Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] Driver-core: Fix bluetooth network device rename regression
From: Kay Sievers @ 2010-07-22 15:30 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Eric W. Biederman, Greg KH, Andrew Morton, Greg KH,
	Rafael J. Wysocki, Maciej W. Rozycki, netdev
In-Reply-To: <1279807845.12439.20.camel@jlt3.sipsolutions.net>

On Thu, Jul 22, 2010 at 16:10, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Thu, 2010-07-22 at 15:38 +0200, Kay Sievers wrote:
>>
>> Please try to fix these drivers instead, or mark the broken for
>> namespaces, if nobody can fix them right now.
>
> We've tried. Nobody, including you, has been able to suggest how to fix
> it.

I did, and several times. Here are the options again:

- Split the driver in two modules, so the auto-cleanup of the netdev
can be done by the second module when the device is force-unloaded
without taking any references to the code while in use (netdev specif
behavior).

- Move the device cleanup code somehow in the core by adding proper
functions to bus devices, similar to the completely mis-used
device_create() function for class devices. This would also be a
proper fix for the weird driver core use.

- Do not allow to force-unload the module while the netdev is in use.
You would need some "destruct" command for the device then, which
removes the netdev, and to be able to unload the module.

> And it's not just broken with network namespaces enabled either, as

So what's the problem without the sysfs ns then? I didn't hear any the
last couple of years.

> Eric explained. I really don't see why you keep asking us to fix it when
> clearly we cannot -- even you don't know how and you certainly have more
> insight into the device model than we do.

Sure, and I ask again to fix the drivers, instead of sneaking-in dirty
hacks into the core, just by calling an expected behavior a
"regression". This is not a core bug, and should not be worked around
that way in the core.

If there is a new requirement for the core (like possibly point 2
above), we can surely look into making this happen.

We can not add lists of individual subsystems to generic core
functions to work around broken drivers. I hope you will understand
that.

Thanks,
Kay

^ permalink raw reply

* Re: [RFC 1/1] netfilter: xtables: inclusion of xt_condition
From: Jan Engelhardt @ 2010-07-22 15:36 UTC (permalink / raw)
  To: Luciano Coelho
  Cc: Netfilter Developer Mailing List, netdev@vger.kernel.org,
	Patrick McHardy, sameo@linux.intel.com, Alexey Dobriyan
In-Reply-To: <1279811808.1630.8.camel@powerslave>


On Thursday 2010-07-22 17:16, Luciano Coelho wrote:
>> >+	ret = strict_strtoul(val, 0, &l);
>> >+	if (ret == -EINVAL || ((uint)l != l))
>> >+		return -EINVAL;
>> 
>> >+	*((u32 *) ((u8 *) cond_net + (size_t) kp->arg)) = l;
>> 
>> I don't think we need this level of granularity; let the options be 
>> global, similar to what xt_hashlimit does.
>
>I did this according to Patrick's comment:
>> > proc_net_condition is a global variable, so this won't work for
>> > namespaces. What the code does is reinitialize it when instantiating
>> > a new namespace, so it will always point to the last instantiated
>> > namespace.
>> > 
>> > The same problem exists for the condition_list, each namespace
>> > should only be able to access its own conditions.
>> 
>> This also applies to the permission variables. Basically, we shouldn't
>> be having any globals except perhaps the mutex. You probably need a
>> module_param_call function to set them for the correct namespace (you
>> can access that through current->nsproxy->net_ns).
>
>I found it a bit strange to be able to change the module params in a
>per-netns basis, but it is actually possible if you're changing the
>parameters via sysfs.  I tried it and it even seems to work. ;)
>
>I can't see any module parameters in the xt_hashlimit.c file.  Am I
>looking in the wrong place?

Oops, xt_recent.c.

>I would be fine with making the module params global (as they were
>before), if that's fine with Patrick too.

"When was the last time you needed to change the default ownership
when you _also_ have the possibility to chown each procfs file
individually?"

>> (I am not even sure if kp->arg can be non-multiples-of-4, in which case 
>> this would be an alignment violation even.)
>
>I'm passing size_t in kp->arg.  It looks quite ugly, because usually
>kp->arg is a pointer to some data.  But at least this way, using
>offsetof(), I could avoid lots of repeated code for the options...

if kp->arg is 1, ((u8*)cond_net + kp->arg) yields a pointer that's
usually not aligned for u32. (And C pedants would probably argue
that is should be char* not u8*, even if the one is a typedef
of another.)

^ permalink raw reply

* Re: [PATCH nf-next-2.6] netfilter: add xt_cpu match
From: Jan Engelhardt @ 2010-07-22 15:39 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Patrick McHardy, Netfilter Development Mailinglist, netdev
In-Reply-To: <1279811939.2467.79.camel@edumazet-laptop>

On Thursday 2010-07-22 17:18, Eric Dumazet wrote:

>Le jeudi 22 juillet 2010 à 16:19 +0200, Jan Engelhardt a écrit :
>> On Thursday 2010-07-22 16:03, Eric Dumazet wrote:
>> 
>> >This match is a bit strange, being packet content agnostic...
>> >+/*
>> >+ * Yes, packet content is not interesting for us, we only take care
>> >+ * of cpu handling this packet
>> >+ */
>> 
>> That is not so strange after all, we have many packet agnostic matches: 
>> xt_time, xt_condition, xt_IDLETIMER, xt_iface.
>> So this little comment looks a bit redundant.
>> 
>> Or it seems that academia can't come up with enough new protocols in time that
>> we have to resort to do -m coffeemaker :)
>> 
>> >@@ -0,0 +1,8 @@
>> >+#ifndef _XT_CPU_H
>> >+#define _XT_CPU_H
>> >+
>> >+struct xt_cpu_info {
>> >+	unsigned int	cpu;
>> >+	int		invert;
>> >+};
>> >+#endif /*_XT_MAC_H*/
>> 
>> Please take a read in "Writing Netfilter Modules" e-book :-)
>> It will tell you that types other than fixed ones are a no-no.
>
>Ok, let's do that, but I doubt sizeof(int) can be different than 4 on a
>Linux 2.6 host right now.

Never say never. "long" already bit people in the past, and now we
have that CONFIG_COMPAT stuff.

If invert is the only flag, perhaps it makes sense to use __u8 
for it. 

>I prefer not doing the !!info->invert, and do the check only once.

>+static int cpu_mt_check(const struct xt_mtchk_param *par)
>+{
>+	const struct xt_cpu_info *info = par->matchinfo;
>+
>+	if (info->invert & ~1)
>+		return -EINVAL;
>+	return 0;
>+}
>+
>+static bool cpu_mt(const struct sk_buff *skb, struct xt_action_param *par)
>+{
>+	const struct xt_cpu_info *info = par->matchinfo;
>+
>+	return (info->cpu == smp_processor_id()) ^ info->invert;
>+}

That works nicely indeed. Do you anticipate any future flags?


^ permalink raw reply

* Re: [PATCH UPDATED 1/3] vhost: replace vhost_workqueue with per-vhost kthread
From: Michael S. Tsirkin @ 2010-07-22 15:58 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Sridhar Samudrala, netdev, lkml,
	kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev, Jiri Kosina,
	Thomas Gleixner, Ingo Molnar, Andi Kleen
In-Reply-To: <4C06A580.9060300@kernel.org>

On Wed, Jun 02, 2010 at 08:40:00PM +0200, Tejun Heo wrote:
> Replace vhost_workqueue with per-vhost kthread.  Other than callback
> argument change from struct work_struct * to struct vhost_work *,
> there's no visible change to vhost_poll_*() interface.
> 
> This conversion is to make each vhost use a dedicated kthread so that
> resource control via cgroup can be applied.
> 
> Partially based on Sridhar Samudrala's patch.
> 
> * Updated to use sub structure vhost_work instead of directly using
>   vhost_poll at Michael's suggestion.
> 
> * Added flusher wake_up() optimization at Michael's suggestion.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Sridhar Samudrala <samudrala.sridhar@gmail.com>

All the tricky barrier pairing made me uncomfortable.  So I came up with
this on top (untested): if we do all operations under the spinlock, we
can get by without barriers and atomics.  And since we need the lock for
list operations anyway, this should have no paerformance impact.

What do you think?


diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 0c6b533..7730a30 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -73,7 +73,7 @@ void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn,
 	INIT_LIST_HEAD(&work->node);
 	work->fn = fn;
 	init_waitqueue_head(&work->done);
-	atomic_set(&work->flushing, 0);
+	work->flushing = 0;
 	work->queue_seq = work->done_seq = 0;
 }
 
@@ -99,13 +99,23 @@ void vhost_poll_stop(struct vhost_poll *poll)
 void vhost_poll_flush(struct vhost_poll *poll)
 {
 	struct vhost_work *work = &poll->work;
-	int seq = work->queue_seq;
+	unsigned seq, left;
+	int flushing;
 
-	atomic_inc(&work->flushing);
-	smp_mb__after_atomic_inc();	/* mb flush-b0 paired with worker-b1 */
-	wait_event(work->done, seq - work->done_seq <= 0);
-	atomic_dec(&work->flushing);
-	smp_mb__after_atomic_dec();	/* rmb flush-b1 paired with worker-b0 */
+	spin_lock_irq(&dev->work_lock);
+	seq = work->queue_seq;
+	work->flushing++;
+	spin_unlock_irq(&dev->work_lock);
+	wait_event(work->done, {
+		   spin_lock_irq(&dev->work_lock);
+		   left = work->done_seq - seq;
+		   spin_unlock_irq(&dev->work_lock);
+		   left < UINT_MAX / 2;
+	});
+	spin_lock_irq(&dev->work_lock);
+	flushing = --work->flushing;
+	spin_unlock_irq(&dev->work_lock);
+	BUG_ON(flushing < 0);
 }
 
 void vhost_poll_queue(struct vhost_poll *poll)
@@ -151,37 +161,37 @@ static void vhost_vq_reset(struct vhost_dev *dev,
 static int vhost_worker(void *data)
 {
 	struct vhost_dev *dev = data;
-	struct vhost_work *work;
+	struct vhost_work *work = NULL;
 
-repeat:
-	set_current_state(TASK_INTERRUPTIBLE);	/* mb paired w/ kthread_stop */
+	for (;;) {
+		set_current_state(TASK_INTERRUPTIBLE);	/* mb paired w/ kthread_stop */
 
-	if (kthread_should_stop()) {
-		__set_current_state(TASK_RUNNING);
-		return 0;
-	}
+		if (kthread_should_stop()) {
+			__set_current_state(TASK_RUNNING);
+			return 0;
+		}
 
-	work = NULL;
-	spin_lock_irq(&dev->work_lock);
-	if (!list_empty(&dev->work_list)) {
-		work = list_first_entry(&dev->work_list,
-					struct vhost_work, node);
-		list_del_init(&work->node);
-	}
-	spin_unlock_irq(&dev->work_lock);
+		spin_lock_irq(&dev->work_lock);
+		if (work) {
+			work->done_seq = work->queue_seq;
+			if (work->flushing)
+				wake_up_all(&work->done);
+		}
+		if (!list_empty(&dev->work_list)) {
+			work = list_first_entry(&dev->work_list,
+						struct vhost_work, node);
+			list_del_init(&work->node);
+		} else
+			work = NULL;
+		spin_unlock_irq(&dev->work_lock);
+
+		if (work) {
+			__set_current_state(TASK_RUNNING);
+			work->fn(work);
+		} else
+			schedule();
 
-	if (work) {
-		__set_current_state(TASK_RUNNING);
-		work->fn(work);
-		smp_wmb();	/* wmb worker-b0 paired with flush-b1 */
-		work->done_seq = work->queue_seq;
-		smp_mb();	/* mb worker-b1 paired with flush-b0 */
-		if (atomic_read(&work->flushing))
-			wake_up_all(&work->done);
-	} else
-		schedule();
-
-	goto repeat;
+	}
 }
 
 long vhost_dev_init(struct vhost_dev *dev,
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 0e63091..3693327 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -27,9 +27,9 @@ struct vhost_work {
 	struct list_head	  node;
 	vhost_work_fn_t		  fn;
 	wait_queue_head_t	  done;
-	atomic_t		  flushing;
-	int			  queue_seq;
-	int			  done_seq;
+	int			  flushing;
+	unsigned		  queue_seq;
+	unsigned		  done_seq;
 };
 
 /* Poll a file (eventfd or socket) */

^ permalink raw reply related

* Re: macvtap: Limit packet queue length
From: Shirley Ma @ 2010-07-22 15:59 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David S. Miller, netdev, Arnd Bergmann, Mark Wagner
In-Reply-To: <20100722064157.GA25913@gondor.apana.org.au>

On Thu, 2010-07-22 at 14:41 +0800, Herbert Xu wrote:
>  {
>         struct macvtap_queue *q = macvtap_get_queue(dev, skb);
>         if (!q)
> -               return -ENOLINK;
> +               goto drop;
> +
> +       if (skb_queue_len(&q->sk.sk_receive_queue) >=
> dev->tx_queue_len)
> +               goto drop;
> 

Do we need to orphan skb here, just like tun?

>         skb_queue_tail(&q->sk.sk_receive_queue, skb);
>         wake_up_interruptible_poll(sk_sleep(&q->sk), POLLIN |
> POLLRDNORM | POLLRDBAND);
> -       return 0;
> +       return NET_RX_SUCCESS;
> +
> +drop:

Do we need to increase dropped++ counter here to let user know there are
packets dropped?

> +       kfree_skb(skb);
> +       return NET_RX_DROP;
>  }
> 
> 

^ permalink raw reply

* Re: macvtap: Limit packet queue length
From: Shirley Ma @ 2010-07-22 16:05 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Herbert Xu, David S. Miller, netdev, Mark Wagner, Chris Wright
In-Reply-To: <201007221130.53612.arnd@arndb.de>

On Thu, 2010-07-22 at 11:30 +0200, Arnd Bergmann wrote:
> In the TX direction, we really don't queue, since we simply forward
> to the lowerdev tx queue, so exposing the tunable to user space
> as the tx queue length is a little bit awkward, as well as
> inconsistent
> between macvtap and macvlan.

Maybe we can use lowerdev backlog queue size here from receiving path
here?

thanks
Shirley

^ permalink raw reply

* Re: macvtap: Limit packet queue length
From: Herbert Xu @ 2010-07-22 16:07 UTC (permalink / raw)
  To: Shirley Ma; +Cc: David S. Miller, netdev, Arnd Bergmann, Mark Wagner
In-Reply-To: <1279814398.3211.1.camel@localhost.localdomain>

On Thu, Jul 22, 2010 at 08:59:58AM -0700, Shirley Ma wrote:
> On Thu, 2010-07-22 at 14:41 +0800, Herbert Xu wrote:
> >  {
> >         struct macvtap_queue *q = macvtap_get_queue(dev, skb);
> >         if (!q)
> > -               return -ENOLINK;
> > +               goto drop;
> > +
> > +       if (skb_queue_len(&q->sk.sk_receive_queue) >=
> > dev->tx_queue_len)
> > +               goto drop;
> > 
> 
> Do we need to orphan skb here, just like tun?

We could, but that is orthogonal to the problem at hand so feel
free to do that in another patch.

> >         skb_queue_tail(&q->sk.sk_receive_queue, skb);
> >         wake_up_interruptible_poll(sk_sleep(&q->sk), POLLIN |
> > POLLRDNORM | POLLRDBAND);
> > -       return 0;
> > +       return NET_RX_SUCCESS;
> > +
> > +drop:
> 
> Do we need to increase dropped++ counter here to let user know there are
> packets dropped?

The caller is supposed to handle this.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: macvtap: Limit packet queue length
From: Herbert Xu @ 2010-07-22 16:08 UTC (permalink / raw)
  To: Shirley Ma
  Cc: Arnd Bergmann, David S. Miller, netdev, Mark Wagner, Chris Wright
In-Reply-To: <1279814726.3211.5.camel@localhost.localdomain>

On Thu, Jul 22, 2010 at 09:05:26AM -0700, Shirley Ma wrote:
> On Thu, 2010-07-22 at 11:30 +0200, Arnd Bergmann wrote:
> > In the TX direction, we really don't queue, since we simply forward
> > to the lowerdev tx queue, so exposing the tunable to user space
> > as the tx queue length is a little bit awkward, as well as
> > inconsistent
> > between macvtap and macvlan.
> 
> Maybe we can use lowerdev backlog queue size here from receiving path
> here?

No, you may wish to set different queue lengths for different
macvtap devices over the same lowerdev so you definitely don't
want to use any lowerdev parameter for this.

I honestly don't see any problems with using tx_queue_len since
it isn't used by anything else for macvtap.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH nf-next-2.6] netfilter: add xt_cpu match
From: Eric Dumazet @ 2010-07-22 16:24 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Patrick McHardy, Netfilter Development Mailinglist, netdev
In-Reply-To: <alpine.LSU.2.01.1007221736150.12308@obet.zrqbmnf.qr>

Le jeudi 22 juillet 2010 à 17:39 +0200, Jan Engelhardt a écrit :

> Never say never. "long" already bit people in the past, and now we
> have that CONFIG_COMPAT stuff.
> 

I know pretty well the "long" problem, I received one of the first alpha
machine ever built in the world (DEC 3000 AXP, with a fast 133 MHz
cpu ;) ), before I began to use Linux :)


> If invert is the only flag, perhaps it makes sense to use __u8 
> for it. 
> 

Quite frankly it brings more problems than plain u32

- Possible security problems (padding bytes). Not applicable to
iptables.

- Some arches have slow byte/short accesses (21064 for example :) )

"int" is the natural type, fast on all arches.

- Given alignment requirements of iptables rules, using less than 32bits
here saves no ram.

But I dont care that much.

I even see compiler doesnt want to use a XOR instruction :

00000018 <cpu_mt>:
  18:	55                   	push   %ebp
  19:	8b 42 04             	mov    0x4(%edx),%eax
  1c:	64 8b 15 00 00 00 00 	mov    %fs:0x0,%edx
  23:	89 e5                	mov    %esp,%ebp
  25:	5d                   	pop    %ebp
  26:	39 10                	cmp    %edx,(%eax)
  28:	0f 94 c2             	sete   %dl
  2b:	0f b6 d2             	movzbl %dl,%edx
  2e:	3b 50 04             	cmp    0x4(%eax),%edx
  31:	0f 95 c0             	setne  %al
  34:	c3                   	ret    




--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] CAN: Add Flexcan CAN controller driver
From: Marc Kleine-Budde @ 2010-07-22 16:40 UTC (permalink / raw)
  To: wg-5Yr1BZd7O62+XT7JhA+gdA
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA, Marc Kleine-Budde,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q

Hello,

sorry for the confusion about the last minute problem with the driver,
it was false alarm. Someone had troubles a driver derived from fsl's
original one.

David, please feel free to merge the driver now. :)

Cheers, Marc


P.S.: You can pull this from:

The following changes since commit 4cfa580e7eebb8694b875d2caff3b989ada2efac:

  r6040: Fix args to phy_mii_ioctl(). (2010-07-21 21:10:49 -0700)

are available in the git repository at:
  git://git.pengutronix.de/git/mkl/linux-2.6.git for-net-next-2.6

Marc Kleine-Budde (1):
      CAN: Add Flexcan CAN controller driver

 drivers/net/can/Kconfig              |    9 +
 drivers/net/can/Makefile             |    1 +
 drivers/net/can/flexcan.c            | 1030 ++++++++++++++++++++++++++++++++++
 include/linux/can/platform/flexcan.h |   20 +
 4 files changed, 1060 insertions(+), 0 deletions(-)
 create mode 100644 drivers/net/can/flexcan.c
 create mode 100644 include/linux/can/platform/flexcan.h

---------8<--------8<--------8<--------8<--------8<--------8<--------8<-------

This core is found on some Freescale SoCs and also some Coldfire
SoCs. Support for Coldfire is missing though at the moment as
they have an older revision of the core which does not have RX FIFO
support.

Signed-off-by: Sascha Hauer <s.hauer-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
Signed-off-by: Marc Kleine-Budde <mkl-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
Acked-by: Wolfgang Grandegger <wg-5Yr1BZd7O62+XT7JhA+gdA@public.gmane.org>
---
Changes to prev version:

* improved activation of the driver in Kconfig

 drivers/net/can/Kconfig              |    9 +
 drivers/net/can/Makefile             |    1 +
 drivers/net/can/flexcan.c            | 1030 ++++++++++++++++++++++++++++++++++
 include/linux/can/platform/flexcan.h |   20 +
 4 files changed, 1060 insertions(+), 0 deletions(-)
 create mode 100644 drivers/net/can/flexcan.c
 create mode 100644 include/linux/can/platform/flexcan.h

diff --git a/drivers/net/can/Kconfig b/drivers/net/can/Kconfig
index 2c5227c..9d9e453 100644
--- a/drivers/net/can/Kconfig
+++ b/drivers/net/can/Kconfig
@@ -73,6 +73,15 @@ config CAN_JANZ_ICAN3
 	  This driver can also be built as a module. If so, the module will be
 	  called janz-ican3.ko.
 
+config HAVE_CAN_FLEXCAN
+	bool
+
+config CAN_FLEXCAN
+	tristate "Support for Freescale FLEXCAN based chips"
+	depends on CAN_DEV && HAVE_CAN_FLEXCAN
+	---help---
+	  Say Y here if you want to support for Freescale FlexCAN.
+
 source "drivers/net/can/mscan/Kconfig"
 
 source "drivers/net/can/sja1000/Kconfig"
diff --git a/drivers/net/can/Makefile b/drivers/net/can/Makefile
index 9047cd0..0057537 100644
--- a/drivers/net/can/Makefile
+++ b/drivers/net/can/Makefile
@@ -16,5 +16,6 @@ obj-$(CONFIG_CAN_TI_HECC)	+= ti_hecc.o
 obj-$(CONFIG_CAN_MCP251X)	+= mcp251x.o
 obj-$(CONFIG_CAN_BFIN)		+= bfin_can.o
 obj-$(CONFIG_CAN_JANZ_ICAN3)	+= janz-ican3.o
+obj-$(CONFIG_CAN_FLEXCAN)	+= flexcan.o
 
 ccflags-$(CONFIG_CAN_DEBUG_DEVICES) := -DDEBUG
diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
new file mode 100644
index 0000000..ef443a0
--- /dev/null
+++ b/drivers/net/can/flexcan.c
@@ -0,0 +1,1030 @@
+/*
+ * flexcan.c - FLEXCAN CAN controller driver
+ *
+ * Copyright (c) 2005-2006 Varma Electronics Oy
+ * Copyright (c) 2009 Sascha Hauer, Pengutronix
+ * Copyright (c) 2010 Marc Kleine-Budde, Pengutronix
+ *
+ * Based on code originally by Andrey Volkov <avolkov-ppI4tVfbJvJWk0Htik3J/w@public.gmane.org>
+ *
+ * LICENCE:
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/netdevice.h>
+#include <linux/can.h>
+#include <linux/can/dev.h>
+#include <linux/can/error.h>
+#include <linux/can/platform/flexcan.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/if_arp.h>
+#include <linux/if_ether.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+
+#include <mach/clock.h>
+
+#define DRV_NAME			"flexcan"
+
+/* 8 for RX fifo and 2 error handling */
+#define FLEXCAN_NAPI_WEIGHT		(8 + 2)
+
+/* FLEXCAN module configuration register (CANMCR) bits */
+#define FLEXCAN_MCR_MDIS		BIT(31)
+#define FLEXCAN_MCR_FRZ			BIT(30)
+#define FLEXCAN_MCR_FEN			BIT(29)
+#define FLEXCAN_MCR_HALT		BIT(28)
+#define FLEXCAN_MCR_NOT_RDY		BIT(27)
+#define FLEXCAN_MCR_WAK_MSK		BIT(26)
+#define FLEXCAN_MCR_SOFTRST		BIT(25)
+#define FLEXCAN_MCR_FRZ_ACK		BIT(24)
+#define FLEXCAN_MCR_SUPV		BIT(23)
+#define FLEXCAN_MCR_SLF_WAK		BIT(22)
+#define FLEXCAN_MCR_WRN_EN		BIT(21)
+#define FLEXCAN_MCR_LPM_ACK		BIT(20)
+#define FLEXCAN_MCR_WAK_SRC		BIT(19)
+#define FLEXCAN_MCR_DOZE		BIT(18)
+#define FLEXCAN_MCR_SRX_DIS		BIT(17)
+#define FLEXCAN_MCR_BCC			BIT(16)
+#define FLEXCAN_MCR_LPRIO_EN		BIT(13)
+#define FLEXCAN_MCR_AEN			BIT(12)
+#define FLEXCAN_MCR_MAXMB(x)		((x) & 0xf)
+#define FLEXCAN_MCR_IDAM_A		(0 << 8)
+#define FLEXCAN_MCR_IDAM_B		(1 << 8)
+#define FLEXCAN_MCR_IDAM_C		(2 << 8)
+#define FLEXCAN_MCR_IDAM_D		(3 << 8)
+
+/* FLEXCAN control register (CANCTRL) bits */
+#define FLEXCAN_CTRL_PRESDIV(x)		(((x) & 0xff) << 24)
+#define FLEXCAN_CTRL_RJW(x)		(((x) & 0x03) << 22)
+#define FLEXCAN_CTRL_PSEG1(x)		(((x) & 0x07) << 19)
+#define FLEXCAN_CTRL_PSEG2(x)		(((x) & 0x07) << 16)
+#define FLEXCAN_CTRL_BOFF_MSK		BIT(15)
+#define FLEXCAN_CTRL_ERR_MSK		BIT(14)
+#define FLEXCAN_CTRL_CLK_SRC		BIT(13)
+#define FLEXCAN_CTRL_LPB		BIT(12)
+#define FLEXCAN_CTRL_TWRN_MSK		BIT(11)
+#define FLEXCAN_CTRL_RWRN_MSK		BIT(10)
+#define FLEXCAN_CTRL_SMP		BIT(7)
+#define FLEXCAN_CTRL_BOFF_REC		BIT(6)
+#define FLEXCAN_CTRL_TSYN		BIT(5)
+#define FLEXCAN_CTRL_LBUF		BIT(4)
+#define FLEXCAN_CTRL_LOM		BIT(3)
+#define FLEXCAN_CTRL_PROPSEG(x)		((x) & 0x07)
+#define FLEXCAN_CTRL_ERR_BUS		(FLEXCAN_CTRL_ERR_MSK)
+#define FLEXCAN_CTRL_ERR_STATE \
+	(FLEXCAN_CTRL_TWRN_MSK | FLEXCAN_CTRL_RWRN_MSK | \
+	 FLEXCAN_CTRL_BOFF_MSK)
+#define FLEXCAN_CTRL_ERR_ALL \
+	(FLEXCAN_CTRL_ERR_BUS | FLEXCAN_CTRL_ERR_STATE)
+
+/* FLEXCAN error and status register (ESR) bits */
+#define FLEXCAN_ESR_TWRN_INT		BIT(17)
+#define FLEXCAN_ESR_RWRN_INT		BIT(16)
+#define FLEXCAN_ESR_BIT1_ERR		BIT(15)
+#define FLEXCAN_ESR_BIT0_ERR		BIT(14)
+#define FLEXCAN_ESR_ACK_ERR		BIT(13)
+#define FLEXCAN_ESR_CRC_ERR		BIT(12)
+#define FLEXCAN_ESR_FRM_ERR		BIT(11)
+#define FLEXCAN_ESR_STF_ERR		BIT(10)
+#define FLEXCAN_ESR_TX_WRN		BIT(9)
+#define FLEXCAN_ESR_RX_WRN		BIT(8)
+#define FLEXCAN_ESR_IDLE		BIT(7)
+#define FLEXCAN_ESR_TXRX		BIT(6)
+#define FLEXCAN_EST_FLT_CONF_SHIFT	(4)
+#define FLEXCAN_ESR_FLT_CONF_MASK	(0x3 << FLEXCAN_EST_FLT_CONF_SHIFT)
+#define FLEXCAN_ESR_FLT_CONF_ACTIVE	(0x0 << FLEXCAN_EST_FLT_CONF_SHIFT)
+#define FLEXCAN_ESR_FLT_CONF_PASSIVE	(0x1 << FLEXCAN_EST_FLT_CONF_SHIFT)
+#define FLEXCAN_ESR_BOFF_INT		BIT(2)
+#define FLEXCAN_ESR_ERR_INT		BIT(1)
+#define FLEXCAN_ESR_WAK_INT		BIT(0)
+#define FLEXCAN_ESR_ERR_BUS \
+	(FLEXCAN_ESR_BIT1_ERR | FLEXCAN_ESR_BIT0_ERR | \
+	 FLEXCAN_ESR_ACK_ERR | FLEXCAN_ESR_CRC_ERR | \
+	 FLEXCAN_ESR_FRM_ERR | FLEXCAN_ESR_STF_ERR)
+#define FLEXCAN_ESR_ERR_STATE \
+	(FLEXCAN_ESR_TWRN_INT | FLEXCAN_ESR_RWRN_INT | FLEXCAN_ESR_BOFF_INT)
+#define FLEXCAN_ESR_ERR_ALL \
+	(FLEXCAN_ESR_ERR_BUS | FLEXCAN_ESR_ERR_STATE)
+
+/* FLEXCAN interrupt flag register (IFLAG) bits */
+#define FLEXCAN_TX_BUF_ID		8
+#define FLEXCAN_IFLAG_BUF(x)		BIT(x)
+#define FLEXCAN_IFLAG_RX_FIFO_OVERFLOW	BIT(7)
+#define FLEXCAN_IFLAG_RX_FIFO_WARN	BIT(6)
+#define FLEXCAN_IFLAG_RX_FIFO_AVAILABLE	BIT(5)
+#define FLEXCAN_IFLAG_DEFAULT \
+	(FLEXCAN_IFLAG_RX_FIFO_OVERFLOW | FLEXCAN_IFLAG_RX_FIFO_AVAILABLE | \
+	 FLEXCAN_IFLAG_BUF(FLEXCAN_TX_BUF_ID))
+
+/* FLEXCAN message buffers */
+#define FLEXCAN_MB_CNT_CODE(x)		(((x) & 0xf) << 24)
+#define FLEXCAN_MB_CNT_SRR		BIT(22)
+#define FLEXCAN_MB_CNT_IDE		BIT(21)
+#define FLEXCAN_MB_CNT_RTR		BIT(20)
+#define FLEXCAN_MB_CNT_LENGTH(x)	(((x) & 0xf) << 16)
+#define FLEXCAN_MB_CNT_TIMESTAMP(x)	((x) & 0xffff)
+
+#define FLEXCAN_MB_CODE_MASK		(0xf0ffffff)
+
+/* Structure of the message buffer */
+struct flexcan_mb {
+	u32 can_ctrl;
+	u32 can_id;
+	u32 data[2];
+};
+
+/* Structure of the hardware registers */
+struct flexcan_regs {
+	u32 mcr;		/* 0x00 */
+	u32 ctrl;		/* 0x04 */
+	u32 timer;		/* 0x08 */
+	u32 _reserved1;		/* 0x0c */
+	u32 rxgmask;		/* 0x10 */
+	u32 rx14mask;		/* 0x14 */
+	u32 rx15mask;		/* 0x18 */
+	u32 ecr;		/* 0x1c */
+	u32 esr;		/* 0x20 */
+	u32 imask2;		/* 0x24 */
+	u32 imask1;		/* 0x28 */
+	u32 iflag2;		/* 0x2c */
+	u32 iflag1;		/* 0x30 */
+	u32 _reserved2[19];
+	struct flexcan_mb cantxfg[64];
+};
+
+struct flexcan_priv {
+	struct can_priv can;
+	struct net_device *dev;
+	struct napi_struct napi;
+
+	void __iomem *base;
+	u32 reg_esr;
+	u32 reg_ctrl_default;
+
+	struct clk *clk;
+	struct flexcan_platform_data *pdata;
+};
+
+static struct can_bittiming_const flexcan_bittiming_const = {
+	.name = DRV_NAME,
+	.tseg1_min = 4,
+	.tseg1_max = 16,
+	.tseg2_min = 2,
+	.tseg2_max = 8,
+	.sjw_max = 4,
+	.brp_min = 1,
+	.brp_max = 256,
+	.brp_inc = 1,
+};
+
+/*
+ * Swtich transceiver on or off
+ */
+static void flexcan_transceiver_switch(const struct flexcan_priv *priv, int on)
+{
+	if (priv->pdata && priv->pdata->transceiver_switch)
+		priv->pdata->transceiver_switch(on);
+}
+
+static inline int flexcan_has_and_handle_berr(const struct flexcan_priv *priv,
+					      u32 reg_esr)
+{
+	return (priv->can.ctrlmode & CAN_CTRLMODE_BERR_REPORTING) &&
+		(reg_esr & FLEXCAN_ESR_ERR_BUS);
+}
+
+static inline void flexcan_chip_enable(struct flexcan_priv *priv)
+{
+	struct flexcan_regs __iomem *regs = priv->base;
+	u32 reg;
+
+	reg = readl(&regs->mcr);
+	reg &= ~FLEXCAN_MCR_MDIS;
+	writel(reg, &regs->mcr);
+
+	udelay(10);
+}
+
+static inline void flexcan_chip_disable(struct flexcan_priv *priv)
+{
+	struct flexcan_regs __iomem *regs = priv->base;
+	u32 reg;
+
+	reg = readl(&regs->mcr);
+	reg |= FLEXCAN_MCR_MDIS;
+	writel(reg, &regs->mcr);
+}
+
+static int flexcan_get_berr_counter(const struct net_device *dev,
+				    struct can_berr_counter *bec)
+{
+	const struct flexcan_priv *priv = netdev_priv(dev);
+	struct flexcan_regs __iomem *regs = priv->base;
+	u32 reg = readl(&regs->ecr);
+
+	bec->txerr = (reg >> 0) & 0xff;
+	bec->rxerr = (reg >> 8) & 0xff;
+
+	return 0;
+}
+
+static int flexcan_start_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	const struct flexcan_priv *priv = netdev_priv(dev);
+	struct net_device_stats *stats = &dev->stats;
+	struct flexcan_regs __iomem *regs = priv->base;
+	struct can_frame *cf = (struct can_frame *)skb->data;
+	u32 can_id;
+	u32 ctrl = FLEXCAN_MB_CNT_CODE(0xc) | (cf->can_dlc << 16);
+
+	if (can_dropped_invalid_skb(dev, skb))
+		return NETDEV_TX_OK;
+
+	netif_stop_queue(dev);
+
+	if (cf->can_id & CAN_EFF_FLAG) {
+		can_id = cf->can_id & CAN_EFF_MASK;
+		ctrl |= FLEXCAN_MB_CNT_IDE | FLEXCAN_MB_CNT_SRR;
+	} else {
+		can_id = (cf->can_id & CAN_SFF_MASK) << 18;
+	}
+
+	if (cf->can_id & CAN_RTR_FLAG)
+		ctrl |= FLEXCAN_MB_CNT_RTR;
+
+	if (cf->can_dlc > 0) {
+		u32 data = be32_to_cpup((__be32 *)&cf->data[0]);
+		writel(data, &regs->cantxfg[FLEXCAN_TX_BUF_ID].data[0]);
+	}
+	if (cf->can_dlc > 3) {
+		u32 data = be32_to_cpup((__be32 *)&cf->data[4]);
+		writel(data, &regs->cantxfg[FLEXCAN_TX_BUF_ID].data[1]);
+	}
+
+	writel(can_id, &regs->cantxfg[FLEXCAN_TX_BUF_ID].can_id);
+	writel(ctrl, &regs->cantxfg[FLEXCAN_TX_BUF_ID].can_ctrl);
+
+	kfree_skb(skb);
+
+	/* tx_packets is incremented in flexcan_irq */
+	stats->tx_bytes += cf->can_dlc;
+
+	return NETDEV_TX_OK;
+}
+
+static void do_bus_err(struct net_device *dev,
+		       struct can_frame *cf, u32 reg_esr)
+{
+	struct flexcan_priv *priv = netdev_priv(dev);
+	int rx_errors = 0, tx_errors = 0;
+
+	cf->can_id |= CAN_ERR_PROT | CAN_ERR_BUSERROR;
+
+	if (reg_esr & FLEXCAN_ESR_BIT1_ERR) {
+		dev_dbg(dev->dev.parent, "BIT1_ERR irq\n");
+		cf->data[2] |= CAN_ERR_PROT_BIT1;
+		tx_errors = 1;
+	}
+	if (reg_esr & FLEXCAN_ESR_BIT0_ERR) {
+		dev_dbg(dev->dev.parent, "BIT0_ERR irq\n");
+		cf->data[2] |= CAN_ERR_PROT_BIT0;
+		tx_errors = 1;
+	}
+	if (reg_esr & FLEXCAN_ESR_ACK_ERR) {
+		dev_dbg(dev->dev.parent, "ACK_ERR irq\n");
+		cf->can_id |= CAN_ERR_ACK;
+		cf->data[3] |= CAN_ERR_PROT_LOC_ACK;
+		tx_errors = 1;
+	}
+	if (reg_esr & FLEXCAN_ESR_CRC_ERR) {
+		dev_dbg(dev->dev.parent, "CRC_ERR irq\n");
+		cf->data[2] |= CAN_ERR_PROT_BIT;
+		cf->data[3] |= CAN_ERR_PROT_LOC_CRC_SEQ;
+		rx_errors = 1;
+	}
+	if (reg_esr & FLEXCAN_ESR_FRM_ERR) {
+		dev_dbg(dev->dev.parent, "FRM_ERR irq\n");
+		cf->data[2] |= CAN_ERR_PROT_FORM;
+		rx_errors = 1;
+	}
+	if (reg_esr & FLEXCAN_ESR_STF_ERR) {
+		dev_dbg(dev->dev.parent, "STF_ERR irq\n");
+		cf->data[2] |= CAN_ERR_PROT_STUFF;
+		rx_errors = 1;
+	}
+
+	priv->can.can_stats.bus_error++;
+	if (rx_errors)
+		dev->stats.rx_errors++;
+	if (tx_errors)
+		dev->stats.tx_errors++;
+}
+
+static int flexcan_poll_bus_err(struct net_device *dev, u32 reg_esr)
+{
+	struct sk_buff *skb;
+	struct can_frame *cf;
+
+	skb = alloc_can_err_skb(dev, &cf);
+	if (unlikely(!skb))
+		return 0;
+
+	do_bus_err(dev, cf, reg_esr);
+	netif_receive_skb(skb);
+
+	dev->stats.rx_packets++;
+	dev->stats.rx_bytes += cf->can_dlc;
+
+	return 1;
+}
+
+static void do_state(struct net_device *dev,
+		     struct can_frame *cf, enum can_state new_state)
+{
+	struct flexcan_priv *priv = netdev_priv(dev);
+	struct can_berr_counter bec;
+
+	flexcan_get_berr_counter(dev, &bec);
+
+	switch (priv->can.state) {
+	case CAN_STATE_ERROR_ACTIVE:
+		/*
+		 * from: ERROR_ACTIVE
+		 * to  : ERROR_WARNING, ERROR_PASSIVE, BUS_OFF
+		 * =>  : there was a warning int
+		 */
+		if (new_state >= CAN_STATE_ERROR_WARNING &&
+		    new_state <= CAN_STATE_BUS_OFF) {
+			dev_dbg(dev->dev.parent, "Error Warning IRQ\n");
+			priv->can.can_stats.error_warning++;
+
+			cf->can_id |= CAN_ERR_CRTL;
+			cf->data[1] = (bec.txerr > bec.rxerr) ?
+				CAN_ERR_CRTL_TX_WARNING :
+				CAN_ERR_CRTL_RX_WARNING;
+		}
+	case CAN_STATE_ERROR_WARNING:	/* fallthrough */
+		/*
+		 * from: ERROR_ACTIVE, ERROR_WARNING
+		 * to  : ERROR_PASSIVE, BUS_OFF
+		 * =>  : error passive int
+		 */
+		if (new_state >= CAN_STATE_ERROR_PASSIVE &&
+		    new_state <= CAN_STATE_BUS_OFF) {
+			dev_dbg(dev->dev.parent, "Error Passive IRQ\n");
+			priv->can.can_stats.error_passive++;
+
+			cf->can_id |= CAN_ERR_CRTL;
+			cf->data[1] = (bec.txerr > bec.rxerr) ?
+				CAN_ERR_CRTL_TX_PASSIVE :
+				CAN_ERR_CRTL_RX_PASSIVE;
+		}
+		break;
+	case CAN_STATE_BUS_OFF:
+		dev_err(dev->dev.parent,
+			"BUG! hardware recovered automatically from BUS_OFF\n");
+		break;
+	default:
+		break;
+	}
+
+	/* process state changes depending on the new state */
+	switch (new_state) {
+	case CAN_STATE_ERROR_ACTIVE:
+		dev_dbg(dev->dev.parent, "Error Active\n");
+		cf->can_id |= CAN_ERR_PROT;
+		cf->data[2] = CAN_ERR_PROT_ACTIVE;
+		break;
+	case CAN_STATE_BUS_OFF:
+		cf->can_id |= CAN_ERR_BUSOFF;
+		can_bus_off(dev);
+		break;
+	default:
+		break;
+	}
+}
+
+static int flexcan_poll_state(struct net_device *dev, u32 reg_esr)
+{
+	struct flexcan_priv *priv = netdev_priv(dev);
+	struct sk_buff *skb;
+	struct can_frame *cf;
+	enum can_state new_state;
+	int flt;
+
+	flt = reg_esr & FLEXCAN_ESR_FLT_CONF_MASK;
+	if (likely(flt == FLEXCAN_ESR_FLT_CONF_ACTIVE)) {
+		if (likely(!(reg_esr & (FLEXCAN_ESR_TX_WRN |
+					FLEXCAN_ESR_RX_WRN))))
+			new_state = CAN_STATE_ERROR_ACTIVE;
+		else
+			new_state = CAN_STATE_ERROR_WARNING;
+	} else if (unlikely(flt == FLEXCAN_ESR_FLT_CONF_PASSIVE))
+		new_state = CAN_STATE_ERROR_PASSIVE;
+	else
+		new_state = CAN_STATE_BUS_OFF;
+
+	/* state hasn't changed */
+	if (likely(new_state == priv->can.state))
+		return 0;
+
+	skb = alloc_can_err_skb(dev, &cf);
+	if (unlikely(!skb))
+		return 0;
+
+	do_state(dev, cf, new_state);
+	priv->can.state = new_state;
+	netif_receive_skb(skb);
+
+	dev->stats.rx_packets++;
+	dev->stats.rx_bytes += cf->can_dlc;
+
+	return 1;
+}
+
+static void flexcan_read_fifo(const struct net_device *dev,
+			      struct can_frame *cf)
+{
+	const struct flexcan_priv *priv = netdev_priv(dev);
+	struct flexcan_regs __iomem *regs = priv->base;
+	struct flexcan_mb __iomem *mb = &regs->cantxfg[0];
+	u32 reg_ctrl, reg_id;
+
+	reg_ctrl = readl(&mb->can_ctrl);
+	reg_id = readl(&mb->can_id);
+	if (reg_ctrl & FLEXCAN_MB_CNT_IDE)
+		cf->can_id = ((reg_id >> 0) & CAN_EFF_MASK) | CAN_EFF_FLAG;
+	else
+		cf->can_id = (reg_id >> 18) & CAN_SFF_MASK;
+
+	if (reg_ctrl & FLEXCAN_MB_CNT_RTR)
+		cf->can_id |= CAN_RTR_FLAG;
+	cf->can_dlc = get_can_dlc((reg_ctrl >> 16) & 0xf);
+
+	*(__be32 *)(cf->data + 0) = cpu_to_be32(readl(&mb->data[0]));
+	*(__be32 *)(cf->data + 4) = cpu_to_be32(readl(&mb->data[1]));
+
+	/* mark as read */
+	writel(FLEXCAN_IFLAG_RX_FIFO_AVAILABLE, &regs->iflag1);
+	readl(&regs->timer);
+}
+
+static int flexcan_read_frame(struct net_device *dev)
+{
+	struct net_device_stats *stats = &dev->stats;
+	struct can_frame *cf;
+	struct sk_buff *skb;
+
+	skb = alloc_can_skb(dev, &cf);
+	if (unlikely(!skb)) {
+		stats->rx_dropped++;
+		return 0;
+	}
+
+	flexcan_read_fifo(dev, cf);
+	netif_receive_skb(skb);
+
+	stats->rx_packets++;
+	stats->rx_bytes += cf->can_dlc;
+
+	return 1;
+}
+
+static int flexcan_poll(struct napi_struct *napi, int quota)
+{
+	struct net_device *dev = napi->dev;
+	const struct flexcan_priv *priv = netdev_priv(dev);
+	struct flexcan_regs __iomem *regs = priv->base;
+	u32 reg_iflag1, reg_esr;
+	int work_done = 0;
+
+	/*
+	 * The error bits are cleared on read,
+	 * use saved value from irq handler.
+	 */
+	reg_esr = readl(&regs->esr) | priv->reg_esr;
+
+	/* handle state changes */
+	work_done += flexcan_poll_state(dev, reg_esr);
+
+	/* handle RX-FIFO */
+	reg_iflag1 = readl(&regs->iflag1);
+	while (reg_iflag1 & FLEXCAN_IFLAG_RX_FIFO_AVAILABLE &&
+	       work_done < quota) {
+		work_done += flexcan_read_frame(dev);
+		reg_iflag1 = readl(&regs->iflag1);
+	}
+
+	/* report bus errors */
+	if (flexcan_has_and_handle_berr(priv, reg_esr) && work_done < quota)
+		work_done += flexcan_poll_bus_err(dev, reg_esr);
+
+	if (work_done < quota) {
+		napi_complete(napi);
+		/* enable IRQs */
+		writel(FLEXCAN_IFLAG_DEFAULT, &regs->imask1);
+		writel(priv->reg_ctrl_default, &regs->ctrl);
+	}
+
+	return work_done;
+}
+
+static irqreturn_t flexcan_irq(int irq, void *dev_id)
+{
+	struct net_device *dev = dev_id;
+	struct net_device_stats *stats = &dev->stats;
+	struct flexcan_priv *priv = netdev_priv(dev);
+	struct flexcan_regs __iomem *regs = priv->base;
+	u32 reg_iflag1, reg_esr;
+
+	reg_iflag1 = readl(&regs->iflag1);
+	reg_esr = readl(&regs->esr);
+	writel(FLEXCAN_ESR_ERR_INT, &regs->esr);	/* ACK err IRQ */
+
+	/*
+	 * schedule NAPI in case of:
+	 * - rx IRQ
+	 * - state change IRQ
+	 * - bus error IRQ and bus error reporting is activated
+	 */
+	if ((reg_iflag1 & FLEXCAN_IFLAG_RX_FIFO_AVAILABLE) ||
+	    (reg_esr & FLEXCAN_ESR_ERR_STATE) ||
+	    flexcan_has_and_handle_berr(priv, reg_esr)) {
+		/*
+		 * The error bits are cleared on read,
+		 * save them for later use.
+		 */
+		priv->reg_esr = reg_esr & FLEXCAN_ESR_ERR_BUS;
+		writel(FLEXCAN_IFLAG_DEFAULT & ~FLEXCAN_IFLAG_RX_FIFO_AVAILABLE,
+		       &regs->imask1);
+		writel(priv->reg_ctrl_default & ~FLEXCAN_CTRL_ERR_ALL,
+		       &regs->ctrl);
+		napi_schedule(&priv->napi);
+	}
+
+	/* FIFO overflow */
+	if (reg_iflag1 & FLEXCAN_IFLAG_RX_FIFO_OVERFLOW) {
+		writel(FLEXCAN_IFLAG_RX_FIFO_OVERFLOW, &regs->iflag1);
+		dev->stats.rx_over_errors++;
+		dev->stats.rx_errors++;
+	}
+
+	/* transmission complete interrupt */
+	if (reg_iflag1 & (1 << FLEXCAN_TX_BUF_ID)) {
+		/* tx_bytes is incremented in flexcan_start_xmit */
+		stats->tx_packets++;
+		writel((1 << FLEXCAN_TX_BUF_ID), &regs->iflag1);
+		netif_wake_queue(dev);
+	}
+
+	return IRQ_HANDLED;
+}
+
+static void flexcan_set_bittiming(struct net_device *dev)
+{
+	const struct flexcan_priv *priv = netdev_priv(dev);
+	const struct can_bittiming *bt = &priv->can.bittiming;
+	struct flexcan_regs __iomem *regs = priv->base;
+	u32 reg;
+
+	reg = readl(&regs->ctrl);
+	reg &= ~(FLEXCAN_CTRL_PRESDIV(0xff) |
+		 FLEXCAN_CTRL_RJW(0x3) |
+		 FLEXCAN_CTRL_PSEG1(0x7) |
+		 FLEXCAN_CTRL_PSEG2(0x7) |
+		 FLEXCAN_CTRL_PROPSEG(0x7) |
+		 FLEXCAN_CTRL_LPB |
+		 FLEXCAN_CTRL_SMP |
+		 FLEXCAN_CTRL_LOM);
+
+	reg |= FLEXCAN_CTRL_PRESDIV(bt->brp - 1) |
+		FLEXCAN_CTRL_PSEG1(bt->phase_seg1 - 1) |
+		FLEXCAN_CTRL_PSEG2(bt->phase_seg2 - 1) |
+		FLEXCAN_CTRL_RJW(bt->sjw - 1) |
+		FLEXCAN_CTRL_PROPSEG(bt->prop_seg - 1);
+
+	if (priv->can.ctrlmode & CAN_CTRLMODE_LOOPBACK)
+		reg |= FLEXCAN_CTRL_LPB;
+	if (priv->can.ctrlmode & CAN_CTRLMODE_LISTENONLY)
+		reg |= FLEXCAN_CTRL_LOM;
+	if (priv->can.ctrlmode & CAN_CTRLMODE_3_SAMPLES)
+		reg |= FLEXCAN_CTRL_SMP;
+
+	dev_info(dev->dev.parent, "writing ctrl=0x%08x\n", reg);
+	writel(reg, &regs->ctrl);
+
+	/* print chip status */
+	dev_dbg(dev->dev.parent, "%s: mcr=0x%08x ctrl=0x%08x\n", __func__,
+		readl(&regs->mcr), readl(&regs->ctrl));
+}
+
+/*
+ * flexcan_chip_start
+ *
+ * this functions is entered with clocks enabled
+ *
+ */
+static int flexcan_chip_start(struct net_device *dev)
+{
+	struct flexcan_priv *priv = netdev_priv(dev);
+	struct flexcan_regs __iomem *regs = priv->base;
+	unsigned int i;
+	int err;
+	u32 reg_mcr, reg_ctrl;
+
+	/* enable module */
+	flexcan_chip_enable(priv);
+
+	/* soft reset */
+	writel(FLEXCAN_MCR_SOFTRST, &regs->mcr);
+	udelay(10);
+
+	reg_mcr = readl(&regs->mcr);
+	if (reg_mcr & FLEXCAN_MCR_SOFTRST) {
+		dev_err(dev->dev.parent,
+			"Failed to softreset can module (mcr=0x%08x)\n",
+			reg_mcr);
+		err = -ENODEV;
+		goto out;
+	}
+
+	flexcan_set_bittiming(dev);
+
+	/*
+	 * MCR
+	 *
+	 * enable freeze
+	 * enable fifo
+	 * halt now
+	 * only supervisor access
+	 * enable warning int
+	 * choose format C
+	 *
+	 */
+	reg_mcr = readl(&regs->mcr);
+	reg_mcr |= FLEXCAN_MCR_FRZ | FLEXCAN_MCR_FEN | FLEXCAN_MCR_HALT |
+		FLEXCAN_MCR_SUPV | FLEXCAN_MCR_WRN_EN |
+		FLEXCAN_MCR_IDAM_C;
+	dev_dbg(dev->dev.parent, "%s: writing mcr=0x%08x", __func__, reg_mcr);
+	writel(reg_mcr, &regs->mcr);
+
+	/*
+	 * CTRL
+	 *
+	 * disable timer sync feature
+	 *
+	 * disable auto busoff recovery
+	 * transmit lowest buffer first
+	 *
+	 * enable tx and rx warning interrupt
+	 * enable bus off interrupt
+	 * (== FLEXCAN_CTRL_ERR_STATE)
+	 *
+	 * _note_: we enable the "error interrupt"
+	 * (FLEXCAN_CTRL_ERR_MSK), too. Otherwise we don't get any
+	 * warning or bus passive interrupts.
+	 */
+	reg_ctrl = readl(&regs->ctrl);
+	reg_ctrl &= ~FLEXCAN_CTRL_TSYN;
+	reg_ctrl |= FLEXCAN_CTRL_BOFF_REC | FLEXCAN_CTRL_LBUF |
+		FLEXCAN_CTRL_ERR_STATE | FLEXCAN_CTRL_ERR_MSK;
+
+	/* save for later use */
+	priv->reg_ctrl_default = reg_ctrl;
+	dev_dbg(dev->dev.parent, "%s: writing ctrl=0x%08x", __func__, reg_ctrl);
+	writel(reg_ctrl, &regs->ctrl);
+
+	for (i = 0; i < ARRAY_SIZE(regs->cantxfg); i++) {
+		writel(0, &regs->cantxfg[i].can_ctrl);
+		writel(0, &regs->cantxfg[i].can_id);
+		writel(0, &regs->cantxfg[i].data[0]);
+		writel(0, &regs->cantxfg[i].data[1]);
+
+		/* put MB into rx queue */
+		writel(FLEXCAN_MB_CNT_CODE(0x4), &regs->cantxfg[i].can_ctrl);
+	}
+
+	/* acceptance mask/acceptance code (accept everything) */
+	writel(0x0, &regs->rxgmask);
+	writel(0x0, &regs->rx14mask);
+	writel(0x0, &regs->rx15mask);
+
+	flexcan_transceiver_switch(priv, 1);
+
+	/* synchronize with the can bus */
+	reg_mcr = readl(&regs->mcr);
+	reg_mcr &= ~FLEXCAN_MCR_HALT;
+	writel(reg_mcr, &regs->mcr);
+
+	priv->can.state = CAN_STATE_ERROR_ACTIVE;
+
+	/* enable FIFO interrupts */
+	writel(FLEXCAN_IFLAG_DEFAULT, &regs->imask1);
+
+	/* print chip status */
+	dev_dbg(dev->dev.parent, "%s: reading mcr=0x%08x ctrl=0x%08x\n",
+		__func__, readl(&regs->mcr), readl(&regs->ctrl));
+
+	return 0;
+
+ out:
+	flexcan_chip_disable(priv);
+	return err;
+}
+
+/*
+ * flexcan_chip_stop
+ *
+ * this functions is entered with clocks enabled
+ *
+ */
+static void flexcan_chip_stop(struct net_device *dev)
+{
+	struct flexcan_priv *priv = netdev_priv(dev);
+	struct flexcan_regs __iomem *regs = priv->base;
+	u32 reg;
+
+	/* Disable all interrupts */
+	writel(0, &regs->imask1);
+
+	/* Disable + halt module */
+	reg = readl(&regs->mcr);
+	reg |= FLEXCAN_MCR_MDIS | FLEXCAN_MCR_HALT;
+	writel(reg, &regs->mcr);
+
+	flexcan_transceiver_switch(priv, 0);
+	priv->can.state = CAN_STATE_STOPPED;
+
+	return;
+}
+
+static int flexcan_open(struct net_device *dev)
+{
+	struct flexcan_priv *priv = netdev_priv(dev);
+	int err;
+
+	clk_enable(priv->clk);
+
+	err = open_candev(dev);
+	if (err)
+		goto out;
+
+	err = request_irq(dev->irq, flexcan_irq, IRQF_SHARED, dev->name, dev);
+	if (err)
+		goto out_close;
+
+	/* start chip and queuing */
+	err = flexcan_chip_start(dev);
+	if (err)
+		goto out_close;
+	napi_enable(&priv->napi);
+	netif_start_queue(dev);
+
+	return 0;
+
+ out_close:
+	close_candev(dev);
+ out:
+	clk_disable(priv->clk);
+
+	return err;
+}
+
+static int flexcan_close(struct net_device *dev)
+{
+	struct flexcan_priv *priv = netdev_priv(dev);
+
+	netif_stop_queue(dev);
+	napi_disable(&priv->napi);
+	flexcan_chip_stop(dev);
+
+	free_irq(dev->irq, dev);
+	clk_disable(priv->clk);
+
+	close_candev(dev);
+
+	return 0;
+}
+
+static int flexcan_set_mode(struct net_device *dev, enum can_mode mode)
+{
+	int err;
+
+	switch (mode) {
+	case CAN_MODE_START:
+		err = flexcan_chip_start(dev);
+		if (err)
+			return err;
+
+		netif_wake_queue(dev);
+		break;
+
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static const struct net_device_ops flexcan_netdev_ops = {
+	.ndo_open	= flexcan_open,
+	.ndo_stop	= flexcan_close,
+	.ndo_start_xmit	= flexcan_start_xmit,
+};
+
+static int __devinit register_flexcandev(struct net_device *dev)
+{
+	struct flexcan_priv *priv = netdev_priv(dev);
+	struct flexcan_regs __iomem *regs = priv->base;
+	u32 reg, err;
+
+	clk_enable(priv->clk);
+
+	/* select "bus clock", chip must be disabled */
+	flexcan_chip_disable(priv);
+	reg = readl(&regs->ctrl);
+	reg |= FLEXCAN_CTRL_CLK_SRC;
+	writel(reg, &regs->ctrl);
+
+	flexcan_chip_enable(priv);
+
+	/* set freeze, halt and activate FIFO, restrict register access */
+	reg = readl(&regs->mcr);
+	reg |= FLEXCAN_MCR_FRZ | FLEXCAN_MCR_HALT |
+		FLEXCAN_MCR_FEN | FLEXCAN_MCR_SUPV;
+	writel(reg, &regs->mcr);
+
+	/*
+	 * Currently we only support newer versions of this core
+	 * featuring a RX FIFO. Older cores found on some Coldfire
+	 * derivates are not yet supported.
+	 */
+	reg = readl(&regs->mcr);
+	if (!(reg & FLEXCAN_MCR_FEN)) {
+		dev_err(dev->dev.parent,
+			"Could not enable RX FIFO, unsupported core\n");
+		err = -ENODEV;
+		goto out;
+	}
+
+	err = register_candev(dev);
+
+ out:
+	/* disable core and turn off clocks */
+	flexcan_chip_disable(priv);
+	clk_disable(priv->clk);
+
+	return err;
+}
+
+static void __devexit unregister_flexcandev(struct net_device *dev)
+{
+	unregister_candev(dev);
+}
+
+static int __devinit flexcan_probe(struct platform_device *pdev)
+{
+	struct net_device *dev;
+	struct flexcan_priv *priv;
+	struct resource *mem;
+	struct clk *clk;
+	void __iomem *base;
+	resource_size_t mem_size;
+	int err, irq;
+
+	clk = clk_get(&pdev->dev, NULL);
+	if (IS_ERR(clk)) {
+		dev_err(&pdev->dev, "no clock defined\n");
+		err = PTR_ERR(clk);
+		goto failed_clock;
+	}
+
+	mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	irq = platform_get_irq(pdev, 0);
+	if (!mem || irq <= 0) {
+		err = -ENODEV;
+		goto failed_get;
+	}
+
+	mem_size = resource_size(mem);
+	if (!request_mem_region(mem->start, mem_size, pdev->name)) {
+		err = -EBUSY;
+		goto failed_req;
+	}
+
+	base = ioremap(mem->start, mem_size);
+	if (!base) {
+		err = -ENOMEM;
+		goto failed_map;
+	}
+
+	dev = alloc_candev(sizeof(struct flexcan_priv), 0);
+	if (!dev) {
+		err = -ENOMEM;
+		goto failed_alloc;
+	}
+
+	dev->netdev_ops = &flexcan_netdev_ops;
+	dev->irq = irq;
+	dev->flags |= IFF_ECHO; /* we support local echo in hardware */
+
+	priv = netdev_priv(dev);
+	priv->can.clock.freq = clk_get_rate(clk);
+	priv->can.bittiming_const = &flexcan_bittiming_const;
+	priv->can.do_set_mode = flexcan_set_mode;
+	priv->can.do_get_berr_counter = flexcan_get_berr_counter;
+	priv->can.ctrlmode_supported = CAN_CTRLMODE_LOOPBACK |
+		CAN_CTRLMODE_LISTENONLY	| CAN_CTRLMODE_3_SAMPLES |
+		CAN_CTRLMODE_BERR_REPORTING;
+	priv->base = base;
+	priv->dev = dev;
+	priv->clk = clk;
+	priv->pdata = pdev->dev.platform_data;
+
+	netif_napi_add(dev, &priv->napi, flexcan_poll, FLEXCAN_NAPI_WEIGHT);
+
+	dev_set_drvdata(&pdev->dev, dev);
+	SET_NETDEV_DEV(dev, &pdev->dev);
+
+	err = register_flexcandev(dev);
+	if (err) {
+		dev_err(&pdev->dev, "registering netdev failed\n");
+		goto failed_register;
+	}
+
+	dev_info(&pdev->dev, "device registered (reg_base=%p, irq=%d)\n",
+		 priv->base, dev->irq);
+
+	return 0;
+
+ failed_register:
+	free_candev(dev);
+ failed_alloc:
+	iounmap(base);
+ failed_map:
+	release_mem_region(mem->start, mem_size);
+ failed_req:
+	clk_put(clk);
+ failed_get:
+ failed_clock:
+	return err;
+}
+
+static int __devexit flexcan_remove(struct platform_device *pdev)
+{
+	struct net_device *dev = platform_get_drvdata(pdev);
+	struct flexcan_priv *priv = netdev_priv(dev);
+	struct resource *mem;
+
+	unregister_flexcandev(dev);
+	platform_set_drvdata(pdev, NULL);
+	free_candev(dev);
+	iounmap(priv->base);
+
+	mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	release_mem_region(mem->start, resource_size(mem));
+
+	clk_put(priv->clk);
+
+	return 0;
+}
+
+static struct platform_driver flexcan_driver = {
+	.driver.name = DRV_NAME,
+	.probe = flexcan_probe,
+	.remove = __devexit_p(flexcan_remove),
+};
+
+static int __init flexcan_init(void)
+{
+	pr_info("%s netdevice driver\n", DRV_NAME);
+	return platform_driver_register(&flexcan_driver);
+}
+
+static void __exit flexcan_exit(void)
+{
+	platform_driver_unregister(&flexcan_driver);
+	pr_info("%s: driver removed\n", DRV_NAME);
+}
+
+module_init(flexcan_init);
+module_exit(flexcan_exit);
+
+MODULE_AUTHOR("Sascha Hauer <kernel-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>, "
+	      "Marc Kleine-Budde <kernel-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("CAN port driver for flexcan based chip");
diff --git a/include/linux/can/platform/flexcan.h b/include/linux/can/platform/flexcan.h
new file mode 100644
index 0000000..72b713a
--- /dev/null
+++ b/include/linux/can/platform/flexcan.h
@@ -0,0 +1,20 @@
+/*
+ * Copyright (C) 2010 Marc Kleine-Budde <kernel-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
+ *
+ * This file is released under the GPLv2
+ *
+ */
+
+#ifndef __CAN_PLATFORM_FLEXCAN_H
+#define __CAN_PLATFORM_FLEXCAN_H
+
+/**
+ * struct flexcan_platform_data - flex CAN controller platform data
+ * @transceiver_enable:         - called to power on/off the transceiver
+ *
+ */
+struct flexcan_platform_data {
+	void (*transceiver_switch)(int enable);
+};
+
+#endif /* __CAN_PLATFORM_FLEXCAN_H */
-- 
1.7.1

^ permalink raw reply related

* Re: Another oops + repost
From: Eric W. Biederman @ 2010-07-22 16:43 UTC (permalink / raw)
  To: Martín Ferrari
  Cc: netdev, Ben Hutchings, 577640, Alexey Dobriyan, Mathieu Lacage,
	Daniel Lezcano
In-Reply-To: <AANLkTin_KDKEejs1ld-qv5jOyuALysjZ8NLC2-786htJ@mail.gmail.com>

Martín Ferrari <martin.ferrari@gmail.com> writes:

> Hi again,
>
> First of all, I would like to know if anybody was able to fix this
> problem  that got kinda lost in the thread:

I can't reproduce this on 2.6.35-rc1+

Can you please test a 2.6.35-rc version?  If you can still reproduce
it there can you send me your .config?  Otherwise I expect my last
round of changes to sysfs fixed whatever was the underlying problem.

Thanks,
Eric

^ permalink raw reply

* Re: [PATCH 1/1] Bluetooth: hidp: Add support for hidraw     HIDIOCGFEATURE  and HIDIOCSFEATURE
From: Alan Ott @ 2010-07-22 16:58 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Jiri Kosina, David S Miller, Michael Poole, Bastien Nocera,
	Eric Dumazet, linux-bluetooth, linux-kernel, netdev
In-Reply-To: <1279812115.2621.6.camel@localhost.localdomain>

On 07/22/2010 11:21 AM, Marcel Holtmann wrote:
>>>>>>> what is usb-hid.ko doing here? I would expect a bunch of code
>>>>>>> duplication with minor difference between USB and Bluetooth.
>>>>>>>
>>>>>>>                
>>>>>> usbhid doesn't have a lot of code for hidraw. Two functions are involved:
>>>>>>        usbhid_output_raw_report()
>>>>>>            - calls usb_control_msg() with Get_Report
>>>>>>        usbhid_get_raw_report()
>>>>>>            - calls usb_control_msg() with Set_Report
>>>>>>                OR
>>>>>>            - calls usb_interrupt_msg() on the Ouput pipe.
>>>>>>
>>>>>> This is of course easier than bluetooth because usb_control_msg() is
>>>>>> synchronous, even when requesting reports, mostly because of the nature
>>>>>> of USB, where the request and response are part of the same transfer.
>>>>>>
>>>>>> For Bluetooth, it's a bit more complicated since the kernel treats it
>>>>>> more like a networking interface (and indeed it is). My understanding is
>>>>>> that to make a synchronous transfer in bluetooth, one must:
>>>>>>        - send the request packet
>>>>>>        - block (wait_event_*())
>>>>>>        - when the response is received in the input handler, wake_up_*().
>>>>>>
>>>>>> There's not really any code duplication, mostly because initiating
>>>>>> synchronous USB transfers (input and output) is easy (because of the
>>>>>> usb_*_msg() functions), while making synchronous Bluetooth transfers
>>>>>> must be done manually. If there's a nice, convenient, synchronous
>>>>>> function in Bluetooth similar to usb_control_msg() that I've missed,
>>>>>> then let me know, as it would simplify this whole thing.
>>>>>>
>>>>>>              
>>>>> there is not and I don't think we ever get one. My question here was
>>>>> more in the direction why HID core is doing these synchronously in the
>>>>> first place. Especially since USB can do everything async as well.
>>>>>            
>>>> I'm open to suggestions. The way I see it is from a user space
>>>> perspective. With Get_Feature being on an ioctl(), I don't see any clean
>>>> way to do it other than synchronously. Other operating systems (I can
>>>> say for sure Windows, Mac OS X, and FreeBSD) handle Get/Set Feature the
>>>> same way (synchronously) from user space.
>>>>
>>>> You seem to be proposing an asynchronous interface. What would that look
>>>> like from user space?
>>>>          
>>> not necessarily from user space, but at least from HID core to HIDP and
>>> usb-hid transports. At least that is what I would expect, Jiri?
>>>        
>> Sorry for this taking too long (vacations, conferences, you name it) for
>> me to respond.
>>
>> As all the _raw() callbacks are purely intended for userspace interaction
>> anyway, it's perfectly fine (and in fact desirable) for the low-level
>> transport drivers to perform these operations synchronously (and that's
>> what USB implementation does as well).
>>
>> Marcel, if your opposition to synchronous interface is strong, we'll have
>> to think about other aproaches, but from my POV, the patch is fine as-is
>> for Bluetooth.
>>      
> that the ioctl() API is synchronous is fine to me, however pushing that
> down to the transport drivers seems wrong to me. I think the HID core
> should be able to handle a fully asynchronous transport driver. I know
> that with the USB subsystem you are little bit spoiled here, but for
> Bluetooth it is not the case. And in the end even using the asynchronous
> USB URB calls would be nice as well.
>    

How are the URB calls better than using the synchronous calls? (see below)

> So why not make the core actually wait for responses from the transport
> driver.

Because this makes the USB side a lot more complicated, and it would 
introduce transport specific code into the core. Further, it would 
involve the transport driver calling hidraw with _every_ single packet 
it receives. Further, it would have to call hidraw with HANDSHAKE 
protocol error packets as well.

>   I would make the transport drivers a lot simpler in the long
> run.

It would make the USB transport driver and drivers/hid/hidraw much more 
complicated right now, at the expense of making the BT transport driver 
marginally simpler (and I'm not even convinced whether it would really 
be simpler). (see below for more)

>   And I know that most likely besides Bluetooth and USB you won't see
> another, but you never know.
>
>    

I just don't understand the objection. In each transport type, the 
waiting will have to be done in a different way. USB and BT are 
different enough that this is the case already, without having to 
imagine future buses which use HID. In BT, you have to check each packet 
which comes back from the BT network to see whether it is the response 
packet that hidraw is waiting for. Further, you have to check for 
HANDSHAKE packets indicating protocol error. Where better for this to be 
done than in hidp? Further, how can this possibly happen in 
drivers/hid/hidraw, as it doesn't know about the details of bluetooth to 
make this determination, and why should it? In my last email ( 
http://lkml.org/lkml/2010/7/9/231 ) to which I got no response, I laid 
out how doing the blocking in drivers/hid/hidraw would only make all the 
parts except bluetooth more complicated (including the core, and the USB 
side), and would also introduce bluetooth-specific things into the core.

Further, you're saying that using the asynchronous USB URB calls would 
be a benefit. How is it a benefit to replace a single function call 
which does exactly what I want, with a set of asynchronous calls and 
then adding my own blocking to make it do the same thing? This sounds to 
me like it would be 1: more text, 2: duplication of code, 3: error 
prone. I can't understand how this is of benefit. Please explain to me 
what I'm missing.

In theory, what you're saying makes sense. Making common code and logic 
actually common is always good. In practice though, in this case, I 
submit that there really isn't any commonality, and the only way for 
there to be commonality is to do the USB side the hard way. Further, 
drivers/hid/hidraw can't wait for a bluetooth packet without having code 
that's bluetooth-specific. It seems just that simple to me.

I'll give it some more thought, and take another look at the code to see 
if there's something obvious that I'm missing. If you know what I'm 
missing in my understanding of the problem, please tell me :)

Alan.

^ permalink raw reply

* Re: Re: Fwd: LVS on local node
From: Franchoze Eric @ 2010-07-22 16:59 UTC (permalink / raw)
  To: Simon Horman; +Cc: wensong, lvs-devel, netdev, netfilter-devel
In-Reply-To: <20100722132502.GB28362@verge.net.au>



22.07.10, 17:25, "Simon Horman" <horms@verge.net.au>:

> On Thu, Jul 22, 2010 at 07:51:20AM +0400, Franchoze Eric wrote:
>  > Hello,
>  > 
>  > I'm trying to do load balancing of incoming traffic to my applications. This applications are not very  smp friendly, and I want try to run some instances according to number of cpus on single machine. And balance load of incoming traffic/connections to this applications.
>  > Looks like is should be similar to http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.localnode.html
>  > 
>  >  linux kernel 2.6.32 with  or without hide interface patches.  Tried different configurations but could not see packets on application layer.
>  > 
>  > 192.168.1.165 - eth0 - interface for external connections
>  > 195.0.0.1 - dummy0 - virtual interface, real application is binded to that address.
>  > 
>  > Configuration is:
>  > -A -t 192.168.1.165:1234 -s wlc
>  > -a -t 192.168.1.165:1234 -r 195.0.0.1:1234 -g -w
>  > 
>  > #ipvsadm -L -n
>  > IP Virtual Server version 1.2.1 (size=4096)
>  > Prot LocalAddress:Port Scheduler Flags
>  >   -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
>  > TCP  192.168.1.165:1234 wlc
>  >   -> 195.0.0.1:1234               Local   1      0          0        
>  > #
>  > 
>  > Log:
>  > [ 2106.897409] IPVS: lookup/out TCP 192.168.1.165:44847->192.168.1.165:1234 not hit
>  > [ 2106.897412] IPVS: lookup service: fwm 0 TCP 192.168.1.165:1234 hit
>  > [ 2106.897414] IPVS: ip_vs_wlc_schedule(): Scheduling...
>  > [ 2106.897416] IPVS: WLC: server 195.0.0.1:1234 activeconns 0 refcnt 2 weight 1 overhead 1
>  > [ 2106.897418] IPVS: Enter: ip_vs_conn_new, net/netfilter/ipvs/ip_vs_conn.c line 693
>  > [ 2106.897421] IPVS: Bind-dest TCP c:192.168.1.165:44847 v:192.168.1.165:1234 d:195.0.0.1:1234 fwd:L s:0 conn->flags:181 conn->refcnt:1 dest->refcnt:3
>  > [ 2106.897425] IPVS: Schedule fwd:L c:192.168.1.165:44847 v:192.168.1.165:1234 d:195.0.0.1:1234 conn->flags:1C1 conn->refcnt:2
>  > [ 2106.897429] IPVS: TCP input  [S...] 195.0.0.1:1234->192.168.1.165:44847 state: NONE->SYN_RECV conn->refcnt:2
>  > [ 2106.897431] IPVS: Enter: ip_vs_null_xmit, net/netfilter/ipvs/ip_vs_xmit.c line 212
>  > [ 2106.897439] IPVS: lookup/in TCP 192.168.1.165:1234->192.168.1.165:44847 not hit
>  > [ 2106.897441] IPVS: lookup/out TCP 192.168.1.165:1234->192.168.1.165:44847 not hit
>  > [ 2107.277535] IPVS: packet type=1 proto=17 daddr=255.255.255.255 ignored
>  > [ 2108.542691] IPVS: packet type=1 proto=17 daddr=192.168.1.255 ignored
>  > 
>  > As the result, server application does receive anything on accept(). I tried to make dummy0 a hidden device and play with arp settings. But without result.
>  > 
>  > I will be happy to hear any idea how to do connection in this environment.
>  
>  Hi,
>  
>  while others have suggested not using LVS for this task for various reasons.
>  I would just like to comment that this should work and this smells
>  like a bug to me. I will try and confirm that. But it won't be today.
>  
>  
>  

With the latest kernel I see that: LVS accepts connections, selects right destination (if round robin is selected destination changes accoring it), then it detects that it is local node and do:
net/netfilter/ipvs/ip_vs_xmit.c:
   ip_vs_null_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
 struct ip_vs_protocol *pp)

Which does nothing with skb. (here I do not understand what happens with that packet then)
I think if VLS could change destination for packets which go from local node to local node then connection can be established. Is it reasonable?

^ permalink raw reply

* Re: [PATCH] Driver-core: Fix bluetooth network device rename  regression
From: Eric W. Biederman @ 2010-07-22 17:18 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Greg KH, Andrew Morton, Greg KH, Rafael J. Wysocki,
	Maciej W. Rozycki, Johannes Berg, netdev
In-Reply-To: <AANLkTimM5Ea8mQ7aX4kDq3dgF3P-t2Wm3dERhckC69Ja@mail.gmail.com>

Kay Sievers <kay.sievers@vrfy.org> writes:

Kay you are full of it.  My argument is that this a clear case
where aiming for perfection is the enemy of the good.

>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -673,7 +673,7 @@ static struct kobject *get_device_parent(struct device *dev,
>>                 */
>>                if (parent == NULL)
>>                        parent_kobj = virtual_device_parent(dev);
>> -               else if (parent->class)
>> +               else if (parent->class && (strcmp(dev->class->name, "net") != 0))
>

> Since the issue is not a regression, and not even a bug in the core,
> it should not be done this way for mainline.

Yes the issue is a namespace regression and it has been a regression
for the bluetooth network drivers since CONFIG_SYSFS_DEPRECATED was
introduced.  "ip link set name blarg"  Where blarg is an attribute of
the bluteooth parent device fails only with CONFIG_SYSFS_DEPRECATED
disabled.

I reported that regression for the majority of the network devices and
the class directories were introduced to avoid this problem, except
convoluted logic of when to create a class directory fails to cover
a few flavors of network drivers.

For the mac80211_hwsim driver I don't care.  That isn't real hardware
and it is only interesting for testing.  For the bluetooth network driver
that is real hardware that needs to be made to work.

> Please try to fix these drivers instead, or mark the broken for
> namespaces, if nobody can fix them right now.

Let me get this straight.  You believe a non-trival non-obvious
complete rewrite of entire subsystems after rc1 is better than
a one line bug fix that let's real hardware work for real
people?

You have had 3 years to do better a better job with the bluetooth
driver.

Furthermore we fixed the rest of this regression in the core I don't
see why we can fix the entirety of it this namespace regression in the
core.  Especially since it is an obviously correct one line change.

Eric

^ permalink raw reply

* Re: [PATCH] LSM: Add post recvmsg() hook.
From: David Miller @ 2010-07-22 17:22 UTC (permalink / raw)
  To: penguin-kernel
  Cc: kuznet, pekkas, jmorris, yoshfuji, kaber, paul.moore, netdev,
	linux-security-module
In-Reply-To: <201007222146.FGC09386.OFOtFQOVFMJSLH@I-love.SAKURA.ne.jp>

From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Thu, 22 Jul 2010 21:46:55 +0900

> David Miller wrote:
>> > Then, why does below proposal lose information?
>> 
>> Peek changes state, now it's possible that two processes end up
>> receiving the packet.
> 
> Indeed. We will need to protect sock->ops->recvmsg() call using a lock like

But this doesn't matter.

The fact is going to remain that you will be unable to return data
from recvmsg() to a blocking socket when ->poll() returns true even
though data is in fact there in the socket receive queue.

This is something that the existing LSM hooks do not do.

You can't create this silly situation where some packets in the socket
receive queue can be recvmsg()'d by some processes, but not by others.

At best, it is pure crazyness.

^ permalink raw reply

* Re: [PATCH] LSM: Add post recvmsg() hook.
From: David Miller @ 2010-07-22 17:26 UTC (permalink / raw)
  To: penguin-kernel
  Cc: kuznet, pekkas, jmorris, yoshfuji, kaber, paul.moore, netdev,
	linux-security-module
In-Reply-To: <20100722.102251.165153819.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Thu, 22 Jul 2010 10:22:51 -0700 (PDT)

> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Thu, 22 Jul 2010 21:46:55 +0900
> 
>> David Miller wrote:
>>> > Then, why does below proposal lose information?
>>> 
>>> Peek changes state, now it's possible that two processes end up
>>> receiving the packet.
>> 
>> Indeed. We will need to protect sock->ops->recvmsg() call using a lock like
> 
> But this doesn't matter.

Also, btw, we're not adding a lock to a code path which we've worked
so hard to make largely lockless.  This lock is going to kill
performance.

^ permalink raw reply

* Re: mirred, redirect action vs. dev refcount issue
From: Stephen Hemminger @ 2010-07-22 17:42 UTC (permalink / raw)
  To: hadi; +Cc: David Miller, netdev
In-Reply-To: <1279793481.2747.39.camel@bigi>

This is what I am testing now.

--- a/include/net/tc_act/tc_mirred.h	2010-07-22 09:17:41.631561924 -0700
+++ b/include/net/tc_act/tc_mirred.h	2010-07-22 09:49:53.355325645 -0700
@@ -9,6 +9,7 @@ struct tcf_mirred {
 	int			tcfm_ifindex;
 	int			tcfm_ok_push;
 	struct net_device	*tcfm_dev;
+	struct list_head	tcfm_list;
 };
 #define to_mirred(pc) \
 	container_of(pc, struct tcf_mirred, common)
--- a/net/sched/act_mirred.c	2010-07-21 16:10:48.922308036 -0700
+++ b/net/sched/act_mirred.c	2010-07-22 09:52:32.229910928 -0700
@@ -1,3 +1,4 @@
+
 /*
  * net/sched/mirred.c	packet mirroring and redirect actions
  *
@@ -21,6 +22,7 @@
 #include <linux/module.h>
 #include <linux/init.h>
 #include <linux/gfp.h>
+#include <linux/notifier.h>
 #include <net/net_namespace.h>
 #include <net/netlink.h>
 #include <net/pkt_sched.h>
@@ -33,6 +35,7 @@
 static struct tcf_common *tcf_mirred_ht[MIRRED_TAB_MASK + 1];
 static u32 mirred_idx_gen;
 static DEFINE_RWLOCK(mirred_lock);
+static LIST_HEAD(mirred_list);
 
 static struct tcf_hashinfo mirred_hash_info = {
 	.htab	=	tcf_mirred_ht,
@@ -47,7 +50,10 @@ static inline int tcf_mirred_release(str
 			m->tcf_bindcnt--;
 		m->tcf_refcnt--;
 		if(!m->tcf_bindcnt && m->tcf_refcnt <= 0) {
-			dev_put(m->tcfm_dev);
+			ASSERT_RTNL();
+			list_del(&m->tcfm_list);
+			if (m->tcfm_dev)
+				dev_put(m->tcfm_dev);
 			tcf_hash_destroy(&m->common, &mirred_hash_info);
 			return 1;
 		}
@@ -134,8 +140,11 @@ static int tcf_mirred_init(struct nlattr
 		m->tcfm_ok_push = ok_push;
 	}
 	spin_unlock_bh(&m->tcf_lock);
-	if (ret == ACT_P_CREATED)
+	if (ret == ACT_P_CREATED) {
+		ASSERT_RTNL();
+		list_add(&m->tcfm_list, &mirred_list);
 		tcf_hash_insert(pc, &mirred_hash_info);
+	}
 
 	return ret;
 }
@@ -164,9 +173,14 @@ static int tcf_mirred(struct sk_buff *sk
 	m->tcf_bstats.packets++;
 
 	dev = m->tcfm_dev;
+	if (!dev) {
+		printk_once(KERN_NOTICE "tc mirred: target device is gone\n");
+		goto out;
+	}
+
 	if (!(dev->flags & IFF_UP)) {
 		if (net_ratelimit())
-			pr_notice("tc mirred to Houston: device %s is gone!\n",
+			pr_notice("tc mirred to Houston: device %s is down\n",
 				  dev->name);
 		goto out;
 	}
@@ -230,6 +244,28 @@ nla_put_failure:
 	return -1;
 }
 
+static int mirred_device_event(struct notifier_block *unused,
+			       unsigned long event, void *ptr)
+{
+	struct net_device *dev = ptr;
+	struct tcf_mirred *m;
+
+	if (event == NETDEV_UNREGISTER)
+		list_for_each_entry(m, &mirred_list, tcfm_list) {
+			if (m->tcfm_dev == dev) {
+				dev_put(dev);
+				m->tcfm_dev = NULL;
+			}
+		}
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block mirred_device_notifier = {
+	.notifier_call = mirred_device_event,
+};
+
+
 static struct tc_action_ops act_mirred_ops = {
 	.kind		=	"mirred",
 	.hinfo		=	&mirred_hash_info,
@@ -250,12 +286,17 @@ MODULE_LICENSE("GPL");
 
 static int __init mirred_init_module(void)
 {
+	int err = register_netdevice_notifier(&mirred_device_notifier);
+	if (err)
+		return err;
+
 	pr_info("Mirror/redirect action on\n");
 	return tcf_register_action(&act_mirred_ops);
 }
 
 static void __exit mirred_cleanup_module(void)
 {
+	unregister_netdevice_notifier(&mirred_device_notifier);
 	tcf_unregister_action(&act_mirred_ops);
 }
 

^ permalink raw reply

* Re: [PATCH] Driver-core: Fix bluetooth network device rename regression
From: Kay Sievers @ 2010-07-22 17:44 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Greg KH, Andrew Morton, Greg KH, Rafael J. Wysocki,
	Maciej W. Rozycki, Johannes Berg, netdev
In-Reply-To: <m1vd87toy9.fsf@fess.ebiederm.org>

On Thu, Jul 22, 2010 at 19:18, Eric W. Biederman <ebiederm@xmission.com> wrote:
> Kay Sievers <kay.sievers@vrfy.org> writes:

> Let me get this straight.  You believe a non-trival non-obvious
> complete rewrite of entire subsystems after rc1 is better than
> a one line bug fix that let's real hardware work for real
> people?

Yes. The drivers are broken in its core use pattern, and enabling a
(non-common) new feature exposes this. Users should just not enable
the new feature until the drivers, they might depend on, are fixed
properly. That's the rule for all other kernel development too, and I
don't see what's different here. It's not a regression, and not a bug
in the core, which needs fixing.

I don't see what would justify subsystem-specific hacks in the core,
which we will never be able to remove. Your patch exposes sysfs device
layout for a single subsystem in an unusual manner, and we can not
change that later. We need to avoid such special cases for all
reasons.

> You have had 3 years to do better a better job with the bluetooth
> driver.

I don't know of any real-world problem here.

> Furthermore we fixed the rest of this regression in the core I don't
> see why we can fix the entirety of it this namespace regression in the
> core.  Especially since it is an obviously correct one line change.

Oh, no. It's obviously wrong to add specific lists of subsystems to
generic core code to fix individual drivers. It even implements its
own sysfs layout rules for a single subsystem.

Please help fixing the drivers, or improving the core where needed, to
support these drivers better, but don't try to throw quick and dirty
and wrong fixes at us. Sysfs is already hard to understand and more
exceptions like this, and subsystem-specific layout rules would
seriously be the wrong direction.

Thanks,
Kay

^ permalink raw reply

* TCP ssthresh in Linux/RFC
From: Yuchung Cheng @ 2010-07-22 17:56 UTC (permalink / raw)
  To: netdev

The Linux set ssthresh upon entering recovery stage (tcp_fastretrans_alert()) to
1) max(snd_cwnd/2, 2) in tcp_cong.c
2) max((tp->snd_cwnd * beta) / BICTCP_BETA_SCALE, 2U) in tcp_cubic.c

but RFC 5681 says  ssthresh = max (FlightSize / 2, 2*SMSS). It also
specifically mentions
" Implementation Note: An easy mistake to make is to simply use cwnd,
   rather than FlightSize, which in some implementations may
   incidentally increase well beyond rwnd."

I can think of a few pros and cons of both design, but would like to
hear Linux developers opinions first.

Yuchung

^ permalink raw reply

* Re: mirred, redirect action vs. dev refcount issue
From: jamal @ 2010-07-22 18:02 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev
In-Reply-To: <20100722104256.5a7e7cb6@nehalam>

On Thu, 2010-07-22 at 10:42 -0700, Stephen Hemminger wrote:
> This is what I am testing now.
> 

Looks very reasonable to me. Please add my signoff when you 
submit.

cheers,
jamal

^ permalink raw reply

* Lost in iptables code.
From: Ben Greear @ 2010-07-22 18:02 UTC (permalink / raw)
  To: NetDev

I'm trying to debug an appearant bug in the "CT zones"
logic in 2.6.34, and I'm having a hard time figuring out where the
code enters the iptables logic.  (In case it matters, the
CT Zones logic is related to the 'raw' table.)

I assume it would be somewhere in netif_receive_skb,
but I don't see anything obvious.

If anyone has suggestions where to start poking at this,
please let me know.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCH] Driver-core: Fix bluetooth network device rename regression
From: Johannes Berg @ 2010-07-22 18:20 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Eric W. Biederman, Greg KH, Andrew Morton, Greg KH,
	Rafael J. Wysocki, Maciej W. Rozycki, netdev
In-Reply-To: <AANLkTil3Q1Vobw48AvStl9gphaJcyUGlt6ZnMLanDMVn@mail.gmail.com>

On Thu, 2010-07-22 at 19:44 +0200, Kay Sievers wrote:

> Yes. The drivers are broken in its core use pattern, and enabling a
> (non-common) new feature exposes this.

You've been saying this for a long time now, but I still don't buy it.

This stuff WORKED before. Now, years later, you're saying that it has a
broken use pattern, and needs to be fixed. That's a pipe dream. In the
real world, you can't deliberately break things because years later you
decide that the use pattern that was working fine before is broken.

johannes

^ permalink raw reply

* Re: [PATCH] Driver-core: Fix bluetooth network device rename regression
From: Greg KH @ 2010-07-22 18:28 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Kay Sievers, Eric W. Biederman, Andrew Morton, Greg KH,
	Rafael J. Wysocki, Maciej W. Rozycki, netdev
In-Reply-To: <1279822828.12439.24.camel@jlt3.sipsolutions.net>

On Thu, Jul 22, 2010 at 08:20:28PM +0200, Johannes Berg wrote:
> On Thu, 2010-07-22 at 19:44 +0200, Kay Sievers wrote:
> 
> > Yes. The drivers are broken in its core use pattern, and enabling a
> > (non-common) new feature exposes this.
> 
> You've been saying this for a long time now, but I still don't buy it.
> 
> This stuff WORKED before. Now, years later, you're saying that it has a
> broken use pattern, and needs to be fixed. That's a pipe dream. In the
> real world, you can't deliberately break things because years later you
> decide that the use pattern that was working fine before is broken.

It worked only because no one realized that it was broken with the
DEPRECATED option enabled.  When that is enabled, it is broken, right?

Eric's changes to sysfs to add namespace support exposed this breakage.
That's not a reason to paper over the problem, but it should be driving
someone to fix it correctly, as has been pointed out a number of times
already.

thanks,

greg k-h

^ permalink raw reply

* Re: Lost in iptables code.
From: Peter Chacko @ 2010-07-22 18:37 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev
In-Reply-To: <4C4887D0.3050309@candelatech.com>

follow NF_HOOKS.

On Thu, Jul 22, 2010 at 11:32 PM, Ben Greear <greearb@candelatech.com> wrote:
> I'm trying to debug an appearant bug in the "CT zones"
> logic in 2.6.34, and I'm having a hard time figuring out where the
> code enters the iptables logic.  (In case it matters, the
> CT Zones logic is related to the 'raw' table.)
>
> I assume it would be somewhere in netif_receive_skb,
> but I don't see anything obvious.
>
> If anyone has suggestions where to start poking at this,
> please let me know.
>
> Thanks,
> Ben
>
> --
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc  http://www.candelatech.com
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* Re: [PATCH] Driver-core: Fix bluetooth network device rename regression
From: Johannes Berg @ 2010-07-22 18:36 UTC (permalink / raw)
  To: Greg KH
  Cc: Kay Sievers, Eric W. Biederman, Andrew Morton, Greg KH,
	Rafael J. Wysocki, Maciej W. Rozycki, netdev
In-Reply-To: <20100722182827.GA12821@suse.de>

On Thu, 2010-07-22 at 11:28 -0700, Greg KH wrote:

> It worked only because no one realized that it was broken with the
> DEPRECATED option enabled.  When that is enabled, it is broken, right?

I'm pretty sure I always had that enabled, and never had issues. Can't
test right now since I don't have that option back yet in the tree I'm
using.

> Eric's changes to sysfs to add namespace support exposed this breakage.
> That's not a reason to paper over the problem, but it should be driving
> someone to fix it correctly, as has been pointed out a number of times
> already.

I'm just contesting that that someone should be me. I don't think you
get to blame driver developers for doing something that worked and
solved the problem they needed to solve. sysfs is largely opaque to most
of us already, and it now sure feels like Kay decided to change the
rules underneath the code in saying "this was wrong all along".

johannes

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox