Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: no reassembly for outgoing packets on RAW socket
From: Jan Engelhardt @ 2010-06-09 15:15 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Jiri Olsa, netdev, Netfilter Developer Mailing List
In-Reply-To: <4C0FA24A.7060907@trash.net>


On Wednesday 2010-06-09 16:16, Patrick McHardy wrote:
>>>> I'd like to be able to sendout a single IP packet with MF flag set.
>>>>
>>>> When using RAW sockets the packet will get stuck in the
>>>> netfilter (NF_INET_LOCAL_OUT nf_defrag_ipv4 reassembly unit)
>>>> and wont ever make it out..
>>>>
>>>> I made a change which bypass the outgoing reassembly for
>>>> RAW sockets, but I'm not sure wether it's too invasive..
>>>>       
>>> That would break reassembly (and thus connection tracking) for cases
>>> where its really intended.
>>>     
>>>> Is there any standard for RAW sockets behaviour?
>>>> Or another way around? :)
>>>>       
>>> You could use the NOTRACK target to bypass connection tracking.
>>
>> I tried the NOTRACK target, but the packet is still going
>> throught reassembly, because the RAW filter has lower priority
>> then the connection track defragmentation..
>
>Right.

Blech. That reminds me of
http://marc.info/?l=netfilter-devel&m=126581823826735&w=2

^ permalink raw reply

* Re: no reassembly for outgoing packets on RAW socket
From: Patrick McHardy @ 2010-06-09 15:16 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Jiri Olsa, netdev, Netfilter Developer Mailing List
In-Reply-To: <alpine.LSU.2.01.1006091713260.30265@obet.zrqbmnf.qr>

Jan Engelhardt wrote:
> On Wednesday 2010-06-09 16:16, Patrick McHardy wrote:
>>>> You could use the NOTRACK target to bypass connection tracking.
>>>>         
>>> I tried the NOTRACK target, but the packet is still going
>>> throught reassembly, because the RAW filter has lower priority
>>> then the connection track defragmentation..
>>>       
>> Right.
>>     
>
> Blech. That reminds me of
> http://marc.info/?l=netfilter-devel&m=126581823826735&w=2
>   

We already fixed that.


^ permalink raw reply

* Re: [PATCH v3] netfilter: Xtables: idletimer target implementation
From: Jan Engelhardt @ 2010-06-09 15:18 UTC (permalink / raw)
  To: Luciano Coelho
  Cc: ext Patrick McHardy, netfilter-devel@vger.kernel.org,
	netdev@vger.kernel.org, Timo Teras
In-Reply-To: <1276096275.31892.117.camel@chilepepper>


On Wednesday 2010-06-09 17:11, Luciano Coelho wrote:
>> 
>> How does this work with multiple namespaces? It seems every namespace
>> can bind to any timer.
>
>I was implementing this solution for multiple namespaces (see the
>previous versions of my patch), but the code started getting really
>complicated.  Then I found out that sysfs and multiple namespaces are
>not working very well together yet and decided to drop it for the time
>being.  Of course this doesn't matter anymore, since the timers belong
>to an independent class in sysfs, so I can easily add multiple namespace
>support by adding struct net *net as part of the list key together with
>the label.
>
>Do you think it's okay to leave it like this for now and extend it for
>multiple namespace support with a future patch?

Yes. Least thing we need is one humongous patch. :)


>> > +	timer = __idletimer_tg_find_by_label(info->label);
>> > +	if (!timer) {
>> > +		spin_unlock(&list_lock);
>> > +		timer = idletimer_tg_create(info);
>> >   
>> 
>> How does this prevent creating the same timer twice?
>
>The timer will only be created if __idletimer_tg_find_by_label() returns
>NULL, which means that no timer with that label has been found.  "info"
>won't be the same if info->label is different, right? Or can it change
>on the fly?

One thing to be generally aware about is that things could potentially
be instantiated by another entity between the time a label was looked up
with negative result and the time one tries to add it.
It may thus be required to extend keeping the lock until after
idletimer_tg_create, in other words, lookup and create must be atomic
to the rest of the world.

^ permalink raw reply

* Re: no reassembly for outgoing packets on RAW socket
From: Jan Engelhardt @ 2010-06-09 15:20 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Jiri Olsa, netdev, Netfilter Developer Mailing List
In-Reply-To: <4C0FB068.9090700@trash.net>


On Wednesday 2010-06-09 17:16, Patrick McHardy wrote:
>Jan Engelhardt wrote:
>> On Wednesday 2010-06-09 16:16, Patrick McHardy wrote:
>>>>> You could use the NOTRACK target to bypass connection tracking.
>>>>>         
>>>> I tried the NOTRACK target, but the packet is still going
>>>> throught reassembly, because the RAW filter has lower priority
>>>> then the connection track defragmentation..
>>>
>>> Right.
>>
>> Blech. That reminds me of
>> http://marc.info/?l=netfilter-devel&m=126581823826735&w=2
>
>We already fixed that.

I know, and I posted it for the understanding of the OP
as to why RAW is after DEFRAG.

^ permalink raw reply

* Re: 2.6.35-rc2, CONFIG_RPS is filling the dmesg log
From: Tim Gardner @ 2010-06-09 15:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1276090953.2442.140.camel@edumazet-laptop>

[-- Attachment #1: Type: text/plain, Size: 2011 bytes --]

On 06/09/2010 07:42 AM, Eric Dumazet wrote:
> Le mercredi 09 juin 2010 à 07:27 -0600, Tim Gardner a écrit :
>> On 06/08/2010 02:55 PM, Tim Gardner wrote:
>>> With 2.6.35-rc2 my dmesg log is being flooded with messages like this:
>>>
>>> br0 received packet on queue 4, but number of RX queues is 1
>>>
>>> This machine is bridged for KVM and has 2 igb network adapters.
>>>
>>> The root cause appears to be CONFIG_RPS=y and the fact that none of the
>>> drivers that call skb_record_rx_queue() perform their net device
>>> allocation using alloc_netdev_mq(), thereby initializing num_rx_queues
>>> to a maximum of 1.
>>>
>>> Given that this is early RPS days, is the warning in get_rps_cpu()
>>> really necessary? It would appear that _all_ of the multi-receive queue
>>> devices that call skb_record_rx_queue() will cause this log noise.
>>>
>>> By the way, how do you turn off CONFIG_RPS? The only way I could get it
>>> disabled was to change the default in net/Kconfig to 'n'.
>>>
>>> rtg
>>
>> This is the route that I'm taking with Ubuntu in the short term. I'll
>> have lots of server testers complaining pretty soon if I don't take care
>> of this now. It does keep my server logs from filling.
>>
>> rtg
>>
>
> Probably fine, but your commit message is not exact :
>
>    So far no users of skb_record_rx_queue() use alloc_netdev_mq() for
>    network device initialization, so don't print a warning about num_rx_queues
>    imbalances in get_rps_cpu() unless they have actually been allocated.
>
> In fact, drivers that use skb_record_rx_queue() did use alloc_netdev_mq().
>
> Problem is : packets going thru bridge/bonding that are not yet
> multiqueue enabled. If R[PF]S enabled for these "virtual devices",
> we trigger the get_rps_cpu() warning.
>
> Also, in a bonding setup, we still have a problem
> because all tx packets will go thru tx queue 0 (dev_pick_tx() job)
>
> (That might be good to know that for Ubuntu server testers)
>

How about this?

-- 
Tim Gardner tim.gardner@canonical.com

[-- Attachment #2: 0001-UBUNTU-SAUCE-net-Print-num_rx_queues-imbalance-warni.patch --]
[-- Type: text/x-patch, Size: 1443 bytes --]

>From ad76786a1a0c7b7b3c9bfeb4116fa0e2742f6328 Mon Sep 17 00:00:00 2001
From: Tim Gardner <tim.gardner@canonical.com>
Date: Tue, 8 Jun 2010 17:51:27 -0600
Subject: [PATCH] net: Print num_rx_queues imbalance warning only when there are allocated queues

BugLink: http://bugs.launchpad.net/bugs/591416

There are a number of network drivers (bridge, bonding, etc) that are not yet
receive multi-queue enabled and use alloc_netdev(), so don't print a
num_rx_queues imbalance warning in that case.

Also, only print the warning once for those drivers that _are_ multi-queue
enabled.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
---
 net/core/dev.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index d03470f..14a8568 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2253,11 +2253,9 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb,
 	if (skb_rx_queue_recorded(skb)) {
 		u16 index = skb_get_rx_queue(skb);
 		if (unlikely(index >= dev->num_rx_queues)) {
-			if (net_ratelimit()) {
-				pr_warning("%s received packet on queue "
-					"%u, but number of RX queues is %u\n",
-					dev->name, index, dev->num_rx_queues);
-			}
+			WARN_ONCE(dev->num_rx_queues > 1, "%s received packet "
+				"on queue %u, but number of RX queues is %u\n",
+				dev->name, index, dev->num_rx_queues);
 			goto done;
 		}
 		rxqueue = dev->_rx + index;
-- 
1.7.0.4


^ permalink raw reply related

* Re: 2.6.35-rc2-git1 - include/linux/cgroup.h:534 invoked rcu_dereference_check() without protection!
From: Peter Zijlstra @ 2010-06-09 15:24 UTC (permalink / raw)
  To: Miles Lane
  Cc: paulmck, Vivek Goyal, Eric Paris, Lai Jiangshan, Ingo Molnar,
	LKML, nauman, eric.dumazet, netdev, Jens Axboe, Gui Jianfeng,
	Li Zefan, Johannes Berg
In-Reply-To: <AANLkTil6ucLZ0A6tWx-OQt_2EeKnx8nSx6rIg2t1L2t5@mail.gmail.com>

On Wed, 2010-06-09 at 11:11 -0400, Miles Lane wrote:
> 
> Sorry.  I misunderstood this message when I first read it.  I didn't
> realize this message include a new version of the patch.
> Anyhow, I just tried to apply the patch to 2.6.35-rc2-git3 and got this:
> 
> # patch -p1 -l -F 20 --dry-run < ../5.patch
> patching file include/linux/cgroup.h
> patching file kernel/sched.c
> Hunk #1 succeeded at 306 with fuzz 1.
> Hunk #3 FAILED at 4462.
> Hunk #4 succeeded at 4487 with fuzz 3.
> 1 out of 4 hunks FAILED -- saving rejects to file kernel/sched.c.rej

Weird.. it seems to apply without trouble to Linus' git tree.

root@twins:/usr/src/linux-2.6# git checkout -f origin/master
HEAD is now at 84f7586... Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
root@twins:/usr/src/linux-2.6# quilt push
Applying patch patches/sched-rcu-validation.patch
patching file include/linux/cgroup.h
patching file kernel/sched.c

Now at patch patches/sched-rcu-validation.patch
root@twins:/usr/src/linux-2.6# git describe
v2.6.35-rc2-54-g84f7586

^ permalink raw reply

* Re: 2.6.35-rc2, CONFIG_RPS is filling the dmesg log
From: Eric Dumazet @ 2010-06-09 15:27 UTC (permalink / raw)
  To: tim.gardner; +Cc: netdev
In-Reply-To: <4C0FB1D3.60901@canonical.com>

Le mercredi 09 juin 2010 à 09:22 -0600, Tim Gardner a écrit :
> On 06/09/2010 07:42 AM, Eric Dumazet wrote:
> > Le mercredi 09 juin 2010 à 07:27 -0600, Tim Gardner a écrit :
> >> On 06/08/2010 02:55 PM, Tim Gardner wrote:
> >>> With 2.6.35-rc2 my dmesg log is being flooded with messages like this:
> >>>
> >>> br0 received packet on queue 4, but number of RX queues is 1
> >>>
> >>> This machine is bridged for KVM and has 2 igb network adapters.
> >>>
> >>> The root cause appears to be CONFIG_RPS=y and the fact that none of the
> >>> drivers that call skb_record_rx_queue() perform their net device
> >>> allocation using alloc_netdev_mq(), thereby initializing num_rx_queues
> >>> to a maximum of 1.
> >>>
> >>> Given that this is early RPS days, is the warning in get_rps_cpu()
> >>> really necessary? It would appear that _all_ of the multi-receive queue
> >>> devices that call skb_record_rx_queue() will cause this log noise.
> >>>
> >>> By the way, how do you turn off CONFIG_RPS? The only way I could get it
> >>> disabled was to change the default in net/Kconfig to 'n'.
> >>>
> >>> rtg
> >>
> >> This is the route that I'm taking with Ubuntu in the short term. I'll
> >> have lots of server testers complaining pretty soon if I don't take care
> >> of this now. It does keep my server logs from filling.
> >>
> >> rtg
> >>
> >
> > Probably fine, but your commit message is not exact :
> >
> >    So far no users of skb_record_rx_queue() use alloc_netdev_mq() for
> >    network device initialization, so don't print a warning about num_rx_queues
> >    imbalances in get_rps_cpu() unless they have actually been allocated.
> >
> > In fact, drivers that use skb_record_rx_queue() did use alloc_netdev_mq().
> >
> > Problem is : packets going thru bridge/bonding that are not yet
> > multiqueue enabled. If R[PF]S enabled for these "virtual devices",
> > we trigger the get_rps_cpu() warning.
> >
> > Also, in a bonding setup, we still have a problem
> > because all tx packets will go thru tx queue 0 (dev_pick_tx() job)
> >
> > (That might be good to know that for Ubuntu server testers)
> >
> 
> How about this?
> 

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Thanks !




^ permalink raw reply

* Re: 2.6.35-rc2-git1 - include/linux/cgroup.h:534 invoked rcu_dereference_check() without protection!
From: Miles Lane @ 2010-06-09 15:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: paulmck, Vivek Goyal, Eric Paris, Lai Jiangshan, Ingo Molnar,
	LKML, nauman, eric.dumazet, netdev, Jens Axboe, Gui Jianfeng,
	Li Zefan, Johannes Berg
In-Reply-To: <1276097068.1745.13.camel@laptop>

On Wed, Jun 9, 2010 at 11:24 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, 2010-06-09 at 11:11 -0400, Miles Lane wrote:
>>
>> Sorry.  I misunderstood this message when I first read it.  I didn't
>> realize this message include a new version of the patch.
>> Anyhow, I just tried to apply the patch to 2.6.35-rc2-git3 and got this:
>>
>> # patch -p1 -l -F 20 --dry-run < ../5.patch
>> patching file include/linux/cgroup.h
>> patching file kernel/sched.c
>> Hunk #1 succeeded at 306 with fuzz 1.
>> Hunk #3 FAILED at 4462.
>> Hunk #4 succeeded at 4487 with fuzz 3.
>> 1 out of 4 hunks FAILED -- saving rejects to file kernel/sched.c.rej
>
> Weird.. it seems to apply without trouble to Linus' git tree.
>
> root@twins:/usr/src/linux-2.6# git checkout -f origin/master
> HEAD is now at 84f7586... Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
> root@twins:/usr/src/linux-2.6# quilt push
> Applying patch patches/sched-rcu-validation.patch
> patching file include/linux/cgroup.h
> patching file kernel/sched.c
>
> Now at patch patches/sched-rcu-validation.patch
> root@twins:/usr/src/linux-2.6# git describe
> v2.6.35-rc2-54-g84f7586

Oh.  Duh. I know what is going on.  I had received another patch to
sched.c.  They must conflict.  I will test with just your patch now.

^ permalink raw reply

* Re: [RFC PATCH 1/2] mac80211: make max_network_latency notifier atomic safe
From: Florian Mickler @ 2010-06-09 15:37 UTC (permalink / raw)
  To: Johannes Berg
  Cc: pm list, james.bottomley-l3A5Bk7waGM,
	markgross-FexmcLCpFRVAfugRpC6u6w, mgross-VuQAYsv1563Yd54FQh9/CA,
	John W. Linville, David S. Miller, Javier Cardona, Jouni Malinen,
	Rui Paulo, Kalle Valo, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Thomas Gleixner
In-Reply-To: <1276086425.14580.14.camel-8upI4CBIZJIJvtFkdXX2HixXY32XiHfO@public.gmane.org>

On Wed, 09 Jun 2010 14:27:05 +0200
Johannes Berg <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> wrote:

> On Wed, 2010-06-09 at 14:16 +0200, Florian Mickler wrote:
> 
> > That was also my first idea, but then I thought about qos and thought
> > atomic notification are necessary.
> > Do you see any value in having atomic notification? 
> > 
> > I have the following situation before my eyes:
> > 
> > 	Driver A gets an interrupt and needs (to service that
> > 	interrupt) the cpu to guarantee a latency of X because the
> > 	device is a bit icky.
> > 
> > Now, in that situation, if we don't immediately (without scheduling in
> > between) notify the system to be in that latency-mode the driver won't
> > function properly. Is this a realistic scene?
> > 
> > At the moment we only have process context notification and only 2
> > listeners.
> > 
> > I think providing for atomic as well as "relaxed" notification could be
> > useful. 
> > 
> > If atomic notification is deemed unnecessary, I have no
> > problems to just use schedule_work() in update request.
> > Anyway, it is probably best to split this. I.e. first make
> > update_request callable from atomic contexts with doing the
> > schedule_work in update_request and then
> > as an add on provide for constraints_objects with atomic notifications.
> 
> Well I remember http://thread.gmane.org/gmane.linux.kernel/979935 where
> Mark renamed things to "request" which seems to imply to me more of a
> "please do this" than "I NEED IT NOW!!!!!".
> 
> johannes

Yes. I just posted a version which uses schedule_work().
Just FYI, James has also posted his version which uses either a blocking
or an atomic notifier chain.
http://article.gmane.org/gmane.linux.kernel/996813

Cheers,
Flo

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH iproute2] ip: add support for multicast rules
From: Patrick McHardy @ 2010-06-09 15:41 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 1 bytes --]



[-- Attachment #2: 01.diff --]
[-- Type: text/x-diff, Size: 2681 bytes --]

commit 44a5293c1c47b8c32d9bb0756660ea5d4802acf2
Author: Patrick McHardy <kaber@trash.net>
Date:   Tue Apr 13 17:03:47 2010 +0200

    ip: add support for multicast rules
    
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index e94981d..0d8ef9e 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -7,6 +7,13 @@
 #include <linux/if_addr.h>
 #include <linux/neighbour.h>
 
+/* rtnetlink families. Values up to 127 are reserved for real address
+ * families, values above 128 may be used arbitrarily.
+ */
+#define RTNL_FAMILY_IPMR		128
+#define RTNL_FAMILY_IP6MR		129
+#define RTNL_FAMILY_MAX			129
+
 /****
  *		Routing/neighbour discovery messages.
  ****/
diff --git a/ip/ip.c b/ip/ip.c
index e0cf175..9f29533 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -42,7 +42,7 @@ static void usage(void)
 "Usage: ip [ OPTIONS ] OBJECT { COMMAND | help }\n"
 "       ip [ -force ] -batch filename\n"
 "where  OBJECT := { link | addr | addrlabel | route | rule | neigh | ntable |\n"
-"                   tunnel | tuntap | maddr | mroute | monitor | xfrm }\n"
+"                   tunnel | tuntap | maddr | mroute | mrule | monitor | xfrm }\n"
 "       OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[esolve] |\n"
 "                    -f[amily] { inet | inet6 | ipx | dnet | link } |\n"
 "                    -o[neline] | -t[imestamp] | -b[atch] [filename] |\n"
@@ -76,6 +76,7 @@ static const struct cmd {
 	{ "monitor",	do_ipmonitor },
 	{ "xfrm",	do_xfrm },
 	{ "mroute",	do_multiroute },
+	{ "mrule",	do_multirule },
 	{ "help",	do_help },
 	{ 0 }
 };
diff --git a/ip/ip_common.h b/ip/ip_common.h
index c857667..a114186 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -37,6 +37,7 @@ extern int do_iplink(int argc, char **argv);
 extern int do_ipmonitor(int argc, char **argv);
 extern int do_multiaddr(int argc, char **argv);
 extern int do_multiroute(int argc, char **argv);
+extern int do_multirule(int argc, char **argv);
 extern int do_xfrm(int argc, char **argv);
 
 static inline int rtm_get_table(struct rtmsg *r, struct rtattr **tb)
diff --git a/ip/iprule.c b/ip/iprule.c
index 7140375..9c8c6ef 100644
--- a/ip/iprule.c
+++ b/ip/iprule.c
@@ -436,3 +436,20 @@ int do_iprule(int argc, char **argv)
 	exit(-1);
 }
 
+int do_multirule(int argc, char **argv)
+{
+	switch (preferred_family) {
+	case AF_UNSPEC:
+	case AF_INET:
+		preferred_family = RTNL_FAMILY_IPMR;
+		break;
+	case AF_INET6:
+		preferred_family = RTNL_FAMILY_IP6MR;
+		break;
+	default:
+		fprintf(stderr, "Multicast rules are only supported for IPv4/IPv6\n");
+		exit(-1);
+	}
+
+	return do_iprule(argc, argv);
+}

^ permalink raw reply related

* Re: [PATCH iproute2] ip: add support for multicast rules
From: Stephen Hemminger @ 2010-06-09 15:50 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Linux Netdev List
In-Reply-To: <4C0FB616.3000303@trash.net>


Applied

^ permalink raw reply

* Re: 2.6.35-rc2-git1 - include/linux/cgroup.h:534 invoked rcu_dereference_check() without protection!
From: Miles Lane @ 2010-06-09 17:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: paulmck, Vivek Goyal, Eric Paris, Lai Jiangshan, Ingo Molnar,
	LKML, nauman, eric.dumazet, netdev, Jens Axboe, Gui Jianfeng,
	Li Zefan, Johannes Berg
In-Reply-To: <AANLkTinl2JKUX6859Ioa1y0DqHWXH5TrkissQ8BAF4cP@mail.gmail.com>

On Wed, Jun 9, 2010 at 11:29 AM, Miles Lane <miles.lane@gmail.com> wrote:
> On Wed, Jun 9, 2010 at 11:24 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> On Wed, 2010-06-09 at 11:11 -0400, Miles Lane wrote:
>>>
>>> Sorry.  I misunderstood this message when I first read it.  I didn't
>>> realize this message include a new version of the patch.
>>> Anyhow, I just tried to apply the patch to 2.6.35-rc2-git3 and got this:
>>>
>>> # patch -p1 -l -F 20 --dry-run < ../5.patch
>>> patching file include/linux/cgroup.h
>>> patching file kernel/sched.c
>>> Hunk #1 succeeded at 306 with fuzz 1.
>>> Hunk #3 FAILED at 4462.
>>> Hunk #4 succeeded at 4487 with fuzz 3.
>>> 1 out of 4 hunks FAILED -- saving rejects to file kernel/sched.c.rej
>>
>> Weird.. it seems to apply without trouble to Linus' git tree.
>>
>> root@twins:/usr/src/linux-2.6# git checkout -f origin/master
>> HEAD is now at 84f7586... Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
>> root@twins:/usr/src/linux-2.6# quilt push
>> Applying patch patches/sched-rcu-validation.patch
>> patching file include/linux/cgroup.h
>> patching file kernel/sched.c
>>
>> Now at patch patches/sched-rcu-validation.patch
>> root@twins:/usr/src/linux-2.6# git describe
>> v2.6.35-rc2-54-g84f7586
>
> Oh.  Duh. I know what is going on.  I had received another patch to
> sched.c.  They must conflict.  I will test with just your patch now.
>

ACPI: Core revision 20100428
[    0.061088]
[    0.061090] ===================================================
[    0.062009] [ INFO: suspicious rcu_dereference_check() usage. ]
[    0.062138] ---------------------------------------------------
[    0.062268] kernel/sched.c:616 invoked rcu_dereference_check()
without protection!
[    0.062470]
[    0.062471] other info that might help us debug this:
[    0.062472]
[    0.062835]
[    0.062836] rcu_scheduler_active = 1, debug_locks = 1
[    0.063009] no locks held by swapper/0.
[    0.063134]
[    0.063135] stack backtrace:
[    0.063378] Pid: 0, comm: swapper Not tainted 2.6.35-rc2-git3 #3
[    0.063507] Call Trace:
[    0.063638]  [<ffffffff81072205>] lockdep_rcu_dereference+0x9d/0xa5
[    0.063773]  [<ffffffff810379f9>] task_group+0x7b/0x8a
[    0.064012]  [<ffffffff81037a1d>] set_task_rq+0x15/0x6e
[    0.064143]  [<ffffffff8103e50f>] set_task_cpu+0xa9/0xba
[    0.064274]  [<ffffffff81042dbb>] sched_fork+0x10a/0x1b3
[    0.064405]  [<ffffffff810446f9>] copy_process+0x617/0x10e6
[    0.064537]  [<ffffffff8104533d>] do_fork+0x175/0x39b
[    0.064670]  [<ffffffff8106589b>] ? up+0xf/0x39
[    0.064800]  [<ffffffff8106589b>] ? up+0xf/0x39
[    0.065013]  [<ffffffff811dbf73>] ? do_raw_spin_lock+0x79/0x13e
[    0.065148]  [<ffffffff81011526>] kernel_thread+0x70/0x72
[    0.065279]  [<ffffffff816cc5e4>] ? kernel_init+0x0/0x1ce
[    0.065411]  [<ffffffff8100aba0>] ? kernel_thread_helper+0x0/0x10
[    0.065545]  [<ffffffff81096bea>] ? rcu_scheduler_starting+0x2a/0x4c
[    0.065679]  [<ffffffff813a8a4d>] rest_init+0x21/0xde
[    0.065810]  [<ffffffff816cce28>] start_kernel+0x448/0x453
[    0.066013]  [<ffffffff816cc2c8>] x86_64_start_reservations+0xb3/0xb7
[    0.066148]  [<ffffffff816cc418>] x86_64_start_kernel+0x14c/0x15b
[    0.066499] Setting APIC routing to flat

^ permalink raw reply

* Re: linux-next: Tree for June 9 (niu)
From: Randy Dunlap @ 2010-06-09 17:36 UTC (permalink / raw)
  To: Stephen Rothwell, netdev; +Cc: linux-next, LKML, davem
In-Reply-To: <20100609133443.38f1f957.sfr@canb.auug.org.au>

On Wed, 9 Jun 2010 13:34:43 +1000 Stephen Rothwell wrote:

> Hi all,
> 
> Changes since 20100608:
> 
> My fixes tree contains:
>       v4l-dvb: update gfp/slab.h includes
>       arm: update gfp/slab.h includes
>       davinci: update gfp/slab.h includes
>       ocfs2: update gfp/slab.h includes
>       acpi: update gfp/slab.h includes



on x86_64 or i386, CONFIG_OF_DEVICE is not enabled:

drivers/net/niu.c:9700: warning: 'struct of_device' declared inside parameter list
drivers/net/niu.c:9700: warning: its scope is only this definition or declaration, which is probably not what you want
drivers/net/niu.c:9716: warning: assignment from incompatible pointer type

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply

* Re: [PATCH v3] netfilter: Xtables: idletimer target implementation
From: Luciano Coelho @ 2010-06-09 17:48 UTC (permalink / raw)
  To: ext Jan Engelhardt
  Cc: ext Patrick McHardy, netfilter-devel@vger.kernel.org,
	netdev@vger.kernel.org, Timo Teras
In-Reply-To: <alpine.LSU.2.01.1006091716170.30265@obet.zrqbmnf.qr>

Hi Jan,

On Wed, 2010-06-09 at 17:18 +0200, ext Jan Engelhardt wrote:
> On Wednesday 2010-06-09 17:11, Luciano Coelho wrote:
> >Do you think it's okay to leave it like this for now and extend it for
> >multiple namespace support with a future patch?
> 
> Yes. Least thing we need is one humongous patch. :)

Thanks! That was my concern too.  One huge and very complex patch is not
a nice thing to have. ;)


> >> > +	timer = __idletimer_tg_find_by_label(info->label);
> >> > +	if (!timer) {
> >> > +		spin_unlock(&list_lock);
> >> > +		timer = idletimer_tg_create(info);
> >> >   
> >> 
> >> How does this prevent creating the same timer twice?
> >
> >The timer will only be created if __idletimer_tg_find_by_label() returns
> >NULL, which means that no timer with that label has been found.  "info"
> >won't be the same if info->label is different, right? Or can it change
> >on the fly?
> 
> One thing to be generally aware about is that things could potentially
> be instantiated by another entity between the time a label was looked up
> with negative result and the time one tries to add it.
> It may thus be required to extend keeping the lock until after
> idletimer_tg_create, in other words, lookup and create must be atomic
> to the rest of the world.

Ahh, sure! I missed the actual point of Patrick's question.  I had the
idletimer_tg_create() inside the lock, but when I added the
sysfs_create_file() there (which can sleep), I screwed up with the
locking.

I'll move the sysfs file creation to outside that function so I can keep
the lock until after the timer is added to the list.  Thanks for
clarifying!


-- 
Cheers,
Luca.


^ permalink raw reply

* Re: 2.6.35-rc2-git1 - include/linux/cgroup.h:534 invoked rcu_dereference_check() without protection!
From: Peter Zijlstra @ 2010-06-09 17:57 UTC (permalink / raw)
  To: Miles Lane
  Cc: paulmck, Vivek Goyal, Eric Paris, Lai Jiangshan, Ingo Molnar,
	LKML, nauman, eric.dumazet, netdev, Jens Axboe, Gui Jianfeng,
	Li Zefan, Johannes Berg
In-Reply-To: <AANLkTilpM9O373rA0cw_wTYVfTiDPbjaKHScAGlt7IAr@mail.gmail.com>

On Wed, 2010-06-09 at 13:00 -0400, Miles Lane wrote:

> ACPI: Core revision 20100428
> [    0.061088]
> [    0.061090] ===================================================
> [    0.062009] [ INFO: suspicious rcu_dereference_check() usage. ]
> [    0.062138] ---------------------------------------------------
> [    0.062268] kernel/sched.c:616 invoked rcu_dereference_check()
> without protection!
> [    0.062470]
> [    0.062471] other info that might help us debug this:
> [    0.062472]
> [    0.062835]
> [    0.062836] rcu_scheduler_active = 1, debug_locks = 1
> [    0.063009] no locks held by swapper/0.
> [    0.063134]
> [    0.063135] stack backtrace:
> [    0.063378] Pid: 0, comm: swapper Not tainted 2.6.35-rc2-git3 #3
> [    0.063507] Call Trace:
> [    0.063638]  [<ffffffff81072205>] lockdep_rcu_dereference+0x9d/0xa5
> [    0.063773]  [<ffffffff810379f9>] task_group+0x7b/0x8a
> [    0.064012]  [<ffffffff81037a1d>] set_task_rq+0x15/0x6e
> [    0.064143]  [<ffffffff8103e50f>] set_task_cpu+0xa9/0xba
> [    0.064274]  [<ffffffff81042dbb>] sched_fork+0x10a/0x1b3
> [    0.064405]  [<ffffffff810446f9>] copy_process+0x617/0x10e6
> [    0.064537]  [<ffffffff8104533d>] do_fork+0x175/0x39b
> [    0.064670]  [<ffffffff8106589b>] ? up+0xf/0x39
> [    0.064800]  [<ffffffff8106589b>] ? up+0xf/0x39
> [    0.065013]  [<ffffffff811dbf73>] ? do_raw_spin_lock+0x79/0x13e
> [    0.065148]  [<ffffffff81011526>] kernel_thread+0x70/0x72
> [    0.065279]  [<ffffffff816cc5e4>] ? kernel_init+0x0/0x1ce
> [    0.065411]  [<ffffffff8100aba0>] ? kernel_thread_helper+0x0/0x10
> [    0.065545]  [<ffffffff81096bea>] ? rcu_scheduler_starting+0x2a/0x4c
> [    0.065679]  [<ffffffff813a8a4d>] rest_init+0x21/0xde
> [    0.065810]  [<ffffffff816cce28>] start_kernel+0x448/0x453
> [    0.066013]  [<ffffffff816cc2c8>] x86_64_start_reservations+0xb3/0xb7
> [    0.066148]  [<ffffffff816cc418>] x86_64_start_kernel+0x14c/0x15b
> [    0.066499] Setting APIC routing to flat

Argh, moar funkeh stuff..

Either we do something like the below, or add something like (p->flags &
PF_STARTING) to the task_subsys_state_check(), opinions?

---
 kernel/sched.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 19b3c5d..bfd3128 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2564,7 +2564,15 @@ void sched_fork(struct task_struct *p, int clone_flags)
 	if (p->sched_class->task_fork)
 		p->sched_class->task_fork(p);
 
+	/*
+	 * We're not in the pid-hash yet so no cgroup attach races, and the
+	 * cgroup is pinned by the parent running this.
+	 *
+	 * Silence PROVE_RCU.
+	 */
+	rcu_read_lock();
 	set_task_cpu(p, cpu);
+	rcu_read_unlock();
 
 #if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
 	if (likely(sched_info_on()))


^ permalink raw reply related

* Re: linux-next: Tree for June 9 (niu)
From: David Miller @ 2010-06-09 18:06 UTC (permalink / raw)
  To: randy.dunlap; +Cc: sfr, netdev, linux-next, linux-kernel
In-Reply-To: <20100609103657.66252a27.randy.dunlap@oracle.com>

From: Randy Dunlap <randy.dunlap@oracle.com>
Date: Wed, 9 Jun 2010 10:36:57 -0700

> On Wed, 9 Jun 2010 13:34:43 +1000 Stephen Rothwell wrote:
> 
>> Changes since 20100608:
>> 
>> My fixes tree contains:
>>       v4l-dvb: update gfp/slab.h includes
>>       arm: update gfp/slab.h includes
>>       davinci: update gfp/slab.h includes
>>       ocfs2: update gfp/slab.h includes
>>       acpi: update gfp/slab.h includes
> 
> 
> 
> on x86_64 or i386, CONFIG_OF_DEVICE is not enabled:
> 
> drivers/net/niu.c:9700: warning: 'struct of_device' declared inside parameter list
> drivers/net/niu.c:9700: warning: its scope is only this definition or declaration, which is probably not what you want
> drivers/net/niu.c:9716: warning: assignment from incompatible pointer type

Hmmm, I'm confused why this never happened before :-)

We conditionalize linux/of_device.h inclusion with CONFIG_SPARC64, yet
we unconditionally use "struct of_device *" pointers in the driver
with no such ifdef protection.

Even if we unconditionally included linux/of_device.h, that file does
nothing unless CONFIG_OF_DEVICE is defined so it should have always
produced these warnings since I can't see from where else it could
have gotten even a "struct of_device;" somewhere.

Do you have any idea Randy?  Pease try analyze this further so we can
fix it properly.

THanks!

^ permalink raw reply

* Re: [PATCH][RFC] Check sk_buff states
From: David VomLehn @ 2010-06-09 18:07 UTC (permalink / raw)
  To: Andi Kleen; +Cc: netdev
In-Reply-To: <878w6o65kx.fsf@basil.nowhere.org>

On Wed, Jun 09, 2010 at 02:14:06AM -0500, Andi Kleen wrote:
> David VomLehn <dvomlehn@cisco.com> writes:
> >
> > Some of the errors that can be detected are:
> > o	Double-freeing an skb_buff
> 
> slab debugging should already catch that (although at higher cost)

There is a sublety here--if skb_clone() is used, skb_free() isn't
necessarily called until both parts of the cloned sk_buff are no longer
used. But it's not just that the double free is called, it's that you
also know where the first free is located. This turns out to be very
valuable.

> I would say the big question is how many bugs such a new infrastructure
> find. Did you find any?

Definitely. I've heard back about two cases. One of them was especially
gratifying because it has a control case. Two people were looking for the
same bug and found it within hours of each other. One person, who didn't
have the code enabled, took three weeks. The other person enabled the code.
It took him three hours to find the problem. So, our networking people
have been very happy to have this.

> -Andi

-- 
David VL

^ permalink raw reply

* Re: linux-next: Tree for June 9 (niu)
From: Randy Dunlap @ 2010-06-09 18:08 UTC (permalink / raw)
  To: David Miller; +Cc: sfr, netdev, linux-next, linux-kernel
In-Reply-To: <20100609.110638.112605100.davem@davemloft.net>

On 06/09/10 11:06, David Miller wrote:
> From: Randy Dunlap <randy.dunlap@oracle.com>
> Date: Wed, 9 Jun 2010 10:36:57 -0700
> 
>> On Wed, 9 Jun 2010 13:34:43 +1000 Stephen Rothwell wrote:
>>
>>> Changes since 20100608:
>>>
>>> My fixes tree contains:
>>>       v4l-dvb: update gfp/slab.h includes
>>>       arm: update gfp/slab.h includes
>>>       davinci: update gfp/slab.h includes
>>>       ocfs2: update gfp/slab.h includes
>>>       acpi: update gfp/slab.h includes
>>
>>
>>
>> on x86_64 or i386, CONFIG_OF_DEVICE is not enabled:
>>
>> drivers/net/niu.c:9700: warning: 'struct of_device' declared inside parameter list
>> drivers/net/niu.c:9700: warning: its scope is only this definition or declaration, which is probably not what you want
>> drivers/net/niu.c:9716: warning: assignment from incompatible pointer type
> 
> Hmmm, I'm confused why this never happened before :-)
> 
> We conditionalize linux/of_device.h inclusion with CONFIG_SPARC64, yet
> we unconditionally use "struct of_device *" pointers in the driver
> with no such ifdef protection.
> 
> Even if we unconditionally included linux/of_device.h, that file does
> nothing unless CONFIG_OF_DEVICE is defined so it should have always
> produced these warnings since I can't see from where else it could
> have gotten even a "struct of_device;" somewhere.
> 
> Do you have any idea Randy?  Pease try analyze this further so we can
> fix it properly.


I looked and was confuzed, but I'll look again.

-- 
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply

* Re: [PATCH][RFC] Infrastructure for compact call location representation
From: David VomLehn @ 2010-06-09 18:13 UTC (permalink / raw)
  To: Andi Kleen; +Cc: netdev
In-Reply-To: <874ohc65es.fsf@basil.nowhere.org>

On Wed, Jun 09, 2010 at 02:17:47AM -0500, Andi Kleen wrote:
> David VomLehn <dvomlehn@cisco.com> writes:
> 
> First your mailer generates broken cc addresses like
> "netdev@vger.kernel.org"@cisco.com
> 
> > This patch allows the location of a call to be recorded as a small integer,
> > with each call location ("callsite") assigned a new value the first time
> > the code in that location is executed. Locations can be recorded as a
> > an address or as a __FILE__/__LINE__ pair. The later is easier to read, but
> > requires significantly more space.
> 
> Seems overly complicated.
> 
> How about using a hash of __FILE__, __LINE__ instead? 
> With some care you can write it in a way that it gets completely
> evaluated by gcc at compile time, so it's just a constant then.
> 
> That may use a few more bits then, but that's far better
> than having so much runtime overhead for this.

The same indexing scheme is employed whether it is configured for reporting
an address or the __FILE__/__LINE__ pair. The __FILE__/__LINE__ data is
stored statically at compile time, but since the __FILE__ values are strings
and may be pretty long, it can use a significant amount of space in the
string table. Performance wise, the key difference is that reporting an
address means that, the first time you go through a callsite, it stores
the return address. This doesn't need to be repeated during subsequent
calls, so the performance difference is miniscule.
-- 
David VL

^ permalink raw reply

* Re: 2.6.35-rc2-git1 - include/linux/cgroup.h:534 invoked rcu_dereference_check() without protection!
From: Peter Zijlstra @ 2010-06-09 18:15 UTC (permalink / raw)
  To: Miles Lane
  Cc: paulmck, Vivek Goyal, Eric Paris, Lai Jiangshan, Ingo Molnar,
	LKML, nauman, eric.dumazet, netdev, Jens Axboe, Gui Jianfeng,
	Li Zefan, Johannes Berg
In-Reply-To: <1276106229.1745.65.camel@laptop>

On Wed, 2010-06-09 at 19:57 +0200, Peter Zijlstra wrote:
> +       /*
> +        * We're not in the pid-hash yet so no cgroup attach races, and the
> +        * cgroup is pinned by the parent running this.
> +        *
> +        * Silence PROVE_RCU.
> +        */ 

Hum,.. not sure that's actually true though, the parent itself is still
susceptible to races afaict..

^ permalink raw reply

* Re: pull request: wireless-2.6 2010-06-08
From: David Miller @ 2010-06-09 18:17 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20100608190214.GD2422@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Tue, 8 Jun 2010 15:02:14 -0400

> Dave,
> 
> Here is a flurry of fixes intended for 2.6.35.  Most of them are small
> and obvious, but there are a few exceptions...I'll include some text from
> the iwlwifi pull request:
...
> The iwlwifi bits are bigger than I would like to see, but the fixes
> seem worthwhile.  Also included is a revert of a hostap patch that
> introduced failures with some devices, a NULL pointer dereference fix,
> a memory leak fix, a fix for an incorrect check of function pointer,
> a USB ID for p54, and a fix for an association failure 'corner case'
> from Johannes.

Pulled, thanks John.

Please try to keep things a lot smaller from here on out, thank you.

^ permalink raw reply

* Re: [PATCH][RFC] Infrastructure for compact call location representation
From: David VomLehn @ 2010-06-09 18:22 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Stephen Hemminger, to, netdev
In-Reply-To: <20100609084417.GW26335@laptop>

On Wed, Jun 09, 2010 at 03:44:17AM -0500, Nick Piggin wrote:
> On Tue, Jun 08, 2010 at 08:44:56AM -0700, Stephen Hemminger wrote:
> > On Mon, 7 Jun 2010 17:30:52 -0700
> > David VomLehn <dvomlehn@cisco.com> wrote:
> > > History
> > > v2	Support small callsite IDs and split out out-of-band parameter
> > > 	parsing.
> > > V1	Initial release
> > > 
> > > Signed-off-by: David VomLehn <dvomlehn@cisco.com>
> > 
> > This is really Linux Kernel Mailing List material (not just netdev). And it will
> > be a hard sell to get it accepted, because it is basically an alternative call
> > tracing mechanism, and there are already several of these in use or under development
> > (see perf and ftrace).
> 
> What about a generic extension or layer on top of stacktrace that
> does caching and unique IDs for stack traces. This way you can get
> callsites or _full_ stack traces if required, and it shouldn't require
> any extra magic in the net functions.

Since the code calls BUG() when it detects an error, you already get the
full stack trace of the location where the problem is detected. The question
is the relative cost and benefits of a full stack trace of the previous
sk_buff state modification. Since I'm working in a MIPS processor
environment, I am rather prejudiced against doing any stack trace I don't
have to; for now, at least, they are *very* expensive on MIPS.

The two times this code (or its ancestor) has found problems in a deployed
software stack, the engineers reported they there were able to immediately
find and fix the problem. This suggests that we don't need to take on the
complexity of the stack backtrace, at least for now. If this gets added to
the mainline and people find they need the extra information, I'd be all
for it.

> You would need a hash for stack traces to check for an existing trace,
> and an idr to assign ids to traces.

Once you have the trace, it should be pretty easy to do this. In theory there
could be a huge number of unique stack traces, but I don't that this would
be the case in practice.
-- 
David VL

^ permalink raw reply

* Re: [PATCH v3] netfilter: Xtables: idletimer target implementation
From: Luciano Coelho @ 2010-06-09 18:42 UTC (permalink / raw)
  To: ext Jan Engelhardt
  Cc: ext Patrick McHardy, netfilter-devel@vger.kernel.org,
	netdev@vger.kernel.org, Timo Teras
In-Reply-To: <1276105707.11199.12.camel@powerslave>

On Wed, 2010-06-09 at 19:48 +0200, Coelho Luciano (Nokia-D/Helsinki)
wrote:
> On Wed, 2010-06-09 at 17:18 +0200, ext Jan Engelhardt wrote:
> > >> > +	timer = __idletimer_tg_find_by_label(info->label);
> > >> > +	if (!timer) {
> > >> > +		spin_unlock(&list_lock);
> > >> > +		timer = idletimer_tg_create(info);
> > >> >   
> > >> 
> > >> How does this prevent creating the same timer twice?
> > >
> > >The timer will only be created if __idletimer_tg_find_by_label() returns
> > >NULL, which means that no timer with that label has been found.  "info"
> > >won't be the same if info->label is different, right? Or can it change
> > >on the fly?
> > 
> > One thing to be generally aware about is that things could potentially
> > be instantiated by another entity between the time a label was looked up
> > with negative result and the time one tries to add it.
> > It may thus be required to extend keeping the lock until after
> > idletimer_tg_create, in other words, lookup and create must be atomic
> > to the rest of the world.
> 
> Ahh, sure! I missed the actual point of Patrick's question.  I had the
> idletimer_tg_create() inside the lock, but when I added the
> sysfs_create_file() there (which can sleep), I screwed up with the
> locking.
> 
> I'll move the sysfs file creation to outside that function so I can keep
> the lock until after the timer is added to the list.  Thanks for
> clarifying!

Hmmm... after struggling with this for a while, I think it's not really
possible to simply create the sysfs file outside of the lock, because if
the sysfs creation fails, we will again risk a race condition.

I think the only way is to delay the sysfs file creation and do it in a
workqueue.


-- 
Cheers,
Luca.


^ permalink raw reply

* ss -p is much too slow
From: Steve Fink @ 2010-06-09 18:42 UTC (permalink / raw)
  To: netdev

I was recently working with a monitoring system, and one check
involved displaying what process(es) were listening on specific ports.
I was using a single call to ss -lntp and processing the output, but
it would often take several minutes to respond. netstat -lntp, on the
other hand, was immediate. I had a few thousand processes running on
the machine, and it could take 7 minutes or so.

On closer inspection, it appears that ss -p does a quadratic scan. It
rescans every entry in /proc/*/fd/* repeatedly (once per listening
port? per process? I don't remember what I figured out.)

I humbly suggest that this is not a good idea.

I am currently running iproute-2.6.27-2.fc10.i386, though my real test
case was on a different machine. I don't have a newer version to check
right now.

% strace ss -lntp |& fgrep /proc/3768/fd/10
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14
readlink("/proc/3768/fd/10", "socket:[16214]"..., 63) = 14

^ permalink raw reply

* RE: [v2 Patch 1/2] s2io: add dynamic LRO disable support
From: Ramkrishna Vepa @ 2010-06-09 18:52 UTC (permalink / raw)
  To: Ben Hutchings, Amerigo Wang
  Cc: netdev@vger.kernel.org, nhorman@redhat.com, sgruszka@redhat.com,
	herbert.xu@redhat.com, davem@davemloft.net
In-Reply-To: <1276092058.2093.1.camel@achroite.uk.solarflarecom.com>

> On Wed, 2010-06-09 at 06:05 -0400, Amerigo Wang wrote:
> [...]
> > +static int s2io_ethtool_set_flags(struct net_device *dev, u32 data)
> > +{
> > +   struct s2io_nic *sp = netdev_priv(dev);
> > +   int rc = 0;
> > +   int changed = 0;
> > +
> > +   if (data & ETH_FLAG_LRO) {
> > +           if (lro_enable) {
> > +                   if (!(dev->features & NETIF_F_LRO)) {
> > +                           dev->features |= NETIF_F_LRO;
> > +                           changed = 1;
> > +                   }
> > +           } else
> > +                   rc = -EINVAL;
> > +   } else if (dev->features & NETIF_F_LRO) {
> > +           dev->features &= ~NETIF_F_LRO;
> > +           changed = 1;
> > +   }
> > +
> > +   if (changed && netif_running(dev)) {
> > +           s2io_stop_all_tx_queue(sp);
> > +           s2io_card_down(sp);
> > +           sp->lro = dev->features & NETIF_F_LRO;
> > +           rc = s2io_card_up(sp);
> > +           s2io_start_all_tx_queue(sp);
> [...]
>
> Is it safe to call s2io_start_all_tx_queue() if s2io_card_up() failed?
Ben,
Good point. If s2io_card_up() fails the chip will not be accessed, so it's safe but all transmit skbs will be freed without the user knowing the reason for failing to transmit or receive for that matter.  The other option is to return with a failure and get the watchdog timer reset the adapter.

Ram
>
> Ben.
>
> --
> Ben Hutchings, Senior Software Engineer, Solarflare Communications
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.


The information and any attached documents contained in this message
may be confidential and/or legally privileged.  The message is
intended solely for the addressee(s).  If you are not the intended
recipient, you are hereby notified that any use, dissemination, or
reproduction is strictly prohibited and may be unlawful.  If you are
not the intended recipient, please contact the sender immediately by
return e-mail and destroy all copies of the original message.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox