Netdev List
 help / color / mirror / Atom feed
* Re: [RFC][PATCH] ethtool: Add reset operation
From: Ben Hutchings @ 2009-10-02 11:00 UTC (permalink / raw)
  To: Ajit Khaparde; +Cc: David Miller, netdev, linux-net-drivers
In-Reply-To: <20091002104010.GA19862@serverengines.com>

On Fri, 2009-10-02 at 16:10 +0530, Ajit Khaparde wrote:
[...]
> Can you tell the intention behind this copy_to_user?
> Do you envision drivers sending back some data to the userland - may be
> sometime in future?

This allows userland to see which components were actually reset.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: Messages are printed on screen
From: Markus Feldmann @ 2009-10-02 10:56 UTC (permalink / raw)
  To: netdev
In-Reply-To: <ha4igd$ghh$1@ger.gmane.org>

Markus Feldmann schrieb:
> ....
> my <rsyslog> to save this only to </var/log>. Here is my 
> </etc/rsyslog.conf>
http://pastebin.com/m4400fb9e



^ permalink raw reply

* Re: Messages are printed on screen
From: Ben Hutchings @ 2009-10-02 10:56 UTC (permalink / raw)
  To: Markus Feldmann; +Cc: netdev
In-Reply-To: <ha4igd$ghh$1@ger.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 2380 bytes --]

On Fri, 2009-10-02 at 11:52 +0200, Markus Feldmann wrote:
> Hi All,
> 
> i am setting up my Server, with Linux Debian lenny. Therefore i am using 
>   Kernel 2.6.31.1.

The current kernel version for 'lenny' is 2.6.26 (with stable updates
and other fixes).

2.6.31 is known to have a large number of regressions outstanding, which
is why it is not in Debian yet.

> I configured this Kernel with <make defconfig> and 
> <make menuconfig>. My motherboard has 5 PCI slots and 1 AGP slot. All 
> slots are in use. All devices are mapped to the following IRQ-Line:
> 
> Mass Storage Device (PCI Slot1)	IRQ-Line 11
> Ethernet (PCI Slot 2) 		IRQ-Line 4
> Ethernet (PCI Slot 3) 		IRQ-Line 5
> Ethernet (PCI Slot 4) 		IRQ-Line 7
> Ethernet (PCI Slot 5) 		IRQ-Line 11
> Onboard USB-Controller		IRQ-Line 5
> Onboard USB-Controller		IRQ-Line 4
> Onboard USB-Controller		IRQ-Line 11
> Onboard IDE			IRQ-Line 14
> AGP VGA				IRQ-Line 11
> 
> As you see some of my IRQ-Lines are multiply in use, so my Server is 
> working hard at his limit.

IRQ sharing is normal on PCs without MSI support, but to see where
that's happening you need to look at /proc/interrupts and not the BIOS
setup program or wherever you got the above information from.

This does not result in 'working hard at his limit'.

> The result is sometimes freezing of my 
> Server, especially if there is much processing on these devices. I 
> remember that with Kernel 2.6.18 my system didn't does freezing.

This is simply a bug, not a result of IRQ sharing or 'working hard'.

> So i am trying to reduce the amount of this processing. I still get 
> messages about dropped network packets on my Terminal, although i set up 
> my <rsyslog> to save this only to </var/log>. Here is my 
> </etc/rsyslog.conf>

You forgot to paste it.

> How can i disable the output of messages (about dropped packets from my 
> firewall) to my terminal ?

Edit the value of kernel.printk in /etc/sysctl.conf.

> How can i stabilize my IRQ-System with the kernel 2.6.31.1 ?

I would expect the standard kernel version for 'lenny' or the 2.6.30
kernel from 'sid' to be more stable.

> What debug features should i disable ?

No idea, you didn't even specify what you enabled...

Ben.

-- 
Ben Hutchings
Who are all these weirdos? - David Bowie, about L-Space IRC channel #afp

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Re: [RFC][PATCH] ethtool: Add reset operation
From: Ajit Khaparde @ 2009-10-02 10:40 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: David Miller, netdev, linux-net-drivers
In-Reply-To: <1254426195.2735.16.camel@achroite>

On 01/10/09 20:43 +0100, Ben Hutchings wrote:
> After updating firmware stored in flash, users may wish to reset the
> relevant hardware and start the new firmware immediately.  This should
> not be completely automatic as it may be disruptive.
> 
> A selective reset may also be useful for debugging or diagnostics.
> 
> This adds a separate reset operation which takes flags indicating the
> components to be reset.  Drivers are allowed to reset only a subset of
> those requested, and must report the actual subset.  This allows the
> use of generic component masks and some future expansion.
> ---
Looks good. But one question.

> +static int ethtool_reset(struct net_device *dev, char __user *useraddr)
> +{
> +	struct ethtool_value reset;
> +	int ret;
> +
> +	if (!dev->ethtool_ops->reset)
> +		return -EOPNOTSUPP;
> +
> +	if (copy_from_user(&reset, useraddr, sizeof(reset)))
> +		return -EFAULT;
> +
> +	ret = dev->ethtool_ops->reset(dev, &reset.data);
> +	if (ret)
> +		return ret;
> +
> +	if (copy_to_user(useraddr, &reset, sizeof(reset)))
> +		return -EFAULT;
Can you tell the intention behind this copy_to_user?
Do you envision drivers sending back some data to the userland - may be
sometime in future?

Thanks
-Ajit

^ permalink raw reply

* [RFC take2] pkt_sched: gen_estimator: Dont report fake rate estimators
From: Eric Dumazet @ 2009-10-02 10:35 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, kaber, netdev
In-Reply-To: <20091002070819.GA9694@ff.dom.local>

Here is second attempt to make this change, thanks Jarek !

This is indeed less intrusive !

[RFC] pkt_sched: gen_estimator: Dont report fake rate estimators

We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator
is running.

# tc -s -d qdisc
qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0

User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake
one (because no estimator is active)

After this patch, tc command output is :
$ tc -s -d qdisc
qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

We add a parameter to gnet_stats_copy_rate_est() function so that
it can use gen_estimator_active(bstats, r), as suggested by Jarek.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/gen_stats.h |    1 +
 net/core/gen_stats.c    |    7 ++++++-
 net/sched/act_api.c     |    2 +-
 net/sched/sch_api.c     |    2 +-
 net/sched/sch_cbq.c     |    2 +-
 net/sched/sch_drr.c     |    2 +-
 net/sched/sch_hfsc.c    |    2 +-
 net/sched/sch_htb.c     |    2 +-
 8 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/include/net/gen_stats.h b/include/net/gen_stats.h
index c148855..a0800e6 100644
--- a/include/net/gen_stats.h
+++ b/include/net/gen_stats.h
@@ -30,6 +30,7 @@ extern int gnet_stats_start_copy_compat(struct sk_buff *skb, int type,
 extern int gnet_stats_copy_basic(struct gnet_dump *d,
 				 struct gnet_stats_basic_packed *b);
 extern int gnet_stats_copy_rate_est(struct gnet_dump *d,
+				    const struct gnet_stats_basic_packed *bstats,
 				    struct gnet_stats_rate_est *r);
 extern int gnet_stats_copy_queue(struct gnet_dump *d,
 				 struct gnet_stats_queue *q);
diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c
index 8569310..6f9513e 100644
--- a/net/core/gen_stats.c
+++ b/net/core/gen_stats.c
@@ -136,8 +136,13 @@ gnet_stats_copy_basic(struct gnet_dump *d, struct gnet_stats_basic_packed *b)
  * if the room in the socket buffer was not sufficient.
  */
 int
-gnet_stats_copy_rate_est(struct gnet_dump *d, struct gnet_stats_rate_est *r)
+gnet_stats_copy_rate_est(struct gnet_dump *d,
+			 const struct gnet_stats_basic_packed *bstats,
+			 struct gnet_stats_rate_est *r)
 {
+	if (!gen_estimator_active(bstats, r))
+		return 0;
+
 	if (d->compat_tc_stats) {
 		d->tc_stats.bps = r->bps;
 		d->tc_stats.pps = r->pps;
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 2dfb3e7..2b0d5ee 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -618,7 +618,7 @@ int tcf_action_copy_stats(struct sk_buff *skb, struct tc_action *a,
 			goto errout;
 
 	if (gnet_stats_copy_basic(&d, &h->tcf_bstats) < 0 ||
-	    gnet_stats_copy_rate_est(&d, &h->tcf_rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(&d, &h->tcf_bstats, &h->tcf_rate_est) < 0 ||
 	    gnet_stats_copy_queue(&d, &h->tcf_qstats) < 0)
 		goto errout;
 
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 903e418..1acfd29 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1179,7 +1179,7 @@ static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid,
 		goto nla_put_failure;
 
 	if (gnet_stats_copy_basic(&d, &q->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(&d, &q->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(&d, &q->bstats, &q->rate_est) < 0 ||
 	    gnet_stats_copy_queue(&d, &q->qstats) < 0)
 		goto nla_put_failure;
 
diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index 5b132c4..3846d65 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -1609,7 +1609,7 @@ cbq_dump_class_stats(struct Qdisc *sch, unsigned long arg,
 		cl->xstats.undertime = cl->undertime - q->now;
 
 	if (gnet_stats_copy_basic(d, &cl->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(d, &cl->bstats, &cl->rate_est) < 0 ||
 	    gnet_stats_copy_queue(d, &cl->qstats) < 0)
 		return -1;
 
diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c
index 5a888af..a65604f 100644
--- a/net/sched/sch_drr.c
+++ b/net/sched/sch_drr.c
@@ -280,7 +280,7 @@ static int drr_dump_class_stats(struct Qdisc *sch, unsigned long arg,
 	}
 
 	if (gnet_stats_copy_basic(d, &cl->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(d, &cl->bstats, &cl->rate_est) < 0 ||
 	    gnet_stats_copy_queue(d, &cl->qdisc->qstats) < 0)
 		return -1;
 
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 2c5c76b..b38b39c 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -1375,7 +1375,7 @@ hfsc_dump_class_stats(struct Qdisc *sch, unsigned long arg,
 	xstats.rtwork  = cl->cl_cumul;
 
 	if (gnet_stats_copy_basic(d, &cl->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(d, &cl->bstats, &cl->rate_est) < 0 ||
 	    gnet_stats_copy_queue(d, &cl->qstats) < 0)
 		return -1;
 
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 85acab9..8352fa3 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1105,7 +1105,7 @@ htb_dump_class_stats(struct Qdisc *sch, unsigned long arg, struct gnet_dump *d)
 	cl->xstats.ctokens = cl->ctokens;
 
 	if (gnet_stats_copy_basic(d, &cl->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(d, &cl->bstats, &cl->rate_est) < 0 ||
 	    gnet_stats_copy_queue(d, &cl->qstats) < 0)
 		return -1;
 

^ permalink raw reply related

* Re: [PATCH 30/31] Fix use of uninitialized variable in cache_grow()
From: David Rientjes @ 2009-10-02 10:05 UTC (permalink / raw)
  To: Neil Brown
  Cc: Suresh Jayaraman, Linus Torvalds, Andrew Morton, linux-kernel,
	linux-mm, netdev, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust
In-Reply-To: <19141.34685.863491.329836@notabene.brown>

On Fri, 2 Oct 2009, Neil Brown wrote:

> > > Index: mmotm/mm/slab.c
> > > ===================================================================
> > > --- mmotm.orig/mm/slab.c
> > > +++ mmotm/mm/slab.c
> > > @@ -2760,7 +2760,7 @@ static int cache_grow(struct kmem_cache
> > >  	size_t offset;
> > >  	gfp_t local_flags;
> > >  	struct kmem_list3 *l3;
> > > -	int reserve;
> > > +	int reserve = -1;
> > >  
> > >  	/*
> > >  	 * Be lazy and only check for valid flags here,  keeping it out of the
> > > @@ -2816,7 +2816,8 @@ static int cache_grow(struct kmem_cache
> > >  	if (local_flags & __GFP_WAIT)
> > >  		local_irq_disable();
> > >  	check_irq_off();
> > > -	slab_set_reserve(cachep, reserve);
> > > +	if (reserve != -1)
> > > +		slab_set_reserve(cachep, reserve);
> > >  	spin_lock(&l3->list_lock);
> > >  
> > >  	/* Make slab active. */
> > 
> > Given the patch description, shouldn't this be a test for objp != NULL 
> > instead, then?
> 
> In between those to patch hunks, cache_grow contains the code:
> 	if (!objp)
> 		objp = kmem_getpages(cachep, local_flags, nodeid, &reserve);
> 	if (!objp)
> 		goto failed;
> 
> We can no longer test if objp was NULL on entry to the function.
> We could take a copy of objp on entry to the function, and test it
> here.  But initialising 'reserve' to an invalid value is easier.
> 

Seems like you could do all this in kmem_getpages(), then, by calling 
slab_set_reserve(cachep, page->reserve) before returning the new page?

 [ I'd also drop the branch in slab_set_reserve(), it's faster to just 
   assign it unconditionally. ]

^ permalink raw reply

* Messages are printed on screen
From: Markus Feldmann @ 2009-10-02  9:52 UTC (permalink / raw)
  To: netdev

Hi All,

i am setting up my Server, with Linux Debian lenny. Therefore i am using 
  Kernel 2.6.31.1. I configured this Kernel with <make defconfig> and 
<make menuconfig>. My motherboard has 5 PCI slots and 1 AGP slot. All 
slots are in use. All devices are mapped to the following IRQ-Line:

Mass Storage Device (PCI Slot1)	IRQ-Line 11
Ethernet (PCI Slot 2) 		IRQ-Line 4
Ethernet (PCI Slot 3) 		IRQ-Line 5
Ethernet (PCI Slot 4) 		IRQ-Line 7
Ethernet (PCI Slot 5) 		IRQ-Line 11
Onboard USB-Controller		IRQ-Line 5
Onboard USB-Controller		IRQ-Line 4
Onboard USB-Controller		IRQ-Line 11
Onboard IDE			IRQ-Line 14
AGP VGA				IRQ-Line 11

As you see some of my IRQ-Lines are multiply in use, so my Server is 
working hard at his limit. The result is sometimes freezing of my 
Server, especially if there is much processing on these devices. I 
remember that with Kernel 2.6.18 my system didn't does freezing.

So i am trying to reduce the amount of this processing. I still get 
messages about dropped network packets on my Terminal, although i set up 
my <rsyslog> to save this only to </var/log>. Here is my 
</etc/rsyslog.conf>

How can i disable the output of messages (about dropped packets from my 
firewall) to my terminal ?

How can i stabilize my IRQ-System with the kernel 2.6.31.1 ?

What debug features should i disable ?

regards Markus


^ permalink raw reply

* Re: [BUG net-2.6] bluetooth/rfcomm : sleeping function called from invalid context at mm/slub.c:1719
From: Oliver Hartkopp @ 2009-10-02  9:52 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: Linux Netdev List, linux-bluetooth
In-Reply-To: <4AC59D8A.6000102@hartkopp.net>

It's a reproducible bug.

When creating a ppp dialup connection a second time there is a lockdep annotation:

[ 1477.716936] PPP generic driver version 2.4.2
[ 1477.738035] BUG: sleeping function called from invalid context at
mm/slub.c:1719
[ 1477.738046] in_atomic(): 1, irqs_disabled(): 0, pid: 5057, name: pppd
[ 1477.738053] 3 locks held by pppd/5057:
[ 1477.738058]  #0:  (rfcomm_mutex){+.+.+.}, at: [<fa5dd2a1>]
rfcomm_dlc_open+0x28/0x2d6 [rfcomm]
[ 1477.738083]  #1:  (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.+.}, at:
[<fa53f4f8>] l2cap_sock_connect+0x62/0x2c6 [l2cap]
[ 1477.738105]  #2:  (&hdev->lock){+...+.}, at: [<fa53f5b4>]
l2cap_sock_connect+0x11e/0x2c6 [l2cap]
[ 1477.738129] Pid: 5057, comm: pppd Not tainted 2.6.31-08939-gdb8abec-dirty #21
[ 1477.738135] Call Trace:
[ 1477.738148]  [<c1042a2b>] ? __debug_show_held_locks+0x1e/0x20
[ 1477.738160]  [<c10212a1>] __might_sleep+0xc9/0xce
[ 1477.738171]  [<c1078b62>] __kmalloc+0x6d/0xfb
[ 1477.738181]  [<c119e739>] ? kzalloc+0xb/0xd
[ 1477.738190]  [<c119e739>] kzalloc+0xb/0xd
[ 1477.738199]  [<c119ef1a>] device_private_init+0x15/0x3d
[ 1477.738209]  [<c11a0e1b>] dev_set_drvdata+0x18/0x26
[ 1477.738233]  [<f88f9a1b>] hci_conn_init_sysfs+0x3d/0xc7 [bluetooth]
[ 1477.738253]  [<f88f61b3>] hci_conn_add+0x1c0/0x1d5 [bluetooth]
[ 1477.738271]  [<f88f6360>] hci_connect+0x71/0x17d [bluetooth]
[ 1477.738285]  [<fa53f62c>] l2cap_sock_connect+0x196/0x2c6 [l2cap]
[ 1477.738298]  [<c1246e3d>] kernel_connect+0xd/0x12
[ 1477.738311]  [<fa5dd3c3>] rfcomm_dlc_open+0x14a/0x2d6 [rfcomm]
[ 1477.738326]  [<fa5df0fa>] ? rfcomm_tty_open+0x73/0x227 [rfcomm]
[ 1477.738341]  [<fa5df130>] rfcomm_tty_open+0xa9/0x227 [rfcomm]
[ 1477.738352]  [<c1022e3f>] ? default_wake_function+0x0/0xd
[ 1477.738363]  [<c1180c79>] tty_open+0x29e/0x399
[ 1477.738374]  [<c107e9bd>] chrdev_open+0x13f/0x156
[ 1477.738384]  [<c107b0d3>] __dentry_open+0x11b/0x20f
[ 1477.738394]  [<c107b261>] nameidata_to_filp+0x2c/0x43
[ 1477.738403]  [<c107e87e>] ? chrdev_open+0x0/0x156
[ 1477.738414]  [<c1084e9e>] do_filp_open+0x3c6/0x70a
[ 1477.738426]  [<c108d3e4>] ? alloc_fd+0xc8/0xd2
[ 1477.738436]  [<c108d3e4>] ? alloc_fd+0xc8/0xd2
[ 1477.738446]  [<c107aebc>] do_sys_open+0x4a/0xe7
[ 1477.738456]  [<c1002acc>] ? restore_all_notrace+0x0/0x18
[ 1477.738466]  [<c107af9b>] sys_open+0x1e/0x26
[ 1477.738475]  [<c1002a18>] sysenter_do_call+0x12/0x36
[ 1484.844933] PPP BSD Compression module registered
[ 1484.870946] PPP Deflate Compression module registered
[ 4335.008503] CE: hpet increasing min_delta_ns to 15000 nsec
[ 7605.540870] INFO: trying to register non-static key.
[ 7605.540879] the code is fine but needs lockdep annotation.
[ 7605.540884] turning off the locking correctness validator.
[ 7605.540894] Pid: 0, comm: swapper Not tainted 2.6.31-08939-gdb8abec-dirty #21
[ 7605.540900] Call Trace:
[ 7605.540915]  [<c12e4fb2>] ? printk+0xf/0x11
[ 7605.540928]  [<c1042214>] register_lock_class+0x5a/0x295
[ 7605.540939]  [<c1043af2>] __lock_acquire+0x9b/0xc03
[ 7605.540949]  [<c104464b>] ? __lock_acquire+0xbf4/0xc03
[ 7605.540967]  [<fa53b168>] ? l2cap_get_chan_by_scid+0x35/0x43 [l2cap]
[ 7605.540977]  [<c104491f>] ? lock_release_non_nested+0x17b/0x1db
[ 7605.540990]  [<fa53b168>] ? l2cap_get_chan_by_scid+0x35/0x43 [l2cap]
[ 7605.541001]  [<c10426fd>] ? trace_hardirqs_off+0xb/0xd
[ 7605.541010]  [<c10446b6>] lock_acquire+0x5c/0x73
[ 7605.541021]  [<c124cd14>] ? skb_dequeue+0x12/0x4c
[ 7605.541031]  [<c12e6e23>] _spin_lock_irqsave+0x24/0x34
[ 7605.541039]  [<c124cd14>] ? skb_dequeue+0x12/0x4c
[ 7605.541048]  [<c124cd14>] skb_dequeue+0x12/0x4c
[ 7605.541057]  [<c124d579>] skb_queue_purge+0x14/0x1b
[ 7605.541070]  [<fa53de3f>] l2cap_recv_frame+0xe9e/0x129a [l2cap]
[ 7605.541080]  [<c10421d1>] ? register_lock_class+0x17/0x295
[ 7605.541091]  [<c104464b>] ? __lock_acquire+0xbf4/0xc03
[ 7605.541114]  [<c104464b>] ? __lock_acquire+0xbf4/0xc03
[ 7605.541125]  [<c120de74>] ? uhci_giveback_urb+0xf2/0x162
[ 7605.541148]  [<f88f4c45>] ? hci_rx_task+0xfe/0x1f8 [bluetooth]
[ 7605.541162]  [<fa53e2e4>] l2cap_recv_acldata+0xa9/0x1be [l2cap]
[ 7605.541174]  [<fa53e23b>] ? l2cap_recv_acldata+0x0/0x1be [l2cap]
[ 7605.541193]  [<f88f4c77>] hci_rx_task+0x130/0x1f8 [bluetooth]
[ 7605.541204]  [<c102a098>] tasklet_action+0x6b/0xb2
[ 7605.541213]  [<c102a46b>] __do_softirq+0x82/0x101
[ 7605.541222]  [<c102a515>] do_softirq+0x2b/0x43
[ 7605.541231]  [<c102a619>] irq_exit+0x35/0x68
[ 7605.541241]  [<c1004513>] do_IRQ+0x80/0x96
[ 7605.541250]  [<c10030ae>] common_interrupt+0x2e/0x34
[ 7605.541260]  [<c104007b>] ? tick_device_uses_broadcast+0x71/0x7c
[ 7605.541271]  [<c11747a8>] ? acpi_idle_enter_simple+0x103/0x12e
[ 7605.541281]  [<c1174515>] acpi_idle_enter_bm+0xc3/0x253
[ 7605.541291]  [<c1238b6f>] cpuidle_idle_call+0x60/0x91
[ 7605.541300]  [<c1001d44>] cpu_idle+0x49/0x65
[ 7605.541310]  [<c12e2f0e>] start_secondary+0x190/0x195


Oliver Hartkopp wrote:
> Hello Marcel,
> 
> with current net-2.6 tree ...
> 
> While starting my PPP Bluetooth dialup networking, i got this:
> 
> [  722.461549] PPP generic driver version 2.4.2
> [  722.477519] BUG: sleeping function called from invalid context at
> mm/slub.c:1719
> [  722.477530] in_atomic(): 1, irqs_disabled(): 0, pid: 4677, name: pppd
> [  722.477537] 3 locks held by pppd/4677:
> [  722.477542]  #0:  (rfcomm_mutex){+.+.+.}, at: [<fa5df2a1>]
> rfcomm_dlc_open+0x28/0x2d6 [rfcomm]
> [  722.477568]  #1:  (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.+.}, at:
> [<fa5414f8>] l2cap_sock_connect+0x62/0x2c6 [l2cap]
> [  722.477589]  #2:  (&hdev->lock){+...+.}, at: [<fa5415b4>]
> l2cap_sock_connect+0x11e/0x2c6 [l2cap]
> [  722.477613] Pid: 4677, comm: pppd Not tainted 2.6.31-08939-gdb8abec-dirty #21
> [  722.477619] Call Trace:
> [  722.477633]  [<c1042a2b>] ? __debug_show_held_locks+0x1e/0x20
> [  722.477644]  [<c10212a1>] __might_sleep+0xc9/0xce
> [  722.477655]  [<c1078b62>] __kmalloc+0x6d/0xfb
> [  722.477666]  [<c119e739>] ? kzalloc+0xb/0xd
> [  722.477674]  [<c119e739>] kzalloc+0xb/0xd
> [  722.477683]  [<c119ef1a>] device_private_init+0x15/0x3d
> [  722.477693]  [<c11a0e1b>] dev_set_drvdata+0x18/0x26
> [  722.477718]  [<f8b7ca1b>] hci_conn_init_sysfs+0x3d/0xc7 [bluetooth]
> [  722.477737]  [<f8b791b3>] hci_conn_add+0x1c0/0x1d5 [bluetooth]
> [  722.477756]  [<f8b79360>] hci_connect+0x71/0x17d [bluetooth]
> [  722.477769]  [<fa54162c>] l2cap_sock_connect+0x196/0x2c6 [l2cap]
> [  722.477782]  [<c1246e3d>] kernel_connect+0xd/0x12
> [  722.477795]  [<fa5df3c3>] rfcomm_dlc_open+0x14a/0x2d6 [rfcomm]
> [  722.477810]  [<fa5e10fa>] ? rfcomm_tty_open+0x73/0x227 [rfcomm]
> [  722.477825]  [<fa5e1130>] rfcomm_tty_open+0xa9/0x227 [rfcomm]
> [  722.477836]  [<c1022e3f>] ? default_wake_function+0x0/0xd
> [  722.477847]  [<c1180c79>] tty_open+0x29e/0x399
> [  722.477858]  [<c107e9bd>] chrdev_open+0x13f/0x156
> [  722.477868]  [<c107b0d3>] __dentry_open+0x11b/0x20f
> [  722.477878]  [<c107b261>] nameidata_to_filp+0x2c/0x43
> [  722.477888]  [<c107e87e>] ? chrdev_open+0x0/0x156
> [  722.477898]  [<c1084e9e>] do_filp_open+0x3c6/0x70a
> [  722.477910]  [<c108d3e4>] ? alloc_fd+0xc8/0xd2
> [  722.477920]  [<c108d3e4>] ? alloc_fd+0xc8/0xd2
> [  722.477930]  [<c107aebc>] do_sys_open+0x4a/0xe7
> [  722.477940]  [<c1002acc>] ? restore_all_notrace+0x0/0x18
> [  722.477950]  [<c107af9b>] sys_open+0x1e/0x26
> [  722.477959]  [<c1002a18>] sysenter_do_call+0x12/0x36
> [  729.658613] PPP BSD Compression module registered
> [  729.684789] PPP Deflate Compression module registered
> 
> Any idea?
> 
> Regards,
> Oliver
> 


^ permalink raw reply

* Re: [PATCH 04/31] mm: tag reseve pages
From: David Rientjes @ 2009-10-02  9:50 UTC (permalink / raw)
  To: Neil Brown
  Cc: Suresh Jayaraman, Linus Torvalds, Andrew Morton, linux-kernel,
	linux-mm, netdev, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust
In-Reply-To: <19141.34038.274185.392663@notabene.brown>

On Fri, 2 Oct 2009, Neil Brown wrote:

> Normally if zones are above their watermarks, page->reserve will not
> be set.
> This is because __alloc_page_nodemask (which seems to be the main
> non-inline entrypoint) first calls get_page_from_freelist with
> alloc_flags set to ALLOC_WMARK_LOW|ALLOC_CPUSET.
> Only if this fails does __alloc_page_nodemask call
> __alloc_pages_slowpath which potentially sets ALLOC_NO_WATERMARKS in
> alloc_flags.
> 
> So page->reserved being set actually tells us:
>   PF_MEMALLOC or GFP_MEMALLOC were used, and
>   a WMARK_LOW allocation attempt failed very recently
> 
> which is close enough to "the emergency reserves were used" I think.
> 

There're a couple cornercases for GFP_ATOMIC, though:

 - it isn't restricted by cpuset, so ALLOC_CPUSET will never get set for 
   the slowpath allocs and may very well allow the allocation to succeed 
   in zones far above their min watermark.

 - it allows for allocating beyond the min watermark in allowed zones
   simply by setting ALLOC_HARDER; these types of "reserve" allocations
   wouldn't be marked as page->reserve with your patches if
   ALLOC_NO_WATERMARKS wasn't set because of the allocation context.

The second one is debatable whether it fits your definition of reserve or 
not, but there's an inconsistency if it doesn't because the allocation may 
succeed in "no watermark" context (for example, in hard irq context) even 
though that privilege wasn't necessary to successfully allocate: perhaps 
it only needed ALLOC_HARDER.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* [PATCH] net: Fix wrong sizeof
From: Jean Delvare @ 2009-10-02  9:30 UTC (permalink / raw)
  To: LKML, netdev; +Cc: linux-doc, Randy Dunlap, stable

Which is why I have always preferred sizeof(struct foo) over
sizeof(var).

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
---
Stable team, the non-documentation part of this fix applies to 2.6.31,
2.6.30 and 2.6.27.

 Documentation/networking/timestamping/timestamping.c |    2 +-
 drivers/net/iseries_veth.c                           |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.32-rc1.orig/Documentation/networking/timestamping/timestamping.c	2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.32-rc1/Documentation/networking/timestamping/timestamping.c	2009-10-02 11:07:19.000000000 +0200
@@ -381,7 +381,7 @@ int main(int argc, char **argv)
 	memset(&hwtstamp, 0, sizeof(hwtstamp));
 	strncpy(hwtstamp.ifr_name, interface, sizeof(hwtstamp.ifr_name));
 	hwtstamp.ifr_data = (void *)&hwconfig;
-	memset(&hwconfig, 0, sizeof(&hwconfig));
+	memset(&hwconfig, 0, sizeof(hwconfig));
 	hwconfig.tx_type =
 		(so_timestamping_flags & SOF_TIMESTAMPING_TX_HARDWARE) ?
 		HWTSTAMP_TX_ON : HWTSTAMP_TX_OFF;
--- linux-2.6.32-rc1.orig/drivers/net/iseries_veth.c	2009-09-28 10:28:42.000000000 +0200
+++ linux-2.6.32-rc1/drivers/net/iseries_veth.c	2009-10-02 11:07:15.000000000 +0200
@@ -495,7 +495,7 @@ static void veth_take_cap_ack(struct vet
 			   cnx->remote_lp);
 	} else {
 		memcpy(&cnx->cap_ack_event, event,
-		       sizeof(&cnx->cap_ack_event));
+		       sizeof(cnx->cap_ack_event));
 		cnx->state |= VETH_STATE_GOTCAPACK;
 		veth_kick_statemachine(cnx);
 	}


-- 
Jean Delvare

^ permalink raw reply

* Re: [PATCH 03/31] mm: expose gfp_to_alloc_flags()
From: David Rientjes @ 2009-10-02  9:30 UTC (permalink / raw)
  To: Neil Brown
  Cc: Suresh Jayaraman, Linus Torvalds, Andrew Morton, linux-kernel,
	linux-mm, netdev, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust
In-Reply-To: <19141.35274.513790.845711@notabene.brown>

On Fri, 2 Oct 2009, Neil Brown wrote:

> So something like this?
> Then change every occurrence of
> +		if (!(gfp_to_alloc_flags(gfpflags) & ALLOC_NO_WATERMARKS))
> to
> +		if (!(gfp_has_no_watermarks(gfpflags)))
> 
> ??
> 

No, it's not even necessary to call gfp_to_alloc_flags() at all, just 
create a globally exported function such as can_alloc_use_reserves() and 
use it in gfp_to_alloc_flags().

 [ Using 'p' in gfp_to_alloc_flags() is actually wrong since
   test_thread_flag() only works on current anyway, so it would be
   inconsistent if p were set to anything other than current; we can
   get rid of that auto variable. ]

Something like the following, which you can fold into this patch proposal 
and modify later for GFP_MEMALLOC.

Signed-off-by: David Rientjes <rientjes@google.com>
---
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 557bdad..7dd62a0 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -265,6 +265,8 @@ static inline void arch_free_page(struct page *page, int order) { }
 static inline void arch_alloc_page(struct page *page, int order) { }
 #endif
 
+int can_alloc_use_reserves(void);
+
 struct page *
 __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
 		       struct zonelist *zonelist, nodemask_t *nodemask);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bf72055..cf1d765 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1744,10 +1744,19 @@ void wake_all_kswapd(unsigned int order, struct zonelist *zonelist,
 		wakeup_kswapd(zone, order);
 }
 
+/*
+ * Does the current context allow the allocation to utilize memory reserves
+ * by ignoring watermarks for all zones?
+ */
+int can_alloc_use_reserves(void)
+{
+	return !in_interrupt() && ((current->flags & PF_MEMALLOC) ||
+				   unlikely(test_thread_flag(TIF_MEMDIE)));
+}
+
 static inline int
 gfp_to_alloc_flags(gfp_t gfp_mask)
 {
-	struct task_struct *p = current;
 	int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
 	const gfp_t wait = gfp_mask & __GFP_WAIT;
 
@@ -1769,15 +1778,12 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 		 * See also cpuset_zone_allowed() comment in kernel/cpuset.c.
 		 */
 		alloc_flags &= ~ALLOC_CPUSET;
-	} else if (unlikely(rt_task(p)))
+	} else if (unlikely(rt_task(current)))
 		alloc_flags |= ALLOC_HARDER;
 
-	if (likely(!(gfp_mask & __GFP_NOMEMALLOC))) {
-		if (!in_interrupt() &&
-		    ((p->flags & PF_MEMALLOC) ||
-		     unlikely(test_thread_flag(TIF_MEMDIE))))
+	if (likely(!(gfp_mask & __GFP_NOMEMALLOC)))
+		if (can_alloc_use_reserves())
 			alloc_flags |= ALLOC_NO_WATERMARKS;
-	}
 
 	return alloc_flags;
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related

* [RFC] netlink: add socket destruction notification
From: Johannes Berg @ 2009-10-02  8:44 UTC (permalink / raw)
  To: netdev; +Cc: Jouni Malinen, Thomas Graf

When we want to keep track of resources associated with applications, we
need to know when an app is going away. Add a notification function to
netlink that tells us that, and also hook it up to generic netlink so
generic netlink can notify the families. Due to the way generic netlink
works though, we need to notify all families and they have to sort out
whatever resources some commands associated with the socket themselves.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
---
 drivers/connector/connector.c       |    2 +-
 drivers/scsi/scsi_netlink.c         |    2 +-
 drivers/scsi/scsi_transport_iscsi.c |    2 +-
 include/linux/netlink.h             |    1 +
 include/net/genetlink.h             |    3 +++
 kernel/audit.c                      |    3 ++-
 lib/kobject_uevent.c                |    2 +-
 net/bridge/netfilter/ebt_ulog.c     |    2 +-
 net/core/rtnetlink.c                |    3 ++-
 net/decnet/netfilter/dn_rtmsg.c     |    2 +-
 net/ipv4/fib_frontend.c             |    2 +-
 net/ipv4/inet_diag.c                |    2 +-
 net/ipv4/netfilter/ip_queue.c       |    2 +-
 net/ipv4/netfilter/ipt_ULOG.c       |    6 +++---
 net/ipv6/netfilter/ip6_queue.c      |    2 +-
 net/netfilter/nfnetlink.c           |    2 +-
 net/netlink/af_netlink.c            |    6 ++++++
 net/netlink/genetlink.c             |   18 ++++++++++++++++--
 net/xfrm/xfrm_user.c                |    2 +-
 security/selinux/netlink.c          |    3 ++-
 20 files changed, 47 insertions(+), 20 deletions(-)

--- wireless-testing.orig/net/xfrm/xfrm_user.c	2009-09-23 10:10:41.000000000 +0200
+++ wireless-testing/net/xfrm/xfrm_user.c	2009-09-29 14:45:33.000000000 +0200
@@ -2605,7 +2605,7 @@ static int __net_init xfrm_user_net_init
 	struct sock *nlsk;
 
 	nlsk = netlink_kernel_create(net, NETLINK_XFRM, XFRMNLGRP_MAX,
-				     xfrm_netlink_rcv, NULL, THIS_MODULE);
+				     xfrm_netlink_rcv, NULL, NULL, THIS_MODULE);
 	if (nlsk == NULL)
 		return -ENOMEM;
 	rcu_assign_pointer(net->xfrm.nlsk, nlsk);
--- wireless-testing.orig/drivers/connector/connector.c	2009-09-29 12:26:17.000000000 +0200
+++ wireless-testing/drivers/connector/connector.c	2009-09-29 14:45:33.000000000 +0200
@@ -451,7 +451,7 @@ static int __devinit cn_init(void)
 
 	dev->nls = netlink_kernel_create(&init_net, NETLINK_CONNECTOR,
 					 CN_NETLINK_USERS + 0xf,
-					 dev->input, NULL, THIS_MODULE);
+					 dev->input, NULL, NULL, THIS_MODULE);
 	if (!dev->nls)
 		return -EIO;
 
--- wireless-testing.orig/drivers/scsi/scsi_netlink.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/drivers/scsi/scsi_netlink.c	2009-09-29 14:45:33.000000000 +0200
@@ -496,7 +496,7 @@ scsi_netlink_init(void)
 
 	scsi_nl_sock = netlink_kernel_create(&init_net, NETLINK_SCSITRANSPORT,
 				SCSI_NL_GRP_CNT, scsi_nl_rcv_msg, NULL,
-				THIS_MODULE);
+				NULL, THIS_MODULE);
 	if (!scsi_nl_sock) {
 		printk(KERN_ERR "%s: register of recieve handler failed\n",
 				__func__);
--- wireless-testing.orig/drivers/scsi/scsi_transport_iscsi.c	2009-09-29 12:26:46.000000000 +0200
+++ wireless-testing/drivers/scsi/scsi_transport_iscsi.c	2009-09-29 14:45:33.000000000 +0200
@@ -2082,7 +2082,7 @@ static __init int iscsi_transport_init(v
 		goto unregister_conn_class;
 
 	nls = netlink_kernel_create(&init_net, NETLINK_ISCSI, 1, iscsi_if_rx,
-				    NULL, THIS_MODULE);
+				    NULL, NULL, THIS_MODULE);
 	if (!nls) {
 		err = -ENOBUFS;
 		goto unregister_session_class;
--- wireless-testing.orig/kernel/audit.c	2009-09-29 12:27:01.000000000 +0200
+++ wireless-testing/kernel/audit.c	2009-09-29 14:45:33.000000000 +0200
@@ -970,7 +970,8 @@ static int __init audit_init(void)
 	printk(KERN_INFO "audit: initializing netlink socket (%s)\n",
 	       audit_default ? "enabled" : "disabled");
 	audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, 0,
-					   audit_receive, NULL, THIS_MODULE);
+					   audit_receive, NULL, NULL,
+					   THIS_MODULE);
 	if (!audit_sock)
 		audit_panic("cannot initialize netlink socket");
 	else
--- wireless-testing.orig/lib/kobject_uevent.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/lib/kobject_uevent.c	2009-09-29 14:45:33.000000000 +0200
@@ -322,7 +322,7 @@ EXPORT_SYMBOL_GPL(add_uevent_var);
 static int __init kobject_uevent_init(void)
 {
 	uevent_sock = netlink_kernel_create(&init_net, NETLINK_KOBJECT_UEVENT,
-					    1, NULL, NULL, THIS_MODULE);
+					    1, NULL, NULL, NULL, THIS_MODULE);
 	if (!uevent_sock) {
 		printk(KERN_ERR
 		       "kobject_uevent: unable to create netlink socket!\n");
--- wireless-testing.orig/net/bridge/netfilter/ebt_ulog.c	2009-09-29 12:27:03.000000000 +0200
+++ wireless-testing/net/bridge/netfilter/ebt_ulog.c	2009-09-29 14:45:33.000000000 +0200
@@ -304,7 +304,7 @@ static int __init ebt_ulog_init(void)
 
 	ebtulognl = netlink_kernel_create(&init_net, NETLINK_NFLOG,
 					  EBT_ULOG_MAXNLGROUPS, NULL, NULL,
-					  THIS_MODULE);
+					  NULL, THIS_MODULE);
 	if (!ebtulognl) {
 		printk(KERN_WARNING KBUILD_MODNAME ": out of memory trying to "
 		       "call netlink_kernel_create\n");
--- wireless-testing.orig/net/core/rtnetlink.c	2009-09-29 12:27:04.000000000 +0200
+++ wireless-testing/net/core/rtnetlink.c	2009-09-29 14:45:33.000000000 +0200
@@ -1360,7 +1360,8 @@ static int rtnetlink_net_init(struct net
 {
 	struct sock *sk;
 	sk = netlink_kernel_create(net, NETLINK_ROUTE, RTNLGRP_MAX,
-				   rtnetlink_rcv, &rtnl_mutex, THIS_MODULE);
+				   rtnetlink_rcv, NULL,
+				   &rtnl_mutex, THIS_MODULE);
 	if (!sk)
 		return -ENOMEM;
 	net->rtnl = sk;
--- wireless-testing.orig/net/decnet/netfilter/dn_rtmsg.c	2009-09-23 10:10:41.000000000 +0200
+++ wireless-testing/net/decnet/netfilter/dn_rtmsg.c	2009-09-29 14:45:33.000000000 +0200
@@ -128,7 +128,7 @@ static int __init dn_rtmsg_init(void)
 
 	dnrmg = netlink_kernel_create(&init_net,
 				      NETLINK_DNRTMSG, DNRNG_NLGRP_MAX,
-				      dnrmg_receive_user_skb,
+				      dnrmg_receive_user_skb, NULL,
 				      NULL, THIS_MODULE);
 	if (dnrmg == NULL) {
 		printk(KERN_ERR "dn_rtmsg: Cannot create netlink socket");
--- wireless-testing.orig/net/ipv4/fib_frontend.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/net/ipv4/fib_frontend.c	2009-09-29 14:45:33.000000000 +0200
@@ -879,7 +879,7 @@ static int nl_fib_lookup_init(struct net
 {
 	struct sock *sk;
 	sk = netlink_kernel_create(net, NETLINK_FIB_LOOKUP, 0,
-				   nl_fib_input, NULL, THIS_MODULE);
+				   nl_fib_input, NULL, NULL, THIS_MODULE);
 	if (sk == NULL)
 		return -EAFNOSUPPORT;
 	net->ipv4.fibnl = sk;
--- wireless-testing.orig/net/ipv4/inet_diag.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/net/ipv4/inet_diag.c	2009-09-29 14:45:33.000000000 +0200
@@ -924,7 +924,7 @@ static int __init inet_diag_init(void)
 		goto out;
 
 	idiagnl = netlink_kernel_create(&init_net, NETLINK_INET_DIAG, 0,
-					inet_diag_rcv, NULL, THIS_MODULE);
+					inet_diag_rcv, NULL, NULL, THIS_MODULE);
 	if (idiagnl == NULL)
 		goto out_free_table;
 	err = 0;
--- wireless-testing.orig/net/ipv4/netfilter/ip_queue.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/net/ipv4/netfilter/ip_queue.c	2009-09-29 14:45:33.000000000 +0200
@@ -578,7 +578,7 @@ static int __init ip_queue_init(void)
 
 	netlink_register_notifier(&ipq_nl_notifier);
 	ipqnl = netlink_kernel_create(&init_net, NETLINK_FIREWALL, 0,
-				      ipq_rcv_skb, NULL, THIS_MODULE);
+				      ipq_rcv_skb, NULL, NULL, THIS_MODULE);
 	if (ipqnl == NULL) {
 		printk(KERN_ERR "ip_queue: failed to create netlink socket\n");
 		goto cleanup_netlink_notifier;
--- wireless-testing.orig/net/ipv4/netfilter/ipt_ULOG.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/net/ipv4/netfilter/ipt_ULOG.c	2009-09-29 14:45:33.000000000 +0200
@@ -400,9 +400,9 @@ static int __init ulog_tg_init(void)
 	for (i = 0; i < ULOG_MAXNLGROUPS; i++)
 		setup_timer(&ulog_buffers[i].timer, ulog_timer, i);
 
-	nflognl = netlink_kernel_create(&init_net,
-					NETLINK_NFLOG, ULOG_MAXNLGROUPS, NULL,
-					NULL, THIS_MODULE);
+	nflognl = netlink_kernel_create(&init_net, NETLINK_NFLOG,
+					ULOG_MAXNLGROUPS, NULL,
+					NULL, NULL, THIS_MODULE);
 	if (!nflognl)
 		return -ENOMEM;
 
--- wireless-testing.orig/net/ipv6/netfilter/ip6_queue.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/net/ipv6/netfilter/ip6_queue.c	2009-09-29 14:45:33.000000000 +0200
@@ -580,7 +580,7 @@ static int __init ip6_queue_init(void)
 
 	netlink_register_notifier(&ipq_nl_notifier);
 	ipqnl = netlink_kernel_create(&init_net, NETLINK_IP6_FW, 0,
-			              ipq_rcv_skb, NULL, THIS_MODULE);
+				      ipq_rcv_skb, NULL, NULL, THIS_MODULE);
 	if (ipqnl == NULL) {
 		printk(KERN_ERR "ip6_queue: failed to create netlink socket\n");
 		goto cleanup_netlink_notifier;
--- wireless-testing.orig/net/netfilter/nfnetlink.c	2009-09-29 12:27:12.000000000 +0200
+++ wireless-testing/net/netfilter/nfnetlink.c	2009-09-29 14:45:33.000000000 +0200
@@ -196,7 +196,7 @@ static int __init nfnetlink_init(void)
 	printk("Netfilter messages via NETLINK v%s.\n", nfversion);
 
 	nfnl = netlink_kernel_create(&init_net, NETLINK_NETFILTER, NFNLGRP_MAX,
-				     nfnetlink_rcv, NULL, THIS_MODULE);
+				     nfnetlink_rcv, NULL, NULL, THIS_MODULE);
 	if (!nfnl) {
 		printk(KERN_ERR "cannot initialize nfnetlink!\n");
 		return -ENOMEM;
--- wireless-testing.orig/net/netlink/genetlink.c	2009-09-29 12:27:12.000000000 +0200
+++ wireless-testing/net/netlink/genetlink.c	2009-09-29 14:45:33.000000000 +0200
@@ -561,6 +561,20 @@ static void genl_rcv(struct sk_buff *skb
 	genl_unlock();
 }
 
+static void genl_destruct(struct sock *sk)
+{
+	struct genl_family *f;
+	int idx;
+
+	genl_lock();
+
+	for (idx = 0; idx < GENL_FAM_TAB_SIZE; idx++)
+		list_for_each_entry(f, &family_ht[idx], family_list)
+			if (f->destruct_sk)
+				f->destruct_sk(sk);
+	genl_unlock();
+}
+
 /**************************************************************************
  * Controller
  **************************************************************************/
@@ -852,8 +866,8 @@ static int __net_init genl_pernet_init(s
 {
 	/* we'll bump the group number right afterwards */
 	net->genl_sock = netlink_kernel_create(net, NETLINK_GENERIC, 0,
-					       genl_rcv, &genl_mutex,
-					       THIS_MODULE);
+					       genl_rcv, genl_destruct,
+					       &genl_mutex, THIS_MODULE);
 
 	if (!net->genl_sock && net_eq(net, &init_net))
 		panic("GENL: Cannot initialize generic netlink\n");
--- wireless-testing.orig/security/selinux/netlink.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/security/selinux/netlink.c	2009-09-29 14:45:33.000000000 +0200
@@ -106,7 +106,8 @@ void selnl_notify_policyload(u32 seqno)
 static int __init selnl_init(void)
 {
 	selnl = netlink_kernel_create(&init_net, NETLINK_SELINUX,
-				      SELNLGRP_MAX, NULL, NULL, THIS_MODULE);
+				      SELNLGRP_MAX, NULL, NULL, NULL,
+				      THIS_MODULE);
 	if (selnl == NULL)
 		panic("SELinux:  Cannot create netlink socket.");
 	netlink_set_nonroot(NETLINK_SELINUX, NL_NONROOT_RECV);
--- wireless-testing.orig/include/linux/netlink.h	2009-09-29 12:26:58.000000000 +0200
+++ wireless-testing/include/linux/netlink.h	2009-09-29 14:45:33.000000000 +0200
@@ -182,6 +182,7 @@ extern void netlink_table_ungrab(void);
 extern struct sock *netlink_kernel_create(struct net *net,
 					  int unit,unsigned int groups,
 					  void (*input)(struct sk_buff *skb),
+					  void (*destruct)(struct sock *sk),
 					  struct mutex *cb_mutex,
 					  struct module *module);
 extern void netlink_kernel_release(struct sock *sk);
--- wireless-testing.orig/include/net/genetlink.h	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/include/net/genetlink.h	2009-09-29 14:45:33.000000000 +0200
@@ -30,6 +30,8 @@ struct genl_multicast_group
  * @maxattr: maximum number of attributes supported
  * @netnsok: set to true if the family can handle network
  *	namespaces and should be presented in all of them
+ * @destruct_sk: called when any generic netlink socket
+ *	is destroyed (e.g. by the application closing it)
  * @attrbuf: buffer to store parsed attributes
  * @ops_list: list of all assigned operations
  * @family_list: family list
@@ -43,6 +45,7 @@ struct genl_family
 	unsigned int		version;
 	unsigned int		maxattr;
 	bool			netnsok;
+	void			(*destruct_sk)(struct sock *sk);
 	struct nlattr **	attrbuf;	/* private */
 	struct list_head	ops_list;	/* private */
 	struct list_head	family_list;	/* private */
--- wireless-testing.orig/net/netlink/af_netlink.c	2009-09-29 12:27:12.000000000 +0200
+++ wireless-testing/net/netlink/af_netlink.c	2009-09-29 14:45:33.000000000 +0200
@@ -80,6 +80,7 @@ struct netlink_sock {
 	struct mutex		*cb_mutex;
 	struct mutex		cb_def_mutex;
 	void			(*netlink_rcv)(struct sk_buff *skb);
+	void			(*destruct)(struct sock *sk);
 	struct module		*module;
 };
 
@@ -166,6 +167,9 @@ static void netlink_sock_destruct(struct
 		return;
 	}
 
+	if (nlk->destruct)
+		nlk->destruct(sk);
+
 	WARN_ON(atomic_read(&sk->sk_rmem_alloc));
 	WARN_ON(atomic_read(&sk->sk_wmem_alloc));
 	WARN_ON(nlk_sk(sk)->groups);
@@ -1464,6 +1468,7 @@ static void netlink_data_ready(struct so
 struct sock *
 netlink_kernel_create(struct net *net, int unit, unsigned int groups,
 		      void (*input)(struct sk_buff *skb),
+		      void (*destruct)(struct sock *sk),
 		      struct mutex *cb_mutex, struct module *module)
 {
 	struct socket *sock;
@@ -1502,6 +1507,7 @@ netlink_kernel_create(struct net *net, i
 	sk->sk_data_ready = netlink_data_ready;
 	if (input)
 		nlk_sk(sk)->netlink_rcv = input;
+	nlk_sk(sk)->destruct = destruct;
 
 	if (netlink_insert(sk, net, 0))
 		goto out_sock_release;



^ permalink raw reply

* Re: [PATCH] ipvs: Add boundary check on ioctl arguments
From: Julian Anastasov @ 2009-10-02  8:35 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Hannes Eder, Wensong Zhang, netdev, linux-kernel, Simon Horman
In-Reply-To: <20090930171833.5ce0011d@infradead.org>


	Hello,

On Wed, 30 Sep 2009, Arjan van de Ven wrote:

> fair enough; updated patch below

	OK, you can add my signed-off line after changing
'cmd > ...MAX + 1' to 'cmd > ...MAX' at both
places, nf_sockopt_ops ranges are [optmin ... optmax)

May be comments should be changed because:

- i'm not the author but after ispection we do not see any holes,
we do not want users to upgrade just for this change
- the cmd checks are just to help code checking tools
- the len checks should help programmers (may be BUG_ON is
better, user does not deserve EINVAL for wrong set_arglen/get_arglen).
Checks for *len and len are not needed.

	For example, for len checks this should be enough, before
copy_from_user():

in do_ip_vs_get_ctl check can be
	BUG_ON(get_arglen[GET_CMDID(cmd)] > sizeof(arg));

in do_ip_vs_set_ctl check can be
	BUG_ON(set_arglen[SET_CMDID(cmd)] > sizeof(arg));

Acked-by: Julian Anastasov <ja@ssi.bg>

> >From 28ae217858e683c0c94c02219d46a9a9c87f61c6 Mon Sep 17 00:00:00 2001
> From: Arjan van de Ven <arjan@linux.intel.com>
> Date: Wed, 30 Sep 2009 13:05:51 +0200
> Subject: [PATCH] ipvs: Add boundary check on ioctl arguments
> 
> The ipvs code has a nifty system for doing the size of ioctl command copies;
> it defines an array with values into which it indexes the cmd to find the
> right length.
> 
> Unfortunately, the ipvs code forgot to check if the cmd was in the range
> that the array provides, allowing for an index outside of the array,
> which then gives a "garbage" result into the length, which then gets
> used for copying into a stack buffer.
> 
> Fix this by adding sanity checks on these as well as the copy size.
> 
> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
> ---
>  net/netfilter/ipvs/ip_vs_ctl.c |   14 +++++++++++++-
>  1 files changed, 13 insertions(+), 1 deletions(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
> index ac624e5..7adc876 100644
> --- a/net/netfilter/ipvs/ip_vs_ctl.c
> +++ b/net/netfilter/ipvs/ip_vs_ctl.c
> @@ -2077,6 +2077,10 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
>  	if (!capable(CAP_NET_ADMIN))
>  		return -EPERM;
>  
> +	if (cmd < IP_VS_BASE_CTL || cmd > IP_VS_SO_SET_MAX + 1)
> +		return -EINVAL;
> +	if (len < 0 || len >  sizeof(arg))
> +		return -EINVAL;
>  	if (len != set_arglen[SET_CMDID(cmd)]) {
>  		pr_err("set_ctl: len %u != %u\n",
>  		       len, set_arglen[SET_CMDID(cmd)]);
> @@ -2353,17 +2357,25 @@ do_ip_vs_get_ctl(struct sock *sk, int cmd, void __user *user, int *len)
>  {
>  	unsigned char arg[128];
>  	int ret = 0;
> +	unsigned int copylen;
>  
>  	if (!capable(CAP_NET_ADMIN))
>  		return -EPERM;
>  
> +	if (cmd < IP_VS_BASE_CTL || cmd > IP_VS_SO_GET_MAX + 1)
> +		return -EINVAL;
> +
>  	if (*len < get_arglen[GET_CMDID(cmd)]) {
>  		pr_err("get_ctl: len %u < %u\n",
>  		       *len, get_arglen[GET_CMDID(cmd)]);
>  		return -EINVAL;
>  	}
>  
> -	if (copy_from_user(arg, user, get_arglen[GET_CMDID(cmd)]) != 0)
> +	copylen = get_arglen[GET_CMDID(cmd)];
> +	if (copylen > sizeof(arg))
> +		return -EINVAL;
> +
> +	if (copy_from_user(arg, user, copylen) != 0)
>  		return -EFAULT;
>  
>  	if (mutex_lock_interruptible(&__ip_vs_mutex))

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [PATCH 00/31] Swap over NFS -v20
From: Suresh Jayaraman @ 2009-10-02  8:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Andrew Morton, linux-kernel, linux-mm, netdev,
	Neil Brown, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust
In-Reply-To: <20091001174201.GA30068@infradead.org>

Christoph Hellwig wrote:
> On Thu, Oct 01, 2009 at 07:34:18PM +0530, Suresh Jayaraman wrote:
> 
> The other really big one is adding a proper method for safe, page-backed
> kernelspace I/O on files.  That is not something like the grotty
> swap-tied address_space operations in this patch, but more something in

I'm not sure I understood about what problems you see with the proposed
address_space operations. Could you please elaborate a bit more?

> the direction of the kernel direct I/O patches from Jenx Axboe he did
> for using in the loop driver.  But even those aren't complete as they
> don't touch the locking issue yet.
> 

Thanks,

-- 
Suresh Jayaraman

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: Network hangs with 2.6.30.5
From: Ilpo Järvinen @ 2009-10-02  8:11 UTC (permalink / raw)
  To: David Miller
  Cc: jarkao2, holger.hoffstaette, Netdev, eric.dumazet,
	Evgeniy Polyakov
In-Reply-To: <20091001.154913.88345178.davem@davemloft.net>

On Thu, 1 Oct 2009, David Miller wrote:

> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Mon, 7 Sep 2009 07:21:43 +0000
> 
> > While Eric is analyzing your data, I guess you could try reverting
> > some stuff around this tcp_tw_recycle, and my tcp ignorance would
> > point these commits for the beginning:
> > 
> > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.30.y.git;a=commitdiff;h=fc1ad92dfc4e363a055053746552cdb445ba5c57
> > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.30.y.git;a=commitdiff;h=c887e6d2d9aee56ee7c9f2af4cec3a5efdcc4c72
> 
> Ilpo's cleanup (the second commit listed) looks most likely to
> be a possibility.
> 
> But I surely cannot find any bugs in it, even after studying it
> a few times.
> 
> Ilpo could you audit it one more time for us just in case?

Argh, not that one ...the jungle of negations. But I'll try to go it 
through once more but I tell you I did go through those negations multiple 
times already before submitting it :-).

> I also looked through all the TCP commits in 2.6.29 to 2.6.30
> and I could not find anything else that might cause stalls with
> time-wait recycled connections.

What about the more than 64k connections change a9d8f9110d7e953c2f2 (or 
its fixes), it might be another possibility? ...It certainly does 
something related to reuse and happens to be in the correct time frame... 
(I've added Evgeniy).

-- 
 i.

^ permalink raw reply

* Re: [PATCH 03/31] mm: expose gfp_to_alloc_flags()
From: Suresh Jayaraman @ 2009-10-02  8:11 UTC (permalink / raw)
  To: David Rientjes
  Cc: Linus Torvalds, Andrew Morton, linux-kernel, linux-mm, netdev,
	Neil Brown, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust
In-Reply-To: <alpine.DEB.1.00.0910011355230.32006@chino.kir.corp.google.com>

David Rientjes wrote:
> On Thu, 1 Oct 2009, Suresh Jayaraman wrote:
> 
>> From: Peter Zijlstra <a.p.zijlstra@chello.nl> 
>>
>> Expose the gfp to alloc_flags mapping, so we can use it in other parts
>> of the vm.
>>
>> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
> 
> Nack, these flags are internal to the page allocator and exporting them to 
> generic VM code is unnecessary.

Yes, you're right.

> The only bit you actually use in your patchset is ALLOC_NO_WATERMARKS to 
> determine whether a particular allocation can use memory reserves.  I'd 
> suggest adding a bool function that returns whether the current context is 
> given access to reserves including your new __GFP_MEMALLOC flag and 
> exporting that instead.

Makes sense and Neil already posted a patch citing the suggested
changes, will incorporate the change.

Thanks,

-- 
Suresh Jayaraman

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: SPLICE_F_NONBLOCK semantics...
From: Jens Axboe @ 2009-10-02  7:47 UTC (permalink / raw)
  To: David Miller
  Cc: torvalds, eric.dumazet, jgunthorpe, vl, opurdila, netdev,
	linux-kernel
In-Reply-To: <20091001.152717.187318570.davem@davemloft.net>

On Thu, Oct 01 2009, David Miller wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Thu, 1 Oct 2009 15:21:44 -0700 (PDT)
> 
> > On Thu, 1 Oct 2009, David Miller wrote:
> >> 
> >> It depends upon our interpretation of how you intended the
> >> SPLICE_F_NONBLOCK flag to work when you added it way back
> >> when.
> >> 
> >> Linus introduced  SPLICE_F_NONBLOCK in commit 29e350944fdc2dfca102500790d8ad6d6ff4f69d
> >> (splice: add SPLICE_F_NONBLOCK flag )
> >> 
> >>   It doesn't make the splice itself necessarily nonblocking (because the
> >>   actual file descriptors that are spliced from/to may block unless they
> >>   have the O_NONBLOCK flag set), but it makes the splice pipe operations
> >>   nonblocking.
> >> 
> >> Linus intention was clear : let SPLICE_F_NONBLOCK control the splice pipe mode only
> > 
> > Ack. The original intent was for the flag to affect the buffering, not the 
> > end points.
> 
> Great, thanks for reviewing.
> 
> > Although the more I think about it, the more I suspect that the
> > whole NONBLOCK thing should probably have been two bits, and simply
> > been about "nonblocking input" vs "nonblocking output" (so that you
> > could control both sides on a call-by-call basis).
> 
> I think we could still extend things in this way if we wanted to.
> So if you specify the explicit input and/or output nonblock flag,
> it takes precedence over the SPLICE_F_NONBLOCK thing.

Yes I agree, thank god for having a 'flags' parameter for the syscalls
:-). I'll make a note to add and test bidirectional nonblock hints.

The net patch looks fine and correct to me, feel free to add my acked-by
if you want.

-- 
Jens Axboe


^ permalink raw reply

* Re: 2.6.32-rc1-git2: Reported regressions from 2.6.31
From: Jaswinder Singh Rajput @ 2009-10-02  7:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Adrian Bunk, Andrew Morton,
	Linus Torvalds, Natalie Protasevich, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	Linux Wireless List, DRI
In-Reply-To: <9UCePxij8cB.A.VCG.-3SxKB@chimera>

Hello Rafael,

On Thu, 2009-10-01 at 21:26 +0200, Rafael J. Wysocki wrote:
> [Notes:
> 
>  * Here's the first summary report of known regressions from 2.6.31.  There's
>    not too many of them at the moment, which is nice.
> 
>  * We're still getting quite a number of reports of regressions from 2.6.30 and
>    it's been that way since 2.6.31 was released.  For details please see the
>    summary report of regressions 2.6.30 -> 2.6.31 that will follow shortly.]
> 
> This message contains a list of some regressions from 2.6.31, for which there
> are no fixes in the mainline I know of.  If any of them have been fixed already,
> please let me know.
> 
> If you know of any other unresolved regressions from 2.6.31, please let me know
> either and I'll add them to the list.  Also, please let me know if any of the
> entries below are invalid.
> 
> Each entry from the list will be sent additionally in an automatic reply to
> this message with CCs to the people involved in reporting and handling the
> issue.
> 
> 
> Listed regressions statistics:
> 
>   Date          Total  Pending  Unresolved
>   ----------------------------------------
>   2009-10-02       22       15           9
> 
> 
> Unresolved regressions
> ----------------------
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14299
> Subject		: oops in wireless, iwl3945 related?
> Submitter	: Pavel Machek <pavel@ucw.cz>
> Date		: 2009-09-29 17:12 (3 days old)
> References	: http://marc.info/?l=linux-kernel&m=125424439725743&w=4
> 

If you add one more entry say "Suspected commit :" then it will be great
and will solve regressions much faster. You can request submitter to
submit 'suspected commit' by git bisect and also specify git bisect
links like : (for more information about git bisect check
http://kerneltrap.org/node/11753)

Thanks,
--
JSR

^ permalink raw reply

* Re: [RFC] pkt_sched: gen_estimator: Dont report fake rate estimators
From: Jarek Poplawski @ 2009-10-02  7:32 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, kaber, netdev
In-Reply-To: <20091002070819.GA9694@ff.dom.local>

On Fri, Oct 02, 2009 at 07:08:19AM +0000, Jarek Poplawski wrote:
> On 01-10-2009 23:21, Jarek Poplawski wrote:
...
> To make my point clare: [...]

Am I clair? ;-)

Jarek P.

^ permalink raw reply

* Re: [RFC] pkt_sched: gen_estimator: Dont report fake rate estimators
From: Jarek Poplawski @ 2009-10-02  7:17 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, kaber, netdev
In-Reply-To: <4AC5A7F9.3000005@gmail.com>

On Fri, Oct 02, 2009 at 09:12:57AM +0200, Eric Dumazet wrote:
> Jarek Poplawski a écrit :
> 
> > To make my point clare: why not something like this?:
> > 
> > static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid,
> >                          u32 pid, u32 seq, u16 flags, int event)
> > {
> > 	...
> > 	if (gnet_stats_copy_basic(&d, &q->bstats) < 0 ||
> > 	    (gen_estimator_active(&q->bstats, &q->rate_est) &&
> >              gnet_stats_copy_rate_est(&d, &q->rate_est) < 0) ||
> >             gnet_stats_copy_queue(&d, &q->qstats) < 0)
> >                 goto nla_put_failure;
> > 
> > BTW, I'm not sure we need to chanage user visible API for this.
> > (Is it really expected to work after updating gen_stats.h only in
> > iproute?)
> > 
> 
> Thats would be better indeed, do you want to work on it or let me do it ?

I want you work on it.

Thanks,
Jarek P.

^ permalink raw reply

* [net-2.6 PATCH] e1000e/igb/ixgbe: Don't report an error if devices don't support AER
From: Jeff Kirsher @ 2009-10-02  7:15 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Frans Pop, Jeff Kirsher

From: Frans Pop <elendil@planet.nl>

The only error returned by pci_{en,dis}able_pcie_error_reporting() is
-EIO which simply means that Advanced Error Reporting is not supported.
There is no need to report that, so remove the error check from e1000e,
igb and ixgbe.

Signed-off-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/e1000e/netdev.c    |   13 ++-----------
 drivers/net/igb/igb_main.c     |   13 ++-----------
 drivers/net/ixgbe/ixgbe_main.c |   13 ++-----------
 3 files changed, 6 insertions(+), 33 deletions(-)

diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index 16c193a..0687c6a 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -4982,12 +4982,7 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
 		goto err_pci_reg;
 
 	/* AER (Advanced Error Reporting) hooks */
-	err = pci_enable_pcie_error_reporting(pdev);
-	if (err) {
-		dev_err(&pdev->dev, "pci_enable_pcie_error_reporting failed "
-		        "0x%x\n", err);
-		/* non-fatal, continue */
-	}
+	pci_enable_pcie_error_reporting(pdev);
 
 	pci_set_master(pdev);
 	/* PCI config space info */
@@ -5263,7 +5258,6 @@ static void __devexit e1000_remove(struct pci_dev *pdev)
 {
 	struct net_device *netdev = pci_get_drvdata(pdev);
 	struct e1000_adapter *adapter = netdev_priv(netdev);
-	int err;
 
 	/*
 	 * flush_scheduled work may reschedule our watchdog task, so
@@ -5299,10 +5293,7 @@ static void __devexit e1000_remove(struct pci_dev *pdev)
 	free_netdev(netdev);
 
 	/* AER disable */
-	err = pci_disable_pcie_error_reporting(pdev);
-	if (err)
-		dev_err(&pdev->dev,
-		        "pci_disable_pcie_error_reporting failed 0x%x\n", err);
+	pci_disable_pcie_error_reporting(pdev);
 
 	pci_disable_device(pdev);
 }
diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 5d6c153..714c3a4 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -1246,12 +1246,7 @@ static int __devinit igb_probe(struct pci_dev *pdev,
 	if (err)
 		goto err_pci_reg;
 
-	err = pci_enable_pcie_error_reporting(pdev);
-	if (err) {
-		dev_err(&pdev->dev, "pci_enable_pcie_error_reporting failed "
-		        "0x%x\n", err);
-		/* non-fatal, continue */
-	}
+	pci_enable_pcie_error_reporting(pdev);
 
 	pci_set_master(pdev);
 	pci_save_state(pdev);
@@ -1628,7 +1623,6 @@ static void __devexit igb_remove(struct pci_dev *pdev)
 	struct net_device *netdev = pci_get_drvdata(pdev);
 	struct igb_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
-	int err;
 
 	/* flush_scheduled work may reschedule our watchdog task, so
 	 * explicitly disable watchdog tasks from being rescheduled  */
@@ -1682,10 +1676,7 @@ static void __devexit igb_remove(struct pci_dev *pdev)
 
 	free_netdev(netdev);
 
-	err = pci_disable_pcie_error_reporting(pdev);
-	if (err)
-		dev_err(&pdev->dev,
-		        "pci_disable_pcie_error_reporting failed 0x%x\n", err);
+	pci_disable_pcie_error_reporting(pdev);
 
 	pci_disable_device(pdev);
 }
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 1cbc6a3..28fbb9d 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -5507,12 +5507,7 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
 		goto err_pci_reg;
 	}
 
-	err = pci_enable_pcie_error_reporting(pdev);
-	if (err) {
-		dev_err(&pdev->dev, "pci_enable_pcie_error_reporting failed "
-		                    "0x%x\n", err);
-		/* non-fatal, continue */
-	}
+	pci_enable_pcie_error_reporting(pdev);
 
 	pci_set_master(pdev);
 	pci_save_state(pdev);
@@ -5821,7 +5816,6 @@ static void __devexit ixgbe_remove(struct pci_dev *pdev)
 {
 	struct net_device *netdev = pci_get_drvdata(pdev);
 	struct ixgbe_adapter *adapter = netdev_priv(netdev);
-	int err;
 
 	set_bit(__IXGBE_DOWN, &adapter->state);
 	/* clear the module not found bit to make sure the worker won't
@@ -5872,10 +5866,7 @@ static void __devexit ixgbe_remove(struct pci_dev *pdev)
 
 	free_netdev(netdev);
 
-	err = pci_disable_pcie_error_reporting(pdev);
-	if (err)
-		dev_err(&pdev->dev,
-		        "pci_disable_pcie_error_reporting failed 0x%x\n", err);
+	pci_disable_pcie_error_reporting(pdev);
 
 	pci_disable_device(pdev);
 }


^ permalink raw reply related

* Re: [RFC] pkt_sched: gen_estimator: Dont report fake rate estimators
From: Eric Dumazet @ 2009-10-02  7:12 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, kaber, netdev
In-Reply-To: <20091002070819.GA9694@ff.dom.local>

Jarek Poplawski a écrit :

> To make my point clare: why not something like this?:
> 
> static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid,
>                          u32 pid, u32 seq, u16 flags, int event)
> {
> 	...
> 	if (gnet_stats_copy_basic(&d, &q->bstats) < 0 ||
> 	    (gen_estimator_active(&q->bstats, &q->rate_est) &&
>              gnet_stats_copy_rate_est(&d, &q->rate_est) < 0) ||
>             gnet_stats_copy_queue(&d, &q->qstats) < 0)
>                 goto nla_put_failure;
> 
> BTW, I'm not sure we need to chanage user visible API for this.
> (Is it really expected to work after updating gen_stats.h only in
> iproute?)
> 

Thats would be better indeed, do you want to work on it or let me do it ?

Thanks

^ permalink raw reply

* Re: [RFC] pkt_sched: gen_estimator: Dont report fake rate estimators
From: Jarek Poplawski @ 2009-10-02  7:08 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, kaber, netdev
In-Reply-To: <4AC51D3D.8010700@gmail.com>

On 01-10-2009 23:21, Jarek Poplawski wrote:
> David Miller wrote, On 10/01/2009 11:14 PM:
> 
>> From: Jarek Poplawski <jarkao2@gmail.com>
>> Date: Thu, 01 Oct 2009 23:05:53 +0200
>>
>>> Since you ask... I wonder about this whole int plus quite a bit of
>>> struct unreadability for one flag only. Maybe it could be queried
>>> on qdisc level (with a flag if necessary), and additional parameter
>>> of gnet_stats_copy_rate_est()? (Qdiscs should have no problem with
>>> setting this param for their classes too.)
>> Certainly, that's another approach to this problem.
>>
>> But logically, just like we wouldn't emit a block of RED scheduler
>> data to 'tc' unless RED is actually configured, it seems consistent to
>> not emit estimator data when no estimator is even there.
> 
> Sure! I've exaggerated with this additional parameter. ;-)

To make my point clare: why not something like this?:

static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid,
                         u32 pid, u32 seq, u16 flags, int event)
{
	...
	if (gnet_stats_copy_basic(&d, &q->bstats) < 0 ||
	    (gen_estimator_active(&q->bstats, &q->rate_est) &&
             gnet_stats_copy_rate_est(&d, &q->rate_est) < 0) ||
            gnet_stats_copy_queue(&d, &q->qstats) < 0)
                goto nla_put_failure;

BTW, I'm not sure we need to chanage user visible API for this.
(Is it really expected to work after updating gen_stats.h only in
iproute?)

Jarek P.

^ permalink raw reply

* Re: [Question]: reqsk table size limited to 16?
From: Eric Dumazet @ 2009-10-02  6:50 UTC (permalink / raw)
  To: Gerrit Renker, netdev
In-Reply-To: <20091002062532.GA15755@gerrit.erg.abdn.ac.uk>

Gerrit Renker a écrit :
> Please forget the posting, this is correct; the clamping is
> 
>   8 <= nr_table_entries <=  sysctl_max_syn_backlog,
> 
> i.e. the minimum table size is 16.
>

Yes, agreed, 8+1 -> 16


^ permalink raw reply

* Re: [Question]: reqsk table size limited to 16?
From: Eric Dumazet @ 2009-10-02  6:49 UTC (permalink / raw)
  To: Gerrit Renker, netdev
In-Reply-To: <20091002061134.GC5646@gerrit.erg.abdn.ac.uk>

Gerrit Renker a écrit :
> Can someone please have a look, it may be that I am missing something?
> 
> It seems that in the following the maximum number of table entries is set
> to always 16, despite sysctl_max_syn_backlog (tcp_max_syn_backlog), 
> overriding the 'backlog' parameter to listen(2).

False alarm ;)

> 
> net/core/request_sock.c
> -----------------------
> 
> int reqsk_queue_alloc(struct request_sock_queue *queue,
>                       unsigned int nr_table_entries)
> {
>         size_t lopt_size = sizeof(struct listen_sock);
>         struct listen_sock *lopt;
> 
> 	nr_table_entries = min_t(u32, nr_table_entries, sysctl_max_syn_backlog);

Here we take the _minimum_ value.
If you have  nr_table_entries=4096 and sysctl_max_syn_backlog=1024,
result is 1024

>         nr_table_entries = max_t(u32, nr_table_entries, 8);

Here we take the _maximum_ value of nr_table_entries and 8

-> 1024

Deal is : We want at least 8 slots, even if users called listen(fd, 1);

(Later, user can change its mind and call listen(fd, 1024).

We dont resize hashtable yet, so we guarantee at least 8 slots fot pathological cases.

>         nr_table_entries = roundup_pow_of_two(nr_table_entries + 1);
> 
> 	//...
> 	for (lopt->max_qlen_log = 3;
>              (1 << lopt->max_qlen_log) < nr_table_entries;
>              lopt->max_qlen_log++);
> 
>  	//...
> 	lopt->nr_table_entries = nr_table_entries;
> 	
> 	//...
> 	return 0
> }
> 
> The function is called with an argument 'nr_table_entries', which is then clamped as
> 
>    sysctl_max_syn_backlog <= nr_table_entries <= 8
> 
> If nr_table_entries = 8, then round_pow_of_two(8 + 1) = 16.
> 
> The sysctl value is set to a much higher value (default 128 or 1024, net/ipv4/tcp.c).
> 
> The reqsk_queue_alloc() gets 'nr_table_entries' passed directly from inet_csk_listen_start(),
> which in turn gets its 'nr_table_entries' as the 'backlog' argument to listen(2) via
>  * net/dccp/proto.c   (dccp_listen_start) or
>  * net/ipv4/af_inet.c (inet_listen).


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox