* [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
@ 2013-09-03 11:37 Thomas Graf
2013-09-03 11:56 ` Hannes Frederic Sowa
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Thomas Graf @ 2013-09-03 11:37 UTC (permalink / raw)
To: davem
Cc: netdev, Eric Dumazet, Hannes Frederic Sowa, Stephen Warren,
Fabio Estevam
Allocating skbs when sending out neighbour discovery messages
currently uses sock_alloc_send_skb() based on a per net namespace
socket and thus share a socket wmem buffer space.
If a netdevice is temporarily unable to transmit due to carrier
loss or for other reasons, the queued up ndisc messages will cosnume
all of the wmem space and will thus prevent from any more skbs to
be allocated even for netdevices that are able to transmit packets.
The number of neighbour discovery messages sent is very limited,
use of alloc_skb() bypasses the socket wmem buffer size enforcement
while the manual call to skb_set_owner_w() maintains the socket
reference needed for the IPv6 output path.
This patch has orginally been posted by Eric Dumazet in a modified
form.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Fabio Estevam <festevam@gmail.com>
---
net/ipv6/ndisc.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
v2: Maintain socket reference to account for the IPv6 output path.
Problem reported by Stephen Warren <swarren@wwwdotorg.org> and
Fabio Estevam <festevam@gmail.com>.
Tested on x86_64
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 04d31c2..78141fa 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -372,14 +372,11 @@ static struct sk_buff *ndisc_alloc_skb(struct net_device *dev,
int tlen = dev->needed_tailroom;
struct sock *sk = dev_net(dev)->ipv6.ndisc_sk;
struct sk_buff *skb;
- int err;
- skb = sock_alloc_send_skb(sk,
- hlen + sizeof(struct ipv6hdr) + len + tlen,
- 1, &err);
+ skb = alloc_skb(hlen + sizeof(struct ipv6hdr) + len + tlen, GFP_ATOMIC);
if (!skb) {
- ND_PRINTK(0, err, "ndisc: %s failed to allocate an skb, err=%d\n",
- __func__, err);
+ ND_PRINTK(0, err, "ndisc: %s failed to allocate an skb\n",
+ __func__);
return NULL;
}
@@ -389,6 +386,11 @@ static struct sk_buff *ndisc_alloc_skb(struct net_device *dev,
skb_reserve(skb, hlen + sizeof(struct ipv6hdr));
skb_reset_transport_header(skb);
+ /* Manually assign socket ownership as we avoid calling
+ * sock_alloc_send_pskb() to bypass wmem buffer limits
+ */
+ skb_set_owner_w(skb, sk);
+
return skb;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
2013-09-03 11:37 [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages Thomas Graf
@ 2013-09-03 11:56 ` Hannes Frederic Sowa
2013-09-03 12:11 ` Thomas Graf
2013-09-03 12:18 ` Fabio Estevam
` (2 subsequent siblings)
3 siblings, 1 reply; 13+ messages in thread
From: Hannes Frederic Sowa @ 2013-09-03 11:56 UTC (permalink / raw)
To: Thomas Graf; +Cc: davem, netdev, Eric Dumazet, Stephen Warren, Fabio Estevam
Hi!
On Tue, Sep 03, 2013 at 01:37:01PM +0200, Thomas Graf wrote:
> @@ -389,6 +386,11 @@ static struct sk_buff *ndisc_alloc_skb(struct net_device *dev,
> skb_reserve(skb, hlen + sizeof(struct ipv6hdr));
> skb_reset_transport_header(skb);
>
> + /* Manually assign socket ownership as we avoid calling
> + * sock_alloc_send_pskb() to bypass wmem buffer limits
> + */
> + skb_set_owner_w(skb, sk);
> +
> return skb;
> }
Do you know why this is needed? From the report it seemed to me that we might
have a deadlock on idev->lock and I couldn't find the culprit.
When I tested your change on x86_64 I did not experience this.
Maybe someone with arm could try this patch with CONFIG_PROVE_LOCKING?
Thanks,
Hannes
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
2013-09-03 11:56 ` Hannes Frederic Sowa
@ 2013-09-03 12:11 ` Thomas Graf
0 siblings, 0 replies; 13+ messages in thread
From: Thomas Graf @ 2013-09-03 12:11 UTC (permalink / raw)
To: davem, netdev, Eric Dumazet, Stephen Warren, Fabio Estevam
On 09/03/13 at 01:56pm, Hannes Frederic Sowa wrote:
> Hi!
>
> On Tue, Sep 03, 2013 at 01:37:01PM +0200, Thomas Graf wrote:
> > @@ -389,6 +386,11 @@ static struct sk_buff *ndisc_alloc_skb(struct net_device *dev,
> > skb_reserve(skb, hlen + sizeof(struct ipv6hdr));
> > skb_reset_transport_header(skb);
> >
> > + /* Manually assign socket ownership as we avoid calling
> > + * sock_alloc_send_pskb() to bypass wmem buffer limits
> > + */
> > + skb_set_owner_w(skb, sk);
> > +
> > return skb;
> > }
>
> Do you know why this is needed? From the report it seemed to me that we might
> have a deadlock on idev->lock and I couldn't find the culprit.
>
> When I tested your change on x86_64 I did not experience this.
I also didn't see any problems running v1 of the patch which
is confusing as the IPv6 output path assumes a socket reference
in various places as Dave pointed out correctly.
I don't see why the problem would be limited to ARM.
> Maybe someone with arm could try this patch with CONFIG_PROVE_LOCKING?
I would certainly welcome that.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
2013-09-03 11:37 [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages Thomas Graf
2013-09-03 11:56 ` Hannes Frederic Sowa
@ 2013-09-03 12:18 ` Fabio Estevam
2013-09-03 17:19 ` Stephen Warren
2013-09-04 18:41 ` David Miller
3 siblings, 0 replies; 13+ messages in thread
From: Fabio Estevam @ 2013-09-03 12:18 UTC (permalink / raw)
To: Thomas Graf
Cc: David S. Miller, netdev@vger.kernel.org, Eric Dumazet,
Hannes Frederic Sowa, Stephen Warren
On Tue, Sep 3, 2013 at 8:37 AM, Thomas Graf <tgraf@suug.ch> wrote:
> Allocating skbs when sending out neighbour discovery messages
> currently uses sock_alloc_send_skb() based on a per net namespace
> socket and thus share a socket wmem buffer space.
>
> If a netdevice is temporarily unable to transmit due to carrier
> loss or for other reasons, the queued up ndisc messages will cosnume
> all of the wmem space and will thus prevent from any more skbs to
> be allocated even for netdevices that are able to transmit packets.
>
> The number of neighbour discovery messages sent is very limited,
> use of alloc_skb() bypasses the socket wmem buffer size enforcement
> while the manual call to skb_set_owner_w() maintains the socket
> reference needed for the IPv6 output path.
>
> This patch has orginally been posted by Eric Dumazet in a modified
> form.
>
> Signed-off-by: Thomas Graf <tgraf@suug.ch>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Cc: Stephen Warren <swarren@wwwdotorg.org>
> Cc: Fabio Estevam <festevam@gmail.com>
I don't get the system hang that I used to see with the previous
version on my ARM board, so
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
2013-09-03 11:37 [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages Thomas Graf
2013-09-03 11:56 ` Hannes Frederic Sowa
2013-09-03 12:18 ` Fabio Estevam
@ 2013-09-03 17:19 ` Stephen Warren
2013-09-03 17:27 ` Hannes Frederic Sowa
2013-09-04 18:41 ` David Miller
3 siblings, 1 reply; 13+ messages in thread
From: Stephen Warren @ 2013-09-03 17:19 UTC (permalink / raw)
To: Thomas Graf
Cc: davem, netdev, Eric Dumazet, Hannes Frederic Sowa, Fabio Estevam
On 09/03/2013 05:37 AM, Thomas Graf wrote:
> Allocating skbs when sending out neighbour discovery messages
> currently uses sock_alloc_send_skb() based on a per net namespace
> socket and thus share a socket wmem buffer space.
>
> If a netdevice is temporarily unable to transmit due to carrier
> loss or for other reasons, the queued up ndisc messages will cosnume
> all of the wmem space and will thus prevent from any more skbs to
> be allocated even for netdevices that are able to transmit packets.
>
> The number of neighbour discovery messages sent is very limited,
> use of alloc_skb() bypasses the socket wmem buffer size enforcement
> while the manual call to skb_set_owner_w() maintains the socket
> reference needed for the IPv6 output path.
>
> This patch has orginally been posted by Eric Dumazet in a modified
> form.
Tested-by: Stephen Warren <swarren@nvidia.com>
Although I do note something slightly odd:
next-20130830 had an issue, and reverting V1 of this patch solved it.
However, in next-20130903, if I revert the revert of V1 of this patch, I
don't see any issue; it appears that the problem was some interaction
between V1 of this patch and something else in next-20130830.
Either way, this patch doesn't seem to introduce any issue when applied
on top of either next-20130830 with V1 reverted, or on top of
next-20130903, so it's fine.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
2013-09-03 17:19 ` Stephen Warren
@ 2013-09-03 17:27 ` Hannes Frederic Sowa
2013-09-03 17:42 ` Stephen Warren
0 siblings, 1 reply; 13+ messages in thread
From: Hannes Frederic Sowa @ 2013-09-03 17:27 UTC (permalink / raw)
To: Stephen Warren; +Cc: Thomas Graf, davem, netdev, Eric Dumazet, Fabio Estevam
On Tue, Sep 03, 2013 at 11:19:14AM -0600, Stephen Warren wrote:
> On 09/03/2013 05:37 AM, Thomas Graf wrote:
> > Allocating skbs when sending out neighbour discovery messages
> > currently uses sock_alloc_send_skb() based on a per net namespace
> > socket and thus share a socket wmem buffer space.
> >
> > If a netdevice is temporarily unable to transmit due to carrier
> > loss or for other reasons, the queued up ndisc messages will cosnume
> > all of the wmem space and will thus prevent from any more skbs to
> > be allocated even for netdevices that are able to transmit packets.
> >
> > The number of neighbour discovery messages sent is very limited,
> > use of alloc_skb() bypasses the socket wmem buffer size enforcement
> > while the manual call to skb_set_owner_w() maintains the socket
> > reference needed for the IPv6 output path.
> >
> > This patch has orginally been posted by Eric Dumazet in a modified
> > form.
>
> Tested-by: Stephen Warren <swarren@nvidia.com>
>
> Although I do note something slightly odd:
>
> next-20130830 had an issue, and reverting V1 of this patch solved it.
>
> However, in next-20130903, if I revert the revert of V1 of this patch, I
> don't see any issue; it appears that the problem was some interaction
> between V1 of this patch and something else in next-20130830.
>
> Either way, this patch doesn't seem to introduce any issue when applied
> on top of either next-20130830 with V1 reverted, or on top of
> next-20130903, so it's fine.
Could either of you run the v1 version of the patch with CONFIG_PROVE_LOCKING
enabled? I also do think there is some side-effect we don't understand yet.
Thanks,
Hannes
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
2013-09-03 17:27 ` Hannes Frederic Sowa
@ 2013-09-03 17:42 ` Stephen Warren
2013-09-03 17:51 ` Eric Dumazet
0 siblings, 1 reply; 13+ messages in thread
From: Stephen Warren @ 2013-09-03 17:42 UTC (permalink / raw)
To: Thomas Graf, davem, netdev, Eric Dumazet, Fabio Estevam
On 09/03/2013 11:27 AM, Hannes Frederic Sowa wrote:
> On Tue, Sep 03, 2013 at 11:19:14AM -0600, Stephen Warren wrote:
>> On 09/03/2013 05:37 AM, Thomas Graf wrote:
>>> Allocating skbs when sending out neighbour discovery messages
>>> currently uses sock_alloc_send_skb() based on a per net namespace
>>> socket and thus share a socket wmem buffer space.
>>>
>>> If a netdevice is temporarily unable to transmit due to carrier
>>> loss or for other reasons, the queued up ndisc messages will cosnume
>>> all of the wmem space and will thus prevent from any more skbs to
>>> be allocated even for netdevices that are able to transmit packets.
>>>
>>> The number of neighbour discovery messages sent is very limited,
>>> use of alloc_skb() bypasses the socket wmem buffer size enforcement
>>> while the manual call to skb_set_owner_w() maintains the socket
>>> reference needed for the IPv6 output path.
>>>
>>> This patch has orginally been posted by Eric Dumazet in a modified
>>> form.
>>
>> Tested-by: Stephen Warren <swarren@nvidia.com>
>>
>> Although I do note something slightly odd:
>>
>> next-20130830 had an issue, and reverting V1 of this patch solved it.
>>
>> However, in next-20130903, if I revert the revert of V1 of this patch, I
>> don't see any issue; it appears that the problem was some interaction
>> between V1 of this patch and something else in next-20130830.
>>
>> Either way, this patch doesn't seem to introduce any issue when applied
>> on top of either next-20130830 with V1 reverted, or on top of
>> next-20130903, so it's fine.
>
> Could either of you run the v1 version of the patch with CONFIG_PROVE_LOCKING
> enabled? I also do think there is some side-effect we don't understand yet.
I don't see any extra messages from PROVE_LOCKING related to networking.
There is a single extra message from inside the audio driver, but that's
not networking-related at all.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
2013-09-03 17:42 ` Stephen Warren
@ 2013-09-03 17:51 ` Eric Dumazet
2013-09-03 18:03 ` Stephen Warren
0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2013-09-03 17:51 UTC (permalink / raw)
To: Stephen Warren; +Cc: Thomas Graf, davem, netdev, Fabio Estevam
On Tue, 2013-09-03 at 11:42 -0600, Stephen Warren wrote:
> I don't see any extra messages from PROVE_LOCKING related to networking.
> There is a single extra message from inside the audio driver, but that's
> not networking-related at all.
LOCKDEP is automatically disabled at first splat.
Please try a kernel without audio driver ;)
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
2013-09-03 17:51 ` Eric Dumazet
@ 2013-09-03 18:03 ` Stephen Warren
2013-09-03 18:23 ` Thomas Graf
0 siblings, 1 reply; 13+ messages in thread
From: Stephen Warren @ 2013-09-03 18:03 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Thomas Graf, davem, netdev, Fabio Estevam
On 09/03/2013 11:51 AM, Eric Dumazet wrote:
> On Tue, 2013-09-03 at 11:42 -0600, Stephen Warren wrote:
>
>> I don't see any extra messages from PROVE_LOCKING related to networking.
>> There is a single extra message from inside the audio driver, but that's
>> not networking-related at all.
>
> LOCKDEP is automatically disabled at first splat.
>
> Please try a kernel without audio driver ;)
Ah, OK. Now I do see something from ipv6:
> [ 25.327622]
> [ 25.329142] =============================================
> [ 25.334533] [ INFO: possible recursive locking detected ]
> [ 25.339927] 3.11.0-rc7-next-20130830-00024-g209b4d8-dirty #17 Not tainted
> [ 25.346705] ---------------------------------------------
> [ 25.352095] login/704 is trying to acquire lock:
> [ 25.356705] (&ndev->lock){++--..}, at: [<c049be24>] ipv6_chk_mcast_addr+0x5c/0x200
> [ 25.364405]
> [ 25.364405] but task is already holding lock:
> [ 25.370230] (&ndev->lock){++--..}, at: [<c0480eb4>] addrconf_rs_timer+0x18/0x134
> [ 25.377741]
> [ 25.377741] other info that might help us debug this:
> [ 25.384261] Possible unsafe locking scenario:
> [ 25.384261]
> [ 25.390174] CPU0
> [ 25.392614] ----
> [ 25.395053] lock(&ndev->lock);
> [ 25.398286] lock(&ndev->lock);
> [ 25.401518]
> [ 25.401518] *** DEADLOCK ***
> [ 25.401518]
> [ 25.407433] May be due to missing lock nesting notation
> [ 25.407433]
> [ 25.414214] 4 locks held by login/704:
> [ 25.417955] #0: (((&ndev->rs_timer))){+.-...}, at: [<c002f8a8>] call_timer_fn+0x0/0xe4
> [ 25.426076] #1: (&ndev->lock){++--..}, at: [<c0480eb4>] addrconf_rs_timer+0x18/0x134
> [ 25.434018] #2: (rcu_read_lock){.+.+..}, at: [<c048e104>] ndisc_send_skb+0x58/0x3dc
> [ 25.441876] #3: (rcu_read_lock){.+.+..}, at: [<c049bdc8>] ipv6_chk_mcast_addr+0x0/0x200
> [ 25.450079]
> [ 25.450079] stack backtrace:
> [ 25.454436] CPU: 0 PID: 704 Comm: login Not tainted 3.11.0-rc7-next-20130830-00024-g209b4d8-dirty #17
> [ 25.463674] [<c0016958>] (unwind_backtrace+0x0/0x138) from [<c00125fc>] (show_stack+0x10/0x14)
> [ 25.472297] [<c00125fc>] (show_stack+0x10/0x14) from [<c0567808>] (dump_stack+0x80/0xc4)
> [ 25.480394] [<c0567808>] (dump_stack+0x80/0xc4) from [<c0074af8>] (validate_chain.isra.24+0x74c/0x8c8)
> [ 25.489696] [<c0074af8>] (validate_chain.isra.24+0x74c/0x8c8) from [<c00757b0>] (__lock_acquire+0x3bc/0xa80)
> [ 25.499518] [<c00757b0>] (__lock_acquire+0x3bc/0xa80) from [<c007632c>] (lock_acquire+0x60/0x74)
> [ 25.508305] [<c007632c>] (lock_acquire+0x60/0x74) from [<c056e998>] (_raw_read_lock_bh+0x4c/0x5c)
> [ 25.517175] [<c056e998>] (_raw_read_lock_bh+0x4c/0x5c) from [<c049be24>] (ipv6_chk_mcast_addr+0x5c/0x200)
> [ 25.526740] [<c049be24>] (ipv6_chk_mcast_addr+0x5c/0x200) from [<c04799ac>] (ip6_finish_output2+0x298/0x62c)
> [ 25.536563] [<c04799ac>] (ip6_finish_output2+0x298/0x62c) from [<c048e2b0>] (ndisc_send_skb+0x204/0x3dc)
> [ 25.546037] [<c048e2b0>] (ndisc_send_skb+0x204/0x3dc) from [<c0480fa4>] (addrconf_rs_timer+0x108/0x134)
> [ 25.555425] [<c0480fa4>] (addrconf_rs_timer+0x108/0x134) from [<c002f914>] (call_timer_fn+0x6c/0xe4)
> [ 25.564553] [<c002f914>] (call_timer_fn+0x6c/0xe4) from [<c0030220>] (run_timer_softirq+0x180/0x22c)
> [ 25.573691] [<c0030220>] (run_timer_softirq+0x180/0x22c) from [<c002a210>] (__do_softirq+0x100/0x1fc)
> [ 25.582905] [<c002a210>] (__do_softirq+0x100/0x1fc) from [<c002a6d8>] (irq_exit+0xa8/0xf4)
> [ 25.591166] [<c002a6d8>] (irq_exit+0xa8/0xf4) from [<c000f9b8>] (handle_IRQ+0x3c/0x94)
> [ 25.599078] [<c000f9b8>] (handle_IRQ+0x3c/0x94) from [<c0008700>] (gic_handle_irq+0x28/0x5c)
> [ 25.607511] [<c0008700>] (gic_handle_irq+0x28/0x5c) from [<c00132fc>] (__irq_usr+0x3c/0x60)
> [ 25.615855] Exception stack(0xee0d9fb0 to 0xee0d9ff8)
> [ 25.620901] 9fa0: 5c026e4c c9ce71d5 538e8b11 f2a43512
> [ 25.629072] 9fc0: 49f97e42 7a179ff1 9c755700 9b933273 8b7a2401 6d6222ee 7b8908fd 1df2f704
> [ 25.637243] 9fe0: 2664e738 bec3ef10 b6d440e9 b6d43d58 900f0030 ffffffff
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
2013-09-03 18:03 ` Stephen Warren
@ 2013-09-03 18:23 ` Thomas Graf
2013-09-03 18:30 ` Hannes Frederic Sowa
2013-09-03 22:43 ` Hannes Frederic Sowa
0 siblings, 2 replies; 13+ messages in thread
From: Thomas Graf @ 2013-09-03 18:23 UTC (permalink / raw)
To: Stephen Warren; +Cc: Eric Dumazet, davem, netdev, Fabio Estevam
On 09/03/13 at 12:03pm, Stephen Warren wrote:
> On 09/03/2013 11:51 AM, Eric Dumazet wrote:
> > On Tue, 2013-09-03 at 11:42 -0600, Stephen Warren wrote:
> >
> >> I don't see any extra messages from PROVE_LOCKING related to networking.
> >> There is a single extra message from inside the audio driver, but that's
> >> not networking-related at all.
> >
> > LOCKDEP is automatically disabled at first splat.
> >
> > Please try a kernel without audio driver ;)
>
> Ah, OK. Now I do see something from ipv6:
>
> > [ 25.327622]
> > [ 25.329142] =============================================
> > [ 25.334533] [ INFO: possible recursive locking detected ]
> > [ 25.339927] 3.11.0-rc7-next-20130830-00024-g209b4d8-dirty #17 Not tainted
> > [ 25.346705] ---------------------------------------------
> > [ 25.352095] login/704 is trying to acquire lock:
> > [ 25.356705] (&ndev->lock){++--..}, at: [<c049be24>] ipv6_chk_mcast_addr+0x5c/0x200
> > [ 25.364405]
> > [ 25.364405] but task is already holding lock:
> > [ 25.370230] (&ndev->lock){++--..}, at: [<c0480eb4>] addrconf_rs_timer+0x18/0x134
Real deadlock, we should not hold idev->lock for ndisc_send_rs(), we
should drop the lock beforehand I guess. We also don't hold idev->lock
if we send out the RS via addrconf_dad_completed().
I'm confused why lockdep would only trigger after my patch, the
deadlock is unrelated.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
2013-09-03 18:23 ` Thomas Graf
@ 2013-09-03 18:30 ` Hannes Frederic Sowa
2013-09-03 22:43 ` Hannes Frederic Sowa
1 sibling, 0 replies; 13+ messages in thread
From: Hannes Frederic Sowa @ 2013-09-03 18:30 UTC (permalink / raw)
To: Thomas Graf; +Cc: Stephen Warren, Eric Dumazet, davem, netdev, Fabio Estevam
On Tue, Sep 03, 2013 at 07:23:48PM +0100, Thomas Graf wrote:
> On 09/03/13 at 12:03pm, Stephen Warren wrote:
> > On 09/03/2013 11:51 AM, Eric Dumazet wrote:
> > > On Tue, 2013-09-03 at 11:42 -0600, Stephen Warren wrote:
> > >
> > >> I don't see any extra messages from PROVE_LOCKING related to networking.
> > >> There is a single extra message from inside the audio driver, but that's
> > >> not networking-related at all.
> > >
> > > LOCKDEP is automatically disabled at first splat.
> > >
> > > Please try a kernel without audio driver ;)
> >
> > Ah, OK. Now I do see something from ipv6:
> >
> > > [ 25.327622]
> > > [ 25.329142] =============================================
> > > [ 25.334533] [ INFO: possible recursive locking detected ]
> > > [ 25.339927] 3.11.0-rc7-next-20130830-00024-g209b4d8-dirty #17 Not tainted
> > > [ 25.346705] ---------------------------------------------
> > > [ 25.352095] login/704 is trying to acquire lock:
> > > [ 25.356705] (&ndev->lock){++--..}, at: [<c049be24>] ipv6_chk_mcast_addr+0x5c/0x200
> > > [ 25.364405]
> > > [ 25.364405] but task is already holding lock:
> > > [ 25.370230] (&ndev->lock){++--..}, at: [<c0480eb4>] addrconf_rs_timer+0x18/0x134
>
> Real deadlock, we should not hold idev->lock for ndisc_send_rs(), we
> should drop the lock beforehand I guess. We also don't hold idev->lock
> if we send out the RS via addrconf_dad_completed().
Yes, this already happend by the series to implement ipv6 for vxlan:
caf92bc4007036cfac8ee06c845fdfe6496e4fb3 ("ipv6: do not call ndisc_send_rs()
with write lock").
> I'm confused why lockdep would only trigger after my patch, the
> deadlock is unrelated.
Reordering by the compiler in the short-curcuit evaluation in
ip6_finish_output2 maybe (sk_mc_loop)?
Greetings,
Hannes
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
2013-09-03 18:23 ` Thomas Graf
2013-09-03 18:30 ` Hannes Frederic Sowa
@ 2013-09-03 22:43 ` Hannes Frederic Sowa
1 sibling, 0 replies; 13+ messages in thread
From: Hannes Frederic Sowa @ 2013-09-03 22:43 UTC (permalink / raw)
To: Thomas Graf; +Cc: Stephen Warren, Eric Dumazet, davem, netdev, Fabio Estevam
On Tue, Sep 03, 2013 at 07:23:48PM +0100, Thomas Graf wrote:
> I'm confused why lockdep would only trigger after my patch, the
> deadlock is unrelated.
Either we had patch caf92bc ("ipv6: do not call ndisc_send_rs() with write
lock") already applied or both our testing was flawed. I reverted caf92bc and
tested v1 again and could also reproduce this problem.
Because of the sk_mc_loop logic I do think it is better to set skb->sk, so:
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
If this should still go to stable, the above mentioned commit should also hit
the queue.
Greetings,
Hannes
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages
2013-09-03 11:37 [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages Thomas Graf
` (2 preceding siblings ...)
2013-09-03 17:19 ` Stephen Warren
@ 2013-09-04 18:41 ` David Miller
3 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2013-09-04 18:41 UTC (permalink / raw)
To: tgraf; +Cc: netdev, eric.dumazet, hannes, swarren, festevam
From: Thomas Graf <tgraf@suug.ch>
Date: Tue, 3 Sep 2013 13:37:01 +0200
> Allocating skbs when sending out neighbour discovery messages
> currently uses sock_alloc_send_skb() based on a per net namespace
> socket and thus share a socket wmem buffer space.
>
> If a netdevice is temporarily unable to transmit due to carrier
> loss or for other reasons, the queued up ndisc messages will cosnume
> all of the wmem space and will thus prevent from any more skbs to
> be allocated even for netdevices that are able to transmit packets.
>
> The number of neighbour discovery messages sent is very limited,
> use of alloc_skb() bypasses the socket wmem buffer size enforcement
> while the manual call to skb_set_owner_w() maintains the socket
> reference needed for the IPv6 output path.
>
> This patch has orginally been posted by Eric Dumazet in a modified
> form.
>
> Signed-off-by: Thomas Graf <tgraf@suug.ch>
Applied, and queued up for -stable along with Cong Wang's ndisc_send_rs()
locking change.
Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-09-04 18:41 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-03 11:37 [PATCH v2] ipv6: Don't depend on per socket memory for neighbour discovery messages Thomas Graf
2013-09-03 11:56 ` Hannes Frederic Sowa
2013-09-03 12:11 ` Thomas Graf
2013-09-03 12:18 ` Fabio Estevam
2013-09-03 17:19 ` Stephen Warren
2013-09-03 17:27 ` Hannes Frederic Sowa
2013-09-03 17:42 ` Stephen Warren
2013-09-03 17:51 ` Eric Dumazet
2013-09-03 18:03 ` Stephen Warren
2013-09-03 18:23 ` Thomas Graf
2013-09-03 18:30 ` Hannes Frederic Sowa
2013-09-03 22:43 ` Hannes Frederic Sowa
2013-09-04 18:41 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox