* [patch 10/15] ppp_generic: fix lockdep warning
@ 2007-04-26 7:27 akpm
2007-04-26 8:39 ` David Miller
2007-04-26 10:04 ` [patch 10/15] " Paul Mackerras
0 siblings, 2 replies; 13+ messages in thread
From: akpm @ 2007-04-26 7:27 UTC (permalink / raw)
To: davem; +Cc: netdev, akpm, jarkao2, jura, paulus
From: Jarek Poplawski <jarkao2@o2.pl>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.21-rc4 #1
> -------------------------------------------------------
> pppd/8926 is trying to acquire lock:
> (&vlan_netdev_xmit_lock_key){-...}, at: [<c0265486>]
> dev_queue_xmit+0x247/0x2f1
>
> but task is already holding lock:
> (&pch->downl){-+..}, at: [<c0230c72>] ppp_channel_push+0x19/0x9a
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #3 (&pch->downl){-+..}:
> [<c013642b>] __lock_acquire+0xe62/0x1010
> [<c0136642>] lock_acquire+0x69/0x83
> [<c02afc13>] _spin_lock_bh+0x30/0x3d
> [<c022f715>] ppp_push+0x5a/0x9a
> [<c022fb40>] ppp_xmit_process+0x2e/0x511
> [<c0231a05>] ppp_write+0xb8/0xf2
> [<c015ec26>] vfs_write+0x7f/0xba
> [<c015f158>] sys_write+0x3d/0x64
> [<c01027de>] sysenter_past_esp+0x5f/0x99
> [<ffffffff>] 0xffffffff
>
> -> #2 (&ppp->wlock){-+..}:
> [<c013642b>] __lock_acquire+0xe62/0x1010
> [<c0136642>] lock_acquire+0x69/0x83
> [<c02afc13>] _spin_lock_bh+0x30/0x3d
> [<c022fb2b>] ppp_xmit_process+0x19/0x511
> [<c02318d3>] ppp_start_xmit+0x18a/0x204
> [<c0263a6f>] dev_hard_start_xmit+0x1f6/0x2c4
> [<c026ded3>] __qdisc_run+0x81/0x1bc
> [<c026549e>] dev_queue_xmit+0x25f/0x2f1
> [<c027c75f>] ip_output+0x1be/0x25f
> [<c02788ce>] ip_forward+0x159/0x22b
> [<c027745c>] ip_rcv+0x297/0x4dd
> [<c0263698>] netif_receive_skb+0x164/0x1f2
> [<c022199d>] e1000_clean_rx_irq+0x12a/0x4b7
> [<c02209bc>] e1000_clean+0x3ff/0x5dd
> [<c0265084>] net_rx_action+0x7d/0x12b
> [<c011e442>] __do_softirq+0x82/0xf2
> [<c011e509>] do_softirq+0x57/0x59
> [<c011e877>] irq_exit+0x7f/0x81
> [<c0105011>] do_IRQ+0x45/0x84
> [<c0103252>] common_interrupt+0x2e/0x34
> [<c0100b66>] mwait_idle+0x12/0x14
> [<c0100c60>] cpu_idle+0x6c/0x86
> [<c01001cd>] rest_init+0x23/0x36
> [<c0377d89>] start_kernel+0x3ca/0x461
> [<00000000>] 0x0
> [<ffffffff>] 0xffffffff
>
> -> #1 (&dev->_xmit_lock){-+..}:
> [<c013642b>] __lock_acquire+0xe62/0x1010
> [<c0136642>] lock_acquire+0x69/0x83
> [<c02afc13>] _spin_lock_bh+0x30/0x3d
> [<c0266861>] dev_mc_add+0x34/0x16a
> [<c02ab5c7>] vlan_dev_set_multicast_list+0x88/0x25c
> [<c0266592>] __dev_mc_upload+0x22/0x24
> [<c0266914>] dev_mc_add+0xe7/0x16a
> [<c029f323>] igmp_group_added+0xe6/0xeb
> [<c029f50b>] ip_mc_inc_group+0x13f/0x210
> [<c029f5fa>] ip_mc_up+0x1e/0x61
> [<c029ab81>] inetdev_event+0x154/0x2c7
> [<c0125a46>] notifier_call_chain+0x2c/0x39
> [<c0125a7c>] raw_notifier_call_chain+0x8/0xa
> [<c026477a>] dev_open+0x6d/0x71
> [<c0263028>] dev_change_flags+0x51/0x101
> [<c029b7ca>] devinet_ioctl+0x4df/0x644
> [<c029bc03>] inet_ioctl+0x5c/0x6f
> [<c02596e0>] sock_ioctl+0x4f/0x1e8
> [<c0168c32>] do_ioctl+0x22/0x71
> [<c0168cd6>] vfs_ioctl+0x55/0x27e
> [<c0168f32>] sys_ioctl+0x33/0x51
> [<c01027de>] sysenter_past_esp+0x5f/0x99
> [<ffffffff>] 0xffffffff
>
> -> #0 (&vlan_netdev_xmit_lock_key){-...}:
> [<c0136289>] __lock_acquire+0xcc0/0x1010
> [<c0136642>] lock_acquire+0x69/0x83
> [<c02afbd6>] _spin_lock+0x2b/0x38
> [<c0265486>] dev_queue_xmit+0x247/0x2f1
> [<c02334f6>] __pppoe_xmit+0x1a9/0x215
> [<c023356c>] pppoe_xmit+0xa/0xc
> [<c0230c9a>] ppp_channel_push+0x41/0x9a
> [<c0231a13>] ppp_write+0xc6/0xf2
> [<c015ec26>] vfs_write+0x7f/0xba
> [<c015f158>] sys_write+0x3d/0x64
> [<c01027de>] sysenter_past_esp+0x5f/0x99
> [<ffffffff>] 0xffffffff
>
> other info that might help us debug this:
>
> 1 lock held by pppd/8926:
> #0: (&pch->downl){-+..}, at: [<c0230c72>] ppp_channel_push+0x19/0x9a
>
> stack backtrace:
> [<c0103834>] show_trace_log_lvl+0x1a/0x30
> [<c0103f16>] show_trace+0x12/0x14
> [<c0103f9d>] dump_stack+0x16/0x18
> [<c01343cd>] print_circular_bug_tail+0x68/0x71
> [<c0136289>] __lock_acquire+0xcc0/0x1010
> [<c0136642>] lock_acquire+0x69/0x83
> [<c02afbd6>] _spin_lock+0x2b/0x38
> [<c0265486>] dev_queue_xmit+0x247/0x2f1
> [<c02334f6>] __pppoe_xmit+0x1a9/0x215
> [<c023356c>] pppoe_xmit+0xa/0xc
> [<c0230c9a>] ppp_channel_push+0x41/0x9a
> [<c0231a13>] ppp_write+0xc6/0xf2
> [<c015ec26>] vfs_write+0x7f/0xba
> [<c015f158>] sys_write+0x3d/0x64
> [<c01027de>] sysenter_past_esp+0x5f/0x99
> =======================
> Clocksource tsc unstable (delta = 4686844667 ns)
> Time: acpi_pm clocksource has been installed.
...
lockdep has seen locks "-> #0" - "-> #3" taken in circular order, but IMHO,
lock "-> #3" (&pch->downl) taken after "-> #2" (&ppp->wlock) differs from
&pch->downl lock taken in "-> #0" (before &vlan_netdev_xmit_lock_key) and
lockdep should be notified about this.
Reported & tested by: "Yuriy N. Shkandybin" <jura@netams.com>
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/net/ppp_generic.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff -puN drivers/net/ppp_generic.c~ppp_generic-fix-lockdep-warning drivers/net/ppp_generic.c
--- a/drivers/net/ppp_generic.c~ppp_generic-fix-lockdep-warning
+++ a/drivers/net/ppp_generic.c
@@ -1433,7 +1433,8 @@ ppp_channel_push(struct channel *pch)
struct sk_buff *skb;
struct ppp *ppp;
- spin_lock_bh(&pch->downl);
+ local_bh_disable();
+ spin_lock_nested(&pch->downl, SINGLE_DEPTH_NESTING);
if (pch->chan != 0) {
while (!skb_queue_empty(&pch->file.xq)) {
skb = skb_dequeue(&pch->file.xq);
@@ -1447,7 +1448,8 @@ ppp_channel_push(struct channel *pch)
/* channel got deregistered */
skb_queue_purge(&pch->file.xq);
}
- spin_unlock_bh(&pch->downl);
+ spin_unlock(&pch->downl);
+ local_bh_enable();
/* see if there is anything from the attached unit to be sent */
if (skb_queue_empty(&pch->file.xq)) {
read_lock_bh(&pch->upl);
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 10/15] ppp_generic: fix lockdep warning
2007-04-26 7:27 [patch 10/15] ppp_generic: fix lockdep warning akpm
@ 2007-04-26 8:39 ` David Miller
2007-04-26 10:49 ` Jarek Poplawski
2007-04-26 10:04 ` [patch 10/15] " Paul Mackerras
1 sibling, 1 reply; 13+ messages in thread
From: David Miller @ 2007-04-26 8:39 UTC (permalink / raw)
To: akpm; +Cc: netdev, jarkao2, jura, paulus
From: akpm@linux-foundation.org
Date: Thu, 26 Apr 2007 00:27:29 -0700
> lockdep has seen locks "-> #0" - "-> #3" taken in circular order, but IMHO,
> lock "-> #3" (&pch->downl) taken after "-> #2" (&ppp->wlock) differs from
> &pch->downl lock taken in "-> #0" (before &vlan_netdev_xmit_lock_key) and
> lockdep should be notified about this.
>
> Reported & tested by: "Yuriy N. Shkandybin" <jura@netams.com>
> Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
> Cc: Paul Mackerras <paulus@samba.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I believe there were some problems found in this one,
so I'm going to hold on it for now.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 10/15] ppp_generic: fix lockdep warning
2007-04-26 7:27 [patch 10/15] ppp_generic: fix lockdep warning akpm
2007-04-26 8:39 ` David Miller
@ 2007-04-26 10:04 ` Paul Mackerras
2007-04-26 11:28 ` Jarek Poplawski
1 sibling, 1 reply; 13+ messages in thread
From: Paul Mackerras @ 2007-04-26 10:04 UTC (permalink / raw)
To: akpm; +Cc: davem, netdev, jarkao2, jura
akpm@linux-foundation.org writes:
> lockdep has seen locks "-> #0" - "-> #3" taken in circular order, but IMHO,
> lock "-> #3" (&pch->downl) taken after "-> #2" (&ppp->wlock) differs from
> &pch->downl lock taken in "-> #0" (before &vlan_netdev_xmit_lock_key) and
> lockdep should be notified about this.
>
> Reported & tested by: "Yuriy N. Shkandybin" <jura@netams.com>
> Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
> Cc: Paul Mackerras <paulus@samba.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> drivers/net/ppp_generic.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff -puN drivers/net/ppp_generic.c~ppp_generic-fix-lockdep-warning drivers/net/ppp_generic.c
> --- a/drivers/net/ppp_generic.c~ppp_generic-fix-lockdep-warning
> +++ a/drivers/net/ppp_generic.c
> @@ -1433,7 +1433,8 @@ ppp_channel_push(struct channel *pch)
> struct sk_buff *skb;
> struct ppp *ppp;
>
> - spin_lock_bh(&pch->downl);
> + local_bh_disable();
> + spin_lock_nested(&pch->downl, SINGLE_DEPTH_NESTING);
This looks like a band-aid to me. I don't feel that I understand
exactly how the recursive locking situation arose, or why saying
"SINGLE_DEPTH_NESTING" (whatever that means exactly) is a suitable
fix.
Paul.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 10/15] ppp_generic: fix lockdep warning
2007-04-26 8:39 ` David Miller
@ 2007-04-26 10:49 ` Jarek Poplawski
2007-05-09 9:35 ` [PATCH] vlan: lockdep subclass for ppp _xmit_lock " Jarek Poplawski
0 siblings, 1 reply; 13+ messages in thread
From: Jarek Poplawski @ 2007-04-26 10:49 UTC (permalink / raw)
To: David Miller; +Cc: akpm, netdev, jura, paulus
On Thu, Apr 26, 2007 at 01:39:11AM -0700, David Miller wrote:
> From: akpm@linux-foundation.org
> Date: Thu, 26 Apr 2007 00:27:29 -0700
>
> > lockdep has seen locks "-> #0" - "-> #3" taken in circular order, but IMHO,
> > lock "-> #3" (&pch->downl) taken after "-> #2" (&ppp->wlock) differs from
> > &pch->downl lock taken in "-> #0" (before &vlan_netdev_xmit_lock_key) and
> > lockdep should be notified about this.
> >
> > Reported & tested by: "Yuriy N. Shkandybin" <jura@netams.com>
> > Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
> > Cc: Paul Mackerras <paulus@samba.org>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>
> I believe there were some problems found in this one,
> so I'm going to hold on it for now.
>
This patch should prevent lockdep from reporting one kind
of possible circular locking error, which in my opinion is
impossible (lockdep doesn't see the difference in serving
two types of ppp channels). So, if I'm not wrong, this
patch is helpful sometime.
But there is also a second, very similar lockdep report,
probably also false (lockdep cannot see the difference
between locks of two different, I hope, vlan devices),
which needs more work:
a) vlan should use different lockdep lock subclasses or
classes for each device, which would require quite a lot
of static memory reserved, probably only to silence
lockdep,
b) pppoe could change the way of sending packets, so
the locks of ppp_generic were not seen by lockdep
with so many variants; I'm not sure the maintainer of
pppoe likes this idea;
Doing a) should be enough, I guess; doing b) is easier
but, probably, the similar is possible elsewhere, too.
Of course, there is possibility to silence lockdep
easier, e.g. don't let it know about vlan's lock
at all, but there is always, a risk lockdep reports
could be right sometimes...
Probably this kind of "error" reports could be avoided,
if vlan's config were started after all their devices
were up. Here it is probably: netdev1, vlan1, netdev2,
vlan2... and lockdep isn't able to distinguish them
enough.
Currently I think about some change in lockdep, to track
something like different vlans with less memory, but I'm
not sure it'll work, yet.
Regards,
Jarek P.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 10/15] ppp_generic: fix lockdep warning
2007-04-26 10:04 ` [patch 10/15] " Paul Mackerras
@ 2007-04-26 11:28 ` Jarek Poplawski
0 siblings, 0 replies; 13+ messages in thread
From: Jarek Poplawski @ 2007-04-26 11:28 UTC (permalink / raw)
To: Paul Mackerras; +Cc: akpm, davem, netdev, jura
On Thu, Apr 26, 2007 at 08:04:36PM +1000, Paul Mackerras wrote:
> akpm@linux-foundation.org writes:
>
> > lockdep has seen locks "-> #0" - "-> #3" taken in circular order, but IMHO,
> > lock "-> #3" (&pch->downl) taken after "-> #2" (&ppp->wlock) differs from
> > &pch->downl lock taken in "-> #0" (before &vlan_netdev_xmit_lock_key) and
> > lockdep should be notified about this.
> >
> > Reported & tested by: "Yuriy N. Shkandybin" <jura@netams.com>
> > Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
> > Cc: Paul Mackerras <paulus@samba.org>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > ---
> >
> > drivers/net/ppp_generic.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff -puN drivers/net/ppp_generic.c~ppp_generic-fix-lockdep-warning drivers/net/ppp_generic.c
> > --- a/drivers/net/ppp_generic.c~ppp_generic-fix-lockdep-warning
> > +++ a/drivers/net/ppp_generic.c
> > @@ -1433,7 +1433,8 @@ ppp_channel_push(struct channel *pch)
> > struct sk_buff *skb;
> > struct ppp *ppp;
> >
> > - spin_lock_bh(&pch->downl);
> > + local_bh_disable();
> > + spin_lock_nested(&pch->downl, SINGLE_DEPTH_NESTING);
>
> This looks like a band-aid to me. I don't feel that I understand
> exactly how the recursive locking situation arose, or why saying
> "SINGLE_DEPTH_NESTING" (whatever that means exactly) is a suitable
> fix.
I think I've cc-d this patch proposal to you, and there was
something about: don't apply without maintainer's opinion and
Yuriy's testing. I think Andrew was not afraid to risk -mm
stability, and took this earlier.
In this case lockdep sees locks taken in different order,
and one of this locks is pch->downl, taken in two different
functions. Lockdep thinks it's the same lock, but these
functions serve two different type of channels, which
cannot wait for/take this lock at the same time.
SINGLE_DEPTH_NESTING means here this is another lock, so
lockdep doesn't see nothing wrong with this:
f1()
{
spin_lock_nested(&pch->downl, SINGLE_DEPTH_NESTING);
spin_lock(&B);
...
}
f2()
{
spin_lock(&B);
spin_lock(&pch->downl);
...
}
which would generate an error without nesting.
So, if you think it's wrong, the patch should be dumped,
and other measures be taken, to stop bother users with this.
Regards,
Jarek P.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] vlan: lockdep subclass for ppp _xmit_lock Re: ppp_generic: fix lockdep warning
2007-05-09 9:35 ` [PATCH] vlan: lockdep subclass for ppp _xmit_lock " Jarek Poplawski
@ 2007-05-09 9:32 ` David Miller
2007-05-09 13:13 ` [PATCH (take 2)] vlan: lockdep class " Jarek Poplawski
2007-05-09 20:06 ` [PATCH] vlan: lockdep subclass " Yuriy N. Shkandybin
1 sibling, 1 reply; 13+ messages in thread
From: David Miller @ 2007-05-09 9:32 UTC (permalink / raw)
To: jarkao2; +Cc: jura, akpm, netdev, paulus, greearb, mostrows
From: Jarek Poplawski <jarkao2@o2.pl>
Date: Wed, 9 May 2007 11:35:37 +0200
> After rethinking there is the 3-rd way (as usual):
>
> c) vlan should use different lockdep lock subclasses or
> classes for different types of devices, used at the same
> time.
Perhaps we should just bite the bullet and use a seperate
unique lock class for ->_xmit_lock of each device type.
We are going to find more cases like this one, for sure.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] vlan: lockdep subclass for ppp _xmit_lock Re: ppp_generic: fix lockdep warning
2007-04-26 10:49 ` Jarek Poplawski
@ 2007-05-09 9:35 ` Jarek Poplawski
2007-05-09 9:32 ` David Miller
2007-05-09 20:06 ` [PATCH] vlan: lockdep subclass " Yuriy N. Shkandybin
0 siblings, 2 replies; 13+ messages in thread
From: Jarek Poplawski @ 2007-05-09 9:35 UTC (permalink / raw)
To: Yuriy N. Shkandybin; +Cc: akpm, netdev, jura, paulus, greearb, mostrows, davem
On Thu, Apr 26, 2007 at 12:49:50PM +0200, Jarek Poplawski wrote:
...
> But there is also a second, very similar lockdep report,
> probably also false (lockdep cannot see the difference
> between locks of two different, I hope, vlan devices),
> which needs more work:
> a) vlan should use different lockdep lock subclasses or
> classes for each device, which would require quite a lot
> of static memory reserved, probably only to silence
> lockdep,
> b) pppoe could change the way of sending packets, so
> the locks of ppp_generic were not seen by lockdep
> with so many variants; I'm not sure the maintainer of
> pppoe likes this idea;
>
> Doing a) should be enough, I guess; doing b) is easier
> but, probably, the similar is possible elsewhere, too.
...
> Currently I think about some change in lockdep, to track
> something like different vlans with less memory, but I'm
> not sure it'll work, yet.
After rethinking there is the 3-rd way (as usual):
c) vlan should use different lockdep lock subclasses or
classes for different types of devices, used at the same
time.
This patch is intended for testing, so should be applied
after some confirmation. (It should be safe anyway.)
If this works, the previous lockdep patch on ppp_generic
should be really superfluous.
Yuriy, could you try this patch, please?
This is done on 2.6.21, but could be applied to current -mm
or -git, too. If you prefere some other version, let me know.
Regards,
Jarek P.
--->
This patch's aim is to let lockdep see ppp devs as different
from others (default), and it's OK to take: _xmit_lock of vlan
and _xmit_lock of ppp with reverse order provided vlan _xmit_locks
are bound to different devs (ppp and e.g. eth).
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.21-rc6-mm1 #4
> -------------------------------------------------------
> pppd/14305 is trying to acquire lock:
> (&vlan_netdev_xmit_lock_key){-...}, at: [<ffffffff8022f90b>]
> dev_queue_xmit+0x26b/0x300
>
> but task is already holding lock:
> (&pch->downl#2){-+..}, at: [<ffffffff80388d3c>] ppp_push+0x5f/0xa7
>
> which lock already depends on the new lock.
Reported & tested by: "Yuriy N. Shkandybin" <jura@netams.com>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michal Ostrowski <mostrows@speakeasy.net>
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
---
diff -Nurp 2.6.21-/net/8021q/vlan.c 2.6.21/net/8021q/vlan.c
--- 2.6.21-/net/8021q/vlan.c 2007-05-01 12:43:39.000000000 +0200
+++ 2.6.21/net/8021q/vlan.c 2007-05-07 21:09:30.000000000 +0200
@@ -535,7 +535,13 @@ static struct net_device *register_vlan_
if (register_netdevice(new_dev))
goto out_free_newdev;
- lockdep_set_class(&new_dev->_xmit_lock, &vlan_netdev_xmit_lock_key);
+ if (unlikely(real_dev->type == ARPHRD_PPP))
+ /* pppoe uses two different kinds of _xmit_lock for ppp & eth */
+ lockdep_set_class_and_subclass(&new_dev->_xmit_lock,
+ &vlan_netdev_xmit_lock_key, 1);
+ else
+ lockdep_set_class(&new_dev->_xmit_lock,
+ &vlan_netdev_xmit_lock_key);
new_dev->iflink = real_dev->ifindex;
vlan_transfer_operstate(real_dev, new_dev);
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH (take 2)] vlan: lockdep class for ppp _xmit_lock Re: ppp_generic: fix lockdep warning
2007-05-09 9:32 ` David Miller
@ 2007-05-09 13:13 ` Jarek Poplawski
0 siblings, 0 replies; 13+ messages in thread
From: Jarek Poplawski @ 2007-05-09 13:13 UTC (permalink / raw)
To: David Miller; +Cc: jura, akpm, netdev, paulus, greearb, mostrows
On Wed, May 09, 2007 at 02:32:24AM -0700, David Miller wrote:
> From: Jarek Poplawski <jarkao2@o2.pl>
> Date: Wed, 9 May 2007 11:35:37 +0200
>
> > After rethinking there is the 3-rd way (as usual):
> >
> > c) vlan should use different lockdep lock subclasses or
> > classes for different types of devices, used at the same
> > time.
>
> Perhaps we should just bite the bullet and use a seperate
> unique lock class for ->_xmit_lock of each device type.
>
> We are going to find more cases like this one, for sure.
>
Theoretically subclasses are intended for this, and there
are 6 free places still. Practically lockdep treats subclasses
differently sometimes (e.g., now, it seems, it cannot see a real
recursion error inside a subclass). So, it's mainly a problem
of some static memory wasting or saving (unless it should be
subclassed yet - then no question).
Here is "your way" alternative, preferred version - I hope Yuriy
isn't bored with this testing yet!
Of course, I see nothing against you or somebody else doing
or re-doing this all as needed.
Regards,
Jarek P.
---> (take 2)
This patch's aim is to let lockdep see ppp devs as different
from others (default), and it's OK to take: _xmit_lock of vlan
and _xmit_lock of ppp with reverse order provided vlan _xmit_locks
are bound to different devs (ppp and e.g. eth).
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.21-rc6-mm1 #4
> -------------------------------------------------------
> pppd/14305 is trying to acquire lock:
> (&vlan_netdev_xmit_lock_key){-...}, at: [<ffffffff8022f90b>]
> dev_queue_xmit+0x26b/0x300
>
> but task is already holding lock:
> (&pch->downl#2){-+..}, at: [<ffffffff80388d3c>] ppp_push+0x5f/0xa7
>
> which lock already depends on the new lock.
The final idea to use separate classes by David Miller.
Reported & tested by: "Yuriy N. Shkandybin" <jura@netams.com>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michal Ostrowski <mostrows@speakeasy.net>
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
---
diff -Nurp 2.6.21-git7-/net/8021q/vlan.c 2.6.21/net/8021q/vlan.c
--- 2.6.21-git7-/net/8021q/vlan.c 2007-05-01 12:43:39.000000000 +0200
+++ 2.6.21/net/8021q/vlan.c 2007-05-09 14:46:12.000000000 +0200
@@ -376,6 +376,8 @@ static void vlan_transfer_operstate(cons
*/
static struct lock_class_key vlan_netdev_xmit_lock_key;
+/* pppoe uses two different kinds of _xmit_lock for ppp & eth */
+static struct lock_class_key vlan_ppp_xmit_lock_key;
/* Attach a VLAN device to a mac address (ie Ethernet Card).
* Returns the device that was created, or NULL if there was
@@ -535,7 +537,12 @@ static struct net_device *register_vlan_
if (register_netdevice(new_dev))
goto out_free_newdev;
- lockdep_set_class(&new_dev->_xmit_lock, &vlan_netdev_xmit_lock_key);
+ if (unlikely(real_dev->type == ARPHRD_PPP))
+ lockdep_set_class(&new_dev->_xmit_lock,
+ &vlan_ppp_xmit_lock_key);
+ else
+ lockdep_set_class(&new_dev->_xmit_lock,
+ &vlan_netdev_xmit_lock_key);
new_dev->iflink = real_dev->ifindex;
vlan_transfer_operstate(real_dev, new_dev);
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] vlan: lockdep subclass for ppp _xmit_lock Re: ppp_generic: fix lockdep warning
2007-05-09 9:35 ` [PATCH] vlan: lockdep subclass for ppp _xmit_lock " Jarek Poplawski
2007-05-09 9:32 ` David Miller
@ 2007-05-09 20:06 ` Yuriy N. Shkandybin
2007-05-10 5:30 ` Jarek Poplawski
2007-05-11 7:00 ` [PATCH] ppp_generic: lockdep class " Jarek Poplawski
1 sibling, 2 replies; 13+ messages in thread
From: Yuriy N. Shkandybin @ 2007-05-09 20:06 UTC (permalink / raw)
To: Jarek Poplawski; +Cc: akpm, netdev, paulus, greearb, mostrows, davem
After applying this patch i've got this:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.21-gentoo #2
-------------------------------------------------------
ospfd/3984 is trying to acquire lock:
(&ppp->wlock){-...}, at: [<ffffffff803512a0>] ppp_xmit_process+0x20/0x4f0
but task is already holding lock:
(&dev->_xmit_lock){-+..}, at: [<ffffffff8038d778>] __qdisc_run+0x88/0x1c3
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&dev->_xmit_lock){-+..}:
[<ffffffff80288f49>] __lock_acquire+0xca9/0xf31
[<ffffffff8028921a>] lock_acquire+0x49/0x6f
[<ffffffff8038663c>] dev_mc_add+0x3c/0x148
[<ffffffff8025e14c>] _spin_lock_bh+0x23/0x2c
[<ffffffff8038663c>] dev_mc_add+0x3c/0x148
[<ffffffff803bf0fc>] vlan_dev_set_multicast_list+0xfc/0x2a0
[<ffffffff80386707>] dev_mc_add+0x107/0x148
[<ffffffff803b1406>] igmp_group_added+0x44/0x4b
[<ffffffff803b15aa>] ip_mc_inc_group+0x12a/0x150
[<ffffffff803b15b2>] ip_mc_inc_group+0x132/0x150
[<ffffffff803b160e>] ip_mc_up+0x3e/0x5a
[<ffffffff803aeb77>] inetdev_event+0x137/0x2d0
[<ffffffff8038b67e>] rtmsg_ifinfo+0xae/0xe0
[<ffffffff8027bb24>] notifier_call_chain+0x24/0x36
[<ffffffff80384cb6>] dev_open+0x66/0x70
[<ffffffff803830c2>] dev_change_flags+0x5c/0x12a
[<ffffffff803af5fd>] devinet_ioctl+0x2ad/0x6b0
[<ffffffff8037b5e6>] sock_ioctl+0x1e6/0x202
[<ffffffff8023eaa1>] do_ioctl+0x21/0x79
[<ffffffff8022e5bd>] vfs_ioctl+0x28d/0x2b0
[<ffffffff80287dca>] trace_hardirqs_on+0x12e/0x164
[<ffffffff80248f89>] sys_ioctl+0x3b/0x5b
[<ffffffff8025811e>] system_call+0x7e/0x83
[<ffffffffffffffff>] 0xffffffffffffffff
-> #2 (&vlan_netdev_xmit_lock_key){-...}:
[<ffffffff80288f49>] __lock_acquire+0xca9/0xf31
[<ffffffff8028921a>] lock_acquire+0x49/0x6f
[<ffffffff8022de38>] dev_queue_xmit+0x178/0x263
[<ffffffff8025e120>] _spin_lock+0x1e/0x27
[<ffffffff8022de38>] dev_queue_xmit+0x178/0x263
[<ffffffff80354f57>] __pppoe_xmit+0x217/0x23b
[<ffffffff803523e3>] ppp_channel_push+0x43/0xb9
[<ffffffff8035331a>] ppp_write+0x10a/0x120
[<ffffffff80215895>] vfs_write+0xa5/0xf0
[<ffffffff80216238>] sys_write+0x45/0x7d
[<ffffffff8025811e>] system_call+0x7e/0x83
[<ffffffffffffffff>] 0xffffffffffffffff
-> #1 (&pch->downl){-...}:
[<ffffffff80288f49>] __lock_acquire+0xca9/0xf31
[<ffffffff8028921a>] lock_acquire+0x49/0x6f
[<ffffffff803509d5>] ppp_push+0x45/0xaf
[<ffffffff8025e14c>] _spin_lock_bh+0x23/0x2c
[<ffffffff803509d5>] ppp_push+0x45/0xaf
[<ffffffff803516e7>] ppp_xmit_process+0x467/0x4f0
[<ffffffff80287dca>] trace_hardirqs_on+0x12e/0x164
[<ffffffff8035330a>] ppp_write+0xfa/0x120
[<ffffffff80215895>] vfs_write+0xa5/0xf0
[<ffffffff80216238>] sys_write+0x45/0x7d
[<ffffffff8025811e>] system_call+0x7e/0x83
[<ffffffffffffffff>] 0xffffffffffffffff
-> #0 (&ppp->wlock){-...}:
[<ffffffff8028483b>] print_stack_trace+0x6d/0x8a
[<ffffffff80288e08>] __lock_acquire+0xb68/0xf31
[<ffffffff8028921a>] lock_acquire+0x49/0x6f
[<ffffffff803512a0>] ppp_xmit_process+0x20/0x4f0
[<ffffffff8025e14c>] _spin_lock_bh+0x23/0x2c
[<ffffffff803512a0>] ppp_xmit_process+0x20/0x4f0
[<ffffffff80287daf>] trace_hardirqs_on+0x113/0x164
[<ffffffff803531bc>] ppp_start_xmit+0x1dc/0x230
[<ffffffff8038d7f5>] __qdisc_run+0x105/0x1c3
[<ffffffff8022dddf>] dev_queue_xmit+0x11f/0x263
[<ffffffff80237533>] ip_mc_output+0x333/0x370
[<ffffffff803a9460>] raw_sendmsg+0x560/0x729
[<ffffffff8025045b>] sock_sendmsg+0xcb/0xe3
[<ffffffff80281bbc>] autoremove_wake_function+0x0/0x34
[<ffffffff802754bf>] local_bh_enable_ip+0xeb/0xfc
[<ffffffff80287dca>] trace_hardirqs_on+0x12e/0x164
[<ffffffff8039c20c>] ip_setsockopt+0xb1c/0xb3f
[<ffffffff8039c20c>] ip_setsockopt+0xb1c/0xb3f
[<ffffffff803813e6>] verify_iovec+0x46/0x94
[<ffffffff8037bde0>] sys_sendmsg+0x1de/0x249
[<ffffffff8025e3c9>] _spin_unlock_irqrestore+0x41/0x4a
[<ffffffff8027bd6f>] getrusage+0x19f/0x1ba
[<ffffffff80287dca>] trace_hardirqs_on+0x12e/0x164
[<ffffffff8025dbc9>] trace_hardirqs_on_thunk+0x35/0x37
[<ffffffff803a8650>] raw_setsockopt+0x0/0x54
[<ffffffff8025811e>] system_call+0x7e/0x83
[<ffffffffffffffff>] 0xffffffffffffffff
other info that might help us debug this:
1 lock held by ospfd/3984:
#0: (&dev->_xmit_lock){-+..}, at: [<ffffffff8038d778>]
__qdisc_run+0x88/0x1c3
stack backtrace:
Call Trace:
[<ffffffff80286f9d>] print_circular_bug_tail+0x6a/0x7d
[<ffffffff8028483b>] print_stack_trace+0x6d/0x8a
[<ffffffff80288e08>] __lock_acquire+0xb68/0xf31
[<ffffffff8028921a>] lock_acquire+0x49/0x6f
[<ffffffff803512a0>] ppp_xmit_process+0x20/0x4f0
[<ffffffff8025e14c>] _spin_lock_bh+0x23/0x2c
[<ffffffff803512a0>] ppp_xmit_process+0x20/0x4f0
[<ffffffff80287daf>] trace_hardirqs_on+0x113/0x164
[<ffffffff803531bc>] ppp_start_xmit+0x1dc/0x230
[<ffffffff8038d7f5>] __qdisc_run+0x105/0x1c3
[<ffffffff8022dddf>] dev_queue_xmit+0x11f/0x263
[<ffffffff80237533>] ip_mc_output+0x333/0x370
[<ffffffff803a9460>] raw_sendmsg+0x560/0x729
[<ffffffff8025045b>] sock_sendmsg+0xcb/0xe3
[<ffffffff80281bbc>] autoremove_wake_function+0x0/0x34
[<ffffffff802754bf>] local_bh_enable_ip+0xeb/0xfc
[<ffffffff80287dca>] trace_hardirqs_on+0x12e/0x164
[<ffffffff8039c20c>] ip_setsockopt+0xb1c/0xb3f
[<ffffffff8039c20c>] ip_setsockopt+0xb1c/0xb3f
[<ffffffff803813e6>] verify_iovec+0x46/0x94
[<ffffffff8037bde0>] sys_sendmsg+0x1de/0x249
[<ffffffff8025e3c9>] _spin_unlock_irqrestore+0x41/0x4a
[<ffffffff8027bd6f>] getrusage+0x19f/0x1ba
[<ffffffff80287dca>] trace_hardirqs_on+0x12e/0x164
[<ffffffff8025dbc9>] trace_hardirqs_on_thunk+0x35/0x37
[<ffffffff803a8650>] raw_setsockopt+0x0/0x54
[<ffffffff8025811e>] system_call+0x7e/0x83
Jura
----- Original Message -----
From: "Jarek Poplawski" <jarkao2@o2.pl>
To: "Yuriy N. Shkandybin" <jura@netams.com>
Cc: <akpm@linux-foundation.org>; <netdev@vger.kernel.org>;
<jura@netams.com>; <paulus@samba.org>; <greearb@candelatech.com>;
<mostrows@speakeasy.net>; <davem@davemloft.net>
Sent: Wednesday, May 09, 2007 1:35 PM
Subject: [PATCH] vlan: lockdep subclass for ppp _xmit_lock Re: ppp_generic:
fix lockdep warning
> On Thu, Apr 26, 2007 at 12:49:50PM +0200, Jarek Poplawski wrote:
> ...
>> But there is also a second, very similar lockdep report,
>> probably also false (lockdep cannot see the difference
>> between locks of two different, I hope, vlan devices),
>> which needs more work:
>> a) vlan should use different lockdep lock subclasses or
>> classes for each device, which would require quite a lot
>> of static memory reserved, probably only to silence
>> lockdep,
>> b) pppoe could change the way of sending packets, so
>> the locks of ppp_generic were not seen by lockdep
>> with so many variants; I'm not sure the maintainer of
>> pppoe likes this idea;
>>
>> Doing a) should be enough, I guess; doing b) is easier
>> but, probably, the similar is possible elsewhere, too.
> ...
>> Currently I think about some change in lockdep, to track
>> something like different vlans with less memory, but I'm
>> not sure it'll work, yet.
>
> After rethinking there is the 3-rd way (as usual):
>
> c) vlan should use different lockdep lock subclasses or
> classes for different types of devices, used at the same
> time.
>
> This patch is intended for testing, so should be applied
> after some confirmation. (It should be safe anyway.)
>
> If this works, the previous lockdep patch on ppp_generic
> should be really superfluous.
>
> Yuriy, could you try this patch, please?
> This is done on 2.6.21, but could be applied to current -mm
> or -git, too. If you prefere some other version, let me know.
>
> Regards,
> Jarek P.
> --->
>
> This patch's aim is to let lockdep see ppp devs as different
> from others (default), and it's OK to take: _xmit_lock of vlan
> and _xmit_lock of ppp with reverse order provided vlan _xmit_locks
> are bound to different devs (ppp and e.g. eth).
>
>> =======================================================
>> [ INFO: possible circular locking dependency detected ]
>> 2.6.21-rc6-mm1 #4
>> -------------------------------------------------------
>> pppd/14305 is trying to acquire lock:
>> (&vlan_netdev_xmit_lock_key){-...}, at: [<ffffffff8022f90b>]
>> dev_queue_xmit+0x26b/0x300
>>
>> but task is already holding lock:
>> (&pch->downl#2){-+..}, at: [<ffffffff80388d3c>] ppp_push+0x5f/0xa7
>>
>> which lock already depends on the new lock.
>
> Reported & tested by: "Yuriy N. Shkandybin" <jura@netams.com>
> Cc: Ben Greear <greearb@candelatech.com>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michal Ostrowski <mostrows@speakeasy.net>
> Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
>
> ---
>
> diff -Nurp 2.6.21-/net/8021q/vlan.c 2.6.21/net/8021q/vlan.c
> --- 2.6.21-/net/8021q/vlan.c 2007-05-01 12:43:39.000000000 +0200
> +++ 2.6.21/net/8021q/vlan.c 2007-05-07 21:09:30.000000000 +0200
> @@ -535,7 +535,13 @@ static struct net_device *register_vlan_
> if (register_netdevice(new_dev))
> goto out_free_newdev;
>
> - lockdep_set_class(&new_dev->_xmit_lock, &vlan_netdev_xmit_lock_key);
> + if (unlikely(real_dev->type == ARPHRD_PPP))
> + /* pppoe uses two different kinds of _xmit_lock for ppp & eth */
> + lockdep_set_class_and_subclass(&new_dev->_xmit_lock,
> + &vlan_netdev_xmit_lock_key, 1);
> + else
> + lockdep_set_class(&new_dev->_xmit_lock,
> + &vlan_netdev_xmit_lock_key);
>
> new_dev->iflink = real_dev->ifindex;
> vlan_transfer_operstate(real_dev, new_dev);
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] vlan: lockdep subclass for ppp _xmit_lock Re: ppp_generic: fix lockdep warning
2007-05-09 20:06 ` [PATCH] vlan: lockdep subclass " Yuriy N. Shkandybin
@ 2007-05-10 5:30 ` Jarek Poplawski
2007-05-10 6:03 ` Yuriy N. Shkandybin
2007-05-11 7:00 ` [PATCH] ppp_generic: lockdep class " Jarek Poplawski
1 sibling, 1 reply; 13+ messages in thread
From: Jarek Poplawski @ 2007-05-10 5:30 UTC (permalink / raw)
To: Yuriy N. Shkandybin; +Cc: akpm, netdev, paulus, greearb, mostrows, davem
On Thu, May 10, 2007 at 12:06:09AM +0400, Yuriy N. Shkandybin wrote:
> After applying this patch i've got this:
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.21-gentoo #2
> -------------------------------------------------------
> ospfd/3984 is trying to acquire lock:
> (&ppp->wlock){-...}, at: [<ffffffff803512a0>] ppp_xmit_process+0x20/0x4f0
>
> but task is already holding lock:
> (&dev->_xmit_lock){-+..}, at: [<ffffffff8038d778>] __qdisc_run+0x88/0x1c3
>
> which lock already depends on the new lock.
Thanks Yuriy!
As a matter of fact I suspected there will be something more.
Lockdep always stops working after the first warning, and you
have some very interesting (entangled) configuration, so each
fix could uncover new warnings.
I need some time to figure out this one, so I'd try to use your
help yet... I hope it's only about lockdep warnings and there
are no real lockups.
Best regards,
Jarek P.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] vlan: lockdep subclass for ppp _xmit_lock Re: ppp_generic: fix lockdep warning
2007-05-10 5:30 ` Jarek Poplawski
@ 2007-05-10 6:03 ` Yuriy N. Shkandybin
2007-05-10 6:39 ` Jarek Poplawski
0 siblings, 1 reply; 13+ messages in thread
From: Yuriy N. Shkandybin @ 2007-05-10 6:03 UTC (permalink / raw)
To: Jarek Poplawski; +Cc: akpm, netdev, paulus, greearb, mostrows, davem
Yes, there is no real lockup with pppoe
ll repeat my configuration:
vpn (pptp(mostly)+pppoe) concentrator
PPPoE provided through 802.1q
+OSPF (quagga)
Jura
----- Original Message -----
From: "Jarek Poplawski" <jarkao2@o2.pl>
To: "Yuriy N. Shkandybin" <jura@netams.com>
Cc: <akpm@linux-foundation.org>; <netdev@vger.kernel.org>;
<paulus@samba.org>; <greearb@candelatech.com>; <mostrows@speakeasy.net>;
<davem@davemloft.net>
Sent: Thursday, May 10, 2007 9:30 AM
Subject: Re: [PATCH] vlan: lockdep subclass for ppp _xmit_lock Re:
ppp_generic: fix lockdep warning
> On Thu, May 10, 2007 at 12:06:09AM +0400, Yuriy N. Shkandybin wrote:
>> After applying this patch i've got this:
>>
>> =======================================================
>> [ INFO: possible circular locking dependency detected ]
>> 2.6.21-gentoo #2
>> -------------------------------------------------------
>> ospfd/3984 is trying to acquire lock:
>> (&ppp->wlock){-...}, at: [<ffffffff803512a0>] ppp_xmit_process+0x20/0x4f0
>>
>> but task is already holding lock:
>> (&dev->_xmit_lock){-+..}, at: [<ffffffff8038d778>] __qdisc_run+0x88/0x1c3
>>
>> which lock already depends on the new lock.
>
> Thanks Yuriy!
>
> As a matter of fact I suspected there will be something more.
> Lockdep always stops working after the first warning, and you
> have some very interesting (entangled) configuration, so each
> fix could uncover new warnings.
>
> I need some time to figure out this one, so I'd try to use your
> help yet... I hope it's only about lockdep warnings and there
> are no real lockups.
>
> Best regards,
> Jarek P.
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] vlan: lockdep subclass for ppp _xmit_lock Re: ppp_generic: fix lockdep warning
2007-05-10 6:03 ` Yuriy N. Shkandybin
@ 2007-05-10 6:39 ` Jarek Poplawski
0 siblings, 0 replies; 13+ messages in thread
From: Jarek Poplawski @ 2007-05-10 6:39 UTC (permalink / raw)
To: Yuriy N. Shkandybin; +Cc: akpm, netdev, paulus, greearb, mostrows, davem
On Thu, May 10, 2007 at 10:03:23AM +0400, Yuriy N. Shkandybin wrote:
> Yes, there is no real lockup with pppoe
> ll repeat my configuration:
> vpn (pptp(mostly)+pppoe) concentrator
> PPPoE provided through 802.1q
> +OSPF (quagga)
I think, it's a little too general... Probably at least
ifconfig and ip rule, route etc. could tell something.
But let's wait a little.
Jarek P.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ppp_generic: lockdep class for ppp _xmit_lock Re: ppp_generic: fix lockdep warning
2007-05-09 20:06 ` [PATCH] vlan: lockdep subclass " Yuriy N. Shkandybin
2007-05-10 5:30 ` Jarek Poplawski
@ 2007-05-11 7:00 ` Jarek Poplawski
1 sibling, 0 replies; 13+ messages in thread
From: Jarek Poplawski @ 2007-05-11 7:00 UTC (permalink / raw)
To: Yuriy N. Shkandybin; +Cc: akpm, netdev, paulus, greearb, mostrows, davem
Hi,
Read below, please:
On Thu, May 10, 2007 at 12:06:09AM +0400, Yuriy N. Shkandybin wrote:
> After applying this patch i've got this:
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.21-gentoo #2
> -------------------------------------------------------
> ospfd/3984 is trying to acquire lock:
> (&ppp->wlock){-...}, at: [<ffffffff803512a0>] ppp_xmit_process+0x20/0x4f0
>
> but task is already holding lock:
> (&dev->_xmit_lock){-+..}, at: [<ffffffff8038d778>] __qdisc_run+0x88/0x1c3
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #3 (&dev->_xmit_lock){-+..}:
...
> [<ffffffff8038663c>] dev_mc_add+0x3c/0x148
> [<ffffffff8025e14c>] _spin_lock_bh+0x23/0x2c
> [<ffffffff8038663c>] dev_mc_add+0x3c/0x148
> [<ffffffff803bf0fc>] vlan_dev_set_multicast_list+0xfc/0x2a0
> [<ffffffff80386707>] dev_mc_add+0x107/0x148
...
> -> #2 (&vlan_netdev_xmit_lock_key){-...}:
...
> [<ffffffff8022de38>] dev_queue_xmit+0x178/0x263
> [<ffffffff8025e120>] _spin_lock+0x1e/0x27
> [<ffffffff8022de38>] dev_queue_xmit+0x178/0x263
> [<ffffffff80354f57>] __pppoe_xmit+0x217/0x23b
> [<ffffffff803523e3>] ppp_channel_push+0x43/0xb9
> [<ffffffff8035331a>] ppp_write+0x10a/0x120
...
> -> #1 (&pch->downl){-...}:
...
> [<ffffffff803509d5>] ppp_push+0x45/0xaf
> [<ffffffff8025e14c>] _spin_lock_bh+0x23/0x2c
> [<ffffffff803509d5>] ppp_push+0x45/0xaf
> [<ffffffff803516e7>] ppp_xmit_process+0x467/0x4f0
> [<ffffffff80287dca>] trace_hardirqs_on+0x12e/0x164
...
> -> #0 (&ppp->wlock){-...}:
...
> [<ffffffff803512a0>] ppp_xmit_process+0x20/0x4f0
> [<ffffffff8025e14c>] _spin_lock_bh+0x23/0x2c
> [<ffffffff803512a0>] ppp_xmit_process+0x20/0x4f0
> [<ffffffff80287daf>] trace_hardirqs_on+0x113/0x164
> [<ffffffff803531bc>] ppp_start_xmit+0x1dc/0x230
> [<ffffffff8038d7f5>] __qdisc_run+0x105/0x1c3
> [<ffffffff8022dddf>] dev_queue_xmit+0x11f/0x263
> [<ffffffff80237533>] ip_mc_output+0x333/0x370
> [<ffffffff803a9460>] raw_sendmsg+0x560/0x729
...
> other info that might help us debug this:
>
> 1 lock held by ospfd/3984:
> #0: (&dev->_xmit_lock){-+..}, at: [<ffffffff8038d778>]
> __qdisc_run+0x88/0x1c3
...
Here is a patch to do for ppp something similar as in vlan:
kind of lockdep configuration. I think this is useful, even
if it doesn't fix this particular problem, and IMHO should
be safe - likewise - my previous, vlan patch. These patches
should enough with simpler situations, up to two types of
interfaces (e.g. vlan with ppp with eth). There could be
something like vlan with ppp with eth with other ppp or
something virtual, so lockdep is going crazy.
It seems that above "-> #3 (&dev->_xmit_lock)" is on eth (not
ppp at least), while "#0: (&dev->_xmit_lock)" held at the end
belongs to ppp. After this patch we should be at least sure
of this.
Yuriy, could you try this patch, please?
This is done on 2.6.21, but could be applied to current -mm
or -git, too. (My previous: vlan & ppp_generic patches should
stay too.)
> Yes, there is no real lockup with pppoe
> ll repeat my configuration:
> vpn (pptp(mostly)+pppoe) concentrator
> PPPoE provided through 802.1q
> +OSPF (quagga)
>
> Jura
Jura, it would be useful to know, at least, which types of
interfaces are added to your vlans/802.1q and pppoe, so
maybe you could send some parts of configs or at least
"ip link" output.
Regards,
Jarek P.
--->
This patch's aim is to let lockdep see ppp dev->_xmit_lock as
different from others, after register_netdev inits this lock.
Reported & tested by: "Yuriy N. Shkandybin" <jura@netams.com>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michal Ostrowski <mostrows@speakeasy.net>
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
---
diff -Nurp 2.6.21-/drivers/net/ppp_generic.c 2.6.21/drivers/net/ppp_generic.c
--- 2.6.21-/drivers/net/ppp_generic.c 2007-05-10 20:05:42.000000000 +0200
+++ 2.6.21/drivers/net/ppp_generic.c 2007-05-10 21:35:46.000000000 +0200
@@ -2399,6 +2399,10 @@ ppp_get_stats(struct ppp *ppp, struct pp
* or if there is already a unit with the requested number.
* unit == -1 means allocate a new number.
*/
+
+/* ppp dev->_xmit_lock shouldn't be mixed with others used e.g. with pppoe */
+static struct lock_class_key ppp_netdev_xmit_lock_key;
+
static struct ppp *
ppp_create_interface(int unit, int *retp)
{
@@ -2456,6 +2460,7 @@ ppp_create_interface(int unit, int *retp
if (ret != 0)
goto out3;
+ lockdep_set_class(&dev->_xmit_lock, &ppp_netdev_xmit_lock_key);
mutex_unlock(&all_ppp_mutex);
*retp = 0;
return ppp;
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2007-05-11 6:54 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-26 7:27 [patch 10/15] ppp_generic: fix lockdep warning akpm
2007-04-26 8:39 ` David Miller
2007-04-26 10:49 ` Jarek Poplawski
2007-05-09 9:35 ` [PATCH] vlan: lockdep subclass for ppp _xmit_lock " Jarek Poplawski
2007-05-09 9:32 ` David Miller
2007-05-09 13:13 ` [PATCH (take 2)] vlan: lockdep class " Jarek Poplawski
2007-05-09 20:06 ` [PATCH] vlan: lockdep subclass " Yuriy N. Shkandybin
2007-05-10 5:30 ` Jarek Poplawski
2007-05-10 6:03 ` Yuriy N. Shkandybin
2007-05-10 6:39 ` Jarek Poplawski
2007-05-11 7:00 ` [PATCH] ppp_generic: lockdep class " Jarek Poplawski
2007-04-26 10:04 ` [patch 10/15] " Paul Mackerras
2007-04-26 11:28 ` Jarek Poplawski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).