netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: + ppp_generic-fix-lockdep-warning.patch added to -mm tree
       [not found] ` <01ef01c77bfe$4cdfe640$0202fea9@Jura>
@ 2007-04-11  7:09   ` Andrew Morton
  2007-04-11  8:52     ` Yuriy N. Shkandybin
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2007-04-11  7:09 UTC (permalink / raw)
  To: Yuriy N. Shkandybin; +Cc: jarkao2, paulus, netdev

(added netdev)

On Wed, 11 Apr 2007 09:57:33 +0400 "Yuriy N. Shkandybin" <jura@netams.com> wrote:

> I've tested  2.6.21-rc6-mm1
> Linux vpn1 2.6.21-rc6-mm1 #4 SMP Wed Apr 11 03:34:26 MSD 2007 x86_64 
> Intel(R) Pentium(R) D CPU 2.80GHz GenuineIntel GNU/Linux
> 
> warn appeares upon first pppoe connection to rp-pppoe server in kernel mode
> 
> result:
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.21-rc6-mm1 #4
> -------------------------------------------------------
> pppd/14305 is trying to acquire lock:
>  (&vlan_netdev_xmit_lock_key){-...}, at: [<ffffffff8022f90b>] 
> dev_queue_xmit+0x26b/0x300
> 
> but task is already holding lock:
>  (&pch->downl#2){-+..}, at: [<ffffffff80388d3c>] ppp_push+0x5f/0xa7
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #3 (&pch->downl#2){-+..}:
>        [<ffffffff80290c5f>] __lock_acquire+0xedf/0x1048
>        [<ffffffff80290e17>] lock_acquire+0x4f/0x78
>        [<ffffffff80388d3c>] ppp_push+0x5f/0xa7
>        [<ffffffff80263434>] _spin_lock_bh+0x2a/0x39
>        [<ffffffff80388d3c>] ppp_push+0x5f/0xa7
>        [<ffffffff8038967d>] ppp_xmit_process+0x3d/0x590
>        [<ffffffff8038b915>] ppp_write+0x105/0x140
>        [<ffffffff802163f3>] vfs_write+0xa3/0xf0
>        [<ffffffff80216e04>] sys_write+0x47/0x75
>        [<ffffffff8025d11e>] system_call+0x7e/0x83
>        [<ffffffffffffffff>] 0xffffffffffffffff
> 
> -> #2 (&ppp->wlock){-+..}:
>        [<ffffffff80290c5f>] __lock_acquire+0xedf/0x1048
>        [<ffffffff80290e17>] lock_acquire+0x4f/0x78
>        [<ffffffff80389667>] ppp_xmit_process+0x27/0x590
>        [<ffffffff80263434>] _spin_lock_bh+0x2a/0x39
>        [<ffffffff80389667>] ppp_xmit_process+0x27/0x590
>        [<ffffffff8038b78c>] ppp_start_xmit+0x1cc/0x250
>        [<ffffffff803c1bff>] dev_hard_start_xmit+0x22f/0x290
>        [<ffffffff803ccbf1>] __qdisc_run+0xd1/0x1f8
>        [<ffffffff8022f928>] dev_queue_xmit+0x288/0x300
>        [<ffffffff80239b72>] ip_mc_output+0x292/0x3f0
>        [<ffffffff803eb991>] raw_sendmsg+0x511/0x7c3
>        [<ffffffff80245a25>] inet_sendmsg+0x35/0x55
>        [<ffffffff80254da7>] sock_sendmsg+0xdf/0x102
>        [<ffffffff8028f826>] trace_hardirqs_on+0xc6/0x160
>        [<ffffffff802885da>] autoremove_wake_function+0x0/0x46
>        [<ffffffff80263324>] _spin_unlock_bh+0x2f/0x36
>        [<ffffffff80230ec8>] release_sock+0xcd/0xd6
>        [<ffffffff803dbcc2>] ip_setsockopt+0x142/0xbb3
>        [<ffffffff803bf33c>] verify_iovec+0x3c/0xc2
>        [<ffffffff803b8e9d>] sys_sendmsg+0x133/0x248
>        [<ffffffff8028f826>] trace_hardirqs_on+0xc6/0x160
>        [<ffffffff80282028>] getrusage+0x1b8/0x1d9
>        [<ffffffff8028f826>] trace_hardirqs_on+0xc6/0x160
>        [<ffffffff80262e27>] trace_hardirqs_on_thunk+0x35/0x37
>        [<ffffffff8025d11e>] system_call+0x7e/0x83
>        [<ffffffffffffffff>] 0xffffffffffffffff
> 
> -> #1 (&dev->_xmit_lock){-+..}:
>        [<ffffffff80290c5f>] __lock_acquire+0xedf/0x1048
>        [<ffffffff80290e17>] lock_acquire+0x4f/0x78
>        [<ffffffff803c4f59>] dev_mc_add+0x40/0x169
>        [<ffffffff80263434>] _spin_lock_bh+0x2a/0x39
>        [<ffffffff803c4f59>] dev_mc_add+0x40/0x169
>        [<ffffffff80403457>] vlan_dev_set_multicast_list+0xa7/0x2b8
>        [<ffffffff803c4c64>] __dev_mc_upload+0x24/0x26
>        [<ffffffff803c4ff7>] dev_mc_add+0xde/0x169
>        [<ffffffff803f47d7>] igmp_group_added+0x56/0x5f
>        [<ffffffff8026322b>] _write_unlock_bh+0x2f/0x36
>        [<ffffffff803f4965>] ip_mc_inc_group+0x105/0x17a
>        [<ffffffff803f49fc>] ip_mc_up+0x22/0x69
>        [<ffffffff803f1b48>] inetdev_event+0x1b8/0x2f0
>        [<ffffffff80281d69>] notifier_call_chain+0x49/0x6b
>        [<ffffffff80281dcc>] __raw_notifier_call_chain+0x9/0xb
>        [<ffffffff80281ddf>] raw_notifier_call_chain+0x11/0x13
>        [<ffffffff803c335d>] dev_open+0x7d/0x80
>        [<ffffffff803c1527>] dev_change_flags+0x107/0x138
>        [<ffffffff803f290c>] devinet_ioctl+0x5cc/0x720
>        [<ffffffff803c2fec>] dev_ioctl+0x1fc/0x31b
>        [<ffffffff8022104f>] __up_read+0x3f/0x9d
>        [<ffffffff803f2d5d>] inet_ioctl+0x5d/0x77
>        [<ffffffff803b854f>] sock_ioctl+0x4f/0x215
>        [<ffffffff80241aca>] do_ioctl+0x2a/0x83
>        [<ffffffff8022fe92>] vfs_ioctl+0x62/0x2b0
>        [<ffffffff8028f826>] trace_hardirqs_on+0xc6/0x160
>        [<ffffffff8024cef0>] sys_ioctl+0x41/0x65
>        [<ffffffff8025d11e>] system_call+0x7e/0x83
>        [<ffffffffffffffff>] 0xffffffffffffffff
> 
> -> #0 (&vlan_netdev_xmit_lock_key){-...}:
>        [<ffffffff8028ded0>] print_circular_bug_entry+0x49/0x59
>        [<ffffffff80290ad3>] __lock_acquire+0xd53/0x1048
>        [<ffffffff8020a5d5>] kmem_cache_alloc+0x1a5/0x5e0
>        [<ffffffff8028f7ea>] trace_hardirqs_on+0x8a/0x160
>        [<ffffffff80290e17>] lock_acquire+0x4f/0x78
>        [<ffffffff8022f90b>] dev_queue_xmit+0x26b/0x300
>        [<ffffffff802633fb>] _spin_lock+0x25/0x34
>        [<ffffffff8022f90b>] dev_queue_xmit+0x26b/0x300
>        [<ffffffff8038d753>] __pppoe_xmit+0x1e8/0x265
>        [<ffffffff8038d7dc>] pppoe_xmit+0xc/0xe
>        [<ffffffff80388d51>] ppp_push+0x74/0xa7
>        [<ffffffff8038967d>] ppp_xmit_process+0x3d/0x590
>        [<ffffffff8038b915>] ppp_write+0x105/0x140
>        [<ffffffff802163f3>] vfs_write+0xa3/0xf0
>        [<ffffffff80216e04>] sys_write+0x47/0x75
>        [<ffffffff8025d11e>] system_call+0x7e/0x83
>        [<ffffffffffffffff>] 0xffffffffffffffff
> 
> other info that might help us debug this:
> 
> 2 locks held by pppd/14305:
>  #0:  (&ppp->wlock){-+..}, at: [<ffffffff80389667>] 
> ppp_xmit_process+0x27/0x590
>  #1:  (&pch->downl#2){-+..}, at: [<ffffffff80388d3c>] ppp_push+0x5f/0xa7
> 
> stack backtrace:
> 
> Call Trace:
>  [<ffffffff8028e95b>] print_circular_bug_tail+0x7c/0x91
>  [<ffffffff8028ded0>] print_circular_bug_entry+0x49/0x59
>  [<ffffffff80290ad3>] __lock_acquire+0xd53/0x1048
>  [<ffffffff8020a5d5>] kmem_cache_alloc+0x1a5/0x5e0
>  [<ffffffff8028f7ea>] trace_hardirqs_on+0x8a/0x160
>  [<ffffffff80290e17>] lock_acquire+0x4f/0x78
>  [<ffffffff8022f90b>] dev_queue_xmit+0x26b/0x300
>  [<ffffffff802633fb>] _spin_lock+0x25/0x34
>  [<ffffffff8022f90b>] dev_queue_xmit+0x26b/0x300
>  [<ffffffff8038d753>] __pppoe_xmit+0x1e8/0x265
>  [<ffffffff8038d7dc>] pppoe_xmit+0xc/0xe
>  [<ffffffff80388d51>] ppp_push+0x74/0xa7
>  [<ffffffff8038967d>] ppp_xmit_process+0x3d/0x590
>  [<ffffffff8038b915>] ppp_write+0x105/0x140
>  [<ffffffff802163f3>] vfs_write+0xa3/0xf0
>  [<ffffffff80216e04>] sys_write+0x47/0x75
>  [<ffffffff8025d11e>] system_call+0x7e/0x83
> 
> INFO: lockdep is turned off.

Thanks.  So you're saying that
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm1/broken-out/ppp_generic-fix-lockdep-warning.patch
did not fix anything?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: + ppp_generic-fix-lockdep-warning.patch added to -mm tree
  2007-04-11  7:09   ` + ppp_generic-fix-lockdep-warning.patch added to -mm tree Andrew Morton
@ 2007-04-11  8:52     ` Yuriy N. Shkandybin
  2007-04-11  8:58       ` Andrew Morton
  2007-04-17  7:37       ` Jarek Poplawski
  0 siblings, 2 replies; 7+ messages in thread
From: Yuriy N. Shkandybin @ 2007-04-11  8:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: jarkao2, paulus, netdev



> (added netdev)
>
> On Wed, 11 Apr 2007 09:57:33 +0400 "Yuriy N. Shkandybin" <jura@netams.com> 
> wrote:
>
>> I've tested  2.6.21-rc6-mm1
>> Linux vpn1 2.6.21-rc6-mm1 #4 SMP Wed Apr 11 03:34:26 MSD 2007 x86_64
>> Intel(R) Pentium(R) D CPU 2.80GHz GenuineIntel GNU/Linux
>>
>> warn appeares upon first pppoe connection to rp-pppoe server in kernel 
>> mode
>>
>
> Thanks.  So you're saying that
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm1/broken-out/ppp_generic-fix-lockdep-warning.patch
> did not fix anything?
As i understand this patch already in -mm tree, so I've booted into last mm 
kernel and received this locked warning.
Or i've mistaked and should apply this patch manually?
>
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: + ppp_generic-fix-lockdep-warning.patch added to -mm tree
  2007-04-11  8:52     ` Yuriy N. Shkandybin
@ 2007-04-11  8:58       ` Andrew Morton
  2007-04-17  7:37       ` Jarek Poplawski
  1 sibling, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2007-04-11  8:58 UTC (permalink / raw)
  To: Yuriy N. Shkandybin; +Cc: jarkao2, paulus, netdev

On Wed, 11 Apr 2007 12:52:28 +0400 "Yuriy N. Shkandybin" <jura@netams.com> wrote:

> 
> 
> > (added netdev)
> >
> > On Wed, 11 Apr 2007 09:57:33 +0400 "Yuriy N. Shkandybin" <jura@netams.com> 
> > wrote:
> >
> >> I've tested  2.6.21-rc6-mm1
> >> Linux vpn1 2.6.21-rc6-mm1 #4 SMP Wed Apr 11 03:34:26 MSD 2007 x86_64
> >> Intel(R) Pentium(R) D CPU 2.80GHz GenuineIntel GNU/Linux
> >>
> >> warn appeares upon first pppoe connection to rp-pppoe server in kernel 
> >> mode
> >>
> >
> > Thanks.  So you're saying that
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm1/broken-out/ppp_generic-fix-lockdep-warning.patch
> > did not fix anything?
> As i understand this patch already in -mm tree, so I've booted into last mm 
> kernel and received this locked warning.
> Or i've mistaked and should apply this patch manually?

No, that's OK - that patch is indeed in 2.6.21-rc6-mm1.  It appears that it
did not fix this bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: + ppp_generic-fix-lockdep-warning.patch added to -mm tree
  2007-04-11  8:52     ` Yuriy N. Shkandybin
  2007-04-11  8:58       ` Andrew Morton
@ 2007-04-17  7:37       ` Jarek Poplawski
  2007-04-17 13:26         ` Michal Ostrowski
  2007-04-19  5:30         ` Jarek Poplawski
  1 sibling, 2 replies; 7+ messages in thread
From: Jarek Poplawski @ 2007-04-17  7:37 UTC (permalink / raw)
  To: Yuriy N. Shkandybin
  Cc: Andrew Morton, Paul Mackerras, netdev, Michal Ostrowski

On Wed, Apr 11, 2007 at 12:52:28PM +0400, Yuriy N. Shkandybin wrote:
...
> >On Wed, 11 Apr 2007 09:57:33 +0400 "Yuriy N. Shkandybin" <jura@netams.com> 
> >wrote:
> >
> >>I've tested  2.6.21-rc6-mm1
> >>Linux vpn1 2.6.21-rc6-mm1 #4 SMP Wed Apr 11 03:34:26 MSD 2007 x86_64
> >>Intel(R) Pentium(R) D CPU 2.80GHz GenuineIntel GNU/Linux
> >>
> >>warn appeares upon first pppoe connection to rp-pppoe server in kernel 
> >>mode
> >>
> >
> >Thanks.  So you're saying that
> >ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm1/broken-out/ppp_generic-fix-lockdep-warning.patch
> >did not fix anything?
> As i understand this patch already in -mm tree, so I've booted into last mm 
> kernel and received this locked warning.
> Or i've mistaked and should apply this patch manually?

Hi!

Yuriy - thanks for testing my patch ...(pause) Not!

It seems this patch is not visible in this version - probably
is overpatched by something else. But your new log shows there
is another connection between these locks (ppp_xmit_process
and ppp_push instead of ppp_channel_push in "-> #0"), so the
patch is not sufficient (and could be dumped).

I don't know your vlans configuration, but it seems the real
lockup isn't very probable here - it's rather lockdep question.
I think vlan's too broad lockdep class is the main "guilty"
here, but probably pppoe also could be enhanced: it's making
the things unnecessarily complicated by calling dev_queue_xmit
under ppp_generic's xmit locks. I wonder if there is any reason
against using a tasklet here.

I'll try to find more time to untie this yet - or maybe some
maintainer will find this interesting, too...

Regards,
Jarek P.

PS: sorry for late responding (vacations).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: + ppp_generic-fix-lockdep-warning.patch added to -mm tree
  2007-04-17  7:37       ` Jarek Poplawski
@ 2007-04-17 13:26         ` Michal Ostrowski
  2007-04-18  6:40           ` Jarek Poplawski
  2007-04-19  5:30         ` Jarek Poplawski
  1 sibling, 1 reply; 7+ messages in thread
From: Michal Ostrowski @ 2007-04-17 13:26 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Yuriy N. Shkandybin, Andrew Morton, Paul Mackerras, netdev,
	Michal Ostrowski

The "xmit" function of a PPP channel is a synchronous operation.  If the 
transmission fails, we must notify the caller and let them re-submit the 
skb later.  The return status of dev_queue_xmit is needed to determine 
the return code passed back to the caller and thus the call is made 
synchronously and not in a tasklet.

Looking at the stack traces earlier in this thread, it seems to me that 
even if the PPPoE call was made in a tasklet, this same warning could be 
generated.

--
Michal Ostrowski
mostrows@earthlink.net



Jarek Poplawski wrote:
> On Wed, Apr 11, 2007 at 12:52:28PM +0400, Yuriy N. Shkandybin wrote:
> ...
>>> On Wed, 11 Apr 2007 09:57:33 +0400 "Yuriy N. Shkandybin" <jura@netams.com> 
>>> wrote:
>>>
>>>> I've tested  2.6.21-rc6-mm1
>>>> Linux vpn1 2.6.21-rc6-mm1 #4 SMP Wed Apr 11 03:34:26 MSD 2007 x86_64
>>>> Intel(R) Pentium(R) D CPU 2.80GHz GenuineIntel GNU/Linux
>>>>
>>>> warn appeares upon first pppoe connection to rp-pppoe server in kernel 
>>>> mode
>>>>
>>> Thanks.  So you're saying that
>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm1/broken-out/ppp_generic-fix-lockdep-warning.patch
>>> did not fix anything?
>> As i understand this patch already in -mm tree, so I've booted into last mm 
>> kernel and received this locked warning.
>> Or i've mistaked and should apply this patch manually?
> 
> Hi!
> 
> Yuriy - thanks for testing my patch ...(pause) Not!
> 
> It seems this patch is not visible in this version - probably
> is overpatched by something else. But your new log shows there
> is another connection between these locks (ppp_xmit_process
> and ppp_push instead of ppp_channel_push in "-> #0"), so the
> patch is not sufficient (and could be dumped).
> 
> I don't know your vlans configuration, but it seems the real
> lockup isn't very probable here - it's rather lockdep question.
> I think vlan's too broad lockdep class is the main "guilty"
> here, but probably pppoe also could be enhanced: it's making
> the things unnecessarily complicated by calling dev_queue_xmit
> under ppp_generic's xmit locks. I wonder if there is any reason
> against using a tasklet here.
> 
> I'll try to find more time to untie this yet - or maybe some
> maintainer will find this interesting, too...
> 
> Regards,
> Jarek P.
> 
> PS: sorry for late responding (vacations).
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: + ppp_generic-fix-lockdep-warning.patch added to -mm tree
  2007-04-17 13:26         ` Michal Ostrowski
@ 2007-04-18  6:40           ` Jarek Poplawski
  0 siblings, 0 replies; 7+ messages in thread
From: Jarek Poplawski @ 2007-04-18  6:40 UTC (permalink / raw)
  To: Michal Ostrowski
  Cc: Yuriy N. Shkandybin, Andrew Morton, Paul Mackerras, netdev,
	Michal Ostrowski

On Tue, Apr 17, 2007 at 08:26:32AM -0500, Michal Ostrowski wrote:
> The "xmit" function of a PPP channel is a synchronous operation.  If the 
> transmission fails, we must notify the caller and let them re-submit the 
> skb later.  The return status of dev_queue_xmit is needed to determine 
> the return code passed back to the caller and thus the call is made 
> synchronously and not in a tasklet.

Sure! But on the other hand:

- the return code from dev_queue_xmit doesn't guarantee
the transmission won't fail,

- similar code in ppp_async: ppp_async_send isn't so
truthful and doesn't even check the return from
ppp_async_push; BTW - probably other layers should
care for transmission errors and re-submiting,

- maybe I'm wrong here, but I think every "layer" should
look (work) similarly here: dev_queue_xmit (or qdisc_run)
thinks it's talking to some independent network device,
which after dev_hard_start_xmit (and dev->hard_start_xmit)
does some transmission; if, instead of this, next
dev_queue_xmits are called with xmit locks held from
previous "devs", then it looks like logical recursion and
locking is really hard to follow (even if it's OK).

> Looking at the stack traces earlier in this thread, it seems to me that 
> even if the PPPoE call was made in a tasklet, this same warning could be 
> generated.

Of course a tasklet by itself isn't a cure, but if
dev_queue_xmit is done from tasklet - only locks got
within this tasklet should be counted.

Thanks for response & best regards,
Jarek P.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: + ppp_generic-fix-lockdep-warning.patch added to -mm tree
  2007-04-17  7:37       ` Jarek Poplawski
  2007-04-17 13:26         ` Michal Ostrowski
@ 2007-04-19  5:30         ` Jarek Poplawski
  1 sibling, 0 replies; 7+ messages in thread
From: Jarek Poplawski @ 2007-04-19  5:30 UTC (permalink / raw)
  To: Yuriy N. Shkandybin
  Cc: Andrew Morton, Paul Mackerras, netdev, Michal Ostrowski

On Tue, Apr 17, 2007 at 09:37:44AM +0200, Jarek Poplawski wrote:
...
> Yuriy - thanks for testing my patch ...(pause) Not!
> 
> It seems this patch is not visible in this version - probably
...

Sorry! It was only something with my eyes.
(Probably too much of Pamela!).

Jarek P.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-04-19  5:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200703282353.l2SNr2iL023119@shell0.pdx.osdl.net>
     [not found] ` <01ef01c77bfe$4cdfe640$0202fea9@Jura>
2007-04-11  7:09   ` + ppp_generic-fix-lockdep-warning.patch added to -mm tree Andrew Morton
2007-04-11  8:52     ` Yuriy N. Shkandybin
2007-04-11  8:58       ` Andrew Morton
2007-04-17  7:37       ` Jarek Poplawski
2007-04-17 13:26         ` Michal Ostrowski
2007-04-18  6:40           ` Jarek Poplawski
2007-04-19  5:30         ` Jarek Poplawski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).