From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Yuriy N. Shkandybin" Subject: Re: [PATCH] vlan: lockdep subclass for ppp _xmit_lock Re: ppp_generic: fix lockdep warning Date: Thu, 10 May 2007 00:06:09 +0400 Message-ID: <001801c79275$80cd5e00$0701010a@note> References: <200704260727.l3Q7RTxG023970@shell0.pdx.osdl.net> <20070426.013911.69219176.davem@davemloft.net> <20070426104950.GA3145@ff.dom.local> <20070509093536.GA2436@ff.dom.local> Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit Cc: , , , , , To: "Jarek Poplawski" Return-path: Received: from www.netams.com ([212.192.245.10]:60978 "EHLO mail.netams.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755785AbXEIUGd (ORCPT ); Wed, 9 May 2007 16:06:33 -0400 Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org After applying this patch i've got this: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.21-gentoo #2 ------------------------------------------------------- ospfd/3984 is trying to acquire lock: (&ppp->wlock){-...}, at: [] ppp_xmit_process+0x20/0x4f0 but task is already holding lock: (&dev->_xmit_lock){-+..}, at: [] __qdisc_run+0x88/0x1c3 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&dev->_xmit_lock){-+..}: [] __lock_acquire+0xca9/0xf31 [] lock_acquire+0x49/0x6f [] dev_mc_add+0x3c/0x148 [] _spin_lock_bh+0x23/0x2c [] dev_mc_add+0x3c/0x148 [] vlan_dev_set_multicast_list+0xfc/0x2a0 [] dev_mc_add+0x107/0x148 [] igmp_group_added+0x44/0x4b [] ip_mc_inc_group+0x12a/0x150 [] ip_mc_inc_group+0x132/0x150 [] ip_mc_up+0x3e/0x5a [] inetdev_event+0x137/0x2d0 [] rtmsg_ifinfo+0xae/0xe0 [] notifier_call_chain+0x24/0x36 [] dev_open+0x66/0x70 [] dev_change_flags+0x5c/0x12a [] devinet_ioctl+0x2ad/0x6b0 [] sock_ioctl+0x1e6/0x202 [] do_ioctl+0x21/0x79 [] vfs_ioctl+0x28d/0x2b0 [] trace_hardirqs_on+0x12e/0x164 [] sys_ioctl+0x3b/0x5b [] system_call+0x7e/0x83 [] 0xffffffffffffffff -> #2 (&vlan_netdev_xmit_lock_key){-...}: [] __lock_acquire+0xca9/0xf31 [] lock_acquire+0x49/0x6f [] dev_queue_xmit+0x178/0x263 [] _spin_lock+0x1e/0x27 [] dev_queue_xmit+0x178/0x263 [] __pppoe_xmit+0x217/0x23b [] ppp_channel_push+0x43/0xb9 [] ppp_write+0x10a/0x120 [] vfs_write+0xa5/0xf0 [] sys_write+0x45/0x7d [] system_call+0x7e/0x83 [] 0xffffffffffffffff -> #1 (&pch->downl){-...}: [] __lock_acquire+0xca9/0xf31 [] lock_acquire+0x49/0x6f [] ppp_push+0x45/0xaf [] _spin_lock_bh+0x23/0x2c [] ppp_push+0x45/0xaf [] ppp_xmit_process+0x467/0x4f0 [] trace_hardirqs_on+0x12e/0x164 [] ppp_write+0xfa/0x120 [] vfs_write+0xa5/0xf0 [] sys_write+0x45/0x7d [] system_call+0x7e/0x83 [] 0xffffffffffffffff -> #0 (&ppp->wlock){-...}: [] print_stack_trace+0x6d/0x8a [] __lock_acquire+0xb68/0xf31 [] lock_acquire+0x49/0x6f [] ppp_xmit_process+0x20/0x4f0 [] _spin_lock_bh+0x23/0x2c [] ppp_xmit_process+0x20/0x4f0 [] trace_hardirqs_on+0x113/0x164 [] ppp_start_xmit+0x1dc/0x230 [] __qdisc_run+0x105/0x1c3 [] dev_queue_xmit+0x11f/0x263 [] ip_mc_output+0x333/0x370 [] raw_sendmsg+0x560/0x729 [] sock_sendmsg+0xcb/0xe3 [] autoremove_wake_function+0x0/0x34 [] local_bh_enable_ip+0xeb/0xfc [] trace_hardirqs_on+0x12e/0x164 [] ip_setsockopt+0xb1c/0xb3f [] ip_setsockopt+0xb1c/0xb3f [] verify_iovec+0x46/0x94 [] sys_sendmsg+0x1de/0x249 [] _spin_unlock_irqrestore+0x41/0x4a [] getrusage+0x19f/0x1ba [] trace_hardirqs_on+0x12e/0x164 [] trace_hardirqs_on_thunk+0x35/0x37 [] raw_setsockopt+0x0/0x54 [] system_call+0x7e/0x83 [] 0xffffffffffffffff other info that might help us debug this: 1 lock held by ospfd/3984: #0: (&dev->_xmit_lock){-+..}, at: [] __qdisc_run+0x88/0x1c3 stack backtrace: Call Trace: [] print_circular_bug_tail+0x6a/0x7d [] print_stack_trace+0x6d/0x8a [] __lock_acquire+0xb68/0xf31 [] lock_acquire+0x49/0x6f [] ppp_xmit_process+0x20/0x4f0 [] _spin_lock_bh+0x23/0x2c [] ppp_xmit_process+0x20/0x4f0 [] trace_hardirqs_on+0x113/0x164 [] ppp_start_xmit+0x1dc/0x230 [] __qdisc_run+0x105/0x1c3 [] dev_queue_xmit+0x11f/0x263 [] ip_mc_output+0x333/0x370 [] raw_sendmsg+0x560/0x729 [] sock_sendmsg+0xcb/0xe3 [] autoremove_wake_function+0x0/0x34 [] local_bh_enable_ip+0xeb/0xfc [] trace_hardirqs_on+0x12e/0x164 [] ip_setsockopt+0xb1c/0xb3f [] ip_setsockopt+0xb1c/0xb3f [] verify_iovec+0x46/0x94 [] sys_sendmsg+0x1de/0x249 [] _spin_unlock_irqrestore+0x41/0x4a [] getrusage+0x19f/0x1ba [] trace_hardirqs_on+0x12e/0x164 [] trace_hardirqs_on_thunk+0x35/0x37 [] raw_setsockopt+0x0/0x54 [] system_call+0x7e/0x83 Jura ----- Original Message ----- From: "Jarek Poplawski" To: "Yuriy N. Shkandybin" Cc: ; ; ; ; ; ; Sent: Wednesday, May 09, 2007 1:35 PM Subject: [PATCH] vlan: lockdep subclass for ppp _xmit_lock Re: ppp_generic: fix lockdep warning > On Thu, Apr 26, 2007 at 12:49:50PM +0200, Jarek Poplawski wrote: > ... >> But there is also a second, very similar lockdep report, >> probably also false (lockdep cannot see the difference >> between locks of two different, I hope, vlan devices), >> which needs more work: >> a) vlan should use different lockdep lock subclasses or >> classes for each device, which would require quite a lot >> of static memory reserved, probably only to silence >> lockdep, >> b) pppoe could change the way of sending packets, so >> the locks of ppp_generic were not seen by lockdep >> with so many variants; I'm not sure the maintainer of >> pppoe likes this idea; >> >> Doing a) should be enough, I guess; doing b) is easier >> but, probably, the similar is possible elsewhere, too. > ... >> Currently I think about some change in lockdep, to track >> something like different vlans with less memory, but I'm >> not sure it'll work, yet. > > After rethinking there is the 3-rd way (as usual): > > c) vlan should use different lockdep lock subclasses or > classes for different types of devices, used at the same > time. > > This patch is intended for testing, so should be applied > after some confirmation. (It should be safe anyway.) > > If this works, the previous lockdep patch on ppp_generic > should be really superfluous. > > Yuriy, could you try this patch, please? > This is done on 2.6.21, but could be applied to current -mm > or -git, too. If you prefere some other version, let me know. > > Regards, > Jarek P. > ---> > > This patch's aim is to let lockdep see ppp devs as different > from others (default), and it's OK to take: _xmit_lock of vlan > and _xmit_lock of ppp with reverse order provided vlan _xmit_locks > are bound to different devs (ppp and e.g. eth). > >> ======================================================= >> [ INFO: possible circular locking dependency detected ] >> 2.6.21-rc6-mm1 #4 >> ------------------------------------------------------- >> pppd/14305 is trying to acquire lock: >> (&vlan_netdev_xmit_lock_key){-...}, at: [] >> dev_queue_xmit+0x26b/0x300 >> >> but task is already holding lock: >> (&pch->downl#2){-+..}, at: [] ppp_push+0x5f/0xa7 >> >> which lock already depends on the new lock. > > Reported & tested by: "Yuriy N. Shkandybin" > Cc: Ben Greear > Cc: Paul Mackerras > Cc: Michal Ostrowski > Signed-off-by: Jarek Poplawski > > --- > > diff -Nurp 2.6.21-/net/8021q/vlan.c 2.6.21/net/8021q/vlan.c > --- 2.6.21-/net/8021q/vlan.c 2007-05-01 12:43:39.000000000 +0200 > +++ 2.6.21/net/8021q/vlan.c 2007-05-07 21:09:30.000000000 +0200 > @@ -535,7 +535,13 @@ static struct net_device *register_vlan_ > if (register_netdevice(new_dev)) > goto out_free_newdev; > > - lockdep_set_class(&new_dev->_xmit_lock, &vlan_netdev_xmit_lock_key); > + if (unlikely(real_dev->type == ARPHRD_PPP)) > + /* pppoe uses two different kinds of _xmit_lock for ppp & eth */ > + lockdep_set_class_and_subclass(&new_dev->_xmit_lock, > + &vlan_netdev_xmit_lock_key, 1); > + else > + lockdep_set_class(&new_dev->_xmit_lock, > + &vlan_netdev_xmit_lock_key); > > new_dev->iflink = real_dev->ifindex; > vlan_transfer_operstate(real_dev, new_dev); >