* My system stops during startup with curretn git tree of 2.6.25-rc2 @ 2008-02-17 10:55 Zdenek Kabelac 2008-02-17 12:04 ` Jiri Kosina 0 siblings, 1 reply; 13+ messages in thread From: Zdenek Kabelac @ 2008-02-17 10:55 UTC (permalink / raw) To: linux-kernel; +Cc: panther Hi It looks like there is something weird as my systems stops when the swap is mounted. I've played bisection game and this is the commit which makes the system unusable: # bad: [45b503548210fe6f23e92b856421c2a3f05fd034] [RTNETLINK]: Send a single notification on device state changes. git-bisect bad 45b503548210fe6f23e92b856421c2a3f05fd034 I've tried to reverse this commit - and it has compiled & worked. Zdenek ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: My system stops during startup with curretn git tree of 2.6.25-rc2 2008-02-17 10:55 My system stops during startup with curretn git tree of 2.6.25-rc2 Zdenek Kabelac @ 2008-02-17 12:04 ` Jiri Kosina 2008-02-18 2:36 ` David Miller 0 siblings, 1 reply; 13+ messages in thread From: Jiri Kosina @ 2008-02-17 12:04 UTC (permalink / raw) To: Zdenek Kabelac Cc: linux-kernel, panther, Laszlo Attila Toth, David S. Miller, Linus Torvalds On Sun, 17 Feb 2008, Zdenek Kabelac wrote: > It looks like there is something weird as my systems stops when the swap > is mounted. I've played bisection game and this is the commit which > makes the system unusable: > # bad: [45b503548210fe6f23e92b856421c2a3f05fd034] [RTNETLINK]: Send a > single notification on device state changes. > git-bisect bad 45b503548210fe6f23e92b856421c2a3f05fd034 > I've tried to reverse this commit - and it has compiled & worked. This commit is completely broken (it, for example, breaks locking around dev->link_mode), as has been already mentioned by Rafael at http://lkml.org/lkml/2008/2/15/542 Dave, do you have a proper fix queued? Otherwise I would propose just to revert it completely from Linus' tree for now. -- Jiri Kosina ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: My system stops during startup with curretn git tree of 2.6.25-rc2 2008-02-17 12:04 ` Jiri Kosina @ 2008-02-18 2:36 ` David Miller 2008-02-18 13:06 ` Laszlo Attila Toth 0 siblings, 1 reply; 13+ messages in thread From: David Miller @ 2008-02-18 2:36 UTC (permalink / raw) To: jkosina; +Cc: zdenek.kabelac, linux-kernel, panther, torvalds From: Jiri Kosina <jkosina@suse.cz> Date: Sun, 17 Feb 2008 13:04:59 +0100 (CET) > On Sun, 17 Feb 2008, Zdenek Kabelac wrote: > > > It looks like there is something weird as my systems stops when the swap > > is mounted. I've played bisection game and this is the commit which > > makes the system unusable: > > # bad: [45b503548210fe6f23e92b856421c2a3f05fd034] [RTNETLINK]: Send a > > single notification on device state changes. > > git-bisect bad 45b503548210fe6f23e92b856421c2a3f05fd034 > > I've tried to reverse this commit - and it has compiled & worked. > > This commit is completely broken (it, for example, breaks locking around > dev->link_mode), as has been already mentioned by Rafael at > http://lkml.org/lkml/2008/2/15/542 > > Dave, do you have a proper fix queued? Otherwise I would propose just to > revert it completely from Linus' tree for now. I just reverted and I'll push that to Linus. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: My system stops during startup with curretn git tree of 2.6.25-rc2 2008-02-18 2:36 ` David Miller @ 2008-02-18 13:06 ` Laszlo Attila Toth 2008-02-18 13:44 ` Jiri Kosina 0 siblings, 1 reply; 13+ messages in thread From: Laszlo Attila Toth @ 2008-02-18 13:06 UTC (permalink / raw) To: David Miller, zdenek.kabelac; +Cc: jkosina, linux-kernel, torvalds David Miller wrote: > From: Jiri Kosina <jkosina@suse.cz> > Date: Sun, 17 Feb 2008 13:04:59 +0100 (CET) > >> On Sun, 17 Feb 2008, Zdenek Kabelac wrote: >> >>> It looks like there is something weird as my systems stops when the swap >>> is mounted. I've played bisection game and this is the commit which >>> makes the system unusable: >>> # bad: [45b503548210fe6f23e92b856421c2a3f05fd034] [RTNETLINK]: Send a >>> single notification on device state changes. >>> git-bisect bad 45b503548210fe6f23e92b856421c2a3f05fd034 >>> I've tried to reverse this commit - and it has compiled & worked. >> This commit is completely broken (it, for example, breaks locking around >> dev->link_mode), as has been already mentioned by Rafael at >> http://lkml.org/lkml/2008/2/15/542 >> >> Dave, do you have a proper fix queued? Otherwise I would propose just to >> revert it completely from Linus' tree for now. > > I just reverted and I'll push that to Linus. > Okay, but I can't figure out what's the problem with it. I don't have wireless card on my linux box also I can't test it but everything else works. Swap is mounted. The concurrency cannot be a problem because the write operation is protected by a lock. My only idea is the notification itself, because the netdev_state_change function is used which calls rtmsg_ifinfo too. void netdev_state_change(struct net_device *dev) { if (dev->flags & IFF_UP) { call_netdevice_notifiers(NETDEV_CHANGE, dev); rtmsg_ifinfo(RTM_NEWLINK, dev, 0); } } Also last lines of my patch could be instead of if (modified) netdev_state_change(dev); the following: if (modified && dev->flags & IFF_UP) { call_netdevice_notifiers(NETDEV_CHANGE, dev); } Regards, Attila ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: My system stops during startup with curretn git tree of 2.6.25-rc2 2008-02-18 13:06 ` Laszlo Attila Toth @ 2008-02-18 13:44 ` Jiri Kosina 2008-02-18 14:13 ` Laszlo Attila Toth 0 siblings, 1 reply; 13+ messages in thread From: Jiri Kosina @ 2008-02-18 13:44 UTC (permalink / raw) To: Laszlo Attila Toth Cc: David Miller, zdenek.kabelac, linux-kernel, Linus Torvalds On Mon, 18 Feb 2008, Laszlo Attila Toth wrote: > Okay, but I can't figure out what's the problem with it. I don't have > wireless card on my linux box also I can't test it but everything else > works. Swap is mounted. The concurrency cannot be a problem because the > write operation is protected by a lock. - write_lock_bh(&dev_base_lock); - dev->link_mode = nla_get_u8(tb[IFLA_LINKMODE]); - write_unlock_bh(&dev_base_lock); + if (dev->link_mode != nla_get_u8(tb[IFLA_LINKMODE])) { + write_lock_bh(&dev_base_lock); + dev->link_mode = nla_get_u8(tb[IFLA_LINKMODE]); + write_lock_bh(&dev_base_lock); + modified = 1; + } } 1) you are accessing dev->link_mode and tb[] outside the dev_base_lock 2) there is obvious and immediate deadlock -- you acquire the dev_base_lock twice, without any unlock, just look at the chunk above 3) even with this deadlock fixed, Rafael states that either NM or wpa_supplicant (I don't recall from top of my head) still don't work -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: My system stops during startup with curretn git tree of 2.6.25-rc2 2008-02-18 13:44 ` Jiri Kosina @ 2008-02-18 14:13 ` Laszlo Attila Toth 2008-02-18 16:41 ` Rafael J. Wysocki 0 siblings, 1 reply; 13+ messages in thread From: Laszlo Attila Toth @ 2008-02-18 14:13 UTC (permalink / raw) To: Jiri Kosina; +Cc: David Miller, zdenek.kabelac, linux-kernel, Linus Torvalds Jiri Kosina wrote: > On Mon, 18 Feb 2008, Laszlo Attila Toth wrote: > >> Okay, but I can't figure out what's the problem with it. I don't have >> wireless card on my linux box also I can't test it but everything else >> works. Swap is mounted. The concurrency cannot be a problem because the >> write operation is protected by a lock. > > - write_lock_bh(&dev_base_lock); > - dev->link_mode = nla_get_u8(tb[IFLA_LINKMODE]); > - write_unlock_bh(&dev_base_lock); > + if (dev->link_mode != nla_get_u8(tb[IFLA_LINKMODE])) { > + write_lock_bh(&dev_base_lock); > + dev->link_mode = nla_get_u8(tb[IFLA_LINKMODE]); > + write_lock_bh(&dev_base_lock); > + modified = 1; > + } > } > > 1) you are accessing dev->link_mode and tb[] outside the dev_base_lock yes, because tb[IFLA_LINKMODE] is not used by someone else in this case only dev->link_mode. Although its value is unpredictable in case of a concurrent access in the condition, it does not affect the final value of dev->link_mode but the length of the critical section remains minimal. The if statement may be inside the lock. > 2) there is obvious and immediate deadlock -- you acquire the > dev_base_lock twice, without any unlock, just look at the chunk above Indeed: "Feb 16 16:51:49 sandman kernel: BUG: rwlock recursion on CPU#0," I missed it. I copied the code from another patch which didn't contain the two locking statements and when I copied them back it became a copy-paste bug. > 3) even with this deadlock fixed, Rafael states that either NM or > wpa_supplicant (I don't recall from top of my head) still don't work That's bad. Does my suggestion solve the problem? Again: - if (modified) - netdev_state_change(dev); + if (modified && dev->flags & IFF_UP) + call_netdevice_notifiers(NETDEV_CHANGE, dev) Regards, Attila ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: My system stops during startup with curretn git tree of 2.6.25-rc2 2008-02-18 14:13 ` Laszlo Attila Toth @ 2008-02-18 16:41 ` Rafael J. Wysocki 2008-02-18 17:03 ` Laszlo Attila Toth 0 siblings, 1 reply; 13+ messages in thread From: Rafael J. Wysocki @ 2008-02-18 16:41 UTC (permalink / raw) To: Laszlo Attila Toth Cc: Jiri Kosina, David Miller, zdenek.kabelac, linux-kernel, Linus Torvalds On Monday, 18 of February 2008, Laszlo Attila Toth wrote: > Jiri Kosina wrote: > > On Mon, 18 Feb 2008, Laszlo Attila Toth wrote: > > > >> Okay, but I can't figure out what's the problem with it. I don't have > >> wireless card on my linux box also I can't test it but everything else > >> works. Swap is mounted. The concurrency cannot be a problem because the > >> write operation is protected by a lock. > > > > - write_lock_bh(&dev_base_lock); > > - dev->link_mode = nla_get_u8(tb[IFLA_LINKMODE]); > > - write_unlock_bh(&dev_base_lock); > > + if (dev->link_mode != nla_get_u8(tb[IFLA_LINKMODE])) { > > + write_lock_bh(&dev_base_lock); > > + dev->link_mode = nla_get_u8(tb[IFLA_LINKMODE]); > > + write_lock_bh(&dev_base_lock); > > + modified = 1; > > + } > > } > > > > 1) you are accessing dev->link_mode and tb[] outside the dev_base_lock > > yes, because tb[IFLA_LINKMODE] is not used by someone else in this case > only dev->link_mode. Although its value is unpredictable in case of a > concurrent access in the condition, it does not affect the final value > of dev->link_mode but the length of the critical section remains > minimal. The if statement may be inside the lock. > > > 2) there is obvious and immediate deadlock -- you acquire the > > dev_base_lock twice, without any unlock, just look at the chunk above > > Indeed: > "Feb 16 16:51:49 sandman kernel: BUG: rwlock recursion on CPU#0," > > I missed it. I copied the code from another patch which didn't contain > the two locking statements and when I copied them back it became a > copy-paste bug. > > > > 3) even with this deadlock fixed, Rafael states that either NM or > > wpa_supplicant (I don't recall from top of my head) still don't work > > That's bad. Does my suggestion solve the problem? Again: > > - if (modified) > - netdev_state_change(dev); > + if (modified && dev->flags & IFF_UP) > + call_netdevice_notifiers(NETDEV_CHANGE, dev) All in all, I gather you wanted me to test the patch below. :-) Yes, that helps. Thanks, Rafael --- Fix net/core/rtnetlink.c breakage caused by commit 45b503548210fe6f23e92b856421c2a3f05fd034 "[RTNETLINK]: Send a single notification on device state changes." Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- net/core/rtnetlink.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) Index: linux-2.6/net/core/rtnetlink.c =================================================================== --- linux-2.6.orig/net/core/rtnetlink.c +++ linux-2.6/net/core/rtnetlink.c @@ -853,7 +853,7 @@ static int do_setlink(struct net_device if (dev->link_mode != nla_get_u8(tb[IFLA_LINKMODE])) { write_lock_bh(&dev_base_lock); dev->link_mode = nla_get_u8(tb[IFLA_LINKMODE]); - write_lock_bh(&dev_base_lock); + write_unlock_bh(&dev_base_lock); modified = 1; } } @@ -870,8 +870,8 @@ errout: if (send_addr_notify) call_netdevice_notifiers(NETDEV_CHANGEADDR, dev); - if (modified) - netdev_state_change(dev); + if (modified && dev->flags & IFF_UP) + call_netdevice_notifiers(NETDEV_CHANGE, dev); return err; } ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: My system stops during startup with curretn git tree of 2.6.25-rc2 2008-02-18 16:41 ` Rafael J. Wysocki @ 2008-02-18 17:03 ` Laszlo Attila Toth 2008-02-18 17:06 ` Rafael J. Wysocki 2008-02-19 4:51 ` David Miller 0 siblings, 2 replies; 13+ messages in thread From: Laszlo Attila Toth @ 2008-02-18 17:03 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jiri Kosina, David Miller, zdenek.kabelac, linux-kernel, Linus Torvalds Hello, Rafael J. Wysocki wrote: > On Monday, 18 of February 2008, Laszlo Attila Toth wrote: > > > All in all, I gather you wanted me to test the patch below. :-) > > Yes, that helps. Thanks for the testing. Dave already reverted the patch, also I don't know I should resend it or leave it as is. Regards, Attila > Thanks, > Rafael > > --- > Fix net/core/rtnetlink.c breakage caused by commit > 45b503548210fe6f23e92b856421c2a3f05fd034 > "[RTNETLINK]: Send a single notification on device state changes." > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> > --- > net/core/rtnetlink.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > Index: linux-2.6/net/core/rtnetlink.c > =================================================================== > --- linux-2.6.orig/net/core/rtnetlink.c > +++ linux-2.6/net/core/rtnetlink.c > @@ -853,7 +853,7 @@ static int do_setlink(struct net_device > if (dev->link_mode != nla_get_u8(tb[IFLA_LINKMODE])) { > write_lock_bh(&dev_base_lock); > dev->link_mode = nla_get_u8(tb[IFLA_LINKMODE]); > - write_lock_bh(&dev_base_lock); > + write_unlock_bh(&dev_base_lock); > modified = 1; > } > } > @@ -870,8 +870,8 @@ errout: > if (send_addr_notify) > call_netdevice_notifiers(NETDEV_CHANGEADDR, dev); > > - if (modified) > - netdev_state_change(dev); > + if (modified && dev->flags & IFF_UP) > + call_netdevice_notifiers(NETDEV_CHANGE, dev); > > return err; > } > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: My system stops during startup with curretn git tree of 2.6.25-rc2 2008-02-18 17:03 ` Laszlo Attila Toth @ 2008-02-18 17:06 ` Rafael J. Wysocki 2008-02-19 4:51 ` David Miller 1 sibling, 0 replies; 13+ messages in thread From: Rafael J. Wysocki @ 2008-02-18 17:06 UTC (permalink / raw) To: Laszlo Attila Toth Cc: Jiri Kosina, David Miller, zdenek.kabelac, linux-kernel, Linus Torvalds On Monday, 18 of February 2008, Laszlo Attila Toth wrote: > Hello, > > Rafael J. Wysocki wrote: > > On Monday, 18 of February 2008, Laszlo Attila Toth wrote: > > > > > > All in all, I gather you wanted me to test the patch below. :-) > > > > Yes, that helps. > > Thanks for the testing. Dave already reverted the patch, also I don't > know I should resend it or leave it as is. Since the patch has been reverted, I think you can fold the fix into it and send it again for the next merge window (unless it's _really_ urgent). Of course, if that's fine by Dave. :-) Thanks, Rafael ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: My system stops during startup with curretn git tree of 2.6.25-rc2 2008-02-18 17:03 ` Laszlo Attila Toth 2008-02-18 17:06 ` Rafael J. Wysocki @ 2008-02-19 4:51 ` David Miller 2008-02-19 10:54 ` Jens Axboe 1 sibling, 1 reply; 13+ messages in thread From: David Miller @ 2008-02-19 4:51 UTC (permalink / raw) To: panther; +Cc: rjw, jkosina, zdenek.kabelac, linux-kernel, torvalds From: Laszlo Attila Toth <panther@balabit.hu> Date: Mon, 18 Feb 2008 18:03:47 +0100 > Hello, > > Rafael J. Wysocki wrote: > > On Monday, 18 of February 2008, Laszlo Attila Toth wrote: > > > > > > All in all, I gather you wanted me to test the patch below. :-) > > > > Yes, that helps. > > Thanks for the testing. Dave already reverted the patch, also I don't > know I should resend it or leave it as is. After this sits and gets testing and feedback for a few days you'll need to resubmit the entire original patch with these fixes added on top. For now, it's staying reverted. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: My system stops during startup with curretn git tree of 2.6.25-rc2 2008-02-19 4:51 ` David Miller @ 2008-02-19 10:54 ` Jens Axboe 2008-02-19 11:11 ` David Miller 0 siblings, 1 reply; 13+ messages in thread From: Jens Axboe @ 2008-02-19 10:54 UTC (permalink / raw) To: David Miller Cc: panther, rjw, jkosina, zdenek.kabelac, linux-kernel, torvalds On Mon, Feb 18 2008, David Miller wrote: > From: Laszlo Attila Toth <panther@balabit.hu> > Date: Mon, 18 Feb 2008 18:03:47 +0100 > > > Hello, > > > > Rafael J. Wysocki wrote: > > > On Monday, 18 of February 2008, Laszlo Attila Toth wrote: > > > > > > > > > All in all, I gather you wanted me to test the patch below. :-) > > > > > > Yes, that helps. > > > > Thanks for the testing. Dave already reverted the patch, also I don't > > know I should resend it or leave it as is. > > After this sits and gets testing and feedback for a few days > you'll need to resubmit the entire original patch with these > fixes added on top. > > For now, it's staying reverted. Can we please get the revert pushed out please? It renders my laptop unusable and I keep forgetting to revert that pesky change when testing new kernels. -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: My system stops during startup with curretn git tree of 2.6.25-rc2 2008-02-19 10:54 ` Jens Axboe @ 2008-02-19 11:11 ` David Miller 2008-02-19 11:12 ` Jens Axboe 0 siblings, 1 reply; 13+ messages in thread From: David Miller @ 2008-02-19 11:11 UTC (permalink / raw) To: jens.axboe; +Cc: panther, rjw, jkosina, zdenek.kabelac, linux-kernel, torvalds From: Jens Axboe <jens.axboe@oracle.com> Date: Tue, 19 Feb 2008 11:54:19 +0100 > On Mon, Feb 18 2008, David Miller wrote: > > From: Laszlo Attila Toth <panther@balabit.hu> > > Date: Mon, 18 Feb 2008 18:03:47 +0100 > > > > > Hello, > > > > > > Rafael J. Wysocki wrote: > > > > On Monday, 18 of February 2008, Laszlo Attila Toth wrote: > > > > > > > > > > > > All in all, I gather you wanted me to test the patch below. :-) > > > > > > > > Yes, that helps. > > > > > > Thanks for the testing. Dave already reverted the patch, also I don't > > > know I should resend it or leave it as is. > > > > After this sits and gets testing and feedback for a few days > > you'll need to resubmit the entire original patch with these > > fixes added on top. > > > > For now, it's staying reverted. > > Can we please get the revert pushed out please? It renders my laptop > unusable and I keep forgetting to revert that pesky change when testing > new kernels. I sent a push to Linus an hour or so ago, he should pick it up in the morning. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: My system stops during startup with curretn git tree of 2.6.25-rc2 2008-02-19 11:11 ` David Miller @ 2008-02-19 11:12 ` Jens Axboe 0 siblings, 0 replies; 13+ messages in thread From: Jens Axboe @ 2008-02-19 11:12 UTC (permalink / raw) To: David Miller Cc: panther, rjw, jkosina, zdenek.kabelac, linux-kernel, torvalds On Tue, Feb 19 2008, David Miller wrote: > From: Jens Axboe <jens.axboe@oracle.com> > Date: Tue, 19 Feb 2008 11:54:19 +0100 > > > On Mon, Feb 18 2008, David Miller wrote: > > > From: Laszlo Attila Toth <panther@balabit.hu> > > > Date: Mon, 18 Feb 2008 18:03:47 +0100 > > > > > > > Hello, > > > > > > > > Rafael J. Wysocki wrote: > > > > > On Monday, 18 of February 2008, Laszlo Attila Toth wrote: > > > > > > > > > > > > > > > All in all, I gather you wanted me to test the patch below. :-) > > > > > > > > > > Yes, that helps. > > > > > > > > Thanks for the testing. Dave already reverted the patch, also I don't > > > > know I should resend it or leave it as is. > > > > > > After this sits and gets testing and feedback for a few days > > > you'll need to resubmit the entire original patch with these > > > fixes added on top. > > > > > > For now, it's staying reverted. > > > > Can we please get the revert pushed out please? It renders my laptop > > unusable and I keep forgetting to revert that pesky change when testing > > new kernels. > > I sent a push to Linus an hour or so ago, he should pick it up > in the morning. Great, thanks! -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2008-02-19 11:12 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-02-17 10:55 My system stops during startup with curretn git tree of 2.6.25-rc2 Zdenek Kabelac 2008-02-17 12:04 ` Jiri Kosina 2008-02-18 2:36 ` David Miller 2008-02-18 13:06 ` Laszlo Attila Toth 2008-02-18 13:44 ` Jiri Kosina 2008-02-18 14:13 ` Laszlo Attila Toth 2008-02-18 16:41 ` Rafael J. Wysocki 2008-02-18 17:03 ` Laszlo Attila Toth 2008-02-18 17:06 ` Rafael J. Wysocki 2008-02-19 4:51 ` David Miller 2008-02-19 10:54 ` Jens Axboe 2008-02-19 11:11 ` David Miller 2008-02-19 11:12 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox