* [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev @ 2016-10-05 14:12 Anoob Soman 2016-10-07 0:50 ` David Miller 0 siblings, 1 reply; 13+ messages in thread From: Anoob Soman @ 2016-10-05 14:12 UTC (permalink / raw) To: netdev; +Cc: Anoob Soman If a socket has FANOUT sockopt set, a new proto_hook is registered as part of fanout_add(). When processing a NETDEV_UNREGISTER event in af_packet, __fanout_unlink is called for all sockets, but prot_hook which was registered as part of fanout_add is not removed. Call fanout_release, on a NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the fanout_list. This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo() Signed-off-by: Anoob Soman <anoob.soman@citrix.com> --- net/packet/af_packet.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 33a4697..11db0d6 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -3952,6 +3952,7 @@ static int packet_notifier(struct notifier_block *this, } if (msg == NETDEV_UNREGISTER) { packet_cached_dev_reset(po); + fanout_release(sk); po->ifindex = -1; if (po->prot_hook.dev) dev_put(po->prot_hook.dev); -- 2.7.4 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev 2016-10-05 14:12 [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev Anoob Soman @ 2016-10-07 0:50 ` David Miller 2017-01-30 17:26 ` Eric Dumazet 0 siblings, 1 reply; 13+ messages in thread From: David Miller @ 2016-10-07 0:50 UTC (permalink / raw) To: anoob.soman; +Cc: netdev From: Anoob Soman <anoob.soman@citrix.com> Date: Wed, 5 Oct 2016 15:12:54 +0100 > If a socket has FANOUT sockopt set, a new proto_hook is registered > as part of fanout_add(). When processing a NETDEV_UNREGISTER event in > af_packet, __fanout_unlink is called for all sockets, but prot_hook which was > registered as part of fanout_add is not removed. Call fanout_release, on a > NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the > fanout_list. > > This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo() > > Signed-off-by: Anoob Soman <anoob.soman@citrix.com> Applied and queued up for -stable, thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev 2016-10-07 0:50 ` David Miller @ 2017-01-30 17:26 ` Eric Dumazet 2017-01-30 18:19 ` Anoob Soman 2017-01-30 19:08 ` Anoob Soman 0 siblings, 2 replies; 13+ messages in thread From: Eric Dumazet @ 2017-01-30 17:26 UTC (permalink / raw) To: David Miller; +Cc: anoob.soman, netdev On Thu, 2016-10-06 at 20:50 -0400, David Miller wrote: > From: Anoob Soman <anoob.soman@citrix.com> > Date: Wed, 5 Oct 2016 15:12:54 +0100 > > > If a socket has FANOUT sockopt set, a new proto_hook is registered > > as part of fanout_add(). When processing a NETDEV_UNREGISTER event in > > af_packet, __fanout_unlink is called for all sockets, but prot_hook which was > > registered as part of fanout_add is not removed. Call fanout_release, on a > > NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the > > fanout_list. > > > > This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo() > > > > Signed-off-by: Anoob Soman <anoob.soman@citrix.com> > > Applied and queued up for -stable, thanks. This commit (6664498280cf "packet: call fanout_release, while UNREGISTERING a netdev") looks buggy : We end up calling fanout_release() while holding a spinlock ( spin_lock(&po->bind_lock); ) But fanout_release() grabs a mutex ( mutex_lock(&fanout_mutex) ), and this is absolutely not valid while holding a spinlock. Anoob, can you cook a fix, I guess you have a way to reproduce the thing that wanted a kernel patch ? (Please build your test kernel with CONFIG_LOCKDEP=y) Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev 2017-01-30 17:26 ` Eric Dumazet @ 2017-01-30 18:19 ` Anoob Soman 2017-01-30 19:08 ` Anoob Soman 1 sibling, 0 replies; 13+ messages in thread From: Anoob Soman @ 2017-01-30 18:19 UTC (permalink / raw) To: Eric Dumazet, David Miller; +Cc: netdev On 30/01/17 17:26, Eric Dumazet wrote: > On Thu, 2016-10-06 at 20:50 -0400, David Miller wrote: >> From: Anoob Soman <anoob.soman@citrix.com> >> Date: Wed, 5 Oct 2016 15:12:54 +0100 >> >>> If a socket has FANOUT sockopt set, a new proto_hook is registered >>> as part of fanout_add(). When processing a NETDEV_UNREGISTER event in >>> af_packet, __fanout_unlink is called for all sockets, but prot_hook which was >>> registered as part of fanout_add is not removed. Call fanout_release, on a >>> NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the >>> fanout_list. >>> >>> This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo() >>> >>> Signed-off-by: Anoob Soman <anoob.soman@citrix.com> >> Applied and queued up for -stable, thanks. > This commit (6664498280cf "packet: call fanout_release, while > UNREGISTERING a netdev") > looks buggy : > > We end up calling fanout_release() while holding a spinlock > ( spin_lock(&po->bind_lock); ) > > But fanout_release() grabs a mutex ( mutex_lock(&fanout_mutex) ), and > this is absolutely not valid while holding a spinlock. Yes correct, that is wrong. > > Anoob, can you cook a fix, I guess you have a way to reproduce the thing > that wanted a kernel patch ? > > (Please build your test kernel with CONFIG_LOCKDEP=y) Sure, I am planning to move fanout_release(sk) after spin_unlock(bond_lock). Something like this. } if (msg == NETDEV_UNREGISTER) { packet_cached_dev_reset(po); - fanout_release(sk); po->ifindex = -1; if (po->prot_hook.dev) dev_put(po->prot_hook.dev); po->prot_hook.dev = NULL; } spin_unlock(&po->bind_lock); + if (msg == NETDEV_UNREGISTER) { + fanout_release(sk); + } } break; I will quickly test it out. > Thanks. > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev 2017-01-30 17:26 ` Eric Dumazet 2017-01-30 18:19 ` Anoob Soman @ 2017-01-30 19:08 ` Anoob Soman 2017-01-30 19:44 ` Eric Dumazet 1 sibling, 1 reply; 13+ messages in thread From: Anoob Soman @ 2017-01-30 19:08 UTC (permalink / raw) To: Eric Dumazet, David Miller; +Cc: netdev On 30/01/17 17:26, Eric Dumazet wrote: > On Thu, 2016-10-06 at 20:50 -0400, David Miller wrote: >> From: Anoob Soman <anoob.soman@citrix.com> >> Date: Wed, 5 Oct 2016 15:12:54 +0100 >> >>> If a socket has FANOUT sockopt set, a new proto_hook is registered >>> as part of fanout_add(). When processing a NETDEV_UNREGISTER event in >>> af_packet, __fanout_unlink is called for all sockets, but prot_hook which was >>> registered as part of fanout_add is not removed. Call fanout_release, on a >>> NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the >>> fanout_list. >>> >>> This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo() >>> >>> Signed-off-by: Anoob Soman <anoob.soman@citrix.com> >> Applied and queued up for -stable, thanks. > This commit (6664498280cf "packet: call fanout_release, while > UNREGISTERING a netdev") > looks buggy : > > We end up calling fanout_release() while holding a spinlock > ( spin_lock(&po->bind_lock); ) > > But fanout_release() grabs a mutex ( mutex_lock(&fanout_mutex) ), and > this is absolutely not valid while holding a spinlock. Yes, that is wrong. > > Anoob, can you cook a fix, I guess you have a way to reproduce the thing > that wanted a kernel patch ? > > (Please build your test kernel with CONFIG_LOCKDEP=y) Sure, I am planning to move fanout_release(sk) after spin_unlock(bind_lock). Something like this. } if (msg == NETDEV_UNREGISTER) { packet_cached_dev_reset(po); - fanout_release(sk); po->ifindex = -1; if (po->prot_hook.dev) dev_put(po->prot_hook.dev); po->prot_hook.dev = NULL; } spin_unlock(&po->bind_lock); + if (msg == NETDEV_UNREGISTER) { + fanout_release(sk); + } } break; I will quickly test it out. > Thanks. > > Thanks, Anoob. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev 2017-01-30 19:08 ` Anoob Soman @ 2017-01-30 19:44 ` Eric Dumazet 2017-01-31 17:03 ` Anoob Soman 0 siblings, 1 reply; 13+ messages in thread From: Eric Dumazet @ 2017-01-30 19:44 UTC (permalink / raw) To: Anoob Soman; +Cc: David Miller, netdev On Mon, 2017-01-30 at 19:08 +0000, Anoob Soman wrote: > > On 30/01/17 17:26, Eric Dumazet wrote: > > On Thu, 2016-10-06 at 20:50 -0400, David Miller wrote: > >> From: Anoob Soman <anoob.soman@citrix.com> > >> Date: Wed, 5 Oct 2016 15:12:54 +0100 > >> > >>> If a socket has FANOUT sockopt set, a new proto_hook is registered > >>> as part of fanout_add(). When processing a NETDEV_UNREGISTER event in > >>> af_packet, __fanout_unlink is called for all sockets, but prot_hook which was > >>> registered as part of fanout_add is not removed. Call fanout_release, on a > >>> NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the > >>> fanout_list. > >>> > >>> This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo() > >>> > >>> Signed-off-by: Anoob Soman <anoob.soman@citrix.com> > >> Applied and queued up for -stable, thanks. > > This commit (6664498280cf "packet: call fanout_release, while > > UNREGISTERING a netdev") > > looks buggy : > > > > We end up calling fanout_release() while holding a spinlock > > ( spin_lock(&po->bind_lock); ) > > > > But fanout_release() grabs a mutex ( mutex_lock(&fanout_mutex) ), and > > this is absolutely not valid while holding a spinlock. > > Yes, that is wrong. > > > > > Anoob, can you cook a fix, I guess you have a way to reproduce the thing > > that wanted a kernel patch ? > > > > (Please build your test kernel with CONFIG_LOCKDEP=y) > > Sure, I am planning to move fanout_release(sk) after > spin_unlock(bind_lock). Something like this. > } > if (msg == NETDEV_UNREGISTER) { > packet_cached_dev_reset(po); > - fanout_release(sk); > po->ifindex = -1; > if (po->prot_hook.dev) > dev_put(po->prot_hook.dev); > po->prot_hook.dev = NULL; > } > spin_unlock(&po->bind_lock); > + if (msg == NETDEV_UNREGISTER) { > + fanout_release(sk); > + } > } > break; > > I will quickly test it out. It wont be enough. You need to also fix a race if two cpus call fanout_release(sk) at the same time. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev 2017-01-30 19:44 ` Eric Dumazet @ 2017-01-31 17:03 ` Anoob Soman 2017-01-31 18:00 ` Eric Dumazet 0 siblings, 1 reply; 13+ messages in thread From: Anoob Soman @ 2017-01-31 17:03 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, netdev On 30/01/17 19:44, Eric Dumazet wrote: > On Mon, 2017-01-30 at 19:08 +0000, Anoob Soman wrote: >> On 30/01/17 17:26, Eric Dumazet wrote: >>> On Thu, 2016-10-06 at 20:50 -0400, David Miller wrote: >>>> From: Anoob Soman <anoob.soman@citrix.com> >>>> Date: Wed, 5 Oct 2016 15:12:54 +0100 >>>> >>>>> If a socket has FANOUT sockopt set, a new proto_hook is registered >>>>> as part of fanout_add(). When processing a NETDEV_UNREGISTER event in >>>>> af_packet, __fanout_unlink is called for all sockets, but prot_hook which was >>>>> registered as part of fanout_add is not removed. Call fanout_release, on a >>>>> NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the >>>>> fanout_list. >>>>> >>>>> This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo() >>>>> >>>>> Signed-off-by: Anoob Soman <anoob.soman@citrix.com> >>>> Applied and queued up for -stable, thanks. >>> This commit (6664498280cf "packet: call fanout_release, while >>> UNREGISTERING a netdev") >>> looks buggy : >>> >>> We end up calling fanout_release() while holding a spinlock >>> ( spin_lock(&po->bind_lock); ) >>> >>> But fanout_release() grabs a mutex ( mutex_lock(&fanout_mutex) ), and >>> this is absolutely not valid while holding a spinlock. >> Yes, that is wrong. >> >>> Anoob, can you cook a fix, I guess you have a way to reproduce the thing >>> that wanted a kernel patch ? >>> >>> (Please build your test kernel with CONFIG_LOCKDEP=y) >> Sure, I am planning to move fanout_release(sk) after >> spin_unlock(bind_lock). Something like this. >> } >> if (msg == NETDEV_UNREGISTER) { >> packet_cached_dev_reset(po); >> - fanout_release(sk); >> po->ifindex = -1; >> if (po->prot_hook.dev) >> dev_put(po->prot_hook.dev); >> po->prot_hook.dev = NULL; >> } >> spin_unlock(&po->bind_lock); >> + if (msg == NETDEV_UNREGISTER) { >> + fanout_release(sk); >> + } >> } >> break; >> >> I will quickly test it out. > It wont be enough. > > You need to also fix a race if two cpus call fanout_release(sk) at the > same time. > > Thanks. > > > Hi Eric, I have ran into some problem trying to enable CONFIG_LOCKDEP. I think this particular scenario, taking mutex_lock() while holding a spin_lock debugging, requires CONFIG_DEBUG_ATOMIC_SLEEP to be enabled. CONFIG_DEBUG_ATOMIC_SLEEP, selects CONFIG_PREEMPT_COUNT and my kernel doesn't behave well if PREEMPTION is enabled. I am trying to reproduce this issue in a way that I might be able to use debug_atomic_sleep. Meanwhile, I have modified patch fix the race. @@ -1722,18 +1722,20 @@ static void fanout_release(struct sock *sk) struct packet_sock *po = pkt_sk(sk); struct packet_fanout *f; + mutex_lock(&fanout_mutex); + f = po->fanout; - if (!f) + if (!f) { + mutex_unlock(&fanout_mutex); return; - - mutex_lock(&fanout_mutex); - po->fanout = NULL; + } if (atomic_dec_and_test(&f->sk_ref)) { list_del(&f->list); dev_remove_pack(&f->prot_hook); fanout_release_data(f); kfree(f); + po->fanout = NULL; } mutex_unlock(&fanout_mutex); @@ -3855,13 +3857,14 @@ static int packet_notifier(struct notifier_block *this, } if (msg == NETDEV_UNREGISTER) { packet_cached_dev_reset(po); - fanout_release(sk); po->ifindex = -1; if (po->prot_hook.dev) dev_put(po->prot_hook.dev); po->prot_hook.dev = NULL; } spin_unlock(&po->bind_lock); + if (msg == NETDEV_UNREGISTER) + fanout_release(sk); } break; case NETDEV_UP: Thanks, Anoob. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev 2017-01-31 17:03 ` Anoob Soman @ 2017-01-31 18:00 ` Eric Dumazet 2017-01-31 18:14 ` Anoob Soman 0 siblings, 1 reply; 13+ messages in thread From: Eric Dumazet @ 2017-01-31 18:00 UTC (permalink / raw) To: Anoob Soman; +Cc: David Miller, netdev On Tue, 2017-01-31 at 17:03 +0000, Anoob Soman wrote: > > On 30/01/17 19:44, Eric Dumazet wrote: > > On Mon, 2017-01-30 at 19:08 +0000, Anoob Soman wrote: > >> On 30/01/17 17:26, Eric Dumazet wrote: > >>> On Thu, 2016-10-06 at 20:50 -0400, David Miller wrote: > >>>> From: Anoob Soman <anoob.soman@citrix.com> > >>>> Date: Wed, 5 Oct 2016 15:12:54 +0100 > >>>> > >>>>> If a socket has FANOUT sockopt set, a new proto_hook is registered > >>>>> as part of fanout_add(). When processing a NETDEV_UNREGISTER event in > >>>>> af_packet, __fanout_unlink is called for all sockets, but prot_hook which was > >>>>> registered as part of fanout_add is not removed. Call fanout_release, on a > >>>>> NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the > >>>>> fanout_list. > >>>>> > >>>>> This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo() > >>>>> > >>>>> Signed-off-by: Anoob Soman <anoob.soman@citrix.com> > >>>> Applied and queued up for -stable, thanks. > >>> This commit (6664498280cf "packet: call fanout_release, while > >>> UNREGISTERING a netdev") > >>> looks buggy : > >>> > >>> We end up calling fanout_release() while holding a spinlock > >>> ( spin_lock(&po->bind_lock); ) > >>> > >>> But fanout_release() grabs a mutex ( mutex_lock(&fanout_mutex) ), and > >>> this is absolutely not valid while holding a spinlock. > >> Yes, that is wrong. > >> > >>> Anoob, can you cook a fix, I guess you have a way to reproduce the thing > >>> that wanted a kernel patch ? > >>> > >>> (Please build your test kernel with CONFIG_LOCKDEP=y) > >> Sure, I am planning to move fanout_release(sk) after > >> spin_unlock(bind_lock). Something like this. > >> } > >> if (msg == NETDEV_UNREGISTER) { > >> packet_cached_dev_reset(po); > >> - fanout_release(sk); > >> po->ifindex = -1; > >> if (po->prot_hook.dev) > >> dev_put(po->prot_hook.dev); > >> po->prot_hook.dev = NULL; > >> } > >> spin_unlock(&po->bind_lock); > >> + if (msg == NETDEV_UNREGISTER) { > >> + fanout_release(sk); > >> + } > >> } > >> break; > >> > >> I will quickly test it out. > > It wont be enough. > > > > You need to also fix a race if two cpus call fanout_release(sk) at the > > same time. > > > > Thanks. > > > > > > > Hi Eric, > > I have ran into some problem trying to enable CONFIG_LOCKDEP. I think > this particular scenario, taking mutex_lock() while holding a spin_lock > debugging, requires CONFIG_DEBUG_ATOMIC_SLEEP to be enabled. > CONFIG_DEBUG_ATOMIC_SLEEP, selects CONFIG_PREEMPT_COUNT and my kernel > doesn't behave well if PREEMPTION is enabled. I am trying to reproduce > this issue in a way that I might be able to use debug_atomic_sleep. > > Meanwhile, I have modified patch fix the race. So you can definitely have in a .config all these at the same time (LOCKDEP, non PREEMPT, and DEBUG_ATOMIC_SLEEP) $ egrep "DEBUG_ATOMIC_SLEEP|PREEMPT|LOCKDEP" .config CONFIG_LOCKDEP_SUPPORT=y CONFIG_PREEMPT_NOTIFIERS=y CONFIG_PREEMPT_NONE=y # CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set CONFIG_PREEMPT_COUNT=y CONFIG_LOCKDEP=y # CONFIG_DEBUG_LOCKDEP is not set CONFIG_DEBUG_ATOMIC_SLEEP=y ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev 2017-01-31 18:00 ` Eric Dumazet @ 2017-01-31 18:14 ` Anoob Soman 2017-02-02 14:42 ` Anoob Soman 0 siblings, 1 reply; 13+ messages in thread From: Anoob Soman @ 2017-01-31 18:14 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, netdev On 31/01/17 18:00, Eric Dumazet wrote: > On Tue, 2017-01-31 at 17:03 +0000, Anoob Soman wrote: >> On 30/01/17 19:44, Eric Dumazet wrote: >>> On Mon, 2017-01-30 at 19:08 +0000, Anoob Soman wrote: >>>> On 30/01/17 17:26, Eric Dumazet wrote: >>>>> On Thu, 2016-10-06 at 20:50 -0400, David Miller wrote: >>>>>> From: Anoob Soman <anoob.soman@citrix.com> >>>>>> Date: Wed, 5 Oct 2016 15:12:54 +0100 >>>>>> >>>>>>> If a socket has FANOUT sockopt set, a new proto_hook is registered >>>>>>> as part of fanout_add(). When processing a NETDEV_UNREGISTER event in >>>>>>> af_packet, __fanout_unlink is called for all sockets, but prot_hook which was >>>>>>> registered as part of fanout_add is not removed. Call fanout_release, on a >>>>>>> NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the >>>>>>> fanout_list. >>>>>>> >>>>>>> This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo() >>>>>>> >>>>>>> Signed-off-by: Anoob Soman <anoob.soman@citrix.com> >>>>>> Applied and queued up for -stable, thanks. >>>>> This commit (6664498280cf "packet: call fanout_release, while >>>>> UNREGISTERING a netdev") >>>>> looks buggy : >>>>> >>>>> We end up calling fanout_release() while holding a spinlock >>>>> ( spin_lock(&po->bind_lock); ) >>>>> >>>>> But fanout_release() grabs a mutex ( mutex_lock(&fanout_mutex) ), and >>>>> this is absolutely not valid while holding a spinlock. >>>> Yes, that is wrong. >>>> >>>>> Anoob, can you cook a fix, I guess you have a way to reproduce the thing >>>>> that wanted a kernel patch ? >>>>> >>>>> (Please build your test kernel with CONFIG_LOCKDEP=y) >>>> Sure, I am planning to move fanout_release(sk) after >>>> spin_unlock(bind_lock). Something like this. >>>> } >>>> if (msg == NETDEV_UNREGISTER) { >>>> packet_cached_dev_reset(po); >>>> - fanout_release(sk); >>>> po->ifindex = -1; >>>> if (po->prot_hook.dev) >>>> dev_put(po->prot_hook.dev); >>>> po->prot_hook.dev = NULL; >>>> } >>>> spin_unlock(&po->bind_lock); >>>> + if (msg == NETDEV_UNREGISTER) { >>>> + fanout_release(sk); >>>> + } >>>> } >>>> break; >>>> >>>> I will quickly test it out. >>> It wont be enough. >>> >>> You need to also fix a race if two cpus call fanout_release(sk) at the >>> same time. >>> >>> Thanks. >>> >>> >>> >> Hi Eric, >> >> I have ran into some problem trying to enable CONFIG_LOCKDEP. I think >> this particular scenario, taking mutex_lock() while holding a spin_lock >> debugging, requires CONFIG_DEBUG_ATOMIC_SLEEP to be enabled. >> CONFIG_DEBUG_ATOMIC_SLEEP, selects CONFIG_PREEMPT_COUNT and my kernel >> doesn't behave well if PREEMPTION is enabled. I am trying to reproduce >> this issue in a way that I might be able to use debug_atomic_sleep. >> >> Meanwhile, I have modified patch fix the race. > > So you can definitely have in a .config all these at the same time > (LOCKDEP, non PREEMPT, and DEBUG_ATOMIC_SLEEP) > > > > > $ egrep "DEBUG_ATOMIC_SLEEP|PREEMPT|LOCKDEP" .config > CONFIG_LOCKDEP_SUPPORT=y > CONFIG_PREEMPT_NOTIFIERS=y > CONFIG_PREEMPT_NONE=y > # CONFIG_PREEMPT_VOLUNTARY is not set > # CONFIG_PREEMPT is not set > CONFIG_PREEMPT_COUNT=y > CONFIG_LOCKDEP=y > # CONFIG_DEBUG_LOCKDEP is not set > CONFIG_DEBUG_ATOMIC_SLEEP=y > yes, thats exactly what I have. $ egrep "DEBUG_ATOMIC_SLEEP|PREEMPT|LOCKDEP" .config CONFIG_LOCKDEP_SUPPORT=y # CONFIG_PREEMPT_NONE is not set CONFIG_PREEMPT_VOLUNTARY=y # CONFIG_PREEMPT is not set CONFIG_PREEMPT_COUNT=y CONFIG_LOCKDEP=y # CONFIG_DEBUG_LOCKDEP is not set CONFIG_DEBUG_ATOMIC_SLEEP=y I initially thought CONFIG_PREEMPT_COUNT enables CONFIG_PREEMPT, but looks like all it does is to inc/dec preempt_count. Let me give the test a spin again, and see why everything seems to fall apart. > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev 2017-01-31 18:14 ` Anoob Soman @ 2017-02-02 14:42 ` Anoob Soman 2017-02-02 15:53 ` Eric Dumazet 0 siblings, 1 reply; 13+ messages in thread From: Anoob Soman @ 2017-02-02 14:42 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, netdev On 31/01/17 18:14, Anoob Soman wrote: > > > On 31/01/17 18:00, Eric Dumazet wrote: >> On Tue, 2017-01-31 at 17:03 +0000, Anoob Soman wrote: >>> On 30/01/17 19:44, Eric Dumazet wrote: >>>> On Mon, 2017-01-30 at 19:08 +0000, Anoob Soman wrote: >>>>> On 30/01/17 17:26, Eric Dumazet wrote: >>>>>> On Thu, 2016-10-06 at 20:50 -0400, David Miller wrote: >>>>>>> From: Anoob Soman <anoob.soman@citrix.com> >>>>>>> Date: Wed, 5 Oct 2016 15:12:54 +0100 >>>>>>> >>>>>>>> If a socket has FANOUT sockopt set, a new proto_hook is registered >>>>>>>> as part of fanout_add(). When processing a NETDEV_UNREGISTER >>>>>>>> event in >>>>>>>> af_packet, __fanout_unlink is called for all sockets, but >>>>>>>> prot_hook which was >>>>>>>> registered as part of fanout_add is not removed. Call >>>>>>>> fanout_release, on a >>>>>>>> NETDEV_UNREGISTER, which removes prot_hook and removes fanout >>>>>>>> from the >>>>>>>> fanout_list. >>>>>>>> >>>>>>>> This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in >>>>>>>> netdev_run_todo() >>>>>>>> >>>>>>>> Signed-off-by: Anoob Soman <anoob.soman@citrix.com> >>>>>>> Applied and queued up for -stable, thanks. >>>>>> This commit (6664498280cf "packet: call fanout_release, while >>>>>> UNREGISTERING a netdev") >>>>>> looks buggy : >>>>>> >>>>>> We end up calling fanout_release() while holding a spinlock >>>>>> ( spin_lock(&po->bind_lock); ) >>>>>> >>>>>> But fanout_release() grabs a mutex ( mutex_lock(&fanout_mutex) ), >>>>>> and >>>>>> this is absolutely not valid while holding a spinlock. >>>>> Yes, that is wrong. >>>>> >>>>>> Anoob, can you cook a fix, I guess you have a way to reproduce >>>>>> the thing >>>>>> that wanted a kernel patch ? >>>>>> >>>>>> (Please build your test kernel with CONFIG_LOCKDEP=y) >>>>> Sure, I am planning to move fanout_release(sk) after >>>>> spin_unlock(bind_lock). Something like this. >>>>> } >>>>> if (msg == NETDEV_UNREGISTER) { >>>>> packet_cached_dev_reset(po); >>>>> - fanout_release(sk); >>>>> po->ifindex = -1; >>>>> if (po->prot_hook.dev) >>>>> dev_put(po->prot_hook.dev); >>>>> po->prot_hook.dev = NULL; >>>>> } >>>>> spin_unlock(&po->bind_lock); >>>>> + if (msg == NETDEV_UNREGISTER) { >>>>> + fanout_release(sk); >>>>> + } >>>>> } >>>>> break; >>>>> >>>>> I will quickly test it out. >>>> It wont be enough. >>>> >>>> You need to also fix a race if two cpus call fanout_release(sk) at the >>>> same time. >>>> >>>> Thanks. >>>> >>>> >>>> >>> Hi Eric, >>> >>> I have ran into some problem trying to enable CONFIG_LOCKDEP. I think >>> this particular scenario, taking mutex_lock() while holding a spin_lock >>> debugging, requires CONFIG_DEBUG_ATOMIC_SLEEP to be enabled. >>> CONFIG_DEBUG_ATOMIC_SLEEP, selects CONFIG_PREEMPT_COUNT and my kernel >>> doesn't behave well if PREEMPTION is enabled. I am trying to reproduce >>> this issue in a way that I might be able to use debug_atomic_sleep. >>> >>> Meanwhile, I have modified patch fix the race. >> >> So you can definitely have in a .config all these at the same time >> (LOCKDEP, non PREEMPT, and DEBUG_ATOMIC_SLEEP) >> >> >> >> >> $ egrep "DEBUG_ATOMIC_SLEEP|PREEMPT|LOCKDEP" .config >> CONFIG_LOCKDEP_SUPPORT=y >> CONFIG_PREEMPT_NOTIFIERS=y >> CONFIG_PREEMPT_NONE=y >> # CONFIG_PREEMPT_VOLUNTARY is not set >> # CONFIG_PREEMPT is not set >> CONFIG_PREEMPT_COUNT=y >> CONFIG_LOCKDEP=y >> # CONFIG_DEBUG_LOCKDEP is not set >> CONFIG_DEBUG_ATOMIC_SLEEP=y >> > > yes, thats exactly what I have. > > $ egrep "DEBUG_ATOMIC_SLEEP|PREEMPT|LOCKDEP" .config > CONFIG_LOCKDEP_SUPPORT=y > # CONFIG_PREEMPT_NONE is not set > CONFIG_PREEMPT_VOLUNTARY=y > # CONFIG_PREEMPT is not set > CONFIG_PREEMPT_COUNT=y > CONFIG_LOCKDEP=y > # CONFIG_DEBUG_LOCKDEP is not set > CONFIG_DEBUG_ATOMIC_SLEEP=y > > I initially thought CONFIG_PREEMPT_COUNT enables CONFIG_PREEMPT, but > looks like all it does is to inc/dec preempt_count. > > Let me give the test a spin again, and see why everything seems to > fall apart. > >> >> > Hi Eric, I managed to reproduce the problem consistently with LOCKDEP enabled. I have to workaround few other problems in order to make the repro consistent. There are 4 potential problem with the commit. 1. calling mutex_lock(&fanout_mutex) (fanout_release()) from inside rcu_read-side critical section. rcu_read_lock disables preemption, most often (expect if CONFIG_PREEMPT/CONFIG_PREEMPTE_RCU are set), which prohibits calling sleeping functions. [ 180.940388] include/linux/rcupdate.h:560 Illegal context switch in RCU read-side critical section! [ 180.940401] [ 180.940401] other info that might help us debug this: [ 180.940401] [ 180.940417] [ 180.940417] rcu_scheduler_active = 1, debug_locks = 0 [ 180.940430] 4 locks held by ovs-vswitchd/1969: [ 180.940438] #0: (cb_lock){++++++}, at: [<ffffffff8158a6c9>] genl_rcv+0x19/0x40 [ 180.940498] #1: (ovs_mutex){+.+.+.}, at: [<ffffffffa04878ca>] ovs_vport_cmd_del+0x4a/0x100 [openvswitch] [ 180.940530] #2: (rtnl_mutex){+.+.+.}, at: [<ffffffff81564157>] rtnl_lock+0x17/0x20 [ 180.940557] #3: (rcu_read_lock){......}, at: [<ffffffff81614165>] packet_notifier+0x5/0x3f0 [ 180.940587] [ 180.940697] Call Trace: [ 180.940710] [<ffffffff813770c1>] dump_stack+0x85/0xc4 [ 180.940727] [<ffffffff810c9077>] lockdep_rcu_suspicious+0x107/0x110 [ 180.940742] [<ffffffff810a2da7>] ___might_sleep+0x57/0x210 [ 180.940755] [<ffffffff810a2fd0>] __might_sleep+0x70/0x90 [ 180.940768] [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0 [ 180.940785] [<ffffffff810de93f>] ? vprintk_default+0x1f/0x30 [ 180.940801] [<ffffffff81186e88>] ? printk+0x4d/0x4f [ 180.940814] [<ffffffff816106dd>] fanout_release+0x1d/0xe0 [ 180.940828] [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0 2. calling mutex_lock(&fanout_mutex) inside spin_lock(&po->bind_lock). "sleeping function called from invalid context" [ 181.941336] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620 [ 181.941352] in_atomic(): 1, irqs_disabled(): 0, pid: 1969, name: ovs-vswitchd [ 181.941365] INFO: lockdep is turned off. [ 181.941462] Call Trace: [ 181.941469] [<ffffffff813770c1>] dump_stack+0x85/0xc4 [ 181.941480] [<ffffffff810a2f52>] ___might_sleep+0x202/0x210 [ 181.941492] [<ffffffff810a2fd0>] __might_sleep+0x70/0x90 [ 181.941503] [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0 [ 181.941515] [<ffffffff810de93f>] ? vprintk_default+0x1f/0x30 [ 181.941526] [<ffffffff81186e88>] ? printk+0x4d/0x4f [ 181.941537] [<ffffffff816106dd>] fanout_release+0x1d/0xe0 [ 181.941548] [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0 3. calling dev_remove_pack(&fanout->prot_hook), from inside spin_lock(&po->bind_lock) or rcu_read-side critcial.section. dev_remove_pack() -> synchronize_net(), which might sleep. [ 181.942401] BUG: scheduling while atomic: ovs-vswitchd/1969/0x00000002 [ 181.942411] INFO: lockdep is turned off. [ 181.942751] Call Trace: [ 181.942760] [<ffffffff813770c1>] dump_stack+0x85/0xc4 [ 181.942771] [<ffffffff81186274>] __schedule_bug+0x64/0x73 [ 181.942782] [<ffffffff8162b8cb>] __schedule+0x6b/0xd10 [ 181.942794] [<ffffffff8162c5db>] schedule+0x6b/0x80 [ 181.942805] [<ffffffff81630b1d>] schedule_timeout+0x38d/0x410 [ 181.942820] [<ffffffff810efac0>] ? internal_add_timer+0x80/0x80 [ 181.942835] [<ffffffff810ea3fd>] synchronize_sched_expedited+0x53d/0x810 [ 181.942850] [<ffffffff810bf650>] ? prepare_to_wait_event+0x110/0x110 [ 181.942862] [<ffffffff810ea6de>] synchronize_rcu_expedited+0xe/0x10 [ 181.942873] [<ffffffff8154eab5>] synchronize_net+0x35/0x50 [ 181.942884] [<ffffffff8154eae3>] dev_remove_pack+0x13/0x20 [ 181.942896] [<ffffffff8161077e>] fanout_release+0xbe/0xe0 [ 181.942909] [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0 4. fanout_release() races with calls from different CPU. Moving mutex_lock(&fanout_mutex) out of spin_lock(&po->bind_lock), if (msg == NETDEV_UNREGISTER) { packet_cached_dev_reset(po); - fanout_release(sk); po->ifindex = -1; if (po->prot_hook.dev) dev_put(po->prot_hook.dev); po->prot_hook.dev = NULL; } spin_unlock(&po->bind_lock); + if (msg == NETDEV_UNREGISTER) + fanout_release(sk); } break; case NETDEV_UP: will solve 2 and part (spin_lock) of 3, but definetly will not solve 1 and (rcu_read-side cs) other part of 3 (if CONFIG_PREEMPT is enabled). Inside packet_notifier packet.sklist is traversed under rcu_read_lock, which meant any calls which might sleep (dev_remove_pack(), mutex_lock()) cannot be done. Instead of traversing sklist under rcu_read_lock, we can traverse sklist using mutex_lock(packet.sklist_lock). This would fix 1 and 3, but adds additional overhead of blocking modifications, while traversing sklist. Another way to fix, all the above problem, would be not to call fanout_release() under rcu_read_lock(), instead call __dev_remove_pack(&fanout->prot_hook) and netdev_run_todo is happy that &dev->ptype_specific list in empty. In order to make this work, I had to move dev_{add,remove}_pack() out of fanout_{add,release} to __fanout_{link,unlink}. So, call to {,__}unregister_prot_hook() will make sure fanout->prot_hook is removed as well. @@ -1498,6 +1498,8 @@ static void __fanout_link(struct sock *sk, struct packet_sock *po) f->arr[f->num_members] = sk; smp_wmb(); f->num_members++; + if (f->num_members == 1) + dev_add_pack(&f->prot_hook); spin_unlock(&f->lock); } @@ -1514,6 +1516,8 @@ static void __fanout_unlink(struct sock *sk, struct packet_sock *po) BUG_ON(i >= f->num_members); f->arr[i] = f->arr[f->num_members - 1]; f->num_members--; + if (f->num_members == 0) + __dev_remove_pack(&f->prot_hook); spin_unlock(&f->lock); } @@ -1692,7 +1696,6 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags) match->prot_hook.func = packet_rcv_fanout; match->prot_hook.af_packet_priv = match; match->prot_hook.id_match = match_fanout_group; - dev_add_pack(&match->prot_hook); list_add(&match->list, &fanout_list); } err = -EINVAL; @@ -1731,7 +1734,6 @@ static void fanout_release(struct sock *sk) if (atomic_dec_and_test(&f->sk_ref)) { list_del(&f->list); - dev_remove_pack(&f->prot_hook); fanout_release_data(f); kfree(f); } @@ -3855,7 +3857,6 @@ static int packet_notifier(struct notifier_block *this, } if (msg == NETDEV_UNREGISTER) { packet_cached_dev_reset(po); - fanout_release(sk); po->ifindex = -1; if (po->prot_hook.dev) dev_put(po->prot_hook.dev); I have tested both the approaches and LOCKDEP doesn't seem to catch any problem with the test I was doing. Thanks, Anoob. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev 2017-02-02 14:42 ` Anoob Soman @ 2017-02-02 15:53 ` Eric Dumazet 2017-02-02 16:44 ` Anoob Soman 0 siblings, 1 reply; 13+ messages in thread From: Eric Dumazet @ 2017-02-02 15:53 UTC (permalink / raw) To: Anoob Soman; +Cc: David Miller, netdev On Thu, 2017-02-02 at 14:42 +0000, Anoob Soman wrote: > > I have tested both the approaches and LOCKDEP doesn't seem to catch any > problem with the test I was doing. Yeah, I think I will cleanup this mess, we probably can remove rcu locking in control path, and stick to a single mutex to reduce all this lock complexity. But that will be for net-next, while we need your fix for net tree. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev 2017-02-02 15:53 ` Eric Dumazet @ 2017-02-02 16:44 ` Anoob Soman 2017-02-02 17:12 ` Eric Dumazet 0 siblings, 1 reply; 13+ messages in thread From: Anoob Soman @ 2017-02-02 16:44 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, netdev On 02/02/17 15:53, Eric Dumazet wrote: > On Thu, 2017-02-02 at 14:42 +0000, Anoob Soman wrote: > >> I have tested both the approaches and LOCKDEP doesn't seem to catch any >> problem with the test I was doing. > Yeah, I think I will cleanup this mess, we probably can remove rcu > locking in control path, and stick to a single mutex to reduce all this > lock complexity. > > But that will be for net-next, while we need your fix for net tree. > > Thanks. > > Sorry, I didn't get it. You want me to post a patch for net tree, changing the way we do dev_{add,remove}_pack(fanout_prot_hook) and you will fix up the lock mess in net-next. Thanks, Anoob. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev 2017-02-02 16:44 ` Anoob Soman @ 2017-02-02 17:12 ` Eric Dumazet 0 siblings, 0 replies; 13+ messages in thread From: Eric Dumazet @ 2017-02-02 17:12 UTC (permalink / raw) To: Anoob Soman; +Cc: David Miller, netdev On Thu, 2017-02-02 at 16:44 +0000, Anoob Soman wrote: > On 02/02/17 15:53, Eric Dumazet wrote: > > On Thu, 2017-02-02 at 14:42 +0000, Anoob Soman wrote: > > > >> I have tested both the approaches and LOCKDEP doesn't seem to catch any > >> problem with the test I was doing. > > Yeah, I think I will cleanup this mess, we probably can remove rcu > > locking in control path, and stick to a single mutex to reduce all this > > lock complexity. > > > > But that will be for net-next, while we need your fix for net tree. > > > > Thanks. > > > > > Sorry, I didn't get it. You want me to post a patch for net tree, > changing the way we do dev_{add,remove}_pack(fanout_prot_hook) and you > will fix up the lock mess in net-next. Yes. Please send your patch. In the future (~1 month when net-next is re-open) I will submit a patch to remove all these complexity out of af_packet. Once fast path uses RCU, control path does not need to use a galaxy of spinlocks and mutexes, and rcu. This is too complex to maintain. A single mutex should be more than enough. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-02-02 17:12 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-10-05 14:12 [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev Anoob Soman 2016-10-07 0:50 ` David Miller 2017-01-30 17:26 ` Eric Dumazet 2017-01-30 18:19 ` Anoob Soman 2017-01-30 19:08 ` Anoob Soman 2017-01-30 19:44 ` Eric Dumazet 2017-01-31 17:03 ` Anoob Soman 2017-01-31 18:00 ` Eric Dumazet 2017-01-31 18:14 ` Anoob Soman 2017-02-02 14:42 ` Anoob Soman 2017-02-02 15:53 ` Eric Dumazet 2017-02-02 16:44 ` Anoob Soman 2017-02-02 17:12 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).