From: Peter Zijlstra <peterz@infradead.org>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"David S. Miller" <davem@davemloft.net>,
linuxppc-dev@ozlabs.org, Thomas Gleixner <tglx@linutronix.de>,
netdev@vger.kernel.org, akpm@linux-foundation.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>
Subject: Re: [NET]: Fix possible dev_deactivate race condition
Date: Fri, 19 Oct 2007 09:35:19 +0200 [thread overview]
Message-ID: <1192779319.27435.163.camel@twins> (raw)
In-Reply-To: <20071019053624.GA10560@gondor.apana.org.au>
On Fri, 2007-10-19 at 13:36 +0800, Herbert Xu wrote:
> On Fri, Oct 19, 2007 at 12:20:25PM +0800, Herbert Xu wrote:
> >
> > In fact this bug exists elsewhere too. For example, the network
> > stack does this in net/sched/sch_generic.c:
> >
> > /* Wait for outstanding qdisc_run calls. */
> > while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
> > yield();
> >
> > This has the same problem as the current synchronize_irq code.
>
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index e01d576..b3b7420 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -556,6 +556,7 @@ void dev_deactivate(struct net_device *dev)
> {
> struct Qdisc *qdisc;
> struct sk_buff *skb;
> + int running;
>
> spin_lock_bh(&dev->queue_lock);
> qdisc = dev->qdisc;
> @@ -571,12 +572,31 @@ void dev_deactivate(struct net_device *dev)
>
> dev_watchdog_down(dev);
>
> - /* Wait for outstanding dev_queue_xmit calls. */
> + /* Wait for outstanding qdisc-less dev_queue_xmit calls. */
> synchronize_rcu();
>
> /* Wait for outstanding qdisc_run calls. */
> - while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
> - yield();
> + do {
> + while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
> + yield();
> +
Ouch!, is there really no sane locking alternative? Hashed waitqueues
like for the page lock come to mind.
> + /*
> + * Double-check inside queue lock to ensure that all effects
> + * of the queue run are visible when we return.
> + */
> + spin_lock_bh(&dev->queue_lock);
> + running = test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state);
> + spin_unlock_bh(&dev->queue_lock);
> +
> + /*
> + * The running flag should never be set at this point because
> + * we've already set dev->qdisc to noop_qdisc *inside* the same
> + * pair of spin locks. That is, if any qdisc_run starts after
> + * our initial test it should see the noop_qdisc and then
> + * clear the RUNNING bit before dropping the queue lock. So
> + * if it is set here then we've found a bug.
> + */
> + } while (WARN_ON_ONCE(running));
> }
>
> void dev_init_scheduler(struct net_device *dev)
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
akpm@linux-foundation.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linuxppc-dev@ozlabs.org, Ingo Molnar <mingo@elte.hu>,
Thomas Gleixner <tglx@linutronix.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
netdev@vger.kernel.org, Ingo Molnar <mingo@elte.hu>
Subject: Re: [NET]: Fix possible dev_deactivate race condition
Date: Fri, 19 Oct 2007 09:35:19 +0200 [thread overview]
Message-ID: <1192779319.27435.163.camel@twins> (raw)
In-Reply-To: <20071019053624.GA10560@gondor.apana.org.au>
On Fri, 2007-10-19 at 13:36 +0800, Herbert Xu wrote:
> On Fri, Oct 19, 2007 at 12:20:25PM +0800, Herbert Xu wrote:
> >
> > In fact this bug exists elsewhere too. For example, the network
> > stack does this in net/sched/sch_generic.c:
> >
> > /* Wait for outstanding qdisc_run calls. */
> > while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
> > yield();
> >
> > This has the same problem as the current synchronize_irq code.
>
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index e01d576..b3b7420 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -556,6 +556,7 @@ void dev_deactivate(struct net_device *dev)
> {
> struct Qdisc *qdisc;
> struct sk_buff *skb;
> + int running;
>
> spin_lock_bh(&dev->queue_lock);
> qdisc = dev->qdisc;
> @@ -571,12 +572,31 @@ void dev_deactivate(struct net_device *dev)
>
> dev_watchdog_down(dev);
>
> - /* Wait for outstanding dev_queue_xmit calls. */
> + /* Wait for outstanding qdisc-less dev_queue_xmit calls. */
> synchronize_rcu();
>
> /* Wait for outstanding qdisc_run calls. */
> - while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
> - yield();
> + do {
> + while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
> + yield();
> +
Ouch!, is there really no sane locking alternative? Hashed waitqueues
like for the page lock come to mind.
> + /*
> + * Double-check inside queue lock to ensure that all effects
> + * of the queue run are visible when we return.
> + */
> + spin_lock_bh(&dev->queue_lock);
> + running = test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state);
> + spin_unlock_bh(&dev->queue_lock);
> +
> + /*
> + * The running flag should never be set at this point because
> + * we've already set dev->qdisc to noop_qdisc *inside* the same
> + * pair of spin locks. That is, if any qdisc_run starts after
> + * our initial test it should see the noop_qdisc and then
> + * clear the RUNNING bit before dropping the queue lock. So
> + * if it is set here then we've found a bug.
> + */
> + } while (WARN_ON_ONCE(running));
> }
>
> void dev_init_scheduler(struct net_device *dev)
next prev parent reply other threads:[~2007-10-19 7:35 UTC|newest]
Thread overview: 96+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-18 1:25 [PATCH] synchronize_irq needs a barrier Benjamin Herrenschmidt
2007-10-18 1:25 ` Benjamin Herrenschmidt
2007-10-18 1:45 ` Andrew Morton
2007-10-18 1:45 ` Andrew Morton
2007-10-18 1:55 ` Benjamin Herrenschmidt
2007-10-18 1:55 ` Benjamin Herrenschmidt
2007-10-18 2:12 ` Linus Torvalds
2007-10-18 2:12 ` Linus Torvalds
2007-10-18 2:40 ` Benjamin Herrenschmidt
2007-10-18 2:40 ` Benjamin Herrenschmidt
2007-10-18 2:57 ` Benjamin Herrenschmidt
2007-10-18 2:57 ` Benjamin Herrenschmidt
2007-10-18 14:56 ` Herbert Xu
2007-10-18 14:56 ` Herbert Xu
2007-10-18 22:05 ` Benjamin Herrenschmidt
2007-10-18 22:05 ` Benjamin Herrenschmidt
2007-10-18 22:52 ` Linus Torvalds
2007-10-18 22:52 ` Linus Torvalds
2007-10-18 23:17 ` Benjamin Herrenschmidt
2007-10-18 23:17 ` Benjamin Herrenschmidt
2007-10-18 23:39 ` Linus Torvalds
2007-10-18 23:39 ` Linus Torvalds
2007-10-18 23:52 ` Benjamin Herrenschmidt
2007-10-18 23:52 ` Benjamin Herrenschmidt
2007-10-19 2:32 ` Herbert Xu
2007-10-19 2:32 ` Herbert Xu
2007-10-19 2:52 ` Nick Piggin
2007-10-19 2:52 ` Nick Piggin
2007-10-19 3:28 ` Herbert Xu
2007-10-19 3:28 ` Herbert Xu
2007-10-19 4:49 ` Nick Piggin
2007-10-19 4:49 ` Nick Piggin
2007-10-19 2:55 ` Linus Torvalds
2007-10-19 2:55 ` Linus Torvalds
2007-10-19 3:26 ` Linus Torvalds
2007-10-19 3:26 ` Linus Torvalds
2007-10-19 4:11 ` Benjamin Herrenschmidt
2007-10-19 4:11 ` Benjamin Herrenschmidt
2007-10-19 4:26 ` Benjamin Herrenschmidt
2007-10-19 4:26 ` Benjamin Herrenschmidt
2007-10-19 5:53 ` Herbert Xu
2007-10-19 5:53 ` Herbert Xu
2007-10-19 4:20 ` Herbert Xu
2007-10-19 4:20 ` Herbert Xu
2007-10-19 4:29 ` Benjamin Herrenschmidt
2007-10-19 4:29 ` Benjamin Herrenschmidt
2007-10-19 4:35 ` Benjamin Herrenschmidt
2007-10-19 4:35 ` Benjamin Herrenschmidt
2007-10-19 4:48 ` Herbert Xu
2007-10-19 4:48 ` Herbert Xu
2007-10-19 4:58 ` Benjamin Herrenschmidt
2007-10-19 4:58 ` Benjamin Herrenschmidt
2007-10-21 21:10 ` Benjamin Herrenschmidt
2007-10-21 21:10 ` Benjamin Herrenschmidt
2007-10-23 3:26 ` [IRQ]: Fix synchronize_irq races with IRQ handler Herbert Xu
2007-10-23 3:26 ` Herbert Xu
2007-10-19 5:36 ` [NET]: Fix possible dev_deactivate race condition Herbert Xu
2007-10-19 5:36 ` Herbert Xu
2007-10-19 5:38 ` David Miller
2007-10-19 5:38 ` David Miller
2007-10-19 7:35 ` Peter Zijlstra [this message]
2007-10-19 7:35 ` Peter Zijlstra
2007-10-19 9:29 ` Herbert Xu
2007-10-19 9:29 ` Herbert Xu
2007-10-18 14:35 ` [PATCH] synchronize_irq needs a barrier Herbert Xu
2007-10-18 14:35 ` Herbert Xu
2007-10-18 21:35 ` Benjamin Herrenschmidt
2007-10-18 21:35 ` Benjamin Herrenschmidt
2007-10-20 2:02 ` Maxim Levitsky
2007-10-20 2:02 ` Maxim Levitsky
2007-10-20 2:25 ` Linus Torvalds
2007-10-20 2:25 ` Linus Torvalds
2007-10-20 3:10 ` Maxim Levitsky
2007-10-20 3:10 ` Maxim Levitsky
2007-10-20 4:06 ` Benjamin Herrenschmidt
2007-10-20 4:06 ` Benjamin Herrenschmidt
2007-10-20 4:04 ` Benjamin Herrenschmidt
2007-10-20 4:04 ` Benjamin Herrenschmidt
2007-10-20 4:09 ` Benjamin Herrenschmidt
2007-10-20 4:09 ` Benjamin Herrenschmidt
2007-10-20 3:37 ` Herbert Xu
2007-10-20 3:37 ` Herbert Xu
2007-10-20 3:56 ` Benjamin Herrenschmidt
2007-10-20 3:56 ` Benjamin Herrenschmidt
2007-10-20 4:24 ` Maxim Levitsky
2007-10-20 4:24 ` Maxim Levitsky
2007-10-20 5:04 ` Benjamin Herrenschmidt
2007-10-20 5:04 ` Benjamin Herrenschmidt
2007-10-20 5:36 ` Maxim Levitsky
2007-10-20 5:36 ` Maxim Levitsky
2007-10-20 5:46 ` Benjamin Herrenschmidt
2007-10-20 5:46 ` Benjamin Herrenschmidt
2007-10-20 6:06 ` Maxim Levitsky
2007-10-20 6:06 ` Maxim Levitsky
2007-10-20 6:13 ` Benjamin Herrenschmidt
2007-10-20 6:13 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1192779319.27435.163.camel@twins \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=herbert@gondor.apana.org.au \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=mingo@elte.hu \
--cc=netdev@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.