* [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
@ 2007-10-16 5:40 Benjamin Herrenschmidt
0 siblings, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2007-10-16 5:40 UTC (permalink / raw)
To: shemminger; +Cc: netdev, Roland Dreier, David S. Miller, linuxppc-dev list
net: Add __napi_sycnhronize() to sync with napi poll
The EMAC driver which needs to handle multiple devices with one
NAPI instance implements its own per-channel disable bit. However,
when setting such a bit, it needs to synchronize with the poller
(that is make sure that any pending poller instance has completed,
or is started late enough to see that disable bit).
This implements a low level __napi_synchronize() function to acheive
that. The underscores are to emphasis the low level aspect of it and
to discourage driver writers who don't know what they are doing to
use it (to please DaveM :-)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
If the approach is accepted, I would like to have this merged now
so the EMAC patch to make it work again can follow :-)
Note: I use msleep_interruptible(1); just like napi_disable(). However
I'm not too happy that the "hot" loop that results of a pending signal
here will spin without even a cpu_relax ... what do you guys think would
be the best way to handle this ?
Index: linux-work/include/linux/netdevice.h
===================================================================
--- linux-work.orig/include/linux/netdevice.h 2007-10-16 15:27:27.000000000 +1000
+++ linux-work/include/linux/netdevice.h 2007-10-16 15:27:38.000000000 +1000
@@ -394,6 +394,21 @@ static inline void napi_disable(struct n
}
/**
+ * __napi_synchronize - synchronize with a concurrent poll
+ * @n: napi context
+ *
+ * Synchronizes with a concurrent poll. Not to be used in normal
+ * drivers, mostly useful if you end up with multiple interfaces
+ * on one NAPI instance.
+ */
+static inline void __napi_synchronize(struct napi_struct *n)
+{
+ smp_mb();
+ while (test_bit(NAPI_STATE_SCHED, &n->state))
+ msleep_interruptible(1);
+}
+
+/**
* napi_enable - enable NAPI scheduling
* @n: napi context
*
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
@ 2007-10-16 5:49 Benjamin Herrenschmidt
2007-10-16 6:06 ` Herbert Xu
2007-10-16 18:53 ` Stephen Hemminger
0 siblings, 2 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2007-10-16 5:49 UTC (permalink / raw)
To: shemminger; +Cc: netdev, Roland Dreier, David S. Miller, linuxppc-dev list
net: Add __napi_sycnhronize() to sync with napi poll
The EMAC driver which needs to handle multiple devices with one
NAPI instance implements its own per-channel disable bit. However,
when setting such a bit, it needs to synchronize with the poller
(that is make sure that any pending poller instance has completed,
or is started late enough to see that disable bit).
This implements a low level __napi_synchronize() function to acheive
that. The underscores are to emphasis the low level aspect of it and
to discourage driver writers who don't know what they are doing to
use it (to please DaveM :-)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
(Use correct address for Stephen this time)
If the approach is accepted, I would like to have this merged now
so the EMAC patch to make it work again can follow :-)
Note: I use msleep_interruptible(1); just like napi_disable(). However
I'm not too happy that the "hot" loop that results of a pending signal
here will spin without even a cpu_relax ... what do you guys think would
be the best way to handle this ?
Index: linux-work/include/linux/netdevice.h
===================================================================
--- linux-work.orig/include/linux/netdevice.h 2007-10-16 15:27:27.000000000 +1000
+++ linux-work/include/linux/netdevice.h 2007-10-16 15:27:38.000000000 +1000
@@ -394,6 +394,21 @@ static inline void napi_disable(struct n
}
/**
+ * __napi_synchronize - synchronize with a concurrent poll
+ * @n: napi context
+ *
+ * Synchronizes with a concurrent poll. Not to be used in normal
+ * drivers, mostly useful if you end up with multiple interfaces
+ * on one NAPI instance.
+ */
+static inline void __napi_synchronize(struct napi_struct *n)
+{
+ smp_mb();
+ while (test_bit(NAPI_STATE_SCHED, &n->state))
+ msleep_interruptible(1);
+}
+
+/**
* napi_enable - enable NAPI scheduling
* @n: napi context
*
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
2007-10-16 5:49 Benjamin Herrenschmidt
@ 2007-10-16 6:06 ` Herbert Xu
2007-10-16 7:37 ` Benjamin Herrenschmidt
2007-10-16 18:53 ` Stephen Hemminger
1 sibling, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2007-10-16 6:06 UTC (permalink / raw)
To: benh; +Cc: netdev, rdreier, shemminger, davem, linuxppc-dev
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
>
> Note: I use msleep_interruptible(1); just like napi_disable(). However
> I'm not too happy that the "hot" loop that results of a pending signal
> here will spin without even a cpu_relax ... what do you guys think would
> be the best way to handle this ?
Well since the loop does not check signals at all, it should
just use msleep.
Granted the process will end up in the D state and contribute
to the load average. But if this loop executes long enough
for that to be noticed then we've got bigger problems to worry
about.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
2007-10-16 6:06 ` Herbert Xu
@ 2007-10-16 7:37 ` Benjamin Herrenschmidt
2007-10-16 7:44 ` David Miller
0 siblings, 1 reply; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2007-10-16 7:37 UTC (permalink / raw)
To: Herbert Xu; +Cc: netdev, rdreier, shemminger, davem, linuxppc-dev
On Tue, 2007-10-16 at 14:06 +0800, Herbert Xu wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> >
> > Note: I use msleep_interruptible(1); just like napi_disable(). However
> > I'm not too happy that the "hot" loop that results of a pending signal
> > here will spin without even a cpu_relax ... what do you guys think would
> > be the best way to handle this ?
>
> Well since the loop does not check signals at all, it should
> just use msleep.
>
> Granted the process will end up in the D state and contribute
> to the load average. But if this loop executes long enough
> for that to be noticed then we've got bigger problems to worry
> about.
If Dave & Stephen agree, I'll send a patch changing napi_disable() too
then.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
2007-10-16 7:37 ` Benjamin Herrenschmidt
@ 2007-10-16 7:44 ` David Miller
0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2007-10-16 7:44 UTC (permalink / raw)
To: benh; +Cc: netdev, rdreier, shemminger, herbert, linuxppc-dev
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Tue, 16 Oct 2007 17:37:03 +1000
>
> On Tue, 2007-10-16 at 14:06 +0800, Herbert Xu wrote:
> > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> > >
> > > Note: I use msleep_interruptible(1); just like napi_disable(). However
> > > I'm not too happy that the "hot" loop that results of a pending signal
> > > here will spin without even a cpu_relax ... what do you guys think would
> > > be the best way to handle this ?
> >
> > Well since the loop does not check signals at all, it should
> > just use msleep.
> >
> > Granted the process will end up in the D state and contribute
> > to the load average. But if this loop executes long enough
> > for that to be noticed then we've got bigger problems to worry
> > about.
>
> If Dave & Stephen agree, I'll send a patch changing napi_disable() too
> then.
I agree with the msleep() change.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
2007-10-16 5:49 Benjamin Herrenschmidt
2007-10-16 6:06 ` Herbert Xu
@ 2007-10-16 18:53 ` Stephen Hemminger
2007-10-16 21:14 ` Benjamin Herrenschmidt
` (2 more replies)
1 sibling, 3 replies; 13+ messages in thread
From: Stephen Hemminger @ 2007-10-16 18:53 UTC (permalink / raw)
To: benh; +Cc: netdev, Dreier, Roland, David S. Miller, linuxppc-dev list
On Tue, 16 Oct 2007 15:49:52 +1000
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> net: Add __napi_sycnhronize() to sync with napi poll
>
> The EMAC driver which needs to handle multiple devices with one
> NAPI instance implements its own per-channel disable bit. However,
> when setting such a bit, it needs to synchronize with the poller
> (that is make sure that any pending poller instance has completed,
> or is started late enough to see that disable bit).
>
> This implements a low level __napi_synchronize() function to acheive
> that. The underscores are to emphasis the low level aspect of it and
> to discourage driver writers who don't know what they are doing to
> use it (to please DaveM :-)
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>
> (Use correct address for Stephen this time)
>
> If the approach is accepted, I would like to have this merged now
> so the EMAC patch to make it work again can follow :-)
>
> Note: I use msleep_interruptible(1); just like napi_disable(). However
> I'm not too happy that the "hot" loop that results of a pending signal
> here will spin without even a cpu_relax ... what do you guys think would
> be the best way to handle this ?
So this is really just like synchronize_irq()? Using msleep is bogus
because you want to spin, you are only waiting for a softirq on the other
cpu to finish. If you wait for a whole millisecond and sleep that
is far longer than the napi routine should take.
You could even optimize it like synchronize_irq() for the non-SMP case.
--
Stephen Hemminger <shemminger@linux-foundation.org>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
2007-10-16 18:53 ` Stephen Hemminger
@ 2007-10-16 21:14 ` Benjamin Herrenschmidt
2007-10-17 2:50 ` Benjamin Herrenschmidt
2007-10-17 2:52 ` Benjamin Herrenschmidt
2 siblings, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2007-10-16 21:14 UTC (permalink / raw)
To: Stephen Hemminger
Cc: netdev, Roland Dreier, David S. Miller, linuxppc-dev list
> So this is really just like synchronize_irq()? Using msleep is bogus
> because you want to spin, you are only waiting for a softirq on the other
> cpu to finish. If you wait for a whole millisecond and sleep that
> is far longer than the napi routine should take.
>
> You could even optimize it like synchronize_irq() for the non-SMP case.
It's just like synchronize_irq() indeed. I used the mlseep() just like
napi_disable() mostly because I use it in a very similar context, for
disabling my sub-channels on things like link change etc... where I need
to reconfigure parts of the chip.
I prefer sleeping in my case but I agree that if somebody else was going
to use for something else more performance critical, it might be an
issue. On the other hand, spinning will not be nice for my usage
scenario :-)
I agree about the SMP optimisation though again, in my usage pattern,
it's very unimportant (similar code path as napi_disable)
I'll respin later today though.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
2007-10-16 18:53 ` Stephen Hemminger
2007-10-16 21:14 ` Benjamin Herrenschmidt
@ 2007-10-17 2:50 ` Benjamin Herrenschmidt
2007-10-17 2:52 ` Benjamin Herrenschmidt
2 siblings, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2007-10-17 2:50 UTC (permalink / raw)
To: Stephen Hemminger
Cc: netdev, Roland Dreier, David S. Miller, linuxppc-dev list
napi: use non-interruptible sleep in napi_disable
The current napi_disable() uses msleep_interruptible() but doesn't
(and can't) exit in case there's a signal, thus ending up doing a
hot spin without a cpu_relax. Use uninterruptible sleep instead.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
Index: linux-work/include/linux/netdevice.h
===================================================================
--- linux-work.orig/include/linux/netdevice.h 2007-10-17 12:39:16.000000000 +1000
+++ linux-work/include/linux/netdevice.h 2007-10-17 12:45:00.000000000 +1000
@@ -390,7 +390,7 @@ static inline void napi_complete(struct
static inline void napi_disable(struct napi_struct *n)
{
while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
- msleep_interruptible(1);
+ msleep(1);
}
/**
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
2007-10-16 18:53 ` Stephen Hemminger
2007-10-16 21:14 ` Benjamin Herrenschmidt
2007-10-17 2:50 ` Benjamin Herrenschmidt
@ 2007-10-17 2:52 ` Benjamin Herrenschmidt
2007-10-17 3:16 ` Jeff Garzik
2 siblings, 1 reply; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2007-10-17 2:52 UTC (permalink / raw)
To: Stephen Hemminger
Cc: netdev, Roland Dreier, David S. Miller, linuxppc-dev list
net: Add __napi_synchronize() to sync with napi poll
The EMAC driver which needs to handle multiple devices with one
NAPI instance implements its own per-channel disable bit. However,
when setting such a bit, it needs to synchronize with the poller
(that is make sure that any pending poller instance has completed,
or is started late enough to see that disable bit).
This implements a low level __napi_synchronize() function to acheive
that. The underscores are to emphasis the low level aspect of it and
to discourage driver writers who don't know what they are doing to
use it (to please DaveM :-)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
Switched to do a cpu_relax() spin instead and only on SMP. Not that
we have an smp_mb() in there which synchronize_irq() lacks. I believe
this is a bug in synchronize_irq() which I'll handle separately.
Note: unfortunately, Jeff already picked up the EMAC patch without
waiting for this to be sorted out (oops...). So if you agree with
this patch, it would be nice to have it go in quickly or maybe via
Jeff's tree to avoid breakage ? Not terribly important tho.
Index: linux-work/include/linux/netdevice.h
===================================================================
--- linux-work.orig/include/linux/netdevice.h 2007-10-17 12:39:40.000000000 +1000
+++ linux-work/include/linux/netdevice.h 2007-10-17 12:40:59.000000000 +1000
@@ -394,6 +394,23 @@ static inline void napi_disable(struct n
}
/**
+ * __napi_synchronize - synchronize with a concurrent poll
+ * @n: napi context
+ *
+ * Synchronizes with a concurrent poll. Not to be used in normal
+ * drivers, mostly useful if you end up with multiple interfaces
+ * on one NAPI instance.
+ */
+static inline void __napi_synchronize(struct napi_struct *n)
+{
+#ifdef CONFIG_SMP
+ smp_mb();
+ while (test_bit(NAPI_STATE_SCHED, &n->state))
+ cpu_relax();
+#endif /* CONFIG_SMP */
+}
+
+/**
* napi_enable - enable NAPI scheduling
* @n: napi context
*
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
2007-10-17 2:52 ` Benjamin Herrenschmidt
@ 2007-10-17 3:16 ` Jeff Garzik
2007-10-17 3:24 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 13+ messages in thread
From: Jeff Garzik @ 2007-10-17 3:16 UTC (permalink / raw)
To: benh
Cc: netdev, Roland Dreier, Stephen Hemminger, David S. Miller,
linuxppc-dev list
Benjamin Herrenschmidt wrote:
> net: Add __napi_synchronize() to sync with napi poll
>
> The EMAC driver which needs to handle multiple devices with one
> NAPI instance implements its own per-channel disable bit. However,
> when setting such a bit, it needs to synchronize with the poller
> (that is make sure that any pending poller instance has completed,
> or is started late enough to see that disable bit).
>
> This implements a low level __napi_synchronize() function to acheive
> that. The underscores are to emphasis the low level aspect of it and
> to discourage driver writers who don't know what they are doing to
> use it (to please DaveM :-)
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>
> Switched to do a cpu_relax() spin instead and only on SMP. Not that
> we have an smp_mb() in there which synchronize_irq() lacks. I believe
> this is a bug in synchronize_irq() which I'll handle separately.
>
> Note: unfortunately, Jeff already picked up the EMAC patch without
> waiting for this to be sorted out (oops...). So if you agree with
> this patch, it would be nice to have it go in quickly or maybe via
> Jeff's tree to avoid breakage ? Not terribly important tho.
Sorry, I thought that was the way everybody was headed. With the driver
broken /anyway/, I just sorta threw it into the pile of fixes.
It's upstream now, so let me know if you want to revert or move forward
from here...
Jeff
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
2007-10-17 3:16 ` Jeff Garzik
@ 2007-10-17 3:24 ` Benjamin Herrenschmidt
2007-10-17 15:31 ` Stephen Hemminger
0 siblings, 1 reply; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2007-10-17 3:24 UTC (permalink / raw)
To: Jeff Garzik
Cc: netdev, Roland Dreier, Stephen Hemminger, David S. Miller,
linuxppc-dev list
\
> > Note: unfortunately, Jeff already picked up the EMAC patch without
> > waiting for this to be sorted out (oops...). So if you agree with
> > this patch, it would be nice to have it go in quickly or maybe via
> > Jeff's tree to avoid breakage ? Not terribly important tho.
>
>
> Sorry, I thought that was the way everybody was headed. With the driver
> broken /anyway/, I just sorta threw it into the pile of fixes.
>
> It's upstream now, so let me know if you want to revert or move forward
> from here...
Let's just move forward. I can always re-fix the driver :-)
Ben.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
2007-10-17 3:24 ` Benjamin Herrenschmidt
@ 2007-10-17 15:31 ` Stephen Hemminger
2007-10-17 21:52 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2007-10-17 15:31 UTC (permalink / raw)
To: benh; +Cc: netdev, Roland Dreier, David S. Miller, Jeff Garzik,
linuxppc-dev list
Please don't use double underscore, for this function name. There is no
reason to not make it a normal API call.
The sky2 fix I am working on will use napi_synchronize as well.
--- a/include/linux/netdevice.h 2007-10-16 16:48:20.000000000 -0700
+++ b/include/linux/netdevice.h 2007-10-17 08:29:55.000000000 -0700
@@ -407,6 +407,24 @@ static inline void napi_enable(struct na
clear_bit(NAPI_STATE_SCHED, &n->state);
}
+#ifdef CONFIG_SMP
+/**
+ * napi_synchronize - wait until NAPI is not running
+ * @n: napi context
+ *
+ * Wait until NAPI is done being scheduled on this context.
+ * Any outstanding processing completes but
+ * does not disable future activations.
+ */
+static inline void napi_synchronize(const struct napi_struct *n)
+{
+ while (test_bit(NAPI_STATE_SCHED, &n->state))
+ msleep(1);
+}
+#else
+# define napi_synchronize(n) barrier()
+#endif
+
/*
* The DEVICE structure.
* Actually, this whole structure is a big mistake. It mixes I/O
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
2007-10-17 15:31 ` Stephen Hemminger
@ 2007-10-17 21:52 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2007-10-17 21:52 UTC (permalink / raw)
To: Stephen Hemminger
Cc: netdev, Roland Dreier, David S. Miller, Jeff Garzik,
linuxppc-dev list
On Wed, 2007-10-17 at 08:31 -0700, Stephen Hemminger wrote:
> Please don't use double underscore, for this function name. There is no
> reason to not make it a normal API call.
>
> The sky2 fix I am working on will use napi_synchronize as well.
Allright. A compiler barrier in the !SMP case makes sense, but I would
still want an smp_mb() before the test_bit. I think it's a bug in
synchronize_irq not to have it.
Cheers,
Ben.
> --- a/include/linux/netdevice.h 2007-10-16 16:48:20.000000000 -0700
> +++ b/include/linux/netdevice.h 2007-10-17 08:29:55.000000000 -0700
> @@ -407,6 +407,24 @@ static inline void napi_enable(struct na
> clear_bit(NAPI_STATE_SCHED, &n->state);
> }
>
> +#ifdef CONFIG_SMP
> +/**
> + * napi_synchronize - wait until NAPI is not running
> + * @n: napi context
> + *
> + * Wait until NAPI is done being scheduled on this context.
> + * Any outstanding processing completes but
> + * does not disable future activations.
> + */
> +static inline void napi_synchronize(const struct napi_struct *n)
> +{
> + while (test_bit(NAPI_STATE_SCHED, &n->state))
> + msleep(1);
> +}
> +#else
> +# define napi_synchronize(n) barrier()
> +#endif
> +
> /*
> * The DEVICE structure.
> * Actually, this whole structure is a big mistake. It mixes I/O
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2007-10-17 21:52 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-16 5:40 [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll Benjamin Herrenschmidt
-- strict thread matches above, loose matches on Subject: below --
2007-10-16 5:49 Benjamin Herrenschmidt
2007-10-16 6:06 ` Herbert Xu
2007-10-16 7:37 ` Benjamin Herrenschmidt
2007-10-16 7:44 ` David Miller
2007-10-16 18:53 ` Stephen Hemminger
2007-10-16 21:14 ` Benjamin Herrenschmidt
2007-10-17 2:50 ` Benjamin Herrenschmidt
2007-10-17 2:52 ` Benjamin Herrenschmidt
2007-10-17 3:16 ` Jeff Garzik
2007-10-17 3:24 ` Benjamin Herrenschmidt
2007-10-17 15:31 ` Stephen Hemminger
2007-10-17 21:52 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).