* [PATCH v2] iopoll: use udelay() for initial polling
@ 2026-05-19 10:24 Peter Collingbourne
2026-05-19 18:35 ` David Laight
2026-05-20 13:29 ` Ville Syrjälä
0 siblings, 2 replies; 11+ messages in thread
From: Peter Collingbourne @ 2026-05-19 10:24 UTC (permalink / raw)
To: Mark Brown
Cc: David Laight, Christophe Kerello, Patrice Chotard,
Boris Brezillon, linux-spi, linux-kernel, Jani Nikula,
Ville Syrjälä, Simona Vetter, Peter Collingbourne,
Randy Dunlap
A short polling delay, such as the delay of 5us
(SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver,
can become a 1/HZ (order of ms) delay caused by the usleep_range()
call in read_poll_timeout(), significantly reducing SPI NAND access
performance. Fix it by adjusting the read_poll_timeout() macro to use
udelay() to delay until 1/10 of a timer tick after it is called, and
only then sleep.
Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions")
Signed-off-by: Peter Collingbourne <peter@pcc.me.uk>
---
include/linux/iopoll.h | 30 ++++++++++++++++++++++--------
1 file changed, 22 insertions(+), 8 deletions(-)
v2:
* Fix it in read_poll_timeout() instead
diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
index 53edd69acb9b..2ee89b76f072 100644
--- a/include/linux/iopoll.h
+++ b/include/linux/iopoll.h
@@ -19,9 +19,11 @@
*
* @op: Operation
* @cond: Break condition
- * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops).
- * Please read usleep_range() function description for details and
- * limitations.
+ * @sleep_us: Maximum time to sleep or delay between operations in us
+ * (0 tight-loops). Please read usleep_range() and udelay()
+ * function descriptions for details and limitations.
+ * This macro will delay until 1/10 of a timer tick after
+ * it is called, and will then start sleeping.
* @timeout_us: Timeout in us, 0 means never timeout
* @sleep_before_op: if it is true, sleep @sleep_us before operation.
*
@@ -35,11 +37,18 @@
({ \
u64 __timeout_us = (timeout_us); \
unsigned long __sleep_us = (sleep_us); \
- ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
+ ktime_t __start_time = ktime_get(); \
+ u64 __delay_timeout_us = 100000/HZ; \
+ ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \
+ ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \
int ___ret; \
might_sleep_if((__sleep_us) != 0); \
- if ((sleep_before_op) && __sleep_us) \
- usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
+ if ((sleep_before_op) && __sleep_us) { \
+ if (__sleep_us <= __delay_timeout_us) \
+ udelay(__sleep_us); \
+ else \
+ usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
+ } \
for (;;) { \
bool __expired = __timeout_us && \
ktime_compare(ktime_get(), __timeout) > 0; \
@@ -54,8 +63,13 @@
___ret = -ETIMEDOUT; \
break; \
} \
- if (__sleep_us) \
- usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
+ if (__sleep_us) { \
+ if (__sleep_us <= __delay_timeout_us && \
+ ktime_compare(ktime_get(), __delay_timeout) < 0) \
+ udelay(__sleep_us); \
+ else \
+ usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
+ } \
cpu_relax(); \
} \
___ret; \
--
2.54.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling
2026-05-19 10:24 [PATCH v2] iopoll: use udelay() for initial polling Peter Collingbourne
@ 2026-05-19 18:35 ` David Laight
2026-05-20 7:38 ` Peter Collingbourne
2026-05-20 13:29 ` Ville Syrjälä
1 sibling, 1 reply; 11+ messages in thread
From: David Laight @ 2026-05-19 18:35 UTC (permalink / raw)
To: Peter Collingbourne
Cc: Mark Brown, Christophe Kerello, Patrice Chotard, Boris Brezillon,
linux-spi, linux-kernel, Jani Nikula, Ville Syrjälä,
Simona Vetter, Randy Dunlap
On Tue, 19 May 2026 03:24:46 -0700
Peter Collingbourne <peter@pcc.me.uk> wrote:
> A short polling delay, such as the delay of 5us
> (SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver,
> can become a 1/HZ (order of ms) delay caused by the usleep_range()
> call in read_poll_timeout(), significantly reducing SPI NAND access
> performance. Fix it by adjusting the read_poll_timeout() macro to use
> udelay() to delay until 1/10 of a timer tick after it is called, and
> only then sleep.
>
> Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions")
> Signed-off-by: Peter Collingbourne <peter@pcc.me.uk>
> ---
> include/linux/iopoll.h | 30 ++++++++++++++++++++++--------
> 1 file changed, 22 insertions(+), 8 deletions(-)
>
> v2:
> * Fix it in read_poll_timeout() instead
>
> diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
> index 53edd69acb9b..2ee89b76f072 100644
> --- a/include/linux/iopoll.h
> +++ b/include/linux/iopoll.h
> @@ -19,9 +19,11 @@
> *
> * @op: Operation
> * @cond: Break condition
> - * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops).
> - * Please read usleep_range() function description for details and
> - * limitations.
> + * @sleep_us: Maximum time to sleep or delay between operations in us
> + * (0 tight-loops). Please read usleep_range() and udelay()
> + * function descriptions for details and limitations.
> + * This macro will delay until 1/10 of a timer tick after
> + * it is called, and will then start sleeping.
> * @timeout_us: Timeout in us, 0 means never timeout
> * @sleep_before_op: if it is true, sleep @sleep_us before operation.
> *
> @@ -35,11 +37,18 @@
> ({ \
> u64 __timeout_us = (timeout_us); \
> unsigned long __sleep_us = (sleep_us); \
> - ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
> + ktime_t __start_time = ktime_get(); \
> + u64 __delay_timeout_us = 100000/HZ; \
> + ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \
> + ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \
> int ___ret; \
> might_sleep_if((__sleep_us) != 0); \
> - if ((sleep_before_op) && __sleep_us) \
> - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> + if ((sleep_before_op) && __sleep_us) { \
> + if (__sleep_us <= __delay_timeout_us) \
> + udelay(__sleep_us); \
> + else \
> + usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> + } \
> for (;;) { \
> bool __expired = __timeout_us && \
> ktime_compare(ktime_get(), __timeout) > 0; \
> @@ -54,8 +63,13 @@
> ___ret = -ETIMEDOUT; \
> break; \
> } \
> - if (__sleep_us) \
> - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> + if (__sleep_us) { \
> + if (__sleep_us <= __delay_timeout_us && \
> + ktime_compare(ktime_get(), __delay_timeout) < 0) \
> + udelay(__sleep_us); \
> + else \
> + usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> + } \
> cpu_relax(); \
> } \
> ___ret; \
How about:
#define poll_timeout_us(op, cond, sleep_us, timeout_us, sleep_before_op) \
({ \
u64 __timeout_us = (timeout_us); \
unsigned long __sleep_us = (sleep_us); \
ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
u64 __delay_timeout_us = 100000/HZ; \
int ___ret; \
bool __expired; \
might_sleep_if((__sleep_us) != 0); \
\
for (;; sleep_before_op = false) { \
if (!sleep_before_op) { \
__expired = __timeout_us && \
ktime_compare(ktime_get(), __timeout) > 0; \
/* guarantee 'op' and 'cond' are evaluated after timeout expired */ \
barrier(); \
op; \
if (cond) { \
___ret = 0; \
break; \
} \
if (__expired) { \
___ret = -ETIMEDOUT; \
break; \
} \
} \
if (__sleep_us > __delay_timeout_us) { \
usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
continue;
} \
if (__sleep_us) { \
__delay_timeout_us -= __sleep_us; \
udelay(__sleep_us); \
} \
cpu_relax(); \
} \
___ret; \
})
Which I think is approximately equivalent.
But I'm not at all sure the usleep/udelay test it right.
100000/HZ is a strange number of usecs; for HZ=100 it is 1ms, but for HZ=1000 0.1ms.
Maybe it should be more like:
u32 __delay_timeout_us = __sleep_us > 20 ? 0 : 100;
so that you delay for (approx) max 100us if the interval is less than 20us.
The is also a mismatch of long and u64.
I don't think anything (except the time_t) needs to be 64bit (esp. on 32bit).
I'm not sure about all architectures, but I'm pretty sure than on x86
usleep_range() is independent of HZ.
-- David
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling
2026-05-19 18:35 ` David Laight
@ 2026-05-20 7:38 ` Peter Collingbourne
0 siblings, 0 replies; 11+ messages in thread
From: Peter Collingbourne @ 2026-05-20 7:38 UTC (permalink / raw)
To: David Laight
Cc: Mark Brown, Christophe Kerello, Patrice Chotard, Boris Brezillon,
linux-spi, linux-kernel, Jani Nikula, Ville Syrjälä,
Simona Vetter, Randy Dunlap
On Tue, May 19, 2026 at 11:35 AM David Laight
<david.laight.linux@gmail.com> wrote:
>
> On Tue, 19 May 2026 03:24:46 -0700
> Peter Collingbourne <peter@pcc.me.uk> wrote:
>
> > A short polling delay, such as the delay of 5us
> > (SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver,
> > can become a 1/HZ (order of ms) delay caused by the usleep_range()
> > call in read_poll_timeout(), significantly reducing SPI NAND access
> > performance. Fix it by adjusting the read_poll_timeout() macro to use
> > udelay() to delay until 1/10 of a timer tick after it is called, and
> > only then sleep.
> >
> > Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions")
> > Signed-off-by: Peter Collingbourne <peter@pcc.me.uk>
> > ---
> > include/linux/iopoll.h | 30 ++++++++++++++++++++++--------
> > 1 file changed, 22 insertions(+), 8 deletions(-)
> >
> > v2:
> > * Fix it in read_poll_timeout() instead
> >
> > diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
> > index 53edd69acb9b..2ee89b76f072 100644
> > --- a/include/linux/iopoll.h
> > +++ b/include/linux/iopoll.h
> > @@ -19,9 +19,11 @@
> > *
> > * @op: Operation
> > * @cond: Break condition
> > - * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops).
> > - * Please read usleep_range() function description for details and
> > - * limitations.
> > + * @sleep_us: Maximum time to sleep or delay between operations in us
> > + * (0 tight-loops). Please read usleep_range() and udelay()
> > + * function descriptions for details and limitations.
> > + * This macro will delay until 1/10 of a timer tick after
> > + * it is called, and will then start sleeping.
> > * @timeout_us: Timeout in us, 0 means never timeout
> > * @sleep_before_op: if it is true, sleep @sleep_us before operation.
> > *
> > @@ -35,11 +37,18 @@
> > ({ \
> > u64 __timeout_us = (timeout_us); \
> > unsigned long __sleep_us = (sleep_us); \
> > - ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
> > + ktime_t __start_time = ktime_get(); \
> > + u64 __delay_timeout_us = 100000/HZ; \
> > + ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \
> > + ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \
> > int ___ret; \
> > might_sleep_if((__sleep_us) != 0); \
> > - if ((sleep_before_op) && __sleep_us) \
> > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> > + if ((sleep_before_op) && __sleep_us) { \
> > + if (__sleep_us <= __delay_timeout_us) \
> > + udelay(__sleep_us); \
> > + else \
> > + usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> > + } \
> > for (;;) { \
> > bool __expired = __timeout_us && \
> > ktime_compare(ktime_get(), __timeout) > 0; \
> > @@ -54,8 +63,13 @@
> > ___ret = -ETIMEDOUT; \
> > break; \
> > } \
> > - if (__sleep_us) \
> > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> > + if (__sleep_us) { \
> > + if (__sleep_us <= __delay_timeout_us && \
> > + ktime_compare(ktime_get(), __delay_timeout) < 0) \
> > + udelay(__sleep_us); \
> > + else \
> > + usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> > + } \
> > cpu_relax(); \
> > } \
> > ___ret; \
>
> How about:
> #define poll_timeout_us(op, cond, sleep_us, timeout_us, sleep_before_op) \
> ({ \
> u64 __timeout_us = (timeout_us); \
> unsigned long __sleep_us = (sleep_us); \
> ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
> u64 __delay_timeout_us = 100000/HZ; \
> int ___ret; \
> bool __expired; \
> might_sleep_if((__sleep_us) != 0); \
> \
> for (;; sleep_before_op = false) { \
> if (!sleep_before_op) { \
> __expired = __timeout_us && \
> ktime_compare(ktime_get(), __timeout) > 0; \
> /* guarantee 'op' and 'cond' are evaluated after timeout expired */ \
> barrier(); \
> op; \
> if (cond) { \
> ___ret = 0; \
> break; \
> } \
> if (__expired) { \
> ___ret = -ETIMEDOUT; \
> break; \
> } \
> } \
> if (__sleep_us > __delay_timeout_us) { \
> usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> continue;
> } \
> if (__sleep_us) { \
> __delay_timeout_us -= __sleep_us; \
> udelay(__sleep_us); \
> } \
> cpu_relax(); \
> } \
> ___ret; \
> })
> Which I think is approximately equivalent.
> But I'm not at all sure the usleep/udelay test it right.
> 100000/HZ is a strange number of usecs; for HZ=100 it is 1ms, but for HZ=1000 0.1ms.
The idea here was that since the sleep time for usleep_range() is tied
to HZ (or so I thought, see below), the delay timeout should also be
tied to HZ.
> Maybe it should be more like:
> u32 __delay_timeout_us = __sleep_us > 20 ? 0 : 100;
> so that you delay for (approx) max 100us if the interval is less than 20us.
I think that makes sense.
> The is also a mismatch of long and u64.
> I don't think anything (except the time_t) needs to be 64bit (esp. on 32bit).
Ack.
> I'm not sure about all architectures, but I'm pretty sure than on x86
> usleep_range() is independent of HZ.
Right, after looking into it more I learned that the hrtimer should
usually allow usleep_range() to return before the next timer tick.
Which, as it turns out, was not happening on my target device because
the kernel was configured without CONFIG_HIGH_RES_TIMERS. After
turning it on, the performance issue is fixed, even without this
patch.
For me this raises the question of whether CONFIG_HIGH_RES_TIMERS=n is
always a good default. On x86 (except for ancient microarchitectures)
and arm64, the architecture provides a guaranteed hrtimer, so to me it
makes sense for it to default to y on those architectures at least, so
that users who write their own config (not defconfig based) will avoid
issues like this difficult to debug performance issue by default.
If we do decide to turn it on everywhere by default, it may be worth
reconsidering whether we need to delay in poll_timeout_us() after all.
Peter
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling
2026-05-19 10:24 [PATCH v2] iopoll: use udelay() for initial polling Peter Collingbourne
2026-05-19 18:35 ` David Laight
@ 2026-05-20 13:29 ` Ville Syrjälä
2026-05-21 5:59 ` Peter Collingbourne
1 sibling, 1 reply; 11+ messages in thread
From: Ville Syrjälä @ 2026-05-20 13:29 UTC (permalink / raw)
To: Peter Collingbourne
Cc: Mark Brown, David Laight, Christophe Kerello, Patrice Chotard,
Boris Brezillon, linux-spi, linux-kernel, Jani Nikula,
Simona Vetter, Randy Dunlap
On Tue, May 19, 2026 at 03:24:46AM -0700, Peter Collingbourne wrote:
> A short polling delay, such as the delay of 5us
> (SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver,
> can become a 1/HZ (order of ms) delay caused by the usleep_range()
> call in read_poll_timeout(), significantly reducing SPI NAND access
> performance. Fix it by adjusting the read_poll_timeout() macro to use
> udelay() to delay until 1/10 of a timer tick after it is called, and
> only then sleep.
>
> Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions")
> Signed-off-by: Peter Collingbourne <peter@pcc.me.uk>
> ---
> include/linux/iopoll.h | 30 ++++++++++++++++++++++--------
> 1 file changed, 22 insertions(+), 8 deletions(-)
>
> v2:
> * Fix it in read_poll_timeout() instead
>
> diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
> index 53edd69acb9b..2ee89b76f072 100644
> --- a/include/linux/iopoll.h
> +++ b/include/linux/iopoll.h
> @@ -19,9 +19,11 @@
> *
> * @op: Operation
> * @cond: Break condition
> - * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops).
> - * Please read usleep_range() function description for details and
> - * limitations.
> + * @sleep_us: Maximum time to sleep or delay between operations in us
> + * (0 tight-loops). Please read usleep_range() and udelay()
> + * function descriptions for details and limitations.
> + * This macro will delay until 1/10 of a timer tick after
> + * it is called, and will then start sleeping.
> * @timeout_us: Timeout in us, 0 means never timeout
> * @sleep_before_op: if it is true, sleep @sleep_us before operation.
> *
> @@ -35,11 +37,18 @@
> ({ \
> u64 __timeout_us = (timeout_us); \
> unsigned long __sleep_us = (sleep_us); \
> - ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
> + ktime_t __start_time = ktime_get(); \
> + u64 __delay_timeout_us = 100000/HZ; \
> + ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \
> + ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \
> int ___ret; \
> might_sleep_if((__sleep_us) != 0); \
> - if ((sleep_before_op) && __sleep_us) \
> - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> + if ((sleep_before_op) && __sleep_us) { \
> + if (__sleep_us <= __delay_timeout_us) \
> + udelay(__sleep_us); \
If you want udelay() why not just use the atomic variant of the macro?
> + else \
> + usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> + } \
> for (;;) { \
> bool __expired = __timeout_us && \
> ktime_compare(ktime_get(), __timeout) > 0; \
> @@ -54,8 +63,13 @@
> ___ret = -ETIMEDOUT; \
> break; \
> } \
> - if (__sleep_us) \
> - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> + if (__sleep_us) { \
> + if (__sleep_us <= __delay_timeout_us && \
> + ktime_compare(ktime_get(), __delay_timeout) < 0) \
> + udelay(__sleep_us); \
> + else \
> + usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> + } \
> cpu_relax(); \
> } \
> ___ret; \
> --
> 2.54.0
--
Ville Syrjälä
Intel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling
2026-05-20 13:29 ` Ville Syrjälä
@ 2026-05-21 5:59 ` Peter Collingbourne
2026-05-21 7:03 ` Jani Nikula
0 siblings, 1 reply; 11+ messages in thread
From: Peter Collingbourne @ 2026-05-21 5:59 UTC (permalink / raw)
To: Ville Syrjälä
Cc: Mark Brown, David Laight, Christophe Kerello, Patrice Chotard,
Boris Brezillon, linux-spi, linux-kernel, Jani Nikula,
Simona Vetter, Randy Dunlap
On Wed, May 20, 2026 at 6:29 AM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
>
> On Tue, May 19, 2026 at 03:24:46AM -0700, Peter Collingbourne wrote:
> > A short polling delay, such as the delay of 5us
> > (SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver,
> > can become a 1/HZ (order of ms) delay caused by the usleep_range()
> > call in read_poll_timeout(), significantly reducing SPI NAND access
> > performance. Fix it by adjusting the read_poll_timeout() macro to use
> > udelay() to delay until 1/10 of a timer tick after it is called, and
> > only then sleep.
> >
> > Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions")
> > Signed-off-by: Peter Collingbourne <peter@pcc.me.uk>
> > ---
> > include/linux/iopoll.h | 30 ++++++++++++++++++++++--------
> > 1 file changed, 22 insertions(+), 8 deletions(-)
> >
> > v2:
> > * Fix it in read_poll_timeout() instead
> >
> > diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
> > index 53edd69acb9b..2ee89b76f072 100644
> > --- a/include/linux/iopoll.h
> > +++ b/include/linux/iopoll.h
> > @@ -19,9 +19,11 @@
> > *
> > * @op: Operation
> > * @cond: Break condition
> > - * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops).
> > - * Please read usleep_range() function description for details and
> > - * limitations.
> > + * @sleep_us: Maximum time to sleep or delay between operations in us
> > + * (0 tight-loops). Please read usleep_range() and udelay()
> > + * function descriptions for details and limitations.
> > + * This macro will delay until 1/10 of a timer tick after
> > + * it is called, and will then start sleeping.
> > * @timeout_us: Timeout in us, 0 means never timeout
> > * @sleep_before_op: if it is true, sleep @sleep_us before operation.
> > *
> > @@ -35,11 +37,18 @@
> > ({ \
> > u64 __timeout_us = (timeout_us); \
> > unsigned long __sleep_us = (sleep_us); \
> > - ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
> > + ktime_t __start_time = ktime_get(); \
> > + u64 __delay_timeout_us = 100000/HZ; \
> > + ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \
> > + ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \
> > int ___ret; \
> > might_sleep_if((__sleep_us) != 0); \
> > - if ((sleep_before_op) && __sleep_us) \
> > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> > + if ((sleep_before_op) && __sleep_us) { \
> > + if (__sleep_us <= __delay_timeout_us) \
> > + udelay(__sleep_us); \
>
> If you want udelay() why not just use the atomic variant of the macro?
That's what I had in v1; we decided this approach would better handle
misbehaving devices.
https://lore.kernel.org/all/20260517150253.031dec09@pumpkin/
Peter
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling
2026-05-21 5:59 ` Peter Collingbourne
@ 2026-05-21 7:03 ` Jani Nikula
2026-05-22 14:50 ` Mark Brown
0 siblings, 1 reply; 11+ messages in thread
From: Jani Nikula @ 2026-05-21 7:03 UTC (permalink / raw)
To: Peter Collingbourne, Ville Syrjälä
Cc: Mark Brown, David Laight, Christophe Kerello, Patrice Chotard,
Boris Brezillon, linux-spi, linux-kernel, Simona Vetter,
Randy Dunlap
On Wed, 20 May 2026, Peter Collingbourne <peter@pcc.me.uk> wrote:
> On Wed, May 20, 2026 at 6:29 AM Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
>>
>> On Tue, May 19, 2026 at 03:24:46AM -0700, Peter Collingbourne wrote:
>> > A short polling delay, such as the delay of 5us
>> > (SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver,
>> > can become a 1/HZ (order of ms) delay caused by the usleep_range()
>> > call in read_poll_timeout(), significantly reducing SPI NAND access
>> > performance. Fix it by adjusting the read_poll_timeout() macro to use
>> > udelay() to delay until 1/10 of a timer tick after it is called, and
>> > only then sleep.
What's "1/10 of a timer tick" based on?
fsleep() has simply 10 us as the threshold.
>> >
>> > Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions")
>> > Signed-off-by: Peter Collingbourne <peter@pcc.me.uk>
>> > ---
>> > include/linux/iopoll.h | 30 ++++++++++++++++++++++--------
>> > 1 file changed, 22 insertions(+), 8 deletions(-)
>> >
>> > v2:
>> > * Fix it in read_poll_timeout() instead
>> >
>> > diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
>> > index 53edd69acb9b..2ee89b76f072 100644
>> > --- a/include/linux/iopoll.h
>> > +++ b/include/linux/iopoll.h
>> > @@ -19,9 +19,11 @@
>> > *
>> > * @op: Operation
>> > * @cond: Break condition
>> > - * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops).
>> > - * Please read usleep_range() function description for details and
>> > - * limitations.
>> > + * @sleep_us: Maximum time to sleep or delay between operations in us
>> > + * (0 tight-loops). Please read usleep_range() and udelay()
>> > + * function descriptions for details and limitations.
>> > + * This macro will delay until 1/10 of a timer tick after
>> > + * it is called, and will then start sleeping.
>> > * @timeout_us: Timeout in us, 0 means never timeout
>> > * @sleep_before_op: if it is true, sleep @sleep_us before operation.
>> > *
>> > @@ -35,11 +37,18 @@
>> > ({ \
>> > u64 __timeout_us = (timeout_us); \
>> > unsigned long __sleep_us = (sleep_us); \
>> > - ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
>> > + ktime_t __start_time = ktime_get(); \
>> > + u64 __delay_timeout_us = 100000/HZ; \
>> > + ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \
>> > + ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \
>> > int ___ret; \
>> > might_sleep_if((__sleep_us) != 0); \
>> > - if ((sleep_before_op) && __sleep_us) \
>> > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
>> > + if ((sleep_before_op) && __sleep_us) { \
>> > + if (__sleep_us <= __delay_timeout_us) \
>> > + udelay(__sleep_us); \
>>
>> If you want udelay() why not just use the atomic variant of the macro?
>
> That's what I had in v1; we decided this approach would better handle
> misbehaving devices.
> https://lore.kernel.org/all/20260517150253.031dec09@pumpkin/
I think the problem with trying to adapt to everything within
read_poll_timeout() is that every step like this adapts to a *specific*
use case, and once it gets specific enough, it's no longer usable to
other scenarios.
Having to reimplement the whole thing in drivers is much worse than
having to do two calls.
Could a staggered approach work?
ret = read_poll_timeout_atomic("short delay/timeout")
if (ret)
ret = read_poll_timeout("longer delay/timeout")
Then you have better control of the behaviour in the driver, instead of
adapting a generic function to a specific use case.
BR,
Jani.
--
Jani Nikula, Intel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling
2026-05-21 7:03 ` Jani Nikula
@ 2026-05-22 14:50 ` Mark Brown
2026-05-29 5:59 ` Peter Collingbourne
0 siblings, 1 reply; 11+ messages in thread
From: Mark Brown @ 2026-05-22 14:50 UTC (permalink / raw)
To: Jani Nikula
Cc: Peter Collingbourne, Ville Syrjälä, David Laight,
Christophe Kerello, Patrice Chotard, Boris Brezillon, linux-spi,
linux-kernel, Simona Vetter, Randy Dunlap
[-- Attachment #1: Type: text/plain, Size: 1310 bytes --]
On Thu, May 21, 2026 at 10:03:28AM +0300, Jani Nikula wrote:
> On Wed, 20 May 2026, Peter Collingbourne <peter@pcc.me.uk> wrote:
> > That's what I had in v1; we decided this approach would better handle
> > misbehaving devices.
> > https://lore.kernel.org/all/20260517150253.031dec09@pumpkin/
> I think the problem with trying to adapt to everything within
> read_poll_timeout() is that every step like this adapts to a *specific*
> use case, and once it gets specific enough, it's no longer usable to
> other scenarios.
> Having to reimplement the whole thing in drivers is much worse than
> having to do two calls.
> Could a staggered approach work?
> ret = read_poll_timeout_atomic("short delay/timeout")
> if (ret)
> ret = read_poll_timeout("longer delay/timeout")
> Then you have better control of the behaviour in the driver, instead of
> adapting a generic function to a specific use case.
If we're doing that it still seems like it'd be better to have a helper
than just takes the two timeouts and does the right thing with them. My
original feedback here was that we should have helpers which encourage
good patterns, and TBH I expect there's probably a bunch of scenarios
where some fixed cutoff would be a worthwile improvement for very little
effort on the part of users.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling
2026-05-22 14:50 ` Mark Brown
@ 2026-05-29 5:59 ` Peter Collingbourne
2026-05-29 11:20 ` David Laight
0 siblings, 1 reply; 11+ messages in thread
From: Peter Collingbourne @ 2026-05-29 5:59 UTC (permalink / raw)
To: Mark Brown
Cc: Jani Nikula, Ville Syrjälä, David Laight,
Christophe Kerello, Patrice Chotard, Boris Brezillon, linux-spi,
linux-kernel, Simona Vetter, Randy Dunlap
On Fri, May 22, 2026 at 7:51 AM Mark Brown <broonie@kernel.org> wrote:
>
> On Thu, May 21, 2026 at 10:03:28AM +0300, Jani Nikula wrote:
> > On Wed, 20 May 2026, Peter Collingbourne <peter@pcc.me.uk> wrote:
>
> > > That's what I had in v1; we decided this approach would better handle
> > > misbehaving devices.
> > > https://lore.kernel.org/all/20260517150253.031dec09@pumpkin/
>
> > I think the problem with trying to adapt to everything within
> > read_poll_timeout() is that every step like this adapts to a *specific*
> > use case, and once it gets specific enough, it's no longer usable to
> > other scenarios.
>
> > Having to reimplement the whole thing in drivers is much worse than
> > having to do two calls.
>
> > Could a staggered approach work?
>
> > ret = read_poll_timeout_atomic("short delay/timeout")
> > if (ret)
> > ret = read_poll_timeout("longer delay/timeout")
>
> > Then you have better control of the behaviour in the driver, instead of
> > adapting a generic function to a specific use case.
>
> If we're doing that it still seems like it'd be better to have a helper
> than just takes the two timeouts and does the right thing with them. My
> original feedback here was that we should have helpers which encourage
> good patterns, and TBH I expect there's probably a bunch of scenarios
> where some fixed cutoff would be a worthwile improvement for very little
> effort on the part of users.
Thinking about it some more:
Do we really want to udelay() for the initial busy wait? Assuming that
we've decided to spend some time burning cycles, I think we may as
well use those cycles to check the condition so that we have a chance
of an early exit.
As an experiment, I open coded it in the SPI NAND driver:
diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
index a09371a075d2..cce34ada7611 100644
--- a/drivers/spi/spi-mem.c
+++ b/drivers/spi/spi-mem.c
@@ -1005,6 +1005,20 @@ int spi_mem_poll_status(struct spi_mem *mem,
usleep_range((initial_delay_us >> 2) + 1,
initial_delay_us);
+ {
+ ktime_t __start_time = ktime_get();
+ u64 __delay_timeout_us = 10;
+ ktime_t __delay_timeout =
+ ktime_add_us(__start_time, __delay_timeout_us);
+ while (ktime_compare(ktime_get(), __delay_timeout) <
+ 0) {
+ ret = spi_mem_read_status(mem, op, &status);
+ if (ret < 0)
+ return ret;
+ if ((status & mask) == match)
+ return 0;
+ }
+ }
ret = read_poll_timeout(spi_mem_read_status, read_status_ret,
(read_status_ret || ((status)
& mask) == match),
polling_delay_us, timeout_ms *
1000, false, mem,
And it did in fact improve performance on my target compared to status
quo and even v2.
With hrtimers enabled, timing the UBI scan, 3 runs of each:
- With no changes: 0.784s, 0.779s, 0.777s
- With my v2: 0.687s, 0.686s, 0.687s
- With the above patch: 0.665s, 0.665s, 0.665s
So 3 options:
1. Take my v2 and change read_poll_timeout() to do the check during
the busy wait.
2. Adjust read_poll_timeout_atomic() to busy wait using the check
instead of udelay(), then change the SPI NAND driver to do the 2
calls.
3. Introduce a new macro that does the 2 calls, and have the SPI NAND
driver use it.
I don't have a strong opinion, but my preference would be either 1 or
3. I agree with Mark that it would make the preferred usage code
easier to write (and to read), as there would be no need to duplicate
the argument list or compute the timeouts correctly in every caller.
Peter
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling
2026-05-29 5:59 ` Peter Collingbourne
@ 2026-05-29 11:20 ` David Laight
2026-05-29 21:56 ` Mark Brown
0 siblings, 1 reply; 11+ messages in thread
From: David Laight @ 2026-05-29 11:20 UTC (permalink / raw)
To: Peter Collingbourne
Cc: Mark Brown, Jani Nikula, Ville Syrjälä,
Christophe Kerello, Patrice Chotard, Boris Brezillon, linux-spi,
linux-kernel, Simona Vetter, Randy Dunlap
On Thu, 28 May 2026 22:59:25 -0700
Peter Collingbourne <peter@pcc.me.uk> wrote:
> On Fri, May 22, 2026 at 7:51 AM Mark Brown <broonie@kernel.org> wrote:
> >
> > On Thu, May 21, 2026 at 10:03:28AM +0300, Jani Nikula wrote:
> > > On Wed, 20 May 2026, Peter Collingbourne <peter@pcc.me.uk> wrote:
> >
> > > > That's what I had in v1; we decided this approach would better handle
> > > > misbehaving devices.
> > > > https://lore.kernel.org/all/20260517150253.031dec09@pumpkin/
> >
> > > I think the problem with trying to adapt to everything within
> > > read_poll_timeout() is that every step like this adapts to a *specific*
> > > use case, and once it gets specific enough, it's no longer usable to
> > > other scenarios.
> >
> > > Having to reimplement the whole thing in drivers is much worse than
> > > having to do two calls.
> >
> > > Could a staggered approach work?
> >
> > > ret = read_poll_timeout_atomic("short delay/timeout")
> > > if (ret)
> > > ret = read_poll_timeout("longer delay/timeout")
> >
> > > Then you have better control of the behaviour in the driver, instead of
> > > adapting a generic function to a specific use case.
> >
> > If we're doing that it still seems like it'd be better to have a helper
> > than just takes the two timeouts and does the right thing with them. My
> > original feedback here was that we should have helpers which encourage
> > good patterns, and TBH I expect there's probably a bunch of scenarios
> > where some fixed cutoff would be a worthwile improvement for very little
> > effort on the part of users.
>
> Thinking about it some more:
>
> Do we really want to udelay() for the initial busy wait? Assuming that
> we've decided to spend some time burning cycles, I think we may as
> well use those cycles to check the condition so that we have a chance
> of an early exit.
>
> As an experiment, I open coded it in the SPI NAND driver:
>
> diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
> index a09371a075d2..cce34ada7611 100644
> --- a/drivers/spi/spi-mem.c
> +++ b/drivers/spi/spi-mem.c
> @@ -1005,6 +1005,20 @@ int spi_mem_poll_status(struct spi_mem *mem,
> usleep_range((initial_delay_us >> 2) + 1,
> initial_delay_us);
>
> + {
> + ktime_t __start_time = ktime_get();
> + u64 __delay_timeout_us = 10;
> + ktime_t __delay_timeout =
> + ktime_add_us(__start_time, __delay_timeout_us);
> + while (ktime_compare(ktime_get(), __delay_timeout) <
> + 0) {
> + ret = spi_mem_read_status(mem, op, &status);
> + if (ret < 0)
> + return ret;
> + if ((status & mask) == match)
> + return 0;
> + }
> + }
> ret = read_poll_timeout(spi_mem_read_status, read_status_ret,
> (read_status_ret || ((status)
> & mask) == match),
> polling_delay_us, timeout_ms *
> 1000, false, mem,
>
> And it did in fact improve performance on my target compared to status
> quo and even v2.
>
> With hrtimers enabled, timing the UBI scan, 3 runs of each:
> - With no changes: 0.784s, 0.779s, 0.777s
> - With my v2: 0.687s, 0.686s, 0.687s
> - With the above patch: 0.665s, 0.665s, 0.665s
>
> So 3 options:
> 1. Take my v2 and change read_poll_timeout() to do the check during
> the busy wait.
> 2. Adjust read_poll_timeout_atomic() to busy wait using the check
> instead of udelay(), then change the SPI NAND driver to do the 2
> calls.
> 3. Introduce a new macro that does the 2 calls, and have the SPI NAND
> driver use it.
There is also the (difficult to measure) impact on overall system performance.
The versions that sleep let the cpu run other processes (or go to a low
power state).
Looping reading the status will load whatever bus handles the requests
between the cpu and spi interface logic (might be PCIe for example), that
could slow down accesses to other devices from other cpu.
I think I remember someone saying that the spi hardware interface normally
generates an interrupt when the request completes?
So for spi this is only fall-back code for a few systems.
Although I just looked at some code I have for doing bulk writes to an SPI
memory device (to write an fpga image).
That does about 35000 back to back writes.
The whole lot is in userspace - the driver just lets it mmap() the PCIe
registers. And the PCIe slave just converts 32bit accesses into 8 4-bit ones.
(I really should have supported read/write of AVX registers to reduce the
latency of reads - but reads are only really used for verify and I got the
CRC64 logic working in the end.)
The code has this comment:
/* Wait for the write - typically 0.6ms (max 5ms).
* In spite of the datasheet values, I'm seeing 200us writes. */
It waits 200us and then polls every 50us for 2 seconds.
-- David
>
> I don't have a strong opinion, but my preference would be either 1 or
> 3. I agree with Mark that it would make the preferred usage code
> easier to write (and to read), as there would be no need to duplicate
> the argument list or compute the timeouts correctly in every caller.
>
> Peter
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling
2026-05-29 11:20 ` David Laight
@ 2026-05-29 21:56 ` Mark Brown
2026-05-30 10:20 ` David Laight
0 siblings, 1 reply; 11+ messages in thread
From: Mark Brown @ 2026-05-29 21:56 UTC (permalink / raw)
To: David Laight
Cc: Peter Collingbourne, Jani Nikula, Ville Syrjälä,
Christophe Kerello, Patrice Chotard, Boris Brezillon, linux-spi,
linux-kernel, Simona Vetter, Randy Dunlap
[-- Attachment #1: Type: text/plain, Size: 834 bytes --]
On Fri, May 29, 2026 at 12:20:16PM +0100, David Laight wrote:
> I think I remember someone saying that the spi hardware interface normally
> generates an interrupt when the request completes?
> So for spi this is only fall-back code for a few systems.
For something like spi-mem I'd expect to have interrupt support, we go
all the way down to fully bitbanged though. You also often have
copybreak cutoffs for smaller transfers (and quicker operations) where
it's more efficient to poll for completion than use an interrupt, or
PIO rather than DMA.
> The code has this comment:
> /* Wait for the write - typically 0.6ms (max 5ms).
> * In spite of the datasheet values, I'm seeing 200us writes. */
> It waits 200us and then polls every 50us for 2 seconds.
You can also get fun with things like contention on shared buses.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling
2026-05-29 21:56 ` Mark Brown
@ 2026-05-30 10:20 ` David Laight
0 siblings, 0 replies; 11+ messages in thread
From: David Laight @ 2026-05-30 10:20 UTC (permalink / raw)
To: Mark Brown
Cc: Peter Collingbourne, Jani Nikula, Ville Syrjälä,
Christophe Kerello, Patrice Chotard, Boris Brezillon, linux-spi,
linux-kernel, Simona Vetter, Randy Dunlap
On Fri, 29 May 2026 22:56:23 +0100
Mark Brown <broonie@kernel.org> wrote:
> On Fri, May 29, 2026 at 12:20:16PM +0100, David Laight wrote:
>
> > I think I remember someone saying that the spi hardware interface normally
> > generates an interrupt when the request completes?
> > So for spi this is only fall-back code for a few systems.
>
> For something like spi-mem I'd expect to have interrupt support, we go
> all the way down to fully bitbanged though. You also often have
> copybreak cutoffs for smaller transfers (and quicker operations) where
> it's more efficient to poll for completion than use an interrupt, or
> PIO rather than DMA.
I do wonder about copybreak cutoffs especially for systems with iommu
(which has started including x86 systems).
Many years ago a colleague got a figure of ~1200 bytes for a sparc sbus card.
I suspect that the iommu setup code has got more complex (and slower)
since then and the memory copies faster (especially if you can do overlong
aligned copies that include the required data).
For spi-mem writes (max size is probably 256 bytes) that take 100s of us
it is best the sleep (for timer or isr).
spi-mem reads are another matter since the only delays are those needed
to bit-bang the physical device - which might be at over 100MHz and 4
(or even 8) data bits at a time.
Possibly worth using DMA for the data (to avoid leisurely PCIe reads)
but polling (ideally host memory) for completion to avoid the cost
of the ISR as well as avoiding context switches.
For the much slower smbus and i2c you pretty much always want to
offload the 'bit bang' to hardware and wait for an interrupts.
(I've seen ethernet drivers 'bugger' the system by repeatedly reading
the phy status - that could be done 1 bit every timer interrupt.)
>
> > The code has this comment:
> > /* Wait for the write - typically 0.6ms (max 5ms).
> > * In spite of the datasheet values, I'm seeing 200us writes. */
> > It waits 200us and then polls every 50us for 2 seconds.
FWIW I wrote the comment, the code below it, and the logic on the fpga
that converts the PCIe slave cycles into signals to the memory chip.
What I/we never resolved was why some chips/boards failed to act on
the 'read status' command issued after the first delay.
> You can also get fun with things like contention on shared buses.
Indeed - and in places you don't realise.
In some cases repeated reads of a slow device can restrict bus throughput
enough to make DMA requests underrun (eg trying to use an LCD panel on
a SA1100/SA1101 strongarm system).
-- David
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-05-30 10:20 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-19 10:24 [PATCH v2] iopoll: use udelay() for initial polling Peter Collingbourne
2026-05-19 18:35 ` David Laight
2026-05-20 7:38 ` Peter Collingbourne
2026-05-20 13:29 ` Ville Syrjälä
2026-05-21 5:59 ` Peter Collingbourne
2026-05-21 7:03 ` Jani Nikula
2026-05-22 14:50 ` Mark Brown
2026-05-29 5:59 ` Peter Collingbourne
2026-05-29 11:20 ` David Laight
2026-05-29 21:56 ` Mark Brown
2026-05-30 10:20 ` David Laight
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox