* [PATCH v2] iopoll: use udelay() for initial polling
@ 2026-05-19 10:24 Peter Collingbourne
2026-05-19 18:35 ` David Laight
2026-05-20 13:29 ` Ville Syrjälä
0 siblings, 2 replies; 6+ messages in thread
From: Peter Collingbourne @ 2026-05-19 10:24 UTC (permalink / raw)
To: Mark Brown
Cc: David Laight, Christophe Kerello, Patrice Chotard,
Boris Brezillon, linux-spi, linux-kernel, Jani Nikula,
Ville Syrjälä, Simona Vetter, Peter Collingbourne,
Randy Dunlap
A short polling delay, such as the delay of 5us
(SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver,
can become a 1/HZ (order of ms) delay caused by the usleep_range()
call in read_poll_timeout(), significantly reducing SPI NAND access
performance. Fix it by adjusting the read_poll_timeout() macro to use
udelay() to delay until 1/10 of a timer tick after it is called, and
only then sleep.
Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions")
Signed-off-by: Peter Collingbourne <peter@pcc.me.uk>
---
include/linux/iopoll.h | 30 ++++++++++++++++++++++--------
1 file changed, 22 insertions(+), 8 deletions(-)
v2:
* Fix it in read_poll_timeout() instead
diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
index 53edd69acb9b..2ee89b76f072 100644
--- a/include/linux/iopoll.h
+++ b/include/linux/iopoll.h
@@ -19,9 +19,11 @@
*
* @op: Operation
* @cond: Break condition
- * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops).
- * Please read usleep_range() function description for details and
- * limitations.
+ * @sleep_us: Maximum time to sleep or delay between operations in us
+ * (0 tight-loops). Please read usleep_range() and udelay()
+ * function descriptions for details and limitations.
+ * This macro will delay until 1/10 of a timer tick after
+ * it is called, and will then start sleeping.
* @timeout_us: Timeout in us, 0 means never timeout
* @sleep_before_op: if it is true, sleep @sleep_us before operation.
*
@@ -35,11 +37,18 @@
({ \
u64 __timeout_us = (timeout_us); \
unsigned long __sleep_us = (sleep_us); \
- ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
+ ktime_t __start_time = ktime_get(); \
+ u64 __delay_timeout_us = 100000/HZ; \
+ ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \
+ ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \
int ___ret; \
might_sleep_if((__sleep_us) != 0); \
- if ((sleep_before_op) && __sleep_us) \
- usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
+ if ((sleep_before_op) && __sleep_us) { \
+ if (__sleep_us <= __delay_timeout_us) \
+ udelay(__sleep_us); \
+ else \
+ usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
+ } \
for (;;) { \
bool __expired = __timeout_us && \
ktime_compare(ktime_get(), __timeout) > 0; \
@@ -54,8 +63,13 @@
___ret = -ETIMEDOUT; \
break; \
} \
- if (__sleep_us) \
- usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
+ if (__sleep_us) { \
+ if (__sleep_us <= __delay_timeout_us && \
+ ktime_compare(ktime_get(), __delay_timeout) < 0) \
+ udelay(__sleep_us); \
+ else \
+ usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
+ } \
cpu_relax(); \
} \
___ret; \
--
2.54.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v2] iopoll: use udelay() for initial polling 2026-05-19 10:24 [PATCH v2] iopoll: use udelay() for initial polling Peter Collingbourne @ 2026-05-19 18:35 ` David Laight 2026-05-20 7:38 ` Peter Collingbourne 2026-05-20 13:29 ` Ville Syrjälä 1 sibling, 1 reply; 6+ messages in thread From: David Laight @ 2026-05-19 18:35 UTC (permalink / raw) To: Peter Collingbourne Cc: Mark Brown, Christophe Kerello, Patrice Chotard, Boris Brezillon, linux-spi, linux-kernel, Jani Nikula, Ville Syrjälä, Simona Vetter, Randy Dunlap On Tue, 19 May 2026 03:24:46 -0700 Peter Collingbourne <peter@pcc.me.uk> wrote: > A short polling delay, such as the delay of 5us > (SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver, > can become a 1/HZ (order of ms) delay caused by the usleep_range() > call in read_poll_timeout(), significantly reducing SPI NAND access > performance. Fix it by adjusting the read_poll_timeout() macro to use > udelay() to delay until 1/10 of a timer tick after it is called, and > only then sleep. > > Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions") > Signed-off-by: Peter Collingbourne <peter@pcc.me.uk> > --- > include/linux/iopoll.h | 30 ++++++++++++++++++++++-------- > 1 file changed, 22 insertions(+), 8 deletions(-) > > v2: > * Fix it in read_poll_timeout() instead > > diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h > index 53edd69acb9b..2ee89b76f072 100644 > --- a/include/linux/iopoll.h > +++ b/include/linux/iopoll.h > @@ -19,9 +19,11 @@ > * > * @op: Operation > * @cond: Break condition > - * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops). > - * Please read usleep_range() function description for details and > - * limitations. > + * @sleep_us: Maximum time to sleep or delay between operations in us > + * (0 tight-loops). Please read usleep_range() and udelay() > + * function descriptions for details and limitations. > + * This macro will delay until 1/10 of a timer tick after > + * it is called, and will then start sleeping. > * @timeout_us: Timeout in us, 0 means never timeout > * @sleep_before_op: if it is true, sleep @sleep_us before operation. > * > @@ -35,11 +37,18 @@ > ({ \ > u64 __timeout_us = (timeout_us); \ > unsigned long __sleep_us = (sleep_us); \ > - ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \ > + ktime_t __start_time = ktime_get(); \ > + u64 __delay_timeout_us = 100000/HZ; \ > + ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \ > + ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \ > int ___ret; \ > might_sleep_if((__sleep_us) != 0); \ > - if ((sleep_before_op) && __sleep_us) \ > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > + if ((sleep_before_op) && __sleep_us) { \ > + if (__sleep_us <= __delay_timeout_us) \ > + udelay(__sleep_us); \ > + else \ > + usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > + } \ > for (;;) { \ > bool __expired = __timeout_us && \ > ktime_compare(ktime_get(), __timeout) > 0; \ > @@ -54,8 +63,13 @@ > ___ret = -ETIMEDOUT; \ > break; \ > } \ > - if (__sleep_us) \ > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > + if (__sleep_us) { \ > + if (__sleep_us <= __delay_timeout_us && \ > + ktime_compare(ktime_get(), __delay_timeout) < 0) \ > + udelay(__sleep_us); \ > + else \ > + usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > + } \ > cpu_relax(); \ > } \ > ___ret; \ How about: #define poll_timeout_us(op, cond, sleep_us, timeout_us, sleep_before_op) \ ({ \ u64 __timeout_us = (timeout_us); \ unsigned long __sleep_us = (sleep_us); \ ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \ u64 __delay_timeout_us = 100000/HZ; \ int ___ret; \ bool __expired; \ might_sleep_if((__sleep_us) != 0); \ \ for (;; sleep_before_op = false) { \ if (!sleep_before_op) { \ __expired = __timeout_us && \ ktime_compare(ktime_get(), __timeout) > 0; \ /* guarantee 'op' and 'cond' are evaluated after timeout expired */ \ barrier(); \ op; \ if (cond) { \ ___ret = 0; \ break; \ } \ if (__expired) { \ ___ret = -ETIMEDOUT; \ break; \ } \ } \ if (__sleep_us > __delay_timeout_us) { \ usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ continue; } \ if (__sleep_us) { \ __delay_timeout_us -= __sleep_us; \ udelay(__sleep_us); \ } \ cpu_relax(); \ } \ ___ret; \ }) Which I think is approximately equivalent. But I'm not at all sure the usleep/udelay test it right. 100000/HZ is a strange number of usecs; for HZ=100 it is 1ms, but for HZ=1000 0.1ms. Maybe it should be more like: u32 __delay_timeout_us = __sleep_us > 20 ? 0 : 100; so that you delay for (approx) max 100us if the interval is less than 20us. The is also a mismatch of long and u64. I don't think anything (except the time_t) needs to be 64bit (esp. on 32bit). I'm not sure about all architectures, but I'm pretty sure than on x86 usleep_range() is independent of HZ. -- David ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling 2026-05-19 18:35 ` David Laight @ 2026-05-20 7:38 ` Peter Collingbourne 0 siblings, 0 replies; 6+ messages in thread From: Peter Collingbourne @ 2026-05-20 7:38 UTC (permalink / raw) To: David Laight Cc: Mark Brown, Christophe Kerello, Patrice Chotard, Boris Brezillon, linux-spi, linux-kernel, Jani Nikula, Ville Syrjälä, Simona Vetter, Randy Dunlap On Tue, May 19, 2026 at 11:35 AM David Laight <david.laight.linux@gmail.com> wrote: > > On Tue, 19 May 2026 03:24:46 -0700 > Peter Collingbourne <peter@pcc.me.uk> wrote: > > > A short polling delay, such as the delay of 5us > > (SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver, > > can become a 1/HZ (order of ms) delay caused by the usleep_range() > > call in read_poll_timeout(), significantly reducing SPI NAND access > > performance. Fix it by adjusting the read_poll_timeout() macro to use > > udelay() to delay until 1/10 of a timer tick after it is called, and > > only then sleep. > > > > Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions") > > Signed-off-by: Peter Collingbourne <peter@pcc.me.uk> > > --- > > include/linux/iopoll.h | 30 ++++++++++++++++++++++-------- > > 1 file changed, 22 insertions(+), 8 deletions(-) > > > > v2: > > * Fix it in read_poll_timeout() instead > > > > diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h > > index 53edd69acb9b..2ee89b76f072 100644 > > --- a/include/linux/iopoll.h > > +++ b/include/linux/iopoll.h > > @@ -19,9 +19,11 @@ > > * > > * @op: Operation > > * @cond: Break condition > > - * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops). > > - * Please read usleep_range() function description for details and > > - * limitations. > > + * @sleep_us: Maximum time to sleep or delay between operations in us > > + * (0 tight-loops). Please read usleep_range() and udelay() > > + * function descriptions for details and limitations. > > + * This macro will delay until 1/10 of a timer tick after > > + * it is called, and will then start sleeping. > > * @timeout_us: Timeout in us, 0 means never timeout > > * @sleep_before_op: if it is true, sleep @sleep_us before operation. > > * > > @@ -35,11 +37,18 @@ > > ({ \ > > u64 __timeout_us = (timeout_us); \ > > unsigned long __sleep_us = (sleep_us); \ > > - ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \ > > + ktime_t __start_time = ktime_get(); \ > > + u64 __delay_timeout_us = 100000/HZ; \ > > + ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \ > > + ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \ > > int ___ret; \ > > might_sleep_if((__sleep_us) != 0); \ > > - if ((sleep_before_op) && __sleep_us) \ > > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > > + if ((sleep_before_op) && __sleep_us) { \ > > + if (__sleep_us <= __delay_timeout_us) \ > > + udelay(__sleep_us); \ > > + else \ > > + usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > > + } \ > > for (;;) { \ > > bool __expired = __timeout_us && \ > > ktime_compare(ktime_get(), __timeout) > 0; \ > > @@ -54,8 +63,13 @@ > > ___ret = -ETIMEDOUT; \ > > break; \ > > } \ > > - if (__sleep_us) \ > > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > > + if (__sleep_us) { \ > > + if (__sleep_us <= __delay_timeout_us && \ > > + ktime_compare(ktime_get(), __delay_timeout) < 0) \ > > + udelay(__sleep_us); \ > > + else \ > > + usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > > + } \ > > cpu_relax(); \ > > } \ > > ___ret; \ > > How about: > #define poll_timeout_us(op, cond, sleep_us, timeout_us, sleep_before_op) \ > ({ \ > u64 __timeout_us = (timeout_us); \ > unsigned long __sleep_us = (sleep_us); \ > ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \ > u64 __delay_timeout_us = 100000/HZ; \ > int ___ret; \ > bool __expired; \ > might_sleep_if((__sleep_us) != 0); \ > \ > for (;; sleep_before_op = false) { \ > if (!sleep_before_op) { \ > __expired = __timeout_us && \ > ktime_compare(ktime_get(), __timeout) > 0; \ > /* guarantee 'op' and 'cond' are evaluated after timeout expired */ \ > barrier(); \ > op; \ > if (cond) { \ > ___ret = 0; \ > break; \ > } \ > if (__expired) { \ > ___ret = -ETIMEDOUT; \ > break; \ > } \ > } \ > if (__sleep_us > __delay_timeout_us) { \ > usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > continue; > } \ > if (__sleep_us) { \ > __delay_timeout_us -= __sleep_us; \ > udelay(__sleep_us); \ > } \ > cpu_relax(); \ > } \ > ___ret; \ > }) > Which I think is approximately equivalent. > But I'm not at all sure the usleep/udelay test it right. > 100000/HZ is a strange number of usecs; for HZ=100 it is 1ms, but for HZ=1000 0.1ms. The idea here was that since the sleep time for usleep_range() is tied to HZ (or so I thought, see below), the delay timeout should also be tied to HZ. > Maybe it should be more like: > u32 __delay_timeout_us = __sleep_us > 20 ? 0 : 100; > so that you delay for (approx) max 100us if the interval is less than 20us. I think that makes sense. > The is also a mismatch of long and u64. > I don't think anything (except the time_t) needs to be 64bit (esp. on 32bit). Ack. > I'm not sure about all architectures, but I'm pretty sure than on x86 > usleep_range() is independent of HZ. Right, after looking into it more I learned that the hrtimer should usually allow usleep_range() to return before the next timer tick. Which, as it turns out, was not happening on my target device because the kernel was configured without CONFIG_HIGH_RES_TIMERS. After turning it on, the performance issue is fixed, even without this patch. For me this raises the question of whether CONFIG_HIGH_RES_TIMERS=n is always a good default. On x86 (except for ancient microarchitectures) and arm64, the architecture provides a guaranteed hrtimer, so to me it makes sense for it to default to y on those architectures at least, so that users who write their own config (not defconfig based) will avoid issues like this difficult to debug performance issue by default. If we do decide to turn it on everywhere by default, it may be worth reconsidering whether we need to delay in poll_timeout_us() after all. Peter ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling 2026-05-19 10:24 [PATCH v2] iopoll: use udelay() for initial polling Peter Collingbourne 2026-05-19 18:35 ` David Laight @ 2026-05-20 13:29 ` Ville Syrjälä 2026-05-21 5:59 ` Peter Collingbourne 1 sibling, 1 reply; 6+ messages in thread From: Ville Syrjälä @ 2026-05-20 13:29 UTC (permalink / raw) To: Peter Collingbourne Cc: Mark Brown, David Laight, Christophe Kerello, Patrice Chotard, Boris Brezillon, linux-spi, linux-kernel, Jani Nikula, Simona Vetter, Randy Dunlap On Tue, May 19, 2026 at 03:24:46AM -0700, Peter Collingbourne wrote: > A short polling delay, such as the delay of 5us > (SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver, > can become a 1/HZ (order of ms) delay caused by the usleep_range() > call in read_poll_timeout(), significantly reducing SPI NAND access > performance. Fix it by adjusting the read_poll_timeout() macro to use > udelay() to delay until 1/10 of a timer tick after it is called, and > only then sleep. > > Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions") > Signed-off-by: Peter Collingbourne <peter@pcc.me.uk> > --- > include/linux/iopoll.h | 30 ++++++++++++++++++++++-------- > 1 file changed, 22 insertions(+), 8 deletions(-) > > v2: > * Fix it in read_poll_timeout() instead > > diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h > index 53edd69acb9b..2ee89b76f072 100644 > --- a/include/linux/iopoll.h > +++ b/include/linux/iopoll.h > @@ -19,9 +19,11 @@ > * > * @op: Operation > * @cond: Break condition > - * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops). > - * Please read usleep_range() function description for details and > - * limitations. > + * @sleep_us: Maximum time to sleep or delay between operations in us > + * (0 tight-loops). Please read usleep_range() and udelay() > + * function descriptions for details and limitations. > + * This macro will delay until 1/10 of a timer tick after > + * it is called, and will then start sleeping. > * @timeout_us: Timeout in us, 0 means never timeout > * @sleep_before_op: if it is true, sleep @sleep_us before operation. > * > @@ -35,11 +37,18 @@ > ({ \ > u64 __timeout_us = (timeout_us); \ > unsigned long __sleep_us = (sleep_us); \ > - ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \ > + ktime_t __start_time = ktime_get(); \ > + u64 __delay_timeout_us = 100000/HZ; \ > + ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \ > + ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \ > int ___ret; \ > might_sleep_if((__sleep_us) != 0); \ > - if ((sleep_before_op) && __sleep_us) \ > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > + if ((sleep_before_op) && __sleep_us) { \ > + if (__sleep_us <= __delay_timeout_us) \ > + udelay(__sleep_us); \ If you want udelay() why not just use the atomic variant of the macro? > + else \ > + usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > + } \ > for (;;) { \ > bool __expired = __timeout_us && \ > ktime_compare(ktime_get(), __timeout) > 0; \ > @@ -54,8 +63,13 @@ > ___ret = -ETIMEDOUT; \ > break; \ > } \ > - if (__sleep_us) \ > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > + if (__sleep_us) { \ > + if (__sleep_us <= __delay_timeout_us && \ > + ktime_compare(ktime_get(), __delay_timeout) < 0) \ > + udelay(__sleep_us); \ > + else \ > + usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > + } \ > cpu_relax(); \ > } \ > ___ret; \ > -- > 2.54.0 -- Ville Syrjälä Intel ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling 2026-05-20 13:29 ` Ville Syrjälä @ 2026-05-21 5:59 ` Peter Collingbourne 2026-05-21 7:03 ` Jani Nikula 0 siblings, 1 reply; 6+ messages in thread From: Peter Collingbourne @ 2026-05-21 5:59 UTC (permalink / raw) To: Ville Syrjälä Cc: Mark Brown, David Laight, Christophe Kerello, Patrice Chotard, Boris Brezillon, linux-spi, linux-kernel, Jani Nikula, Simona Vetter, Randy Dunlap On Wed, May 20, 2026 at 6:29 AM Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > On Tue, May 19, 2026 at 03:24:46AM -0700, Peter Collingbourne wrote: > > A short polling delay, such as the delay of 5us > > (SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver, > > can become a 1/HZ (order of ms) delay caused by the usleep_range() > > call in read_poll_timeout(), significantly reducing SPI NAND access > > performance. Fix it by adjusting the read_poll_timeout() macro to use > > udelay() to delay until 1/10 of a timer tick after it is called, and > > only then sleep. > > > > Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions") > > Signed-off-by: Peter Collingbourne <peter@pcc.me.uk> > > --- > > include/linux/iopoll.h | 30 ++++++++++++++++++++++-------- > > 1 file changed, 22 insertions(+), 8 deletions(-) > > > > v2: > > * Fix it in read_poll_timeout() instead > > > > diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h > > index 53edd69acb9b..2ee89b76f072 100644 > > --- a/include/linux/iopoll.h > > +++ b/include/linux/iopoll.h > > @@ -19,9 +19,11 @@ > > * > > * @op: Operation > > * @cond: Break condition > > - * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops). > > - * Please read usleep_range() function description for details and > > - * limitations. > > + * @sleep_us: Maximum time to sleep or delay between operations in us > > + * (0 tight-loops). Please read usleep_range() and udelay() > > + * function descriptions for details and limitations. > > + * This macro will delay until 1/10 of a timer tick after > > + * it is called, and will then start sleeping. > > * @timeout_us: Timeout in us, 0 means never timeout > > * @sleep_before_op: if it is true, sleep @sleep_us before operation. > > * > > @@ -35,11 +37,18 @@ > > ({ \ > > u64 __timeout_us = (timeout_us); \ > > unsigned long __sleep_us = (sleep_us); \ > > - ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \ > > + ktime_t __start_time = ktime_get(); \ > > + u64 __delay_timeout_us = 100000/HZ; \ > > + ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \ > > + ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \ > > int ___ret; \ > > might_sleep_if((__sleep_us) != 0); \ > > - if ((sleep_before_op) && __sleep_us) \ > > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > > + if ((sleep_before_op) && __sleep_us) { \ > > + if (__sleep_us <= __delay_timeout_us) \ > > + udelay(__sleep_us); \ > > If you want udelay() why not just use the atomic variant of the macro? That's what I had in v1; we decided this approach would better handle misbehaving devices. https://lore.kernel.org/all/20260517150253.031dec09@pumpkin/ Peter ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] iopoll: use udelay() for initial polling 2026-05-21 5:59 ` Peter Collingbourne @ 2026-05-21 7:03 ` Jani Nikula 0 siblings, 0 replies; 6+ messages in thread From: Jani Nikula @ 2026-05-21 7:03 UTC (permalink / raw) To: Peter Collingbourne, Ville Syrjälä Cc: Mark Brown, David Laight, Christophe Kerello, Patrice Chotard, Boris Brezillon, linux-spi, linux-kernel, Simona Vetter, Randy Dunlap On Wed, 20 May 2026, Peter Collingbourne <peter@pcc.me.uk> wrote: > On Wed, May 20, 2026 at 6:29 AM Ville Syrjälä > <ville.syrjala@linux.intel.com> wrote: >> >> On Tue, May 19, 2026 at 03:24:46AM -0700, Peter Collingbourne wrote: >> > A short polling delay, such as the delay of 5us >> > (SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver, >> > can become a 1/HZ (order of ms) delay caused by the usleep_range() >> > call in read_poll_timeout(), significantly reducing SPI NAND access >> > performance. Fix it by adjusting the read_poll_timeout() macro to use >> > udelay() to delay until 1/10 of a timer tick after it is called, and >> > only then sleep. What's "1/10 of a timer tick" based on? fsleep() has simply 10 us as the threshold. >> > >> > Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions") >> > Signed-off-by: Peter Collingbourne <peter@pcc.me.uk> >> > --- >> > include/linux/iopoll.h | 30 ++++++++++++++++++++++-------- >> > 1 file changed, 22 insertions(+), 8 deletions(-) >> > >> > v2: >> > * Fix it in read_poll_timeout() instead >> > >> > diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h >> > index 53edd69acb9b..2ee89b76f072 100644 >> > --- a/include/linux/iopoll.h >> > +++ b/include/linux/iopoll.h >> > @@ -19,9 +19,11 @@ >> > * >> > * @op: Operation >> > * @cond: Break condition >> > - * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops). >> > - * Please read usleep_range() function description for details and >> > - * limitations. >> > + * @sleep_us: Maximum time to sleep or delay between operations in us >> > + * (0 tight-loops). Please read usleep_range() and udelay() >> > + * function descriptions for details and limitations. >> > + * This macro will delay until 1/10 of a timer tick after >> > + * it is called, and will then start sleeping. >> > * @timeout_us: Timeout in us, 0 means never timeout >> > * @sleep_before_op: if it is true, sleep @sleep_us before operation. >> > * >> > @@ -35,11 +37,18 @@ >> > ({ \ >> > u64 __timeout_us = (timeout_us); \ >> > unsigned long __sleep_us = (sleep_us); \ >> > - ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \ >> > + ktime_t __start_time = ktime_get(); \ >> > + u64 __delay_timeout_us = 100000/HZ; \ >> > + ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \ >> > + ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \ >> > int ___ret; \ >> > might_sleep_if((__sleep_us) != 0); \ >> > - if ((sleep_before_op) && __sleep_us) \ >> > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ >> > + if ((sleep_before_op) && __sleep_us) { \ >> > + if (__sleep_us <= __delay_timeout_us) \ >> > + udelay(__sleep_us); \ >> >> If you want udelay() why not just use the atomic variant of the macro? > > That's what I had in v1; we decided this approach would better handle > misbehaving devices. > https://lore.kernel.org/all/20260517150253.031dec09@pumpkin/ I think the problem with trying to adapt to everything within read_poll_timeout() is that every step like this adapts to a *specific* use case, and once it gets specific enough, it's no longer usable to other scenarios. Having to reimplement the whole thing in drivers is much worse than having to do two calls. Could a staggered approach work? ret = read_poll_timeout_atomic("short delay/timeout") if (ret) ret = read_poll_timeout("longer delay/timeout") Then you have better control of the behaviour in the driver, instead of adapting a generic function to a specific use case. BR, Jani. -- Jani Nikula, Intel ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-05-21 7:03 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-19 10:24 [PATCH v2] iopoll: use udelay() for initial polling Peter Collingbourne 2026-05-19 18:35 ` David Laight 2026-05-20 7:38 ` Peter Collingbourne 2026-05-20 13:29 ` Ville Syrjälä 2026-05-21 5:59 ` Peter Collingbourne 2026-05-21 7:03 ` Jani Nikula
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox