* [PATCH 00/15] timers: Cleanup delay/sleep related mess
@ 2024-09-04 13:04 Anna-Maria Behnsen
2024-09-04 13:04 ` [PATCH 06/15] timers: Update function descriptions of sleep/delay related functions Anna-Maria Behnsen
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-04 13:04 UTC (permalink / raw)
To: Frederic Weisbecker, Thomas Gleixner, Jonathan Corbet
Cc: linux-kernel, Len Brown, Rafael J. Wysocki, Anna-Maria Behnsen,
Peter Zijlstra, SeongJae Park, Andrew Morton, damon, linux-mm,
Arnd Bergmann, linux-arch, Heiner Kallweit, David S. Miller,
Andy Whitcroft, Joe Perches, Dwaipayan Ray, Liam Girdwood,
Mark Brown, Andrew Lunn, Jaroslav Kysela, Takashi Iwai, netdev,
linux-sound, Michael Ellerman, Nathan Lynch, linuxppc-dev,
Mauro Carvalho Chehab, linux-media
Hi,
a question about which sleeping function should be used in acpi_os_sleep()
started a discussion and examination about the existing documentation and
implementation of functions which insert a sleep/delay.
The result of the discussion was, that the documentation is outdated and
the implemented fsleep() reflects the outdated documentation but doesn't
help to reflect reality which in turns leads to the queue which covers the
following things:
- Minor changes (naming and typo fixes)
- Split out all timeout and sleep related functions from hrtimer.c and timer.c
into a separate file
- Update function descriptions of sleep related functions
- Change fsleep() to reflect reality
- Rework all comments or users which obviously rely on the outdated
documentation as they reference "Documentation/timers/timers-howto.rst"
- Last but not least (as there are no more references): Update the outdated
documentation and move it into a file with a self explaining file name
The queue is available here and applies on top of tip/timers/core:
git://git.kernel.org/pub/scm/linux/kernel/git/anna-maria/linux-devel.git timers/misc
Cc: linux-kernel@vger.kernel.org
Cc: Len Brown <len.brown@intel.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
To: Frederic Weisbecker <frederic@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
To: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Thanks,
Anna-Maria
---
Anna-Maria Behnsen (15):
timers: Rename next_expiry_recalc() to be unique
cpu: Use already existing usleep_range()
Comments: Fix wrong singular form of jiffies
timers: Move *sleep*() and timeout functions into a separate file
timers: Rename sleep_idle_range() to sleep_range_idle()
timers: Update function descriptions of sleep/delay related functions
timers: Adjust flseep() to reflect reality
mm/damon/core: Use generic upper bound recommondation for usleep_range()
timers: Add a warning to usleep_range_state() for wrong order of arguments
checkpatch: Remove broken sleep/delay related checks
regulator: core: Use fsleep() to get best sleep mechanism
iopoll/regmap/phy/snd: Fix comment referencing outdated timer documentation
powerpc/rtas: Use fsleep() to minimize additional sleep duration
media: anysee: Fix link to outdated sleep function documentation
timers/Documentation: Cleanup delay/sleep documentation
Documentation/admin-guide/media/vivid.rst | 2 +-
Documentation/dev-tools/checkpatch.rst | 6 -
Documentation/timers/delay_sleep_functions.rst | 122 +++++++
Documentation/timers/index.rst | 2 +-
Documentation/timers/timers-howto.rst | 115 -------
.../sp_SP/scheduler/sched-design-CFS.rst | 2 +-
MAINTAINERS | 2 +
arch/arm/mach-versatile/spc.c | 2 +-
arch/m68k/q40/q40ints.c | 2 +-
arch/powerpc/kernel/rtas.c | 21 +-
arch/x86/kernel/cpu/mce/dev-mcelog.c | 2 +-
drivers/char/ipmi/ipmi_ssif.c | 2 +-
drivers/dma-buf/st-dma-fence.c | 2 +-
drivers/gpu/drm/i915/gem/i915_gem_wait.c | 2 +-
drivers/gpu/drm/i915/gt/selftest_execlists.c | 4 +-
drivers/gpu/drm/i915/i915_utils.c | 2 +-
drivers/gpu/drm/v3d/v3d_bo.c | 2 +-
drivers/isdn/mISDN/dsp_cmx.c | 2 +-
drivers/media/usb/dvb-usb-v2/anysee.c | 6 +-
drivers/net/ethernet/marvell/mvmdio.c | 2 +-
drivers/regulator/core.c | 33 +-
fs/xfs/xfs_buf.h | 2 +-
include/asm-generic/delay.h | 46 ++-
include/linux/delay.h | 79 ++++-
include/linux/iopoll.h | 24 +-
include/linux/jiffies.h | 2 +-
include/linux/phy.h | 7 +-
include/linux/regmap.h | 18 +-
include/linux/timekeeper_internal.h | 2 +-
kernel/cpu.c | 2 +-
kernel/time/Makefile | 2 +-
kernel/time/alarmtimer.c | 2 +-
kernel/time/clockevents.c | 2 +-
kernel/time/hrtimer.c | 122 +------
kernel/time/posix-timers.c | 4 +-
kernel/time/sleep_timeout.c | 363 +++++++++++++++++++++
kernel/time/timer.c | 210 +-----------
lib/Kconfig.debug | 2 +-
mm/damon/core.c | 5 +-
net/batman-adv/types.h | 2 +-
scripts/checkpatch.pl | 38 ---
sound/soc/sof/ops.h | 6 +-
42 files changed, 668 insertions(+), 607 deletions(-)
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 06/15] timers: Update function descriptions of sleep/delay related functions 2024-09-04 13:04 [PATCH 00/15] timers: Cleanup delay/sleep related mess Anna-Maria Behnsen @ 2024-09-04 13:04 ` Anna-Maria Behnsen 2024-09-04 14:30 ` Arnd Bergmann 2024-09-05 6:59 ` Thomas Gleixner 2024-09-04 14:44 ` [PATCH 00/15] timers: Cleanup delay/sleep related mess Rafael J. Wysocki 2024-10-17 14:19 ` (subset) " Mark Brown 2 siblings, 2 replies; 8+ messages in thread From: Anna-Maria Behnsen @ 2024-09-04 13:04 UTC (permalink / raw) To: Frederic Weisbecker, Thomas Gleixner, Jonathan Corbet Cc: linux-kernel, Len Brown, Rafael J. Wysocki, Anna-Maria Behnsen, Arnd Bergmann, linux-arch A lot of commonly used functions for inserting a sleep or delay lack a proper function description. Add function descriptions to all of them to have important information in a central place close to the code. No functional change. Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-arch@vger.kernel.org Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> --- include/asm-generic/delay.h | 46 ++++++++++++++++++++++++++++++++++----- include/linux/delay.h | 48 ++++++++++++++++++++++++++++++---------- kernel/time/sleep_timeout.c | 53 ++++++++++++++++++++++++++++++++++++++++----- 3 files changed, 123 insertions(+), 24 deletions(-) diff --git a/include/asm-generic/delay.h b/include/asm-generic/delay.h index e448ac61430c..76531ea7597a 100644 --- a/include/asm-generic/delay.h +++ b/include/asm-generic/delay.h @@ -11,12 +11,37 @@ extern void __ndelay(unsigned long nsecs); extern void __const_udelay(unsigned long xloops); extern void __delay(unsigned long loops); -/* - * The weird n/20000 thing suppresses a "comparison is always false due to - * limited range of data type" warning with non-const 8-bit arguments. +/** + * udelay - Inserting a delay based on microseconds with busy waiting + * @n: requested delay in microseconds + * + * When delaying in an atomic context ndelay(), udelay() and mdelay() are the + * only valid variants of delaying/sleeping to go with. + * + * When inserting delays in non atomic context which are shorter than the time + * which is required to queue e.g. an hrtimer and to enter then the scheduler, + * it is also valuable to use udelay(). But is not simple to specify a generic + * threshold for this which will fit for all systems, but an approximation would + * be a threshold for all delays up to 10 microseconds. + * + * When having a delay which is larger than the architecture specific + * %MAX_UDELAY_MS value, please make sure mdelay() is used. Otherwise a overflow + * risk is given. + * + * Please note that ndelay(), udelay() and mdelay() may return early for several + * reasons (https://lists.openwall.net/linux-kernel/2011/01/09/56): + * + * #. computed loops_per_jiffy too low (due to the time taken to execute the + * timer interrupt.) + * #. cache behaviour affecting the time it takes to execute the loop function. + * #. CPU clock rate changes. + * + * Impelementation details for udelay() only: + * + * * The weird n/20000 thing suppresses a "comparison is always false due to + * limited range of data type" warning with non-const 8-bit arguments. + * * 0x10c7 is 2**32 / 1000000 (rounded up) */ - -/* 0x10c7 is 2**32 / 1000000 (rounded up) */ #define udelay(n) \ ({ \ if (__builtin_constant_p(n)) { \ @@ -29,7 +54,16 @@ extern void __delay(unsigned long loops); } \ }) -/* 0x5 is 2**32 / 1000000000 (rounded up) */ +/** + * ndelay - Inserting a delay based on nanoseconds with busy waiting + * @n: requested delay in nanoseconds + * + * See udelay() for basic information about ndelay() and it's variants. + * + * Impelmentation details for ndelay(): + * + * * 0x5 is 2**32 / 1000000000 (rounded up) + */ #define ndelay(n) \ ({ \ if (__builtin_constant_p(n)) { \ diff --git a/include/linux/delay.h b/include/linux/delay.h index 2bc586aa2068..23623fa79768 100644 --- a/include/linux/delay.h +++ b/include/linux/delay.h @@ -6,17 +6,7 @@ * Copyright (C) 1993 Linus Torvalds * * Delay routines, using a pre-computed "loops_per_jiffy" value. - * - * Please note that ndelay(), udelay() and mdelay() may return early for - * several reasons: - * 1. computed loops_per_jiffy too low (due to the time taken to - * execute the timer interrupt.) - * 2. cache behaviour affecting the time it takes to execute the - * loop function. - * 3. CPU clock rate changes. - * - * Please see this thread: - * https://lists.openwall.net/linux-kernel/2011/01/09/56 + * Sleep routines using timer list timers or hrtimers. */ #include <linux/math.h> @@ -35,12 +25,21 @@ extern unsigned long loops_per_jiffy; * The 2nd mdelay() definition ensures GCC will optimize away the * while loop for the common cases where n <= MAX_UDELAY_MS -- Paul G. */ - #ifndef MAX_UDELAY_MS #define MAX_UDELAY_MS 5 #endif #ifndef mdelay +/** + * mdelay - Inserting a delay based on microseconds with busy waiting + * @n: requested delay in microseconds + * + * See udelay() for basic information about mdelay() and it's variants. + * + * Please double check, whether mdelay() is the right way to go or whether a + * refactoring of the code is the better variant to be able to use msleep() + * instead. + */ #define mdelay(n) (\ (__builtin_constant_p(n) && (n)<=MAX_UDELAY_MS) ? udelay((n)*1000) : \ ({unsigned long __ms=(n); while (__ms--) udelay(1000);})) @@ -63,16 +62,41 @@ unsigned long msleep_interruptible(unsigned int msecs); void usleep_range_state(unsigned long min, unsigned long max, unsigned int state); +/** + * usleep_range - Sleep for an approximate time + * @min: Minimum time in microseconds to sleep + * @max: Maximum time in microseconds to sleep + * + * For basic information please refere to usleep_range_state(). + * + * The task will be in the state TASK_UNINTERRUPTIBLE during the sleep. + */ static inline void usleep_range(unsigned long min, unsigned long max) { usleep_range_state(min, max, TASK_UNINTERRUPTIBLE); } +/** + * usleep_range_idle - Sleep for an approximate time with idle time accounting + * @min: Minimum time in microseconds to sleep + * @max: Maximum time in microseconds to sleep + * + * For basic information please refere to usleep_range_state(). + * + * The sleeping task has the state TASK_IDLE during the sleep to prevent + * contribution to the load avarage. + */ static inline void usleep_range_idle(unsigned long min, unsigned long max) { usleep_range_state(min, max, TASK_IDLE); } +/** + * ssleep - wrapper for seconds arount msleep + * @seconds: Requested sleep duration in seconds + * + * Please refere to msleep() for detailed information. + */ static inline void ssleep(unsigned int seconds) { msleep(seconds * 1000); diff --git a/kernel/time/sleep_timeout.c b/kernel/time/sleep_timeout.c index 9ecbce7da537..1ebd8429a64a 100644 --- a/kernel/time/sleep_timeout.c +++ b/kernel/time/sleep_timeout.c @@ -267,7 +267,34 @@ EXPORT_SYMBOL_GPL(schedule_hrtimeout); /** * msleep - sleep safely even with waitqueue interruptions - * @msecs: Time in milliseconds to sleep for + * @msecs: Requested sleep duration in milliseconds + * + * msleep() uses jiffie based timeouts for the sleep duration. The accuracy of + * the resulting sleep duration depends on: + * + * * HZ configuration + * * sleep duration (as granularity of a bucket which collects timers increases + * with the timer wheel levels) + * + * When the timer is queued into the second level of the timer wheel the maximum + * additional delay will be 12.5%. For explanation please check the detailed + * description about the basics of the timer wheel. In case this is accurate + * enough check which sleep length is selected to make sure required accuracy is + * given. Please use therefore the following simple steps: + * + * #. Decide which slack is fine for the requested sleep duration - but do not + * use values shorter than 1/8 + * #. Check whether your sleep duration is equal or greater than the following + * result: ``TICK_NSEC / slack / NSEC_PER_MSEC`` + * + * Examples: + * + * * ``HZ=1000`` with `slack=1/4``: all sleep durations greater or equal 4ms will meet + * the constrains. + * * ``HZ=250`` with ``slack=1/4``: all sleep durations greater or equal 16ms will meet + * the constrains. + * + * See also the signal aware variant msleep_interruptible(). */ void msleep(unsigned int msecs) { @@ -280,7 +307,15 @@ EXPORT_SYMBOL(msleep); /** * msleep_interruptible - sleep waiting for signals - * @msecs: Time in milliseconds to sleep for + * @msecs: Requested sleep duration in milliseconds + * + * See msleep() for some basic information. + * + * The difference between msleep() and msleep_interruptible() is that the sleep + * could be interrupted by a signal delivery and then returns early. + * + * Returns the remaining time of the sleep duration transformed to msecs (see + * schedule_timeout() for details). */ unsigned long msleep_interruptible(unsigned int msecs) { @@ -298,11 +333,17 @@ EXPORT_SYMBOL(msleep_interruptible); * @max: Maximum time in usecs to sleep * @state: State of the current task that will be while sleeping * + * usleep_range_state() sleeps at least for the minimum specified time but not + * longer than the maximum specified amount of time. The range might reduce + * power usage by allowing hrtimers to coalesce an already scheduled interrupt + * with this hrtimer. In the worst case, an interrupt is scheduled for the upper + * bound. + * + * The sleeping task is set to the specified state before starting the sleep. + * * In non-atomic context where the exact wakeup time is flexible, use - * usleep_range_state() instead of udelay(). The sleep improves responsiveness - * by avoiding the CPU-hogging busy-wait of udelay(), and the range reduces - * power usage by allowing hrtimers to take advantage of an already- - * scheduled interrupt instead of scheduling a new one just for this sleep. + * usleep_range() or its variants instead of udelay(). The sleep improves + * responsiveness by avoiding the CPU-hogging busy-wait of udelay(). */ void __sched usleep_range_state(unsigned long min, unsigned long max, unsigned int state) -- 2.39.2 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 06/15] timers: Update function descriptions of sleep/delay related functions 2024-09-04 13:04 ` [PATCH 06/15] timers: Update function descriptions of sleep/delay related functions Anna-Maria Behnsen @ 2024-09-04 14:30 ` Arnd Bergmann 2024-09-05 6:59 ` Thomas Gleixner 1 sibling, 0 replies; 8+ messages in thread From: Arnd Bergmann @ 2024-09-04 14:30 UTC (permalink / raw) To: Anna-Maria Gleixner, Frederic Weisbecker, Thomas Gleixner, Jonathan Corbet Cc: linux-kernel, Len Brown, Rafael J . Wysocki, Linux-Arch On Wed, Sep 4, 2024, at 13:04, Anna-Maria Behnsen wrote: > A lot of commonly used functions for inserting a sleep or delay lack a > proper function description. Add function descriptions to all of them to > have important information in a central place close to the code. > > No functional change. > > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: linux-arch@vger.kernel.org > Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> > --- > include/asm-generic/delay.h | 46 ++++++++++++++++++++++++++++++++++----- > include/linux/delay.h | 48 ++++++++++++++++++++++++++++++---------- > kernel/time/sleep_timeout.c | 53 ++++++++++++++++++++++++++++++++++++++++----- > 3 files changed, 123 insertions(+), 24 deletions(-) Acked-by: Arnd Bergmann <arnd@arndb.de> # asm-generic Arnd ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 06/15] timers: Update function descriptions of sleep/delay related functions 2024-09-04 13:04 ` [PATCH 06/15] timers: Update function descriptions of sleep/delay related functions Anna-Maria Behnsen 2024-09-04 14:30 ` Arnd Bergmann @ 2024-09-05 6:59 ` Thomas Gleixner 2024-09-05 16:07 ` Thomas Gleixner 1 sibling, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2024-09-05 6:59 UTC (permalink / raw) To: Anna-Maria Behnsen, Frederic Weisbecker, Jonathan Corbet Cc: linux-kernel, Len Brown, Rafael J. Wysocki, Anna-Maria Behnsen, Arnd Bergmann, linux-arch, Andrew Morton, Nathan Chancellor, Nick Desaulniers On Wed, Sep 04 2024 at 15:04, Anna-Maria Behnsen wrote: > +/** > + * udelay - Inserting a delay based on microseconds with busy waiting > + * @n: requested delay in microseconds .... > + * Impelementation details for udelay() only: Implementation > + * * The weird n/20000 thing suppresses a "comparison is always false due to > + * limited range of data type" warning with non-const 8-bit arguments. > + * * 0x10c7 is 2**32 / 1000000 (rounded up) > */ That spello aside, I don't see how this information is interesting for the user of udelay(). It's really a implementation detail and the user does not care about this piece of art at all. Though that made me look at this voodoo and the magic constants in detail. The division was added in a87e553fabe8 ("asm-generic: delay.h fix udelay and ndelay for 8 bit args") to work around a compiler which is upset about the comparision even when __builtin_constant_p(arg) is false: warning: comparison is always false due to limited range of data type The changelog is silent about the compiler version. I assume it's clang because clang still complains on a plain (n) > 20000 when udelay() is invoked with a u8 variable as argument: warning: result of comparison of constant 20000 with expression of type 'unsigned char' is always false [-Wtautological-constant-out-of-range-compare] while gcc does not care. The change log explains further that type casting 'n' in the comparison does not cure it. Contemporary clang seems to be less stupid and if ((unsigned long)(n) >= 20000) \ compiles just fine. Though assumed that some older clang version failed and is still allowed to be used for compiling the kernel we have to work around it. However, instead of proliferating this voodoo can we please convert it into something comprehensible? /* * The microseconds delay multiplicator is used to convert a constant * microseconds value to a <INSERT COHERENT EXPLANATION>. */ #define UDELAY_CONST_MULT ((unsigned long)DIV_ROUND_UP(1ULL << 32, USEC_PER_SEC)) /* * The maximum constant udelay value picked out of thin air * to avoid <INSERT COHERENT EXPLANATION>. */ #define UDELAY_CONST_MAX 20000 /** * udelay - ..... */ static __always_inline void udelay(unsigned long usec) { /* * <INSERT COHERENT EXPLANATION> for this construct */ if (__builtin_constant_p(usec)) { if (usec >= UDELAY_CONST_MAX) __bad_udelay(); else __const_udelay(usec * UDELAY_CONST_MULT); } else { __udelay(usec); } } Both gcc and clang optimize this correctly with -O2. If there are ancient compilers which fail to do so: *shrug*. > + * See udelay() for basic information about ndelay() and it's variants. > + * > + * Impelmentation details for ndelay(): vs. > + * Impelementation details for udelay() only: above. Can you please make your mind up which mis-spelled variant to pick? :) > /** > * msleep_interruptible - sleep waiting for signals > - * @msecs: Time in milliseconds to sleep for > + * @msecs: Requested sleep duration in milliseconds > + * > + * See msleep() for some basic information. > + * > + * The difference between msleep() and msleep_interruptible() is that the sleep > + * could be interrupted by a signal delivery and then returns early. > + * > + * Returns the remaining time of the sleep duration transformed to msecs (see > + * schedule_timeout() for details). Returns: The remaining ... Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 06/15] timers: Update function descriptions of sleep/delay related functions 2024-09-05 6:59 ` Thomas Gleixner @ 2024-09-05 16:07 ` Thomas Gleixner 2024-09-05 19:49 ` Anna-Maria Behnsen 0 siblings, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2024-09-05 16:07 UTC (permalink / raw) To: Anna-Maria Behnsen, Frederic Weisbecker, Jonathan Corbet Cc: linux-kernel, Len Brown, Rafael J. Wysocki, Anna-Maria Behnsen, Arnd Bergmann, linux-arch, Andrew Morton, Nathan Chancellor, Nick Desaulniers On Thu, Sep 05 2024 at 08:59, Thomas Gleixner wrote: > On Wed, Sep 04 2024 at 15:04, Anna-Maria Behnsen wrote: > However, instead of proliferating this voodoo can we please convert it > into something comprehensible? > > /* > * The microseconds delay multiplicator is used to convert a constant > * microseconds value to a <INSERT COHERENT EXPLANATION>. > */ > #define UDELAY_CONST_MULT ((unsigned long)DIV_ROUND_UP(1ULL << 32, USEC_PER_SEC)) > > /* > * The maximum constant udelay value picked out of thin air > * to avoid <INSERT COHERENT EXPLANATION>. > */ > #define UDELAY_CONST_MAX 20000 > > /** > * udelay - ..... > */ > static __always_inline void udelay(unsigned long usec) > { > /* > * <INSERT COHERENT EXPLANATION> for this construct > */ > if (__builtin_constant_p(usec)) { > if (usec >= UDELAY_CONST_MAX) > __bad_udelay(); > else > __const_udelay(usec * UDELAY_CONST_MULT); > } else { > __udelay(usec); And of course a these magic numeric constants have been copied all over the place. git grep '__const_udelay(' arch/ .... Just SH managed to use 0x10c6 instead of 0x10c7. ARM has it's own udelay implementation: #define udelay(n) \ (__builtin_constant_p(n) ? \ ((n) > (MAX_UDELAY_MS * 1000) ? __bad_udelay() : \ __const_udelay((n) * UDELAY_MULT)) : \ __udelay(n)) Amazingly this uses the same comparison construct which was in the generic udelay implementation... Same for arc, m68k and microblaze. Plus the default implementation for mdelay() in linux/delay.h: #define mdelay(n) (\ (__builtin_constant_p(n) && (n)<=MAX_UDELAY_MS) ? udelay((n)*1000) : \ ({unsigned long __ms=(n); while (__ms--) udelay(1000);})) Oh well.... What's truly amazing is that all __udelay() implementations, which invoke __const_udelay() under the hood, do: __const_udelay(usec * 0x10c7); So we have an arbitrary range limit for constants, which makes the build fail. But the variable based udelays can hand in whatever they want and __udelay() happily ignores it including the possible multiplication overflow. That's all really consistently copy and pasted voodoo. The other architecture implementations are not much better in that regard. The main difference is their cutoff value for __const_udelay() and the multiplication factors. The below uncompiled and untested pile is an attempt to consolidate this mess as far as it goes. There is probably more to mop up, but for a start this makes already sense. Thanks, tglx --- arch/Kconfig | 3 arch/arc/include/asm/delay.h | 43 ------------ arch/arm64/Kconfig | 1 arch/arm64/lib/delay.c | 29 -------- arch/csky/Kconfig | 1 arch/csky/lib/delay.c | 22 ------ arch/loongarch/Kconfig | 1 arch/loongarch/include/asm/delay.h | 16 ---- arch/loongarch/lib/delay.c | 23 ------ arch/microblaze/Kconfig | 1 arch/microblaze/include/asm/delay.h | 48 -------------- arch/mips/Kconfig | 1 arch/mips/include/asm/delay.h | 18 ++--- arch/mips/lib/delay.c | 29 +------- arch/nios2/Kconfig | 1 arch/nios2/lib/delay.c | 22 ------ arch/openrisc/Kconfig | 1 arch/openrisc/lib/delay.c | 22 ------ arch/sh/Kconfig | 1 arch/sh/kernel/sh_ksyms_32.c | 4 - arch/sh/lib/delay.c | 23 +----- arch/x86/Kconfig | 1 arch/x86/include/asm/delay.h | 9 ++ arch/x86/lib/delay.c | 23 ------ arch/x86/um/delay.c | 24 ------- include/asm-generic/delay.h | 120 +++++++++++++++++++++++++++--------- 26 files changed, 145 insertions(+), 342 deletions(-) --- a/arch/Kconfig +++ b/arch/Kconfig @@ -55,6 +55,9 @@ config HOTPLUG_PARALLEL bool select HOTPLUG_SPLIT_STARTUP +config GENERIC_DELAY + bool + config GENERIC_ENTRY bool --- a/arch/arc/include/asm/delay.h +++ b/arch/arc/include/asm/delay.h @@ -14,11 +14,6 @@ #ifndef __ASM_ARC_UDELAY_H #define __ASM_ARC_UDELAY_H -#include <asm-generic/types.h> -#include <asm/param.h> /* HZ */ - -extern unsigned long loops_per_jiffy; - static inline void __delay(unsigned long loops) { __asm__ __volatile__( @@ -27,43 +22,11 @@ static inline void __delay(unsigned long " nop \n" "1: \n" : - : "r"(loops) - : "lp_count"); + : "r"(loops) + : "lp_count"); } -extern void __bad_udelay(void); - -/* - * Normal Math for computing loops in "N" usecs - * -we have precomputed @loops_per_jiffy - * -1 sec has HZ jiffies - * loops per "N" usecs = ((loops_per_jiffy * HZ / 1000000) * N) - * - * Approximate Division by multiplication: - * -Mathematically if we multiply and divide a number by same value the - * result remains unchanged: In this case, we use 2^32 - * -> (loops_per_N_usec * 2^32 ) / 2^32 - * -> (((loops_per_jiffy * HZ / 1000000) * N) * 2^32) / 2^32 - * -> (loops_per_jiffy * HZ * N * 4295) / 2^32 - * - * -Divide by 2^32 is very simply right shift by 32 - * -We simply need to ensure that the multiply per above eqn happens in - * 64-bit precision (if CPU doesn't support it - gcc can emaulate it) - */ - -static inline void __udelay(unsigned long usecs) -{ - unsigned long loops; - - /* (u64) cast ensures 64 bit MPY - real or emulated - * HZ * 4295 is pre-evaluated by gcc - hence only 2 mpy ops - */ - loops = ((u64) usecs * 4295 * HZ * loops_per_jiffy) >> 32; - - __delay(loops); -} +#include <asm-generic/delay.h> -#define udelay(n) (__builtin_constant_p(n) ? ((n) > 20000 ? __bad_udelay() \ - : __udelay(n)) : __udelay(n)) #endif /* __ASM_ARC_UDELAY_H */ --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -138,6 +138,7 @@ config ARM64 select GENERIC_CPU_AUTOPROBE select GENERIC_CPU_DEVICES select GENERIC_CPU_VULNERABILITIES + select GENERIC_DELAY select GENERIC_EARLY_IOREMAP select GENERIC_IDLE_POLL_SETUP select GENERIC_IOREMAP --- a/arch/arm64/lib/delay.c +++ b/arch/arm64/lib/delay.c @@ -15,13 +15,8 @@ #include <clocksource/arm_arch_timer.h> -#define USECS_TO_CYCLES(time_usecs) \ - xloops_to_cycles((time_usecs) * 0x10C7UL) - -static inline unsigned long xloops_to_cycles(unsigned long xloops) -{ - return (xloops * loops_per_jiffy * HZ) >> 32; -} +#define USECS_TO_CYCLES(time_usecs) \ + (usec * DELAY_MULT_LPJ * UDELAY_MULT) >> UDELAY_SHIFT) void __delay(unsigned long cycles) { @@ -37,7 +32,7 @@ void __delay(unsigned long cycles) wfit(end); while ((get_cycles() - start) < cycles) wfet(end); - } else if (arch_timer_evtstrm_available()) { + } else if (arch_timer_evtstrm_available()) { const cycles_t timer_evt_period = USECS_TO_CYCLES(ARCH_TIMER_EVT_STREAM_PERIOD_US); @@ -49,21 +44,3 @@ void __delay(unsigned long cycles) cpu_relax(); } EXPORT_SYMBOL(__delay); - -inline void __const_udelay(unsigned long xloops) -{ - __delay(xloops_to_cycles(xloops)); -} -EXPORT_SYMBOL(__const_udelay); - -void __udelay(unsigned long usecs) -{ - __const_udelay(usecs * 0x10C7UL); /* 2**32 / 1000000 (rounded up) */ -} -EXPORT_SYMBOL(__udelay); - -void __ndelay(unsigned long nsecs) -{ - __const_udelay(nsecs * 0x5UL); /* 2**32 / 1000000000 (rounded up) */ -} -EXPORT_SYMBOL(__ndelay); --- a/arch/csky/Kconfig +++ b/arch/csky/Kconfig @@ -48,6 +48,7 @@ config CSKY select DMA_DIRECT_REMAP select IRQ_DOMAIN select DW_APB_TIMER_OF + select GENERIC_DELAY select GENERIC_IOREMAP select GENERIC_LIB_ASHLDI3 select GENERIC_LIB_ASHRDI3 --- a/arch/csky/lib/delay.c +++ b/arch/csky/lib/delay.c @@ -15,25 +15,3 @@ void __aligned(8) __delay(unsigned long : "0"(loops)); } EXPORT_SYMBOL(__delay); - -void __const_udelay(unsigned long xloops) -{ - unsigned long long loops; - - loops = (unsigned long long)xloops * loops_per_jiffy * HZ; - - __delay(loops >> 32); -} -EXPORT_SYMBOL(__const_udelay); - -void __udelay(unsigned long usecs) -{ - __const_udelay(usecs * 0x10C7UL); /* 2**32 / 1000000 (rounded up) */ -} -EXPORT_SYMBOL(__udelay); - -void __ndelay(unsigned long nsecs) -{ - __const_udelay(nsecs * 0x5UL); /* 2**32 / 1000000000 (rounded up) */ -} -EXPORT_SYMBOL(__ndelay); --- a/arch/loongarch/Kconfig +++ b/arch/loongarch/Kconfig @@ -82,6 +82,7 @@ config LOONGARCH select GENERIC_CMOS_UPDATE select GENERIC_CPU_AUTOPROBE select GENERIC_CPU_DEVICES + select GENERIC_DELAY select GENERIC_ENTRY select GENERIC_GETTIMEOFDAY select GENERIC_IOREMAP if !ARCH_IOREMAP --- a/arch/loongarch/include/asm/delay.h +++ b/arch/loongarch/include/asm/delay.h @@ -7,20 +7,8 @@ #include <linux/param.h> -extern void __delay(unsigned long cycles); -extern void __ndelay(unsigned long ns); -extern void __udelay(unsigned long us); +#define DELAY_LPJ_MULT lpj_fine -#define ndelay(ns) __ndelay(ns) -#define udelay(us) __udelay(us) - -/* make sure "usecs *= ..." in udelay do not overflow. */ -#if HZ >= 1000 -#define MAX_UDELAY_MS 1 -#elif HZ <= 200 -#define MAX_UDELAY_MS 5 -#else -#define MAX_UDELAY_MS (1000 / HZ) -#endif +#include <asm-generic/delay.h> #endif /* _ASM_DELAY_H */ --- a/arch/loongarch/lib/delay.c +++ b/arch/loongarch/lib/delay.c @@ -17,26 +17,3 @@ void __delay(unsigned long cycles) cpu_relax(); } EXPORT_SYMBOL(__delay); - -/* - * Division by multiplication: you don't have to worry about - * loss of precision. - * - * Use only for very small delays ( < 1 msec). Should probably use a - * lookup table, really, as the multiplications take much too long with - * short delays. This is a "reasonable" implementation, though (and the - * first constant multiplications gets optimized away if the delay is - * a constant) - */ - -void __udelay(unsigned long us) -{ - __delay((us * 0x000010c7ull * HZ * lpj_fine) >> 32); -} -EXPORT_SYMBOL(__udelay); - -void __ndelay(unsigned long ns) -{ - __delay((ns * 0x00000005ull * HZ * lpj_fine) >> 32); -} -EXPORT_SYMBOL(__ndelay); --- a/arch/microblaze/Kconfig +++ b/arch/microblaze/Kconfig @@ -16,6 +16,7 @@ config MICROBLAZE select DMA_DIRECT_REMAP select GENERIC_ATOMIC64 select GENERIC_CPU_DEVICES + select GENERIC_DELAY select GENERIC_IDLE_POLL_SETUP select GENERIC_IRQ_PROBE select GENERIC_IRQ_SHOW --- a/arch/microblaze/include/asm/delay.h +++ b/arch/microblaze/include/asm/delay.h @@ -33,53 +33,9 @@ static inline void __delay(unsigned long * (which corresponds to ~3800 bogomips at HZ = 100). * -- paulus */ -#define __MAX_UDELAY (226050910UL/HZ) /* maximum udelay argument */ -#define __MAX_NDELAY (4294967295UL/HZ) /* maximum ndelay argument */ -extern unsigned long loops_per_jiffy; +#define UDELAY_ARCH_MULT (19 * 226) -static inline void __udelay(unsigned int x) -{ - - unsigned long long tmp = - (unsigned long long)x * (unsigned long long)loops_per_jiffy \ - * 226LL; - unsigned loops = tmp >> 32; - -/* - __asm__("mulxuu %0,%1,%2" : "=r" (loops) : - "r" (x), "r" (loops_per_jiffy * 226)); -*/ - __delay(loops); -} - -extern void __bad_udelay(void); /* deliberately undefined */ -extern void __bad_ndelay(void); /* deliberately undefined */ - -#define udelay(n) \ - ({ \ - if (__builtin_constant_p(n)) { \ - if ((n) / __MAX_UDELAY >= 1) \ - __bad_udelay(); \ - else \ - __udelay((n) * (19 * HZ)); \ - } else { \ - __udelay((n) * (19 * HZ)); \ - } \ - }) - -#define ndelay(n) \ - ({ \ - if (__builtin_constant_p(n)) { \ - if ((n) / __MAX_NDELAY >= 1) \ - __bad_ndelay(); \ - else \ - __udelay((n) * HZ); \ - } else { \ - __udelay((n) * HZ); \ - } \ - }) - -#define muldiv(a, b, c) (((a)*(b))/(c)) +#include <asm-generic/delay.h> #endif /* _ASM_MICROBLAZE_DELAY_H */ --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -34,6 +34,7 @@ config MIPS select GENERIC_ATOMIC64 if !64BIT select GENERIC_CMOS_UPDATE select GENERIC_CPU_AUTOPROBE + select GENERIC_DELAY if !CAVIUM_OCTEON_SOC select GENERIC_GETTIMEOFDAY select GENERIC_IOMAP select GENERIC_IRQ_PROBE --- a/arch/mips/include/asm/delay.h +++ b/arch/mips/include/asm/delay.h @@ -11,22 +11,22 @@ #ifndef _ASM_DELAY_H #define _ASM_DELAY_H -#include <linux/param.h> +#ifdef CONFIG GENERIC_DELAY +void __delay_loops(unsigned long long delay); +#define __delay_loops __delay_loops +#define DELAY_MULT_LPJ 1 +#define UDELAY_ARCH_SHIFT 0 + +#include <asm-generic/delay.h> + +#else extern void __delay(unsigned long loops); extern void __ndelay(unsigned long ns); extern void __udelay(unsigned long us); #define ndelay(ns) __ndelay(ns) #define udelay(us) __udelay(us) - -/* make sure "usecs *= ..." in udelay do not overflow. */ -#if HZ >= 1000 -#define MAX_UDELAY_MS 1 -#elif HZ <= 200 -#define MAX_UDELAY_MS 5 -#else -#define MAX_UDELAY_MS (1000 / HZ) #endif #endif /* _ASM_DELAY_H */ --- a/arch/mips/lib/delay.c +++ b/arch/mips/lib/delay.c @@ -38,31 +38,12 @@ void __delay(unsigned long loops) } EXPORT_SYMBOL(__delay); -/* - * Division by multiplication: you don't have to worry about - * loss of precision. - * - * Use only for very small delays ( < 1 msec). Should probably use a - * lookup table, really, as the multiplications take much too long with - * short delays. This is a "reasonable" implementation, though (and the - * first constant multiplications gets optimized away if the delay is - * a constant) - */ - -void __udelay(unsigned long us) -{ - unsigned int lpj = raw_current_cpu_data.udelay_val; - - __delay((us * 0x000010c7ull * HZ * lpj) >> 32); -} -EXPORT_SYMBOL(__udelay); - -void __ndelay(unsigned long ns) +void __delay_loops(unsigned long long delay) { - unsigned int lpj = raw_current_cpu_data.udelay_val; + unsigned long lpj = raw_current_cpu_data.udelay_val; + unsigned long xloops = (delay * lpj) >> 32; - __delay((ns * 0x00000005ull * HZ * lpj) >> 32); + __delay(++xloops); } -EXPORT_SYMBOL(__ndelay); - +EXPORT_SYMBOL(__delay_loops); #endif --- a/arch/nios2/Kconfig +++ b/arch/nios2/Kconfig @@ -12,6 +12,7 @@ config NIOS2 select TIMER_OF select GENERIC_ATOMIC64 select GENERIC_CPU_DEVICES + select GENERIC_DELAY select GENERIC_IRQ_PROBE select GENERIC_IRQ_SHOW select HAVE_ARCH_TRACEHOOK --- a/arch/nios2/lib/delay.c +++ b/arch/nios2/lib/delay.c @@ -16,25 +16,3 @@ void __delay(unsigned long cycles) cpu_relax(); } EXPORT_SYMBOL(__delay); - -void __const_udelay(unsigned long xloops) -{ - u64 loops; - - loops = (u64)xloops * loops_per_jiffy * HZ; - - __delay(loops >> 32); -} -EXPORT_SYMBOL(__const_udelay); - -void __udelay(unsigned long usecs) -{ - __const_udelay(usecs * 0x10C7UL); /* 2**32 / 1000000 (rounded up) */ -} -EXPORT_SYMBOL(__udelay); - -void __ndelay(unsigned long nsecs) -{ - __const_udelay(nsecs * 0x5UL); /* 2**32 / 1000000000 (rounded up) */ -} -EXPORT_SYMBOL(__ndelay); --- a/arch/openrisc/Kconfig +++ b/arch/openrisc/Kconfig @@ -17,6 +17,7 @@ config OPENRISC select GPIOLIB select HAVE_ARCH_TRACEHOOK select SPARSE_IRQ + select GENERIC_DELAY select GENERIC_IRQ_CHIP select GENERIC_IRQ_PROBE select GENERIC_IRQ_SHOW --- a/arch/openrisc/lib/delay.c +++ b/arch/openrisc/lib/delay.c @@ -35,25 +35,3 @@ void __delay(unsigned long cycles) cpu_relax(); } EXPORT_SYMBOL(__delay); - -inline void __const_udelay(unsigned long xloops) -{ - unsigned long long loops; - - loops = (unsigned long long)xloops * loops_per_jiffy * HZ; - - __delay(loops >> 32); -} -EXPORT_SYMBOL(__const_udelay); - -void __udelay(unsigned long usecs) -{ - __const_udelay(usecs * 0x10C7UL); /* 2**32 / 1000000 (rounded up) */ -} -EXPORT_SYMBOL(__udelay); - -void __ndelay(unsigned long nsecs) -{ - __const_udelay(nsecs * 0x5UL); /* 2**32 / 1000000000 (rounded up) */ -} -EXPORT_SYMBOL(__ndelay); --- a/arch/sh/Kconfig +++ b/arch/sh/Kconfig @@ -18,6 +18,7 @@ config SUPERH select DMA_DECLARE_COHERENT select GENERIC_ATOMIC64 select GENERIC_CMOS_UPDATE if SH_SH03 || SH_DREAMCAST + select GENERIC_DELAY select GENERIC_IDLE_POLL_SETUP select GENERIC_IRQ_SHOW select GENERIC_LIB_ASHLDI3 --- a/arch/sh/kernel/sh_ksyms_32.c +++ b/arch/sh/kernel/sh_ksyms_32.c @@ -12,9 +12,7 @@ EXPORT_SYMBOL(memcpy); EXPORT_SYMBOL(memset); EXPORT_SYMBOL(memmove); EXPORT_SYMBOL(__copy_user); -EXPORT_SYMBOL(__udelay); -EXPORT_SYMBOL(__ndelay); -EXPORT_SYMBOL(__const_udelay); +EXPORT_SYMBOL(__delay_loops); EXPORT_SYMBOL(strlen); EXPORT_SYMBOL(csum_partial); EXPORT_SYMBOL(csum_partial_copy_generic); --- a/arch/sh/lib/delay.c +++ b/arch/sh/lib/delay.c @@ -30,25 +30,10 @@ void __delay(unsigned long loops) : "t"); } -inline void __const_udelay(unsigned long xloops) +void __delay_loops(unsigned long long delay) { - xloops *= 4; - __asm__("dmulu.l %0, %2\n\t" - "sts mach, %0" - : "=r" (xloops) - : "0" (xloops), - "r" (cpu_data[raw_smp_processor_id()].loops_per_jiffy * (HZ/4)) - : "macl", "mach"); - __delay(++xloops); -} - -void __udelay(unsigned long usecs) -{ - __const_udelay(usecs * 0x000010c6); /* 2**32 / 1000000 */ -} + unsigned long lpj = cpu_data[raw_smp_processor_id()].loops_per_jiffy; + unsigned long xloops = (delay * lpj) >> 32; -void __ndelay(unsigned long nsecs) -{ - __const_udelay(nsecs * 0x00000005); + __delay(++xloops); } - --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -156,6 +156,7 @@ config X86 select GENERIC_CPU_AUTOPROBE select GENERIC_CPU_DEVICES select GENERIC_CPU_VULNERABILITIES + select GENERIC_DELAY select GENERIC_EARLY_IOREMAP select GENERIC_ENTRY select GENERIC_IOMAP --- a/arch/x86/include/asm/delay.h +++ b/arch/x86/include/asm/delay.h @@ -2,9 +2,16 @@ #ifndef _ASM_X86_DELAY_H #define _ASM_X86_DELAY_H -#include <asm-generic/delay.h> #include <linux/init.h> +void __delay_loops(unsigned long long delay); + +#define __delay_loops __delay_loops +#define DELAY_MULT_LPJ 1 +#define UDELAY_ARCH_SHIFT 0 + +#include <asm-generic/delay.h> + void __init use_tsc_delay(void); void __init use_tpause_delay(void); void use_mwaitx_delay(void); --- a/arch/x86/lib/delay.c +++ b/arch/x86/lib/delay.c @@ -204,28 +204,11 @@ void __delay(unsigned long loops) } EXPORT_SYMBOL(__delay); -noinline void __const_udelay(unsigned long xloops) +void __delay_loops(unsigned long long delay) { unsigned long lpj = this_cpu_read(cpu_info.loops_per_jiffy) ? : loops_per_jiffy; - int d0; - - xloops *= 4; - asm("mull %%edx" - :"=d" (xloops), "=&a" (d0) - :"1" (xloops), "0" (lpj * (HZ / 4))); + unsigned long xloops = (delay * lpj) >> 32; __delay(++xloops); } -EXPORT_SYMBOL(__const_udelay); - -void __udelay(unsigned long usecs) -{ - __const_udelay(usecs * 0x000010c7); /* 2**32 / 1000000 (rounded up) */ -} -EXPORT_SYMBOL(__udelay); - -void __ndelay(unsigned long nsecs) -{ - __const_udelay(nsecs * 0x00005); /* 2**32 / 1000000000 (rounded up) */ -} -EXPORT_SYMBOL(__ndelay); +EXPORT_SYMBOL(__delay_loops); --- a/arch/x86/um/delay.c +++ b/arch/x86/um/delay.c @@ -30,28 +30,10 @@ void __delay(unsigned long loops) } EXPORT_SYMBOL(__delay); -inline void __const_udelay(unsigned long xloops) +void __delay_loops(unsigned long long delay) { - int d0; - - xloops *= 4; - asm("mull %%edx" - : "=d" (xloops), "=&a" (d0) - : "1" (xloops), "0" - (loops_per_jiffy * (HZ/4))); + unsigned long xloops = (delay * loops_per_jiffy) >> 32; __delay(++xloops); } -EXPORT_SYMBOL(__const_udelay); - -void __udelay(unsigned long usecs) -{ - __const_udelay(usecs * 0x000010c7); /* 2**32 / 1000000 (rounded up) */ -} -EXPORT_SYMBOL(__udelay); - -void __ndelay(unsigned long nsecs) -{ - __const_udelay(nsecs * 0x00005); /* 2**32 / 1000000000 (rounded up) */ -} -EXPORT_SYMBOL(__ndelay); +EXPORT_SYMBOL(__delay_loops); --- a/include/asm-generic/delay.h +++ b/include/asm-generic/delay.h @@ -2,44 +2,104 @@ #ifndef __ASM_GENERIC_DELAY_H #define __ASM_GENERIC_DELAY_H +#include <vdso/time64.h> + /* Undefined functions to get compile-time errors */ extern void __bad_udelay(void); extern void __bad_ndelay(void); -extern void __udelay(unsigned long usecs); -extern void __ndelay(unsigned long nsecs); -extern void __const_udelay(unsigned long xloops); extern void __delay(unsigned long loops); +#ifdef CONFIG_GENERIC_UDELAY +#ifndef UDELAY_ARCH_MULT +# define UDELAY_ARCH_MULT 1ULL +#endif + +#ifdef UDELAY_ARCH_SHIFT +# define UDELAY_SHIFT UDELAY_ARCH_SHIFT +#else +# define UDELAY_SHIFT 32 +#endif + +#define __UDELAY_MULT ((unsigned long long)UDELAY_ARCH_MULT * HZ) +#define UDELAY_MULT ((unsigned long)DIV_ROUND_UP(__UDELAY_MULT << 32, USEC_PER_SEC)) +#define NDELAY_MULT DIV_ROUND_UP(UDELAY_MULT, NSEC_PER_USEC) + /* - * The weird n/20000 thing suppresses a "comparison is always false due to - * limited range of data type" warning with non-const 8-bit arguments. + * Generous upper bound for loops per jiffy assuming a maximal CPU + * frequency of 8GHz and 1 cycle per loop. */ +#define LPJ_MAX ((8ULL * NSEC_PER_SEC) / HZ) -/* 0x10c7 is 2**32 / 1000000 (rounded up) */ -#define udelay(n) \ - ({ \ - if (__builtin_constant_p(n)) { \ - if ((n) / 20000 >= 1) \ - __bad_udelay(); \ - else \ - __const_udelay((n) * 0x10c7ul); \ - } else { \ - __udelay(n); \ - } \ - }) - -/* 0x5 is 2**32 / 1000000000 (rounded up) */ -#define ndelay(n) \ - ({ \ - if (__builtin_constant_p(n)) { \ - if ((n) / 20000 >= 1) \ - __bad_ndelay(); \ - else \ - __const_udelay((n) * 5ul); \ - } else { \ - __ndelay(n); \ - } \ - }) +/* + * The maximum usec value depends on the multiplication factor and the + * maximum upper bound for loops_per_jiffy to guarantee that there is + * no multiplication overflow when __delay_loops() multiplies the + * argument with the actual loops_per_jiffy value. + */ +#define UDELAY_CONST_MAX (unsigned long)(U64_MAX / (LPJ_MAX * UDELAY_MULT)) +#define NDELAY_CONST_MAX (UDELAY_CONST_MAX * NSEC_PER_USEC) + +#ifndef DELAY_MULT_LPJ +#define DELAY_MULT_LPJ loops_per_jiffy +#endif + +#ifndef __delay_loops +#define __delay_loops(x) __delay(x) +#endif + +static __always_inline void __udelay(unsigned long usec) +{ + /* FIXME: Add a debug sanity check for usec > UDELAY_CONST_MAX */ + __delay_loops((usec * DELAY_MULT_LPJ * UDELAY_MULT) >> UDELAY_SHIFT); +} + +static __always_inline void __ndelay(unsigned long nsec) +{ + /* FIXME: Add a debug sanity check for usec > NDELAY_CONST_MAX */ + __delay_loops((nsec * DELAY_MULT_LPJ * NDELAY_MULT) >> UDELAY_SHIFT); +} + +#define __const_udelay_wrapper(usec, mult) __udelay(usec) +#define __const_ndelay_wrapper(usec, mult) __ndelay(usec) + +#else +/* Does any of this make sense? No. */ +#define UDELAY_CONST_MULT ((unsigned long)DIV_ROUND_UP((1ULL << 32), USEC_PER_SEC)) +#define NDELAY_CONST_MULT (UDELAY_CONST_MULT / NSEC_PER_USEC) +#define UDELAY_CONST_MAX 20000 +#define NDELAY_CONST_MAX 20000 +extern void __const_udelay(unsigned long xloops); +extern void __udelay(unsigned long usecs); +extern void __ndelay(unsigned long nsecs); +#define __const_udelay_wrapper(usec) __const_udelay(usec * UDELAY_CONST_MULT) +#define __const_ndelay_wrapper(nsec) __const_udelay(nsec * NDELAY_CONST_MULT) +#endif + +static __always_inline void _udelay(unsigned long usec) +{ + if (__builtin_constant_p(usec)) { + if (usec >= UDELAY_CONST_MAX) + __bad_udelay(); + else + __const_udelay_wrapper(usec); + } else { + __udelay(usec); + } +} +#define udelay(x) _udelay(x) + +static __always_inline void _ndelay(unsigned long usec) +{ + if (__builtin_constant_p(usec)) { + if (usec >= NDELAY_CONST_MAX) + __bad_ndelay(); + else + __const_ndelay_wrapper(usec); + } else { + __ndelay(usec); + } +} +#define ndelay(x) _ndelay(x) #endif /* __ASM_GENERIC_DELAY_H */ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 06/15] timers: Update function descriptions of sleep/delay related functions 2024-09-05 16:07 ` Thomas Gleixner @ 2024-09-05 19:49 ` Anna-Maria Behnsen 0 siblings, 0 replies; 8+ messages in thread From: Anna-Maria Behnsen @ 2024-09-05 19:49 UTC (permalink / raw) To: Thomas Gleixner, Frederic Weisbecker, Jonathan Corbet Cc: linux-kernel, Len Brown, Rafael J. Wysocki, Arnd Bergmann, linux-arch, Andrew Morton, Nathan Chancellor, Nick Desaulniers Thomas Gleixner <tglx@linutronix.de> writes: > On Thu, Sep 05 2024 at 08:59, Thomas Gleixner wrote: >> On Wed, Sep 04 2024 at 15:04, Anna-Maria Behnsen wrote: >> However, instead of proliferating this voodoo can we please convert it >> into something comprehensible? >> >> /* >> * The microseconds delay multiplicator is used to convert a constant >> * microseconds value to a <INSERT COHERENT EXPLANATION>. >> */ >> #define UDELAY_CONST_MULT ((unsigned long)DIV_ROUND_UP(1ULL << 32, USEC_PER_SEC)) >> >> /* >> * The maximum constant udelay value picked out of thin air >> * to avoid <INSERT COHERENT EXPLANATION>. >> */ >> #define UDELAY_CONST_MAX 20000 >> >> /** >> * udelay - ..... >> */ >> static __always_inline void udelay(unsigned long usec) >> { >> /* >> * <INSERT COHERENT EXPLANATION> for this construct >> */ >> if (__builtin_constant_p(usec)) { >> if (usec >= UDELAY_CONST_MAX) >> __bad_udelay(); >> else >> __const_udelay(usec * UDELAY_CONST_MULT); >> } else { >> __udelay(usec); > > And of course a these magic numeric constants have been copied all over > the place. git grep '__const_udelay(' arch/ .... Just SH managed to use > 0x10c6 instead of 0x10c7. > > ARM has it's own udelay implementation: > > #define udelay(n) \ > (__builtin_constant_p(n) ? \ > ((n) > (MAX_UDELAY_MS * 1000) ? __bad_udelay() : \ > __const_udelay((n) * UDELAY_MULT)) : \ > __udelay(n)) > > Amazingly this uses the same comparison construct which was in the > generic udelay implementation... Same for arc, m68k and microblaze. > > Plus the default implementation for mdelay() in linux/delay.h: > > #define mdelay(n) (\ > (__builtin_constant_p(n) && (n)<=MAX_UDELAY_MS) ? udelay((n)*1000) : \ > ({unsigned long __ms=(n); while (__ms--) udelay(1000);})) > > Oh well.... > > What's truly amazing is that all __udelay() implementations, which > invoke __const_udelay() under the hood, do: > > __const_udelay(usec * 0x10c7); > > So we have an arbitrary range limit for constants, which makes the build > fail. But the variable based udelays can hand in whatever they want and > __udelay() happily ignores it including the possible multiplication > overflow. > > That's all really consistently copy and pasted voodoo. The other > architecture implementations are not much better in that regard. The > main difference is their cutoff value for __const_udelay() and the > multiplication factors. > > The below uncompiled and untested pile is an attempt to consolidate this > mess as far as it goes. There is probably more to mop up, but for a > start this makes already sense. Thanks for the first step of dissection of the mess! I'll take a closer look at it soon. But as it's in tree since some more days than just one, can we please make this cleanup on top of the original queue and get the fsleep() and outdated documentation thing fixed soon? You made a proposal in the previous answer to convert it into something comprehensible. If there are no concerns, I would integrate it and prepare a v2 for the queue. Thanks, Anna-Maria ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 00/15] timers: Cleanup delay/sleep related mess 2024-09-04 13:04 [PATCH 00/15] timers: Cleanup delay/sleep related mess Anna-Maria Behnsen 2024-09-04 13:04 ` [PATCH 06/15] timers: Update function descriptions of sleep/delay related functions Anna-Maria Behnsen @ 2024-09-04 14:44 ` Rafael J. Wysocki 2024-10-17 14:19 ` (subset) " Mark Brown 2 siblings, 0 replies; 8+ messages in thread From: Rafael J. Wysocki @ 2024-09-04 14:44 UTC (permalink / raw) To: Anna-Maria Behnsen Cc: Frederic Weisbecker, Thomas Gleixner, Jonathan Corbet, linux-kernel, Len Brown, Rafael J. Wysocki, Peter Zijlstra, SeongJae Park, Andrew Morton, damon, linux-mm, Arnd Bergmann, linux-arch, Heiner Kallweit, David S. Miller, Andy Whitcroft, Joe Perches, Dwaipayan Ray, Liam Girdwood, Mark Brown, Andrew Lunn, Jaroslav Kysela, Takashi Iwai, netdev, linux-sound, Michael Ellerman, Nathan Lynch, linuxppc-dev, Mauro Carvalho Chehab, linux-media On Wed, Sep 4, 2024 at 3:05 PM Anna-Maria Behnsen <anna-maria@linutronix.de> wrote: > > Hi, > > a question about which sleeping function should be used in acpi_os_sleep() > started a discussion and examination about the existing documentation and > implementation of functions which insert a sleep/delay. > > The result of the discussion was, that the documentation is outdated and > the implemented fsleep() reflects the outdated documentation but doesn't > help to reflect reality which in turns leads to the queue which covers the > following things: > > - Minor changes (naming and typo fixes) > > - Split out all timeout and sleep related functions from hrtimer.c and timer.c > into a separate file > > - Update function descriptions of sleep related functions > > - Change fsleep() to reflect reality > > - Rework all comments or users which obviously rely on the outdated > documentation as they reference "Documentation/timers/timers-howto.rst" > > - Last but not least (as there are no more references): Update the outdated > documentation and move it into a file with a self explaining file name > > The queue is available here and applies on top of tip/timers/core: > > git://git.kernel.org/pub/scm/linux/kernel/git/anna-maria/linux-devel.git timers/misc > > Cc: linux-kernel@vger.kernel.org > Cc: Len Brown <len.brown@intel.com> > Cc: Rafael J. Wysocki <rafael@kernel.org> > To: Frederic Weisbecker <frederic@kernel.org> > To: Thomas Gleixner <tglx@linutronix.de> > To: Jonathan Corbet <corbet@lwn.net> > Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> > > Thanks, > > Anna-Maria > > --- > Anna-Maria Behnsen (15): > timers: Rename next_expiry_recalc() to be unique > cpu: Use already existing usleep_range() > Comments: Fix wrong singular form of jiffies > timers: Move *sleep*() and timeout functions into a separate file > timers: Rename sleep_idle_range() to sleep_range_idle() > timers: Update function descriptions of sleep/delay related functions > timers: Adjust flseep() to reflect reality > mm/damon/core: Use generic upper bound recommondation for usleep_range() > timers: Add a warning to usleep_range_state() for wrong order of arguments > checkpatch: Remove broken sleep/delay related checks > regulator: core: Use fsleep() to get best sleep mechanism > iopoll/regmap/phy/snd: Fix comment referencing outdated timer documentation > powerpc/rtas: Use fsleep() to minimize additional sleep duration > media: anysee: Fix link to outdated sleep function documentation > timers/Documentation: Cleanup delay/sleep documentation I like the changes, so Acked-by: Rafael J. Wysocki <rafael@kernel.org> for the series. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: (subset) [PATCH 00/15] timers: Cleanup delay/sleep related mess 2024-09-04 13:04 [PATCH 00/15] timers: Cleanup delay/sleep related mess Anna-Maria Behnsen 2024-09-04 13:04 ` [PATCH 06/15] timers: Update function descriptions of sleep/delay related functions Anna-Maria Behnsen 2024-09-04 14:44 ` [PATCH 00/15] timers: Cleanup delay/sleep related mess Rafael J. Wysocki @ 2024-10-17 14:19 ` Mark Brown 2 siblings, 0 replies; 8+ messages in thread From: Mark Brown @ 2024-10-17 14:19 UTC (permalink / raw) To: Frederic Weisbecker, Thomas Gleixner, Jonathan Corbet, Anna-Maria Behnsen Cc: linux-kernel, Len Brown, Rafael J. Wysocki, Peter Zijlstra, SeongJae Park, Andrew Morton, damon, linux-mm, Arnd Bergmann, linux-arch, Heiner Kallweit, David S. Miller, Andy Whitcroft, Joe Perches, Dwaipayan Ray, Liam Girdwood, Andrew Lunn, Jaroslav Kysela, Takashi Iwai, netdev, linux-sound, Michael Ellerman, Nathan Lynch, linuxppc-dev, Mauro Carvalho Chehab, linux-media On Wed, 04 Sep 2024 15:04:50 +0200, Anna-Maria Behnsen wrote: > a question about which sleeping function should be used in acpi_os_sleep() > started a discussion and examination about the existing documentation and > implementation of functions which insert a sleep/delay. > > The result of the discussion was, that the documentation is outdated and > the implemented fsleep() reflects the outdated documentation but doesn't > help to reflect reality which in turns leads to the queue which covers the > following things: > > [...] Applied to https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git for-next Thanks! [11/15] regulator: core: Use fsleep() to get best sleep mechanism commit: f20669fbcf99d0e15e94fb50929bb1c41618e197 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-10-17 14:20 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-04 13:04 [PATCH 00/15] timers: Cleanup delay/sleep related mess Anna-Maria Behnsen 2024-09-04 13:04 ` [PATCH 06/15] timers: Update function descriptions of sleep/delay related functions Anna-Maria Behnsen 2024-09-04 14:30 ` Arnd Bergmann 2024-09-05 6:59 ` Thomas Gleixner 2024-09-05 16:07 ` Thomas Gleixner 2024-09-05 19:49 ` Anna-Maria Behnsen 2024-09-04 14:44 ` [PATCH 00/15] timers: Cleanup delay/sleep related mess Rafael J. Wysocki 2024-10-17 14:19 ` (subset) " Mark Brown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox