* [PATCH 0/3] time: use __builtin_constant_p() in msecs_to_jiffies
@ 2015-04-05 7:23 Nicholas Mc Guire
2015-04-05 7:23 ` [PATCH 1/3] time: move timeconst.h into include/generated Nicholas Mc Guire
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Nicholas Mc Guire @ 2015-04-05 7:23 UTC (permalink / raw)
To: Michal Marek
Cc: Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin,
Joe Perches, John Stultz, Andrew Hunter, Paul Turner,
linux-kernel, Nicholas Mc Guire
In the overall kernel source there currently are
2544 msecs_to_jiffies
126 usecs_to_jiffies
and a few places that are using var * HZ / 1000 constructs
which are not always safe (no check of corner cases) and should
be switched to msecs_to_jiffies (roughly 25 left).
Allowing gcc to fold constants for these calls that in most
cases are passing in constants (roughly 95%) has some potential
to improve performance (and should save a few bytes).
size impact:
x86_64_defconfig:
without patches vmlinux 24628843
with patches vmlinux 24628797
ppc64_defconfig:
without patches 27412024
with patches 27412040 (no clue why it is bigger!)
multi_v7_defconfig:
without patches vmlinux.o 22901462
with patches vmlinux.o 22899843
As the changes to the top level Kbuild will impact every architecture
this is probably not enough testing - but should be suitable for a first
review.
Once this is clean a patch for usecs_to_jiffies will be provided as well
The patch set:
0001 moves timeconst.h from kernel/time/ to include/generated/ and makes
it available early enough so that the build can use the constants
for msecs_to_jiffies
0002 rename msecs_to_jiffies to __msecs_to_jiffies, the only modification
is that the < 0 is moved out.
modified version of msecs_to_jiffies that checks for constant via
call to __builtin_constant_p() and calls __msecs_to_jiffies if it
can't determine that the argument is constant.
0003 documentation update and reformatting to kernel-doc format
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH 1/3] time: move timeconst.h into include/generated 2015-04-05 7:23 [PATCH 0/3] time: use __builtin_constant_p() in msecs_to_jiffies Nicholas Mc Guire @ 2015-04-05 7:23 ` Nicholas Mc Guire 2015-04-05 7:23 ` [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies Nicholas Mc Guire ` (2 subsequent siblings) 3 siblings, 0 replies; 14+ messages in thread From: Nicholas Mc Guire @ 2015-04-05 7:23 UTC (permalink / raw) To: Michal Marek Cc: Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, Joe Perches, John Stultz, Andrew Hunter, Paul Turner, linux-kernel, Nicholas Mc Guire kernel/time/timeconst.h is moved to include/generated/ and generated in an early build stage by top level Kbuild. This allows using timeconst.h in an earlier stage of the build. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> --- Thanks to Joe Perches <joe@perches.com> for suggesting this approach and catching the unconditional rebuild (should be fixed here now properly) as well as for his review comments on the first attempts. Some questions: * Kbuild - is passing in arguments to .bc files via pipe and read(); rather than using multiple files acceptable or is there some reason for the original multifile solution that Im overlooking ? * conditional rebuild of timeconst.h with $(call filechk,gentimeconst) not really clear if this is going to do all rebuilds that could be necessary and not clear how to verify this. currently only checked by 1) defconfig -> build -> rebuild -> CHK is executed timeconst.h unmodified 2) defconfig -> build -> change HZ -> rebuild -> UPD timeconst.h resulting in a rebuild of most files. Patch was compile tested for x86_64_defconfig,multi_v7_defconfig, ppc64_defconfig, and boot/run tested on x86_64. Further a test-case with an invalid HZ value to trigger the error output in kernel/time/timeconst.bc was run. Patch is against 4.0-rc6 (localversion-next is -next-20150402) Kbuild | 30 +++++++++++++++++++++++++----- kernel/time/Makefile | 17 +---------------- kernel/time/time.c | 2 +- kernel/time/timeconst.bc | 3 ++- 4 files changed, 29 insertions(+), 23 deletions(-) diff --git a/Kbuild b/Kbuild index 96d0629..4e1ed0b 100644 --- a/Kbuild +++ b/Kbuild @@ -2,8 +2,9 @@ # Kbuild for top-level directory of the kernel # This file takes care of the following: # 1) Generate bounds.h -# 2) Generate asm-offsets.h (may need bounds.h) -# 3) Check for missing system calls +# 2) Generate timeconst.h +# 3) Generate asm-offsets.h (may need bounds.h) +# 4) Check for missing system calls # Default sed regexp - multiline due to syntax constraints define sed-y @@ -47,7 +48,26 @@ $(obj)/$(bounds-file): kernel/bounds.s FORCE $(call filechk,offsets,__LINUX_BOUNDS_H__) ##### -# 2) Generate asm-offsets.h +# 2) Generate timeconst.h + +timeconst-file := include/generated/timeconst.h + +always += $(timeconst-file) +targets += $(timeconst-file) + +quiet_cmd_gentimeconst = GEN $@ +define cmd_gentimeconst + (echo $(CONFIG_HZ) | bc -q $< ) > $@ +endef +define filechk_gentimeconst + (echo $(CONFIG_HZ) | bc -q $< ) +endef + +$(obj)/$(timeconst-file): kernel/time/timeconst.bc FORCE + $(call filechk,gentimeconst) + +##### +# 3) Generate asm-offsets.h # offsets-file := include/generated/asm-offsets.h @@ -66,7 +86,7 @@ $(obj)/$(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE $(call filechk,offsets,__ASM_OFFSETS_H__) ##### -# 3) Check for missing system calls +# 4) Check for missing system calls # always += missing-syscalls @@ -79,4 +99,4 @@ missing-syscalls: scripts/checksyscalls.sh $(offsets-file) FORCE $(call cmd,syscalls) # Keep these two files during make clean -no-clean-files := $(bounds-file) $(offsets-file) +no-clean-files := $(bounds-file) $(offsets-file) $(timeconst-file) diff --git a/kernel/time/Makefile b/kernel/time/Makefile index 01f0312..ffc4cc3 100644 --- a/kernel/time/Makefile +++ b/kernel/time/Makefile @@ -13,19 +13,4 @@ obj-$(CONFIG_TIMER_STATS) += timer_stats.o obj-$(CONFIG_DEBUG_FS) += timekeeping_debug.o obj-$(CONFIG_TEST_UDELAY) += test_udelay.o -$(obj)/time.o: $(obj)/timeconst.h - -quiet_cmd_hzfile = HZFILE $@ - cmd_hzfile = echo "hz=$(CONFIG_HZ)" > $@ - -targets += hz.bc -$(obj)/hz.bc: $(objtree)/include/config/hz.h FORCE - $(call if_changed,hzfile) - -quiet_cmd_bc = BC $@ - cmd_bc = bc -q $(filter-out FORCE,$^) > $@ - -targets += timeconst.h -$(obj)/timeconst.h: $(obj)/hz.bc $(src)/timeconst.bc FORCE - $(call if_changed,bc) - +$(obj)/time.o: $(objtree)/include/config/ diff --git a/kernel/time/time.c b/kernel/time/time.c index 2c85b77..4fa1d26 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -41,7 +41,7 @@ #include <asm/uaccess.h> #include <asm/unistd.h> -#include "timeconst.h" +#include <generated/timeconst.h> #include "timekeeping.h" /* diff --git a/kernel/time/timeconst.bc b/kernel/time/timeconst.bc index 511bdf2..c7388de 100644 --- a/kernel/time/timeconst.bc +++ b/kernel/time/timeconst.bc @@ -50,7 +50,7 @@ define timeconst(hz) { print "#include <linux/types.h>\n\n" print "#if HZ != ", hz, "\n" - print "#error \qkernel/timeconst.h has the wrong HZ value!\q\n" + print "#error \qinclude/generated/timeconst.h has the wrong HZ value!\q\n" print "#endif\n\n" if (hz < 2) { @@ -105,4 +105,5 @@ define timeconst(hz) { halt } +hz = read(); timeconst(hz) -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies 2015-04-05 7:23 [PATCH 0/3] time: use __builtin_constant_p() in msecs_to_jiffies Nicholas Mc Guire 2015-04-05 7:23 ` [PATCH 1/3] time: move timeconst.h into include/generated Nicholas Mc Guire @ 2015-04-05 7:23 ` Nicholas Mc Guire 2015-04-06 0:03 ` Joe Perches 2015-04-05 7:23 ` [PATCH 3/3] time: update msecs_to_jiffies doc and move to kernel-doc format Nicholas Mc Guire 2015-04-05 9:33 ` [PATCH 0/3] time: use __builtin_constant_p() in msecs_to_jiffies Joe Perches 3 siblings, 1 reply; 14+ messages in thread From: Nicholas Mc Guire @ 2015-04-05 7:23 UTC (permalink / raw) To: Michal Marek Cc: Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, Joe Perches, John Stultz, Andrew Hunter, Paul Turner, linux-kernel, Nicholas Mc Guire The majority of the msecs_to_jiffies() users in the kernel are passing in constants which would allow gcc to do constant folding by checking with __builtin_constant_p() in msecs_to_jiffies(). The original msecs_to_jiffies is renamed to __msecs_to_jiffies and aside from the removal of the check for negative values being moved out, is unaltered. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> --- Thanks to Joe Perches <joe@perches.com> for suggesting this approach and for his review comments on the first attempts. Patch was compile and boot tested (x86_64) and a few msecs_to_jiffies() locations verified by inspection of the .lst files. Patch is against 4.0-rc6 (localversion-next is -next-20150402) include/linux/jiffies.h | 29 ++++++++++++++++++++++++++++- kernel/time/time.c | 13 +++++-------- 2 files changed, 33 insertions(+), 9 deletions(-) diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index c367cbd..dcd8ba5 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -7,6 +7,7 @@ #include <linux/time.h> #include <linux/timex.h> #include <asm/param.h> /* for HZ */ +#include <generated/timeconst.h> /* * The following defines establish the engineering parameters of the PLL @@ -288,7 +289,33 @@ static inline u64 jiffies_to_nsecs(const unsigned long j) return (u64)jiffies_to_usecs(j) * NSEC_PER_USEC; } -extern unsigned long msecs_to_jiffies(const unsigned int m); +extern unsigned long __msecs_to_jiffies(const unsigned int m); +static inline unsigned long msecs_to_jiffies(const unsigned int m) +{ + /* + * Negative value, means infinite timeout: + */ + if ((int)m < 0) + return MAX_JIFFY_OFFSET; + + if (__builtin_constant_p(m)) { +#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) + return (m + (MSEC_PER_SEC / HZ) - 1) / (MSEC_PER_SEC / HZ); +#elif HZ > MSEC_PER_SEC && !(HZ % MSEC_PER_SEC) + if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET)) + return MAX_JIFFY_OFFSET; + return m * (HZ / MSEC_PER_SEC); +#else + if (HZ > MSEC_PER_SEC && m > jiffies_to_msecs(MAX_JIFFY_OFFSET)) + return MAX_JIFFY_OFFSET; + + return (MSEC_TO_HZ_MUL32 * m + MSEC_TO_HZ_ADJ32) + >> MSEC_TO_HZ_SHR32; +#endif + } else + return __msecs_to_jiffies(m); +} + extern unsigned long usecs_to_jiffies(const unsigned int u); extern unsigned long timespec_to_jiffies(const struct timespec *value); extern void jiffies_to_timespec(const unsigned long jiffies, diff --git a/kernel/time/time.c b/kernel/time/time.c index 4fa1d26..3797540 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -488,6 +488,8 @@ EXPORT_SYMBOL(ns_to_timespec64); * the following way: * * - negative values mean 'infinite timeout' (MAX_JIFFY_OFFSET) + * negative values are handled in msecs_to_jiffies in + * include/linux/jiffies.h * * - 'too large' values [that would result in larger than * MAX_JIFFY_OFFSET values] mean 'infinite timeout' too. @@ -496,15 +498,10 @@ EXPORT_SYMBOL(ns_to_timespec64); * the input value by a factor or dividing it with a factor * * We must also be careful about 32-bit overflows. + * */ -unsigned long msecs_to_jiffies(const unsigned int m) +unsigned long __msecs_to_jiffies(const unsigned int m) { - /* - * Negative value, means infinite timeout: - */ - if ((int)m < 0) - return MAX_JIFFY_OFFSET; - #if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) /* * HZ is equal to or smaller than 1000, and 1000 is a nice @@ -537,7 +534,7 @@ unsigned long msecs_to_jiffies(const unsigned int m) >> MSEC_TO_HZ_SHR32; #endif } -EXPORT_SYMBOL(msecs_to_jiffies); +EXPORT_SYMBOL(__msecs_to_jiffies); unsigned long usecs_to_jiffies(const unsigned int u) { -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies 2015-04-05 7:23 ` [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies Nicholas Mc Guire @ 2015-04-06 0:03 ` Joe Perches 2015-04-06 1:00 ` Nicholas Mc Guire 0 siblings, 1 reply; 14+ messages in thread From: Joe Perches @ 2015-04-06 0:03 UTC (permalink / raw) To: Nicholas Mc Guire Cc: Michal Marek, Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, John Stultz, Andrew Hunter, Paul Turner, linux-kernel On Sun, 2015-04-05 at 09:23 +0200, Nicholas Mc Guire wrote: > The majority of the msecs_to_jiffies() users in the kernel are passing in > constants which would allow gcc to do constant folding by checking with > __builtin_constant_p() in msecs_to_jiffies(). > > The original msecs_to_jiffies is renamed to __msecs_to_jiffies and aside > from the removal of the check for negative values being moved out, is > unaltered. At least for gcc 4.9, this doesn't allow the compiler to optimize / precalculation msecs_to_jiffies calls with a constant. This does: (on top of your patch x86-64 defconfig) $ size vmlinux.o.* text data bss dec hex filename 11770523 1505971 1018454 14294948 da1fa4 vmlinux.o.next-b0a12fb5bc8 11770530 1505971 1018454 14294955 da1fab vmlinux.o.next-b0a12fb5bc8-inline 11768734 1505971 1018454 14293159 da18a7 vmlinux.o.next-b0a12fb5bc8-macro I think this should still move the if (m) < 0 back into the original __msecs_to_jiffies function. --- include/linux/jiffies.h | 71 ++++++++++++++++++++++++++++++++----------------- 1 file changed, 46 insertions(+), 25 deletions(-) diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index a75158e..f8fe9f7 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -291,6 +291,39 @@ static inline u64 jiffies_to_nsecs(const unsigned long j) extern unsigned long __msecs_to_jiffies(const unsigned int m); +#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) +#define const_msecs_to_jiffies(m) \ +({ \ + unsigned long j; \ + if ((int)m < 0) \ + j = MAX_JIFFY_OFFSET; \ + else \ + j = (m + (MSEC_PER_SEC / HZ) - 1) / (MSEC_PER_SEC / HZ); \ + j; \ +}) +#elif HZ > MSEC_PER_SEC && !(HZ % MSEC_PER_SEC) +#define const_msecs_to_jiffies(m) \ +({ \ + unsigned long j; \ + if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET)) \ + j = MAX_JIFFY_OFFSET; \ + else \ + j = m * (HZ / MSEC_PER_SEC); \ + j; \ +}) +#else +#define const_msecs_to_jiffies(m) \ +({ \ + unsigned long j; \ + if (HZ > MSEC_PER_SEC && m > jiffies_to_msecs(MAX_JIFFY_OFFSET))\ + j = MAX_JIFFY_OFFSET; \ + else \ + j = (MSEC_TO_HZ_MUL32 * m + MSEC_TO_HZ_ADJ32) \ + >> MSEC_TO_HZ_SHR32; \ + j; \ +}) +#endif + /** * msecs_to_jiffies: - convert milliseconds to jiffies * @m: time in millisecons @@ -313,31 +346,19 @@ extern unsigned long __msecs_to_jiffies(const unsigned int m); * allow constant folding and the actual conversion must be done at * runtime. */ -static inline unsigned long msecs_to_jiffies(const unsigned int m) -{ - /* - * Negative value, means infinite timeout: - */ - if ((int)m < 0) - return MAX_JIFFY_OFFSET; - - if (__builtin_constant_p(m)) { -#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) - return (m + (MSEC_PER_SEC / HZ) - 1) / (MSEC_PER_SEC / HZ); -#elif HZ > MSEC_PER_SEC && !(HZ % MSEC_PER_SEC) - if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET)) - return MAX_JIFFY_OFFSET; - return m * (HZ / MSEC_PER_SEC); -#else - if (HZ > MSEC_PER_SEC && m > jiffies_to_msecs(MAX_JIFFY_OFFSET)) - return MAX_JIFFY_OFFSET; - - return (MSEC_TO_HZ_MUL32 * m + MSEC_TO_HZ_ADJ32) - >> MSEC_TO_HZ_SHR32; -#endif - } else - return __msecs_to_jiffies(m); -} +#define msecs_to_jiffies(m) \ +({ \ + unsigned long j; \ + if (__builtin_constant_p(m)) { \ + if ((int)m < 0) \ + j = MAX_JIFFY_OFFSET; \ + else \ + j = const_msecs_to_jiffies(m); \ + } else { \ + j = __msecs_to_jiffies(m); \ + } \ + j; \ +}) extern unsigned long usecs_to_jiffies(const unsigned int u); extern unsigned long timespec_to_jiffies(const struct timespec *value); ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies 2015-04-06 0:03 ` Joe Perches @ 2015-04-06 1:00 ` Nicholas Mc Guire 2015-04-06 2:15 ` Joe Perches 0 siblings, 1 reply; 14+ messages in thread From: Nicholas Mc Guire @ 2015-04-06 1:00 UTC (permalink / raw) To: Joe Perches Cc: Nicholas Mc Guire, Michal Marek, Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, John Stultz, Andrew Hunter, Paul Turner, linux-kernel On Sun, 05 Apr 2015, Joe Perches wrote: > On Sun, 2015-04-05 at 09:23 +0200, Nicholas Mc Guire wrote: > > The majority of the msecs_to_jiffies() users in the kernel are passing in > > constants which would allow gcc to do constant folding by checking with > > __builtin_constant_p() in msecs_to_jiffies(). > > > > The original msecs_to_jiffies is renamed to __msecs_to_jiffies and aside > > from the removal of the check for negative values being moved out, is > > unaltered. > > At least for gcc 4.9, this doesn't allow the compiler > to optimize / precalculation msecs_to_jiffies calls > with a constant. > > This does: (on top of your patch x86-64 defconfig) > > $ size vmlinux.o.* > text data bss dec hex filename > 11770523 1505971 1018454 14294948 da1fa4 vmlinux.o.next-b0a12fb5bc8 > 11770530 1505971 1018454 14294955 da1fab vmlinux.o.next-b0a12fb5bc8-inline > 11768734 1505971 1018454 14293159 da18a7 vmlinux.o.next-b0a12fb5bc8-macro > > I think this should still move the if (m) < 0 back into the > original __msecs_to_jiffies function. > could you check if you can reproduce the results below ? my assumption was that gcc would always optimize out an if(CONST < 0) return CONST; reducing it to the return CONST; only and thus this should not make any difference but Im not that familiar with gcc. gcc versions here are: for x86 gcc version 4.7.2 (Debian 4.7.2-5) for powerpc it is a gcc version 4.9.2 (crosstool-NG 1.20.0) for arm gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) Procedure used: root@debian:~/linux-next# make distclean root@debian:~/linux-next# make defconfig root@debian:~/linux-next# make drivers/net/wireless/p54/p54usb.lst root@debian:~/linux-next# make drivers/net/wireless/p54/p54usb.s same setup in unpatched /usr/src/linux-next/ e.g: root@debian:/usr/src/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.c timeout = jiffies + msecs_to_jiffies(1000); timeout = jiffies + msecs_to_jiffies(1000); So both calls are constants and should be optimized out if it works as expected. without the patch applied: root@debian:/usr/src/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.s call msecs_to_jiffies # call msecs_to_jiffies # root@debian:/usr/src/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.lst timeout = jiffies + msecs_to_jiffies(1000); e19: R_X86_64_PC32 msecs_to_jiffies+0xfffffffffffffffc timeout = jiffies + msecs_to_jiffies(1000); timeout = jiffies + msecs_to_jiffies(1000); fd8: R_X86_64_PC32 msecs_to_jiffies+0xfffffffffffffffc timeout = jiffies + msecs_to_jiffies(1000); with the patch applied this then gives me: root@debian:~/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.s root@debian:~/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.lst timeout = jiffies + msecs_to_jiffies(1000); timeout = jiffies + msecs_to_jiffies(1000); timeout = jiffies + msecs_to_jiffies(1000); timeout = jiffies + msecs_to_jiffies(1000); Conversely in kernel/sched/core.c the msecs_to_jiffies is not a constant and the result is that it calls __msecs_to_jiffies patched: root@debian:~/linux-next# grep msecs_to_jiffies kernel/sched/core.s call __msecs_to_jiffies # unpatched: root@debian:/usr/src/linux-next# grep msecs_to_jiffies kernel/sched/core.s call msecs_to_jiffies # Could you check if you get these results for this test-case ? If this really were compiler dependant that would be very bad. I did move the < 0 check - but that did not change the situation here. but it well may be that there are some cases where this does make a difference thx! hofrat ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies 2015-04-06 1:00 ` Nicholas Mc Guire @ 2015-04-06 2:15 ` Joe Perches 2015-04-06 4:26 ` Nicholas Mc Guire 2015-04-12 8:36 ` Nicholas Mc Guire 0 siblings, 2 replies; 14+ messages in thread From: Joe Perches @ 2015-04-06 2:15 UTC (permalink / raw) To: Nicholas Mc Guire Cc: Nicholas Mc Guire, Michal Marek, Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, John Stultz, Andrew Hunter, Paul Turner, linux-kernel On Mon, 2015-04-06 at 03:00 +0200, Nicholas Mc Guire wrote: > On Sun, 05 Apr 2015, Joe Perches wrote: > > > On Sun, 2015-04-05 at 09:23 +0200, Nicholas Mc Guire wrote: > > > The majority of the msecs_to_jiffies() users in the kernel are passing in > > > constants which would allow gcc to do constant folding by checking with > > > __builtin_constant_p() in msecs_to_jiffies(). > > > > > > The original msecs_to_jiffies is renamed to __msecs_to_jiffies and aside > > > from the removal of the check for negative values being moved out, is > > > unaltered. > > > > At least for gcc 4.9, this doesn't allow the compiler > > to optimize / precalculation msecs_to_jiffies calls > > with a constant. > > > > This does: (on top of your patch x86-64 defconfig) > > > > $ size vmlinux.o.* > > text data bss dec hex filename > > 11770523 1505971 1018454 14294948 da1fa4 vmlinux.o.next-b0a12fb5bc8 > > 11770530 1505971 1018454 14294955 da1fab vmlinux.o.next-b0a12fb5bc8-inline > > 11768734 1505971 1018454 14293159 da18a7 vmlinux.o.next-b0a12fb5bc8-macro > > > > I think this should still move the if (m) < 0 back into the > > original __msecs_to_jiffies function. > > > > could you check if you can reproduce the results below ? > my assumption was that gcc would always optimize out an > if(CONST < 0) return CONST; reducing it to the return CONST; > only and thus this should not make any difference but Im not > that familiar with gcc. > > gcc versions here are: > for x86 gcc version 4.7.2 (Debian 4.7.2-5) > for powerpc it is a gcc version 4.9.2 (crosstool-NG 1.20.0) > for arm gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) > > Procedure used: > root@debian:~/linux-next# make distclean > root@debian:~/linux-next# make defconfig > root@debian:~/linux-next# make drivers/net/wireless/p54/p54usb.lst > root@debian:~/linux-next# make drivers/net/wireless/p54/p54usb.s > > same setup in unpatched /usr/src/linux-next/ > > e.g: > root@debian:/usr/src/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.c > timeout = jiffies + msecs_to_jiffies(1000); > timeout = jiffies + msecs_to_jiffies(1000); > > So both calls are constants and should be optimized out if it works as > expected. > > without the patch applied: > > root@debian:/usr/src/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.s > call msecs_to_jiffies # > call msecs_to_jiffies # > root@debian:/usr/src/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.lst > timeout = jiffies + msecs_to_jiffies(1000); > e19: R_X86_64_PC32 msecs_to_jiffies+0xfffffffffffffffc > timeout = jiffies + msecs_to_jiffies(1000); > timeout = jiffies + msecs_to_jiffies(1000); > fd8: R_X86_64_PC32 msecs_to_jiffies+0xfffffffffffffffc > timeout = jiffies + msecs_to_jiffies(1000); > > > with the patch applied this then gives me: > > root@debian:~/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.s > root@debian:~/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.lst > timeout = jiffies + msecs_to_jiffies(1000); > timeout = jiffies + msecs_to_jiffies(1000); > timeout = jiffies + msecs_to_jiffies(1000); > timeout = jiffies + msecs_to_jiffies(1000); > > Conversely in kernel/sched/core.c the msecs_to_jiffies is not a constant > and the result is that it calls __msecs_to_jiffies > > patched: > root@debian:~/linux-next# grep msecs_to_jiffies kernel/sched/core.s > call __msecs_to_jiffies # > > unpatched: > root@debian:/usr/src/linux-next# grep msecs_to_jiffies kernel/sched/core.s > call msecs_to_jiffies # > > > Could you check if you get these results for this test-case ? > If this really were compiler dependant that would be very bad. Hi Nicholas. Your inline version has not worked with any of x86-64 gcc 4.4, 4.6, 4.7, or 4.9 I suggest you add some lines to lib/test_module.c/test_module_init like: unsigned int m; for (m = 10; m < 200; m += 10) pr_info("msecs_to_jiffies(%u) is %lu\n", m, msecs_to_jiffies(m)); pr_info("msecs_to_jiffies(%u) is %lu\n", 10, msecs_to_jiffies(10)); pr_info("msecs_to_jiffies(%u) is %lu\n", 100, msecs_to_jiffies(100)); pr_info("msecs_to_jiffies(%u) is %lu\n", 1000, msecs_to_jiffies(1000)); Then it's pretty easy to look at the assembly/.lst file Your inline function doesn't allow gcc to precompute the msecs_to_jiffies value. The macro one does for all those gcc versions. Try it and look at the generated .lst files with and without the patch I sent. cheers, Joe ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies 2015-04-06 2:15 ` Joe Perches @ 2015-04-06 4:26 ` Nicholas Mc Guire 2015-04-06 4:33 ` Joe Perches 2015-04-12 8:36 ` Nicholas Mc Guire 1 sibling, 1 reply; 14+ messages in thread From: Nicholas Mc Guire @ 2015-04-06 4:26 UTC (permalink / raw) To: Joe Perches Cc: Nicholas Mc Guire, Michal Marek, Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, John Stultz, Andrew Hunter, Paul Turner, linux-kernel On Sun, 05 Apr 2015, Joe Perches wrote: > On Mon, 2015-04-06 at 03:00 +0200, Nicholas Mc Guire wrote: > > On Sun, 05 Apr 2015, Joe Perches wrote: > > > > > On Sun, 2015-04-05 at 09:23 +0200, Nicholas Mc Guire wrote: > > > > The majority of the msecs_to_jiffies() users in the kernel are passing in > > > > constants which would allow gcc to do constant folding by checking with > > > > __builtin_constant_p() in msecs_to_jiffies(). > > > > > > > > The original msecs_to_jiffies is renamed to __msecs_to_jiffies and aside > > > > from the removal of the check for negative values being moved out, is > > > > unaltered. > > > > > > At least for gcc 4.9, this doesn't allow the compiler > > > to optimize / precalculation msecs_to_jiffies calls > > > with a constant. > > > > > > This does: (on top of your patch x86-64 defconfig) > > > > > > $ size vmlinux.o.* > > > text data bss dec hex filename > > > 11770523 1505971 1018454 14294948 da1fa4 vmlinux.o.next-b0a12fb5bc8 > > > 11770530 1505971 1018454 14294955 da1fab vmlinux.o.next-b0a12fb5bc8-inline > > > 11768734 1505971 1018454 14293159 da18a7 vmlinux.o.next-b0a12fb5bc8-macro > > > > > > I think this should still move the if (m) < 0 back into the > > > original __msecs_to_jiffies function. > > > > > > > could you check if you can reproduce the results below ? > > my assumption was that gcc would always optimize out an > > if(CONST < 0) return CONST; reducing it to the return CONST; > > only and thus this should not make any difference but Im not > > that familiar with gcc. > > > > gcc versions here are: > > for x86 gcc version 4.7.2 (Debian 4.7.2-5) > > for powerpc it is a gcc version 4.9.2 (crosstool-NG 1.20.0) > > for arm gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) > > > > Procedure used: > > root@debian:~/linux-next# make distclean > > root@debian:~/linux-next# make defconfig > > root@debian:~/linux-next# make drivers/net/wireless/p54/p54usb.lst > > root@debian:~/linux-next# make drivers/net/wireless/p54/p54usb.s > > > > same setup in unpatched /usr/src/linux-next/ > > > > e.g: > > root@debian:/usr/src/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.c > > timeout = jiffies + msecs_to_jiffies(1000); > > timeout = jiffies + msecs_to_jiffies(1000); > > > > So both calls are constants and should be optimized out if it works as > > expected. > > > > without the patch applied: > > > > root@debian:/usr/src/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.s > > call msecs_to_jiffies # > > call msecs_to_jiffies # > > root@debian:/usr/src/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.lst > > timeout = jiffies + msecs_to_jiffies(1000); > > e19: R_X86_64_PC32 msecs_to_jiffies+0xfffffffffffffffc > > timeout = jiffies + msecs_to_jiffies(1000); > > timeout = jiffies + msecs_to_jiffies(1000); > > fd8: R_X86_64_PC32 msecs_to_jiffies+0xfffffffffffffffc > > timeout = jiffies + msecs_to_jiffies(1000); > > > > > > with the patch applied this then gives me: > > > > root@debian:~/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.s > > root@debian:~/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.lst > > timeout = jiffies + msecs_to_jiffies(1000); > > timeout = jiffies + msecs_to_jiffies(1000); > > timeout = jiffies + msecs_to_jiffies(1000); > > timeout = jiffies + msecs_to_jiffies(1000); > > > > Conversely in kernel/sched/core.c the msecs_to_jiffies is not a constant > > and the result is that it calls __msecs_to_jiffies > > > > patched: > > root@debian:~/linux-next# grep msecs_to_jiffies kernel/sched/core.s > > call __msecs_to_jiffies # > > > > unpatched: > > root@debian:/usr/src/linux-next# grep msecs_to_jiffies kernel/sched/core.s > > call msecs_to_jiffies # > > > > > > Could you check if you get these results for this test-case ? > > If this really were compiler dependant that would be very bad. > > Hi Nicholas. > > Your inline version has not worked with any of > x86-64 gcc 4.4, 4.6, 4.7, or 4.9 > > I suggest you add some lines to > lib/test_module.c/test_module_init like: > > unsigned int m; > > for (m = 10; m < 200; m += 10) > pr_info("msecs_to_jiffies(%u) is %lu\n", > m, msecs_to_jiffies(m)); > > pr_info("msecs_to_jiffies(%u) is %lu\n", > 10, msecs_to_jiffies(10)); > pr_info("msecs_to_jiffies(%u) is %lu\n", > 100, msecs_to_jiffies(100)); > pr_info("msecs_to_jiffies(%u) is %lu\n", > 1000, msecs_to_jiffies(1000)); > > Then it's pretty easy to look at the assembly/.lst file > > Your inline function doesn't allow gcc to precompute > the msecs_to_jiffies value. The macro one does for all > those gcc versions. > > Try it and look at the generated .lst files with and > without the patch I sent. > will do that - thanks ! Managed to fool my self - the difference in the .lst/.s files is simply due to chaning msecs_to_jiffies being inline (which it was not) and thus it "vanished" - kind of stupid - sorry back to gcc manual first - need to understand __builtin_constant_p better and the constraints - from all that I understood it should be doable both as macro and inline. thx! hofrat ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies 2015-04-06 4:26 ` Nicholas Mc Guire @ 2015-04-06 4:33 ` Joe Perches 2015-04-06 6:40 ` Nicholas Mc Guire 0 siblings, 1 reply; 14+ messages in thread From: Joe Perches @ 2015-04-06 4:33 UTC (permalink / raw) To: Nicholas Mc Guire Cc: Nicholas Mc Guire, Michal Marek, Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, John Stultz, Andrew Hunter, Paul Turner, linux-kernel On Mon, 2015-04-06 at 06:26 +0200, Nicholas Mc Guire wrote: > On Sun, 05 Apr 2015, Joe Perches wrote: > > Try it and look at the generated .lst files with and > > without the patch I sent. [] > from all that I understood it should > be doable both as macro and inline. I think it _should_ be doable too but I also think the only reason gcc doesn't optimize the inline is because gcc's optimizer isn't good enough yet. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies 2015-04-06 4:33 ` Joe Perches @ 2015-04-06 6:40 ` Nicholas Mc Guire 2015-04-06 7:12 ` Joe Perches 0 siblings, 1 reply; 14+ messages in thread From: Nicholas Mc Guire @ 2015-04-06 6:40 UTC (permalink / raw) To: Joe Perches Cc: Nicholas Mc Guire, Michal Marek, Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, John Stultz, Andrew Hunter, Paul Turner, linux-kernel On Sun, 05 Apr 2015, Joe Perches wrote: > On Mon, 2015-04-06 at 06:26 +0200, Nicholas Mc Guire wrote: > > On Sun, 05 Apr 2015, Joe Perches wrote: > > > Try it and look at the generated .lst files with and > > > without the patch I sent. > [] > > from all that I understood it should > > be doable both as macro and inline. > > I think it _should_ be doable too but I also think > the only reason gcc doesn't optimize the inline > is because gcc's optimizer isn't good enough yet. > "unfortunately" I can't blame it on gcc - here is the initial toy-case - test.c and either testi.h or testm.h included - m = TIMEOUT or m = atoi(argv[1]); both in the inline and the macro case gcc reduced the code to a single load mediate or register instruction for the constant - so the optimizer is doing its job. test.c: #include <stdio.h> #define HZ 100 #define MSECS_PER_SEC 1000 #define TIMEOUT 100 #include "testi.h" /* inline msecs_to_jiffies */ //#include "testm.h" /* macro versions */ int main(int argc, char **argv) { //int m = atoi(argv[0]); /* non-const */ int m = TIMEOUT; /* const */ printf("%lu\n",msecs_to_jiffies(m)); return 0; } testm.h: #define msecs_to_jiffies(m) \ (__builtin_constant_p (m) \ ? ((m) * HZ / MSECS_PER_SEC ) : __msecs_to_jiffies(m)) unsigned long __msecs_to_jiffies(int m) { return m * HZ / MSECS_PER_SEC ; } first case with a non-const main: .LFB12: .cfi_startproc subq $8, %rsp #, .cfi_def_cfa_offset 16 movq 8(%rsi), %rdi # MEM[(char * *)argv_2(D) + 8B], MEM[(char * *)argv_2(D) + 8B] xorl %eax, %eax # call atoi # movl $1717986919, %edx #, tmp69 movl %eax, %ecx #, m movl $.LC0, %edi #, imull %edx # tmp69 sarl $31, %ecx #, tmp71 xorl %eax, %eax # sarl $2, %edx #, tmp67 subl %ecx, %edx # tmp71, tmp67 movslq %edx, %rsi # tmp67, tmp72 call printf # o second with a constant: main: .LFB12: .cfi_startproc subq $8, %rsp #, .cfi_def_cfa_offset 16 movl $10, %esi #, movl $.LC0, %edi #, xorl %eax, %eax # call printf # inline: ------- testi.h: static inline unsigned long __msecs_to_jiffies(int m) { return m * HZ / MSECS_PER_SEC; } static inline unsigned long msecs_to_jiffies(int m) { return __builtin_constant_p (m) ? (m) * HZ / MSECS_PER_SEC : __msecs_to_jiffies(m); } first case with a non-const main: .LFB13: .cfi_startproc subq $8, %rsp #, .cfi_def_cfa_offset 16 movq (%rsi), %rdi # *argv_1(D), xorl %eax, %eax # call atoi # movl $1717986919, %edx #, tmp68 movl %eax, %ecx #, m movl $.LC0, %edi #, imull %edx # tmp68 sarl $31, %ecx #, tmp70 xorl %eax, %eax # sarl $2, %edx #, tmp66 subl %ecx, %edx # tmp70, tmp66 movslq %edx, %rsi # tmp66, tmp71 call printf # second with a constant: main: .LFB13: .cfi_startproc subq $8, %rsp #, .cfi_def_cfa_offset 16 xorl %esi, %esi # movl $.LC0, %edi #, xorl %eax, %eax # call printf # giving it another run from scratch somewhere I simply screwed up or overlooked some detail. thx! hofrat ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies 2015-04-06 6:40 ` Nicholas Mc Guire @ 2015-04-06 7:12 ` Joe Perches 2015-04-06 7:21 ` Nicholas Mc Guire 0 siblings, 1 reply; 14+ messages in thread From: Joe Perches @ 2015-04-06 7:12 UTC (permalink / raw) To: Nicholas Mc Guire Cc: Nicholas Mc Guire, Michal Marek, Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, John Stultz, Andrew Hunter, Paul Turner, linux-kernel On Mon, 2015-04-06 at 08:40 +0200, Nicholas Mc Guire wrote: > #define msecs_to_jiffies(m) \ > (__builtin_constant_p (m) \ > ? ((m) * HZ / MSECS_PER_SEC ) : __msecs_to_jiffies(m)) [] > main: > .LFB12: > .cfi_startproc > subq $8, %rsp #, > .cfi_def_cfa_offset 16 > movl $10, %esi #, > movl $.LC0, %edi #, > xorl %eax, %eax # > call printf # vs: > static inline unsigned long msecs_to_jiffies(int m) > { > return __builtin_constant_p (m) ? > (m) * HZ / MSECS_PER_SEC : __msecs_to_jiffies(m); > } [] > main: > .LFB13: > .cfi_startproc > subq $8, %rsp #, > .cfi_def_cfa_offset 16 > xorl %esi, %esi # > movl $.LC0, %edi #, > xorl %eax, %eax # > call printf # > > giving it another run from scratch somewhere I simply screwed up or > overlooked some detail. If the optimizer was doing it's job properly, wouldn't the macro and inline output object code be the same? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies 2015-04-06 7:12 ` Joe Perches @ 2015-04-06 7:21 ` Nicholas Mc Guire 0 siblings, 0 replies; 14+ messages in thread From: Nicholas Mc Guire @ 2015-04-06 7:21 UTC (permalink / raw) To: Joe Perches Cc: Nicholas Mc Guire, Michal Marek, Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, John Stultz, Andrew Hunter, Paul Turner, linux-kernel On Mon, 06 Apr 2015, Joe Perches wrote: > On Mon, 2015-04-06 at 08:40 +0200, Nicholas Mc Guire wrote: > > #define msecs_to_jiffies(m) \ > > (__builtin_constant_p (m) \ > > ? ((m) * HZ / MSECS_PER_SEC ) : __msecs_to_jiffies(m)) > [] > > main: > > .LFB12: > > .cfi_startproc > > subq $8, %rsp #, > > .cfi_def_cfa_offset 16 > > movl $10, %esi #, > > movl $.LC0, %edi #, > > xorl %eax, %eax # > > call printf # > > vs: > > > static inline unsigned long msecs_to_jiffies(int m) > > { > > return __builtin_constant_p (m) ? > > (m) * HZ / MSECS_PER_SEC : __msecs_to_jiffies(m); > > } > [] > > main: > > .LFB13: > > .cfi_startproc > > subq $8, %rsp #, > > .cfi_def_cfa_offset 16 > > xorl %esi, %esi # > > movl $.LC0, %edi #, > > xorl %eax, %eax # > > call printf # > > > > giving it another run from scratch somewhere I simply screwed up or > > overlooked some detail. > > If the optimizer was doing it's job properly, wouldn't > the macro and inline output object code be the same? > yes - and they are - that was my mistake I grabed the wrong asm snippet - here is the complete test case also made a mess of the code while trimming down the mail - so here is the single test case showing, I think, that inline works as well and as expected. testi.h: #define HZ 100 #define MSECS_PER_SEC 1000 #define TIMEOUT 100 extern inline unsigned long __msecs_to_jiffies(int m); unsigned long msecs_to_jiffies(int m) { return __builtin_constant_p(m) ? ((m) * HZ / MSECS_PER_SEC ) : __msecs_to_jiffies(m); } test.c: #include <stdio.h> #include "testi.h" unsigned long __msecs_to_jiffies(int m) { return (m * HZ / MSECS_PER_SEC); } int main(int argc, char **argv) { //int m = atoi(argv[1]); int m = TIMEOUT; printf("%lu\n",msecs_to_jiffies(m)); return 0; } compiled with: gcc -O2 -S --verbose-asm test.c <snip> main: .LFB13: .cfi_startproc subq $8, %rsp #, .cfi_def_cfa_offset 16 movl $10, %esi #, movl $.LC0, %edi #, xorl %eax, %eax # call printf # <snip> need to cleanup here :) thx! hofrat ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies 2015-04-06 2:15 ` Joe Perches 2015-04-06 4:26 ` Nicholas Mc Guire @ 2015-04-12 8:36 ` Nicholas Mc Guire 1 sibling, 0 replies; 14+ messages in thread From: Nicholas Mc Guire @ 2015-04-12 8:36 UTC (permalink / raw) To: Joe Perches Cc: Nicholas Mc Guire, Michal Marek, Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, John Stultz, Andrew Hunter, Paul Turner, linux-kernel > Your inline version has not worked with any of > x86-64 gcc 4.4, 4.6, 4.7, or 4.9 > > I suggest you add some lines to > lib/test_module.c/test_module_init like: > > unsigned int m; > > for (m = 10; m < 200; m += 10) > pr_info("msecs_to_jiffies(%u) is %lu\n", > m, msecs_to_jiffies(m)); > > pr_info("msecs_to_jiffies(%u) is %lu\n", > 10, msecs_to_jiffies(10)); > pr_info("msecs_to_jiffies(%u) is %lu\n", > 100, msecs_to_jiffies(100)); > pr_info("msecs_to_jiffies(%u) is %lu\n", > 1000, msecs_to_jiffies(1000)); > > Then it's pretty easy to look at the assembly/.lst file > > Your inline function doesn't allow gcc to precompute > the msecs_to_jiffies value. The macro one does for all > those gcc versions. > > Try it and look at the generated .lst files with and > without the patch I sent. > I have checked it with the testcase you proposed - and I quite sure it is working, find the .s file snippets from the test_module.c modified as you suggested below. HZ is set to 300 so that it is easy to differenciate the parameter to msecs_to_jiffies and the printed const. there is no call and no calculation for the constant cases - just load as it should be. with the patch applied: test_module_init: pushq %rbp # movq %rsp, %rbp #, pushq %rbx # movl $10, %ebx #, m pushq %rcx # .L4: movl %ebx, %edi # m, call __msecs_to_jiffies # movl %ebx, %esi # m, movq %rax, %rdx # D.14515, movq $.LC1, %rdi #, xorl %eax, %eax # addl $10, %ebx #, m call printk # cmpl $200, %ebx #, m jne .L4 #, <---end of for loop movl $3, %edx #, <---msecs_to_jiffies(10) movl $10, %esi #, movq $.LC1, %rdi #, xorl %eax, %eax # call printk # movl $30, %edx #, <---msecs_to_jiffies(100) movl $100, %esi #, movq $.LC1, %rdi #, xorl %eax, %eax # call printk # movl $300, %edx #, <---msecs_to_jiffies(1000) movl $1000, %esi #, movq $.LC1, %rdi #, xorl %eax, %eax # call printk # without the patch applied: pushq %rbp # movq %rsp, %rbp #, pushq %rbx # movl $10, %ebx #, m pushq %rcx # .L2: movl %ebx, %edi # m, call msecs_to_jiffies # movl %ebx, %esi # m, movq %rax, %rdx # D.14503, movq $.LC0, %rdi #, xorl %eax, %eax # addl $10, %ebx #, m call printk # cmpl $200, %ebx #, m jne .L2 #, movl $10, %edi #, call msecs_to_jiffies # movl $10, %esi #, movq %rax, %rdx # D.14504, movq $.LC0, %rdi #, xorl %eax, %eax # call printk # movl $100, %edi #, call msecs_to_jiffies # movl $100, %esi #, movq %rax, %rdx # D.14505, movq $.LC0, %rdi #, xorl %eax, %eax # call printk # movl $1000, %edi #, could you check the .s file for you test module ? I'm a bit lost on why you are not seeing this - also checked with cross-build (multi_v7_defconfig) and it looks like thats working as well. thx! hofrat ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/3] time: update msecs_to_jiffies doc and move to kernel-doc format 2015-04-05 7:23 [PATCH 0/3] time: use __builtin_constant_p() in msecs_to_jiffies Nicholas Mc Guire 2015-04-05 7:23 ` [PATCH 1/3] time: move timeconst.h into include/generated Nicholas Mc Guire 2015-04-05 7:23 ` [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies Nicholas Mc Guire @ 2015-04-05 7:23 ` Nicholas Mc Guire 2015-04-05 9:33 ` [PATCH 0/3] time: use __builtin_constant_p() in msecs_to_jiffies Joe Perches 3 siblings, 0 replies; 14+ messages in thread From: Nicholas Mc Guire @ 2015-04-05 7:23 UTC (permalink / raw) To: Michal Marek Cc: Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, Joe Perches, John Stultz, Andrew Hunter, Paul Turner, linux-kernel, Nicholas Mc Guire update the documentation of msecs_to_jiffies and move to kernel-doc format Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> --- Patch was compile with x86_64_defconfig Patch is against 4.0-rc6 (localversion-next is -next-20150402) include/linux/jiffies.h | 23 +++++++++++++++++++++++ kernel/time/time.c | 19 ++++++++++--------- 2 files changed, 33 insertions(+), 9 deletions(-) diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index dcd8ba5..a75158e 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -290,6 +290,29 @@ static inline u64 jiffies_to_nsecs(const unsigned long j) } extern unsigned long __msecs_to_jiffies(const unsigned int m); + +/** + * msecs_to_jiffies: - convert milliseconds to jiffies + * @m: time in millisecons + * + * conversion is done as follows: + * + * - negative values mean 'infinite timeout' (MAX_JIFFY_OFFSET) + * + * - 'too large' values [that would result in larger than + * MAX_JIFFY_OFFSET values] mean 'infinite timeout' too. + * + * - all other values are converted to jiffies by either multiplying + * the input value by a factor or dividing it with a factor and + * handling any 32-bit overflows. + * for the details see __msecs_to_jiffies() + * + * msecs_to_jiffies() checks for the passed in value being a constant + * via __builtin_constant_p() allowing gcc to eliminate most of the + * code, __msecs_to_jiffies() is called if the value passed does not + * allow constant folding and the actual conversion must be done at + * runtime. + */ static inline unsigned long msecs_to_jiffies(const unsigned int m) { /* diff --git a/kernel/time/time.c b/kernel/time/time.c index 3797540..5d97610 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -483,22 +483,23 @@ struct timespec64 ns_to_timespec64(const s64 nsec) } EXPORT_SYMBOL(ns_to_timespec64); #endif -/* - * When we convert to jiffies then we interpret incoming values - * the following way: + +/** + * __msecs_to_jiffies: - convert milliseconds to jiffies + * @m: time in millisecons * - * - negative values mean 'infinite timeout' (MAX_JIFFY_OFFSET) - * negative values are handled in msecs_to_jiffies in - * include/linux/jiffies.h + * conversion is done as follows: * * - 'too large' values [that would result in larger than * MAX_JIFFY_OFFSET values] mean 'infinite timeout' too. * * - all other values are converted to jiffies by either multiplying - * the input value by a factor or dividing it with a factor - * - * We must also be careful about 32-bit overflows. + * the input value by a factor or dividing it with a factor and + * handling any 32-bit overflows. * + * __msecs_to_jiffies() is called if the value passed to + * msecs_to_jiffies() does not allow constant folding and the actual + * conversion must be done at runtime. */ unsigned long __msecs_to_jiffies(const unsigned int m) { -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 0/3] time: use __builtin_constant_p() in msecs_to_jiffies 2015-04-05 7:23 [PATCH 0/3] time: use __builtin_constant_p() in msecs_to_jiffies Nicholas Mc Guire ` (2 preceding siblings ...) 2015-04-05 7:23 ` [PATCH 3/3] time: update msecs_to_jiffies doc and move to kernel-doc format Nicholas Mc Guire @ 2015-04-05 9:33 ` Joe Perches 3 siblings, 0 replies; 14+ messages in thread From: Joe Perches @ 2015-04-05 9:33 UTC (permalink / raw) To: Nicholas Mc Guire Cc: Michal Marek, Masahiro Yamada, Sam Ravnborg, Thomas Gleixner, H. Peter Alvin, John Stultz, Andrew Hunter, Paul Turner, linux-kernel On Sun, 2015-04-05 at 09:23 +0200, Nicholas Mc Guire wrote: > Allowing gcc to fold constants for these calls that in most > cases are passing in constants (roughly 95%) has some potential > to improve performance (and should save a few bytes). Thanks for working on this Nicholas. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-04-12 8:36 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-04-05 7:23 [PATCH 0/3] time: use __builtin_constant_p() in msecs_to_jiffies Nicholas Mc Guire 2015-04-05 7:23 ` [PATCH 1/3] time: move timeconst.h into include/generated Nicholas Mc Guire 2015-04-05 7:23 ` [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies Nicholas Mc Guire 2015-04-06 0:03 ` Joe Perches 2015-04-06 1:00 ` Nicholas Mc Guire 2015-04-06 2:15 ` Joe Perches 2015-04-06 4:26 ` Nicholas Mc Guire 2015-04-06 4:33 ` Joe Perches 2015-04-06 6:40 ` Nicholas Mc Guire 2015-04-06 7:12 ` Joe Perches 2015-04-06 7:21 ` Nicholas Mc Guire 2015-04-12 8:36 ` Nicholas Mc Guire 2015-04-05 7:23 ` [PATCH 3/3] time: update msecs_to_jiffies doc and move to kernel-doc format Nicholas Mc Guire 2015-04-05 9:33 ` [PATCH 0/3] time: use __builtin_constant_p() in msecs_to_jiffies Joe Perches
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox