* [PATCH] vsyscall: use __iter_div_u64_rem()
@ 2019-07-10 13:01 Arnd Bergmann
2019-07-10 13:01 ` Arnd Bergmann
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Arnd Bergmann @ 2019-07-10 13:01 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino
Cc: Arnd Bergmann, linux-arch, linux-arm-kernel, linux-mips,
linux-kselftest, Catalin Marinas, Will Deacon, Russell King,
Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn,
Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes,
Huw Davies, linux-kernel, clang-built-linux
On 32-bit x86 when building with clang-9, the loop gets turned back into
an inefficient division that causes a link error:
kernel/time/vsyscall.o: In function `update_vsyscall':
vsyscall.c:(.text+0xe3): undefined reference to `__udivdi3'
Use the provided __iter_div_u64_rem() function that is meant to address
the same case in other files.
Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
kernel/time/vsyscall.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/kernel/time/vsyscall.c b/kernel/time/vsyscall.c
index a80893180826..8cf3596a4ce6 100644
--- a/kernel/time/vsyscall.c
+++ b/kernel/time/vsyscall.c
@@ -104,11 +104,7 @@ void update_vsyscall(struct timekeeper *tk)
vdso_ts->sec = tk->xtime_sec + tk->wall_to_monotonic.tv_sec;
nsec = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift;
nsec = nsec + tk->wall_to_monotonic.tv_nsec;
- while (nsec >= NSEC_PER_SEC) {
- nsec = nsec - NSEC_PER_SEC;
- vdso_ts->sec++;
- }
- vdso_ts->nsec = nsec;
+ vdso_ts->sec += __iter_div_u64_rem(nsec, NSEC_PER_SEC, &vdso_ts->nsec);
if (__arch_use_vsyscall(vdata))
update_vdso_data(vdata, tk);
--
2.20.0
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-10 13:01 [PATCH] vsyscall: use __iter_div_u64_rem() Arnd Bergmann @ 2019-07-10 13:01 ` Arnd Bergmann 2019-07-10 16:56 ` Nathan Chancellor 2019-07-11 12:14 ` Vincenzo Frascino 2 siblings, 0 replies; 14+ messages in thread From: Arnd Bergmann @ 2019-07-10 13:01 UTC (permalink / raw) To: Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino Cc: Arnd Bergmann, linux-arch, linux-arm-kernel, linux-mips, linux-kselftest, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw Davies, linux-kernel, clang-built-linux On 32-bit x86 when building with clang-9, the loop gets turned back into an inefficient division that causes a link error: kernel/time/vsyscall.o: In function `update_vsyscall': vsyscall.c:(.text+0xe3): undefined reference to `__udivdi3' Use the provided __iter_div_u64_rem() function that is meant to address the same case in other files. Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation") Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- kernel/time/vsyscall.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/kernel/time/vsyscall.c b/kernel/time/vsyscall.c index a80893180826..8cf3596a4ce6 100644 --- a/kernel/time/vsyscall.c +++ b/kernel/time/vsyscall.c @@ -104,11 +104,7 @@ void update_vsyscall(struct timekeeper *tk) vdso_ts->sec = tk->xtime_sec + tk->wall_to_monotonic.tv_sec; nsec = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift; nsec = nsec + tk->wall_to_monotonic.tv_nsec; - while (nsec >= NSEC_PER_SEC) { - nsec = nsec - NSEC_PER_SEC; - vdso_ts->sec++; - } - vdso_ts->nsec = nsec; + vdso_ts->sec += __iter_div_u64_rem(nsec, NSEC_PER_SEC, &vdso_ts->nsec); if (__arch_use_vsyscall(vdata)) update_vdso_data(vdata, tk); -- 2.20.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-10 13:01 [PATCH] vsyscall: use __iter_div_u64_rem() Arnd Bergmann 2019-07-10 13:01 ` Arnd Bergmann @ 2019-07-10 16:56 ` Nathan Chancellor 2019-07-10 16:56 ` Nathan Chancellor 2019-07-11 12:14 ` Vincenzo Frascino 2 siblings, 1 reply; 14+ messages in thread From: Nathan Chancellor @ 2019-07-10 16:56 UTC (permalink / raw) To: Arnd Bergmann Cc: Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino, linux-arch, linux-arm-kernel, linux-mips, linux-kselftest, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw Davies, linux-kernel, clang-built-linux On Wed, Jul 10, 2019 at 03:01:53PM +0200, Arnd Bergmann wrote: > On 32-bit x86 when building with clang-9, the loop gets turned back into > an inefficient division that causes a link error: > > kernel/time/vsyscall.o: In function `update_vsyscall': > vsyscall.c:(.text+0xe3): undefined reference to `__udivdi3' > > Use the provided __iter_div_u64_rem() function that is meant to address > the same case in other files. > > Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation") > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > kernel/time/vsyscall.c | 6 +----- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/kernel/time/vsyscall.c b/kernel/time/vsyscall.c > index a80893180826..8cf3596a4ce6 100644 > --- a/kernel/time/vsyscall.c > +++ b/kernel/time/vsyscall.c > @@ -104,11 +104,7 @@ void update_vsyscall(struct timekeeper *tk) > vdso_ts->sec = tk->xtime_sec + tk->wall_to_monotonic.tv_sec; > nsec = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift; > nsec = nsec + tk->wall_to_monotonic.tv_nsec; > - while (nsec >= NSEC_PER_SEC) { > - nsec = nsec - NSEC_PER_SEC; > - vdso_ts->sec++; > - } > - vdso_ts->nsec = nsec; > + vdso_ts->sec += __iter_div_u64_rem(nsec, NSEC_PER_SEC, &vdso_ts->nsec); > > if (__arch_use_vsyscall(vdata)) > update_vdso_data(vdata, tk); > -- > 2.20.0 > What an interesting function. Looks good to me and I can confirm it fixes the link error. Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-10 16:56 ` Nathan Chancellor @ 2019-07-10 16:56 ` Nathan Chancellor 0 siblings, 0 replies; 14+ messages in thread From: Nathan Chancellor @ 2019-07-10 16:56 UTC (permalink / raw) To: Arnd Bergmann Cc: Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino, linux-arch, linux-arm-kernel, linux-mips, linux-kselftest, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw Davies, linux-kernel, clang-built-linux On Wed, Jul 10, 2019 at 03:01:53PM +0200, Arnd Bergmann wrote: > On 32-bit x86 when building with clang-9, the loop gets turned back into > an inefficient division that causes a link error: > > kernel/time/vsyscall.o: In function `update_vsyscall': > vsyscall.c:(.text+0xe3): undefined reference to `__udivdi3' > > Use the provided __iter_div_u64_rem() function that is meant to address > the same case in other files. > > Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation") > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > kernel/time/vsyscall.c | 6 +----- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/kernel/time/vsyscall.c b/kernel/time/vsyscall.c > index a80893180826..8cf3596a4ce6 100644 > --- a/kernel/time/vsyscall.c > +++ b/kernel/time/vsyscall.c > @@ -104,11 +104,7 @@ void update_vsyscall(struct timekeeper *tk) > vdso_ts->sec = tk->xtime_sec + tk->wall_to_monotonic.tv_sec; > nsec = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift; > nsec = nsec + tk->wall_to_monotonic.tv_nsec; > - while (nsec >= NSEC_PER_SEC) { > - nsec = nsec - NSEC_PER_SEC; > - vdso_ts->sec++; > - } > - vdso_ts->nsec = nsec; > + vdso_ts->sec += __iter_div_u64_rem(nsec, NSEC_PER_SEC, &vdso_ts->nsec); > > if (__arch_use_vsyscall(vdata)) > update_vdso_data(vdata, tk); > -- > 2.20.0 > What an interesting function. Looks good to me and I can confirm it fixes the link error. Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-10 13:01 [PATCH] vsyscall: use __iter_div_u64_rem() Arnd Bergmann 2019-07-10 13:01 ` Arnd Bergmann 2019-07-10 16:56 ` Nathan Chancellor @ 2019-07-11 12:14 ` Vincenzo Frascino 2019-07-11 12:14 ` Vincenzo Frascino 2019-07-11 12:28 ` Arnd Bergmann 2 siblings, 2 replies; 14+ messages in thread From: Vincenzo Frascino @ 2019-07-11 12:14 UTC (permalink / raw) To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner Cc: linux-arch, linux-arm-kernel, linux-mips, linux-kselftest, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw Davies, linux-kernel, clang-built-linux Hi Arnd, On 10/07/2019 14:01, Arnd Bergmann wrote: > On 32-bit x86 when building with clang-9, the loop gets turned back into > an inefficient division that causes a link error: > > kernel/time/vsyscall.o: In function `update_vsyscall': > vsyscall.c:(.text+0xe3): undefined reference to `__udivdi3' > > Use the provided __iter_div_u64_rem() function that is meant to address > the same case in other files. > > Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation") > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > kernel/time/vsyscall.c | 6 +----- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/kernel/time/vsyscall.c b/kernel/time/vsyscall.c > index a80893180826..8cf3596a4ce6 100644 > --- a/kernel/time/vsyscall.c > +++ b/kernel/time/vsyscall.c > @@ -104,11 +104,7 @@ void update_vsyscall(struct timekeeper *tk) > vdso_ts->sec = tk->xtime_sec + tk->wall_to_monotonic.tv_sec; > nsec = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift; > nsec = nsec + tk->wall_to_monotonic.tv_nsec; > - while (nsec >= NSEC_PER_SEC) { > - nsec = nsec - NSEC_PER_SEC; > - vdso_ts->sec++; > - } > - vdso_ts->nsec = nsec; > + vdso_ts->sec += __iter_div_u64_rem(nsec, NSEC_PER_SEC, &vdso_ts->nsec); > > if (__arch_use_vsyscall(vdata)) > update_vdso_data(vdata, tk); > I am trying to test this patch using clang-9 tip: # clang -v clang version 9.0.0 (git@github.com:llvm-mirror/clang.git 6ed0749151866894a67a3e7eefdc1f3a547daa0e) (git@github.com:llvm-mirror/llvm.git a10a70238ace1093cad3adeb94814b422bd1b5c1) but I get a lot of errors similar to the one below: In file included from ~/linux/arch/x86/events/amd/core.c:11: ~/linux/arch/x86/events/amd/../perf_event.h:824:21: error: invalid output size for constraint '=q' u64 disable_mask = __this_cpu_read(cpu_hw_events.perf_ctr_virt_mask); ^ ~/linux/include/linux/percpu-defs.h:447:2: note: expanded from macro '__this_cpu_read' raw_cpu_read(pcp); \ ^ ~/linux/include/linux/percpu-defs.h:421:28: note: expanded from macro 'raw_cpu_read' #define raw_cpu_read(pcp) __pcpu_size_call_return(raw_cpu_read_, pcp) ^ ~/linux/include/linux/percpu-defs.h:322:23: note: expanded from macro '__pcpu_size_call_return' case 1: pscr_ret__ = stem##1(variable); break; \ ^ <scratch space>:110:1: note: expanded from here raw_cpu_read_1 ^ ~/linux/arch/x86/include/asm/percpu.h:394:30: note: expanded from macro 'raw_cpu_read_1' #define raw_cpu_read_1(pcp) percpu_from_op(, "mov", pcp) ^ ~/linux/arch/x86/include/asm/percpu.h:189:15: note: expanded from macro 'percpu_from_op' : "=q" (pfo_ret__) \ Could you please tell me which version of the compiler did you use? My building command is: # make mrproper && make CC=clang HOSTCC=clang i386_defconfig && make ARCH=i386 CC=clang HOSTCC=clang -j56 -- Regards, Vincenzo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-11 12:14 ` Vincenzo Frascino @ 2019-07-11 12:14 ` Vincenzo Frascino 2019-07-11 12:28 ` Arnd Bergmann 1 sibling, 0 replies; 14+ messages in thread From: Vincenzo Frascino @ 2019-07-11 12:14 UTC (permalink / raw) To: Arnd Bergmann, Andy Lutomirski, Thomas Gleixner Cc: linux-arch, linux-arm-kernel, linux-mips, linux-kselftest, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw Davies, linux-kernel, clang-built-linux Hi Arnd, On 10/07/2019 14:01, Arnd Bergmann wrote: > On 32-bit x86 when building with clang-9, the loop gets turned back into > an inefficient division that causes a link error: > > kernel/time/vsyscall.o: In function `update_vsyscall': > vsyscall.c:(.text+0xe3): undefined reference to `__udivdi3' > > Use the provided __iter_div_u64_rem() function that is meant to address > the same case in other files. > > Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation") > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > kernel/time/vsyscall.c | 6 +----- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/kernel/time/vsyscall.c b/kernel/time/vsyscall.c > index a80893180826..8cf3596a4ce6 100644 > --- a/kernel/time/vsyscall.c > +++ b/kernel/time/vsyscall.c > @@ -104,11 +104,7 @@ void update_vsyscall(struct timekeeper *tk) > vdso_ts->sec = tk->xtime_sec + tk->wall_to_monotonic.tv_sec; > nsec = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift; > nsec = nsec + tk->wall_to_monotonic.tv_nsec; > - while (nsec >= NSEC_PER_SEC) { > - nsec = nsec - NSEC_PER_SEC; > - vdso_ts->sec++; > - } > - vdso_ts->nsec = nsec; > + vdso_ts->sec += __iter_div_u64_rem(nsec, NSEC_PER_SEC, &vdso_ts->nsec); > > if (__arch_use_vsyscall(vdata)) > update_vdso_data(vdata, tk); > I am trying to test this patch using clang-9 tip: # clang -v clang version 9.0.0 (git@github.com:llvm-mirror/clang.git 6ed0749151866894a67a3e7eefdc1f3a547daa0e) (git@github.com:llvm-mirror/llvm.git a10a70238ace1093cad3adeb94814b422bd1b5c1) but I get a lot of errors similar to the one below: In file included from ~/linux/arch/x86/events/amd/core.c:11: ~/linux/arch/x86/events/amd/../perf_event.h:824:21: error: invalid output size for constraint '=q' u64 disable_mask = __this_cpu_read(cpu_hw_events.perf_ctr_virt_mask); ^ ~/linux/include/linux/percpu-defs.h:447:2: note: expanded from macro '__this_cpu_read' raw_cpu_read(pcp); \ ^ ~/linux/include/linux/percpu-defs.h:421:28: note: expanded from macro 'raw_cpu_read' #define raw_cpu_read(pcp) __pcpu_size_call_return(raw_cpu_read_, pcp) ^ ~/linux/include/linux/percpu-defs.h:322:23: note: expanded from macro '__pcpu_size_call_return' case 1: pscr_ret__ = stem##1(variable); break; \ ^ <scratch space>:110:1: note: expanded from here raw_cpu_read_1 ^ ~/linux/arch/x86/include/asm/percpu.h:394:30: note: expanded from macro 'raw_cpu_read_1' #define raw_cpu_read_1(pcp) percpu_from_op(, "mov", pcp) ^ ~/linux/arch/x86/include/asm/percpu.h:189:15: note: expanded from macro 'percpu_from_op' : "=q" (pfo_ret__) \ Could you please tell me which version of the compiler did you use? My building command is: # make mrproper && make CC=clang HOSTCC=clang i386_defconfig && make ARCH=i386 CC=clang HOSTCC=clang -j56 -- Regards, Vincenzo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-11 12:14 ` Vincenzo Frascino 2019-07-11 12:14 ` Vincenzo Frascino @ 2019-07-11 12:28 ` Arnd Bergmann 2019-07-11 12:28 ` Arnd Bergmann ` (2 more replies) 1 sibling, 3 replies; 14+ messages in thread From: Arnd Bergmann @ 2019-07-11 12:28 UTC (permalink / raw) To: Vincenzo Frascino Cc: Andy Lutomirski, Thomas Gleixner, linux-arch, Linux ARM, linux-mips, open list:KERNEL SELFTEST FRAMEWORK, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw Davies On Thu, Jul 11, 2019 at 2:14 PM Vincenzo Frascino <vincenzo.frascino@arm.com> wrote: > > > Could you please tell me which version of the compiler did you use? > > My building command is: > > # make mrproper && make CC=clang HOSTCC=clang i386_defconfig && make ARCH=i386 > CC=clang HOSTCC=clang -j56 > See below for the patch I am using locally to work around this. That patch is probably wrong, so I have not submitted it yet, but it gives you a clean build ;-) Arnd 8<--- Subject: [PATCH] x86: percpu: fix clang 32-bit build clang does not like an inline assembly with a "=q" contraint for a 64-bit output: arch/x86/events/perf_event.h:824:21: error: invalid output size for constraint '=q' u64 disable_mask = __this_cpu_read(cpu_hw_events.perf_ctr_virt_mask); ^ include/linux/percpu-defs.h:447:2: note: expanded from macro '__this_cpu_read' raw_cpu_read(pcp); \ ^ include/linux/percpu-defs.h:421:28: note: expanded from macro 'raw_cpu_read' #define raw_cpu_read(pcp) __pcpu_size_call_return(raw_cpu_read_, pcp) ^ include/linux/percpu-defs.h:322:23: note: expanded from macro '__pcpu_size_call_return' case 1: pscr_ret__ = stem##1(variable); break; \ ^ <scratch space>:357:1: note: expanded from here raw_cpu_read_1 ^ arch/x86/include/asm/percpu.h:394:30: note: expanded from macro 'raw_cpu_read_1' #define raw_cpu_read_1(pcp) percpu_from_op(, "mov", pcp) ^ arch/x86/include/asm/percpu.h:189:15: note: expanded from macro 'percpu_from_op' : "=q" (pfo_ret__) \ ^ According to the commit that introduced the "q" constraint, this was needed to fix miscompilation, but it gives no further detail. Using the normal "=r" constraint seems to work so far. Fixes: 3c598766a2ba ("x86: fix percpu_{to,from}_op()") Cc: Jan Beulich <jbeulich@suse.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index 2278797c769d..e791fbf4018f 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -99,7 +99,7 @@ do { \ case 1: \ asm qual (op "b %1,"__percpu_arg(0) \ : "+m" (var) \ - : "qi" ((pto_T__)(val))); \ + : "ri" ((pto_T__)(val))); \ break; \ case 2: \ asm qual (op "w %1,"__percpu_arg(0) \ @@ -144,7 +144,7 @@ do { \ else \ asm qual ("addb %1, "__percpu_arg(0) \ : "+m" (var) \ - : "qi" ((pao_T__)(val))); \ + : "ri" ((pao_T__)(val))); \ break; \ case 2: \ if (pao_ID__ == 1) \ @@ -186,7 +186,7 @@ do { \ switch (sizeof(var)) { \ case 1: \ asm qual (op "b "__percpu_arg(1)",%0" \ - : "=q" (pfo_ret__) \ + : "=r" (pfo_ret__) \ : "m" (var)); \ break; \ case 2: \ @@ -215,7 +215,7 @@ do { \ switch (sizeof(var)) { \ case 1: \ asm(op "b "__percpu_arg(P1)",%0" \ - : "=q" (pfo_ret__) \ + : "=r" (pfo_ret__) \ : "p" (&(var))); \ break; \ case 2: \ ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-11 12:28 ` Arnd Bergmann @ 2019-07-11 12:28 ` Arnd Bergmann 2019-07-11 13:08 ` Vincenzo Frascino 2019-07-11 17:14 ` Nick Desaulniers 2 siblings, 0 replies; 14+ messages in thread From: Arnd Bergmann @ 2019-07-11 12:28 UTC (permalink / raw) To: Vincenzo Frascino Cc: Andy Lutomirski, Thomas Gleixner, linux-arch, Linux ARM, linux-mips, open list:KERNEL SELFTEST FRAMEWORK, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw Davies, Linux Kernel Mailing List, clang-built-linux On Thu, Jul 11, 2019 at 2:14 PM Vincenzo Frascino <vincenzo.frascino@arm.com> wrote: > > > Could you please tell me which version of the compiler did you use? > > My building command is: > > # make mrproper && make CC=clang HOSTCC=clang i386_defconfig && make ARCH=i386 > CC=clang HOSTCC=clang -j56 > See below for the patch I am using locally to work around this. That patch is probably wrong, so I have not submitted it yet, but it gives you a clean build ;-) Arnd 8<--- Subject: [PATCH] x86: percpu: fix clang 32-bit build clang does not like an inline assembly with a "=q" contraint for a 64-bit output: arch/x86/events/perf_event.h:824:21: error: invalid output size for constraint '=q' u64 disable_mask = __this_cpu_read(cpu_hw_events.perf_ctr_virt_mask); ^ include/linux/percpu-defs.h:447:2: note: expanded from macro '__this_cpu_read' raw_cpu_read(pcp); \ ^ include/linux/percpu-defs.h:421:28: note: expanded from macro 'raw_cpu_read' #define raw_cpu_read(pcp) __pcpu_size_call_return(raw_cpu_read_, pcp) ^ include/linux/percpu-defs.h:322:23: note: expanded from macro '__pcpu_size_call_return' case 1: pscr_ret__ = stem##1(variable); break; \ ^ <scratch space>:357:1: note: expanded from here raw_cpu_read_1 ^ arch/x86/include/asm/percpu.h:394:30: note: expanded from macro 'raw_cpu_read_1' #define raw_cpu_read_1(pcp) percpu_from_op(, "mov", pcp) ^ arch/x86/include/asm/percpu.h:189:15: note: expanded from macro 'percpu_from_op' : "=q" (pfo_ret__) \ ^ According to the commit that introduced the "q" constraint, this was needed to fix miscompilation, but it gives no further detail. Using the normal "=r" constraint seems to work so far. Fixes: 3c598766a2ba ("x86: fix percpu_{to,from}_op()") Cc: Jan Beulich <jbeulich@suse.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index 2278797c769d..e791fbf4018f 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -99,7 +99,7 @@ do { \ case 1: \ asm qual (op "b %1,"__percpu_arg(0) \ : "+m" (var) \ - : "qi" ((pto_T__)(val))); \ + : "ri" ((pto_T__)(val))); \ break; \ case 2: \ asm qual (op "w %1,"__percpu_arg(0) \ @@ -144,7 +144,7 @@ do { \ else \ asm qual ("addb %1, "__percpu_arg(0) \ : "+m" (var) \ - : "qi" ((pao_T__)(val))); \ + : "ri" ((pao_T__)(val))); \ break; \ case 2: \ if (pao_ID__ == 1) \ @@ -186,7 +186,7 @@ do { \ switch (sizeof(var)) { \ case 1: \ asm qual (op "b "__percpu_arg(1)",%0" \ - : "=q" (pfo_ret__) \ + : "=r" (pfo_ret__) \ : "m" (var)); \ break; \ case 2: \ @@ -215,7 +215,7 @@ do { \ switch (sizeof(var)) { \ case 1: \ asm(op "b "__percpu_arg(P1)",%0" \ - : "=q" (pfo_ret__) \ + : "=r" (pfo_ret__) \ : "p" (&(var))); \ break; \ case 2: \ ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-11 12:28 ` Arnd Bergmann 2019-07-11 12:28 ` Arnd Bergmann @ 2019-07-11 13:08 ` Vincenzo Frascino 2019-07-11 13:08 ` Vincenzo Frascino 2019-07-11 17:14 ` Nick Desaulniers 2 siblings, 1 reply; 14+ messages in thread From: Vincenzo Frascino @ 2019-07-11 13:08 UTC (permalink / raw) To: Arnd Bergmann Cc: Andy Lutomirski, Thomas Gleixner, linux-arch, Linux ARM, linux-mips, open list:KERNEL SELFTEST FRAMEWORK, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw Davies Hi Arnd, On 11/07/2019 13:28, Arnd Bergmann wrote: > On Thu, Jul 11, 2019 at 2:14 PM Vincenzo Frascino > <vincenzo.frascino@arm.com> wrote: >> >> >> Could you please tell me which version of the compiler did you use? >> >> My building command is: >> >> # make mrproper && make CC=clang HOSTCC=clang i386_defconfig && make ARCH=i386 >> CC=clang HOSTCC=clang -j56 >> > > See below for the patch I am using locally to work around this. > That patch is probably wrong, so I have not submitted it yet, but it > gives you a clean build ;-) > > Arnd > Thank you, I will give it a go :-) > 8<--- > Subject: [PATCH] x86: percpu: fix clang 32-bit build > > clang does not like an inline assembly with a "=q" contraint for > a 64-bit output: > > arch/x86/events/perf_event.h:824:21: error: invalid output size for > constraint '=q' > u64 disable_mask = __this_cpu_read(cpu_hw_events.perf_ctr_virt_mask); > ^ > include/linux/percpu-defs.h:447:2: note: expanded from macro '__this_cpu_read' > raw_cpu_read(pcp); \ > ^ > include/linux/percpu-defs.h:421:28: note: expanded from macro 'raw_cpu_read' > #define raw_cpu_read(pcp) > __pcpu_size_call_return(raw_cpu_read_, pcp) > ^ > include/linux/percpu-defs.h:322:23: note: expanded from macro > '__pcpu_size_call_return' > case 1: pscr_ret__ = stem##1(variable); break; \ > ^ > <scratch space>:357:1: note: expanded from here > raw_cpu_read_1 > ^ > arch/x86/include/asm/percpu.h:394:30: note: expanded from macro 'raw_cpu_read_1' > #define raw_cpu_read_1(pcp) percpu_from_op(, "mov", pcp) > ^ > arch/x86/include/asm/percpu.h:189:15: note: expanded from macro 'percpu_from_op' > : "=q" (pfo_ret__) \ > ^ > > According to the commit that introduced the "q" constraint, this was > needed to fix miscompilation, but it gives no further detail. > > Using the normal "=r" constraint seems to work so far. > > Fixes: 3c598766a2ba ("x86: fix percpu_{to,from}_op()") > Cc: Jan Beulich <jbeulich@suse.com> > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > > diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h > index 2278797c769d..e791fbf4018f 100644 > --- a/arch/x86/include/asm/percpu.h > +++ b/arch/x86/include/asm/percpu.h > @@ -99,7 +99,7 @@ do { \ > case 1: \ > asm qual (op "b %1,"__percpu_arg(0) \ > : "+m" (var) \ > - : "qi" ((pto_T__)(val))); \ > + : "ri" ((pto_T__)(val))); \ > break; \ > case 2: \ > asm qual (op "w %1,"__percpu_arg(0) \ > @@ -144,7 +144,7 @@ do { > \ > else \ > asm qual ("addb %1, "__percpu_arg(0) \ > : "+m" (var) \ > - : "qi" ((pao_T__)(val))); \ > + : "ri" ((pao_T__)(val))); \ > break; \ > case 2: \ > if (pao_ID__ == 1) \ > @@ -186,7 +186,7 @@ do { > \ > switch (sizeof(var)) { \ > case 1: \ > asm qual (op "b "__percpu_arg(1)",%0" \ > - : "=q" (pfo_ret__) \ > + : "=r" (pfo_ret__) \ > : "m" (var)); \ > break; \ > case 2: \ > @@ -215,7 +215,7 @@ do { > \ > switch (sizeof(var)) { \ > case 1: \ > asm(op "b "__percpu_arg(P1)",%0" \ > - : "=q" (pfo_ret__) \ > + : "=r" (pfo_ret__) \ > : "p" (&(var))); \ > break; \ > case 2: \ > -- Regards, Vincenzo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-11 13:08 ` Vincenzo Frascino @ 2019-07-11 13:08 ` Vincenzo Frascino 0 siblings, 0 replies; 14+ messages in thread From: Vincenzo Frascino @ 2019-07-11 13:08 UTC (permalink / raw) To: Arnd Bergmann Cc: Andy Lutomirski, Thomas Gleixner, linux-arch, Linux ARM, linux-mips, open list:KERNEL SELFTEST FRAMEWORK, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw Davies, Linux Kernel Mailing List, clang-built-linux Hi Arnd, On 11/07/2019 13:28, Arnd Bergmann wrote: > On Thu, Jul 11, 2019 at 2:14 PM Vincenzo Frascino > <vincenzo.frascino@arm.com> wrote: >> >> >> Could you please tell me which version of the compiler did you use? >> >> My building command is: >> >> # make mrproper && make CC=clang HOSTCC=clang i386_defconfig && make ARCH=i386 >> CC=clang HOSTCC=clang -j56 >> > > See below for the patch I am using locally to work around this. > That patch is probably wrong, so I have not submitted it yet, but it > gives you a clean build ;-) > > Arnd > Thank you, I will give it a go :-) > 8<--- > Subject: [PATCH] x86: percpu: fix clang 32-bit build > > clang does not like an inline assembly with a "=q" contraint for > a 64-bit output: > > arch/x86/events/perf_event.h:824:21: error: invalid output size for > constraint '=q' > u64 disable_mask = __this_cpu_read(cpu_hw_events.perf_ctr_virt_mask); > ^ > include/linux/percpu-defs.h:447:2: note: expanded from macro '__this_cpu_read' > raw_cpu_read(pcp); \ > ^ > include/linux/percpu-defs.h:421:28: note: expanded from macro 'raw_cpu_read' > #define raw_cpu_read(pcp) > __pcpu_size_call_return(raw_cpu_read_, pcp) > ^ > include/linux/percpu-defs.h:322:23: note: expanded from macro > '__pcpu_size_call_return' > case 1: pscr_ret__ = stem##1(variable); break; \ > ^ > <scratch space>:357:1: note: expanded from here > raw_cpu_read_1 > ^ > arch/x86/include/asm/percpu.h:394:30: note: expanded from macro 'raw_cpu_read_1' > #define raw_cpu_read_1(pcp) percpu_from_op(, "mov", pcp) > ^ > arch/x86/include/asm/percpu.h:189:15: note: expanded from macro 'percpu_from_op' > : "=q" (pfo_ret__) \ > ^ > > According to the commit that introduced the "q" constraint, this was > needed to fix miscompilation, but it gives no further detail. > > Using the normal "=r" constraint seems to work so far. > > Fixes: 3c598766a2ba ("x86: fix percpu_{to,from}_op()") > Cc: Jan Beulich <jbeulich@suse.com> > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > > diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h > index 2278797c769d..e791fbf4018f 100644 > --- a/arch/x86/include/asm/percpu.h > +++ b/arch/x86/include/asm/percpu.h > @@ -99,7 +99,7 @@ do { \ > case 1: \ > asm qual (op "b %1,"__percpu_arg(0) \ > : "+m" (var) \ > - : "qi" ((pto_T__)(val))); \ > + : "ri" ((pto_T__)(val))); \ > break; \ > case 2: \ > asm qual (op "w %1,"__percpu_arg(0) \ > @@ -144,7 +144,7 @@ do { > \ > else \ > asm qual ("addb %1, "__percpu_arg(0) \ > : "+m" (var) \ > - : "qi" ((pao_T__)(val))); \ > + : "ri" ((pao_T__)(val))); \ > break; \ > case 2: \ > if (pao_ID__ == 1) \ > @@ -186,7 +186,7 @@ do { > \ > switch (sizeof(var)) { \ > case 1: \ > asm qual (op "b "__percpu_arg(1)",%0" \ > - : "=q" (pfo_ret__) \ > + : "=r" (pfo_ret__) \ > : "m" (var)); \ > break; \ > case 2: \ > @@ -215,7 +215,7 @@ do { > \ > switch (sizeof(var)) { \ > case 1: \ > asm(op "b "__percpu_arg(P1)",%0" \ > - : "=q" (pfo_ret__) \ > + : "=r" (pfo_ret__) \ > : "p" (&(var))); \ > break; \ > case 2: \ > -- Regards, Vincenzo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-11 12:28 ` Arnd Bergmann 2019-07-11 12:28 ` Arnd Bergmann 2019-07-11 13:08 ` Vincenzo Frascino @ 2019-07-11 17:14 ` Nick Desaulniers 2019-07-11 17:14 ` Nick Desaulniers 2019-07-11 20:55 ` Arnd Bergmann 2 siblings, 2 replies; 14+ messages in thread From: Nick Desaulniers @ 2019-07-11 17:14 UTC (permalink / raw) To: Arnd Bergmann Cc: Vincenzo Frascino, Andy Lutomirski, Thomas Gleixner, linux-arch, Linux ARM, linux-mips, open list:KERNEL SELFTEST FRAMEWORK, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw On Thu, Jul 11, 2019 at 5:28 AM Arnd Bergmann <arnd@arndb.de> wrote: > clang does not like an inline assembly with a "=q" contraint for > a 64-bit output: Seems like starting in GCC 7, GCC may not like it either: https://godbolt.org/z/UyBUfh it simply warns then proceeds with code gen. Another difference may come from when GCC vs Clang perform dead code elimination (DCE) vs semantic analysis. -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-11 17:14 ` Nick Desaulniers @ 2019-07-11 17:14 ` Nick Desaulniers 2019-07-11 20:55 ` Arnd Bergmann 1 sibling, 0 replies; 14+ messages in thread From: Nick Desaulniers @ 2019-07-11 17:14 UTC (permalink / raw) To: Arnd Bergmann Cc: Vincenzo Frascino, Andy Lutomirski, Thomas Gleixner, linux-arch, Linux ARM, linux-mips, open list:KERNEL SELFTEST FRAMEWORK, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw Davies, Linux Kernel Mailing List, clang-built-linux On Thu, Jul 11, 2019 at 5:28 AM Arnd Bergmann <arnd@arndb.de> wrote: > clang does not like an inline assembly with a "=q" contraint for > a 64-bit output: Seems like starting in GCC 7, GCC may not like it either: https://godbolt.org/z/UyBUfh it simply warns then proceeds with code gen. Another difference may come from when GCC vs Clang perform dead code elimination (DCE) vs semantic analysis. -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-11 17:14 ` Nick Desaulniers 2019-07-11 17:14 ` Nick Desaulniers @ 2019-07-11 20:55 ` Arnd Bergmann 2019-07-11 20:55 ` Arnd Bergmann 1 sibling, 1 reply; 14+ messages in thread From: Arnd Bergmann @ 2019-07-11 20:55 UTC (permalink / raw) To: Nick Desaulniers Cc: Vincenzo Frascino, Andy Lutomirski, Thomas Gleixner, linux-arch, Linux ARM, linux-mips, open list:KERNEL SELFTEST FRAMEWORK, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw On Thu, Jul 11, 2019 at 7:14 PM 'Nick Desaulniers' via Clang Built Linux <clang-built-linux@googlegroups.com> wrote: > > On Thu, Jul 11, 2019 at 5:28 AM Arnd Bergmann <arnd@arndb.de> wrote: > > clang does not like an inline assembly with a "=q" contraint for > > a 64-bit output: > > Seems like starting in GCC 7, GCC may not like it either: > https://godbolt.org/z/UyBUfh > it simply warns then proceeds with code gen. Another difference may > come from when GCC vs Clang perform dead code elimination (DCE) vs > semantic analysis. Right, I also had the idea to work around it with a set of __builtin_choos_expr() instead of the switch()/case but did not complete that patch as the percpu code is rather complex and this would touch lots of code. Arnd ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] vsyscall: use __iter_div_u64_rem() 2019-07-11 20:55 ` Arnd Bergmann @ 2019-07-11 20:55 ` Arnd Bergmann 0 siblings, 0 replies; 14+ messages in thread From: Arnd Bergmann @ 2019-07-11 20:55 UTC (permalink / raw) To: Nick Desaulniers Cc: Vincenzo Frascino, Andy Lutomirski, Thomas Gleixner, linux-arch, Linux ARM, linux-mips, open list:KERNEL SELFTEST FRAMEWORK, Catalin Marinas, Will Deacon, Russell King, Ralf Baechle, Paul Burton, Daniel Lezcano, Mark Salyzyn, Peter Collingbourne, Shuah Khan, Dmitry Safonov, Rasmus Villemoes, Huw Davies, Linux Kernel Mailing List, clang-built-linux On Thu, Jul 11, 2019 at 7:14 PM 'Nick Desaulniers' via Clang Built Linux <clang-built-linux@googlegroups.com> wrote: > > On Thu, Jul 11, 2019 at 5:28 AM Arnd Bergmann <arnd@arndb.de> wrote: > > clang does not like an inline assembly with a "=q" contraint for > > a 64-bit output: > > Seems like starting in GCC 7, GCC may not like it either: > https://godbolt.org/z/UyBUfh > it simply warns then proceeds with code gen. Another difference may > come from when GCC vs Clang perform dead code elimination (DCE) vs > semantic analysis. Right, I also had the idea to work around it with a set of __builtin_choos_expr() instead of the switch()/case but did not complete that patch as the percpu code is rather complex and this would touch lots of code. Arnd ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2019-07-11 20:55 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-07-10 13:01 [PATCH] vsyscall: use __iter_div_u64_rem() Arnd Bergmann 2019-07-10 13:01 ` Arnd Bergmann 2019-07-10 16:56 ` Nathan Chancellor 2019-07-10 16:56 ` Nathan Chancellor 2019-07-11 12:14 ` Vincenzo Frascino 2019-07-11 12:14 ` Vincenzo Frascino 2019-07-11 12:28 ` Arnd Bergmann 2019-07-11 12:28 ` Arnd Bergmann 2019-07-11 13:08 ` Vincenzo Frascino 2019-07-11 13:08 ` Vincenzo Frascino 2019-07-11 17:14 ` Nick Desaulniers 2019-07-11 17:14 ` Nick Desaulniers 2019-07-11 20:55 ` Arnd Bergmann 2019-07-11 20:55 ` Arnd Bergmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox