* [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang
@ 2020-11-23 7:36 Antony Yu
2020-11-23 18:16 ` Nathan Chancellor
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Antony Yu @ 2020-11-23 7:36 UTC (permalink / raw)
Cc: swpenim, Nick Desaulniers, Russell King, linux-kernel,
clang-built-linux, Nathan Chancellor, linux-arm-kernel
__do_div64 clobbers the input register r0 in little endian system.
According to the inline assembly document, if an input operand is
modified, it should be tied to a output operand. This patch can
prevent compilers from reusing r0 register after asm statements.
Signed-off-by: Antony Yu <swpenim@gmail.com>
---
arch/arm/include/asm/div64.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
index 898e9c78a7e7..809efc51e90f 100644
--- a/arch/arm/include/asm/div64.h
+++ b/arch/arm/include/asm/div64.h
@@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base)
asm( __asmeq("%0", __xh)
__asmeq("%1", "r2")
__asmeq("%2", "r0")
- __asmeq("%3", "r4")
+ __asmeq("%3", "r0")
+ __asmeq("%4", "r4")
"bl __do_div64"
- : "=r" (__rem), "=r" (__res)
+ : "=r" (__rem), "=r" (__res), "=r" (__n)
: "r" (__n), "r" (__base)
: "ip", "lr", "cc");
*n = __res;
--
2.23.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 16+ messages in thread* Re: [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-23 7:36 [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang Antony Yu @ 2020-11-23 18:16 ` Nathan Chancellor [not found] ` <20201124074211.GA26157@penyung-VirtualBox> 2020-11-24 23:16 ` [RESEND,PATCH] " kernel test robot 2020-11-30 10:11 ` [RESEND, PATCH] " Ard Biesheuvel 2 siblings, 1 reply; 16+ messages in thread From: Nathan Chancellor @ 2020-11-23 18:16 UTC (permalink / raw) To: Antony Yu Cc: clang-built-linux, Nick Desaulniers, Russell King, linux-arm-kernel, linux-kernel On Mon, Nov 23, 2020 at 03:36:32PM +0800, Antony Yu wrote: > __do_div64 clobbers the input register r0 in little endian system. > According to the inline assembly document, if an input operand is > modified, it should be tied to a output operand. This patch can > prevent compilers from reusing r0 register after asm statements. > > Signed-off-by: Antony Yu <swpenim@gmail.com> > --- > arch/arm/include/asm/div64.h | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h > index 898e9c78a7e7..809efc51e90f 100644 > --- a/arch/arm/include/asm/div64.h > +++ b/arch/arm/include/asm/div64.h > @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base) > asm( __asmeq("%0", __xh) > __asmeq("%1", "r2") > __asmeq("%2", "r0") > - __asmeq("%3", "r4") > + __asmeq("%3", "r0") > + __asmeq("%4", "r4") > "bl __do_div64" > - : "=r" (__rem), "=r" (__res) > + : "=r" (__rem), "=r" (__res), "=r" (__n) > : "r" (__n), "r" (__base) > : "ip", "lr", "cc"); > *n = __res; > -- > 2.23.0 > I am not sure that I am qualified to review this (my assembly knowledge is not the best) but your commit title mentions an error when compiling with clang. What is the exact error, what configuration generates it, and what version of clang? We have done fairly decent testing for 32-bit ARM, I would like to know what we are missing. Cheers, Nathan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <20201124074211.GA26157@penyung-VirtualBox>]
* Re: [RESEND, PATCH] ARM: fix __div64_32() error when compiling with clang [not found] ` <20201124074211.GA26157@penyung-VirtualBox> @ 2020-11-24 10:14 ` Antony Yu 2020-11-24 21:06 ` Nick Desaulniers 0 siblings, 1 reply; 16+ messages in thread From: Antony Yu @ 2020-11-24 10:14 UTC (permalink / raw) To: Nathan Chancellor Cc: Pen-Yung Yu, Nick Desaulniers, Russell King, linux-kernel, clang-built-linux, linux-arm-kernel Antony Yu <swpenim@gmail.com> 於 2020年11月24日 週二 下午3:43寫道: > > On Mon, Nov 23, 2020 at 11:16:02AM -0700, Nathan Chancellor wrote: > > On Mon, Nov 23, 2020 at 03:36:32PM +0800, Antony Yu wrote: > > > __do_div64 clobbers the input register r0 in little endian system. > > > According to the inline assembly document, if an input operand is > > > modified, it should be tied to a output operand. This patch can > > > prevent compilers from reusing r0 register after asm statements. > > > > > > Signed-off-by: Antony Yu <swpenim@gmail.com> > > > --- > > > arch/arm/include/asm/div64.h | 5 +++-- > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h > > > index 898e9c78a7e7..809efc51e90f 100644 > > > --- a/arch/arm/include/asm/div64.h > > > +++ b/arch/arm/include/asm/div64.h > > > @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base) > > > asm( __asmeq("%0", __xh) > > > __asmeq("%1", "r2") > > > __asmeq("%2", "r0") > > > - __asmeq("%3", "r4") > > > + __asmeq("%3", "r0") > > > + __asmeq("%4", "r4") > > > "bl __do_div64" > > > - : "=r" (__rem), "=r" (__res) > > > + : "=r" (__rem), "=r" (__res), "=r" (__n) > > > : "r" (__n), "r" (__base) > > > : "ip", "lr", "cc"); > > > *n = __res; > > > -- > > > 2.23.0 > > > > > > > I am not sure that I am qualified to review this (my assembly knowledge > > is not the best) but your commit title mentions an error when compiling > > with clang. What is the exact error, what configuration generates it, > > and what version of clang? We have done fairly decent testing for > > 32-bit ARM, I would like to know what we are missing. > > > > Cheers, > > Nathan > > We have run fail on android R vts vts_libsnapshot_test with kernel 4.14. > This bug is triggered accidently by a workaround patch in our code base. > It is fine on a pure clean 4.14 branch since __do_div64 may not be > executed in skip_metadata. > > The attachment are .i and generated .s file. .s file can be reproduced > with clang -target arm-linux-eabi -march=armv8.2-a -O2. > > In function skip_metadata, it loads some value to r0, calls __do_div64, > adds 1 to r0 and stores it to [r5]. It gets wrong value since __do_div64 > clobbers r0 register. > > We have tried clang-10, clang-11 and android prebuilt clang-r383902b. All > of them have the same problem. Sorry for the large attachment. I put .i and .s files on https://gist.github.com/penyung/274b0c697062a1c776994bb40243cfff Antony Yu _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RESEND, PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-24 10:14 ` [RESEND, PATCH] " Antony Yu @ 2020-11-24 21:06 ` Nick Desaulniers 2020-11-30 8:20 ` [PATCH v2] " Antony Yu 0 siblings, 1 reply; 16+ messages in thread From: Nick Desaulniers @ 2020-11-24 21:06 UTC (permalink / raw) To: Antony Yu Cc: clang-built-linux, Nathan Chancellor, Russell King, Linux ARM, LKML Thanks for the report, it probably was not fun to debug. I'll take a closer look at this after the Thanksgiving holiday. On Tue, Nov 24, 2020 at 2:14 AM Antony Yu <swpenim@gmail.com> wrote: > > Antony Yu <swpenim@gmail.com> 於 2020年11月24日 週二 下午3:43寫道: > > > > On Mon, Nov 23, 2020 at 11:16:02AM -0700, Nathan Chancellor wrote: > > > On Mon, Nov 23, 2020 at 03:36:32PM +0800, Antony Yu wrote: > > > > __do_div64 clobbers the input register r0 in little endian system. > > > > According to the inline assembly document, if an input operand is > > > > modified, it should be tied to a output operand. This patch can > > > > prevent compilers from reusing r0 register after asm statements. > > > > > > > > Signed-off-by: Antony Yu <swpenim@gmail.com> > > > > --- > > > > arch/arm/include/asm/div64.h | 5 +++-- > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h > > > > index 898e9c78a7e7..809efc51e90f 100644 > > > > --- a/arch/arm/include/asm/div64.h > > > > +++ b/arch/arm/include/asm/div64.h > > > > @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base) > > > > asm( __asmeq("%0", __xh) > > > > __asmeq("%1", "r2") > > > > __asmeq("%2", "r0") > > > > - __asmeq("%3", "r4") > > > > + __asmeq("%3", "r0") > > > > + __asmeq("%4", "r4") > > > > "bl __do_div64" > > > > - : "=r" (__rem), "=r" (__res) > > > > + : "=r" (__rem), "=r" (__res), "=r" (__n) > > > > : "r" (__n), "r" (__base) > > > > : "ip", "lr", "cc"); > > > > *n = __res; > > > > -- > > > > 2.23.0 > > > > > > > > > > I am not sure that I am qualified to review this (my assembly knowledge > > > is not the best) but your commit title mentions an error when compiling > > > with clang. What is the exact error, what configuration generates it, > > > and what version of clang? We have done fairly decent testing for > > > 32-bit ARM, I would like to know what we are missing. > > > > > > Cheers, > > > Nathan > > > > We have run fail on android R vts vts_libsnapshot_test with kernel 4.14. > > This bug is triggered accidently by a workaround patch in our code base. > > It is fine on a pure clean 4.14 branch since __do_div64 may not be > > executed in skip_metadata. > > > > The attachment are .i and generated .s file. .s file can be reproduced > > with clang -target arm-linux-eabi -march=armv8.2-a -O2. > > > > In function skip_metadata, it loads some value to r0, calls __do_div64, > > adds 1 to r0 and stores it to [r5]. It gets wrong value since __do_div64 > > clobbers r0 register. > > > > We have tried clang-10, clang-11 and android prebuilt clang-r383902b. All > > of them have the same problem. > > Sorry for the large attachment. > I put .i and .s files on > https://gist.github.com/penyung/274b0c697062a1c776994bb40243cfff > > Antony Yu -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2] ARM: fix __div64_32() error when compiling with clang 2020-11-24 21:06 ` Nick Desaulniers @ 2020-11-30 8:20 ` Antony Yu 0 siblings, 0 replies; 16+ messages in thread From: Antony Yu @ 2020-11-30 8:20 UTC (permalink / raw) To: Nick Desaulniers Cc: swpenim, Russell King, LKML, clang-built-linux, Nathan Chancellor, Linux ARM [-- Attachment #1: v2-0001-ARM-fix-__div64_32-error-when-compiling-with-clan.patch --] [-- Type: text/x-diff, Size: 1327 bytes --] From 3a4c19e09079324a625941ea3c16fe4b0df2ed86 Mon Sep 17 00:00:00 2001 From: Antony Yu <swpenim@gmail.com> Date: Mon, 9 Nov 2020 17:31:52 +0800 Subject: [PATCH v2] ARM: fix __div64_32() error when compiling with clang __do_div64 clobbers the input register r0 in little endian system. According to the inline assembly document, if an input operand is modified, it should be tied to a output operand. This patch can prevent compilers from reusing r0 register after asm statements. Signed-off-by: Antony Yu <swpenim@gmail.com> Reported-by: kernel test robot <lkp@intel.com> --- changed in v2 - add kernel-test-robot tag - fix build failure on big endian arch/arm/include/asm/div64.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h index 898e9c78a7e7..961a129eb41f 100644 --- a/arch/arm/include/asm/div64.h +++ b/arch/arm/include/asm/div64.h @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base) asm( __asmeq("%0", __xh) __asmeq("%1", "r2") __asmeq("%2", "r0") - __asmeq("%3", "r4") + __asmeq("%3", "r0") + __asmeq("%4", "r4") "bl __do_div64" - : "=r" (__rem), "=r" (__res) + : "=r" (__rem), "=r" (__res), "=r" (__xl) : "r" (__n), "r" (__base) : "ip", "lr", "cc"); *n = __res; -- 2.29.0 [-- Attachment #2: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-23 7:36 [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang Antony Yu 2020-11-23 18:16 ` Nathan Chancellor @ 2020-11-24 23:16 ` kernel test robot 2020-11-30 10:11 ` [RESEND, PATCH] " Ard Biesheuvel 2 siblings, 0 replies; 16+ messages in thread From: kernel test robot @ 2020-11-24 23:16 UTC (permalink / raw) To: Antony Yu Cc: kbuild-all, swpenim, Nick Desaulniers, Russell King, linux-kernel, clang-built-linux, Nathan Chancellor, linux-arm-kernel [-- Attachment #1: Type: text/plain, Size: 5215 bytes --] Hi Antony, Thank you for the patch! Yet something to improve: [auto build test ERROR on linus/master] [also build test ERROR on v5.10-rc5 next-20201124] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Antony-Yu/ARM-fix-__div64_32-error-when-compiling-with-clang/20201123-154454 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 418baf2c28f3473039f2f7377760bd8f6897ae18 config: arm-randconfig-r036-20201124 (attached as .config) compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project df9ae5992889560a8f3c6760b54d5051b47c7bf5) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm cross compiling tool for clang build # apt-get install binutils-arm-linux-gnueabi # https://github.com/0day-ci/linux/commit/83df1f983ed4219556020bf30cc819e66bb7e69c git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Antony-Yu/ARM-fix-__div64_32-error-when-compiling-with-clang/20201123-154454 git checkout 83df1f983ed4219556020bf30cc819e66bb7e69c # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): drivers/tty/serial/imx.c:359:19: warning: unused function 'imx_uart_is_imx21' [-Wunused-function] static inline int imx_uart_is_imx21(struct imx_port *sport) ^ drivers/tty/serial/imx.c:364:19: warning: unused function 'imx_uart_is_imx53' [-Wunused-function] static inline int imx_uart_is_imx53(struct imx_port *sport) ^ drivers/tty/serial/imx.c:369:19: warning: unused function 'imx_uart_is_imx6q' [-Wunused-function] static inline int imx_uart_is_imx6q(struct imx_port *sport) ^ In file included from drivers/tty/serial/imx.c:11: In file included from include/linux/module.h:12: In file included from include/linux/list.h:9: In file included from include/linux/kernel.h:19: >> arch/arm/include/asm/div64.h:39:2: error: multiple outputs to hard register: r0 asm( __asmeq("%0", __xh) ^ 3 warnings and 1 error generated. -- In file included from drivers/gpu/drm/tegra/dc.c:7: In file included from include/linux/clk.h:13: In file included from include/linux/kernel.h:19: >> arch/arm/include/asm/div64.h:39:2: error: multiple outputs to hard register: r0 asm( __asmeq("%0", __xh) ^ 1 error generated. vim +39 arch/arm/include/asm/div64.h ^1da177e4c3f415 include/asm-arm/div64.h Linus Torvalds 2005-04-16 32 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 33 static inline uint32_t __div64_32(uint64_t *n, uint32_t base) 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 34 { 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 35 register unsigned int __base asm("r4") = base; 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 36 register unsigned long long __n asm("r0") = *n; 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 37 register unsigned long long __res asm("r2"); 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 38 register unsigned int __rem asm(__xh); 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 @39 asm( __asmeq("%0", __xh) 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 40 __asmeq("%1", "r2") 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 41 __asmeq("%2", "r0") 83df1f983ed4219 arch/arm/include/asm/div64.h Antony Yu 2020-11-23 42 __asmeq("%3", "r0") 83df1f983ed4219 arch/arm/include/asm/div64.h Antony Yu 2020-11-23 43 __asmeq("%4", "r4") 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 44 "bl __do_div64" 83df1f983ed4219 arch/arm/include/asm/div64.h Antony Yu 2020-11-23 45 : "=r" (__rem), "=r" (__res), "=r" (__n) 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 46 : "r" (__n), "r" (__base) 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 47 : "ip", "lr", "cc"); 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 48 *n = __res; 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 49 return __rem; 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 50 } 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 51 #define __div64_32 __div64_32 040b323b5012b55 arch/arm/include/asm/div64.h Nicolas Pitre 2015-11-02 52 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 33586 bytes --] [-- Attachment #3: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RESEND, PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-23 7:36 [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang Antony Yu 2020-11-23 18:16 ` Nathan Chancellor 2020-11-24 23:16 ` [RESEND,PATCH] " kernel test robot @ 2020-11-30 10:11 ` Ard Biesheuvel 2020-11-30 10:12 ` Ard Biesheuvel 2 siblings, 1 reply; 16+ messages in thread From: Ard Biesheuvel @ 2020-11-30 10:11 UTC (permalink / raw) To: Antony Yu Cc: Nick Desaulniers, Russell King, Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor, Linux ARM On Mon, 23 Nov 2020 at 08:39, Antony Yu <swpenim@gmail.com> wrote: > > __do_div64 clobbers the input register r0 in little endian system. > According to the inline assembly document, if an input operand is > modified, it should be tied to a output operand. This patch can > prevent compilers from reusing r0 register after asm statements. > > Signed-off-by: Antony Yu <swpenim@gmail.com> > --- > arch/arm/include/asm/div64.h | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h > index 898e9c78a7e7..809efc51e90f 100644 > --- a/arch/arm/include/asm/div64.h > +++ b/arch/arm/include/asm/div64.h > @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base) > asm( __asmeq("%0", __xh) > __asmeq("%1", "r2") > __asmeq("%2", "r0") > - __asmeq("%3", "r4") > + __asmeq("%3", "r0") > + __asmeq("%4", "r4") > "bl __do_div64" > - : "=r" (__rem), "=r" (__res) > + : "=r" (__rem), "=r" (__res), "=r" (__n) > : "r" (__n), "r" (__base) > : "ip", "lr", "cc"); > *n = __res; > -- > 2.23.0 > Agree that using r0 as an input operand only is incorrect, and not only on Clang. The compiler might assume that r0 will retain its value across the asm() block, which is obviously not the case. However, your patch will likely break big-endian, since in that case, __xh == r0, and so it will appear twice. Perhaps it would be better to change the type of __rem to unsigned long long as well? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RESEND, PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-30 10:11 ` [RESEND, PATCH] " Ard Biesheuvel @ 2020-11-30 10:12 ` Ard Biesheuvel 2020-11-30 10:21 ` [RESEND,PATCH] " Russell King - ARM Linux admin 2020-11-30 15:50 ` Nicolas Pitre 0 siblings, 2 replies; 16+ messages in thread From: Ard Biesheuvel @ 2020-11-30 10:12 UTC (permalink / raw) To: Antony Yu, Nicolas Pitre Cc: Nick Desaulniers, Russell King, Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor, Linux ARM (+ Nico) On Mon, 30 Nov 2020 at 11:11, Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 23 Nov 2020 at 08:39, Antony Yu <swpenim@gmail.com> wrote: > > > > __do_div64 clobbers the input register r0 in little endian system. > > According to the inline assembly document, if an input operand is > > modified, it should be tied to a output operand. This patch can > > prevent compilers from reusing r0 register after asm statements. > > > > Signed-off-by: Antony Yu <swpenim@gmail.com> > > --- > > arch/arm/include/asm/div64.h | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h > > index 898e9c78a7e7..809efc51e90f 100644 > > --- a/arch/arm/include/asm/div64.h > > +++ b/arch/arm/include/asm/div64.h > > @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base) > > asm( __asmeq("%0", __xh) > > __asmeq("%1", "r2") > > __asmeq("%2", "r0") > > - __asmeq("%3", "r4") > > + __asmeq("%3", "r0") > > + __asmeq("%4", "r4") > > "bl __do_div64" > > - : "=r" (__rem), "=r" (__res) > > + : "=r" (__rem), "=r" (__res), "=r" (__n) > > : "r" (__n), "r" (__base) > > : "ip", "lr", "cc"); > > *n = __res; > > -- > > 2.23.0 > > > > Agree that using r0 as an input operand only is incorrect, and not > only on Clang. The compiler might assume that r0 will retain its value > across the asm() block, which is obviously not the case. > > However, your patch will likely break big-endian, since in that case, > __xh == r0, and so it will appear twice. > > Perhaps it would be better to change the type of __rem to unsigned > long long as well? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-30 10:12 ` Ard Biesheuvel @ 2020-11-30 10:21 ` Russell King - ARM Linux admin 2020-11-30 10:40 ` [RESEND, PATCH] " Ard Biesheuvel 2020-11-30 15:50 ` Nicolas Pitre 1 sibling, 1 reply; 16+ messages in thread From: Russell King - ARM Linux admin @ 2020-11-30 10:21 UTC (permalink / raw) To: Ard Biesheuvel Cc: Nicolas Pitre, Antony Yu, Nick Desaulniers, Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor, Linux ARM On Mon, Nov 30, 2020 at 11:12:33AM +0100, Ard Biesheuvel wrote: > (+ Nico) > > On Mon, 30 Nov 2020 at 11:11, Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Mon, 23 Nov 2020 at 08:39, Antony Yu <swpenim@gmail.com> wrote: > > > > > > __do_div64 clobbers the input register r0 in little endian system. > > > According to the inline assembly document, if an input operand is > > > modified, it should be tied to a output operand. This patch can > > > prevent compilers from reusing r0 register after asm statements. > > > > > > Signed-off-by: Antony Yu <swpenim@gmail.com> > > > --- > > > arch/arm/include/asm/div64.h | 5 +++-- > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h > > > index 898e9c78a7e7..809efc51e90f 100644 > > > --- a/arch/arm/include/asm/div64.h > > > +++ b/arch/arm/include/asm/div64.h > > > @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base) > > > asm( __asmeq("%0", __xh) > > > __asmeq("%1", "r2") > > > __asmeq("%2", "r0") > > > - __asmeq("%3", "r4") > > > + __asmeq("%3", "r0") > > > + __asmeq("%4", "r4") > > > "bl __do_div64" > > > - : "=r" (__rem), "=r" (__res) > > > + : "=r" (__rem), "=r" (__res), "=r" (__n) > > > : "r" (__n), "r" (__base) > > > : "ip", "lr", "cc"); > > > *n = __res; > > > -- > > > 2.23.0 > > > > > > > Agree that using r0 as an input operand only is incorrect, and not > > only on Clang. The compiler might assume that r0 will retain its value > > across the asm() block, which is obviously not the case. However, you can _not_ have an asm block that names two outputs using the same physical register - that's why both the original patch and the posted v2 will fail. You also can't mark r0 as clobbered because it's used as an operand and that is not allowed by gcc. The fact is, we have two register variables occupying the same register, which are __n and __rem. It doesn't matter which endian-ness __rem is, r0 will be used for both __n (input) and __rem (output). If the compiler can't work out that if a physical register used as an output operand will be written by the assembler, then the compiler is quite simply buggy. The code is correct as it stands; Clang is buggy. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RESEND, PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-30 10:21 ` [RESEND,PATCH] " Russell King - ARM Linux admin @ 2020-11-30 10:40 ` Ard Biesheuvel 2020-11-30 13:58 ` [RESEND,PATCH] " David Laight 0 siblings, 1 reply; 16+ messages in thread From: Ard Biesheuvel @ 2020-11-30 10:40 UTC (permalink / raw) To: Russell King - ARM Linux admin Cc: Nicolas Pitre, Antony Yu, Nick Desaulniers, Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor, Linux ARM On Mon, 30 Nov 2020 at 11:21, Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote: > > On Mon, Nov 30, 2020 at 11:12:33AM +0100, Ard Biesheuvel wrote: > > (+ Nico) > > > > On Mon, 30 Nov 2020 at 11:11, Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 23 Nov 2020 at 08:39, Antony Yu <swpenim@gmail.com> wrote: > > > > > > > > __do_div64 clobbers the input register r0 in little endian system. > > > > According to the inline assembly document, if an input operand is > > > > modified, it should be tied to a output operand. This patch can > > > > prevent compilers from reusing r0 register after asm statements. > > > > > > > > Signed-off-by: Antony Yu <swpenim@gmail.com> > > > > --- > > > > arch/arm/include/asm/div64.h | 5 +++-- > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h > > > > index 898e9c78a7e7..809efc51e90f 100644 > > > > --- a/arch/arm/include/asm/div64.h > > > > +++ b/arch/arm/include/asm/div64.h > > > > @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base) > > > > asm( __asmeq("%0", __xh) > > > > __asmeq("%1", "r2") > > > > __asmeq("%2", "r0") > > > > - __asmeq("%3", "r4") > > > > + __asmeq("%3", "r0") > > > > + __asmeq("%4", "r4") > > > > "bl __do_div64" > > > > - : "=r" (__rem), "=r" (__res) > > > > + : "=r" (__rem), "=r" (__res), "=r" (__n) > > > > : "r" (__n), "r" (__base) > > > > : "ip", "lr", "cc"); > > > > *n = __res; > > > > -- > > > > 2.23.0 > > > > > > > > > > Agree that using r0 as an input operand only is incorrect, and not > > > only on Clang. The compiler might assume that r0 will retain its value > > > across the asm() block, which is obviously not the case. > > However, you can _not_ have an asm block that names two outputs using > the same physical register - that's why both the original patch and > the posted v2 will fail. > > You also can't mark r0 as clobbered because it's used as an operand > and that is not allowed by gcc. > > The fact is, we have two register variables occupying the same register, > which are __n and __rem. It doesn't matter which endian-ness __rem is, > r0 will be used for both __n (input) and __rem (output). > __rem is a 32-bit variable, so in LE mode, only r1 is used for __rem, not r0. So r0/r1 are used as an input operand pair, and r1 is used as an output operand. So I don't think the compiler has to be buggy in order for it to assume that r0 will still contain the low word of the dividend afterwards. And actually, the same applies on BE, but the other way around. So we should mark __xl as an output register as well, as __xl will assume the right value depending on the endianness. I suggest something like the below, diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h index 898e9c78a7e7..85ff9109595e 100644 --- a/arch/arm/include/asm/div64.h +++ b/arch/arm/include/asm/div64.h @@ -36,12 +36,14 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base) register unsigned long long __n asm("r0") = *n; register unsigned long long __res asm("r2"); register unsigned int __rem asm(__xh); + register unsigned int __dummy asm(__xl); asm( __asmeq("%0", __xh) __asmeq("%1", "r2") - __asmeq("%2", "r0") - __asmeq("%3", "r4") + __asmeq("%2", __xl) + __asmeq("%3", "r0") + __asmeq("%4", "r4") "bl __do_div64" - : "=r" (__rem), "=r" (__res) + : "=r" (__rem), "=r" (__res), "=r"(__dummy) : "r" (__n), "r" (__base) : "ip", "lr", "cc"); *n = __res; > If the compiler can't work out that if a physical register used as an > output operand will be written by the assembler, then the compiler is > quite simply buggy. > > The code is correct as it stands; Clang is buggy. > > -- > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 16+ messages in thread
* RE: [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-30 10:40 ` [RESEND, PATCH] " Ard Biesheuvel @ 2020-11-30 13:58 ` David Laight 2020-11-30 14:18 ` Russell King - ARM Linux admin 0 siblings, 1 reply; 16+ messages in thread From: David Laight @ 2020-11-30 13:58 UTC (permalink / raw) To: 'Ard Biesheuvel', Russell King - ARM Linux admin Cc: Nicolas Pitre, Antony Yu, Nick Desaulniers, Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor, Linux ARM > And actually, the same applies on BE, but the other way around. So we > should mark __xl as an output register as well, as __xl will assume > the right value depending on the endianness. Why not use "+r" to indicate than an 'output' parameter is also used as an input. Rather cleaner than specifying the same C variable as both input and output. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-30 13:58 ` [RESEND,PATCH] " David Laight @ 2020-11-30 14:18 ` Russell King - ARM Linux admin 0 siblings, 0 replies; 16+ messages in thread From: Russell King - ARM Linux admin @ 2020-11-30 14:18 UTC (permalink / raw) To: David Laight Cc: Nicolas Pitre, Antony Yu, Nick Desaulniers, Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor, 'Ard Biesheuvel', Linux ARM On Mon, Nov 30, 2020 at 01:58:27PM +0000, David Laight wrote: > > And actually, the same applies on BE, but the other way around. So we > > should mark __xl as an output register as well, as __xl will assume > > the right value depending on the endianness. > > Why not use "+r" to indicate than an 'output' parameter is also > used as an input. > > Rather cleaner than specifying the same C variable as both > input and output. You have an incorrect understanding. "__n" is the input operand in r0. "__rem" is the output operand in r0/r1. No single C variable is used as both an input and an output. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-30 10:12 ` Ard Biesheuvel 2020-11-30 10:21 ` [RESEND,PATCH] " Russell King - ARM Linux admin @ 2020-11-30 15:50 ` Nicolas Pitre 2020-11-30 17:18 ` [RESEND, PATCH] " Ard Biesheuvel 1 sibling, 1 reply; 16+ messages in thread From: Nicolas Pitre @ 2020-11-30 15:50 UTC (permalink / raw) To: Ard Biesheuvel Cc: Antony Yu, Nick Desaulniers, Russell King, Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor, Linux ARM On Mon, 30 Nov 2020, Ard Biesheuvel wrote: > (+ Nico) > > On Mon, 30 Nov 2020 at 11:11, Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Mon, 23 Nov 2020 at 08:39, Antony Yu <swpenim@gmail.com> wrote: > > > > > > __do_div64 clobbers the input register r0 in little endian system. > > > According to the inline assembly document, if an input operand is > > > modified, it should be tied to a output operand. This patch can > > > prevent compilers from reusing r0 register after asm statements. > > > > > > Signed-off-by: Antony Yu <swpenim@gmail.com> > > > --- > > > arch/arm/include/asm/div64.h | 5 +++-- > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h > > > index 898e9c78a7e7..809efc51e90f 100644 > > > --- a/arch/arm/include/asm/div64.h > > > +++ b/arch/arm/include/asm/div64.h > > > @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base) > > > asm( __asmeq("%0", __xh) > > > __asmeq("%1", "r2") > > > __asmeq("%2", "r0") > > > - __asmeq("%3", "r4") > > > + __asmeq("%3", "r0") > > > + __asmeq("%4", "r4") > > > "bl __do_div64" > > > - : "=r" (__rem), "=r" (__res) > > > + : "=r" (__rem), "=r" (__res), "=r" (__n) > > > : "r" (__n), "r" (__base) > > > : "ip", "lr", "cc"); > > > *n = __res; > > > -- > > > 2.23.0 > > > > > > > Agree that using r0 as an input operand only is incorrect, and not > > only on Clang. The compiler might assume that r0 will retain its value > > across the asm() block, which is obviously not the case. You're right. This was done like that most likely to work around some stupid code generation with "__n >> 32" while using gcc from about 20 years ago. IOW I don't exactly remember why I did it like that, but it is certainly flawed. Here's my version of the fix which should be correct. Warning: this is completely untested, but should in theory produce the same code on modern gcc. diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h index 898e9c78a7..595e538f5b 100644 --- a/arch/arm/include/asm/div64.h +++ b/arch/arm/include/asm/div64.h @@ -21,29 +21,20 @@ * assembly implementation with completely non standard calling convention * for arguments and results (beware). */ - -#ifdef __ARMEB__ -#define __xh "r0" -#define __xl "r1" -#else -#define __xl "r0" -#define __xh "r1" -#endif - static inline uint32_t __div64_32(uint64_t *n, uint32_t base) { register unsigned int __base asm("r4") = base; register unsigned long long __n asm("r0") = *n; register unsigned long long __res asm("r2"); - register unsigned int __rem asm(__xh); - asm( __asmeq("%0", __xh) + unsigned int __rem; + asm( __asmeq("%0", "r0") __asmeq("%1", "r2") - __asmeq("%2", "r0") - __asmeq("%3", "r4") + __asmeq("%2", "r4") "bl __do_div64" - : "=r" (__rem), "=r" (__res) - : "r" (__n), "r" (__base) + : "+r" (__n), "=r" (__res) + : "r" (__base) : "ip", "lr", "cc"); + __rem = __n >> 32; *n = __res; return __rem; } I'll submit it if someone confirms it works. Nicolas _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [RESEND, PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-30 15:50 ` Nicolas Pitre @ 2020-11-30 17:18 ` Ard Biesheuvel 2020-11-30 17:52 ` [RESEND,PATCH] " Nicolas Pitre 0 siblings, 1 reply; 16+ messages in thread From: Ard Biesheuvel @ 2020-11-30 17:18 UTC (permalink / raw) To: Nicolas Pitre Cc: Antony Yu, Nick Desaulniers, Russell King, Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor, Linux ARM On Mon, 30 Nov 2020 at 16:51, Nicolas Pitre <nico@fluxnic.net> wrote: > > On Mon, 30 Nov 2020, Ard Biesheuvel wrote: > > > (+ Nico) > > > > On Mon, 30 Nov 2020 at 11:11, Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 23 Nov 2020 at 08:39, Antony Yu <swpenim@gmail.com> wrote: > > > > > > > > __do_div64 clobbers the input register r0 in little endian system. > > > > According to the inline assembly document, if an input operand is > > > > modified, it should be tied to a output operand. This patch can > > > > prevent compilers from reusing r0 register after asm statements. > > > > > > > > Signed-off-by: Antony Yu <swpenim@gmail.com> > > > > --- > > > > arch/arm/include/asm/div64.h | 5 +++-- > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h > > > > index 898e9c78a7e7..809efc51e90f 100644 > > > > --- a/arch/arm/include/asm/div64.h > > > > +++ b/arch/arm/include/asm/div64.h > > > > @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base) > > > > asm( __asmeq("%0", __xh) > > > > __asmeq("%1", "r2") > > > > __asmeq("%2", "r0") > > > > - __asmeq("%3", "r4") > > > > + __asmeq("%3", "r0") > > > > + __asmeq("%4", "r4") > > > > "bl __do_div64" > > > > - : "=r" (__rem), "=r" (__res) > > > > + : "=r" (__rem), "=r" (__res), "=r" (__n) > > > > : "r" (__n), "r" (__base) > > > > : "ip", "lr", "cc"); > > > > *n = __res; > > > > -- > > > > 2.23.0 > > > > > > > > > > Agree that using r0 as an input operand only is incorrect, and not > > > only on Clang. The compiler might assume that r0 will retain its value > > > across the asm() block, which is obviously not the case. > > You're right. > > This was done like that most likely to work around some stupid code > generation with "__n >> 32" while using gcc from about 20 years ago. IOW > I don't exactly remember why I did it like that, but it is certainly > flawed. > > Here's my version of the fix which should be correct. Warning: this > is completely untested, but should in theory produce the same code on > modern gcc. > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h > index 898e9c78a7..595e538f5b 100644 > --- a/arch/arm/include/asm/div64.h > +++ b/arch/arm/include/asm/div64.h > @@ -21,29 +21,20 @@ > * assembly implementation with completely non standard calling convention > * for arguments and results (beware). > */ > - > -#ifdef __ARMEB__ > -#define __xh "r0" > -#define __xl "r1" > -#else > -#define __xl "r0" > -#define __xh "r1" > -#endif > - > static inline uint32_t __div64_32(uint64_t *n, uint32_t base) > { > register unsigned int __base asm("r4") = base; > register unsigned long long __n asm("r0") = *n; > register unsigned long long __res asm("r2"); > - register unsigned int __rem asm(__xh); > - asm( __asmeq("%0", __xh) > + unsigned int __rem; > + asm( __asmeq("%0", "r0") > __asmeq("%1", "r2") > - __asmeq("%2", "r0") > - __asmeq("%3", "r4") > + __asmeq("%2", "r4") > "bl __do_div64" > - : "=r" (__rem), "=r" (__res) > - : "r" (__n), "r" (__base) > + : "+r" (__n), "=r" (__res) > + : "r" (__base) > : "ip", "lr", "cc"); > + __rem = __n >> 32; This treats {r0, r1} as a {low, high} pair, regardless of endianness, and so it puts the value of r0 into r1. Doesn't that mean the shift should only be done on little endian? > *n = __res; > return __rem; > } > > I'll submit it if someone confirms it works. > > > Nicolas > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-30 17:18 ` [RESEND, PATCH] " Ard Biesheuvel @ 2020-11-30 17:52 ` Nicolas Pitre 2020-11-30 18:08 ` [RESEND, PATCH] " Ard Biesheuvel 0 siblings, 1 reply; 16+ messages in thread From: Nicolas Pitre @ 2020-11-30 17:52 UTC (permalink / raw) To: Ard Biesheuvel Cc: Antony Yu, Nick Desaulniers, Russell King, Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor, Linux ARM On Mon, 30 Nov 2020, Ard Biesheuvel wrote: > On Mon, 30 Nov 2020 at 16:51, Nicolas Pitre <nico@fluxnic.net> wrote: > > > Here's my version of the fix which should be correct. Warning: this > > is completely untested, but should in theory produce the same code on > > modern gcc. > > > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h > > index 898e9c78a7..595e538f5b 100644 > > --- a/arch/arm/include/asm/div64.h > > +++ b/arch/arm/include/asm/div64.h > > @@ -21,29 +21,20 @@ > > * assembly implementation with completely non standard calling convention > > * for arguments and results (beware). > > */ > > - > > -#ifdef __ARMEB__ > > -#define __xh "r0" > > -#define __xl "r1" > > -#else > > -#define __xl "r0" > > -#define __xh "r1" > > -#endif > > - > > static inline uint32_t __div64_32(uint64_t *n, uint32_t base) > > { > > register unsigned int __base asm("r4") = base; > > register unsigned long long __n asm("r0") = *n; > > register unsigned long long __res asm("r2"); > > - register unsigned int __rem asm(__xh); > > - asm( __asmeq("%0", __xh) > > + unsigned int __rem; > > + asm( __asmeq("%0", "r0") > > __asmeq("%1", "r2") > > - __asmeq("%2", "r0") > > - __asmeq("%3", "r4") > > + __asmeq("%2", "r4") > > "bl __do_div64" > > - : "=r" (__rem), "=r" (__res) > > - : "r" (__n), "r" (__base) > > + : "+r" (__n), "=r" (__res) > > + : "r" (__base) > > : "ip", "lr", "cc"); > > + __rem = __n >> 32; > > This treats {r0, r1} as a {low, high} pair, regardless of endianness, > and so it puts the value of r0 into r1. Doesn't that mean the shift > should only be done on little endian? Not quite. r0-r1 = low-high is for little endian. Then "__n >> 32" is actually translated into "mov r0, r1" to move it into __rem and returned through r0. On big endial it is r0-r1 = high-low. Here "__n >> 32" picks r0 and moves it to __rem which is returned through r0 so no extra instruction needed. Of course the function is inlined so r0 can be anything, or optimized away if__rem is not used. Nicolas _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RESEND, PATCH] ARM: fix __div64_32() error when compiling with clang 2020-11-30 17:52 ` [RESEND,PATCH] " Nicolas Pitre @ 2020-11-30 18:08 ` Ard Biesheuvel 0 siblings, 0 replies; 16+ messages in thread From: Ard Biesheuvel @ 2020-11-30 18:08 UTC (permalink / raw) To: Nicolas Pitre Cc: Antony Yu, Nick Desaulniers, Russell King, Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor, Linux ARM On Mon, 30 Nov 2020 at 18:52, Nicolas Pitre <nico@fluxnic.net> wrote: > > On Mon, 30 Nov 2020, Ard Biesheuvel wrote: > > > On Mon, 30 Nov 2020 at 16:51, Nicolas Pitre <nico@fluxnic.net> wrote: > > > > > Here's my version of the fix which should be correct. Warning: this > > > is completely untested, but should in theory produce the same code on > > > modern gcc. > > > > > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h > > > index 898e9c78a7..595e538f5b 100644 > > > --- a/arch/arm/include/asm/div64.h > > > +++ b/arch/arm/include/asm/div64.h > > > @@ -21,29 +21,20 @@ > > > * assembly implementation with completely non standard calling convention > > > * for arguments and results (beware). > > > */ > > > - > > > -#ifdef __ARMEB__ > > > -#define __xh "r0" > > > -#define __xl "r1" > > > -#else > > > -#define __xl "r0" > > > -#define __xh "r1" > > > -#endif > > > - > > > static inline uint32_t __div64_32(uint64_t *n, uint32_t base) > > > { > > > register unsigned int __base asm("r4") = base; > > > register unsigned long long __n asm("r0") = *n; > > > register unsigned long long __res asm("r2"); > > > - register unsigned int __rem asm(__xh); > > > - asm( __asmeq("%0", __xh) > > > + unsigned int __rem; > > > + asm( __asmeq("%0", "r0") > > > __asmeq("%1", "r2") > > > - __asmeq("%2", "r0") > > > - __asmeq("%3", "r4") > > > + __asmeq("%2", "r4") > > > "bl __do_div64" > > > - : "=r" (__rem), "=r" (__res) > > > - : "r" (__n), "r" (__base) > > > + : "+r" (__n), "=r" (__res) > > > + : "r" (__base) > > > : "ip", "lr", "cc"); > > > + __rem = __n >> 32; > > > > This treats {r0, r1} as a {low, high} pair, regardless of endianness, > > and so it puts the value of r0 into r1. Doesn't that mean the shift > > should only be done on little endian? > > Not quite. r0-r1 = low-high is for little endian. Then "__n >> 32" is > actually translated into "mov r0, r1" to move it into __rem and returned > through r0. > > On big endial it is r0-r1 = high-low. Here "__n >> 32" picks r0 and > moves it to __rem which is returned through r0 so no extra instruction > needed. > > Of course the function is inlined so r0 can be anything, or optimized > away if__rem is not used. > OK, you're right. I got myself confused there, but a quick test with GCC confirms your explanation: $ arm-linux-gnueabihf-gcc -mbig-endian -O2 -S -o - \ -xc - <<<"long f(long long l) { return l >> 32; }" just produces bx lr whereas removing the -mbig-endian gives mov r0, r1 bx lr I tested the change and it builds and runs fine (although I am not sure how much coverage this code gets on an ordinary boot): Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2020-11-30 18:09 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-11-23 7:36 [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang Antony Yu
2020-11-23 18:16 ` Nathan Chancellor
[not found] ` <20201124074211.GA26157@penyung-VirtualBox>
2020-11-24 10:14 ` [RESEND, PATCH] " Antony Yu
2020-11-24 21:06 ` Nick Desaulniers
2020-11-30 8:20 ` [PATCH v2] " Antony Yu
2020-11-24 23:16 ` [RESEND,PATCH] " kernel test robot
2020-11-30 10:11 ` [RESEND, PATCH] " Ard Biesheuvel
2020-11-30 10:12 ` Ard Biesheuvel
2020-11-30 10:21 ` [RESEND,PATCH] " Russell King - ARM Linux admin
2020-11-30 10:40 ` [RESEND, PATCH] " Ard Biesheuvel
2020-11-30 13:58 ` [RESEND,PATCH] " David Laight
2020-11-30 14:18 ` Russell King - ARM Linux admin
2020-11-30 15:50 ` Nicolas Pitre
2020-11-30 17:18 ` [RESEND, PATCH] " Ard Biesheuvel
2020-11-30 17:52 ` [RESEND,PATCH] " Nicolas Pitre
2020-11-30 18:08 ` [RESEND, PATCH] " Ard Biesheuvel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).