* [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code [not found] <20220322102115.186179-1-ammarfaizi2@gnuweeb.org> @ 2022-03-22 10:21 ` Ammar Faizi 2022-03-22 17:09 ` Nick Desaulniers 2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi 1 sibling, 1 reply; 30+ messages in thread From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw) To: Willy Tarreau Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi, llvm, Nick Desaulniers Building with clang yields the following error: ``` <inline asm>:3:1: error: _start changed binding to STB_GLOBAL .global _start ^ 1 error generated. ``` Make sure only specify one between `.global _start` and `.weak _start`. Removing `.global _start`. Cc: llvm@lists.linux.dev Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> --- @@ Changelog: Link RFC v1: https://lore.kernel.org/llvm/20220320093750.159991-3-ammarfaizi2@gnuweeb.org RFC v1 -> RFC v2: - Remove all `.global _start` for all build (GCC and Clang) instead of removing all `.weak _start` for clang build (Comment from Willy). --- tools/include/nolibc/arch-aarch64.h | 1 - tools/include/nolibc/arch-arm.h | 1 - tools/include/nolibc/arch-i386.h | 1 - tools/include/nolibc/arch-mips.h | 1 - tools/include/nolibc/arch-riscv.h | 1 - tools/include/nolibc/arch-x86_64.h | 1 - 6 files changed, 6 deletions(-) diff --git a/tools/include/nolibc/arch-aarch64.h b/tools/include/nolibc/arch-aarch64.h index 87d9e434820c..2dbd80d633cb 100644 --- a/tools/include/nolibc/arch-aarch64.h +++ b/tools/include/nolibc/arch-aarch64.h @@ -184,7 +184,6 @@ struct sys_stat_struct { /* startup code */ asm(".section .text\n" ".weak _start\n" - ".global _start\n" "_start:\n" "ldr x0, [sp]\n" // argc (x0) was in the stack "add x1, sp, 8\n" // argv (x1) = sp diff --git a/tools/include/nolibc/arch-arm.h b/tools/include/nolibc/arch-arm.h index 001a3c8c9ad5..1191395b5acd 100644 --- a/tools/include/nolibc/arch-arm.h +++ b/tools/include/nolibc/arch-arm.h @@ -177,7 +177,6 @@ struct sys_stat_struct { /* startup code */ asm(".section .text\n" ".weak _start\n" - ".global _start\n" "_start:\n" #if defined(__THUMBEB__) || defined(__THUMBEL__) /* We enter here in 32-bit mode but if some previous functions were in diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h index d7e4d53325a3..125a691fc631 100644 --- a/tools/include/nolibc/arch-i386.h +++ b/tools/include/nolibc/arch-i386.h @@ -176,7 +176,6 @@ struct sys_stat_struct { */ asm(".section .text\n" ".weak _start\n" - ".global _start\n" "_start:\n" "pop %eax\n" // argc (first arg, %eax) "mov %esp, %ebx\n" // argv[] (second arg, %ebx) diff --git a/tools/include/nolibc/arch-mips.h b/tools/include/nolibc/arch-mips.h index c9a6aac87c6d..1a124790c99f 100644 --- a/tools/include/nolibc/arch-mips.h +++ b/tools/include/nolibc/arch-mips.h @@ -192,7 +192,6 @@ struct sys_stat_struct { asm(".section .text\n" ".weak __start\n" ".set nomips16\n" - ".global __start\n" ".set noreorder\n" ".option pic0\n" ".ent __start\n" diff --git a/tools/include/nolibc/arch-riscv.h b/tools/include/nolibc/arch-riscv.h index bc10b7b5706d..511d67fc534e 100644 --- a/tools/include/nolibc/arch-riscv.h +++ b/tools/include/nolibc/arch-riscv.h @@ -185,7 +185,6 @@ struct sys_stat_struct { /* startup code */ asm(".section .text\n" ".weak _start\n" - ".global _start\n" "_start:\n" ".option push\n" ".option norelax\n" diff --git a/tools/include/nolibc/arch-x86_64.h b/tools/include/nolibc/arch-x86_64.h index a7b70ea51b68..84c174181425 100644 --- a/tools/include/nolibc/arch-x86_64.h +++ b/tools/include/nolibc/arch-x86_64.h @@ -199,7 +199,6 @@ struct sys_stat_struct { */ asm(".section .text\n" ".weak _start\n" - ".global _start\n" "_start:\n" "pop %rdi\n" // argc (first arg, %rdi) "mov %rsp, %rsi\n" // argv[] (second arg, %rsi) -- Ammar Faizi ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code 2022-03-22 10:21 ` [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code Ammar Faizi @ 2022-03-22 17:09 ` Nick Desaulniers 2022-03-22 17:25 ` Willy Tarreau 0 siblings, 1 reply; 30+ messages in thread From: Nick Desaulniers @ 2022-03-22 17:09 UTC (permalink / raw) To: Ammar Faizi Cc: Willy Tarreau, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm On Tue, Mar 22, 2022 at 3:21 AM Ammar Faizi <ammarfaizi2@gnuweeb.org> wrote: > > Building with clang yields the following error: > ``` > <inline asm>:3:1: error: _start changed binding to STB_GLOBAL > .global _start > ^ > 1 error generated. > ``` > Make sure only specify one between `.global _start` and `.weak _start`. > Removing `.global _start`. Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Yes, symbols should either be `.weak` or `.global`. The warning from Clang's integrated assembler is meant to flush out funny business. I assume there's a good reason _why_ _start is weak and not strong? Then again, I'm not familiar with nolibc. > > Cc: llvm@lists.linux.dev > Cc: Nick Desaulniers <ndesaulniers@google.com> > Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> > --- > > @@ Changelog: > > Link RFC v1: https://lore.kernel.org/llvm/20220320093750.159991-3-ammarfaizi2@gnuweeb.org > RFC v1 -> RFC v2: > - Remove all `.global _start` for all build (GCC and Clang) instead of > removing all `.weak _start` for clang build (Comment from Willy). > --- > tools/include/nolibc/arch-aarch64.h | 1 - > tools/include/nolibc/arch-arm.h | 1 - > tools/include/nolibc/arch-i386.h | 1 - > tools/include/nolibc/arch-mips.h | 1 - > tools/include/nolibc/arch-riscv.h | 1 - > tools/include/nolibc/arch-x86_64.h | 1 - > 6 files changed, 6 deletions(-) > > diff --git a/tools/include/nolibc/arch-aarch64.h b/tools/include/nolibc/arch-aarch64.h > index 87d9e434820c..2dbd80d633cb 100644 > --- a/tools/include/nolibc/arch-aarch64.h > +++ b/tools/include/nolibc/arch-aarch64.h > @@ -184,7 +184,6 @@ struct sys_stat_struct { > /* startup code */ > asm(".section .text\n" > ".weak _start\n" > - ".global _start\n" > "_start:\n" > "ldr x0, [sp]\n" // argc (x0) was in the stack > "add x1, sp, 8\n" // argv (x1) = sp > diff --git a/tools/include/nolibc/arch-arm.h b/tools/include/nolibc/arch-arm.h > index 001a3c8c9ad5..1191395b5acd 100644 > --- a/tools/include/nolibc/arch-arm.h > +++ b/tools/include/nolibc/arch-arm.h > @@ -177,7 +177,6 @@ struct sys_stat_struct { > /* startup code */ > asm(".section .text\n" > ".weak _start\n" > - ".global _start\n" > "_start:\n" > #if defined(__THUMBEB__) || defined(__THUMBEL__) > /* We enter here in 32-bit mode but if some previous functions were in > diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h > index d7e4d53325a3..125a691fc631 100644 > --- a/tools/include/nolibc/arch-i386.h > +++ b/tools/include/nolibc/arch-i386.h > @@ -176,7 +176,6 @@ struct sys_stat_struct { > */ > asm(".section .text\n" > ".weak _start\n" > - ".global _start\n" > "_start:\n" > "pop %eax\n" // argc (first arg, %eax) > "mov %esp, %ebx\n" // argv[] (second arg, %ebx) > diff --git a/tools/include/nolibc/arch-mips.h b/tools/include/nolibc/arch-mips.h > index c9a6aac87c6d..1a124790c99f 100644 > --- a/tools/include/nolibc/arch-mips.h > +++ b/tools/include/nolibc/arch-mips.h > @@ -192,7 +192,6 @@ struct sys_stat_struct { > asm(".section .text\n" > ".weak __start\n" > ".set nomips16\n" > - ".global __start\n" > ".set noreorder\n" > ".option pic0\n" > ".ent __start\n" > diff --git a/tools/include/nolibc/arch-riscv.h b/tools/include/nolibc/arch-riscv.h > index bc10b7b5706d..511d67fc534e 100644 > --- a/tools/include/nolibc/arch-riscv.h > +++ b/tools/include/nolibc/arch-riscv.h > @@ -185,7 +185,6 @@ struct sys_stat_struct { > /* startup code */ > asm(".section .text\n" > ".weak _start\n" > - ".global _start\n" > "_start:\n" > ".option push\n" > ".option norelax\n" > diff --git a/tools/include/nolibc/arch-x86_64.h b/tools/include/nolibc/arch-x86_64.h > index a7b70ea51b68..84c174181425 100644 > --- a/tools/include/nolibc/arch-x86_64.h > +++ b/tools/include/nolibc/arch-x86_64.h > @@ -199,7 +199,6 @@ struct sys_stat_struct { > */ > asm(".section .text\n" > ".weak _start\n" > - ".global _start\n" > "_start:\n" > "pop %rdi\n" // argc (first arg, %rdi) > "mov %rsp, %rsi\n" // argv[] (second arg, %rsi) > -- > Ammar Faizi > -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code 2022-03-22 17:09 ` Nick Desaulniers @ 2022-03-22 17:25 ` Willy Tarreau 2022-03-22 17:30 ` Nick Desaulniers 0 siblings, 1 reply; 30+ messages in thread From: Willy Tarreau @ 2022-03-22 17:25 UTC (permalink / raw) To: Nick Desaulniers Cc: Ammar Faizi, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm Hi Nick, On Tue, Mar 22, 2022 at 10:09:18AM -0700, Nick Desaulniers wrote: > On Tue, Mar 22, 2022 at 3:21 AM Ammar Faizi <ammarfaizi2@gnuweeb.org> wrote: > > > > Building with clang yields the following error: > > ``` > > <inline asm>:3:1: error: _start changed binding to STB_GLOBAL > > .global _start > > ^ > > 1 error generated. > > ``` > > Make sure only specify one between `.global _start` and `.weak _start`. > > Removing `.global _start`. > > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> > > Yes, symbols should either be `.weak` or `.global`. The warning from > Clang's integrated assembler is meant to flush out funny business. > > I assume there's a good reason _why_ _start is weak and not strong? Yes, the issue appears when you start to build programs made of more than one C file. That's why we have a few weak symbols here and there (others like errno are static and the lack of inter-unit portability is assumed). > Then again, I'm not familiar with nolibc. No problem. The purpose is clearly *not* to implement a libc, but to have something very lightweight that allows to compile trivial programs. A good example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm personally using a tiny pre-init shell that I always package with my kernels and that builds with them [1]. It will never do big things but the balance between ease of use and coding effort is pretty good in my experience. And I'm also careful not to make it complicated to use nor to maintain, pragmatism is important and the effort should remain on the program developer if some arbitration is needed. Regards, Willy [1] https://github.com/formilux/flxutils/tree/master/init ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code 2022-03-22 17:25 ` Willy Tarreau @ 2022-03-22 17:30 ` Nick Desaulniers 2022-03-22 17:58 ` Willy Tarreau 0 siblings, 1 reply; 30+ messages in thread From: Nick Desaulniers @ 2022-03-22 17:30 UTC (permalink / raw) To: Willy Tarreau; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm (Moving folks to bcc; check the lists if you're interested) On Tue, Mar 22, 2022 at 10:25 AM Willy Tarreau <w@1wt.eu> wrote: > > Hi Nick, > > On Tue, Mar 22, 2022 at 10:09:18AM -0700, Nick Desaulniers wrote: > > Then again, I'm not familiar with nolibc. > > No problem. The purpose is clearly *not* to implement a libc, but to have > something very lightweight that allows to compile trivial programs. A good > example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm > personally using a tiny pre-init shell that I always package with my > kernels and that builds with them [1]. It will never do big things but > the balance between ease of use and coding effort is pretty good in my > experience. And I'm also careful not to make it complicated to use nor > to maintain, pragmatism is important and the effort should remain on the > program developer if some arbitration is needed. Neat, I bet that helps generate very small initrd! Got any quick size measurements? -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code 2022-03-22 17:30 ` Nick Desaulniers @ 2022-03-22 17:58 ` Willy Tarreau 2022-03-22 18:07 ` Nick Desaulniers 0 siblings, 1 reply; 30+ messages in thread From: Willy Tarreau @ 2022-03-22 17:58 UTC (permalink / raw) To: Nick Desaulniers; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm On Tue, Mar 22, 2022 at 10:30:53AM -0700, Nick Desaulniers wrote: > (Moving folks to bcc; check the lists if you're interested) Yes, agreed :-) > On Tue, Mar 22, 2022 at 10:25 AM Willy Tarreau <w@1wt.eu> wrote: > > The purpose is clearly *not* to implement a libc, but to have > > something very lightweight that allows to compile trivial programs. A good > > example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm > > personally using a tiny pre-init shell that I always package with my > > kernels and that builds with them [1]. It will never do big things but > > the balance between ease of use and coding effort is pretty good in my > > experience. And I'm also careful not to make it complicated to use nor > > to maintain, pragmatism is important and the effort should remain on the > > program developer if some arbitration is needed. > > Neat, I bet that helps generate very small initrd! Got any quick size > measurements? Yep: First, the usual static printf("hello world!\n"): $ ll hello-*libc -rwxrwxr-x 1 willy dev 719232 Mar 22 18:50 hello-glibc* -rwxrwxr-x 1 willy dev 1248 Mar 22 18:51 hello-nolibc* $ objdump -h hello-nolibc hello-nolibc: file format elf64-x86-64 Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000300 00000000004000b0 00000000004000b0 000000b0 2**0 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .rodata 00000015 00000000004003b0 00000000004003b0 000003b0 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA Then the preinit stuff: $ ll initramfs/init -rwxr-xr-x 1 willy users 13936 Mar 22 18:40 initramfs/init* $ xz -c9 < initramfs/init | wc -c 8392 $ size initramfs/init text data bss dec hex filename 13348 0 23016 36364 8e0c init $ objdump -h initramfs/init initramfs/init: file format elf64-x86-64 Sections: Idx Name Size VMA LMA File off Algn 0 .text 00002b74 00000000004000e8 00000000004000e8 000000e8 2**0 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .rodata 000008b0 0000000000402c60 0000000000402c60 00002c60 2**5 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .bss 000059e8 0000000000404520 0000000000404520 00003520 2**5 ALLOC This one supports ~30-40 simple commands (mount/unmount, mknod, ls, ln), a tar extractor, multi-level braces, and boolean expression evaluation, variable expansion, and a config file parser to script all this. The code is 20 years old and is really ugly (even uglier than you think). But that gives an idea. 20 years ago the init was much simpler and 800 bytes (my constraint was for single floppies containing kernel+rootfs) and strings were manually merged by tails and put in .text to drop .rodata. You'll also note that there's 0 data segment above. That used to be convenient to further shrink programs, but these days given how linkers arrange segments by permissions that doesn't save as much as it used to, and it's likely that at some points I'll assume that there must be some variables by default (errno, environ, etc) and that we'll accept to invest a few extra tens of bytes by default for more convenience. Cheers, Willy ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code 2022-03-22 17:58 ` Willy Tarreau @ 2022-03-22 18:07 ` Nick Desaulniers 2022-03-22 18:24 ` Willy Tarreau 0 siblings, 1 reply; 30+ messages in thread From: Nick Desaulniers @ 2022-03-22 18:07 UTC (permalink / raw) To: Willy Tarreau; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm On Tue, Mar 22, 2022 at 10:58 AM Willy Tarreau <w@1wt.eu> wrote: > > On Tue, Mar 22, 2022 at 10:30:53AM -0700, Nick Desaulniers wrote: > > On Tue, Mar 22, 2022 at 10:25 AM Willy Tarreau <w@1wt.eu> wrote: > > > The purpose is clearly *not* to implement a libc, but to have > > > something very lightweight that allows to compile trivial programs. A good > > > example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm > > > personally using a tiny pre-init shell that I always package with my > > > kernels and that builds with them [1]. It will never do big things but > > > the balance between ease of use and coding effort is pretty good in my > > > experience. And I'm also careful not to make it complicated to use nor > > > to maintain, pragmatism is important and the effort should remain on the > > > program developer if some arbitration is needed. > > > > Neat, I bet that helps generate very small initrd! Got any quick size > > measurements? > > Yep: > > First, the usual static printf("hello world!\n"): > > $ ll hello-*libc > -rwxrwxr-x 1 willy dev 719232 Mar 22 18:50 hello-glibc* > -rwxrwxr-x 1 willy dev 1248 Mar 22 18:51 hello-nolibc* ! What! Are those both statically linked? > This one supports ~30-40 simple commands (mount/unmount, mknod, ls, ln), > a tar extractor, multi-level braces, and boolean expression evaluation, > variable expansion, and a config file parser to script all this. The code > is 20 years old and is really ugly (even uglier than you think). But that > gives an idea. 20 years ago the init was much simpler and 800 bytes (my > constraint was for single floppies containing kernel+rootfs) and strings > were manually merged by tails and put in .text to drop .rodata. Oh, so nolibc has been around for a while then? ld.lld will do string merging in that fashion at -O2 (the linker can accept and optimization level). I did have a kernel patch for that somewhere, need to update it for CC_OPTIMIZE_FOR_SIZE... I guess the tradeoff with strings in .text is that now the strings themselves are r+x and not just r? > > You'll also note that there's 0 data segment above. That used to be > convenient to further shrink programs, but these days given how linkers > arrange segments by permissions that doesn't save as much as it used to, > and it's likely that at some points I'll assume that there must be some > variables by default (errno, environ, etc) and that we'll accept to invest > a few extra tens of bytes by default for more convenience. Thanks for the measurements. -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code 2022-03-22 18:07 ` Nick Desaulniers @ 2022-03-22 18:24 ` Willy Tarreau 2022-03-22 18:38 ` Nick Desaulniers 0 siblings, 1 reply; 30+ messages in thread From: Willy Tarreau @ 2022-03-22 18:24 UTC (permalink / raw) To: Nick Desaulniers; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm On Tue, Mar 22, 2022 at 11:07:17AM -0700, Nick Desaulniers wrote: > > First, the usual static printf("hello world!\n"): > > > > $ ll hello-*libc > > -rwxrwxr-x 1 willy dev 719232 Mar 22 18:50 hello-glibc* > > -rwxrwxr-x 1 willy dev 1248 Mar 22 18:51 hello-nolibc* > > ! What! Are those both statically linked? Yes: $ file hello-nolibc hello-nolibc: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped (rebuilding without stripping) $ nm --size hello-nolibc 000000000000000f T main 0000000000000053 t u64toa_r 0000000000000280 t printf.constprop.0 $ nm hello-nolibc 00000000004013c5 R __bss_start 00000000004013c5 R _edata 00000000004013c8 R _end 00000000004000bf W _start 00000000004000b0 T main 0000000000400130 t printf.constprop.0 00000000004000dd t u64toa_r > > This one supports ~30-40 simple commands (mount/unmount, mknod, ls, ln), > > a tar extractor, multi-level braces, and boolean expression evaluation, > > variable expansion, and a config file parser to script all this. The code > > is 20 years old and is really ugly (even uglier than you think). But that > > gives an idea. 20 years ago the init was much simpler and 800 bytes (my > > constraint was for single floppies containing kernel+rootfs) and strings > > were manually merged by tails and put in .text to drop .rodata. > > Oh, so nolibc has been around for a while then? Not exactly. Over time I collected some of my stuff out of preinit to make more reusable code for other tools, and eventually created a separate project for it 5 years ago [1]. I then changed my mind a few times on how to arrange all this and over time it became a bit easier to use. One day Paul asked how to make less invasive static binaries for rcutorture and I found that it was the perfect match so we agreed to integrate it there. It was still a single file by then. And as usual when some code starts to get more exposure it receives more contribs and feature requests ;-) > ld.lld will do string merging in that fashion at -O2 (the linker can > accept and optimization level). I did have a kernel patch for that > somewhere, need to update it for CC_OPTIMIZE_FOR_SIZE... Ah I didn't know, that's good to know! > I guess the tradeoff with strings in .text is that now the strings > themselves are r+x and not just r? Yes but when you're writing a small shell to allow you to manually mount your rootfs from the kernel, you don't really care if someone might try to use some of your strings as code gadgets for ROP exploits :-) I would really not want to see this used for general programs, but it does fit well with hacking stuff for initramfs, and what lies in the selftests directory in general I guess. What I particularly like is that I don't need a full toolchain, so if I can build a kernel with the bare-metal compilers from kernel.org then I know I can also build my initramfs that's packaged in it using the exact same compiler. This significantly simplifies the build process. Willy [1] https://github.com/wtarreau/nolibc ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code 2022-03-22 18:24 ` Willy Tarreau @ 2022-03-22 18:38 ` Nick Desaulniers 0 siblings, 0 replies; 30+ messages in thread From: Nick Desaulniers @ 2022-03-22 18:38 UTC (permalink / raw) To: Willy Tarreau; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm On Tue, Mar 22, 2022 at 11:24 AM Willy Tarreau <w@1wt.eu> wrote: > > What I particularly like is that I don't need a full toolchain, so if > I can build a kernel with the bare-metal compilers from kernel.org then > I know I can also build my initramfs that's packaged in it using the > exact same compiler. This significantly simplifies the build process. Neat; yeah that coincides a bit with my interest in having builds of llvm on kernel.org; having/needing a libc is a PITA and building a full cross toolchain is also more difficult than I think it needs to be. The libc will depend on kernel headers, for each target. LLVM currently has a WIP libc in its tree; I'm looking for something I can statically link into the toolchain images (even LTO them into the image). Will probably pursue musl (if I ever get time for this, though maybe a project for my summer intern...). One thing I've been looking at is a utility called llvm-ifs [1]; it can generate .so stubs from a textual description that can be more easily read, diff'ed, and committed. These are much faster to build and reduce the chain of build dependencies (when dynamically linking). Last I checked it had issues with versioned symbols, and I'm not sure if/what it does for headers, which are still needed. Within Android, libabigail is being used to dump+diff xml descriptions of parts of an ABI, it looks like llvm-ifs might be useful for that as well. Not sure if it's interesting but thought I'd share. [1] https://www.youtube.com/watch?v=_pIorUFavc8 -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 30+ messages in thread
* [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments [not found] <20220322102115.186179-1-ammarfaizi2@gnuweeb.org> 2022-03-22 10:21 ` [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code Ammar Faizi @ 2022-03-22 10:21 ` Ammar Faizi 2022-03-22 10:57 ` David Laight 2022-03-22 11:39 ` David Laight 1 sibling, 2 replies; 30+ messages in thread From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw) To: Willy Tarreau Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi, x86, llvm, David Laight On i386, the 6th argument of syscall goes in %ebp. However, both Clang and GCC cannot use %ebp in the clobber list and in the "r" constraint without using -fomit-frame-pointer. To make it always available for any kind of compilation, the below workaround is implemented. For clang (the Assembly statement can't clobber %ebp): 1) Push the 6-th argument. 2) Push %ebp. 3) Load the 6-th argument from 4(%esp) to %ebp. 4) Do the syscall (int $0x80). 5) Pop %ebp (restore the old value of %ebp). 6) Add %esp by 4 (undo the stack pointer). For GCC, fortunately it has a #pragma that can force a specific function to be compiled with -fomit-frame-pointer, so it can use "r"(var) where var is a variable bound to %ebp. Cc: x86@kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/lkml/2e335ac54db44f1d8496583d97f9dab0@AcuMS.aculab.com Suggested-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> --- @@ Changelog: Link RFC v1: https://lore.kernel.org/llvm/20220320093750.159991-4-ammarfaizi2@gnuweeb.org RFC v1 -> RFC v2: - Fix %ebp saving method. Don't use redzone, i386 doesn't have a redzone (comment from David and Alviro). --- tools/include/nolibc/arch-i386.h | 66 ++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h index 125a691fc631..9f4dc36e6ac2 100644 --- a/tools/include/nolibc/arch-i386.h +++ b/tools/include/nolibc/arch-i386.h @@ -167,6 +167,72 @@ struct sys_stat_struct { _ret; \ }) + +/* + * Both Clang and GCC cannot use %ebp in the clobber list and in the "r" + * constraint without using -fomit-frame-pointer. To make it always + * available for any kind of compilation, the below workaround is + * implemented. + * + * For clang (the Assembly statement can't clobber %ebp): + * 1) Push the 6-th argument. + * 2) Push %ebp. + * 3) Load the 6-th argument from 4(%esp) to %ebp. + * 4) Do the syscall (int $0x80). + * 5) Pop %ebp (restore the old value of %ebp). + * 6) Add %esp by 4 (undo the stack pointer). + * + * For GCC, fortunately it has a #pragma that can force a specific function + * to be compiled with -fomit-frame-pointer, so it can use "r"(var) where + * var is a variable bound to %ebp. + * + */ +#if defined(__clang__) +static inline long ____do_syscall6(long eax, long ebx, long ecx, long edx, + long esi, long edi, long ebp) +{ + __asm__ volatile ( + "pushl %[arg6]\n\t" + "pushl %%ebp\n\t" + "movl 4(%%esp), %%ebp\n\t" + "int $0x80\n\t" + "popl %%ebp\n\t" + "addl $4,%%esp\n\t" + : "=a"(eax) + : "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi), + [arg6]"m"(ebp) + : "memory", "cc" + ); + return eax; +} + +#else /* #if defined(__clang__) */ +#pragma GCC push_options +#pragma GCC optimize "-fomit-frame-pointer" +static long ____do_syscall6(long eax, long ebx, long ecx, long edx, long esi, + long edi, long ebp) +{ + register long __ebp __asm__("ebp") = ebp; + __asm__ volatile ( + "int $0x80" + : "=a"(eax) + : "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi), + "r"(__ebp) + : "memory", "cc" + ); + return eax; +} +#pragma GCC pop_options +#endif /* #if defined(__clang__) */ + +#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) ( \ + ____do_syscall6((long)(num), (long)(arg1), \ + (long)(arg2), (long)(arg3), \ + (long)(arg4), (long)(arg5), \ + (long)(arg6)) \ +) + + /* startup code */ /* * i386 System V ABI mandates: -- Ammar Faizi ^ permalink raw reply related [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi @ 2022-03-22 10:57 ` David Laight 2022-03-22 11:23 ` Willy Tarreau 2022-03-22 11:39 ` David Laight 1 sibling, 1 reply; 30+ messages in thread From: David Laight @ 2022-03-22 10:57 UTC (permalink / raw) To: 'Ammar Faizi', Willy Tarreau Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev From: Ammar Faizi > Sent: 22 March 2022 10:21 > > On i386, the 6th argument of syscall goes in %ebp. However, both Clang > and GCC cannot use %ebp in the clobber list and in the "r" constraint > without using -fomit-frame-pointer. To make it always available for > any kind of compilation, the below workaround is implemented. > > For clang (the Assembly statement can't clobber %ebp): > 1) Push the 6-th argument. > 2) Push %ebp. > 3) Load the 6-th argument from 4(%esp) to %ebp. > 4) Do the syscall (int $0x80). > 5) Pop %ebp (restore the old value of %ebp). > 6) Add %esp by 4 (undo the stack pointer). > > For GCC, fortunately it has a #pragma that can force a specific function > to be compiled with -fomit-frame-pointer, so it can use "r"(var) where > var is a variable bound to %ebp. You need to use the 'clang' pattern for gcc. #pragma optimise is fundamentally broken. What actually happens here is the 'inline' gets lost (because of the implied -O0) and you get far worse code than you might expect. Since you need the 'clang' version, use it all the time. David > > Cc: x86@kernel.org > Cc: llvm@lists.linux.dev > Link: https://lore.kernel.org/lkml/2e335ac54db44f1d8496583d97f9dab0@AcuMS.aculab.com > Suggested-by: David Laight <David.Laight@ACULAB.COM> > Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> > --- > > @@ Changelog: > > Link RFC v1: https://lore.kernel.org/llvm/20220320093750.159991-4-ammarfaizi2@gnuweeb.org > RFC v1 -> RFC v2: > - Fix %ebp saving method. Don't use redzone, i386 doesn't have a redzone > (comment from David and Alviro). > --- > tools/include/nolibc/arch-i386.h | 66 ++++++++++++++++++++++++++++++++ > 1 file changed, 66 insertions(+) > > diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h > index 125a691fc631..9f4dc36e6ac2 100644 > --- a/tools/include/nolibc/arch-i386.h > +++ b/tools/include/nolibc/arch-i386.h > @@ -167,6 +167,72 @@ struct sys_stat_struct { > _ret; \ > }) > > + > +/* > + * Both Clang and GCC cannot use %ebp in the clobber list and in the "r" > + * constraint without using -fomit-frame-pointer. To make it always > + * available for any kind of compilation, the below workaround is > + * implemented. > + * > + * For clang (the Assembly statement can't clobber %ebp): > + * 1) Push the 6-th argument. > + * 2) Push %ebp. > + * 3) Load the 6-th argument from 4(%esp) to %ebp. > + * 4) Do the syscall (int $0x80). > + * 5) Pop %ebp (restore the old value of %ebp). > + * 6) Add %esp by 4 (undo the stack pointer). > + * > + * For GCC, fortunately it has a #pragma that can force a specific function > + * to be compiled with -fomit-frame-pointer, so it can use "r"(var) where > + * var is a variable bound to %ebp. > + * > + */ > +#if defined(__clang__) > +static inline long ____do_syscall6(long eax, long ebx, long ecx, long edx, > + long esi, long edi, long ebp) > +{ > + __asm__ volatile ( > + "pushl %[arg6]\n\t" > + "pushl %%ebp\n\t" > + "movl 4(%%esp), %%ebp\n\t" > + "int $0x80\n\t" > + "popl %%ebp\n\t" > + "addl $4,%%esp\n\t" > + : "=a"(eax) > + : "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi), > + [arg6]"m"(ebp) > + : "memory", "cc" > + ); > + return eax; > +} > + > +#else /* #if defined(__clang__) */ > +#pragma GCC push_options > +#pragma GCC optimize "-fomit-frame-pointer" > +static long ____do_syscall6(long eax, long ebx, long ecx, long edx, long esi, > + long edi, long ebp) > +{ > + register long __ebp __asm__("ebp") = ebp; > + __asm__ volatile ( > + "int $0x80" > + : "=a"(eax) > + : "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi), > + "r"(__ebp) > + : "memory", "cc" > + ); > + return eax; > +} > +#pragma GCC pop_options > +#endif /* #if defined(__clang__) */ > + > +#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) ( \ > + ____do_syscall6((long)(num), (long)(arg1), \ > + (long)(arg2), (long)(arg3), \ > + (long)(arg4), (long)(arg5), \ > + (long)(arg6)) \ > +) > + > + > /* startup code */ > /* > * i386 System V ABI mandates: > -- > Ammar Faizi - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 10:57 ` David Laight @ 2022-03-22 11:23 ` Willy Tarreau 0 siblings, 0 replies; 30+ messages in thread From: Willy Tarreau @ 2022-03-22 11:23 UTC (permalink / raw) To: David Laight Cc: 'Ammar Faizi', Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On Tue, Mar 22, 2022 at 10:57:01AM +0000, David Laight wrote: > From: Ammar Faizi > > Sent: 22 March 2022 10:21 > > > > On i386, the 6th argument of syscall goes in %ebp. However, both Clang > > and GCC cannot use %ebp in the clobber list and in the "r" constraint > > without using -fomit-frame-pointer. To make it always available for > > any kind of compilation, the below workaround is implemented. > > > > For clang (the Assembly statement can't clobber %ebp): > > 1) Push the 6-th argument. > > 2) Push %ebp. > > 3) Load the 6-th argument from 4(%esp) to %ebp. > > 4) Do the syscall (int $0x80). > > 5) Pop %ebp (restore the old value of %ebp). > > 6) Add %esp by 4 (undo the stack pointer). > > > > For GCC, fortunately it has a #pragma that can force a specific function > > to be compiled with -fomit-frame-pointer, so it can use "r"(var) where > > var is a variable bound to %ebp. > > You need to use the 'clang' pattern for gcc. > #pragma optimise is fundamentally broken. > What actually happens here is the 'inline' gets lost > (because of the implied -O0) and you get far worse code > than you might expect. > > Since you need the 'clang' version, use it all the time. I clearly prefer it as well, it looks much cleaner! Willy ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi 2022-03-22 10:57 ` David Laight @ 2022-03-22 11:39 ` David Laight 2022-03-22 12:02 ` Ammar Faizi 1 sibling, 1 reply; 30+ messages in thread From: David Laight @ 2022-03-22 11:39 UTC (permalink / raw) To: 'Ammar Faizi', Willy Tarreau Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev From: Ammar Faizi > Sent: 22 March 2022 10:21 > On i386, the 6th argument of syscall goes in %ebp. However, both Clang > and GCC cannot use %ebp in the clobber list and in the "r" constraint > without using -fomit-frame-pointer. To make it always available for > any kind of compilation, the below workaround is implemented. > ... > diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h > index 125a691fc631..9f4dc36e6ac2 100644 > --- a/tools/include/nolibc/arch-i386.h > +++ b/tools/include/nolibc/arch-i386.h > @@ -167,6 +167,72 @@ struct sys_stat_struct { > _ret; \ > }) > > + > +/* > + * Both Clang and GCC cannot use %ebp in the clobber list and in the "r" > + * constraint without using -fomit-frame-pointer. To make it always > + * available for any kind of compilation, the below workaround is > + * implemented. > + * > + * For clang (the Assembly statement can't clobber %ebp): > + * 1) Push the 6-th argument. > + * 2) Push %ebp. > + * 3) Load the 6-th argument from 4(%esp) to %ebp. > + * 4) Do the syscall (int $0x80). > + * 5) Pop %ebp (restore the old value of %ebp). > + * 6) Add %esp by 4 (undo the stack pointer). > + * > + * For GCC, fortunately it has a #pragma that can force a specific function > + * to be compiled with -fomit-frame-pointer, so it can use "r"(var) where > + * var is a variable bound to %ebp. > + * > + */ > +#if defined(__clang__) > +static inline long ____do_syscall6(long eax, long ebx, long ecx, long edx, > + long esi, long edi, long ebp) That should probably be: static inline long ____do_syscall6(long nr, long arg1, long arg2, long arg3, long arg4, long arg5, long arg6) and the input constraints changed to match. > +{ > + __asm__ volatile ( > + "pushl %[arg6]\n\t" > + "pushl %%ebp\n\t" > + "movl 4(%%esp), %%ebp\n\t" > + "int $0x80\n\t" > + "popl %%ebp\n\t" > + "addl $4,%%esp\n\t" > + : "=a"(eax) > + : "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi), Does having "=a" for an output constraint and "a" for an input constraint actually DTRT? There is a special syntax for tying input and output to the same register. Or you could use "+a"(nr_rval) and 'return nr_rval'. David > + [arg6]"m"(ebp) > + : "memory", "cc" > + ); > + return eax; > +} - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 11:39 ` David Laight @ 2022-03-22 12:02 ` Ammar Faizi 2022-03-22 12:07 ` Ammar Faizi 2022-03-22 12:13 ` Willy Tarreau 0 siblings, 2 replies; 30+ messages in thread From: Ammar Faizi @ 2022-03-22 12:02 UTC (permalink / raw) To: David Laight, Willy Tarreau Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On 3/22/22 6:39 PM, David Laight wrote: >> + __asm__ volatile ( >> + "pushl %[arg6]\n\t" >> + "pushl %%ebp\n\t" >> + "movl 4(%%esp), %%ebp\n\t" >> + "int $0x80\n\t" >> + "popl %%ebp\n\t" >> + "addl $4,%%esp\n\t" >> + : "=a"(eax) >> + : "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi), > > Does having "=a" for an output constraint and "a" for an input > constraint actually DTRT? > There is a special syntax for tying input and output to > the same register. > Or you could use "+a"(nr_rval) and 'return nr_rval'. Well, I agree with your previous email. Now since we no longer use a #pragma optimize with -fomit-frame-pointer, the function is not needed. I propose the following macro (this is not so much different with other my_syscall macro), expect the 6th argument can be in reg or mem. The "rm" constraint here gives the opportunity for the compiler to use %ebp instead of memory if -fomit-frame-pointer is turned on. #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ ({ \ long _ret; \ register long _num asm("eax") = (num); \ register long _arg1 asm("ebx") = (long)(arg1); \ register long _arg2 asm("ecx") = (long)(arg2); \ register long _arg3 asm("edx") = (long)(arg3); \ register long _arg4 asm("esi") = (long)(arg4); \ register long _arg5 asm("edi") = (long)(arg5); \ long _arg6 = (long)(arg6); /* Might be in memory */ \ \ asm volatile ( \ "pushl %[_arg6]\n\t" \ "pushl %%ebp\n\t" \ "movl 4(%%esp), %%ebp\n\t" \ "int $0x80\n\t" \ "popl %%ebp\n\t" \ "addl $4,%%esp\n\t" \ : "=a"(_ret) \ : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \ "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6) \ : "memory", "cc" \ ); \ _ret; \ }) What do you think? -- Ammar Faizi ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 12:02 ` Ammar Faizi @ 2022-03-22 12:07 ` Ammar Faizi 2022-03-22 12:13 ` Willy Tarreau 1 sibling, 0 replies; 30+ messages in thread From: Ammar Faizi @ 2022-03-22 12:07 UTC (permalink / raw) To: David Laight, Willy Tarreau Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On 3/22/22 7:02 PM, Ammar Faizi wrote: > Well, I agree with your previous email. Now since we no longer use a #pragma > optimize with -fomit-frame-pointer, the function is not needed. I propose the > following macro (this is not so much different with other my_syscall macro), > expect the 6th argument can be in reg or mem. > > The "rm" constraint here gives the opportunity for the compiler to use %ebp > instead of memory if -fomit-frame-pointer is turned on. > > #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ > ({ \ > long _ret; \ > register long _num asm("eax") = (num); \ > register long _arg1 asm("ebx") = (long)(arg1); \ > register long _arg2 asm("ecx") = (long)(arg2); \ > register long _arg3 asm("edx") = (long)(arg3); \ > register long _arg4 asm("esi") = (long)(arg4); \ > register long _arg5 asm("edi") = (long)(arg5); \ > long _arg6 = (long)(arg6); /* Might be in memory */ \ > \ > asm volatile ( \ > "pushl %[_arg6]\n\t" \ > "pushl %%ebp\n\t" \ > "movl 4(%%esp), %%ebp\n\t" \ > "int $0x80\n\t" \ > "popl %%ebp\n\t" \ > "addl $4,%%esp\n\t" \ > : "=a"(_ret) \ > : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \ > "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6) \ > : "memory", "cc" \ > ); \ > _ret; \ > }) > > What do you think? > For the following code: int main() { mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); return 0; } GCC generates this: 00001000 <main>: 1000: push %ebp 1001: mov $0xc0,%eax 1006: mov $0x1000,%ecx 100b: mov $0x3,%edx 1010: push %edi 1011: xor %ebp,%ebp 1013: mov $0xffffffff,%edi 1018: push %esi 1019: mov $0x22,%esi 101e: push %ebx 101f: xor %ebx,%ebx 1021: push %ebp <--- arg6 here 1022: push %ebp 1023: mov 0x4(%esp),%ebp 1027: int $0x80 1029: pop %ebp 102a: add $0x4,%esp 102d: xor %eax,%eax 102f: pop %ebx 1030: pop %esi 1031: pop %edi 1032: pop %ebp 1033: ret -- Ammar Faizi ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 12:02 ` Ammar Faizi 2022-03-22 12:07 ` Ammar Faizi @ 2022-03-22 12:13 ` Willy Tarreau 2022-03-22 13:26 ` Ammar Faizi 2022-03-22 13:37 ` David Laight 1 sibling, 2 replies; 30+ messages in thread From: Willy Tarreau @ 2022-03-22 12:13 UTC (permalink / raw) To: Ammar Faizi Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On Tue, Mar 22, 2022 at 07:02:53PM +0700, Ammar Faizi wrote: > I propose the > following macro (this is not so much different with other my_syscall macro), > expect the 6th argument can be in reg or mem. > > The "rm" constraint here gives the opportunity for the compiler to use %ebp > instead of memory if -fomit-frame-pointer is turned on. > > #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ > ({ \ > long _ret; \ > register long _num asm("eax") = (num); \ > register long _arg1 asm("ebx") = (long)(arg1); \ > register long _arg2 asm("ecx") = (long)(arg2); \ > register long _arg3 asm("edx") = (long)(arg3); \ > register long _arg4 asm("esi") = (long)(arg4); \ > register long _arg5 asm("edi") = (long)(arg5); \ > long _arg6 = (long)(arg6); /* Might be in memory */ \ > \ > asm volatile ( \ > "pushl %[_arg6]\n\t" \ > "pushl %%ebp\n\t" \ > "movl 4(%%esp), %%ebp\n\t" \ > "int $0x80\n\t" \ > "popl %%ebp\n\t" \ > "addl $4,%%esp\n\t" \ > : "=a"(_ret) \ > : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \ > "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6) \ > : "memory", "cc" \ > ); \ > _ret; \ > }) > > What do you think? Hmmm indeed that comes back to the existing constructs and is certainly more in line with the rest of the code (plus it will not be affected by -O0). I seem to remember a register allocation issue which kept me away from implementing it this way on i386 back then, but given that my focus was not as much on i386 as it was on other platforms, it's likely that I have not insisted too much and not tried this one which looks like the way to go to me. Willy ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 12:13 ` Willy Tarreau @ 2022-03-22 13:26 ` Ammar Faizi 2022-03-22 13:34 ` Willy Tarreau 2022-03-22 13:37 ` David Laight 1 sibling, 1 reply; 30+ messages in thread From: Ammar Faizi @ 2022-03-22 13:26 UTC (permalink / raw) To: Willy Tarreau Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On 3/22/22 7:13 PM, Willy Tarreau wrote: > On Tue, Mar 22, 2022 at 07:02:53PM +0700, Ammar Faizi wrote: >> I propose the >> following macro (this is not so much different with other my_syscall macro), >> expect the 6th argument can be in reg or mem. >> >> The "rm" constraint here gives the opportunity for the compiler to use %ebp >> instead of memory if -fomit-frame-pointer is turned on. >> >> #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ >> ({ \ >> long _ret; \ >> register long _num asm("eax") = (num); \ >> register long _arg1 asm("ebx") = (long)(arg1); \ >> register long _arg2 asm("ecx") = (long)(arg2); \ >> register long _arg3 asm("edx") = (long)(arg3); \ >> register long _arg4 asm("esi") = (long)(arg4); \ >> register long _arg5 asm("edi") = (long)(arg5); \ >> long _arg6 = (long)(arg6); /* Might be in memory */ \ >> \ >> asm volatile ( \ >> "pushl %[_arg6]\n\t" \ >> "pushl %%ebp\n\t" \ >> "movl 4(%%esp), %%ebp\n\t" \ >> "int $0x80\n\t" \ >> "popl %%ebp\n\t" \ >> "addl $4,%%esp\n\t" \ >> : "=a"(_ret) \ >> : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \ >> "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6) \ >> : "memory", "cc" \ >> ); \ >> _ret; \ >> }) >> >> What do you think? > > Hmmm indeed that comes back to the existing constructs and is certainly > more in line with the rest of the code (plus it will not be affected by > -O0). > > I seem to remember a register allocation issue which kept me away from > implementing it this way on i386 back then, but given that my focus was > not as much on i386 as it was on other platforms, it's likely that I have > not insisted too much and not tried this one which looks like the way to > go to me. I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer (e.g. without optimization / -O0). So I will still use "m" here. -- Ammar Faizi ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 13:26 ` Ammar Faizi @ 2022-03-22 13:34 ` Willy Tarreau 2022-03-22 13:37 ` Ammar Faizi 0 siblings, 1 reply; 30+ messages in thread From: Willy Tarreau @ 2022-03-22 13:34 UTC (permalink / raw) To: Ammar Faizi Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On Tue, Mar 22, 2022 at 08:26:37PM +0700, Ammar Faizi wrote: > On 3/22/22 7:13 PM, Willy Tarreau wrote: > > On Tue, Mar 22, 2022 at 07:02:53PM +0700, Ammar Faizi wrote: > > > I propose the > > > following macro (this is not so much different with other my_syscall macro), > > > expect the 6th argument can be in reg or mem. > > > > > > The "rm" constraint here gives the opportunity for the compiler to use %ebp > > > instead of memory if -fomit-frame-pointer is turned on. > > > > > > #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ > > > ({ \ > > > long _ret; \ > > > register long _num asm("eax") = (num); \ > > > register long _arg1 asm("ebx") = (long)(arg1); \ > > > register long _arg2 asm("ecx") = (long)(arg2); \ > > > register long _arg3 asm("edx") = (long)(arg3); \ > > > register long _arg4 asm("esi") = (long)(arg4); \ > > > register long _arg5 asm("edi") = (long)(arg5); \ > > > long _arg6 = (long)(arg6); /* Might be in memory */ \ > > > \ > > > asm volatile ( \ > > > "pushl %[_arg6]\n\t" \ > > > "pushl %%ebp\n\t" \ > > > "movl 4(%%esp), %%ebp\n\t" \ > > > "int $0x80\n\t" \ > > > "popl %%ebp\n\t" \ > > > "addl $4,%%esp\n\t" \ > > > : "=a"(_ret) \ > > > : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \ > > > "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6) \ > > > : "memory", "cc" \ > > > ); \ > > > _ret; \ > > > }) > > > > > > What do you think? > > > > Hmmm indeed that comes back to the existing constructs and is certainly > > more in line with the rest of the code (plus it will not be affected by > > -O0). > > > > I seem to remember a register allocation issue which kept me away from > > implementing it this way on i386 back then, but given that my focus was > > not as much on i386 as it was on other platforms, it's likely that I have > > not insisted too much and not tried this one which looks like the way to > > go to me. > > I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer > (e.g. without optimization / -O0). So I will still use "m" here. OK that's fine. then you can probably simplify it like this: long _arg6 = (long)(arg6); /* Might be in memory */ \ \ asm volatile ( \ "pushl %%ebp\n\t" \ "movl %[_arg6], %%ebp\n\t" \ "int $0x80\n\t" \ "popl %%ebp\n\t" \ : "=a"(_ret) \ : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \ "r"(_arg4),"r"(_arg5), [_arg6]"m"(_arg6) \ : "memory", "cc" \ ); \ See ? no more push, no more addl, direct load from memory. Willy ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 13:34 ` Willy Tarreau @ 2022-03-22 13:37 ` Ammar Faizi 2022-03-22 13:39 ` David Laight 0 siblings, 1 reply; 30+ messages in thread From: Ammar Faizi @ 2022-03-22 13:37 UTC (permalink / raw) To: Willy Tarreau Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On 3/22/22 8:34 PM, Willy Tarreau wrote: >> I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer >> (e.g. without optimization / -O0). So I will still use "m" here. > > OK that's fine. then you can probably simplify it like this: > > long _arg6 = (long)(arg6); /* Might be in memory */ \ > \ > asm volatile ( \ > "pushl %%ebp\n\t" \ > "movl %[_arg6], %%ebp\n\t" \ > "int $0x80\n\t" \ > "popl %%ebp\n\t" \ > : "=a"(_ret) \ > : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \ > "r"(_arg4),"r"(_arg5), [_arg6]"m"(_arg6) \ > : "memory", "cc" \ > ); \ > > See ? no more push, no more addl, direct load from memory. Uggh... I crafted the same code like you suggested before, but then I realized it's buggy, it's buggy because %[_arg6] may live in N(%esp). When you pushl %ebp, the %esp changes, N(%esp) no longer points to the 6-th argument. -- Ammar Faizi ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 13:37 ` Ammar Faizi @ 2022-03-22 13:39 ` David Laight 2022-03-22 13:41 ` Willy Tarreau 0 siblings, 1 reply; 30+ messages in thread From: David Laight @ 2022-03-22 13:39 UTC (permalink / raw) To: 'Ammar Faizi', Willy Tarreau Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev From: Ammar Faizi > Sent: 22 March 2022 13:37 > > On 3/22/22 8:34 PM, Willy Tarreau wrote: > >> I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer > >> (e.g. without optimization / -O0). So I will still use "m" here. > > > > OK that's fine. then you can probably simplify it like this: > > > > long _arg6 = (long)(arg6); /* Might be in memory */ \ > > \ > > asm volatile ( \ > > "pushl %%ebp\n\t" \ > > "movl %[_arg6], %%ebp\n\t" \ > > "int $0x80\n\t" \ > > "popl %%ebp\n\t" \ > > : "=a"(_ret) \ > > : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \ > > "r"(_arg4),"r"(_arg5), [_arg6]"m"(_arg6) \ > > : "memory", "cc" \ > > ); \ > > > > See ? no more push, no more addl, direct load from memory. > > Uggh... I crafted the same code like you suggested before, but then > I realized it's buggy, it's buggy because %[_arg6] may live in N(%esp). > > When you pushl %ebp, the %esp changes, N(%esp) no longer points to the > 6-th argument. Yep - that is why I wrote the 'push arg6'. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 13:39 ` David Laight @ 2022-03-22 13:41 ` Willy Tarreau 2022-03-22 13:45 ` Ammar Faizi 0 siblings, 1 reply; 30+ messages in thread From: Willy Tarreau @ 2022-03-22 13:41 UTC (permalink / raw) To: David Laight Cc: 'Ammar Faizi', Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On Tue, Mar 22, 2022 at 01:39:41PM +0000, David Laight wrote: > From: Ammar Faizi > > Sent: 22 March 2022 13:37 > > > > On 3/22/22 8:34 PM, Willy Tarreau wrote: > > >> I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer > > >> (e.g. without optimization / -O0). So I will still use "m" here. > > > > > > OK that's fine. then you can probably simplify it like this: > > > > > > long _arg6 = (long)(arg6); /* Might be in memory */ \ > > > \ > > > asm volatile ( \ > > > "pushl %%ebp\n\t" \ > > > "movl %[_arg6], %%ebp\n\t" \ > > > "int $0x80\n\t" \ > > > "popl %%ebp\n\t" \ > > > : "=a"(_ret) \ > > > : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \ > > > "r"(_arg4),"r"(_arg5), [_arg6]"m"(_arg6) \ > > > : "memory", "cc" \ > > > ); \ > > > > > > See ? no more push, no more addl, direct load from memory. > > > > Uggh... I crafted the same code like you suggested before, but then > > I realized it's buggy, it's buggy because %[_arg6] may live in N(%esp). > > > > When you pushl %ebp, the %esp changes, N(%esp) no longer points to the > > 6-th argument. > > Yep - that is why I wrote the 'push arg6'. Got it and you're right indeed, sorry for the noise :-) Willy ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 13:41 ` Willy Tarreau @ 2022-03-22 13:45 ` Ammar Faizi 2022-03-22 13:54 ` Ammar Faizi 0 siblings, 1 reply; 30+ messages in thread From: Ammar Faizi @ 2022-03-22 13:45 UTC (permalink / raw) To: Willy Tarreau, David Laight Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On 3/22/22 8:41 PM, Willy Tarreau wrote: [...] >>> When you pushl %ebp, the %esp changes, N(%esp) no longer points to the >>> 6-th argument. >> >> Yep - that is why I wrote the 'push arg6'. > > Got it and you're right indeed, sorry for the noise :-) Uggh... it seems I hit a GCC bug when playing with -m32 (32-bit code). I am on Linux x86-64. Compiling without optimization causing GCC stuck in an endless loop with 100% CPU usage. I will try to narrow it down and see if I can create a simple reproducer on this issue. ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ gcc --version gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ time taskset -c 0 gcc -m32 -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc ^C real 0m46.696s user 0m0.000s sys 0m0.002s ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ time taskset -c 0 gcc -O1 -m32 -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc real 0m0.054s user 0m0.046s sys 0m0.008s ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ time taskset -c 0 gcc -O2 -m32 -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc real 0m0.079s user 0m0.067s sys 0m0.012s ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ time taskset -c 0 gcc -O3 -m32 -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc real 0m0.110s user 0m0.097s sys 0m0.013s ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ -- Ammar Faizi ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 13:45 ` Ammar Faizi @ 2022-03-22 13:54 ` Ammar Faizi 2022-03-22 13:56 ` Ammar Faizi 0 siblings, 1 reply; 30+ messages in thread From: Ammar Faizi @ 2022-03-22 13:54 UTC (permalink / raw) To: Willy Tarreau, David Laight Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev Willy, something goes wrong here... ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ taskset -c 0 gcc -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc /usr/bin/ld: /tmp/ccHiYiks.o: warning: relocation against `environ' in read-only section `.text' /usr/bin/ld: /tmp/ccHiYiks.o: in function `getenv': test.c:(.text+0x1f76): undefined reference to `environ' /usr/bin/ld: test.c:(.text+0x1fc3): undefined reference to `environ' /usr/bin/ld: test.c:(.text+0x1ffc): undefined reference to `environ' /usr/bin/ld: test.c:(.text+0x2021): undefined reference to `environ' /usr/bin/ld: test.c:(.text+0x2049): undefined reference to `environ' /usr/bin/ld: warning: creating DT_TEXTREL in a PIE collect2: error: ld returned 1 exit status ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ I suspect it's caused by commit: commit c970abe796019b3d576fd154a54b94efb35c02b1 Author: Willy Tarreau <w@1wt.eu> Date: Mon Mar 21 18:33:08 2022 +0100 tools/nolibc/stdlib: add a simple getenv() implementation This implementation relies on an extern definition of the environ variable, that the caller must declare and initialize from envp. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> I will take a look deeper on this... -- Ammar Faizi ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 13:54 ` Ammar Faizi @ 2022-03-22 13:56 ` Ammar Faizi 2022-03-22 14:02 ` Willy Tarreau 0 siblings, 1 reply; 30+ messages in thread From: Ammar Faizi @ 2022-03-22 13:56 UTC (permalink / raw) To: Willy Tarreau, David Laight Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On 3/22/22 8:54 PM, Ammar Faizi wrote: > > Willy, something goes wrong here... > > ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ taskset -c 0 gcc -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc > /usr/bin/ld: /tmp/ccHiYiks.o: warning: relocation against `environ' in read-only section `.text' > /usr/bin/ld: /tmp/ccHiYiks.o: in function `getenv': > test.c:(.text+0x1f76): undefined reference to `environ' > /usr/bin/ld: test.c:(.text+0x1fc3): undefined reference to `environ' > /usr/bin/ld: test.c:(.text+0x1ffc): undefined reference to `environ' > /usr/bin/ld: test.c:(.text+0x2021): undefined reference to `environ' > /usr/bin/ld: test.c:(.text+0x2049): undefined reference to `environ' > /usr/bin/ld: warning: creating DT_TEXTREL in a PIE > collect2: error: ld returned 1 exit status > ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ > > > I suspect it's caused by commit: > > commit c970abe796019b3d576fd154a54b94efb35c02b1 > Author: Willy Tarreau <w@1wt.eu> > Date: Mon Mar 21 18:33:08 2022 +0100 > > tools/nolibc/stdlib: add a simple getenv() implementation > This implementation relies on an extern definition of the environ > variable, that the caller must declare and initialize from envp. > Signed-off-by: Willy Tarreau <w@1wt.eu> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > I will take a look deeper on this... This bug only exists when compiling without optimization. -- Ammar Faizi ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 13:56 ` Ammar Faizi @ 2022-03-22 14:02 ` Willy Tarreau 0 siblings, 0 replies; 30+ messages in thread From: Willy Tarreau @ 2022-03-22 14:02 UTC (permalink / raw) To: Ammar Faizi Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On Tue, Mar 22, 2022 at 08:56:44PM +0700, Ammar Faizi wrote: > On 3/22/22 8:54 PM, Ammar Faizi wrote: > > > > Willy, something goes wrong here... > > > > ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ taskset -c 0 gcc -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc > > /usr/bin/ld: /tmp/ccHiYiks.o: warning: relocation against `environ' in read-only section `.text' > > /usr/bin/ld: /tmp/ccHiYiks.o: in function `getenv': > > test.c:(.text+0x1f76): undefined reference to `environ' > > /usr/bin/ld: test.c:(.text+0x1fc3): undefined reference to `environ' > > /usr/bin/ld: test.c:(.text+0x1ffc): undefined reference to `environ' > > /usr/bin/ld: test.c:(.text+0x2021): undefined reference to `environ' > > /usr/bin/ld: test.c:(.text+0x2049): undefined reference to `environ' > > /usr/bin/ld: warning: creating DT_TEXTREL in a PIE > > collect2: error: ld returned 1 exit status > > ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ > > > > > > I suspect it's caused by commit: > > > > commit c970abe796019b3d576fd154a54b94efb35c02b1 > > Author: Willy Tarreau <w@1wt.eu> > > Date: Mon Mar 21 18:33:08 2022 +0100 > > > > tools/nolibc/stdlib: add a simple getenv() implementation > > This implementation relies on an extern definition of the environ > > variable, that the caller must declare and initialize from envp. > > Signed-off-by: Willy Tarreau <w@1wt.eu> > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > > I will take a look deeper on this... > > This bug only exists when compiling without optimization. Indeed, reproduced. I can bypass it by adding __attribute__((weak)) on the environ declaration in getenv(). Will send a patch later. Thanks, Willy ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 12:13 ` Willy Tarreau 2022-03-22 13:26 ` Ammar Faizi @ 2022-03-22 13:37 ` David Laight 2022-03-22 14:47 ` Alviro Iskandar Setiawan 2022-03-23 6:29 ` Ammar Faizi 1 sibling, 2 replies; 30+ messages in thread From: David Laight @ 2022-03-22 13:37 UTC (permalink / raw) To: 'Willy Tarreau', Ammar Faizi Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev From: Willy Tarreau > Sent: 22 March 2022 12:14 > > On Tue, Mar 22, 2022 at 07:02:53PM +0700, Ammar Faizi wrote: > > I propose the > > following macro (this is not so much different with other my_syscall macro), > > expect the 6th argument can be in reg or mem. > > > > The "rm" constraint here gives the opportunity for the compiler to use %ebp > > instead of memory if -fomit-frame-pointer is turned on. > > > > #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ > > ({ \ > > long _ret; \ > > register long _num asm("eax") = (num); \ > > register long _arg1 asm("ebx") = (long)(arg1); \ > > register long _arg2 asm("ecx") = (long)(arg2); \ > > register long _arg3 asm("edx") = (long)(arg3); \ > > register long _arg4 asm("esi") = (long)(arg4); \ > > register long _arg5 asm("edi") = (long)(arg5); \ > > long _arg6 = (long)(arg6); /* Might be in memory */ \ > > \ > > asm volatile ( \ > > "pushl %[_arg6]\n\t" \ > > "pushl %%ebp\n\t" \ > > "movl 4(%%esp), %%ebp\n\t" \ > > "int $0x80\n\t" \ > > "popl %%ebp\n\t" \ > > "addl $4,%%esp\n\t" \ > > : "=a"(_ret) \ > > : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \ > > "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6) \ > > : "memory", "cc" \ > > ); \ > > _ret; \ > > }) > > > > What do you think? > > Hmmm indeed that comes back to the existing constructs and is certainly > more in line with the rest of the code (plus it will not be affected by > -O0). I'd add an 'always_inline' to the function. That will force inline even with -O0. > I seem to remember a register allocation issue which kept me away from > implementing it this way on i386 back then, but given that my focus was > not as much on i386 as it was on other platforms, it's likely that I have > not insisted too much and not tried this one which looks like the way to > go to me. dunno, 'asm' register variables are rather more horrid and should probably only be used (for asm statements) when there aren't suitable register constraints. (I'm sure there is a comment about that in the gcc docs.) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 13:37 ` David Laight @ 2022-03-22 14:47 ` Alviro Iskandar Setiawan 2022-03-22 15:11 ` David Laight 2022-03-23 6:29 ` Ammar Faizi 1 sibling, 1 reply; 30+ messages in thread From: Alviro Iskandar Setiawan @ 2022-03-22 14:47 UTC (permalink / raw) To: David Laight Cc: Willy Tarreau, Ammar Faizi, Paul E. McKenney, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On Tue, Mar 22, 2022 at 8:37 PM David Laight wrote: > dunno, 'asm' register variables are rather more horrid and > should probably only be used (for asm statements) when there aren't > suitable register constraints. > > (I'm sure there is a comment about that in the gcc docs.) I don't find the comment that says so here: https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html The current code looks valid to me, but I would still prefer to use the explicit register constraints instead of always using "r"(var) if available. No strong reason in denying that, tho. Still looks good. -- Viro ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 14:47 ` Alviro Iskandar Setiawan @ 2022-03-22 15:11 ` David Laight 0 siblings, 0 replies; 30+ messages in thread From: David Laight @ 2022-03-22 15:11 UTC (permalink / raw) To: 'Alviro Iskandar Setiawan' Cc: Willy Tarreau, Ammar Faizi, Paul E. McKenney, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev From: Alviro Iskandar Setiawan > Sent: 22 March 2022 14:48 > > On Tue, Mar 22, 2022 at 8:37 PM David Laight wrote: > > dunno, 'asm' register variables are rather more horrid and > > should probably only be used (for asm statements) when there aren't > > suitable register constraints. > > > > (I'm sure there is a comment about that in the gcc docs.) > > I don't find the comment that says so here: > https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html I've probably inferred it from: "The only supported use for this feature is to specify registers for input and output operands when calling Extended asm (see Extended Asm). This may be necessary if the constraints for a particular machine don’t provide sufficient control to select the desired register." Here is isn't necessary because the required constraint exist/ David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-22 13:37 ` David Laight 2022-03-22 14:47 ` Alviro Iskandar Setiawan @ 2022-03-23 6:29 ` Ammar Faizi 2022-03-23 6:32 ` Ammar Faizi 2022-03-23 7:10 ` Willy Tarreau 1 sibling, 2 replies; 30+ messages in thread From: Ammar Faizi @ 2022-03-23 6:29 UTC (permalink / raw) To: David Laight, 'Willy Tarreau' Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On 3/22/22 8:37 PM, David Laight wrote: > dunno, 'asm' register variables are rather more horrid and > should probably only be used (for asm statements) when there aren't > suitable register constraints. > > (I'm sure there is a comment about that in the gcc docs.) ^ Hey David, yes you're right, that is very interesting... I hit a GCC bug when playing with syscall6() implementation here. Using register variables for all inputs for syscall6() causing GCC 11.2 stuck in an endless loop with 100% CPU usage. Reproducible with several versions of GCC. In GCC 6.3, the syscall6() implementation above yields ICE (Internal Compiler Error): ``` <source>: In function '__sys_mmap': <source>:35:1: error: unable to find a register to spill } ^ <source>:35:1: error: this is the insn: (insn 14 13 30 2 (set (reg:SI 95 [92]) (mem/c:SI (plus:SI (reg/f:SI 16 argp) (const_int 28 [0x1c])) [1 offset+0 S4 A32])) <source>:33 86 {*movsi_internal} (expr_list:REG_DEAD (reg:SI 16 argp) (nil))) <source>:35: confused by earlier errors, bailing out Compiler returned: 1 ``` See the full show here: https://godbolt.org/z/dYeKaYWY3 Using the appropriate constraints, it compiles nicely, now it looks like this: ``` #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ ({ \ long _eax = (long)(num); \ long _arg6 = (long)(arg6); /* Always be in memory */ \ asm volatile ( \ "pushl %[_arg6]\n\t" \ "pushl %%ebp\n\t" \ "movl 4(%%esp), %%ebp\n\t" \ "int $0x80\n\t" \ "popl %%ebp\n\t" \ "addl $4,%%esp\n\t" \ : "+a"(_eax) /* %eax */ \ : "b"(arg1), /* %ebx */ \ "c"(arg2), /* %ecx */ \ "d"(arg3), /* %edx */ \ "S"(arg4), /* %esi */ \ "D"(arg5), /* %edi */ \ [_arg6]"m"(_arg6) /* memory */ \ : "memory", "cc" \ ); \ _eax; \ }) ``` Link: https://godbolt.org/z/ozGbYWbPY Will use that in the next patchset version. -- Ammar Faizi ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-23 6:29 ` Ammar Faizi @ 2022-03-23 6:32 ` Ammar Faizi 2022-03-23 7:10 ` Willy Tarreau 1 sibling, 0 replies; 30+ messages in thread From: Ammar Faizi @ 2022-03-23 6:32 UTC (permalink / raw) To: David Laight, 'Willy Tarreau' Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev I have reported this bug to GNU people. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032 -- Ammar Faizi ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments 2022-03-23 6:29 ` Ammar Faizi 2022-03-23 6:32 ` Ammar Faizi @ 2022-03-23 7:10 ` Willy Tarreau 1 sibling, 0 replies; 30+ messages in thread From: Willy Tarreau @ 2022-03-23 7:10 UTC (permalink / raw) To: Ammar Faizi Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org, llvm@lists.linux.dev On Wed, Mar 23, 2022 at 01:29:39PM +0700, Ammar Faizi wrote: > On 3/22/22 8:37 PM, David Laight wrote: > > dunno, 'asm' register variables are rather more horrid and > > should probably only be used (for asm statements) when there aren't > > suitable register constraints. > > > > (I'm sure there is a comment about that in the gcc docs.) > > ^ Hey David, yes you're right, that is very interesting... > > I hit a GCC bug when playing with syscall6() implementation here. > > Using register variables for all inputs for syscall6() causing GCC 11.2 > stuck in an endless loop with 100% CPU usage. Reproducible with several > versions of GCC. > > In GCC 6.3, the syscall6() implementation above yields ICE (Internal > Compiler Error): > ``` > <source>: In function '__sys_mmap': > <source>:35:1: error: unable to find a register to spill Now I'm pretty sure that it was the issue I faced when trying long ago, I remember this error message before I found it wiser to give up. Willy ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2022-03-23 7:10 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20220322102115.186179-1-ammarfaizi2@gnuweeb.org>
2022-03-22 10:21 ` [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code Ammar Faizi
2022-03-22 17:09 ` Nick Desaulniers
2022-03-22 17:25 ` Willy Tarreau
2022-03-22 17:30 ` Nick Desaulniers
2022-03-22 17:58 ` Willy Tarreau
2022-03-22 18:07 ` Nick Desaulniers
2022-03-22 18:24 ` Willy Tarreau
2022-03-22 18:38 ` Nick Desaulniers
2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi
2022-03-22 10:57 ` David Laight
2022-03-22 11:23 ` Willy Tarreau
2022-03-22 11:39 ` David Laight
2022-03-22 12:02 ` Ammar Faizi
2022-03-22 12:07 ` Ammar Faizi
2022-03-22 12:13 ` Willy Tarreau
2022-03-22 13:26 ` Ammar Faizi
2022-03-22 13:34 ` Willy Tarreau
2022-03-22 13:37 ` Ammar Faizi
2022-03-22 13:39 ` David Laight
2022-03-22 13:41 ` Willy Tarreau
2022-03-22 13:45 ` Ammar Faizi
2022-03-22 13:54 ` Ammar Faizi
2022-03-22 13:56 ` Ammar Faizi
2022-03-22 14:02 ` Willy Tarreau
2022-03-22 13:37 ` David Laight
2022-03-22 14:47 ` Alviro Iskandar Setiawan
2022-03-22 15:11 ` David Laight
2022-03-23 6:29 ` Ammar Faizi
2022-03-23 6:32 ` Ammar Faizi
2022-03-23 7:10 ` Willy Tarreau
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox