* [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
[not found] <20220322102115.186179-1-ammarfaizi2@gnuweeb.org>
@ 2022-03-22 10:21 ` Ammar Faizi
2022-03-22 17:09 ` Nick Desaulniers
2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi
1 sibling, 1 reply; 30+ messages in thread
From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw)
To: Willy Tarreau
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi,
llvm, Nick Desaulniers
Building with clang yields the following error:
```
<inline asm>:3:1: error: _start changed binding to STB_GLOBAL
.global _start
^
1 error generated.
```
Make sure only specify one between `.global _start` and `.weak _start`.
Removing `.global _start`.
Cc: llvm@lists.linux.dev
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---
@@ Changelog:
Link RFC v1: https://lore.kernel.org/llvm/20220320093750.159991-3-ammarfaizi2@gnuweeb.org
RFC v1 -> RFC v2:
- Remove all `.global _start` for all build (GCC and Clang) instead of
removing all `.weak _start` for clang build (Comment from Willy).
---
tools/include/nolibc/arch-aarch64.h | 1 -
tools/include/nolibc/arch-arm.h | 1 -
tools/include/nolibc/arch-i386.h | 1 -
tools/include/nolibc/arch-mips.h | 1 -
tools/include/nolibc/arch-riscv.h | 1 -
tools/include/nolibc/arch-x86_64.h | 1 -
6 files changed, 6 deletions(-)
diff --git a/tools/include/nolibc/arch-aarch64.h b/tools/include/nolibc/arch-aarch64.h
index 87d9e434820c..2dbd80d633cb 100644
--- a/tools/include/nolibc/arch-aarch64.h
+++ b/tools/include/nolibc/arch-aarch64.h
@@ -184,7 +184,6 @@ struct sys_stat_struct {
/* startup code */
asm(".section .text\n"
".weak _start\n"
- ".global _start\n"
"_start:\n"
"ldr x0, [sp]\n" // argc (x0) was in the stack
"add x1, sp, 8\n" // argv (x1) = sp
diff --git a/tools/include/nolibc/arch-arm.h b/tools/include/nolibc/arch-arm.h
index 001a3c8c9ad5..1191395b5acd 100644
--- a/tools/include/nolibc/arch-arm.h
+++ b/tools/include/nolibc/arch-arm.h
@@ -177,7 +177,6 @@ struct sys_stat_struct {
/* startup code */
asm(".section .text\n"
".weak _start\n"
- ".global _start\n"
"_start:\n"
#if defined(__THUMBEB__) || defined(__THUMBEL__)
/* We enter here in 32-bit mode but if some previous functions were in
diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
index d7e4d53325a3..125a691fc631 100644
--- a/tools/include/nolibc/arch-i386.h
+++ b/tools/include/nolibc/arch-i386.h
@@ -176,7 +176,6 @@ struct sys_stat_struct {
*/
asm(".section .text\n"
".weak _start\n"
- ".global _start\n"
"_start:\n"
"pop %eax\n" // argc (first arg, %eax)
"mov %esp, %ebx\n" // argv[] (second arg, %ebx)
diff --git a/tools/include/nolibc/arch-mips.h b/tools/include/nolibc/arch-mips.h
index c9a6aac87c6d..1a124790c99f 100644
--- a/tools/include/nolibc/arch-mips.h
+++ b/tools/include/nolibc/arch-mips.h
@@ -192,7 +192,6 @@ struct sys_stat_struct {
asm(".section .text\n"
".weak __start\n"
".set nomips16\n"
- ".global __start\n"
".set noreorder\n"
".option pic0\n"
".ent __start\n"
diff --git a/tools/include/nolibc/arch-riscv.h b/tools/include/nolibc/arch-riscv.h
index bc10b7b5706d..511d67fc534e 100644
--- a/tools/include/nolibc/arch-riscv.h
+++ b/tools/include/nolibc/arch-riscv.h
@@ -185,7 +185,6 @@ struct sys_stat_struct {
/* startup code */
asm(".section .text\n"
".weak _start\n"
- ".global _start\n"
"_start:\n"
".option push\n"
".option norelax\n"
diff --git a/tools/include/nolibc/arch-x86_64.h b/tools/include/nolibc/arch-x86_64.h
index a7b70ea51b68..84c174181425 100644
--- a/tools/include/nolibc/arch-x86_64.h
+++ b/tools/include/nolibc/arch-x86_64.h
@@ -199,7 +199,6 @@ struct sys_stat_struct {
*/
asm(".section .text\n"
".weak _start\n"
- ".global _start\n"
"_start:\n"
"pop %rdi\n" // argc (first arg, %rdi)
"mov %rsp, %rsi\n" // argv[] (second arg, %rsi)
--
Ammar Faizi
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
[not found] <20220322102115.186179-1-ammarfaizi2@gnuweeb.org>
2022-03-22 10:21 ` [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code Ammar Faizi
@ 2022-03-22 10:21 ` Ammar Faizi
2022-03-22 10:57 ` David Laight
2022-03-22 11:39 ` David Laight
1 sibling, 2 replies; 30+ messages in thread
From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw)
To: Willy Tarreau
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi,
x86, llvm, David Laight
On i386, the 6th argument of syscall goes in %ebp. However, both Clang
and GCC cannot use %ebp in the clobber list and in the "r" constraint
without using -fomit-frame-pointer. To make it always available for
any kind of compilation, the below workaround is implemented.
For clang (the Assembly statement can't clobber %ebp):
1) Push the 6-th argument.
2) Push %ebp.
3) Load the 6-th argument from 4(%esp) to %ebp.
4) Do the syscall (int $0x80).
5) Pop %ebp (restore the old value of %ebp).
6) Add %esp by 4 (undo the stack pointer).
For GCC, fortunately it has a #pragma that can force a specific function
to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
var is a variable bound to %ebp.
Cc: x86@kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/lkml/2e335ac54db44f1d8496583d97f9dab0@AcuMS.aculab.com
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---
@@ Changelog:
Link RFC v1: https://lore.kernel.org/llvm/20220320093750.159991-4-ammarfaizi2@gnuweeb.org
RFC v1 -> RFC v2:
- Fix %ebp saving method. Don't use redzone, i386 doesn't have a redzone
(comment from David and Alviro).
---
tools/include/nolibc/arch-i386.h | 66 ++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
index 125a691fc631..9f4dc36e6ac2 100644
--- a/tools/include/nolibc/arch-i386.h
+++ b/tools/include/nolibc/arch-i386.h
@@ -167,6 +167,72 @@ struct sys_stat_struct {
_ret; \
})
+
+/*
+ * Both Clang and GCC cannot use %ebp in the clobber list and in the "r"
+ * constraint without using -fomit-frame-pointer. To make it always
+ * available for any kind of compilation, the below workaround is
+ * implemented.
+ *
+ * For clang (the Assembly statement can't clobber %ebp):
+ * 1) Push the 6-th argument.
+ * 2) Push %ebp.
+ * 3) Load the 6-th argument from 4(%esp) to %ebp.
+ * 4) Do the syscall (int $0x80).
+ * 5) Pop %ebp (restore the old value of %ebp).
+ * 6) Add %esp by 4 (undo the stack pointer).
+ *
+ * For GCC, fortunately it has a #pragma that can force a specific function
+ * to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
+ * var is a variable bound to %ebp.
+ *
+ */
+#if defined(__clang__)
+static inline long ____do_syscall6(long eax, long ebx, long ecx, long edx,
+ long esi, long edi, long ebp)
+{
+ __asm__ volatile (
+ "pushl %[arg6]\n\t"
+ "pushl %%ebp\n\t"
+ "movl 4(%%esp), %%ebp\n\t"
+ "int $0x80\n\t"
+ "popl %%ebp\n\t"
+ "addl $4,%%esp\n\t"
+ : "=a"(eax)
+ : "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi),
+ [arg6]"m"(ebp)
+ : "memory", "cc"
+ );
+ return eax;
+}
+
+#else /* #if defined(__clang__) */
+#pragma GCC push_options
+#pragma GCC optimize "-fomit-frame-pointer"
+static long ____do_syscall6(long eax, long ebx, long ecx, long edx, long esi,
+ long edi, long ebp)
+{
+ register long __ebp __asm__("ebp") = ebp;
+ __asm__ volatile (
+ "int $0x80"
+ : "=a"(eax)
+ : "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi),
+ "r"(__ebp)
+ : "memory", "cc"
+ );
+ return eax;
+}
+#pragma GCC pop_options
+#endif /* #if defined(__clang__) */
+
+#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) ( \
+ ____do_syscall6((long)(num), (long)(arg1), \
+ (long)(arg2), (long)(arg3), \
+ (long)(arg4), (long)(arg5), \
+ (long)(arg6)) \
+)
+
+
/* startup code */
/*
* i386 System V ABI mandates:
--
Ammar Faizi
^ permalink raw reply related [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi
@ 2022-03-22 10:57 ` David Laight
2022-03-22 11:23 ` Willy Tarreau
2022-03-22 11:39 ` David Laight
1 sibling, 1 reply; 30+ messages in thread
From: David Laight @ 2022-03-22 10:57 UTC (permalink / raw)
To: 'Ammar Faizi', Willy Tarreau
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
From: Ammar Faizi
> Sent: 22 March 2022 10:21
>
> On i386, the 6th argument of syscall goes in %ebp. However, both Clang
> and GCC cannot use %ebp in the clobber list and in the "r" constraint
> without using -fomit-frame-pointer. To make it always available for
> any kind of compilation, the below workaround is implemented.
>
> For clang (the Assembly statement can't clobber %ebp):
> 1) Push the 6-th argument.
> 2) Push %ebp.
> 3) Load the 6-th argument from 4(%esp) to %ebp.
> 4) Do the syscall (int $0x80).
> 5) Pop %ebp (restore the old value of %ebp).
> 6) Add %esp by 4 (undo the stack pointer).
>
> For GCC, fortunately it has a #pragma that can force a specific function
> to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
> var is a variable bound to %ebp.
You need to use the 'clang' pattern for gcc.
#pragma optimise is fundamentally broken.
What actually happens here is the 'inline' gets lost
(because of the implied -O0) and you get far worse code
than you might expect.
Since you need the 'clang' version, use it all the time.
David
>
> Cc: x86@kernel.org
> Cc: llvm@lists.linux.dev
> Link: https://lore.kernel.org/lkml/2e335ac54db44f1d8496583d97f9dab0@AcuMS.aculab.com
> Suggested-by: David Laight <David.Laight@ACULAB.COM>
> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
> ---
>
> @@ Changelog:
>
> Link RFC v1: https://lore.kernel.org/llvm/20220320093750.159991-4-ammarfaizi2@gnuweeb.org
> RFC v1 -> RFC v2:
> - Fix %ebp saving method. Don't use redzone, i386 doesn't have a redzone
> (comment from David and Alviro).
> ---
> tools/include/nolibc/arch-i386.h | 66 ++++++++++++++++++++++++++++++++
> 1 file changed, 66 insertions(+)
>
> diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
> index 125a691fc631..9f4dc36e6ac2 100644
> --- a/tools/include/nolibc/arch-i386.h
> +++ b/tools/include/nolibc/arch-i386.h
> @@ -167,6 +167,72 @@ struct sys_stat_struct {
> _ret; \
> })
>
> +
> +/*
> + * Both Clang and GCC cannot use %ebp in the clobber list and in the "r"
> + * constraint without using -fomit-frame-pointer. To make it always
> + * available for any kind of compilation, the below workaround is
> + * implemented.
> + *
> + * For clang (the Assembly statement can't clobber %ebp):
> + * 1) Push the 6-th argument.
> + * 2) Push %ebp.
> + * 3) Load the 6-th argument from 4(%esp) to %ebp.
> + * 4) Do the syscall (int $0x80).
> + * 5) Pop %ebp (restore the old value of %ebp).
> + * 6) Add %esp by 4 (undo the stack pointer).
> + *
> + * For GCC, fortunately it has a #pragma that can force a specific function
> + * to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
> + * var is a variable bound to %ebp.
> + *
> + */
> +#if defined(__clang__)
> +static inline long ____do_syscall6(long eax, long ebx, long ecx, long edx,
> + long esi, long edi, long ebp)
> +{
> + __asm__ volatile (
> + "pushl %[arg6]\n\t"
> + "pushl %%ebp\n\t"
> + "movl 4(%%esp), %%ebp\n\t"
> + "int $0x80\n\t"
> + "popl %%ebp\n\t"
> + "addl $4,%%esp\n\t"
> + : "=a"(eax)
> + : "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi),
> + [arg6]"m"(ebp)
> + : "memory", "cc"
> + );
> + return eax;
> +}
> +
> +#else /* #if defined(__clang__) */
> +#pragma GCC push_options
> +#pragma GCC optimize "-fomit-frame-pointer"
> +static long ____do_syscall6(long eax, long ebx, long ecx, long edx, long esi,
> + long edi, long ebp)
> +{
> + register long __ebp __asm__("ebp") = ebp;
> + __asm__ volatile (
> + "int $0x80"
> + : "=a"(eax)
> + : "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi),
> + "r"(__ebp)
> + : "memory", "cc"
> + );
> + return eax;
> +}
> +#pragma GCC pop_options
> +#endif /* #if defined(__clang__) */
> +
> +#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) ( \
> + ____do_syscall6((long)(num), (long)(arg1), \
> + (long)(arg2), (long)(arg3), \
> + (long)(arg4), (long)(arg5), \
> + (long)(arg6)) \
> +)
> +
> +
> /* startup code */
> /*
> * i386 System V ABI mandates:
> --
> Ammar Faizi
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 10:57 ` David Laight
@ 2022-03-22 11:23 ` Willy Tarreau
0 siblings, 0 replies; 30+ messages in thread
From: Willy Tarreau @ 2022-03-22 11:23 UTC (permalink / raw)
To: David Laight
Cc: 'Ammar Faizi', Paul E. McKenney, Alviro Iskandar Setiawan,
Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List,
x86@kernel.org, llvm@lists.linux.dev
On Tue, Mar 22, 2022 at 10:57:01AM +0000, David Laight wrote:
> From: Ammar Faizi
> > Sent: 22 March 2022 10:21
> >
> > On i386, the 6th argument of syscall goes in %ebp. However, both Clang
> > and GCC cannot use %ebp in the clobber list and in the "r" constraint
> > without using -fomit-frame-pointer. To make it always available for
> > any kind of compilation, the below workaround is implemented.
> >
> > For clang (the Assembly statement can't clobber %ebp):
> > 1) Push the 6-th argument.
> > 2) Push %ebp.
> > 3) Load the 6-th argument from 4(%esp) to %ebp.
> > 4) Do the syscall (int $0x80).
> > 5) Pop %ebp (restore the old value of %ebp).
> > 6) Add %esp by 4 (undo the stack pointer).
> >
> > For GCC, fortunately it has a #pragma that can force a specific function
> > to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
> > var is a variable bound to %ebp.
>
> You need to use the 'clang' pattern for gcc.
> #pragma optimise is fundamentally broken.
> What actually happens here is the 'inline' gets lost
> (because of the implied -O0) and you get far worse code
> than you might expect.
>
> Since you need the 'clang' version, use it all the time.
I clearly prefer it as well, it looks much cleaner!
Willy
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi
2022-03-22 10:57 ` David Laight
@ 2022-03-22 11:39 ` David Laight
2022-03-22 12:02 ` Ammar Faizi
1 sibling, 1 reply; 30+ messages in thread
From: David Laight @ 2022-03-22 11:39 UTC (permalink / raw)
To: 'Ammar Faizi', Willy Tarreau
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
From: Ammar Faizi
> Sent: 22 March 2022 10:21
> On i386, the 6th argument of syscall goes in %ebp. However, both Clang
> and GCC cannot use %ebp in the clobber list and in the "r" constraint
> without using -fomit-frame-pointer. To make it always available for
> any kind of compilation, the below workaround is implemented.
>
...
> diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
> index 125a691fc631..9f4dc36e6ac2 100644
> --- a/tools/include/nolibc/arch-i386.h
> +++ b/tools/include/nolibc/arch-i386.h
> @@ -167,6 +167,72 @@ struct sys_stat_struct {
> _ret; \
> })
>
> +
> +/*
> + * Both Clang and GCC cannot use %ebp in the clobber list and in the "r"
> + * constraint without using -fomit-frame-pointer. To make it always
> + * available for any kind of compilation, the below workaround is
> + * implemented.
> + *
> + * For clang (the Assembly statement can't clobber %ebp):
> + * 1) Push the 6-th argument.
> + * 2) Push %ebp.
> + * 3) Load the 6-th argument from 4(%esp) to %ebp.
> + * 4) Do the syscall (int $0x80).
> + * 5) Pop %ebp (restore the old value of %ebp).
> + * 6) Add %esp by 4 (undo the stack pointer).
> + *
> + * For GCC, fortunately it has a #pragma that can force a specific function
> + * to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
> + * var is a variable bound to %ebp.
> + *
> + */
> +#if defined(__clang__)
> +static inline long ____do_syscall6(long eax, long ebx, long ecx, long edx,
> + long esi, long edi, long ebp)
That should probably be:
static inline long ____do_syscall6(long nr, long arg1, long arg2, long arg3,
long arg4, long arg5, long arg6)
and the input constraints changed to match.
> +{
> + __asm__ volatile (
> + "pushl %[arg6]\n\t"
> + "pushl %%ebp\n\t"
> + "movl 4(%%esp), %%ebp\n\t"
> + "int $0x80\n\t"
> + "popl %%ebp\n\t"
> + "addl $4,%%esp\n\t"
> + : "=a"(eax)
> + : "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi),
Does having "=a" for an output constraint and "a" for an input
constraint actually DTRT?
There is a special syntax for tying input and output to
the same register.
Or you could use "+a"(nr_rval) and 'return nr_rval'.
David
> + [arg6]"m"(ebp)
> + : "memory", "cc"
> + );
> + return eax;
> +}
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 11:39 ` David Laight
@ 2022-03-22 12:02 ` Ammar Faizi
2022-03-22 12:07 ` Ammar Faizi
2022-03-22 12:13 ` Willy Tarreau
0 siblings, 2 replies; 30+ messages in thread
From: Ammar Faizi @ 2022-03-22 12:02 UTC (permalink / raw)
To: David Laight, Willy Tarreau
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
On 3/22/22 6:39 PM, David Laight wrote:
>> + __asm__ volatile (
>> + "pushl %[arg6]\n\t"
>> + "pushl %%ebp\n\t"
>> + "movl 4(%%esp), %%ebp\n\t"
>> + "int $0x80\n\t"
>> + "popl %%ebp\n\t"
>> + "addl $4,%%esp\n\t"
>> + : "=a"(eax)
>> + : "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi),
>
> Does having "=a" for an output constraint and "a" for an input
> constraint actually DTRT?
> There is a special syntax for tying input and output to
> the same register.
> Or you could use "+a"(nr_rval) and 'return nr_rval'.
Well, I agree with your previous email. Now since we no longer use a #pragma
optimize with -fomit-frame-pointer, the function is not needed. I propose the
following macro (this is not so much different with other my_syscall macro),
expect the 6th argument can be in reg or mem.
The "rm" constraint here gives the opportunity for the compiler to use %ebp
instead of memory if -fomit-frame-pointer is turned on.
#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
({ \
long _ret; \
register long _num asm("eax") = (num); \
register long _arg1 asm("ebx") = (long)(arg1); \
register long _arg2 asm("ecx") = (long)(arg2); \
register long _arg3 asm("edx") = (long)(arg3); \
register long _arg4 asm("esi") = (long)(arg4); \
register long _arg5 asm("edi") = (long)(arg5); \
long _arg6 = (long)(arg6); /* Might be in memory */ \
\
asm volatile ( \
"pushl %[_arg6]\n\t" \
"pushl %%ebp\n\t" \
"movl 4(%%esp), %%ebp\n\t" \
"int $0x80\n\t" \
"popl %%ebp\n\t" \
"addl $4,%%esp\n\t" \
: "=a"(_ret) \
: "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \
"r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6) \
: "memory", "cc" \
); \
_ret; \
})
What do you think?
--
Ammar Faizi
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 12:02 ` Ammar Faizi
@ 2022-03-22 12:07 ` Ammar Faizi
2022-03-22 12:13 ` Willy Tarreau
1 sibling, 0 replies; 30+ messages in thread
From: Ammar Faizi @ 2022-03-22 12:07 UTC (permalink / raw)
To: David Laight, Willy Tarreau
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
On 3/22/22 7:02 PM, Ammar Faizi wrote:
> Well, I agree with your previous email. Now since we no longer use a #pragma
> optimize with -fomit-frame-pointer, the function is not needed. I propose the
> following macro (this is not so much different with other my_syscall macro),
> expect the 6th argument can be in reg or mem.
>
> The "rm" constraint here gives the opportunity for the compiler to use %ebp
> instead of memory if -fomit-frame-pointer is turned on.
>
> #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
> ({ \
> long _ret; \
> register long _num asm("eax") = (num); \
> register long _arg1 asm("ebx") = (long)(arg1); \
> register long _arg2 asm("ecx") = (long)(arg2); \
> register long _arg3 asm("edx") = (long)(arg3); \
> register long _arg4 asm("esi") = (long)(arg4); \
> register long _arg5 asm("edi") = (long)(arg5); \
> long _arg6 = (long)(arg6); /* Might be in memory */ \
> \
> asm volatile ( \
> "pushl %[_arg6]\n\t" \
> "pushl %%ebp\n\t" \
> "movl 4(%%esp), %%ebp\n\t" \
> "int $0x80\n\t" \
> "popl %%ebp\n\t" \
> "addl $4,%%esp\n\t" \
> : "=a"(_ret) \
> : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \
> "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6) \
> : "memory", "cc" \
> ); \
> _ret; \
> })
>
> What do you think?
>
For the following code:
int main()
{
mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
return 0;
}
GCC generates this:
00001000 <main>:
1000: push %ebp
1001: mov $0xc0,%eax
1006: mov $0x1000,%ecx
100b: mov $0x3,%edx
1010: push %edi
1011: xor %ebp,%ebp
1013: mov $0xffffffff,%edi
1018: push %esi
1019: mov $0x22,%esi
101e: push %ebx
101f: xor %ebx,%ebx
1021: push %ebp <--- arg6 here
1022: push %ebp
1023: mov 0x4(%esp),%ebp
1027: int $0x80
1029: pop %ebp
102a: add $0x4,%esp
102d: xor %eax,%eax
102f: pop %ebx
1030: pop %esi
1031: pop %edi
1032: pop %ebp
1033: ret
--
Ammar Faizi
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 12:02 ` Ammar Faizi
2022-03-22 12:07 ` Ammar Faizi
@ 2022-03-22 12:13 ` Willy Tarreau
2022-03-22 13:26 ` Ammar Faizi
2022-03-22 13:37 ` David Laight
1 sibling, 2 replies; 30+ messages in thread
From: Willy Tarreau @ 2022-03-22 12:13 UTC (permalink / raw)
To: Ammar Faizi
Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
On Tue, Mar 22, 2022 at 07:02:53PM +0700, Ammar Faizi wrote:
> I propose the
> following macro (this is not so much different with other my_syscall macro),
> expect the 6th argument can be in reg or mem.
>
> The "rm" constraint here gives the opportunity for the compiler to use %ebp
> instead of memory if -fomit-frame-pointer is turned on.
>
> #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
> ({ \
> long _ret; \
> register long _num asm("eax") = (num); \
> register long _arg1 asm("ebx") = (long)(arg1); \
> register long _arg2 asm("ecx") = (long)(arg2); \
> register long _arg3 asm("edx") = (long)(arg3); \
> register long _arg4 asm("esi") = (long)(arg4); \
> register long _arg5 asm("edi") = (long)(arg5); \
> long _arg6 = (long)(arg6); /* Might be in memory */ \
> \
> asm volatile ( \
> "pushl %[_arg6]\n\t" \
> "pushl %%ebp\n\t" \
> "movl 4(%%esp), %%ebp\n\t" \
> "int $0x80\n\t" \
> "popl %%ebp\n\t" \
> "addl $4,%%esp\n\t" \
> : "=a"(_ret) \
> : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \
> "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6) \
> : "memory", "cc" \
> ); \
> _ret; \
> })
>
> What do you think?
Hmmm indeed that comes back to the existing constructs and is certainly
more in line with the rest of the code (plus it will not be affected by
-O0).
I seem to remember a register allocation issue which kept me away from
implementing it this way on i386 back then, but given that my focus was
not as much on i386 as it was on other platforms, it's likely that I have
not insisted too much and not tried this one which looks like the way to
go to me.
Willy
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 12:13 ` Willy Tarreau
@ 2022-03-22 13:26 ` Ammar Faizi
2022-03-22 13:34 ` Willy Tarreau
2022-03-22 13:37 ` David Laight
1 sibling, 1 reply; 30+ messages in thread
From: Ammar Faizi @ 2022-03-22 13:26 UTC (permalink / raw)
To: Willy Tarreau
Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
On 3/22/22 7:13 PM, Willy Tarreau wrote:
> On Tue, Mar 22, 2022 at 07:02:53PM +0700, Ammar Faizi wrote:
>> I propose the
>> following macro (this is not so much different with other my_syscall macro),
>> expect the 6th argument can be in reg or mem.
>>
>> The "rm" constraint here gives the opportunity for the compiler to use %ebp
>> instead of memory if -fomit-frame-pointer is turned on.
>>
>> #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
>> ({ \
>> long _ret; \
>> register long _num asm("eax") = (num); \
>> register long _arg1 asm("ebx") = (long)(arg1); \
>> register long _arg2 asm("ecx") = (long)(arg2); \
>> register long _arg3 asm("edx") = (long)(arg3); \
>> register long _arg4 asm("esi") = (long)(arg4); \
>> register long _arg5 asm("edi") = (long)(arg5); \
>> long _arg6 = (long)(arg6); /* Might be in memory */ \
>> \
>> asm volatile ( \
>> "pushl %[_arg6]\n\t" \
>> "pushl %%ebp\n\t" \
>> "movl 4(%%esp), %%ebp\n\t" \
>> "int $0x80\n\t" \
>> "popl %%ebp\n\t" \
>> "addl $4,%%esp\n\t" \
>> : "=a"(_ret) \
>> : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \
>> "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6) \
>> : "memory", "cc" \
>> ); \
>> _ret; \
>> })
>>
>> What do you think?
>
> Hmmm indeed that comes back to the existing constructs and is certainly
> more in line with the rest of the code (plus it will not be affected by
> -O0).
>
> I seem to remember a register allocation issue which kept me away from
> implementing it this way on i386 back then, but given that my focus was
> not as much on i386 as it was on other platforms, it's likely that I have
> not insisted too much and not tried this one which looks like the way to
> go to me.
I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer
(e.g. without optimization / -O0). So I will still use "m" here.
--
Ammar Faizi
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 13:26 ` Ammar Faizi
@ 2022-03-22 13:34 ` Willy Tarreau
2022-03-22 13:37 ` Ammar Faizi
0 siblings, 1 reply; 30+ messages in thread
From: Willy Tarreau @ 2022-03-22 13:34 UTC (permalink / raw)
To: Ammar Faizi
Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
On Tue, Mar 22, 2022 at 08:26:37PM +0700, Ammar Faizi wrote:
> On 3/22/22 7:13 PM, Willy Tarreau wrote:
> > On Tue, Mar 22, 2022 at 07:02:53PM +0700, Ammar Faizi wrote:
> > > I propose the
> > > following macro (this is not so much different with other my_syscall macro),
> > > expect the 6th argument can be in reg or mem.
> > >
> > > The "rm" constraint here gives the opportunity for the compiler to use %ebp
> > > instead of memory if -fomit-frame-pointer is turned on.
> > >
> > > #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
> > > ({ \
> > > long _ret; \
> > > register long _num asm("eax") = (num); \
> > > register long _arg1 asm("ebx") = (long)(arg1); \
> > > register long _arg2 asm("ecx") = (long)(arg2); \
> > > register long _arg3 asm("edx") = (long)(arg3); \
> > > register long _arg4 asm("esi") = (long)(arg4); \
> > > register long _arg5 asm("edi") = (long)(arg5); \
> > > long _arg6 = (long)(arg6); /* Might be in memory */ \
> > > \
> > > asm volatile ( \
> > > "pushl %[_arg6]\n\t" \
> > > "pushl %%ebp\n\t" \
> > > "movl 4(%%esp), %%ebp\n\t" \
> > > "int $0x80\n\t" \
> > > "popl %%ebp\n\t" \
> > > "addl $4,%%esp\n\t" \
> > > : "=a"(_ret) \
> > > : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \
> > > "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6) \
> > > : "memory", "cc" \
> > > ); \
> > > _ret; \
> > > })
> > >
> > > What do you think?
> >
> > Hmmm indeed that comes back to the existing constructs and is certainly
> > more in line with the rest of the code (plus it will not be affected by
> > -O0).
> >
> > I seem to remember a register allocation issue which kept me away from
> > implementing it this way on i386 back then, but given that my focus was
> > not as much on i386 as it was on other platforms, it's likely that I have
> > not insisted too much and not tried this one which looks like the way to
> > go to me.
>
> I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer
> (e.g. without optimization / -O0). So I will still use "m" here.
OK that's fine. then you can probably simplify it like this:
long _arg6 = (long)(arg6); /* Might be in memory */ \
\
asm volatile ( \
"pushl %%ebp\n\t" \
"movl %[_arg6], %%ebp\n\t" \
"int $0x80\n\t" \
"popl %%ebp\n\t" \
: "=a"(_ret) \
: "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \
"r"(_arg4),"r"(_arg5), [_arg6]"m"(_arg6) \
: "memory", "cc" \
); \
See ? no more push, no more addl, direct load from memory.
Willy
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 13:34 ` Willy Tarreau
@ 2022-03-22 13:37 ` Ammar Faizi
2022-03-22 13:39 ` David Laight
0 siblings, 1 reply; 30+ messages in thread
From: Ammar Faizi @ 2022-03-22 13:37 UTC (permalink / raw)
To: Willy Tarreau
Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
On 3/22/22 8:34 PM, Willy Tarreau wrote:
>> I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer
>> (e.g. without optimization / -O0). So I will still use "m" here.
>
> OK that's fine. then you can probably simplify it like this:
>
> long _arg6 = (long)(arg6); /* Might be in memory */ \
> \
> asm volatile ( \
> "pushl %%ebp\n\t" \
> "movl %[_arg6], %%ebp\n\t" \
> "int $0x80\n\t" \
> "popl %%ebp\n\t" \
> : "=a"(_ret) \
> : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \
> "r"(_arg4),"r"(_arg5), [_arg6]"m"(_arg6) \
> : "memory", "cc" \
> ); \
>
> See ? no more push, no more addl, direct load from memory.
Uggh... I crafted the same code like you suggested before, but then
I realized it's buggy, it's buggy because %[_arg6] may live in N(%esp).
When you pushl %ebp, the %esp changes, N(%esp) no longer points to the
6-th argument.
--
Ammar Faizi
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 12:13 ` Willy Tarreau
2022-03-22 13:26 ` Ammar Faizi
@ 2022-03-22 13:37 ` David Laight
2022-03-22 14:47 ` Alviro Iskandar Setiawan
2022-03-23 6:29 ` Ammar Faizi
1 sibling, 2 replies; 30+ messages in thread
From: David Laight @ 2022-03-22 13:37 UTC (permalink / raw)
To: 'Willy Tarreau', Ammar Faizi
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
From: Willy Tarreau
> Sent: 22 March 2022 12:14
>
> On Tue, Mar 22, 2022 at 07:02:53PM +0700, Ammar Faizi wrote:
> > I propose the
> > following macro (this is not so much different with other my_syscall macro),
> > expect the 6th argument can be in reg or mem.
> >
> > The "rm" constraint here gives the opportunity for the compiler to use %ebp
> > instead of memory if -fomit-frame-pointer is turned on.
> >
> > #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
> > ({ \
> > long _ret; \
> > register long _num asm("eax") = (num); \
> > register long _arg1 asm("ebx") = (long)(arg1); \
> > register long _arg2 asm("ecx") = (long)(arg2); \
> > register long _arg3 asm("edx") = (long)(arg3); \
> > register long _arg4 asm("esi") = (long)(arg4); \
> > register long _arg5 asm("edi") = (long)(arg5); \
> > long _arg6 = (long)(arg6); /* Might be in memory */ \
> > \
> > asm volatile ( \
> > "pushl %[_arg6]\n\t" \
> > "pushl %%ebp\n\t" \
> > "movl 4(%%esp), %%ebp\n\t" \
> > "int $0x80\n\t" \
> > "popl %%ebp\n\t" \
> > "addl $4,%%esp\n\t" \
> > : "=a"(_ret) \
> > : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \
> > "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6) \
> > : "memory", "cc" \
> > ); \
> > _ret; \
> > })
> >
> > What do you think?
>
> Hmmm indeed that comes back to the existing constructs and is certainly
> more in line with the rest of the code (plus it will not be affected by
> -O0).
I'd add an 'always_inline' to the function.
That will force inline even with -O0.
> I seem to remember a register allocation issue which kept me away from
> implementing it this way on i386 back then, but given that my focus was
> not as much on i386 as it was on other platforms, it's likely that I have
> not insisted too much and not tried this one which looks like the way to
> go to me.
dunno, 'asm' register variables are rather more horrid and
should probably only be used (for asm statements) when there aren't
suitable register constraints.
(I'm sure there is a comment about that in the gcc docs.)
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 13:37 ` Ammar Faizi
@ 2022-03-22 13:39 ` David Laight
2022-03-22 13:41 ` Willy Tarreau
0 siblings, 1 reply; 30+ messages in thread
From: David Laight @ 2022-03-22 13:39 UTC (permalink / raw)
To: 'Ammar Faizi', Willy Tarreau
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
From: Ammar Faizi
> Sent: 22 March 2022 13:37
>
> On 3/22/22 8:34 PM, Willy Tarreau wrote:
> >> I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer
> >> (e.g. without optimization / -O0). So I will still use "m" here.
> >
> > OK that's fine. then you can probably simplify it like this:
> >
> > long _arg6 = (long)(arg6); /* Might be in memory */ \
> > \
> > asm volatile ( \
> > "pushl %%ebp\n\t" \
> > "movl %[_arg6], %%ebp\n\t" \
> > "int $0x80\n\t" \
> > "popl %%ebp\n\t" \
> > : "=a"(_ret) \
> > : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \
> > "r"(_arg4),"r"(_arg5), [_arg6]"m"(_arg6) \
> > : "memory", "cc" \
> > ); \
> >
> > See ? no more push, no more addl, direct load from memory.
>
> Uggh... I crafted the same code like you suggested before, but then
> I realized it's buggy, it's buggy because %[_arg6] may live in N(%esp).
>
> When you pushl %ebp, the %esp changes, N(%esp) no longer points to the
> 6-th argument.
Yep - that is why I wrote the 'push arg6'.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 13:39 ` David Laight
@ 2022-03-22 13:41 ` Willy Tarreau
2022-03-22 13:45 ` Ammar Faizi
0 siblings, 1 reply; 30+ messages in thread
From: Willy Tarreau @ 2022-03-22 13:41 UTC (permalink / raw)
To: David Laight
Cc: 'Ammar Faizi', Paul E. McKenney, Alviro Iskandar Setiawan,
Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List,
x86@kernel.org, llvm@lists.linux.dev
On Tue, Mar 22, 2022 at 01:39:41PM +0000, David Laight wrote:
> From: Ammar Faizi
> > Sent: 22 March 2022 13:37
> >
> > On 3/22/22 8:34 PM, Willy Tarreau wrote:
> > >> I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer
> > >> (e.g. without optimization / -O0). So I will still use "m" here.
> > >
> > > OK that's fine. then you can probably simplify it like this:
> > >
> > > long _arg6 = (long)(arg6); /* Might be in memory */ \
> > > \
> > > asm volatile ( \
> > > "pushl %%ebp\n\t" \
> > > "movl %[_arg6], %%ebp\n\t" \
> > > "int $0x80\n\t" \
> > > "popl %%ebp\n\t" \
> > > : "=a"(_ret) \
> > > : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3), \
> > > "r"(_arg4),"r"(_arg5), [_arg6]"m"(_arg6) \
> > > : "memory", "cc" \
> > > ); \
> > >
> > > See ? no more push, no more addl, direct load from memory.
> >
> > Uggh... I crafted the same code like you suggested before, but then
> > I realized it's buggy, it's buggy because %[_arg6] may live in N(%esp).
> >
> > When you pushl %ebp, the %esp changes, N(%esp) no longer points to the
> > 6-th argument.
>
> Yep - that is why I wrote the 'push arg6'.
Got it and you're right indeed, sorry for the noise :-)
Willy
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 13:41 ` Willy Tarreau
@ 2022-03-22 13:45 ` Ammar Faizi
2022-03-22 13:54 ` Ammar Faizi
0 siblings, 1 reply; 30+ messages in thread
From: Ammar Faizi @ 2022-03-22 13:45 UTC (permalink / raw)
To: Willy Tarreau, David Laight
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
On 3/22/22 8:41 PM, Willy Tarreau wrote:
[...]
>>> When you pushl %ebp, the %esp changes, N(%esp) no longer points to the
>>> 6-th argument.
>>
>> Yep - that is why I wrote the 'push arg6'.
>
> Got it and you're right indeed, sorry for the noise :-)
Uggh... it seems I hit a GCC bug when playing with -m32 (32-bit code).
I am on Linux x86-64. Compiling without optimization causing GCC stuck
in an endless loop with 100% CPU usage.
I will try to narrow it down and see if I can create a simple reproducer
on this issue.
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ gcc --version
gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ time taskset -c 0 gcc -m32 -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc
^C
real 0m46.696s
user 0m0.000s
sys 0m0.002s
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ time taskset -c 0 gcc -O1 -m32 -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc
real 0m0.054s
user 0m0.046s
sys 0m0.008s
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ time taskset -c 0 gcc -O2 -m32 -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc
real 0m0.079s
user 0m0.067s
sys 0m0.012s
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ time taskset -c 0 gcc -O3 -m32 -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc
real 0m0.110s
user 0m0.097s
sys 0m0.013s
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$
--
Ammar Faizi
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 13:45 ` Ammar Faizi
@ 2022-03-22 13:54 ` Ammar Faizi
2022-03-22 13:56 ` Ammar Faizi
0 siblings, 1 reply; 30+ messages in thread
From: Ammar Faizi @ 2022-03-22 13:54 UTC (permalink / raw)
To: Willy Tarreau, David Laight
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
Willy, something goes wrong here...
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ taskset -c 0 gcc -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc
/usr/bin/ld: /tmp/ccHiYiks.o: warning: relocation against `environ' in read-only section `.text'
/usr/bin/ld: /tmp/ccHiYiks.o: in function `getenv':
test.c:(.text+0x1f76): undefined reference to `environ'
/usr/bin/ld: test.c:(.text+0x1fc3): undefined reference to `environ'
/usr/bin/ld: test.c:(.text+0x1ffc): undefined reference to `environ'
/usr/bin/ld: test.c:(.text+0x2021): undefined reference to `environ'
/usr/bin/ld: test.c:(.text+0x2049): undefined reference to `environ'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE
collect2: error: ld returned 1 exit status
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$
I suspect it's caused by commit:
commit c970abe796019b3d576fd154a54b94efb35c02b1
Author: Willy Tarreau <w@1wt.eu>
Date: Mon Mar 21 18:33:08 2022 +0100
tools/nolibc/stdlib: add a simple getenv() implementation
This implementation relies on an extern definition of the environ
variable, that the caller must declare and initialize from envp.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
I will take a look deeper on this...
--
Ammar Faizi
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 13:54 ` Ammar Faizi
@ 2022-03-22 13:56 ` Ammar Faizi
2022-03-22 14:02 ` Willy Tarreau
0 siblings, 1 reply; 30+ messages in thread
From: Ammar Faizi @ 2022-03-22 13:56 UTC (permalink / raw)
To: Willy Tarreau, David Laight
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
On 3/22/22 8:54 PM, Ammar Faizi wrote:
>
> Willy, something goes wrong here...
>
> ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ taskset -c 0 gcc -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc
> /usr/bin/ld: /tmp/ccHiYiks.o: warning: relocation against `environ' in read-only section `.text'
> /usr/bin/ld: /tmp/ccHiYiks.o: in function `getenv':
> test.c:(.text+0x1f76): undefined reference to `environ'
> /usr/bin/ld: test.c:(.text+0x1fc3): undefined reference to `environ'
> /usr/bin/ld: test.c:(.text+0x1ffc): undefined reference to `environ'
> /usr/bin/ld: test.c:(.text+0x2021): undefined reference to `environ'
> /usr/bin/ld: test.c:(.text+0x2049): undefined reference to `environ'
> /usr/bin/ld: warning: creating DT_TEXTREL in a PIE
> collect2: error: ld returned 1 exit status
> ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$
>
>
> I suspect it's caused by commit:
>
> commit c970abe796019b3d576fd154a54b94efb35c02b1
> Author: Willy Tarreau <w@1wt.eu>
> Date: Mon Mar 21 18:33:08 2022 +0100
>
> tools/nolibc/stdlib: add a simple getenv() implementation
> This implementation relies on an extern definition of the environ
> variable, that the caller must declare and initialize from envp.
> Signed-off-by: Willy Tarreau <w@1wt.eu>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>
> I will take a look deeper on this...
This bug only exists when compiling without optimization.
--
Ammar Faizi
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 13:56 ` Ammar Faizi
@ 2022-03-22 14:02 ` Willy Tarreau
0 siblings, 0 replies; 30+ messages in thread
From: Willy Tarreau @ 2022-03-22 14:02 UTC (permalink / raw)
To: Ammar Faizi
Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
On Tue, Mar 22, 2022 at 08:56:44PM +0700, Ammar Faizi wrote:
> On 3/22/22 8:54 PM, Ammar Faizi wrote:
> >
> > Willy, something goes wrong here...
> >
> > ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ taskset -c 0 gcc -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc
> > /usr/bin/ld: /tmp/ccHiYiks.o: warning: relocation against `environ' in read-only section `.text'
> > /usr/bin/ld: /tmp/ccHiYiks.o: in function `getenv':
> > test.c:(.text+0x1f76): undefined reference to `environ'
> > /usr/bin/ld: test.c:(.text+0x1fc3): undefined reference to `environ'
> > /usr/bin/ld: test.c:(.text+0x1ffc): undefined reference to `environ'
> > /usr/bin/ld: test.c:(.text+0x2021): undefined reference to `environ'
> > /usr/bin/ld: test.c:(.text+0x2049): undefined reference to `environ'
> > /usr/bin/ld: warning: creating DT_TEXTREL in a PIE
> > collect2: error: ld returned 1 exit status
> > ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$
> >
> >
> > I suspect it's caused by commit:
> >
> > commit c970abe796019b3d576fd154a54b94efb35c02b1
> > Author: Willy Tarreau <w@1wt.eu>
> > Date: Mon Mar 21 18:33:08 2022 +0100
> >
> > tools/nolibc/stdlib: add a simple getenv() implementation
> > This implementation relies on an extern definition of the environ
> > variable, that the caller must declare and initialize from envp.
> > Signed-off-by: Willy Tarreau <w@1wt.eu>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >
> > I will take a look deeper on this...
>
> This bug only exists when compiling without optimization.
Indeed, reproduced. I can bypass it by adding __attribute__((weak)) on
the environ declaration in getenv(). Will send a patch later.
Thanks,
Willy
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 13:37 ` David Laight
@ 2022-03-22 14:47 ` Alviro Iskandar Setiawan
2022-03-22 15:11 ` David Laight
2022-03-23 6:29 ` Ammar Faizi
1 sibling, 1 reply; 30+ messages in thread
From: Alviro Iskandar Setiawan @ 2022-03-22 14:47 UTC (permalink / raw)
To: David Laight
Cc: Willy Tarreau, Ammar Faizi, Paul E. McKenney, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
On Tue, Mar 22, 2022 at 8:37 PM David Laight wrote:
> dunno, 'asm' register variables are rather more horrid and
> should probably only be used (for asm statements) when there aren't
> suitable register constraints.
>
> (I'm sure there is a comment about that in the gcc docs.)
I don't find the comment that says so here:
https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html
The current code looks valid to me, but I would still prefer to use
the explicit register constraints instead of always using "r"(var) if
available. No strong reason in denying that, tho. Still looks good.
-- Viro
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 14:47 ` Alviro Iskandar Setiawan
@ 2022-03-22 15:11 ` David Laight
0 siblings, 0 replies; 30+ messages in thread
From: David Laight @ 2022-03-22 15:11 UTC (permalink / raw)
To: 'Alviro Iskandar Setiawan'
Cc: Willy Tarreau, Ammar Faizi, Paul E. McKenney, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
From: Alviro Iskandar Setiawan
> Sent: 22 March 2022 14:48
>
> On Tue, Mar 22, 2022 at 8:37 PM David Laight wrote:
> > dunno, 'asm' register variables are rather more horrid and
> > should probably only be used (for asm statements) when there aren't
> > suitable register constraints.
> >
> > (I'm sure there is a comment about that in the gcc docs.)
>
> I don't find the comment that says so here:
> https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html
I've probably inferred it from:
"The only supported use for this feature is to specify registers for
input and output operands when calling Extended asm (see Extended Asm).
This may be necessary if the constraints for a particular machine don’t
provide sufficient control to select the desired register."
Here is isn't necessary because the required constraint exist/
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
2022-03-22 10:21 ` [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code Ammar Faizi
@ 2022-03-22 17:09 ` Nick Desaulniers
2022-03-22 17:25 ` Willy Tarreau
0 siblings, 1 reply; 30+ messages in thread
From: Nick Desaulniers @ 2022-03-22 17:09 UTC (permalink / raw)
To: Ammar Faizi
Cc: Willy Tarreau, Paul E. McKenney, Alviro Iskandar Setiawan,
Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm
On Tue, Mar 22, 2022 at 3:21 AM Ammar Faizi <ammarfaizi2@gnuweeb.org> wrote:
>
> Building with clang yields the following error:
> ```
> <inline asm>:3:1: error: _start changed binding to STB_GLOBAL
> .global _start
> ^
> 1 error generated.
> ```
> Make sure only specify one between `.global _start` and `.weak _start`.
> Removing `.global _start`.
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Yes, symbols should either be `.weak` or `.global`. The warning from
Clang's integrated assembler is meant to flush out funny business.
I assume there's a good reason _why_ _start is weak and not strong?
Then again, I'm not familiar with nolibc.
>
> Cc: llvm@lists.linux.dev
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
> ---
>
> @@ Changelog:
>
> Link RFC v1: https://lore.kernel.org/llvm/20220320093750.159991-3-ammarfaizi2@gnuweeb.org
> RFC v1 -> RFC v2:
> - Remove all `.global _start` for all build (GCC and Clang) instead of
> removing all `.weak _start` for clang build (Comment from Willy).
> ---
> tools/include/nolibc/arch-aarch64.h | 1 -
> tools/include/nolibc/arch-arm.h | 1 -
> tools/include/nolibc/arch-i386.h | 1 -
> tools/include/nolibc/arch-mips.h | 1 -
> tools/include/nolibc/arch-riscv.h | 1 -
> tools/include/nolibc/arch-x86_64.h | 1 -
> 6 files changed, 6 deletions(-)
>
> diff --git a/tools/include/nolibc/arch-aarch64.h b/tools/include/nolibc/arch-aarch64.h
> index 87d9e434820c..2dbd80d633cb 100644
> --- a/tools/include/nolibc/arch-aarch64.h
> +++ b/tools/include/nolibc/arch-aarch64.h
> @@ -184,7 +184,6 @@ struct sys_stat_struct {
> /* startup code */
> asm(".section .text\n"
> ".weak _start\n"
> - ".global _start\n"
> "_start:\n"
> "ldr x0, [sp]\n" // argc (x0) was in the stack
> "add x1, sp, 8\n" // argv (x1) = sp
> diff --git a/tools/include/nolibc/arch-arm.h b/tools/include/nolibc/arch-arm.h
> index 001a3c8c9ad5..1191395b5acd 100644
> --- a/tools/include/nolibc/arch-arm.h
> +++ b/tools/include/nolibc/arch-arm.h
> @@ -177,7 +177,6 @@ struct sys_stat_struct {
> /* startup code */
> asm(".section .text\n"
> ".weak _start\n"
> - ".global _start\n"
> "_start:\n"
> #if defined(__THUMBEB__) || defined(__THUMBEL__)
> /* We enter here in 32-bit mode but if some previous functions were in
> diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
> index d7e4d53325a3..125a691fc631 100644
> --- a/tools/include/nolibc/arch-i386.h
> +++ b/tools/include/nolibc/arch-i386.h
> @@ -176,7 +176,6 @@ struct sys_stat_struct {
> */
> asm(".section .text\n"
> ".weak _start\n"
> - ".global _start\n"
> "_start:\n"
> "pop %eax\n" // argc (first arg, %eax)
> "mov %esp, %ebx\n" // argv[] (second arg, %ebx)
> diff --git a/tools/include/nolibc/arch-mips.h b/tools/include/nolibc/arch-mips.h
> index c9a6aac87c6d..1a124790c99f 100644
> --- a/tools/include/nolibc/arch-mips.h
> +++ b/tools/include/nolibc/arch-mips.h
> @@ -192,7 +192,6 @@ struct sys_stat_struct {
> asm(".section .text\n"
> ".weak __start\n"
> ".set nomips16\n"
> - ".global __start\n"
> ".set noreorder\n"
> ".option pic0\n"
> ".ent __start\n"
> diff --git a/tools/include/nolibc/arch-riscv.h b/tools/include/nolibc/arch-riscv.h
> index bc10b7b5706d..511d67fc534e 100644
> --- a/tools/include/nolibc/arch-riscv.h
> +++ b/tools/include/nolibc/arch-riscv.h
> @@ -185,7 +185,6 @@ struct sys_stat_struct {
> /* startup code */
> asm(".section .text\n"
> ".weak _start\n"
> - ".global _start\n"
> "_start:\n"
> ".option push\n"
> ".option norelax\n"
> diff --git a/tools/include/nolibc/arch-x86_64.h b/tools/include/nolibc/arch-x86_64.h
> index a7b70ea51b68..84c174181425 100644
> --- a/tools/include/nolibc/arch-x86_64.h
> +++ b/tools/include/nolibc/arch-x86_64.h
> @@ -199,7 +199,6 @@ struct sys_stat_struct {
> */
> asm(".section .text\n"
> ".weak _start\n"
> - ".global _start\n"
> "_start:\n"
> "pop %rdi\n" // argc (first arg, %rdi)
> "mov %rsp, %rsi\n" // argv[] (second arg, %rsi)
> --
> Ammar Faizi
>
--
Thanks,
~Nick Desaulniers
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
2022-03-22 17:09 ` Nick Desaulniers
@ 2022-03-22 17:25 ` Willy Tarreau
2022-03-22 17:30 ` Nick Desaulniers
0 siblings, 1 reply; 30+ messages in thread
From: Willy Tarreau @ 2022-03-22 17:25 UTC (permalink / raw)
To: Nick Desaulniers
Cc: Ammar Faizi, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm
Hi Nick,
On Tue, Mar 22, 2022 at 10:09:18AM -0700, Nick Desaulniers wrote:
> On Tue, Mar 22, 2022 at 3:21 AM Ammar Faizi <ammarfaizi2@gnuweeb.org> wrote:
> >
> > Building with clang yields the following error:
> > ```
> > <inline asm>:3:1: error: _start changed binding to STB_GLOBAL
> > .global _start
> > ^
> > 1 error generated.
> > ```
> > Make sure only specify one between `.global _start` and `.weak _start`.
> > Removing `.global _start`.
>
> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
>
> Yes, symbols should either be `.weak` or `.global`. The warning from
> Clang's integrated assembler is meant to flush out funny business.
>
> I assume there's a good reason _why_ _start is weak and not strong?
Yes, the issue appears when you start to build programs made of more than
one C file. That's why we have a few weak symbols here and there (others
like errno are static and the lack of inter-unit portability is assumed).
> Then again, I'm not familiar with nolibc.
No problem. The purpose is clearly *not* to implement a libc, but to have
something very lightweight that allows to compile trivial programs. A good
example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm
personally using a tiny pre-init shell that I always package with my
kernels and that builds with them [1]. It will never do big things but
the balance between ease of use and coding effort is pretty good in my
experience. And I'm also careful not to make it complicated to use nor
to maintain, pragmatism is important and the effort should remain on the
program developer if some arbitration is needed.
Regards,
Willy
[1] https://github.com/formilux/flxutils/tree/master/init
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
2022-03-22 17:25 ` Willy Tarreau
@ 2022-03-22 17:30 ` Nick Desaulniers
2022-03-22 17:58 ` Willy Tarreau
0 siblings, 1 reply; 30+ messages in thread
From: Nick Desaulniers @ 2022-03-22 17:30 UTC (permalink / raw)
To: Willy Tarreau; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm
(Moving folks to bcc; check the lists if you're interested)
On Tue, Mar 22, 2022 at 10:25 AM Willy Tarreau <w@1wt.eu> wrote:
>
> Hi Nick,
>
> On Tue, Mar 22, 2022 at 10:09:18AM -0700, Nick Desaulniers wrote:
> > Then again, I'm not familiar with nolibc.
>
> No problem. The purpose is clearly *not* to implement a libc, but to have
> something very lightweight that allows to compile trivial programs. A good
> example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm
> personally using a tiny pre-init shell that I always package with my
> kernels and that builds with them [1]. It will never do big things but
> the balance between ease of use and coding effort is pretty good in my
> experience. And I'm also careful not to make it complicated to use nor
> to maintain, pragmatism is important and the effort should remain on the
> program developer if some arbitration is needed.
Neat, I bet that helps generate very small initrd! Got any quick size
measurements?
--
Thanks,
~Nick Desaulniers
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
2022-03-22 17:30 ` Nick Desaulniers
@ 2022-03-22 17:58 ` Willy Tarreau
2022-03-22 18:07 ` Nick Desaulniers
0 siblings, 1 reply; 30+ messages in thread
From: Willy Tarreau @ 2022-03-22 17:58 UTC (permalink / raw)
To: Nick Desaulniers; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm
On Tue, Mar 22, 2022 at 10:30:53AM -0700, Nick Desaulniers wrote:
> (Moving folks to bcc; check the lists if you're interested)
Yes, agreed :-)
> On Tue, Mar 22, 2022 at 10:25 AM Willy Tarreau <w@1wt.eu> wrote:
> > The purpose is clearly *not* to implement a libc, but to have
> > something very lightweight that allows to compile trivial programs. A good
> > example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm
> > personally using a tiny pre-init shell that I always package with my
> > kernels and that builds with them [1]. It will never do big things but
> > the balance between ease of use and coding effort is pretty good in my
> > experience. And I'm also careful not to make it complicated to use nor
> > to maintain, pragmatism is important and the effort should remain on the
> > program developer if some arbitration is needed.
>
> Neat, I bet that helps generate very small initrd! Got any quick size
> measurements?
Yep:
First, the usual static printf("hello world!\n"):
$ ll hello-*libc
-rwxrwxr-x 1 willy dev 719232 Mar 22 18:50 hello-glibc*
-rwxrwxr-x 1 willy dev 1248 Mar 22 18:51 hello-nolibc*
$ objdump -h hello-nolibc
hello-nolibc: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000300 00000000004000b0 00000000004000b0 000000b0 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .rodata 00000015 00000000004003b0 00000000004003b0 000003b0 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
Then the preinit stuff:
$ ll initramfs/init
-rwxr-xr-x 1 willy users 13936 Mar 22 18:40 initramfs/init*
$ xz -c9 < initramfs/init | wc -c
8392
$ size initramfs/init
text data bss dec hex filename
13348 0 23016 36364 8e0c init
$ objdump -h initramfs/init
initramfs/init: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00002b74 00000000004000e8 00000000004000e8 000000e8 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .rodata 000008b0 0000000000402c60 0000000000402c60 00002c60 2**5
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .bss 000059e8 0000000000404520 0000000000404520 00003520 2**5
ALLOC
This one supports ~30-40 simple commands (mount/unmount, mknod, ls, ln),
a tar extractor, multi-level braces, and boolean expression evaluation,
variable expansion, and a config file parser to script all this. The code
is 20 years old and is really ugly (even uglier than you think). But that
gives an idea. 20 years ago the init was much simpler and 800 bytes (my
constraint was for single floppies containing kernel+rootfs) and strings
were manually merged by tails and put in .text to drop .rodata.
You'll also note that there's 0 data segment above. That used to be
convenient to further shrink programs, but these days given how linkers
arrange segments by permissions that doesn't save as much as it used to,
and it's likely that at some points I'll assume that there must be some
variables by default (errno, environ, etc) and that we'll accept to invest
a few extra tens of bytes by default for more convenience.
Cheers,
Willy
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
2022-03-22 17:58 ` Willy Tarreau
@ 2022-03-22 18:07 ` Nick Desaulniers
2022-03-22 18:24 ` Willy Tarreau
0 siblings, 1 reply; 30+ messages in thread
From: Nick Desaulniers @ 2022-03-22 18:07 UTC (permalink / raw)
To: Willy Tarreau; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm
On Tue, Mar 22, 2022 at 10:58 AM Willy Tarreau <w@1wt.eu> wrote:
>
> On Tue, Mar 22, 2022 at 10:30:53AM -0700, Nick Desaulniers wrote:
> > On Tue, Mar 22, 2022 at 10:25 AM Willy Tarreau <w@1wt.eu> wrote:
> > > The purpose is clearly *not* to implement a libc, but to have
> > > something very lightweight that allows to compile trivial programs. A good
> > > example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm
> > > personally using a tiny pre-init shell that I always package with my
> > > kernels and that builds with them [1]. It will never do big things but
> > > the balance between ease of use and coding effort is pretty good in my
> > > experience. And I'm also careful not to make it complicated to use nor
> > > to maintain, pragmatism is important and the effort should remain on the
> > > program developer if some arbitration is needed.
> >
> > Neat, I bet that helps generate very small initrd! Got any quick size
> > measurements?
>
> Yep:
>
> First, the usual static printf("hello world!\n"):
>
> $ ll hello-*libc
> -rwxrwxr-x 1 willy dev 719232 Mar 22 18:50 hello-glibc*
> -rwxrwxr-x 1 willy dev 1248 Mar 22 18:51 hello-nolibc*
! What! Are those both statically linked?
> This one supports ~30-40 simple commands (mount/unmount, mknod, ls, ln),
> a tar extractor, multi-level braces, and boolean expression evaluation,
> variable expansion, and a config file parser to script all this. The code
> is 20 years old and is really ugly (even uglier than you think). But that
> gives an idea. 20 years ago the init was much simpler and 800 bytes (my
> constraint was for single floppies containing kernel+rootfs) and strings
> were manually merged by tails and put in .text to drop .rodata.
Oh, so nolibc has been around for a while then?
ld.lld will do string merging in that fashion at -O2 (the linker can
accept and optimization level). I did have a kernel patch for that
somewhere, need to update it for CC_OPTIMIZE_FOR_SIZE...
I guess the tradeoff with strings in .text is that now the strings
themselves are r+x and not just r?
>
> You'll also note that there's 0 data segment above. That used to be
> convenient to further shrink programs, but these days given how linkers
> arrange segments by permissions that doesn't save as much as it used to,
> and it's likely that at some points I'll assume that there must be some
> variables by default (errno, environ, etc) and that we'll accept to invest
> a few extra tens of bytes by default for more convenience.
Thanks for the measurements.
--
Thanks,
~Nick Desaulniers
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
2022-03-22 18:07 ` Nick Desaulniers
@ 2022-03-22 18:24 ` Willy Tarreau
2022-03-22 18:38 ` Nick Desaulniers
0 siblings, 1 reply; 30+ messages in thread
From: Willy Tarreau @ 2022-03-22 18:24 UTC (permalink / raw)
To: Nick Desaulniers; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm
On Tue, Mar 22, 2022 at 11:07:17AM -0700, Nick Desaulniers wrote:
> > First, the usual static printf("hello world!\n"):
> >
> > $ ll hello-*libc
> > -rwxrwxr-x 1 willy dev 719232 Mar 22 18:50 hello-glibc*
> > -rwxrwxr-x 1 willy dev 1248 Mar 22 18:51 hello-nolibc*
>
> ! What! Are those both statically linked?
Yes:
$ file hello-nolibc
hello-nolibc: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped
(rebuilding without stripping)
$ nm --size hello-nolibc
000000000000000f T main
0000000000000053 t u64toa_r
0000000000000280 t printf.constprop.0
$ nm hello-nolibc
00000000004013c5 R __bss_start
00000000004013c5 R _edata
00000000004013c8 R _end
00000000004000bf W _start
00000000004000b0 T main
0000000000400130 t printf.constprop.0
00000000004000dd t u64toa_r
> > This one supports ~30-40 simple commands (mount/unmount, mknod, ls, ln),
> > a tar extractor, multi-level braces, and boolean expression evaluation,
> > variable expansion, and a config file parser to script all this. The code
> > is 20 years old and is really ugly (even uglier than you think). But that
> > gives an idea. 20 years ago the init was much simpler and 800 bytes (my
> > constraint was for single floppies containing kernel+rootfs) and strings
> > were manually merged by tails and put in .text to drop .rodata.
>
> Oh, so nolibc has been around for a while then?
Not exactly. Over time I collected some of my stuff out of preinit to
make more reusable code for other tools, and eventually created a separate
project for it 5 years ago [1]. I then changed my mind a few times on how
to arrange all this and over time it became a bit easier to use. One day
Paul asked how to make less invasive static binaries for rcutorture and I
found that it was the perfect match so we agreed to integrate it there. It
was still a single file by then. And as usual when some code starts to get
more exposure it receives more contribs and feature requests ;-)
> ld.lld will do string merging in that fashion at -O2 (the linker can
> accept and optimization level). I did have a kernel patch for that
> somewhere, need to update it for CC_OPTIMIZE_FOR_SIZE...
Ah I didn't know, that's good to know!
> I guess the tradeoff with strings in .text is that now the strings
> themselves are r+x and not just r?
Yes but when you're writing a small shell to allow you to manually
mount your rootfs from the kernel, you don't really care if someone
might try to use some of your strings as code gadgets for ROP exploits :-)
I would really not want to see this used for general programs, but it
does fit well with hacking stuff for initramfs, and what lies in the
selftests directory in general I guess.
What I particularly like is that I don't need a full toolchain, so if
I can build a kernel with the bare-metal compilers from kernel.org then
I know I can also build my initramfs that's packaged in it using the
exact same compiler. This significantly simplifies the build process.
Willy
[1] https://github.com/wtarreau/nolibc
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
2022-03-22 18:24 ` Willy Tarreau
@ 2022-03-22 18:38 ` Nick Desaulniers
0 siblings, 0 replies; 30+ messages in thread
From: Nick Desaulniers @ 2022-03-22 18:38 UTC (permalink / raw)
To: Willy Tarreau; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm
On Tue, Mar 22, 2022 at 11:24 AM Willy Tarreau <w@1wt.eu> wrote:
>
> What I particularly like is that I don't need a full toolchain, so if
> I can build a kernel with the bare-metal compilers from kernel.org then
> I know I can also build my initramfs that's packaged in it using the
> exact same compiler. This significantly simplifies the build process.
Neat; yeah that coincides a bit with my interest in having builds of
llvm on kernel.org; having/needing a libc is a PITA and building a
full cross toolchain is also more difficult than I think it needs to
be. The libc will depend on kernel headers, for each target. LLVM
currently has a WIP libc in its tree; I'm looking for something I can
statically link into the toolchain images (even LTO them into the
image). Will probably pursue musl (if I ever get time for this,
though maybe a project for my summer intern...).
One thing I've been looking at is a utility called llvm-ifs [1]; it
can generate .so stubs from a textual description that can be more
easily read, diff'ed, and committed. These are much faster to build
and reduce the chain of build dependencies (when dynamically linking).
Last I checked it had issues with versioned symbols, and I'm not sure
if/what it does for headers, which are still needed. Within Android,
libabigail is being used to dump+diff xml descriptions of parts of an
ABI, it looks like llvm-ifs might be useful for that as well. Not
sure if it's interesting but thought I'd share.
[1] https://www.youtube.com/watch?v=_pIorUFavc8
--
Thanks,
~Nick Desaulniers
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-22 13:37 ` David Laight
2022-03-22 14:47 ` Alviro Iskandar Setiawan
@ 2022-03-23 6:29 ` Ammar Faizi
2022-03-23 6:32 ` Ammar Faizi
2022-03-23 7:10 ` Willy Tarreau
1 sibling, 2 replies; 30+ messages in thread
From: Ammar Faizi @ 2022-03-23 6:29 UTC (permalink / raw)
To: David Laight, 'Willy Tarreau'
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
On 3/22/22 8:37 PM, David Laight wrote:
> dunno, 'asm' register variables are rather more horrid and
> should probably only be used (for asm statements) when there aren't
> suitable register constraints.
>
> (I'm sure there is a comment about that in the gcc docs.)
^ Hey David, yes you're right, that is very interesting...
I hit a GCC bug when playing with syscall6() implementation here.
Using register variables for all inputs for syscall6() causing GCC 11.2
stuck in an endless loop with 100% CPU usage. Reproducible with several
versions of GCC.
In GCC 6.3, the syscall6() implementation above yields ICE (Internal
Compiler Error):
```
<source>: In function '__sys_mmap':
<source>:35:1: error: unable to find a register to spill
}
^
<source>:35:1: error: this is the insn:
(insn 14 13 30 2 (set (reg:SI 95 [92])
(mem/c:SI (plus:SI (reg/f:SI 16 argp)
(const_int 28 [0x1c])) [1 offset+0 S4 A32])) <source>:33 86 {*movsi_internal}
(expr_list:REG_DEAD (reg:SI 16 argp)
(nil)))
<source>:35: confused by earlier errors, bailing out
Compiler returned: 1
```
See the full show here: https://godbolt.org/z/dYeKaYWY3
Using the appropriate constraints, it compiles nicely, now it looks
like this:
```
#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
({ \
long _eax = (long)(num); \
long _arg6 = (long)(arg6); /* Always be in memory */ \
asm volatile ( \
"pushl %[_arg6]\n\t" \
"pushl %%ebp\n\t" \
"movl 4(%%esp), %%ebp\n\t" \
"int $0x80\n\t" \
"popl %%ebp\n\t" \
"addl $4,%%esp\n\t" \
: "+a"(_eax) /* %eax */ \
: "b"(arg1), /* %ebx */ \
"c"(arg2), /* %ecx */ \
"d"(arg3), /* %edx */ \
"S"(arg4), /* %esi */ \
"D"(arg5), /* %edi */ \
[_arg6]"m"(_arg6) /* memory */ \
: "memory", "cc" \
); \
_eax; \
})
```
Link: https://godbolt.org/z/ozGbYWbPY
Will use that in the next patchset version.
--
Ammar Faizi
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-23 6:29 ` Ammar Faizi
@ 2022-03-23 6:32 ` Ammar Faizi
2022-03-23 7:10 ` Willy Tarreau
1 sibling, 0 replies; 30+ messages in thread
From: Ammar Faizi @ 2022-03-23 6:32 UTC (permalink / raw)
To: David Laight, 'Willy Tarreau'
Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
I have reported this bug to GNU people.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032
--
Ammar Faizi
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
2022-03-23 6:29 ` Ammar Faizi
2022-03-23 6:32 ` Ammar Faizi
@ 2022-03-23 7:10 ` Willy Tarreau
1 sibling, 0 replies; 30+ messages in thread
From: Willy Tarreau @ 2022-03-23 7:10 UTC (permalink / raw)
To: Ammar Faizi
Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
Linux Kernel Mailing List, GNU/Weeb Mailing List, x86@kernel.org,
llvm@lists.linux.dev
On Wed, Mar 23, 2022 at 01:29:39PM +0700, Ammar Faizi wrote:
> On 3/22/22 8:37 PM, David Laight wrote:
> > dunno, 'asm' register variables are rather more horrid and
> > should probably only be used (for asm statements) when there aren't
> > suitable register constraints.
> >
> > (I'm sure there is a comment about that in the gcc docs.)
>
> ^ Hey David, yes you're right, that is very interesting...
>
> I hit a GCC bug when playing with syscall6() implementation here.
>
> Using register variables for all inputs for syscall6() causing GCC 11.2
> stuck in an endless loop with 100% CPU usage. Reproducible with several
> versions of GCC.
>
> In GCC 6.3, the syscall6() implementation above yields ICE (Internal
> Compiler Error):
> ```
> <source>: In function '__sys_mmap':
> <source>:35:1: error: unable to find a register to spill
Now I'm pretty sure that it was the issue I faced when trying long ago,
I remember this error message before I found it wiser to give up.
Willy
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2022-03-23 7:10 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20220322102115.186179-1-ammarfaizi2@gnuweeb.org>
2022-03-22 10:21 ` [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code Ammar Faizi
2022-03-22 17:09 ` Nick Desaulniers
2022-03-22 17:25 ` Willy Tarreau
2022-03-22 17:30 ` Nick Desaulniers
2022-03-22 17:58 ` Willy Tarreau
2022-03-22 18:07 ` Nick Desaulniers
2022-03-22 18:24 ` Willy Tarreau
2022-03-22 18:38 ` Nick Desaulniers
2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi
2022-03-22 10:57 ` David Laight
2022-03-22 11:23 ` Willy Tarreau
2022-03-22 11:39 ` David Laight
2022-03-22 12:02 ` Ammar Faizi
2022-03-22 12:07 ` Ammar Faizi
2022-03-22 12:13 ` Willy Tarreau
2022-03-22 13:26 ` Ammar Faizi
2022-03-22 13:34 ` Willy Tarreau
2022-03-22 13:37 ` Ammar Faizi
2022-03-22 13:39 ` David Laight
2022-03-22 13:41 ` Willy Tarreau
2022-03-22 13:45 ` Ammar Faizi
2022-03-22 13:54 ` Ammar Faizi
2022-03-22 13:56 ` Ammar Faizi
2022-03-22 14:02 ` Willy Tarreau
2022-03-22 13:37 ` David Laight
2022-03-22 14:47 ` Alviro Iskandar Setiawan
2022-03-22 15:11 ` David Laight
2022-03-23 6:29 ` Ammar Faizi
2022-03-23 6:32 ` Ammar Faizi
2022-03-23 7:10 ` Willy Tarreau
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox