* GCC 15 -fzero-init-padding-bits= option and redzone clobber
@ 2024-11-28 11:19 Jakub Jelinek
2024-11-29 12:52 ` Peter Zijlstra
0 siblings, 1 reply; 10+ messages in thread
From: Jakub Jelinek @ 2024-11-28 11:19 UTC (permalink / raw)
To: linux-toolchains
Hi!
This is just a FYI, since today GCC 15 no longer zero initializes padding
bits in unions where the standard doesn't require it.
So e.g.
void foo (void)
{
union U { int a; long b[64]; };
/* This clears everything including padding bits,
required by at least C23 (note, GCC 15 defaults to -std=gnu23) */
union U u = {};
/* This used to clear everything, but only clears
v.a in GCC 15 by default. */
union U v = {0};
}
If you want to keep the old behavior e.g. for security purposes (the whole
union can be copied to user etc.), one can use
-fzero-init-padding-bits=unions to restore the GCC 14 and older behavior.
And, -fzero-init-padding-bits=all can be used to clear padding bits even
in cases where the standard doesn't require them even in structures, e.g.
void bar (void)
{
struct S { char a; int b; };
/* C23 requires padding bits to be cleared here. */
struct S s = {};
/* But not here. -fzero-init-padding-bits=all does that anyway. */
struct S t = { 1, 2 };
}
Note, there is also __builtin_clear_padding builtin to clear padding bits
already since GCC 11, though it doesn't clear bits in unions unless they
are padding bits for all possible members, as it doesn't know which union
member is current.
Another new feature since today that might be relevant to kernel is
the "redzone" inline asm clobber.
It can/should be used on inline asm which does or could clobber memory
below the stack pointer and so its presence must disable use of redzone
(currently on x86_64 and powerpc*), whether because say pushf/pop pair
or because the inline asm performs calls without taking into account
the red zone (e.g. on x86_64 that would be something like subtracting
128 from %rsp at the start and restoring at the end).
In the past I think kernel used some hacks like clobbering rsp, that is
something that really shouldn't be used even if it happened to work,
inline asm is of course allowed to change the stack pointer temporarily,
but before returning (if it returns at all) it needs to restore it back,
and clobbers are not about temporary changes during the execution of inline
asm, but about changes from the start to the end of inline asm.
So
asm ("call something" : ... : ... : "redzone");
(of course it likely needs tons of other clobbers for call clobbered
registers unless it saves them and restores them in the inline asm or
in whatever it calls).
Jakub
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: GCC 15 -fzero-init-padding-bits= option and redzone clobber 2024-11-28 11:19 GCC 15 -fzero-init-padding-bits= option and redzone clobber Jakub Jelinek @ 2024-11-29 12:52 ` Peter Zijlstra 2024-11-29 13:23 ` Jakub Jelinek 0 siblings, 1 reply; 10+ messages in thread From: Peter Zijlstra @ 2024-11-29 12:52 UTC (permalink / raw) To: Jakub Jelinek; +Cc: linux-toolchains, Linus Torvalds, x86 On Thu, Nov 28, 2024 at 12:19:10PM +0100, Jakub Jelinek wrote: > Hi! > > This is just a FYI, since today GCC 15 no longer zero initializes padding > bits in unions where the standard doesn't require it. > So e.g. > void foo (void) > { > union U { int a; long b[64]; }; > /* This clears everything including padding bits, > required by at least C23 (note, GCC 15 defaults to -std=gnu23) */ > union U u = {}; > /* This used to clear everything, but only clears > v.a in GCC 15 by default. */ > union U v = {0}; > } > If you want to keep the old behavior e.g. for security purposes (the whole > union can be copied to user etc.), one can use > -fzero-init-padding-bits=unions to restore the GCC 14 and older behavior. > And, -fzero-init-padding-bits=all can be used to clear padding bits even > in cases where the standard doesn't require them even in structures, e.g. > void bar (void) > { > struct S { char a; int b; }; > /* C23 requires padding bits to be cleared here. */ > struct S s = {}; > /* But not here. -fzero-init-padding-bits=all does that anyway. */ > struct S t = { 1, 2 }; > } > Note, there is also __builtin_clear_padding builtin to clear padding bits > already since GCC 11, though it doesn't clear bits in unions unless they > are padding bits for all possible members, as it doesn't know which union > member is current. *groan* I suppose we should enable that flag when present :/ > Another new feature since today that might be relevant to kernel is > the "redzone" inline asm clobber. > It can/should be used on inline asm which does or could clobber memory > below the stack pointer and so its presence must disable use of redzone > (currently on x86_64 and powerpc*), At least on x86_64 we don't currently have a redzone. I'm assuming the "memory" clobber still very much includes everything? And why was it deemed okay to change behaviour that might break existing code? > whether because say pushf/pop pair > or because the inline asm performs calls without taking into account > the red zone (e.g. on x86_64 that would be something like subtracting > 128 from %rsp at the start and restoring at the end). > In the past I think kernel used some hacks like clobbering rsp, that is > something that really shouldn't be used even if it happened to work, > inline asm is of course allowed to change the stack pointer temporarily, > but before returning (if it returns at all) it needs to restore it back, > and clobbers are not about temporary changes during the execution of inline > asm, but about changes from the start to the end of inline asm. Mostly we call a full C function on another stack, I don't think we ever swizzle the stack while inside a C function. > So > asm ("call something" : ... : ... : "redzone"); > (of course it likely needs tons of other clobbers for call clobbered > registers unless it saves them and restores them in the inline asm or > in whatever it calls). We have this thing: /* * This output constraint should be used for any inline asm which has a "call" * instruction. Otherwise the asm may be inserted before the frame pointer * gets set up by the containing function. If you forget to do this, objtool * may print a "call without frame pointer save/setup" warning. */ register unsigned long current_stack_pointer asm(_ASM_SP); #define ASM_CALL_CONSTRAINT "+r" (current_stack_pointer) which we use a *LOT*. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GCC 15 -fzero-init-padding-bits= option and redzone clobber 2024-11-29 12:52 ` Peter Zijlstra @ 2024-11-29 13:23 ` Jakub Jelinek 2024-11-29 17:55 ` Linus Torvalds 0 siblings, 1 reply; 10+ messages in thread From: Jakub Jelinek @ 2024-11-29 13:23 UTC (permalink / raw) To: Peter Zijlstra; +Cc: linux-toolchains, Linus Torvalds, x86 On Fri, Nov 29, 2024 at 01:52:38PM +0100, Peter Zijlstra wrote: > > Note, there is also __builtin_clear_padding builtin to clear padding bits > > already since GCC 11, though it doesn't clear bits in unions unless they > > are padding bits for all possible members, as it doesn't know which union > > member is current. > > *groan* I suppose we should enable that flag when present :/ If you want to keep previous behavior, sure. That is why I'posted this FYI. > > Another new feature since today that might be relevant to kernel is > > the "redzone" inline asm clobber. > > It can/should be used on inline asm which does or could clobber memory > > below the stack pointer and so its presence must disable use of redzone > > (currently on x86_64 and powerpc*), > > At least on x86_64 we don't currently have a redzone. Ah, I see, kernel builds with -mno-red-zone, missed that being mentioned in https://gcc.gnu.org/PR117312 that led to this. > I'm assuming the > "memory" clobber still very much includes everything? No. "memory" clobber is about clobbering memory which could be clobbered e.g. by a call to a function one doesn't know anything about and which one passes the inline asm operands to. So it can certainly clobber global variables, or address taken whose address escaped to global state (or could escape), or memory referenced from the operands of the inline asm. It can't clobber random stack locations it knows nothing about, say clobbering the saved registers on the stack in the stack frame, return address, non-address taken variables which were just spilled to stack due to register allocator decisions, etc. There would be nothing the compiler could do about that. > And why was it deemed okay to change behaviour that might break existing > code? You mean for unions? It can only change behavior of invalid code (relying on undefined behavior). Most compiler optimizations change behavior of code with UB in it, otherwise the compiler couldn't optimize anything. > We have this thing: > > /* > * This output constraint should be used for any inline asm which has a "call" > * instruction. Otherwise the asm may be inserted before the frame pointer > * gets set up by the containing function. If you forget to do this, objtool > * may print a "call without frame pointer save/setup" warning. > */ > register unsigned long current_stack_pointer asm(_ASM_SP); > #define ASM_CALL_CONSTRAINT "+r" (current_stack_pointer) > > which we use a *LOT*. And wrongly so. Inline asm really can't change the stack pointer (with the meaning that rsp would be different between the entry to the inline asm and its exit(s) (multiple for asm goto). So telling the compiler it does change is wrong. In GCC 9 we've added a warning for "rsp" and similar clobbers of inline asm, warning: listing the stack pointer register ‘rsp’ in a clobber list is deprecated There is none currently for the "+r" (current_stack_pointer), but e.g. doing current_stack_pointer += 16; or current_stack_pointer = whatever; will certainly cause UB. The behavior of "+r" (current_stack_pointer) you are relying on certainly isn't guaranteed. See https://gcc.gnu.org/PR117312 for some details. There are surely other bugs that talk about that. Jakub ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GCC 15 -fzero-init-padding-bits= option and redzone clobber 2024-11-29 13:23 ` Jakub Jelinek @ 2024-11-29 17:55 ` Linus Torvalds 2024-11-29 18:21 ` Linus Torvalds 0 siblings, 1 reply; 10+ messages in thread From: Linus Torvalds @ 2024-11-29 17:55 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Peter Zijlstra, linux-toolchains, x86 On Fri, 29 Nov 2024 at 05:23, Jakub Jelinek <jakub@redhat.com> wrote: > > On Fri, Nov 29, 2024 at 01:52:38PM +0100, Peter Zijlstra wrote: > > register unsigned long current_stack_pointer asm(_ASM_SP); > > #define ASM_CALL_CONSTRAINT "+r" (current_stack_pointer) > > > > which we use a *LOT*. > > And wrongly so. Inline asm really can't change the stack pointer (with the > meaning that rsp would be different between the entry to the inline asm and > its exit(s) (multiple for asm goto). So telling the compiler it does change > is wrong. The value of %rsp doesn't change in the end, but we do read and write it, because that's what a "call" instruction does. So see that ASM_CALL_CONSTRAINT as us just telling the compiler that we *use* rsp, because otherwise we've had issues with the compiler inserting the inline asm before the frame pointer is set up, which breaks all the usual tracing etc stuff. You can't do function calls without a properly set up frame. This has happened with both gcc and clang, and telling the compiler that we need the stack pointer fixes it. I don't actually remember who told us to do that, but I think it was a gcc person. In fact, if I were a betting man, I would have thought it was you ;) The comment above the ASM_CALL_CONSTRAINT definition actually explains it: * This output constraint should be used for any inline asm which has a "call" * instruction. Otherwise the asm may be inserted before the frame pointer * gets set up by the containing function. If you forget to do this, objtool * may print a "call without frame pointer save/setup" warning. but we're open to other ways to tell the compiler that "we need the stack pointer to be set up". Linus ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GCC 15 -fzero-init-padding-bits= option and redzone clobber 2024-11-29 17:55 ` Linus Torvalds @ 2024-11-29 18:21 ` Linus Torvalds 2024-11-30 11:10 ` Segher Boessenkool 0 siblings, 1 reply; 10+ messages in thread From: Linus Torvalds @ 2024-11-29 18:21 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Peter Zijlstra, linux-toolchains, x86 On Fri, 29 Nov 2024 at 09:55, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > This has happened with both gcc and clang, and telling the compiler > that we need the stack pointer fixes it. I don't actually remember who > told us to do that, but I think it was a gcc person. > > In fact, if I were a betting man, I would have thought it was you ;) Ahh, found it. Not you. Segher. Doing some archeology finds this: https://gcc.gnu.org/legacy-ml/gcc/2015-07/msg00080.html and even at the time, Segher suggested maybe having a separate "stack" clobber. But obviously that wasn't available at the time, and afaik has never happened. So we used that gcc suggestion, and it has worked fine for most of a decade by now. Linus ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GCC 15 -fzero-init-padding-bits= option and redzone clobber 2024-11-29 18:21 ` Linus Torvalds @ 2024-11-30 11:10 ` Segher Boessenkool 2024-11-30 17:43 ` Linus Torvalds 0 siblings, 1 reply; 10+ messages in thread From: Segher Boessenkool @ 2024-11-30 11:10 UTC (permalink / raw) To: Linus Torvalds; +Cc: Jakub Jelinek, Peter Zijlstra, linux-toolchains, x86 Hi! On Fri, Nov 29, 2024 at 10:21:13AM -0800, Linus Torvalds wrote: > On Fri, 29 Nov 2024 at 09:55, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > This has happened with both gcc and clang, and telling the compiler > > that we need the stack pointer fixes it. I don't actually remember who > > told us to do that, but I think it was a gcc person. > > > > In fact, if I were a betting man, I would have thought it was you ;) > > Ahh, found it. Not you. Segher. Doing some archeology finds this: > > https://gcc.gnu.org/legacy-ml/gcc/2015-07/msg00080.html > > and even at the time, Segher suggested maybe having a separate "stack" > clobber. But obviously that wasn't available at the time, and afaik > has never happened. But that would not mean changing the stack pointer, writing to something on the stack instead. Something like this "redzone" thing maybe. I did say it would have to have precise semantics specified :-) > So we used that gcc suggestion, and it has worked fine for most of a > decade by now. Interesting. And I was even accused of being "clever" in that thread, wow! :-) But of course, GCC assumes there is a properly set up stack *everywhere*, it can in principle insert calls *anywhere*. So these asm constraints are totally superfluous anyway! (Input constraints do not say a reg is read, and output constraints do not say it is written to. Instead, they express the data flow, something that is actually useful to the compiler, and very much required as well). Segher ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GCC 15 -fzero-init-padding-bits= option and redzone clobber 2024-11-30 11:10 ` Segher Boessenkool @ 2024-11-30 17:43 ` Linus Torvalds 2024-11-30 22:19 ` Segher Boessenkool 0 siblings, 1 reply; 10+ messages in thread From: Linus Torvalds @ 2024-11-30 17:43 UTC (permalink / raw) To: Segher Boessenkool; +Cc: Jakub Jelinek, Peter Zijlstra, linux-toolchains, x86 On Sat, 30 Nov 2024 at 03:13, Segher Boessenkool <segher@kernel.crashing.org> wrote: > > But of course, GCC assumes there is a properly set up stack > *everywhere*, it can in principle insert calls *anywhere*. So these > asm constraints are totally superfluous anyway! Well, I wish that were true. But we used your trick for a reason: it fixed a real and present issue. So no. It's not superfluous. It's required. Without that "add sp register as a a input/output reg", at least some versions of gcc did in fact put inline asm in places where the frame was not set up. You seem to imply that that may be a gcc bug, of course. Par for the course - we have had other odd things in inline asms (or around them) for other gcc bugs. We do have commentary in some of our commits to that "it's a gcc bug" effect, ie this commit message from "only" seven years ago states: With GCC 7.2, however, GCC's behavior has changed. It now changes its behavior based on the conversion of the register variable to a global. That somehow convinces it to *always* set up the frame pointer before inserting *any* inline asm. (Therefore, listing the variable as an output constraint is a no-op and is no longer necessary.) So if it makes you feel any better, the trick now works for a _different_reason_. Just the existence of the global register variable seems to matter to those newer versions of gcc. But we technically still support those older gcc versions that require the old format (we still support back to gcc-5.1, although IO think we're about to make the jump up to 8.1 based on staid enterprise distro people finally having left some of the ancient stuff behind). So we *may* be able to remove this hack, if gcc people can actually pinky promise that it's not required with anything newer than 8.1 Linus ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GCC 15 -fzero-init-padding-bits= option and redzone clobber 2024-11-30 17:43 ` Linus Torvalds @ 2024-11-30 22:19 ` Segher Boessenkool 2024-11-30 22:43 ` Linus Torvalds 0 siblings, 1 reply; 10+ messages in thread From: Segher Boessenkool @ 2024-11-30 22:19 UTC (permalink / raw) To: Linus Torvalds; +Cc: Jakub Jelinek, Peter Zijlstra, linux-toolchains, x86 Hi! On Sat, Nov 30, 2024 at 09:43:53AM -0800, Linus Torvalds wrote: > On Sat, 30 Nov 2024 at 03:13, Segher Boessenkool > <segher@kernel.crashing.org> wrote: > > > > But of course, GCC assumes there is a properly set up stack > > *everywhere*, it can in principle insert calls *anywhere*. So these > > asm constraints are totally superfluous anyway! > > Well, I wish that were true. But we used your trick for a reason: it > fixed a real and present issue. > > So no. It's not superfluous. It's required. It expresses a fact that is true always anyway. But it is required to get a *side effect*. "It is a hack", if you want. (And "is true always" means "is a requirement on any program", of course a user can blatantly violate those rules by writing incorrect programs). > Without that "add sp register as a a input/output reg", at least some > versions of gcc did in fact put inline asm in places where the frame > was not set up. Yes. And that was not a bug, either: nothing expressed that there has to be a frame set up for this asm to work, so GCC felt free to make more optimal code (or what it thought was more optimal code, or potentially anyway; "could be more optimal code"). > You seem to imply that that may be a gcc bug, of course. Par for the > course - we have had other odd things in inline asms (or around them) > for other gcc bugs. Nope, it is not a GCC bug when users expect things that were not promised to them. > We do have commentary in some of our commits to that "it's a gcc bug" > effect, ie this commit message from "only" seven years ago states: > > With GCC 7.2, however, GCC's behavior has changed. It now changes its > behavior based on the conversion of the register variable to a global. > That somehow convinces it to *always* set up the frame pointer before > inserting *any* inline asm. (Therefore, listing the variable as an > output constraint is a no-op and is no longer necessary.) I have absolutely no idea what "conversion of the register variable to a global" would mean, so I cannot parse what this sentence is meant to mean. There are two kinds of register variable: global register variables, and local register variables. Global register variables are declared at global scope, and local register variables are declared within a function. There obviously is no way the compiler could decide to make a local register variable a global one (that would change semantics!), so it probably means something else, but I have no idea what. > So if it makes you feel any better, the trick now works for a > _different_reason_. Just the existence of the global register variable > seems to matter to those newer versions of gcc. Yeah, and I don't see now. If there was a full testcase I could take a look :-) > But we technically still support those older gcc versions that require > the old format (we still support back to gcc-5.1, although IO think > we're about to make the jump up to 8.1 based on staid enterprise > distro people finally having left some of the ancient stuff behind). Why 8.1? 8.5 was a bugfix release (in general, always require the latest version in a release series, or recommend it at least: we cannot fix any bug in older releases, but we can do new releases :-) ) > So we *may* be able to remove this hack, if gcc people can actually > pinky promise that it's not required with anything newer than 8.1 The problem is still not completely clear to me. Maybe some other GCC people will do such a promise, but at least I won't. Sorry. Segher ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GCC 15 -fzero-init-padding-bits= option and redzone clobber 2024-11-30 22:19 ` Segher Boessenkool @ 2024-11-30 22:43 ` Linus Torvalds 2024-11-30 22:45 ` Linus Torvalds 0 siblings, 1 reply; 10+ messages in thread From: Linus Torvalds @ 2024-11-30 22:43 UTC (permalink / raw) To: Segher Boessenkool; +Cc: Jakub Jelinek, Peter Zijlstra, linux-toolchains, x86 On Sat, 30 Nov 2024 at 14:21, Segher Boessenkool <segher@kernel.crashing.org> wrote: > > > I have absolutely no idea what "conversion of the register variable to > a global" would mean, so I cannot parse what this sentence is meant to > mean. That commit was for a clang issue, where clang was unhappy with a local definition of the stack pointer register. gcc didn't care, clang did. So that commit changed it from a local register variable to a global one, and the commit message talked about how gcc doesn't seem to care any more as long as it's there. > Yeah, and I don't see now. If there was a full testcase I could take a > look :-) There was at the time, but that "at the time" was 10 years ago. The gmane links no longer work, and unlike the more modern lore.kernel.org archives, the link itself contains no useful information for me to try to find it. So this is not the kernel one, but a gcc test at the time you replied to: https://gcc.gnu.org/legacy-ml/gcc/2015-07/msg00079.html > > But we technically still support those older gcc versions that require > > the old format (we still support back to gcc-5.1, although IO think > > we're about to make the jump up to 8.1 based on staid enterprise > > distro people finally having left some of the ancient stuff behind). > > Why 8.1? All our "we support this compiler version" tend to be about what distros still have. And "enterprise" in technology means "old and crappy, but some company supports it". So enterprise distros tend to use some god-awful 5+ year old setup, because the companies that pay the big bucks don't like to see version changes. > The problem is still not completely clear to me. Maybe some other GCC > people will do such a promise, but at least I won't. Sorry. So this is why we are sometimes forced to use hacks. You may not like it, but to *us*, the important part is "this works and doesn't generate buggy code", not "this is documented". Guess why we have random unnecessary "asm volatile" things too? Yeah. gcc bugs. They have been fixed, but not in older versions. Linus ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GCC 15 -fzero-init-padding-bits= option and redzone clobber 2024-11-30 22:43 ` Linus Torvalds @ 2024-11-30 22:45 ` Linus Torvalds 0 siblings, 0 replies; 10+ messages in thread From: Linus Torvalds @ 2024-11-30 22:45 UTC (permalink / raw) To: Segher Boessenkool; +Cc: Jakub Jelinek, Peter Zijlstra, linux-toolchains, x86 On Sat, 30 Nov 2024 at 14:43, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Guess why we have random unnecessary "asm volatile" things too? Yeah. > gcc bugs. They have been fixed, but not in older versions. Btw, don't feel bad. There are clang bugs too. Bugs happen. And they happen more in code bases that push the boundaries. 99% of all projects never use inline asm at all, or do it purely with headers that came with the system. For the kernel, it's non-negotiable. Linus ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-11-30 22:45 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-11-28 11:19 GCC 15 -fzero-init-padding-bits= option and redzone clobber Jakub Jelinek 2024-11-29 12:52 ` Peter Zijlstra 2024-11-29 13:23 ` Jakub Jelinek 2024-11-29 17:55 ` Linus Torvalds 2024-11-29 18:21 ` Linus Torvalds 2024-11-30 11:10 ` Segher Boessenkool 2024-11-30 17:43 ` Linus Torvalds 2024-11-30 22:19 ` Segher Boessenkool 2024-11-30 22:43 ` Linus Torvalds 2024-11-30 22:45 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).