linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
@ 2025-06-13 19:15 marc.herbert
  2025-06-13 20:20 ` Miguel Ojeda
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: marc.herbert @ 2025-06-13 19:15 UTC (permalink / raw)
  To: Jonathan Cameron, Greg KH, Dan Williams, rafael.j.wysocki,
	linux-cxl, linux-acpi, linux-kernel, Rafael J. Wysocki,
	Sudeep Holla, Ben Cheatham, Danilo Krummrich
  Cc: Marc Herbert

From: Marc Herbert <marc.herbert@linux.intel.com>

Fixes undefined behavior that was spotted by Jonathan Cameron in
https://lore.kernel.org/linux-cxl/20250609170509.00003625@huawei.com/

The possible consequences of the undefined behavior fixed here are fairly
well documented across the Internet but to save research time and avoid
doubts, I include a very short and simple demo below. I imagine kernel
compilation flags and various other conditions may not make the
consequences as bad as this example, however those conditions could change
and this type of code is still Undefined Behavior no matter what.
One of the best articles - there are many others:
https://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

Since commit b5ec6fd286dfa4 ("kbuild: Drop -Wdeclaration-after-statement"),
it's now possible to use C99 declarations; the kernel is not constrained
anymore to group all declarations at the top of a block like single-pass
compilers used to require. This allows combining declarations and
definitions in one place - like literally every other language and project
does - and trivially fix undefined behavior like this.  This also reduces
variable scope and avoids misuse between declaration and definition like
uninitialized reads or writing to the wrong variable by mistake. C99
declarations also allow using a lot more `const` (the default in some
languages) which avoids some misuse after legitimate use.
tl;dr: C99 declarations are not just a "codestyle" or "taste" issue;
they are an important (and not mandatory) feature.

cc --version
  cc (GCC) 15.1.1 20250425

for i in 0 1 2 g; do printf "gcc -O$i: "; gcc -O$i nullptrUB.c &&
   ./a.out; done

gcc -O0: Segmentation fault (core dumped)
gcc -O1: ptr is zero
gcc -O2: ptr is NOT zero!!!
gcc -O3: ptr is NOT zero!!!
gcc -Og: ptr is zero

clang --version
  clang version 19.1.7

clang -O0: Segmentation fault (core dumped)
clang -O1: ptr is NOT zero!!!
clang -O2: ptr is NOT zero!!!
clang -O3: ptr is NOT zero!!!
clang -Og: ptr is NOT zero!!!

int faux_device_destroy(int *ptr)
{
  int i = *ptr;  i++;

  // Because we dereferenced ptr, the compiler knows the pointer cannot
  // be null (even when it is!) and can optimize this away.
  if (!ptr) {
    printf("ptr is zero\n");
    return 0;
  }

  printf("ptr is NOT zero!!!\n");
  return 1;
}

int main()
{
  struct timespec t1, t2;
  clock_gettime(CLOCK_MONOTONIC, &t1);
  clock_gettime(CLOCK_MONOTONIC, &t2);

  // Use the clock to hide zero from the compiler
  int * zeroptr = (int *)(t2.tv_sec - t1.tv_sec);

  return faux_device_destroy(zeroptr);
}

Fixes: 35fa2d88ca94 ("driver core: add a faux bus for use when a simple device/bus is needed")
Signed-off-by: Marc Herbert <marc.herbert@linux.intel.com>
---
 drivers/base/faux.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/faux.c b/drivers/base/faux.c
index 9054d346bd7f..94392d397986 100644
--- a/drivers/base/faux.c
+++ b/drivers/base/faux.c
@@ -218,11 +218,11 @@ EXPORT_SYMBOL_GPL(faux_device_create);
  */
 void faux_device_destroy(struct faux_device *faux_dev)
 {
-	struct device *dev = &faux_dev->dev;
-
 	if (!faux_dev)
 		return;
 
+	struct device *dev = &faux_dev->dev;
+
 	device_del(dev);
 
 	/* The final put_device() will clean up the memory we allocated for this device. */
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-13 19:15 [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy() marc.herbert
@ 2025-06-13 20:20 ` Miguel Ojeda
  2025-06-14  0:33 ` Greg KH
  2025-06-25 15:21 ` Dan Carpenter
  2 siblings, 0 replies; 19+ messages in thread
From: Miguel Ojeda @ 2025-06-13 20:20 UTC (permalink / raw)
  To: marc.herbert
  Cc: Benjamin.Cheatham, Jonathan.Cameron, dakr, dan.j.williams, gregkh,
	linux-acpi, linux-cxl, linux-kernel, rafael.j.wysocki, rafael,
	sudeep.holla, Kees Cook

On Fri, 13 Jun 2025 19:15:56 +0000 marc.herbert@linux.intel.com wrote:
>
> doubts, I include a very short and simple demo below. I imagine kernel
> compilation flags and various other conditions may not make the
> consequences as bad as this example, however those conditions could change

Yeah, the kernel uses `-fno-delete-null-pointer-checks`, so e.g. for your
example:

    https://godbolt.org/z/hs49jb98o

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-13 19:15 [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy() marc.herbert
  2025-06-13 20:20 ` Miguel Ojeda
@ 2025-06-14  0:33 ` Greg KH
  2025-06-14 10:50   ` Miguel Ojeda
  2025-06-25 15:21 ` Dan Carpenter
  2 siblings, 1 reply; 19+ messages in thread
From: Greg KH @ 2025-06-14  0:33 UTC (permalink / raw)
  To: marc.herbert
  Cc: Jonathan Cameron, Dan Williams, rafael.j.wysocki, linux-cxl,
	linux-acpi, linux-kernel, Rafael J. Wysocki, Sudeep Holla,
	Ben Cheatham, Danilo Krummrich

On Fri, Jun 13, 2025 at 07:15:56PM +0000, marc.herbert@linux.intel.com wrote:
> From: Marc Herbert <marc.herbert@linux.intel.com>
> 
> Fixes undefined behavior that was spotted by Jonathan Cameron in
> https://lore.kernel.org/linux-cxl/20250609170509.00003625@huawei.com/
> 
> The possible consequences of the undefined behavior fixed here are fairly
> well documented across the Internet but to save research time and avoid
> doubts, I include a very short and simple demo below. I imagine kernel
> compilation flags and various other conditions may not make the
> consequences as bad as this example, however those conditions could change
> and this type of code is still Undefined Behavior no matter what.
> One of the best articles - there are many others:
> https://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
> 
> Since commit b5ec6fd286dfa4 ("kbuild: Drop -Wdeclaration-after-statement"),
> it's now possible to use C99 declarations; the kernel is not constrained
> anymore to group all declarations at the top of a block like single-pass
> compilers used to require. This allows combining declarations and
> definitions in one place - like literally every other language and project
> does - and trivially fix undefined behavior like this.  This also reduces
> variable scope and avoids misuse between declaration and definition like
> uninitialized reads or writing to the wrong variable by mistake. C99
> declarations also allow using a lot more `const` (the default in some
> languages) which avoids some misuse after legitimate use.
> tl;dr: C99 declarations are not just a "codestyle" or "taste" issue;
> they are an important (and not mandatory) feature.
> 
> cc --version
>   cc (GCC) 15.1.1 20250425
> 
> for i in 0 1 2 g; do printf "gcc -O$i: "; gcc -O$i nullptrUB.c &&
>    ./a.out; done
> 
> gcc -O0: Segmentation fault (core dumped)
> gcc -O1: ptr is zero
> gcc -O2: ptr is NOT zero!!!
> gcc -O3: ptr is NOT zero!!!
> gcc -Og: ptr is zero
> 
> clang --version
>   clang version 19.1.7
> 
> clang -O0: Segmentation fault (core dumped)
> clang -O1: ptr is NOT zero!!!
> clang -O2: ptr is NOT zero!!!
> clang -O3: ptr is NOT zero!!!
> clang -Og: ptr is NOT zero!!!
> 
> int faux_device_destroy(int *ptr)
> {
>   int i = *ptr;  i++;
> 
>   // Because we dereferenced ptr, the compiler knows the pointer cannot
>   // be null (even when it is!) and can optimize this away.
>   if (!ptr) {
>     printf("ptr is zero\n");
>     return 0;
>   }
> 
>   printf("ptr is NOT zero!!!\n");
>   return 1;
> }
> 
> int main()
> {
>   struct timespec t1, t2;
>   clock_gettime(CLOCK_MONOTONIC, &t1);
>   clock_gettime(CLOCK_MONOTONIC, &t2);
> 
>   // Use the clock to hide zero from the compiler
>   int * zeroptr = (int *)(t2.tv_sec - t1.tv_sec);
> 
>   return faux_device_destroy(zeroptr);
> }
> 
> Fixes: 35fa2d88ca94 ("driver core: add a faux bus for use when a simple device/bus is needed")
> Signed-off-by: Marc Herbert <marc.herbert@linux.intel.com>

Great writeup, but as Miguel says, this isn't needed at all, the kernel
relies on the compiler to be sane :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-14  0:33 ` Greg KH
@ 2025-06-14 10:50   ` Miguel Ojeda
  2025-06-14 11:53     ` Greg KH
  2025-06-25 15:20     ` Dan Carpenter
  0 siblings, 2 replies; 19+ messages in thread
From: Miguel Ojeda @ 2025-06-14 10:50 UTC (permalink / raw)
  To: gregkh
  Cc: Benjamin.Cheatham, Jonathan.Cameron, dakr, dan.j.williams,
	linux-acpi, linux-cxl, linux-kernel, marc.herbert,
	rafael.j.wysocki, rafael, sudeep.holla, Dan Carpenter, Kees Cook

On Fri, 13 Jun 2025 20:33:42 -0400 Greg KH <gregkh@linuxfoundation.org> wrote:
>
> Great writeup, but as Miguel says, this isn't needed at all, the kernel
> relies on the compiler to be sane :)

We may still want to clean them up, e.g. for tooling -- Kees/Dan: do we?
e.g. I see a similar case with discussion at:

    https://lore.kernel.org/lkml/3f1e7aaa-501a-44f1-8122-28e9efa0a33c@web.de/

Which in the end was picked up as commit 2df2c0caaecf ("fbdev: au1100fb:
Move a variable assignment behind a null pointer check").

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-14 10:50   ` Miguel Ojeda
@ 2025-06-14 11:53     ` Greg KH
  2025-06-14 14:53       ` Marc Herbert
  2025-06-25 15:20     ` Dan Carpenter
  1 sibling, 1 reply; 19+ messages in thread
From: Greg KH @ 2025-06-14 11:53 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Benjamin.Cheatham, Jonathan.Cameron, dakr, dan.j.williams,
	linux-acpi, linux-cxl, linux-kernel, marc.herbert,
	rafael.j.wysocki, rafael, sudeep.holla, Dan Carpenter, Kees Cook

On Sat, Jun 14, 2025 at 12:50:37PM +0200, Miguel Ojeda wrote:
> On Fri, 13 Jun 2025 20:33:42 -0400 Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > Great writeup, but as Miguel says, this isn't needed at all, the kernel
> > relies on the compiler to be sane :)
> 
> We may still want to clean them up, e.g. for tooling -- Kees/Dan: do we?
> e.g. I see a similar case with discussion at:
> 
>     https://lore.kernel.org/lkml/3f1e7aaa-501a-44f1-8122-28e9efa0a33c@web.de/
> 
> Which in the end was picked up as commit 2df2c0caaecf ("fbdev: au1100fb:
> Move a variable assignment behind a null pointer check").

If "tooling" trips over stuff like this, then we should fix the tooling
as again, the kernel relies on this not being "optimized away" by the
compiler in many places.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-14 11:53     ` Greg KH
@ 2025-06-14 14:53       ` Marc Herbert
  2025-06-16  3:35         ` Greg KH
  0 siblings, 1 reply; 19+ messages in thread
From: Marc Herbert @ 2025-06-14 14:53 UTC (permalink / raw)
  To: Greg KH, Miguel Ojeda
  Cc: Benjamin.Cheatham, Jonathan.Cameron, dakr, dan.j.williams,
	linux-acpi, linux-cxl, linux-kernel, rafael.j.wysocki, rafael,
	sudeep.holla, Dan Carpenter, Kees Cook

> the kernel relies on this not being "optimized away" by the compiler
> in many places.

I think "undefined behavior" is the more general topic, more important
than null pointer checks specifically?

> the kernel relies on the compiler to be sane :)

Undefined behavior is... insane by essence? I'm afraid a few custom
compiler options can never fully address that.  While we might get away
with -fno-delete-null-pointer-checks right here right now, who knows
what else could happen in some future compiler version or future
combination of flags. No one: that's why it's called "undefined"
behavior!

> If "tooling" trips over stuff like this, then we should fix the tooling

Because of its old age, many quirks and limitations, C needs and has a
pretty large number of external "tools": static and run-time analyzers,
coding rules (CERT, MISRA,...) and what not. It's not realistic to "fix"
them all so they all "support" undefined behaviors like this one. It's
already hard enough for them to agree on false positives with a somewhat
"standard" version of C. The kernel wields a massive influence but
I'm afraid its power is not big enough to impose its own C "flavor". It
has influence on gcc and a couple others but not on the language as a
whole. The alternative is for the kernel to stay incompatible by choice
with most C "tooling" available - and find fewer issues :-(

(For even more diverse language mess, take a look at the Safe C++
"standardization" attempt and at C++ "profiles". I digress)

> I see a similar case with discussion at:
> https://lore.kernel.org/lkml/3f1e7aaa-501a-44f1-8122-28e9efa0a33c@web.de/

Thanks Miguel for these near-identical examples. While more verbose and more
error-prone, this can indeed be fixed with pre-C99, separate definitions
as it was done in multiple places there.

This is just moving one line of code a few lines down. I think there are
many more "interesting" and much more complex C flaws to waste time on :-)
My 2 cents.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-14 14:53       ` Marc Herbert
@ 2025-06-16  3:35         ` Greg KH
  2025-06-16 14:02           ` Alice Ryhl
  2025-06-18 23:43           ` Marc Herbert
  0 siblings, 2 replies; 19+ messages in thread
From: Greg KH @ 2025-06-16  3:35 UTC (permalink / raw)
  To: Marc Herbert
  Cc: Miguel Ojeda, Benjamin.Cheatham, Jonathan.Cameron, dakr,
	dan.j.williams, linux-acpi, linux-cxl, linux-kernel,
	rafael.j.wysocki, rafael, sudeep.holla, Dan Carpenter, Kees Cook

On Sat, Jun 14, 2025 at 07:53:34AM -0700, Marc Herbert wrote:
> > the kernel relies on this not being "optimized away" by the compiler
> > in many places.
> 
> I think "undefined behavior" is the more general topic, more important
> than null pointer checks specifically?

Is this really "undefined behaviour"?  There are a lot of things that
the kernel requires for a compiler to be able to build it, and this is
one of those things, it can't do this type of "optimization" and expect
the output to actually work properly.

> > the kernel relies on the compiler to be sane :)
> 
> Undefined behavior is... insane by essence? I'm afraid a few custom
> compiler options can never fully address that.  While we might get away
> with -fno-delete-null-pointer-checks right here right now, who knows
> what else could happen in some future compiler version or future
> combination of flags. No one: that's why it's called "undefined"
> behavior!

Again, that's not the issue here.  The issue is that we rely on this
type of optimization to not happen in order to work properly.  So no
need to "fix" anything here except perhaps the compiler for not
attempting to do foolish things like this :)

> > If "tooling" trips over stuff like this, then we should fix the tooling
> 
> Because of its old age, many quirks and limitations, C needs and has a
> pretty large number of external "tools": static and run-time analyzers,
> coding rules (CERT, MISRA,...) and what not. It's not realistic to "fix"
> them all so they all "support" undefined behaviors like this one.

If they wish to analize Linux, then yes, they do need to be fixed to
recognize that this is not an issue for us.  There is no requirement
that we have that _all_ tools must be able to parse our source code.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-16  3:35         ` Greg KH
@ 2025-06-16 14:02           ` Alice Ryhl
  2025-06-18 23:43           ` Marc Herbert
  1 sibling, 0 replies; 19+ messages in thread
From: Alice Ryhl @ 2025-06-16 14:02 UTC (permalink / raw)
  To: Greg KH
  Cc: Marc Herbert, Miguel Ojeda, Benjamin.Cheatham, Jonathan.Cameron,
	dakr, dan.j.williams, linux-acpi, linux-cxl, linux-kernel,
	rafael.j.wysocki, rafael, sudeep.holla, Dan Carpenter, Kees Cook

On Mon, Jun 16, 2025 at 5:36 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Sat, Jun 14, 2025 at 07:53:34AM -0700, Marc Herbert wrote:
> > > the kernel relies on this not being "optimized away" by the compiler
> > > in many places.
> >
> > I think "undefined behavior" is the more general topic, more important
> > than null pointer checks specifically?
>
> Is this really "undefined behaviour"?  There are a lot of things that
> the kernel requires for a compiler to be able to build it, and this is
> one of those things, it can't do this type of "optimization" and expect
> the output to actually work properly.

My understanding is that -fno-delete-null-pointer-checks changes the
language semantics so that nullptr deref isn't UB anymore and instead
becomes a guaranteed crash.

Alice

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-16  3:35         ` Greg KH
  2025-06-16 14:02           ` Alice Ryhl
@ 2025-06-18 23:43           ` Marc Herbert
  2025-06-19  0:23             ` Dan Williams
  1 sibling, 1 reply; 19+ messages in thread
From: Marc Herbert @ 2025-06-18 23:43 UTC (permalink / raw)
  To: Greg KH
  Cc: Miguel Ojeda, Benjamin.Cheatham, Jonathan.Cameron, dakr,
	dan.j.williams, linux-acpi, linux-cxl, linux-kernel,
	rafael.j.wysocki, rafael, sudeep.holla, Dan Carpenter, Kees Cook


On 2025-06-15 20:35, Greg KH wrote:
> On Sat, Jun 14, 2025 at 07:53:34AM -0700, Marc Herbert wrote:
>>> the kernel relies on this not being "optimized away" by the compiler
>>> in many places.
>>
>> I think "undefined behavior" is the more general topic, more important
>> than null pointer checks specifically?
> 
> Is this really "undefined behaviour"? 

This is apparently debatable:

https://stackoverflow.com/questions/26906621/does-struct-name-null-b-cause-undefined-behaviour-in-c11

Also note: https://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

  Dereferencing a NULL Pointer: contrary to popular belief,
  dereferencing a null pointer in C is undefined. [...] NULL pointer
  dereferences being undefined enables a broad range of optimizations
  [...]  This significantly punishes scheduling and other
  optimizations. In C-based languages, NULL being undefined enables a
  large number of simple scalar optimizations that are exposed as a
  result of macro expansion and inlining.

> There are a lot of things that the kernel requires for a compiler to
> be able to build it, and this is one of those things, it can't do this
> type of "optimization" and expect the output to actually work
> properly.

According to page 2, this type of optimizations exists for a reason and
makes a real impact
https://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html

  While this is intentionally a simple and contrived example, this sort
  of thing happens all the time with inlining: inlining a function often
  exposes a number of secondary optimization opportunities. This means
  that if the optimizer decides to inline a function, a variety of local
  optimizations can kick in, which change the behavior of the code. This
  is both perfectly valid according to the standard, and _important for
  performance in practice_. [emphasis mine]

In other words, by turning this off unconditionally at the global level,
the kernel could actually lose (surprise!) some performance.

> Again, that's not the issue here.  The issue is that we rely on this
> type of optimization to not happen in order to work properly.

I'm interested in examples where this deviation is actually required, if
anyone can think of some from the top of their head. But in this
particular case it is not required because it's trivial and enough to
swap the two lines and check the pointer first. Even if the language
lawyers eventually agree that this particular case is not UB (for
instance: because it does not dereference "for real"), I still miss the
value of involving lawyers (or tripping some analyzers) at all in
situations where this can be avoided so easily. There are plenty enough
complex situations already, no need for even more C torture :-)


> So no need to "fix" anything here except perhaps the compiler for not
> attempting to do foolish things like this :)

It looks foolish when assuming that C is some sort of low-level language
a.k.a. "portable assembly", which it stopped being a long time ago;
for performance reasons:

https://queue.acm.org/detail.cfm?id=3212479
https://stefansf.de/post/pointers-are-more-abstract-than-you-might-expect/

Does this mean C has evolved into some hybrid monster good at neither
low-level nor high-level stuff? I think yes.

>>> If "tooling" trips over stuff like this, then we should fix the tooling
>>
>> Because of its old age, many quirks and limitations, C needs and has a
>> pretty large number of external "tools": static and run-time analyzers,
>> coding rules (CERT, MISRA,...) and what not. It's not realistic to "fix"
>> them all so they all "support" undefined behaviors like this one.
> 
> If they wish to analize Linux, then yes, they do need to be fixed to
> recognize that this is not an issue for us.  There is no requirement
> that we have that _all_ tools must be able to parse our source code.

Not sure why I wrote "all", this should have been "any".

Undefined behavior is in the "Most Wanted" list of pretty much all these
tools for obvious reasons.

Most these tools want to make C _in general_ less broken; not any specific
C project in particular. If some projects prefer being left out and use
some project-specific C dialect instead then it's their loss.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-18 23:43           ` Marc Herbert
@ 2025-06-19  0:23             ` Dan Williams
  2025-06-19  2:35               ` Dan Carpenter
  2025-06-19  3:33               ` Marc Herbert
  0 siblings, 2 replies; 19+ messages in thread
From: Dan Williams @ 2025-06-19  0:23 UTC (permalink / raw)
  To: Marc Herbert, Greg KH
  Cc: Miguel Ojeda, Benjamin.Cheatham, Jonathan.Cameron, dakr,
	dan.j.williams, linux-acpi, linux-cxl, linux-kernel,
	rafael.j.wysocki, rafael, sudeep.holla, Dan Carpenter, Kees Cook

Marc Herbert wrote:
[..]
> In other words, by turning this off unconditionally at the global level,
> the kernel could actually lose (surprise!) some performance.

I expect the answer is that any compiler that does fail to convert this
to defined behavior is not suitable for compiling the kernel.

The issue is not "oh hey, this fixup in this case is tiny", it is
"changing this precedent implicates a large flag day audit". I am
certain this is one of many optimizations that the kernel is willing to
sacrifice.

Otherwise, the massive effort to remove undefined behavior from the
kernel and allow for complier optimzations around that removal is called
the "Rust for Linux" project.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-19  0:23             ` Dan Williams
@ 2025-06-19  2:35               ` Dan Carpenter
  2025-06-19  3:33               ` Marc Herbert
  1 sibling, 0 replies; 19+ messages in thread
From: Dan Carpenter @ 2025-06-19  2:35 UTC (permalink / raw)
  To: Dan Williams
  Cc: Marc Herbert, Greg KH, Miguel Ojeda, Benjamin.Cheatham,
	Jonathan.Cameron, dakr, linux-acpi, linux-cxl, linux-kernel,
	rafael.j.wysocki, rafael, sudeep.holla, Kees Cook

On Wed, Jun 18, 2025 at 05:23:10PM -0700, Dan Williams wrote:
> Marc Herbert wrote:
> [..]
> > In other words, by turning this off unconditionally at the global level,
> > the kernel could actually lose (surprise!) some performance.
> 
> I expect the answer is that any compiler that does fail to convert this
> to defined behavior is not suitable for compiling the kernel.
> 
> The issue is not "oh hey, this fixup in this case is tiny", it is
> "changing this precedent implicates a large flag day audit". I am
> certain this is one of many optimizations that the kernel is willing to
> sacrifice.
> 
> Otherwise, the massive effort to remove undefined behavior from the
> kernel and allow for complier optimzations around that removal is called
> the "Rust for Linux" project.

We turned it off because of the tun.c bug.  CVE-2009-1897.  It was a fun
story:

https://lwn.net/Articles/342330/
https://lwn.net/Articles/342420/

I would say that if having the compiler automatically delete nonsensical
NULL checks leads to a performance improvement in your code then you're
doing something wrong.  Potentially there could be nonsense NULL checks
embedded inside macros, I guess.

But, again, this is a totally different thing from what the patch does.
The faux_device_destroy() code is not doing a dereference, it's doing
pointer math.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-19  0:23             ` Dan Williams
  2025-06-19  2:35               ` Dan Carpenter
@ 2025-06-19  3:33               ` Marc Herbert
  2025-06-19  4:02                 ` Dan Carpenter
  2025-06-26  0:55                 ` Kent Overstreet
  1 sibling, 2 replies; 19+ messages in thread
From: Marc Herbert @ 2025-06-19  3:33 UTC (permalink / raw)
  To: Dan Williams, Greg KH
  Cc: Miguel Ojeda, Benjamin.Cheatham, Jonathan.Cameron, dakr,
	linux-acpi, linux-cxl, linux-kernel, rafael.j.wysocki, rafael,
	sudeep.holla, Dan Carpenter, Kees Cook




On 2025-06-18 17:23, Dan Williams wrote:
> Marc Herbert wrote:
> [..]
>> In other words, by turning this off unconditionally at the global level,
>> the kernel could actually lose (surprise!) some performance.
> 
> I expect the answer is that any compiler that does fail to convert this
> to defined behavior is not suitable for compiling the kernel.
> 
> The issue is not "oh hey, this fixup in this case is tiny", it is
> "changing this precedent implicates a large flag day audit". I am
> certain this is one of many optimizations that the kernel is willing to
> sacrifice.

None of these ideas crossed my mind:
- dropping -fno-delete-null-pointer-checks
- anything "large" like a "flag day audit" or any large cleanup/refactoring/etc.

Sorry for the confusion.

During the discussion, some seemed to perceive
-fno-delete-null-pointer-checks as a "performance-neutral" choice. So I
just tried to correct that impression "in passing", but please do _not_
read too much into it.


What I was really interested in:

1. Is it acceptable to swap two lines to _locally_ get rid of C Fear,
   Uncertainty and Doubt and time-consuming consultations with language
   lawyers. On a _case-by-case_ basis.

2. Are C99 declarations acceptable.

3. Do tooling and "convergence" with other C projects matter.

Note "acceptable" != mandatory; _allowing_ C99 declarations does NOT
imply scanning existing code and systematically reducing variable scope
everywhere possible. Same as every other "new" C feature.

I think these were valid "policy" questions, that this "poster child"
was an efficient way to ask all of them with a ridiculously small amount
of code and I think I got loud and clear answers. Case closed, moving on!


> Otherwise, the massive effort to remove undefined behavior from the
> kernel and allow for complier optimzations around that removal is called
> the "Rust for Linux" project.

Nice one!


On 2025-06-18 19:35, Dan Carpenter wrote:

> But, again, this is a totally different thing from what the patch does.
> The faux_device_destroy() code is not doing a dereference, it's doing
> pointer math.

pointer math is what we _want_ the code to do. But if that relies on
some undefined behavior(s) then the bets are off again. Check
https://stackoverflow.com/questions/26906621/does-struct-name-null-b-cause-undefined-behaviour-in-c11
where offsetof() is a suggested alternative.
Spoiler alert: more language lawyers. Do not click ;-)



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-19  3:33               ` Marc Herbert
@ 2025-06-19  4:02                 ` Dan Carpenter
  2025-06-26  0:55                 ` Kent Overstreet
  1 sibling, 0 replies; 19+ messages in thread
From: Dan Carpenter @ 2025-06-19  4:02 UTC (permalink / raw)
  To: Marc Herbert
  Cc: Dan Williams, Greg KH, Miguel Ojeda, Benjamin.Cheatham,
	Jonathan.Cameron, dakr, linux-acpi, linux-cxl, linux-kernel,
	rafael.j.wysocki, rafael, sudeep.holla, Kees Cook

On Wed, Jun 18, 2025 at 08:33:27PM -0700, Marc Herbert wrote:
> > But, again, this is a totally different thing from what the patch does.
> > The faux_device_destroy() code is not doing a dereference, it's doing
> > pointer math.
> 
> pointer math is what we _want_ the code to do. But if that relies on
> some undefined behavior(s) then the bets are off again. Check
> https://stackoverflow.com/questions/26906621/does-struct-name-null-b-cause-undefined-behaviour-in-c11
> where offsetof() is a suggested alternative.

The answers talk about "But the value of a null pointer constant is not
defined as 0." which is some trivia that had heard before.  Probably
I heard it in the context of someone saying that we should check
"if (p == NULL)" instead of "if (!p)"...

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-14 10:50   ` Miguel Ojeda
  2025-06-14 11:53     ` Greg KH
@ 2025-06-25 15:20     ` Dan Carpenter
  2025-06-25 22:30       ` Marc Herbert
  1 sibling, 1 reply; 19+ messages in thread
From: Dan Carpenter @ 2025-06-25 15:20 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: gregkh, Benjamin.Cheatham, Jonathan.Cameron, dakr, dan.j.williams,
	linux-acpi, linux-cxl, linux-kernel, marc.herbert,
	rafael.j.wysocki, rafael, sudeep.holla, Kees Cook

On Sat, Jun 14, 2025 at 12:50:37PM +0200, Miguel Ojeda wrote:
> On Fri, 13 Jun 2025 20:33:42 -0400 Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > Great writeup, but as Miguel says, this isn't needed at all, the kernel
> > relies on the compiler to be sane :)
> 
> We may still want to clean them up, e.g. for tooling -- Kees/Dan: do we?
> e.g. I see a similar case with discussion at:
> 
>     https://lore.kernel.org/lkml/3f1e7aaa-501a-44f1-8122-28e9efa0a33c@web.de/
> 
> Which in the end was picked up as commit 2df2c0caaecf ("fbdev: au1100fb:
> Move a variable assignment behind a null pointer check").

Putting the declarations at the top was always just a style preference.
Putting declarations at the top causes issues for __cleanup magic and
also bcachefs puts the declarations where ever it wants, but otherwise
people generally still put the declarations at the stop.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-13 19:15 [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy() marc.herbert
  2025-06-13 20:20 ` Miguel Ojeda
  2025-06-14  0:33 ` Greg KH
@ 2025-06-25 15:21 ` Dan Carpenter
  2 siblings, 0 replies; 19+ messages in thread
From: Dan Carpenter @ 2025-06-25 15:21 UTC (permalink / raw)
  To: marc.herbert
  Cc: Jonathan Cameron, Greg KH, Dan Williams, rafael.j.wysocki,
	linux-cxl, linux-acpi, linux-kernel, Rafael J. Wysocki,
	Sudeep Holla, Ben Cheatham, Danilo Krummrich

On Fri, Jun 13, 2025 at 07:15:56PM +0000, marc.herbert@linux.intel.com wrote:
> gcc -O0: Segmentation fault (core dumped)
> gcc -O1: ptr is zero
> gcc -O2: ptr is NOT zero!!!
> gcc -O3: ptr is NOT zero!!!
> gcc -Og: ptr is zero

Btw, this is testing dereferences where the kernel code is doing pointer
math.  No one disagrees about dereferences after a check.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-25 15:20     ` Dan Carpenter
@ 2025-06-25 22:30       ` Marc Herbert
  2025-06-25 23:18         ` Dan Carpenter
  0 siblings, 1 reply; 19+ messages in thread
From: Marc Herbert @ 2025-06-25 22:30 UTC (permalink / raw)
  To: Dan Carpenter, Miguel Ojeda
  Cc: gregkh, Benjamin.Cheatham, Jonathan.Cameron, dakr, dan.j.williams,
	linux-acpi, linux-cxl, linux-kernel, rafael.j.wysocki, rafael,
	sudeep.holla, Kees Cook



On 2025-06-25 08:20, Dan Carpenter wrote:
> On Sat, Jun 14, 2025 at 12:50:37PM +0200, Miguel Ojeda wrote:
>> On Fri, 13 Jun 2025 20:33:42 -0400 Greg KH <gregkh@linuxfoundation.org> wrote:
>>>
>>> Great writeup, but as Miguel says, this isn't needed at all, the kernel
>>> relies on the compiler to be sane :)
>>
>> We may still want to clean them up, e.g. for tooling -- Kees/Dan: do we?
>> e.g. I see a similar case with discussion at:
>>
>>     https://lore.kernel.org/lkml/3f1e7aaa-501a-44f1-8122-28e9efa0a33c@web.de/
>>
>> Which in the end was picked up as commit 2df2c0caaecf ("fbdev: au1100fb:
>> Move a variable assignment behind a null pointer check").
> 
> Putting the declarations at the top was always just a style preference.

No, "const" and variable scopes are not just "style", please do a
bit of research. For instance...

> Putting declarations at the top causes issues for __cleanup magic and...

https://stackoverflow.com/questions/368385/implementing-raii-in-pure-c
https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization#Compiler_%22cleanup%22_extensions

Not just "style" either:
- Automagically avoiding exploits like TUN https://lwn.net/Articles/342330/
- The unusual flag -fno-delete-null-pointer-checks and incompatibility
  with other analyzers and compilers
- All the complex compiler discussions around those.

Declaration-after-statement was an important (and obviously: optional)
C99 feature that let C catch up with every other language. Forbidding it
just for "style" would be a serious misunderstanding of that feature. I
don't know any yet but there has to be some more important reason(s)
than "style".

From https://lore.kernel.org/lkml/4d54e4f6-0d98-4b42-9bea-169f3b8772bb@sabinyo.mountain/
> Btw, this is testing dereferences where the kernel code is doing pointer math.

Compiler optimizations may or may not care about that difference.  It
seems gcc and clang both do care... for now (and even if that changes
then I guess -fno-delete-null-pointer-checks would still be enough)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-25 22:30       ` Marc Herbert
@ 2025-06-25 23:18         ` Dan Carpenter
  0 siblings, 0 replies; 19+ messages in thread
From: Dan Carpenter @ 2025-06-25 23:18 UTC (permalink / raw)
  To: Marc Herbert
  Cc: Miguel Ojeda, gregkh, Benjamin.Cheatham, Jonathan.Cameron, dakr,
	dan.j.williams, linux-acpi, linux-cxl, linux-kernel,
	rafael.j.wysocki, rafael, sudeep.holla, Kees Cook

On Wed, Jun 25, 2025 at 03:30:47PM -0700, Marc Herbert wrote:
> 
> 
> On 2025-06-25 08:20, Dan Carpenter wrote:
> > On Sat, Jun 14, 2025 at 12:50:37PM +0200, Miguel Ojeda wrote:
> >> On Fri, 13 Jun 2025 20:33:42 -0400 Greg KH <gregkh@linuxfoundation.org> wrote:
> >>>
> >>> Great writeup, but as Miguel says, this isn't needed at all, the kernel
> >>> relies on the compiler to be sane :)
> >>
> >> We may still want to clean them up, e.g. for tooling -- Kees/Dan: do we?
> >> e.g. I see a similar case with discussion at:
> >>
> >>     https://lore.kernel.org/lkml/3f1e7aaa-501a-44f1-8122-28e9efa0a33c@web.de/
> >>
> >> Which in the end was picked up as commit 2df2c0caaecf ("fbdev: au1100fb:
> >> Move a variable assignment behind a null pointer check").
> > 
> > Putting the declarations at the top was always just a style preference.
> 
> No, "const" and variable scopes are not just "style", please do a
> bit of research. For instance...
> 

No, I meant it was a style issue for *us* as kernel developers.  It
wasn't like kernel developers had not heard that c99 let you put
variable declarations randomly all over the place.  We knew about it
and hated it.  We only changed the rules because of __cleanup magic.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-19  3:33               ` Marc Herbert
  2025-06-19  4:02                 ` Dan Carpenter
@ 2025-06-26  0:55                 ` Kent Overstreet
  2025-06-30 23:24                   ` Marc Herbert
  1 sibling, 1 reply; 19+ messages in thread
From: Kent Overstreet @ 2025-06-26  0:55 UTC (permalink / raw)
  To: Marc Herbert
  Cc: Dan Williams, Greg KH, Miguel Ojeda, Benjamin.Cheatham,
	Jonathan.Cameron, dakr, linux-acpi, linux-cxl, linux-kernel,
	rafael.j.wysocki, rafael, sudeep.holla, Dan Carpenter, Kees Cook

On Wed, Jun 18, 2025 at 08:33:27PM -0700, Marc Herbert wrote:
> 2. Are C99 declarations acceptable.

To comment on this one, since I was linked - I'd say there's definite
pros and cons to C99 declarations to be aware of

pros:
- not separating the definition and the assignment is better style, and
  does lead to a reduction in bugs
- discourages reusing the same variable for multiple purposes, better
  and more readable code

the big con:
- they interact badly with gotos, you can get undefined behaviour from
  using a variable that wasn't actually defined _and the compiler will
  not warn you_

So, same issue as __cleanup.

There are no downsides to c99 style for loop declations that I've ever
seen, and we have seen definite cases where using those would have
prevented bugs.

But the issue with gotos is worth highlighting. Be careful when using
them in code that hasn't been converted to __cleanup.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy()
  2025-06-26  0:55                 ` Kent Overstreet
@ 2025-06-30 23:24                   ` Marc Herbert
  0 siblings, 0 replies; 19+ messages in thread
From: Marc Herbert @ 2025-06-30 23:24 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Dan Williams, Greg KH, Miguel Ojeda, Benjamin.Cheatham,
	Jonathan.Cameron, dakr, linux-acpi, linux-cxl, linux-kernel,
	rafael.j.wysocki, rafael, sudeep.holla, Dan Carpenter, Kees Cook

On 2025-06-25 17:55, Kent Overstreet wrote:
> the big con:
> - they interact badly with gotos, you can get undefined behaviour from
>   using a variable that wasn't actually defined _and the compiler will
>   not warn you_
> [...]
> But the issue with gotos is worth highlighting. Be careful when using
> them in code that hasn't been converted to __cleanup.

Thanks Kent for sharing this.

I got curious and found that clang -Wall is actually able to warn,
at least in simple cases:

int goto_uninitialized_C99(int *ptr)
{
  if (!ptr)
    goto cleanup;
  const int i = 42;

cleanup:
  // clang warning, no gcc warning
  printf("fin: i=%d\n", i);


warning: variable 'i' is used uninitialized whenever 'if' condition
   is true [-Wsometimes-uninitialized]


gcc -Wall -Wextra does not say anything.
Tested with clang version 18.1.3 and gcc 13.3.0


Interestingly, there is no warning difference between C89 and C99 code
for such a simple example. gcc warns for neither C89 code nor C99 code
and clang warns for both.

int goto_uninitialized_C89(int *ptr)
{
  int i;
  if (!ptr)
    goto cleanup;
  i = 42

cleanup:
  /* clang warning, no gcc warning */
  printf("fin: i=%d\n", i);


(finally getting rid of gotos is one of the main purposes of RAII)

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2025-06-30 23:24 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-13 19:15 [PATCH] driver core: faux: fix Undefined Behavior in faux_device_destroy() marc.herbert
2025-06-13 20:20 ` Miguel Ojeda
2025-06-14  0:33 ` Greg KH
2025-06-14 10:50   ` Miguel Ojeda
2025-06-14 11:53     ` Greg KH
2025-06-14 14:53       ` Marc Herbert
2025-06-16  3:35         ` Greg KH
2025-06-16 14:02           ` Alice Ryhl
2025-06-18 23:43           ` Marc Herbert
2025-06-19  0:23             ` Dan Williams
2025-06-19  2:35               ` Dan Carpenter
2025-06-19  3:33               ` Marc Herbert
2025-06-19  4:02                 ` Dan Carpenter
2025-06-26  0:55                 ` Kent Overstreet
2025-06-30 23:24                   ` Marc Herbert
2025-06-25 15:20     ` Dan Carpenter
2025-06-25 22:30       ` Marc Herbert
2025-06-25 23:18         ` Dan Carpenter
2025-06-25 15:21 ` Dan Carpenter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).