public inbox for linux-modules@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 3/3] module: Add compile-time check for embedded NUL characters
From: Sami Tolvanen @ 2025-12-11 17:51 UTC (permalink / raw)
  To: Daniel Gomez
  Cc: Dan Carpenter, Luck, Tony, Chris Li, Eric Biggers, Kees Cook,
	Luis Chamberlain, Rusty Russell, Petr Pavlu,
	linux-modules@vger.kernel.org, Malcolm Priestley,
	Mauro Carvalho Chehab, Hans Verkuil, Uwe Kleine-König,
	linux-kernel@vger.kernel.org, linux-media@vger.kernel.org,
	linux-hardening@vger.kernel.org
In-Reply-To: <083ebd92-4b3f-47f8-bf0f-395a604b5f05@kernel.org>

On Fri, Dec 12, 2025 at 02:30:48AM +0900, Daniel Gomez wrote:
> 
> 
> On 12/12/2025 02.03, Sami Tolvanen wrote:
> > On Thu, Dec 11, 2025 at 12:28 AM Dan Carpenter <dan.carpenter@linaro.org> wrote:
> >>
> >> On Wed, Dec 10, 2025 at 02:29:45PM -0800, Luck, Tony wrote:
> >>>> diff --git a/expand.c b/expand.c
> >>>> index f14e7181..71221d35 100644
> >>>> --- a/expand.c
> >>>> +++ b/expand.c
> >>>> @@ -535,6 +535,8 @@ static int expand_compare(struct expression *expr)
> >>>>                     expr->taint = 0;
> >>>>                     return 0;
> >>>>             }
> >>>> +           if (left->flags & CEF_ICE && right->flags & CEF_ICE)
> >>>> +                   expr->flags |= CEF_SET_ICE;
> >>>>             if (simplify_cmp_binop(expr, left->ctype))
> >>>>                     return 0;
> >>>>             if (simplify_float_cmp(expr, left->ctype))
> >>
> >> I'm not an expert in the C standard, but this feels correct to me.
> > 
> > It only fixes comparisons though, the problem still exists for other
> > expressions. For example, while `_Static_assert(__builtin_strlen("")
> > == 0);` works with this change,
> > `_Static_assert(!__builtin_strlen(""));` still fails. Perhaps there's
> > a better way to fix this than changing each expression expansion
> > function to handle this flag?
> 
> Maybe the flag fix just needs to be applied to the evaluation? Other op
> structs do the same. But Dan's patch did not implement evaluate. E.g.:
> 
> static struct symbol_op constant_p_op = {
> 	.evaluate = evaluate_to_int_const_expr,
> 	.expand = expand_constant_p
> };

Nice catch! This seems to fix the issue for me:

diff --git a/builtin.c b/builtin.c
index 9149c43d..7573abf8 100644
--- a/builtin.c
+++ b/builtin.c
@@ -616,6 +616,7 @@ static int expand_strlen(struct expression *expr, int cost)
 }
 
 static struct symbol_op strlen_op = {
+	.evaluate = evaluate_to_int_const_expr,
 	.expand = expand_strlen,
 };


I wonder if there are any other __builtin_* functions that need this too?
Looks like __builtin_object_size doesn't have this either.

Sami

^ permalink raw reply related

* Re: [PATCH 3/3] module: Add compile-time check for embedded NUL characters
From: Daniel Gomez @ 2025-12-11 17:30 UTC (permalink / raw)
  To: Sami Tolvanen, Dan Carpenter
  Cc: Luck, Tony, Chris Li, Eric Biggers, Kees Cook, Luis Chamberlain,
	Rusty Russell, Petr Pavlu, linux-modules@vger.kernel.org,
	Malcolm Priestley, Mauro Carvalho Chehab, Hans Verkuil,
	Uwe Kleine-König, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org, linux-hardening@vger.kernel.org
In-Reply-To: <CABCJKufWw4VQA_k6Deuf5Bn6401cbYv_St8VV_0-LAau6F0nTw@mail.gmail.com>



On 12/12/2025 02.03, Sami Tolvanen wrote:
> On Thu, Dec 11, 2025 at 12:28 AM Dan Carpenter <dan.carpenter@linaro.org> wrote:
>>
>> On Wed, Dec 10, 2025 at 02:29:45PM -0800, Luck, Tony wrote:
>>>> diff --git a/expand.c b/expand.c
>>>> index f14e7181..71221d35 100644
>>>> --- a/expand.c
>>>> +++ b/expand.c
>>>> @@ -535,6 +535,8 @@ static int expand_compare(struct expression *expr)
>>>>                     expr->taint = 0;
>>>>                     return 0;
>>>>             }
>>>> +           if (left->flags & CEF_ICE && right->flags & CEF_ICE)
>>>> +                   expr->flags |= CEF_SET_ICE;
>>>>             if (simplify_cmp_binop(expr, left->ctype))
>>>>                     return 0;
>>>>             if (simplify_float_cmp(expr, left->ctype))
>>
>> I'm not an expert in the C standard, but this feels correct to me.
> 
> It only fixes comparisons though, the problem still exists for other
> expressions. For example, while `_Static_assert(__builtin_strlen("")
> == 0);` works with this change,
> `_Static_assert(!__builtin_strlen(""));` still fails. Perhaps there's
> a better way to fix this than changing each expression expansion
> function to handle this flag?

Maybe the flag fix just needs to be applied to the evaluation? Other op
structs do the same. But Dan's patch did not implement evaluate. E.g.:

static struct symbol_op constant_p_op = {
	.evaluate = evaluate_to_int_const_expr,
	.expand = expand_constant_p
};


> 
> Sami

^ permalink raw reply

* Re: [PATCH 3/3] module: Add compile-time check for embedded NUL characters
From: Sami Tolvanen @ 2025-12-11 17:03 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Luck, Tony, Chris Li, Daniel Gomez, Eric Biggers, Kees Cook,
	Luis Chamberlain, Rusty Russell, Petr Pavlu,
	linux-modules@vger.kernel.org, Malcolm Priestley,
	Mauro Carvalho Chehab, Hans Verkuil, Uwe Kleine-König,
	linux-kernel@vger.kernel.org, linux-media@vger.kernel.org,
	linux-hardening@vger.kernel.org
In-Reply-To: <aTqAxfiVCR2ch4I5@stanley.mountain>

On Thu, Dec 11, 2025 at 12:28 AM Dan Carpenter <dan.carpenter@linaro.org> wrote:
>
> On Wed, Dec 10, 2025 at 02:29:45PM -0800, Luck, Tony wrote:
> > > diff --git a/expand.c b/expand.c
> > > index f14e7181..71221d35 100644
> > > --- a/expand.c
> > > +++ b/expand.c
> > > @@ -535,6 +535,8 @@ static int expand_compare(struct expression *expr)
> > >                     expr->taint = 0;
> > >                     return 0;
> > >             }
> > > +           if (left->flags & CEF_ICE && right->flags & CEF_ICE)
> > > +                   expr->flags |= CEF_SET_ICE;
> > >             if (simplify_cmp_binop(expr, left->ctype))
> > >                     return 0;
> > >             if (simplify_float_cmp(expr, left->ctype))
>
> I'm not an expert in the C standard, but this feels correct to me.

It only fixes comparisons though, the problem still exists for other
expressions. For example, while `_Static_assert(__builtin_strlen("")
== 0);` works with this change,
`_Static_assert(!__builtin_strlen(""));` still fails. Perhaps there's
a better way to fix this than changing each expression expansion
function to handle this flag?

Sami

^ permalink raw reply

* Re: [PATCH 3/3] module: Add compile-time check for embedded NUL characters
From: Dan Carpenter @ 2025-12-11  8:28 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Sami Tolvanen, Chris Li, Daniel Gomez, Eric Biggers, Kees Cook,
	Luis Chamberlain, Rusty Russell, Petr Pavlu,
	linux-modules@vger.kernel.org, Malcolm Priestley,
	Mauro Carvalho Chehab, Hans Verkuil, Uwe Kleine-König,
	linux-kernel@vger.kernel.org, linux-media@vger.kernel.org,
	linux-hardening@vger.kernel.org
In-Reply-To: <aTn0WdVv-S_EpQmS@agluck-desk3>

On Wed, Dec 10, 2025 at 02:29:45PM -0800, Luck, Tony wrote:
> > diff --git a/expand.c b/expand.c
> > index f14e7181..71221d35 100644
> > --- a/expand.c
> > +++ b/expand.c
> > @@ -535,6 +535,8 @@ static int expand_compare(struct expression *expr)
> >  			expr->taint = 0;
> >  			return 0;
> >  		}
> > +		if (left->flags & CEF_ICE && right->flags & CEF_ICE)
> > +			expr->flags |= CEF_SET_ICE;
> >  		if (simplify_cmp_binop(expr, left->ctype))
> >  			return 0;
> >  		if (simplify_float_cmp(expr, left->ctype))

I'm not an expert in the C standard, but this feels correct to me.

regards,
dan carpenter

^ permalink raw reply

* Re: [PATCH 3/3] module: Add compile-time check for embedded NUL characters
From: Luck, Tony @ 2025-12-10 22:29 UTC (permalink / raw)
  To: Sami Tolvanen, Dan Carpenter, Chris Li
  Cc: Daniel Gomez, Eric Biggers, Kees Cook, Luis Chamberlain,
	Rusty Russell, Petr Pavlu, linux-modules@vger.kernel.org,
	Malcolm Priestley, Mauro Carvalho Chehab, Hans Verkuil,
	Uwe Kleine-König, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org, linux-hardening@vger.kernel.org
In-Reply-To: <20251210010020.GA2522829@google.com>

[Added Dan Carpenter and Chris Li]

Dan and Chris,

You're in the git log for sparse wrting/commiting fixes. Can you take
a look at this?  Sami's fix works for me.

-Tony

On Wed, Dec 10, 2025 at 01:00:20AM +0000, Sami Tolvanen wrote:
> On Tue, Dec 09, 2025 at 10:29:06AM -0800, Luck, Tony wrote:
> > On Tue, Dec 09, 2025 at 08:45:14AM -0800, Luck, Tony wrote:
> > > On Tue, Dec 09, 2025 at 04:20:06PM +0000, Luck, Tony wrote:
> > > > >> Likewise, I just got the following kernel test robot report sent to me,
> > > > >> where it's warning about MODULE_LICENSE("GPL"):
> > > > >> https://lore.kernel.org/all/202512090359.7BkUaiC9-lkp@intel.com/
> > > > >
> > > > > Can you both confirm which version of sparse are you using?
> > > > >
> > > > > My understanding was that this patch fixed that problem:
> > > > > >https://lore.kernel.org/linux-sparse/CACePvbVG2KrGQq4cNKV=wbO5h=jp3M0RO1SdfX8kV4OukjPG8A@mail.gmail.com/T/#mf838b3e2e3245d88c30a801ea7473d5a5c0eb121
> > > > 
> > > > > The patch is already merged into the sparse tree, and I was not able to
> > > > > reproduce the issue.
> > > > 
> > > > I pulled the latest sparse source and re-checked before reporting. Top commit I have is the one you mention:
> > > > 
> > > > fbdde3127b83 ("builtin: implement __builtin_strlen() for constants")
> > > > 
> > > > I'm building latest Linus tree from the current merge window (well latest as-of yesterday):
> > > > 
> > > > c2f2b01b74be ("Merge tag 'i3c/for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux")
> > > 
> > > I added a debug trace to the new expand_strlen() function added to
> > > sparse. It is being called and doing the right thing. My trace says:
> > > 
> > > 	len(GPL) = 3
> > 
> > Simple test case:
> > 
> > $ cat -n s.c
> >      1
> >      2  _Static_assert(sizeof("GPL") - 1 == 3, "sizeof");
> >      3
> >      4  _Static_assert(__builtin_strlen("GPL") == 3, "strlen");
> > 
> > $ sparse s.c
> > s.c:4:40: error: bad integer constant expression
> > s.c:4:40: error: static assertion failed: "strlen"
> > 
> > So the "sizeof" bit is OK. But the __builtin_strlen() isn't.
> 
> This looks like a bug in Sparse. The CEF_ICE flag isn't propagated to
> the comparison expression, which it presumably should be when both
> sides are integer constant expressions.
> 
> I'm not really familiar enough with Sparse to know whether this is the
> correct place to handle this case, but this quick hack fixes the issue
> for me:
> 
> diff --git a/expand.c b/expand.c
> index f14e7181..71221d35 100644
> --- a/expand.c
> +++ b/expand.c
> @@ -535,6 +535,8 @@ static int expand_compare(struct expression *expr)
>  			expr->taint = 0;
>  			return 0;
>  		}
> +		if (left->flags & CEF_ICE && right->flags & CEF_ICE)
> +			expr->flags |= CEF_SET_ICE;
>  		if (simplify_cmp_binop(expr, left->ctype))
>  			return 0;
>  		if (simplify_float_cmp(expr, left->ctype))
> 
> Sami

^ permalink raw reply

* Re: [PATCH 3/3] module: Add compile-time check for embedded NUL characters
From: Sami Tolvanen @ 2025-12-10  1:00 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Daniel Gomez, Eric Biggers, Kees Cook, Luis Chamberlain,
	Rusty Russell, Petr Pavlu, linux-modules@vger.kernel.org,
	Malcolm Priestley, Mauro Carvalho Chehab, Hans Verkuil,
	Uwe Kleine-König, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org, linux-hardening@vger.kernel.org
In-Reply-To: <aThqcq0iGge1pQCr@agluck-desk3>

On Tue, Dec 09, 2025 at 10:29:06AM -0800, Luck, Tony wrote:
> On Tue, Dec 09, 2025 at 08:45:14AM -0800, Luck, Tony wrote:
> > On Tue, Dec 09, 2025 at 04:20:06PM +0000, Luck, Tony wrote:
> > > >> Likewise, I just got the following kernel test robot report sent to me,
> > > >> where it's warning about MODULE_LICENSE("GPL"):
> > > >> https://lore.kernel.org/all/202512090359.7BkUaiC9-lkp@intel.com/
> > > >
> > > > Can you both confirm which version of sparse are you using?
> > > >
> > > > My understanding was that this patch fixed that problem:
> > > > >https://lore.kernel.org/linux-sparse/CACePvbVG2KrGQq4cNKV=wbO5h=jp3M0RO1SdfX8kV4OukjPG8A@mail.gmail.com/T/#mf838b3e2e3245d88c30a801ea7473d5a5c0eb121
> > > 
> > > > The patch is already merged into the sparse tree, and I was not able to
> > > > reproduce the issue.
> > > 
> > > I pulled the latest sparse source and re-checked before reporting. Top commit I have is the one you mention:
> > > 
> > > fbdde3127b83 ("builtin: implement __builtin_strlen() for constants")
> > > 
> > > I'm building latest Linus tree from the current merge window (well latest as-of yesterday):
> > > 
> > > c2f2b01b74be ("Merge tag 'i3c/for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux")
> > 
> > I added a debug trace to the new expand_strlen() function added to
> > sparse. It is being called and doing the right thing. My trace says:
> > 
> > 	len(GPL) = 3
> 
> Simple test case:
> 
> $ cat -n s.c
>      1
>      2  _Static_assert(sizeof("GPL") - 1 == 3, "sizeof");
>      3
>      4  _Static_assert(__builtin_strlen("GPL") == 3, "strlen");
> 
> $ sparse s.c
> s.c:4:40: error: bad integer constant expression
> s.c:4:40: error: static assertion failed: "strlen"
> 
> So the "sizeof" bit is OK. But the __builtin_strlen() isn't.

This looks like a bug in Sparse. The CEF_ICE flag isn't propagated to
the comparison expression, which it presumably should be when both
sides are integer constant expressions.

I'm not really familiar enough with Sparse to know whether this is the
correct place to handle this case, but this quick hack fixes the issue
for me:

diff --git a/expand.c b/expand.c
index f14e7181..71221d35 100644
--- a/expand.c
+++ b/expand.c
@@ -535,6 +535,8 @@ static int expand_compare(struct expression *expr)
 			expr->taint = 0;
 			return 0;
 		}
+		if (left->flags & CEF_ICE && right->flags & CEF_ICE)
+			expr->flags |= CEF_SET_ICE;
 		if (simplify_cmp_binop(expr, left->ctype))
 			return 0;
 		if (simplify_float_cmp(expr, left->ctype))

Sami

^ permalink raw reply related

* Re: [PATCH 3/3] module: Add compile-time check for embedded NUL characters
From: Luck, Tony @ 2025-12-09 18:29 UTC (permalink / raw)
  To: Daniel Gomez, Eric Biggers
  Cc: Kees Cook, Luis Chamberlain, Rusty Russell, Petr Pavlu,
	Sami Tolvanen, linux-modules@vger.kernel.org, Malcolm Priestley,
	Mauro Carvalho Chehab, Hans Verkuil, Uwe Kleine-König,
	linux-kernel@vger.kernel.org, linux-media@vger.kernel.org,
	linux-hardening@vger.kernel.org
In-Reply-To: <aThSGiKwJRYOB6kx@agluck-desk3>

On Tue, Dec 09, 2025 at 08:45:14AM -0800, Luck, Tony wrote:
> On Tue, Dec 09, 2025 at 04:20:06PM +0000, Luck, Tony wrote:
> > >> Likewise, I just got the following kernel test robot report sent to me,
> > >> where it's warning about MODULE_LICENSE("GPL"):
> > >> https://lore.kernel.org/all/202512090359.7BkUaiC9-lkp@intel.com/
> > >
> > > Can you both confirm which version of sparse are you using?
> > >
> > > My understanding was that this patch fixed that problem:
> > > >https://lore.kernel.org/linux-sparse/CACePvbVG2KrGQq4cNKV=wbO5h=jp3M0RO1SdfX8kV4OukjPG8A@mail.gmail.com/T/#mf838b3e2e3245d88c30a801ea7473d5a5c0eb121
> > 
> > > The patch is already merged into the sparse tree, and I was not able to
> > > reproduce the issue.
> > 
> > I pulled the latest sparse source and re-checked before reporting. Top commit I have is the one you mention:
> > 
> > fbdde3127b83 ("builtin: implement __builtin_strlen() for constants")
> > 
> > I'm building latest Linus tree from the current merge window (well latest as-of yesterday):
> > 
> > c2f2b01b74be ("Merge tag 'i3c/for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux")
> 
> I added a debug trace to the new expand_strlen() function added to
> sparse. It is being called and doing the right thing. My trace says:
> 
> 	len(GPL) = 3

Simple test case:

$ cat -n s.c
     1
     2  _Static_assert(sizeof("GPL") - 1 == 3, "sizeof");
     3
     4  _Static_assert(__builtin_strlen("GPL") == 3, "strlen");

$ sparse s.c
s.c:4:40: error: bad integer constant expression
s.c:4:40: error: static assertion failed: "strlen"

So the "sizeof" bit is OK. But the __builtin_strlen() isn't.

-Tony

^ permalink raw reply

* Re: [PATCH 3/3] module: Add compile-time check for embedded NUL characters
From: Luck, Tony @ 2025-12-09 16:45 UTC (permalink / raw)
  To: Daniel Gomez, Eric Biggers
  Cc: Kees Cook, Luis Chamberlain, Rusty Russell, Petr Pavlu,
	Sami Tolvanen, linux-modules@vger.kernel.org, Malcolm Priestley,
	Mauro Carvalho Chehab, Hans Verkuil, Uwe Kleine-König,
	linux-kernel@vger.kernel.org, linux-media@vger.kernel.org,
	linux-hardening@vger.kernel.org
In-Reply-To: <SJ1PR11MB6083C6D97484ED4E4710268EFCA3A@SJ1PR11MB6083.namprd11.prod.outlook.com>

On Tue, Dec 09, 2025 at 04:20:06PM +0000, Luck, Tony wrote:
> >> Likewise, I just got the following kernel test robot report sent to me,
> >> where it's warning about MODULE_LICENSE("GPL"):
> >> https://lore.kernel.org/all/202512090359.7BkUaiC9-lkp@intel.com/
> >
> > Can you both confirm which version of sparse are you using?
> >
> > My understanding was that this patch fixed that problem:
> > >https://lore.kernel.org/linux-sparse/CACePvbVG2KrGQq4cNKV=wbO5h=jp3M0RO1SdfX8kV4OukjPG8A@mail.gmail.com/T/#mf838b3e2e3245d88c30a801ea7473d5a5c0eb121
> 
> > The patch is already merged into the sparse tree, and I was not able to
> > reproduce the issue.
> 
> I pulled the latest sparse source and re-checked before reporting. Top commit I have is the one you mention:
> 
> fbdde3127b83 ("builtin: implement __builtin_strlen() for constants")
> 
> I'm building latest Linus tree from the current merge window (well latest as-of yesterday):
> 
> c2f2b01b74be ("Merge tag 'i3c/for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux")

I added a debug trace to the new expand_strlen() function added to
sparse. It is being called and doing the right thing. My trace says:

	len(GPL) = 3

-Tony

^ permalink raw reply

* RE: [PATCH 3/3] module: Add compile-time check for embedded NUL characters
From: Luck, Tony @ 2025-12-09 16:20 UTC (permalink / raw)
  To: Daniel Gomez, Eric Biggers
  Cc: Kees Cook, Luis Chamberlain, Rusty Russell, Petr Pavlu,
	Sami Tolvanen, linux-modules@vger.kernel.org, Malcolm Priestley,
	Mauro Carvalho Chehab, Hans Verkuil, Uwe Kleine-König,
	linux-kernel@vger.kernel.org, linux-media@vger.kernel.org,
	linux-hardening@vger.kernel.org
In-Reply-To: <d3dd8fc8-ab46-4cf1-87a9-0324685ba2e0@kernel.org>

>> Likewise, I just got the following kernel test robot report sent to me,
>> where it's warning about MODULE_LICENSE("GPL"):
>> https://lore.kernel.org/all/202512090359.7BkUaiC9-lkp@intel.com/
>
> Can you both confirm which version of sparse are you using?
>
> My understanding was that this patch fixed that problem:
> >https://lore.kernel.org/linux-sparse/CACePvbVG2KrGQq4cNKV=wbO5h=jp3M0RO1SdfX8kV4OukjPG8A@mail.gmail.com/T/#mf838b3e2e3245d88c30a801ea7473d5a5c0eb121

> The patch is already merged into the sparse tree, and I was not able to
> reproduce the issue.

I pulled the latest sparse source and re-checked before reporting. Top commit I have is the one you mention:

fbdde3127b83 ("builtin: implement __builtin_strlen() for constants")

I'm building latest Linus tree from the current merge window (well latest as-of yesterday):

c2f2b01b74be ("Merge tag 'i3c/for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux")

-Tony





^ permalink raw reply

* Re: [PATCH v3 1/4] kernel.h: drop STACK_MAGIC macro
From: Andi Shyti @ 2025-12-09 15:58 UTC (permalink / raw)
  To: Yury Norov (NVIDIA)
  Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Andy Shevchenko, Christophe Leroy, Randy Dunlap, Ingo Molnar,
	Jani Nikula, Joonas Lahtinen, David Laight, Petr Pavlu,
	Rodrigo Vivi, Tvrtko Ursulin, Daniel Gomez, Greg Kroah-Hartman,
	Rafael J. Wysocki, Danilo Krummrich, Andrew Morton, linux-kernel,
	intel-gfx, dri-devel, linux-modules, linux-trace-kernel,
	Jani Nikula
In-Reply-To: <20251205175237.242022-2-yury.norov@gmail.com>

Hi Yuri,

On Fri, Dec 05, 2025 at 12:52:32PM -0500, Yury Norov (NVIDIA) wrote:
> The macro was introduced in 1994, v1.0.4, for stacks protection. Since
> that, people found better ways to protect stacks, and now the macro is
> only used by i915 selftests. Move it to a local header and drop from
> the kernel.h.
> 
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Acked-by: Randy Dunlap <rdunlap@infradead.org>
> Acked-by: Jani Nikula <jani.nikula@intel.com>
> Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>

Thanks,
Andi

^ permalink raw reply

* Re: [PATCH 3/3] module: Add compile-time check for embedded NUL characters
From: Daniel Gomez @ 2025-12-09  8:18 UTC (permalink / raw)
  To: Eric Biggers, Luck, Tony
  Cc: Kees Cook, Luis Chamberlain, Rusty Russell, Petr Pavlu,
	Sami Tolvanen, linux-modules, Malcolm Priestley,
	Mauro Carvalho Chehab, Hans Verkuil, Uwe Kleine-König,
	linux-kernel, linux-media, linux-hardening
In-Reply-To: <20251209001139.GA7982@quark>



On 09/12/2025 09.11, Eric Biggers wrote:
> On Mon, Dec 08, 2025 at 01:05:55PM -0800, Luck, Tony wrote:
>> On Tue, Oct 07, 2025 at 08:59:35PM -0700, Kees Cook wrote:
>>> Long ago, the kernel module license checks were bypassed by embedding a
>>> NUL character in the MODULE_LICENSE() string[1]. By using a string like
>>> "GPL\0proprietary text", the kernel would only read "GPL" due to C string
>>> termination at the NUL byte, allowing proprietary modules to avoid kernel
>>> tainting and access GPL-only symbols.
>>>
>>> The MODULE_INFO() macro stores these strings in the .modinfo ELF
>>> section, and get_next_modinfo() uses strcmp()-family functions
>>> which stop at the first NUL. This split the embedded string into two
>>> separate .modinfo entries, with only the first part being processed by
>>> license_is_gpl_compatible().
>>>
>>> Add a compile-time check using _Static_assert that compares the full
>>> string length (sizeof - 1) against __builtin_strlen(), which stops at
>>> the first NUL. If they differ, compilation fails with a clear error
>>> message.
>>>
>>> While this check can still be circumvented by modifying the ELF binary
>>> post-compilation, it prevents accidental embedded NULs and forces
>>> intentional abuse to require deliberate binary manipulation rather than
>>> simple source-level tricks.
>>>
>>> Build tested with test modules containing both valid and invalid license
>>> strings. The check correctly rejects:
>>>
>>>     MODULE_LICENSE("GPL\0proprietary")
>>>
>>> while accepting normal declarations:
>>>
>>>     MODULE_LICENSE("GPL")
>>
>>  
>> I did a "make W=1 C=1" and found that sparse is now unhappy with all MODULE_LICENSE(),
>> MODULE_PARM_DESC(), MODULE_DESCRIPTION(), MODULE_AUTHOR() defintions (with no NUL byte).
>>
>> I see:
>>
>> error: bad integer constant expression
>> error: static assertion failed: "MODULE_INFO(parmtype, ...) contains embedded NUL byte"
>>
>> for every use.
> 
> Likewise, I just got the following kernel test robot report sent to me,
> where it's warning about MODULE_LICENSE("GPL"):
> https://lore.kernel.org/all/202512090359.7BkUaiC9-lkp@intel.com/

Can you both confirm which version of sparse are you using?

My understanding was that this patch fixed that problem:
https://lore.kernel.org/linux-sparse/CACePvbVG2KrGQq4cNKV=wbO5h=jp3M0RO1SdfX8kV4OukjPG8A@mail.gmail.com/T/#mf838b3e2e3245d88c30a801ea7473d5a5c0eb121

The patch is already merged into the sparse tree, and I was not able to
reproduce the issue.

^ permalink raw reply

* Re: [PATCH 3/3] module: Add compile-time check for embedded NUL characters
From: Eric Biggers @ 2025-12-09  0:11 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Kees Cook, Luis Chamberlain, Rusty Russell, Petr Pavlu,
	Daniel Gomez, Sami Tolvanen, linux-modules, Malcolm Priestley,
	Mauro Carvalho Chehab, Hans Verkuil, Uwe Kleine-König,
	linux-kernel, linux-media, linux-hardening
In-Reply-To: <aTc9s210am0YqMV4@agluck-desk3>

On Mon, Dec 08, 2025 at 01:05:55PM -0800, Luck, Tony wrote:
> On Tue, Oct 07, 2025 at 08:59:35PM -0700, Kees Cook wrote:
> > Long ago, the kernel module license checks were bypassed by embedding a
> > NUL character in the MODULE_LICENSE() string[1]. By using a string like
> > "GPL\0proprietary text", the kernel would only read "GPL" due to C string
> > termination at the NUL byte, allowing proprietary modules to avoid kernel
> > tainting and access GPL-only symbols.
> > 
> > The MODULE_INFO() macro stores these strings in the .modinfo ELF
> > section, and get_next_modinfo() uses strcmp()-family functions
> > which stop at the first NUL. This split the embedded string into two
> > separate .modinfo entries, with only the first part being processed by
> > license_is_gpl_compatible().
> > 
> > Add a compile-time check using _Static_assert that compares the full
> > string length (sizeof - 1) against __builtin_strlen(), which stops at
> > the first NUL. If they differ, compilation fails with a clear error
> > message.
> > 
> > While this check can still be circumvented by modifying the ELF binary
> > post-compilation, it prevents accidental embedded NULs and forces
> > intentional abuse to require deliberate binary manipulation rather than
> > simple source-level tricks.
> > 
> > Build tested with test modules containing both valid and invalid license
> > strings. The check correctly rejects:
> > 
> >     MODULE_LICENSE("GPL\0proprietary")
> > 
> > while accepting normal declarations:
> > 
> >     MODULE_LICENSE("GPL")
> 
>  
> I did a "make W=1 C=1" and found that sparse is now unhappy with all MODULE_LICENSE(),
> MODULE_PARM_DESC(), MODULE_DESCRIPTION(), MODULE_AUTHOR() defintions (with no NUL byte).
> 
> I see:
> 
> error: bad integer constant expression
> error: static assertion failed: "MODULE_INFO(parmtype, ...) contains embedded NUL byte"
> 
> for every use.

Likewise, I just got the following kernel test robot report sent to me,
where it's warning about MODULE_LICENSE("GPL"):
https://lore.kernel.org/all/202512090359.7BkUaiC9-lkp@intel.com/

- Eric

^ permalink raw reply

* Re: [PATCH 3/3] module: Add compile-time check for embedded NUL characters
From: Luck, Tony @ 2025-12-08 21:05 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis Chamberlain, Rusty Russell, Petr Pavlu, Daniel Gomez,
	Sami Tolvanen, linux-modules, Malcolm Priestley,
	Mauro Carvalho Chehab, Hans Verkuil, Uwe Kleine-König,
	linux-kernel, linux-media, linux-hardening
In-Reply-To: <20251008035938.838263-3-kees@kernel.org>

On Tue, Oct 07, 2025 at 08:59:35PM -0700, Kees Cook wrote:
> Long ago, the kernel module license checks were bypassed by embedding a
> NUL character in the MODULE_LICENSE() string[1]. By using a string like
> "GPL\0proprietary text", the kernel would only read "GPL" due to C string
> termination at the NUL byte, allowing proprietary modules to avoid kernel
> tainting and access GPL-only symbols.
> 
> The MODULE_INFO() macro stores these strings in the .modinfo ELF
> section, and get_next_modinfo() uses strcmp()-family functions
> which stop at the first NUL. This split the embedded string into two
> separate .modinfo entries, with only the first part being processed by
> license_is_gpl_compatible().
> 
> Add a compile-time check using _Static_assert that compares the full
> string length (sizeof - 1) against __builtin_strlen(), which stops at
> the first NUL. If they differ, compilation fails with a clear error
> message.
> 
> While this check can still be circumvented by modifying the ELF binary
> post-compilation, it prevents accidental embedded NULs and forces
> intentional abuse to require deliberate binary manipulation rather than
> simple source-level tricks.
> 
> Build tested with test modules containing both valid and invalid license
> strings. The check correctly rejects:
> 
>     MODULE_LICENSE("GPL\0proprietary")
> 
> while accepting normal declarations:
> 
>     MODULE_LICENSE("GPL")

 
I did a "make W=1 C=1" and found that sparse is now unhappy with all MODULE_LICENSE(),
MODULE_PARM_DESC(), MODULE_DESCRIPTION(), MODULE_AUTHOR() defintions (with no NUL byte).

I see:

error: bad integer constant expression
error: static assertion failed: "MODULE_INFO(parmtype, ...) contains embedded NUL byte"

for every use.

-Tony

^ permalink raw reply

* Re: [PATCH V3] mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction
From: Vlastimil Babka @ 2025-12-07 17:11 UTC (permalink / raw)
  To: Harry Yoo
  Cc: surenb, Liam.Howlett, cl, rientjes, roman.gushchin, urezki,
	sidhartha.kumar, linux-mm, linux-kernel, rcu, maple-tree,
	linux-modules, mcgrof, petr.pavlu, samitolvanen, atomlin,
	lucas.demarchi, akpm, jonathanh, stable, Daniel Gomez
In-Reply-To: <20251207154148.117723-1-harry.yoo@oracle.com>

On 12/7/25 16:41, Harry Yoo wrote:
> Currently, kvfree_rcu_barrier() flushes RCU sheaves across all slab
> caches when a cache is destroyed. This is unnecessary; only the RCU
> sheaves belonging to the cache being destroyed need to be flushed.
> 
> As suggested by Vlastimil Babka, introduce a weaker form of
> kvfree_rcu_barrier() that operates on a specific slab cache.
> 
> Factor out flush_rcu_sheaves_on_cache() from flush_all_rcu_sheaves() and
> call it from flush_all_rcu_sheaves() and kvfree_rcu_barrier_on_cache().
> 
> Call kvfree_rcu_barrier_on_cache() instead of kvfree_rcu_barrier() on
> cache destruction.
> 
> The performance benefit is evaluated on a 12 core 24 threads AMD Ryzen
> 5900X machine (1 socket), by loading slub_kunit module.
> 
> Before:
>   Total calls: 19
>   Average latency (us): 18127
>   Total time (us): 344414
> 
> After:
>   Total calls: 19
>   Average latency (us): 10066
>   Total time (us): 191264
> 
> Two performance regression have been reported:
>   - stress module loader test's runtime increases by 50-60% (Daniel)
>   - internal graphics test's runtime on Tegra23 increases by 35% (Jon)
> 
> They are fixed by this change.
> 
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Fixes: ec66e0d59952 ("slab: add sheaf support for batching kfree_rcu() operations")
> Cc: <stable@vger.kernel.org>
> Link: https://lore.kernel.org/linux-mm/1bda09da-93be-4737-aef0-d47f8c5c9301@suse.cz
> Reported-and-tested-by: Daniel Gomez <da.gomez@samsung.com>
> Closes: https://lore.kernel.org/linux-mm/0406562e-2066-4cf8-9902-b2b0616dd742@kernel.org
> Reported-and-tested-by: Jon Hunter <jonathanh@nvidia.com>
> Closes: https://lore.kernel.org/linux-mm/e988eff6-1287-425e-a06c-805af5bbf262@nvidia.com
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---
> 
> v2 -> v3:
> - Addressed Suren's comment [1], thanks!

Thanks, replaced.


^ permalink raw reply

* [PATCH V3] mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction
From: Harry Yoo @ 2025-12-07 15:41 UTC (permalink / raw)
  To: vbabka
  Cc: surenb, Liam.Howlett, cl, rientjes, roman.gushchin, harry.yoo,
	urezki, sidhartha.kumar, linux-mm, linux-kernel, rcu, maple-tree,
	linux-modules, mcgrof, petr.pavlu, samitolvanen, atomlin,
	lucas.demarchi, akpm, jonathanh, stable, Daniel Gomez

Currently, kvfree_rcu_barrier() flushes RCU sheaves across all slab
caches when a cache is destroyed. This is unnecessary; only the RCU
sheaves belonging to the cache being destroyed need to be flushed.

As suggested by Vlastimil Babka, introduce a weaker form of
kvfree_rcu_barrier() that operates on a specific slab cache.

Factor out flush_rcu_sheaves_on_cache() from flush_all_rcu_sheaves() and
call it from flush_all_rcu_sheaves() and kvfree_rcu_barrier_on_cache().

Call kvfree_rcu_barrier_on_cache() instead of kvfree_rcu_barrier() on
cache destruction.

The performance benefit is evaluated on a 12 core 24 threads AMD Ryzen
5900X machine (1 socket), by loading slub_kunit module.

Before:
  Total calls: 19
  Average latency (us): 18127
  Total time (us): 344414

After:
  Total calls: 19
  Average latency (us): 10066
  Total time (us): 191264

Two performance regression have been reported:
  - stress module loader test's runtime increases by 50-60% (Daniel)
  - internal graphics test's runtime on Tegra23 increases by 35% (Jon)

They are fixed by this change.

Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: ec66e0d59952 ("slab: add sheaf support for batching kfree_rcu() operations")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/linux-mm/1bda09da-93be-4737-aef0-d47f8c5c9301@suse.cz
Reported-and-tested-by: Daniel Gomez <da.gomez@samsung.com>
Closes: https://lore.kernel.org/linux-mm/0406562e-2066-4cf8-9902-b2b0616dd742@kernel.org
Reported-and-tested-by: Jon Hunter <jonathanh@nvidia.com>
Closes: https://lore.kernel.org/linux-mm/e988eff6-1287-425e-a06c-805af5bbf262@nvidia.com
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---

v2 -> v3:
- Addressed Suren's comment [1], thanks!

[1] https://lore.kernel.org/linux-mm/CAJuCfpE+g65Dm8-r=psDJQf_O1rfBG62DOzx4mE1mb+ottUKmQ@mail.gmail.com 

 include/linux/slab.h |  7 ++++++
 mm/slab.h            |  1 +
 mm/slab_common.c     | 52 +++++++++++++++++++++++++++++------------
 mm/slub.c            | 55 ++++++++++++++++++++++++--------------------
 4 files changed, 75 insertions(+), 40 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index cf443f064a66..2482992248dc 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -1150,10 +1150,17 @@ static inline void kvfree_rcu_barrier(void)
 	rcu_barrier();
 }
 
+static inline void kvfree_rcu_barrier_on_cache(struct kmem_cache *s)
+{
+	rcu_barrier();
+}
+
 static inline void kfree_rcu_scheduler_running(void) { }
 #else
 void kvfree_rcu_barrier(void);
 
+void kvfree_rcu_barrier_on_cache(struct kmem_cache *s);
+
 void kfree_rcu_scheduler_running(void);
 #endif
 
diff --git a/mm/slab.h b/mm/slab.h
index f730e012553c..e767aa7e91b0 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -422,6 +422,7 @@ static inline bool is_kmalloc_normal(struct kmem_cache *s)
 
 bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj);
 void flush_all_rcu_sheaves(void);
+void flush_rcu_sheaves_on_cache(struct kmem_cache *s);
 
 #define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \
 			 SLAB_CACHE_DMA32 | SLAB_PANIC | \
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 84dfff4f7b1f..dd8a49d6f9cc 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -492,7 +492,7 @@ void kmem_cache_destroy(struct kmem_cache *s)
 		return;
 
 	/* in-flight kfree_rcu()'s may include objects from our cache */
-	kvfree_rcu_barrier();
+	kvfree_rcu_barrier_on_cache(s);
 
 	if (IS_ENABLED(CONFIG_SLUB_RCU_DEBUG) &&
 	    (s->flags & SLAB_TYPESAFE_BY_RCU)) {
@@ -2038,25 +2038,13 @@ void kvfree_call_rcu(struct rcu_head *head, void *ptr)
 }
 EXPORT_SYMBOL_GPL(kvfree_call_rcu);
 
-/**
- * kvfree_rcu_barrier - Wait until all in-flight kvfree_rcu() complete.
- *
- * Note that a single argument of kvfree_rcu() call has a slow path that
- * triggers synchronize_rcu() following by freeing a pointer. It is done
- * before the return from the function. Therefore for any single-argument
- * call that will result in a kfree() to a cache that is to be destroyed
- * during module exit, it is developer's responsibility to ensure that all
- * such calls have returned before the call to kmem_cache_destroy().
- */
-void kvfree_rcu_barrier(void)
+static inline void __kvfree_rcu_barrier(void)
 {
 	struct kfree_rcu_cpu_work *krwp;
 	struct kfree_rcu_cpu *krcp;
 	bool queued;
 	int i, cpu;
 
-	flush_all_rcu_sheaves();
-
 	/*
 	 * Firstly we detach objects and queue them over an RCU-batch
 	 * for all CPUs. Finally queued works are flushed for each CPU.
@@ -2118,8 +2106,43 @@ void kvfree_rcu_barrier(void)
 		}
 	}
 }
+
+/**
+ * kvfree_rcu_barrier - Wait until all in-flight kvfree_rcu() complete.
+ *
+ * Note that a single argument of kvfree_rcu() call has a slow path that
+ * triggers synchronize_rcu() following by freeing a pointer. It is done
+ * before the return from the function. Therefore for any single-argument
+ * call that will result in a kfree() to a cache that is to be destroyed
+ * during module exit, it is developer's responsibility to ensure that all
+ * such calls have returned before the call to kmem_cache_destroy().
+ */
+void kvfree_rcu_barrier(void)
+{
+	flush_all_rcu_sheaves();
+	__kvfree_rcu_barrier();
+}
 EXPORT_SYMBOL_GPL(kvfree_rcu_barrier);
 
+/**
+ * kvfree_rcu_barrier_on_cache - Wait for in-flight kvfree_rcu() calls on a
+ *                               specific slab cache.
+ * @s: slab cache to wait for
+ *
+ * See the description of kvfree_rcu_barrier() for details.
+ */
+void kvfree_rcu_barrier_on_cache(struct kmem_cache *s)
+{
+	if (s->cpu_sheaves)
+		flush_rcu_sheaves_on_cache(s);
+	/*
+	 * TODO: Introduce a version of __kvfree_rcu_barrier() that works
+	 * on a specific slab cache.
+	 */
+	__kvfree_rcu_barrier();
+}
+EXPORT_SYMBOL_GPL(kvfree_rcu_barrier_on_cache);
+
 static unsigned long
 kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
 {
@@ -2215,4 +2238,3 @@ void __init kvfree_rcu_init(void)
 }
 
 #endif /* CONFIG_KVFREE_RCU_BATCHED */
-
diff --git a/mm/slub.c b/mm/slub.c
index 785e25a14999..7cec2220712b 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4118,42 +4118,47 @@ static void flush_rcu_sheaf(struct work_struct *w)
 
 
 /* needed for kvfree_rcu_barrier() */
-void flush_all_rcu_sheaves(void)
+void flush_rcu_sheaves_on_cache(struct kmem_cache *s)
 {
 	struct slub_flush_work *sfw;
-	struct kmem_cache *s;
 	unsigned int cpu;
 
-	cpus_read_lock();
-	mutex_lock(&slab_mutex);
+	mutex_lock(&flush_lock);
 
-	list_for_each_entry(s, &slab_caches, list) {
-		if (!s->cpu_sheaves)
-			continue;
+	for_each_online_cpu(cpu) {
+		sfw = &per_cpu(slub_flush, cpu);
 
-		mutex_lock(&flush_lock);
+		/*
+		 * we don't check if rcu_free sheaf exists - racing
+		 * __kfree_rcu_sheaf() might have just removed it.
+		 * by executing flush_rcu_sheaf() on the cpu we make
+		 * sure the __kfree_rcu_sheaf() finished its call_rcu()
+		 */
 
-		for_each_online_cpu(cpu) {
-			sfw = &per_cpu(slub_flush, cpu);
+		INIT_WORK(&sfw->work, flush_rcu_sheaf);
+		sfw->s = s;
+		queue_work_on(cpu, flushwq, &sfw->work);
+	}
 
-			/*
-			 * we don't check if rcu_free sheaf exists - racing
-			 * __kfree_rcu_sheaf() might have just removed it.
-			 * by executing flush_rcu_sheaf() on the cpu we make
-			 * sure the __kfree_rcu_sheaf() finished its call_rcu()
-			 */
+	for_each_online_cpu(cpu) {
+		sfw = &per_cpu(slub_flush, cpu);
+		flush_work(&sfw->work);
+	}
 
-			INIT_WORK(&sfw->work, flush_rcu_sheaf);
-			sfw->s = s;
-			queue_work_on(cpu, flushwq, &sfw->work);
-		}
+	mutex_unlock(&flush_lock);
+}
 
-		for_each_online_cpu(cpu) {
-			sfw = &per_cpu(slub_flush, cpu);
-			flush_work(&sfw->work);
-		}
+void flush_all_rcu_sheaves(void)
+{
+	struct kmem_cache *s;
+
+	cpus_read_lock();
+	mutex_lock(&slab_mutex);
 
-		mutex_unlock(&flush_lock);
+	list_for_each_entry(s, &slab_caches, list) {
+		if (!s->cpu_sheaves)
+			continue;
+		flush_rcu_sheaves_on_cache(s);
 	}
 
 	mutex_unlock(&slab_mutex);
-- 
2.43.0


^ permalink raw reply related

* Re: [GIT PULL] Modules changes for v6.19-rc1
From: pr-tracker-bot @ 2025-12-06 17:22 UTC (permalink / raw)
  To: Daniel Gomez
  Cc: Linus Torvalds, Luis Chamberlain, Petr Pavlu, Sami Tolvanen,
	Daniel Gomez, Mauro Carvalho Chehab, Hans Verkuil,
	Andreas Hindborg, Uwe Kleine-König, Aaron Tomlin, Kees Cook,
	linux-modules, linux-kernel
In-Reply-To: <20251203234840.3720-1-da.gomez@kernel.org>

The pull request you sent on Thu,  4 Dec 2025 00:48:37 +0100:

> ssh://git@gitolite.kernel.org/pub/scm/linux/kernel/git/modules/linux.git/ tags/modules-6.19-rc1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/c84d574698bad2c02aad506dfe712f83cbe3b771

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply

* Re: [GIT PULL] Modules changes for v6.19-rc1
From: Linus Torvalds @ 2025-12-06 16:31 UTC (permalink / raw)
  To: Daniel Gomez
  Cc: Luis Chamberlain, Petr Pavlu, Sami Tolvanen, Daniel Gomez,
	Mauro Carvalho Chehab, Hans Verkuil, Andreas Hindborg,
	Uwe Kleine-König, Aaron Tomlin, Kees Cook, linux-modules,
	linux-kernel
In-Reply-To: <7dc0f103-ab0b-40db-b246-a1af40afe4d3@kernel.org>

On Thu, 4 Dec 2025 at 03:52, Daniel Gomez <da.gomez@kernel.org> wrote:
> >
> >   ssh://git@gitolite.kernel.org/pub/scm/linux/kernel/git/modules/linux.git/ tags/modules-6.19-rc1
>
> I know the preference is to use git:// but my git config resolved to the above
> link.

I do automatic conversion from the private ssh address to the
externally visible git one for kernel.org addresses.

So when a maintainer sends me one of those "only available to
kernel.org accounts" links, my scripting will make it show up as the
public one instead.

I do that for visual consistency, but also so that people who _don't_
have an account can follow along at home.

Put another way: no worries. As long as you use the standard syntax,
it gets corrected for public use anyway.

              Linus

^ permalink raw reply

* Re: [PATCH v18 25/42] dept: add documents for dept
From: Bagas Sanjaya @ 2025-12-06  0:25 UTC (permalink / raw)
  To: Byungchul Park, linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, x86, hpa, luto, sumit.semwal,
	gustavo, christian.koenig, andi.shyti, arnd, lorenzo.stoakes,
	Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu, da.gomez,
	samitolvanen, paulmck, frederic, neeraj.upadhyay, joelagnelf,
	josh, urezki, mathieu.desnoyers, jiangshanlai, qiang.zhang,
	juri.lelli, vincent.guittot, dietmar.eggemann, bsegall, mgorman,
	vschneid, chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy,
	anna, kees, bigeasy, clrkwllms, mark.rutland, ada.coupriediaz,
	kristina.martsenko, wangkefeng.wang, broonie, kevin.brodsky, dwmw,
	shakeel.butt, ast, ziy, yuzhao, baolin.wang, usamaarif642,
	joel.granados, richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel,
	2407018371, dakr, miguel.ojeda.sandonis, neilb, wsa+renesas,
	dave.hansen, geert, ojeda, alex.gaynor, gary, bjorn3_gh, lossin,
	a.hindborg, aliceryhl, tmgross, rust-for-linux
In-Reply-To: <20251205071855.72743-26-byungchul@sk.com>

[-- Attachment #1: Type: text/plain, Size: 9470 bytes --]

On Fri, Dec 05, 2025 at 04:18:38PM +0900, Byungchul Park wrote:
> Add documents describing the concept and APIs of dept.
> 
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> ---
>  Documentation/dev-tools/dept.rst     | 778 +++++++++++++++++++++++++++
>  Documentation/dev-tools/dept_api.rst | 125 +++++

You forget to add toctree entries:

---- >8 ----
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 4b8425e348abd1..02c858f5ed1fa2 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -22,6 +22,8 @@ Documentation/process/debugging/index.rst
    clang-format
    coccinelle
    sparse
+   dept
+   dept_api
    kcov
    gcov
    kasan

> +Lockdep detects a deadlock by checking lock acquisition order.  For
> +example, a graph to track acquisition order built by lockdep might look
> +like:
> +
> +.. literal::
> +
> +   A -> B -
> +           \
> +            -> E
> +           /
> +   C -> D -
> +
> +   where 'A -> B' means that acquisition A is prior to acquisition B
> +   with A still held.

Use code-block directive for literal code blocks:

---- >8 ----
diff --git a/Documentation/dev-tools/dept.rst b/Documentation/dev-tools/dept.rst
index 333166464543d7..8394c4ea81bc2a 100644
--- a/Documentation/dev-tools/dept.rst
+++ b/Documentation/dev-tools/dept.rst
@@ -10,7 +10,7 @@ Lockdep detects a deadlock by checking lock acquisition order.  For
 example, a graph to track acquisition order built by lockdep might look
 like:
 
-.. literal::
+.. code-block::
 
    A -> B -
            \
@@ -25,7 +25,7 @@ Lockdep keeps adding each new acquisition order into the graph at
 runtime.  For example, 'E -> C' will be added when the two locks have
 been acquired in the order, E and then C.  The graph will look like:
 
-.. literal::
+.. code-block::
 
        A -> B -
                \
@@ -41,7 +41,7 @@ been acquired in the order, E and then C.  The graph will look like:
 
 This graph contains a subgraph that demonstrates a loop like:
 
-.. literal::
+.. code-block::
 
                 -> E -
                /      \
@@ -76,7 +76,7 @@ e.g. irq context, normal process context, wq worker context, or so on.
 
 Can lockdep detect the following deadlock?
 
-.. literal::
+.. code-block::
 
    context X	   context Y	   context Z
 
@@ -91,7 +91,7 @@ Can lockdep detect the following deadlock?
 
 No.  What about the following?
 
-.. literal::
+.. code-block::
 
    context X		   context Y
 
@@ -116,7 +116,7 @@ What leads a deadlock
 A deadlock occurs when one or multi contexts are waiting for events that
 will never happen.  For example:
 
-.. literal::
+.. code-block::
 
    context X	   context Y	   context Z
 
@@ -148,7 +148,7 @@ In terms of dependency:
 
 Dependency graph reflecting this example will look like:
 
-.. literal::
+.. code-block::
 
     -> C -> A -> B -
    /                \
@@ -171,7 +171,7 @@ Introduce DEPT
 DEPT(DEPendency Tracker) tracks wait and event instead of lock
 acquisition order so as to recognize the following situation:
 
-.. literal::
+.. code-block::
 
    context X	   context Y	   context Z
 
@@ -186,7 +186,7 @@ acquisition order so as to recognize the following situation:
 and builds up a dependency graph at runtime that is similar to lockdep.
 The graph might look like:
 
-.. literal::
+.. code-block::
 
     -> C -> A -> B -
    /                \
@@ -199,7 +199,7 @@ DEPT keeps adding each new dependency into the graph at runtime.  For
 example, 'B -> D' will be added when event D occurrence is a
 prerequisite to reaching event B like:
 
-.. literal::
+.. code-block::
 
    context W
 
@@ -211,7 +211,7 @@ prerequisite to reaching event B like:
 
 After the addition, the graph will look like:
 
-.. literal::
+.. code-block::
 
                      -> D
                     /
@@ -236,7 +236,7 @@ How DEPT works
 Let's take a look how DEPT works with the 1st example in the section
 'Limitation of lockdep'.
 
-.. literal::
+.. code-block::
 
    context X	   context Y	   context Z
 
@@ -256,7 +256,7 @@ event.
 
 Adding comments to describe DEPT's view in detail:
 
-.. literal::
+.. code-block::
 
    context X	   context Y	   context Z
 
@@ -293,7 +293,7 @@ Adding comments to describe DEPT's view in detail:
 
 Let's build up dependency graph with this example.  Firstly, context X:
 
-.. literal::
+.. code-block::
 
    context X
 
@@ -304,7 +304,7 @@ Let's build up dependency graph with this example.  Firstly, context X:
 
 There are no events to create dependency.  Next, context Y:
 
-.. literal::
+.. code-block::
 
    context Y
 
@@ -332,7 +332,7 @@ event A cannot be triggered if wait B cannot be awakened by event B.
 Therefore, we can say event A depends on event B, say, 'A -> B'.  The
 graph will look like after adding the dependency:
 
-.. literal::
+.. code-block::
 
    A -> B
 
@@ -340,7 +340,7 @@ graph will look like after adding the dependency:
 
 Lastly, context Z:
 
-.. literal::
+.. code-block::
 
    context Z
 
@@ -362,7 +362,7 @@ triggered if wait A cannot be awakened by event A.  Therefore, we can
 say event B depends on event A, say, 'B -> A'.  The graph will look like
 after adding the dependency:
 
-.. literal::
+.. code-block::
 
     -> A -> B -
    /           \
@@ -386,7 +386,7 @@ Interpret DEPT report
 
 The following is the same example in the section 'How DEPT works'.
 
-.. literal::
+.. code-block::
 
    context X	   context Y	   context Z
 
@@ -425,7 +425,7 @@ We can simplify this by labeling each waiting point with [W], each
 point where its event's context starts with [S] and each event with [E].
 This example will look like after the labeling:
 
-.. literal::
+.. code-block::
 
    context X	   context Y	   context Z
 
@@ -443,7 +443,7 @@ DEPT uses the symbols [W], [S] and [E] in its report as described above.
 The following is an example reported by DEPT for a real problem in
 practice.
 
-.. literal::
+.. code-block::
 
    Link: https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a485657d@I-love.SAKURA.ne.jp/#t
    Link: https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.park@lge.com/
@@ -646,7 +646,7 @@ practice.
 
 Let's take a look at the summary that is the most important part.
 
-.. literal::
+.. code-block::
 
    ---------------------------------------------------
    summary
@@ -669,7 +669,7 @@ Let's take a look at the summary that is the most important part.
 
 The summary shows the following scenario:
 
-.. literal::
+.. code-block::
 
    context A	   context B	   context ?(unknown)
 
@@ -684,7 +684,7 @@ The summary shows the following scenario:
 
 Adding comments to describe DEPT's view in detail:
 
-.. literal::
+.. code-block::
 
    context A	   context B	   context ?(unknown)
 
@@ -711,7 +711,7 @@ Adding comments to describe DEPT's view in detail:
 
 Let's build up dependency graph with this report. Firstly, context A:
 
-.. literal::
+.. code-block::
 
    context A
 
@@ -735,7 +735,7 @@ unlock(&ni->ni_lock:0) depends on folio_unlock(&f1), say,
 
 The graph will look like after adding the dependency:
 
-.. literal::
+.. code-block::
 
    unlock(&ni->ni_lock:0) -> folio_unlock(&f1)
 
@@ -743,7 +743,7 @@ The graph will look like after adding the dependency:
 
 Secondly, context B:
 
-.. literal::
+.. code-block::
 
    context B
 
@@ -762,7 +762,7 @@ folio_unlock(&f1) depends on unlock(&ni->ni_lock:0), say,
 
 The graph will look like after adding the dependency:
 
-.. literal::
+.. code-block::
 
     -> unlock(&ni->ni_lock:0) -> folio_unlock(&f1) -
    /                                                \

> +Limitation of lockdep
> +---------------------
> +
> +Lockdep deals with a deadlock by typical lock e.g. spinlock and mutex,
> +that are supposed to be released within the acquisition context.
> +However, when it comes to a deadlock by folio lock that is not supposed
> +to be released within the acquisition context or other general
> +synchronization mechanisms, lockdep doesn't work.
> +
> +NOTE:  In this document, 'context' refers to any type of unique context
> +e.g. irq context, normal process context, wq worker context, or so on.
> +
> +Can lockdep detect the following deadlock?
> +
> +.. literal::
> +
> +   context X	   context Y	   context Z
> +
> +		   mutex_lock A
> +   folio_lock B
> +		   folio_lock B <- DEADLOCK
> +				   mutex_lock A <- DEADLOCK
> +				   folio_unlock B
> +		   folio_unlock B
> +		   mutex_unlock A
> +				   mutex_unlock A
> +
> +No.  What about the following?
> +
> +.. literal::
> +
> +   context X		   context Y
> +
> +			   mutex_lock A
> +   mutex_lock A <- DEADLOCK
> +			   wait_for_complete B <- DEADLOCK
> +   complete B
> +			   mutex_unlock A
> +   mutex_unlock A
> +
> +No.

One unanswered question from my v17 review [1]: You explain in "How DEPT works"
section how DEPT detects deadlock in the first example (the former with three
contexts). Can you do the same on the second example (the latter with two
contexts)?

Thanks.

[1]: https://lore.kernel.org/linux-doc/aN84jKyrE1BumpLj@archie.me/

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply related

* Re: [PATCH v3 4/4] tracing: move tracing declarations from kernel.h to a dedicated header
From: Andy Shevchenko @ 2025-12-05 20:55 UTC (permalink / raw)
  To: Yury Norov (NVIDIA)
  Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Christophe Leroy, Randy Dunlap, Ingo Molnar, Jani Nikula,
	Joonas Lahtinen, David Laight, Petr Pavlu, Andi Shyti,
	Rodrigo Vivi, Tvrtko Ursulin, Daniel Gomez, Greg Kroah-Hartman,
	Rafael J. Wysocki, Danilo Krummrich, Andrew Morton, linux-kernel,
	intel-gfx, dri-devel, linux-modules, linux-trace-kernel
In-Reply-To: <20251205175237.242022-5-yury.norov@gmail.com>

On Fri, Dec 05, 2025 at 12:52:35PM -0500, Yury Norov (NVIDIA) wrote:
> Tracing is a half of the kernel.h in terms of LOCs, although it's
> a self-consistent part. It is intended for quick debugging purposes
> and isn't used by the normal tracing utilities.
> 
> Move it to a separate header. If someone needs to just throw a
> trace_printk() in their driver, they will not have to pull all
> the heavy tracing machinery.
> 
> This is a pure move, except for removing a few 'extern's.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply

* [PATCH v3 4/4] tracing: move tracing declarations from kernel.h to a dedicated header
From: Yury Norov (NVIDIA) @ 2025-12-05 17:52 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Andy Shevchenko, Christophe Leroy, Randy Dunlap, Ingo Molnar,
	Jani Nikula, Joonas Lahtinen, David Laight, Petr Pavlu,
	Andi Shyti, Rodrigo Vivi, Tvrtko Ursulin, Daniel Gomez,
	Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
	Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
	linux-trace-kernel
  Cc: Yury Norov (NVIDIA)
In-Reply-To: <20251205175237.242022-1-yury.norov@gmail.com>

Tracing is a half of the kernel.h in terms of LOCs, although it's
a self-consistent part. It is intended for quick debugging purposes
and isn't used by the normal tracing utilities.

Move it to a separate header. If someone needs to just throw a
trace_printk() in their driver, they will not have to pull all
the heavy tracing machinery.

This is a pure move, except for removing a few 'extern's.

Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
---
 include/linux/kernel.h       | 196 +--------------------------------
 include/linux/trace_printk.h | 205 +++++++++++++++++++++++++++++++++++
 2 files changed, 206 insertions(+), 195 deletions(-)
 create mode 100644 include/linux/trace_printk.h

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 5b879bfea948..a377335e01da 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -32,7 +32,7 @@
 #include <linux/build_bug.h>
 #include <linux/sprintf.h>
 #include <linux/static_call_types.h>
-#include <linux/instruction_pointer.h>
+#include <linux/trace_printk.h>
 #include <linux/util_macros.h>
 #include <linux/wordpart.h>
 
@@ -190,200 +190,6 @@ enum system_states {
 };
 extern enum system_states system_state;
 
-/*
- * General tracing related utility functions - trace_printk(),
- * tracing_on/tracing_off and tracing_start()/tracing_stop
- *
- * Use tracing_on/tracing_off when you want to quickly turn on or off
- * tracing. It simply enables or disables the recording of the trace events.
- * This also corresponds to the user space /sys/kernel/tracing/tracing_on
- * file, which gives a means for the kernel and userspace to interact.
- * Place a tracing_off() in the kernel where you want tracing to end.
- * From user space, examine the trace, and then echo 1 > tracing_on
- * to continue tracing.
- *
- * tracing_stop/tracing_start has slightly more overhead. It is used
- * by things like suspend to ram where disabling the recording of the
- * trace is not enough, but tracing must actually stop because things
- * like calling smp_processor_id() may crash the system.
- *
- * Most likely, you want to use tracing_on/tracing_off.
- */
-
-enum ftrace_dump_mode {
-	DUMP_NONE,
-	DUMP_ALL,
-	DUMP_ORIG,
-	DUMP_PARAM,
-};
-
-#ifdef CONFIG_TRACING
-void tracing_on(void);
-void tracing_off(void);
-int tracing_is_on(void);
-void tracing_snapshot(void);
-void tracing_snapshot_alloc(void);
-
-extern void tracing_start(void);
-extern void tracing_stop(void);
-
-static inline __printf(1, 2)
-void ____trace_printk_check_format(const char *fmt, ...)
-{
-}
-#define __trace_printk_check_format(fmt, args...)			\
-do {									\
-	if (0)								\
-		____trace_printk_check_format(fmt, ##args);		\
-} while (0)
-
-/**
- * trace_printk - printf formatting in the ftrace buffer
- * @fmt: the printf format for printing
- *
- * Note: __trace_printk is an internal function for trace_printk() and
- *       the @ip is passed in via the trace_printk() macro.
- *
- * This function allows a kernel developer to debug fast path sections
- * that printk is not appropriate for. By scattering in various
- * printk like tracing in the code, a developer can quickly see
- * where problems are occurring.
- *
- * This is intended as a debugging tool for the developer only.
- * Please refrain from leaving trace_printks scattered around in
- * your code. (Extra memory is used for special buffers that are
- * allocated when trace_printk() is used.)
- *
- * A little optimization trick is done here. If there's only one
- * argument, there's no need to scan the string for printf formats.
- * The trace_puts() will suffice. But how can we take advantage of
- * using trace_puts() when trace_printk() has only one argument?
- * By stringifying the args and checking the size we can tell
- * whether or not there are args. __stringify((__VA_ARGS__)) will
- * turn into "()\0" with a size of 3 when there are no args, anything
- * else will be bigger. All we need to do is define a string to this,
- * and then take its size and compare to 3. If it's bigger, use
- * do_trace_printk() otherwise, optimize it to trace_puts(). Then just
- * let gcc optimize the rest.
- */
-
-#define trace_printk(fmt, ...)				\
-do {							\
-	char _______STR[] = __stringify((__VA_ARGS__));	\
-	if (sizeof(_______STR) > 3)			\
-		do_trace_printk(fmt, ##__VA_ARGS__);	\
-	else						\
-		trace_puts(fmt);			\
-} while (0)
-
-#define do_trace_printk(fmt, args...)					\
-do {									\
-	static const char *trace_printk_fmt __used			\
-		__section("__trace_printk_fmt") =			\
-		__builtin_constant_p(fmt) ? fmt : NULL;			\
-									\
-	__trace_printk_check_format(fmt, ##args);			\
-									\
-	if (__builtin_constant_p(fmt))					\
-		__trace_bprintk(_THIS_IP_, trace_printk_fmt, ##args);	\
-	else								\
-		__trace_printk(_THIS_IP_, fmt, ##args);			\
-} while (0)
-
-extern __printf(2, 3)
-int __trace_bprintk(unsigned long ip, const char *fmt, ...);
-
-extern __printf(2, 3)
-int __trace_printk(unsigned long ip, const char *fmt, ...);
-
-/**
- * trace_puts - write a string into the ftrace buffer
- * @str: the string to record
- *
- * Note: __trace_bputs is an internal function for trace_puts and
- *       the @ip is passed in via the trace_puts macro.
- *
- * This is similar to trace_printk() but is made for those really fast
- * paths that a developer wants the least amount of "Heisenbug" effects,
- * where the processing of the print format is still too much.
- *
- * This function allows a kernel developer to debug fast path sections
- * that printk is not appropriate for. By scattering in various
- * printk like tracing in the code, a developer can quickly see
- * where problems are occurring.
- *
- * This is intended as a debugging tool for the developer only.
- * Please refrain from leaving trace_puts scattered around in
- * your code. (Extra memory is used for special buffers that are
- * allocated when trace_puts() is used.)
- *
- * Returns: 0 if nothing was written, positive # if string was.
- *  (1 when __trace_bputs is used, strlen(str) when __trace_puts is used)
- */
-
-#define trace_puts(str) ({						\
-	static const char *trace_printk_fmt __used			\
-		__section("__trace_printk_fmt") =			\
-		__builtin_constant_p(str) ? str : NULL;			\
-									\
-	if (__builtin_constant_p(str))					\
-		__trace_bputs(_THIS_IP_, trace_printk_fmt);		\
-	else								\
-		__trace_puts(_THIS_IP_, str, strlen(str));		\
-})
-extern int __trace_bputs(unsigned long ip, const char *str);
-extern int __trace_puts(unsigned long ip, const char *str, int size);
-
-extern void trace_dump_stack(int skip);
-
-/*
- * The double __builtin_constant_p is because gcc will give us an error
- * if we try to allocate the static variable to fmt if it is not a
- * constant. Even with the outer if statement.
- */
-#define ftrace_vprintk(fmt, vargs)					\
-do {									\
-	if (__builtin_constant_p(fmt)) {				\
-		static const char *trace_printk_fmt __used		\
-		  __section("__trace_printk_fmt") =			\
-			__builtin_constant_p(fmt) ? fmt : NULL;		\
-									\
-		__ftrace_vbprintk(_THIS_IP_, trace_printk_fmt, vargs);	\
-	} else								\
-		__ftrace_vprintk(_THIS_IP_, fmt, vargs);		\
-} while (0)
-
-extern __printf(2, 0) int
-__ftrace_vbprintk(unsigned long ip, const char *fmt, va_list ap);
-
-extern __printf(2, 0) int
-__ftrace_vprintk(unsigned long ip, const char *fmt, va_list ap);
-
-extern void ftrace_dump(enum ftrace_dump_mode oops_dump_mode);
-#else
-static inline void tracing_start(void) { }
-static inline void tracing_stop(void) { }
-static inline void trace_dump_stack(int skip) { }
-
-static inline void tracing_on(void) { }
-static inline void tracing_off(void) { }
-static inline int tracing_is_on(void) { return 0; }
-static inline void tracing_snapshot(void) { }
-static inline void tracing_snapshot_alloc(void) { }
-
-static inline __printf(1, 2)
-int trace_printk(const char *fmt, ...)
-{
-	return 0;
-}
-static __printf(1, 0) inline int
-ftrace_vprintk(const char *fmt, va_list ap)
-{
-	return 0;
-}
-static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { }
-#endif /* CONFIG_TRACING */
-
 /* Rebuild everything on CONFIG_DYNAMIC_FTRACE */
 #ifdef CONFIG_DYNAMIC_FTRACE
 # define REBUILD_DUE_TO_DYNAMIC_FTRACE
diff --git a/include/linux/trace_printk.h b/include/linux/trace_printk.h
new file mode 100644
index 000000000000..5c9785c49c8e
--- /dev/null
+++ b/include/linux/trace_printk.h
@@ -0,0 +1,205 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_TRACE_PRINTK_H
+#define _LINUX_TRACE_PRINTK_H
+
+#include <linux/compiler_attributes.h>
+#include <linux/instruction_pointer.h>
+#include <linux/stddef.h>
+#include <linux/stringify.h>
+#include <linux/string.h>
+
+/*
+ * General tracing related utility functions - trace_printk(),
+ * tracing_on/tracing_off and tracing_start()/tracing_stop
+ *
+ * Use tracing_on/tracing_off when you want to quickly turn on or off
+ * tracing. It simply enables or disables the recording of the trace events.
+ * This also corresponds to the user space /sys/kernel/tracing/tracing_on
+ * file, which gives a means for the kernel and userspace to interact.
+ * Place a tracing_off() in the kernel where you want tracing to end.
+ * From user space, examine the trace, and then echo 1 > tracing_on
+ * to continue tracing.
+ *
+ * tracing_stop/tracing_start has slightly more overhead. It is used
+ * by things like suspend to ram where disabling the recording of the
+ * trace is not enough, but tracing must actually stop because things
+ * like calling smp_processor_id() may crash the system.
+ *
+ * Most likely, you want to use tracing_on/tracing_off.
+ */
+
+enum ftrace_dump_mode {
+	DUMP_NONE,
+	DUMP_ALL,
+	DUMP_ORIG,
+	DUMP_PARAM,
+};
+
+#ifdef CONFIG_TRACING
+void tracing_on(void);
+void tracing_off(void);
+int tracing_is_on(void);
+void tracing_snapshot(void);
+void tracing_snapshot_alloc(void);
+
+void tracing_start(void);
+void tracing_stop(void);
+
+static inline __printf(1, 2)
+void ____trace_printk_check_format(const char *fmt, ...)
+{
+}
+#define __trace_printk_check_format(fmt, args...)			\
+do {									\
+	if (0)								\
+		____trace_printk_check_format(fmt, ##args);		\
+} while (0)
+
+/**
+ * trace_printk - printf formatting in the ftrace buffer
+ * @fmt: the printf format for printing
+ *
+ * Note: __trace_printk is an internal function for trace_printk() and
+ *       the @ip is passed in via the trace_printk() macro.
+ *
+ * This function allows a kernel developer to debug fast path sections
+ * that printk is not appropriate for. By scattering in various
+ * printk like tracing in the code, a developer can quickly see
+ * where problems are occurring.
+ *
+ * This is intended as a debugging tool for the developer only.
+ * Please refrain from leaving trace_printks scattered around in
+ * your code. (Extra memory is used for special buffers that are
+ * allocated when trace_printk() is used.)
+ *
+ * A little optimization trick is done here. If there's only one
+ * argument, there's no need to scan the string for printf formats.
+ * The trace_puts() will suffice. But how can we take advantage of
+ * using trace_puts() when trace_printk() has only one argument?
+ * By stringifying the args and checking the size we can tell
+ * whether or not there are args. __stringify((__VA_ARGS__)) will
+ * turn into "()\0" with a size of 3 when there are no args, anything
+ * else will be bigger. All we need to do is define a string to this,
+ * and then take its size and compare to 3. If it's bigger, use
+ * do_trace_printk() otherwise, optimize it to trace_puts(). Then just
+ * let gcc optimize the rest.
+ */
+
+#define trace_printk(fmt, ...)				\
+do {							\
+	char _______STR[] = __stringify((__VA_ARGS__));	\
+	if (sizeof(_______STR) > 3)			\
+		do_trace_printk(fmt, ##__VA_ARGS__);	\
+	else						\
+		trace_puts(fmt);			\
+} while (0)
+
+#define do_trace_printk(fmt, args...)					\
+do {									\
+	static const char *trace_printk_fmt __used			\
+		__section("__trace_printk_fmt") =			\
+		__builtin_constant_p(fmt) ? fmt : NULL;			\
+									\
+	__trace_printk_check_format(fmt, ##args);			\
+									\
+	if (__builtin_constant_p(fmt))					\
+		__trace_bprintk(_THIS_IP_, trace_printk_fmt, ##args);	\
+	else								\
+		__trace_printk(_THIS_IP_, fmt, ##args);			\
+} while (0)
+
+__printf(2, 3)
+int __trace_bprintk(unsigned long ip, const char *fmt, ...);
+
+__printf(2, 3)
+int __trace_printk(unsigned long ip, const char *fmt, ...);
+
+/**
+ * trace_puts - write a string into the ftrace buffer
+ * @str: the string to record
+ *
+ * Note: __trace_bputs is an internal function for trace_puts and
+ *       the @ip is passed in via the trace_puts macro.
+ *
+ * This is similar to trace_printk() but is made for those really fast
+ * paths that a developer wants the least amount of "Heisenbug" effects,
+ * where the processing of the print format is still too much.
+ *
+ * This function allows a kernel developer to debug fast path sections
+ * that printk is not appropriate for. By scattering in various
+ * printk like tracing in the code, a developer can quickly see
+ * where problems are occurring.
+ *
+ * This is intended as a debugging tool for the developer only.
+ * Please refrain from leaving trace_puts scattered around in
+ * your code. (Extra memory is used for special buffers that are
+ * allocated when trace_puts() is used.)
+ *
+ * Returns: 0 if nothing was written, positive # if string was.
+ *  (1 when __trace_bputs is used, strlen(str) when __trace_puts is used)
+ */
+
+#define trace_puts(str) ({						\
+	static const char *trace_printk_fmt __used			\
+		__section("__trace_printk_fmt") =			\
+		__builtin_constant_p(str) ? str : NULL;			\
+									\
+	if (__builtin_constant_p(str))					\
+		__trace_bputs(_THIS_IP_, trace_printk_fmt);		\
+	else								\
+		__trace_puts(_THIS_IP_, str, strlen(str));		\
+})
+int __trace_bputs(unsigned long ip, const char *str);
+int __trace_puts(unsigned long ip, const char *str, int size);
+
+void trace_dump_stack(int skip);
+
+/*
+ * The double __builtin_constant_p is because gcc will give us an error
+ * if we try to allocate the static variable to fmt if it is not a
+ * constant. Even with the outer if statement.
+ */
+#define ftrace_vprintk(fmt, vargs)					\
+do {									\
+	if (__builtin_constant_p(fmt)) {				\
+		static const char *trace_printk_fmt __used		\
+		  __section("__trace_printk_fmt") =			\
+			__builtin_constant_p(fmt) ? fmt : NULL;		\
+									\
+		__ftrace_vbprintk(_THIS_IP_, trace_printk_fmt, vargs);	\
+	} else								\
+		__ftrace_vprintk(_THIS_IP_, fmt, vargs);		\
+} while (0)
+
+__printf(2, 0) int
+__ftrace_vbprintk(unsigned long ip, const char *fmt, va_list ap);
+
+__printf(2, 0) int
+__ftrace_vprintk(unsigned long ip, const char *fmt, va_list ap);
+
+void ftrace_dump(enum ftrace_dump_mode oops_dump_mode);
+#else
+static inline void tracing_start(void) { }
+static inline void tracing_stop(void) { }
+static inline void trace_dump_stack(int skip) { }
+
+static inline void tracing_on(void) { }
+static inline void tracing_off(void) { }
+static inline int tracing_is_on(void) { return 0; }
+static inline void tracing_snapshot(void) { }
+static inline void tracing_snapshot_alloc(void) { }
+
+static inline __printf(1, 2)
+int trace_printk(const char *fmt, ...)
+{
+	return 0;
+}
+static __printf(1, 0) inline int
+ftrace_vprintk(const char *fmt, va_list ap)
+{
+	return 0;
+}
+static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { }
+#endif /* CONFIG_TRACING */
+
+#endif
-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 3/4] kernel.h: move VERIFY_OCTAL_PERMISSIONS() to sysfs.h
From: Yury Norov (NVIDIA) @ 2025-12-05 17:52 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Andy Shevchenko, Christophe Leroy, Randy Dunlap, Ingo Molnar,
	Jani Nikula, Joonas Lahtinen, David Laight, Petr Pavlu,
	Andi Shyti, Rodrigo Vivi, Tvrtko Ursulin, Daniel Gomez,
	Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
	Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
	linux-trace-kernel
  Cc: Yury Norov (NVIDIA)
In-Reply-To: <20251205175237.242022-1-yury.norov@gmail.com>

The macro is related to sysfs, but is defined in kernel.h. Move it to
the proper header, and unload the generic kernel.h.

Now that the macro is removed from kernel.h, linux/moduleparam.h is
decoupled, and kernel.h inclusion can be removed.

Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
---
 Documentation/filesystems/sysfs.rst |  2 +-
 include/linux/kernel.h              | 12 ------------
 include/linux/moduleparam.h         |  2 +-
 include/linux/sysfs.h               | 13 +++++++++++++
 4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/Documentation/filesystems/sysfs.rst b/Documentation/filesystems/sysfs.rst
index 2703c04af7d0..ffcef4d6bc8d 100644
--- a/Documentation/filesystems/sysfs.rst
+++ b/Documentation/filesystems/sysfs.rst
@@ -120,7 +120,7 @@ is equivalent to doing::
 	    .store = store_foo,
     };
 
-Note as stated in include/linux/kernel.h "OTHER_WRITABLE?  Generally
+Note as stated in include/linux/sysfs.h "OTHER_WRITABLE?  Generally
 considered a bad idea." so trying to set a sysfs file writable for
 everyone will fail reverting to RO mode for "Others".
 
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 61d63c57bc2d..5b879bfea948 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -389,16 +389,4 @@ static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { }
 # define REBUILD_DUE_TO_DYNAMIC_FTRACE
 #endif
 
-/* Permissions on a sysfs file: you didn't miss the 0 prefix did you? */
-#define VERIFY_OCTAL_PERMISSIONS(perms)						\
-	(BUILD_BUG_ON_ZERO((perms) < 0) +					\
-	 BUILD_BUG_ON_ZERO((perms) > 0777) +					\
-	 /* USER_READABLE >= GROUP_READABLE >= OTHER_READABLE */		\
-	 BUILD_BUG_ON_ZERO((((perms) >> 6) & 4) < (((perms) >> 3) & 4)) +	\
-	 BUILD_BUG_ON_ZERO((((perms) >> 3) & 4) < ((perms) & 4)) +		\
-	 /* USER_WRITABLE >= GROUP_WRITABLE */					\
-	 BUILD_BUG_ON_ZERO((((perms) >> 6) & 2) < (((perms) >> 3) & 2)) +	\
-	 /* OTHER_WRITABLE?  Generally considered a bad idea. */		\
-	 BUILD_BUG_ON_ZERO((perms) & 2) +					\
-	 (perms))
 #endif
diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
index ca7c8107c7c8..dd2d990b2611 100644
--- a/include/linux/moduleparam.h
+++ b/include/linux/moduleparam.h
@@ -8,7 +8,7 @@
 #include <linux/compiler.h>
 #include <linux/init.h>
 #include <linux/stringify.h>
-#include <linux/kernel.h>
+#include <linux/sysfs.h>
 #include <linux/types.h>
 
 /*
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 9a25a2911652..15ee3ef33991 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -798,4 +798,17 @@ static inline void sysfs_put(struct kernfs_node *kn)
 	kernfs_put(kn);
 }
 
+/* Permissions on a sysfs file: you didn't miss the 0 prefix did you? */
+#define VERIFY_OCTAL_PERMISSIONS(perms)						\
+	(BUILD_BUG_ON_ZERO((perms) < 0) +					\
+	 BUILD_BUG_ON_ZERO((perms) > 0777) +					\
+	 /* USER_READABLE >= GROUP_READABLE >= OTHER_READABLE */		\
+	 BUILD_BUG_ON_ZERO((((perms) >> 6) & 4) < (((perms) >> 3) & 4)) +	\
+	 BUILD_BUG_ON_ZERO((((perms) >> 3) & 4) < ((perms) & 4)) +		\
+	 /* USER_WRITABLE >= GROUP_WRITABLE */					\
+	 BUILD_BUG_ON_ZERO((((perms) >> 6) & 2) < (((perms) >> 3) & 2)) +	\
+	 /* OTHER_WRITABLE?  Generally considered a bad idea. */		\
+	 BUILD_BUG_ON_ZERO((perms) & 2) +					\
+	 (perms))
+
 #endif /* _SYSFS_H_ */
-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 2/4] moduleparam: include required headers explicitly
From: Yury Norov (NVIDIA) @ 2025-12-05 17:52 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Andy Shevchenko, Christophe Leroy, Randy Dunlap, Ingo Molnar,
	Jani Nikula, Joonas Lahtinen, David Laight, Petr Pavlu,
	Andi Shyti, Rodrigo Vivi, Tvrtko Ursulin, Daniel Gomez,
	Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
	Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
	linux-trace-kernel
  Cc: Yury Norov (NVIDIA)
In-Reply-To: <20251205175237.242022-1-yury.norov@gmail.com>

The following patch drops moduleparam.h dependency on kernel.h. In
preparation to it, list all the required headers explicitly.

Suggested-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
---
 include/linux/moduleparam.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
index 6907aedc4f74..ca7c8107c7c8 100644
--- a/include/linux/moduleparam.h
+++ b/include/linux/moduleparam.h
@@ -2,9 +2,14 @@
 #ifndef _LINUX_MODULE_PARAMS_H
 #define _LINUX_MODULE_PARAMS_H
 /* (C) Copyright 2001, 2002 Rusty Russell IBM Corporation */
+
+#include <linux/array_size.h>
+#include <linux/build_bug.h>
+#include <linux/compiler.h>
 #include <linux/init.h>
 #include <linux/stringify.h>
 #include <linux/kernel.h>
+#include <linux/types.h>
 
 /*
  * The maximum module name length, including the NUL byte.
-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 1/4] kernel.h: drop STACK_MAGIC macro
From: Yury Norov (NVIDIA) @ 2025-12-05 17:52 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Andy Shevchenko, Christophe Leroy, Randy Dunlap, Ingo Molnar,
	Jani Nikula, Joonas Lahtinen, David Laight, Petr Pavlu,
	Andi Shyti, Rodrigo Vivi, Tvrtko Ursulin, Daniel Gomez,
	Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
	Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
	linux-trace-kernel
  Cc: Yury Norov (NVIDIA), Jani Nikula
In-Reply-To: <20251205175237.242022-1-yury.norov@gmail.com>

The macro was introduced in 1994, v1.0.4, for stacks protection. Since
that, people found better ways to protect stacks, and now the macro is
only used by i915 selftests. Move it to a local header and drop from
the kernel.h.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
---
 drivers/gpu/drm/i915/gt/selftest_ring_submission.c | 1 +
 drivers/gpu/drm/i915/i915_selftest.h               | 2 ++
 include/linux/kernel.h                             | 2 --
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_ring_submission.c b/drivers/gpu/drm/i915/gt/selftest_ring_submission.c
index 87ceb0f374b6..600333ae6c8c 100644
--- a/drivers/gpu/drm/i915/gt/selftest_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/selftest_ring_submission.c
@@ -3,6 +3,7 @@
  * Copyright © 2020 Intel Corporation
  */
 
+#include "i915_selftest.h"
 #include "intel_engine_pm.h"
 #include "selftests/igt_flush_test.h"
 
diff --git a/drivers/gpu/drm/i915/i915_selftest.h b/drivers/gpu/drm/i915/i915_selftest.h
index bdf3e22c0a34..72922028f4ba 100644
--- a/drivers/gpu/drm/i915/i915_selftest.h
+++ b/drivers/gpu/drm/i915/i915_selftest.h
@@ -26,6 +26,8 @@
 
 #include <linux/types.h>
 
+#define STACK_MAGIC	0xdeadbeef
+
 struct pci_dev;
 struct drm_i915_private;
 
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 5b46924fdff5..61d63c57bc2d 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -40,8 +40,6 @@
 
 #include <uapi/linux/kernel.h>
 
-#define STACK_MAGIC	0xdeadbeef
-
 struct completion;
 struct user;
 
-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 0/4] Unload linux/kernel.h
From: Yury Norov (NVIDIA) @ 2025-12-05 17:52 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Andy Shevchenko, Christophe Leroy, Randy Dunlap, Ingo Molnar,
	Jani Nikula, Joonas Lahtinen, David Laight, Petr Pavlu,
	Andi Shyti, Rodrigo Vivi, Tvrtko Ursulin, Daniel Gomez,
	Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
	Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
	linux-trace-kernel
  Cc: Yury Norov (NVIDIA)

kernel.h hosts declarations that can be placed better.

No major changes since v2. For testing details, see v2.

v1: https://lore.kernel.org/all/20251129195304.204082-1-yury.norov@gmail.com/
v2: https://lore.kernel.org/all/20251203162329.280182-1-yury.norov@gmail.com/
v3:
 - rename linux/tracing.h to linux/trace_printk.h (Steven);
 - cleanup headers better (Andy);

Yury Norov (NVIDIA) (4):
  kernel.h: drop STACK_MAGIC macro
  moduleparam: include required headers explicitly
  kernel.h: move VERIFY_OCTAL_PERMISSIONS() to sysfs.h
  tracing: move tracing declarations from kernel.h to a dedicated header

 Documentation/filesystems/sysfs.rst           |   2 +-
 .../drm/i915/gt/selftest_ring_submission.c    |   1 +
 drivers/gpu/drm/i915/i915_selftest.h          |   2 +
 include/linux/kernel.h                        | 210 +-----------------
 include/linux/moduleparam.h                   |   7 +-
 include/linux/sysfs.h                         |  13 ++
 include/linux/trace_printk.h                  | 205 +++++++++++++++++
 7 files changed, 229 insertions(+), 211 deletions(-)
 create mode 100644 include/linux/trace_printk.h

-- 
2.43.0


^ permalink raw reply

* Re: [PATCH v2 3/4] kernel.h: move VERIFY_OCTAL_PERMISSIONS() to sysfs.h
From: Petr Pavlu @ 2025-12-05 12:58 UTC (permalink / raw)
  To: Yury Norov (NVIDIA)
  Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Andy Shevchenko, Christophe Leroy, Randy Dunlap, Ingo Molnar,
	Jani Nikula, Joonas Lahtinen, David Laight, Andi Shyti,
	Rodrigo Vivi, Tvrtko Ursulin, Daniel Gomez, Greg Kroah-Hartman,
	Rafael J. Wysocki, Danilo Krummrich, Andrew Morton, linux-kernel,
	intel-gfx, dri-devel, linux-modules, linux-trace-kernel
In-Reply-To: <20251203162329.280182-4-yury.norov@gmail.com>

On 12/3/25 5:23 PM, Yury Norov (NVIDIA) wrote:
> The macro is related to sysfs, but is defined in kernel.h. Move it to
> the proper header, and unload the generic kernel.h.
> 
> Now that the macro is removed from kernel.h, linux/moduleparam.h is
> decoupled, and kernel.h inclusion can be removed.
> 
> Acked-by: Randy Dunlap <rdunlap@infradead.org>
> Tested-by: Randy Dunlap <rdunlap@infradead.org>
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>

Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>

-- 
Thanks,
Petr

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox