* [PATCH] rv: Fix boot failure when kernel lockdown is active
@ 2025-09-17 12:57 Xiu Jianfeng
2025-09-17 13:57 ` Gabriele Monaco
0 siblings, 1 reply; 7+ messages in thread
From: Xiu Jianfeng @ 2025-09-17 12:57 UTC (permalink / raw)
To: rostedt, mhiramat, mathieu.desnoyers, gmonaco, namcao
Cc: linux-trace-kernel, linux-kernel, nicolas.bouchinet, xiujianfeng
From: Xiu Jianfeng <xiujianfeng@huawei.com>
When booting kernel with lockdown=confidentiality parameter, the system
will hang at rv_register_reactor() due to waiting for rv_interface_lock,
as shown in the following log,
INFO: task swapper/0:1 blocked for more than 122 seconds.
Not tainted 6.17.0-rc6-next-20250915+ #29
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0
Call Trace:
<TASK>
__schedule+0x492/0x1600
schedule+0x27/0xf0
schedule_preempt_disabled+0x15/0x30
__mutex_lock.constprop.0+0x538/0x9e0
? vprintk+0x18/0x50
? _printk+0x5f/0x90
__mutex_lock_slowpath+0x13/0x20
mutex_lock+0x3b/0x50
rv_register_reactor+0x48/0xe0
? __pfx_register_react_printk+0x10/0x10
register_react_printk+0x15/0x20
do_one_initcall+0x5d/0x340
kernel_init_freeable+0x351/0x540
? __pfx_kernel_init+0x10/0x10
kernel_init+0x1b/0x200
? __pfx_kernel_init+0x10/0x10
ret_from_fork+0x1fb/0x220
? __pfx_kernel_init+0x10/0x10
ret_from_fork_asm+0x1a/0x30
The root cause is that, when the kernel lockdown is in confidentiality
mode, rv_create_dir(), which is essentially tracefs_create_dir(), will
return NULL. This, in turn, causes create_monitor_dir() to return
-ENOMEM, and finally leading to the mutex not being unlocked.
Fixes: 24cbfe18d55a ("rv: Merge struct rv_monitor_def into struct rv_monitor")
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
---
kernel/trace/rv/rv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/rv/rv.c b/kernel/trace/rv/rv.c
index 1482e91c39f4..e35565dd2dc5 100644
--- a/kernel/trace/rv/rv.c
+++ b/kernel/trace/rv/rv.c
@@ -805,7 +805,7 @@ int rv_register_monitor(struct rv_monitor *monitor, struct rv_monitor *parent)
retval = create_monitor_dir(monitor, parent);
if (retval)
- return retval;
+ goto out_unlock;
/* keep children close to the parent for easier visualisation */
if (parent)
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] rv: Fix boot failure when kernel lockdown is active
2025-09-17 12:57 [PATCH] rv: Fix boot failure when kernel lockdown is active Xiu Jianfeng
@ 2025-09-17 13:57 ` Gabriele Monaco
2025-09-17 14:07 ` Nam Cao
0 siblings, 1 reply; 7+ messages in thread
From: Gabriele Monaco @ 2025-09-17 13:57 UTC (permalink / raw)
To: Xiu Jianfeng, rostedt, mhiramat, namcao
Cc: linux-trace-kernel, linux-kernel, nicolas.bouchinet, xiujianfeng
On Wed, 2025-09-17 at 12:57 +0000, Xiu Jianfeng wrote:
> From: Xiu Jianfeng <xiujianfeng@huawei.com>
>
> When booting kernel with lockdown=confidentiality parameter, the
> system
> will hang at rv_register_reactor() due to waiting for
> rv_interface_lock,
> as shown in the following log,
>
Thanks for finding this, the problem was already fixed in [1], which is
on its way to getting merged.
[1] -
https://lore.kernel.org/all/20250903065112.1878330-1-zhen.ni@easystack.cn
> INFO: task swapper/0:1 blocked for more than 122 seconds.
> Not tainted 6.17.0-rc6-next-20250915+ #29
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
> message.
> task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0
> Call Trace:
> <TASK>
> __schedule+0x492/0x1600
> schedule+0x27/0xf0
> schedule_preempt_disabled+0x15/0x30
> __mutex_lock.constprop.0+0x538/0x9e0
> ? vprintk+0x18/0x50
> ? _printk+0x5f/0x90
> __mutex_lock_slowpath+0x13/0x20
> mutex_lock+0x3b/0x50
> rv_register_reactor+0x48/0xe0
> ? __pfx_register_react_printk+0x10/0x10
> register_react_printk+0x15/0x20
> do_one_initcall+0x5d/0x340
> kernel_init_freeable+0x351/0x540
> ? __pfx_kernel_init+0x10/0x10
> kernel_init+0x1b/0x200
> ? __pfx_kernel_init+0x10/0x10
> ret_from_fork+0x1fb/0x220
> ? __pfx_kernel_init+0x10/0x10
> ret_from_fork_asm+0x1a/0x30
>
> The root cause is that, when the kernel lockdown is in
> confidentiality
> mode, rv_create_dir(), which is essentially tracefs_create_dir(),
> will
> return NULL. This, in turn, causes create_monitor_dir() to return
> -ENOMEM, and finally leading to the mutex not being unlocked.
>
> Fixes: 24cbfe18d55a ("rv: Merge struct rv_monitor_def into struct
> rv_monitor")
> Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
> ---
> kernel/trace/rv/rv.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/trace/rv/rv.c b/kernel/trace/rv/rv.c
> index 1482e91c39f4..e35565dd2dc5 100644
> --- a/kernel/trace/rv/rv.c
> +++ b/kernel/trace/rv/rv.c
> @@ -805,7 +805,7 @@ int rv_register_monitor(struct rv_monitor
> *monitor, struct rv_monitor *parent)
>
> retval = create_monitor_dir(monitor, parent);
> if (retval)
> - return retval;
> + goto out_unlock;
>
> /* keep children close to the parent for easier
> visualisation */
> if (parent)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] rv: Fix boot failure when kernel lockdown is active
2025-09-17 13:57 ` Gabriele Monaco
@ 2025-09-17 14:07 ` Nam Cao
2025-09-18 8:36 ` Gabriele Monaco
0 siblings, 1 reply; 7+ messages in thread
From: Nam Cao @ 2025-09-17 14:07 UTC (permalink / raw)
To: Gabriele Monaco, Xiu Jianfeng, rostedt, mhiramat
Cc: linux-trace-kernel, linux-kernel, nicolas.bouchinet, xiujianfeng
Gabriele Monaco <gmonaco@redhat.com> writes:
> On Wed, 2025-09-17 at 12:57 +0000, Xiu Jianfeng wrote:
>> From: Xiu Jianfeng <xiujianfeng@huawei.com>
>>
>> When booting kernel with lockdown=confidentiality parameter, the
>> system
>> will hang at rv_register_reactor() due to waiting for
>> rv_interface_lock,
>> as shown in the following log,
>>
>
> Thanks for finding this, the problem was already fixed in [1], which is
> on its way to getting merged.
>
> [1] -
> https://lore.kernel.org/all/20250903065112.1878330-1-zhen.ni@easystack.cn
Yeah, but it is interesting that this is causing real boot problem. I
thought that commit merely fixes a theoretical bug. I guess this is an
even stronger motivation to use lock guards.
Nam
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] rv: Fix boot failure when kernel lockdown is active
2025-09-17 14:07 ` Nam Cao
@ 2025-09-18 8:36 ` Gabriele Monaco
2025-09-18 9:48 ` Tomas Glozar
0 siblings, 1 reply; 7+ messages in thread
From: Gabriele Monaco @ 2025-09-18 8:36 UTC (permalink / raw)
To: Nam Cao
Cc: linux-trace-kernel, linux-kernel, nicolas.bouchinet, Xiu Jianfeng,
rostedt, mhiramat
On Wed, 2025-09-17 at 16:07 +0200, Nam Cao wrote:
> Gabriele Monaco <gmonaco@redhat.com> writes:
> > On Wed, 2025-09-17 at 12:57 +0000, Xiu Jianfeng wrote:
> > > From: Xiu Jianfeng <xiujianfeng@huawei.com>
> > >
> > > When booting kernel with lockdown=confidentiality parameter, the
> > > system will hang at rv_register_reactor() due to waiting for
> > > rv_interface_lock, as shown in the following log,
> > >
> >
> > Thanks for finding this, the problem was already fixed in [1],
> > which is on its way to getting merged.
> >
> > [1] -
> > https://lore.kernel.org/all/20250903065112.1878330-1-zhen.ni@easystack.cn
>
> Yeah, but it is interesting that this is causing real boot problem. I
> thought that commit merely fixes a theoretical bug. I guess this is
> an even stronger motivation to use lock guards.
Yeah totally, I have the feeling that with the kernel there's no such a
thing as a "theoretical bug", kinda like a good consequence of Murphy's
Law.
But I agree, that's something we may want to do sooner than later.
I'm currently not refactoring the RV core so I won't be touching that
for a while, but any patch is more than welcome!
Gabriele
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] rv: Fix boot failure when kernel lockdown is active
2025-09-18 8:36 ` Gabriele Monaco
@ 2025-09-18 9:48 ` Tomas Glozar
2025-09-19 8:52 ` Gabriele Monaco
0 siblings, 1 reply; 7+ messages in thread
From: Tomas Glozar @ 2025-09-18 9:48 UTC (permalink / raw)
To: Gabriele Monaco
Cc: Nam Cao, linux-trace-kernel, linux-kernel, nicolas.bouchinet,
Xiu Jianfeng, rostedt, mhiramat
čt 18. 9. 2025 v 10:36 odesílatel Gabriele Monaco <gmonaco@redhat.com> napsal:
>
> Yeah totally, I have the feeling that with the kernel there's no such a
> thing as a "theoretical bug", kinda like a good consequence of Murphy's
> Law.
>
My understanding of "theoretical bug" is that it's code that is
semantically equivalent to a bug-free code, but becomes buggy after
doing an "innocent" change. The bug might be more or less
"theoretical" based on how "innocent" that change is. Of course, in a
codebase of the size of a Linux kernel, this tends to happen quite
often, and is not always possible to get rid of completely...
Tomas
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] rv: Fix boot failure when kernel lockdown is active
2025-09-18 9:48 ` Tomas Glozar
@ 2025-09-19 8:52 ` Gabriele Monaco
2025-09-19 9:03 ` Nam Cao
0 siblings, 1 reply; 7+ messages in thread
From: Gabriele Monaco @ 2025-09-19 8:52 UTC (permalink / raw)
To: Tomas Glozar
Cc: Nam Cao, linux-trace-kernel, linux-kernel, nicolas.bouchinet,
Xiu Jianfeng, rostedt, mhiramat
On Thu, 2025-09-18 at 11:48 +0200, Tomas Glozar wrote:
> čt 18. 9. 2025 v 10:36 odesílatel Gabriele Monaco <gmonaco@redhat.com> napsal:
> >
> > Yeah totally, I have the feeling that with the kernel there's no such a
> > thing as a "theoretical bug", kinda like a good consequence of Murphy's
> > Law.
>
> My understanding of "theoretical bug" is that it's code that is
> semantically equivalent to a bug-free code, but becomes buggy after
> doing an "innocent" change. The bug might be more or less
> "theoretical" based on how "innocent" that change is. Of course, in a
> codebase of the size of a Linux kernel, this tends to happen quite
> often, and is not always possible to get rid of completely...
Yeah good point, we are getting philosophical here :) . This wasn't a
theoretical bug then, just something you don't think will really happen (a
failure creating a sysfs directory) ... until it happens.
The fact there is a way to make that function fail on-demand (kernel lockdown),
makes it just more "real". Moral of the story, better get the compiler check
things for you (lock guards).
Anyway the fix is now upstream.
Gabriele
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] rv: Fix boot failure when kernel lockdown is active
2025-09-19 8:52 ` Gabriele Monaco
@ 2025-09-19 9:03 ` Nam Cao
0 siblings, 0 replies; 7+ messages in thread
From: Nam Cao @ 2025-09-19 9:03 UTC (permalink / raw)
To: Gabriele Monaco, Tomas Glozar
Cc: linux-trace-kernel, linux-kernel, nicolas.bouchinet, Xiu Jianfeng,
rostedt, mhiramat
Gabriele Monaco <gmonaco@redhat.com> writes:
> Moral of the story, better get the compiler check things for you (lock
> guards).
I will make the patches, once I'm done with some other things..
Nam
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-09-19 9:03 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-17 12:57 [PATCH] rv: Fix boot failure when kernel lockdown is active Xiu Jianfeng
2025-09-17 13:57 ` Gabriele Monaco
2025-09-17 14:07 ` Nam Cao
2025-09-18 8:36 ` Gabriele Monaco
2025-09-18 9:48 ` Tomas Glozar
2025-09-19 8:52 ` Gabriele Monaco
2025-09-19 9:03 ` Nam Cao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).