All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: Fix NULL mm dereference in sched_mm_cid_after_execve()
@ 2025-12-23 21:51 Cong Wang
  2025-12-24 14:28 ` Mathieu Desnoyers
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Cong Wang @ 2025-12-23 21:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mathieu.desnoyers, Cong Wang, Thomas Gleixner

From: Cong Wang <cwang@multikernel.io>

sched_mm_cid_after_execve() is called in bprm_execve()'s cleanup path
even when exec_binprm() fails. For the init task's first execve, this
causes a problem:

1. current->mm is NULL (kernel threads don't have an mm)
2. sched_mm_cid_before_execve() exits early because mm is NULL
3. exec_binprm() fails (e.g., ENOENT for missing script interpreter)
4. sched_mm_cid_after_execve() is called with mm still NULL
5. sched_mm_cid_fork() is called unconditionally, triggering WARN_ON

This is easily reproduced by booting with an init that is a shell script
(#!/bin/sh) where the interpreter doesn't exist in the initramfs.

Fix this by checking if t->mm is NULL before calling sched_mm_cid_fork(),
matching the behavior of sched_mm_cid_before_execve() which already
handles this case via sched_mm_cid_exit()'s early return.

Fixes: b0c3d51b54f8 ("sched/mmcid: Provide precomputed maximal value")
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Cong Wang <cwang@multikernel.io>
---
 kernel/sched/core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 41ba0be16911..60afadb6eede 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10694,10 +10694,11 @@ void sched_mm_cid_before_execve(struct task_struct *t)
 	sched_mm_cid_exit(t);
 }
 
-/* Reactivate MM CID after successful execve() */
+/* Reactivate MM CID after execve() */
 void sched_mm_cid_after_execve(struct task_struct *t)
 {
-	sched_mm_cid_fork(t);
+	if (t->mm)
+		sched_mm_cid_fork(t);
 }
 
 static void mm_cid_work_fn(struct work_struct *work)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched: Fix NULL mm dereference in sched_mm_cid_after_execve()
  2025-12-23 21:51 [PATCH] sched: Fix NULL mm dereference in sched_mm_cid_after_execve() Cong Wang
@ 2025-12-24 14:28 ` Mathieu Desnoyers
  2025-12-30 17:03 ` Qing Wang
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Mathieu Desnoyers @ 2025-12-24 14:28 UTC (permalink / raw)
  To: Cong Wang, linux-kernel, Thomas Gleixner; +Cc: peterz, Cong Wang, Ingo Molnar

On 2025-12-23 16:51, Cong Wang wrote:
> From: Cong Wang <cwang@multikernel.io>
> 
> sched_mm_cid_after_execve() is called in bprm_execve()'s cleanup path
> even when exec_binprm() fails. For the init task's first execve, this
> causes a problem:
> 
> 1. current->mm is NULL (kernel threads don't have an mm)
> 2. sched_mm_cid_before_execve() exits early because mm is NULL
> 3. exec_binprm() fails (e.g., ENOENT for missing script interpreter)
> 4. sched_mm_cid_after_execve() is called with mm still NULL
> 5. sched_mm_cid_fork() is called unconditionally, triggering WARN_ON
> 
> This is easily reproduced by booting with an init that is a shell script
> (#!/bin/sh) where the interpreter doesn't exist in the initramfs.
> 
> Fix this by checking if t->mm is NULL before calling sched_mm_cid_fork(),
> matching the behavior of sched_mm_cid_before_execve() which already
> handles this case via sched_mm_cid_exit()'s early return.
 >
 > Fixes: b0c3d51b54f8 ("sched/mmcid: Provide precomputed maximal value")
 > Cc: Thomas Gleixner <tglx@linutronix.de>
 > Signed-off-by: Cong Wang <cwang@multikernel.io>

Thanks for the detailed explanation.

Indeed, the offending commit removes a pre-existing NULL mm check:

  void sched_mm_cid_after_execve(struct task_struct *t)
  {
-       struct mm_struct *mm = t->mm;
-
-       if (!mm)
-               return;

Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Thomas, Peter, Ingo, can we fast-track this fix for upstream ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched: Fix NULL mm dereference in sched_mm_cid_after_execve()
  2025-12-23 21:51 [PATCH] sched: Fix NULL mm dereference in sched_mm_cid_after_execve() Cong Wang
  2025-12-24 14:28 ` Mathieu Desnoyers
@ 2025-12-30 17:03 ` Qing Wang
  2026-01-07 18:00 ` Will Deacon
  2026-01-09 12:05 ` [tip: sched/urgent] sched/mm_cid: Prevent " tip-bot2 for Cong Wang
  3 siblings, 0 replies; 8+ messages in thread
From: Qing Wang @ 2025-12-30 17:03 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: cwang, linux-kernel, mathieu.desnoyers, peterz, tglx

>  void sched_mm_cid_after_execve(struct task_struct *t)
>  {
> -    sched_mm_cid_fork(t);
> +    if (t->mm)
> +        sched_mm_cid_fork(t);
>  }

Hi,

It's a correct solution, but I have a small suggestion that putting the 'mm'
checking into sched_mm_cid_fork(), just like sched_mm_cid_exit().

Best regards,
Qing Wang


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched: Fix NULL mm dereference in sched_mm_cid_after_execve()
  2025-12-23 21:51 [PATCH] sched: Fix NULL mm dereference in sched_mm_cid_after_execve() Cong Wang
  2025-12-24 14:28 ` Mathieu Desnoyers
  2025-12-30 17:03 ` Qing Wang
@ 2026-01-07 18:00 ` Will Deacon
  2026-01-08 15:28   ` Mathieu Desnoyers
  2026-01-09 12:05 ` [tip: sched/urgent] sched/mm_cid: Prevent " tip-bot2 for Cong Wang
  3 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2026-01-07 18:00 UTC (permalink / raw)
  To: Cong Wang
  Cc: linux-kernel, peterz, mathieu.desnoyers, Cong Wang,
	Thomas Gleixner

On Tue, Dec 23, 2025 at 01:51:13PM -0800, Cong Wang wrote:
> From: Cong Wang <cwang@multikernel.io>
> 
> sched_mm_cid_after_execve() is called in bprm_execve()'s cleanup path
> even when exec_binprm() fails. For the init task's first execve, this
> causes a problem:
> 
> 1. current->mm is NULL (kernel threads don't have an mm)
> 2. sched_mm_cid_before_execve() exits early because mm is NULL
> 3. exec_binprm() fails (e.g., ENOENT for missing script interpreter)
> 4. sched_mm_cid_after_execve() is called with mm still NULL
> 5. sched_mm_cid_fork() is called unconditionally, triggering WARN_ON
> 
> This is easily reproduced by booting with an init that is a shell script
> (#!/bin/sh) where the interpreter doesn't exist in the initramfs.
> 
> Fix this by checking if t->mm is NULL before calling sched_mm_cid_fork(),
> matching the behavior of sched_mm_cid_before_execve() which already
> handles this case via sched_mm_cid_exit()'s early return.
> 
> Fixes: b0c3d51b54f8 ("sched/mmcid: Provide precomputed maximal value")
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Cong Wang <cwang@multikernel.io>
> ---
>  kernel/sched/core.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 41ba0be16911..60afadb6eede 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -10694,10 +10694,11 @@ void sched_mm_cid_before_execve(struct task_struct *t)
>  	sched_mm_cid_exit(t);
>  }
>  
> -/* Reactivate MM CID after successful execve() */
> +/* Reactivate MM CID after execve() */
>  void sched_mm_cid_after_execve(struct task_struct *t)
>  {
> -	sched_mm_cid_fork(t);
> +	if (t->mm)
> +		sched_mm_cid_fork(t);
>  }
>  
>  static void mm_cid_work_fn(struct work_struct *work)

This addresses a panic reported on arm64 when trying to execute x86
binaries using TCG on an Apple device:

https://lore.kernel.org/all/20251226192506.88593-1-za4emsu@gmail.com/

so:

Acked-by: Will Deacon <will@kernel.org>

Please can we land this for 6.19?

Cheers,

Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched: Fix NULL mm dereference in sched_mm_cid_after_execve()
  2026-01-07 18:00 ` Will Deacon
@ 2026-01-08 15:28   ` Mathieu Desnoyers
  2026-01-09 11:53     ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Mathieu Desnoyers @ 2026-01-08 15:28 UTC (permalink / raw)
  To: Will Deacon, Cong Wang
  Cc: linux-kernel, peterz, Cong Wang, Thomas Gleixner, Ingo Molnar,
	Linus Torvalds

On 2026-01-07 13:00, Will Deacon wrote:
> On Tue, Dec 23, 2025 at 01:51:13PM -0800, Cong Wang wrote:
>> From: Cong Wang <cwang@multikernel.io>
>>
>> sched_mm_cid_after_execve() is called in bprm_execve()'s cleanup path
>> even when exec_binprm() fails. For the init task's first execve, this
>> causes a problem:
>>
>> 1. current->mm is NULL (kernel threads don't have an mm)
>> 2. sched_mm_cid_before_execve() exits early because mm is NULL
>> 3. exec_binprm() fails (e.g., ENOENT for missing script interpreter)
>> 4. sched_mm_cid_after_execve() is called with mm still NULL
>> 5. sched_mm_cid_fork() is called unconditionally, triggering WARN_ON
>>
>> This is easily reproduced by booting with an init that is a shell script
>> (#!/bin/sh) where the interpreter doesn't exist in the initramfs.
>>
>> Fix this by checking if t->mm is NULL before calling sched_mm_cid_fork(),
>> matching the behavior of sched_mm_cid_before_execve() which already
>> handles this case via sched_mm_cid_exit()'s early return.
>>
>> Fixes: b0c3d51b54f8 ("sched/mmcid: Provide precomputed maximal value")
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Signed-off-by: Cong Wang <cwang@multikernel.io>
>> ---
>>   kernel/sched/core.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 41ba0be16911..60afadb6eede 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -10694,10 +10694,11 @@ void sched_mm_cid_before_execve(struct task_struct *t)
>>   	sched_mm_cid_exit(t);
>>   }
>>   
>> -/* Reactivate MM CID after successful execve() */
>> +/* Reactivate MM CID after execve() */
>>   void sched_mm_cid_after_execve(struct task_struct *t)
>>   {
>> -	sched_mm_cid_fork(t);
>> +	if (t->mm)
>> +		sched_mm_cid_fork(t);
>>   }
>>   
>>   static void mm_cid_work_fn(struct work_struct *work)
> 
> This addresses a panic reported on arm64 when trying to execute x86
> binaries using TCG on an Apple device:
> 
> https://lore.kernel.org/all/20251226192506.88593-1-za4emsu@gmail.com/
> 
> so:
> 
> Acked-by: Will Deacon <will@kernel.org>
> 
> Please can we land this for 6.19?

Yes, please. I gave my Reviewed-by already 2 weeks ago:

https://lore.kernel.org/lkml/d6ac8fe9-5fc0-4042-9592-cde3db82b65e@efficios.com/

I guess the relevant maintainers are gradually coming back from the holiday break.
I will ask it again here: can we fast-track this fix for upstream ?

Thanks,

Mathieu

> 
> Cheers,
> 
> Will


-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched: Fix NULL mm dereference in sched_mm_cid_after_execve()
  2026-01-08 15:28   ` Mathieu Desnoyers
@ 2026-01-09 11:53     ` Thomas Gleixner
  2026-01-09 13:23       ` Mathieu Desnoyers
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2026-01-09 11:53 UTC (permalink / raw)
  To: Mathieu Desnoyers, Will Deacon, Cong Wang
  Cc: linux-kernel, peterz, Cong Wang, Ingo Molnar, Linus Torvalds

On Thu, Jan 08 2026 at 10:28, Mathieu Desnoyers wrote:
> On 2026-01-07 13:00, Will Deacon wrote:
> I guess the relevant maintainers are gradually coming back from the holiday break.
> I will ask it again here: can we fast-track this fix for upstream ?

Yes people are coming back from vacation and are picking up stuff.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tip: sched/urgent] sched/mm_cid: Prevent NULL mm dereference in sched_mm_cid_after_execve()
  2025-12-23 21:51 [PATCH] sched: Fix NULL mm dereference in sched_mm_cid_after_execve() Cong Wang
                   ` (2 preceding siblings ...)
  2026-01-07 18:00 ` Will Deacon
@ 2026-01-09 12:05 ` tip-bot2 for Cong Wang
  3 siblings, 0 replies; 8+ messages in thread
From: tip-bot2 for Cong Wang @ 2026-01-09 12:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Cong Wang, Thomas Gleixner, Mathieu Desnoyers, Will Deacon, x86,
	linux-kernel

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     2bdf777410dc6e022d1081885ff34673b5dfee99
Gitweb:        https://git.kernel.org/tip/2bdf777410dc6e022d1081885ff34673b5dfee99
Author:        Cong Wang <cwang@multikernel.io>
AuthorDate:    Tue, 23 Dec 2025 13:51:13 -08:00
Committer:     Thomas Gleixner <tglx@kernel.org>
CommitterDate: Fri, 09 Jan 2026 13:02:57 +01:00

sched/mm_cid: Prevent NULL mm dereference in sched_mm_cid_after_execve()

sched_mm_cid_after_execve() is called in bprm_execve()'s cleanup path even
when exec_binprm() fails. For the init task's first execve(), this causes a
problem:

  1. current->mm is NULL (kernel threads don't have an mm)
  2. sched_mm_cid_before_execve() exits early because mm is NULL
  3. exec_binprm() fails (e.g., ENOENT for missing script interpreter)
  4. sched_mm_cid_after_execve() is called with mm still NULL
  5. sched_mm_cid_fork() is called unconditionally, triggering WARN_ON

This is easily reproduced by booting with an init that is a shell script
(#!/bin/sh) where the interpreter doesn't exist in the initramfs.

Fix this by checking if t->mm is NULL before calling sched_mm_cid_fork(),
matching the behavior of sched_mm_cid_before_execve() which already
handles this case via sched_mm_cid_exit()'s early return.

Fixes: b0c3d51b54f8 ("sched/mmcid: Provide precomputed maximal value")
Signed-off-by: Cong Wang <cwang@multikernel.io>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://patch.msgid.link/20251223215113.639686-1-xiyou.wangcong@gmail.com
---
 kernel/sched/core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 41ba0be..60afadb 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10694,10 +10694,11 @@ void sched_mm_cid_before_execve(struct task_struct *t)
 	sched_mm_cid_exit(t);
 }
 
-/* Reactivate MM CID after successful execve() */
+/* Reactivate MM CID after execve() */
 void sched_mm_cid_after_execve(struct task_struct *t)
 {
-	sched_mm_cid_fork(t);
+	if (t->mm)
+		sched_mm_cid_fork(t);
 }
 
 static void mm_cid_work_fn(struct work_struct *work)

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched: Fix NULL mm dereference in sched_mm_cid_after_execve()
  2026-01-09 11:53     ` Thomas Gleixner
@ 2026-01-09 13:23       ` Mathieu Desnoyers
  0 siblings, 0 replies; 8+ messages in thread
From: Mathieu Desnoyers @ 2026-01-09 13:23 UTC (permalink / raw)
  To: Thomas Gleixner, Will Deacon, Cong Wang
  Cc: linux-kernel, peterz, Cong Wang, Ingo Molnar, Linus Torvalds

On 2026-01-09 06:53, Thomas Gleixner wrote:
> On Thu, Jan 08 2026 at 10:28, Mathieu Desnoyers wrote:
>> On 2026-01-07 13:00, Will Deacon wrote:
>> I guess the relevant maintainers are gradually coming back from the holiday break.
>> I will ask it again here: can we fast-track this fix for upstream ?
> 
> Yes people are coming back from vacation and are picking up stuff.

That's what I figured. Thanks for picking this up!

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-01-09 13:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-23 21:51 [PATCH] sched: Fix NULL mm dereference in sched_mm_cid_after_execve() Cong Wang
2025-12-24 14:28 ` Mathieu Desnoyers
2025-12-30 17:03 ` Qing Wang
2026-01-07 18:00 ` Will Deacon
2026-01-08 15:28   ` Mathieu Desnoyers
2026-01-09 11:53     ` Thomas Gleixner
2026-01-09 13:23       ` Mathieu Desnoyers
2026-01-09 12:05 ` [tip: sched/urgent] sched/mm_cid: Prevent " tip-bot2 for Cong Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.