linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/fair: Fix division-by-zero error in task_scan_max()
@ 2025-08-27 12:34 Xia Fukun
  2025-09-02  8:08 ` Xia Fukun
  2025-09-02  8:57 ` Peter Zijlstra
  0 siblings, 2 replies; 3+ messages in thread
From: Xia Fukun @ 2025-08-27 12:34 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, mgorman, riel
  Cc: dietmar.eggemann, rostedt, bsegall, vschneid, linux-kernel,
	xiafukun

The error can be reproduced by following these steps:
First, set sysctl_numa_balancing_scan_size to 0:

echo 0 > /sys/kernel/debug/sched/numa_balancing/scan_size_mb

Then trigger the clone system call, for example, by using
pthread_create to create a new thread.

	Oops: divide error: 0000 [#1] SMP NOPTI
	CPU: 11 UID: 0 PID: 1 Comm: systemd Tainted: G S 6.17.0xfk_v2 #6
	Tainted: [S]=CPU_OUT_OF_SPEC
	Hardware name: SuperCloud R5210 G12/X12DPi-N6, BIOS 1.1c 08/30/2021
	RIP: 0010:task_scan_max+0x24/0x190
	RSP: 0018:ff56485a001ebc98 EFLAGS: 00010246
	...
	Call Trace:
	<TASK>
	init_numa_balancing+0xdb/0x1e0
	__sched_fork+0x110/0x180
	sched_fork+0xd/0x170
	copy_process+0x821/0x1aa0
	kernel_clone+0xbc/0x400
	__do_sys_clone3+0xde/0x120
	do_syscall_64+0xa4/0x260
	entry_SYSCALL_64_after_hwframe+0x77/0x7f

This patch fixes the issue by ensuring that the relevant value in
task_scan_max() is at least 1.

Fixes: 598f0ec0bc99 ("sched/numa: Set the scan rate proportional to the memory usage of the task being scanned")
Signed-off-by: Xia Fukun <xiafukun@huawei.com>
---
 kernel/sched/fair.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b173a059315c..ea962e3bcb13 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1496,6 +1496,7 @@ static unsigned int task_nr_scan_windows(struct task_struct *p)
 	 * on resident pages
 	 */
 	nr_scan_pages = sysctl_numa_balancing_scan_size << (20 - PAGE_SHIFT);
+	nr_scan_pages = max_t(unsigned long, nr_scan_pages, 1UL << (20 - PAGE_SHIFT));
 	rss = get_mm_rss(p->mm);
 	if (!rss)
 		rss = nr_scan_pages;
@@ -1510,6 +1511,7 @@ static unsigned int task_nr_scan_windows(struct task_struct *p)
 static unsigned int task_scan_min(struct task_struct *p)
 {
 	unsigned int scan_size = READ_ONCE(sysctl_numa_balancing_scan_size);
+	scan_size = max_t(unsigned int, scan_size, 1);
 	unsigned int scan, floor;
 	unsigned int windows = 1;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] sched/fair: Fix division-by-zero error in task_scan_max()
  2025-08-27 12:34 [PATCH] sched/fair: Fix division-by-zero error in task_scan_max() Xia Fukun
@ 2025-09-02  8:08 ` Xia Fukun
  2025-09-02  8:57 ` Peter Zijlstra
  1 sibling, 0 replies; 3+ messages in thread
From: Xia Fukun @ 2025-09-02  8:08 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, mgorman, riel
  Cc: dietmar.eggemann, rostedt, bsegall, vschneid, linux-kernel

On 8/27/2025 8:34 PM, Xia Fukun wrote:
> The error can be reproduced by following these steps:
> First, set sysctl_numa_balancing_scan_size to 0:
> 
> echo 0 > /sys/kernel/debug/sched/numa_balancing/scan_size_mb
> 
> Then trigger the clone system call, for example, by using
> pthread_create to create a new thread.
> 
> 	Oops: divide error: 0000 [#1] SMP NOPTI
> 	CPU: 11 UID: 0 PID: 1 Comm: systemd Tainted: G S 6.17.0xfk_v2 #6
> 	Tainted: [S]=CPU_OUT_OF_SPEC
> 	Hardware name: SuperCloud R5210 G12/X12DPi-N6, BIOS 1.1c 08/30/2021
> 	RIP: 0010:task_scan_max+0x24/0x190

Gentle ping ...


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] sched/fair: Fix division-by-zero error in task_scan_max()
  2025-08-27 12:34 [PATCH] sched/fair: Fix division-by-zero error in task_scan_max() Xia Fukun
  2025-09-02  8:08 ` Xia Fukun
@ 2025-09-02  8:57 ` Peter Zijlstra
  1 sibling, 0 replies; 3+ messages in thread
From: Peter Zijlstra @ 2025-09-02  8:57 UTC (permalink / raw)
  To: Xia Fukun
  Cc: mingo, juri.lelli, vincent.guittot, mgorman, riel,
	dietmar.eggemann, rostedt, bsegall, vschneid, linux-kernel

On Wed, Aug 27, 2025 at 12:34:27PM +0000, Xia Fukun wrote:
> The error can be reproduced by following these steps:
> First, set sysctl_numa_balancing_scan_size to 0:
> 
> echo 0 > /sys/kernel/debug/sched/numa_balancing/scan_size_mb
> 
> Then trigger the clone system call, for example, by using
> pthread_create to create a new thread.

How about rejecting 0 instead?

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-09-02  8:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-27 12:34 [PATCH] sched/fair: Fix division-by-zero error in task_scan_max() Xia Fukun
2025-09-02  8:08 ` Xia Fukun
2025-09-02  8:57 ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).