From: Tim Chen <tim.c.chen@linux.intel.com>
To: Shrikanth Hegde <sshegde@linux.ibm.com>,
linux-kernel@vger.kernel.org,
"Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: linux-tip-commits@vger.kernel.org, Chen Yu <yu.c.chen@intel.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Srikar Dronamraju <srikar@linux.ibm.com>,
Mohini Narkhede <mohini.narkhede@intel.com>,
x86@kernel.org
Subject: Re: [tip: sched/core] sched/fair: Skip sched_balance_running cmpxchg when balance is not due
Date: Mon, 17 Nov 2025 10:55:07 -0800 [thread overview]
Message-ID: <ceffc6f7870711d40f195191d298ca9bf1def022.camel@linux.intel.com> (raw)
In-Reply-To: <dffe53a4-0ef2-4346-ad73-c4b71a734b3a@linux.ibm.com>
On Sun, 2025-11-16 at 02:26 +0530, Shrikanth Hegde wrote:
> Hi Peter.
>
> On 11/14/25 5:49 PM, tip-bot2 for Tim Chen wrote:
> > The following commit has been merged into the sched/core branch of tip:
> >
> > Commit-ID: 2265c5d4deeff3bfe4580d9ffe718fd80a414cac
> > Gitweb: https://git.kernel.org/tip/2265c5d4deeff3bfe4580d9ffe718fd80a414cac
> > Author: Tim Chen <tim.c.chen@linux.intel.com>
> > AuthorDate: Mon, 10 Nov 2025 10:47:35 -08:00
> > Committer: Peter Zijlstra <peterz@infradead.org>
> > CommitterDate: Fri, 14 Nov 2025 13:03:05 +01:00
> >
> > sched/fair: Skip sched_balance_running cmpxchg when balance is not due
> >
> >
>
>
> > + if (!need_unlock && (sd->flags & SD_SERIALIZE) && idle != CPU_NEWLY_IDLE) {
> > + if (!atomic_try_cmpxchg_acquire(&sched_balance_running, 0, 1))
>
> This should be atomic_cmpxchg_acquire?
>
> I booted the system with latest sched/core and it crashes at the boot.
>
> BUG: Kernel NULL pointer dereference on read at 0x00000000
> Faulting instruction address: 0xc0000000001db57c
> Oops: Kernel access of bad area, sig: 7 [#1]
> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=8192 NUMA pSeries
> Modules linked in:
> CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.18.0-rc3+ #242 PREEMPT(lazy)
> NIP [c0000000001db57c] sched_balance_rq+0x560/0x92c
> LR [c0000000001db198] sched_balance_rq+0x17c/0x92c
> Call Trace:
> [c00000111ffdfd10] [c0000000001db198] sched_balance_rq+0x17c/0x92c (unreliable)
> [c00000111ffdfe50] [c0000000001dc598] sched_balance_domains+0x2c4/0x3d0
> [c00000111ffdff00] [c000000000168958] handle_softirqs+0x138/0x414
> [c00000111ffdffe0] [c000000000017d80] do_softirq_own_stack+0x3c/0x50
> [c000000008a57a60] [c000000000168048] __irq_exit_rcu+0x18c/0x1b4
> [c000000008a57a90] [c0000000001691a8] irq_exit+0x20/0x38
> [c000000008a57ab0] [c000000000028c18] timer_interrupt+0x174/0x394
> [c000000008a57b10] [c000000000009f8c] decrementer_common_virt+0x28c/0x290
>
>
> Bisect pointed to:
> git bisect bad 2265c5d4deeff3bfe4580d9ffe718fd80a414cac
> # first bad commit: [2265c5d4deeff3bfe4580d9ffe718fd80a414cac] sched/fair: Skip sched_balance_running cmpxchg when balance is not due
>
>
> I wondered what is really different since the tim's v4 boots fine.
> There is try instead in the tip, i think that is messing it since likely
> we are dereferencing 0?
>
>
> With this diff it boots fine.
>
> ---
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index aaa47ece6a8e..01814b10b833 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11841,7 +11841,7 @@ static int sched_balance_rq(int this_cpu, struct rq *this_rq,
> }
>
> if (!need_unlock && (sd->flags & SD_SERIALIZE)) {
> - if (!atomic_try_cmpxchg_acquire(&sched_balance_running, 0, 1))
The second argument of atomic_try_cmpxchg_acquire is "int *old" while that of atomic_cmpxchg_acquire
is "int old". So the above check would result in NULL pointer access. Probably have
to do something like the following to use atomic_try_cmpxchg_acquire()
int zero = 0;
if (!atomic_try_cmpxchg_acquire(&sched_balance_running, &zero, 1))
Otherwise we should do atomic_cmpxchg_acquire() as below
> + if (!atomic_cmpxchg_acquire(&sched_balance_running, 0, 1))
Tim
> goto out_balanced;
>
> need_unlock = true;
>
>
next prev parent reply other threads:[~2025-11-17 18:55 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-10 18:47 [PATCH v4] sched/fair: Skip sched_balance_running cmpxchg when balance is not due Tim Chen
2025-11-11 6:24 ` Shrikanth Hegde
2025-11-12 8:02 ` Srikar Dronamraju
2025-11-12 10:37 ` Peter Zijlstra
2025-11-12 10:45 ` Peter Zijlstra
2025-11-12 11:09 ` Shrikanth Hegde
2025-11-12 11:21 ` Peter Zijlstra
2025-11-12 21:10 ` Tim Chen
2025-11-13 4:25 ` Shrikanth Hegde
2025-11-13 17:49 ` Tim Chen
2025-11-12 11:25 ` Srikar Dronamraju
2025-11-12 13:39 ` Peter Zijlstra
2025-11-12 13:44 ` Peter Zijlstra
2025-11-12 16:02 ` Srikar Dronamraju
2025-11-12 10:53 ` Shrikanth Hegde
2025-11-14 12:19 ` [tip: sched/core] " tip-bot2 for Tim Chen
2025-11-15 20:56 ` Shrikanth Hegde
2025-11-17 18:55 ` Tim Chen [this message]
2025-11-17 19:00 ` K Prateek Nayak
2025-11-27 14:09 ` Peter Zijlstra
2025-11-18 9:54 ` Peter Zijlstra
2025-11-18 9:56 ` Peter Zijlstra
2025-11-21 6:26 ` Nathan Chancellor
2025-11-21 9:00 ` Peter Zijlstra
2025-11-17 19:06 ` Borislav Petkov
-- strict thread matches above, loose matches on Subject: below --
2025-11-17 16:23 tip-bot2 for Tim Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ceffc6f7870711d40f195191d298ca9bf1def022.camel@linux.intel.com \
--to=tim.c.chen@linux.intel.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=mohini.narkhede@intel.com \
--cc=peterz@infradead.org \
--cc=srikar@linux.ibm.com \
--cc=sshegde@linux.ibm.com \
--cc=vincent.guittot@linaro.org \
--cc=x86@kernel.org \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.