The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: "Aaron Lu" <ziqianlu@bytedance.com>
To: "Peter Zijlstra" <peterz@infradead.org>
Cc: <mingo@kernel.org>, <linux-kernel@vger.kernel.org>,
	 <juri.lelli@redhat.com>, <vincent.guittot@linaro.org>,
	 <dietmar.eggemann@arm.com>, <rostedt@goodmis.org>,
	<bsegall@google.com>,  <mgorman@suse.de>, <vschneid@redhat.com>,
	<kprateek.nayak@amd.com>,  <tj@kernel.org>, <williams@redhat.com>,
	<jkacur@redhat.com>
Subject: Re: [PATCH 0/2] sched: Remove sched_class::balance()
Date: Thu, 2 Jul 2026 19:49:19 +0800	[thread overview]
Message-ID: <20260702114919.GA186418@bytedance.com> (raw)
In-Reply-To: <20260624121327.190063948@infradead.org>

Hi Peter,

On Wed, Jun 24, 2026 at 02:13:27PM +0200, Peter Zijlstra wrote:
> Hi!
> 
> As mentioned [1], I was going to move sched_balance_newidle() into
> balance_fair(). And while the patch was quickly done, something didn't feel
> right.
>

Sorry for replying late, been busy with other stuffs.

The test caused a panic with below msg:

[  110.613929] BUG: kernel NULL pointer dereference, address: 0000000000000014
[  110.615988] #PF: supervisor read access in kernel mode
[  110.617444] #PF: error_code(0x0000) - not-present page
[  110.618929] PGD 1044c2067 P4D 0
[  110.619938] Oops: Oops: 0000 [#1] SMP NOPTI
[  110.621491] CPU: 34 UID: 1000 PID: 1394 Comm: sh Not tainted 7.1.0-rc2-00109-gb17be96b4d6c #4 PREEMPT(lazy)
[  110.624764] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[  110.627559] RIP: 0010:pick_next_task+0x42b/0xa20
[  110.628970] Code: 28 60 00 49 89 c7 8b 05 b3 8d 73 05 41 39 c7 0f 83 c3 01 00 00 49 63 cf 48 8b 1c cd 80 7b 96 82 48 01 eb 48 8b b3 50 0f 00 00 <8b> 4e 14 48 8b 0c cd 80 7b 96 82 48 8b 4c 29 20 48 39 ce 74 32 48
[  110.634160] RSP: 0000:ffa000000435bc48 EFLAGS: 00010083
[  110.635261] RAX: 0000000000000040 RBX: ff11000ff85fe140 RCX: 0000000000000022
[  110.636753] RDX: 0000000000000022 RSI: 0000000000000000 RDI: ff110001008b54d8
[  110.638230] RBP: ffffffff87452140 R08: 0000000000000022 R09: 00000000aed806e6
[  110.639719] R10: 0000000000000001 R11: 0000000000000000 R12: ff11000ff85fe140
[  110.641185] R13: 0000000000000000 R14: ff11000102373e88 R15: 0000000000000022
[  110.642688] FS:  00007f698e381780(0000) GS:ff110010711ac000(0000) knlGS:0000000000000000
[  110.644359] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  110.645255] CR2: 0000000000000014 CR3: 00000001044c1001 CR4: 0000000000371ef0
[  110.646366] Call Trace:
[  110.646781]  <TASK>
[  110.647123]  ? sched_clock+0x10/0x30
[  110.647704]  __schedule+0x1a4/0x7c0
[  110.648249]  ? local_clock_noinstr+0xd/0xc0
[  110.648914]  preempt_schedule+0x37/0x50
[  110.649528]  preempt_schedule_thunk+0x16/0x30
[  110.650217]  _raw_spin_unlock+0x37/0x40
[  110.650833]  wp_page_copy+0x2b7/0x5f0
[  110.651408]  __handle_mm_fault+0x45f/0x6e0
[  110.652068]  handle_mm_fault+0x96/0x250
[  110.652685]  do_user_addr_fault+0x236/0x6a0
[  110.653333]  exc_page_fault+0x7a/0x210
[  110.653937]  asm_exc_page_fault+0x26/0x30
[  110.654577] RIP: 0033:0x7f698e3a1f84
[  110.655077] Code: 8d 3d e0 d4 1c 00 e9 0b 15 05 00 0f 1f 00 48 89 7d f8 48 8d 3d cd d4 1c 00 e8 38 14 05 00 48 8b 55 f8 eb 8c 66 90 f3 0f 1e fa <c7> 05 b2 d4 1c 00 00 00 00 00 c3 90 f3 0f 1e fa 31 c9 e9 65 f9 ff
[  110.657442] RSP: 002b:00007fff9a5ff9e8 EFLAGS: 00010246
[  110.658113] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 00007f698e443f63
[  110.659037] RDX: 0000000000000000 RSI: 0000000000000018 RDI: 00007f698e381a60
[  110.659973] RBP: 00007fff9a5ffa10 R08: 0000000000000000 R09: 000000000000000f
[  110.660896] R10: 00007f698e381a50 R11: 0000000000000246 R12: 0000000000000000
[  110.661829] R13: 00000000ffffffff R14: 0000000000000000 R15: 00007fff9a5ffc20
[  110.662745]  </TASK>
[  110.663040] Modules linked in:
[  110.663451] CR2: 0000000000000014
[  110.663893] ---[ end trace 0000000000000000 ]---
[  110.664487] RIP: 0010:pick_next_task+0x42b/0xa20
[  110.665090] Code: 28 60 00 49 89 c7 8b 05 b3 8d 73 05 41 39 c7 0f 83 c3 01 00 00 49 63 cf 48 8b 1c cd 80 7b 96 82 48 01 eb 48 8b b3 50 0f 00 00 <8b> 4e 14 48 8b 0c cd 80 7b 96 82 48 8b 4c 29 20 48 39 ce 74 32 48
[  110.667457] RSP: 0000:ffa000000435bc48 EFLAGS: 00010083
[  110.668146] RAX: 0000000000000040 RBX: ff11000ff85fe140 RCX: 0000000000000022
[  110.669074] RDX: 0000000000000022 RSI: 0000000000000000 RDI: ff110001008b54d8
[  110.670009] RBP: ffffffff87452140 R08: 0000000000000022 R09: 00000000aed806e6
[  110.670942] R10: 0000000000000001 R11: 0000000000000000 R12: ff11000ff85fe140
[  110.671861] R13: 0000000000000000 R14: ff11000102373e88 R15: 0000000000000022
[  110.672787] FS:  00007f698e381780(0000) GS:ff110010711ac000(0000) knlGS:0000000000000000
[  110.673826] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  110.674567] CR2: 0000000000000014 CR3: 00000001044c1001 CR4: 0000000000371ef0
[  110.675487] Kernel panic - not syncing: Fatal exception
[  111.737349] Shutting down cpus with NMI
[  111.738019] Kernel Offset: disabled
[  111.738373] Rebooting in 30 seconds..

$ ./scripts/faddr2line ../obj/guest/vmlinux pick_next_task+0x42b
pick_next_task+0x42b/0xa20:
task_cpu at include/linux/sched.h:2270
(inlined by) is_task_rq_idle at kernel/sched/core.c:6144
(inlined by) cookie_equals at kernel/sched/core.c:6149
(inlined by) pick_next_task at kernel/sched/core.c:6328
This is the cookie_equals(p, cookie) in pick_next_task().

Assume cpuX and cpuY are siblings, it appears the following happened:

     cpuX                                      cpuY

   pick_next_task()
   goto restart_multi

   rqX->core_pick = pick_task(rqX)

   pick_task(rqY)
     pick_task_fair(rqY)
       sched_balance_newidle(rqY)
         raw_spin_rq_unlock(rqY)  // drops core lock

                                           pick_next_task()
                                           goto restart_multi
                                           rqY->core_pick = pick_task(rqY)
                                           rqX->core_pick = pick_task(rqX)

                                           if (rqX->curr == rqX->core_pick)
                                             rqX->core_pick = NULL

					   UNLOCK rq_lockp(rqY)

         raw_spin_rq_lock(rqY)

   rqY->core_pick = pick_task(rqY)

   p = rqX->core_pick        // NULL
   cookie_equals(p, cookie)  // NULL deref

BTW, I applied this series on top of commit c095741713d1b("sched/fair:
Fix newidle vs core-sched") of the queue/sched/core branch.

  parent reply	other threads:[~2026-07-02 11:49 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-24 12:13 [PATCH 0/2] sched: Remove sched_class::balance() Peter Zijlstra
2026-06-24 12:13 ` [PATCH 1/2] sched/core: Allow newidle for core-sched Peter Zijlstra
2026-06-24 23:56   ` K Prateek Nayak
2026-06-25 12:41     ` Peter Zijlstra
2026-06-25 12:42     ` Peter Zijlstra
2026-06-24 12:13 ` [PATCH 2/2] sched: Remove sched_class::balance() Peter Zijlstra
2026-07-02 11:49 ` Aaron Lu [this message]
2026-07-03  3:31   ` [PATCH 0/2] " K Prateek Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260702114919.GA186418@bytedance.com \
    --to=ziqianlu@bytedance.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=jkacur@redhat.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox