From: Thomas Gleixner <tglx@linutronix.de>
To: Shrikanth Hegde <sshegde@linux.ibm.com>,
LKML <linux-kernel@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Gabriele Monaco <gmonaco@redhat.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Michael Jeanson <mjeanson@efficios.com>,
Jens Axboe <axboe@kernel.dk>,
"Paul E. McKenney" <paulmck@kernel.org>,
"Gautham R. Shenoy" <gautham.shenoy@amd.com>,
Florian Weimer <fweimer@redhat.com>,
Tim Chen <tim.c.chen@intel.com>,
Yury Norov <yury.norov@gmail.com>
Subject: Re: [patch V3 00/20] sched: Rewrite MM CID management
Date: Fri, 31 Oct 2025 20:36:06 +0100 [thread overview]
Message-ID: <87ecqisw2h.ffs@tglx> (raw)
In-Reply-To: <ae8de225-3c07-43ac-80d1-ecb6ac45d5d9@linux.ibm.com>
On Thu, Oct 30 2025 at 12:10, Shrikanth Hegde wrote:
>> I am running into crash at boot on power10 pseries.
>> Thought of putting it here first. Me trying to figure out why.
>>
>> I am using your tree.
>> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git
Can you update and revalidate? There are a couple of fixes there though
I don't know how they would be related.
>> Oops: Kernel access of bad area, sig: 7 [#3]
>> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=8192 NUMA pSeries
>> Modules linked in: drm drm_panel_orientation_quirks xfs sd_mod sg
>> ibmvscsi ibmveth scsi_transport_srp pseries_wdt dm_mirror dm_region_hash
>> dm_log dm_mod fuse
>> CPU: 96 UID: 0 PID: 0 Comm: swapper/96 Tainted: G D W
>> 6.18.0-rc3+ #4 PREEMPT(lazy)
>> Tainted: [D]=DIE, [W]=WARN
>> NIP [c0000000001b5c10] mm_cid_switch_to+0x58/0x52c
If it happens again, can you decode the source line?
>> LR [c000000001117c84] __schedule+0x4bc/0x760
>> Call Trace:
>> [c00000668367fde0] [c0000000001b53c8] __pick_next_task+0x60/0x2ac
>> (unreliable)
>> [c00000668367fe40] [c000000001117a14] __schedule+0x24c/0x760
>> [c00000668367fee0] [c0000000011183d0] schedule_idle+0x3c/0x64
>> [c00000668367ff10] [c0000000001f2470] do_idle+0x15c/0x1ac
>> [c00000668367ff60] [c0000000001f2788] cpu_startup_entry+0x4c/0x50
>> [c00000668367ff90] [c00000000005ef20] start_secondary+0x284/0x288
>> [c00000668367ffe0] [c00000000000e158] start_secondary_prolog+0x10/0x14
>>
>
> Issue happens with NR_CPUS=8192. System boots fine with NR_CPUS=2048
Hmm. Let me build a kernel with 8K and throw it at a VM then.
Thanks,
tglx
next prev parent reply other threads:[~2025-10-31 19:36 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-29 13:08 [patch V3 00/20] sched: Rewrite MM CID management Thomas Gleixner
2025-10-29 13:08 ` [patch V3 01/20] sched/mmcid: Revert the complex " Thomas Gleixner
2025-10-29 13:08 ` [patch V3 02/20] sched/mmcid: Use proper data structures Thomas Gleixner
2025-10-29 15:31 ` Mathieu Desnoyers
2025-10-29 13:08 ` [patch V3 03/20] sched/mmcid: Cacheline align MM CID storage Thomas Gleixner
2025-10-29 15:39 ` Mathieu Desnoyers
2025-10-29 21:09 ` Thomas Gleixner
2025-10-30 14:15 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 04/20] sched: Fixup whitespace damage Thomas Gleixner
2025-10-29 15:42 ` Mathieu Desnoyers
2025-10-29 21:11 ` Thomas Gleixner
2025-10-29 13:09 ` [patch V3 05/20] sched/mmcid: Move scheduler code out of global header Thomas Gleixner
2025-10-29 15:43 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 06/20] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed() Thomas Gleixner
2025-10-29 15:45 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 07/20] cpumask: Introduce cpumask_weighted_or() Thomas Gleixner
2025-10-29 15:49 ` Mathieu Desnoyers
2025-11-03 9:15 ` Shrikanth Hegde
2025-11-03 13:29 ` Thomas Gleixner
2025-11-10 16:11 ` Peter Zijlstra
2025-11-10 16:42 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 08/20] sched/mmcid: Use cpumask_weighted_or() Thomas Gleixner
2025-10-29 15:51 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 09/20] cpumask: Cache num_possible_cpus() Thomas Gleixner
2025-10-29 15:54 ` Mathieu Desnoyers
2025-10-29 21:11 ` Thomas Gleixner
2025-11-01 22:59 ` Thomas Gleixner
2025-11-03 10:06 ` Shrikanth Hegde
2025-11-03 13:28 ` Thomas Gleixner
2025-10-29 13:09 ` [patch V3 10/20] sched/mmcid: Convert mm CID mask to a bitmap Thomas Gleixner
2025-10-30 13:59 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 11/20] signal: Move MMCID exit out of sighand lock Thomas Gleixner
2025-10-29 13:09 ` [patch V3 12/20] sched/mmcid: Move initialization out of line Thomas Gleixner
2025-10-30 14:02 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 13/20] sched/mmcid: Provide precomputed maximal value Thomas Gleixner
2025-10-30 14:23 ` Mathieu Desnoyers
2025-10-31 15:06 ` Thomas Gleixner
2025-10-31 15:14 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 14/20] sched/mmcid: Serialize sched_mm_cid_fork()/exit() with a mutex Thomas Gleixner
2025-10-30 14:25 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 15/20] sched/mmcid: Introduce per task/CPU ownership infrastrcuture Thomas Gleixner
2025-10-30 14:45 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 16/20] sched/mmcid: Provide new scheduler CID mechanism Thomas Gleixner
2025-10-30 15:09 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 17/20] sched/mmcid: Provide CID ownership mode fixup functions Thomas Gleixner
2025-10-30 15:51 ` Mathieu Desnoyers
2025-10-31 16:54 ` Thomas Gleixner
2025-10-31 19:15 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 18/20] irqwork: Move data struct to a types header Thomas Gleixner
2025-10-30 15:52 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 19/20] sched/mmcid: Implement deferred mode change Thomas Gleixner
2025-10-30 15:56 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 20/20] sched/mmcid: Switch over to the new mechanism Thomas Gleixner
2025-10-30 16:07 ` Mathieu Desnoyers
2025-10-31 16:57 ` Thomas Gleixner
2025-10-31 19:17 ` Mathieu Desnoyers
2025-11-24 12:10 ` Mark Brown
2025-11-24 12:27 ` Florian Weimer
2025-11-24 13:28 ` Mark Brown
2025-10-30 5:00 ` [patch V3 00/20] sched: Rewrite MM CID management Shrikanth Hegde
2025-10-30 6:40 ` Shrikanth Hegde
2025-10-31 19:36 ` Thomas Gleixner [this message]
2025-11-01 7:56 ` Shrikanth Hegde
2025-11-01 12:27 ` Thomas Gleixner
2025-11-10 17:09 ` Gabriele Monaco
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ecqisw2h.ffs@tglx \
--to=tglx@linutronix.de \
--cc=axboe@kernel.dk \
--cc=fweimer@redhat.com \
--cc=gautham.shenoy@amd.com \
--cc=gmonaco@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mjeanson@efficios.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=sshegde@linux.ibm.com \
--cc=tim.c.chen@intel.com \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.