From: Usama Arif <usama.arif@bytedance.com>
To: paulmck@kernel.org, dwmw2@infradead.org, tglx@linutronix.de
Cc: kim.phillips@amd.com, arjan@linux.intel.com, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com,
x86@kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
rcu@vger.kernel.org, mimoja@mimoja.de, hewenliang4@huawei.com,
thomas.lendacky@amd.com, seanjc@google.com,
pmenzel@molgen.mpg.de, fam.zheng@bytedance.com,
punit.agrawal@bytedance.com, simon.evans@bytedance.com,
liangma@liangbit.com
Subject: Re: [External] Re: [PATCH v7 0/9] Parallel CPU bringup for x86_64
Date: Thu, 9 Feb 2023 09:49:04 +0000 [thread overview]
Message-ID: <8e2f03e2-9517-aeb4-df60-b36ef3ff3a75@bytedance.com> (raw)
In-Reply-To: <20230209035300.GA3216394@paulmck-ThinkPad-P17-Gen-1>
On 09/02/2023 03:53, Paul E. McKenney wrote:
> On Tue, Feb 07, 2023 at 11:04:27PM +0000, Usama Arif wrote:
>> Tested on v7, doing INIT/SIPI/SIPI in parallel brings down the time for
>> smpboot from ~700ms to 100ms (85% improvement) on a server with 128 CPUs
>> split across 2 NUMA nodes.
>>
>> The major change over v6 is keeping parallel smp support enabled in AMD.
>> APIC ID for parallel CPU bringup is now obtained from CPUID leaf 0x0B
>> (for x2APIC mode) otherwise CPUID leaf 0x1 (8 bits).
>>
>> The patch for reusing timer calibration for secondary CPUs is also removed
>> from the series as its not part of parallel smp bringup and needs to be
>> further thought about.
>
> Running rcutorture on this got me the following NULL pointer dereference
> on scenario TREE01:
>
> ------------------------------------------------------------------------
>
> [ 34.662066] smpboot: CPU 0 is now offline
> [ 34.674075] rcu: NOCB: Cannot CB-offload offline CPU 25
> [ 35.038003] rcu: De-offloading 5
> [ 35.112997] rcu: Offloading 12
> [ 35.716011] smpboot: Booting Node 0 Processor 0 APIC 0x0
> [ 35.762685] BUG: kernel NULL pointer dereference, address: 0000000000000001
> [ 35.764278] #PF: supervisor instruction fetch in kernel mode
> [ 35.765530] #PF: error_code(0x0010) - not-present page
> [ 35.766700] PGD 0 P4D 0
> [ 35.767278] Oops: 0010 [#1] PREEMPT SMP PTI
> [ 35.768223] CPU: 36 PID: 0 Comm: swapper/36 Not tainted 6.2.0-rc1-00206-g18a37610b632-dirty #3563
> [ 35.770201] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
>
> ------------------------------------------------------------------------
>
> Given an x86 system with KVM and qemu, this can be reproduced by running
> the following from the top-level directory in the Linux-kernel source
> tree:
>
> tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --configs "TREE01 TINY01" --trust-make
>
> Out of 15 runs, 14 blew up just after the first attempt to bring CPU
> 0 back online. The 15th run blew up just after the second attempt to
> bring CPU 0 online, the first attempt having succeeded.
>
> My guess is that the CONFIG_BOOTPARAM_HOTPLUG_CPU0=y Kconfig option is
> tickling this bug. This Kconfig option has been added to the TREE01
> scenario in the -rcu tree's "dev" branch, which might mean that this test
> would pass on mainline. But CONFIG_BOOTPARAM_HOTPLUG_CPU0=y is not new,
> only rcutorture's testing of it.
>
> Thoughts?
>
> Thanx, Paul
It looks like its because of the initial_gs, initial_stack and
early_gdt_descr not being setup properly for CPU0 hotplug, i.e.
init_cpu_data isnt called in cpu0 hotplug case.
Its easy to test, just by doing
echo 0 > /sys/devices/system/cpu/cpu0/online;
echo 1 > /sys/devices/system/cpu/cpu0/online;
As a quick check, if we do something like below (probably there is a
much better place to set these..), the above hotplug commands will work.
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3ec5182d9698..184135c47ee5 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1080,6 +1080,7 @@ wakeup_cpu_via_init_nmi(int cpu, unsigned long
start_ip, int apicid,
wakeup_cpu0_nmi, 0, "wake_cpu0");
if (!boot_error) {
+ initial_gs = per_cpu_offset(cpu);
enable_start_cpu0 = 1;
*cpu0_nmi_registered = 1;
id = apic->dest_mode_logical ? cpu0_logical_apicid :
apicid;
@@ -1188,10 +1189,14 @@ static int do_boot_cpu(int apicid, int cpu,
struct task_struct *idle,
boot_error = apic->wakeup_secondary_cpu_64(apicid,
start_ip);
else if (apic->wakeup_secondary_cpu)
boot_error = apic->wakeup_secondary_cpu(apicid, start_ip);
- else
+ else {
+ if(!cpu) {
+ early_gdt_descr.address = (unsigned
long)get_cpu_gdt_rw(cpu);
+ initial_stack = idle->thread.sp;
+ }
boot_error = wakeup_cpu_via_init_nmi(cpu, start_ip, apicid,
cpu0_nmi_registered);
-
+ }
return boot_error;
}
next prev parent reply other threads:[~2023-02-09 9:49 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-07 23:04 [PATCH v7 0/9] Parallel CPU bringup for x86_64 Usama Arif
2023-02-07 23:04 ` [PATCH v7 1/9] x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel Usama Arif
2023-02-07 23:04 ` [PATCH v7 2/9] cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h> Usama Arif
2023-02-07 23:04 ` [PATCH v7 3/9] cpu/hotplug: Add dynamic parallel bringup states before CPUHP_BRINGUP_CPU Usama Arif
2023-02-07 23:04 ` [PATCH v7 4/9] x86/smpboot: Reference count on smpboot_setup_warm_reset_vector() Usama Arif
2023-02-07 23:04 ` [PATCH v7 5/9] x86/smpboot: Split up native_cpu_up into separate phases and document them Usama Arif
2023-02-08 10:03 ` Yuan Yao
2023-02-08 12:02 ` David Woodhouse
2023-02-09 3:06 ` Yuan Yao
2023-02-07 23:04 ` [PATCH v7 6/9] x86/smpboot: Support parallel startup of secondary CPUs Usama Arif
2023-02-08 5:09 ` Brian Gerst
2023-02-08 12:54 ` [External] " Usama Arif
2023-02-08 12:56 ` David Woodhouse
2023-02-08 13:13 ` Thomas Gleixner
2023-02-08 14:41 ` David Woodhouse
2023-02-07 23:04 ` [PATCH v7 7/9] x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel Usama Arif
2023-02-07 23:04 ` [PATCH v7 8/9] x86/mtrr: Avoid repeated save of MTRRs on boot-time CPU bringup Usama Arif
2023-02-07 23:04 ` [PATCH v7 9/9] x86/smpboot: Serialize topology updates for secondary bringup Usama Arif
2023-02-09 3:53 ` [PATCH v7 0/9] Parallel CPU bringup for x86_64 Paul E. McKenney
2023-02-09 9:49 ` Usama Arif [this message]
2023-02-09 10:06 ` [External] " David Woodhouse
2023-02-09 10:19 ` Usama Arif
2023-02-09 10:22 ` David Woodhouse
2023-02-09 11:03 ` David Woodhouse
2023-02-09 11:45 ` Usama Arif
2023-02-09 11:53 ` Thomas Gleixner
2023-02-09 12:10 ` David Woodhouse
2023-02-09 13:48 ` David Woodhouse
2023-02-09 14:01 ` Usama Arif
2023-02-09 15:48 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8e2f03e2-9517-aeb4-df60-b36ef3ff3a75@bytedance.com \
--to=usama.arif@bytedance.com \
--cc=arjan@linux.intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=dwmw2@infradead.org \
--cc=fam.zheng@bytedance.com \
--cc=hewenliang4@huawei.com \
--cc=hpa@zytor.com \
--cc=kim.phillips@amd.com \
--cc=kvm@vger.kernel.org \
--cc=liangma@liangbit.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mimoja@mimoja.de \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=pbonzini@redhat.com \
--cc=pmenzel@molgen.mpg.de \
--cc=punit.agrawal@bytedance.com \
--cc=rcu@vger.kernel.org \
--cc=seanjc@google.com \
--cc=simon.evans@bytedance.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox