public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"rcu@vger.kernel.org" <rcu@vger.kernel.org>,
	"mimoja@mimoja.de" <mimoja@mimoja.de>,
	"hewenliang4@huawei.com" <hewenliang4@huawei.com>,
	"hushiyuan@huawei.com" <hushiyuan@huawei.com>,
	"luolongjun@huawei.com" <luolongjun@huawei.com>,
	"hejingxian@huawei.com" <hejingxian@huawei.com>
Subject: Re: [PATCH v3 0/9] Parallel CPU bringup for x86_64
Date: Fri, 28 Jan 2022 21:40:42 +0000	[thread overview]
Message-ID: <YfRi2sY0hVfri5eR@google.com> (raw)
In-Reply-To: <7e92a196e67b1bfa37c1e61a789f2b75a735c06f.camel@infradead.org>

On Fri, Jan 28, 2022, David Woodhouse wrote:
> On Fri, 2021-12-17 at 14:55 -0600, Tom Lendacky wrote:
> > On 12/17/21 2:13 PM, David Woodhouse wrote:
> > > On Fri, 2021-12-17 at 13:46 -0600, Tom Lendacky wrote:
> > > > There's no WARN or PANIC, just a reset. I can look to try and capture some
> > > > KVM trace data if that would help. If so, let me know what events you'd
> > > > like captured.
> > > 
> > > 
> > > Could start with just kvm_run_exit?
> > > 
> > > Reason 8 would be KVM_EXIT_SHUTDOWN and would potentially indicate a
> > > triple fault.
> > 
> > qemu-system-x86-24093   [005] .....  1601.759486: kvm_exit: vcpu 112 reason shutdown rip 0xffffffff81070574 info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x80000b08 error_code 0x00000000
> > 
> > # addr2line -e woodhouse-build-x86_64/vmlinux 0xffffffff81070574
> > /root/kernels/woodhouse-build-x86_64/./arch/x86/include/asm/desc.h:272
> > 
> > Which is: asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8));
> 
> So, I remain utterly bemused by this, and the Milan *guests* I have
> access to can't even kexec with a stock kernel; that is also "too fast"
> and they take a triple fault during the bringup in much the same way —
> even without my parallel patches, and even going back to fairly old
> kernels.
> 
> I wasn't able to follow up with raw serial output during the bringup to
> pinpoint precisely where it happens, because the VM would tear itself
> down in response to the triple fault without actually flushing the last
> virtual serial output :)
> 
> It would be really useful to get access to a suitable host where I can
> spawn this in qemu and watch it fail. I am suspecting a chip-specific
> quirk or bug at this point.

Nope.  You missed a spot.  This also reproduces on a sufficiently large Intel
system (and Milan).  initial_gs gets overwritten by common_cpu_up(), which leads
to a CPU getting the wrong MSR_GS_BASE and then the wrong raw_smp_processor_id(),
resulting in cpu_init_exception_handling() stuffing the wrong GDT and leaving a
NULL TR descriptor for itself.

You also have a lurking bug in the x2APIC ID handling.  Stripping the boot flags
from the prescribed APICID needs to happen before retrieving the x2APIC ID from
CPUID, otherwise bits 31:16 of the ID will be lost.

You owe me two beers ;-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index dcdf49a137d6..23df88c86a0e 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -208,11 +208,14 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
         * in smpboot_control:
         * Bit 0-15     APICID if STARTUP_USE_CPUID_0B is not set
         * Bit 16       Secondary boot flag
-        * Bit 17       Parallel boot flag
+        * Bit 17       Parallel boot flag (STARTUP_USE_CPUID_0B)
         */
        testl   $STARTUP_USE_CPUID_0B, %eax
-       jz      .Lsetup_AP
+       jnz     .Luse_cpuid_0b
+       andl    $0xFFFF, %eax
+       jmp     .Lsetup_AP

+.Luse_cpuid_0b:
        mov     $0x0B, %eax
        xorl    %ecx, %ecx
        cpuid
@@ -220,7 +223,6 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)

 .Lsetup_AP:
        /* EAX contains the APICID of the current CPU */
-       andl    $0xFFFF, %eax
        xorl    %ecx, %ecx
        leaq    cpuid_to_apicid(%rip), %rbx

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 04f5c8de5606..e7fda406f39a 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1093,6 +1093,17 @@ wakeup_cpu_via_init_nmi(int cpu, unsigned long start_ip, int apicid,
        return boot_error;
 }

+static bool do_parallel_bringup = true;
+
+static int __init no_parallel_bringup(char *str)
+{
+       do_parallel_bringup = false;
+
+       return 0;
+}
+early_param("no_parallel_bringup", no_parallel_bringup);
+
+
 int common_cpu_up(unsigned int cpu, struct task_struct *idle)
 {
        int ret;
@@ -1112,7 +1123,8 @@ int common_cpu_up(unsigned int cpu, struct task_struct *idle)
        /* Stack for startup_32 can be just as for start_secondary onwards */
        per_cpu(cpu_current_top_of_stack, cpu) = task_top_of_stack(idle);
 #else
-       initial_gs = per_cpu_offset(cpu);
+       if (!do_parallel_bringup)
+               initial_gs = per_cpu_offset(cpu);
 #endif
        return 0;
 }
@@ -1336,16 +1348,6 @@ int do_cpu_up(unsigned int cpu, struct task_struct *tidle)
        return ret;
 }

-static bool do_parallel_bringup = true;
-
-static int __init no_parallel_bringup(char *str)
-{
-       do_parallel_bringup = false;
-
-       return 0;
-}
-early_param("no_parallel_bringup", no_parallel_bringup);
-
 int native_cpu_up(unsigned int cpu, struct task_struct *tidle)
 {
        int ret;


  reply	other threads:[~2022-01-28 21:40 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-15 14:56 [PATCH v3 0/9] Parallel CPU bringup for x86_64 David Woodhouse
2021-12-15 14:56 ` [PATCH v3 1/9] x86/apic/x2apic: Fix parallel handling of cluster_mask David Woodhouse
2021-12-15 14:56 ` [PATCH v3 2/9] cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h> David Woodhouse
2021-12-15 14:56 ` [PATCH v3 3/9] cpu/hotplug: Add dynamic parallel bringup states before CPUHP_BRINGUP_CPU David Woodhouse
2021-12-15 14:56 ` [PATCH v3 4/9] x86/smpboot: Reference count on smpboot_setup_warm_reset_vector() David Woodhouse
2021-12-15 14:56 ` [PATCH v3 5/9] x86/smpboot: Split up native_cpu_up into separate phases and document them David Woodhouse
2021-12-15 14:56 ` [PATCH v3 6/9] x86/smpboot: Support parallel startup of secondary CPUs David Woodhouse
2021-12-16 14:24   ` Tom Lendacky
2021-12-16 18:24     ` David Woodhouse
2021-12-16 19:00       ` Tom Lendacky
2021-12-16 19:20         ` David Woodhouse
2022-01-29 12:04           ` David Woodhouse
2022-01-31 13:59             ` Borislav Petkov
2022-02-01 10:25               ` David Woodhouse
2022-02-01 10:56                 ` Borislav Petkov
2022-02-01 12:39                   ` David Woodhouse
2022-02-01 12:56                     ` Borislav Petkov
2022-02-01 13:02                       ` David Woodhouse
2021-12-15 14:56 ` [PATCH v3 7/9] x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel David Woodhouse
2021-12-15 14:56 ` [PATCH v3 8/9] x86/mtrr: Avoid repeated save of MTRRs on boot-time CPU bringup David Woodhouse
2021-12-15 14:56 ` [PATCH v3 9/9] x86/smpboot: Serialize topology updates for secondary bringup David Woodhouse
2021-12-16 16:27 ` [PATCH v3 0/9] Parallel CPU bringup for x86_64 Tom Lendacky
2021-12-16 19:24   ` David Woodhouse
2021-12-16 22:52     ` Tom Lendacky
2021-12-17  0:13       ` David Woodhouse
2021-12-17 10:09         ` Igor Mammedov
2021-12-17 15:40           ` David Woodhouse
2021-12-20 17:10           ` David Woodhouse
2021-12-20 18:54             ` Tom Lendacky
2021-12-20 21:29               ` David Woodhouse
2021-12-20 21:47                 ` Tom Lendacky
2021-12-21 22:25                   ` Tom Lendacky
2021-12-21 22:33                     ` David Woodhouse
2021-12-17 17:48         ` Tom Lendacky
2021-12-17 19:11           ` David Woodhouse
2021-12-17 19:26             ` David Woodhouse
2021-12-17 20:15               ` Tom Lendacky
2021-12-17 19:46             ` Tom Lendacky
2021-12-17 20:13               ` David Woodhouse
2021-12-17 20:55                 ` Tom Lendacky
2021-12-17 22:48                   ` David Woodhouse
2022-01-28  9:54                   ` David Woodhouse
2022-01-28 21:40                     ` Sean Christopherson [this message]
2022-01-28 21:48                       ` David Woodhouse
2022-01-29  9:22                       ` David Woodhouse
2021-12-16 19:52   ` David Woodhouse
2021-12-16 19:55     ` Tom Lendacky
2021-12-16 19:59       ` David Woodhouse
2021-12-27 16:57 ` Paul Menzel
2021-12-28 11:34   ` Paul Menzel
2021-12-28 14:18     ` David Woodhouse
2021-12-29 13:18       ` Paul Menzel
2021-12-29 13:54         ` David Woodhouse
2022-02-14 13:45           ` Paul Menzel
2022-04-21 10:00             ` Mimoja
2022-04-22 21:19               ` Tom Lendacky
2022-06-01  8:30                 ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YfRi2sY0hVfri5eR@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=hejingxian@huawei.com \
    --cc=hewenliang4@huawei.com \
    --cc=hpa@zytor.com \
    --cc=hushiyuan@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luolongjun@huawei.com \
    --cc=mimoja@mimoja.de \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rcu@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox