From: Thomas Gleixner <tglx@linutronix.de>
To: Ashok Raj <ashok.raj@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
x86@kernel.org, Mario Limonciello <mario.limonciello@amd.com>,
Tom Lendacky <thomas.lendacky@amd.com>,
Tony Battersby <tonyb@cybernetics.com>,
Ashok Raj <ashok.raj@linux.intel.com>,
Tony Luck <tony.luck@intel.com>,
Arjan van de Veen <arjan@linux.intel.com>,
Eric Biederman <ebiederm@xmission.com>,
Ashok Raj <ashok.raj@intel.com>
Subject: Re: [patch V2 1/8] x86/smp: Make stop_other_cpus() more robust
Date: Wed, 14 Jun 2023 21:53:21 +0200 [thread overview]
Message-ID: <87ilbp95xq.ffs@tglx> (raw)
In-Reply-To: <ZIoYMakfbAU9EOjc@a4bf019067fa.jf.intel.com>
On Wed, Jun 14 2023 at 12:42, Ashok Raj wrote:
> On Tue, Jun 13, 2023 at 02:17:55PM +0200, Thomas Gleixner wrote:
>>
>> WBINVD is an expensive operation and if multiple CPUs issue it at the same
>> time the resulting delays are even larger.
>
> Is this situation similar to what happened with the unexpected wakeups from
> mwait_play_dead()?
>
> i.e the wbinvd() takes a while, and when CPU0 moves ahead, the previous
> kernel marches past the wbinvd() instruction since we didn't wait to ensure
> this has indeed completed?
This is about reboot and I don't know how wbinvd() interacts with
that. But yes, this could be an issue vs. kexec() too.
> native_machine_halt()
> {
> machine_shutdown()->stop_other_cpus()
> stop_this_cpu();<---- Unbalanced atomic_dec()?
> }
>
> But the last stop_this_cpu() in native_machine_halt() would
> make the count go negative? Maybe that's OK since no one is waiting for it
> to go zero at that point?
Correct it's the CPU which stopped the others.
>> timeout = USEC_PER_MSEC * 10;
>> - while (num_online_cpus() > 1 && (wait || timeout--))
>> + while (atomic_read(&stop_cpus_count) > 0 && (wait || timeout--))
>> udelay(1);
>> }
>
> If we go down the INIT path, life is less complicated..
>
> After REBOOT_VECTOR IPI, if stop_cpus_count > 0, we send NMI to all CPUs.
> Won't this completely update the atomic_dec() since CPUs in hlt() will also
> take the NMI correct? I'm not sure if this is problematic.
>
> Or should we reinitialize stop_cpus_count before the NMI hurrah
Bah. Didn't think about HLT. Let me go back to the drawing board. Good catch!
>> + /*
>> + * Ensure that the cache line is invalidated on the other CPUs. See
>> + * comment vs. SME in stop_this_cpu().
>> + */
>> + atomic_set(&stop_cpus_count, INT_MAX);
>
> Didn't understand why INT_MAX here?
Any random number will do. The only purpose is to ensure that there is
no dirty cache line on the other (stopped) CPUs.
Now let me look into this NMI cruft.
Thanks,
tglx
next prev parent reply other threads:[~2023-06-14 19:53 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-13 12:17 [patch V2 0/8] x86/smp: Cure stop_other_cpus() and kexec() troubles Thomas Gleixner
2023-06-13 12:17 ` [patch V2 1/8] x86/smp: Make stop_other_cpus() more robust Thomas Gleixner
2023-06-14 19:42 ` Ashok Raj
2023-06-14 19:53 ` Thomas Gleixner [this message]
2023-06-14 20:47 ` Ashok Raj
2023-06-14 22:40 ` Thomas Gleixner
2023-06-13 12:17 ` [patch V2 2/8] x86/smp: Dont access non-existing CPUID leaf Thomas Gleixner
2023-06-13 12:17 ` [patch V2 3/8] x86/smp: Remove pointless wmb() from native_stop_other_cpus() Thomas Gleixner
2023-06-15 8:58 ` Peter Zijlstra
2023-06-15 10:57 ` Thomas Gleixner
2023-06-13 12:17 ` [patch V2 4/8] x86/smp: Acquire stopping_cpu unconditionally Thomas Gleixner
2023-06-15 9:02 ` Peter Zijlstra
2023-06-13 12:18 ` [patch V2 5/8] x86/smp: Use dedicated cache-line for mwait_play_dead() Thomas Gleixner
2023-06-13 12:18 ` [patch V2 6/8] x86/smp: Cure kexec() vs. mwait_play_dead() breakage Thomas Gleixner
2023-06-13 12:18 ` [patch V2 7/8] x86/smp: Split sending INIT IPI out into a helper function Thomas Gleixner
2023-06-13 12:18 ` [patch V2 8/8] x86/smp: Put CPUs into INIT on shutdown if possible Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ilbp95xq.ffs@tglx \
--to=tglx@linutronix.de \
--cc=arjan@linux.intel.com \
--cc=ashok.raj@intel.com \
--cc=ashok.raj@linux.intel.com \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=thomas.lendacky@amd.com \
--cc=tony.luck@intel.com \
--cc=tonyb@cybernetics.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox