From: David Gibson <david@gibson.dropbear.id.au>
To: Daniel Henrique Barboza <danielhb413@gmail.com>
Cc: Xujun Ma <xuma@redhat.com>,
qemu-ppc@nongnu.org, Greg Kurz <groug@kaod.org>,
qemu-devel@nongnu.org
Subject: Re: [PATCH v1 7/7] spapr.c: consider CPU core online state before allowing unplug
Date: Mon, 18 Jan 2021 12:18:48 +1100 [thread overview]
Message-ID: <20210118011848.GD2089552@yekko.fritz.box> (raw)
In-Reply-To: <14cfa384-9972-6817-5c5f-f37bc1880043@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6876 bytes --]
On Fri, Jan 15, 2021 at 03:52:56PM -0300, Daniel Henrique Barboza wrote:
>
>
> On 1/15/21 2:22 PM, Greg Kurz wrote:
> > On Thu, 14 Jan 2021 15:06:28 -0300
> > Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
> >
> > > The only restriction we have when unplugging CPUs is to forbid unplug of
> > > the boot cpu core. spapr_core_unplug_possible() does not contemplate the
>
> I'll look into it.
>
> >
> > I can't remember why this restriction was introduced in the first place...
> > This should be investigated and documented if the limitation still stands.
> >
> > > possibility of some cores being offlined by the guest, meaning that we're
> > > rolling the dice regarding on whether we're unplugging the last online
> > > CPU core the guest has.
> > >
> >
> > Trying to unplug the last CPU is obviously something that deserves
> > special care. LoPAPR is quite explicit on the outcome : this should
> > terminate the partition.
> >
> > 13.7.4.1.1. Isolation of CPUs
> >
> > The isolation of a CPU, in all cases, is preceded by the stop-self
> > RTAS function for all processor threads, and the OS insures that all
> > the CPU’s threads are in the RTAS stopped state prior to isolating the
> > CPU. Isolation of a processor that is not stopped produces unpredictable
> > results. The stopping of the last processor thread of a LPAR partition
> > effectively kills the partition, and at that point, ownership of all
> > partition resources reverts to the platform firmware.
>
>
> I was just investigating a reason why we should check for all thread
> states before unplugging the core, like David suggested in his reply.
> rtas_stop_self() was setting 'cs->halted = 1' without a thread activity
> check like ibm,suspend-me() does and I was wondering why. This text you sent
> explains it, quoting:
>
> "> The isolation of a CPU, in all cases, is preceded by the stop-self
> RTAS function for all processor threads, and the OS insures that all
> the CPU’s threads are in the RTAS stopped state prior to isolating the
> CPU."
>
>
> This seems to be corroborated by arch/powerpc/platform/pseries/hotplug-cpu.c:
Um... this seems like you're overcomplicating this. The crucial point
here is that 'start-cpu' and 'stop-self' operate on individual
threads, where as dynamic reconfiguration hotplug and unplug works on
whole cores.
> static void pseries_cpu_offline_self(void)
> {
> unsigned int hwcpu = hard_smp_processor_id();
>
> local_irq_disable();
> idle_task_exit();
> if (xive_enabled())
> xive_teardown_cpu();
> else
> xics_teardown_cpu();
>
> unregister_slb_shadow(hwcpu);
> rtas_stop_self();
>
> /* Should never get here... */
> BUG();
> for(;;);
> }
>
>
> IIUC this means that we can rely on cs->halted = 1 since it's coming right
> after a rtas_stop_self() call. This is still a bit confusing though and I
> wouldn't mind standardizing the 'CPU core is offline' condition with what
> ibm,suspend-me is already doing.
At the moment we have no tracking of whether a core is online. We
kinda-sorta track whether a *thread* is online through stop-self /
start-cpu.
> David, what do you think?
I think we can rely on cs->halted = 1 when the thread is offline.
What I'm much less certain about is whether we can count on the thread
being offline when cs->halted = 1.
> > R1-13.7.4.1.1-1. For the LRDR option: Prior to issuing the RTAS
> > set-indicator specifying isolate isolation-state of a CPU DR
> > connector type, all the CPU threads must be in the RTAS stopped
> > state.
> >
> > R1-13.7.4.1.1-2. For the LRDR option: Stopping of the last processor
> > thread of a LPAR partition with the stop-self RTAS function, must kill
> > the partition, with ownership of all partition resources reverting to
> > the platform firmware.
> >
> > This is clearly not how things work today : linux doesn't call
> > "stop-self" on the last vCPU and even if it did, QEMU doesn't
> > terminate the VM.
> >
> > If there's a valid reason to not implement this PAPR behavior, I'd like
> > it to be documented.
>
>
> I can only speculate. This would create a unorthodox way of shutting down
> the guest, when the user can just shutdown the whole thing gracefully.
>
> Unless we're considering borderline cases, like the security risk mentioned
> in the kernel docs (Documentation/core-api/cpu_hotplug.rst):
>
> "Such advances require CPUs available to a kernel to be removed either for
> provisioning reasons, or for RAS purposes to keep an offending CPU off
> system execution path. Hence the need for CPU hotplug support in the
> Linux kernel."
>
> In this extreme scenario I can see a reason to kill the partition/guest
> by offlining the last online CPU - if it's compromising the host we'd
> rather terminate immediately instead of waiting for graceful
> shutdown.
I'm a bit confused by this comment. You seem to be conflating
online/offline operations (start-cpu/stop-self) with hot plug/unplug
operations - they're obviously related, but they're not the same
thing.
> > > If we hit the jackpot, we're going to detach the core DRC and pulse the
> > > hotplug IRQ, but the guest OS will refuse to release the CPU. Our
> > > spapr_core_unplug() DRC release callback will never be called and the CPU
> > > core object will keep existing in QEMU. No error message will be sent
> > > to the user, but the CPU core wasn't unplugged from the guest.
> > >
> > > If the guest OS onlines the CPU core again we won't be able to hotunplug it
> > > either. 'dmesg' inside the guest will report a failed attempt to offline an
> > > unknown CPU:
> > >
> > > [ 923.003994] pseries-hotplug-cpu: Failed to offline CPU <NULL>, rc: -16
> > >
> > > This is the result of stopping the DRC state transition in the middle in the
> > > first failed attempt.
> > >
> >
> > Yes, at this point only a machine reset can fix things up.
> >
> > Given this is linux's choice not to call "stop-self" as it should do, I'm not
> > super fan of hardcoding this logic in QEMU, unless there are really good
> > reasons to do so.
>
> I understand where are you coming from and I sympathize. Not sure about how users
> would feel about that though. They expect a somewhat compatible behavior of
> multi-arch features like hotplug/hotunplug, and x86 will neither hotunplug nor offline
> the last CPU as well.
>
> There is a very high chance that, even if we pull this design off, I'll need to go to
> Libvirt and disable it because we broke compatibility with how vcpu unplug operated
> earlier.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2021-01-18 2:39 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-14 18:06 [PATCH v1 0/7] pseries: avoid unplug the last online CPU core + assorted fixes Daniel Henrique Barboza
2021-01-14 18:06 ` [PATCH v1 1/7] spapr.h: fix trailing whitespace in phb_placement Daniel Henrique Barboza
2021-01-15 0:42 ` David Gibson
2021-01-14 18:06 ` [PATCH v1 2/7] spapr_hcall.c: make do_client_architecture_support static Daniel Henrique Barboza
2021-01-15 0:43 ` David Gibson
2021-01-14 18:06 ` [PATCH v1 3/7] spapr_rtas.c: fix identation in rtas_ibm_nmi_interlock() string Daniel Henrique Barboza
2021-01-15 0:44 ` David Gibson
2021-01-14 18:06 ` [PATCH v1 4/7] spapr_rtas.c: fix identation of rtas_ibm_suspend_me() args Daniel Henrique Barboza
2021-01-15 0:45 ` David Gibson
2021-01-14 18:06 ` [PATCH v1 5/7] spapr_cpu_core.c: use g_auto* in spapr_create_vcpu() Daniel Henrique Barboza
2021-01-15 0:49 ` David Gibson
2021-01-14 18:06 ` [PATCH v1 6/7] spapr.c: introduce spapr_core_unplug_possible() Daniel Henrique Barboza
2021-01-15 0:52 ` David Gibson
2021-01-14 18:06 ` [PATCH v1 7/7] spapr.c: consider CPU core online state before allowing unplug Daniel Henrique Barboza
2021-01-15 1:03 ` David Gibson
2021-01-15 17:22 ` Greg Kurz
2021-01-15 18:52 ` Daniel Henrique Barboza
2021-01-18 1:18 ` David Gibson [this message]
2021-01-18 10:28 ` Daniel Henrique Barboza
2021-01-15 21:43 ` Daniel Henrique Barboza
2021-01-15 21:43 ` Daniel Henrique Barboza
2021-01-18 1:12 ` David Gibson
2021-01-18 11:03 ` Greg Kurz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210118011848.GD2089552@yekko.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=danielhb413@gmail.com \
--cc=groug@kaod.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=xuma@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).