Re: [PATCH v1 7/7] spapr.c: consider CPU core online state before allowing unplug

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Daniel Henrique Barboza <danielhb413@gmail.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: Xujun Ma <xuma@redhat.com>,
	qemu-ppc@nongnu.org, Greg Kurz <groug@kaod.org>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH v1 7/7] spapr.c: consider CPU core online state before allowing unplug
Date: Mon, 18 Jan 2021 07:28:25 -0300	[thread overview]
Message-ID: <5a89f311-9c67-fcd0-8b6c-4dcc9be66c2d@gmail.com> (raw)
In-Reply-To: <20210118011848.GD2089552@yekko.fritz.box>



On 1/17/21 10:18 PM, David Gibson wrote:
> On Fri, Jan 15, 2021 at 03:52:56PM -0300, Daniel Henrique Barboza wrote:
>>
>>
>> On 1/15/21 2:22 PM, Greg Kurz wrote:
>>> On Thu, 14 Jan 2021 15:06:28 -0300
>>> Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
>>>
>>>> The only restriction we have when unplugging CPUs is to forbid unplug of
>>>> the boot cpu core. spapr_core_unplug_possible() does not contemplate the
>>
>> I'll look into it.
>>
>>>
>>> I can't remember why this restriction was introduced in the first place...
>>> This should be investigated and documented if the limitation still stands.
>>>
>>>> possibility of some cores being offlined by the guest, meaning that we're
>>>> rolling the dice regarding on whether we're unplugging the last online
>>>> CPU core the guest has.
>>>>
>>>
>>> Trying to unplug the last CPU is obviously something that deserves
>>> special care. LoPAPR is quite explicit on the outcome : this should
>>> terminate the partition.
>>>
>>> 13.7.4.1.1. Isolation of CPUs
>>>
>>> The isolation of a CPU, in all cases, is preceded by the stop-self
>>> RTAS function for all processor threads, and the OS insures that all
>>> the CPU’s threads are in the RTAS stopped state prior to isolating the
>>> CPU. Isolation of a processor that is not stopped produces unpredictable
>>> results. The stopping of the last processor thread of a LPAR partition
>>> effectively kills the partition, and at that point, ownership of all
>>> partition resources reverts to the platform firmware.
>>
>>
>> I was just investigating a reason why we should check for all thread
>> states before unplugging the core, like David suggested in his reply.
>> rtas_stop_self() was setting 'cs->halted = 1' without a thread activity
>> check like ibm,suspend-me() does and I was wondering why. This text you sent
>> explains it, quoting:
>>
>> "> The isolation of a CPU, in all cases, is preceded by the stop-self
>>   RTAS function for all processor threads, and the OS insures that all
>>   the CPU’s threads are in the RTAS stopped state prior to isolating the
>>   CPU."
>>
>>
>> This seems to be corroborated by arch/powerpc/platform/pseries/hotplug-cpu.c:
> 
> Um... this seems like you're overcomplicating this.  The crucial point
> here is that 'start-cpu' and 'stop-self' operate on individual
> threads, where as dynamic reconfiguration hotplug and unplug works on
> whole cores.
> 
>> static void pseries_cpu_offline_self(void)
>> {
>> 	unsigned int hwcpu = hard_smp_processor_id();
>>
>> 	local_irq_disable();
>> 	idle_task_exit();
>> 	if (xive_enabled())
>> 		xive_teardown_cpu();
>> 	else
>> 		xics_teardown_cpu();
>>
>> 	unregister_slb_shadow(hwcpu);
>> 	rtas_stop_self();
>>
>> 	/* Should never get here... */
>> 	BUG();
>> 	for(;;);
>> }
>>
>>
>> IIUC this means that we can rely on cs->halted = 1 since it's coming right
>> after a rtas_stop_self() call. This is still a bit confusing though and I
>> wouldn't mind standardizing the 'CPU core is offline' condition with what
>> ibm,suspend-me is already doing.
> 
> At the moment we have no tracking of whether a core is online.  We
> kinda-sorta track whether a *thread* is online through stop-self /
> start-cpu.
> 
>> David, what do you think?
> 
> I think we can rely on cs->halted = 1 when the thread is offline.
> What I'm much less certain about is whether we can count on the thread
> being offline when cs->halted = 1.

I guess we should just stick with your first suggestion then. I'll check for
both cs->halted and msr to assert whether a whole core if offline, like
ibm,suspend-me is doing:

         if (!cs->halted || (e->msr & (1ULL << MSR_EE))) {
             rtas_st(rets, 0, H_MULTI_THREADS_ACTIVE);
             return;
         }

Another question not necessarily related to this fix: we do a similar check
in the start of do_client_architecture_support(), returning the same
H_MULTI_THREADS_ACTIVE error, but we're not checking e->msr:


     /* CAS is supposed to be called early when only the boot vCPU is active. */
     CPU_FOREACH(cs) {
         if (cs == CPU(cpu)) {
             continue;
         }
         if (!cs->halted) {
             warn_report("guest has multiple active vCPUs at CAS, which is not allowed");
             return H_MULTI_THREADS_ACTIVE;
         }
     }


Should we bother changing this logic to check for e->msr as well? If there is no
possible harm done I'd rather have the same check returning H_MULTI_THREADS_ACTIVE
in both places. If there is a special condition in early boot where this check in
do_client_architecture_support() is enough, I would like to put a comment in there
to make it clear why.



Thanks,


DHB


> 
>>> R1-13.7.4.1.1-1. For the LRDR option: Prior to issuing the RTAS
>>> set-indicator specifying isolate isolation-state of a CPU DR
>>> connector type, all the CPU threads must be in the RTAS stopped
>>> state.
>>>
>>> R1-13.7.4.1.1-2. For the LRDR option: Stopping of the last processor
>>> thread of a LPAR partition with the stop-self RTAS function, must kill
>>> the partition, with ownership of all partition resources reverting to
>>> the platform firmware.
>>>
>>> This is clearly not how things work today : linux doesn't call
>>> "stop-self" on the last vCPU and even if it did, QEMU doesn't
>>> terminate the VM.
>>>
>>> If there's a valid reason to not implement this PAPR behavior, I'd like
>>> it to be documented.
>>
>>
>> I can only speculate. This would create a unorthodox way of shutting down
>> the guest, when the user can just shutdown the whole thing gracefully.
>>
>> Unless we're considering borderline cases, like the security risk mentioned
>> in the kernel docs (Documentation/core-api/cpu_hotplug.rst):
>>
>> "Such advances require CPUs available to a kernel to be removed either for
>> provisioning reasons, or for RAS purposes to keep an offending CPU off
>> system execution path. Hence the need for CPU hotplug support in the
>> Linux kernel."
>>
>> In this extreme scenario I can see a reason to kill the partition/guest
>> by offlining the last online CPU - if it's compromising the host we'd
>> rather terminate immediately instead of waiting for graceful
>> shutdown.
> 
> I'm a bit confused by this comment.  You seem to be conflating
> online/offline operations (start-cpu/stop-self) with hot plug/unplug
> operations - they're obviously related, but they're not the same
> thing.
> 
>>>> If we hit the jackpot, we're going to detach the core DRC and pulse the
>>>> hotplug IRQ, but the guest OS will refuse to release the CPU. Our
>>>> spapr_core_unplug() DRC release callback will never be called and the CPU
>>>> core object will keep existing in QEMU. No error message will be sent
>>>> to the user, but the CPU core wasn't unplugged from the guest.
>>>>
>>>> If the guest OS onlines the CPU core again we won't be able to hotunplug it
>>>> either. 'dmesg' inside the guest will report a failed attempt to offline an
>>>> unknown CPU:
>>>>
>>>> [  923.003994] pseries-hotplug-cpu: Failed to offline CPU <NULL>, rc: -16
>>>>
>>>> This is the result of stopping the DRC state transition in the middle in the
>>>> first failed attempt.
>>>>
>>>
>>> Yes, at this point only a machine reset can fix things up.
>>>
>>> Given this is linux's choice not to call "stop-self" as it should do, I'm not
>>> super fan of hardcoding this logic in QEMU, unless there are really good
>>> reasons to do so.
>>
>> I understand where are you coming from and I sympathize. Not sure about how users
>> would feel about that though. They expect a somewhat compatible behavior of
>> multi-arch features like hotplug/hotunplug, and x86 will neither hotunplug nor offline
>> the last CPU as well.
>>
>> There is a very high chance that, even if we pull this design off, I'll need to go to
>> Libvirt and disable it because we broke compatibility with how vcpu unplug operated
>> earlier.
>

next prev parent reply	other threads:[~2021-01-18 10:30 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-14 18:06 [PATCH v1 0/7] pseries: avoid unplug the last online CPU core + assorted fixes Daniel Henrique Barboza
2021-01-14 18:06 ` [PATCH v1 1/7] spapr.h: fix trailing whitespace in phb_placement Daniel Henrique Barboza
2021-01-15  0:42   ` David Gibson
2021-01-14 18:06 ` [PATCH v1 2/7] spapr_hcall.c: make do_client_architecture_support static Daniel Henrique Barboza
2021-01-15  0:43   ` David Gibson
2021-01-14 18:06 ` [PATCH v1 3/7] spapr_rtas.c: fix identation in rtas_ibm_nmi_interlock() string Daniel Henrique Barboza
2021-01-15  0:44   ` David Gibson
2021-01-14 18:06 ` [PATCH v1 4/7] spapr_rtas.c: fix identation of rtas_ibm_suspend_me() args Daniel Henrique Barboza
2021-01-15  0:45   ` David Gibson
2021-01-14 18:06 ` [PATCH v1 5/7] spapr_cpu_core.c: use g_auto* in spapr_create_vcpu() Daniel Henrique Barboza
2021-01-15  0:49   ` David Gibson
2021-01-14 18:06 ` [PATCH v1 6/7] spapr.c: introduce spapr_core_unplug_possible() Daniel Henrique Barboza
2021-01-15  0:52   ` David Gibson
2021-01-14 18:06 ` [PATCH v1 7/7] spapr.c: consider CPU core online state before allowing unplug Daniel Henrique Barboza
2021-01-15  1:03   ` David Gibson
2021-01-15 17:22   ` Greg Kurz
2021-01-15 18:52     ` Daniel Henrique Barboza
2021-01-18  1:18       ` David Gibson
2021-01-18 10:28         ` Daniel Henrique Barboza [this message]
2021-01-15 21:43     ` Daniel Henrique Barboza
2021-01-15 21:43     ` Daniel Henrique Barboza
2021-01-18  1:12     ` David Gibson
2021-01-18 11:03       ` Greg Kurz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a89f311-9c67-fcd0-8b6c-4dcc9be66c2d@gmail.com \
    --to=danielhb413@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=groug@kaod.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=xuma@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).