qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Harsh Prateek Bora <harshpb@linux.ibm.com>
To: Nicholas Piggin <npiggin@gmail.com>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org
Cc: danielhb413@gmail.com, vaibhav@linux.ibm.com, sbhat@linux.ibm.com
Subject: Re: [PATCH] target/ppc: handle vcpu hotplug failure gracefully
Date: Mon, 20 May 2024 15:25:51 +0530	[thread overview]
Message-ID: <85fd9f8a-0a46-4a3e-ab8c-e63633760836@linux.ibm.com> (raw)
In-Reply-To: <D1BMB9HRF50E.2UC36WJ1ZHAFU@gmail.com>



On 5/17/24 09:30, Nicholas Piggin wrote:
> On Thu May 16, 2024 at 2:31 PM AEST, Harsh Prateek Bora wrote:
>> Hi Nick,
>>
>> On 5/14/24 08:39, Nicholas Piggin wrote:
>>> On Tue Apr 23, 2024 at 4:30 PM AEST, Harsh Prateek Bora wrote:
>>>> + qemu-devel
>>>>
>>>> On 4/23/24 11:40, Harsh Prateek Bora wrote:
>>>>> On ppc64, the PowerVM hypervisor runs with limited memory and a VCPU
>>>>> creation during hotplug may fail during kvm_ioctl for KVM_CREATE_VCPU,
>>>>> leading to termination of guest since errp is set to &error_fatal while
>>>>> calling kvm_init_vcpu. This unexpected behaviour can be avoided by
>>>>> pre-creating vcpu and parking it on success or return error otherwise.
>>>>> This enables graceful error delivery for any vcpu hotplug failures while
>>>>> the guest can keep running.
>>>
>>> So this puts in on the park list so when kvm_init_vcpu() later runs it
>>> will just take it off the park list instead of issuing another
>>> KVM_CREATE_VCPU ioctl.
>>>
>>> And kvm_init_vcpu() runs in the vcpu thread function, which does not
>>> have a good way to indicate failure to the caller.
>>>
>>> I'm don't know a lot about this part of qemu but it seems like a good
>>> idea to move fail-able initialisation out of the vcpu thread in that
>>> case. So the general idea seems good to me.
>>>
>>
>> Yeh ..
>>
>>>>>
>>>>> Based on api refactoring to create/park vcpus introduced in 1/8 of patch series:
>>>>> https://lore.kernel.org/qemu-devel/20240312020000.12992-2-salil.mehta@huawei.com/
>>>
>>> So from this series AFAIKS you're just using kvm_create / kvm_park
>>> routines? You could easily pull that patch 1 out ahead of that larger
>>> series if progress is slow on it, it's a decent cleanup by itself by
>>> the looks.
>>>
>>
>> Yeh, patch 1 of that series is only we need but the author mentioned on
>> the list that he is about to post next version soon.
>>
>>>>>
>>>>> Tested OK by repeatedly doing a hotplug/unplug of vcpus as below:
>>>>>
>>>>>     #virsh setvcpus hotplug 40
>>>>>     #virsh setvcpus hotplug 70
>>>>> error: internal error: unable to execute QEMU command 'device_add':
>>>>> kvmppc_cpu_realize: vcpu hotplug failed with -12
>>>>>
>>>>> Reported-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
>>>>> Suggested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
>>>>> Suggested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>>>>> Signed-off by: Harsh Prateek Bora <harshpb@linux.ibm.com>
>>>>> ---
>>>>> ---
>>>>>     target/ppc/kvm.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>>>>>     1 file changed, 42 insertions(+)
>>>>>
>>>>> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
>>>>> index 8231feb2d4..c887f6dfa0 100644
>>>>> --- a/target/ppc/kvm.c
>>>>> +++ b/target/ppc/kvm.c
>>>>> @@ -48,6 +48,8 @@
>>>>>     #include "qemu/mmap-alloc.h"
>>>>>     #include "elf.h"
>>>>>     #include "sysemu/kvm_int.h"
>>>>> +#include "sysemu/kvm.h"
>>>>> +#include "hw/core/accel-cpu.h"
>>>>>     
>>>>>     #define PROC_DEVTREE_CPU      "/proc/device-tree/cpus/"
>>>>>     
>>>>> @@ -2339,6 +2341,43 @@ static void alter_insns(uint64_t *word, uint64_t flags, bool on)
>>>>>         }
>>>>>     }
>>>>>     
>>>>> +static int max_cpu_index = 0;
>>>>> +
>>>>> +static bool kvmppc_cpu_realize(CPUState *cs, Error **errp)
>>>>> +{
>>>>> +    int ret;
>>>>> +
>>>>> +    cs->cpu_index = max_cpu_index++;
>>>>> +
>>>>> +    POWERPC_CPU(cs)->vcpu_id = cs->cpu_index;
>>>
>>> So you're overriding the cpu_get_free_index() allocator here.
>>> And you need to because vcpu_id needs to be assigned before
>>> the KVM create, I guess.
>>>
>>
>> Yes ..
>>
>>> I guess it works. I would add a comment like s390x has.
>>>
>> Not sure which comment you were referring to but with exporting
>> cpu_get_free_index as suggested later, not sure if we still need any
>> comment.
> 
> Yeah that's true.
> 
>>>>> +
>>>>> +    if (cs->parent_obj.hotplugged) {
>>>
>>> Can _all_ kvm cpu creation go via this path? Why just limit it to
>>> hotplugged?
>>
>> For the initial bootup, we actually want to abort if the requested vCPUs
>> cant be allocated so that user can retry until the requested vCPUs are
>> allocated. For hotplug failure, bringing down entire guest isn't fair,
>> hence the fix.
> 
> But you could make the error handling depend on hotplugged, no?
> Perhaps put that error handling decision in common code so policy
> is the same for all targets and back ends.

Hmm, I think just setting errp appropriately would suffice for both
cases as existing behaviour takes care of the rest of handling.
Something like below:

+static bool kvmppc_cpu_realize(CPUState *cs, Error **errp)
+{
+    int ret;
+    const char *vcpu_str = (cs->parent_obj.hotplugged == true) ?
+                           "hotplug" : "create";
+    cs->cpu_index = cpu_get_free_index();
+
+    POWERPC_CPU(cs)->vcpu_id = cs->cpu_index;
+
+    /* create and park to fail gracefully in case vcpu hotplug fails */
+    ret = kvm_create_and_park_vcpu(cs);
+    if (ret) {
+        error_setg(errp, "%s: vcpu %s failed with %d",
+                         __func__, vcpu_str, ret);
+        return false;
+    }
+    return true;
+}


> 
> [...]
> 
>>>>> +}
>>>>> +
>>>>>     static void kvmppc_host_cpu_class_init(ObjectClass *oc, void *data)
>>>>>     {
>>>>>         PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
>>>>> @@ -2963,4 +3002,7 @@ bool kvm_arch_cpu_check_are_resettable(void)
>>>>>     
>>>>>     void kvm_arch_accel_class_init(ObjectClass *oc)
>>>>>     {
>>>>> +    AccelClass *ac = ACCEL_CLASS(oc);
>>>>> +    ac->cpu_common_realize = kvmppc_cpu_realize;
>>>>> +    ac->cpu_common_unrealize = kvmppc_cpu_unrealize;
>>>>>     }
> 
> One other thing I noticed -- cpu_common_realize seems to be for
> core code and cpu_target_realize for targets. Should we be
> using the latter here? If not, a comment would be warranted and
> probably also a comment in accel_cpu_common_realize().

Well, I looked at that initially but looks like accel_cpu itself (which
contains vector for cpu_target_realize) doesnt get initialized in case
of kvm on ppc for some reason and therefore wouldnt work. I think
this change is generic enough for cpu_common_realize as well (now having
similar behaviour for initial vcpus also), so I guess comment may not 
really be needed, but let me know if still need any comment.

Else, I can post next version with above proposed diff.

regards,
Harsh

> 
> Thanks,
> Nick


      reply	other threads:[~2024-05-20  9:56 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240423061058.595674-1-harshpb@linux.ibm.com>
2024-04-23  6:30 ` [PATCH] target/ppc: handle vcpu hotplug failure gracefully Harsh Prateek Bora
2024-05-14  3:09   ` Nicholas Piggin
2024-05-16  4:31     ` Harsh Prateek Bora
2024-05-17  4:00       ` Nicholas Piggin
2024-05-20  9:55         ` Harsh Prateek Bora [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=85fd9f8a-0a46-4a3e-ab8c-e63633760836@linux.ibm.com \
    --to=harshpb@linux.ibm.com \
    --cc=danielhb413@gmail.com \
    --cc=npiggin@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=sbhat@linux.ibm.com \
    --cc=vaibhav@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).