From: Harsh Prateek Bora <harshpb@linux.ibm.com>
To: Nicholas Piggin <npiggin@gmail.com>,
qemu-ppc@nongnu.org, qemu-devel@nongnu.org
Cc: danielhb413@gmail.com, vaibhav@linux.ibm.com, sbhat@linux.ibm.com
Subject: Re: [PATCH] target/ppc: handle vcpu hotplug failure gracefully
Date: Mon, 20 May 2024 15:25:51 +0530 [thread overview]
Message-ID: <85fd9f8a-0a46-4a3e-ab8c-e63633760836@linux.ibm.com> (raw)
In-Reply-To: <D1BMB9HRF50E.2UC36WJ1ZHAFU@gmail.com>
On 5/17/24 09:30, Nicholas Piggin wrote:
> On Thu May 16, 2024 at 2:31 PM AEST, Harsh Prateek Bora wrote:
>> Hi Nick,
>>
>> On 5/14/24 08:39, Nicholas Piggin wrote:
>>> On Tue Apr 23, 2024 at 4:30 PM AEST, Harsh Prateek Bora wrote:
>>>> + qemu-devel
>>>>
>>>> On 4/23/24 11:40, Harsh Prateek Bora wrote:
>>>>> On ppc64, the PowerVM hypervisor runs with limited memory and a VCPU
>>>>> creation during hotplug may fail during kvm_ioctl for KVM_CREATE_VCPU,
>>>>> leading to termination of guest since errp is set to &error_fatal while
>>>>> calling kvm_init_vcpu. This unexpected behaviour can be avoided by
>>>>> pre-creating vcpu and parking it on success or return error otherwise.
>>>>> This enables graceful error delivery for any vcpu hotplug failures while
>>>>> the guest can keep running.
>>>
>>> So this puts in on the park list so when kvm_init_vcpu() later runs it
>>> will just take it off the park list instead of issuing another
>>> KVM_CREATE_VCPU ioctl.
>>>
>>> And kvm_init_vcpu() runs in the vcpu thread function, which does not
>>> have a good way to indicate failure to the caller.
>>>
>>> I'm don't know a lot about this part of qemu but it seems like a good
>>> idea to move fail-able initialisation out of the vcpu thread in that
>>> case. So the general idea seems good to me.
>>>
>>
>> Yeh ..
>>
>>>>>
>>>>> Based on api refactoring to create/park vcpus introduced in 1/8 of patch series:
>>>>> https://lore.kernel.org/qemu-devel/20240312020000.12992-2-salil.mehta@huawei.com/
>>>
>>> So from this series AFAIKS you're just using kvm_create / kvm_park
>>> routines? You could easily pull that patch 1 out ahead of that larger
>>> series if progress is slow on it, it's a decent cleanup by itself by
>>> the looks.
>>>
>>
>> Yeh, patch 1 of that series is only we need but the author mentioned on
>> the list that he is about to post next version soon.
>>
>>>>>
>>>>> Tested OK by repeatedly doing a hotplug/unplug of vcpus as below:
>>>>>
>>>>> #virsh setvcpus hotplug 40
>>>>> #virsh setvcpus hotplug 70
>>>>> error: internal error: unable to execute QEMU command 'device_add':
>>>>> kvmppc_cpu_realize: vcpu hotplug failed with -12
>>>>>
>>>>> Reported-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
>>>>> Suggested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
>>>>> Suggested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>>>>> Signed-off by: Harsh Prateek Bora <harshpb@linux.ibm.com>
>>>>> ---
>>>>> ---
>>>>> target/ppc/kvm.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>>>>> 1 file changed, 42 insertions(+)
>>>>>
>>>>> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
>>>>> index 8231feb2d4..c887f6dfa0 100644
>>>>> --- a/target/ppc/kvm.c
>>>>> +++ b/target/ppc/kvm.c
>>>>> @@ -48,6 +48,8 @@
>>>>> #include "qemu/mmap-alloc.h"
>>>>> #include "elf.h"
>>>>> #include "sysemu/kvm_int.h"
>>>>> +#include "sysemu/kvm.h"
>>>>> +#include "hw/core/accel-cpu.h"
>>>>>
>>>>> #define PROC_DEVTREE_CPU "/proc/device-tree/cpus/"
>>>>>
>>>>> @@ -2339,6 +2341,43 @@ static void alter_insns(uint64_t *word, uint64_t flags, bool on)
>>>>> }
>>>>> }
>>>>>
>>>>> +static int max_cpu_index = 0;
>>>>> +
>>>>> +static bool kvmppc_cpu_realize(CPUState *cs, Error **errp)
>>>>> +{
>>>>> + int ret;
>>>>> +
>>>>> + cs->cpu_index = max_cpu_index++;
>>>>> +
>>>>> + POWERPC_CPU(cs)->vcpu_id = cs->cpu_index;
>>>
>>> So you're overriding the cpu_get_free_index() allocator here.
>>> And you need to because vcpu_id needs to be assigned before
>>> the KVM create, I guess.
>>>
>>
>> Yes ..
>>
>>> I guess it works. I would add a comment like s390x has.
>>>
>> Not sure which comment you were referring to but with exporting
>> cpu_get_free_index as suggested later, not sure if we still need any
>> comment.
>
> Yeah that's true.
>
>>>>> +
>>>>> + if (cs->parent_obj.hotplugged) {
>>>
>>> Can _all_ kvm cpu creation go via this path? Why just limit it to
>>> hotplugged?
>>
>> For the initial bootup, we actually want to abort if the requested vCPUs
>> cant be allocated so that user can retry until the requested vCPUs are
>> allocated. For hotplug failure, bringing down entire guest isn't fair,
>> hence the fix.
>
> But you could make the error handling depend on hotplugged, no?
> Perhaps put that error handling decision in common code so policy
> is the same for all targets and back ends.
Hmm, I think just setting errp appropriately would suffice for both
cases as existing behaviour takes care of the rest of handling.
Something like below:
+static bool kvmppc_cpu_realize(CPUState *cs, Error **errp)
+{
+ int ret;
+ const char *vcpu_str = (cs->parent_obj.hotplugged == true) ?
+ "hotplug" : "create";
+ cs->cpu_index = cpu_get_free_index();
+
+ POWERPC_CPU(cs)->vcpu_id = cs->cpu_index;
+
+ /* create and park to fail gracefully in case vcpu hotplug fails */
+ ret = kvm_create_and_park_vcpu(cs);
+ if (ret) {
+ error_setg(errp, "%s: vcpu %s failed with %d",
+ __func__, vcpu_str, ret);
+ return false;
+ }
+ return true;
+}
>
> [...]
>
>>>>> +}
>>>>> +
>>>>> static void kvmppc_host_cpu_class_init(ObjectClass *oc, void *data)
>>>>> {
>>>>> PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
>>>>> @@ -2963,4 +3002,7 @@ bool kvm_arch_cpu_check_are_resettable(void)
>>>>>
>>>>> void kvm_arch_accel_class_init(ObjectClass *oc)
>>>>> {
>>>>> + AccelClass *ac = ACCEL_CLASS(oc);
>>>>> + ac->cpu_common_realize = kvmppc_cpu_realize;
>>>>> + ac->cpu_common_unrealize = kvmppc_cpu_unrealize;
>>>>> }
>
> One other thing I noticed -- cpu_common_realize seems to be for
> core code and cpu_target_realize for targets. Should we be
> using the latter here? If not, a comment would be warranted and
> probably also a comment in accel_cpu_common_realize().
Well, I looked at that initially but looks like accel_cpu itself (which
contains vector for cpu_target_realize) doesnt get initialized in case
of kvm on ppc for some reason and therefore wouldnt work. I think
this change is generic enough for cpu_common_realize as well (now having
similar behaviour for initial vcpus also), so I guess comment may not
really be needed, but let me know if still need any comment.
Else, I can post next version with above proposed diff.
regards,
Harsh
>
> Thanks,
> Nick
prev parent reply other threads:[~2024-05-20 9:56 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20240423061058.595674-1-harshpb@linux.ibm.com>
2024-04-23 6:30 ` [PATCH] target/ppc: handle vcpu hotplug failure gracefully Harsh Prateek Bora
2024-05-14 3:09 ` Nicholas Piggin
2024-05-16 4:31 ` Harsh Prateek Bora
2024-05-17 4:00 ` Nicholas Piggin
2024-05-20 9:55 ` Harsh Prateek Bora [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=85fd9f8a-0a46-4a3e-ab8c-e63633760836@linux.ibm.com \
--to=harshpb@linux.ibm.com \
--cc=danielhb413@gmail.com \
--cc=npiggin@gmail.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=sbhat@linux.ibm.com \
--cc=vaibhav@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).