RE: [PATCH V8 1/8] accel/kvm: Extract common KVM vCPU {creation, parking} code

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Salil Mehta via <qemu-devel@nongnu.org>
To: Harsh Prateek Bora <harshpb@linux.ibm.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"qemu-arm@nongnu.org" <qemu-arm@nongnu.org>
Cc: "maz@kernel.org" <maz@kernel.org>,
	"jean-philippe@linaro.org" <jean-philippe@linaro.org>,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	"lpieralisi@kernel.org" <lpieralisi@kernel.org>,
	"peter.maydell@linaro.org" <peter.maydell@linaro.org>,
	"richard.henderson@linaro.org" <richard.henderson@linaro.org>,
	"imammedo@redhat.com" <imammedo@redhat.com>,
	"andrew.jones@linux.dev" <andrew.jones@linux.dev>,
	"david@redhat.com" <david@redhat.com>,
	"philmd@linaro.org" <philmd@linaro.org>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"oliver.upton@linux.dev" <oliver.upton@linux.dev>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"mst@redhat.com" <mst@redhat.com>,
	"will@kernel.org" <will@kernel.org>,
	"gshan@redhat.com" <gshan@redhat.com>,
	"rafael@kernel.org" <rafael@kernel.org>,
	"alex.bennee@linaro.org" <alex.bennee@linaro.org>,
	"linux@armlinux.org.uk" <linux@armlinux.org.uk>,
	"darren@os.amperecomputing.com" <darren@os.amperecomputing.com>,
	"ilkka@os.amperecomputing.com" <ilkka@os.amperecomputing.com>,
	"vishnu@os.amperecomputing.com" <vishnu@os.amperecomputing.com>,
	"karl.heubaum@oracle.com" <karl.heubaum@oracle.com>,
	"miguel.luis@oracle.com" <miguel.luis@oracle.com>,
	"salil.mehta@opnsrc.net" <salil.mehta@opnsrc.net>,
	zhukeqian <zhukeqian1@huawei.com>,
	"wangxiongfeng (C)" <wangxiongfeng2@huawei.com>,
	"wangyanan (Y)" <wangyanan55@huawei.com>,
	"jiakernel2@gmail.com" <jiakernel2@gmail.com>,
	"maobibo@loongson.cn" <maobibo@loongson.cn>,
	"lixianglai@loongson.cn" <lixianglai@loongson.cn>,
	Linuxarm <linuxarm@huawei.com>,
	Vaibhav Jain <vaibhav@linux.ibm.com>,
	"sbhat@linux.ibm.com" <sbhat@linux.ibm.com>
Subject: RE: [PATCH V8 1/8] accel/kvm: Extract common KVM vCPU {creation, parking} code
Date: Fri, 3 May 2024 18:43:27 +0000	[thread overview]
Message-ID: <ab66fa4ded96458cac2df04f44d53e14@huawei.com> (raw)
In-Reply-To: <ca178aae-82b9-4150-9965-50d968787d23@linux.ibm.com>

Hi Harsh,

Sorry for the delay in my reply. I've been off the grid for some time so missed this
earlier mail. Please find my reply below to you query.

Thanks

>  From: Harsh Prateek Bora <harshpb@linux.ibm.com>
>  Sent: Friday, March 22, 2024 8:15 AM
>  
>  + Vaibhav, Shiva
>  
>  Hi Salil,
>  
>  I came across your patch while trying to solve a related problem on spapr.
>  One query below ..
>  
>  On 3/12/24 07:29, Salil Mehta via wrote:
>  > KVM vCPU creation is done once during the vCPU realization when Qemu
>  > vCPU thread is spawned. This is common to all the architectures as of now.
>  >
>  > Hot-unplug of vCPU results in destruction of the vCPU object in QOM
>  > but the corresponding KVM vCPU object in the Host KVM is not destroyed
>  > as KVM doesn't support vCPU removal. Therefore, its representative KVM
>  > vCPU object/context in Qemu is parked.
>  >
>  > Refactor architecture common logic so that some APIs could be reused
>  > by vCPU Hotplug code of some architectures likes ARM, Loongson etc.
>  > Update new/old APIs with trace events instead of DPRINTF. No functional
>  change is intended here.
>  >
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > Reviewed-by: Gavin Shan <gshan@redhat.com>
>  > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>  > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>  > Tested-by: Xianglai Li <lixianglai@loongson.cn>
>  > Tested-by: Miguel Luis <miguel.luis@oracle.com>
>  > Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
>  > ---
>  >   accel/kvm/kvm-all.c    | 64 ++++++++++++++++++++++++++++++++------
>  ----
>  >   accel/kvm/trace-events |  5 +++-
>  >   include/sysemu/kvm.h   | 16 +++++++++++
>  >   3 files changed, 69 insertions(+), 16 deletions(-)
>  >
>  > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index
>  > a8cecd040e..3bc3207bda 100644
>  > --- a/accel/kvm/kvm-all.c
>  > +++ b/accel/kvm/kvm-all.c
>  > @@ -126,6 +126,7 @@ static QemuMutex kml_slots_lock;
>  >   #define kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
>  >
>  >   static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
>  > +static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
>  >
>  >   static inline void kvm_resample_fd_remove(int gsi)
>  >   {
>  > @@ -314,14 +315,53 @@ err:
>  >       return ret;
>  >   }
>  >
>  > +void kvm_park_vcpu(CPUState *cpu)
>  > +{
>  > +    struct KVMParkedVcpu *vcpu;
>  > +
>  > +    trace_kvm_park_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>  > +
>  > +    vcpu = g_malloc0(sizeof(*vcpu));
>  > +    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
>  > +    vcpu->kvm_fd = cpu->kvm_fd;
>  > +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node); }
>  > +
>  > +int kvm_create_vcpu(CPUState *cpu)
>  > +{
>  > +    unsigned long vcpu_id = kvm_arch_vcpu_id(cpu);
>  > +    KVMState *s = kvm_state;
>  > +    int kvm_fd;
>  > +
>  > +    trace_kvm_create_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>  > +
>  > +    /* check if the KVM vCPU already exist but is parked */
>  > +    kvm_fd = kvm_get_vcpu(s, vcpu_id);
>  > +    if (kvm_fd < 0) {
>  > +        /* vCPU not parked: create a new KVM vCPU */
>  > +        kvm_fd = kvm_vm_ioctl(s, KVM_CREATE_VCPU, vcpu_id);
>  > +        if (kvm_fd < 0) {
>  > +            error_report("KVM_CREATE_VCPU IOCTL failed for vCPU %lu",
>  vcpu_id);
>  > +            return kvm_fd;
>  > +        }
>  > +    }
>  > +
>  > +    cpu->kvm_fd = kvm_fd;
>  > +    cpu->kvm_state = s;
>  > +    cpu->vcpu_dirty = true;
>  > +    cpu->dirty_pages = 0;
>  > +    cpu->throttle_us_per_full = 0;
>  > +
>  > +    return 0;
>  > +}
>  > +
>  >   static int do_kvm_destroy_vcpu(CPUState *cpu)
>  >   {
>  >       KVMState *s = kvm_state;
>  >       long mmap_size;
>  > -    struct KVMParkedVcpu *vcpu = NULL;
>  >       int ret = 0;
>  >
>  > -    trace_kvm_destroy_vcpu();
>  > +    trace_kvm_destroy_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>  >
>  >       ret = kvm_arch_destroy_vcpu(cpu);
>  >       if (ret < 0) {
>  > @@ -347,10 +387,7 @@ static int do_kvm_destroy_vcpu(CPUState *cpu)
>  >           }
>  >       }
>  >
>  > -    vcpu = g_malloc0(sizeof(*vcpu));
>  > -    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
>  > -    vcpu->kvm_fd = cpu->kvm_fd;
>  > -    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
>  > +    kvm_park_vcpu(cpu);
>  >   err:
>  >       return ret;
>  >   }
>  > @@ -371,6 +408,8 @@ static int kvm_get_vcpu(KVMState *s, unsigned
>  long vcpu_id)
>  >           if (cpu->vcpu_id == vcpu_id) {
>  >               int kvm_fd;
>  >
>  > +            trace_kvm_get_vcpu(vcpu_id);
>  > +
>  >               QLIST_REMOVE(cpu, node);
>  >               kvm_fd = cpu->kvm_fd;
>  >               g_free(cpu);
>  > @@ -378,7 +417,7 @@ static int kvm_get_vcpu(KVMState *s, unsigned
>  long vcpu_id)
>  >           }
>  >       }
>  >
>  > -    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
>  > +    return -ENOENT;
>  >   }
>  >
>  >   int kvm_init_vcpu(CPUState *cpu, Error **errp) @@ -389,19 +428,14 @@
>  > int kvm_init_vcpu(CPUState *cpu, Error **errp)
>  >
>  >       trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>  >
>  > -    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>  > +    ret = kvm_create_vcpu(cpu);
>  >       if (ret < 0) {
>  > -        error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed
>  (%lu)",
>  > +        error_setg_errno(errp, -ret,
>  > +                         "kvm_init_vcpu: kvm_create_vcpu failed
>  > + (%lu)",
>  >                            kvm_arch_vcpu_id(cpu));
>  
>  If a vcpu hotplug fails due to failure with kvm_create_vcpu ioctl, current
>  behaviour would be to bring down the guest as errp is &error_fatal. Any
>  thoughts on how do we ensure that a failure with kvm_create_vcpu ioctl for
>  hotplugged cpus (only) doesnt bring down the guest and fail gracefully (by
>  reporting error to user on monitor?)?

In the ARM, we are by design pre-creating all the vCPUs in the KVM during the
Qemu/KVM Init. This is to satisfy the constraints posed by ARM architecture
as we are not allowed to meddle with any initialization at KVM level or Guest
kernel level after system has booted. The constraints are mainly coming from
GIC and related per-CPU features which can only be initialized once during init
in the KVM and then their presence is made to felt to the Guest kernel only
once during enumeration of the CPUs and related GIC CPU interfaces. Later
cannot be changed either. Hence, if all of the KVM vCPUs have been created
successfully during init then hot(un)plugging operations later won't have
fatal initialization errors at the KVM as all operation get handled at QOM
level only for the hot(un)plugged vCPUs.

I feel if there is a failure to create KVM vCPU at Qemu KVM Init time then
there is something severally wrong either with the inputs or the system.
Hence, to keep the handling simple I was in favor of aborting the initialization.


But all of above is ARM arch specific. Do you have anything specific in mind
why you need graceful handling at the init time?

Thanks
Salil.

>  
>  regards,
>  Harsh
>  >           goto err;
>  >       }
>  >

next prev parent reply	other threads:[~2024-05-03 18:44 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-12  1:59 [PATCH V8 0/8] Add architecture agnostic code to support vCPU Hotplug Salil Mehta via
2024-03-12  1:59 ` [PATCH V8 1/8] accel/kvm: Extract common KVM vCPU {creation, parking} code Salil Mehta via
2024-03-22  8:15   ` Harsh Prateek Bora
2024-04-23  6:44     ` Harsh Prateek Bora
2024-05-03 18:56       ` Salil Mehta via
2024-05-03 18:43     ` Salil Mehta via [this message]
2024-04-04 13:59   ` [PATCH V8 1/8] accel/kvm: Extract common KVM vCPU {creation,parking} code Vishnu Pajjuri
2024-05-03 16:23     ` Salil Mehta via
2024-05-07 12:39       ` Vishnu Pajjuri
2024-05-07 12:51         ` Salil Mehta
2024-05-03  9:40   ` Philippe Mathieu-Daudé
2024-05-03 15:57     ` Salil Mehta via
2024-05-03 18:22       ` Philippe Mathieu-Daudé
2024-05-08 10:46         ` Salil Mehta via
2024-05-10 14:43           ` Philippe Mathieu-Daudé
2024-03-12  1:59 ` [PATCH V8 2/8] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file Salil Mehta via
2024-03-12  1:59 ` [PATCH V8 3/8] hw/acpi: Update ACPI GED framework to support vCPU Hotplug Salil Mehta via
2024-03-13  6:14   ` Zhao Liu
2024-05-03 19:59     ` Salil Mehta via
2024-05-06  9:05       ` Zhao Liu
2024-05-06  9:27         ` Salil Mehta via
2024-04-04 14:01   ` Vishnu Pajjuri
2024-05-03 20:09     ` Salil Mehta via
2024-03-12  1:59 ` [PATCH V8 4/8] hw/acpi: Update GED _EVT method AML with CPU scan Salil Mehta via
2024-03-12  1:59 ` [PATCH V8 5/8] hw/acpi: Update CPUs AML with cpu-(ctrl)dev change Salil Mehta via
2024-03-12  1:59 ` [PATCH V8 6/8] physmem: Add helper function to destroy CPU AddressSpace Salil Mehta via
2024-03-15  1:16   ` 答复: " zhukeqian via
2024-05-04  1:40     ` Salil Mehta
2024-05-04 13:40   ` Peter Maydell
2024-05-06  9:06     ` Salil Mehta via
2024-05-06  9:28       ` Peter Maydell
2024-05-07  0:11         ` Salil Mehta via
2024-05-07  9:02           ` Peter Maydell
2024-05-07  9:56             ` Salil Mehta via
2024-03-12  1:59 ` [PATCH V8 7/8] gdbstub: Add helper function to unregister GDB register space Salil Mehta via
2024-04-04 14:02   ` Vishnu Pajjuri
2024-05-03 19:36     ` Salil Mehta via
2024-03-12  2:00 ` [PATCH V8 8/8] docs/specs/acpi_hw_reduced_hotplug: Add the CPU Hotplug Event Bit Salil Mehta via
2024-03-12 18:00 ` [PATCH V8 0/8] Add architecture agnostic code to support vCPU Hotplug Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab66fa4ded96458cac2df04f44d53e14@huawei.com \
    --to=qemu-devel@nongnu.org \
    --cc=alex.bennee@linaro.org \
    --cc=andrew.jones@linux.dev \
    --cc=darren@os.amperecomputing.com \
    --cc=david@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=gshan@redhat.com \
    --cc=harshpb@linux.ibm.com \
    --cc=ilkka@os.amperecomputing.com \
    --cc=imammedo@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=jiakernel2@gmail.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=karl.heubaum@oracle.com \
    --cc=linux@armlinux.org.uk \
    --cc=linuxarm@huawei.com \
    --cc=lixianglai@loongson.cn \
    --cc=lpieralisi@kernel.org \
    --cc=maobibo@loongson.cn \
    --cc=maz@kernel.org \
    --cc=miguel.luis@oracle.com \
    --cc=mst@redhat.com \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=rafael@kernel.org \
    --cc=richard.henderson@linaro.org \
    --cc=salil.mehta@huawei.com \
    --cc=salil.mehta@opnsrc.net \
    --cc=sbhat@linux.ibm.com \
    --cc=vaibhav@linux.ibm.com \
    --cc=vishnu@os.amperecomputing.com \
    --cc=wangxiongfeng2@huawei.com \
    --cc=wangyanan55@huawei.com \
    --cc=will@kernel.org \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).