From: David Gibson <david@gibson.dropbear.id.au>
To: Daniel Henrique Barboza <danielhb413@gmail.com>
Cc: Xujun Ma <xuma@redhat.com>,
qemu-ppc@nongnu.org, qemu-devel@nongnu.org, groug@kaod.org
Subject: Re: [PATCH v3 6/7] spapr_drc.c: add hotunplug timeout for CPUs
Date: Wed, 17 Feb 2021 12:23:30 +1100 [thread overview]
Message-ID: <YCxwEkS7EGsLhdqI@yekko.fritz.box> (raw)
In-Reply-To: <20210211225246.17315-7-danielhb413@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6354 bytes --]
On Thu, Feb 11, 2021 at 07:52:45PM -0300, Daniel Henrique Barboza wrote:
> There is a reliable way to make a CPU hotunplug fail in the pseries
> machine. Hotplug a CPU A, then offline all other CPUs inside the guest
> but A. When trying to hotunplug A the guest kernel will refuse to do
> it, because A is now the last online CPU of the guest. PAPR has no
> 'error callback' in this situation to report back to the platform,
> so the guest kernel will deny the unplug in silent and QEMU will never
> know what happened. The unplug pending state of A will remain until
> the guest is shutdown or rebooted.
>
> Previous attempts of fixing it (see [1] and [2]) were aimed at trying to
> mitigate the effects of the problem. In [1] we were trying to guess which
> guest CPUs were online to forbid hotunplug of the last online CPU in the QEMU
> layer, avoiding the scenario described above because QEMU is now failing
> in behalf of the guest. This is not robust because the last online CPU of
> the guest can change while we're in the middle of the unplug process, and
> our initial assumptions are now invalid. In [2] we were accepting that our
> unplug process is uncertain and the user should be allowed to spam the IRQ
> hotunplug queue of the guest in case the CPU hotunplug fails.
>
> This patch presents another alternative, using the timeout infrastructure
> introduced in the previous patch. CPU hotunplugs in the pSeries machine will
> now timeout after 15 seconds. This is a long time for a single CPU unplug
> to occur, regardless of guest load - although the user is *strongly* encouraged
> to *not* hotunplug devices from a guest under high load - and we can be sure
> that something went wrong if it takes longer than that for the guest to release
> the CPU (the same can't be said about memory hotunplug - more on that in the
> next patch).
>
> Timing out the unplug operation will reset the unplug state of the CPU and
> allow the user to try it again, regardless of the error situation that
> prevented the hotunplug to occur. Of all the not so pretty fixes/mitigations
> for CPU hotunplug errors in pSeries, timing out the operation is an admission
> that we have no control in the process, and must assume the worst case if
> the operation doesn't succeed in a sensible time frame.
>
> [1] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg03353.html
> [2] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg04400.html
>
> Reported-by: Xujun Ma <xuma@redhat.com>
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1911414
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> hw/ppc/spapr.c | 4 ++++
> hw/ppc/spapr_drc.c | 17 +++++++++++++++++
> include/hw/ppc/spapr_drc.h | 3 +++
> 3 files changed, 24 insertions(+)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b066df68cb..ecce8abf14 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3724,6 +3724,10 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
> if (!spapr_drc_unplug_requested(drc)) {
> spapr_drc_unplug_request(drc);
> spapr_hotplug_req_remove_by_index(drc);
> + } else {
> + error_setg(errp, "core-id %d unplug is still pending, %d seconds "
> + "timeout remaining",
> + cc->core_id, spapr_drc_unplug_timeout_remaining_sec(drc));
Reporting this information is a nice touch.
> }
> }
>
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index c88bb524c5..c143bfb6d3 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -398,6 +398,12 @@ void spapr_drc_unplug_request(SpaprDrc *drc)
>
> drc->unplug_requested = true;
>
> + if (drck->unplug_timeout_seconds != 0) {
> + timer_mod(drc->unplug_timeout_timer,
> + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
> + drck->unplug_timeout_seconds * 1000);
> + }
> +
> if (drc->state != drck->empty_state) {
> trace_spapr_drc_awaiting_quiesce(spapr_drc_index(drc));
> return;
> @@ -406,6 +412,16 @@ void spapr_drc_unplug_request(SpaprDrc *drc)
> spapr_drc_release(drc);
> }
>
> +int spapr_drc_unplug_timeout_remaining_sec(SpaprDrc *drc)
> +{
> + if (drc->unplug_requested && timer_pending(drc->unplug_timeout_timer)) {
> + return (qemu_timeout_ns_to_ms(drc->unplug_timeout_timer->expire_time) -
> + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL)) / 1000;
Hmm. Reaching into the timer's internal fields isn't ideal. I wonder
if we should add a helper in the timer code for reporting this information.
> + }
> +
> + return 0;
> +}
> +
> bool spapr_drc_reset(SpaprDrc *drc)
> {
> SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> @@ -706,6 +722,7 @@ static void spapr_drc_cpu_class_init(ObjectClass *k, void *data)
> drck->drc_name_prefix = "CPU ";
> drck->release = spapr_core_release;
> drck->dt_populate = spapr_core_dt_populate;
> + drck->unplug_timeout_seconds = 15;
> }
>
> static void spapr_drc_pci_class_init(ObjectClass *k, void *data)
> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> index b2e6222d09..26599c385a 100644
> --- a/include/hw/ppc/spapr_drc.h
> +++ b/include/hw/ppc/spapr_drc.h
> @@ -211,6 +211,8 @@ typedef struct SpaprDrcClass {
>
> int (*dt_populate)(SpaprDrc *drc, struct SpaprMachineState *spapr,
> void *fdt, int *fdt_start_offset, Error **errp);
> +
> + int unplug_timeout_seconds;
> } SpaprDrcClass;
>
> typedef struct SpaprDrcPhysical {
> @@ -246,6 +248,7 @@ int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask);
> */
> void spapr_drc_attach(SpaprDrc *drc, DeviceState *d);
> void spapr_drc_unplug_request(SpaprDrc *drc);
> +int spapr_drc_unplug_timeout_remaining_sec(SpaprDrc *drc);
>
> /*
> * Reset all DRCs, causing pending hot-plug/unplug requests to complete.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2021-02-17 1:34 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-11 22:52 [PATCH v3 0/7] CPU unplug timeout/LMB unplug cleanup in DRC reconfiguration Daniel Henrique Barboza
2021-02-11 22:52 ` [PATCH v3 1/7] spapr_drc.c: do not call spapr_drc_detach() in drc_isolate_logical() Daniel Henrique Barboza
2021-02-15 10:40 ` Greg Kurz
2021-02-17 0:51 ` David Gibson
2021-02-11 22:52 ` [PATCH v3 2/7] spapr_pci.c: simplify spapr_pci_unplug_request() function handling Daniel Henrique Barboza
2021-02-16 15:50 ` Greg Kurz
2021-02-16 16:09 ` Daniel Henrique Barboza
2021-02-16 17:16 ` Greg Kurz
2021-02-16 17:44 ` Daniel Henrique Barboza
2021-02-17 0:54 ` David Gibson
2021-02-11 22:52 ` [PATCH v3 3/7] spapr_drc.c: use spapr_drc_release() in isolate_physical/set_unusable Daniel Henrique Barboza
2021-02-17 0:57 ` David Gibson
2021-02-17 10:58 ` Greg Kurz
2021-02-11 22:52 ` [PATCH v3 4/7] spapr: rename spapr_drc_detach() to spapr_drc_unplug_request() Daniel Henrique Barboza
2021-02-17 0:58 ` David Gibson
2021-02-17 11:01 ` Greg Kurz
2021-02-11 22:52 ` [PATCH v3 5/7] spapr_drc.c: introduce unplug_timeout_timer Daniel Henrique Barboza
2021-02-17 1:14 ` David Gibson
2021-02-17 1:20 ` David Gibson
2021-02-11 22:52 ` [PATCH v3 6/7] spapr_drc.c: add hotunplug timeout for CPUs Daniel Henrique Barboza
2021-02-17 1:23 ` David Gibson [this message]
2021-02-11 22:52 ` [PATCH v3 7/7] spapr_drc.c: use DRC reconfiguration to cleanup DIMM unplug state Daniel Henrique Barboza
2021-02-17 2:31 ` David Gibson
2021-02-19 20:04 ` Daniel Henrique Barboza
2021-02-22 5:53 ` David Gibson
2021-02-19 21:31 ` Daniel Henrique Barboza
2021-02-22 5:54 ` David Gibson
2021-02-17 2:33 ` [PATCH v3 0/7] CPU unplug timeout/LMB unplug cleanup in DRC reconfiguration David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YCxwEkS7EGsLhdqI@yekko.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=danielhb413@gmail.com \
--cc=groug@kaod.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=xuma@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).