From: David Gibson <david@gibson.dropbear.id.au>
To: Daniel Henrique Barboza <danielhb413@gmail.com>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org, groug@kaod.org
Subject: Re: [PATCH v3 5/7] spapr_drc.c: introduce unplug_timeout_timer
Date: Wed, 17 Feb 2021 12:14:09 +1100 [thread overview]
Message-ID: <YCxt4VVYe5FBQX42@yekko.fritz.box> (raw)
In-Reply-To: <20210211225246.17315-6-danielhb413@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6889 bytes --]
On Thu, Feb 11, 2021 at 07:52:44PM -0300, Daniel Henrique Barboza wrote:
> The LoPAR spec provides no way for the guest kernel to report failure of
> hotplug/hotunplug events. This wouldn't be bad if those operations were
> granted to always succeed, but that's far for the reality.
>
> What ends up happening is that, in the case of a failed hotunplug,
> regardless of whether it was a QEMU error or a guest misbehavior, the pSeries
> machine is retaining the unplug state of the device in the running guest.
> This state is cleanup in machine reset, where it is assumed that this state
> represents a device that is pending unplug, and the device is hotunpluged
> from the board. Until the reset occurs, any hotunplug operation of the same
> device is forbid because there is a pending unplug state.
>
> This behavior has at least one undesirable side effect. A long standing pending
> unplug state is, more often than not, the result of a hotunplug error. The user
> had to dealt with it, since retrying to unplug the device is noy allowed, and then
> in the machine reset we're removing the device from the guest. This means that
> we're failing the user twice - failed to hotunplug when asked, then hotunplugged
> without notice.
>
> Solutions to this problem range between trying to predict when the hotunplug will
> fail and forbid the operation from the QEMU layer, from opening up the IRQ queue
> to allow for multiple hotunplug attempts, from telling the users to 'reboot the
> machine if something goes wrong'. The first solution is flawed because we can't
> fully predict guest behavior from QEMU, the second solution is a trial and error
> remediation that counts on a hope that the unplug will eventually succeed, and the
> third is ... well.
>
> This patch introduces a crude, but effective solution to hotunplug errors in
> the pSeries machine. For each unplug done, we'll timeout after some time. If
> a certain amount of time passes, we'll cleanup the hotunplug state from the machine.
> During the timeout period, any unplug operations in the same device will still
> be blocked. After that, we'll assume that the guest failed the operation, and
> allow the user to try again. If the timeout is too short we'll prevent legitimate
> hotunplug situations to occur, so we'll need to overestimate the regular time
> an unplug operation takes to succeed to account that.
>
> The true solution for the hotunplug errors in the pSeries machines is a PAPR change
> to allow for the guest to warn the platform about it. For now, the work done in this
> timeout design can be used for the new PAPR 'abort hcall' in the future, given that
> for both cases we'll need code to cleanup the existing unplug states of the DRCs.
>
> At this moment we're adding the basic wiring of the timer into the DRC. Next patch
> will use the timer to timeout failed CPU hotunplugs.
>
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> ---
> hw/ppc/spapr_drc.c | 36 ++++++++++++++++++++++++++++++++++++
> include/hw/ppc/spapr_drc.h | 2 ++
> 2 files changed, 38 insertions(+)
>
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index 67041fb212..c88bb524c5 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -57,6 +57,8 @@ static void spapr_drc_release(SpaprDrc *drc)
> drck->release(drc->dev);
>
> drc->unplug_requested = false;
> + timer_del(drc->unplug_timeout_timer);
> +
> g_free(drc->fdt);
> drc->fdt = NULL;
> drc->fdt_start_offset = 0;
> @@ -453,6 +455,24 @@ static const VMStateDescription vmstate_spapr_drc_unplug_requested = {
> }
> };
>
> +static bool spapr_drc_unplug_timeout_timer_needed(void *opaque)
> +{
> + SpaprDrc *drc = opaque;
> +
> + return timer_pending(drc->unplug_timeout_timer);
> +}
> +
> +static const VMStateDescription vmstate_spapr_drc_unplug_timeout_timer = {
> + .name = "DRC unplug timeout timer",
> + .version_id = 1,
> + .minimum_version_id = 1,
> + .needed = spapr_drc_unplug_timeout_timer_needed,
> + .fields = (VMStateField[]) {
> + VMSTATE_TIMER_PTR(unplug_timeout_timer, SpaprDrc),
> + VMSTATE_END_OF_LIST()
> + }
> +};
I think we can probably avoid adding extra data to the migration
stream. Because the exact length of the timeout isn't super
important, so long as it's "long enough" I think it's acceptable if we
restart the timeout period after a migration. That can be
accomplished with a post-load hook that just restarts the timer at the
initial duration if the DRC is in the unplug_requested state.
> static bool spapr_drc_needed(void *opaque)
> {
> SpaprDrc *drc = opaque;
> @@ -486,10 +506,20 @@ static const VMStateDescription vmstate_spapr_drc = {
> },
> .subsections = (const VMStateDescription * []) {
> &vmstate_spapr_drc_unplug_requested,
> + &vmstate_spapr_drc_unplug_timeout_timer,
> NULL
> }
> };
>
> +static void drc_unplug_timeout_cb(void *opaque)
> +{
> + SpaprDrc *drc = opaque;
> +
> + if (drc->unplug_requested) {
> + drc->unplug_requested = false;
> + }
> +}
> +
> static void drc_realize(DeviceState *d, Error **errp)
> {
> SpaprDrc *drc = SPAPR_DR_CONNECTOR(d);
> @@ -512,6 +542,11 @@ static void drc_realize(DeviceState *d, Error **errp)
> object_property_add_alias(root_container, link_name,
> drc->owner, child_name);
> g_free(link_name);
> +
> + drc->unplug_timeout_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
> + drc_unplug_timeout_cb,
> + drc);
> +
> vmstate_register(VMSTATE_IF(drc), spapr_drc_index(drc), &vmstate_spapr_drc,
> drc);
> trace_spapr_drc_realize_complete(spapr_drc_index(drc));
> @@ -529,6 +564,7 @@ static void drc_unrealize(DeviceState *d)
> name = g_strdup_printf("%x", spapr_drc_index(drc));
> object_property_del(root_container, name);
> g_free(name);
> + timer_free(drc->unplug_timeout_timer);
> }
>
> SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> index 02a63b3666..b2e6222d09 100644
> --- a/include/hw/ppc/spapr_drc.h
> +++ b/include/hw/ppc/spapr_drc.h
> @@ -187,6 +187,8 @@ typedef struct SpaprDrc {
> bool unplug_requested;
> void *fdt;
> int fdt_start_offset;
> +
> + QEMUTimer *unplug_timeout_timer;
> } SpaprDrc;
>
> struct SpaprMachineState;
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2021-02-17 1:30 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-11 22:52 [PATCH v3 0/7] CPU unplug timeout/LMB unplug cleanup in DRC reconfiguration Daniel Henrique Barboza
2021-02-11 22:52 ` [PATCH v3 1/7] spapr_drc.c: do not call spapr_drc_detach() in drc_isolate_logical() Daniel Henrique Barboza
2021-02-15 10:40 ` Greg Kurz
2021-02-17 0:51 ` David Gibson
2021-02-11 22:52 ` [PATCH v3 2/7] spapr_pci.c: simplify spapr_pci_unplug_request() function handling Daniel Henrique Barboza
2021-02-16 15:50 ` Greg Kurz
2021-02-16 16:09 ` Daniel Henrique Barboza
2021-02-16 17:16 ` Greg Kurz
2021-02-16 17:44 ` Daniel Henrique Barboza
2021-02-17 0:54 ` David Gibson
2021-02-11 22:52 ` [PATCH v3 3/7] spapr_drc.c: use spapr_drc_release() in isolate_physical/set_unusable Daniel Henrique Barboza
2021-02-17 0:57 ` David Gibson
2021-02-17 10:58 ` Greg Kurz
2021-02-11 22:52 ` [PATCH v3 4/7] spapr: rename spapr_drc_detach() to spapr_drc_unplug_request() Daniel Henrique Barboza
2021-02-17 0:58 ` David Gibson
2021-02-17 11:01 ` Greg Kurz
2021-02-11 22:52 ` [PATCH v3 5/7] spapr_drc.c: introduce unplug_timeout_timer Daniel Henrique Barboza
2021-02-17 1:14 ` David Gibson [this message]
2021-02-17 1:20 ` David Gibson
2021-02-11 22:52 ` [PATCH v3 6/7] spapr_drc.c: add hotunplug timeout for CPUs Daniel Henrique Barboza
2021-02-17 1:23 ` David Gibson
2021-02-11 22:52 ` [PATCH v3 7/7] spapr_drc.c: use DRC reconfiguration to cleanup DIMM unplug state Daniel Henrique Barboza
2021-02-17 2:31 ` David Gibson
2021-02-19 20:04 ` Daniel Henrique Barboza
2021-02-22 5:53 ` David Gibson
2021-02-19 21:31 ` Daniel Henrique Barboza
2021-02-22 5:54 ` David Gibson
2021-02-17 2:33 ` [PATCH v3 0/7] CPU unplug timeout/LMB unplug cleanup in DRC reconfiguration David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YCxt4VVYe5FBQX42@yekko.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=danielhb413@gmail.com \
--cc=groug@kaod.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).