qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>,
	lvivier@redhat.com, sursingh@redhat.com, qemu-devel@nongnu.org,
	groug@kaod.org, qemu-ppc@nongnu.org, bharata@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)
Date: Thu, 6 Jul 2017 19:46:26 +1000	[thread overview]
Message-ID: <20170706094626.GQ2180@umbus.fritz.box> (raw)
In-Reply-To: <149929450093.3492.12871680935157233569@loki>

[-- Attachment #1: Type: text/plain, Size: 3828 bytes --]

On Wed, Jul 05, 2017 at 05:41:40PM -0500, Michael Roth wrote:
> Quoting Daniel Henrique Barboza (2017-07-05 16:53:57)
> > 
> > 
> > On 07/05/2017 08:04 AM, David Gibson wrote:
> > > On Tue, Jul 04, 2017 at 06:13:31PM -0300, Daniel Henrique Barboza wrote:
> > >> I just tested this patch set on top of current ppc-for-2.10 branch (which
> > >> contains
> > >> the patches from part V). It applied cleanly but required a couple of
> > >> trivial
> > >> fixes to build probably because it was made on top of an older code base.
> > > Right, I fixed that up locally already, but haven't gotten around to
> > > reposting yet.  You can look at the 'drcVI' branch on my github tree,
> > > if you're interested.
> > >
> > >> The trivial migration test worked fine. The libvirt scenario (attaching a
> > >> device on
> > >> target before migration, try to unplug after migration) isn't working as
> > >> expected
> > >> but we have a different result with this series. Instead of silently failing
> > >> to unplug
> > >> with error messages on dmesg, the hot unplug works on QEMU level:
> > > Thanks for testing.  Just to clarify what you're saying here, you
> > > haven't spotted a regression with this series, but there is a case
> > > which was broken and is still broken with slightly different
> > > symptoms.  Yes?
> > In my opinion, yes. It is debatable if the patch series made it worse 
> > because
> > the guest is now misbehaving, but the feature per se wasn't working
> > prior to it.
> 
> I think it's the removal of awaiting_allocation.

So.. yes, in the sense that I think we've rediscovered the problem
which prompter the awaiting_allocation flag in the first place.  I
still don't think awaiting_allocation is a sensible fix for.. well,
anything.

So we need to find the right fix for this problem.

> We know currently that
> in the libvirt scenario the DRC is exposed in an pre-hotplug state of
> ISOLATED/UNALLOCATED.

Right, need to understand exactly how we get there.

> In that state, spapr_drc_detach() completes
> immediately because from the perspective of QEMU it apparently has not
> been exposed to the guest yet, or the guest has already quiesced it on
> it's end.

Right.

> awaiting_allocation guarded against this, as it's intention was to make
> sure that resource was put into an ALLOCATED state prior to getting moved
> back into an UNALLOCATED state,

Right, but that seems broken to me.  It means if a cpu is hotplug when
the guest isn't paying attention (e.g. early boot, guest is
halted/crashed), then you can't remove it until you either boot an OS
that allocates then releases it, or you reset (and then release).

> so we didn't immediately unplug a CPU
> while the hotplug was in progress.

But in what sense is the hotplug "in progress"?  The guest has been
given a notification, but it hasn't touched the device yet.  Moving
the state away from UNALLOCATED is never guaranteed to work,
regardless of what notifications have been received.

> So in your scenario the CPU is just mysteriously vanishing out from
> under the guest,

Because the DRC is ending up in UNUSABLE state on the destination when
it should be in some other state (and was on the source)?  Yeah, that
sounds plausible.

> which probably explains the hang. The fix for this
> particular scenario is to fix the initial DRC on the target. I think
> we have a plan for that so I wouldn't consider this a regression
> necessarily.

Yeah.. I'm not quite sure how it's ending up wrong; why isn't the
incoming migration state setting it correctly?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2017-07-06  9:49 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-21  9:18 [Qemu-devel] [PATCH 0/5] spapr: DRC cleanups (part VI) David Gibson
2017-06-21  9:18 ` [Qemu-devel] [PATCH 1/5] spapr: Remove 'awaiting_allocation' DRC flag David Gibson
2017-06-21  9:18 ` [Qemu-devel] [PATCH 2/5] spapr: Refactor spapr_drc_detach() David Gibson
2017-06-22  9:32   ` Greg Kurz
2017-07-04 14:54     ` David Gibson
2017-06-21  9:18 ` [Qemu-devel] [PATCH 3/5] spapr: Cleanups relating to DRC awaiting_release field David Gibson
2017-06-22 12:51   ` Greg Kurz
2017-06-21  9:18 ` [Qemu-devel] [PATCH 4/5] spapr: Consolidate DRC state variables David Gibson
2017-06-21  9:18 ` [Qemu-devel] [PATCH 5/5] spapr: Remove sPAPRConfigureConnectorState sub-structure David Gibson
2017-07-04 21:13 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI) Daniel Henrique Barboza
2017-07-05 11:04   ` David Gibson
2017-07-05 21:53     ` Daniel Henrique Barboza
2017-07-05 22:41       ` Michael Roth
2017-07-06  9:46         ` David Gibson [this message]
2017-07-06 14:31       ` Daniel Henrique Barboza
2017-07-07  7:14         ` David Gibson
2017-07-07 10:23           ` Daniel Henrique Barboza
2017-07-07 21:24             ` Daniel Henrique Barboza

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170706094626.GQ2180@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=danielhb@linux.vnet.ibm.com \
    --cc=groug@kaod.org \
    --cc=lvivier@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=sursingh@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).