qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Cc: mdroth@linux.vnet.ibm.com, lvivier@redhat.com,
	sursingh@redhat.com, qemu-devel@nongnu.org, groug@kaod.org,
	qemu-ppc@nongnu.org, bharata@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)
Date: Wed, 5 Jul 2017 21:04:14 +1000	[thread overview]
Message-ID: <20170705110414.GN2180@umbus.fritz.box> (raw)
In-Reply-To: <f89c6897-9555-100a-7252-a705fcbc8ee6@linux.vnet.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 5366 bytes --]

On Tue, Jul 04, 2017 at 06:13:31PM -0300, Daniel Henrique Barboza wrote:
> I just tested this patch set on top of current ppc-for-2.10 branch (which
> contains
> the patches from part V). It applied cleanly but required a couple of
> trivial
> fixes to build probably because it was made on top of an older code base.

Right, I fixed that up locally already, but haven't gotten around to
reposting yet.  You can look at the 'drcVI' branch on my github tree,
if you're interested.

> The trivial migration test worked fine. The libvirt scenario (attaching a
> device on
> target before migration, try to unplug after migration) isn't working as
> expected
> but we have a different result with this series. Instead of silently failing
> to unplug
> with error messages on dmesg, the hot unplug works on QEMU level:

Thanks for testing.  Just to clarify what you're saying here, you
haven't spotted a regression with this series, but there is a case
which was broken and is still broken with slightly different
symptoms.  Yes?

> 
> (qemu) device_del core1
> (qemu)
> (qemu) info cpus
> * CPU #0: nip=0xc0000000000a3e0c thread_id=86162
> (qemu) info hotpluggable-cpus
> Hotpluggable CPUs:
>   type: "host-spapr-cpu-core"
>   vcpus_count: "1"
>   CPUInstance Properties:
>     core-id: "3"
>   type: "host-spapr-cpu-core"
>   vcpus_count: "1"
>   CPUInstance Properties:
>     core-id: "2"
>   type: "host-spapr-cpu-core"
>   vcpus_count: "1"
>   CPUInstance Properties:
>     core-id: "1"
>   type: "host-spapr-cpu-core"
>   vcpus_count: "1"
>   qom_path: "/machine/unattached/device[0]"
>   CPUInstance Properties:
>     core-id: "0"
> (qemu)
> 
> 
> However, any operation on the guest afterwards (tried with lscpu and dmesg)
> seems
> to hung the guest. This is what I got when trying to do a dmesg after the
> hot unplug:

Ouch.  That's bad.  I'll have to look into it.

I have rather a lot on my plate at the moment - if you get a chance to
work out which of the patches in the series causes this behaviour,
that could be handy.

> danielhb@ubuntu1704:~$
> danielhb@ubuntu1704:~$
> danielhb@ubuntu1704:~$ dmesg | tail -n 5
> ^C
> 
> [  362.749693] INFO: task kworker/u8:1:30 blocked for more than 120 seconds.
> [  362.749819]       Not tainted 4.10.0-26-generic #30-Ubuntu
> [  362.749892] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  362.750224] INFO: task kworker/0:3:1887 blocked for more than 120
> seconds.
> [  362.750320]       Not tainted 4.10.0-26-generic #30-Ubuntu
> [  362.750394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  483.568842] INFO: task kworker/u8:1:30 blocked for more than 120 seconds.
> [  483.568997]       Not tainted 4.10.0-26-generic #30-Ubuntu
> [  483.569067] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  483.569364] INFO: task kworker/0:3:1887 blocked for more than 120
> seconds.
> (...)
> 
> 
> I am not sure if it was intended for this scenario to work with this patch
> set already. Figured
> it can serve as a FYI for the next series.

This might be due to a problem I've discussed previously with
Laurent.  libvirt uses device_add on the destination's monitor before
sending the migration stream, rather than starting the destination
with already-hotplugged devices as -device.

To my surprise, there doesn't seem to be a system reset between that
device_add phase and processing the incoming stream, which means DRCs
for those devices might be in the wrong state.

I might need to explicitly trigger a reset of DRC state at incoming
migration time to handle this.

> Given that hotplug/unplug without migration still works and on the
> assumption that
> we'll look more into this libvirt scenario in the next spins, +1.
> 
> 
> Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
> 
> 
> 
> On 06/21/2017 06:18 AM, David Gibson wrote:
> > This sixth set of DRC cleanup patches (to be applied on top of the
> > patches from part V) is a complete rework of DRC state management.  We
> > stop tracking some unnecessary things, and change the basic state
> > representation to a simpler and more robust model.
> > 
> > Most of the patches in this set "break" migration from earlier git
> > snapshots, but not from any released qemu version.  The previous
> > migration stream format had multiple problems, so better to fix them
> > now, before 2.10 is out.
> > 
> > David Gibson (5):
> >    spapr: Remove 'awaiting_allocation' DRC flag
> >    spapr: Refactor spapr_drc_detach()
> >    spapr: Cleanups relating to DRC awaiting_release field
> >    spapr: Consolidate DRC state variables
> >    spapr: Remove sPAPRConfigureConnectorState sub-structure
> > 
> >   hw/ppc/spapr.c             |   9 +-
> >   hw/ppc/spapr_drc.c         | 321 +++++++++++++++++++++------------------------
> >   hw/ppc/spapr_pci.c         |  13 +-
> >   hw/ppc/trace-events        |   3 +-
> >   include/hw/ppc/spapr_drc.h |  53 +++++---
> >   5 files changed, 187 insertions(+), 212 deletions(-)
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2017-07-05 11:04 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-21  9:18 [Qemu-devel] [PATCH 0/5] spapr: DRC cleanups (part VI) David Gibson
2017-06-21  9:18 ` [Qemu-devel] [PATCH 1/5] spapr: Remove 'awaiting_allocation' DRC flag David Gibson
2017-06-21  9:18 ` [Qemu-devel] [PATCH 2/5] spapr: Refactor spapr_drc_detach() David Gibson
2017-06-22  9:32   ` Greg Kurz
2017-07-04 14:54     ` David Gibson
2017-06-21  9:18 ` [Qemu-devel] [PATCH 3/5] spapr: Cleanups relating to DRC awaiting_release field David Gibson
2017-06-22 12:51   ` Greg Kurz
2017-06-21  9:18 ` [Qemu-devel] [PATCH 4/5] spapr: Consolidate DRC state variables David Gibson
2017-06-21  9:18 ` [Qemu-devel] [PATCH 5/5] spapr: Remove sPAPRConfigureConnectorState sub-structure David Gibson
2017-07-04 21:13 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI) Daniel Henrique Barboza
2017-07-05 11:04   ` David Gibson [this message]
2017-07-05 21:53     ` Daniel Henrique Barboza
2017-07-05 22:41       ` Michael Roth
2017-07-06  9:46         ` David Gibson
2017-07-06 14:31       ` Daniel Henrique Barboza
2017-07-07  7:14         ` David Gibson
2017-07-07 10:23           ` Daniel Henrique Barboza
2017-07-07 21:24             ` Daniel Henrique Barboza

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170705110414.GN2180@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=danielhb@linux.vnet.ibm.com \
    --cc=groug@kaod.org \
    --cc=lvivier@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=sursingh@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).