Re: [Qemu-devel] [PATCH] spapr: manage hotplugged devices while the VM is not started

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>,
	Laurent Vivier <lvivier@redhat.com>,
	Thomas Huth <thuth@redhat.com>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	David Gibson <david@gibson.dropbear.id.au>,
	Juan Quintela <quintela@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] spapr: manage hotplugged devices while the VM is not started
Date: Wed, 14 Jun 2017 12:59:10 +0100	[thread overview]
Message-ID: <20170614115909.GC2687@work-vm> (raw)
In-Reply-To: <20170614110001.7816209b@nial.brq.redhat.com>

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Tue, 13 Jun 2017 16:42:45 -0500
> Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> 
> > Quoting Igor Mammedov (2017-06-09 03:27:33)
> > > On Thu, 08 Jun 2017 15:00:53 -0500
> > > Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> > >   
> > > > Quoting David Gibson (2017-05-30 23:35:57)  
> > > > > On Tue, May 30, 2017 at 06:04:45PM +0200, Laurent Vivier wrote:    
> > > > > > For QEMU, a hotlugged device is a device added using the HMP/QMP
> > > > > > interface.
> > > > > > For SPAPR, a hotplugged device is a device added while the
> > > > > > machine is running. In this case QEMU doesn't update internal
> > > > > > state but relies on the OS for this part
> > > > > > 
> > > > > > In the case of migration, when we (libvirt) hotplug a device
> > > > > > on the source guest, we (libvirt) generally hotplug the same
> > > > > > device on the destination guest. But in this case, the machine
> > > > > > is stopped (RUN_STATE_INMIGRATE) and QEMU must not expect
> > > > > > the OS will manage it as an hotplugged device as it will
> > > > > > be "imported" by the migration.
> > > > > > 
> > > > > > This patch changes the meaning of "hotplugged" in spapr.c
> > > > > > to manage a QEMU hotplugged device like a "coldplugged" one
> > > > > > when the machine is awaiting an incoming migration.
> > > > > > 
> > > > > > Signed-off-by: Laurent Vivier <lvivier@redhat.com>    
> > > > > 
> > > > > So, I think this is a reasonable concept, at least in terms of
> > > > > cleanliness and not doing unnecessary work.  However, if it's fixing
> > > > > bugs, I suspect that means we still have problems elsewhere.    
> > > > 
> > > > I was hoping a lot of these issues would go away once we default
> > > > the initial/reset DRC states to "coldplugged". I think your pending
> > > > patch:
> > > > 
> > > >   "spapr: Make DRC reset force DRC into known state"
> > > > 
> > > > But I didn't consider the fact that libvirt will be issuing these
> > > > hotplugs *after* reset, so those states would indeed need to
> > > > be fixed up again to reflect boot-time,attached as opposed to
> > > > boot-time,unattached before starting the target.
> > > > 
> > > > So I do think this patch addresses a specific bug that isn't
> > > > obviously fixable elsewhere.
> > > > 
> > > > To me it seems like the only way to avoid doing something like
> > > > what this patch does is to migrate all attached DRCs from the
> > > > source in all cases.
> > > > 
> > > > This would break backward-migration though, unless we switch from
> > > > using subregions for DRCs to explicitly disabling DRC migration
> > > > based on machine type.  
> > > we could leave old machines broken and fix only new machine types,
> > > then it would be easy ot migrate 'additional' DRC state as subsection
> > > only on new for new machines.  
> > 
> > That's an option, but subsections were only really used for backward
> > compatibility. Not sure how much we have to gain from using both.
> If I remember correctly subsections could be/are used for forward compat stuff
> i.e. subsection is generated on source side when .needed callback returns
> true and destinations will just consume whatever data were sent
> without looking at .need callback. So source could generate extra
> DRC subsection when cpu hotplug is enabled for new machine types,
> ex: f816a62daa
> 
> adding David/Juan to CC list to correct me if I'm wrong.

Yes I think that's right; but note that the destination does have to
know about the subsection definition to consume it.
It can't consume a subsection sent by a newer qemu which it doesn't
have a definition of and so can't parse.

Dave

> > > > That approach seems to similar to what x86 does, e.g.
> > > > hw/acpi/ich9.c and hw/acpi/piix.c migrate vmstate_memhp_state
> > > > (corresponding to all DIMMs' slot status) in all cases where
> > > > memory hotplug is enabled. If they were to do this using
> > > > subregions for DIMMs in a transitional state I think similar
> > > > issues would pop up in that code as well.
> > > > 
> > > > Even if we take this route, we still need to explicitly suppress
> > > > hotplug events during INMIGRATE to avoid extra events going on
> > > > the queue. *Unless* we similarly rely purely on the ones sent by
> > > > the source.  
> > > pc/q35 might also lose events if device is hotplugged during migration,
> > > in addition migration would fail anyway since dst qemu
> > > should be launched with all devices that are present on src.
> > > 
> > > ex: consider if one hotplugs DIMM during migration, it creates
> > > RAM region mapped into guest and that region might be transferred
> > > as part of VMState (not sure if it even works)
> > > and considering dst qemu has no idea about hotplugged memory mapping,
> > > the migration would fail on receiving unknown VMState.
> > > 
> > > Hotplug generally doesn't work during migration, so it should be disabled
> > > in a generic way on migration start and re-enabled on target
> > > on migration completion. How about blocking device_add when
> > > INMIGRATE state and unblocking it when switching to runnig on dst?  
> > 
> > Maybe I'm misunderstanding the intent of this patch, but in our own
> > testing we've seen that even for CPUs hotplugged *before* migration
> > starts, libvirt will add them to the dest via device_add instead of
> > via the command-line.
> the way migration currently works, this behavior seems fine to me,
> whether hotplugged CPUs on target side are specified with -device or
> device_add, it shouldn't affect machine behavior. It doesn't in
> case of cpu for x86, where hotplug process state is
> migrated as VMSTATE_CPU_HOTPLUG subsection of piix4_pm/ich9
> device.
> 
> 
> > If the CPUs were all specified via command-line, I don't think these
> > patches would be needed, since the coldplug hooks would be executed
> > without the need to make any special considerations for INMIGRATE.
> I don't think that it's libvirt's problem, user if free to use either
> -device or device_add to add devices on target side.
> 
> We might re-enable writing to hotplugged property (36cccb8c5)
> and ask mgmt to set its value on target but that would not work
> fine for old mgmt tools.
> Perhaps we should migrate DeviceState::hotplugged property state
> as part of every device so that target could fixup device state
> according to its value, but I'm not sure if it is useful.
> 
> 
> > This libvirt commit seems to confirm that the CPUs are added via
> > device_add, and we've seen similar behavior in our testing:
> > 
> > 
> > commit 9eb9106ea51b43102ee51132f69780b2c86ccbca
> > Author: Peter Krempa <pkrempa@redhat.com>
> > Date:   Thu Aug 4 14:36:24 2016 +0200
> > 
> >     qemu: command: Add support for sparse vcpu topologies
> >     
> >     Add support for using the new approach to hotplug vcpus using device_add
> >     during startup of qemu to allow sparse vcpu topologies.
> >     
> >     There are a few limitations imposed by qemu on the supported
> >     configuration:
> >     - vcpu0 needs to be always present and not hotpluggable
> >     - non-hotpluggable cpus need to be ordered at the beginning
> >     - order of the vcpus needs to be unique for every single hotpluggable
> >       entity
> >     
> >     Qemu also doesn't really allow to query the information necessary to
> >     start a VM with the vcpus directly on the commandline. Fortunately they
> >     can be hotplugged during startup.
> >     
> >     The new hotplug code uses the following approach:
> >     - non-hotpluggable vcpus are counted and put to the -smp option
> >     - qemu is started
> >     - qemu is queried for the necessary information
> >     - the configuration is checked
> >     - the hotpluggable vcpus are hotplugged
> >     - vcpus are started
> >     
> >     This patch adds a lot of checking code and enables the support to
> >     specify the individual vcpu element with qemu.
> > 
> > 
> > So I don't think disabling migration during inmigrate is a possible
> > alternative unless we rework how libvirt handles this. The only
> > alternative to this patch that I'm aware of would be to always
> > migrate DRCs when dev->hotplugged == true.
> Currently I'd suggest to look into always migrate DRCs if cpu hotplug
> is enabled even if dev->hotplugged is false (not nice but it might work).
> Consider:
>   SRC1: hotplug CPU1 => CPU1.hotplugged = true
>   DST1: -device CPU1 => CPU1.hotplugged = false
> so in current code relying on CPU1.hotplugged would not work as expected,
> it works by accident because libvirt uses device_add on target
>   DST1: device_add CPU1 => CPU1.hotplugged = true
> 
> If we try to fix it by migrating 'DeviceState::hotplugged' flag,
> we would need CPU/memory/machine specific migration hooks which will
> fix device/machine state as by the time migration stream is processed
> on target side, all devices are already wired up using -device or
> device_add paths (cold/hotplugged paths).
> Approach doesn't seem robust to me.
> 
> May be we should
>  1. make DeviceState:hotpluggable property write-able again
>  2. transmit DeviceState:hotpluggable as part of migration stream
>  3. add generic migration hook which will check if target and
>     source value match, if value differs => fail/abort migration.
>  4. in case values mismatch mgmt will be forced to explicitly
>     provide hotplugged property value on -device/device_add
> That would enforce consistent DeviceState:hotpluggable value
> on target and source.
> We can enforce it only for new machine types so it won't break
> old mgmt tools with old machine types but would force mgmt
> for new machines to use hotplugged property on target
> so QEMU could rely on its value for migration purposes.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

next prev parent reply	other threads:[~2017-06-14 11:59 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-30 16:04 [Qemu-devel] [PATCH] spapr: manage hotplugged devices while the VM is not started Laurent Vivier
2017-05-30 16:35 ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2017-05-31  4:35 ` [Qemu-devel] " David Gibson
2017-05-31  7:12   ` Laurent Vivier
2017-05-31  9:06   ` Igor Mammedov
2017-06-08 20:00   ` Michael Roth
2017-06-09  8:27     ` Igor Mammedov
2017-06-09 10:53       ` David Gibson
2017-06-13 21:42       ` Michael Roth
2017-06-14  9:00         ` Igor Mammedov
2017-06-14  9:26           ` Juan Quintela
2017-06-14 11:59           ` Dr. David Alan Gilbert [this message]
2017-06-15  0:27           ` Michael Roth
2017-06-16 13:53             ` Igor Mammedov
2017-06-16 14:40               ` David Gibson
2017-06-16 15:58                 ` Igor Mammedov
2017-06-16 16:15                 ` Michael Roth
2017-06-18  9:59                   ` David Gibson
2017-06-18 12:37                     ` Michael Roth
2017-06-18 13:38                       ` David Gibson
2017-06-16 17:19               ` Michael Roth
2017-06-18 10:24                 ` David Gibson
2017-06-18 12:52                   ` Michael Roth
2017-06-19 12:40               ` Marcel Apfelbaum
2017-06-19 13:30                 ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
2017-06-09 10:49     ` [Qemu-devel] " David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170614115909.GC2687@work-vm \
    --to=dgilbert@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=imammedo@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).