qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-ppc@nongnu.org,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Richard Henderson <rth@twiddle.net>,
	Eduardo Habkost <ehabkost@redhat.com>,
	Eric Blake <eblake@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	Pankaj Gupta <pagupta@redhat.com>,
	Luiz Capitulino <lcapitul@redhat.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH v4 18/24] qdev: hotplug: provide do_unplug handler
Date: Tue, 2 Oct 2018 11:49:09 +0200	[thread overview]
Message-ID: <518ce438-7d6b-6f36-8bac-d834b21fb8bc@redhat.com> (raw)
In-Reply-To: <20181001152443.2cca5c5e@redhat.com>

On 01/10/2018 15:24, Igor Mammedov wrote:
> On Fri, 28 Sep 2018 14:21:33 +0200
> David Hildenbrand <david@redhat.com> wrote:
> 
>> On 27/09/2018 15:01, Igor Mammedov wrote:
>>> On Wed, 26 Sep 2018 11:42:13 +0200
>>> David Hildenbrand <david@redhat.com> wrote:
>>>   
>>>> The unplug and unplug_request handlers are special: They are not
>>>> executed when unrealizing a device, but rather trigger the removal of a
>>>> device from device_del() via object_unparent() - to effectively
>>>> unrealize a device.
>>>>
>>>> If such a device has a child bus and another device attached to
>>>> that bus (e.g. how virtio devices are created with their proxy device),
>>>> we will not get a call to the unplug handler. As we want to support
>>>> hotplug handlers (and especially also some unplug logic to undo resource
>>>> assignment) for such devices, we cannot simply call the unplug handler
>>>> when unrealizing - it has a different semantic ("trigger removal").
>>>>
>>>> To handle this scenario, we need a do_unplug handler, that will be
>>>> executed for all devices with a hotplug handler.  
>>> could you clarify what would be call flow for unplug in this case
>>> starting from 'device_del'?  
>>
>> Let's work it through for virtio-pmem:
>>
>> qemu-system-x86_64 -machine pc -m 8G,maxmem=20G \
>>   [...] \
>>   -object memory-backend-file,id=mem1,share,mem-path=/dev/zero,size=4G \
>>   -device virtio-pmem-pci,id=vp1,memdev=mem1 -monitor stdio
>>
>> info qtree gives us:
>>
>>    bus: pci.0
>>       type PCI
>>       dev: virtio-pmem-pci, id "vp1"
>> 	[...]
>>         bus: virtio-bus
>>           type virtio-pci-bus
>>           dev: virtio-pmem, id ""
>>             memaddr = 9663676416 (0x240000000)
>>             memdev = "/objects/mem1"
>> 	    [...]
>>
>> "device_del vp1":
>>
>> qmp_device_del(vp1)->qdev_unplug(vp1)->hotplug_handler_unplug_request(vp1)
>>
>> piix4_device_unplug_request_cb(vp1)->acpi_pcihp_device_unplug_cb(vp1)
>>
>> -> Guest has to process the request and respond  
>>
>> acpi_pcihp_eject_slot(vp1)->object_unparent(vp1)
> that's one of the possible call flows, unplug could also originate
> from shpc or native pci-e hot-plug.
> PCI unplug hasn't ever been factored out from old PCI device/bus code,
> so PCIDevice::unrealize takes care of parent resource teardown.
> (well, there wasn't any reason to factor it out till we started
> talking about hybrid devices).
> We probably should do the same refactoring like it was done for
> pc-dimm/cpu unplug
> (see qdev_get_hotplug_handler()+hotplug_handler_unplug() usage)
> 
>> Now, this triggers the unplug of the device hierarchy:
>>
>> object_unparent(vp1)->device_unparent(vp1)>device_set_realized(vp1, 0)
>>
>> ->bus_set_realized(virtio-bus, 0)->device_set_realized(virtio-pmem, 0)  
>>
>> This is the place where this hooks is comes into play:
>>
>> ->hotplug_handler_do_unplug(virtio-pmem)->machine  
>> handler->virtio_pmem_do_unplug(virtio-pmem)
>>
>> Followed by object_unparent(virtio-bus)->bus_unparent(virtio-bus)
>> Followed by object_unparent(virtio-pmem)->device_unparent(virtio-pmem)
>>
>>
>> At this place, the hierarchy is gone. Hotplug succeeded and the
>> virtio-pmem device (memory device) has been properly unplugged.
> I'm concerned that both plug and unplug flows are implicit
> and handled as if it were separate devices without enforcing
> a particular ordering of (un)plug handlers.
> It would work right now but it looks rather fragile to me.

In my ideal world, the plug+unplug handlers would only perform checks
and essentially trigger an object_unparent(). (either directly or by
some guest action).

Inside object_unparent(), the call flow of unrealize steps is defined.
By moving the "real unplug" part into "do_unplug" and therefor
essentially calling it when unrealizing, we could generalize this for
all unplug handlers.

I think, order of realization and therefore the order of hotplug handler
calls is strictly defined already. Same applies to unrealization if we
would factor the essential parts out into e.g. "do_unplug". That order
is strictly encoded in device_set_realized() and bus_set_realized().

> 
> If I remember right, the suggested and partially implemented idea
> in one of your previous series was to override default hotplug
> handler with a machine one for plugged in device [1][2].
> However impl. wasn't exactly what I've suggested since it matches
> all memory-devices.
> 
> 1) qdev: let machine hotplug handler to override bus hotplug handler
> 2) pc: route all memory devices through  the machine hotplug handler
> 
> So lets reiterate, we have TYPE_VIRTIO_PMEM and TYPE_VIRTIO_PMEM_PCI
> the former implements TYPE_MEMORY_DEVICE interface and the later is
> a wrapper PCI/whatnot device shim.
> So when you plug that composite device you'd get 2 independent
> plug hooks called, which makes it unrelable/broken design.

Can you elaborate why this is broken? I don't consider the
realize/unrealize order broken, and that is where we plug into. But yes,
we logically plug a device hierarchy and therefore get a separate
hotplug handler calls.

> 
> My next question would be why TYPE_VIRTIO_PMEM_PCI can't implement 
> TYPE_MEMORY_DEVICE interface and TYPE_VIRTIO_PMEM be a simple VIRTIO
> device without any hotplug hooks (so shim device would proxy all
> memory-device logic to its child)?
> /huh, then you don't need get_device_id() as well/

I had the same idea while going through different options. Then we would
have to forward all calls directly to the child. We cannot reuse
TYPE_MEMORY_DEVICE, so we would either need a new interface or define
the functions we want manually for each such device.

> 
> That way using [2] and [1 - modulo it should match only concrete type]
> machine would be able to override hotplug handlers for TYPE_VIRTIO_PMEM_PCI
> and explicitly call machine + pci hotplug handlers in necessary order.
> 
> flow would look like:
>   [acpi|shcp|native pci-e eject]->  
>        hotplug_ctrl = qdev_get_hotplug_handler(dev);
>        hotplug_handler_unplug(hotplug_ctrl, dev, &local_err); ->
>             machine_unplug()
>                machine_virtio_pci_pmem_cb(): 
>                   // we now that's device has 2 stage hotplug handlers,
>                   // so we can arrange hotplug sequence in necessary order
>                   hotplug_ctrl2 = qdev_get_bus_hotplug_handler(dev);
> 
>                   //then do unplug in whatever order that's correct,
>                   // I'd assume tear down/stop PCI device first, flushing
>                   // command virtio command queues and that unplug memory itself.
>                   hotplug_handler_unplug(hotplug_ctrl2, dev, &local_err);
>                   memory_device_unplug()
> 
> Similar logic applies to device_add/device_del paths, with a difference that
> origin point would be monitor/qmp.

Let's see. User calls device_del(). That triggers an unplug_request. For
virtio-pmem, there is nothing to do.

eject hook is called by the guest. For now we do an object_unparent.
This would now be wrong. We would have to call a proper hotplug handler
chain (I guess that's the refactoring you mentioned above).

> 
> Point is to have a single explicit callback chain that applies to a concrete
> device type. That way if ever change an ordering of calling plug callbacks
> in qdev core, the expected for a device callback sequence would still
> remain in place ensuring that device (un)plugged as expected.

I haven't tested yet if this will work, but I can give it a try. I
learned that in QEMU things often seem easier than they actually are :)

> 
> Also it should be easier to trace for a reader, than 2 disjoint callbacks of
> composite device (which only know to author of that device and then only
> till he/she remembers how that thing works).

In my view it makes things slightly more complicated, because you have
to follow a hotplug handler chain that plugs devices via proxy devices.
(e.g. passing through TYPE_MEMORY_DEVICE calls to a child, and therefore
essentially hotplugging the child), instead of only watching out for
which device get's hotplugged and finding exactly one hotplug handler.
Of course, for a device hierarchy, multiple devices get logically
hotplugged.

-- 

Thanks,

David / dhildenb

  reply	other threads:[~2018-10-02  9:49 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20180926094219.20322-1-david@redhat.com>
     [not found] ` <20180926094219.20322-9-david@redhat.com>
     [not found]   ` <df729c85-6fa1-93d7-c91e-7d3738fbf38f@redhat.com>
2018-10-01  8:13     ` [Qemu-devel] [PATCH v4 08/24] memory-device: document MemoryDeviceClass David Hildenbrand
2018-10-01 10:40       ` Auger Eric
     [not found] ` <20180926094219.20322-15-david@redhat.com>
     [not found]   ` <99ab8baf-37c9-2df1-7292-8e0ac4f31137@redhat.com>
2018-10-01  8:15     ` [Qemu-devel] [PATCH v4 14/24] memory-device: complete factoring out plug handling David Hildenbrand
2018-10-01  8:18       ` David Hildenbrand
2018-10-01  9:01         ` Igor Mammedov
     [not found] ` <20180926094219.20322-17-david@redhat.com>
     [not found]   ` <2c164355-1592-a785-b761-463f00dee259@redhat.com>
2018-10-01  8:21     ` [Qemu-devel] [PATCH v4 16/24] memory-device: trace when pre_assigning/assigning/unassigning addresses David Hildenbrand
     [not found] ` <20180926094219.20322-18-david@redhat.com>
     [not found]   ` <9be6d517-615d-34ef-f6f4-4d478ef21944@redhat.com>
2018-10-01  8:36     ` [Qemu-devel] [PATCH v4 17/24] memory-device: add class function get_device_id() David Hildenbrand
     [not found] ` <20180926094219.20322-20-david@redhat.com>
2018-10-01 13:37   ` [Qemu-devel] [PATCH v4 19/24] virtio-pmem: prototype Igor Mammedov
     [not found] ` <20180926094219.20322-22-david@redhat.com>
2018-10-01 18:57   ` [Qemu-devel] [PATCH v4 21/24] hmp: handle virtio-pmem when printing memory device infos Dr. David Alan Gilbert
     [not found] ` <20180926094219.20322-23-david@redhat.com>
2018-10-01 18:59   ` [Qemu-devel] [PATCH v4 22/24] numa: handle virtio-pmem in NUMA stats Dr. David Alan Gilbert
     [not found] ` <20180926094219.20322-19-david@redhat.com>
     [not found]   ` <20180927150141.60a6488a@redhat.com>
     [not found]     ` <dc5d7b2d-5b51-2c0b-aac7-ebf04a4e7859@redhat.com>
2018-10-01 13:24       ` [Qemu-devel] [PATCH v4 18/24] qdev: hotplug: provide do_unplug handler Igor Mammedov
2018-10-02  9:49         ` David Hildenbrand [this message]
2018-10-02 14:23           ` Igor Mammedov
2018-10-02 15:36             ` David Hildenbrand
2018-10-08 11:47         ` David Hildenbrand
2018-10-08 12:19           ` Igor Mammedov
2018-10-08 12:41             ` David Hildenbrand
2018-10-08 14:12               ` Igor Mammedov
2018-10-11  8:50                 ` David Hildenbrand
2018-10-12  8:27                   ` Igor Mammedov
2018-10-12  8:45                     ` David Hildenbrand
2018-10-12 14:21                       ` Igor Mammedov
2018-10-15  7:21                         ` David Hildenbrand
2018-10-03  6:29   ` David Gibson
2018-10-03 17:21     ` David Hildenbrand
2018-10-04 15:59       ` Igor Mammedov
2018-10-05  7:40         ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=518ce438-7d6b-6f36-8bac-d834b21fb8bc@redhat.com \
    --to=david@redhat.com \
    --cc=agraf@suse.de \
    --cc=armbru@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=lcapitul@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=xiaoguangrong.eric@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).