From: David Hildenbrand <david@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-ppc@nongnu.org,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Richard Henderson <rth@twiddle.net>,
Eduardo Habkost <ehabkost@redhat.com>,
Eric Blake <eblake@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
Pankaj Gupta <pagupta@redhat.com>,
Luiz Capitulino <lcapitul@redhat.com>,
Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
David Gibson <david@gibson.dropbear.id.au>,
Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH v4 18/24] qdev: hotplug: provide do_unplug handler
Date: Tue, 2 Oct 2018 11:49:09 +0200 [thread overview]
Message-ID: <518ce438-7d6b-6f36-8bac-d834b21fb8bc@redhat.com> (raw)
In-Reply-To: <20181001152443.2cca5c5e@redhat.com>
On 01/10/2018 15:24, Igor Mammedov wrote:
> On Fri, 28 Sep 2018 14:21:33 +0200
> David Hildenbrand <david@redhat.com> wrote:
>
>> On 27/09/2018 15:01, Igor Mammedov wrote:
>>> On Wed, 26 Sep 2018 11:42:13 +0200
>>> David Hildenbrand <david@redhat.com> wrote:
>>>
>>>> The unplug and unplug_request handlers are special: They are not
>>>> executed when unrealizing a device, but rather trigger the removal of a
>>>> device from device_del() via object_unparent() - to effectively
>>>> unrealize a device.
>>>>
>>>> If such a device has a child bus and another device attached to
>>>> that bus (e.g. how virtio devices are created with their proxy device),
>>>> we will not get a call to the unplug handler. As we want to support
>>>> hotplug handlers (and especially also some unplug logic to undo resource
>>>> assignment) for such devices, we cannot simply call the unplug handler
>>>> when unrealizing - it has a different semantic ("trigger removal").
>>>>
>>>> To handle this scenario, we need a do_unplug handler, that will be
>>>> executed for all devices with a hotplug handler.
>>> could you clarify what would be call flow for unplug in this case
>>> starting from 'device_del'?
>>
>> Let's work it through for virtio-pmem:
>>
>> qemu-system-x86_64 -machine pc -m 8G,maxmem=20G \
>> [...] \
>> -object memory-backend-file,id=mem1,share,mem-path=/dev/zero,size=4G \
>> -device virtio-pmem-pci,id=vp1,memdev=mem1 -monitor stdio
>>
>> info qtree gives us:
>>
>> bus: pci.0
>> type PCI
>> dev: virtio-pmem-pci, id "vp1"
>> [...]
>> bus: virtio-bus
>> type virtio-pci-bus
>> dev: virtio-pmem, id ""
>> memaddr = 9663676416 (0x240000000)
>> memdev = "/objects/mem1"
>> [...]
>>
>> "device_del vp1":
>>
>> qmp_device_del(vp1)->qdev_unplug(vp1)->hotplug_handler_unplug_request(vp1)
>>
>> piix4_device_unplug_request_cb(vp1)->acpi_pcihp_device_unplug_cb(vp1)
>>
>> -> Guest has to process the request and respond
>>
>> acpi_pcihp_eject_slot(vp1)->object_unparent(vp1)
> that's one of the possible call flows, unplug could also originate
> from shpc or native pci-e hot-plug.
> PCI unplug hasn't ever been factored out from old PCI device/bus code,
> so PCIDevice::unrealize takes care of parent resource teardown.
> (well, there wasn't any reason to factor it out till we started
> talking about hybrid devices).
> We probably should do the same refactoring like it was done for
> pc-dimm/cpu unplug
> (see qdev_get_hotplug_handler()+hotplug_handler_unplug() usage)
>
>> Now, this triggers the unplug of the device hierarchy:
>>
>> object_unparent(vp1)->device_unparent(vp1)>device_set_realized(vp1, 0)
>>
>> ->bus_set_realized(virtio-bus, 0)->device_set_realized(virtio-pmem, 0)
>>
>> This is the place where this hooks is comes into play:
>>
>> ->hotplug_handler_do_unplug(virtio-pmem)->machine
>> handler->virtio_pmem_do_unplug(virtio-pmem)
>>
>> Followed by object_unparent(virtio-bus)->bus_unparent(virtio-bus)
>> Followed by object_unparent(virtio-pmem)->device_unparent(virtio-pmem)
>>
>>
>> At this place, the hierarchy is gone. Hotplug succeeded and the
>> virtio-pmem device (memory device) has been properly unplugged.
> I'm concerned that both plug and unplug flows are implicit
> and handled as if it were separate devices without enforcing
> a particular ordering of (un)plug handlers.
> It would work right now but it looks rather fragile to me.
In my ideal world, the plug+unplug handlers would only perform checks
and essentially trigger an object_unparent(). (either directly or by
some guest action).
Inside object_unparent(), the call flow of unrealize steps is defined.
By moving the "real unplug" part into "do_unplug" and therefor
essentially calling it when unrealizing, we could generalize this for
all unplug handlers.
I think, order of realization and therefore the order of hotplug handler
calls is strictly defined already. Same applies to unrealization if we
would factor the essential parts out into e.g. "do_unplug". That order
is strictly encoded in device_set_realized() and bus_set_realized().
>
> If I remember right, the suggested and partially implemented idea
> in one of your previous series was to override default hotplug
> handler with a machine one for plugged in device [1][2].
> However impl. wasn't exactly what I've suggested since it matches
> all memory-devices.
>
> 1) qdev: let machine hotplug handler to override bus hotplug handler
> 2) pc: route all memory devices through the machine hotplug handler
>
> So lets reiterate, we have TYPE_VIRTIO_PMEM and TYPE_VIRTIO_PMEM_PCI
> the former implements TYPE_MEMORY_DEVICE interface and the later is
> a wrapper PCI/whatnot device shim.
> So when you plug that composite device you'd get 2 independent
> plug hooks called, which makes it unrelable/broken design.
Can you elaborate why this is broken? I don't consider the
realize/unrealize order broken, and that is where we plug into. But yes,
we logically plug a device hierarchy and therefore get a separate
hotplug handler calls.
>
> My next question would be why TYPE_VIRTIO_PMEM_PCI can't implement
> TYPE_MEMORY_DEVICE interface and TYPE_VIRTIO_PMEM be a simple VIRTIO
> device without any hotplug hooks (so shim device would proxy all
> memory-device logic to its child)?
> /huh, then you don't need get_device_id() as well/
I had the same idea while going through different options. Then we would
have to forward all calls directly to the child. We cannot reuse
TYPE_MEMORY_DEVICE, so we would either need a new interface or define
the functions we want manually for each such device.
>
> That way using [2] and [1 - modulo it should match only concrete type]
> machine would be able to override hotplug handlers for TYPE_VIRTIO_PMEM_PCI
> and explicitly call machine + pci hotplug handlers in necessary order.
>
> flow would look like:
> [acpi|shcp|native pci-e eject]->
> hotplug_ctrl = qdev_get_hotplug_handler(dev);
> hotplug_handler_unplug(hotplug_ctrl, dev, &local_err); ->
> machine_unplug()
> machine_virtio_pci_pmem_cb():
> // we now that's device has 2 stage hotplug handlers,
> // so we can arrange hotplug sequence in necessary order
> hotplug_ctrl2 = qdev_get_bus_hotplug_handler(dev);
>
> //then do unplug in whatever order that's correct,
> // I'd assume tear down/stop PCI device first, flushing
> // command virtio command queues and that unplug memory itself.
> hotplug_handler_unplug(hotplug_ctrl2, dev, &local_err);
> memory_device_unplug()
>
> Similar logic applies to device_add/device_del paths, with a difference that
> origin point would be monitor/qmp.
Let's see. User calls device_del(). That triggers an unplug_request. For
virtio-pmem, there is nothing to do.
eject hook is called by the guest. For now we do an object_unparent.
This would now be wrong. We would have to call a proper hotplug handler
chain (I guess that's the refactoring you mentioned above).
>
> Point is to have a single explicit callback chain that applies to a concrete
> device type. That way if ever change an ordering of calling plug callbacks
> in qdev core, the expected for a device callback sequence would still
> remain in place ensuring that device (un)plugged as expected.
I haven't tested yet if this will work, but I can give it a try. I
learned that in QEMU things often seem easier than they actually are :)
>
> Also it should be easier to trace for a reader, than 2 disjoint callbacks of
> composite device (which only know to author of that device and then only
> till he/she remembers how that thing works).
In my view it makes things slightly more complicated, because you have
to follow a hotplug handler chain that plugs devices via proxy devices.
(e.g. passing through TYPE_MEMORY_DEVICE calls to a child, and therefore
essentially hotplugging the child), instead of only watching out for
which device get's hotplugged and finding exactly one hotplug handler.
Of course, for a device hierarchy, multiple devices get logically
hotplugged.
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2018-10-02 9:49 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20180926094219.20322-1-david@redhat.com>
[not found] ` <20180926094219.20322-9-david@redhat.com>
[not found] ` <df729c85-6fa1-93d7-c91e-7d3738fbf38f@redhat.com>
2018-10-01 8:13 ` [Qemu-devel] [PATCH v4 08/24] memory-device: document MemoryDeviceClass David Hildenbrand
2018-10-01 10:40 ` Auger Eric
[not found] ` <20180926094219.20322-15-david@redhat.com>
[not found] ` <99ab8baf-37c9-2df1-7292-8e0ac4f31137@redhat.com>
2018-10-01 8:15 ` [Qemu-devel] [PATCH v4 14/24] memory-device: complete factoring out plug handling David Hildenbrand
2018-10-01 8:18 ` David Hildenbrand
2018-10-01 9:01 ` Igor Mammedov
[not found] ` <20180926094219.20322-17-david@redhat.com>
[not found] ` <2c164355-1592-a785-b761-463f00dee259@redhat.com>
2018-10-01 8:21 ` [Qemu-devel] [PATCH v4 16/24] memory-device: trace when pre_assigning/assigning/unassigning addresses David Hildenbrand
[not found] ` <20180926094219.20322-18-david@redhat.com>
[not found] ` <9be6d517-615d-34ef-f6f4-4d478ef21944@redhat.com>
2018-10-01 8:36 ` [Qemu-devel] [PATCH v4 17/24] memory-device: add class function get_device_id() David Hildenbrand
[not found] ` <20180926094219.20322-20-david@redhat.com>
2018-10-01 13:37 ` [Qemu-devel] [PATCH v4 19/24] virtio-pmem: prototype Igor Mammedov
[not found] ` <20180926094219.20322-22-david@redhat.com>
2018-10-01 18:57 ` [Qemu-devel] [PATCH v4 21/24] hmp: handle virtio-pmem when printing memory device infos Dr. David Alan Gilbert
[not found] ` <20180926094219.20322-23-david@redhat.com>
2018-10-01 18:59 ` [Qemu-devel] [PATCH v4 22/24] numa: handle virtio-pmem in NUMA stats Dr. David Alan Gilbert
[not found] ` <20180926094219.20322-19-david@redhat.com>
[not found] ` <20180927150141.60a6488a@redhat.com>
[not found] ` <dc5d7b2d-5b51-2c0b-aac7-ebf04a4e7859@redhat.com>
2018-10-01 13:24 ` [Qemu-devel] [PATCH v4 18/24] qdev: hotplug: provide do_unplug handler Igor Mammedov
2018-10-02 9:49 ` David Hildenbrand [this message]
2018-10-02 14:23 ` Igor Mammedov
2018-10-02 15:36 ` David Hildenbrand
2018-10-08 11:47 ` David Hildenbrand
2018-10-08 12:19 ` Igor Mammedov
2018-10-08 12:41 ` David Hildenbrand
2018-10-08 14:12 ` Igor Mammedov
2018-10-11 8:50 ` David Hildenbrand
2018-10-12 8:27 ` Igor Mammedov
2018-10-12 8:45 ` David Hildenbrand
2018-10-12 14:21 ` Igor Mammedov
2018-10-15 7:21 ` David Hildenbrand
2018-10-03 6:29 ` David Gibson
2018-10-03 17:21 ` David Hildenbrand
2018-10-04 15:59 ` Igor Mammedov
2018-10-05 7:40 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=518ce438-7d6b-6f36-8bac-d834b21fb8bc@redhat.com \
--to=david@redhat.com \
--cc=agraf@suse.de \
--cc=armbru@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=ehabkost@redhat.com \
--cc=imammedo@redhat.com \
--cc=lcapitul@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=pagupta@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=rth@twiddle.net \
--cc=xiaoguangrong.eric@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).