From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: Alex Williamson <alex.williamson@redhat.com>,
qemu-ppc@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH qemu v14 16/18] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping being used by VFIO
Date: Tue, 22 Mar 2016 17:24:33 +1100 [thread overview]
Message-ID: <56F0E521.6060506@ozlabs.ru> (raw)
In-Reply-To: <20160322044559.GE23586@voom.redhat.com>
On 03/22/2016 03:45 PM, David Gibson wrote:
> On Mon, Mar 21, 2016 at 06:47:04PM +1100, Alexey Kardashevskiy wrote:
>> The sPAPR TCE tables manage 2 copies when VFIO is using an IOMMU -
>> a guest view of the table and a hardware TCE table. If there is no VFIO
>> presense in the address space, then just the guest view is used, if
>> this is the case, it is allocated in the KVM. However since there is no
>> support yet for VFIO in KVM TCE hypercalls, when we start using VFIO,
>> we need to move the guest view from KVM to the userspace; and we need
>> to do this for every IOMMU on a bus with VFIO devices.
>>
>> This adds vfio_start/vfio_stop callbacks in MemoryRegionIOMMUOps to
>> notifiy IOMMU about changing environment so it can reallocate the table
>> to/from KVM or (when available) hook the IOMMU groups with the logical
>> bus (LIOBN) in the KVM.
>>
>> This removes explicit spapr_tce_set_need_vfio() call from PCI hotplug
>> path as the new callbacks do this better - they notify IOMMU at
>> the exact moment when the configuration is changed, and this also
>> includes the case of PCI hot unplug.
>>
>> TODO: split into 2 or 3 patches, per maintainership area.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>
> I'm finding this one much easier to follow than the previous revision.
>
>> ---
>> hw/ppc/spapr_iommu.c | 12 ++++++++++++
>> hw/ppc/spapr_pci.c | 6 ------
>> hw/vfio/common.c | 9 +++++++++
>> include/exec/memory.h | 4 ++++
>> 4 files changed, 25 insertions(+), 6 deletions(-)
>>
>> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
>> index 6dc3c45..702075d 100644
>> --- a/hw/ppc/spapr_iommu.c
>> +++ b/hw/ppc/spapr_iommu.c
>> @@ -151,6 +151,16 @@ static uint64_t spapr_tce_get_page_sizes(MemoryRegion *iommu)
>> return 1ULL << tcet->page_shift;
>> }
>>
>> +static void spapr_tce_vfio_start(MemoryRegion *iommu)
>> +{
>> + spapr_tce_set_need_vfio(container_of(iommu, sPAPRTCETable, iommu), true);
>> +}
>> +
>> +static void spapr_tce_vfio_stop(MemoryRegion *iommu)
>> +{
>> + spapr_tce_set_need_vfio(container_of(iommu, sPAPRTCETable, iommu), false);
>> +}
>
> Wonder if a single callback which takes a boolean might be a little
> less clunky.
I have a feeling that at least once I was asked to do the opposite and now
we have take_ownership/release_ownership. This does not seem to be much
different and the existing names are more self-documenting than the
previous vfio_notify() or whatever name I could think of.
>> static void spapr_tce_table_do_enable(sPAPRTCETable *tcet);
>> static void spapr_tce_table_do_disable(sPAPRTCETable *tcet);
>>
>> @@ -211,6 +221,8 @@ static const VMStateDescription vmstate_spapr_tce_table = {
>> static MemoryRegionIOMMUOps spapr_iommu_ops = {
>> .translate = spapr_tce_translate_iommu,
>> .get_page_sizes = spapr_tce_get_page_sizes,
>> + .vfio_start = spapr_tce_vfio_start,
>> + .vfio_stop = spapr_tce_vfio_stop,
>> };
>>
>> static int spapr_tce_table_realize(DeviceState *dev)
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index bfcafdf..af99a36 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -1121,12 +1121,6 @@ static void spapr_phb_add_pci_device(sPAPRDRConnector *drc,
>> void *fdt = NULL;
>> int fdt_start_offset = 0, fdt_size;
>>
>> - if (object_dynamic_cast(OBJECT(pdev), "vfio-pci")) {
>> - sPAPRTCETable *tcet = spapr_tce_find_by_liobn(phb->dma_liobn);
>> -
>> - spapr_tce_set_need_vfio(tcet, true);
>> - }
>> -
>> if (dev->hotplugged) {
>> fdt = create_device_tree(&fdt_size);
>> fdt_start_offset = spapr_create_pci_child_dt(phb, pdev, fdt, 0);
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index b257655..4e873b7 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -421,6 +421,9 @@ static void vfio_listener_region_add(MemoryListener *listener,
>> QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
>>
>> memory_region_register_iommu_notifier(giommu->iommu, &giommu->n);
>> + if (section->mr->iommu_ops && section->mr->iommu_ops->vfio_start) {
>> + section->mr->iommu_ops->vfio_start(section->mr);
>> + }
>> memory_region_iommu_replay(giommu->iommu, &giommu->n,
>> false);
>>
>> @@ -466,6 +469,7 @@ static void vfio_listener_region_del(MemoryListener *listener,
>> VFIOContainer *container = container_of(listener, VFIOContainer, listener);
>> hwaddr iova, end;
>> int ret;
>> + MemoryRegion *iommu = NULL;
>>
>> if (vfio_listener_skipped_section(section)) {
>> trace_vfio_listener_region_del_skip(
>> @@ -487,6 +491,7 @@ static void vfio_listener_region_del(MemoryListener *listener,
>> QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
>> if (giommu->iommu == section->mr) {
>> memory_region_unregister_iommu_notifier(&giommu->n);
>> + iommu = giommu->iommu;
>> QLIST_REMOVE(giommu, giommu_next);
>> g_free(giommu);
>> break;
>> @@ -519,6 +524,10 @@ static void vfio_listener_region_del(MemoryListener *listener,
>> "0x%"HWADDR_PRIx") = %d (%m)",
>> container, iova, end - iova, ret);
>> }
>> +
>> + if (iommu && iommu->iommu_ops && iommu->iommu_ops->vfio_stop) {
>> + iommu->iommu_ops->vfio_stop(section->mr);
>> + }
>
> IIRC there can be multiple containers listening on the same PCI
> address space. In that case, this won't be correct, because once one
> of the VFIO containers is removed, it will call vfio_stop, even though
> the other VFIO container still needs the guest IOMMU to support it.
>
> So I think you need some sort of refcounting here.
Right, missed this bit, good finding.
>
>> }
>>
>> static const MemoryListener vfio_memory_listener = {
>> diff --git a/include/exec/memory.h b/include/exec/memory.h
>> index eb5ce67..f1de133f 100644
>> --- a/include/exec/memory.h
>> +++ b/include/exec/memory.h
>> @@ -152,6 +152,10 @@ struct MemoryRegionIOMMUOps {
>> IOMMUTLBEntry (*translate)(MemoryRegion *iommu, hwaddr addr, bool is_write);
>> /* Returns supported page sizes */
>> uint64_t (*get_page_sizes)(MemoryRegion *iommu);
>> + /* Called when VFIO starts using this */
>> + void (*vfio_start)(MemoryRegion *iommu);
>> + /* Called when VFIO stops using this */
>> + void (*vfio_stop)(MemoryRegion *iommu);
>> };
>>
>> typedef struct CoalescedMemoryRange CoalescedMemoryRange;
>
--
Alexey
next prev parent reply other threads:[~2016-03-22 6:24 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-21 7:46 [Qemu-devel] [PATCH qemu v14 00/18] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy
2016-03-21 7:46 ` [Qemu-devel] [PATCH qemu v14 01/18] memory: Fix IOMMU replay base address Alexey Kardashevskiy
2016-03-22 0:49 ` David Gibson
2016-03-22 3:12 ` Alexey Kardashevskiy
2016-03-22 3:26 ` David Gibson
2016-03-22 4:28 ` Alexey Kardashevskiy
2016-03-22 4:59 ` David Gibson
2016-03-22 7:19 ` Alexey Kardashevskiy
2016-03-22 23:07 ` David Gibson
2016-03-23 10:58 ` Paolo Bonzini
2016-03-21 7:46 ` [Qemu-devel] [PATCH qemu v14 02/18] vmstate: Define VARRAY with VMS_ALLOC Alexey Kardashevskiy
2016-03-21 7:46 ` [Qemu-devel] [PATCH qemu v14 03/18] spapr_pci: Move DMA window enablement to a helper Alexey Kardashevskiy
2016-03-22 1:02 ` David Gibson
2016-03-22 3:17 ` Alexey Kardashevskiy
2016-03-22 3:28 ` David Gibson
2016-03-21 7:46 ` [Qemu-devel] [PATCH qemu v14 04/18] spapr_iommu: Move table allocation to helpers Alexey Kardashevskiy
2016-03-21 7:46 ` [Qemu-devel] [PATCH qemu v14 05/18] spapr_iommu: Introduce "enabled" state for TCE table Alexey Kardashevskiy
2016-03-22 1:11 ` David Gibson
2016-03-21 7:46 ` [Qemu-devel] [PATCH qemu v14 06/18] spapr_iommu: Finish renaming vfio_accel to need_vfio Alexey Kardashevskiy
2016-03-22 1:18 ` David Gibson
2016-03-21 7:46 ` [Qemu-devel] [PATCH qemu v14 07/18] spapr_iommu: Realloc table during migration Alexey Kardashevskiy
2016-03-22 1:23 ` David Gibson
2016-03-21 7:46 ` [Qemu-devel] [PATCH qemu v14 08/18] spapr_iommu: Migrate full state Alexey Kardashevskiy
2016-03-22 1:31 ` David Gibson
2016-03-21 7:46 ` [Qemu-devel] [PATCH qemu v14 09/18] spapr_iommu: Add root memory region Alexey Kardashevskiy
2016-03-21 7:46 ` [Qemu-devel] [PATCH qemu v14 10/18] spapr_pci: Reset DMA config on PHB reset Alexey Kardashevskiy
2016-03-21 7:46 ` [Qemu-devel] [PATCH qemu v14 11/18] memory: Add reporting of supported page sizes Alexey Kardashevskiy
2016-03-22 3:02 ` David Gibson
2016-03-21 7:47 ` [Qemu-devel] [PATCH qemu v14 12/18] vfio: Check that IOMMU MR translates to system address space Alexey Kardashevskiy
2016-03-22 3:05 ` David Gibson
2016-03-22 15:47 ` Alex Williamson
2016-03-23 0:43 ` David Gibson
2016-03-23 0:44 ` Alexey Kardashevskiy
2016-03-21 7:47 ` [Qemu-devel] [PATCH qemu v14 13/18] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) Alexey Kardashevskiy
2016-03-22 4:04 ` David Gibson
2016-03-21 7:47 ` [Qemu-devel] [PATCH qemu v14 14/18] spapr_pci: Add and export DMA resetting helper Alexey Kardashevskiy
2016-03-21 7:47 ` [Qemu-devel] [PATCH qemu v14 15/18] vfio: Add host side IOMMU capabilities Alexey Kardashevskiy
2016-03-22 4:20 ` David Gibson
2016-03-22 6:47 ` Alexey Kardashevskiy
2016-03-21 7:47 ` [Qemu-devel] [PATCH qemu v14 16/18] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping being used by VFIO Alexey Kardashevskiy
2016-03-22 4:45 ` David Gibson
2016-03-22 6:24 ` Alexey Kardashevskiy [this message]
2016-03-22 10:22 ` David Gibson
2016-03-21 7:47 ` [Qemu-devel] [PATCH qemu v14 17/18] vfio/spapr: Use VFIO_SPAPR_TCE_v2_IOMMU Alexey Kardashevskiy
2016-03-22 5:14 ` David Gibson
2016-03-22 5:54 ` Alexey Kardashevskiy
2016-03-23 1:08 ` David Gibson
2016-03-23 2:12 ` Alexey Kardashevskiy
2016-03-23 2:53 ` David Gibson
2016-03-23 3:06 ` Alexey Kardashevskiy
2016-03-23 6:03 ` David Gibson
2016-03-24 0:03 ` Alexey Kardashevskiy
2016-03-24 9:10 ` Alexey Kardashevskiy
2016-03-29 5:30 ` David Gibson
2016-03-29 5:44 ` Alexey Kardashevskiy
2016-03-29 6:44 ` David Gibson
2016-03-21 7:47 ` [Qemu-devel] [PATCH qemu v14 18/18] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) Alexey Kardashevskiy
2016-03-23 2:13 ` David Gibson
2016-03-23 3:28 ` Alexey Kardashevskiy
2016-03-23 6:11 ` David Gibson
2016-03-24 2:32 ` Alexey Kardashevskiy
2016-03-29 5:22 ` David Gibson
2016-03-29 6:23 ` Alexey Kardashevskiy
2016-03-31 3:19 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56F0E521.6060506@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).