qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH qemu v14 16/18] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping being used by VFIO
Date: Tue, 22 Mar 2016 17:24:33 +1100	[thread overview]
Message-ID: <56F0E521.6060506@ozlabs.ru> (raw)
In-Reply-To: <20160322044559.GE23586@voom.redhat.com>

On 03/22/2016 03:45 PM, David Gibson wrote:
> On Mon, Mar 21, 2016 at 06:47:04PM +1100, Alexey Kardashevskiy wrote:
>> The sPAPR TCE tables manage 2 copies when VFIO is using an IOMMU -
>> a guest view of the table and a hardware TCE table. If there is no VFIO
>> presense in the address space, then just the guest view is used, if
>> this is the case, it is allocated in the KVM. However since there is no
>> support yet for VFIO in KVM TCE hypercalls, when we start using VFIO,
>> we need to move the guest view from KVM to the userspace; and we need
>> to do this for every IOMMU on a bus with VFIO devices.
>>
>> This adds vfio_start/vfio_stop callbacks in MemoryRegionIOMMUOps to
>> notifiy IOMMU about changing environment so it can reallocate the table
>> to/from KVM or (when available) hook the IOMMU groups with the logical
>> bus (LIOBN) in the KVM.
>>
>> This removes explicit spapr_tce_set_need_vfio() call from PCI hotplug
>> path as the new callbacks do this better - they notify IOMMU at
>> the exact moment when the configuration is changed, and this also
>> includes the case of PCI hot unplug.
>>
>> TODO: split into 2 or 3 patches, per maintainership area.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>
> I'm finding this one much easier to follow than the previous revision.
>
>> ---
>>   hw/ppc/spapr_iommu.c  | 12 ++++++++++++
>>   hw/ppc/spapr_pci.c    |  6 ------
>>   hw/vfio/common.c      |  9 +++++++++
>>   include/exec/memory.h |  4 ++++
>>   4 files changed, 25 insertions(+), 6 deletions(-)
>>
>> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
>> index 6dc3c45..702075d 100644
>> --- a/hw/ppc/spapr_iommu.c
>> +++ b/hw/ppc/spapr_iommu.c
>> @@ -151,6 +151,16 @@ static uint64_t spapr_tce_get_page_sizes(MemoryRegion *iommu)
>>       return 1ULL << tcet->page_shift;
>>   }
>>
>> +static void spapr_tce_vfio_start(MemoryRegion *iommu)
>> +{
>> +    spapr_tce_set_need_vfio(container_of(iommu, sPAPRTCETable, iommu), true);
>> +}
>> +
>> +static void spapr_tce_vfio_stop(MemoryRegion *iommu)
>> +{
>> +    spapr_tce_set_need_vfio(container_of(iommu, sPAPRTCETable, iommu), false);
>> +}
>
> Wonder if a single callback which takes a boolean might be a little
> less clunky.

I have a feeling that at least once I was asked to do the opposite and now 
we have take_ownership/release_ownership. This does not seem to be much 
different and the existing names are more self-documenting than the 
previous vfio_notify() or whatever name I could think of.


>>   static void spapr_tce_table_do_enable(sPAPRTCETable *tcet);
>>   static void spapr_tce_table_do_disable(sPAPRTCETable *tcet);
>>
>> @@ -211,6 +221,8 @@ static const VMStateDescription vmstate_spapr_tce_table = {
>>   static MemoryRegionIOMMUOps spapr_iommu_ops = {
>>       .translate = spapr_tce_translate_iommu,
>>       .get_page_sizes = spapr_tce_get_page_sizes,
>> +    .vfio_start = spapr_tce_vfio_start,
>> +    .vfio_stop = spapr_tce_vfio_stop,
>>   };
>>
>>   static int spapr_tce_table_realize(DeviceState *dev)
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index bfcafdf..af99a36 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -1121,12 +1121,6 @@ static void spapr_phb_add_pci_device(sPAPRDRConnector *drc,
>>       void *fdt = NULL;
>>       int fdt_start_offset = 0, fdt_size;
>>
>> -    if (object_dynamic_cast(OBJECT(pdev), "vfio-pci")) {
>> -        sPAPRTCETable *tcet = spapr_tce_find_by_liobn(phb->dma_liobn);
>> -
>> -        spapr_tce_set_need_vfio(tcet, true);
>> -    }
>> -
>>       if (dev->hotplugged) {
>>           fdt = create_device_tree(&fdt_size);
>>           fdt_start_offset = spapr_create_pci_child_dt(phb, pdev, fdt, 0);
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index b257655..4e873b7 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -421,6 +421,9 @@ static void vfio_listener_region_add(MemoryListener *listener,
>>           QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
>>
>>           memory_region_register_iommu_notifier(giommu->iommu, &giommu->n);
>> +        if (section->mr->iommu_ops && section->mr->iommu_ops->vfio_start) {
>> +            section->mr->iommu_ops->vfio_start(section->mr);
>> +        }
>>           memory_region_iommu_replay(giommu->iommu, &giommu->n,
>>                                      false);
>>
>> @@ -466,6 +469,7 @@ static void vfio_listener_region_del(MemoryListener *listener,
>>       VFIOContainer *container = container_of(listener, VFIOContainer, listener);
>>       hwaddr iova, end;
>>       int ret;
>> +    MemoryRegion *iommu = NULL;
>>
>>       if (vfio_listener_skipped_section(section)) {
>>           trace_vfio_listener_region_del_skip(
>> @@ -487,6 +491,7 @@ static void vfio_listener_region_del(MemoryListener *listener,
>>           QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
>>               if (giommu->iommu == section->mr) {
>>                   memory_region_unregister_iommu_notifier(&giommu->n);
>> +                iommu = giommu->iommu;
>>                   QLIST_REMOVE(giommu, giommu_next);
>>                   g_free(giommu);
>>                   break;
>> @@ -519,6 +524,10 @@ static void vfio_listener_region_del(MemoryListener *listener,
>>                        "0x%"HWADDR_PRIx") = %d (%m)",
>>                        container, iova, end - iova, ret);
>>       }
>> +
>> +    if (iommu && iommu->iommu_ops && iommu->iommu_ops->vfio_stop) {
>> +        iommu->iommu_ops->vfio_stop(section->mr);
>> +    }
>
> IIRC there can be multiple containers listening on the same PCI
> address space.  In that case, this won't be correct, because once one
> of the VFIO containers is removed, it will call vfio_stop, even though
> the other VFIO container still needs the guest IOMMU to support it.
>
> So I think you need some sort of refcounting here.


Right, missed this bit, good finding.


>
>>   }
>>
>>   static const MemoryListener vfio_memory_listener = {
>> diff --git a/include/exec/memory.h b/include/exec/memory.h
>> index eb5ce67..f1de133f 100644
>> --- a/include/exec/memory.h
>> +++ b/include/exec/memory.h
>> @@ -152,6 +152,10 @@ struct MemoryRegionIOMMUOps {
>>       IOMMUTLBEntry (*translate)(MemoryRegion *iommu, hwaddr addr, bool is_write);
>>       /* Returns supported page sizes */
>>       uint64_t (*get_page_sizes)(MemoryRegion *iommu);
>> +    /* Called when VFIO starts using this */
>> +    void (*vfio_start)(MemoryRegion *iommu);
>> +    /* Called when VFIO stops using this */
>> +    void (*vfio_stop)(MemoryRegion *iommu);
>>   };
>>
>>   typedef struct CoalescedMemoryRange CoalescedMemoryRange;
>


-- 
Alexey

  reply	other threads:[~2016-03-22  6:24 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-21  7:46 [Qemu-devel] [PATCH qemu v14 00/18] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy
2016-03-21  7:46 ` [Qemu-devel] [PATCH qemu v14 01/18] memory: Fix IOMMU replay base address Alexey Kardashevskiy
2016-03-22  0:49   ` David Gibson
2016-03-22  3:12     ` Alexey Kardashevskiy
2016-03-22  3:26       ` David Gibson
2016-03-22  4:28         ` Alexey Kardashevskiy
2016-03-22  4:59           ` David Gibson
2016-03-22  7:19             ` Alexey Kardashevskiy
2016-03-22 23:07               ` David Gibson
2016-03-23 10:58         ` Paolo Bonzini
2016-03-21  7:46 ` [Qemu-devel] [PATCH qemu v14 02/18] vmstate: Define VARRAY with VMS_ALLOC Alexey Kardashevskiy
2016-03-21  7:46 ` [Qemu-devel] [PATCH qemu v14 03/18] spapr_pci: Move DMA window enablement to a helper Alexey Kardashevskiy
2016-03-22  1:02   ` David Gibson
2016-03-22  3:17     ` Alexey Kardashevskiy
2016-03-22  3:28       ` David Gibson
2016-03-21  7:46 ` [Qemu-devel] [PATCH qemu v14 04/18] spapr_iommu: Move table allocation to helpers Alexey Kardashevskiy
2016-03-21  7:46 ` [Qemu-devel] [PATCH qemu v14 05/18] spapr_iommu: Introduce "enabled" state for TCE table Alexey Kardashevskiy
2016-03-22  1:11   ` David Gibson
2016-03-21  7:46 ` [Qemu-devel] [PATCH qemu v14 06/18] spapr_iommu: Finish renaming vfio_accel to need_vfio Alexey Kardashevskiy
2016-03-22  1:18   ` David Gibson
2016-03-21  7:46 ` [Qemu-devel] [PATCH qemu v14 07/18] spapr_iommu: Realloc table during migration Alexey Kardashevskiy
2016-03-22  1:23   ` David Gibson
2016-03-21  7:46 ` [Qemu-devel] [PATCH qemu v14 08/18] spapr_iommu: Migrate full state Alexey Kardashevskiy
2016-03-22  1:31   ` David Gibson
2016-03-21  7:46 ` [Qemu-devel] [PATCH qemu v14 09/18] spapr_iommu: Add root memory region Alexey Kardashevskiy
2016-03-21  7:46 ` [Qemu-devel] [PATCH qemu v14 10/18] spapr_pci: Reset DMA config on PHB reset Alexey Kardashevskiy
2016-03-21  7:46 ` [Qemu-devel] [PATCH qemu v14 11/18] memory: Add reporting of supported page sizes Alexey Kardashevskiy
2016-03-22  3:02   ` David Gibson
2016-03-21  7:47 ` [Qemu-devel] [PATCH qemu v14 12/18] vfio: Check that IOMMU MR translates to system address space Alexey Kardashevskiy
2016-03-22  3:05   ` David Gibson
2016-03-22 15:47     ` Alex Williamson
2016-03-23  0:43       ` David Gibson
2016-03-23  0:44       ` Alexey Kardashevskiy
2016-03-21  7:47 ` [Qemu-devel] [PATCH qemu v14 13/18] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) Alexey Kardashevskiy
2016-03-22  4:04   ` David Gibson
2016-03-21  7:47 ` [Qemu-devel] [PATCH qemu v14 14/18] spapr_pci: Add and export DMA resetting helper Alexey Kardashevskiy
2016-03-21  7:47 ` [Qemu-devel] [PATCH qemu v14 15/18] vfio: Add host side IOMMU capabilities Alexey Kardashevskiy
2016-03-22  4:20   ` David Gibson
2016-03-22  6:47     ` Alexey Kardashevskiy
2016-03-21  7:47 ` [Qemu-devel] [PATCH qemu v14 16/18] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping being used by VFIO Alexey Kardashevskiy
2016-03-22  4:45   ` David Gibson
2016-03-22  6:24     ` Alexey Kardashevskiy [this message]
2016-03-22 10:22       ` David Gibson
2016-03-21  7:47 ` [Qemu-devel] [PATCH qemu v14 17/18] vfio/spapr: Use VFIO_SPAPR_TCE_v2_IOMMU Alexey Kardashevskiy
2016-03-22  5:14   ` David Gibson
2016-03-22  5:54     ` Alexey Kardashevskiy
2016-03-23  1:08       ` David Gibson
2016-03-23  2:12         ` Alexey Kardashevskiy
2016-03-23  2:53           ` David Gibson
2016-03-23  3:06             ` Alexey Kardashevskiy
2016-03-23  6:03               ` David Gibson
2016-03-24  0:03                 ` Alexey Kardashevskiy
2016-03-24  9:10                   ` Alexey Kardashevskiy
2016-03-29  5:30                     ` David Gibson
2016-03-29  5:44                       ` Alexey Kardashevskiy
2016-03-29  6:44                         ` David Gibson
2016-03-21  7:47 ` [Qemu-devel] [PATCH qemu v14 18/18] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) Alexey Kardashevskiy
2016-03-23  2:13   ` David Gibson
2016-03-23  3:28     ` Alexey Kardashevskiy
2016-03-23  6:11       ` David Gibson
2016-03-24  2:32         ` Alexey Kardashevskiy
2016-03-29  5:22           ` David Gibson
2016-03-29  6:23             ` Alexey Kardashevskiy
2016-03-31  3:19           ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56F0E521.6060506@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).