qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH v4 08/18] vfio: Add DMA memory registering [repost]
Date: Tue, 17 Feb 2015 13:20:12 +1100	[thread overview]
Message-ID: <54E2A55C.9060107@ozlabs.ru> (raw)
In-Reply-To: <20150202070407.GN28703@voom.fritz.box>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Repost as I think I pressed "crtl-r" and thunberbird killed the formatting :)


On 02/02/2015 06:04 PM, David Gibson wrote:
> On Thu, Jan 29, 2015 at 08:27:20PM +1100, Alexey Kardashevskiy wrote:
>> This makes use of the new "memory registering" feature. The idea is 
>> to provide the guest ability to notify the host kernel about pages
>> which are going to be used for DMA. Having this information, the
>> host kernel can pin them all, do locked pages accounting and not
>> spent time on doing that in real time with possible failures which
>> cannot be handled nicely in some cases.
>> 
>> This adds a memory listener which notifies a VFIO container about
>> memory which needs to be pinned/unpinned. VFIO MMIO regions are
>> skipped.
>> 
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> --- 
>> hw/vfio/common.c              | 99
>> ++++++++++++++++++++++++++++++++++++++++++- 
>> include/hw/vfio/vfio-common.h |  3 +- 2 files changed, 100
>> insertions(+), 2 deletions(-)
>> 
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index
>> cf483ff..c08f9ab 100644 --- a/hw/vfio/common.c +++
>> b/hw/vfio/common.c @@ -485,6 +485,99 @@ void
>> vfio_listener_release(VFIOContainer *container) 
>> memory_listener_unregister(&container->iommu_data.type1.listener); 
>> }
>> 
>> +static int vfio_mem_register(VFIOContainer *container, +
>> void *vaddr, ram_addr_t size) +{ +    struct
>> vfio_iommu_type1_register_memory reg = { +        .argsz =
>> sizeof(reg), +        .vaddr = (__u64)(uintptr_t)vaddr, +
>> .size = size, +    }; +    long ret = ioctl(container->fd,
>> VFIO_IOMMU_REGISTER_MEMORY, &reg); + +    DPRINTF("(%u) %s %u:
>> va=%lx size=%lx, ret=%ld\n", getpid(), +            __func__,
>> __LINE__, reg.vaddr, reg.size, ret); + +    return ret; +} + +static
>> int vfio_mem_unregister(VFIOContainer *container, +
>> void *vaddr, ram_addr_t size) +{ +    struct
>> vfio_iommu_type1_unregister_memory reg = { +        .argsz =
>> sizeof(reg), +        .vaddr = (__u64)(uintptr_t)vaddr, +
>> .size = size, +    }; +    long ret = ioctl(container->fd,
>> VFIO_IOMMU_UNREGISTER_MEMORY, &reg); + +    DPRINTF("(%u) %s %u:
>> va=%lx size=%lx, ret=%ld\n", getpid(), +            __func__,
>> __LINE__, reg.vaddr, reg.size, ret); + +    return ret; +} + +static
>> void vfio_ram_region_add(MemoryListener *listener, +
>> MemoryRegionSection *section) +{ +    VFIOContainer *container =
>> container_of(listener, VFIOContainer, +
>> iommu_data.type1.ramlistener); +    hwaddr end; +    Int128 llend; +
>> void *vaddr; + +    if (!memory_region_is_ram(section->mr) || +
>> memory_region_is_skip_dump(section->mr)) { +        return; +    } 
>> + +    llend = int128_make64(section->offset_within_address_space); 
>> +    llend = int128_add(llend, section->size);
> 
> Does this need an add TARGET_PAGE_SIZE-1 in order to make it round up 
> to a page boundary, rather than truncate to a page boundary?

Fixed.

> 
>> +    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK)); + +
>> memory_region_ref(section->mr); + +    end = int128_get64(llend); +
>> vaddr = memory_region_get_ram_ptr(section->mr) + +
>> section->offset_within_region; +    vfio_mem_register(container,
>> vaddr, end); +} + +static void vfio_ram_region_del(MemoryListener
>> *listener, +                                MemoryRegionSection
>> *section) +{ +    VFIOContainer *container = container_of(listener,
>> VFIOContainer, +
>> iommu_data.type1.ramlistener); +    hwaddr iova, end; +    void
>> *vaddr; + +    if (!memory_region_is_ram(section->mr) || +
>> memory_region_is_skip_dump(section->mr)) { +        return; +    } 
>> + + +    iova =
>> TARGET_PAGE_ALIGN(section->offset_within_address_space); +    end =
>> (section->offset_within_address_space + int128_get64(section->size))
>> & +        TARGET_PAGE_MASK; +    vaddr =
>> memory_region_get_ram_ptr(section->mr) + +
>> section->offset_within_region + +        (iova -
>> section->offset_within_address_space);
> 
> It's not clear to me why region_add and region_del have a different 
> set of address calculations.
> 

Cut-n-paste from the other listener; I will rework it; this version was
too raw.


>> +    vfio_mem_unregister(container, vaddr, end - iova); +} + +const
>> MemoryListener vfio_ram_listener = { +    .region_add =
>> vfio_ram_region_add, +    .region_del = vfio_ram_region_del, +}; + 
>> +static void vfio_spapr_listener_release(VFIOContainer *container) 
>> +{ +
>> memory_listener_unregister(&container->iommu_data.type1.listener); +
>> memory_listener_unregister(&container->iommu_data.type1.ramlistener);
>
>> 
> Accessing fields within type1 from a function whose name says it's 
> spapr specific seems very wrong.


Kind of ugly, yes. But we share the common memory listener with Type1 so...


> 
>> +} + int vfio_mmap_region(Object *obj, VFIORegion *region, 
>> MemoryRegion *mem, MemoryRegion *submem, void **map, size_t size,
>> off_t offset, @@ -705,6 +798,10 @@ static int
>> vfio_connect_container(VFIOGroup *group, AddressSpace *as) goto
>> free_container_exit; }
>> 
>> +        container->iommu_data.type1.ramlistener =
>> vfio_ram_listener; +
>> memory_listener_register(&container->iommu_data.type1.ramlistener, +
>> &address_space_memory);
> 
> Why two separate listeners, rather than doing both jobs from a single
> listener?
> 

... I actually like the idea to have this separated from the rest of the
code. Furthermore, now I think we better have separate memory listeners
for Type1 and SPAPR as the current vfio_listener_region_add()/del() look
quite ugly trying to do different things depending on
memory_region_is_iommu().

Any objection to separating SPAPR's listener (and merging it with the one
introduced by this patch)?


>> /* * The host kernel code implementing VFIO_IOMMU_DISABLE is called 
>> * when container fd is closed so we do not call it explicitly @@
>> -718,7 +815,7 @@ static int vfio_connect_container(VFIOGroup *group,
>> AddressSpace *as) }
>> 
>> container->iommu_data.type1.listener = vfio_memory_listener; -
>> container->iommu_data.release = vfio_listener_release; +
>> container->iommu_data.release = vfio_spapr_listener_release;
>> 
>> memory_listener_register(&container->iommu_data.type1.listener, 
>> container->space->as); diff --git a/include/hw/vfio/vfio-common.h
>> b/include/hw/vfio/vfio-common.h index 1d5bfe8..3f91216 100644 ---
>> a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h 
>> @@ -66,6 +66,7 @@ struct VFIOGroup;
>> 
>> typedef struct VFIOType1 { MemoryListener listener; +
>> MemoryListener ramlistener; int error; bool initialized; }
>> VFIOType1; @@ -144,7 +145,7 @@ int vfio_get_device(VFIOGroup *group,
>> const char *name, VFIODevice *vbasedev);
>> 
>> extern const MemoryRegionOps vfio_region_ops; -extern const
>> MemoryListener vfio_memory_listener; +extern const MemoryListener
>> vfio_memory_listener, vfio_ram_listener; extern
>> QLIST_HEAD(vfio_group_head, VFIOGroup) vfio_group_list; extern
>> QLIST_HEAD(vfio_as_head, VFIOAddressSpace) vfio_address_spaces;
>> 
> 


- -- 
Alexey
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJU4qVcAAoJEIYTPdgrwSC5czMP/Aui1yqy0a1/MZ69FlvZwYPR
V/wtTcy04IVBhBIxDq9BYMLOuWrliCZWbE172pXEVwcwD/JQedvYrgF2f+zZ+NUm
exoScPwG+nWS3iy9Xw5pLuqYbQVkIIoJ/cyTqjZesqmc++Og1FiRh8Wh1qttn8Dl
KSqnn30hVGdjZgIf6sztMOw3a9eAt+mRpdcjhIiMiilbYhoWwvsVNH9nIC4WNkX8
nky8+DarT4Tlwif9wns/BubtYtzPhUHeaIHjv4c7NSbSInXVRJ0bUPprxJse0l3c
lRSBNYp/cA4Qsw1Cec5ZC911OQDuIVGUrv80iXi2urFPViuma8HxYRxadJ3K2u/n
DCw+rvwB8dTvMviwYG0PyWrSSduUbC8riThypvw+yvfgC5qZi6KlxBXROoFy1AQN
YaeS5Mb7tTgmeYogClslu5MYQtSkNH7G8oD8NqNpXD6gA8oloCK2U0cLg6XYtH04
DtGkk9AoTTsDrAXPNX+OuGWQOLQAiR8rpLqq59J59ug+6uObrrGDoovCZH9Vk4rJ
F/gdogD/D/uO213QhfxJdCav/E0AxJ7LglCgvDKUk73Jy9es11s+Xy2augkfexHL
4foua0OqJl34hhtLubv6W0YZLx6yUZaUQoypOmuLJYfrb4TNlEHXcOGdsDmkk7No
yflIC9eNtNDR0wdWQqo+
=iytn
-----END PGP SIGNATURE-----

  parent reply	other threads:[~2015-02-17  2:20 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-29  9:27 [Qemu-devel] [PATCH v4 00/18] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 01/18] spapr_iommu: Disable in-kernel IOMMU tables for >4GB windows Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 02/18] spapr_iommu: Make H_PUT_TCE_INDIRECT endian-safe Alexey Kardashevskiy
2015-02-02  6:30   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 03/18] spapr_pci: Introduce a liobn number generating macros Alexey Kardashevskiy
2015-02-04 15:31   ` Alexander Graf
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 04/18] spapr_vio: " Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 05/18] spapr_pci: Make find_phb()/find_dev() public Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 06/18] spapr_iommu: Make spapr_tce_find_by_liobn() public Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 07/18] spapr_iommu: Implement free_table() helper Alexey Kardashevskiy
2015-02-02  6:37   ` David Gibson
2015-02-03  1:32     ` Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 08/18] vfio: Add DMA memory registering Alexey Kardashevskiy
2015-02-02  7:04   ` David Gibson
2015-02-17  2:14     ` Alexey Kardashevskiy
2015-02-17 23:53       ` David Gibson
2015-02-17  2:20     ` Alexey Kardashevskiy [this message]
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 09/18] spapr_rtas: Reserve DDW RTAS token numbers Alexey Kardashevskiy
2015-02-02  7:09   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 10/18] spapr_pci: Define DDW callbacks Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 11/18] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) Alexey Kardashevskiy
2015-02-05  3:51   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 12/18] spapr_rtas: Add Dynamic DMA windows (DDW) RTAS handlers Alexey Kardashevskiy
2015-02-05  4:05   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 13/18] spapr_pci: Advertise dynamic DMA windows to guest Alexey Kardashevskiy
2015-02-05  4:10   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 14/18] vfio: Enable DDW ioctls to VFIO IOMMU driver Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 15/18] spapr_pci_vfio: Enable multiple groups per container Alexey Kardashevskiy
2015-02-05  4:19   ` David Gibson
2015-02-17  0:34     ` Alexey Kardashevskiy
2015-02-17 23:52       ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 16/18] spapr_rtas_ddw: Workaround broken LE guests Alexey Kardashevskiy
2015-02-05  4:23   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 17/18] target-ppc: kvm: make use of KVM_CREATE_SPAPR_TCE_64 Alexey Kardashevskiy
2015-02-05  4:30   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 18/18] vfio: Enable in-kernel acceleration via VFIO KVM device Alexey Kardashevskiy
2015-02-05  4:49   ` David Gibson
2015-02-17  2:36     ` Alexey Kardashevskiy
2015-02-17 23:56       ` David Gibson
2015-01-30  4:01 ` [Qemu-devel] [PATCH v4 00/18] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54E2A55C.9060107@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=agraf@suse.de \
    --cc=alex.williamson@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).