From: Alex Williamson <alex.williamson@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
Wei Yang <richard.weiyang@linux.alibaba.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
qemu-devel@nongnu.org, Peter Xu <peterx@redhat.com>,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
Auger Eric <eric.auger@redhat.com>,
teawater <teawaterz@linux.alibaba.com>,
Igor Mammedov <imammedo@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Marek Kedzierski <mkedzier@redhat.com>
Subject: Re: [PATCH v3 04/10] vfio: Query and store the maximum number of DMA mappings
Date: Thu, 17 Dec 2020 10:55:12 -0700 [thread overview]
Message-ID: <20201217105512.78a2ef71@omen.home> (raw)
In-Reply-To: <20201216141200.118742-5-david@redhat.com>
On Wed, 16 Dec 2020 15:11:54 +0100
David Hildenbrand <david@redhat.com> wrote:
> Let's query the maximum number of DMA mappings by querying the available
> mappings when creating the container.
>
> In addition, count the number of DMA mappings and warn when we would
> exceed it. This is a preparation for RamDiscardMgr which might
> create quite some DMA mappings over time, and we at least want to warn
> early that the QEMU setup might be problematic. Use "reserved"
> terminology, so we can use this to reserve mappings before they are
> actually created.
This terminology doesn't make much sense to me, we're not actually
performing any kind of reservation.
> Note: don't reserve vIOMMU DMA mappings - using the vIOMMU region size
> divided by the mapping page size might be a bad indication of what will
> happen in practice - we might end up warning all the time.
This suggests we're not really tracking DMA "reservations" at all.
Would something like dma_regions_mappings be a more appropriate
identifier for the thing you're trying to count? We might as well also
keep a counter for dma_iommu_mappings where the sum of those two should
stay below dma_max_mappings.
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Auger Eric <eric.auger@redhat.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: teawater <teawaterz@linux.alibaba.com>
> Cc: Marek Kedzierski <mkedzier@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> hw/vfio/common.c | 34 ++++++++++++++++++++++++++++++++++
> include/hw/vfio/vfio-common.h | 2 ++
> 2 files changed, 36 insertions(+)
>
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 6ff1daa763..5ad88d476f 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -288,6 +288,26 @@ const MemoryRegionOps vfio_region_ops = {
> },
> };
>
> +static void vfio_container_dma_reserve(VFIOContainer *container,
> + unsigned long dma_mappings)
> +{
> + bool warned = container->dma_reserved > container->dma_max;
> +
> + container->dma_reserved += dma_mappings;
> + if (!warned && container->dma_max &&
> + container->dma_reserved > container->dma_max) {
> + warn_report("%s: possibly running out of DMA mappings. "
> + " Maximum number of DMA mappings: %d", __func__,
> + container->dma_max);
If we kept track of all the mappings we could predict better than
"possibly". Tracing support to track a high water mark might be useful
too.
> + }
> +}
> +
> +static void vfio_container_dma_unreserve(VFIOContainer *container,
> + unsigned long dma_mappings)
> +{
> + container->dma_reserved -= dma_mappings;
> +}
> +
> /*
> * Device state interfaces
> */
> @@ -835,6 +855,9 @@ static void vfio_listener_region_add(MemoryListener *listener,
> }
> }
>
> + /* We'll need one DMA mapping. */
> + vfio_container_dma_reserve(container, 1);
> +
> ret = vfio_dma_map(container, iova, int128_get64(llsize),
> vaddr, section->readonly);
> if (ret) {
> @@ -879,6 +902,7 @@ static void vfio_listener_region_del(MemoryListener *listener,
> MemoryRegionSection *section)
> {
> VFIOContainer *container = container_of(listener, VFIOContainer, listener);
> + bool unreserve_on_unmap = true;
> hwaddr iova, end;
> Int128 llend, llsize;
> int ret;
> @@ -919,6 +943,7 @@ static void vfio_listener_region_del(MemoryListener *listener,
> * based IOMMU where a big unmap flattens a large range of IO-PTEs.
> * That may not be true for all IOMMU types.
> */
> + unreserve_on_unmap = false;
> }
>
> iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
> @@ -970,6 +995,11 @@ static void vfio_listener_region_del(MemoryListener *listener,
> "0x%"HWADDR_PRIx") = %d (%m)",
> container, iova, int128_get64(llsize), ret);
> }
> +
> + /* We previously reserved one DMA mapping. */
> + if (unreserve_on_unmap) {
> + vfio_container_dma_unreserve(container, 1);
> + }
> }
>
> memory_region_unref(section->mr);
> @@ -1735,6 +1765,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
> container->fd = fd;
> container->error = NULL;
> container->dirty_pages_supported = false;
> + container->dma_max = 0;
> QLIST_INIT(&container->giommu_list);
> QLIST_INIT(&container->hostwin_list);
>
> @@ -1765,7 +1796,10 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
> vfio_host_win_add(container, 0, (hwaddr)-1, info->iova_pgsizes);
> container->pgsizes = info->iova_pgsizes;
>
> + /* The default in the kernel ("dma_entry_limit") is 65535. */
> + container->dma_max = 65535;
> if (!ret) {
> + vfio_get_info_dma_avail(info, &container->dma_max);
> vfio_get_iommu_info_migration(container, info);
> }
> g_free(info);
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 6141162d7a..fed0e85f66 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -88,6 +88,8 @@ typedef struct VFIOContainer {
> uint64_t dirty_pgsizes;
> uint64_t max_dirty_bitmap_size;
> unsigned long pgsizes;
> + unsigned int dma_max;
> + unsigned long dma_reserved;
If dma_max is unsigned int, why do we need an unsigned long to track
how many are in use? Thanks,
Alex
> QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
> QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list;
> QLIST_HEAD(, VFIOGroup) group_list;
next prev parent reply other threads:[~2020-12-17 17:56 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-16 14:11 [PATCH v3 00/10] virtio-mem: vfio support David Hildenbrand
2020-12-16 14:11 ` [PATCH v3 01/10] memory: Introduce RamDiscardMgr for RAM memory regions David Hildenbrand
2020-12-16 14:11 ` [PATCH v3 02/10] virtio-mem: Factor out traversing unplugged ranges David Hildenbrand
2020-12-16 14:11 ` [PATCH v3 03/10] virtio-mem: Implement RamDiscardMgr interface David Hildenbrand
2020-12-16 14:11 ` [PATCH v3 04/10] vfio: Query and store the maximum number of DMA mappings David Hildenbrand
2020-12-17 7:43 ` Pankaj Gupta
2020-12-17 17:55 ` Alex Williamson [this message]
2020-12-17 19:04 ` David Hildenbrand
2020-12-17 19:37 ` David Hildenbrand
2020-12-17 19:47 ` Alex Williamson
2021-01-07 12:56 ` David Hildenbrand
2020-12-16 14:11 ` [PATCH v3 05/10] vfio: Support for RamDiscardMgr in the !vIOMMU case David Hildenbrand
2020-12-17 18:36 ` Alex Williamson
2020-12-17 18:55 ` David Hildenbrand
2020-12-17 19:59 ` Alex Williamson
2020-12-18 9:11 ` David Hildenbrand
2020-12-16 14:11 ` [PATCH v3 06/10] vfio: Support for RamDiscardMgr in the vIOMMU case David Hildenbrand
2020-12-16 14:11 ` [PATCH v3 07/10] softmmu/physmem: Don't use atomic operations in ram_block_discard_(disable|require) David Hildenbrand
2020-12-16 14:11 ` [PATCH v3 08/10] softmmu/physmem: Extend ram_block_discard_(require|disable) by two discard types David Hildenbrand
2020-12-16 14:11 ` [PATCH v3 09/10] virtio-mem: Require only coordinated discards David Hildenbrand
2020-12-16 14:12 ` [PATCH v3 10/10] vfio: Disable only uncoordinated discards David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201217105512.78a2ef71@omen.home \
--to=alex.williamson@redhat.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=david@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eric.auger@redhat.com \
--cc=imammedo@redhat.com \
--cc=mkedzier@redhat.com \
--cc=mst@redhat.com \
--cc=pankaj.gupta.linux@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=richard.weiyang@linux.alibaba.com \
--cc=teawaterz@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).