From: Jason Gunthorpe <jgg@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
Thanos Makatos <thanos.makatos@nutanix.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"Martins, Joao" <joao.m.martins@oracle.com>,
John Levon <john.levon@nutanix.com>,
"john.g.johnson@oracle.com" <john.g.johnson@oracle.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Eric Auger <eric.auger@redhat.com>,
David Gibson <david@gibson.dropbear.id.au>,
"Liu, Yi L" <yi.l.liu@intel.com>
Subject: Re: iommufd dirty page logging overview
Date: Fri, 18 Mar 2022 12:55:36 -0300 [thread overview]
Message-ID: <20220318155536.GQ11336@nvidia.com> (raw)
In-Reply-To: <20220318090636.6ea05cfd.alex.williamson@redhat.com>
On Fri, Mar 18, 2022 at 09:06:36AM -0600, Alex Williamson wrote:
> There are advantages to each, the 2nd option gives the user more
> visibility, more options to thread, but it also possibly duplicates
> significant data.
The coming mlx5 tracker won't require kernel storage at all, so I
think this is something to tackle if/when someone comes with a device
that uses the CPU to somehow track dirties (probably via a mdev that
is already tracking DMA?)
One thought is to let vfio coordinate a single allocation of a dirty
bitmap xarray among drivers.
Even in the worst case of duplicated bitmaps the memory usage is not
fatally terrible it is about 32MB per 1TB of guest memory.
> The unmap scenario above is also not quite as cohesive if the user
> needs to poll devices for dirty pages in the unmapped range after
> performing the unmap. It might make sense if the iommufd could
> generate the merged bitmap on unmap as the threading optimization
> probably has less value in that case.
I don't think of it this way. The device tracker has no idea about
munmap/mmap, it just tracks IOVA dirties.
Which is a problem because any time we alter the IOVA to PFN map we
need to read the device dirties and correlate them back to the actual
CPU pages that were dirtied.
unmap is one case, but nested paging invalidation is another much
nastier problem. How exactly that can work is a bit of a mystery to me
as the ultimate IOVA to PFN mapping is rather fuzzy/racy from the view
of the hypervisor.
So, I wouldn't invest effort to make a special kernel API to link
unmap and leave invalidate unsolved. Just keeping them seperated seems
to make more sense, and userspace knows better what it is doing. Eg
vIOMMU cases need to synchronize the dirty, but other things like
memory unplug don't.
From another perspective, what we want is for the system iommu to be
as lean and fast as possible because in, say, 10 years it will be the
dominate way to do this job. This is another reason I'm reluctant to
co-mingle it with device trackers in a way that might limit it.
Though overall I think threading is the key argument. Given this is
time critical for stop copy, and all trackers can report fully in
parallel, we should strive to allow userspace to thread them.
Jason
next prev parent reply other threads:[~2022-03-18 15:55 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-16 23:29 iommufd dirty page logging overview Thanos Makatos
2022-03-16 23:50 ` Jason Gunthorpe
2022-03-18 9:23 ` Tian, Kevin
2022-03-18 12:41 ` Jason Gunthorpe
2022-03-18 15:06 ` Alex Williamson
2022-03-18 15:55 ` Jason Gunthorpe [this message]
2022-03-19 7:54 ` Tian, Kevin
2022-03-19 8:14 ` Tian, Kevin
2022-03-20 3:34 ` Tian, Kevin
2022-03-21 13:30 ` Jason Gunthorpe
2022-03-22 2:40 ` Tian, Kevin
2022-03-17 12:39 ` Joao Martins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220318155536.GQ11336@nvidia.com \
--to=jgg@nvidia.com \
--cc=alex.williamson@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=eric.auger@redhat.com \
--cc=joao.m.martins@oracle.com \
--cc=john.g.johnson@oracle.com \
--cc=john.levon@nutanix.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=stefanha@redhat.com \
--cc=thanos.makatos@nutanix.com \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox