From: Peter Xu <peterx@redhat.com>
To: Zhenzhong Duan <zhenzhong.duan@intel.com>,
Jason Gunthorpe <jgg@nvidia.com>
Cc: qemu-devel@nongnu.org, mst@redhat.com, jasowang@redhat.com,
pbonzini@redhat.com, richard.henderson@linaro.org,
eduardo@habkost.net, marcel.apfelbaum@gmail.com,
alex.williamson@redhat.com, clg@redhat.com, david@redhat.com,
philmd@linaro.org, kwankhede@nvidia.com, cjia@nvidia.com,
yi.l.liu@intel.com, chao.p.peng@intel.com
Subject: Re: [PATCH v3 5/5] intel_iommu: Optimize out some unnecessary UNMAP calls
Date: Thu, 8 Jun 2023 10:05:08 -0400 [thread overview]
Message-ID: <ZIHgFFSaBJWFUNd7@x1n> (raw)
In-Reply-To: <20230608095231.225450-6-zhenzhong.duan@intel.com>
On Thu, Jun 08, 2023 at 05:52:31PM +0800, Zhenzhong Duan wrote:
> Commit 63b88968f1 ("intel-iommu: rework the page walk logic") adds logic
> to record mapped IOVA ranges so we only need to send MAP or UNMAP when
> necessary. But there is still a corner case of unnecessary UNMAP.
>
> During invalidation, either domain or device selective, we only need to
> unmap when there are recorded mapped IOVA ranges, presuming most of OSes
> allocating IOVA range continuously, e.g. on x86, linux sets up mapping
> from 0xffffffff downwards.
>
> Strace shows UNMAP ioctl taking 0.000014us and we have 28 such ioctl()
> in one invalidation, as two notifiers in x86 are split into power of 2
> pieces.
>
> ioctl(48, VFIO_IOMMU_UNMAP_DMA, 0x7ffffd5c42f0) = 0 <0.000014>
Thanks for the numbers, but for a fair comparison IMHO it needs to be a
comparison of before/after on the whole time used for unmap AS. It'll be
great to have finer granule measurements like each ioctl, but the total
time used should be more important (especially to contain "after"). Side
note: I don't think the UNMAP ioctl will take the same time; it should
matter on whether there's mapping exist).
Actually it's hard to tell because this also depends on what's in the iova
tree.. but still at least we know how it works in some cases.
>
> The other purpose of this patch is to eliminate noisy error log when we
> work with IOMMUFD. It looks the duplicate UNMAP call will fail with IOMMUFD
> while always succeed with legacy container. This behavior difference leads
> to below error log for IOMMUFD:
>
> IOMMU_IOAS_UNMAP failed: No such file or directory
> vfio_container_dma_unmap(0x562012d6b6d0, 0x0, 0x80000000) = -2 (No such file or directory)
> IOMMU_IOAS_UNMAP failed: No such file or directory
> vfio_container_dma_unmap(0x562012d6b6d0, 0x80000000, 0x40000000) = -2 (No such file or directory)
> ...
My gut feeling is the major motivation is actually this (not the perf).
tens of some 14us ioctls is really nothing on a rare event..
Jason Wang raised a question in previous version and I think JasonG's reply
is here:
https://lore.kernel.org/r/ZHTaQXd3ZybmhCLb@nvidia.com
JasonG: sorry I know zero on iommufd api yet, but you said:
The VFIO emulation functions should do whatever VFIO does, is there
a mistake there?
IIUC what VFIO does here is it returns succeed if unmap over nothing rather
than failing like iommufd. Curious (like JasonW) on why that retval? I'd
assume for returning "how much unmapped" we can at least still return 0 for
nothing.
Are you probably suggesting that we can probably handle that in QEMU side
on -ENOENT here for iommufd only (a question to Yi?).
If that's already a kernel abi, not sure whether it's even discussable, but
just to raise this up.
--
Peter Xu
next prev parent reply other threads:[~2023-06-08 14:06 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-08 9:52 [PATCH v3 0/5] Optimize UNMAP call and bug fix Zhenzhong Duan
2023-06-08 9:52 ` [PATCH v3 1/5] util: Add iova_tree_foreach_range_data Zhenzhong Duan
2023-06-08 9:52 ` [PATCH v3 2/5] intel_iommu: Fix a potential issue in VFIO dirty page sync Zhenzhong Duan
2023-06-08 13:42 ` Peter Xu
2023-06-08 9:52 ` [PATCH v3 3/5] intel_iommu: Fix flag check in replay Zhenzhong Duan
2023-06-08 13:43 ` Peter Xu
2023-06-08 9:52 ` [PATCH v3 4/5] intel_iommu: Fix address space unmap Zhenzhong Duan
2023-06-08 13:48 ` Peter Xu
2023-06-09 3:31 ` Duan, Zhenzhong
2023-06-09 13:36 ` Peter Xu
2023-06-13 2:32 ` Duan, Zhenzhong
2023-06-08 9:52 ` [PATCH v3 5/5] intel_iommu: Optimize out some unnecessary UNMAP calls Zhenzhong Duan
2023-06-08 14:05 ` Peter Xu [this message]
2023-06-08 14:11 ` Jason Gunthorpe
2023-06-08 15:40 ` Peter Xu
2023-06-08 16:27 ` Jason Gunthorpe
2023-06-08 19:53 ` Peter Xu
2023-06-09 1:00 ` Jason Gunthorpe
2023-06-09 5:49 ` Duan, Zhenzhong
2023-06-09 21:26 ` Peter Xu
2023-06-13 2:37 ` Duan, Zhenzhong
2023-06-14 9:47 ` Duan, Zhenzhong
2023-06-09 4:03 ` Duan, Zhenzhong
2023-06-09 3:41 ` Duan, Zhenzhong
2023-06-08 20:34 ` Peter Xu
2023-06-09 4:01 ` Duan, Zhenzhong
2023-06-14 9:38 ` Duan, Zhenzhong
2023-06-14 12:51 ` Peter Xu
2023-06-08 15:53 ` [PATCH v3 0/5] Optimize UNMAP call and bug fix Peter Xu
2023-06-09 3:32 ` Duan, Zhenzhong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZIHgFFSaBJWFUNd7@x1n \
--to=peterx@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=chao.p.peng@intel.com \
--cc=cjia@nvidia.com \
--cc=clg@redhat.com \
--cc=david@redhat.com \
--cc=eduardo@habkost.net \
--cc=jasowang@redhat.com \
--cc=jgg@nvidia.com \
--cc=kwankhede@nvidia.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=yi.l.liu@intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).