From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: Baolu Lu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux.dev, Kevin Tian <kevin.tian@intel.com>,
Yi Liu <yi.l.liu@intel.com>, Joerg Roedel <joro@8bytes.org>,
Will Deacon <will@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
linux-kernel@vger.kernel.org, jacob.jun.pan@linux.intel.com
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
Date: Tue, 9 Apr 2024 10:31:46 -0700 [thread overview]
Message-ID: <20240409103146.0d155e45@jacob-builder> (raw)
In-Reply-To: <aff42b8f-b757-4422-9ebe-741a4b894b6c@linux.intel.com>
Hi Baolu,
On Tue, 9 Apr 2024 11:12:20 +0800, Baolu Lu <baolu.lu@linux.intel.com>
wrote:
> On 4/9/24 5:03 AM, Jacob Pan wrote:
> > Hi Lu,
>
> Hi Jacob,
>
> >
> > On Sun, 7 Apr 2024 22:42:32 +0800, Lu Baolu<baolu.lu@linux.intel.com>
> > wrote:
> >
> >> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> >> implementation caches not-present or erroneous translation-structure
> >> entries except the first-stage translation. The caching mode is
> >> unrelated to the device TLB , therefore there is no need to check
> >> it before a device TLB invalidation operation.
> >>
> >> Before the scalable mode is introduced, caching mode is treated as
> >> an indication that the driver is running in a VM guest. This is just
> >> a software contract as shadow page table is the only way to implement
> >> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> >> the scalable mode is introduced, this doesn't stand for anymore, as
> >> caching mode is not relevant for the first-stage translation. A virtual
> >> IOMMU implementation is free to support first-stage translation only
> >> with caching mode cleared.
> >>
> >> Remove the caching mode check before device TLB invalidation to ensure
> >> compatibility with the scalable mode use cases.
> >>
> > I agree with the changes below, what about this CM check:
> >
> > /* Notification for newly created mappings */
> > static void __mapping_notify_one(struct intel_iommu *iommu, struct
> > dmar_domain *domain, unsigned long pfn, unsigned int pages)
> > {
> > /*
> > * It's a non-present to present mapping. Only flush if caching
> > mode
> > * and second level.
> > */
> > if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
> > iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);
> >
> > We are still tying devTLB flush to CM=1, no?
>
> __mapping_notify_one() is called in the path where some PTEs are changed
> from non-present to present.
>
> In this scenario,
>
> - if CM is set and first-stage translation is not used, the IOTLB caches
> are required to be explicitly flushed.
> - else if hardware requires write buffer flushing, do it.
> - Otherwise, no op.
> - devtlb invalidation is irrelevant to this path.
>
> The code after the fix appears to do the right thing. devTLB is not
> invalidated in iommu_flush_iotlb_psi() since it's a map (map == 1).
>
> Or perhaps I overlooked anything?
My confusion is that, on one side, this patch is saying devTLB flush has
nothing to do with CM. But here, if CMD==1, we don't flush devTLB since
map==1.
If the guest uses SL page tables in vIOMMU, we don;t expose ATS to the
guest. So ATS is not relevant here, does't matter map or unmap.
Can we remove the map argument in iommu_flush_iotlb_psi(iommu, domain,pfn,
pages, 0, 1)?
Then devTLB flush will naturally be skipped in the guest (CM=1, SL) since
ATS is not enabled.
iommu_flush_dev_iotlb(domain, addr, mask);
i.e.
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 50eb9aed47cc..ee3e5a1af0c5 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1483,7 +1483,7 @@ static void __iommu_flush_iotlb_psi(struct intel_iommu *iommu, u16 did,
static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
struct dmar_domain *domain,
unsigned long pfn, unsigned int pages,
- int ih, int map)
+ int ih)
{
unsigned int aligned_pages = __roundup_pow_of_two(pages);
unsigned int mask = ilog2(aligned_pages);
@@ -1501,12 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
else
__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
- /*
- * In caching mode, changes of pages from non-present to present require
- * flush. However, device IOTLB doesn't need to be flushed in this case.
- */
- if (!cap_caching_mode(iommu->cap) || !map)
- iommu_flush_dev_iotlb(domain, addr, mask);
+ iommu_flush_dev_iotlb(domain, addr, mask);
}
/* Notification for newly created mappings */
@@ -1518,7 +1513,7 @@ static void __mapping_notify_one(struct intel_iommu *iommu, struct dmar_domain *
* and second level.
*/
if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
- iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);
+ iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0);
else
iommu_flush_write_buffer(iommu);
}
> >
> > If we are running in the guest with second level page table (shadowed),
> > can we decide if devTLB flush is needed based on ATS enable just as the
> > rest of the cases?
>
> I think the ATS check should be consistent. It's generic no matter how
> the IOMMU is implemented (in hardware or emulated in software).
>
> Best regards,
> baolu
Thanks,
Jacob
next prev parent reply other threads:[~2024-04-09 17:27 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-07 14:42 [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path Lu Baolu
2024-04-07 14:42 ` [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush Lu Baolu
2024-04-08 7:21 ` Ethan Zhao
2024-04-08 7:23 ` Baolu Lu
2024-04-08 7:43 ` Ethan Zhao
2024-04-08 21:03 ` Jacob Pan
2024-04-09 3:12 ` Baolu Lu
2024-04-09 17:31 ` Jacob Pan [this message]
2024-04-10 0:32 ` Tian, Kevin
2024-04-10 16:19 ` Jacob Pan
2024-04-10 23:23 ` Tian, Kevin
2024-04-11 16:17 ` Jacob Pan
2024-04-12 3:13 ` Tian, Kevin
2024-04-09 7:30 ` Tian, Kevin
2024-04-10 5:40 ` Baolu Lu
2024-04-10 23:49 ` Zhang, Tina
2024-04-11 12:15 ` Baolu Lu
2024-04-09 8:36 ` Yi Liu
2024-04-09 7:20 ` [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path Tian, Kevin
2024-04-09 7:21 ` Tian, Kevin
2024-04-09 8:27 ` Yi Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240409103146.0d155e45@jacob-builder \
--to=jacob.jun.pan@linux.intel.com \
--cc=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.