From: Jason Gunthorpe <jgg@nvidia.com>
To: Baolu Lu <baolu.lu@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>,
iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
Robin Murphy <robin.murphy@arm.com>,
Will Deacon <will@kernel.org>, Kevin Tian <kevin.tian@intel.com>,
patches@lists.linux.dev, Tina Zhang <tina.zhang@intel.com>,
Wei Wang <wei.w.wang@intel.com>
Subject: Re: [PATCH 5/9] iommupt: Add the Intel VT-D second stage page table format
Date: Tue, 29 Jul 2025 20:05:18 -0300 [thread overview]
Message-ID: <20250729230518.GD82395@nvidia.com> (raw)
In-Reply-To: <529ca56f-5bdc-47ad-8d0d-6b3b5c800c7a@linux.intel.com>
On Tue, Jul 22, 2025 at 11:11:02AM +0800, Baolu Lu wrote:
> > diff --git a/drivers/iommu/generic_pt/Kconfig b/drivers/iommu/generic_pt/Kconfig
> > index 953856e4b48369..b631cf00eba559 100644
> > --- a/drivers/iommu/generic_pt/Kconfig
> > +++ b/drivers/iommu/generic_pt/Kconfig
> > @@ -56,6 +56,17 @@ config IOMMU_PT_RISCV64
> > Selected automatically by an IOMMU driver that uses this format.
> > +config IOMMU_PT_VTDSS
> > + tristate "IOMMU page table for Intel VT-D IOMMU Second Stage"
> > + depends on !GENERIC_ATOMIC64 # for cmpxchg64
> > + default n
>
> The default value is a "n". So what's the value of putting a "default n"
> here?
I do not know, I cargo culted this from somewhere else, lots of
examples. Do you think we should drop it?
> > @@ -72,6 +83,7 @@ config IOMMU_PT_KUNIT_TEST
> > depends on KUNIT
> > depends on IOMMU_PT_AMDV1 || !IOMMU_PT_AMDV1
> > depends on IOMMU_PT_RISCV64 || !IOMMU_PT_RISCV64
> > + depends on IOMMU_PT_VTDSS || !IOMMU_PT_VTDSS
>
> This line implies that the IOMMU_PT kunit test functions regardless of
> whether IOMMU_PT_VTDSS is enabled. But if IOMMU_PT_VTDSS is enabled,
> this kunit test will also cover it. Do I understand this correctly?
Yes.
The kunit test will build a unique:
kunit_test_suites(&NS(generic_pt_suite));
For VTDSS if it is compiled in and that macro eventually drops an ELF section:
__used __section(".kunit_test_suites") = { __VA_ARGS__ }
And then the linker and some kunit magic will automatically run it
just be virtue of having compiled it.
The odd || expression is a kconfig trick that ensures that the kunit
and vtdss have compatible modularity. ie the kunit cannot be built in
while the vtdss is modular.
> > +static inline enum pt_entry_type vtdss_pt_load_entry_raw(struct pt_state *pts)
> > +{
> > + const u64 *tablep = pt_cur_table(pts, u64);
> > + u64 entry;
> > +
> > + pts->entry = entry = READ_ONCE(tablep[pts->index]);
> > + if (!entry)
> > + return PT_ENTRY_EMPTY;
>
> Would it be more reasonable to check the present bit of the entry
> here?
VTDSS has no present bit? Did I misunderstand that in the spec?
IIRC this design uses all bits as 0 to mean non-present.
> Otherwise, it implies that when a PTE is non-present, all fields must be
> cleared. I'm concerned about any potential corner cases.
Since this code makes all the PTEs it does do that correctly, and we
have a great test suite that looks for corner cases :)
> > +static inline int vtdss_pt_iommu_set_prot(struct pt_common *common,
> > + struct pt_write_attrs *attrs,
> > + unsigned int iommu_prot)
> > +{
> > + u64 pte = 0;
> > +
> > + /*
> > + * VTDSS does not have a present bit, so we tell if any entry is present
> > + * by checking for R or W.
> > + */
> > + if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
> > + return -EINVAL;
> > +
> > + if (iommu_prot & IOMMU_READ)
> > + pte |= VTDSS_FMT_R;
> > + if (iommu_prot & IOMMU_WRITE)
> > + pte |= VTDSS_FMT_W;
> > + if (pt_feature(common, PT_FEAT_VTDSS_FORCE_COHERENCE))
> > + pte |= VTDSS_FMT_SNP;
>
> The comment says:
>
> /*
> * The PTEs are set to prevent cache incoherent traffic, such as PCI no
> * snoop. This is set either at creation time or before the first map
> * operation.
> */
> PT_FEAT_VTDSS_FORCE_COHERENCE = PT_FEAT_FMT_START,
>
> It seems that you are okay with setting this feature after iommu_pt
> creation and before the first map operation?
Yes it works as it is now.
> Do we still need to reform the enforce_cache_coherency callback
> mechanism?
I think that was motivated by the code in the driver, not so much this
code? I can't recall the detail right now but I didn't think it was a
very high priority.
Jason
next prev parent reply other threads:[~2025-07-29 23:10 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-16 19:57 [PATCH 0/9] Convert Intel VT-D to use the generic iommu page table Jason Gunthorpe
2025-07-16 19:57 ` [PATCH 1/9] iommu/pages: Add support for a incoherent IOMMU page walker Jason Gunthorpe
2025-07-21 8:41 ` Baolu Lu
2025-07-29 22:32 ` Jason Gunthorpe
2025-07-30 1:49 ` Baolu Lu
2025-08-11 21:21 ` Jason Gunthorpe
2025-08-15 11:28 ` Tian, Kevin
2025-08-22 21:13 ` Jason Gunthorpe
2025-07-16 19:57 ` [PATCH 2/9] iommupt: Add basic support for SW bits in the page table Jason Gunthorpe
2025-08-15 11:29 ` Tian, Kevin
2025-08-18 23:35 ` Jason Gunthorpe
2025-07-16 19:57 ` [PATCH 3/9] iommupt: Use the incoherent start/stop functions for PT_FEAT_DMA_INCOHERENT Jason Gunthorpe
2025-08-15 11:35 ` Tian, Kevin
2025-08-22 20:45 ` Jason Gunthorpe
2025-07-16 19:57 ` [PATCH 4/9] iommupt: Flush the CPU cache after any writes to the page table Jason Gunthorpe
2025-07-16 19:57 ` [PATCH 5/9] iommupt: Add the Intel VT-D second stage page table format Jason Gunthorpe
2025-07-22 3:11 ` Baolu Lu
2025-07-29 23:05 ` Jason Gunthorpe [this message]
2025-07-30 2:00 ` Baolu Lu
2025-08-22 9:14 ` Tian, Kevin
2025-08-22 14:53 ` Jason Gunthorpe
2025-07-16 19:57 ` [PATCH 6/9] iommupt/x86: Set the dirty bit only for writable PTEs Jason Gunthorpe
2025-07-21 10:02 ` Baolu Lu
2025-07-16 19:57 ` [PATCH 7/9] iommupt/x86: Support SW bits and permit PT_FEAT_DMA_INCOHERENT Jason Gunthorpe
2025-07-22 5:17 ` Baolu Lu
2025-07-29 23:13 ` Jason Gunthorpe
2025-07-30 2:35 ` Baolu Lu
2025-08-22 9:17 ` Tian, Kevin
2025-08-22 14:55 ` Jason Gunthorpe
2025-07-16 19:57 ` [PATCH 8/9] iommu/vt-d: Use the generic iommu page table Jason Gunthorpe
2025-07-22 6:44 ` Baolu Lu
2025-07-29 23:39 ` Jason Gunthorpe
2025-08-22 9:35 ` Tian, Kevin
2025-08-22 20:43 ` Jason Gunthorpe
2025-07-16 19:57 ` [PATCH 9/9] iommupt: Add a kunit test for the SW bits Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250729230518.GD82395@nvidia.com \
--to=jgg@nvidia.com \
--cc=baolu.lu@linux.intel.com \
--cc=dwmw2@infradead.org \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=patches@lists.linux.dev \
--cc=robin.murphy@arm.com \
--cc=tina.zhang@intel.com \
--cc=wei.w.wang@intel.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.