From: Jason Gunthorpe <jgg@nvidia.com>
To: Baolu Lu <baolu.lu@linux.intel.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
"Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com>,
"intel-gfx@lists.freedesktop.org"
<intel-gfx@lists.freedesktop.org>,
"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
"De Marchi, Lucas" <lucas.demarchi@intel.com>,
"Kurmi, Suresh Kumar" <suresh.kumar.kurmi@intel.com>,
"Saarinen, Jani" <jani.saarinen@intel.com>,
"Auld, Matthew" <matthew.auld@intel.com>,
"iommu@lists.linux.dev" <iommu@lists.linux.dev>
Subject: Re: REGRESSION on linux-next (next-20251106)
Date: Tue, 18 Nov 2025 08:35:13 -0400 [thread overview]
Message-ID: <20251118123513.GJ10864@nvidia.com> (raw)
In-Reply-To: <1843821d-c3ca-480d-909c-2331521f6932@linux.intel.com>
On Tue, Nov 18, 2025 at 07:29:22PM +0800, Baolu Lu wrote:
> On 11/18/2025 3:47 PM, Tian, Kevin wrote:
> > > From: Baolu Lu <baolu.lu@linux.intel.com>
> > > Sent: Tuesday, November 18, 2025 2:24 PM
> > >
> > > On 11/18/25 12:04, Tian, Kevin wrote:
> > > > > 46 bits is not particularly big... Hmm, I wonder if we have some issue
> > > > > with the sign-extend? iommupt does that properly and IIRC the old code
> > > > > did not. Which of the page table formats is this using second stage or
> > > > > first stage?
> > > > Assume it's first stage for kernel IOVA, if available in hw
> > >
> > > It's the first stage (x86_64 fmt) according to the PASID entry setup:
> > >
> > > IOMMU dmar0: Root Table Address: 0x105a82000
> > > B.D.F Root_entry Context_entry
> > > PASID PASID_table_entry
> > > 00:02.0 0x0000000000000000:0x0000000105a85001
> > > 0x0000000000000000:0x0000000105a84405 0
> > > 0x0000000105a86000:0x0000000000000002:0x0000000000000049
> > >
> >
> > so the 3rd experiment (if the former two doesn't show difference) is
> > to force using second stage to see whether it's caused by the
> > sign-extend logic.
>
> I hardcoded the driver to always use the second stage for paging domain
> translation, and it works now.
>
> IOMMU dmar0: Root Table Address: 0x1049b6000
> B.D.F Root_entry Context_entry PASID PASID_table_entry
> 00:02.0 0x0000000000000000:0x00000001049ba001
> 0x0000000000000000:0x00000001049b9405 0
> 0x0000000000000000:0x0000000000000002:0x00000001049bb089
Okay, that is a great finding!
So either it is something about the sign extend or something about
x86_64. Given the similarity of vtdss all the code around cache/iotlb
flushing is the same so we can say that is working.
1) Can you run the test with CONFIG_DEBUG_GENERIC_PT=y? Lets see if
pt_check_install_leaf_args() fails?
2) Lets try to disabling the sign extend function:
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2818,8 +2818,7 @@ intel_iommu_domain_alloc_first_stage(struct device *dev,
else
cfg.common.hw_max_vasz_lg2 = 48;
cfg.common.hw_max_oasz_lg2 = 52;
- cfg.common.features = BIT(PT_FEAT_SIGN_EXTEND) |
- BIT(PT_FEAT_FLUSH_RANGE);
+ cfg.common.features = BIT(PT_FEAT_FLUSH_RANGE);
/* First stage always uses scalable mode */
if (!ecap_smpwc(iommu->ecap))
cfg.common.features |= BIT(PT_FEAT_DMA_INCOHERENT);
3) Let's validate the mapping:
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2572,6 +2572,21 @@ int iommu_map_nosync(struct iommu_domain *domain, unsigned long iova,
else
trace_map(orig_iova, orig_paddr, orig_size);
+ if (!ret) {
+ paddr = orig_paddr;
+ for (iova = orig_iova; iova < orig_iova + orig_size; iova += PAGE_SIZE) {
+ phys_addr_t pt_paddr = ops->iova_to_phys(domain, iova);
+
+ if (pt_paddr != paddr) {
+ pr_warn("mapping: Bad physical storage %lx != %lx at %lx\n",
+ (unsigned long)paddr,
+ (unsigned long)pt_paddr, iova);
+ break;
+ }
+ paddr += PAGE_SIZE;
+ }
+ }
+
Maybe the physical is getting truncated for some reason?
4) Please collect the map/unmap traces, including the return code
Jason
next prev parent reply other threads:[~2025-11-18 12:35 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-10 6:36 REGRESSION on linux-next (next-20251106) Borah, Chaitanya Kumar
2025-11-12 22:32 ` Jason Gunthorpe
2025-11-13 2:00 ` Tian, Kevin
2025-11-17 15:24 ` Jason Gunthorpe
2025-11-17 12:54 ` Baolu Lu
2025-11-17 15:22 ` Jason Gunthorpe
2025-11-18 1:29 ` Jason Gunthorpe
2025-11-18 4:04 ` Tian, Kevin
2025-11-18 6:19 ` Baolu Lu
2025-11-18 6:23 ` Baolu Lu
2025-11-18 7:47 ` Tian, Kevin
2025-11-18 11:29 ` Baolu Lu
2025-11-18 12:35 ` Jason Gunthorpe [this message]
2025-11-19 7:25 ` Baolu Lu
2025-11-18 10:30 ` Baolu Lu
2025-11-18 15:16 ` Borah, Chaitanya Kumar
2025-11-18 16:13 ` Jason Gunthorpe
2025-11-19 7:40 ` Borah, Chaitanya Kumar
2025-11-19 9:31 ` Tian, Kevin
2025-11-19 18:51 ` Jason Gunthorpe
2025-11-19 23:56 ` Tian, Kevin
2025-11-20 2:18 ` Jason Gunthorpe
2025-11-20 2:24 ` Baolu Lu
2025-11-20 7:27 ` Baolu Lu
2025-11-20 0:19 ` Tian, Kevin
2025-11-19 9:29 ` Baolu Lu
2025-11-18 12:42 ` ✗ Fi.CI.BUILD: failure for " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251118123513.GJ10864@nvidia.com \
--to=jgg@nvidia.com \
--cc=baolu.lu@linux.intel.com \
--cc=chaitanya.kumar.borah@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=iommu@lists.linux.dev \
--cc=jani.saarinen@intel.com \
--cc=kevin.tian@intel.com \
--cc=lucas.demarchi@intel.com \
--cc=matthew.auld@intel.com \
--cc=suresh.kumar.kurmi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.