From: Lu Baolu <baolu.lu@linux.intel.com>
To: Joerg Roedel <joro@8bytes.org>
Cc: Zhenzhong Duan <zhenzhong.duan@intel.com>,
Bjorn Helgaas <bhelgaas@google.com>,
Jason Gunthorpe <jgg@nvidia.com>,
iommu@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: [PATCH 10/10] iommu/vt-d: Simplify calculate_psi_aligned_address()
Date: Thu, 2 Apr 2026 14:57:33 +0800 [thread overview]
Message-ID: <20260402065734.1687476-11-baolu.lu@linux.intel.com> (raw)
In-Reply-To: <20260402065734.1687476-1-baolu.lu@linux.intel.com>
From: Jason Gunthorpe <jgg@nvidia.com>
This is doing far too much math for the simple task of finding a power
of 2 that fully spans the given range. Use fls directly on the xor
which computes the common binary prefix.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/4-v1-f175e27af136+11647-iommupt_inv_vtd_jgg@nvidia.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/cache.c | 49 ++++++++++++-------------------------
1 file changed, 16 insertions(+), 33 deletions(-)
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c
index be8410f0e841..54dd9f7323bd 100644
--- a/drivers/iommu/intel/cache.c
+++ b/drivers/iommu/intel/cache.c
@@ -254,37 +254,25 @@ void cache_tag_unassign_domain(struct dmar_domain *domain,
}
static unsigned long calculate_psi_aligned_address(unsigned long start,
- unsigned long end,
- unsigned long *_mask)
+ unsigned long last,
+ unsigned long *size_order)
{
- unsigned long pages = aligned_nrpages(start, end - start + 1);
- unsigned long aligned_pages = __roundup_pow_of_two(pages);
- unsigned long bitmask = aligned_pages - 1;
- unsigned long mask = ilog2(aligned_pages);
- unsigned long pfn = IOVA_PFN(start);
+ unsigned int sz_lg2;
- /*
- * PSI masks the low order bits of the base address. If the
- * address isn't aligned to the mask, then compute a mask value
- * needed to ensure the target range is flushed.
- */
- if (unlikely(bitmask & pfn)) {
- unsigned long end_pfn = pfn + pages - 1, shared_bits;
-
- /*
- * Since end_pfn <= pfn + bitmask, the only way bits
- * higher than bitmask can differ in pfn and end_pfn is
- * by carrying. This means after masking out bitmask,
- * high bits starting with the first set bit in
- * shared_bits are all equal in both pfn and end_pfn.
- */
- shared_bits = ~(pfn ^ end_pfn) & ~bitmask;
- mask = shared_bits ? __ffs(shared_bits) : MAX_AGAW_PFN_WIDTH;
+ /* Compute a sz_lg2 that spans start and last */
+ start &= GENMASK(BITS_PER_LONG - 1, VTD_PAGE_SHIFT);
+ sz_lg2 = fls_long(start ^ last);
+ if (sz_lg2 <= 12) {
+ *size_order = 0;
+ return start;
+ }
+ if (unlikely(sz_lg2 >= MAX_AGAW_PFN_WIDTH)) {
+ *size_order = MAX_AGAW_PFN_WIDTH;
+ return 0;
}
- *_mask = mask;
-
- return ALIGN_DOWN(start, VTD_PAGE_SIZE << mask);
+ *size_order = sz_lg2 - VTD_PAGE_SHIFT;
+ return start & GENMASK(BITS_PER_LONG - 1, sz_lg2);
}
static void qi_batch_flush_descs(struct intel_iommu *iommu, struct qi_batch *batch)
@@ -441,12 +429,7 @@ void cache_tag_flush_range(struct dmar_domain *domain, unsigned long start,
struct cache_tag *tag;
unsigned long flags;
- if (start == 0 && end == ULONG_MAX) {
- addr = 0;
- mask = MAX_AGAW_PFN_WIDTH;
- } else {
- addr = calculate_psi_aligned_address(start, end, &mask);
- }
+ addr = calculate_psi_aligned_address(start, end, &mask);
spin_lock_irqsave(&domain->cache_lock, flags);
list_for_each_entry(tag, &domain->cache_tags, node) {
--
2.43.0
next prev parent reply other threads:[~2026-04-02 7:00 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-02 6:57 [PATCH 00/10] [PULL REQUEST] Intel IOMMU updates for v7.1 Lu Baolu
2026-04-02 6:57 ` [PATCH 01/10] iommu/vt-d: Block PASID attachment to nested domain with dirty tracking Lu Baolu
2026-04-02 6:57 ` [PATCH 02/10] iommu/vt-d: Rename device_set_dirty_tracking() and pass dmar_domain pointer Lu Baolu
2026-04-02 6:57 ` [PATCH 03/10] iommu/vt-d: Support dirty tracking on PASID Lu Baolu
2026-04-02 6:57 ` [PATCH 04/10] iommufd/selftest: Test " Lu Baolu
2026-04-02 6:57 ` [PATCH 05/10] iommu/vt-d: Remove dmar_readl() and dmar_readq() Lu Baolu
2026-04-02 6:57 ` [PATCH 06/10] iommu/vt-d: Remove dmar_writel() and dmar_writeq() Lu Baolu
2026-04-02 6:57 ` [PATCH 07/10] iommu/vt-d: Split piotlb invalidation into range and all Lu Baolu
2026-04-02 6:57 ` [PATCH 08/10] iommu/vt-d: Pass size_order to qi_desc_piotlb() not npages Lu Baolu
2026-04-02 6:57 ` [PATCH 09/10] iommu/vt-d: Remove the remaining pages along the invalidation path Lu Baolu
2026-04-02 6:57 ` Lu Baolu [this message]
2026-04-02 8:39 ` [PATCH 10/10] iommu/vt-d: Simplify calculate_psi_aligned_address() Baolu Lu
2026-04-02 9:46 ` Joerg Roedel
2026-04-02 15:35 ` Jason Gunthorpe
2026-04-02 7:26 ` [PATCH 00/10] [PULL REQUEST] Intel IOMMU updates for v7.1 Joerg Roedel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260402065734.1687476-11-baolu.lu@linux.intel.com \
--to=baolu.lu@linux.intel.com \
--cc=bhelgaas@google.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox