From: Sasha Levin <sashal@kernel.org>
To: stable@vger.kernel.org
Cc: Vasant Hegde <vasant.hegde@amd.com>,
Alejandro Jimenez <alejandro.j.jimenez@oracle.com>,
Joao Martins <joao.m.martins@oracle.com>,
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
Joerg Roedel <joerg.roedel@amd.com>,
Sasha Levin <sashal@kernel.org>
Subject: [PATCH 6.6.y] iommu/amd/pgtbl: Fix possible race while increase page table level
Date: Sun, 21 Sep 2025 11:39:16 -0400 [thread overview]
Message-ID: <20250921153916.2944533-1-sashal@kernel.org> (raw)
In-Reply-To: <2025092102-unbutton-entire-9371@gregkh>
From: Vasant Hegde <vasant.hegde@amd.com>
[ Upstream commit 1e56310b40fd2e7e0b9493da9ff488af145bdd0c ]
The AMD IOMMU host page table implementation supports dynamic page table levels
(up to 6 levels), starting with a 3-level configuration that expands based on
IOVA address. The kernel maintains a root pointer and current page table level
to enable proper page table walks in alloc_pte()/fetch_pte() operations.
The IOMMU IOVA allocator initially starts with 32-bit address and onces its
exhuasted it switches to 64-bit address (max address is determined based
on IOMMU and device DMA capability). To support larger IOVA, AMD IOMMU
driver increases page table level.
But in unmap path (iommu_v1_unmap_pages()), fetch_pte() reads
pgtable->[root/mode] without lock. So its possible that in exteme corner case,
when increase_address_space() is updating pgtable->[root/mode], fetch_pte()
reads wrong page table level (pgtable->mode). It does compare the value with
level encoded in page table and returns NULL. This will result is
iommu_unmap ops to fail and upper layer may retry/log WARN_ON.
CPU 0 CPU 1
------ ------
map pages unmap pages
alloc_pte() -> increase_address_space() iommu_v1_unmap_pages() -> fetch_pte()
pgtable->root = pte (new root value)
READ pgtable->[mode/root]
Reads new root, old mode
Updates mode (pgtable->mode += 1)
Since Page table level updates are infrequent and already synchronized with a
spinlock, implement seqcount to enable lock-free read operations on the read path.
Fixes: 754265bcab7 ("iommu/amd: Fix race in increase_address_space()")
Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Cc: stable@vger.kernel.org
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
[ Adapted pgtable->mode and pgtable->root to use domain->iop.mode and domain->iop.root ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/iommu/amd/amd_iommu_types.h | 1 +
drivers/iommu/amd/io_pgtable.c | 26 ++++++++++++++++++++++----
2 files changed, 23 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 7dc30c2b56b30..d872054b874fa 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -540,6 +540,7 @@ struct amd_irte_ops;
container_of((x), struct amd_io_pgtable, pgtbl_cfg)
struct amd_io_pgtable {
+ seqcount_t seqcount; /* Protects root/mode update */
struct io_pgtable_cfg pgtbl_cfg;
struct io_pgtable iop;
int mode;
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 2892aa1b4dc1d..b785d82399983 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -17,6 +17,7 @@
#include <linux/slab.h>
#include <linux/types.h>
#include <linux/dma-mapping.h>
+#include <linux/seqlock.h>
#include <asm/barrier.h>
@@ -171,8 +172,11 @@ static bool increase_address_space(struct protection_domain *domain,
*pte = PM_LEVEL_PDE(domain->iop.mode, iommu_virt_to_phys(domain->iop.root));
+ write_seqcount_begin(&domain->iop.seqcount);
domain->iop.root = pte;
domain->iop.mode += 1;
+ write_seqcount_end(&domain->iop.seqcount);
+
amd_iommu_update_and_flush_device_table(domain);
amd_iommu_domain_flush_complete(domain);
@@ -199,6 +203,7 @@ static u64 *alloc_pte(struct protection_domain *domain,
gfp_t gfp,
bool *updated)
{
+ unsigned int seqcount;
int level, end_lvl;
u64 *pte, *page;
@@ -214,8 +219,14 @@ static u64 *alloc_pte(struct protection_domain *domain,
}
- level = domain->iop.mode - 1;
- pte = &domain->iop.root[PM_LEVEL_INDEX(level, address)];
+ do {
+ seqcount = read_seqcount_begin(&domain->iop.seqcount);
+
+ level = domain->iop.mode - 1;
+ pte = &domain->iop.root[PM_LEVEL_INDEX(level, address)];
+ } while (read_seqcount_retry(&domain->iop.seqcount, seqcount));
+
+
address = PAGE_SIZE_ALIGN(address, page_size);
end_lvl = PAGE_SIZE_LEVEL(page_size);
@@ -292,6 +303,7 @@ static u64 *fetch_pte(struct amd_io_pgtable *pgtable,
unsigned long *page_size)
{
int level;
+ unsigned int seqcount;
u64 *pte;
*page_size = 0;
@@ -299,8 +311,12 @@ static u64 *fetch_pte(struct amd_io_pgtable *pgtable,
if (address > PM_LEVEL_SIZE(pgtable->mode))
return NULL;
- level = pgtable->mode - 1;
- pte = &pgtable->root[PM_LEVEL_INDEX(level, address)];
+ do {
+ seqcount = read_seqcount_begin(&pgtable->seqcount);
+ level = pgtable->mode - 1;
+ pte = &pgtable->root[PM_LEVEL_INDEX(level, address)];
+ } while (read_seqcount_retry(&pgtable->seqcount, seqcount));
+
*page_size = PTE_LEVEL_PAGE_SIZE(level);
while (level > 0) {
@@ -524,6 +540,8 @@ static struct io_pgtable *v1_alloc_pgtable(struct io_pgtable_cfg *cfg, void *coo
cfg->oas = IOMMU_OUT_ADDR_BIT_SIZE,
cfg->tlb = &v1_flush_ops;
+ seqcount_init(&pgtable->seqcount);
+
pgtable->iop.ops.map_pages = iommu_v1_map_pages;
pgtable->iop.ops.unmap_pages = iommu_v1_unmap_pages;
pgtable->iop.ops.iova_to_phys = iommu_v1_iova_to_phys;
--
2.51.0
prev parent reply other threads:[~2025-09-21 15:39 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-21 12:25 FAILED: patch "[PATCH] iommu/amd/pgtbl: Fix possible race while increase page table" failed to apply to 6.6-stable tree gregkh
2025-09-21 15:39 ` Sasha Levin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250921153916.2944533-1-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=alejandro.j.jimenez@oracle.com \
--cc=joao.m.martins@oracle.com \
--cc=joerg.roedel@amd.com \
--cc=stable@vger.kernel.org \
--cc=suravee.suthikulpanit@amd.com \
--cc=vasant.hegde@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox