* [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent
@ 2025-04-15 4:57 Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 01/11] iommu/arm-smmu-v3: Pass in vmid to arm_smmu_make_s2_domain_ste() Nicolin Chen
` (10 more replies)
0 siblings, 11 replies; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 4:57 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, kevin.tian, praan, nathan, yi.l.liu, peterz, mshavit,
jsnitsel, smostafa, jeff.johnson, zhangzekun11, linux-arm-kernel,
iommu, linux-kernel, shameerali.kolothum.thodi
With a system having multiple SMMU physical instances, multiple vSMMUs can
be allocated for a VM that deivces behind different SMMUs are assigned to.
In such a use case, the IPA->PA mappings (i.e. the stage-2 I/O page table)
can be shared across the vSMMU instances.
With a shareable S2 parent domain, it is more natural to store a vmid per
vSMMU instance v.s. a shared S2 domain, since each physical SMMU instance
maintains its own vmid bitmap.
Have a few patches to get the functions ready for the vmid migration. And
decouple the vmid from S2 parent domains and move its allocation to vSMMU
instances. Note that a regular S2 domain (!nest_parent) has to retain the
s2_cfg and vmid for non-nesting use cases, if the SMMU HW doesn't support
stage 1. Maintain a per-S2-domain vsmmus list and a per-vSMMU ats_devices
list to iterate for S2 cache invalidation cmds and ATC invalidation cmds.
This is on Github:
https://github.com/nicolinc/iommufd/commits/smmuv3_vmid-v2
Changelog
v2
* Add Reviewed-by from Jason and Pranjal
* Add WARN_ON_ONCE(!vmid) in arm_smmu_make_s2_domain_ste()
* Add arm_smmu_s2_parent_can_share() for a compatibility check
* Introduce arm_smmu_s2_parent_tlb_inv_* helpers replacing the non-nesting
routines
* Introduce arm_vsmmu_atc_inv_domain() using a per-vSMMU ats_devices list,
replacing the nested_ats_flush in struct arm_smmu_master_domain
v1
https://lore.kernel.org/all/cover.1741150594.git.nicolinc@nvidia.com/
Thanks
Nicolin
Nicolin Chen (11):
iommu/arm-smmu-v3: Pass in vmid to arm_smmu_make_s2_domain_ste()
iommu/arm-smmu-v3: Pass in smmu/iommu_domain to
__arm_smmu_tlb_inv_range()
iommu/arm-smmu-v3: Share cmdq/cmd helpers with arm-smmu-v3-iommufd
iommu/arm-smmu-v3: Add an inline arm_smmu_tlb_inv_vmid helper
iommu/arm-smmu-v3: Rename arm_smmu_attach_prepare_vmaster
iommu/arm-smmu-v3: Introduce arm_smmu_s2_parent_tlb_ invalidation
helpers
iommu/arm-smmu-v3: Introduce arm_vsmmu_atc_inv_domain()
iommu/arm-smmu-v3: Use vSMMU helpers for S2 and ATC invalidations
iommu/arm-smmu-v3: Clean up nested_ats_flush from master_domain
iommu/arm-smmu-v3: Decouple vmid from S2 nest_parent domain
iommu/arm-smmu-v3: Allow to share S2 nest_parent domain across vSMMUs
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 77 +++++++-
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 175 ++++++++++++++++--
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 3 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 139 +++++++-------
4 files changed, 297 insertions(+), 97 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2 01/11] iommu/arm-smmu-v3: Pass in vmid to arm_smmu_make_s2_domain_ste()
2025-04-15 4:57 [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent Nicolin Chen
@ 2025-04-15 4:57 ` Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 02/11] iommu/arm-smmu-v3: Pass in smmu/iommu_domain to __arm_smmu_tlb_inv_range() Nicolin Chen
` (9 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 4:57 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, kevin.tian, praan, nathan, yi.l.liu, peterz, mshavit,
jsnitsel, smostafa, jeff.johnson, zhangzekun11, linux-arm-kernel,
iommu, linux-kernel, shameerali.kolothum.thodi
An stage-2 STE requires a vmid that has been so far allocated per domain,
so arm_smmu_make_s2_domain_ste() has been extracting the vmid from the S2
domain.
To share an S2 parent domain across vSMMUs in the same VM, a vmid will be
no longer allocated for nor stored in the S2 domain, but per vSMMU, which
means the arm_smmu_make_s2_domain_ste() can get a vmid either from an S2
domain (non nesting parent) or a vSMMU.
Allow to pass in vmid explicitly to arm_smmu_make_s2_domain_ste(), giving
its callers a chance to pick the vmid between a domain or a vSMMU. Add a
WARN_ON_ONCE to validate the input vmid.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 6 ++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 3 ++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 +++++---
4 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index dd1ad56ce863..d4837a33fb81 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -894,7 +894,7 @@ struct arm_smmu_entry_writer_ops {
void arm_smmu_make_abort_ste(struct arm_smmu_ste *target);
void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_domain *smmu_domain, u16 vmid,
bool ats_enabled);
#if IS_ENABLED(CONFIG_KUNIT)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index e4fd8d522af8..d86dba6691e8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -34,8 +34,9 @@ static void arm_smmu_make_nested_cd_table_ste(
struct arm_smmu_ste *target, struct arm_smmu_master *master,
struct arm_smmu_nested_domain *nested_domain, bool ats_enabled)
{
- arm_smmu_make_s2_domain_ste(
- target, master, nested_domain->vsmmu->s2_parent, ats_enabled);
+ arm_smmu_make_s2_domain_ste(target, master,
+ nested_domain->vsmmu->s2_parent,
+ nested_domain->vsmmu->vmid, ats_enabled);
target->data[0] = cpu_to_le64(STRTAB_STE_0_V |
FIELD_PREP(STRTAB_STE_0_CFG,
@@ -78,6 +79,7 @@ static void arm_smmu_make_nested_domain_ste(
case STRTAB_STE_0_CFG_BYPASS:
arm_smmu_make_s2_domain_ste(target, master,
nested_domain->vsmmu->s2_parent,
+ nested_domain->vsmmu->vmid,
ats_enabled);
break;
case STRTAB_STE_0_CFG_ABORT:
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
index d2671bfd3798..7fac5a112c5c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
@@ -316,7 +316,8 @@ static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste,
io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.sl = 3;
io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.tsz = 4;
- arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain, ats_enabled);
+ arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain,
+ smmu_domain.s2_cfg.vmid, ats_enabled);
}
static void arm_smmu_v3_write_ste_test_s2_to_abort(struct kunit *test)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c32c0b92dc69..1ec5efca1d42 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1656,10 +1656,9 @@ EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_cdtable_ste);
void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_domain *smmu_domain, u16 vmid,
bool ats_enabled)
{
- struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
const struct io_pgtable_cfg *pgtbl_cfg =
&io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr =
@@ -1667,6 +1666,8 @@ void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
u64 vtcr_val;
struct arm_smmu_device *smmu = master->smmu;
+ WARN_ON_ONCE(!vmid);
+
memset(target, 0, sizeof(*target));
target->data[0] = cpu_to_le64(
STRTAB_STE_0_V |
@@ -1690,7 +1691,7 @@ void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
target->data[2] = cpu_to_le64(
- FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
+ FIELD_PREP(STRTAB_STE_2_S2VMID, vmid) |
FIELD_PREP(STRTAB_STE_2_VTCR, vtcr_val) |
STRTAB_STE_2_S2AA64 |
#ifdef __BIG_ENDIAN
@@ -2990,6 +2991,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
}
case ARM_SMMU_DOMAIN_S2:
arm_smmu_make_s2_domain_ste(&target, master, smmu_domain,
+ smmu_domain->s2_cfg.vmid,
state.ats_enabled);
arm_smmu_install_ste_for_dev(master, &target);
arm_smmu_clear_cd(master, IOMMU_NO_PASID);
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 02/11] iommu/arm-smmu-v3: Pass in smmu/iommu_domain to __arm_smmu_tlb_inv_range()
2025-04-15 4:57 [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 01/11] iommu/arm-smmu-v3: Pass in vmid to arm_smmu_make_s2_domain_ste() Nicolin Chen
@ 2025-04-15 4:57 ` Nicolin Chen
2025-05-15 15:06 ` Will Deacon
2025-04-15 4:57 ` [PATCH v2 03/11] iommu/arm-smmu-v3: Share cmdq/cmd helpers with arm-smmu-v3-iommufd Nicolin Chen
` (8 subsequent siblings)
10 siblings, 1 reply; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 4:57 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, kevin.tian, praan, nathan, yi.l.liu, peterz, mshavit,
jsnitsel, smostafa, jeff.johnson, zhangzekun11, linux-arm-kernel,
iommu, linux-kernel, shameerali.kolothum.thodi
What __arm_smmu_tlb_inv_range() really needs is the smmu and iommu_domain
pointers from the smmu_domain.
For a nest_parent smmu_domain, it will no longer store an smmu pointer as
it can be shared across vSMMU instances. A vSMMU structure sharing the S2
smmu_domain instead would hold the smmu pointer.
Pass them in explicitly to fit both !nest_parent and nest_parent cases.
While changing it, share it in the header with arm-smmu-v3-iommmufd that
will call it too.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 ++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 17 +++++++++--------
2 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index d4837a33fb81..5dbdc61558a9 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -955,6 +955,10 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
struct arm_smmu_domain *smmu_domain);
int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
unsigned long iova, size_t size);
+void __arm_smmu_tlb_inv_range(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq_ent *cmd, unsigned long iova,
+ size_t size, size_t granule,
+ struct iommu_domain *domain);
void __arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 1ec5efca1d42..e9d4bbdacc99 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2267,12 +2267,11 @@ static void arm_smmu_tlb_inv_context(void *cookie)
arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
}
-static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
- unsigned long iova, size_t size,
- size_t granule,
- struct arm_smmu_domain *smmu_domain)
+void __arm_smmu_tlb_inv_range(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq_ent *cmd, unsigned long iova,
+ size_t size, size_t granule,
+ struct iommu_domain *domain)
{
- struct arm_smmu_device *smmu = smmu_domain->smmu;
unsigned long end = iova + size, num_pages = 0, tg = 0;
size_t inv_range = granule;
struct arm_smmu_cmdq_batch cmds;
@@ -2282,7 +2281,7 @@ static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
if (smmu->features & ARM_SMMU_FEAT_RANGE_INV) {
/* Get the leaf page size */
- tg = __ffs(smmu_domain->domain.pgsize_bitmap);
+ tg = __ffs(domain->pgsize_bitmap);
num_pages = size >> tg;
@@ -2356,7 +2355,8 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
cmd.opcode = CMDQ_OP_TLBI_S2_IPA;
cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
}
- __arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
+ __arm_smmu_tlb_inv_range(smmu_domain->smmu, &cmd, iova, size, granule,
+ &smmu_domain->domain);
if (smmu_domain->nest_parent) {
/*
@@ -2387,7 +2387,8 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
},
};
- __arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
+ __arm_smmu_tlb_inv_range(smmu_domain->smmu, &cmd, iova, size, granule,
+ &smmu_domain->domain);
}
static void arm_smmu_tlb_inv_page_nosync(struct iommu_iotlb_gather *gather,
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 03/11] iommu/arm-smmu-v3: Share cmdq/cmd helpers with arm-smmu-v3-iommufd
2025-04-15 4:57 [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 01/11] iommu/arm-smmu-v3: Pass in vmid to arm_smmu_make_s2_domain_ste() Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 02/11] iommu/arm-smmu-v3: Pass in smmu/iommu_domain to __arm_smmu_tlb_inv_range() Nicolin Chen
@ 2025-04-15 4:57 ` Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 04/11] iommu/arm-smmu-v3: Add an inline arm_smmu_tlb_inv_vmid helper Nicolin Chen
` (7 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 4:57 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, kevin.tian, praan, nathan, yi.l.liu, peterz, mshavit,
jsnitsel, smostafa, jeff.johnson, zhangzekun11, linux-arm-kernel,
iommu, linux-kernel, shameerali.kolothum.thodi
Allow arm-smmu-v3-iommufd to call them for nested/S2 cache invalidations.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 12 ++++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 25 ++++++++++-----------
2 files changed, 24 insertions(+), 13 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 5dbdc61558a9..4f3f4a40a755 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -959,6 +959,8 @@ void __arm_smmu_tlb_inv_range(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq_ent *cmd, unsigned long iova,
size_t size, size_t granule,
struct iommu_domain *domain);
+void arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
+ struct arm_smmu_cmdq_ent *cmd);
void __arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq);
@@ -996,6 +998,16 @@ void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master,
int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq, u64 *cmds, int n,
bool sync);
+int arm_smmu_cmdq_issue_cmd_with_sync(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq_ent *ent);
+void arm_smmu_cmdq_batch_init(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq_batch *cmds,
+ struct arm_smmu_cmdq_ent *ent);
+void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq_batch *cmds,
+ struct arm_smmu_cmdq_ent *cmd);
+int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq_batch *cmds);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e9d4bbdacc99..8ad249f7dcbf 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -929,23 +929,23 @@ static int arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu,
return __arm_smmu_cmdq_issue_cmd(smmu, ent, false);
}
-static int arm_smmu_cmdq_issue_cmd_with_sync(struct arm_smmu_device *smmu,
- struct arm_smmu_cmdq_ent *ent)
+int arm_smmu_cmdq_issue_cmd_with_sync(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq_ent *ent)
{
return __arm_smmu_cmdq_issue_cmd(smmu, ent, true);
}
-static void arm_smmu_cmdq_batch_init(struct arm_smmu_device *smmu,
- struct arm_smmu_cmdq_batch *cmds,
- struct arm_smmu_cmdq_ent *ent)
+void arm_smmu_cmdq_batch_init(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq_batch *cmds,
+ struct arm_smmu_cmdq_ent *ent)
{
cmds->num = 0;
cmds->cmdq = arm_smmu_get_cmdq(smmu, ent);
}
-static void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu,
- struct arm_smmu_cmdq_batch *cmds,
- struct arm_smmu_cmdq_ent *cmd)
+void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq_batch *cmds,
+ struct arm_smmu_cmdq_ent *cmd)
{
bool unsupported_cmd = !arm_smmu_cmdq_supports_cmd(cmds->cmdq, cmd);
bool force_sync = (cmds->num == CMDQ_BATCH_ENTRIES - 1) &&
@@ -974,8 +974,8 @@ static void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu,
cmds->num++;
}
-static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
- struct arm_smmu_cmdq_batch *cmds)
+int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq_batch *cmds)
{
return arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmdq, cmds->cmds,
cmds->num, true);
@@ -2096,9 +2096,8 @@ static irqreturn_t arm_smmu_combined_irq_handler(int irq, void *dev)
return IRQ_WAKE_THREAD;
}
-static void
-arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
- struct arm_smmu_cmdq_ent *cmd)
+void arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
+ struct arm_smmu_cmdq_ent *cmd)
{
size_t log2_span;
size_t span_mask;
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 04/11] iommu/arm-smmu-v3: Add an inline arm_smmu_tlb_inv_vmid helper
2025-04-15 4:57 [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent Nicolin Chen
` (2 preceding siblings ...)
2025-04-15 4:57 ` [PATCH v2 03/11] iommu/arm-smmu-v3: Share cmdq/cmd helpers with arm-smmu-v3-iommufd Nicolin Chen
@ 2025-04-15 4:57 ` Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 05/11] iommu/arm-smmu-v3: Rename arm_smmu_attach_prepare_vmaster Nicolin Chen
` (6 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 4:57 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, kevin.tian, praan, nathan, yi.l.liu, peterz, mshavit,
jsnitsel, smostafa, jeff.johnson, zhangzekun11, linux-arm-kernel,
iommu, linux-kernel, shameerali.kolothum.thodi
Both arm-smmu-v3 and arm-smmu-v3-iommufd will use this by passing in a vmid
from s2_cfg/vsmmu.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 10 ++++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 +++-------
2 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 4f3f4a40a755..2f8928971716 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1009,6 +1009,16 @@ void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu,
int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq_batch *cmds);
+static inline void arm_smmu_tlb_inv_vmid(struct arm_smmu_device *smmu, u16 vmid)
+{
+ struct arm_smmu_cmdq_ent cmd = {
+ .opcode = CMDQ_OP_TLBI_S12_VMALL,
+ .tlbi.vmid = vmid,
+ };
+
+ arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
+}
+
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
bool arm_smmu_master_sva_supported(struct arm_smmu_master *master);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8ad249f7dcbf..bafe7c7c2769 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2247,7 +2247,6 @@ static void arm_smmu_tlb_inv_context(void *cookie)
{
struct arm_smmu_domain *smmu_domain = cookie;
struct arm_smmu_device *smmu = smmu_domain->smmu;
- struct arm_smmu_cmdq_ent cmd;
/*
* NOTE: when io-pgtable is in non-strict mode, we may get here with
@@ -2256,13 +2255,10 @@ static void arm_smmu_tlb_inv_context(void *cookie)
* insertion to guarantee those are observed before the TLBI. Do be
* careful, 007.
*/
- if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+ if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1)
arm_smmu_tlb_inv_asid(smmu, smmu_domain->cd.asid);
- } else {
- cmd.opcode = CMDQ_OP_TLBI_S12_VMALL;
- cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
- arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
- }
+ else
+ arm_smmu_tlb_inv_vmid(smmu, smmu_domain->s2_cfg.vmid);
arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 05/11] iommu/arm-smmu-v3: Rename arm_smmu_attach_prepare_vmaster
2025-04-15 4:57 [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent Nicolin Chen
` (3 preceding siblings ...)
2025-04-15 4:57 ` [PATCH v2 04/11] iommu/arm-smmu-v3: Add an inline arm_smmu_tlb_inv_vmid helper Nicolin Chen
@ 2025-04-15 4:57 ` Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 06/11] iommu/arm-smmu-v3: Introduce arm_smmu_s2_parent_tlb_ invalidation helpers Nicolin Chen
` (5 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 4:57 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, kevin.tian, praan, nathan, yi.l.liu, peterz, mshavit,
jsnitsel, smostafa, jeff.johnson, zhangzekun11, linux-arm-kernel,
iommu, linux-kernel, shameerali.kolothum.thodi
There is a need of stuffing more vsmmu-related routine into the prepare().
Given that the arm_smmu_attach_prepare_vmaster() is always called when the
domain is a nested domain that always has a valid vsmmu pointer. Rename it
to arm_vsmmu_attach_prepare().
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 9 ++++-----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 8 ++++----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 5 +++--
3 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 2f8928971716..7b47f4408a7a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1089,8 +1089,8 @@ struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
struct iommu_domain *parent,
struct iommufd_ctx *ictx,
unsigned int viommu_type);
-int arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state,
- struct arm_smmu_nested_domain *nested_domain);
+int arm_vsmmu_attach_prepare(struct arm_smmu_attach_state *state,
+ struct arm_vsmmu *vsmmu);
void arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state);
void arm_smmu_master_clear_vmaster(struct arm_smmu_master *master);
int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, u64 *evt);
@@ -1098,9 +1098,8 @@ int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, u64 *evt);
#define arm_smmu_hw_info NULL
#define arm_vsmmu_alloc NULL
-static inline int
-arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state,
- struct arm_smmu_nested_domain *nested_domain)
+static inline int arm_vsmmu_attach_prepare(struct arm_smmu_attach_state *state,
+ struct arm_vsmmu *vsmmu)
{
return 0;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index d86dba6691e8..6cd01536c966 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -89,8 +89,8 @@ static void arm_smmu_make_nested_domain_ste(
}
}
-int arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state,
- struct arm_smmu_nested_domain *nested_domain)
+int arm_vsmmu_attach_prepare(struct arm_smmu_attach_state *state,
+ struct arm_vsmmu *vsmmu)
{
struct arm_smmu_vmaster *vmaster;
unsigned long vsid;
@@ -98,7 +98,7 @@ int arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state,
iommu_group_mutex_assert(state->master->dev);
- ret = iommufd_viommu_get_vdev_id(&nested_domain->vsmmu->core,
+ ret = iommufd_viommu_get_vdev_id(&vsmmu->core,
state->master->dev, &vsid);
if (ret)
return ret;
@@ -106,7 +106,7 @@ int arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state,
vmaster = kzalloc(sizeof(*vmaster), GFP_KERNEL);
if (!vmaster)
return -ENOMEM;
- vmaster->vsmmu = nested_domain->vsmmu;
+ vmaster->vsmmu = vsmmu;
vmaster->vsid = vsid;
state->vmaster = vmaster;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index bafe7c7c2769..07d435562da2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2839,8 +2839,9 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
if (smmu_domain) {
if (new_domain->type == IOMMU_DOMAIN_NESTED) {
- ret = arm_smmu_attach_prepare_vmaster(
- state, to_smmu_nested_domain(new_domain));
+ ret = arm_vsmmu_attach_prepare(
+ state,
+ to_smmu_nested_domain(new_domain)->vsmmu);
if (ret)
return ret;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 06/11] iommu/arm-smmu-v3: Introduce arm_smmu_s2_parent_tlb_ invalidation helpers
2025-04-15 4:57 [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent Nicolin Chen
` (4 preceding siblings ...)
2025-04-15 4:57 ` [PATCH v2 05/11] iommu/arm-smmu-v3: Rename arm_smmu_attach_prepare_vmaster Nicolin Chen
@ 2025-04-15 4:57 ` Nicolin Chen
2025-04-15 12:50 ` Jason Gunthorpe
2025-04-15 4:57 ` [PATCH v2 07/11] iommu/arm-smmu-v3: Introduce arm_vsmmu_atc_inv_domain() Nicolin Chen
` (4 subsequent siblings)
10 siblings, 1 reply; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 4:57 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, kevin.tian, praan, nathan, yi.l.liu, peterz, mshavit,
jsnitsel, smostafa, jeff.johnson, zhangzekun11, linux-arm-kernel,
iommu, linux-kernel, shameerali.kolothum.thodi
An S2 nest_parent domain can be shared across vSMMUs in the same VM, since
the S2 domain is basically the IPA mappings for the entire RAM of the VM.
Meanwhile, each vSMMU can have its own VMID, so the VMID allocation should
be done per vSMMU instance v.s. per S2 nest_parent domain.
However, an S2 domain can be also allocated when a physical SMMU instance
doesn't support S1. So, the structure has to retain the s2_cfg and vmid.
Add a per-domain "vsmmus" list pairing with a spinlock, maintaining a list
of vSMMUs in the S2 parent domain.
Provide two arm_smmu_s2_parent_tlb_ helpers that will be used for nesting
cases to invalidate S2 cache using vsmmu->vmid by iterating this "vsmmus"
list.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 22 ++++++++
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 53 +++++++++++++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +
3 files changed, 77 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 7b47f4408a7a..7d76d8ac9acc 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -859,6 +859,10 @@ struct arm_smmu_domain {
struct arm_smmu_ctx_desc cd;
struct arm_smmu_s2_cfg s2_cfg;
};
+ struct {
+ struct list_head list;
+ spinlock_t lock;
+ } vsmmus;
struct iommu_domain domain;
@@ -1081,6 +1085,7 @@ struct arm_vsmmu {
struct arm_smmu_device *smmu;
struct arm_smmu_domain *s2_parent;
u16 vmid;
+ struct list_head vsmmus_elm; /* arm_smmu_domain::vsmmus::list */
};
#if IS_ENABLED(CONFIG_ARM_SMMU_V3_IOMMUFD)
@@ -1094,6 +1099,11 @@ int arm_vsmmu_attach_prepare(struct arm_smmu_attach_state *state,
void arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state);
void arm_smmu_master_clear_vmaster(struct arm_smmu_master *master);
int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, u64 *evt);
+
+void arm_smmu_s2_parent_tlb_inv_domain(struct arm_smmu_domain *s2_parent);
+void arm_smmu_s2_parent_tlb_inv_range(struct arm_smmu_domain *s2_parent,
+ unsigned long iova, size_t size,
+ size_t granule, bool leaf);
#else
#define arm_smmu_hw_info NULL
#define arm_vsmmu_alloc NULL
@@ -1119,6 +1129,18 @@ static inline int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster,
{
return -EOPNOTSUPP;
}
+
+static inline void
+arm_smmu_s2_parent_tlb_inv_domain(struct arm_smmu_domain *s2_parent)
+{
+}
+
+static inline void
+arm_smmu_s2_parent_tlb_inv_range(struct arm_smmu_domain *s2_parent,
+ unsigned long iova, size_t size,
+ size_t granule, bool leaf)
+{
+}
#endif /* CONFIG_ARM_SMMU_V3_IOMMUFD */
#endif /* _ARM_SMMU_V3_H */
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index 6cd01536c966..45ba68a1b59a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -30,6 +30,54 @@ void *arm_smmu_hw_info(struct device *dev, u32 *length, u32 *type)
return info;
}
+void arm_smmu_s2_parent_tlb_inv_domain(struct arm_smmu_domain *s2_parent)
+{
+ struct arm_vsmmu *vsmmu, *next;
+ unsigned long flags;
+
+ spin_lock_irqsave(&s2_parent->vsmmus.lock, flags);
+ list_for_each_entry_safe(vsmmu, next, &s2_parent->vsmmus.list,
+ vsmmus_elm) {
+ arm_smmu_tlb_inv_vmid(vsmmu->smmu, vsmmu->vmid);
+ }
+ spin_unlock_irqrestore(&s2_parent->vsmmus.lock, flags);
+}
+
+void arm_smmu_s2_parent_tlb_inv_range(struct arm_smmu_domain *s2_parent,
+ unsigned long iova, size_t size,
+ size_t granule, bool leaf)
+{
+ struct arm_smmu_cmdq_ent cmd = { .tlbi = { .leaf = leaf } };
+ struct arm_vsmmu *vsmmu, *next;
+ unsigned long flags;
+
+ spin_lock_irqsave(&s2_parent->vsmmus.lock, flags);
+ list_for_each_entry_safe(vsmmu, next, &s2_parent->vsmmus.list,
+ vsmmus_elm) {
+ cmd.tlbi.vmid = vsmmu->vmid;
+
+ /* Must flush all the nested S1 ASIDs when S2 domain changes */
+ cmd.opcode = CMDQ_OP_TLBI_NH_ALL;
+ arm_smmu_cmdq_issue_cmd_with_sync(vsmmu->smmu, &cmd);
+ cmd.opcode = CMDQ_OP_TLBI_S2_IPA;
+ __arm_smmu_tlb_inv_range(vsmmu->smmu, &cmd, iova, size, granule,
+ &s2_parent->domain);
+ }
+ spin_unlock_irqrestore(&s2_parent->vsmmus.lock, flags);
+}
+
+static void arm_vsmmu_destroy(struct iommufd_viommu *viommu)
+{
+ struct arm_vsmmu *vsmmu = container_of(viommu, struct arm_vsmmu, core);
+ unsigned long flags;
+
+ spin_lock_irqsave(&vsmmu->s2_parent->vsmmus.lock, flags);
+ list_del(&vsmmu->vsmmus_elm);
+ spin_unlock_irqrestore(&vsmmu->s2_parent->vsmmus.lock, flags);
+ /* Must flush S2 vmid after delinking vSMMU */
+ arm_smmu_tlb_inv_vmid(vsmmu->smmu, vsmmu->vmid);
+}
+
static void arm_smmu_make_nested_cd_table_ste(
struct arm_smmu_ste *target, struct arm_smmu_master *master,
struct arm_smmu_nested_domain *nested_domain, bool ats_enabled)
@@ -380,6 +428,7 @@ static int arm_vsmmu_cache_invalidate(struct iommufd_viommu *viommu,
}
static const struct iommufd_viommu_ops arm_vsmmu_ops = {
+ .destroy = arm_vsmmu_destroy,
.alloc_domain_nested = arm_vsmmu_alloc_domain_nested,
.cache_invalidate = arm_vsmmu_cache_invalidate,
};
@@ -394,6 +443,7 @@ struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct arm_smmu_domain *s2_parent = to_smmu_domain(parent);
struct arm_vsmmu *vsmmu;
+ unsigned long flags;
if (viommu_type != IOMMU_VIOMMU_TYPE_ARM_SMMUV3)
return ERR_PTR(-EOPNOTSUPP);
@@ -433,6 +483,9 @@ struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
vsmmu->s2_parent = s2_parent;
/* FIXME Move VMID allocation from the S2 domain allocation to here */
vsmmu->vmid = s2_parent->s2_cfg.vmid;
+ spin_lock_irqsave(&s2_parent->vsmmus.lock, flags);
+ list_add_tail(&vsmmu->vsmmus_elm, &s2_parent->vsmmus.list);
+ spin_unlock_irqrestore(&s2_parent->vsmmus.lock, flags);
return &vsmmu->core;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 07d435562da2..df87880e2a29 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3256,6 +3256,8 @@ arm_smmu_domain_alloc_paging_flags(struct device *dev, u32 flags,
}
smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
smmu_domain->nest_parent = true;
+ INIT_LIST_HEAD(&smmu_domain->vsmmus.list);
+ spin_lock_init(&smmu_domain->vsmmus.lock);
break;
case IOMMU_HWPT_ALLOC_DIRTY_TRACKING:
case IOMMU_HWPT_ALLOC_DIRTY_TRACKING | IOMMU_HWPT_ALLOC_PASID:
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 07/11] iommu/arm-smmu-v3: Introduce arm_vsmmu_atc_inv_domain()
2025-04-15 4:57 [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent Nicolin Chen
` (5 preceding siblings ...)
2025-04-15 4:57 ` [PATCH v2 06/11] iommu/arm-smmu-v3: Introduce arm_smmu_s2_parent_tlb_ invalidation helpers Nicolin Chen
@ 2025-04-15 4:57 ` Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 08/11] iommu/arm-smmu-v3: Use vSMMU helpers for S2 and ATC invalidations Nicolin Chen
` (3 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 4:57 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, kevin.tian, praan, nathan, yi.l.liu, peterz, mshavit,
jsnitsel, smostafa, jeff.johnson, zhangzekun11, linux-arm-kernel,
iommu, linux-kernel, shameerali.kolothum.thodi
Currently, all nested domains that enable ATS (i.e. nested_ats_flush) are
added to the devices list in the S2 parent domain via a master_domain. On
the other hand, an S2 parent domain can be shared across vSMMU instances.
So, storing all devices behind different vSMMU isntances into a shared S2
parent domain apparently isn't ideal.
Add a new per-vSMMU ats_devices list (with a pairing lock), which will be
stored the devices if their ATS features are enabled.
Using this ats_devices list, add an arm_vsmmu_atc_inv_domain() helper, for
the s2_parent invalidation routines to proceed ATC invalidation properly,
which sends an ATC invalidation request to all the devices on the list.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 6 +++
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 45 +++++++++++++++++++
2 files changed, 51 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 7d76d8ac9acc..d130d723cc33 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -840,6 +840,7 @@ struct arm_smmu_master {
bool sva_enabled;
bool iopf_enabled;
unsigned int ssid_bits;
+ struct list_head devices_elm; /* vsmmu->ats_devices */
};
/* SMMU private data for an IOMMU domain */
@@ -1086,6 +1087,11 @@ struct arm_vsmmu {
struct arm_smmu_domain *s2_parent;
u16 vmid;
struct list_head vsmmus_elm; /* arm_smmu_domain::vsmmus::list */
+ /* List of struct arm_smmu_master that enables ATS */
+ struct {
+ struct list_head list;
+ spinlock_t lock;
+ } ats_devices;
};
#if IS_ENABLED(CONFIG_ARM_SMMU_V3_IOMMUFD)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index 45ba68a1b59a..4730ff56cf04 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -30,6 +30,41 @@ void *arm_smmu_hw_info(struct device *dev, u32 *length, u32 *type)
return info;
}
+static void arm_vsmmu_cmdq_batch_add_atc_inv(struct arm_vsmmu *vsmmu,
+ struct arm_smmu_master *master,
+ struct arm_smmu_cmdq_batch *cmds,
+ struct arm_smmu_cmdq_ent *cmd)
+{
+ int i;
+
+ lockdep_assert_held(&vsmmu->ats_devices.lock);
+
+ arm_smmu_atc_inv_to_cmd(IOMMU_NO_PASID, 0, 0, cmd);
+ for (i = 0; i < master->num_streams; i++) {
+ cmd->atc.sid = master->streams[i].id;
+ arm_smmu_cmdq_batch_add(vsmmu->smmu, cmds, cmd);
+ }
+}
+
+static int arm_vsmmu_atc_inv_domain(struct arm_vsmmu *vsmmu, unsigned long iova,
+ size_t size)
+{
+ struct arm_smmu_cmdq_ent cmd = { .opcode = CMDQ_OP_ATC_INV };
+ struct arm_smmu_master *master, *next;
+ struct arm_smmu_cmdq_batch cmds;
+ unsigned long flags;
+
+ arm_smmu_cmdq_batch_init(vsmmu->smmu, &cmds, &cmd);
+
+ spin_lock_irqsave(&vsmmu->ats_devices.lock, flags);
+ list_for_each_entry_safe(master, next, &vsmmu->ats_devices.list,
+ devices_elm)
+ arm_vsmmu_cmdq_batch_add_atc_inv(vsmmu, master, &cmds, &cmd);
+ spin_unlock_irqrestore(&vsmmu->ats_devices.lock, flags);
+
+ return arm_smmu_cmdq_batch_submit(vsmmu->smmu, &cmds);
+}
+
void arm_smmu_s2_parent_tlb_inv_domain(struct arm_smmu_domain *s2_parent)
{
struct arm_vsmmu *vsmmu, *next;
@@ -39,6 +74,7 @@ void arm_smmu_s2_parent_tlb_inv_domain(struct arm_smmu_domain *s2_parent)
list_for_each_entry_safe(vsmmu, next, &s2_parent->vsmmus.list,
vsmmus_elm) {
arm_smmu_tlb_inv_vmid(vsmmu->smmu, vsmmu->vmid);
+ arm_vsmmu_atc_inv_domain(vsmmu, 0, 0);
}
spin_unlock_irqrestore(&s2_parent->vsmmus.lock, flags);
}
@@ -62,6 +98,11 @@ void arm_smmu_s2_parent_tlb_inv_range(struct arm_smmu_domain *s2_parent,
cmd.opcode = CMDQ_OP_TLBI_S2_IPA;
__arm_smmu_tlb_inv_range(vsmmu->smmu, &cmd, iova, size, granule,
&s2_parent->domain);
+ /*
+ * Unfortunately, this can't be leaf-only since we may have
+ * zapped an entire table.
+ */
+ arm_vsmmu_atc_inv_domain(vsmmu, iova, size);
}
spin_unlock_irqrestore(&s2_parent->vsmmus.lock, flags);
}
@@ -76,6 +117,7 @@ static void arm_vsmmu_destroy(struct iommufd_viommu *viommu)
spin_unlock_irqrestore(&vsmmu->s2_parent->vsmmus.lock, flags);
/* Must flush S2 vmid after delinking vSMMU */
arm_smmu_tlb_inv_vmid(vsmmu->smmu, vsmmu->vmid);
+ arm_vsmmu_atc_inv_domain(vsmmu, 0, 0);
}
static void arm_smmu_make_nested_cd_table_ste(
@@ -487,6 +529,9 @@ struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
list_add_tail(&vsmmu->vsmmus_elm, &s2_parent->vsmmus.list);
spin_unlock_irqrestore(&s2_parent->vsmmus.lock, flags);
+ INIT_LIST_HEAD(&vsmmu->ats_devices.list);
+ spin_lock_init(&vsmmu->ats_devices.lock);
+
return &vsmmu->core;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 08/11] iommu/arm-smmu-v3: Use vSMMU helpers for S2 and ATC invalidations
2025-04-15 4:57 [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent Nicolin Chen
` (6 preceding siblings ...)
2025-04-15 4:57 ` [PATCH v2 07/11] iommu/arm-smmu-v3: Introduce arm_vsmmu_atc_inv_domain() Nicolin Chen
@ 2025-04-15 4:57 ` Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 09/11] iommu/arm-smmu-v3: Clean up nested_ats_flush from master_domain Nicolin Chen
` (2 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 4:57 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, kevin.tian, praan, nathan, yi.l.liu, peterz, mshavit,
jsnitsel, smostafa, jeff.johnson, zhangzekun11, linux-arm-kernel,
iommu, linux-kernel, shameerali.kolothum.thodi
Now the driver can do a per-vSMMU S2 cache and ATC invalidations, given a
pair of arm_smmu_s2_parent_* helpers. Use them in the arm_smmu_tlb_inv_*
functions, replacing the existing per-domain invalidations.
This also requires to add/remove the device onto/from the ats_devices list
of the vSMMU. Note that this is shifting away from the nested_ats_flush in
the struct arm_smmu_master_domain, which now became a dead code, requiring
a cleanup.
Move the arm_vsmmu_attach_prepare() call in arm_smmu_attach_prepare(), out
of the !IOMMU_DOMAIN_NESTED routine, so that it doesn't need to revert the
arm_vsmmu_attach_prepare(), which wouldn't only require a simple kfree().
All of these have to be done in one single patch, so nothing is broken.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 +++
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 27 ++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 45 +++++++++----------
3 files changed, 55 insertions(+), 24 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index d130d723cc33..c9b9c7921bee 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1104,6 +1104,8 @@ int arm_vsmmu_attach_prepare(struct arm_smmu_attach_state *state,
struct arm_vsmmu *vsmmu);
void arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state);
void arm_smmu_master_clear_vmaster(struct arm_smmu_master *master);
+void arm_vsmmu_remove_ats_device(struct arm_vsmmu *vsmmu,
+ struct arm_smmu_master *master);
int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, u64 *evt);
void arm_smmu_s2_parent_tlb_inv_domain(struct arm_smmu_domain *s2_parent);
@@ -1130,6 +1132,11 @@ arm_smmu_master_clear_vmaster(struct arm_smmu_master *master)
{
}
+static inline void arm_vsmmu_remove_ats_device(struct arm_vsmmu *vsmmu,
+ struct arm_smmu_master *master)
+{
+}
+
static inline int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster,
u64 *evt)
{
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index 4730ff56cf04..491f2b88e30b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -182,11 +182,13 @@ static void arm_smmu_make_nested_domain_ste(
int arm_vsmmu_attach_prepare(struct arm_smmu_attach_state *state,
struct arm_vsmmu *vsmmu)
{
+ struct arm_smmu_master *master = state->master;
struct arm_smmu_vmaster *vmaster;
+ unsigned long flags;
unsigned long vsid;
int ret;
- iommu_group_mutex_assert(state->master->dev);
+ iommu_group_mutex_assert(master->dev);
ret = iommufd_viommu_get_vdev_id(&vsmmu->core,
state->master->dev, &vsid);
@@ -200,6 +202,12 @@ int arm_vsmmu_attach_prepare(struct arm_smmu_attach_state *state,
vmaster->vsid = vsid;
state->vmaster = vmaster;
+ if (state->ats_enabled) {
+ spin_lock_irqsave(&vsmmu->ats_devices.lock, flags);
+ list_add(&master->devices_elm, &vsmmu->ats_devices.list);
+ spin_unlock_irqrestore(&vsmmu->ats_devices.lock, flags);
+ }
+
return 0;
}
@@ -220,6 +228,23 @@ void arm_smmu_master_clear_vmaster(struct arm_smmu_master *master)
arm_smmu_attach_commit_vmaster(&state);
}
+void arm_vsmmu_remove_ats_device(struct arm_vsmmu *vsmmu,
+ struct arm_smmu_master *master)
+{
+ struct arm_smmu_cmdq_ent cmd = { .opcode = CMDQ_OP_ATC_INV };
+ struct arm_smmu_cmdq_batch cmds;
+ unsigned long flags;
+
+ arm_smmu_cmdq_batch_init(vsmmu->smmu, &cmds, &cmd);
+
+ spin_lock_irqsave(&vsmmu->ats_devices.lock, flags);
+ list_del(&master->devices_elm);
+ arm_vsmmu_cmdq_batch_add_atc_inv(vsmmu, master, &cmds, &cmd);
+ spin_unlock_irqrestore(&vsmmu->ats_devices.lock, flags);
+
+ arm_smmu_cmdq_batch_submit(vsmmu->smmu, &cmds);
+}
+
static int arm_smmu_attach_dev_nested(struct iommu_domain *domain,
struct device *dev)
{
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index df87880e2a29..483ef9e2c6b7 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2255,6 +2255,10 @@ static void arm_smmu_tlb_inv_context(void *cookie)
* insertion to guarantee those are observed before the TLBI. Do be
* careful, 007.
*/
+
+ if (smmu_domain->nest_parent)
+ return arm_smmu_s2_parent_tlb_inv_domain(smmu_domain);
+
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1)
arm_smmu_tlb_inv_asid(smmu, smmu_domain->cd.asid);
else
@@ -2342,6 +2346,11 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
},
};
+ if (smmu_domain->nest_parent) {
+ return arm_smmu_s2_parent_tlb_inv_range(smmu_domain, iova, size,
+ granule, leaf);
+ }
+
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
cmd.opcode = smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA;
@@ -2353,15 +2362,6 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
__arm_smmu_tlb_inv_range(smmu_domain->smmu, &cmd, iova, size, granule,
&smmu_domain->domain);
- if (smmu_domain->nest_parent) {
- /*
- * When the S2 domain changes all the nested S1 ASIDs have to be
- * flushed too.
- */
- cmd.opcode = CMDQ_OP_TLBI_NH_ALL;
- arm_smmu_cmdq_issue_cmd_with_sync(smmu_domain->smmu, &cmd);
- }
-
/*
* Unfortunately, this can't be leaf-only since we may have
* zapped an entire table.
@@ -2765,8 +2765,11 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
if (!smmu_domain)
return;
- if (domain->type == IOMMU_DOMAIN_NESTED)
- nested_ats_flush = to_smmu_nested_domain(domain)->enable_ats;
+ if (domain->type == IOMMU_DOMAIN_NESTED &&
+ to_smmu_nested_domain(domain)->enable_ats) {
+ return arm_vsmmu_remove_ats_device(
+ to_smmu_nested_domain(domain)->vsmmu, master);
+ }
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
master_domain = arm_smmu_find_master_domain(smmu_domain, master, ssid,
@@ -2837,20 +2840,17 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
arm_smmu_ats_supported(master);
}
- if (smmu_domain) {
- if (new_domain->type == IOMMU_DOMAIN_NESTED) {
- ret = arm_vsmmu_attach_prepare(
- state,
- to_smmu_nested_domain(new_domain)->vsmmu);
- if (ret)
- return ret;
- }
+ if (new_domain->type == IOMMU_DOMAIN_NESTED) {
+ struct arm_smmu_nested_domain *nested_domain =
+ to_smmu_nested_domain(new_domain);
+ ret = arm_vsmmu_attach_prepare(state, nested_domain->vsmmu);
+ if (ret)
+ return ret;
+ } else if (smmu_domain) {
master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
- if (!master_domain) {
- kfree(state->vmaster);
+ if (!master_domain)
return -ENOMEM;
- }
master_domain->master = master;
master_domain->ssid = state->ssid;
if (new_domain->type == IOMMU_DOMAIN_NESTED)
@@ -2877,7 +2877,6 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
spin_unlock_irqrestore(&smmu_domain->devices_lock,
flags);
kfree(master_domain);
- kfree(state->vmaster);
return -EINVAL;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 09/11] iommu/arm-smmu-v3: Clean up nested_ats_flush from master_domain
2025-04-15 4:57 [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent Nicolin Chen
` (7 preceding siblings ...)
2025-04-15 4:57 ` [PATCH v2 08/11] iommu/arm-smmu-v3: Use vSMMU helpers for S2 and ATC invalidations Nicolin Chen
@ 2025-04-15 4:57 ` Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 10/11] iommu/arm-smmu-v3: Decouple vmid from S2 nest_parent domain Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 11/11] iommu/arm-smmu-v3: Allow to share S2 nest_parent domain across vSMMUs Nicolin Chen
10 siblings, 0 replies; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 4:57 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, kevin.tian, praan, nathan, yi.l.liu, peterz, mshavit,
jsnitsel, smostafa, jeff.johnson, zhangzekun11, linux-arm-kernel,
iommu, linux-kernel, shameerali.kolothum.thodi
Now the ats_devices list is maintained per vSMMU, since an S2 domain could
be shared among vSMMU instances.
Drop the nested_ats_flush from struct arm_smmu_master_domain, and clean up
the dead code in the related functions.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 -
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 ++++-----------------
2 files changed, 4 insertions(+), 21 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index c9b9c7921bee..477d4d2f19a6 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -921,7 +921,6 @@ struct arm_smmu_master_domain {
struct list_head devices_elm;
struct arm_smmu_master *master;
ioasid_t ssid;
- bool nested_ats_flush : 1;
};
static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 483ef9e2c6b7..4b9cdfb177ca 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2221,16 +2221,7 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
if (!master->ats_enabled)
continue;
- if (master_domain->nested_ats_flush) {
- /*
- * If a S2 used as a nesting parent is changed we have
- * no option but to completely flush the ATC.
- */
- arm_smmu_atc_inv_to_cmd(IOMMU_NO_PASID, 0, 0, &cmd);
- } else {
- arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size,
- &cmd);
- }
+ arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size, &cmd);
for (i = 0; i < master->num_streams; i++) {
cmd.atc.sid = master->streams[i].id;
@@ -2717,8 +2708,7 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
static struct arm_smmu_master_domain *
arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
- struct arm_smmu_master *master,
- ioasid_t ssid, bool nested_ats_flush)
+ struct arm_smmu_master *master, ioasid_t ssid)
{
struct arm_smmu_master_domain *master_domain;
@@ -2727,8 +2717,7 @@ arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
list_for_each_entry(master_domain, &smmu_domain->devices,
devices_elm) {
if (master_domain->master == master &&
- master_domain->ssid == ssid &&
- master_domain->nested_ats_flush == nested_ats_flush)
+ master_domain->ssid == ssid)
return master_domain;
}
return NULL;
@@ -2759,7 +2748,6 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
struct arm_smmu_master_domain *master_domain;
- bool nested_ats_flush = false;
unsigned long flags;
if (!smmu_domain)
@@ -2772,8 +2760,7 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
}
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- master_domain = arm_smmu_find_master_domain(smmu_domain, master, ssid,
- nested_ats_flush);
+ master_domain = arm_smmu_find_master_domain(smmu_domain, master, ssid);
if (master_domain) {
list_del(&master_domain->devices_elm);
kfree(master_domain);
@@ -2853,9 +2840,6 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
return -ENOMEM;
master_domain->master = master;
master_domain->ssid = state->ssid;
- if (new_domain->type == IOMMU_DOMAIN_NESTED)
- master_domain->nested_ats_flush =
- to_smmu_nested_domain(new_domain)->enable_ats;
/*
* During prepare we want the current smmu_domain and new
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 10/11] iommu/arm-smmu-v3: Decouple vmid from S2 nest_parent domain
2025-04-15 4:57 [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent Nicolin Chen
` (8 preceding siblings ...)
2025-04-15 4:57 ` [PATCH v2 09/11] iommu/arm-smmu-v3: Clean up nested_ats_flush from master_domain Nicolin Chen
@ 2025-04-15 4:57 ` Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 11/11] iommu/arm-smmu-v3: Allow to share S2 nest_parent domain across vSMMUs Nicolin Chen
10 siblings, 0 replies; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 4:57 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, kevin.tian, praan, nathan, yi.l.liu, peterz, mshavit,
jsnitsel, smostafa, jeff.johnson, zhangzekun11, linux-arm-kernel,
iommu, linux-kernel, shameerali.kolothum.thodi
Now the new S2 invalidation routines in arm-smmu-v3-iommufd are ready to
support a shared S2 nest_parent domain across multiple vSMMU instances.
Move the vmid allocation/releasing to the vSMMU allocator/destroyer too.
Then, move the vsmmus list next to s2_cfg in the struct arm_smmu_domain,
as they can be exclusive now.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 12 ++++++------
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 15 ++++++++++++---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++++++---
3 files changed, 24 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 477d4d2f19a6..dfb9d5f935e4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -857,13 +857,13 @@ struct arm_smmu_domain {
enum arm_smmu_domain_stage stage;
union {
- struct arm_smmu_ctx_desc cd;
- struct arm_smmu_s2_cfg s2_cfg;
+ struct arm_smmu_ctx_desc cd; /* S1 */
+ struct arm_smmu_s2_cfg s2_cfg; /* S2 && !nest_parent */
+ struct { /* S2 && nest_parent */
+ struct list_head list;
+ spinlock_t lock;
+ } vsmmus;
};
- struct {
- struct list_head list;
- spinlock_t lock;
- } vsmmus;
struct iommu_domain domain;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index 491f2b88e30b..5d05f8a78215 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -118,6 +118,7 @@ static void arm_vsmmu_destroy(struct iommufd_viommu *viommu)
/* Must flush S2 vmid after delinking vSMMU */
arm_smmu_tlb_inv_vmid(vsmmu->smmu, vsmmu->vmid);
arm_vsmmu_atc_inv_domain(vsmmu, 0, 0);
+ ida_free(&vsmmu->smmu->vmid_map, vsmmu->vmid);
}
static void arm_smmu_make_nested_cd_table_ste(
@@ -511,6 +512,7 @@ struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
struct arm_smmu_domain *s2_parent = to_smmu_domain(parent);
struct arm_vsmmu *vsmmu;
unsigned long flags;
+ int vmid;
if (viommu_type != IOMMU_VIOMMU_TYPE_ARM_SMMUV3)
return ERR_PTR(-EOPNOTSUPP);
@@ -541,15 +543,22 @@ struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
!(smmu->features & ARM_SMMU_FEAT_S2FWB))
return ERR_PTR(-EOPNOTSUPP);
+ vmid = ida_alloc_range(&smmu->vmid_map, 1, (1 << smmu->vmid_bits) - 1,
+ GFP_KERNEL);
+ if (vmid < 0)
+ return ERR_PTR(vmid);
+
vsmmu = iommufd_viommu_alloc(ictx, struct arm_vsmmu, core,
&arm_vsmmu_ops);
- if (IS_ERR(vsmmu))
+ if (IS_ERR(vsmmu)) {
+ ida_free(&smmu->vmid_map, vmid);
return ERR_CAST(vsmmu);
+ }
vsmmu->smmu = smmu;
+ vsmmu->vmid = (u16)vmid;
vsmmu->s2_parent = s2_parent;
- /* FIXME Move VMID allocation from the S2 domain allocation to here */
- vsmmu->vmid = s2_parent->s2_cfg.vmid;
+
spin_lock_irqsave(&s2_parent->vsmmus.lock, flags);
list_add_tail(&vsmmu->vsmmus_elm, &s2_parent->vsmmus.list);
spin_unlock_irqrestore(&s2_parent->vsmmus.lock, flags);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4b9cdfb177ca..8047b60ec024 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2474,7 +2474,7 @@ static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
mutex_lock(&arm_smmu_asid_lock);
xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
mutex_unlock(&arm_smmu_asid_lock);
- } else {
+ } else if (!smmu_domain->nest_parent) {
struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
if (cfg->vmid)
ida_free(&smmu->vmid_map, cfg->vmid);
@@ -2503,7 +2503,10 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_device *smmu,
struct arm_smmu_domain *smmu_domain)
{
int vmid;
- struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
+
+ /* nest_parent stores vmid in vSMMU instead of a shared S2 domain */
+ if (smmu_domain->nest_parent)
+ return 0;
/* Reserve VMID 0 for stage-2 bypass STEs */
vmid = ida_alloc_range(&smmu->vmid_map, 1, (1 << smmu->vmid_bits) - 1,
@@ -2511,7 +2514,7 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_device *smmu,
if (vmid < 0)
return vmid;
- cfg->vmid = (u16)vmid;
+ smmu_domain->s2_cfg.vmid = (u16)vmid;
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 11/11] iommu/arm-smmu-v3: Allow to share S2 nest_parent domain across vSMMUs
2025-04-15 4:57 [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent Nicolin Chen
` (9 preceding siblings ...)
2025-04-15 4:57 ` [PATCH v2 10/11] iommu/arm-smmu-v3: Decouple vmid from S2 nest_parent domain Nicolin Chen
@ 2025-04-15 4:57 ` Nicolin Chen
10 siblings, 0 replies; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 4:57 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, kevin.tian, praan, nathan, yi.l.liu, peterz, mshavit,
jsnitsel, smostafa, jeff.johnson, zhangzekun11, linux-arm-kernel,
iommu, linux-kernel, shameerali.kolothum.thodi
An S2 nest_parent domain used by one vSMMU can be shared with another vSMMU
so long as the underlying stage-2 page table is compatible by the physical
SMMU instance.
There is no direct information about the page table from the master device,
but a comparison can be done between the physical SMMU that the nest_parent
domain was allocated for and the physical SMMU that the device is behind.
Replace the smmu test in arm_vsmmu_alloc() with a compatibility test, which
goes through the physical SMMU parameters that were used to decide the page
table formats.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 21 ++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index 5d05f8a78215..f654e665739a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -501,6 +501,25 @@ static const struct iommufd_viommu_ops arm_vsmmu_ops = {
.cache_invalidate = arm_vsmmu_cache_invalidate,
};
+static bool arm_smmu_s2_parent_can_share(struct arm_smmu_domain *s2_parent,
+ struct arm_smmu_device *smmu)
+{
+ struct arm_smmu_device *s2_smmu = s2_parent->smmu;
+
+ if (s2_smmu == smmu)
+ return true;
+ if (s2_smmu->iommu.ops != smmu->iommu.ops)
+ return false;
+ if (s2_smmu->ias > smmu->ias || s2_smmu->oas > smmu->oas)
+ return false;
+ if (s2_smmu->pgsize_bitmap != smmu->pgsize_bitmap)
+ return false;
+ if ((s2_smmu->features & ARM_SMMU_FEAT_COHERENCY) !=
+ (smmu->features & ARM_SMMU_FEAT_COHERENCY))
+ return false;
+ return true;
+}
+
struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
struct iommu_domain *parent,
struct iommufd_ctx *ictx,
@@ -520,7 +539,7 @@ struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
if (!(smmu->features & ARM_SMMU_FEAT_NESTING))
return ERR_PTR(-EOPNOTSUPP);
- if (s2_parent->smmu != master->smmu)
+ if (!arm_smmu_s2_parent_can_share(s2_parent, master->smmu))
return ERR_PTR(-EINVAL);
/*
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v2 06/11] iommu/arm-smmu-v3: Introduce arm_smmu_s2_parent_tlb_ invalidation helpers
2025-04-15 4:57 ` [PATCH v2 06/11] iommu/arm-smmu-v3: Introduce arm_smmu_s2_parent_tlb_ invalidation helpers Nicolin Chen
@ 2025-04-15 12:50 ` Jason Gunthorpe
2025-04-15 20:10 ` Nicolin Chen
0 siblings, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2025-04-15 12:50 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, joro, kevin.tian, praan, nathan, yi.l.liu,
peterz, mshavit, jsnitsel, smostafa, jeff.johnson, zhangzekun11,
linux-arm-kernel, iommu, linux-kernel, shameerali.kolothum.thodi
On Mon, Apr 14, 2025 at 09:57:41PM -0700, Nicolin Chen wrote:
> An S2 nest_parent domain can be shared across vSMMUs in the same VM, since
> the S2 domain is basically the IPA mappings for the entire RAM of the VM.
>
> Meanwhile, each vSMMU can have its own VMID, so the VMID allocation should
> be done per vSMMU instance v.s. per S2 nest_parent domain.
>
> However, an S2 domain can be also allocated when a physical SMMU instance
> doesn't support S1. So, the structure has to retain the s2_cfg and vmid.
>
> Add a per-domain "vsmmus" list pairing with a spinlock, maintaining a list
> of vSMMUs in the S2 parent domain.
>
> Provide two arm_smmu_s2_parent_tlb_ helpers that will be used for nesting
> cases to invalidate S2 cache using vsmmu->vmid by iterating this "vsmmus"
> list.
I was rather hoping to fix the normal S2 case as well, the nested case
is really not so different.
The challenge with that is to rework the list of invalidation
instructions stored in the smmu_domain to be more general and have
more information, how to invalidate for vsmmu is just another special
case.
> @@ -859,6 +859,10 @@ struct arm_smmu_domain {
> struct arm_smmu_ctx_desc cd;
> struct arm_smmu_s2_cfg s2_cfg;
> };
> + struct {
> + struct list_head list;
> + spinlock_t lock;
> + } vsmmus;
So this approach of just adding more lists is functional, but it isn't
very general :\
This is why it is a tough project, because carefully generalizing the
invalidation data without degrading the performance is certainly
somewhat tricky.
But what I was broadly thinking is to have an allocated array attached
to each domain with something like:
struct invalidation_op {
struct arm_smmu_device *smmu;
enum {ATS,S2_VMDIA_IPA,S2_VMID,S1_ASID} invalidation_op;
union {
u16 vmid;
u32 asid;
u32 ats_id;
};
refcount_t users;
};
Then invalidation would just iterate over this list following each
instruction.
When things are attached the list is mutated:
- Normal S1/S2 attach would reuse an ASID for the same instance or
allocate a new list entry, users keeps track of ID sharing
- VMID attach would use the VMID of the vSMMU
- ATS enabled would add entries for each PCI device instead of the
seperate ATS list
To do this without locking on the invalidation side would require
using RCU to manage the list, which suggests it is probably an array
that is re-allocated each time it is changed.
That means some fancy algorithms to copy and mutate the array, deal
with error cases and sort it (ATS must follow ID, want things grouped
by instance).
There is some tricky memory barriers needed and RCU would require that
SMMU unplug do a synchronize_rcu(). IIRC riscv did this in their
driver.
But the end result is we fully disconnect the domain from the smmu
instance and all domain types can be shared across all instances if
they support the pagetable layout. The invalidation also becomes
somewhat simpler as it just sweeps the list and does what it is
told. The special ATS list, counter and locking is removed too.
Jason
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 06/11] iommu/arm-smmu-v3: Introduce arm_smmu_s2_parent_tlb_ invalidation helpers
2025-04-15 12:50 ` Jason Gunthorpe
@ 2025-04-15 20:10 ` Nicolin Chen
2025-04-15 23:46 ` Jason Gunthorpe
0 siblings, 1 reply; 16+ messages in thread
From: Nicolin Chen @ 2025-04-15 20:10 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: will, robin.murphy, joro, kevin.tian, praan, nathan, yi.l.liu,
peterz, mshavit, jsnitsel, smostafa, jeff.johnson, zhangzekun11,
linux-arm-kernel, iommu, linux-kernel, shameerali.kolothum.thodi
On Tue, Apr 15, 2025 at 09:50:42AM -0300, Jason Gunthorpe wrote:
> struct invalidation_op {
> struct arm_smmu_device *smmu;
> enum {ATS,S2_VMDIA_IPA,S2_VMID,S1_ASID} invalidation_op;
> union {
> u16 vmid;
> u32 asid;
> u32 ats_id;
> };
> refcount_t users;
> };
>
> Then invalidation would just iterate over this list following each
> instruction.
>
> When things are attached the list is mutated:
> - Normal S1/S2 attach would reuse an ASID for the same instance or
> allocate a new list entry, users keeps track of ID sharing
> - VMID attach would use the VMID of the vSMMU
> - ATS enabled would add entries for each PCI device instead of the
> seperate ATS list
Interesting. I can see it generalize all the use cases.
Yet are you expecting a big list combining TLBI and ATC_INV cmds?
I think the ATC_INV entries doesn't need a refcount? And finding
an SID (to remove the device for example) would take long, when
there are a lot of entries in the list?
Should the ATS list still be separate, or even an xarray?
> To do this without locking on the invalidation side would require
> using RCU to manage the list, which suggests it is probably an array
> that is re-allocated each time it is changed.
>
> That means some fancy algorithms to copy and mutate the array, deal
> with error cases and sort it (ATS must follow ID, want things grouped
> by instance).
>
> There is some tricky memory barriers needed and RCU would require that
> SMMU unplug do a synchronize_rcu(). IIRC riscv did this in their
> driver.
I will refer to their driver. Yet, I wonder what we will gain from
RCU here? Race condition? Would you elaborate with some use case?
> But the end result is we fully disconnect the domain from the smmu
> instance and all domain types can be shared across all instances if
> they support the pagetable layout. The invalidation also becomes
> somewhat simpler as it just sweeps the list and does what it is
> told. The special ATS list, counter and locking is removed too.
OK. I'd like to give it another try. Or would you prefer to write
yourself?
Thanks
Nicolin
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 06/11] iommu/arm-smmu-v3: Introduce arm_smmu_s2_parent_tlb_ invalidation helpers
2025-04-15 20:10 ` Nicolin Chen
@ 2025-04-15 23:46 ` Jason Gunthorpe
0 siblings, 0 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2025-04-15 23:46 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, joro, kevin.tian, praan, nathan, yi.l.liu,
peterz, mshavit, jsnitsel, smostafa, jeff.johnson, zhangzekun11,
linux-arm-kernel, iommu, linux-kernel, shameerali.kolothum.thodi
On Tue, Apr 15, 2025 at 01:10:37PM -0700, Nicolin Chen wrote:
> On Tue, Apr 15, 2025 at 09:50:42AM -0300, Jason Gunthorpe wrote:
> > struct invalidation_op {
> > struct arm_smmu_device *smmu;
> > enum {ATS,S2_VMDIA_IPA,S2_VMID,S1_ASID} invalidation_op;
> > union {
> > u16 vmid;
> > u32 asid;
> > u32 ats_id;
> > };
> > refcount_t users;
> > };
> >
> > Then invalidation would just iterate over this list following each
> > instruction.
> >
> > When things are attached the list is mutated:
> > - Normal S1/S2 attach would reuse an ASID for the same instance or
> > allocate a new list entry, users keeps track of ID sharing
> > - VMID attach would use the VMID of the vSMMU
> > - ATS enabled would add entries for each PCI device instead of the
> > seperate ATS list
>
> Interesting. I can see it generalize all the use cases.
>
> Yet are you expecting a big list combining TLBI and ATC_INV cmds?
It is the idea I had in my head. There isn't really a great reason to
have two lists if one list can handle the required updating and
locking needs.. I imagine the IOTLB entries would be sorted first and
the ATC entries last.
> I think the ATC_INV entries doesn't need a refcount?
Probably in almost all cases.
But see below about needing two domains in the list at once and recall
that today we temporarily put the same domain in the list twice
sometimes. So it may make alot of sense to use the refcount in every
entry to track how many masters are using that entry just to keep the
design simple.
> And finding an SID (to remove the device for example) would take
> long, when there are a lot of entries in the list?
It depends how smart you get, bisection search on a sorted linear list
would scale fine. But I don't think we care much about attach/detach
performance, or have such high numbers of attachments that this is
worth optimizing for.
> Should the ATS list still be separate, or even an xarray?
I haven't gone through it in any details to know.. If the invalidation
can use the structure above for ATS and nothing else needs the ATS
list, then perhaps it doesn't need to exist.
> I will refer to their driver. Yet, I wonder what we will gain from
> RCU here? Race condition? Would you elaborate with some use case?
The invalidation path was optimized to avoid locking, look at the
stuff in arm_smmu_atc_inv_domain() to try to avoid the spinlock
protecting the ATS invalidations read from the devices list.
So, I imagine a similar lock free scheme would be
invalidation:
rcu_read_lock()
list = READ_ONCE(domain->invalidation_ops);
[execute invalidation on list]
rcu_read_unlock()
mutate:
mutex_lock(domain->lock for attachment)
new_list = kcalloc()
copy_and_mutate(domain->invalidation_ops, new_list);
rcu_assign_pointer(domain->invalidation_ops, new_list);
mutex_unlock(domain->lock for attachment)
Then because of RCU you have to deal with some races.
1) HW flushing must be synchronous with the domain attach:
CPU 1 CPU 2
change an IOPTE
release IOPTs
attach a domain
release invalidation_ops
invalidation
acquire READ_ONCE()
acquire IOPTEs
update the STE/CD
Such that the HW is guarenteed to either:
a) see the new value of IOPTE before seeing the STE/CD that could
cause it be fetched
b) is guaranteed to see the invalidation_op for the new STE prior to
the STE being installed.
IIRC the riscv folks determined that this was a simple smp_mb()..
On the detaching side spurious IOTLB invalidation is OK, that will
just cause some performance anomaly. And I think spurious ATC
invalidation is OK too, though maybe need a synchronize_rcu() in
device removal due to friendly hot unplug.. IDK
2) Safe domain replacement
The existing code double adds devices to the invalidations lists for
safety. So it would need a algorithm like this:
prepare:
middle_list = copy_and_mutate_add_master(domain->list, new_master);
final_list = copy_and_mutate_remove_master(middle_list, old_master);
commit:
// Invalidate both new/old master while we mess with the STE/CD
rcu_assign_pointer(domain->list, middle_list);
install_ste()
// Only invalidate new master
rcu_assign_pointer(domain->list, final_list);
kfree_rcu(middle_list);
kfree_rcu(old_list);
As there is an intrinsic time window after the STE is written to
memory but before the STE invalidation sync has been completed in HW
where we have no idea which of the two domains the HW is fetching
from.
3) IOMMU Device removal
Since the RCU is also protecting the smmu instance memory and queues:
CPU 1 CPU 2
invalidation
rcu_read_lock()
domain detach
arm_smmu_release_device()
iommu_device_unregister()
list = READ_ONCE()
.. list[i]->smmu ..
rcu_read_unlock()
synchronize_rcu()
kfree(smmu);
But that's easy and we never hotunplug smmu's anyhow.
> > But the end result is we fully disconnect the domain from the smmu
> > instance and all domain types can be shared across all instances if
> > they support the pagetable layout. The invalidation also becomes
> > somewhat simpler as it just sweeps the list and does what it is
> > told. The special ATS list, counter and locking is removed too.
>
> OK. I'd like to give it another try. Or would you prefer to write
> yourself?
I'd be happy if you can knock it out, or at least determine it is too
hard/bad idea I'm trying to push out the io page table stuff this
cycle
The only thing that gives me pause is the complexity of the list copy
and mutate, but I didn't try to enumerate all the mutations that are
required. Maybe if this is done in a very simple unoptimized way it is
good enough 'mutate add master' 'mutate remove master', allocating a
new list copy for each operation.
Scan the list and calculate the new size. Copy the list discarding
things to delete. Add the new things to the end. Sort.
I'd probably start here, try to write the two mutate functions, check
if those are enough mutate functions, then try to migrate the
invalidation logic over to use the new lists part by part. Building
the new lists can be done first in a series.
From here a future project would be to optimize the invalidation for
multi-SMMU and multi-device... The current code runs everything
serially, but we could push all the invalidation commands to all the
instances, then wait for the sync's to come back from each instance
allowing the HW invalidation to be in parallel. Then similarly do the
ATC in parallel. It is easy to do if the list is sorted already in
order of required operations. This might make most sense for ATC
invalidation since it is always range based and only needs two command
entries?
Jason
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 02/11] iommu/arm-smmu-v3: Pass in smmu/iommu_domain to __arm_smmu_tlb_inv_range()
2025-04-15 4:57 ` [PATCH v2 02/11] iommu/arm-smmu-v3: Pass in smmu/iommu_domain to __arm_smmu_tlb_inv_range() Nicolin Chen
@ 2025-05-15 15:06 ` Will Deacon
0 siblings, 0 replies; 16+ messages in thread
From: Will Deacon @ 2025-05-15 15:06 UTC (permalink / raw)
To: Nicolin Chen
Cc: robin.murphy, jgg, joro, kevin.tian, praan, nathan, yi.l.liu,
peterz, mshavit, jsnitsel, smostafa, jeff.johnson, zhangzekun11,
linux-arm-kernel, iommu, linux-kernel, shameerali.kolothum.thodi
On Mon, Apr 14, 2025 at 09:57:37PM -0700, Nicolin Chen wrote:
> What __arm_smmu_tlb_inv_range() really needs is the smmu and iommu_domain
> pointers from the smmu_domain.
>
> For a nest_parent smmu_domain, it will no longer store an smmu pointer as
> it can be shared across vSMMU instances. A vSMMU structure sharing the S2
> smmu_domain instead would hold the smmu pointer.
>
> Pass them in explicitly to fit both !nest_parent and nest_parent cases.
>
> While changing it, share it in the header with arm-smmu-v3-iommmufd that
> will call it too.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 ++++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 17 +++++++++--------
> 2 files changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index d4837a33fb81..5dbdc61558a9 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -955,6 +955,10 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
> struct arm_smmu_domain *smmu_domain);
> int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
> unsigned long iova, size_t size);
> +void __arm_smmu_tlb_inv_range(struct arm_smmu_device *smmu,
> + struct arm_smmu_cmdq_ent *cmd, unsigned long iova,
> + size_t size, size_t granule,
> + struct iommu_domain *domain);
I don't think this function makes a particularly good "public" API --
the caller even sets the cmd opcode!
Can we expose some TLB invalidation helpers instead rather than the
low-level helpers?
Will
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-05-15 15:06 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-15 4:57 [PATCH v2 00/11] iommu/arm-smmu-v3: Allocate vmid per vsmmu instead of s2_parent Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 01/11] iommu/arm-smmu-v3: Pass in vmid to arm_smmu_make_s2_domain_ste() Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 02/11] iommu/arm-smmu-v3: Pass in smmu/iommu_domain to __arm_smmu_tlb_inv_range() Nicolin Chen
2025-05-15 15:06 ` Will Deacon
2025-04-15 4:57 ` [PATCH v2 03/11] iommu/arm-smmu-v3: Share cmdq/cmd helpers with arm-smmu-v3-iommufd Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 04/11] iommu/arm-smmu-v3: Add an inline arm_smmu_tlb_inv_vmid helper Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 05/11] iommu/arm-smmu-v3: Rename arm_smmu_attach_prepare_vmaster Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 06/11] iommu/arm-smmu-v3: Introduce arm_smmu_s2_parent_tlb_ invalidation helpers Nicolin Chen
2025-04-15 12:50 ` Jason Gunthorpe
2025-04-15 20:10 ` Nicolin Chen
2025-04-15 23:46 ` Jason Gunthorpe
2025-04-15 4:57 ` [PATCH v2 07/11] iommu/arm-smmu-v3: Introduce arm_vsmmu_atc_inv_domain() Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 08/11] iommu/arm-smmu-v3: Use vSMMU helpers for S2 and ATC invalidations Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 09/11] iommu/arm-smmu-v3: Clean up nested_ats_flush from master_domain Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 10/11] iommu/arm-smmu-v3: Decouple vmid from S2 nest_parent domain Nicolin Chen
2025-04-15 4:57 ` [PATCH v2 11/11] iommu/arm-smmu-v3: Allow to share S2 nest_parent domain across vSMMUs Nicolin Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).