* [PATCH v2 0/4] iommu/smmuv3: Add IOMMUFD dirty tracking support for SMMUv3
@ 2024-02-22 9:49 Shameer Kolothum
2024-02-22 9:49 ` [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU Shameer Kolothum
` (3 more replies)
0 siblings, 4 replies; 36+ messages in thread
From: Shameer Kolothum @ 2024-02-22 9:49 UTC (permalink / raw)
To: iommu, linux-arm-kernel
Cc: joro, jgg, kevin.tian, nicolinc, mshavit, robin.murphy, will,
joao.m.martins, jiangkunkun, zhukeqian1, linuxarm
Hi,
This is revisiting the earlier attempts [1, 2] to use SMMUv3 HTTU feature
for dirty page tracking. The Intel/AMD support is already mainline.
The code is now based on the new IOMMUFD APIs and is rebased on
Jason's SMMUv3 driver refactor series[3].
Basic sanity tests are done using an emulation setup and on a test
hardware setup. Block page split/merge(BBML) is not part of this
series. I am planning to send it separately.
v1 --> v2:
https://lore.kernel.org/linux-iommu/20231128094940.1344-1-shameerali.kolothum.thodi@huawei.com/
Addressed review comments from Jason and Joao(Thanks)
-Moved dirty_ops setting to domain finalise(patch #3)
-Only enable DBM for stage 1 if domain_alloc_user() requests it.
-Changed IO page table walker(patch #2) and tested with 4KB/16KB/64KB
with l1/l2/l3 traversal.(The earlier one had a bug where it fails to
walk L3 level).
-Rearranged patches a bit to improve bi-sectability.
-Rebased on top of Jason's v5 of SMMUv3 new API series git.
Please take a look and let me know your feedback.
Thanks,
Shameer
1. https://lore.kernel.org/lkml/20210413085457.25400-1-zhukeqian1@huawei.com/
2. https://lore.kernel.org/linux-iommu/20230518204650.14541-1-joao.m.martins@oracle.com/
3. https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
Jean-Philippe Brucker (1):
iommu/arm-smmu-v3: Add feature detection for HTTU
Joao Martins (1):
iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
Kunkun Jiang (1):
iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
Shameer Kolothum (1):
iommu/io-pgtable-arm: Add read_and_clear_dirty() support
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 142 +++++++++++++++++---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 ++
drivers/iommu/io-pgtable-arm.c | 133 +++++++++++++++++-
include/linux/io-pgtable.h | 4 +
4 files changed, 263 insertions(+), 24 deletions(-)
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU
2024-02-22 9:49 [PATCH v2 0/4] iommu/smmuv3: Add IOMMUFD dirty tracking support for SMMUv3 Shameer Kolothum
@ 2024-02-22 9:49 ` Shameer Kolothum
2024-04-23 14:41 ` Ryan Roberts
2024-02-22 9:49 ` [PATCH v2 2/4] iommu/io-pgtable-arm: Add read_and_clear_dirty() support Shameer Kolothum
` (2 subsequent siblings)
3 siblings, 1 reply; 36+ messages in thread
From: Shameer Kolothum @ 2024-02-22 9:49 UTC (permalink / raw)
To: iommu, linux-arm-kernel
Cc: joro, jgg, kevin.tian, nicolinc, mshavit, robin.murphy, will,
joao.m.martins, jiangkunkun, zhukeqian1, linuxarm
From: Jean-Philippe Brucker <jean-philippe@linaro.org>
If the SMMU supports it and the kernel was built with HTTU support,
Probe support for Hardware Translation Table Update (HTTU) which is
essentially to enable hardware update of access and dirty flags.
Probe and set the smmu::features for Hardware Dirty and Hardware Access
bits. This is in preparation, to enable it on the context descriptors of
stage 1 format.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32 +++++++++++++++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 ++++
2 files changed, 37 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 0606166a8781..bd30739e3588 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4196,6 +4196,28 @@ static void arm_smmu_device_iidr_probe(struct arm_smmu_device *smmu)
}
}
+static void arm_smmu_get_httu(struct arm_smmu_device *smmu, u32 reg)
+{
+ u32 fw_features = smmu->features & (ARM_SMMU_FEAT_HA | ARM_SMMU_FEAT_HD);
+ u32 features = 0;
+
+ switch (FIELD_GET(IDR0_HTTU, reg)) {
+ case IDR0_HTTU_ACCESS_DIRTY:
+ features |= ARM_SMMU_FEAT_HD;
+ fallthrough;
+ case IDR0_HTTU_ACCESS:
+ features |= ARM_SMMU_FEAT_HA;
+ }
+
+ if (smmu->dev->of_node)
+ smmu->features |= features;
+ else if (features != fw_features)
+ /* ACPI IORT sets the HTTU bits */
+ dev_warn(smmu->dev,
+ "IDR0.HTTU overridden by FW configuration (0x%x)\n",
+ fw_features);
+}
+
static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
{
u32 reg;
@@ -4256,6 +4278,8 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
smmu->features |= ARM_SMMU_FEAT_E2H;
}
+ arm_smmu_get_httu(smmu, reg);
+
/*
* The coherency feature as set by FW is used in preference to the ID
* register, but warn on mismatch.
@@ -4448,6 +4472,14 @@ static int arm_smmu_device_acpi_probe(struct platform_device *pdev,
if (iort_smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE)
smmu->features |= ARM_SMMU_FEAT_COHERENCY;
+ switch (FIELD_GET(ACPI_IORT_SMMU_V3_HTTU_OVERRIDE, iort_smmu->flags)) {
+ case IDR0_HTTU_ACCESS_DIRTY:
+ smmu->features |= ARM_SMMU_FEAT_HD;
+ fallthrough;
+ case IDR0_HTTU_ACCESS:
+ smmu->features |= ARM_SMMU_FEAT_HA;
+ }
+
return 0;
}
#else
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 45bcd72fcda4..5e51a6c1d55f 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -33,6 +33,9 @@
#define IDR0_ASID16 (1 << 12)
#define IDR0_ATS (1 << 10)
#define IDR0_HYP (1 << 9)
+#define IDR0_HTTU GENMASK(7, 6)
+#define IDR0_HTTU_ACCESS 1
+#define IDR0_HTTU_ACCESS_DIRTY 2
#define IDR0_COHACC (1 << 4)
#define IDR0_TTF GENMASK(3, 2)
#define IDR0_TTF_AARCH64 2
@@ -668,6 +671,8 @@ struct arm_smmu_device {
#define ARM_SMMU_FEAT_SVA (1 << 17)
#define ARM_SMMU_FEAT_E2H (1 << 18)
#define ARM_SMMU_FEAT_NESTING (1 << 19)
+#define ARM_SMMU_FEAT_HA (1 << 20)
+#define ARM_SMMU_FEAT_HD (1 << 21)
u32 features;
#define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v2 2/4] iommu/io-pgtable-arm: Add read_and_clear_dirty() support
2024-02-22 9:49 [PATCH v2 0/4] iommu/smmuv3: Add IOMMUFD dirty tracking support for SMMUv3 Shameer Kolothum
2024-02-22 9:49 ` [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU Shameer Kolothum
@ 2024-02-22 9:49 ` Shameer Kolothum
2024-04-23 15:56 ` Ryan Roberts
2024-02-22 9:49 ` [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc Shameer Kolothum
2024-02-22 9:49 ` [PATCH v2 4/4] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping Shameer Kolothum
3 siblings, 1 reply; 36+ messages in thread
From: Shameer Kolothum @ 2024-02-22 9:49 UTC (permalink / raw)
To: iommu, linux-arm-kernel
Cc: joro, jgg, kevin.tian, nicolinc, mshavit, robin.murphy, will,
joao.m.martins, jiangkunkun, zhukeqian1, linuxarm
.read_and_clear_dirty() IOMMU domain op takes care of reading the dirty
bits (i.e. PTE has both DBM and AP[2] set) and marshalling into a bitmap of
a given page size.
While reading the dirty bits we also clear the PTE AP[2] bit to mark it as
writable-clean depending on read_and_clear_dirty() flags.
Structure it in a way that the IOPTE walker is generic, and so we pass a
function pointer over what to do on a per-PTE basis.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
drivers/iommu/io-pgtable-arm.c | 128 ++++++++++++++++++++++++++++++++-
1 file changed, 127 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index f7828a7aad41..1ce7b7a3c1e8 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -75,6 +75,7 @@
#define ARM_LPAE_PTE_NSTABLE (((arm_lpae_iopte)1) << 63)
#define ARM_LPAE_PTE_XN (((arm_lpae_iopte)3) << 53)
+#define ARM_LPAE_PTE_DBM (((arm_lpae_iopte)1) << 51)
#define ARM_LPAE_PTE_AF (((arm_lpae_iopte)1) << 10)
#define ARM_LPAE_PTE_SH_NS (((arm_lpae_iopte)0) << 8)
#define ARM_LPAE_PTE_SH_OS (((arm_lpae_iopte)2) << 8)
@@ -84,7 +85,7 @@
#define ARM_LPAE_PTE_ATTR_LO_MASK (((arm_lpae_iopte)0x3ff) << 2)
/* Ignore the contiguous bit for block splitting */
-#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)6) << 52)
+#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)0xd) << 51)
#define ARM_LPAE_PTE_ATTR_MASK (ARM_LPAE_PTE_ATTR_LO_MASK | \
ARM_LPAE_PTE_ATTR_HI_MASK)
/* Software bit for solving coherency races */
@@ -93,6 +94,9 @@
/* Stage-1 PTE */
#define ARM_LPAE_PTE_AP_UNPRIV (((arm_lpae_iopte)1) << 6)
#define ARM_LPAE_PTE_AP_RDONLY (((arm_lpae_iopte)2) << 6)
+#define ARM_LPAE_PTE_AP_RDONLY_BIT 7
+#define ARM_LPAE_PTE_AP_WRITABLE (ARM_LPAE_PTE_AP_RDONLY | \
+ ARM_LPAE_PTE_DBM)
#define ARM_LPAE_PTE_ATTRINDX_SHIFT 2
#define ARM_LPAE_PTE_nG (((arm_lpae_iopte)1) << 11)
@@ -729,6 +733,127 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
return iopte_to_paddr(pte, data) | iova;
}
+struct arm_lpae_iopte_read_dirty {
+ unsigned long flags;
+ struct iommu_dirty_bitmap *dirty;
+};
+
+typedef int (*io_pgtable_visit_fn_t)(unsigned long iova, size_t size,
+ arm_lpae_iopte *ptep, void *opaque);
+struct io_pgtable_walk_data {
+ const io_pgtable_visit_fn_t fn;
+ void *opaque;
+ u64 addr;
+ const u64 end;
+};
+
+static int __arm_lpae_iopte_walk(struct arm_lpae_io_pgtable *data,
+ struct io_pgtable_walk_data *walk_data,
+ arm_lpae_iopte *ptep,
+ int lvl);
+
+static int __arm_lpae_read_and_clear_dirty(unsigned long iova, size_t size,
+ arm_lpae_iopte *ptep, void *opaque)
+{
+ struct arm_lpae_iopte_read_dirty *arg = opaque;
+ struct iommu_dirty_bitmap *dirty = arg->dirty;
+ arm_lpae_iopte pte;
+
+ pte = READ_ONCE(*ptep);
+ if (WARN_ON(!pte))
+ return -EINVAL;
+
+ if ((pte & ARM_LPAE_PTE_AP_WRITABLE) == ARM_LPAE_PTE_AP_WRITABLE)
+ return 0;
+
+ iommu_dirty_bitmap_record(dirty, iova, size);
+ if (!(arg->flags & IOMMU_DIRTY_NO_CLEAR))
+ set_bit(ARM_LPAE_PTE_AP_RDONLY_BIT, (unsigned long *)ptep);
+ return 0;
+}
+
+static int io_pgtable_visit(struct arm_lpae_io_pgtable *data,
+ struct io_pgtable_walk_data *walk_data,
+ arm_lpae_iopte *ptep, int lvl)
+{
+ struct io_pgtable *iop = &data->iop;
+ arm_lpae_iopte pte = READ_ONCE(*ptep);
+
+ if (WARN_ON(!pte))
+ return -EINVAL;
+
+ if (iopte_leaf(pte, lvl, iop->fmt)) {
+ size_t size = ARM_LPAE_BLOCK_SIZE(lvl, data);
+
+ walk_data->fn(walk_data->addr, size, ptep, walk_data->opaque);
+ walk_data->addr += size;
+ return 0;
+ }
+
+ ptep = iopte_deref(pte, data);
+ return __arm_lpae_iopte_walk(data, walk_data, ptep, lvl + 1);
+}
+
+static int __arm_lpae_iopte_walk(struct arm_lpae_io_pgtable *data,
+ struct io_pgtable_walk_data *walk_data,
+ arm_lpae_iopte *ptep,
+ int lvl)
+{
+ u32 idx;
+ int max_entries, ret;
+
+ if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+ return -EINVAL;
+
+ if (lvl == data->start_level)
+ max_entries = ARM_LPAE_PGD_SIZE(data) / sizeof(arm_lpae_iopte);
+ else
+ max_entries = ARM_LPAE_PTES_PER_TABLE(data);
+
+ for (idx = ARM_LPAE_LVL_IDX(walk_data->addr, lvl, data);
+ (idx < max_entries) && (walk_data->addr < walk_data->end); ++idx) {
+ arm_lpae_iopte *pteref = &ptep[idx];
+
+ ret = io_pgtable_visit(data, walk_data, pteref, lvl);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static int arm_lpae_read_and_clear_dirty(struct io_pgtable_ops *ops,
+ unsigned long iova, size_t size,
+ unsigned long flags,
+ struct iommu_dirty_bitmap *dirty)
+{
+ struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+ struct io_pgtable_cfg *cfg = &data->iop.cfg;
+ struct arm_lpae_iopte_read_dirty arg = {
+ .flags = flags, .dirty = dirty,
+ };
+ struct io_pgtable_walk_data walk_data = {
+ .fn = __arm_lpae_read_and_clear_dirty,
+ .opaque = &arg,
+ .addr = iova,
+ .end = iova + size - 1,
+ };
+ arm_lpae_iopte *ptep = data->pgd;
+ int lvl = data->start_level;
+ long iaext = (s64)iova >> cfg->ias;
+
+ if (WARN_ON(!size))
+ return -EINVAL;
+
+ if (WARN_ON(iaext))
+ return -EINVAL;
+
+ if (data->iop.fmt != ARM_64_LPAE_S1)
+ return -EINVAL;
+
+ return __arm_lpae_iopte_walk(data, &walk_data, ptep, lvl);
+}
+
static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
{
unsigned long granule, page_sizes;
@@ -807,6 +932,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
.map_pages = arm_lpae_map_pages,
.unmap_pages = arm_lpae_unmap_pages,
.iova_to_phys = arm_lpae_iova_to_phys,
+ .read_and_clear_dirty = arm_lpae_read_and_clear_dirty,
};
return data;
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
2024-02-22 9:49 [PATCH v2 0/4] iommu/smmuv3: Add IOMMUFD dirty tracking support for SMMUv3 Shameer Kolothum
2024-02-22 9:49 ` [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU Shameer Kolothum
2024-02-22 9:49 ` [PATCH v2 2/4] iommu/io-pgtable-arm: Add read_and_clear_dirty() support Shameer Kolothum
@ 2024-02-22 9:49 ` Shameer Kolothum
2024-02-22 11:04 ` Joao Martins
` (2 more replies)
2024-02-22 9:49 ` [PATCH v2 4/4] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping Shameer Kolothum
3 siblings, 3 replies; 36+ messages in thread
From: Shameer Kolothum @ 2024-02-22 9:49 UTC (permalink / raw)
To: iommu, linux-arm-kernel
Cc: joro, jgg, kevin.tian, nicolinc, mshavit, robin.murphy, will,
joao.m.martins, jiangkunkun, zhukeqian1, linuxarm
From: Joao Martins <joao.m.martins@oracle.com>
This provides all the infrastructure to enable dirty tracking if the
hardware has the capability and domain alloc request for it.
Please note, we still report no support for IOMMU_CAP_DIRTY_TRACKING
as it will finally be enabled in a subsequent patch.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 95 ++++++++++++++++-----
include/linux/io-pgtable.h | 4 +
2 files changed, 77 insertions(+), 22 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index bd30739e3588..058bbb0dbe2e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -43,6 +43,7 @@ MODULE_PARM_DESC(disable_msipolling,
"Disable MSI-based polling for CMD_SYNC completion.");
static struct iommu_ops arm_smmu_ops;
+static struct iommu_dirty_ops arm_smmu_dirty_ops;
enum arm_smmu_msi_index {
EVTQ_MSI_INDEX,
@@ -86,7 +87,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
- struct arm_smmu_device *smmu);
+ struct arm_smmu_device *smmu,
+ bool enable_dirty);
static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
static void arm_smmu_tlb_inv_all_s2(struct arm_smmu_domain *smmu_domain);
@@ -2378,7 +2380,7 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
int ret;
- ret = arm_smmu_domain_finalise(smmu_domain, master->smmu);
+ ret = arm_smmu_domain_finalise(smmu_domain, master->smmu, false);
if (ret) {
kfree(smmu_domain);
return ERR_PTR(ret);
@@ -2445,10 +2447,11 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
}
static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
- struct arm_smmu_device *smmu)
+ struct arm_smmu_device *smmu,
+ bool enable_dirty)
{
int ret;
- unsigned long ias, oas;
+ unsigned long ias;
enum io_pgtable_fmt fmt;
struct io_pgtable_cfg pgtbl_cfg;
struct io_pgtable_ops *pgtbl_ops;
@@ -2459,31 +2462,31 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
+ pgtbl_cfg = (struct io_pgtable_cfg) {
+ .pgsize_bitmap = smmu->pgsize_bitmap,
+ .coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENCY,
+ .tlb = &arm_smmu_flush_ops,
+ .iommu_dev = smmu->dev,
+ };
+
switch (smmu_domain->stage) {
case ARM_SMMU_DOMAIN_S1:
ias = (smmu->features & ARM_SMMU_FEAT_VAX) ? 52 : 48;
- ias = min_t(unsigned long, ias, VA_BITS);
- oas = smmu->ias;
+ pgtbl_cfg.ias = min_t(unsigned long, ias, VA_BITS);
+ pgtbl_cfg.oas = smmu->ias;
+ if (enable_dirty)
+ pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
fmt = ARM_64_LPAE_S1;
break;
case ARM_SMMU_DOMAIN_S2:
- ias = smmu->ias;
- oas = smmu->oas;
+ pgtbl_cfg.ias = smmu->ias;
+ pgtbl_cfg.oas = smmu->oas;
fmt = ARM_64_LPAE_S2;
break;
default:
return -EINVAL;
}
- pgtbl_cfg = (struct io_pgtable_cfg) {
- .pgsize_bitmap = smmu->pgsize_bitmap,
- .ias = ias,
- .oas = oas,
- .coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENCY,
- .tlb = &arm_smmu_flush_ops,
- .iommu_dev = smmu->dev,
- };
-
pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
if (!pgtbl_ops)
return -ENOMEM;
@@ -2491,7 +2494,8 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
smmu_domain->domain.geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
smmu_domain->domain.geometry.force_aperture = true;
-
+ if (enable_dirty && smmu_domain->stage == ARM_SMMU_DOMAIN_S1)
+ smmu_domain->domain.dirty_ops = &arm_smmu_dirty_ops;
ret = arm_smmu_domain_alloc_id(smmu, smmu_domain);
if (ret < 0) {
free_io_pgtable_ops(pgtbl_ops);
@@ -2811,7 +2815,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
mutex_lock(&smmu_domain->init_mutex);
if (!smmu_domain->smmu) {
- ret = arm_smmu_domain_finalise(smmu_domain, smmu);
+ ret = arm_smmu_domain_finalise(smmu_domain, smmu, false);
} else if (smmu_domain->smmu != smmu)
ret = -EINVAL;
@@ -2876,7 +2880,7 @@ static int arm_smmu_s1_set_dev_pasid(struct iommu_domain *domain,
mutex_lock(&smmu_domain->init_mutex);
if (!smmu_domain->smmu)
- ret = arm_smmu_domain_finalise(smmu_domain, smmu);
+ ret = arm_smmu_domain_finalise(smmu_domain, smmu, false);
else if (smmu_domain->smmu != smmu)
ret = -EINVAL;
mutex_unlock(&smmu_domain->init_mutex);
@@ -3193,7 +3197,9 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
const struct iommu_user_data *user_data)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
- const u32 paging_flags = IOMMU_HWPT_ALLOC_NEST_PARENT;
+ const u32 paging_flags = IOMMU_HWPT_ALLOC_NEST_PARENT |
+ IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
+ bool enforce_dirty = flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
struct arm_smmu_domain *smmu_domain;
int ret;
@@ -3206,6 +3212,10 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
if (user_data)
return ERR_PTR(-EINVAL);
+ if (enforce_dirty &&
+ !device_iommu_capable(dev, IOMMU_CAP_DIRTY_TRACKING))
+ return ERR_PTR(-EOPNOTSUPP);
+
smmu_domain = arm_smmu_domain_alloc();
if (!smmu_domain)
return ERR_PTR(-ENOMEM);
@@ -3221,7 +3231,7 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
smmu_domain->domain.type = IOMMU_DOMAIN_UNMANAGED;
smmu_domain->domain.ops = arm_smmu_ops.default_domain_ops;
- ret = arm_smmu_domain_finalise(smmu_domain, master->smmu);
+ ret = arm_smmu_domain_finalise(smmu_domain, master->smmu, enforce_dirty);
if (ret)
goto err_free;
return &smmu_domain->domain;
@@ -3470,6 +3480,42 @@ static void arm_smmu_release_device(struct device *dev)
kfree(master);
}
+static int arm_smmu_read_and_clear_dirty(struct iommu_domain *domain,
+ unsigned long iova, size_t size,
+ unsigned long flags,
+ struct iommu_dirty_bitmap *dirty)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+ struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+
+ if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+ return -EINVAL;
+
+ if (WARN_ON_ONCE(!ops || !ops->read_and_clear_dirty))
+ return -ENODEV;
+
+ return ops->read_and_clear_dirty(ops, iova, size, flags, dirty);
+}
+
+static int arm_smmu_set_dirty_tracking(struct iommu_domain *domain,
+ bool enabled)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+ struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+
+ if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+ return -EINVAL;
+
+ if (WARN_ON_ONCE(!ops))
+ return -ENODEV;
+
+ /*
+ * Always enabled and the dirty bitmap is cleared prior to
+ * set_dirty_tracking().
+ */
+ return 0;
+}
+
static struct iommu_group *arm_smmu_device_group(struct device *dev)
{
struct iommu_group *group;
@@ -3612,6 +3658,11 @@ static struct iommu_ops arm_smmu_ops = {
}
};
+static struct iommu_dirty_ops arm_smmu_dirty_ops = {
+ .read_and_clear_dirty = arm_smmu_read_and_clear_dirty,
+ .set_dirty_tracking = arm_smmu_set_dirty_tracking,
+};
+
/* Probing and initialisation functions */
static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
struct arm_smmu_queue *q,
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 86cf1f7ae389..8e75f944f07a 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -85,6 +85,8 @@ struct io_pgtable_cfg {
*
* IO_PGTABLE_QUIRK_ARM_OUTER_WBWA: Override the outer-cacheability
* attributes set in the TCR for a non-coherent page-table walker.
+ *
+ * IO_PGTABLE_QUIRK_ARM_HD: Enables dirty tracking in stage 1 pagetable.
*/
#define IO_PGTABLE_QUIRK_ARM_NS BIT(0)
#define IO_PGTABLE_QUIRK_NO_PERMS BIT(1)
@@ -92,6 +94,8 @@ struct io_pgtable_cfg {
#define IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT BIT(4)
#define IO_PGTABLE_QUIRK_ARM_TTBR1 BIT(5)
#define IO_PGTABLE_QUIRK_ARM_OUTER_WBWA BIT(6)
+ #define IO_PGTABLE_QUIRK_ARM_HD BIT(7)
+
unsigned long quirks;
unsigned long pgsize_bitmap;
unsigned int ias;
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v2 4/4] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
2024-02-22 9:49 [PATCH v2 0/4] iommu/smmuv3: Add IOMMUFD dirty tracking support for SMMUv3 Shameer Kolothum
` (2 preceding siblings ...)
2024-02-22 9:49 ` [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc Shameer Kolothum
@ 2024-02-22 9:49 ` Shameer Kolothum
2024-03-08 14:32 ` Jason Gunthorpe
2024-04-23 16:45 ` Ryan Roberts
3 siblings, 2 replies; 36+ messages in thread
From: Shameer Kolothum @ 2024-02-22 9:49 UTC (permalink / raw)
To: iommu, linux-arm-kernel
Cc: joro, jgg, kevin.tian, nicolinc, mshavit, robin.murphy, will,
joao.m.martins, jiangkunkun, zhukeqian1, linuxarm
From: Kunkun Jiang <jiangkunkun@huawei.com>
If io-pgtable quirk flag indicates support for hardware update of
dirty state, enable HA/HD bits in the SMMU CD and also set the DBM
bit in the page descriptor.
And now report the dirty page tracking capability of SMMUv3.
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++++++++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +++
drivers/iommu/io-pgtable-arm.c | 5 ++++-
3 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 058bbb0dbe2e..4423cc7e48cf 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1271,6 +1271,12 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
CTXDESC_CD_0_ASET |
FIELD_PREP(CTXDESC_CD_0_ASID, smmu_domain->asid)
);
+
+ /* To enable dirty flag update, set both Access flag and dirty state update */
+ if (pgtbl_cfg->quirks & IO_PGTABLE_QUIRK_ARM_HD)
+ target->data[0] |= cpu_to_le64(CTXDESC_CD_0_TCR_HA |
+ CTXDESC_CD_0_TCR_HD);
+
target->data[1] = cpu_to_le64(pgtbl_cfg->arm_lpae_s1_cfg.ttbr &
CTXDESC_CD_1_TTB0_MASK);
target->data[3] = cpu_to_le64(pgtbl_cfg->arm_lpae_s1_cfg.mair);
@@ -2307,6 +2313,13 @@ static const struct iommu_flush_ops arm_smmu_flush_ops = {
.tlb_add_page = arm_smmu_tlb_inv_page_nosync,
};
+static bool arm_smmu_dbm_capable(struct arm_smmu_device *smmu)
+{
+ u32 features = (ARM_SMMU_FEAT_HD | ARM_SMMU_FEAT_COHERENCY);
+
+ return (smmu->features & features) == features;
+}
+
/* IOMMU API */
static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
{
@@ -2319,6 +2332,8 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
case IOMMU_CAP_NOEXEC:
case IOMMU_CAP_DEFERRED_FLUSH:
return true;
+ case IOMMU_CAP_DIRTY_TRACKING:
+ return arm_smmu_dbm_capable(master->smmu);
default:
return false;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 5e51a6c1d55f..a9cd805e5a1b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -313,6 +313,9 @@ struct arm_smmu_cd {
#define CTXDESC_CD_0_TCR_IPS GENMASK_ULL(34, 32)
#define CTXDESC_CD_0_TCR_TBI0 (1ULL << 38)
+#define CTXDESC_CD_0_TCR_HA (1UL << 43)
+#define CTXDESC_CD_0_TCR_HD (1UL << 42)
+
#define CTXDESC_CD_0_AA64 (1UL << 41)
#define CTXDESC_CD_0_S (1UL << 44)
#define CTXDESC_CD_0_R (1UL << 45)
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 1ce7b7a3c1e8..56e88e555fc7 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -429,6 +429,8 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
pte = ARM_LPAE_PTE_nG;
if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ))
pte |= ARM_LPAE_PTE_AP_RDONLY;
+ else if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_HD)
+ pte |= ARM_LPAE_PTE_AP_WRITABLE;
if (!(prot & IOMMU_PRIV))
pte |= ARM_LPAE_PTE_AP_UNPRIV;
} else {
@@ -948,7 +950,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
IO_PGTABLE_QUIRK_ARM_TTBR1 |
- IO_PGTABLE_QUIRK_ARM_OUTER_WBWA))
+ IO_PGTABLE_QUIRK_ARM_OUTER_WBWA |
+ IO_PGTABLE_QUIRK_ARM_HD))
return NULL;
data = arm_lpae_alloc_pgtable(cfg);
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
2024-02-22 9:49 ` [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc Shameer Kolothum
@ 2024-02-22 11:04 ` Joao Martins
2024-02-22 11:31 ` Shameerali Kolothum Thodi
2024-03-08 14:31 ` Jason Gunthorpe
2024-04-23 16:27 ` Ryan Roberts
2 siblings, 1 reply; 36+ messages in thread
From: Joao Martins @ 2024-02-22 11:04 UTC (permalink / raw)
To: Shameer Kolothum
Cc: joro, jgg, kevin.tian, nicolinc, iommu, mshavit, robin.murphy,
will, jiangkunkun, zhukeqian1, linuxarm, linux-arm-kernel
On 22/02/2024 09:49, Shameer Kolothum wrote:
> From: Joao Martins <joao.m.martins@oracle.com>
>
> This provides all the infrastructure to enable dirty tracking if the
> hardware has the capability and domain alloc request for it.
>
> Please note, we still report no support for IOMMU_CAP_DIRTY_TRACKING
> as it will finally be enabled in a subsequent patch.
>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 95 ++++++++++++++++-----
> include/linux/io-pgtable.h | 4 +
> 2 files changed, 77 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index bd30739e3588..058bbb0dbe2e 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -43,6 +43,7 @@ MODULE_PARM_DESC(disable_msipolling,
> "Disable MSI-based polling for CMD_SYNC completion.");
>
> static struct iommu_ops arm_smmu_ops;
> +static struct iommu_dirty_ops arm_smmu_dirty_ops;
>
> enum arm_smmu_msi_index {
> EVTQ_MSI_INDEX,
> @@ -86,7 +87,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
>
> static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
> static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> - struct arm_smmu_device *smmu);
> + struct arm_smmu_device *smmu,
> + bool enable_dirty);
> static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
> static void arm_smmu_tlb_inv_all_s2(struct arm_smmu_domain *smmu_domain);
>
> @@ -2378,7 +2380,7 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
> struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> int ret;
>
> - ret = arm_smmu_domain_finalise(smmu_domain, master->smmu);
> + ret = arm_smmu_domain_finalise(smmu_domain, master->smmu, false);
> if (ret) {
> kfree(smmu_domain);
> return ERR_PTR(ret);
> @@ -2445,10 +2447,11 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
> }
>
> static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> - struct arm_smmu_device *smmu)
> + struct arm_smmu_device *smmu,
> + bool enable_dirty)
> {
> int ret;
> - unsigned long ias, oas;
> + unsigned long ias;
> enum io_pgtable_fmt fmt;
> struct io_pgtable_cfg pgtbl_cfg;
> struct io_pgtable_ops *pgtbl_ops;
> @@ -2459,31 +2462,31 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
> smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>
> + pgtbl_cfg = (struct io_pgtable_cfg) {
> + .pgsize_bitmap = smmu->pgsize_bitmap,
> + .coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENCY,
> + .tlb = &arm_smmu_flush_ops,
> + .iommu_dev = smmu->dev,
> + };
> +
> switch (smmu_domain->stage) {
> case ARM_SMMU_DOMAIN_S1:
> ias = (smmu->features & ARM_SMMU_FEAT_VAX) ? 52 : 48;
> - ias = min_t(unsigned long, ias, VA_BITS);
> - oas = smmu->ias;
> + pgtbl_cfg.ias = min_t(unsigned long, ias, VA_BITS);
> + pgtbl_cfg.oas = smmu->ias;
> + if (enable_dirty)
> + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
> fmt = ARM_64_LPAE_S1;
> break;
> case ARM_SMMU_DOMAIN_S2:
> - ias = smmu->ias;
> - oas = smmu->oas;
> + pgtbl_cfg.ias = smmu->ias;
> + pgtbl_cfg.oas = smmu->oas;
> fmt = ARM_64_LPAE_S2;
> break;
> default:
> return -EINVAL;
> }
>
> - pgtbl_cfg = (struct io_pgtable_cfg) {
> - .pgsize_bitmap = smmu->pgsize_bitmap,
> - .ias = ias,
> - .oas = oas,
> - .coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENCY,
> - .tlb = &arm_smmu_flush_ops,
> - .iommu_dev = smmu->dev,
> - };
> -
> pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> if (!pgtbl_ops)
> return -ENOMEM;
> @@ -2491,7 +2494,8 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> smmu_domain->domain.geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
> smmu_domain->domain.geometry.force_aperture = true;
> -
> + if (enable_dirty && smmu_domain->stage == ARM_SMMU_DOMAIN_S1)
> + smmu_domain->domain.dirty_ops = &arm_smmu_dirty_ops;
> ret = arm_smmu_domain_alloc_id(smmu, smmu_domain);
> if (ret < 0) {
> free_io_pgtable_ops(pgtbl_ops);
> @@ -2811,7 +2815,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> mutex_lock(&smmu_domain->init_mutex);
>
> if (!smmu_domain->smmu) {
> - ret = arm_smmu_domain_finalise(smmu_domain, smmu);
> + ret = arm_smmu_domain_finalise(smmu_domain, smmu, false);
> } else if (smmu_domain->smmu != smmu)
> ret = -EINVAL;
>
I think we are missing the domain attach_dev check for dirty tracking.
Something like:
if (domain->dirty_ops && !arm_smmu_dbm_capable(smmu))
return -EINVAL;
But that helper is only introduced in the last patch, so maybe:
if (domain->dirty_ops &&
!device_iommu_capable(dev, IOMMU_CAP_DIRTY_TRACKING))
return -EINVAL;
> @@ -2876,7 +2880,7 @@ static int arm_smmu_s1_set_dev_pasid(struct iommu_domain *domain,
>
> mutex_lock(&smmu_domain->init_mutex);
> if (!smmu_domain->smmu)
> - ret = arm_smmu_domain_finalise(smmu_domain, smmu);
> + ret = arm_smmu_domain_finalise(smmu_domain, smmu, false);
> else if (smmu_domain->smmu != smmu)
> ret = -EINVAL;
> mutex_unlock(&smmu_domain->init_mutex);
> @@ -3193,7 +3197,9 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
> const struct iommu_user_data *user_data)
> {
> struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> - const u32 paging_flags = IOMMU_HWPT_ALLOC_NEST_PARENT;
> + const u32 paging_flags = IOMMU_HWPT_ALLOC_NEST_PARENT |
> + IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
> + bool enforce_dirty = flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
> struct arm_smmu_domain *smmu_domain;
> int ret;
>
> @@ -3206,6 +3212,10 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
> if (user_data)
> return ERR_PTR(-EINVAL);
>
> + if (enforce_dirty &&
> + !device_iommu_capable(dev, IOMMU_CAP_DIRTY_TRACKING))
> + return ERR_PTR(-EOPNOTSUPP);
> +
> smmu_domain = arm_smmu_domain_alloc();
> if (!smmu_domain)
> return ERR_PTR(-ENOMEM);
> @@ -3221,7 +3231,7 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
>
> smmu_domain->domain.type = IOMMU_DOMAIN_UNMANAGED;
> smmu_domain->domain.ops = arm_smmu_ops.default_domain_ops;
> - ret = arm_smmu_domain_finalise(smmu_domain, master->smmu);
> + ret = arm_smmu_domain_finalise(smmu_domain, master->smmu, enforce_dirty);
> if (ret)
> goto err_free;
> return &smmu_domain->domain;
> @@ -3470,6 +3480,42 @@ static void arm_smmu_release_device(struct device *dev)
> kfree(master);
> }
>
> +static int arm_smmu_read_and_clear_dirty(struct iommu_domain *domain,
> + unsigned long iova, size_t size,
> + unsigned long flags,
> + struct iommu_dirty_bitmap *dirty)
> +{
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> +
> + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
> + return -EINVAL;
> +
> + if (WARN_ON_ONCE(!ops || !ops->read_and_clear_dirty))
> + return -ENODEV;
> +
> + return ops->read_and_clear_dirty(ops, iova, size, flags, dirty);
> +}
> +
> +static int arm_smmu_set_dirty_tracking(struct iommu_domain *domain,
> + bool enabled)
> +{
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> +
> + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
> + return -EINVAL;
> +
> + if (WARN_ON_ONCE(!ops))
> + return -ENODEV;
> +
> + /*
> + * Always enabled and the dirty bitmap is cleared prior to
> + * set_dirty_tracking().
> + */
> + return 0;
> +}
> +
> static struct iommu_group *arm_smmu_device_group(struct device *dev)
> {
> struct iommu_group *group;
> @@ -3612,6 +3658,11 @@ static struct iommu_ops arm_smmu_ops = {
> }
> };
>
> +static struct iommu_dirty_ops arm_smmu_dirty_ops = {
> + .read_and_clear_dirty = arm_smmu_read_and_clear_dirty,
> + .set_dirty_tracking = arm_smmu_set_dirty_tracking,
> +};
> +
> /* Probing and initialisation functions */
> static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
> struct arm_smmu_queue *q,
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index 86cf1f7ae389..8e75f944f07a 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -85,6 +85,8 @@ struct io_pgtable_cfg {
> *
> * IO_PGTABLE_QUIRK_ARM_OUTER_WBWA: Override the outer-cacheability
> * attributes set in the TCR for a non-coherent page-table walker.
> + *
> + * IO_PGTABLE_QUIRK_ARM_HD: Enables dirty tracking in stage 1 pagetable.
> */
> #define IO_PGTABLE_QUIRK_ARM_NS BIT(0)
> #define IO_PGTABLE_QUIRK_NO_PERMS BIT(1)
> @@ -92,6 +94,8 @@ struct io_pgtable_cfg {
> #define IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT BIT(4)
> #define IO_PGTABLE_QUIRK_ARM_TTBR1 BIT(5)
> #define IO_PGTABLE_QUIRK_ARM_OUTER_WBWA BIT(6)
> + #define IO_PGTABLE_QUIRK_ARM_HD BIT(7)
> +
> unsigned long quirks;
> unsigned long pgsize_bitmap;
> unsigned int ias;
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
2024-02-22 11:04 ` Joao Martins
@ 2024-02-22 11:31 ` Shameerali Kolothum Thodi
2024-02-22 11:37 ` Joao Martins
0 siblings, 1 reply; 36+ messages in thread
From: Shameerali Kolothum Thodi @ 2024-02-22 11:31 UTC (permalink / raw)
To: Joao Martins
Cc: joro@8bytes.org, jgg@nvidia.com, kevin.tian@intel.com,
nicolinc@nvidia.com, iommu@lists.linux.dev, mshavit@google.com,
robin.murphy@arm.com, will@kernel.org, jiangkunkun, zhukeqian,
Linuxarm, linux-arm-kernel@lists.infradead.org
> -----Original Message-----
> From: Joao Martins <joao.m.martins@oracle.com>
> Sent: Thursday, February 22, 2024 11:04 AM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: joro@8bytes.org; jgg@nvidia.com; kevin.tian@intel.com;
> nicolinc@nvidia.com; iommu@lists.linux.dev; mshavit@google.com;
> robin.murphy@arm.com; will@kernel.org; jiangkunkun
> <jiangkunkun@huawei.com>; zhukeqian <zhukeqian1@huawei.com>;
> Linuxarm <linuxarm@huawei.com>; linux-arm-kernel@lists.infradead.org
> Subject: Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty
> tracking in domain alloc
>
> On 22/02/2024 09:49, Shameer Kolothum wrote:
> > From: Joao Martins <joao.m.martins@oracle.com>
> >
> > This provides all the infrastructure to enable dirty tracking if the
> > hardware has the capability and domain alloc request for it.
> >
> > Please note, we still report no support for IOMMU_CAP_DIRTY_TRACKING
> > as it will finally be enabled in a subsequent patch.
> >
> > Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > ---
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 95
> ++++++++++++++++-----
> > include/linux/io-pgtable.h | 4 +
> > 2 files changed, 77 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index bd30739e3588..058bbb0dbe2e 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -43,6 +43,7 @@ MODULE_PARM_DESC(disable_msipolling,
> > "Disable MSI-based polling for CMD_SYNC completion.");
> >
> > static struct iommu_ops arm_smmu_ops;
> > +static struct iommu_dirty_ops arm_smmu_dirty_ops;
> >
> > enum arm_smmu_msi_index {
> > EVTQ_MSI_INDEX,
> > @@ -86,7 +87,8 @@ static struct arm_smmu_option_prop
> > arm_smmu_options[] = {
> >
> > static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device
> > *smmu); static int arm_smmu_domain_finalise(struct arm_smmu_domain
> *smmu_domain,
> > - struct arm_smmu_device *smmu);
> > + struct arm_smmu_device *smmu,
> > + bool enable_dirty);
> > static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
> > static void arm_smmu_tlb_inv_all_s2(struct arm_smmu_domain
> > *smmu_domain);
> >
> > @@ -2378,7 +2380,7 @@ static struct iommu_domain
> *arm_smmu_domain_alloc_paging(struct device *dev)
> > struct arm_smmu_master *master =
> dev_iommu_priv_get(dev);
> > int ret;
> >
> > - ret = arm_smmu_domain_finalise(smmu_domain, master-
> >smmu);
> > + ret = arm_smmu_domain_finalise(smmu_domain, master-
> >smmu, false);
> > if (ret) {
> > kfree(smmu_domain);
> > return ERR_PTR(ret);
> > @@ -2445,10 +2447,11 @@ static void arm_smmu_domain_free(struct
> > iommu_domain *domain) }
> >
> > static int arm_smmu_domain_finalise(struct arm_smmu_domain
> *smmu_domain,
> > - struct arm_smmu_device *smmu)
> > + struct arm_smmu_device *smmu,
> > + bool enable_dirty)
> > {
> > int ret;
> > - unsigned long ias, oas;
> > + unsigned long ias;
> > enum io_pgtable_fmt fmt;
> > struct io_pgtable_cfg pgtbl_cfg;
> > struct io_pgtable_ops *pgtbl_ops;
> > @@ -2459,31 +2462,31 @@ static int arm_smmu_domain_finalise(struct
> arm_smmu_domain *smmu_domain,
> > if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
> > smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
> >
> > + pgtbl_cfg = (struct io_pgtable_cfg) {
> > + .pgsize_bitmap = smmu->pgsize_bitmap,
> > + .coherent_walk = smmu->features &
> ARM_SMMU_FEAT_COHERENCY,
> > + .tlb = &arm_smmu_flush_ops,
> > + .iommu_dev = smmu->dev,
> > + };
> > +
> > switch (smmu_domain->stage) {
> > case ARM_SMMU_DOMAIN_S1:
> > ias = (smmu->features & ARM_SMMU_FEAT_VAX) ? 52 : 48;
> > - ias = min_t(unsigned long, ias, VA_BITS);
> > - oas = smmu->ias;
> > + pgtbl_cfg.ias = min_t(unsigned long, ias, VA_BITS);
> > + pgtbl_cfg.oas = smmu->ias;
> > + if (enable_dirty)
> > + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
> > fmt = ARM_64_LPAE_S1;
> > break;
> > case ARM_SMMU_DOMAIN_S2:
> > - ias = smmu->ias;
> > - oas = smmu->oas;
> > + pgtbl_cfg.ias = smmu->ias;
> > + pgtbl_cfg.oas = smmu->oas;
> > fmt = ARM_64_LPAE_S2;
> > break;
> > default:
> > return -EINVAL;
> > }
> >
> > - pgtbl_cfg = (struct io_pgtable_cfg) {
> > - .pgsize_bitmap = smmu->pgsize_bitmap,
> > - .ias = ias,
> > - .oas = oas,
> > - .coherent_walk = smmu->features &
> ARM_SMMU_FEAT_COHERENCY,
> > - .tlb = &arm_smmu_flush_ops,
> > - .iommu_dev = smmu->dev,
> > - };
> > -
> > pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> > if (!pgtbl_ops)
> > return -ENOMEM;
> > @@ -2491,7 +2494,8 @@ static int arm_smmu_domain_finalise(struct
> arm_smmu_domain *smmu_domain,
> > smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> > smmu_domain->domain.geometry.aperture_end = (1UL <<
> pgtbl_cfg.ias) - 1;
> > smmu_domain->domain.geometry.force_aperture = true;
> > -
> > + if (enable_dirty && smmu_domain->stage ==
> ARM_SMMU_DOMAIN_S1)
> > + smmu_domain->domain.dirty_ops =
> &arm_smmu_dirty_ops;
> > ret = arm_smmu_domain_alloc_id(smmu, smmu_domain);
> > if (ret < 0) {
> > free_io_pgtable_ops(pgtbl_ops);
> > @@ -2811,7 +2815,7 @@ static int arm_smmu_attach_dev(struct
> iommu_domain *domain, struct device *dev)
> > mutex_lock(&smmu_domain->init_mutex);
> >
> > if (!smmu_domain->smmu) {
> > - ret = arm_smmu_domain_finalise(smmu_domain, smmu);
> > + ret = arm_smmu_domain_finalise(smmu_domain, smmu,
> false);
> > } else if (smmu_domain->smmu != smmu)
> > ret = -EINVAL;
> >
>
>
> I think we are missing the domain attach_dev check for dirty tracking.
>
> Something like:
>
> if (domain->dirty_ops && !arm_smmu_dbm_capable(smmu))
> return -EINVAL;
>
> But that helper is only introduced in the last patch, so maybe:
>
> if (domain->dirty_ops &&
> !device_iommu_capable(dev, IOMMU_CAP_DIRTY_TRACKING))
> return -EINVAL;
Ok. But do we really need to check this in attach()? As dirty_ops are added only
if it is requested in alloc_user() and there we return err when hardware doesn't
have the capability. So not sure how this matters in attach() path. May be I am
missing something.
Thanks,
Shameer
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
2024-02-22 11:31 ` Shameerali Kolothum Thodi
@ 2024-02-22 11:37 ` Joao Martins
2024-02-22 12:24 ` Shameerali Kolothum Thodi
0 siblings, 1 reply; 36+ messages in thread
From: Joao Martins @ 2024-02-22 11:37 UTC (permalink / raw)
To: Shameerali Kolothum Thodi
Cc: joro@8bytes.org, jgg@nvidia.com, kevin.tian@intel.com,
nicolinc@nvidia.com, iommu@lists.linux.dev, mshavit@google.com,
robin.murphy@arm.com, will@kernel.org, jiangkunkun, zhukeqian,
Linuxarm, linux-arm-kernel@lists.infradead.org
On 22/02/2024 11:31, Shameerali Kolothum Thodi wrote:
>
>
>> -----Original Message-----
>> From: Joao Martins <joao.m.martins@oracle.com>
>> Sent: Thursday, February 22, 2024 11:04 AM
>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
>> Cc: joro@8bytes.org; jgg@nvidia.com; kevin.tian@intel.com;
>> nicolinc@nvidia.com; iommu@lists.linux.dev; mshavit@google.com;
>> robin.murphy@arm.com; will@kernel.org; jiangkunkun
>> <jiangkunkun@huawei.com>; zhukeqian <zhukeqian1@huawei.com>;
>> Linuxarm <linuxarm@huawei.com>; linux-arm-kernel@lists.infradead.org
>> Subject: Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty
>> tracking in domain alloc
>>
>> On 22/02/2024 09:49, Shameer Kolothum wrote:
>>> From: Joao Martins <joao.m.martins@oracle.com>
>>>
>>> This provides all the infrastructure to enable dirty tracking if the
>>> hardware has the capability and domain alloc request for it.
>>>
>>> Please note, we still report no support for IOMMU_CAP_DIRTY_TRACKING
>>> as it will finally be enabled in a subsequent patch.
>>>
>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>> Signed-off-by: Shameer Kolothum
>> <shameerali.kolothum.thodi@huawei.com>
>>> ---
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 95
>> ++++++++++++++++-----
>>> include/linux/io-pgtable.h | 4 +
>>> 2 files changed, 77 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> index bd30739e3588..058bbb0dbe2e 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> @@ -43,6 +43,7 @@ MODULE_PARM_DESC(disable_msipolling,
>>> "Disable MSI-based polling for CMD_SYNC completion.");
>>>
>>> static struct iommu_ops arm_smmu_ops;
>>> +static struct iommu_dirty_ops arm_smmu_dirty_ops;
>>>
>>> enum arm_smmu_msi_index {
>>> EVTQ_MSI_INDEX,
>>> @@ -86,7 +87,8 @@ static struct arm_smmu_option_prop
>>> arm_smmu_options[] = {
>>>
>>> static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device
>>> *smmu); static int arm_smmu_domain_finalise(struct arm_smmu_domain
>> *smmu_domain,
>>> - struct arm_smmu_device *smmu);
>>> + struct arm_smmu_device *smmu,
>>> + bool enable_dirty);
>>> static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
>>> static void arm_smmu_tlb_inv_all_s2(struct arm_smmu_domain
>>> *smmu_domain);
>>>
>>> @@ -2378,7 +2380,7 @@ static struct iommu_domain
>> *arm_smmu_domain_alloc_paging(struct device *dev)
>>> struct arm_smmu_master *master =
>> dev_iommu_priv_get(dev);
>>> int ret;
>>>
>>> - ret = arm_smmu_domain_finalise(smmu_domain, master-
>>> smmu);
>>> + ret = arm_smmu_domain_finalise(smmu_domain, master-
>>> smmu, false);
>>> if (ret) {
>>> kfree(smmu_domain);
>>> return ERR_PTR(ret);
>>> @@ -2445,10 +2447,11 @@ static void arm_smmu_domain_free(struct
>>> iommu_domain *domain) }
>>>
>>> static int arm_smmu_domain_finalise(struct arm_smmu_domain
>> *smmu_domain,
>>> - struct arm_smmu_device *smmu)
>>> + struct arm_smmu_device *smmu,
>>> + bool enable_dirty)
>>> {
>>> int ret;
>>> - unsigned long ias, oas;
>>> + unsigned long ias;
>>> enum io_pgtable_fmt fmt;
>>> struct io_pgtable_cfg pgtbl_cfg;
>>> struct io_pgtable_ops *pgtbl_ops;
>>> @@ -2459,31 +2462,31 @@ static int arm_smmu_domain_finalise(struct
>> arm_smmu_domain *smmu_domain,
>>> if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
>>> smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>>>
>>> + pgtbl_cfg = (struct io_pgtable_cfg) {
>>> + .pgsize_bitmap = smmu->pgsize_bitmap,
>>> + .coherent_walk = smmu->features &
>> ARM_SMMU_FEAT_COHERENCY,
>>> + .tlb = &arm_smmu_flush_ops,
>>> + .iommu_dev = smmu->dev,
>>> + };
>>> +
>>> switch (smmu_domain->stage) {
>>> case ARM_SMMU_DOMAIN_S1:
>>> ias = (smmu->features & ARM_SMMU_FEAT_VAX) ? 52 : 48;
>>> - ias = min_t(unsigned long, ias, VA_BITS);
>>> - oas = smmu->ias;
>>> + pgtbl_cfg.ias = min_t(unsigned long, ias, VA_BITS);
>>> + pgtbl_cfg.oas = smmu->ias;
>>> + if (enable_dirty)
>>> + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
>>> fmt = ARM_64_LPAE_S1;
>>> break;
>>> case ARM_SMMU_DOMAIN_S2:
>>> - ias = smmu->ias;
>>> - oas = smmu->oas;
>>> + pgtbl_cfg.ias = smmu->ias;
>>> + pgtbl_cfg.oas = smmu->oas;
>>> fmt = ARM_64_LPAE_S2;
>>> break;
>>> default:
>>> return -EINVAL;
>>> }
>>>
>>> - pgtbl_cfg = (struct io_pgtable_cfg) {
>>> - .pgsize_bitmap = smmu->pgsize_bitmap,
>>> - .ias = ias,
>>> - .oas = oas,
>>> - .coherent_walk = smmu->features &
>> ARM_SMMU_FEAT_COHERENCY,
>>> - .tlb = &arm_smmu_flush_ops,
>>> - .iommu_dev = smmu->dev,
>>> - };
>>> -
>>> pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
>>> if (!pgtbl_ops)
>>> return -ENOMEM;
>>> @@ -2491,7 +2494,8 @@ static int arm_smmu_domain_finalise(struct
>> arm_smmu_domain *smmu_domain,
>>> smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
>>> smmu_domain->domain.geometry.aperture_end = (1UL <<
>> pgtbl_cfg.ias) - 1;
>>> smmu_domain->domain.geometry.force_aperture = true;
>>> -
>>> + if (enable_dirty && smmu_domain->stage ==
>> ARM_SMMU_DOMAIN_S1)
>>> + smmu_domain->domain.dirty_ops =
>> &arm_smmu_dirty_ops;
>>> ret = arm_smmu_domain_alloc_id(smmu, smmu_domain);
>>> if (ret < 0) {
>>> free_io_pgtable_ops(pgtbl_ops);
>>> @@ -2811,7 +2815,7 @@ static int arm_smmu_attach_dev(struct
>> iommu_domain *domain, struct device *dev)
>>> mutex_lock(&smmu_domain->init_mutex);
>>>
>>> if (!smmu_domain->smmu) {
>>> - ret = arm_smmu_domain_finalise(smmu_domain, smmu);
>>> + ret = arm_smmu_domain_finalise(smmu_domain, smmu,
>> false);
>>> } else if (smmu_domain->smmu != smmu)
>>> ret = -EINVAL;
>>>
>>
>>
>> I think we are missing the domain attach_dev check for dirty tracking.
>>
>> Something like:
>>
>> if (domain->dirty_ops && !arm_smmu_dbm_capable(smmu))
>> return -EINVAL;
>>
>> But that helper is only introduced in the last patch, so maybe:
>>
>> if (domain->dirty_ops &&
>> !device_iommu_capable(dev, IOMMU_CAP_DIRTY_TRACKING))
>> return -EINVAL;
>
> Ok. But do we really need to check this in attach()? As dirty_ops are added only
> if it is requested in alloc_user() and there we return err when hardware doesn't
> have the capability. So not sure how this matters in attach() path. May be I am
> missing something.
That's when you create the domain with dev A. Afterwards that dev A is attached,
but later on you can attach another device B to the domain. So this check is
there such that a domain with dirty tracking ops set will only have devices in
there that support dirty tracking.
Joao
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
2024-02-22 11:37 ` Joao Martins
@ 2024-02-22 12:24 ` Shameerali Kolothum Thodi
2024-02-22 13:11 ` Jason Gunthorpe
2024-02-22 13:23 ` Joao Martins
0 siblings, 2 replies; 36+ messages in thread
From: Shameerali Kolothum Thodi @ 2024-02-22 12:24 UTC (permalink / raw)
To: Joao Martins
Cc: joro@8bytes.org, jgg@nvidia.com, kevin.tian@intel.com,
nicolinc@nvidia.com, iommu@lists.linux.dev, mshavit@google.com,
robin.murphy@arm.com, will@kernel.org, jiangkunkun, zhukeqian,
Linuxarm, linux-arm-kernel@lists.infradead.org
> -----Original Message-----
> From: Joao Martins <joao.m.martins@oracle.com>
> Sent: Thursday, February 22, 2024 11:38 AM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: joro@8bytes.org; jgg@nvidia.com; kevin.tian@intel.com;
> nicolinc@nvidia.com; iommu@lists.linux.dev; mshavit@google.com;
> robin.murphy@arm.com; will@kernel.org; jiangkunkun
> <jiangkunkun@huawei.com>; zhukeqian <zhukeqian1@huawei.com>; Linuxarm
> <linuxarm@huawei.com>; linux-arm-kernel@lists.infradead.org
> Subject: Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking
> in domain alloc
>
> On 22/02/2024 11:31, Shameerali Kolothum Thodi wrote:
> >
> >
> >> -----Original Message-----
> >> From: Joao Martins <joao.m.martins@oracle.com>
> >> Sent: Thursday, February 22, 2024 11:04 AM
> >> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> >> Cc: joro@8bytes.org; jgg@nvidia.com; kevin.tian@intel.com;
> >> nicolinc@nvidia.com; iommu@lists.linux.dev; mshavit@google.com;
> >> robin.murphy@arm.com; will@kernel.org; jiangkunkun
> >> <jiangkunkun@huawei.com>; zhukeqian <zhukeqian1@huawei.com>;
> >> Linuxarm <linuxarm@huawei.com>; linux-arm-kernel@lists.infradead.org
> >> Subject: Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty
> >> tracking in domain alloc
> >>
> >> On 22/02/2024 09:49, Shameer Kolothum wrote:
> >>> From: Joao Martins <joao.m.martins@oracle.com>
> >>>
> >>> This provides all the infrastructure to enable dirty tracking if the
> >>> hardware has the capability and domain alloc request for it.
> >>>
> >>> Please note, we still report no support for IOMMU_CAP_DIRTY_TRACKING
> >>> as it will finally be enabled in a subsequent patch.
> >>>
> >>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> >>> Signed-off-by: Shameer Kolothum
> >> <shameerali.kolothum.thodi@huawei.com>
> >>> ---
> >>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 95
> >> ++++++++++++++++-----
> >>> include/linux/io-pgtable.h | 4 +
> >>> 2 files changed, 77 insertions(+), 22 deletions(-)
> >>>
> >>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >>> index bd30739e3588..058bbb0dbe2e 100644
> >>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >>> @@ -43,6 +43,7 @@ MODULE_PARM_DESC(disable_msipolling,
> >>> "Disable MSI-based polling for CMD_SYNC completion.");
> >>>
> >>> static struct iommu_ops arm_smmu_ops;
> >>> +static struct iommu_dirty_ops arm_smmu_dirty_ops;
> >>>
> >>> enum arm_smmu_msi_index {
> >>> EVTQ_MSI_INDEX,
> >>> @@ -86,7 +87,8 @@ static struct arm_smmu_option_prop
> >>> arm_smmu_options[] = {
> >>>
> >>> static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device
> >>> *smmu); static int arm_smmu_domain_finalise(struct arm_smmu_domain
> >> *smmu_domain,
> >>> - struct arm_smmu_device *smmu);
> >>> + struct arm_smmu_device *smmu,
> >>> + bool enable_dirty);
> >>> static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
> >>> static void arm_smmu_tlb_inv_all_s2(struct arm_smmu_domain
> >>> *smmu_domain);
> >>>
> >>> @@ -2378,7 +2380,7 @@ static struct iommu_domain
> >> *arm_smmu_domain_alloc_paging(struct device *dev)
> >>> struct arm_smmu_master *master =
> >> dev_iommu_priv_get(dev);
> >>> int ret;
> >>>
> >>> - ret = arm_smmu_domain_finalise(smmu_domain, master-
> >>> smmu);
> >>> + ret = arm_smmu_domain_finalise(smmu_domain, master-
> >>> smmu, false);
> >>> if (ret) {
> >>> kfree(smmu_domain);
> >>> return ERR_PTR(ret);
> >>> @@ -2445,10 +2447,11 @@ static void arm_smmu_domain_free(struct
> >>> iommu_domain *domain) }
> >>>
> >>> static int arm_smmu_domain_finalise(struct arm_smmu_domain
> >> *smmu_domain,
> >>> - struct arm_smmu_device *smmu)
> >>> + struct arm_smmu_device *smmu,
> >>> + bool enable_dirty)
> >>> {
> >>> int ret;
> >>> - unsigned long ias, oas;
> >>> + unsigned long ias;
> >>> enum io_pgtable_fmt fmt;
> >>> struct io_pgtable_cfg pgtbl_cfg;
> >>> struct io_pgtable_ops *pgtbl_ops;
> >>> @@ -2459,31 +2462,31 @@ static int arm_smmu_domain_finalise(struct
> >> arm_smmu_domain *smmu_domain,
> >>> if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
> >>> smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
> >>>
> >>> + pgtbl_cfg = (struct io_pgtable_cfg) {
> >>> + .pgsize_bitmap = smmu->pgsize_bitmap,
> >>> + .coherent_walk = smmu->features &
> >> ARM_SMMU_FEAT_COHERENCY,
> >>> + .tlb = &arm_smmu_flush_ops,
> >>> + .iommu_dev = smmu->dev,
> >>> + };
> >>> +
> >>> switch (smmu_domain->stage) {
> >>> case ARM_SMMU_DOMAIN_S1:
> >>> ias = (smmu->features & ARM_SMMU_FEAT_VAX) ? 52 : 48;
> >>> - ias = min_t(unsigned long, ias, VA_BITS);
> >>> - oas = smmu->ias;
> >>> + pgtbl_cfg.ias = min_t(unsigned long, ias, VA_BITS);
> >>> + pgtbl_cfg.oas = smmu->ias;
> >>> + if (enable_dirty)
> >>> + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
> >>> fmt = ARM_64_LPAE_S1;
> >>> break;
> >>> case ARM_SMMU_DOMAIN_S2:
> >>> - ias = smmu->ias;
> >>> - oas = smmu->oas;
> >>> + pgtbl_cfg.ias = smmu->ias;
> >>> + pgtbl_cfg.oas = smmu->oas;
> >>> fmt = ARM_64_LPAE_S2;
> >>> break;
> >>> default:
> >>> return -EINVAL;
> >>> }
> >>>
> >>> - pgtbl_cfg = (struct io_pgtable_cfg) {
> >>> - .pgsize_bitmap = smmu->pgsize_bitmap,
> >>> - .ias = ias,
> >>> - .oas = oas,
> >>> - .coherent_walk = smmu->features &
> >> ARM_SMMU_FEAT_COHERENCY,
> >>> - .tlb = &arm_smmu_flush_ops,
> >>> - .iommu_dev = smmu->dev,
> >>> - };
> >>> -
> >>> pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> >>> if (!pgtbl_ops)
> >>> return -ENOMEM;
> >>> @@ -2491,7 +2494,8 @@ static int arm_smmu_domain_finalise(struct
> >> arm_smmu_domain *smmu_domain,
> >>> smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> >>> smmu_domain->domain.geometry.aperture_end = (1UL <<
> >> pgtbl_cfg.ias) - 1;
> >>> smmu_domain->domain.geometry.force_aperture = true;
> >>> -
> >>> + if (enable_dirty && smmu_domain->stage ==
> >> ARM_SMMU_DOMAIN_S1)
> >>> + smmu_domain->domain.dirty_ops =
> >> &arm_smmu_dirty_ops;
> >>> ret = arm_smmu_domain_alloc_id(smmu, smmu_domain);
> >>> if (ret < 0) {
> >>> free_io_pgtable_ops(pgtbl_ops);
> >>> @@ -2811,7 +2815,7 @@ static int arm_smmu_attach_dev(struct
> >> iommu_domain *domain, struct device *dev)
> >>> mutex_lock(&smmu_domain->init_mutex);
> >>>
> >>> if (!smmu_domain->smmu) {
> >>> - ret = arm_smmu_domain_finalise(smmu_domain, smmu);
> >>> + ret = arm_smmu_domain_finalise(smmu_domain, smmu,
> >> false);
> >>> } else if (smmu_domain->smmu != smmu)
> >>> ret = -EINVAL;
> >>>
> >>
> >>
> >> I think we are missing the domain attach_dev check for dirty tracking.
> >>
> >> Something like:
> >>
> >> if (domain->dirty_ops && !arm_smmu_dbm_capable(smmu))
> >> return -EINVAL;
> >>
> >> But that helper is only introduced in the last patch, so maybe:
> >>
> >> if (domain->dirty_ops &&
> >> !device_iommu_capable(dev, IOMMU_CAP_DIRTY_TRACKING))
> >> return -EINVAL;
> >
> > Ok. But do we really need to check this in attach()? As dirty_ops are added only
> > if it is requested in alloc_user() and there we return err when hardware doesn't
> > have the capability. So not sure how this matters in attach() path. May be I am
> > missing something.
>
> That's when you create the domain with dev A. Afterwards that dev A is
> attached,
> but later on you can attach another device B to the domain. So this check is
> there such that a domain with dirty tracking ops set will only have devices in
> there that support dirty tracking.
But that only matters if dev B is on different smmu without dbm capability, right?
In that case we already fail the attach with,
>>> } else if (smmu_domain->smmu != smmu)
> >>> ret = -EINVAL;
> >>>
Is that right?
Thanks,
Shameer
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
2024-02-22 12:24 ` Shameerali Kolothum Thodi
@ 2024-02-22 13:11 ` Jason Gunthorpe
2024-02-22 13:23 ` Joao Martins
1 sibling, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2024-02-22 13:11 UTC (permalink / raw)
To: Shameerali Kolothum Thodi
Cc: Joao Martins, joro@8bytes.org, kevin.tian@intel.com,
nicolinc@nvidia.com, iommu@lists.linux.dev, mshavit@google.com,
robin.murphy@arm.com, will@kernel.org, jiangkunkun, zhukeqian,
Linuxarm, linux-arm-kernel@lists.infradead.org
On Thu, Feb 22, 2024 at 12:24:06PM +0000, Shameerali Kolothum Thodi wrote:
> > > Ok. But do we really need to check this in attach()? As dirty_ops are added only
> > > if it is requested in alloc_user() and there we return err when hardware doesn't
> > > have the capability. So not sure how this matters in attach() path. May be I am
> > > missing something.
> >
> > That's when you create the domain with dev A. Afterwards that dev A is
> > attached,
> > but later on you can attach another device B to the domain. So this check is
> > there such that a domain with dirty tracking ops set will only have devices in
> > there that support dirty tracking.
>
> But that only matters if dev B is on different smmu without dbm capability, right?
> In that case we already fail the attach with,
>
> >>> } else if (smmu_domain->smmu != smmu)
> > >>> ret = -EINVAL;
> > >>>
>
> Is that right?
Right. Relaxing this is on my list :\
The other two drivers don't have this limitation so needed the check
that Joao described.
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
2024-02-22 12:24 ` Shameerali Kolothum Thodi
2024-02-22 13:11 ` Jason Gunthorpe
@ 2024-02-22 13:23 ` Joao Martins
1 sibling, 0 replies; 36+ messages in thread
From: Joao Martins @ 2024-02-22 13:23 UTC (permalink / raw)
To: Shameerali Kolothum Thodi
Cc: joro@8bytes.org, jgg@nvidia.com, kevin.tian@intel.com,
nicolinc@nvidia.com, iommu@lists.linux.dev, mshavit@google.com,
robin.murphy@arm.com, will@kernel.org, jiangkunkun, zhukeqian,
Linuxarm, linux-arm-kernel@lists.infradead.org
On 22/02/2024 12:24, Shameerali Kolothum Thodi wrote:
>
>
>> -----Original Message-----
>> From: Joao Martins <joao.m.martins@oracle.com>
>> Sent: Thursday, February 22, 2024 11:38 AM
>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
>> Cc: joro@8bytes.org; jgg@nvidia.com; kevin.tian@intel.com;
>> nicolinc@nvidia.com; iommu@lists.linux.dev; mshavit@google.com;
>> robin.murphy@arm.com; will@kernel.org; jiangkunkun
>> <jiangkunkun@huawei.com>; zhukeqian <zhukeqian1@huawei.com>; Linuxarm
>> <linuxarm@huawei.com>; linux-arm-kernel@lists.infradead.org
>> Subject: Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking
>> in domain alloc
>>
>> On 22/02/2024 11:31, Shameerali Kolothum Thodi wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Joao Martins <joao.m.martins@oracle.com>
>>>> Sent: Thursday, February 22, 2024 11:04 AM
>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
>>>> Cc: joro@8bytes.org; jgg@nvidia.com; kevin.tian@intel.com;
>>>> nicolinc@nvidia.com; iommu@lists.linux.dev; mshavit@google.com;
>>>> robin.murphy@arm.com; will@kernel.org; jiangkunkun
>>>> <jiangkunkun@huawei.com>; zhukeqian <zhukeqian1@huawei.com>;
>>>> Linuxarm <linuxarm@huawei.com>; linux-arm-kernel@lists.infradead.org
>>>> Subject: Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty
>>>> tracking in domain alloc
>>>>
>>>> On 22/02/2024 09:49, Shameer Kolothum wrote:
>>>>> From: Joao Martins <joao.m.martins@oracle.com>
>>>>>
>>>>> This provides all the infrastructure to enable dirty tracking if the
>>>>> hardware has the capability and domain alloc request for it.
>>>>>
>>>>> Please note, we still report no support for IOMMU_CAP_DIRTY_TRACKING
>>>>> as it will finally be enabled in a subsequent patch.
>>>>>
>>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>>>> Signed-off-by: Shameer Kolothum
>>>> <shameerali.kolothum.thodi@huawei.com>
>>>>> ---
>>>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 95
>>>> ++++++++++++++++-----
>>>>> include/linux/io-pgtable.h | 4 +
>>>>> 2 files changed, 77 insertions(+), 22 deletions(-)
>>>>>
>>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> index bd30739e3588..058bbb0dbe2e 100644
>>>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> @@ -43,6 +43,7 @@ MODULE_PARM_DESC(disable_msipolling,
>>>>> "Disable MSI-based polling for CMD_SYNC completion.");
>>>>>
>>>>> static struct iommu_ops arm_smmu_ops;
>>>>> +static struct iommu_dirty_ops arm_smmu_dirty_ops;
>>>>>
>>>>> enum arm_smmu_msi_index {
>>>>> EVTQ_MSI_INDEX,
>>>>> @@ -86,7 +87,8 @@ static struct arm_smmu_option_prop
>>>>> arm_smmu_options[] = {
>>>>>
>>>>> static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device
>>>>> *smmu); static int arm_smmu_domain_finalise(struct arm_smmu_domain
>>>> *smmu_domain,
>>>>> - struct arm_smmu_device *smmu);
>>>>> + struct arm_smmu_device *smmu,
>>>>> + bool enable_dirty);
>>>>> static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
>>>>> static void arm_smmu_tlb_inv_all_s2(struct arm_smmu_domain
>>>>> *smmu_domain);
>>>>>
>>>>> @@ -2378,7 +2380,7 @@ static struct iommu_domain
>>>> *arm_smmu_domain_alloc_paging(struct device *dev)
>>>>> struct arm_smmu_master *master =
>>>> dev_iommu_priv_get(dev);
>>>>> int ret;
>>>>>
>>>>> - ret = arm_smmu_domain_finalise(smmu_domain, master-
>>>>> smmu);
>>>>> + ret = arm_smmu_domain_finalise(smmu_domain, master-
>>>>> smmu, false);
>>>>> if (ret) {
>>>>> kfree(smmu_domain);
>>>>> return ERR_PTR(ret);
>>>>> @@ -2445,10 +2447,11 @@ static void arm_smmu_domain_free(struct
>>>>> iommu_domain *domain) }
>>>>>
>>>>> static int arm_smmu_domain_finalise(struct arm_smmu_domain
>>>> *smmu_domain,
>>>>> - struct arm_smmu_device *smmu)
>>>>> + struct arm_smmu_device *smmu,
>>>>> + bool enable_dirty)
>>>>> {
>>>>> int ret;
>>>>> - unsigned long ias, oas;
>>>>> + unsigned long ias;
>>>>> enum io_pgtable_fmt fmt;
>>>>> struct io_pgtable_cfg pgtbl_cfg;
>>>>> struct io_pgtable_ops *pgtbl_ops;
>>>>> @@ -2459,31 +2462,31 @@ static int arm_smmu_domain_finalise(struct
>>>> arm_smmu_domain *smmu_domain,
>>>>> if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
>>>>> smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>>>>>
>>>>> + pgtbl_cfg = (struct io_pgtable_cfg) {
>>>>> + .pgsize_bitmap = smmu->pgsize_bitmap,
>>>>> + .coherent_walk = smmu->features &
>>>> ARM_SMMU_FEAT_COHERENCY,
>>>>> + .tlb = &arm_smmu_flush_ops,
>>>>> + .iommu_dev = smmu->dev,
>>>>> + };
>>>>> +
>>>>> switch (smmu_domain->stage) {
>>>>> case ARM_SMMU_DOMAIN_S1:
>>>>> ias = (smmu->features & ARM_SMMU_FEAT_VAX) ? 52 : 48;
>>>>> - ias = min_t(unsigned long, ias, VA_BITS);
>>>>> - oas = smmu->ias;
>>>>> + pgtbl_cfg.ias = min_t(unsigned long, ias, VA_BITS);
>>>>> + pgtbl_cfg.oas = smmu->ias;
>>>>> + if (enable_dirty)
>>>>> + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
>>>>> fmt = ARM_64_LPAE_S1;
>>>>> break;
>>>>> case ARM_SMMU_DOMAIN_S2:
>>>>> - ias = smmu->ias;
>>>>> - oas = smmu->oas;
>>>>> + pgtbl_cfg.ias = smmu->ias;
>>>>> + pgtbl_cfg.oas = smmu->oas;
>>>>> fmt = ARM_64_LPAE_S2;
>>>>> break;
>>>>> default:
>>>>> return -EINVAL;
>>>>> }
>>>>>
>>>>> - pgtbl_cfg = (struct io_pgtable_cfg) {
>>>>> - .pgsize_bitmap = smmu->pgsize_bitmap,
>>>>> - .ias = ias,
>>>>> - .oas = oas,
>>>>> - .coherent_walk = smmu->features &
>>>> ARM_SMMU_FEAT_COHERENCY,
>>>>> - .tlb = &arm_smmu_flush_ops,
>>>>> - .iommu_dev = smmu->dev,
>>>>> - };
>>>>> -
>>>>> pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
>>>>> if (!pgtbl_ops)
>>>>> return -ENOMEM;
>>>>> @@ -2491,7 +2494,8 @@ static int arm_smmu_domain_finalise(struct
>>>> arm_smmu_domain *smmu_domain,
>>>>> smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
>>>>> smmu_domain->domain.geometry.aperture_end = (1UL <<
>>>> pgtbl_cfg.ias) - 1;
>>>>> smmu_domain->domain.geometry.force_aperture = true;
>>>>> -
>>>>> + if (enable_dirty && smmu_domain->stage ==
>>>> ARM_SMMU_DOMAIN_S1)
>>>>> + smmu_domain->domain.dirty_ops =
>>>> &arm_smmu_dirty_ops;
>>>>> ret = arm_smmu_domain_alloc_id(smmu, smmu_domain);
>>>>> if (ret < 0) {
>>>>> free_io_pgtable_ops(pgtbl_ops);
>>>>> @@ -2811,7 +2815,7 @@ static int arm_smmu_attach_dev(struct
>>>> iommu_domain *domain, struct device *dev)
>>>>> mutex_lock(&smmu_domain->init_mutex);
>>>>>
>>>>> if (!smmu_domain->smmu) {
>>>>> - ret = arm_smmu_domain_finalise(smmu_domain, smmu);
>>>>> + ret = arm_smmu_domain_finalise(smmu_domain, smmu,
>>>> false);
>>>>> } else if (smmu_domain->smmu != smmu)
>>>>> ret = -EINVAL;
>>>>>
>>>>
>>>>
>>>> I think we are missing the domain attach_dev check for dirty tracking.
>>>>
>>>> Something like:
>>>>
>>>> if (domain->dirty_ops && !arm_smmu_dbm_capable(smmu))
>>>> return -EINVAL;
>>>>
>>>> But that helper is only introduced in the last patch, so maybe:
>>>>
>>>> if (domain->dirty_ops &&
>>>> !device_iommu_capable(dev, IOMMU_CAP_DIRTY_TRACKING))
>>>> return -EINVAL;
>>>
>>> Ok. But do we really need to check this in attach()? As dirty_ops are added only
>>> if it is requested in alloc_user() and there we return err when hardware doesn't
>>> have the capability. So not sure how this matters in attach() path. May be I am
>>> missing something.
>>
>> That's when you create the domain with dev A. Afterwards that dev A is
>> attached,
>> but later on you can attach another device B to the domain. So this check is
>> there such that a domain with dirty tracking ops set will only have devices in
>> there that support dirty tracking.
>
> But that only matters if dev B is on different smmu without dbm capability, right?
> In that case we already fail the attach with,
>
>>>> } else if (smmu_domain->smmu != smmu)
>>>>> ret = -EINVAL;
>>>>>
>
> Is that right?
>
Oh yes, I missed that. That looks indeed to do the equivalent.
Perhaps a comment there at least to remind to reader in the event that such
check being lifted.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
2024-02-22 9:49 ` [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc Shameer Kolothum
2024-02-22 11:04 ` Joao Martins
@ 2024-03-08 14:31 ` Jason Gunthorpe
2024-04-23 16:27 ` Ryan Roberts
2 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2024-03-08 14:31 UTC (permalink / raw)
To: Shameer Kolothum
Cc: iommu, linux-arm-kernel, joro, kevin.tian, nicolinc, mshavit,
robin.murphy, will, joao.m.martins, jiangkunkun, zhukeqian1,
linuxarm
On Thu, Feb 22, 2024 at 09:49:22AM +0000, Shameer Kolothum wrote:
> From: Joao Martins <joao.m.martins@oracle.com>
>
> This provides all the infrastructure to enable dirty tracking if the
> hardware has the capability and domain alloc request for it.
>
> Please note, we still report no support for IOMMU_CAP_DIRTY_TRACKING
> as it will finally be enabled in a subsequent patch.
>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 95 ++++++++++++++++-----
> include/linux/io-pgtable.h | 4 +
> 2 files changed, 77 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index bd30739e3588..058bbb0dbe2e 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -43,6 +43,7 @@ MODULE_PARM_DESC(disable_msipolling,
> "Disable MSI-based polling for CMD_SYNC completion.");
>
> static struct iommu_ops arm_smmu_ops;
> +static struct iommu_dirty_ops arm_smmu_dirty_ops;
>
> enum arm_smmu_msi_index {
> EVTQ_MSI_INDEX,
> @@ -86,7 +87,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
>
> static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
> static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> - struct arm_smmu_device *smmu);
> + struct arm_smmu_device *smmu,
> + bool enable_dirty);
> static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
> static void arm_smmu_tlb_inv_all_s2(struct arm_smmu_domain *smmu_domain);
>
> @@ -2378,7 +2380,7 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
> struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> int ret;
>
> - ret = arm_smmu_domain_finalise(smmu_domain, master->smmu);
> + ret = arm_smmu_domain_finalise(smmu_domain, master->smmu, false);
> if (ret) {
> kfree(smmu_domain);
> return ERR_PTR(ret);
> @@ -2445,10 +2447,11 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
> }
>
> static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> - struct arm_smmu_device *smmu)
> + struct arm_smmu_device *smmu,
> + bool enable_dirty)
It is possibly a bit more readable if this is a flags and the test is
on IOMMU_HWPT_ALLOC_DIRTY_TRACKING
> {
> int ret;
> - unsigned long ias, oas;
> + unsigned long ias;
Isn't ias unused too?
> @@ -3193,7 +3197,9 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
> const struct iommu_user_data *user_data)
> {
> struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> - const u32 paging_flags = IOMMU_HWPT_ALLOC_NEST_PARENT;
> + const u32 paging_flags = IOMMU_HWPT_ALLOC_NEST_PARENT |
> + IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
> + bool enforce_dirty = flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
Ah you based this on the nesting series.. Once part 2 is done we
should try to figure out what the order should be..
> +static int arm_smmu_read_and_clear_dirty(struct iommu_domain *domain,
> + unsigned long iova, size_t size,
> + unsigned long flags,
> + struct iommu_dirty_bitmap *dirty)
> +{
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> +
> + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
> + return -EINVAL;
This is not possible, these ops are only installed on S1 domains.
> + if (WARN_ON_ONCE(!ops || !ops->read_and_clear_dirty))
> + return -ENODEV;
This is not really needed, if this is corrupted then this will have a
tidy crash:
> + return ops->read_and_clear_dirty(ops, iova, size, flags, dirty);
If you are worried then move the WARN_ON into the finalize function to
ensure tha tthe io_pgtable_ops is properly formed after creating it.
> +static int arm_smmu_set_dirty_tracking(struct iommu_domain *domain,
> + bool enabled)
> +{
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> +
> + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
> + return -EINVAL;
> + if (WARN_ON_ONCE(!ops))
> + return -ENODEV;
Ditto for both of these
Otherwise it looks OK to me
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 4/4] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
2024-02-22 9:49 ` [PATCH v2 4/4] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping Shameer Kolothum
@ 2024-03-08 14:32 ` Jason Gunthorpe
2024-04-23 16:45 ` Ryan Roberts
1 sibling, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2024-03-08 14:32 UTC (permalink / raw)
To: Shameer Kolothum
Cc: iommu, linux-arm-kernel, joro, kevin.tian, nicolinc, mshavit,
robin.murphy, will, joao.m.martins, jiangkunkun, zhukeqian1,
linuxarm
On Thu, Feb 22, 2024 at 09:49:23AM +0000, Shameer Kolothum wrote:
> From: Kunkun Jiang <jiangkunkun@huawei.com>
>
> If io-pgtable quirk flag indicates support for hardware update of
> dirty state, enable HA/HD bits in the SMMU CD and also set the DBM
> bit in the page descriptor.
>
> And now report the dirty page tracking capability of SMMUv3.
>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++++++++++++++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +++
> drivers/iommu/io-pgtable-arm.c | 5 ++++-
> 3 files changed, 22 insertions(+), 1 deletion(-)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU
2024-02-22 9:49 ` [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU Shameer Kolothum
@ 2024-04-23 14:41 ` Ryan Roberts
2024-04-23 14:52 ` Jason Gunthorpe
2024-04-24 8:01 ` Shameerali Kolothum Thodi
0 siblings, 2 replies; 36+ messages in thread
From: Ryan Roberts @ 2024-04-23 14:41 UTC (permalink / raw)
To: Shameer Kolothum, iommu, linux-arm-kernel
Cc: joro, jgg, kevin.tian, nicolinc, mshavit, robin.murphy, will,
joao.m.martins, jiangkunkun, zhukeqian1, linuxarm
Hi,
I'm aiming to (slowly) get more involved with SMMU activities, although I'm sure
it will take a while to get up to speed and provide useful input. It was
suggested that this series would be a useful starting point to dip my toe in.
Please bear with me while I ask stupid questions...
On 22/02/2024 09:49, Shameer Kolothum wrote:
> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
>
> If the SMMU supports it and the kernel was built with HTTU support,
> Probe support for Hardware Translation Table Update (HTTU) which is
> essentially to enable hardware update of access and dirty flags.
>
> Probe and set the smmu::features for Hardware Dirty and Hardware Access
> bits. This is in preparation, to enable it on the context descriptors of
> stage 1 format.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Except for the nit (feel free to ignore it), LGTM!
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32 +++++++++++++++++++++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 ++++
> 2 files changed, 37 insertions(+)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 0606166a8781..bd30739e3588 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -4196,6 +4196,28 @@ static void arm_smmu_device_iidr_probe(struct arm_smmu_device *smmu)
> }
> }
>
> +static void arm_smmu_get_httu(struct arm_smmu_device *smmu, u32 reg)
> +{
> + u32 fw_features = smmu->features & (ARM_SMMU_FEAT_HA | ARM_SMMU_FEAT_HD);
> + u32 features = 0;
> +
> + switch (FIELD_GET(IDR0_HTTU, reg)) {
> + case IDR0_HTTU_ACCESS_DIRTY:
> + features |= ARM_SMMU_FEAT_HD;
> + fallthrough;
> + case IDR0_HTTU_ACCESS:
> + features |= ARM_SMMU_FEAT_HA;
> + }
> +
> + if (smmu->dev->of_node)
> + smmu->features |= features;
> + else if (features != fw_features)
> + /* ACPI IORT sets the HTTU bits */
> + dev_warn(smmu->dev,
> + "IDR0.HTTU overridden by FW configuration (0x%x)\n",
> + fw_features);
> +}
> +
> static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
> {
> u32 reg;
> @@ -4256,6 +4278,8 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
> smmu->features |= ARM_SMMU_FEAT_E2H;
> }
>
> + arm_smmu_get_httu(smmu, reg);
> +
> /*
> * The coherency feature as set by FW is used in preference to the ID
> * register, but warn on mismatch.
> @@ -4448,6 +4472,14 @@ static int arm_smmu_device_acpi_probe(struct platform_device *pdev,
> if (iort_smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE)
> smmu->features |= ARM_SMMU_FEAT_COHERENCY;
>
> + switch (FIELD_GET(ACPI_IORT_SMMU_V3_HTTU_OVERRIDE, iort_smmu->flags)) {
> + case IDR0_HTTU_ACCESS_DIRTY:
> + smmu->features |= ARM_SMMU_FEAT_HD;
> + fallthrough;
> + case IDR0_HTTU_ACCESS:
> + smmu->features |= ARM_SMMU_FEAT_HA;
> + }
> +
> return 0;
> }
> #else
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 45bcd72fcda4..5e51a6c1d55f 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -33,6 +33,9 @@
> #define IDR0_ASID16 (1 << 12)
> #define IDR0_ATS (1 << 10)
> #define IDR0_HYP (1 << 9)
> +#define IDR0_HTTU GENMASK(7, 6)
> +#define IDR0_HTTU_ACCESS 1
> +#define IDR0_HTTU_ACCESS_DIRTY 2
> #define IDR0_COHACC (1 << 4)
> #define IDR0_TTF GENMASK(3, 2)
> #define IDR0_TTF_AARCH64 2
> @@ -668,6 +671,8 @@ struct arm_smmu_device {
> #define ARM_SMMU_FEAT_SVA (1 << 17)
> #define ARM_SMMU_FEAT_E2H (1 << 18)
> #define ARM_SMMU_FEAT_NESTING (1 << 19)
> +#define ARM_SMMU_FEAT_HA (1 << 20)
> +#define ARM_SMMU_FEAT_HD (1 << 21)
nit: HA and HD are a bit opaque, at least to me. I guess they are HW Access and
HW Dirty? Perhaps ARM_SMMU_FEAT_HW_ACCESS and ARM_SMMU_FEAT_HW_DIRTY are more
expressive?
> u32 features;
>
> #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU
2024-04-23 14:41 ` Ryan Roberts
@ 2024-04-23 14:52 ` Jason Gunthorpe
2024-04-24 10:04 ` Ryan Roberts
2024-04-24 8:01 ` Shameerali Kolothum Thodi
1 sibling, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2024-04-23 14:52 UTC (permalink / raw)
To: Ryan Roberts
Cc: Shameer Kolothum, iommu, linux-arm-kernel, joro, kevin.tian,
nicolinc, mshavit, robin.murphy, will, joao.m.martins,
jiangkunkun, zhukeqian1, linuxarm
On Tue, Apr 23, 2024 at 03:41:30PM +0100, Ryan Roberts wrote:
> Hi,
>
> I'm aiming to (slowly) get more involved with SMMU activities, although I'm sure
> it will take a while to get up to speed and provide useful input. It was
> suggested that this series would be a useful starting point to dip my toe in.
> Please bear with me while I ask stupid questions...
Nice!
I would like to see this merged as it was part of the original three
implementations for iommufd dirty tracking. The other two have been
merged for a long time now..
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 2/4] iommu/io-pgtable-arm: Add read_and_clear_dirty() support
2024-02-22 9:49 ` [PATCH v2 2/4] iommu/io-pgtable-arm: Add read_and_clear_dirty() support Shameer Kolothum
@ 2024-04-23 15:56 ` Ryan Roberts
2024-04-24 8:01 ` Shameerali Kolothum Thodi
0 siblings, 1 reply; 36+ messages in thread
From: Ryan Roberts @ 2024-04-23 15:56 UTC (permalink / raw)
To: Shameer Kolothum, iommu, linux-arm-kernel
Cc: joro, jgg, kevin.tian, nicolinc, mshavit, robin.murphy, will,
joao.m.martins, jiangkunkun, zhukeqian1, linuxarm
On 22/02/2024 09:49, Shameer Kolothum wrote:
> .read_and_clear_dirty() IOMMU domain op takes care of reading the dirty
> bits (i.e. PTE has both DBM and AP[2] set) and marshalling into a bitmap of
Err... a PTE is HW-dirty if DBM is set and AP[2] is *clear*, right? (AP[2] is "RDONLY" bit, and is initially set, then the HW clears it on first write. See pte_hw_dirty() in linux/arch/arm64/include/asm/pgtable.h).
> a given page size.
>
> While reading the dirty bits we also clear the PTE AP[2] bit to mark it as
> writable-clean depending on read_and_clear_dirty() flags.
You would need to *set* AP[2] to mark it clean.
>
> Structure it in a way that the IOPTE walker is generic, and so we pass a
> function pointer over what to do on a per-PTE basis.
>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
> drivers/iommu/io-pgtable-arm.c | 128 ++++++++++++++++++++++++++++++++-
> 1 file changed, 127 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index f7828a7aad41..1ce7b7a3c1e8 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -75,6 +75,7 @@
>
> #define ARM_LPAE_PTE_NSTABLE (((arm_lpae_iopte)1) << 63)
> #define ARM_LPAE_PTE_XN (((arm_lpae_iopte)3) << 53)
> +#define ARM_LPAE_PTE_DBM (((arm_lpae_iopte)1) << 51)
> #define ARM_LPAE_PTE_AF (((arm_lpae_iopte)1) << 10)
> #define ARM_LPAE_PTE_SH_NS (((arm_lpae_iopte)0) << 8)
> #define ARM_LPAE_PTE_SH_OS (((arm_lpae_iopte)2) << 8)
> @@ -84,7 +85,7 @@
>
> #define ARM_LPAE_PTE_ATTR_LO_MASK (((arm_lpae_iopte)0x3ff) << 2)
> /* Ignore the contiguous bit for block splitting */
> -#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)6) << 52)
> +#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)0xd) << 51)
Perhaps this is easier to understand:
#define ARM_LPAE_PTE_ATTR_HI_MASK (ARM_LPAE_PTE_XN | ARM_LPAE_PTE_DBM)
> #define ARM_LPAE_PTE_ATTR_MASK (ARM_LPAE_PTE_ATTR_LO_MASK | \
> ARM_LPAE_PTE_ATTR_HI_MASK)
> /* Software bit for solving coherency races */
> @@ -93,6 +94,9 @@
> /* Stage-1 PTE */
> #define ARM_LPAE_PTE_AP_UNPRIV (((arm_lpae_iopte)1) << 6)
> #define ARM_LPAE_PTE_AP_RDONLY (((arm_lpae_iopte)2) << 6)
> +#define ARM_LPAE_PTE_AP_RDONLY_BIT 7
Perhaps:
#define ARM_LPAE_PTE_AP_RDONLY_BIT 7
#define ARM_LPAE_PTE_AP_RDONLY (((arm_lpae_iopte)1) << ARM_LPAE_PTE_AP_RDONLY_BIT)
> +#define ARM_LPAE_PTE_AP_WRITABLE (ARM_LPAE_PTE_AP_RDONLY | \
> + ARM_LPAE_PTE_DBM)
Perhaps: ARM_LPAE_PTE_AP_WRITABLE_CLEAN ?
> #define ARM_LPAE_PTE_ATTRINDX_SHIFT 2
> #define ARM_LPAE_PTE_nG (((arm_lpae_iopte)1) << 11)
>
> @@ -729,6 +733,127 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
> return iopte_to_paddr(pte, data) | iova;
> }
>
> +struct arm_lpae_iopte_read_dirty {
> + unsigned long flags;
> + struct iommu_dirty_bitmap *dirty;
> +};
> +
> +typedef int (*io_pgtable_visit_fn_t)(unsigned long iova, size_t size,
> + arm_lpae_iopte *ptep, void *opaque);
> +struct io_pgtable_walk_data {
> + const io_pgtable_visit_fn_t fn;
> + void *opaque;> + u64 addr;
> + const u64 end;
> +};
> +
> +static int __arm_lpae_iopte_walk(struct arm_lpae_io_pgtable *data,
> + struct io_pgtable_walk_data *walk_data,
> + arm_lpae_iopte *ptep,
> + int lvl);
> +
> +static int __arm_lpae_read_and_clear_dirty(unsigned long iova, size_t size,
> + arm_lpae_iopte *ptep, void *opaque)
> +{
> + struct arm_lpae_iopte_read_dirty *arg = opaque;
> + struct iommu_dirty_bitmap *dirty = arg->dirty;
> + arm_lpae_iopte pte;
> +
> + pte = READ_ONCE(*ptep);
> + if (WARN_ON(!pte))
> + return -EINVAL;
You've already read and checked its not zero in io_pgtable_visit(). Either the walker is considered generic and probably shouldn't care if the pte is 0, (so just do check here). Or the walker is specific for this use case, in which case there is no need for the function pointer callback inefficiencies (and you've already READ_ONCE() the ptep so no need to repeat. (Multiple ptep_get()s does make a meaningful performance impact in the core-mm).
> +
> + if ((pte & ARM_LPAE_PTE_AP_WRITABLE) == ARM_LPAE_PTE_AP_WRITABLE)
> + return 0;
What about RO ptes? This check only checks that it is writable-clean. So you have 2 remaining options; writable-dirty and read-only. Suggest:
if (pte_hw_dirty(pte)) {
iommu_dirty_bitmap_record(dirty, iova, size);
if (!(arg->flags & IOMMU_DIRTY_NO_CLEAR))
set_bit(ARM_LPAE_PTE_AP_RDONLY_BIT, (unsigned long *)ptep);
}
> +
> + iommu_dirty_bitmap_record(dirty, iova, size);
> + if (!(arg->flags & IOMMU_DIRTY_NO_CLEAR))
> + set_bit(ARM_LPAE_PTE_AP_RDONLY_BIT, (unsigned long *)ptep);
> + return 0;
> +}
> +
> +static int io_pgtable_visit(struct arm_lpae_io_pgtable *data,
> + struct io_pgtable_walk_data *walk_data,
> + arm_lpae_iopte *ptep, int lvl)
> +{
> + struct io_pgtable *iop = &data->iop;
> + arm_lpae_iopte pte = READ_ONCE(*ptep);
> +
> + if (WARN_ON(!pte))
> + return -EINVAL;
> +
> + if (iopte_leaf(pte, lvl, iop->fmt)) {
> + size_t size = ARM_LPAE_BLOCK_SIZE(lvl, data);
> +
> + walk_data->fn(walk_data->addr, size, ptep, walk_data->opaque);
> + walk_data->addr += size;
> + return 0;
> + }
> +
> + ptep = iopte_deref(pte, data);
> + return __arm_lpae_iopte_walk(data, walk_data, ptep, lvl + 1);
> +}
> +
> +static int __arm_lpae_iopte_walk(struct arm_lpae_io_pgtable *data,
> + struct io_pgtable_walk_data *walk_data,
> + arm_lpae_iopte *ptep,
> + int lvl)
> +{
> + u32 idx;
> + int max_entries, ret;
> +
> + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
> + return -EINVAL;
> +
> + if (lvl == data->start_level)
> + max_entries = ARM_LPAE_PGD_SIZE(data) / sizeof(arm_lpae_iopte);
> + else
> + max_entries = ARM_LPAE_PTES_PER_TABLE(data);
> +
> + for (idx = ARM_LPAE_LVL_IDX(walk_data->addr, lvl, data);
> + (idx < max_entries) && (walk_data->addr < walk_data->end); ++idx) {
> + arm_lpae_iopte *pteref = &ptep[idx];
nit: Personally I would get rid of this and just pass `ptep + idx` to io_pgtable_visit()
> +
> + ret = io_pgtable_visit(data, walk_data, pteref, lvl);
> + if (ret)
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +static int arm_lpae_read_and_clear_dirty(struct io_pgtable_ops *ops,
> + unsigned long iova, size_t size,
> + unsigned long flags,
> + struct iommu_dirty_bitmap *dirty)
> +{
> + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
> + struct io_pgtable_cfg *cfg = &data->iop.cfg;
> + struct arm_lpae_iopte_read_dirty arg = {
> + .flags = flags, .dirty = dirty,
> + };
> + struct io_pgtable_walk_data walk_data = {
> + .fn = __arm_lpae_read_and_clear_dirty,
> + .opaque = &arg,
Do you have plans to reuse the walker? If not, then I wonder if separating the argument and callback function are valuable? It will certainly be more efficient without the per-pte indirect call.
> + .addr = iova,
> + .end = iova + size - 1,
Bug?: You are initializing end to be inclusive (i.e. last valid address in range). But __arm_lpae_iopte_walk() is using it as exclusive (i.e. first address after range) - see "walk_data->addr < walk_data->end".
mm code usually treats end as exclusive, so suggest removing the "- 1".
> + };
> + arm_lpae_iopte *ptep = data->pgd;
> + int lvl = data->start_level;
> + long iaext = (s64)iova >> cfg->ias;
Why cast this to signed? And why is the cast s64 and the result long? Suggest they should both be consistent at least. But probably clearer to just do:
WARN_ON((iova + size - 1) & (BIT(cfg->ias) - 1))
in the below if()? That way you are checking the full range too.
> +
> + if (WARN_ON(!size))
> + return -EINVAL;
> +
> + if (WARN_ON(iaext))
> + return -EINVAL;
> +
> + if (data->iop.fmt != ARM_64_LPAE_S1)
> + return -EINVAL;
> +
> + return __arm_lpae_iopte_walk(data, &walk_data, ptep, lvl);
> +}
> +
> static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
> {
> unsigned long granule, page_sizes;
> @@ -807,6 +932,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
> .map_pages = arm_lpae_map_pages,
> .unmap_pages = arm_lpae_unmap_pages,
> .iova_to_phys = arm_lpae_iova_to_phys,
> + .read_and_clear_dirty = arm_lpae_read_and_clear_dirty,
> };
>
> return data;
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
2024-02-22 9:49 ` [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc Shameer Kolothum
2024-02-22 11:04 ` Joao Martins
2024-03-08 14:31 ` Jason Gunthorpe
@ 2024-04-23 16:27 ` Ryan Roberts
2024-04-23 16:39 ` Jason Gunthorpe
2024-04-24 8:27 ` Shameerali Kolothum Thodi
2 siblings, 2 replies; 36+ messages in thread
From: Ryan Roberts @ 2024-04-23 16:27 UTC (permalink / raw)
To: Shameer Kolothum, iommu, linux-arm-kernel
Cc: joro, jgg, kevin.tian, nicolinc, mshavit, robin.murphy, will,
joao.m.martins, jiangkunkun, zhukeqian1, linuxarm
On 22/02/2024 09:49, Shameer Kolothum wrote:
> From: Joao Martins <joao.m.martins@oracle.com>
>
> This provides all the infrastructure to enable dirty tracking if the
> hardware has the capability and domain alloc request for it.
>
> Please note, we still report no support for IOMMU_CAP_DIRTY_TRACKING
> as it will finally be enabled in a subsequent patch.
>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 95 ++++++++++++++++-----
> include/linux/io-pgtable.h | 4 +
> 2 files changed, 77 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index bd30739e3588..058bbb0dbe2e 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -43,6 +43,7 @@ MODULE_PARM_DESC(disable_msipolling,
> "Disable MSI-based polling for CMD_SYNC completion.");
>
> static struct iommu_ops arm_smmu_ops;
> +static struct iommu_dirty_ops arm_smmu_dirty_ops;
>
> enum arm_smmu_msi_index {
> EVTQ_MSI_INDEX,
> @@ -86,7 +87,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
>
> static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
> static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> - struct arm_smmu_device *smmu);
> + struct arm_smmu_device *smmu,
> + bool enable_dirty);
> static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
> static void arm_smmu_tlb_inv_all_s2(struct arm_smmu_domain *smmu_domain);
>
> @@ -2378,7 +2380,7 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
> struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> int ret;
>
> - ret = arm_smmu_domain_finalise(smmu_domain, master->smmu);
> + ret = arm_smmu_domain_finalise(smmu_domain, master->smmu, false);
> if (ret) {
> kfree(smmu_domain);
> return ERR_PTR(ret);
> @@ -2445,10 +2447,11 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
> }
>
> static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> - struct arm_smmu_device *smmu)
> + struct arm_smmu_device *smmu,
> + bool enable_dirty)
> {
> int ret;
> - unsigned long ias, oas;
> + unsigned long ias;
> enum io_pgtable_fmt fmt;
> struct io_pgtable_cfg pgtbl_cfg;
> struct io_pgtable_ops *pgtbl_ops;
> @@ -2459,31 +2462,31 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
> smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>
> + pgtbl_cfg = (struct io_pgtable_cfg) {
> + .pgsize_bitmap = smmu->pgsize_bitmap,
> + .coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENCY,
> + .tlb = &arm_smmu_flush_ops,
> + .iommu_dev = smmu->dev,
> + };
> +
> switch (smmu_domain->stage) {
> case ARM_SMMU_DOMAIN_S1:
> ias = (smmu->features & ARM_SMMU_FEAT_VAX) ? 52 : 48;
> - ias = min_t(unsigned long, ias, VA_BITS);
> - oas = smmu->ias;
> + pgtbl_cfg.ias = min_t(unsigned long, ias, VA_BITS);
I know this isn't changed by this patch, but do we really mean VA_BITS here?
Don't we want vabits_actual? I'm guessing we are intending to limit ias to the
size the kernel is using.
> + pgtbl_cfg.oas = smmu->ias;
> + if (enable_dirty)
> + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
> fmt = ARM_64_LPAE_S1;
> break;
> case ARM_SMMU_DOMAIN_S2:
> - ias = smmu->ias;
> - oas = smmu->oas;
> + pgtbl_cfg.ias = smmu->ias;
> + pgtbl_cfg.oas = smmu->oas;
> fmt = ARM_64_LPAE_S2;
Is it worth adding a WARN_ON(enable_dirty) here?
> break;
> default:
> return -EINVAL;
> }
>
> - pgtbl_cfg = (struct io_pgtable_cfg) {
> - .pgsize_bitmap = smmu->pgsize_bitmap,
> - .ias = ias,
> - .oas = oas,
> - .coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENCY,
> - .tlb = &arm_smmu_flush_ops,
> - .iommu_dev = smmu->dev,
> - };
> -
> pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> if (!pgtbl_ops)
> return -ENOMEM;
> @@ -2491,7 +2494,8 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> smmu_domain->domain.geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
> smmu_domain->domain.geometry.force_aperture = true;
> -
> + if (enable_dirty && smmu_domain->stage == ARM_SMMU_DOMAIN_S1)
> + smmu_domain->domain.dirty_ops = &arm_smmu_dirty_ops;
> ret = arm_smmu_domain_alloc_id(smmu, smmu_domain);
> if (ret < 0) {
> free_io_pgtable_ops(pgtbl_ops);
> @@ -2811,7 +2815,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> mutex_lock(&smmu_domain->init_mutex);
>
> if (!smmu_domain->smmu) {
> - ret = arm_smmu_domain_finalise(smmu_domain, smmu);
> + ret = arm_smmu_domain_finalise(smmu_domain, smmu, false);
> } else if (smmu_domain->smmu != smmu)
> ret = -EINVAL;
>
> @@ -2876,7 +2880,7 @@ static int arm_smmu_s1_set_dev_pasid(struct iommu_domain *domain,
>
> mutex_lock(&smmu_domain->init_mutex);
> if (!smmu_domain->smmu)
> - ret = arm_smmu_domain_finalise(smmu_domain, smmu);
> + ret = arm_smmu_domain_finalise(smmu_domain, smmu, false);
> else if (smmu_domain->smmu != smmu)
> ret = -EINVAL;
> mutex_unlock(&smmu_domain->init_mutex);
> @@ -3193,7 +3197,9 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
> const struct iommu_user_data *user_data)
> {
> struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> - const u32 paging_flags = IOMMU_HWPT_ALLOC_NEST_PARENT;
> + const u32 paging_flags = IOMMU_HWPT_ALLOC_NEST_PARENT |
> + IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
> + bool enforce_dirty = flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
nit: It's called enable_dirty in other places; I think that is more appropriate
here?
> struct arm_smmu_domain *smmu_domain;
> int ret;
>
> @@ -3206,6 +3212,10 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
> if (user_data)
> return ERR_PTR(-EINVAL);
>
> + if (enforce_dirty &&
> + !device_iommu_capable(dev, IOMMU_CAP_DIRTY_TRACKING))
> + return ERR_PTR(-EOPNOTSUPP);
I'm guessing the intention is that only a stage 1 will ever be marked with
IOMMU_CAP_DIRTY_TRACKING (there are a few places that assume/check we are
dealing with S1)? But is there a reason why stage 2 can't be supported as well?
> +
> smmu_domain = arm_smmu_domain_alloc();
> if (!smmu_domain)
> return ERR_PTR(-ENOMEM);
> @@ -3221,7 +3231,7 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
>
> smmu_domain->domain.type = IOMMU_DOMAIN_UNMANAGED;
> smmu_domain->domain.ops = arm_smmu_ops.default_domain_ops;
> - ret = arm_smmu_domain_finalise(smmu_domain, master->smmu);
> + ret = arm_smmu_domain_finalise(smmu_domain, master->smmu, enforce_dirty);
> if (ret)
> goto err_free;
> return &smmu_domain->domain;
> @@ -3470,6 +3480,42 @@ static void arm_smmu_release_device(struct device *dev)
> kfree(master);
> }
>
> +static int arm_smmu_read_and_clear_dirty(struct iommu_domain *domain,
> + unsigned long iova, size_t size,
> + unsigned long flags,
> + struct iommu_dirty_bitmap *dirty)
> +{
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> +
> + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
> + return -EINVAL;
You've only attached the dirty_ops if it was S1 in the first place, so this
check seems overkill to me.
> +
> + if (WARN_ON_ONCE(!ops || !ops->read_and_clear_dirty))
> + return -ENODEV;
And here; could this be moved to where you attach the dirty_ops?
> +
> + return ops->read_and_clear_dirty(ops, iova, size, flags, dirty);
> +}
> +
> +static int arm_smmu_set_dirty_tracking(struct iommu_domain *domain,
> + bool enabled)
> +{
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> +
> + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
> + return -EINVAL;
> +
> + if (WARN_ON_ONCE(!ops))
> + return -ENODEV;
Same comments for the 2 checks.
> +
> + /*
> + * Always enabled and the dirty bitmap is cleared prior to
> + * set_dirty_tracking().
> + */
> + return 0;
> +}
> +
> static struct iommu_group *arm_smmu_device_group(struct device *dev)
> {
> struct iommu_group *group;
> @@ -3612,6 +3658,11 @@ static struct iommu_ops arm_smmu_ops = {
> }
> };
>
> +static struct iommu_dirty_ops arm_smmu_dirty_ops = {
> + .read_and_clear_dirty = arm_smmu_read_and_clear_dirty,
> + .set_dirty_tracking = arm_smmu_set_dirty_tracking,
> +};
> +
> /* Probing and initialisation functions */
> static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
> struct arm_smmu_queue *q,
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index 86cf1f7ae389..8e75f944f07a 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -85,6 +85,8 @@ struct io_pgtable_cfg {
> *
> * IO_PGTABLE_QUIRK_ARM_OUTER_WBWA: Override the outer-cacheability
> * attributes set in the TCR for a non-coherent page-table walker.
> + *
> + * IO_PGTABLE_QUIRK_ARM_HD: Enables dirty tracking in stage 1 pagetable.
> */
> #define IO_PGTABLE_QUIRK_ARM_NS BIT(0)
> #define IO_PGTABLE_QUIRK_NO_PERMS BIT(1)
> @@ -92,6 +94,8 @@ struct io_pgtable_cfg {
> #define IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT BIT(4)
> #define IO_PGTABLE_QUIRK_ARM_TTBR1 BIT(5)
> #define IO_PGTABLE_QUIRK_ARM_OUTER_WBWA BIT(6)
> + #define IO_PGTABLE_QUIRK_ARM_HD BIT(7)
> +
> unsigned long quirks;
> unsigned long pgsize_bitmap;
> unsigned int ias;
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
2024-04-23 16:27 ` Ryan Roberts
@ 2024-04-23 16:39 ` Jason Gunthorpe
2024-04-23 16:50 ` Ryan Roberts
2024-04-24 8:27 ` Shameerali Kolothum Thodi
1 sibling, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2024-04-23 16:39 UTC (permalink / raw)
To: Ryan Roberts
Cc: Shameer Kolothum, iommu, linux-arm-kernel, joro, kevin.tian,
nicolinc, mshavit, robin.murphy, will, joao.m.martins,
jiangkunkun, zhukeqian1, linuxarm
On Tue, Apr 23, 2024 at 05:27:09PM +0100, Ryan Roberts wrote:
> > switch (smmu_domain->stage) {
> > case ARM_SMMU_DOMAIN_S1:
> > ias = (smmu->features & ARM_SMMU_FEAT_VAX) ? 52 : 48;
> > - ias = min_t(unsigned long, ias, VA_BITS);
> > - oas = smmu->ias;
> > + pgtbl_cfg.ias = min_t(unsigned long, ias, VA_BITS);
>
> I know this isn't changed by this patch, but do we really mean VA_BITS here?
> Don't we want vabits_actual? I'm guessing we are intending to limit ias to the
> size the kernel is using.
I think the intention here is to allow CONFIG_ARM64_VA_BITS to choose
the number of page table levels in the iommu. Perhaps a SMMU specific
config option would be clearer.
There is no relationship to the configuration of the S1 paging domain
and what the MM is doing, there should be no crossing between those
two worlds.
vabits_actual is used in the SVA code to match the SMMU SVA S1 to the
mm_struct.
> > @@ -3206,6 +3212,10 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
> > if (user_data)
> > return ERR_PTR(-EINVAL);
> >
> > + if (enforce_dirty &&
> > + !device_iommu_capable(dev, IOMMU_CAP_DIRTY_TRACKING))
> > + return ERR_PTR(-EOPNOTSUPP);
>
> I'm guessing the intention is that only a stage 1 will ever be marked with
> IOMMU_CAP_DIRTY_TRACKING (there are a few places that assume/check we are
> dealing with S1)? But is there a reason why stage 2 can't be supported as well?
Stage 2 really should ultimately be supported..
I'd say this is a first step as iommufd will naturally prefer the S1
configuration when not using nesting. After the nesting support is
merged the vSMMU will require the S2 to do the dirty tracking.
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 4/4] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
2024-02-22 9:49 ` [PATCH v2 4/4] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping Shameer Kolothum
2024-03-08 14:32 ` Jason Gunthorpe
@ 2024-04-23 16:45 ` Ryan Roberts
2024-04-23 17:32 ` Jason Gunthorpe
1 sibling, 1 reply; 36+ messages in thread
From: Ryan Roberts @ 2024-04-23 16:45 UTC (permalink / raw)
To: Shameer Kolothum, iommu, linux-arm-kernel
Cc: joro, jgg, kevin.tian, nicolinc, mshavit, robin.murphy, will,
joao.m.martins, jiangkunkun, zhukeqian1, linuxarm
On 22/02/2024 09:49, Shameer Kolothum wrote:
> From: Kunkun Jiang <jiangkunkun@huawei.com>
>
> If io-pgtable quirk flag indicates support for hardware update of
> dirty state, enable HA/HD bits in the SMMU CD and also set the DBM
> bit in the page descriptor.
>
> And now report the dirty page tracking capability of SMMUv3.
>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Section 3.13 of the spec states: "Where translation tables are shared between
CDs that contain the same ASID (within a translation regime), the CD HA and HD
fields must be identical."
I don't think the way that smmu domains work, its possible to end up with a
single pgtable shared between multiple CDs? So the driver should be able to
guarantee this constraint is met?
Assuming yes:
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++++++++++++++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +++
> drivers/iommu/io-pgtable-arm.c | 5 ++++-
> 3 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 058bbb0dbe2e..4423cc7e48cf 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1271,6 +1271,12 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
> CTXDESC_CD_0_ASET |
> FIELD_PREP(CTXDESC_CD_0_ASID, smmu_domain->asid)
> );
> +
> + /* To enable dirty flag update, set both Access flag and dirty state update */
> + if (pgtbl_cfg->quirks & IO_PGTABLE_QUIRK_ARM_HD)
> + target->data[0] |= cpu_to_le64(CTXDESC_CD_0_TCR_HA |
> + CTXDESC_CD_0_TCR_HD);
> +
> target->data[1] = cpu_to_le64(pgtbl_cfg->arm_lpae_s1_cfg.ttbr &
> CTXDESC_CD_1_TTB0_MASK);
> target->data[3] = cpu_to_le64(pgtbl_cfg->arm_lpae_s1_cfg.mair);
> @@ -2307,6 +2313,13 @@ static const struct iommu_flush_ops arm_smmu_flush_ops = {
> .tlb_add_page = arm_smmu_tlb_inv_page_nosync,
> };
>
> +static bool arm_smmu_dbm_capable(struct arm_smmu_device *smmu)
> +{
> + u32 features = (ARM_SMMU_FEAT_HD | ARM_SMMU_FEAT_COHERENCY);
> +
> + return (smmu->features & features) == features;
> +}
> +
> /* IOMMU API */
> static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
> {
> @@ -2319,6 +2332,8 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
> case IOMMU_CAP_NOEXEC:
> case IOMMU_CAP_DEFERRED_FLUSH:
> return true;
> + case IOMMU_CAP_DIRTY_TRACKING:
> + return arm_smmu_dbm_capable(master->smmu);
> default:
> return false;
> }
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 5e51a6c1d55f..a9cd805e5a1b 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -313,6 +313,9 @@ struct arm_smmu_cd {
> #define CTXDESC_CD_0_TCR_IPS GENMASK_ULL(34, 32)
> #define CTXDESC_CD_0_TCR_TBI0 (1ULL << 38)
>
> +#define CTXDESC_CD_0_TCR_HA (1UL << 43)
> +#define CTXDESC_CD_0_TCR_HD (1UL << 42)
> +
> #define CTXDESC_CD_0_AA64 (1UL << 41)
> #define CTXDESC_CD_0_S (1UL << 44)
> #define CTXDESC_CD_0_R (1UL << 45)
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 1ce7b7a3c1e8..56e88e555fc7 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -429,6 +429,8 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
> pte = ARM_LPAE_PTE_nG;
> if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ))
> pte |= ARM_LPAE_PTE_AP_RDONLY;
> + else if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_HD)
> + pte |= ARM_LPAE_PTE_AP_WRITABLE;
> if (!(prot & IOMMU_PRIV))
> pte |= ARM_LPAE_PTE_AP_UNPRIV;
> } else {
> @@ -948,7 +950,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
>
> if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
> IO_PGTABLE_QUIRK_ARM_TTBR1 |
> - IO_PGTABLE_QUIRK_ARM_OUTER_WBWA))
> + IO_PGTABLE_QUIRK_ARM_OUTER_WBWA |
> + IO_PGTABLE_QUIRK_ARM_HD))
> return NULL;
>
> data = arm_lpae_alloc_pgtable(cfg);
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
2024-04-23 16:39 ` Jason Gunthorpe
@ 2024-04-23 16:50 ` Ryan Roberts
0 siblings, 0 replies; 36+ messages in thread
From: Ryan Roberts @ 2024-04-23 16:50 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Shameer Kolothum, iommu, linux-arm-kernel, joro, kevin.tian,
nicolinc, mshavit, robin.murphy, will, joao.m.martins,
jiangkunkun, zhukeqian1, linuxarm
On 23/04/2024 17:39, Jason Gunthorpe wrote:
> On Tue, Apr 23, 2024 at 05:27:09PM +0100, Ryan Roberts wrote:
>
>>> switch (smmu_domain->stage) {
>>> case ARM_SMMU_DOMAIN_S1:
>>> ias = (smmu->features & ARM_SMMU_FEAT_VAX) ? 52 : 48;
>>> - ias = min_t(unsigned long, ias, VA_BITS);
>>> - oas = smmu->ias;
>>> + pgtbl_cfg.ias = min_t(unsigned long, ias, VA_BITS);
>>
>> I know this isn't changed by this patch, but do we really mean VA_BITS here?
>> Don't we want vabits_actual? I'm guessing we are intending to limit ias to the
>> size the kernel is using.
>
> I think the intention here is to allow CONFIG_ARM64_VA_BITS to choose
> the number of page table levels in the iommu.
Hmm ok, got it. But it seems odd...
> Perhaps a SMMU specific
> config option would be clearer.
Yes, it would certainly be clearer. Anyway, I don't think there is any required
action here. Thanks for explaining.
>
> There is no relationship to the configuration of the S1 paging domain
> and what the MM is doing, there should be no crossing between those
> two worlds.
>
> vabits_actual is used in the SVA code to match the SMMU SVA S1 to the
> mm_struct.
>
>>> @@ -3206,6 +3212,10 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
>>> if (user_data)
>>> return ERR_PTR(-EINVAL);
>>>
>>> + if (enforce_dirty &&
>>> + !device_iommu_capable(dev, IOMMU_CAP_DIRTY_TRACKING))
>>> + return ERR_PTR(-EOPNOTSUPP);
>>
>> I'm guessing the intention is that only a stage 1 will ever be marked with
>> IOMMU_CAP_DIRTY_TRACKING (there are a few places that assume/check we are
>> dealing with S1)? But is there a reason why stage 2 can't be supported as well?
>
> Stage 2 really should ultimately be supported..
>
> I'd say this is a first step as iommufd will naturally prefer the S1
> configuration when not using nesting. After the nesting support is
> merged the vSMMU will require the S2 to do the dirty tracking.
Yep, ok, so we should plan to support S2 as a later step. Sounds good.
>
> Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 4/4] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
2024-04-23 16:45 ` Ryan Roberts
@ 2024-04-23 17:32 ` Jason Gunthorpe
2024-04-24 7:58 ` Ryan Roberts
0 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2024-04-23 17:32 UTC (permalink / raw)
To: Ryan Roberts
Cc: Shameer Kolothum, iommu, linux-arm-kernel, joro, kevin.tian,
nicolinc, mshavit, robin.murphy, will, joao.m.martins,
jiangkunkun, zhukeqian1, linuxarm
On Tue, Apr 23, 2024 at 05:45:16PM +0100, Ryan Roberts wrote:
> On 22/02/2024 09:49, Shameer Kolothum wrote:
> > From: Kunkun Jiang <jiangkunkun@huawei.com>
> >
> > If io-pgtable quirk flag indicates support for hardware update of
> > dirty state, enable HA/HD bits in the SMMU CD and also set the DBM
> > bit in the page descriptor.
> >
> > And now report the dirty page tracking capability of SMMUv3.
> >
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> > Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>
> Section 3.13 of the spec states: "Where translation tables are shared between
> CDs that contain the same ASID (within a translation regime), the CD HA and HD
> fields must be identical."
>
> I don't think the way that smmu domains work, its possible to end up with a
> single pgtable shared between multiple CDs?
It is possible. iommufd can link a single hwpt -> iommu_domain ->
smmu_domain to many different RIDs and to different PASIDs each with
their own CD.
> So the driver should be able to guarantee this constraint is met?
It is expected to be done by this:
> > @@ -1271,6 +1271,12 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
> > CTXDESC_CD_0_ASET |
> > FIELD_PREP(CTXDESC_CD_0_ASID, smmu_domain->asid)
> > );
> > +
> > + /* To enable dirty flag update, set both Access flag and dirty state update */
> > + if (pgtbl_cfg->quirks & IO_PGTABLE_QUIRK_ARM_HD)
> > + target->data[0] |= cpu_to_le64(CTXDESC_CD_0_TCR_HA |
> > + CTXDESC_CD_0_TCR_HD);
> > +
This function is the only place that programs the ASID into a CD entry
for the domain, and it always derives the HA/HD bits in the same way
from some immutable value stored in the iommu_domain structure.
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 4/4] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
2024-04-23 17:32 ` Jason Gunthorpe
@ 2024-04-24 7:58 ` Ryan Roberts
2024-04-24 12:15 ` Jason Gunthorpe
0 siblings, 1 reply; 36+ messages in thread
From: Ryan Roberts @ 2024-04-24 7:58 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Shameer Kolothum, iommu, linux-arm-kernel, joro, kevin.tian,
nicolinc, mshavit, robin.murphy, will, joao.m.martins,
jiangkunkun, zhukeqian1, linuxarm
On 23/04/2024 18:32, Jason Gunthorpe wrote:
> On Tue, Apr 23, 2024 at 05:45:16PM +0100, Ryan Roberts wrote:
>> On 22/02/2024 09:49, Shameer Kolothum wrote:
>>> From: Kunkun Jiang <jiangkunkun@huawei.com>
>>>
>>> If io-pgtable quirk flag indicates support for hardware update of
>>> dirty state, enable HA/HD bits in the SMMU CD and also set the DBM
>>> bit in the page descriptor.
>>>
>>> And now report the dirty page tracking capability of SMMUv3.
>>>
>>> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>>> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>>> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>
>> Section 3.13 of the spec states: "Where translation tables are shared between
>> CDs that contain the same ASID (within a translation regime), the CD HA and HD
>> fields must be identical."
>>
>> I don't think the way that smmu domains work, its possible to end up with a
>> single pgtable shared between multiple CDs?
>
> It is possible. iommufd can link a single hwpt -> iommu_domain ->
> smmu_domain to many different RIDs and to different PASIDs each with
> their own CD.
Ahh ok.
>
>> So the driver should be able to guarantee this constraint is met?
>
> It is expected to be done by this:
>
>>> @@ -1271,6 +1271,12 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
>>> CTXDESC_CD_0_ASET |
>>> FIELD_PREP(CTXDESC_CD_0_ASID, smmu_domain->asid)
>>> );
>>> +
>>> + /* To enable dirty flag update, set both Access flag and dirty state update */
>>> + if (pgtbl_cfg->quirks & IO_PGTABLE_QUIRK_ARM_HD)
>>> + target->data[0] |= cpu_to_le64(CTXDESC_CD_0_TCR_HA |
>>> + CTXDESC_CD_0_TCR_HD);
>>> +
>
> This function is the only place that programs the ASID into a CD entry
> for the domain, and it always derives the HA/HD bits in the same way
> from some immutable value stored in the iommu_domain structure.
But then I don't understand how this code can make the guarrantee? Whether or
not dirty tracking is enabled seems to be a property of the domain, not the page
table. (See arm_smmu_domain_finalise()). So if multiple domains can share a page
table surely some of those domains could end up with dirty tracking on and
others with it off. So there would be multiple CDs, all pointing to the same
pgtable but with different HA/HD values?
Or perhaps I've misunderstood your comment and there is actually always a 1:1
relationship between domain and pgtable? But a 1:* relationship between domain
and CD?
>
> Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU
2024-04-23 14:41 ` Ryan Roberts
2024-04-23 14:52 ` Jason Gunthorpe
@ 2024-04-24 8:01 ` Shameerali Kolothum Thodi
2024-04-24 8:28 ` Ryan Roberts
1 sibling, 1 reply; 36+ messages in thread
From: Shameerali Kolothum Thodi @ 2024-04-24 8:01 UTC (permalink / raw)
To: Ryan Roberts, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org
Cc: joro@8bytes.org, jgg@nvidia.com, kevin.tian@intel.com,
nicolinc@nvidia.com, mshavit@google.com, robin.murphy@arm.com,
will@kernel.org, joao.m.martins@oracle.com, jiangkunkun,
zhukeqian, Linuxarm
Hi,
> -----Original Message-----
> From: Ryan Roberts <ryan.roberts@arm.com>
> Sent: Tuesday, April 23, 2024 3:42 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> iommu@lists.linux.dev; linux-arm-kernel@lists.infradead.org
> Cc: joro@8bytes.org; jgg@nvidia.com; kevin.tian@intel.com;
> nicolinc@nvidia.com; mshavit@google.com; robin.murphy@arm.com;
> will@kernel.org; joao.m.martins@oracle.com; jiangkunkun
> <jiangkunkun@huawei.com>; zhukeqian <zhukeqian1@huawei.com>;
> Linuxarm <linuxarm@huawei.com>
> Subject: Re: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for
> HTTU
>
> Hi,
>
> I'm aiming to (slowly) get more involved with SMMU activities, although I'm
> sure
> it will take a while to get up to speed and provide useful input. It was
> suggested that this series would be a useful starting point to dip my toe in.
> Please bear with me while I ask stupid questions...
Thanks for going through the series. I am planning to respin this one soon, once
Jason's SMMUv3 refactor series in some good form.
>
>
> On 22/02/2024 09:49, Shameer Kolothum wrote:
> > From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >
> > If the SMMU supports it and the kernel was built with HTTU support,
> > Probe support for Hardware Translation Table Update (HTTU) which is
> > essentially to enable hardware update of access and dirty flags.
> >
> > Probe and set the smmu::features for Hardware Dirty and Hardware Access
> > bits. This is in preparation, to enable it on the context descriptors of
> > stage 1 format.
> >
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
>
> Except for the nit (feel free to ignore it), LGTM!
>
> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
>
>
> > ---
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32
> +++++++++++++++++++++
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 ++++
> > 2 files changed, 37 insertions(+)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 0606166a8781..bd30739e3588 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -4196,6 +4196,28 @@ static void arm_smmu_device_iidr_probe(struct
> arm_smmu_device *smmu)
> > }
> > }
> >
> > +static void arm_smmu_get_httu(struct arm_smmu_device *smmu, u32
> reg)
> > +{
> > + u32 fw_features = smmu->features & (ARM_SMMU_FEAT_HA |
> ARM_SMMU_FEAT_HD);
> > + u32 features = 0;
> > +
> > + switch (FIELD_GET(IDR0_HTTU, reg)) {
> > + case IDR0_HTTU_ACCESS_DIRTY:
> > + features |= ARM_SMMU_FEAT_HD;
> > + fallthrough;
> > + case IDR0_HTTU_ACCESS:
> > + features |= ARM_SMMU_FEAT_HA;
> > + }
> > +
> > + if (smmu->dev->of_node)
> > + smmu->features |= features;
> > + else if (features != fw_features)
> > + /* ACPI IORT sets the HTTU bits */
> > + dev_warn(smmu->dev,
> > + "IDR0.HTTU overridden by FW configuration
> (0x%x)\n",
> > + fw_features);
> > +}
> > +
> > static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
> > {
> > u32 reg;
> > @@ -4256,6 +4278,8 @@ static int arm_smmu_device_hw_probe(struct
> arm_smmu_device *smmu)
> > smmu->features |= ARM_SMMU_FEAT_E2H;
> > }
> >
> > + arm_smmu_get_httu(smmu, reg);
> > +
> > /*
> > * The coherency feature as set by FW is used in preference to the ID
> > * register, but warn on mismatch.
> > @@ -4448,6 +4472,14 @@ static int arm_smmu_device_acpi_probe(struct
> platform_device *pdev,
> > if (iort_smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE)
> > smmu->features |= ARM_SMMU_FEAT_COHERENCY;
> >
> > + switch (FIELD_GET(ACPI_IORT_SMMU_V3_HTTU_OVERRIDE,
> iort_smmu->flags)) {
> > + case IDR0_HTTU_ACCESS_DIRTY:
> > + smmu->features |= ARM_SMMU_FEAT_HD;
> > + fallthrough;
> > + case IDR0_HTTU_ACCESS:
> > + smmu->features |= ARM_SMMU_FEAT_HA;
> > + }
> > +
> > return 0;
> > }
> > #else
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > index 45bcd72fcda4..5e51a6c1d55f 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > @@ -33,6 +33,9 @@
> > #define IDR0_ASID16 (1 << 12)
> > #define IDR0_ATS (1 << 10)
> > #define IDR0_HYP (1 << 9)
> > +#define IDR0_HTTU GENMASK(7, 6)
> > +#define IDR0_HTTU_ACCESS 1
> > +#define IDR0_HTTU_ACCESS_DIRTY 2
> > #define IDR0_COHACC (1 << 4)
> > #define IDR0_TTF GENMASK(3, 2)
> > #define IDR0_TTF_AARCH64 2
> > @@ -668,6 +671,8 @@ struct arm_smmu_device {
> > #define ARM_SMMU_FEAT_SVA (1 << 17)
> > #define ARM_SMMU_FEAT_E2H (1 << 18)
> > #define ARM_SMMU_FEAT_NESTING (1 << 19)
> > +#define ARM_SMMU_FEAT_HA (1 << 20)
> > +#define ARM_SMMU_FEAT_HD (1 << 21)
>
> nit: HA and HD are a bit opaque, at least to me. I guess they are HW Access
> and
> HW Dirty? Perhaps ARM_SMMU_FEAT_HW_ACCESS and
> ARM_SMMU_FEAT_HW_DIRTY are more
> expressive?
Nothing against it. Just that _HW_ is not used in driver anywhere(AFAICS)
and HA/HD are used in the specification. How about we rename to
ARM_SMMU_FEAT_HTTU_HA and ARM_SMMU_FEAT_HTTU_HA?
Thanks,
Shameer
>
> > u32 features;
> >
> > #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 2/4] iommu/io-pgtable-arm: Add read_and_clear_dirty() support
2024-04-23 15:56 ` Ryan Roberts
@ 2024-04-24 8:01 ` Shameerali Kolothum Thodi
2024-04-24 8:36 ` Ryan Roberts
0 siblings, 1 reply; 36+ messages in thread
From: Shameerali Kolothum Thodi @ 2024-04-24 8:01 UTC (permalink / raw)
To: Ryan Roberts, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org
Cc: joro@8bytes.org, jgg@nvidia.com, kevin.tian@intel.com,
nicolinc@nvidia.com, mshavit@google.com, robin.murphy@arm.com,
will@kernel.org, joao.m.martins@oracle.com, jiangkunkun,
zhukeqian, Linuxarm
> -----Original Message-----
> From: Ryan Roberts <ryan.roberts@arm.com>
> Sent: Tuesday, April 23, 2024 4:56 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> iommu@lists.linux.dev; linux-arm-kernel@lists.infradead.org
> Cc: joro@8bytes.org; jgg@nvidia.com; kevin.tian@intel.com;
> nicolinc@nvidia.com; mshavit@google.com; robin.murphy@arm.com;
> will@kernel.org; joao.m.martins@oracle.com; jiangkunkun
> <jiangkunkun@huawei.com>; zhukeqian <zhukeqian1@huawei.com>;
> Linuxarm <linuxarm@huawei.com>
> Subject: Re: [PATCH v2 2/4] iommu/io-pgtable-arm: Add
> read_and_clear_dirty() support
>
> On 22/02/2024 09:49, Shameer Kolothum wrote:
> > .read_and_clear_dirty() IOMMU domain op takes care of reading the
> > dirty bits (i.e. PTE has both DBM and AP[2] set) and marshalling into
> > a bitmap of
>
> Err... a PTE is HW-dirty if DBM is set and AP[2] is *clear*, right? (AP[2] is
> "RDONLY" bit, and is initially set, then the HW clears it on first write. See
> pte_hw_dirty() in linux/arch/arm64/include/asm/pgtable.h).
Oops..yes, the comment here is indeed wrong.
I will update and add something like below to make it clear:
DBM bit AP[2]("RDONLY" bit)
1. writeable_clean 1 1
2. writeable_dirty 1 0
3. read-only 0 1
>
> > a given page size.
> >
> > While reading the dirty bits we also clear the PTE AP[2] bit to mark
> > it as writable-clean depending on read_and_clear_dirty() flags.
>
> You would need to *set* AP[2] to mark it clean.
>
> >
> > Structure it in a way that the IOPTE walker is generic, and so we pass
> > a function pointer over what to do on a per-PTE basis.
> >
> > Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > ---
> > drivers/iommu/io-pgtable-arm.c | 128
> > ++++++++++++++++++++++++++++++++-
> > 1 file changed, 127 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/io-pgtable-arm.c
> > b/drivers/iommu/io-pgtable-arm.c index f7828a7aad41..1ce7b7a3c1e8
> > 100644
> > --- a/drivers/iommu/io-pgtable-arm.c
> > +++ b/drivers/iommu/io-pgtable-arm.c
> > @@ -75,6 +75,7 @@
> >
> > #define ARM_LPAE_PTE_NSTABLE (((arm_lpae_iopte)1) << 63)
> > #define ARM_LPAE_PTE_XN (((arm_lpae_iopte)3) << 53)
> > +#define ARM_LPAE_PTE_DBM (((arm_lpae_iopte)1) << 51)
> > #define ARM_LPAE_PTE_AF (((arm_lpae_iopte)1) << 10)
> > #define ARM_LPAE_PTE_SH_NS (((arm_lpae_iopte)0) << 8)
> > #define ARM_LPAE_PTE_SH_OS (((arm_lpae_iopte)2) << 8)
> > @@ -84,7 +85,7 @@
> >
> > #define ARM_LPAE_PTE_ATTR_LO_MASK (((arm_lpae_iopte)0x3ff) <<
> 2)
> > /* Ignore the contiguous bit for block splitting */
> > -#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)6) << 52)
> > +#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)0xd) <<
> 51)
>
> Perhaps this is easier to understand:
>
> #define ARM_LPAE_PTE_ATTR_HI_MASK (ARM_LPAE_PTE_XN |
> ARM_LPAE_PTE_DBM)
Ok.
>
> > #define ARM_LPAE_PTE_ATTR_MASK
> (ARM_LPAE_PTE_ATTR_LO_MASK | \
> > ARM_LPAE_PTE_ATTR_HI_MASK)
> > /* Software bit for solving coherency races */ @@ -93,6 +94,9 @@
> > /* Stage-1 PTE */
> > #define ARM_LPAE_PTE_AP_UNPRIV (((arm_lpae_iopte)1) << 6)
> > #define ARM_LPAE_PTE_AP_RDONLY (((arm_lpae_iopte)2)
> << 6)
> > +#define ARM_LPAE_PTE_AP_RDONLY_BIT 7
>
> Perhaps:
>
> #define ARM_LPAE_PTE_AP_RDONLY_BIT 7
> #define ARM_LPAE_PTE_AP_RDONLY (((arm_lpae_iopte)1) <<
> ARM_LPAE_PTE_AP_RDONLY_BIT)
Ok.
>
> > +#define ARM_LPAE_PTE_AP_WRITABLE (ARM_LPAE_PTE_AP_RDONLY
> | \
> > + ARM_LPAE_PTE_DBM)
>
> Perhaps: ARM_LPAE_PTE_AP_WRITABLE_CLEAN ?
Sure. That’s more clear.
>
> > #define ARM_LPAE_PTE_ATTRINDX_SHIFT 2
> > #define ARM_LPAE_PTE_nG (((arm_lpae_iopte)1) << 11)
> >
> > @@ -729,6 +733,127 @@ static phys_addr_t
> arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
> > return iopte_to_paddr(pte, data) | iova; }
> >
> > +struct arm_lpae_iopte_read_dirty {
> > + unsigned long flags;
> > + struct iommu_dirty_bitmap *dirty;
> > +};
> > +
> > +typedef int (*io_pgtable_visit_fn_t)(unsigned long iova, size_t size,
> > + arm_lpae_iopte *ptep, void *opaque);
> struct
> > +io_pgtable_walk_data {
> > + const io_pgtable_visit_fn_t fn;
> > + void *opaque;> + u64
> addr;
> > + const u64 end;
> > +};
> > +
> > +static int __arm_lpae_iopte_walk(struct arm_lpae_io_pgtable *data,
> > + struct io_pgtable_walk_data *walk_data,
> > + arm_lpae_iopte *ptep,
> > + int lvl);
> > +
> > +static int __arm_lpae_read_and_clear_dirty(unsigned long iova, size_t
> size,
> > + arm_lpae_iopte *ptep, void
> *opaque) {
> > + struct arm_lpae_iopte_read_dirty *arg = opaque;
> > + struct iommu_dirty_bitmap *dirty = arg->dirty;
> > + arm_lpae_iopte pte;
> > +
> > + pte = READ_ONCE(*ptep);
> > + if (WARN_ON(!pte))
> > + return -EINVAL;
>
> You've already read and checked its not zero in io_pgtable_visit(). Either the
> walker is considered generic and probably shouldn't care if the pte is 0, (so
> just do check here). Or the walker is specific for this use case, in which case
> there is no need for the function pointer callback inefficiencies (and you've
> already READ_ONCE() the ptep so no need to repeat. (Multiple ptep_get()s
> does make a meaningful performance impact in the core-mm).
Yes, that check is a duplication. Will remove that. I do have plans to support block
page split/merge functionality during migration start. And with that in mind tried
to make it generic with callback ptr. May be it is not worth at this time and I will
revisit it when we add split/merge functionality.
> > +
> > + if ((pte & ARM_LPAE_PTE_AP_WRITABLE) ==
> ARM_LPAE_PTE_AP_WRITABLE)
> > + return 0;
>
> What about RO ptes? This check only checks that it is writable-clean. So you
> have 2 remaining options; writable-dirty and read-only. Suggest:
Ok. Got it.
>
> if (pte_hw_dirty(pte)) {
> iommu_dirty_bitmap_record(dirty, iova, size);
> if (!(arg->flags & IOMMU_DIRTY_NO_CLEAR))
> set_bit(ARM_LPAE_PTE_AP_RDONLY_BIT, (unsigned long
> *)ptep); }
>
> > +
> > + iommu_dirty_bitmap_record(dirty, iova, size);
> > + if (!(arg->flags & IOMMU_DIRTY_NO_CLEAR))
> > + set_bit(ARM_LPAE_PTE_AP_RDONLY_BIT, (unsigned long
> *)ptep);
> > + return 0;
> > +}
> > +
> > +static int io_pgtable_visit(struct arm_lpae_io_pgtable *data,
> > + struct io_pgtable_walk_data *walk_data,
> > + arm_lpae_iopte *ptep, int lvl) {
> > + struct io_pgtable *iop = &data->iop;
> > + arm_lpae_iopte pte = READ_ONCE(*ptep);
> > +
> > + if (WARN_ON(!pte))
> > + return -EINVAL;
> > +
> > + if (iopte_leaf(pte, lvl, iop->fmt)) {
> > + size_t size = ARM_LPAE_BLOCK_SIZE(lvl, data);
> > +
> > + walk_data->fn(walk_data->addr, size, ptep, walk_data-
> >opaque);
> > + walk_data->addr += size;
> > + return 0;
> > + }
> > +
> > + ptep = iopte_deref(pte, data);
> > + return __arm_lpae_iopte_walk(data, walk_data, ptep, lvl + 1); }
> > +
> > +static int __arm_lpae_iopte_walk(struct arm_lpae_io_pgtable *data,
> > + struct io_pgtable_walk_data *walk_data,
> > + arm_lpae_iopte *ptep,
> > + int lvl)
> > +{
> > + u32 idx;
> > + int max_entries, ret;
> > +
> > + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
> > + return -EINVAL;
> > +
> > + if (lvl == data->start_level)
> > + max_entries = ARM_LPAE_PGD_SIZE(data) /
> sizeof(arm_lpae_iopte);
> > + else
> > + max_entries = ARM_LPAE_PTES_PER_TABLE(data);
> > +
> > + for (idx = ARM_LPAE_LVL_IDX(walk_data->addr, lvl, data);
> > + (idx < max_entries) && (walk_data->addr < walk_data->end);
> ++idx) {
> > + arm_lpae_iopte *pteref = &ptep[idx];
>
> nit: Personally I would get rid of this and just pass `ptep + idx` to
> io_pgtable_visit()
Ack.
> > +
> > + ret = io_pgtable_visit(data, walk_data, pteref, lvl);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int arm_lpae_read_and_clear_dirty(struct io_pgtable_ops *ops,
> > + unsigned long iova, size_t size,
> > + unsigned long flags,
> > + struct iommu_dirty_bitmap *dirty) {
> > + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
> > + struct io_pgtable_cfg *cfg = &data->iop.cfg;
> > + struct arm_lpae_iopte_read_dirty arg = {
> > + .flags = flags, .dirty = dirty,
> > + };
> > + struct io_pgtable_walk_data walk_data = {
> > + .fn = __arm_lpae_read_and_clear_dirty,
> > + .opaque = &arg,
>
> Do you have plans to reuse the walker? If not, then I wonder if separating
> the argument and callback function are valuable? It will certainly be more
> efficient without the per-pte indirect call.
Ok. I will remove indirection for now and may revisit it when we add split/merge
functionality.
>
> > + .addr = iova,
> > + .end = iova + size - 1,
>
> Bug?: You are initializing end to be inclusive (i.e. last valid address in range).
> But __arm_lpae_iopte_walk() is using it as exclusive (i.e. first address after
> range) - see "walk_data->addr < walk_data->end".
>
> mm code usually treats end as exclusive, so suggest removing the "- 1".
I will double check this.
>
> > + };
> > + arm_lpae_iopte *ptep = data->pgd;
> > + int lvl = data->start_level;
> > + long iaext = (s64)iova >> cfg->ias;
>
> Why cast this to signed? And why is the cast s64 and the result long? Suggest
> they should both be consistent at least. But probably clearer to just do:
>
> WARN_ON((iova + size - 1) & (BIT(cfg->ias) - 1))
>
> in the below if()? That way you are checking the full range too.
Ok.
Thanks,
Shameer
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc
2024-04-23 16:27 ` Ryan Roberts
2024-04-23 16:39 ` Jason Gunthorpe
@ 2024-04-24 8:27 ` Shameerali Kolothum Thodi
1 sibling, 0 replies; 36+ messages in thread
From: Shameerali Kolothum Thodi @ 2024-04-24 8:27 UTC (permalink / raw)
To: Ryan Roberts, Shameerali Kolothum Thodi, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org
Cc: joro@8bytes.org, jgg@nvidia.com, kevin.tian@intel.com,
nicolinc@nvidia.com, mshavit@google.com, robin.murphy@arm.com,
will@kernel.org, joao.m.martins@oracle.com, jiangkunkun,
zhukeqian
> -----Original Message-----
> From: Ryan Roberts <ryan.roberts@arm.com>
> Sent: Tuesday, April 23, 2024 5:27 PM
> To: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>;
> iommu@lists.linux.dev; linux-arm-kernel@lists.infradead.org
> Cc: joro@8bytes.org; jgg@nvidia.com; kevin.tian@intel.com;
> nicolinc@nvidia.com; mshavit@google.com; robin.murphy@arm.com;
> will@kernel.org; joao.m.martins@oracle.com; jiangkunkun@huawei.com;
> zhukeqian1@huawei.com; linuxarm@huawei.com
> Subject: Re: [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking
> in domain alloc
>
> On 22/02/2024 09:49, Shameer Kolothum wrote:
> > From: Joao Martins <joao.m.martins@oracle.com>
> >
> > This provides all the infrastructure to enable dirty tracking if the
> > hardware has the capability and domain alloc request for it.
> >
> > Please note, we still report no support for IOMMU_CAP_DIRTY_TRACKING
> > as it will finally be enabled in a subsequent patch.
> >
> > Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > ---
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 95 ++++++++++++++++--
> ---
> > include/linux/io-pgtable.h | 4 +
> > 2 files changed, 77 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index bd30739e3588..058bbb0dbe2e 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -43,6 +43,7 @@ MODULE_PARM_DESC(disable_msipolling,
> > "Disable MSI-based polling for CMD_SYNC completion.");
> >
> > static struct iommu_ops arm_smmu_ops;
> > +static struct iommu_dirty_ops arm_smmu_dirty_ops;
> >
> > enum arm_smmu_msi_index {
> > EVTQ_MSI_INDEX,
> > @@ -86,7 +87,8 @@ static struct arm_smmu_option_prop
> arm_smmu_options[] = {
> >
> > static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device
> *smmu);
> > static int arm_smmu_domain_finalise(struct arm_smmu_domain
> *smmu_domain,
> > - struct arm_smmu_device *smmu);
> > + struct arm_smmu_device *smmu,
> > + bool enable_dirty);
> > static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
> > static void arm_smmu_tlb_inv_all_s2(struct arm_smmu_domain
> *smmu_domain);
> >
> > @@ -2378,7 +2380,7 @@ static struct iommu_domain
> *arm_smmu_domain_alloc_paging(struct device *dev)
> > struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > int ret;
> >
> > - ret = arm_smmu_domain_finalise(smmu_domain, master-
> >smmu);
> > + ret = arm_smmu_domain_finalise(smmu_domain, master-
> >smmu, false);
> > if (ret) {
> > kfree(smmu_domain);
> > return ERR_PTR(ret);
> > @@ -2445,10 +2447,11 @@ static void arm_smmu_domain_free(struct
> iommu_domain *domain)
> > }
> >
> > static int arm_smmu_domain_finalise(struct arm_smmu_domain
> *smmu_domain,
> > - struct arm_smmu_device *smmu)
> > + struct arm_smmu_device *smmu,
> > + bool enable_dirty)
> > {
> > int ret;
> > - unsigned long ias, oas;
> > + unsigned long ias;
> > enum io_pgtable_fmt fmt;
> > struct io_pgtable_cfg pgtbl_cfg;
> > struct io_pgtable_ops *pgtbl_ops;
> > @@ -2459,31 +2462,31 @@ static int arm_smmu_domain_finalise(struct
> arm_smmu_domain *smmu_domain,
> > if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
> > smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
> >
> > + pgtbl_cfg = (struct io_pgtable_cfg) {
> > + .pgsize_bitmap = smmu->pgsize_bitmap,
> > + .coherent_walk = smmu->features &
> ARM_SMMU_FEAT_COHERENCY,
> > + .tlb = &arm_smmu_flush_ops,
> > + .iommu_dev = smmu->dev,
> > + };
> > +
> > switch (smmu_domain->stage) {
> > case ARM_SMMU_DOMAIN_S1:
> > ias = (smmu->features & ARM_SMMU_FEAT_VAX) ? 52 : 48;
> > - ias = min_t(unsigned long, ias, VA_BITS);
> > - oas = smmu->ias;
> > + pgtbl_cfg.ias = min_t(unsigned long, ias, VA_BITS);
>
> I know this isn't changed by this patch, but do we really mean VA_BITS here?
> Don't we want vabits_actual? I'm guessing we are intending to limit ias to the
> size the kernel is using.
I see Jason has replied to this.
>
> > + pgtbl_cfg.oas = smmu->ias;
> > + if (enable_dirty)
> > + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
> > fmt = ARM_64_LPAE_S1;
> > break;
> > case ARM_SMMU_DOMAIN_S2:
> > - ias = smmu->ias;
> > - oas = smmu->oas;
> > + pgtbl_cfg.ias = smmu->ias;
> > + pgtbl_cfg.oas = smmu->oas;
> > fmt = ARM_64_LPAE_S2;
>
> Is it worth adding a WARN_ON(enable_dirty) here?
Not sure it makes a difference as we don’t set the quirk flag here.
>
> > break;
> > default:
> > return -EINVAL;
> > }
> >
> > - pgtbl_cfg = (struct io_pgtable_cfg) {
> > - .pgsize_bitmap = smmu->pgsize_bitmap,
> > - .ias = ias,
> > - .oas = oas,
> > - .coherent_walk = smmu->features &
> ARM_SMMU_FEAT_COHERENCY,
> > - .tlb = &arm_smmu_flush_ops,
> > - .iommu_dev = smmu->dev,
> > - };
> > -
> > pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> > if (!pgtbl_ops)
> > return -ENOMEM;
> > @@ -2491,7 +2494,8 @@ static int arm_smmu_domain_finalise(struct
> arm_smmu_domain *smmu_domain,
> > smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> > smmu_domain->domain.geometry.aperture_end = (1UL <<
> pgtbl_cfg.ias) - 1;
> > smmu_domain->domain.geometry.force_aperture = true;
> > -
> > + if (enable_dirty && smmu_domain->stage ==
> ARM_SMMU_DOMAIN_S1)
> > + smmu_domain->domain.dirty_ops = &arm_smmu_dirty_ops;
> > ret = arm_smmu_domain_alloc_id(smmu, smmu_domain);
> > if (ret < 0) {
> > free_io_pgtable_ops(pgtbl_ops);
> > @@ -2811,7 +2815,7 @@ static int arm_smmu_attach_dev(struct
> iommu_domain *domain, struct device *dev)
> > mutex_lock(&smmu_domain->init_mutex);
> >
> > if (!smmu_domain->smmu) {
> > - ret = arm_smmu_domain_finalise(smmu_domain, smmu);
> > + ret = arm_smmu_domain_finalise(smmu_domain, smmu, false);
> > } else if (smmu_domain->smmu != smmu)
> > ret = -EINVAL;
> >
> > @@ -2876,7 +2880,7 @@ static int arm_smmu_s1_set_dev_pasid(struct
> iommu_domain *domain,
> >
> > mutex_lock(&smmu_domain->init_mutex);
> > if (!smmu_domain->smmu)
> > - ret = arm_smmu_domain_finalise(smmu_domain, smmu);
> > + ret = arm_smmu_domain_finalise(smmu_domain, smmu, false);
> > else if (smmu_domain->smmu != smmu)
> > ret = -EINVAL;
> > mutex_unlock(&smmu_domain->init_mutex);
> > @@ -3193,7 +3197,9 @@ arm_smmu_domain_alloc_user(struct device *dev,
> u32 flags,
> > const struct iommu_user_data *user_data)
> > {
> > struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > - const u32 paging_flags = IOMMU_HWPT_ALLOC_NEST_PARENT;
> > + const u32 paging_flags = IOMMU_HWPT_ALLOC_NEST_PARENT |
> > + IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
> > + bool enforce_dirty = flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
>
> nit: It's called enable_dirty in other places; I think that is more appropriate
> here?
Ok.
> > struct arm_smmu_domain *smmu_domain;
> > int ret;
> >
> > @@ -3206,6 +3212,10 @@ arm_smmu_domain_alloc_user(struct device
> *dev, u32 flags,
> > if (user_data)
> > return ERR_PTR(-EINVAL);
> >
> > + if (enforce_dirty &&
> > + !device_iommu_capable(dev, IOMMU_CAP_DIRTY_TRACKING))
> > + return ERR_PTR(-EOPNOTSUPP);
>
> I'm guessing the intention is that only a stage 1 will ever be marked with
> IOMMU_CAP_DIRTY_TRACKING (there are a few places that assume/check we
> are
> dealing with S1)? But is there a reason why stage 2 can't be supported as well?
We don’t have nested support yet. S2 support will be added in future.
> > +
> > smmu_domain = arm_smmu_domain_alloc();
> > if (!smmu_domain)
> > return ERR_PTR(-ENOMEM);
> > @@ -3221,7 +3231,7 @@ arm_smmu_domain_alloc_user(struct device *dev,
> u32 flags,
> >
> > smmu_domain->domain.type = IOMMU_DOMAIN_UNMANAGED;
> > smmu_domain->domain.ops = arm_smmu_ops.default_domain_ops;
> > - ret = arm_smmu_domain_finalise(smmu_domain, master->smmu);
> > + ret = arm_smmu_domain_finalise(smmu_domain, master->smmu,
> enforce_dirty);
> > if (ret)
> > goto err_free;
> > return &smmu_domain->domain;
> > @@ -3470,6 +3480,42 @@ static void arm_smmu_release_device(struct
> device *dev)
> > kfree(master);
> > }
> >
> > +static int arm_smmu_read_and_clear_dirty(struct iommu_domain *domain,
> > + unsigned long iova, size_t size,
> > + unsigned long flags,
> > + struct iommu_dirty_bitmap *dirty)
> > +{
> > + struct arm_smmu_domain *smmu_domain =
> to_smmu_domain(domain);
> > + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> > +
> > + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
> > + return -EINVAL;
>
> You've only attached the dirty_ops if it was S1 in the first place, so this
> check seems overkill to me.
>
> > +
> > + if (WARN_ON_ONCE(!ops || !ops->read_and_clear_dirty))
> > + return -ENODEV;
>
> And here; could this be moved to where you attach the dirty_ops?
Yes. Jason has also made the same comment on this and will be removed in
next revision.
> > +
> > + return ops->read_and_clear_dirty(ops, iova, size, flags, dirty);
> > +}
> > +
> > +static int arm_smmu_set_dirty_tracking(struct iommu_domain *domain,
> > + bool enabled)
> > +{
> > + struct arm_smmu_domain *smmu_domain =
> to_smmu_domain(domain);
> > + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> > +
> > + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
> > + return -EINVAL;
> > +
> > + if (WARN_ON_ONCE(!ops))
> > + return -ENODEV;
>
> Same comments for the 2 checks.
Sure.
Thanks,
Shameer
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU
2024-04-24 8:01 ` Shameerali Kolothum Thodi
@ 2024-04-24 8:28 ` Ryan Roberts
0 siblings, 0 replies; 36+ messages in thread
From: Ryan Roberts @ 2024-04-24 8:28 UTC (permalink / raw)
To: Shameerali Kolothum Thodi, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org
Cc: joro@8bytes.org, jgg@nvidia.com, kevin.tian@intel.com,
nicolinc@nvidia.com, mshavit@google.com, robin.murphy@arm.com,
will@kernel.org, joao.m.martins@oracle.com, jiangkunkun,
zhukeqian, Linuxarm
On 24/04/2024 09:01, Shameerali Kolothum Thodi wrote:
> Hi,
>
>> -----Original Message-----
>> From: Ryan Roberts <ryan.roberts@arm.com>
>> Sent: Tuesday, April 23, 2024 3:42 PM
>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>> iommu@lists.linux.dev; linux-arm-kernel@lists.infradead.org
>> Cc: joro@8bytes.org; jgg@nvidia.com; kevin.tian@intel.com;
>> nicolinc@nvidia.com; mshavit@google.com; robin.murphy@arm.com;
>> will@kernel.org; joao.m.martins@oracle.com; jiangkunkun
>> <jiangkunkun@huawei.com>; zhukeqian <zhukeqian1@huawei.com>;
>> Linuxarm <linuxarm@huawei.com>
>> Subject: Re: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for
>> HTTU
>>
>> Hi,
>>
>> I'm aiming to (slowly) get more involved with SMMU activities, although I'm
>> sure
>> it will take a while to get up to speed and provide useful input. It was
>> suggested that this series would be a useful starting point to dip my toe in.
>> Please bear with me while I ask stupid questions...
>
> Thanks for going through the series. I am planning to respin this one soon, once
> Jason's SMMUv3 refactor series in some good form.
>
>>
>>
>> On 22/02/2024 09:49, Shameer Kolothum wrote:
>>> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>>
>>> If the SMMU supports it and the kernel was built with HTTU support,
>>> Probe support for Hardware Translation Table Update (HTTU) which is
>>> essentially to enable hardware update of access and dirty flags.
>>>
>>> Probe and set the smmu::features for Hardware Dirty and Hardware Access
>>> bits. This is in preparation, to enable it on the context descriptors of
>>> stage 1 format.
>>>
>>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
>>> Signed-off-by: Shameer Kolothum
>> <shameerali.kolothum.thodi@huawei.com>
>>
>> Except for the nit (feel free to ignore it), LGTM!
>>
>> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
>>
>>
>>> ---
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32
>> +++++++++++++++++++++
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 ++++
>>> 2 files changed, 37 insertions(+)
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> index 0606166a8781..bd30739e3588 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> @@ -4196,6 +4196,28 @@ static void arm_smmu_device_iidr_probe(struct
>> arm_smmu_device *smmu)
>>> }
>>> }
>>>
>>> +static void arm_smmu_get_httu(struct arm_smmu_device *smmu, u32
>> reg)
>>> +{
>>> + u32 fw_features = smmu->features & (ARM_SMMU_FEAT_HA |
>> ARM_SMMU_FEAT_HD);
>>> + u32 features = 0;
>>> +
>>> + switch (FIELD_GET(IDR0_HTTU, reg)) {
>>> + case IDR0_HTTU_ACCESS_DIRTY:
>>> + features |= ARM_SMMU_FEAT_HD;
>>> + fallthrough;
>>> + case IDR0_HTTU_ACCESS:
>>> + features |= ARM_SMMU_FEAT_HA;
>>> + }
>>> +
>>> + if (smmu->dev->of_node)
>>> + smmu->features |= features;
>>> + else if (features != fw_features)
>>> + /* ACPI IORT sets the HTTU bits */
>>> + dev_warn(smmu->dev,
>>> + "IDR0.HTTU overridden by FW configuration
>> (0x%x)\n",
>>> + fw_features);
>>> +}
>>> +
>>> static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
>>> {
>>> u32 reg;
>>> @@ -4256,6 +4278,8 @@ static int arm_smmu_device_hw_probe(struct
>> arm_smmu_device *smmu)
>>> smmu->features |= ARM_SMMU_FEAT_E2H;
>>> }
>>>
>>> + arm_smmu_get_httu(smmu, reg);
>>> +
>>> /*
>>> * The coherency feature as set by FW is used in preference to the ID
>>> * register, but warn on mismatch.
>>> @@ -4448,6 +4472,14 @@ static int arm_smmu_device_acpi_probe(struct
>> platform_device *pdev,
>>> if (iort_smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE)
>>> smmu->features |= ARM_SMMU_FEAT_COHERENCY;
>>>
>>> + switch (FIELD_GET(ACPI_IORT_SMMU_V3_HTTU_OVERRIDE,
>> iort_smmu->flags)) {
>>> + case IDR0_HTTU_ACCESS_DIRTY:
>>> + smmu->features |= ARM_SMMU_FEAT_HD;
>>> + fallthrough;
>>> + case IDR0_HTTU_ACCESS:
>>> + smmu->features |= ARM_SMMU_FEAT_HA;
>>> + }
>>> +
>>> return 0;
>>> }
>>> #else
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> index 45bcd72fcda4..5e51a6c1d55f 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> @@ -33,6 +33,9 @@
>>> #define IDR0_ASID16 (1 << 12)
>>> #define IDR0_ATS (1 << 10)
>>> #define IDR0_HYP (1 << 9)
>>> +#define IDR0_HTTU GENMASK(7, 6)
>>> +#define IDR0_HTTU_ACCESS 1
>>> +#define IDR0_HTTU_ACCESS_DIRTY 2
>>> #define IDR0_COHACC (1 << 4)
>>> #define IDR0_TTF GENMASK(3, 2)
>>> #define IDR0_TTF_AARCH64 2
>>> @@ -668,6 +671,8 @@ struct arm_smmu_device {
>>> #define ARM_SMMU_FEAT_SVA (1 << 17)
>>> #define ARM_SMMU_FEAT_E2H (1 << 18)
>>> #define ARM_SMMU_FEAT_NESTING (1 << 19)
>>> +#define ARM_SMMU_FEAT_HA (1 << 20)
>>> +#define ARM_SMMU_FEAT_HD (1 << 21)
>>
>> nit: HA and HD are a bit opaque, at least to me. I guess they are HW Access
>> and
>> HW Dirty? Perhaps ARM_SMMU_FEAT_HW_ACCESS and
>> ARM_SMMU_FEAT_HW_DIRTY are more
>> expressive?
>
> Nothing against it. Just that _HW_ is not used in driver anywhere(AFAICS)
> and HA/HD are used in the specification.
Ahh yes, I see that now.
> How about we rename to
> ARM_SMMU_FEAT_HTTU_HA and ARM_SMMU_FEAT_HTTU_HA?
Yes, that's much more obvious from where I'm standing. But given HA and HD are
fields in the spec, I'm also fine with leaving as is. Up to you.
>
> Thanks,
> Shameer
>
>>
>>> u32 features;
>>>
>>> #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
>>
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 2/4] iommu/io-pgtable-arm: Add read_and_clear_dirty() support
2024-04-24 8:01 ` Shameerali Kolothum Thodi
@ 2024-04-24 8:36 ` Ryan Roberts
0 siblings, 0 replies; 36+ messages in thread
From: Ryan Roberts @ 2024-04-24 8:36 UTC (permalink / raw)
To: Shameerali Kolothum Thodi, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org
Cc: joro@8bytes.org, jgg@nvidia.com, kevin.tian@intel.com,
nicolinc@nvidia.com, mshavit@google.com, robin.murphy@arm.com,
will@kernel.org, joao.m.martins@oracle.com, jiangkunkun,
zhukeqian, Linuxarm
On 24/04/2024 09:01, Shameerali Kolothum Thodi wrote:
>
>
>> -----Original Message-----
>> From: Ryan Roberts <ryan.roberts@arm.com>
>> Sent: Tuesday, April 23, 2024 4:56 PM
>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>> iommu@lists.linux.dev; linux-arm-kernel@lists.infradead.org
>> Cc: joro@8bytes.org; jgg@nvidia.com; kevin.tian@intel.com;
>> nicolinc@nvidia.com; mshavit@google.com; robin.murphy@arm.com;
>> will@kernel.org; joao.m.martins@oracle.com; jiangkunkun
>> <jiangkunkun@huawei.com>; zhukeqian <zhukeqian1@huawei.com>;
>> Linuxarm <linuxarm@huawei.com>
>> Subject: Re: [PATCH v2 2/4] iommu/io-pgtable-arm: Add
>> read_and_clear_dirty() support
>>
>> On 22/02/2024 09:49, Shameer Kolothum wrote:
>>> .read_and_clear_dirty() IOMMU domain op takes care of reading the
>>> dirty bits (i.e. PTE has both DBM and AP[2] set) and marshalling into
>>> a bitmap of
>>
>> Err... a PTE is HW-dirty if DBM is set and AP[2] is *clear*, right? (AP[2] is
>> "RDONLY" bit, and is initially set, then the HW clears it on first write. See
>> pte_hw_dirty() in linux/arch/arm64/include/asm/pgtable.h).
>
> Oops..yes, the comment here is indeed wrong.
>
> I will update and add something like below to make it clear:
>
> DBM bit AP[2]("RDONLY" bit)
> 1. writeable_clean 1 1
> 2. writeable_dirty 1 0
> 3. read-only 0 1
Yes, sounds good!
>
>>
>>> a given page size.
>>>
>>> While reading the dirty bits we also clear the PTE AP[2] bit to mark
>>> it as writable-clean depending on read_and_clear_dirty() flags.
>>
>> You would need to *set* AP[2] to mark it clean.
>>
>>>
>>> Structure it in a way that the IOPTE walker is generic, and so we pass
>>> a function pointer over what to do on a per-PTE basis.
>>>
>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>> Signed-off-by: Shameer Kolothum
>> <shameerali.kolothum.thodi@huawei.com>
>>> ---
>>> drivers/iommu/io-pgtable-arm.c | 128
>>> ++++++++++++++++++++++++++++++++-
>>> 1 file changed, 127 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/iommu/io-pgtable-arm.c
>>> b/drivers/iommu/io-pgtable-arm.c index f7828a7aad41..1ce7b7a3c1e8
>>> 100644
>>> --- a/drivers/iommu/io-pgtable-arm.c
>>> +++ b/drivers/iommu/io-pgtable-arm.c
>>> @@ -75,6 +75,7 @@
>>>
>>> #define ARM_LPAE_PTE_NSTABLE (((arm_lpae_iopte)1) << 63)
>>> #define ARM_LPAE_PTE_XN (((arm_lpae_iopte)3) << 53)
>>> +#define ARM_LPAE_PTE_DBM (((arm_lpae_iopte)1) << 51)
>>> #define ARM_LPAE_PTE_AF (((arm_lpae_iopte)1) << 10)
>>> #define ARM_LPAE_PTE_SH_NS (((arm_lpae_iopte)0) << 8)
>>> #define ARM_LPAE_PTE_SH_OS (((arm_lpae_iopte)2) << 8)
>>> @@ -84,7 +85,7 @@
>>>
>>> #define ARM_LPAE_PTE_ATTR_LO_MASK (((arm_lpae_iopte)0x3ff) <<
>> 2)
>>> /* Ignore the contiguous bit for block splitting */
>>> -#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)6) << 52)
>>> +#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)0xd) <<
>> 51)
>>
>> Perhaps this is easier to understand:
>>
>> #define ARM_LPAE_PTE_ATTR_HI_MASK (ARM_LPAE_PTE_XN |
>> ARM_LPAE_PTE_DBM)
>
> Ok.
>
>>
>>> #define ARM_LPAE_PTE_ATTR_MASK
>> (ARM_LPAE_PTE_ATTR_LO_MASK | \
>>> ARM_LPAE_PTE_ATTR_HI_MASK)
>>> /* Software bit for solving coherency races */ @@ -93,6 +94,9 @@
>>> /* Stage-1 PTE */
>>> #define ARM_LPAE_PTE_AP_UNPRIV (((arm_lpae_iopte)1) << 6)
>>> #define ARM_LPAE_PTE_AP_RDONLY (((arm_lpae_iopte)2)
>> << 6)
>>> +#define ARM_LPAE_PTE_AP_RDONLY_BIT 7
>>
>> Perhaps:
>>
>> #define ARM_LPAE_PTE_AP_RDONLY_BIT 7
>> #define ARM_LPAE_PTE_AP_RDONLY (((arm_lpae_iopte)1) <<
>> ARM_LPAE_PTE_AP_RDONLY_BIT)
>
> Ok.
>
>>
>>> +#define ARM_LPAE_PTE_AP_WRITABLE (ARM_LPAE_PTE_AP_RDONLY
>> | \
>>> + ARM_LPAE_PTE_DBM)
>>
>> Perhaps: ARM_LPAE_PTE_AP_WRITABLE_CLEAN ?
>
> Sure. That’s more clear.
>
> >
>>> #define ARM_LPAE_PTE_ATTRINDX_SHIFT 2
>>> #define ARM_LPAE_PTE_nG (((arm_lpae_iopte)1) << 11)
>>>
>>> @@ -729,6 +733,127 @@ static phys_addr_t
>> arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
>>> return iopte_to_paddr(pte, data) | iova; }
>>>
>>> +struct arm_lpae_iopte_read_dirty {
>>> + unsigned long flags;
>>> + struct iommu_dirty_bitmap *dirty;
>>> +};
>>> +
>>> +typedef int (*io_pgtable_visit_fn_t)(unsigned long iova, size_t size,
>>> + arm_lpae_iopte *ptep, void *opaque);
>> struct
>>> +io_pgtable_walk_data {
>>> + const io_pgtable_visit_fn_t fn;
>>> + void *opaque;> + u64
>> addr;
>>> + const u64 end;
>>> +};
>>> +
>>> +static int __arm_lpae_iopte_walk(struct arm_lpae_io_pgtable *data,
>>> + struct io_pgtable_walk_data *walk_data,
>>> + arm_lpae_iopte *ptep,
>>> + int lvl);
>>> +
>>> +static int __arm_lpae_read_and_clear_dirty(unsigned long iova, size_t
>> size,
>>> + arm_lpae_iopte *ptep, void
>> *opaque) {
>>> + struct arm_lpae_iopte_read_dirty *arg = opaque;
>>> + struct iommu_dirty_bitmap *dirty = arg->dirty;
>>> + arm_lpae_iopte pte;
>>> +
>>> + pte = READ_ONCE(*ptep);
>>> + if (WARN_ON(!pte))
>>> + return -EINVAL;
>>
>> You've already read and checked its not zero in io_pgtable_visit(). Either the
>> walker is considered generic and probably shouldn't care if the pte is 0, (so
>> just do check here). Or the walker is specific for this use case, in which case
>> there is no need for the function pointer callback inefficiencies (and you've
>> already READ_ONCE() the ptep so no need to repeat. (Multiple ptep_get()s
>> does make a meaningful performance impact in the core-mm).
>
> Yes, that check is a duplication. Will remove that. I do have plans to support block
> page split/merge functionality during migration start. And with that in mind tried
> to make it generic with callback ptr. May be it is not worth at this time and I will
> revisit it when we add split/merge functionality.
Ahh ok. My preference would be to keep it simple while this is the only user,
then generalize it when you add a second user. But if you plan to add the second
user imminently, then feel free to leave it general.
>
> > > +
>>> + if ((pte & ARM_LPAE_PTE_AP_WRITABLE) ==
>> ARM_LPAE_PTE_AP_WRITABLE)
>>> + return 0;
>>
>> What about RO ptes? This check only checks that it is writable-clean. So you
>> have 2 remaining options; writable-dirty and read-only. Suggest:
>
> Ok. Got it.
>
>>
>> if (pte_hw_dirty(pte)) {
>> iommu_dirty_bitmap_record(dirty, iova, size);
>> if (!(arg->flags & IOMMU_DIRTY_NO_CLEAR))
>> set_bit(ARM_LPAE_PTE_AP_RDONLY_BIT, (unsigned long
>> *)ptep); }
>>
>>> +
>>> + iommu_dirty_bitmap_record(dirty, iova, size);
>>> + if (!(arg->flags & IOMMU_DIRTY_NO_CLEAR))
>>> + set_bit(ARM_LPAE_PTE_AP_RDONLY_BIT, (unsigned long
>> *)ptep);
>>> + return 0;
>>> +}
>>> +
>>> +static int io_pgtable_visit(struct arm_lpae_io_pgtable *data,
>>> + struct io_pgtable_walk_data *walk_data,
>>> + arm_lpae_iopte *ptep, int lvl) {
>>> + struct io_pgtable *iop = &data->iop;
>>> + arm_lpae_iopte pte = READ_ONCE(*ptep);
>>> +
>>> + if (WARN_ON(!pte))
>>> + return -EINVAL;
>>> +
>>> + if (iopte_leaf(pte, lvl, iop->fmt)) {
>>> + size_t size = ARM_LPAE_BLOCK_SIZE(lvl, data);
>>> +
>>> + walk_data->fn(walk_data->addr, size, ptep, walk_data-
>>> opaque);
>>> + walk_data->addr += size;
>>> + return 0;
>>> + }
>>> +
>>> + ptep = iopte_deref(pte, data);
>>> + return __arm_lpae_iopte_walk(data, walk_data, ptep, lvl + 1); }
>>> +
>>> +static int __arm_lpae_iopte_walk(struct arm_lpae_io_pgtable *data,
>>> + struct io_pgtable_walk_data *walk_data,
>>> + arm_lpae_iopte *ptep,
>>> + int lvl)
>>> +{
>>> + u32 idx;
>>> + int max_entries, ret;
>>> +
>>> + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
>>> + return -EINVAL;
>>> +
>>> + if (lvl == data->start_level)
>>> + max_entries = ARM_LPAE_PGD_SIZE(data) /
>> sizeof(arm_lpae_iopte);
>>> + else
>>> + max_entries = ARM_LPAE_PTES_PER_TABLE(data);
>>> +
>>> + for (idx = ARM_LPAE_LVL_IDX(walk_data->addr, lvl, data);
>>> + (idx < max_entries) && (walk_data->addr < walk_data->end);
>> ++idx) {
>>> + arm_lpae_iopte *pteref = &ptep[idx];
>>
>> nit: Personally I would get rid of this and just pass `ptep + idx` to
>> io_pgtable_visit()
>
> Ack.
>
>>> +
>>> + ret = io_pgtable_visit(data, walk_data, pteref, lvl);
>>> + if (ret)
>>> + return ret;
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static int arm_lpae_read_and_clear_dirty(struct io_pgtable_ops *ops,
>>> + unsigned long iova, size_t size,
>>> + unsigned long flags,
>>> + struct iommu_dirty_bitmap *dirty) {
>>> + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
>>> + struct io_pgtable_cfg *cfg = &data->iop.cfg;
>>> + struct arm_lpae_iopte_read_dirty arg = {
>>> + .flags = flags, .dirty = dirty,
>>> + };
>>> + struct io_pgtable_walk_data walk_data = {
>>> + .fn = __arm_lpae_read_and_clear_dirty,
>>> + .opaque = &arg,
>>
>> Do you have plans to reuse the walker? If not, then I wonder if separating
>> the argument and callback function are valuable? It will certainly be more
>> efficient without the per-pte indirect call.
>
> Ok. I will remove indirection for now and may revisit it when we add split/merge
> functionality.
>
>>
>>> + .addr = iova,
>>> + .end = iova + size - 1,
>>
>> Bug?: You are initializing end to be inclusive (i.e. last valid address in range).
>> But __arm_lpae_iopte_walk() is using it as exclusive (i.e. first address after
>> range) - see "walk_data->addr < walk_data->end".
>>
>> mm code usually treats end as exclusive, so suggest removing the "- 1".
>
> I will double check this.
Yes please do. It certainly looks like a bug, but I may be wrong!
>
>>
>>> + };
>>> + arm_lpae_iopte *ptep = data->pgd;
>>> + int lvl = data->start_level;
>>> + long iaext = (s64)iova >> cfg->ias;
>>
>> Why cast this to signed? And why is the cast s64 and the result long? Suggest
>> they should both be consistent at least. But probably clearer to just do:
>>
>> WARN_ON((iova + size - 1) & (BIT(cfg->ias) - 1))
>>
>> in the below if()? That way you are checking the full range too.
>
> Ok.
>
> Thanks,
> Shameer
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU
2024-04-23 14:52 ` Jason Gunthorpe
@ 2024-04-24 10:04 ` Ryan Roberts
2024-04-24 12:23 ` Jason Gunthorpe
0 siblings, 1 reply; 36+ messages in thread
From: Ryan Roberts @ 2024-04-24 10:04 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Shameer Kolothum, iommu, linux-arm-kernel, joro, kevin.tian,
nicolinc, mshavit, robin.murphy, will, joao.m.martins,
jiangkunkun, zhukeqian1, linuxarm
On 23/04/2024 15:52, Jason Gunthorpe wrote:
> On Tue, Apr 23, 2024 at 03:41:30PM +0100, Ryan Roberts wrote:
>> Hi,
>>
>> I'm aiming to (slowly) get more involved with SMMU activities, although I'm sure
>> it will take a while to get up to speed and provide useful input. It was
>> suggested that this series would be a useful starting point to dip my toe in.
>> Please bear with me while I ask stupid questions...
>
> Nice!
>
> I would like to see this merged as it was part of the original three
> implementations for iommufd dirty tracking. The other two have been
> merged for a long time now..
My understanding is that this series should pretty much apply on top of
mainline, is that correct? (in practice there are some conflicts, but I think
they are trivial).
I don't think it depends on anything new in your smmuv3_newapi branch?
Could you point me to any series you have on list that I could help with review on?
>
> Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 4/4] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
2024-04-24 7:58 ` Ryan Roberts
@ 2024-04-24 12:15 ` Jason Gunthorpe
2024-04-24 12:45 ` Ryan Roberts
0 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2024-04-24 12:15 UTC (permalink / raw)
To: Ryan Roberts
Cc: Shameer Kolothum, iommu, linux-arm-kernel, joro, kevin.tian,
nicolinc, mshavit, robin.murphy, will, joao.m.martins,
jiangkunkun, zhukeqian1, linuxarm
On Wed, Apr 24, 2024 at 08:58:41AM +0100, Ryan Roberts wrote:
> Or perhaps I've misunderstood your comment and there is actually always a 1:1
> relationship between domain and pgtable? But a 1:* relationship between domain
> and CD?
Yes, this is the case
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU
2024-04-24 10:04 ` Ryan Roberts
@ 2024-04-24 12:23 ` Jason Gunthorpe
2024-04-24 12:59 ` Ryan Roberts
2024-04-24 13:20 ` Shameerali Kolothum Thodi
0 siblings, 2 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2024-04-24 12:23 UTC (permalink / raw)
To: Ryan Roberts
Cc: Shameer Kolothum, iommu, linux-arm-kernel, joro, kevin.tian,
nicolinc, mshavit, robin.murphy, will, joao.m.martins,
jiangkunkun, zhukeqian1, linuxarm
On Wed, Apr 24, 2024 at 11:04:43AM +0100, Ryan Roberts wrote:
> On 23/04/2024 15:52, Jason Gunthorpe wrote:
> > On Tue, Apr 23, 2024 at 03:41:30PM +0100, Ryan Roberts wrote:
> >> Hi,
> >>
> >> I'm aiming to (slowly) get more involved with SMMU activities, although I'm sure
> >> it will take a while to get up to speed and provide useful input. It was
> >> suggested that this series would be a useful starting point to dip my toe in.
> >> Please bear with me while I ask stupid questions...
> >
> > Nice!
> >
> > I would like to see this merged as it was part of the original three
> > implementations for iommufd dirty tracking. The other two have been
> > merged for a long time now..
>
> My understanding is that this series should pretty much apply on top of
> mainline, is that correct? (in practice there are some conflicts, but I think
> they are trivial).
It was originally based on my part 2, but I suspect it could apply
easially on top of part 2a. I don't think it is worth rebasing to
v6.9-rc since I expect Will to take 2a.
> I don't think it depends on anything new in your smmuv3_newapi branch?
I feel part 2a is done, hopefully Will will take it soon:
https://lore.kernel.org/linux-iommu/0-v8-4c4298c63951+13484-smmuv3_newapi_p2_jgg@nvidia.com/
There are quite a few distros waiting on this:
https://lore.kernel.org/linux-iommu/cover.1712977210.git.nicolinc@nvidia.com/
The cmdq area is not something I've studied deeply
I will be reposting the latter half of this as a part 2b:
https://lore.kernel.org/linux-iommu/0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com/
Which is the prerequisite for the part 3 series to enable nested
translation an vSMMUv3 for iommufd.
Nested translation and a few more bits is prerequisite to finally enable BTM:
https://lore.kernel.org/linux-iommu/20240208151837.35068-1-shameerali.kolothum.thodi@huawei.com/
Thanks,
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 4/4] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
2024-04-24 12:15 ` Jason Gunthorpe
@ 2024-04-24 12:45 ` Ryan Roberts
0 siblings, 0 replies; 36+ messages in thread
From: Ryan Roberts @ 2024-04-24 12:45 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Shameer Kolothum, iommu, linux-arm-kernel, joro, kevin.tian,
nicolinc, mshavit, robin.murphy, will, joao.m.martins,
jiangkunkun, zhukeqian1, linuxarm
On 24/04/2024 13:15, Jason Gunthorpe wrote:
> On Wed, Apr 24, 2024 at 08:58:41AM +0100, Ryan Roberts wrote:
>
>> Or perhaps I've misunderstood your comment and there is actually always a 1:1
>> relationship between domain and pgtable? But a 1:* relationship between domain
>> and CD?
>
> Yes, this is the case
Great thanks! See - I said I'd be asking stupid questions...
>
> Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU
2024-04-24 12:23 ` Jason Gunthorpe
@ 2024-04-24 12:59 ` Ryan Roberts
2024-04-24 13:20 ` Shameerali Kolothum Thodi
1 sibling, 0 replies; 36+ messages in thread
From: Ryan Roberts @ 2024-04-24 12:59 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Shameer Kolothum, iommu, linux-arm-kernel, joro, kevin.tian,
nicolinc, mshavit, robin.murphy, will, joao.m.martins,
jiangkunkun, zhukeqian1, linuxarm
On 24/04/2024 13:23, Jason Gunthorpe wrote:
> On Wed, Apr 24, 2024 at 11:04:43AM +0100, Ryan Roberts wrote:
>> On 23/04/2024 15:52, Jason Gunthorpe wrote:
>>> On Tue, Apr 23, 2024 at 03:41:30PM +0100, Ryan Roberts wrote:
>>>> Hi,
>>>>
>>>> I'm aiming to (slowly) get more involved with SMMU activities, although I'm sure
>>>> it will take a while to get up to speed and provide useful input. It was
>>>> suggested that this series would be a useful starting point to dip my toe in.
>>>> Please bear with me while I ask stupid questions...
>>>
>>> Nice!
>>>
>>> I would like to see this merged as it was part of the original three
>>> implementations for iommufd dirty tracking. The other two have been
>>> merged for a long time now..
>>
>> My understanding is that this series should pretty much apply on top of
>> mainline, is that correct? (in practice there are some conflicts, but I think
>> they are trivial).
>
> It was originally based on my part 2, but I suspect it could apply
> easially on top of part 2a. I don't think it is worth rebasing to
> v6.9-rc since I expect Will to take 2a.
>
>> I don't think it depends on anything new in your smmuv3_newapi branch?
>
> I feel part 2a is done, hopefully Will will take it soon:
>
> https://lore.kernel.org/linux-iommu/0-v8-4c4298c63951+13484-smmuv3_newapi_p2_jgg@nvidia.com/
>
> There are quite a few distros waiting on this:
>
> https://lore.kernel.org/linux-iommu/cover.1712977210.git.nicolinc@nvidia.com/
Great thanks for the links. I'll aim to start working through from this one
onwards as an when I can.
>
> The cmdq area is not something I've studied deeply
>
> I will be reposting the latter half of this as a part 2b:
>
> https://lore.kernel.org/linux-iommu/0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com/
>
> Which is the prerequisite for the part 3 series to enable nested
> translation an vSMMUv3 for iommufd.
>
> Nested translation and a few more bits is prerequisite to finally enable BTM:
>
> https://lore.kernel.org/linux-iommu/20240208151837.35068-1-shameerali.kolothum.thodi@huawei.com/
>
> Thanks,
> Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU
2024-04-24 12:23 ` Jason Gunthorpe
2024-04-24 12:59 ` Ryan Roberts
@ 2024-04-24 13:20 ` Shameerali Kolothum Thodi
2024-04-24 13:32 ` Jason Gunthorpe
1 sibling, 1 reply; 36+ messages in thread
From: Shameerali Kolothum Thodi @ 2024-04-24 13:20 UTC (permalink / raw)
To: Jason Gunthorpe, Ryan Roberts
Cc: iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
joro@8bytes.org, kevin.tian@intel.com, nicolinc@nvidia.com,
mshavit@google.com, robin.murphy@arm.com, will@kernel.org,
joao.m.martins@oracle.com, jiangkunkun, zhukeqian, Linuxarm
> -----Original Message-----
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 24, 2024 1:24 PM
> To: Ryan Roberts <ryan.roberts@arm.com>
> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> iommu@lists.linux.dev; linux-arm-kernel@lists.infradead.org; joro@8bytes.org;
> kevin.tian@intel.com; nicolinc@nvidia.com; mshavit@google.com;
> robin.murphy@arm.com; will@kernel.org; joao.m.martins@oracle.com;
> jiangkunkun <jiangkunkun@huawei.com>; zhukeqian
> <zhukeqian1@huawei.com>; Linuxarm <linuxarm@huawei.com>
> Subject: Re: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for
> HTTU
>
> On Wed, Apr 24, 2024 at 11:04:43AM +0100, Ryan Roberts wrote:
> > On 23/04/2024 15:52, Jason Gunthorpe wrote:
> > > On Tue, Apr 23, 2024 at 03:41:30PM +0100, Ryan Roberts wrote:
> > >> Hi,
> > >>
> > >> I'm aiming to (slowly) get more involved with SMMU activities, although I'm
> sure
> > >> it will take a while to get up to speed and provide useful input. It was
> > >> suggested that this series would be a useful starting point to dip my toe in.
> > >> Please bear with me while I ask stupid questions...
> > >
> > > Nice!
> > >
> > > I would like to see this merged as it was part of the original three
> > > implementations for iommufd dirty tracking. The other two have been
> > > merged for a long time now..
> >
> > My understanding is that this series should pretty much apply on top of
> > mainline, is that correct? (in practice there are some conflicts, but I think
> > they are trivial).
>
> It was originally based on my part 2, but I suspect it could apply
> easially on top of part 2a. I don't think it is worth rebasing to
> v6.9-rc since I expect Will to take 2a.
But isn't arm_smmu_domain_alloc_user() gets introduced only in part 3
of the series? Do we plan to support DIRTY_TRACKING with domain_alloc( )
itself by default?
Thanks,
Shameer
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU
2024-04-24 13:20 ` Shameerali Kolothum Thodi
@ 2024-04-24 13:32 ` Jason Gunthorpe
2024-04-24 13:43 ` Shameerali Kolothum Thodi
0 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2024-04-24 13:32 UTC (permalink / raw)
To: Shameerali Kolothum Thodi
Cc: Ryan Roberts, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, joro@8bytes.org,
kevin.tian@intel.com, nicolinc@nvidia.com, mshavit@google.com,
robin.murphy@arm.com, will@kernel.org, joao.m.martins@oracle.com,
jiangkunkun, zhukeqian, Linuxarm
On Wed, Apr 24, 2024 at 01:20:53PM +0000, Shameerali Kolothum Thodi wrote:
> > On Wed, Apr 24, 2024 at 11:04:43AM +0100, Ryan Roberts wrote:
> > > On 23/04/2024 15:52, Jason Gunthorpe wrote:
> > > > On Tue, Apr 23, 2024 at 03:41:30PM +0100, Ryan Roberts wrote:
> > > >> Hi,
> > > >>
> > > >> I'm aiming to (slowly) get more involved with SMMU activities, although I'm
> > sure
> > > >> it will take a while to get up to speed and provide useful input. It was
> > > >> suggested that this series would be a useful starting point to dip my toe in.
> > > >> Please bear with me while I ask stupid questions...
> > > >
> > > > Nice!
> > > >
> > > > I would like to see this merged as it was part of the original three
> > > > implementations for iommufd dirty tracking. The other two have been
> > > > merged for a long time now..
> > >
> > > My understanding is that this series should pretty much apply on top of
> > > mainline, is that correct? (in practice there are some conflicts, but I think
> > > they are trivial).
> >
> > It was originally based on my part 2, but I suspect it could apply
> > easially on top of part 2a. I don't think it is worth rebasing to
> > v6.9-rc since I expect Will to take 2a.
>
> But isn't arm_smmu_domain_alloc_user() gets introduced only in part 3
> of the series?
Right. "easially" was perhaps a bit too strong. It would have to be
rebased and pull back a little bit of the domain_alloc_user()
implementation.
If it is good otherwise lets consider doing this. Maybe HTTU and part
2b can go together during the v6.11 cycle?
> Do we plan to support DIRTY_TRACKING with domain_alloc( )
> itself by default?
No
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU
2024-04-24 13:32 ` Jason Gunthorpe
@ 2024-04-24 13:43 ` Shameerali Kolothum Thodi
2024-04-24 14:21 ` Jason Gunthorpe
0 siblings, 1 reply; 36+ messages in thread
From: Shameerali Kolothum Thodi @ 2024-04-24 13:43 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Ryan Roberts, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, joro@8bytes.org,
kevin.tian@intel.com, nicolinc@nvidia.com, mshavit@google.com,
robin.murphy@arm.com, will@kernel.org, joao.m.martins@oracle.com,
jiangkunkun, zhukeqian, Linuxarm
> -----Original Message-----
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 24, 2024 2:32 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Ryan Roberts <ryan.roberts@arm.com>; iommu@lists.linux.dev; linux-arm-
> kernel@lists.infradead.org; joro@8bytes.org; kevin.tian@intel.com;
> nicolinc@nvidia.com; mshavit@google.com; robin.murphy@arm.com;
> will@kernel.org; joao.m.martins@oracle.com; jiangkunkun
> <jiangkunkun@huawei.com>; zhukeqian <zhukeqian1@huawei.com>; Linuxarm
> <linuxarm@huawei.com>
> Subject: Re: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for
> HTTU
>
> On Wed, Apr 24, 2024 at 01:20:53PM +0000, Shameerali Kolothum Thodi wrote:
>
> > > On Wed, Apr 24, 2024 at 11:04:43AM +0100, Ryan Roberts wrote:
> > > > On 23/04/2024 15:52, Jason Gunthorpe wrote:
> > > > > On Tue, Apr 23, 2024 at 03:41:30PM +0100, Ryan Roberts wrote:
> > > > >> Hi,
> > > > >>
> > > > >> I'm aiming to (slowly) get more involved with SMMU activities, although
> I'm
> > > sure
> > > > >> it will take a while to get up to speed and provide useful input. It was
> > > > >> suggested that this series would be a useful starting point to dip my toe
> in.
> > > > >> Please bear with me while I ask stupid questions...
> > > > >
> > > > > Nice!
> > > > >
> > > > > I would like to see this merged as it was part of the original three
> > > > > implementations for iommufd dirty tracking. The other two have been
> > > > > merged for a long time now..
> > > >
> > > > My understanding is that this series should pretty much apply on top of
> > > > mainline, is that correct? (in practice there are some conflicts, but I think
> > > > they are trivial).
> > >
> > > It was originally based on my part 2, but I suspect it could apply
> > > easially on top of part 2a. I don't think it is worth rebasing to
> > > v6.9-rc since I expect Will to take 2a.
> >
> > But isn't arm_smmu_domain_alloc_user() gets introduced only in part 3
> > of the series?
>
> Right. "easially" was perhaps a bit too strong. It would have to be
> rebased and pull back a little bit of the domain_alloc_user()
> implementation.
>
> If it is good otherwise lets consider doing this. Maybe HTTU and part
> 2b can go together during the v6.11 cycle?
Sure. Let us target that.
Thanks,
Shameer
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU
2024-04-24 13:43 ` Shameerali Kolothum Thodi
@ 2024-04-24 14:21 ` Jason Gunthorpe
0 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2024-04-24 14:21 UTC (permalink / raw)
To: Shameerali Kolothum Thodi
Cc: Ryan Roberts, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, joro@8bytes.org,
kevin.tian@intel.com, nicolinc@nvidia.com, mshavit@google.com,
robin.murphy@arm.com, will@kernel.org, joao.m.martins@oracle.com,
jiangkunkun, zhukeqian, Linuxarm
On Wed, Apr 24, 2024 at 01:43:33PM +0000, Shameerali Kolothum Thodi wrote:
> > Right. "easially" was perhaps a bit too strong. It would have to be
> > rebased and pull back a little bit of the domain_alloc_user()
> > implementation.
> >
> > If it is good otherwise lets consider doing this. Maybe HTTU and part
> > 2b can go together during the v6.11 cycle?
>
> Sure. Let us target that.
Okay, soonly post a v3 on top of the latest 2b patch with the comments
addressed and lets see.
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2024-04-24 14:21 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-22 9:49 [PATCH v2 0/4] iommu/smmuv3: Add IOMMUFD dirty tracking support for SMMUv3 Shameer Kolothum
2024-02-22 9:49 ` [PATCH v2 1/4] iommu/arm-smmu-v3: Add feature detection for HTTU Shameer Kolothum
2024-04-23 14:41 ` Ryan Roberts
2024-04-23 14:52 ` Jason Gunthorpe
2024-04-24 10:04 ` Ryan Roberts
2024-04-24 12:23 ` Jason Gunthorpe
2024-04-24 12:59 ` Ryan Roberts
2024-04-24 13:20 ` Shameerali Kolothum Thodi
2024-04-24 13:32 ` Jason Gunthorpe
2024-04-24 13:43 ` Shameerali Kolothum Thodi
2024-04-24 14:21 ` Jason Gunthorpe
2024-04-24 8:01 ` Shameerali Kolothum Thodi
2024-04-24 8:28 ` Ryan Roberts
2024-02-22 9:49 ` [PATCH v2 2/4] iommu/io-pgtable-arm: Add read_and_clear_dirty() support Shameer Kolothum
2024-04-23 15:56 ` Ryan Roberts
2024-04-24 8:01 ` Shameerali Kolothum Thodi
2024-04-24 8:36 ` Ryan Roberts
2024-02-22 9:49 ` [PATCH v2 3/4] iommu/arm-smmu-v3: Add support for dirty tracking in domain alloc Shameer Kolothum
2024-02-22 11:04 ` Joao Martins
2024-02-22 11:31 ` Shameerali Kolothum Thodi
2024-02-22 11:37 ` Joao Martins
2024-02-22 12:24 ` Shameerali Kolothum Thodi
2024-02-22 13:11 ` Jason Gunthorpe
2024-02-22 13:23 ` Joao Martins
2024-03-08 14:31 ` Jason Gunthorpe
2024-04-23 16:27 ` Ryan Roberts
2024-04-23 16:39 ` Jason Gunthorpe
2024-04-23 16:50 ` Ryan Roberts
2024-04-24 8:27 ` Shameerali Kolothum Thodi
2024-02-22 9:49 ` [PATCH v2 4/4] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping Shameer Kolothum
2024-03-08 14:32 ` Jason Gunthorpe
2024-04-23 16:45 ` Ryan Roberts
2024-04-23 17:32 ` Jason Gunthorpe
2024-04-24 7:58 ` Ryan Roberts
2024-04-24 12:15 ` Jason Gunthorpe
2024-04-24 12:45 ` Ryan Roberts
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).