* [RFC PATCH 0/4] drm/panfrost: Support ARM_64_LPAE_S1 page table
@ 2025-02-26 18:30 Ariel D'Alessandro
2025-02-26 18:30 ` [RFC PATCH 1/4] drm/panfrost: Use GPU_MMU_FEATURES_VA_BITS/PA_BITS macros Ariel D'Alessandro
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Ariel D'Alessandro @ 2025-02-26 18:30 UTC (permalink / raw)
To: dri-devel, linux-kernel
Cc: boris.brezillon, robh, steven.price, maarten.lankhorst, mripard,
tzimmermann, airlied, simona, Ariel D'Alessandro
Hi all,
This is a RFC related to AArch64 page table format support in panfrost.
Currently, only MMU in legacy mode is supported, but Bifrost GPUs use
the standard format LPAE S1 page tables.
There's a previous similar thread on this topic from 2019-May [0], which
got stalled. This RFC is an attemp to bring this discussion back in
order to properly support this mode.
So far, this patchset has been tested on a Mediatek Genio 700 EVK
(MT8390) board, with an integrated Mali-G57 MC3 GPU using
`glmark2-es2-drm` OpenGL 2.0 benchmark tests.
However, Mesa CI dEQP tests for GLES2, GLES3+ and EGL already reported
possible regressions on this patchset, still under investigation.
Due to the possible impact of this patchset, exhaustive testing should
be done before merging, but in any case, let's start kicking this thread
for discussion.
Any comments, feedback is welcome :)
[0] https://lists.freedesktop.org/archives/dri-devel/2019-May/217617.html
Thanks!
Ariel D'Alessandro (4):
drm/panfrost: Use GPU_MMU_FEATURES_VA_BITS/PA_BITS macros
drm/panfrost: Split LPAE MMU TRANSTAB register values
drm/panfrost: Support ARM_64_LPAE_S1 page table
drm/panfrost: Set HW_FEATURE_AARCH64_MMU feature flag on Bifrost
models
drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
drivers/gpu/drm/panfrost/panfrost_features.h | 3 +
drivers/gpu/drm/panfrost/panfrost_mmu.c | 124 +++++++++++++++----
drivers/gpu/drm/panfrost/panfrost_regs.h | 50 ++++++--
4 files changed, 149 insertions(+), 29 deletions(-)
--
2.47.2
^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC PATCH 1/4] drm/panfrost: Use GPU_MMU_FEATURES_VA_BITS/PA_BITS macros
2025-02-26 18:30 [RFC PATCH 0/4] drm/panfrost: Support ARM_64_LPAE_S1 page table Ariel D'Alessandro
@ 2025-02-26 18:30 ` Ariel D'Alessandro
2025-02-27 8:21 ` Boris Brezillon
2025-02-27 14:44 ` Steven Price
2025-02-26 18:30 ` [RFC PATCH 2/4] drm/panfrost: Split LPAE MMU TRANSTAB register values Ariel D'Alessandro
` (2 subsequent siblings)
3 siblings, 2 replies; 17+ messages in thread
From: Ariel D'Alessandro @ 2025-02-26 18:30 UTC (permalink / raw)
To: dri-devel, linux-kernel
Cc: boris.brezillon, robh, steven.price, maarten.lankhorst, mripard,
tzimmermann, airlied, simona, Ariel D'Alessandro
As done in panthor, define and use these GPU_MMU_FEATURES_* macros,
which makes code easier to read and reuse.
Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
---
drivers/gpu/drm/panfrost/panfrost_mmu.c | 6 ++++--
drivers/gpu/drm/panfrost/panfrost_regs.h | 2 ++
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index b91019cd5acb..7df2c8d5b0ae 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -615,6 +615,8 @@ static void panfrost_drm_mm_color_adjust(const struct drm_mm_node *node,
struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
{
+ u32 va_bits = GPU_MMU_FEATURES_VA_BITS(pfdev->features.mmu_features);
+ u32 pa_bits = GPU_MMU_FEATURES_PA_BITS(pfdev->features.mmu_features);
struct panfrost_mmu *mmu;
mmu = kzalloc(sizeof(*mmu), GFP_KERNEL);
@@ -633,8 +635,8 @@ struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
mmu->pgtbl_cfg = (struct io_pgtable_cfg) {
.pgsize_bitmap = SZ_4K | SZ_2M,
- .ias = FIELD_GET(0xff, pfdev->features.mmu_features),
- .oas = FIELD_GET(0xff00, pfdev->features.mmu_features),
+ .ias = va_bits,
+ .oas = pa_bits,
.coherent_walk = pfdev->coherent,
.tlb = &mmu_tlb_ops,
.iommu_dev = pfdev->dev,
diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
index c7bba476ab3f..b5f279a19a08 100644
--- a/drivers/gpu/drm/panfrost/panfrost_regs.h
+++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
@@ -16,6 +16,8 @@
#define GROUPS_L2_COHERENT BIT(0) /* Cores groups are l2 coherent */
#define GPU_MMU_FEATURES 0x014 /* (RO) MMU features */
+#define GPU_MMU_FEATURES_VA_BITS(x) ((x) & GENMASK(7, 0))
+#define GPU_MMU_FEATURES_PA_BITS(x) (((x) >> 8) & GENMASK(7, 0))
#define GPU_AS_PRESENT 0x018 /* (RO) Address space slots present */
#define GPU_JS_PRESENT 0x01C /* (RO) Job slots present */
--
2.47.2
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH 2/4] drm/panfrost: Split LPAE MMU TRANSTAB register values
2025-02-26 18:30 [RFC PATCH 0/4] drm/panfrost: Support ARM_64_LPAE_S1 page table Ariel D'Alessandro
2025-02-26 18:30 ` [RFC PATCH 1/4] drm/panfrost: Use GPU_MMU_FEATURES_VA_BITS/PA_BITS macros Ariel D'Alessandro
@ 2025-02-26 18:30 ` Ariel D'Alessandro
2025-02-27 8:25 ` Boris Brezillon
2025-02-26 18:30 ` [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table Ariel D'Alessandro
2025-02-26 18:30 ` [RFC PATCH 4/4] drm/panfrost: Set HW_FEATURE_AARCH64_MMU feature flag on Bifrost models Ariel D'Alessandro
3 siblings, 1 reply; 17+ messages in thread
From: Ariel D'Alessandro @ 2025-02-26 18:30 UTC (permalink / raw)
To: dri-devel, linux-kernel
Cc: boris.brezillon, robh, steven.price, maarten.lankhorst, mripard,
tzimmermann, airlied, simona, Ariel D'Alessandro
The TRANSTAB (Translation table base address) layout is different
depending on the legacy mode configuration.
Currently, the defined values apply to the legacy mode. Let's rename
them so we can add the ones for no-legacy mode.
Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
---
drivers/gpu/drm/panfrost/panfrost_regs.h | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
index b5f279a19a08..4e6064d5feaa 100644
--- a/drivers/gpu/drm/panfrost/panfrost_regs.h
+++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
@@ -317,14 +317,19 @@
#define MMU_AS_STRIDE (1 << MMU_AS_SHIFT)
/*
- * Begin LPAE MMU TRANSTAB register values
+ * Begin LPAE MMU TRANSTAB register values (legacy mode)
*/
-#define AS_TRANSTAB_LPAE_ADDR_SPACE_MASK 0xfffffffffffff000
-#define AS_TRANSTAB_LPAE_ADRMODE_IDENTITY 0x2
-#define AS_TRANSTAB_LPAE_ADRMODE_TABLE 0x3
-#define AS_TRANSTAB_LPAE_ADRMODE_MASK 0x3
-#define AS_TRANSTAB_LPAE_READ_INNER BIT(2)
-#define AS_TRANSTAB_LPAE_SHARE_OUTER BIT(4)
+#define AS_TRANSTAB_LEGACY_ADDR_SPACE_MASK 0xfffffffffffff000
+#define AS_TRANSTAB_LEGACY_ADRMODE_IDENTITY 0x2
+#define AS_TRANSTAB_LEGACY_ADRMODE_TABLE 0x3
+#define AS_TRANSTAB_LEGACY_ADRMODE_MASK 0x3
+#define AS_TRANSTAB_LEGACY_READ_INNER BIT(2)
+#define AS_TRANSTAB_LEGACY_SHARE_OUTER BIT(4)
+
+/*
+ * Begin LPAE MMU TRANSTAB register values (no-legacy mode)
+ */
+#define AS_TRANSTAB_LPAE_ADDR_SPACE_MASK 0xfffffffffffffff0
#define AS_STATUS_AS_ACTIVE 0x01
--
2.47.2
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table
2025-02-26 18:30 [RFC PATCH 0/4] drm/panfrost: Support ARM_64_LPAE_S1 page table Ariel D'Alessandro
2025-02-26 18:30 ` [RFC PATCH 1/4] drm/panfrost: Use GPU_MMU_FEATURES_VA_BITS/PA_BITS macros Ariel D'Alessandro
2025-02-26 18:30 ` [RFC PATCH 2/4] drm/panfrost: Split LPAE MMU TRANSTAB register values Ariel D'Alessandro
@ 2025-02-26 18:30 ` Ariel D'Alessandro
2025-02-27 8:30 ` Boris Brezillon
` (2 more replies)
2025-02-26 18:30 ` [RFC PATCH 4/4] drm/panfrost: Set HW_FEATURE_AARCH64_MMU feature flag on Bifrost models Ariel D'Alessandro
3 siblings, 3 replies; 17+ messages in thread
From: Ariel D'Alessandro @ 2025-02-26 18:30 UTC (permalink / raw)
To: dri-devel, linux-kernel
Cc: boris.brezillon, robh, steven.price, maarten.lankhorst, mripard,
tzimmermann, airlied, simona, Ariel D'Alessandro
Bifrost MMUs support AArch64 4kB granule specification. However,
panfrost only enables MMU in legacy mode, despite the presence of the
HW_FEATURE_AARCH64_MMU feature flag.
This commit adds support to use page tables according to AArch64 4kB
granule specification. This feature is enabled conditionally based on
the GPU model's HW_FEATURE_AARCH64_MMU feature flag.
Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
---
drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
drivers/gpu/drm/panfrost/panfrost_mmu.c | 118 +++++++++++++++++----
drivers/gpu/drm/panfrost/panfrost_regs.h | 29 +++++
3 files changed, 128 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
index cffcb0ac7c11..dea252f43c58 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -153,6 +153,7 @@ struct panfrost_device {
};
struct panfrost_mmu {
+ void (*enable)(struct panfrost_device *pfdev, struct panfrost_mmu *mmu);
struct panfrost_device *pfdev;
struct kref refcount;
struct io_pgtable_cfg pgtbl_cfg;
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 7df2c8d5b0ae..30b8e2723254 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -26,6 +26,48 @@
#define mmu_write(dev, reg, data) writel(data, dev->iomem + reg)
#define mmu_read(dev, reg) readl(dev->iomem + reg)
+static u64 mair_to_memattr(u64 mair, bool coherent)
+{
+ u64 memattr = 0;
+ u32 i;
+
+ for (i = 0; i < 8; i++) {
+ u8 in_attr = mair >> (8 * i), out_attr;
+ u8 outer = in_attr >> 4, inner = in_attr & 0xf;
+
+ /* For caching to be enabled, inner and outer caching policy
+ * have to be both write-back, if one of them is write-through
+ * or non-cacheable, we just choose non-cacheable. Device
+ * memory is also translated to non-cacheable.
+ */
+ if (!(outer & 3) || !(outer & 4) || !(inner & 4)) {
+ out_attr = AS_MEMATTR_AARCH64_INNER_OUTER_NC |
+ AS_MEMATTR_AARCH64_SH_MIDGARD_INNER |
+ AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(false, false);
+ } else {
+ out_attr = AS_MEMATTR_AARCH64_INNER_OUTER_WB |
+ AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(inner & 1, inner & 2);
+ /* Use SH_MIDGARD_INNER mode when device isn't coherent,
+ * so SH_IS, which is used when IOMMU_CACHE is set, maps
+ * to Mali's internal-shareable mode. As per the Mali
+ * Spec, inner and outer-shareable modes aren't allowed
+ * for WB memory when coherency is disabled.
+ * Use SH_CPU_INNER mode when coherency is enabled, so
+ * that SH_IS actually maps to the standard definition of
+ * inner-shareable.
+ */
+ if (!coherent)
+ out_attr |= AS_MEMATTR_AARCH64_SH_MIDGARD_INNER;
+ else
+ out_attr |= AS_MEMATTR_AARCH64_SH_CPU_INNER;
+ }
+
+ memattr |= (u64)out_attr << (8 * i);
+ }
+
+ return memattr;
+}
+
static int wait_ready(struct panfrost_device *pfdev, u32 as_nr)
{
int ret;
@@ -121,38 +163,66 @@ static int mmu_hw_do_operation(struct panfrost_device *pfdev,
return ret;
}
-static void panfrost_mmu_enable(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
+static void
+_panfrost_mmu_as_control_write(struct panfrost_device *pfdev, u32 as_nr,
+ u64 transtab, u64 memattr, u64 transcfg)
{
- int as_nr = mmu->as;
- struct io_pgtable_cfg *cfg = &mmu->pgtbl_cfg;
- u64 transtab = cfg->arm_mali_lpae_cfg.transtab;
- u64 memattr = cfg->arm_mali_lpae_cfg.memattr;
-
mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0ULL, AS_COMMAND_FLUSH_MEM);
mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), lower_32_bits(transtab));
mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), upper_32_bits(transtab));
- /* Need to revisit mem attrs.
- * NC is the default, Mali driver is inner WT.
- */
mmu_write(pfdev, AS_MEMATTR_LO(as_nr), lower_32_bits(memattr));
mmu_write(pfdev, AS_MEMATTR_HI(as_nr), upper_32_bits(memattr));
+ mmu_write(pfdev, AS_TRANSCFG_LO(as_nr), lower_32_bits(transcfg));
+ mmu_write(pfdev, AS_TRANSCFG_HI(as_nr), upper_32_bits(transcfg));
+
write_cmd(pfdev, as_nr, AS_COMMAND_UPDATE);
+
+ dev_dbg(pfdev->dev, "mmu_as_control: as=%d, transtab=0x%016llx, memattr=0x%016llx, transcfg=0x%016llx",
+ as_nr, transtab, memattr, transcfg);
}
-static void panfrost_mmu_disable(struct panfrost_device *pfdev, u32 as_nr)
+static void mmu_lpae_s1_enable(struct panfrost_device *pfdev,
+ struct panfrost_mmu *mmu)
{
- mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0ULL, AS_COMMAND_FLUSH_MEM);
+ struct io_pgtable_cfg *cfg = &mmu->pgtbl_cfg;
+ int as_nr = mmu->as;
- mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), 0);
- mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), 0);
+ u64 transtab =
+ cfg->arm_lpae_s1_cfg.ttbr & AS_TRANSTAB_LPAE_ADDR_SPACE_MASK;
+ u64 memattr =
+ mair_to_memattr(cfg->arm_lpae_s1_cfg.mair, pfdev->coherent);
+ u32 va_bits = GPU_MMU_FEATURES_VA_BITS(pfdev->features.mmu_features);
+ u64 transcfg = AS_TRANSCFG_PTW_MEMATTR_WB |
+ AS_TRANSCFG_PTW_RA |
+ AS_TRANSCFG_ADRMODE_AARCH64_4K |
+ AS_TRANSCFG_INA_BITS(55 - va_bits);
- mmu_write(pfdev, AS_MEMATTR_LO(as_nr), 0);
- mmu_write(pfdev, AS_MEMATTR_HI(as_nr), 0);
+ if (pfdev->coherent)
+ transcfg |= AS_TRANSCFG_PTW_SH_OS;
- write_cmd(pfdev, as_nr, AS_COMMAND_UPDATE);
+ _panfrost_mmu_as_control_write(pfdev, as_nr, transtab, memattr,
+ transcfg);
+}
+
+static void mmu_mali_lpae_enable(struct panfrost_device *pfdev,
+ struct panfrost_mmu *mmu)
+{
+ struct io_pgtable_cfg *cfg = &mmu->pgtbl_cfg;
+ int as_nr = mmu->as;
+
+ _panfrost_mmu_as_control_write(pfdev, as_nr,
+ cfg->arm_mali_lpae_cfg.transtab,
+ cfg->arm_mali_lpae_cfg.memattr,
+ AS_TRANSCFG_ADRMODE_LEGACY);
+}
+
+static void panfrost_mmu_disable(struct panfrost_device *pfdev, u32 as_nr)
+{
+ _panfrost_mmu_as_control_write(pfdev, as_nr, 0, 0,
+ AS_TRANSCFG_ADRMODE_UNMAPPED);
}
u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
@@ -182,7 +252,7 @@ u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
mmu_write(pfdev, MMU_INT_CLEAR, mask);
mmu_write(pfdev, MMU_INT_MASK, ~pfdev->as_faulty_mask);
pfdev->as_faulty_mask &= ~mask;
- panfrost_mmu_enable(pfdev, mmu);
+ mmu->enable(pfdev, mmu);
}
goto out;
@@ -214,7 +284,7 @@ u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
dev_dbg(pfdev->dev, "Assigned AS%d to mmu %p, alloc_mask=%lx", as, mmu, pfdev->as_alloc_mask);
- panfrost_mmu_enable(pfdev, mmu);
+ mmu->enable(pfdev, mmu);
out:
spin_unlock(&pfdev->as_lock);
@@ -618,6 +688,7 @@ struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
u32 va_bits = GPU_MMU_FEATURES_VA_BITS(pfdev->features.mmu_features);
u32 pa_bits = GPU_MMU_FEATURES_PA_BITS(pfdev->features.mmu_features);
struct panfrost_mmu *mmu;
+ enum io_pgtable_fmt fmt;
mmu = kzalloc(sizeof(*mmu), GFP_KERNEL);
if (!mmu)
@@ -642,8 +713,15 @@ struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
.iommu_dev = pfdev->dev,
};
- mmu->pgtbl_ops = alloc_io_pgtable_ops(ARM_MALI_LPAE, &mmu->pgtbl_cfg,
- mmu);
+ if (panfrost_has_hw_feature(pfdev, HW_FEATURE_AARCH64_MMU)) {
+ fmt = ARM_64_LPAE_S1;
+ mmu->enable = mmu_lpae_s1_enable;
+ } else {
+ fmt = ARM_MALI_LPAE;
+ mmu->enable = mmu_mali_lpae_enable;
+ }
+ mmu->pgtbl_ops = alloc_io_pgtable_ops(fmt, &mmu->pgtbl_cfg, mmu);
+
if (!mmu->pgtbl_ops) {
kfree(mmu);
return ERR_PTR(-EINVAL);
diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
index 4e6064d5feaa..a5ca36f583ff 100644
--- a/drivers/gpu/drm/panfrost/panfrost_regs.h
+++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
@@ -301,6 +301,17 @@
#define AS_TRANSTAB_HI(as) (MMU_AS(as) + 0x04) /* (RW) Translation Table Base Address for address space n, high word */
#define AS_MEMATTR_LO(as) (MMU_AS(as) + 0x08) /* (RW) Memory attributes for address space n, low word. */
#define AS_MEMATTR_HI(as) (MMU_AS(as) + 0x0C) /* (RW) Memory attributes for address space n, high word. */
+#define AS_MEMATTR_AARCH64_INNER_ALLOC_IMPL (2 << 2)
+#define AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(w, r) ((3 << 2) | \
+ ((w) ? BIT(0) : 0) | \
+ ((r) ? BIT(1) : 0))
+#define AS_MEMATTR_AARCH64_SH_MIDGARD_INNER (0 << 4)
+#define AS_MEMATTR_AARCH64_SH_CPU_INNER (1 << 4)
+#define AS_MEMATTR_AARCH64_SH_CPU_INNER_SHADER_COH (2 << 4)
+#define AS_MEMATTR_AARCH64_SHARED (0 << 6)
+#define AS_MEMATTR_AARCH64_INNER_OUTER_NC (1 << 6)
+#define AS_MEMATTR_AARCH64_INNER_OUTER_WB (2 << 6)
+#define AS_MEMATTR_AARCH64_FAULT (3 << 6)
#define AS_LOCKADDR_LO(as) (MMU_AS(as) + 0x10) /* (RW) Lock region address for address space n, low word */
#define AS_LOCKADDR_HI(as) (MMU_AS(as) + 0x14) /* (RW) Lock region address for address space n, high word */
#define AS_COMMAND(as) (MMU_AS(as) + 0x18) /* (WO) MMU command register for address space n */
@@ -311,6 +322,24 @@
/* Additional Bifrost AS registers */
#define AS_TRANSCFG_LO(as) (MMU_AS(as) + 0x30) /* (RW) Translation table configuration for address space n, low word */
#define AS_TRANSCFG_HI(as) (MMU_AS(as) + 0x34) /* (RW) Translation table configuration for address space n, high word */
+#define AS_TRANSCFG_ADRMODE_LEGACY (0 << 0)
+#define AS_TRANSCFG_ADRMODE_UNMAPPED (1 << 0)
+#define AS_TRANSCFG_ADRMODE_IDENTITY (2 << 0)
+#define AS_TRANSCFG_ADRMODE_AARCH64_4K (6 << 0)
+#define AS_TRANSCFG_ADRMODE_AARCH64_64K (8 << 0)
+#define AS_TRANSCFG_INA_BITS(x) ((x) << 6)
+#define AS_TRANSCFG_OUTA_BITS(x) ((x) << 14)
+#define AS_TRANSCFG_SL_CONCAT BIT(22)
+#define AS_TRANSCFG_PTW_MEMATTR_NC (1 << 24)
+#define AS_TRANSCFG_PTW_MEMATTR_WB (2 << 24)
+#define AS_TRANSCFG_PTW_SH_NS (0 << 28)
+#define AS_TRANSCFG_PTW_SH_OS (2 << 28)
+#define AS_TRANSCFG_PTW_SH_IS (3 << 28)
+#define AS_TRANSCFG_PTW_RA BIT(30)
+#define AS_TRANSCFG_DISABLE_HIER_AP BIT(33)
+#define AS_TRANSCFG_DISABLE_AF_FAULT BIT(34)
+#define AS_TRANSCFG_WXN BIT(35)
+#define AS_TRANSCFG_XREADABLE BIT(36)
#define AS_FAULTEXTRA_LO(as) (MMU_AS(as) + 0x38) /* (RO) Secondary fault address for address space n, low word */
#define AS_FAULTEXTRA_HI(as) (MMU_AS(as) + 0x3C) /* (RO) Secondary fault address for address space n, high word */
--
2.47.2
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH 4/4] drm/panfrost: Set HW_FEATURE_AARCH64_MMU feature flag on Bifrost models
2025-02-26 18:30 [RFC PATCH 0/4] drm/panfrost: Support ARM_64_LPAE_S1 page table Ariel D'Alessandro
` (2 preceding siblings ...)
2025-02-26 18:30 ` [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table Ariel D'Alessandro
@ 2025-02-26 18:30 ` Ariel D'Alessandro
3 siblings, 0 replies; 17+ messages in thread
From: Ariel D'Alessandro @ 2025-02-26 18:30 UTC (permalink / raw)
To: dri-devel, linux-kernel
Cc: boris.brezillon, robh, steven.price, maarten.lankhorst, mripard,
tzimmermann, airlied, simona, Ariel D'Alessandro
Mali Bifrost MMU support AArch64 4kB page tables. This feature is in
panfrost based the HW_FEATURE_AARCH64_MMU feature flag.
Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
---
drivers/gpu/drm/panfrost/panfrost_features.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/panfrost/panfrost_features.h b/drivers/gpu/drm/panfrost/panfrost_features.h
index 7ed0cd3ea2d4..52f9d69f6db9 100644
--- a/drivers/gpu/drm/panfrost/panfrost_features.h
+++ b/drivers/gpu/drm/panfrost/panfrost_features.h
@@ -54,6 +54,7 @@ enum panfrost_hw_feature {
BIT_ULL(HW_FEATURE_THREAD_GROUP_SPLIT) | \
BIT_ULL(HW_FEATURE_FLUSH_REDUCTION) | \
BIT_ULL(HW_FEATURE_PROTECTED_MODE) | \
+ BIT_ULL(HW_FEATURE_AARCH64_MMU) | \
BIT_ULL(HW_FEATURE_COHERENCY_REG))
#define hw_features_g72 (\
@@ -64,6 +65,7 @@ enum panfrost_hw_feature {
BIT_ULL(HW_FEATURE_FLUSH_REDUCTION) | \
BIT_ULL(HW_FEATURE_PROTECTED_MODE) | \
BIT_ULL(HW_FEATURE_PROTECTED_DEBUG_MODE) | \
+ BIT_ULL(HW_FEATURE_AARCH64_MMU) | \
BIT_ULL(HW_FEATURE_COHERENCY_REG))
#define hw_features_g51 hw_features_g72
@@ -77,6 +79,7 @@ enum panfrost_hw_feature {
BIT_ULL(HW_FEATURE_PROTECTED_MODE) | \
BIT_ULL(HW_FEATURE_PROTECTED_DEBUG_MODE) | \
BIT_ULL(HW_FEATURE_IDVS_GROUP_SIZE) | \
+ BIT_ULL(HW_FEATURE_AARCH64_MMU) | \
BIT_ULL(HW_FEATURE_COHERENCY_REG))
#define hw_features_g76 (\
--
2.47.2
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/4] drm/panfrost: Use GPU_MMU_FEATURES_VA_BITS/PA_BITS macros
2025-02-26 18:30 ` [RFC PATCH 1/4] drm/panfrost: Use GPU_MMU_FEATURES_VA_BITS/PA_BITS macros Ariel D'Alessandro
@ 2025-02-27 8:21 ` Boris Brezillon
2025-02-27 14:44 ` Steven Price
1 sibling, 0 replies; 17+ messages in thread
From: Boris Brezillon @ 2025-02-27 8:21 UTC (permalink / raw)
To: Ariel D'Alessandro
Cc: dri-devel, linux-kernel, robh, steven.price, maarten.lankhorst,
mripard, tzimmermann, airlied, simona
On Wed, 26 Feb 2025 15:30:40 -0300
Ariel D'Alessandro <ariel.dalessandro@collabora.com> wrote:
> As done in panthor, define and use these GPU_MMU_FEATURES_* macros,
> which makes code easier to read and reuse.
>
> Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 6 ++++--
> drivers/gpu/drm/panfrost/panfrost_regs.h | 2 ++
> 2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index b91019cd5acb..7df2c8d5b0ae 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -615,6 +615,8 @@ static void panfrost_drm_mm_color_adjust(const struct drm_mm_node *node,
>
> struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
> {
> + u32 va_bits = GPU_MMU_FEATURES_VA_BITS(pfdev->features.mmu_features);
> + u32 pa_bits = GPU_MMU_FEATURES_PA_BITS(pfdev->features.mmu_features);
> struct panfrost_mmu *mmu;
>
> mmu = kzalloc(sizeof(*mmu), GFP_KERNEL);
> @@ -633,8 +635,8 @@ struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
>
> mmu->pgtbl_cfg = (struct io_pgtable_cfg) {
> .pgsize_bitmap = SZ_4K | SZ_2M,
> - .ias = FIELD_GET(0xff, pfdev->features.mmu_features),
> - .oas = FIELD_GET(0xff00, pfdev->features.mmu_features),
> + .ias = va_bits,
> + .oas = pa_bits,
> .coherent_walk = pfdev->coherent,
> .tlb = &mmu_tlb_ops,
> .iommu_dev = pfdev->dev,
> diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
> index c7bba476ab3f..b5f279a19a08 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_regs.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
> @@ -16,6 +16,8 @@
> #define GROUPS_L2_COHERENT BIT(0) /* Cores groups are l2 coherent */
>
> #define GPU_MMU_FEATURES 0x014 /* (RO) MMU features */
> +#define GPU_MMU_FEATURES_VA_BITS(x) ((x) & GENMASK(7, 0))
> +#define GPU_MMU_FEATURES_PA_BITS(x) (((x) >> 8) & GENMASK(7, 0))
> #define GPU_AS_PRESENT 0x018 /* (RO) Address space slots present */
> #define GPU_JS_PRESENT 0x01C /* (RO) Job slots present */
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 2/4] drm/panfrost: Split LPAE MMU TRANSTAB register values
2025-02-26 18:30 ` [RFC PATCH 2/4] drm/panfrost: Split LPAE MMU TRANSTAB register values Ariel D'Alessandro
@ 2025-02-27 8:25 ` Boris Brezillon
2025-03-07 14:02 ` Ariel D'Alessandro
0 siblings, 1 reply; 17+ messages in thread
From: Boris Brezillon @ 2025-02-27 8:25 UTC (permalink / raw)
To: Ariel D'Alessandro
Cc: dri-devel, linux-kernel, robh, steven.price, maarten.lankhorst,
mripard, tzimmermann, airlied, simona
On Wed, 26 Feb 2025 15:30:41 -0300
Ariel D'Alessandro <ariel.dalessandro@collabora.com> wrote:
> The TRANSTAB (Translation table base address) layout is different
> depending on the legacy mode configuration.
>
> Currently, the defined values apply to the legacy mode. Let's rename
> them so we can add the ones for no-legacy mode.
>
> Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
> ---
> drivers/gpu/drm/panfrost/panfrost_regs.h | 19 ++++++++++++-------
> 1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
> index b5f279a19a08..4e6064d5feaa 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_regs.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
> @@ -317,14 +317,19 @@
> #define MMU_AS_STRIDE (1 << MMU_AS_SHIFT)
>
> /*
> - * Begin LPAE MMU TRANSTAB register values
> + * Begin LPAE MMU TRANSTAB register values (legacy mode)
> */
> -#define AS_TRANSTAB_LPAE_ADDR_SPACE_MASK 0xfffffffffffff000
> -#define AS_TRANSTAB_LPAE_ADRMODE_IDENTITY 0x2
> -#define AS_TRANSTAB_LPAE_ADRMODE_TABLE 0x3
> -#define AS_TRANSTAB_LPAE_ADRMODE_MASK 0x3
> -#define AS_TRANSTAB_LPAE_READ_INNER BIT(2)
> -#define AS_TRANSTAB_LPAE_SHARE_OUTER BIT(4)
> +#define AS_TRANSTAB_LEGACY_ADDR_SPACE_MASK 0xfffffffffffff000
> +#define AS_TRANSTAB_LEGACY_ADRMODE_IDENTITY 0x2
> +#define AS_TRANSTAB_LEGACY_ADRMODE_TABLE 0x3
> +#define AS_TRANSTAB_LEGACY_ADRMODE_MASK 0x3
> +#define AS_TRANSTAB_LEGACY_READ_INNER BIT(2)
> +#define AS_TRANSTAB_LEGACY_SHARE_OUTER BIT(4)
How about we keep AS_TRANSTAB_LPAE_ here and prefix the new reg values
with AS_xxx_AARCH64_ when there's a collision between the two formats.
> +
> +/*
> + * Begin LPAE MMU TRANSTAB register values (no-legacy mode)
> + */
> +#define AS_TRANSTAB_LPAE_ADDR_SPACE_MASK 0xfffffffffffffff0
It looks like we're not use AS_TRANSTAB_LPAE_ADDR_SPACE_MASK, so I'm
not sure it's worth defining the mask for the AARCH64 format.
>
> #define AS_STATUS_AS_ACTIVE 0x01
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table
2025-02-26 18:30 ` [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table Ariel D'Alessandro
@ 2025-02-27 8:30 ` Boris Brezillon
2025-02-27 8:32 ` Boris Brezillon
2025-03-07 14:42 ` Ariel D'Alessandro
2025-02-27 14:44 ` Steven Price
2025-02-27 14:55 ` Boris Brezillon
2 siblings, 2 replies; 17+ messages in thread
From: Boris Brezillon @ 2025-02-27 8:30 UTC (permalink / raw)
To: Ariel D'Alessandro
Cc: dri-devel, linux-kernel, robh, steven.price, maarten.lankhorst,
mripard, tzimmermann, airlied, simona
On Wed, 26 Feb 2025 15:30:42 -0300
Ariel D'Alessandro <ariel.dalessandro@collabora.com> wrote:
> Bifrost MMUs support AArch64 4kB granule specification. However,
> panfrost only enables MMU in legacy mode, despite the presence of the
> HW_FEATURE_AARCH64_MMU feature flag.
>
> This commit adds support to use page tables according to AArch64 4kB
> granule specification. This feature is enabled conditionally based on
> the GPU model's HW_FEATURE_AARCH64_MMU feature flag.
>
> Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
> ---
> drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 118 +++++++++++++++++----
> drivers/gpu/drm/panfrost/panfrost_regs.h | 29 +++++
> 3 files changed, 128 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
> index cffcb0ac7c11..dea252f43c58 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -153,6 +153,7 @@ struct panfrost_device {
> };
>
> struct panfrost_mmu {
> + void (*enable)(struct panfrost_device *pfdev, struct panfrost_mmu *mmu);
The enable sequence is the same, it's just the transtab, memattr and
transcfg values that differ depending on the format, so let's prepare
them at panfrost_mmu init time, and cache them here.
> struct panfrost_device *pfdev;
> struct kref refcount;
> struct io_pgtable_cfg pgtbl_cfg;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index 7df2c8d5b0ae..30b8e2723254 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -26,6 +26,48 @@
> #define mmu_write(dev, reg, data) writel(data, dev->iomem + reg)
> #define mmu_read(dev, reg) readl(dev->iomem + reg)
>
> +static u64 mair_to_memattr(u64 mair, bool coherent)
> +{
> + u64 memattr = 0;
> + u32 i;
> +
> + for (i = 0; i < 8; i++) {
> + u8 in_attr = mair >> (8 * i), out_attr;
> + u8 outer = in_attr >> 4, inner = in_attr & 0xf;
> +
> + /* For caching to be enabled, inner and outer caching policy
> + * have to be both write-back, if one of them is write-through
> + * or non-cacheable, we just choose non-cacheable. Device
> + * memory is also translated to non-cacheable.
> + */
> + if (!(outer & 3) || !(outer & 4) || !(inner & 4)) {
> + out_attr = AS_MEMATTR_AARCH64_INNER_OUTER_NC |
> + AS_MEMATTR_AARCH64_SH_MIDGARD_INNER |
> + AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(false, false);
> + } else {
> + out_attr = AS_MEMATTR_AARCH64_INNER_OUTER_WB |
> + AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(inner & 1, inner & 2);
> + /* Use SH_MIDGARD_INNER mode when device isn't coherent,
> + * so SH_IS, which is used when IOMMU_CACHE is set, maps
> + * to Mali's internal-shareable mode. As per the Mali
> + * Spec, inner and outer-shareable modes aren't allowed
> + * for WB memory when coherency is disabled.
> + * Use SH_CPU_INNER mode when coherency is enabled, so
> + * that SH_IS actually maps to the standard definition of
> + * inner-shareable.
> + */
> + if (!coherent)
> + out_attr |= AS_MEMATTR_AARCH64_SH_MIDGARD_INNER;
> + else
> + out_attr |= AS_MEMATTR_AARCH64_SH_CPU_INNER;
> + }
> +
> + memattr |= (u64)out_attr << (8 * i);
> + }
> +
> + return memattr;
> +}
> +
> static int wait_ready(struct panfrost_device *pfdev, u32 as_nr)
> {
> int ret;
> @@ -121,38 +163,66 @@ static int mmu_hw_do_operation(struct panfrost_device *pfdev,
> return ret;
> }
>
> -static void panfrost_mmu_enable(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
> +static void
> +_panfrost_mmu_as_control_write(struct panfrost_device *pfdev, u32 as_nr,
> + u64 transtab, u64 memattr, u64 transcfg)
> {
> - int as_nr = mmu->as;
> - struct io_pgtable_cfg *cfg = &mmu->pgtbl_cfg;
> - u64 transtab = cfg->arm_mali_lpae_cfg.transtab;
> - u64 memattr = cfg->arm_mali_lpae_cfg.memattr;
> -
> mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0ULL, AS_COMMAND_FLUSH_MEM);
>
> mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), lower_32_bits(transtab));
> mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), upper_32_bits(transtab));
>
> - /* Need to revisit mem attrs.
> - * NC is the default, Mali driver is inner WT.
> - */
> mmu_write(pfdev, AS_MEMATTR_LO(as_nr), lower_32_bits(memattr));
> mmu_write(pfdev, AS_MEMATTR_HI(as_nr), upper_32_bits(memattr));
>
> + mmu_write(pfdev, AS_TRANSCFG_LO(as_nr), lower_32_bits(transcfg));
> + mmu_write(pfdev, AS_TRANSCFG_HI(as_nr), upper_32_bits(transcfg));
> +
> write_cmd(pfdev, as_nr, AS_COMMAND_UPDATE);
> +
> + dev_dbg(pfdev->dev, "mmu_as_control: as=%d, transtab=0x%016llx, memattr=0x%016llx, transcfg=0x%016llx",
> + as_nr, transtab, memattr, transcfg);
> }
>
> -static void panfrost_mmu_disable(struct panfrost_device *pfdev, u32 as_nr)
> +static void mmu_lpae_s1_enable(struct panfrost_device *pfdev,
> + struct panfrost_mmu *mmu)
> {
> - mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0ULL, AS_COMMAND_FLUSH_MEM);
> + struct io_pgtable_cfg *cfg = &mmu->pgtbl_cfg;
> + int as_nr = mmu->as;
>
> - mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), 0);
> - mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), 0);
> + u64 transtab =
> + cfg->arm_lpae_s1_cfg.ttbr & AS_TRANSTAB_LPAE_ADDR_SPACE_MASK;
> + u64 memattr =
> + mair_to_memattr(cfg->arm_lpae_s1_cfg.mair, pfdev->coherent);
> + u32 va_bits = GPU_MMU_FEATURES_VA_BITS(pfdev->features.mmu_features);
> + u64 transcfg = AS_TRANSCFG_PTW_MEMATTR_WB |
> + AS_TRANSCFG_PTW_RA |
> + AS_TRANSCFG_ADRMODE_AARCH64_4K |
> + AS_TRANSCFG_INA_BITS(55 - va_bits);
>
> - mmu_write(pfdev, AS_MEMATTR_LO(as_nr), 0);
> - mmu_write(pfdev, AS_MEMATTR_HI(as_nr), 0);
> + if (pfdev->coherent)
> + transcfg |= AS_TRANSCFG_PTW_SH_OS;
>
> - write_cmd(pfdev, as_nr, AS_COMMAND_UPDATE);
> + _panfrost_mmu_as_control_write(pfdev, as_nr, transtab, memattr,
> + transcfg);
> +}
> +
> +static void mmu_mali_lpae_enable(struct panfrost_device *pfdev,
> + struct panfrost_mmu *mmu)
> +{
> + struct io_pgtable_cfg *cfg = &mmu->pgtbl_cfg;
> + int as_nr = mmu->as;
> +
> + _panfrost_mmu_as_control_write(pfdev, as_nr,
> + cfg->arm_mali_lpae_cfg.transtab,
> + cfg->arm_mali_lpae_cfg.memattr,
> + AS_TRANSCFG_ADRMODE_LEGACY);
> +}
> +
> +static void panfrost_mmu_disable(struct panfrost_device *pfdev, u32 as_nr)
> +{
> + _panfrost_mmu_as_control_write(pfdev, as_nr, 0, 0,
> + AS_TRANSCFG_ADRMODE_UNMAPPED);
> }
>
> u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
> @@ -182,7 +252,7 @@ u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
> mmu_write(pfdev, MMU_INT_CLEAR, mask);
> mmu_write(pfdev, MMU_INT_MASK, ~pfdev->as_faulty_mask);
> pfdev->as_faulty_mask &= ~mask;
> - panfrost_mmu_enable(pfdev, mmu);
> + mmu->enable(pfdev, mmu);
> }
>
> goto out;
> @@ -214,7 +284,7 @@ u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
>
> dev_dbg(pfdev->dev, "Assigned AS%d to mmu %p, alloc_mask=%lx", as, mmu, pfdev->as_alloc_mask);
>
> - panfrost_mmu_enable(pfdev, mmu);
> + mmu->enable(pfdev, mmu);
>
> out:
> spin_unlock(&pfdev->as_lock);
> @@ -618,6 +688,7 @@ struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
> u32 va_bits = GPU_MMU_FEATURES_VA_BITS(pfdev->features.mmu_features);
> u32 pa_bits = GPU_MMU_FEATURES_PA_BITS(pfdev->features.mmu_features);
> struct panfrost_mmu *mmu;
> + enum io_pgtable_fmt fmt;
>
> mmu = kzalloc(sizeof(*mmu), GFP_KERNEL);
> if (!mmu)
> @@ -642,8 +713,15 @@ struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
> .iommu_dev = pfdev->dev,
> };
>
> - mmu->pgtbl_ops = alloc_io_pgtable_ops(ARM_MALI_LPAE, &mmu->pgtbl_cfg,
> - mmu);
> + if (panfrost_has_hw_feature(pfdev, HW_FEATURE_AARCH64_MMU)) {
> + fmt = ARM_64_LPAE_S1;
> + mmu->enable = mmu_lpae_s1_enable;
> + } else {
> + fmt = ARM_MALI_LPAE;
> + mmu->enable = mmu_mali_lpae_enable;
> + }
> + mmu->pgtbl_ops = alloc_io_pgtable_ops(fmt, &mmu->pgtbl_cfg, mmu);
> +
> if (!mmu->pgtbl_ops) {
> kfree(mmu);
> return ERR_PTR(-EINVAL);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
> index 4e6064d5feaa..a5ca36f583ff 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_regs.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
> @@ -301,6 +301,17 @@
> #define AS_TRANSTAB_HI(as) (MMU_AS(as) + 0x04) /* (RW) Translation Table Base Address for address space n, high word */
> #define AS_MEMATTR_LO(as) (MMU_AS(as) + 0x08) /* (RW) Memory attributes for address space n, low word. */
> #define AS_MEMATTR_HI(as) (MMU_AS(as) + 0x0C) /* (RW) Memory attributes for address space n, high word. */
> +#define AS_MEMATTR_AARCH64_INNER_ALLOC_IMPL (2 << 2)
> +#define AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(w, r) ((3 << 2) | \
> + ((w) ? BIT(0) : 0) | \
> + ((r) ? BIT(1) : 0))
> +#define AS_MEMATTR_AARCH64_SH_MIDGARD_INNER (0 << 4)
> +#define AS_MEMATTR_AARCH64_SH_CPU_INNER (1 << 4)
> +#define AS_MEMATTR_AARCH64_SH_CPU_INNER_SHADER_COH (2 << 4)
> +#define AS_MEMATTR_AARCH64_SHARED (0 << 6)
> +#define AS_MEMATTR_AARCH64_INNER_OUTER_NC (1 << 6)
> +#define AS_MEMATTR_AARCH64_INNER_OUTER_WB (2 << 6)
> +#define AS_MEMATTR_AARCH64_FAULT (3 << 6)
> #define AS_LOCKADDR_LO(as) (MMU_AS(as) + 0x10) /* (RW) Lock region address for address space n, low word */
> #define AS_LOCKADDR_HI(as) (MMU_AS(as) + 0x14) /* (RW) Lock region address for address space n, high word */
> #define AS_COMMAND(as) (MMU_AS(as) + 0x18) /* (WO) MMU command register for address space n */
> @@ -311,6 +322,24 @@
> /* Additional Bifrost AS registers */
> #define AS_TRANSCFG_LO(as) (MMU_AS(as) + 0x30) /* (RW) Translation table configuration for address space n, low word */
> #define AS_TRANSCFG_HI(as) (MMU_AS(as) + 0x34) /* (RW) Translation table configuration for address space n, high word */
> +#define AS_TRANSCFG_ADRMODE_LEGACY (0 << 0)
> +#define AS_TRANSCFG_ADRMODE_UNMAPPED (1 << 0)
> +#define AS_TRANSCFG_ADRMODE_IDENTITY (2 << 0)
> +#define AS_TRANSCFG_ADRMODE_AARCH64_4K (6 << 0)
> +#define AS_TRANSCFG_ADRMODE_AARCH64_64K (8 << 0)
> +#define AS_TRANSCFG_INA_BITS(x) ((x) << 6)
> +#define AS_TRANSCFG_OUTA_BITS(x) ((x) << 14)
> +#define AS_TRANSCFG_SL_CONCAT BIT(22)
> +#define AS_TRANSCFG_PTW_MEMATTR_NC (1 << 24)
> +#define AS_TRANSCFG_PTW_MEMATTR_WB (2 << 24)
> +#define AS_TRANSCFG_PTW_SH_NS (0 << 28)
> +#define AS_TRANSCFG_PTW_SH_OS (2 << 28)
> +#define AS_TRANSCFG_PTW_SH_IS (3 << 28)
> +#define AS_TRANSCFG_PTW_RA BIT(30)
> +#define AS_TRANSCFG_DISABLE_HIER_AP BIT(33)
> +#define AS_TRANSCFG_DISABLE_AF_FAULT BIT(34)
> +#define AS_TRANSCFG_WXN BIT(35)
> +#define AS_TRANSCFG_XREADABLE BIT(36)
> #define AS_FAULTEXTRA_LO(as) (MMU_AS(as) + 0x38) /* (RO) Secondary fault address for address space n, low word */
> #define AS_FAULTEXTRA_HI(as) (MMU_AS(as) + 0x3C) /* (RO) Secondary fault address for address space n, high word */
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table
2025-02-27 8:30 ` Boris Brezillon
@ 2025-02-27 8:32 ` Boris Brezillon
2025-03-07 14:42 ` Ariel D'Alessandro
1 sibling, 0 replies; 17+ messages in thread
From: Boris Brezillon @ 2025-02-27 8:32 UTC (permalink / raw)
To: Ariel D'Alessandro
Cc: dri-devel, linux-kernel, robh, steven.price, maarten.lankhorst,
mripard, tzimmermann, airlied, simona
On Thu, 27 Feb 2025 09:30:30 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:
> > Bifrost MMUs support AArch64 4kB granule specification. However,
> > panfrost only enables MMU in legacy mode, despite the presence of the
> > HW_FEATURE_AARCH64_MMU feature flag.
> >
> > This commit adds support to use page tables according to AArch64 4kB
> > granule specification. This feature is enabled conditionally based on
> > the GPU model's HW_FEATURE_AARCH64_MMU feature flag.
> >
> > Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
> > ---
> > drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
> > drivers/gpu/drm/panfrost/panfrost_mmu.c | 118 +++++++++++++++++----
> > drivers/gpu/drm/panfrost/panfrost_regs.h | 29 +++++
> > 3 files changed, 128 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
> > index cffcb0ac7c11..dea252f43c58 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> > +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> > @@ -153,6 +153,7 @@ struct panfrost_device {
> > };
> >
> > struct panfrost_mmu {
> > + void (*enable)(struct panfrost_device *pfdev, struct panfrost_mmu *mmu);
>
> The enable sequence is the same, it's just the transtab, memattr and
> transcfg values that differ depending on the format, so let's prepare
> them at panfrost_mmu init time, and cache them here.
Just to be clear, I meant replace this ->enable() function pointer by a
struct {
u64 transtab;
u64 memattr;
u64 transcfg;
} cfg;
field.
>
> > struct panfrost_device *pfdev;
> > struct kref refcount;
> > struct io_pgtable_cfg pgtbl_cfg;
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/4] drm/panfrost: Use GPU_MMU_FEATURES_VA_BITS/PA_BITS macros
2025-02-26 18:30 ` [RFC PATCH 1/4] drm/panfrost: Use GPU_MMU_FEATURES_VA_BITS/PA_BITS macros Ariel D'Alessandro
2025-02-27 8:21 ` Boris Brezillon
@ 2025-02-27 14:44 ` Steven Price
1 sibling, 0 replies; 17+ messages in thread
From: Steven Price @ 2025-02-27 14:44 UTC (permalink / raw)
To: Ariel D'Alessandro, dri-devel, linux-kernel
Cc: boris.brezillon, robh, maarten.lankhorst, mripard, tzimmermann,
airlied, simona
On 26/02/2025 18:30, Ariel D'Alessandro wrote:
> As done in panthor, define and use these GPU_MMU_FEATURES_* macros,
> which makes code easier to read and reuse.
>
> Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
> ---
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 6 ++++--
> drivers/gpu/drm/panfrost/panfrost_regs.h | 2 ++
> 2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index b91019cd5acb..7df2c8d5b0ae 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -615,6 +615,8 @@ static void panfrost_drm_mm_color_adjust(const struct drm_mm_node *node,
>
> struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
> {
> + u32 va_bits = GPU_MMU_FEATURES_VA_BITS(pfdev->features.mmu_features);
> + u32 pa_bits = GPU_MMU_FEATURES_PA_BITS(pfdev->features.mmu_features);
> struct panfrost_mmu *mmu;
>
> mmu = kzalloc(sizeof(*mmu), GFP_KERNEL);
> @@ -633,8 +635,8 @@ struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
>
> mmu->pgtbl_cfg = (struct io_pgtable_cfg) {
> .pgsize_bitmap = SZ_4K | SZ_2M,
> - .ias = FIELD_GET(0xff, pfdev->features.mmu_features),
> - .oas = FIELD_GET(0xff00, pfdev->features.mmu_features),
> + .ias = va_bits,
> + .oas = pa_bits,
> .coherent_walk = pfdev->coherent,
> .tlb = &mmu_tlb_ops,
> .iommu_dev = pfdev->dev,
> diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
> index c7bba476ab3f..b5f279a19a08 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_regs.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
> @@ -16,6 +16,8 @@
> #define GROUPS_L2_COHERENT BIT(0) /* Cores groups are l2 coherent */
>
> #define GPU_MMU_FEATURES 0x014 /* (RO) MMU features */
> +#define GPU_MMU_FEATURES_VA_BITS(x) ((x) & GENMASK(7, 0))
> +#define GPU_MMU_FEATURES_PA_BITS(x) (((x) >> 8) & GENMASK(7, 0))
> #define GPU_AS_PRESENT 0x018 /* (RO) Address space slots present */
> #define GPU_JS_PRESENT 0x01C /* (RO) Job slots present */
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table
2025-02-26 18:30 ` [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table Ariel D'Alessandro
2025-02-27 8:30 ` Boris Brezillon
@ 2025-02-27 14:44 ` Steven Price
2025-03-10 15:46 ` Ariel D'Alessandro
2025-02-27 14:55 ` Boris Brezillon
2 siblings, 1 reply; 17+ messages in thread
From: Steven Price @ 2025-02-27 14:44 UTC (permalink / raw)
To: Ariel D'Alessandro, dri-devel, linux-kernel
Cc: boris.brezillon, robh, maarten.lankhorst, mripard, tzimmermann,
airlied, simona
On 26/02/2025 18:30, Ariel D'Alessandro wrote:
> Bifrost MMUs support AArch64 4kB granule specification. However,
> panfrost only enables MMU in legacy mode, despite the presence of the
> HW_FEATURE_AARCH64_MMU feature flag.
>
> This commit adds support to use page tables according to AArch64 4kB
> granule specification. This feature is enabled conditionally based on
> the GPU model's HW_FEATURE_AARCH64_MMU feature flag.
>
> Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
I find some of the naming confusing here. The subject calls it
'ARM_64_LPAE_S1' which in an unfortunate name from the iommu code.
AIUI, LPAE is the "Large Physical Address Extension" and is a v7 feature
for 32 bit. "LEGACY" (as Bifrost calls it) mode is a (modified) version
of LPAE, which in Linux we've called "mali_lpae".
What you're adding support for is AARCH64_4K which is the v8 64 bit
mode. So I think it's worth including the "64" part of the name of the
mmu_lpae_s1_enable() function. Personally I'd be tempted to drop the
"_s1" part, but I guess there's a small chance someone will find a use
for the second stage one day.
Note also that it's not necessarily a clear-cut improvement to use
AARCH64_4K over LEGACY. I wouldn't be surprised if this actually causes
(minor) performance regressions on some platforms. Sadly I don't have
access to a range of hardware to test this on.
Steve
> ---
> drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 118 +++++++++++++++++----
> drivers/gpu/drm/panfrost/panfrost_regs.h | 29 +++++
> 3 files changed, 128 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
> index cffcb0ac7c11..dea252f43c58 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -153,6 +153,7 @@ struct panfrost_device {
> };
>
> struct panfrost_mmu {
> + void (*enable)(struct panfrost_device *pfdev, struct panfrost_mmu *mmu);
> struct panfrost_device *pfdev;
> struct kref refcount;
> struct io_pgtable_cfg pgtbl_cfg;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index 7df2c8d5b0ae..30b8e2723254 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -26,6 +26,48 @@
> #define mmu_write(dev, reg, data) writel(data, dev->iomem + reg)
> #define mmu_read(dev, reg) readl(dev->iomem + reg)
>
> +static u64 mair_to_memattr(u64 mair, bool coherent)
> +{
> + u64 memattr = 0;
> + u32 i;
> +
> + for (i = 0; i < 8; i++) {
> + u8 in_attr = mair >> (8 * i), out_attr;
> + u8 outer = in_attr >> 4, inner = in_attr & 0xf;
> +
> + /* For caching to be enabled, inner and outer caching policy
> + * have to be both write-back, if one of them is write-through
> + * or non-cacheable, we just choose non-cacheable. Device
> + * memory is also translated to non-cacheable.
> + */
> + if (!(outer & 3) || !(outer & 4) || !(inner & 4)) {
> + out_attr = AS_MEMATTR_AARCH64_INNER_OUTER_NC |
> + AS_MEMATTR_AARCH64_SH_MIDGARD_INNER |
> + AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(false, false);
> + } else {
> + out_attr = AS_MEMATTR_AARCH64_INNER_OUTER_WB |
> + AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(inner & 1, inner & 2);
> + /* Use SH_MIDGARD_INNER mode when device isn't coherent,
> + * so SH_IS, which is used when IOMMU_CACHE is set, maps
> + * to Mali's internal-shareable mode. As per the Mali
> + * Spec, inner and outer-shareable modes aren't allowed
> + * for WB memory when coherency is disabled.
> + * Use SH_CPU_INNER mode when coherency is enabled, so
> + * that SH_IS actually maps to the standard definition of
> + * inner-shareable.
> + */
> + if (!coherent)
> + out_attr |= AS_MEMATTR_AARCH64_SH_MIDGARD_INNER;
> + else
> + out_attr |= AS_MEMATTR_AARCH64_SH_CPU_INNER;
> + }
> +
> + memattr |= (u64)out_attr << (8 * i);
> + }
> +
> + return memattr;
> +}
> +
> static int wait_ready(struct panfrost_device *pfdev, u32 as_nr)
> {
> int ret;
> @@ -121,38 +163,66 @@ static int mmu_hw_do_operation(struct panfrost_device *pfdev,
> return ret;
> }
>
> -static void panfrost_mmu_enable(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
> +static void
> +_panfrost_mmu_as_control_write(struct panfrost_device *pfdev, u32 as_nr,
> + u64 transtab, u64 memattr, u64 transcfg)
> {
> - int as_nr = mmu->as;
> - struct io_pgtable_cfg *cfg = &mmu->pgtbl_cfg;
> - u64 transtab = cfg->arm_mali_lpae_cfg.transtab;
> - u64 memattr = cfg->arm_mali_lpae_cfg.memattr;
> -
> mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0ULL, AS_COMMAND_FLUSH_MEM);
>
> mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), lower_32_bits(transtab));
> mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), upper_32_bits(transtab));
>
> - /* Need to revisit mem attrs.
> - * NC is the default, Mali driver is inner WT.
> - */
> mmu_write(pfdev, AS_MEMATTR_LO(as_nr), lower_32_bits(memattr));
> mmu_write(pfdev, AS_MEMATTR_HI(as_nr), upper_32_bits(memattr));
>
> + mmu_write(pfdev, AS_TRANSCFG_LO(as_nr), lower_32_bits(transcfg));
> + mmu_write(pfdev, AS_TRANSCFG_HI(as_nr), upper_32_bits(transcfg));
> +
> write_cmd(pfdev, as_nr, AS_COMMAND_UPDATE);
> +
> + dev_dbg(pfdev->dev, "mmu_as_control: as=%d, transtab=0x%016llx, memattr=0x%016llx, transcfg=0x%016llx",
> + as_nr, transtab, memattr, transcfg);
> }
>
> -static void panfrost_mmu_disable(struct panfrost_device *pfdev, u32 as_nr)
> +static void mmu_lpae_s1_enable(struct panfrost_device *pfdev,
> + struct panfrost_mmu *mmu)
> {
> - mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0ULL, AS_COMMAND_FLUSH_MEM);
> + struct io_pgtable_cfg *cfg = &mmu->pgtbl_cfg;
> + int as_nr = mmu->as;
>
> - mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), 0);
> - mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), 0);
> + u64 transtab =
> + cfg->arm_lpae_s1_cfg.ttbr & AS_TRANSTAB_LPAE_ADDR_SPACE_MASK;
> + u64 memattr =
> + mair_to_memattr(cfg->arm_lpae_s1_cfg.mair, pfdev->coherent);
> + u32 va_bits = GPU_MMU_FEATURES_VA_BITS(pfdev->features.mmu_features);
> + u64 transcfg = AS_TRANSCFG_PTW_MEMATTR_WB |
> + AS_TRANSCFG_PTW_RA |
> + AS_TRANSCFG_ADRMODE_AARCH64_4K |
> + AS_TRANSCFG_INA_BITS(55 - va_bits);
>
> - mmu_write(pfdev, AS_MEMATTR_LO(as_nr), 0);
> - mmu_write(pfdev, AS_MEMATTR_HI(as_nr), 0);
> + if (pfdev->coherent)
> + transcfg |= AS_TRANSCFG_PTW_SH_OS;
>
> - write_cmd(pfdev, as_nr, AS_COMMAND_UPDATE);
> + _panfrost_mmu_as_control_write(pfdev, as_nr, transtab, memattr,
> + transcfg);
> +}
> +
> +static void mmu_mali_lpae_enable(struct panfrost_device *pfdev,
> + struct panfrost_mmu *mmu)
> +{
> + struct io_pgtable_cfg *cfg = &mmu->pgtbl_cfg;
> + int as_nr = mmu->as;
> +
> + _panfrost_mmu_as_control_write(pfdev, as_nr,
> + cfg->arm_mali_lpae_cfg.transtab,
> + cfg->arm_mali_lpae_cfg.memattr,
> + AS_TRANSCFG_ADRMODE_LEGACY);
> +}
> +
> +static void panfrost_mmu_disable(struct panfrost_device *pfdev, u32 as_nr)
> +{
> + _panfrost_mmu_as_control_write(pfdev, as_nr, 0, 0,
> + AS_TRANSCFG_ADRMODE_UNMAPPED);
> }
>
> u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
> @@ -182,7 +252,7 @@ u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
> mmu_write(pfdev, MMU_INT_CLEAR, mask);
> mmu_write(pfdev, MMU_INT_MASK, ~pfdev->as_faulty_mask);
> pfdev->as_faulty_mask &= ~mask;
> - panfrost_mmu_enable(pfdev, mmu);
> + mmu->enable(pfdev, mmu);
> }
>
> goto out;
> @@ -214,7 +284,7 @@ u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
>
> dev_dbg(pfdev->dev, "Assigned AS%d to mmu %p, alloc_mask=%lx", as, mmu, pfdev->as_alloc_mask);
>
> - panfrost_mmu_enable(pfdev, mmu);
> + mmu->enable(pfdev, mmu);
>
> out:
> spin_unlock(&pfdev->as_lock);
> @@ -618,6 +688,7 @@ struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
> u32 va_bits = GPU_MMU_FEATURES_VA_BITS(pfdev->features.mmu_features);
> u32 pa_bits = GPU_MMU_FEATURES_PA_BITS(pfdev->features.mmu_features);
> struct panfrost_mmu *mmu;
> + enum io_pgtable_fmt fmt;
>
> mmu = kzalloc(sizeof(*mmu), GFP_KERNEL);
> if (!mmu)
> @@ -642,8 +713,15 @@ struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
> .iommu_dev = pfdev->dev,
> };
>
> - mmu->pgtbl_ops = alloc_io_pgtable_ops(ARM_MALI_LPAE, &mmu->pgtbl_cfg,
> - mmu);
> + if (panfrost_has_hw_feature(pfdev, HW_FEATURE_AARCH64_MMU)) {
> + fmt = ARM_64_LPAE_S1;
> + mmu->enable = mmu_lpae_s1_enable;
> + } else {
> + fmt = ARM_MALI_LPAE;
> + mmu->enable = mmu_mali_lpae_enable;
> + }
> + mmu->pgtbl_ops = alloc_io_pgtable_ops(fmt, &mmu->pgtbl_cfg, mmu);
> +
> if (!mmu->pgtbl_ops) {
> kfree(mmu);
> return ERR_PTR(-EINVAL);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
> index 4e6064d5feaa..a5ca36f583ff 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_regs.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
> @@ -301,6 +301,17 @@
> #define AS_TRANSTAB_HI(as) (MMU_AS(as) + 0x04) /* (RW) Translation Table Base Address for address space n, high word */
> #define AS_MEMATTR_LO(as) (MMU_AS(as) + 0x08) /* (RW) Memory attributes for address space n, low word. */
> #define AS_MEMATTR_HI(as) (MMU_AS(as) + 0x0C) /* (RW) Memory attributes for address space n, high word. */
> +#define AS_MEMATTR_AARCH64_INNER_ALLOC_IMPL (2 << 2)
> +#define AS_MEMATTR_AARCH64_INNER_ALLOC_EXPL(w, r) ((3 << 2) | \
> + ((w) ? BIT(0) : 0) | \
> + ((r) ? BIT(1) : 0))
> +#define AS_MEMATTR_AARCH64_SH_MIDGARD_INNER (0 << 4)
> +#define AS_MEMATTR_AARCH64_SH_CPU_INNER (1 << 4)
> +#define AS_MEMATTR_AARCH64_SH_CPU_INNER_SHADER_COH (2 << 4)
> +#define AS_MEMATTR_AARCH64_SHARED (0 << 6)
> +#define AS_MEMATTR_AARCH64_INNER_OUTER_NC (1 << 6)
> +#define AS_MEMATTR_AARCH64_INNER_OUTER_WB (2 << 6)
> +#define AS_MEMATTR_AARCH64_FAULT (3 << 6)
> #define AS_LOCKADDR_LO(as) (MMU_AS(as) + 0x10) /* (RW) Lock region address for address space n, low word */
> #define AS_LOCKADDR_HI(as) (MMU_AS(as) + 0x14) /* (RW) Lock region address for address space n, high word */
> #define AS_COMMAND(as) (MMU_AS(as) + 0x18) /* (WO) MMU command register for address space n */
> @@ -311,6 +322,24 @@
> /* Additional Bifrost AS registers */
> #define AS_TRANSCFG_LO(as) (MMU_AS(as) + 0x30) /* (RW) Translation table configuration for address space n, low word */
> #define AS_TRANSCFG_HI(as) (MMU_AS(as) + 0x34) /* (RW) Translation table configuration for address space n, high word */
> +#define AS_TRANSCFG_ADRMODE_LEGACY (0 << 0)
> +#define AS_TRANSCFG_ADRMODE_UNMAPPED (1 << 0)
> +#define AS_TRANSCFG_ADRMODE_IDENTITY (2 << 0)
> +#define AS_TRANSCFG_ADRMODE_AARCH64_4K (6 << 0)
> +#define AS_TRANSCFG_ADRMODE_AARCH64_64K (8 << 0)
> +#define AS_TRANSCFG_INA_BITS(x) ((x) << 6)
> +#define AS_TRANSCFG_OUTA_BITS(x) ((x) << 14)
> +#define AS_TRANSCFG_SL_CONCAT BIT(22)
> +#define AS_TRANSCFG_PTW_MEMATTR_NC (1 << 24)
> +#define AS_TRANSCFG_PTW_MEMATTR_WB (2 << 24)
> +#define AS_TRANSCFG_PTW_SH_NS (0 << 28)
> +#define AS_TRANSCFG_PTW_SH_OS (2 << 28)
> +#define AS_TRANSCFG_PTW_SH_IS (3 << 28)
> +#define AS_TRANSCFG_PTW_RA BIT(30)
> +#define AS_TRANSCFG_DISABLE_HIER_AP BIT(33)
> +#define AS_TRANSCFG_DISABLE_AF_FAULT BIT(34)
> +#define AS_TRANSCFG_WXN BIT(35)
> +#define AS_TRANSCFG_XREADABLE BIT(36)
> #define AS_FAULTEXTRA_LO(as) (MMU_AS(as) + 0x38) /* (RO) Secondary fault address for address space n, low word */
> #define AS_FAULTEXTRA_HI(as) (MMU_AS(as) + 0x3C) /* (RO) Secondary fault address for address space n, high word */
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table
2025-02-26 18:30 ` [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table Ariel D'Alessandro
2025-02-27 8:30 ` Boris Brezillon
2025-02-27 14:44 ` Steven Price
@ 2025-02-27 14:55 ` Boris Brezillon
2025-03-10 15:34 ` Ariel D'Alessandro
2 siblings, 1 reply; 17+ messages in thread
From: Boris Brezillon @ 2025-02-27 14:55 UTC (permalink / raw)
To: Ariel D'Alessandro
Cc: dri-devel, linux-kernel, robh, steven.price, maarten.lankhorst,
mripard, tzimmermann, airlied, simona
On Wed, 26 Feb 2025 15:30:42 -0300
Ariel D'Alessandro <ariel.dalessandro@collabora.com> wrote:
> @@ -642,8 +713,15 @@ struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
> .iommu_dev = pfdev->dev,
> };
>
> - mmu->pgtbl_ops = alloc_io_pgtable_ops(ARM_MALI_LPAE, &mmu->pgtbl_cfg,
> - mmu);
> + if (panfrost_has_hw_feature(pfdev, HW_FEATURE_AARCH64_MMU)) {
> + fmt = ARM_64_LPAE_S1;
> + mmu->enable = mmu_lpae_s1_enable;
> + } else {
> + fmt = ARM_MALI_LPAE;
> + mmu->enable = mmu_mali_lpae_enable;
> + }
How about we stick to the legacy pgtable format for all currently
supported GPUs, and make this an opt-in property attached to the
compatible. This way, we can progressively move away from the legacy
format once enough testing has been done, while allowing support for
GPUs that can't use the old format because the cachability/shareability
configuration is too limited.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 2/4] drm/panfrost: Split LPAE MMU TRANSTAB register values
2025-02-27 8:25 ` Boris Brezillon
@ 2025-03-07 14:02 ` Ariel D'Alessandro
0 siblings, 0 replies; 17+ messages in thread
From: Ariel D'Alessandro @ 2025-03-07 14:02 UTC (permalink / raw)
To: Boris Brezillon
Cc: dri-devel, linux-kernel, robh, steven.price, maarten.lankhorst,
mripard, tzimmermann, airlied, simona
Boris,
On 2/27/25 5:25 AM, Boris Brezillon wrote:
> On Wed, 26 Feb 2025 15:30:41 -0300
> Ariel D'Alessandro <ariel.dalessandro@collabora.com> wrote:
[snip]
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
>> index b5f279a19a08..4e6064d5feaa 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_regs.h
>> +++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
>> @@ -317,14 +317,19 @@
>> #define MMU_AS_STRIDE (1 << MMU_AS_SHIFT)
>>
>> /*
>> - * Begin LPAE MMU TRANSTAB register values
>> + * Begin LPAE MMU TRANSTAB register values (legacy mode)
>> */
>> -#define AS_TRANSTAB_LPAE_ADDR_SPACE_MASK 0xfffffffffffff000
>> -#define AS_TRANSTAB_LPAE_ADRMODE_IDENTITY 0x2
>> -#define AS_TRANSTAB_LPAE_ADRMODE_TABLE 0x3
>> -#define AS_TRANSTAB_LPAE_ADRMODE_MASK 0x3
>> -#define AS_TRANSTAB_LPAE_READ_INNER BIT(2)
>> -#define AS_TRANSTAB_LPAE_SHARE_OUTER BIT(4)
>> +#define AS_TRANSTAB_LEGACY_ADDR_SPACE_MASK 0xfffffffffffff000
>> +#define AS_TRANSTAB_LEGACY_ADRMODE_IDENTITY 0x2
>> +#define AS_TRANSTAB_LEGACY_ADRMODE_TABLE 0x3
>> +#define AS_TRANSTAB_LEGACY_ADRMODE_MASK 0x3
>> +#define AS_TRANSTAB_LEGACY_READ_INNER BIT(2)
>> +#define AS_TRANSTAB_LEGACY_SHARE_OUTER BIT(4)
>
> How about we keep AS_TRANSTAB_LPAE_ here and prefix the new reg values
> with AS_xxx_AARCH64_ when there's a collision between the two formats.
Agreed. Will use AS_TRANSTAB_AARCH64_4K_ prefix for the new ones.
>
>> +
>> +/*
>> + * Begin LPAE MMU TRANSTAB register values (no-legacy mode)
>> + */
>> +#define AS_TRANSTAB_LPAE_ADDR_SPACE_MASK 0xfffffffffffffff0
>
> It looks like we're not use AS_TRANSTAB_LPAE_ADDR_SPACE_MASK, so I'm
> not sure it's worth defining the mask for the AARCH64 format.
None of the original AS_TRANSTAB_LPAE_* values are used, but these refer
to the LPAE (legacy mode) format.
The new mask for the AARCH64 format is required by the follow up patch
`[RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table`. It
probably makes sense to just squash it now that this patch got
simplified and the naming will be more clear.
I'll send a new patchset version with these changes.
Thanks!
--
Ariel D'Alessandro
Software Engineer
Collabora Ltd.
Platinum Building, St John's Innovation Park, Cambridge CB4 0DS, UK
Registered in England & Wales, no. 5513718
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table
2025-02-27 8:30 ` Boris Brezillon
2025-02-27 8:32 ` Boris Brezillon
@ 2025-03-07 14:42 ` Ariel D'Alessandro
1 sibling, 0 replies; 17+ messages in thread
From: Ariel D'Alessandro @ 2025-03-07 14:42 UTC (permalink / raw)
To: Boris Brezillon
Cc: dri-devel, linux-kernel, robh, steven.price, maarten.lankhorst,
mripard, tzimmermann, airlied, simona
Boris,
On 2/27/25 5:30 AM, Boris Brezillon wrote:
> On Wed, 26 Feb 2025 15:30:42 -0300
> Ariel D'Alessandro <ariel.dalessandro@collabora.com> wrote:
>
>> Bifrost MMUs support AArch64 4kB granule specification. However,
>> panfrost only enables MMU in legacy mode, despite the presence of the
>> HW_FEATURE_AARCH64_MMU feature flag.
>>
>> This commit adds support to use page tables according to AArch64 4kB
>> granule specification. This feature is enabled conditionally based on
>> the GPU model's HW_FEATURE_AARCH64_MMU feature flag.
>>
>> Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
>> ---
>> drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
>> drivers/gpu/drm/panfrost/panfrost_mmu.c | 118 +++++++++++++++++----
>> drivers/gpu/drm/panfrost/panfrost_regs.h | 29 +++++
>> 3 files changed, 128 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
>> index cffcb0ac7c11..dea252f43c58 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>> @@ -153,6 +153,7 @@ struct panfrost_device {
>> };
>>
>> struct panfrost_mmu {
>> + void (*enable)(struct panfrost_device *pfdev, struct panfrost_mmu *mmu);
>
> The enable sequence is the same, it's just the transtab, memattr and
> transcfg values that differ depending on the format, so let's prepare
> them at panfrost_mmu init time, and cache them here.
Agreed. AFAICS, this would be:
Add the following to struct panfrost_mmu:
struct {
u64 transtab;
u64 memattr;
u64 transcfg;
} cfg;
and have it initialized in panfrost_mmu_ctx_create().
For consistency, we should do this for both modes MALI_LPAE and
AARCH64_4K. As for the MALI_LPAE case, I'd move out the initialization
done in drivers/iommu/io-pgtable-arm.c for:
struct {
u64 transtab;
u64 memattr;
} arm_mali_lpae_cfg;
I'll send a proposal for this in the next patchset v1.
Thanks!
--
Ariel D'Alessandro
Software Engineer
Collabora Ltd.
Platinum Building, St John's Innovation Park, Cambridge CB4 0DS, UK
Registered in England & Wales, no. 5513718
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table
2025-02-27 14:55 ` Boris Brezillon
@ 2025-03-10 15:34 ` Ariel D'Alessandro
2025-03-10 19:25 ` Boris Brezillon
0 siblings, 1 reply; 17+ messages in thread
From: Ariel D'Alessandro @ 2025-03-10 15:34 UTC (permalink / raw)
To: Boris Brezillon
Cc: dri-devel, linux-kernel, robh, steven.price, maarten.lankhorst,
mripard, tzimmermann, airlied, simona
Hi Boris,
On 2/27/25 11:55 AM, Boris Brezillon wrote:
> On Wed, 26 Feb 2025 15:30:42 -0300
> Ariel D'Alessandro <ariel.dalessandro@collabora.com> wrote:
>
>> @@ -642,8 +713,15 @@ struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
>> .iommu_dev = pfdev->dev,
>> };
>>
>> - mmu->pgtbl_ops = alloc_io_pgtable_ops(ARM_MALI_LPAE, &mmu->pgtbl_cfg,
>> - mmu);
>> + if (panfrost_has_hw_feature(pfdev, HW_FEATURE_AARCH64_MMU)) {
>> + fmt = ARM_64_LPAE_S1;
>> + mmu->enable = mmu_lpae_s1_enable;
>> + } else {
>> + fmt = ARM_MALI_LPAE;
>> + mmu->enable = mmu_mali_lpae_enable;
>> + }
>
> How about we stick to the legacy pgtable format for all currently
> supported GPUs, and make this an opt-in property attached to the
> compatible. This way, we can progressively move away from the legacy
> format once enough testing has been done, while allowing support for
> GPUs that can't use the old format because the cachability/shareability
> configuration is too limited.
Indeed, that's a better way to go.
Specifically, what you mean is: keep the same compatible string and add
a new property to the `panfrost_compatible` private data for that
specific variant? E.g.
In drivers/gpu/drm/panfrost/panfrost_drv.c:
```
struct panfrost_compatible mediatek_mt8188_data
[...]
{ .compatible = "mediatek,mt8188-mali", .data = &mediatek_mt8188_data },
```
Thanks,
--
Ariel D'Alessandro
Software Engineer
Collabora Ltd.
Platinum Building, St John's Innovation Park, Cambridge CB4 0DS, UK
Registered in England & Wales, no. 5513718
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table
2025-02-27 14:44 ` Steven Price
@ 2025-03-10 15:46 ` Ariel D'Alessandro
0 siblings, 0 replies; 17+ messages in thread
From: Ariel D'Alessandro @ 2025-03-10 15:46 UTC (permalink / raw)
To: Steven Price, dri-devel, linux-kernel
Cc: boris.brezillon, robh, maarten.lankhorst, mripard, tzimmermann,
airlied, simona
Hi Steven,
On 2/27/25 11:44 AM, Steven Price wrote:
> On 26/02/2025 18:30, Ariel D'Alessandro wrote:
>> Bifrost MMUs support AArch64 4kB granule specification. However,
>> panfrost only enables MMU in legacy mode, despite the presence of the
>> HW_FEATURE_AARCH64_MMU feature flag.
>>
>> This commit adds support to use page tables according to AArch64 4kB
>> granule specification. This feature is enabled conditionally based on
>> the GPU model's HW_FEATURE_AARCH64_MMU feature flag.
>>
>> Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
>
> I find some of the naming confusing here. The subject calls it
> 'ARM_64_LPAE_S1' which in an unfortunate name from the iommu code.
>
> AIUI, LPAE is the "Large Physical Address Extension" and is a v7 feature
> for 32 bit. "LEGACY" (as Bifrost calls it) mode is a (modified) version
> of LPAE, which in Linux we've called "mali_lpae".
>
> What you're adding support for is AARCH64_4K which is the v8 64 bit
> mode. So I think it's worth including the "64" part of the name of the
> mmu_lpae_s1_enable() function. Personally I'd be tempted to drop the
> "_s1" part, but I guess there's a small chance someone will find a use
> for the second stage one day.
Yes, overall agreed. I'll definitely keep the "64_4k" part of the name
explicit and also keep the "mali_lpae" for the legacy mode. Will send a
patchset v1 soon with all this, we can continue reviewing it there.
>
> Note also that it's not necessarily a clear-cut improvement to use
> AARCH64_4K over LEGACY. I wouldn't be surprised if this actually causes
> (minor) performance regressions on some platforms. Sadly I don't have
> access to a range of hardware to test this on.
FWIW, I'm using Mesa CI [0] to test this as much as possible, at least
to detect any (major) regressions. As proposed by Boris, we should use a
property to progressively enable this feature.
Thanks!
[0] https://gitlab.freedesktop.org/mesa/mesa/
--
Ariel D'Alessandro
Software Engineer
Collabora Ltd.
Platinum Building, St John's Innovation Park, Cambridge CB4 0DS, UK
Registered in England & Wales, no. 5513718
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table
2025-03-10 15:34 ` Ariel D'Alessandro
@ 2025-03-10 19:25 ` Boris Brezillon
0 siblings, 0 replies; 17+ messages in thread
From: Boris Brezillon @ 2025-03-10 19:25 UTC (permalink / raw)
To: Ariel D'Alessandro
Cc: dri-devel, linux-kernel, robh, steven.price, maarten.lankhorst,
mripard, tzimmermann, airlied, simona
On Mon, 10 Mar 2025 12:34:30 -0300
Ariel D'Alessandro <ariel.dalessandro@collabora.com> wrote:
> Hi Boris,
>
> On 2/27/25 11:55 AM, Boris Brezillon wrote:
> > On Wed, 26 Feb 2025 15:30:42 -0300
> > Ariel D'Alessandro <ariel.dalessandro@collabora.com> wrote:
> >
> >> @@ -642,8 +713,15 @@ struct panfrost_mmu *panfrost_mmu_ctx_create(struct panfrost_device *pfdev)
> >> .iommu_dev = pfdev->dev,
> >> };
> >>
> >> - mmu->pgtbl_ops = alloc_io_pgtable_ops(ARM_MALI_LPAE, &mmu->pgtbl_cfg,
> >> - mmu);
> >> + if (panfrost_has_hw_feature(pfdev, HW_FEATURE_AARCH64_MMU)) {
> >> + fmt = ARM_64_LPAE_S1;
> >> + mmu->enable = mmu_lpae_s1_enable;
> >> + } else {
> >> + fmt = ARM_MALI_LPAE;
> >> + mmu->enable = mmu_mali_lpae_enable;
> >> + }
> >
> > How about we stick to the legacy pgtable format for all currently
> > supported GPUs, and make this an opt-in property attached to the
> > compatible. This way, we can progressively move away from the legacy
> > format once enough testing has been done, while allowing support for
> > GPUs that can't use the old format because the cachability/shareability
> > configuration is too limited.
>
> Indeed, that's a better way to go.
>
> Specifically, what you mean is: keep the same compatible string and add
> a new property to the `panfrost_compatible` private data for that
> specific variant?
Exactly.
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2025-03-10 19:26 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-26 18:30 [RFC PATCH 0/4] drm/panfrost: Support ARM_64_LPAE_S1 page table Ariel D'Alessandro
2025-02-26 18:30 ` [RFC PATCH 1/4] drm/panfrost: Use GPU_MMU_FEATURES_VA_BITS/PA_BITS macros Ariel D'Alessandro
2025-02-27 8:21 ` Boris Brezillon
2025-02-27 14:44 ` Steven Price
2025-02-26 18:30 ` [RFC PATCH 2/4] drm/panfrost: Split LPAE MMU TRANSTAB register values Ariel D'Alessandro
2025-02-27 8:25 ` Boris Brezillon
2025-03-07 14:02 ` Ariel D'Alessandro
2025-02-26 18:30 ` [RFC PATCH 3/4] drm/panfrost: Support ARM_64_LPAE_S1 page table Ariel D'Alessandro
2025-02-27 8:30 ` Boris Brezillon
2025-02-27 8:32 ` Boris Brezillon
2025-03-07 14:42 ` Ariel D'Alessandro
2025-02-27 14:44 ` Steven Price
2025-03-10 15:46 ` Ariel D'Alessandro
2025-02-27 14:55 ` Boris Brezillon
2025-03-10 15:34 ` Ariel D'Alessandro
2025-03-10 19:25 ` Boris Brezillon
2025-02-26 18:30 ` [RFC PATCH 4/4] drm/panfrost: Set HW_FEATURE_AARCH64_MMU feature flag on Bifrost models Ariel D'Alessandro
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.