* [PATCH v2 0/2] drm/panthor: Fix panthor+FEX-Emu
@ 2025-04-17 14:49 Boris Brezillon
2025-04-17 14:49 ` [PATCH v2 1/2] drm/panthor: Add missing explicit padding in drm_panthor_gpu_info Boris Brezillon
2025-04-17 14:49 ` [PATCH v2 2/2] drm/panthor: Fix the user MMIO offset logic for emulators Boris Brezillon
0 siblings, 2 replies; 9+ messages in thread
From: Boris Brezillon @ 2025-04-17 14:49 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Adrián Larumbe
Cc: dri-devel, kernel
Hello,
This is an attempt a couple bugs exposed by FEX-Emu. The first one
is pretty trivial and should be uncontroversial, since it's just
a missing padding field in one of our uAPI structs. We are getting
away with it on arm32 because of the alignment rules provided by
the Arm ABI, but x86 has relaxed constraints for u64 fields, and
this bug is definitely hit when running a 32-bit x86 mesa binary
under FEX Emu.
The second fix is addressing a problem we have because FEX-Emu is
an aarch64 process executing 32-bit x86 code, meaning the check
we do on the is-32bit-task check we do to figure out the MMIO
offset seen by the user won't work. In order to fix that, we add
an ioctl to let the user explicitly set this offset. The offset
can only be set early on, if no MMIO range has been mapped before.
With those, and the mesa MR at [1], I managed to run a 32-bit x86
glmark2 through FEX without using the host mesa (if we were to use
the thunked mesa lib, both the kernel and mesa would use
MMIO_OFFSET_64BIT, and the problem doesn't exist anymore).
Regards,
Boris
Changes in v2:
- Simplify the logic in patch2 to have a lockless solution that's
still safe for what we need
[1]https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34573
Boris Brezillon (2):
drm/panthor: Add missing explicit padding in drm_panthor_gpu_info
drm/panthor: Fix the user MMIO offset logic for emulators
drivers/gpu/drm/panthor/panthor_device.h | 18 ++++++++
drivers/gpu/drm/panthor/panthor_drv.c | 56 +++++++++++++++++-------
include/uapi/drm/panthor_drm.h | 41 +++++++++++++++++
3 files changed, 99 insertions(+), 16 deletions(-)
--
2.49.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 1/2] drm/panthor: Add missing explicit padding in drm_panthor_gpu_info
2025-04-17 14:49 [PATCH v2 0/2] drm/panthor: Fix panthor+FEX-Emu Boris Brezillon
@ 2025-04-17 14:49 ` Boris Brezillon
2025-04-17 15:07 ` Steven Price
2025-04-24 14:32 ` Adrián Larumbe
2025-04-17 14:49 ` [PATCH v2 2/2] drm/panthor: Fix the user MMIO offset logic for emulators Boris Brezillon
1 sibling, 2 replies; 9+ messages in thread
From: Boris Brezillon @ 2025-04-17 14:49 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Adrián Larumbe
Cc: dri-devel, kernel
drm_panthor_gpu_info::shader_present is currently automatically offset
by 4 byte to meet Arm's 32-bit/64-bit field alignment rules, but those
constraints don't stand on 32-bit x86 and cause a mismatch when running
an x86 binary in a user emulated environment like FEX. It's also
generally agreed that uAPIs should explicitly pad their struct fields,
which we originally intended to do, but a mistake slipped through during
the submission process, leading drm_panthor_gpu_info::shader_present to
be misaligned.
This uAPI change doesn't break any of the existing users of panthor
which are either arm32 or arm64 where the 64-bit alignment of
u64 fields is already enforced a the compiler level.
Changes in v2:
- Rename the garbage field into pad0 and adjust the comment accordingly
- Add Liviu's R-b
Fixes: 0f25e493a246 ("drm/panthor: Add uAPI")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
---
include/uapi/drm/panthor_drm.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
index 97e2c4510e69..dbb907eae443 100644
--- a/include/uapi/drm/panthor_drm.h
+++ b/include/uapi/drm/panthor_drm.h
@@ -293,6 +293,9 @@ struct drm_panthor_gpu_info {
/** @as_present: Bitmask encoding the number of address-space exposed by the MMU. */
__u32 as_present;
+ /** @pad0: MBZ. */
+ __u32 pad0;
+
/** @shader_present: Bitmask encoding the shader cores exposed by the GPU. */
__u64 shader_present;
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 2/2] drm/panthor: Fix the user MMIO offset logic for emulators
2025-04-17 14:49 [PATCH v2 0/2] drm/panthor: Fix panthor+FEX-Emu Boris Brezillon
2025-04-17 14:49 ` [PATCH v2 1/2] drm/panthor: Add missing explicit padding in drm_panthor_gpu_info Boris Brezillon
@ 2025-04-17 14:49 ` Boris Brezillon
2025-04-17 15:13 ` Steven Price
` (2 more replies)
1 sibling, 3 replies; 9+ messages in thread
From: Boris Brezillon @ 2025-04-17 14:49 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Adrián Larumbe
Cc: dri-devel, kernel
Currently, we pick the MMIO offset based on the size of the pgoff_t
type seen by the process that manipulates the FD, such that a 32-bit
process can always map the user MMIO ranges. But this approach doesn't
work well for emulators like FEX, where the emulator is a 64-bit binary
which might be executing 32-bit code. In that case, the kernel thinks
it's the 64-bit process and assumes DRM_PANTHOR_USER_MMIO_OFFSET_64BIT
is in use, but the UMD library expects DRM_PANTHOR_USER_MMIO_OFFSET_32BIT,
because it can't mmap() anything above the pgoff_t size.
In order to solve that, we need a way to explicitly set the user MMIO
offset from the UMD, such that the kernel doesn't have to guess it
from the TIF_32BIT flag set on user thread. We keep the old behavior
if DRM_PANTHOR_SET_USER_MMIO_OFFSET is never called.
Changes:
- Drop the lock/immutable fields and allow SET_USER_MMIO_OFFSET
requests to race with mmap() requests
- Don't do the is_user_mmio_offset test twice in panthor_mmap()
- Improve the uAPI docs
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_device.h | 18 ++++++++
drivers/gpu/drm/panthor/panthor_drv.c | 56 +++++++++++++++++-------
include/uapi/drm/panthor_drm.h | 38 ++++++++++++++++
3 files changed, 96 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index 4c27b6d85f46..6d8c2d5042f2 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -219,6 +219,24 @@ struct panthor_file {
/** @ptdev: Device attached to this file. */
struct panthor_device *ptdev;
+ /** @user_mmio: User MMIO related fields. */
+ struct {
+ /**
+ * @offset: Offset used for user MMIO mappings.
+ *
+ * This offset should not be used to check the type of mapping
+ * except in panthor_mmap(). After that point, MMIO mapping
+ * offsets have been adjusted to match
+ * DRM_PANTHOR_USER_MMIO_OFFSET and this macro should be used
+ * instead.
+ * Make sure this rule is followed at all times, because
+ * userspace is in control of the offset, and can change the
+ * value behind out back, potentially leading to erronous
+ * branching happening in kernel space.
+ */
+ u64 offset;
+ } user_mmio;
+
/** @vms: VM pool attached to this file. */
struct panthor_vm_pool *vms;
diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
index 4d4a52a033f6..aedef2bfa7ac 100644
--- a/drivers/gpu/drm/panthor/panthor_drv.c
+++ b/drivers/gpu/drm/panthor/panthor_drv.c
@@ -1338,6 +1338,20 @@ static int panthor_ioctl_vm_get_state(struct drm_device *ddev, void *data,
return 0;
}
+static int panthor_ioctl_set_user_mmio_offset(struct drm_device *ddev,
+ void *data, struct drm_file *file)
+{
+ struct drm_panthor_set_user_mmio_offset *args = data;
+ struct panthor_file *pfile = file->driver_priv;
+
+ if (args->offset != DRM_PANTHOR_USER_MMIO_OFFSET_32BIT &&
+ args->offset != DRM_PANTHOR_USER_MMIO_OFFSET_64BIT)
+ return -EINVAL;
+
+ WRITE_ONCE(pfile->user_mmio.offset, args->offset);
+ return 0;
+}
+
static int
panthor_open(struct drm_device *ddev, struct drm_file *file)
{
@@ -1355,6 +1369,18 @@ panthor_open(struct drm_device *ddev, struct drm_file *file)
}
pfile->ptdev = ptdev;
+ pfile->user_mmio.offset = DRM_PANTHOR_USER_MMIO_OFFSET;
+
+#ifdef CONFIG_ARM64
+ /*
+ * With 32-bit systems being limited by the 32-bit representation of
+ * mmap2's pgoffset field, we need to make the MMIO offset arch
+ * specific.
+ */
+ if (test_tsk_thread_flag(current, TIF_32BIT))
+ pfile->user_mmio.offset = DRM_PANTHOR_USER_MMIO_OFFSET_32BIT;
+#endif
+
ret = panthor_vm_pool_create(pfile);
if (ret)
@@ -1407,6 +1433,7 @@ static const struct drm_ioctl_desc panthor_drm_driver_ioctls[] = {
PANTHOR_IOCTL(TILER_HEAP_CREATE, tiler_heap_create, DRM_RENDER_ALLOW),
PANTHOR_IOCTL(TILER_HEAP_DESTROY, tiler_heap_destroy, DRM_RENDER_ALLOW),
PANTHOR_IOCTL(GROUP_SUBMIT, group_submit, DRM_RENDER_ALLOW),
+ PANTHOR_IOCTL(SET_USER_MMIO_OFFSET, set_user_mmio_offset, DRM_RENDER_ALLOW),
};
static int panthor_mmap(struct file *filp, struct vm_area_struct *vma)
@@ -1415,30 +1442,26 @@ static int panthor_mmap(struct file *filp, struct vm_area_struct *vma)
struct panthor_file *pfile = file->driver_priv;
struct panthor_device *ptdev = pfile->ptdev;
u64 offset = (u64)vma->vm_pgoff << PAGE_SHIFT;
+ u64 user_mmio_offset;
int ret, cookie;
if (!drm_dev_enter(file->minor->dev, &cookie))
return -ENODEV;
-#ifdef CONFIG_ARM64
- /*
- * With 32-bit systems being limited by the 32-bit representation of
- * mmap2's pgoffset field, we need to make the MMIO offset arch
- * specific. This converts a user MMIO offset into something the kernel
- * driver understands.
+ /* Adjust the user MMIO offset to match the offset used kernel side.
+ * We use a local variable with a READ_ONCE() here to make sure
+ * the user_mmio_offset we use for the is_user_mmio_mapping() check
+ * hasn't changed when we do the offset adjustment.
*/
- if (test_tsk_thread_flag(current, TIF_32BIT) &&
- offset >= DRM_PANTHOR_USER_MMIO_OFFSET_32BIT) {
- offset += DRM_PANTHOR_USER_MMIO_OFFSET_64BIT -
- DRM_PANTHOR_USER_MMIO_OFFSET_32BIT;
+ user_mmio_offset = READ_ONCE(pfile->user_mmio.offset);
+ if (offset >= user_mmio_offset) {
+ offset -= user_mmio_offset;
+ offset += DRM_PANTHOR_USER_MMIO_OFFSET;
vma->vm_pgoff = offset >> PAGE_SHIFT;
- }
-#endif
-
- if (offset >= DRM_PANTHOR_USER_MMIO_OFFSET)
ret = panthor_device_mmap_io(ptdev, vma);
- else
+ } else {
ret = drm_gem_mmap(filp, vma);
+ }
drm_dev_exit(cookie);
return ret;
@@ -1516,6 +1539,7 @@ static void panthor_debugfs_init(struct drm_minor *minor)
* - 1.2 - adds DEV_QUERY_GROUP_PRIORITIES_INFO query
* - adds PANTHOR_GROUP_PRIORITY_REALTIME priority
* - 1.3 - adds DRM_PANTHOR_GROUP_STATE_INNOCENT flag
+ * - 1.4 - adds DRM_PANTHOR_SET_USER_MMIO_OFFSET ioctl
*/
static const struct drm_driver panthor_drm_driver = {
.driver_features = DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ |
@@ -1529,7 +1553,7 @@ static const struct drm_driver panthor_drm_driver = {
.name = "panthor",
.desc = "Panthor DRM driver",
.major = 1,
- .minor = 3,
+ .minor = 4,
.gem_create_object = panthor_gem_create_object,
.gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table,
diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
index dbb907eae443..1d1282f2c9fa 100644
--- a/include/uapi/drm/panthor_drm.h
+++ b/include/uapi/drm/panthor_drm.h
@@ -127,6 +127,20 @@ enum drm_panthor_ioctl_id {
/** @DRM_PANTHOR_TILER_HEAP_DESTROY: Destroy a tiler heap. */
DRM_PANTHOR_TILER_HEAP_DESTROY,
+
+ /**
+ * @DRM_PANTHOR_SET_USER_MMIO_OFFSET: Set the offset to use as the user MMIO offset.
+ *
+ * The default behavior is to pick the MMIO offset based on the size of the pgoff_t
+ * type seen by the process that manipulates the FD, such that a 32-bit process can
+ * always map the user MMIO ranges. But this approach doesn't work well for emulators
+ * like FEX, where the emulator is an 64-bit binary which might be executing 32-bit
+ * code. In that case, the kernel thinks it's the 64-bit process and assumes
+ * DRM_PANTHOR_USER_MMIO_OFFSET_64BIT is in use, but the UMD library expects
+ * DRM_PANTHOR_USER_MMIO_OFFSET_32BIT, because it can't mmap() anything above the
+ * pgoff_t size.
+ */
+ DRM_PANTHOR_SET_USER_MMIO_OFFSET,
};
/**
@@ -980,6 +994,28 @@ struct drm_panthor_tiler_heap_destroy {
__u32 pad;
};
+/**
+ * struct drm_panthor_set_user_mmio_offset - Arguments passed to
+ * DRM_IOCTL_PANTHOR_SET_USER_MMIO_OFFSET
+ *
+ * This ioctl is only really useful if you want to support userspace
+ * CPU emulation environments where the size of an unsigned long differs
+ * between the host and the guest architectures.
+ */
+struct drm_panthor_set_user_mmio_offset {
+ /**
+ * @offset: User MMIO offset to use.
+ *
+ * Must be either DRM_PANTHOR_USER_MMIO_OFFSET_32BIT or
+ * DRM_PANTHOR_USER_MMIO_OFFSET_64BIT.
+ *
+ * Use DRM_PANTHOR_USER_MMIO_OFFSET (which selects OFFSET_32BIT or
+ * OFFSET_64BIT based on the size of an unsigned long) unless you
+ * have a very good reason to overrule this decision.
+ */
+ __u64 offset;
+};
+
/**
* DRM_IOCTL_PANTHOR() - Build a Panthor IOCTL number
* @__access: Access type. Must be R, W or RW.
@@ -1022,6 +1058,8 @@ enum {
DRM_IOCTL_PANTHOR(WR, TILER_HEAP_CREATE, tiler_heap_create),
DRM_IOCTL_PANTHOR_TILER_HEAP_DESTROY =
DRM_IOCTL_PANTHOR(WR, TILER_HEAP_DESTROY, tiler_heap_destroy),
+ DRM_IOCTL_PANTHOR_SET_USER_MMIO_OFFSET =
+ DRM_IOCTL_PANTHOR(WR, SET_USER_MMIO_OFFSET, set_user_mmio_offset),
};
#if defined(__cplusplus)
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] drm/panthor: Add missing explicit padding in drm_panthor_gpu_info
2025-04-17 14:49 ` [PATCH v2 1/2] drm/panthor: Add missing explicit padding in drm_panthor_gpu_info Boris Brezillon
@ 2025-04-17 15:07 ` Steven Price
2025-04-24 14:32 ` Adrián Larumbe
1 sibling, 0 replies; 9+ messages in thread
From: Steven Price @ 2025-04-17 15:07 UTC (permalink / raw)
To: Boris Brezillon, Liviu Dudau, Adrián Larumbe; +Cc: dri-devel, kernel
On 17/04/2025 15:49, Boris Brezillon wrote:
> drm_panthor_gpu_info::shader_present is currently automatically offset
> by 4 byte to meet Arm's 32-bit/64-bit field alignment rules, but those
> constraints don't stand on 32-bit x86 and cause a mismatch when running
> an x86 binary in a user emulated environment like FEX. It's also
> generally agreed that uAPIs should explicitly pad their struct fields,
> which we originally intended to do, but a mistake slipped through during
> the submission process, leading drm_panthor_gpu_info::shader_present to
> be misaligned.
>
> This uAPI change doesn't break any of the existing users of panthor
> which are either arm32 or arm64 where the 64-bit alignment of
> u64 fields is already enforced a the compiler level.
>
> Changes in v2:
> - Rename the garbage field into pad0 and adjust the comment accordingly
> - Add Liviu's R-b
>
> Fixes: 0f25e493a246 ("drm/panthor: Add uAPI")
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
> ---
> include/uapi/drm/panthor_drm.h | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index 97e2c4510e69..dbb907eae443 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -293,6 +293,9 @@ struct drm_panthor_gpu_info {
> /** @as_present: Bitmask encoding the number of address-space exposed by the MMU. */
> __u32 as_present;
>
> + /** @pad0: MBZ. */
> + __u32 pad0;
> +
> /** @shader_present: Bitmask encoding the shader cores exposed by the GPU. */
> __u64 shader_present;
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/2] drm/panthor: Fix the user MMIO offset logic for emulators
2025-04-17 14:49 ` [PATCH v2 2/2] drm/panthor: Fix the user MMIO offset logic for emulators Boris Brezillon
@ 2025-04-17 15:13 ` Steven Price
2025-04-17 15:37 ` Boris Brezillon
2025-04-22 9:20 ` Liviu Dudau
2025-04-24 14:31 ` Adrián Larumbe
2 siblings, 1 reply; 9+ messages in thread
From: Steven Price @ 2025-04-17 15:13 UTC (permalink / raw)
To: Boris Brezillon, Liviu Dudau, Adrián Larumbe; +Cc: dri-devel, kernel
On 17/04/2025 15:49, Boris Brezillon wrote:
> Currently, we pick the MMIO offset based on the size of the pgoff_t
> type seen by the process that manipulates the FD, such that a 32-bit
> process can always map the user MMIO ranges. But this approach doesn't
> work well for emulators like FEX, where the emulator is a 64-bit binary
> which might be executing 32-bit code. In that case, the kernel thinks
> it's the 64-bit process and assumes DRM_PANTHOR_USER_MMIO_OFFSET_64BIT
> is in use, but the UMD library expects DRM_PANTHOR_USER_MMIO_OFFSET_32BIT,
> because it can't mmap() anything above the pgoff_t size.
>
> In order to solve that, we need a way to explicitly set the user MMIO
> offset from the UMD, such that the kernel doesn't have to guess it
> from the TIF_32BIT flag set on user thread. We keep the old behavior
> if DRM_PANTHOR_SET_USER_MMIO_OFFSET is never called.
>
> Changes:
> - Drop the lock/immutable fields and allow SET_USER_MMIO_OFFSET
> requests to race with mmap() requests
> - Don't do the is_user_mmio_offset test twice in panthor_mmap()
> - Improve the uAPI docs
>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Much nicer, thanks!
Reviewed-by: Steven Price <steven.price@arm.com>
One note for merging - both this and Adrián's series are introducing the
new 1.4 version. So we either need to switch one to 1.5 or combine the
series.
Thanks,
Steve
> ---
> drivers/gpu/drm/panthor/panthor_device.h | 18 ++++++++
> drivers/gpu/drm/panthor/panthor_drv.c | 56 +++++++++++++++++-------
> include/uapi/drm/panthor_drm.h | 38 ++++++++++++++++
> 3 files changed, 96 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 4c27b6d85f46..6d8c2d5042f2 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -219,6 +219,24 @@ struct panthor_file {
> /** @ptdev: Device attached to this file. */
> struct panthor_device *ptdev;
>
> + /** @user_mmio: User MMIO related fields. */
> + struct {
> + /**
> + * @offset: Offset used for user MMIO mappings.
> + *
> + * This offset should not be used to check the type of mapping
> + * except in panthor_mmap(). After that point, MMIO mapping
> + * offsets have been adjusted to match
> + * DRM_PANTHOR_USER_MMIO_OFFSET and this macro should be used
> + * instead.
> + * Make sure this rule is followed at all times, because
> + * userspace is in control of the offset, and can change the
> + * value behind out back, potentially leading to erronous
> + * branching happening in kernel space.
> + */
> + u64 offset;
> + } user_mmio;
> +
> /** @vms: VM pool attached to this file. */
> struct panthor_vm_pool *vms;
>
> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
> index 4d4a52a033f6..aedef2bfa7ac 100644
> --- a/drivers/gpu/drm/panthor/panthor_drv.c
> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> @@ -1338,6 +1338,20 @@ static int panthor_ioctl_vm_get_state(struct drm_device *ddev, void *data,
> return 0;
> }
>
> +static int panthor_ioctl_set_user_mmio_offset(struct drm_device *ddev,
> + void *data, struct drm_file *file)
> +{
> + struct drm_panthor_set_user_mmio_offset *args = data;
> + struct panthor_file *pfile = file->driver_priv;
> +
> + if (args->offset != DRM_PANTHOR_USER_MMIO_OFFSET_32BIT &&
> + args->offset != DRM_PANTHOR_USER_MMIO_OFFSET_64BIT)
> + return -EINVAL;
> +
> + WRITE_ONCE(pfile->user_mmio.offset, args->offset);
> + return 0;
> +}
> +
> static int
> panthor_open(struct drm_device *ddev, struct drm_file *file)
> {
> @@ -1355,6 +1369,18 @@ panthor_open(struct drm_device *ddev, struct drm_file *file)
> }
>
> pfile->ptdev = ptdev;
> + pfile->user_mmio.offset = DRM_PANTHOR_USER_MMIO_OFFSET;
> +
> +#ifdef CONFIG_ARM64
> + /*
> + * With 32-bit systems being limited by the 32-bit representation of
> + * mmap2's pgoffset field, we need to make the MMIO offset arch
> + * specific.
> + */
> + if (test_tsk_thread_flag(current, TIF_32BIT))
> + pfile->user_mmio.offset = DRM_PANTHOR_USER_MMIO_OFFSET_32BIT;
> +#endif
> +
>
> ret = panthor_vm_pool_create(pfile);
> if (ret)
> @@ -1407,6 +1433,7 @@ static const struct drm_ioctl_desc panthor_drm_driver_ioctls[] = {
> PANTHOR_IOCTL(TILER_HEAP_CREATE, tiler_heap_create, DRM_RENDER_ALLOW),
> PANTHOR_IOCTL(TILER_HEAP_DESTROY, tiler_heap_destroy, DRM_RENDER_ALLOW),
> PANTHOR_IOCTL(GROUP_SUBMIT, group_submit, DRM_RENDER_ALLOW),
> + PANTHOR_IOCTL(SET_USER_MMIO_OFFSET, set_user_mmio_offset, DRM_RENDER_ALLOW),
> };
>
> static int panthor_mmap(struct file *filp, struct vm_area_struct *vma)
> @@ -1415,30 +1442,26 @@ static int panthor_mmap(struct file *filp, struct vm_area_struct *vma)
> struct panthor_file *pfile = file->driver_priv;
> struct panthor_device *ptdev = pfile->ptdev;
> u64 offset = (u64)vma->vm_pgoff << PAGE_SHIFT;
> + u64 user_mmio_offset;
> int ret, cookie;
>
> if (!drm_dev_enter(file->minor->dev, &cookie))
> return -ENODEV;
>
> -#ifdef CONFIG_ARM64
> - /*
> - * With 32-bit systems being limited by the 32-bit representation of
> - * mmap2's pgoffset field, we need to make the MMIO offset arch
> - * specific. This converts a user MMIO offset into something the kernel
> - * driver understands.
> + /* Adjust the user MMIO offset to match the offset used kernel side.
> + * We use a local variable with a READ_ONCE() here to make sure
> + * the user_mmio_offset we use for the is_user_mmio_mapping() check
> + * hasn't changed when we do the offset adjustment.
> */
> - if (test_tsk_thread_flag(current, TIF_32BIT) &&
> - offset >= DRM_PANTHOR_USER_MMIO_OFFSET_32BIT) {
> - offset += DRM_PANTHOR_USER_MMIO_OFFSET_64BIT -
> - DRM_PANTHOR_USER_MMIO_OFFSET_32BIT;
> + user_mmio_offset = READ_ONCE(pfile->user_mmio.offset);
> + if (offset >= user_mmio_offset) {
> + offset -= user_mmio_offset;
> + offset += DRM_PANTHOR_USER_MMIO_OFFSET;
> vma->vm_pgoff = offset >> PAGE_SHIFT;
> - }
> -#endif
> -
> - if (offset >= DRM_PANTHOR_USER_MMIO_OFFSET)
> ret = panthor_device_mmap_io(ptdev, vma);
> - else
> + } else {
> ret = drm_gem_mmap(filp, vma);
> + }
>
> drm_dev_exit(cookie);
> return ret;
> @@ -1516,6 +1539,7 @@ static void panthor_debugfs_init(struct drm_minor *minor)
> * - 1.2 - adds DEV_QUERY_GROUP_PRIORITIES_INFO query
> * - adds PANTHOR_GROUP_PRIORITY_REALTIME priority
> * - 1.3 - adds DRM_PANTHOR_GROUP_STATE_INNOCENT flag
> + * - 1.4 - adds DRM_PANTHOR_SET_USER_MMIO_OFFSET ioctl
> */
> static const struct drm_driver panthor_drm_driver = {
> .driver_features = DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ |
> @@ -1529,7 +1553,7 @@ static const struct drm_driver panthor_drm_driver = {
> .name = "panthor",
> .desc = "Panthor DRM driver",
> .major = 1,
> - .minor = 3,
> + .minor = 4,
>
> .gem_create_object = panthor_gem_create_object,
> .gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table,
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index dbb907eae443..1d1282f2c9fa 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -127,6 +127,20 @@ enum drm_panthor_ioctl_id {
>
> /** @DRM_PANTHOR_TILER_HEAP_DESTROY: Destroy a tiler heap. */
> DRM_PANTHOR_TILER_HEAP_DESTROY,
> +
> + /**
> + * @DRM_PANTHOR_SET_USER_MMIO_OFFSET: Set the offset to use as the user MMIO offset.
> + *
> + * The default behavior is to pick the MMIO offset based on the size of the pgoff_t
> + * type seen by the process that manipulates the FD, such that a 32-bit process can
> + * always map the user MMIO ranges. But this approach doesn't work well for emulators
> + * like FEX, where the emulator is an 64-bit binary which might be executing 32-bit
> + * code. In that case, the kernel thinks it's the 64-bit process and assumes
> + * DRM_PANTHOR_USER_MMIO_OFFSET_64BIT is in use, but the UMD library expects
> + * DRM_PANTHOR_USER_MMIO_OFFSET_32BIT, because it can't mmap() anything above the
> + * pgoff_t size.
> + */
> + DRM_PANTHOR_SET_USER_MMIO_OFFSET,
> };
>
> /**
> @@ -980,6 +994,28 @@ struct drm_panthor_tiler_heap_destroy {
> __u32 pad;
> };
>
> +/**
> + * struct drm_panthor_set_user_mmio_offset - Arguments passed to
> + * DRM_IOCTL_PANTHOR_SET_USER_MMIO_OFFSET
> + *
> + * This ioctl is only really useful if you want to support userspace
> + * CPU emulation environments where the size of an unsigned long differs
> + * between the host and the guest architectures.
> + */
> +struct drm_panthor_set_user_mmio_offset {
> + /**
> + * @offset: User MMIO offset to use.
> + *
> + * Must be either DRM_PANTHOR_USER_MMIO_OFFSET_32BIT or
> + * DRM_PANTHOR_USER_MMIO_OFFSET_64BIT.
> + *
> + * Use DRM_PANTHOR_USER_MMIO_OFFSET (which selects OFFSET_32BIT or
> + * OFFSET_64BIT based on the size of an unsigned long) unless you
> + * have a very good reason to overrule this decision.
> + */
> + __u64 offset;
> +};
> +
> /**
> * DRM_IOCTL_PANTHOR() - Build a Panthor IOCTL number
> * @__access: Access type. Must be R, W or RW.
> @@ -1022,6 +1058,8 @@ enum {
> DRM_IOCTL_PANTHOR(WR, TILER_HEAP_CREATE, tiler_heap_create),
> DRM_IOCTL_PANTHOR_TILER_HEAP_DESTROY =
> DRM_IOCTL_PANTHOR(WR, TILER_HEAP_DESTROY, tiler_heap_destroy),
> + DRM_IOCTL_PANTHOR_SET_USER_MMIO_OFFSET =
> + DRM_IOCTL_PANTHOR(WR, SET_USER_MMIO_OFFSET, set_user_mmio_offset),
> };
>
> #if defined(__cplusplus)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/2] drm/panthor: Fix the user MMIO offset logic for emulators
2025-04-17 15:13 ` Steven Price
@ 2025-04-17 15:37 ` Boris Brezillon
0 siblings, 0 replies; 9+ messages in thread
From: Boris Brezillon @ 2025-04-17 15:37 UTC (permalink / raw)
To: Steven Price; +Cc: Liviu Dudau, Adrián Larumbe, dri-devel, kernel
On Thu, 17 Apr 2025 16:13:49 +0100
Steven Price <steven.price@arm.com> wrote:
> On 17/04/2025 15:49, Boris Brezillon wrote:
> > Currently, we pick the MMIO offset based on the size of the pgoff_t
> > type seen by the process that manipulates the FD, such that a 32-bit
> > process can always map the user MMIO ranges. But this approach doesn't
> > work well for emulators like FEX, where the emulator is a 64-bit binary
> > which might be executing 32-bit code. In that case, the kernel thinks
> > it's the 64-bit process and assumes DRM_PANTHOR_USER_MMIO_OFFSET_64BIT
> > is in use, but the UMD library expects DRM_PANTHOR_USER_MMIO_OFFSET_32BIT,
> > because it can't mmap() anything above the pgoff_t size.
> >
> > In order to solve that, we need a way to explicitly set the user MMIO
> > offset from the UMD, such that the kernel doesn't have to guess it
> > from the TIF_32BIT flag set on user thread. We keep the old behavior
> > if DRM_PANTHOR_SET_USER_MMIO_OFFSET is never called.
> >
> > Changes:
> > - Drop the lock/immutable fields and allow SET_USER_MMIO_OFFSET
> > requests to race with mmap() requests
> > - Don't do the is_user_mmio_offset test twice in panthor_mmap()
> > - Improve the uAPI docs
> >
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
>
> Much nicer, thanks!
>
> Reviewed-by: Steven Price <steven.price@arm.com>
>
> One note for merging - both this and Adrián's series are introducing the
> new 1.4 version. So we either need to switch one to 1.5 or combine the
> series.
I'll let Adrian series go first. I want to leave some time for others
to chime in anyway.
Thanks for the reviews/suggestions.
Boris
>
> Thanks,
> Steve
>
> > ---
> > drivers/gpu/drm/panthor/panthor_device.h | 18 ++++++++
> > drivers/gpu/drm/panthor/panthor_drv.c | 56 +++++++++++++++++-------
> > include/uapi/drm/panthor_drm.h | 38 ++++++++++++++++
> > 3 files changed, 96 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> > index 4c27b6d85f46..6d8c2d5042f2 100644
> > --- a/drivers/gpu/drm/panthor/panthor_device.h
> > +++ b/drivers/gpu/drm/panthor/panthor_device.h
> > @@ -219,6 +219,24 @@ struct panthor_file {
> > /** @ptdev: Device attached to this file. */
> > struct panthor_device *ptdev;
> >
> > + /** @user_mmio: User MMIO related fields. */
> > + struct {
> > + /**
> > + * @offset: Offset used for user MMIO mappings.
> > + *
> > + * This offset should not be used to check the type of mapping
> > + * except in panthor_mmap(). After that point, MMIO mapping
> > + * offsets have been adjusted to match
> > + * DRM_PANTHOR_USER_MMIO_OFFSET and this macro should be used
> > + * instead.
> > + * Make sure this rule is followed at all times, because
> > + * userspace is in control of the offset, and can change the
> > + * value behind out back, potentially leading to erronous
Oops, typo here ^ our
> > + * branching happening in kernel space.
> > + */
> > + u64 offset;
> > + } user_mmio;
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/2] drm/panthor: Fix the user MMIO offset logic for emulators
2025-04-17 14:49 ` [PATCH v2 2/2] drm/panthor: Fix the user MMIO offset logic for emulators Boris Brezillon
2025-04-17 15:13 ` Steven Price
@ 2025-04-22 9:20 ` Liviu Dudau
2025-04-24 14:31 ` Adrián Larumbe
2 siblings, 0 replies; 9+ messages in thread
From: Liviu Dudau @ 2025-04-22 9:20 UTC (permalink / raw)
To: Boris Brezillon; +Cc: Steven Price, Adrián Larumbe, dri-devel, kernel
On Thu, Apr 17, 2025 at 04:49:07PM +0200, Boris Brezillon wrote:
> Currently, we pick the MMIO offset based on the size of the pgoff_t
> type seen by the process that manipulates the FD, such that a 32-bit
> process can always map the user MMIO ranges. But this approach doesn't
> work well for emulators like FEX, where the emulator is a 64-bit binary
> which might be executing 32-bit code. In that case, the kernel thinks
> it's the 64-bit process and assumes DRM_PANTHOR_USER_MMIO_OFFSET_64BIT
> is in use, but the UMD library expects DRM_PANTHOR_USER_MMIO_OFFSET_32BIT,
> because it can't mmap() anything above the pgoff_t size.
>
> In order to solve that, we need a way to explicitly set the user MMIO
> offset from the UMD, such that the kernel doesn't have to guess it
> from the TIF_32BIT flag set on user thread. We keep the old behavior
> if DRM_PANTHOR_SET_USER_MMIO_OFFSET is never called.
>
> Changes:
> - Drop the lock/immutable fields and allow SET_USER_MMIO_OFFSET
> requests to race with mmap() requests
> - Don't do the is_user_mmio_offset test twice in panthor_mmap()
> - Improve the uAPI docs
>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.h | 18 ++++++++
> drivers/gpu/drm/panthor/panthor_drv.c | 56 +++++++++++++++++-------
> include/uapi/drm/panthor_drm.h | 38 ++++++++++++++++
> 3 files changed, 96 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 4c27b6d85f46..6d8c2d5042f2 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -219,6 +219,24 @@ struct panthor_file {
> /** @ptdev: Device attached to this file. */
> struct panthor_device *ptdev;
>
> + /** @user_mmio: User MMIO related fields. */
> + struct {
> + /**
> + * @offset: Offset used for user MMIO mappings.
> + *
> + * This offset should not be used to check the type of mapping
> + * except in panthor_mmap(). After that point, MMIO mapping
> + * offsets have been adjusted to match
> + * DRM_PANTHOR_USER_MMIO_OFFSET and this macro should be used
Small nit: "that" macro, rather than "this". Given that comment starts with "This offset"
I was expecting some other local macro.
> + * instead.
> + * Make sure this rule is followed at all times, because
> + * userspace is in control of the offset, and can change the
> + * value behind out back, potentially leading to erronous
s/out/our/, s/erronous/erroneous/
> + * branching happening in kernel space.
Is "potentially" here still valid if someone follows the rule? Maybe
s/back, potentially leading/back. Otherwise it can lead/ ?
Otherwise, LGTM!
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Best regards,
Liviu
> + */
> + u64 offset;
> + } user_mmio;
> +
> /** @vms: VM pool attached to this file. */
> struct panthor_vm_pool *vms;
>
> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
> index 4d4a52a033f6..aedef2bfa7ac 100644
> --- a/drivers/gpu/drm/panthor/panthor_drv.c
> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> @@ -1338,6 +1338,20 @@ static int panthor_ioctl_vm_get_state(struct drm_device *ddev, void *data,
> return 0;
> }
>
> +static int panthor_ioctl_set_user_mmio_offset(struct drm_device *ddev,
> + void *data, struct drm_file *file)
> +{
> + struct drm_panthor_set_user_mmio_offset *args = data;
> + struct panthor_file *pfile = file->driver_priv;
> +
> + if (args->offset != DRM_PANTHOR_USER_MMIO_OFFSET_32BIT &&
> + args->offset != DRM_PANTHOR_USER_MMIO_OFFSET_64BIT)
> + return -EINVAL;
> +
> + WRITE_ONCE(pfile->user_mmio.offset, args->offset);
> + return 0;
> +}
> +
> static int
> panthor_open(struct drm_device *ddev, struct drm_file *file)
> {
> @@ -1355,6 +1369,18 @@ panthor_open(struct drm_device *ddev, struct drm_file *file)
> }
>
> pfile->ptdev = ptdev;
> + pfile->user_mmio.offset = DRM_PANTHOR_USER_MMIO_OFFSET;
> +
> +#ifdef CONFIG_ARM64
> + /*
> + * With 32-bit systems being limited by the 32-bit representation of
> + * mmap2's pgoffset field, we need to make the MMIO offset arch
> + * specific.
> + */
> + if (test_tsk_thread_flag(current, TIF_32BIT))
> + pfile->user_mmio.offset = DRM_PANTHOR_USER_MMIO_OFFSET_32BIT;
> +#endif
> +
>
> ret = panthor_vm_pool_create(pfile);
> if (ret)
> @@ -1407,6 +1433,7 @@ static const struct drm_ioctl_desc panthor_drm_driver_ioctls[] = {
> PANTHOR_IOCTL(TILER_HEAP_CREATE, tiler_heap_create, DRM_RENDER_ALLOW),
> PANTHOR_IOCTL(TILER_HEAP_DESTROY, tiler_heap_destroy, DRM_RENDER_ALLOW),
> PANTHOR_IOCTL(GROUP_SUBMIT, group_submit, DRM_RENDER_ALLOW),
> + PANTHOR_IOCTL(SET_USER_MMIO_OFFSET, set_user_mmio_offset, DRM_RENDER_ALLOW),
> };
>
> static int panthor_mmap(struct file *filp, struct vm_area_struct *vma)
> @@ -1415,30 +1442,26 @@ static int panthor_mmap(struct file *filp, struct vm_area_struct *vma)
> struct panthor_file *pfile = file->driver_priv;
> struct panthor_device *ptdev = pfile->ptdev;
> u64 offset = (u64)vma->vm_pgoff << PAGE_SHIFT;
> + u64 user_mmio_offset;
> int ret, cookie;
>
> if (!drm_dev_enter(file->minor->dev, &cookie))
> return -ENODEV;
>
> -#ifdef CONFIG_ARM64
> - /*
> - * With 32-bit systems being limited by the 32-bit representation of
> - * mmap2's pgoffset field, we need to make the MMIO offset arch
> - * specific. This converts a user MMIO offset into something the kernel
> - * driver understands.
> + /* Adjust the user MMIO offset to match the offset used kernel side.
> + * We use a local variable with a READ_ONCE() here to make sure
> + * the user_mmio_offset we use for the is_user_mmio_mapping() check
> + * hasn't changed when we do the offset adjustment.
> */
> - if (test_tsk_thread_flag(current, TIF_32BIT) &&
> - offset >= DRM_PANTHOR_USER_MMIO_OFFSET_32BIT) {
> - offset += DRM_PANTHOR_USER_MMIO_OFFSET_64BIT -
> - DRM_PANTHOR_USER_MMIO_OFFSET_32BIT;
> + user_mmio_offset = READ_ONCE(pfile->user_mmio.offset);
> + if (offset >= user_mmio_offset) {
> + offset -= user_mmio_offset;
> + offset += DRM_PANTHOR_USER_MMIO_OFFSET;
> vma->vm_pgoff = offset >> PAGE_SHIFT;
> - }
> -#endif
> -
> - if (offset >= DRM_PANTHOR_USER_MMIO_OFFSET)
> ret = panthor_device_mmap_io(ptdev, vma);
> - else
> + } else {
> ret = drm_gem_mmap(filp, vma);
> + }
>
> drm_dev_exit(cookie);
> return ret;
> @@ -1516,6 +1539,7 @@ static void panthor_debugfs_init(struct drm_minor *minor)
> * - 1.2 - adds DEV_QUERY_GROUP_PRIORITIES_INFO query
> * - adds PANTHOR_GROUP_PRIORITY_REALTIME priority
> * - 1.3 - adds DRM_PANTHOR_GROUP_STATE_INNOCENT flag
> + * - 1.4 - adds DRM_PANTHOR_SET_USER_MMIO_OFFSET ioctl
> */
> static const struct drm_driver panthor_drm_driver = {
> .driver_features = DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ |
> @@ -1529,7 +1553,7 @@ static const struct drm_driver panthor_drm_driver = {
> .name = "panthor",
> .desc = "Panthor DRM driver",
> .major = 1,
> - .minor = 3,
> + .minor = 4,
>
> .gem_create_object = panthor_gem_create_object,
> .gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table,
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index dbb907eae443..1d1282f2c9fa 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -127,6 +127,20 @@ enum drm_panthor_ioctl_id {
>
> /** @DRM_PANTHOR_TILER_HEAP_DESTROY: Destroy a tiler heap. */
> DRM_PANTHOR_TILER_HEAP_DESTROY,
> +
> + /**
> + * @DRM_PANTHOR_SET_USER_MMIO_OFFSET: Set the offset to use as the user MMIO offset.
> + *
> + * The default behavior is to pick the MMIO offset based on the size of the pgoff_t
> + * type seen by the process that manipulates the FD, such that a 32-bit process can
> + * always map the user MMIO ranges. But this approach doesn't work well for emulators
> + * like FEX, where the emulator is an 64-bit binary which might be executing 32-bit
> + * code. In that case, the kernel thinks it's the 64-bit process and assumes
> + * DRM_PANTHOR_USER_MMIO_OFFSET_64BIT is in use, but the UMD library expects
> + * DRM_PANTHOR_USER_MMIO_OFFSET_32BIT, because it can't mmap() anything above the
> + * pgoff_t size.
> + */
> + DRM_PANTHOR_SET_USER_MMIO_OFFSET,
> };
>
> /**
> @@ -980,6 +994,28 @@ struct drm_panthor_tiler_heap_destroy {
> __u32 pad;
> };
>
> +/**
> + * struct drm_panthor_set_user_mmio_offset - Arguments passed to
> + * DRM_IOCTL_PANTHOR_SET_USER_MMIO_OFFSET
> + *
> + * This ioctl is only really useful if you want to support userspace
> + * CPU emulation environments where the size of an unsigned long differs
> + * between the host and the guest architectures.
> + */
> +struct drm_panthor_set_user_mmio_offset {
> + /**
> + * @offset: User MMIO offset to use.
> + *
> + * Must be either DRM_PANTHOR_USER_MMIO_OFFSET_32BIT or
> + * DRM_PANTHOR_USER_MMIO_OFFSET_64BIT.
> + *
> + * Use DRM_PANTHOR_USER_MMIO_OFFSET (which selects OFFSET_32BIT or
> + * OFFSET_64BIT based on the size of an unsigned long) unless you
> + * have a very good reason to overrule this decision.
> + */
> + __u64 offset;
> +};
> +
> /**
> * DRM_IOCTL_PANTHOR() - Build a Panthor IOCTL number
> * @__access: Access type. Must be R, W or RW.
> @@ -1022,6 +1058,8 @@ enum {
> DRM_IOCTL_PANTHOR(WR, TILER_HEAP_CREATE, tiler_heap_create),
> DRM_IOCTL_PANTHOR_TILER_HEAP_DESTROY =
> DRM_IOCTL_PANTHOR(WR, TILER_HEAP_DESTROY, tiler_heap_destroy),
> + DRM_IOCTL_PANTHOR_SET_USER_MMIO_OFFSET =
> + DRM_IOCTL_PANTHOR(WR, SET_USER_MMIO_OFFSET, set_user_mmio_offset),
> };
>
> #if defined(__cplusplus)
> --
> 2.49.0
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/2] drm/panthor: Fix the user MMIO offset logic for emulators
2025-04-17 14:49 ` [PATCH v2 2/2] drm/panthor: Fix the user MMIO offset logic for emulators Boris Brezillon
2025-04-17 15:13 ` Steven Price
2025-04-22 9:20 ` Liviu Dudau
@ 2025-04-24 14:31 ` Adrián Larumbe
2 siblings, 0 replies; 9+ messages in thread
From: Adrián Larumbe @ 2025-04-24 14:31 UTC (permalink / raw)
To: Boris Brezillon; +Cc: Steven Price, Liviu Dudau, dri-devel, kernel
On 17.04.2025 16:49, Boris Brezillon wrote:
> Currently, we pick the MMIO offset based on the size of the pgoff_t
> type seen by the process that manipulates the FD, such that a 32-bit
> process can always map the user MMIO ranges. But this approach doesn't
> work well for emulators like FEX, where the emulator is a 64-bit binary
> which might be executing 32-bit code. In that case, the kernel thinks
> it's the 64-bit process and assumes DRM_PANTHOR_USER_MMIO_OFFSET_64BIT
> is in use, but the UMD library expects DRM_PANTHOR_USER_MMIO_OFFSET_32BIT,
> because it can't mmap() anything above the pgoff_t size.
>
> In order to solve that, we need a way to explicitly set the user MMIO
> offset from the UMD, such that the kernel doesn't have to guess it
> from the TIF_32BIT flag set on user thread. We keep the old behavior
> if DRM_PANTHOR_SET_USER_MMIO_OFFSET is never called.
>
> Changes:
> - Drop the lock/immutable fields and allow SET_USER_MMIO_OFFSET
> requests to race with mmap() requests
> - Don't do the is_user_mmio_offset test twice in panthor_mmap()
> - Improve the uAPI docs
>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.h | 18 ++++++++
> drivers/gpu/drm/panthor/panthor_drv.c | 56 +++++++++++++++++-------
> include/uapi/drm/panthor_drm.h | 38 ++++++++++++++++
> 3 files changed, 96 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 4c27b6d85f46..6d8c2d5042f2 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -219,6 +219,24 @@ struct panthor_file {
> /** @ptdev: Device attached to this file. */
> struct panthor_device *ptdev;
>
> + /** @user_mmio: User MMIO related fields. */
> + struct {
> + /**
> + * @offset: Offset used for user MMIO mappings.
> + *
> + * This offset should not be used to check the type of mapping
> + * except in panthor_mmap(). After that point, MMIO mapping
> + * offsets have been adjusted to match
> + * DRM_PANTHOR_USER_MMIO_OFFSET and this macro should be used
> + * instead.
> + * Make sure this rule is followed at all times, because
> + * userspace is in control of the offset, and can change the
> + * value behind out back, potentially leading to erronous
> + * branching happening in kernel space.
> + */
> + u64 offset;
> + } user_mmio;
> +
> /** @vms: VM pool attached to this file. */
> struct panthor_vm_pool *vms;
>
> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
> index 4d4a52a033f6..aedef2bfa7ac 100644
> --- a/drivers/gpu/drm/panthor/panthor_drv.c
> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> @@ -1338,6 +1338,20 @@ static int panthor_ioctl_vm_get_state(struct drm_device *ddev, void *data,
> return 0;
> }
>
> +static int panthor_ioctl_set_user_mmio_offset(struct drm_device *ddev,
> + void *data, struct drm_file *file)
> +{
> + struct drm_panthor_set_user_mmio_offset *args = data;
> + struct panthor_file *pfile = file->driver_priv;
> +
> + if (args->offset != DRM_PANTHOR_USER_MMIO_OFFSET_32BIT &&
> + args->offset != DRM_PANTHOR_USER_MMIO_OFFSET_64BIT)
> + return -EINVAL;
> +
> + WRITE_ONCE(pfile->user_mmio.offset, args->offset);
> + return 0;
> +}
> +
> static int
> panthor_open(struct drm_device *ddev, struct drm_file *file)
> {
> @@ -1355,6 +1369,18 @@ panthor_open(struct drm_device *ddev, struct drm_file *file)
> }
>
> pfile->ptdev = ptdev;
> + pfile->user_mmio.offset = DRM_PANTHOR_USER_MMIO_OFFSET;
> +
> +#ifdef CONFIG_ARM64
> + /*
> + * With 32-bit systems being limited by the 32-bit representation of
> + * mmap2's pgoffset field, we need to make the MMIO offset arch
> + * specific.
> + */
> + if (test_tsk_thread_flag(current, TIF_32BIT))
> + pfile->user_mmio.offset = DRM_PANTHOR_USER_MMIO_OFFSET_32BIT;
> +#endif
> +
>
> ret = panthor_vm_pool_create(pfile);
> if (ret)
> @@ -1407,6 +1433,7 @@ static const struct drm_ioctl_desc panthor_drm_driver_ioctls[] = {
> PANTHOR_IOCTL(TILER_HEAP_CREATE, tiler_heap_create, DRM_RENDER_ALLOW),
> PANTHOR_IOCTL(TILER_HEAP_DESTROY, tiler_heap_destroy, DRM_RENDER_ALLOW),
> PANTHOR_IOCTL(GROUP_SUBMIT, group_submit, DRM_RENDER_ALLOW),
> + PANTHOR_IOCTL(SET_USER_MMIO_OFFSET, set_user_mmio_offset, DRM_RENDER_ALLOW),
> };
>
> static int panthor_mmap(struct file *filp, struct vm_area_struct *vma)
> @@ -1415,30 +1442,26 @@ static int panthor_mmap(struct file *filp, struct vm_area_struct *vma)
> struct panthor_file *pfile = file->driver_priv;
> struct panthor_device *ptdev = pfile->ptdev;
> u64 offset = (u64)vma->vm_pgoff << PAGE_SHIFT;
> + u64 user_mmio_offset;
> int ret, cookie;
>
> if (!drm_dev_enter(file->minor->dev, &cookie))
> return -ENODEV;
>
> -#ifdef CONFIG_ARM64
> - /*
> - * With 32-bit systems being limited by the 32-bit representation of
> - * mmap2's pgoffset field, we need to make the MMIO offset arch
> - * specific. This converts a user MMIO offset into something the kernel
> - * driver understands.
> + /* Adjust the user MMIO offset to match the offset used kernel side.
> + * We use a local variable with a READ_ONCE() here to make sure
> + * the user_mmio_offset we use for the is_user_mmio_mapping() check
> + * hasn't changed when we do the offset adjustment.
> */
> - if (test_tsk_thread_flag(current, TIF_32BIT) &&
> - offset >= DRM_PANTHOR_USER_MMIO_OFFSET_32BIT) {
> - offset += DRM_PANTHOR_USER_MMIO_OFFSET_64BIT -
> - DRM_PANTHOR_USER_MMIO_OFFSET_32BIT;
> + user_mmio_offset = READ_ONCE(pfile->user_mmio.offset);
> + if (offset >= user_mmio_offset) {
> + offset -= user_mmio_offset;
> + offset += DRM_PANTHOR_USER_MMIO_OFFSET;
> vma->vm_pgoff = offset >> PAGE_SHIFT;
> - }
> -#endif
> -
> - if (offset >= DRM_PANTHOR_USER_MMIO_OFFSET)
> ret = panthor_device_mmap_io(ptdev, vma);
> - else
> + } else {
> ret = drm_gem_mmap(filp, vma);
> + }
>
> drm_dev_exit(cookie);
> return ret;
> @@ -1516,6 +1539,7 @@ static void panthor_debugfs_init(struct drm_minor *minor)
> * - 1.2 - adds DEV_QUERY_GROUP_PRIORITIES_INFO query
> * - adds PANTHOR_GROUP_PRIORITY_REALTIME priority
> * - 1.3 - adds DRM_PANTHOR_GROUP_STATE_INNOCENT flag
> + * - 1.4 - adds DRM_PANTHOR_SET_USER_MMIO_OFFSET ioctl
> */
> static const struct drm_driver panthor_drm_driver = {
> .driver_features = DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ |
> @@ -1529,7 +1553,7 @@ static const struct drm_driver panthor_drm_driver = {
> .name = "panthor",
> .desc = "Panthor DRM driver",
> .major = 1,
> - .minor = 3,
> + .minor = 4,
>
> .gem_create_object = panthor_gem_create_object,
> .gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table,
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index dbb907eae443..1d1282f2c9fa 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -127,6 +127,20 @@ enum drm_panthor_ioctl_id {
>
> /** @DRM_PANTHOR_TILER_HEAP_DESTROY: Destroy a tiler heap. */
> DRM_PANTHOR_TILER_HEAP_DESTROY,
> +
> + /**
> + * @DRM_PANTHOR_SET_USER_MMIO_OFFSET: Set the offset to use as the user MMIO offset.
> + *
> + * The default behavior is to pick the MMIO offset based on the size of the pgoff_t
> + * type seen by the process that manipulates the FD, such that a 32-bit process can
> + * always map the user MMIO ranges. But this approach doesn't work well for emulators
> + * like FEX, where the emulator is an 64-bit binary which might be executing 32-bit
> + * code. In that case, the kernel thinks it's the 64-bit process and assumes
> + * DRM_PANTHOR_USER_MMIO_OFFSET_64BIT is in use, but the UMD library expects
> + * DRM_PANTHOR_USER_MMIO_OFFSET_32BIT, because it can't mmap() anything above the
> + * pgoff_t size.
> + */
> + DRM_PANTHOR_SET_USER_MMIO_OFFSET,
> };
>
> /**
> @@ -980,6 +994,28 @@ struct drm_panthor_tiler_heap_destroy {
> __u32 pad;
> };
>
> +/**
> + * struct drm_panthor_set_user_mmio_offset - Arguments passed to
> + * DRM_IOCTL_PANTHOR_SET_USER_MMIO_OFFSET
> + *
> + * This ioctl is only really useful if you want to support userspace
> + * CPU emulation environments where the size of an unsigned long differs
> + * between the host and the guest architectures.
> + */
> +struct drm_panthor_set_user_mmio_offset {
> + /**
> + * @offset: User MMIO offset to use.
> + *
> + * Must be either DRM_PANTHOR_USER_MMIO_OFFSET_32BIT or
> + * DRM_PANTHOR_USER_MMIO_OFFSET_64BIT.
> + *
> + * Use DRM_PANTHOR_USER_MMIO_OFFSET (which selects OFFSET_32BIT or
> + * OFFSET_64BIT based on the size of an unsigned long) unless you
> + * have a very good reason to overrule this decision.
> + */
> + __u64 offset;
> +};
> +
> /**
> * DRM_IOCTL_PANTHOR() - Build a Panthor IOCTL number
> * @__access: Access type. Must be R, W or RW.
> @@ -1022,6 +1058,8 @@ enum {
> DRM_IOCTL_PANTHOR(WR, TILER_HEAP_CREATE, tiler_heap_create),
> DRM_IOCTL_PANTHOR_TILER_HEAP_DESTROY =
> DRM_IOCTL_PANTHOR(WR, TILER_HEAP_DESTROY, tiler_heap_destroy),
> + DRM_IOCTL_PANTHOR_SET_USER_MMIO_OFFSET =
> + DRM_IOCTL_PANTHOR(WR, SET_USER_MMIO_OFFSET, set_user_mmio_offset),
> };
>
> #if defined(__cplusplus)
> --
> 2.49.0
Adrian Larumbe
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] drm/panthor: Add missing explicit padding in drm_panthor_gpu_info
2025-04-17 14:49 ` [PATCH v2 1/2] drm/panthor: Add missing explicit padding in drm_panthor_gpu_info Boris Brezillon
2025-04-17 15:07 ` Steven Price
@ 2025-04-24 14:32 ` Adrián Larumbe
1 sibling, 0 replies; 9+ messages in thread
From: Adrián Larumbe @ 2025-04-24 14:32 UTC (permalink / raw)
To: Boris Brezillon; +Cc: Steven Price, Liviu Dudau, dri-devel, kernel
On 17.04.2025 16:49, Boris Brezillon wrote:
> drm_panthor_gpu_info::shader_present is currently automatically offset
> by 4 byte to meet Arm's 32-bit/64-bit field alignment rules, but those
> constraints don't stand on 32-bit x86 and cause a mismatch when running
> an x86 binary in a user emulated environment like FEX. It's also
> generally agreed that uAPIs should explicitly pad their struct fields,
> which we originally intended to do, but a mistake slipped through during
> the submission process, leading drm_panthor_gpu_info::shader_present to
> be misaligned.
>
> This uAPI change doesn't break any of the existing users of panthor
> which are either arm32 or arm64 where the 64-bit alignment of
> u64 fields is already enforced a the compiler level.
>
> Changes in v2:
> - Rename the garbage field into pad0 and adjust the comment accordingly
> - Add Liviu's R-b
>
> Fixes: 0f25e493a246 ("drm/panthor: Add uAPI")
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> ---
> include/uapi/drm/panthor_drm.h | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index 97e2c4510e69..dbb907eae443 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -293,6 +293,9 @@ struct drm_panthor_gpu_info {
> /** @as_present: Bitmask encoding the number of address-space exposed by the MMU. */
> __u32 as_present;
>
> + /** @pad0: MBZ. */
> + __u32 pad0;
> +
> /** @shader_present: Bitmask encoding the shader cores exposed by the GPU. */
> __u64 shader_present;
>
> --
> 2.49.0
Adrian Larumbe
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-04-24 14:32 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-17 14:49 [PATCH v2 0/2] drm/panthor: Fix panthor+FEX-Emu Boris Brezillon
2025-04-17 14:49 ` [PATCH v2 1/2] drm/panthor: Add missing explicit padding in drm_panthor_gpu_info Boris Brezillon
2025-04-17 15:07 ` Steven Price
2025-04-24 14:32 ` Adrián Larumbe
2025-04-17 14:49 ` [PATCH v2 2/2] drm/panthor: Fix the user MMIO offset logic for emulators Boris Brezillon
2025-04-17 15:13 ` Steven Price
2025-04-17 15:37 ` Boris Brezillon
2025-04-22 9:20 ` Liviu Dudau
2025-04-24 14:31 ` Adrián Larumbe
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.