* [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path
@ 2025-02-10 12:41 Adrián Larumbe
2025-02-10 12:42 ` [PATCH 2/2] drm/panthor: Avoid sleep locking in the internal BO size path Adrián Larumbe
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: Adrián Larumbe @ 2025-02-10 12:41 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Adrián Larumbe
Cc: kernel, dri-devel, linux-kernel
Panthor's fdinfo handler is routed through the /proc file system, which
executes in an atomic context. That means we cannot use mutexes because
they might sleep.
This bug was uncovered by enabling some of the kernel's mutex-debugging
features:
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_MUTEXES=y
Replace Panthor's group fdinfo data mutex with a guarded spinlock.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: e16635d88fa0 ("drm/panthor: add DRM fdinfo support")
---
drivers/gpu/drm/panthor/panthor_sched.c | 26 ++++++++++++-------------
1 file changed, 12 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index 0b93bf83a9b2..7a267d1efeb6 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -9,6 +9,7 @@
#include <drm/panthor_drm.h>
#include <linux/build_bug.h>
+#include <linux/cleanup.h>
#include <linux/clk.h>
#include <linux/delay.h>
#include <linux/dma-mapping.h>
@@ -631,10 +632,10 @@ struct panthor_group {
struct panthor_gpu_usage data;
/**
- * @lock: Mutex to govern concurrent access from drm file's fdinfo callback
- * and job post-completion processing function
+ * @fdinfo.lock: Spinlock to govern concurrent access from drm file's fdinfo
+ * callback and job post-completion processing function
*/
- struct mutex lock;
+ spinlock_t lock;
/** @fdinfo.kbo_sizes: Aggregate size of private kernel BO's held by the group. */
size_t kbo_sizes;
@@ -910,8 +911,6 @@ static void group_release_work(struct work_struct *work)
release_work);
u32 i;
- mutex_destroy(&group->fdinfo.lock);
-
for (i = 0; i < group->queue_count; i++)
group_free_queue(group, group->queues[i]);
@@ -2861,12 +2860,12 @@ static void update_fdinfo_stats(struct panthor_job *job)
struct panthor_job_profiling_data *slots = queue->profiling.slots->kmap;
struct panthor_job_profiling_data *data = &slots[job->profiling.slot];
- mutex_lock(&group->fdinfo.lock);
- if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
- fdinfo->cycles += data->cycles.after - data->cycles.before;
- if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
- fdinfo->time += data->time.after - data->time.before;
- mutex_unlock(&group->fdinfo.lock);
+ scoped_guard(spinlock, &group->fdinfo.lock) {
+ if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
+ fdinfo->cycles += data->cycles.after - data->cycles.before;
+ if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
+ fdinfo->time += data->time.after - data->time.before;
+ }
}
void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
@@ -2880,12 +2879,11 @@ void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
xa_lock(&gpool->xa);
xa_for_each(&gpool->xa, i, group) {
- mutex_lock(&group->fdinfo.lock);
+ guard(spinlock)(&group->fdinfo.lock);
pfile->stats.cycles += group->fdinfo.data.cycles;
pfile->stats.time += group->fdinfo.data.time;
group->fdinfo.data.cycles = 0;
group->fdinfo.data.time = 0;
- mutex_unlock(&group->fdinfo.lock);
}
xa_unlock(&gpool->xa);
}
@@ -3531,7 +3529,7 @@ int panthor_group_create(struct panthor_file *pfile,
mutex_unlock(&sched->reset.lock);
add_group_kbo_sizes(group->ptdev, group);
- mutex_init(&group->fdinfo.lock);
+ spin_lock_init(&group->fdinfo.lock);
return gid;
base-commit: 2eca617f12586abff62038db1c14cb3aa60a15aa
prerequisite-patch-id: 7e787ce5973b5fc7e9f69f26aa4d7e5ec03d5caa
prerequisite-patch-id: 03a619b8c741444b28435850e23d9ec463171c13
prerequisite-patch-id: 290e1053f8bf4a8b80fb037a87cae7e096b5aa96
prerequisite-patch-id: bc49bb8c29905650fb4788acc528bb44013c0240
prerequisite-patch-id: 46cab4c980824c03e5164afc43085fec23e1cba5
--
2.47.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/2] drm/panthor: Avoid sleep locking in the internal BO size path
2025-02-10 12:41 [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path Adrián Larumbe
@ 2025-02-10 12:42 ` Adrián Larumbe
2025-02-10 13:18 ` Boris Brezillon
2025-02-10 13:08 ` [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path Boris Brezillon
` (2 subsequent siblings)
3 siblings, 1 reply; 15+ messages in thread
From: Adrián Larumbe @ 2025-02-10 12:42 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Adrián Larumbe, Mihail Atanassov
Cc: kernel, dri-devel, linux-kernel
A previous commit dealt with a similar situation, whereby upon enabling
some mutex debug features, a warning about sleep muteces being used in a
/proc file read atomic context was being triggered.
Because in this case replacing the heap mutex with a spinlock isn't
feasible, the fdinfo handler no longer traverses the list of heaps for
every single VM associated with an open DRM file. Instad, when a new heap
chunk is allocated, its size is accumulated into a VM-wide tally, which
also makes the atomic context code path somewhat faster.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: 3e2c8c718567 ("drm/panthor: Expose size of driver internal BO's over fdinfo")
---
drivers/gpu/drm/panthor/panthor_heap.c | 38 ++++++++------------------
drivers/gpu/drm/panthor/panthor_heap.h | 2 --
drivers/gpu/drm/panthor/panthor_mmu.c | 18 +++++++-----
drivers/gpu/drm/panthor/panthor_mmu.h | 1 +
4 files changed, 23 insertions(+), 36 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
index db0285ce5812..686f209f5b09 100644
--- a/drivers/gpu/drm/panthor/panthor_heap.c
+++ b/drivers/gpu/drm/panthor/panthor_heap.c
@@ -127,6 +127,8 @@ static void panthor_free_heap_chunk(struct panthor_vm *vm,
heap->chunk_count--;
mutex_unlock(&heap->lock);
+ panthor_vm_heaps_accumulate(vm, -heap->chunk_size);
+
panthor_kernel_bo_destroy(chunk->bo);
kfree(chunk);
}
@@ -180,6 +182,8 @@ static int panthor_alloc_heap_chunk(struct panthor_device *ptdev,
heap->chunk_count++;
mutex_unlock(&heap->lock);
+ panthor_vm_heaps_accumulate(vm, heap->chunk_size);
+
return 0;
err_destroy_bo:
@@ -389,6 +393,7 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool,
removed = chunk;
list_del(&chunk->node);
heap->chunk_count--;
+ panthor_vm_heaps_accumulate(chunk->bo->vm, -heap->chunk_size);
break;
}
}
@@ -560,6 +565,8 @@ panthor_heap_pool_create(struct panthor_device *ptdev, struct panthor_vm *vm)
if (ret)
goto err_destroy_pool;
+ panthor_vm_heaps_accumulate(vm, pool->gpu_contexts->obj->size);
+
return pool;
err_destroy_pool:
@@ -594,8 +601,11 @@ void panthor_heap_pool_destroy(struct panthor_heap_pool *pool)
xa_for_each(&pool->xa, i, heap)
drm_WARN_ON(&pool->ptdev->base, panthor_heap_destroy_locked(pool, i));
- if (!IS_ERR_OR_NULL(pool->gpu_contexts))
+ if (!IS_ERR_OR_NULL(pool->gpu_contexts)) {
+ panthor_vm_heaps_accumulate(pool->gpu_contexts->vm,
+ -pool->gpu_contexts->obj->size);
panthor_kernel_bo_destroy(pool->gpu_contexts);
+ }
/* Reflects the fact the pool has been destroyed. */
pool->vm = NULL;
@@ -603,29 +613,3 @@ void panthor_heap_pool_destroy(struct panthor_heap_pool *pool)
panthor_heap_pool_put(pool);
}
-
-/**
- * panthor_heap_pool_size() - Calculate size of all chunks across all heaps in a pool
- * @pool: Pool whose total chunk size to calculate.
- *
- * This function adds the size of all heap chunks across all heaps in the
- * argument pool. It also adds the size of the gpu contexts kernel bo.
- * It is meant to be used by fdinfo for displaying the size of internal
- * driver BO's that aren't exposed to userspace through a GEM handle.
- *
- */
-size_t panthor_heap_pool_size(struct panthor_heap_pool *pool)
-{
- struct panthor_heap *heap;
- unsigned long i;
- size_t size = 0;
-
- down_read(&pool->lock);
- xa_for_each(&pool->xa, i, heap)
- size += heap->chunk_size * heap->chunk_count;
- up_read(&pool->lock);
-
- size += pool->gpu_contexts->obj->size;
-
- return size;
-}
diff --git a/drivers/gpu/drm/panthor/panthor_heap.h b/drivers/gpu/drm/panthor/panthor_heap.h
index e3358d4e8edb..25a5f2bba445 100644
--- a/drivers/gpu/drm/panthor/panthor_heap.h
+++ b/drivers/gpu/drm/panthor/panthor_heap.h
@@ -27,8 +27,6 @@ struct panthor_heap_pool *
panthor_heap_pool_get(struct panthor_heap_pool *pool);
void panthor_heap_pool_put(struct panthor_heap_pool *pool);
-size_t panthor_heap_pool_size(struct panthor_heap_pool *pool);
-
int panthor_heap_grow(struct panthor_heap_pool *pool,
u64 heap_gpu_va,
u32 renderpasses_in_flight,
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index 0a4e352b5505..aaad1a560805 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -345,6 +345,10 @@ struct panthor_vm {
/** @heaps.lock: Lock used to protect access to @pool. */
struct mutex lock;
+
+ /** @heaps.size: Size of all chunks across all heaps in the pool. */
+ ssize_t size;
+
} heaps;
/** @node: Used to insert the VM in the panthor_mmu::vm::list. */
@@ -1539,6 +1543,7 @@ static void panthor_vm_destroy(struct panthor_vm *vm)
mutex_lock(&vm->heaps.lock);
panthor_heap_pool_destroy(vm->heaps.pool);
vm->heaps.pool = NULL;
+ vm->heaps.size = 0;
mutex_unlock(&vm->heaps.lock);
drm_WARN_ON(&vm->ptdev->base,
@@ -1963,13 +1968,7 @@ void panthor_vm_heaps_sizes(struct panthor_file *pfile, struct drm_memory_stats
xa_lock(&pfile->vms->xa);
xa_for_each(&pfile->vms->xa, i, vm) {
- size_t size = 0;
-
- mutex_lock(&vm->heaps.lock);
- if (vm->heaps.pool)
- size = panthor_heap_pool_size(vm->heaps.pool);
- mutex_unlock(&vm->heaps.lock);
-
+ size_t size = vm->heaps.size;
stats->resident += size;
if (vm->as.id >= 0)
stats->active += size;
@@ -1977,6 +1976,11 @@ void panthor_vm_heaps_sizes(struct panthor_file *pfile, struct drm_memory_stats
xa_unlock(&pfile->vms->xa);
}
+void panthor_vm_heaps_accumulate(struct panthor_vm *vm, ssize_t acc)
+{
+ vm->heaps.size += acc;
+}
+
static u64 mair_to_memattr(u64 mair, bool coherent)
{
u64 memattr = 0;
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.h b/drivers/gpu/drm/panthor/panthor_mmu.h
index fc274637114e..b6686896f8ef 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.h
+++ b/drivers/gpu/drm/panthor/panthor_mmu.h
@@ -39,6 +39,7 @@ struct panthor_heap_pool *
panthor_vm_get_heap_pool(struct panthor_vm *vm, bool create);
void panthor_vm_heaps_sizes(struct panthor_file *pfile, struct drm_memory_stats *stats);
+void panthor_vm_heaps_accumulate(struct panthor_vm *vm, ssize_t acc);
struct panthor_vm *panthor_vm_get(struct panthor_vm *vm);
void panthor_vm_put(struct panthor_vm *vm);
--
2.47.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path
2025-02-10 12:41 [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path Adrián Larumbe
2025-02-10 12:42 ` [PATCH 2/2] drm/panthor: Avoid sleep locking in the internal BO size path Adrián Larumbe
@ 2025-02-10 13:08 ` Boris Brezillon
2025-02-10 13:37 ` Liviu Dudau
2025-02-10 15:52 ` Tvrtko Ursulin
3 siblings, 0 replies; 15+ messages in thread
From: Boris Brezillon @ 2025-02-10 13:08 UTC (permalink / raw)
To: Adrián Larumbe
Cc: Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, kernel, dri-devel,
linux-kernel
On Mon, 10 Feb 2025 12:41:59 +0000
Adrián Larumbe <adrian.larumbe@collabora.com> wrote:
> Panthor's fdinfo handler is routed through the /proc file system, which
> executes in an atomic context. That means we cannot use mutexes because
> they might sleep.
>
> This bug was uncovered by enabling some of the kernel's mutex-debugging
> features:
>
> CONFIG_DEBUG_RT_MUTEXES=y
> CONFIG_DEBUG_MUTEXES=y
>
> Replace Panthor's group fdinfo data mutex with a guarded spinlock.
>
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> Fixes: e16635d88fa0 ("drm/panthor: add DRM fdinfo support")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_sched.c | 26 ++++++++++++-------------
> 1 file changed, 12 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index 0b93bf83a9b2..7a267d1efeb6 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -9,6 +9,7 @@
> #include <drm/panthor_drm.h>
>
> #include <linux/build_bug.h>
> +#include <linux/cleanup.h>
> #include <linux/clk.h>
> #include <linux/delay.h>
> #include <linux/dma-mapping.h>
> @@ -631,10 +632,10 @@ struct panthor_group {
> struct panthor_gpu_usage data;
>
> /**
> - * @lock: Mutex to govern concurrent access from drm file's fdinfo callback
> - * and job post-completion processing function
> + * @fdinfo.lock: Spinlock to govern concurrent access from drm file's fdinfo
> + * callback and job post-completion processing function
> */
> - struct mutex lock;
> + spinlock_t lock;
>
> /** @fdinfo.kbo_sizes: Aggregate size of private kernel BO's held by the group. */
> size_t kbo_sizes;
> @@ -910,8 +911,6 @@ static void group_release_work(struct work_struct *work)
> release_work);
> u32 i;
>
> - mutex_destroy(&group->fdinfo.lock);
> -
> for (i = 0; i < group->queue_count; i++)
> group_free_queue(group, group->queues[i]);
>
> @@ -2861,12 +2860,12 @@ static void update_fdinfo_stats(struct panthor_job *job)
> struct panthor_job_profiling_data *slots = queue->profiling.slots->kmap;
> struct panthor_job_profiling_data *data = &slots[job->profiling.slot];
>
> - mutex_lock(&group->fdinfo.lock);
> - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
> - fdinfo->cycles += data->cycles.after - data->cycles.before;
> - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
> - fdinfo->time += data->time.after - data->time.before;
> - mutex_unlock(&group->fdinfo.lock);
> + scoped_guard(spinlock, &group->fdinfo.lock) {
> + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
> + fdinfo->cycles += data->cycles.after - data->cycles.before;
> + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
> + fdinfo->time += data->time.after - data->time.before;
> + }
> }
>
> void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
> @@ -2880,12 +2879,11 @@ void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
>
> xa_lock(&gpool->xa);
> xa_for_each(&gpool->xa, i, group) {
> - mutex_lock(&group->fdinfo.lock);
> + guard(spinlock)(&group->fdinfo.lock);
> pfile->stats.cycles += group->fdinfo.data.cycles;
> pfile->stats.time += group->fdinfo.data.time;
> group->fdinfo.data.cycles = 0;
> group->fdinfo.data.time = 0;
> - mutex_unlock(&group->fdinfo.lock);
> }
> xa_unlock(&gpool->xa);
> }
> @@ -3531,7 +3529,7 @@ int panthor_group_create(struct panthor_file *pfile,
> mutex_unlock(&sched->reset.lock);
>
> add_group_kbo_sizes(group->ptdev, group);
> - mutex_init(&group->fdinfo.lock);
> + spin_lock_init(&group->fdinfo.lock);
>
> return gid;
>
>
> base-commit: 2eca617f12586abff62038db1c14cb3aa60a15aa
> prerequisite-patch-id: 7e787ce5973b5fc7e9f69f26aa4d7e5ec03d5caa
> prerequisite-patch-id: 03a619b8c741444b28435850e23d9ec463171c13
> prerequisite-patch-id: 290e1053f8bf4a8b80fb037a87cae7e096b5aa96
> prerequisite-patch-id: bc49bb8c29905650fb4788acc528bb44013c0240
> prerequisite-patch-id: 46cab4c980824c03e5164afc43085fec23e1cba5
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] drm/panthor: Avoid sleep locking in the internal BO size path
2025-02-10 12:42 ` [PATCH 2/2] drm/panthor: Avoid sleep locking in the internal BO size path Adrián Larumbe
@ 2025-02-10 13:18 ` Boris Brezillon
2025-02-10 13:43 ` Liviu Dudau
0 siblings, 1 reply; 15+ messages in thread
From: Boris Brezillon @ 2025-02-10 13:18 UTC (permalink / raw)
To: Adrián Larumbe
Cc: Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Mihail Atanassov,
kernel, dri-devel, linux-kernel
On Mon, 10 Feb 2025 12:42:00 +0000
Adrián Larumbe <adrian.larumbe@collabora.com> wrote:
> A previous commit dealt with a similar situation, whereby upon enabling
> some mutex debug features, a warning about sleep muteces being used in a
^ mutexes
> /proc file read atomic context was being triggered.
>
> Because in this case replacing the heap mutex with a spinlock isn't
> feasible, the fdinfo handler no longer traverses the list of heaps for
> every single VM associated with an open DRM file. Instad, when a new heap
> chunk is allocated, its size is accumulated into a VM-wide tally, which
> also makes the atomic context code path somewhat faster.
>
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> Fixes: 3e2c8c718567 ("drm/panthor: Expose size of driver internal BO's over fdinfo")
> ---
> drivers/gpu/drm/panthor/panthor_heap.c | 38 ++++++++------------------
> drivers/gpu/drm/panthor/panthor_heap.h | 2 --
> drivers/gpu/drm/panthor/panthor_mmu.c | 18 +++++++-----
> drivers/gpu/drm/panthor/panthor_mmu.h | 1 +
> 4 files changed, 23 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> index db0285ce5812..686f209f5b09 100644
> --- a/drivers/gpu/drm/panthor/panthor_heap.c
> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> @@ -127,6 +127,8 @@ static void panthor_free_heap_chunk(struct panthor_vm *vm,
> heap->chunk_count--;
> mutex_unlock(&heap->lock);
>
> + panthor_vm_heaps_accumulate(vm, -heap->chunk_size);
> +
> panthor_kernel_bo_destroy(chunk->bo);
> kfree(chunk);
> }
> @@ -180,6 +182,8 @@ static int panthor_alloc_heap_chunk(struct panthor_device *ptdev,
> heap->chunk_count++;
> mutex_unlock(&heap->lock);
>
> + panthor_vm_heaps_accumulate(vm, heap->chunk_size);
> +
> return 0;
>
> err_destroy_bo:
> @@ -389,6 +393,7 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool,
> removed = chunk;
> list_del(&chunk->node);
> heap->chunk_count--;
> + panthor_vm_heaps_accumulate(chunk->bo->vm, -heap->chunk_size);
> break;
> }
> }
> @@ -560,6 +565,8 @@ panthor_heap_pool_create(struct panthor_device *ptdev, struct panthor_vm *vm)
> if (ret)
> goto err_destroy_pool;
>
> + panthor_vm_heaps_accumulate(vm, pool->gpu_contexts->obj->size);
> +
> return pool;
>
> err_destroy_pool:
> @@ -594,8 +601,11 @@ void panthor_heap_pool_destroy(struct panthor_heap_pool *pool)
> xa_for_each(&pool->xa, i, heap)
> drm_WARN_ON(&pool->ptdev->base, panthor_heap_destroy_locked(pool, i));
>
> - if (!IS_ERR_OR_NULL(pool->gpu_contexts))
> + if (!IS_ERR_OR_NULL(pool->gpu_contexts)) {
> + panthor_vm_heaps_accumulate(pool->gpu_contexts->vm,
> + -pool->gpu_contexts->obj->size);
> panthor_kernel_bo_destroy(pool->gpu_contexts);
> + }
>
> /* Reflects the fact the pool has been destroyed. */
> pool->vm = NULL;
> @@ -603,29 +613,3 @@ void panthor_heap_pool_destroy(struct panthor_heap_pool *pool)
>
> panthor_heap_pool_put(pool);
> }
> -
> -/**
> - * panthor_heap_pool_size() - Calculate size of all chunks across all heaps in a pool
> - * @pool: Pool whose total chunk size to calculate.
> - *
> - * This function adds the size of all heap chunks across all heaps in the
> - * argument pool. It also adds the size of the gpu contexts kernel bo.
> - * It is meant to be used by fdinfo for displaying the size of internal
> - * driver BO's that aren't exposed to userspace through a GEM handle.
> - *
> - */
> -size_t panthor_heap_pool_size(struct panthor_heap_pool *pool)
> -{
> - struct panthor_heap *heap;
> - unsigned long i;
> - size_t size = 0;
> -
> - down_read(&pool->lock);
> - xa_for_each(&pool->xa, i, heap)
> - size += heap->chunk_size * heap->chunk_count;
> - up_read(&pool->lock);
> -
> - size += pool->gpu_contexts->obj->size;
> -
> - return size;
> -}
> diff --git a/drivers/gpu/drm/panthor/panthor_heap.h b/drivers/gpu/drm/panthor/panthor_heap.h
> index e3358d4e8edb..25a5f2bba445 100644
> --- a/drivers/gpu/drm/panthor/panthor_heap.h
> +++ b/drivers/gpu/drm/panthor/panthor_heap.h
> @@ -27,8 +27,6 @@ struct panthor_heap_pool *
> panthor_heap_pool_get(struct panthor_heap_pool *pool);
> void panthor_heap_pool_put(struct panthor_heap_pool *pool);
>
> -size_t panthor_heap_pool_size(struct panthor_heap_pool *pool);
> -
> int panthor_heap_grow(struct panthor_heap_pool *pool,
> u64 heap_gpu_va,
> u32 renderpasses_in_flight,
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index 0a4e352b5505..aaad1a560805 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -345,6 +345,10 @@ struct panthor_vm {
>
> /** @heaps.lock: Lock used to protect access to @pool. */
> struct mutex lock;
> +
> + /** @heaps.size: Size of all chunks across all heaps in the pool. */
> + ssize_t size;
Let's put that into an fdinfo struct.
> +
Drop the extra blank-line.
> } heaps;
>
> /** @node: Used to insert the VM in the panthor_mmu::vm::list. */
> @@ -1539,6 +1543,7 @@ static void panthor_vm_destroy(struct panthor_vm *vm)
> mutex_lock(&vm->heaps.lock);
> panthor_heap_pool_destroy(vm->heaps.pool);
> vm->heaps.pool = NULL;
> + vm->heaps.size = 0;
> mutex_unlock(&vm->heaps.lock);
>
> drm_WARN_ON(&vm->ptdev->base,
> @@ -1963,13 +1968,7 @@ void panthor_vm_heaps_sizes(struct panthor_file *pfile, struct drm_memory_stats
>
> xa_lock(&pfile->vms->xa);
> xa_for_each(&pfile->vms->xa, i, vm) {
> - size_t size = 0;
> -
> - mutex_lock(&vm->heaps.lock);
> - if (vm->heaps.pool)
> - size = panthor_heap_pool_size(vm->heaps.pool);
> - mutex_unlock(&vm->heaps.lock);
> -
> + size_t size = vm->heaps.size;
> stats->resident += size;
> if (vm->as.id >= 0)
> stats->active += size;
> @@ -1977,6 +1976,11 @@ void panthor_vm_heaps_sizes(struct panthor_file *pfile, struct drm_memory_stats
> xa_unlock(&pfile->vms->xa);
> }
>
> +void panthor_vm_heaps_accumulate(struct panthor_vm *vm, ssize_t acc)
> +{
Either there's some lock protecting this operation and we want a
lockdep_assert_held(), or we need to make it an atomic operation (and
make the size an atomic_t) to avoid races.
> + vm->heaps.size += acc;
> +}
> +
> static u64 mair_to_memattr(u64 mair, bool coherent)
> {
> u64 memattr = 0;
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.h b/drivers/gpu/drm/panthor/panthor_mmu.h
> index fc274637114e..b6686896f8ef 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.h
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.h
> @@ -39,6 +39,7 @@ struct panthor_heap_pool *
> panthor_vm_get_heap_pool(struct panthor_vm *vm, bool create);
>
> void panthor_vm_heaps_sizes(struct panthor_file *pfile, struct drm_memory_stats *stats);
> +void panthor_vm_heaps_accumulate(struct panthor_vm *vm, ssize_t acc);
>
> struct panthor_vm *panthor_vm_get(struct panthor_vm *vm);
> void panthor_vm_put(struct panthor_vm *vm);
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path
2025-02-10 12:41 [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path Adrián Larumbe
2025-02-10 12:42 ` [PATCH 2/2] drm/panthor: Avoid sleep locking in the internal BO size path Adrián Larumbe
2025-02-10 13:08 ` [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path Boris Brezillon
@ 2025-02-10 13:37 ` Liviu Dudau
2025-02-10 15:52 ` Tvrtko Ursulin
3 siblings, 0 replies; 15+ messages in thread
From: Liviu Dudau @ 2025-02-10 13:37 UTC (permalink / raw)
To: Adrián Larumbe
Cc: Boris Brezillon, Steven Price, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, kernel, dri-devel,
linux-kernel
On Mon, Feb 10, 2025 at 12:41:59PM +0000, Adrián Larumbe wrote:
> Panthor's fdinfo handler is routed through the /proc file system, which
> executes in an atomic context. That means we cannot use mutexes because
> they might sleep.
>
> This bug was uncovered by enabling some of the kernel's mutex-debugging
> features:
>
> CONFIG_DEBUG_RT_MUTEXES=y
> CONFIG_DEBUG_MUTEXES=y
>
> Replace Panthor's group fdinfo data mutex with a guarded spinlock.
>
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> Fixes: e16635d88fa0 ("drm/panthor: add DRM fdinfo support")
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Best regards,
Liviu
> ---
> drivers/gpu/drm/panthor/panthor_sched.c | 26 ++++++++++++-------------
> 1 file changed, 12 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index 0b93bf83a9b2..7a267d1efeb6 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -9,6 +9,7 @@
> #include <drm/panthor_drm.h>
>
> #include <linux/build_bug.h>
> +#include <linux/cleanup.h>
> #include <linux/clk.h>
> #include <linux/delay.h>
> #include <linux/dma-mapping.h>
> @@ -631,10 +632,10 @@ struct panthor_group {
> struct panthor_gpu_usage data;
>
> /**
> - * @lock: Mutex to govern concurrent access from drm file's fdinfo callback
> - * and job post-completion processing function
> + * @fdinfo.lock: Spinlock to govern concurrent access from drm file's fdinfo
> + * callback and job post-completion processing function
> */
> - struct mutex lock;
> + spinlock_t lock;
>
> /** @fdinfo.kbo_sizes: Aggregate size of private kernel BO's held by the group. */
> size_t kbo_sizes;
> @@ -910,8 +911,6 @@ static void group_release_work(struct work_struct *work)
> release_work);
> u32 i;
>
> - mutex_destroy(&group->fdinfo.lock);
> -
> for (i = 0; i < group->queue_count; i++)
> group_free_queue(group, group->queues[i]);
>
> @@ -2861,12 +2860,12 @@ static void update_fdinfo_stats(struct panthor_job *job)
> struct panthor_job_profiling_data *slots = queue->profiling.slots->kmap;
> struct panthor_job_profiling_data *data = &slots[job->profiling.slot];
>
> - mutex_lock(&group->fdinfo.lock);
> - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
> - fdinfo->cycles += data->cycles.after - data->cycles.before;
> - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
> - fdinfo->time += data->time.after - data->time.before;
> - mutex_unlock(&group->fdinfo.lock);
> + scoped_guard(spinlock, &group->fdinfo.lock) {
> + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
> + fdinfo->cycles += data->cycles.after - data->cycles.before;
> + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
> + fdinfo->time += data->time.after - data->time.before;
> + }
> }
>
> void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
> @@ -2880,12 +2879,11 @@ void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
>
> xa_lock(&gpool->xa);
> xa_for_each(&gpool->xa, i, group) {
> - mutex_lock(&group->fdinfo.lock);
> + guard(spinlock)(&group->fdinfo.lock);
> pfile->stats.cycles += group->fdinfo.data.cycles;
> pfile->stats.time += group->fdinfo.data.time;
> group->fdinfo.data.cycles = 0;
> group->fdinfo.data.time = 0;
> - mutex_unlock(&group->fdinfo.lock);
> }
> xa_unlock(&gpool->xa);
> }
> @@ -3531,7 +3529,7 @@ int panthor_group_create(struct panthor_file *pfile,
> mutex_unlock(&sched->reset.lock);
>
> add_group_kbo_sizes(group->ptdev, group);
> - mutex_init(&group->fdinfo.lock);
> + spin_lock_init(&group->fdinfo.lock);
>
> return gid;
>
>
> base-commit: 2eca617f12586abff62038db1c14cb3aa60a15aa
> prerequisite-patch-id: 7e787ce5973b5fc7e9f69f26aa4d7e5ec03d5caa
> prerequisite-patch-id: 03a619b8c741444b28435850e23d9ec463171c13
> prerequisite-patch-id: 290e1053f8bf4a8b80fb037a87cae7e096b5aa96
> prerequisite-patch-id: bc49bb8c29905650fb4788acc528bb44013c0240
> prerequisite-patch-id: 46cab4c980824c03e5164afc43085fec23e1cba5
> --
> 2.47.1
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] drm/panthor: Avoid sleep locking in the internal BO size path
2025-02-10 13:18 ` Boris Brezillon
@ 2025-02-10 13:43 ` Liviu Dudau
2025-02-10 14:15 ` Boris Brezillon
0 siblings, 1 reply; 15+ messages in thread
From: Liviu Dudau @ 2025-02-10 13:43 UTC (permalink / raw)
To: Boris Brezillon
Cc: Adrián Larumbe, Steven Price, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Mihail Atanassov, kernel, dri-devel, linux-kernel
On Mon, Feb 10, 2025 at 02:18:07PM +0100, Boris Brezillon wrote:
> On Mon, 10 Feb 2025 12:42:00 +0000
> Adrián Larumbe <adrian.larumbe@collabora.com> wrote:
>
> > A previous commit dealt with a similar situation, whereby upon enabling
> > some mutex debug features, a warning about sleep muteces being used in a
>
> ^ mutexes
>
> > /proc file read atomic context was being triggered.
> >
> > Because in this case replacing the heap mutex with a spinlock isn't
> > feasible, the fdinfo handler no longer traverses the list of heaps for
> > every single VM associated with an open DRM file. Instad, when a new heap
> > chunk is allocated, its size is accumulated into a VM-wide tally, which
> > also makes the atomic context code path somewhat faster.
> >
> > Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> > Fixes: 3e2c8c718567 ("drm/panthor: Expose size of driver internal BO's over fdinfo")
> > ---
> > drivers/gpu/drm/panthor/panthor_heap.c | 38 ++++++++------------------
> > drivers/gpu/drm/panthor/panthor_heap.h | 2 --
> > drivers/gpu/drm/panthor/panthor_mmu.c | 18 +++++++-----
> > drivers/gpu/drm/panthor/panthor_mmu.h | 1 +
> > 4 files changed, 23 insertions(+), 36 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> > index db0285ce5812..686f209f5b09 100644
> > --- a/drivers/gpu/drm/panthor/panthor_heap.c
> > +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> > @@ -127,6 +127,8 @@ static void panthor_free_heap_chunk(struct panthor_vm *vm,
> > heap->chunk_count--;
> > mutex_unlock(&heap->lock);
> >
> > + panthor_vm_heaps_accumulate(vm, -heap->chunk_size);
> > +
> > panthor_kernel_bo_destroy(chunk->bo);
> > kfree(chunk);
> > }
> > @@ -180,6 +182,8 @@ static int panthor_alloc_heap_chunk(struct panthor_device *ptdev,
> > heap->chunk_count++;
> > mutex_unlock(&heap->lock);
> >
> > + panthor_vm_heaps_accumulate(vm, heap->chunk_size);
> > +
> > return 0;
> >
> > err_destroy_bo:
> > @@ -389,6 +393,7 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool,
> > removed = chunk;
> > list_del(&chunk->node);
> > heap->chunk_count--;
> > + panthor_vm_heaps_accumulate(chunk->bo->vm, -heap->chunk_size);
> > break;
> > }
> > }
> > @@ -560,6 +565,8 @@ panthor_heap_pool_create(struct panthor_device *ptdev, struct panthor_vm *vm)
> > if (ret)
> > goto err_destroy_pool;
> >
> > + panthor_vm_heaps_accumulate(vm, pool->gpu_contexts->obj->size);
> > +
> > return pool;
> >
> > err_destroy_pool:
> > @@ -594,8 +601,11 @@ void panthor_heap_pool_destroy(struct panthor_heap_pool *pool)
> > xa_for_each(&pool->xa, i, heap)
> > drm_WARN_ON(&pool->ptdev->base, panthor_heap_destroy_locked(pool, i));
> >
> > - if (!IS_ERR_OR_NULL(pool->gpu_contexts))
> > + if (!IS_ERR_OR_NULL(pool->gpu_contexts)) {
> > + panthor_vm_heaps_accumulate(pool->gpu_contexts->vm,
> > + -pool->gpu_contexts->obj->size);
> > panthor_kernel_bo_destroy(pool->gpu_contexts);
> > + }
> >
> > /* Reflects the fact the pool has been destroyed. */
> > pool->vm = NULL;
> > @@ -603,29 +613,3 @@ void panthor_heap_pool_destroy(struct panthor_heap_pool *pool)
> >
> > panthor_heap_pool_put(pool);
> > }
> > -
> > -/**
> > - * panthor_heap_pool_size() - Calculate size of all chunks across all heaps in a pool
> > - * @pool: Pool whose total chunk size to calculate.
> > - *
> > - * This function adds the size of all heap chunks across all heaps in the
> > - * argument pool. It also adds the size of the gpu contexts kernel bo.
> > - * It is meant to be used by fdinfo for displaying the size of internal
> > - * driver BO's that aren't exposed to userspace through a GEM handle.
> > - *
> > - */
> > -size_t panthor_heap_pool_size(struct panthor_heap_pool *pool)
> > -{
> > - struct panthor_heap *heap;
> > - unsigned long i;
> > - size_t size = 0;
> > -
> > - down_read(&pool->lock);
> > - xa_for_each(&pool->xa, i, heap)
> > - size += heap->chunk_size * heap->chunk_count;
> > - up_read(&pool->lock);
> > -
> > - size += pool->gpu_contexts->obj->size;
> > -
> > - return size;
> > -}
> > diff --git a/drivers/gpu/drm/panthor/panthor_heap.h b/drivers/gpu/drm/panthor/panthor_heap.h
> > index e3358d4e8edb..25a5f2bba445 100644
> > --- a/drivers/gpu/drm/panthor/panthor_heap.h
> > +++ b/drivers/gpu/drm/panthor/panthor_heap.h
> > @@ -27,8 +27,6 @@ struct panthor_heap_pool *
> > panthor_heap_pool_get(struct panthor_heap_pool *pool);
> > void panthor_heap_pool_put(struct panthor_heap_pool *pool);
> >
> > -size_t panthor_heap_pool_size(struct panthor_heap_pool *pool);
> > -
> > int panthor_heap_grow(struct panthor_heap_pool *pool,
> > u64 heap_gpu_va,
> > u32 renderpasses_in_flight,
> > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> > index 0a4e352b5505..aaad1a560805 100644
> > --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> > +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> > @@ -345,6 +345,10 @@ struct panthor_vm {
> >
> > /** @heaps.lock: Lock used to protect access to @pool. */
> > struct mutex lock;
> > +
> > + /** @heaps.size: Size of all chunks across all heaps in the pool. */
> > + ssize_t size;
>
> Let's put that into an fdinfo struct.
>
> > +
>
> Drop the extra blank-line.
>
> > } heaps;
> >
> > /** @node: Used to insert the VM in the panthor_mmu::vm::list. */
> > @@ -1539,6 +1543,7 @@ static void panthor_vm_destroy(struct panthor_vm *vm)
> > mutex_lock(&vm->heaps.lock);
> > panthor_heap_pool_destroy(vm->heaps.pool);
> > vm->heaps.pool = NULL;
> > + vm->heaps.size = 0;
> > mutex_unlock(&vm->heaps.lock);
> >
> > drm_WARN_ON(&vm->ptdev->base,
> > @@ -1963,13 +1968,7 @@ void panthor_vm_heaps_sizes(struct panthor_file *pfile, struct drm_memory_stats
> >
> > xa_lock(&pfile->vms->xa);
> > xa_for_each(&pfile->vms->xa, i, vm) {
> > - size_t size = 0;
> > -
> > - mutex_lock(&vm->heaps.lock);
> > - if (vm->heaps.pool)
> > - size = panthor_heap_pool_size(vm->heaps.pool);
> > - mutex_unlock(&vm->heaps.lock);
> > -
> > + size_t size = vm->heaps.size;
> > stats->resident += size;
> > if (vm->as.id >= 0)
> > stats->active += size;
> > @@ -1977,6 +1976,11 @@ void panthor_vm_heaps_sizes(struct panthor_file *pfile, struct drm_memory_stats
> > xa_unlock(&pfile->vms->xa);
> > }
> >
> > +void panthor_vm_heaps_accumulate(struct panthor_vm *vm, ssize_t acc)
> > +{
>
> Either there's some lock protecting this operation and we want a
> lockdep_assert_held(), or we need to make it an atomic operation (and
> make the size an atomic_t) to avoid races.
If Adrián moves the call site in panthor_{alloc,free}_heap_chunk() before he drops the heap->lock,
would that be sufficient? Pool create and destroy are hopefully race-free and panthor_heap_return_chunk()
should be also safe.
Best regards,
Liviu
>
> > + vm->heaps.size += acc;
> > +}
> > +
> > static u64 mair_to_memattr(u64 mair, bool coherent)
> > {
> > u64 memattr = 0;
> > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.h b/drivers/gpu/drm/panthor/panthor_mmu.h
> > index fc274637114e..b6686896f8ef 100644
> > --- a/drivers/gpu/drm/panthor/panthor_mmu.h
> > +++ b/drivers/gpu/drm/panthor/panthor_mmu.h
> > @@ -39,6 +39,7 @@ struct panthor_heap_pool *
> > panthor_vm_get_heap_pool(struct panthor_vm *vm, bool create);
> >
> > void panthor_vm_heaps_sizes(struct panthor_file *pfile, struct drm_memory_stats *stats);
> > +void panthor_vm_heaps_accumulate(struct panthor_vm *vm, ssize_t acc);
> >
> > struct panthor_vm *panthor_vm_get(struct panthor_vm *vm);
> > void panthor_vm_put(struct panthor_vm *vm);
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] drm/panthor: Avoid sleep locking in the internal BO size path
2025-02-10 13:43 ` Liviu Dudau
@ 2025-02-10 14:15 ` Boris Brezillon
0 siblings, 0 replies; 15+ messages in thread
From: Boris Brezillon @ 2025-02-10 14:15 UTC (permalink / raw)
To: Liviu Dudau
Cc: Adrián Larumbe, Steven Price, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Mihail Atanassov, kernel, dri-devel, linux-kernel
On Mon, 10 Feb 2025 13:43:17 +0000
Liviu Dudau <liviu.dudau@arm.com> wrote:
> On Mon, Feb 10, 2025 at 02:18:07PM +0100, Boris Brezillon wrote:
> > On Mon, 10 Feb 2025 12:42:00 +0000
> > Adrián Larumbe <adrian.larumbe@collabora.com> wrote:
> >
> > > A previous commit dealt with a similar situation, whereby upon enabling
> > > some mutex debug features, a warning about sleep muteces being used in a
> >
> > ^ mutexes
> >
> > > /proc file read atomic context was being triggered.
> > >
> > > Because in this case replacing the heap mutex with a spinlock isn't
> > > feasible, the fdinfo handler no longer traverses the list of heaps for
> > > every single VM associated with an open DRM file. Instad, when a new heap
> > > chunk is allocated, its size is accumulated into a VM-wide tally, which
> > > also makes the atomic context code path somewhat faster.
> > >
> > > Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> > > Fixes: 3e2c8c718567 ("drm/panthor: Expose size of driver internal BO's over fdinfo")
> > > ---
> > > drivers/gpu/drm/panthor/panthor_heap.c | 38 ++++++++------------------
> > > drivers/gpu/drm/panthor/panthor_heap.h | 2 --
> > > drivers/gpu/drm/panthor/panthor_mmu.c | 18 +++++++-----
> > > drivers/gpu/drm/panthor/panthor_mmu.h | 1 +
> > > 4 files changed, 23 insertions(+), 36 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> > > index db0285ce5812..686f209f5b09 100644
> > > --- a/drivers/gpu/drm/panthor/panthor_heap.c
> > > +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> > > @@ -127,6 +127,8 @@ static void panthor_free_heap_chunk(struct panthor_vm *vm,
> > > heap->chunk_count--;
> > > mutex_unlock(&heap->lock);
> > >
> > > + panthor_vm_heaps_accumulate(vm, -heap->chunk_size);
> > > +
> > > panthor_kernel_bo_destroy(chunk->bo);
> > > kfree(chunk);
> > > }
> > > @@ -180,6 +182,8 @@ static int panthor_alloc_heap_chunk(struct panthor_device *ptdev,
> > > heap->chunk_count++;
> > > mutex_unlock(&heap->lock);
> > >
> > > + panthor_vm_heaps_accumulate(vm, heap->chunk_size);
> > > +
> > > return 0;
> > >
> > > err_destroy_bo:
> > > @@ -389,6 +393,7 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool,
> > > removed = chunk;
> > > list_del(&chunk->node);
> > > heap->chunk_count--;
> > > + panthor_vm_heaps_accumulate(chunk->bo->vm, -heap->chunk_size);
> > > break;
> > > }
> > > }
> > > @@ -560,6 +565,8 @@ panthor_heap_pool_create(struct panthor_device *ptdev, struct panthor_vm *vm)
> > > if (ret)
> > > goto err_destroy_pool;
> > >
> > > + panthor_vm_heaps_accumulate(vm, pool->gpu_contexts->obj->size);
> > > +
> > > return pool;
> > >
> > > err_destroy_pool:
> > > @@ -594,8 +601,11 @@ void panthor_heap_pool_destroy(struct panthor_heap_pool *pool)
> > > xa_for_each(&pool->xa, i, heap)
> > > drm_WARN_ON(&pool->ptdev->base, panthor_heap_destroy_locked(pool, i));
> > >
> > > - if (!IS_ERR_OR_NULL(pool->gpu_contexts))
> > > + if (!IS_ERR_OR_NULL(pool->gpu_contexts)) {
> > > + panthor_vm_heaps_accumulate(pool->gpu_contexts->vm,
> > > + -pool->gpu_contexts->obj->size);
> > > panthor_kernel_bo_destroy(pool->gpu_contexts);
> > > + }
> > >
> > > /* Reflects the fact the pool has been destroyed. */
> > > pool->vm = NULL;
> > > @@ -603,29 +613,3 @@ void panthor_heap_pool_destroy(struct panthor_heap_pool *pool)
> > >
> > > panthor_heap_pool_put(pool);
> > > }
> > > -
> > > -/**
> > > - * panthor_heap_pool_size() - Calculate size of all chunks across all heaps in a pool
> > > - * @pool: Pool whose total chunk size to calculate.
> > > - *
> > > - * This function adds the size of all heap chunks across all heaps in the
> > > - * argument pool. It also adds the size of the gpu contexts kernel bo.
> > > - * It is meant to be used by fdinfo for displaying the size of internal
> > > - * driver BO's that aren't exposed to userspace through a GEM handle.
> > > - *
> > > - */
> > > -size_t panthor_heap_pool_size(struct panthor_heap_pool *pool)
> > > -{
> > > - struct panthor_heap *heap;
> > > - unsigned long i;
> > > - size_t size = 0;
> > > -
> > > - down_read(&pool->lock);
> > > - xa_for_each(&pool->xa, i, heap)
> > > - size += heap->chunk_size * heap->chunk_count;
> > > - up_read(&pool->lock);
> > > -
> > > - size += pool->gpu_contexts->obj->size;
> > > -
> > > - return size;
> > > -}
> > > diff --git a/drivers/gpu/drm/panthor/panthor_heap.h b/drivers/gpu/drm/panthor/panthor_heap.h
> > > index e3358d4e8edb..25a5f2bba445 100644
> > > --- a/drivers/gpu/drm/panthor/panthor_heap.h
> > > +++ b/drivers/gpu/drm/panthor/panthor_heap.h
> > > @@ -27,8 +27,6 @@ struct panthor_heap_pool *
> > > panthor_heap_pool_get(struct panthor_heap_pool *pool);
> > > void panthor_heap_pool_put(struct panthor_heap_pool *pool);
> > >
> > > -size_t panthor_heap_pool_size(struct panthor_heap_pool *pool);
> > > -
> > > int panthor_heap_grow(struct panthor_heap_pool *pool,
> > > u64 heap_gpu_va,
> > > u32 renderpasses_in_flight,
> > > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> > > index 0a4e352b5505..aaad1a560805 100644
> > > --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> > > +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> > > @@ -345,6 +345,10 @@ struct panthor_vm {
> > >
> > > /** @heaps.lock: Lock used to protect access to @pool. */
> > > struct mutex lock;
> > > +
> > > + /** @heaps.size: Size of all chunks across all heaps in the pool. */
> > > + ssize_t size;
> >
> > Let's put that into an fdinfo struct.
> >
> > > +
> >
> > Drop the extra blank-line.
> >
> > > } heaps;
> > >
> > > /** @node: Used to insert the VM in the panthor_mmu::vm::list. */
> > > @@ -1539,6 +1543,7 @@ static void panthor_vm_destroy(struct panthor_vm *vm)
> > > mutex_lock(&vm->heaps.lock);
> > > panthor_heap_pool_destroy(vm->heaps.pool);
> > > vm->heaps.pool = NULL;
> > > + vm->heaps.size = 0;
> > > mutex_unlock(&vm->heaps.lock);
> > >
> > > drm_WARN_ON(&vm->ptdev->base,
> > > @@ -1963,13 +1968,7 @@ void panthor_vm_heaps_sizes(struct panthor_file *pfile, struct drm_memory_stats
> > >
> > > xa_lock(&pfile->vms->xa);
> > > xa_for_each(&pfile->vms->xa, i, vm) {
> > > - size_t size = 0;
> > > -
> > > - mutex_lock(&vm->heaps.lock);
> > > - if (vm->heaps.pool)
> > > - size = panthor_heap_pool_size(vm->heaps.pool);
> > > - mutex_unlock(&vm->heaps.lock);
> > > -
> > > + size_t size = vm->heaps.size;
> > > stats->resident += size;
> > > if (vm->as.id >= 0)
> > > stats->active += size;
> > > @@ -1977,6 +1976,11 @@ void panthor_vm_heaps_sizes(struct panthor_file *pfile, struct drm_memory_stats
> > > xa_unlock(&pfile->vms->xa);
> > > }
> > >
> > > +void panthor_vm_heaps_accumulate(struct panthor_vm *vm, ssize_t acc)
> > > +{
> >
> > Either there's some lock protecting this operation and we want a
> > lockdep_assert_held(), or we need to make it an atomic operation (and
> > make the size an atomic_t) to avoid races.
>
> If Adrián moves the call site in panthor_{alloc,free}_heap_chunk() before he drops the heap->lock,
> would that be sufficient? Pool create and destroy are hopefully race-free and
> panthor_heap_return_chunk() should be also safe.
I'm not sure that's enough. The tiler_oom_work is per-group, and you
can have multiple groups sharing the same VM. The workqueue we schedule
these works on is also not single-threaded, so you can potentially have
multiple threads updating the total VM tiler heap size at the same time.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path
2025-02-10 12:41 [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path Adrián Larumbe
` (2 preceding siblings ...)
2025-02-10 13:37 ` Liviu Dudau
@ 2025-02-10 15:52 ` Tvrtko Ursulin
2025-02-10 16:08 ` Adrián Larumbe
2025-02-10 16:10 ` Adrián Larumbe
3 siblings, 2 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2025-02-10 15:52 UTC (permalink / raw)
To: Adrián Larumbe, Boris Brezillon, Steven Price, Liviu Dudau,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: kernel, dri-devel, linux-kernel
On 10/02/2025 12:41, Adrián Larumbe wrote:
> Panthor's fdinfo handler is routed through the /proc file system, which
> executes in an atomic context. That means we cannot use mutexes because
> they might sleep.
Have the debug splat at hand? I am thinking it is not because of fdinfo
reads, which are allowed to sleep, but has to be something else.
Regards,
Tvrtko
> This bug was uncovered by enabling some of the kernel's mutex-debugging
> features:
>
> CONFIG_DEBUG_RT_MUTEXES=y
> CONFIG_DEBUG_MUTEXES=y
>
> Replace Panthor's group fdinfo data mutex with a guarded spinlock.
>
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> Fixes: e16635d88fa0 ("drm/panthor: add DRM fdinfo support")
> ---
> drivers/gpu/drm/panthor/panthor_sched.c | 26 ++++++++++++-------------
> 1 file changed, 12 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index 0b93bf83a9b2..7a267d1efeb6 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -9,6 +9,7 @@
> #include <drm/panthor_drm.h>
>
> #include <linux/build_bug.h>
> +#include <linux/cleanup.h>
> #include <linux/clk.h>
> #include <linux/delay.h>
> #include <linux/dma-mapping.h>
> @@ -631,10 +632,10 @@ struct panthor_group {
> struct panthor_gpu_usage data;
>
> /**
> - * @lock: Mutex to govern concurrent access from drm file's fdinfo callback
> - * and job post-completion processing function
> + * @fdinfo.lock: Spinlock to govern concurrent access from drm file's fdinfo
> + * callback and job post-completion processing function
> */
> - struct mutex lock;
> + spinlock_t lock;
>
> /** @fdinfo.kbo_sizes: Aggregate size of private kernel BO's held by the group. */
> size_t kbo_sizes;
> @@ -910,8 +911,6 @@ static void group_release_work(struct work_struct *work)
> release_work);
> u32 i;
>
> - mutex_destroy(&group->fdinfo.lock);
> -
> for (i = 0; i < group->queue_count; i++)
> group_free_queue(group, group->queues[i]);
>
> @@ -2861,12 +2860,12 @@ static void update_fdinfo_stats(struct panthor_job *job)
> struct panthor_job_profiling_data *slots = queue->profiling.slots->kmap;
> struct panthor_job_profiling_data *data = &slots[job->profiling.slot];
>
> - mutex_lock(&group->fdinfo.lock);
> - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
> - fdinfo->cycles += data->cycles.after - data->cycles.before;
> - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
> - fdinfo->time += data->time.after - data->time.before;
> - mutex_unlock(&group->fdinfo.lock);
> + scoped_guard(spinlock, &group->fdinfo.lock) {
> + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
> + fdinfo->cycles += data->cycles.after - data->cycles.before;
> + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
> + fdinfo->time += data->time.after - data->time.before;
> + }
> }
>
> void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
> @@ -2880,12 +2879,11 @@ void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
>
> xa_lock(&gpool->xa);
> xa_for_each(&gpool->xa, i, group) {
> - mutex_lock(&group->fdinfo.lock);
> + guard(spinlock)(&group->fdinfo.lock);
> pfile->stats.cycles += group->fdinfo.data.cycles;
> pfile->stats.time += group->fdinfo.data.time;
> group->fdinfo.data.cycles = 0;
> group->fdinfo.data.time = 0;
> - mutex_unlock(&group->fdinfo.lock);
> }
> xa_unlock(&gpool->xa);
> }
> @@ -3531,7 +3529,7 @@ int panthor_group_create(struct panthor_file *pfile,
> mutex_unlock(&sched->reset.lock);
>
> add_group_kbo_sizes(group->ptdev, group);
> - mutex_init(&group->fdinfo.lock);
> + spin_lock_init(&group->fdinfo.lock);
>
> return gid;
>
>
> base-commit: 2eca617f12586abff62038db1c14cb3aa60a15aa
> prerequisite-patch-id: 7e787ce5973b5fc7e9f69f26aa4d7e5ec03d5caa
> prerequisite-patch-id: 03a619b8c741444b28435850e23d9ec463171c13
> prerequisite-patch-id: 290e1053f8bf4a8b80fb037a87cae7e096b5aa96
> prerequisite-patch-id: bc49bb8c29905650fb4788acc528bb44013c0240
> prerequisite-patch-id: 46cab4c980824c03e5164afc43085fec23e1cba5
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path
2025-02-10 15:52 ` Tvrtko Ursulin
@ 2025-02-10 16:08 ` Adrián Larumbe
2025-02-11 11:39 ` Tvrtko Ursulin
2025-02-10 16:10 ` Adrián Larumbe
1 sibling, 1 reply; 15+ messages in thread
From: Adrián Larumbe @ 2025-02-10 16:08 UTC (permalink / raw)
To: Tvrtko Ursulin
Cc: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, dri-devel, linux-kernel
Hi Tvrtko,
[18153.770244] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562
[18153.771059] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 203412, name: cat
[18153.771757] preempt_count: 1, expected: 0
[18153.772164] RCU nest depth: 0, expected: 0
[18153.772538] INFO: lockdep is turned off.
[18153.772898] CPU: 4 UID: 0 PID: 203412 Comm: cat Tainted: G W 6.14.0-rc1-panthor-next-rk3588-fdinfo+ #1
[18153.772906] Tainted: [W]=WARN
[18153.772908] Hardware name: Radxa ROCK 5B (DT)
[18153.772911] Call trace:
[18153.772913] show_stack+0x24/0x38 (C)
[18153.772927] dump_stack_lvl+0x3c/0x98
[18153.772935] dump_stack+0x18/0x24
[18153.772941] __might_resched+0x298/0x2b0
[18153.772948] __might_sleep+0x6c/0xb0
[18153.772953] __mutex_lock_common+0x7c/0x1950
[18153.772962] mutex_lock_nested+0x38/0x50
[18153.772969] panthor_fdinfo_gather_group_samples+0x80/0x138 [panthor]
[18153.773042] panthor_show_fdinfo+0x80/0x228 [panthor]
[18153.773109] drm_show_fdinfo+0x1a4/0x1e0 [drm]
[18153.773397] seq_show+0x274/0x358
[18153.773404] seq_read_iter+0x1d4/0x630
[18153.773411] seq_read+0x148/0x1a0
[18153.773416] vfs_read+0x114/0x3e0
[18153.773423] ksys_read+0x90/0x110
[18153.773429] __arm64_sys_read+0x50/0x70
[18153.773435] invoke_syscall+0x60/0x178
[18153.773442] el0_svc_common+0x104/0x148
[18153.773447] do_el0_svc+0x3c/0x58
[18153.773453] el0_svc+0x50/0xa8
[18153.773459] el0t_64_sync_handler+0x78/0x108
[18153.773465] el0t_64_sync+0x198/0x1a0
On 10.02.2025 15:52, Tvrtko Ursulin wrote:
> On 10/02/2025 12:41, Adrián Larumbe wrote:
> > Panthor's fdinfo handler is routed through the /proc file system, which
> > executes in an atomic context. That means we cannot use mutexes because
> > they might sleep.
>
> Have the debug splat at hand? I am thinking it is not because of fdinfo reads,
> which are allowed to sleep, but has to be something else.
>
> Regards,
>
> Tvrtko
>
> > This bug was uncovered by enabling some of the kernel's mutex-debugging
> > features:
> >
> > CONFIG_DEBUG_RT_MUTEXES=y
> > CONFIG_DEBUG_MUTEXES=y
> >
> > Replace Panthor's group fdinfo data mutex with a guarded spinlock.
> >
> > Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> > Fixes: e16635d88fa0 ("drm/panthor: add DRM fdinfo support")
> > ---
> > drivers/gpu/drm/panthor/panthor_sched.c | 26 ++++++++++++-------------
> > 1 file changed, 12 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> > index 0b93bf83a9b2..7a267d1efeb6 100644
> > --- a/drivers/gpu/drm/panthor/panthor_sched.c
> > +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> > @@ -9,6 +9,7 @@
> > #include <drm/panthor_drm.h>
> > #include <linux/build_bug.h>
> > +#include <linux/cleanup.h>
> > #include <linux/clk.h>
> > #include <linux/delay.h>
> > #include <linux/dma-mapping.h>
> > @@ -631,10 +632,10 @@ struct panthor_group {
> > struct panthor_gpu_usage data;
> > /**
> > - * @lock: Mutex to govern concurrent access from drm file's fdinfo callback
> > - * and job post-completion processing function
> > + * @fdinfo.lock: Spinlock to govern concurrent access from drm file's fdinfo
> > + * callback and job post-completion processing function
> > */
> > - struct mutex lock;
> > + spinlock_t lock;
> > /** @fdinfo.kbo_sizes: Aggregate size of private kernel BO's held by the group. */
> > size_t kbo_sizes;
> > @@ -910,8 +911,6 @@ static void group_release_work(struct work_struct *work)
> > release_work);
> > u32 i;
> > - mutex_destroy(&group->fdinfo.lock);
> > -
> > for (i = 0; i < group->queue_count; i++)
> > group_free_queue(group, group->queues[i]);
> > @@ -2861,12 +2860,12 @@ static void update_fdinfo_stats(struct panthor_job *job)
> > struct panthor_job_profiling_data *slots = queue->profiling.slots->kmap;
> > struct panthor_job_profiling_data *data = &slots[job->profiling.slot];
> > - mutex_lock(&group->fdinfo.lock);
> > - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
> > - fdinfo->cycles += data->cycles.after - data->cycles.before;
> > - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
> > - fdinfo->time += data->time.after - data->time.before;
> > - mutex_unlock(&group->fdinfo.lock);
> > + scoped_guard(spinlock, &group->fdinfo.lock) {
> > + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
> > + fdinfo->cycles += data->cycles.after - data->cycles.before;
> > + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
> > + fdinfo->time += data->time.after - data->time.before;
> > + }
> > }
> > void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
> > @@ -2880,12 +2879,11 @@ void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
> > xa_lock(&gpool->xa);
> > xa_for_each(&gpool->xa, i, group) {
> > - mutex_lock(&group->fdinfo.lock);
> > + guard(spinlock)(&group->fdinfo.lock);
> > pfile->stats.cycles += group->fdinfo.data.cycles;
> > pfile->stats.time += group->fdinfo.data.time;
> > group->fdinfo.data.cycles = 0;
> > group->fdinfo.data.time = 0;
> > - mutex_unlock(&group->fdinfo.lock);
> > }
> > xa_unlock(&gpool->xa);
> > }
> > @@ -3531,7 +3529,7 @@ int panthor_group_create(struct panthor_file *pfile,
> > mutex_unlock(&sched->reset.lock);
> > add_group_kbo_sizes(group->ptdev, group);
> > - mutex_init(&group->fdinfo.lock);
> > + spin_lock_init(&group->fdinfo.lock);
> > return gid;
> >
> > base-commit: 2eca617f12586abff62038db1c14cb3aa60a15aa
> > prerequisite-patch-id: 7e787ce5973b5fc7e9f69f26aa4d7e5ec03d5caa
> > prerequisite-patch-id: 03a619b8c741444b28435850e23d9ec463171c13
> > prerequisite-patch-id: 290e1053f8bf4a8b80fb037a87cae7e096b5aa96
> > prerequisite-patch-id: bc49bb8c29905650fb4788acc528bb44013c0240
> > prerequisite-patch-id: 46cab4c980824c03e5164afc43085fec23e1cba5
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path
2025-02-10 15:52 ` Tvrtko Ursulin
2025-02-10 16:08 ` Adrián Larumbe
@ 2025-02-10 16:10 ` Adrián Larumbe
1 sibling, 0 replies; 15+ messages in thread
From: Adrián Larumbe @ 2025-02-10 16:10 UTC (permalink / raw)
To: Tvrtko Ursulin
Cc: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, dri-devel, linux-kernel
This is the debug log for the second patch in the series:
[18325.029172] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562
[18325.029947] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 203483, name: cat
[18325.030646] preempt_count: 1, expected: 0
[18325.031012] RCU nest depth: 0, expected: 0
[18325.031387] INFO: lockdep is turned off.
[18325.031754] CPU: 4 UID: 0 PID: 203483 Comm: cat Tainted: G W 6.14.0-rc1-panthor-next-rk3588-fdinfo+ #1
[18325.031763] Tainted: [W]=WARN
[18325.031767] Hardware name: Radxa ROCK 5B (DT)
[18325.031772] Call trace:
[18325.031777] show_stack+0x24/0x38 (C)
[18325.031813] dump_stack_lvl+0x3c/0x98
[18325.031827] dump_stack+0x18/0x24
[18325.031833] __might_resched+0x298/0x2b0
[18325.031844] __might_sleep+0x6c/0xb0
[18325.031850] __mutex_lock_common+0x7c/0x1950
[18325.031875] mutex_lock_nested+0x38/0x50
[18325.031889] panthor_vm_heaps_sizes+0xa8/0x158 [panthor]
[18325.031973] panthor_show_fdinfo+0x1a0/0x228 [panthor]
[18325.032040] drm_show_fdinfo+0x1a4/0x1e0 [drm]
[18325.032328] seq_show+0x274/0x358
[18325.032343] seq_read_iter+0x1d4/0x630
[18325.032358] seq_read+0x148/0x1a0
[18325.032364] vfs_read+0x114/0x3e0
[18325.032377] ksys_read+0x90/0x110
[18325.032383] __arm64_sys_read+0x50/0x70
[18325.032389] invoke_syscall+0x60/0x178
[18325.032400] el0_svc_common+0x104/0x148
[18325.032406] do_el0_svc+0x3c/0x58
[18325.032413] el0_svc+0x50/0xa8
[18325.032422] el0t_64_sync_handler+0x78/0x108
[18325.032427] el0t_64_sync+0x198/0x1a0
On 10.02.2025 15:52, Tvrtko Ursulin wrote:
>
> On 10/02/2025 12:41, Adrián Larumbe wrote:
> > Panthor's fdinfo handler is routed through the /proc file system, which
> > executes in an atomic context. That means we cannot use mutexes because
> > they might sleep.
>
> Have the debug splat at hand? I am thinking it is not because of fdinfo reads,
> which are allowed to sleep, but has to be something else.
>
> Regards,
>
> Tvrtko
>
> > This bug was uncovered by enabling some of the kernel's mutex-debugging
> > features:
> >
> > CONFIG_DEBUG_RT_MUTEXES=y
> > CONFIG_DEBUG_MUTEXES=y
> >
> > Replace Panthor's group fdinfo data mutex with a guarded spinlock.
> >
> > Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> > Fixes: e16635d88fa0 ("drm/panthor: add DRM fdinfo support")
> > ---
> > drivers/gpu/drm/panthor/panthor_sched.c | 26 ++++++++++++-------------
> > 1 file changed, 12 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> > index 0b93bf83a9b2..7a267d1efeb6 100644
> > --- a/drivers/gpu/drm/panthor/panthor_sched.c
> > +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> > @@ -9,6 +9,7 @@
> > #include <drm/panthor_drm.h>
> > #include <linux/build_bug.h>
> > +#include <linux/cleanup.h>
> > #include <linux/clk.h>
> > #include <linux/delay.h>
> > #include <linux/dma-mapping.h>
> > @@ -631,10 +632,10 @@ struct panthor_group {
> > struct panthor_gpu_usage data;
> > /**
> > - * @lock: Mutex to govern concurrent access from drm file's fdinfo callback
> > - * and job post-completion processing function
> > + * @fdinfo.lock: Spinlock to govern concurrent access from drm file's fdinfo
> > + * callback and job post-completion processing function
> > */
> > - struct mutex lock;
> > + spinlock_t lock;
> > /** @fdinfo.kbo_sizes: Aggregate size of private kernel BO's held by the group. */
> > size_t kbo_sizes;
> > @@ -910,8 +911,6 @@ static void group_release_work(struct work_struct *work)
> > release_work);
> > u32 i;
> > - mutex_destroy(&group->fdinfo.lock);
> > -
> > for (i = 0; i < group->queue_count; i++)
> > group_free_queue(group, group->queues[i]);
> > @@ -2861,12 +2860,12 @@ static void update_fdinfo_stats(struct panthor_job *job)
> > struct panthor_job_profiling_data *slots = queue->profiling.slots->kmap;
> > struct panthor_job_profiling_data *data = &slots[job->profiling.slot];
> > - mutex_lock(&group->fdinfo.lock);
> > - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
> > - fdinfo->cycles += data->cycles.after - data->cycles.before;
> > - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
> > - fdinfo->time += data->time.after - data->time.before;
> > - mutex_unlock(&group->fdinfo.lock);
> > + scoped_guard(spinlock, &group->fdinfo.lock) {
> > + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
> > + fdinfo->cycles += data->cycles.after - data->cycles.before;
> > + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
> > + fdinfo->time += data->time.after - data->time.before;
> > + }
> > }
> > void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
> > @@ -2880,12 +2879,11 @@ void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
> > xa_lock(&gpool->xa);
> > xa_for_each(&gpool->xa, i, group) {
> > - mutex_lock(&group->fdinfo.lock);
> > + guard(spinlock)(&group->fdinfo.lock);
> > pfile->stats.cycles += group->fdinfo.data.cycles;
> > pfile->stats.time += group->fdinfo.data.time;
> > group->fdinfo.data.cycles = 0;
> > group->fdinfo.data.time = 0;
> > - mutex_unlock(&group->fdinfo.lock);
> > }
> > xa_unlock(&gpool->xa);
> > }
> > @@ -3531,7 +3529,7 @@ int panthor_group_create(struct panthor_file *pfile,
> > mutex_unlock(&sched->reset.lock);
> > add_group_kbo_sizes(group->ptdev, group);
> > - mutex_init(&group->fdinfo.lock);
> > + spin_lock_init(&group->fdinfo.lock);
> > return gid;
> >
> > base-commit: 2eca617f12586abff62038db1c14cb3aa60a15aa
> > prerequisite-patch-id: 7e787ce5973b5fc7e9f69f26aa4d7e5ec03d5caa
> > prerequisite-patch-id: 03a619b8c741444b28435850e23d9ec463171c13
> > prerequisite-patch-id: 290e1053f8bf4a8b80fb037a87cae7e096b5aa96
> > prerequisite-patch-id: bc49bb8c29905650fb4788acc528bb44013c0240
> > prerequisite-patch-id: 46cab4c980824c03e5164afc43085fec23e1cba5
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path
2025-02-10 16:08 ` Adrián Larumbe
@ 2025-02-11 11:39 ` Tvrtko Ursulin
2025-02-11 12:41 ` Boris Brezillon
2025-02-12 16:33 ` Boris Brezillon
0 siblings, 2 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2025-02-11 11:39 UTC (permalink / raw)
To: Adrián Larumbe
Cc: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, dri-devel, linux-kernel
On 10/02/2025 16:08, Adrián Larumbe wrote:
> Hi Tvrtko,
Thanks!
> [18153.770244] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562
> [18153.771059] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 203412, name: cat
> [18153.771757] preempt_count: 1, expected: 0
> [18153.772164] RCU nest depth: 0, expected: 0
> [18153.772538] INFO: lockdep is turned off.
> [18153.772898] CPU: 4 UID: 0 PID: 203412 Comm: cat Tainted: G W 6.14.0-rc1-panthor-next-rk3588-fdinfo+ #1
> [18153.772906] Tainted: [W]=WARN
> [18153.772908] Hardware name: Radxa ROCK 5B (DT)
> [18153.772911] Call trace:
> [18153.772913] show_stack+0x24/0x38 (C)
> [18153.772927] dump_stack_lvl+0x3c/0x98
> [18153.772935] dump_stack+0x18/0x24
> [18153.772941] __might_resched+0x298/0x2b0
> [18153.772948] __might_sleep+0x6c/0xb0
> [18153.772953] __mutex_lock_common+0x7c/0x1950
> [18153.772962] mutex_lock_nested+0x38/0x50
> [18153.772969] panthor_fdinfo_gather_group_samples+0x80/0x138 [panthor]
> [18153.773042] panthor_show_fdinfo+0x80/0x228 [panthor]
> [18153.773109] drm_show_fdinfo+0x1a4/0x1e0 [drm]
> [18153.773397] seq_show+0x274/0x358
> [18153.773404] seq_read_iter+0x1d4/0x630
There is a mutex_lock literally in seq_read_iter.
So colour me confused. What created the atomic context between then and
Panthor code?! I just don't see it.
Regards,
Tvrtko
> [18153.773411] seq_read+0x148/0x1a0
> [18153.773416] vfs_read+0x114/0x3e0
> [18153.773423] ksys_read+0x90/0x110
> [18153.773429] __arm64_sys_read+0x50/0x70
> [18153.773435] invoke_syscall+0x60/0x178
> [18153.773442] el0_svc_common+0x104/0x148
> [18153.773447] do_el0_svc+0x3c/0x58
> [18153.773453] el0_svc+0x50/0xa8
> [18153.773459] el0t_64_sync_handler+0x78/0x108
> [18153.773465] el0t_64_sync+0x198/0x1a0
>
>
> On 10.02.2025 15:52, Tvrtko Ursulin wrote:
>> On 10/02/2025 12:41, Adrián Larumbe wrote:
>>> Panthor's fdinfo handler is routed through the /proc file system, which
>>> executes in an atomic context. That means we cannot use mutexes because
>>> they might sleep.
>>
>> Have the debug splat at hand? I am thinking it is not because of fdinfo reads,
>> which are allowed to sleep, but has to be something else.
>>
>> Regards,
>>
>> Tvrtko
>>
>>> This bug was uncovered by enabling some of the kernel's mutex-debugging
>>> features:
>>>
>>> CONFIG_DEBUG_RT_MUTEXES=y
>>> CONFIG_DEBUG_MUTEXES=y
>>>
>>> Replace Panthor's group fdinfo data mutex with a guarded spinlock.
>>>
>>> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
>>> Fixes: e16635d88fa0 ("drm/panthor: add DRM fdinfo support")
>>> ---
>>> drivers/gpu/drm/panthor/panthor_sched.c | 26 ++++++++++++-------------
>>> 1 file changed, 12 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
>>> index 0b93bf83a9b2..7a267d1efeb6 100644
>>> --- a/drivers/gpu/drm/panthor/panthor_sched.c
>>> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
>>> @@ -9,6 +9,7 @@
>>> #include <drm/panthor_drm.h>
>>> #include <linux/build_bug.h>
>>> +#include <linux/cleanup.h>
>>> #include <linux/clk.h>
>>> #include <linux/delay.h>
>>> #include <linux/dma-mapping.h>
>>> @@ -631,10 +632,10 @@ struct panthor_group {
>>> struct panthor_gpu_usage data;
>>> /**
>>> - * @lock: Mutex to govern concurrent access from drm file's fdinfo callback
>>> - * and job post-completion processing function
>>> + * @fdinfo.lock: Spinlock to govern concurrent access from drm file's fdinfo
>>> + * callback and job post-completion processing function
>>> */
>>> - struct mutex lock;
>>> + spinlock_t lock;
>>> /** @fdinfo.kbo_sizes: Aggregate size of private kernel BO's held by the group. */
>>> size_t kbo_sizes;
>>> @@ -910,8 +911,6 @@ static void group_release_work(struct work_struct *work)
>>> release_work);
>>> u32 i;
>>> - mutex_destroy(&group->fdinfo.lock);
>>> -
>>> for (i = 0; i < group->queue_count; i++)
>>> group_free_queue(group, group->queues[i]);
>>> @@ -2861,12 +2860,12 @@ static void update_fdinfo_stats(struct panthor_job *job)
>>> struct panthor_job_profiling_data *slots = queue->profiling.slots->kmap;
>>> struct panthor_job_profiling_data *data = &slots[job->profiling.slot];
>>> - mutex_lock(&group->fdinfo.lock);
>>> - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
>>> - fdinfo->cycles += data->cycles.after - data->cycles.before;
>>> - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
>>> - fdinfo->time += data->time.after - data->time.before;
>>> - mutex_unlock(&group->fdinfo.lock);
>>> + scoped_guard(spinlock, &group->fdinfo.lock) {
>>> + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
>>> + fdinfo->cycles += data->cycles.after - data->cycles.before;
>>> + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
>>> + fdinfo->time += data->time.after - data->time.before;
>>> + }
>>> }
>>> void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
>>> @@ -2880,12 +2879,11 @@ void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
>>> xa_lock(&gpool->xa);
>>> xa_for_each(&gpool->xa, i, group) {
>>> - mutex_lock(&group->fdinfo.lock);
>>> + guard(spinlock)(&group->fdinfo.lock);
>>> pfile->stats.cycles += group->fdinfo.data.cycles;
>>> pfile->stats.time += group->fdinfo.data.time;
>>> group->fdinfo.data.cycles = 0;
>>> group->fdinfo.data.time = 0;
>>> - mutex_unlock(&group->fdinfo.lock);
>>> }
>>> xa_unlock(&gpool->xa);
>>> }
>>> @@ -3531,7 +3529,7 @@ int panthor_group_create(struct panthor_file *pfile,
>>> mutex_unlock(&sched->reset.lock);
>>> add_group_kbo_sizes(group->ptdev, group);
>>> - mutex_init(&group->fdinfo.lock);
>>> + spin_lock_init(&group->fdinfo.lock);
>>> return gid;
>>>
>>> base-commit: 2eca617f12586abff62038db1c14cb3aa60a15aa
>>> prerequisite-patch-id: 7e787ce5973b5fc7e9f69f26aa4d7e5ec03d5caa
>>> prerequisite-patch-id: 03a619b8c741444b28435850e23d9ec463171c13
>>> prerequisite-patch-id: 290e1053f8bf4a8b80fb037a87cae7e096b5aa96
>>> prerequisite-patch-id: bc49bb8c29905650fb4788acc528bb44013c0240
>>> prerequisite-patch-id: 46cab4c980824c03e5164afc43085fec23e1cba5
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path
2025-02-11 11:39 ` Tvrtko Ursulin
@ 2025-02-11 12:41 ` Boris Brezillon
2025-02-12 14:23 ` Adrián Larumbe
2025-02-12 16:33 ` Boris Brezillon
1 sibling, 1 reply; 15+ messages in thread
From: Boris Brezillon @ 2025-02-11 12:41 UTC (permalink / raw)
To: Tvrtko Ursulin
Cc: Adrián Larumbe, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, dri-devel, linux-kernel
On Tue, 11 Feb 2025 11:39:49 +0000
Tvrtko Ursulin <tursulin@ursulin.net> wrote:
> On 10/02/2025 16:08, Adrián Larumbe wrote:
> > Hi Tvrtko,
>
> Thanks!
>
> > [18153.770244] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562
> > [18153.771059] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 203412, name: cat
> > [18153.771757] preempt_count: 1, expected: 0
> > [18153.772164] RCU nest depth: 0, expected: 0
> > [18153.772538] INFO: lockdep is turned off.
> > [18153.772898] CPU: 4 UID: 0 PID: 203412 Comm: cat Tainted: G W 6.14.0-rc1-panthor-next-rk3588-fdinfo+ #1
> > [18153.772906] Tainted: [W]=WARN
> > [18153.772908] Hardware name: Radxa ROCK 5B (DT)
> > [18153.772911] Call trace:
> > [18153.772913] show_stack+0x24/0x38 (C)
> > [18153.772927] dump_stack_lvl+0x3c/0x98
> > [18153.772935] dump_stack+0x18/0x24
> > [18153.772941] __might_resched+0x298/0x2b0
> > [18153.772948] __might_sleep+0x6c/0xb0
> > [18153.772953] __mutex_lock_common+0x7c/0x1950
> > [18153.772962] mutex_lock_nested+0x38/0x50
> > [18153.772969] panthor_fdinfo_gather_group_samples+0x80/0x138 [panthor]
> > [18153.773042] panthor_show_fdinfo+0x80/0x228 [panthor]
> > [18153.773109] drm_show_fdinfo+0x1a4/0x1e0 [drm]
> > [18153.773397] seq_show+0x274/0x358
> > [18153.773404] seq_read_iter+0x1d4/0x630
>
> There is a mutex_lock literally in seq_read_iter.
>
> So colour me confused. What created the atomic context between then and
> Panthor code?! I just don't see it.
Uh, looks like we've leaked an atomic context somewhere, indeed.
Adrian, do you have a reliable reproducer for this bug?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path
2025-02-11 12:41 ` Boris Brezillon
@ 2025-02-12 14:23 ` Adrián Larumbe
0 siblings, 0 replies; 15+ messages in thread
From: Adrián Larumbe @ 2025-02-12 14:23 UTC (permalink / raw)
To: Boris Brezillon
Cc: Tvrtko Ursulin, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, dri-devel, linux-kernel
Hi Boris,
Here are the branch and kernel config I used for reproducing the bug:
https://gitlab.freedesktop.org/larumbe/drm-misc/-/tree/panthor-fdinfo-internalbosizes-10
https://gitlab.collabora.com/-/snippets/426
On 11.02.2025 13:41, Boris Brezillon wrote:
> On Tue, 11 Feb 2025 11:39:49 +0000
> Tvrtko Ursulin <tursulin@ursulin.net> wrote:
>
> > On 10/02/2025 16:08, Adrián Larumbe wrote:
> > > Hi Tvrtko,
> >
> > Thanks!
> >
> > > [18153.770244] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562
> > > [18153.771059] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 203412, name: cat
> > > [18153.771757] preempt_count: 1, expected: 0
> > > [18153.772164] RCU nest depth: 0, expected: 0
> > > [18153.772538] INFO: lockdep is turned off.
> > > [18153.772898] CPU: 4 UID: 0 PID: 203412 Comm: cat Tainted: G W 6.14.0-rc1-panthor-next-rk3588-fdinfo+ #1
> > > [18153.772906] Tainted: [W]=WARN
> > > [18153.772908] Hardware name: Radxa ROCK 5B (DT)
> > > [18153.772911] Call trace:
> > > [18153.772913] show_stack+0x24/0x38 (C)
> > > [18153.772927] dump_stack_lvl+0x3c/0x98
> > > [18153.772935] dump_stack+0x18/0x24
> > > [18153.772941] __might_resched+0x298/0x2b0
> > > [18153.772948] __might_sleep+0x6c/0xb0
> > > [18153.772953] __mutex_lock_common+0x7c/0x1950
> > > [18153.772962] mutex_lock_nested+0x38/0x50
> > > [18153.772969] panthor_fdinfo_gather_group_samples+0x80/0x138 [panthor]
> > > [18153.773042] panthor_show_fdinfo+0x80/0x228 [panthor]
> > > [18153.773109] drm_show_fdinfo+0x1a4/0x1e0 [drm]
> > > [18153.773397] seq_show+0x274/0x358
> > > [18153.773404] seq_read_iter+0x1d4/0x630
> >
> > There is a mutex_lock literally in seq_read_iter.
> >
> > So colour me confused. What created the atomic context between then and
> > Panthor code?! I just don't see it.
>
> Uh, looks like we've leaked an atomic context somewhere, indeed.
> Adrian, do you have a reliable reproducer for this bug?
Adrian Larumbe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path
2025-02-11 11:39 ` Tvrtko Ursulin
2025-02-11 12:41 ` Boris Brezillon
@ 2025-02-12 16:33 ` Boris Brezillon
2025-02-12 16:47 ` Tvrtko Ursulin
1 sibling, 1 reply; 15+ messages in thread
From: Boris Brezillon @ 2025-02-12 16:33 UTC (permalink / raw)
To: Tvrtko Ursulin
Cc: Adrián Larumbe, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, dri-devel, linux-kernel
On Tue, 11 Feb 2025 11:39:49 +0000
Tvrtko Ursulin <tursulin@ursulin.net> wrote:
> On 10/02/2025 16:08, Adrián Larumbe wrote:
> > Hi Tvrtko,
>
> Thanks!
>
> > [18153.770244] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562
> > [18153.771059] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 203412, name: cat
> > [18153.771757] preempt_count: 1, expected: 0
> > [18153.772164] RCU nest depth: 0, expected: 0
> > [18153.772538] INFO: lockdep is turned off.
> > [18153.772898] CPU: 4 UID: 0 PID: 203412 Comm: cat Tainted: G W 6.14.0-rc1-panthor-next-rk3588-fdinfo+ #1
> > [18153.772906] Tainted: [W]=WARN
> > [18153.772908] Hardware name: Radxa ROCK 5B (DT)
> > [18153.772911] Call trace:
> > [18153.772913] show_stack+0x24/0x38 (C)
> > [18153.772927] dump_stack_lvl+0x3c/0x98
> > [18153.772935] dump_stack+0x18/0x24
> > [18153.772941] __might_resched+0x298/0x2b0
> > [18153.772948] __might_sleep+0x6c/0xb0
> > [18153.772953] __mutex_lock_common+0x7c/0x1950
> > [18153.772962] mutex_lock_nested+0x38/0x50
> > [18153.772969] panthor_fdinfo_gather_group_samples+0x80/0x138 [panthor]
> > [18153.773042] panthor_show_fdinfo+0x80/0x228 [panthor]
> > [18153.773109] drm_show_fdinfo+0x1a4/0x1e0 [drm]
> > [18153.773397] seq_show+0x274/0x358
> > [18153.773404] seq_read_iter+0x1d4/0x630
>
> There is a mutex_lock literally in seq_read_iter.
>
> So colour me confused. What created the atomic context between then and
> Panthor code?! I just don't see it.
It's actually the xa_lock() we take when gathering fdinfo data that
puts us in this atomic context. In other words, the fix is correct, but
the explanation is wrong :-).
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path
2025-02-12 16:33 ` Boris Brezillon
@ 2025-02-12 16:47 ` Tvrtko Ursulin
0 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2025-02-12 16:47 UTC (permalink / raw)
To: Boris Brezillon
Cc: Adrián Larumbe, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, dri-devel, linux-kernel
On 12/02/2025 16:33, Boris Brezillon wrote:
> On Tue, 11 Feb 2025 11:39:49 +0000
> Tvrtko Ursulin <tursulin@ursulin.net> wrote:
>
>> On 10/02/2025 16:08, Adrián Larumbe wrote:
>>> Hi Tvrtko,
>>
>> Thanks!
>>
>>> [18153.770244] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562
>>> [18153.771059] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 203412, name: cat
>>> [18153.771757] preempt_count: 1, expected: 0
>>> [18153.772164] RCU nest depth: 0, expected: 0
>>> [18153.772538] INFO: lockdep is turned off.
>>> [18153.772898] CPU: 4 UID: 0 PID: 203412 Comm: cat Tainted: G W 6.14.0-rc1-panthor-next-rk3588-fdinfo+ #1
>>> [18153.772906] Tainted: [W]=WARN
>>> [18153.772908] Hardware name: Radxa ROCK 5B (DT)
>>> [18153.772911] Call trace:
>>> [18153.772913] show_stack+0x24/0x38 (C)
>>> [18153.772927] dump_stack_lvl+0x3c/0x98
>>> [18153.772935] dump_stack+0x18/0x24
>>> [18153.772941] __might_resched+0x298/0x2b0
>>> [18153.772948] __might_sleep+0x6c/0xb0
>>> [18153.772953] __mutex_lock_common+0x7c/0x1950
>>> [18153.772962] mutex_lock_nested+0x38/0x50
>>> [18153.772969] panthor_fdinfo_gather_group_samples+0x80/0x138 [panthor]
>>> [18153.773042] panthor_show_fdinfo+0x80/0x228 [panthor]
>>> [18153.773109] drm_show_fdinfo+0x1a4/0x1e0 [drm]
>>> [18153.773397] seq_show+0x274/0x358
>>> [18153.773404] seq_read_iter+0x1d4/0x630
>>
>> There is a mutex_lock literally in seq_read_iter.
>>
>> So colour me confused. What created the atomic context between then and
>> Panthor code?! I just don't see it.
>
> It's actually the xa_lock() we take when gathering fdinfo data that
> puts us in this atomic context. In other words, the fix is correct, but
> the explanation is wrong :-).
Huh I did not spot that when glancing over it yesterday or so, but
anyway, excellent you figured it out! Now the commit message can be
fixed and not mislead someone in the future to think mutexes cannot be
used in fdinfo. Thanks for taking time to check it!
Regards,
Tvrtko
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-02-12 16:47 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-10 12:41 [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path Adrián Larumbe
2025-02-10 12:42 ` [PATCH 2/2] drm/panthor: Avoid sleep locking in the internal BO size path Adrián Larumbe
2025-02-10 13:18 ` Boris Brezillon
2025-02-10 13:43 ` Liviu Dudau
2025-02-10 14:15 ` Boris Brezillon
2025-02-10 13:08 ` [PATCH 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path Boris Brezillon
2025-02-10 13:37 ` Liviu Dudau
2025-02-10 15:52 ` Tvrtko Ursulin
2025-02-10 16:08 ` Adrián Larumbe
2025-02-11 11:39 ` Tvrtko Ursulin
2025-02-11 12:41 ` Boris Brezillon
2025-02-12 14:23 ` Adrián Larumbe
2025-02-12 16:33 ` Boris Brezillon
2025-02-12 16:47 ` Tvrtko Ursulin
2025-02-10 16:10 ` Adrián Larumbe
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.