[PATCH] drm/amdgpu: Make the submission path memory reclaim safe

dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] drm/amdgpu: Make the submission path memory reclaim safe
@ 2024-11-13 13:48 Tvrtko Ursulin
  2024-11-13 14:26 ` Christian König
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Tvrtko Ursulin @ 2024-11-13 13:48 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: kernel-dev, Tvrtko Ursulin, stable, Matthew Brost,
	Danilo Krummrich, Philipp Stanner, Alex Deucher,
	Christian König

From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

As commit 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
points out, ever since
a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread"),
any workqueue flushing done from the job submission path must only
involve memory reclaim safe workqueues to be safe against reclaim
deadlocks.

This is also pointed out by workqueue sanity checks:

 [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work [gpu_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_device_delay_enable_gfx_off [amdgpu]
...
 [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
...
 [ ] Call Trace:
 [ ]  <TASK>
...
 [ ]  ? check_flush_dependency+0xf5/0x110
...
 [ ]  cancel_delayed_work_sync+0x6e/0x80
 [ ]  amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
 [ ]  amdgpu_ring_alloc+0x40/0x50 [amdgpu]
 [ ]  amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
 [ ]  ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
 [ ]  amdgpu_job_run+0xaa/0x1f0 [amdgpu]
 [ ]  drm_sched_run_job_work+0x257/0x430 [gpu_sched]
 [ ]  process_one_work+0x217/0x720
...
 [ ]  </TASK>

Fix this by creating a memory reclaim safe driver workqueue and make the
submission path use it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
References: 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread")
Cc: stable@vger.kernel.org
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Philipp Stanner <pstanner@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 25 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  5 +++--
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 7645e498faa4..a6aad687537e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -268,6 +268,8 @@ extern int amdgpu_agp;
 
 extern int amdgpu_wbrf;
 
+extern struct workqueue_struct *amdgpu_reclaim_wq;
+
 #define AMDGPU_VM_MAX_NUM_CTX			4096
 #define AMDGPU_SG_THRESHOLD			(256*1024*1024)
 #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS	        3000
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 38686203bea6..f5b7172e8042 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -255,6 +255,8 @@ struct amdgpu_watchdog_timer amdgpu_watchdog_timer = {
 	.period = 0x0, /* default to 0x0 (timeout disable) */
 };
 
+struct workqueue_struct *amdgpu_reclaim_wq;
+
 /**
  * DOC: vramlimit (int)
  * Restrict the total amount of VRAM in MiB for testing.  The default is 0 (Use full VRAM).
@@ -2971,6 +2973,21 @@ static struct pci_driver amdgpu_kms_pci_driver = {
 	.dev_groups = amdgpu_sysfs_groups,
 };
 
+static int amdgpu_wq_init(void)
+{
+	amdgpu_reclaim_wq =
+		alloc_workqueue("amdgpu-reclaim", WQ_MEM_RECLAIM, 0);
+	if (!amdgpu_reclaim_wq)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void amdgpu_wq_fini(void)
+{
+	destroy_workqueue(amdgpu_reclaim_wq);
+}
+
 static int __init amdgpu_init(void)
 {
 	int r;
@@ -2978,6 +2995,10 @@ static int __init amdgpu_init(void)
 	if (drm_firmware_drivers_only())
 		return -EINVAL;
 
+	r = amdgpu_wq_init();
+	if (r)
+		goto error_wq;
+
 	r = amdgpu_sync_init();
 	if (r)
 		goto error_sync;
@@ -3006,6 +3027,9 @@ static int __init amdgpu_init(void)
 	amdgpu_sync_fini();
 
 error_sync:
+	amdgpu_wq_fini();
+
+error_wq:
 	return r;
 }
 
@@ -3017,6 +3041,7 @@ static void __exit amdgpu_exit(void)
 	amdgpu_acpi_release();
 	amdgpu_sync_fini();
 	amdgpu_fence_slab_fini();
+	amdgpu_wq_fini();
 	mmu_notifier_synchronize();
 	amdgpu_xcp_drv_release();
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 2f3f09dfb1fd..f8fd71d9382f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -790,8 +790,9 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable)
 						AMD_IP_BLOCK_TYPE_GFX, true))
 					adev->gfx.gfx_off_state = true;
 			} else {
-				schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
-					      delay);
+				queue_delayed_work(amdgpu_reclaim_wq,
+						   &adev->gfx.gfx_off_delay_work,
+						   delay);
 			}
 		}
 	} else {
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/amdgpu: Make the submission path memory reclaim safe
  2024-11-13 13:48 [PATCH] drm/amdgpu: Make the submission path memory reclaim safe Tvrtko Ursulin
@ 2024-11-13 14:26 ` Christian König
  2024-11-13 14:42   ` Tvrtko Ursulin
  2024-12-03 14:24 ` Oleksandr Natalenko
  2024-12-17 10:39 ` Philipp Stanner
  2 siblings, 1 reply; 12+ messages in thread
From: Christian König @ 2024-11-13 14:26 UTC (permalink / raw)
  To: Tvrtko Ursulin, amd-gfx, dri-devel
  Cc: kernel-dev, Tvrtko Ursulin, stable, Matthew Brost,
	Danilo Krummrich, Philipp Stanner, Alex Deucher

Am 13.11.24 um 14:48 schrieb Tvrtko Ursulin:
> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>
> As commit 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
> points out, ever since
> a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread"),
> any workqueue flushing done from the job submission path must only
> involve memory reclaim safe workqueues to be safe against reclaim
> deadlocks.
>
> This is also pointed out by workqueue sanity checks:
>
>   [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work [gpu_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_device_delay_enable_gfx_off [amdgpu]
> ...
>   [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
> ...
>   [ ] Call Trace:
>   [ ]  <TASK>
> ...
>   [ ]  ? check_flush_dependency+0xf5/0x110
> ...
>   [ ]  cancel_delayed_work_sync+0x6e/0x80
>   [ ]  amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
>   [ ]  amdgpu_ring_alloc+0x40/0x50 [amdgpu]
>   [ ]  amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
>   [ ]  ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
>   [ ]  amdgpu_job_run+0xaa/0x1f0 [amdgpu]
>   [ ]  drm_sched_run_job_work+0x257/0x430 [gpu_sched]
>   [ ]  process_one_work+0x217/0x720
> ...
>   [ ]  </TASK>
>
> Fix this by creating a memory reclaim safe driver workqueue and make the
> submission path use it.

Oh well, that is a really good catch! I wasn't aware the workqueues 
could be blocked by memory reclaim as well.

Do we have system wide workqueues for that? It seems a bit overkill that 
amdgpu has to allocate one on his own.

Apart from that looks good to me.

Regards,
Christian.

>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> References: 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
> Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread")
> Cc: stable@vger.kernel.org
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Philipp Stanner <pstanner@redhat.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 25 +++++++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  5 +++--
>   3 files changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 7645e498faa4..a6aad687537e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -268,6 +268,8 @@ extern int amdgpu_agp;
>   
>   extern int amdgpu_wbrf;
>   
> +extern struct workqueue_struct *amdgpu_reclaim_wq;
> +
>   #define AMDGPU_VM_MAX_NUM_CTX			4096
>   #define AMDGPU_SG_THRESHOLD			(256*1024*1024)
>   #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS	        3000
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 38686203bea6..f5b7172e8042 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -255,6 +255,8 @@ struct amdgpu_watchdog_timer amdgpu_watchdog_timer = {
>   	.period = 0x0, /* default to 0x0 (timeout disable) */
>   };
>   
> +struct workqueue_struct *amdgpu_reclaim_wq;
> +
>   /**
>    * DOC: vramlimit (int)
>    * Restrict the total amount of VRAM in MiB for testing.  The default is 0 (Use full VRAM).
> @@ -2971,6 +2973,21 @@ static struct pci_driver amdgpu_kms_pci_driver = {
>   	.dev_groups = amdgpu_sysfs_groups,
>   };
>   
> +static int amdgpu_wq_init(void)
> +{
> +	amdgpu_reclaim_wq =
> +		alloc_workqueue("amdgpu-reclaim", WQ_MEM_RECLAIM, 0);
> +	if (!amdgpu_reclaim_wq)
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static void amdgpu_wq_fini(void)
> +{
> +	destroy_workqueue(amdgpu_reclaim_wq);
> +}
> +
>   static int __init amdgpu_init(void)
>   {
>   	int r;
> @@ -2978,6 +2995,10 @@ static int __init amdgpu_init(void)
>   	if (drm_firmware_drivers_only())
>   		return -EINVAL;
>   
> +	r = amdgpu_wq_init();
> +	if (r)
> +		goto error_wq;
> +
>   	r = amdgpu_sync_init();
>   	if (r)
>   		goto error_sync;
> @@ -3006,6 +3027,9 @@ static int __init amdgpu_init(void)
>   	amdgpu_sync_fini();
>   
>   error_sync:
> +	amdgpu_wq_fini();
> +
> +error_wq:
>   	return r;
>   }
>   
> @@ -3017,6 +3041,7 @@ static void __exit amdgpu_exit(void)
>   	amdgpu_acpi_release();
>   	amdgpu_sync_fini();
>   	amdgpu_fence_slab_fini();
> +	amdgpu_wq_fini();
>   	mmu_notifier_synchronize();
>   	amdgpu_xcp_drv_release();
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 2f3f09dfb1fd..f8fd71d9382f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -790,8 +790,9 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable)
>   						AMD_IP_BLOCK_TYPE_GFX, true))
>   					adev->gfx.gfx_off_state = true;
>   			} else {
> -				schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
> -					      delay);
> +				queue_delayed_work(amdgpu_reclaim_wq,
> +						   &adev->gfx.gfx_off_delay_work,
> +						   delay);
>   			}
>   		}
>   	} else {


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/amdgpu: Make the submission path memory reclaim safe
  2024-11-13 14:26 ` Christian König
@ 2024-11-13 14:42   ` Tvrtko Ursulin
  2024-11-22 11:34     ` Tvrtko Ursulin
  0 siblings, 1 reply; 12+ messages in thread
From: Tvrtko Ursulin @ 2024-11-13 14:42 UTC (permalink / raw)
  To: Christian König, Tvrtko Ursulin, amd-gfx, dri-devel
  Cc: kernel-dev, stable, Matthew Brost, Danilo Krummrich,
	Philipp Stanner, Alex Deucher


On 13/11/2024 14:26, Christian König wrote:
> Am 13.11.24 um 14:48 schrieb Tvrtko Ursulin:
>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>
>> As commit 746ae46c1113 ("drm/sched: Mark scheduler work queues with 
>> WQ_MEM_RECLAIM")
>> points out, ever since
>> a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue 
>> rather than kthread"),
>> any workqueue flushing done from the job submission path must only
>> involve memory reclaim safe workqueues to be safe against reclaim
>> deadlocks.
>>
>> This is also pointed out by workqueue sanity checks:
>>
>>   [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work 
>> [gpu_sched] is flushing !WQ_MEM_RECLAIM 
>> events:amdgpu_device_delay_enable_gfx_off [amdgpu]
>> ...
>>   [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
>> ...
>>   [ ] Call Trace:
>>   [ ]  <TASK>
>> ...
>>   [ ]  ? check_flush_dependency+0xf5/0x110
>> ...
>>   [ ]  cancel_delayed_work_sync+0x6e/0x80
>>   [ ]  amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
>>   [ ]  amdgpu_ring_alloc+0x40/0x50 [amdgpu]
>>   [ ]  amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
>>   [ ]  ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
>>   [ ]  amdgpu_job_run+0xaa/0x1f0 [amdgpu]
>>   [ ]  drm_sched_run_job_work+0x257/0x430 [gpu_sched]
>>   [ ]  process_one_work+0x217/0x720
>> ...
>>   [ ]  </TASK>
>>
>> Fix this by creating a memory reclaim safe driver workqueue and make the
>> submission path use it.
> 
> Oh well, that is a really good catch! I wasn't aware the workqueues 
> could be blocked by memory reclaim as well.

Only credit I can take is for the habit that I often run with many 
kernel debugging aids enabled.

> Do we have system wide workqueues for that? It seems a bit overkill that 
> amdgpu has to allocate one on his own.

I wondered the same but did not find any. Only ones I am aware of are 
system_wq&co created in workqueue_init_early().

Regards,

Tvrtko

> Apart from that looks good to me.
> 
> Regards,
> Christian.
> 
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> References: 746ae46c1113 ("drm/sched: Mark scheduler work queues with 
>> WQ_MEM_RECLAIM")
>> Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a work 
>> queue rather than kthread")
>> Cc: stable@vger.kernel.org
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Cc: Danilo Krummrich <dakr@kernel.org>
>> Cc: Philipp Stanner <pstanner@redhat.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Cc: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 25 +++++++++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  5 +++--
>>   3 files changed, 30 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 7645e498faa4..a6aad687537e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -268,6 +268,8 @@ extern int amdgpu_agp;
>>   extern int amdgpu_wbrf;
>> +extern struct workqueue_struct *amdgpu_reclaim_wq;
>> +
>>   #define AMDGPU_VM_MAX_NUM_CTX            4096
>>   #define AMDGPU_SG_THRESHOLD            (256*1024*1024)
>>   #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS            3000
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 38686203bea6..f5b7172e8042 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -255,6 +255,8 @@ struct amdgpu_watchdog_timer amdgpu_watchdog_timer 
>> = {
>>       .period = 0x0, /* default to 0x0 (timeout disable) */
>>   };
>> +struct workqueue_struct *amdgpu_reclaim_wq;
>> +
>>   /**
>>    * DOC: vramlimit (int)
>>    * Restrict the total amount of VRAM in MiB for testing.  The 
>> default is 0 (Use full VRAM).
>> @@ -2971,6 +2973,21 @@ static struct pci_driver amdgpu_kms_pci_driver = {
>>       .dev_groups = amdgpu_sysfs_groups,
>>   };
>> +static int amdgpu_wq_init(void)
>> +{
>> +    amdgpu_reclaim_wq =
>> +        alloc_workqueue("amdgpu-reclaim", WQ_MEM_RECLAIM, 0);
>> +    if (!amdgpu_reclaim_wq)
>> +        return -ENOMEM;
>> +
>> +    return 0;
>> +}
>> +
>> +static void amdgpu_wq_fini(void)
>> +{
>> +    destroy_workqueue(amdgpu_reclaim_wq);
>> +}
>> +
>>   static int __init amdgpu_init(void)
>>   {
>>       int r;
>> @@ -2978,6 +2995,10 @@ static int __init amdgpu_init(void)
>>       if (drm_firmware_drivers_only())
>>           return -EINVAL;
>> +    r = amdgpu_wq_init();
>> +    if (r)
>> +        goto error_wq;
>> +
>>       r = amdgpu_sync_init();
>>       if (r)
>>           goto error_sync;
>> @@ -3006,6 +3027,9 @@ static int __init amdgpu_init(void)
>>       amdgpu_sync_fini();
>>   error_sync:
>> +    amdgpu_wq_fini();
>> +
>> +error_wq:
>>       return r;
>>   }
>> @@ -3017,6 +3041,7 @@ static void __exit amdgpu_exit(void)
>>       amdgpu_acpi_release();
>>       amdgpu_sync_fini();
>>       amdgpu_fence_slab_fini();
>> +    amdgpu_wq_fini();
>>       mmu_notifier_synchronize();
>>       amdgpu_xcp_drv_release();
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> index 2f3f09dfb1fd..f8fd71d9382f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> @@ -790,8 +790,9 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device 
>> *adev, bool enable)
>>                           AMD_IP_BLOCK_TYPE_GFX, true))
>>                       adev->gfx.gfx_off_state = true;
>>               } else {
>> -                schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
>> -                          delay);
>> +                queue_delayed_work(amdgpu_reclaim_wq,
>> +                           &adev->gfx.gfx_off_delay_work,
>> +                           delay);
>>               }
>>           }
>>       } else {
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/amdgpu: Make the submission path memory reclaim safe
  2024-11-13 14:42   ` Tvrtko Ursulin
@ 2024-11-22 11:34     ` Tvrtko Ursulin
  2024-11-22 13:46       ` Christian König
  0 siblings, 1 reply; 12+ messages in thread
From: Tvrtko Ursulin @ 2024-11-22 11:34 UTC (permalink / raw)
  To: Christian König, Tvrtko Ursulin, amd-gfx, dri-devel
  Cc: kernel-dev, stable, Matthew Brost, Danilo Krummrich,
	Philipp Stanner, Alex Deucher


On 13/11/2024 14:42, Tvrtko Ursulin wrote:
> 
> On 13/11/2024 14:26, Christian König wrote:
>> Am 13.11.24 um 14:48 schrieb Tvrtko Ursulin:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>
>>> As commit 746ae46c1113 ("drm/sched: Mark scheduler work queues with 
>>> WQ_MEM_RECLAIM")
>>> points out, ever since
>>> a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue 
>>> rather than kthread"),
>>> any workqueue flushing done from the job submission path must only
>>> involve memory reclaim safe workqueues to be safe against reclaim
>>> deadlocks.
>>>
>>> This is also pointed out by workqueue sanity checks:
>>>
>>>   [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work 
>>> [gpu_sched] is flushing !WQ_MEM_RECLAIM 
>>> events:amdgpu_device_delay_enable_gfx_off [amdgpu]
>>> ...
>>>   [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
>>> ...
>>>   [ ] Call Trace:
>>>   [ ]  <TASK>
>>> ...
>>>   [ ]  ? check_flush_dependency+0xf5/0x110
>>> ...
>>>   [ ]  cancel_delayed_work_sync+0x6e/0x80
>>>   [ ]  amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
>>>   [ ]  amdgpu_ring_alloc+0x40/0x50 [amdgpu]
>>>   [ ]  amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
>>>   [ ]  ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
>>>   [ ]  amdgpu_job_run+0xaa/0x1f0 [amdgpu]
>>>   [ ]  drm_sched_run_job_work+0x257/0x430 [gpu_sched]
>>>   [ ]  process_one_work+0x217/0x720
>>> ...
>>>   [ ]  </TASK>
>>>
>>> Fix this by creating a memory reclaim safe driver workqueue and make the
>>> submission path use it.
>>
>> Oh well, that is a really good catch! I wasn't aware the workqueues 
>> could be blocked by memory reclaim as well.
> 
> Only credit I can take is for the habit that I often run with many 
> kernel debugging aids enabled.

Although this one actually isn't even under "Kernel hacking".

>> Do we have system wide workqueues for that? It seems a bit overkill 
>> that amdgpu has to allocate one on his own.
> 
> I wondered the same but did not find any. Only ones I am aware of are 
> system_wq&co created in workqueue_init_early().

Gentle ping on this. I don't have any better ideas that creating a new wq.

Regards,

Tvrtko
>> Apart from that looks good to me.
>>
>> Regards,
>> Christian.
>>
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>> References: 746ae46c1113 ("drm/sched: Mark scheduler work queues with 
>>> WQ_MEM_RECLAIM")
>>> Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a work 
>>> queue rather than kthread")
>>> Cc: stable@vger.kernel.org
>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>> Cc: Danilo Krummrich <dakr@kernel.org>
>>> Cc: Philipp Stanner <pstanner@redhat.com>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 25 +++++++++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  5 +++--
>>>   3 files changed, 30 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 7645e498faa4..a6aad687537e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -268,6 +268,8 @@ extern int amdgpu_agp;
>>>   extern int amdgpu_wbrf;
>>> +extern struct workqueue_struct *amdgpu_reclaim_wq;
>>> +
>>>   #define AMDGPU_VM_MAX_NUM_CTX            4096
>>>   #define AMDGPU_SG_THRESHOLD            (256*1024*1024)
>>>   #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS            3000
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> index 38686203bea6..f5b7172e8042 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> @@ -255,6 +255,8 @@ struct amdgpu_watchdog_timer 
>>> amdgpu_watchdog_timer = {
>>>       .period = 0x0, /* default to 0x0 (timeout disable) */
>>>   };
>>> +struct workqueue_struct *amdgpu_reclaim_wq;
>>> +
>>>   /**
>>>    * DOC: vramlimit (int)
>>>    * Restrict the total amount of VRAM in MiB for testing.  The 
>>> default is 0 (Use full VRAM).
>>> @@ -2971,6 +2973,21 @@ static struct pci_driver amdgpu_kms_pci_driver 
>>> = {
>>>       .dev_groups = amdgpu_sysfs_groups,
>>>   };
>>> +static int amdgpu_wq_init(void)
>>> +{
>>> +    amdgpu_reclaim_wq =
>>> +        alloc_workqueue("amdgpu-reclaim", WQ_MEM_RECLAIM, 0);
>>> +    if (!amdgpu_reclaim_wq)
>>> +        return -ENOMEM;
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static void amdgpu_wq_fini(void)
>>> +{
>>> +    destroy_workqueue(amdgpu_reclaim_wq);
>>> +}
>>> +
>>>   static int __init amdgpu_init(void)
>>>   {
>>>       int r;
>>> @@ -2978,6 +2995,10 @@ static int __init amdgpu_init(void)
>>>       if (drm_firmware_drivers_only())
>>>           return -EINVAL;
>>> +    r = amdgpu_wq_init();
>>> +    if (r)
>>> +        goto error_wq;
>>> +
>>>       r = amdgpu_sync_init();
>>>       if (r)
>>>           goto error_sync;
>>> @@ -3006,6 +3027,9 @@ static int __init amdgpu_init(void)
>>>       amdgpu_sync_fini();
>>>   error_sync:
>>> +    amdgpu_wq_fini();
>>> +
>>> +error_wq:
>>>       return r;
>>>   }
>>> @@ -3017,6 +3041,7 @@ static void __exit amdgpu_exit(void)
>>>       amdgpu_acpi_release();
>>>       amdgpu_sync_fini();
>>>       amdgpu_fence_slab_fini();
>>> +    amdgpu_wq_fini();
>>>       mmu_notifier_synchronize();
>>>       amdgpu_xcp_drv_release();
>>>   }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>> index 2f3f09dfb1fd..f8fd71d9382f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>> @@ -790,8 +790,9 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device 
>>> *adev, bool enable)
>>>                           AMD_IP_BLOCK_TYPE_GFX, true))
>>>                       adev->gfx.gfx_off_state = true;
>>>               } else {
>>> -                schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
>>> -                          delay);
>>> +                queue_delayed_work(amdgpu_reclaim_wq,
>>> +                           &adev->gfx.gfx_off_delay_work,
>>> +                           delay);
>>>               }
>>>           }
>>>       } else {
>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/amdgpu: Make the submission path memory reclaim safe
  2024-11-22 11:34     ` Tvrtko Ursulin
@ 2024-11-22 13:46       ` Christian König
  2024-11-22 14:36         ` Tvrtko Ursulin
  0 siblings, 1 reply; 12+ messages in thread
From: Christian König @ 2024-11-22 13:46 UTC (permalink / raw)
  To: Tvrtko Ursulin, Tvrtko Ursulin, amd-gfx, dri-devel
  Cc: kernel-dev, stable, Matthew Brost, Danilo Krummrich,
	Philipp Stanner, Alex Deucher

Am 22.11.24 um 12:34 schrieb Tvrtko Ursulin:
> On 13/11/2024 14:42, Tvrtko Ursulin wrote:
>> On 13/11/2024 14:26, Christian König wrote:
>>> Am 13.11.24 um 14:48 schrieb Tvrtko Ursulin:
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>>
>>>> As commit 746ae46c1113 ("drm/sched: Mark scheduler work queues with 
>>>> WQ_MEM_RECLAIM")
>>>> points out, ever since
>>>> a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue 
>>>> rather than kthread"),
>>>> any workqueue flushing done from the job submission path must only
>>>> involve memory reclaim safe workqueues to be safe against reclaim
>>>> deadlocks.
>>>>
>>>> This is also pointed out by workqueue sanity checks:
>>>>
>>>>   [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work 
>>>> [gpu_sched] is flushing !WQ_MEM_RECLAIM 
>>>> events:amdgpu_device_delay_enable_gfx_off [amdgpu]
>>>> ...
>>>>   [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
>>>> ...
>>>>   [ ] Call Trace:
>>>>   [ ]  <TASK>
>>>> ...
>>>>   [ ]  ? check_flush_dependency+0xf5/0x110
>>>> ...
>>>>   [ ]  cancel_delayed_work_sync+0x6e/0x80
>>>>   [ ]  amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
>>>>   [ ]  amdgpu_ring_alloc+0x40/0x50 [amdgpu]
>>>>   [ ]  amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
>>>>   [ ]  ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
>>>>   [ ]  amdgpu_job_run+0xaa/0x1f0 [amdgpu]
>>>>   [ ]  drm_sched_run_job_work+0x257/0x430 [gpu_sched]
>>>>   [ ]  process_one_work+0x217/0x720
>>>> ...
>>>>   [ ]  </TASK>
>>>>
>>>> Fix this by creating a memory reclaim safe driver workqueue and 
>>>> make the
>>>> submission path use it.
>>>
>>> Oh well, that is a really good catch! I wasn't aware the workqueues 
>>> could be blocked by memory reclaim as well.
>>
>> Only credit I can take is for the habit that I often run with many 
>> kernel debugging aids enabled.
>
> Although this one actually isn't even under "Kernel hacking".
>
>>> Do we have system wide workqueues for that? It seems a bit overkill 
>>> that amdgpu has to allocate one on his own.
>>
>> I wondered the same but did not find any. Only ones I am aware of are 
>> system_wq&co created in workqueue_init_early().
>
> Gentle ping on this. I don't have any better ideas that creating a new 
> wq.

It took me a moment to realize, but I now think this warning message is 
a false positive.

What happens is that the code calls cancel_delayed_work_sync().

If the work item never run because of lack of memory then it can just be 
canceled.

If the work item is running then we will block for it to finish.

There is no need to use WQ_MEM_RECLAIM for the workqueue or do I miss 
something?

If I'm not completely mistaken you stumbled over a bug in the warning 
code instead :)

Regards,
Christian.

>
> Regards,
>
> Tvrtko
>>> Apart from that looks good to me.
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>> References: 746ae46c1113 ("drm/sched: Mark scheduler work queues 
>>>> with WQ_MEM_RECLAIM")
>>>> Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a 
>>>> work queue rather than kthread")
>>>> Cc: stable@vger.kernel.org
>>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>>> Cc: Danilo Krummrich <dakr@kernel.org>
>>>> Cc: Philipp Stanner <pstanner@redhat.com>
>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>> Cc: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 25 
>>>> +++++++++++++++++++++++++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  5 +++--
>>>>   3 files changed, 30 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index 7645e498faa4..a6aad687537e 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -268,6 +268,8 @@ extern int amdgpu_agp;
>>>>   extern int amdgpu_wbrf;
>>>> +extern struct workqueue_struct *amdgpu_reclaim_wq;
>>>> +
>>>>   #define AMDGPU_VM_MAX_NUM_CTX            4096
>>>>   #define AMDGPU_SG_THRESHOLD            (256*1024*1024)
>>>>   #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS            3000
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> index 38686203bea6..f5b7172e8042 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> @@ -255,6 +255,8 @@ struct amdgpu_watchdog_timer 
>>>> amdgpu_watchdog_timer = {
>>>>       .period = 0x0, /* default to 0x0 (timeout disable) */
>>>>   };
>>>> +struct workqueue_struct *amdgpu_reclaim_wq;
>>>> +
>>>>   /**
>>>>    * DOC: vramlimit (int)
>>>>    * Restrict the total amount of VRAM in MiB for testing. The 
>>>> default is 0 (Use full VRAM).
>>>> @@ -2971,6 +2973,21 @@ static struct pci_driver 
>>>> amdgpu_kms_pci_driver = {
>>>>       .dev_groups = amdgpu_sysfs_groups,
>>>>   };
>>>> +static int amdgpu_wq_init(void)
>>>> +{
>>>> +    amdgpu_reclaim_wq =
>>>> +        alloc_workqueue("amdgpu-reclaim", WQ_MEM_RECLAIM, 0);
>>>> +    if (!amdgpu_reclaim_wq)
>>>> +        return -ENOMEM;
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static void amdgpu_wq_fini(void)
>>>> +{
>>>> +    destroy_workqueue(amdgpu_reclaim_wq);
>>>> +}
>>>> +
>>>>   static int __init amdgpu_init(void)
>>>>   {
>>>>       int r;
>>>> @@ -2978,6 +2995,10 @@ static int __init amdgpu_init(void)
>>>>       if (drm_firmware_drivers_only())
>>>>           return -EINVAL;
>>>> +    r = amdgpu_wq_init();
>>>> +    if (r)
>>>> +        goto error_wq;
>>>> +
>>>>       r = amdgpu_sync_init();
>>>>       if (r)
>>>>           goto error_sync;
>>>> @@ -3006,6 +3027,9 @@ static int __init amdgpu_init(void)
>>>>       amdgpu_sync_fini();
>>>>   error_sync:
>>>> +    amdgpu_wq_fini();
>>>> +
>>>> +error_wq:
>>>>       return r;
>>>>   }
>>>> @@ -3017,6 +3041,7 @@ static void __exit amdgpu_exit(void)
>>>>       amdgpu_acpi_release();
>>>>       amdgpu_sync_fini();
>>>>       amdgpu_fence_slab_fini();
>>>> +    amdgpu_wq_fini();
>>>>       mmu_notifier_synchronize();
>>>>       amdgpu_xcp_drv_release();
>>>>   }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>> index 2f3f09dfb1fd..f8fd71d9382f 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>> @@ -790,8 +790,9 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device 
>>>> *adev, bool enable)
>>>>                           AMD_IP_BLOCK_TYPE_GFX, true))
>>>>                       adev->gfx.gfx_off_state = true;
>>>>               } else {
>>>> - schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
>>>> -                          delay);
>>>> +                queue_delayed_work(amdgpu_reclaim_wq,
>>>> + &adev->gfx.gfx_off_delay_work,
>>>> +                           delay);
>>>>               }
>>>>           }
>>>>       } else {
>>>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/amdgpu: Make the submission path memory reclaim safe
  2024-11-22 13:46       ` Christian König
@ 2024-11-22 14:36         ` Tvrtko Ursulin
  2024-12-17  0:26           ` Matthew Brost
  0 siblings, 1 reply; 12+ messages in thread
From: Tvrtko Ursulin @ 2024-11-22 14:36 UTC (permalink / raw)
  To: Christian König, Tvrtko Ursulin, amd-gfx, dri-devel
  Cc: kernel-dev, stable, Matthew Brost, Danilo Krummrich,
	Philipp Stanner, Alex Deucher, Tejun Heo


On 22/11/2024 13:46, Christian König wrote:
> Am 22.11.24 um 12:34 schrieb Tvrtko Ursulin:
>> On 13/11/2024 14:42, Tvrtko Ursulin wrote:
>>> On 13/11/2024 14:26, Christian König wrote:
>>>> Am 13.11.24 um 14:48 schrieb Tvrtko Ursulin:
>>>>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>>>
>>>>> As commit 746ae46c1113 ("drm/sched: Mark scheduler work queues with 
>>>>> WQ_MEM_RECLAIM")
>>>>> points out, ever since
>>>>> a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue 
>>>>> rather than kthread"),
>>>>> any workqueue flushing done from the job submission path must only
>>>>> involve memory reclaim safe workqueues to be safe against reclaim
>>>>> deadlocks.
>>>>>
>>>>> This is also pointed out by workqueue sanity checks:
>>>>>
>>>>>   [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work 
>>>>> [gpu_sched] is flushing !WQ_MEM_RECLAIM 
>>>>> events:amdgpu_device_delay_enable_gfx_off [amdgpu]
>>>>> ...
>>>>>   [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
>>>>> ...
>>>>>   [ ] Call Trace:
>>>>>   [ ]  <TASK>
>>>>> ...
>>>>>   [ ]  ? check_flush_dependency+0xf5/0x110
>>>>> ...
>>>>>   [ ]  cancel_delayed_work_sync+0x6e/0x80
>>>>>   [ ]  amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
>>>>>   [ ]  amdgpu_ring_alloc+0x40/0x50 [amdgpu]
>>>>>   [ ]  amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
>>>>>   [ ]  ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
>>>>>   [ ]  amdgpu_job_run+0xaa/0x1f0 [amdgpu]
>>>>>   [ ]  drm_sched_run_job_work+0x257/0x430 [gpu_sched]
>>>>>   [ ]  process_one_work+0x217/0x720
>>>>> ...
>>>>>   [ ]  </TASK>
>>>>>
>>>>> Fix this by creating a memory reclaim safe driver workqueue and 
>>>>> make the
>>>>> submission path use it.
>>>>
>>>> Oh well, that is a really good catch! I wasn't aware the workqueues 
>>>> could be blocked by memory reclaim as well.
>>>
>>> Only credit I can take is for the habit that I often run with many 
>>> kernel debugging aids enabled.
>>
>> Although this one actually isn't even under "Kernel hacking".
>>
>>>> Do we have system wide workqueues for that? It seems a bit overkill 
>>>> that amdgpu has to allocate one on his own.
>>>
>>> I wondered the same but did not find any. Only ones I am aware of are 
>>> system_wq&co created in workqueue_init_early().
>>
>> Gentle ping on this. I don't have any better ideas that creating a new 
>> wq.
> 
> It took me a moment to realize, but I now think this warning message is 
> a false positive.
> 
> What happens is that the code calls cancel_delayed_work_sync().
> 
> If the work item never run because of lack of memory then it can just be 
> canceled.
> 
> If the work item is running then we will block for it to finish.
> 
> There is no need to use WQ_MEM_RECLAIM for the workqueue or do I miss 
> something?
> 
> If I'm not completely mistaken you stumbled over a bug in the warning 
> code instead :)

Hmm your thinking sounds convincing.

Adding Tejun if he has time to help brainstorm this.

Question is - does check_flush_dependency() need to skip the 
!WQ_MEM_RECLAIM flushing WQ_MEM_RECLAIM warning *if* the work is already 
running *and* it was called from cancel_delayed_work_sync()?

Regards,

Tvrtko

>>>> Apart from that looks good to me.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>>> References: 746ae46c1113 ("drm/sched: Mark scheduler work queues 
>>>>> with WQ_MEM_RECLAIM")
>>>>> Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a 
>>>>> work queue rather than kthread")
>>>>> Cc: stable@vger.kernel.org
>>>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>>>> Cc: Danilo Krummrich <dakr@kernel.org>
>>>>> Cc: Philipp Stanner <pstanner@redhat.com>
>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 25 
>>>>> +++++++++++++++++++++++++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  5 +++--
>>>>>   3 files changed, 30 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> index 7645e498faa4..a6aad687537e 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> @@ -268,6 +268,8 @@ extern int amdgpu_agp;
>>>>>   extern int amdgpu_wbrf;
>>>>> +extern struct workqueue_struct *amdgpu_reclaim_wq;
>>>>> +
>>>>>   #define AMDGPU_VM_MAX_NUM_CTX            4096
>>>>>   #define AMDGPU_SG_THRESHOLD            (256*1024*1024)
>>>>>   #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS            3000
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> index 38686203bea6..f5b7172e8042 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> @@ -255,6 +255,8 @@ struct amdgpu_watchdog_timer 
>>>>> amdgpu_watchdog_timer = {
>>>>>       .period = 0x0, /* default to 0x0 (timeout disable) */
>>>>>   };
>>>>> +struct workqueue_struct *amdgpu_reclaim_wq;
>>>>> +
>>>>>   /**
>>>>>    * DOC: vramlimit (int)
>>>>>    * Restrict the total amount of VRAM in MiB for testing. The 
>>>>> default is 0 (Use full VRAM).
>>>>> @@ -2971,6 +2973,21 @@ static struct pci_driver 
>>>>> amdgpu_kms_pci_driver = {
>>>>>       .dev_groups = amdgpu_sysfs_groups,
>>>>>   };
>>>>> +static int amdgpu_wq_init(void)
>>>>> +{
>>>>> +    amdgpu_reclaim_wq =
>>>>> +        alloc_workqueue("amdgpu-reclaim", WQ_MEM_RECLAIM, 0);
>>>>> +    if (!amdgpu_reclaim_wq)
>>>>> +        return -ENOMEM;
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static void amdgpu_wq_fini(void)
>>>>> +{
>>>>> +    destroy_workqueue(amdgpu_reclaim_wq);
>>>>> +}
>>>>> +
>>>>>   static int __init amdgpu_init(void)
>>>>>   {
>>>>>       int r;
>>>>> @@ -2978,6 +2995,10 @@ static int __init amdgpu_init(void)
>>>>>       if (drm_firmware_drivers_only())
>>>>>           return -EINVAL;
>>>>> +    r = amdgpu_wq_init();
>>>>> +    if (r)
>>>>> +        goto error_wq;
>>>>> +
>>>>>       r = amdgpu_sync_init();
>>>>>       if (r)
>>>>>           goto error_sync;
>>>>> @@ -3006,6 +3027,9 @@ static int __init amdgpu_init(void)
>>>>>       amdgpu_sync_fini();
>>>>>   error_sync:
>>>>> +    amdgpu_wq_fini();
>>>>> +
>>>>> +error_wq:
>>>>>       return r;
>>>>>   }
>>>>> @@ -3017,6 +3041,7 @@ static void __exit amdgpu_exit(void)
>>>>>       amdgpu_acpi_release();
>>>>>       amdgpu_sync_fini();
>>>>>       amdgpu_fence_slab_fini();
>>>>> +    amdgpu_wq_fini();
>>>>>       mmu_notifier_synchronize();
>>>>>       amdgpu_xcp_drv_release();
>>>>>   }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>>> index 2f3f09dfb1fd..f8fd71d9382f 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>>> @@ -790,8 +790,9 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device 
>>>>> *adev, bool enable)
>>>>>                           AMD_IP_BLOCK_TYPE_GFX, true))
>>>>>                       adev->gfx.gfx_off_state = true;
>>>>>               } else {
>>>>> - schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
>>>>> -                          delay);
>>>>> +                queue_delayed_work(amdgpu_reclaim_wq,
>>>>> + &adev->gfx.gfx_off_delay_work,
>>>>> +                           delay);
>>>>>               }
>>>>>           }
>>>>>       } else {
>>>>
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/amdgpu: Make the submission path memory reclaim safe
  2024-11-13 13:48 [PATCH] drm/amdgpu: Make the submission path memory reclaim safe Tvrtko Ursulin
  2024-11-13 14:26 ` Christian König
@ 2024-12-03 14:24 ` Oleksandr Natalenko
  2024-12-03 14:54   ` Christian König
  2024-12-17 10:39 ` Philipp Stanner
  2 siblings, 1 reply; 12+ messages in thread
From: Oleksandr Natalenko @ 2024-12-03 14:24 UTC (permalink / raw)
  To: amd-gfx, dri-devel, Tvrtko Ursulin
  Cc: kernel-dev, Tvrtko Ursulin, stable, Matthew Brost,
	Danilo Krummrich, Philipp Stanner, Alex Deucher,
	Christian König, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 5274 bytes --]

On středa 13. listopadu 2024 14:48:38, středoevropský standardní čas Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> 
> As commit 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
> points out, ever since
> a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread"),
> any workqueue flushing done from the job submission path must only
> involve memory reclaim safe workqueues to be safe against reclaim
> deadlocks.
> 
> This is also pointed out by workqueue sanity checks:
> 
>  [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work [gpu_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_device_delay_enable_gfx_off [amdgpu]
> ...
>  [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
> ...
>  [ ] Call Trace:
>  [ ]  <TASK>
> ...
>  [ ]  ? check_flush_dependency+0xf5/0x110
> ...
>  [ ]  cancel_delayed_work_sync+0x6e/0x80
>  [ ]  amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
>  [ ]  amdgpu_ring_alloc+0x40/0x50 [amdgpu]
>  [ ]  amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
>  [ ]  ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
>  [ ]  amdgpu_job_run+0xaa/0x1f0 [amdgpu]
>  [ ]  drm_sched_run_job_work+0x257/0x430 [gpu_sched]
>  [ ]  process_one_work+0x217/0x720
> ...
>  [ ]  </TASK>
> 
> Fix this by creating a memory reclaim safe driver workqueue and make the
> submission path use it.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> References: 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
> Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread")
> Cc: stable@vger.kernel.org
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Philipp Stanner <pstanner@redhat.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 25 +++++++++++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  5 +++--
>  3 files changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 7645e498faa4..a6aad687537e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -268,6 +268,8 @@ extern int amdgpu_agp;
>  
>  extern int amdgpu_wbrf;
>  
> +extern struct workqueue_struct *amdgpu_reclaim_wq;
> +
>  #define AMDGPU_VM_MAX_NUM_CTX			4096
>  #define AMDGPU_SG_THRESHOLD			(256*1024*1024)
>  #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS	        3000
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 38686203bea6..f5b7172e8042 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -255,6 +255,8 @@ struct amdgpu_watchdog_timer amdgpu_watchdog_timer = {
>  	.period = 0x0, /* default to 0x0 (timeout disable) */
>  };
>  
> +struct workqueue_struct *amdgpu_reclaim_wq;
> +
>  /**
>   * DOC: vramlimit (int)
>   * Restrict the total amount of VRAM in MiB for testing.  The default is 0 (Use full VRAM).
> @@ -2971,6 +2973,21 @@ static struct pci_driver amdgpu_kms_pci_driver = {
>  	.dev_groups = amdgpu_sysfs_groups,
>  };
>  
> +static int amdgpu_wq_init(void)
> +{
> +	amdgpu_reclaim_wq =
> +		alloc_workqueue("amdgpu-reclaim", WQ_MEM_RECLAIM, 0);
> +	if (!amdgpu_reclaim_wq)
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static void amdgpu_wq_fini(void)
> +{
> +	destroy_workqueue(amdgpu_reclaim_wq);
> +}
> +
>  static int __init amdgpu_init(void)
>  {
>  	int r;
> @@ -2978,6 +2995,10 @@ static int __init amdgpu_init(void)
>  	if (drm_firmware_drivers_only())
>  		return -EINVAL;
>  
> +	r = amdgpu_wq_init();
> +	if (r)
> +		goto error_wq;
> +
>  	r = amdgpu_sync_init();
>  	if (r)
>  		goto error_sync;
> @@ -3006,6 +3027,9 @@ static int __init amdgpu_init(void)
>  	amdgpu_sync_fini();
>  
>  error_sync:
> +	amdgpu_wq_fini();
> +
> +error_wq:
>  	return r;
>  }
>  
> @@ -3017,6 +3041,7 @@ static void __exit amdgpu_exit(void)
>  	amdgpu_acpi_release();
>  	amdgpu_sync_fini();
>  	amdgpu_fence_slab_fini();
> +	amdgpu_wq_fini();
>  	mmu_notifier_synchronize();
>  	amdgpu_xcp_drv_release();
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 2f3f09dfb1fd..f8fd71d9382f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -790,8 +790,9 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable)
>  						AMD_IP_BLOCK_TYPE_GFX, true))
>  					adev->gfx.gfx_off_state = true;
>  			} else {
> -				schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
> -					      delay);
> +				queue_delayed_work(amdgpu_reclaim_wq,
> +						   &adev->gfx.gfx_off_delay_work,
> +						   delay);
>  			}
>  		}
>  	} else {

I can confirm this fixed the warning for me.

Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>

What's the fate of this submission?

Thank you.

-- 
Oleksandr Natalenko, MSE

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/amdgpu: Make the submission path memory reclaim safe
  2024-12-03 14:24 ` Oleksandr Natalenko
@ 2024-12-03 14:54   ` Christian König
  0 siblings, 0 replies; 12+ messages in thread
From: Christian König @ 2024-12-03 14:54 UTC (permalink / raw)
  To: Oleksandr Natalenko, amd-gfx, dri-devel, Tvrtko Ursulin
  Cc: kernel-dev, Tvrtko Ursulin, stable, Matthew Brost,
	Danilo Krummrich, Philipp Stanner, Alex Deucher, Tejun Heo

Am 03.12.24 um 15:24 schrieb Oleksandr Natalenko:
> On středa 13. listopadu 2024 14:48:38, středoevropský standardní čas Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>
>> As commit 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
>> points out, ever since
>> a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread"),
>> any workqueue flushing done from the job submission path must only
>> involve memory reclaim safe workqueues to be safe against reclaim
>> deadlocks.
>>
>> This is also pointed out by workqueue sanity checks:
>>
>>   [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work [gpu_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_device_delay_enable_gfx_off [amdgpu]
>> ...
>>   [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
>> ...
>>   [ ] Call Trace:
>>   [ ]  <TASK>
>> ...
>>   [ ]  ? check_flush_dependency+0xf5/0x110
>> ...
>>   [ ]  cancel_delayed_work_sync+0x6e/0x80
>>   [ ]  amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
>>   [ ]  amdgpu_ring_alloc+0x40/0x50 [amdgpu]
>>   [ ]  amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
>>   [ ]  ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
>>   [ ]  amdgpu_job_run+0xaa/0x1f0 [amdgpu]
>>   [ ]  drm_sched_run_job_work+0x257/0x430 [gpu_sched]
>>   [ ]  process_one_work+0x217/0x720
>> ...
>>   [ ]  </TASK>
>>
>> Fix this by creating a memory reclaim safe driver workqueue and make the
>> submission path use it.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> References: 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
>> Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread")
>> Cc: stable@vger.kernel.org
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Cc: Danilo Krummrich <dakr@kernel.org>
>> Cc: Philipp Stanner <pstanner@redhat.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Cc: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 25 +++++++++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  5 +++--
>>   3 files changed, 30 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 7645e498faa4..a6aad687537e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -268,6 +268,8 @@ extern int amdgpu_agp;
>>   
>>   extern int amdgpu_wbrf;
>>   
>> +extern struct workqueue_struct *amdgpu_reclaim_wq;
>> +
>>   #define AMDGPU_VM_MAX_NUM_CTX			4096
>>   #define AMDGPU_SG_THRESHOLD			(256*1024*1024)
>>   #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS	        3000
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 38686203bea6..f5b7172e8042 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -255,6 +255,8 @@ struct amdgpu_watchdog_timer amdgpu_watchdog_timer = {
>>   	.period = 0x0, /* default to 0x0 (timeout disable) */
>>   };
>>   
>> +struct workqueue_struct *amdgpu_reclaim_wq;
>> +
>>   /**
>>    * DOC: vramlimit (int)
>>    * Restrict the total amount of VRAM in MiB for testing.  The default is 0 (Use full VRAM).
>> @@ -2971,6 +2973,21 @@ static struct pci_driver amdgpu_kms_pci_driver = {
>>   	.dev_groups = amdgpu_sysfs_groups,
>>   };
>>   
>> +static int amdgpu_wq_init(void)
>> +{
>> +	amdgpu_reclaim_wq =
>> +		alloc_workqueue("amdgpu-reclaim", WQ_MEM_RECLAIM, 0);
>> +	if (!amdgpu_reclaim_wq)
>> +		return -ENOMEM;
>> +
>> +	return 0;
>> +}
>> +
>> +static void amdgpu_wq_fini(void)
>> +{
>> +	destroy_workqueue(amdgpu_reclaim_wq);
>> +}
>> +
>>   static int __init amdgpu_init(void)
>>   {
>>   	int r;
>> @@ -2978,6 +2995,10 @@ static int __init amdgpu_init(void)
>>   	if (drm_firmware_drivers_only())
>>   		return -EINVAL;
>>   
>> +	r = amdgpu_wq_init();
>> +	if (r)
>> +		goto error_wq;
>> +
>>   	r = amdgpu_sync_init();
>>   	if (r)
>>   		goto error_sync;
>> @@ -3006,6 +3027,9 @@ static int __init amdgpu_init(void)
>>   	amdgpu_sync_fini();
>>   
>>   error_sync:
>> +	amdgpu_wq_fini();
>> +
>> +error_wq:
>>   	return r;
>>   }
>>   
>> @@ -3017,6 +3041,7 @@ static void __exit amdgpu_exit(void)
>>   	amdgpu_acpi_release();
>>   	amdgpu_sync_fini();
>>   	amdgpu_fence_slab_fini();
>> +	amdgpu_wq_fini();
>>   	mmu_notifier_synchronize();
>>   	amdgpu_xcp_drv_release();
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> index 2f3f09dfb1fd..f8fd71d9382f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> @@ -790,8 +790,9 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable)
>>   						AMD_IP_BLOCK_TYPE_GFX, true))
>>   					adev->gfx.gfx_off_state = true;
>>   			} else {
>> -				schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
>> -					      delay);
>> +				queue_delayed_work(amdgpu_reclaim_wq,
>> +						   &adev->gfx.gfx_off_delay_work,
>> +						   delay);
>>   			}
>>   		}
>>   	} else {
> I can confirm this fixed the warning for me.
>
> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
>
> What's the fate of this submission?

It turned out that the warning is actually a false negative.

E.g. the warning shouldn't be printed in the first place.

Tvrtko pinged the relevant maintainer, but for far I haven't seen a reply.

Regards,
Christian.

>
> Thank you.
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/amdgpu: Make the submission path memory reclaim safe
  2024-11-22 14:36         ` Tvrtko Ursulin
@ 2024-12-17  0:26           ` Matthew Brost
  2024-12-17  9:30             ` Christian König
  0 siblings, 1 reply; 12+ messages in thread
From: Matthew Brost @ 2024-12-17  0:26 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Christian König, Tvrtko Ursulin, amd-gfx, dri-devel,
	kernel-dev, stable, Danilo Krummrich, Philipp Stanner,
	Alex Deucher, Tejun Heo

On Fri, Nov 22, 2024 at 02:36:59PM +0000, Tvrtko Ursulin wrote:
> 
> On 22/11/2024 13:46, Christian König wrote:
> > Am 22.11.24 um 12:34 schrieb Tvrtko Ursulin:
> > > On 13/11/2024 14:42, Tvrtko Ursulin wrote:
> > > > On 13/11/2024 14:26, Christian König wrote:
> > > > > Am 13.11.24 um 14:48 schrieb Tvrtko Ursulin:
> > > > > > From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> > > > > > 
> > > > > > As commit 746ae46c1113 ("drm/sched: Mark scheduler work
> > > > > > queues with WQ_MEM_RECLAIM")
> > > > > > points out, ever since
> > > > > > a6149f039369 ("drm/sched: Convert drm scheduler to use a
> > > > > > work queue rather than kthread"),
> > > > > > any workqueue flushing done from the job submission path must only
> > > > > > involve memory reclaim safe workqueues to be safe against reclaim
> > > > > > deadlocks.
> > > > > > 
> > > > > > This is also pointed out by workqueue sanity checks:
> > > > > > 
> > > > > >   [ ] workqueue: WQ_MEM_RECLAIM
> > > > > > sdma0:drm_sched_run_job_work [gpu_sched] is flushing
> > > > > > !WQ_MEM_RECLAIM
> > > > > > events:amdgpu_device_delay_enable_gfx_off [amdgpu]
> > > > > > ...
> > > > > >   [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
> > > > > > ...
> > > > > >   [ ] Call Trace:
> > > > > >   [ ]  <TASK>
> > > > > > ...
> > > > > >   [ ]  ? check_flush_dependency+0xf5/0x110
> > > > > > ...
> > > > > >   [ ]  cancel_delayed_work_sync+0x6e/0x80
> > > > > >   [ ]  amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
> > > > > >   [ ]  amdgpu_ring_alloc+0x40/0x50 [amdgpu]
> > > > > >   [ ]  amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
> > > > > >   [ ]  ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
> > > > > >   [ ]  amdgpu_job_run+0xaa/0x1f0 [amdgpu]
> > > > > >   [ ]  drm_sched_run_job_work+0x257/0x430 [gpu_sched]
> > > > > >   [ ]  process_one_work+0x217/0x720
> > > > > > ...
> > > > > >   [ ]  </TASK>
> > > > > > 
> > > > > > Fix this by creating a memory reclaim safe driver
> > > > > > workqueue and make the
> > > > > > submission path use it.
> > > > > 
> > > > > Oh well, that is a really good catch! I wasn't aware the
> > > > > workqueues could be blocked by memory reclaim as well.
> > > > 
> > > > Only credit I can take is for the habit that I often run with
> > > > many kernel debugging aids enabled.
> > > 
> > > Although this one actually isn't even under "Kernel hacking".
> > > 
> > > > > Do we have system wide workqueues for that? It seems a bit
> > > > > overkill that amdgpu has to allocate one on his own.
> > > > 
> > > > I wondered the same but did not find any. Only ones I am aware
> > > > of are system_wq&co created in workqueue_init_early().
> > > 
> > > Gentle ping on this. I don't have any better ideas that creating a
> > > new wq.
> > 
> > It took me a moment to realize, but I now think this warning message is
> > a false positive.
> > 
> > What happens is that the code calls cancel_delayed_work_sync().
> > 
> > If the work item never run because of lack of memory then it can just be
> > canceled.
> > 
> > If the work item is running then we will block for it to finish.
> > 

Apologies for the late reply. Alex responded to another thread and CC'd
me, which reminded me to reply here.

The execution of the non-reclaim worker could have led to a few scenarios:

- It might have triggered reclaim through its own memory allocation.
- It could have been running and then context-switched out, with reclaim
  being triggered elsewhere in the mean time, pausing the execution of
  the non-reclaim worker.

In either case, during reclaim, if you wait on a DMA fence that depends
on the DRM scheduler worker, and that worker attempts to flush the above
non-reclaim worker, it will result in a deadlock.

The annotation appears correct to me, and I believe Tvrtko's patch is
indeed accurate. For what it's worth, we encountered several similar
bugs in Xe that emerged once we added the correct work queue
annotations.

> > There is no need to use WQ_MEM_RECLAIM for the workqueue or do I miss
> > something?
> > 
> > If I'm not completely mistaken you stumbled over a bug in the warning
> > code instead :)
> 
> Hmm your thinking sounds convincing.
> 
> Adding Tejun if he has time to help brainstorm this.
> 

Tejun could likely provide insight into whether my above assessment is
correct.

Matt

> Question is - does check_flush_dependency() need to skip the !WQ_MEM_RECLAIM
> flushing WQ_MEM_RECLAIM warning *if* the work is already running *and* it
> was called from cancel_delayed_work_sync()?
> 
> Regards,
> 
> Tvrtko
> 
> > > > > Apart from that looks good to me.
> > > > > 
> > > > > Regards,
> > > > > Christian.
> > > > > 
> > > > > > 
> > > > > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> > > > > > References: 746ae46c1113 ("drm/sched: Mark scheduler
> > > > > > work queues with WQ_MEM_RECLAIM")
> > > > > > Fixes: a6149f039369 ("drm/sched: Convert drm scheduler
> > > > > > to use a work queue rather than kthread")
> > > > > > Cc: stable@vger.kernel.org
> > > > > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > > > > Cc: Danilo Krummrich <dakr@kernel.org>
> > > > > > Cc: Philipp Stanner <pstanner@redhat.com>
> > > > > > Cc: Alex Deucher <alexander.deucher@amd.com>
> > > > > > Cc: Christian König <christian.koenig@amd.com>
> > > > > > ---
> > > > > >   drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
> > > > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 25
> > > > > > +++++++++++++++++++++++++
> > > > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  5 +++--
> > > > > >   3 files changed, 30 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > > > > > index 7645e498faa4..a6aad687537e 100644
> > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > > > > > @@ -268,6 +268,8 @@ extern int amdgpu_agp;
> > > > > >   extern int amdgpu_wbrf;
> > > > > > +extern struct workqueue_struct *amdgpu_reclaim_wq;
> > > > > > +
> > > > > >   #define AMDGPU_VM_MAX_NUM_CTX            4096
> > > > > >   #define AMDGPU_SG_THRESHOLD            (256*1024*1024)
> > > > > >   #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS            3000
> > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > > > > index 38686203bea6..f5b7172e8042 100644
> > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > > > > @@ -255,6 +255,8 @@ struct amdgpu_watchdog_timer
> > > > > > amdgpu_watchdog_timer = {
> > > > > >       .period = 0x0, /* default to 0x0 (timeout disable) */
> > > > > >   };
> > > > > > +struct workqueue_struct *amdgpu_reclaim_wq;
> > > > > > +
> > > > > >   /**
> > > > > >    * DOC: vramlimit (int)
> > > > > >    * Restrict the total amount of VRAM in MiB for
> > > > > > testing. The default is 0 (Use full VRAM).
> > > > > > @@ -2971,6 +2973,21 @@ static struct pci_driver
> > > > > > amdgpu_kms_pci_driver = {
> > > > > >       .dev_groups = amdgpu_sysfs_groups,
> > > > > >   };
> > > > > > +static int amdgpu_wq_init(void)
> > > > > > +{
> > > > > > +    amdgpu_reclaim_wq =
> > > > > > +        alloc_workqueue("amdgpu-reclaim", WQ_MEM_RECLAIM, 0);
> > > > > > +    if (!amdgpu_reclaim_wq)
> > > > > > +        return -ENOMEM;
> > > > > > +
> > > > > > +    return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static void amdgpu_wq_fini(void)
> > > > > > +{
> > > > > > +    destroy_workqueue(amdgpu_reclaim_wq);
> > > > > > +}
> > > > > > +
> > > > > >   static int __init amdgpu_init(void)
> > > > > >   {
> > > > > >       int r;
> > > > > > @@ -2978,6 +2995,10 @@ static int __init amdgpu_init(void)
> > > > > >       if (drm_firmware_drivers_only())
> > > > > >           return -EINVAL;
> > > > > > +    r = amdgpu_wq_init();
> > > > > > +    if (r)
> > > > > > +        goto error_wq;
> > > > > > +
> > > > > >       r = amdgpu_sync_init();
> > > > > >       if (r)
> > > > > >           goto error_sync;
> > > > > > @@ -3006,6 +3027,9 @@ static int __init amdgpu_init(void)
> > > > > >       amdgpu_sync_fini();
> > > > > >   error_sync:
> > > > > > +    amdgpu_wq_fini();
> > > > > > +
> > > > > > +error_wq:
> > > > > >       return r;
> > > > > >   }
> > > > > > @@ -3017,6 +3041,7 @@ static void __exit amdgpu_exit(void)
> > > > > >       amdgpu_acpi_release();
> > > > > >       amdgpu_sync_fini();
> > > > > >       amdgpu_fence_slab_fini();
> > > > > > +    amdgpu_wq_fini();
> > > > > >       mmu_notifier_synchronize();
> > > > > >       amdgpu_xcp_drv_release();
> > > > > >   }
> > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > > > > > index 2f3f09dfb1fd..f8fd71d9382f 100644
> > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > > > > > @@ -790,8 +790,9 @@ void amdgpu_gfx_off_ctrl(struct
> > > > > > amdgpu_device *adev, bool enable)
> > > > > >                           AMD_IP_BLOCK_TYPE_GFX, true))
> > > > > >                       adev->gfx.gfx_off_state = true;
> > > > > >               } else {
> > > > > > - schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
> > > > > > -                          delay);
> > > > > > +                queue_delayed_work(amdgpu_reclaim_wq,
> > > > > > + &adev->gfx.gfx_off_delay_work,
> > > > > > +                           delay);
> > > > > >               }
> > > > > >           }
> > > > > >       } else {
> > > > > 
> > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/amdgpu: Make the submission path memory reclaim safe
  2024-12-17  0:26           ` Matthew Brost
@ 2024-12-17  9:30             ` Christian König
  2024-12-17  9:33               ` Fwd: " Christian König
  0 siblings, 1 reply; 12+ messages in thread
From: Christian König @ 2024-12-17  9:30 UTC (permalink / raw)
  To: Matthew Brost, Tvrtko Ursulin
  Cc: Tvrtko Ursulin, amd-gfx, dri-devel, kernel-dev, stable,
	Danilo Krummrich, Philipp Stanner, Alex Deucher, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 8589 bytes --]

Am 17.12.24 um 01:26 schrieb Matthew Brost:
> On Fri, Nov 22, 2024 at 02:36:59PM +0000, Tvrtko Ursulin wrote:
>> [SNIP]
>>>>>> Do we have system wide workqueues for that? It seems a bit
>>>>>> overkill that amdgpu has to allocate one on his own.
>>>>> I wondered the same but did not find any. Only ones I am aware
>>>>> of are system_wq&co created in workqueue_init_early().
>>>> Gentle ping on this. I don't have any better ideas that creating a
>>>> new wq.
>>> It took me a moment to realize, but I now think this warning message is
>>> a false positive.
>>>
>>> What happens is that the code calls cancel_delayed_work_sync().
>>>
>>> If the work item never run because of lack of memory then it can just be
>>> canceled.
>>>
>>> If the work item is running then we will block for it to finish.
>>>
> Apologies for the late reply. Alex responded to another thread and CC'd
> me, which reminded me to reply here.
>
> The execution of the non-reclaim worker could have led to a few scenarios:
>
> - It might have triggered reclaim through its own memory allocation.

That is unrelated and has nothing todo with WQ_MEM_RECLAIM.

What we should do is to make sure that the lockdep annotation covers all 
workers who play a role in fence signaling.

> - It could have been running and then context-switched out, with reclaim
>    being triggered elsewhere in the mean time, pausing the execution of
>    the non-reclaim worker.

As far as I know non-reclaim workers are not paused because a reclaim 
worker is running, that would be really new to me.

What happens is that here (from workqueue.c):

* Workqueue rescuer thread function. There's one rescuer for each * 
workqueue which has WQ_MEM_RECLAIM set. * * Regular work processing on a 
pool may block trying to create a new * worker which uses GFP_KERNEL 
allocation which has slight chance of * developing into deadlock if some 
works currently on the same queue * need to be processed to satisfy the 
GFP_KERNEL allocation. This is * the problem rescuer solves.

> In either case, during reclaim, if you wait on a DMA fence that depends
> on the DRM scheduler worker,and that worker attempts to flush the above
> non-reclaim worker, it will result in a deadlock.

Well that is only partially correct.

It's true that the worker we wait for can't wait for DMA-fence or do 
memory allocations who wait for DMA-fences. But WQ_MEM_RECLAIM is not 
related to any DMA fence annotation.

What happens instead is that the kernel always keeps a kernel thread 
pre-allocated so that it can guarantee that the worker can start without 
allocating memory.

As soon as the worker runs there shouldn't be any difference in the 
handling as far as I know.

> The annotation appears correct to me, and I believe Tvrtko's patch is
> indeed accurate. For what it's worth, we encountered several similar
> bugs in Xe that emerged once we added the correct work queue
> annotations.

I think you mean something different. This is the lockdep annotation for 
the workers and not WQ_MEM_RECLAIM.

Regards,
Christian.

>>> There is no need to use WQ_MEM_RECLAIM for the workqueue or do I miss
>>> something?
>>>
>>> If I'm not completely mistaken you stumbled over a bug in the warning
>>> code instead :)
>> Hmm your thinking sounds convincing.
>>
>> Adding Tejun if he has time to help brainstorm this.
>>
> Tejun could likely provide insight into whether my above assessment is
> correct.
>
> Matt
>
>> Question is - does check_flush_dependency() need to skip the !WQ_MEM_RECLAIM
>> flushing WQ_MEM_RECLAIM warning *if* the work is already running *and* it
>> was called from cancel_delayed_work_sync()?
>>
>> Regards,
>>
>> Tvrtko
>>
>>>>>> Apart from that looks good to me.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@igalia.com>
>>>>>>> References: 746ae46c1113 ("drm/sched: Mark scheduler
>>>>>>> work queues with WQ_MEM_RECLAIM")
>>>>>>> Fixes: a6149f039369 ("drm/sched: Convert drm scheduler
>>>>>>> to use a work queue rather than kthread")
>>>>>>> Cc:stable@vger.kernel.org
>>>>>>> Cc: Matthew Brost<matthew.brost@intel.com>
>>>>>>> Cc: Danilo Krummrich<dakr@kernel.org>
>>>>>>> Cc: Philipp Stanner<pstanner@redhat.com>
>>>>>>> Cc: Alex Deucher<alexander.deucher@amd.com>
>>>>>>> Cc: Christian König<christian.koenig@amd.com>
>>>>>>> ---
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 25
>>>>>>> +++++++++++++++++++++++++
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  5 +++--
>>>>>>>    3 files changed, 30 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>> index 7645e498faa4..a6aad687537e 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>> @@ -268,6 +268,8 @@ extern int amdgpu_agp;
>>>>>>>    extern int amdgpu_wbrf;
>>>>>>> +extern struct workqueue_struct *amdgpu_reclaim_wq;
>>>>>>> +
>>>>>>>    #define AMDGPU_VM_MAX_NUM_CTX            4096
>>>>>>>    #define AMDGPU_SG_THRESHOLD            (256*1024*1024)
>>>>>>>    #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS            3000
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> index 38686203bea6..f5b7172e8042 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> @@ -255,6 +255,8 @@ struct amdgpu_watchdog_timer
>>>>>>> amdgpu_watchdog_timer = {
>>>>>>>        .period = 0x0, /* default to 0x0 (timeout disable) */
>>>>>>>    };
>>>>>>> +struct workqueue_struct *amdgpu_reclaim_wq;
>>>>>>> +
>>>>>>>    /**
>>>>>>>     * DOC: vramlimit (int)
>>>>>>>     * Restrict the total amount of VRAM in MiB for
>>>>>>> testing. The default is 0 (Use full VRAM).
>>>>>>> @@ -2971,6 +2973,21 @@ static struct pci_driver
>>>>>>> amdgpu_kms_pci_driver = {
>>>>>>>        .dev_groups = amdgpu_sysfs_groups,
>>>>>>>    };
>>>>>>> +static int amdgpu_wq_init(void)
>>>>>>> +{
>>>>>>> +    amdgpu_reclaim_wq =
>>>>>>> +        alloc_workqueue("amdgpu-reclaim", WQ_MEM_RECLAIM, 0);
>>>>>>> +    if (!amdgpu_reclaim_wq)
>>>>>>> +        return -ENOMEM;
>>>>>>> +
>>>>>>> +    return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void amdgpu_wq_fini(void)
>>>>>>> +{
>>>>>>> +    destroy_workqueue(amdgpu_reclaim_wq);
>>>>>>> +}
>>>>>>> +
>>>>>>>    static int __init amdgpu_init(void)
>>>>>>>    {
>>>>>>>        int r;
>>>>>>> @@ -2978,6 +2995,10 @@ static int __init amdgpu_init(void)
>>>>>>>        if (drm_firmware_drivers_only())
>>>>>>>            return -EINVAL;
>>>>>>> +    r = amdgpu_wq_init();
>>>>>>> +    if (r)
>>>>>>> +        goto error_wq;
>>>>>>> +
>>>>>>>        r = amdgpu_sync_init();
>>>>>>>        if (r)
>>>>>>>            goto error_sync;
>>>>>>> @@ -3006,6 +3027,9 @@ static int __init amdgpu_init(void)
>>>>>>>        amdgpu_sync_fini();
>>>>>>>    error_sync:
>>>>>>> +    amdgpu_wq_fini();
>>>>>>> +
>>>>>>> +error_wq:
>>>>>>>        return r;
>>>>>>>    }
>>>>>>> @@ -3017,6 +3041,7 @@ static void __exit amdgpu_exit(void)
>>>>>>>        amdgpu_acpi_release();
>>>>>>>        amdgpu_sync_fini();
>>>>>>>        amdgpu_fence_slab_fini();
>>>>>>> +    amdgpu_wq_fini();
>>>>>>>        mmu_notifier_synchronize();
>>>>>>>        amdgpu_xcp_drv_release();
>>>>>>>    }
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>>>>> index 2f3f09dfb1fd..f8fd71d9382f 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>>>>> @@ -790,8 +790,9 @@ void amdgpu_gfx_off_ctrl(struct
>>>>>>> amdgpu_device *adev, bool enable)
>>>>>>>                            AMD_IP_BLOCK_TYPE_GFX, true))
>>>>>>>                        adev->gfx.gfx_off_state = true;
>>>>>>>                } else {
>>>>>>> - schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
>>>>>>> -                          delay);
>>>>>>> +                queue_delayed_work(amdgpu_reclaim_wq,
>>>>>>> + &adev->gfx.gfx_off_delay_work,
>>>>>>> +                           delay);
>>>>>>>                }
>>>>>>>            }
>>>>>>>        } else {

[-- Attachment #2: Type: text/html, Size: 14836 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Fwd: [PATCH] drm/amdgpu: Make the submission path memory reclaim safe
  2024-12-17  9:30             ` Christian König
@ 2024-12-17  9:33               ` Christian König
  0 siblings, 0 replies; 12+ messages in thread
From: Christian König @ 2024-12-17  9:33 UTC (permalink / raw)
  To: amd-gfx list, dri-devel, kernel-dev@igalia.com, stable

Sending it out to the mailing lists once more because AMD mail servers 
tried to convert it to HTML :(

Am 17.12.24 um 01:26 schrieb Matthew Brost:
> On Fri, Nov 22, 2024 at 02:36:59PM +0000, Tvrtko Ursulin wrote:
>> [SNIP]
>>>>>> Do we have system wide workqueues for that? It seems a bit
>>>>>> overkill that amdgpu has to allocate one on his own.
>>>>> I wondered the same but did not find any. Only ones I am aware
>>>>> of are system_wq&co created in workqueue_init_early().
>>>> Gentle ping on this. I don't have any better ideas that creating a
>>>> new wq.
>>> It took me a moment to realize, but I now think this warning message is
>>> a false positive.
>>>
>>> What happens is that the code calls cancel_delayed_work_sync().
>>>
>>> If the work item never run because of lack of memory then it can just be
>>> canceled.
>>>
>>> If the work item is running then we will block for it to finish.
>>>
> Apologies for the late reply. Alex responded to another thread and CC'd
> me, which reminded me to reply here.
>
> The execution of the non-reclaim worker could have led to a few scenarios:
>
> - It might have triggered reclaim through its own memory allocation.

That is unrelated and has nothing todo with WQ_MEM_RECLAIM.

What we should do is to make sure that the lockdep annotation covers all 
workers who play a role in fence signaling.

> - It could have been running and then context-switched out, with reclaim
>    being triggered elsewhere in the mean time, pausing the execution of
>    the non-reclaim worker.

As far as I know non-reclaim workers are not paused because a reclaim 
worker is running, that would be really new to me.

What happens is that here (from workqueue.c):

* Workqueue rescuer thread function. There's one rescuer for each * 
workqueue which has WQ_MEM_RECLAIM set. * * Regular work processing on a 
pool may block trying to create a new * worker which uses GFP_KERNEL 
allocation which has slight chance of * developing into deadlock if some 
works currently on the same queue * need to be processed to satisfy the 
GFP_KERNEL allocation. This is * the problem rescuer solves.

> In either case, during reclaim, if you wait on a DMA fence that depends
> on the DRM scheduler worker,and that worker attempts to flush the above
> non-reclaim worker, it will result in a deadlock.

Well that is only partially correct.

It's true that the worker we wait for can't wait for DMA-fence or do 
memory allocations who wait for DMA-fences. But WQ_MEM_RECLAIM is not 
related to any DMA fence annotation.

What happens instead is that the kernel always keeps a kernel thread 
pre-allocated so that it can guarantee that the worker can start without 
allocating memory.

As soon as the worker runs there shouldn't be any difference in the 
handling as far as I know.

> The annotation appears correct to me, and I believe Tvrtko's patch is
> indeed accurate. For what it's worth, we encountered several similar
> bugs in Xe that emerged once we added the correct work queue
> annotations.

I think you mean something different. This is the lockdep annotation for 
the workers and not WQ_MEM_RECLAIM.

Regards,
Christian.

>>> There is no need to use WQ_MEM_RECLAIM for the workqueue or do I miss
>>> something?
>>>
>>> If I'm not completely mistaken you stumbled over a bug in the warning
>>> code instead :)
>> Hmm your thinking sounds convincing.
>>
>> Adding Tejun if he has time to help brainstorm this.
>>
> Tejun could likely provide insight into whether my above assessment is
> correct.
> Matt
>
>> Question is - does check_flush_dependency() need to skip the !WQ_MEM_RECLAIM
>> flushing WQ_MEM_RECLAIM warning *if* the work is already running *and* it
>> was called from cancel_delayed_work_sync()?
>>
>> Regards,
>>
>> Tvrtko
>>
>>>>>> Apart from that looks good to me.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@igalia.com>
>>>>>>> References: 746ae46c1113 ("drm/sched: Mark scheduler
>>>>>>> work queues with WQ_MEM_RECLAIM")
>>>>>>> Fixes: a6149f039369 ("drm/sched: Convert drm scheduler
>>>>>>> to use a work queue rather than kthread")
>>>>>>> Cc:stable@vger.kernel.org
>>>>>>> Cc: Matthew Brost<matthew.brost@intel.com>
>>>>>>> Cc: Danilo Krummrich<dakr@kernel.org>
>>>>>>> Cc: Philipp Stanner<pstanner@redhat.com>
>>>>>>> Cc: Alex Deucher<alexander.deucher@amd.com>
>>>>>>> Cc: Christian König<christian.koenig@amd.com>
>>>>>>> ---
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 25
>>>>>>> +++++++++++++++++++++++++
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  5 +++--
>>>>>>>    3 files changed, 30 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>> index 7645e498faa4..a6aad687537e 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>> @@ -268,6 +268,8 @@ extern int amdgpu_agp;
>>>>>>>    extern int amdgpu_wbrf;
>>>>>>> +extern struct workqueue_struct *amdgpu_reclaim_wq;
>>>>>>> +
>>>>>>>    #define AMDGPU_VM_MAX_NUM_CTX            4096
>>>>>>>    #define AMDGPU_SG_THRESHOLD            (256*1024*1024)
>>>>>>>    #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS            3000
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> index 38686203bea6..f5b7172e8042 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> @@ -255,6 +255,8 @@ struct amdgpu_watchdog_timer
>>>>>>> amdgpu_watchdog_timer = {
>>>>>>>        .period = 0x0, /* default to 0x0 (timeout disable) */
>>>>>>>    };
>>>>>>> +struct workqueue_struct *amdgpu_reclaim_wq;
>>>>>>> +
>>>>>>>    /**
>>>>>>>     * DOC: vramlimit (int)
>>>>>>>     * Restrict the total amount of VRAM in MiB for
>>>>>>> testing. The default is 0 (Use full VRAM).
>>>>>>> @@ -2971,6 +2973,21 @@ static struct pci_driver
>>>>>>> amdgpu_kms_pci_driver = {
>>>>>>>        .dev_groups = amdgpu_sysfs_groups,
>>>>>>>    };
>>>>>>> +static int amdgpu_wq_init(void)
>>>>>>> +{
>>>>>>> +    amdgpu_reclaim_wq =
>>>>>>> +        alloc_workqueue("amdgpu-reclaim", WQ_MEM_RECLAIM, 0);
>>>>>>> +    if (!amdgpu_reclaim_wq)
>>>>>>> +        return -ENOMEM;
>>>>>>> +
>>>>>>> +    return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void amdgpu_wq_fini(void)
>>>>>>> +{
>>>>>>> +    destroy_workqueue(amdgpu_reclaim_wq);
>>>>>>> +}
>>>>>>> +
>>>>>>>    static int __init amdgpu_init(void)
>>>>>>>    {
>>>>>>>        int r;
>>>>>>> @@ -2978,6 +2995,10 @@ static int __init amdgpu_init(void)
>>>>>>>        if (drm_firmware_drivers_only())
>>>>>>>            return -EINVAL;
>>>>>>> +    r = amdgpu_wq_init();
>>>>>>> +    if (r)
>>>>>>> +        goto error_wq;
>>>>>>> +
>>>>>>>        r = amdgpu_sync_init();
>>>>>>>        if (r)
>>>>>>>            goto error_sync;
>>>>>>> @@ -3006,6 +3027,9 @@ static int __init amdgpu_init(void)
>>>>>>>        amdgpu_sync_fini();
>>>>>>>    error_sync:
>>>>>>> +    amdgpu_wq_fini();
>>>>>>> +
>>>>>>> +error_wq:
>>>>>>>        return r;
>>>>>>>    }
>>>>>>> @@ -3017,6 +3041,7 @@ static void __exit amdgpu_exit(void)
>>>>>>>        amdgpu_acpi_release();
>>>>>>>        amdgpu_sync_fini();
>>>>>>>        amdgpu_fence_slab_fini();
>>>>>>> +    amdgpu_wq_fini();
>>>>>>>        mmu_notifier_synchronize();
>>>>>>>        amdgpu_xcp_drv_release();
>>>>>>>    }
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>>>>> index 2f3f09dfb1fd..f8fd71d9382f 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>>>>>> @@ -790,8 +790,9 @@ void amdgpu_gfx_off_ctrl(struct
>>>>>>> amdgpu_device *adev, bool enable)
>>>>>>>                            AMD_IP_BLOCK_TYPE_GFX, true))
>>>>>>>                        adev->gfx.gfx_off_state = true;
>>>>>>>                } else {
>>>>>>> - schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
>>>>>>> -                          delay);
>>>>>>> +                queue_delayed_work(amdgpu_reclaim_wq,
>>>>>>> + &adev->gfx.gfx_off_delay_work,
>>>>>>> +                           delay);
>>>>>>>                }
>>>>>>>            }
>>>>>>>        } else {


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/amdgpu: Make the submission path memory reclaim safe
  2024-11-13 13:48 [PATCH] drm/amdgpu: Make the submission path memory reclaim safe Tvrtko Ursulin
  2024-11-13 14:26 ` Christian König
  2024-12-03 14:24 ` Oleksandr Natalenko
@ 2024-12-17 10:39 ` Philipp Stanner
  2 siblings, 0 replies; 12+ messages in thread
From: Philipp Stanner @ 2024-12-17 10:39 UTC (permalink / raw)
  To: Tvrtko Ursulin, amd-gfx, dri-devel
  Cc: kernel-dev, Tvrtko Ursulin, stable, Matthew Brost,
	Danilo Krummrich, Alex Deucher, Christian König,
	Krzysztof Wilczyński

[+cc Krzysztof, who I think witnessed a possibly related Kernel crash
in the wild]

P.

On Wed, 2024-11-13 at 13:48 +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> 
> As commit 746ae46c1113 ("drm/sched: Mark scheduler work queues with
> WQ_MEM_RECLAIM")
> points out, ever since
> a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue
> rather than kthread"),
> any workqueue flushing done from the job submission path must only
> involve memory reclaim safe workqueues to be safe against reclaim
> deadlocks.
> 
> This is also pointed out by workqueue sanity checks:
> 
>  [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work
> [gpu_sched] is flushing !WQ_MEM_RECLAIM
> events:amdgpu_device_delay_enable_gfx_off [amdgpu]
> ...
>  [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
> ...
>  [ ] Call Trace:
>  [ ]  <TASK>
> ...
>  [ ]  ? check_flush_dependency+0xf5/0x110
> ...
>  [ ]  cancel_delayed_work_sync+0x6e/0x80
>  [ ]  amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
>  [ ]  amdgpu_ring_alloc+0x40/0x50 [amdgpu]
>  [ ]  amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
>  [ ]  ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
>  [ ]  amdgpu_job_run+0xaa/0x1f0 [amdgpu]
>  [ ]  drm_sched_run_job_work+0x257/0x430 [gpu_sched]
>  [ ]  process_one_work+0x217/0x720
> ...
>  [ ]  </TASK>
> 
> Fix this by creating a memory reclaim safe driver workqueue and make
> the
> submission path use it.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> References: 746ae46c1113 ("drm/sched: Mark scheduler work queues with
> WQ_MEM_RECLAIM")
> Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a work
> queue rather than kthread")
> Cc: stable@vger.kernel.org
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Philipp Stanner <pstanner@redhat.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 25
> +++++++++++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  5 +++--
>  3 files changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 7645e498faa4..a6aad687537e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -268,6 +268,8 @@ extern int amdgpu_agp;
>  
>  extern int amdgpu_wbrf;
>  
> +extern struct workqueue_struct *amdgpu_reclaim_wq;
> +
>  #define AMDGPU_VM_MAX_NUM_CTX			4096
>  #define AMDGPU_SG_THRESHOLD			(256*1024*1024)
>  #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS	        3000
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 38686203bea6..f5b7172e8042 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -255,6 +255,8 @@ struct amdgpu_watchdog_timer
> amdgpu_watchdog_timer = {
>  	.period = 0x0, /* default to 0x0 (timeout disable) */
>  };
>  
> +struct workqueue_struct *amdgpu_reclaim_wq;
> +
>  /**
>   * DOC: vramlimit (int)
>   * Restrict the total amount of VRAM in MiB for testing.  The
> default is 0 (Use full VRAM).
> @@ -2971,6 +2973,21 @@ static struct pci_driver amdgpu_kms_pci_driver
> = {
>  	.dev_groups = amdgpu_sysfs_groups,
>  };
>  
> +static int amdgpu_wq_init(void)
> +{
> +	amdgpu_reclaim_wq =
> +		alloc_workqueue("amdgpu-reclaim", WQ_MEM_RECLAIM,
> 0);
> +	if (!amdgpu_reclaim_wq)
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static void amdgpu_wq_fini(void)
> +{
> +	destroy_workqueue(amdgpu_reclaim_wq);
> +}
> +
>  static int __init amdgpu_init(void)
>  {
>  	int r;
> @@ -2978,6 +2995,10 @@ static int __init amdgpu_init(void)
>  	if (drm_firmware_drivers_only())
>  		return -EINVAL;
>  
> +	r = amdgpu_wq_init();
> +	if (r)
> +		goto error_wq;
> +
>  	r = amdgpu_sync_init();
>  	if (r)
>  		goto error_sync;
> @@ -3006,6 +3027,9 @@ static int __init amdgpu_init(void)
>  	amdgpu_sync_fini();
>  
>  error_sync:
> +	amdgpu_wq_fini();
> +
> +error_wq:
>  	return r;
>  }
>  
> @@ -3017,6 +3041,7 @@ static void __exit amdgpu_exit(void)
>  	amdgpu_acpi_release();
>  	amdgpu_sync_fini();
>  	amdgpu_fence_slab_fini();
> +	amdgpu_wq_fini();
>  	mmu_notifier_synchronize();
>  	amdgpu_xcp_drv_release();
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 2f3f09dfb1fd..f8fd71d9382f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -790,8 +790,9 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device
> *adev, bool enable)
>  						AMD_IP_BLOCK_TYPE_GF
> X, true))
>  					adev->gfx.gfx_off_state =
> true;
>  			} else {
> -				schedule_delayed_work(&adev-
> >gfx.gfx_off_delay_work,
> -					      delay);
> +				queue_delayed_work(amdgpu_reclaim_wq
> ,
> +						   &adev-
> >gfx.gfx_off_delay_work,
> +						   delay);
>  			}
>  		}
>  	} else {


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-12-17 10:39 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-13 13:48 [PATCH] drm/amdgpu: Make the submission path memory reclaim safe Tvrtko Ursulin
2024-11-13 14:26 ` Christian König
2024-11-13 14:42   ` Tvrtko Ursulin
2024-11-22 11:34     ` Tvrtko Ursulin
2024-11-22 13:46       ` Christian König
2024-11-22 14:36         ` Tvrtko Ursulin
2024-12-17  0:26           ` Matthew Brost
2024-12-17  9:30             ` Christian König
2024-12-17  9:33               ` Fwd: " Christian König
2024-12-03 14:24 ` Oleksandr Natalenko
2024-12-03 14:54   ` Christian König
2024-12-17 10:39 ` Philipp Stanner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).