Re: [RFC 16/20] drm/xe: Remove mem_access calls from migration

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Matthew Auld <matthew.auld@intel.com>
To: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-xe@lists.freedesktop.org
Subject: Re: [RFC 16/20] drm/xe: Remove mem_access calls from migration
Date: Tue, 9 Jan 2024 18:49:47 +0000	[thread overview]
Message-ID: <062c816f-b825-4b8c-aacf-2e4221398455@intel.com> (raw)
In-Reply-To: <ZZ2JW1NQVVkNMsAw@intel.com>

On 09/01/2024 17:58, Rodrigo Vivi wrote:
> On Tue, Jan 09, 2024 at 12:33:25PM +0000, Matthew Auld wrote:
>> On 28/12/2023 02:12, Rodrigo Vivi wrote:
>>> The sched jobs runtime pm calls already protects every execution,
>>> including these migration ones.
>>
>> Is job really enough here? I assume queue is only destroyed once it has no
>> more jobs and the final queue ref is dropped. And destroying the queue might
>> involve stuff like de-register the context with GuC etc. which needs to use
>> CT which will need rpm ref. What is holding the rpm if not the vm or queue?
> 
> The exec queue is holding to the end.

Can you share some more details? AFAIK the queue destruction is async, 
and previously the vm underneath is holding the rpm or in the case of 
migration vm, if was the queue itself. But for the migration vm case 
that is removed below. I guess I'm missing something here.

> 
>>
>>>
>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>> ---
>>>    drivers/gpu/drm/xe/tests/xe_migrate.c |  2 --
>>>    drivers/gpu/drm/xe/xe_device.c        | 17 -----------------
>>>    drivers/gpu/drm/xe/xe_device.h        |  1 -
>>>    drivers/gpu/drm/xe/xe_exec_queue.c    | 18 ------------------
>>>    4 files changed, 38 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/tests/xe_migrate.c b/drivers/gpu/drm/xe/tests/xe_migrate.c
>>> index 7a32faa2f6888..2257f0a28435b 100644
>>> --- a/drivers/gpu/drm/xe/tests/xe_migrate.c
>>> +++ b/drivers/gpu/drm/xe/tests/xe_migrate.c
>>> @@ -428,9 +428,7 @@ static int migrate_test_run_device(struct xe_device *xe)
>>>    		kunit_info(test, "Testing tile id %d.\n", id);
>>>    		xe_vm_lock(m->q->vm, true);
>>> -		xe_device_mem_access_get(xe);
>>>    		xe_migrate_sanity_test(m, test);
>>> -		xe_device_mem_access_put(xe);
>>>    		xe_vm_unlock(m->q->vm);
>>>    	}
>>> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>>> index ee9b6612eec43..a7bec49da49fa 100644
>>> --- a/drivers/gpu/drm/xe/xe_device.c
>>> +++ b/drivers/gpu/drm/xe/xe_device.c
>>> @@ -675,23 +675,6 @@ void xe_device_assert_mem_access(struct xe_device *xe)
>>>    	XE_WARN_ON(xe_pm_runtime_suspended(xe));
>>>    }
>>> -bool xe_device_mem_access_get_if_ongoing(struct xe_device *xe)
>>> -{
>>> -	bool active;
>>> -
>>> -	if (xe_pm_read_callback_task(xe) == current)
>>> -		return true;
>>> -
>>> -	active = xe_pm_runtime_get_if_active(xe);
>>> -	if (active) {
>>> -		int ref = atomic_inc_return(&xe->mem_access.ref);
>>> -
>>> -		xe_assert(xe, ref != S32_MAX);
>>> -	}
>>> -
>>> -	return active;
>>> -}
>>> -
>>>    void xe_device_mem_access_get(struct xe_device *xe)
>>>    {
>>>    	int ref;
>>> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
>>> index af8ac2e9e2709..4acf4c2973390 100644
>>> --- a/drivers/gpu/drm/xe/xe_device.h
>>> +++ b/drivers/gpu/drm/xe/xe_device.h
>>> @@ -142,7 +142,6 @@ static inline struct xe_force_wake *gt_to_fw(struct xe_gt *gt)
>>>    }
>>>    void xe_device_mem_access_get(struct xe_device *xe);
>>> -bool xe_device_mem_access_get_if_ongoing(struct xe_device *xe);
>>>    void xe_device_mem_access_put(struct xe_device *xe);
>>>    void xe_device_assert_mem_access(struct xe_device *xe);
>>> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
>>> index 44fe8097b7cda..d3a8d2d8caaaf 100644
>>> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
>>> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
>>> @@ -87,17 +87,6 @@ static struct xe_exec_queue *__xe_exec_queue_create(struct xe_device *xe,
>>>    	if (err)
>>>    		goto err_lrc;
>>> -	/*
>>> -	 * Normally the user vm holds an rpm ref to keep the device
>>> -	 * awake, and the context holds a ref for the vm, however for
>>> -	 * some engines we use the kernels migrate vm underneath which offers no
>>> -	 * such rpm ref, or we lack a vm. Make sure we keep a ref here, so we
>>> -	 * can perform GuC CT actions when needed. Caller is expected to have
>>> -	 * already grabbed the rpm ref outside any sensitive locks.
>>> -	 */
>>> -	if (!(q->flags & EXEC_QUEUE_FLAG_PERMANENT) && (q->flags & EXEC_QUEUE_FLAG_VM || !vm))
>>> -		drm_WARN_ON(&xe->drm, !xe_device_mem_access_get_if_ongoing(xe));
>>> -
>>>    	return q;
>>>    err_lrc:
>>> @@ -172,8 +161,6 @@ void xe_exec_queue_fini(struct xe_exec_queue *q)
>>>    	for (i = 0; i < q->width; ++i)
>>>    		xe_lrc_finish(q->lrc + i);
>>> -	if (!(q->flags & EXEC_QUEUE_FLAG_PERMANENT) && (q->flags & EXEC_QUEUE_FLAG_VM || !q->vm))
>>> -		xe_device_mem_access_put(gt_to_xe(q->gt));
>>>    	if (q->vm)
>>>    		xe_vm_put(q->vm);
>>> @@ -643,9 +630,6 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
>>>    			if (XE_IOCTL_DBG(xe, !hwe))
>>>    				return -EINVAL;
>>> -			/* The migration vm doesn't hold rpm ref */
>>> -			xe_device_mem_access_get(xe);
>>> -
>>>    			migrate_vm = xe_migrate_get_vm(gt_to_tile(gt)->migrate);
>>>    			new = xe_exec_queue_create(xe, migrate_vm, logical_mask,
>>>    						   args->width, hwe,
>>> @@ -655,8 +639,6 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
>>>    						    EXEC_QUEUE_FLAG_BIND_ENGINE_CHILD :
>>>    						    0));
>>> -			xe_device_mem_access_put(xe); /* now held by engine */
>>> -
>>>    			xe_vm_put(migrate_vm);
>>>    			if (IS_ERR(new)) {
>>>    				err = PTR_ERR(new);

next prev parent reply	other threads:[~2024-01-09 18:49 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-28  2:12 [RFC 00/20] First attempt to kill mem_access Rodrigo Vivi
2023-12-28  2:12 ` [RFC 01/20] drm/xe: Document Xe PM component Rodrigo Vivi
2023-12-28  2:12 ` [RFC 02/20] drm/xe: Fix display runtime_pm handling Rodrigo Vivi
2023-12-28  2:12 ` [RFC 03/20] drm/xe: Create a xe_pm_runtime_resume_and_get variant for display Rodrigo Vivi
2023-12-28  2:12 ` [RFC 04/20] drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion Rodrigo Vivi
2023-12-28  2:12 ` [RFC 05/20] drm/xe: Prepare display for D3Cold Rodrigo Vivi
2023-12-28  2:12 ` [RFC 06/20] drm/xe: Convert mem_access assertion towards the runtime_pm state Rodrigo Vivi
2024-01-09 11:06   ` Matthew Auld
2024-01-09 17:50     ` Rodrigo Vivi
2023-12-28  2:12 ` [RFC 07/20] drm/xe: Runtime PM wake on every IOCTL Rodrigo Vivi
2024-01-02 11:30   ` Gupta, Anshuman
2024-01-09 17:57     ` Rodrigo Vivi
2023-12-28  2:12 ` [RFC 08/20] drm/xe: Runtime PM wake on every exec Rodrigo Vivi
2024-01-09 11:24   ` Matthew Auld
2024-01-09 17:41     ` Rodrigo Vivi
2024-01-09 18:40       ` Matthew Auld
2023-12-28  2:12 ` [RFC 09/20] drm/xe: Runtime PM wake on every sysfs call Rodrigo Vivi
2023-12-28  2:12 ` [RFC 10/20] drm/xe: Sort some xe_pm_runtime related functions Rodrigo Vivi
2024-01-09 11:26   ` Matthew Auld
2023-12-28  2:12 ` [RFC 11/20] drm/xe: Ensure device is awake before removing it Rodrigo Vivi
2023-12-28  2:12 ` [RFC 12/20] drm/xe: Remove mem_access from guc_pc calls Rodrigo Vivi
2023-12-28  2:12 ` [RFC 13/20] drm/xe: Runtime PM wake on every debugfs call Rodrigo Vivi
2023-12-28  2:12 ` [RFC 14/20] drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls Rodrigo Vivi
2023-12-28  2:12 ` [RFC 15/20] drm/xe: Allow GuC CT fast path and worker regardless of runtime_pm Rodrigo Vivi
2024-01-09 12:09   ` Matthew Auld
2023-12-28  2:12 ` [RFC 16/20] drm/xe: Remove mem_access calls from migration Rodrigo Vivi
2024-01-09 12:33   ` Matthew Auld
2024-01-09 17:58     ` Rodrigo Vivi
2024-01-09 18:49       ` Matthew Auld [this message]
2024-01-09 22:40         ` Rodrigo Vivi
2024-01-11 14:17           ` Matthew Brost
2023-12-28  2:12 ` [RFC 17/20] drm/xe: Removing extra mem_access protection from runtime pm Rodrigo Vivi
2023-12-28  2:12 ` [RFC 18/20] drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls Rodrigo Vivi
2023-12-28  2:12 ` [RFC 19/20] drm/xe: Remove unused runtime pm helper Rodrigo Vivi
2023-12-28  2:12 ` [RFC 20/20] drm/xe: Mega Kill of mem_access Rodrigo Vivi
2024-01-09 11:41   ` Matthew Auld
2024-01-09 17:39     ` Rodrigo Vivi
2024-01-09 18:27       ` Matthew Auld
2024-01-09 22:34         ` Rodrigo Vivi
2024-01-04  5:40 ` ✓ CI.Patch_applied: success for First attempt to kill mem_access Patchwork
2024-01-04  5:40 ` ✗ CI.checkpatch: warning " Patchwork
2024-01-04  5:41 ` ✗ CI.KUnit: failure " Patchwork
2024-01-10  5:21 ` [RFC 00/20] " Matthew Brost
2024-01-10 14:06   ` Rodrigo Vivi
2024-01-10 14:08     ` Vivi, Rodrigo
2024-01-10 14:33     ` Matthew Brost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=062c816f-b825-4b8c-aacf-2e4221398455@intel.com \
    --to=matthew.auld@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox