From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Matthew Auld <matthew.auld@intel.com>
Cc: intel-xe@lists.freedesktop.org
Subject: Re: [RFC 08/20] drm/xe: Runtime PM wake on every exec
Date: Tue, 9 Jan 2024 12:41:53 -0500 [thread overview]
Message-ID: <ZZ2FYVEN1xzDQMVm@intel.com> (raw)
In-Reply-To: <11d0bf86-c011-4761-895f-8cce0a7e071c@intel.com>
On Tue, Jan 09, 2024 at 11:24:34AM +0000, Matthew Auld wrote:
> On 28/12/2023 02:12, Rodrigo Vivi wrote:
> > Let's ensure our PCI device is awaken on every GT execution to
> > the end of the execution.
> > Let's increase the runtime_pm protection and start moving
> > that to the outer bounds.
> >
> > Let's also remove the unnecessary mem_access get/put.
> >
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_sched_job.c | 10 +++++-----
> > 1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c
> > index 01106a1156ad8..0b30ec77fc5ad 100644
> > --- a/drivers/gpu/drm/xe/xe_sched_job.c
> > +++ b/drivers/gpu/drm/xe/xe_sched_job.c
> > @@ -15,6 +15,7 @@
> > #include "xe_hw_fence.h"
> > #include "xe_lrc.h"
> > #include "xe_macros.h"
> > +#include "xe_pm.h"
> > #include "xe_trace.h"
> > #include "xe_vm.h"
> > @@ -67,6 +68,8 @@ static void job_free(struct xe_sched_job *job)
> > struct xe_exec_queue *q = job->q;
> > bool is_migration = xe_sched_job_is_migration(q);
> > + xe_pm_runtime_put(gt_to_xe(q->gt));
> > +
> > kmem_cache_free(xe_exec_queue_is_parallel(job->q) || is_migration ?
> > xe_sched_job_parallel_slab : xe_sched_job_slab, job);
> > }
> > @@ -86,6 +89,8 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
> > int i, j;
> > u32 width;
> > + xe_pm_runtime_get(gt_to_xe(q->gt));
> > +
>
> This seems way too deep in the call chain. If this actually wakes up the
> device we will end up with all of the same d3cold deadlock issues. Like here
> we are for sure holding stuff like dma-resv, but the rpm callbacks also want
> to grab it. IMO this needs to be something like runtime_get_if_active(),
> with the upper layers already ensuring device is awake (like ioctl), so here
> we are just keeping it awake until the job is done. Or maybe this is how it
> is by the end of the series?
we have 2 cases here, one that it is already awake by the ioctl and the
other that is on the eviction preparation and that exit because of the
'current' task. So we should be good anyways, but you are right, maybe
using the get_if_active is better here for clarity.
>
> > /* only a kernel context can submit a vm-less job */
> > XE_WARN_ON(!q->vm && !(q->flags & EXEC_QUEUE_FLAG_KERNEL));
> > @@ -155,9 +160,6 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
> > for (i = 0; i < width; ++i)
> > job->batch_addr[i] = batch_addr[i];
> > - /* All other jobs require a VM to be open which has a ref */
> > - if (unlikely(q->flags & EXEC_QUEUE_FLAG_KERNEL))
> > - xe_device_mem_access_get(job_to_xe(job));
> > xe_device_assert_mem_access(job_to_xe(job));
> > trace_xe_sched_job_create(job);
> > @@ -189,8 +191,6 @@ void xe_sched_job_destroy(struct kref *ref)
> > struct xe_sched_job *job =
> > container_of(ref, struct xe_sched_job, refcount);
> > - if (unlikely(job->q->flags & EXEC_QUEUE_FLAG_KERNEL))
> > - xe_device_mem_access_put(job_to_xe(job));
> > xe_exec_queue_put(job->q);
> > dma_fence_put(job->fence);
> > drm_sched_job_cleanup(&job->drm);
next prev parent reply other threads:[~2024-01-09 17:42 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-28 2:12 [RFC 00/20] First attempt to kill mem_access Rodrigo Vivi
2023-12-28 2:12 ` [RFC 01/20] drm/xe: Document Xe PM component Rodrigo Vivi
2023-12-28 2:12 ` [RFC 02/20] drm/xe: Fix display runtime_pm handling Rodrigo Vivi
2023-12-28 2:12 ` [RFC 03/20] drm/xe: Create a xe_pm_runtime_resume_and_get variant for display Rodrigo Vivi
2023-12-28 2:12 ` [RFC 04/20] drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion Rodrigo Vivi
2023-12-28 2:12 ` [RFC 05/20] drm/xe: Prepare display for D3Cold Rodrigo Vivi
2023-12-28 2:12 ` [RFC 06/20] drm/xe: Convert mem_access assertion towards the runtime_pm state Rodrigo Vivi
2024-01-09 11:06 ` Matthew Auld
2024-01-09 17:50 ` Rodrigo Vivi
2023-12-28 2:12 ` [RFC 07/20] drm/xe: Runtime PM wake on every IOCTL Rodrigo Vivi
2024-01-02 11:30 ` Gupta, Anshuman
2024-01-09 17:57 ` Rodrigo Vivi
2023-12-28 2:12 ` [RFC 08/20] drm/xe: Runtime PM wake on every exec Rodrigo Vivi
2024-01-09 11:24 ` Matthew Auld
2024-01-09 17:41 ` Rodrigo Vivi [this message]
2024-01-09 18:40 ` Matthew Auld
2023-12-28 2:12 ` [RFC 09/20] drm/xe: Runtime PM wake on every sysfs call Rodrigo Vivi
2023-12-28 2:12 ` [RFC 10/20] drm/xe: Sort some xe_pm_runtime related functions Rodrigo Vivi
2024-01-09 11:26 ` Matthew Auld
2023-12-28 2:12 ` [RFC 11/20] drm/xe: Ensure device is awake before removing it Rodrigo Vivi
2023-12-28 2:12 ` [RFC 12/20] drm/xe: Remove mem_access from guc_pc calls Rodrigo Vivi
2023-12-28 2:12 ` [RFC 13/20] drm/xe: Runtime PM wake on every debugfs call Rodrigo Vivi
2023-12-28 2:12 ` [RFC 14/20] drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls Rodrigo Vivi
2023-12-28 2:12 ` [RFC 15/20] drm/xe: Allow GuC CT fast path and worker regardless of runtime_pm Rodrigo Vivi
2024-01-09 12:09 ` Matthew Auld
2023-12-28 2:12 ` [RFC 16/20] drm/xe: Remove mem_access calls from migration Rodrigo Vivi
2024-01-09 12:33 ` Matthew Auld
2024-01-09 17:58 ` Rodrigo Vivi
2024-01-09 18:49 ` Matthew Auld
2024-01-09 22:40 ` Rodrigo Vivi
2024-01-11 14:17 ` Matthew Brost
2023-12-28 2:12 ` [RFC 17/20] drm/xe: Removing extra mem_access protection from runtime pm Rodrigo Vivi
2023-12-28 2:12 ` [RFC 18/20] drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls Rodrigo Vivi
2023-12-28 2:12 ` [RFC 19/20] drm/xe: Remove unused runtime pm helper Rodrigo Vivi
2023-12-28 2:12 ` [RFC 20/20] drm/xe: Mega Kill of mem_access Rodrigo Vivi
2024-01-09 11:41 ` Matthew Auld
2024-01-09 17:39 ` Rodrigo Vivi
2024-01-09 18:27 ` Matthew Auld
2024-01-09 22:34 ` Rodrigo Vivi
2024-01-04 5:40 ` ✓ CI.Patch_applied: success for First attempt to kill mem_access Patchwork
2024-01-04 5:40 ` ✗ CI.checkpatch: warning " Patchwork
2024-01-04 5:41 ` ✗ CI.KUnit: failure " Patchwork
2024-01-10 5:21 ` [RFC 00/20] " Matthew Brost
2024-01-10 14:06 ` Rodrigo Vivi
2024-01-10 14:08 ` Vivi, Rodrigo
2024-01-10 14:33 ` Matthew Brost
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZZ2FYVEN1xzDQMVm@intel.com \
--to=rodrigo.vivi@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.auld@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox