From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Matthew Auld <matthew.auld@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
Matthew Brost <matthew.brost@intel.com>
Subject: Re: [RFC 21/34] drm/xe: Convert GuC CT paths from mem_access to xe_pm_runtime
Date: Wed, 28 Feb 2024 11:51:31 -0500 [thread overview]
Message-ID: <Zd9kkxpCZ174GzwF@intel.com> (raw)
In-Reply-To: <1460e9d5-fb69-43a1-8be7-ae08152bb15b@intel.com>
On Mon, Feb 05, 2024 at 12:23:41PM +0000, Matthew Auld wrote:
> On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > For the G2H and TLB invalidation cases, the outer bounds
> > protection of the pm_runtime are not going to work. So
> > we need to have the inner side protections here.
> >
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > Suggested-by: Matthew Brost <matthew.brost@intel.com>
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_debugfs.c | 3 ++
> > drivers/gpu/drm/xe/xe_guc_ct.c | 79 +++++++++++++---------------
> > drivers/gpu/drm/xe/xe_guc_ct_types.h | 2 +
> > 3 files changed, 41 insertions(+), 43 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> > index 8abdf3c17e1d..b3669eac23c9 100644
> > --- a/drivers/gpu/drm/xe/xe_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> > @@ -64,6 +64,9 @@ static int info(struct seq_file *m, void *data)
> > xe_force_wake_ref(gt_to_fw(gt), XE_FW_GT));
> > drm_printf(&p, "gt%d engine_mask 0x%llx\n", id,
> > gt->info.engine_mask);
> > + drm_printf(&p, "gt%d g2h_outstanding %d g2h_pm_refs %d\n", id,
> > + gt->uc.guc.ct.g2h_outstanding,
> > + gt->uc.guc.ct.g2h_pm_refs);
> > }
> > xe_pm_runtime_put(xe);
> > diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> > index f3d356383ced..139cfe733661 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> > @@ -287,16 +287,49 @@ static int guc_ct_control_toggle(struct xe_guc_ct *ct, bool enable)
> > return ret > 0 ? -EPROTO : ret;
> > }
> > +static void guc_ct_g2h_outstanding_clear(struct xe_guc_ct *ct)
> > +{
> > + int i;
> > +
> > + for (i = 0; i < ct->g2h_pm_refs; i++)
> > + xe_pm_runtime_put(ct_to_xe(ct));
> > + ct->g2h_outstanding = 0;
> > + ct->g2h_pm_refs = 0;
> > +}
> > +
> > +static void guc_ct_g2h_outstanding_dec(struct xe_guc_ct *ct)
> > +{
> > + if (ct->g2h_pm_refs > 0)
> > + xe_pm_runtime_put(ct_to_xe(ct));
> > + ct->g2h_pm_refs--;
> > + ct->g2h_outstanding--;
> > +}
> > +
> > +static void guc_ct_g2h_outstanding_add(struct xe_guc_ct *ct, int num_g2h)
> > +{
> > + struct xe_device *xe = ct_to_xe(ct);
> > + int i;
> > +
> > + for (i = 0; i < num_g2h; i++) {
> > + if (xe_pm_runtime_get_if_in_use(xe))
> > + ct->g2h_pm_refs++;
> > + else
> > + drm_err(&xe->drm, "Failed to grab RPM ref for outstanding g2h\n");
> > + }
>
> I guess we could maybe drop the g2h_pm_refs, and rather just track single
> rpm ref for entire ct when g2h_outstanding 0 -> 1, and likewise drop the rpm
> ref when g2h_outstanding reaches zero?
I thought about that, but I was afraid of some unbalanced case when the
get_if_in_use fails.
>
> > + ct->g2h_outstanding += num_g2h;
> > +}
> > +
> > static void xe_guc_ct_set_state(struct xe_guc_ct *ct,
> > enum xe_guc_ct_state state)
> > {
> > mutex_lock(&ct->lock); /* Serialise dequeue_one_g2h() */
> > spin_lock_irq(&ct->fast_lock); /* Serialise CT fast-path */
> > +
> > xe_gt_assert(ct_to_gt(ct), ct->g2h_outstanding == 0 ||
> > state == XE_GUC_CT_STATE_STOPPED);
> > - ct->g2h_outstanding = 0;
> > + guc_ct_g2h_outstanding_clear(ct);
> > ct->state = state;
> > spin_unlock_irq(&ct->fast_lock);
> > @@ -428,7 +461,7 @@ static void __g2h_reserve_space(struct xe_guc_ct *ct, u32 g2h_len, u32 num_g2h)
> > lockdep_assert_held(&ct->fast_lock);
> > ct->ctbs.g2h.info.space -= g2h_len;
> > - ct->g2h_outstanding += num_g2h;
> > + guc_ct_g2h_outstanding_add(ct, num_g2h);
> > }
> > }
> > @@ -439,7 +472,7 @@ static void __g2h_release_space(struct xe_guc_ct *ct, u32 g2h_len)
> > ct->ctbs.g2h.info.size - ct->ctbs.g2h.info.resv_space);
> > ct->ctbs.g2h.info.space += g2h_len;
> > - --ct->g2h_outstanding;
> > + guc_ct_g2h_outstanding_dec(ct);
> > }
> > static void g2h_release_space(struct xe_guc_ct *ct, u32 g2h_len)
> > @@ -1199,14 +1232,8 @@ static void g2h_fast_path(struct xe_guc_ct *ct, u32 *msg, u32 len)
> > */
> > void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
> > {
> > - struct xe_device *xe = ct_to_xe(ct);
> > - bool ongoing;
> > int len;
> > - ongoing = xe_device_mem_access_get_if_ongoing(ct_to_xe(ct));
> > - if (!ongoing && xe_pm_read_callback_task(ct_to_xe(ct)) == NULL)
> > - return;
> > -
> > spin_lock(&ct->fast_lock);
> > do {
> > len = g2h_read(ct, ct->fast_msg, true);
> > @@ -1214,9 +1241,6 @@ void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
> > g2h_fast_path(ct, ct->fast_msg, len);
> > } while (len > 0);
> > spin_unlock(&ct->fast_lock);
> > -
> > - if (ongoing)
> > - xe_device_mem_access_put(xe);
> > }
> > /* Returns less than zero on error, 0 on done, 1 on more available */
> > @@ -1247,36 +1271,8 @@ static int dequeue_one_g2h(struct xe_guc_ct *ct)
> > static void g2h_worker_func(struct work_struct *w)
> > {
> > struct xe_guc_ct *ct = container_of(w, struct xe_guc_ct, g2h_worker);
> > - bool ongoing;
> > int ret;
> > - /*
> > - * Normal users must always hold mem_access.ref around CT calls. However
> > - * during the runtime pm callbacks we rely on CT to talk to the GuC, but
> > - * at this stage we can't rely on mem_access.ref and even the
> > - * callback_task will be different than current. For such cases we just
> > - * need to ensure we always process the responses from any blocking
> > - * ct_send requests or where we otherwise expect some response when
> > - * initiated from those callbacks (which will need to wait for the below
> > - * dequeue_one_g2h()). The dequeue_one_g2h() will gracefully fail if
> > - * the device has suspended to the point that the CT communication has
> > - * been disabled.
> > - *
> > - * If we are inside the runtime pm callback, we can be the only task
> > - * still issuing CT requests (since that requires having the
> > - * mem_access.ref). It seems like it might in theory be possible to
> > - * receive unsolicited events from the GuC just as we are
> > - * suspending-resuming, but those will currently anyway be lost when
> > - * eventually exiting from suspend, hence no need to wake up the device
> > - * here. If we ever need something stronger than get_if_ongoing() then
> > - * we need to be careful with blocking the pm callbacks from getting CT
> > - * responses, if the worker here is blocked on those callbacks
> > - * completing, creating a deadlock.
> > - */
> > - ongoing = xe_device_mem_access_get_if_ongoing(ct_to_xe(ct));
> > - if (!ongoing && xe_pm_read_callback_task(ct_to_xe(ct)) == NULL)
> > - return;
> > -
> > do {
> > mutex_lock(&ct->lock);
> > ret = dequeue_one_g2h(ct);
> > @@ -1290,9 +1286,6 @@ static void g2h_worker_func(struct work_struct *w)
> > kick_reset(ct);
> > }
> > } while (ret == 1);
> > -
> > - if (ongoing)
> > - xe_device_mem_access_put(ct_to_xe(ct));
> > }
> > static void guc_ctb_snapshot_capture(struct xe_device *xe, struct guc_ctb *ctb,
> > diff --git a/drivers/gpu/drm/xe/xe_guc_ct_types.h b/drivers/gpu/drm/xe/xe_guc_ct_types.h
> > index d29144c9f20b..4220230e7be4 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_ct_types.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_ct_types.h
> > @@ -108,6 +108,8 @@ struct xe_guc_ct {
> > } ctbs;
> > /** @g2h_outstanding: number of outstanding G2H */
> > u32 g2h_outstanding;
> > + /** @g2h_pm_refs: number of pm_refs for pending G2H */
> > + u32 g2h_pm_refs;
> > /** @g2h_worker: worker to process G2H messages */
> > struct work_struct g2h_worker;
> > /** @state: CT state */
next prev parent reply other threads:[~2024-02-28 16:51 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
2024-01-26 20:30 ` [RFC 01/34] Revert "drm/xe/uc: Store firmware binary in system-memory backed BO" Rodrigo Vivi
2024-01-26 20:30 ` [RFC 02/34] drm/xe: Document Xe PM component Rodrigo Vivi
2024-01-29 10:38 ` Francois Dugast
2024-01-26 20:30 ` [RFC 03/34] drm/xe: Fix display runtime_pm handling Rodrigo Vivi
2024-02-05 9:11 ` Matthew Auld
2024-02-14 18:05 ` Rodrigo Vivi
2024-02-15 9:30 ` Matthew Auld
2024-02-15 22:19 ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 04/34] drm/xe: Create a xe_pm_runtime_resume_and_get variant for display Rodrigo Vivi
2024-01-26 20:30 ` [RFC 05/34] drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion Rodrigo Vivi
2024-01-26 20:30 ` [RFC 06/34] drm/xe: Prepare display for D3Cold Rodrigo Vivi
2024-01-26 20:30 ` [RFC 07/34] drm/xe: Convert mem_access assertion towards the runtime_pm state Rodrigo Vivi
2024-02-05 9:55 ` Matthew Auld
2024-02-14 18:15 ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 08/34] drm/xe: Runtime PM wake on every IOCTL Rodrigo Vivi
2024-02-05 9:39 ` Matthew Auld
2024-01-26 20:30 ` [RFC 09/34] drm/xe: Convert kunit tests from mem_access to xe_pm_runtime Rodrigo Vivi
2024-02-05 9:57 ` Matthew Auld
2024-01-26 20:30 ` [RFC 10/34] drm/xe: Convert scheduler towards direct pm_runtime Rodrigo Vivi
2024-02-05 10:46 ` Matthew Auld
2024-01-26 20:30 ` [RFC 11/34] drm/xe: Runtime PM wake on every sysfs call Rodrigo Vivi
2024-02-05 10:55 ` Matthew Auld
2024-02-14 18:48 ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 12/34] drm/xe: Ensure device is awake before removing it Rodrigo Vivi
2024-02-05 11:05 ` Matthew Auld
2024-02-14 18:51 ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 13/34] drm/xe: Remove mem_access from guc_pc calls Rodrigo Vivi
2024-02-05 11:08 ` Matthew Auld
2024-01-26 20:30 ` [RFC 14/34] drm/xe: Runtime PM wake on every debugfs call Rodrigo Vivi
2024-02-05 11:10 ` Matthew Auld
2024-02-14 18:57 ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 15/34] drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls Rodrigo Vivi
2024-02-05 11:15 ` Matthew Auld
2024-01-26 20:30 ` [RFC 16/34] drm/xe: Removing extra mem_access protection from runtime pm Rodrigo Vivi
2024-02-05 11:23 ` Matthew Auld
2024-01-26 20:30 ` [RFC 17/34] drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls Rodrigo Vivi
2024-02-05 11:25 ` Matthew Auld
2024-01-26 20:30 ` [RFC 18/34] drm/xe: Move lockdep protection from mem_access to xe_pm_runtime Rodrigo Vivi
2024-02-05 11:31 ` Matthew Auld
2024-01-26 20:30 ` [RFC 19/34] drm/xe: Remove pm_runtime lockdep Rodrigo Vivi
2024-02-05 11:54 ` Matthew Auld
2024-02-15 22:47 ` Rodrigo Vivi
2024-02-20 17:48 ` Matthew Auld
2024-02-28 16:53 ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 20/34] drm/xe: Stop checking for power_lost on D3Cold Rodrigo Vivi
2024-01-26 20:30 ` [RFC 21/34] drm/xe: Convert GuC CT paths from mem_access to xe_pm_runtime Rodrigo Vivi
2024-02-05 12:23 ` Matthew Auld
2024-02-28 16:51 ` Rodrigo Vivi [this message]
2024-01-26 20:30 ` [RFC 22/34] drm/xe: Keep D0 for the entire duration of a LR VM Rodrigo Vivi
2024-01-26 20:30 ` [RFC 23/34] drm/xe: Ensure D0 on TLB invalidation Rodrigo Vivi
2024-02-05 12:41 ` Matthew Auld
2024-01-26 20:30 ` [RFC 24/34] drm/xe: Remove useless mem_access protection for query ioctls Rodrigo Vivi
2024-02-05 12:43 ` Matthew Auld
2024-01-26 20:30 ` [RFC 25/34] drm/xe: Convert gsc_work from mem_access to xe_pm_runtime Rodrigo Vivi
2024-02-05 13:11 ` Matthew Auld
2024-01-26 20:30 ` [RFC 26/34] drm/xe: VMs don't need the mem_access protection anymore Rodrigo Vivi
2024-02-05 13:29 ` Matthew Auld
2024-02-15 22:37 ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 27/34] drm/xe: Remove useless mem_access during probe Rodrigo Vivi
2024-02-05 13:18 ` Matthew Auld
2024-01-26 20:30 ` [RFC 28/34] drm/xe: Remove mem_access from suspend and resume functions Rodrigo Vivi
2024-02-05 13:30 ` Matthew Auld
2024-01-26 20:30 ` [RFC 29/34] drm/xe: Convert gt_reset from mem_access to xe_pm_runtime Rodrigo Vivi
2024-02-05 13:33 ` Matthew Auld
2024-01-26 20:30 ` [RFC 30/34] drm/xe: Remove useless mem_access on PAT dumps Rodrigo Vivi
2024-02-05 13:34 ` Matthew Auld
2024-01-26 20:30 ` [RFC 31/34] drm/xe: Remove inner mem_access protections Rodrigo Vivi
2024-01-26 20:30 ` [RFC 32/34] drm/xe: Kill xe_device_mem_access_{get*,put} Rodrigo Vivi
2024-01-26 20:30 ` [RFC 33/34] drm/xe: Remove unused runtime pm helper Rodrigo Vivi
2024-01-26 20:30 ` [RFC 34/34] drm/xe: Enable D3Cold on 'low' VRAM utilization Rodrigo Vivi
2024-01-29 12:12 ` Matthew Auld
2024-01-29 19:01 ` Vivi, Rodrigo
2024-01-30 15:01 ` Gupta, Anshuman
2024-01-26 20:39 ` ✓ CI.Patch_applied: success for Kill mem_access v2 Patchwork
2024-01-26 20:40 ` ✗ CI.checkpatch: warning " Patchwork
2024-01-26 20:40 ` ✗ CI.KUnit: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zd9kkxpCZ174GzwF@intel.com \
--to=rodrigo.vivi@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.