From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: intel-xe@lists.freedesktop.org
Subject: Re: [Intel-xe] [PATCH] drm/xe: Fix locking in CT fast path
Date: Tue, 21 Mar 2023 17:14:06 -0400 [thread overview]
Message-ID: <ZBoeHi82lpK+w0vg@intel.com> (raw)
In-Reply-To: <ZBnAdXk4+Idtg9lo@DUT025-TGLU.fm.intel.com>
On Tue, Mar 21, 2023 at 02:34:29PM +0000, Matthew Brost wrote:
> On Tue, Mar 21, 2023 at 01:25:51PM +0100, Maarten Lankhorst wrote:
> > Hey,
> >
> > I'm afraid this is not allowed, you can't take a mutex in an irq handler, not even a trylock.
> >
> > From Documentation/locking/mutex-design.rst:
> >
> > The mutex subsystem checks and enforces the following rules:
> > ...
> > - Mutexes may not be used in hardware or software interrupt
> > contexts such as tasklets and timers.
> >
>
> I wasn't aware of this byr DOC makes it clear this isn;t allowed.
>
> > Lockdep will likely still splat too as a result.
> >
>
> Lockdep is happy which is very odd since clearly this isn't allowed per
> the DOC.
This is strange... in general it is loud when you try mutex inside irq.
some .config missing? or maybe the trylock itself misleading lockdep?!
>
> Anyways, I'm thinking your atomic fix is needed
Yes, this is becoming un-avoidable. I would prefer some lock than atomic.
Maybe a spinlock?
Or we need to be really sure that there won't be any race where we end
with an access before the wakeup.
The runtime_pm doc even suggest that all the memory accesses should be
serialized instead of what we are trying to do currently with the
mem_access. Thoughts on if it is possible to serialize them on our cases?
check for the 'foo_' examples at Documentation/power/runtime_pm.txt
> but likely also need a
> follow on to this patch as well something like:
>
> xe_device_mem_access_get_if_active();
hmmm... I didn't want to grow the mem_access into an rpm wrapper for all
cases like we ended up in i915... but this might be unavoidable for this
case...
> do CT fast path...
> xe_device_mem_access_put_async();
>
> The key being we can't sleep but also can't power down access to the
> VRAM when the CT fast path is executing.
>
> Matt
>
> > Cheers,
> > ~Maarten
> >
> > On 2023-03-17 01:22, Matthew Brost wrote:
> > > We can't sleep in the CT fast but need to ensure we can access VRAM. Use
> > > a trylock + reference counter check to ensure safe access to VRAM, if
> > > either check fails, fall back to slow path.
> > >
> > > VLK-45296
> > >
> > > Signed-off-by: Matthew Brost<matthew.brost@intel.com>
> > > ---
> > > drivers/gpu/drm/xe/xe_device.h | 9 ++++++++-
> > > drivers/gpu/drm/xe/xe_guc_ct.c | 11 ++++++++++-
> > > 2 files changed, 18 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> > > index 25c5087f5aad..0cc4f52098a1 100644
> > > --- a/drivers/gpu/drm/xe/xe_device.h
> > > +++ b/drivers/gpu/drm/xe/xe_device.h
> > > @@ -95,12 +95,19 @@ static inline void xe_device_assert_mem_access(struct xe_device *xe)
> > > XE_WARN_ON(!xe->mem_access.ref);
> > > }
> > > +static inline bool __xe_device_mem_access_ongoing(struct xe_device *xe)
> > > +{
> > > + lockdep_assert_held(&xe->mem_access.lock);
> > > +
> > > + return xe->mem_access.ref;
> > > +}
> > > +
> > > static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> > > {
> > > bool ret;
> > > mutex_lock(&xe->mem_access.lock);
> > > - ret = xe->mem_access.ref;
> > > + ret = __xe_device_mem_access_ongoing(xe);
> > > mutex_unlock(&xe->mem_access.lock);
> > > return ret;
> > > diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> > > index e5ed9022a0a2..bba0ef21c9e5 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> > > +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> > > @@ -1030,9 +1030,15 @@ void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
> > > struct xe_device *xe = ct_to_xe(ct);
> > > int len;
> > > - if (!xe_device_in_fault_mode(xe) || !xe_device_mem_access_ongoing(xe))
> > > + if (!xe_device_in_fault_mode(xe))
> > > return;
> > > + if (!mutex_trylock(&xe->mem_access.lock))
> > > + return;
> > > +
> > > + if (!__xe_device_mem_access_ongoing(xe))
> > > + goto unlock;
> > > +
> > > spin_lock(&ct->fast_lock);
> > > do {
> > > len = g2h_read(ct, ct->fast_msg, true);
> > > @@ -1040,6 +1046,9 @@ void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
> > > g2h_fast_path(ct, ct->fast_msg, len);
> > > } while (len > 0);
> > > spin_unlock(&ct->fast_lock);
> > > +
> > > +unlock:
> > > + mutex_unlock(&xe->mem_access.lock);
> > > }
> > > /* Returns less than zero on error, 0 on done, 1 on more available */
next prev parent reply other threads:[~2023-03-21 21:14 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-17 0:22 [Intel-xe] [PATCH] drm/xe: Fix locking in CT fast path Matthew Brost
2023-03-17 0:24 ` [Intel-xe] ✓ CI.Patch_applied: success for " Patchwork
2023-03-17 0:25 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
2023-03-21 12:25 ` [Intel-xe] [PATCH] " Maarten Lankhorst
2023-03-21 14:34 ` Matthew Brost
2023-03-21 21:14 ` Rodrigo Vivi [this message]
2023-03-22 12:18 ` Maarten Lankhorst
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZBoeHi82lpK+w0vg@intel.com \
--to=rodrigo.vivi@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox