Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: intel-xe@lists.freedesktop.org
Subject: Re: [Intel-xe] [PATCH] drm/xe: Fix locking in CT fast path
Date: Tue, 21 Mar 2023 17:14:06 -0400	[thread overview]
Message-ID: <ZBoeHi82lpK+w0vg@intel.com> (raw)
In-Reply-To: <ZBnAdXk4+Idtg9lo@DUT025-TGLU.fm.intel.com>

On Tue, Mar 21, 2023 at 02:34:29PM +0000, Matthew Brost wrote:
> On Tue, Mar 21, 2023 at 01:25:51PM +0100, Maarten Lankhorst wrote:
> > Hey,
> > 
> > I'm afraid this is not allowed, you can't take a mutex in an irq handler, not even a trylock.
> > 
> > From Documentation/locking/mutex-design.rst:
> > 
> > The mutex subsystem checks and enforces the following rules:
> > ...
> >     - Mutexes may not be used in hardware or software interrupt
> >       contexts such as tasklets and timers.
> > 
> 
> I wasn't aware of this byr DOC makes it clear this isn;t allowed. 
> 
> > Lockdep will likely still splat too as a result.
> >
> 
> Lockdep is happy which is very odd since clearly this isn't allowed per
> the DOC.

This is strange... in general it is loud when you try mutex inside irq.
some .config missing? or maybe the trylock itself misleading lockdep?!

> 
> Anyways, I'm thinking your atomic fix is needed

Yes, this is becoming un-avoidable. I would prefer some lock than atomic.
Maybe a spinlock?
Or we need to be really sure that there won't be any race where we end
with an access before the wakeup.

The runtime_pm doc even suggest that all the memory accesses should be
serialized instead of what we are trying to do currently with the
mem_access. Thoughts on if it is possible to serialize them on our cases?

check for the 'foo_' examples at Documentation/power/runtime_pm.txt

> but likely also need a
> follow on to this patch as well something like:
> 
> xe_device_mem_access_get_if_active();

hmmm... I didn't want to grow the mem_access into an rpm wrapper for all
cases like we ended up in i915... but this might be unavoidable for this
case...

> do CT fast path... 
> xe_device_mem_access_put_async();
> 
> The key being we can't sleep but also can't power down access to the
> VRAM when the CT fast path is executing.
> 
> Matt
> 
> > Cheers,
> > ~Maarten
> > 
> > On 2023-03-17 01:22, Matthew Brost wrote:
> > > We can't sleep in the CT fast but need to ensure we can access VRAM. Use
> > > a trylock + reference counter check to ensure safe access to VRAM, if
> > > either check fails, fall back to slow path.
> > > 
> > > VLK-45296
> > > 
> > > Signed-off-by: Matthew Brost<matthew.brost@intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_device.h |  9 ++++++++-
> > >   drivers/gpu/drm/xe/xe_guc_ct.c | 11 ++++++++++-
> > >   2 files changed, 18 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> > > index 25c5087f5aad..0cc4f52098a1 100644
> > > --- a/drivers/gpu/drm/xe/xe_device.h
> > > +++ b/drivers/gpu/drm/xe/xe_device.h
> > > @@ -95,12 +95,19 @@ static inline void xe_device_assert_mem_access(struct xe_device *xe)
> > >   	XE_WARN_ON(!xe->mem_access.ref);
> > >   }
> > > +static inline bool __xe_device_mem_access_ongoing(struct xe_device *xe)
> > > +{
> > > +	lockdep_assert_held(&xe->mem_access.lock);
> > > +
> > > +	return xe->mem_access.ref;
> > > +}
> > > +
> > >   static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> > >   {
> > >   	bool ret;
> > >   	mutex_lock(&xe->mem_access.lock);
> > > -	ret = xe->mem_access.ref;
> > > +	ret = __xe_device_mem_access_ongoing(xe);
> > >   	mutex_unlock(&xe->mem_access.lock);
> > >   	return ret;
> > > diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> > > index e5ed9022a0a2..bba0ef21c9e5 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> > > +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> > > @@ -1030,9 +1030,15 @@ void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
> > >   	struct xe_device *xe = ct_to_xe(ct);
> > >   	int len;
> > > -	if (!xe_device_in_fault_mode(xe) || !xe_device_mem_access_ongoing(xe))
> > > +	if (!xe_device_in_fault_mode(xe))
> > >   		return;
> > > +	if (!mutex_trylock(&xe->mem_access.lock))
> > > +		return;
> > > +
> > > +	if (!__xe_device_mem_access_ongoing(xe))
> > > +		goto unlock;
> > > +
> > >   	spin_lock(&ct->fast_lock);
> > >   	do {
> > >   		len = g2h_read(ct, ct->fast_msg, true);
> > > @@ -1040,6 +1046,9 @@ void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
> > >   			g2h_fast_path(ct, ct->fast_msg, len);
> > >   	} while (len > 0);
> > >   	spin_unlock(&ct->fast_lock);
> > > +
> > > +unlock:
> > > +	mutex_unlock(&xe->mem_access.lock);
> > >   }
> > >   /* Returns less than zero on error, 0 on done, 1 on more available */

  reply	other threads:[~2023-03-21 21:14 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-17  0:22 [Intel-xe] [PATCH] drm/xe: Fix locking in CT fast path Matthew Brost
2023-03-17  0:24 ` [Intel-xe] ✓ CI.Patch_applied: success for " Patchwork
2023-03-17  0:25 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
2023-03-21 12:25 ` [Intel-xe] [PATCH] " Maarten Lankhorst
2023-03-21 14:34   ` Matthew Brost
2023-03-21 21:14     ` Rodrigo Vivi [this message]
2023-03-22 12:18       ` Maarten Lankhorst

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBoeHi82lpK+w0vg@intel.com \
    --to=rodrigo.vivi@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox