intel-xe.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
	Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Subject: Re: [PATCH v2 2/2] drm/xe/oa: Fix locking for stream->pollin
Date: Thu, 23 Jan 2025 05:25:45 -0500	[thread overview]
Message-ID: <Z5IZKcUpYwKMPVbc@intel.com> (raw)
In-Reply-To: <85frlafcy2.wl-ashutosh.dixit@intel.com>

On Wed, Jan 22, 2025 at 07:04:21PM -0800, Dixit, Ashutosh wrote:
> On Wed, 22 Jan 2025 02:23:23 -0800, Rodrigo Vivi wrote:
> >
> 
> Hi Rodrigo,
> 
> Thanks for your inputs, it helped me look at the code more closely and
> clarify some things.
> 
> I have responded to your comments below, but in case you don't want to go
> through that, the summary is that I would like to drop this patch from
> consideration at this time, basically because it needs more changes.
> 
> > On Tue, Jan 21, 2025 at 08:02:04PM -0800, Ashutosh Dixit wrote:
> > > Previously locking was skipped for stream->pollin. This "mostly" worked
> > > because pollin is u32/bool, even when stream->pollin is accessed
> > > concurrently in multiple threads. However, with stream->pollin moving under
> > > stream->oa_buffer.ptr_lock in this series, implement the correct way to
> > > access stream->pollin, which is to access it under
> > > stream->oa_buffer.ptr_lock.
> > >
> > > v2: Update commit message to explain the "why" of this change (Rodrigo)
> > >     Document the change in scope for stream->oa_buffer.ptr_lock (Rodrigo)
> >
> > First of all thanks for the rework.
> > But I believe I didn't have enough coffee today yet, because
> > I'm still failing to understand why...
> >
> > Breaking your explanation to see if I can understand:
> > 'mostly' - Why mostly? Did we face bugs?
> >
> > 'worked because pollin is u32/bool' - this sounds like 'works by luck'
> 
> No, there are no known bugs in the current code in the kernel. That is what
> 'mostly' is supposed to mean: basically the requirements on this code are
> pretty lax, and even if writes into the boolean stream->pollin from
> different threads cross each other, or they stomp on each other, things
> work out. So not 'works by luck' or 'works most of the time' but the code
> works even in absence of this locking.
> 
> >
> > 'with stream->pollin moving under stream->oa_buffer.ptr_lock' - Why?
> >
> > I believe this is the main why that I had yesterday and that continues
> > today. Why are we using the oa_buffer pointer lock to also protect
> > the a stream variable?
> >
> > Why don't you use the stream_lock? Or why don't you create a dedicated
> > polling_lock?
> >
> 
> So, stream->oa_buffer.ptr_lock _is_ the correct lock to use for
> 'stream->pollin', because 'stream->pollin' is intimately connected to the
> same pointers 'ptr_lock' is protecting. You can see this in Patch 1.
> 
> Also, there is basically one 'struct xe_oa_buffer' per stream (and stream_lock
> is a mutex which can't be used from interrupt context).
> 
> So rather than using a stream level lock, we can just move pollin into
> stream->oa_buffer. So it will become stream->oa_buffer.pollin.
> 
> > I'm sorry for not been clear yesterday, wasting your time and 1 cycle...
> 
> No, thanks for your comments, it made me think about all this.
> 
> > > @@ -562,8 +563,10 @@ static ssize_t xe_oa_read(struct file *file, char __user *buf,
> > >	 * Also in case of -EIO, we have already waited for data before returning
> > >	 * -EIO, so need to wait again
> > >	 */
> > > +	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
> > >	if (ret != -ENOSPC && ret != -EIO)
> > >		stream->pollin = false;
> > > +	spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
> 
> Also this is the part I don't like, this should be moved into
> xe_oa_append_reports() where stream->oa_buffer.ptr_lock is already
> held. Otherwise we are dropping and re-grabbing the lock etc.
> 
> So imo the basic direction of this patch is correct but I want to make all
> these changes and resubmit this as part of a separate series.

everything makes sense now, thanks for the clarifications and for the work.
the lock might even deserve a more generic name in the series,
like s/ptr_lock/lock.

>
 So please
> ignore this patch for now. I will go ahead and merge Patch 1 since that is
> ok and was already reviewed as a separate series:
> 
> https://patchwork.freedesktop.org/series/143575/

on merging that other patch: Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

> 
> Thanks.
> --
> Ashutosh
> 
> 
> 
> > >
> > >	/* Possible values for ret are 0, -EFAULT, -ENOSPC, -EIO, -EINVAL, ... */
> > >	return offset ?: (ret ?: -EAGAIN);
> > > @@ -573,6 +576,7 @@ static __poll_t xe_oa_poll_locked(struct xe_oa_stream *stream,
> > >				  struct file *file, poll_table *wait)
> > >  {
> > >	__poll_t events = 0;
> > > +	unsigned long flags;
> > >
> > >	poll_wait(file, &stream->poll_wq, wait);
> > >
> > > @@ -582,8 +586,10 @@ static __poll_t xe_oa_poll_locked(struct xe_oa_stream *stream,
> > >	 * in use. We rely on hrtimer xe_oa_poll_check_timer_cb to notify us when there
> > >	 * are samples to read
> > >	 */
> > > +	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
> > >	if (stream->pollin)
> > >		events |= EPOLLIN;
> > > +	spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
> > >
> > >	return events;
> > >  }
> > > diff --git a/drivers/gpu/drm/xe/xe_oa_types.h b/drivers/gpu/drm/xe/xe_oa_types.h
> > > index 52e33c37d5ee8..5c4ea13f646fc 100644
> > > --- a/drivers/gpu/drm/xe/xe_oa_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_oa_types.h
> > > @@ -159,7 +159,7 @@ struct xe_oa_buffer {
> > >	/** @vaddr: mapped vaddr of the OA buffer */
> > >	u8 *vaddr;
> > >
> > > -	/** @ptr_lock: Lock protecting reads/writes to head/tail pointers */
> > > +	/** @ptr_lock: Lock protecting reads/writes to head/tail pointers and stream->pollin */
> > >	spinlock_t ptr_lock;
> > >
> > >	/** @head: Cached head to read from */
> > > --
> > > 2.47.1
> > >

  reply	other threads:[~2025-01-23 10:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-22  4:02 [PATCH v2 0/2] stream->pollin fixes Ashutosh Dixit
2025-01-22  4:02 ` [PATCH 1/2] drm/xe/oa: Set stream->pollin in xe_oa_buffer_check_unlocked Ashutosh Dixit
2025-01-22  4:02 ` [PATCH v2 2/2] drm/xe/oa: Fix locking for stream->pollin Ashutosh Dixit
2025-01-22 10:23   ` Rodrigo Vivi
2025-01-23  3:04     ` Dixit, Ashutosh
2025-01-23 10:25       ` Rodrigo Vivi [this message]
2025-01-22  4:11 ` ✓ CI.Patch_applied: success for stream->pollin fixes (rev2) Patchwork
2025-01-22  4:12 ` ✓ CI.checkpatch: " Patchwork
2025-01-22  4:13 ` ✓ CI.KUnit: " Patchwork
2025-01-22  4:29 ` ✓ CI.Build: " Patchwork
2025-01-22  4:31 ` ✓ CI.Hooks: " Patchwork
2025-01-22  4:33 ` ✓ CI.checksparse: " Patchwork
2025-01-22  4:59 ` ✓ Xe.CI.BAT: " Patchwork
2025-01-22 14:14 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5IZKcUpYwKMPVbc@intel.com \
    --to=rodrigo.vivi@intel.com \
    --cc=ashutosh.dixit@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).