Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Harish Chegondi <harish.chegondi@intel.com>
To: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
	Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Subject: Re: [PATCH 1/1] drm/xe/eustall: Return EBADFD from read if EU stall registers get reset
Date: Tue, 16 Dec 2025 15:53:01 -0800	[thread overview]
Message-ID: <aUHw3Td486iBv0Up@intel.com> (raw)
In-Reply-To: <87zf7n4cx9.wl-ashutosh.dixit@intel.com>

On Fri, Dec 12, 2025 at 01:18:10PM -0800, Dixit, Ashutosh wrote:
> On Sun, 07 Dec 2025 22:16:11 -0800, Harish Chegondi wrote:
> >
> 
Hi Ashutosh,
> Hi Harish,
> 
> > @@ -541,9 +541,24 @@ static ssize_t xe_eu_stall_stream_read_locked(struct xe_eu_stall_data_stream *st
> >	size_t total_size = 0;
> >	u16 group, instance;
> >	unsigned int xecore;
> > +	u32 base_reg_value;
> >	int ret = 0;
> >
> >	mutex_lock(&stream->xecore_buf_lock);
> > +	/* If a GT or engine reset happens during EU stall data sampling,
> > +	 * all EU stall registers get reset to 0 and the cached values of
> > +	 * EU stall data buffers' read and write pointers are out of sync
> > +	 * with the register values. This can cause invalid data to be
> > +	 * returned from read(). To prevent this, check the value of a
> > +	 * EU stall base register. If it is zero, return -EBADFD. The
> > +	 * user is expected to close the fd and open a new fd.
> > +	 */
> > +	base_reg_value = xe_gt_mcr_unicast_read_any(gt, XEHPC_EUSTALL_BASE);
> > +	if (unlikely(!base_reg_value)) {
> > +		xe_gt_dbg(gt, "EU stall base register has been reset to 0\n");
> > +		mutex_unlock(&stream->xecore_buf_lock);
> > +		return -EBADFD;
> > +	}
> 
> Since we are introducing an extra register read every read() call here,
> does it make sense to first check if there's a real userland need for this?
I had discussions with the UMD folks and the feedback I received is - it
would be better to return an error than returning bad EU stall data. If
a reset happens in the middle of EU stall sampling, the circular buffer
pointers get messed up leading to invalid data. However, my
understanding is that if a reset happens when a workload is executing,
the workload will fail with an error. So, the user would probably
discard any EU stall data collected. This error code is an additional
feedback mechanism to the user to not trust the EU stall data collected
so far.
> And actually have a UMD PR which will consume this -EBADFD return value,
> before we merge this?
I agree that the UMDs may have to do additional work on their end, but
their PRs doesn't have to merged before this patch. If EU stall read()
returns any error, the UMDs would probably exit further read of EU stall
data. Even with this new error code, they would exit reading the stall
data. Upon receiving this new error, UMDs should stop reading the data,
close the fd,, open a new fd and read again.
> 
> Thanks.
> --
> Ashutosh

Thank You
Harish.

  reply	other threads:[~2025-12-16 23:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-08  6:16 [PATCH 1/1] drm/xe/eustall: Return EBADFD from read if EU stall registers get reset Harish Chegondi
2025-12-08  6:32 ` ✓ CI.KUnit: success for series starting with [1/1] " Patchwork
2025-12-08  7:56 ` ✓ Xe.CI.BAT: " Patchwork
2025-12-08  8:48 ` ✗ Xe.CI.Full: failure " Patchwork
2025-12-12 21:18 ` [PATCH 1/1] " Dixit, Ashutosh
2025-12-16 23:53   ` Harish Chegondi [this message]
2025-12-18 19:53 ` Dixit, Ashutosh
2025-12-22 22:37   ` Harish Chegondi
2025-12-23  5:08     ` Dixit, Ashutosh
2025-12-23 23:39       ` Harish Chegondi
2025-12-24  1:47         ` Dixit, Ashutosh
  -- strict thread matches above, loose matches on Subject: below --
2025-10-01  6:38 Harish Chegondi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aUHw3Td486iBv0Up@intel.com \
    --to=harish.chegondi@intel.com \
    --cc=ashutosh.dixit@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox