From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <intel-xe-bounces@lists.freedesktop.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.lore.kernel.org (Postfix) with ESMTPS id 8D7C5F483F7
	for <intel-xe@archiver.kernel.org>; Mon, 23 Mar 2026 20:17:27 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id 4E71510E3E8;
	Mon, 23 Mar 2026 20:17:27 +0000 (UTC)
Authentication-Results: gabe.freedesktop.org;
	dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="aPho7yBI";
	dkim-atps=neutral
Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11])
 by gabe.freedesktop.org (Postfix) with ESMTPS id 6B0AB10E3E8
 for <intel-xe@lists.freedesktop.org>; Mon, 23 Mar 2026 20:17:26 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
 d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
 t=1774297047; x=1805833047;
 h=date:message-id:from:to:cc:subject:in-reply-to:
 references:mime-version;
 bh=UeTf/zU6ZAgiwdh8YWFHja9tCGn/vst4KCC4V7/ysRY=;
 b=aPho7yBIgP8VBJibhzCZcR6LpoqgSFg6GrTS1DXoShJvEkdCGNuzY/Bj
 jLAWv/S2FmwNxXZk9QhBM1I8zHNQBP8OwkS1ewE9o5RD/Sa7FWFrjEQNq
 +VXJnyxYmmSHlHSEfOXBb8KUtH6EMr9RqW4vD3bu51spgSpsArv+lErjB
 xxoCNDdsrLHcE2MuJYfO6dl8cplIMfiE37xnSk4YPyr0rg2bGKmfmjzGS
 iq+8VyiGlg0eE+LDgeI74CzOxW9UU2I28z1Q9xn3n+UEvDOQPg9gJ7WY2
 4XVUt8euvJJqblBysbjmf+MLgEehUhx7ejpQeXsDsz2z0pN9VGTOc9iE+ w==;
X-CSE-ConnectionGUID: q8xSk8MySxeA2H8VPu/dtA==
X-CSE-MsgGUID: 8uEQfKByQlOJUS5gWpbALQ==
X-IronPort-AV: E=McAfee;i="6800,10657,11738"; a="85615687"
X-IronPort-AV: E=Sophos;i="6.23,137,1770624000"; d="scan'208";a="85615687"
Received: from orviesa002.jf.intel.com ([10.64.159.142])
 by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 23 Mar 2026 13:17:26 -0700
X-CSE-ConnectionGUID: KmzXwGOBTQiEl2rHO4ImHg==
X-CSE-MsgGUID: OvhPNsq5T7WwCDN3G9/EQQ==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.23,137,1770624000"; d="scan'208";a="254612950"
Received: from msharm8-mobl2.amr.corp.intel.com (HELO adixit-MOBL3.intel.com)
 ([10.125.35.207])
 by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 23 Mar 2026 13:17:26 -0700
Date: Mon, 23 Mar 2026 13:17:24 -0700
Message-ID: <877br2qp0r.wl-ashutosh.dixit@intel.com>
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Harish Chegondi <harish.chegondi@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <felix.j.degrood@intel.com>,
 <matias.a.cabral@intel.com>, <joshua.santosh.ranjan@intel.com>
Subject: Re: [PATCH v2 1/1] drm/xe/eustall: Return EBADFD from read if EU
 stall registers get reset
In-Reply-To: <ab21JvEQ9nQJzZ2A@intel.com>
References: <52d991cc7e8bec514bb582717a1c42033672d4a5.1773683739.git.harish.chegondi@intel.com>	<87se9xpqub.wl-ashutosh.dixit@intel.com>	<absZXhGMwjcNb-gn@intel.com>	<878qboqxq9.wl-ashutosh.dixit@intel.com>	<ab21JvEQ9nQJzZ2A@intel.com>
User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue)
 FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0
 Emacs/30.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset=US-ASCII
X-BeenThere: intel-xe@lists.freedesktop.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Intel Xe graphics driver <intel-xe.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-xe>,
 <mailto:intel-xe-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-xe>
List-Post: <mailto:intel-xe@lists.freedesktop.org>
List-Help: <mailto:intel-xe-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-xe>,
 <mailto:intel-xe-request@lists.freedesktop.org?subject=subscribe>
Errors-To: intel-xe-bounces@lists.freedesktop.org
Sender: "Intel-xe" <intel-xe-bounces@lists.freedesktop.org>

On Fri, 20 Mar 2026 13:59:18 -0700, Harish Chegondi wrote:
>
> On Wed, Mar 18, 2026 at 08:55:42PM -0700, Dixit, Ashutosh wrote:
> > On Wed, 18 Mar 2026 14:30:06 -0700, Harish Chegondi wrote:
> > >
> > > On Tue, Mar 17, 2026 at 11:57:32PM -0700, Dixit, Ashutosh wrote:
> > > > On Mon, 16 Mar 2026 10:58:56 -0700, Harish Chegondi wrote:
> > > > >
> > > > > If a reset (GT or engine) happens during EU stall data sampling, all the
> > > > > EU stall registers can get reset to 0. This will result in EU stall data
> > > > > buffers' read and write pointer register values to be out of sync with
> > > > > the cached values. This will result in read() returning invalid data. To
> > > > > prevent this, check the value of a EU stall base register. If it is zero,
> > > > > it indicates a reset may have happened that wiped the register to zero.
> > > > > If this happens, return EBADFD from read() upon which the user space
> > > > > should close the fd and open a new fd for a new EU stall data
> > > > > collection session.
> > > > >
> > > > > Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > > > > Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
> > > > > ---
> > > > > v2: Move base register check from read to the poll function
> > > > >
> > > > >  drivers/gpu/drm/xe/xe_eu_stall.c | 24 +++++++++++++++++++++++-
> > > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/xe_eu_stall.c b/drivers/gpu/drm/xe/xe_eu_stall.c
> > > > > index c34408cfd292..7e14de73a2c9 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_eu_stall.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_eu_stall.c
> > > > > @@ -44,6 +44,7 @@ struct per_xecore_buf {
> > > > >  struct xe_eu_stall_data_stream {
> > > > >	bool pollin;
> > > > >	bool enabled;
> > > > > +	bool reset_detected;
> > > > >	int wait_num_reports;
> > > > >	int sampling_rate_mult;
> > > > >	wait_queue_head_t poll_wq;
> > > > > @@ -428,6 +429,17 @@ static bool eu_stall_data_buf_poll(struct xe_eu_stall_data_stream *stream)
> > > > >			set_bit(xecore, stream->data_drop.mask);
> > > > >		xecore_buf->write = write_ptr;
> > > > >	}
> > > > > +	/* If a GT or engine reset happens during EU stall sampling,
> > > > > +	 * all EU stall registers get reset to 0 and the cached values of
> > > > > +	 * the EU stall data buffers' read pointers are out of sync with
> > > > > +	 * the register values. This causes invalid data to be returned
> > > > > +	 * from read(). To prevent this, check the value of a EU stall base
> > > > > +	 * register. If it is zero, there has been a reset.
> > > > > +	 */
> > > >
> > > > As previously discussed, the best way would have been to not have to do
> > > > this. We would just plug into the handler for the reset message from GuC,
> > > > rather than to implement a reset detection here (and in other places such
> > > > as OA). But looks like if we do that, because of the way EUSS registers are
> > > > reset, we can return bad EUSS data. So looks like there is no way around
> > > > doing this "reset detection" here and a solution with the GuC reset handler
> > > > would always be racy. Just for the record.
> > >
> > > Thanks for the summary of the previous discussion. Yes, hooking into the
> > > GUc reset notification handler will be racy and bad EUSS data will be
> > > returned to the user space if read() happens after the reset but before
> > > the GuC reset notification message is processed. That's the reason for
> > > not taking that approach.
> > >
> > > >
> > > > > +	if (unlikely(!xe_gt_mcr_unicast_read_any(gt, XEHPC_EUSTALL_BASE))) {
> > > > > +		stream->reset_detected = true;
> > > > > +		min_data_present = true;
> > > >
> > > > I don't believe we need to set 'min_data_present = true' if we are setting
> > > > 'stream->reset_detected = true', correct? See if statement at the bottom.
> > >
> > > Agree. The only difference is that the if statement at the bottom will
> > > evaluate true in the current execution of eu_stall_data_buf_poll_work_fn
> > > if min_data_present is set to true. If min_data_present is not set to
> > > true, the if statement will evaluate to true in the subsequent execution
> > > of eu_stall_data_buf_poll_work_fn() which is still okay. So, yes, we
> > > don't have to set min_data_present to true here. Will fix in the next
> > > version.
> >
> > Just switch the order of the two OR operands and you don't have that issue?
> But switching the order of the two OR operands would cause the function
> eu_stall_data_buf_poll() to be unnecessarily called even after a reset
> is detected.

Hmm, in that case why even keep rescheduling the work? I took a swipe at
doing this a little bit differently and have sent a patch v3. Can you take
a look at that and see what you think. The patch is only compile tested.

I also have a different idea now, which is, should we just call release()
and close the fd in the kernel itself, if reset is detected. Will check
with Umesh about this, I think there was something in perf_pmu which was
doing it. The kernel will automatically return -EBADFD to user land when
user land tries to use the closed fd. I am not sure if we'll eventually do
this, but at least it's good to understand if doing this is feasible.

> >
> > > >
> > > > Also, since the write pointer itself gets reset during reset, didn't we
> > > > want to do this register read only when the write pointer is 0 (to avoid an
> > > > extra register read every 5 ms)?
> > >
> > > Good point. I have thought about reducing the number of this register
> > > reads. The poll function reads the write pointers of all the xecores.
> > > A reset can happen anytime the poll function is reading the write
> > > pointers of the xecores. If the reset happens before the poll function
> > > started reading the write pointers, all write pointers are zeros.
> > > If the reset happens during the poll function, several write pointers
> > > read so far can be non-zero while the rest of the pointers after reset
> > > are all zeros. The if reset happens right after the poll function, the
> > > write pointers can be a mix of zeros and non-zeros.
> > > I think the only time this register read can be skipped is if the
> > > LAST write pointer read is non-zero which means a reset did not happen
> > > before or during the poll function. Do you agree? I thought of adding a
> > > check to the if statement to check if the last write pointer is
> > > non-zero, but to keep the code clean, I didn't. Also, if there are
> > > n xecores, there will be n write pointer register reads plus one
> > > additional base register read, which isn't too bad? Also, hoping the use
> > > of unlikely macro would not impact the performance too much.
> >
> > OK, leave this as is I think. Otherwise we'll need to read the base
> > register each time we see a zero write pointer. So it's ok, leave as is.
> >
> > > >
> > > > > +	}
> > > > >	mutex_unlock(&stream->xecore_buf_lock);
> > > > >
> > > > >	return min_data_present;
> > > > > @@ -554,6 +566,15 @@ static ssize_t xe_eu_stall_stream_read_locked(struct xe_eu_stall_data_stream *st
> > > > >		}
> > > > >		stream->data_drop.reported_to_user = false;
> > > > >	}
> > > > > +	/* If EU stall registers got reset due to a GT/engine reset,
> > > > > +	 * continuing with the read() will return invalid data to
> > > > > +	 * the user space. Just return -EBADFD instead.
> > > > > +	 */
> > > > > +	if (unlikely(stream->reset_detected)) {
> > > > > +		xe_gt_dbg(gt, "EU stall base register has been reset\n");
> > > > > +		mutex_unlock(&stream->xecore_buf_lock);
> > > > > +		return -EBADFD;
> > > >
> > > > The other option is to return -EIO here and implement
> > > > DRM_XE_OBSERVATION_IOCTL_STATUS and return status from that. Let me think
> > > > some more about this.
> > >
> > > I think EBADFD is more appropriate errno than EIO in this case since the
> > > fd is in a corrupted state and user has to close and re-open the fd.
> > > Currently, the -EIO is used to indicate drop data in which case, the
> > > user space can continue to read the data (faster) without closing the fd.
> >
> > OK, we can go with -EBADFD, though still thinking about it.
> >
> > > >
> > > > > +	}
> > > > >
> > > > >	for_each_dss_steering(xecore, gt, group, instance) {
> > > > >		ret = xe_eu_stall_data_buf_read(stream, buf, count, &total_size,
> > > > > @@ -692,6 +713,7 @@ static int xe_eu_stall_stream_enable(struct xe_eu_stall_data_stream *stream)
> > > > >		xecore_buf->write = write_ptr;
> > > > >		xecore_buf->read = write_ptr;
> > > > >	}
> > > > > +	stream->reset_detected = false;
> > > >
> > > > So after reset, if a stream is disabled and re-enabled, we expect things to
> > > > work again and EUSS data to be correct (without re-opening a new
> > > > stream)?
> > >
> > > Technically, yes, since the EU stall registers programming is done in
> > > enable, things will work again if the stream is disabled and re-enabled.
> > > But if the EUSS registers programming is moved into open() in the
> > > future, things may not work by disabling and re-enabling the stream. So,
> > > I think we suggest to the UMDs to close the stream and open a new
> > > stream.
> >
> > No we don't suggest anything to UMD's. We decide what we want to do,
> > implement and enforce it that way and then maintain that uapi.
> >
> > OK, then let us make sure after disable/enable, the reset_detected flag
> > remains set.
> Okay, I will make sure the reset_detected flag remains set even after a
> disable and an enable.

I think we just need to not to reset it to false here. I took out this line
in v3 I sent.

> > > >
> > > > >	stream->data_drop.reported_to_user = false;
> > > > >	bitmap_zero(stream->data_drop.mask, XE_MAX_DSS_FUSE_BITS);
> > > > >
> > > > > @@ -717,7 +739,7 @@ static void eu_stall_data_buf_poll_work_fn(struct work_struct *work)
> > > > >		container_of(work, typeof(*stream), buf_poll_work.work);
> > > > >	struct xe_gt *gt = stream->gt;
> > > > >
> > > > > -	if (eu_stall_data_buf_poll(stream)) {
> > > > > +	if (stream->reset_detected || eu_stall_data_buf_poll(stream)) {
> > > > >		stream->pollin = true;
> > > > >		wake_up(&stream->poll_wq);
> > > > >	}
> > > > > --
> > > > > 2.43.0
> > > > >