intel-xe.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Harish Chegondi <harish.chegondi@intel.com>
Cc: <intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH v8 4/7] drm/xe/eustall: Return -EIO error from read() if HW drops data
Date: Wed, 29 Jan 2025 20:45:59 -0800	[thread overview]
Message-ID: <857c6cgb94.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <9dc0f747ae215f9704fe1edec2e7007e19f8dd0b.1736970203.git.harish.chegondi@intel.com>

On Wed, 15 Jan 2025 12:02:10 -0800, Harish Chegondi wrote:
>
> If the user space doesn't read the EU stall data fast enough,
> it is possible that the EU stall data buffer can get filled,
> and if the hardware wants to write more data, it simply drops
> data due to unavailable buffer space. In that case, hardware
> sets a bit in a register. If the driver detects data drop,
> the driver read() returns -EIO error to let the user space
> know that HW has dropped data. The -EIO error is returned
> even if there is EU stall data in the buffer. A subsequent
> read by the user space returns the remaining EU stall data.

As I mentioned earlier, entire dropped packet handling should be in this
patch, so we can see the entire logic around this. So data_drop struct
should be defined in this patch.

>
> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_eu_stall.c | 12 ++++++++++++
>  drivers/gpu/drm/xe/xe_eu_stall.h |  1 +
>  2 files changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_eu_stall.c b/drivers/gpu/drm/xe/xe_eu_stall.c
> index c388d733b857..437782f8433c 100644
> --- a/drivers/gpu/drm/xe/xe_eu_stall.c
> +++ b/drivers/gpu/drm/xe/xe_eu_stall.c
> @@ -472,6 +472,7 @@ xe_eu_stall_stream_read_locked(struct xe_eu_stall_data_stream *stream,
>   * before calling read().
>   *
>   * Returns: The number of bytes copied or a negative error code on failure.
> + *	    -EIO if HW drops any EU stall data when the buffer is full.
>   */
>  static ssize_t xe_eu_stall_stream_read(struct file *file, char __user *buf,
>				       size_t count, loff_t *ppos)
> @@ -485,6 +486,16 @@ static ssize_t xe_eu_stall_stream_read(struct file *file, char __user *buf,
>		return -EINVAL;
>	}
>
> +	if (bitmap_weight(stream->data_drop.mask, XE_MAX_DSS_FUSE_BITS)) {

Since data_drop.mask is being touched elsewhere under xecore_buf->lock,
here also it should be accessed under the same lock. So this returning -EIO
should probably be moved into xe_eu_stall_stream_read_locked?

> +		if (!stream->data_drop.reported_to_user) {
> +			stream->data_drop.reported_to_user = true;
> +			xe_gt_dbg(gt, "EU stall data dropped in XeCores: %*pb\n",
> +				  XE_MAX_DSS_FUSE_BITS, stream->data_drop.mask);
> +			return -EIO;
> +		}
> +		stream->data_drop.reported_to_user = false;

I don't think this logic is correct. We should set this to false only after
we have cleared all set bits (e.g. only after bitmap_weight) otherwise we
might keep returning -EIO multiple times?

If HW continues to drop data and keep setting the line, while we are
resetting the bit, it is possible bitmap_weight might never become 0. I
think that is ok, we have returned -EIO at least once to indicate to
userspace that it is not reading data fast enough and HW is dropping data.

Or we may return -EIO multiple times as is happening here, where
reported_to_user is set to 0 before all bits might have been cleared. So
what is happening here might be ok too.

To see this clearly and evaluate it is why I am saying move all of this
data drop handling and -EIO return into this one patch. So we can decide
which approach to take: return -EIO just once or return multiple times.

We can also maybe defer this patch and merge the other stuff first if it's
a separate patch.

So maybe this is ok, maybe not, anyway something to think about.

> +	}
> +
>	if (!(file->f_flags & O_NONBLOCK)) {
>		do {
>			if (!stream->pollin) {
> @@ -680,6 +691,7 @@ static int xe_eu_stall_stream_init(struct xe_eu_stall_data_stream *stream,
>	if (!stream->xecore_buf)
>		return -ENOMEM;
>
> +	stream->data_drop.reported_to_user = false;
>	bitmap_zero(stream->data_drop.mask, XE_MAX_DSS_FUSE_BITS);

Stream is kzalloc'd, why do you need to init these?

>
>	xe_pm_runtime_get(gt_to_xe(gt));
> diff --git a/drivers/gpu/drm/xe/xe_eu_stall.h b/drivers/gpu/drm/xe/xe_eu_stall.h
> index f97c8bf8e852..8bc44e9e98af 100644
> --- a/drivers/gpu/drm/xe/xe_eu_stall.h
> +++ b/drivers/gpu/drm/xe/xe_eu_stall.h
> @@ -31,6 +31,7 @@ struct xe_eu_stall_data_stream {
>	struct xe_bo *bo;
>	struct per_xecore_buf *xecore_buf;
>	struct {
> +		bool reported_to_user;
>		xe_dss_mask_t mask;
>	} data_drop;
>	struct hrtimer poll_check_timer;
> --
> 2.47.1
>

  reply	other threads:[~2025-01-30  4:46 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-15 20:02 [PATCH v8 0/7] Add support for EU stall sampling Harish Chegondi
2025-01-15 20:02 ` [PATCH v8 1/7] drm/xe/topology: Add a function to find the index of the last enabled DSS in a mask Harish Chegondi
2025-01-17 17:25   ` Dixit, Ashutosh
2025-01-22  5:18     ` Harish Chegondi
2025-01-15 20:02 ` [PATCH v8 2/7] drm/xe/uapi: Introduce API for EU stall sampling Harish Chegondi
2025-01-17 19:02   ` Dixit, Ashutosh
2025-01-22 23:44     ` Harish Chegondi
2025-01-23  2:19       ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 3/7] drm/xe/eustall: Implement EU stall sampling APIs for Xe_HPC Harish Chegondi
2025-01-18  2:34   ` Dixit, Ashutosh
2025-01-23 18:51   ` Dixit, Ashutosh
2025-01-25  3:09   ` [PATCH v8 3 " Dixit, Ashutosh
2025-01-29  4:12   ` [PATCH v8 4 " Dixit, Ashutosh
2025-01-29  4:32     ` Dixit, Ashutosh
2025-01-30 18:46     ` Harish Chegondi
2025-01-31  3:23       ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 4/7] drm/xe/eustall: Return -EIO error from read() if HW drops data Harish Chegondi
2025-01-30  4:45   ` Dixit, Ashutosh [this message]
2025-01-30 17:05     ` Dixit, Ashutosh
2025-01-31 21:50       ` Harish Chegondi
2025-01-31 19:30     ` Harish Chegondi
2025-01-31 20:19       ` Dixit, Ashutosh
2025-01-31 22:59         ` Harish Chegondi
2025-02-01  0:13           ` Dixit, Ashutosh
2025-02-01  6:57             ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 5/7] drm/xe/eustall: Add EU stall sampling support for Xe2 Harish Chegondi
2025-01-30  4:55   ` Dixit, Ashutosh
2025-02-05  1:16     ` Olson, Matthew
2025-02-05  1:57       ` Dixit, Ashutosh
2025-02-05 19:03         ` Olson, Matthew
2025-02-05 20:02           ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 6/7] drm/xe/uapi: Add a device query to get EU stall sampling information Harish Chegondi
2025-01-16 22:34   ` Dixit, Ashutosh
2025-01-22  2:48     ` Harish Chegondi
2025-01-22  3:00       ` Dixit, Ashutosh
2025-01-30 17:36   ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 7/7] drm/xe/eustall: Add workaround 22016596838 which applies to PVC Harish Chegondi
2025-01-30  5:14   ` Dixit, Ashutosh
2025-01-15 20:46 ` ✓ CI.Patch_applied: success for Add support for EU stall sampling Patchwork
2025-01-15 20:46 ` ✗ CI.checkpatch: warning " Patchwork
2025-01-15 20:48 ` ✓ CI.KUnit: success " Patchwork
2025-01-15 21:14 ` ✓ CI.Build: " Patchwork
2025-01-15 21:16 ` ✗ CI.Hooks: failure " Patchwork
2025-01-15 21:18 ` ✓ CI.checksparse: success " Patchwork
2025-01-15 21:43 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-01-16  0:37 ` ✗ Xe.CI.Full: " Patchwork
2025-01-16  0:51 ` [PATCH v8 0/7] " Degrood, Felix J
2025-01-16 21:50 ` Olson, Matthew
2025-01-18  5:19   ` Harish Chegondi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=857c6cgb94.wl-ashutosh.dixit@intel.com \
    --to=ashutosh.dixit@intel.com \
    --cc=harish.chegondi@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).