All of lore.kernel.org
 help / color / mirror / Atom feed
From: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
To: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>,
	<intel-xe@lists.freedesktop.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Subject: Re: [Intel-xe] [PATCH v2 1/1] drm/xe: Add a debugfs for faking gt reset failure.
Date: Wed, 24 May 2023 21:57:00 +0530	[thread overview]
Message-ID: <ZG461NbVPLb7utFZ@bvivekan-mobl> (raw)
In-Reply-To: <20230524070641.1601255-1-himal.prasad.ghimiray@intel.com>

On 24.05.2023 12:36, Himal Prasad Ghimiray wrote:
> In case of gt reset failure, KMD notifies userspace about failure
> via uevent. To validate this notification we need to ensure gt
> reset fails and there is no mechanism to cause failure from hardware.
> Hence added a debugfs which will cause fake reset failure.
> 
> v1(Rodrigo)
> - Change the variable to fake_reset_failure_in_progress.
> - Drop usage of READ_ONCE and WRITE_ONCE.
> - Follow consistency for variable assignment. Either use
>   functions for all the assignments or don't use for any.
> 
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt.c         | 15 ++++++++++++++-
>  drivers/gpu/drm/xe/xe_gt_debugfs.c | 11 +++++++++++
>  drivers/gpu/drm/xe/xe_gt_types.h   |  1 +
>  3 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> index 80d42c7c7cfa..b3ac8b3ab455 100644
> --- a/drivers/gpu/drm/xe/xe_gt.c
> +++ b/drivers/gpu/drm/xe/xe_gt.c
> @@ -506,6 +506,9 @@ int xe_gt_init(struct xe_gt *gt)
>  	int err;
>  	int i;
>  
> +	/*Fake reset failure should be disabled by default*/
> +	gt->reset.fake_reset_failure_in_progress = false;
> +
>  	INIT_WORK(&gt->reset.worker, gt_reset_worker);
>  
>  	for (i = 0; i < XE_ENGINE_CLASS_MAX; ++i) {
> @@ -601,8 +604,18 @@ static int gt_reset(struct xe_gt *gt)
>  	xe_gt_info(gt, "reset started\n");
>  
>  	xe_gt_sanitize(gt);
> -
>  	xe_device_mem_access_get(gt_to_xe(gt));
> +
> +	err = gt->reset.fake_reset_failure_in_progress;
> +	if (err) {

This doesn't look the right way. It should be something like

```
if (gt->reset.fake_reset_failure_in_progress) {
	
	err = -ECANCELED; or whatever appropriate
	...
```

> +		xe_gt_info(gt, "Fake GT reset failure is in progress\n");
> +
> +		/* Disable fake reset failure for next call */
> +		gt->reset.fake_reset_failure_in_progress = false;
> +
> +		goto err_msg;
> +	}
> +
>  	err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>  	if (err)
>  		goto err_msg;
> diff --git a/drivers/gpu/drm/xe/xe_gt_debugfs.c b/drivers/gpu/drm/xe/xe_gt_debugfs.c
> index 8bf441e850a0..13d2a974bc00 100644
> --- a/drivers/gpu/drm/xe/xe_gt_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_gt_debugfs.c
> @@ -127,6 +127,16 @@ static int register_save_restore(struct seq_file *m, void *data)
>  	return 0;
>  }
>  
> +static int fake_reset_failure(struct seq_file *m, void *data)
> +{
> +	struct xe_gt *gt = node_to_gt(m->private);
> +
> +	gt->reset.fake_reset_failure_in_progress = true;
> +	xe_gt_reset_async(gt);
> +
> +	return 0;
> +}
> +
>  static const struct drm_info_list debugfs_list[] = {
>  	{"hw_engines", hw_engines, 0},
>  	{"force_reset", force_reset, 0},
> @@ -135,6 +145,7 @@ static const struct drm_info_list debugfs_list[] = {
>  	{"steering", steering, 0},
>  	{"ggtt", ggtt, 0},
>  	{"register-save-restore", register_save_restore, 0},
> +	{"fake_reset_failure", fake_reset_failure, 0},
>  };
>  
>  void xe_gt_debugfs_register(struct xe_gt *gt)
> diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
> index 7c47d67aa8be..746d65c4aacc 100644
> --- a/drivers/gpu/drm/xe/xe_gt_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_types.h
> @@ -168,6 +168,7 @@ struct xe_gt {
>  
>  	/** @reset: state for GT resets */
>  	struct {
> +		bool fake_reset_failure_in_progress;
>  		/**
>  		 * @worker: work so GT resets can done async allowing to reset
>  		 * code to safely flush all code paths

@Rodrigo
This feature is purely to support test the driver and not something which
helps debugging. I am concerned whether such code should be part of
production build. I am thinking if this feature or any such patches in
future could be made part of a kernel config, which can be enabled in CI
when the driver is subjected to testing.

Regards,
Bala

> -- 
> 2.25.1
> 

  parent reply	other threads:[~2023-05-24 16:27 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-24  7:06 [Intel-xe] [PATCH v2 1/1] drm/xe: Add a debugfs for faking gt reset failure Himal Prasad Ghimiray
2023-05-24  7:04 ` [Intel-xe] ✓ CI.Patch_applied: success for series starting with [v2,1/1] " Patchwork
2023-05-24  7:06 ` [Intel-xe] ✓ CI.KUnit: " Patchwork
2023-05-24  7:10 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-05-24  7:40 ` [Intel-xe] ○ CI.BAT: info " Patchwork
2023-05-24 13:23 ` [Intel-xe] [PATCH v2 1/1] " Rodrigo Vivi
2023-05-29  4:02   ` Ghimiray, Himal Prasad
2023-05-24 16:27 ` Balasubramani Vivekanandan [this message]
2023-05-24 16:43   ` Rodrigo Vivi
2023-05-24 16:50     ` Vivi, Rodrigo
2023-05-25  6:06       ` Mauro Carvalho Chehab
2023-06-06  4:03         ` Ghimiray, Himal Prasad
2023-06-06  8:23           ` Mauro Carvalho Chehab
2023-05-29  4:00   ` Ghimiray, Himal Prasad
2023-05-24 16:52 ` [Intel-xe] ✗ CI.Patch_applied: failure for series starting with [v2,1/1] drm/xe: Add a debugfs for faking gt reset failure. (rev2) Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2023-05-29 12:33 [Intel-xe] [PATCH v2 1/1] drm/xe: Add a debugfs for faking gt reset failure Himal Prasad Ghimiray
2023-06-08 14:54 ` Francois Dugast
2023-06-15  8:58 Himal Prasad Ghimiray
2023-06-15  9:00 ` Ghimiray, Himal Prasad
2023-06-15  9:02 Himal Prasad Ghimiray
2023-06-15  9:03 ` Ghimiray, Himal Prasad

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZG461NbVPLb7utFZ@bvivekan-mobl \
    --to=balasubramani.vivekanandan@intel.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.