Igt-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Ghimiray, Himal Prasad" <himal.prasad.ghimiray@intel.com>
To: <igt-dev@lists.freedesktop.org>
Subject: Re: [PATCH i-g-t 3/3] tests/intel/xe_wedged: Introduce test for wedged_mode=2
Date: Thu, 18 Apr 2024 09:25:12 +0530	[thread overview]
Message-ID: <975774a1-f75b-4515-9a40-b97f96e39cfd@intel.com> (raw)
In-Reply-To: <20240409221908.1077893-3-rodrigo.vivi@intel.com>


On 10-04-2024 03:49, Rodrigo Vivi wrote:
> In this mode, selected with debugfs, the GPU will be declared
> as wedged at any timeout. So, let's also introduce a command
> that will surely timeout. Based on the xe_exec_threads hang.
>
> Then we confirm the GPU is back alive after a rebind.

Patch LGTM.

Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>


>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   tests/intel/xe_wedged.c | 69 +++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 69 insertions(+)
>
> diff --git a/tests/intel/xe_wedged.c b/tests/intel/xe_wedged.c
> index ab9bf23d5..35fc905e7 100644
> --- a/tests/intel/xe_wedged.c
> +++ b/tests/intel/xe_wedged.c
> @@ -162,10 +162,60 @@ simple_exec(int fd, struct drm_xe_engine_class_instance *eci)
>   	xe_vm_destroy(fd, vm);
>   }
>   
> +static void
> +simple_hang(int fd)
> +{
> +	struct drm_xe_engine_class_instance *eci = &xe_engine(fd, 0)->instance;
> +	uint32_t vm;
> +	uint64_t addr = 0x1a0000;
> +	struct drm_xe_exec exec_hang = {
> +		.num_batch_buffer = 1,
> +	};
> +	uint64_t spin_offset;
> +	uint32_t hang_exec_queue;
> +	size_t bo_size;
> +	uint32_t bo = 0;
> +	struct {
> +		struct xe_spin spin;
> +		uint32_t batch[16];
> +		uint64_t pad;
> +		uint32_t data;
> +	} *data;
> +	struct xe_spin_opts spin_opts = { .preempt = false };
> +	int err;
> +
> +	vm = xe_vm_create(fd, 0, 0);
> +	bo_size = xe_bb_size(fd, sizeof(*data));
> +	bo = xe_bo_create(fd, vm, bo_size,
> +			  vram_if_possible(fd, eci->gt_id),
> +			  DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
> +	data = xe_bo_map(fd, bo, bo_size);
> +	hang_exec_queue = xe_exec_queue_create(fd, vm, eci, 0);
> +
> +	spin_offset = (char *)&data[0].spin - (char *)data;
> +	spin_opts.addr = addr + spin_offset;
> +	xe_spin_init(&data[0].spin, &spin_opts);
> +	exec_hang.exec_queue_id = hang_exec_queue;
> +	exec_hang.address = spin_opts.addr;
> +
> +	do {
> +		err =  igt_ioctl(fd, DRM_IOCTL_XE_EXEC, &exec_hang);
> +	} while (err && errno == ENOMEM);
> +
> +	xe_exec_queue_destroy(fd, hang_exec_queue);
> +	munmap(data, bo_size);
> +	gem_close(fd, bo);
> +	xe_vm_destroy(fd, vm);
> +}
> +
>   /**
>    * SUBTEST: basic-wedged
>    * Description: Force Xe device wedged after injecting a failure in GT reset
>    */
> +/**
> + * SUBTEST: wedged-at-any-timeout
> + * Description: Force Xe device wedged after a simple guc timeout
> + */
>   igt_main
>   {
>   	struct drm_xe_engine_class_instance *hwe;
> @@ -188,6 +238,25 @@ igt_main
>   			simple_exec(fd, hwe);
>   	}
>   
> +	igt_subtest_f("wedged-at-any-timeout") {
> +		igt_require(igt_debugfs_exists(fd, "wedged_mode", O_RDWR));
> +
> +		igt_debugfs_write(fd, "wedged_mode", "2");
> +		simple_hang(fd);
> +		/*
> +		 * Any ioctl after the first timeout on wedged_mode=2 is blocked
> +		 * so we cannot relly on sync objects. Let's wait a bit for
> +		 * things to settle before we confirm device as wedged and
> +		 * rebind.
> +		 */
> +		sleep(1);
> +		igt_assert_neq(simple_ioctl(fd), 0);
> +		fd = rebind_xe(fd);
> +		igt_assert_eq(simple_ioctl(fd), 0);
> +		xe_for_each_engine(fd, hwe)
> +			simple_exec(fd, hwe);
> +	}
> +
>   	igt_fixture {
>   		if (igt_debugfs_exists(fd, "fail_gt_reset/probability", O_RDWR)) {
>   			igt_debugfs_write(fd, "fail_gt_reset/probability", "0");

  reply	other threads:[~2024-04-18  3:55 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-09 22:19 [PATCH i-g-t 1/3] tests/intel/xe_wedged: Introduce a new test for Xe device wedged state Rodrigo Vivi
2024-04-09 22:19 ` [PATCH i-g-t 2/3] tests/intel/xe_wedged: Also add a simple exec to confirm GPU health Rodrigo Vivi
2024-04-18 14:35   ` Ghimiray, Himal Prasad
2024-04-09 22:19 ` [PATCH i-g-t 3/3] tests/intel/xe_wedged: Introduce test for wedged_mode=2 Rodrigo Vivi
2024-04-18  3:55   ` Ghimiray, Himal Prasad [this message]
2024-04-09 23:22 ` ✓ Fi.CI.BAT: success for series starting with [i-g-t,1/3] tests/intel/xe_wedged: Introduce a new test for Xe device wedged state Patchwork
2024-04-09 23:22 ` ✓ CI.xeBAT: " Patchwork
2024-04-10  4:17 ` [PATCH i-g-t 1/3] " Ghimiray, Himal Prasad
2024-04-10 20:12 ` ✗ Fi.CI.IGT: failure for series starting with [i-g-t,1/3] " Patchwork
2024-04-18 14:28 ` [PATCH i-g-t 1/3] " Ghimiray, Himal Prasad

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=975774a1-f75b-4515-9a40-b97f96e39cfd@intel.com \
    --to=himal.prasad.ghimiray@intel.com \
    --cc=igt-dev@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox