From: "Anirban, Sk" <sk.anirban@intel.com>
To: Riana Tauro <riana.tauro@intel.com>, <igt-dev@lists.freedesktop.org>
Cc: <anshuman.gupta@intel.com>, <badal.nilawar@intel.com>,
<rodrigo.vivi@intel.com>
Subject: Re: [i-g-t] tests/intel/xe_wedged: Add new test csc-wedged
Date: Mon, 14 Jul 2025 10:41:43 +0530 [thread overview]
Message-ID: <ebc6a1cd-9418-4a82-9239-61ba2eec025a@intel.com> (raw)
In-Reply-To: <f054aafd-37fc-42ac-a343-d29ba8a05b6a@intel.com>
Hi,
On 10-07-2025 12:53, Riana Tauro wrote:
> Hi Anirban
>
> On 7/8/2025 2:53 PM, Sk Anirban wrote:
>> Inject a CSC error through uevent to cause the Xe device to enter a
>> wedged
>
> Add details about survivability mode. What is the expectation of the test
>
> Add a link to kernel patches
Sure, I will add this.
>
>> state. To return the device to a normal state, reload the driver, as
>> the wedged state can only be resolved by rebinding/reprobing the driver.
>>
>> Signed-off-by: Sk Anirban <sk.anirban@intel.com>
>> ---
>> tests/intel/xe_wedged.c | 85 +++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 85 insertions(+)
>>
>> diff --git a/tests/intel/xe_wedged.c b/tests/intel/xe_wedged.c
>> index 7fc7ca9eb..b29e9bcb5 100644
>> --- a/tests/intel/xe_wedged.c
>> +++ b/tests/intel/xe_wedged.c
>> @@ -14,6 +14,7 @@
>> #include <limits.h>
>> #include <dirent.h>
>> +#include <libudev.h>
>> #include "igt.h"
>> #include "igt_device.h"
>> @@ -46,6 +47,46 @@ static void force_wedged(int fd)
>> sleep(1);
>> }
>> +static void force_wedged_csc_error(int fd)
>> +{
>> + igt_debugfs_write(fd, "inject_csc_hw_error/probability", "100");
>> + igt_debugfs_write(fd, "inject_csc_hw_error/times", "1");
>> +
>> + xe_force_gt_reset_sync(fd, 0);
>> + sleep(1);
>> +}
>> +
>> +static char bus_addr[NAME_MAX];> +
>> +static int check_survivability_mode(int fd)
>> +{
>> + struct pci_device *pci_dev;
>> + char path[PATH_MAX];
>> + int dirfd;
>> +
>> + pci_dev = igt_device_get_pci_device(fd);
>> + snprintf(bus_addr, sizeof(bus_addr), "%04x:%02x:%02x.%01x",
>> + pci_dev->domain, pci_dev->bus, pci_dev->dev, pci_dev->func);
>> + snprintf(path, PATH_MAX,
>> "/sys/bus/pci/devices/%s/survivability_mode", bus_addr);
>> + dirfd = open(path, O_RDONLY);
>> +
>> + return dirfd;
>> +}
>> +
>> +static void intercept_udev_events(struct udev_device *device)
>> +{
>> + const char *dev_path = udev_device_get_property_value(device,
>> "DEVPATH");
>> + const char *wedged = udev_device_get_property_value(device,
>> "WEDGED");
>> +
>> + igt_assert_f(wedged && !strcmp(wedged, "vendor-specific"),
>> + "Expected WEDGED property to be 'vendor-specific', got
>> '%s'",
>> + wedged);
>> +
>> + igt_assert_f(dev_path && strstr(dev_path, bus_addr),
>> + "Expected bus address '%s' to be part of DEVPATH '%s'",
>> + bus_addr, dev_path);
>> +}
>> +
>> static int simple_ioctl(int fd)
>> {
>> int ret;
>> @@ -208,6 +249,11 @@ simple_hang(int fd, struct drm_xe_sync *sync)
>> * SUBTEST: basic-wedged-read
>> * Description: Read wedged_mode debugfs
>> */
>> +/**
>> + * SUBTEST: csc-wedged
>> + * Description: Force Xe device wedged after injecting a failure in CSC
>> + */
>> +
>> igt_main
>> {
>> struct drm_xe_engine_class_instance *hwe;
>> @@ -300,12 +346,51 @@ igt_main
>> igt_assert_f(str[0] != '\0', "Failed to read wedged_mode
>> from debugfs!\n");
>> }
>> + igt_subtest("csc-wedged") {
>> + struct udev *udev = udev_new();
>> + struct udev_monitor *monitor;
>
> We can use this instead
>
> struct udev_monitor *mon = igt_watch_uevents();
Sure, I will check this.
>
>
>> + struct udev_device *device;
>> +
>> + igt_require(igt_debugfs_exists(fd,
>> "inject_csc_hw_error/probability",
>> + O_RDWR));
>> +
>> + igt_assert_f(check_survivability_mode(fd) < 0,
>> + "survivability_mode sysfs available");
>
> why?
Just to check if the node is not available before the cse wedged.
>> +
>> + igt_debugfs_write(fd, "inject_csc_hw_error/verbose", "1");
>> + igt_assert_eq(simple_ioctl(fd), 0);
> Is this required?
As discussed offline, I will remove this.
>> + ignore_wedged_in_dmesg();
>
> Ignoring the interrupt message and runtime survivability also might be
> needed. Can check once kernel patches are merged
Sure, I’ll align this with the Kernel patches once they are merged.
>> +
>> + monitor = udev_monitor_new_from_netlink(udev, "kernel");
>> + udev_monitor_enable_receiving(monitor);
>> +
>> + force_wedged_csc_error(fd);
>> +
>> + device = udev_monitor_receive_device(monitor);
>
>
>> + intercept_udev_events(device);
>> +
>> + igt_assert_f(check_survivability_mode(fd) >= 0,
>> + "survivability_mode sysfs not available");
>
> you can add both of this in a single function
> (check_runtime_survivability_mode)
Sure, I will modify this.
>
> Thanks
> Riana
Thanks,
Anirban
>> +
>> + drm_close_driver(fd);
>> + igt_kmod_rebind("xe", pci_slot);
>> + fd = drm_open_driver(DRIVER_XE);
>> + igt_assert_eq(simple_ioctl(fd), 0);
>> + xe_for_each_engine(fd, hwe)
>> + simple_exec(fd, hwe);
>> + }
>> +
>> igt_fixture {
>> if (igt_debugfs_exists(fd, "fail_gt_reset/probability",
>> O_RDWR)) {
>> igt_debugfs_write(fd, "fail_gt_reset/probability", "0");
>> igt_debugfs_write(fd, "fail_gt_reset/times", "1");
>> }
>> + if (igt_debugfs_exists(fd,
>> "inject_csc_hw_error/probability", O_RDWR)) {
>> + igt_debugfs_write(fd, "inject_csc_hw_error/probability",
>> "0");
>> + igt_debugfs_write(fd, "inject_csc_hw_error/times", "1");
>> + }
>> +
>> /* Tests might have failed, force a rebind before exiting */
>> drm_close_driver(fd);
>> igt_kmod_rebind("xe", pci_slot);
>
prev parent reply other threads:[~2025-07-14 5:12 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-08 9:23 [i-g-t] tests/intel/xe_wedged: Add new test csc-wedged Sk Anirban
2025-07-08 10:16 ` ✓ Xe.CI.BAT: success for tests/intel/xe_wedged: Add new test csc-wedged (rev2) Patchwork
2025-07-08 10:23 ` ✓ i915.CI.BAT: " Patchwork
2025-07-08 11:41 ` ✓ Xe.CI.Full: " Patchwork
2025-07-08 13:04 ` ✗ i915.CI.Full: failure " Patchwork
2025-07-10 7:23 ` [i-g-t] tests/intel/xe_wedged: Add new test csc-wedged Riana Tauro
2025-07-14 5:11 ` Anirban, Sk [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ebc6a1cd-9418-4a82-9239-61ba2eec025a@intel.com \
--to=sk.anirban@intel.com \
--cc=anshuman.gupta@intel.com \
--cc=badal.nilawar@intel.com \
--cc=igt-dev@lists.freedesktop.org \
--cc=riana.tauro@intel.com \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.