From: "Anirban, Sk" <sk.anirban@intel.com>
To: Riana Tauro <riana.tauro@intel.com>, <igt-dev@lists.freedesktop.org>
Cc: <anshuman.gupta@intel.com>, <badal.nilawar@intel.com>,
<rodrigo.vivi@intel.com>
Subject: Re: [i-g-t] tests/intel/xe_wedged: Add new test csc-wedged
Date: Mon, 14 Jul 2025 10:41:43 +0530 [thread overview]
Message-ID: <ebc6a1cd-9418-4a82-9239-61ba2eec025a@intel.com> (raw)
In-Reply-To: <f054aafd-37fc-42ac-a343-d29ba8a05b6a@intel.com>
Hi,
On 10-07-2025 12:53, Riana Tauro wrote:
> Hi Anirban
>
> On 7/8/2025 2:53 PM, Sk Anirban wrote:
>> Inject a CSC error through uevent to cause the Xe device to enter a
>> wedged
>
> Add details about survivability mode. What is the expectation of the test
>
> Add a link to kernel patches
Sure, I will add this.
>
>> state. To return the device to a normal state, reload the driver, as
>> the wedged state can only be resolved by rebinding/reprobing the driver.
>>
>> Signed-off-by: Sk Anirban <sk.anirban@intel.com>
>> ---
>> tests/intel/xe_wedged.c | 85 +++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 85 insertions(+)
>>
>> diff --git a/tests/intel/xe_wedged.c b/tests/intel/xe_wedged.c
>> index 7fc7ca9eb..b29e9bcb5 100644
>> --- a/tests/intel/xe_wedged.c
>> +++ b/tests/intel/xe_wedged.c
>> @@ -14,6 +14,7 @@
>> #include <limits.h>
>> #include <dirent.h>
>> +#include <libudev.h>
>> #include "igt.h"
>> #include "igt_device.h"
>> @@ -46,6 +47,46 @@ static void force_wedged(int fd)
>> sleep(1);
>> }
>> +static void force_wedged_csc_error(int fd)
>> +{
>> + igt_debugfs_write(fd, "inject_csc_hw_error/probability", "100");
>> + igt_debugfs_write(fd, "inject_csc_hw_error/times", "1");
>> +
>> + xe_force_gt_reset_sync(fd, 0);
>> + sleep(1);
>> +}
>> +
>> +static char bus_addr[NAME_MAX];> +
>> +static int check_survivability_mode(int fd)
>> +{
>> + struct pci_device *pci_dev;
>> + char path[PATH_MAX];
>> + int dirfd;
>> +
>> + pci_dev = igt_device_get_pci_device(fd);
>> + snprintf(bus_addr, sizeof(bus_addr), "%04x:%02x:%02x.%01x",
>> + pci_dev->domain, pci_dev->bus, pci_dev->dev, pci_dev->func);
>> + snprintf(path, PATH_MAX,
>> "/sys/bus/pci/devices/%s/survivability_mode", bus_addr);
>> + dirfd = open(path, O_RDONLY);
>> +
>> + return dirfd;
>> +}
>> +
>> +static void intercept_udev_events(struct udev_device *device)
>> +{
>> + const char *dev_path = udev_device_get_property_value(device,
>> "DEVPATH");
>> + const char *wedged = udev_device_get_property_value(device,
>> "WEDGED");
>> +
>> + igt_assert_f(wedged && !strcmp(wedged, "vendor-specific"),
>> + "Expected WEDGED property to be 'vendor-specific', got
>> '%s'",
>> + wedged);
>> +
>> + igt_assert_f(dev_path && strstr(dev_path, bus_addr),
>> + "Expected bus address '%s' to be part of DEVPATH '%s'",
>> + bus_addr, dev_path);
>> +}
>> +
>> static int simple_ioctl(int fd)
>> {
>> int ret;
>> @@ -208,6 +249,11 @@ simple_hang(int fd, struct drm_xe_sync *sync)
>> * SUBTEST: basic-wedged-read
>> * Description: Read wedged_mode debugfs
>> */
>> +/**
>> + * SUBTEST: csc-wedged
>> + * Description: Force Xe device wedged after injecting a failure in CSC
>> + */
>> +
>> igt_main
>> {
>> struct drm_xe_engine_class_instance *hwe;
>> @@ -300,12 +346,51 @@ igt_main
>> igt_assert_f(str[0] != '\0', "Failed to read wedged_mode
>> from debugfs!\n");
>> }
>> + igt_subtest("csc-wedged") {
>> + struct udev *udev = udev_new();
>> + struct udev_monitor *monitor;
>
> We can use this instead
>
> struct udev_monitor *mon = igt_watch_uevents();
Sure, I will check this.
>
>
>> + struct udev_device *device;
>> +
>> + igt_require(igt_debugfs_exists(fd,
>> "inject_csc_hw_error/probability",
>> + O_RDWR));
>> +
>> + igt_assert_f(check_survivability_mode(fd) < 0,
>> + "survivability_mode sysfs available");
>
> why?
Just to check if the node is not available before the cse wedged.
>> +
>> + igt_debugfs_write(fd, "inject_csc_hw_error/verbose", "1");
>> + igt_assert_eq(simple_ioctl(fd), 0);
> Is this required?
As discussed offline, I will remove this.
>> + ignore_wedged_in_dmesg();
>
> Ignoring the interrupt message and runtime survivability also might be
> needed. Can check once kernel patches are merged
Sure, I’ll align this with the Kernel patches once they are merged.
>> +
>> + monitor = udev_monitor_new_from_netlink(udev, "kernel");
>> + udev_monitor_enable_receiving(monitor);
>> +
>> + force_wedged_csc_error(fd);
>> +
>> + device = udev_monitor_receive_device(monitor);
>
>
>> + intercept_udev_events(device);
>> +
>> + igt_assert_f(check_survivability_mode(fd) >= 0,
>> + "survivability_mode sysfs not available");
>
> you can add both of this in a single function
> (check_runtime_survivability_mode)
Sure, I will modify this.
>
> Thanks
> Riana
Thanks,
Anirban
>> +
>> + drm_close_driver(fd);
>> + igt_kmod_rebind("xe", pci_slot);
>> + fd = drm_open_driver(DRIVER_XE);
>> + igt_assert_eq(simple_ioctl(fd), 0);
>> + xe_for_each_engine(fd, hwe)
>> + simple_exec(fd, hwe);
>> + }
>> +
>> igt_fixture {
>> if (igt_debugfs_exists(fd, "fail_gt_reset/probability",
>> O_RDWR)) {
>> igt_debugfs_write(fd, "fail_gt_reset/probability", "0");
>> igt_debugfs_write(fd, "fail_gt_reset/times", "1");
>> }
>> + if (igt_debugfs_exists(fd,
>> "inject_csc_hw_error/probability", O_RDWR)) {
>> + igt_debugfs_write(fd, "inject_csc_hw_error/probability",
>> "0");
>> + igt_debugfs_write(fd, "inject_csc_hw_error/times", "1");
>> + }
>> +
>> /* Tests might have failed, force a rebind before exiting */
>> drm_close_driver(fd);
>> igt_kmod_rebind("xe", pci_slot);
>
prev parent reply other threads:[~2025-07-14 5:12 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-08 9:23 [i-g-t] tests/intel/xe_wedged: Add new test csc-wedged Sk Anirban
2025-07-08 10:16 ` ✓ Xe.CI.BAT: success for tests/intel/xe_wedged: Add new test csc-wedged (rev2) Patchwork
2025-07-08 10:23 ` ✓ i915.CI.BAT: " Patchwork
2025-07-08 11:41 ` ✓ Xe.CI.Full: " Patchwork
2025-07-08 13:04 ` ✗ i915.CI.Full: failure " Patchwork
2025-07-10 7:23 ` [i-g-t] tests/intel/xe_wedged: Add new test csc-wedged Riana Tauro
2025-07-14 5:11 ` Anirban, Sk [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ebc6a1cd-9418-4a82-9239-61ba2eec025a@intel.com \
--to=sk.anirban@intel.com \
--cc=anshuman.gupta@intel.com \
--cc=badal.nilawar@intel.com \
--cc=igt-dev@lists.freedesktop.org \
--cc=riana.tauro@intel.com \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox