From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Oak Zeng <oak.zeng@intel.com>, intel-xe@lists.freedesktop.org
Cc: matthew.brost@intel.com, jonathan.cavitt@intel.com
Subject: Re: [PATCH 3/3] drm/xe: Allow scratch page under fault mode for certain platform
Date: Wed, 05 Feb 2025 14:14:15 +0100 [thread overview]
Message-ID: <68c890a3baff21199f6aca82dccdd024f56de199.camel@linux.intel.com> (raw)
In-Reply-To: <20250204184558.4181478-3-oak.zeng@intel.com>
On Tue, 2025-02-04 at 13:45 -0500, Oak Zeng wrote:
> Normally scratch page is not allowed when a vm is operate under page
> fault mode, i.e., in the existing codes,
> DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE
> and DRM_XE_VM_CREATE_FLAG_FAULT_MODE are mutual exclusive. The reason
> is fault mode relies on recoverable page to work, while scratch page
> can mute recoverable page fault.
>
> On xe2 and xe3, out of bound prefetch can cause page fault and
> further
> system hang because xekmd can't resolve such page fault. SYCL and OCL
> language runtime requires out of bound prefetch to be silently
> dropped
> without causing any functional problem, thus the existing behavior
> doesn't meet language runtime requirement.
>
> At the same time, HW prefetching can cause page fault interrupt. Due
> to
> page fault interrupt overhead (i.e., need Guc and KMD involved to fix
> the page fault), HW prefetching can be slowed by many orders of
> magnitude.
>
> Fix those problems by allowing scratch page under fault mode for xe2
> and
> xe3. With scratch page in place, HW prefetching could always hit
> scratch
> page instead of causing interrupt.
>
> A side effect is, scratch page could hide application program error.
> Application out of bound accesses are hided
s/hided/hidden/
> by scratch page mapping,
> instead of get reported to user.
>
> igt test: https://patchwork.freedesktop.org/series/144334/. Test
> result on
> BMG:
>
> root@DUT1130BMGFRD:/home/szeng/dii-tools/igt-public/build/tests#
> ./xe_exec_fault_mode --run-subtest scratch-fault
> IGT-Version: 1.30-gde1a3cb42 (x86_64) (Linux: 6.13.0-xe x86_64)
> Using IGT_SRANDOM=1738684805 for randomisation
> Opened device: /dev/dri/card0
> Starting subtest: scratch-fault
> Subtest scratch-fault: SUCCESS (0.080s)
>
> Without this series, the test result is:
>
> root@DUT1130BMGFRD:/home/szeng/dii-tools/igt-public/build/tests#
> ./xe_exec_fault_mode --run-subtest scratch-fault
> IGT-Version: 1.30-gde1a3cb42 (x86_64) (Linux: 6.13.0-xe x86_64)
> Using IGT_SRANDOM=1738686046 for randomisation
> Opened device: /dev/dri/card0
> Starting subtest: scratch-fault
> (xe_exec_fault_mode:5047) CRITICAL: Test assertion failure function
> test_exec, file ../tests/intel/xe_exec_fault_mode.c:349:
> (xe_exec_fault_mode:5047) CRITICAL: Failed assertion:
> __xe_wait_ufence(fd, &exec_sync[i], 0xdeadbeefdeadbeefull,
> exec_queues[i % n_exec_queues], &timeout) == 0
> (xe_exec_fault_mode:5047) CRITICAL: Last errno: 62, Timer expired
> (xe_exec_fault_mode:5047) CRITICAL: error: -62 != 0
> Stack trace:
> #0 ../lib/igt_core.c:2266 __igt_fail_assert()
> #1 ../tests/intel/xe_exec_fault_mode.c:346 test_exec()
> #2 ../tests/intel/xe_exec_fault_mode.c:537
> __igt_unique____real_main407()
> #3 ../tests/intel/xe_exec_fault_mode.c:407 main()
> #4 ../sysdeps/nptl/libc_start_call_main.h:74
> __libc_start_call_main()
> #5 ../csu/libc-start.c:128 __libc_start_main@@GLIBC_2.34()
> #6 [_start+0x2e]
> Subtest scratch-fault failed.
>
> v2: Refine commit message (Thomas)
>
> Signed-off-by: Oak Zeng <oak.zeng@intel.com>
> ---
> drivers/gpu/drm/xe/xe_vm.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 813d893d9b63..c0372f083d42 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -1752,6 +1752,11 @@ int xe_vm_create_ioctl(struct drm_device *dev,
> void *data,
> if (XE_IOCTL_DBG(xe, args->extensions))
> return -EINVAL;
>
> + if (XE_IOCTL_DBG(xe, args->flags &
> DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE &&
> + args->flags &
> DRM_XE_VM_CREATE_FLAG_FAULT_MODE &&
> + !(NEEDS_SCRATCH(xe))))
> + return -EINVAL;
> +
We should probably move this test to where the old test were below,
since the WA below enables scratch pages.
/Thomas
> if (XE_WA(xe_root_mmio_gt(xe), 14016763929))
> args->flags |= DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE;
>
> @@ -1765,10 +1770,6 @@ int xe_vm_create_ioctl(struct drm_device *dev,
> void *data,
> if (XE_IOCTL_DBG(xe, args->flags &
> ~ALL_DRM_XE_VM_CREATE_FLAGS))
> return -EINVAL;
>
> - if (XE_IOCTL_DBG(xe, args->flags &
> DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE &&
> - args->flags &
> DRM_XE_VM_CREATE_FLAG_FAULT_MODE))
> - return -EINVAL;
> -
> if (XE_IOCTL_DBG(xe, !(args->flags &
> DRM_XE_VM_CREATE_FLAG_LR_MODE) &&
> args->flags &
> DRM_XE_VM_CREATE_FLAG_FAULT_MODE))
> return -EINVAL;
next prev parent reply other threads:[~2025-02-05 13:14 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-04 18:45 [PATCH 1/3] drm/xe: Introduced needs_scratch bit in device descriptor Oak Zeng
2025-02-04 18:45 ` [PATCH 2/3] drm/xe: Clear scratch page before vm_bind Oak Zeng
2025-02-05 14:35 ` Thomas Hellström
2025-02-06 1:52 ` Zeng, Oak
2025-02-06 12:51 ` Thomas Hellström
2025-02-06 18:56 ` Zeng, Oak
2025-02-06 20:11 ` Thomas Hellström
2025-02-06 21:05 ` Zeng, Oak
2025-02-06 10:34 ` Matthew Brost
2025-02-06 10:43 ` Thomas Hellström
2025-02-10 17:08 ` Matthew Brost
2025-02-06 15:16 ` Zeng, Oak
2025-02-10 17:09 ` Matthew Brost
2025-02-04 18:45 ` [PATCH 3/3] drm/xe: Allow scratch page under fault mode for certain platform Oak Zeng
2025-02-05 13:14 ` Thomas Hellström [this message]
2025-02-06 1:54 ` Zeng, Oak
2025-02-06 9:29 ` Thomas Hellström
2025-02-06 15:14 ` Zeng, Oak
2025-02-25 22:05 ` Matthew Brost
2025-02-04 23:27 ` ✓ CI.Patch_applied: success for series starting with [1/3] drm/xe: Introduced needs_scratch bit in device descriptor Patchwork
2025-02-04 23:27 ` ✗ CI.checkpatch: warning " Patchwork
2025-02-04 23:28 ` ✓ CI.KUnit: success " Patchwork
2025-02-04 23:45 ` ✓ CI.Build: " Patchwork
2025-02-04 23:47 ` ✓ CI.Hooks: " Patchwork
2025-02-04 23:48 ` ✓ CI.checksparse: " Patchwork
2025-02-05 6:05 ` ✓ Xe.CI.BAT: " Patchwork
2025-02-05 13:16 ` [PATCH 1/3] " Thomas Hellström
2025-02-25 21:37 ` Matthew Brost
-- strict thread matches above, loose matches on Subject: below --
2025-02-13 2:23 Oak Zeng
2025-02-13 2:23 ` [PATCH 3/3] drm/xe: Allow scratch page under fault mode for certain platform Oak Zeng
2025-02-25 22:10 ` Matthew Brost
2025-02-26 22:12 ` Zeng, Oak
2025-02-27 0:22 ` Matthew Brost
2025-02-06 21:38 [PATCH 1/3] drm/xe: Introduced needs_scratch bit in device descriptor Oak Zeng
2025-02-06 21:38 ` [PATCH 3/3] drm/xe: Allow scratch page under fault mode for certain platform Oak Zeng
2025-02-06 2:11 [PATCH 1/3] drm/xe: Introduced needs_scratch bit in device descriptor Oak Zeng
2025-02-06 2:11 ` [PATCH 3/3] drm/xe: Allow scratch page under fault mode for certain platform Oak Zeng
2025-01-28 22:21 [PATCH 1/3] drm/xe: Add a function to zap page table by address range Oak Zeng
2025-01-28 22:21 ` [PATCH 3/3] drm/xe: Allow scratch page under fault mode for certain platform Oak Zeng
2025-01-28 23:05 ` Cavitt, Jonathan
2025-01-29 8:52 ` Thomas Hellström
2025-01-29 16:41 ` Matthew Brost
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=68c890a3baff21199f6aca82dccdd024f56de199.camel@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jonathan.cavitt@intel.com \
--cc=matthew.brost@intel.com \
--cc=oak.zeng@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox