From: Matthew Brost <matthew.brost@intel.com>
To: Oak Zeng <oak.zeng@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
<Thomas.Hellstrom@linux.intel.com>, <jonathan.cavitt@intel.com>
Subject: Re: [PATCH v7 3/3] drm/xe: Allow scratch page under fault mode for certain platform
Date: Thu, 6 Mar 2025 12:16:07 -0800 [thread overview]
Message-ID: <Z8oCh9kRdckHCrvy@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <20250228153058.1039188-4-oak.zeng@intel.com>
On Fri, Feb 28, 2025 at 10:30:58AM -0500, Oak Zeng wrote:
> Normally scratch page is not allowed when a vm is operate under page
> fault mode, i.e., in the existing codes, DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE
> and DRM_XE_VM_CREATE_FLAG_FAULT_MODE are mutual exclusive. The reason
> is fault mode relies on recoverable page to work, while scratch page
> can mute recoverable page fault.
>
> On xe2 and xe3, out of bound prefetch can cause page fault and further
> system hang because xekmd can't resolve such page fault. SYCL and OCL
> language runtime requires out of bound prefetch to be silently dropped
> without causing any functional problem, thus the existing behavior
> doesn't meet language runtime requirement.
>
> At the same time, HW prefetching can cause page fault interrupt. Due to
> page fault interrupt overhead (i.e., need Guc and KMD involved to fix
> the page fault), HW prefetching can be slowed by many orders of magnitude.
>
> Fix those problems by allowing scratch page under fault mode for xe2 and
> xe3. With scratch page in place, HW prefetching could always hit scratch
> page instead of causing interrupt.
>
> A side effect is, scratch page could hide application program error.
> Application out of bound accesses are hided by scratch page mapping,
> instead of get reported to user.
>
> v2: Refine commit message (Thomas)
>
> v3: Move the scratch page flag check to after scratch page wa (Thomas)
>
> v4: drop NEEDS_SCRATCH macro (matt)
> Add a comment to DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE
>
> Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/xe/xe_vm.c | 3 ++-
> include/uapi/drm/xe_drm.h | 6 +++++-
> 2 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 47051735f0e1..2356f12392a2 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -1791,7 +1791,8 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
> return -EINVAL;
>
> if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE &&
> - args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE))
> + args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE &&
> + !xe->info.needs_scratch))
> return -EINVAL;
>
> if (XE_IOCTL_DBG(xe, !(args->flags & DRM_XE_VM_CREATE_FLAG_LR_MODE) &&
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> index 76a462fae05f..7471eaa669bc 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -911,7 +911,11 @@ struct drm_xe_gem_mmap_offset {
> * struct drm_xe_vm_create - Input of &DRM_IOCTL_XE_VM_CREATE
> *
> * The @flags can be:
> - * - %DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE
> + * - %DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE - Map the whole virtual address
> + * space of the VM to scratch page. A vm_bind would overwrite the scratch
> + * page mapping. This flag is mutually exclusive with the
> + * %DRM_XE_VM_CREATE_FLAG_FAULT_MODE flag, with an exception of on x2 and
> + * xe3 platform.
> * - %DRM_XE_VM_CREATE_FLAG_LR_MODE - An LR, or Long Running VM accepts
> * exec submissions to its exec_queues that don't have an upper time
> * limit on the job execution time. But exec submissions to these
> --
> 2.26.3
>
next prev parent reply other threads:[~2025-03-06 20:15 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-28 15:30 [PATCH v7 0/3] Allow scratch page under fault mode for certain platform Oak Zeng
2025-02-28 15:30 ` [PATCH v7 1/3] drm/xe: Introduced needs_scratch bit in device descriptor Oak Zeng
2025-02-28 15:30 ` [PATCH v7 2/3] drm/xe: Clear scratch page on vm_bind Oak Zeng
2025-02-28 18:44 ` Matthew Brost
2025-02-28 15:30 ` [PATCH v7 3/3] drm/xe: Allow scratch page under fault mode for certain platform Oak Zeng
2025-03-06 20:16 ` Matthew Brost [this message]
2025-03-04 6:00 ` ✓ CI.Patch_applied: success for Allow scratch page under fault mode for certain platform (rev3) Patchwork
2025-03-04 6:01 ` ✗ CI.checkpatch: warning " Patchwork
2025-03-04 6:02 ` ✓ CI.KUnit: success " Patchwork
2025-03-04 6:19 ` ✓ CI.Build: " Patchwork
2025-03-04 6:21 ` ✓ CI.Hooks: " Patchwork
2025-03-04 6:22 ` ✓ CI.checksparse: " Patchwork
2025-03-04 6:40 ` ✓ Xe.CI.BAT: " Patchwork
2025-03-04 7:41 ` ✗ Xe.CI.Full: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z8oCh9kRdckHCrvy@lstrano-desk.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=Thomas.Hellstrom@linux.intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jonathan.cavitt@intel.com \
--cc=oak.zeng@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox