From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Arvind Yadav <arvind.yadav@intel.com>, intel-xe@lists.freedesktop.org
Cc: matthew.brost@intel.com, himal.prasad.ghimiray@intel.com,
pallavi.mishra@intel.com
Subject: Re: [PATCH v5 3/9] drm/xe/madvise: Implement purgeable buffer object support
Date: Tue, 24 Feb 2026 13:21:03 +0100 [thread overview]
Message-ID: <932bd347295960286b9a30db776ac3e2e24cdd2d.camel@linux.intel.com> (raw)
In-Reply-To: <20260211152644.1661165-4-arvind.yadav@intel.com>
On Wed, 2026-02-11 at 20:56 +0530, Arvind Yadav wrote:
> This allows userspace applications to provide memory usage hints to
> the kernel for better memory management under pressure:
>
> Add the core implementation for purgeable buffer objects, enabling
> memory
> reclamation of user-designated DONTNEED buffers during eviction.
>
> This patch implements the purge operation and state machine
> transitions:
>
> Purgeable States (from xe_madv_purgeable_state):
> - WILLNEED (0): BO should be retained, actively used
> - DONTNEED (1): BO eligible for purging, not currently needed
> - PURGED (2): BO backing store reclaimed, permanently invalid
>
> Design Rationale:
> - Async TLB invalidation via trigger_rebind (no blocking
> xe_vm_invalidate_vma)
> - i915 compatibility: retained field, "once purged always purged"
> semantics
> - Shared BO protection prevents multi-process memory corruption
> - Scratch PTE reuse avoids new infrastructure, safe for fault mode
>
> Note: The madvise_purgeable() function is implemented but not hooked
> into
> the IOCTL handler (madvise_funcs[] entry is NULL) to maintain
> bisectability.
> The feature will be enabled in the final patch when all supporting
> infrastructure (shrinker, per-VMA tracking) is complete.
>
> v2:
> - Use xe_bo_trigger_rebind() for async TLB invalidation (Thomas
> Hellström)
> - Add NULL rebind with scratch PTEs for fault mode (Thomas
> Hellström)
> - Implement i915-compatible retained field logic (Thomas Hellström)
> - Skip BO validation for purged BOs in page fault handler (crash
> fix)
> - Add scratch VM check in page fault path (non-scratch VMs fail
> fault)
> - Force clear_pt for non-scratch VMs to avoid phys addr 0 mapping
> (review fix)
> - Add !is_purged check to resource cursor setup to prevent stale
> access
>
> v3:
> - Rebase as xe_gt_pagefault.c is gone upstream and replaced
> with xe_pagefault.c (Matthew Brost)
> - Xe specific warn on (Matthew Brost)
> - Call helpers for madv_purgeable access(Matthew Brost)
> - Remove bo NULL check(Matthew Brost)
> - Use xe_bo_assert_held instead of dma assert(Matthew Brost)
> - Move the xe_bo_is_purged check under the dma-resv lock( by Matt)
> - Drop is_purged from xe_pt_stage_bind_entry and just set is_null
> to true
> for purged BO rename s/is_null/is_null_or_purged (by Matt)
> - UAPI rule should not be changed.(Matthew Brost)
> - Make 'retained' a userptr (Matthew Brost)
>
> v4:
> - @madv_purgeable atomic_t → u32 change across all relevant patches
> (Matt)
>
> v5:
> - Introduce xe_bo_set_purgeable_state() helper (void return) to
> centralize
> madv_purgeable updates with xe_bo_assert_held() and state
> transition
> validation using explicit enum checks (no transition out of
> PURGED) (Matt)
> - Make xe_ttm_bo_purge() return int and propagate failures from
> xe_bo_move(); handle xe_bo_trigger_rebind() failures (e.g.
> no_wait_gpu
> paths) rather than silently ignoring (Matt)
> - Replace drm_WARN_ON with xe_assert for better Xe-specific
> assertions (Matt)
> - Hook purgeable handling into
> madvise_funcs[DRM_XE_VMA_ATTR_PURGEABLE_STATE]
> instead of special-case path in xe_vm_madvise_ioctl() (Matt)
> - Track purgeable retained return via xe_madvise_details and
> perform
> copy_to_user() from xe_madvise_details_fini() after locks are
> dropped (Matt)
> - Set madvise_funcs[DRM_XE_VMA_ATTR_PURGEABLE_STATE] to NULL with
> __maybe_unused on madvise_purgeable() to maintain bisectability
> until
> shrinker integration is complete in final patch (Matt)
> - Use put_user() instead of copy_to_user() for single u32 retained
> value (Thomas)
> - Return -EFAULT from ioctl if put_user() fails (Thomas)
> - Validate userspace initialized retained to 0 before ioctl,
> ensuring safe
> default (0 = "assume purged") if put_user() fails (Thomas)
> - Refactor error handling: separate fallible put_user from
> infallible cleanup
> - xe_madvise_purgeable_retained_to_user(): separate helper for
> fallible put_user
> - Call put_user() after releasing all locks to avoid circular
> dependencies
> - Use xe_bo_move_notify() instead of xe_bo_trigger_rebind() in
> xe_ttm_bo_purge()
> for proper abstraction - handles vunmap, dma-buf notifications,
> and VRAM
> userfault cleanup (Thomas)
> - Fix LRU crash while running shrink test
> - Skip xe_bo_validate() for purged BOs in xe_gpuvm_validate()
>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
> ---
> drivers/gpu/drm/xe/xe_bo.c | 106 ++++++++++++++++++++---
> drivers/gpu/drm/xe/xe_bo.h | 2 +
> drivers/gpu/drm/xe/xe_pagefault.c | 12 +++
> drivers/gpu/drm/xe/xe_pt.c | 40 +++++++--
> drivers/gpu/drm/xe/xe_vm.c | 20 ++++-
> drivers/gpu/drm/xe/xe_vm_madvise.c | 133
> +++++++++++++++++++++++++++++
> 6 files changed, 292 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index 8bf16d60b9a5..87cde4b2fe59 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -835,6 +835,83 @@ static int xe_bo_move_notify(struct xe_bo *bo,
> return 0;
> }
>
> +/**
> + * xe_bo_set_purgeable_state() - Set BO purgeable state with
> validation
> + * @bo: Buffer object
> + * @new_state: New purgeable state
> + *
> + * Sets the purgeable state with lockdep assertions and validates
> state
> + * transitions. Once a BO is PURGED, it cannot transition to any
> other state.
> + * Invalid transitions are caught with xe_assert().
> + */
> +void xe_bo_set_purgeable_state(struct xe_bo *bo,
> + enum xe_madv_purgeable_state
> new_state)
> +{
> + struct xe_device *xe = xe_bo_device(bo);
> +
> + xe_bo_assert_held(bo);
> +
> + /* Validate state is one of the known values */
> + xe_assert(xe, new_state == XE_MADV_PURGEABLE_WILLNEED ||
> + new_state == XE_MADV_PURGEABLE_DONTNEED ||
> + new_state == XE_MADV_PURGEABLE_PURGED);
> +
> + /* Once purged, always purged - cannot transition out */
> + xe_assert(xe, !(bo->madv_purgeable ==
> XE_MADV_PURGEABLE_PURGED &&
> + new_state != XE_MADV_PURGEABLE_PURGED));
> +
> + bo->madv_purgeable = new_state;
> +}
> +
> +/**
> + * xe_ttm_bo_purge() - Purge buffer object backing store
> + * @ttm_bo: The TTM buffer object to purge
> + * @ctx: TTM operation context
> + *
> + * This function purges the backing store of a BO marked as DONTNEED
> and
> + * triggers rebind to invalidate stale GPU mappings. For fault-mode
> VMs,
> + * this zaps the PTEs. The next GPU access will trigger a page fault
> and
> + * perform NULL rebind (scratch pages or clear PTEs based on VM
> config).
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +static int xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo, struct
> ttm_operation_ctx *ctx)
> +{
> + struct xe_bo *bo = ttm_to_xe_bo(ttm_bo);
> + struct ttm_placement place = {};
> + int ret;
> +
> + xe_bo_assert_held(bo);
> +
> + if (!ttm_bo->ttm)
> + return 0;
> +
> + if (!xe_bo_madv_is_dontneed(bo))
> + return 0;
> +
> + ret = ttm_bo_validate(ttm_bo, &place, ctx);
> + if (ret)
> + return ret;
> +
> + /*
> + * Use the standard pre-move hook so we share the same
> cleanup/invalidate
> + * path as migrations: drop any CPU vmap and schedule the
> necessary GPU
> + * unbind/rebind work.
> + *
> + * This may fail in no-wait contexts (fault/shrinker) or if
> the BO is
> + * pinned. Keep state unchanged on failure so we don't end
> up "PURGED"
> + * with stale mappings.
> + */
> + ret = xe_bo_move_notify(bo, ctx);
> + if (ret)
> + return ret;
move_notify() must be called *before* pages are actually freed, that is
before ttm_bo_validate().
Other than that LGTM.
next prev parent reply other threads:[~2026-02-24 12:21 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-11 15:26 [PATCH v5 0/9] drm/xe/madvise: Add support for purgeable buffer objects Arvind Yadav
2026-02-11 15:26 ` [PATCH v5 1/9] drm/xe/uapi: Add UAPI " Arvind Yadav
2026-02-24 10:50 ` Thomas Hellström
2026-02-26 17:58 ` Souza, Jose
2026-02-27 9:32 ` Yadav, Arvind
2026-02-11 15:26 ` [PATCH v5 2/9] drm/xe/bo: Add purgeable bo state tracking and field madv to xe_bo Arvind Yadav
2026-02-11 16:00 ` Matthew Brost
2026-02-11 15:26 ` [PATCH v5 3/9] drm/xe/madvise: Implement purgeable buffer object support Arvind Yadav
2026-02-24 12:21 ` Thomas Hellström [this message]
2026-02-24 14:56 ` Yadav, Arvind
2026-02-11 15:26 ` [PATCH v5 4/9] drm/xe/bo: Handle CPU faults on purged buffer objects Arvind Yadav
2026-02-11 15:26 ` [PATCH v5 5/9] drm/xe/vm: Prevent binding of " Arvind Yadav
2026-02-11 16:17 ` Matthew Brost
2026-02-11 15:26 ` [PATCH v5 6/9] drm/xe/madvise: Implement per-VMA purgeable state tracking Arvind Yadav
2026-02-24 12:48 ` Thomas Hellström
2026-02-24 15:07 ` Yadav, Arvind
2026-02-24 16:36 ` Matthew Brost
2026-02-25 5:35 ` Yadav, Arvind
2026-02-25 8:21 ` Thomas Hellström
2026-02-25 9:04 ` Matthew Brost
2026-02-25 9:18 ` Thomas Hellström
2026-02-25 9:40 ` Yadav, Arvind
2026-02-25 18:32 ` Matthew Brost
2026-02-11 15:26 ` [PATCH v5 7/9] drm/xe/madvise: Block imported and exported dma-bufs Arvind Yadav
2026-02-24 14:15 ` Thomas Hellström
2026-02-11 15:26 ` [PATCH v5 8/9] drm/xe/bo: Add purgeable shrinker state helpers Arvind Yadav
2026-02-24 14:21 ` Thomas Hellström
2026-02-24 15:09 ` Yadav, Arvind
2026-02-11 15:26 ` [PATCH v5 9/9] drm/xe/madvise: Enable purgeable buffer object IOCTL support Arvind Yadav
2026-02-11 15:40 ` Matthew Brost
2026-02-11 15:46 ` [PATCH v5 0/9] drm/xe/madvise: Add support for purgeable buffer objects Matthew Brost
2026-02-25 10:10 ` Yadav, Arvind
2026-02-11 16:21 ` ✗ CI.checkpatch: warning for drm/xe/madvise: Add support for purgeable buffer objects (rev6) Patchwork
2026-02-11 16:22 ` ✓ CI.KUnit: success " Patchwork
2026-02-11 17:11 ` ✗ Xe.CI.BAT: failure " Patchwork
2026-02-13 1:15 ` ✗ Xe.CI.FULL: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=932bd347295960286b9a30db776ac3e2e24cdd2d.camel@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=arvind.yadav@intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
--cc=pallavi.mishra@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox