From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Arvind Yadav <arvind.yadav@intel.com>, intel-xe@lists.freedesktop.org
Cc: matthew.brost@intel.com, himal.prasad.ghimiray@intel.com,
pallavi.mishra@intel.com
Subject: Re: [PATCH v5 3/9] drm/xe/madvise: Implement purgeable buffer object support
Date: Tue, 24 Feb 2026 13:21:03 +0100 [thread overview]
Message-ID: <932bd347295960286b9a30db776ac3e2e24cdd2d.camel@linux.intel.com> (raw)
In-Reply-To: <20260211152644.1661165-4-arvind.yadav@intel.com>
On Wed, 2026-02-11 at 20:56 +0530, Arvind Yadav wrote:
> This allows userspace applications to provide memory usage hints to
> the kernel for better memory management under pressure:
>
> Add the core implementation for purgeable buffer objects, enabling
> memory
> reclamation of user-designated DONTNEED buffers during eviction.
>
> This patch implements the purge operation and state machine
> transitions:
>
> Purgeable States (from xe_madv_purgeable_state):
> - WILLNEED (0): BO should be retained, actively used
> - DONTNEED (1): BO eligible for purging, not currently needed
> - PURGED (2): BO backing store reclaimed, permanently invalid
>
> Design Rationale:
> - Async TLB invalidation via trigger_rebind (no blocking
> xe_vm_invalidate_vma)
> - i915 compatibility: retained field, "once purged always purged"
> semantics
> - Shared BO protection prevents multi-process memory corruption
> - Scratch PTE reuse avoids new infrastructure, safe for fault mode
>
> Note: The madvise_purgeable() function is implemented but not hooked
> into
> the IOCTL handler (madvise_funcs[] entry is NULL) to maintain
> bisectability.
> The feature will be enabled in the final patch when all supporting
> infrastructure (shrinker, per-VMA tracking) is complete.
>
> v2:
> - Use xe_bo_trigger_rebind() for async TLB invalidation (Thomas
> Hellström)
> - Add NULL rebind with scratch PTEs for fault mode (Thomas
> Hellström)
> - Implement i915-compatible retained field logic (Thomas Hellström)
> - Skip BO validation for purged BOs in page fault handler (crash
> fix)
> - Add scratch VM check in page fault path (non-scratch VMs fail
> fault)
> - Force clear_pt for non-scratch VMs to avoid phys addr 0 mapping
> (review fix)
> - Add !is_purged check to resource cursor setup to prevent stale
> access
>
> v3:
> - Rebase as xe_gt_pagefault.c is gone upstream and replaced
> with xe_pagefault.c (Matthew Brost)
> - Xe specific warn on (Matthew Brost)
> - Call helpers for madv_purgeable access(Matthew Brost)
> - Remove bo NULL check(Matthew Brost)
> - Use xe_bo_assert_held instead of dma assert(Matthew Brost)
> - Move the xe_bo_is_purged check under the dma-resv lock( by Matt)
> - Drop is_purged from xe_pt_stage_bind_entry and just set is_null
> to true
> for purged BO rename s/is_null/is_null_or_purged (by Matt)
> - UAPI rule should not be changed.(Matthew Brost)
> - Make 'retained' a userptr (Matthew Brost)
>
> v4:
> - @madv_purgeable atomic_t → u32 change across all relevant patches
> (Matt)
>
> v5:
> - Introduce xe_bo_set_purgeable_state() helper (void return) to
> centralize
> madv_purgeable updates with xe_bo_assert_held() and state
> transition
> validation using explicit enum checks (no transition out of
> PURGED) (Matt)
> - Make xe_ttm_bo_purge() return int and propagate failures from
> xe_bo_move(); handle xe_bo_trigger_rebind() failures (e.g.
> no_wait_gpu
> paths) rather than silently ignoring (Matt)
> - Replace drm_WARN_ON with xe_assert for better Xe-specific
> assertions (Matt)
> - Hook purgeable handling into
> madvise_funcs[DRM_XE_VMA_ATTR_PURGEABLE_STATE]
> instead of special-case path in xe_vm_madvise_ioctl() (Matt)
> - Track purgeable retained return via xe_madvise_details and
> perform
> copy_to_user() from xe_madvise_details_fini() after locks are
> dropped (Matt)
> - Set madvise_funcs[DRM_XE_VMA_ATTR_PURGEABLE_STATE] to NULL with
> __maybe_unused on madvise_purgeable() to maintain bisectability
> until
> shrinker integration is complete in final patch (Matt)
> - Use put_user() instead of copy_to_user() for single u32 retained
> value (Thomas)
> - Return -EFAULT from ioctl if put_user() fails (Thomas)
> - Validate userspace initialized retained to 0 before ioctl,
> ensuring safe
> default (0 = "assume purged") if put_user() fails (Thomas)
> - Refactor error handling: separate fallible put_user from
> infallible cleanup
> - xe_madvise_purgeable_retained_to_user(): separate helper for
> fallible put_user
> - Call put_user() after releasing all locks to avoid circular
> dependencies
> - Use xe_bo_move_notify() instead of xe_bo_trigger_rebind() in
> xe_ttm_bo_purge()
> for proper abstraction - handles vunmap, dma-buf notifications,
> and VRAM
> userfault cleanup (Thomas)
> - Fix LRU crash while running shrink test
> - Skip xe_bo_validate() for purged BOs in xe_gpuvm_validate()
>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
> ---
> drivers/gpu/drm/xe/xe_bo.c | 106 ++++++++++++++++++++---
> drivers/gpu/drm/xe/xe_bo.h | 2 +
> drivers/gpu/drm/xe/xe_pagefault.c | 12 +++
> drivers/gpu/drm/xe/xe_pt.c | 40 +++++++--
> drivers/gpu/drm/xe/xe_vm.c | 20 ++++-
> drivers/gpu/drm/xe/xe_vm_madvise.c | 133
> +++++++++++++++++++++++++++++
> 6 files changed, 292 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index 8bf16d60b9a5..87cde4b2fe59 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -835,6 +835,83 @@ static int xe_bo_move_notify(struct xe_bo *bo,
> return 0;
> }
>
> +/**
> + * xe_bo_set_purgeable_state() - Set BO purgeable state with
> validation
> + * @bo: Buffer object
> + * @new_state: New purgeable state
> + *
> + * Sets the purgeable state with lockdep assertions and validates
> state
> + * transitions. Once a BO is PURGED, it cannot transition to any
> other state.
> + * Invalid transitions are caught with xe_assert().
> + */
> +void xe_bo_set_purgeable_state(struct xe_bo *bo,
> + enum xe_madv_purgeable_state
> new_state)
> +{
> + struct xe_device *xe = xe_bo_device(bo);
> +
> + xe_bo_assert_held(bo);
> +
> + /* Validate state is one of the known values */
> + xe_assert(xe, new_state == XE_MADV_PURGEABLE_WILLNEED ||
> + new_state == XE_MADV_PURGEABLE_DONTNEED ||
> + new_state == XE_MADV_PURGEABLE_PURGED);
> +
> + /* Once purged, always purged - cannot transition out */
> + xe_assert(xe, !(bo->madv_purgeable ==
> XE_MADV_PURGEABLE_PURGED &&
> + new_state != XE_MADV_PURGEABLE_PURGED));
> +
> + bo->madv_purgeable = new_state;
> +}
> +
> +/**
> + * xe_ttm_bo_purge() - Purge buffer object backing store
> + * @ttm_bo: The TTM buffer object to purge
> + * @ctx: TTM operation context
> + *
> + * This function purges the backing store of a BO marked as DONTNEED
> and
> + * triggers rebind to invalidate stale GPU mappings. For fault-mode
> VMs,
> + * this zaps the PTEs. The next GPU access will trigger a page fault
> and
> + * perform NULL rebind (scratch pages or clear PTEs based on VM
> config).
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +static int xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo, struct
> ttm_operation_ctx *ctx)
> +{
> + struct xe_bo *bo = ttm_to_xe_bo(ttm_bo);
> + struct ttm_placement place = {};
> + int ret;
> +
> + xe_bo_assert_held(bo);
> +
> + if (!ttm_bo->ttm)
> + return 0;
> +
> + if (!xe_bo_madv_is_dontneed(bo))
> + return 0;
> +
> + ret = ttm_bo_validate(ttm_bo, &place, ctx);
> + if (ret)
> + return ret;
> +
> + /*
> + * Use the standard pre-move hook so we share the same
> cleanup/invalidate
> + * path as migrations: drop any CPU vmap and schedule the
> necessary GPU
> + * unbind/rebind work.
> + *
> + * This may fail in no-wait contexts (fault/shrinker) or if
> the BO is
> + * pinned. Keep state unchanged on failure so we don't end
> up "PURGED"
> + * with stale mappings.
> + */
> + ret = xe_bo_move_notify(bo, ctx);
> + if (ret)
> + return ret;
move_notify() must be called *before* pages are actually freed, that is
before ttm_bo_validate().
Other than that LGTM.
next prev parent reply other threads:[~2026-02-24 12:21 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-11 15:26 [PATCH v5 0/9] drm/xe/madvise: Add support for purgeable buffer objects Arvind Yadav
2026-02-11 15:26 ` [PATCH v5 1/9] drm/xe/uapi: Add UAPI " Arvind Yadav
2026-02-24 10:50 ` Thomas Hellström
2026-02-26 17:58 ` Souza, Jose
2026-02-27 9:32 ` Yadav, Arvind
2026-02-11 15:26 ` [PATCH v5 2/9] drm/xe/bo: Add purgeable bo state tracking and field madv to xe_bo Arvind Yadav
2026-02-11 16:00 ` Matthew Brost
2026-02-11 15:26 ` [PATCH v5 3/9] drm/xe/madvise: Implement purgeable buffer object support Arvind Yadav
2026-02-24 12:21 ` Thomas Hellström [this message]
2026-02-24 14:56 ` Yadav, Arvind
2026-02-11 15:26 ` [PATCH v5 4/9] drm/xe/bo: Handle CPU faults on purged buffer objects Arvind Yadav
2026-02-11 15:26 ` [PATCH v5 5/9] drm/xe/vm: Prevent binding of " Arvind Yadav
2026-02-11 16:17 ` Matthew Brost
2026-02-11 15:26 ` [PATCH v5 6/9] drm/xe/madvise: Implement per-VMA purgeable state tracking Arvind Yadav
2026-02-24 12:48 ` Thomas Hellström
2026-02-24 15:07 ` Yadav, Arvind
2026-02-24 16:36 ` Matthew Brost
2026-02-25 5:35 ` Yadav, Arvind
2026-02-25 8:21 ` Thomas Hellström
2026-02-25 9:04 ` Matthew Brost
2026-02-25 9:18 ` Thomas Hellström
2026-02-25 9:40 ` Yadav, Arvind
2026-02-25 18:32 ` Matthew Brost
2026-02-11 15:26 ` [PATCH v5 7/9] drm/xe/madvise: Block imported and exported dma-bufs Arvind Yadav
2026-02-24 14:15 ` Thomas Hellström
2026-02-11 15:26 ` [PATCH v5 8/9] drm/xe/bo: Add purgeable shrinker state helpers Arvind Yadav
2026-02-24 14:21 ` Thomas Hellström
2026-02-24 15:09 ` Yadav, Arvind
2026-02-11 15:26 ` [PATCH v5 9/9] drm/xe/madvise: Enable purgeable buffer object IOCTL support Arvind Yadav
2026-02-11 15:40 ` Matthew Brost
2026-02-11 15:46 ` [PATCH v5 0/9] drm/xe/madvise: Add support for purgeable buffer objects Matthew Brost
2026-02-25 10:10 ` Yadav, Arvind
2026-02-11 16:21 ` ✗ CI.checkpatch: warning for drm/xe/madvise: Add support for purgeable buffer objects (rev6) Patchwork
2026-02-11 16:22 ` ✓ CI.KUnit: success " Patchwork
2026-02-11 17:11 ` ✗ Xe.CI.BAT: failure " Patchwork
2026-02-13 1:15 ` ✗ Xe.CI.FULL: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=932bd347295960286b9a30db776ac3e2e24cdd2d.camel@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=arvind.yadav@intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
--cc=pallavi.mishra@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.