From: "Yadav, Arvind" <arvind.yadav@intel.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
intel-xe@lists.freedesktop.org
Cc: <matthew.brost@intel.com>, <himal.prasad.ghimiray@intel.com>,
<pallavi.mishra@intel.com>
Subject: Re: [PATCH v5 3/9] drm/xe/madvise: Implement purgeable buffer object support
Date: Tue, 24 Feb 2026 20:26:59 +0530 [thread overview]
Message-ID: <c46707ff-ba40-450c-8b37-42cfb491ece0@intel.com> (raw)
In-Reply-To: <932bd347295960286b9a30db776ac3e2e24cdd2d.camel@linux.intel.com>
On 24-02-2026 17:51, Thomas Hellström wrote:
> On Wed, 2026-02-11 at 20:56 +0530, Arvind Yadav wrote:
>> This allows userspace applications to provide memory usage hints to
>> the kernel for better memory management under pressure:
>>
>> Add the core implementation for purgeable buffer objects, enabling
>> memory
>> reclamation of user-designated DONTNEED buffers during eviction.
>>
>> This patch implements the purge operation and state machine
>> transitions:
>>
>> Purgeable States (from xe_madv_purgeable_state):
>> - WILLNEED (0): BO should be retained, actively used
>> - DONTNEED (1): BO eligible for purging, not currently needed
>> - PURGED (2): BO backing store reclaimed, permanently invalid
>>
>> Design Rationale:
>> - Async TLB invalidation via trigger_rebind (no blocking
>> xe_vm_invalidate_vma)
>> - i915 compatibility: retained field, "once purged always purged"
>> semantics
>> - Shared BO protection prevents multi-process memory corruption
>> - Scratch PTE reuse avoids new infrastructure, safe for fault mode
>>
>> Note: The madvise_purgeable() function is implemented but not hooked
>> into
>> the IOCTL handler (madvise_funcs[] entry is NULL) to maintain
>> bisectability.
>> The feature will be enabled in the final patch when all supporting
>> infrastructure (shrinker, per-VMA tracking) is complete.
>>
>> v2:
>> - Use xe_bo_trigger_rebind() for async TLB invalidation (Thomas
>> Hellström)
>> - Add NULL rebind with scratch PTEs for fault mode (Thomas
>> Hellström)
>> - Implement i915-compatible retained field logic (Thomas Hellström)
>> - Skip BO validation for purged BOs in page fault handler (crash
>> fix)
>> - Add scratch VM check in page fault path (non-scratch VMs fail
>> fault)
>> - Force clear_pt for non-scratch VMs to avoid phys addr 0 mapping
>> (review fix)
>> - Add !is_purged check to resource cursor setup to prevent stale
>> access
>>
>> v3:
>> - Rebase as xe_gt_pagefault.c is gone upstream and replaced
>> with xe_pagefault.c (Matthew Brost)
>> - Xe specific warn on (Matthew Brost)
>> - Call helpers for madv_purgeable access(Matthew Brost)
>> - Remove bo NULL check(Matthew Brost)
>> - Use xe_bo_assert_held instead of dma assert(Matthew Brost)
>> - Move the xe_bo_is_purged check under the dma-resv lock( by Matt)
>> - Drop is_purged from xe_pt_stage_bind_entry and just set is_null
>> to true
>> for purged BO rename s/is_null/is_null_or_purged (by Matt)
>> - UAPI rule should not be changed.(Matthew Brost)
>> - Make 'retained' a userptr (Matthew Brost)
>>
>> v4:
>> - @madv_purgeable atomic_t → u32 change across all relevant patches
>> (Matt)
>>
>> v5:
>> - Introduce xe_bo_set_purgeable_state() helper (void return) to
>> centralize
>> madv_purgeable updates with xe_bo_assert_held() and state
>> transition
>> validation using explicit enum checks (no transition out of
>> PURGED) (Matt)
>> - Make xe_ttm_bo_purge() return int and propagate failures from
>> xe_bo_move(); handle xe_bo_trigger_rebind() failures (e.g.
>> no_wait_gpu
>> paths) rather than silently ignoring (Matt)
>> - Replace drm_WARN_ON with xe_assert for better Xe-specific
>> assertions (Matt)
>> - Hook purgeable handling into
>> madvise_funcs[DRM_XE_VMA_ATTR_PURGEABLE_STATE]
>> instead of special-case path in xe_vm_madvise_ioctl() (Matt)
>> - Track purgeable retained return via xe_madvise_details and
>> perform
>> copy_to_user() from xe_madvise_details_fini() after locks are
>> dropped (Matt)
>> - Set madvise_funcs[DRM_XE_VMA_ATTR_PURGEABLE_STATE] to NULL with
>> __maybe_unused on madvise_purgeable() to maintain bisectability
>> until
>> shrinker integration is complete in final patch (Matt)
>> - Use put_user() instead of copy_to_user() for single u32 retained
>> value (Thomas)
>> - Return -EFAULT from ioctl if put_user() fails (Thomas)
>> - Validate userspace initialized retained to 0 before ioctl,
>> ensuring safe
>> default (0 = "assume purged") if put_user() fails (Thomas)
>> - Refactor error handling: separate fallible put_user from
>> infallible cleanup
>> - xe_madvise_purgeable_retained_to_user(): separate helper for
>> fallible put_user
>> - Call put_user() after releasing all locks to avoid circular
>> dependencies
>> - Use xe_bo_move_notify() instead of xe_bo_trigger_rebind() in
>> xe_ttm_bo_purge()
>> for proper abstraction - handles vunmap, dma-buf notifications,
>> and VRAM
>> userfault cleanup (Thomas)
>> - Fix LRU crash while running shrink test
>> - Skip xe_bo_validate() for purged BOs in xe_gpuvm_validate()
>>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
>> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_bo.c | 106 ++++++++++++++++++++---
>> drivers/gpu/drm/xe/xe_bo.h | 2 +
>> drivers/gpu/drm/xe/xe_pagefault.c | 12 +++
>> drivers/gpu/drm/xe/xe_pt.c | 40 +++++++--
>> drivers/gpu/drm/xe/xe_vm.c | 20 ++++-
>> drivers/gpu/drm/xe/xe_vm_madvise.c | 133
>> +++++++++++++++++++++++++++++
>> 6 files changed, 292 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
>> index 8bf16d60b9a5..87cde4b2fe59 100644
>> --- a/drivers/gpu/drm/xe/xe_bo.c
>> +++ b/drivers/gpu/drm/xe/xe_bo.c
>> @@ -835,6 +835,83 @@ static int xe_bo_move_notify(struct xe_bo *bo,
>> return 0;
>> }
>>
>> +/**
>> + * xe_bo_set_purgeable_state() - Set BO purgeable state with
>> validation
>> + * @bo: Buffer object
>> + * @new_state: New purgeable state
>> + *
>> + * Sets the purgeable state with lockdep assertions and validates
>> state
>> + * transitions. Once a BO is PURGED, it cannot transition to any
>> other state.
>> + * Invalid transitions are caught with xe_assert().
>> + */
>> +void xe_bo_set_purgeable_state(struct xe_bo *bo,
>> + enum xe_madv_purgeable_state
>> new_state)
>> +{
>> + struct xe_device *xe = xe_bo_device(bo);
>> +
>> + xe_bo_assert_held(bo);
>> +
>> + /* Validate state is one of the known values */
>> + xe_assert(xe, new_state == XE_MADV_PURGEABLE_WILLNEED ||
>> + new_state == XE_MADV_PURGEABLE_DONTNEED ||
>> + new_state == XE_MADV_PURGEABLE_PURGED);
>> +
>> + /* Once purged, always purged - cannot transition out */
>> + xe_assert(xe, !(bo->madv_purgeable ==
>> XE_MADV_PURGEABLE_PURGED &&
>> + new_state != XE_MADV_PURGEABLE_PURGED));
>> +
>> + bo->madv_purgeable = new_state;
>> +}
>> +
>> +/**
>> + * xe_ttm_bo_purge() - Purge buffer object backing store
>> + * @ttm_bo: The TTM buffer object to purge
>> + * @ctx: TTM operation context
>> + *
>> + * This function purges the backing store of a BO marked as DONTNEED
>> and
>> + * triggers rebind to invalidate stale GPU mappings. For fault-mode
>> VMs,
>> + * this zaps the PTEs. The next GPU access will trigger a page fault
>> and
>> + * perform NULL rebind (scratch pages or clear PTEs based on VM
>> config).
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
>> +static int xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo, struct
>> ttm_operation_ctx *ctx)
>> +{
>> + struct xe_bo *bo = ttm_to_xe_bo(ttm_bo);
>> + struct ttm_placement place = {};
>> + int ret;
>> +
>> + xe_bo_assert_held(bo);
>> +
>> + if (!ttm_bo->ttm)
>> + return 0;
>> +
>> + if (!xe_bo_madv_is_dontneed(bo))
>> + return 0;
>> +
>> + ret = ttm_bo_validate(ttm_bo, &place, ctx);
>> + if (ret)
>> + return ret;
>> +
>> + /*
>> + * Use the standard pre-move hook so we share the same
>> cleanup/invalidate
>> + * path as migrations: drop any CPU vmap and schedule the
>> necessary GPU
>> + * unbind/rebind work.
>> + *
>> + * This may fail in no-wait contexts (fault/shrinker) or if
>> the BO is
>> + * pinned. Keep state unchanged on failure so we don't end
>> up "PURGED"
>> + * with stale mappings.
>> + */
>> + ret = xe_bo_move_notify(bo, ctx);
>> + if (ret)
>> + return ret;
> move_notify() must be called *before* pages are actually freed, that is
> before ttm_bo_validate().
Noted, I will move this before tttm_bo_validate().
thanks,
Arvind
>
> Other than that LGTM.
>
next prev parent reply other threads:[~2026-02-24 14:57 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-11 15:26 [PATCH v5 0/9] drm/xe/madvise: Add support for purgeable buffer objects Arvind Yadav
2026-02-11 15:26 ` [PATCH v5 1/9] drm/xe/uapi: Add UAPI " Arvind Yadav
2026-02-24 10:50 ` Thomas Hellström
2026-02-26 17:58 ` Souza, Jose
2026-02-27 9:32 ` Yadav, Arvind
2026-02-11 15:26 ` [PATCH v5 2/9] drm/xe/bo: Add purgeable bo state tracking and field madv to xe_bo Arvind Yadav
2026-02-11 16:00 ` Matthew Brost
2026-02-11 15:26 ` [PATCH v5 3/9] drm/xe/madvise: Implement purgeable buffer object support Arvind Yadav
2026-02-24 12:21 ` Thomas Hellström
2026-02-24 14:56 ` Yadav, Arvind [this message]
2026-02-11 15:26 ` [PATCH v5 4/9] drm/xe/bo: Handle CPU faults on purged buffer objects Arvind Yadav
2026-02-11 15:26 ` [PATCH v5 5/9] drm/xe/vm: Prevent binding of " Arvind Yadav
2026-02-11 16:17 ` Matthew Brost
2026-02-11 15:26 ` [PATCH v5 6/9] drm/xe/madvise: Implement per-VMA purgeable state tracking Arvind Yadav
2026-02-24 12:48 ` Thomas Hellström
2026-02-24 15:07 ` Yadav, Arvind
2026-02-24 16:36 ` Matthew Brost
2026-02-25 5:35 ` Yadav, Arvind
2026-02-25 8:21 ` Thomas Hellström
2026-02-25 9:04 ` Matthew Brost
2026-02-25 9:18 ` Thomas Hellström
2026-02-25 9:40 ` Yadav, Arvind
2026-02-25 18:32 ` Matthew Brost
2026-02-11 15:26 ` [PATCH v5 7/9] drm/xe/madvise: Block imported and exported dma-bufs Arvind Yadav
2026-02-24 14:15 ` Thomas Hellström
2026-02-11 15:26 ` [PATCH v5 8/9] drm/xe/bo: Add purgeable shrinker state helpers Arvind Yadav
2026-02-24 14:21 ` Thomas Hellström
2026-02-24 15:09 ` Yadav, Arvind
2026-02-11 15:26 ` [PATCH v5 9/9] drm/xe/madvise: Enable purgeable buffer object IOCTL support Arvind Yadav
2026-02-11 15:40 ` Matthew Brost
2026-02-11 15:46 ` [PATCH v5 0/9] drm/xe/madvise: Add support for purgeable buffer objects Matthew Brost
2026-02-25 10:10 ` Yadav, Arvind
2026-02-11 16:21 ` ✗ CI.checkpatch: warning for drm/xe/madvise: Add support for purgeable buffer objects (rev6) Patchwork
2026-02-11 16:22 ` ✓ CI.KUnit: success " Patchwork
2026-02-11 17:11 ` ✗ Xe.CI.BAT: failure " Patchwork
2026-02-13 1:15 ` ✗ Xe.CI.FULL: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c46707ff-ba40-450c-8b37-42cfb491ece0@intel.com \
--to=arvind.yadav@intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
--cc=pallavi.mishra@intel.com \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox