From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Arvind Yadav <arvind.yadav@intel.com>, intel-xe@lists.freedesktop.org
Cc: matthew.brost@intel.com, himal.prasad.ghimiray@intel.com
Subject: Re: [RFC PATCH 3/9] drm/xe/madvise: Implement purgeable buffer object support
Date: Wed, 29 Oct 2025 11:51:53 +0100 [thread overview]
Message-ID: <93be3dde56c24ce83c28a2dfe3bffeaf9a47b25d.camel@linux.intel.com> (raw)
In-Reply-To: <f976fdad404f4e8704ad2974dfed06f80ba31e6e.camel@linux.intel.com>
On Wed, 2025-10-29 at 09:55 +0100, Thomas Hellström wrote:
> On Tue, 2025-10-28 at 17:54 +0530, Arvind Yadav wrote:
> > This allows userspace applications to provide memory usage hints to
> > the kernel for better memory management under pressure:
> >
> > - WILLNEED: BO will be needed again, re-validate if purged
> > - DONTNEED: BO not currently needed, may be purged if needed
> >
> > When userspace marks BO as DONTNEED, the kernel can reclaim
> > their memory during memory pressure. BO transition to PURGED
> > state when reclaimed, and attempting to access purged buffers
> > triggers appropriate fault handling.
> >
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> > Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_bo.c | 75 +++++++++++++++++++++++++-
> > --
> > --
> > drivers/gpu/drm/xe/xe_vm_madvise.c | 67 ++++++++++++++++++++++++++
> > 2 files changed, 130 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_bo.c
> > b/drivers/gpu/drm/xe/xe_bo.c
> > index cbc3ee157218..3b3eb83658cc 100644
> > --- a/drivers/gpu/drm/xe/xe_bo.c
> > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > @@ -836,6 +836,60 @@ static int xe_bo_move_notify(struct xe_bo *bo,
> > return 0;
> > }
> >
> > +static int xe_bo_invalidate_tlb_before_purge(struct xe_bo *bo)
>
> In the future someone might want to reuse this function for
> invalidating somewhere else. Could we perhaps rename to
> xe_bo_invalidate_vmas() or something like that?
>
>
> > +{
> > + struct drm_gpuvm_bo *vm_bo;
> > + struct drm_gpuva *gpuva;
> > + struct drm_gem_object *obj = &bo->ttm.base;
> > + int ret;
> > +
> > + /* BO must be locked before invalidating */
> > + dma_resv_assert_held(bo->ttm.base.resv);
> > +
> > + drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
> > + drm_gpuvm_bo_for_each_va(gpuva, vm_bo) {
> > + struct xe_vma *vma = gpuva_to_vma(gpuva);
> > +
> > + ret = xe_vm_invalidate_vma(vma);
> > + if (ret)
> > + return ret;
> > + }
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static void xe_bo_set_purged(struct xe_bo *bo)
> > +{
> > + /* BO must be locked before modifying madv state */
> > + dma_resv_assert_held(bo->ttm.base.resv);
> > +
> > + atomic_set(&bo->madv_purgeable, XE_MADV_PURGEABLE_PURGED);
> > +}
> > +
> > +static void xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo,
> > struct
> > ttm_operation_ctx *ctx)
> > +{
> > + struct xe_device *xe = ttm_to_xe_device(ttm_bo->bdev);
> > + struct xe_bo *bo = ttm_to_xe_bo(ttm_bo);
> > +
> > + if (ttm_bo->ttm) {
> > + struct ttm_placement place = {};
> > + int ret = ttm_bo_validate(ttm_bo, &place, ctx);
> > + int ret_inval;
>
> Christian from AMD once mentioned that instead of implicitly calling
> ttm_bo_validate() with an empty placement, we could send the null
> placement through the evict_flags callback. Would that work?
>
>
Actually it doesn't since we don't get to call move_notify.
>
>
> > +
> > + drm_WARN_ON(&xe->drm, ret);
> > + if (!ret && bo) {
> > + if (atomic_read(&bo->madv_purgeable) ==
> > XE_MADV_PURGEABLE_DONTNEED) {
> > + /* Invalidate TLB before marking
> > BO
> > as purged */
> > + ret_inval =
> > xe_bo_invalidate_tlb_before_purge(bo);
>
> Since the page-table update and page-freeing is really intended to be
> an asynchronous operation, and the GPU bindings are intended to be
> invalidated in move_notify() / trigger_rebind() where we properly
> take
> care of special cases like faulting VMs etc, can we move the
> invalidation logic there?
>
> Perhaps it is even possible to skip the synchronous page-table
> zeroing
> here in favour of a NULL rebind (when rebinding a purged BO we set up
> all zero mappings, or whatever mappings are required given scratch
> page
> mode etc.) Then the page-table clearing will be properly inserted in
> the asynchronous execution.
>
>
> > + if (!ret_inval)
> > + xe_bo_set_purged(bo);
> > +
> > + }
> > + }
>
>
>
> > + }
> > +}
> > +
> > static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool
> > evict,
> > struct ttm_operation_ctx *ctx,
> > struct ttm_resource *new_mem,
> > @@ -853,8 +907,14 @@ static int xe_bo_move(struct ttm_buffer_object
> > *ttm_bo, bool evict,
> > bool needs_clear;
> > bool handle_system_ccs = (!IS_DGFX(xe) &&
> > xe_bo_needs_ccs_pages(bo) &&
> > ttm && ttm_tt_is_populated(ttm))
> > ?
> > true : false;
> > + int state = atomic_read(&bo->madv_purgeable);
> > int ret = 0;
> >
> > + if (evict && state == XE_MADV_PURGEABLE_DONTNEED) {
> > + xe_ttm_bo_purge(ttm_bo, ctx);
> > + return 0;
> > + }
> > +
> > /* Bo creation path, moving to system or TT. */
> > if ((!old_mem && ttm) && !handle_system_ccs) {
> > if (new_mem->mem_type == XE_PL_TT)
> > @@ -1606,18 +1666,6 @@ static void
> > xe_ttm_bo_delete_mem_notify(struct
> > ttm_buffer_object *ttm_bo)
> > }
> > }
> >
> > -static void xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo,
> > struct
> > ttm_operation_ctx *ctx)
> > -{
> > - struct xe_device *xe = ttm_to_xe_device(ttm_bo->bdev);
> > -
> > - if (ttm_bo->ttm) {
> > - struct ttm_placement place = {};
> > - int ret = ttm_bo_validate(ttm_bo, &place, ctx);
> > -
> > - drm_WARN_ON(&xe->drm, ret);
> > - }
> > -}
> > -
> > static void xe_ttm_bo_swap_notify(struct ttm_buffer_object
> > *ttm_bo)
> > {
> > struct ttm_operation_ctx ctx = {
> > @@ -2472,6 +2520,9 @@ struct xe_bo *xe_bo_create_user(struct
> > xe_device *xe,
> > ttm_bo_type_device, flags,
> > 0,
> > true);
> > }
> >
> > + /* Initialize purge advisory state */
> > + atomic_set(&bo->madv_purgeable,
> > XE_MADV_PURGEABLE_WILLNEED);
> > +
> > return bo;
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_vm_madvise.c
> > b/drivers/gpu/drm/xe/xe_vm_madvise.c
> > index cad3cf627c3f..1f0356ea4403 100644
> > --- a/drivers/gpu/drm/xe/xe_vm_madvise.c
> > +++ b/drivers/gpu/drm/xe/xe_vm_madvise.c
> > @@ -158,6 +158,54 @@ static void madvise_pat_index(struct xe_device
> > *xe, struct xe_vm *vm,
> > }
> > }
> >
> > +/*
> > + * Handle purgeable buffer object advice for
> > DONTNEED/WILLNEED/PURGED.
> > + * Returns 0 on success, negative errno on error.
> > + */
> > +static void xe_vm_madvise_purgeable_bo(struct xe_device *xe,
> > struct
> > xe_vm *vm,
> > + struct xe_vma **vmas, int
> > num_vmas,
> > + struct drm_xe_madvise *op,
> > struct drm_exec *exec)
> > +{
> > +
> > + xe_assert(vm->xe, op->type ==
> > DRM_XE_VMA_ATTR_PURGEABLE_STATE);
> > +
> > + for (int i = 0; i < num_vmas; i++) {
> > + struct xe_bo *bo = xe_vma_bo(vmas[i]);
> > + int state;
> > + int ret;
> > +
> > + if (!bo)
> > + continue;
> > +
> > + /* BO must be locked before modifying madv state
> > */
> > + dma_resv_assert_held(bo->ttm.base.resv);
> > +
> > + switch (op->purge_state_val.val) {
> > + case DRM_XE_VMA_PURGEABLE_STATE_WILLNEED:
> > + state = atomic_read(&bo->madv_purgeable);
> > + if (state == XE_MADV_PURGEABLE_PURGED) {
> > + ret = xe_bo_validate(bo, NULL,
> > true,
> > exec);
> > + if (ret) {
> > + drm_err(&vm->xe->drm,
> > + "Failed to
> > validate
> > purged BO: %d\n", ret);
> > + return;
> > + }
> > + }
> > + atomic_set(&bo->madv_purgeable,
> > XE_MADV_PURGEABLE_WILLNEED);
> > + break;
> > + case DRM_XE_VMA_PURGEABLE_STATE_DONTNEED:
> > + state = atomic_read(&bo->madv_purgeable);
> > + if (state != XE_MADV_PURGEABLE_PURGED)
> > + atomic_set(&bo->madv_purgeable,
> > XE_MADV_PURGEABLE_DONTNEED);
> > + break;
> > + default:
> > + drm_warn(&vm->xe->drm, "Invalid madvice
> > value = %d\n",
> > + op->purge_state_val.val);
> > + return;
> > + }
> > + }
> > +}
> > +
> > typedef void (*madvise_func)(struct xe_device *xe, struct xe_vm
> > *vm,
> > struct xe_vma **vmas, int num_vmas,
> > struct drm_xe_madvise *op);
> > @@ -283,6 +331,19 @@ static bool madvise_args_are_sane(struct
> > xe_device *xe, const struct drm_xe_madv
> > return false;
> > break;
> > }
> > + case DRM_XE_VMA_ATTR_PURGEABLE_STATE:
> > + {
> > + u32 val = args->purge_state_val.val;
> > +
> > + if (XE_IOCTL_DBG(xe, !((val ==
> > DRM_XE_VMA_PURGEABLE_STATE_WILLNEED) ||
> > + (val ==
> > DRM_XE_VMA_PURGEABLE_STATE_DONTNEED))))
> > + return false;
> > +
> > + if (XE_IOCTL_DBG(xe, args-
> > > purge_state_val.reserved))
> > + return false;
> > +
> > + break;
> > + }
> > default:
> > if (XE_IOCTL_DBG(xe, 1))
> > return false;
> > @@ -402,6 +463,12 @@ int xe_vm_madvise_ioctl(struct drm_device
> > *dev,
> > void *data, struct drm_file *fil
> > goto err_fini;
> > }
> > }
> > + if (args->type == DRM_XE_VMA_ATTR_PURGEABLE_STATE)
> > {
> > + xe_vm_madvise_purgeable_bo(xe, vm,
> > madvise_range.vmas,
> > +
> > madvise_range.num_vmas, args, &exec);
> > + goto err_fini;
> > +
> > + }
> > }
> >
> > if (madvise_range.has_svm_userptr_vmas) {
>
next prev parent reply other threads:[~2025-10-29 10:51 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-28 12:24 [RFC PATCH 0/9] drm/xe/madvise: Add support for purgeable buffer objects Arvind Yadav
2025-10-28 12:24 ` [RFC PATCH 1/9] drm/xe/uapi: Add UAPI " Arvind Yadav
2025-10-28 12:24 ` [RFC PATCH 2/9] drm/xe/bo: Add purgeable bo state tracking and field madv to xe_bo Arvind Yadav
2025-10-28 12:24 ` [RFC PATCH 3/9] drm/xe/madvise: Implement purgeable buffer object support Arvind Yadav
2025-10-29 8:55 ` Thomas Hellström
2025-10-29 10:51 ` Thomas Hellström [this message]
2025-10-30 7:03 ` Yadav, Arvind
2025-10-30 8:17 ` Thomas Hellström
2025-11-06 9:58 ` Yadav, Arvind
2025-10-28 12:24 ` [RFC PATCH 4/9] drm/xe/bo: Prevent purging of shared buffer objects Arvind Yadav
2025-10-28 12:24 ` [RFC PATCH 5/9] drm/xe/bo: Handle CPU faults on purged " Arvind Yadav
2025-10-28 12:24 ` [RFC PATCH 6/9] drm/xe/bo: Prevent mmap of " Arvind Yadav
2025-10-28 12:24 ` [RFC PATCH 7/9] drm/xe/vm: Prevent binding " Arvind Yadav
2025-10-28 12:24 ` [RFC PATCH 8/9] drm/xe/uapi: Add UAPI for purgeable bo state to madvise query response Arvind Yadav
2025-10-28 12:24 ` [RFC PATCH 9/9] drm/xe: Add support for querying purgeable BO states Arvind Yadav
2025-10-28 12:37 ` [RFC PATCH 0/9] drm/xe/madvise: Add support for purgeable buffer objects Thomas Hellström
2025-10-28 13:02 ` Matthew Auld
2025-10-29 8:40 ` Yadav, Arvind
2025-10-28 13:23 ` ✗ CI.checkpatch: warning for " Patchwork
2025-10-28 13:24 ` ✓ CI.KUnit: success " Patchwork
2025-10-28 14:12 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-10-28 19:44 ` ✗ Xe.CI.Full: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=93be3dde56c24ce83c28a2dfe3bffeaf9a47b25d.camel@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=arvind.yadav@intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.