From: Matthew Brost <matthew.brost@intel.com>
To: Brian Nguyen <brian3.nguyen@intel.com>
Cc: <intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH 2/4] drm/xe: Add explicit abort page reclaim list
Date: Mon, 5 Jan 2026 18:23:36 -0800 [thread overview]
Message-ID: <aVxyKJf2fWBmISqL@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <20260105233351.3753716-8-brian3.nguyen@intel.com>
On Tue, Jan 06, 2026 at 07:33:54AM +0800, Brian Nguyen wrote:
> PRLs could be invalidated to indicate its getting dropped from current
> scope but are still valid. So standardize calls and add abort to clearly
> define when an invalidation is a real abort and PRL should fallback.
>
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> ---
> drivers/gpu/drm/xe/xe_page_reclaim.c | 23 +++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_page_reclaim.h | 3 +++
> drivers/gpu/drm/xe/xe_pt.c | 21 +++++++++------------
> drivers/gpu/drm/xe/xe_tlb_inval_job.c | 2 +-
> 4 files changed, 36 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c b/drivers/gpu/drm/xe/xe_page_reclaim.c
> index 94d4608ebd74..9f086067a4a1 100644
> --- a/drivers/gpu/drm/xe/xe_page_reclaim.c
> +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> @@ -100,6 +100,29 @@ void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl)
> prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
> }
>
> +/**
> + * xe_page_reclaim_list_abort() - Abort a PRL, invalidate it, and log
> + * @gt: GT associated with this PRL
> + * @prl: Page reclaim list to invalidate
> + * @format: printf-style format string for vm_dbg
@fmt
> + *
> + * This is intended for PRL abort paths where we want to track aborted PRLs
> + */
> +void xe_page_reclaim_list_abort(struct xe_gt *gt, struct xe_page_reclaim_list *prl,
> + const char *fmt, ...)
> +{
> + struct va_format vaf;
> + va_list va_args;
> +
> + xe_page_reclaim_list_invalidate(prl);
> +
> + va_start(va_args, fmt);
If vm_dbg calls va_start (it does in __drm_dev_dbg) this won't work as
calling va_start start twice is undefined behavior. IIRC, this works on
some compilers but not others so as soon as kernel test robot tries a
certain build we will get bug reports. The W/A here is make make
xe_page_reclaim_list_abort a macro.
> + vaf.fmt = fmt;
> + vaf.va = &va_args;
> + vm_dbg(>_to_xe(gt)->drm, "PRL aborted: %pV", &vaf);
> + va_end(va_args);
> +}
> +
> /**
> * xe_page_reclaim_list_init() - Initialize a page reclaim list
> * @prl: Page reclaim list to initialize
> diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h b/drivers/gpu/drm/xe/xe_page_reclaim.h
> index a4f58e0ce9b4..2464268dba69 100644
> --- a/drivers/gpu/drm/xe/xe_page_reclaim.h
> +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> @@ -19,6 +19,7 @@
> struct xe_tlb_inval;
> struct xe_tlb_inval_fence;
> struct xe_tile;
> +struct xe_gt;
> struct xe_vma;
>
> struct xe_guc_page_reclaim_entry {
> @@ -75,6 +76,8 @@ struct drm_suballoc *xe_page_reclaim_create_prl_bo(struct xe_tlb_inval *tlb_inva
> struct xe_page_reclaim_list *prl,
> struct xe_tlb_inval_fence *fence);
> void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl);
> +void xe_page_reclaim_list_abort(struct xe_gt *gt, struct xe_page_reclaim_list *prl,
> + const char *format, ...);
So I think here, make xe_page_reclaim_list_abort a macro in this header
file.
> void xe_page_reclaim_list_init(struct xe_page_reclaim_list *prl);
> int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl);
> /**
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index 6cd78bb2b652..2752a5a48a97 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -1618,10 +1618,9 @@ static int generate_reclaim_entry(struct xe_tile *tile,
> } else if (is_2m_pte(xe_child)) {
> reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M); /* reclamation_size = 9 */
> } else {
> - xe_page_reclaim_list_invalidate(prl);
> - vm_dbg(&tile_to_xe(tile)->drm,
> - "PRL invalidate: unsupported PTE level=%u pte=%#llx\n",
> - xe_child->level, pte);
> + xe_page_reclaim_list_abort(tile->primary_gt, prl,
> + "unsupported PTE level=%u pte=%#llx",
> + xe_child->level, pte);
> return -EINVAL;
> }
>
> @@ -1670,10 +1669,9 @@ static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
> break;
> } else {
> /* overflow, mark as invalid */
> - xe_page_reclaim_list_invalidate(xe_walk->prl);
> - vm_dbg(&xe->drm,
> - "PRL invalidate: overflow while adding pte=%#llx",
> - pte);
> + xe_page_reclaim_list_abort(xe_walk->tile->primary_gt, xe_walk->prl,
> + "overflow while adding pte=%#llx",
> + pte);
> break;
> }
> }
> @@ -1682,10 +1680,9 @@ static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
> /* If aborting page walk early, invalidate PRL since PTE may be dropped from this abort */
> if (xe_pt_check_kill(addr, next, level - 1, xe_child, action, walk) &&
> xe_walk->prl && level > 1 && xe_child->base.children && xe_child->num_live != 0) {
> - xe_page_reclaim_list_invalidate(xe_walk->prl);
> - vm_dbg(&xe->drm,
> - "PRL invalidate: kill at level=%u addr=%#llx next=%#llx num_live=%u\n",
> - level, addr, next, xe_child->num_live);
> + xe_page_reclaim_list_abort(xe_walk->tile->primary_gt, xe_walk->prl,
> + "kill at level=%u addr=%#llx next=%#llx num_live=%u\n",
> + level, addr, next, xe_child->num_live);
> }
>
> return 0;
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_job.c b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> index 6a7bd6315797..b8916552101c 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> @@ -182,7 +182,7 @@ static void xe_tlb_inval_job_destroy(struct kref *ref)
> struct xe_vm *vm = job->vm;
>
> /* BO creation retains a copy (if used), so no longer needed */
> - xe_page_reclaim_entries_put(job->prl.entries);
> + xe_page_reclaim_list_invalidate(&job->prl);
This doesn't look right or at minimum is unrelated to this patch.
Matt
>
> if (!job->fence_armed)
> kfree(ifence);
> --
> 2.52.0
>
next prev parent reply other threads:[~2026-01-06 2:23 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-05 23:33 [PATCH 0/4] Page-reclaim fixes and PRL stats addition Brian Nguyen
2026-01-05 23:33 ` [PATCH 1/4] drm/xe: Remove debug comment in page reclaim Brian Nguyen
2026-01-06 2:15 ` Matthew Brost
2026-01-05 23:33 ` [PATCH 2/4] drm/xe: Add explicit abort page reclaim list Brian Nguyen
2026-01-06 2:23 ` Matthew Brost [this message]
2026-01-06 12:44 ` Nguyen, Brian3
2026-01-05 23:33 ` [PATCH 3/4] drm/xe: Fix page reclaim entry handling for large pages Brian Nguyen
2026-01-06 16:41 ` Matthew Brost
2026-01-06 17:12 ` Nguyen, Brian3
2026-01-05 23:33 ` [PATCH 4/4] drm/xe: Add page reclamation related stats Brian Nguyen
2026-01-05 23:41 ` ✓ CI.KUnit: success for Page-reclaim fixes and PRL stats addition (rev2) Patchwork
2026-01-06 1:12 ` ✗ Xe.CI.Full: failure " Patchwork
-- strict thread matches above, loose matches on Subject: below --
2026-01-07 1:04 [PATCH 0/4] Page-reclaim fixes and PRL stats addition Brian Nguyen
2026-01-07 1:04 ` [PATCH 2/4] drm/xe: Add explicit abort page reclaim list Brian Nguyen
2026-01-07 19:57 ` Matthew Brost
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aVxyKJf2fWBmISqL@lstrano-desk.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=brian3.nguyen@intel.com \
--cc=intel-xe@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox