From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Matthew Brost <matthew.brost@intel.com>,
Matthew Auld <matthew.auld@intel.com>
Cc: intel-xe@lists.freedesktop.org
Subject: Re: [RFC PATCH] drm/xe/bo: Honor madvise(2) advices
Date: Sat, 29 Nov 2025 13:51:38 +0100 [thread overview]
Message-ID: <b7c3969245a5db71ced0c3aadc52c9531e68141d.camel@linux.intel.com> (raw)
In-Reply-To: <aSoNkE3dldrSbbF9@lstrano-desk.jf.intel.com>
On Fri, 2025-11-28 at 13:01 -0800, Matthew Brost wrote:
> On Fri, Nov 28, 2025 at 12:57:15PM +0000, Matthew Auld wrote:
> > On 28/11/2025 10:46, Thomas Hellström wrote:
> > > The user can give advices as to how the CPU will access an
> > > address range. Use those advices to determine the number of
> > > bo pages to prefault on a page-fault.
> > >
> > > Do this regardless of whether we can find a way to avoid the
> > > fairly slow vm_insert_pfn_prot() to populate buffer
> > > object maps.
> > >
> > > Initially, fault up to 512 pages on sequential access and
> > > a single page on random access.
> > >
> > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > Cc: Matthew Auld <matthew.auld@intel.com>
> > > Signed-off-by: Thomas Hellström
> > > <thomas.hellstrom@linux.intel.com>
> > > ---
> > > drivers/gpu/drm/xe/xe_bo.c | 18 +++++++++++++++++-
> > > 1 file changed, 17 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_bo.c
> > > b/drivers/gpu/drm/xe/xe_bo.c
> > > index 6fd6ce6c6586..07d0d954f826 100644
> > > --- a/drivers/gpu/drm/xe/xe_bo.c
> > > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > > @@ -1821,15 +1821,31 @@ static int xe_bo_fault_migrate(struct
> > > xe_bo *bo, struct ttm_operation_ctx *ctx,
> > > return err;
> > > }
> > > +/*
> > > + * Number of prefaulted pages for the MADV_SEQUENTIAL and
> > > + * MADV_RANDOM madvise() advices.
> > > + */
> > > +#define XE_BO_VM_NUM_PREFAULT_SEQ 512
> > > +#define XE_BO_VM_NUM_PREFAULT_RAND 1
> > > +
> > > /* Call into TTM to populate PTEs, and register bo for PTE
> > > removal on runtime suspend. */
> > > static vm_fault_t __xe_bo_cpu_fault(struct vm_fault *vmf,
> > > struct xe_device *xe, struct xe_bo *bo)
> > > {
> > > + const struct vm_area_struct *vma = vmf->vma;
> > > + pgoff_t num_prefault;
> > > vm_fault_t ret;
> > > trace_xe_bo_cpu_fault(bo);
> > > + if (vma->vm_flags & VM_SEQ_READ)
> > > + num_prefault = XE_BO_VM_NUM_PREFAULT_SEQ;
> > > + else if (vma->vm_flags & VM_RAND_READ)
> > > + num_prefault = XE_BO_VM_NUM_PREFAULT_RAND;
> > > + else
> > > + num_prefault = TTM_BO_VM_NUM_PREFAULT;
> >
> > Ah, interesting. Do we know if any UMD is making use of these
> > special flags
> > today? Just wondering if this might be a visible change or not?
> > Also would
> > it make sense to document/advertise this somewhere for UMD folks,
> > in case
> > this has an immediate benefit for them?
> >
>
> I also have a question here - does Xe / TTM support faulting in THP
> on
> the CPU side? Is that something we should also look at doing based on
> madvise / global THP settings? Would that help mitigate the slow
> vm_insert_pfn_prot too?
It would probably help a lot, as long as we actually get 2MiB pages
from TTM.
I had that implemented in TTM once with vmwgfx the only user, and it
was working fine except one very important detail: I had implemented it
based on vma information rather than PTE-based information, so
get_user_pages_fast() didn't recognize these pages and was terribly
confused. So it had to be ripped out.
If we're going to try that again, we need to talk to x86 arch to get a
PMD_PUD_SPECIAL pmd/pud flag that behaves just like PTE_SPECIAL, so
that things like get_user_pages_fast() ignore these huge PTEs. Auditing
all page-walks in core-mm for this is non-trivial.
But if that is done, we could bring in that stuff again, although
Christian wasn't very fond of having it in TTM.
But I think it would also be very beneficial for things like ioremap()
and friends.
/Thomas
>
> Matt
>
> > I guess would be good to add an IGT which uses both flags, if we
> > don't
> > already?
> >
> > Anyway, I think change makes sense,
> > Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> >
> > > +
> > > ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma-
> > > >vm_page_prot,
> > > - TTM_BO_VM_NUM_PREFAULT);
> > > + num_prefault);
> > > /*
> > > * When TTM is actually called to insert PTEs, ensure no
> > > blocking conditions
> > > * remain, in which case TTM may drop locks and return
> > > VM_FAULT_RETRY.
> >
next prev parent reply other threads:[~2025-11-29 12:51 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-28 10:46 [RFC PATCH] drm/xe/bo: Honor madvise(2) advices Thomas Hellström
2025-11-28 10:53 ` ✓ CI.KUnit: success for " Patchwork
2025-11-28 12:57 ` [RFC PATCH] " Matthew Auld
2025-11-28 21:01 ` Matthew Brost
2025-11-29 12:51 ` Thomas Hellström [this message]
2025-11-29 15:55 ` Matthew Brost
2025-11-29 16:18 ` Thomas Hellström
2025-11-29 12:40 ` Thomas Hellström
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b7c3969245a5db71ced0c3aadc52c9531e68141d.camel@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox