intel-xe.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Matthew Auld <matthew.auld@intel.com>, intel-xe@lists.freedesktop.org
Cc: melvyn <melvyn2@dnsense.pub>, Summers Stuart <stuart.summers@intel.com>
Subject: Re: [PATCH v2] drm/xe: Defer buffer object shrinker write-backs and GPU waits
Date: Thu, 07 Aug 2025 21:13:59 +0200	[thread overview]
Message-ID: <06cb2b34897415e574a7c55b54757000958cdcee.camel@linux.intel.com> (raw)
In-Reply-To: <3e90708318a368704a8aa7aa89682fb3743d7f3f.camel@linux.intel.com>

On Tue, 2025-08-05 at 18:27 +0200, Thomas Hellström wrote:
> On Tue, 2025-08-05 at 14:40 +0100, Matthew Auld wrote:
> > On 05/08/2025 08:48, Thomas Hellström wrote:
> > > When the xe buffer-object shrinker allows GPU waits and write-
> > > back,
> > > (typically from kswapd), perform multiple passes, skipping
> > > subsequent passes if the shrinker number of scanned objects
> > > target
> > > is reached.
> > > 
> > > 1) Without GPU waits and write-back
> > > 2) Without write-back
> > > 3) With both GPU-waits and write-back
> > > 
> > > This is to avoid stalls and costly write- and readbacks unless
> > > they
> > > are really necessary.
> > > 
> > > v2:
> > > - Don't test for scan completion twice. (Stuart Summers)
> > > - Update tags.
> > > 
> > > Reported-by: melvyn <melvyn2@dnsense.pub>
> > > Closes:
> > > https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5557
> > > Cc: Summers Stuart <stuart.summers@intel.com>
> > > Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos")
> > > Cc: <stable@vger.kernel.org> # v6.15+
> > > Signed-off-by: Thomas Hellström
> > > <thomas.hellstrom@linux.intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_shrinker.c | 51
> > > +++++++++++++++++++++++++++++---
> > >   1 file changed, 47 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_shrinker.c
> > > b/drivers/gpu/drm/xe/xe_shrinker.c
> > > index 1c3c04d52f55..90244fe59b59 100644
> > > --- a/drivers/gpu/drm/xe/xe_shrinker.c
> > > +++ b/drivers/gpu/drm/xe/xe_shrinker.c
> > > @@ -54,10 +54,10 @@ xe_shrinker_mod_pages(struct xe_shrinker
> > > *shrinker, long shrinkable, long purgea
> > >   	write_unlock(&shrinker->lock);
> > >   }
> > >   
> > > -static s64 xe_shrinker_walk(struct xe_device *xe,
> > > -			    struct ttm_operation_ctx *ctx,
> > > -			    const struct xe_bo_shrink_flags
> > > flags,
> > > -			    unsigned long to_scan, unsigned long
> > > *scanned)
> > > +static s64 __xe_shrinker_walk(struct xe_device *xe,
> > > +			      struct ttm_operation_ctx *ctx,
> > > +			      const struct xe_bo_shrink_flags
> > > flags,
> > > +			      unsigned long to_scan, unsigned
> > > long
> > > *scanned)
> > >   {
> > >   	unsigned int mem_type;
> > >   	s64 freed = 0, lret;
> > > @@ -93,6 +93,48 @@ static s64 xe_shrinker_walk(struct xe_device
> > > *xe,
> > >   	return freed;
> > >   }
> > >   
> > > +/*
> > > + * Try shrinking idle objects without writeback first, then if
> > > not
> > > sufficient,
> > > + * try also non-idle objects and finally if that's not
> > > sufficient
> > > either,
> > > + * add writeback. This avoids stalls and explicit writebacks
> > > with
> > > light or
> > > + * moderate memory pressure.
> > 
> > Just one question here, with writeback=false it doesn't really
> > influence 
> > which objects are chosen for shrinking, unlike with no_wait_gpu,
> > right? 
> > Will having another pass just with writeback=true yield anything 
> > different, assuming here that the previous two passes would have
> > already 
> > hoovered ~everything up that was a possible candidate, so this pass
> > won't really find anything in practice? If so, does that also mean
> > we
> > never really end up using the writeback=true behaviour any more?
> 
> Good point.
> 
> The assumption is that if allocating shmem backing-store fails during
> shrinking, we'd see an -ENOMEM and fail our target, and the next pass
> with writeback would help avoiding that.
> 
> Ofc that requires that a shmem_read_folio() from within reclaim
> returns
> an ERR_PTR(-ENOMEM) if the kernel reserves are depleted rather than
> to
> invoke the OOM killer. I should perhaps test that.

So for reference I tested allocing pages from a loop with
__GFP_RETRY_MAYFAIl | __GFP_NOWARN from reclaim (on each shrinker scan)
until I received a failure, (which is the GFP flags we use for shmem
page allocation), and then freed them again. That worked and no OOM
killer was invoked.

So this makes me more confident in merging v2. We need to keep an eye
on reports for unexpected OOM kills, though.

/Thomas


> 
> Other options would ofc be to include the writeback in pass 2, which
> would be similar to what the i915 shrinker does.
> 
> Thoughts?
> 
> Thanks,
> Thomas
> 
> 
> 
> > 
> > > + */
> > > +static s64 xe_shrinker_walk(struct xe_device *xe,
> > > +			    struct ttm_operation_ctx *ctx,
> > > +			    const struct xe_bo_shrink_flags
> > > flags,
> > > +			    unsigned long to_scan, unsigned long
> > > *scanned)
> > > +{
> > > +	bool no_wait_gpu = true;
> > > +	struct xe_bo_shrink_flags save_flags = flags;
> > > +	s64 lret, freed;
> > > +
> > > +	swap(no_wait_gpu, ctx->no_wait_gpu);
> > > +	save_flags.writeback = false;
> > > +	lret = __xe_shrinker_walk(xe, ctx, save_flags, to_scan,
> > > scanned);
> > > +	swap(no_wait_gpu, ctx->no_wait_gpu);
> > > +	if (lret < 0 || *scanned >= to_scan)
> > > +		return lret;
> > > +
> > > +	freed = lret;
> > > +	if (!ctx->no_wait_gpu) {
> > > +		lret = __xe_shrinker_walk(xe, ctx, save_flags,
> > > to_scan, scanned);
> > > +		if (lret < 0)
> > > +			return lret;
> > > +		freed += lret;
> > > +		if (*scanned >= to_scan)
> > > +			return freed;
> > > +	}
> > > +
> > > +	if (flags.writeback) {
> > > +		lret = __xe_shrinker_walk(xe, ctx, flags,
> > > to_scan,
> > > scanned);
> > > +		if (lret < 0)
> > > +			return lret;
> > > +		freed += lret;
> > > +	}
> > > +
> > > +	return freed;
> > > +}
> > > +
> > >   static unsigned long
> > >   xe_shrinker_count(struct shrinker *shrink, struct
> > > shrink_control
> > > *sc)
> > >   {
> > > @@ -199,6 +241,7 @@ static unsigned long xe_shrinker_scan(struct
> > > shrinker *shrink, struct shrink_con
> > >   		runtime_pm =
> > > xe_shrinker_runtime_pm_get(shrinker,
> > > true, 0, can_backup);
> > >   
> > >   	shrink_flags.purge = false;
> > > +
> > >   	lret = xe_shrinker_walk(shrinker->xe, &ctx,
> > > shrink_flags,
> > >   				nr_to_scan, &nr_scanned);
> > >   	if (lret >= 0)
> > 
> 


  parent reply	other threads:[~2025-08-07 19:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-05  7:48 [PATCH v2] drm/xe: Defer buffer object shrinker write-backs and GPU waits Thomas Hellström
2025-08-05  8:55 ` ✓ CI.KUnit: success for drm/xe: Defer buffer object shrinker write-backs and GPU waits (rev2) Patchwork
2025-08-05  9:29 ` ✓ Xe.CI.BAT: " Patchwork
2025-08-05 11:31 ` ✗ Xe.CI.Full: failure " Patchwork
2025-08-05 13:40 ` [PATCH v2] drm/xe: Defer buffer object shrinker write-backs and GPU waits Matthew Auld
2025-08-05 16:27   ` Thomas Hellström
2025-08-06 14:45     ` Matthew Auld
2025-08-07 19:13     ` Thomas Hellström [this message]
2025-08-05 14:50 ` Summers, Stuart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=06cb2b34897415e574a7c55b54757000958cdcee.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    --cc=melvyn2@dnsense.pub \
    --cc=stuart.summers@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).