From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: "Christian König" <christian.koenig@amd.com>,
"Daniel Vetter" <daniel.vetter@ffwll.ch>
Cc: Matthew Brost <matthew.brost@intel.com>, dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 4/7] drm/ttm: move LRU walk defines into new internal header
Date: Tue, 20 Aug 2024 18:00:55 +0200 [thread overview]
Message-ID: <bb0a31ea3d82ee370873ca5f1c66ec4eeafabffe.camel@linux.intel.com> (raw)
In-Reply-To: <4d4c532a-ff35-4172-9b71-93f5d130711b@amd.com>
On Tue, 2024-08-20 at 17:45 +0200, Christian König wrote:
> Am 20.08.24 um 12:37 schrieb Thomas Hellström:
> > Hi,
> >
> > On Mon, 2024-08-19 at 17:26 +0200, Christian König wrote:
> > > Am 19.08.24 um 16:14 schrieb Daniel Vetter:
> > > > On Mon, Aug 19, 2024 at 01:38:56PM +0200, Thomas Hellström
> > > > wrote:
> > > > > Hi, Christian,
> > > > >
> > > > > On Mon, 2024-08-19 at 13:03 +0200, Christian König wrote:
> > > > > > Am 06.08.24 um 10:29 schrieb Thomas Hellström:
> > > > > > > Hi, Christian.
> > > > > > >
> > > > > > > On Thu, 2024-07-11 at 14:01 +0200, Christian König wrote:
> > > > > > > > Am 10.07.24 um 20:19 schrieb Matthew Brost:
> > > > > > > > > On Wed, Jul 10, 2024 at 02:42:58PM +0200, Christian
> > > > > > > > > König
> > > > > > > > > wrote:
> > > > > > > > > > That is something drivers really shouldn't mess
> > > > > > > > > > with.
> > > > > > > > > >
> > > > > > > > > Thomas uses this in Xe to implement a shrinker [1].
> > > > > > > > > Seems
> > > > > > > > > to
> > > > > > > > > need
> > > > > > > > > to
> > > > > > > > > remain available for drivers.
> > > > > > > > No, that is exactly what I try to prevent with that.
> > > > > > > >
> > > > > > > > This is an internally thing of TTM and drivers should
> > > > > > > > never
> > > > > > > > use
> > > > > > > > it
> > > > > > > > directly.
> > > > > > > That driver-facing LRU walker is a direct
> > > > > > > response/solution
> > > > > > > to this
> > > > > > > comment that you made in the first shrinker series:
> > > > > > >
> > > > > > > https://lore.kernel.org/linux-mm/b7491378-defd-4f1c-31e2-29e4c77e2d67@amd.com/T/#ma918844aa8a6efe8768fdcda0c6590d5c93850c9
> > > > > > Ah, yeah that was about how we are should be avoiding
> > > > > > middle
> > > > > > layer
> > > > > > design.
> > > > > >
> > > > > > But a function which returns the next candidate for
> > > > > > eviction
> > > > > > and a
> > > > > > function which calls a callback for eviction is exactly the
> > > > > > opposite.
> > > > > >
> > > > > > > That is also mentioned in the cover letter of the recent
> > > > > > > shrinker
> > > > > > > series, and this walker has been around in that shrinker
> > > > > > > series for
> > > > > > > more than half a year now so if you think it's not the
> > > > > > > correct
> > > > > > > driver
> > > > > > > facing API IMHO that should be addressed by a review
> > > > > > > comment
> > > > > > > in
> > > > > > > that
> > > > > > > series rather than in posting a conflicting patch?
> > > > > > I actually outlined that in the review comments for the
> > > > > > patch
> > > > > > series.
> > > > > > E.g. a walker function with a callback is basically a
> > > > > > middle
> > > > > > layer.
> > > > > >
> > > > > > What outlined in the link above is that a function which
> > > > > > returns the
> > > > > > next eviction candidate is a better approach than a
> > > > > > callback.
> > > > > >
> > > > > > > So assuming that we still want the driver to register the
> > > > > > > shrinker,
> > > > > > > IMO that helper abstracts away all the nasty locking and
> > > > > > > pitfalls
> > > > > > > for a
> > > > > > > driver-registered shrinker, and is similar in structure
> > > > > > > to
> > > > > > > for
> > > > > > > example
> > > > > > > the pagewalk helper in mm/pagewalk.c.
> > > > > > >
> > > > > > > An alternative that could be tried as a driver-facing API
> > > > > > > is
> > > > > > > to
> > > > > > > provide
> > > > > > > a for_each_bo_in_lru_lock() macro where the driver open-
> > > > > > > codes
> > > > > > > "process_bo()" inside the for loop but I tried this and
> > > > > > > found
> > > > > > > it
> > > > > > > quite
> > > > > > > fragile since the driver might exit the loop without
> > > > > > > performing the
> > > > > > > necessary cleanup.
> > > > > > The point is that the shrinker should *never* need to have
> > > > > > context.
> > > > > > E.g.
> > > > > > a walker which allows going over multiple BOs for eviction
> > > > > > is
> > > > > > exactly
> > > > > > the wrong approach for that.
> > > > > >
> > > > > > The shrinker should evict always only exactly one BO and
> > > > > > the
> > > > > > next
> > > > > > invocation of a shrinker should not depend on the result of
> > > > > > the
> > > > > > previous
> > > > > > one.
> > > > > >
> > > > > > Or am I missing something vital here?
> > > > > A couple of things,
> > > > >
> > > > > 1) I'd like to think of the middle-layer vs helper like the
> > > > > helper has
> > > > > its own ops, and can be used optionally from the driver.
> > > > > IIRC,
> > > > > the
> > > > > atomic modesetting / pageflip ops are implemented in exactly
> > > > > this
> > > > > way.
> > > > >
> > > > > Sometimes a certain loop operation can't be easily or at
> > > > > least
> > > > > robustly
> > > > > implemented using a for_each_.. approach. Like for example
> > > > > mm/pagewalk.c. In this shrinking case I think it's probably
> > > > > possible
> > > > > using the scoped_guard() in cleanup.h. This way we could get
> > > > > rid
> > > > > of
> > > > > this middle layer discussion by turning the interface inside-
> > > > > out:
> > > > >
> > > > > for_each_bo_on_lru_locked(xxx)
> > > > > driver_shrink();
> > > > >
> > > > > But I do think the currently suggested approach is less
> > > > > fragile
> > > > > and
> > > > > prone to driver error.
> > > > >
> > > > > FWIW in addition to the above examples, also drm_gem_lru_scan
> > > > > works
> > > > > like this.
> > > > a iteration state structure (like drm_connector_iter) plus then
> > > > a
> > > > macro
> > > > for the common case that uses cleanup.h should get the job
> > > > done.
> > > Yeah, completely agree. It basically boils down to which one
> > > needs to
> > > pack a state bag.
> > >
> > > In a mid-layer design it's the driver or the caller of functions,
> > > e.g.
> > > the much hated init callback in the DRM layer was a perfect
> > > example
> > > of that.
> > >
> > > In a non mid-layer approach it's the framework or the called
> > > function,
> > > examples are how the fence iterators in the dma_resv or the
> > > drm_connector, plane etc.. work.
> > >
> > > One big difference between the two approach is who and where
> > > things
> > > like
> > > preparation and cleanup are done, e.g. who takes locks for
> > > example.
> > So what in your opinion is an acceptable solution here?
> > The "get next object to shrink" approach won't work, since it's
> > subject
> > to the old TTM swap problem now removed:
> > If shrinking fails we will get the same object upon subsequent
> > calls.
>
> But and that is expected behavior. If shrinking fails just going to
> the
> next object is invalid as far as I can see.
>
> That's why I was so puzzled why you tried to apply the walker to the
> TTM
> shrinker.
>
> Or why exactly should shrinking fail?
A common example would be not having runtime pm and the particular bo
needs it to unbind, we want to try the next bo. Example: i915 GGTT
bound bos and Lunar Lake PL_TT bos.
And again, all other drm bo shrinkers do this. We just want to do the
same.
>
> > If we bump LRU we could end up with infinite loops.
> > So IMO we need to be able to loop. I don't really care wether we do
> > this as an explicit loop or whether we use the LRU walker, but I
> > think
> > from a maintainability point-of-view it is better to keep LRU
> > walking
> > in a single place.
> >
> > If we return an unlocked object, we'd need to refcount and drop the
> > lru
> > lock, but maybe that's not a bad thing.
> >
> > But what's the main drawback of exporting the existing helper.
>
> Well that we re-creates exactly the mid-layer mess I worked so hard
> to
> remove from TTM.
It doesn't IMO. I agree the first attempt did. This affects only the
LRU iteration itself and I'm even fine to get rid of the callback using
a for_ macro.
>
> > > > > 2) The shrinkers suggest to shrink a number of pages, not a
> > > > > single bo,
> > > > > again drm_gem_lru_scan works like this. i915 works like this.
> > > > > I
> > > > > think
> > > > > we should align with those.
> > > > Yeah that's how shrinkers work, so if we demidlayer then it
> > > > realls
> > > > needs
> > > > to be a loop in the driver, not a "here's the next bo to nuke"
> > > > I
> > > > think.
> > > Hui? Well that's not how I understand shrinkers.
> > >
> > > The shrinker gives you the maximum number of objects to scan and
> > > not
> > > how
> > > many pages to free. In return you give the actual number of
> > > objects
> > > to
> > > scanned and pages freed.
> > >
> > > As far as I know the number of objects are in the sense of SLUBs
> > > or
> > > rather different LRU lists.
> > >
> > > So for BO shrinking we should probably only free or rather unpin
> > > a
> > > single BO per callback otherwise we mess up the fairness between
> > > shrinkers in the MM layer.
> > Hm. AFAICT all drm shrinkers use pages as objects, and the docs of
> > nr_to_scan says it's the number of objects to scan and try to
> > reclaim.
> > I think this strategy has had a fair amount of testing with the
> > i915
> > driver.
> > In any case, I don't think TTM should enforce a different way of
> > shrinking by the means of a severely restricted helper?
>
> Well, as far as I can see that is exactly what TTM should do.
>
> I mean the main advantage to make a common component is to enforce
> correct behavior.
But if all other drivers don't agree this as correct behavior and
instead want to keep behavior that is proven to work, that's a dead
end.
/Thomas
>
> Regards,
> Christian.
>
> >
> > /Thomas
> >
next prev parent reply other threads:[~2024-08-20 16:01 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-10 12:42 Using drm_exec in TTM Christian König
2024-07-10 12:42 ` [PATCH 1/7] dma-buf/dma-resv: Introduce dma_resv_trylock_ctx() Christian König
2024-07-10 12:42 ` [PATCH 2/7] drm/exec: don't immediately add the prelocked obj Christian König
2024-07-10 12:42 ` [PATCH 3/7] drm/exec: provide trylock interface for eviction Christian König
2024-07-10 12:42 ` [PATCH 4/7] drm/ttm: move LRU walk defines into new internal header Christian König
2024-07-10 18:19 ` Matthew Brost
2024-07-11 12:01 ` Christian König
2024-07-11 16:00 ` Matthew Brost
2024-08-06 8:29 ` Thomas Hellström
2024-08-19 11:03 ` Christian König
2024-08-19 11:38 ` Thomas Hellström
2024-08-19 14:14 ` Daniel Vetter
2024-08-19 15:26 ` Christian König
2024-08-20 10:37 ` Thomas Hellström
2024-08-20 15:45 ` Christian König
2024-08-20 16:00 ` Thomas Hellström [this message]
2024-08-21 8:14 ` Christian König
2024-08-21 8:57 ` Thomas Hellström
2024-08-21 9:31 ` Thomas Hellström
2024-08-21 9:48 ` Christian König
2024-08-21 12:00 ` Thomas Hellström
2024-08-22 6:47 ` Thomas Hellström
2024-08-22 7:55 ` Christian König
2024-08-22 8:21 ` Thomas Hellström
2024-08-22 8:36 ` Thomas Hellström
2024-08-22 9:29 ` Christian König
2024-08-22 13:16 ` Thomas Hellström
2024-08-27 16:58 ` Daniel Vetter
2024-08-22 9:23 ` Daniel Vetter
2024-08-22 13:19 ` Christian König
2024-08-27 16:52 ` Daniel Vetter
2024-08-27 17:53 ` Daniel Vetter
2024-08-28 12:20 ` Christian König
2024-08-28 14:05 ` Thomas Hellström
2024-08-28 15:25 ` Christian König
2024-08-28 15:35 ` Alex Deucher
2024-08-28 15:45 ` Thomas Hellström
2024-09-02 11:07 ` Daniel Vetter
2024-09-18 12:57 ` Thomas Hellström
2024-10-02 11:30 ` Thomas Hellström
2024-10-02 11:32 ` Christian König
2024-10-02 11:36 ` Thomas Hellström
2024-10-07 9:08 ` Christian König
2024-10-09 13:39 ` Thomas Hellström
2024-10-09 14:17 ` Thomas Hellström
2024-10-10 8:00 ` Christian König
2024-10-11 16:52 ` Thomas Hellström
2024-08-27 18:24 ` Alex Deucher
2024-09-02 11:00 ` Daniel Vetter
2024-08-21 7:02 ` Thomas Hellström
2024-07-10 12:42 ` [PATCH 5/7] drm/ttm: move needs_unlock into the walk Christian König
2024-07-10 12:43 ` [PATCH 6/7] drm/ttm: support using drm_exec during eviction v2 Christian König
2024-07-10 12:43 ` [PATCH 7/7] drm/amdgpu: use drm_exec during BO validation Christian König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bb0a31ea3d82ee370873ca5f1c66ec4eeafabffe.camel@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=christian.koenig@amd.com \
--cc=daniel.vetter@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.