From: David Airlie <airlied@redhat.com>
To: "Christian König" <christian.koenig@amd.com>
Cc: Dave Airlie <airlied@gmail.com>,
dri-devel@lists.freedesktop.org, linux-mm@kvack.org,
Johannes Weiner <hannes@cmpxchg.org>,
Dave Chinner <david@fromorbit.com>,
Kairui Song <kasong@tencent.com>
Subject: Re: [PATCH 12/17] ttm: add objcg pointer to bo and tt
Date: Wed, 2 Jul 2025 17:57:35 +1000 [thread overview]
Message-ID: <CAMwc25oYx1L9H+GCE95ZAxXAwqCDQVjpJfre_Ndv=Z8j8KOyYw@mail.gmail.com> (raw)
In-Reply-To: <54b2ee4a-0f2f-49a1-a680-8dc1193e2d30@amd.com>
> >
> > It makes it easier now, but when we have to solve swapping, step one
> > will be moving all this code around to what I have now, and starting
> > from there.
> >
> > This just raises the bar to solving the next problem.
> >
> > We need to find incremental approaches to getting all the pieces of
> > the puzzle solved, or else we will still be here in 10 years.
> >
> > The steps I've formulated (none of them are perfect, but they all seem
> > better than status quo)
> >
> > 1. add global counters for pages - now we can at least see things in
> > vmstat and per-node
> > 2. add numa to the pool lru - we can remove our own numa code and
> > align with core kernel - probably doesn't help anything
>
> So far no objections from my side to that.
>
> > 3. add memcg awareness to the pool and pool shrinker.
> > if you are on a APU with no swap configured - you have a lot better time.
> > if you are on a dGPU or APU with swap - you have a moderately
> > better time, but I can't see you having a worse time.
>
> Well that's what I'm strongly disagreeing on.
>
> Adding memcg to the pool has no value at all and complicates things massively when moving forward.
>
> What exactly should be the benefit of that?
I'm already showing the benefit of the pool moving to memcg, we've
even talked about it multiple times on the list, it's not a OMG change
the world benefit, but it definitely provides better alignment between
the pool and memcg allocations.
We expose userspace API to allocate write combined memory, we do this
for all currently supported CPU/GPUs. We might think in the future we
don't want to continue to do this, but we do it now. My Fedora 42
desktop uses it, even if you say there is no need.
If I allocate 100% of my memcg budget to WC memory, free it, then
allocate 100% of my budget to non-WC memory, we break container
containment as we can force other cgroups to run out of memory budget
and have to call the global shrinker. With this in place, the
container that allocates the WC memory also pays the price to switch
it back. Again this is just correctness, it's not going to fix any
major workloads, but I also don't think it should cause any
regressions, since it won't be worse than current worst case
expectation for most workloads.
I'm not just adding memcg awareness to the pool though, that is just
completeness, I'm adding memcg awareness to all GPU system memory
allocations, and making sure swapout works (which it does), swapin
probably needs more work.
The future work is integerating ttm swap mechanisms with memcg to get it right.
> >
> > Accounting at the resource level makes stuff better, but I don't
> > believe after implementing it that it is consistent with solving the
> > overall problem.
>
> Exactly that's my point. See accounting is no problem at all, that can be done on any possible level.
>
> What is tricky is shrinking, e.g. either core MM or memcg asking to reduce the usage of memory and moving things into swap.
>
> And that can only be done either on the resource level or the tt object, but not the pool level.
I understand we have to add more code to the tt level and that's fine,
I just don't see why you think starting at the bottom level is wrong?
it clearly has a use, and it's just cleaning up and preparing the
levels, so we can move up and solve the next problem.
> The whole TTM pool is to aid a 28 year old HW design which has no practical relevance on modern systems and we should really not touch that in any way possible.
Modern systems are still using it, I'm still seeing WC allocations,
they still seem to have some cost associated with them on x86-64, they
certainly aren't free. I don't care if they aren't practical, but if
they are a way to route around container containment, they need to be
fixed.
Dave.
next prev parent reply other threads:[~2025-07-02 7:57 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-30 4:49 drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver Dave Airlie
2025-06-30 4:49 ` [PATCH 01/17] mm: add gpu active/reclaim per-node stat counters (v2) Dave Airlie
2025-06-30 4:49 ` [PATCH 02/17] drm/ttm: use gpu mm stats to track gpu memory allocations. (v2) Dave Airlie
2025-06-30 10:04 ` Christian König
2025-07-01 1:41 ` David Airlie
2025-07-02 16:08 ` Shakeel Butt
2025-06-30 4:49 ` [PATCH 03/17] mm/list_lru: export list_lru_add Dave Airlie
2025-06-30 4:49 ` [PATCH 04/17] ttm/pool: port to list_lru. (v2) Dave Airlie
2025-06-30 10:37 ` kernel test robot
2025-06-30 4:49 ` [PATCH 05/17] ttm/pool: drop numa specific pools Dave Airlie
2025-06-30 10:12 ` Christian König
2025-06-30 4:49 ` [PATCH 06/17] ttm/pool: make pool shrinker NUMA aware Dave Airlie
2025-06-30 10:15 ` Christian König
2025-06-30 21:30 ` David Airlie
2025-06-30 4:49 ` [PATCH 07/17] ttm/pool: track allocated_pages per numa node Dave Airlie
2025-06-30 4:49 ` [PATCH 08/17] memcg: add support for GPU page counters Dave Airlie
2025-07-02 16:06 ` Shakeel Butt
2025-07-03 5:43 ` David Airlie
2025-06-30 4:49 ` [PATCH 09/17] memcg: export memcg_list_lru_alloc Dave Airlie
2025-06-30 4:49 ` [PATCH 10/17] ttm: add a memcg accounting flag to the alloc/populate APIs Dave Airlie
2025-06-30 9:56 ` kernel test robot
2025-06-30 10:20 ` Christian König
2025-07-01 1:46 ` David Airlie
2025-06-30 4:49 ` [PATCH 11/17] ttm/pool: initialise the shrinker earlier Dave Airlie
2025-06-30 4:49 ` [PATCH 12/17] ttm: add objcg pointer to bo and tt Dave Airlie
2025-06-30 10:24 ` Christian König
2025-06-30 21:33 ` David Airlie
2025-07-01 7:22 ` Christian König
2025-07-01 8:06 ` David Airlie
2025-07-01 8:15 ` Christian König
2025-07-01 22:11 ` David Airlie
2025-07-02 7:27 ` Christian König
2025-07-02 7:57 ` David Airlie [this message]
2025-07-02 8:24 ` Christian König
2025-07-03 5:53 ` David Airlie
2025-06-30 4:49 ` [PATCH 13/17] ttm/pool: enable memcg tracking and shrinker Dave Airlie
2025-06-30 10:23 ` Christian König
2025-06-30 21:23 ` David Airlie
2025-06-30 11:59 ` kernel test robot
2025-07-02 16:41 ` Shakeel Butt
2025-06-30 4:49 ` [PATCH 14/17] ttm: hook up memcg placement flags Dave Airlie
2025-06-30 4:49 ` [PATCH 15/17] memcontrol: allow objcg api when memcg is config off Dave Airlie
2025-06-30 4:49 ` [PATCH 16/17] memcontrol: export current_obj_cgroup Dave Airlie
2025-06-30 4:49 ` [PATCH 17/17] amdgpu: add support for memory cgroups Dave Airlie
2025-07-02 16:02 ` Shakeel Butt
2025-07-03 2:53 ` David Airlie
2025-07-03 17:58 ` Shakeel Butt
2025-07-03 18:15 ` Christian König
2025-07-03 20:06 ` Shakeel Butt
2025-07-03 21:22 ` David Airlie
2025-07-04 9:39 ` Christian König
2025-07-01 23:26 ` drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMwc25oYx1L9H+GCE95ZAxXAwqCDQVjpJfre_Ndv=Z8j8KOyYw@mail.gmail.com' \
--to=airlied@redhat.com \
--cc=airlied@gmail.com \
--cc=christian.koenig@amd.com \
--cc=david@fromorbit.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=hannes@cmpxchg.org \
--cc=kasong@tencent.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).