linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: David Airlie <airlied@redhat.com>
Cc: Dave Airlie <airlied@gmail.com>,
	dri-devel@lists.freedesktop.org, linux-mm@kvack.org,
	Johannes Weiner <hannes@cmpxchg.org>,
	Dave Chinner <david@fromorbit.com>,
	Kairui Song <kasong@tencent.com>
Subject: Re: [PATCH 13/18] ttm/pool: enable memcg tracking and shrinker. (v2)
Date: Mon, 4 Aug 2025 11:22:05 +0200	[thread overview]
Message-ID: <903cbf42-2fde-4e38-89e4-2d7287b845bf@amd.com> (raw)
In-Reply-To: <CAMwc25pyqhcq-8ubGZT5UX5AYroewBYP6oFN-JmjzEkHgFLTrg@mail.gmail.com>

Sorry for the delayed response, just back from vacation.

On 22.07.25 01:16, David Airlie wrote:
>>>> @@ -162,7 +164,10 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
>>>>               p = alloc_pages_node(pool->nid, gfp_flags, order);
>>>>               if (p) {
>>>>                       p->private = order;
>>>> -                     mod_node_page_state(NODE_DATA(page_to_nid(p)), NR_GPU_ACTIVE, (1 << order));
>>>> +                     if (!mem_cgroup_charge_gpu_page(objcg, p, order, gfp_flags, false)) {
>>>
>>> Thinking more about it that is way to late. At this point we can't fail the allocation any more.
>>>
>>
>> I've tested it at least works, but there is a bit of a problem with
>> it, because if we fail a 10 order allocation, it tries to fallback
>> down the order hierarchy, when there is no point since it can't
>> account the maximum size.
>>
>>> Otherwise we either completely break suspend or don't account system allocations to the correctly any more after resume.
>>
>> When you say suspend here, do you mean for VRAM allocations, normal
>> system RAM allocations which are accounted here shouldn't have any
>> effect on suspend/resume since they stay where they are. Currently it
>> also doesn't try account for evictions at all.

Good point, I was not considering moves during suspend as evictions. But from the code flow that should indeed work for now.

What I meant is that after resume BOs are usually not moved back into VRAM immediately. Filling VRAM is rate limited to allow quick response of desktop applications after resume.

So at least temporary we hopelessly overcommit system memory after resume. But that problem potentially goes into the same bucked as general eviction.

> I've just traced the global swapin/out paths as well and those seem
> fine for memcg at this point, since they are called only after
> populate/unpopulate. Now I haven't addressed the new xe swap paths,
> because I don't have a test path, since amdgpu doesn't support those,
> I was thinking I'd leave it on the list for when amdgpu goes to that
> path, or I can spend some time on xe.

I would really prefer that before we commit this that we have patches for both amdgpu and XE which at least demonstrate the functionality.

We are essentially defining uAPI here and when that goes wrong we can't change it any more as soon as people start depending on it.

> 
> Dave.
> 
>>>
>>> What we need is to reserve the memory on BO allocation and commit it when the TT backend is populated.
>>
>> I'm not sure what reserve vs commit is here, mem cgroup is really just
>> reserve until you can reserve no more, it's just a single
>> charge/uncharge stage. If we try and charge and we are over the limit,
>> bad things will happen, either fail allocation or reclaim for the
>> cgroup.

Yeah, exactly that is what I think is highly problematic.

When the allocation of a buffer for an application fails in the display server you basically open up the possibility for a deny of service.

E.g. imaging that an application allocates a 4GiB BO while it's cgroup says it can only allocate 2GiB, that will work because the backing store is only allocated delayed. Now send that BO to the display server and the command submission in the display server will fail with an -ENOMEM because we exceed the cgroup of the application.

As far as I can see we also need to limit how much an application can overcommit by creating BOs without backing store.

Alternatively disallow creating BOs without backing store, but that is an uAPI change and will break at least some use cases.

Regards,
Christian.

>>
>> Regards,
>> Dave.
> 



  reply	other threads:[~2025-08-04  9:22 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-14  5:18 drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver (complete series v2) Dave Airlie
2025-07-14  5:18 ` [PATCH 01/18] mm: add gpu active/reclaim per-node stat counters (v2) Dave Airlie
2025-07-14  5:18 ` [PATCH 02/18] drm/ttm: use gpu mm stats to track gpu memory allocations. (v3) Dave Airlie
2025-07-14 18:33   ` Shakeel Butt
2025-07-14  5:18 ` [PATCH 03/18] mm/list_lru: export list_lru_add Dave Airlie
2025-07-14 19:01   ` Shakeel Butt
2025-07-14  5:18 ` [PATCH 04/18] ttm/pool: port to list_lru. (v2) Dave Airlie
2025-07-14  5:18 ` [PATCH 05/18] ttm/pool: drop numa specific pools Dave Airlie
2025-07-14  5:18 ` [PATCH 06/18] ttm/pool: make pool shrinker NUMA aware Dave Airlie
2025-07-14  5:18 ` [PATCH 07/18] ttm/pool: track allocated_pages per numa node Dave Airlie
2025-07-14  5:18 ` [PATCH 08/18] memcg: add support for GPU page counters. (v2) Dave Airlie
2025-07-14  5:18 ` [PATCH 09/18] memcg: export memcg_list_lru_alloc Dave Airlie
2025-07-14  5:18 ` [PATCH 10/18] ttm: add a memcg accounting flag to the alloc/populate APIs Dave Airlie
2025-07-14  5:18 ` [PATCH 11/18] ttm/pool: initialise the shrinker earlier Dave Airlie
2025-07-14  5:18 ` [PATCH 12/18] ttm: add objcg pointer to bo and tt Dave Airlie
2025-07-14  5:18 ` [PATCH 13/18] ttm/pool: enable memcg tracking and shrinker. (v2) Dave Airlie
2025-07-15  7:34   ` Christian König
2025-07-21  5:56     ` David Airlie
2025-07-21 23:16       ` David Airlie
2025-08-04  9:22         ` Christian König [this message]
2025-08-06  2:43           ` Dave Airlie
2025-08-06 13:04             ` Christian König
2025-07-14  5:18 ` [PATCH 14/18] ttm: hook up memcg placement flags Dave Airlie
2025-07-14  5:18 ` [PATCH 15/18] memcontrol: allow objcg api when memcg is config off Dave Airlie
2025-07-14  5:18 ` [PATCH 16/18] memcontrol: export current_obj_cgroup Dave Airlie
2025-07-14  5:18 ` [PATCH 17/18] amdgpu: add support for memory cgroups Dave Airlie
2025-07-14  5:18 ` [PATCH 18/18] ttm: add support for a module option to disable memcg pool Dave Airlie
2025-07-14 11:49   ` Christian König
2025-08-05 10:58 ` drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver (complete series v2) Maarten Lankhorst
2025-08-06  2:39   ` Dave Airlie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=903cbf42-2fde-4e38-89e4-2d7287b845bf@amd.com \
    --to=christian.koenig@amd.com \
    --cc=airlied@gmail.com \
    --cc=airlied@redhat.com \
    --cc=david@fromorbit.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hannes@cmpxchg.org \
    --cc=kasong@tencent.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).