public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeel.butt@linux.dev>
To: "Christian König" <christian.koenig@amd.com>
Cc: Dave Airlie <airlied@gmail.com>,
	dri-devel@lists.freedesktop.org,  tj@kernel.org,
	Johannes Weiner <hannes@cmpxchg.org>,
	 Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	 Muchun Song <muchun.song@linux.dev>,
	cgroups@vger.kernel.org, Dave Chinner <david@fromorbit.com>,
	 Waiman Long <longman@redhat.com>,
	simona@ffwll.ch, tjmercier@google.com
Subject: Re: [PATCH 07/16] memcg: add support for GPU page counters. (v4)
Date: Mon, 2 Mar 2026 09:16:59 -0800	[thread overview]
Message-ID: <aaXEDLpXLROBO7To@linux.dev> (raw)
In-Reply-To: <63dccd9c-f2e5-421e-ac3a-a7c13cec9121@amd.com>

On Mon, Mar 02, 2026 at 04:51:12PM +0100, Christian König wrote:
> On 3/2/26 16:40, Shakeel Butt wrote:
> > +TJ
> > 
> > On Mon, Mar 02, 2026 at 03:37:37PM +0100, Christian König wrote:
> >> On 3/2/26 15:15, Shakeel Butt wrote:
> >>> On Wed, Feb 25, 2026 at 10:09:55AM +0100, Christian König wrote:
> >>>> On 2/24/26 20:28, Dave Airlie wrote:
> >>> [...]
> >>>>
> >>>>> This has been a pain in the ass for desktop for years, and I'd like to
> >>>>> fix it, the HPC use case if purely a driver for me doing the work.
> >>>>
> >>>> Wait a second. How does accounting to cgroups help with that in any way?
> >>>>
> >>>> The last time I looked into this problem the OOM killer worked based on the per task_struct stats which couldn't be influenced this way.
> >>>>
> >>>
> >>> It depends on the context of the oom-killer. If the oom-killer is triggered due
> >>> to memcg limits then only the processes in the scope of the memcg will be
> >>> targetted by the oom-killer. With the specific setting, the oom-killer can kill
> >>> all the processes in the target memcg.
> >>>
> >>> However nowadays the userspace oom-killer is preferred over the kernel
> >>> oom-killer due to flexibility and configurability. Userspace oom-killers like
> >>> systmd-oomd, Android's LMKD or fb-oomd are being used in containerized
> >>> environments. Such oom-killers looks at memcg stats and hiding something
> >>> something from memcg i.e. not charging to memcg will hide such usage from these
> >>> oom-killers.
> >>
> >> Well exactly that's the problem. Android's oom killer is *not* using memcg exactly because of this inflexibility.
> > 
> > Are you sure Android's oom killer is not using memcg? From what I see in the
> > documentation [1], it requires memcg.
> 
> My bad, I should have been wording that better.
> 
> The Android OOM killer is not using memcg for tracking GPU memory allocations, because memcg doesn't have proper support for tracking shared buffers.

Yes indeed memcg is bad with buffers shared between memcgs (shmem, shared
filesystems).

> 
> In other words GPU memory allocations are shared by design and it is the norm that the process which is using it is not the process which has allocated it.

Here the GPU memory can be system memory or the actual memory on GPU, right?

I think I discussed with TJ on the possibility of moving the allocations in the
context of process using through custom fault handler in GPU drivers. I don't
remember the conclusion but I am assuming that is not possible.

> 
> What we would need (as a start) to handle all of this with memcg would be to accounted the resources to the process which referenced it and not the one which allocated it.

Irrespective of memcg charging decision, one of my request would be to at least
have global counters for the GPU memory which this series is adding. That would
be very similar to NR_KERNEL_FILE_PAGES where we explicit opt-out of memcg
charging but keep the global counter, so the admin can identify the reasons
behind high unaccounted memory on the system.

> 
> I can give a full list of requirements which would be needed by cgroups to cover all the different use cases, but it basically means tons of extra complexity.
> 
> Regards,
> Christian.
> 
> > 
> > [1] https://source.android.com/docs/core/perf/lmkd
> > 
> >>
> >> See the multiple iterations we already had on that topic. Even including reverting already upstream uAPI.
> >>
> >> The latest incarnation is that BPF is used for this task on Android.
> >>
> >> Regards,
> >> Christian.
> 

  reply	other threads:[~2026-03-02 17:17 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-24  2:06 drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver (complete series v5) Dave Airlie
2026-02-24  2:06 ` [PATCH 01/16] mm: add gpu active/reclaim per-node stat counters (v2) Dave Airlie
2026-02-24  2:06 ` [PATCH 02/16] drm/ttm: use gpu mm stats to track gpu memory allocations. (v4) Dave Airlie
2026-02-24  2:06 ` [PATCH 03/16] ttm/pool: port to list_lru. (v2) Dave Airlie
2026-02-24  2:06 ` [PATCH 04/16] ttm/pool: drop numa specific pools Dave Airlie
2026-02-24  2:06 ` [PATCH 05/16] ttm/pool: make pool shrinker NUMA aware (v2) Dave Airlie
2026-02-24  2:06 ` [PATCH 06/16] ttm/pool: track allocated_pages per numa node Dave Airlie
2026-02-24  2:06 ` [PATCH 07/16] memcg: add support for GPU page counters. (v4) Dave Airlie
2026-02-24  7:20   ` kernel test robot
2026-02-24  7:50   ` Christian König
2026-02-24 19:28     ` Dave Airlie
2026-02-25  9:09       ` Christian König
2026-03-02 14:15         ` Shakeel Butt
2026-03-02 14:37           ` Christian König
2026-03-02 15:40             ` Shakeel Butt
2026-03-02 15:51               ` Christian König
2026-03-02 17:16                 ` Shakeel Butt [this message]
2026-03-02 19:36                   ` Christian König
2026-03-05  3:23                     ` Dave Airlie
2026-03-02 19:35                 ` T.J. Mercier
2026-03-03  9:29                   ` Christian König
2026-03-03 17:25                     ` T.J. Mercier
2026-03-05  3:19                   ` Dave Airlie
2026-03-05  9:25                     ` Christian König
2026-03-10  1:27                     ` T.J. Mercier
2026-02-24  2:06 ` [PATCH 08/16] ttm: add a memcg accounting flag to the alloc/populate APIs Dave Airlie
2026-02-24  8:42   ` kernel test robot
2026-02-24  2:06 ` [PATCH 09/16] ttm/pool: initialise the shrinker earlier Dave Airlie
2026-02-24  2:06 ` [PATCH 10/16] ttm: add objcg pointer to bo and tt (v2) Dave Airlie
2026-02-24  2:06 ` [PATCH 11/16] ttm/pool: enable memcg tracking and shrinker. (v3) Dave Airlie
2026-02-24  2:06 ` [PATCH 12/16] ttm: hook up memcg placement flags Dave Airlie
2026-02-24  2:06 ` [PATCH 13/16] memcontrol: allow objcg api when memcg is config off Dave Airlie
2026-02-24  2:06 ` [PATCH 14/16] amdgpu: add support for memory cgroups Dave Airlie
2026-02-24  2:06 ` [PATCH 15/16] ttm: add support for a module option to disable memcg integration Dave Airlie
2026-02-24  2:06 ` [PATCH 16/16] xe: create a flag to enable memcg accounting for XE as well Dave Airlie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaXEDLpXLROBO7To@linux.dev \
    --to=shakeel.butt@linux.dev \
    --cc=airlied@gmail.com \
    --cc=cgroups@vger.kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=david@fromorbit.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hannes@cmpxchg.org \
    --cc=longman@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=simona@ffwll.ch \
    --cc=tj@kernel.org \
    --cc=tjmercier@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox