Re: drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver (complete series v2)

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Dave Airlie <airlied@gmail.com>
To: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: dri-devel@lists.freedesktop.org, linux-mm@kvack.org,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Christian Koenig <christian.koenig@amd.com>,
	 Dave Chinner <david@fromorbit.com>,
	Kairui Song <kasong@tencent.com>
Subject: Re: drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver (complete series v2)
Date: Wed, 6 Aug 2025 12:39:51 +1000	[thread overview]
Message-ID: <CAPM=9twKRN=GXNhyUZEd4b7StCN8WuhG_h6BCZX+ONgSW268=g@mail.gmail.com> (raw)
In-Reply-To: <4a45548a-ad37-4778-b6de-1cda7ce258dc@linux.intel.com>

On Tue, 5 Aug 2025 at 20:58, Maarten Lankhorst
<maarten.lankhorst@linux.intel.com> wrote:
>
> Hey,
>
> Den 2025-07-14 kl. 07:18, skrev Dave Airlie:
> > Hi all,
> >
> > This is a repost with some fixes and cleanups.
> >
> > Differences since last posting:
> > 1. Added patch 18: add a module option to allow pooled pages to not be stored in the lru per-memcg
> >    (Requested by Christian Konig)
> > 2. Converged the naming and stats between vmstat and memcg (Suggested by Shakeel Butt)
> > 3. Cleaned up the charge/uncharge code and some other bits.
> >
> > Dave.
> >
> > Original cover letter:
> > tl;dr: start using list_lru/numa/memcg in GPU driver core and amdgpu driver for now.
> >
> > This is a complete series of patches, some of which have been sent before and reviewed,
> > but I want to get the complete picture for others, and try to figure out how best to land this.
> >
> > There are 3 pieces to this:
> > 01->02: add support for global gpu stat counters (previously posted, patch 2 is newer)
> > 03->07: port ttm pools to list_lru for numa awareness
> > 08->14: add memcg stats + gpu apis, then port ttm pools to memcg aware list_lru and shrinker
> > 15->17: enable amdgpu to use new functionality.
> >
> > The biggest difference in the memcg code from previously is I discovered what
> > obj cgroups were designed for and I'm reusing the page/objcg intergration that
> > already exists, to avoid reinventing that wheel right now.
> >
> > There are some igt-gpu-tools tests I've written at:
> > https://gitlab.freedesktop.org/airlied/igt-gpu-tools/-/tree/amdgpu-cgroups?ref_type=heads
> >
> > One problem is there are a lot of delayed action, that probably means the testing
> > needs a bit more robustness, but the tests validate all the basic paths.
> >
> > Regards,
> > Dave.
> >
> Patch below to enable on xe as well, I ran into some issues though when testing.
> After shutting down gdm3/sddm, I ran into a null dereference in mem_cgroup_uncharge_gpu_page()
> from ttm_pool_free_page(), presumably because of the objects that were created without a
> cgroup set. I tried to fix it in mem_cgroup_uncharge_gpu_page() by conditionally calling
> refill_stock(), but that ran into an underflow instead.

there should be a check if memcg is not NULL before calling into
refill, where are you seeing the underflow?

Thanks for the patch, I've booted one of my meteorlake systems with
this applied to forcing xe, and it seems to be working in my basic
testing so far.

Dave.

>
> Anyway, patch for xe below:
> ----->8-----------
> drm/xe: Enable memcg accounting for TT/system
>
> Create a flag to enable memcg accounting for XE as well.
>
> Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
>
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index 867087c2d1534..fd93374967c9e 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -54,6 +54,7 @@ static const struct ttm_place sys_placement_flags = {
>         .flags = 0,
>  };
>
> +/* TTM_PL_FLAG_MEMCG is not set, those placements are used for eviction */
>  static struct ttm_placement sys_placement = {
>         .num_placement = 1,
>         .placement = &sys_placement_flags,
> @@ -188,6 +189,7 @@ static void try_add_system(struct xe_device *xe, struct xe_bo *bo,
>
>                 bo->placements[*c] = (struct ttm_place) {
>                         .mem_type = XE_PL_TT,
> +                       .flags = TTM_PL_FLAG_MEMCG,
>                 };
>                 *c += 1;
>         }
> @@ -1696,6 +1698,8 @@ static void xe_ttm_bo_destroy(struct ttm_buffer_object *ttm_bo)
>
>  static void xe_gem_object_free(struct drm_gem_object *obj)
>  {
> +       struct xe_bo *bo = gem_to_xe_bo(obj);
> +
>         /* Our BO reference counting scheme works as follows:
>          *
>          * The gem object kref is typically used throughout the driver,
> @@ -1709,8 +1713,9 @@ static void xe_gem_object_free(struct drm_gem_object *obj)
>          * driver ttm callbacks is allowed to use the ttm_buffer_object
>          * refcount directly if needed.
>          */
> -       __xe_bo_vunmap(gem_to_xe_bo(obj));
> -       ttm_bo_put(container_of(obj, struct ttm_buffer_object, base));
> +       __xe_bo_vunmap(bo);
> +       obj_cgroup_put(bo->ttm.objcg);
> +       ttm_bo_put(&bo->ttm);
>  }
>
>  static void xe_gem_object_close(struct drm_gem_object *obj,
> @@ -1951,6 +1956,9 @@ struct xe_bo *___xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
>         placement = (type == ttm_bo_type_sg ||
>                      bo->flags & XE_BO_FLAG_DEFER_BACKING) ? &sys_placement :
>                 &bo->placement;
> +
> +       if (bo->flags & XE_BO_FLAG_ACCOUNTED)
> +               bo->ttm.objcg = get_obj_cgroup_from_current();
>         err = ttm_bo_init_reserved(&xe->ttm, &bo->ttm, type,
>                                    placement, alignment,
>                                    &ctx, NULL, resv, xe_ttm_bo_destroy);
> @@ -2726,7 +2734,7 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data,
>         if (XE_IOCTL_DBG(xe, args->size & ~PAGE_MASK))
>                 return -EINVAL;
>
> -       bo_flags = 0;
> +       bo_flags = XE_BO_FLAG_ACCOUNTED;
>         if (args->flags & DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING)
>                 bo_flags |= XE_BO_FLAG_DEFER_BACKING;
>
> diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
> index 6134d82e80554..e44fc58d9a00f 100644
> --- a/drivers/gpu/drm/xe/xe_bo.h
> +++ b/drivers/gpu/drm/xe/xe_bo.h
> @@ -48,6 +48,7 @@
>  #define XE_BO_FLAG_GGTT2               BIT(22)
>  #define XE_BO_FLAG_GGTT3               BIT(23)
>  #define XE_BO_FLAG_CPU_ADDR_MIRROR     BIT(24)
> +#define XE_BO_FLAG_ACCOUNTED           BIT(25)
>
>  /* this one is trigger internally only */
>  #define XE_BO_FLAG_INTERNAL_TEST       BIT(30)
> diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
> index 540f044bf4255..4db3227d65c04 100644
> --- a/drivers/gpu/drm/xe/xe_lrc.c
> +++ b/drivers/gpu/drm/xe/xe_lrc.c
> @@ -1266,7 +1266,8 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
>         bo_flags = XE_BO_FLAG_VRAM_IF_DGFX(tile) | XE_BO_FLAG_GGTT |
>                    XE_BO_FLAG_GGTT_INVALIDATE;
>         if (vm && vm->xef) /* userspace */
> -               bo_flags |= XE_BO_FLAG_PINNED_LATE_RESTORE;
> +               bo_flags |= XE_BO_FLAG_PINNED_LATE_RESTORE |
> +                           XE_BO_FLAG_ACCOUNTED;
>
>         lrc->bo = xe_bo_create_pin_map(xe, tile, NULL, bo_size,
>                                        ttm_bo_type_kernel,
> diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c
> index 5729e7d3e3356..569035630ffdf 100644
> --- a/drivers/gpu/drm/xe/xe_oa.c
> +++ b/drivers/gpu/drm/xe/xe_oa.c
> @@ -885,7 +885,7 @@ static int xe_oa_alloc_oa_buffer(struct xe_oa_stream *stream, size_t size)
>
>         bo = xe_bo_create_pin_map(stream->oa->xe, stream->gt->tile, NULL,
>                                   size, ttm_bo_type_kernel,
> -                                 XE_BO_FLAG_SYSTEM | XE_BO_FLAG_GGTT);
> +                                 XE_BO_FLAG_SYSTEM | XE_BO_FLAG_GGTT | XE_BO_FLAG_ACCOUNTED);
>         if (IS_ERR(bo))
>                 return PTR_ERR(bo);
>
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index 330cc0f54a3f4..efcd54ab75e92 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -120,7 +120,8 @@ struct xe_pt *xe_pt_create(struct xe_vm *vm, struct xe_tile *tile,
>                    XE_BO_FLAG_IGNORE_MIN_PAGE_SIZE |
>                    XE_BO_FLAG_NO_RESV_EVICT | XE_BO_FLAG_PAGETABLE;
>         if (vm->xef) /* userspace */
> -               bo_flags |= XE_BO_FLAG_PINNED_LATE_RESTORE;
> +               bo_flags |= XE_BO_FLAG_PINNED_LATE_RESTORE |
> +                           XE_BO_FLAG_ACCOUNTED;
>
>         pt->level = level;
>         bo = xe_bo_create_pin_map(vm->xe, tile, vm, SZ_4K,
> diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> index 10c8a1bcb86e8..fdf845bb717e0 100644
> --- a/drivers/gpu/drm/xe/xe_svm.c
> +++ b/drivers/gpu/drm/xe/xe_svm.c
> @@ -700,6 +700,7 @@ static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap,
>         bo = xe_bo_create_locked(vr->xe, NULL, NULL, end - start,
>                                  ttm_bo_type_device,
>                                  (IS_DGFX(xe) ? XE_BO_FLAG_VRAM(vr) : XE_BO_FLAG_SYSTEM) |
> +                                XE_BO_FLAG_ACCOUNTED |
>                                  XE_BO_FLAG_CPU_ADDR_MIRROR);
>         if (IS_ERR(bo)) {
>                 err = PTR_ERR(bo);
>

     prev parent reply	other threads:[~2025-08-06  2:40 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-14  5:18 drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver (complete series v2) Dave Airlie
2025-07-14  5:18 ` [PATCH 01/18] mm: add gpu active/reclaim per-node stat counters (v2) Dave Airlie
2025-07-14  5:18 ` [PATCH 02/18] drm/ttm: use gpu mm stats to track gpu memory allocations. (v3) Dave Airlie
2025-07-14 18:33   ` Shakeel Butt
2025-07-14  5:18 ` [PATCH 03/18] mm/list_lru: export list_lru_add Dave Airlie
2025-07-14 19:01   ` Shakeel Butt
2025-07-14  5:18 ` [PATCH 04/18] ttm/pool: port to list_lru. (v2) Dave Airlie
2025-07-14  5:18 ` [PATCH 05/18] ttm/pool: drop numa specific pools Dave Airlie
2025-07-14  5:18 ` [PATCH 06/18] ttm/pool: make pool shrinker NUMA aware Dave Airlie
2025-07-14  5:18 ` [PATCH 07/18] ttm/pool: track allocated_pages per numa node Dave Airlie
2025-07-14  5:18 ` [PATCH 08/18] memcg: add support for GPU page counters. (v2) Dave Airlie
2025-07-14  5:18 ` [PATCH 09/18] memcg: export memcg_list_lru_alloc Dave Airlie
2025-07-14  5:18 ` [PATCH 10/18] ttm: add a memcg accounting flag to the alloc/populate APIs Dave Airlie
2025-07-14  5:18 ` [PATCH 11/18] ttm/pool: initialise the shrinker earlier Dave Airlie
2025-07-14  5:18 ` [PATCH 12/18] ttm: add objcg pointer to bo and tt Dave Airlie
2025-07-14  5:18 ` [PATCH 13/18] ttm/pool: enable memcg tracking and shrinker. (v2) Dave Airlie
2025-07-15  7:34   ` Christian König
2025-07-21  5:56     ` David Airlie
2025-07-21 23:16       ` David Airlie
2025-08-04  9:22         ` Christian König
2025-08-06  2:43           ` Dave Airlie
2025-08-06 13:04             ` Christian König
2025-07-14  5:18 ` [PATCH 14/18] ttm: hook up memcg placement flags Dave Airlie
2025-07-14  5:18 ` [PATCH 15/18] memcontrol: allow objcg api when memcg is config off Dave Airlie
2025-07-14  5:18 ` [PATCH 16/18] memcontrol: export current_obj_cgroup Dave Airlie
2025-07-14  5:18 ` [PATCH 17/18] amdgpu: add support for memory cgroups Dave Airlie
2025-07-14  5:18 ` [PATCH 18/18] ttm: add support for a module option to disable memcg pool Dave Airlie
2025-07-14 11:49   ` Christian König
2025-08-05 10:58 ` drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver (complete series v2) Maarten Lankhorst
2025-08-06  2:39   ` Dave Airlie [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPM=9twKRN=GXNhyUZEd4b7StCN8WuhG_h6BCZX+ONgSW268=g@mail.gmail.com' \
    --to=airlied@gmail.com \
    --cc=christian.koenig@amd.com \
    --cc=david@fromorbit.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hannes@cmpxchg.org \
    --cc=kasong@tencent.com \
    --cc=linux-mm@kvack.org \
    --cc=maarten.lankhorst@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).