From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 838B7C83026 for ; Mon, 30 Jun 2025 04:51:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2018E8D000B; Mon, 30 Jun 2025 00:51:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1B19B8D0001; Mon, 30 Jun 2025 00:51:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 078B08D000B; Mon, 30 Jun 2025 00:51:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E8D8B8D0001 for ; Mon, 30 Jun 2025 00:51:56 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B46EDC03CA for ; Mon, 30 Jun 2025 04:51:56 +0000 (UTC) X-FDA: 83610844632.18.0B2DF70 Received: from us-smtp-delivery-44.mimecast.com (us-smtp-delivery-44.mimecast.com [205.139.111.44]) by imf07.hostedemail.com (Postfix) with ESMTP id BBC5E40005 for ; Mon, 30 Jun 2025 04:51:54 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=softfail (imf07.hostedemail.com: 205.139.111.44 is neither permitted nor denied by domain of airlied@gmail.com) smtp.mailfrom=airlied@gmail.com; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751259114; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FKDBoVFh3HNyJz9XiNn03uSZiCy7DDqbEh2NB+deyY0=; b=7bPNpIt0hidx3AxIBKzmQcQkRxoNFDqf9RLE5aQFCCGIOMrVImnlGkfHl9dw8umNNiItgo TC95Ck7l7KXYq+IH3hmaAG+66uu0O1eBYfWZKvdre3Jxl4WKOdk/7Jp4UNmOIXTbdwye5V cpMK/HMHjvXsS+OgqaQ3UfPpJeIodeo= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=softfail (imf07.hostedemail.com: 205.139.111.44 is neither permitted nor denied by domain of airlied@gmail.com) smtp.mailfrom=airlied@gmail.com; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751259114; a=rsa-sha256; cv=none; b=7dO0SI+iqS9GsK8UAOy92+9oxvEjHMubie34ncShGTB7jPylbN1O5CKLnvTenLUMCeRBuN /uru09pcHmo19AGlrFjpz4w0v+0y4LCTgZjc21lDMSTNdS7/es5kz4k6VtZOTNBkJ1D6XB XezMhsxs6g2Zw1poyOMCATRCTD6aVlA= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-8-j9dm2L73P_iB3tPsZJC-vA-1; Mon, 30 Jun 2025 00:51:51 -0400 X-MC-Unique: j9dm2L73P_iB3tPsZJC-vA-1 X-Mimecast-MFC-AGG-ID: j9dm2L73P_iB3tPsZJC-vA_1751259110 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 738521956086; Mon, 30 Jun 2025 04:51:50 +0000 (UTC) Received: from dreadlord.redhat.com (unknown [10.67.24.96]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 42F6C1956095; Mon, 30 Jun 2025 04:51:44 +0000 (UTC) From: Dave Airlie To: dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Johannes Weiner , Christian Koenig Cc: Dave Chinner , Kairui Song , Dave Airlie Subject: [PATCH 13/17] ttm/pool: enable memcg tracking and shrinker. Date: Mon, 30 Jun 2025 14:49:32 +1000 Message-ID: <20250630045005.1337339-14-airlied@gmail.com> In-Reply-To: <20250630045005.1337339-1-airlied@gmail.com> References: <20250630045005.1337339-1-airlied@gmail.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 5TvmUOBN9mPmtBEDx5jHRDV-sJZLCxw9oYd1lDCOV2k_1751259110 X-Mimecast-Originator: gmail.com Content-Transfer-Encoding: quoted-printable content-type: text/plain; charset=WINDOWS-1252; x-default=true X-Stat-Signature: takwheu5dh9mkfu6jhrpigyiti6c8zxz X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: BBC5E40005 X-Rspam-User: X-HE-Tag: 1751259114-723701 X-HE-Meta: U2FsdGVkX18GCj1qIJIYjlQDB+XPdXG7tMqB8IFoSZtNMUMmiaZuuMt5YogQK0/ei0wJsmsauvdbpXWRH9wog3PBvgamQup2Ya3oU8BR3CnDi8xa81sIh4xGpWZbtGwqGyxF0cdd5SPRnpX9Ye4wXfbV6S2jHmoKkC5qh4llZz24HYIA9A9Ym8DeyqD4VpZlFelSqUf1GV8vviGm27f5EF+jOdRmhjeRIJV5Dqdgu6Y/wDyEESJIeR7edFxYlULP1XoSnHGAImhNZaujtMyHP3QVuTyGjv1mhctHXW3TsGCjmBGtiguPCHQlJHqIn/XOd766xC7AMH5Kw3JduL+CO2sFLzZn9LNyQmxnTYANOBGJACgK5nG9fWxZ7bLru384QfXCWF9NZ4TdejeOExMmnAZfXXcYvmoGa5X+ISgIT8G0QR73TtnbXb5Q+MXJY3EURZdfbpDWCr0aHBZjocbynfN5KJhm6D6srIjl/3ykmUXyF3eLMGFSOqbShv3wJc4JEgYG+T1LrOyOUWiEA720Qy/i/iUuCLMSOf/Cz0bObnrriYbviwGvX1YAbQwR9itpy40WfwACA1AYr3ZvRy9xTKd4k06HwNkYjVL31Ncppf3heuZi8gN61UE1wfQ6LKTaLuANuPU8helKIoLFm3fqq8h3R5oXpSrd+Fd7SrPnS/gcMgwo+x7bo2eeZmZJKvWgcgjIPf6TNovClSwN/TXScZ7uYH1FSqbZ3s4IuVw21WDv+UiKKs7gFgtnl49YO57hahNIeG3ILAYaqr80e/9AxBLHZClEOyttFpwPjrIfe5LPSn6WgdlVO8bz1e5XyTDQT6yfH2eMlrF/IteLF+bUrj1CPgW+AHxgPdGvdkzcvaLsjtlEB05s5OuEGBYL7SzEPkYgt46HON8ThH7sd5zg5ejKMBSBqFidcOgey40+pWGEZ5HPcIsMb1TidC/fXZzTYmdQ8GjU3+xTthE5ADZ fhYtAvHv gYLr8Wcmv3BMzgFsnEz/m3oG+7+OBWts2kW2gbYoQZbjrjQBQri84t7pt7M+SRkpMqRIKkm0bAVhpxPDnTMbLo7Yr+9CC+BV3MVav6PX9YhKy7AdQ9Mg/gzNVwY7nJIOjCALb00EwOVR0AU6i/g0/rFiuHBRj4QzzSHHK7nnb4qETWG+7WAIxtLZ70iSyzkO2iLDOAqJbTDjUmippt+bA98s86x9OcNjv2CA0RddbdOo9DV1vVfgWppiYetWzc/EA59FZi5h2323GDGFwG6QTgltSBCrqIr7Tm3/ksBHKVoh6/KKkjU+rAEOd7Vdr2P4cAbi3RkYuAnQVTg6680DMe7bN0lmkwN3t4+9RWMio3H1Lz9kdzA6YHp9jDurr1qkDp8T4mx8bSdIRGnaDWVaH8bits6OXB6pGliI36w+89S3hGqE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Dave Airlie This enables all the backend code to use the list lru in memcg mode, and set the shrinker to be memcg aware. It adds the loop case for when pooled pages end up being reparented to a higher memcg group, that newer memcg can search for them there and take them back. Signed-off-by: Dave Airlie --- drivers/gpu/drm/ttm/ttm_pool.c | 123 ++++++++++++++++++++++++++++----- 1 file changed, 105 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.= c index 210f4ac4de67..49e92f40ab23 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -143,7 +143,9 @@ static int ttm_pool_nid(struct ttm_pool *pool) { } =20 /* Allocate pages of size 1 << order with the given gfp_flags */ -static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_f= lags, +static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, +=09=09=09=09=09struct obj_cgroup *objcg, +=09=09=09=09=09gfp_t gfp_flags, =09=09=09=09=09unsigned int order) { =09unsigned long attr =3D DMA_ATTR_FORCE_CONTIGUOUS; @@ -163,6 +165,10 @@ static struct page *ttm_pool_alloc_page(struct ttm_poo= l *pool, gfp_t gfp_flags, =09=09p =3D alloc_pages_node(pool->nid, gfp_flags, order); =09=09if (p) { =09=09=09p->private =3D order; +=09=09=09if (!mem_cgroup_charge_gpu_page(objcg, p, order, gfp_flags, false= )) { +=09=09=09=09__free_pages(p, order); +=09=09=09=09return NULL; +=09=09=09} =09=09=09mod_node_page_state(NODE_DATA(ttm_pool_nid(pool)), NR_GPU_ACTIVE,= (1 << order)); =09=09} =09=09return p; @@ -214,6 +220,7 @@ static void ttm_pool_free_page(struct ttm_pool *pool, e= num ttm_caching caching, #endif =20 =09if (!pool || !pool->use_dma_alloc) { +=09=09mem_cgroup_uncharge_gpu_page(p, order, reclaim); =09=09mod_node_page_state(NODE_DATA(ttm_pool_nid(pool)), =09=09=09=09 reclaim ? NR_GPU_RECLAIM : NR_GPU_ACTIVE, =09=09=09=09 -(1 << order)); @@ -303,12 +310,13 @@ static void ttm_pool_type_give(struct ttm_pool_type *= pt, struct page *p) =20 =09INIT_LIST_HEAD(&p->lru); =09rcu_read_lock(); -=09list_lru_add(&pt->pages, &p->lru, nid, NULL); +=09list_lru_add(&pt->pages, &p->lru, nid, page_memcg_check(p)); =09rcu_read_unlock(); =20 =09atomic_long_add(num_pages, &allocated_pages[nid]); =09mod_node_page_state(NODE_DATA(nid), NR_GPU_ACTIVE, -num_pages); =09mod_node_page_state(NODE_DATA(nid), NR_GPU_RECLAIM, num_pages); +=09mem_cgroup_move_gpu_page_reclaim(NULL, p, pt->order, true); } =20 static enum lru_status take_one_from_lru(struct list_head *item, @@ -323,20 +331,59 @@ static enum lru_status take_one_from_lru(struct list_= head *item, =09return LRU_REMOVED; } =20 -/* Take pages from a specific pool_type, return NULL when nothing availabl= e */ -static struct page *ttm_pool_type_take(struct ttm_pool_type *pt, int nid) +static int pool_lru_get_page(struct ttm_pool_type *pt, int nid, +=09=09=09 struct page **page_out, +=09=09=09 struct obj_cgroup *objcg, +=09=09=09 struct mem_cgroup *memcg) { =09int ret; =09struct page *p =3D NULL; =09unsigned long nr_to_walk =3D 1; +=09unsigned int num_pages =3D 1 << pt->order; =20 -=09ret =3D list_lru_walk_node(&pt->pages, nid, take_one_from_lru, (void *)= &p, &nr_to_walk); +=09ret =3D list_lru_walk_one(&pt->pages, nid, memcg, take_one_from_lru, (v= oid *)&p, &nr_to_walk); =09if (ret =3D=3D 1 && p) { -=09=09atomic_long_sub(1 << pt->order, &allocated_pages[nid]); -=09=09mod_node_page_state(NODE_DATA(nid), NR_GPU_ACTIVE, (1 << pt->order))= ; -=09=09mod_node_page_state(NODE_DATA(nid), NR_GPU_RECLAIM, -(1 << pt->order= )); +=09=09atomic_long_sub(num_pages, &allocated_pages[nid]); +=09=09mod_node_page_state(NODE_DATA(nid), NR_GPU_RECLAIM, -num_pages); + +=09=09if (!mem_cgroup_move_gpu_page_reclaim(objcg, p, pt->order, false)) { +=09=09=09__free_pages(p, pt->order); +=09=09=09p =3D NULL; +=09=09} +=09=09if (p) +=09=09=09mod_node_page_state(NODE_DATA(nid), NR_GPU_ACTIVE, num_pages); =09} -=09return p; +=09*page_out =3D p; +=09return ret; +} + +/* Take pages from a specific pool_type, return NULL when nothing availabl= e */ +static struct page *ttm_pool_type_take(struct ttm_pool_type *pt, int nid, +=09=09=09=09 struct obj_cgroup *orig_objcg) +{ +=09struct page *page_out =3D NULL; +=09int ret; +=09struct mem_cgroup *orig_memcg =3D orig_objcg ? get_mem_cgroup_from_objc= g(orig_objcg) : NULL; +=09struct mem_cgroup *memcg =3D orig_memcg; + +=09/* +=09 * Attempt to get a page from the current memcg, but if it hasn't got a= ny in it's level, +=09 * go up to the parent and check there. This helps the scenario where m= ultiple apps get +=09 * started into their own cgroup from a common parent and want to reuse= the pools. +=09 */ +=09while (!page_out) { +=09=09ret =3D pool_lru_get_page(pt, nid, &page_out, orig_objcg, memcg); +=09=09if (ret =3D=3D 1) +=09=09=09break; +=09=09if (!memcg) +=09=09=09break; +=09=09memcg =3D parent_mem_cgroup(memcg); +=09=09if (!memcg) +=09=09=09break; +=09} + +=09mem_cgroup_put(orig_memcg); +=09return page_out; } =20 /* Initialize and add a pool type to the global shrinker list */ @@ -346,7 +393,7 @@ static void ttm_pool_type_init(struct ttm_pool_type *pt= , struct ttm_pool *pool, =09pt->pool =3D pool; =09pt->caching =3D caching; =09pt->order =3D order; -=09list_lru_init(&pt->pages); +=09list_lru_init_memcg(&pt->pages, mm_shrinker); =20 =09spin_lock(&shrinker_lock); =09list_add_tail(&pt->shrinker_list, &shrinker_list); @@ -389,6 +436,30 @@ static void ttm_pool_type_fini(struct ttm_pool_type *p= t) =09ttm_pool_dispose_list(pt, &dispose); } =20 +static int ttm_pool_check_objcg(struct obj_cgroup *objcg) +{ +#ifdef CONFIG_MEMCG +=09int r =3D 0; +=09struct mem_cgroup *memcg; +=09if (!objcg) +=09=09return 0; + +=09memcg =3D get_mem_cgroup_from_objcg(objcg); +=09for (unsigned i =3D 0; i < NR_PAGE_ORDERS; i++) { +=09=09r =3D memcg_list_lru_alloc(memcg, &global_write_combined[i].pages, G= FP_KERNEL); +=09=09if (r) { +=09=09=09break; +=09=09} +=09=09r =3D memcg_list_lru_alloc(memcg, &global_uncached[i].pages, GFP_KER= NEL); +=09=09if (r) { +=09=09=09break; +=09=09} +=09} +=09css_put(&memcg->css); +#endif +=09return 0; +} + /* Return the pool_type to use for the given caching and order */ static struct ttm_pool_type *ttm_pool_select_type(struct ttm_pool *pool, =09=09=09=09=09=09 enum ttm_caching caching, @@ -418,7 +489,9 @@ static struct ttm_pool_type *ttm_pool_select_type(struc= t ttm_pool *pool, } =20 /* Free pages using the per-node shrinker list */ -static unsigned int ttm_pool_shrink(int nid, unsigned long num_to_free) +static unsigned int ttm_pool_shrink(int nid, +=09=09=09=09 struct mem_cgroup *memcg, +=09=09=09=09 unsigned long num_to_free) { =09LIST_HEAD(dispose); =09struct ttm_pool_type *pt; @@ -430,7 +503,11 @@ static unsigned int ttm_pool_shrink(int nid, unsigned = long num_to_free) =09list_move_tail(&pt->shrinker_list, &shrinker_list); =09spin_unlock(&shrinker_lock); =20 -=09num_pages =3D list_lru_walk_node(&pt->pages, nid, pool_move_to_dispose_= list, &dispose, &num_to_free); +=09if (!memcg) { +=09=09num_pages =3D list_lru_walk_node(&pt->pages, nid, pool_move_to_dispo= se_list, &dispose, &num_to_free); +=09} else { +=09=09num_pages =3D list_lru_walk_one(&pt->pages, nid, memcg, pool_move_to= _dispose_list, &dispose, &num_to_free); +=09} =09num_pages *=3D 1 << pt->order; =20 =09ttm_pool_dispose_list(pt, &dispose); @@ -595,6 +672,7 @@ static int ttm_pool_restore_commit(struct ttm_pool_tt_r= estore *restore, =09=09=09 */ =09=09=09ttm_pool_split_for_swap(restore->pool, p); =09=09=09copy_highpage(restore->alloced_page + i, p); +=09=09=09p->memcg_data =3D 0; =09=09=09__free_pages(p, 0); =09=09} =20 @@ -756,6 +834,7 @@ static int __ttm_pool_alloc(struct ttm_pool *pool, stru= ct ttm_tt *tt, =09bool allow_pools; =09struct page *p; =09int r; +=09struct obj_cgroup *objcg =3D memcg_account ? tt->objcg : NULL; =20 =09WARN_ON(!alloc->remaining_pages || ttm_tt_is_populated(tt)); =09WARN_ON(alloc->dma_addr && !pool->dev); @@ -773,6 +852,9 @@ static int __ttm_pool_alloc(struct ttm_pool *pool, stru= ct ttm_tt *tt, =20 =09page_caching =3D tt->caching; =09allow_pools =3D true; + +=09ttm_pool_check_objcg(objcg); + =09for (order =3D ttm_pool_alloc_find_order(MAX_PAGE_ORDER, alloc); =09 alloc->remaining_pages; =09 order =3D ttm_pool_alloc_find_order(order, alloc)) { @@ -782,7 +864,7 @@ static int __ttm_pool_alloc(struct ttm_pool *pool, stru= ct ttm_tt *tt, =09=09p =3D NULL; =09=09pt =3D ttm_pool_select_type(pool, page_caching, order); =09=09if (pt && allow_pools) -=09=09=09p =3D ttm_pool_type_take(pt, ttm_pool_nid(pool)); +=09=09=09p =3D ttm_pool_type_take(pt, ttm_pool_nid(pool), objcg); =20 =09=09/* =09=09 * If that fails or previously failed, allocate from system. @@ -793,7 +875,7 @@ static int __ttm_pool_alloc(struct ttm_pool *pool, stru= ct ttm_tt *tt, =09=09if (!p) { =09=09=09page_caching =3D ttm_cached; =09=09=09allow_pools =3D false; -=09=09=09p =3D ttm_pool_alloc_page(pool, gfp_flags, order); +=09=09=09p =3D ttm_pool_alloc_page(pool, objcg, gfp_flags, order); =09=09} =09=09/* If that fails, lower the order if possible and retry. */ =09=09if (!p) { @@ -937,7 +1019,7 @@ void ttm_pool_free(struct ttm_pool *pool, struct ttm_t= t *tt) =20 =09while (atomic_long_read(&allocated_pages[nid]) > pool_node_limit[nid]) = { =09=09unsigned long diff =3D pool_node_limit[nid] - atomic_long_read(&allo= cated_pages[nid]); -=09=09ttm_pool_shrink(nid, diff); +=09=09ttm_pool_shrink(nid, NULL, diff); =09} } EXPORT_SYMBOL(ttm_pool_free); @@ -1057,6 +1139,7 @@ long ttm_pool_backup(struct ttm_pool *pool, struct tt= m_tt *tt, =09=09=09if (flags->purge) { =09=09=09=09shrunken +=3D num_pages; =09=09=09=09page->private =3D 0; +=09=09=09=09page->memcg_data =3D 0; =09=09=09=09__free_pages(page, order); =09=09=09=09memset(tt->pages + i, 0, =09=09=09=09 num_pages * sizeof(*tt->pages)); @@ -1193,10 +1276,14 @@ static unsigned long ttm_pool_shrinker_scan(struct = shrinker *shrink, =09=09=09=09=09 struct shrink_control *sc) { =09unsigned long num_freed =3D 0; +=09int num_pools; +=09spin_lock(&shrinker_lock); +=09num_pools =3D list_count_nodes(&shrinker_list); +=09spin_unlock(&shrinker_lock); =20 =09do -=09=09num_freed +=3D ttm_pool_shrink(sc->nid, sc->nr_to_scan); -=09while (num_freed < sc->nr_to_scan && +=09=09num_freed +=3D ttm_pool_shrink(sc->nid, sc->memcg, sc->nr_to_scan); +=09while (num_pools-- >=3D 0 && num_freed < sc->nr_to_scan && =09 atomic_long_read(&allocated_pages[sc->nid])); =20 =09sc->nr_scanned =3D num_freed; @@ -1388,7 +1475,7 @@ int ttm_pool_mgr_init(unsigned long num_pages) =09spin_lock_init(&shrinker_lock); =09INIT_LIST_HEAD(&shrinker_list); =20 -=09mm_shrinker =3D shrinker_alloc(SHRINKER_NUMA_AWARE, "drm-ttm_pool"); +=09mm_shrinker =3D shrinker_alloc(SHRINKER_MEMCG_AWARE | SHRINKER_NUMA_AWA= RE, "drm-ttm_pool"); =09if (!mm_shrinker) =09=09return -ENOMEM; =20 --=20 2.49.0