public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Nhat Pham <nphamcs@gmail.com>
Cc: akpm@linux-foundation.org, riel@surriel.com, mhocko@kernel.org,
	roman.gushchin@linux.dev, shakeelb@google.com,
	muchun.song@linux.dev, tj@kernel.org, lizefan.x@bytedance.com,
	shuah@kernel.org, mike.kravetz@oracle.com, yosryahmed@google.com,
	fvdl@google.com, linux-mm@kvack.org, kernel-team@meta.com,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org
Subject: Re: [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller
Date: Tue, 3 Oct 2023 08:54:44 -0400	[thread overview]
Message-ID: <20231003125444.GB17012@cmpxchg.org> (raw)
In-Reply-To: <20231003001828.2554080-3-nphamcs@gmail.com>

On Mon, Oct 02, 2023 at 05:18:27PM -0700, Nhat Pham wrote:
> Currently, hugetlb memory usage is not acounted for in the memory
> controller, which could lead to memory overprotection for cgroups with
> hugetlb-backed memory. This has been observed in our production system.
> 
> For instance, here is one of our usecases: suppose there are two 32G
> containers. The machine is booted with hugetlb_cma=6G, and each
> container may or may not use up to 3 gigantic page, depending on the
> workload within it. The rest is anon, cache, slab, etc. We can set the
> hugetlb cgroup limit of each cgroup to 3G to enforce hugetlb fairness.
> But it is very difficult to configure memory.max to keep overall
> consumption, including anon, cache, slab etc. fair.
> 
> What we have had to resort to is to constantly poll hugetlb usage and
> readjust memory.max. Similar procedure is done to other memory limits
> (memory.low for e.g). However, this is rather cumbersome and buggy.
> Furthermore, when there is a delay in memory limits correction, (for e.g
> when hugetlb usage changes within consecutive runs of the userspace
> agent), the system could be in an over/underprotected state.
> 
> This patch rectifies this issue by charging the memcg when the hugetlb
> folio is utilized, and uncharging when the folio is freed (analogous to
> the hugetlb controller). Note that we do not charge when the folio is
> allocated to the hugetlb pool, because at this point it is not owned by
> any memcg.
> 
> Some caveats to consider:
>   * This feature is only available on cgroup v2.
>   * There is no hugetlb pool management involved in the memory
>     controller. As stated above, hugetlb folios are only charged towards
>     the memory controller when it is used. Host overcommit management
>     has to consider it when configuring hard limits.
>   * Failure to charge towards the memcg results in SIGBUS. This could
>     happen even if the hugetlb pool still has pages (but the cgroup
>     limit is hit and reclaim attempt fails).
>   * When this feature is enabled, hugetlb pages contribute to memory
>     reclaim protection. low, min limits tuning must take into account
>     hugetlb memory.
>   * Hugetlb pages utilized while this option is not selected will not
>     be tracked by the memory controller (even if cgroup v2 is remounted
>     later on).
> 
> Signed-off-by: Nhat Pham <nphamcs@gmail.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

  parent reply	other threads:[~2023-10-03 12:54 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-03  0:18 [PATCH v3 0/3] hugetlb memcg accounting Nhat Pham
2023-10-03  0:18 ` [PATCH v3 1/3] memcontrol: add helpers for " Nhat Pham
2023-10-03 11:50   ` Michal Hocko
2023-10-03 12:47   ` Johannes Weiner
2023-10-03  0:18 ` [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller Nhat Pham
2023-10-03  0:26   ` Nhat Pham
2023-10-03 12:54   ` Johannes Weiner [this message]
2023-10-03 12:58   ` Michal Hocko
2023-10-03 15:59     ` Johannes Weiner
2023-10-03 17:13   ` Mike Kravetz
2023-10-03 18:01     ` Nhat Pham
2023-10-03 18:39       ` Johannes Weiner
2023-10-03 22:09         ` Nhat Pham
2023-10-03 22:42           ` Mike Kravetz
2023-10-03 23:26             ` Nhat Pham
2023-10-03 23:14     ` [PATCH] memcontrol: only transfer the memcg data for migration Nhat Pham
2023-10-03 23:22       ` Yosry Ahmed
2023-10-03 23:31         ` Nhat Pham
2023-10-03 23:54           ` Yosry Ahmed
2023-10-04  0:02             ` Nhat Pham
2023-10-04  0:02       ` Nhat Pham
2023-10-04 14:17       ` Johannes Weiner
2023-10-04 19:45   ` [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller (fix) Nhat Pham
2023-10-06 17:25     ` Andrew Morton
2023-10-06 18:23       ` Nhat Pham
2023-10-03  0:18 ` [PATCH v3 3/3] selftests: add a selftest to verify hugetlb usage in memcg Nhat Pham

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231003125444.GB17012@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=fvdl@google.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan.x@bytedance.com \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=nphamcs@gmail.com \
    --cc=riel@surriel.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=shuah@kernel.org \
    --cc=tj@kernel.org \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox