From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D90BC83F34 for ; Fri, 18 Jul 2025 21:00:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EBF6A6B0092; Fri, 18 Jul 2025 17:00:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E493E6B0093; Fri, 18 Jul 2025 17:00:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D10826B0095; Fri, 18 Jul 2025 17:00:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id BB4F76B0092 for ; Fri, 18 Jul 2025 17:00:57 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 384FF160169 for ; Fri, 18 Jul 2025 21:00:57 +0000 (UTC) X-FDA: 83678604954.23.1139A3D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf05.hostedemail.com (Postfix) with ESMTP id EDB85100016 for ; Fri, 18 Jul 2025 21:00:54 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=gMvVYy+r; spf=pass (imf05.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752872455; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XswQuaaE2754OlYhBT3pS/FTBtUgeNS0s3HKHNzAPSU=; b=rxJloCFBTO7ieWdhAep2CdVDpzcUclNQfhX4pkJVYuelEhdX2ycdzi4fqTr+71YaKAizdf 2/Y0wApb/+fkhbLzSWOxtxQtP6s7Wj3OzgrLriEeT8SuIeE5rI387mPZ/83AcrJ2KjNn7q Lkk+v/hPweScoZijOTGYG4AQJdBSDXE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752872455; a=rsa-sha256; cv=none; b=1Z5VlyNt+wQetrlEPTifdO4fSrAk4t2/kScpiETXnif/HJNRAdVFhlIcHRraRGPLNMBFvz IONeVILOwA1LUhF1BVXhVZCDNpShv4PYaZ+gTh/2UwjtHYDdCsR87kt0rLSQkcThIF2KtM 1+WPhDt6nCqbJ+OMuei46hCXQHdrnhg= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=gMvVYy+r; spf=pass (imf05.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1752872454; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XswQuaaE2754OlYhBT3pS/FTBtUgeNS0s3HKHNzAPSU=; b=gMvVYy+rkbSfIe6vsfaqoS2QDN6cCr9PmZyih0q2PCxrz0QNTT3dbmmkIPem/AboO+sExp OKZh0KEg9nDMdAaeCTPB0pPWp16xrKmK+q9j5jBtTiFXkBB0gYCLpHMjpZTVlcu7XrY7eA GJOrEGS7gCTnGqOx+qJhBRbXV4seAIA= Received: from mail-yw1-f198.google.com (mail-yw1-f198.google.com [209.85.128.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-323-kyEXwFUkOFGhoDccut3vRQ-1; Fri, 18 Jul 2025 17:00:53 -0400 X-MC-Unique: kyEXwFUkOFGhoDccut3vRQ-1 X-Mimecast-MFC-AGG-ID: kyEXwFUkOFGhoDccut3vRQ_1752872452 Received: by mail-yw1-f198.google.com with SMTP id 00721157ae682-7194c6b0b8aso25384347b3.2 for ; Fri, 18 Jul 2025 14:00:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752872452; x=1753477252; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XswQuaaE2754OlYhBT3pS/FTBtUgeNS0s3HKHNzAPSU=; b=sAeiO5tlLR+pqxXppLU67tb0bmDNN5qowxecw6RJFShiNzLyE9W3frqx0doZNjR0Ia EE+qW6CEZNNEpBp1D1TLtKGzvDBwOrmvcnFmUgLe5CTOXlPvOfh+NHGR4pTJ7RlV14+P HMbYZsJHh4/iUNP1rQRah43+2sYCmwtxsQ4+BCLIfiJE1rOCWelfDhUhFJRyLybYRRPm ms0XE5w8PCmFRrjaTBn6ToivRZ+WJoiW7wevVYyteGZu2MVMwvuQYRxkS8If7AvD9/7W X92CDg1/wC4Af9eQeAdOF9SZvGrcSxbeXfCNkXdwcCZHS18PDNcxFLUx7wOB+M/MhXGo zuLQ== X-Gm-Message-State: AOJu0YyInp/izvD2Sdlaj8/02GntPHOP8YoLLUW+vSudkPc+lI2sHPt/ fb9phQsCgm/5k0+ql1+iOste6XF4Iyo5YTGyoUgnu2qVD04tC5nDetxC3L0YhEk873Vnq4GhtNf aOezliA4n8mnF0+4dvdljGmW5WKArtKyKWi+DAGEqB/zBes/zFnl8Ano/ae/5LFYAIQ2bb9TZdg TcoNJYTiRSlaw29pcm3v4DGCcej8s= X-Gm-Gg: ASbGnctyLdweV1IubCh7y3FgJ1HxWKAKXQPL6iMSq7S+lhPJF4tm2a4hGENPhrl/3iY oRmzrXEoBYh78F87uRQ0GhbIQ2WpYfpHDZhy5iSkUwYYSss1aSZRrIBoHbgyJEfLKylPUVaLOEx /2L+eVTyIW6eO6mDbsbJq154U= X-Received: by 2002:a05:690c:3703:b0:70c:a854:8384 with SMTP id 00721157ae682-719521a381cmr41623957b3.11.1752872452229; Fri, 18 Jul 2025 14:00:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFbt85bf5Kp+dxoJKo0BjW8mtOr2VJ49x22KnfNGF3rtqH8j5TzQacIXdf97YAgrjTlUoNcqs6kDMtwdOrWCrU= X-Received: by 2002:a05:690c:3703:b0:70c:a854:8384 with SMTP id 00721157ae682-719521a381cmr41623077b3.11.1752872451784; Fri, 18 Jul 2025 14:00:51 -0700 (PDT) MIME-Version: 1.0 References: <20250714003207.113275-1-npache@redhat.com> <20250714003207.113275-14-npache@redhat.com> <94c8899a-f116-4b6a-94d3-f8295ee3f535@linux.alibaba.com> In-Reply-To: <94c8899a-f116-4b6a-94d3-f8295ee3f535@linux.alibaba.com> From: Nico Pache Date: Fri, 18 Jul 2025 15:00:25 -0600 X-Gm-Features: Ac12FXyZkvpO72Y8KuApMa6zkujq0hEtG4p8bsgfkYpSM3DyaV_34-FVHPb5UAY Message-ID: Subject: Re: [PATCH v9 13/14] khugepaged: add per-order mTHP khugepaged stats To: Baolin Wang Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, david@redhat.com, ziy@nvidia.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, dev.jain@arm.com, corbet@lwn.net, rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, akpm@linux-foundation.org, baohua@kernel.org, willy@infradead.org, peterx@redhat.com, wangkefeng.wang@huawei.com, usamaarif642@gmail.com, sunnanyong@huawei.com, vishal.moola@gmail.com, thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com, kirill.shutemov@linux.intel.com, aarcange@redhat.com, raquini@redhat.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, tiwai@suse.de, will@kernel.org, dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org, jglisse@google.com, surenb@google.com, zokeefe@google.com, hannes@cmpxchg.org, rientjes@google.com, mhocko@suse.com, rdunlap@infradead.org, hughd@google.com X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: jzEsNkEtKalrbg91Ly98DGsWqMkMEgAsoKTXT6u3CBM_1752872452 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: EDB85100016 X-Stat-Signature: jr8jgkdq5m7bhnpzmieurctboejnkek4 X-Rspam-User: X-HE-Tag: 1752872454-64247 X-HE-Meta: U2FsdGVkX19fCWtElEf+I+SOVT8fpKP3jPEKEHmYwixZeIGrxv9mYf7kZcY+bYbJOHPXx/fBdfmcEhV4qdh0aN1O1S1VWgZ6/tnq+nZUfhJeajYKIVkziBNWSScV/6yU8uqPcMluRiahknzCr7Yz5KE6ePJSBYnEvTrvBY9+AKWL0jdCks1sa7SfTBFKms7s7CN9v0sgUSVuEOhzKCEPZrv2TfUsLvlQ0hNPMRWoZF+TCGy0gAe6zgA6A/K6zouDZ+37WQoVPqceolngeoVKe53Y3xhVeqkP8xQ82G+AH/o1Zy8Xw3XzySG2v/WFEaO99/pvy2g6kqcWF6OWdb/2/PofNlDgf5csHV50eCcbX3cVI/5CdLpNO/eMaOd8wlzaIx3oJolJIdtdoGrOoKOyvWduyH7+/ERpYZoA0+apXiVqtHUu+9m4mZELTCWk8parzNUNeeMhdCiitkOb5cMUi6kd5Mv0C/JNJp5PDt4sdhGZdCuS0vz5SVFnfqjBBBPc8dubxCYgUftHMX1qJOlwJ/IWqyufgy8c9oE+khuAhAm3hG4sv8jUAc+5uetaPs2lnB8HQys98zYdF/4mFwoOPnOroiKALFz+YOJcwniK2QDAuk8uM1z46p6v0bHqZEZbvpwc1kY3K+ZhEB6zYshRFFEJ+Att2ryHKvT4MWAfzT1TFpfD9Q6s9OrxCefUy0pZ/p9etBh2tKzEb+725a/Q7riqSfer9J6ykJbAT2oXQuLPw8h/c7/y59T97JHnXS6WhKWIkSvv2aLFksFCyZCz8FHC5WK2cwa/l/Oe9zNX/SAOewj6FcC8V/JmOG0jMzwaNTLprCKvOy1RAHn3PWUZ2RKBMP238FfjmQYht5K1grZvRPh0ZnMMEyuTwr7CZoWZqfh628vSj4Wxh/79f04okIGkTaYjP/mFFMhnaQsOH/mXShx5nszCEYFRv48kTdon2083zQdxeV9nCtqFinC 9YehTz/w IFe4VB019owRwGvrsIdi9Zzjfm//AUsFlEOqEc0HA8bqnJaflWXVacRQtXZ6j3BBE5Dgh/AYnfRnrmqd7ScwhOlMnLid0+SpP1Vh4yCwZYg7fSM/VO0mz7lg220+meLHKlixK+Bq6g8Ej+rE8olMKKDvDc5uUvkxqet/vnRrCfs/czxOBt3802AyLAl252/CZhLtSyQ22mIfReKrjKPvJRlA9JapJOASOt9EgIJX+wchmK1+PhEE6MQjTWgLymJ2xSqpM3S3Z5QAbhCWTtQ14dkHYpFFhelAhAcDMPYmsANBqmVM9wB5sLIE3crelsMEC1FuAjDd2yD7Vx0uuJuUpAJO32Zeij6H5RBnEzXfKfcp0khiu6vkLp3u1xzPe7O9cDZY/3P4dXZuHs5GYApfdWiFl6Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jul 17, 2025 at 11:05=E2=80=AFPM Baolin Wang wrote: > > > > On 2025/7/14 08:32, Nico Pache wrote: > > With mTHP support inplace, let add the per-order mTHP stats for > > exceeding NONE, SWAP, and SHARED. > > > > Signed-off-by: Nico Pache > > --- > > Documentation/admin-guide/mm/transhuge.rst | 17 +++++++++++++++++ > > include/linux/huge_mm.h | 3 +++ > > mm/huge_memory.c | 7 +++++++ > > mm/khugepaged.c | 15 ++++++++++++--- > > 4 files changed, 39 insertions(+), 3 deletions(-) > > > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation= /admin-guide/mm/transhuge.rst > > index 2c523dce6bc7..28c8af61efba 100644 > > --- a/Documentation/admin-guide/mm/transhuge.rst > > +++ b/Documentation/admin-guide/mm/transhuge.rst > > @@ -658,6 +658,23 @@ nr_anon_partially_mapped > > an anonymous THP as "partially mapped" and count it here, even= though it > > is not actually partially mapped anymore. > > > > +collapse_exceed_swap_pte > > + The number of anonymous THP which contain at least one swap PTE= . > > + Currently khugepaged does not support collapsing mTHP regions t= hat > > + contain a swap PTE. > > + > > +collapse_exceed_none_pte > > + The number of anonymous THP which have exceeded the none PTE th= reshold. > > + With mTHP collapse, a bitmap is used to gather the state of a P= MD region > > + and is then recursively checked from largest to smallest order = against > > + the scaled max_ptes_none count. This counter indicates that the= next > > + enabled order will be checked. > > + > > +collapse_exceed_shared_pte > > + The number of anonymous THP which contain at least one shared P= TE. > > + Currently khugepaged does not support collapsing mTHP regions t= hat > > + contain a shared PTE. > > + > > As the system ages, allocating huge pages may be expensive as the > > system uses memory compaction to copy data around memory to free a > > huge page for use. There are some counters in ``/proc/vmstat`` to hel= p > > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h > > index 4042078e8cc9..e0a27f80f390 100644 > > --- a/include/linux/huge_mm.h > > +++ b/include/linux/huge_mm.h > > @@ -141,6 +141,9 @@ enum mthp_stat_item { > > MTHP_STAT_SPLIT_DEFERRED, > > MTHP_STAT_NR_ANON, > > MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, > > + MTHP_STAT_COLLAPSE_EXCEED_SWAP, > > + MTHP_STAT_COLLAPSE_EXCEED_NONE, > > + MTHP_STAT_COLLAPSE_EXCEED_SHARED, > > __MTHP_STAT_COUNT > > }; > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index e2ed9493df77..57e5699cf638 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -632,6 +632,10 @@ DEFINE_MTHP_STAT_ATTR(split_failed, MTHP_STAT_SPLI= T_FAILED); > > DEFINE_MTHP_STAT_ATTR(split_deferred, MTHP_STAT_SPLIT_DEFERRED); > > DEFINE_MTHP_STAT_ATTR(nr_anon, MTHP_STAT_NR_ANON); > > DEFINE_MTHP_STAT_ATTR(nr_anon_partially_mapped, MTHP_STAT_NR_ANON_PAR= TIALLY_MAPPED); > > +DEFINE_MTHP_STAT_ATTR(collapse_exceed_swap_pte, MTHP_STAT_COLLAPSE_EXC= EED_SWAP); > > +DEFINE_MTHP_STAT_ATTR(collapse_exceed_none_pte, MTHP_STAT_COLLAPSE_EXC= EED_NONE); > > +DEFINE_MTHP_STAT_ATTR(collapse_exceed_shared_pte, MTHP_STAT_COLLAPSE_E= XCEED_SHARED); > > + > > > > static struct attribute *anon_stats_attrs[] =3D { > > &anon_fault_alloc_attr.attr, > > @@ -648,6 +652,9 @@ static struct attribute *anon_stats_attrs[] =3D { > > &split_deferred_attr.attr, > > &nr_anon_attr.attr, > > &nr_anon_partially_mapped_attr.attr, > > + &collapse_exceed_swap_pte_attr.attr, > > + &collapse_exceed_none_pte_attr.attr, > > + &collapse_exceed_shared_pte_attr.attr, > > NULL, > > }; > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index d0c99b86b304..8a5873d0a23a 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -594,7 +594,10 @@ static int __collapse_huge_page_isolate(struct vm_= area_struct *vma, > > continue; > > } else { > > result =3D SCAN_EXCEED_NONE_PTE; > > - count_vm_event(THP_SCAN_EXCEED_NONE_PTE); > > + if (order =3D=3D HPAGE_PMD_ORDER) > > + count_vm_event(THP_SCAN_EXCEED_NO= NE_PTE); > > + else > > + count_mthp_stat(order, MTHP_STAT_= COLLAPSE_EXCEED_NONE); > > Please follow the same logic as other mTHP statistics, meaning there is > no need to filter out PMD-sized orders, because mTHP also supports > PMD-sized orders. So logic should be: > > if (order =3D=3D HPAGE_PMD_ORDER) > count_vm_event(THP_SCAN_EXCEED_NONE_PTE); > > count_mthp_stat(order, MTHP_STAT_COLLAPSE_EXCEED_NONE); Good point-- I will fix that! > > > goto out; > > } > > } > > @@ -623,8 +626,14 @@ static int __collapse_huge_page_isolate(struct vm_= area_struct *vma, > > /* See khugepaged_scan_pmd(). */ > > if (folio_maybe_mapped_shared(folio)) { > > ++shared; > > - if (order !=3D HPAGE_PMD_ORDER || (cc->is_khugepa= ged && > > - shared > khugepaged_max_ptes_shared)) { > > + if (order !=3D HPAGE_PMD_ORDER) { > > + result =3D SCAN_EXCEED_SHARED_PTE; > > + count_mthp_stat(order, MTHP_STAT_COLLAPSE= _EXCEED_SHARED); > > + goto out; > > + } > > Ditto. Thanks! There is also the SWAP one, which is slightly different as it is calculated during the scan phase, and in the mTHP case in the swapin faulting code. Not sure if during the scan phase we should also increment the counter for the PMD order... or just leave it as a general vm_event counter since it's not attributed to an order during scan. I believe the latter is the correct approach and only attribute an order to it in the __collapse_huge_page_swapin function if its mTHP collapses. > > > + > > + if (cc->is_khugepaged && > > + shared > khugepaged_max_ptes_shared) { > > result =3D SCAN_EXCEED_SHARED_PTE; > > count_vm_event(THP_SCAN_EXCEED_SHARED_PTE= ); > > goto out; >