From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CB27C83F1B for ; Thu, 17 Jul 2025 07:23:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A1F5B6B008A; Thu, 17 Jul 2025 03:23:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9D0746B008C; Thu, 17 Jul 2025 03:23:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8BF3F6B00AB; Thu, 17 Jul 2025 03:23:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 798F06B008A for ; Thu, 17 Jul 2025 03:23:46 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2CD761403C7 for ; Thu, 17 Jul 2025 07:23:46 +0000 (UTC) X-FDA: 83672916852.24.2D369F6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf19.hostedemail.com (Postfix) with ESMTP id 0A9E61A0009 for ; Thu, 17 Jul 2025 07:23:43 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=f2VYh0V9; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf19.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752737024; a=rsa-sha256; cv=none; b=6HuXg8qfm2MUW51D2g3gjMSNSHz114EhxrzFAnFrLcoH8BEfVC/z85h3l+H3hBdFKMNVbq NgyiKzpynOW8so+72fBQUouRN2HLAW2Ott89V/g4FXq9OfrCyEnYLOaz8Fd3MYYa9IU6tQ ttqg8kDb3ZChN5moDosO3UcKyFghlMc= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=f2VYh0V9; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf19.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752737024; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Xka/ZuoT5H83UsFhmNwZ/doTSozyLFCxoHnEvCahLTQ=; b=NBVGH9+I3+VleG1iNKLSAoTwSspwwkO7xdEQAeByQ/pPcpqiAK91RjtCjoGlt11onOCTuH dsViClVkLe6jssqD+LuykR/6CXRSgm2Fxi/ck5/GvlL26BdrmCNscXtRT8vgF1ZD+FYyOU 3xI7T8GUbt82tpy18NowGI1n9gWm+vQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1752737023; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Xka/ZuoT5H83UsFhmNwZ/doTSozyLFCxoHnEvCahLTQ=; b=f2VYh0V9SbBwXR8aEc4zYaUW95kPq0qnA6xTh/RW9JLfwjyuYwLIy38kqrqb9xxQ0VCB1g 2YWIVPAdBUO6rr6xpz2aVfGTAdxZlgvOYd+F30uGoda061ybLPwu+R+Pd07Vxbm2gwzhWp W6z7Z5EOd7v4r2I07yKkoNxLUY68sFw= Received: from mail-yw1-f199.google.com (mail-yw1-f199.google.com [209.85.128.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-281-Ys_G_ptNMv-L2qS8LkDbiQ-1; Thu, 17 Jul 2025 03:23:40 -0400 X-MC-Unique: Ys_G_ptNMv-L2qS8LkDbiQ-1 X-Mimecast-MFC-AGG-ID: Ys_G_ptNMv-L2qS8LkDbiQ_1752737019 Received: by mail-yw1-f199.google.com with SMTP id 00721157ae682-7140227e02fso8093977b3.2 for ; Thu, 17 Jul 2025 00:23:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752737019; x=1753341819; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Xka/ZuoT5H83UsFhmNwZ/doTSozyLFCxoHnEvCahLTQ=; b=I/KA5UVN++wbEB6Ypf6YORQjJ2baYYIfeZpV5BdbjOt5A8auEKZ8xYGWJ7+KKHYpFm bngFktc97rAfkIJGEJRsiWB8eAioBR4tH/sJhbOyVii3RkPgc2ZVhRMUDU0HlCEya6E1 c3RKhX89XTmTQkN3HQUbk7JPQ8apHXktdlFcq3wTZ/z/fokiu6fn+2He7wbUJOs87SXD +Ff5Cr4EqeAHXR18YppaQXR4vKhuv1YyQUQBIFsC+Htb1Cnqpy5ztLiIwaBjAJH2CILq MmHDJ+vM3TU3QfzBZphjENBBCS9w2y2Cpwvh8C0EeFwJVSaeGCP8nO/qLcTx5XD+0Xp5 PBQw== X-Gm-Message-State: AOJu0YxOfDAm/ol7MENZB5z3ngkrqvfZ/3u7CCMLaV0Sr1mtvUzBtG7R wX1PkGu65n6axh1vgb5cUoWkXpUGzQPZ7FVnwq+7VfI9TRS+2xYX2YnyX/w1QFDEdnxSpjSrTC4 5TFi0uA+rH+XLyrU6Jzj90GlyKX5+5cexwxSpCosynDENLsp93rMOhB04k81Ip/PTp39hNPmn9W Aa3clEYV2lPAbP/gwG/Pwhh1HE7Fg= X-Gm-Gg: ASbGncv60BLVaQpxN7YBwvh5vnoq3SH4nSO7g4PMSixqtJ57cGBKEt3Qe/vj0317whJ H9U2BKxIHlHXrW9H+vmnNZCQZK0RGXYtblBmZndI8hmcShQZ/3YOsV0BfLD5hetSFcWHLHHsz5v yfI54qQOckRBKFEsHrkTqTxHY= X-Received: by 2002:a05:690c:1c:b0:714:13:357a with SMTP id 00721157ae682-7183733e754mr75568577b3.20.1752737019362; Thu, 17 Jul 2025 00:23:39 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG062fQmQMTtzQaORYO8BQfLpIhFpvKSjhTVlHbWcaYDeY88CgnYk46dQY+VDz9L5BhRQfpAlIpXIV3Coe2VM0= X-Received: by 2002:a05:690c:1c:b0:714:13:357a with SMTP id 00721157ae682-7183733e754mr75568367b3.20.1752737018858; Thu, 17 Jul 2025 00:23:38 -0700 (PDT) MIME-Version: 1.0 References: <20250714003207.113275-1-npache@redhat.com> <20250714003207.113275-6-npache@redhat.com> <5ff595db-3720-4ce3-8d92-5f08d0625c75@redhat.com> In-Reply-To: <5ff595db-3720-4ce3-8d92-5f08d0625c75@redhat.com> From: Nico Pache Date: Thu, 17 Jul 2025 01:23:12 -0600 X-Gm-Features: Ac12FXxSdZgIpFjqwS2OfbfxZpFjuP_7yWTzmxJmAa7_DtUsx7oTR3r9FbXes2k Message-ID: Subject: Re: [PATCH v9 05/14] khugepaged: generalize __collapse_huge_page_* for mTHP support To: David Hildenbrand Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, ziy@nvidia.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, dev.jain@arm.com, corbet@lwn.net, rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, akpm@linux-foundation.org, baohua@kernel.org, willy@infradead.org, peterx@redhat.com, wangkefeng.wang@huawei.com, usamaarif642@gmail.com, sunnanyong@huawei.com, vishal.moola@gmail.com, thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com, kirill.shutemov@linux.intel.com, aarcange@redhat.com, raquini@redhat.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, tiwai@suse.de, will@kernel.org, dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org, jglisse@google.com, surenb@google.com, zokeefe@google.com, hannes@cmpxchg.org, rientjes@google.com, mhocko@suse.com, rdunlap@infradead.org, hughd@google.com X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: DHYpFSKjw1YwFmk3hK8BKVzj13TknPYOZC7ZgwOoqMg_1752737019 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0A9E61A0009 X-Stat-Signature: ryfkp3k3c9pjyktjyk8zjbjgodrmbadt X-HE-Tag: 1752737023-628757 X-HE-Meta: U2FsdGVkX19Tr8ecxK8JSu7Iv0CjPr6zWc8iW9mrJ7jtWLTaI6hvfk8aXIj/loOUrpeE8VnElcTG+vfdMDMZTLIKYsC5vxtnawUUNt6hW/iOO8kuxdNYFM1Q4Fvoe1h8u87CWeHM3L8ZVR3MEyhAAVZFU6TjF1N2t35tHowH+AxjinjqXx5JQ2gV2jmVBQCkoeTtyAP0s0HV9vZemuG19dE3NWiXbnvPdjIIKwA658E3ix7ZTX1R4+aOOdhYdp75o6xO5RptTCyGDrXUldb98WVJE4f5d1A4OQ2Ez7vkWYrBqmfGEgG1F2q31hgViBFEXu1t1HSONQqsMx7SfqrT3VcPC+GiKvf3mbl+e1hY32esbF/YI9082reb978pCcpu4Dl7/bbzQiFxbSAwKFYRvG1f80676/GyrnXoyFwsF3ZLagm4GRjoYZXrph7MS7Xf4MWLtS0/jInrzycj4Fnc9KZGykU3mdoiPzKVZww4GwWhrutlq34LaQafdcPMFPgtntsvDacZgs9QKC+jhFQyWa8YO2Rtj9l8K4v+EHXydW1NjQ4n8EhwY3XdXJJX0NK2ATMniAt0J33pcXdttumUOV6OHZjvPxIDVlLXJ2tTCEzeIv2xN9EOGCFElPef3xBYGD/yxDgDdGqHsfaROaM0pk6HCu04vcverjNY2oJoeUFMgQcySMjCj1c1n5obgzo1X9l44828xayAWeSF8Im2x4xS/JclIKrzd0BOJ6g8ipnlB+mmKPx8E2D4TknZXBRDCP2IPMw1eYSGSPnLIrUuemBqtYPI/Uv5z4kDqWl8tkwrhs8IRi6MtKwdtOrFVxOZIGdN0HXscnptr46NADmLSP3HUjBZ8QDwlI1o6r78C4IRC7eX/Fyvr19e6+6lO2NokQzb5/rzLIGvHvgpZ2HC6R8SjCrXBVbnvTJ06vcxwMvxfM+HBpkPTtcPKB8BkwVG2OyazacN2pNFEYvi3nk Abc5B5Dm rc549EiEQFoa81C5rqHgj9UZ2ZVhrgxJ9QBCG2koqoepA62ErA0oeUTPq2DoAD8prSyedaLdNexPGK6BFR3Vh5TPwrYLvHTtc8xSicUWsU0dus5UDd5/J4CZTSf0DQmj82SzAmVX6G7/54XT3TnohkeqBMaFF5Ydt0fyuxcuJkVu5Ikahg7+zN66Q8CDHNLbSjUxuY5b0kdkhRMOSK6wVUfE8Ctf7MHrNxnTAVGS6xRhjEaUckYiJH4ebMmKTWzyXVdmCU9NS15G/SdthpSXDtqHHBxwunn3hO6mEYynFTCjiaxsFByRIVuP9AVAK78tRqHTTePRRYrz7ghUC5JcVQRqFjTyZfwAxfCVmw+y1IFivioebTpABEJlLYE6eAqQ0YvRntKazWK1abSLoA3dKNhL8LZzb6u9fZnm63W8rPOqF7xEO2BSIoAWz2jJnnuQ91Bmn X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jul 16, 2025 at 8:03=E2=80=AFAM David Hildenbrand wrote: > > On 14.07.25 02:31, Nico Pache wrote: > > generalize the order of the __collapse_huge_page_* functions > > to support future mTHP collapse. > > > > mTHP collapse can suffer from incosistant behavior, and memory waste > > "creep". disable swapin and shared support for mTHP collapse. > > > > No functional changes in this patch. > > > > Reviewed-by: Baolin Wang > > Co-developed-by: Dev Jain > > Signed-off-by: Dev Jain > > Signed-off-by: Nico Pache > > --- > > mm/khugepaged.c | 49 +++++++++++++++++++++++++++++++-----------------= - > > 1 file changed, 31 insertions(+), 18 deletions(-) > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index cc9a35185604..ee54e3c1db4e 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -552,15 +552,17 @@ static int __collapse_huge_page_isolate(struct vm= _area_struct *vma, > > unsigned long address, > > pte_t *pte, > > struct collapse_control *cc, > > - struct list_head *compound_pageli= st) > > + struct list_head *compound_pageli= st, > > + u8 order) > > { > > struct page *page =3D NULL; > > struct folio *folio =3D NULL; > > pte_t *_pte; > > int none_or_zero =3D 0, shared =3D 0, result =3D SCAN_FAIL, refer= enced =3D 0; > > bool writable =3D false; > > + int scaled_none =3D khugepaged_max_ptes_none >> (HPAGE_PMD_ORDER = - order); > > > > - for (_pte =3D pte; _pte < pte + HPAGE_PMD_NR; > > + for (_pte =3D pte; _pte < pte + (1 << order); > > _pte++, address +=3D PAGE_SIZE) { > > pte_t pteval =3D ptep_get(_pte); > > if (pte_none(pteval) || (pte_present(pteval) && > > @@ -568,7 +570,7 @@ static int __collapse_huge_page_isolate(struct vm_a= rea_struct *vma, > > ++none_or_zero; > > if (!userfaultfd_armed(vma) && > > (!cc->is_khugepaged || > > - none_or_zero <=3D khugepaged_max_ptes_none))= { > > + none_or_zero <=3D scaled_none)) { > > continue; > > } else { > > result =3D SCAN_EXCEED_NONE_PTE; > > @@ -596,8 +598,8 @@ static int __collapse_huge_page_isolate(struct vm_a= rea_struct *vma, > > /* See hpage_collapse_scan_pmd(). */ > > if (folio_maybe_mapped_shared(folio)) { > > ++shared; > > - if (cc->is_khugepaged && > > - shared > khugepaged_max_ptes_shared) { > > + if (order !=3D HPAGE_PMD_ORDER || (cc->is_khugepa= ged && > > + shared > khugepaged_max_ptes_shared)) { > > result =3D SCAN_EXCEED_SHARED_PTE; > > count_vm_event(THP_SCAN_EXCEED_SHARED_PTE= ); > > goto out; > > @@ -698,13 +700,14 @@ static void __collapse_huge_page_copy_succeeded(p= te_t *pte, > > struct vm_area_struct *vm= a, > > unsigned long address, > > spinlock_t *ptl, > > - struct list_head *compoun= d_pagelist) > > + struct list_head *compoun= d_pagelist, > > + u8 order) > > { > > struct folio *src, *tmp; > > pte_t *_pte; > > pte_t pteval; > > > > - for (_pte =3D pte; _pte < pte + HPAGE_PMD_NR; > > + for (_pte =3D pte; _pte < pte + (1 << order); > > _pte++, address +=3D PAGE_SIZE) { > > pteval =3D ptep_get(_pte); > > if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { > > @@ -751,7 +754,8 @@ static void __collapse_huge_page_copy_failed(pte_t = *pte, > > pmd_t *pmd, > > pmd_t orig_pmd, > > struct vm_area_struct *vma, > > - struct list_head *compound_p= agelist) > > + struct list_head *compound_p= agelist, > > + u8 order) > > { > > spinlock_t *pmd_ptl; > > > > @@ -768,7 +772,7 @@ static void __collapse_huge_page_copy_failed(pte_t = *pte, > > * Release both raw and compound pages isolated > > * in __collapse_huge_page_isolate. > > */ > > - release_pte_pages(pte, pte + HPAGE_PMD_NR, compound_pagelist); > > + release_pte_pages(pte, pte + (1 << order), compound_pagelist); > > } > > > > /* > > @@ -789,7 +793,7 @@ static void __collapse_huge_page_copy_failed(pte_t = *pte, > > static int __collapse_huge_page_copy(pte_t *pte, struct folio *folio, > > pmd_t *pmd, pmd_t orig_pmd, struct vm_area_struct *vma, > > unsigned long address, spinlock_t *ptl, > > - struct list_head *compound_pagelist) > > + struct list_head *compound_pagelist, u8 order) > > { > > unsigned int i; > > int result =3D SCAN_SUCCEED; > > @@ -797,7 +801,7 @@ static int __collapse_huge_page_copy(pte_t *pte, st= ruct folio *folio, > > /* > > * Copying pages' contents is subject to memory poison at any ite= ration. > > */ > > - for (i =3D 0; i < HPAGE_PMD_NR; i++) { > > + for (i =3D 0; i < (1 << order); i++) { > > pte_t pteval =3D ptep_get(pte + i); > > struct page *page =3D folio_page(folio, i); > > unsigned long src_addr =3D address + i * PAGE_SIZE; > > @@ -816,10 +820,10 @@ static int __collapse_huge_page_copy(pte_t *pte, = struct folio *folio, > > > > if (likely(result =3D=3D SCAN_SUCCEED)) > > __collapse_huge_page_copy_succeeded(pte, vma, address, pt= l, > > - compound_pagelist); > > + compound_pagelist, or= der); > > else > > __collapse_huge_page_copy_failed(pte, pmd, orig_pmd, vma, > > - compound_pagelist); > > + compound_pagelist, order= ); > > > > return result; > > } > > @@ -994,11 +998,11 @@ static int check_pmd_still_valid(struct mm_struct= *mm, > > static int __collapse_huge_page_swapin(struct mm_struct *mm, > > struct vm_area_struct *vma, > > unsigned long haddr, pmd_t *pmd, > > - int referenced) > > + int referenced, u8 order) > > { > > int swapped_in =3D 0; > > vm_fault_t ret =3D 0; > > - unsigned long address, end =3D haddr + (HPAGE_PMD_NR * PAGE_SIZE)= ; > > + unsigned long address, end =3D haddr + (PAGE_SIZE << order); > > int result; > > pte_t *pte =3D NULL; > > spinlock_t *ptl; > > @@ -1029,6 +1033,15 @@ static int __collapse_huge_page_swapin(struct mm= _struct *mm, > > if (!is_swap_pte(vmf.orig_pte)) > > continue; > > > > + /* Dont swapin for mTHP collapse */ > > + if (order !=3D HPAGE_PMD_ORDER) { > > + count_mthp_stat(order, MTHP_STAT_COLLAPSE_EXCEED_= SWAP); > > Doesn't compile. This is introduced way later in this series. Whoops I stupidly applied this fixup to the wrong commit. > > Using something like > > git rebase -i mm/mm-unstable --exec "make -j16" Ah I remember you showing me this in the past! Need to start using it more-- Thank you. > > You can efficiently make sure that individual patches compile cleanly. > > -- > Cheers, > > David / dhildenb >