From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45E53C2D0CD for ; Wed, 21 May 2025 10:23:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D71856B0096; Wed, 21 May 2025 06:23:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CFBD06B0098; Wed, 21 May 2025 06:23:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BC2C16B0099; Wed, 21 May 2025 06:23:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9ADA66B0096 for ; Wed, 21 May 2025 06:23:40 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 52D531A0F28 for ; Wed, 21 May 2025 10:23:40 +0000 (UTC) X-FDA: 83466528600.13.C4C36E6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf25.hostedemail.com (Postfix) with ESMTP id 0406BA000B for ; Wed, 21 May 2025 10:23:37 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=dKeU1zGx; spf=pass (imf25.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747823018; a=rsa-sha256; cv=none; b=2YtFvRYbkVBZELOd3R8K6rrepXYIymJkdvPUps8MQxpam4H56Zq96M+yjvdMmeEBEQAsgv pgKq+b5rr4AflHb0qc6rjiVPyJX78KS6ExaJSaTFRcfX9wAMtOkedszO3ZHsWSLCQXDAdG toZ4zK+vhhX4t4v+Z8CxLC5BPUw0W5M= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=dKeU1zGx; spf=pass (imf25.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747823018; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xv7tt/66OIkABhuEjqwsZTzTH6t5qT8pLzWmTfIJKYY=; b=L8rbFinBqVG+4yHdad3luWfnSVjbqjCY/3KR7V80JnU2kWshB2QguwWDJlwjX4D6IN6jq1 2U6JVoIcO/ZG0Qlfvgu8E9Cs+TBUVP4ETivkX8a2nCqDt2UT4ksL1ez+bjjoG2CaG+wlNz Amzkr8tmVDF8Qx8RvEjP/LW4EmN4aXE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1747823017; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xv7tt/66OIkABhuEjqwsZTzTH6t5qT8pLzWmTfIJKYY=; b=dKeU1zGxt+C1zQNg2rT6ggZmGNcHaiInuRneAD/neCF4u1DV9Xrg079u0t7oOJKauvSp+s n1zltSHsMcNDX5K6QSk7ZQUPB1H5Z1uVvINvwpsaNhUuqHdOgfC2m1Dl4HCler4G7N58nf eapFR7rohAeiMnFThULLYOWvDo439tQ= Received: from mail-yw1-f198.google.com (mail-yw1-f198.google.com [209.85.128.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-635-tkQ-5PMrPH-XrImV53BQJw-1; Wed, 21 May 2025 06:23:36 -0400 X-MC-Unique: tkQ-5PMrPH-XrImV53BQJw-1 X-Mimecast-MFC-AGG-ID: tkQ-5PMrPH-XrImV53BQJw_1747823015 Received: by mail-yw1-f198.google.com with SMTP id 00721157ae682-70dda56ad2aso38227207b3.2 for ; Wed, 21 May 2025 03:23:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747823015; x=1748427815; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xv7tt/66OIkABhuEjqwsZTzTH6t5qT8pLzWmTfIJKYY=; b=jhLxN1Kd4icfniLcNLrlZl1l0/VS3V7AYkCBQRwvuRMjCNhUHVyLemesXj93tEw+rb FBBgB3PxcxbfuhQGVrVzfHEYS5aUSgFx7Wuy5H11AfShu2Ema7No5jcehNsRgYOFbB7m +cikJpvowHHtnibLMP0SB0Q31M1jXdWUce3g8KEjqkKNcQnaqDmDbfXQilXFKnkEDHeJ z5U2KoCLhpsL4lFpjiiGHnniUmXjZAdU1BGm5JdfiFuxCzbZ42E2qdqKkiwRbuAATR9E s9F1/tOnS6/gPqbanOPf26NhXziQbjGNIHnIF043yh4uQY4v6X3J8DUOqCX2BrtMUw4m OZwQ== X-Gm-Message-State: AOJu0Ywid0FiwFCpP+bTatll0xUJLfbX7nayAizTkAxdcUxU8dNNjlvC zHjUZxlljMU/qgusgHzfSUd2GJjiwIyv4dK78oqpg06QFJi6XuZ7/Se/ONdEwYTjYL4LWTjBK2Q ISseqFI4rUheVZMHlzhldSNb1y3vtEDdn7SriQ9ZLqcr0QATmWHyVNCGmBT/sFt1Uoegr0cZoWm ZXWRTzWvOq9L08xMGEL1OyflIdbWI= X-Gm-Gg: ASbGncvBRwSPx4+bIDvDGmgeXmjdg4uKwrUuvBQxs0NqxyqXCG0yy+W7CzgSZUUgEPh UgfbLVvMsgo5KmwB7EkNvt8xNnzNYqu8gU1QGHw/qLmXhREWKNgtSPbpyuZcOQ7BaXGkF8YU= X-Received: by 2002:a05:690c:6a0d:b0:70c:b882:303 with SMTP id 00721157ae682-70cb88204e8mr227629267b3.20.1747823015550; Wed, 21 May 2025 03:23:35 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGJAC7UDK2Ml5NUGsHXm/FBxPtG8D9yeyHOKNsViVb/TA8VxWOTgFky2CFZp36oD7EqilO5iH57ODdygTR3gfE= X-Received: by 2002:a05:690c:6a0d:b0:70c:b882:303 with SMTP id 00721157ae682-70cb88204e8mr227628957b3.20.1747823015210; Wed, 21 May 2025 03:23:35 -0700 (PDT) MIME-Version: 1.0 References: <20250515032226.128900-1-npache@redhat.com> <20250515032226.128900-7-npache@redhat.com> <9c54397f-3cbf-4fa2-bf69-ba89613d355f@linux.alibaba.com> In-Reply-To: From: Nico Pache Date: Wed, 21 May 2025 04:23:09 -0600 X-Gm-Features: AX0GCFum_mHPuN2v2FqWBQ4JkPTMjA_oDiuLTJ8VITx5jEG1VUxuZREHd37-1c0 Message-ID: Subject: Re: [PATCH v7 06/12] khugepaged: introduce khugepaged_scan_bitmap for mTHP support To: Baolin Wang , David Rientjes , zokeefe@google.com Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, david@redhat.com, ziy@nvidia.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, dev.jain@arm.com, corbet@lwn.net, rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, akpm@linux-foundation.org, baohua@kernel.org, willy@infradead.org, peterx@redhat.com, wangkefeng.wang@huawei.com, usamaarif642@gmail.com, sunnanyong@huawei.com, vishal.moola@gmail.com, thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com, kirill.shutemov@linux.intel.com, aarcange@redhat.com, raquini@redhat.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, tiwai@suse.de, will@kernel.org, dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org, jglisse@google.com, surenb@google.com, hannes@cmpxchg.org, mhocko@suse.com, rdunlap@infradead.org X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: NDXankPvY-26kDBpMsbth0YVeFESA8M1mqeLE9Ql4vw_1747823015 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 0406BA000B X-Stat-Signature: 33yxbzkphzfy74ijw3w9koamum7oq5hi X-HE-Tag: 1747823017-241707 X-HE-Meta: U2FsdGVkX19gOgzQl/w6xlVydnjsEOSpfwOJVaaZsb9a0Jvdd8s1l4TEtB4PSYIuFTRswJaTzU5n5oQ/tHeK+FCfLI2T+ZehBbthIiaxoOXSp2txk5FLjOtWdTQiZ0tnJQCT+6SPFysFpiQu8AYipqhi1xHFqHjB+5CEA8QzJVR85k3M0UQSC9nFWaLl5hOHqnZ6XWDvaP9o4t17/fL8GKAMNUDJpBb+VF2FDHGsAujANFO0iYf+lMl5HH8vs+X50IhtywMZA/IJtdPo5sEYpdA7pB2fBJP1nCHXZsJnJ4DDBNEWzEobL1nkp8rxruJAcSwVIgZsAhumOa0uhV24OlgchujI+ycENu20Zbd8AX+PqcrPfiNn28DSAo64n0c52k3f50w5dti+QtH+eYRTI100mVnZx6nSCDfCYrqVvFCECHPpFwyNbM3PNWxff47V1xqeXWSv14OqaI01EBUitt8dJVoqLz3FnUgcXtTQw7m/NfBD4WcFPtj69bRhjszTxNjB5nqn+W6wziioiQtBMij9I2NXtS89g+b42lWFz5mou9Mb37aFCaqnnMNadD9qqG4Weahcq2oil0cOy/r6zy0FJGsAG2IxyoYFuylYuzRf21FM1RNOF68NqsdemnF0dtwnWNW99ZhOyWewbmQdyQa2dHznzICZ05+uM9/l94g9jm+i7qQaSdbM224P3ViRXsseAybjSbbRX9C/L378Nbthpy9PT2a+/kYqVPJDNxQFL2Gz9SUfF8sh7IQcIYGqlchg4OTKdk7VUl/bGddGm9/47aib/UymICvW/kJhsVeBDRJ5m78SYT2VSZRcnzoVzKCmW/hm+1NkPyt4s8ypKkJmgZ1BoJskwoSjSI5a5UC+TQsLwTftO6xWsxHnsj3LsABe6u6IfLdudph8CVTtJy57R0fKvgtiox3EnyIQ8EtU9tBAfZ1yDOvqPudEthlt/KHcCBSBhz41KGB9JSb vyFCz/4p cFqqkcSFmxMlnxq16Jcrk9DqTk9K49Imzmzccqhj4yGSaaA/kLobFaXJPoEMWFcc9PMoVnTIrKNKA0HpIJOnVbLhAKee/V6iChOi8Npqg3zhVaHo/lZQBzvx+eBKRVLlU9DGlhCG17xw8mgz1DU83pzB2aVoPI34td4zpsox97N2Xq/P9Aa73Zx9EumyMY8X0TdGHFO1swuk5oJsxdPMvxQr/leOxJgpTNt3ToISnyx7mr3mMgqdKMfwy2BFojCCQA3B9AyaX28DfwKakEOZJ8cXZlMrcZch5niLdfz4lhuhBOwhL04l/4GyriQLOkMTf5BwsRjbVmcmuf5Dhk075SQ/3e1zETqQNTXjxICSe+sU26f/ooYT9ovyLw0wrX0j9Mp8876hIr5m2OkM1dFOBSwFunIAQq5ipF1K31ZQtzULUJulXFWG6PBv+2gz9VaAZu/otu/CO/Xi6Oio9PMAeWmCC3w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, May 20, 2025 at 4:09=E2=80=AFAM Baolin Wang wrote: > > Sorry for late reply. > > On 2025/5/17 14:47, Nico Pache wrote: > > On Thu, May 15, 2025 at 9:20=E2=80=AFPM Baolin Wang > > wrote: > >> > >> > >> > >> On 2025/5/15 11:22, Nico Pache wrote: > >>> khugepaged scans anons PMD ranges for potential collapse to a hugepag= e. > >>> To add mTHP support we use this scan to instead record chunks of util= ized > >>> sections of the PMD. > >>> > >>> khugepaged_scan_bitmap uses a stack struct to recursively scan a bitm= ap > >>> that represents chunks of utilized regions. We can then determine wha= t > >>> mTHP size fits best and in the following patch, we set this bitmap wh= ile > >>> scanning the anon PMD. A minimum collapse order of 2 is used as this = is > >>> the lowest order supported by anon memory. > >>> > >>> max_ptes_none is used as a scale to determine how "full" an order mus= t > >>> be before being considered for collapse. > >>> > >>> When attempting to collapse an order that has its order set to "alway= s" > >>> lets always collapse to that order in a greedy manner without > >>> considering the number of bits set. > >>> > >>> Signed-off-by: Nico Pache > >> > >> Sigh. You still haven't addressed or explained the issues I previously > >> raised [1], so I don't know how to review this patch again... > > Can you still reproduce this issue? > > Yes, I can still reproduce this issue with today's (5/20) mm-new branch. > > I've disabled PMD-sized THP in my system: > [root]# cat /sys/kernel/mm/transparent_hugepage/enabled > always madvise [never] > [root]# cat /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled > always inherit madvise [never] > > And I tried calling madvise() with MADV_COLLAPSE for anonymous memory, > and I can still see it collapsing to a PMD-sized THP. Hi Baolin ! Thank you for your reply and willingness to test again :) I didn't realize we were talking about madvise collapse-- this makes sense now. I also figured out why I could "reproduce" it before. My script was always enabling the THP settings in two places, and I only commented out one to test this. But this time I was doing more manual testing. The original design of madvise_collapse ignores the sysfs and collapses even if you have an order disabled. I believe this behavior is wrong, but by design. I spent some time playing around with madvise collapses with and w/o my changes. This is not a new thing, I reproduced the issue in 6.11 (Fedora 41), and I think its been possible since the inception of madvise collapse 3 years ago. I noticed a similar behavior on one of my RFC since it was "breaking" selftests, and the fix was to reincorporate this broken sysfs behavior. 7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") "This call is independent of the system-wide THP sysfs settings, but will fail for memory marked VM_NOHUGEPAGE." The second condition holds true (and fails for VM_NOHUGEPAGE), but I dont know if we actually want madvise_collapse to be independent of the system-wide. So I'll ask the authors +David Rientjes +zokeefe@google.com Was this brought up as a concern when this feature was first introduced, was there any pushback, what was the outcome of the discussion if so? I can easily fix this and it would further simplify the code (by removing the is_khugepaged and friends). As David H. has brought up in other discussions around similar topics, never should mean never, is this the only exception we should allow? Thanks! > > > I can no longer reproduce this issue, that's why I posted... although > > I should have followed up, and looked into what the original issue > > was. Nothing really sticks out so perhaps something in mm-new was > > broken and pulled out... not sure. > > > > It should now follow the expected behavior, which is that no mTHP > > collapse occurs because if the PMD size is disabled so is khugepaged > > collapse. > > > > Lmk if you are still experiencing this issue please. > > > > Cheers, > > -- Nico > >> > >> [1] > >> https://lore.kernel.org/all/83a66442-b7c7-42e7-af4e-fd211d8ed6f8@linux= .alibaba.com/ > >> >