From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D74CBC83F10 for ; Thu, 10 Jul 2025 01:58:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 666196B009C; Wed, 9 Jul 2025 21:58:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6165F6B009E; Wed, 9 Jul 2025 21:58:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 52D3A6B00A4; Wed, 9 Jul 2025 21:58:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 3FE486B009C for ; Wed, 9 Jul 2025 21:58:10 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id DA1CA1603DC for ; Thu, 10 Jul 2025 01:58:09 +0000 (UTC) X-FDA: 83646694698.30.6020C0C Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) by imf17.hostedemail.com (Postfix) with ESMTP id D07C94000E for ; Thu, 10 Jul 2025 01:58:06 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=iKlp836K; spf=pass (imf17.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752112688; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MyRObTRkklQz6NQ2WdLXWBBkOrBh2ZjGUjiRtUdCUHc=; b=6RC+NCiTJnhSeLvmO3EKtd+Rf+Fuy43T9mymYHlSBdP9qkg5j+UNwhnDPcpuhzya6VQgoU Y0o6BimR3Zm4V7Gg3G7u1SE5mbc1SthHuoIpoqxMlQFRLKwLjue2SoJGERE8pfgkGeedAQ PfQX78QEb8wvjPjiV6sJifFviqDRSA0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752112688; a=rsa-sha256; cv=none; b=i4ALrDAYe9VM1CzZ53EOosVOI6QiCxJORDsxq9oO5uez2ENWzWZjwUnW75elFaVO0rr1Zc ldR4GI8xwZQQkymq8ePX/1oJntjxivgwZ/m0GuJFPAJbK7qqA33R4Y0xb9ANssYRQ5zvqi HeZuKspRgtNJ7BqWWbMxXysnBO2xitM= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=iKlp836K; spf=pass (imf17.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1752112683; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=MyRObTRkklQz6NQ2WdLXWBBkOrBh2ZjGUjiRtUdCUHc=; b=iKlp836KEwetgOdcVFnIGUTjj5QhDBywSvjlrOAztZFOIl2Rm45t+FG8PKFp+DAAuPHnvn27JEmfKsNL+9wtiQmpBSRAarCAEQWQU0iVDNW3d6Voanx16L6/jlMgzWOkHQS6K2E9GVzPYEUUQzEloPh1woMU0EdCLiDU/OTzs84= Received: from 30.74.144.111(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WiaZUIW_1752112681 cluster:ay36) by smtp.aliyun-inc.com; Thu, 10 Jul 2025 09:58:02 +0800 Message-ID: Date: Thu, 10 Jul 2025 09:58:00 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are disabled To: Lorenzo Stoakes Cc: akpm@linux-foundation.org, hughd@google.com, david@redhat.com, ziy@nvidia.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Pedro Falcato References: <573eb43a-8536-4206-a7c6-d0daa1fd7e70@lucifer.local> From: Baolin Wang In-Reply-To: <573eb43a-8536-4206-a7c6-d0daa1fd7e70@lucifer.local> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: D07C94000E X-Stat-Signature: 4akedbk7ayu86bsk6p3ipsp16xtshccx X-HE-Tag: 1752112686-760043 X-HE-Meta: U2FsdGVkX195ryHb8kFZERO31EUdJyKV9ATc3TdsX0mKihJXLdd+iihVOC0Z4g6XHxGfuVBcSfumGjAabs8487ALHK3EDzz+nD6vWg59OV2Dk0RtqLRCIDY92G2q9dVQX4kZmTDfuVC0Ng1fP9ArM3jvdTeKuQGJXp9jnKtS6Oy3Vre4BjlabrSA3Sx8ZUIPFp0OXuWO9Zt246UodofWiv6quuSzn1c4SZL3XBh51HlfQP90RUg0C9McFsdQ58NIfu5Qn8NYdOX1N/V0RjSG2wWNVLd5anSMWSwQcRfV+OCN98HuRPIJz0YSaBlf8j1lTGCEpUj+/nuhCcu/qr8VYr8mmzKlWhHOgmhwEjyHgK3a1eMSnAhve1n385/LNtrRmv2ciLXE4ofteQOgybds+fKHVsm4Kadz1f+XybovDhoPfkre7ryJT1POWUxdOSaK+bvq7zRYeinLseQSl27Zh+R7jxBqPQET/jfHuWGP/KJo3maYP+A4VxRIWsLMoCtMzZMFNLzFC1rU9NxBiEa5pIDi8EZBjtL/p8vTXTzadELv+jzeHoyrT6B6pRMJAs7PWfJooF8QVIApl6FZ5K2vhujRYYunNhExEt8NmUJXbiS30WYmlWttiKrQQXjc54aaDG769Nq25TTzL/gP1dU7dmkCJKhaI2jN7AooLhwt/0Kgbjkjddl1Lo8XVxnTnexTpBwggChQFIDimWTo3F+ekzIIjOaU7EWUMpCymITxdmUXDry2h6pbszzciY+SxFmk59twCk2sIxUrvMp4vR8AlBBHQvx7wvnDlk3jYQxL/qcrav1B2hLHbj2KbBx7vAM1YLYWHfVJwbUTV8bGNGfspW/57vXAoRfIlScezhcPMrCBGfBw+j8tUerMJWGfcDyH9TUCa/E71hyw6O/ZP+XDl8UT8P2AREE46NfunmjKvv98xDZl4eP1fHQYD5YCU/H3RjuGMkW4+yW8AIJMlV2 3r3zNr0i CnBs8tUlIu0h0t/WAMjEpc8DyEbwHltEBw+jKByD3Rrswz+1F/AYjhCzXhJiBrw6N7iNY7ed5TZZ1N2ItX5S5B6TrRYmHyTfFJ5pFmIs95XZK+E3eCNEvMR/KyPpGXOu9/kH1XFlZPgHvbZCxf5nkn+GZUQd6MqohLbPyna3Z+jREksZpEDaAa0jo37rAtEeiCwDSSU8IhXTeY8R8T0uYXyJtmQVkLDP19YJTd+qUSQ8pAMse4PaA8YXWoZMJn/qpMjSgGf76y+MmSOjEKjFGPae9NtZC8k+HRqLP9Q808jVRTo7pYiRm8lJa6JvFmHoXZUgyclX8yWpNOWRo2mPh7uBBxK/LOtLLOdw86pf5HOlcEWtuOziT+/K7SsYgVvZO9v0L X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/7/9 20:36, Lorenzo Stoakes wrote: > +cc Pedro as he'd raised concerns here also. > > Hi Baolin, > > Just for some clarification on this - thank you very much for this series, > but based on discussion with David and concerns raised by Hugh + others, > overall it feels as if, while the documentation is no doubt vague in ways > it ought not to be, this behaviour is something we have put out into the > world and we should continue to support it. > > So overall I feel that this series should not be applied. Fair enough. > > Your work here is great, and really massive apologies for this after all > the work you've put in (and of course the review work here also), but on > reflection I think it's a risk we shouldn't take. Consensus is the key. Thank you and David for the discussion and suggestions. > I understand this means that MADV_COLLAPSE can't be used to collapse at a > mTHP granularity - we definitely need to have a think about how we might > provide this sensibly. > > As for how to move forward - I will go ahead and update documentation to > make the situation absolutely crystal clear, both in the man page and the > rst. OK. Great. Thanks. > Thanks, Lorenzo > > On Wed, Jun 25, 2025 at 09:40:08AM +0800, Baolin Wang wrote: >> When invoking thp_vma_allowable_orders(), if the TVA_ENFORCE_SYSFS flag is not >> specified, we will ignore the THP sysfs settings. Whilst it makes sense for the >> callers who do not specify this flag, it creates a odd and surprising situation >> where a sysadmin specifying 'never' for all THP sizes still observing THP pages >> being allocated and used on the system. And the MADV_COLLAPSE is an example of >> such a case, that means it will not set TVA_ENFORCE_SYSFS when calling >> thp_vma_allowable_orders(). >> >> As we discussed in the previous thread [1], the MADV_COLLAPSE will ignore >> the system-wide anon/shmem THP sysfs settings, which means that even though >> we have disabled the anon/shmem THP configuration, MADV_COLLAPSE will still >> attempt to collapse into a anon/shmem THP. This violates the rule we have >> agreed upon: never means never. >> >> For example, system administrators who disabled THP everywhere must indeed very >> much not want THP to be used for whatever reason - having individual programs >> being able to quietly override this is very surprising and likely to cause headaches >> for those who desire this not to happen on their systems. >> >> This patch set will address the MADV_COLLAPSE issue. >> >> Test >> ==== >> 1. Tested the mm selftests and found no regressions. >> 2. With toggling different Anon mTHP settings, the allocation and madvise collapse for >> anonymous pages work well. >> 3. With toggling different shmem mTHP settings, the allocation and madvise collapse for >> shmem work well. >> 4. Tested the large order allocation for tmpfs, and works as expected. >> >> [1] https://lore.kernel.org/all/1f00fdc3-a3a3-464b-8565-4c1b23d34f8d@linux.alibaba.com/ >> >> Changes from v3: >> - Collect reviewed tags. Thanks. >> - Update the commit message, per David. >> >> Changes from v2: >> - Update the commit message and cover letter, per Lorenzo. Thanks. >> - Simplify the logic in thp_vma_allowable_orders(), per Lorenzo and David. Thanks. >> >> Changes from v1: >> - Update the commit message, per Zi. >> - Add Zi's reviewed tag. Thanks. >> - Update the shmem logic. >> >> Baolin Wang (2): >> mm: huge_memory: disallow hugepages if the system-wide THP sysfs >> settings are disabled >> mm: shmem: disallow hugepages if the system-wide shmem THP sysfs >> settings are disabled >> >> include/linux/huge_mm.h | 51 ++++++++++++++++++------- >> mm/shmem.c | 6 +-- >> tools/testing/selftests/mm/khugepaged.c | 8 +--- >> 3 files changed, 43 insertions(+), 22 deletions(-) >> >> -- >> 2.43.5 >>