From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7E6AC7115C for ; Wed, 25 Jun 2025 08:40:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 67C766B0092; Wed, 25 Jun 2025 04:40:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 62CEF6B0099; Wed, 25 Jun 2025 04:40:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F4DB6B009B; Wed, 25 Jun 2025 04:40:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 369E46B0092 for ; Wed, 25 Jun 2025 04:40:34 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 7347B141AFA for ; Wed, 25 Jun 2025 08:40:33 +0000 (UTC) X-FDA: 83593276746.08.2008C1F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf07.hostedemail.com (Postfix) with ESMTP id 35F3F4000C for ; Wed, 25 Jun 2025 08:40:31 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=N5PnxLp6; spf=pass (imf07.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750840831; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1GBhPYm37wvmLi6jS+3dbtOSdMwC0yaFOC4KWxoS5RA=; b=JCGPnIBzTgOsl9uJASFYblxzwNi/Jx6gdGYP8rPXhlCTonZADLfiXJYoIrtEENFl44ROkC cSVj29vbqNLa//VlhAIwoekTssR4Yv9U23TB5jPY6JgrYPYrE2HMF4cvgdNAJBT3AMmFlW gM+fZvpcgBNBqyW0+4s49U5GxFcw9y0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750840831; a=rsa-sha256; cv=none; b=plBqeEfG8kYQt9g3/pVrZZe9DY6rmMuBr9HguNsePufeKAAv1tOkmlgNsLXu1RPwPjQxxn qlcGUUvxA9PnX3bEr23zkGKmOVV+j1Ss5+INoRnpERmHoMvl7i6RDNIc3ii8bxd9Zjf9Bg uA2bDkUvaL1vpDljiGkzKnh7MNIYA6Y= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=N5PnxLp6; spf=pass (imf07.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750840830; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=1GBhPYm37wvmLi6jS+3dbtOSdMwC0yaFOC4KWxoS5RA=; b=N5PnxLp6PNqnh6yEcFh2utRrr0pHxqJFMa+z6NXpj1J6zOCinHKonKH1bIAlVacGHddyUd /p/lV1C/Qx8ianLAHJmIJlUTiUPkz6HkG6FyJZJQAw8xdsqdtwAl6LjIzfb9DLKUHZRdj4 87ovz23ZguleZ0Zhak88I90GpL48mnU= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-464-lErXU_DSObebGHRNn_wP9w-1; Wed, 25 Jun 2025 04:40:29 -0400 X-MC-Unique: lErXU_DSObebGHRNn_wP9w-1 X-Mimecast-MFC-AGG-ID: lErXU_DSObebGHRNn_wP9w_1750840827 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-4537f56ab74so8146675e9.3 for ; Wed, 25 Jun 2025 01:40:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750840827; x=1751445627; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=1GBhPYm37wvmLi6jS+3dbtOSdMwC0yaFOC4KWxoS5RA=; b=rKGBfC261DglquYnzKMKbM3U27jERripulwAFp5diR7Xf2B4FJTEs+I0plrdyDk7J1 rx490pwydX4tsSHk8WXAe/WFJoEffNVNXM2KYtRH63bkCqUgbOHxxB8BkNEh/9ku8gVg QMJJS26ivLA0ZXkqL8QHh1EubswJ1NyqNBO/fYGDn3XxQH8q7DLVj7wJXXT+Vcf+3yPv mNgb3KlwGeLpdnkOoBoR8CoeLhF3PA/NeRvsZm3n21AvgEux0qYtuqW/l9CN1/U8K4XT BMsbDMbCuEiI352VIcv+w+nrDJB4G9bUWPbv9ykKbxGX2R+34QdyS4MKrWQJwhZI5CHJ 7xrA== X-Forwarded-Encrypted: i=1; AJvYcCWmfZb8XFUAfUqOPKd6LQD/Yhk7b8+pSNSdPAMEUXBDezkg2Foh0Ky4sANTQyfmvX1M6RJNKZu2Ew==@kvack.org X-Gm-Message-State: AOJu0YwjXwDlFQqQQy7F/HbbQAxJtrm+1yA025gCjr2znODgi9yilZLw mbJchtWC/FtMBS9V65jyWdzYb+huoz23ldvgxcBI3779fme1dbLnAMCTK7r5bZj+UnRaJ4ZMPxi aos2p46Ca88QJaU5mXlnkzQuLh4ShnqUvSbfKmUgkR+7fJqlAhbJ3 X-Gm-Gg: ASbGncvvjiyjGlc84GWk+3t/VGs/xgJU76EssGhqKlgUI3NZHzxeTefVkSc8homq8ku ju5uu8KZA9inHmK5dvxaJgfxgDsZiVbxg4Pd1Q3PwN1BUiKfy8QfXpBFiIKhEXU0/qdaI1ujJ5k Cd9yvmGmdS/9Zdsy14r898ChF9pacIfZ+gIYmvpnePwgebfyNeZztJd5rRbXJ0UDEO1C/AN58vk 8JltbbKsf6T9xbXlLDp74QiU0EcR819Y7RI3GzZeMXT7lnJ/TJHfnImEK+rXoW3XAOD29eAoPmo kHN9oSuIrbrR0NYGXkqIBbbBNWTEffHSLgkt8QG83FxStpWEa6UzrNYjtXriduNp2ILhVruC0gz Wa5ggpV0tIfneFRSl9SXEs2SrOMENbvUuN3fg83bhHHkQ X-Received: by 2002:a05:600c:1d0a:b0:43d:26e3:f2f6 with SMTP id 5b1f17b1804b1-45381aafedbmr19623985e9.5.1750840826771; Wed, 25 Jun 2025 01:40:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHsAcNjfLqJSJmC7ITvrekx9xCPzRXJwCgbTdWo6F8k+yVvTmk1lePpzDiqyAGDowT09bQ0hA== X-Received: by 2002:a05:600c:1d0a:b0:43d:26e3:f2f6 with SMTP id 5b1f17b1804b1-45381aafedbmr19623705e9.5.1750840826336; Wed, 25 Jun 2025 01:40:26 -0700 (PDT) Received: from ?IPV6:2003:d8:2f12:1b00:5d6b:db26:e2b7:12? (p200300d82f121b005d6bdb26e2b70012.dip0.t-ipconnect.de. [2003:d8:2f12:1b00:5d6b:db26:e2b7:12]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4538235600csm12773795e9.20.2025.06.25.01.40.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 25 Jun 2025 01:40:24 -0700 (PDT) Message-ID: Date: Wed, 25 Jun 2025 10:40:23 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are disabled To: Lorenzo Stoakes Cc: Hugh Dickins , Baolin Wang , akpm@linux-foundation.org, ziy@nvidia.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, zokeefe@google.com, shy828301@gmail.com, usamaarif642@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <75c02dbf-4189-958d-515e-fa80bb2187fc@google.com> <9cb94544-f65a-4394-b1e2-bfb226ead31c@redhat.com> <8691d74b-67ee-4e26-81ac-f6bf1725361e@redhat.com> <3b6db0c3-aef3-4a21-a154-6aafd639dbc7@lucifer.local> <6bda0de6-1ade-40c9-aa52-16bc02d98bee@redhat.com> <28051538-d3ea-4064-aef3-89f6dd98b119@redhat.com> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: iO-nDGx9aoMOG9GMeoba78GQ0i-lKFwHjsYsxNNu0AQ_1750840827 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Stat-Signature: tbtmwj8ppfdeey7tknjro5k13w4nnipt X-Rspamd-Queue-Id: 35F3F4000C X-Rspamd-Server: rspam08 X-HE-Tag: 1750840831-882619 X-HE-Meta: U2FsdGVkX1+8dC7hwHhZIG5SLQjFph2GhE/cbcTndu0Q8MfPtpsKk4A+gp6UV4dIeGNPvki64EO5nLnxSg+gb6JWTchXUpXaMMvzTwolkyiA6LhrH5fx6UmFLb64ys+FD/Bh0NNr06wqaO+2AYWbQVOikC749/K61X0VFbwwu/0lIekhTkWc6IDACz3PeetEy/BRL41t1rsm1EmnrYFYHNhilzrx6GH73aqN/9v3BF2QAlN1yUaYrtg/ew5Ej2jqzfvDcOTp5EbiPKPuml+2g8Qjr0BAkUiMBaNwucK5e7dwWCv1p4wAPkpd9w+lKTVjVw7wzjfQxzdXDgDbsVO7VXdfCbO6lA+79kv9RiVj5H4AOLICKQ0Ak9CwfuB+acs5vO3slIJUPDR7GSZm37w+56rJzBvxMyyvXixyG7ULT2MdZiA+fPKqZgHRRasvBL49u8wWaH3MvhQ3wtcXkghxDJA/q0zfwVn+x6/D2mmwWw+EJM0GfQCLYlPNqHd73c7163OKJvN5xWP2gcOoabq7gdy2PAyE8+4jYXAnepXQSlYibxn+jGSGP27ILnh5HIGkVYfX5jgS8Ebq0z/Sdkdcfh3qCAFMu2/AUaw70jXr3FmAyHBXbQ7v1Bi/dEMgtuTQGlHEexfw3+I8GvI1zBEvjNlFbbdtXG3utLXlqYy7c/TvqCi37NN/WfUfGaw2RcmcoE6pgVLt12ds2enhjtVIPIHjRWl39Qsl+HW6RsytFU6pnkeYwiJreSeq99EnXD3lupbZff0E9ffAun7L2QQWuBYii2ti05XtGnrjwTHA8nMyRfdaK9dojIF7nxRt87ZdaZE10sbmpnGY68U5+Lnl2Aeuox6eePQMSQL1cYTyUd4FRflPRWFyXXcTOHTkwGaH/H3P//rW7Ei2eNlWLz2njY2POTeKb4rqtfDgAZvw6dvvbM3u0hMxrE2+2PW3owi+JCi+t2R/SzGik/Kb5+R bm7lm0sr ooy6eShZviaqmpzetpiOCy97FUG27YYSS11HH4NLIm0BUMkSezXApUHXe6D3oszk1ytcAUHVCXLUgkjBlFlGNkHGAexu1DQ3vIhIsVsUfRHegXWeUakwqFLyR8KWRLkrcLKvu X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 25.06.25 10:22, Lorenzo Stoakes wrote: > On Wed, Jun 25, 2025 at 10:16:46AM +0200, David Hildenbrand wrote: >> On 25.06.25 09:49, David Hildenbrand wrote: >>> I think the whole use case of using MADV_COLLAPSE to completely control >>> THP allocation in a system is otherwise pretty hard to achieve, if there >>> is no other way to tame THP allocation through page faults+khugepaged. >> >> Just want to add: for an app itself, it's doable in "madvise" mode perfectly >> fine. >> >> If your app does a MADV_HUGEPAGE, it can get a THP during page-fault + >> khugepaged. >> >> If your app does not do a MADV_HUGEPAGE, it can get a THP through >> MADV_COLLAPSE. >> >> So the "madvise" mode actually works. > > Right, but for me MADV_COLLAPSE is more about 'I want THPs _now_ (if available), > not when khugepaged decides to give me some'. > > So we have multiple semantics at work here, unfortunately. > >> >> The problem appears as soon as we want to control other processes that might >> be setting MADV_HUGEPAGE, and we actually want to control the behavior using >> process_madvise(MADV_COLLAPSE), to say "well, the MADV_HUGEPAGE" should be >> ignored. > > This is a _very_ specialist use. > > I'd argue for a 'manual' mode to be added to sysfs to cover this case, with > 'never' having the 'actually means never' semantics. > > You might argue that could confuse things, but it'd retain the 'de facto' > understanding nearly everybody has about what thees flags mean, but give > whatever user is out there that needs this the ability to continue doing what > they want. > > And we get into philosophy about not 'breaking' userland, not sure we have a > TLB/page fault/folio allocation efficiency contract with userland :) > > No program will break with this patch applied. Just potentially get performance > degradation in a very, very specialist case. > >> >> Then, you configure "never" system-wide and use >> process_madvise(MADV_COLLAPSE) to drive it all manually. >> >> Curious to learn if there is such a user out there. > > Oh me too :) I just looked at the original use cases [1], such a use case is not mentioned. But it did add process_madvise(MADV_COLLAPSE) in 876b4a1896646cc85ec6b1fc1c9270928b7e0831 where we document " This is useful for the development of userspace agents that seek to optimize THP utilization system-wide by using userspace signals to prioritize what memory is most deserving of being THP-backed. " The "prioritize" might indicate that this is used in combination with "madvise", not with "never"/ So yeah, it all boils down to (1) If there is no such use case, "never can mean never". Because there is nothing to break, really. (2) If there is such a use case, we might be breaking it. [1] https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/ -- Cheers, David / dhildenb