From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C18B2C369BD for ; Sat, 19 Apr 2025 16:26:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2AD316B0007; Sat, 19 Apr 2025 12:26:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 25BBB6B0031; Sat, 19 Apr 2025 12:26:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D5B46B0032; Sat, 19 Apr 2025 12:26:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D98BE6B0007 for ; Sat, 19 Apr 2025 12:26:05 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 64FEF1617A1 for ; Sat, 19 Apr 2025 16:26:07 +0000 (UTC) X-FDA: 83351320374.27.45BBF61 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf22.hostedemail.com (Postfix) with ESMTP id 0286EC0004 for ; Sat, 19 Apr 2025 16:26:04 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=L5P6BBL6; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf22.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745079965; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dga5qQLT3pW4UP2UHTM4SfCCr6ilk2JUJIkWjvE/AMc=; b=T9WIDWELfMNV8QyNEak7CZo51P8eZybjfeJTkEFbtADCei6NbAzm/Kmm1hJHp9ra3J/8uz qIhDWujWU7EuPjg7OY3NoGVNwbUDMnUgj5PMvasCHABxy2a5zur3pjen1ZW4oie8zBwUkM 5q6p0DZOLJfE/Blql8ejkjKcM2q3IGk= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=L5P6BBL6; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf22.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745079965; a=rsa-sha256; cv=none; b=s24YCb1p2rL1bLfz+Rhy3UZAWlmHaU+5J9c3qZbaSMEJp8EHdpIMnq6LwVaov3HUDX8i43 zZ4O1oky3gy05HkbcnSRT4P7TGq7VwFshD6Mce2GrUELo60cibRey03nSLWj8AV23jOOp7 qYvfFo7lEfI1YV6wQ+9yrMB0tTiu48g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1745079964; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=dga5qQLT3pW4UP2UHTM4SfCCr6ilk2JUJIkWjvE/AMc=; b=L5P6BBL6DQD+8D8lnf6syyruS7ZelbmKswixP64do6DZwFDOf4UZRcRa8TMtz+pjN2z5S+ dlNFEfgAN/8bSidCJNLQzXuzjluVXrxOkg0G/TH+MOp+ExaMkFXKpk4cv2vkQwTyOwhygR 8LBMYn3CLPSxCEEbBLQdmcpBNcvkVVI= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-65-O3P3k1WZPJy7UXjvDNMZag-1; Sat, 19 Apr 2025 12:26:02 -0400 X-MC-Unique: O3P3k1WZPJy7UXjvDNMZag-1 X-Mimecast-MFC-AGG-ID: O3P3k1WZPJy7UXjvDNMZag_1745079961 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-43e9b0fd00cso10932115e9.0 for ; Sat, 19 Apr 2025 09:26:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745079961; x=1745684761; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=dga5qQLT3pW4UP2UHTM4SfCCr6ilk2JUJIkWjvE/AMc=; b=TANHKhFjXmg4gwLOiYUPGunhYgv71gNSkHZkaP5mQqsy4XXjlvvN5LAA5cuN82mnXK WB9LfuZLr7KwTndR6C1dlI2yImc3i2nQRFAGembvuVIZn/pWwjE4tHQlxBy9VeeyPSbo Wx1Mp11IzkMIgB2nep5eL1kwMF2g8ax5oTqzGrK3y3vAdUAfouOhmmy2k9hNxS6XJudB ab1g8sR9OyrYOx5FMgOUMWPNym2K4D+NZBqAb6GrVuHSxE6+xogznAG6pXPEoaYCsNSX +2xDnrMI259WtT0w41tXOxi2Dioif1hNyDJtYemSgoTz7x8s6/ZeZEBUTQI67epKg8wz +Tow== X-Forwarded-Encrypted: i=1; AJvYcCU9D+bnPf9WGVQER2A3GLoguRCP/omvTAJTBEbARBZcpUgMVDO3frlhoymfwSMCwnMZPDoI8nNiYw==@kvack.org X-Gm-Message-State: AOJu0YxtXG5kFmlATZV+GgvEK0cXnhrXUZX+jkBrqRjnXWQo7KAOf8YL 49HNVAojhj2FG5cJYOryylpJycc8h4qqIXbN9iS1T//pT9Ey1Q2muZo454n+k2LtF2a6pMgSjlR Wa+zkxt+9rJCZBmR5ZRqynt69o8mu7n2tCwvmOt9jzeUYMUyW X-Gm-Gg: ASbGncuyLS/maaSegm9wP5GkRdZnM4lrOk5Qg8blmCjHtAvL5SGxk55IOb68Kk9KNgS aTBnlmW4Es9Ea5TzaWB2UqCAM7d4lYU6EUDyJPGN9eDYlDqSwI2Ois8Z2Bv4CC1bjskHUgCc8rg r0LCP2ogifHgOugutzNZk4whwV51TufVo9Ml5+SRmKfh1uIWjJ8wu/83bw41RrW2gawlXW8dyCq zSfuMUZ3bdtBOdDIgOGEkL/W8q7hRW5Gw+X+ne11evl9KuDZavEg4hQ2e7tEwiliDxCj4vrf4Ex 8iNYbEru6NblxsgOWben9FX4eQBQSUUIMZZHAY0PoB3ENdlMOO3HCy4531xrkWgIlXj0xEOutwV 1vqcu3C72cUoAbRaVzppY5kZsSnpPdMDv1YvtC2A= X-Received: by 2002:a05:600c:43d4:b0:440:67d4:ec70 with SMTP id 5b1f17b1804b1-44069732095mr50829845e9.8.1745079961413; Sat, 19 Apr 2025 09:26:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHv9FA2yzEGzZa3elXFirsRR2kMXfl1QfiO65SD+kmp5eZmIyqM29ero1shFEHHZxVTLAnYRg== X-Received: by 2002:a05:600c:43d4:b0:440:67d4:ec70 with SMTP id 5b1f17b1804b1-44069732095mr50829595e9.8.1745079960899; Sat, 19 Apr 2025 09:26:00 -0700 (PDT) Received: from ?IPV6:2003:cb:c717:8100:123f:e62f:1361:e4eb? (p200300cbc7178100123fe62f1361e4eb.dip0.t-ipconnect.de. [2003:cb:c717:8100:123f:e62f:1361:e4eb]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4406d5acdd4sm65877295e9.14.2025.04.19.09.25.59 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 19 Apr 2025 09:26:00 -0700 (PDT) Message-ID: Date: Sat, 19 Apr 2025 18:25:58 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 13/20] mm: Copy-on-Write (COW) reuse support for PTE-mapped THP To: Kairui Song Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, Andrew Morton , "Matthew Wilcox (Oracle)" , Tejun Heo , Zefan Li , Johannes Weiner , =?UTF-8?Q?Michal_Koutn=C3=BD?= , Jonathan Corbet , Andy Lutomirski , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Muchun Song , "Liam R. Howlett" , Lorenzo Stoakes , Vlastimil Babka , Jann Horn References: <20250303163014.1128035-1-david@redhat.com> <20250303163014.1128035-14-david@redhat.com> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: dSTPkMcROqZaMEJAEb5Myoj2oCSalOucoLh8BeuPpvU_1745079961 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Stat-Signature: wddspqcoips74m96uf8bwzk19jktegb7 X-Rspamd-Queue-Id: 0286EC0004 X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1745079964-179920 X-HE-Meta: U2FsdGVkX1/9JFQt9n1KkXXtI3Rm3AyJm1Gs48Io3IDZu1A6lFbUHdl+/gH8vP5Oe7MXVfzKhD8gXii4R14kXlE3asYrZmPoLpqDs2DtSKSDTgX6gkPJ4w+F+/AbwEbGSVThRchXHEFN99k1z0fLJ3G/ou467rh5yZQJgVBqGh5qFDPH7eAXBnUirZCc+lfAnD1ZwB2izUZ2C+Bn2QhE2Ltww8+DoIQ2gOpiR2g7oOI71s5d5MTsb68wTtX0ZUms7iWYkn03Q5fa3dReBJiWim0S/syhl3tbkp+ELUY6kasxJNyL5JeItQTbmb8QrPtgwohcEvj1TuvARBQP8oLKGR/7qHfErSnsQLEcWUzJ9f3cSxscQclrdUIOcWtOE/ldcR3V8fuqrJX5OttGaUfHbdz/ipRp/BAWyfRLDD2o57eshurrHov1MfhggbJ20o1/XEY+2o5qJ3K5SBD1+M/kOTrsTJS9zzSCUlBpLBcUACSVBQEsvXCmc7f/ibv7O39lQ2cjRhDex19oYpelV30o6/HIW6/JjhTDxTnRkVS4LbCmwg8nfjbzPn7Yg4+W2OQ3Rkxwwoddqdc3M/IYOzXx/L/lGT5ugpvoXdabpHwRD2M+8F9R5cvcWvSpRlNJmu3VRqT26/1se8Ux9YD8My+On1rQgFRJouyOLzAXLs4EWFe8JCppLI863Zj0iWpvKByJdnqRH7H3PVTBizq0IjyMs6vRwc2enwyz1r9DTWxGBaXMEanuE/5jWOeRoMuePIvACBTj7he7vspeP8j8n2WBK+DhefIyqjVIMjmL/d7yoMP7P/Y6n8TgBXuk5aCsQCf2t7y5YxWcwle59H+THDCaZhe1/G6PV1bCNbKdFl+W3EvkfJBFs8tWSEDqQdR6FMzpOdXeaUGFoKdnkFBUtaZEC6pV9EPOb8Q9z2B6tRX+tnYDCEKgw8x3JY6X/X4GUUbak/AshN4uPSyLTqwimR5 TQ1wRfR6 Caw9JLEOVcFzGYGTasoo3REv05CUdKaroEIPFmpza2iRKkf7eylj9s8DgrVJpbAU+AbBrKConBD1cJtPqbCDSBWYlKqswVBRR8MuDDcqzAnK5D/t4aLO/PkHSf+jq+gGBI8e5HyWgvEBfCfqULGQunO0UykitYmzwzWrLMp0dU9imAh17XAGDTGjfs092p1Zz/NSlRQsZfymkZNenLOPd6HpkP6Rz5dLz3n9MdwUIItAAbsmZQLMd9+8Y+7WlrAExnDqIKmMfUHM1+fepmWsrbCLJSSfD+BasS3MXdjPrm3dvjIOocIf+z3GDPprhBCBv12TuBDkQGHGtD8omn0iB0vI4jxHSmQIkk0AgbiST6p7V93m6ktsnVUnVfgaoLEd83OyAzTXXD80kWb5YqkUnj02VXPgjqhh2fb9l+BnCTuml3HDgQglB6emNtAbNvbb3fV6tG1Vp1iITsbylfuPwFdifjrvVpPSteXVi X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 19.04.25 18:02, Kairui Song wrote: > On Tue, Mar 4, 2025 at 12:46 AM David Hildenbrand wrote: >> >> Currently, we never end up reusing PTE-mapped THPs after fork. This >> wasn't really a problem with PMD-sized THPs, because they would have to >> be PTE-mapped first, but it's getting a problem with smaller THP >> sizes that are effectively always PTE-mapped. >> >> With our new "mapped exclusively" vs "maybe mapped shared" logic for >> large folios, implementing CoW reuse for PTE-mapped THPs is straight >> forward: if exclusively mapped, make sure that all references are >> from these (our) mappings. Add some helpful comments to explain the >> details. >> >> CONFIG_TRANSPARENT_HUGEPAGE selects CONFIG_MM_ID. If we spot an anon >> large folio without CONFIG_TRANSPARENT_HUGEPAGE in that code, something >> is seriously messed up. >> >> There are plenty of things we can optimize in the future: For example, we >> could remember that the folio is fully exclusive so we could speedup >> the next fault further. Also, we could try "faulting around", turning >> surrounding PTEs that map the same folio writable. But especially the >> latter might increase COW latency, so it would need further >> investigation. >> >> Signed-off-by: David Hildenbrand >> --- >> mm/memory.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++------ >> 1 file changed, 75 insertions(+), 8 deletions(-) >> >> diff --git a/mm/memory.c b/mm/memory.c >> index 73b783c7d7d51..bb245a8fe04bc 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -3729,19 +3729,86 @@ static vm_fault_t wp_page_shared(struct vm_fault *vmf, struct folio *folio) >> return ret; >> } >> >> -static bool wp_can_reuse_anon_folio(struct folio *folio, >> - struct vm_area_struct *vma) >> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE >> +static bool __wp_can_reuse_large_anon_folio(struct folio *folio, >> + struct vm_area_struct *vma) >> { >> + bool exclusive = false; >> + >> + /* Let's just free up a large folio if only a single page is mapped. */ >> + if (folio_large_mapcount(folio) <= 1) >> + return false; >> + >> /* >> - * We could currently only reuse a subpage of a large folio if no >> - * other subpages of the large folios are still mapped. However, >> - * let's just consistently not reuse subpages even if we could >> - * reuse in that scenario, and give back a large folio a bit >> - * sooner. >> + * The assumption for anonymous folios is that each page can only get >> + * mapped once into each MM. The only exception are KSM folios, which >> + * are always small. >> + * >> + * Each taken mapcount must be paired with exactly one taken reference, >> + * whereby the refcount must be incremented before the mapcount when >> + * mapping a page, and the refcount must be decremented after the >> + * mapcount when unmapping a page. >> + * >> + * If all folio references are from mappings, and all mappings are in >> + * the page tables of this MM, then this folio is exclusive to this MM. >> */ >> - if (folio_test_large(folio)) >> + if (folio_test_large_maybe_mapped_shared(folio)) >> + return false; >> + >> + VM_WARN_ON_ONCE(folio_test_ksm(folio)); >> + VM_WARN_ON_ONCE(folio_mapcount(folio) > folio_nr_pages(folio)); >> + VM_WARN_ON_ONCE(folio_entire_mapcount(folio)); >> + >> + if (unlikely(folio_test_swapcache(folio))) { >> + /* >> + * Note: freeing up the swapcache will fail if some PTEs are >> + * still swap entries. >> + */ >> + if (!folio_trylock(folio)) >> + return false; >> + folio_free_swap(folio); >> + folio_unlock(folio); >> + } >> + >> + if (folio_large_mapcount(folio) != folio_ref_count(folio)) >> return false; >> >> + /* Stabilize the mapcount vs. refcount and recheck. */ >> + folio_lock_large_mapcount(folio); >> + VM_WARN_ON_ONCE(folio_large_mapcount(folio) < folio_ref_count(folio)); > > Hi David, I'm seeing this WARN_ON being triggered on my test machine: Hi! So I assume the following will not sort out the issue for you, correct? https://lore.kernel.org/all/20250415095007.569836-1-david@redhat.com/T/#u > > I'm currently working on my swap table series and testing heavily with > swap related workloads. I thought my patch may break the kernel, but > after more investigation and reverting to current mm-unstable, it > still occurs (with a much lower chance though, I think my series > changed the timing so it's more frequent in my case). > > The test is simple, I just enable all mTHP sizes and repeatedly build > linux kernel in a 1G memcg using tmpfs. > > The WARN is reproducible with current mm-unstable > (dc683247117ee018e5da6b04f1c499acdc2a1418): > > [ 5268.100379] ------------[ cut here ]------------ > [ 5268.105925] WARNING: CPU: 2 PID: 700274 at mm/memory.c:3792 > do_wp_page+0xfc5/0x1080 > [ 5268.112437] Modules linked in: zram virtiofs > [ 5268.115507] CPU: 2 UID: 0 PID: 700274 Comm: cc1 Kdump: loaded Not > tainted 6.15.0-rc2.ptch-gdc683247117e #1434 PREEMPT(voluntary) > [ 5268.120562] Hardware name: Red Hat KVM/RHEL-AV, BIOS 0.0.0 02/06/2015 > [ 5268.123025] RIP: 0010:do_wp_page+0xfc5/0x1080 > [ 5268.124807] Code: 0d 80 77 32 02 0f 85 3e f1 ff ff 0f 1f 44 00 00 > e9 34 f1 ff ff 48 0f ba 75 00 1f 65 ff 0d 63 77 32 02 0f 85 21 f1 ff > ff eb e1 <0f> 0b e9 10 fd ff ff 65 ff 00 f0 48 0f b > a 6d 00 1f 0f 83 ec fc ff > [ 5268.132034] RSP: 0000:ffffc900234efd48 EFLAGS: 00010297 > [ 5268.134002] RAX: 0000000000000080 RBX: 0000000000000000 RCX: 000fffffffe00000 > [ 5268.136609] RDX: 0000000000000081 RSI: 00007f009cbad000 RDI: ffffea0012da0000 > [ 5268.139371] RBP: ffffea0012da0068 R08: 80000004b682d025 R09: 00007f009c7c0000 > [ 5268.142183] R10: ffff88839c48b8c0 R11: 0000000000000000 R12: ffff88839c48b8c0 > [ 5268.144738] R13: ffffea0012da0000 R14: 00007f009cbadf10 R15: ffffc900234efdd8 > [ 5268.147540] FS: 00007f009d1fdac0(0000) GS:ffff88a07ae14000(0000) > knlGS:0000000000000000 > [ 5268.150715] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 5268.153270] CR2: 00007f009cbadf10 CR3: 000000016c7c0001 CR4: 0000000000770eb0 > [ 5268.155674] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 5268.158100] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 5268.160613] PKRU: 55555554 > [ 5268.161662] Call Trace: > [ 5268.162609] > [ 5268.163438] ? ___pte_offset_map+0x1b/0x110 > [ 5268.165309] __handle_mm_fault+0xa51/0xf00 > [ 5268.166848] ? update_load_avg+0x80/0x760 > [ 5268.168376] handle_mm_fault+0x13d/0x360 > [ 5268.169930] do_user_addr_fault+0x2f2/0x7f0 > [ 5268.171630] exc_page_fault+0x6a/0x140 > [ 5268.173278] asm_exc_page_fault+0x26/0x30 > [ 5268.174866] RIP: 0033:0x120e8e4 > [ 5268.176272] Code: 84 a9 00 00 00 48 39 c3 0f 85 ae 00 00 00 48 8b > 43 20 48 89 45 38 48 85 c0 0f 85 b7 00 00 00 48 8b 43 18 48 8b 15 6c > 08 42 01 <0f> 11 43 10 48 89 1d 61 08 42 01 48 89 53 18 0f 11 03 0f 11 > 43 20 > [ 5268.184121] RSP: 002b:00007fff8a855160 EFLAGS: 00010246 > [ 5268.186343] RAX: 00007f009cbadbd0 RBX: 00007f009cbadf00 RCX: 0000000000000000 > [ 5268.189209] RDX: 00007f009cbba030 RSI: 00000000000006f4 RDI: 0000000000000000 > [ 5268.192145] RBP: 00007f009cbb6460 R08: 00007f009d10f000 R09: 000000000000016c > [ 5268.194687] R10: 0000000000000000 R11: 0000000000000010 R12: 00007f009cf97660 > [ 5268.197172] R13: 00007f009756ede0 R14: 00007f0097582348 R15: 0000000000000002 > [ 5268.199419] > [ 5268.200227] ---[ end trace 0000000000000000 ]--- > > I also once changed the WARN_ON to WARN_ON_FOLIO and I got more info here: > > [ 3994.907255] page: refcount:9 mapcount:1 mapping:0000000000000000 > index:0x7f90b3e98 pfn:0x615028 > [ 3994.914449] head: order:3 mapcount:8 entire_mapcount:0 > nr_pages_mapped:8 pincount:0 > [ 3994.924534] memcg:ffff888106746000 > [ 3994.927868] anon flags: > 0x17ffffc002084c(referenced|uptodate|owner_2|head|swapbacked|node=0|zone=2|lastcpupid=0x1fffff) > [ 3994.933479] raw: 0017ffffc002084c ffff88816edd9128 ffffea000beac108 > ffff8882e8ba6bc9 > [ 3994.936251] raw: 00000007f90b3e98 0000000000000000 0000000900000000 > ffff888106746000 > [ 3994.939466] head: 0017ffffc002084c ffff88816edd9128 > ffffea000beac108 ffff8882e8ba6bc9 > [ 3994.943355] head: 00000007f90b3e98 0000000000000000 > 0000000900000000 ffff888106746000 > [ 3994.946988] head: 0017ffffc0000203 ffffea0018540a01 > 0000000800000007 00000000ffffffff > [ 3994.950328] head: ffffffff00000007 00000000800000a3 > 0000000000000000 0000000000000008 > [ 3994.953684] page dumped because: > VM_WARN_ON_FOLIO(folio_large_mapcount(folio) < folio_ref_count(folio)) > [ 3994.957534] ------------[ cut here ]------------ > [ 3994.959917] WARNING: CPU: 16 PID: 555282 at mm/memory.c:3794 > do_wp_page+0x10c0/0x1110 > [ 3994.963069] Modules linked in: zram virtiofs > [ 3994.964726] CPU: 16 UID: 0 PID: 555282 Comm: sh Kdump: loaded Not > tainted 6.15.0-rc1.ptch-ge39aef85f4c0-dirty #1431 PREEMPT(voluntary) > [ 3994.969985] Hardware name: Red Hat KVM/RHEL-AV, BIOS 0.0.0 02/06/2015 > [ 3994.972905] RIP: 0010:do_wp_page+0x10c0/0x1110 > [ 3994.974477] Code: fe ff 0f 0b bd f5 ff ff ff e9 16 fb ff ff 41 83 > a9 bc 12 00 00 01 e9 2f fb ff ff 48 c7 c6 90 c2 49 82 4c 89 ef e8 40 > fd fe ff <0f> 0b e9 6a fc ff ff 65 ff 00 f0 48 0f b > a 6d 00 1f 0f 83 46 fc ff > [ 3994.981033] RSP: 0000:ffffc9002b3c7d40 EFLAGS: 00010246 > [ 3994.982636] RAX: 000000000000005b RBX: 0000000000000000 RCX: 0000000000000000 > [ 3994.984778] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff889ffea16a80 > [ 3994.986865] RBP: ffffea0018540a68 R08: 0000000000000000 R09: c0000000ffff7fff > [ 3994.989316] R10: 0000000000000001 R11: ffffc9002b3c7b80 R12: ffff88810cfd7d40 > [ 3994.991654] R13: ffffea0018540a00 R14: 00007f90b3e9d620 R15: ffffc9002b3c7dd8 > [ 3994.994076] FS: 00007f90b3caa740(0000) GS:ffff88a07b194000(0000) > knlGS:0000000000000000 > [ 3994.996939] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 3994.998902] CR2: 00007f90b3e9d620 CR3: 0000000104088004 CR4: 0000000000770eb0 > [ 3995.001314] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 3995.003746] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 3995.006173] PKRU: 55555554 > [ 3995.007117] Call Trace: > [ 3995.007988] > [ 3995.008755] ? __pfx_default_wake_function+0x10/0x10 > [ 3995.010490] ? ___pte_offset_map+0x1b/0x110 > [ 3995.011929] __handle_mm_fault+0xa51/0xf00 > [ 3995.013346] handle_mm_fault+0x13d/0x360 > [ 3995.014796] do_user_addr_fault+0x2f2/0x7f0 > [ 3995.016331] ? sigprocmask+0x77/0xa0 > [ 3995.017656] exc_page_fault+0x6a/0x140 > [ 3995.018978] asm_exc_page_fault+0x26/0x30 > [ 3995.020309] RIP: 0033:0x7f90b3d881a7 > [ 3995.021461] Code: e8 4e b1 f8 ff 66 66 2e 0f 1f 84 00 00 00 00 00 > 0f 1f 00 f3 0f 1e fa 55 31 c0 ba 01 00 00 00 48 89 e5 53 48 89 fb 48 > 83 ec 08 0f b1 15 71 54 11 00 0f 85 3b 01 00 0 > 0 48 8b 35 84 54 11 00 48 > [ 3995.028091] RSP: 002b:00007ffc33632c90 EFLAGS: 00010206 > [ 3995.029992] RAX: 0000000000000000 RBX: 0000560cfbfc0a40 RCX: 0000000000000000 > [ 3995.032456] RDX: 0000000000000001 RSI: 0000000000000005 RDI: 0000560cfbfc0a40 > [ 3995.034794] RBP: 00007ffc33632ca0 R08: 00007ffc33632d50 R09: 00007ffc33632cff > [ 3995.037534] R10: 00007ffc33632c70 R11: 00007ffc33632d00 R12: 0000560cfbfc0a40 > [ 3995.041063] R13: 00007f90b3e97fd0 R14: 00007f90b3e97fa8 R15: 0000000000000000 > [ 3995.044390] > [ 3995.045510] ---[ end trace 0000000000000000 ]--- > > My guess is folio_ref_count is not a reliable thing to check here, > anything can increase the folio's ref account even without locking it, > for example, a swap cache lookup or maybe anything iterating the LRU. It is reliable, we are holding the mapcount lock, so for each mapcount we must have a corresponding refcount. If that is not the case, we have an issue elsewhere. Other reference may only increase the refcount, but not violate the mapcount vs. refcount condition. Can you reproduce also with swap disabled? -- Cheers, David / dhildenb