From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21817C71157 for ; Tue, 17 Jun 2025 15:44:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EC1DF6B00A5; Tue, 17 Jun 2025 11:44:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DACBF6B00A9; Tue, 17 Jun 2025 11:44:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7BAA6B00A7; Tue, 17 Jun 2025 11:44:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 8FAD56B00A6 for ; Tue, 17 Jun 2025 11:44:12 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 5A3A61D3DFB for ; Tue, 17 Jun 2025 15:44:12 +0000 (UTC) X-FDA: 83565313944.02.BE679B9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf08.hostedemail.com (Postfix) with ESMTP id 2443C160015 for ; Tue, 17 Jun 2025 15:44:09 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=c4FlE04N; spf=pass (imf08.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750175050; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=u8pCpCuV3SHMatGSJCijHr+IJztx1AQFgmUpAFfOuvk=; b=SqlYph1Jt+3EzRgmX69oCy46GWtW2ajti1hO+i0rEtTjlxrN7CZRDEWoi38ZcUtGe/pJk5 jSgCvKeSN9DjR4LoEP1JbiQoGAt6kxg3oq0zJZyYjKweT0lTFOvt4GocNW1uHUA6mPsmGZ YY9KiYsCj+GsiLQGvZgp2YUPH29UCTw= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=c4FlE04N; spf=pass (imf08.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750175050; a=rsa-sha256; cv=none; b=mkOFezlyCDAlo/nHyQG5wCiU1iup/W3BRNeazSQfreEwBgWAJCkyEvK9vXj5cw1xO10vl5 NvskUtSv3JvpifM3pOFBCx1RvJmTHk5EhUSzbLGxlkOfLN3WySMhakd7i4Cp3i+dgQUfJV 75ONfurWtR4XS4tZ9rmx4IrgL6gsSNg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750175049; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u8pCpCuV3SHMatGSJCijHr+IJztx1AQFgmUpAFfOuvk=; b=c4FlE04NWx1UYb75XvzFd+cQymPO+VAYu0nIaOWrLMEXZhyuucR/MKSUIlQVdltqK5uShW j8IWRcGvEbW96ilcs4uHDI4B6AUf+B6p1xwlNhw5MtslTm6m4gQ8U4EAE3TNlUdFix10xt 0+QL5XReb3EbOp9Uma/KKXGnU5LiF8Q= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-49-uuquhZMXNCWQ3sZ3jrcmWg-1; Tue, 17 Jun 2025 11:44:08 -0400 X-MC-Unique: uuquhZMXNCWQ3sZ3jrcmWg-1 X-Mimecast-MFC-AGG-ID: uuquhZMXNCWQ3sZ3jrcmWg_1750175047 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-451d7de4ae3so38232165e9.2 for ; Tue, 17 Jun 2025 08:44:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750175047; x=1750779847; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u8pCpCuV3SHMatGSJCijHr+IJztx1AQFgmUpAFfOuvk=; b=TERBOoUKfSWADxJD+9eNtSxHrtNTvRFbMcznI6VH0X4ZNlkBZEZAs66dNzFKbJ2NA5 tJCccbBUMBXRv5T45x7kWIPYCRgpTfk80bqiIbbjU5vjg0jUl3Tic8pyQKjBA5De7Yak fwmkVkwBLFT1Yb9tJIC+PEeO8GK2jjTlj3UATKPG1b2eetc56T2iqRRBg9Fb38nFVU8y G/K8y+Pm/cfU8i3bCd0Jt5liZb5B/IMnjqb4KVKfglAnKyRLbpqfuUW4SjoxDplMJCMy h4pw3xtX8ux6/VN0mRIKhgpw+18qECOKb65rcG6jEbztUaaG7cHc7S0U2TtBcA+asroh FyUg== X-Forwarded-Encrypted: i=1; AJvYcCVEaIHCRHqaf/4jTId4sXb6CkgQvco1drtCv5vkjAbBp/QUNSHKc4bOdV+zpeOGj/NTkDdy/2Pi9A==@kvack.org X-Gm-Message-State: AOJu0Yzfbz8Xo7NbYOaZoYBoaoI6Y3O5C/p++z+O1aB7MSaVinUgBlPc fHjZwnKX1K99CD5i9Ah++ypOmXmXY1S/0VgPyU4NK/x1SNPgWDqvrfYT19fyfxx4NlRaW9gd7+Z 9HLFoEXKf3zzPOAf+ib7xkngrKqkyUqJCtyZbofrFA5bASoTrZNxA X-Gm-Gg: ASbGncuSMOLvUFxZbXed7fLGKT92ylE93MC6B1dXuBO+qWPXHERMhWdxBGra6wUjvgs 5Z0/OAX22bHG3+UXPFlkJyb2aiEmE94FnVo6j8xZQV7VWe0cJzQaRJiE6UWwGbFeG0eIv5apAxP iazGrRjNz6oH0A1+2oBTO22b7QNZPR8adjhEVFmeA2e9Y917Q64AMLjEUudX/NDgv+gWZUHi/SB xsjt3SpyEU5Nby8p1/ZjRWdELT4cQhCQcEv71cTCNjH/RLQDP7t16qX4sjgpOsYun2H+nikVS7O Km8TatRCCDZqsKnku3JJKbkqkwD5lTGhBR1EioPyi2F1yTVF8QGoXGq08oK8LWGGxJxT+IZ1CsN RrLqwyQ== X-Received: by 2002:a05:6000:2c13:b0:3a5:1241:ce99 with SMTP id ffacd0b85a97d-3a5723ad5dfmr10866426f8f.24.1750175046932; Tue, 17 Jun 2025 08:44:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEHpBD+dKVpYruKbhanJPmGuA69TQLBLGeutEiQZdQL6alvi0yDoDZEPOFHzDtmskCoeBKHRQ== X-Received: by 2002:a05:6000:2c13:b0:3a5:1241:ce99 with SMTP id ffacd0b85a97d-3a5723ad5dfmr10866409f8f.24.1750175046528; Tue, 17 Jun 2025 08:44:06 -0700 (PDT) Received: from localhost (p200300d82f3107003851c66ab6b93490.dip0.t-ipconnect.de. [2003:d8:2f31:700:3851:c66a:b6b9:3490]) by smtp.gmail.com with UTF8SMTPSA id ffacd0b85a97d-3a568b4e4f1sm14054457f8f.87.2025.06.17.08.44.04 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 17 Jun 2025 08:44:06 -0700 (PDT) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, nvdimm@lists.linux.dev, David Hildenbrand , Andrew Morton , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko , Dan Williams , Alistair Popple , Matthew Wilcox , Jan Kara , Alexander Viro , Christian Brauner , Zi Yan , Baolin Wang , Lorenzo Stoakes , "Liam R. Howlett" , Nico Pache , Ryan Roberts , Dev Jain , Barry Song , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Jann Horn , Pedro Falcato Subject: [PATCH RFC 08/14] mm/huge_memory: mark PMD mappings of the huge zero folio special Date: Tue, 17 Jun 2025 17:43:39 +0200 Message-ID: <20250617154345.2494405-9-david@redhat.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250617154345.2494405-1-david@redhat.com> References: <20250617154345.2494405-1-david@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: P6O0iy0Okm87dSqcUFoPcQEJ0rWXWvAsi_GhZrV2JhY_1750175047 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit content-type: text/plain; charset="US-ASCII"; x-default=true X-Rspamd-Server: rspam11 X-Rspam-User: X-Rspamd-Queue-Id: 2443C160015 X-Stat-Signature: u5834t5f4x1skxx7htzpdh4488fm1h3g X-HE-Tag: 1750175049-228814 X-HE-Meta: U2FsdGVkX1+vBUbF3YAjkvtMTZSTWiq0QVOET4KJQQoCJvQT80giPVRKq54RV7MDG3P68nCO0R0ZkiHl57xbZ3Uzg7hkDbDDesxc1UlWZqbgrSMIt8gs+ijgPvWy6i3R7oDZ5qodAnOB0UI2PhmvwlXfaQFfFU5CN4u4ht5DL9EzBC9UJcopsIlCrYxuFrSMj/vUkj1LdgZnCxiXsSGmr1SxrTBQ1i5JLigc9gmoGtz6/vjXVFsSfCGxd9bFndqU6h1xR7rziA8vLtYy7k45fxQ2/EQEe9KI4jKraO2LVv3ruvFm5vViyiknqHrQ6nfwoq/xCvidrYV/U8+N3n5eQxN2eGOBIAyRfWnczsBmLMJ/TXGwaU/qUkgMe5EHPN/klVLM2poZR8f4T7b3xYg/k6Uh+jTqSSfDPXQffUDkCgpXRCl0QQsMeKVDD36oSvD/tZ0m22v7u0XtA6lsLD+g97D+zs1FvVxUlEUv1lHo7LKgaDFU+6/N4J1hwPjLbQvvY33+ybdimBkRXaiICPjaNwfLU2sPnFvWHPFbVn48kUJ79YJet5L0dZ3dVfPEVOd2CDm1rAV4/Qx7nC19kacBDdyjSYCWF/xHG3sDvJFZsS14du0rB61ALuLqdCAbS1hWJdv4uXV+W/+556W3OYSkG0eWhhf9BFNdFXoeQIFToeBV6G7iywwPSG4jJctBwGY6qJprjB1U4KzQO4aYDIL9sX+GqC9RdA3rjsJvMQuY2CUXhqk04gC5+Uu54tHeMGlPkcJUEMVQDk073zIZWJcXOptTS0A1uU5L8byS/A+Bk26pgr7pQFKpBTp40Yu18v6Q9kd6AnxwzoW0P7PMjY4jNN0AF6VMO2yTDnCduTA20HNx0twEKx4Q8GLNDgotSlN4O1YPZzkVFXULf2o6KeLxEBuh4zf6EN2pQ4Ga0p/UDoKY4LgGrA7q57mdF5BbQvZIboTZBB2Fy5jkOSqX810 +sd6YC18 m6POPkcc9bnWQwjcdiXrHLIIHfBuuEk3jm38gXLokyQ9fOcHd/MrICXYPYCNte5Tmp8zmLgB+9Qu8IGvSiWwGohtiNdnDQ9YI5oym79ADfexjCmSmCHnl87BpQPfsUNYSytC1POEXyr+d5eJqtzBOOEN2L+d+8b+JtHPOE6eRDeY9uZpnFDY3sxho58e94Z5zLDPaEDmuz+to6E8l02/oyCtvccGxHVTTVG/DtXmxZjo99di3ujOW90qK1lnzCgR2HtH1JpbhnUuxQP4M0AVCkfrIrAfuEq1fNFvTinrNt99w+u23yrT2zPIEu0DN6UkM3OQIThO6/wWUzUb5v57ahCpd0p6rRcvveF1dm2f9DpXFR71Hb1Oghn9j71idYMirsxyJUh4+X+cistv7BTA6astoRsNaGr9Yu6NyYat3+ZB8cj4LhpuWJx70iJr4GNcYPgbzuxp4f2JoUCq3DvguWHVNEsG3ZXWNEEPM8nI0C7FWRvDX4//KMZVAMGtWXsC0wv6XdnMRZHa/cJLLaDuOwmyxpHTFbU+YQjwe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The huge zero folio is refcounted (+mapcounted -- is that a word?) differently than "normal" folios, similarly (but different) to the ordinary shared zeropage. For this reason, we special-case these pages in vm_normal_page*/vm_normal_folio*, and only allow selected callers to still use them (e.g., GUP can still take a reference on them). vm_normal_page_pmd() already filters out the huge zero folio. However, so far we are not marking it as special like we do with the ordinary shared zeropage. Let's mark it as special, so we can further refactor vm_normal_page_pmd() and vm_normal_page(). While at it, update the doc regarding the shared zero folios. Signed-off-by: David Hildenbrand --- mm/huge_memory.c | 5 ++++- mm/memory.c | 13 +++++++++---- 2 files changed, 13 insertions(+), 5 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 92400f3baa9ff..8f03cd4e40397 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1309,6 +1309,7 @@ static void set_huge_zero_folio(pgtable_t pgtable, struct mm_struct *mm, { pmd_t entry; entry = folio_mk_pmd(zero_folio, vma->vm_page_prot); + entry = pmd_mkspecial(entry); pgtable_trans_huge_deposit(mm, pmd, pgtable); set_pmd_at(mm, haddr, pmd, entry); mm_inc_nr_ptes(mm); @@ -1418,7 +1419,9 @@ static vm_fault_t insert_pmd(struct vm_area_struct *vma, unsigned long addr, if (fop.is_folio) { entry = folio_mk_pmd(fop.folio, vma->vm_page_prot); - if (!is_huge_zero_folio(fop.folio)) { + if (is_huge_zero_folio(fop.folio)) { + entry = pmd_mkspecial(entry); + } else { folio_get(fop.folio); folio_add_file_rmap_pmd(fop.folio, &fop.folio->page, vma); add_mm_counter(mm, mm_counter_file(fop.folio), HPAGE_PMD_NR); diff --git a/mm/memory.c b/mm/memory.c index 9a1acd057ce59..ef277dab69e33 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -541,7 +541,13 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr, * * "Special" mappings do not wish to be associated with a "struct page" (either * it doesn't exist, or it exists but they don't want to touch it). In this - * case, NULL is returned here. "Normal" mappings do have a struct page. + * case, NULL is returned here. "Normal" mappings do have a struct page and + * are ordinarily refcounted. + * + * Page mappings of the shared zero folios are always considered "special", as + * they are not ordinarily refcounted. However, selected page table walkers + * (such as GUP) can still identify these mappings and work with the + * underlying "struct page". * * There are 2 broad cases. Firstly, an architecture may define a pte_special() * pte bit, in which case this function is trivial. Secondly, an architecture @@ -571,9 +577,8 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr, * * VM_MIXEDMAP mappings can likewise contain memory with or without "struct * page" backing, however the difference is that _all_ pages with a struct - * page (that is, those where pfn_valid is true) are refcounted and considered - * normal pages by the VM. The only exception are zeropages, which are - * *never* refcounted. + * page (that is, those where pfn_valid is true, except the shared zero + * folios) are refcounted and considered normal pages by the VM. * * The disadvantage is that pages are refcounted (which can be slower and * simply not an option for some PFNMAP users). The advantage is that we -- 2.49.0