From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC726C83F17 for ; Tue, 15 Jul 2025 13:24:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D33668D000B; Tue, 15 Jul 2025 09:24:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D0CB98D0001; Tue, 15 Jul 2025 09:24:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF8E68D000B; Tue, 15 Jul 2025 09:24:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B000F8D0001 for ; Tue, 15 Jul 2025 09:24:10 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7761012DC33 for ; Tue, 15 Jul 2025 13:24:10 +0000 (UTC) X-FDA: 83666567460.21.8E04A8E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf08.hostedemail.com (Postfix) with ESMTP id 462D716000D for ; Tue, 15 Jul 2025 13:24:08 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=RA3eErbB; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf08.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752585848; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Qf0T+96rABqxKpEz5mD8vzC45xOAiNOzYXCwXWS86uE=; b=MrYHHpIKt0jhEgfUN838HGAvfzBxAGiXv/trrjMN7MF13X3t2M2YWcHHOG93I6Q6wIF9XH AKYDvFHcGg2/or+Al2hwIXuL/WOQOumHdo0GDpkmGv/9Db8za5+r6DzLfq2EoW1iY8nD9v txAIniw2FmBihPsCObFsG+J5mNpUchQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752585848; a=rsa-sha256; cv=none; b=MF7jTbKQIsvAsyNeHZE4O3ezFJ0jsRv+dZkZOXOTOoINBgClDvqfcgVFKYnzoDXUIR/sYB eDcdQpiARC6/m+Jr/NvqvH0KvDKfH28y7mDy9sliC3k/Pf9hDAficBrKkAIsU19OL8k4dS AH2YI0VR9N4dwRb0X1nCVozJg1XjPHY= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=RA3eErbB; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf08.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1752585847; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Qf0T+96rABqxKpEz5mD8vzC45xOAiNOzYXCwXWS86uE=; b=RA3eErbBok50moSxY9VgkiUX66tOBpy7JK4KZQw/kFfYlhyGyo09vKUNR0NHtcI8iw38k5 0Y18a3NPdSrQgnQam45gLnhlwgaTFr2CJBsULRAmai3HapQ7nJJuQ4pJxvXaYZZC4c8h+W zrLCFj/pwo746OFUKW6umixVDZORb3w= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-648-z3qFjIqKNPOku6iCEhJIHA-1; Tue, 15 Jul 2025 09:24:06 -0400 X-MC-Unique: z3qFjIqKNPOku6iCEhJIHA-1 X-Mimecast-MFC-AGG-ID: z3qFjIqKNPOku6iCEhJIHA_1752585845 Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-455e9e09afeso19700885e9.2 for ; Tue, 15 Jul 2025 06:24:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752585845; x=1753190645; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qf0T+96rABqxKpEz5mD8vzC45xOAiNOzYXCwXWS86uE=; b=GDJRbG6+nGAXDLRouAzqdaZS5JeTKDj5f4mQzz6PNJyXtRrpPzO27U4mp1NImuzZqV aVTct6orBLeK/pRsdm6C7DBmSkGbMGoVrABfZbR8XnEEz7dg2fdNfzQL769jAMh3MvgQ uH7RMcO4pUpmnm/h3Rlf4ZaBfD+EpzOW6t3v/J31ERLgIqHImtfhj879g3gdqidAEbEx BpJKgVQyJJoLckndCZz40VOlHi8/SeUxIo6PkSZsCUn4YRXSO4m5jN3me24PhHXOpDwb vyYLis1zmdj0eZ1wRKzq2sEyat3bTLtrCH311pw5nHdniddqjijgnNZLjsPkt1AfQfq1 1OhQ== X-Gm-Message-State: AOJu0YxHujwa7cq9hBt3AZjhHhFoRXF9GqoHyr40/8ZFo/C8hIvxUb/g hxtk0NJDAgrEkrBu2SyH67332Lz+U3dT3Xt8V0KeUkZcG+W+tp42CEhf5vBn3hQ6n0BBSAIaBau yAKsJOP9mhPcN8Ykyvdo4XT2IEndhyoBCwnzzE8RGGYq9aIz0H3Fs X-Gm-Gg: ASbGncuTtpWt6QjCkf5mlebaD0s+QjWN+FKEUNhK/ReGeYggRsED1kaY5PSeU9hK9TO DEoTsxkv2u/tmJ2r0yzec9qO7ExUl6yHdulmoAf8cWHwZXWrrAt9AXC33kyvCmra2paNyM7yZi0 ZkutixQPqZ8saC8DJCnxolG2Yv+9xI62deCAqQZNOdaHwkMdjw2RVNX3WnRIDGt2zR+RT6/FCDz bmJUoUFyVJKxccfzMViM6so4FVAFBU8nRzPRNBQwq80+xofmwkw6Z4qglXjODRk6ZlvpPdL3Q4Z AxkbEErktLcTg8OOpdOdWSAUMAh0Xo9qmRa6K62IynnYsRtbuuMzbuKQz0JWOeTnhjcHnXqTT2v cNIEq/T5taUfruW5yuo1CxND/ X-Received: by 2002:a05:600c:1c10:b0:456:1e5a:8879 with SMTP id 5b1f17b1804b1-4561e5a903dmr64876095e9.9.1752585844791; Tue, 15 Jul 2025 06:24:04 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFNkvm/o7SOv8gvxBFeHAl5ZSVtKwdA8aiblR66hArUogUqCn7BZUev9VVrmUri129a2+K3wA== X-Received: by 2002:a05:600c:1c10:b0:456:1e5a:8879 with SMTP id 5b1f17b1804b1-4561e5a903dmr64875585e9.9.1752585844257; Tue, 15 Jul 2025 06:24:04 -0700 (PDT) Received: from localhost (p200300d82f2849002c244e201f219fbd.dip0.t-ipconnect.de. [2003:d8:2f28:4900:2c24:4e20:1f21:9fbd]) by smtp.gmail.com with UTF8SMTPSA id 5b1f17b1804b1-45611a518c2sm80433825e9.31.2025.07.15.06.24.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 15 Jul 2025 06:24:03 -0700 (PDT) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, xen-devel@lists.xenproject.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, David Hildenbrand , Andrew Morton , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko , Dan Williams , Matthew Wilcox , Jan Kara , Alexander Viro , Christian Brauner , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Zi Yan , Baolin Wang , Nico Pache , Ryan Roberts , Dev Jain , Barry Song , Jann Horn , Pedro Falcato , Hugh Dickins , Oscar Salvador , Lance Yang Subject: [PATCH v1 5/9] mm/huge_memory: mark PMD mappings of the huge zero folio special Date: Tue, 15 Jul 2025 15:23:46 +0200 Message-ID: <20250715132350.2448901-6-david@redhat.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250715132350.2448901-1-david@redhat.com> References: <20250715132350.2448901-1-david@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: E7m-cSnEKi82Hv1ESEx6Cp4W_UkflMZR5T73KHHGgjU_1752585845 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit content-type: text/plain; charset="US-ASCII"; x-default=true X-Stat-Signature: ns7ktcdmaron1y1sseaiy4gud7jfdojr X-Rspamd-Queue-Id: 462D716000D X-Rspamd-Server: rspam10 X-Rspam-User: X-HE-Tag: 1752585848-631105 X-HE-Meta: U2FsdGVkX19D2MitLa2g0o3QA9jgFqphVXVJAVxX1AoQFOJ7sqf8epcut0GH8A6Ty+5ZFGQooRrK57k8EOcHIx75hzgbVpgdsQaP6TYGn1J98690sMTIyTiKPRfvGCpTywTR77UNwrs7oAr02fTHsuddIkSfKppxO6HDr2KX44UjuE5K7uEf32uskhcsLGtpZ/dJyhoq5aZ+dYj5I/jslhG/hvaZvi2GeJpRL0A9xMUFxChW/yk1P0/uN1p17u1l7vFprVV6GEC0KRnB+R7iJDNKK7BSakngenoLh8Y0ZsDqQXm+rdyA0TK7LriXbmiWyBgrLMhfZF1tjoj+m3w0rEZ70Hgv5ZpXVUzeZFWYKIfo/QFlKKcSNanoI1SD5Bk6inw0vLX4ruGL8rARAdyHZTmfCb38N7YwLAl5ZWypIVu1hrVydnySP01VXcN/MxyVD0yoSLpt/Fqg1dr/3e4rjx9SdV3bXQAnob44f7MdqHiNQZvYKqbp6U0BhiFs3hy0qq+3+Q+KYlyIX9IFvZKEjBjFadTXcPfMLWEDqOhWA20LodVqBJ+EWTqHmuk5B/v2AeW9SCo/isiUDivzKKkxYHwbjmDOgtClTSaSEKZOLMBwHTmR3MtM9Cmr7lV/Wjc2WCT5g5P2NslGbEDnqfL+5rIkeBiVEyZveUWdxmfKFxvUnf76yqjywLKgEIwcKO9ATuKg1rRP8uebrjXDi2+59bDopdBcta0QitH6aJLRKjMqjC5AUoJlAVnQ7PtZF52fplsGW13RQGJg0B2o05nAdtbBDL0yl4sFXxBWxW2nvDWeRb4PwDhEQUvYVg8vbWoVsS+lFZ5nfPR8CZmtWaApny9f3GDYCNrqKAaQecfWcMP4K96o3ufuMZbOtq7C2+ATSkGAE3YiKUNihFoS7o2Ajjr2uIbmhKs6IrzZg8kDbh3GpFeu7+Vd5ThJpw4f6P8ZeDsrJq6ENLymEaKRsWM pufMCj+P TCYiMf3J1+XOoSPtTuPs0+ysLd+EGJWkfEDxRE464ZC2lOC/1JTSdQvAAJddWXtcTo0zwYoq/J04P+xSq8YFrsGQCVaKEmL5z0l3c0pN82qMy8XzVJL76FOCNFXZSFwDejEuyGuZ9sTclENgJVQmwQmNp+ZfKixtwUfMas2ybshyZ2f2+5IN2lsbi/stPsVriwe1uj/2S22Hy5/jpH8xd3PcQleh4Gli2Y6Lww8lvi+/ZesMUWlaDjlMZIxlpx5wXzVTfJoZyMNdcKl6a3Fj94CNAM2AJXaalhsIhhxGYN3mmhfWL0RYMNr4Qic9HnrrqFmScg0aklzvGB2prdlYvo+sKlmFho98hCv8s466KLu2HuLq/mU/7rnCNskAx1kWLvuXdJbq9DE09qtf3bgCu6r+DZdeMNbp3OzqsonqcmH/L+DAoIZL5AOjkBVJtxz8S/7tv25WO2FqKpW8X/CxHNv8yXLKl0M2Gb0/YSe9Oc6rWKJN9R0DWfyxy8PA7NC53Qa+gQr8Mg44eLYVs8/3xDUfBwA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The huge zero folio is refcounted (+mapcounted -- is that a word?) differently than "normal" folios, similarly (but different) to the ordinary shared zeropage. For this reason, we special-case these pages in vm_normal_page*/vm_normal_folio*, and only allow selected callers to still use them (e.g., GUP can still take a reference on them). vm_normal_page_pmd() already filters out the huge zero folio. However, so far we are not marking it as special like we do with the ordinary shared zeropage. Let's mark it as special, so we can further refactor vm_normal_page_pmd() and vm_normal_page(). While at it, update the doc regarding the shared zero folios. Reviewed-by: Oscar Salvador Signed-off-by: David Hildenbrand --- mm/huge_memory.c | 5 ++++- mm/memory.c | 14 +++++++++----- 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9ec7f48efde09..24aff14d22a1e 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1320,6 +1320,7 @@ static void set_huge_zero_folio(pgtable_t pgtable, struct mm_struct *mm, { pmd_t entry; entry = folio_mk_pmd(zero_folio, vma->vm_page_prot); + entry = pmd_mkspecial(entry); pgtable_trans_huge_deposit(mm, pmd, pgtable); set_pmd_at(mm, haddr, pmd, entry); mm_inc_nr_ptes(mm); @@ -1429,7 +1430,9 @@ static vm_fault_t insert_pmd(struct vm_area_struct *vma, unsigned long addr, if (fop.is_folio) { entry = folio_mk_pmd(fop.folio, vma->vm_page_prot); - if (!is_huge_zero_folio(fop.folio)) { + if (is_huge_zero_folio(fop.folio)) { + entry = pmd_mkspecial(entry); + } else { folio_get(fop.folio); folio_add_file_rmap_pmd(fop.folio, &fop.folio->page, vma); add_mm_counter(mm, mm_counter_file(fop.folio), HPAGE_PMD_NR); diff --git a/mm/memory.c b/mm/memory.c index 3dd6c57e6511e..a4f62923b961c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -543,7 +543,13 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr, * * "Special" mappings do not wish to be associated with a "struct page" (either * it doesn't exist, or it exists but they don't want to touch it). In this - * case, NULL is returned here. "Normal" mappings do have a struct page. + * case, NULL is returned here. "Normal" mappings do have a struct page and + * are ordinarily refcounted. + * + * Page mappings of the shared zero folios are always considered "special", as + * they are not ordinarily refcounted. However, selected page table walkers + * (such as GUP) can still identify these mappings and work with the + * underlying "struct page". * * There are 2 broad cases. Firstly, an architecture may define a pte_special() * pte bit, in which case this function is trivial. Secondly, an architecture @@ -573,9 +579,8 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr, * * VM_MIXEDMAP mappings can likewise contain memory with or without "struct * page" backing, however the difference is that _all_ pages with a struct - * page (that is, those where pfn_valid is true) are refcounted and considered - * normal pages by the VM. The only exception are zeropages, which are - * *never* refcounted. + * page (that is, those where pfn_valid is true, except the shared zero + * folios) are refcounted and considered normal pages by the VM. * * The disadvantage is that pages are refcounted (which can be slower and * simply not an option for some PFNMAP users). The advantage is that we @@ -655,7 +660,6 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, { unsigned long pfn = pmd_pfn(pmd); - /* Currently it's only used for huge pfnmaps */ if (unlikely(pmd_special(pmd))) return NULL; -- 2.50.1