From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from fra-out-012.esa.eu-central-1.outbound.mail-perimeter.amazon.com (fra-out-012.esa.eu-central-1.outbound.mail-perimeter.amazon.com [52.57.120.243]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 186733E122F; Tue, 17 Mar 2026 14:11:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.57.120.243 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773756719; cv=none; b=gJgUAN/zNdTLvFJeT+p7MpYeOHZhQwP2CnEWpkGtujUnPEP4VJzn3642UPdSHJRRvRsO3PrlPhklsWwPNh7GXJh3mHS33cTCCl1OPLYJeZEcpEKEwJeCP30ZQZRlqTI4GlK5UKfEwUcqqLkZDJ47UU+Hao5rAsUMBOGyAUvIZiQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773756719; c=relaxed/simple; bh=gRHLLWWo2Xli4d09JqIF5pV+V9iYYjd9cVS3Xvh68bQ=; h=From:To:CC:Subject:Date:Message-ID:References:In-Reply-To: Content-Type:MIME-Version; b=f+0a+MsLzgJKktPokjUpbBgm8mjHpbC0yHQpjgAUwFEmHaF/Q5m0PUIUgwgpg/pgBUrZU80H9UAkQ6xwcCxea+O7rcY1+V+Y3Gs9JGnbR/ITrXtMIWZ7PpLPamw+lSDyRJaVFVs96Ssdd4C+XeizvcXTmXCppp4XjOp37V84LdQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.co.uk; spf=pass smtp.mailfrom=amazon.co.uk; dkim=pass (2048-bit key) header.d=amazon.co.uk header.i=@amazon.co.uk header.b=JBabSIlu; arc=none smtp.client-ip=52.57.120.243 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.co.uk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=amazon.co.uk header.i=@amazon.co.uk header.b="JBabSIlu" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.uk; i=@amazon.co.uk; q=dns/txt; s=amazoncorp2; t=1773756718; x=1805292718; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=llJJeOmsrKZhS7S+LBUIVkS0VQiaBAql1AwgjiwVdWQ=; b=JBabSIlucE44UG6dN0pE7kaxPKm8/da/Kk94Ym0hxXjKN/F6M3CLpWFt ibLGFfgG3Md8aBfBUkkKMrsJZWraS6jQNX3Wj7mkxmYUgRzt9aoBsR6TA V4dfZyeV+EXVkq9FjOm73fk4lrfG8NnqYKaJZBNY7ozNSxyfavKPB5KOX KvHjACRyy5wtidzzrRayZjPtGEEzJ5YilmP6LA/ymcYYYWS5bzXscu6s4 IH0v+DS85dW+vD1HnaniWQg+iyLnqjYkT5xEqV8seCIOHhSc5qS9PCHPb JnWvlKIM1LXxzLK1Fe0vflkPFliEj0uZV2OBaXzqeOJDfiLno7PnShOh1 g==; X-CSE-ConnectionGUID: zruidjxqQia4haOuXf9V+Q== X-CSE-MsgGUID: aYgMHUxmQry190BDjc+w9A== X-IronPort-AV: E=Sophos;i="6.23,124,1770595200"; d="scan'208";a="10908907" Received: from ip-10-6-11-83.eu-central-1.compute.internal (HELO smtpout.naws.eu-central-1.prod.farcaster.email.amazon.dev) ([10.6.11.83]) by internal-fra-out-012.esa.eu-central-1.outbound.mail-perimeter.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Mar 2026 14:11:48 +0000 Received: from EX19MTAEUC002.ant.amazon.com [54.240.197.228:24073] by smtpin.naws.eu-central-1.prod.farcaster.email.amazon.dev [10.0.35.101:2525] with esmtp (Farcaster) id 8f6cda6b-e113-4ce3-95b0-f77c086de20c; Tue, 17 Mar 2026 14:11:48 +0000 (UTC) X-Farcaster-Flow-ID: 8f6cda6b-e113-4ce3-95b0-f77c086de20c Received: from EX19D005EUB004.ant.amazon.com (10.252.51.126) by EX19MTAEUC002.ant.amazon.com (10.252.51.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.2562.37; Tue, 17 Mar 2026 14:11:42 +0000 Received: from EX19D005EUB003.ant.amazon.com (10.252.51.31) by EX19D005EUB004.ant.amazon.com (10.252.51.126) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.2562.37; Tue, 17 Mar 2026 14:11:41 +0000 Received: from EX19D005EUB003.ant.amazon.com ([fe80::b825:becb:4b38:da0c]) by EX19D005EUB003.ant.amazon.com ([fe80::b825:becb:4b38:da0c%3]) with mapi id 15.02.2562.037; Tue, 17 Mar 2026 14:11:41 +0000 From: "Kalyazin, Nikita" To: "kvm@vger.kernel.org" , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" , "kvmarm@lists.linux.dev" , "linux-fsdevel@vger.kernel.org" , "linux-mm@kvack.org" , "bpf@vger.kernel.org" , "linux-kselftest@vger.kernel.org" , "kernel@xen0n.name" , "linux-riscv@lists.infradead.org" , "linux-s390@vger.kernel.org" , "loongarch@lists.linux.dev" , "linux-pm@vger.kernel.org" CC: "pbonzini@redhat.com" , "corbet@lwn.net" , "maz@kernel.org" , "oupton@kernel.org" , "joey.gouly@arm.com" , "suzuki.poulose@arm.com" , "yuzenghui@huawei.com" , "catalin.marinas@arm.com" , "will@kernel.org" , "seanjc@google.com" , "tglx@kernel.org" , "mingo@redhat.com" , "bp@alien8.de" , "dave.hansen@linux.intel.com" , "x86@kernel.org" , "hpa@zytor.com" , "luto@kernel.org" , "peterz@infradead.org" , "willy@infradead.org" , "akpm@linux-foundation.org" , "david@kernel.org" , "lorenzo.stoakes@oracle.com" , "vbabka@kernel.org" , "rppt@kernel.org" , "surenb@google.com" , "mhocko@suse.com" , "ast@kernel.org" , "daniel@iogearbox.net" , "andrii@kernel.org" , "martin.lau@linux.dev" , "eddyz87@gmail.com" , "song@kernel.org" , "yonghong.song@linux.dev" , "john.fastabend@gmail.com" , "kpsingh@kernel.org" , "sdf@fomichev.me" , "haoluo@google.com" , "jolsa@kernel.org" , "jgg@ziepe.ca" , "jhubbard@nvidia.com" , "peterx@redhat.com" , "jannh@google.com" , "pfalcato@suse.de" , "skhan@linuxfoundation.org" , "riel@surriel.com" , "ryan.roberts@arm.com" , "jgross@suse.com" , "yu-cheng.yu@intel.com" , "kas@kernel.org" , "coxu@redhat.com" , "kevin.brodsky@arm.com" , "ackerleytng@google.com" , "yosry@kernel.org" , "ajones@ventanamicro.com" , "maobibo@loongson.cn" , "tabba@google.com" , "prsampat@amd.com" , "wu.fei9@sanechips.com.cn" , "mlevitsk@redhat.com" , "jmattson@google.com" , "jthoughton@google.com" , "agordeev@linux.ibm.com" , "alex@ghiti.fr" , "aou@eecs.berkeley.edu" , "borntraeger@linux.ibm.com" , "chenhuacai@kernel.org" , "dev.jain@arm.com" , "gor@linux.ibm.com" , "hca@linux.ibm.com" , "palmer@dabbelt.com" , "pjw@kernel.org" , "shijie@os.amperecomputing.com" , "svens@linux.ibm.com" , "thuth@redhat.com" , "wyihan@google.com" , "yang@os.amperecomputing.com" , "Jonathan.Cameron@huawei.com" , "Liam.Howlett@oracle.com" , "urezki@gmail.com" , "zhengqi.arch@bytedance.com" , "gerald.schaefer@linux.ibm.com" , "jiayuan.chen@shopee.com" , "lenb@kernel.org" , "osalvador@suse.de" , "pavel@kernel.org" , "rafael@kernel.org" , "vannapurve@google.com" , "jackmanb@google.com" , "aneesh.kumar@kernel.org" , "patrick.roy@linux.dev" , "Thomson, Jack" , "Itazuri, Takahiro" , "Manwaring, Derek" , "Kalyazin, Nikita" , Vlastimil Babka Subject: [PATCH v11 06/16] mm: introduce AS_NO_DIRECT_MAP Thread-Topic: [PATCH v11 06/16] mm: introduce AS_NO_DIRECT_MAP Thread-Index: AQHcthf6aKtpV6+9UE+MjaWrmt1tyg== Date: Tue, 17 Mar 2026 14:11:41 +0000 Message-ID: <20260317141031.514-7-kalyazin@amazon.com> References: <20260317141031.514-1-kalyazin@amazon.com> In-Reply-To: <20260317141031.514-1-kalyazin@amazon.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Patrick Roy =0A= =0A= Add AS_NO_DIRECT_MAP for mappings where direct map entries of folios are=0A= set to not present. Currently, mappings that match this description are=0A= secretmem mappings (memfd_secret()). Later, some guest_memfd=0A= configurations will also fall into this category.=0A= =0A= Reject this new type of mappings in all locations that currently reject=0A= secretmem mappings, on the assumption that if secretmem mappings are=0A= rejected somewhere, it is precisely because of an inability to deal with=0A= folios without direct map entries, and then make memfd_secret() use=0A= AS_NO_DIRECT_MAP on its address_space to drop its special=0A= vma_is_secretmem()/secretmem_mapping() checks.=0A= =0A= Use a new flag instead of overloading AS_INACCESSIBLE (which is already=0A= set by guest_memfd) because not all guest_memfd mappings will end up=0A= being direct map removed (e.g. in pKVM setups, parts of guest_memfd that=0A= can be mapped to userspace should also be GUP-able, and generally not=0A= have restrictions on who can access it).=0A= =0A= Acked-by: Mike Rapoport (Microsoft) =0A= Acked-by: David Hildenbrand (Red Hat) =0A= Signed-off-by: Patrick Roy =0A= Acked-by: Vlastimil Babka =0A= Reviewed-by: Ackerley Tng =0A= Signed-off-by: Nikita Kalyazin =0A= ---=0A= include/linux/pagemap.h | 16 ++++++++++++++++=0A= include/linux/secretmem.h | 18 ------------------=0A= lib/buildid.c | 8 ++++++--=0A= mm/gup.c | 9 ++++-----=0A= mm/mlock.c | 2 +-=0A= mm/secretmem.c | 8 ++------=0A= 6 files changed, 29 insertions(+), 32 deletions(-)=0A= =0A= diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h=0A= index ec442af3f886..68c075502d91 100644=0A= --- a/include/linux/pagemap.h=0A= +++ b/include/linux/pagemap.h=0A= @@ -211,6 +211,7 @@ enum mapping_flags {=0A= AS_KERNEL_FILE =3D 10, /* mapping for a fake kernel file that shouldn't= =0A= account usage to user cgroups */=0A= AS_NO_DATA_INTEGRITY =3D 11, /* no data integrity guarantees */=0A= + AS_NO_DIRECT_MAP =3D 12, /* Folios in the mapping are not in the direct m= ap */=0A= /* Bits 16-25 are used for FOLIO_ORDER */=0A= AS_FOLIO_ORDER_BITS =3D 5,=0A= AS_FOLIO_ORDER_MIN =3D 16,=0A= @@ -356,6 +357,21 @@ static inline bool mapping_no_data_integrity(const str= uct address_space *mapping=0A= return test_bit(AS_NO_DATA_INTEGRITY, &mapping->flags);=0A= }=0A= =0A= +static inline void mapping_set_no_direct_map(struct address_space *mapping= )=0A= +{=0A= + set_bit(AS_NO_DIRECT_MAP, &mapping->flags);=0A= +}=0A= +=0A= +static inline bool mapping_no_direct_map(const struct address_space *mappi= ng)=0A= +{=0A= + return test_bit(AS_NO_DIRECT_MAP, &mapping->flags);=0A= +}=0A= +=0A= +static inline bool vma_has_no_direct_map(const struct vm_area_struct *vma)= =0A= +{=0A= + return vma->vm_file && mapping_no_direct_map(vma->vm_file->f_mapping);=0A= +}=0A= +=0A= static inline gfp_t mapping_gfp_mask(const struct address_space *mapping)= =0A= {=0A= return mapping->gfp_mask;=0A= diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h=0A= index e918f96881f5..0ae1fb057b3d 100644=0A= --- a/include/linux/secretmem.h=0A= +++ b/include/linux/secretmem.h=0A= @@ -4,28 +4,10 @@=0A= =0A= #ifdef CONFIG_SECRETMEM=0A= =0A= -extern const struct address_space_operations secretmem_aops;=0A= -=0A= -static inline bool secretmem_mapping(struct address_space *mapping)=0A= -{=0A= - return mapping->a_ops =3D=3D &secretmem_aops;=0A= -}=0A= -=0A= -bool vma_is_secretmem(struct vm_area_struct *vma);=0A= bool secretmem_active(void);=0A= =0A= #else=0A= =0A= -static inline bool vma_is_secretmem(struct vm_area_struct *vma)=0A= -{=0A= - return false;=0A= -}=0A= -=0A= -static inline bool secretmem_mapping(struct address_space *mapping)=0A= -{=0A= - return false;=0A= -}=0A= -=0A= static inline bool secretmem_active(void)=0A= {=0A= return false;=0A= diff --git a/lib/buildid.c b/lib/buildid.c=0A= index c4b737640621..ba79bf28f7e6 100644=0A= --- a/lib/buildid.c=0A= +++ b/lib/buildid.c=0A= @@ -47,6 +47,10 @@ static int freader_get_folio(struct freader *r, loff_t f= ile_off)=0A= =0A= freader_put_folio(r);=0A= =0A= + /* reject folios without direct map entries (e.g. from memfd_secret() or = guest_memfd()) */=0A= + if (mapping_no_direct_map(r->file->f_mapping))=0A= + return -EFAULT;=0A= +=0A= /* only use page cache lookup - fail if not already cached */=0A= r->folio =3D filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT= );=0A= =0A= @@ -87,8 +91,8 @@ const void *freader_fetch(struct freader *r, loff_t file_= off, size_t sz)=0A= return r->data + file_off;=0A= }=0A= =0A= - /* reject secretmem folios created with memfd_secret() */=0A= - if (secretmem_mapping(r->file->f_mapping)) {=0A= + /* reject folios without direct map entries (e.g. from memfd_secret() or = guest_memfd()) */=0A= + if (mapping_no_direct_map(r->file->f_mapping)) {=0A= r->err =3D -EFAULT;=0A= return NULL;=0A= }=0A= diff --git a/mm/gup.c b/mm/gup.c=0A= index 869d79c8daa4..a5a753da66aa 100644=0A= --- a/mm/gup.c=0A= +++ b/mm/gup.c=0A= @@ -11,7 +11,6 @@=0A= #include =0A= #include =0A= #include =0A= -#include =0A= =0A= #include =0A= #include =0A= @@ -1216,7 +1215,7 @@ static int check_vma_flags(struct vm_area_struct *vma= , unsigned long gup_flags)=0A= if ((gup_flags & FOLL_SPLIT_PMD) && is_vm_hugetlb_page(vma))=0A= return -EOPNOTSUPP;=0A= =0A= - if (vma_is_secretmem(vma))=0A= + if (vma_has_no_direct_map(vma))=0A= return -EFAULT;=0A= =0A= if (write) {=0A= @@ -2724,7 +2723,7 @@ EXPORT_SYMBOL(get_user_pages_unlocked);=0A= * This call assumes the caller has pinned the folio, that the lowest page= table=0A= * level still points to this folio, and that interrupts have been disable= d.=0A= *=0A= - * GUP-fast must reject all secretmem folios.=0A= + * GUP-fast must reject all folios without direct map entries (such as sec= retmem).=0A= *=0A= * Writing to pinned file-backed dirty tracked folios is inherently proble= matic=0A= * (see comment describing the writable_file_mapping_allowed() function). = We=0A= @@ -2744,7 +2743,7 @@ static bool gup_fast_folio_allowed(struct folio *foli= o, unsigned int flags)=0A= if (WARN_ON_ONCE(folio_test_slab(folio)))=0A= return false;=0A= =0A= - /* hugetlb neither requires dirty-tracking nor can be secretmem. */=0A= + /* hugetlb neither requires dirty-tracking nor can be without direct map.= */=0A= if (folio_test_hugetlb(folio))=0A= return true;=0A= =0A= @@ -2782,7 +2781,7 @@ static bool gup_fast_folio_allowed(struct folio *foli= o, unsigned int flags)=0A= * At this point, we know the mapping is non-null and points to an=0A= * address_space object.=0A= */=0A= - if (secretmem_mapping(mapping))=0A= + if (mapping_no_direct_map(mapping))=0A= return false;=0A= =0A= /*=0A= diff --git a/mm/mlock.c b/mm/mlock.c=0A= index 2f699c3497a5..a6f4b3df4f3f 100644=0A= --- a/mm/mlock.c=0A= +++ b/mm/mlock.c=0A= @@ -474,7 +474,7 @@ static int mlock_fixup(struct vma_iterator *vmi, struct= vm_area_struct *vma,=0A= =0A= if (newflags =3D=3D oldflags || (oldflags & VM_SPECIAL) ||=0A= is_vm_hugetlb_page(vma) || vma =3D=3D get_gate_vma(current->mm) ||=0A= - vma_is_dax(vma) || vma_is_secretmem(vma) || (oldflags & VM_DROPPABLE)= )=0A= + vma_is_dax(vma) || vma_has_no_direct_map(vma) || (oldflags & VM_DROPP= ABLE))=0A= /* don't set VM_LOCKED or VM_LOCKONFAULT and don't count */=0A= goto out;=0A= =0A= diff --git a/mm/secretmem.c b/mm/secretmem.c=0A= index 27b176af8fc4..d32e1be1eb35 100644=0A= --- a/mm/secretmem.c=0A= +++ b/mm/secretmem.c=0A= @@ -129,11 +129,6 @@ static int secretmem_mmap_prepare(struct vm_area_desc = *desc)=0A= return 0;=0A= }=0A= =0A= -bool vma_is_secretmem(struct vm_area_struct *vma)=0A= -{=0A= - return vma->vm_ops =3D=3D &secretmem_vm_ops;=0A= -}=0A= -=0A= static const struct file_operations secretmem_fops =3D {=0A= .release =3D secretmem_release,=0A= .mmap_prepare =3D secretmem_mmap_prepare,=0A= @@ -151,7 +146,7 @@ static void secretmem_free_folio(struct folio *folio)= =0A= folio_zero_segment(folio, 0, folio_size(folio));=0A= }=0A= =0A= -const struct address_space_operations secretmem_aops =3D {=0A= +static const struct address_space_operations secretmem_aops =3D {=0A= .dirty_folio =3D noop_dirty_folio,=0A= .free_folio =3D secretmem_free_folio,=0A= .migrate_folio =3D secretmem_migrate_folio,=0A= @@ -200,6 +195,7 @@ static struct file *secretmem_file_create(unsigned long= flags)=0A= =0A= mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER);=0A= mapping_set_unevictable(inode->i_mapping);=0A= + mapping_set_no_direct_map(inode->i_mapping);=0A= =0A= inode->i_op =3D &secretmem_iops;=0A= inode->i_mapping->a_ops =3D &secretmem_aops;=0A= -- =0A= 2.50.1=0A= =0A=