From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C01D0C3ABD8 for ; Wed, 14 May 2025 23:43:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2518E6B00BA; Wed, 14 May 2025 19:43:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 16A046B00C9; Wed, 14 May 2025 19:43:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CFD5D6B00C7; Wed, 14 May 2025 19:43:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A2A1F6B00AC for ; Wed, 14 May 2025 19:43:00 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id F1FD1BEC06 for ; Wed, 14 May 2025 23:42:59 +0000 (UTC) X-FDA: 83443141278.20.FF95A6A Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf12.hostedemail.com (Postfix) with ESMTP id 425BF40007 for ; Wed, 14 May 2025 23:42:58 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="2NH/dhhl"; spf=pass (imf12.hostedemail.com: domain of 3gColaAsKCLERTbVicVpkeXXffXcV.TfdcZelo-ddbmRTb.fiX@flex--ackerleytng.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3gColaAsKCLERTbVicVpkeXXffXcV.TfdcZelo-ddbmRTb.fiX@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747266178; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=zwqGrUiOjHvJJnxbyG+X+anw8ewdbXKmpYSNoes50/Y=; b=bzJEa3iZ2W1hDoJItzGZrEJIHWGebmRylfiWDPJ9N43kkhfLcALfHirM6zgatuIqSNFESb VGPqNMCPlcMDKeHVHFOEcgDaL41y4JFKLnDW9beOoeU5cWbrEQIKynmSI/Ll3F1gwp8c/g E3HSwgWPZyve5m2wOqoDk3VlcSgPY24= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="2NH/dhhl"; spf=pass (imf12.hostedemail.com: domain of 3gColaAsKCLERTbVicVpkeXXffXcV.TfdcZelo-ddbmRTb.fiX@flex--ackerleytng.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3gColaAsKCLERTbVicVpkeXXffXcV.TfdcZelo-ddbmRTb.fiX@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747266178; a=rsa-sha256; cv=none; b=QPz5yYq5pgrjQMpLhjrezH48Ztut4UPwyuBXop6xJLgWJ6vwsxWU0YHc5mODmK7w2FC0SP Es5CUGJxuX2wlU2i1OUqnrIEvNwUV4X2SNQJxZrjznkyAig5buLqj6ykFBvpz2Mqe6LmNp MzQDgYPDm9gso0R9uu1tGmbEYpds9/4= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-30cbf82bd11so285479a91.3 for ; Wed, 14 May 2025 16:42:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747266177; x=1747870977; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=zwqGrUiOjHvJJnxbyG+X+anw8ewdbXKmpYSNoes50/Y=; b=2NH/dhhlRkZW8cz7tB2vB27GpPNqCvwD0dSLeX4Zb9rlKIoJDrB5Pu38ini9jMI8ft faFL4xUWQ7umBLDg9Q/bunJe3vPzFeX0ENrmDYf3wSPrY7TVIR6q/2JyeKDydGDeME36 2WInQwLYTZLu+YyVEvfhuhgFibLPaDS5vPU+9ae+hnfd7ZdTfPRd52M5mlIlH9jahC9y HxeAd/giNXieRezh2w1GYKXbnIhI0rAV0f0T8Z5qtp2UmzbLcNlOg6o9Dt6sOsuscExJ i3lbjp7PgPVnRfvncuTl4sW32N/gL0N11t/itQVEklX9CxdkBjmKSX7DUalWjiOxrU/v XnOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747266177; x=1747870977; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=zwqGrUiOjHvJJnxbyG+X+anw8ewdbXKmpYSNoes50/Y=; b=ZU95j7YpUqoL+IYkHcKFjbS5PGXPA8Et3zRF+p6FAjtk9dzrjQwqH8NBKeyNEzQaNC SigmNWvB9Rn0l1dlXlJs9lw0pFIWAylRpWlVRZd7+JaNse5lN1rhkEA2vk0tOKVt9QjV SiEA7KhVuvOZ6My28Fbbscr6mdQzTYopdWqoj5GF70q8mOWq7i6qrfmBAMGiV9TYDvfx 9iwbmUWvXEVyaTzrvp/R9lFJmGyDCUB7bXEywqfyzJF1YC5jqD7JbFZJn7XB1GZJs3ZM VkKNhezHqJhL/tPLKR5Cp0yi/E5VV4NSEEjoDqTytaqkcYyOsltWNEnZazO1a2n04P31 EVNg== X-Forwarded-Encrypted: i=1; AJvYcCUXsleYTlN1UP0d7W3BC8RE7NtJPv9KIkzTL8J3rgPLZ2DLFI5eu640zd5yDRepKdjTY4EGPS2ihg==@kvack.org X-Gm-Message-State: AOJu0YxTsJpVlTmdyVTEgX+WSMZhngv6xFLy+Btj5n1LC6aMANyPGjLp SjtSViVCsoAVHAeENDoGnXMgYCtCgC0WmhzHtHtAnSVbj7SwxeO1avV9LAJhayESKSfpBaW5WjR f9wQT3lrtdga0AdDOXHJcYg== X-Google-Smtp-Source: AGHT+IERrB6TWSBQZL8LGOX6EETFEN6+F+QJfjTnuW65iRmylkCaeR55nU6SQCFovka0Yb5n9ELAV7ZNQnoM1ohNmw== X-Received: from pjf3.prod.google.com ([2002:a17:90b:3f03:b0:2fb:fa85:1678]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:5403:b0:301:1d9f:4ba2 with SMTP id 98e67ed59e1d1-30e51914ea8mr650385a91.28.1747266176876; Wed, 14 May 2025 16:42:56 -0700 (PDT) Date: Wed, 14 May 2025 16:41:39 -0700 Mime-Version: 1.0 X-Mailer: git-send-email 2.49.0.1045.g170613ef41-goog Message-ID: Subject: [RFC PATCH v2 00/51] 1G page support for guest_memfd From: Ackerley Tng To: kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org Cc: ackerleytng@google.com, aik@amd.com, ajones@ventanamicro.com, akpm@linux-foundation.org, amoorthy@google.com, anthony.yznaga@oracle.com, anup@brainfault.org, aou@eecs.berkeley.edu, bfoster@redhat.com, binbin.wu@linux.intel.com, brauner@kernel.org, catalin.marinas@arm.com, chao.p.peng@intel.com, chenhuacai@kernel.org, dave.hansen@intel.com, david@redhat.com, dmatlack@google.com, dwmw@amazon.co.uk, erdemaktas@google.com, fan.du@intel.com, fvdl@google.com, graf@amazon.com, haibo1.xu@intel.com, hch@infradead.org, hughd@google.com, ira.weiny@intel.com, isaku.yamahata@intel.com, jack@suse.cz, james.morse@arm.com, jarkko@kernel.org, jgg@ziepe.ca, jgowans@amazon.com, jhubbard@nvidia.com, jroedel@suse.de, jthoughton@google.com, jun.miao@intel.com, kai.huang@intel.com, keirf@google.com, kent.overstreet@linux.dev, kirill.shutemov@intel.com, liam.merwick@oracle.com, maciej.wieczor-retman@intel.com, mail@maciej.szmigiero.name, maz@kernel.org, mic@digikod.net, michael.roth@amd.com, mpe@ellerman.id.au, muchun.song@linux.dev, nikunj@amd.com, nsaenz@amazon.es, oliver.upton@linux.dev, palmer@dabbelt.com, pankaj.gupta@amd.com, paul.walmsley@sifive.com, pbonzini@redhat.com, pdurrant@amazon.co.uk, peterx@redhat.com, pgonda@google.com, pvorel@suse.cz, qperret@google.com, quic_cvanscha@quicinc.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, quic_svaddagi@quicinc.com, quic_tsoni@quicinc.com, richard.weiyang@gmail.com, rick.p.edgecombe@intel.com, rientjes@google.com, roypat@amazon.co.uk, rppt@kernel.org, seanjc@google.com, shuah@kernel.org, steven.price@arm.com, steven.sistare@oracle.com, suzuki.poulose@arm.com, tabba@google.com, thomas.lendacky@amd.com, usama.arif@bytedance.com, vannapurve@google.com, vbabka@suse.cz, viro@zeniv.linux.org.uk, vkuznets@redhat.com, wei.w.wang@intel.com, will@kernel.org, willy@infradead.org, xiaoyao.li@intel.com, yan.y.zhao@intel.com, yilun.xu@intel.com, yuzenghui@huawei.com, zhiquan1.li@intel.com Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: uuejomojf85qdmpamauyrppbk91tt547 X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 425BF40007 X-HE-Tag: 1747266178-791870 X-HE-Meta: U2FsdGVkX1/xITdlSPTFub06BvErsQKJvWFvM6bMpPQ7vclNs07GUmSb8MEXEl7Mn9ApUOrs+fU8c0mMwc/9tH12tL6dFb9bHSXMHjBuV0dIBHNU+sVLk8oyOyv4su9viyWDqKS2mjbLMBypDWIpFMQaY12WfWsJN5a9/bSjH6nd8bUWgdwxO/piUywzIMUi7Di78GkgGpTAiZxOlzCnyIR+5WHw3NofzqbMMHqV//x/tOkU9KYB5gAMPO174MXYE6X4o+7Cmkm20aPHruF1Kgbl4TZL+zGNwc8OAt9Qjiff3uRaZEg78M78rlnNyT/BhddCmFzW+ZRDBKPDK596mDfnmW+B8nZdCGGidVt9LgQa4GyUcI1nHnPIFsX8LwPHq8XkvkMbHyCbcJGDT09kck+jU+9KDRAPk23q00H/fx33qFF2KeyhuYN0sTe5Og7rVOWFZxBw7YWZkawK8b2KpJHszu/6kRUVZTsvGznbO8PO6EHYyuJiEN2hLmG0V1y+l/F8LNGxH2zkSNwwRsw9iZe8YtO2dehAqAdsuaYniz9SrPRs36tH+tQQkbMI/9KwqPbTaPEoUgBeSn81slsvi7wN6RLMSzYLT+/FPrtORMnNze1AhqGVI4miyjhhz2mQsk22Iw0PMNYz/Joo1FhH3yzb3Wo6paGvD5iEuWXhJ9ny5NqvDo6R7mPWi7WySChq2iGqz/f7SVfuhkLvquea0U4EqVvZzq8jVAhQY+FbP6mOUwIHmcQ4UGC0OkEPUq6gjvbhbYbBj2jPYsq54gvTbytrLdd258wI4ag+ZMt9XEOxKjPa3q6X81HrFxw5+CFOTPk9KICZxtE8v3d7SqFjfZSo/WF76rfMDrq12CAhSIZjW+4LGD04SEoUNFr4+U5oydiuTbMx7iYZTzrr3gMqHbzIi+9wyxkANIRui2xlqOOBQ6wd4R9XA4nC4/+c3IKiCS2MLFZvfZqfbt/EYGf Dc4UDbzw s20yY3IUtLKSq0qD/vKH+7Pu9cewoErqKKndVTtL3QymgnatdvsIXyVacltP216GpYa/sDcpLgL+T/EHEk3ke45DXD3WGR/spskSrEQovg0s5N5UzytGAYLfOWf4ni+7T2gIwgn8WzZH8jmv/uZ+bEFAgTR57UHjkfV1VCkidA34oXFDCjYJYpGOjO28qG6Ndly7D0VYjTQ/ZMd1tGiijHDHwcTzyW+N2Bxph58C8xI+idlF7UGPmsRUJq9/BjaB7unxc9d/L43lNXSefQQUpICjKlOIqgvaO0DRunC1vc0bkRBOnasWh5Au5h6wNJ1umJH9nIon8yupV7Z/VKif4miUM7puDM4cupSC0Y4iWPS4VFIusiF4mIX/egL2/ahr506wFMrGh7cschq6GLg6Auyb8Gb223GVXzY5UaSI/iCeDEOlBIrU7+nwrFwwrrVbPYwGKOHdhS5p4NNDr3G/mQdwqUAQjxfm86GymkcMg3kyWwUPWICRDx1AU8pZnXd4rsFYN+w2Ti7nhJUYDvJ5l24dFzff9+anUSc3Mp8VFZ6fdgDg85TbzGN0BpHHKrAgQcztFvUazNNG7pBtjAvPwppiaDEUoIA663NVbwyJ9SNoqJETSGbbNfJsPCFrtEtiO5D5EmItU6hLGfBVI8jqqWvJUEcXo5r7HOyKoaKlSTTe/5NNqScvXRjYoByv3PEJkLddUUz5rRSaQ+WvYjdq+oyhMYva8NmhNCpbXczws7j5Y2vSMV3B6eXWIIKE7yx3xe3POvgTaWLiL7QjI6rKF8q2UCnKET8tYxChQgRostCVzlG0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hello, This patchset builds upon discussion at LPC 2024 and many guest_memfd upstream calls to provide 1G page support for guest_memfd by taking pages from HugeTLB. This patchset is based on Linux v6.15-rc6, and requires the mmap support for guest_memfd patchset (Thanks Fuad!) [1]. For ease of testing, this series is also available, stitched together, at https://github.com/googleprodkernel/linux-cc/tree/gmem-1g-page-support-rfc-v2 This patchset can be divided into two sections: (a) Patches from the beginning up to and including "KVM: selftests: Update script to map shared memory from guest_memfd" are a modified version of "conversion support for guest_memfd", which Fuad is managing [2]. (b) Patches after "KVM: selftests: Update script to map shared memory from guest_memfd" till the end are patches that actually bring in 1G page support for guest_memfd. These are the significant differences between (a) and [2]: + [2] uses an xarray to track sharability, but I used a maple tree because for 1G pages, iterating pagewise to update shareability was prohibitively slow even for testing. I was choosing from among multi-index xarrays, interval trees and maple trees [3], and picked maple trees because + Maple trees were easier to figure out since I didn't have to compute the correct multi-index order and handle edge cases if the converted range wasn't a neat power of 2. + Maple trees were easier to figure out as compared to updating parts of a multi-index xarray. + Maple trees had an easier API to use than interval trees. + [2] doesn't yet have a conversion ioctl, but I needed it to test 1G support end-to-end. + (a) Removes guest_memfd from participating in LRU, which I needed, to get conversion selftests to work as expected, since participation in LRU was causing some unexpected refcounts on folios which was blocking conversions. I am sending (a) in emails as well, as opposed to just leaving it on GitHub, so that we can discuss by commenting inline on emails. If you'd like to just look at 1G page support, here are some key takeaways from the first section (a): + If GUEST_MEMFD_FLAG_SUPPORT_SHARED is requested during guest_memfd creation, guest_memfd will + Track shareability (whether an index in the inode is guest-only or if the host is allowed to fault memory at a given index). + Always be used for guest faults - specifically, kvm_gmem_get_pfn() will be used to provide pages for the guest. + Always be used by KVM to check private/shared status of a gfn. + guest_memfd now has conversion ioctls, allowing conversion to private/shared + Conversion can fail if there are unexpected refcounts on any folios in the range. Focusing on (b) 1G page support, here's an overview: 1. A bunch of refactoring patches for HugeTLB that isolates the allocation of a HugeTLB folio from other HugeTLB concepts such as VMA-level reservations, and HugeTLBfs-specific concepts, such as where memory policy is stored in the VMA, or where the subpool is stored on the inode. 2. A few patches that add a guestmem_hugetlb allocator within mm/. The guestmem_hugetlb allocator is a wrapper around HugeTLB to modularize the memory management functions, and to cleanly handle cleanup, so that folio cleanup can happen after the guest_memfd inode (and even KVM) goes away. 3. Some updates to guest_memfd to use the guestmem_hugetlb allocator. 4. Selftests for 1G page support. Here are some remaining issues/TODOs: 1. Memory error handling such as machine check errors have not been implemented. 2. I've not looked into preparedness of pages, only zeroing has been considered. 3. When allocating HugeTLB pages, if two threads allocate indices mapping to the same huge page, the utilization in guest_memfd inode's subpool may momentarily go over the subpool limit (the requested size of the inode at guest_memfd creation time), causing one of the two threads to get -ENOMEM. Suggestions to solve this are appreciated! 4. max_usage_in_bytes statistic (cgroups v1) for guest_memfd HugeTLB pages should be correct but needs testing and could be wrong. 5. memcg charging (charge_memcg()) for cgroups v2 for guest_memfd HugeTLB pages after splitting should be correct but needs testing and could be wrong. 6. Page cache accounting: When a hugetlb page is split, guest_memfd will incur page count in both NR_HUGETLB (counted at hugetlb allocation time) and NR_FILE_PAGES stats (counted when split pages are added to the filemap). Is this aligned with what people expect? Here are some optimizations that could be explored in future series: 1. Pages could be split from 1G to 2M first and only split to 4K if necessary. 2. Zeroing could be skipped for Coco VMs if hardware already zeroes the pages. Here's RFC v1 [4] if you're interested in the motivation behind choosing HugeTLB, or the history of this patch series. [1] https://lore.kernel.org/all/20250513163438.3942405-11-tabba@google.com/T/ [2] https://lore.kernel.org/all/20250328153133.3504118-1-tabba@google.com/T/ [3] https://lore.kernel.org/all/diqzzfih8q7r.fsf@ackerleytng-ctop.c.googlers.com/ [4] https://lore.kernel.org/all/cover.1726009989.git.ackerleytng@google.com/T/ --- Ackerley Tng (49): KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes KVM: guest_memfd: Introduce and use shareability to guard faulting KVM: selftests: Update guest_memfd_test for INIT_PRIVATE flag KVM: guest_memfd: Introduce KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls KVM: guest_memfd: Skip LRU for guest_memfd folios KVM: Query guest_memfd for private/shared status KVM: guest_memfd: Add CAP KVM_CAP_GMEM_CONVERSION KVM: selftests: Test flag validity after guest_memfd supports conversions KVM: selftests: Test faulting with respect to GUEST_MEMFD_FLAG_INIT_PRIVATE KVM: selftests: Refactor vm_mem_add to be more flexible KVM: selftests: Allow cleanup of ucall_pool from host KVM: selftests: Test conversion flows for guest_memfd KVM: selftests: Add script to exercise private_mem_conversions_test KVM: selftests: Update private_mem_conversions_test to mmap guest_memfd KVM: selftests: Update script to map shared memory from guest_memfd mm: hugetlb: Consolidate interpretation of gbl_chg within alloc_hugetlb_folio() mm: hugetlb: Cleanup interpretation of gbl_chg in alloc_hugetlb_folio() mm: hugetlb: Cleanup interpretation of map_chg_state within alloc_hugetlb_folio() mm: hugetlb: Rename alloc_surplus_hugetlb_folio mm: mempolicy: Refactor out policy_node_nodemask() mm: hugetlb: Inline huge_node() into callers mm: hugetlb: Refactor hugetlb allocation functions mm: hugetlb: Refactor out hugetlb_alloc_folio() mm: hugetlb: Add option to create new subpool without using surplus mm: truncate: Expose preparation steps for truncate_inode_pages_final mm: hugetlb: Expose hugetlb_subpool_{get,put}_pages() mm: Introduce guestmem_hugetlb to support folio_put() handling of guestmem pages mm: guestmem_hugetlb: Wrap HugeTLB as an allocator for guest_memfd mm: truncate: Expose truncate_inode_folio() KVM: x86: Set disallow_lpage on base_gfn and guest_memfd pgoff misalignment KVM: guest_memfd: Support guestmem_hugetlb as custom allocator KVM: guest_memfd: Allocate and truncate from custom allocator mm: hugetlb: Add functions to add/delete folio from hugetlb lists mm: guestmem_hugetlb: Add support for splitting and merging pages mm: Convert split_folio() macro to function KVM: guest_memfd: Split allocator pages for guest_memfd use KVM: guest_memfd: Merge and truncate on fallocate(PUNCH_HOLE) KVM: guest_memfd: Update kvm_gmem_mapping_order to account for page status KVM: Add CAP to indicate support for HugeTLB as custom allocator KVM: selftests: Add basic selftests for hugetlb-backed guest_memfd KVM: selftests: Update conversion flows test for HugeTLB KVM: selftests: Test truncation paths of guest_memfd KVM: selftests: Test allocation and conversion of subfolios KVM: selftests: Test that guest_memfd usage is reported via hugetlb KVM: selftests: Support various types of backing sources for private memory KVM: selftests: Update test for various private memory backing source types KVM: selftests: Update private_mem_conversions_test.sh to test with HugeTLB pages KVM: selftests: Add script to test HugeTLB statistics KVM: selftests: Test guest_memfd for accuracy of st_blocks Elliot Berman (1): filemap: Pass address_space mapping to ->free_folio() Fuad Tabba (1): mm: Consolidate freeing of typed folios on final folio_put() Documentation/filesystems/locking.rst | 2 +- Documentation/filesystems/vfs.rst | 15 +- Documentation/virt/kvm/api.rst | 5 + arch/arm64/include/asm/kvm_host.h | 5 - arch/x86/include/asm/kvm_host.h | 10 - arch/x86/kvm/x86.c | 53 +- fs/hugetlbfs/inode.c | 2 +- fs/nfs/dir.c | 9 +- fs/orangefs/inode.c | 3 +- include/linux/fs.h | 2 +- include/linux/guestmem.h | 23 + include/linux/huge_mm.h | 6 +- include/linux/hugetlb.h | 19 +- include/linux/kvm_host.h | 32 +- include/linux/mempolicy.h | 11 +- include/linux/mm.h | 2 + include/linux/page-flags.h | 32 + include/uapi/linux/guestmem.h | 29 + include/uapi/linux/kvm.h | 16 + include/uapi/linux/magic.h | 1 + mm/Kconfig | 13 + mm/Makefile | 1 + mm/debug.c | 1 + mm/filemap.c | 12 +- mm/guestmem_hugetlb.c | 512 +++++ mm/guestmem_hugetlb.h | 9 + mm/hugetlb.c | 488 ++--- mm/internal.h | 1 - mm/memcontrol.c | 2 + mm/memory.c | 1 + mm/mempolicy.c | 44 +- mm/secretmem.c | 3 +- mm/swap.c | 32 +- mm/truncate.c | 27 +- mm/vmscan.c | 4 +- tools/testing/selftests/kvm/Makefile.kvm | 2 + .../kvm/guest_memfd_conversions_test.c | 797 ++++++++ .../kvm/guest_memfd_hugetlb_reporting_test.c | 384 ++++ ...uest_memfd_provide_hugetlb_cgroup_mount.sh | 36 + .../testing/selftests/kvm/guest_memfd_test.c | 293 ++- ...memfd_wrap_test_check_hugetlb_reporting.sh | 95 + .../testing/selftests/kvm/include/kvm_util.h | 104 +- .../testing/selftests/kvm/include/test_util.h | 20 +- .../selftests/kvm/include/ucall_common.h | 1 + tools/testing/selftests/kvm/lib/kvm_util.c | 465 +++-- tools/testing/selftests/kvm/lib/test_util.c | 102 + .../testing/selftests/kvm/lib/ucall_common.c | 16 +- .../kvm/x86/private_mem_conversions_test.c | 195 +- .../kvm/x86/private_mem_conversions_test.sh | 100 + virt/kvm/Kconfig | 5 + virt/kvm/guest_memfd.c | 1655 ++++++++++++++++- virt/kvm/kvm_main.c | 14 +- virt/kvm/kvm_mm.h | 9 +- 53 files changed, 5080 insertions(+), 640 deletions(-) create mode 100644 include/linux/guestmem.h create mode 100644 include/uapi/linux/guestmem.h create mode 100644 mm/guestmem_hugetlb.c create mode 100644 mm/guestmem_hugetlb.h create mode 100644 tools/testing/selftests/kvm/guest_memfd_conversions_test.c create mode 100644 tools/testing/selftests/kvm/guest_memfd_hugetlb_reporting_test.c create mode 100755 tools/testing/selftests/kvm/guest_memfd_provide_hugetlb_cgroup_mount.sh create mode 100755 tools/testing/selftests/kvm/guest_memfd_wrap_test_check_hugetlb_reporting.sh create mode 100755 tools/testing/selftests/kvm/x86/private_mem_conversions_test.sh -- 2.49.0.1045.g170613ef41-goog