From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AA6CC7115B for ; Thu, 19 Jun 2025 09:19:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3BA896B00A3; Thu, 19 Jun 2025 05:19:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 38ACF6B00A4; Thu, 19 Jun 2025 05:19:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 279786B00A5; Thu, 19 Jun 2025 05:19:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 165436B00A3 for ; Thu, 19 Jun 2025 05:19:12 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A10FA140833 for ; Thu, 19 Jun 2025 09:19:11 +0000 (UTC) X-FDA: 83571601302.08.F40B157 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) by imf20.hostedemail.com (Postfix) with ESMTP id E2A781C0002 for ; Thu, 19 Jun 2025 09:19:08 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ALD0Iz5A; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf20.hostedemail.com: domain of xiaoyao.li@intel.com designates 198.175.65.21 as permitted sender) smtp.mailfrom=xiaoyao.li@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750324749; a=rsa-sha256; cv=none; b=NyX2RbE8A8dd+3KMZEfYXKDbET3vORm/b5NG4wcc8hXoF9l8jQTsOnjm+WpopDDBmKuLS/ rE3EWw7pyK3jM+d1sHaAViAx+Dfhf1aA7HWX/s8xTLFTlc72SETCucHSWWGCXBI0jOdjru ERX99ZG4mYEGH88tWb53fu4xw7JHwzY= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ALD0Iz5A; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf20.hostedemail.com: domain of xiaoyao.li@intel.com designates 198.175.65.21 as permitted sender) smtp.mailfrom=xiaoyao.li@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750324749; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WQ9apji+0mbbU5g0Kf3K9dV14cMR0ULDaCzilnW+2sY=; b=U1M9vsQAtYa7ik2AiTvB6L3X5T40y10JTDyXQByBHOPsZR+WQT4ZGU0DcweiS6gFJTn4ix FXE2xf8fB0Utnvp/8GZHfcvzjuwF7KjFg3CkObauInUa1xK9kEj/O34pGM2GC6tN1KDdlr On5UsZxEJTPjmWsutDvUI/94KQWBPGY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750324749; x=1781860749; h=message-id:date:mime-version:subject:from:to:cc: references:in-reply-to:content-transfer-encoding; bh=VJC27VDTjsSTVk7DExjp2HkvIBK+06bQoOreug3AoUU=; b=ALD0Iz5AHWq8VJtqQxRIUc4nazepi6qpYMxWk5e5pftlAJ2k9cx0xNsh OuZV7v3XAGAxLCjVOOb9zINY3D9hHR7oPQ/L1lqaBUe8avO5w0yghAjUJ dG0jmSkFLuHVF5UZxoh08wsRGVLXgBuntAnUGGlZ4EkmQGbAxuBTTSYjU JSeAj4YaMF8m6Ry6SgYAXho5Kqln4bm7nrWkhm//PmGhj9eTdT8la37Zj FXmnyeqYK6IOK2NDJpXlssYVQGF9ldWRvwyD56v98r8MqFv99Cor93Izj y2bPjMl8OnQr3XXTtOW+DE+VFbplTv7rljgsjsg/UxxYLi3cwmnuPdIkD A==; X-CSE-ConnectionGUID: br15DyouQ/qXcRDmbIRxeA== X-CSE-MsgGUID: XHLQxbFPQGOm6ISTmSP3iQ== X-IronPort-AV: E=McAfee;i="6800,10657,11468"; a="52437279" X-IronPort-AV: E=Sophos;i="6.16,248,1744095600"; d="scan'208";a="52437279" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Jun 2025 02:19:07 -0700 X-CSE-ConnectionGUID: C2V7ck8pTYagQfSu+uBumw== X-CSE-MsgGUID: wzDvlGXjSMufvW0I2In1kA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,248,1744095600"; d="scan'208";a="174138001" Received: from xiaoyaol-hp-g830.ccr.corp.intel.com (HELO [10.124.247.1]) ([10.124.247.1]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Jun 2025 02:18:47 -0700 Message-ID: <9b55acfa-688e-49da-9599-f35aee351e3d@intel.com> Date: Thu, 19 Jun 2025 17:18:44 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v2 00/51] 1G page support for guest_memfd From: Xiaoyao Li To: Yan Zhao , Ackerley Tng Cc: kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, aik@amd.com, ajones@ventanamicro.com, akpm@linux-foundation.org, amoorthy@google.com, anthony.yznaga@oracle.com, anup@brainfault.org, aou@eecs.berkeley.edu, bfoster@redhat.com, binbin.wu@linux.intel.com, brauner@kernel.org, catalin.marinas@arm.com, chao.p.peng@intel.com, chenhuacai@kernel.org, dave.hansen@intel.com, david@redhat.com, dmatlack@google.com, dwmw@amazon.co.uk, erdemaktas@google.com, fan.du@intel.com, fvdl@google.com, graf@amazon.com, haibo1.xu@intel.com, hch@infradead.org, hughd@google.com, ira.weiny@intel.com, isaku.yamahata@intel.com, jack@suse.cz, james.morse@arm.com, jarkko@kernel.org, jgg@ziepe.ca, jgowans@amazon.com, jhubbard@nvidia.com, jroedel@suse.de, jthoughton@google.com, jun.miao@intel.com, kai.huang@intel.com, keirf@google.com, kent.overstreet@linux.dev, kirill.shutemov@intel.com, liam.merwick@oracle.com, maciej.wieczor-retman@intel.com, mail@maciej.szmigiero.name, maz@kernel.org, mic@digikod.net, michael.roth@amd.com, mpe@ellerman.id.au, muchun.song@linux.dev, nikunj@amd.com, nsaenz@amazon.es, oliver.upton@linux.dev, palmer@dabbelt.com, pankaj.gupta@amd.com, paul.walmsley@sifive.com, pbonzini@redhat.com, pdurrant@amazon.co.uk, peterx@redhat.com, pgonda@google.com, pvorel@suse.cz, qperret@google.com, quic_cvanscha@quicinc.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, quic_svaddagi@quicinc.com, quic_tsoni@quicinc.com, richard.weiyang@gmail.com, rick.p.edgecombe@intel.com, rientjes@google.com, roypat@amazon.co.uk, rppt@kernel.org, seanjc@google.com, shuah@kernel.org, steven.price@arm.com, steven.sistare@oracle.com, suzuki.poulose@arm.com, tabba@google.com, thomas.lendacky@amd.com, usama.arif@bytedance.com, vannapurve@google.com, vbabka@suse.cz, viro@zeniv.linux.org.uk, vkuznets@redhat.com, wei.w.wang@intel.com, will@kernel.org, willy@infradead.org, yilun.xu@intel.com, yuzenghui@huawei.com, zhiquan1.li@intel.com References: Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: E2A781C0002 X-Stat-Signature: 8ybsiphgadmfeqnrrnf83tryxy1nee9c X-Rspam-User: X-HE-Tag: 1750324748-710435 X-HE-Meta: U2FsdGVkX184EzpbHqKOtFkuNdDVhJ5z5TmaD3ibZRyoe9SQv9vIc7tMCdh4lsquZs7yxKD7I3znWQ8yF6W/WpGZnHxozJgok7eonRnDuYakfVF1p0aBEmPw8JZ4RNu9JHD6pyBSLpAqBH/yquHKAYlwvYKCLNYspOrcPRARppicbJxtAkY8F27e+wDh1XVds7ocZ5KqpwApcuIRj1FLXrB8mg841ZiqCqgLe8VFa0pgcUlVetAh4pgm2FOsheSAT1L8FYCvtAi8pGFEizJmsb5Wv+kMjFqnWFZXz5TDJN+V9YJCnDyaYbJr+cGDR34A8m8huVRyl2kXPGnBTGebiF7K2m+9+gKC/qRG2ni4Bsqh9yG5t72nNp4FHZt/oFC7eTgv0ZXUXWjxmzvCa4ZTHAZAZ0F2H6HgX3VwpVbhkWekRU/f+efCyUlCGN8Yhk/JEAY3GlWCea8AaH/5dZVAPB8qavddU2Qv4LOkEctNJQ21hcW6G58b0wcqLlcLgJ+dhVEKNEfqD1clszdq/H8MEMRwafVdL1xwGZUZpNHQJVZryyxicn6VLP1Yl0zGK7GOgoMun9KidFiEPHPnHAfgSHvys3YaOqUofTg5ZE31q7LMPvQrCQLLkIke7y/rxghzTEoKbEZD9J67iiaSIBFOF7l8H6xfxtnUHrFQyCIAgtavI/LksNYSEXAuvej0Vn4Rxd1TVemX08+XSp81+Dx5YhCCep+ap7UG0QR5+KjgbdabKxwL6qItS2sOrbhrj7stcFK3yEeB8abO29edk3aZJT3DfHx2NSVcrljzsW7t++664HmgQxa2R1SnG7nDlxbcgfo/7Bs3EkTQNVC4j/I7xYU3R/h3IIscvbtl6JsY5mmA6xQgMwPBR/UytoWSm1IFW6pmhYgrm5Pfn+VtO7xZS+t7Vm0ARO7wcIsbG7q56Zm2dPSj9+hf9RPXCVlYkDPGvkyBiAcyGC4X/PSKkdp p6jUU/Z2 SQPMZy2etLmrZ0t5IPJ//Ic9wgbJkokjmn6Rr4HgIfznD4Y/F2E/3jNR0cf+5nEqEl4U1jB9i5A3SdivrmZtZW6GMCB/i45OYLK6k1HvLwVsuFxnLI/jmHV1OOUqfZf1ZsbGMwaPSKQKWGoKXxovGe69YvVOYFBvIMCZnj3RyRcuIkQtYkujUxo/5ABE/TrWJ1o2T9crO9HFDyIGGOMiIM1jnLWSvMvJgO/4HioP93rANyEcgKwhXq9Fo+3I1jVttVflY8tk24JWX1OtuUNdnBqAm+p1IEYuXGN65HJyHQrpjMkZwUxXp9Pib0jX8HjT0A1rFb405exUUASK+Gw6W0JLwOA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 6/19/2025 4:59 PM, Xiaoyao Li wrote: > On 6/19/2025 4:13 PM, Yan Zhao wrote: >> On Wed, May 14, 2025 at 04:41:39PM -0700, Ackerley Tng wrote: >>> Hello, >>> >>> This patchset builds upon discussion at LPC 2024 and many guest_memfd >>> upstream calls to provide 1G page support for guest_memfd by taking >>> pages from HugeTLB. >>> >>> This patchset is based on Linux v6.15-rc6, and requires the mmap support >>> for guest_memfd patchset (Thanks Fuad!) [1]. >>> >>> For ease of testing, this series is also available, stitched together, >>> at https://github.com/googleprodkernel/linux-cc/tree/gmem-1g-page- >>> support-rfc-v2 >> Just to record a found issue -- not one that must be fixed. >> >> In TDX, the initial memory region is added as private memory during >> TD's build >> time, with its initial content copied from source pages in shared memory. >> The copy operation requires simultaneous access to both shared source >> memory >> and private target memory. >> >> Therefore, userspace cannot store the initial content in shared memory >> at the >> mmap-ed VA of a guest_memfd that performs in-place conversion between >> shared and >> private memory. This is because the guest_memfd will first unmap a PFN >> in shared >> page tables and then check for any extra refcount held for the shared >> PFN before >> converting it to private. > > I have an idea. > > If I understand correctly, the KVM_GMEM_CONVERT_PRIVATE of in-place > conversion unmap the PFN in shared page tables while keeping the content > of the page unchanged, right? > > So KVM_GMEM_CONVERT_PRIVATE can be used to initialize the private memory > actually for non-CoCo case actually, that userspace first mmap() it and > ensure it's shared and writes the initial content to it, after it > userspace convert it to private with KVM_GMEM_CONVERT_PRIVATE. > > For CoCo case, like TDX, it can hook to KVM_GMEM_CONVERT_PRIVATE if it > wants the private memory to be initialized with initial content, and > just do in-place TDH.PAGE.ADD in the hook. And maybe a new flag for KVM_GMEM_CONVERT_PRIVATE for user space to explicitly request that the page range is converted to private and the content needs to be retained. So that TDX can identify which case needs to call in-place TDH.PAGE.ADD.