From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11628C83F25 for ; Mon, 21 Jul 2025 14:42:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 92B0B8E0003; Mon, 21 Jul 2025 10:42:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8DD078E0001; Mon, 21 Jul 2025 10:42:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F1958E0003; Mon, 21 Jul 2025 10:42:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 6FACC8E0001 for ; Mon, 21 Jul 2025 10:42:20 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1BC0C160765 for ; Mon, 21 Jul 2025 14:42:20 +0000 (UTC) X-FDA: 83688537240.25.C93CE6C Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf30.hostedemail.com (Postfix) with ESMTP id 4169780007 for ; Mon, 21 Jul 2025 14:42:18 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LWbC9Dhp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of 3yFF-aAYKCK8hTPcYRVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--seanjc.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3yFF-aAYKCK8hTPcYRVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--seanjc.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1753108938; a=rsa-sha256; cv=none; b=cMHlWmAqSZaogk2wvvkW2+b+K/GYpttK8l1lmXqv5OH3Ut4QpsU8jT3DnTwTCjVQym84jK iayL0gTqHWQn7+K/KBGkeVMQy15ypkG/LFPo+p+h1CdhUfLGoftp8gKCRtbDBOs2K3Va1k faVgnXSKZowMPJV+LzOUHCi/ayb2Ans= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LWbC9Dhp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of 3yFF-aAYKCK8hTPcYRVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--seanjc.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3yFF-aAYKCK8hTPcYRVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1753108938; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0goFzyfUB48nlOh0zNIdkcSt6jKBrbzO05Ipxz1r+/M=; b=j8MdRZVL8xGIG1KE2Bvf0NJXPHQ6voRNPY3Fqsxu0qJtgEZ4NaOVcwuNX1OdNKKGnb7PNZ VrvZ1XtWfp+9vz28ruU8Rw/13UJ4akseNQ6S9lQtQk16Lg0jF6F7XbffLzLWkn2BQzSyGU YT8yoWjTm3VoybrZof7C+/+c+RQ2gmw= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-313c3915345so6391412a91.3 for ; Mon, 21 Jul 2025 07:42:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1753108937; x=1753713737; darn=kvack.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=0goFzyfUB48nlOh0zNIdkcSt6jKBrbzO05Ipxz1r+/M=; b=LWbC9DhpACFIImaszL8q/ItZNIIFZhYgm7ZWxeI9F3u+ciEsLreDoRjXdYy7z5ZZsT Ei3/DF6hiSLwbThbFB1GY3cjlyyNAJ8JPJUq6ioOGJpubGrjokScG1ED/ACSDdRBFDb/ v/oy3c8XWSq/MxyCIZsyNynenVQ+Y8CxHgw0B4h5ulW0Azt3W/Egcjzi1M38XG6jZN4n kysSDBQGDvnXHZn7CwFxdjHuohIRJacXQ7lTv6f40eg4Hf4L+K2gQQUQhRMK5oRL/jms Zn2KaImcH2vKwPXpWV7Q21VckRSGmFZF32DOfO7sZsSTp4sHGIp87pFZK4zVMSB5Tg5X CATg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753108937; x=1753713737; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=0goFzyfUB48nlOh0zNIdkcSt6jKBrbzO05Ipxz1r+/M=; b=WT07sVhzEdFJaVFCcGCFMLoMRLhVC1XIVQGfZ6Vjq8+ex4UAuzdlaqwLjyvrsKNXl2 RAvnbhBob07mKqBJ2/5CrtQ8NzHhIeQu2W+pPCUKTl0xsiu5+ESEuoXTrkVITc6Hco/x r85YwSYYns8bU4oxLpuAqVgCg4SK4O6E+ujrgQUPbIuxK30ZFqTjD4i5OUrINW6oEqgb VbNYUbuZhGpwOWLoOEw+L7O+7Fk+zlCMkjMsjPo/RNeW1ldwU3JFFpRZWSf+4KiVVCZ5 qjKFT3O+5SGkl4ZpkWAo/oeCrA54pO+D63sMyC5Cf7ZMXrB0mVoYCi5kr4Nuf9h9I5lu VZOQ== X-Forwarded-Encrypted: i=1; AJvYcCXMvkKJuOq2xPgrOdC+z+xRhvAgCF7S8YLe1BJN1F1Ls3Oc65H3iCKRFmWz9GYWcQ5IzZ7GpCzwkQ==@kvack.org X-Gm-Message-State: AOJu0Yzldv6h0bu937VC4sWdT52e5xfD4klJ6D4E4K6bFTs3oBFiSfYf QXEW7TUZ5DiAVyIwl+7fGk93ZvNew5mz6C8wxpyz612OVLzWDs41H9IQlnBVfIOeLpPb5pU/OkE rAc62hg== X-Google-Smtp-Source: AGHT+IEJ+bjLeZTUfUfKDLe1LsRRhzuv2RwpCfUPWTo0Nomfp5QFIPNH2ZxIutJZB2fcJTSI1oFs3d8UxLE= X-Received: from pjz11.prod.google.com ([2002:a17:90b:56cb:b0:31c:2fe4:33b7]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:5810:b0:313:23ed:701 with SMTP id 98e67ed59e1d1-31caf821680mr23427409a91.4.1753108936848; Mon, 21 Jul 2025 07:42:16 -0700 (PDT) Date: Mon, 21 Jul 2025 07:42:14 -0700 In-Reply-To: Mime-Version: 1.0 References: <20250717162731.446579-1-tabba@google.com> <20250717162731.446579-15-tabba@google.com> <505a30a3-4c55-434c-86a5-f86d2e9dc78a@intel.com> Message-ID: Subject: Re: [PATCH v15 14/21] KVM: x86: Enable guest_memfd mmap for default VM type From: Sean Christopherson To: Vishal Annapurve Cc: Xiaoyao Li , Fuad Tabba , kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, kvmarm@lists.linux.dev, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, ira.weiny@intel.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4169780007 X-Stat-Signature: 7ydpqpx8sj1weo841tkhg5rjdh3ei1ym X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1753108938-475124 X-HE-Meta: U2FsdGVkX1+cOsSbp/iSB3rQ2cx4Vb5mkzLLiPTobh/KPaV5l9NrKPClo1qiK+9SyAQDllElOo27Pm+pGDV9Wa0S78mcLPKxuJ3qkt34I+PJzIGt8Up4lCIoaHS7UGANNcLlfG+h5OKLnbkaAWu8q3FHhiSC74kvzjEYG6cvTLVeWAW8xUVeipsKw53lO2DSZVP6+PvCY2xphh6HCupgn5SYJy6YXFzrzsP3YPKTqX2FbP2H5yyIE7PcVjI50RKeaJmewsrIMYAtPmIbrcYYFKlVOlmDD1haPY4zPwlN79e9AhBgwH9i7vPHIB9Bi++ZEbpJXFdYd5PiaJbPUjHwtib6ELxnFHdzxs/olRWoVvmS/Vy1IqwpD2hcqwXDwGr2KglCudUg2ayAacSqYyA/wfVnVBhkR1jD7NmkLFDuQJ+epbD9Ikg2/aNSERFO4hVUy6Pvr9p+Ptto9H5WXEGmF97+w9KxX5KW1p0ncu86OowQl6vI7YNwrBoqENRu0Po/Nj4PcZSpLzKhkHJwa3/TkpAOvcWG5ZXRddk4zAjfjZ/fzJAg5bR9MADs0sXwJ6YsJRhfNjZFlEch4KVESGq8Bk25MQDeBkKEuPwckkcQDTIN0HNJf1Glq4TY9px7C5bvtvtO2z70a97Yt8So5B0FQFEzWne7Fqa2aiWaa9CRvAS8Mte8V/eBlAlJi1rXFvhADdZRthtPhENUQLJygXYCCQRXY5ryouH/CzNMMLH5LL4XM6HeKaYqEFhGYbZE2HvQJrqdeSDdDr/41riR1UMk1jN0/m8vit9lRHLU/UpheDoEPE1In8FJp9NfY47Z+Edbx6SwUYYpEpOkm4TGDHn+d6SLrgpV3m7C/YVMd79sa8aPu0H5q8X6vV72GTUUAMmrVFjxh5BKWOR2b+dIbQG9l+t0yFQVegXuU6uIZ5RaCS7YvVwpTDBv6b/t/fv4Lx9nAnC65n2/A7rtJm5srLo +netNh/B czprfm30B6v4gwBlCh1V2bOx4NFNSP5pEGkNeDFgbNXY/9rs5E968HtAqpqHv4rPXTM84fQB/DjzBh3ih3LRGYNBpCFTO0Y6wVEpZamn8ZOBkhe0QcC66EPJh3XIn1Zf+lT2bJiUlnXztRrQZrcq/L+GMYQOLosDX1RcP4OycQprxerr8gLnbGoIUgf/QnqS2gBYU8yzkZPn/uVA64KRsMG4TU8IFoJ+3AApe0NmDAk1nzNkMtwFM35LAlUXS6s5c19KRlCR8B9pVOrbkgSNw0llEPhZldsuVZjMc+0Zh07MPUgGG+xJ+usTWIVEUQOd3dHEGL+pzY2N0LoFVYy9mvR8Y4n0d62L5FnF3NLC/3H+Kjd/oFu6EVktDkw7esGNNP5pLX0NpK52FgE5Z5sshBQvh3E48x07geXEShXmR7SgyLIcC6hQa1pHNrnO7DF5VjfE8aAtiSbGqrIjpP7Y4SicNH6DLLDaZmBkuzHTHdJztYbomaJ0aO8MMY9ZGp0d+3zW6/nJg+b6izXtrzkHE0w9icS9LJaWSHsgYq8E1NaJqTMmY0k3dLl85V6rWoTQjjae97qQElsKgVJyr3rIAHhOSNp9wdGLVGkvJXjsE8aQCBWYzAG8v+C7nyQKWazJ8vHk8jaLGbM9VH/iLkBQvhfLm1HLT+hQ/HX6OVGLfQANO8tH2dk5trNbIEjbe7izUUfSg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jul 21, 2025, Vishal Annapurve wrote: > On Mon, Jul 21, 2025 at 5:22=E2=80=AFAM Xiaoyao Li = wrote: > > > > On 7/18/2025 12:27 AM, Fuad Tabba wrote: > > > +/* > > > + * CoCo VMs with hardware support that use guest_memfd only for back= ing private > > > + * memory, e.g., TDX, cannot use guest_memfd with userspace mapping = enabled. > > > + */ > > > +#define kvm_arch_supports_gmem_mmap(kvm) \ > > > + (IS_ENABLED(CONFIG_KVM_GMEM_SUPPORTS_MMAP) && \ > > > + (kvm)->arch.vm_type =3D=3D KVM_X86_DEFAULT_VM) > > > > I want to share the findings when I do the POC to enable gmem mmap in Q= EMU. > > > > Actually, QEMU can use gmem with mmap support as the normal memory even > > without passing the gmem fd to kvm_userspace_memory_region2.guest_memfd > > on KVM_SET_USER_MEMORY_REGION2. > > > > Since the gmem is mmapable, QEMU can pass the userspace addr got from > > mmap() on gmem fd to kvm_userspace_memory_region(2).userspace_addr. It > > works well for non-coco VMs on x86. > > > > Then it seems feasible to use gmem with mmap for the shared memory of > > TDX, and an additional gmem without mmap for the private memory. i.e., > > For struct kvm_userspace_memory_region, the @userspace_addr is passed > > with the uaddr returned from gmem0 with mmap, while @guest_memfd is > > passed with another gmem1 fd without mmap. > > > > However, it fails actually, because the kvm_arch_suports_gmem_mmap() > > returns false for TDX VMs, which means userspace cannot allocate gmem > > with mmap just for shared memory for TDX. >=20 > Why do you want such a usecase to work? I'm guessing Xiaoyao was asking an honest question in response to finding a perceived flaw when trying to get this all working in QEMU. > If kvm allows mappable guest_memfd files for TDX VMs without > conversion support, userspace will be able to use those for backing s/able/unable? > private memory unless: > 1) KVM checks at binding time if the guest_memfd passed during memslot > creation is not a mappable one and doesn't enforce "not mappable" > requirement for TDX VMs at creation time. Xiaoyao's question is about "just for shared memory", so this is irrelevant= for the question at hand. > 2) KVM fetches shared faults through userspace page tables and not > guest_memfd directly. This is also irrelevant. KVM _already_ supports resolving shared faults th= rough userspace page tables. That support won't go away as KVM will always need/= want to support mapping VM_IO and/or VM_PFNMAP memory into the guest (even for T= DX). > I don't see value in trying to go out of way to support such a usecase. But if/when KVM gains support for tracking shared vs. private in guest_memf= d itself, i.e. when TDX _does_ support mmap() on guest_memfd, KVM won't have = to go out of its to support using guest_memfd for the @userspace_addr backing sto= re. Unless I'm missing something, the only thing needed to "support" this scena= rio is: diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index d01bd7a2c2bd..34403d2f1eeb 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -533,7 +533,7 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_= guest_memfd *args) u64 flags =3D args->flags; u64 valid_flags =3D 0; =20 - if (kvm_arch_supports_gmem_mmap(kvm)) + // if (kvm_arch_supports_gmem_mmap(kvm)) valid_flags |=3D GUEST_MEMFD_FLAG_MMAP; =20 if (flags & ~valid_flags) I think the question we actually want to answer is: do we want to go out of= our way to *prevent* such a usecase. E.g. is there any risk/danger that we nee= d to mitigate, and would the cost of the mitigation be acceptable? I think the answer is "no", because preventing userspace from using guest_m= emfd as shared-only memory would require resolving the VMA during hva_to_pfn() i= n order to fully prevent such behavior, and I defintely don't want to take mmap_loc= k around hva_to_pfn_fast(). I don't see any obvious danger lurking. KVM's pre-guest_memfd memory manag= ement scheme is all about effectively making KVM behave like "just another" users= pace agent. E.g. if/when TDX/SNP support comes along, guest_memfd must not allo= w mapping private memory into userspace regardless of what KVM supports for page faul= ts. So unless I'm missing something, for now we do nothing, and let this suppor= t come along naturally once TDX support mmap() on guest_memfd.