From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C041C83F17 for ; Thu, 10 Jul 2025 17:54:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BDC916B008C; Thu, 10 Jul 2025 13:54:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B8BEA6B0093; Thu, 10 Jul 2025 13:54:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A534D6B0099; Thu, 10 Jul 2025 13:54:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 931056B008C for ; Thu, 10 Jul 2025 13:54:55 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 45B3C10D49E for ; Thu, 10 Jul 2025 17:54:55 +0000 (UTC) X-FDA: 83649105750.18.50C7C75 Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) by imf23.hostedemail.com (Postfix) with ESMTP id 5A61C14000E for ; Thu, 10 Jul 2025 17:54:53 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=AGW329fl; dmarc=none; spf=pass (imf23.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.215.169 as permitted sender) smtp.mailfrom=jgg@ziepe.ca ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752170093; a=rsa-sha256; cv=none; b=n6eFDzh+6pUZcUxg+r5Vxe55buPKB8gjgz+mG0fdxSsjPjA9CE9gb1RrycsqXIFxxB5PTM 5Txab/I0UGWHHw57N2FMImAVAMehcDZh0B9ioJIus7Zovg4DGXasxT6NYViOuhCwgFsUEU LfzXrMHC3YJRhHx2Zg3180N45pTUaP0= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=AGW329fl; dmarc=none; spf=pass (imf23.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.215.169 as permitted sender) smtp.mailfrom=jgg@ziepe.ca ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752170093; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HlXUsgYpVWPkbX8oMthsLhbtJUL7p0PJ63EhWSwPiGE=; b=NuVDaBQpx4JFVOh3UJBp9B3ZxH2Uo/RTa7X/bXQLprPHCknFYuEJCgLEP+TfqOFemk1EC/ MNAArXFEsbn79rT6CNU7qLbzI4R/8xWVhVh2s/gvBg1kisq7rzIaX+HTrfHQ3xNXnqlppU XJBfQ0Cw1wunK2P7tYlxfyZttk0lMHI= Received: by mail-pg1-f169.google.com with SMTP id 41be03b00d2f7-b1fd59851baso1031775a12.0 for ; Thu, 10 Jul 2025 10:54:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; t=1752170092; x=1752774892; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=HlXUsgYpVWPkbX8oMthsLhbtJUL7p0PJ63EhWSwPiGE=; b=AGW329flmaPgJu3EJMo6LoFluF7qY+AAdjsAnU5USOo/0rngGaxyMuBTY+ju0hyM84 f0BaGyM0CdTEoIq/GPt8gGlKThdZDJmyJUemZj5CEXoY97eP9K24hZdN1Z5GL75Iz5JZ fE8VPOgZm3q9Bd1o/75KlZ982dCs4ibtv2b+yA16to7L8f+IHdjaUxSrGRxXsTDsK3IR iV/rKOmm5eYSb+UGbAD22CaJFVx1HIW5adyTFM7cm4q23VEnnk3DYO8vSMfqLE/kwV0z yO0mCNsHKOEz2pN6/hPr/ocdjitKmf4+oE0Ssnhc3uTu2bYKGYtUOEZDmCBycusEHN/R N+/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752170092; x=1752774892; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HlXUsgYpVWPkbX8oMthsLhbtJUL7p0PJ63EhWSwPiGE=; b=gKPRfjhULE2hm60nQZmsjtbUYTFOAspzzshYMJ7sIcA6bKUsbR4B00eYTywWfAX72k Yx7gTxtZUAEYXbWM8a+Lh+yOaFcKOBjmdVIVsN70S4+qKunErlAHBQmSaQbeZG3kAbJQ O+57WYgF3urABi+W3opQFyTZ7L6dLEbB4FNVEQFbbJZwNulC7RanFCz5iJix2FsVXk41 PGK1JNR21rcf7P4zZlRtLm+Erh1M8yjzkVO+X8nabLt+N8t3Bq5Av2FSq0RlDtdFMazi AFXjYAwC1VIoMmCH6XU9cgNGnjO2qY+0KpqGHxGqNB+yqKGIcuRphgbuKWMFqWUS5QGJ +E9w== X-Forwarded-Encrypted: i=1; AJvYcCXnOt4SASzM+dw7SjuXjFAC5IMBorOdZef8FFi6Hvw9Mn2/jhDo/aZ6ez89OBHQrDD19DlvR9jXOg==@kvack.org X-Gm-Message-State: AOJu0Yzf1xUDLTTRhMXz+bKsbzz38dRTdnO5XV5RMaNNWWZ2Df1mZo/m 1BRQfy160XIAiJG0kTKSRY4tNKqABuPeIMBV7bT+5lC5e5rnheiKfHPnehuej/vdhYY= X-Gm-Gg: ASbGncstqCbucdMkmdIeo2AXZUl+zNwguh4h9Ajs+xxCH1FjYLF8ZdW88vOERg+EqAb auFsRtfYPbVkfjBBWd02PHyWS5w8BV4mUd3rIWBf3OtlgxIs/k95cWvWEwS7CcijkG+ncQf2DK/ yhEyts22bjwNfoMFqNFSWDviAtGMH3/vCJHByhPcvcATw3jY6IVPKxugCq02/stXo8/twwZy+g+ UMmaL/46hsT3jjCUuaZ58YJPgE+wPjVjzdji/qRRO71SEsfKJfbmVRMcEx8Lfj9KMTNiH4JJUft 7+ZoVIF78G/Jh57Ac3WV6LWHDGfUyIEavfNx5PJ0z1kUr9o= X-Google-Smtp-Source: AGHT+IHQpQYpeOko+u1PyjuO7tF8TrxEDaL8lMOxVpnSl269LjT/LT0TmNeqUnE/EZcU54Mh/IefEA== X-Received: by 2002:a05:6a20:e198:b0:220:764c:9edf with SMTP id adf61e73a8af0-23121109719mr453532637.40.1752170091929; Thu, 10 Jul 2025 10:54:51 -0700 (PDT) Received: from ziepe.ca ([130.41.10.202]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-b3bbe6c5f29sm2773180a12.46.2025.07.10.10.54.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Jul 2025 10:54:50 -0700 (PDT) Received: from jgg by wakko with local (Exim 4.97) (envelope-from ) id 1uZvTt-00000007x4Z-1rxa; Thu, 10 Jul 2025 14:54:49 -0300 Date: Thu, 10 Jul 2025 14:54:49 -0300 From: Jason Gunthorpe To: Xu Yilun Cc: Vishal Annapurve , Yan Zhao , Alexey Kardashevskiy , Fuad Tabba , Ackerley Tng , kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, ajones@ventanamicro.com, akpm@linux-foundation.org, amoorthy@google.com, anthony.yznaga@oracle.com, anup@brainfault.org, aou@eecs.berkeley.edu, bfoster@redhat.com, binbin.wu@linux.intel.com, brauner@kernel.org, catalin.marinas@arm.com, chao.p.peng@intel.com, chenhuacai@kernel.org, dave.hansen@intel.com, david@redhat.com, dmatlack@google.com, dwmw@amazon.co.uk, erdemaktas@google.com, fan.du@intel.com, fvdl@google.com, graf@amazon.com, haibo1.xu@intel.com, hch@infradead.org, hughd@google.com, ira.weiny@intel.com, isaku.yamahata@intel.com, jack@suse.cz, james.morse@arm.com, jarkko@kernel.org, jgowans@amazon.com, jhubbard@nvidia.com, jroedel@suse.de, jthoughton@google.com, jun.miao@intel.com, kai.huang@intel.com, keirf@google.com, kent.overstreet@linux.dev, kirill.shutemov@intel.com, liam.merwick@oracle.com, maciej.wieczor-retman@intel.com, mail@maciej.szmigiero.name, maz@kernel.org, mic@digikod.net, michael.roth@amd.com, mpe@ellerman.id.au, muchun.song@linux.dev, nikunj@amd.com, nsaenz@amazon.es, oliver.upton@linux.dev, palmer@dabbelt.com, pankaj.gupta@amd.com, paul.walmsley@sifive.com, pbonzini@redhat.com, pdurrant@amazon.co.uk, peterx@redhat.com, pgonda@google.com, pvorel@suse.cz, qperret@google.com, quic_cvanscha@quicinc.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, quic_svaddagi@quicinc.com, quic_tsoni@quicinc.com, richard.weiyang@gmail.com, rick.p.edgecombe@intel.com, rientjes@google.com, roypat@amazon.co.uk, rppt@kernel.org, seanjc@google.com, shuah@kernel.org, steven.price@arm.com, steven.sistare@oracle.com, suzuki.poulose@arm.com, thomas.lendacky@amd.com, usama.arif@bytedance.com, vbabka@suse.cz, viro@zeniv.linux.org.uk, vkuznets@redhat.com, wei.w.wang@intel.com, will@kernel.org, willy@infradead.org, xiaoyao.li@intel.com, yilun.xu@intel.com, yuzenghui@huawei.com, zhiquan1.li@intel.com Subject: Re: [RFC PATCH v2 04/51] KVM: guest_memfd: Introduce KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls Message-ID: <20250710175449.GA1870174@ziepe.ca> References: <9502503f-e0c2-489e-99b0-94146f9b6f85@amd.com> <20250624130811.GB72557@ziepe.ca> <20250702141321.GC904431@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 5A61C14000E X-Stat-Signature: 1ksku4eiz3t4pofpboqcyroxjmaexikq X-HE-Tag: 1752170093-952016 X-HE-Meta: U2FsdGVkX18ZJ8vEFu4N/ACD4dfgfwyqDmi+Hyruu9WRe/MCbx9bo5ZqwzkUWWXMIWT1elimJHNc9+1JCIzjG+uzWpby3AYnkDgKvbPWTbG9GlCsVaagaZLwZBDEgNivAz3szjZAYnaXFd30M6wrnP2jHTlItGN7bIFoZ8uMwvbwLoZE8HF+diWgTj9Dsvny6sZhUV7ewI7GdvXS+U8eCMDr0qs4256d8uUgo5B2LH/CJIStrfdkTjC9yaaea823BZycO3RXcFoNzyC112r4xI91IpGBUz6znKwyDTNvtDiKctQUjsXZu6fGJ/VpEu6oyqx+L4uVy7YCg1zbqcGqWWOeysTtvStd61RNDLJSLcOtKKq9T1x1pGWmJj6roAibgh8vlSYpPGrpJu8MB0Hadhog/047Mrvg/4WR8JmPMSJHAQ6ZD8hTJGqZ106QL9fhN9P5CIBwieplKE3V5pGcjn0whnjqppu4KpduNz6dgAGmxTwEcohuvvDzJHxTXn35WtMzrE5AuHAg4LZeLvDgs9U5sfiL+0ZG00BQ7yUT+WSmsSWAnIIKCDnHjH0AU6MpzuqgbLx0U7cjhWCb0CEqXc/XqFvNaI3dtj8UwR50+7cbUlCgs/4NuDIBnF3LG7yc70ftGdD9D8gkyTc08/W4MS/SzehZhRn25FKKPRlEGdeHcJVlRrhayg66hUf8kMiUDbUUe/ofUucMcuiBRXp3souWPyOJO0lzdsv8HgqNSoTQuD9yvF76Q3ZwT5jbTmF2SK07zPtBwvqTk6g5GPjbJQ4y91iUD3YE+8MdB+3SnKwgxD2kBFVGnbeTuoQBlYQFou5jgnxr4ciEs319pakrhoBjOj17eDgbExG9g0hbvvvqip8ou+54hCtlsOa8I08C0WihV8qjm1M/6El6LilR9PnSYS7dMrhyjZIx6ToWNxEtysEltrV8ZCbU4Y6KrORGP7VAP5/YTKsxtJWMjgv 1Vc+ifjJ VNox4C6OD8hN8kslZxuRIoVlclWWmd5X/BU4oEpxUDo2jIV5lbEBBuDGT+l9e6k2q6Kv4 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jul 10, 2025 at 06:50:09PM +0800, Xu Yilun wrote: > On Wed, Jul 02, 2025 at 07:32:36AM -0700, Vishal Annapurve wrote: > > On Wed, Jul 2, 2025 at 7:13 AM Jason Gunthorpe wrote: > > > > > > On Wed, Jul 02, 2025 at 06:54:10AM -0700, Vishal Annapurve wrote: > > > > On Wed, Jul 2, 2025 at 1:38 AM Yan Zhao wrote: > > > > > > > > > > On Tue, Jun 24, 2025 at 07:10:38AM -0700, Vishal Annapurve wrote: > > > > > > On Tue, Jun 24, 2025 at 6:08 AM Jason Gunthorpe wrote: > > > > > > > > > > > > > > On Tue, Jun 24, 2025 at 06:23:54PM +1000, Alexey Kardashevskiy wrote: > > > > > > > > > > > > > > > Now, I am rebasing my RFC on top of this patchset and it fails in > > > > > > > > kvm_gmem_has_safe_refcount() as IOMMU holds references to all these > > > > > > > > folios in my RFC. > > > > > > > > > > > > > > > > So what is the expected sequence here? The userspace unmaps a DMA > > > > > > > > page and maps it back right away, all from the userspace? The end > > > > > > > > result will be the exactly same which seems useless. And IOMMU TLB > > > > > > > > > > > > As Jason described, ideally IOMMU just like KVM, should just: > > > > > > 1) Directly rely on guest_memfd for pinning -> no page refcounts taken > > > > > > by IOMMU stack > > > > > In TDX connect, TDX module and TDs do not trust VMM. So, it's the TDs to inform > > > > > TDX module about which pages are used by it for DMAs purposes. > > > > > So, if a page is regarded as pinned by TDs for DMA, the TDX module will fail the > > > > > unmap of the pages from S-EPT. > > > > > > I don't see this as having much to do with iommufd. > > > > > > iommufd will somehow support the T=1 iommu inside the TDX module but > > > it won't have an IOAS for it since the VMM does not control the > > > translation. > > I partially agree with this. > > This is still the DMA Silent drop issue for security. The HW (Also > applicable to AMD/ARM) screams out if the trusted DMA path (IOMMU > mapping, or access control table like RMP) is changed out of TD's > expectation. So from HW POV, it is the iommu problem. I thought the basic idea was that the secure world would sanity check what the insecure is doing and if it is not OK then it blows up. So if the DMA fails because the untrusted world revoked sharability when it should not have then this is correct and expected? > For SW, if we don't blame iommu, maybe we rephrase as gmemfd can't > invalidate private pages unless TD agrees. I think you mean guestmemfd in the kernel cannot autonomously change 'something' unless instructed to explicitly by userspace. The expectation is the userspace will only give such instructions based on the VM telling it to do a shared/private change. If userspace gives an instruction that was not agreed with the guest then the secure world can police the error and blow up. > Just to be clear. With In-place conversion, it is not KVM gives pages > to become secure, it is gmemfd. Or maybe you mean gmemfd is part of KVM. Yeah, I mean part of. > > > Obviously in a mode where there is a vPCI device we will need all the > > > pages to be pinned in the guestmemfd to prevent any kind of > > > migrations. Only shared/private conversions should change the page > > > around. > > Only *guest permitted* conversion should change the page. I.e only when > VMM is dealing with the KVM_HC_MAP_GPA_RANGE hypercall. Not sure if we > could just let QEMU ensure this or KVM/guestmemfd should ensure this. I think it should not be part of the kernel, no need. From a kernel perspective userspace has requested a shared/private conversion and if it wasn't agreed with the VM then it will explode. Jason