From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6CF1A3D76 for ; Thu, 15 Jan 2026 00:19:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768436362; cv=none; b=KPZ8PRn1N2wLjMKZc3J/BqqoZmaPCTc+AwP4+jfL+Oy5NC7J4ujnlf/LzqW7NI7PtFt82Um8lWpIWu+iG//QpNNqgvdCufeIrhqq3l9exMtvBVCmY2381ArDD5e/fkww3covQX0UmR+c9qpDD3/UImWM+xZTN0U2TRlc9j/M04U= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768436362; c=relaxed/simple; bh=9OJ7Bv6EmPT04u7hksmUAJ9D2DTfAs/+5EMvZVUoIQQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=EN38BSHmYvi/FbTLOlEwhS8PEOL5CESLDDSzSYDWveHElpS4bRZq7mfpPykv2LyP5uy0wuOhdkQ9D8LVWDHm+7CS/LY+G8NtgqSNeau9WZ3wdpx57oroHtULJPYPrzBBRouerL+hscw1PB+p7TeS9jzh0WO9CtibRMBzkTXrOi0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=MwAaTfOD; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="MwAaTfOD" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2a089575ab3so3376935ad.0 for ; Wed, 14 Jan 2026 16:19:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1768436361; x=1769041161; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=XXs7JrUDbBtoVcfFWKUiB6HLb56edwyCJBhI/Mr8b0I=; b=MwAaTfODNJ9vr+kuwK879daP5Gb6tbPftLn1uLZL9rC2sZKckFKlVra7FwHLm+UbV0 yjN08AWSyz23Z078oZID7BoUiXQa0BWmzAL/OCWa2THoglmaKS46413MCA6Cibysm+rB nCCaOlB55V/VkYvJJzWZ7XRIKizdMINIm4ilR1ylYAK5wOISHP0nHwHkuAEarHXTyWFo Wp5AA59d1E/dk1fpiLGGHCBx3p0Dw2JAkQ3JNlaJIpMTGoU8uF8uXrKG6RIbco+hYqQ3 Rgh51RBTnVlLVCTlgNckwgmU7RPnig+mVe0rSnE5zLDOyEvW9/PPjabNYzXHZKe0qd3p SEWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768436361; x=1769041161; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XXs7JrUDbBtoVcfFWKUiB6HLb56edwyCJBhI/Mr8b0I=; b=eOFwj7CkoNoBPMkWoYOFIDGd+NxesL00214vMuWu2IHaKTESEUis4k5Vk0BBvEBQ/P QA6dvLWCZkn8BS2ZuyZlh3BO1aMqId5qkoLnRUQtVT2zERqP5i4ddGmfKmAzWnzNQXsw fLQs0+tLKxu2rmx9Vn2UtvaQSesR6sTAVse2tXG5bHht6QHviFjVm4brNUqGX25LwYEK pqmH0x/ncS3d0kK8/INevS5XjfiFwWpAiSwyDTeg5mUmubAJA515kgC9d2gbMC34sAQi WQo7OTIl2S/Y1HQq/PsN4t5Io/7TkoAf4I5PZSeojbwqhBrCoOiAoYuy3oiz1KH4uokZ TbhQ== X-Forwarded-Encrypted: i=1; AJvYcCUNU91Ba/TabvgkVpOOJMBNDrv6gPbfcKNfTvR8i8rfofqQuP58yJcVS76AVtQriob/hyhWv6rnVE/tWwI=@vger.kernel.org X-Gm-Message-State: AOJu0YxiAvP+KECl2eLyB1Arz0adG0AG5ofdElt9QfBs8dGk6qHPz8T2 fGzoUpMwbzeF/jm9iy5J6W4sYp8B4mvGOOLCmC4Ly1KRpOfc2nFEXTpqcdcGBtTaeeJevoUojeC vpLdzxA== X-Received: from pjbpv8.prod.google.com ([2002:a17:90b:3c88:b0:34c:d9a0:3bf6]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:11cc:b0:2a0:be5d:d53d with SMTP id d9443c01a7336-2a599ea7c5dmr43733585ad.53.1768436360589; Wed, 14 Jan 2026 16:19:20 -0800 (PST) Date: Wed, 14 Jan 2026 16:19:18 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: Message-ID: Subject: Re: [PATCH v3 00/24] KVM: TDX huge page support for private memory From: Sean Christopherson To: Dave Hansen Cc: Yan Zhao , Ackerley Tng , Vishal Annapurve , pbonzini@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, x86@kernel.org, rick.p.edgecombe@intel.com, kas@kernel.org, tabba@google.com, michael.roth@amd.com, david@kernel.org, sagis@google.com, vbabka@suse.cz, thomas.lendacky@amd.com, nik.borisov@suse.com, pgonda@google.com, fan.du@intel.com, jun.miao@intel.com, francescolavra.fl@gmail.com, jgross@suse.com, ira.weiny@intel.com, isaku.yamahata@intel.com, xiaoyao.li@intel.com, kai.huang@intel.com, binbin.wu@linux.intel.com, chao.p.peng@intel.com, chao.gao@intel.com Content-Type: text/plain; charset="us-ascii" On Wed, Jan 14, 2026, Dave Hansen wrote: > On 1/14/26 07:26, Sean Christopherson wrote: > ... > > Dave may feel differently, but I am not going to budge on this. I am not going > > to bake in assumptions throughout KVM about memory being backed by page+folio. > > We _just_ cleaned up that mess in the aformentioned "Stop grabbing references to > > PFNMAP'd pages" series, I am NOT reintroducing such assumptions. > > > > NAK to any KVM TDX code that pulls a page or folio out of a guest_memfd pfn. > > 'struct page' gives us two things: One is the type safety, but I'm > pretty flexible on how that's implemented as long as it's not a raw u64 > getting passed around everywhere. I don't necessarily disagree on the type safety front, but for the specific code in question, any type safety is a facade. Everything leading up to the TDX code is dealing with raw PFNs and/or PTEs. Then the TDX code assumes that the PFN being mapped into the guest is backed by a struct page, and that the folio size is consistent with @level, without _any_ checks whatsover. This is providing the exact opposite of safety. static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn) { int tdx_level = pg_level_to_tdx_sept_level(level); struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); struct page *page = pfn_to_page(pfn); <================== struct folio *folio = page_folio(page); gpa_t gpa = gfn_to_gpa(gfn); u64 entry, level_state; u64 err; err = tdh_mem_page_aug(&kvm_tdx->td, gpa, tdx_level, folio, folio_page_idx(folio, page), &entry, &level_state); ... } I've no objection if e.g. tdh_mem_page_aug() wants to sanity check that a PFN is backed by a struct page with a valid refcount, it's code like that above that I don't want. > The second thing is a (near) guarantee that the backing memory is RAM. > Not only RAM, but RAM that the TDX module knows about and has a PAMT and > TDMR and all that TDX jazz. I'm not at all opposed to backing guest_memfd with "struct page", quite the opposite. What I don't want is to bake assumptions into KVM code that doesn't _require_ struct page, because that has cause KVM immense pain in the past. And I'm strongly opposed to KVM special-casing TDX or anything else, precisely because we struggled through all that pain so that KVM would work better with memory that isn't backed by "struct page", or more specifically, memory that has an associated "struct page", but isn't managed by core MM, e.g. isn't refcounted. > We've also done things like stopping memory hotplug because you can't > amend TDX page metadata at runtime. So we prevent new 'struct pages' > from coming into existence. So 'struct page' is a quite useful choke > point for TDX. > > I'd love to hear more about how guest_memfd is going to tie all the > pieces together and give the same straightforward guarantees without > leaning on the core mm the same way we do now. I don't think guest_memfd needs to be different, and that's not what I'm advocating. What I don't want is to make KVM TDX's handling of memory different from the rest of KVM and KVM's MMU(s).