From: "Huang, Kai" <kai.huang@intel.com>
To: "kirill.shutemov@linux.intel.com" <kirill.shutemov@linux.intel.com>
Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
"seanjc@google.com" <seanjc@google.com>,
"x86@kernel.org" <x86@kernel.org>, "bp@alien8.de" <bp@alien8.de>,
"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
"mingo@redhat.com" <mingo@redhat.com>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"Zhao, Yan Y" <yan.y.zhao@intel.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"Yamahata, Isaku" <isaku.yamahata@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-coco@lists.linux.dev" <linux-coco@lists.linux.dev>
Subject: Re: [RFC, PATCH 08/12] KVM: x86/tdp_mmu: Add phys_prepare() and phys_cleanup() to kvm_x86_ops
Date: Mon, 19 May 2025 05:00:48 +0000 [thread overview]
Message-ID: <dfe459c48f3b73cfe2d5878b0804f8d01d13e0e7.camel@intel.com> (raw)
In-Reply-To: <6b5pkr4eh3l6c2ovp6t2m7phonp4kr2z5k5facrsktcmsyztqo@2hjgi7c455km>
On Wed, 2025-05-14 at 09:43 +0300, kirill.shutemov@linux.intel.com wrote:
> On Wed, May 14, 2025 at 12:00:17AM +0000, Huang, Kai wrote:
> > On Mon, 2025-05-12 at 12:55 +0300, Kirill A. Shutemov wrote:
> > > On Fri, May 09, 2025 at 09:25:58AM +0800, Yan Zhao wrote:
> > > > On Thu, May 08, 2025 at 04:23:56PM +0300, Kirill A. Shutemov wrote:
> > > > > On Tue, May 06, 2025 at 07:55:17PM +0800, Yan Zhao wrote:
> > > > > > On Fri, May 02, 2025 at 04:08:24PM +0300, Kirill A. Shutemov wrote:
> > > > > > > The functions kvm_x86_ops::link_external_spt() and
> > > > > > > kvm_x86_ops::set_external_spte() are used to assign new memory to a VM.
> > > > > > > When using TDX with Dynamic PAMT enabled, the assigned memory must be
> > > > > > > covered by PAMT.
> > > > > > >
> > > > > > > The new function kvm_x86_ops::phys_prepare() is called before
> > > > > > > link_external_spt() and set_external_spte() to ensure that the memory is
> > > > > > > ready to be assigned to the virtual machine. In the case of TDX, it
> > > > > > > makes sure that the memory is covered by PAMT.
> > > > > > >
> > > > > > > kvm_x86_ops::phys_prepare() is called in a context where struct kvm_vcpu
> > > > > > > is available, allowing the implementation to allocate memory from a
> > > > > > > per-VCPU pool.
> > > > > > >
> > > > > > Why not invoke phys_prepare() and phys_cleanup() in set_external_spte_present()?
> > > > > > Or in tdx_sept_set_private_spte()/tdx_sept_link_private_spt()?
> > > > >
> > > > > Because the memory pool we allocated from is per-vcpu and we lost access
> > > > > to vcpu by then. And not all callers provide vcpu.
> > > > Maybe we can get vcpu via kvm_get_running_vcpu(), as in [1].
> > > > Then for callers not providing vcpu (where vcpu is NULL), we can use per-KVM
> > > > cache?
> > >
> > > Hm. I was not aware of kvm_get_running_vcpu(). Will play with it, thanks.
> >
> > I am not sure why per-vcpu cache matters.
> >
> > For non-leaf SEPT pages, AFAICT the "vcpu->arch.mmu_external_spt_cache" is just
> > an empty cache, and eventually __get_free_page() is used to allocate in:
> >
> > sp->external_spt =
> > kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_external_spt_cache);
> >
> > So why not we actually create a kmem_cache for it with an actual 'ctor', and we
> > can call tdx_alloc_page() in that. This makes sure when the "external_spt" is
> > allocated, the underneath PAMT entry is there.
>
> This would make hard to debug PAMT memory leaks. external_spt pages in the
> pool will have PAMT memory tied to them, so we will have non-zero PAMT
> memory usage with zero TDs running.
Why is that? AFAICT all 'external_spt' pages are freed when TD is gone.
>
> > For the last level guest memory page, similar to SEV, we can hook the
> > kvm_arch_gmem_prepare() to call tdx_alloc_page() to make PAMT entry ready.
>
> I don't think kvm_arch_gmem_prepare() is right place to allocate PAMT
> memory. THPs are dynamic and page order can change due to split or
> collapse between the time the page is allocated and gets mapped into EPT.
> I am not sure if SEV code is correct in this regard.
Yeah, agreed. Not sure how does SEV-SNP handles large page split/merge either.
next prev parent reply other threads:[~2025-05-19 5:01 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-02 13:08 [RFC, PATCH 00/12] TDX: Enable Dynamic PAMT Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 01/12] x86/virt/tdx: Allocate page bitmap for " Kirill A. Shutemov
2025-05-05 10:08 ` Huang, Kai
2025-05-02 13:08 ` [RFC, PATCH 02/12] x86/virt/tdx: Allocate reference counters for PAMT memory Kirill A. Shutemov
2025-05-05 11:05 ` Huang, Kai
2025-05-08 13:03 ` kirill.shutemov
2025-05-09 1:06 ` Huang, Kai
2025-05-12 9:53 ` kirill.shutemov
2025-05-13 23:24 ` Huang, Kai
2025-05-09 9:52 ` Chao Gao
2025-05-12 9:51 ` Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 03/12] x86/virt/tdx: Add wrappers for TDH.PHYMEM.PAMT.ADD/REMOVE Kirill A. Shutemov
2025-05-09 10:18 ` Chao Gao
2025-05-12 9:55 ` Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 04/12] x86/virt/tdx: Account PAMT memory and print if in /proc/meminfo Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 05/12] KVM: TDX: Add tdx_pamt_get()/put() helpers Kirill A. Shutemov
2025-05-05 12:44 ` Huang, Kai
2025-05-07 1:01 ` Yan Zhao
2025-05-07 1:15 ` Vishal Annapurve
2025-05-07 2:42 ` Yan Zhao
2025-05-08 13:19 ` kirill.shutemov
2025-05-07 16:31 ` Dave Hansen
2025-05-08 2:08 ` Yan Zhao
2025-05-08 13:21 ` kirill.shutemov
2025-05-08 13:16 ` kirill.shutemov
2025-05-23 9:42 ` kirill.shutemov
2025-05-14 5:25 ` Chao Gao
2025-05-23 10:46 ` Kirill A. Shutemov
2025-05-14 5:33 ` Chao Gao
2025-05-14 6:25 ` Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 06/12] KVM: TDX: Allocate PAMT memory in __tdx_td_init() Kirill A. Shutemov
2025-05-05 12:46 ` Huang, Kai
2025-05-02 13:08 ` [RFC, PATCH 07/12] KVM: TDX: Allocate PAMT memory in tdx_td_vcpu_init() Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 08/12] KVM: x86/tdp_mmu: Add phys_prepare() and phys_cleanup() to kvm_x86_ops Kirill A. Shutemov
2025-05-06 11:55 ` Yan Zhao
2025-05-08 13:23 ` Kirill A. Shutemov
2025-05-09 1:25 ` Yan Zhao
2025-05-12 9:55 ` Kirill A. Shutemov
2025-05-14 0:00 ` Huang, Kai
2025-05-14 6:43 ` kirill.shutemov
2025-05-19 5:00 ` Huang, Kai [this message]
2025-05-23 12:00 ` kirill.shutemov
2025-06-05 13:01 ` kirill.shutemov
2025-06-05 22:21 ` Huang, Kai
2025-06-06 10:20 ` kirill.shutemov
2025-05-14 6:15 ` Chao Gao
2025-05-02 13:08 ` [RFC, PATCH 09/12] KVM: TDX: Preallocate PAMT pages to be used in page fault path Kirill A. Shutemov
2025-05-14 0:07 ` Huang, Kai
2025-05-14 6:30 ` Chao Gao
2025-05-30 10:28 ` Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 10/12] KVM: TDX: Hookup phys_prepare() and phys_cleanup() kvm_x86_ops Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 11/12] KVM: TDX: Reclaim PAMT memory Kirill A. Shutemov
2025-05-14 1:11 ` Huang, Kai
2025-05-14 15:21 ` Vishal Annapurve
2025-05-19 5:06 ` Huang, Kai
2025-05-02 13:08 ` [RFC, PATCH 12/12] x86/virt/tdx: Enable Dynamic PAMT Kirill A. Shutemov
2025-05-14 13:41 ` [RFC, PATCH 00/12] TDX: " Sean Christopherson
2025-05-15 14:22 ` Kirill A. Shutemov
2025-05-15 15:03 ` Dave Hansen
2025-05-15 16:02 ` Kirill A. Shutemov
2025-05-14 20:33 ` Zhi Wang
2025-05-15 9:17 ` Kirill A. Shutemov
2025-05-15 14:03 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dfe459c48f3b73cfe2d5878b0804f8d01d13e0e7.camel@intel.com \
--to=kai.huang@intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=isaku.yamahata@intel.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).