From: "Huang, Kai" <kai.huang@intel.com>
To: "kirill.shutemov@linux.intel.com"
<kirill.shutemov@linux.intel.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"seanjc@google.com" <seanjc@google.com>
Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
"bp@alien8.de" <bp@alien8.de>,
"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
"x86@kernel.org" <x86@kernel.org>,
"mingo@redhat.com" <mingo@redhat.com>,
"Zhao, Yan Y" <yan.y.zhao@intel.com>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"linux-coco@lists.linux.dev" <linux-coco@lists.linux.dev>,
"Yamahata, Isaku" <isaku.yamahata@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC, PATCH 02/12] x86/virt/tdx: Allocate reference counters for PAMT memory
Date: Mon, 5 May 2025 11:05:12 +0000 [thread overview]
Message-ID: <1e939e994d4f1f36d0a15a18dd66c5fe9864f2e2.camel@intel.com> (raw)
In-Reply-To: <20250502130828.4071412-3-kirill.shutemov@linux.intel.com>
> +static atomic_t *pamt_refcounts;
> +
> static enum tdx_module_status_t tdx_module_status;
> static DEFINE_MUTEX(tdx_module_lock);
>
> @@ -1035,9 +1038,108 @@ static int config_global_keyid(void)
> return ret;
> }
>
> +atomic_t *tdx_get_pamt_refcount(unsigned long hpa)
> +{
> + return &pamt_refcounts[hpa / PMD_SIZE];
> +}
> +EXPORT_SYMBOL_GPL(tdx_get_pamt_refcount);
It's not quite clear why this function needs to be exported in this patch. IMO
it's better to move the export to the patch which actually needs it.
Looking at patch 5, tdx_pamt_get()/put() use it, and they are in KVM code. But
I think we should just put them here in this file. tdx_alloc_page() and
tdx_free_page() should be in this file too.
And instead of exporting tdx_get_pamt_refcount(), the TDX core code here can
export tdx_alloc_page() and tdx_free_page(), providing two high level helpers to
allow the TDX users (e.g., KVM) to allocate/free TDX private pages. How PAMT
pages are allocated is then hidden in the core TDX code.
> +
> +static int pamt_refcount_populate(pte_t *pte, unsigned long addr, void *data)
> +{
> + unsigned long vaddr;
> + pte_t entry;
> +
> + if (!pte_none(ptep_get(pte)))
> + return 0;
> +
> + vaddr = __get_free_page(GFP_KERNEL | __GFP_ZERO);
> + if (!vaddr)
> + return -ENOMEM;
> +
> + entry = pfn_pte(PFN_DOWN(__pa(vaddr)), PAGE_KERNEL);
> +
> + spin_lock(&init_mm.page_table_lock);
> + if (pte_none(ptep_get(pte)))
> + set_pte_at(&init_mm, addr, pte, entry);
> + else
> + free_page(vaddr);
> + spin_unlock(&init_mm.page_table_lock);
> +
> + return 0;
> +}
> +
> +static int pamt_refcount_depopulate(pte_t *pte, unsigned long addr,
> + void *data)
> +{
> + unsigned long vaddr;
> +
> + vaddr = (unsigned long)__va(PFN_PHYS(pte_pfn(ptep_get(pte))));
> +
> + spin_lock(&init_mm.page_table_lock);
> + if (!pte_none(ptep_get(pte))) {
> + pte_clear(&init_mm, addr, pte);
> + free_page(vaddr);
> + }
> + spin_unlock(&init_mm.page_table_lock);
> +
> + return 0;
> +}
> +
> +static int alloc_tdmr_pamt_refcount(struct tdmr_info *tdmr)
> +{
> + unsigned long start, end;
> +
> + start = (unsigned long)tdx_get_pamt_refcount(tdmr->base);
> + end = (unsigned long)tdx_get_pamt_refcount(tdmr->base + tdmr->size);
> + start = round_down(start, PAGE_SIZE);
> + end = round_up(end, PAGE_SIZE);
> +
> + return apply_to_page_range(&init_mm, start, end - start,
> + pamt_refcount_populate, NULL);
> +}
IIUC, populating refcount based on TDMR will slightly waste memory. The reason
is IIUC we don't need to populate the refcount for a 2M range if the range is
completely marked as reserved in TDMR, because it's not possible for the kernel
to use such range for TDX.
Populating based on the list of TDX memory blocks should be better. In
practice, the difference should be unnoticeable, but conceptually, using TDX
memory blocks is better.
> +
> +static int init_pamt_metadata(void)
> +{
> + size_t size = max_pfn / PTRS_PER_PTE * sizeof(*pamt_refcounts);
> + struct vm_struct *area;
> +
> + if (!tdx_supports_dynamic_pamt(&tdx_sysinfo))
> + return 0;
> +
> + /*
> + * Reserve vmalloc range for PAMT reference counters. It covers all
> + * physical address space up to max_pfn. It is going to be populated
> + * from init_tdmr() only for present memory that available for TDX use.
> + */
> + area = get_vm_area(size, VM_IOREMAP);
> + if (!area)
> + return -ENOMEM;
> +
> + pamt_refcounts = area->addr;
> + return 0;
> +}
> +
> +static void free_pamt_metadata(void)
> +{
> + size_t size = max_pfn / PTRS_PER_PTE * sizeof(*pamt_refcounts);
> +
> + size = round_up(size, PAGE_SIZE);
> + apply_to_existing_page_range(&init_mm,
> + (unsigned long)pamt_refcounts,
> + size, pamt_refcount_depopulate,
> + NULL);
> + vfree(pamt_refcounts);
> + pamt_refcounts = NULL;
> +}
> +
> static int init_tdmr(struct tdmr_info *tdmr)
> {
> u64 next;
> + int ret;
> +
> + ret = alloc_tdmr_pamt_refcount(tdmr);
> + if (ret)
> + return ret;
>
> /*
> * Initializing a TDMR can be time consuming. To avoid long
> @@ -1048,7 +1150,6 @@ static int init_tdmr(struct tdmr_info *tdmr)
> struct tdx_module_args args = {
> .rcx = tdmr->base,
> };
> - int ret;
>
> ret = seamcall_prerr_ret(TDH_SYS_TDMR_INIT, &args);
> if (ret)
> @@ -1134,10 +1235,15 @@ static int init_tdx_module(void)
> if (ret)
> goto err_reset_pamts;
>
> + /* Reserve vmalloc range for PAMT reference counters */
> + ret = init_pamt_metadata();
> + if (ret)
> + goto err_reset_pamts;
> +
> /* Initialize TDMRs to complete the TDX module initialization */
> ret = init_tdmrs(&tdx_tdmr_list);
> if (ret)
> - goto err_reset_pamts;
> + goto err_free_pamt_metadata;
>
> pr_info("%lu KB allocated for PAMT\n", tdmrs_count_pamt_kb(&tdx_tdmr_list));
>
> @@ -1149,6 +1255,9 @@ static int init_tdx_module(void)
> put_online_mems();
> return ret;
>
> +err_free_pamt_metadata:
> + free_pamt_metadata();
> +
> err_reset_pamts:
> /*
> * Part of PAMTs may already have been initialized by the
next prev parent reply other threads:[~2025-05-05 11:05 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-02 13:08 [RFC, PATCH 00/12] TDX: Enable Dynamic PAMT Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 01/12] x86/virt/tdx: Allocate page bitmap for " Kirill A. Shutemov
2025-05-05 10:08 ` Huang, Kai
2025-05-02 13:08 ` [RFC, PATCH 02/12] x86/virt/tdx: Allocate reference counters for PAMT memory Kirill A. Shutemov
2025-05-05 11:05 ` Huang, Kai [this message]
2025-05-08 13:03 ` kirill.shutemov
2025-05-09 1:06 ` Huang, Kai
2025-05-12 9:53 ` kirill.shutemov
2025-05-13 23:24 ` Huang, Kai
2025-05-09 9:52 ` Chao Gao
2025-05-12 9:51 ` Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 03/12] x86/virt/tdx: Add wrappers for TDH.PHYMEM.PAMT.ADD/REMOVE Kirill A. Shutemov
2025-05-09 10:18 ` Chao Gao
2025-05-12 9:55 ` Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 04/12] x86/virt/tdx: Account PAMT memory and print if in /proc/meminfo Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 05/12] KVM: TDX: Add tdx_pamt_get()/put() helpers Kirill A. Shutemov
2025-05-05 12:44 ` Huang, Kai
2025-05-07 1:01 ` Yan Zhao
2025-05-07 1:15 ` Vishal Annapurve
2025-05-07 2:42 ` Yan Zhao
2025-05-08 13:19 ` kirill.shutemov
2025-05-07 16:31 ` Dave Hansen
2025-05-08 2:08 ` Yan Zhao
2025-05-08 13:21 ` kirill.shutemov
2025-05-08 13:16 ` kirill.shutemov
2025-05-23 9:42 ` kirill.shutemov
2025-05-14 5:25 ` Chao Gao
2025-05-23 10:46 ` Kirill A. Shutemov
2025-05-14 5:33 ` Chao Gao
2025-05-14 6:25 ` Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 06/12] KVM: TDX: Allocate PAMT memory in __tdx_td_init() Kirill A. Shutemov
2025-05-05 12:46 ` Huang, Kai
2025-05-02 13:08 ` [RFC, PATCH 07/12] KVM: TDX: Allocate PAMT memory in tdx_td_vcpu_init() Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 08/12] KVM: x86/tdp_mmu: Add phys_prepare() and phys_cleanup() to kvm_x86_ops Kirill A. Shutemov
2025-05-06 11:55 ` Yan Zhao
2025-05-08 13:23 ` Kirill A. Shutemov
2025-05-09 1:25 ` Yan Zhao
2025-05-12 9:55 ` Kirill A. Shutemov
2025-05-14 0:00 ` Huang, Kai
2025-05-14 6:43 ` kirill.shutemov
2025-05-19 5:00 ` Huang, Kai
2025-05-23 12:00 ` kirill.shutemov
2025-06-05 13:01 ` kirill.shutemov
2025-06-05 22:21 ` Huang, Kai
2025-06-06 10:20 ` kirill.shutemov
2025-05-14 6:15 ` Chao Gao
2025-05-02 13:08 ` [RFC, PATCH 09/12] KVM: TDX: Preallocate PAMT pages to be used in page fault path Kirill A. Shutemov
2025-05-14 0:07 ` Huang, Kai
2025-05-14 6:30 ` Chao Gao
2025-05-30 10:28 ` Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 10/12] KVM: TDX: Hookup phys_prepare() and phys_cleanup() kvm_x86_ops Kirill A. Shutemov
2025-05-02 13:08 ` [RFC, PATCH 11/12] KVM: TDX: Reclaim PAMT memory Kirill A. Shutemov
2025-05-14 1:11 ` Huang, Kai
2025-05-14 15:21 ` Vishal Annapurve
2025-05-19 5:06 ` Huang, Kai
2025-05-02 13:08 ` [RFC, PATCH 12/12] x86/virt/tdx: Enable Dynamic PAMT Kirill A. Shutemov
2025-05-14 13:41 ` [RFC, PATCH 00/12] TDX: " Sean Christopherson
2025-05-15 14:22 ` Kirill A. Shutemov
2025-05-15 15:03 ` Dave Hansen
2025-05-15 16:02 ` Kirill A. Shutemov
2025-05-14 20:33 ` Zhi Wang
2025-05-15 9:17 ` Kirill A. Shutemov
2025-05-15 14:03 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1e939e994d4f1f36d0a15a18dd66c5fe9864f2e2.camel@intel.com \
--to=kai.huang@intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=isaku.yamahata@intel.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).