From: Mike Rapoport <rppt@kernel.org>
To: Kai Huang <kai.huang@intel.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
x86@kernel.org, dave.hansen@intel.com,
kirill.shutemov@linux.intel.com, peterz@infradead.org,
tony.luck@intel.com, tglx@linutronix.de, bp@alien8.de,
mingo@redhat.com, hpa@zytor.com, seanjc@google.com,
pbonzini@redhat.com, rafael@kernel.org, david@redhat.com,
dan.j.williams@intel.com, len.brown@intel.com,
ak@linux.intel.com, isaku.yamahata@intel.com,
ying.huang@intel.com, chao.gao@intel.com,
sathyanarayanan.kuppuswamy@linux.intel.com, nik.borisov@suse.com,
bagasdotme@gmail.com, sagis@google.com, imammedo@redhat.com
Subject: Re: [PATCH v15 08/23] x86/virt/tdx: Use all system memory when initializing TDX module as TDX memory
Date: Thu, 5 Dec 2024 09:57:38 +0200 [thread overview]
Message-ID: <Z1Fc8g47vfpz9EVW@kernel.org> (raw)
In-Reply-To: <87e19d1931e33bfaece5b79602cfbd517df891f1.1699527082.git.kai.huang@intel.com>
Hi,
I've been auditing for_each_mem_pfn_range() users and it's usage in TDX is
dubious for me.
On Fri, Nov 10, 2023 at 12:55:45AM +1300, Kai Huang wrote:
>
> As TDX-usable memory is a fixed configuration, take a snapshot of the
> memory configuration from memblocks at the time of module initialization
> (memblocks are modified on memory hotplug). This snapshot is used to
AFAUI this could happen long after free_initmem() which discards all
memblock data on x86.
> enable TDX support for *this* memory configuration only. Use a memory
> hotplug notifier to ensure that no other RAM can be added outside of
> this configuration.
...
> +/*
> + * Ensure that all memblock memory regions are convertible to TDX
> + * memory. Once this has been established, stash the memblock
> + * ranges off in a secondary structure because memblock is modified
> + * in memory hotplug while TDX memory regions are fixed.
> + */
> +static int build_tdx_memlist(struct list_head *tmb_list)
> +{
> + unsigned long start_pfn, end_pfn;
> + int i, ret;
> +
> + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, NULL) {
Unles ARCH_KEEP_MEMBLOCK is defined this won't work after free_initmem()
> + /*
> + * The first 1MB is not reported as TDX convertible memory.
> + * Although the first 1MB is always reserved and won't end up
> + * to the page allocator, it is still in memblock's memory
> + * regions. Skip them manually to exclude them as TDX memory.
> + */
> + start_pfn = max(start_pfn, PHYS_PFN(SZ_1M));
> + if (start_pfn >= end_pfn)
> + continue;
> +
> + /*
> + * Add the memory regions as TDX memory. The regions in
> + * memblock has already guaranteed they are in address
> + * ascending order and don't overlap.
> + */
> + ret = add_tdx_memblock(tmb_list, start_pfn, end_pfn);
> + if (ret)
> + goto err;
> + }
> +
> + return 0;
> +err:
> + free_tdx_memlist(tmb_list);
> + return ret;
> +}
> +
> static int init_tdx_module(void)
> {
> + int ret;
> +
> + /*
> + * To keep things simple, assume that all TDX-protected memory
> + * will come from the page allocator. Make sure all pages in the
> + * page allocator are TDX-usable memory.
> + *
> + * Build the list of "TDX-usable" memory regions which cover all
> + * pages in the page allocator to guarantee that. Do it while
> + * holding mem_hotplug_lock read-lock as the memory hotplug code
> + * path reads the @tdx_memlist to reject any new memory.
> + */
> + get_online_mems();
> +
> + ret = build_tdx_memlist(&tdx_memlist);
> + if (ret)
> + goto out_put_tdxmem;
> +
> /*
> * TODO:
> *
> - * - Build the list of TDX-usable memory regions.
> * - Get TDX module "TD Memory Region" (TDMR) global metadata.
> * - Construct a list of TDMRs to cover all TDX-usable memory
> * regions.
> @@ -168,7 +267,14 @@ static int init_tdx_module(void)
> *
> * Return error before all steps are done.
> */
> - return -EINVAL;
> + ret = -EINVAL;
> +out_put_tdxmem:
> + /*
> + * @tdx_memlist is written here and read at memory hotplug time.
> + * Lock out memory hotplug code while building it.
> + */
> + put_online_mems();
> + return ret;
> }
>
> static int __tdx_enable(void)
> @@ -258,6 +364,56 @@ static int __init record_keyid_partitioning(u32 *tdx_keyid_start,
> return 0;
> }
>
> +static bool is_tdx_memory(unsigned long start_pfn, unsigned long end_pfn)
> +{
> + struct tdx_memblock *tmb;
> +
> + /*
> + * This check assumes that the start_pfn<->end_pfn range does not
> + * cross multiple @tdx_memlist entries. A single memory online
> + * event across multiple memblocks (from which @tdx_memlist
> + * entries are derived at the time of module initialization) is
> + * not possible. This is because memory offline/online is done
> + * on granularity of 'struct memory_block', and the hotpluggable
> + * memory region (one memblock) must be multiple of memory_block.
> + */
> + list_for_each_entry(tmb, &tdx_memlist, list) {
> + if (start_pfn >= tmb->start_pfn && end_pfn <= tmb->end_pfn)
> + return true;
> + }
> + return false;
> +}
> +
> +static int tdx_memory_notifier(struct notifier_block *nb, unsigned long action,
> + void *v)
> +{
> + struct memory_notify *mn = v;
> +
> + if (action != MEM_GOING_ONLINE)
> + return NOTIFY_OK;
> +
> + /*
> + * Empty list means TDX isn't enabled. Allow any memory
> + * to go online.
> + */
> + if (list_empty(&tdx_memlist))
> + return NOTIFY_OK;
> +
> + /*
> + * The TDX memory configuration is static and can not be
> + * changed. Reject onlining any memory which is outside of
> + * the static configuration whether it supports TDX or not.
> + */
> + if (is_tdx_memory(mn->start_pfn, mn->start_pfn + mn->nr_pages))
> + return NOTIFY_OK;
> +
> + return NOTIFY_BAD;
> +}
> +
> +static struct notifier_block tdx_memory_nb = {
> + .notifier_call = tdx_memory_notifier,
> +};
> +
> static int __init tdx_init(void)
> {
> u32 tdx_keyid_start, nr_tdx_keyids;
> @@ -281,6 +437,13 @@ static int __init tdx_init(void)
> return -ENODEV;
> }
>
> + err = register_memory_notifier(&tdx_memory_nb);
> + if (err) {
> + pr_err("initialization failed: register_memory_notifier() failed (%d)\n",
> + err);
> + return -ENODEV;
> + }
> +
> /*
> * Just use the first TDX KeyID as the 'global KeyID' and
> * leave the rest for TDX guests.
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index a3c52270df5b..c11e0a7ca664 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -27,4 +27,10 @@ enum tdx_module_status_t {
> TDX_MODULE_ERROR
> };
>
> +struct tdx_memblock {
> + struct list_head list;
> + unsigned long start_pfn;
> + unsigned long end_pfn;
> +};
> +
> #endif
> --
> 2.41.0
>
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2024-12-05 7:57 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-09 11:55 [PATCH v15 00/23] TDX host kernel support Kai Huang
2023-11-09 11:55 ` [PATCH v15 01/23] x86/virt/tdx: Detect TDX during kernel boot Kai Huang
2023-11-09 11:55 ` [PATCH v15 02/23] x86/tdx: Define TDX supported page sizes as macros Kai Huang
2023-11-09 11:55 ` [PATCH v15 03/23] x86/virt/tdx: Make INTEL_TDX_HOST depend on X86_X2APIC Kai Huang
2023-11-09 11:55 ` [PATCH v15 04/23] x86/cpu: Detect TDX partial write machine check erratum Kai Huang
2023-11-09 11:55 ` [PATCH v15 05/23] x86/virt/tdx: Handle SEAMCALL no entropy error in common code Kai Huang
2023-11-09 16:38 ` Dave Hansen
2023-11-14 19:24 ` Isaku Yamahata
2023-11-15 10:41 ` Huang, Kai
2023-11-15 19:26 ` Isaku Yamahata
2023-11-09 11:55 ` [PATCH v15 06/23] x86/virt/tdx: Add SEAMCALL error printing for module initialization Kai Huang
2023-11-09 11:55 ` [PATCH v15 07/23] x86/virt/tdx: Add skeleton to enable TDX on demand Kai Huang
2023-11-09 11:55 ` [PATCH v15 08/23] x86/virt/tdx: Use all system memory when initializing TDX module as TDX memory Kai Huang
2024-12-05 7:57 ` Mike Rapoport [this message]
2024-12-05 9:06 ` Nikolay Borisov
2024-12-05 12:25 ` Huang, Kai
2024-12-05 16:30 ` Mike Rapoport
2023-11-09 11:55 ` [PATCH v15 09/23] x86/virt/tdx: Get module global metadata for module initialization Kai Huang
2023-11-09 23:29 ` Dave Hansen
2023-11-10 2:23 ` Huang, Kai
2023-11-15 19:35 ` Isaku Yamahata
2023-11-16 3:19 ` Huang, Kai
2023-11-09 11:55 ` [PATCH v15 10/23] x86/virt/tdx: Add placeholder to construct TDMRs to cover all TDX memory regions Kai Huang
2023-11-09 11:55 ` [PATCH v15 11/23] x86/virt/tdx: Fill out " Kai Huang
2023-11-09 11:55 ` [PATCH v15 12/23] x86/virt/tdx: Allocate and set up PAMTs for TDMRs Kai Huang
2023-11-09 11:55 ` [PATCH v15 13/23] x86/virt/tdx: Designate reserved areas for all TDMRs Kai Huang
2023-11-09 11:55 ` [PATCH v15 14/23] x86/virt/tdx: Configure TDX module with the TDMRs and global KeyID Kai Huang
2023-11-09 11:55 ` [PATCH v15 15/23] x86/virt/tdx: Configure global KeyID on all packages Kai Huang
2023-11-09 11:55 ` [PATCH v15 16/23] x86/virt/tdx: Initialize all TDMRs Kai Huang
2023-11-09 11:55 ` [PATCH v15 17/23] x86/kexec: Flush cache of TDX private memory Kai Huang
2023-11-27 18:13 ` Dave Hansen
2023-11-27 19:33 ` Huang, Kai
2023-11-27 20:02 ` Huang, Kai
2023-11-27 20:05 ` Dave Hansen
2023-11-27 20:52 ` Huang, Kai
2023-11-27 21:06 ` Dave Hansen
2023-11-27 22:09 ` Huang, Kai
2023-11-09 11:55 ` [PATCH v15 18/23] x86/virt/tdx: Keep TDMRs when module initialization is successful Kai Huang
2023-11-09 11:55 ` [PATCH v15 19/23] x86/virt/tdx: Improve readability of module initialization error handling Kai Huang
2023-11-09 11:55 ` [PATCH v15 20/23] x86/kexec(): Reset TDX private memory on platforms with TDX erratum Kai Huang
2023-11-09 11:55 ` [PATCH v15 21/23] x86/virt/tdx: Handle TDX interaction with ACPI S3 and deeper states Kai Huang
2023-11-30 17:20 ` Dave Hansen
2023-11-09 11:55 ` [PATCH v15 22/23] x86/mce: Improve error log of kernel space TDX #MC due to erratum Kai Huang
2023-11-30 18:01 ` Tony Luck
2023-12-01 20:35 ` Dave Hansen
2023-12-03 11:44 ` Huang, Kai
2023-12-04 17:07 ` Dave Hansen
2023-12-04 21:00 ` Huang, Kai
2023-12-04 22:04 ` Dave Hansen
2023-12-04 23:24 ` Huang, Kai
2023-12-04 23:39 ` Dave Hansen
2023-12-04 23:56 ` Huang, Kai
2023-12-05 2:04 ` Sean Christopherson
2023-12-05 16:36 ` Dave Hansen
2023-12-05 16:53 ` Sean Christopherson
2023-12-05 16:36 ` Luck, Tony
2023-12-05 16:57 ` Sean Christopherson
2023-12-04 23:41 ` Huang, Kai
2023-12-05 14:25 ` Borislav Petkov
2023-12-05 19:41 ` Huang, Kai
2023-12-05 19:56 ` Borislav Petkov
2023-12-05 20:08 ` Huang, Kai
2023-12-05 20:29 ` Borislav Petkov
2023-12-05 20:33 ` Huang, Kai
2023-12-05 20:41 ` Borislav Petkov
2023-12-05 20:49 ` Dave Hansen
2023-12-05 20:58 ` Huang, Kai
2023-11-09 11:56 ` [PATCH v15 23/23] Documentation/x86: Add documentation for TDX host support Kai Huang
2023-11-13 8:40 ` [PATCH v15 00/23] TDX host kernel support Nikolay Borisov
2023-11-13 9:11 ` Huang, Kai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z1Fc8g47vfpz9EVW@kernel.org \
--to=rppt@kernel.org \
--cc=ak@linux.intel.com \
--cc=bagasdotme@gmail.com \
--cc=bp@alien8.de \
--cc=chao.gao@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=david@redhat.com \
--cc=hpa@zytor.com \
--cc=imammedo@redhat.com \
--cc=isaku.yamahata@intel.com \
--cc=kai.huang@intel.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=nik.borisov@suse.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=sagis@google.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.