From: David Hildenbrand <david@redhat.com>
To: Jonathan Cameron <jonathan.cameron@huawei.com>,
fanhuang <FangSheng.Huang@amd.com>
Cc: qemu-devel@nongnu.org, imammedo@redhat.com, Zhigang.Luo@amd.com,
Lianjie.Shi@amd.com, Oscar Salvador <osalvador@suse.de>
Subject: Re: [PATCH v2] numa: add 'spm' option for Specific Purpose Memory
Date: Mon, 20 Oct 2025 22:03:51 +0200 [thread overview]
Message-ID: <40005e02-17df-43d1-a1d7-0b3bcfefdbf1@redhat.com> (raw)
In-Reply-To: <20251020111534.00004a29@huawei.com>
On 20.10.25 12:15, Jonathan Cameron wrote:
> On Mon, 20 Oct 2025 17:07:00 +0800
> fanhuang <FangSheng.Huang@amd.com> wrote:
>
>> Hi David and Igor,
>>
>> I apologize for the delayed response. Thank you very much for your thoughtful
>> questions and feedback on the SPM patch series.
>>
>> Before addressing your questions, I'd like to briefly mention what the new
>> QEMU patch series additionally resolves:
>>
>> 1. **Corrected SPM terminology**: Fixed the description error from the previous
>> version. The correct acronym is "Specific Purpose Memory" (not "special
>> purpose memory" as previously stated).
>>
>> 2. **Fixed overlapping E820 entries**: Updated the implementation to properly
>> handle overlapping E820 RAM entries before adding E820_SOFT_RESERVED
>> regions.
>>
>> The previous implementation created overlapping E820 entries by first adding
>> a large E820_RAM entry covering the entire above-4GB memory range, then
>> adding E820_SOFT_RESERVED entries for SPM regions that overlapped with the
>> RAM entry. This violated the E820 specification and caused OVMF/UEFI
>> firmware to receive conflicting memory type information for the same
>> physical addresses.
>>
>> The new implementation processes SPM regions first to identify reserved
>> areas, then adds RAM entries around the SPM regions, generating a clean,
>> non-overlapping E820 map.
>
> I'm definitely in favor of this support for testing purposes as well as
> for the GPU cases you describe.
Thanks for taking a look!
>
> Given I took your brief comment on hotplug and expanded on it +CC David
> and Oscar.
>
>>
>> Now, regarding your questions:
>>
>> ========================================================================
>> Why SPM Must Be Boot Memory
>> ========================================================================
>>
>> SPM cannot be implemented as hotplug memory (DIMM/NVDIMM) because:
>>
>> The primary goal of SPM is to ensure that memory is managed by guest
>> device drivers, not the guest OS. This requires boot-time discovery
>> for three key reasons:
>>
>> 1. SPM regions must appear in the E820 memory map as `E820_SOFT_RESERVED`
>> during firmware initialization, before the OS starts.
>>
>> 2. Hotplug memory is integrated into kernel memory management, making
>> it unavailable for device-specific use.
>
> This is only sort of true and perhaps reflects support in the kernel for ACPI
> features being missing as no one has yet been interested in them.
> See 9.11.3 Hot-pluggable Memory Description Illustrated in the 6.6 ACPI spec.
> That has an example where the EFI_MEMORY_SP bit is provided.
> I had a dig around and for now ACPICA / kernel doesn't seem to put that alongside
> write_protect and the other bits that IIUC come from the same field.
> It would be relatively easy to pipe that through and potentially add handling
> in the memory hotplug path to allow for drivers to pick these regions up
> (which boils down I think to making them visible in some way but doing nothing
> else with them)
Considering something like DIMMs, one challenge is also that hotplugged
memory in QEMU is never advertised in e820 (we only indicate the
hotpluggable region), which is different to real hardware but let's us
stop the early kernel that is booting up from considering these areas
"initial memory" and effectively turning them hot-unpluggable in the
default case.
Then, the question is what happens when someone plugs such a DIMM,
unplugs it, and plugs something else in there that's not supposed to be SP.
I assume that's all solvable, just want to point out that the default
memory hotplug path in QEMU is not really suitable for that right now I
think.
>
> Other path would be to use a discoverable path such as emulating CXL memory.
> Hotplug of that would work fine from point of view of coming up as driver managed
> SPM style (the flag is in runtime data provided by the device). It would however
> look different to the firmware managed approach you are using in the host.
Right.
--
Cheers
David / dhildenb
next prev parent reply other threads:[~2025-10-20 20:04 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-20 9:07 [PATCH v2] numa: add 'spm' option for Specific Purpose Memory fanhuang
2025-10-20 9:07 ` fanhuang
2025-11-03 12:32 ` David Hildenbrand
2025-11-04 8:00 ` Huang, FangSheng (Jerry)
2025-10-20 10:15 ` Jonathan Cameron via
2025-10-20 20:03 ` David Hildenbrand [this message]
2025-10-22 10:19 ` Huang, FangSheng (Jerry)
2025-10-20 20:10 ` David Hildenbrand
2025-10-22 10:09 ` Huang, FangSheng (Jerry)
2025-10-22 10:28 ` David Hildenbrand
2025-11-03 3:01 ` Huang, FangSheng (Jerry)
2025-11-03 12:36 ` David Hildenbrand (Red Hat)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=40005e02-17df-43d1-a1d7-0b3bcfefdbf1@redhat.com \
--to=david@redhat.com \
--cc=FangSheng.Huang@amd.com \
--cc=Lianjie.Shi@amd.com \
--cc=Zhigang.Luo@amd.com \
--cc=imammedo@redhat.com \
--cc=jonathan.cameron@huawei.com \
--cc=osalvador@suse.de \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).