From: "Huang, FangSheng (Jerry)" <FangSheng.Huang@amd.com>
To: David Hildenbrand <david@redhat.com>, <qemu-devel@nongnu.org>,
<imammedo@redhat.com>
Cc: <Zhigang.Luo@amd.com>, <Lianjie.Shi@amd.com>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>
Subject: Re: [PATCH v2] numa: add 'spm' option for Specific Purpose Memory
Date: Mon, 3 Nov 2025 11:01:25 +0800 [thread overview]
Message-ID: <1fc33dfc-ae73-4d23-a21a-a3a5ed480dd1@amd.com> (raw)
In-Reply-To: <eb1b524d-3a8c-481b-85eb-6697f5ee332b@redhat.com>
Hi David,
I hope this email finds you well. I wanted to follow up on the SPM
patch series we discussed back in October.
I'm reaching out to check on the current status and see if there's
anything else I should address or any additional information I can
provide.
Thank you for your time and guidance on this!
Best regards,
Jerry Huang
On 10/22/2025 6:28 PM, David Hildenbrand wrote:
> On 22.10.25 12:09, Huang, FangSheng (Jerry) wrote:
>>
>>
>> On 10/21/2025 4:10 AM, David Hildenbrand wrote:
>>> On 20.10.25 11:07, fanhuang wrote:
>>>> Hi David and Igor,
>>>>
>>>> I apologize for the delayed response. Thank you very much for your
>>>> thoughtful
>>>> questions and feedback on the SPM patch series.
>>>>
>>>> Before addressing your questions, I'd like to briefly mention what the
>>>> new
>>>> QEMU patch series additionally resolves:
>>>>
>>>> 1. **Corrected SPM terminology**: Fixed the description error from the
>>>> previous
>>>> version. The correct acronym is "Specific Purpose Memory" (not
>>>> "special
>>>> purpose memory" as previously stated).
>>>>
>>>> 2. **Fixed overlapping E820 entries**: Updated the implementation to
>>>> properly
>>>> handle overlapping E820 RAM entries before adding
>>>> E820_SOFT_RESERVED
>>>> regions.
>>>>
>>>> The previous implementation created overlapping E820 entries by
>>>> first adding
>>>> a large E820_RAM entry covering the entire above-4GB memory range,
>>>> then
>>>> adding E820_SOFT_RESERVED entries for SPM regions that overlapped
>>>> with the
>>>> RAM entry. This violated the E820 specification and caused
>>>> OVMF/UEFI
>>>> firmware to receive conflicting memory type information for the
>>>> same
>>>> physical addresses.
>>>>
>>>> The new implementation processes SPM regions first to identify
>>>> reserved
>>>> areas, then adds RAM entries around the SPM regions, generating a
>>>> clean,
>>>> non-overlapping E820 map.
>>>>
>>>> Now, regarding your questions:
>>>>
>>>> ========================================================================
>>>> Why SPM Must Be Boot Memory
>>>> ========================================================================
>>>>
>>>> SPM cannot be implemented as hotplug memory (DIMM/NVDIMM) because:
>>>>
>>>> The primary goal of SPM is to ensure that memory is managed by guest
>>>> device drivers, not the guest OS. This requires boot-time discovery
>>>> for three key reasons:
>>>>
>>>> 1. SPM regions must appear in the E820 memory map as
>>>> `E820_SOFT_RESERVED`
>>>> during firmware initialization, before the OS starts.
>>>>
>>>> 2. Hotplug memory is integrated into kernel memory management, making
>>>> it unavailable for device-specific use.
>>>>
>>>> ========================================================================
>>>> Detailed Use Case
>>>> ========================================================================
>>>>
>>>> **Background**
>>>> Unified Address Space for CPU and GPU:
>>>>
>>>> Modern heterogeneous computing architectures implement a coherent and
>>>> unified address space shared between CPUs and GPUs. Unlike traditional
>>>> discrete GPU designs with dedicated frame buffer, these accelerators
>>>> connect CPU and GPU through high-speed interconnects (e.g., XGMI):
>>>>
>>>> - **HBM (High Bandwidth Memory)**: Physically attached to each GPU,
>>>> reported to the OS as driver-managed system memory
>>>>
>>>> - **XGMI (eXternal Global Memory Interconnect, aka. Infinity Fabric)**:
>>>> Maintains data coherence between CPU and GPU, enabling direct CPU
>>>> access to GPU HBM without data copying
>>>>
>>>> In this architecture, GPU HBM is reported as system memory to the OS,
>>>> but it needs to be managed exclusively by the GPU driver rather than
>>>> the general OS memory allocator. This driver-managed memory provides
>>>> optimal performance for GPU workloads while enabling coherent CPU-GPU
>>>> data sharing through the XGMI. This is where SPM (Specific Purpose
>>>> Memory) becomes essential.
>>>>
>>>> **Virtualization Scenario**
>>>>
>>>> In virtualization, hypervisor need to expose this memory topology to
>>>> guest VMs while maintaining the same driver-managed vs OS-managed
>>>> distinction.
>>>
>>> Just wondering, could device hotplug in that model ever work? I guess we
>>> wouldn't expose the memory at all in e820 (after all, it gets hotplugged
>>> later) and instead the device driver in the guest would have to
>>> detect+hotplug that memoory.
>>>
>>> But that sounds weird, because the device driver in the VM shouldn't do
>>> something virt specific.
>>>
>>> Which raises the question: how is device hoplug of such gpus handled on
>>> bare metal? Or does it simply not work? :)
>>>
>> Hi David, Thank you for your thoughtful feedback.
>> To directly answer your question:
>> in our use case, GPU device hotplug does NOT work on bare metal,
>> and this is by design.
>
> Cool, thanks for clarifying!
>
next prev parent reply other threads:[~2025-11-03 3:08 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-20 9:07 [PATCH v2] numa: add 'spm' option for Specific Purpose Memory fanhuang
2025-10-20 9:07 ` fanhuang
2025-11-03 12:32 ` David Hildenbrand
2025-11-04 8:00 ` Huang, FangSheng (Jerry)
2025-10-20 10:15 ` Jonathan Cameron via
2025-10-20 20:03 ` David Hildenbrand
2025-10-22 10:19 ` Huang, FangSheng (Jerry)
2025-10-20 20:10 ` David Hildenbrand
2025-10-22 10:09 ` Huang, FangSheng (Jerry)
2025-10-22 10:28 ` David Hildenbrand
2025-11-03 3:01 ` Huang, FangSheng (Jerry) [this message]
2025-11-03 12:36 ` David Hildenbrand (Red Hat)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1fc33dfc-ae73-4d23-a21a-a3a5ed480dd1@amd.com \
--to=fangsheng.huang@amd.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=Lianjie.Shi@amd.com \
--cc=Zhigang.Luo@amd.com \
--cc=david@redhat.com \
--cc=imammedo@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).