From: David Hildenbrand <david@redhat.com>
To: fanhuang <FangSheng.Huang@amd.com>,
qemu-devel@nongnu.org, imammedo@redhat.com
Cc: Zhigang.Luo@amd.com, Lianjie.Shi@amd.com,
Jonathan Cameron <Jonathan.Cameron@huawei.com>
Subject: Re: [PATCH v2] numa: add 'spm' option for Specific Purpose Memory
Date: Mon, 20 Oct 2025 22:10:21 +0200 [thread overview]
Message-ID: <c35a21dd-40e5-4fa5-87c4-18ebe8ca73ca@redhat.com> (raw)
In-Reply-To: <20251020090701.4036748-1-FangSheng.Huang@amd.com>
On 20.10.25 11:07, fanhuang wrote:
> Hi David and Igor,
>
> I apologize for the delayed response. Thank you very much for your thoughtful
> questions and feedback on the SPM patch series.
>
> Before addressing your questions, I'd like to briefly mention what the new
> QEMU patch series additionally resolves:
>
> 1. **Corrected SPM terminology**: Fixed the description error from the previous
> version. The correct acronym is "Specific Purpose Memory" (not "special
> purpose memory" as previously stated).
>
> 2. **Fixed overlapping E820 entries**: Updated the implementation to properly
> handle overlapping E820 RAM entries before adding E820_SOFT_RESERVED
> regions.
>
> The previous implementation created overlapping E820 entries by first adding
> a large E820_RAM entry covering the entire above-4GB memory range, then
> adding E820_SOFT_RESERVED entries for SPM regions that overlapped with the
> RAM entry. This violated the E820 specification and caused OVMF/UEFI
> firmware to receive conflicting memory type information for the same
> physical addresses.
>
> The new implementation processes SPM regions first to identify reserved
> areas, then adds RAM entries around the SPM regions, generating a clean,
> non-overlapping E820 map.
>
> Now, regarding your questions:
>
> ========================================================================
> Why SPM Must Be Boot Memory
> ========================================================================
>
> SPM cannot be implemented as hotplug memory (DIMM/NVDIMM) because:
>
> The primary goal of SPM is to ensure that memory is managed by guest
> device drivers, not the guest OS. This requires boot-time discovery
> for three key reasons:
>
> 1. SPM regions must appear in the E820 memory map as `E820_SOFT_RESERVED`
> during firmware initialization, before the OS starts.
>
> 2. Hotplug memory is integrated into kernel memory management, making
> it unavailable for device-specific use.
>
> ========================================================================
> Detailed Use Case
> ========================================================================
>
> **Background**
> Unified Address Space for CPU and GPU:
>
> Modern heterogeneous computing architectures implement a coherent and
> unified address space shared between CPUs and GPUs. Unlike traditional
> discrete GPU designs with dedicated frame buffer, these accelerators
> connect CPU and GPU through high-speed interconnects (e.g., XGMI):
>
> - **HBM (High Bandwidth Memory)**: Physically attached to each GPU,
> reported to the OS as driver-managed system memory
>
> - **XGMI (eXternal Global Memory Interconnect, aka. Infinity Fabric)**:
> Maintains data coherence between CPU and GPU, enabling direct CPU
> access to GPU HBM without data copying
>
> In this architecture, GPU HBM is reported as system memory to the OS,
> but it needs to be managed exclusively by the GPU driver rather than
> the general OS memory allocator. This driver-managed memory provides
> optimal performance for GPU workloads while enabling coherent CPU-GPU
> data sharing through the XGMI. This is where SPM (Specific Purpose
> Memory) becomes essential.
>
> **Virtualization Scenario**
>
> In virtualization, hypervisor need to expose this memory topology to
> guest VMs while maintaining the same driver-managed vs OS-managed
> distinction.
Just wondering, could device hotplug in that model ever work? I guess we
wouldn't expose the memory at all in e820 (after all, it gets hotplugged
later) and instead the device driver in the guest would have to
detect+hotplug that memoory.
But that sounds weird, because the device driver in the VM shouldn't do
something virt specific.
Which raises the question: how is device hoplug of such gpus handled on
bare metal? Or does it simply not work? :)
--
Cheers
David / dhildenb
next prev parent reply other threads:[~2025-10-20 20:11 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-20 9:07 [PATCH v2] numa: add 'spm' option for Specific Purpose Memory fanhuang
2025-10-20 9:07 ` fanhuang
2025-11-03 12:32 ` David Hildenbrand
2025-11-04 8:00 ` Huang, FangSheng (Jerry)
2025-10-20 10:15 ` Jonathan Cameron via
2025-10-20 20:03 ` David Hildenbrand
2025-10-22 10:19 ` Huang, FangSheng (Jerry)
2025-10-20 20:10 ` David Hildenbrand [this message]
2025-10-22 10:09 ` Huang, FangSheng (Jerry)
2025-10-22 10:28 ` David Hildenbrand
2025-11-03 3:01 ` Huang, FangSheng (Jerry)
2025-11-03 12:36 ` David Hildenbrand (Red Hat)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c35a21dd-40e5-4fa5-87c4-18ebe8ca73ca@redhat.com \
--to=david@redhat.com \
--cc=FangSheng.Huang@amd.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=Lianjie.Shi@amd.com \
--cc=Zhigang.Luo@amd.com \
--cc=imammedo@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).