From: "Michael S. Tsirkin" <mst@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
"Igor Mammedov" <imammedo@redhat.com>,
"Eduardo Habkost" <eduardo@habkost.net>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Yanan Wang" <wangyanan55@huawei.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Richard Henderson" <richard.henderson@linaro.org>,
"Ani Sinha" <anisinha@redhat.com>,
qemu-arm@nongnu.org
Subject: [PULL 37/63] smbios: make memory device size configurable per Machine
Date: Sun, 21 Jul 2024 20:18:06 -0400 [thread overview]
Message-ID: <0dd7f4777d1d44fe67f3323b462bc0f0a2f686a2.1721607331.git.mst@redhat.com> (raw)
In-Reply-To: <cover.1721607331.git.mst@redhat.com>
From: Igor Mammedov <imammedo@redhat.com>
Currently QEMU describes initial[1] RAM* in SMBIOS as a series of
virtual DIMMs (capped at 16Gb max) using type 17 structure entries.
Which is fine for the most cases. However when starting guest
with terabytes of RAM this leads to too many memory device
structures, which eventually upsets linux kernel as it reserves
only 64K for these entries and when that border is crossed out
it runs out of reserved memory.
Instead of partitioning initial RAM on 16Gb DIMMs, use maximum
possible chunk size that SMBIOS spec allows[2]. Which lets
encode RAM in lower 31 bits of 32bit field (which amounts upto
2047Tb per DIMM).
As result initial RAM will generate only one type 17 structure
until host/guest reach ability to use more RAM in the future.
Compat changes:
We can't unconditionally change chunk size as it will break
QEMU<->guest ABI (and migration). Thus introduce a new machine
class field that would let older versioned machines to use
legacy 16Gb chunks, while new(er) machine type[s] use maximum
possible chunk size.
PS:
While it might seem to be risky to rise max entry size this large
(much beyond of what current physical RAM modules support),
I'd not expect it causing much issues, modulo uncovering bugs
in software running within guest. And those should be fixed
on guest side to handle SMBIOS spec properly, especially if
guest is expected to support so huge RAM configs.
In worst case, QEMU can reduce chunk size later if we would
care enough about introducing a workaround for some 'unfixable'
guest OS, either by fixing up the next machine type or
giving users a CLI option to customize it.
1) Initial RAM - is RAM configured with help '-m SIZE' CLI option/
implicitly defined by machine. It doesn't include memory
configured with help of '-device' option[s] (pcdimm,nvdimm,...)
2) SMBIOS 3.1.0 7.18.5 Memory Device — Extended Size
PS:
* tested on 8Tb host with RHEL6 guest, which seems to parse
type 17 SMBIOS table entries correctly (according to 'dmidecode').
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240715122417.4059293-1-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
include/hw/boards.h | 4 ++++
hw/arm/virt.c | 1 +
hw/core/machine.c | 6 ++++++
hw/i386/pc_piix.c | 1 +
hw/i386/pc_q35.c | 1 +
hw/smbios/smbios.c | 11 ++++++-----
6 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index ef6f18f2c1..48ff6d8b93 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -237,6 +237,9 @@ typedef struct {
* purposes only.
* Applies only to default memory backend, i.e., explicit memory backend
* wasn't used.
+ * @smbios_memory_device_size:
+ * Default size of memory device,
+ * SMBIOS 3.1.0 "7.18 Memory Device (Type 17)"
*/
struct MachineClass {
/*< private >*/
@@ -304,6 +307,7 @@ struct MachineClass {
const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
ram_addr_t (*fixup_ram_size)(ram_addr_t size);
+ uint64_t smbios_memory_device_size;
};
/**
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b0c68d66a3..719e83e6a1 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3308,6 +3308,7 @@ DEFINE_VIRT_MACHINE_AS_LATEST(9, 1)
static void virt_machine_9_0_options(MachineClass *mc)
{
virt_machine_9_1_options(mc);
+ mc->smbios_memory_device_size = 16 * GiB;
compat_props_add(mc->compat_props, hw_compat_9_0, hw_compat_9_0_len);
}
DEFINE_VIRT_MACHINE(9, 0)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index bc38cad7f2..ac30544e7f 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1004,6 +1004,12 @@ static void machine_class_init(ObjectClass *oc, void *data)
/* Default 128 MB as guest ram size */
mc->default_ram_size = 128 * MiB;
mc->rom_file_has_mr = true;
+ /*
+ * SMBIOS 3.1.0 7.18.5 Memory Device — Extended Size
+ * use max possible value that could be encoded into
+ * 'Extended Size' field (2047Tb).
+ */
+ mc->smbios_memory_device_size = 2047 * TiB;
/* numa node memory size aligned on 8MB by default.
* On Linux, each node's border has to be 8MB aligned
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 9445b07b4f..d9e69243b4 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -495,6 +495,7 @@ static void pc_i440fx_machine_9_0_options(MachineClass *m)
pc_i440fx_machine_9_1_options(m);
m->alias = NULL;
m->is_default = false;
+ m->smbios_memory_device_size = 16 * GiB;
compat_props_add(m->compat_props, hw_compat_9_0, hw_compat_9_0_len);
compat_props_add(m->compat_props, pc_compat_9_0, pc_compat_9_0_len);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 71d3c6d122..9d108b194e 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -374,6 +374,7 @@ static void pc_q35_machine_9_0_options(MachineClass *m)
PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
pc_q35_machine_9_1_options(m);
m->alias = NULL;
+ m->smbios_memory_device_size = 16 * GiB;
compat_props_add(m->compat_props, hw_compat_9_0, hw_compat_9_0_len);
compat_props_add(m->compat_props, pc_compat_9_0, pc_compat_9_0_len);
pcmc->isa_bios_alias = false;
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index 3b7703489d..a394514264 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -1093,6 +1093,7 @@ static bool smbios_get_tables_ep(MachineState *ms,
Error **errp)
{
unsigned i, dimm_cnt, offset;
+ MachineClass *mc = MACHINE_GET_CLASS(ms);
ERRP_GUARD();
assert(ep_type == SMBIOS_ENTRY_POINT_TYPE_32 ||
@@ -1123,12 +1124,12 @@ static bool smbios_get_tables_ep(MachineState *ms,
smbios_build_type_9_table(errp);
smbios_build_type_11_table();
-#define MAX_DIMM_SZ (16 * GiB)
-#define GET_DIMM_SZ ((i < dimm_cnt - 1) ? MAX_DIMM_SZ \
- : ((current_machine->ram_size - 1) % MAX_DIMM_SZ) + 1)
+#define GET_DIMM_SZ ((i < dimm_cnt - 1) ? mc->smbios_memory_device_size \
+ : ((current_machine->ram_size - 1) % mc->smbios_memory_device_size) + 1)
- dimm_cnt = QEMU_ALIGN_UP(current_machine->ram_size, MAX_DIMM_SZ) /
- MAX_DIMM_SZ;
+ dimm_cnt = QEMU_ALIGN_UP(current_machine->ram_size,
+ mc->smbios_memory_device_size) /
+ mc->smbios_memory_device_size;
/*
* The offset determines if we need to keep additional space between
--
MST
next prev parent reply other threads:[~2024-07-22 0:21 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-22 0:16 [PULL 00/63] virtio,pci,pc: features,fixes Michael S. Tsirkin
2024-07-22 0:16 ` [PULL 01/63] hw/virtio/virtio-crypto: Fix op_code assignment in virtio_crypto_create_asym_session Michael S. Tsirkin
2024-07-22 0:16 ` [PULL 02/63] MAINTAINERS: add Stefano Garzarella as vhost/vhost-user reviewer Michael S. Tsirkin
2024-07-22 0:16 ` [PULL 03/63] hw/cxl/cxl-mailbox-utils: remove unneeded mailbox output payload space zeroing Michael S. Tsirkin
2024-07-22 0:16 ` [PULL 04/63] hw/cxl: Check for multiple mappings of memory backends Michael S. Tsirkin
2024-07-22 0:16 ` [PULL 05/63] hw/cxl/cxl-host: Fix segmentation fault when getting cxl-fmw property Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 06/63] hw/cxl: Add get scan media capabilities cmd support Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 07/63] hw/cxl/mbox: replace sanitize_running() with cxl_dev_media_disabled() Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 08/63] hw/cxl/events: discard all event records during sanitation Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 09/63] hw/cxl: Add get scan media results cmd support Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 10/63] cxl/mailbox: move mailbox effect definitions to a header Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 11/63] hw/cxl/cxl-mailbox-utils: Add support for feature commands (8.2.9.6) Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 12/63] hw/cxl/cxl-mailbox-utils: Add device patrol scrub control feature Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 13/63] hw/cxl/cxl-mailbox-utils: Add device DDR5 ECS " Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 14/63] hw/cxl: Support firmware updates Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 15/63] MAINTAINERS: Add myself as a VT-d reviewer Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 16/63] virtio-snd: add max size bounds check in input cb Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 17/63] virtio-snd: check for invalid param shift operands Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 18/63] intel_iommu: fix FRCD construction macro Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 19/63] intel_iommu: move VTD_FRCD_PV and VTD_FRCD_PP declarations Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 20/63] intel_iommu: fix type of the mask field in VTDIOTLBPageInvInfo Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 21/63] intel_iommu: make type match Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 22/63] virtio: Add bool to VirtQueueElement Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 23/63] virtio: virtqueue_pop - VIRTIO_F_IN_ORDER support Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 24/63] virtio: virtqueue_ordered_fill " Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 25/63] virtio: virtqueue_ordered_flush " Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 26/63] vhost,vhost-user: Add VIRTIO_F_IN_ORDER to vhost feature bits Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 27/63] virtio: Add VIRTIO_F_IN_ORDER property definition Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 28/63] contrib/vhost-user-blk: fix overflowing expression Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 29/63] hw/pci: Do not add ROM BAR for SR-IOV VF Michael S. Tsirkin
2024-07-22 14:21 ` Akihiko Odaki
2024-07-23 0:15 ` Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 30/63] hw/pci: Fix SR-IOV VF number calculation Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 31/63] pcie_sriov: Ensure PF and VF are mutually exclusive Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 32/63] pcie_sriov: Check PCI Express for SR-IOV PF Michael S. Tsirkin
2024-07-22 0:17 ` [PULL 33/63] pcie_sriov: Allow user to create SR-IOV device Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 34/63] virtio-pci: Implement SR-IOV PF Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 35/63] virtio-net: Implement SR-IOV VF Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 36/63] docs: Document composable SR-IOV device Michael S. Tsirkin
2024-07-22 0:18 ` Michael S. Tsirkin [this message]
2024-07-22 0:18 ` [PULL 38/63] accel/kvm: Extract common KVM vCPU {creation,parking} code Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 39/63] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 40/63] hw/acpi: Update ACPI GED framework to support vCPU Hotplug Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 41/63] hw/acpi: Update GED _EVT method AML with CPU scan Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 42/63] hw/acpi: Update CPUs AML with cpu-(ctrl)dev change Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 43/63] physmem: Add helper function to destroy CPU AddressSpace Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 44/63] gdbstub: Add helper function to unregister GDB register space Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 45/63] Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged" Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 46/63] virtio-iommu: Remove probe_done Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 47/63] virtio-iommu: Free [host_]resv_ranges on unset_iommu_devices Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 48/63] virtio-iommu: Remove the end point on detach Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 49/63] hw/vfio/common: Add vfio_listener_region_del_iommu trace event Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 50/63] virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domain Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 51/63] hw/riscv/virt-acpi-build.c: Add namespace devices for PLIC and APLIC Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 52/63] hw/riscv/virt-acpi-build.c: Update the HID of RISC-V UART Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 53/63] tests/acpi: Allow DSDT acpi table changes for aarch64 Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 54/63] acpi/gpex: Create PCI link devices outside PCI root bridge Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 55/63] tests/acpi: update expected DSDT blob for aarch64 and microvm Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 56/63] tests/qtest/bios-tables-test.c: Remove the fall back path Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 57/63] tests/acpi: Add empty ACPI data files for RISC-V Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 58/63] tests/qtest/bios-tables-test.c: Enable basic testing " Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 59/63] tests/acpi: Add expected ACPI AML files " Michael S. Tsirkin
2024-07-22 0:18 ` [PULL 60/63] hw/pci: Add all Data Object Types defined in PCIe r6.0 Michael S. Tsirkin
2024-07-22 0:19 ` [PULL 61/63] backends: Initial support for SPDM socket support Michael S. Tsirkin
2024-07-22 0:19 ` [PULL 62/63] hw/nvme: Add SPDM over DOE support Michael S. Tsirkin
2024-07-22 0:19 ` [PULL 63/63] virtio: Always reset vhost devices Michael S. Tsirkin
2024-07-22 21:32 ` [PULL 00/63] virtio,pci,pc: features,fixes Richard Henderson
2024-07-23 0:20 ` Michael S. Tsirkin
2024-07-23 0:44 ` Richard Henderson
2024-07-23 11:36 ` Michael S. Tsirkin
2024-07-23 10:18 ` Hanna Czenczek
2024-07-23 10:45 ` Michael S. Tsirkin
2024-07-23 11:06 ` Hanna Czenczek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0dd7f4777d1d44fe67f3323b462bc0f0a2f686a2.1721607331.git.mst@redhat.com \
--to=mst@redhat.com \
--cc=anisinha@redhat.com \
--cc=eduardo@habkost.net \
--cc=imammedo@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=philmd@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=wangyanan55@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).