From: "Michael S. Tsirkin" <mst@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
"Igor Mammedov" <imammedo@redhat.com>,
"Eduardo Habkost" <eduardo@habkost.net>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Yanan Wang" <wangyanan55@huawei.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Richard Henderson" <richard.henderson@linaro.org>,
"Ani Sinha" <anisinha@redhat.com>,
qemu-arm@nongnu.org
Subject: [PULL v2 36/61] smbios: make memory device size configurable per Machine
Date: Tue, 23 Jul 2024 06:58:31 -0400 [thread overview]
Message-ID: <62f182c97b31445012d37181005a83ff8453edaa.1721731723.git.mst@redhat.com> (raw)
In-Reply-To: <cover.1721731723.git.mst@redhat.com>
From: Igor Mammedov <imammedo@redhat.com>
Currently QEMU describes initial[1] RAM* in SMBIOS as a series of
virtual DIMMs (capped at 16Gb max) using type 17 structure entries.
Which is fine for the most cases. However when starting guest
with terabytes of RAM this leads to too many memory device
structures, which eventually upsets linux kernel as it reserves
only 64K for these entries and when that border is crossed out
it runs out of reserved memory.
Instead of partitioning initial RAM on 16Gb DIMMs, use maximum
possible chunk size that SMBIOS spec allows[2]. Which lets
encode RAM in lower 31 bits of 32bit field (which amounts upto
2047Tb per DIMM).
As result initial RAM will generate only one type 17 structure
until host/guest reach ability to use more RAM in the future.
Compat changes:
We can't unconditionally change chunk size as it will break
QEMU<->guest ABI (and migration). Thus introduce a new machine
class field that would let older versioned machines to use
legacy 16Gb chunks, while new(er) machine type[s] use maximum
possible chunk size.
PS:
While it might seem to be risky to rise max entry size this large
(much beyond of what current physical RAM modules support),
I'd not expect it causing much issues, modulo uncovering bugs
in software running within guest. And those should be fixed
on guest side to handle SMBIOS spec properly, especially if
guest is expected to support so huge RAM configs.
In worst case, QEMU can reduce chunk size later if we would
care enough about introducing a workaround for some 'unfixable'
guest OS, either by fixing up the next machine type or
giving users a CLI option to customize it.
1) Initial RAM - is RAM configured with help '-m SIZE' CLI option/
implicitly defined by machine. It doesn't include memory
configured with help of '-device' option[s] (pcdimm,nvdimm,...)
2) SMBIOS 3.1.0 7.18.5 Memory Device — Extended Size
PS:
* tested on 8Tb host with RHEL6 guest, which seems to parse
type 17 SMBIOS table entries correctly (according to 'dmidecode').
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240715122417.4059293-1-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
include/hw/boards.h | 4 ++++
hw/arm/virt.c | 1 +
hw/core/machine.c | 6 ++++++
hw/i386/pc_piix.c | 1 +
hw/i386/pc_q35.c | 1 +
hw/smbios/smbios.c | 11 ++++++-----
6 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index ef6f18f2c1..48ff6d8b93 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -237,6 +237,9 @@ typedef struct {
* purposes only.
* Applies only to default memory backend, i.e., explicit memory backend
* wasn't used.
+ * @smbios_memory_device_size:
+ * Default size of memory device,
+ * SMBIOS 3.1.0 "7.18 Memory Device (Type 17)"
*/
struct MachineClass {
/*< private >*/
@@ -304,6 +307,7 @@ struct MachineClass {
const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
ram_addr_t (*fixup_ram_size)(ram_addr_t size);
+ uint64_t smbios_memory_device_size;
};
/**
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b0c68d66a3..719e83e6a1 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3308,6 +3308,7 @@ DEFINE_VIRT_MACHINE_AS_LATEST(9, 1)
static void virt_machine_9_0_options(MachineClass *mc)
{
virt_machine_9_1_options(mc);
+ mc->smbios_memory_device_size = 16 * GiB;
compat_props_add(mc->compat_props, hw_compat_9_0, hw_compat_9_0_len);
}
DEFINE_VIRT_MACHINE(9, 0)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index bc38cad7f2..ac30544e7f 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1004,6 +1004,12 @@ static void machine_class_init(ObjectClass *oc, void *data)
/* Default 128 MB as guest ram size */
mc->default_ram_size = 128 * MiB;
mc->rom_file_has_mr = true;
+ /*
+ * SMBIOS 3.1.0 7.18.5 Memory Device — Extended Size
+ * use max possible value that could be encoded into
+ * 'Extended Size' field (2047Tb).
+ */
+ mc->smbios_memory_device_size = 2047 * TiB;
/* numa node memory size aligned on 8MB by default.
* On Linux, each node's border has to be 8MB aligned
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 9445b07b4f..d9e69243b4 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -495,6 +495,7 @@ static void pc_i440fx_machine_9_0_options(MachineClass *m)
pc_i440fx_machine_9_1_options(m);
m->alias = NULL;
m->is_default = false;
+ m->smbios_memory_device_size = 16 * GiB;
compat_props_add(m->compat_props, hw_compat_9_0, hw_compat_9_0_len);
compat_props_add(m->compat_props, pc_compat_9_0, pc_compat_9_0_len);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 71d3c6d122..9d108b194e 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -374,6 +374,7 @@ static void pc_q35_machine_9_0_options(MachineClass *m)
PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
pc_q35_machine_9_1_options(m);
m->alias = NULL;
+ m->smbios_memory_device_size = 16 * GiB;
compat_props_add(m->compat_props, hw_compat_9_0, hw_compat_9_0_len);
compat_props_add(m->compat_props, pc_compat_9_0, pc_compat_9_0_len);
pcmc->isa_bios_alias = false;
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index 3b7703489d..a394514264 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -1093,6 +1093,7 @@ static bool smbios_get_tables_ep(MachineState *ms,
Error **errp)
{
unsigned i, dimm_cnt, offset;
+ MachineClass *mc = MACHINE_GET_CLASS(ms);
ERRP_GUARD();
assert(ep_type == SMBIOS_ENTRY_POINT_TYPE_32 ||
@@ -1123,12 +1124,12 @@ static bool smbios_get_tables_ep(MachineState *ms,
smbios_build_type_9_table(errp);
smbios_build_type_11_table();
-#define MAX_DIMM_SZ (16 * GiB)
-#define GET_DIMM_SZ ((i < dimm_cnt - 1) ? MAX_DIMM_SZ \
- : ((current_machine->ram_size - 1) % MAX_DIMM_SZ) + 1)
+#define GET_DIMM_SZ ((i < dimm_cnt - 1) ? mc->smbios_memory_device_size \
+ : ((current_machine->ram_size - 1) % mc->smbios_memory_device_size) + 1)
- dimm_cnt = QEMU_ALIGN_UP(current_machine->ram_size, MAX_DIMM_SZ) /
- MAX_DIMM_SZ;
+ dimm_cnt = QEMU_ALIGN_UP(current_machine->ram_size,
+ mc->smbios_memory_device_size) /
+ mc->smbios_memory_device_size;
/*
* The offset determines if we need to keep additional space between
--
MST
next prev parent reply other threads:[~2024-07-23 11:00 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-23 10:55 [PULL v2 00/61] virtio,pci,pc: features,fixes Michael S. Tsirkin
2024-07-23 10:55 ` [PULL v2 01/61] hw/virtio/virtio-crypto: Fix op_code assignment in virtio_crypto_create_asym_session Michael S. Tsirkin
2024-07-23 10:55 ` [PULL v2 02/61] MAINTAINERS: add Stefano Garzarella as vhost/vhost-user reviewer Michael S. Tsirkin
2024-07-23 10:55 ` [PULL v2 03/61] hw/cxl/cxl-mailbox-utils: remove unneeded mailbox output payload space zeroing Michael S. Tsirkin
2024-07-23 10:55 ` [PULL v2 04/61] hw/cxl: Check for multiple mappings of memory backends Michael S. Tsirkin
2024-07-23 10:55 ` [PULL v2 05/61] hw/cxl/cxl-host: Fix segmentation fault when getting cxl-fmw property Michael S. Tsirkin
2024-07-23 10:55 ` [PULL v2 06/61] hw/cxl: Add get scan media capabilities cmd support Michael S. Tsirkin
2024-07-23 10:55 ` [PULL v2 07/61] hw/cxl/mbox: replace sanitize_running() with cxl_dev_media_disabled() Michael S. Tsirkin
2024-07-23 10:55 ` [PULL v2 08/61] hw/cxl/events: discard all event records during sanitation Michael S. Tsirkin
2024-07-23 10:55 ` [PULL v2 09/61] hw/cxl: Add get scan media results cmd support Michael S. Tsirkin
2024-07-23 10:56 ` [PULL v2 10/61] cxl/mailbox: move mailbox effect definitions to a header Michael S. Tsirkin
2024-07-23 10:56 ` [PULL v2 11/61] hw/cxl/cxl-mailbox-utils: Add support for feature commands (8.2.9.6) Michael S. Tsirkin
2024-07-23 10:56 ` [PULL v2 12/61] hw/cxl/cxl-mailbox-utils: Add device patrol scrub control feature Michael S. Tsirkin
2024-07-23 10:56 ` [PULL v2 13/61] hw/cxl/cxl-mailbox-utils: Add device DDR5 ECS " Michael S. Tsirkin
2024-07-23 10:56 ` [PULL v2 14/61] hw/cxl: Support firmware updates Michael S. Tsirkin
2024-07-23 10:56 ` [PULL v2 15/61] MAINTAINERS: Add myself as a VT-d reviewer Michael S. Tsirkin
2024-07-23 10:56 ` [PULL v2 16/61] virtio-snd: add max size bounds check in input cb Michael S. Tsirkin
2024-07-23 10:56 ` [PULL v2 17/61] virtio-snd: check for invalid param shift operands Michael S. Tsirkin
2024-07-27 6:55 ` Volker Rümelin
2024-08-01 8:22 ` Michael S. Tsirkin
2024-08-02 5:03 ` Volker Rümelin
2024-08-02 11:13 ` Manos Pitsidianakis
2024-07-23 10:56 ` [PULL v2 18/61] intel_iommu: fix FRCD construction macro Michael S. Tsirkin
2024-07-24 4:41 ` Michael Tokarev
2024-07-23 10:56 ` [PULL v2 19/61] intel_iommu: move VTD_FRCD_PV and VTD_FRCD_PP declarations Michael S. Tsirkin
2024-07-23 10:56 ` [PULL v2 20/61] intel_iommu: fix type of the mask field in VTDIOTLBPageInvInfo Michael S. Tsirkin
2024-07-23 10:57 ` [PULL v2 21/61] intel_iommu: make type match Michael S. Tsirkin
2024-07-23 10:57 ` [PULL v2 22/61] virtio: Add bool to VirtQueueElement Michael S. Tsirkin
2024-07-23 10:57 ` [PULL v2 23/61] virtio: virtqueue_pop - VIRTIO_F_IN_ORDER support Michael S. Tsirkin
2024-07-23 10:57 ` [PULL v2 24/61] virtio: virtqueue_ordered_fill " Michael S. Tsirkin
2024-07-23 10:57 ` [PULL v2 25/61] virtio: virtqueue_ordered_flush " Michael S. Tsirkin
2024-07-23 10:57 ` [PULL v2 26/61] vhost,vhost-user: Add VIRTIO_F_IN_ORDER to vhost feature bits Michael S. Tsirkin
2024-07-23 10:57 ` [PULL v2 27/61] virtio: Add VIRTIO_F_IN_ORDER property definition Michael S. Tsirkin
2024-07-23 10:57 ` [PULL v2 28/61] contrib/vhost-user-blk: fix overflowing expression Michael S. Tsirkin
2024-07-23 10:57 ` [PULL v2 29/61] hw/pci: Fix SR-IOV VF number calculation Michael S. Tsirkin
2024-07-23 10:57 ` [PULL v2 30/61] pcie_sriov: Ensure PF and VF are mutually exclusive Michael S. Tsirkin
2024-07-23 10:57 ` [PULL v2 31/61] pcie_sriov: Check PCI Express for SR-IOV PF Michael S. Tsirkin
2024-07-23 10:58 ` [PULL v2 32/61] pcie_sriov: Allow user to create SR-IOV device Michael S. Tsirkin
2024-07-23 10:58 ` [PULL v2 33/61] virtio-pci: Implement SR-IOV PF Michael S. Tsirkin
2024-07-23 10:58 ` [PULL v2 34/61] virtio-net: Implement SR-IOV VF Michael S. Tsirkin
2024-07-23 11:00 ` Akihiko Odaki
2024-07-23 11:02 ` Michael S. Tsirkin
2024-07-23 10:58 ` [PULL v2 35/61] docs: Document composable SR-IOV device Michael S. Tsirkin
2024-07-23 10:58 ` Michael S. Tsirkin [this message]
2024-07-23 10:58 ` [PULL v2 37/61] accel/kvm: Extract common KVM vCPU {creation,parking} code Michael S. Tsirkin
2024-07-25 10:35 ` Peter Maydell
2024-07-25 12:05 ` Salil Mehta via
2024-07-25 12:27 ` Peter Maydell
2024-07-25 14:56 ` Salil Mehta via
2024-07-23 10:58 ` [PULL v2 38/61] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file Michael S. Tsirkin
2024-07-23 10:59 ` [PULL v2 39/61] hw/acpi: Update ACPI GED framework to support vCPU Hotplug Michael S. Tsirkin
2024-07-23 10:59 ` [PULL v2 40/61] hw/acpi: Update GED _EVT method AML with CPU scan Michael S. Tsirkin
2024-10-14 8:52 ` maobibo
2024-10-14 9:37 ` Igor Mammedov
2024-10-14 20:05 ` Salil Mehta via
2024-10-15 9:31 ` Igor Mammedov
2024-10-15 9:41 ` Salil Mehta via
2024-10-15 14:34 ` Igor Mammedov
2024-10-15 14:42 ` Salil Mehta via
2024-10-14 19:59 ` Salil Mehta via
2024-10-15 1:20 ` maobibo
2024-10-15 9:34 ` Salil Mehta via
2024-07-23 10:59 ` [PULL v2 41/61] hw/acpi: Update CPUs AML with cpu-(ctrl)dev change Michael S. Tsirkin
2024-07-23 10:59 ` [PULL v2 42/61] physmem: Add helper function to destroy CPU AddressSpace Michael S. Tsirkin
2024-08-19 15:22 ` Peter Maydell
2025-02-05 12:11 ` Ilya Leoshkevich
2024-07-23 10:59 ` [PULL v2 43/61] gdbstub: Add helper function to unregister GDB register space Michael S. Tsirkin
2024-07-23 10:59 ` [PULL v2 44/61] Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged" Michael S. Tsirkin
2024-07-23 10:59 ` [PULL v2 45/61] virtio-iommu: Remove probe_done Michael S. Tsirkin
2024-07-23 11:00 ` [PULL v2 46/61] virtio-iommu: Free [host_]resv_ranges on unset_iommu_devices Michael S. Tsirkin
2024-07-23 11:00 ` [PULL v2 47/61] virtio-iommu: Remove the end point on detach Michael S. Tsirkin
2024-07-23 11:00 ` [PULL v2 48/61] hw/vfio/common: Add vfio_listener_region_del_iommu trace event Michael S. Tsirkin
2024-07-23 11:00 ` [PULL v2 49/61] virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domain Michael S. Tsirkin
2024-07-23 11:00 ` [PULL v2 50/61] hw/riscv/virt-acpi-build.c: Add namespace devices for PLIC and APLIC Michael S. Tsirkin
2024-07-23 11:00 ` [PULL v2 51/61] hw/riscv/virt-acpi-build.c: Update the HID of RISC-V UART Michael S. Tsirkin
2024-07-23 11:00 ` [PULL v2 52/61] tests/acpi: Allow DSDT acpi table changes for aarch64 Michael S. Tsirkin
2024-07-23 11:00 ` [PULL v2 53/61] acpi/gpex: Create PCI link devices outside PCI root bridge Michael S. Tsirkin
2024-07-23 11:00 ` [PULL v2 54/61] tests/acpi: update expected DSDT blob for aarch64 and microvm Michael S. Tsirkin
2024-07-23 11:00 ` [PULL v2 55/61] tests/qtest/bios-tables-test.c: Remove the fall back path Michael S. Tsirkin
2024-07-23 11:00 ` [PULL v2 56/61] tests/acpi: Add empty ACPI data files for RISC-V Michael S. Tsirkin
2024-07-23 11:00 ` [PULL v2 57/61] tests/qtest/bios-tables-test.c: Enable basic testing " Michael S. Tsirkin
2024-07-23 11:01 ` [PULL v2 58/61] tests/acpi: Add expected ACPI AML files " Michael S. Tsirkin
2024-07-23 11:01 ` [PULL v2 59/61] hw/pci: Add all Data Object Types defined in PCIe r6.0 Michael S. Tsirkin
2024-07-23 11:01 ` [PULL v2 60/61] backends: Initial support for SPDM socket support Michael S. Tsirkin
2024-07-23 11:01 ` [PULL v2 61/61] hw/nvme: Add SPDM over DOE support Michael S. Tsirkin
2024-07-24 1:24 ` [PULL v2 00/61] virtio,pci,pc: features,fixes Richard Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=62f182c97b31445012d37181005a83ff8453edaa.1721731723.git.mst@redhat.com \
--to=mst@redhat.com \
--cc=anisinha@redhat.com \
--cc=eduardo@habkost.net \
--cc=imammedo@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=philmd@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=wangyanan55@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).