* [PATCH v2 0/3] RISC-V virt MHP support
@ 2024-05-21 10:56 Björn Töpel
2024-05-21 10:56 ` [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support Björn Töpel
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Björn Töpel @ 2024-05-21 10:56 UTC (permalink / raw)
To: Palmer Dabbelt, Alistair Francis, Bin Meng, Weiwei Li,
Daniel Henrique Barboza, Liu Zhiwei, qemu-riscv, qemu-devel,
David Hildenbrand, Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
From: Björn Töpel <bjorn@rivosinc.com>
The RISC-V "virt" machine is currently missing memory hotplugging
support (MHP). This series adds the missing virtio-md, and PC-DIMM
support.
I've include both the ACPI MHP/GED/PC-DIMM, and virtio-md in the same
series. The two patches (virtio-md) are probably less controversial,
and can be pulled separately from the last patch.
The first patch includes MHP that works with DT. The second patch is
basic device memory ACPI SRAT plumbing.
The third patch includes ACPI GED, and ACPI MHP/PC-DIMM support.
Changelog
v1->v2:
* Cap the maximum number of supported ram slots (Daniel)
* Split the ACPI parts out from virtio-md patch (Daniel)
* Allow for max 1G PC-DIMM alignment per ram slot (Daniel)
* Add missing defined(__riscv__) for virtio-mem code
* Include PC-DIMM MHP supportin the series
* Misc code structure changes
Cheers,
Björn
Björn Töpel (3):
hw/riscv/virt: Add memory hotplugging and virtio-md-pci support
hw/riscv/virt-acpi-build: Expose device memory in ACPI SRAT
hw/riscv/virt: Add ACPI GED and PC-DIMM MHP support
hw/riscv/Kconfig | 5 +
hw/riscv/virt-acpi-build.c | 23 +++++
hw/riscv/virt.c | 183 ++++++++++++++++++++++++++++++++++++-
hw/virtio/virtio-mem.c | 5 +-
include/hw/riscv/virt.h | 6 +-
5 files changed, 218 insertions(+), 4 deletions(-)
base-commit: 01782d6b294f95bcde334386f0aaac593cd28c0d
--
2.40.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support
2024-05-21 10:56 [PATCH v2 0/3] RISC-V virt MHP support Björn Töpel
@ 2024-05-21 10:56 ` Björn Töpel
2024-05-24 8:53 ` Daniel Henrique Barboza
2024-05-24 13:14 ` Daniel Henrique Barboza
2024-05-21 10:56 ` [PATCH v2 2/3] hw/riscv/virt-acpi-build: Expose device memory in ACPI SRAT Björn Töpel
2024-05-21 10:56 ` [PATCH v2 3/3] hw/riscv/virt: Add ACPI GED and PC-DIMM MHP support Björn Töpel
2 siblings, 2 replies; 14+ messages in thread
From: Björn Töpel @ 2024-05-21 10:56 UTC (permalink / raw)
To: Palmer Dabbelt, Alistair Francis, Bin Meng, Weiwei Li,
Daniel Henrique Barboza, Liu Zhiwei, qemu-riscv, qemu-devel,
David Hildenbrand, Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
From: Björn Töpel <bjorn@rivosinc.com>
Virtio-based memory devices (virtio-mem/virtio-pmem) allows for
dynamic resizing of virtual machine memory, and requires proper
hotplugging (add/remove) support to work.
Add device memory support for RISC-V "virt" machine, and enable
virtio-md-pci with the corresponding missing hotplugging callbacks.
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
---
hw/riscv/Kconfig | 2 +
hw/riscv/virt.c | 83 +++++++++++++++++++++++++++++++++++++++++-
hw/virtio/virtio-mem.c | 5 ++-
3 files changed, 87 insertions(+), 3 deletions(-)
diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
index a2030e3a6ff0..08f82dbb681a 100644
--- a/hw/riscv/Kconfig
+++ b/hw/riscv/Kconfig
@@ -56,6 +56,8 @@ config RISCV_VIRT
select PLATFORM_BUS
select ACPI
select ACPI_PCI
+ select VIRTIO_MEM_SUPPORTED
+ select VIRTIO_PMEM_SUPPORTED
config SHAKTI_C
bool
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 4fdb66052587..443902f919d2 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -53,6 +53,8 @@
#include "hw/pci-host/gpex.h"
#include "hw/display/ramfb.h"
#include "hw/acpi/aml-build.h"
+#include "hw/mem/memory-device.h"
+#include "hw/virtio/virtio-mem-pci.h"
#include "qapi/qapi-visit-common.h"
#include "hw/virtio/virtio-iommu.h"
@@ -1407,6 +1409,7 @@ static void virt_machine_init(MachineState *machine)
DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
int i, base_hartid, hart_count;
int socket_count = riscv_socket_count(machine);
+ hwaddr device_memory_base, device_memory_size;
/* Check socket count limit */
if (VIRT_SOCKETS_MAX < socket_count) {
@@ -1420,6 +1423,12 @@ static void virt_machine_init(MachineState *machine)
exit(1);
}
+ if (machine->ram_slots > ACPI_MAX_RAM_SLOTS) {
+ error_report("unsupported amount of memory slots: %"PRIu64,
+ machine->ram_slots);
+ exit(EXIT_FAILURE);
+ }
+
/* Initialize sockets */
mmio_irqchip = virtio_irqchip = pcie_irqchip = NULL;
for (i = 0; i < socket_count; i++) {
@@ -1553,6 +1562,37 @@ static void virt_machine_init(MachineState *machine)
memory_region_add_subregion(system_memory, memmap[VIRT_MROM].base,
mask_rom);
+ /* device memory */
+ device_memory_base = ROUND_UP(s->memmap[VIRT_DRAM].base + machine->ram_size,
+ GiB);
+ device_memory_size = machine->maxram_size - machine->ram_size;
+ if (device_memory_size > 0) {
+ /*
+ * Each DIMM is aligned based on the backend's alignment value.
+ * Assume max 1G hugepage alignment per slot.
+ */
+ device_memory_size += machine->ram_slots * GiB;
+
+ if (riscv_is_32bit(&s->soc[0])) {
+ hwaddr memtop = device_memory_base + ROUND_UP(device_memory_size,
+ GiB);
+
+ if (memtop > UINT32_MAX) {
+ error_report("memory exceeds 32-bit limit by %lu bytes",
+ memtop - UINT32_MAX);
+ exit(EXIT_FAILURE);
+ }
+ }
+
+ if (device_memory_base + device_memory_size < device_memory_size) {
+ error_report("unsupported amount of device memory");
+ exit(EXIT_FAILURE);
+ }
+
+ machine_memory_devices_init(machine, device_memory_base,
+ device_memory_size);
+ }
+
/*
* Init fw_cfg. Must be done before riscv_load_fdt, otherwise the
* device tree cannot be altered and we get FDT_ERR_NOSPACE.
@@ -1712,12 +1752,21 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
MachineClass *mc = MACHINE_GET_CLASS(machine);
if (device_is_dynamic_sysbus(mc, dev) ||
- object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
+ object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI) ||
+ object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
return HOTPLUG_HANDLER(machine);
}
return NULL;
}
+static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
+ DeviceState *dev, Error **errp)
+{
+ if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
+ virtio_md_pci_pre_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
+ }
+}
+
static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
DeviceState *dev, Error **errp)
{
@@ -1735,6 +1784,35 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
create_fdt_virtio_iommu(s, pci_get_bdf(PCI_DEVICE(dev)));
}
+
+ if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
+ virtio_md_pci_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
+ }
+}
+
+static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
+ DeviceState *dev,
+ Error **errp)
+{
+ if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
+ virtio_md_pci_unplug_request(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev),
+ errp);
+ } else {
+ error_setg(errp,
+ "device unplug request for unsupported device type: %s",
+ object_get_typename(OBJECT(dev)));
+ }
+}
+
+static void virt_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
+ DeviceState *dev, Error **errp)
+{
+ if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
+ virtio_md_pci_unplug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
+ } else {
+ error_setg(errp, "virt: device unplug for unsupported device"
+ " type: %s", object_get_typename(OBJECT(dev)));
+ }
}
static void virt_machine_class_init(ObjectClass *oc, void *data)
@@ -1757,7 +1835,10 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
assert(!mc->get_hotplug_handler);
mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
+ hc->pre_plug = virt_machine_device_pre_plug_cb;
hc->plug = virt_machine_device_plug_cb;
+ hc->unplug_request = virt_machine_device_unplug_request_cb;
+ hc->unplug = virt_machine_device_unplug_cb;
machine_class_allow_dynamic_sysbus_dev(mc, TYPE_RAMFB_DEVICE);
#ifdef CONFIG_TPM
diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index ffd119ebacb7..6636e5e1089c 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -51,7 +51,8 @@ static uint32_t virtio_mem_default_thp_size(void)
{
uint32_t default_thp_size = VIRTIO_MEM_MIN_BLOCK_SIZE;
-#if defined(__x86_64__) || defined(__arm__) || defined(__powerpc64__)
+#if defined(__x86_64__) || defined(__arm__) || defined(__powerpc64__) \
+ || defined(__riscv__)
default_thp_size = 2 * MiB;
#elif defined(__aarch64__)
if (qemu_real_host_page_size() == 4 * KiB) {
@@ -161,7 +162,7 @@ static bool virtio_mem_has_shared_zeropage(RAMBlock *rb)
* necessary (as the section size can change). But it's more likely that the
* section size will rather get smaller and not bigger over time.
*/
-#if defined(TARGET_X86_64) || defined(TARGET_I386)
+#if defined(TARGET_X86_64) || defined(TARGET_I386) || defined(TARGET_RISCV)
#define VIRTIO_MEM_USABLE_EXTENT (2 * (128 * MiB))
#elif defined(TARGET_ARM)
#define VIRTIO_MEM_USABLE_EXTENT (2 * (512 * MiB))
--
2.40.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 2/3] hw/riscv/virt-acpi-build: Expose device memory in ACPI SRAT
2024-05-21 10:56 [PATCH v2 0/3] RISC-V virt MHP support Björn Töpel
2024-05-21 10:56 ` [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support Björn Töpel
@ 2024-05-21 10:56 ` Björn Töpel
2024-05-21 10:56 ` [PATCH v2 3/3] hw/riscv/virt: Add ACPI GED and PC-DIMM MHP support Björn Töpel
2 siblings, 0 replies; 14+ messages in thread
From: Björn Töpel @ 2024-05-21 10:56 UTC (permalink / raw)
To: Palmer Dabbelt, Alistair Francis, Bin Meng, Weiwei Li,
Daniel Henrique Barboza, Liu Zhiwei, qemu-riscv, qemu-devel,
David Hildenbrand, Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
From: Björn Töpel <bjorn@rivosinc.com>
Now that device memory is supported by RISC-V 'virt', expose that
region in the ACPI SRAT (System/Static Resource Affinity Table).
ACPI SRAT is used by, e.g., the virtio-mem Linux kernel driver [1].
Link: https://virtio-mem.gitlab.io/user-guide/user-guide-linux.html # [1]
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
---
hw/riscv/virt-acpi-build.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
index 0925528160f8..6dc3baa9ec86 100644
--- a/hw/riscv/virt-acpi-build.c
+++ b/hw/riscv/virt-acpi-build.c
@@ -610,6 +610,13 @@ build_srat(GArray *table_data, BIOSLinker *linker, RISCVVirtState *vms)
}
}
+ if (ms->device_memory) {
+ build_srat_memory(table_data, ms->device_memory->base,
+ memory_region_size(&ms->device_memory->mr),
+ ms->numa_state->num_nodes - 1,
+ MEM_AFFINITY_HOTPLUGGABLE | MEM_AFFINITY_ENABLED);
+ }
+
acpi_table_end(linker, &table);
}
--
2.40.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 3/3] hw/riscv/virt: Add ACPI GED and PC-DIMM MHP support
2024-05-21 10:56 [PATCH v2 0/3] RISC-V virt MHP support Björn Töpel
2024-05-21 10:56 ` [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support Björn Töpel
2024-05-21 10:56 ` [PATCH v2 2/3] hw/riscv/virt-acpi-build: Expose device memory in ACPI SRAT Björn Töpel
@ 2024-05-21 10:56 ` Björn Töpel
2024-05-24 9:10 ` Daniel Henrique Barboza
2 siblings, 1 reply; 14+ messages in thread
From: Björn Töpel @ 2024-05-21 10:56 UTC (permalink / raw)
To: Palmer Dabbelt, Alistair Francis, Bin Meng, Weiwei Li,
Daniel Henrique Barboza, Liu Zhiwei, qemu-riscv, qemu-devel,
David Hildenbrand, Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
From: Björn Töpel <bjorn@rivosinc.com>
Add ACPI GED for the RISC-V "virt" machine, and wire up PC-DIMM memory
hotplugging support. Heavily based/copied from hw/arm/virt.c.
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
---
hw/riscv/Kconfig | 3 ++
hw/riscv/virt-acpi-build.c | 16 ++++++
hw/riscv/virt.c | 104 ++++++++++++++++++++++++++++++++++++-
include/hw/riscv/virt.h | 6 ++-
4 files changed, 126 insertions(+), 3 deletions(-)
diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
index 08f82dbb681a..bebe412f2107 100644
--- a/hw/riscv/Kconfig
+++ b/hw/riscv/Kconfig
@@ -56,6 +56,9 @@ config RISCV_VIRT
select PLATFORM_BUS
select ACPI
select ACPI_PCI
+ select MEM_DEVICE
+ select DIMM
+ select ACPI_HW_REDUCED
select VIRTIO_MEM_SUPPORTED
select VIRTIO_PMEM_SUPPORTED
diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
index 6dc3baa9ec86..61813abdef3f 100644
--- a/hw/riscv/virt-acpi-build.c
+++ b/hw/riscv/virt-acpi-build.c
@@ -27,6 +27,8 @@
#include "hw/acpi/acpi-defs.h"
#include "hw/acpi/acpi.h"
#include "hw/acpi/aml-build.h"
+#include "hw/acpi/memory_hotplug.h"
+#include "hw/acpi/generic_event_device.h"
#include "hw/acpi/pci.h"
#include "hw/acpi/utils.h"
#include "hw/intc/riscv_aclint.h"
@@ -432,6 +434,20 @@ static void build_dsdt(GArray *table_data,
acpi_dsdt_add_gpex_host(scope, PCIE_IRQ + VIRT_IRQCHIP_NUM_SOURCES * 2);
}
+ if (s->acpi_dev) {
+ uint32_t event = object_property_get_uint(OBJECT(s->acpi_dev),
+ "ged-event", &error_abort);
+
+ build_ged_aml(scope, "\\_SB."GED_DEVICE, HOTPLUG_HANDLER(s->acpi_dev),
+ GED_IRQ, AML_SYSTEM_MEMORY, memmap[VIRT_ACPI_GED].base);
+
+ if (event & ACPI_GED_MEM_HOTPLUG_EVT) {
+ build_memory_hotplug_aml(scope, ms->ram_slots, "\\_SB", NULL,
+ AML_SYSTEM_MEMORY,
+ memmap[VIRT_PCDIMM_ACPI].base);
+ }
+ }
+
aml_append(dsdt, scope);
/* copy AML table into ACPI tables blob and patch header there */
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 443902f919d2..2e35890187f2 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -53,10 +53,13 @@
#include "hw/pci-host/gpex.h"
#include "hw/display/ramfb.h"
#include "hw/acpi/aml-build.h"
+#include "hw/acpi/generic_event_device.h"
+#include "hw/acpi/memory_hotplug.h"
#include "hw/mem/memory-device.h"
#include "hw/virtio/virtio-mem-pci.h"
#include "qapi/qapi-visit-common.h"
#include "hw/virtio/virtio-iommu.h"
+#include "hw/mem/pc-dimm.h"
/* KVM AIA only supports APLIC MSI. APLIC Wired is always emulated by QEMU. */
static bool virt_use_kvm_aia(RISCVVirtState *s)
@@ -84,6 +87,8 @@ static const MemMapEntry virt_memmap[] = {
[VIRT_UART0] = { 0x10000000, 0x100 },
[VIRT_VIRTIO] = { 0x10001000, 0x1000 },
[VIRT_FW_CFG] = { 0x10100000, 0x18 },
+ [VIRT_PCDIMM_ACPI] = { 0x10200000, MEMORY_HOTPLUG_IO_LEN },
+ [VIRT_ACPI_GED] = { 0x10210000, ACPI_GED_EVT_SEL_LEN },
[VIRT_FLASH] = { 0x20000000, 0x4000000 },
[VIRT_IMSIC_M] = { 0x24000000, VIRT_IMSIC_MAX_SIZE },
[VIRT_IMSIC_S] = { 0x28000000, VIRT_IMSIC_MAX_SIZE },
@@ -1400,6 +1405,28 @@ static void virt_machine_done(Notifier *notifier, void *data)
}
}
+static DeviceState *create_acpi_ged(RISCVVirtState *s)
+{
+ DeviceState *dev;
+ MachineState *ms = MACHINE(s);
+ uint32_t event = 0;
+
+ if (ms->ram_slots) {
+ event |= ACPI_GED_MEM_HOTPLUG_EVT;
+ }
+
+ dev = qdev_new(TYPE_ACPI_GED);
+ qdev_prop_set_uint32(dev, "ged-event", event);
+ sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+
+ sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, s->memmap[VIRT_ACPI_GED].base);
+ sysbus_mmio_map(SYS_BUS_DEVICE(dev), 1, s->memmap[VIRT_PCDIMM_ACPI].base);
+ sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, qdev_get_gpio_in(s->irqchip[0],
+ GED_IRQ));
+
+ return dev;
+}
+
static void virt_machine_init(MachineState *machine)
{
const MemMapEntry *memmap = virt_memmap;
@@ -1612,6 +1639,10 @@ static void virt_machine_init(MachineState *machine)
gpex_pcie_init(system_memory, pcie_irqchip, s);
+ if (virt_is_acpi_enabled(s)) {
+ s->acpi_dev = create_acpi_ged(s);
+ }
+
create_platform_bus(s, mmio_irqchip);
serial_mm_init(system_memory, memmap[VIRT_UART0].base,
@@ -1752,6 +1783,7 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
MachineClass *mc = MACHINE_GET_CLASS(machine);
if (device_is_dynamic_sysbus(mc, dev) ||
+ object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI) ||
object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
return HOTPLUG_HANDLER(machine);
@@ -1759,14 +1791,42 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
return NULL;
}
+static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+ Error **errp)
+{
+ RISCVVirtState *s = RISCV_VIRT_MACHINE(hotplug_dev);
+
+ if (!s->acpi_dev) {
+ error_setg(errp,
+ "memory hotplug is not enabled: missing acpi-ged device");
+ return;
+ }
+
+ pc_dimm_pre_plug(PC_DIMM(dev), MACHINE(hotplug_dev), NULL, errp);
+}
+
static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
DeviceState *dev, Error **errp)
{
+ if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+ virt_memory_pre_plug(hotplug_dev, dev, errp);
+ }
+
if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
virtio_md_pci_pre_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
}
}
+static void virt_memory_plug(HotplugHandler *hotplug_dev,
+ DeviceState *dev, Error **errp)
+{
+ RISCVVirtState *s = RISCV_VIRT_MACHINE(hotplug_dev);
+
+ pc_dimm_plug(PC_DIMM(dev), MACHINE(s));
+
+ hotplug_handler_plug(HOTPLUG_HANDLER(s->acpi_dev), dev, &error_abort);
+}
+
static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
DeviceState *dev, Error **errp)
{
@@ -1785,16 +1845,36 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
create_fdt_virtio_iommu(s, pci_get_bdf(PCI_DEVICE(dev)));
}
+ if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+ virt_memory_plug(hotplug_dev, dev, errp);
+ }
+
if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
virtio_md_pci_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
}
}
+static void virt_dimm_unplug_request(HotplugHandler *hotplug_dev,
+ DeviceState *dev, Error **errp)
+{
+ RISCVVirtState *s = RISCV_VIRT_MACHINE(hotplug_dev);
+
+ if (!s->acpi_dev) {
+ error_setg(errp,
+ "memory hotplug is not enabled: missing acpi-ged device");
+ return;
+ }
+
+ hotplug_handler_unplug_request(HOTPLUG_HANDLER(s->acpi_dev), dev, errp);
+}
+
static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
DeviceState *dev,
Error **errp)
{
- if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
+ if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+ virt_dimm_unplug_request(hotplug_dev, dev, errp);
+ } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
virtio_md_pci_unplug_request(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev),
errp);
} else {
@@ -1804,10 +1884,30 @@ static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
}
}
+static void virt_dimm_unplug(HotplugHandler *hotplug_dev,
+ DeviceState *dev, Error **errp)
+{
+ RISCVVirtState *s = RISCV_VIRT_MACHINE(hotplug_dev);
+ Error *local_err = NULL;
+
+ hotplug_handler_unplug(HOTPLUG_HANDLER(s->acpi_dev), dev, &local_err);
+ if (local_err) {
+ goto out;
+ }
+
+ pc_dimm_unplug(PC_DIMM(dev), MACHINE(s));
+ qdev_unrealize(dev);
+
+out:
+ error_propagate(errp, local_err);
+}
+
static void virt_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
DeviceState *dev, Error **errp)
{
- if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
+ if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+ virt_dimm_unplug(hotplug_dev, dev, errp);
+ } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
virtio_md_pci_unplug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
} else {
error_setg(errp, "virt: device unplug for unsupported device"
diff --git a/include/hw/riscv/virt.h b/include/hw/riscv/virt.h
index 3db839160f95..adf460a4021e 100644
--- a/include/hw/riscv/virt.h
+++ b/include/hw/riscv/virt.h
@@ -62,6 +62,7 @@ struct RISCVVirtState {
OnOffAuto acpi;
const MemMapEntry *memmap;
struct GPEXHost *gpex_host;
+ DeviceState *acpi_dev;
};
enum {
@@ -84,12 +85,15 @@ enum {
VIRT_PCIE_MMIO,
VIRT_PCIE_PIO,
VIRT_PLATFORM_BUS,
- VIRT_PCIE_ECAM
+ VIRT_PCIE_ECAM,
+ VIRT_PCDIMM_ACPI,
+ VIRT_ACPI_GED,
};
enum {
UART0_IRQ = 10,
RTC_IRQ = 11,
+ GED_IRQ = 12,
VIRTIO_IRQ = 1, /* 1 to 8 */
VIRTIO_COUNT = 8,
PCIE_IRQ = 0x20, /* 32 to 35 */
--
2.40.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support
2024-05-21 10:56 ` [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support Björn Töpel
@ 2024-05-24 8:53 ` Daniel Henrique Barboza
2024-05-27 11:49 ` Björn Töpel
2024-05-24 13:14 ` Daniel Henrique Barboza
1 sibling, 1 reply; 14+ messages in thread
From: Daniel Henrique Barboza @ 2024-05-24 8:53 UTC (permalink / raw)
To: Björn Töpel, Palmer Dabbelt, Alistair Francis, Bin Meng,
Weiwei Li, Liu Zhiwei, qemu-riscv, qemu-devel, David Hildenbrand,
Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
On 5/21/24 07:56, Björn Töpel wrote:
> From: Björn Töpel <bjorn@rivosinc.com>
>
> Virtio-based memory devices (virtio-mem/virtio-pmem) allows for
> dynamic resizing of virtual machine memory, and requires proper
> hotplugging (add/remove) support to work.
>
> Add device memory support for RISC-V "virt" machine, and enable
> virtio-md-pci with the corresponding missing hotplugging callbacks.
>
> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
> ---
> hw/riscv/Kconfig | 2 +
> hw/riscv/virt.c | 83 +++++++++++++++++++++++++++++++++++++++++-
> hw/virtio/virtio-mem.c | 5 ++-
> 3 files changed, 87 insertions(+), 3 deletions(-)
>
> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
> index a2030e3a6ff0..08f82dbb681a 100644
> --- a/hw/riscv/Kconfig
> +++ b/hw/riscv/Kconfig
> @@ -56,6 +56,8 @@ config RISCV_VIRT
> select PLATFORM_BUS
> select ACPI
> select ACPI_PCI
> + select VIRTIO_MEM_SUPPORTED
> + select VIRTIO_PMEM_SUPPORTED
>
> config SHAKTI_C
> bool
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 4fdb66052587..443902f919d2 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -53,6 +53,8 @@
> #include "hw/pci-host/gpex.h"
> #include "hw/display/ramfb.h"
> #include "hw/acpi/aml-build.h"
> +#include "hw/mem/memory-device.h"
> +#include "hw/virtio/virtio-mem-pci.h"
> #include "qapi/qapi-visit-common.h"
> #include "hw/virtio/virtio-iommu.h"
>
> @@ -1407,6 +1409,7 @@ static void virt_machine_init(MachineState *machine)
> DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
> int i, base_hartid, hart_count;
> int socket_count = riscv_socket_count(machine);
> + hwaddr device_memory_base, device_memory_size;
>
> /* Check socket count limit */
> if (VIRT_SOCKETS_MAX < socket_count) {
> @@ -1420,6 +1423,12 @@ static void virt_machine_init(MachineState *machine)
> exit(1);
> }
>
> + if (machine->ram_slots > ACPI_MAX_RAM_SLOTS) {
> + error_report("unsupported amount of memory slots: %"PRIu64,
> + machine->ram_slots);
Let's also add the maximum amount allowed in this message, e.g. this error:
$ (...) -m 2G,slots=512,maxmem=8G
qemu-system-riscv64: unsupported amount of memory slots: 512
could be something like:
qemu-system-riscv64: unsupported amount of memory slots (512), maximum amount: 256
LGTM otherwise. Thanks,
Daniel
> + exit(EXIT_FAILURE);
> + }
> +
> /* Initialize sockets */
> mmio_irqchip = virtio_irqchip = pcie_irqchip = NULL;
> for (i = 0; i < socket_count; i++) {
> @@ -1553,6 +1562,37 @@ static void virt_machine_init(MachineState *machine)
> memory_region_add_subregion(system_memory, memmap[VIRT_MROM].base,
> mask_rom);
>
> + /* device memory */
> + device_memory_base = ROUND_UP(s->memmap[VIRT_DRAM].base + machine->ram_size,
> + GiB);
> + device_memory_size = machine->maxram_size - machine->ram_size;
> + if (device_memory_size > 0) {
> + /*
> + * Each DIMM is aligned based on the backend's alignment value.
> + * Assume max 1G hugepage alignment per slot.
> + */
> + device_memory_size += machine->ram_slots * GiB;
> +
> + if (riscv_is_32bit(&s->soc[0])) {
> + hwaddr memtop = device_memory_base + ROUND_UP(device_memory_size,
> + GiB);
> +
> + if (memtop > UINT32_MAX) {
> + error_report("memory exceeds 32-bit limit by %lu bytes",
> + memtop - UINT32_MAX);
> + exit(EXIT_FAILURE);
> + }
> + }
> +
> + if (device_memory_base + device_memory_size < device_memory_size) {
> + error_report("unsupported amount of device memory");
> + exit(EXIT_FAILURE);
> + }
> +
> + machine_memory_devices_init(machine, device_memory_base,
> + device_memory_size);
> + }
> +> /*
> * Init fw_cfg. Must be done before riscv_load_fdt, otherwise the
> * device tree cannot be altered and we get FDT_ERR_NOSPACE.
> @@ -1712,12 +1752,21 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
> MachineClass *mc = MACHINE_GET_CLASS(machine);
>
> if (device_is_dynamic_sysbus(mc, dev) ||
> - object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
> + object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI) ||
> + object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> return HOTPLUG_HANDLER(machine);
> }
> return NULL;
> }
>
> +static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
> + DeviceState *dev, Error **errp)
> +{
> + if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> + virtio_md_pci_pre_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
> + }
> +}
> +
> static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
> DeviceState *dev, Error **errp)
> {
> @@ -1735,6 +1784,35 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
> if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
> create_fdt_virtio_iommu(s, pci_get_bdf(PCI_DEVICE(dev)));
> }
> +
> + if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> + virtio_md_pci_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
> + }
> +}
> +
> +static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> + DeviceState *dev,
> + Error **errp)
> +{
> + if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> + virtio_md_pci_unplug_request(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev),
> + errp);
> + } else {
> + error_setg(errp,
> + "device unplug request for unsupported device type: %s",
> + object_get_typename(OBJECT(dev)));
> + }
> +}
> +
> +static void virt_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
> + DeviceState *dev, Error **errp)
> +{
> + if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> + virtio_md_pci_unplug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
> + } else {
> + error_setg(errp, "virt: device unplug for unsupported device"
> + " type: %s", object_get_typename(OBJECT(dev)));
> + }
> }
>
> static void virt_machine_class_init(ObjectClass *oc, void *data)
> @@ -1757,7 +1835,10 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
> assert(!mc->get_hotplug_handler);
> mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
>
> + hc->pre_plug = virt_machine_device_pre_plug_cb;
> hc->plug = virt_machine_device_plug_cb;
> + hc->unplug_request = virt_machine_device_unplug_request_cb;
> + hc->unplug = virt_machine_device_unplug_cb;
>
> machine_class_allow_dynamic_sysbus_dev(mc, TYPE_RAMFB_DEVICE);
> #ifdef CONFIG_TPM
> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
> index ffd119ebacb7..6636e5e1089c 100644
> --- a/hw/virtio/virtio-mem.c
> +++ b/hw/virtio/virtio-mem.c
> @@ -51,7 +51,8 @@ static uint32_t virtio_mem_default_thp_size(void)
> {
> uint32_t default_thp_size = VIRTIO_MEM_MIN_BLOCK_SIZE;
>
> -#if defined(__x86_64__) || defined(__arm__) || defined(__powerpc64__)
> +#if defined(__x86_64__) || defined(__arm__) || defined(__powerpc64__) \
> + || defined(__riscv__)
> default_thp_size = 2 * MiB;
> #elif defined(__aarch64__)
> if (qemu_real_host_page_size() == 4 * KiB) {
> @@ -161,7 +162,7 @@ static bool virtio_mem_has_shared_zeropage(RAMBlock *rb)
> * necessary (as the section size can change). But it's more likely that the
> * section size will rather get smaller and not bigger over time.
> */
> -#if defined(TARGET_X86_64) || defined(TARGET_I386)
> +#if defined(TARGET_X86_64) || defined(TARGET_I386) || defined(TARGET_RISCV)
> #define VIRTIO_MEM_USABLE_EXTENT (2 * (128 * MiB))
> #elif defined(TARGET_ARM)
> #define VIRTIO_MEM_USABLE_EXTENT (2 * (512 * MiB))
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/3] hw/riscv/virt: Add ACPI GED and PC-DIMM MHP support
2024-05-21 10:56 ` [PATCH v2 3/3] hw/riscv/virt: Add ACPI GED and PC-DIMM MHP support Björn Töpel
@ 2024-05-24 9:10 ` Daniel Henrique Barboza
2024-05-27 12:08 ` Björn Töpel
0 siblings, 1 reply; 14+ messages in thread
From: Daniel Henrique Barboza @ 2024-05-24 9:10 UTC (permalink / raw)
To: Björn Töpel, Palmer Dabbelt, Alistair Francis, Bin Meng,
Weiwei Li, Liu Zhiwei, qemu-riscv, qemu-devel, David Hildenbrand,
Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
On 5/21/24 07:56, Björn Töpel wrote:
> From: Björn Töpel <bjorn@rivosinc.com>
>
> Add ACPI GED for the RISC-V "virt" machine, and wire up PC-DIMM memory
> hotplugging support. Heavily based/copied from hw/arm/virt.c.
>
> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
> ---
> hw/riscv/Kconfig | 3 ++
> hw/riscv/virt-acpi-build.c | 16 ++++++
> hw/riscv/virt.c | 104 ++++++++++++++++++++++++++++++++++++-
> include/hw/riscv/virt.h | 6 ++-
> 4 files changed, 126 insertions(+), 3 deletions(-)
>
> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
> index 08f82dbb681a..bebe412f2107 100644
> --- a/hw/riscv/Kconfig
> +++ b/hw/riscv/Kconfig
> @@ -56,6 +56,9 @@ config RISCV_VIRT
> select PLATFORM_BUS
> select ACPI
> select ACPI_PCI
> + select MEM_DEVICE
> + select DIMM
> + select ACPI_HW_REDUCED
> select VIRTIO_MEM_SUPPORTED
> select VIRTIO_PMEM_SUPPORTED
>
> diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
> index 6dc3baa9ec86..61813abdef3f 100644
> --- a/hw/riscv/virt-acpi-build.c
> +++ b/hw/riscv/virt-acpi-build.c
> @@ -27,6 +27,8 @@
> #include "hw/acpi/acpi-defs.h"
> #include "hw/acpi/acpi.h"
> #include "hw/acpi/aml-build.h"
> +#include "hw/acpi/memory_hotplug.h"
> +#include "hw/acpi/generic_event_device.h"
> #include "hw/acpi/pci.h"
> #include "hw/acpi/utils.h"
> #include "hw/intc/riscv_aclint.h"
> @@ -432,6 +434,20 @@ static void build_dsdt(GArray *table_data,
> acpi_dsdt_add_gpex_host(scope, PCIE_IRQ + VIRT_IRQCHIP_NUM_SOURCES * 2);
> }
>
> + if (s->acpi_dev) {
> + uint32_t event = object_property_get_uint(OBJECT(s->acpi_dev),
> + "ged-event", &error_abort);
> +
> + build_ged_aml(scope, "\\_SB."GED_DEVICE, HOTPLUG_HANDLER(s->acpi_dev),
> + GED_IRQ, AML_SYSTEM_MEMORY, memmap[VIRT_ACPI_GED].base);
> +
> + if (event & ACPI_GED_MEM_HOTPLUG_EVT) {
> + build_memory_hotplug_aml(scope, ms->ram_slots, "\\_SB", NULL,
> + AML_SYSTEM_MEMORY,
> + memmap[VIRT_PCDIMM_ACPI].base);
> + }
> + }
> +
> aml_append(dsdt, scope);
>
> /* copy AML table into ACPI tables blob and patch header there */
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 443902f919d2..2e35890187f2 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -53,10 +53,13 @@
> #include "hw/pci-host/gpex.h"
> #include "hw/display/ramfb.h"
> #include "hw/acpi/aml-build.h"
> +#include "hw/acpi/generic_event_device.h"
> +#include "hw/acpi/memory_hotplug.h"
> #include "hw/mem/memory-device.h"
> #include "hw/virtio/virtio-mem-pci.h"
> #include "qapi/qapi-visit-common.h"
> #include "hw/virtio/virtio-iommu.h"
> +#include "hw/mem/pc-dimm.h"
>
> /* KVM AIA only supports APLIC MSI. APLIC Wired is always emulated by QEMU. */
> static bool virt_use_kvm_aia(RISCVVirtState *s)
> @@ -84,6 +87,8 @@ static const MemMapEntry virt_memmap[] = {
> [VIRT_UART0] = { 0x10000000, 0x100 },
> [VIRT_VIRTIO] = { 0x10001000, 0x1000 },
> [VIRT_FW_CFG] = { 0x10100000, 0x18 },
> + [VIRT_PCDIMM_ACPI] = { 0x10200000, MEMORY_HOTPLUG_IO_LEN },
> + [VIRT_ACPI_GED] = { 0x10210000, ACPI_GED_EVT_SEL_LEN },
> [VIRT_FLASH] = { 0x20000000, 0x4000000 },
> [VIRT_IMSIC_M] = { 0x24000000, VIRT_IMSIC_MAX_SIZE },
> [VIRT_IMSIC_S] = { 0x28000000, VIRT_IMSIC_MAX_SIZE },
> @@ -1400,6 +1405,28 @@ static void virt_machine_done(Notifier *notifier, void *data)
> }
> }
>
> +static DeviceState *create_acpi_ged(RISCVVirtState *s)
> +{
> + DeviceState *dev;
> + MachineState *ms = MACHINE(s);
> + uint32_t event = 0;
> +
> + if (ms->ram_slots) {
> + event |= ACPI_GED_MEM_HOTPLUG_EVT;
> + }
> +
> + dev = qdev_new(TYPE_ACPI_GED);
> + qdev_prop_set_uint32(dev, "ged-event", event);
> + sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
> +
> + sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, s->memmap[VIRT_ACPI_GED].base);
> + sysbus_mmio_map(SYS_BUS_DEVICE(dev), 1, s->memmap[VIRT_PCDIMM_ACPI].base);
> + sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, qdev_get_gpio_in(s->irqchip[0],
> + GED_IRQ));
> +
> + return dev;
> +}
> +
> static void virt_machine_init(MachineState *machine)
> {
> const MemMapEntry *memmap = virt_memmap;
> @@ -1612,6 +1639,10 @@ static void virt_machine_init(MachineState *machine)
>
> gpex_pcie_init(system_memory, pcie_irqchip, s);
>
> + if (virt_is_acpi_enabled(s)) {
> + s->acpi_dev = create_acpi_ged(s);
> + }
> +
> create_platform_bus(s, mmio_irqchip);
>
> serial_mm_init(system_memory, memmap[VIRT_UART0].base,
> @@ -1752,6 +1783,7 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
> MachineClass *mc = MACHINE_GET_CLASS(machine);
>
> if (device_is_dynamic_sysbus(mc, dev) ||
> + object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
> object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI) ||
> object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> return HOTPLUG_HANDLER(machine);
> @@ -1759,14 +1791,42 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
> return NULL;
> }
>
> +static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> + Error **errp)
> +{
> + RISCVVirtState *s = RISCV_VIRT_MACHINE(hotplug_dev);
> +
> + if (!s->acpi_dev) {
> + error_setg(errp,
> + "memory hotplug is not enabled: missing acpi-ged device");
> + return;
> + }
> +
> + pc_dimm_pre_plug(PC_DIMM(dev), MACHINE(hotplug_dev), NULL, errp);
> +}
> +
Note that we're not doing any aligment checks in this pre_plug(), meaning that
we're kind of accepting whatever the pc-dimm device throw at us.
Testing in an AMD x86 machine will force the pc-dimm to be 2Mb aligned:
$ ./build/qemu-system-riscv64 -M virt -m 2G,slots=4,maxmem=8G -nographic
(...)
(qemu) object_add memory-backend-ram,id=ram0,size=111M
(qemu) device_add pc-dimm,id=dimm0,memdev=ram0
Error: backend memory size must be multiple of 0x200000
(qemu) object_del ram0
This happens because the DIMM must be aligned with its own backend, in this case
the host memory itself (backends/hostmem.c).
There's no guarantee that we'll always run in a host that is mem aligned with the board,
so it would be nice to add align checks in virt_memory_pre_plug().
> static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
> DeviceState *dev, Error **errp)
> {
> + if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> + virt_memory_pre_plug(hotplug_dev, dev, errp);
> + }
> +
> if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> virtio_md_pci_pre_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
> }
> }
>
> +static void virt_memory_plug(HotplugHandler *hotplug_dev,
> + DeviceState *dev, Error **errp)
> +{
> + RISCVVirtState *s = RISCV_VIRT_MACHINE(hotplug_dev);
> +
> + pc_dimm_plug(PC_DIMM(dev), MACHINE(s));
> +
> + hotplug_handler_plug(HOTPLUG_HANDLER(s->acpi_dev), dev, &error_abort);
> +}
> +
> static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
> DeviceState *dev, Error **errp)
> {
> @@ -1785,16 +1845,36 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
> create_fdt_virtio_iommu(s, pci_get_bdf(PCI_DEVICE(dev)));
> }
>
> + if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> + virt_memory_plug(hotplug_dev, dev, errp);
> + }
> +
> if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> virtio_md_pci_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
> }
> }
>
> +static void virt_dimm_unplug_request(HotplugHandler *hotplug_dev,
> + DeviceState *dev, Error **errp)
> +{
> + RISCVVirtState *s = RISCV_VIRT_MACHINE(hotplug_dev);
> +
> + if (!s->acpi_dev) {
> + error_setg(errp,
> + "memory hotplug is not enabled: missing acpi-ged device");
> + return;
> + }
> +
> + hotplug_handler_unplug_request(HOTPLUG_HANDLER(s->acpi_dev), dev, errp);
> +}
> +
I'm unsure if we're ready to support both hotplug and hot-unplug, but I tested anyway.
Hotplug seems to work but hot-unplug doesn't:
$ ./build/qemu-system-riscv64 -M virt -m 2G,slots=4,maxmem=8G -nographic
(...)
(qemu) object_add memory-backend-ram,id=ram0,size=112M
(qemu) device_add pc-dimm,id=dimm0,memdev=ram0
(qemu)
(qemu) info memory-devices
Memory device [dimm]: "dimm0"
addr: 0x100000000
slot: 0
node: 0
size: 117440512
memdev: /objects/ram0
hotplugged: true
hotpluggable: true
(qemu)
(qemu) device_del dimm0
(qemu) object_del ram0
Error: object 'ram0' is in use, can not be deleted
(qemu) info memory-devices
Memory device [dimm]: "dimm0"
addr: 0x100000000
slot: 0
node: 0
size: 117440512
memdev: /objects/ram0
hotplugged: true
hotpluggable: true
(qemu)
In short: hotplugged a 112Mb DIMM, then tried to remove it. 'device_del' doesn't error
out but doesn't let me remove the memory backing created, i.e. the dimm0 device is
still around.
In a quick digging I see that we're hitting virt_dimm_unplug_request() all the way
down to acpi_memory_unplug_request_cb(), where an ACPI_MEMORY_HOTPLUG_STATUS is being
sent. We never reach virt_dimm_unplug() afterwards, so the PC_DIMM is never removed.
I'm not acquainted with ACPI enough to say if we're missing stuff in QEMU, or if we
need SW to be aware of this ACPI HP event to trigger the release of the dimm, or
anything in between.
I consider this more as a FYI. If we're up to the point of hotplugging pc-dimms it's
already an improvement worth having. Hot-unplugging can come later.
Thanks,
Daniel
> static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> DeviceState *dev,
> Error **errp)
> {
> - if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> + if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> + virt_dimm_unplug_request(hotplug_dev, dev, errp);
> + } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> virtio_md_pci_unplug_request(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev),
> errp);
> } else {
> @@ -1804,10 +1884,30 @@ static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> }
> }
>
> +static void virt_dimm_unplug(HotplugHandler *hotplug_dev,
> + DeviceState *dev, Error **errp)
> +{
> + RISCVVirtState *s = RISCV_VIRT_MACHINE(hotplug_dev);
> + Error *local_err = NULL;
> +
> + hotplug_handler_unplug(HOTPLUG_HANDLER(s->acpi_dev), dev, &local_err);
> + if (local_err) {
> + goto out;
> + }
> +
> + pc_dimm_unplug(PC_DIMM(dev), MACHINE(s));
> + qdev_unrealize(dev);
> +
> +out:
> + error_propagate(errp, local_err);
> +}
> +
> static void virt_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
> DeviceState *dev, Error **errp)
> {
> - if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> + if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> + virt_dimm_unplug(hotplug_dev, dev, errp);
> + } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> virtio_md_pci_unplug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
> } else {
> error_setg(errp, "virt: device unplug for unsupported device"
> diff --git a/include/hw/riscv/virt.h b/include/hw/riscv/virt.h
> index 3db839160f95..adf460a4021e 100644
> --- a/include/hw/riscv/virt.h
> +++ b/include/hw/riscv/virt.h
> @@ -62,6 +62,7 @@ struct RISCVVirtState {
> OnOffAuto acpi;
> const MemMapEntry *memmap;
> struct GPEXHost *gpex_host;
> + DeviceState *acpi_dev;
> };
>
> enum {
> @@ -84,12 +85,15 @@ enum {
> VIRT_PCIE_MMIO,
> VIRT_PCIE_PIO,
> VIRT_PLATFORM_BUS,
> - VIRT_PCIE_ECAM
> + VIRT_PCIE_ECAM,
> + VIRT_PCDIMM_ACPI,
> + VIRT_ACPI_GED,
> };
>
> enum {
> UART0_IRQ = 10,
> RTC_IRQ = 11,
> + GED_IRQ = 12,
> VIRTIO_IRQ = 1, /* 1 to 8 */
> VIRTIO_COUNT = 8,
> PCIE_IRQ = 0x20, /* 32 to 35 */
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support
2024-05-21 10:56 ` [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support Björn Töpel
2024-05-24 8:53 ` Daniel Henrique Barboza
@ 2024-05-24 13:14 ` Daniel Henrique Barboza
2024-05-24 15:02 ` David Hildenbrand
1 sibling, 1 reply; 14+ messages in thread
From: Daniel Henrique Barboza @ 2024-05-24 13:14 UTC (permalink / raw)
To: Björn Töpel, Palmer Dabbelt, Alistair Francis, Bin Meng,
Weiwei Li, Liu Zhiwei, qemu-riscv, qemu-devel, David Hildenbrand,
Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
On 5/21/24 07:56, Björn Töpel wrote:
> From: Björn Töpel <bjorn@rivosinc.com>
>
> Virtio-based memory devices (virtio-mem/virtio-pmem) allows for
> dynamic resizing of virtual machine memory, and requires proper
> hotplugging (add/remove) support to work.
>
> Add device memory support for RISC-V "virt" machine, and enable
> virtio-md-pci with the corresponding missing hotplugging callbacks.
>
> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
> ---
> hw/riscv/Kconfig | 2 +
> hw/riscv/virt.c | 83 +++++++++++++++++++++++++++++++++++++++++-
> hw/virtio/virtio-mem.c | 5 ++-
> 3 files changed, 87 insertions(+), 3 deletions(-)
>
> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
> index a2030e3a6ff0..08f82dbb681a 100644
> --- a/hw/riscv/Kconfig
> +++ b/hw/riscv/Kconfig
> @@ -56,6 +56,8 @@ config RISCV_VIRT
> select PLATFORM_BUS
> select ACPI
> select ACPI_PCI
> + select VIRTIO_MEM_SUPPORTED
> + select VIRTIO_PMEM_SUPPORTED
>
> config SHAKTI_C
> bool
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 4fdb66052587..443902f919d2 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -53,6 +53,8 @@
> #include "hw/pci-host/gpex.h"
> #include "hw/display/ramfb.h"
> #include "hw/acpi/aml-build.h"
> +#include "hw/mem/memory-device.h"
> +#include "hw/virtio/virtio-mem-pci.h"
> #include "qapi/qapi-visit-common.h"
> #include "hw/virtio/virtio-iommu.h"
>
> @@ -1407,6 +1409,7 @@ static void virt_machine_init(MachineState *machine)
> DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
> int i, base_hartid, hart_count;
> int socket_count = riscv_socket_count(machine);
> + hwaddr device_memory_base, device_memory_size;
>
> /* Check socket count limit */
> if (VIRT_SOCKETS_MAX < socket_count) {
> @@ -1420,6 +1423,12 @@ static void virt_machine_init(MachineState *machine)
> exit(1);
> }
>
> + if (machine->ram_slots > ACPI_MAX_RAM_SLOTS) {
> + error_report("unsupported amount of memory slots: %"PRIu64,
> + machine->ram_slots);
> + exit(EXIT_FAILURE);
> + }
> +
> /* Initialize sockets */
> mmio_irqchip = virtio_irqchip = pcie_irqchip = NULL;
> for (i = 0; i < socket_count; i++) {
> @@ -1553,6 +1562,37 @@ static void virt_machine_init(MachineState *machine)
> memory_region_add_subregion(system_memory, memmap[VIRT_MROM].base,
> mask_rom);
>
> + /* device memory */
> + device_memory_base = ROUND_UP(s->memmap[VIRT_DRAM].base + machine->ram_size,
> + GiB);
> + device_memory_size = machine->maxram_size - machine->ram_size;
> + if (device_memory_size > 0) {
> + /*
> + * Each DIMM is aligned based on the backend's alignment value.
> + * Assume max 1G hugepage alignment per slot.
> + */
> + device_memory_size += machine->ram_slots * GiB;
We don't need to align to 1GiB. This calc can use 2MiB instead (or 4MiB if we're
running 32 bits).
> +
> + if (riscv_is_32bit(&s->soc[0])) {
> + hwaddr memtop = device_memory_base + ROUND_UP(device_memory_size,
> + GiB);
Same here - alignment is 2/4 MiB.
> +
> + if (memtop > UINT32_MAX) {
> + error_report("memory exceeds 32-bit limit by %lu bytes",
> + memtop - UINT32_MAX);
> + exit(EXIT_FAILURE);
> + }
> + }
> +
> + if (device_memory_base + device_memory_size < device_memory_size) {
> + error_report("unsupported amount of device memory");
> + exit(EXIT_FAILURE);
> + }
Took another look and found this a bit strange. These are all unsigned vars, so
if (unsigned a + unsigned b < unsigned b) will always be 'false'. The compiler is
probably cropping this out.
The calc we need to do is to ensure that the extra ram_slots * alignment will fit into
the VIRT_DRAM block, i.e. maxram_size + (ram_slots * alignment) < memmap[VIRT_DRAM].size.
TBH I'm starting to have second thoughts about letting users hotplug whatever they want.
It seems cleaner to just force the 2/4 Mb alignment in pre_plug() and be done with it,
no need to allocate ram_slots * alignment and doing all these extra checks.
As I sent in an earlier email, users must already comply to the alignment of the host
memory when plugging pc-dimms, so I'm not sure our value/proposition with all this
extra code is worth it - the alignment will most likely be forced by the host memory
backend, so might as well force ourselves in pre_plug().
Thanks,
Daniel
> +
> + machine_memory_devices_init(machine, device_memory_base,
> + device_memory_size);
> + }
> +
> /*
> * Init fw_cfg. Must be done before riscv_load_fdt, otherwise the
> * device tree cannot be altered and we get FDT_ERR_NOSPACE.
> @@ -1712,12 +1752,21 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
> MachineClass *mc = MACHINE_GET_CLASS(machine);
>
> if (device_is_dynamic_sysbus(mc, dev) ||
> - object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
> + object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI) ||
> + object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> return HOTPLUG_HANDLER(machine);
> }
> return NULL;
> }
>
> +static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
> + DeviceState *dev, Error **errp)
> +{
> + if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> + virtio_md_pci_pre_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
> + }
> +}
> +
> static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
> DeviceState *dev, Error **errp)
> {
> @@ -1735,6 +1784,35 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
> if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
> create_fdt_virtio_iommu(s, pci_get_bdf(PCI_DEVICE(dev)));
> }
> +
> + if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> + virtio_md_pci_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
> + }
> +}
> +
> +static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> + DeviceState *dev,
> + Error **errp)
> +{
> + if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> + virtio_md_pci_unplug_request(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev),
> + errp);
> + } else {
> + error_setg(errp,
> + "device unplug request for unsupported device type: %s",
> + object_get_typename(OBJECT(dev)));
> + }
> +}
> +
> +static void virt_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
> + DeviceState *dev, Error **errp)
> +{
> + if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
> + virtio_md_pci_unplug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
> + } else {
> + error_setg(errp, "virt: device unplug for unsupported device"
> + " type: %s", object_get_typename(OBJECT(dev)));
> + }
> }
>
> static void virt_machine_class_init(ObjectClass *oc, void *data)
> @@ -1757,7 +1835,10 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
> assert(!mc->get_hotplug_handler);
> mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
>
> + hc->pre_plug = virt_machine_device_pre_plug_cb;
> hc->plug = virt_machine_device_plug_cb;
> + hc->unplug_request = virt_machine_device_unplug_request_cb;
> + hc->unplug = virt_machine_device_unplug_cb;
>
> machine_class_allow_dynamic_sysbus_dev(mc, TYPE_RAMFB_DEVICE);
> #ifdef CONFIG_TPM
> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
> index ffd119ebacb7..6636e5e1089c 100644
> --- a/hw/virtio/virtio-mem.c
> +++ b/hw/virtio/virtio-mem.c
> @@ -51,7 +51,8 @@ static uint32_t virtio_mem_default_thp_size(void)
> {
> uint32_t default_thp_size = VIRTIO_MEM_MIN_BLOCK_SIZE;
>
> -#if defined(__x86_64__) || defined(__arm__) || defined(__powerpc64__)
> +#if defined(__x86_64__) || defined(__arm__) || defined(__powerpc64__) \
> + || defined(__riscv__)
> default_thp_size = 2 * MiB;
> #elif defined(__aarch64__)
> if (qemu_real_host_page_size() == 4 * KiB) {
> @@ -161,7 +162,7 @@ static bool virtio_mem_has_shared_zeropage(RAMBlock *rb)
> * necessary (as the section size can change). But it's more likely that the
> * section size will rather get smaller and not bigger over time.
> */
> -#if defined(TARGET_X86_64) || defined(TARGET_I386)
> +#if defined(TARGET_X86_64) || defined(TARGET_I386) || defined(TARGET_RISCV)
> #define VIRTIO_MEM_USABLE_EXTENT (2 * (128 * MiB))
> #elif defined(TARGET_ARM)
> #define VIRTIO_MEM_USABLE_EXTENT (2 * (512 * MiB))
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support
2024-05-24 13:14 ` Daniel Henrique Barboza
@ 2024-05-24 15:02 ` David Hildenbrand
2024-05-24 15:29 ` Daniel Henrique Barboza
2024-05-27 12:03 ` Björn Töpel
0 siblings, 2 replies; 14+ messages in thread
From: David Hildenbrand @ 2024-05-24 15:02 UTC (permalink / raw)
To: Daniel Henrique Barboza, Björn Töpel, Palmer Dabbelt,
Alistair Francis, Bin Meng, Weiwei Li, Liu Zhiwei, qemu-riscv,
qemu-devel, Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
On 24.05.24 15:14, Daniel Henrique Barboza wrote:
>
>
> On 5/21/24 07:56, Björn Töpel wrote:
>> From: Björn Töpel <bjorn@rivosinc.com>
>>
>> Virtio-based memory devices (virtio-mem/virtio-pmem) allows for
>> dynamic resizing of virtual machine memory, and requires proper
>> hotplugging (add/remove) support to work.
>>
>> Add device memory support for RISC-V "virt" machine, and enable
>> virtio-md-pci with the corresponding missing hotplugging callbacks.
>>
>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
>> ---
>> hw/riscv/Kconfig | 2 +
>> hw/riscv/virt.c | 83 +++++++++++++++++++++++++++++++++++++++++-
>> hw/virtio/virtio-mem.c | 5 ++-
>> 3 files changed, 87 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
>> index a2030e3a6ff0..08f82dbb681a 100644
>> --- a/hw/riscv/Kconfig
>> +++ b/hw/riscv/Kconfig
>> @@ -56,6 +56,8 @@ config RISCV_VIRT
>> select PLATFORM_BUS
>> select ACPI
>> select ACPI_PCI
>> + select VIRTIO_MEM_SUPPORTED
>> + select VIRTIO_PMEM_SUPPORTED
>>
>> config SHAKTI_C
>> bool
>> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
>> index 4fdb66052587..443902f919d2 100644
>> --- a/hw/riscv/virt.c
>> +++ b/hw/riscv/virt.c
>> @@ -53,6 +53,8 @@
>> #include "hw/pci-host/gpex.h"
>> #include "hw/display/ramfb.h"
>> #include "hw/acpi/aml-build.h"
>> +#include "hw/mem/memory-device.h"
>> +#include "hw/virtio/virtio-mem-pci.h"
>> #include "qapi/qapi-visit-common.h"
>> #include "hw/virtio/virtio-iommu.h"
>>
>> @@ -1407,6 +1409,7 @@ static void virt_machine_init(MachineState *machine)
>> DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
>> int i, base_hartid, hart_count;
>> int socket_count = riscv_socket_count(machine);
>> + hwaddr device_memory_base, device_memory_size;
>>
>> /* Check socket count limit */
>> if (VIRT_SOCKETS_MAX < socket_count) {
>> @@ -1420,6 +1423,12 @@ static void virt_machine_init(MachineState *machine)
>> exit(1);
>> }
>>
>> + if (machine->ram_slots > ACPI_MAX_RAM_SLOTS) {
>> + error_report("unsupported amount of memory slots: %"PRIu64,
>> + machine->ram_slots);
>> + exit(EXIT_FAILURE);
>> + }
>> +
>> /* Initialize sockets */
>> mmio_irqchip = virtio_irqchip = pcie_irqchip = NULL;
>> for (i = 0; i < socket_count; i++) {
>> @@ -1553,6 +1562,37 @@ static void virt_machine_init(MachineState *machine)
>> memory_region_add_subregion(system_memory, memmap[VIRT_MROM].base,
>> mask_rom);
>>
>> + /* device memory */
>> + device_memory_base = ROUND_UP(s->memmap[VIRT_DRAM].base + machine->ram_size,
>> + GiB);
>> + device_memory_size = machine->maxram_size - machine->ram_size;
>> + if (device_memory_size > 0) {
>> + /*
>> + * Each DIMM is aligned based on the backend's alignment value.
>> + * Assume max 1G hugepage alignment per slot.
>> + */
>> + device_memory_size += machine->ram_slots * GiB;
>
> We don't need to align to 1GiB. This calc can use 2MiB instead (or 4MiB if we're
> running 32 bits).
>
>> +
>> + if (riscv_is_32bit(&s->soc[0])) {
>> + hwaddr memtop = device_memory_base + ROUND_UP(device_memory_size,
>> + GiB);
>
> Same here - alignment is 2/4 MiB.
>
>> +
>> + if (memtop > UINT32_MAX) {
>> + error_report("memory exceeds 32-bit limit by %lu bytes",
>> + memtop - UINT32_MAX);
>> + exit(EXIT_FAILURE);
>> + }
>> + }
>> +
>> + if (device_memory_base + device_memory_size < device_memory_size) {
>> + error_report("unsupported amount of device memory");
>> + exit(EXIT_FAILURE);
>> + }
>
> Took another look and found this a bit strange. These are all unsigned vars, so
> if (unsigned a + unsigned b < unsigned b) will always be 'false'. The compiler is
> probably cropping this out.
No. Unsigned interger overflow is defined behavior and this is a common
check to detect such overflow. tI's consistent with what we do for other
architectures.
>
> The calc we need to do is to ensure that the extra ram_slots * alignment will fit into
> the VIRT_DRAM block, i.e. maxram_size + (ram_slots * alignment) < memmap[VIRT_DRAM].size.
>
>
> TBH I'm starting to have second thoughts about letting users hotplug whatever they want.
> It seems cleaner to just force the 2/4 Mb alignment in pre_plug() and be done with it,
> no need to allocate ram_slots * alignment and doing all these extra checks.
It's worth noting that if user space decides to specify addresses
manually, it can mess up everything already. There are other events that
can result in fragmentation of the memory device area (repeated
hot(un)plug of differing DIMMs).
Assume you have 1 GiB range and hotplug a 512 MiB DIMM at offset 256
MiB. You won't be able to hotplug another 512 MiB DIMM even though we
reserved space.
My take so far is: if the user wants to do such stuff it should size the
area (maxmem) much larger or deal with the concequences (not being able
to hotplug memory).
It usually does not happen in practice ...
>
> As I sent in an earlier email, users must already comply to the alignment of the host
> memory when plugging pc-dimms, so I'm not sure our value/proposition with all this
> extra code is worth it - the alignment will most likely be forced by the host memory
> backend, so might as well force ourselves in pre_plug().
At least on x86-64, the 2 MiB alignment requirement is handled
automatically. QEMU_VMALLOC_ALIGN implicitly enforces that.
Maybe RISCV also wants to set QEMU_VMALLOC_ALIGN?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support
2024-05-24 15:02 ` David Hildenbrand
@ 2024-05-24 15:29 ` Daniel Henrique Barboza
2024-05-27 7:19 ` David Hildenbrand
2024-05-27 12:03 ` Björn Töpel
1 sibling, 1 reply; 14+ messages in thread
From: Daniel Henrique Barboza @ 2024-05-24 15:29 UTC (permalink / raw)
To: David Hildenbrand, Björn Töpel, Palmer Dabbelt,
Alistair Francis, Bin Meng, Weiwei Li, Liu Zhiwei, qemu-riscv,
qemu-devel, Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
On 5/24/24 12:02, David Hildenbrand wrote:
> On 24.05.24 15:14, Daniel Henrique Barboza wrote:
>>
>>
>> On 5/21/24 07:56, Björn Töpel wrote:
>>> From: Björn Töpel <bjorn@rivosinc.com>
>>>
>>> Virtio-based memory devices (virtio-mem/virtio-pmem) allows for
>>> dynamic resizing of virtual machine memory, and requires proper
>>> hotplugging (add/remove) support to work.
>>>
>>> Add device memory support for RISC-V "virt" machine, and enable
>>> virtio-md-pci with the corresponding missing hotplugging callbacks.
>>>
>>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
>>> ---
>>> hw/riscv/Kconfig | 2 +
>>> hw/riscv/virt.c | 83 +++++++++++++++++++++++++++++++++++++++++-
>>> hw/virtio/virtio-mem.c | 5 ++-
>>> 3 files changed, 87 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
>>> index a2030e3a6ff0..08f82dbb681a 100644
>>> --- a/hw/riscv/Kconfig
>>> +++ b/hw/riscv/Kconfig
>>> @@ -56,6 +56,8 @@ config RISCV_VIRT
>>> select PLATFORM_BUS
>>> select ACPI
>>> select ACPI_PCI
>>> + select VIRTIO_MEM_SUPPORTED
>>> + select VIRTIO_PMEM_SUPPORTED
>>> config SHAKTI_C
>>> bool
>>> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
>>> index 4fdb66052587..443902f919d2 100644
>>> --- a/hw/riscv/virt.c
>>> +++ b/hw/riscv/virt.c
>>> @@ -53,6 +53,8 @@
>>> #include "hw/pci-host/gpex.h"
>>> #include "hw/display/ramfb.h"
>>> #include "hw/acpi/aml-build.h"
>>> +#include "hw/mem/memory-device.h"
>>> +#include "hw/virtio/virtio-mem-pci.h"
>>> #include "qapi/qapi-visit-common.h"
>>> #include "hw/virtio/virtio-iommu.h"
>>> @@ -1407,6 +1409,7 @@ static void virt_machine_init(MachineState *machine)
>>> DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
>>> int i, base_hartid, hart_count;
>>> int socket_count = riscv_socket_count(machine);
>>> + hwaddr device_memory_base, device_memory_size;
>>> /* Check socket count limit */
>>> if (VIRT_SOCKETS_MAX < socket_count) {
>>> @@ -1420,6 +1423,12 @@ static void virt_machine_init(MachineState *machine)
>>> exit(1);
>>> }
>>> + if (machine->ram_slots > ACPI_MAX_RAM_SLOTS) {
>>> + error_report("unsupported amount of memory slots: %"PRIu64,
>>> + machine->ram_slots);
>>> + exit(EXIT_FAILURE);
>>> + }
>>> +
>>> /* Initialize sockets */
>>> mmio_irqchip = virtio_irqchip = pcie_irqchip = NULL;
>>> for (i = 0; i < socket_count; i++) {
>>> @@ -1553,6 +1562,37 @@ static void virt_machine_init(MachineState *machine)
>>> memory_region_add_subregion(system_memory, memmap[VIRT_MROM].base,
>>> mask_rom);
>>> + /* device memory */
>>> + device_memory_base = ROUND_UP(s->memmap[VIRT_DRAM].base + machine->ram_size,
>>> + GiB);
>>> + device_memory_size = machine->maxram_size - machine->ram_size;
>>> + if (device_memory_size > 0) {
>>> + /*
>>> + * Each DIMM is aligned based on the backend's alignment value.
>>> + * Assume max 1G hugepage alignment per slot.
>>> + */
>>> + device_memory_size += machine->ram_slots * GiB;
>>
>> We don't need to align to 1GiB. This calc can use 2MiB instead (or 4MiB if we're
>> running 32 bits).
>>
>>> +
>>> + if (riscv_is_32bit(&s->soc[0])) {
>>> + hwaddr memtop = device_memory_base + ROUND_UP(device_memory_size,
>>> + GiB);
>>
>> Same here - alignment is 2/4 MiB.
>>
>>> +
>>> + if (memtop > UINT32_MAX) {
>>> + error_report("memory exceeds 32-bit limit by %lu bytes",
>>> + memtop - UINT32_MAX);
>>> + exit(EXIT_FAILURE);
>>> + }
>>> + }
>>> +
>>> + if (device_memory_base + device_memory_size < device_memory_size) {
>>> + error_report("unsupported amount of device memory");
>>> + exit(EXIT_FAILURE);
>>> + }
>>
>> Took another look and found this a bit strange. These are all unsigned vars, so
>> if (unsigned a + unsigned b < unsigned b) will always be 'false'. The compiler is
>> probably cropping this out.
>
> No. Unsigned interger overflow is defined behavior and this is a common check to detect such overflow. tI's consistent with what we do for other architectures.
>
Oh, ok. We're so far away from UINT64_MAX that it didn't occur to me doing an overflow
check here. Fair enough.
>>
>> The calc we need to do is to ensure that the extra ram_slots * alignment will fit into
>> the VIRT_DRAM block, i.e. maxram_size + (ram_slots * alignment) < memmap[VIRT_DRAM].size.
>>
>>
>> TBH I'm starting to have second thoughts about letting users hotplug whatever they want.
>> It seems cleaner to just force the 2/4 Mb alignment in pre_plug() and be done with it,
>> no need to allocate ram_slots * alignment and doing all these extra checks.
>
> It's worth noting that if user space decides to specify addresses manually, it can mess up everything already. There are other events that can result in fragmentation of the memory device area (repeated hot(un)plug of differing DIMMs).
>
> Assume you have 1 GiB range and hotplug a 512 MiB DIMM at offset 256 MiB. You won't be able to hotplug another 512 MiB DIMM even though we reserved space.
>
> My take so far is: if the user wants to do such stuff it should size the area (maxmem) much larger or deal with the concequences (not being able to hotplug memory).
>
> It usually does not happen in practice ...
>
I wonder if we should forbid users from removing memory that is 'out of place', i.e. users
should always remove pc-dimms in LIFO order. Usually this kind of control is done by
management, e.g. libvirt, but if we're not sure we'll keep consistency during memory
unplugs ...
>>
>> As I sent in an earlier email, users must already comply to the alignment of the host
>> memory when plugging pc-dimms, so I'm not sure our value/proposition with all this
>> extra code is worth it - the alignment will most likely be forced by the host memory
>> backend, so might as well force ourselves in pre_plug().
>
> At least on x86-64, the 2 MiB alignment requirement is handled automatically. QEMU_VMALLOC_ALIGN implicitly enforces that.
>
> Maybe RISCV also wants to set QEMU_VMALLOC_ALIGN?
Might be a good idea to experiment.
Thanks,
Daniel
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support
2024-05-24 15:29 ` Daniel Henrique Barboza
@ 2024-05-27 7:19 ` David Hildenbrand
0 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2024-05-27 7:19 UTC (permalink / raw)
To: Daniel Henrique Barboza, Björn Töpel, Palmer Dabbelt,
Alistair Francis, Bin Meng, Weiwei Li, Liu Zhiwei, qemu-riscv,
qemu-devel, Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
>
> I wonder if we should forbid users from removing memory that is 'out of place',
> i.e. users
> should always remove pc-dimms in LIFO order. Usually this kind of control is
> done by
> management, e.g. libvirt, but if we're not sure we'll keep consistency during
> memory
> unplugs ...
I really don't think we should do any of that. ;)
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support
2024-05-24 8:53 ` Daniel Henrique Barboza
@ 2024-05-27 11:49 ` Björn Töpel
0 siblings, 0 replies; 14+ messages in thread
From: Björn Töpel @ 2024-05-27 11:49 UTC (permalink / raw)
To: Daniel Henrique Barboza, Palmer Dabbelt, Alistair Francis,
Bin Meng, Weiwei Li, Liu Zhiwei, qemu-riscv, qemu-devel,
David Hildenbrand, Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
Daniel Henrique Barboza <dbarboza@ventanamicro.com> writes:
> On 5/21/24 07:56, Björn Töpel wrote:
>> From: Björn Töpel <bjorn@rivosinc.com>
>>
>> Virtio-based memory devices (virtio-mem/virtio-pmem) allows for
>> dynamic resizing of virtual machine memory, and requires proper
>> hotplugging (add/remove) support to work.
>>
>> Add device memory support for RISC-V "virt" machine, and enable
>> virtio-md-pci with the corresponding missing hotplugging callbacks.
>>
>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
>> ---
>> hw/riscv/Kconfig | 2 +
>> hw/riscv/virt.c | 83 +++++++++++++++++++++++++++++++++++++++++-
>> hw/virtio/virtio-mem.c | 5 ++-
>> 3 files changed, 87 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
>> index a2030e3a6ff0..08f82dbb681a 100644
>> --- a/hw/riscv/Kconfig
>> +++ b/hw/riscv/Kconfig
>> @@ -56,6 +56,8 @@ config RISCV_VIRT
>> select PLATFORM_BUS
>> select ACPI
>> select ACPI_PCI
>> + select VIRTIO_MEM_SUPPORTED
>> + select VIRTIO_PMEM_SUPPORTED
>>
>> config SHAKTI_C
>> bool
>> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
>> index 4fdb66052587..443902f919d2 100644
>> --- a/hw/riscv/virt.c
>> +++ b/hw/riscv/virt.c
>> @@ -53,6 +53,8 @@
>> #include "hw/pci-host/gpex.h"
>> #include "hw/display/ramfb.h"
>> #include "hw/acpi/aml-build.h"
>> +#include "hw/mem/memory-device.h"
>> +#include "hw/virtio/virtio-mem-pci.h"
>> #include "qapi/qapi-visit-common.h"
>> #include "hw/virtio/virtio-iommu.h"
>>
>> @@ -1407,6 +1409,7 @@ static void virt_machine_init(MachineState *machine)
>> DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
>> int i, base_hartid, hart_count;
>> int socket_count = riscv_socket_count(machine);
>> + hwaddr device_memory_base, device_memory_size;
>>
>> /* Check socket count limit */
>> if (VIRT_SOCKETS_MAX < socket_count) {
>> @@ -1420,6 +1423,12 @@ static void virt_machine_init(MachineState *machine)
>> exit(1);
>> }
>>
>> + if (machine->ram_slots > ACPI_MAX_RAM_SLOTS) {
>> + error_report("unsupported amount of memory slots: %"PRIu64,
>> + machine->ram_slots);
>
> Let's also add the maximum amount allowed in this message, e.g. this error:
>
> $ (...) -m 2G,slots=512,maxmem=8G
> qemu-system-riscv64: unsupported amount of memory slots: 512
>
> could be something like:
>
> qemu-system-riscv64: unsupported amount of memory slots (512), maximum amount: 256
>
>
> LGTM otherwise. Thanks,
Thanks for getting back!
Sure! I'll fix this in the next rev.
Björn
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support
2024-05-24 15:02 ` David Hildenbrand
2024-05-24 15:29 ` Daniel Henrique Barboza
@ 2024-05-27 12:03 ` Björn Töpel
2024-05-27 13:19 ` Daniel Henrique Barboza
1 sibling, 1 reply; 14+ messages in thread
From: Björn Töpel @ 2024-05-27 12:03 UTC (permalink / raw)
To: David Hildenbrand, Daniel Henrique Barboza, Palmer Dabbelt,
Alistair Francis, Bin Meng, Weiwei Li, Liu Zhiwei, qemu-riscv,
qemu-devel, Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
David Hildenbrand <david@redhat.com> writes:
> On 24.05.24 15:14, Daniel Henrique Barboza wrote:
>>
>>
>> On 5/21/24 07:56, Björn Töpel wrote:
>>> From: Björn Töpel <bjorn@rivosinc.com>
>>>
>>> Virtio-based memory devices (virtio-mem/virtio-pmem) allows for
>>> dynamic resizing of virtual machine memory, and requires proper
>>> hotplugging (add/remove) support to work.
>>>
>>> Add device memory support for RISC-V "virt" machine, and enable
>>> virtio-md-pci with the corresponding missing hotplugging callbacks.
>>>
>>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
>>> ---
>>> hw/riscv/Kconfig | 2 +
>>> hw/riscv/virt.c | 83 +++++++++++++++++++++++++++++++++++++++++-
>>> hw/virtio/virtio-mem.c | 5 ++-
>>> 3 files changed, 87 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
>>> index a2030e3a6ff0..08f82dbb681a 100644
>>> --- a/hw/riscv/Kconfig
>>> +++ b/hw/riscv/Kconfig
>>> @@ -56,6 +56,8 @@ config RISCV_VIRT
>>> select PLATFORM_BUS
>>> select ACPI
>>> select ACPI_PCI
>>> + select VIRTIO_MEM_SUPPORTED
>>> + select VIRTIO_PMEM_SUPPORTED
>>>
>>> config SHAKTI_C
>>> bool
>>> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
>>> index 4fdb66052587..443902f919d2 100644
>>> --- a/hw/riscv/virt.c
>>> +++ b/hw/riscv/virt.c
>>> @@ -53,6 +53,8 @@
>>> #include "hw/pci-host/gpex.h"
>>> #include "hw/display/ramfb.h"
>>> #include "hw/acpi/aml-build.h"
>>> +#include "hw/mem/memory-device.h"
>>> +#include "hw/virtio/virtio-mem-pci.h"
>>> #include "qapi/qapi-visit-common.h"
>>> #include "hw/virtio/virtio-iommu.h"
>>>
>>> @@ -1407,6 +1409,7 @@ static void virt_machine_init(MachineState *machine)
>>> DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
>>> int i, base_hartid, hart_count;
>>> int socket_count = riscv_socket_count(machine);
>>> + hwaddr device_memory_base, device_memory_size;
>>>
>>> /* Check socket count limit */
>>> if (VIRT_SOCKETS_MAX < socket_count) {
>>> @@ -1420,6 +1423,12 @@ static void virt_machine_init(MachineState *machine)
>>> exit(1);
>>> }
>>>
>>> + if (machine->ram_slots > ACPI_MAX_RAM_SLOTS) {
>>> + error_report("unsupported amount of memory slots: %"PRIu64,
>>> + machine->ram_slots);
>>> + exit(EXIT_FAILURE);
>>> + }
>>> +
>>> /* Initialize sockets */
>>> mmio_irqchip = virtio_irqchip = pcie_irqchip = NULL;
>>> for (i = 0; i < socket_count; i++) {
>>> @@ -1553,6 +1562,37 @@ static void virt_machine_init(MachineState *machine)
>>> memory_region_add_subregion(system_memory, memmap[VIRT_MROM].base,
>>> mask_rom);
>>>
>>> + /* device memory */
>>> + device_memory_base = ROUND_UP(s->memmap[VIRT_DRAM].base + machine->ram_size,
>>> + GiB);
>>> + device_memory_size = machine->maxram_size - machine->ram_size;
>>> + if (device_memory_size > 0) {
>>> + /*
>>> + * Each DIMM is aligned based on the backend's alignment value.
>>> + * Assume max 1G hugepage alignment per slot.
>>> + */
>>> + device_memory_size += machine->ram_slots * GiB;
>>
>> We don't need to align to 1GiB. This calc can use 2MiB instead (or 4MiB if we're
>> running 32 bits).
>>
>>> +
>>> + if (riscv_is_32bit(&s->soc[0])) {
>>> + hwaddr memtop = device_memory_base + ROUND_UP(device_memory_size,
>>> + GiB);
>>
>> Same here - alignment is 2/4 MiB.
>>
>>> +
>>> + if (memtop > UINT32_MAX) {
>>> + error_report("memory exceeds 32-bit limit by %lu bytes",
>>> + memtop - UINT32_MAX);
>>> + exit(EXIT_FAILURE);
>>> + }
>>> + }
>>> +
>>> + if (device_memory_base + device_memory_size < device_memory_size) {
>>> + error_report("unsupported amount of device memory");
>>> + exit(EXIT_FAILURE);
>>> + }
>>
>> Took another look and found this a bit strange. These are all unsigned vars, so
>> if (unsigned a + unsigned b < unsigned b) will always be 'false'. The compiler is
>> probably cropping this out.
>
> No. Unsigned interger overflow is defined behavior and this is a common
> check to detect such overflow. tI's consistent with what we do for other
> architectures.
>
>>
>> The calc we need to do is to ensure that the extra ram_slots * alignment will fit into
>> the VIRT_DRAM block, i.e. maxram_size + (ram_slots * alignment) < memmap[VIRT_DRAM].size.
>>
>>
>> TBH I'm starting to have second thoughts about letting users hotplug whatever they want.
>> It seems cleaner to just force the 2/4 Mb alignment in pre_plug() and be done with it,
>> no need to allocate ram_slots * alignment and doing all these extra checks.
>
> It's worth noting that if user space decides to specify addresses
> manually, it can mess up everything already. There are other events that
> can result in fragmentation of the memory device area (repeated
> hot(un)plug of differing DIMMs).
>
> Assume you have 1 GiB range and hotplug a 512 MiB DIMM at offset 256
> MiB. You won't be able to hotplug another 512 MiB DIMM even though we
> reserved space.
>
> My take so far is: if the user wants to do such stuff it should size the
> area (maxmem) much larger or deal with the concequences (not being able
> to hotplug memory).
>
> It usually does not happen in practice ...
Daniel/David, again thanks for spending time on the patches.
The reason I picked 1 GiB per slot was because the alignment is also
dependent on the backend AFAIU. Rationale in commit 085f8e88ba73 ("pc:
count in 1Gb hugepage alignment when sizing hotplug-memory container").
What I'm reading from you guys are: Just depend on whatever maxmem says
(modulo 2/4M alignment), and leave at that. I agree that that's much
easier to reason about.
Correct?
>> As I sent in an earlier email, users must already comply to the alignment of the host
>> memory when plugging pc-dimms, so I'm not sure our value/proposition with all this
>> extra code is worth it - the alignment will most likely be forced by the host memory
>> backend, so might as well force ourselves in pre_plug().
>
> At least on x86-64, the 2 MiB alignment requirement is handled
> automatically. QEMU_VMALLOC_ALIGN implicitly enforces that.
>
> Maybe RISCV also wants to set QEMU_VMALLOC_ALIGN?
I'll look into this for the next rev!
Björn
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/3] hw/riscv/virt: Add ACPI GED and PC-DIMM MHP support
2024-05-24 9:10 ` Daniel Henrique Barboza
@ 2024-05-27 12:08 ` Björn Töpel
0 siblings, 0 replies; 14+ messages in thread
From: Björn Töpel @ 2024-05-27 12:08 UTC (permalink / raw)
To: Daniel Henrique Barboza, Palmer Dabbelt, Alistair Francis,
Bin Meng, Weiwei Li, Liu Zhiwei, qemu-riscv, qemu-devel,
David Hildenbrand, Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
Daniel Henrique Barboza <dbarboza@ventanamicro.com> writes:
> On 5/21/24 07:56, Björn Töpel wrote:
>> From: Björn Töpel <bjorn@rivosinc.com>
>>
>> Add ACPI GED for the RISC-V "virt" machine, and wire up PC-DIMM memory
>> hotplugging support. Heavily based/copied from hw/arm/virt.c.
>>
>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
>> ---
>> hw/riscv/Kconfig | 3 ++
>> hw/riscv/virt-acpi-build.c | 16 ++++++
>> hw/riscv/virt.c | 104 ++++++++++++++++++++++++++++++++++++-
>> include/hw/riscv/virt.h | 6 ++-
>> 4 files changed, 126 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
>> index 08f82dbb681a..bebe412f2107 100644
>> --- a/hw/riscv/Kconfig
>> +++ b/hw/riscv/Kconfig
>> @@ -56,6 +56,9 @@ config RISCV_VIRT
>> select PLATFORM_BUS
>> select ACPI
>> select ACPI_PCI
>> + select MEM_DEVICE
>> + select DIMM
>> + select ACPI_HW_REDUCED
>> select VIRTIO_MEM_SUPPORTED
>> select VIRTIO_PMEM_SUPPORTED
>>
>> diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
>> index 6dc3baa9ec86..61813abdef3f 100644
>> --- a/hw/riscv/virt-acpi-build.c
>> +++ b/hw/riscv/virt-acpi-build.c
>> @@ -27,6 +27,8 @@
>> #include "hw/acpi/acpi-defs.h"
>> #include "hw/acpi/acpi.h"
>> #include "hw/acpi/aml-build.h"
>> +#include "hw/acpi/memory_hotplug.h"
>> +#include "hw/acpi/generic_event_device.h"
>> #include "hw/acpi/pci.h"
>> #include "hw/acpi/utils.h"
>> #include "hw/intc/riscv_aclint.h"
>> @@ -432,6 +434,20 @@ static void build_dsdt(GArray *table_data,
>> acpi_dsdt_add_gpex_host(scope, PCIE_IRQ + VIRT_IRQCHIP_NUM_SOURCES * 2);
>> }
>>
>> + if (s->acpi_dev) {
>> + uint32_t event = object_property_get_uint(OBJECT(s->acpi_dev),
>> + "ged-event", &error_abort);
>> +
>> + build_ged_aml(scope, "\\_SB."GED_DEVICE, HOTPLUG_HANDLER(s->acpi_dev),
>> + GED_IRQ, AML_SYSTEM_MEMORY, memmap[VIRT_ACPI_GED].base);
>> +
>> + if (event & ACPI_GED_MEM_HOTPLUG_EVT) {
>> + build_memory_hotplug_aml(scope, ms->ram_slots, "\\_SB", NULL,
>> + AML_SYSTEM_MEMORY,
>> + memmap[VIRT_PCDIMM_ACPI].base);
>> + }
>> + }
>> +
>> aml_append(dsdt, scope);
>>
>> /* copy AML table into ACPI tables blob and patch header there */
>> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
>> index 443902f919d2..2e35890187f2 100644
>> --- a/hw/riscv/virt.c
>> +++ b/hw/riscv/virt.c
>> @@ -53,10 +53,13 @@
>> #include "hw/pci-host/gpex.h"
>> #include "hw/display/ramfb.h"
>> #include "hw/acpi/aml-build.h"
>> +#include "hw/acpi/generic_event_device.h"
>> +#include "hw/acpi/memory_hotplug.h"
>> #include "hw/mem/memory-device.h"
>> #include "hw/virtio/virtio-mem-pci.h"
>> #include "qapi/qapi-visit-common.h"
>> #include "hw/virtio/virtio-iommu.h"
>> +#include "hw/mem/pc-dimm.h"
>>
>> /* KVM AIA only supports APLIC MSI. APLIC Wired is always emulated by QEMU. */
>> static bool virt_use_kvm_aia(RISCVVirtState *s)
>> @@ -84,6 +87,8 @@ static const MemMapEntry virt_memmap[] = {
>> [VIRT_UART0] = { 0x10000000, 0x100 },
>> [VIRT_VIRTIO] = { 0x10001000, 0x1000 },
>> [VIRT_FW_CFG] = { 0x10100000, 0x18 },
>> + [VIRT_PCDIMM_ACPI] = { 0x10200000, MEMORY_HOTPLUG_IO_LEN },
>> + [VIRT_ACPI_GED] = { 0x10210000, ACPI_GED_EVT_SEL_LEN },
>> [VIRT_FLASH] = { 0x20000000, 0x4000000 },
>> [VIRT_IMSIC_M] = { 0x24000000, VIRT_IMSIC_MAX_SIZE },
>> [VIRT_IMSIC_S] = { 0x28000000, VIRT_IMSIC_MAX_SIZE },
>> @@ -1400,6 +1405,28 @@ static void virt_machine_done(Notifier *notifier, void *data)
>> }
>> }
>>
>> +static DeviceState *create_acpi_ged(RISCVVirtState *s)
>> +{
>> + DeviceState *dev;
>> + MachineState *ms = MACHINE(s);
>> + uint32_t event = 0;
>> +
>> + if (ms->ram_slots) {
>> + event |= ACPI_GED_MEM_HOTPLUG_EVT;
>> + }
>> +
>> + dev = qdev_new(TYPE_ACPI_GED);
>> + qdev_prop_set_uint32(dev, "ged-event", event);
>> + sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
>> +
>> + sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, s->memmap[VIRT_ACPI_GED].base);
>> + sysbus_mmio_map(SYS_BUS_DEVICE(dev), 1, s->memmap[VIRT_PCDIMM_ACPI].base);
>> + sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, qdev_get_gpio_in(s->irqchip[0],
>> + GED_IRQ));
>> +
>> + return dev;
>> +}
>> +
>> static void virt_machine_init(MachineState *machine)
>> {
>> const MemMapEntry *memmap = virt_memmap;
>> @@ -1612,6 +1639,10 @@ static void virt_machine_init(MachineState *machine)
>>
>> gpex_pcie_init(system_memory, pcie_irqchip, s);
>>
>> + if (virt_is_acpi_enabled(s)) {
>> + s->acpi_dev = create_acpi_ged(s);
>> + }
>> +
>> create_platform_bus(s, mmio_irqchip);
>>
>> serial_mm_init(system_memory, memmap[VIRT_UART0].base,
>> @@ -1752,6 +1783,7 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
>> MachineClass *mc = MACHINE_GET_CLASS(machine);
>>
>> if (device_is_dynamic_sysbus(mc, dev) ||
>> + object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
>> object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI) ||
>> object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
>> return HOTPLUG_HANDLER(machine);
>> @@ -1759,14 +1791,42 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
>> return NULL;
>> }
>>
>> +static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>> + Error **errp)
>> +{
>> + RISCVVirtState *s = RISCV_VIRT_MACHINE(hotplug_dev);
>> +
>> + if (!s->acpi_dev) {
>> + error_setg(errp,
>> + "memory hotplug is not enabled: missing acpi-ged device");
>> + return;
>> + }
>> +
>> + pc_dimm_pre_plug(PC_DIMM(dev), MACHINE(hotplug_dev), NULL, errp);
>> +}
>> +
>
> Note that we're not doing any aligment checks in this pre_plug(), meaning that
> we're kind of accepting whatever the pc-dimm device throw at us.
>
> Testing in an AMD x86 machine will force the pc-dimm to be 2Mb aligned:
>
> $ ./build/qemu-system-riscv64 -M virt -m 2G,slots=4,maxmem=8G -nographic
> (...)
> (qemu) object_add memory-backend-ram,id=ram0,size=111M
> (qemu) device_add pc-dimm,id=dimm0,memdev=ram0
> Error: backend memory size must be multiple of 0x200000
> (qemu) object_del ram0
>
> This happens because the DIMM must be aligned with its own backend, in this case
> the host memory itself (backends/hostmem.c).
>
> There's no guarantee that we'll always run in a host that is mem aligned with the board,
> so it would be nice to add align checks in virt_memory_pre_plug().
Indeed! I'll look into that.
>> static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
>> DeviceState *dev, Error **errp)
>> {
>> + if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>> + virt_memory_pre_plug(hotplug_dev, dev, errp);
>> + }
>> +
>> if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
>> virtio_md_pci_pre_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
>> }
>> }
>>
>> +static void virt_memory_plug(HotplugHandler *hotplug_dev,
>> + DeviceState *dev, Error **errp)
>> +{
>> + RISCVVirtState *s = RISCV_VIRT_MACHINE(hotplug_dev);
>> +
>> + pc_dimm_plug(PC_DIMM(dev), MACHINE(s));
>> +
>> + hotplug_handler_plug(HOTPLUG_HANDLER(s->acpi_dev), dev, &error_abort);
>> +}
>> +
>> static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
>> DeviceState *dev, Error **errp)
>> {
>> @@ -1785,16 +1845,36 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
>> create_fdt_virtio_iommu(s, pci_get_bdf(PCI_DEVICE(dev)));
>> }
>>
>> + if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>> + virt_memory_plug(hotplug_dev, dev, errp);
>> + }
>> +
>> if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
>> virtio_md_pci_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
>> }
>> }
>>
>> +static void virt_dimm_unplug_request(HotplugHandler *hotplug_dev,
>> + DeviceState *dev, Error **errp)
>> +{
>> + RISCVVirtState *s = RISCV_VIRT_MACHINE(hotplug_dev);
>> +
>> + if (!s->acpi_dev) {
>> + error_setg(errp,
>> + "memory hotplug is not enabled: missing acpi-ged device");
>> + return;
>> + }
>> +
>> + hotplug_handler_unplug_request(HOTPLUG_HANDLER(s->acpi_dev), dev, errp);
>> +}
>> +
>
> I'm unsure if we're ready to support both hotplug and hot-unplug, but I tested anyway.
> Hotplug seems to work but hot-unplug doesn't:
>
> $ ./build/qemu-system-riscv64 -M virt -m 2G,slots=4,maxmem=8G -nographic
> (...)
> (qemu) object_add memory-backend-ram,id=ram0,size=112M
> (qemu) device_add pc-dimm,id=dimm0,memdev=ram0
> (qemu)
> (qemu) info memory-devices
> Memory device [dimm]: "dimm0"
> addr: 0x100000000
> slot: 0
> node: 0
> size: 117440512
> memdev: /objects/ram0
> hotplugged: true
> hotpluggable: true
> (qemu)
> (qemu) device_del dimm0
> (qemu) object_del ram0
> Error: object 'ram0' is in use, can not be deleted
> (qemu) info memory-devices
> Memory device [dimm]: "dimm0"
> addr: 0x100000000
> slot: 0
> node: 0
> size: 117440512
> memdev: /objects/ram0
> hotplugged: true
> hotpluggable: true
> (qemu)
>
> In short: hotplugged a 112Mb DIMM, then tried to remove it. 'device_del' doesn't error
> out but doesn't let me remove the memory backing created, i.e. the dimm0 device is
> still around.
>
> In a quick digging I see that we're hitting virt_dimm_unplug_request() all the way
> down to acpi_memory_unplug_request_cb(), where an ACPI_MEMORY_HOTPLUG_STATUS is being
> sent. We never reach virt_dimm_unplug() afterwards, so the PC_DIMM is never removed.
>
> I'm not acquainted with ACPI enough to say if we're missing stuff in QEMU, or if we
> need SW to be aware of this ACPI HP event to trigger the release of the dimm, or
> anything in between.
>
> I consider this more as a FYI. If we're up to the point of hotplugging pc-dimms it's
> already an improvement worth having. Hot-unplugging can come later.
I'll debug this as well!
Thanks for all the input, Daniel!
Björn
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support
2024-05-27 12:03 ` Björn Töpel
@ 2024-05-27 13:19 ` Daniel Henrique Barboza
0 siblings, 0 replies; 14+ messages in thread
From: Daniel Henrique Barboza @ 2024-05-27 13:19 UTC (permalink / raw)
To: Björn Töpel, David Hildenbrand, Palmer Dabbelt,
Alistair Francis, Bin Meng, Weiwei Li, Liu Zhiwei, qemu-riscv,
qemu-devel, Atish Patra, Atish Patra
Cc: Björn Töpel, Sunil V L, Santosh Mamila,
Chethan Seshadri, Sivakumar Munnangi
On 5/27/24 09:03, Björn Töpel wrote:
> David Hildenbrand <david@redhat.com> writes:
>
>> On 24.05.24 15:14, Daniel Henrique Barboza wrote:
>>>
>>>
>>> On 5/21/24 07:56, Björn Töpel wrote:
>>>> From: Björn Töpel <bjorn@rivosinc.com>
>>>>
>>>> Virtio-based memory devices (virtio-mem/virtio-pmem) allows for
>>>> dynamic resizing of virtual machine memory, and requires proper
>>>> hotplugging (add/remove) support to work.
>>>>
>>>> Add device memory support for RISC-V "virt" machine, and enable
>>>> virtio-md-pci with the corresponding missing hotplugging callbacks.
>>>>
>>>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
>>>> ---
>>>> hw/riscv/Kconfig | 2 +
>>>> hw/riscv/virt.c | 83 +++++++++++++++++++++++++++++++++++++++++-
>>>> hw/virtio/virtio-mem.c | 5 ++-
>>>> 3 files changed, 87 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
>>>> index a2030e3a6ff0..08f82dbb681a 100644
>>>> --- a/hw/riscv/Kconfig
>>>> +++ b/hw/riscv/Kconfig
>>>> @@ -56,6 +56,8 @@ config RISCV_VIRT
>>>> select PLATFORM_BUS
>>>> select ACPI
>>>> select ACPI_PCI
>>>> + select VIRTIO_MEM_SUPPORTED
>>>> + select VIRTIO_PMEM_SUPPORTED
>>>>
>>>> config SHAKTI_C
>>>> bool
>>>> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
>>>> index 4fdb66052587..443902f919d2 100644
>>>> --- a/hw/riscv/virt.c
>>>> +++ b/hw/riscv/virt.c
>>>> @@ -53,6 +53,8 @@
>>>> #include "hw/pci-host/gpex.h"
>>>> #include "hw/display/ramfb.h"
>>>> #include "hw/acpi/aml-build.h"
>>>> +#include "hw/mem/memory-device.h"
>>>> +#include "hw/virtio/virtio-mem-pci.h"
>>>> #include "qapi/qapi-visit-common.h"
>>>> #include "hw/virtio/virtio-iommu.h"
>>>>
>>>> @@ -1407,6 +1409,7 @@ static void virt_machine_init(MachineState *machine)
>>>> DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
>>>> int i, base_hartid, hart_count;
>>>> int socket_count = riscv_socket_count(machine);
>>>> + hwaddr device_memory_base, device_memory_size;
>>>>
>>>> /* Check socket count limit */
>>>> if (VIRT_SOCKETS_MAX < socket_count) {
>>>> @@ -1420,6 +1423,12 @@ static void virt_machine_init(MachineState *machine)
>>>> exit(1);
>>>> }
>>>>
>>>> + if (machine->ram_slots > ACPI_MAX_RAM_SLOTS) {
>>>> + error_report("unsupported amount of memory slots: %"PRIu64,
>>>> + machine->ram_slots);
>>>> + exit(EXIT_FAILURE);
>>>> + }
>>>> +
>>>> /* Initialize sockets */
>>>> mmio_irqchip = virtio_irqchip = pcie_irqchip = NULL;
>>>> for (i = 0; i < socket_count; i++) {
>>>> @@ -1553,6 +1562,37 @@ static void virt_machine_init(MachineState *machine)
>>>> memory_region_add_subregion(system_memory, memmap[VIRT_MROM].base,
>>>> mask_rom);
>>>>
>>>> + /* device memory */
>>>> + device_memory_base = ROUND_UP(s->memmap[VIRT_DRAM].base + machine->ram_size,
>>>> + GiB);
>>>> + device_memory_size = machine->maxram_size - machine->ram_size;
>>>> + if (device_memory_size > 0) {
>>>> + /*
>>>> + * Each DIMM is aligned based on the backend's alignment value.
>>>> + * Assume max 1G hugepage alignment per slot.
>>>> + */
>>>> + device_memory_size += machine->ram_slots * GiB;
>>>
>>> We don't need to align to 1GiB. This calc can use 2MiB instead (or 4MiB if we're
>>> running 32 bits).
>>>
>>>> +
>>>> + if (riscv_is_32bit(&s->soc[0])) {
>>>> + hwaddr memtop = device_memory_base + ROUND_UP(device_memory_size,
>>>> + GiB);
>>>
>>> Same here - alignment is 2/4 MiB.
>>>
>>>> +
>>>> + if (memtop > UINT32_MAX) {
>>>> + error_report("memory exceeds 32-bit limit by %lu bytes",
>>>> + memtop - UINT32_MAX);
>>>> + exit(EXIT_FAILURE);
>>>> + }
>>>> + }
>>>> +
>>>> + if (device_memory_base + device_memory_size < device_memory_size) {
>>>> + error_report("unsupported amount of device memory");
>>>> + exit(EXIT_FAILURE);
>>>> + }
>>>
>>> Took another look and found this a bit strange. These are all unsigned vars, so
>>> if (unsigned a + unsigned b < unsigned b) will always be 'false'. The compiler is
>>> probably cropping this out.
>>
>> No. Unsigned interger overflow is defined behavior and this is a common
>> check to detect such overflow. tI's consistent with what we do for other
>> architectures.
>>
>>>
>>> The calc we need to do is to ensure that the extra ram_slots * alignment will fit into
>>> the VIRT_DRAM block, i.e. maxram_size + (ram_slots * alignment) < memmap[VIRT_DRAM].size.
>>>
>>>
>>> TBH I'm starting to have second thoughts about letting users hotplug whatever they want.
>>> It seems cleaner to just force the 2/4 Mb alignment in pre_plug() and be done with it,
>>> no need to allocate ram_slots * alignment and doing all these extra checks.
>>
>> It's worth noting that if user space decides to specify addresses
>> manually, it can mess up everything already. There are other events that
>> can result in fragmentation of the memory device area (repeated
>> hot(un)plug of differing DIMMs).
>>
>> Assume you have 1 GiB range and hotplug a 512 MiB DIMM at offset 256
>> MiB. You won't be able to hotplug another 512 MiB DIMM even though we
>> reserved space.
>>
>> My take so far is: if the user wants to do such stuff it should size the
>> area (maxmem) much larger or deal with the concequences (not being able
>> to hotplug memory).
>>
>> It usually does not happen in practice ...
>
> Daniel/David, again thanks for spending time on the patches.
>
> The reason I picked 1 GiB per slot was because the alignment is also
> dependent on the backend AFAIU. Rationale in commit 085f8e88ba73 ("pc:
> count in 1Gb hugepage alignment when sizing hotplug-memory container").
>
> What I'm reading from you guys are: Just depend on whatever maxmem says
> (modulo 2/4M alignment), and leave at that. I agree that that's much
> easier to reason about.
>
> Correct?
We could opt for the more conservative approach and keep the 1Gb per dimm to
account for hugepage mem fragmentation that the commit is talking about, assuming
that this scenario is still feasible to happen.
riscv64 can take this extra restriction (i.e. an extra 256GiB RAM just for that)
but I'm afraid this would eradicate hotplug support for riscv32 since it'll never
has enough memory to deal with a spare 1Gb per slot.
I'm aware that you're working in the RISC-V kernel support for memory hotplug, so
you can answer this better than I can: do we care about hugepage support for
riscv32? Assuming we don't, then we could keep 4MiB alignment for riscv32 but do
1GiB for riscv64 to account for this potential hugepage fragmentation.
Thanks,
Daniel
>
>>> As I sent in an earlier email, users must already comply to the alignment of the host
>>> memory when plugging pc-dimms, so I'm not sure our value/proposition with all this
>>> extra code is worth it - the alignment will most likely be forced by the host memory
>>> backend, so might as well force ourselves in pre_plug().
>>
>> At least on x86-64, the 2 MiB alignment requirement is handled
>> automatically. QEMU_VMALLOC_ALIGN implicitly enforces that.
>>
>> Maybe RISCV also wants to set QEMU_VMALLOC_ALIGN?
>
> I'll look into this for the next rev!
>
>
> Björn
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-05-27 13:19 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-21 10:56 [PATCH v2 0/3] RISC-V virt MHP support Björn Töpel
2024-05-21 10:56 ` [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support Björn Töpel
2024-05-24 8:53 ` Daniel Henrique Barboza
2024-05-27 11:49 ` Björn Töpel
2024-05-24 13:14 ` Daniel Henrique Barboza
2024-05-24 15:02 ` David Hildenbrand
2024-05-24 15:29 ` Daniel Henrique Barboza
2024-05-27 7:19 ` David Hildenbrand
2024-05-27 12:03 ` Björn Töpel
2024-05-27 13:19 ` Daniel Henrique Barboza
2024-05-21 10:56 ` [PATCH v2 2/3] hw/riscv/virt-acpi-build: Expose device memory in ACPI SRAT Björn Töpel
2024-05-21 10:56 ` [PATCH v2 3/3] hw/riscv/virt: Add ACPI GED and PC-DIMM MHP support Björn Töpel
2024-05-24 9:10 ` Daniel Henrique Barboza
2024-05-27 12:08 ` Björn Töpel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).