All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support
@ 2019-03-21 10:47 Shameer Kolothum
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 01/10] hw/acpi: Make ACPI IO address space configurable Shameer Kolothum
                   ` (10 more replies)
  0 siblings, 11 replies; 95+ messages in thread
From: Shameer Kolothum @ 2019-03-21 10:47 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

This series is an attempt to provide device memory hotplug support 
on ARM virt platform. This is based on Eric's recent works here[1]
and carries some of the pc-dimm related patches dropped from his
series.

The kernel support for arm64 memory hot add was added recently by
Robin and hence the guest kernel should be => 5.0-rc1.

NVDIM support is not included currently as we still have an unresolved
issue while hot adding NVDIMM[2]. However NVDIMM cold plug patches
can be included, but not done for now, for keeping it simple.

This makes use of GED device to sent hotplug ACPI events to the
Guest. GED code is based on Nemu. Thanks to the efforts of Samuel and
Sebastien to add the hardware-reduced support to Nemu using GED
device[3]. (Please shout if I got the author/signed-off wrong for
those patches or missed any names).

Thanks,
Shameer

[1] https://patchwork.kernel.org/cover/10837565/
[2] https://patchwork.kernel.org/cover/10783589/
[3] https://github.com/intel/nemu/blob/topic/virt-x86/hw/acpi/ged.c

v2 --> v2

Addressed comments from Igor and Eric,
-Made virt acpi device platform independent and moved
 to hw/acpi/generic_event_device.c
-Moved ged specific code into hw/acpi/generic_event_device.c
-Introduced an opt-in feature "fdt" to resolve device-memory being
 treated as early boot memory.
-Dropped patch #1 from v2.

RFC --> v2

-Use GED device instead of GPIO for ACPI hotplug events.
-Removed NVDIMM support for now.
-Includes dropped patches from Eric's v9 series.

Eric Auger (1):
  hw/arm/virt: Add memory hotplug framework

Samuel Ortiz (3):
  hw/acpi: Do not create memory hotplug method when handler is not
    defined
  hw/arm/virt: Add virtual ACPI device
  hw/acpi: Add ACPI Generic Event Device Support

Shameer Kolothum (6):
  hw/acpi: Make ACPI IO address space configurable
  hw/arm/virt: Add ACPI support for device memory cold-plug
  hw/arm/virt-acpi-build: Add PC-DIMM in SRAT
  hw/arm/virt: Introduce opt-in feature "fdt"
  hw/arm/boot: Expose the PC-DIMM nodes in the DT
  hw/arm/virt: Init GED device and enable memory hotplug

 default-configs/arm-softmmu.mak        |   5 +
 hw/acpi/Kconfig                        |   4 +
 hw/acpi/Makefile.objs                  |   1 +
 hw/acpi/generic_event_device.c         | 313 +++++++++++++++++++++++++++++++++
 hw/acpi/memory_hotplug.c               |  34 ++--
 hw/arm/boot.c                          |  44 +++++
 hw/arm/virt-acpi-build.c               |  27 +++
 hw/arm/virt.c                          | 121 ++++++++++++-
 hw/i386/acpi-build.c                   |   3 +-
 include/hw/acpi/generic_event_device.h |  68 +++++++
 include/hw/acpi/memory_hotplug.h       |   8 +-
 include/hw/arm/virt.h                  |   5 +
 12 files changed, 614 insertions(+), 19 deletions(-)
 create mode 100644 hw/acpi/generic_event_device.c
 create mode 100644 include/hw/acpi/generic_event_device.h

-- 
2.7.4



^ permalink raw reply	[flat|nested] 95+ messages in thread

* [Qemu-devel] [PATCH v3 01/10] hw/acpi: Make ACPI IO address space configurable
  2019-03-21 10:47 [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support Shameer Kolothum
@ 2019-03-21 10:47 ` Shameer Kolothum
  2019-04-01 12:58     ` [Qemu-devel] " Igor Mammedov
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 02/10] hw/acpi: Do not create memory hotplug method when handler is not defined Shameer Kolothum
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 95+ messages in thread
From: Shameer Kolothum @ 2019-03-21 10:47 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

This is in preparation for adding support for ARM64 platforms
where it doesn't use port mapped IO for ACPI IO space.

Also move the MEMORY_SLOT_SCAN_METHOD/MEMORY_DEVICES_CONTAINER
definitions to header so that other memory hotplug event
signalling mechanisms (eg. Generic Event Device on HW-reduced
acpi platforms) can use the same from their respective event
handler aml code.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 hw/acpi/memory_hotplug.c         | 24 ++++++++++++++----------
 hw/i386/acpi-build.c             |  3 ++-
 include/hw/acpi/memory_hotplug.h |  8 ++++++--
 3 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index 297812d..80e25f0 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -29,12 +29,10 @@
 #define MEMORY_SLOT_PROXIMITY_METHOD "MPXM"
 #define MEMORY_SLOT_EJECT_METHOD     "MEJ0"
 #define MEMORY_SLOT_NOTIFY_METHOD    "MTFY"
-#define MEMORY_SLOT_SCAN_METHOD      "MSCN"
 #define MEMORY_HOTPLUG_DEVICE        "MHPD"
 #define MEMORY_HOTPLUG_IO_LEN         24
-#define MEMORY_DEVICES_CONTAINER     "\\_SB.MHPC"
 
-static uint16_t memhp_io_base;
+static hwaddr memhp_io_base;
 
 static ACPIOSTInfo *acpi_memory_device_status(int slot, MemStatus *mdev)
 {
@@ -209,7 +207,7 @@ static const MemoryRegionOps acpi_memory_hotplug_ops = {
 };
 
 void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner,
-                              MemHotplugState *state, uint16_t io_base)
+                              MemHotplugState *state, hwaddr io_base)
 {
     MachineState *machine = MACHINE(qdev_get_machine());
 
@@ -342,7 +340,8 @@ const VMStateDescription vmstate_memory_hotplug = {
 
 void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
                               const char *res_root,
-                              const char *event_handler_method)
+                              const char *event_handler_method,
+                              AmlRegionSpace rs)
 {
     int i;
     Aml *ifctx;
@@ -365,14 +364,19 @@ void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
             aml_name_decl("_UID", aml_string("Memory hotplug resources")));
 
         crs = aml_resource_template();
-        aml_append(crs,
-            aml_io(AML_DECODE16, memhp_io_base, memhp_io_base, 0,
-                   MEMORY_HOTPLUG_IO_LEN)
-        );
+        if (rs == AML_SYSTEM_IO) {
+            aml_append(crs,
+                aml_io(AML_DECODE16, memhp_io_base, memhp_io_base, 0,
+                       MEMORY_HOTPLUG_IO_LEN)
+            );
+        } else {
+            aml_append(crs, aml_memory32_fixed(memhp_io_base,
+                            MEMORY_HOTPLUG_IO_LEN, AML_READ_WRITE));
+        }
         aml_append(mem_ctrl_dev, aml_name_decl("_CRS", crs));
 
         aml_append(mem_ctrl_dev, aml_operation_region(
-            MEMORY_HOTPLUG_IO_REGION, AML_SYSTEM_IO,
+            MEMORY_HOTPLUG_IO_REGION, rs,
             aml_int(memhp_io_base), MEMORY_HOTPLUG_IO_LEN)
         );
 
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 416da31..6d6de44 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1852,7 +1852,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         build_cpus_aml(dsdt, machine, opts, pm->cpu_hp_io_base,
                        "\\_SB.PCI0", "\\_GPE._E02");
     }
-    build_memory_hotplug_aml(dsdt, nr_mem, "\\_SB.PCI0", "\\_GPE._E03");
+    build_memory_hotplug_aml(dsdt, nr_mem, "\\_SB.PCI0",
+                             "\\_GPE._E03", AML_SYSTEM_IO);
 
     scope =  aml_scope("_GPE");
     {
diff --git a/include/hw/acpi/memory_hotplug.h b/include/hw/acpi/memory_hotplug.h
index 77c6576..f95aa1f 100644
--- a/include/hw/acpi/memory_hotplug.h
+++ b/include/hw/acpi/memory_hotplug.h
@@ -5,6 +5,9 @@
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/aml-build.h"
 
+#define MEMORY_SLOT_SCAN_METHOD      "MSCN"
+#define MEMORY_DEVICES_CONTAINER     "\\_SB.MHPC"
+
 /**
  * MemStatus:
  * @is_removing: the memory device in slot has been requested to be ejected.
@@ -29,7 +32,7 @@ typedef struct MemHotplugState {
 } MemHotplugState;
 
 void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner,
-                              MemHotplugState *state, uint16_t io_base);
+                              MemHotplugState *state, hwaddr io_base);
 
 void acpi_memory_plug_cb(HotplugHandler *hotplug_dev, MemHotplugState *mem_st,
                          DeviceState *dev, Error **errp);
@@ -48,5 +51,6 @@ void acpi_memory_ospm_status(MemHotplugState *mem_st, ACPIOSTInfoList ***list);
 
 void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
                               const char *res_root,
-                              const char *event_handler_method);
+                              const char *event_handler_method,
+                              AmlRegionSpace rs);
 #endif
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [Qemu-devel] [PATCH v3 02/10] hw/acpi: Do not create memory hotplug method when handler is not defined
  2019-03-21 10:47 [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support Shameer Kolothum
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 01/10] hw/acpi: Make ACPI IO address space configurable Shameer Kolothum
@ 2019-03-21 10:47 ` Shameer Kolothum
  2019-03-28 14:14   ` [Qemu-arm] " Auger Eric
  2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device Shameer Kolothum
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 95+ messages in thread
From: Shameer Kolothum @ 2019-03-21 10:47 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

From: Samuel Ortiz <sameo@linux.intel.com>

With Hardware-reduced ACPI, the GED device will manage ACPI
hotplug entirely. As a consequence, make the memory specific
events AML generation optional. The code will only be added
when the method name is not NULL.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 hw/acpi/memory_hotplug.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index 80e25f0..98407e3 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -720,10 +720,12 @@ void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
     }
     aml_append(table, dev_container);
 
-    method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED);
-    aml_append(method,
-        aml_call0(MEMORY_DEVICES_CONTAINER "." MEMORY_SLOT_SCAN_METHOD));
-    aml_append(table, method);
+    if (event_handler_method) {
+        method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED);
+        aml_append(method,
+                   aml_call0(MEMORY_DEVICES_CONTAINER "." MEMORY_SLOT_SCAN_METHOD));
+        aml_append(table, method);
+    }
 
     g_free(mhp_res_path);
 }
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [Qemu-arm] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
  2019-03-21 10:47 [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support Shameer Kolothum
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 01/10] hw/acpi: Make ACPI IO address space configurable Shameer Kolothum
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 02/10] hw/acpi: Do not create memory hotplug method when handler is not defined Shameer Kolothum
@ 2019-03-21 10:47 ` Shameer Kolothum
  2019-03-28 14:14   ` Auger Eric
  2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 04/10] hw/arm/virt: Add memory hotplug framework Shameer Kolothum
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 95+ messages in thread
From: Shameer Kolothum @ 2019-03-21 10:47 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

From: Samuel Ortiz <sameo@linux.intel.com>

This adds the skeleton to support an acpi device interface
for HW-reduced acpi platforms via ACPI GED - Generic Event
Device (ACPI v6.1 5.6.9).

This will be used by Arm/Virt to add hotplug support.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 hw/acpi/Kconfig                        |  4 ++
 hw/acpi/Makefile.objs                  |  1 +
 hw/acpi/generic_event_device.c         | 72 ++++++++++++++++++++++++++++++++++
 include/hw/acpi/generic_event_device.h | 29 ++++++++++++++
 4 files changed, 106 insertions(+)
 create mode 100644 hw/acpi/generic_event_device.c
 create mode 100644 include/hw/acpi/generic_event_device.h

diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index eca3bee..01a8b41 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -27,3 +27,7 @@ config ACPI_VMGENID
     bool
     default y
     depends on PC
+
+config ACPI_HW_REDUCED
+    bool
+    depends on ACPI
diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
index 2d46e37..b753232 100644
--- a/hw/acpi/Makefile.objs
+++ b/hw/acpi/Makefile.objs
@@ -6,6 +6,7 @@ common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o
 common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
 common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
 common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
+common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
 common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
 
 common-obj-y += acpi_interface.o
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
new file mode 100644
index 0000000..b21a551
--- /dev/null
+++ b/hw/acpi/generic_event_device.c
@@ -0,0 +1,72 @@
+/*
+ *
+ * Copyright (c) 2018 Intel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/sysbus.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/generic_event_device.h"
+
+static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
+                                DeviceState *dev, Error **errp)
+{
+}
+
+static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
+{
+}
+
+static void virt_device_realize(DeviceState *dev, Error **errp)
+{
+}
+
+static Property virt_acpi_properties[] = {
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void virt_acpi_class_init(ObjectClass *class, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(class);
+    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(class);
+    AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_CLASS(class);
+
+    dc->desc = "ACPI";
+    dc->props = virt_acpi_properties;
+    dc->realize = virt_device_realize;
+
+    hc->plug = virt_device_plug_cb;
+
+    adevc->send_event = virt_send_ged;
+}
+
+static const TypeInfo virt_acpi_info = {
+    .name          = TYPE_VIRT_ACPI,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(VirtAcpiState),
+    .class_init    = virt_acpi_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_HOTPLUG_HANDLER },
+        { TYPE_ACPI_DEVICE_IF },
+        { }
+    }
+};
+
+static void virt_acpi_register_types(void)
+{
+    type_register_static(&virt_acpi_info);
+}
+
+type_init(virt_acpi_register_types)
diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
new file mode 100644
index 0000000..f314515
--- /dev/null
+++ b/include/hw/acpi/generic_event_device.h
@@ -0,0 +1,29 @@
+/*
+ *
+ * Copyright (c) 2018 Intel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ACPI_GED_H
+#define HW_ACPI_GED_H
+
+#define TYPE_VIRT_ACPI "virt-acpi"
+#define VIRT_ACPI(obj) \
+    OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
+
+typedef struct VirtAcpiState {
+    SysBusDevice parent_obj;
+} VirtAcpiState;
+
+#endif
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [Qemu-arm] [PATCH v3 04/10] hw/arm/virt: Add memory hotplug framework
  2019-03-21 10:47 [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support Shameer Kolothum
                   ` (2 preceding siblings ...)
  2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device Shameer Kolothum
@ 2019-03-21 10:47 ` Shameer Kolothum
  2019-03-28 15:37   ` Auger Eric
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug Shameer Kolothum
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 95+ messages in thread
From: Shameer Kolothum @ 2019-03-21 10:47 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

From: Eric Auger <eric.auger@redhat.com>

This patch adds the the memory hot-plug/hot-unplug infrastructure
in machvirt. It is still not enabled as device memory is not yet
reported to guest.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 default-configs/arm-softmmu.mak |  3 +++
 hw/arm/virt.c                   | 53 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 2a7efc1..795cb89 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -159,3 +159,6 @@ CONFIG_MUSICPAL=y
 
 # for realview and versatilepb
 CONFIG_LSI_SCSI_PCI=y
+
+CONFIG_MEM_DEVICE=y
+CONFIG_DIMM=y
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ce2664a..d0ff20d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -61,6 +61,8 @@
 #include "hw/arm/smmuv3.h"
 #include "hw/acpi/acpi.h"
 #include "target/arm/internals.h"
+#include "hw/mem/pc-dimm.h"
+#include "hw/mem/nvdimm.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -1806,6 +1808,42 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
     return ms->possible_cpus;
 }
 
+static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                                 Error **errp)
+{
+    const bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
+
+    if (dev->hotplugged) {
+        error_setg(errp, "memory hotplug is not supported");
+    }
+
+    if (is_nvdimm) {
+        error_setg(errp, "nvdimm is not yet supported");
+        return;
+    }
+
+    pc_dimm_pre_plug(PC_DIMM(dev), MACHINE(hotplug_dev), NULL, errp);
+}
+
+static void virt_memory_plug(HotplugHandler *hotplug_dev,
+                             DeviceState *dev, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+    Error *local_err = NULL;
+
+    pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
+
+    error_propagate(errp, local_err);
+}
+
+static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
+                                            DeviceState *dev, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        virt_memory_pre_plug(hotplug_dev, dev, errp);
+    }
+}
+
 static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
                                         DeviceState *dev, Error **errp)
 {
@@ -1817,12 +1855,23 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
                                      SYS_BUS_DEVICE(dev));
         }
     }
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        virt_memory_plug(hotplug_dev, dev, errp);
+    }
+}
+
+static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
+                                          DeviceState *dev, Error **errp)
+{
+    error_setg(errp, "device unplug request for unsupported device"
+               " type: %s", object_get_typename(OBJECT(dev)));
 }
 
 static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
                                                         DeviceState *dev)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE)) {
+    if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE) ||
+       (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM))) {
         return HOTPLUG_HANDLER(machine);
     }
 
@@ -1886,7 +1935,9 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     mc->kvm_type = virt_kvm_type;
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
+    hc->pre_plug = virt_machine_device_pre_plug_cb;
     hc->plug = virt_machine_device_plug_cb;
+    hc->unplug_request = virt_machine_device_unplug_request_cb;
 }
 
 static void virt_instance_init(Object *obj)
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
  2019-03-21 10:47 [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support Shameer Kolothum
                   ` (3 preceding siblings ...)
  2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 04/10] hw/arm/virt: Add memory hotplug framework Shameer Kolothum
@ 2019-03-21 10:47 ` Shameer Kolothum
  2019-03-29  9:31   ` [Qemu-arm] " Auger Eric
  2019-04-01 13:34     ` [Qemu-devel] " Igor Mammedov
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 06/10] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Shameer Kolothum
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 95+ messages in thread
From: Shameer Kolothum @ 2019-03-21 10:47 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

This adds support to build the aml code so that Guest(ACPI boot)
can see the cold-plugged device memory. Memory cold plug support
with DT boot is not yet enabled.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 default-configs/arm-softmmu.mak        |  2 ++
 hw/acpi/generic_event_device.c         | 23 +++++++++++++++++++++++
 hw/arm/virt-acpi-build.c               |  9 +++++++++
 hw/arm/virt.c                          | 23 +++++++++++++++++++++++
 include/hw/acpi/generic_event_device.h |  5 +++++
 include/hw/arm/virt.h                  |  2 ++
 6 files changed, 64 insertions(+)

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 795cb89..6db444e 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
 
 CONFIG_MEM_DEVICE=y
 CONFIG_DIMM=y
+CONFIG_ACPI_MEMORY_HOTPLUG=y
+CONFIG_ACPI_HW_REDUCED=y
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index b21a551..0b32fc9 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -16,13 +16,26 @@
  */
 
 #include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "exec/address-spaces.h"
 #include "hw/sysbus.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/generic_event_device.h"
+#include "hw/mem/pc-dimm.h"
 
 static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
                                 DeviceState *dev, Error **errp)
 {
+    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
+
+    if (s->memhp_state.is_enabled &&
+        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
+                                dev, errp);
+    } else {
+        error_setg(errp, "virt: device plug request for unsupported device"
+                   " type: %s", object_get_typename(OBJECT(dev)));
+    }
 }
 
 static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
@@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
 
 static void virt_device_realize(DeviceState *dev, Error **errp)
 {
+    VirtAcpiState *s = VIRT_ACPI(dev);
+
+    if (s->memhp_state.is_enabled) {
+        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
+                                 &s->memhp_state,
+                                 s->memhp_base);
+    }
 }
 
 static Property virt_acpi_properties[] = {
+    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base, 0),
+    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
+                     memhp_state.is_enabled, true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index bf9c0bc..20d3c83 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -40,6 +40,7 @@
 #include "hw/loader.h"
 #include "hw/hw.h"
 #include "hw/acpi/aml-build.h"
+#include "hw/acpi/memory_hotplug.h"
 #include "hw/pci/pcie_host.h"
 #include "hw/pci/pci.h"
 #include "hw/arm/virt.h"
@@ -49,6 +50,13 @@
 #define ARM_SPI_BASE 32
 #define ACPI_POWER_BUTTON_DEVICE "PWRB"
 
+static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState *ms)
+{
+    uint32_t nr_mem = ms->ram_slots;
+
+    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL, AML_SYSTEM_MEMORY);
+}
+
 static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
 {
     uint16_t i;
@@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
      * the RTC ACPI device at all when using UEFI.
      */
     scope = aml_scope("\\_SB");
+    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
     acpi_dsdt_add_cpus(scope, vms->smp_cpus);
     acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
                        (irqmap[VIRT_UART] + ARM_SPI_BASE));
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index d0ff20d..13db0e9 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
     [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
     [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
     [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
+    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
     [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
     /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
     [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
@@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const VirtMachineState *vms)
     }
 }
 
+static DeviceState *create_virt_acpi(VirtMachineState *vms)
+{
+    DeviceState *dev;
+
+    dev = qdev_create(NULL, "virt-acpi");
+    qdev_prop_set_uint64(dev, "memhp_base",
+                         vms->memmap[VIRT_PCDIMM_ACPI].base);
+    qdev_init_nofail(dev);
+
+    return dev;
+}
+
 static void create_its(VirtMachineState *vms, DeviceState *gicdev)
 {
     const char *itsclass = its_class_name();
@@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
 
     create_platform_bus(vms, pic);
 
+    vms->acpi = create_virt_acpi(vms);
+
     vms->bootinfo.ram_size = machine->ram_size;
     vms->bootinfo.kernel_filename = machine->kernel_filename;
     vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
@@ -1828,11 +1843,19 @@ static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
 static void virt_memory_plug(HotplugHandler *hotplug_dev,
                              DeviceState *dev, Error **errp)
 {
+    HotplugHandlerClass *hhc;
     VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
     Error *local_err = NULL;
 
     pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
+    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);
 
+out:
     error_propagate(errp, local_err);
 }
 
diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
index f314515..262ca7d 100644
--- a/include/hw/acpi/generic_event_device.h
+++ b/include/hw/acpi/generic_event_device.h
@@ -18,12 +18,17 @@
 #ifndef HW_ACPI_GED_H
 #define HW_ACPI_GED_H
 
+#include "hw/acpi/memory_hotplug.h"
+
 #define TYPE_VIRT_ACPI "virt-acpi"
 #define VIRT_ACPI(obj) \
     OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
 
 typedef struct VirtAcpiState {
     SysBusDevice parent_obj;
+    MemHotplugState memhp_state;
+    hwaddr memhp_base;
 } VirtAcpiState;
 
+
 #endif
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 507517c..c5e4c96 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -77,6 +77,7 @@ enum {
     VIRT_GPIO,
     VIRT_SECURE_UART,
     VIRT_SECURE_MEM,
+    VIRT_PCDIMM_ACPI,
     VIRT_LOWMEMMAP_LAST,
 };
 
@@ -132,6 +133,7 @@ typedef struct {
     uint32_t iommu_phandle;
     int psci_conduit;
     hwaddr highest_gpa;
+    DeviceState *acpi;
 } VirtMachineState;
 
 #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [Qemu-devel] [PATCH v3 06/10] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT
  2019-03-21 10:47 [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support Shameer Kolothum
                   ` (4 preceding siblings ...)
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug Shameer Kolothum
@ 2019-03-21 10:47 ` Shameer Kolothum
  2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt" Shameer Kolothum
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 95+ messages in thread
From: Shameer Kolothum @ 2019-03-21 10:47 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

Generate Memory Affinity Structures for PC-DIMM ranges.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/arm/virt-acpi-build.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 20d3c83..1887531 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -524,6 +524,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     int i, srat_start;
     uint64_t mem_base;
     MachineClass *mc = MACHINE_GET_CLASS(vms);
+    MachineState *ms = MACHINE(vms);
     const CPUArchIdList *cpu_list = mc->possible_cpu_arch_ids(MACHINE(vms));
 
     srat_start = table_data->len;
@@ -549,6 +550,14 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         }
     }
 
+    if (ms->device_memory) {
+        numamem = acpi_data_push(table_data, sizeof *numamem);
+        build_srat_memory(numamem, ms->device_memory->base,
+                          memory_region_size(&ms->device_memory->mr),
+                          nb_numa_nodes - 1,
+                          MEM_AFFINITY_HOTPLUGGABLE | MEM_AFFINITY_ENABLED);
+    }
+
     build_header(linker, table_data, (void *)(table_data->data + srat_start),
                  "SRAT", table_data->len - srat_start, 3, NULL, NULL);
 }
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [Qemu-arm] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-03-21 10:47 [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support Shameer Kolothum
                   ` (5 preceding siblings ...)
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 06/10] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Shameer Kolothum
@ 2019-03-21 10:47 ` Shameer Kolothum
  2019-03-29  9:31   ` Auger Eric
  2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 08/10] hw/arm/boot: Expose the PC-DIMM nodes in the DT Shameer Kolothum
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 95+ messages in thread
From: Shameer Kolothum @ 2019-03-21 10:47 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

This is to disable/enable populating DT nodes in case
any conflict with acpi tables. The default is "off".

This will be used in subsequent patch where cold plug
device-memory support is added for DT boot.

If DT memory node support is added for cold-plugged device
memory, those memory will be visible to Guest kernel via
UEFI GetMemoryMap() and gets treated as early boot memory.
Hence memory becomes non hot-un-unpluggable even if Guest
is booted in ACPI mode.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 hw/arm/virt.c         | 23 +++++++++++++++++++++++
 include/hw/arm/virt.h |  1 +
 2 files changed, 24 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 13db0e9..b602151 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1717,6 +1717,20 @@ static void virt_set_highmem(Object *obj, bool value, Error **errp)
     vms->highmem = value;
 }
 
+static bool virt_get_fdt(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return vms->use_fdt;
+}
+
+static void virt_set_fdt(Object *obj, bool value, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    vms->use_fdt = value;
+}
+
 static bool virt_get_its(Object *obj, Error **errp)
 {
     VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -2005,6 +2019,15 @@ static void virt_instance_init(Object *obj)
     object_property_set_description(obj, "gic-version",
                                     "Set GIC version. "
                                     "Valid values are 2, 3 and host", NULL);
+    /* fdt is disabled by default */
+    vms->use_fdt = false;
+    object_property_add_bool(obj, "fdt", virt_get_fdt,
+                             virt_set_fdt, NULL);
+    object_property_set_description(obj, "fdt",
+                                    "Set on/off to enable/disable device tree "
+                                    "nodes in case any conflict with ACPI"
+                                    "(eg: device memory node)",
+                                    NULL);
 
     vms->highmem_ecam = !vmc->no_highmem_ecam;
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index c5e4c96..14b2e0a 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -119,6 +119,7 @@ typedef struct {
     bool highmem_ecam;
     bool its;
     bool virt;
+    bool use_fdt;
     int32_t gic_version;
     VirtIOMMUType iommu;
     struct arm_boot_info bootinfo;
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [Qemu-arm] [PATCH v3 08/10] hw/arm/boot: Expose the PC-DIMM nodes in the DT
  2019-03-21 10:47 [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support Shameer Kolothum
                   ` (6 preceding siblings ...)
  2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt" Shameer Kolothum
@ 2019-03-21 10:47 ` Shameer Kolothum
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 09/10] hw/acpi: Add ACPI Generic Event Device Support Shameer Kolothum
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 95+ messages in thread
From: Shameer Kolothum @ 2019-03-21 10:47 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

This patch adds memory nodes corresponding to PC-DIMM regions.
This is an opt-in feature and needs to be enabled("fdt=on") when
Guest is booted with DT.

NVDIMM and ACPI_NVDIMM configs are not yet set for ARM so we
don't need to care about NVDIMM at this stage.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 hw/arm/boot.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index a830655..5b9a994 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -19,6 +19,7 @@
 #include "sysemu/numa.h"
 #include "hw/boards.h"
 #include "hw/loader.h"
+#include "hw/mem/memory-device.h"
 #include "elf.h"
 #include "sysemu/device_tree.h"
 #include "qemu/config-file.h"
@@ -522,6 +523,41 @@ static void fdt_add_psci_node(void *fdt)
     qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
 }
 
+static int fdt_add_hotpluggable_memory_nodes(void *fdt,
+                                             uint32_t acells, uint32_t scells) {
+    MemoryDeviceInfoList *info, *info_list = qmp_memory_device_list();
+    MemoryDeviceInfo *mi;
+    int ret = 0;
+
+    for (info = info_list; info != NULL; info = info->next) {
+        mi = info->value;
+        switch (mi->type) {
+        case MEMORY_DEVICE_INFO_KIND_DIMM:
+        {
+            PCDIMMDeviceInfo *di = mi->u.dimm.data;
+
+            ret = fdt_add_memory_node(fdt, acells, di->addr,
+                                      scells, di->size, di->node);
+            if (ret) {
+                fprintf(stderr,
+                        "couldn't add PCDIMM /memory@%"PRIx64" node\n",
+                        di->addr);
+                goto out;
+            }
+            break;
+        }
+        default:
+            fprintf(stderr, "%s memory nodes are not yet supported\n",
+                    MemoryDeviceInfoKind_str(mi->type));
+            ret = -ENOENT;
+            goto out;
+        }
+    }
+out:
+    qapi_free_MemoryDeviceInfoList(info_list);
+    return ret;
+}
+
 int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
                  hwaddr addr_limit, AddressSpace *as)
 {
@@ -621,6 +657,14 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
         }
     }
 
+    if (object_property_get_bool(OBJECT(qdev_get_machine()), "fdt", NULL)) {
+        rc = fdt_add_hotpluggable_memory_nodes(fdt, acells, scells);
+        if (rc < 0) {
+            fprintf(stderr, "couldn't add hotpluggable memory nodes\n");
+            goto fail;
+        }
+    }
+
     rc = fdt_path_offset(fdt, "/chosen");
     if (rc < 0) {
         qemu_fdt_add_subnode(fdt, "/chosen");
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [Qemu-devel] [PATCH v3 09/10] hw/acpi: Add ACPI Generic Event Device Support
  2019-03-21 10:47 [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support Shameer Kolothum
                   ` (7 preceding siblings ...)
  2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 08/10] hw/arm/boot: Expose the PC-DIMM nodes in the DT Shameer Kolothum
@ 2019-03-21 10:47 ` Shameer Kolothum
  2019-03-29 13:09   ` [Qemu-arm] " Auger Eric
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 10/10] hw/arm/virt: Init GED device and enable memory hotplug Shameer Kolothum
  2019-03-21 11:06 ` [Qemu-arm] [Qemu-devel] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support no-reply
  10 siblings, 1 reply; 95+ messages in thread
From: Shameer Kolothum @ 2019-03-21 10:47 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

From: Samuel Ortiz <sameo@linux.intel.com>

The ACPI Generic Event Device (GED) is a hardware-reduced specific
device that handles all platform events, including the hotplug ones.
This patch generates the AML code that defines GEDs.

Platforms need to specify their own GedEvent array to describe what
kind of events they want to support through GED.  Also this uses a
a single interrupt for the  GED device, relying on IO memory region
to communicate the type of device affected by the interrupt. This
way, we can support up to 32 events with a unique interrupt.

This is in preparation for making use of GED for ARM/virt
platform and for now supports only memory hotplug.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 hw/acpi/generic_event_device.c         | 200 +++++++++++++++++++++++++++++++++
 include/hw/acpi/generic_event_device.h |  34 ++++++
 2 files changed, 234 insertions(+)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 0b32fc9..9deaa33 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -23,6 +23,183 @@
 #include "hw/acpi/generic_event_device.h"
 #include "hw/mem/pc-dimm.h"
 
+static hwaddr ged_io_base;
+static GedEvent *ged_events;
+static uint32_t ged_events_size;
+
+static Aml *ged_event_aml(const GedEvent *event)
+{
+
+    if (!event) {
+        return NULL;
+    }
+
+    switch (event->event) {
+    case GED_MEMORY_HOTPLUG:
+        /* We run a complete memory SCAN when getting a memory hotplug event */
+        return aml_call0(MEMORY_DEVICES_CONTAINER "." MEMORY_SLOT_SCAN_METHOD);
+    default:
+        break;
+    }
+
+    return NULL;
+}
+
+/*
+ * The ACPI Generic Event Device (GED) is a hardware-reduced specific
+ * device[ACPI v6.1 Section 5.6.9] that handles all platform events,
+ * including the hotplug ones. Platforms need to specify their own
+ * GedEvent array to describe what kind of events they want to support
+ * through GED. This routine uses a single interrupt for the GED device,
+ * relying on IO memory region to communicate the type of device
+ * affected by the interrupt. This way, we can support up to 32 events
+ * with a unique interrupt.
+ */
+void build_ged_aml(Aml *table, const char *name, uint32_t ged_irq,
+                   AmlRegionSpace rs)
+{
+    Aml *crs = aml_resource_template();
+    Aml *evt, *field;
+    Aml *dev = aml_device("%s", name);
+    Aml *irq_sel = aml_local(0);
+    Aml *isel = aml_name(AML_GED_IRQ_SEL);
+    uint32_t i;
+
+    if (!ged_io_base || !ged_events || !ged_events_size) {
+        return;
+    }
+
+    /* _CRS interrupt */
+    aml_append(crs, aml_interrupt(AML_CONSUMER, AML_EDGE, AML_ACTIVE_HIGH,
+                                  AML_EXCLUSIVE, &ged_irq, 1));
+    /*
+     * For each GED event we:
+     * - Add an interrupt to the CRS section.
+     * - Add a conditional block for each event, inside a while loop.
+     *   This is semantically equivalent to a switch/case implementation.
+     */
+    evt = aml_method("_EVT", 1, AML_SERIALIZED);
+    {
+        Aml *ged_aml;
+        Aml *if_ctx;
+
+        /* Local0 = ISEL */
+        aml_append(evt, aml_store(isel, irq_sel));
+
+        /*
+         * Here we want to call a method for each supported GED event type.
+         * The resulting ASL code looks like:
+         *
+         * Local0 = ISEL
+         * If ((Local0 & irq0) == irq0)
+         * {
+         *     MethodEvent0()
+         * }
+         *
+         * If ((Local0 & irq1) == irq1)
+         * {
+         *     MethodEvent1()
+         * }
+         *
+         * If ((Local0 & irq2) == irq2)
+         * {
+         *     MethodEvent2()
+         * }
+         */
+
+        for (i = 0; i < ged_events_size; i++) {
+            ged_aml = ged_event_aml(&ged_events[i]);
+            if (!ged_aml) {
+                continue;
+            }
+
+            /* If ((Local1 == irq))*/
+            if_ctx = aml_if(aml_equal(aml_and(irq_sel, aml_int(ged_events[i].selector), NULL), aml_int(ged_events[i].selector)));
+            {
+                /* AML for this specific type of event */
+                aml_append(if_ctx, ged_aml);
+            }
+
+            /*
+             * We append the first "if" to the "while" context.
+             * Other "ifs" will be "elseifs".
+             */
+            aml_append(evt, if_ctx);
+        }
+    }
+
+    aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0013")));
+    aml_append(dev, aml_name_decl("_UID", aml_string(GED_DEVICE)));
+    aml_append(dev, aml_name_decl("_CRS", crs));
+
+    /* Append IO region */
+    aml_append(dev, aml_operation_region(AML_GED_IRQ_REG, rs,
+               aml_int(ged_io_base + ACPI_GED_IRQ_SEL_OFFSET),
+               ACPI_GED_IRQ_SEL_LEN));
+    field = aml_field(AML_GED_IRQ_REG, AML_DWORD_ACC, AML_NOLOCK,
+                      AML_WRITE_AS_ZEROS);
+    aml_append(field, aml_named_field(AML_GED_IRQ_SEL,
+                                      ACPI_GED_IRQ_SEL_LEN * 8));
+    aml_append(dev, field);
+
+    /* Append _EVT method */
+    aml_append(dev, evt);
+
+    aml_append(table, dev);
+}
+
+/* Memory read by the GED _EVT AML dynamic method */
+static uint64_t ged_read(void *opaque, hwaddr addr, unsigned size)
+{
+    uint64_t val = 0;
+    GEDState *ged_st = opaque;
+
+    switch (addr) {
+    case ACPI_GED_IRQ_SEL_OFFSET:
+        /* Read the selector value and reset it */
+        qemu_mutex_lock(&ged_st->lock);
+        val = ged_st->sel;
+        ged_st->sel = 0;
+        qemu_mutex_unlock(&ged_st->lock);
+        break;
+    default:
+        break;
+    }
+
+    return val;
+}
+
+/* Nothing is expected to be written to the GED memory region */
+static void ged_write(void *opaque, hwaddr addr, uint64_t data,
+                      unsigned int size)
+{
+}
+
+static const MemoryRegionOps ged_ops = {
+    .read = ged_read,
+    .write = ged_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    },
+};
+
+static void acpi_ged_event(GEDState *ged_st, uint32_t ged_irq_sel)
+{
+    /*
+     * Set the GED IRQ selector to the expected device type value. This
+     * way, the ACPI method will be able to trigger the right code based
+     * on a unique IRQ.
+     */
+    qemu_mutex_lock(&ged_st->lock);
+    ged_st->sel = ged_irq_sel;
+    qemu_mutex_unlock(&ged_st->lock);
+
+    /* Trigger the event by sending an interrupt to the guest. */
+    qemu_irq_pulse(ged_st->gsi[ged_st->irq]);
+}
+
 static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
                                 DeviceState *dev, Error **errp)
 {
@@ -40,6 +217,21 @@ static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
 
 static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
 {
+    VirtAcpiState *s = VIRT_ACPI(adev);
+    uint32_t sel;
+
+    if (ev & ACPI_MEMORY_HOTPLUG_STATUS) {
+        sel = ACPI_GED_IRQ_SEL_MEM;
+    } else {
+        /* Unknown event. Return without generating interrupt. */
+        return;
+    }
+
+    /*
+     * We inject the hotplug interrupt. The IRQ selector will make
+     * the difference from the ACPI table.
+     */
+    acpi_ged_event(&s->ged_state, sel);
 }
 
 static void virt_device_realize(DeviceState *dev, Error **errp)
@@ -57,6 +249,11 @@ static Property virt_acpi_properties[] = {
     DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base, 0),
     DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
                      memhp_state.is_enabled, true),
+    DEFINE_PROP_PTR("gsi", VirtAcpiState, gsi),
+    DEFINE_PROP_UINT64("ged_base", VirtAcpiState, ged_base, 0),
+    DEFINE_PROP_UINT32("ged_irq", VirtAcpiState, ged_irq, 0),
+    DEFINE_PROP_PTR("ged_events", VirtAcpiState, ged_events),
+    DEFINE_PROP_UINT32("ged_events_size", VirtAcpiState, ged_events_size, 0),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -70,6 +267,9 @@ static void virt_acpi_class_init(ObjectClass *class, void *data)
     dc->props = virt_acpi_properties;
     dc->realize = virt_device_realize;
 
+    /* Reason: pointer properties "gsi" and "gde_events" */
+    dc->user_creatable = false;
+
     hc->plug = virt_device_plug_cb;
 
     adevc->send_event = virt_send_ged;
diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
index 262ca7d..7f130f3 100644
--- a/include/hw/acpi/generic_event_device.h
+++ b/include/hw/acpi/generic_event_device.h
@@ -24,11 +24,45 @@
 #define VIRT_ACPI(obj) \
     OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
 
+#define ACPI_GED_IRQ_SEL_OFFSET 0x0
+#define ACPI_GED_IRQ_SEL_LEN    0x4
+#define ACPI_GED_IRQ_SEL_MEM    0x1
+#define ACPI_GED_REG_LEN        0x4
+
+#define GED_DEVICE      "GED"
+#define AML_GED_IRQ_REG "IREG"
+#define AML_GED_IRQ_SEL "ISEL"
+
+typedef enum {
+    GED_MEMORY_HOTPLUG = 1,
+} GedEventType;
+
+typedef struct GedEvent {
+    uint32_t     selector;
+    GedEventType event;
+} GedEvent;
+
+typedef struct GEDState {
+    MemoryRegion io;
+    uint32_t     sel;
+    uint32_t     irq;
+    qemu_irq     *gsi;
+    QemuMutex    lock;
+} GEDState;
+
 typedef struct VirtAcpiState {
     SysBusDevice parent_obj;
     MemHotplugState memhp_state;
     hwaddr memhp_base;
+    void *gsi;
+    hwaddr ged_base;
+    GEDState ged_state;
+    uint32_t ged_irq;
+    void *ged_events;
+    uint32_t ged_events_size;
 } VirtAcpiState;
 
+void build_ged_aml(Aml *table, const char* name, uint32_t ged_irq,
+                   AmlRegionSpace rs);
 
 #endif
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [Qemu-devel] [PATCH v3 10/10] hw/arm/virt: Init GED device and enable memory hotplug
  2019-03-21 10:47 [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support Shameer Kolothum
                   ` (8 preceding siblings ...)
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 09/10] hw/acpi: Add ACPI Generic Event Device Support Shameer Kolothum
@ 2019-03-21 10:47 ` Shameer Kolothum
  2019-03-29 14:16   ` [Qemu-arm] " Auger Eric
  2019-03-21 11:06 ` [Qemu-arm] [Qemu-devel] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support no-reply
  10 siblings, 1 reply; 95+ messages in thread
From: Shameer Kolothum @ 2019-03-21 10:47 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

This initializes the GED device with base memory and irq,
configures ged memory hotplug event and builds the
corresponding aml code.

GED irq routing to Guest is also enabled. Memory hotplug
should now work.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 hw/acpi/generic_event_device.c | 18 ++++++++++++++++++
 hw/arm/virt-acpi-build.c       |  9 +++++++++
 hw/arm/virt.c                  | 30 +++++++++++++++++++++++++-----
 include/hw/arm/virt.h          |  2 ++
 4 files changed, 54 insertions(+), 5 deletions(-)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 9deaa33..02d5e66 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -200,6 +200,23 @@ static void acpi_ged_event(GEDState *ged_st, uint32_t ged_irq_sel)
     qemu_irq_pulse(ged_st->gsi[ged_st->irq]);
 }
 
+static void acpi_ged_init(MemoryRegion *as, DeviceState *dev, GEDState *ged_st)
+{
+    VirtAcpiState *s = VIRT_ACPI(dev);
+
+    assert(!ged_io_base && !ged_events && !ged_events_size);
+
+    ged_io_base = s->ged_base;
+    ged_events = s->ged_events;
+    ged_events_size = s->ged_events_size;
+    ged_st->irq = s->ged_irq;
+    ged_st->gsi = s->gsi;
+    qemu_mutex_init(&ged_st->lock);
+    memory_region_init_io(&ged_st->io, OBJECT(dev), &ged_ops, ged_st,
+                          "acpi-ged-event", ACPI_GED_REG_LEN);
+    memory_region_add_subregion(as, ged_io_base, &ged_st->io);
+}
+
 static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
                                 DeviceState *dev, Error **errp)
 {
@@ -242,6 +259,7 @@ static void virt_device_realize(DeviceState *dev, Error **errp)
         acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
                                  &s->memhp_state,
                                  s->memhp_base);
+        acpi_ged_init(get_system_memory(), dev, &s->ged_state);
     }
 }
 
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 1887531..116e9c9 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -41,6 +41,7 @@
 #include "hw/hw.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/memory_hotplug.h"
+#include "hw/acpi/generic_event_device.h"
 #include "hw/pci/pcie_host.h"
 #include "hw/pci/pci.h"
 #include "hw/arm/virt.h"
@@ -50,6 +51,13 @@
 #define ARM_SPI_BASE 32
 #define ACPI_POWER_BUTTON_DEVICE "PWRB"
 
+static void acpi_dsdt_add_ged(Aml *scope, VirtMachineState *vms)
+{
+    int irq =  vms->irqmap[VIRT_ACPI_GED] + ARM_SPI_BASE;
+
+    build_ged_aml(scope, "\\_SB."GED_DEVICE, irq, AML_SYSTEM_MEMORY);
+}
+
 static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState *ms)
 {
     uint32_t nr_mem = ms->ram_slots;
@@ -758,6 +766,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
      */
     scope = aml_scope("\\_SB");
     acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
+    acpi_dsdt_add_ged(scope, vms);
     acpi_dsdt_add_cpus(scope, vms->smp_cpus);
     acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
                        (irqmap[VIRT_UART] + ARM_SPI_BASE));
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b602151..e3f8aa7 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -63,6 +63,7 @@
 #include "target/arm/internals.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/acpi/generic_event_device.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -134,6 +135,7 @@ static const MemMapEntry base_memmap[] = {
     [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
     [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
     [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
+    [VIRT_ACPI_GED] =           { 0x09080000, 0x00010000 },
     [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
     /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
     [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
@@ -169,6 +171,7 @@ static const int a15irqmap[] = {
     [VIRT_PCIE] = 3, /* ... to 6 */
     [VIRT_GPIO] = 7,
     [VIRT_SECURE_UART] = 8,
+    [VIRT_ACPI_GED] = 9,
     [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
     [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
     [VIRT_SMMU] = 74,    /* ...to 74 + NUM_SMMU_IRQS - 1 */
@@ -184,6 +187,13 @@ static const char *valid_cpus[] = {
     ARM_CPU_TYPE_NAME("max"),
 };
 
+static GedEvent ged_events[] = {
+    {
+        .selector = ACPI_GED_IRQ_SEL_MEM,
+        .event    = GED_MEMORY_HOTPLUG,
+    },
+};
+
 static bool cpu_type_valid(const char *cpu)
 {
     int i;
@@ -524,6 +534,11 @@ static DeviceState *create_virt_acpi(VirtMachineState *vms)
     dev = qdev_create(NULL, "virt-acpi");
     qdev_prop_set_uint64(dev, "memhp_base",
                          vms->memmap[VIRT_PCDIMM_ACPI].base);
+    qdev_prop_set_ptr(dev, "gsi", vms->gsi);
+    qdev_prop_set_uint64(dev, "ged_base", vms->memmap[VIRT_ACPI_GED].base);
+    qdev_prop_set_uint32(dev, "ged_irq", vms->irqmap[VIRT_ACPI_GED]);
+    qdev_prop_set_ptr(dev, "ged_events", ged_events);
+    qdev_prop_set_uint32(dev, "ged_events_size", ARRAY_SIZE(ged_events));
     qdev_init_nofail(dev);
 
     return dev;
@@ -568,6 +583,12 @@ static void create_v2m(VirtMachineState *vms, qemu_irq *pic)
     fdt_add_v2m_gic_node(vms);
 }
 
+static void virt_gsi_handler(void *opaque, int n, int level)
+{
+    qemu_irq *gic_irq = opaque;
+    qemu_set_irq(gic_irq[n], level);
+}
+
 static void create_gic(VirtMachineState *vms, qemu_irq *pic)
 {
     /* We create a standalone GIC */
@@ -683,6 +704,8 @@ static void create_gic(VirtMachineState *vms, qemu_irq *pic)
         pic[i] = qdev_get_gpio_in(gicdev, i);
     }
 
+    vms->gsi = qemu_allocate_irqs(virt_gsi_handler, pic, NUM_IRQS);
+
     fdt_add_gic_node(vms);
 
     if (type == 3 && vms->its) {
@@ -1431,7 +1454,7 @@ static void machvirt_init(MachineState *machine)
     VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(machine);
     MachineClass *mc = MACHINE_GET_CLASS(machine);
     const CPUArchIdList *possible_cpus;
-    qemu_irq pic[NUM_IRQS];
+    qemu_irq *pic;
     MemoryRegion *sysmem = get_system_memory();
     MemoryRegion *secure_sysmem = NULL;
     int n, virt_max_cpus;
@@ -1627,6 +1650,7 @@ static void machvirt_init(MachineState *machine)
 
     create_flash(vms, sysmem, secure_sysmem ? secure_sysmem : sysmem);
 
+    pic = g_new0(qemu_irq, NUM_IRQS);
     create_gic(vms, pic);
 
     fdt_add_pmu_nodes(vms);
@@ -1842,10 +1866,6 @@ static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
 {
     const bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
 
-    if (dev->hotplugged) {
-        error_setg(errp, "memory hotplug is not supported");
-    }
-
     if (is_nvdimm) {
         error_setg(errp, "nvdimm is not yet supported");
         return;
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 14b2e0a..850296a 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -78,6 +78,7 @@ enum {
     VIRT_SECURE_UART,
     VIRT_SECURE_MEM,
     VIRT_PCDIMM_ACPI,
+    VIRT_ACPI_GED,
     VIRT_LOWMEMMAP_LAST,
 };
 
@@ -135,6 +136,7 @@ typedef struct {
     int psci_conduit;
     hwaddr highest_gpa;
     DeviceState *acpi;
+    qemu_irq *gsi;
 } VirtMachineState;
 
 #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support
  2019-03-21 10:47 [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support Shameer Kolothum
                   ` (9 preceding siblings ...)
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 10/10] hw/arm/virt: Init GED device and enable memory hotplug Shameer Kolothum
@ 2019-03-21 11:06 ` no-reply
  10 siblings, 0 replies; 95+ messages in thread
From: no-reply @ 2019-03-21 11:06 UTC (permalink / raw)
  To: shameerali.kolothum.thodi
  Cc: fam, peter.maydell, sameo, shannon.zhaosl, qemu-devel, xuwei5,
	linuxarm, eric.auger, qemu-arm, imammedo, sebastien.boeuf

Patchew URL: https://patchew.org/QEMU/20190321104745.28068-1-shameerali.kolothum.thodi@huawei.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20190321104745.28068-1-shameerali.kolothum.thodi@huawei.com
Subject: [Qemu-devel] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
   62a172e6a7..6532dcebb6  master     -> master
 t [tag update]            patchew/1553090847-11300-1-git-send-email-lizhengui@huawei.com -> patchew/1553090847-11300-1-git-send-email-lizhengui@huawei.com
 t [tag update]            patchew/20190313162812.8885-1-armbru@redhat.com -> patchew/20190313162812.8885-1-armbru@redhat.com
 t [tag update]            patchew/20190314131049.23175-1-marcandre.lureau@redhat.com -> patchew/20190314131049.23175-1-marcandre.lureau@redhat.com
 t [tag update]            patchew/20190319163551.32499-1-armbru@redhat.com -> patchew/20190319163551.32499-1-armbru@redhat.com
 * [new tag]               patchew/20190321085212.10796-1-lvivier@redhat.com -> patchew/20190321085212.10796-1-lvivier@redhat.com
 * [new tag]               patchew/20190321094012.36541-1-sgarzare@redhat.com -> patchew/20190321094012.36541-1-sgarzare@redhat.com
 * [new tag]               patchew/20190321104745.28068-1-shameerali.kolothum.thodi@huawei.com -> patchew/20190321104745.28068-1-shameerali.kolothum.thodi@huawei.com
Switched to a new branch 'test'
125f46770b hw/arm/virt: Init GED device and enable memory hotplug
8b3a542021 hw/acpi: Add ACPI Generic Event Device Support
bbb60b29a5 hw/arm/boot: Expose the PC-DIMM nodes in the DT
8617e42573 hw/arm/virt: Introduce opt-in feature "fdt"
9cd5da671c hw/arm/virt-acpi-build: Add PC-DIMM in SRAT
b25387b9d3 hw/arm/virt: Add ACPI support for device memory cold-plug
ab6d232c4d hw/arm/virt: Add memory hotplug framework
ea71a92426 hw/arm/virt: Add virtual ACPI device
f093e1aaf2 hw/acpi: Do not create memory hotplug method when handler is not defined
37eb6a4ab4 hw/acpi: Make ACPI IO address space configurable

=== OUTPUT BEGIN ===
1/10 Checking commit 37eb6a4ab493 (hw/acpi: Make ACPI IO address space configurable)
2/10 Checking commit f093e1aaf21d (hw/acpi: Do not create memory hotplug method when handler is not defined)
WARNING: line over 80 characters
#31: FILE: hw/acpi/memory_hotplug.c:726:
+                   aml_call0(MEMORY_DEVICES_CONTAINER "." MEMORY_SLOT_SCAN_METHOD));

total: 0 errors, 1 warnings, 16 lines checked

Patch 2/10 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/10 Checking commit ea71a92426cc (hw/arm/virt: Add virtual ACPI device)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#42: 
new file mode 100644

total: 0 errors, 1 warnings, 115 lines checked

Patch 3/10 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
4/10 Checking commit ab6d232c4d16 (hw/arm/virt: Add memory hotplug framework)
5/10 Checking commit b25387b9d3e5 (hw/arm/virt: Add ACPI support for device memory cold-plug)
6/10 Checking commit 9cd5da671c49 (hw/arm/virt-acpi-build: Add PC-DIMM in SRAT)
7/10 Checking commit 8617e4257385 (hw/arm/virt: Introduce opt-in feature "fdt")
8/10 Checking commit bbb60b29a575 (hw/arm/boot: Expose the PC-DIMM nodes in the DT)
9/10 Checking commit 8b3a54202156 (hw/acpi: Add ACPI Generic Event Device Support)
ERROR: line over 90 characters
#124: FILE: hw/acpi/generic_event_device.c:117:
+            if_ctx = aml_if(aml_equal(aml_and(irq_sel, aml_int(ged_events[i].selector), NULL), aml_int(ged_events[i].selector)));

total: 1 errors, 0 warnings, 269 lines checked

Patch 9/10 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

10/10 Checking commit 125f46770b37 (hw/arm/virt: Init GED device and enable memory hotplug)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20190321104745.28068-1-shameerali.kolothum.thodi@huawei.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
  2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device Shameer Kolothum
@ 2019-03-28 14:14   ` Auger Eric
  2019-03-29 11:22     ` Shameerali Kolothum Thodi
  0 siblings, 1 reply; 95+ messages in thread
From: Auger Eric @ 2019-03-28 14:14 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-devel, qemu-arm, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

Hi Shameer,

On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> From: Samuel Ortiz <sameo@linux.intel.com>
> 
> This adds the skeleton to support an acpi device interface
> for HW-reduced acpi platforms via ACPI GED - Generic Event
> Device (ACPI v6.1 5.6.9).
> 
> This will be used by Arm/Virt to add hotplug support.
> 
> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  hw/acpi/Kconfig                        |  4 ++
>  hw/acpi/Makefile.objs                  |  1 +
>  hw/acpi/generic_event_device.c         | 72 ++++++++++++++++++++++++++++++++++
>  include/hw/acpi/generic_event_device.h | 29 ++++++++++++++
>  4 files changed, 106 insertions(+)
>  create mode 100644 hw/acpi/generic_event_device.c
>  create mode 100644 include/hw/acpi/generic_event_device.h
> 
> diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
> index eca3bee..01a8b41 100644
> --- a/hw/acpi/Kconfig
> +++ b/hw/acpi/Kconfig
> @@ -27,3 +27,7 @@ config ACPI_VMGENID
>      bool
>      default y
>      depends on PC
> +
> +config ACPI_HW_REDUCED
> +    bool
> +    depends on ACPI
> diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
> index 2d46e37..b753232 100644
> --- a/hw/acpi/Makefile.objs
> +++ b/hw/acpi/Makefile.objs
> @@ -6,6 +6,7 @@ common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o
>  common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
>  common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
>  common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
> +common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
>  common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
>  
>  common-obj-y += acpi_interface.o
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> new file mode 100644
> index 0000000..b21a551
> --- /dev/null
> +++ b/hw/acpi/generic_event_device.c
> @@ -0,0 +1,72 @@
> +/*
> + *
> + * Copyright (c) 2018 Intel Corporation
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/sysbus.h"
> +#include "hw/acpi/acpi.h"
> +#include "hw/acpi/generic_event_device.h"
the files are named generic_event_device.c/h while the device is named
"virt-acpi". I would suggest to use the same naming as in nemu ie. ged
or acpi_ged.

If think you should clarify what is the exact scope of this device. The
patch title make think this is bound to be used only in machvirt (+ the
virt prefix used in numerous functions?). Is it also bound to be used by
other architectures?
> +
> +static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> +                                DeviceState *dev, Error **errp)
> +{
> +}
> +
> +static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> +{
> +}
> +
> +static void virt_device_realize(DeviceState *dev, Error **errp)
> +{
> +}
> +
> +static Property virt_acpi_properties[] = {
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void virt_acpi_class_init(ObjectClass *class, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(class);
> +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(class);
> +    AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_CLASS(class);
> +
> +    dc->desc = "ACPI";
> +    dc->props = virt_acpi_properties;
> +    dc->realize = virt_device_realize;
> +
> +    hc->plug = virt_device_plug_cb;
> +
> +    adevc->send_event = virt_send_ged;
> +}
> +
> +static const TypeInfo virt_acpi_info = {
> +    .name          = TYPE_VIRT_ACPI,
> +    .parent        = TYPE_SYS_BUS_DEVICE,
> +    .instance_size = sizeof(VirtAcpiState),
> +    .class_init    = virt_acpi_class_init,
> +    .interfaces = (InterfaceInfo[]) {
> +        { TYPE_HOTPLUG_HANDLER },
> +        { TYPE_ACPI_DEVICE_IF },
> +        { }
> +    }
> +};
> +
> +static void virt_acpi_register_types(void)
> +{
> +    type_register_static(&virt_acpi_info);
> +}
> +
> +type_init(virt_acpi_register_types)
> diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
> new file mode 100644
> index 0000000..f314515
> --- /dev/null
> +++ b/include/hw/acpi/generic_event_device.h
> @@ -0,0 +1,29 @@
> +/*
> + *
> + * Copyright (c) 2018 Intel Corporation
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
Add a comment in the header introducing what is the role of this device?
link to GED spec? Explain the subset of the interfaces being implemented
by the device.
> +
> +#ifndef HW_ACPI_GED_H
> +#define HW_ACPI_GED_H
> +
> +#define TYPE_VIRT_ACPI "virt-acpi"
> +#define VIRT_ACPI(obj) \
> +    OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> +
> +typedef struct VirtAcpiState {
> +    SysBusDevice parent_obj;
> +} VirtAcpiState;
> +
> +#endif
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 02/10] hw/acpi: Do not create memory hotplug method when handler is not defined
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 02/10] hw/acpi: Do not create memory hotplug method when handler is not defined Shameer Kolothum
@ 2019-03-28 14:14   ` Auger Eric
  0 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-03-28 14:14 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-devel, qemu-arm, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

Hi Shameer,
On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> From: Samuel Ortiz <sameo@linux.intel.com>
> 
> With Hardware-reduced ACPI, the GED device will manage ACPI
s/Hardware-reduced/hardware-reduced
> hotplug entirely. As a consequence, make the memory specific
> events AML generation optional The code will only be added
> when the method name is not NULL.
> 
> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric
> ---
>  hw/acpi/memory_hotplug.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
> index 80e25f0..98407e3 100644
> --- a/hw/acpi/memory_hotplug.c
> +++ b/hw/acpi/memory_hotplug.c
> @@ -720,10 +720,12 @@ void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
>      }
>      aml_append(table, dev_container);
>  
> -    method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED);
> -    aml_append(method,
> -        aml_call0(MEMORY_DEVICES_CONTAINER "." MEMORY_SLOT_SCAN_METHOD));
> -    aml_append(table, method);
> +    if (event_handler_method) {
> +        method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED);
> +        aml_append(method,
> +                   aml_call0(MEMORY_DEVICES_CONTAINER "." MEMORY_SLOT_SCAN_METHOD));
> +        aml_append(table, method);
> +    }
>  
>      g_free(mhp_res_path);
>  }
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 04/10] hw/arm/virt: Add memory hotplug framework
  2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 04/10] hw/arm/virt: Add memory hotplug framework Shameer Kolothum
@ 2019-03-28 15:37   ` Auger Eric
  2019-03-29 12:03     ` Shameerali Kolothum Thodi
  0 siblings, 1 reply; 95+ messages in thread
From: Auger Eric @ 2019-03-28 15:37 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-devel, qemu-arm, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

Hi Shameer,

On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> From: Eric Auger <eric.auger@redhat.com>
> 
> This patch adds the the memory hot-plug/hot-unplug infrastructure
s/the the/the (sorry my fault)
> in machvirt. It is still not enabled as device memory is not yet
> reported to guest.
s/ reported to guest / exposed to the guest either through DT or ACPI.
Note even cold-plug would not work yet.


> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  default-configs/arm-softmmu.mak |  3 +++
>  hw/arm/virt.c                   | 53 ++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 55 insertions(+), 1 deletion(-)
> 
> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> index 2a7efc1..795cb89 100644
> --- a/default-configs/arm-softmmu.mak
> +++ b/default-configs/arm-softmmu.mak
> @@ -159,3 +159,6 @@ CONFIG_MUSICPAL=y
>  
>  # for realview and versatilepb
>  CONFIG_LSI_SCSI_PCI=y
> +
> +CONFIG_MEM_DEVICE=y
> +CONFIG_DIMM=y
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index ce2664a..d0ff20d 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -61,6 +61,8 @@
>  #include "hw/arm/smmuv3.h"
>  #include "hw/acpi/acpi.h"
>  #include "target/arm/internals.h"
> +#include "hw/mem/pc-dimm.h"
> +#include "hw/mem/nvdimm.h"
>  
>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
> @@ -1806,6 +1808,42 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>      return ms->possible_cpus;
>  }
>  
> +static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                                 Error **errp)
> +{
> +    const bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
> +
> +    if (dev->hotplugged) {
> +        error_setg(errp, "memory hotplug is not supported");
> +    }
> +
> +    if (is_nvdimm) {
> +        error_setg(errp, "nvdimm is not yet supported");
> +        return;
> +    }
Maybe this patch should be moved at the end since from this patch
onwards, it becomes possible to instantiate a DIMM slot (with cold-plug
qemu command line) that won't work properly. Of we simply disable
cold-plug for PCDIMM as well at the moment?

Thanks

Eric
> +
> +    pc_dimm_pre_plug(PC_DIMM(dev), MACHINE(hotplug_dev), NULL, errp);
> +}
> +
> +static void virt_memory_plug(HotplugHandler *hotplug_dev,
> +                             DeviceState *dev, Error **errp)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> +    Error *local_err = NULL;
> +
> +    pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> +
> +    error_propagate(errp, local_err);
> +}
> +
> +static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
> +                                            DeviceState *dev, Error **errp)
> +{
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> +        virt_memory_pre_plug(hotplug_dev, dev, errp);
> +    }
> +}
> +
>  static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
>                                          DeviceState *dev, Error **errp)
>  {
> @@ -1817,12 +1855,23 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
>                                       SYS_BUS_DEVICE(dev));
>          }
>      }
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> +        virt_memory_plug(hotplug_dev, dev, errp);
> +    }
> +}
> +
> +static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> +                                          DeviceState *dev, Error **errp)
> +{
> +    error_setg(errp, "device unplug request for unsupported device"
> +               " type: %s", object_get_typename(OBJECT(dev)));
>  }
>  
>  static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
>                                                          DeviceState *dev)
>  {
> -    if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE)) {
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE) ||
> +       (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM))) {
>          return HOTPLUG_HANDLER(machine);
>      }
>  
> @@ -1886,7 +1935,9 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
>      mc->kvm_type = virt_kvm_type;
>      assert(!mc->get_hotplug_handler);
>      mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
> +    hc->pre_plug = virt_machine_device_pre_plug_cb;
>      hc->plug = virt_machine_device_plug_cb;
> +    hc->unplug_request = virt_machine_device_unplug_request_cb;
>  }
>  
>  static void virt_instance_init(Object *obj)
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug Shameer Kolothum
@ 2019-03-29  9:31   ` Auger Eric
  2019-03-29 10:54     ` Shameerali Kolothum Thodi
  2019-04-01 13:43       ` [Qemu-devel] " Igor Mammedov
  2019-04-01 13:34     ` [Qemu-devel] " Igor Mammedov
  1 sibling, 2 replies; 95+ messages in thread
From: Auger Eric @ 2019-03-29  9:31 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-devel, qemu-arm, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

Hi Shameer,

On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> This adds support to build the aml code so that Guest(ACPI boot)
> can see the cold-plugged device memory. Memory cold plug support
> with DT boot is not yet enabled.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  default-configs/arm-softmmu.mak        |  2 ++
>  hw/acpi/generic_event_device.c         | 23 +++++++++++++++++++++++
>  hw/arm/virt-acpi-build.c               |  9 +++++++++
>  hw/arm/virt.c                          | 23 +++++++++++++++++++++++
>  include/hw/acpi/generic_event_device.h |  5 +++++
>  include/hw/arm/virt.h                  |  2 ++
>  6 files changed, 64 insertions(+)
> 
> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> index 795cb89..6db444e 100644
> --- a/default-configs/arm-softmmu.mak
> +++ b/default-configs/arm-softmmu.mak
> @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
>  
>  CONFIG_MEM_DEVICE=y
>  CONFIG_DIMM=y
> +CONFIG_ACPI_MEMORY_HOTPLUG=y
> +CONFIG_ACPI_HW_REDUCED=y
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index b21a551..0b32fc9 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -16,13 +16,26 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "exec/address-spaces.h"
>  #include "hw/sysbus.h"
>  #include "hw/acpi/acpi.h"
>  #include "hw/acpi/generic_event_device.h"
> +#include "hw/mem/pc-dimm.h"
>  
>  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
>                                  DeviceState *dev, Error **errp)
>  {
> +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> +
> +    if (s->memhp_state.is_enabled &&
> +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> +                                dev, errp);
> +    } else {
> +        error_setg(errp, "virt: device plug request for unsupported device"
> +                   " type: %s", object_get_typename(OBJECT(dev)));
> +    }
>  }
>  
>  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>  
>  static void virt_device_realize(DeviceState *dev, Error **errp)
>  {
> +    VirtAcpiState *s = VIRT_ACPI(dev);
> +
> +    if (s->memhp_state.is_enabled) {
> +        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
> +                                 &s->memhp_state,
> +                                 s->memhp_base);
> +    }
>  }
>  
>  static Property virt_acpi_properties[] = {
> +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base, 0),
> +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> +                     memhp_state.is_enabled, true),>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index bf9c0bc..20d3c83 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -40,6 +40,7 @@
>  #include "hw/loader.h"
>  #include "hw/hw.h"
>  #include "hw/acpi/aml-build.h"
> +#include "hw/acpi/memory_hotplug.h"
>  #include "hw/pci/pcie_host.h"
>  #include "hw/pci/pci.h"
>  #include "hw/arm/virt.h"
> @@ -49,6 +50,13 @@
>  #define ARM_SPI_BASE 32
>  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
>  
> +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState *ms)
> +{
> +    uint32_t nr_mem = ms->ram_slots;
> +
> +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL, AML_SYSTEM_MEMORY);
> +}
> +
>  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
>  {
>      uint16_t i;
> @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>       * the RTC ACPI device at all when using UEFI.
>       */
>      scope = aml_scope("\\_SB");
> +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
>      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
>      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
>                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index d0ff20d..13db0e9 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
>      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
>      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
>      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
>      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const VirtMachineState *vms)
>      }
>  }
>  
> +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> +{
> +    DeviceState *dev;
> +
> +    dev = qdev_create(NULL, "virt-acpi");
> +    qdev_prop_set_uint64(dev, "memhp_base",
> +                         vms->memmap[VIRT_PCDIMM_ACPI].base);
Maybe add a comment that a property is requested to integrated with
acpi_memory_hotplug_init() (if I am not wrong). Otherwise we can wonder
why sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, <base>) is not used as for
standard sysbus devices?

> +    qdev_init_nofail(dev);
> +
> +    return dev;
> +}
> +
>  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
>  {
>      const char *itsclass = its_class_name();
> @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
>  
>      create_platform_bus(vms, pic);
>  
> +    vms->acpi = create_virt_acpi(vms);I can see that on PC machines, they use a link property to set the
acpi_dev. I am unsure about the exact reason, any idea?
> +
>      vms->bootinfo.ram_size = machine->ram_size;
>      vms->bootinfo.kernel_filename = machine->kernel_filename;
>      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> @@ -1828,11 +1843,19 @@ static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>  static void virt_memory_plug(HotplugHandler *hotplug_dev,
>                               DeviceState *dev, Error **errp)
>  {
> +    HotplugHandlerClass *hhc;
>      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>      Error *local_err = NULL;
>  
>      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);
Why error_abort instead of propagating the error?
>  
> +out:
>      error_propagate(errp, local_err);
>  }
>  
> diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
> index f314515..262ca7d 100644
> --- a/include/hw/acpi/generic_event_device.h
> +++ b/include/hw/acpi/generic_event_device.h
> @@ -18,12 +18,17 @@
>  #ifndef HW_ACPI_GED_H
>  #define HW_ACPI_GED_H
>  
> +#include "hw/acpi/memory_hotplug.h"
> +
>  #define TYPE_VIRT_ACPI "virt-acpi"
>  #define VIRT_ACPI(obj) \
>      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
>  
>  typedef struct VirtAcpiState {
>      SysBusDevice parent_obj;
> +    MemHotplugState memhp_state;
> +    hwaddr memhp_base;
>  } VirtAcpiState;
>  
> +
spurious newline

Thanks

Eric
>  #endif
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 507517c..c5e4c96 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -77,6 +77,7 @@ enum {
>      VIRT_GPIO,
>      VIRT_SECURE_UART,
>      VIRT_SECURE_MEM,
> +    VIRT_PCDIMM_ACPI,
>      VIRT_LOWMEMMAP_LAST,
>  };
>  
> @@ -132,6 +133,7 @@ typedef struct {
>      uint32_t iommu_phandle;
>      int psci_conduit;
>      hwaddr highest_gpa;
> +    DeviceState *acpi;
>  } VirtMachineState;
>  
>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt" Shameer Kolothum
@ 2019-03-29  9:31   ` Auger Eric
  2019-03-29  9:41     ` Shameerali Kolothum Thodi
  2019-03-29  9:59     ` Shameerali Kolothum Thodi
  0 siblings, 2 replies; 95+ messages in thread
From: Auger Eric @ 2019-03-29  9:31 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-devel, qemu-arm, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: Leif Lindholm, Laszlo Ersek, linuxarm, xuwei5, Ard Biesheuvel

Hi Shameer,

[ + Laszlo, Ard, Leif ]

On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> This is to disable/enable populating DT nodes in case
> any conflict with acpi tables. The default is "off".
The name of the option sounds misleading to me. Also we don't really
know the scope of the disablement. At the moment this just aims to
prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.

> 
> This will be used in subsequent patch where cold plug
> device-memory support is added for DT boot.
I am concerned about the fact that in dt mode, by default, you won't see
any PCDIMM nodes.
> 
> If DT memory node support is added for cold-plugged device
> memory, those memory will be visible to Guest kernel via
> UEFI GetMemoryMap() and gets treated as early boot memory.
Don't we have an issue in UEFI then. Normally the SRAT indicates whether
the slots are hotpluggable or not. Shouldn't the UEFI code look at this
info.
> Hence memory becomes non hot-un-unpluggable even if Guest
> is booted in ACPI mode.



> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  hw/arm/virt.c         | 23 +++++++++++++++++++++++
>  include/hw/arm/virt.h |  1 +
>  2 files changed, 24 insertions(+)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 13db0e9..b602151 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1717,6 +1717,20 @@ static void virt_set_highmem(Object *obj, bool value, Error **errp)
>      vms->highmem = value;
>  }
>  
> +static bool virt_get_fdt(Object *obj, Error **errp)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(obj);
> +
> +    return vms->use_fdt;
> +}
> +
> +static void virt_set_fdt(Object *obj, bool value, Error **errp)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(obj);
> +
> +    vms->use_fdt = value;
> +}
> +
>  static bool virt_get_its(Object *obj, Error **errp)
>  {
>      VirtMachineState *vms = VIRT_MACHINE(obj);
> @@ -2005,6 +2019,15 @@ static void virt_instance_init(Object *obj)
>      object_property_set_description(obj, "gic-version",
>                                      "Set GIC version. "
>                                      "Valid values are 2, 3 and host", NULL);
> +    /* fdt is disabled by default */
> +    vms->use_fdt = false;
> +    object_property_add_bool(obj, "fdt", virt_get_fdt,
> +                             virt_set_fdt, NULL);
> +    object_property_set_description(obj, "fdt",
> +                                    "Set on/off to enable/disable device tree "
> +                                    "nodes in case any conflict with ACPI"
in case of

Thanks

Eric
> +                                    "(eg: device memory node)",
> +                                    NULL);
>  
>      vms->highmem_ecam = !vmc->no_highmem_ecam;
>  
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index c5e4c96..14b2e0a 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -119,6 +119,7 @@ typedef struct {
>      bool highmem_ecam;
>      bool its;
>      bool virt;
> +    bool use_fdt;
>      int32_t gic_version;
>      VirtIOMMUType iommu;
>      struct arm_boot_info bootinfo;
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-03-29  9:31   ` Auger Eric
@ 2019-03-29  9:41     ` Shameerali Kolothum Thodi
  2019-03-29 13:41       ` Auger Eric
  2019-03-29  9:59     ` Shameerali Kolothum Thodi
  1 sibling, 1 reply; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-03-29  9:41 UTC (permalink / raw)
  To: Auger Eric, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	imammedo@redhat.com, peter.maydell@linaro.org,
	shannon.zhaosl@gmail.com, sameo@linux.intel.com
  Cc: Leif Lindholm, Laszlo Ersek, Linuxarm, xuwei (O), Ard Biesheuvel

Hi Eric,

> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 29 March 2019 09:32
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> sameo@linux.intel.com; sebastien.boeuf@intel.com
> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
> 
> Hi Shameer,
> 
> [ + Laszlo, Ard, Leif ]
> 
> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> > This is to disable/enable populating DT nodes in case
> > any conflict with acpi tables. The default is "off".
> The name of the option sounds misleading to me. Also we don't really
> know the scope of the disablement. At the moment this just aims to
> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.

Yes, I was not very happy with the name "fdt". If this is not useful other than
this device memory conflict case, then we can be more specific. But I am not sure
we might need this for any other future conflicts, hence a more generic name. 

May be "force_fdt" or "dimm_fdt"? I am open to suggestions.

> >
> > This will be used in subsequent patch where cold plug
> > device-memory support is added for DT boot.
> I am concerned about the fact that in dt mode, by default, you won't see
> any PCDIMM nodes.

True. But is there any other way to detect that the Guest is using DT?

Thanks,
Shameer

> > If DT memory node support is added for cold-plugged device
> > memory, those memory will be visible to Guest kernel via
> > UEFI GetMemoryMap() and gets treated as early boot memory.
> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
> info.
> > Hence memory becomes non hot-un-unpluggable even if Guest
> > is booted in ACPI mode.
> 
> 
> 
> >
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  hw/arm/virt.c         | 23 +++++++++++++++++++++++
> >  include/hw/arm/virt.h |  1 +
> >  2 files changed, 24 insertions(+)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 13db0e9..b602151 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -1717,6 +1717,20 @@ static void virt_set_highmem(Object *obj, bool
> value, Error **errp)
> >      vms->highmem = value;
> >  }
> >
> > +static bool virt_get_fdt(Object *obj, Error **errp)
> > +{
> > +    VirtMachineState *vms = VIRT_MACHINE(obj);
> > +
> > +    return vms->use_fdt;
> > +}
> > +
> > +static void virt_set_fdt(Object *obj, bool value, Error **errp)
> > +{
> > +    VirtMachineState *vms = VIRT_MACHINE(obj);
> > +
> > +    vms->use_fdt = value;
> > +}
> > +
> >  static bool virt_get_its(Object *obj, Error **errp)
> >  {
> >      VirtMachineState *vms = VIRT_MACHINE(obj);
> > @@ -2005,6 +2019,15 @@ static void virt_instance_init(Object *obj)
> >      object_property_set_description(obj, "gic-version",
> >                                      "Set GIC version. "
> >                                      "Valid values are 2, 3 and host",
> NULL);
> > +    /* fdt is disabled by default */
> > +    vms->use_fdt = false;
> > +    object_property_add_bool(obj, "fdt", virt_get_fdt,
> > +                             virt_set_fdt, NULL);
> > +    object_property_set_description(obj, "fdt",
> > +                                    "Set on/off to enable/disable
> device tree "
> > +                                    "nodes in case any conflict with
> ACPI"
> in case of
> 
> Thanks
> 
> Eric
> > +                                    "(eg: device memory node)",
> > +                                    NULL);
> >
> >      vms->highmem_ecam = !vmc->no_highmem_ecam;
> >
> > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > index c5e4c96..14b2e0a 100644
> > --- a/include/hw/arm/virt.h
> > +++ b/include/hw/arm/virt.h
> > @@ -119,6 +119,7 @@ typedef struct {
> >      bool highmem_ecam;
> >      bool its;
> >      bool virt;
> > +    bool use_fdt;
> >      int32_t gic_version;
> >      VirtIOMMUType iommu;
> >      struct arm_boot_info bootinfo;
> >

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-03-29  9:31   ` Auger Eric
  2019-03-29  9:41     ` Shameerali Kolothum Thodi
@ 2019-03-29  9:59     ` Shameerali Kolothum Thodi
  2019-03-29 13:12       ` Auger Eric
  1 sibling, 1 reply; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-03-29  9:59 UTC (permalink / raw)
  To: Auger Eric, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	imammedo@redhat.com, peter.maydell@linaro.org,
	shannon.zhaosl@gmail.com, sameo@linux.intel.com,
	sebastien.boeuf@intel.com
  Cc: Leif Lindholm, Laszlo Ersek, Linuxarm, xuwei (O), Ard Biesheuvel



> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 29 March 2019 09:32
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> sameo@linux.intel.com; sebastien.boeuf@intel.com
> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
> 
> Hi Shameer,
> 
> [ + Laszlo, Ard, Leif ]
> 
> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> > This is to disable/enable populating DT nodes in case
> > any conflict with acpi tables. The default is "off".
> The name of the option sounds misleading to me. Also we don't really
> know the scope of the disablement. At the moment this just aims to
> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
> 
> >
> > This will be used in subsequent patch where cold plug
> > device-memory support is added for DT boot.
> I am concerned about the fact that in dt mode, by default, you won't see
> any PCDIMM nodes.
> >
> > If DT memory node support is added for cold-plugged device
> > memory, those memory will be visible to Guest kernel via
> > UEFI GetMemoryMap() and gets treated as early boot memory.
> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
> info.

Sorry I missed this part. Yes, that will be a more cleaner solution.

Also, to be more clear on what happens,

Guest ACPI boot with "fdt=on" ,

From kernel log,

[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
[    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
[    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
[    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
[    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
[    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
[    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
[    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
[    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
[    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
[    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
[    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]


Guest ACPI boot with "fdt=off" ,

[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
[    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
[    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
[    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
[    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
[    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
[    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
[    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
[    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
[    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
[    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff

The hotpluggable memory node is absent from early memory nodes here.

As you said, it could be possible to detect this node using SRAT in UEFI.

Cheers,
Shameer

> > Hence memory becomes non hot-un-unpluggable even if Guest
> > is booted in ACPI mode.
> 
> 
> 
> >
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  hw/arm/virt.c         | 23 +++++++++++++++++++++++
> >  include/hw/arm/virt.h |  1 +
> >  2 files changed, 24 insertions(+)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 13db0e9..b602151 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -1717,6 +1717,20 @@ static void virt_set_highmem(Object *obj, bool
> value, Error **errp)
> >      vms->highmem = value;
> >  }
> >
> > +static bool virt_get_fdt(Object *obj, Error **errp)
> > +{
> > +    VirtMachineState *vms = VIRT_MACHINE(obj);
> > +
> > +    return vms->use_fdt;
> > +}
> > +
> > +static void virt_set_fdt(Object *obj, bool value, Error **errp)
> > +{
> > +    VirtMachineState *vms = VIRT_MACHINE(obj);
> > +
> > +    vms->use_fdt = value;
> > +}
> > +
> >  static bool virt_get_its(Object *obj, Error **errp)
> >  {
> >      VirtMachineState *vms = VIRT_MACHINE(obj);
> > @@ -2005,6 +2019,15 @@ static void virt_instance_init(Object *obj)
> >      object_property_set_description(obj, "gic-version",
> >                                      "Set GIC version. "
> >                                      "Valid values are 2, 3 and host",
> NULL);
> > +    /* fdt is disabled by default */
> > +    vms->use_fdt = false;
> > +    object_property_add_bool(obj, "fdt", virt_get_fdt,
> > +                             virt_set_fdt, NULL);
> > +    object_property_set_description(obj, "fdt",
> > +                                    "Set on/off to enable/disable
> device tree "
> > +                                    "nodes in case any conflict with
> ACPI"
> in case of
> 
> Thanks
> 
> Eric
> > +                                    "(eg: device memory node)",
> > +                                    NULL);
> >
> >      vms->highmem_ecam = !vmc->no_highmem_ecam;
> >
> > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > index c5e4c96..14b2e0a 100644
> > --- a/include/hw/arm/virt.h
> > +++ b/include/hw/arm/virt.h
> > @@ -119,6 +119,7 @@ typedef struct {
> >      bool highmem_ecam;
> >      bool its;
> >      bool virt;
> > +    bool use_fdt;
> >      int32_t gic_version;
> >      VirtIOMMUType iommu;
> >      struct arm_boot_info bootinfo;
> >

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
  2019-03-29  9:31   ` [Qemu-arm] " Auger Eric
@ 2019-03-29 10:54     ` Shameerali Kolothum Thodi
  2019-04-01 13:43       ` [Qemu-devel] " Igor Mammedov
  1 sibling, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-03-29 10:54 UTC (permalink / raw)
  To: Auger Eric, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	imammedo@redhat.com, peter.maydell@linaro.org,
	shannon.zhaosl@gmail.com, sameo@linux.intel.com,
	sebastien.boeuf@intel.com
  Cc: Linuxarm, xuwei (O)

Hi Eric,

> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 29 March 2019 09:31
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> sameo@linux.intel.com; sebastien.boeuf@intel.com
> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>
> Subject: Re: [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device
> memory cold-plug
> 
> Hi Shameer,
> 
> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> > This adds support to build the aml code so that Guest(ACPI boot)
> > can see the cold-plugged device memory. Memory cold plug support
> > with DT boot is not yet enabled.
> >
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  default-configs/arm-softmmu.mak        |  2 ++
> >  hw/acpi/generic_event_device.c         | 23
> +++++++++++++++++++++++
> >  hw/arm/virt-acpi-build.c               |  9 +++++++++
> >  hw/arm/virt.c                          | 23
> +++++++++++++++++++++++
> >  include/hw/acpi/generic_event_device.h |  5 +++++
> >  include/hw/arm/virt.h                  |  2 ++
> >  6 files changed, 64 insertions(+)
> >
> > diff --git a/default-configs/arm-softmmu.mak
> b/default-configs/arm-softmmu.mak
> > index 795cb89..6db444e 100644
> > --- a/default-configs/arm-softmmu.mak
> > +++ b/default-configs/arm-softmmu.mak
> > @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
> >
> >  CONFIG_MEM_DEVICE=y
> >  CONFIG_DIMM=y
> > +CONFIG_ACPI_MEMORY_HOTPLUG=y
> > +CONFIG_ACPI_HW_REDUCED=y
> > diff --git a/hw/acpi/generic_event_device.c
> b/hw/acpi/generic_event_device.c
> > index b21a551..0b32fc9 100644
> > --- a/hw/acpi/generic_event_device.c
> > +++ b/hw/acpi/generic_event_device.c
> > @@ -16,13 +16,26 @@
> >   */
> >
> >  #include "qemu/osdep.h"
> > +#include "qapi/error.h"
> > +#include "exec/address-spaces.h"
> >  #include "hw/sysbus.h"
> >  #include "hw/acpi/acpi.h"
> >  #include "hw/acpi/generic_event_device.h"
> > +#include "hw/mem/pc-dimm.h"
> >
> >  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> >                                  DeviceState *dev, Error **errp)
> >  {
> > +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> > +
> > +    if (s->memhp_state.is_enabled &&
> > +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> > +                                dev, errp);
> > +    } else {
> > +        error_setg(errp, "virt: device plug request for unsupported
> device"
> > +                   " type: %s", object_get_typename(OBJECT(dev)));
> > +    }
> >  }
> >
> >  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev,
> AcpiEventStatusBits ev)
> >
> >  static void virt_device_realize(DeviceState *dev, Error **errp)
> >  {
> > +    VirtAcpiState *s = VIRT_ACPI(dev);
> > +
> > +    if (s->memhp_state.is_enabled) {
> > +        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
> > +                                 &s->memhp_state,
> > +                                 s->memhp_base);
> > +    }
> >  }
> >
> >  static Property virt_acpi_properties[] = {
> > +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base,
> 0),
> > +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> > +                     memhp_state.is_enabled, true),>
> DEFINE_PROP_END_OF_LIST(),
> >  };
> >
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index bf9c0bc..20d3c83 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -40,6 +40,7 @@
> >  #include "hw/loader.h"
> >  #include "hw/hw.h"
> >  #include "hw/acpi/aml-build.h"
> > +#include "hw/acpi/memory_hotplug.h"
> >  #include "hw/pci/pcie_host.h"
> >  #include "hw/pci/pci.h"
> >  #include "hw/arm/virt.h"
> > @@ -49,6 +50,13 @@
> >  #define ARM_SPI_BASE 32
> >  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
> >
> > +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState
> *ms)
> > +{
> > +    uint32_t nr_mem = ms->ram_slots;
> > +
> > +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL,
> AML_SYSTEM_MEMORY);
> > +}
> > +
> >  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
> >  {
> >      uint16_t i;
> > @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
> >       * the RTC ACPI device at all when using UEFI.
> >       */
> >      scope = aml_scope("\\_SB");
> > +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
> >      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
> >      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> >                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index d0ff20d..13db0e9 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
> >      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
> >      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> >      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> > +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
> >      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> >      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that
> size */
> >      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> > @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const
> VirtMachineState *vms)
> >      }
> >  }
> >
> > +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> > +{
> > +    DeviceState *dev;
> > +
> > +    dev = qdev_create(NULL, "virt-acpi");
> > +    qdev_prop_set_uint64(dev, "memhp_base",
> > +                         vms->memmap[VIRT_PCDIMM_ACPI].base);
> Maybe add a comment that a property is requested to integrated with
> acpi_memory_hotplug_init() (if I am not wrong). Otherwise we can wonder
> why sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, <base>) is not used as for
> standard sysbus devices?

Ok.
 
> > +    qdev_init_nofail(dev);
> > +
> > +    return dev;
> > +}
> > +
> >  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
> >  {
> >      const char *itsclass = its_class_name();
> > @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
> >
> >      create_platform_bus(vms, pic);
> >
> > +    vms->acpi = create_virt_acpi(vms);I can see that on PC machines, they
> use a link property to set the
> acpi_dev. I am unsure about the exact reason, any idea?
> > +
> >      vms->bootinfo.ram_size = machine->ram_size;
> >      vms->bootinfo.kernel_filename = machine->kernel_filename;
> >      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> > @@ -1828,11 +1843,19 @@ static void
> virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >  static void virt_memory_plug(HotplugHandler *hotplug_dev,
> >                               DeviceState *dev, Error **errp)
> >  {
> > +    HotplugHandlerClass *hhc;
> >      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> >      Error *local_err = NULL;
> >
> >      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> > +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);
> Why error_abort instead of propagating the error?

I could change that but this is what pc code does. Not sure the error is critical
though.

Thanks,
Shameer

> >
> > +out:
> >      error_propagate(errp, local_err);
> >  }
> >
> > diff --git a/include/hw/acpi/generic_event_device.h
> b/include/hw/acpi/generic_event_device.h
> > index f314515..262ca7d 100644
> > --- a/include/hw/acpi/generic_event_device.h
> > +++ b/include/hw/acpi/generic_event_device.h
> > @@ -18,12 +18,17 @@
> >  #ifndef HW_ACPI_GED_H
> >  #define HW_ACPI_GED_H
> >
> > +#include "hw/acpi/memory_hotplug.h"
> > +
> >  #define TYPE_VIRT_ACPI "virt-acpi"
> >  #define VIRT_ACPI(obj) \
> >      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> >
> >  typedef struct VirtAcpiState {
> >      SysBusDevice parent_obj;
> > +    MemHotplugState memhp_state;
> > +    hwaddr memhp_base;
> >  } VirtAcpiState;
> >
> > +
> spurious newline
> 
> Thanks
> 
> Eric
> >  #endif
> > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > index 507517c..c5e4c96 100644
> > --- a/include/hw/arm/virt.h
> > +++ b/include/hw/arm/virt.h
> > @@ -77,6 +77,7 @@ enum {
> >      VIRT_GPIO,
> >      VIRT_SECURE_UART,
> >      VIRT_SECURE_MEM,
> > +    VIRT_PCDIMM_ACPI,
> >      VIRT_LOWMEMMAP_LAST,
> >  };
> >
> > @@ -132,6 +133,7 @@ typedef struct {
> >      uint32_t iommu_phandle;
> >      int psci_conduit;
> >      hwaddr highest_gpa;
> > +    DeviceState *acpi;
> >  } VirtMachineState;
> >
> >  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM :
> VIRT_PCIE_ECAM)
> >

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
  2019-03-28 14:14   ` Auger Eric
@ 2019-03-29 11:22     ` Shameerali Kolothum Thodi
  2019-04-01 13:08         ` Igor Mammedov
  0 siblings, 1 reply; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-03-29 11:22 UTC (permalink / raw)
  To: Auger Eric, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	imammedo@redhat.com, peter.maydell@linaro.org,
	shannon.zhaosl@gmail.com, sameo@linux.intel.com,
	sebastien.boeuf@intel.com
  Cc: Linuxarm, xuwei (O)



> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 28 March 2019 14:15
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> sameo@linux.intel.com; sebastien.boeuf@intel.com
> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>
> Subject: Re: [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
> 
> Hi Shameer,
> 
> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> > From: Samuel Ortiz <sameo@linux.intel.com>
> >
> > This adds the skeleton to support an acpi device interface for
> > HW-reduced acpi platforms via ACPI GED - Generic Event Device (ACPI
> > v6.1 5.6.9).
> >
> > This will be used by Arm/Virt to add hotplug support.
> >
> > Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  hw/acpi/Kconfig                        |  4 ++
> >  hw/acpi/Makefile.objs                  |  1 +
> >  hw/acpi/generic_event_device.c         | 72
> ++++++++++++++++++++++++++++++++++
> >  include/hw/acpi/generic_event_device.h | 29 ++++++++++++++
> >  4 files changed, 106 insertions(+)
> >  create mode 100644 hw/acpi/generic_event_device.c  create mode
> 100644
> > include/hw/acpi/generic_event_device.h
> >
> > diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig index eca3bee..01a8b41
> > 100644
> > --- a/hw/acpi/Kconfig
> > +++ b/hw/acpi/Kconfig
> > @@ -27,3 +27,7 @@ config ACPI_VMGENID
> >      bool
> >      default y
> >      depends on PC
> > +
> > +config ACPI_HW_REDUCED
> > +    bool
> > +    depends on ACPI
> > diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs index
> > 2d46e37..b753232 100644
> > --- a/hw/acpi/Makefile.objs
> > +++ b/hw/acpi/Makefile.objs
> > @@ -6,6 +6,7 @@ common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) +=
> > memory_hotplug.o
> >  common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
> >  common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
> >  common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
> > +common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
> >  common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
> >
> >  common-obj-y += acpi_interface.o
> > diff --git a/hw/acpi/generic_event_device.c
> > b/hw/acpi/generic_event_device.c new file mode 100644 index
> > 0000000..b21a551
> > --- /dev/null
> > +++ b/hw/acpi/generic_event_device.c
> > @@ -0,0 +1,72 @@
> > +/*
> > + *
> > + * Copyright (c) 2018 Intel Corporation
> > + *
> > + * This program is free software; you can redistribute it and/or
> > +modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2 or later, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but
> > +WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of
> MERCHANTABILITY
> > +or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > +License for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > +along with
> > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "hw/sysbus.h"
> > +#include "hw/acpi/acpi.h"
> > +#include "hw/acpi/generic_event_device.h"
> the files are named generic_event_device.c/h while the device is named
> "virt-acpi". I would suggest to use the same naming as in nemu ie. ged or
> acpi_ged.

Agree. The naming is a bit confusing. In nemu they have a separate virt-acpi
dev which makes use of GED. Here, we are rolling those two into one. I am
still not very sure whether we should leave it as virt-acpi, because the actual
device on which this is implemented can be changed eg, GED vs GPIO. 

> If think you should clarify what is the exact scope of this device. The patch title
> make think this is bound to be used only in machvirt (+ the virt prefix used in
> numerous functions?). Is it also bound to be used by other architectures?
> > +
> > +static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > +                                DeviceState *dev, Error **errp) { }
> > +
> > +static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > +{ }
> > +
> > +static void virt_device_realize(DeviceState *dev, Error **errp) { }
> > +
> > +static Property virt_acpi_properties[] = {
> > +    DEFINE_PROP_END_OF_LIST(),
> > +};
> > +
> > +static void virt_acpi_class_init(ObjectClass *class, void *data) {
> > +    DeviceClass *dc = DEVICE_CLASS(class);
> > +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(class);
> > +    AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_CLASS(class);
> > +
> > +    dc->desc = "ACPI";
> > +    dc->props = virt_acpi_properties;
> > +    dc->realize = virt_device_realize;
> > +
> > +    hc->plug = virt_device_plug_cb;
> > +
> > +    adevc->send_event = virt_send_ged; }
> > +
> > +static const TypeInfo virt_acpi_info = {
> > +    .name          = TYPE_VIRT_ACPI,
> > +    .parent        = TYPE_SYS_BUS_DEVICE,
> > +    .instance_size = sizeof(VirtAcpiState),
> > +    .class_init    = virt_acpi_class_init,
> > +    .interfaces = (InterfaceInfo[]) {
> > +        { TYPE_HOTPLUG_HANDLER },
> > +        { TYPE_ACPI_DEVICE_IF },
> > +        { }
> > +    }
> > +};
> > +
> > +static void virt_acpi_register_types(void) {
> > +    type_register_static(&virt_acpi_info);
> > +}
> > +
> > +type_init(virt_acpi_register_types)
> > diff --git a/include/hw/acpi/generic_event_device.h
> > b/include/hw/acpi/generic_event_device.h
> > new file mode 100644
> > index 0000000..f314515
> > --- /dev/null
> > +++ b/include/hw/acpi/generic_event_device.h
> > @@ -0,0 +1,29 @@
> > +/*
> > + *
> > + * Copyright (c) 2018 Intel Corporation
> > + *
> > + * This program is free software; you can redistribute it and/or
> > +modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2 or later, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but
> > +WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of
> MERCHANTABILITY
> > +or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > +License for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > +along with
> > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > + */
> Add a comment in the header introducing what is the role of this device?
> link to GED spec? Explain the subset of the interfaces being implemented by
> the device.

Ok. I have added comments to that effect in patch #10, but I think I will make it
clear here as well.

Cheers,
Shameer

> > +
> > +#ifndef HW_ACPI_GED_H
> > +#define HW_ACPI_GED_H
> > +
> > +#define TYPE_VIRT_ACPI "virt-acpi"
> > +#define VIRT_ACPI(obj) \
> > +    OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > +
> > +typedef struct VirtAcpiState {
> > +    SysBusDevice parent_obj;
> > +} VirtAcpiState;
> > +
> > +#endif
> >

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 04/10] hw/arm/virt: Add memory hotplug framework
  2019-03-28 15:37   ` Auger Eric
@ 2019-03-29 12:03     ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-03-29 12:03 UTC (permalink / raw)
  To: Auger Eric, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	imammedo@redhat.com, peter.maydell@linaro.org,
	shannon.zhaosl@gmail.com, sameo@linux.intel.com,
	sebastien.boeuf@intel.com
  Cc: Linuxarm, xuwei (O)



> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 28 March 2019 15:38
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> sameo@linux.intel.com; sebastien.boeuf@intel.com
> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>
> Subject: Re: [PATCH v3 04/10] hw/arm/virt: Add memory hotplug framework
> 
> Hi Shameer,
> 
> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> > From: Eric Auger <eric.auger@redhat.com>
> >
> > This patch adds the the memory hot-plug/hot-unplug infrastructure
> s/the the/the (sorry my fault)
> > in machvirt. It is still not enabled as device memory is not yet
> > reported to guest.
> s/ reported to guest / exposed to the guest either through DT or ACPI.
> Note even cold-plug would not work yet.

Ok. 

> >
> > Signed-off-by: Eric Auger <eric.auger@redhat.com>
> > Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  default-configs/arm-softmmu.mak |  3 +++
> >  hw/arm/virt.c                   | 53
> ++++++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 55 insertions(+), 1 deletion(-)
> >
> > diff --git a/default-configs/arm-softmmu.mak
> b/default-configs/arm-softmmu.mak
> > index 2a7efc1..795cb89 100644
> > --- a/default-configs/arm-softmmu.mak
> > +++ b/default-configs/arm-softmmu.mak
> > @@ -159,3 +159,6 @@ CONFIG_MUSICPAL=y
> >
> >  # for realview and versatilepb
> >  CONFIG_LSI_SCSI_PCI=y
> > +
> > +CONFIG_MEM_DEVICE=y
> > +CONFIG_DIMM=y
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index ce2664a..d0ff20d 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -61,6 +61,8 @@
> >  #include "hw/arm/smmuv3.h"
> >  #include "hw/acpi/acpi.h"
> >  #include "target/arm/internals.h"
> > +#include "hw/mem/pc-dimm.h"
> > +#include "hw/mem/nvdimm.h"
> >
> >  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
> >      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
> > @@ -1806,6 +1808,42 @@ static const CPUArchIdList
> *virt_possible_cpu_arch_ids(MachineState *ms)
> >      return ms->possible_cpus;
> >  }
> >
> > +static void virt_memory_pre_plug(HotplugHandler *hotplug_dev,
> DeviceState *dev,
> > +                                 Error **errp)
> > +{
> > +    const bool is_nvdimm = object_dynamic_cast(OBJECT(dev),
> TYPE_NVDIMM);
> > +
> > +    if (dev->hotplugged) {
> > +        error_setg(errp, "memory hotplug is not supported");
> > +    }
> > +
> > +    if (is_nvdimm) {
> > +        error_setg(errp, "nvdimm is not yet supported");
> > +        return;
> > +    }
> Maybe this patch should be moved at the end since from this patch
> onwards, it becomes possible to instantiate a DIMM slot (with cold-plug
> qemu command line) that won't work properly. Of we simply disable
> cold-plug for PCDIMM as well at the moment?

Ok. I will check this and change accordingly.

Thanks,
Shameer

> Thanks
> 
> Eric
> > +
> > +    pc_dimm_pre_plug(PC_DIMM(dev), MACHINE(hotplug_dev), NULL,
> errp);
> > +}
> > +
> > +static void virt_memory_plug(HotplugHandler *hotplug_dev,
> > +                             DeviceState *dev, Error **errp)
> > +{
> > +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> > +    Error *local_err = NULL;
> > +
> > +    pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> > +
> > +    error_propagate(errp, local_err);
> > +}
> > +
> > +static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
> > +                                            DeviceState *dev,
> Error **errp)
> > +{
> > +    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > +        virt_memory_pre_plug(hotplug_dev, dev, errp);
> > +    }
> > +}
> > +
> >  static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
> >                                          DeviceState *dev, Error
> **errp)
> >  {
> > @@ -1817,12 +1855,23 @@ static void
> virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
> >                                       SYS_BUS_DEVICE(dev));
> >          }
> >      }
> > +    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > +        virt_memory_plug(hotplug_dev, dev, errp);
> > +    }
> > +}
> > +
> > +static void virt_machine_device_unplug_request_cb(HotplugHandler
> *hotplug_dev,
> > +                                          DeviceState *dev, Error
> **errp)
> > +{
> > +    error_setg(errp, "device unplug request for unsupported device"
> > +               " type: %s", object_get_typename(OBJECT(dev)));
> >  }
> >
> >  static HotplugHandler *virt_machine_get_hotplug_handler(MachineState
> *machine,
> >
> DeviceState *dev)
> >  {
> > -    if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE)) {
> > +    if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE) ||
> > +       (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM))) {
> >          return HOTPLUG_HANDLER(machine);
> >      }
> >
> > @@ -1886,7 +1935,9 @@ static void virt_machine_class_init(ObjectClass
> *oc, void *data)
> >      mc->kvm_type = virt_kvm_type;
> >      assert(!mc->get_hotplug_handler);
> >      mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
> > +    hc->pre_plug = virt_machine_device_pre_plug_cb;
> >      hc->plug = virt_machine_device_plug_cb;
> > +    hc->unplug_request = virt_machine_device_unplug_request_cb;
> >  }
> >
> >  static void virt_instance_init(Object *obj)
> >

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 09/10] hw/acpi: Add ACPI Generic Event Device Support
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 09/10] hw/acpi: Add ACPI Generic Event Device Support Shameer Kolothum
@ 2019-03-29 13:09   ` Auger Eric
  2019-03-29 13:44     ` Shameerali Kolothum Thodi
  0 siblings, 1 reply; 95+ messages in thread
From: Auger Eric @ 2019-03-29 13:09 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-devel, qemu-arm, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

Hi Shameer,

On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> From: Samuel Ortiz <sameo@linux.intel.com>
> 
> The ACPI Generic Event Device (GED) is a hardware-reduced specific
> device that handles all platform events, including the hotplug ones.
> This patch generates the AML code that defines GEDs.
> 
> Platforms need to specify their own GedEvent array to describe what
> kind of events they want to support through GED.  Also this uses a
> a single interrupt for the  GED device, relying on IO memory region
> to communicate the type of device affected by the interrupt. This
> way, we can support up to 32 events with a unique interrupt.
> 
> This is in preparation for making use of GED for ARM/virt
> platform and for now supports only memory hotplug.

Personally I would squash this with PATCH 3.
> 
> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
> Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  hw/acpi/generic_event_device.c         | 200 +++++++++++++++++++++++++++++++++
>  include/hw/acpi/generic_event_device.h |  34 ++++++
>  2 files changed, 234 insertions(+)
> 
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index 0b32fc9..9deaa33 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -23,6 +23,183 @@
>  #include "hw/acpi/generic_event_device.h"
>  #include "hw/mem/pc-dimm.h"
>  
> +static hwaddr ged_io_base;
> +static GedEvent *ged_events;
> +static uint32_t ged_events_size;
> +
> +static Aml *ged_event_aml(const GedEvent *event)
> +{
> +
> +    if (!event) {
> +        return NULL;
> +    }
> +
> +    switch (event->event) {
> +    case GED_MEMORY_HOTPLUG:
> +        /* We run a complete memory SCAN when getting a memory hotplug event */
> +        return aml_call0(MEMORY_DEVICES_CONTAINER "." MEMORY_SLOT_SCAN_METHOD);
> +    default:
> +        break;
> +    }
> +
> +    return NULL;
> +}
> +
> +/*
> + * The ACPI Generic Event Device (GED) is a hardware-reduced specific
> + * device[ACPI v6.1 Section 5.6.9] that handles all platform events,
> + * including the hotplug ones. Platforms need to specify their own
> + * GedEvent array to describe what kind of events they want to support
> + * through GED. This routine uses a single interrupt for the GED device,
> + * relying on IO memory region to communicate the type of device
> + * affected by the interrupt. This way, we can support up to 32 events
> + * with a unique interrupt.

During last review I asked the question herefater. I may have missed
your answer but just in case.

5.6.9.1 says:
"
The platform declare its support for the GED, and query whether an OS
supports it, via the _OSC method
"
Is it something done?
> + */
> +void build_ged_aml(Aml *table, const char *name, uint32_t ged_irq,
> +                   AmlRegionSpace rs)
> +{
> +    Aml *crs = aml_resource_template();
> +    Aml *evt, *field;
> +    Aml *dev = aml_device("%s", name);
> +    Aml *irq_sel = aml_local(0);
> +    Aml *isel = aml_name(AML_GED_IRQ_SEL);
> +    uint32_t i;
> +
> +    if (!ged_io_base || !ged_events || !ged_events_size) {
> +        return;
> +    }
> +
> +    /* _CRS interrupt */
> +    aml_append(crs, aml_interrupt(AML_CONSUMER, AML_EDGE, AML_ACTIVE_HIGH,
> +                                  AML_EXCLUSIVE, &ged_irq, 1));
> +    /*
> +     * For each GED event we:
> +     * - Add an interrupt to the CRS section.
> +     * - Add a conditional block for each event, inside a while loop.
> +     *   This is semantically equivalent to a switch/case implementation.
> +     */
> +    evt = aml_method("_EVT", 1, AML_SERIALIZED);
> +    {
> +        Aml *ged_aml;
> +        Aml *if_ctx;
> +
> +        /* Local0 = ISEL */
> +        aml_append(evt, aml_store(isel, irq_sel));
> +
> +        /*
> +         * Here we want to call a method for each supported GED event type.
> +         * The resulting ASL code looks like:
> +         *
> +         * Local0 = ISEL
> +         * If ((Local0 & irq0) == irq0)
> +         * {
> +         *     MethodEvent0()
> +         * }
> +         *
> +         * If ((Local0 & irq1) == irq1)
> +         * {
> +         *     MethodEvent1()
> +         * }
I think we could have stopped here ;-) with a ../..
> +         *
> +         * If ((Local0 & irq2) == irq2)
> +         * {
> +         *     MethodEvent2()
> +         * }
> +         */
> +
> +        for (i = 0; i < ged_events_size; i++) {
> +            ged_aml = ged_event_aml(&ged_events[i]);
> +            if (!ged_aml) {
> +                continue;
> +            }
> +
> +            /* If ((Local1 == irq))*/
> +            if_ctx = aml_if(aml_equal(aml_and(irq_sel, aml_int(ged_events[i].selector), NULL), aml_int(ged_events[i].selector)));
doesn't check_patch complain here?
> +            {
> +                /* AML for this specific type of event */
> +                aml_append(if_ctx, ged_aml);
> +            }
> +
> +            /*
> +             * We append the first "if" to the "while" context.
> +             * Other "ifs" will be "elseifs".
> +             */
> +            aml_append(evt, if_ctx);
> +        }
> +    }
> +
> +    aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0013")));
> +    aml_append(dev, aml_name_decl("_UID", aml_string(GED_DEVICE)));
> +    aml_append(dev, aml_name_decl("_CRS", crs));
> +
> +    /* Append IO region */
> +    aml_append(dev, aml_operation_region(AML_GED_IRQ_REG, rs,
> +               aml_int(ged_io_base + ACPI_GED_IRQ_SEL_OFFSET),
> +               ACPI_GED_IRQ_SEL_LEN));
> +    field = aml_field(AML_GED_IRQ_REG, AML_DWORD_ACC, AML_NOLOCK,
> +                      AML_WRITE_AS_ZEROS);
> +    aml_append(field, aml_named_field(AML_GED_IRQ_SEL,
> +                                      ACPI_GED_IRQ_SEL_LEN * 8));
> +    aml_append(dev, field);
> +
> +    /* Append _EVT method */
> +    aml_append(dev, evt);
> +
> +    aml_append(table, dev);
> +}
> +
> +/* Memory read by the GED _EVT AML dynamic method */
> +static uint64_t ged_read(void *opaque, hwaddr addr, unsigned size)
> +{
> +    uint64_t val = 0;
> +    GEDState *ged_st = opaque;
> +
> +    switch (addr) {
> +    case ACPI_GED_IRQ_SEL_OFFSET:
> +        /* Read the selector value and reset it */
> +        qemu_mutex_lock(&ged_st->lock);
> +        val = ged_st->sel;
> +        ged_st->sel = 0;
> +        qemu_mutex_unlock(&ged_st->lock);
> +        break;
> +    default:
> +        break;
> +    }
> +
> +    return val;
> +}
> +
> +/* Nothing is expected to be written to the GED memory region */
> +static void ged_write(void *opaque, hwaddr addr, uint64_t data,
> +                      unsigned int size)
> +{
> +}
> +
> +static const MemoryRegionOps ged_ops = {
> +    .read = ged_read,
> +    .write = ged_write,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +    },
> +};
> +
> +static void acpi_ged_event(GEDState *ged_st, uint32_t ged_irq_sel)
> +{
> +    /*
> +     * Set the GED IRQ selector to the expected device type value. This
> +     * way, the ACPI method will be able to trigger the right code based
> +     * on a unique IRQ.
> +     */
> +    qemu_mutex_lock(&ged_st->lock);
> +    ged_st->sel = ged_irq_sel;
> +    qemu_mutex_unlock(&ged_st->lock);
> +
> +    /* Trigger the event by sending an interrupt to the guest. */
> +    qemu_irq_pulse(ged_st->gsi[ged_st->irq]);
I don't get this. The devices uses a single irq, right?

Why can't we do like other sysbus devices, sysbus_init_irq(dev, &s->irq);
and use
sysbus_connect_irq(SYS_BUS_DEVICE(dev), i, pic); in virt?
> +}
> +
>  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
>                                  DeviceState *dev, Error **errp)
>  {
> @@ -40,6 +217,21 @@ static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
>  
>  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>  {
> +    VirtAcpiState *s = VIRT_ACPI(adev);
> +    uint32_t sel;
> +
> +    if (ev & ACPI_MEMORY_HOTPLUG_STATUS) {
> +        sel = ACPI_GED_IRQ_SEL_MEM;
> +    } else {
> +        /* Unknown event. Return without generating interrupt. */
> +        return;
> +    }
> +
> +    /*
> +     * We inject the hotplug interrupt. The IRQ selector will make
> +     * the difference from the ACPI table.
> +     */
> +    acpi_ged_event(&s->ged_state, sel);
>  }
>  
>  static void virt_device_realize(DeviceState *dev, Error **errp)
> @@ -57,6 +249,11 @@ static Property virt_acpi_properties[] = {
>      DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base, 0),
>      DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
>                       memhp_state.is_enabled, true),
It may be worth to explain why the GED device owns the
MEMORY_HOTPLUG_MMIO region. This was and still is confusing me I
acknowledge.
> +    DEFINE_PROP_PTR("gsi", VirtAcpiState, gsi),
see the comment abour irq above.
> +    DEFINE_PROP_UINT64("ged_base", VirtAcpiState, ged_base, 0),
> +    DEFINE_PROP_UINT32("ged_irq", VirtAcpiState, ged_irq, 0),
> +    DEFINE_PROP_PTR("ged_events", VirtAcpiState, ged_events),
> +    DEFINE_PROP_UINT32("ged_events_size", VirtAcpiState, ged_events_size, 0),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> @@ -70,6 +267,9 @@ static void virt_acpi_class_init(ObjectClass *class, void *data)
>      dc->props = virt_acpi_properties;
>      dc->realize = virt_device_realize;
>  
> +    /* Reason: pointer properties "gsi" and "gde_events" */
ged_events
> +    dc->user_creatable = false;
> +
>      hc->plug = virt_device_plug_cb;
>  
>      adevc->send_event = virt_send_ged;
> diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
> index 262ca7d..7f130f3 100644
> --- a/include/hw/acpi/generic_event_device.h
> +++ b/include/hw/acpi/generic_event_device.h
> @@ -24,11 +24,45 @@
>  #define VIRT_ACPI(obj) \
>      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
>  
> +#define ACPI_GED_IRQ_SEL_OFFSET 0x0
> +#define ACPI_GED_IRQ_SEL_LEN    0x4
> +#define ACPI_GED_IRQ_SEL_MEM    0x1
> +#define ACPI_GED_REG_LEN        0x4
> +
> +#define GED_DEVICE      "GED"
> +#define AML_GED_IRQ_REG "IREG"
> +#define AML_GED_IRQ_SEL "ISEL"
> +
> +typedef enum {
> +    GED_MEMORY_HOTPLUG = 1,
> +} GedEventType;
> +
> +typedef struct GedEvent {
> +    uint32_t     selector;
> +    GedEventType event;
> +} GedEvent;
> +
> +typedef struct GEDState {
> +    MemoryRegion io;
> +    uint32_t     sel;
> +    uint32_t     irq;
> +    qemu_irq     *gsi;
> +    QemuMutex    lock;
> +} GEDState;
> +
>  typedef struct VirtAcpiState {
>      SysBusDevice parent_obj;
>      MemHotplugState memhp_state;
>      hwaddr memhp_base;
> +    void *gsi;
> +    hwaddr ged_base;
> +    GEDState ged_state;
> +    uint32_t ged_irq;
> +    void *ged_events;
> +    uint32_t ged_events_size;
>  } VirtAcpiState;
>  
> +void build_ged_aml(Aml *table, const char* name, uint32_t ged_irq,
> +                   AmlRegionSpace rs);
>  
>  #endif
> 

Thanks

Eric

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-03-29  9:59     ` Shameerali Kolothum Thodi
@ 2019-03-29 13:12       ` Auger Eric
  2019-03-29 13:14         ` Ard Biesheuvel
  0 siblings, 1 reply; 95+ messages in thread
From: Auger Eric @ 2019-03-29 13:12 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, imammedo@redhat.com,
	peter.maydell@linaro.org, shannon.zhaosl@gmail.com,
	sameo@linux.intel.com, sebastien.boeuf@intel.com
  Cc: Leif Lindholm, Laszlo Ersek, Linuxarm, xuwei (O), Ard Biesheuvel

Hi Shameer,

On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:
> 
> 
>> -----Original Message-----
>> From: Auger Eric [mailto:eric.auger@redhat.com]
>> Sent: 29 March 2019 09:32
>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>
>> Hi Shameer,
>>
>> [ + Laszlo, Ard, Leif ]
>>
>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>> This is to disable/enable populating DT nodes in case
>>> any conflict with acpi tables. The default is "off".
>> The name of the option sounds misleading to me. Also we don't really
>> know the scope of the disablement. At the moment this just aims to
>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>
>>>
>>> This will be used in subsequent patch where cold plug
>>> device-memory support is added for DT boot.
>> I am concerned about the fact that in dt mode, by default, you won't see
>> any PCDIMM nodes.
>>>
>>> If DT memory node support is added for cold-plugged device
>>> memory, those memory will be visible to Guest kernel via
>>> UEFI GetMemoryMap() and gets treated as early boot memory.
>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>> info.
> 
> Sorry I missed this part. Yes, that will be a more cleaner solution.
> 
> Also, to be more clear on what happens,
> 
> Guest ACPI boot with "fdt=on" ,
> 
> From kernel log,
> 
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
> 
> 
> Guest ACPI boot with "fdt=off" ,
> 
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
> 
> The hotpluggable memory node is absent from early memory nodes here.

OK thank you for the example illustrating the concern.
> 
> As you said, it could be possible to detect this node using SRAT in UEFI.

Let's wait for EDK2 experts on this.

Thanks

Eric
> 
> Cheers,
> Shameer
> 
>>> Hence memory becomes non hot-un-unpluggable even if Guest
>>> is booted in ACPI mode.
>>
>>
>>
>>>
>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>> ---
>>>  hw/arm/virt.c         | 23 +++++++++++++++++++++++
>>>  include/hw/arm/virt.h |  1 +
>>>  2 files changed, 24 insertions(+)
>>>
>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>> index 13db0e9..b602151 100644
>>> --- a/hw/arm/virt.c
>>> +++ b/hw/arm/virt.c
>>> @@ -1717,6 +1717,20 @@ static void virt_set_highmem(Object *obj, bool
>> value, Error **errp)
>>>      vms->highmem = value;
>>>  }
>>>
>>> +static bool virt_get_fdt(Object *obj, Error **errp)
>>> +{
>>> +    VirtMachineState *vms = VIRT_MACHINE(obj);
>>> +
>>> +    return vms->use_fdt;
>>> +}
>>> +
>>> +static void virt_set_fdt(Object *obj, bool value, Error **errp)
>>> +{
>>> +    VirtMachineState *vms = VIRT_MACHINE(obj);
>>> +
>>> +    vms->use_fdt = value;
>>> +}
>>> +
>>>  static bool virt_get_its(Object *obj, Error **errp)
>>>  {
>>>      VirtMachineState *vms = VIRT_MACHINE(obj);
>>> @@ -2005,6 +2019,15 @@ static void virt_instance_init(Object *obj)
>>>      object_property_set_description(obj, "gic-version",
>>>                                      "Set GIC version. "
>>>                                      "Valid values are 2, 3 and host",
>> NULL);
>>> +    /* fdt is disabled by default */
>>> +    vms->use_fdt = false;
>>> +    object_property_add_bool(obj, "fdt", virt_get_fdt,
>>> +                             virt_set_fdt, NULL);
>>> +    object_property_set_description(obj, "fdt",
>>> +                                    "Set on/off to enable/disable
>> device tree "
>>> +                                    "nodes in case any conflict with
>> ACPI"
>> in case of
>>
>> Thanks
>>
>> Eric
>>> +                                    "(eg: device memory node)",
>>> +                                    NULL);
>>>
>>>      vms->highmem_ecam = !vmc->no_highmem_ecam;
>>>
>>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>>> index c5e4c96..14b2e0a 100644
>>> --- a/include/hw/arm/virt.h
>>> +++ b/include/hw/arm/virt.h
>>> @@ -119,6 +119,7 @@ typedef struct {
>>>      bool highmem_ecam;
>>>      bool its;
>>>      bool virt;
>>> +    bool use_fdt;
>>>      int32_t gic_version;
>>>      VirtIOMMUType iommu;
>>>      struct arm_boot_info bootinfo;
>>>

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-03-29 13:12       ` Auger Eric
@ 2019-03-29 13:14         ` Ard Biesheuvel
  2019-03-29 13:56           ` [Qemu-arm] [Qemu-devel] " Auger Eric
  0 siblings, 1 reply; 95+ messages in thread
From: Ard Biesheuvel @ 2019-03-29 13:14 UTC (permalink / raw)
  To: Auger Eric
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	imammedo@redhat.com, sebastien.boeuf@intel.com, Laszlo Ersek,
	Leif Lindholm

On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:
>
> Hi Shameer,
>
> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:
> >
> >
> >> -----Original Message-----
> >> From: Auger Eric [mailto:eric.auger@redhat.com]
> >> Sent: 29 March 2019 09:32
> >> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> >> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> >> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> >> sameo@linux.intel.com; sebastien.boeuf@intel.com
> >> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
> >> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
> >> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
> >> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
> >>
> >> Hi Shameer,
> >>
> >> [ + Laszlo, Ard, Leif ]
> >>
> >> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> >>> This is to disable/enable populating DT nodes in case
> >>> any conflict with acpi tables. The default is "off".
> >> The name of the option sounds misleading to me. Also we don't really
> >> know the scope of the disablement. At the moment this just aims to
> >> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
> >>
> >>>
> >>> This will be used in subsequent patch where cold plug
> >>> device-memory support is added for DT boot.
> >> I am concerned about the fact that in dt mode, by default, you won't see
> >> any PCDIMM nodes.
> >>>
> >>> If DT memory node support is added for cold-plugged device
> >>> memory, those memory will be visible to Guest kernel via
> >>> UEFI GetMemoryMap() and gets treated as early boot memory.
> >> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
> >> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
> >> info.
> >
> > Sorry I missed this part. Yes, that will be a more cleaner solution.
> >
> > Also, to be more clear on what happens,
> >
> > Guest ACPI boot with "fdt=on" ,
> >
> > From kernel log,
> >
> > [    0.000000] Early memory node ranges
> > [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> > [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> > [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> > [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> > [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> > [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> > [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> > [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> > [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> > [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
> > [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> > [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
> >
> >
> > Guest ACPI boot with "fdt=off" ,
> >
> > [    0.000000] Movable zone start for each node
> > [    0.000000] Early memory node ranges
> > [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> > [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> > [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> > [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> > [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> > [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> > [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> > [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> > [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> > [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> > [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
> >
> > The hotpluggable memory node is absent from early memory nodes here.
>
> OK thank you for the example illustrating the concern.
> >
> > As you said, it could be possible to detect this node using SRAT in UEFI.
>
> Let's wait for EDK2 experts on this.
>

Happy to chime in, but I need a bit more context here.

What is the problem, how does this path try to solve it, and why is
that a bad idea?

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-03-29  9:41     ` Shameerali Kolothum Thodi
@ 2019-03-29 13:41       ` Auger Eric
  0 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-03-29 13:41 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, imammedo@redhat.com,
	peter.maydell@linaro.org, shannon.zhaosl@gmail.com,
	sameo@linux.intel.com
  Cc: Leif Lindholm, Laszlo Ersek, Linuxarm, xuwei (O), Ard Biesheuvel

Hi Shameer,
On 3/29/19 10:41 AM, Shameerali Kolothum Thodi wrote:
> Hi Eric,
> 
>> -----Original Message-----
>> From: Auger Eric [mailto:eric.auger@redhat.com]
>> Sent: 29 March 2019 09:32
>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>
>> Hi Shameer,
>>
>> [ + Laszlo, Ard, Leif ]
>>
>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>> This is to disable/enable populating DT nodes in case
>>> any conflict with acpi tables. The default is "off".
>> The name of the option sounds misleading to me. Also we don't really
>> know the scope of the disablement. At the moment this just aims to
>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
> 
> Yes, I was not very happy with the name "fdt". If this is not useful other than
> this device memory conflict case, then we can be more specific. But I am not sure
> we might need this for any other future conflicts, hence a more generic name. 
> 
> May be "force_fdt" or "dimm_fdt"? I am open to suggestions.
mask_spurious_dt_nodes. But I am unsure this is the way we should go.
> 
>>>
>>> This will be used in subsequent patch where cold plug
>>> device-memory support is added for DT boot.
>> I am concerned about the fact that in dt mode, by default, you won't see
>> any PCDIMM nodes.
> 
> True. But is there any other way to detect that the Guest is using DT?
I don't know any

in machvirt_init, there is firmware_loaded that tells you whether you
have a FW image. If this one is not set, you can induce dt. But if there
is a FW it can be either DT or ACPI booted. You also have the
acpi_enabled knob.

Thanks

Eric
> 
> Thanks,
> Shameer
> 
>>> If DT memory node support is added for cold-plugged device
>>> memory, those memory will be visible to Guest kernel via
>>> UEFI GetMemoryMap() and gets treated as early boot memory.
>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>> info.
>>> Hence memory becomes non hot-un-unpluggable even if Guest
>>> is booted in ACPI mode.
>>
>>
>>
>>>
>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>> ---
>>>  hw/arm/virt.c         | 23 +++++++++++++++++++++++
>>>  include/hw/arm/virt.h |  1 +
>>>  2 files changed, 24 insertions(+)
>>>
>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>> index 13db0e9..b602151 100644
>>> --- a/hw/arm/virt.c
>>> +++ b/hw/arm/virt.c
>>> @@ -1717,6 +1717,20 @@ static void virt_set_highmem(Object *obj, bool
>> value, Error **errp)
>>>      vms->highmem = value;
>>>  }
>>>
>>> +static bool virt_get_fdt(Object *obj, Error **errp)
>>> +{
>>> +    VirtMachineState *vms = VIRT_MACHINE(obj);
>>> +
>>> +    return vms->use_fdt;
>>> +}
>>> +
>>> +static void virt_set_fdt(Object *obj, bool value, Error **errp)
>>> +{
>>> +    VirtMachineState *vms = VIRT_MACHINE(obj);
>>> +
>>> +    vms->use_fdt = value;
>>> +}
>>> +
>>>  static bool virt_get_its(Object *obj, Error **errp)
>>>  {
>>>      VirtMachineState *vms = VIRT_MACHINE(obj);
>>> @@ -2005,6 +2019,15 @@ static void virt_instance_init(Object *obj)
>>>      object_property_set_description(obj, "gic-version",
>>>                                      "Set GIC version. "
>>>                                      "Valid values are 2, 3 and host",
>> NULL);
>>> +    /* fdt is disabled by default */
>>> +    vms->use_fdt = false;
>>> +    object_property_add_bool(obj, "fdt", virt_get_fdt,
>>> +                             virt_set_fdt, NULL);
>>> +    object_property_set_description(obj, "fdt",
>>> +                                    "Set on/off to enable/disable
>> device tree "
>>> +                                    "nodes in case any conflict with
>> ACPI"
>> in case of
>>
>> Thanks
>>
>> Eric
>>> +                                    "(eg: device memory node)",
>>> +                                    NULL);
>>>
>>>      vms->highmem_ecam = !vmc->no_highmem_ecam;
>>>
>>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>>> index c5e4c96..14b2e0a 100644
>>> --- a/include/hw/arm/virt.h
>>> +++ b/include/hw/arm/virt.h
>>> @@ -119,6 +119,7 @@ typedef struct {
>>>      bool highmem_ecam;
>>>      bool its;
>>>      bool virt;
>>> +    bool use_fdt;
>>>      int32_t gic_version;
>>>      VirtIOMMUType iommu;
>>>      struct arm_boot_info bootinfo;
>>>

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 09/10] hw/acpi: Add ACPI Generic Event Device Support
  2019-03-29 13:09   ` [Qemu-arm] " Auger Eric
@ 2019-03-29 13:44     ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-03-29 13:44 UTC (permalink / raw)
  To: Auger Eric, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	imammedo@redhat.com, peter.maydell@linaro.org,
	shannon.zhaosl@gmail.com, sameo@linux.intel.com,
	sebastien.boeuf@intel.com
  Cc: Linuxarm, xuwei (O)

Hi Eric,

> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 29 March 2019 13:09
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> sameo@linux.intel.com; sebastien.boeuf@intel.com
> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>
> Subject: Re: [PATCH v3 09/10] hw/acpi: Add ACPI Generic Event Device Support
> 
> Hi Shameer,
> 
> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> > From: Samuel Ortiz <sameo@linux.intel.com>
> >
> > The ACPI Generic Event Device (GED) is a hardware-reduced specific
> > device that handles all platform events, including the hotplug ones.
> > This patch generates the AML code that defines GEDs.
> >
> > Platforms need to specify their own GedEvent array to describe what
> > kind of events they want to support through GED.  Also this uses a
> > a single interrupt for the  GED device, relying on IO memory region
> > to communicate the type of device affected by the interrupt. This
> > way, we can support up to 32 events with a unique interrupt.
> >
> > This is in preparation for making use of GED for ARM/virt
> > platform and for now supports only memory hotplug.
> 
> Personally I would squash this with PATCH 3.

Ok.

> > Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
> > Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  hw/acpi/generic_event_device.c         | 200
> +++++++++++++++++++++++++++++++++
> >  include/hw/acpi/generic_event_device.h |  34 ++++++
> >  2 files changed, 234 insertions(+)
> >
> > diff --git a/hw/acpi/generic_event_device.c
> b/hw/acpi/generic_event_device.c
> > index 0b32fc9..9deaa33 100644
> > --- a/hw/acpi/generic_event_device.c
> > +++ b/hw/acpi/generic_event_device.c
> > @@ -23,6 +23,183 @@
> >  #include "hw/acpi/generic_event_device.h"
> >  #include "hw/mem/pc-dimm.h"
> >
> > +static hwaddr ged_io_base;
> > +static GedEvent *ged_events;
> > +static uint32_t ged_events_size;
> > +
> > +static Aml *ged_event_aml(const GedEvent *event)
> > +{
> > +
> > +    if (!event) {
> > +        return NULL;
> > +    }
> > +
> > +    switch (event->event) {
> > +    case GED_MEMORY_HOTPLUG:
> > +        /* We run a complete memory SCAN when getting a memory
> hotplug event */
> > +        return aml_call0(MEMORY_DEVICES_CONTAINER "."
> MEMORY_SLOT_SCAN_METHOD);
> > +    default:
> > +        break;
> > +    }
> > +
> > +    return NULL;
> > +}
> > +
> > +/*
> > + * The ACPI Generic Event Device (GED) is a hardware-reduced specific
> > + * device[ACPI v6.1 Section 5.6.9] that handles all platform events,
> > + * including the hotplug ones. Platforms need to specify their own
> > + * GedEvent array to describe what kind of events they want to support
> > + * through GED. This routine uses a single interrupt for the GED device,
> > + * relying on IO memory region to communicate the type of device
> > + * affected by the interrupt. This way, we can support up to 32 events
> > + * with a unique interrupt.
> 
> During last review I asked the question herefater. I may have missed
> your answer but just in case.
> 
> 5.6.9.1 says:
> "
> The platform declare its support for the GED, and query whether an OS
> supports it, via the _OSC method
> "
> Is it something done?

Yes you did raise this earlier and I had replied as well :).

https://patchwork.kernel.org/patch/10844557/

" Right. _OSC is not defined and I don’t see Qemu ever defined platform wide 
OSPM capabilities. I can see that it does that for PCI/PCIe hierarchies. 

Moreover looking at the Linux code, it doesn’t seems to care about
GED definition either,
https://elixir.bootlin.com/linux/v5.0/source/include/linux/acpi.h#L490

I can try adding _OSC declaring just the GED bit for future, but not sure
It makes much difference as of now. 

Please let me know if there is a strong argument for it.
"

> > + */
> > +void build_ged_aml(Aml *table, const char *name, uint32_t ged_irq,
> > +                   AmlRegionSpace rs)
> > +{
> > +    Aml *crs = aml_resource_template();
> > +    Aml *evt, *field;
> > +    Aml *dev = aml_device("%s", name);
> > +    Aml *irq_sel = aml_local(0);
> > +    Aml *isel = aml_name(AML_GED_IRQ_SEL);
> > +    uint32_t i;
> > +
> > +    if (!ged_io_base || !ged_events || !ged_events_size) {
> > +        return;
> > +    }
> > +
> > +    /* _CRS interrupt */
> > +    aml_append(crs, aml_interrupt(AML_CONSUMER, AML_EDGE,
> AML_ACTIVE_HIGH,
> > +                                  AML_EXCLUSIVE, &ged_irq, 1));
> > +    /*
> > +     * For each GED event we:
> > +     * - Add an interrupt to the CRS section.
> > +     * - Add a conditional block for each event, inside a while loop.
> > +     *   This is semantically equivalent to a switch/case implementation.
> > +     */
> > +    evt = aml_method("_EVT", 1, AML_SERIALIZED);
> > +    {
> > +        Aml *ged_aml;
> > +        Aml *if_ctx;
> > +
> > +        /* Local0 = ISEL */
> > +        aml_append(evt, aml_store(isel, irq_sel));
> > +
> > +        /*
> > +         * Here we want to call a method for each supported GED event
> type.
> > +         * The resulting ASL code looks like:
> > +         *
> > +         * Local0 = ISEL
> > +         * If ((Local0 & irq0) == irq0)
> > +         * {
> > +         *     MethodEvent0()
> > +         * }
> > +         *
> > +         * If ((Local0 & irq1) == irq1)
> > +         * {
> > +         *     MethodEvent1()
> > +         * }
> I think we could have stopped here ;-) with a ../..

Ok

> > +         *
> > +         * If ((Local0 & irq2) == irq2)
> > +         * {
> > +         *     MethodEvent2()
> > +         * }
> > +         */
> > +
> > +        for (i = 0; i < ged_events_size; i++) {
> > +            ged_aml = ged_event_aml(&ged_events[i]);
> > +            if (!ged_aml) {
> > +                continue;
> > +            }
> > +
> > +            /* If ((Local1 == irq))*/
> > +            if_ctx = aml_if(aml_equal(aml_and(irq_sel,
> aml_int(ged_events[i].selector), NULL), aml_int(ged_events[i].selector)));
> doesn't check_patch complain here?

It did warn but thought this has better readability. I will address this.

> > +            {
> > +                /* AML for this specific type of event */
> > +                aml_append(if_ctx, ged_aml);
> > +            }
> > +
> > +            /*
> > +             * We append the first "if" to the "while" context.
> > +             * Other "ifs" will be "elseifs".
> > +             */
> > +            aml_append(evt, if_ctx);
> > +        }
> > +    }
> > +
> > +    aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0013")));
> > +    aml_append(dev, aml_name_decl("_UID", aml_string(GED_DEVICE)));
> > +    aml_append(dev, aml_name_decl("_CRS", crs));
> > +
> > +    /* Append IO region */
> > +    aml_append(dev, aml_operation_region(AML_GED_IRQ_REG, rs,
> > +               aml_int(ged_io_base + ACPI_GED_IRQ_SEL_OFFSET),
> > +               ACPI_GED_IRQ_SEL_LEN));
> > +    field = aml_field(AML_GED_IRQ_REG, AML_DWORD_ACC,
> AML_NOLOCK,
> > +                      AML_WRITE_AS_ZEROS);
> > +    aml_append(field, aml_named_field(AML_GED_IRQ_SEL,
> > +                                      ACPI_GED_IRQ_SEL_LEN *
> 8));
> > +    aml_append(dev, field);
> > +
> > +    /* Append _EVT method */
> > +    aml_append(dev, evt);
> > +
> > +    aml_append(table, dev);
> > +}
> > +
> > +/* Memory read by the GED _EVT AML dynamic method */
> > +static uint64_t ged_read(void *opaque, hwaddr addr, unsigned size)
> > +{
> > +    uint64_t val = 0;
> > +    GEDState *ged_st = opaque;
> > +
> > +    switch (addr) {
> > +    case ACPI_GED_IRQ_SEL_OFFSET:
> > +        /* Read the selector value and reset it */
> > +        qemu_mutex_lock(&ged_st->lock);
> > +        val = ged_st->sel;
> > +        ged_st->sel = 0;
> > +        qemu_mutex_unlock(&ged_st->lock);
> > +        break;
> > +    default:
> > +        break;
> > +    }
> > +
> > +    return val;
> > +}
> > +
> > +/* Nothing is expected to be written to the GED memory region */
> > +static void ged_write(void *opaque, hwaddr addr, uint64_t data,
> > +                      unsigned int size)
> > +{
> > +}
> > +
> > +static const MemoryRegionOps ged_ops = {
> > +    .read = ged_read,
> > +    .write = ged_write,
> > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > +    .valid = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 4,
> > +    },
> > +};
> > +
> > +static void acpi_ged_event(GEDState *ged_st, uint32_t ged_irq_sel)
> > +{
> > +    /*
> > +     * Set the GED IRQ selector to the expected device type value. This
> > +     * way, the ACPI method will be able to trigger the right code based
> > +     * on a unique IRQ.
> > +     */
> > +    qemu_mutex_lock(&ged_st->lock);
> > +    ged_st->sel = ged_irq_sel;
> > +    qemu_mutex_unlock(&ged_st->lock);
> > +
> > +    /* Trigger the event by sending an interrupt to the guest. */
> > +    qemu_irq_pulse(ged_st->gsi[ged_st->irq]);
> I don't get this. The devices uses a single irq, right?

Yes it uses single irq.

> Why can't we do like other sysbus devices, sysbus_init_irq(dev, &s->irq);
> and use
> sysbus_connect_irq(SYS_BUS_DEVICE(dev), i, pic); in virt?

I have to take a look at this.

> > +}
> > +
> >  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> >                                  DeviceState *dev, Error **errp)
> >  {
> > @@ -40,6 +217,21 @@ static void virt_device_plug_cb(HotplugHandler
> *hotplug_dev,
> >
> >  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> >  {
> > +    VirtAcpiState *s = VIRT_ACPI(adev);
> > +    uint32_t sel;
> > +
> > +    if (ev & ACPI_MEMORY_HOTPLUG_STATUS) {
> > +        sel = ACPI_GED_IRQ_SEL_MEM;
> > +    } else {
> > +        /* Unknown event. Return without generating interrupt. */
> > +        return;
> > +    }
> > +
> > +    /*
> > +     * We inject the hotplug interrupt. The IRQ selector will make
> > +     * the difference from the ACPI table.
> > +     */
> > +    acpi_ged_event(&s->ged_state, sel);
> >  }
> >
> >  static void virt_device_realize(DeviceState *dev, Error **errp)
> > @@ -57,6 +249,11 @@ static Property virt_acpi_properties[] = {
> >      DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base,
> 0),
> >      DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> >                       memhp_state.is_enabled, true),
> It may be worth to explain why the GED device owns the
> MEMORY_HOTPLUG_MMIO region. This was and still is confusing me I
> acknowledge.

I will try to add the comment here saying this is the hotplug handler/
acpi interface device and needs to initialize the memory hotplug base
region.

> > +    DEFINE_PROP_PTR("gsi", VirtAcpiState, gsi),
> see the comment abour irq above.
> > +    DEFINE_PROP_UINT64("ged_base", VirtAcpiState, ged_base, 0),
> > +    DEFINE_PROP_UINT32("ged_irq", VirtAcpiState, ged_irq, 0),
> > +    DEFINE_PROP_PTR("ged_events", VirtAcpiState, ged_events),
> > +    DEFINE_PROP_UINT32("ged_events_size", VirtAcpiState,
> ged_events_size, 0),
> >      DEFINE_PROP_END_OF_LIST(),
> >  };
> >
> > @@ -70,6 +267,9 @@ static void virt_acpi_class_init(ObjectClass *class,
> void *data)
> >      dc->props = virt_acpi_properties;
> >      dc->realize = virt_device_realize;
> >
> > +    /* Reason: pointer properties "gsi" and "gde_events" */
> ged_events

Thanks,
Shameer

> > +    dc->user_creatable = false;
> > +
> >      hc->plug = virt_device_plug_cb;
> >
> >      adevc->send_event = virt_send_ged;
> > diff --git a/include/hw/acpi/generic_event_device.h
> b/include/hw/acpi/generic_event_device.h
> > index 262ca7d..7f130f3 100644
> > --- a/include/hw/acpi/generic_event_device.h
> > +++ b/include/hw/acpi/generic_event_device.h
> > @@ -24,11 +24,45 @@
> >  #define VIRT_ACPI(obj) \
> >      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> >
> > +#define ACPI_GED_IRQ_SEL_OFFSET 0x0
> > +#define ACPI_GED_IRQ_SEL_LEN    0x4
> > +#define ACPI_GED_IRQ_SEL_MEM    0x1
> > +#define ACPI_GED_REG_LEN        0x4
> > +
> > +#define GED_DEVICE      "GED"
> > +#define AML_GED_IRQ_REG "IREG"
> > +#define AML_GED_IRQ_SEL "ISEL"
> > +
> > +typedef enum {
> > +    GED_MEMORY_HOTPLUG = 1,
> > +} GedEventType;
> > +
> > +typedef struct GedEvent {
> > +    uint32_t     selector;
> > +    GedEventType event;
> > +} GedEvent;
> > +
> > +typedef struct GEDState {
> > +    MemoryRegion io;
> > +    uint32_t     sel;
> > +    uint32_t     irq;
> > +    qemu_irq     *gsi;
> > +    QemuMutex    lock;
> > +} GEDState;
> > +
> >  typedef struct VirtAcpiState {
> >      SysBusDevice parent_obj;
> >      MemHotplugState memhp_state;
> >      hwaddr memhp_base;
> > +    void *gsi;
> > +    hwaddr ged_base;
> > +    GEDState ged_state;
> > +    uint32_t ged_irq;
> > +    void *ged_events;
> > +    uint32_t ged_events_size;
> >  } VirtAcpiState;
> >
> > +void build_ged_aml(Aml *table, const char* name, uint32_t ged_irq,
> > +                   AmlRegionSpace rs);
> >
> >  #endif
> >
> 
> Thanks
> 
> Eric

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-03-29 13:14         ` Ard Biesheuvel
@ 2019-03-29 13:56           ` Auger Eric
  2019-03-29 14:08             ` Shameerali Kolothum Thodi
                               ` (2 more replies)
  0 siblings, 3 replies; 95+ messages in thread
From: Auger Eric @ 2019-03-29 13:56 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Linuxarm,
	Shameerali Kolothum Thodi, qemu-devel@nongnu.org,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	imammedo@redhat.com, sebastien.boeuf@intel.com, Laszlo Ersek,
	Leif Lindholm

Hi Ard,

On 3/29/19 2:14 PM, Ard Biesheuvel wrote:
> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:
>>
>> Hi Shameer,
>>
>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>> Sent: 29 March 2019 09:32
>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>
>>>> Hi Shameer,
>>>>
>>>> [ + Laszlo, Ard, Leif ]
>>>>
>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>>>> This is to disable/enable populating DT nodes in case
>>>>> any conflict with acpi tables. The default is "off".
>>>> The name of the option sounds misleading to me. Also we don't really
>>>> know the scope of the disablement. At the moment this just aims to
>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>
>>>>>
>>>>> This will be used in subsequent patch where cold plug
>>>>> device-memory support is added for DT boot.
>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>> any PCDIMM nodes.
>>>>>
>>>>> If DT memory node support is added for cold-plugged device
>>>>> memory, those memory will be visible to Guest kernel via
>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.
>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>> info.
>>>
>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>
>>> Also, to be more clear on what happens,
>>>
>>> Guest ACPI boot with "fdt=on" ,
>>>
>>> From kernel log,
>>>
>>> [    0.000000] Early memory node ranges
>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>
>>>
>>> Guest ACPI boot with "fdt=off" ,
>>>
>>> [    0.000000] Movable zone start for each node
>>> [    0.000000] Early memory node ranges
>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>
>>> The hotpluggable memory node is absent from early memory nodes here.
>>
>> OK thank you for the example illustrating the concern.
>>>
>>> As you said, it could be possible to detect this node using SRAT in UEFI.
>>
>> Let's wait for EDK2 experts on this.
>>
> 
> Happy to chime in, but I need a bit more context here.
> 
> What is the problem, how does this path try to solve it, and why is
> that a bad idea?
> 
Sure, sorry.

This series:
- [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
https://patchwork.kernel.org/cover/10863301/

aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
SRAT and DSDT parts and relies on GED to trigger the hotplug.

We noticed that if we build the hotpluggable memory dt nodes on top of
the above ACPI tables, the DIMM slots are interpreted as not
hotpluggable memory slots (at least we think so).

We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
fact that those slots are exposed as hotpluggable in the SRAT for example.

So in this series, we are forced to not generate the hotpluggable memory
dt nodes if we want the DIMM slots to be effectively recognized as
hotpluggable.

Could you confirm we have a correct understanding of the EDK2 behaviour
and if so, would there be any solution for EDK2 to absorb both the DT
nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.

At qemu level, detecting we are booting in ACPI mode and purposely
removing the above mentioned DT nodes does not look straightforward.

Hope this clarifies

Thanks

Eric

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-03-29 13:56           ` [Qemu-arm] [Qemu-devel] " Auger Eric
@ 2019-03-29 14:08             ` Shameerali Kolothum Thodi
  2019-04-01 13:07               ` Laszlo Ersek
  2019-04-02  8:39               ` Peter Maydell
  2 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-03-29 14:08 UTC (permalink / raw)
  To: Auger Eric, Ard Biesheuvel
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Linuxarm,
	qemu-devel@nongnu.org, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), imammedo@redhat.com,
	sebastien.boeuf@intel.com, Laszlo Ersek, Leif Lindholm



> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 29 March 2019 13:56
> To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: peter.maydell@linaro.org; sameo@linux.intel.com;
> qemu-devel@nongnu.org; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; Linuxarm
> <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> imammedo@redhat.com; sebastien.boeuf@intel.com; Laszlo Ersek
> <lersek@redhat.com>; Leif Lindholm <Leif.Lindholm@arm.com>
> Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> feature "fdt"
> 
> Hi Ard,
> 
> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:
> > On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:
> >>
> >> Hi Shameer,
> >>
> >> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Auger Eric [mailto:eric.auger@redhat.com]
> >>>> Sent: 29 March 2019 09:32
> >>>> To: Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>;
> >>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org;
> imammedo@redhat.com;
> >>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> >>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
> >>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O)
> <xuwei5@huawei.com>;
> >>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
> >>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
> >>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
> >>>>
> >>>> Hi Shameer,
> >>>>
> >>>> [ + Laszlo, Ard, Leif ]
> >>>>
> >>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> >>>>> This is to disable/enable populating DT nodes in case
> >>>>> any conflict with acpi tables. The default is "off".
> >>>> The name of the option sounds misleading to me. Also we don't really
> >>>> know the scope of the disablement. At the moment this just aims to
> >>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI
> mode.
> >>>>
> >>>>>
> >>>>> This will be used in subsequent patch where cold plug
> >>>>> device-memory support is added for DT boot.
> >>>> I am concerned about the fact that in dt mode, by default, you won't see
> >>>> any PCDIMM nodes.
> >>>>>
> >>>>> If DT memory node support is added for cold-plugged device
> >>>>> memory, those memory will be visible to Guest kernel via
> >>>>> UEFI GetMemoryMap() and gets treated as early boot memory.
> >>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
> >>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
> >>>> info.
> >>>
> >>> Sorry I missed this part. Yes, that will be a more cleaner solution.
> >>>
> >>> Also, to be more clear on what happens,
> >>>
> >>> Guest ACPI boot with "fdt=on" ,
> >>>
> >>> From kernel log,
> >>>
> >>> [    0.000000] Early memory node ranges
> >>> [    0.000000]   node   0: [mem
> 0x0000000040000000-0x00000000bbf5ffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bbf60000-0x00000000bbffffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bc000000-0x00000000bc02ffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bc030000-0x00000000bc36ffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bc370000-0x00000000bf64ffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bf650000-0x00000000bf6dffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bf6e0000-0x00000000bf6effff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bf6f0000-0x00000000bf80ffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bf810000-0x00000000bfffffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory
> node from DT.
> >>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>> [    0.000000] Initmem setup node 0 [mem
> 0x0000000040000000-0x00000000ffffffff]
> >>>
> >>>
> >>> Guest ACPI boot with "fdt=off" ,
> >>>
> >>> [    0.000000] Movable zone start for each node
> >>> [    0.000000] Early memory node ranges
> >>> [    0.000000]   node   0: [mem
> 0x0000000040000000-0x00000000bbf5ffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bbf60000-0x00000000bbffffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bc000000-0x00000000bc02ffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bc030000-0x00000000bc36ffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bc370000-0x00000000bf64ffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bf650000-0x00000000bf6dffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bf6e0000-0x00000000bf6effff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bf6f0000-0x00000000bf80ffff]
> >>> [    0.000000]   node   0: [mem
> 0x00000000bf810000-0x00000000bfffffff]
> >>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>> [    0.000000] Initmem setup node 0 [mem
> 0x0000000040000000-0x00000000bfffffff
> >>>
> >>> The hotpluggable memory node is absent from early memory nodes here.
> >>
> >> OK thank you for the example illustrating the concern.
> >>>
> >>> As you said, it could be possible to detect this node using SRAT in UEFI.
> >>
> >> Let's wait for EDK2 experts on this.
> >>
> >
> > Happy to chime in, but I need a bit more context here.
> >
> > What is the problem, how does this path try to solve it, and why is
> > that a bad idea?
> >
> Sure, sorry.
> 
> This series:
> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> https://patchwork.kernel.org/cover/10863301/
> 
> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> SRAT and DSDT parts and relies on GED to trigger the hotplug.
> 
> We noticed that if we build the hotpluggable memory dt nodes on top of
> the above ACPI tables, the DIMM slots are interpreted as not
> hotpluggable memory slots (at least we think so).
> 
> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> fact that those slots are exposed as hotpluggable in the SRAT for example.
> 
> So in this series, we are forced to not generate the hotpluggable memory
> dt nodes if we want the DIMM slots to be effectively recognized as
> hotpluggable.
> 
> Could you confirm we have a correct understanding of the EDK2 behaviour
> and if so, would there be any solution for EDK2 to absorb both the DT
> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
> 
> At qemu level, detecting we are booting in ACPI mode and purposely
> removing the above mentioned DT nodes does not look straightforward.
> 
> Hope this clarifies

Thanks Eric for this. And this is where you can find the initial discussion on this,

https://www.mail-archive.com/qemu-devel@nongnu.org/msg599076.html

Thanks,
Shameer


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 10/10] hw/arm/virt: Init GED device and enable memory hotplug
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 10/10] hw/arm/virt: Init GED device and enable memory hotplug Shameer Kolothum
@ 2019-03-29 14:16   ` Auger Eric
  0 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-03-29 14:16 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-devel, qemu-arm, imammedo, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf
  Cc: linuxarm, xuwei5

Hi Shameer,
On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> This initializes the GED device with base memory and irq,
> configures ged memory hotplug event and builds the
> corresponding aml code.
> 
> GED irq routing to Guest is also enabled.
I am not clear about why would need gsi and virt_gsi_handler for this patch.

 Memory hotplug
> should now work.
Time to be confident? ;-)
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  hw/acpi/generic_event_device.c | 18 ++++++++++++++++++
>  hw/arm/virt-acpi-build.c       |  9 +++++++++
>  hw/arm/virt.c                  | 30 +++++++++++++++++++++++++-----
>  include/hw/arm/virt.h          |  2 ++
>  4 files changed, 54 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index 9deaa33..02d5e66 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -200,6 +200,23 @@ static void acpi_ged_event(GEDState *ged_st, uint32_t ged_irq_sel)
>      qemu_irq_pulse(ged_st->gsi[ged_st->irq]);
>  }
>  
> +static void acpi_ged_init(MemoryRegion *as, DeviceState *dev, GEDState *ged_st)
> +{
> +    VirtAcpiState *s = VIRT_ACPI(dev);
> +
> +    assert(!ged_io_base && !ged_events && !ged_events_size);
> +
> +    ged_io_base = s->ged_base;
> +    ged_events = s->ged_events;
> +    ged_events_size = s->ged_events_size;
Are you obliged to use those global variables? the ged device handle can
be accessed from vms. build_ged_aml could be passed those values through
args, whose value would be extracted from VirtAcpiState?
> +    ged_st->irq = s->ged_irq;
> +    ged_st->gsi = s->gsi;
> +    qemu_mutex_init(&ged_st->lock);
> +    memory_region_init_io(&ged_st->io, OBJECT(dev), &ged_ops, ged_st,
> +                          "acpi-ged-event", ACPI_GED_REG_LEN);
> +    memory_region_add_subregion(as, ged_io_base, &ged_st->io);

This time you are not stuck with the hotplug framework; couldn't you do
here:
    memory_region_init_io(&ged_st->io, OBJECT(dev), &ged_ops, ged_st,
> +                          "acpi-ged-event", ACPI_GED_REG_LEN);
    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &ged_st->io);

and in virt create_ged

sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, vms->memmap[VIRT_ACPI_GED].base);

You wouldn't need to use a property for the ged_io_base and use a static
variable?
> +}
> +
>  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
>                                  DeviceState *dev, Error **errp)
>  {
> @@ -242,6 +259,7 @@ static void virt_device_realize(DeviceState *dev, Error **errp)
>          acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
>                                   &s->memhp_state,
>                                   s->memhp_base);
> +        acpi_ged_init(get_system_memory(), dev, &s->ged_state);
>      }
>  }
>  
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 1887531..116e9c9 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -41,6 +41,7 @@
>  #include "hw/hw.h"
>  #include "hw/acpi/aml-build.h"
>  #include "hw/acpi/memory_hotplug.h"
> +#include "hw/acpi/generic_event_device.h"
>  #include "hw/pci/pcie_host.h"
>  #include "hw/pci/pci.h"
>  #include "hw/arm/virt.h"
> @@ -50,6 +51,13 @@
>  #define ARM_SPI_BASE 32
>  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
>  
> +static void acpi_dsdt_add_ged(Aml *scope, VirtMachineState *vms)
> +{
> +    int irq =  vms->irqmap[VIRT_ACPI_GED] + ARM_SPI_BASE;
> +
> +    build_ged_aml(scope, "\\_SB."GED_DEVICE, irq, AML_SYSTEM_MEMORY);
> +}
> +
>  static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState *ms)
>  {
>      uint32_t nr_mem = ms->ram_slots;
> @@ -758,6 +766,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>       */
>      scope = aml_scope("\\_SB");
>      acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
> +    acpi_dsdt_add_ged(scope, vms);
>      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
>      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
>                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index b602151..e3f8aa7 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -63,6 +63,7 @@
>  #include "target/arm/internals.h"
>  #include "hw/mem/pc-dimm.h"
>  #include "hw/mem/nvdimm.h"
> +#include "hw/acpi/generic_event_device.h"
>  
>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
> @@ -134,6 +135,7 @@ static const MemMapEntry base_memmap[] = {
>      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
>      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
>      [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
> +    [VIRT_ACPI_GED] =           { 0x09080000, 0x00010000 },
>      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> @@ -169,6 +171,7 @@ static const int a15irqmap[] = {
>      [VIRT_PCIE] = 3, /* ... to 6 */
>      [VIRT_GPIO] = 7,
>      [VIRT_SECURE_UART] = 8,
> +    [VIRT_ACPI_GED] = 9,
>      [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
>      [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
>      [VIRT_SMMU] = 74,    /* ...to 74 + NUM_SMMU_IRQS - 1 */
> @@ -184,6 +187,13 @@ static const char *valid_cpus[] = {
>      ARM_CPU_TYPE_NAME("max"),
>  };
>  
> +static GedEvent ged_events[] = {
> +    {
> +        .selector = ACPI_GED_IRQ_SEL_MEM,
> +        .event    = GED_MEMORY_HOTPLUG,
> +    },
> +};
> +
>  static bool cpu_type_valid(const char *cpu)
>  {
>      int i;
> @@ -524,6 +534,11 @@ static DeviceState *create_virt_acpi(VirtMachineState *vms)
>      dev = qdev_create(NULL, "virt-acpi");
>      qdev_prop_set_uint64(dev, "memhp_base",
>                           vms->memmap[VIRT_PCDIMM_ACPI].base);
> +    qdev_prop_set_ptr(dev, "gsi", vms->gsi);
> +    qdev_prop_set_uint64(dev, "ged_base", vms->memmap[VIRT_ACPI_GED].base);
> +    qdev_prop_set_uint32(dev, "ged_irq", vms->irqmap[VIRT_ACPI_GED]);
can't you avoid using this ged_irq prop as well? How is it different
from the other sysbus devices which do not rely on those props?
> +    qdev_prop_set_ptr(dev, "ged_events", ged_events);
> +    qdev_prop_set_uint32(dev, "ged_events_size", ARRAY_SIZE(ged_events));
>      qdev_init_nofail(dev);
>  
>      return dev;
> @@ -568,6 +583,12 @@ static void create_v2m(VirtMachineState *vms, qemu_irq *pic)
>      fdt_add_v2m_gic_node(vms);
>  }
>  
> +static void virt_gsi_handler(void *opaque, int n, int level)
> +{
> +    qemu_irq *gic_irq = opaque;
> +    qemu_set_irq(gic_irq[n], level);
> +}
> +
>  static void create_gic(VirtMachineState *vms, qemu_irq *pic)
>  {
>      /* We create a standalone GIC */
> @@ -683,6 +704,8 @@ static void create_gic(VirtMachineState *vms, qemu_irq *pic)
>          pic[i] = qdev_get_gpio_in(gicdev, i);
>      }
>  
> +    vms->gsi = qemu_allocate_irqs(virt_gsi_handler, pic, NUM_IRQS);
> +
>      fdt_add_gic_node(vms);
>  
>      if (type == 3 && vms->its) {
> @@ -1431,7 +1454,7 @@ static void machvirt_init(MachineState *machine)
>      VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(machine);
>      MachineClass *mc = MACHINE_GET_CLASS(machine);
>      const CPUArchIdList *possible_cpus;
> -    qemu_irq pic[NUM_IRQS];
> +    qemu_irq *pic;
>      MemoryRegion *sysmem = get_system_memory();
>      MemoryRegion *secure_sysmem = NULL;
>      int n, virt_max_cpus;
> @@ -1627,6 +1650,7 @@ static void machvirt_init(MachineState *machine)
>  
>      create_flash(vms, sysmem, secure_sysmem ? secure_sysmem : sysmem);
>  
> +    pic = g_new0(qemu_irq, NUM_IRQS);
>      create_gic(vms, pic);
>  
>      fdt_add_pmu_nodes(vms);
> @@ -1842,10 +1866,6 @@ static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>  {
>      const bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
>  
> -    if (dev->hotplugged) {
> -        error_setg(errp, "memory hotplug is not supported");
> -    }
> -
>      if (is_nvdimm) {
>          error_setg(errp, "nvdimm is not yet supported");
>          return;
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 14b2e0a..850296a 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -78,6 +78,7 @@ enum {
>      VIRT_SECURE_UART,
>      VIRT_SECURE_MEM,
>      VIRT_PCDIMM_ACPI,
> +    VIRT_ACPI_GED,
>      VIRT_LOWMEMMAP_LAST,
>  };
>  
> @@ -135,6 +136,7 @@ typedef struct {
>      int psci_conduit;
>      hwaddr highest_gpa;
>      DeviceState *acpi;
> +    qemu_irq *gsi;
>  } VirtMachineState;
>  
>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
> 

Thanks

Eric

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 01/10] hw/acpi: Make ACPI IO address space configurable
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 01/10] hw/acpi: Make ACPI IO address space configurable Shameer Kolothum
@ 2019-04-01 12:58     ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-01 12:58 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: peter.maydell, sameo, shannon.zhaosl, qemu-devel, xuwei5,
	linuxarm, eric.auger, qemu-arm, sebastien.boeuf

On Thu, 21 Mar 2019 10:47:36 +0000
Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:

> This is in preparation for adding support for ARM64 platforms
> where it doesn't use port mapped IO for ACPI IO space.
> 
> Also move the MEMORY_SLOT_SCAN_METHOD/MEMORY_DEVICES_CONTAINER
> definitions to header so that other memory hotplug event
> signalling mechanisms (eg. Generic Event Device on HW-reduced
> acpi platforms) can use the same from their respective event
> handler aml code.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Reviewed-by: Eric Auger <eric.auger@redhat.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  hw/acpi/memory_hotplug.c         | 24 ++++++++++++++----------
>  hw/i386/acpi-build.c             |  3 ++-
>  include/hw/acpi/memory_hotplug.h |  8 ++++++--
>  3 files changed, 22 insertions(+), 13 deletions(-)
> 
> diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
> index 297812d..80e25f0 100644
> --- a/hw/acpi/memory_hotplug.c
> +++ b/hw/acpi/memory_hotplug.c
> @@ -29,12 +29,10 @@
>  #define MEMORY_SLOT_PROXIMITY_METHOD "MPXM"
>  #define MEMORY_SLOT_EJECT_METHOD     "MEJ0"
>  #define MEMORY_SLOT_NOTIFY_METHOD    "MTFY"
> -#define MEMORY_SLOT_SCAN_METHOD      "MSCN"
>  #define MEMORY_HOTPLUG_DEVICE        "MHPD"
>  #define MEMORY_HOTPLUG_IO_LEN         24
> -#define MEMORY_DEVICES_CONTAINER     "\\_SB.MHPC"
>  
> -static uint16_t memhp_io_base;
> +static hwaddr memhp_io_base;
>  
>  static ACPIOSTInfo *acpi_memory_device_status(int slot, MemStatus *mdev)
>  {
> @@ -209,7 +207,7 @@ static const MemoryRegionOps acpi_memory_hotplug_ops = {
>  };
>  
>  void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner,
> -                              MemHotplugState *state, uint16_t io_base)
> +                              MemHotplugState *state, hwaddr io_base)
>  {
>      MachineState *machine = MACHINE(qdev_get_machine());
>  
> @@ -342,7 +340,8 @@ const VMStateDescription vmstate_memory_hotplug = {
>  
>  void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
>                                const char *res_root,
> -                              const char *event_handler_method)
> +                              const char *event_handler_method,
> +                              AmlRegionSpace rs)
>  {
>      int i;
>      Aml *ifctx;
> @@ -365,14 +364,19 @@ void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
>              aml_name_decl("_UID", aml_string("Memory hotplug resources")));
>  
>          crs = aml_resource_template();
> -        aml_append(crs,
> -            aml_io(AML_DECODE16, memhp_io_base, memhp_io_base, 0,
> -                   MEMORY_HOTPLUG_IO_LEN)
> -        );
> +        if (rs == AML_SYSTEM_IO) {
> +            aml_append(crs,
> +                aml_io(AML_DECODE16, memhp_io_base, memhp_io_base, 0,
> +                       MEMORY_HOTPLUG_IO_LEN)
> +            );
> +        } else {
> +            aml_append(crs, aml_memory32_fixed(memhp_io_base,
> +                            MEMORY_HOTPLUG_IO_LEN, AML_READ_WRITE));
> +        }
>          aml_append(mem_ctrl_dev, aml_name_decl("_CRS", crs));
>  
>          aml_append(mem_ctrl_dev, aml_operation_region(
> -            MEMORY_HOTPLUG_IO_REGION, AML_SYSTEM_IO,
> +            MEMORY_HOTPLUG_IO_REGION, rs,
>              aml_int(memhp_io_base), MEMORY_HOTPLUG_IO_LEN)
>          );
>  
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 416da31..6d6de44 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1852,7 +1852,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          build_cpus_aml(dsdt, machine, opts, pm->cpu_hp_io_base,
>                         "\\_SB.PCI0", "\\_GPE._E02");
>      }
> -    build_memory_hotplug_aml(dsdt, nr_mem, "\\_SB.PCI0", "\\_GPE._E03");
> +    build_memory_hotplug_aml(dsdt, nr_mem, "\\_SB.PCI0",
> +                             "\\_GPE._E03", AML_SYSTEM_IO);
>  
>      scope =  aml_scope("_GPE");
>      {
> diff --git a/include/hw/acpi/memory_hotplug.h b/include/hw/acpi/memory_hotplug.h
> index 77c6576..f95aa1f 100644
> --- a/include/hw/acpi/memory_hotplug.h
> +++ b/include/hw/acpi/memory_hotplug.h
> @@ -5,6 +5,9 @@
>  #include "hw/acpi/acpi.h"
>  #include "hw/acpi/aml-build.h"
>  
> +#define MEMORY_SLOT_SCAN_METHOD      "MSCN"
> +#define MEMORY_DEVICES_CONTAINER     "\\_SB.MHPC"
> +
>  /**
>   * MemStatus:
>   * @is_removing: the memory device in slot has been requested to be ejected.
> @@ -29,7 +32,7 @@ typedef struct MemHotplugState {
>  } MemHotplugState;
>  
>  void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner,
> -                              MemHotplugState *state, uint16_t io_base);
> +                              MemHotplugState *state, hwaddr io_base);
>  
>  void acpi_memory_plug_cb(HotplugHandler *hotplug_dev, MemHotplugState *mem_st,
>                           DeviceState *dev, Error **errp);
> @@ -48,5 +51,6 @@ void acpi_memory_ospm_status(MemHotplugState *mem_st, ACPIOSTInfoList ***list);
>  
>  void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
>                                const char *res_root,
> -                              const char *event_handler_method);
> +                              const char *event_handler_method,
> +                              AmlRegionSpace rs);
>  #endif


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 01/10] hw/acpi: Make ACPI IO address space configurable
@ 2019-04-01 12:58     ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-01 12:58 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-devel, qemu-arm, eric.auger, peter.maydell, shannon.zhaosl,
	sameo, sebastien.boeuf, linuxarm, xuwei5

On Thu, 21 Mar 2019 10:47:36 +0000
Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:

> This is in preparation for adding support for ARM64 platforms
> where it doesn't use port mapped IO for ACPI IO space.
> 
> Also move the MEMORY_SLOT_SCAN_METHOD/MEMORY_DEVICES_CONTAINER
> definitions to header so that other memory hotplug event
> signalling mechanisms (eg. Generic Event Device on HW-reduced
> acpi platforms) can use the same from their respective event
> handler aml code.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Reviewed-by: Eric Auger <eric.auger@redhat.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  hw/acpi/memory_hotplug.c         | 24 ++++++++++++++----------
>  hw/i386/acpi-build.c             |  3 ++-
>  include/hw/acpi/memory_hotplug.h |  8 ++++++--
>  3 files changed, 22 insertions(+), 13 deletions(-)
> 
> diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
> index 297812d..80e25f0 100644
> --- a/hw/acpi/memory_hotplug.c
> +++ b/hw/acpi/memory_hotplug.c
> @@ -29,12 +29,10 @@
>  #define MEMORY_SLOT_PROXIMITY_METHOD "MPXM"
>  #define MEMORY_SLOT_EJECT_METHOD     "MEJ0"
>  #define MEMORY_SLOT_NOTIFY_METHOD    "MTFY"
> -#define MEMORY_SLOT_SCAN_METHOD      "MSCN"
>  #define MEMORY_HOTPLUG_DEVICE        "MHPD"
>  #define MEMORY_HOTPLUG_IO_LEN         24
> -#define MEMORY_DEVICES_CONTAINER     "\\_SB.MHPC"
>  
> -static uint16_t memhp_io_base;
> +static hwaddr memhp_io_base;
>  
>  static ACPIOSTInfo *acpi_memory_device_status(int slot, MemStatus *mdev)
>  {
> @@ -209,7 +207,7 @@ static const MemoryRegionOps acpi_memory_hotplug_ops = {
>  };
>  
>  void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner,
> -                              MemHotplugState *state, uint16_t io_base)
> +                              MemHotplugState *state, hwaddr io_base)
>  {
>      MachineState *machine = MACHINE(qdev_get_machine());
>  
> @@ -342,7 +340,8 @@ const VMStateDescription vmstate_memory_hotplug = {
>  
>  void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
>                                const char *res_root,
> -                              const char *event_handler_method)
> +                              const char *event_handler_method,
> +                              AmlRegionSpace rs)
>  {
>      int i;
>      Aml *ifctx;
> @@ -365,14 +364,19 @@ void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
>              aml_name_decl("_UID", aml_string("Memory hotplug resources")));
>  
>          crs = aml_resource_template();
> -        aml_append(crs,
> -            aml_io(AML_DECODE16, memhp_io_base, memhp_io_base, 0,
> -                   MEMORY_HOTPLUG_IO_LEN)
> -        );
> +        if (rs == AML_SYSTEM_IO) {
> +            aml_append(crs,
> +                aml_io(AML_DECODE16, memhp_io_base, memhp_io_base, 0,
> +                       MEMORY_HOTPLUG_IO_LEN)
> +            );
> +        } else {
> +            aml_append(crs, aml_memory32_fixed(memhp_io_base,
> +                            MEMORY_HOTPLUG_IO_LEN, AML_READ_WRITE));
> +        }
>          aml_append(mem_ctrl_dev, aml_name_decl("_CRS", crs));
>  
>          aml_append(mem_ctrl_dev, aml_operation_region(
> -            MEMORY_HOTPLUG_IO_REGION, AML_SYSTEM_IO,
> +            MEMORY_HOTPLUG_IO_REGION, rs,
>              aml_int(memhp_io_base), MEMORY_HOTPLUG_IO_LEN)
>          );
>  
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 416da31..6d6de44 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1852,7 +1852,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          build_cpus_aml(dsdt, machine, opts, pm->cpu_hp_io_base,
>                         "\\_SB.PCI0", "\\_GPE._E02");
>      }
> -    build_memory_hotplug_aml(dsdt, nr_mem, "\\_SB.PCI0", "\\_GPE._E03");
> +    build_memory_hotplug_aml(dsdt, nr_mem, "\\_SB.PCI0",
> +                             "\\_GPE._E03", AML_SYSTEM_IO);
>  
>      scope =  aml_scope("_GPE");
>      {
> diff --git a/include/hw/acpi/memory_hotplug.h b/include/hw/acpi/memory_hotplug.h
> index 77c6576..f95aa1f 100644
> --- a/include/hw/acpi/memory_hotplug.h
> +++ b/include/hw/acpi/memory_hotplug.h
> @@ -5,6 +5,9 @@
>  #include "hw/acpi/acpi.h"
>  #include "hw/acpi/aml-build.h"
>  
> +#define MEMORY_SLOT_SCAN_METHOD      "MSCN"
> +#define MEMORY_DEVICES_CONTAINER     "\\_SB.MHPC"
> +
>  /**
>   * MemStatus:
>   * @is_removing: the memory device in slot has been requested to be ejected.
> @@ -29,7 +32,7 @@ typedef struct MemHotplugState {
>  } MemHotplugState;
>  
>  void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner,
> -                              MemHotplugState *state, uint16_t io_base);
> +                              MemHotplugState *state, hwaddr io_base);
>  
>  void acpi_memory_plug_cb(HotplugHandler *hotplug_dev, MemHotplugState *mem_st,
>                           DeviceState *dev, Error **errp);
> @@ -48,5 +51,6 @@ void acpi_memory_ospm_status(MemHotplugState *mem_st, ACPIOSTInfoList ***list);
>  
>  void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
>                                const char *res_root,
> -                              const char *event_handler_method);
> +                              const char *event_handler_method,
> +                              AmlRegionSpace rs);
>  #endif

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-03-29 13:56           ` [Qemu-arm] [Qemu-devel] " Auger Eric
@ 2019-04-01 13:07               ` Laszlo Ersek
  2019-04-01 13:07               ` Laszlo Ersek
  2019-04-02  8:39               ` Peter Maydell
  2 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-01 13:07 UTC (permalink / raw)
  To: Auger Eric, Ard Biesheuvel
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Linuxarm,
	Shameerali Kolothum Thodi, qemu-devel@nongnu.org,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	imammedo@redhat.com, sebastien.boeuf@intel.com, Leif Lindholm

On 03/29/19 14:56, Auger Eric wrote:
> Hi Ard,
> 
> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:
>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:
>>>
>>> Hi Shameer,
>>>
>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>> Sent: 29 March 2019 09:32
>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>
>>>>> Hi Shameer,
>>>>>
>>>>> [ + Laszlo, Ard, Leif ]
>>>>>
>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>>>>> This is to disable/enable populating DT nodes in case
>>>>>> any conflict with acpi tables. The default is "off".
>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>> know the scope of the disablement. At the moment this just aims to
>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>
>>>>>>
>>>>>> This will be used in subsequent patch where cold plug
>>>>>> device-memory support is added for DT boot.
>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>> any PCDIMM nodes.
>>>>>>
>>>>>> If DT memory node support is added for cold-plugged device
>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.
>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>> info.
>>>>
>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>
>>>> Also, to be more clear on what happens,
>>>>
>>>> Guest ACPI boot with "fdt=on" ,
>>>>
>>>> From kernel log,
>>>>
>>>> [    0.000000] Early memory node ranges
>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>
>>>>
>>>> Guest ACPI boot with "fdt=off" ,
>>>>
>>>> [    0.000000] Movable zone start for each node
>>>> [    0.000000] Early memory node ranges
>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>
>>>> The hotpluggable memory node is absent from early memory nodes here.
>>>
>>> OK thank you for the example illustrating the concern.
>>>>
>>>> As you said, it could be possible to detect this node using SRAT in UEFI.
>>>
>>> Let's wait for EDK2 experts on this.
>>>
>>
>> Happy to chime in, but I need a bit more context here.
>>
>> What is the problem, how does this path try to solve it, and why is
>> that a bad idea?
>>
> Sure, sorry.
> 
> This series:
> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> https://patchwork.kernel.org/cover/10863301/
> 
> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> SRAT and DSDT parts and relies on GED to trigger the hotplug.
> 
> We noticed that if we build the hotpluggable memory dt nodes on top of
> the above ACPI tables, the DIMM slots are interpreted as not
> hotpluggable memory slots (at least we think so).
> 
> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> fact that those slots are exposed as hotpluggable in the SRAT for example.
> 
> So in this series, we are forced to not generate the hotpluggable memory
> dt nodes if we want the DIMM slots to be effectively recognized as
> hotpluggable.
> 
> Could you confirm we have a correct understanding of the EDK2 behaviour
> and if so, would there be any solution for EDK2 to absorb both the DT
> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
> 
> At qemu level, detecting we are booting in ACPI mode and purposely
> removing the above mentioned DT nodes does not look straightforward.

The firmware is not enlightened about the ACPI content that comes from
QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
as instructed through the ACPI linker/loader script, in order to install
the ACPI content for the OS. No actual information is consumed by the
firmware from the ACPI payload -- and that's a feature.

The firmware does consume DT:

- If you start QEMU *with* "-no-acpi", then the DT is both consumed by
the firmware (for its own information needs), and passed on to the OS.

- If you start QEMU *without* "-no-acpi" (the default), then the DT is
consumed only by the firmware (for its own information needs), and the
DT is hidden from the OS. The OS gets only the ACPI content
(processed/prepared as described above).


In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
base/size pairs in all the memory nodes in the DT. For each such base
address that is currently tracked as "nonexistent" in the GCD memory
space map, the driver currently adds the base/size range as "system
memory". This in turn is reflected by the UEFI memmap that the OS gets
to see as "conventional memory".

If you need some memory ranges to show up as "special" in the UEFI
memmap, then you need to distinguish them somehow from the "regular"
memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
firmware, so that it act upon the discriminator that you set in the DT.


Now... from a brief look at the Platform Init and UEFI specs, my
impression is that the hotpluggable (but presently not plugged) DIMM
ranges should simply be *absent* from the UEFI memmap; is that correct?
(I didn't check the ACPI spec, maybe it specifies the expected behavior
in full.) If my impression is correct, then two options (alternatives)
exist:

(1) Hide the affected memory nodes -- or at least the affected base/size
pairs -- from the DT, in case you boot without "-no-acpi" but with an
external firmware loaded. Then the firmware will not expose those ranges
as "conventional memory" in the UEFI memmap. This approach requires no
changes to edk2.

This option is precisely what Eric described up-thread, at
<http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:

> in machvirt_init, there is firmware_loaded that tells you whether you
> have a FW image. If this one is not set, you can induce dt. But if
> there is a FW it can be either DT or ACPI booted. You also have the
> acpi_enabled knob.

(The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
"vl.c").

So, the condition for hiding the hotpluggable memory nodes in question
from the DT is:

  (aarch64 && firmware_loaded && acpi_enabled)


(2) Invent and set an "ignore me, firmware" property for the
hotpluggable memory nodes in the DT, and update the firmware to honor
that property.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-01 13:07               ` Laszlo Ersek
  0 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-01 13:07 UTC (permalink / raw)
  To: Auger Eric, Ard Biesheuvel
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	imammedo@redhat.com, sebastien.boeuf@intel.com, Leif Lindholm

On 03/29/19 14:56, Auger Eric wrote:
> Hi Ard,
> 
> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:
>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:
>>>
>>> Hi Shameer,
>>>
>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>> Sent: 29 March 2019 09:32
>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>
>>>>> Hi Shameer,
>>>>>
>>>>> [ + Laszlo, Ard, Leif ]
>>>>>
>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>>>>> This is to disable/enable populating DT nodes in case
>>>>>> any conflict with acpi tables. The default is "off".
>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>> know the scope of the disablement. At the moment this just aims to
>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>
>>>>>>
>>>>>> This will be used in subsequent patch where cold plug
>>>>>> device-memory support is added for DT boot.
>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>> any PCDIMM nodes.
>>>>>>
>>>>>> If DT memory node support is added for cold-plugged device
>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.
>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>> info.
>>>>
>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>
>>>> Also, to be more clear on what happens,
>>>>
>>>> Guest ACPI boot with "fdt=on" ,
>>>>
>>>> From kernel log,
>>>>
>>>> [    0.000000] Early memory node ranges
>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>
>>>>
>>>> Guest ACPI boot with "fdt=off" ,
>>>>
>>>> [    0.000000] Movable zone start for each node
>>>> [    0.000000] Early memory node ranges
>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>
>>>> The hotpluggable memory node is absent from early memory nodes here.
>>>
>>> OK thank you for the example illustrating the concern.
>>>>
>>>> As you said, it could be possible to detect this node using SRAT in UEFI.
>>>
>>> Let's wait for EDK2 experts on this.
>>>
>>
>> Happy to chime in, but I need a bit more context here.
>>
>> What is the problem, how does this path try to solve it, and why is
>> that a bad idea?
>>
> Sure, sorry.
> 
> This series:
> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> https://patchwork.kernel.org/cover/10863301/
> 
> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> SRAT and DSDT parts and relies on GED to trigger the hotplug.
> 
> We noticed that if we build the hotpluggable memory dt nodes on top of
> the above ACPI tables, the DIMM slots are interpreted as not
> hotpluggable memory slots (at least we think so).
> 
> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> fact that those slots are exposed as hotpluggable in the SRAT for example.
> 
> So in this series, we are forced to not generate the hotpluggable memory
> dt nodes if we want the DIMM slots to be effectively recognized as
> hotpluggable.
> 
> Could you confirm we have a correct understanding of the EDK2 behaviour
> and if so, would there be any solution for EDK2 to absorb both the DT
> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
> 
> At qemu level, detecting we are booting in ACPI mode and purposely
> removing the above mentioned DT nodes does not look straightforward.

The firmware is not enlightened about the ACPI content that comes from
QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
as instructed through the ACPI linker/loader script, in order to install
the ACPI content for the OS. No actual information is consumed by the
firmware from the ACPI payload -- and that's a feature.

The firmware does consume DT:

- If you start QEMU *with* "-no-acpi", then the DT is both consumed by
the firmware (for its own information needs), and passed on to the OS.

- If you start QEMU *without* "-no-acpi" (the default), then the DT is
consumed only by the firmware (for its own information needs), and the
DT is hidden from the OS. The OS gets only the ACPI content
(processed/prepared as described above).


In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
base/size pairs in all the memory nodes in the DT. For each such base
address that is currently tracked as "nonexistent" in the GCD memory
space map, the driver currently adds the base/size range as "system
memory". This in turn is reflected by the UEFI memmap that the OS gets
to see as "conventional memory".

If you need some memory ranges to show up as "special" in the UEFI
memmap, then you need to distinguish them somehow from the "regular"
memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
firmware, so that it act upon the discriminator that you set in the DT.


Now... from a brief look at the Platform Init and UEFI specs, my
impression is that the hotpluggable (but presently not plugged) DIMM
ranges should simply be *absent* from the UEFI memmap; is that correct?
(I didn't check the ACPI spec, maybe it specifies the expected behavior
in full.) If my impression is correct, then two options (alternatives)
exist:

(1) Hide the affected memory nodes -- or at least the affected base/size
pairs -- from the DT, in case you boot without "-no-acpi" but with an
external firmware loaded. Then the firmware will not expose those ranges
as "conventional memory" in the UEFI memmap. This approach requires no
changes to edk2.

This option is precisely what Eric described up-thread, at
<http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:

> in machvirt_init, there is firmware_loaded that tells you whether you
> have a FW image. If this one is not set, you can induce dt. But if
> there is a FW it can be either DT or ACPI booted. You also have the
> acpi_enabled knob.

(The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
"vl.c").

So, the condition for hiding the hotpluggable memory nodes in question
from the DT is:

  (aarch64 && firmware_loaded && acpi_enabled)


(2) Invent and set an "ignore me, firmware" property for the
hotpluggable memory nodes in the DT, and update the firmware to honor
that property.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
  2019-03-29 11:22     ` Shameerali Kolothum Thodi
@ 2019-04-01 13:08         ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-01 13:08 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	shannon.zhaosl@gmail.com, qemu-devel@nongnu.org, Linuxarm,
	Auger Eric, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com

On Fri, 29 Mar 2019 11:22:02 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> > -----Original Message-----
> > From: Auger Eric [mailto:eric.auger@redhat.com]
> > Sent: 28 March 2019 14:15
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> > qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> > peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> > sameo@linux.intel.com; sebastien.boeuf@intel.com
> > Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>
> > Subject: Re: [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
> > 
> > Hi Shameer,
> > 
> > On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
> > > From: Samuel Ortiz <sameo@linux.intel.com>
> > >
> > > This adds the skeleton to support an acpi device interface for
> > > HW-reduced acpi platforms via ACPI GED - Generic Event Device (ACPI
> > > v6.1 5.6.9).
> > >
> > > This will be used by Arm/Virt to add hotplug support.
> > >
> > > Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
> > > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > > ---
> > >  hw/acpi/Kconfig                        |  4 ++
> > >  hw/acpi/Makefile.objs                  |  1 +
> > >  hw/acpi/generic_event_device.c         | 72  
> > ++++++++++++++++++++++++++++++++++  
> > >  include/hw/acpi/generic_event_device.h | 29 ++++++++++++++
> > >  4 files changed, 106 insertions(+)
> > >  create mode 100644 hw/acpi/generic_event_device.c  create mode  
> > 100644  
> > > include/hw/acpi/generic_event_device.h
> > >
> > > diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig index eca3bee..01a8b41
> > > 100644
> > > --- a/hw/acpi/Kconfig
> > > +++ b/hw/acpi/Kconfig
> > > @@ -27,3 +27,7 @@ config ACPI_VMGENID
> > >      bool
> > >      default y
> > >      depends on PC
> > > +
> > > +config ACPI_HW_REDUCED
> > > +    bool
> > > +    depends on ACPI
> > > diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs index
> > > 2d46e37..b753232 100644
> > > --- a/hw/acpi/Makefile.objs
> > > +++ b/hw/acpi/Makefile.objs
> > > @@ -6,6 +6,7 @@ common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) +=
> > > memory_hotplug.o
> > >  common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
> > >  common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
> > >  common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
> > > +common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
> > >  common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
> > >
> > >  common-obj-y += acpi_interface.o
> > > diff --git a/hw/acpi/generic_event_device.c
> > > b/hw/acpi/generic_event_device.c new file mode 100644 index
> > > 0000000..b21a551
> > > --- /dev/null
> > > +++ b/hw/acpi/generic_event_device.c
> > > @@ -0,0 +1,72 @@
> > > +/*
> > > + *
> > > + * Copyright (c) 2018 Intel Corporation
> > > + *
> > > + * This program is free software; you can redistribute it and/or
> > > +modify it
> > > + * under the terms and conditions of the GNU General Public License,
> > > + * version 2 or later, as published by the Free Software Foundation.
> > > + *
> > > + * This program is distributed in the hope it will be useful, but
> > > +WITHOUT
> > > + * ANY WARRANTY; without even the implied warranty of  
> > MERCHANTABILITY  
> > > +or
> > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > > +License for
> > > + * more details.
> > > + *
> > > + * You should have received a copy of the GNU General Public License
> > > +along with
> > > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > > + */
> > > +
> > > +#include "qemu/osdep.h"
> > > +#include "hw/sysbus.h"
> > > +#include "hw/acpi/acpi.h"
> > > +#include "hw/acpi/generic_event_device.h"  
> > the files are named generic_event_device.c/h while the device is named
> > "virt-acpi". I would suggest to use the same naming as in nemu ie. ged or
> > acpi_ged.  
> 
> Agree. The naming is a bit confusing. In nemu they have a separate virt-acpi
> dev which makes use of GED. Here, we are rolling those two into one. I am
> still not very sure whether we should leave it as virt-acpi, because the actual
> device on which this is implemented can be changed eg, GED vs GPIO. 

I probably lacking context here, could you clarify and maybe compare
differences between x86 and ARM implementations and why it should be different devices?


> > If think you should clarify what is the exact scope of this device. The patch title
> > make think this is bound to be used only in machvirt (+ the virt prefix used in
> > numerous functions?). Is it also bound to be used by other architectures?  
> > > +
> > > +static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > > +                                DeviceState *dev, Error **errp) { }
> > > +
> > > +static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > +{ }
> > > +
> > > +static void virt_device_realize(DeviceState *dev, Error **errp) { }
> > > +
> > > +static Property virt_acpi_properties[] = {
> > > +    DEFINE_PROP_END_OF_LIST(),
> > > +};
> > > +
> > > +static void virt_acpi_class_init(ObjectClass *class, void *data) {
> > > +    DeviceClass *dc = DEVICE_CLASS(class);
> > > +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(class);
> > > +    AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_CLASS(class);
> > > +
> > > +    dc->desc = "ACPI";
> > > +    dc->props = virt_acpi_properties;
> > > +    dc->realize = virt_device_realize;
> > > +
> > > +    hc->plug = virt_device_plug_cb;
> > > +
> > > +    adevc->send_event = virt_send_ged; }
> > > +
> > > +static const TypeInfo virt_acpi_info = {
> > > +    .name          = TYPE_VIRT_ACPI,
> > > +    .parent        = TYPE_SYS_BUS_DEVICE,
> > > +    .instance_size = sizeof(VirtAcpiState),
> > > +    .class_init    = virt_acpi_class_init,
> > > +    .interfaces = (InterfaceInfo[]) {
> > > +        { TYPE_HOTPLUG_HANDLER },
> > > +        { TYPE_ACPI_DEVICE_IF },
> > > +        { }
> > > +    }
> > > +};
> > > +
> > > +static void virt_acpi_register_types(void) {
> > > +    type_register_static(&virt_acpi_info);
> > > +}
> > > +
> > > +type_init(virt_acpi_register_types)
> > > diff --git a/include/hw/acpi/generic_event_device.h
> > > b/include/hw/acpi/generic_event_device.h
> > > new file mode 100644
> > > index 0000000..f314515
> > > --- /dev/null
> > > +++ b/include/hw/acpi/generic_event_device.h
> > > @@ -0,0 +1,29 @@
> > > +/*
> > > + *
> > > + * Copyright (c) 2018 Intel Corporation
> > > + *
> > > + * This program is free software; you can redistribute it and/or
> > > +modify it
> > > + * under the terms and conditions of the GNU General Public License,
> > > + * version 2 or later, as published by the Free Software Foundation.
> > > + *
> > > + * This program is distributed in the hope it will be useful, but
> > > +WITHOUT
> > > + * ANY WARRANTY; without even the implied warranty of  
> > MERCHANTABILITY  
> > > +or
> > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > > +License for
> > > + * more details.
> > > + *
> > > + * You should have received a copy of the GNU General Public License
> > > +along with
> > > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > > + */  
> > Add a comment in the header introducing what is the role of this device?
> > link to GED spec? Explain the subset of the interfaces being implemented by
> > the device.  
> 
> Ok. I have added comments to that effect in patch #10, but I think I will make it
> clear here as well.
> 
> Cheers,
> Shameer
> 
> > > +
> > > +#ifndef HW_ACPI_GED_H
> > > +#define HW_ACPI_GED_H
> > > +
> > > +#define TYPE_VIRT_ACPI "virt-acpi"
> > > +#define VIRT_ACPI(obj) \
> > > +    OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > > +
> > > +typedef struct VirtAcpiState {
> > > +    SysBusDevice parent_obj;
> > > +} VirtAcpiState;
> > > +
> > > +#endif
> > >  


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
@ 2019-04-01 13:08         ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-01 13:08 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: Auger Eric, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	peter.maydell@linaro.org, shannon.zhaosl@gmail.com,
	sameo@linux.intel.com, sebastien.boeuf@intel.com, Linuxarm,
	xuwei (O)

On Fri, 29 Mar 2019 11:22:02 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> > -----Original Message-----
> > From: Auger Eric [mailto:eric.auger@redhat.com]
> > Sent: 28 March 2019 14:15
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> > qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> > peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> > sameo@linux.intel.com; sebastien.boeuf@intel.com
> > Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>
> > Subject: Re: [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
> > 
> > Hi Shameer,
> > 
> > On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
> > > From: Samuel Ortiz <sameo@linux.intel.com>
> > >
> > > This adds the skeleton to support an acpi device interface for
> > > HW-reduced acpi platforms via ACPI GED - Generic Event Device (ACPI
> > > v6.1 5.6.9).
> > >
> > > This will be used by Arm/Virt to add hotplug support.
> > >
> > > Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
> > > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > > ---
> > >  hw/acpi/Kconfig                        |  4 ++
> > >  hw/acpi/Makefile.objs                  |  1 +
> > >  hw/acpi/generic_event_device.c         | 72  
> > ++++++++++++++++++++++++++++++++++  
> > >  include/hw/acpi/generic_event_device.h | 29 ++++++++++++++
> > >  4 files changed, 106 insertions(+)
> > >  create mode 100644 hw/acpi/generic_event_device.c  create mode  
> > 100644  
> > > include/hw/acpi/generic_event_device.h
> > >
> > > diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig index eca3bee..01a8b41
> > > 100644
> > > --- a/hw/acpi/Kconfig
> > > +++ b/hw/acpi/Kconfig
> > > @@ -27,3 +27,7 @@ config ACPI_VMGENID
> > >      bool
> > >      default y
> > >      depends on PC
> > > +
> > > +config ACPI_HW_REDUCED
> > > +    bool
> > > +    depends on ACPI
> > > diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs index
> > > 2d46e37..b753232 100644
> > > --- a/hw/acpi/Makefile.objs
> > > +++ b/hw/acpi/Makefile.objs
> > > @@ -6,6 +6,7 @@ common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) +=
> > > memory_hotplug.o
> > >  common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
> > >  common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
> > >  common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
> > > +common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
> > >  common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
> > >
> > >  common-obj-y += acpi_interface.o
> > > diff --git a/hw/acpi/generic_event_device.c
> > > b/hw/acpi/generic_event_device.c new file mode 100644 index
> > > 0000000..b21a551
> > > --- /dev/null
> > > +++ b/hw/acpi/generic_event_device.c
> > > @@ -0,0 +1,72 @@
> > > +/*
> > > + *
> > > + * Copyright (c) 2018 Intel Corporation
> > > + *
> > > + * This program is free software; you can redistribute it and/or
> > > +modify it
> > > + * under the terms and conditions of the GNU General Public License,
> > > + * version 2 or later, as published by the Free Software Foundation.
> > > + *
> > > + * This program is distributed in the hope it will be useful, but
> > > +WITHOUT
> > > + * ANY WARRANTY; without even the implied warranty of  
> > MERCHANTABILITY  
> > > +or
> > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > > +License for
> > > + * more details.
> > > + *
> > > + * You should have received a copy of the GNU General Public License
> > > +along with
> > > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > > + */
> > > +
> > > +#include "qemu/osdep.h"
> > > +#include "hw/sysbus.h"
> > > +#include "hw/acpi/acpi.h"
> > > +#include "hw/acpi/generic_event_device.h"  
> > the files are named generic_event_device.c/h while the device is named
> > "virt-acpi". I would suggest to use the same naming as in nemu ie. ged or
> > acpi_ged.  
> 
> Agree. The naming is a bit confusing. In nemu they have a separate virt-acpi
> dev which makes use of GED. Here, we are rolling those two into one. I am
> still not very sure whether we should leave it as virt-acpi, because the actual
> device on which this is implemented can be changed eg, GED vs GPIO. 

I probably lacking context here, could you clarify and maybe compare
differences between x86 and ARM implementations and why it should be different devices?


> > If think you should clarify what is the exact scope of this device. The patch title
> > make think this is bound to be used only in machvirt (+ the virt prefix used in
> > numerous functions?). Is it also bound to be used by other architectures?  
> > > +
> > > +static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > > +                                DeviceState *dev, Error **errp) { }
> > > +
> > > +static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > +{ }
> > > +
> > > +static void virt_device_realize(DeviceState *dev, Error **errp) { }
> > > +
> > > +static Property virt_acpi_properties[] = {
> > > +    DEFINE_PROP_END_OF_LIST(),
> > > +};
> > > +
> > > +static void virt_acpi_class_init(ObjectClass *class, void *data) {
> > > +    DeviceClass *dc = DEVICE_CLASS(class);
> > > +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(class);
> > > +    AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_CLASS(class);
> > > +
> > > +    dc->desc = "ACPI";
> > > +    dc->props = virt_acpi_properties;
> > > +    dc->realize = virt_device_realize;
> > > +
> > > +    hc->plug = virt_device_plug_cb;
> > > +
> > > +    adevc->send_event = virt_send_ged; }
> > > +
> > > +static const TypeInfo virt_acpi_info = {
> > > +    .name          = TYPE_VIRT_ACPI,
> > > +    .parent        = TYPE_SYS_BUS_DEVICE,
> > > +    .instance_size = sizeof(VirtAcpiState),
> > > +    .class_init    = virt_acpi_class_init,
> > > +    .interfaces = (InterfaceInfo[]) {
> > > +        { TYPE_HOTPLUG_HANDLER },
> > > +        { TYPE_ACPI_DEVICE_IF },
> > > +        { }
> > > +    }
> > > +};
> > > +
> > > +static void virt_acpi_register_types(void) {
> > > +    type_register_static(&virt_acpi_info);
> > > +}
> > > +
> > > +type_init(virt_acpi_register_types)
> > > diff --git a/include/hw/acpi/generic_event_device.h
> > > b/include/hw/acpi/generic_event_device.h
> > > new file mode 100644
> > > index 0000000..f314515
> > > --- /dev/null
> > > +++ b/include/hw/acpi/generic_event_device.h
> > > @@ -0,0 +1,29 @@
> > > +/*
> > > + *
> > > + * Copyright (c) 2018 Intel Corporation
> > > + *
> > > + * This program is free software; you can redistribute it and/or
> > > +modify it
> > > + * under the terms and conditions of the GNU General Public License,
> > > + * version 2 or later, as published by the Free Software Foundation.
> > > + *
> > > + * This program is distributed in the hope it will be useful, but
> > > +WITHOUT
> > > + * ANY WARRANTY; without even the implied warranty of  
> > MERCHANTABILITY  
> > > +or
> > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > > +License for
> > > + * more details.
> > > + *
> > > + * You should have received a copy of the GNU General Public License
> > > +along with
> > > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > > + */  
> > Add a comment in the header introducing what is the role of this device?
> > link to GED spec? Explain the subset of the interfaces being implemented by
> > the device.  
> 
> Ok. I have added comments to that effect in patch #10, but I think I will make it
> clear here as well.
> 
> Cheers,
> Shameer
> 
> > > +
> > > +#ifndef HW_ACPI_GED_H
> > > +#define HW_ACPI_GED_H
> > > +
> > > +#define TYPE_VIRT_ACPI "virt-acpi"
> > > +#define VIRT_ACPI(obj) \
> > > +    OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > > +
> > > +typedef struct VirtAcpiState {
> > > +    SysBusDevice parent_obj;
> > > +} VirtAcpiState;
> > > +
> > > +#endif
> > >  

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
  2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug Shameer Kolothum
@ 2019-04-01 13:34     ` Igor Mammedov
  2019-04-01 13:34     ` [Qemu-devel] " Igor Mammedov
  1 sibling, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-01 13:34 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: peter.maydell, sameo, shannon.zhaosl, qemu-devel, xuwei5,
	linuxarm, eric.auger, qemu-arm, sebastien.boeuf

On Thu, 21 Mar 2019 10:47:40 +0000
Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:

> This adds support to build the aml code so that Guest(ACPI boot)
> can see the cold-plugged device memory. Memory cold plug support
> with DT boot is not yet enabled.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  default-configs/arm-softmmu.mak        |  2 ++
>  hw/acpi/generic_event_device.c         | 23 +++++++++++++++++++++++
>  hw/arm/virt-acpi-build.c               |  9 +++++++++
>  hw/arm/virt.c                          | 23 +++++++++++++++++++++++
>  include/hw/acpi/generic_event_device.h |  5 +++++
>  include/hw/arm/virt.h                  |  2 ++
>  6 files changed, 64 insertions(+)
> 
> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> index 795cb89..6db444e 100644
> --- a/default-configs/arm-softmmu.mak
> +++ b/default-configs/arm-softmmu.mak
> @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
>  
>  CONFIG_MEM_DEVICE=y
>  CONFIG_DIMM=y
> +CONFIG_ACPI_MEMORY_HOTPLUG=y
> +CONFIG_ACPI_HW_REDUCED=y
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index b21a551..0b32fc9 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -16,13 +16,26 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "exec/address-spaces.h"
>  #include "hw/sysbus.h"
>  #include "hw/acpi/acpi.h"
>  #include "hw/acpi/generic_event_device.h"
> +#include "hw/mem/pc-dimm.h"
>  
>  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
>                                  DeviceState *dev, Error **errp)
>  {
> +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> +
> +    if (s->memhp_state.is_enabled &&
> +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> +                                dev, errp);
> +    } else {
> +        error_setg(errp, "virt: device plug request for unsupported device"
> +                   " type: %s", object_get_typename(OBJECT(dev)));
> +    }
>  }
>  
>  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>  
>  static void virt_device_realize(DeviceState *dev, Error **errp)
>  {
> +    VirtAcpiState *s = VIRT_ACPI(dev);
> +
> +    if (s->memhp_state.is_enabled) {
> +        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
> +                                 &s->memhp_state,
> +                                 s->memhp_base);
> +    }
>  }
>  
>  static Property virt_acpi_properties[] = {
> +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base, 0),

it's preferred to use '-' in property names

> +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> +                     memhp_state.is_enabled, true),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index bf9c0bc..20d3c83 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -40,6 +40,7 @@
>  #include "hw/loader.h"
>  #include "hw/hw.h"
>  #include "hw/acpi/aml-build.h"
> +#include "hw/acpi/memory_hotplug.h"
>  #include "hw/pci/pcie_host.h"
>  #include "hw/pci/pci.h"
>  #include "hw/arm/virt.h"
> @@ -49,6 +50,13 @@
>  #define ARM_SPI_BASE 32
>  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
>  
> +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState *ms)
> +{
it's dummy wrapper that never will be reused,
I suggest to just inline contents at call site and drop wrapper.

> +    uint32_t nr_mem = ms->ram_slots;
> +
> +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL, AML_SYSTEM_MEMORY);
> +}
> +
>  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
>  {
>      uint16_t i;
> @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>       * the RTC ACPI device at all when using UEFI.
>       */
>      scope = aml_scope("\\_SB");
> +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
>      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
>      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
>                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index d0ff20d..13db0e9 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
>      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
>      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
>      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
                                                 ^^^^^^^^^^^
where from this magic number comes?

>      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const VirtMachineState *vms)
>      }
>  }
>  
> +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> +{
> +    DeviceState *dev;
> +
> +    dev = qdev_create(NULL, "virt-acpi");
> +    qdev_prop_set_uint64(dev, "memhp_base",
> +                         vms->memmap[VIRT_PCDIMM_ACPI].base);
> +    qdev_init_nofail(dev);
> +
> +    return dev;

Probably no worth a wrapper either, since code is trivial and isn't reused elsewhere.

> +}
> +
>  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
>  {
>      const char *itsclass = its_class_name();
> @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
>  
>      create_platform_bus(vms, pic);
>  
> +    vms->acpi = create_virt_acpi(vms);
> +
>      vms->bootinfo.ram_size = machine->ram_size;
>      vms->bootinfo.kernel_filename = machine->kernel_filename;
>      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> @@ -1828,11 +1843,19 @@ static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>  static void virt_memory_plug(HotplugHandler *hotplug_dev,
>                               DeviceState *dev, Error **errp)
>  {
> +    HotplugHandlerClass *hhc;
>      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>      Error *local_err = NULL;
>  
>      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);
>  
> +out:
>      error_propagate(errp, local_err);
>  }
>  
> diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
> index f314515..262ca7d 100644
> --- a/include/hw/acpi/generic_event_device.h
> +++ b/include/hw/acpi/generic_event_device.h
> @@ -18,12 +18,17 @@
>  #ifndef HW_ACPI_GED_H
>  #define HW_ACPI_GED_H
>  
> +#include "hw/acpi/memory_hotplug.h"
> +
>  #define TYPE_VIRT_ACPI "virt-acpi"
>  #define VIRT_ACPI(obj) \
>      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
>  
>  typedef struct VirtAcpiState {
>      SysBusDevice parent_obj;
> +    MemHotplugState memhp_state;
> +    hwaddr memhp_base;
>  } VirtAcpiState;
>  
> +
>  #endif
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 507517c..c5e4c96 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -77,6 +77,7 @@ enum {
>      VIRT_GPIO,
>      VIRT_SECURE_UART,
>      VIRT_SECURE_MEM,
> +    VIRT_PCDIMM_ACPI,
>      VIRT_LOWMEMMAP_LAST,
>  };
>  
> @@ -132,6 +133,7 @@ typedef struct {
>      uint32_t iommu_phandle;
>      int psci_conduit;
>      hwaddr highest_gpa;
> +    DeviceState *acpi;
>  } VirtMachineState;
>  
>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
@ 2019-04-01 13:34     ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-01 13:34 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-devel, qemu-arm, eric.auger, peter.maydell, shannon.zhaosl,
	sameo, sebastien.boeuf, linuxarm, xuwei5

On Thu, 21 Mar 2019 10:47:40 +0000
Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:

> This adds support to build the aml code so that Guest(ACPI boot)
> can see the cold-plugged device memory. Memory cold plug support
> with DT boot is not yet enabled.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  default-configs/arm-softmmu.mak        |  2 ++
>  hw/acpi/generic_event_device.c         | 23 +++++++++++++++++++++++
>  hw/arm/virt-acpi-build.c               |  9 +++++++++
>  hw/arm/virt.c                          | 23 +++++++++++++++++++++++
>  include/hw/acpi/generic_event_device.h |  5 +++++
>  include/hw/arm/virt.h                  |  2 ++
>  6 files changed, 64 insertions(+)
> 
> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> index 795cb89..6db444e 100644
> --- a/default-configs/arm-softmmu.mak
> +++ b/default-configs/arm-softmmu.mak
> @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
>  
>  CONFIG_MEM_DEVICE=y
>  CONFIG_DIMM=y
> +CONFIG_ACPI_MEMORY_HOTPLUG=y
> +CONFIG_ACPI_HW_REDUCED=y
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index b21a551..0b32fc9 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -16,13 +16,26 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "exec/address-spaces.h"
>  #include "hw/sysbus.h"
>  #include "hw/acpi/acpi.h"
>  #include "hw/acpi/generic_event_device.h"
> +#include "hw/mem/pc-dimm.h"
>  
>  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
>                                  DeviceState *dev, Error **errp)
>  {
> +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> +
> +    if (s->memhp_state.is_enabled &&
> +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> +                                dev, errp);
> +    } else {
> +        error_setg(errp, "virt: device plug request for unsupported device"
> +                   " type: %s", object_get_typename(OBJECT(dev)));
> +    }
>  }
>  
>  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>  
>  static void virt_device_realize(DeviceState *dev, Error **errp)
>  {
> +    VirtAcpiState *s = VIRT_ACPI(dev);
> +
> +    if (s->memhp_state.is_enabled) {
> +        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
> +                                 &s->memhp_state,
> +                                 s->memhp_base);
> +    }
>  }
>  
>  static Property virt_acpi_properties[] = {
> +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base, 0),

it's preferred to use '-' in property names

> +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> +                     memhp_state.is_enabled, true),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index bf9c0bc..20d3c83 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -40,6 +40,7 @@
>  #include "hw/loader.h"
>  #include "hw/hw.h"
>  #include "hw/acpi/aml-build.h"
> +#include "hw/acpi/memory_hotplug.h"
>  #include "hw/pci/pcie_host.h"
>  #include "hw/pci/pci.h"
>  #include "hw/arm/virt.h"
> @@ -49,6 +50,13 @@
>  #define ARM_SPI_BASE 32
>  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
>  
> +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState *ms)
> +{
it's dummy wrapper that never will be reused,
I suggest to just inline contents at call site and drop wrapper.

> +    uint32_t nr_mem = ms->ram_slots;
> +
> +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL, AML_SYSTEM_MEMORY);
> +}
> +
>  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
>  {
>      uint16_t i;
> @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>       * the RTC ACPI device at all when using UEFI.
>       */
>      scope = aml_scope("\\_SB");
> +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
>      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
>      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
>                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index d0ff20d..13db0e9 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
>      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
>      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
>      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
                                                 ^^^^^^^^^^^
where from this magic number comes?

>      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const VirtMachineState *vms)
>      }
>  }
>  
> +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> +{
> +    DeviceState *dev;
> +
> +    dev = qdev_create(NULL, "virt-acpi");
> +    qdev_prop_set_uint64(dev, "memhp_base",
> +                         vms->memmap[VIRT_PCDIMM_ACPI].base);
> +    qdev_init_nofail(dev);
> +
> +    return dev;

Probably no worth a wrapper either, since code is trivial and isn't reused elsewhere.

> +}
> +
>  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
>  {
>      const char *itsclass = its_class_name();
> @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
>  
>      create_platform_bus(vms, pic);
>  
> +    vms->acpi = create_virt_acpi(vms);
> +
>      vms->bootinfo.ram_size = machine->ram_size;
>      vms->bootinfo.kernel_filename = machine->kernel_filename;
>      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> @@ -1828,11 +1843,19 @@ static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>  static void virt_memory_plug(HotplugHandler *hotplug_dev,
>                               DeviceState *dev, Error **errp)
>  {
> +    HotplugHandlerClass *hhc;
>      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>      Error *local_err = NULL;
>  
>      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);
>  
> +out:
>      error_propagate(errp, local_err);
>  }
>  
> diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
> index f314515..262ca7d 100644
> --- a/include/hw/acpi/generic_event_device.h
> +++ b/include/hw/acpi/generic_event_device.h
> @@ -18,12 +18,17 @@
>  #ifndef HW_ACPI_GED_H
>  #define HW_ACPI_GED_H
>  
> +#include "hw/acpi/memory_hotplug.h"
> +
>  #define TYPE_VIRT_ACPI "virt-acpi"
>  #define VIRT_ACPI(obj) \
>      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
>  
>  typedef struct VirtAcpiState {
>      SysBusDevice parent_obj;
> +    MemHotplugState memhp_state;
> +    hwaddr memhp_base;
>  } VirtAcpiState;
>  
> +
>  #endif
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 507517c..c5e4c96 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -77,6 +77,7 @@ enum {
>      VIRT_GPIO,
>      VIRT_SECURE_UART,
>      VIRT_SECURE_MEM,
> +    VIRT_PCDIMM_ACPI,
>      VIRT_LOWMEMMAP_LAST,
>  };
>  
> @@ -132,6 +133,7 @@ typedef struct {
>      uint32_t iommu_phandle;
>      int psci_conduit;
>      hwaddr highest_gpa;
> +    DeviceState *acpi;
>  } VirtMachineState;
>  
>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
  2019-03-29  9:31   ` [Qemu-arm] " Auger Eric
@ 2019-04-01 13:43       ` Igor Mammedov
  2019-04-01 13:43       ` [Qemu-devel] " Igor Mammedov
  1 sibling, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-01 13:43 UTC (permalink / raw)
  To: Auger Eric
  Cc: peter.maydell, sameo, qemu-devel, Shameer Kolothum, linuxarm,
	shannon.zhaosl, qemu-arm, xuwei5, sebastien.boeuf

On Fri, 29 Mar 2019 10:31:14 +0100
Auger Eric <eric.auger@redhat.com> wrote:

> Hi Shameer,
> 
> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> > This adds support to build the aml code so that Guest(ACPI boot)
> > can see the cold-plugged device memory. Memory cold plug support
> > with DT boot is not yet enabled.
> > 
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  default-configs/arm-softmmu.mak        |  2 ++
> >  hw/acpi/generic_event_device.c         | 23 +++++++++++++++++++++++
> >  hw/arm/virt-acpi-build.c               |  9 +++++++++
> >  hw/arm/virt.c                          | 23 +++++++++++++++++++++++
> >  include/hw/acpi/generic_event_device.h |  5 +++++
> >  include/hw/arm/virt.h                  |  2 ++
> >  6 files changed, 64 insertions(+)
> > 
> > diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> > index 795cb89..6db444e 100644
> > --- a/default-configs/arm-softmmu.mak
> > +++ b/default-configs/arm-softmmu.mak
> > @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
> >  
> >  CONFIG_MEM_DEVICE=y
> >  CONFIG_DIMM=y
> > +CONFIG_ACPI_MEMORY_HOTPLUG=y
> > +CONFIG_ACPI_HW_REDUCED=y
> > diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> > index b21a551..0b32fc9 100644
> > --- a/hw/acpi/generic_event_device.c
> > +++ b/hw/acpi/generic_event_device.c
> > @@ -16,13 +16,26 @@
> >   */
> >  
> >  #include "qemu/osdep.h"
> > +#include "qapi/error.h"
> > +#include "exec/address-spaces.h"
> >  #include "hw/sysbus.h"
> >  #include "hw/acpi/acpi.h"
> >  #include "hw/acpi/generic_event_device.h"
> > +#include "hw/mem/pc-dimm.h"
> >  
> >  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> >                                  DeviceState *dev, Error **errp)
> >  {
> > +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> > +
> > +    if (s->memhp_state.is_enabled &&
> > +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> > +                                dev, errp);
> > +    } else {
> > +        error_setg(errp, "virt: device plug request for unsupported device"
> > +                   " type: %s", object_get_typename(OBJECT(dev)));
> > +    }
> >  }
> >  
> >  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> >  
> >  static void virt_device_realize(DeviceState *dev, Error **errp)
> >  {
> > +    VirtAcpiState *s = VIRT_ACPI(dev);
> > +
> > +    if (s->memhp_state.is_enabled) {
> > +        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
> > +                                 &s->memhp_state,
> > +                                 s->memhp_base);
> > +    }
> >  }
> >  
> >  static Property virt_acpi_properties[] = {
> > +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base, 0),
> > +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> > +                     memhp_state.is_enabled, true),>      DEFINE_PROP_END_OF_LIST(),
> >  };
> >  
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index bf9c0bc..20d3c83 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -40,6 +40,7 @@
> >  #include "hw/loader.h"
> >  #include "hw/hw.h"
> >  #include "hw/acpi/aml-build.h"
> > +#include "hw/acpi/memory_hotplug.h"
> >  #include "hw/pci/pcie_host.h"
> >  #include "hw/pci/pci.h"
> >  #include "hw/arm/virt.h"
> > @@ -49,6 +50,13 @@
> >  #define ARM_SPI_BASE 32
> >  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
> >  
> > +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState *ms)
> > +{
> > +    uint32_t nr_mem = ms->ram_slots;
> > +
> > +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL, AML_SYSTEM_MEMORY);
> > +}
> > +
> >  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
> >  {
> >      uint16_t i;
> > @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> >       * the RTC ACPI device at all when using UEFI.
> >       */
> >      scope = aml_scope("\\_SB");
> > +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
> >      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
> >      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> >                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index d0ff20d..13db0e9 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
> >      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
> >      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> >      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> > +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
> >      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> >      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
> >      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> > @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const VirtMachineState *vms)
> >      }
> >  }
> >  
> > +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> > +{
> > +    DeviceState *dev;
> > +
> > +    dev = qdev_create(NULL, "virt-acpi");
> > +    qdev_prop_set_uint64(dev, "memhp_base",
> > +                         vms->memmap[VIRT_PCDIMM_ACPI].base);  
> Maybe add a comment that a property is requested to integrated with
> acpi_memory_hotplug_init() (if I am not wrong). Otherwise we can wonder
> why sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, <base>) is not used as for
> standard sysbus devices?

Why it's inherited from SYS_BUS_DEVICE to begin with?

> 
> > +    qdev_init_nofail(dev);
> > +
> > +    return dev;
> > +}
> > +
> >  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
> >  {
> >      const char *itsclass = its_class_name();
> > @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
> >  
> >      create_platform_bus(vms, pic);
> >  
> > +    vms->acpi = create_virt_acpi(vms);I can see that on PC machines, they use a link property to set the  
> acpi_dev. I am unsure about the exact reason, any idea?

pc and q35 machine have different devices that implement ACPI interface
and live somewhere else in the system and also honor -no-acpi CLI option.
Link allows to cache reference to whatever device in use and manage CLI
expectations (if I recall it correctly).

> > +
> >      vms->bootinfo.ram_size = machine->ram_size;
> >      vms->bootinfo.kernel_filename = machine->kernel_filename;
> >      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> > @@ -1828,11 +1843,19 @@ static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >  static void virt_memory_plug(HotplugHandler *hotplug_dev,
> >                               DeviceState *dev, Error **errp)
> >  {
> > +    HotplugHandlerClass *hhc;
> >      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> >      Error *local_err = NULL;
> >  
> >      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> > +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);  
> Why error_abort instead of propagating the error?

After last round of changes to hotplug handler, it's deemed that plug() handler
should not fail (I didn't get my hands on removing error argument from interface
yet). All checks and graceful abort should happen at pre_plug() stage.

> >  
> > +out:
> >      error_propagate(errp, local_err);
> >  }
> >  
> > diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
> > index f314515..262ca7d 100644
> > --- a/include/hw/acpi/generic_event_device.h
> > +++ b/include/hw/acpi/generic_event_device.h
> > @@ -18,12 +18,17 @@
> >  #ifndef HW_ACPI_GED_H
> >  #define HW_ACPI_GED_H
> >  
> > +#include "hw/acpi/memory_hotplug.h"
> > +
> >  #define TYPE_VIRT_ACPI "virt-acpi"
> >  #define VIRT_ACPI(obj) \
> >      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> >  
> >  typedef struct VirtAcpiState {
> >      SysBusDevice parent_obj;
> > +    MemHotplugState memhp_state;
> > +    hwaddr memhp_base;
> >  } VirtAcpiState;
> >  
> > +  
> spurious newline
> 
> Thanks
> 
> Eric
> >  #endif
> > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > index 507517c..c5e4c96 100644
> > --- a/include/hw/arm/virt.h
> > +++ b/include/hw/arm/virt.h
> > @@ -77,6 +77,7 @@ enum {
> >      VIRT_GPIO,
> >      VIRT_SECURE_UART,
> >      VIRT_SECURE_MEM,
> > +    VIRT_PCDIMM_ACPI,
> >      VIRT_LOWMEMMAP_LAST,
> >  };
> >  
> > @@ -132,6 +133,7 @@ typedef struct {
> >      uint32_t iommu_phandle;
> >      int psci_conduit;
> >      hwaddr highest_gpa;
> > +    DeviceState *acpi;
> >  } VirtMachineState;
> >  
> >  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
> >   


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
@ 2019-04-01 13:43       ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-01 13:43 UTC (permalink / raw)
  To: Auger Eric
  Cc: Shameer Kolothum, qemu-devel, qemu-arm, peter.maydell,
	shannon.zhaosl, sameo, sebastien.boeuf, linuxarm, xuwei5

On Fri, 29 Mar 2019 10:31:14 +0100
Auger Eric <eric.auger@redhat.com> wrote:

> Hi Shameer,
> 
> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> > This adds support to build the aml code so that Guest(ACPI boot)
> > can see the cold-plugged device memory. Memory cold plug support
> > with DT boot is not yet enabled.
> > 
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  default-configs/arm-softmmu.mak        |  2 ++
> >  hw/acpi/generic_event_device.c         | 23 +++++++++++++++++++++++
> >  hw/arm/virt-acpi-build.c               |  9 +++++++++
> >  hw/arm/virt.c                          | 23 +++++++++++++++++++++++
> >  include/hw/acpi/generic_event_device.h |  5 +++++
> >  include/hw/arm/virt.h                  |  2 ++
> >  6 files changed, 64 insertions(+)
> > 
> > diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> > index 795cb89..6db444e 100644
> > --- a/default-configs/arm-softmmu.mak
> > +++ b/default-configs/arm-softmmu.mak
> > @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
> >  
> >  CONFIG_MEM_DEVICE=y
> >  CONFIG_DIMM=y
> > +CONFIG_ACPI_MEMORY_HOTPLUG=y
> > +CONFIG_ACPI_HW_REDUCED=y
> > diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> > index b21a551..0b32fc9 100644
> > --- a/hw/acpi/generic_event_device.c
> > +++ b/hw/acpi/generic_event_device.c
> > @@ -16,13 +16,26 @@
> >   */
> >  
> >  #include "qemu/osdep.h"
> > +#include "qapi/error.h"
> > +#include "exec/address-spaces.h"
> >  #include "hw/sysbus.h"
> >  #include "hw/acpi/acpi.h"
> >  #include "hw/acpi/generic_event_device.h"
> > +#include "hw/mem/pc-dimm.h"
> >  
> >  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> >                                  DeviceState *dev, Error **errp)
> >  {
> > +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> > +
> > +    if (s->memhp_state.is_enabled &&
> > +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> > +                                dev, errp);
> > +    } else {
> > +        error_setg(errp, "virt: device plug request for unsupported device"
> > +                   " type: %s", object_get_typename(OBJECT(dev)));
> > +    }
> >  }
> >  
> >  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> >  
> >  static void virt_device_realize(DeviceState *dev, Error **errp)
> >  {
> > +    VirtAcpiState *s = VIRT_ACPI(dev);
> > +
> > +    if (s->memhp_state.is_enabled) {
> > +        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
> > +                                 &s->memhp_state,
> > +                                 s->memhp_base);
> > +    }
> >  }
> >  
> >  static Property virt_acpi_properties[] = {
> > +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base, 0),
> > +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> > +                     memhp_state.is_enabled, true),>      DEFINE_PROP_END_OF_LIST(),
> >  };
> >  
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index bf9c0bc..20d3c83 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -40,6 +40,7 @@
> >  #include "hw/loader.h"
> >  #include "hw/hw.h"
> >  #include "hw/acpi/aml-build.h"
> > +#include "hw/acpi/memory_hotplug.h"
> >  #include "hw/pci/pcie_host.h"
> >  #include "hw/pci/pci.h"
> >  #include "hw/arm/virt.h"
> > @@ -49,6 +50,13 @@
> >  #define ARM_SPI_BASE 32
> >  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
> >  
> > +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState *ms)
> > +{
> > +    uint32_t nr_mem = ms->ram_slots;
> > +
> > +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL, AML_SYSTEM_MEMORY);
> > +}
> > +
> >  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
> >  {
> >      uint16_t i;
> > @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> >       * the RTC ACPI device at all when using UEFI.
> >       */
> >      scope = aml_scope("\\_SB");
> > +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
> >      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
> >      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> >                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index d0ff20d..13db0e9 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
> >      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
> >      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> >      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> > +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
> >      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> >      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
> >      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> > @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const VirtMachineState *vms)
> >      }
> >  }
> >  
> > +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> > +{
> > +    DeviceState *dev;
> > +
> > +    dev = qdev_create(NULL, "virt-acpi");
> > +    qdev_prop_set_uint64(dev, "memhp_base",
> > +                         vms->memmap[VIRT_PCDIMM_ACPI].base);  
> Maybe add a comment that a property is requested to integrated with
> acpi_memory_hotplug_init() (if I am not wrong). Otherwise we can wonder
> why sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, <base>) is not used as for
> standard sysbus devices?

Why it's inherited from SYS_BUS_DEVICE to begin with?

> 
> > +    qdev_init_nofail(dev);
> > +
> > +    return dev;
> > +}
> > +
> >  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
> >  {
> >      const char *itsclass = its_class_name();
> > @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
> >  
> >      create_platform_bus(vms, pic);
> >  
> > +    vms->acpi = create_virt_acpi(vms);I can see that on PC machines, they use a link property to set the  
> acpi_dev. I am unsure about the exact reason, any idea?

pc and q35 machine have different devices that implement ACPI interface
and live somewhere else in the system and also honor -no-acpi CLI option.
Link allows to cache reference to whatever device in use and manage CLI
expectations (if I recall it correctly).

> > +
> >      vms->bootinfo.ram_size = machine->ram_size;
> >      vms->bootinfo.kernel_filename = machine->kernel_filename;
> >      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> > @@ -1828,11 +1843,19 @@ static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >  static void virt_memory_plug(HotplugHandler *hotplug_dev,
> >                               DeviceState *dev, Error **errp)
> >  {
> > +    HotplugHandlerClass *hhc;
> >      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> >      Error *local_err = NULL;
> >  
> >      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> > +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);  
> Why error_abort instead of propagating the error?

After last round of changes to hotplug handler, it's deemed that plug() handler
should not fail (I didn't get my hands on removing error argument from interface
yet). All checks and graceful abort should happen at pre_plug() stage.

> >  
> > +out:
> >      error_propagate(errp, local_err);
> >  }
> >  
> > diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
> > index f314515..262ca7d 100644
> > --- a/include/hw/acpi/generic_event_device.h
> > +++ b/include/hw/acpi/generic_event_device.h
> > @@ -18,12 +18,17 @@
> >  #ifndef HW_ACPI_GED_H
> >  #define HW_ACPI_GED_H
> >  
> > +#include "hw/acpi/memory_hotplug.h"
> > +
> >  #define TYPE_VIRT_ACPI "virt-acpi"
> >  #define VIRT_ACPI(obj) \
> >      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> >  
> >  typedef struct VirtAcpiState {
> >      SysBusDevice parent_obj;
> > +    MemHotplugState memhp_state;
> > +    hwaddr memhp_base;
> >  } VirtAcpiState;
> >  
> > +  
> spurious newline
> 
> Thanks
> 
> Eric
> >  #endif
> > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > index 507517c..c5e4c96 100644
> > --- a/include/hw/arm/virt.h
> > +++ b/include/hw/arm/virt.h
> > @@ -77,6 +77,7 @@ enum {
> >      VIRT_GPIO,
> >      VIRT_SECURE_UART,
> >      VIRT_SECURE_MEM,
> > +    VIRT_PCDIMM_ACPI,
> >      VIRT_LOWMEMMAP_LAST,
> >  };
> >  
> > @@ -132,6 +133,7 @@ typedef struct {
> >      uint32_t iommu_phandle;
> >      int psci_conduit;
> >      hwaddr highest_gpa;
> > +    DeviceState *acpi;
> >  } VirtMachineState;
> >  
> >  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
> >   

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
  2019-04-01 13:08         ` Igor Mammedov
@ 2019-04-01 14:21           ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-01 14:21 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	shannon.zhaosl@gmail.com, qemu-devel@nongnu.org, Linuxarm,
	Auger Eric, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com

Hi Igor,

> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@redhat.com]
> Sent: 01 April 2019 14:09
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Auger Eric <eric.auger@redhat.com>; qemu-devel@nongnu.org;
> qemu-arm@nongnu.org; peter.maydell@linaro.org;
> shannon.zhaosl@gmail.com; sameo@linux.intel.com;
> sebastien.boeuf@intel.com; Linuxarm <linuxarm@huawei.com>; xuwei (O)
> <xuwei5@huawei.com>
> Subject: Re: [Qemu-devel] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI
> device
> 
> On Fri, 29 Mar 2019 11:22:02 +0000
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:
> 
> > > -----Original Message-----
> > > From: Auger Eric [mailto:eric.auger@redhat.com]
> > > Sent: 28 March 2019 14:15
> > > To: Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>;
> > > qemu-devel@nongnu.org; qemu-arm@nongnu.org;
> imammedo@redhat.com;
> > > peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> > > sameo@linux.intel.com; sebastien.boeuf@intel.com
> > > Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>
> > > Subject: Re: [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
> > >
> > > Hi Shameer,
> > >
> > > On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> > > > From: Samuel Ortiz <sameo@linux.intel.com>
> > > >
> > > > This adds the skeleton to support an acpi device interface for
> > > > HW-reduced acpi platforms via ACPI GED - Generic Event Device (ACPI
> > > > v6.1 5.6.9).
> > > >
> > > > This will be used by Arm/Virt to add hotplug support.
> > > >
> > > > Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
> > > > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > > > ---
> > > >  hw/acpi/Kconfig                        |  4 ++
> > > >  hw/acpi/Makefile.objs                  |  1 +
> > > >  hw/acpi/generic_event_device.c         | 72
> > > ++++++++++++++++++++++++++++++++++
> > > >  include/hw/acpi/generic_event_device.h | 29 ++++++++++++++
> > > >  4 files changed, 106 insertions(+)
> > > >  create mode 100644 hw/acpi/generic_event_device.c  create mode
> > > 100644
> > > > include/hw/acpi/generic_event_device.h
> > > >
> > > > diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig index eca3bee..01a8b41
> > > > 100644
> > > > --- a/hw/acpi/Kconfig
> > > > +++ b/hw/acpi/Kconfig
> > > > @@ -27,3 +27,7 @@ config ACPI_VMGENID
> > > >      bool
> > > >      default y
> > > >      depends on PC
> > > > +
> > > > +config ACPI_HW_REDUCED
> > > > +    bool
> > > > +    depends on ACPI
> > > > diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs index
> > > > 2d46e37..b753232 100644
> > > > --- a/hw/acpi/Makefile.objs
> > > > +++ b/hw/acpi/Makefile.objs
> > > > @@ -6,6 +6,7 @@ common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG)
> +=
> > > > memory_hotplug.o
> > > >  common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
> > > >  common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
> > > >  common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
> > > > +common-obj-$(CONFIG_ACPI_HW_REDUCED) +=
> generic_event_device.o
> > > >  common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
> > > >
> > > >  common-obj-y += acpi_interface.o
> > > > diff --git a/hw/acpi/generic_event_device.c
> > > > b/hw/acpi/generic_event_device.c new file mode 100644 index
> > > > 0000000..b21a551
> > > > --- /dev/null
> > > > +++ b/hw/acpi/generic_event_device.c
> > > > @@ -0,0 +1,72 @@
> > > > +/*
> > > > + *
> > > > + * Copyright (c) 2018 Intel Corporation
> > > > + *
> > > > + * This program is free software; you can redistribute it and/or
> > > > +modify it
> > > > + * under the terms and conditions of the GNU General Public License,
> > > > + * version 2 or later, as published by the Free Software Foundation.
> > > > + *
> > > > + * This program is distributed in the hope it will be useful, but
> > > > +WITHOUT
> > > > + * ANY WARRANTY; without even the implied warranty of
> > > MERCHANTABILITY
> > > > +or
> > > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > > > +License for
> > > > + * more details.
> > > > + *
> > > > + * You should have received a copy of the GNU General Public License
> > > > +along with
> > > > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > > > + */
> > > > +
> > > > +#include "qemu/osdep.h"
> > > > +#include "hw/sysbus.h"
> > > > +#include "hw/acpi/acpi.h"
> > > > +#include "hw/acpi/generic_event_device.h"
> > > the files are named generic_event_device.c/h while the device is named
> > > "virt-acpi". I would suggest to use the same naming as in nemu ie. ged or
> > > acpi_ged.
> >
> > Agree. The naming is a bit confusing. In nemu they have a separate virt-acpi
> > dev which makes use of GED. Here, we are rolling those two into one. I am
> > still not very sure whether we should leave it as virt-acpi, because the actual
> > device on which this is implemented can be changed eg, GED vs GPIO.
> 
> I probably lacking context here, could you clarify and maybe compare
> differences between x86 and ARM implementations and why it should be
> different devices?
> 

Right. I was not comparing against x86, but just pointing out how Nemu has
done this. They seems to have a virt-acpi dev specific to virt platforms
(hw/i386/virt/acpi.c) and then moved all GED related code in a separate file
(hw/acpi/ged.c) [1].

I was just thinking whether that approach makes any sense going forward where
there are cases where platforms support GED or GPIO for hotplug support and
virt-acpi dev can be configured to use either of those. May be not.

Thanks,
Shameer

[1]https://github.com/intel/nemu/commit/bcff7ee8588f7049cd919ee8b349f219a873ec41#diff-82ce92e28467c5894c90311f0e6a75fb

> 
> > > If think you should clarify what is the exact scope of this device. The patch
> title
> > > make think this is bound to be used only in machvirt (+ the virt prefix used in
> > > numerous functions?). Is it also bound to be used by other architectures?
> > > > +
> > > > +static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > > > +                                DeviceState *dev, Error **errp)
> { }
> > > > +
> > > > +static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > > +{ }
> > > > +
> > > > +static void virt_device_realize(DeviceState *dev, Error **errp) { }
> > > > +
> > > > +static Property virt_acpi_properties[] = {
> > > > +    DEFINE_PROP_END_OF_LIST(),
> > > > +};
> > > > +
> > > > +static void virt_acpi_class_init(ObjectClass *class, void *data) {
> > > > +    DeviceClass *dc = DEVICE_CLASS(class);
> > > > +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(class);
> > > > +    AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_CLASS(class);
> > > > +
> > > > +    dc->desc = "ACPI";
> > > > +    dc->props = virt_acpi_properties;
> > > > +    dc->realize = virt_device_realize;
> > > > +
> > > > +    hc->plug = virt_device_plug_cb;
> > > > +
> > > > +    adevc->send_event = virt_send_ged; }
> > > > +
> > > > +static const TypeInfo virt_acpi_info = {
> > > > +    .name          = TYPE_VIRT_ACPI,
> > > > +    .parent        = TYPE_SYS_BUS_DEVICE,
> > > > +    .instance_size = sizeof(VirtAcpiState),
> > > > +    .class_init    = virt_acpi_class_init,
> > > > +    .interfaces = (InterfaceInfo[]) {
> > > > +        { TYPE_HOTPLUG_HANDLER },
> > > > +        { TYPE_ACPI_DEVICE_IF },
> > > > +        { }
> > > > +    }
> > > > +};
> > > > +
> > > > +static void virt_acpi_register_types(void) {
> > > > +    type_register_static(&virt_acpi_info);
> > > > +}
> > > > +
> > > > +type_init(virt_acpi_register_types)
> > > > diff --git a/include/hw/acpi/generic_event_device.h
> > > > b/include/hw/acpi/generic_event_device.h
> > > > new file mode 100644
> > > > index 0000000..f314515
> > > > --- /dev/null
> > > > +++ b/include/hw/acpi/generic_event_device.h
> > > > @@ -0,0 +1,29 @@
> > > > +/*
> > > > + *
> > > > + * Copyright (c) 2018 Intel Corporation
> > > > + *
> > > > + * This program is free software; you can redistribute it and/or
> > > > +modify it
> > > > + * under the terms and conditions of the GNU General Public License,
> > > > + * version 2 or later, as published by the Free Software Foundation.
> > > > + *
> > > > + * This program is distributed in the hope it will be useful, but
> > > > +WITHOUT
> > > > + * ANY WARRANTY; without even the implied warranty of
> > > MERCHANTABILITY
> > > > +or
> > > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > > > +License for
> > > > + * more details.
> > > > + *
> > > > + * You should have received a copy of the GNU General Public License
> > > > +along with
> > > > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > > > + */
> > > Add a comment in the header introducing what is the role of this device?
> > > link to GED spec? Explain the subset of the interfaces being implemented by
> > > the device.
> >
> > Ok. I have added comments to that effect in patch #10, but I think I will make
> it
> > clear here as well.
> >
> > Cheers,
> > Shameer
> >
> > > > +
> > > > +#ifndef HW_ACPI_GED_H
> > > > +#define HW_ACPI_GED_H
> > > > +
> > > > +#define TYPE_VIRT_ACPI "virt-acpi"
> > > > +#define VIRT_ACPI(obj) \
> > > > +    OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > > > +
> > > > +typedef struct VirtAcpiState {
> > > > +    SysBusDevice parent_obj;
> > > > +} VirtAcpiState;
> > > > +
> > > > +#endif
> > > >


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
@ 2019-04-01 14:21           ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-01 14:21 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Auger Eric, qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	peter.maydell@linaro.org, shannon.zhaosl@gmail.com,
	sameo@linux.intel.com, sebastien.boeuf@intel.com, Linuxarm,
	xuwei (O)

Hi Igor,

> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@redhat.com]
> Sent: 01 April 2019 14:09
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Auger Eric <eric.auger@redhat.com>; qemu-devel@nongnu.org;
> qemu-arm@nongnu.org; peter.maydell@linaro.org;
> shannon.zhaosl@gmail.com; sameo@linux.intel.com;
> sebastien.boeuf@intel.com; Linuxarm <linuxarm@huawei.com>; xuwei (O)
> <xuwei5@huawei.com>
> Subject: Re: [Qemu-devel] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI
> device
> 
> On Fri, 29 Mar 2019 11:22:02 +0000
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:
> 
> > > -----Original Message-----
> > > From: Auger Eric [mailto:eric.auger@redhat.com]
> > > Sent: 28 March 2019 14:15
> > > To: Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>;
> > > qemu-devel@nongnu.org; qemu-arm@nongnu.org;
> imammedo@redhat.com;
> > > peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> > > sameo@linux.intel.com; sebastien.boeuf@intel.com
> > > Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>
> > > Subject: Re: [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
> > >
> > > Hi Shameer,
> > >
> > > On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> > > > From: Samuel Ortiz <sameo@linux.intel.com>
> > > >
> > > > This adds the skeleton to support an acpi device interface for
> > > > HW-reduced acpi platforms via ACPI GED - Generic Event Device (ACPI
> > > > v6.1 5.6.9).
> > > >
> > > > This will be used by Arm/Virt to add hotplug support.
> > > >
> > > > Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
> > > > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > > > ---
> > > >  hw/acpi/Kconfig                        |  4 ++
> > > >  hw/acpi/Makefile.objs                  |  1 +
> > > >  hw/acpi/generic_event_device.c         | 72
> > > ++++++++++++++++++++++++++++++++++
> > > >  include/hw/acpi/generic_event_device.h | 29 ++++++++++++++
> > > >  4 files changed, 106 insertions(+)
> > > >  create mode 100644 hw/acpi/generic_event_device.c  create mode
> > > 100644
> > > > include/hw/acpi/generic_event_device.h
> > > >
> > > > diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig index eca3bee..01a8b41
> > > > 100644
> > > > --- a/hw/acpi/Kconfig
> > > > +++ b/hw/acpi/Kconfig
> > > > @@ -27,3 +27,7 @@ config ACPI_VMGENID
> > > >      bool
> > > >      default y
> > > >      depends on PC
> > > > +
> > > > +config ACPI_HW_REDUCED
> > > > +    bool
> > > > +    depends on ACPI
> > > > diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs index
> > > > 2d46e37..b753232 100644
> > > > --- a/hw/acpi/Makefile.objs
> > > > +++ b/hw/acpi/Makefile.objs
> > > > @@ -6,6 +6,7 @@ common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG)
> +=
> > > > memory_hotplug.o
> > > >  common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
> > > >  common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
> > > >  common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
> > > > +common-obj-$(CONFIG_ACPI_HW_REDUCED) +=
> generic_event_device.o
> > > >  common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
> > > >
> > > >  common-obj-y += acpi_interface.o
> > > > diff --git a/hw/acpi/generic_event_device.c
> > > > b/hw/acpi/generic_event_device.c new file mode 100644 index
> > > > 0000000..b21a551
> > > > --- /dev/null
> > > > +++ b/hw/acpi/generic_event_device.c
> > > > @@ -0,0 +1,72 @@
> > > > +/*
> > > > + *
> > > > + * Copyright (c) 2018 Intel Corporation
> > > > + *
> > > > + * This program is free software; you can redistribute it and/or
> > > > +modify it
> > > > + * under the terms and conditions of the GNU General Public License,
> > > > + * version 2 or later, as published by the Free Software Foundation.
> > > > + *
> > > > + * This program is distributed in the hope it will be useful, but
> > > > +WITHOUT
> > > > + * ANY WARRANTY; without even the implied warranty of
> > > MERCHANTABILITY
> > > > +or
> > > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > > > +License for
> > > > + * more details.
> > > > + *
> > > > + * You should have received a copy of the GNU General Public License
> > > > +along with
> > > > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > > > + */
> > > > +
> > > > +#include "qemu/osdep.h"
> > > > +#include "hw/sysbus.h"
> > > > +#include "hw/acpi/acpi.h"
> > > > +#include "hw/acpi/generic_event_device.h"
> > > the files are named generic_event_device.c/h while the device is named
> > > "virt-acpi". I would suggest to use the same naming as in nemu ie. ged or
> > > acpi_ged.
> >
> > Agree. The naming is a bit confusing. In nemu they have a separate virt-acpi
> > dev which makes use of GED. Here, we are rolling those two into one. I am
> > still not very sure whether we should leave it as virt-acpi, because the actual
> > device on which this is implemented can be changed eg, GED vs GPIO.
> 
> I probably lacking context here, could you clarify and maybe compare
> differences between x86 and ARM implementations and why it should be
> different devices?
> 

Right. I was not comparing against x86, but just pointing out how Nemu has
done this. They seems to have a virt-acpi dev specific to virt platforms
(hw/i386/virt/acpi.c) and then moved all GED related code in a separate file
(hw/acpi/ged.c) [1].

I was just thinking whether that approach makes any sense going forward where
there are cases where platforms support GED or GPIO for hotplug support and
virt-acpi dev can be configured to use either of those. May be not.

Thanks,
Shameer

[1]https://github.com/intel/nemu/commit/bcff7ee8588f7049cd919ee8b349f219a873ec41#diff-82ce92e28467c5894c90311f0e6a75fb

> 
> > > If think you should clarify what is the exact scope of this device. The patch
> title
> > > make think this is bound to be used only in machvirt (+ the virt prefix used in
> > > numerous functions?). Is it also bound to be used by other architectures?
> > > > +
> > > > +static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > > > +                                DeviceState *dev, Error **errp)
> { }
> > > > +
> > > > +static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > > +{ }
> > > > +
> > > > +static void virt_device_realize(DeviceState *dev, Error **errp) { }
> > > > +
> > > > +static Property virt_acpi_properties[] = {
> > > > +    DEFINE_PROP_END_OF_LIST(),
> > > > +};
> > > > +
> > > > +static void virt_acpi_class_init(ObjectClass *class, void *data) {
> > > > +    DeviceClass *dc = DEVICE_CLASS(class);
> > > > +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(class);
> > > > +    AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_CLASS(class);
> > > > +
> > > > +    dc->desc = "ACPI";
> > > > +    dc->props = virt_acpi_properties;
> > > > +    dc->realize = virt_device_realize;
> > > > +
> > > > +    hc->plug = virt_device_plug_cb;
> > > > +
> > > > +    adevc->send_event = virt_send_ged; }
> > > > +
> > > > +static const TypeInfo virt_acpi_info = {
> > > > +    .name          = TYPE_VIRT_ACPI,
> > > > +    .parent        = TYPE_SYS_BUS_DEVICE,
> > > > +    .instance_size = sizeof(VirtAcpiState),
> > > > +    .class_init    = virt_acpi_class_init,
> > > > +    .interfaces = (InterfaceInfo[]) {
> > > > +        { TYPE_HOTPLUG_HANDLER },
> > > > +        { TYPE_ACPI_DEVICE_IF },
> > > > +        { }
> > > > +    }
> > > > +};
> > > > +
> > > > +static void virt_acpi_register_types(void) {
> > > > +    type_register_static(&virt_acpi_info);
> > > > +}
> > > > +
> > > > +type_init(virt_acpi_register_types)
> > > > diff --git a/include/hw/acpi/generic_event_device.h
> > > > b/include/hw/acpi/generic_event_device.h
> > > > new file mode 100644
> > > > index 0000000..f314515
> > > > --- /dev/null
> > > > +++ b/include/hw/acpi/generic_event_device.h
> > > > @@ -0,0 +1,29 @@
> > > > +/*
> > > > + *
> > > > + * Copyright (c) 2018 Intel Corporation
> > > > + *
> > > > + * This program is free software; you can redistribute it and/or
> > > > +modify it
> > > > + * under the terms and conditions of the GNU General Public License,
> > > > + * version 2 or later, as published by the Free Software Foundation.
> > > > + *
> > > > + * This program is distributed in the hope it will be useful, but
> > > > +WITHOUT
> > > > + * ANY WARRANTY; without even the implied warranty of
> > > MERCHANTABILITY
> > > > +or
> > > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > > > +License for
> > > > + * more details.
> > > > + *
> > > > + * You should have received a copy of the GNU General Public License
> > > > +along with
> > > > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > > > + */
> > > Add a comment in the header introducing what is the role of this device?
> > > link to GED spec? Explain the subset of the interfaces being implemented by
> > > the device.
> >
> > Ok. I have added comments to that effect in patch #10, but I think I will make
> it
> > clear here as well.
> >
> > Cheers,
> > Shameer
> >
> > > > +
> > > > +#ifndef HW_ACPI_GED_H
> > > > +#define HW_ACPI_GED_H
> > > > +
> > > > +#define TYPE_VIRT_ACPI "virt-acpi"
> > > > +#define VIRT_ACPI(obj) \
> > > > +    OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > > > +
> > > > +typedef struct VirtAcpiState {
> > > > +    SysBusDevice parent_obj;
> > > > +} VirtAcpiState;
> > > > +
> > > > +#endif
> > > >

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
  2019-04-01 13:43       ` [Qemu-devel] " Igor Mammedov
@ 2019-04-01 14:51         ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-01 14:51 UTC (permalink / raw)
  To: Igor Mammedov, Auger Eric
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Linuxarm, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), sebastien.boeuf@intel.com

Hi Igor,

> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@redhat.com]
> Sent: 01 April 2019 14:43
> To: Auger Eric <eric.auger@redhat.com>
> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> qemu-devel@nongnu.org; qemu-arm@nongnu.org; peter.maydell@linaro.org;
> shannon.zhaosl@gmail.com; sameo@linux.intel.com;
> sebastien.boeuf@intel.com; Linuxarm <linuxarm@huawei.com>; xuwei (O)
> <xuwei5@huawei.com>
> Subject: Re: [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device
> memory cold-plug
> 
> On Fri, 29 Mar 2019 10:31:14 +0100
> Auger Eric <eric.auger@redhat.com> wrote:
> 
> > Hi Shameer,
> >
> > On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> > > This adds support to build the aml code so that Guest(ACPI boot)
> > > can see the cold-plugged device memory. Memory cold plug support
> > > with DT boot is not yet enabled.
> > >
> > > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > > ---
> > >  default-configs/arm-softmmu.mak        |  2 ++
> > >  hw/acpi/generic_event_device.c         | 23
> +++++++++++++++++++++++
> > >  hw/arm/virt-acpi-build.c               |  9 +++++++++
> > >  hw/arm/virt.c                          | 23
> +++++++++++++++++++++++
> > >  include/hw/acpi/generic_event_device.h |  5 +++++
> > >  include/hw/arm/virt.h                  |  2 ++
> > >  6 files changed, 64 insertions(+)
> > >
> > > diff --git a/default-configs/arm-softmmu.mak
> b/default-configs/arm-softmmu.mak
> > > index 795cb89..6db444e 100644
> > > --- a/default-configs/arm-softmmu.mak
> > > +++ b/default-configs/arm-softmmu.mak
> > > @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
> > >
> > >  CONFIG_MEM_DEVICE=y
> > >  CONFIG_DIMM=y
> > > +CONFIG_ACPI_MEMORY_HOTPLUG=y
> > > +CONFIG_ACPI_HW_REDUCED=y
> > > diff --git a/hw/acpi/generic_event_device.c
> b/hw/acpi/generic_event_device.c
> > > index b21a551..0b32fc9 100644
> > > --- a/hw/acpi/generic_event_device.c
> > > +++ b/hw/acpi/generic_event_device.c
> > > @@ -16,13 +16,26 @@
> > >   */
> > >
> > >  #include "qemu/osdep.h"
> > > +#include "qapi/error.h"
> > > +#include "exec/address-spaces.h"
> > >  #include "hw/sysbus.h"
> > >  #include "hw/acpi/acpi.h"
> > >  #include "hw/acpi/generic_event_device.h"
> > > +#include "hw/mem/pc-dimm.h"
> > >
> > >  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > >                                  DeviceState *dev, Error **errp)
> > >  {
> > > +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> > > +
> > > +    if (s->memhp_state.is_enabled &&
> > > +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > > +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> > > +                                dev, errp);
> > > +    } else {
> > > +        error_setg(errp, "virt: device plug request for unsupported
> device"
> > > +                   " type: %s", object_get_typename(OBJECT(dev)));
> > > +    }
> > >  }
> > >
> > >  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev,
> AcpiEventStatusBits ev)
> > >
> > >  static void virt_device_realize(DeviceState *dev, Error **errp)
> > >  {
> > > +    VirtAcpiState *s = VIRT_ACPI(dev);
> > > +
> > > +    if (s->memhp_state.is_enabled) {
> > > +        acpi_memory_hotplug_init(get_system_memory(),
> OBJECT(dev),
> > > +                                 &s->memhp_state,
> > > +                                 s->memhp_base);
> > > +    }
> > >  }
> > >
> > >  static Property virt_acpi_properties[] = {
> > > +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base,
> 0),
> > > +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> > > +                     memhp_state.is_enabled, true),>
> DEFINE_PROP_END_OF_LIST(),
> > >  };
> > >
> > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > > index bf9c0bc..20d3c83 100644
> > > --- a/hw/arm/virt-acpi-build.c
> > > +++ b/hw/arm/virt-acpi-build.c
> > > @@ -40,6 +40,7 @@
> > >  #include "hw/loader.h"
> > >  #include "hw/hw.h"
> > >  #include "hw/acpi/aml-build.h"
> > > +#include "hw/acpi/memory_hotplug.h"
> > >  #include "hw/pci/pcie_host.h"
> > >  #include "hw/pci/pci.h"
> > >  #include "hw/arm/virt.h"
> > > @@ -49,6 +50,13 @@
> > >  #define ARM_SPI_BASE 32
> > >  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
> > >
> > > +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState
> *ms)
> > > +{
> > > +    uint32_t nr_mem = ms->ram_slots;
> > > +
> > > +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL,
> AML_SYSTEM_MEMORY);
> > > +}
> > > +
> > >  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
> > >  {
> > >      uint16_t i;
> > > @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
> > >       * the RTC ACPI device at all when using UEFI.
> > >       */
> > >      scope = aml_scope("\\_SB");
> > > +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
> > >      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
> > >      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> > >                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > > index d0ff20d..13db0e9 100644
> > > --- a/hw/arm/virt.c
> > > +++ b/hw/arm/virt.c
> > > @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
> > >      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
> > >      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> > >      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> > > +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
> > >      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> > >      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of
> that size */
> > >      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> > > @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const
> VirtMachineState *vms)
> > >      }
> > >  }
> > >
> > > +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> > > +{
> > > +    DeviceState *dev;
> > > +
> > > +    dev = qdev_create(NULL, "virt-acpi");
> > > +    qdev_prop_set_uint64(dev, "memhp_base",
> > > +
> vms->memmap[VIRT_PCDIMM_ACPI].base);
> > Maybe add a comment that a property is requested to integrated with
> > acpi_memory_hotplug_init() (if I am not wrong). Otherwise we can wonder
> > why sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, <base>) is not used as for
> > standard sysbus devices?
> 
> Why it's inherited from SYS_BUS_DEVICE to begin with?

Hmm..I don't have a clear answer to that other than the fact that just reused 
the way other platform devices are created pl011/pl061/smmu etc. Also PCI
doesn't look like an obvious one here. Please let me know if there is a better
way of doing this.

> >
> > > +    qdev_init_nofail(dev);
> > > +
> > > +    return dev;
> > > +}
> > > +
> > >  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
> > >  {
> > >      const char *itsclass = its_class_name();
> > > @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState
> *machine)
> > >
> > >      create_platform_bus(vms, pic);
> > >
> > > +    vms->acpi = create_virt_acpi(vms);I can see that on PC machines,
> they use a link property to set the
> > acpi_dev. I am unsure about the exact reason, any idea?
> 
> pc and q35 machine have different devices that implement ACPI interface
> and live somewhere else in the system and also honor -no-acpi CLI option.
> Link allows to cache reference to whatever device in use and manage CLI
> expectations (if I recall it correctly).

Thanks for clarifying this.

> 
> > > +
> > >      vms->bootinfo.ram_size = machine->ram_size;
> > >      vms->bootinfo.kernel_filename = machine->kernel_filename;
> > >      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> > > @@ -1828,11 +1843,19 @@ static void
> virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > >  static void virt_memory_plug(HotplugHandler *hotplug_dev,
> > >                               DeviceState *dev, Error **errp)
> > >  {
> > > +    HotplugHandlerClass *hhc;
> > >      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> > >      Error *local_err = NULL;
> > >
> > >      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> > > +    if (local_err) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> > > +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);
> > Why error_abort instead of propagating the error?
> 
> After last round of changes to hotplug handler, it's deemed that plug() handler
> should not fail (I didn't get my hands on removing error argument from
> interface
> yet). All checks and graceful abort should happen at pre_plug() stage.

Ok. I will address this in next revision.

Thanks,
Shameer
 
> > >
> > > +out:
> > >      error_propagate(errp, local_err);
> > >  }
> > >
> > > diff --git a/include/hw/acpi/generic_event_device.h
> b/include/hw/acpi/generic_event_device.h
> > > index f314515..262ca7d 100644
> > > --- a/include/hw/acpi/generic_event_device.h
> > > +++ b/include/hw/acpi/generic_event_device.h
> > > @@ -18,12 +18,17 @@
> > >  #ifndef HW_ACPI_GED_H
> > >  #define HW_ACPI_GED_H
> > >
> > > +#include "hw/acpi/memory_hotplug.h"
> > > +
> > >  #define TYPE_VIRT_ACPI "virt-acpi"
> > >  #define VIRT_ACPI(obj) \
> > >      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > >
> > >  typedef struct VirtAcpiState {
> > >      SysBusDevice parent_obj;
> > > +    MemHotplugState memhp_state;
> > > +    hwaddr memhp_base;
> > >  } VirtAcpiState;
> > >
> > > +
> > spurious newline
> >
> > Thanks
> >
> > Eric
> > >  #endif
> > > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > > index 507517c..c5e4c96 100644
> > > --- a/include/hw/arm/virt.h
> > > +++ b/include/hw/arm/virt.h
> > > @@ -77,6 +77,7 @@ enum {
> > >      VIRT_GPIO,
> > >      VIRT_SECURE_UART,
> > >      VIRT_SECURE_MEM,
> > > +    VIRT_PCDIMM_ACPI,
> > >      VIRT_LOWMEMMAP_LAST,
> > >  };
> > >
> > > @@ -132,6 +133,7 @@ typedef struct {
> > >      uint32_t iommu_phandle;
> > >      int psci_conduit;
> > >      hwaddr highest_gpa;
> > > +    DeviceState *acpi;
> > >  } VirtMachineState;
> > >
> > >  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM :
> VIRT_PCIE_ECAM)
> > >


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
@ 2019-04-01 14:51         ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-01 14:51 UTC (permalink / raw)
  To: Igor Mammedov, Auger Eric
  Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	peter.maydell@linaro.org, shannon.zhaosl@gmail.com,
	sameo@linux.intel.com, sebastien.boeuf@intel.com, Linuxarm,
	xuwei (O)

Hi Igor,

> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@redhat.com]
> Sent: 01 April 2019 14:43
> To: Auger Eric <eric.auger@redhat.com>
> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> qemu-devel@nongnu.org; qemu-arm@nongnu.org; peter.maydell@linaro.org;
> shannon.zhaosl@gmail.com; sameo@linux.intel.com;
> sebastien.boeuf@intel.com; Linuxarm <linuxarm@huawei.com>; xuwei (O)
> <xuwei5@huawei.com>
> Subject: Re: [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device
> memory cold-plug
> 
> On Fri, 29 Mar 2019 10:31:14 +0100
> Auger Eric <eric.auger@redhat.com> wrote:
> 
> > Hi Shameer,
> >
> > On 3/21/19 11:47 AM, Shameer Kolothum wrote:
> > > This adds support to build the aml code so that Guest(ACPI boot)
> > > can see the cold-plugged device memory. Memory cold plug support
> > > with DT boot is not yet enabled.
> > >
> > > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > > ---
> > >  default-configs/arm-softmmu.mak        |  2 ++
> > >  hw/acpi/generic_event_device.c         | 23
> +++++++++++++++++++++++
> > >  hw/arm/virt-acpi-build.c               |  9 +++++++++
> > >  hw/arm/virt.c                          | 23
> +++++++++++++++++++++++
> > >  include/hw/acpi/generic_event_device.h |  5 +++++
> > >  include/hw/arm/virt.h                  |  2 ++
> > >  6 files changed, 64 insertions(+)
> > >
> > > diff --git a/default-configs/arm-softmmu.mak
> b/default-configs/arm-softmmu.mak
> > > index 795cb89..6db444e 100644
> > > --- a/default-configs/arm-softmmu.mak
> > > +++ b/default-configs/arm-softmmu.mak
> > > @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
> > >
> > >  CONFIG_MEM_DEVICE=y
> > >  CONFIG_DIMM=y
> > > +CONFIG_ACPI_MEMORY_HOTPLUG=y
> > > +CONFIG_ACPI_HW_REDUCED=y
> > > diff --git a/hw/acpi/generic_event_device.c
> b/hw/acpi/generic_event_device.c
> > > index b21a551..0b32fc9 100644
> > > --- a/hw/acpi/generic_event_device.c
> > > +++ b/hw/acpi/generic_event_device.c
> > > @@ -16,13 +16,26 @@
> > >   */
> > >
> > >  #include "qemu/osdep.h"
> > > +#include "qapi/error.h"
> > > +#include "exec/address-spaces.h"
> > >  #include "hw/sysbus.h"
> > >  #include "hw/acpi/acpi.h"
> > >  #include "hw/acpi/generic_event_device.h"
> > > +#include "hw/mem/pc-dimm.h"
> > >
> > >  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > >                                  DeviceState *dev, Error **errp)
> > >  {
> > > +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> > > +
> > > +    if (s->memhp_state.is_enabled &&
> > > +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > > +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> > > +                                dev, errp);
> > > +    } else {
> > > +        error_setg(errp, "virt: device plug request for unsupported
> device"
> > > +                   " type: %s", object_get_typename(OBJECT(dev)));
> > > +    }
> > >  }
> > >
> > >  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev,
> AcpiEventStatusBits ev)
> > >
> > >  static void virt_device_realize(DeviceState *dev, Error **errp)
> > >  {
> > > +    VirtAcpiState *s = VIRT_ACPI(dev);
> > > +
> > > +    if (s->memhp_state.is_enabled) {
> > > +        acpi_memory_hotplug_init(get_system_memory(),
> OBJECT(dev),
> > > +                                 &s->memhp_state,
> > > +                                 s->memhp_base);
> > > +    }
> > >  }
> > >
> > >  static Property virt_acpi_properties[] = {
> > > +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base,
> 0),
> > > +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> > > +                     memhp_state.is_enabled, true),>
> DEFINE_PROP_END_OF_LIST(),
> > >  };
> > >
> > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > > index bf9c0bc..20d3c83 100644
> > > --- a/hw/arm/virt-acpi-build.c
> > > +++ b/hw/arm/virt-acpi-build.c
> > > @@ -40,6 +40,7 @@
> > >  #include "hw/loader.h"
> > >  #include "hw/hw.h"
> > >  #include "hw/acpi/aml-build.h"
> > > +#include "hw/acpi/memory_hotplug.h"
> > >  #include "hw/pci/pcie_host.h"
> > >  #include "hw/pci/pci.h"
> > >  #include "hw/arm/virt.h"
> > > @@ -49,6 +50,13 @@
> > >  #define ARM_SPI_BASE 32
> > >  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
> > >
> > > +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState
> *ms)
> > > +{
> > > +    uint32_t nr_mem = ms->ram_slots;
> > > +
> > > +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL,
> AML_SYSTEM_MEMORY);
> > > +}
> > > +
> > >  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
> > >  {
> > >      uint16_t i;
> > > @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
> > >       * the RTC ACPI device at all when using UEFI.
> > >       */
> > >      scope = aml_scope("\\_SB");
> > > +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
> > >      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
> > >      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> > >                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > > index d0ff20d..13db0e9 100644
> > > --- a/hw/arm/virt.c
> > > +++ b/hw/arm/virt.c
> > > @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
> > >      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
> > >      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> > >      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> > > +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
> > >      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> > >      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of
> that size */
> > >      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> > > @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const
> VirtMachineState *vms)
> > >      }
> > >  }
> > >
> > > +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> > > +{
> > > +    DeviceState *dev;
> > > +
> > > +    dev = qdev_create(NULL, "virt-acpi");
> > > +    qdev_prop_set_uint64(dev, "memhp_base",
> > > +
> vms->memmap[VIRT_PCDIMM_ACPI].base);
> > Maybe add a comment that a property is requested to integrated with
> > acpi_memory_hotplug_init() (if I am not wrong). Otherwise we can wonder
> > why sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, <base>) is not used as for
> > standard sysbus devices?
> 
> Why it's inherited from SYS_BUS_DEVICE to begin with?

Hmm..I don't have a clear answer to that other than the fact that just reused 
the way other platform devices are created pl011/pl061/smmu etc. Also PCI
doesn't look like an obvious one here. Please let me know if there is a better
way of doing this.

> >
> > > +    qdev_init_nofail(dev);
> > > +
> > > +    return dev;
> > > +}
> > > +
> > >  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
> > >  {
> > >      const char *itsclass = its_class_name();
> > > @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState
> *machine)
> > >
> > >      create_platform_bus(vms, pic);
> > >
> > > +    vms->acpi = create_virt_acpi(vms);I can see that on PC machines,
> they use a link property to set the
> > acpi_dev. I am unsure about the exact reason, any idea?
> 
> pc and q35 machine have different devices that implement ACPI interface
> and live somewhere else in the system and also honor -no-acpi CLI option.
> Link allows to cache reference to whatever device in use and manage CLI
> expectations (if I recall it correctly).

Thanks for clarifying this.

> 
> > > +
> > >      vms->bootinfo.ram_size = machine->ram_size;
> > >      vms->bootinfo.kernel_filename = machine->kernel_filename;
> > >      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> > > @@ -1828,11 +1843,19 @@ static void
> virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > >  static void virt_memory_plug(HotplugHandler *hotplug_dev,
> > >                               DeviceState *dev, Error **errp)
> > >  {
> > > +    HotplugHandlerClass *hhc;
> > >      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> > >      Error *local_err = NULL;
> > >
> > >      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> > > +    if (local_err) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> > > +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);
> > Why error_abort instead of propagating the error?
> 
> After last round of changes to hotplug handler, it's deemed that plug() handler
> should not fail (I didn't get my hands on removing error argument from
> interface
> yet). All checks and graceful abort should happen at pre_plug() stage.

Ok. I will address this in next revision.

Thanks,
Shameer
 
> > >
> > > +out:
> > >      error_propagate(errp, local_err);
> > >  }
> > >
> > > diff --git a/include/hw/acpi/generic_event_device.h
> b/include/hw/acpi/generic_event_device.h
> > > index f314515..262ca7d 100644
> > > --- a/include/hw/acpi/generic_event_device.h
> > > +++ b/include/hw/acpi/generic_event_device.h
> > > @@ -18,12 +18,17 @@
> > >  #ifndef HW_ACPI_GED_H
> > >  #define HW_ACPI_GED_H
> > >
> > > +#include "hw/acpi/memory_hotplug.h"
> > > +
> > >  #define TYPE_VIRT_ACPI "virt-acpi"
> > >  #define VIRT_ACPI(obj) \
> > >      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > >
> > >  typedef struct VirtAcpiState {
> > >      SysBusDevice parent_obj;
> > > +    MemHotplugState memhp_state;
> > > +    hwaddr memhp_base;
> > >  } VirtAcpiState;
> > >
> > > +
> > spurious newline
> >
> > Thanks
> >
> > Eric
> > >  #endif
> > > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > > index 507517c..c5e4c96 100644
> > > --- a/include/hw/arm/virt.h
> > > +++ b/include/hw/arm/virt.h
> > > @@ -77,6 +77,7 @@ enum {
> > >      VIRT_GPIO,
> > >      VIRT_SECURE_UART,
> > >      VIRT_SECURE_MEM,
> > > +    VIRT_PCDIMM_ACPI,
> > >      VIRT_LOWMEMMAP_LAST,
> > >  };
> > >
> > > @@ -132,6 +133,7 @@ typedef struct {
> > >      uint32_t iommu_phandle;
> > >      int psci_conduit;
> > >      hwaddr highest_gpa;
> > > +    DeviceState *acpi;
> > >  } VirtMachineState;
> > >
> > >  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM :
> VIRT_PCIE_ECAM)
> > >

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
  2019-04-01 13:43       ` [Qemu-devel] " Igor Mammedov
@ 2019-04-01 14:59         ` Auger Eric
  -1 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-04-01 14:59 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: peter.maydell, sameo, linuxarm, Shameer Kolothum, qemu-devel,
	shannon.zhaosl, qemu-arm, xuwei5, sebastien.boeuf

Hi Igor,

On 4/1/19 3:43 PM, Igor Mammedov wrote:
> On Fri, 29 Mar 2019 10:31:14 +0100
> Auger Eric <eric.auger@redhat.com> wrote:
> 
>> Hi Shameer,
>>
>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>> This adds support to build the aml code so that Guest(ACPI boot)
>>> can see the cold-plugged device memory. Memory cold plug support
>>> with DT boot is not yet enabled.
>>>
>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>> ---
>>>  default-configs/arm-softmmu.mak        |  2 ++
>>>  hw/acpi/generic_event_device.c         | 23 +++++++++++++++++++++++
>>>  hw/arm/virt-acpi-build.c               |  9 +++++++++
>>>  hw/arm/virt.c                          | 23 +++++++++++++++++++++++
>>>  include/hw/acpi/generic_event_device.h |  5 +++++
>>>  include/hw/arm/virt.h                  |  2 ++
>>>  6 files changed, 64 insertions(+)
>>>
>>> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
>>> index 795cb89..6db444e 100644
>>> --- a/default-configs/arm-softmmu.mak
>>> +++ b/default-configs/arm-softmmu.mak
>>> @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
>>>  
>>>  CONFIG_MEM_DEVICE=y
>>>  CONFIG_DIMM=y
>>> +CONFIG_ACPI_MEMORY_HOTPLUG=y
>>> +CONFIG_ACPI_HW_REDUCED=y
>>> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
>>> index b21a551..0b32fc9 100644
>>> --- a/hw/acpi/generic_event_device.c
>>> +++ b/hw/acpi/generic_event_device.c
>>> @@ -16,13 +16,26 @@
>>>   */
>>>  
>>>  #include "qemu/osdep.h"
>>> +#include "qapi/error.h"
>>> +#include "exec/address-spaces.h"
>>>  #include "hw/sysbus.h"
>>>  #include "hw/acpi/acpi.h"
>>>  #include "hw/acpi/generic_event_device.h"
>>> +#include "hw/mem/pc-dimm.h"
>>>  
>>>  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
>>>                                  DeviceState *dev, Error **errp)
>>>  {
>>> +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
>>> +
>>> +    if (s->memhp_state.is_enabled &&
>>> +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>>> +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
>>> +                                dev, errp);
>>> +    } else {
>>> +        error_setg(errp, "virt: device plug request for unsupported device"
>>> +                   " type: %s", object_get_typename(OBJECT(dev)));
>>> +    }
>>>  }
>>>  
>>>  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>>> @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>>>  
>>>  static void virt_device_realize(DeviceState *dev, Error **errp)
>>>  {
>>> +    VirtAcpiState *s = VIRT_ACPI(dev);
>>> +
>>> +    if (s->memhp_state.is_enabled) {
>>> +        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
>>> +                                 &s->memhp_state,
>>> +                                 s->memhp_base);
>>> +    }
>>>  }
>>>  
>>>  static Property virt_acpi_properties[] = {
>>> +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base, 0),
>>> +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
>>> +                     memhp_state.is_enabled, true),>      DEFINE_PROP_END_OF_LIST(),
>>>  };
>>>  
>>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>>> index bf9c0bc..20d3c83 100644
>>> --- a/hw/arm/virt-acpi-build.c
>>> +++ b/hw/arm/virt-acpi-build.c
>>> @@ -40,6 +40,7 @@
>>>  #include "hw/loader.h"
>>>  #include "hw/hw.h"
>>>  #include "hw/acpi/aml-build.h"
>>> +#include "hw/acpi/memory_hotplug.h"
>>>  #include "hw/pci/pcie_host.h"
>>>  #include "hw/pci/pci.h"
>>>  #include "hw/arm/virt.h"
>>> @@ -49,6 +50,13 @@
>>>  #define ARM_SPI_BASE 32
>>>  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
>>>  
>>> +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState *ms)
>>> +{
>>> +    uint32_t nr_mem = ms->ram_slots;
>>> +
>>> +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL, AML_SYSTEM_MEMORY);
>>> +}
>>> +
>>>  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
>>>  {
>>>      uint16_t i;
>>> @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>>       * the RTC ACPI device at all when using UEFI.
>>>       */
>>>      scope = aml_scope("\\_SB");
>>> +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
>>>      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
>>>      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
>>>                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>> index d0ff20d..13db0e9 100644
>>> --- a/hw/arm/virt.c
>>> +++ b/hw/arm/virt.c
>>> @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
>>>      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
>>>      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
>>>      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
>>> +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
>>>      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
>>>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>>>      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
>>> @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const VirtMachineState *vms)
>>>      }
>>>  }
>>>  
>>> +static DeviceState *create_virt_acpi(VirtMachineState *vms)
>>> +{
>>> +    DeviceState *dev;
>>> +
>>> +    dev = qdev_create(NULL, "virt-acpi");
>>> +    qdev_prop_set_uint64(dev, "memhp_base",
>>> +                         vms->memmap[VIRT_PCDIMM_ACPI].base);  
>> Maybe add a comment that a property is requested to integrated with
>> acpi_memory_hotplug_init() (if I am not wrong). Otherwise we can wonder
>> why sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, <base>) is not used as for
>> standard sysbus devices?
> 
> Why it's inherited from SYS_BUS_DEVICE to begin with?
it is:

static const TypeInfo virt_acpi_info = {
    .name          = TYPE_VIRT_ACPI,
    .parent        = TYPE_SYS_BUS_DEVICE,
    .instance_size = sizeof(VirtAcpiState),
    .class_init    = virt_acpi_class_init,
    .interfaces = (InterfaceInfo[]) {
        { TYPE_HOTPLUG_HANDLER },
        { TYPE_ACPI_DEVICE_IF },
        { }
    }
};
> 
>>
>>> +    qdev_init_nofail(dev);
>>> +
>>> +    return dev;
>>> +}
>>> +
>>>  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
>>>  {
>>>      const char *itsclass = its_class_name();
>>> @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
>>>  
>>>      create_platform_bus(vms, pic);
>>>  
>>> +    vms->acpi = create_virt_acpi(vms);I can see that on PC machines, they use a link property to set the  
>> acpi_dev. I am unsure about the exact reason, any idea?
> 
> pc and q35 machine have different devices that implement ACPI interface
> and live somewhere else in the system and also honor -no-acpi CLI option.
> Link allows to cache reference to whatever device in use and manage CLI
> expectations (if I recall it correctly).

OK thank you for the clarification.
> 
>>> +
>>>      vms->bootinfo.ram_size = machine->ram_size;
>>>      vms->bootinfo.kernel_filename = machine->kernel_filename;
>>>      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
>>> @@ -1828,11 +1843,19 @@ static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>>  static void virt_memory_plug(HotplugHandler *hotplug_dev,
>>>                               DeviceState *dev, Error **errp)
>>>  {
>>> +    HotplugHandlerClass *hhc;
>>>      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>>>      Error *local_err = NULL;
>>>  
>>>      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
>>> +    if (local_err) {
>>> +        goto out;
>>> +    }
>>> +
>>> +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
>>> +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);  
>> Why error_abort instead of propagating the error?
> 
> After last round of changes to hotplug handler, it's deemed that plug() handler
> should not fail (I didn't get my hands on removing error argument from interface
> yet). All checks and graceful abort should happen at pre_plug() stage.

Thanks

Eric
> 
>>>  
>>> +out:
>>>      error_propagate(errp, local_err);
>>>  }
>>>  
>>> diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
>>> index f314515..262ca7d 100644
>>> --- a/include/hw/acpi/generic_event_device.h
>>> +++ b/include/hw/acpi/generic_event_device.h
>>> @@ -18,12 +18,17 @@
>>>  #ifndef HW_ACPI_GED_H
>>>  #define HW_ACPI_GED_H
>>>  
>>> +#include "hw/acpi/memory_hotplug.h"
>>> +
>>>  #define TYPE_VIRT_ACPI "virt-acpi"
>>>  #define VIRT_ACPI(obj) \
>>>      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
>>>  
>>>  typedef struct VirtAcpiState {
>>>      SysBusDevice parent_obj;
>>> +    MemHotplugState memhp_state;
>>> +    hwaddr memhp_base;
>>>  } VirtAcpiState;
>>>  
>>> +  
>> spurious newline
>>
>> Thanks
>>
>> Eric
>>>  #endif
>>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>>> index 507517c..c5e4c96 100644
>>> --- a/include/hw/arm/virt.h
>>> +++ b/include/hw/arm/virt.h
>>> @@ -77,6 +77,7 @@ enum {
>>>      VIRT_GPIO,
>>>      VIRT_SECURE_UART,
>>>      VIRT_SECURE_MEM,
>>> +    VIRT_PCDIMM_ACPI,
>>>      VIRT_LOWMEMMAP_LAST,
>>>  };
>>>  
>>> @@ -132,6 +133,7 @@ typedef struct {
>>>      uint32_t iommu_phandle;
>>>      int psci_conduit;
>>>      hwaddr highest_gpa;
>>> +    DeviceState *acpi;
>>>  } VirtMachineState;
>>>  
>>>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
>>>   
> 
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
@ 2019-04-01 14:59         ` Auger Eric
  0 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-04-01 14:59 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: peter.maydell, sameo, qemu-devel, Shameer Kolothum, linuxarm,
	shannon.zhaosl, qemu-arm, xuwei5, sebastien.boeuf

Hi Igor,

On 4/1/19 3:43 PM, Igor Mammedov wrote:
> On Fri, 29 Mar 2019 10:31:14 +0100
> Auger Eric <eric.auger@redhat.com> wrote:
> 
>> Hi Shameer,
>>
>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>> This adds support to build the aml code so that Guest(ACPI boot)
>>> can see the cold-plugged device memory. Memory cold plug support
>>> with DT boot is not yet enabled.
>>>
>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>> ---
>>>  default-configs/arm-softmmu.mak        |  2 ++
>>>  hw/acpi/generic_event_device.c         | 23 +++++++++++++++++++++++
>>>  hw/arm/virt-acpi-build.c               |  9 +++++++++
>>>  hw/arm/virt.c                          | 23 +++++++++++++++++++++++
>>>  include/hw/acpi/generic_event_device.h |  5 +++++
>>>  include/hw/arm/virt.h                  |  2 ++
>>>  6 files changed, 64 insertions(+)
>>>
>>> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
>>> index 795cb89..6db444e 100644
>>> --- a/default-configs/arm-softmmu.mak
>>> +++ b/default-configs/arm-softmmu.mak
>>> @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
>>>  
>>>  CONFIG_MEM_DEVICE=y
>>>  CONFIG_DIMM=y
>>> +CONFIG_ACPI_MEMORY_HOTPLUG=y
>>> +CONFIG_ACPI_HW_REDUCED=y
>>> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
>>> index b21a551..0b32fc9 100644
>>> --- a/hw/acpi/generic_event_device.c
>>> +++ b/hw/acpi/generic_event_device.c
>>> @@ -16,13 +16,26 @@
>>>   */
>>>  
>>>  #include "qemu/osdep.h"
>>> +#include "qapi/error.h"
>>> +#include "exec/address-spaces.h"
>>>  #include "hw/sysbus.h"
>>>  #include "hw/acpi/acpi.h"
>>>  #include "hw/acpi/generic_event_device.h"
>>> +#include "hw/mem/pc-dimm.h"
>>>  
>>>  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
>>>                                  DeviceState *dev, Error **errp)
>>>  {
>>> +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
>>> +
>>> +    if (s->memhp_state.is_enabled &&
>>> +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>>> +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
>>> +                                dev, errp);
>>> +    } else {
>>> +        error_setg(errp, "virt: device plug request for unsupported device"
>>> +                   " type: %s", object_get_typename(OBJECT(dev)));
>>> +    }
>>>  }
>>>  
>>>  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>>> @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>>>  
>>>  static void virt_device_realize(DeviceState *dev, Error **errp)
>>>  {
>>> +    VirtAcpiState *s = VIRT_ACPI(dev);
>>> +
>>> +    if (s->memhp_state.is_enabled) {
>>> +        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
>>> +                                 &s->memhp_state,
>>> +                                 s->memhp_base);
>>> +    }
>>>  }
>>>  
>>>  static Property virt_acpi_properties[] = {
>>> +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base, 0),
>>> +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
>>> +                     memhp_state.is_enabled, true),>      DEFINE_PROP_END_OF_LIST(),
>>>  };
>>>  
>>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>>> index bf9c0bc..20d3c83 100644
>>> --- a/hw/arm/virt-acpi-build.c
>>> +++ b/hw/arm/virt-acpi-build.c
>>> @@ -40,6 +40,7 @@
>>>  #include "hw/loader.h"
>>>  #include "hw/hw.h"
>>>  #include "hw/acpi/aml-build.h"
>>> +#include "hw/acpi/memory_hotplug.h"
>>>  #include "hw/pci/pcie_host.h"
>>>  #include "hw/pci/pci.h"
>>>  #include "hw/arm/virt.h"
>>> @@ -49,6 +50,13 @@
>>>  #define ARM_SPI_BASE 32
>>>  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
>>>  
>>> +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState *ms)
>>> +{
>>> +    uint32_t nr_mem = ms->ram_slots;
>>> +
>>> +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL, AML_SYSTEM_MEMORY);
>>> +}
>>> +
>>>  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
>>>  {
>>>      uint16_t i;
>>> @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>>       * the RTC ACPI device at all when using UEFI.
>>>       */
>>>      scope = aml_scope("\\_SB");
>>> +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
>>>      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
>>>      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
>>>                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>> index d0ff20d..13db0e9 100644
>>> --- a/hw/arm/virt.c
>>> +++ b/hw/arm/virt.c
>>> @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
>>>      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
>>>      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
>>>      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
>>> +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
>>>      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
>>>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>>>      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
>>> @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const VirtMachineState *vms)
>>>      }
>>>  }
>>>  
>>> +static DeviceState *create_virt_acpi(VirtMachineState *vms)
>>> +{
>>> +    DeviceState *dev;
>>> +
>>> +    dev = qdev_create(NULL, "virt-acpi");
>>> +    qdev_prop_set_uint64(dev, "memhp_base",
>>> +                         vms->memmap[VIRT_PCDIMM_ACPI].base);  
>> Maybe add a comment that a property is requested to integrated with
>> acpi_memory_hotplug_init() (if I am not wrong). Otherwise we can wonder
>> why sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, <base>) is not used as for
>> standard sysbus devices?
> 
> Why it's inherited from SYS_BUS_DEVICE to begin with?
it is:

static const TypeInfo virt_acpi_info = {
    .name          = TYPE_VIRT_ACPI,
    .parent        = TYPE_SYS_BUS_DEVICE,
    .instance_size = sizeof(VirtAcpiState),
    .class_init    = virt_acpi_class_init,
    .interfaces = (InterfaceInfo[]) {
        { TYPE_HOTPLUG_HANDLER },
        { TYPE_ACPI_DEVICE_IF },
        { }
    }
};
> 
>>
>>> +    qdev_init_nofail(dev);
>>> +
>>> +    return dev;
>>> +}
>>> +
>>>  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
>>>  {
>>>      const char *itsclass = its_class_name();
>>> @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
>>>  
>>>      create_platform_bus(vms, pic);
>>>  
>>> +    vms->acpi = create_virt_acpi(vms);I can see that on PC machines, they use a link property to set the  
>> acpi_dev. I am unsure about the exact reason, any idea?
> 
> pc and q35 machine have different devices that implement ACPI interface
> and live somewhere else in the system and also honor -no-acpi CLI option.
> Link allows to cache reference to whatever device in use and manage CLI
> expectations (if I recall it correctly).

OK thank you for the clarification.
> 
>>> +
>>>      vms->bootinfo.ram_size = machine->ram_size;
>>>      vms->bootinfo.kernel_filename = machine->kernel_filename;
>>>      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
>>> @@ -1828,11 +1843,19 @@ static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>>  static void virt_memory_plug(HotplugHandler *hotplug_dev,
>>>                               DeviceState *dev, Error **errp)
>>>  {
>>> +    HotplugHandlerClass *hhc;
>>>      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>>>      Error *local_err = NULL;
>>>  
>>>      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
>>> +    if (local_err) {
>>> +        goto out;
>>> +    }
>>> +
>>> +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
>>> +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);  
>> Why error_abort instead of propagating the error?
> 
> After last round of changes to hotplug handler, it's deemed that plug() handler
> should not fail (I didn't get my hands on removing error argument from interface
> yet). All checks and graceful abort should happen at pre_plug() stage.

Thanks

Eric
> 
>>>  
>>> +out:
>>>      error_propagate(errp, local_err);
>>>  }
>>>  
>>> diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
>>> index f314515..262ca7d 100644
>>> --- a/include/hw/acpi/generic_event_device.h
>>> +++ b/include/hw/acpi/generic_event_device.h
>>> @@ -18,12 +18,17 @@
>>>  #ifndef HW_ACPI_GED_H
>>>  #define HW_ACPI_GED_H
>>>  
>>> +#include "hw/acpi/memory_hotplug.h"
>>> +
>>>  #define TYPE_VIRT_ACPI "virt-acpi"
>>>  #define VIRT_ACPI(obj) \
>>>      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
>>>  
>>>  typedef struct VirtAcpiState {
>>>      SysBusDevice parent_obj;
>>> +    MemHotplugState memhp_state;
>>> +    hwaddr memhp_base;
>>>  } VirtAcpiState;
>>>  
>>> +  
>> spurious newline
>>
>> Thanks
>>
>> Eric
>>>  #endif
>>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>>> index 507517c..c5e4c96 100644
>>> --- a/include/hw/arm/virt.h
>>> +++ b/include/hw/arm/virt.h
>>> @@ -77,6 +77,7 @@ enum {
>>>      VIRT_GPIO,
>>>      VIRT_SECURE_UART,
>>>      VIRT_SECURE_MEM,
>>> +    VIRT_PCDIMM_ACPI,
>>>      VIRT_LOWMEMMAP_LAST,
>>>  };
>>>  
>>> @@ -132,6 +133,7 @@ typedef struct {
>>>      uint32_t iommu_phandle;
>>>      int psci_conduit;
>>>      hwaddr highest_gpa;
>>> +    DeviceState *acpi;
>>>  } VirtMachineState;
>>>  
>>>  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
>>>   
> 
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
  2019-04-01 13:34     ` [Qemu-devel] " Igor Mammedov
@ 2019-04-01 16:24       ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-01 16:24 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	shannon.zhaosl@gmail.com, qemu-devel@nongnu.org, Linuxarm,
	eric.auger@redhat.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com



> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@redhat.com]
> Sent: 01 April 2019 14:34
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org;
> eric.auger@redhat.com; peter.maydell@linaro.org;
> shannon.zhaosl@gmail.com; sameo@linux.intel.com;
> sebastien.boeuf@intel.com; Linuxarm <linuxarm@huawei.com>; xuwei (O)
> <xuwei5@huawei.com>
> Subject: Re: [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device
> memory cold-plug
> 
> On Thu, 21 Mar 2019 10:47:40 +0000
> Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:
> 
> > This adds support to build the aml code so that Guest(ACPI boot)
> > can see the cold-plugged device memory. Memory cold plug support
> > with DT boot is not yet enabled.
> >
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  default-configs/arm-softmmu.mak        |  2 ++
> >  hw/acpi/generic_event_device.c         | 23
> +++++++++++++++++++++++
> >  hw/arm/virt-acpi-build.c               |  9 +++++++++
> >  hw/arm/virt.c                          | 23
> +++++++++++++++++++++++
> >  include/hw/acpi/generic_event_device.h |  5 +++++
> >  include/hw/arm/virt.h                  |  2 ++
> >  6 files changed, 64 insertions(+)
> >
> > diff --git a/default-configs/arm-softmmu.mak
> b/default-configs/arm-softmmu.mak
> > index 795cb89..6db444e 100644
> > --- a/default-configs/arm-softmmu.mak
> > +++ b/default-configs/arm-softmmu.mak
> > @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
> >
> >  CONFIG_MEM_DEVICE=y
> >  CONFIG_DIMM=y
> > +CONFIG_ACPI_MEMORY_HOTPLUG=y
> > +CONFIG_ACPI_HW_REDUCED=y
> > diff --git a/hw/acpi/generic_event_device.c
> b/hw/acpi/generic_event_device.c
> > index b21a551..0b32fc9 100644
> > --- a/hw/acpi/generic_event_device.c
> > +++ b/hw/acpi/generic_event_device.c
> > @@ -16,13 +16,26 @@
> >   */
> >
> >  #include "qemu/osdep.h"
> > +#include "qapi/error.h"
> > +#include "exec/address-spaces.h"
> >  #include "hw/sysbus.h"
> >  #include "hw/acpi/acpi.h"
> >  #include "hw/acpi/generic_event_device.h"
> > +#include "hw/mem/pc-dimm.h"
> >
> >  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> >                                  DeviceState *dev, Error **errp)
> >  {
> > +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> > +
> > +    if (s->memhp_state.is_enabled &&
> > +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> > +                                dev, errp);
> > +    } else {
> > +        error_setg(errp, "virt: device plug request for unsupported
> device"
> > +                   " type: %s", object_get_typename(OBJECT(dev)));
> > +    }
> >  }
> >
> >  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev,
> AcpiEventStatusBits ev)
> >
> >  static void virt_device_realize(DeviceState *dev, Error **errp)
> >  {
> > +    VirtAcpiState *s = VIRT_ACPI(dev);
> > +
> > +    if (s->memhp_state.is_enabled) {
> > +        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
> > +                                 &s->memhp_state,
> > +                                 s->memhp_base);
> > +    }
> >  }
> >
> >  static Property virt_acpi_properties[] = {
> > +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base,
> 0),
> 
> it's preferred to use '-' in property names

Ok.

> > +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> > +                     memhp_state.is_enabled, true),
> >      DEFINE_PROP_END_OF_LIST(),
> >  };
> >
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index bf9c0bc..20d3c83 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -40,6 +40,7 @@
> >  #include "hw/loader.h"
> >  #include "hw/hw.h"
> >  #include "hw/acpi/aml-build.h"
> > +#include "hw/acpi/memory_hotplug.h"
> >  #include "hw/pci/pcie_host.h"
> >  #include "hw/pci/pci.h"
> >  #include "hw/arm/virt.h"
> > @@ -49,6 +50,13 @@
> >  #define ARM_SPI_BASE 32
> >  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
> >
> > +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState
> *ms)
> > +{
> it's dummy wrapper that never will be reused,
> I suggest to just inline contents at call site and drop wrapper.

Ok. I will move it then.

> 
> > +    uint32_t nr_mem = ms->ram_slots;
> > +
> > +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL,
> AML_SYSTEM_MEMORY);
> > +}
> > +
> >  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
> >  {
> >      uint16_t i;
> > @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
> >       * the RTC ACPI device at all when using UEFI.
> >       */
> >      scope = aml_scope("\\_SB");
> > +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
> >      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
> >      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> >                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index d0ff20d..13db0e9 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
> >      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
> >      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> >      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> > +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
>                                                  ^^^^^^^^^^^
> where from this magic number comes?

I think the only requirement for size is >= MEMORY_HOTPLUG_IO_LEN(24).
So may be 64K is bit too much, 4K might as well do the job.

Or is it best to just use MEMORY_HOTPLUG_IO_LEN directly here?
 
> 
> >      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> >      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that
> size */
> >      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> > @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const
> VirtMachineState *vms)
> >      }
> >  }
> >
> > +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> > +{
> > +    DeviceState *dev;
> > +
> > +    dev = qdev_create(NULL, "virt-acpi");
> > +    qdev_prop_set_uint64(dev, "memhp_base",
> > +                         vms->memmap[VIRT_PCDIMM_ACPI].base);
> > +    qdev_init_nofail(dev);
> > +
> > +    return dev;
> 
> Probably no worth a wrapper either, since code is trivial and isn't reused
> elsewhere.

Ok, I will make it an inline then.

Thanks,
Shameer
 
> > +}
> > +
> >  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
> >  {
> >      const char *itsclass = its_class_name();
> > @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
> >
> >      create_platform_bus(vms, pic);
> >
> > +    vms->acpi = create_virt_acpi(vms);
> > +
> >      vms->bootinfo.ram_size = machine->ram_size;
> >      vms->bootinfo.kernel_filename = machine->kernel_filename;
> >      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> > @@ -1828,11 +1843,19 @@ static void
> virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >  static void virt_memory_plug(HotplugHandler *hotplug_dev,
> >                               DeviceState *dev, Error **errp)
> >  {
> > +    HotplugHandlerClass *hhc;
> >      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> >      Error *local_err = NULL;
> >
> >      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> > +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);
> >
> > +out:
> >      error_propagate(errp, local_err);
> >  }
> >
> > diff --git a/include/hw/acpi/generic_event_device.h
> b/include/hw/acpi/generic_event_device.h
> > index f314515..262ca7d 100644
> > --- a/include/hw/acpi/generic_event_device.h
> > +++ b/include/hw/acpi/generic_event_device.h
> > @@ -18,12 +18,17 @@
> >  #ifndef HW_ACPI_GED_H
> >  #define HW_ACPI_GED_H
> >
> > +#include "hw/acpi/memory_hotplug.h"
> > +
> >  #define TYPE_VIRT_ACPI "virt-acpi"
> >  #define VIRT_ACPI(obj) \
> >      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> >
> >  typedef struct VirtAcpiState {
> >      SysBusDevice parent_obj;
> > +    MemHotplugState memhp_state;
> > +    hwaddr memhp_base;
> >  } VirtAcpiState;
> >
> > +
> >  #endif
> > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > index 507517c..c5e4c96 100644
> > --- a/include/hw/arm/virt.h
> > +++ b/include/hw/arm/virt.h
> > @@ -77,6 +77,7 @@ enum {
> >      VIRT_GPIO,
> >      VIRT_SECURE_UART,
> >      VIRT_SECURE_MEM,
> > +    VIRT_PCDIMM_ACPI,
> >      VIRT_LOWMEMMAP_LAST,
> >  };
> >
> > @@ -132,6 +133,7 @@ typedef struct {
> >      uint32_t iommu_phandle;
> >      int psci_conduit;
> >      hwaddr highest_gpa;
> > +    DeviceState *acpi;
> >  } VirtMachineState;
> >
> >  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM :
> VIRT_PCIE_ECAM)


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
@ 2019-04-01 16:24       ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-01 16:24 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, eric.auger@redhat.com,
	peter.maydell@linaro.org, shannon.zhaosl@gmail.com,
	sameo@linux.intel.com, sebastien.boeuf@intel.com, Linuxarm,
	xuwei (O)



> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@redhat.com]
> Sent: 01 April 2019 14:34
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org;
> eric.auger@redhat.com; peter.maydell@linaro.org;
> shannon.zhaosl@gmail.com; sameo@linux.intel.com;
> sebastien.boeuf@intel.com; Linuxarm <linuxarm@huawei.com>; xuwei (O)
> <xuwei5@huawei.com>
> Subject: Re: [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device
> memory cold-plug
> 
> On Thu, 21 Mar 2019 10:47:40 +0000
> Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:
> 
> > This adds support to build the aml code so that Guest(ACPI boot)
> > can see the cold-plugged device memory. Memory cold plug support
> > with DT boot is not yet enabled.
> >
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  default-configs/arm-softmmu.mak        |  2 ++
> >  hw/acpi/generic_event_device.c         | 23
> +++++++++++++++++++++++
> >  hw/arm/virt-acpi-build.c               |  9 +++++++++
> >  hw/arm/virt.c                          | 23
> +++++++++++++++++++++++
> >  include/hw/acpi/generic_event_device.h |  5 +++++
> >  include/hw/arm/virt.h                  |  2 ++
> >  6 files changed, 64 insertions(+)
> >
> > diff --git a/default-configs/arm-softmmu.mak
> b/default-configs/arm-softmmu.mak
> > index 795cb89..6db444e 100644
> > --- a/default-configs/arm-softmmu.mak
> > +++ b/default-configs/arm-softmmu.mak
> > @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
> >
> >  CONFIG_MEM_DEVICE=y
> >  CONFIG_DIMM=y
> > +CONFIG_ACPI_MEMORY_HOTPLUG=y
> > +CONFIG_ACPI_HW_REDUCED=y
> > diff --git a/hw/acpi/generic_event_device.c
> b/hw/acpi/generic_event_device.c
> > index b21a551..0b32fc9 100644
> > --- a/hw/acpi/generic_event_device.c
> > +++ b/hw/acpi/generic_event_device.c
> > @@ -16,13 +16,26 @@
> >   */
> >
> >  #include "qemu/osdep.h"
> > +#include "qapi/error.h"
> > +#include "exec/address-spaces.h"
> >  #include "hw/sysbus.h"
> >  #include "hw/acpi/acpi.h"
> >  #include "hw/acpi/generic_event_device.h"
> > +#include "hw/mem/pc-dimm.h"
> >
> >  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> >                                  DeviceState *dev, Error **errp)
> >  {
> > +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> > +
> > +    if (s->memhp_state.is_enabled &&
> > +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> > +                                dev, errp);
> > +    } else {
> > +        error_setg(errp, "virt: device plug request for unsupported
> device"
> > +                   " type: %s", object_get_typename(OBJECT(dev)));
> > +    }
> >  }
> >
> >  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev,
> AcpiEventStatusBits ev)
> >
> >  static void virt_device_realize(DeviceState *dev, Error **errp)
> >  {
> > +    VirtAcpiState *s = VIRT_ACPI(dev);
> > +
> > +    if (s->memhp_state.is_enabled) {
> > +        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
> > +                                 &s->memhp_state,
> > +                                 s->memhp_base);
> > +    }
> >  }
> >
> >  static Property virt_acpi_properties[] = {
> > +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base,
> 0),
> 
> it's preferred to use '-' in property names

Ok.

> > +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> > +                     memhp_state.is_enabled, true),
> >      DEFINE_PROP_END_OF_LIST(),
> >  };
> >
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index bf9c0bc..20d3c83 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -40,6 +40,7 @@
> >  #include "hw/loader.h"
> >  #include "hw/hw.h"
> >  #include "hw/acpi/aml-build.h"
> > +#include "hw/acpi/memory_hotplug.h"
> >  #include "hw/pci/pcie_host.h"
> >  #include "hw/pci/pci.h"
> >  #include "hw/arm/virt.h"
> > @@ -49,6 +50,13 @@
> >  #define ARM_SPI_BASE 32
> >  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
> >
> > +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState
> *ms)
> > +{
> it's dummy wrapper that never will be reused,
> I suggest to just inline contents at call site and drop wrapper.

Ok. I will move it then.

> 
> > +    uint32_t nr_mem = ms->ram_slots;
> > +
> > +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL,
> AML_SYSTEM_MEMORY);
> > +}
> > +
> >  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
> >  {
> >      uint16_t i;
> > @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
> >       * the RTC ACPI device at all when using UEFI.
> >       */
> >      scope = aml_scope("\\_SB");
> > +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
> >      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
> >      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> >                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index d0ff20d..13db0e9 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
> >      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
> >      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> >      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> > +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
>                                                  ^^^^^^^^^^^
> where from this magic number comes?

I think the only requirement for size is >= MEMORY_HOTPLUG_IO_LEN(24).
So may be 64K is bit too much, 4K might as well do the job.

Or is it best to just use MEMORY_HOTPLUG_IO_LEN directly here?
 
> 
> >      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> >      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that
> size */
> >      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> > @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const
> VirtMachineState *vms)
> >      }
> >  }
> >
> > +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> > +{
> > +    DeviceState *dev;
> > +
> > +    dev = qdev_create(NULL, "virt-acpi");
> > +    qdev_prop_set_uint64(dev, "memhp_base",
> > +                         vms->memmap[VIRT_PCDIMM_ACPI].base);
> > +    qdev_init_nofail(dev);
> > +
> > +    return dev;
> 
> Probably no worth a wrapper either, since code is trivial and isn't reused
> elsewhere.

Ok, I will make it an inline then.

Thanks,
Shameer
 
> > +}
> > +
> >  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
> >  {
> >      const char *itsclass = its_class_name();
> > @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
> >
> >      create_platform_bus(vms, pic);
> >
> > +    vms->acpi = create_virt_acpi(vms);
> > +
> >      vms->bootinfo.ram_size = machine->ram_size;
> >      vms->bootinfo.kernel_filename = machine->kernel_filename;
> >      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> > @@ -1828,11 +1843,19 @@ static void
> virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >  static void virt_memory_plug(HotplugHandler *hotplug_dev,
> >                               DeviceState *dev, Error **errp)
> >  {
> > +    HotplugHandlerClass *hhc;
> >      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> >      Error *local_err = NULL;
> >
> >      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> > +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);
> >
> > +out:
> >      error_propagate(errp, local_err);
> >  }
> >
> > diff --git a/include/hw/acpi/generic_event_device.h
> b/include/hw/acpi/generic_event_device.h
> > index f314515..262ca7d 100644
> > --- a/include/hw/acpi/generic_event_device.h
> > +++ b/include/hw/acpi/generic_event_device.h
> > @@ -18,12 +18,17 @@
> >  #ifndef HW_ACPI_GED_H
> >  #define HW_ACPI_GED_H
> >
> > +#include "hw/acpi/memory_hotplug.h"
> > +
> >  #define TYPE_VIRT_ACPI "virt-acpi"
> >  #define VIRT_ACPI(obj) \
> >      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> >
> >  typedef struct VirtAcpiState {
> >      SysBusDevice parent_obj;
> > +    MemHotplugState memhp_state;
> > +    hwaddr memhp_base;
> >  } VirtAcpiState;
> >
> > +
> >  #endif
> > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > index 507517c..c5e4c96 100644
> > --- a/include/hw/arm/virt.h
> > +++ b/include/hw/arm/virt.h
> > @@ -77,6 +77,7 @@ enum {
> >      VIRT_GPIO,
> >      VIRT_SECURE_UART,
> >      VIRT_SECURE_MEM,
> > +    VIRT_PCDIMM_ACPI,
> >      VIRT_LOWMEMMAP_LAST,
> >  };
> >
> > @@ -132,6 +133,7 @@ typedef struct {
> >      uint32_t iommu_phandle;
> >      int psci_conduit;
> >      hwaddr highest_gpa;
> > +    DeviceState *acpi;
> >  } VirtMachineState;
> >
> >  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM :
> VIRT_PCIE_ECAM)

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
  2019-04-01 14:21           ` Shameerali Kolothum Thodi
@ 2019-04-02  6:31             ` Igor Mammedov
  -1 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-02  6:31 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	shannon.zhaosl@gmail.com, qemu-devel@nongnu.org, Linuxarm,
	Auger Eric, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com

On Mon, 1 Apr 2019 14:21:40 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> Hi Igor,
> 
> > -----Original Message-----
> > From: Igor Mammedov [mailto:imammedo@redhat.com]
> > Sent: 01 April 2019 14:09
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> > Cc: Auger Eric <eric.auger@redhat.com>; qemu-devel@nongnu.org;
> > qemu-arm@nongnu.org; peter.maydell@linaro.org;
> > shannon.zhaosl@gmail.com; sameo@linux.intel.com;
> > sebastien.boeuf@intel.com; Linuxarm <linuxarm@huawei.com>; xuwei (O)
> > <xuwei5@huawei.com>
> > Subject: Re: [Qemu-devel] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI
> > device
> > 
> > On Fri, 29 Mar 2019 11:22:02 +0000
> > Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:
> >   
> > > > -----Original Message-----
> > > > From: Auger Eric [mailto:eric.auger@redhat.com]
> > > > Sent: 28 March 2019 14:15
> > > > To: Shameerali Kolothum Thodi  
> > <shameerali.kolothum.thodi@huawei.com>;  
> > > > qemu-devel@nongnu.org; qemu-arm@nongnu.org;  
> > imammedo@redhat.com;  
> > > > peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> > > > sameo@linux.intel.com; sebastien.boeuf@intel.com
> > > > Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>
> > > > Subject: Re: [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
> > > >
> > > > Hi Shameer,
> > > >
> > > > On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
> > > > > From: Samuel Ortiz <sameo@linux.intel.com>
> > > > >
> > > > > This adds the skeleton to support an acpi device interface for
> > > > > HW-reduced acpi platforms via ACPI GED - Generic Event Device (ACPI
> > > > > v6.1 5.6.9).
> > > > >
> > > > > This will be used by Arm/Virt to add hotplug support.
> > > > >
> > > > > Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
> > > > > Signed-off-by: Shameer Kolothum  
> > <shameerali.kolothum.thodi@huawei.com>  
> > > > > ---
> > > > >  hw/acpi/Kconfig                        |  4 ++
> > > > >  hw/acpi/Makefile.objs                  |  1 +
> > > > >  hw/acpi/generic_event_device.c         | 72  
> > > > ++++++++++++++++++++++++++++++++++  
> > > > >  include/hw/acpi/generic_event_device.h | 29 ++++++++++++++
> > > > >  4 files changed, 106 insertions(+)
> > > > >  create mode 100644 hw/acpi/generic_event_device.c  create mode  
> > > > 100644  
> > > > > include/hw/acpi/generic_event_device.h
> > > > >
> > > > > diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig index eca3bee..01a8b41
> > > > > 100644
> > > > > --- a/hw/acpi/Kconfig
> > > > > +++ b/hw/acpi/Kconfig
> > > > > @@ -27,3 +27,7 @@ config ACPI_VMGENID
> > > > >      bool
> > > > >      default y
> > > > >      depends on PC
> > > > > +
> > > > > +config ACPI_HW_REDUCED
> > > > > +    bool
> > > > > +    depends on ACPI
> > > > > diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs index
> > > > > 2d46e37..b753232 100644
> > > > > --- a/hw/acpi/Makefile.objs
> > > > > +++ b/hw/acpi/Makefile.objs
> > > > > @@ -6,6 +6,7 @@ common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG)  
> > +=  
> > > > > memory_hotplug.o
> > > > >  common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
> > > > >  common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
> > > > >  common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
> > > > > +common-obj-$(CONFIG_ACPI_HW_REDUCED) +=  
> > generic_event_device.o  
> > > > >  common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
> > > > >
> > > > >  common-obj-y += acpi_interface.o
> > > > > diff --git a/hw/acpi/generic_event_device.c
> > > > > b/hw/acpi/generic_event_device.c new file mode 100644 index
> > > > > 0000000..b21a551
> > > > > --- /dev/null
> > > > > +++ b/hw/acpi/generic_event_device.c
> > > > > @@ -0,0 +1,72 @@
> > > > > +/*
> > > > > + *
> > > > > + * Copyright (c) 2018 Intel Corporation
> > > > > + *
> > > > > + * This program is free software; you can redistribute it and/or
> > > > > +modify it
> > > > > + * under the terms and conditions of the GNU General Public License,
> > > > > + * version 2 or later, as published by the Free Software Foundation.
> > > > > + *
> > > > > + * This program is distributed in the hope it will be useful, but
> > > > > +WITHOUT
> > > > > + * ANY WARRANTY; without even the implied warranty of  
> > > > MERCHANTABILITY  
> > > > > +or
> > > > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > > > > +License for
> > > > > + * more details.
> > > > > + *
> > > > > + * You should have received a copy of the GNU General Public License
> > > > > +along with
> > > > > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > > > > + */
> > > > > +
> > > > > +#include "qemu/osdep.h"
> > > > > +#include "hw/sysbus.h"
> > > > > +#include "hw/acpi/acpi.h"
> > > > > +#include "hw/acpi/generic_event_device.h"  
> > > > the files are named generic_event_device.c/h while the device is named
> > > > "virt-acpi". I would suggest to use the same naming as in nemu ie. ged or
> > > > acpi_ged.  
> > >
> > > Agree. The naming is a bit confusing. In nemu they have a separate virt-acpi
> > > dev which makes use of GED. Here, we are rolling those two into one. I am
> > > still not very sure whether we should leave it as virt-acpi, because the actual
> > > device on which this is implemented can be changed eg, GED vs GPIO.  
> > 
> > I probably lacking context here, could you clarify and maybe compare
> > differences between x86 and ARM implementations and why it should be
> > different devices?
> >   
> 
> Right. I was not comparing against x86, but just pointing out how Nemu has
> done this. They seems to have a virt-acpi dev specific to virt platforms
> (hw/i386/virt/acpi.c) and then moved all GED related code in a separate file
> (hw/acpi/ged.c) [1].
> 
> I was just thinking whether that approach makes any sense going forward where
> there are cases where platforms support GED or GPIO for hotplug support and
> virt-acpi dev can be configured to use either of those. May be not.

from what I see that nemu uses GED only as ACPI aml code, while TYPE_VIRT_ACPI
actually implements hardware part of GED (i.e. initializes and owns MMIO/IRQs).
So it is GED device in practice.

If it's possible by ACPI spec to use GPIO with GED device, then I'd add it
later when there is actual usecase for it. Otherwise GPIO is just another
device with its own AML part to go with.

So I'd second Eric's suggestion to rename virt-acpi to acpi-ged

> Thanks,
> Shameer
> 
> [1]https://github.com/intel/nemu/commit/bcff7ee8588f7049cd919ee8b349f219a873ec41#diff-82ce92e28467c5894c90311f0e6a75fb
> 
> >   
> > > > If think you should clarify what is the exact scope of this device. The patch  
> > title  
> > > > make think this is bound to be used only in machvirt (+ the virt prefix used in
> > > > numerous functions?). Is it also bound to be used by other architectures?  
> > > > > +
> > > > > +static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > > > > +                                DeviceState *dev, Error **errp)  
> > { }  
> > > > > +
> > > > > +static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > > > +{ }
> > > > > +
> > > > > +static void virt_device_realize(DeviceState *dev, Error **errp) { }
> > > > > +
> > > > > +static Property virt_acpi_properties[] = {
> > > > > +    DEFINE_PROP_END_OF_LIST(),
> > > > > +};
> > > > > +
> > > > > +static void virt_acpi_class_init(ObjectClass *class, void *data) {
> > > > > +    DeviceClass *dc = DEVICE_CLASS(class);
> > > > > +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(class);
> > > > > +    AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_CLASS(class);
> > > > > +
> > > > > +    dc->desc = "ACPI";
> > > > > +    dc->props = virt_acpi_properties;
> > > > > +    dc->realize = virt_device_realize;
> > > > > +
> > > > > +    hc->plug = virt_device_plug_cb;
> > > > > +
> > > > > +    adevc->send_event = virt_send_ged; }
> > > > > +
> > > > > +static const TypeInfo virt_acpi_info = {
> > > > > +    .name          = TYPE_VIRT_ACPI,
> > > > > +    .parent        = TYPE_SYS_BUS_DEVICE,
> > > > > +    .instance_size = sizeof(VirtAcpiState),
> > > > > +    .class_init    = virt_acpi_class_init,
> > > > > +    .interfaces = (InterfaceInfo[]) {
> > > > > +        { TYPE_HOTPLUG_HANDLER },
> > > > > +        { TYPE_ACPI_DEVICE_IF },
> > > > > +        { }
> > > > > +    }
> > > > > +};
> > > > > +
> > > > > +static void virt_acpi_register_types(void) {
> > > > > +    type_register_static(&virt_acpi_info);
> > > > > +}
> > > > > +
> > > > > +type_init(virt_acpi_register_types)
> > > > > diff --git a/include/hw/acpi/generic_event_device.h
> > > > > b/include/hw/acpi/generic_event_device.h
> > > > > new file mode 100644
> > > > > index 0000000..f314515
> > > > > --- /dev/null
> > > > > +++ b/include/hw/acpi/generic_event_device.h
> > > > > @@ -0,0 +1,29 @@
> > > > > +/*
> > > > > + *
> > > > > + * Copyright (c) 2018 Intel Corporation
> > > > > + *
> > > > > + * This program is free software; you can redistribute it and/or
> > > > > +modify it
> > > > > + * under the terms and conditions of the GNU General Public License,
> > > > > + * version 2 or later, as published by the Free Software Foundation.
> > > > > + *
> > > > > + * This program is distributed in the hope it will be useful, but
> > > > > +WITHOUT
> > > > > + * ANY WARRANTY; without even the implied warranty of  
> > > > MERCHANTABILITY  
> > > > > +or
> > > > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > > > > +License for
> > > > > + * more details.
> > > > > + *
> > > > > + * You should have received a copy of the GNU General Public License
> > > > > +along with
> > > > > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > > > > + */  
> > > > Add a comment in the header introducing what is the role of this device?
> > > > link to GED spec? Explain the subset of the interfaces being implemented by
> > > > the device.  
> > >
> > > Ok. I have added comments to that effect in patch #10, but I think I will make  
> > it  
> > > clear here as well.
> > >
> > > Cheers,
> > > Shameer
> > >  
> > > > > +
> > > > > +#ifndef HW_ACPI_GED_H
> > > > > +#define HW_ACPI_GED_H
> > > > > +
> > > > > +#define TYPE_VIRT_ACPI "virt-acpi"
> > > > > +#define VIRT_ACPI(obj) \
> > > > > +    OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > > > > +
> > > > > +typedef struct VirtAcpiState {
> > > > > +    SysBusDevice parent_obj;
> > > > > +} VirtAcpiState;
> > > > > +
> > > > > +#endif
> > > > >  
> 
> 


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
@ 2019-04-02  6:31             ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-02  6:31 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	shannon.zhaosl@gmail.com, qemu-devel@nongnu.org, Linuxarm,
	Auger Eric, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com

On Mon, 1 Apr 2019 14:21:40 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> Hi Igor,
> 
> > -----Original Message-----
> > From: Igor Mammedov [mailto:imammedo@redhat.com]
> > Sent: 01 April 2019 14:09
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> > Cc: Auger Eric <eric.auger@redhat.com>; qemu-devel@nongnu.org;
> > qemu-arm@nongnu.org; peter.maydell@linaro.org;
> > shannon.zhaosl@gmail.com; sameo@linux.intel.com;
> > sebastien.boeuf@intel.com; Linuxarm <linuxarm@huawei.com>; xuwei (O)
> > <xuwei5@huawei.com>
> > Subject: Re: [Qemu-devel] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI
> > device
> > 
> > On Fri, 29 Mar 2019 11:22:02 +0000
> > Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:
> >   
> > > > -----Original Message-----
> > > > From: Auger Eric [mailto:eric.auger@redhat.com]
> > > > Sent: 28 March 2019 14:15
> > > > To: Shameerali Kolothum Thodi  
> > <shameerali.kolothum.thodi@huawei.com>;  
> > > > qemu-devel@nongnu.org; qemu-arm@nongnu.org;  
> > imammedo@redhat.com;  
> > > > peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> > > > sameo@linux.intel.com; sebastien.boeuf@intel.com
> > > > Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>
> > > > Subject: Re: [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device
> > > >
> > > > Hi Shameer,
> > > >
> > > > On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
> > > > > From: Samuel Ortiz <sameo@linux.intel.com>
> > > > >
> > > > > This adds the skeleton to support an acpi device interface for
> > > > > HW-reduced acpi platforms via ACPI GED - Generic Event Device (ACPI
> > > > > v6.1 5.6.9).
> > > > >
> > > > > This will be used by Arm/Virt to add hotplug support.
> > > > >
> > > > > Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
> > > > > Signed-off-by: Shameer Kolothum  
> > <shameerali.kolothum.thodi@huawei.com>  
> > > > > ---
> > > > >  hw/acpi/Kconfig                        |  4 ++
> > > > >  hw/acpi/Makefile.objs                  |  1 +
> > > > >  hw/acpi/generic_event_device.c         | 72  
> > > > ++++++++++++++++++++++++++++++++++  
> > > > >  include/hw/acpi/generic_event_device.h | 29 ++++++++++++++
> > > > >  4 files changed, 106 insertions(+)
> > > > >  create mode 100644 hw/acpi/generic_event_device.c  create mode  
> > > > 100644  
> > > > > include/hw/acpi/generic_event_device.h
> > > > >
> > > > > diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig index eca3bee..01a8b41
> > > > > 100644
> > > > > --- a/hw/acpi/Kconfig
> > > > > +++ b/hw/acpi/Kconfig
> > > > > @@ -27,3 +27,7 @@ config ACPI_VMGENID
> > > > >      bool
> > > > >      default y
> > > > >      depends on PC
> > > > > +
> > > > > +config ACPI_HW_REDUCED
> > > > > +    bool
> > > > > +    depends on ACPI
> > > > > diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs index
> > > > > 2d46e37..b753232 100644
> > > > > --- a/hw/acpi/Makefile.objs
> > > > > +++ b/hw/acpi/Makefile.objs
> > > > > @@ -6,6 +6,7 @@ common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG)  
> > +=  
> > > > > memory_hotplug.o
> > > > >  common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
> > > > >  common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
> > > > >  common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
> > > > > +common-obj-$(CONFIG_ACPI_HW_REDUCED) +=  
> > generic_event_device.o  
> > > > >  common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
> > > > >
> > > > >  common-obj-y += acpi_interface.o
> > > > > diff --git a/hw/acpi/generic_event_device.c
> > > > > b/hw/acpi/generic_event_device.c new file mode 100644 index
> > > > > 0000000..b21a551
> > > > > --- /dev/null
> > > > > +++ b/hw/acpi/generic_event_device.c
> > > > > @@ -0,0 +1,72 @@
> > > > > +/*
> > > > > + *
> > > > > + * Copyright (c) 2018 Intel Corporation
> > > > > + *
> > > > > + * This program is free software; you can redistribute it and/or
> > > > > +modify it
> > > > > + * under the terms and conditions of the GNU General Public License,
> > > > > + * version 2 or later, as published by the Free Software Foundation.
> > > > > + *
> > > > > + * This program is distributed in the hope it will be useful, but
> > > > > +WITHOUT
> > > > > + * ANY WARRANTY; without even the implied warranty of  
> > > > MERCHANTABILITY  
> > > > > +or
> > > > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > > > > +License for
> > > > > + * more details.
> > > > > + *
> > > > > + * You should have received a copy of the GNU General Public License
> > > > > +along with
> > > > > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > > > > + */
> > > > > +
> > > > > +#include "qemu/osdep.h"
> > > > > +#include "hw/sysbus.h"
> > > > > +#include "hw/acpi/acpi.h"
> > > > > +#include "hw/acpi/generic_event_device.h"  
> > > > the files are named generic_event_device.c/h while the device is named
> > > > "virt-acpi". I would suggest to use the same naming as in nemu ie. ged or
> > > > acpi_ged.  
> > >
> > > Agree. The naming is a bit confusing. In nemu they have a separate virt-acpi
> > > dev which makes use of GED. Here, we are rolling those two into one. I am
> > > still not very sure whether we should leave it as virt-acpi, because the actual
> > > device on which this is implemented can be changed eg, GED vs GPIO.  
> > 
> > I probably lacking context here, could you clarify and maybe compare
> > differences between x86 and ARM implementations and why it should be
> > different devices?
> >   
> 
> Right. I was not comparing against x86, but just pointing out how Nemu has
> done this. They seems to have a virt-acpi dev specific to virt platforms
> (hw/i386/virt/acpi.c) and then moved all GED related code in a separate file
> (hw/acpi/ged.c) [1].
> 
> I was just thinking whether that approach makes any sense going forward where
> there are cases where platforms support GED or GPIO for hotplug support and
> virt-acpi dev can be configured to use either of those. May be not.

from what I see that nemu uses GED only as ACPI aml code, while TYPE_VIRT_ACPI
actually implements hardware part of GED (i.e. initializes and owns MMIO/IRQs).
So it is GED device in practice.

If it's possible by ACPI spec to use GPIO with GED device, then I'd add it
later when there is actual usecase for it. Otherwise GPIO is just another
device with its own AML part to go with.

So I'd second Eric's suggestion to rename virt-acpi to acpi-ged

> Thanks,
> Shameer
> 
> [1]https://github.com/intel/nemu/commit/bcff7ee8588f7049cd919ee8b349f219a873ec41#diff-82ce92e28467c5894c90311f0e6a75fb
> 
> >   
> > > > If think you should clarify what is the exact scope of this device. The patch  
> > title  
> > > > make think this is bound to be used only in machvirt (+ the virt prefix used in
> > > > numerous functions?). Is it also bound to be used by other architectures?  
> > > > > +
> > > > > +static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > > > > +                                DeviceState *dev, Error **errp)  
> > { }  
> > > > > +
> > > > > +static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > > > +{ }
> > > > > +
> > > > > +static void virt_device_realize(DeviceState *dev, Error **errp) { }
> > > > > +
> > > > > +static Property virt_acpi_properties[] = {
> > > > > +    DEFINE_PROP_END_OF_LIST(),
> > > > > +};
> > > > > +
> > > > > +static void virt_acpi_class_init(ObjectClass *class, void *data) {
> > > > > +    DeviceClass *dc = DEVICE_CLASS(class);
> > > > > +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(class);
> > > > > +    AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_CLASS(class);
> > > > > +
> > > > > +    dc->desc = "ACPI";
> > > > > +    dc->props = virt_acpi_properties;
> > > > > +    dc->realize = virt_device_realize;
> > > > > +
> > > > > +    hc->plug = virt_device_plug_cb;
> > > > > +
> > > > > +    adevc->send_event = virt_send_ged; }
> > > > > +
> > > > > +static const TypeInfo virt_acpi_info = {
> > > > > +    .name          = TYPE_VIRT_ACPI,
> > > > > +    .parent        = TYPE_SYS_BUS_DEVICE,
> > > > > +    .instance_size = sizeof(VirtAcpiState),
> > > > > +    .class_init    = virt_acpi_class_init,
> > > > > +    .interfaces = (InterfaceInfo[]) {
> > > > > +        { TYPE_HOTPLUG_HANDLER },
> > > > > +        { TYPE_ACPI_DEVICE_IF },
> > > > > +        { }
> > > > > +    }
> > > > > +};
> > > > > +
> > > > > +static void virt_acpi_register_types(void) {
> > > > > +    type_register_static(&virt_acpi_info);
> > > > > +}
> > > > > +
> > > > > +type_init(virt_acpi_register_types)
> > > > > diff --git a/include/hw/acpi/generic_event_device.h
> > > > > b/include/hw/acpi/generic_event_device.h
> > > > > new file mode 100644
> > > > > index 0000000..f314515
> > > > > --- /dev/null
> > > > > +++ b/include/hw/acpi/generic_event_device.h
> > > > > @@ -0,0 +1,29 @@
> > > > > +/*
> > > > > + *
> > > > > + * Copyright (c) 2018 Intel Corporation
> > > > > + *
> > > > > + * This program is free software; you can redistribute it and/or
> > > > > +modify it
> > > > > + * under the terms and conditions of the GNU General Public License,
> > > > > + * version 2 or later, as published by the Free Software Foundation.
> > > > > + *
> > > > > + * This program is distributed in the hope it will be useful, but
> > > > > +WITHOUT
> > > > > + * ANY WARRANTY; without even the implied warranty of  
> > > > MERCHANTABILITY  
> > > > > +or
> > > > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > > > > +License for
> > > > > + * more details.
> > > > > + *
> > > > > + * You should have received a copy of the GNU General Public License
> > > > > +along with
> > > > > + * this program.  If not, see <http://www.gnu.org/licenses/>.
> > > > > + */  
> > > > Add a comment in the header introducing what is the role of this device?
> > > > link to GED spec? Explain the subset of the interfaces being implemented by
> > > > the device.  
> > >
> > > Ok. I have added comments to that effect in patch #10, but I think I will make  
> > it  
> > > clear here as well.
> > >
> > > Cheers,
> > > Shameer
> > >  
> > > > > +
> > > > > +#ifndef HW_ACPI_GED_H
> > > > > +#define HW_ACPI_GED_H
> > > > > +
> > > > > +#define TYPE_VIRT_ACPI "virt-acpi"
> > > > > +#define VIRT_ACPI(obj) \
> > > > > +    OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > > > > +
> > > > > +typedef struct VirtAcpiState {
> > > > > +    SysBusDevice parent_obj;
> > > > > +} VirtAcpiState;
> > > > > +
> > > > > +#endif
> > > > >  
> 
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
  2019-04-01 14:51         ` [Qemu-devel] " Shameerali Kolothum Thodi
@ 2019-04-02  7:19           ` Igor Mammedov
  -1 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-02  7:19 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	shannon.zhaosl@gmail.com, qemu-devel@nongnu.org, Linuxarm,
	Auger Eric, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com

On Mon, 1 Apr 2019 14:51:51 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> Hi Igor,
> 
> > -----Original Message-----
> > From: Igor Mammedov [mailto:imammedo@redhat.com]
> > Sent: 01 April 2019 14:43
> > To: Auger Eric <eric.auger@redhat.com>
> > Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> > qemu-devel@nongnu.org; qemu-arm@nongnu.org; peter.maydell@linaro.org;
> > shannon.zhaosl@gmail.com; sameo@linux.intel.com;
> > sebastien.boeuf@intel.com; Linuxarm <linuxarm@huawei.com>; xuwei (O)
> > <xuwei5@huawei.com>
> > Subject: Re: [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device
> > memory cold-plug
> > 
> > On Fri, 29 Mar 2019 10:31:14 +0100
> > Auger Eric <eric.auger@redhat.com> wrote:
> >   
> > > Hi Shameer,
> > >
> > > On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
> > > > This adds support to build the aml code so that Guest(ACPI boot)
> > > > can see the cold-plugged device memory. Memory cold plug support
> > > > with DT boot is not yet enabled.
> > > >
> > > > Signed-off-by: Shameer Kolothum  
> > <shameerali.kolothum.thodi@huawei.com>  
> > > > ---
> > > >  default-configs/arm-softmmu.mak        |  2 ++
> > > >  hw/acpi/generic_event_device.c         | 23  
> > +++++++++++++++++++++++  
> > > >  hw/arm/virt-acpi-build.c               |  9 +++++++++
> > > >  hw/arm/virt.c                          | 23  
> > +++++++++++++++++++++++  
> > > >  include/hw/acpi/generic_event_device.h |  5 +++++
> > > >  include/hw/arm/virt.h                  |  2 ++
> > > >  6 files changed, 64 insertions(+)
> > > >
> > > > diff --git a/default-configs/arm-softmmu.mak  
> > b/default-configs/arm-softmmu.mak  
> > > > index 795cb89..6db444e 100644
> > > > --- a/default-configs/arm-softmmu.mak
> > > > +++ b/default-configs/arm-softmmu.mak
> > > > @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
> > > >
> > > >  CONFIG_MEM_DEVICE=y
> > > >  CONFIG_DIMM=y
> > > > +CONFIG_ACPI_MEMORY_HOTPLUG=y
> > > > +CONFIG_ACPI_HW_REDUCED=y
> > > > diff --git a/hw/acpi/generic_event_device.c  
> > b/hw/acpi/generic_event_device.c  
> > > > index b21a551..0b32fc9 100644
> > > > --- a/hw/acpi/generic_event_device.c
> > > > +++ b/hw/acpi/generic_event_device.c
> > > > @@ -16,13 +16,26 @@
> > > >   */
> > > >
> > > >  #include "qemu/osdep.h"
> > > > +#include "qapi/error.h"
> > > > +#include "exec/address-spaces.h"
> > > >  #include "hw/sysbus.h"
> > > >  #include "hw/acpi/acpi.h"
> > > >  #include "hw/acpi/generic_event_device.h"
> > > > +#include "hw/mem/pc-dimm.h"
> > > >
> > > >  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > > >                                  DeviceState *dev, Error **errp)
> > > >  {
> > > > +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> > > > +
> > > > +    if (s->memhp_state.is_enabled &&
> > > > +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > > > +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> > > > +                                dev, errp);
> > > > +    } else {
> > > > +        error_setg(errp, "virt: device plug request for unsupported  
> > device"  
> > > > +                   " type: %s", object_get_typename(OBJECT(dev)));
> > > > +    }
> > > >  }
> > > >
> > > >  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > > @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev,  
> > AcpiEventStatusBits ev)  
> > > >
> > > >  static void virt_device_realize(DeviceState *dev, Error **errp)
> > > >  {
> > > > +    VirtAcpiState *s = VIRT_ACPI(dev);
> > > > +
> > > > +    if (s->memhp_state.is_enabled) {
> > > > +        acpi_memory_hotplug_init(get_system_memory(),  
> > OBJECT(dev),  
> > > > +                                 &s->memhp_state,
> > > > +                                 s->memhp_base);
> > > > +    }
> > > >  }
> > > >
> > > >  static Property virt_acpi_properties[] = {
> > > > +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base,  
> > 0),  
> > > > +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> > > > +                     memhp_state.is_enabled, true),>  
> > DEFINE_PROP_END_OF_LIST(),  
> > > >  };
> > > >
> > > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > > > index bf9c0bc..20d3c83 100644
> > > > --- a/hw/arm/virt-acpi-build.c
> > > > +++ b/hw/arm/virt-acpi-build.c
> > > > @@ -40,6 +40,7 @@
> > > >  #include "hw/loader.h"
> > > >  #include "hw/hw.h"
> > > >  #include "hw/acpi/aml-build.h"
> > > > +#include "hw/acpi/memory_hotplug.h"
> > > >  #include "hw/pci/pcie_host.h"
> > > >  #include "hw/pci/pci.h"
> > > >  #include "hw/arm/virt.h"
> > > > @@ -49,6 +50,13 @@
> > > >  #define ARM_SPI_BASE 32
> > > >  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
> > > >
> > > > +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState  
> > *ms)  
> > > > +{
> > > > +    uint32_t nr_mem = ms->ram_slots;
> > > > +
> > > > +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL,  
> > AML_SYSTEM_MEMORY);  
> > > > +}
> > > > +
> > > >  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
> > > >  {
> > > >      uint16_t i;
> > > > @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,  
> > VirtMachineState *vms)  
> > > >       * the RTC ACPI device at all when using UEFI.
> > > >       */
> > > >      scope = aml_scope("\\_SB");
> > > > +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
> > > >      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
> > > >      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> > > >                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > > > index d0ff20d..13db0e9 100644
> > > > --- a/hw/arm/virt.c
> > > > +++ b/hw/arm/virt.c
> > > > @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
> > > >      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
> > > >      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> > > >      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> > > > +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
> > > >      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> > > >      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of  
> > that size */  
> > > >      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> > > > @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const  
> > VirtMachineState *vms)  
> > > >      }
> > > >  }
> > > >
> > > > +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> > > > +{
> > > > +    DeviceState *dev;
> > > > +
> > > > +    dev = qdev_create(NULL, "virt-acpi");
> > > > +    qdev_prop_set_uint64(dev, "memhp_base",
> > > > +  
> > vms->memmap[VIRT_PCDIMM_ACPI].base);  
> > > Maybe add a comment that a property is requested to integrated with
> > > acpi_memory_hotplug_init() (if I am not wrong). Otherwise we can wonder
> > > why sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, <base>) is not used as for
> > > standard sysbus devices?  
> > 
> > Why it's inherited from SYS_BUS_DEVICE to begin with?  
> 
> Hmm..I don't have a clear answer to that other than the fact that just reused 
> the way other platform devices are created pl011/pl061/smmu etc. Also PCI
> doesn't look like an obvious one here. Please let me know if there is a better
> way of doing this.
If we don't need any of SYSBUS facilities then it's possible to inherit from plain DEVICE.
and use object_property_add_child() to tie it to machine object explicitly.


> > >  
> > > > +    qdev_init_nofail(dev);
> > > > +
> > > > +    return dev;
> > > > +}
> > > > +
> > > >  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
> > > >  {
> > > >      const char *itsclass = its_class_name();
> > > > @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState  
> > *machine)  
> > > >
> > > >      create_platform_bus(vms, pic);
> > > >
> > > > +    vms->acpi = create_virt_acpi(vms);I can see that on PC machines,  
> > they use a link property to set the  
> > > acpi_dev. I am unsure about the exact reason, any idea?  
> > 
> > pc and q35 machine have different devices that implement ACPI interface
> > and live somewhere else in the system and also honor -no-acpi CLI option.
> > Link allows to cache reference to whatever device in use and manage CLI
> > expectations (if I recall it correctly).  
> 
> Thanks for clarifying this.
> 
> >   
> > > > +
> > > >      vms->bootinfo.ram_size = machine->ram_size;
> > > >      vms->bootinfo.kernel_filename = machine->kernel_filename;
> > > >      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> > > > @@ -1828,11 +1843,19 @@ static void  
> > virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,  
> > > >  static void virt_memory_plug(HotplugHandler *hotplug_dev,
> > > >                               DeviceState *dev, Error **errp)
> > > >  {
> > > > +    HotplugHandlerClass *hhc;
> > > >      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> > > >      Error *local_err = NULL;
> > > >
> > > >      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> > > > +    if (local_err) {
> > > > +        goto out;
> > > > +    }
> > > > +
> > > > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> > > > +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);  
> > > Why error_abort instead of propagating the error?  
> > 
> > After last round of changes to hotplug handler, it's deemed that plug() handler
> > should not fail (I didn't get my hands on removing error argument from
> > interface
> > yet). All checks and graceful abort should happen at pre_plug() stage.  
> 
> Ok. I will address this in next revision.
> 
> Thanks,
> Shameer
>  
> > > >
> > > > +out:
> > > >      error_propagate(errp, local_err);
> > > >  }
> > > >
> > > > diff --git a/include/hw/acpi/generic_event_device.h  
> > b/include/hw/acpi/generic_event_device.h  
> > > > index f314515..262ca7d 100644
> > > > --- a/include/hw/acpi/generic_event_device.h
> > > > +++ b/include/hw/acpi/generic_event_device.h
> > > > @@ -18,12 +18,17 @@
> > > >  #ifndef HW_ACPI_GED_H
> > > >  #define HW_ACPI_GED_H
> > > >
> > > > +#include "hw/acpi/memory_hotplug.h"
> > > > +
> > > >  #define TYPE_VIRT_ACPI "virt-acpi"
> > > >  #define VIRT_ACPI(obj) \
> > > >      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > > >
> > > >  typedef struct VirtAcpiState {
> > > >      SysBusDevice parent_obj;
> > > > +    MemHotplugState memhp_state;
> > > > +    hwaddr memhp_base;
> > > >  } VirtAcpiState;
> > > >
> > > > +  
> > > spurious newline
> > >
> > > Thanks
> > >
> > > Eric  
> > > >  #endif
> > > > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > > > index 507517c..c5e4c96 100644
> > > > --- a/include/hw/arm/virt.h
> > > > +++ b/include/hw/arm/virt.h
> > > > @@ -77,6 +77,7 @@ enum {
> > > >      VIRT_GPIO,
> > > >      VIRT_SECURE_UART,
> > > >      VIRT_SECURE_MEM,
> > > > +    VIRT_PCDIMM_ACPI,
> > > >      VIRT_LOWMEMMAP_LAST,
> > > >  };
> > > >
> > > > @@ -132,6 +133,7 @@ typedef struct {
> > > >      uint32_t iommu_phandle;
> > > >      int psci_conduit;
> > > >      hwaddr highest_gpa;
> > > > +    DeviceState *acpi;
> > > >  } VirtMachineState;
> > > >
> > > >  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM :  
> > VIRT_PCIE_ECAM)  
> > > >  
> 
> 


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
@ 2019-04-02  7:19           ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-02  7:19 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: Auger Eric, peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Linuxarm, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), sebastien.boeuf@intel.com

On Mon, 1 Apr 2019 14:51:51 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> Hi Igor,
> 
> > -----Original Message-----
> > From: Igor Mammedov [mailto:imammedo@redhat.com]
> > Sent: 01 April 2019 14:43
> > To: Auger Eric <eric.auger@redhat.com>
> > Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> > qemu-devel@nongnu.org; qemu-arm@nongnu.org; peter.maydell@linaro.org;
> > shannon.zhaosl@gmail.com; sameo@linux.intel.com;
> > sebastien.boeuf@intel.com; Linuxarm <linuxarm@huawei.com>; xuwei (O)
> > <xuwei5@huawei.com>
> > Subject: Re: [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device
> > memory cold-plug
> > 
> > On Fri, 29 Mar 2019 10:31:14 +0100
> > Auger Eric <eric.auger@redhat.com> wrote:
> >   
> > > Hi Shameer,
> > >
> > > On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
> > > > This adds support to build the aml code so that Guest(ACPI boot)
> > > > can see the cold-plugged device memory. Memory cold plug support
> > > > with DT boot is not yet enabled.
> > > >
> > > > Signed-off-by: Shameer Kolothum  
> > <shameerali.kolothum.thodi@huawei.com>  
> > > > ---
> > > >  default-configs/arm-softmmu.mak        |  2 ++
> > > >  hw/acpi/generic_event_device.c         | 23  
> > +++++++++++++++++++++++  
> > > >  hw/arm/virt-acpi-build.c               |  9 +++++++++
> > > >  hw/arm/virt.c                          | 23  
> > +++++++++++++++++++++++  
> > > >  include/hw/acpi/generic_event_device.h |  5 +++++
> > > >  include/hw/arm/virt.h                  |  2 ++
> > > >  6 files changed, 64 insertions(+)
> > > >
> > > > diff --git a/default-configs/arm-softmmu.mak  
> > b/default-configs/arm-softmmu.mak  
> > > > index 795cb89..6db444e 100644
> > > > --- a/default-configs/arm-softmmu.mak
> > > > +++ b/default-configs/arm-softmmu.mak
> > > > @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
> > > >
> > > >  CONFIG_MEM_DEVICE=y
> > > >  CONFIG_DIMM=y
> > > > +CONFIG_ACPI_MEMORY_HOTPLUG=y
> > > > +CONFIG_ACPI_HW_REDUCED=y
> > > > diff --git a/hw/acpi/generic_event_device.c  
> > b/hw/acpi/generic_event_device.c  
> > > > index b21a551..0b32fc9 100644
> > > > --- a/hw/acpi/generic_event_device.c
> > > > +++ b/hw/acpi/generic_event_device.c
> > > > @@ -16,13 +16,26 @@
> > > >   */
> > > >
> > > >  #include "qemu/osdep.h"
> > > > +#include "qapi/error.h"
> > > > +#include "exec/address-spaces.h"
> > > >  #include "hw/sysbus.h"
> > > >  #include "hw/acpi/acpi.h"
> > > >  #include "hw/acpi/generic_event_device.h"
> > > > +#include "hw/mem/pc-dimm.h"
> > > >
> > > >  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > > >                                  DeviceState *dev, Error **errp)
> > > >  {
> > > > +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> > > > +
> > > > +    if (s->memhp_state.is_enabled &&
> > > > +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > > > +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> > > > +                                dev, errp);
> > > > +    } else {
> > > > +        error_setg(errp, "virt: device plug request for unsupported  
> > device"  
> > > > +                   " type: %s", object_get_typename(OBJECT(dev)));
> > > > +    }
> > > >  }
> > > >
> > > >  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > > @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev,  
> > AcpiEventStatusBits ev)  
> > > >
> > > >  static void virt_device_realize(DeviceState *dev, Error **errp)
> > > >  {
> > > > +    VirtAcpiState *s = VIRT_ACPI(dev);
> > > > +
> > > > +    if (s->memhp_state.is_enabled) {
> > > > +        acpi_memory_hotplug_init(get_system_memory(),  
> > OBJECT(dev),  
> > > > +                                 &s->memhp_state,
> > > > +                                 s->memhp_base);
> > > > +    }
> > > >  }
> > > >
> > > >  static Property virt_acpi_properties[] = {
> > > > +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base,  
> > 0),  
> > > > +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> > > > +                     memhp_state.is_enabled, true),>  
> > DEFINE_PROP_END_OF_LIST(),  
> > > >  };
> > > >
> > > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > > > index bf9c0bc..20d3c83 100644
> > > > --- a/hw/arm/virt-acpi-build.c
> > > > +++ b/hw/arm/virt-acpi-build.c
> > > > @@ -40,6 +40,7 @@
> > > >  #include "hw/loader.h"
> > > >  #include "hw/hw.h"
> > > >  #include "hw/acpi/aml-build.h"
> > > > +#include "hw/acpi/memory_hotplug.h"
> > > >  #include "hw/pci/pcie_host.h"
> > > >  #include "hw/pci/pci.h"
> > > >  #include "hw/arm/virt.h"
> > > > @@ -49,6 +50,13 @@
> > > >  #define ARM_SPI_BASE 32
> > > >  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
> > > >
> > > > +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState  
> > *ms)  
> > > > +{
> > > > +    uint32_t nr_mem = ms->ram_slots;
> > > > +
> > > > +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL,  
> > AML_SYSTEM_MEMORY);  
> > > > +}
> > > > +
> > > >  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
> > > >  {
> > > >      uint16_t i;
> > > > @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,  
> > VirtMachineState *vms)  
> > > >       * the RTC ACPI device at all when using UEFI.
> > > >       */
> > > >      scope = aml_scope("\\_SB");
> > > > +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
> > > >      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
> > > >      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> > > >                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > > > index d0ff20d..13db0e9 100644
> > > > --- a/hw/arm/virt.c
> > > > +++ b/hw/arm/virt.c
> > > > @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
> > > >      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
> > > >      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> > > >      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> > > > +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },
> > > >      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> > > >      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of  
> > that size */  
> > > >      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> > > > @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const  
> > VirtMachineState *vms)  
> > > >      }
> > > >  }
> > > >
> > > > +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> > > > +{
> > > > +    DeviceState *dev;
> > > > +
> > > > +    dev = qdev_create(NULL, "virt-acpi");
> > > > +    qdev_prop_set_uint64(dev, "memhp_base",
> > > > +  
> > vms->memmap[VIRT_PCDIMM_ACPI].base);  
> > > Maybe add a comment that a property is requested to integrated with
> > > acpi_memory_hotplug_init() (if I am not wrong). Otherwise we can wonder
> > > why sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, <base>) is not used as for
> > > standard sysbus devices?  
> > 
> > Why it's inherited from SYS_BUS_DEVICE to begin with?  
> 
> Hmm..I don't have a clear answer to that other than the fact that just reused 
> the way other platform devices are created pl011/pl061/smmu etc. Also PCI
> doesn't look like an obvious one here. Please let me know if there is a better
> way of doing this.
If we don't need any of SYSBUS facilities then it's possible to inherit from plain DEVICE.
and use object_property_add_child() to tie it to machine object explicitly.


> > >  
> > > > +    qdev_init_nofail(dev);
> > > > +
> > > > +    return dev;
> > > > +}
> > > > +
> > > >  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
> > > >  {
> > > >      const char *itsclass = its_class_name();
> > > > @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState  
> > *machine)  
> > > >
> > > >      create_platform_bus(vms, pic);
> > > >
> > > > +    vms->acpi = create_virt_acpi(vms);I can see that on PC machines,  
> > they use a link property to set the  
> > > acpi_dev. I am unsure about the exact reason, any idea?  
> > 
> > pc and q35 machine have different devices that implement ACPI interface
> > and live somewhere else in the system and also honor -no-acpi CLI option.
> > Link allows to cache reference to whatever device in use and manage CLI
> > expectations (if I recall it correctly).  
> 
> Thanks for clarifying this.
> 
> >   
> > > > +
> > > >      vms->bootinfo.ram_size = machine->ram_size;
> > > >      vms->bootinfo.kernel_filename = machine->kernel_filename;
> > > >      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> > > > @@ -1828,11 +1843,19 @@ static void  
> > virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,  
> > > >  static void virt_memory_plug(HotplugHandler *hotplug_dev,
> > > >                               DeviceState *dev, Error **errp)
> > > >  {
> > > > +    HotplugHandlerClass *hhc;
> > > >      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> > > >      Error *local_err = NULL;
> > > >
> > > >      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> > > > +    if (local_err) {
> > > > +        goto out;
> > > > +    }
> > > > +
> > > > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> > > > +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);  
> > > Why error_abort instead of propagating the error?  
> > 
> > After last round of changes to hotplug handler, it's deemed that plug() handler
> > should not fail (I didn't get my hands on removing error argument from
> > interface
> > yet). All checks and graceful abort should happen at pre_plug() stage.  
> 
> Ok. I will address this in next revision.
> 
> Thanks,
> Shameer
>  
> > > >
> > > > +out:
> > > >      error_propagate(errp, local_err);
> > > >  }
> > > >
> > > > diff --git a/include/hw/acpi/generic_event_device.h  
> > b/include/hw/acpi/generic_event_device.h  
> > > > index f314515..262ca7d 100644
> > > > --- a/include/hw/acpi/generic_event_device.h
> > > > +++ b/include/hw/acpi/generic_event_device.h
> > > > @@ -18,12 +18,17 @@
> > > >  #ifndef HW_ACPI_GED_H
> > > >  #define HW_ACPI_GED_H
> > > >
> > > > +#include "hw/acpi/memory_hotplug.h"
> > > > +
> > > >  #define TYPE_VIRT_ACPI "virt-acpi"
> > > >  #define VIRT_ACPI(obj) \
> > > >      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > > >
> > > >  typedef struct VirtAcpiState {
> > > >      SysBusDevice parent_obj;
> > > > +    MemHotplugState memhp_state;
> > > > +    hwaddr memhp_base;
> > > >  } VirtAcpiState;
> > > >
> > > > +  
> > > spurious newline
> > >
> > > Thanks
> > >
> > > Eric  
> > > >  #endif
> > > > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > > > index 507517c..c5e4c96 100644
> > > > --- a/include/hw/arm/virt.h
> > > > +++ b/include/hw/arm/virt.h
> > > > @@ -77,6 +77,7 @@ enum {
> > > >      VIRT_GPIO,
> > > >      VIRT_SECURE_UART,
> > > >      VIRT_SECURE_MEM,
> > > > +    VIRT_PCDIMM_ACPI,
> > > >      VIRT_LOWMEMMAP_LAST,
> > > >  };
> > > >
> > > > @@ -132,6 +133,7 @@ typedef struct {
> > > >      uint32_t iommu_phandle;
> > > >      int psci_conduit;
> > > >      hwaddr highest_gpa;
> > > > +    DeviceState *acpi;
> > > >  } VirtMachineState;
> > > >
> > > >  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM :  
> > VIRT_PCIE_ECAM)  
> > > >  
> 
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
  2019-04-01 16:24       ` [Qemu-devel] " Shameerali Kolothum Thodi
@ 2019-04-02  7:22         ` Igor Mammedov
  -1 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-02  7:22 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	shannon.zhaosl@gmail.com, qemu-devel@nongnu.org, Linuxarm,
	eric.auger@redhat.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com

On Mon, 1 Apr 2019 16:24:40 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov [mailto:imammedo@redhat.com]
> > Sent: 01 April 2019 14:34
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> > Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org;
> > eric.auger@redhat.com; peter.maydell@linaro.org;
> > shannon.zhaosl@gmail.com; sameo@linux.intel.com;
> > sebastien.boeuf@intel.com; Linuxarm <linuxarm@huawei.com>; xuwei (O)
> > <xuwei5@huawei.com>
> > Subject: Re: [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device
> > memory cold-plug
> > 
> > On Thu, 21 Mar 2019 10:47:40 +0000
> > Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:
> >   
> > > This adds support to build the aml code so that Guest(ACPI boot)
> > > can see the cold-plugged device memory. Memory cold plug support
> > > with DT boot is not yet enabled.
> > >
> > > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > > ---
> > >  default-configs/arm-softmmu.mak        |  2 ++
> > >  hw/acpi/generic_event_device.c         | 23  
> > +++++++++++++++++++++++  
> > >  hw/arm/virt-acpi-build.c               |  9 +++++++++
> > >  hw/arm/virt.c                          | 23  
> > +++++++++++++++++++++++  
> > >  include/hw/acpi/generic_event_device.h |  5 +++++
> > >  include/hw/arm/virt.h                  |  2 ++
> > >  6 files changed, 64 insertions(+)
> > >
> > > diff --git a/default-configs/arm-softmmu.mak  
> > b/default-configs/arm-softmmu.mak  
> > > index 795cb89..6db444e 100644
> > > --- a/default-configs/arm-softmmu.mak
> > > +++ b/default-configs/arm-softmmu.mak
> > > @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
> > >
> > >  CONFIG_MEM_DEVICE=y
> > >  CONFIG_DIMM=y
> > > +CONFIG_ACPI_MEMORY_HOTPLUG=y
> > > +CONFIG_ACPI_HW_REDUCED=y
> > > diff --git a/hw/acpi/generic_event_device.c  
> > b/hw/acpi/generic_event_device.c  
> > > index b21a551..0b32fc9 100644
> > > --- a/hw/acpi/generic_event_device.c
> > > +++ b/hw/acpi/generic_event_device.c
> > > @@ -16,13 +16,26 @@
> > >   */
> > >
> > >  #include "qemu/osdep.h"
> > > +#include "qapi/error.h"
> > > +#include "exec/address-spaces.h"
> > >  #include "hw/sysbus.h"
> > >  #include "hw/acpi/acpi.h"
> > >  #include "hw/acpi/generic_event_device.h"
> > > +#include "hw/mem/pc-dimm.h"
> > >
> > >  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > >                                  DeviceState *dev, Error **errp)
> > >  {
> > > +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> > > +
> > > +    if (s->memhp_state.is_enabled &&
> > > +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > > +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> > > +                                dev, errp);
> > > +    } else {
> > > +        error_setg(errp, "virt: device plug request for unsupported  
> > device"  
> > > +                   " type: %s", object_get_typename(OBJECT(dev)));
> > > +    }
> > >  }
> > >
> > >  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev,  
> > AcpiEventStatusBits ev)  
> > >
> > >  static void virt_device_realize(DeviceState *dev, Error **errp)
> > >  {
> > > +    VirtAcpiState *s = VIRT_ACPI(dev);
> > > +
> > > +    if (s->memhp_state.is_enabled) {
> > > +        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
> > > +                                 &s->memhp_state,
> > > +                                 s->memhp_base);
> > > +    }
> > >  }
> > >
> > >  static Property virt_acpi_properties[] = {
> > > +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base,  
> > 0),
> > 
> > it's preferred to use '-' in property names  
> 
> Ok.
> 
> > > +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> > > +                     memhp_state.is_enabled, true),
> > >      DEFINE_PROP_END_OF_LIST(),
> > >  };
> > >
> > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > > index bf9c0bc..20d3c83 100644
> > > --- a/hw/arm/virt-acpi-build.c
> > > +++ b/hw/arm/virt-acpi-build.c
> > > @@ -40,6 +40,7 @@
> > >  #include "hw/loader.h"
> > >  #include "hw/hw.h"
> > >  #include "hw/acpi/aml-build.h"
> > > +#include "hw/acpi/memory_hotplug.h"
> > >  #include "hw/pci/pcie_host.h"
> > >  #include "hw/pci/pci.h"
> > >  #include "hw/arm/virt.h"
> > > @@ -49,6 +50,13 @@
> > >  #define ARM_SPI_BASE 32
> > >  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
> > >
> > > +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState  
> > *ms)  
> > > +{  
> > it's dummy wrapper that never will be reused,
> > I suggest to just inline contents at call site and drop wrapper.  
> 
> Ok. I will move it then.
> 
> >   
> > > +    uint32_t nr_mem = ms->ram_slots;
> > > +
> > > +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL,  
> > AML_SYSTEM_MEMORY);  
> > > +}
> > > +
> > >  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
> > >  {
> > >      uint16_t i;
> > > @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,  
> > VirtMachineState *vms)  
> > >       * the RTC ACPI device at all when using UEFI.
> > >       */
> > >      scope = aml_scope("\\_SB");
> > > +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
> > >      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
> > >      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> > >                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > > index d0ff20d..13db0e9 100644
> > > --- a/hw/arm/virt.c
> > > +++ b/hw/arm/virt.c
> > > @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
> > >      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
> > >      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> > >      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> > > +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },  
> >                                                  ^^^^^^^^^^^
> > where from this magic number comes?  
> 
> I think the only requirement for size is >= MEMORY_HOTPLUG_IO_LEN(24).
> So may be 64K is bit too much, 4K might as well do the job.
> 
> Or is it best to just use MEMORY_HOTPLUG_IO_LEN directly here?
4K is a waste for handling a handful bytes, so I'd go with
MEMORY_HOTPLUG_IO_LEN unless there is compelling reason for using
page size granularity.

>  
> >   
> > >      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> > >      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that  
> > size */  
> > >      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> > > @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const  
> > VirtMachineState *vms)  
> > >      }
> > >  }
> > >
> > > +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> > > +{
> > > +    DeviceState *dev;
> > > +
> > > +    dev = qdev_create(NULL, "virt-acpi");
> > > +    qdev_prop_set_uint64(dev, "memhp_base",
> > > +                         vms->memmap[VIRT_PCDIMM_ACPI].base);
> > > +    qdev_init_nofail(dev);
> > > +
> > > +    return dev;  
> > 
> > Probably no worth a wrapper either, since code is trivial and isn't reused
> > elsewhere.  
> 
> Ok, I will make it an inline then.
> 
> Thanks,
> Shameer
>  
> > > +}
> > > +
> > >  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
> > >  {
> > >      const char *itsclass = its_class_name();
> > > @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
> > >
> > >      create_platform_bus(vms, pic);
> > >
> > > +    vms->acpi = create_virt_acpi(vms);
> > > +
> > >      vms->bootinfo.ram_size = machine->ram_size;
> > >      vms->bootinfo.kernel_filename = machine->kernel_filename;
> > >      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> > > @@ -1828,11 +1843,19 @@ static void  
> > virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,  
> > >  static void virt_memory_plug(HotplugHandler *hotplug_dev,
> > >                               DeviceState *dev, Error **errp)
> > >  {
> > > +    HotplugHandlerClass *hhc;
> > >      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> > >      Error *local_err = NULL;
> > >
> > >      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> > > +    if (local_err) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> > > +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);
> > >
> > > +out:
> > >      error_propagate(errp, local_err);
> > >  }
> > >
> > > diff --git a/include/hw/acpi/generic_event_device.h  
> > b/include/hw/acpi/generic_event_device.h  
> > > index f314515..262ca7d 100644
> > > --- a/include/hw/acpi/generic_event_device.h
> > > +++ b/include/hw/acpi/generic_event_device.h
> > > @@ -18,12 +18,17 @@
> > >  #ifndef HW_ACPI_GED_H
> > >  #define HW_ACPI_GED_H
> > >
> > > +#include "hw/acpi/memory_hotplug.h"
> > > +
> > >  #define TYPE_VIRT_ACPI "virt-acpi"
> > >  #define VIRT_ACPI(obj) \
> > >      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > >
> > >  typedef struct VirtAcpiState {
> > >      SysBusDevice parent_obj;
> > > +    MemHotplugState memhp_state;
> > > +    hwaddr memhp_base;
> > >  } VirtAcpiState;
> > >
> > > +
> > >  #endif
> > > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > > index 507517c..c5e4c96 100644
> > > --- a/include/hw/arm/virt.h
> > > +++ b/include/hw/arm/virt.h
> > > @@ -77,6 +77,7 @@ enum {
> > >      VIRT_GPIO,
> > >      VIRT_SECURE_UART,
> > >      VIRT_SECURE_MEM,
> > > +    VIRT_PCDIMM_ACPI,
> > >      VIRT_LOWMEMMAP_LAST,
> > >  };
> > >
> > > @@ -132,6 +133,7 @@ typedef struct {
> > >      uint32_t iommu_phandle;
> > >      int psci_conduit;
> > >      hwaddr highest_gpa;
> > > +    DeviceState *acpi;
> > >  } VirtMachineState;
> > >
> > >  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM :  
> > VIRT_PCIE_ECAM)  
> 


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug
@ 2019-04-02  7:22         ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-02  7:22 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, eric.auger@redhat.com,
	peter.maydell@linaro.org, shannon.zhaosl@gmail.com,
	sameo@linux.intel.com, sebastien.boeuf@intel.com, Linuxarm,
	xuwei (O)

On Mon, 1 Apr 2019 16:24:40 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov [mailto:imammedo@redhat.com]
> > Sent: 01 April 2019 14:34
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> > Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org;
> > eric.auger@redhat.com; peter.maydell@linaro.org;
> > shannon.zhaosl@gmail.com; sameo@linux.intel.com;
> > sebastien.boeuf@intel.com; Linuxarm <linuxarm@huawei.com>; xuwei (O)
> > <xuwei5@huawei.com>
> > Subject: Re: [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device
> > memory cold-plug
> > 
> > On Thu, 21 Mar 2019 10:47:40 +0000
> > Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:
> >   
> > > This adds support to build the aml code so that Guest(ACPI boot)
> > > can see the cold-plugged device memory. Memory cold plug support
> > > with DT boot is not yet enabled.
> > >
> > > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > > ---
> > >  default-configs/arm-softmmu.mak        |  2 ++
> > >  hw/acpi/generic_event_device.c         | 23  
> > +++++++++++++++++++++++  
> > >  hw/arm/virt-acpi-build.c               |  9 +++++++++
> > >  hw/arm/virt.c                          | 23  
> > +++++++++++++++++++++++  
> > >  include/hw/acpi/generic_event_device.h |  5 +++++
> > >  include/hw/arm/virt.h                  |  2 ++
> > >  6 files changed, 64 insertions(+)
> > >
> > > diff --git a/default-configs/arm-softmmu.mak  
> > b/default-configs/arm-softmmu.mak  
> > > index 795cb89..6db444e 100644
> > > --- a/default-configs/arm-softmmu.mak
> > > +++ b/default-configs/arm-softmmu.mak
> > > @@ -162,3 +162,5 @@ CONFIG_LSI_SCSI_PCI=y
> > >
> > >  CONFIG_MEM_DEVICE=y
> > >  CONFIG_DIMM=y
> > > +CONFIG_ACPI_MEMORY_HOTPLUG=y
> > > +CONFIG_ACPI_HW_REDUCED=y
> > > diff --git a/hw/acpi/generic_event_device.c  
> > b/hw/acpi/generic_event_device.c  
> > > index b21a551..0b32fc9 100644
> > > --- a/hw/acpi/generic_event_device.c
> > > +++ b/hw/acpi/generic_event_device.c
> > > @@ -16,13 +16,26 @@
> > >   */
> > >
> > >  #include "qemu/osdep.h"
> > > +#include "qapi/error.h"
> > > +#include "exec/address-spaces.h"
> > >  #include "hw/sysbus.h"
> > >  #include "hw/acpi/acpi.h"
> > >  #include "hw/acpi/generic_event_device.h"
> > > +#include "hw/mem/pc-dimm.h"
> > >
> > >  static void virt_device_plug_cb(HotplugHandler *hotplug_dev,
> > >                                  DeviceState *dev, Error **errp)
> > >  {
> > > +    VirtAcpiState *s = VIRT_ACPI(hotplug_dev);
> > > +
> > > +    if (s->memhp_state.is_enabled &&
> > > +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > > +            acpi_memory_plug_cb(hotplug_dev, &s->memhp_state,
> > > +                                dev, errp);
> > > +    } else {
> > > +        error_setg(errp, "virt: device plug request for unsupported  
> > device"  
> > > +                   " type: %s", object_get_typename(OBJECT(dev)));
> > > +    }
> > >  }
> > >
> > >  static void virt_send_ged(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > @@ -31,9 +44,19 @@ static void virt_send_ged(AcpiDeviceIf *adev,  
> > AcpiEventStatusBits ev)  
> > >
> > >  static void virt_device_realize(DeviceState *dev, Error **errp)
> > >  {
> > > +    VirtAcpiState *s = VIRT_ACPI(dev);
> > > +
> > > +    if (s->memhp_state.is_enabled) {
> > > +        acpi_memory_hotplug_init(get_system_memory(), OBJECT(dev),
> > > +                                 &s->memhp_state,
> > > +                                 s->memhp_base);
> > > +    }
> > >  }
> > >
> > >  static Property virt_acpi_properties[] = {
> > > +    DEFINE_PROP_UINT64("memhp_base", VirtAcpiState, memhp_base,  
> > 0),
> > 
> > it's preferred to use '-' in property names  
> 
> Ok.
> 
> > > +    DEFINE_PROP_BOOL("memory-hotplug-support", VirtAcpiState,
> > > +                     memhp_state.is_enabled, true),
> > >      DEFINE_PROP_END_OF_LIST(),
> > >  };
> > >
> > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > > index bf9c0bc..20d3c83 100644
> > > --- a/hw/arm/virt-acpi-build.c
> > > +++ b/hw/arm/virt-acpi-build.c
> > > @@ -40,6 +40,7 @@
> > >  #include "hw/loader.h"
> > >  #include "hw/hw.h"
> > >  #include "hw/acpi/aml-build.h"
> > > +#include "hw/acpi/memory_hotplug.h"
> > >  #include "hw/pci/pcie_host.h"
> > >  #include "hw/pci/pci.h"
> > >  #include "hw/arm/virt.h"
> > > @@ -49,6 +50,13 @@
> > >  #define ARM_SPI_BASE 32
> > >  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
> > >
> > > +static void acpi_dsdt_add_memory_hotplug(Aml *scope, MachineState  
> > *ms)  
> > > +{  
> > it's dummy wrapper that never will be reused,
> > I suggest to just inline contents at call site and drop wrapper.  
> 
> Ok. I will move it then.
> 
> >   
> > > +    uint32_t nr_mem = ms->ram_slots;
> > > +
> > > +    build_memory_hotplug_aml(scope, nr_mem, "\\_SB", NULL,  
> > AML_SYSTEM_MEMORY);  
> > > +}
> > > +
> > >  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
> > >  {
> > >      uint16_t i;
> > > @@ -740,6 +748,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,  
> > VirtMachineState *vms)  
> > >       * the RTC ACPI device at all when using UEFI.
> > >       */
> > >      scope = aml_scope("\\_SB");
> > > +    acpi_dsdt_add_memory_hotplug(scope, MACHINE(vms));
> > >      acpi_dsdt_add_cpus(scope, vms->smp_cpus);
> > >      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> > >                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
> > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > > index d0ff20d..13db0e9 100644
> > > --- a/hw/arm/virt.c
> > > +++ b/hw/arm/virt.c
> > > @@ -133,6 +133,7 @@ static const MemMapEntry base_memmap[] = {
> > >      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
> > >      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> > >      [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
> > > +    [VIRT_PCDIMM_ACPI] =        { 0x09070000, 0x00010000 },  
> >                                                  ^^^^^^^^^^^
> > where from this magic number comes?  
> 
> I think the only requirement for size is >= MEMORY_HOTPLUG_IO_LEN(24).
> So may be 64K is bit too much, 4K might as well do the job.
> 
> Or is it best to just use MEMORY_HOTPLUG_IO_LEN directly here?
4K is a waste for handling a handful bytes, so I'd go with
MEMORY_HOTPLUG_IO_LEN unless there is compelling reason for using
page size granularity.

>  
> >   
> > >      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> > >      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that  
> > size */  
> > >      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> > > @@ -516,6 +517,18 @@ static void fdt_add_pmu_nodes(const  
> > VirtMachineState *vms)  
> > >      }
> > >  }
> > >
> > > +static DeviceState *create_virt_acpi(VirtMachineState *vms)
> > > +{
> > > +    DeviceState *dev;
> > > +
> > > +    dev = qdev_create(NULL, "virt-acpi");
> > > +    qdev_prop_set_uint64(dev, "memhp_base",
> > > +                         vms->memmap[VIRT_PCDIMM_ACPI].base);
> > > +    qdev_init_nofail(dev);
> > > +
> > > +    return dev;  
> > 
> > Probably no worth a wrapper either, since code is trivial and isn't reused
> > elsewhere.  
> 
> Ok, I will make it an inline then.
> 
> Thanks,
> Shameer
>  
> > > +}
> > > +
> > >  static void create_its(VirtMachineState *vms, DeviceState *gicdev)
> > >  {
> > >      const char *itsclass = its_class_name();
> > > @@ -1644,6 +1657,8 @@ static void machvirt_init(MachineState *machine)
> > >
> > >      create_platform_bus(vms, pic);
> > >
> > > +    vms->acpi = create_virt_acpi(vms);
> > > +
> > >      vms->bootinfo.ram_size = machine->ram_size;
> > >      vms->bootinfo.kernel_filename = machine->kernel_filename;
> > >      vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> > > @@ -1828,11 +1843,19 @@ static void  
> > virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,  
> > >  static void virt_memory_plug(HotplugHandler *hotplug_dev,
> > >                               DeviceState *dev, Error **errp)
> > >  {
> > > +    HotplugHandlerClass *hhc;
> > >      VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> > >      Error *local_err = NULL;
> > >
> > >      pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), &local_err);
> > > +    if (local_err) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi);
> > > +    hhc->plug(HOTPLUG_HANDLER(vms->acpi), dev, &error_abort);
> > >
> > > +out:
> > >      error_propagate(errp, local_err);
> > >  }
> > >
> > > diff --git a/include/hw/acpi/generic_event_device.h  
> > b/include/hw/acpi/generic_event_device.h  
> > > index f314515..262ca7d 100644
> > > --- a/include/hw/acpi/generic_event_device.h
> > > +++ b/include/hw/acpi/generic_event_device.h
> > > @@ -18,12 +18,17 @@
> > >  #ifndef HW_ACPI_GED_H
> > >  #define HW_ACPI_GED_H
> > >
> > > +#include "hw/acpi/memory_hotplug.h"
> > > +
> > >  #define TYPE_VIRT_ACPI "virt-acpi"
> > >  #define VIRT_ACPI(obj) \
> > >      OBJECT_CHECK(VirtAcpiState, (obj), TYPE_VIRT_ACPI)
> > >
> > >  typedef struct VirtAcpiState {
> > >      SysBusDevice parent_obj;
> > > +    MemHotplugState memhp_state;
> > > +    hwaddr memhp_base;
> > >  } VirtAcpiState;
> > >
> > > +
> > >  #endif
> > > diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> > > index 507517c..c5e4c96 100644
> > > --- a/include/hw/arm/virt.h
> > > +++ b/include/hw/arm/virt.h
> > > @@ -77,6 +77,7 @@ enum {
> > >      VIRT_GPIO,
> > >      VIRT_SECURE_UART,
> > >      VIRT_SECURE_MEM,
> > > +    VIRT_PCDIMM_ACPI,
> > >      VIRT_LOWMEMMAP_LAST,
> > >  };
> > >
> > > @@ -132,6 +133,7 @@ typedef struct {
> > >      uint32_t iommu_phandle;
> > >      int psci_conduit;
> > >      hwaddr highest_gpa;
> > > +    DeviceState *acpi;
> > >  } VirtMachineState;
> > >
> > >  #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM :  
> > VIRT_PCIE_ECAM)  
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-01 13:07               ` Laszlo Ersek
@ 2019-04-02  7:42                 ` Igor Mammedov
  -1 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-02  7:42 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	Auger Eric, shannon.zhaosl@gmail.com, qemu-arm@nongnu.org,
	xuwei (O), sebastien.boeuf@intel.com, Leif Lindholm

On Mon, 1 Apr 2019 15:07:05 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

> On 03/29/19 14:56, Auger Eric wrote:
> > Hi Ard,
> > 
> > On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
> >> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
> >>>
> >>> Hi Shameer,
> >>>
> >>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
> >>>>
> >>>>  
> >>>>> -----Original Message-----
> >>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
> >>>>> Sent: 29 March 2019 09:32
> >>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> >>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> >>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> >>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
> >>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
> >>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
> >>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
> >>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
> >>>>>
> >>>>> Hi Shameer,
> >>>>>
> >>>>> [ + Laszlo, Ard, Leif ]
> >>>>>
> >>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
> >>>>>> This is to disable/enable populating DT nodes in case
> >>>>>> any conflict with acpi tables. The default is "off".  
> >>>>> The name of the option sounds misleading to me. Also we don't really
> >>>>> know the scope of the disablement. At the moment this just aims to
> >>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
> >>>>>  
> >>>>>>
> >>>>>> This will be used in subsequent patch where cold plug
> >>>>>> device-memory support is added for DT boot.  
> >>>>> I am concerned about the fact that in dt mode, by default, you won't see
> >>>>> any PCDIMM nodes.  
> >>>>>>
> >>>>>> If DT memory node support is added for cold-plugged device
> >>>>>> memory, those memory will be visible to Guest kernel via
> >>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
> >>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
> >>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
> >>>>> info.  
> >>>>
> >>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
> >>>>
> >>>> Also, to be more clear on what happens,
> >>>>
> >>>> Guest ACPI boot with "fdt=on" ,
> >>>>
> >>>> From kernel log,
> >>>>
> >>>> [    0.000000] Early memory node ranges
> >>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
> >>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
> >>>>
> >>>>
> >>>> Guest ACPI boot with "fdt=off" ,
> >>>>
> >>>> [    0.000000] Movable zone start for each node
> >>>> [    0.000000] Early memory node ranges
> >>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
> >>>>
> >>>> The hotpluggable memory node is absent from early memory nodes here.  
> >>>
> >>> OK thank you for the example illustrating the concern.  
> >>>>
> >>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
> >>>
> >>> Let's wait for EDK2 experts on this.
> >>>  
> >>
> >> Happy to chime in, but I need a bit more context here.
> >>
> >> What is the problem, how does this path try to solve it, and why is
> >> that a bad idea?
> >>  
> > Sure, sorry.
> > 
> > This series:
> > - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> > https://patchwork.kernel.org/cover/10863301/
> > 
> > aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> > SRAT and DSDT parts and relies on GED to trigger the hotplug.
> > 
> > We noticed that if we build the hotpluggable memory dt nodes on top of
> > the above ACPI tables, the DIMM slots are interpreted as not
> > hotpluggable memory slots (at least we think so).
> > 
> > We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> > fact that those slots are exposed as hotpluggable in the SRAT for example.
> > 
> > So in this series, we are forced to not generate the hotpluggable memory
> > dt nodes if we want the DIMM slots to be effectively recognized as
> > hotpluggable.
> > 
> > Could you confirm we have a correct understanding of the EDK2 behaviour
> > and if so, would there be any solution for EDK2 to absorb both the DT
> > nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
> > 
> > At qemu level, detecting we are booting in ACPI mode and purposely
> > removing the above mentioned DT nodes does not look straightforward.  
> 
> The firmware is not enlightened about the ACPI content that comes from
> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> as instructed through the ACPI linker/loader script, in order to install
> the ACPI content for the OS. No actual information is consumed by the
> firmware from the ACPI payload -- and that's a feature.
> 
> The firmware does consume DT:
> 
> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> the firmware (for its own information needs), and passed on to the OS.
> 
> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> consumed only by the firmware (for its own information needs), and the
> DT is hidden from the OS. The OS gets only the ACPI content
> (processed/prepared as described above).
> 
> 
> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> base/size pairs in all the memory nodes in the DT. For each such base
> address that is currently tracked as "nonexistent" in the GCD memory
> space map, the driver currently adds the base/size range as "system
> memory". This in turn is reflected by the UEFI memmap that the OS gets
> to see as "conventional memory".
> 
> If you need some memory ranges to show up as "special" in the UEFI
> memmap, then you need to distinguish them somehow from the "regular"
> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
> firmware, so that it act upon the discriminator that you set in the DT.
> 
> 
> Now... from a brief look at the Platform Init and UEFI specs, my
> impression is that the hotpluggable (but presently not plugged) DIMM
> ranges should simply be *absent* from the UEFI memmap; is that correct?
> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> in full.) If my impression is correct, then two options (alternatives)
> exist:
> 
> (1) Hide the affected memory nodes -- or at least the affected base/size
> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> external firmware loaded. Then the firmware will not expose those ranges
> as "conventional memory" in the UEFI memmap. This approach requires no
> changes to edk2.
> 
> This option is precisely what Eric described up-thread, at
> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
> 
> > in machvirt_init, there is firmware_loaded that tells you whether you
> > have a FW image. If this one is not set, you can induce dt. But if
> > there is a FW it can be either DT or ACPI booted. You also have the
> > acpi_enabled knob.  
> 
> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> "vl.c").
> 
> So, the condition for hiding the hotpluggable memory nodes in question
> from the DT is:
> 
>   (aarch64 && firmware_loaded && acpi_enabled)
I'd go with this one, though I have a question for firmware side.
Let's assume we would want in future to expose hotpluggable & present
memory via GetMemoryMap() (like bare-metal does) (guest OS theoretically
can avoid using it for Normal zone based on hint from SRAT table early
at boot), but what about firmware can it inspect SRAT table and not use
hotpluggable ranges for its own use (or at least do not canibalize
them permanently)?
 

> 
> (2) Invent and set an "ignore me, firmware" property for the
> hotpluggable memory nodes in the DT, and update the firmware to honor
> that property.
> 
> Thanks
> Laszlo


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-02  7:42                 ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-02  7:42 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Auger Eric, Ard Biesheuvel, peter.maydell@linaro.org,
	sameo@linux.intel.com, qemu-devel@nongnu.org,
	Shameerali Kolothum Thodi, Linuxarm, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), sebastien.boeuf@intel.com,
	Leif Lindholm

On Mon, 1 Apr 2019 15:07:05 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

> On 03/29/19 14:56, Auger Eric wrote:
> > Hi Ard,
> > 
> > On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
> >> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
> >>>
> >>> Hi Shameer,
> >>>
> >>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
> >>>>
> >>>>  
> >>>>> -----Original Message-----
> >>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
> >>>>> Sent: 29 March 2019 09:32
> >>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> >>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> >>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> >>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
> >>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
> >>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
> >>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
> >>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
> >>>>>
> >>>>> Hi Shameer,
> >>>>>
> >>>>> [ + Laszlo, Ard, Leif ]
> >>>>>
> >>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
> >>>>>> This is to disable/enable populating DT nodes in case
> >>>>>> any conflict with acpi tables. The default is "off".  
> >>>>> The name of the option sounds misleading to me. Also we don't really
> >>>>> know the scope of the disablement. At the moment this just aims to
> >>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
> >>>>>  
> >>>>>>
> >>>>>> This will be used in subsequent patch where cold plug
> >>>>>> device-memory support is added for DT boot.  
> >>>>> I am concerned about the fact that in dt mode, by default, you won't see
> >>>>> any PCDIMM nodes.  
> >>>>>>
> >>>>>> If DT memory node support is added for cold-plugged device
> >>>>>> memory, those memory will be visible to Guest kernel via
> >>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
> >>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
> >>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
> >>>>> info.  
> >>>>
> >>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
> >>>>
> >>>> Also, to be more clear on what happens,
> >>>>
> >>>> Guest ACPI boot with "fdt=on" ,
> >>>>
> >>>> From kernel log,
> >>>>
> >>>> [    0.000000] Early memory node ranges
> >>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
> >>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
> >>>>
> >>>>
> >>>> Guest ACPI boot with "fdt=off" ,
> >>>>
> >>>> [    0.000000] Movable zone start for each node
> >>>> [    0.000000] Early memory node ranges
> >>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
> >>>>
> >>>> The hotpluggable memory node is absent from early memory nodes here.  
> >>>
> >>> OK thank you for the example illustrating the concern.  
> >>>>
> >>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
> >>>
> >>> Let's wait for EDK2 experts on this.
> >>>  
> >>
> >> Happy to chime in, but I need a bit more context here.
> >>
> >> What is the problem, how does this path try to solve it, and why is
> >> that a bad idea?
> >>  
> > Sure, sorry.
> > 
> > This series:
> > - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> > https://patchwork.kernel.org/cover/10863301/
> > 
> > aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> > SRAT and DSDT parts and relies on GED to trigger the hotplug.
> > 
> > We noticed that if we build the hotpluggable memory dt nodes on top of
> > the above ACPI tables, the DIMM slots are interpreted as not
> > hotpluggable memory slots (at least we think so).
> > 
> > We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> > fact that those slots are exposed as hotpluggable in the SRAT for example.
> > 
> > So in this series, we are forced to not generate the hotpluggable memory
> > dt nodes if we want the DIMM slots to be effectively recognized as
> > hotpluggable.
> > 
> > Could you confirm we have a correct understanding of the EDK2 behaviour
> > and if so, would there be any solution for EDK2 to absorb both the DT
> > nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
> > 
> > At qemu level, detecting we are booting in ACPI mode and purposely
> > removing the above mentioned DT nodes does not look straightforward.  
> 
> The firmware is not enlightened about the ACPI content that comes from
> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> as instructed through the ACPI linker/loader script, in order to install
> the ACPI content for the OS. No actual information is consumed by the
> firmware from the ACPI payload -- and that's a feature.
> 
> The firmware does consume DT:
> 
> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> the firmware (for its own information needs), and passed on to the OS.
> 
> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> consumed only by the firmware (for its own information needs), and the
> DT is hidden from the OS. The OS gets only the ACPI content
> (processed/prepared as described above).
> 
> 
> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> base/size pairs in all the memory nodes in the DT. For each such base
> address that is currently tracked as "nonexistent" in the GCD memory
> space map, the driver currently adds the base/size range as "system
> memory". This in turn is reflected by the UEFI memmap that the OS gets
> to see as "conventional memory".
> 
> If you need some memory ranges to show up as "special" in the UEFI
> memmap, then you need to distinguish them somehow from the "regular"
> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
> firmware, so that it act upon the discriminator that you set in the DT.
> 
> 
> Now... from a brief look at the Platform Init and UEFI specs, my
> impression is that the hotpluggable (but presently not plugged) DIMM
> ranges should simply be *absent* from the UEFI memmap; is that correct?
> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> in full.) If my impression is correct, then two options (alternatives)
> exist:
> 
> (1) Hide the affected memory nodes -- or at least the affected base/size
> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> external firmware loaded. Then the firmware will not expose those ranges
> as "conventional memory" in the UEFI memmap. This approach requires no
> changes to edk2.
> 
> This option is precisely what Eric described up-thread, at
> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
> 
> > in machvirt_init, there is firmware_loaded that tells you whether you
> > have a FW image. If this one is not set, you can induce dt. But if
> > there is a FW it can be either DT or ACPI booted. You also have the
> > acpi_enabled knob.  
> 
> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> "vl.c").
> 
> So, the condition for hiding the hotpluggable memory nodes in question
> from the DT is:
> 
>   (aarch64 && firmware_loaded && acpi_enabled)
I'd go with this one, though I have a question for firmware side.
Let's assume we would want in future to expose hotpluggable & present
memory via GetMemoryMap() (like bare-metal does) (guest OS theoretically
can avoid using it for Normal zone based on hint from SRAT table early
at boot), but what about firmware can it inspect SRAT table and not use
hotpluggable ranges for its own use (or at least do not canibalize
them permanently)?
 

> 
> (2) Invent and set an "ignore me, firmware" property for the
> hotpluggable memory nodes in the DT, and update the firmware to honor
> that property.
> 
> Thanks
> Laszlo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-03-29 13:56           ` [Qemu-arm] [Qemu-devel] " Auger Eric
@ 2019-04-02  8:39               ` Peter Maydell
  2019-04-01 13:07               ` Laszlo Ersek
  2019-04-02  8:39               ` Peter Maydell
  2 siblings, 0 replies; 95+ messages in thread
From: Peter Maydell @ 2019-04-02  8:39 UTC (permalink / raw)
  To: Auger Eric
  Cc: sameo@linux.intel.com, Ard Biesheuvel, qemu-devel@nongnu.org,
	Shameerali Kolothum Thodi, Linuxarm, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), imammedo@redhat.com,
	sebastien.boeuf@intel.com, Laszlo Ersek, Leif Lindholm

On Fri, 29 Mar 2019 at 20:56, Auger Eric <eric.auger@redhat.com> wrote:
> This series:
> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> https://patchwork.kernel.org/cover/10863301/
>
> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>
> We noticed that if we build the hotpluggable memory dt nodes on top of
> the above ACPI tables, the DIMM slots are interpreted as not
> hotpluggable memory slots (at least we think so).
>
> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> fact that those slots are exposed as hotpluggable in the SRAT for example.
>
> So in this series, we are forced to not generate the hotpluggable memory
> dt nodes if we want the DIMM slots to be effectively recognized as
> hotpluggable.
>
> Could you confirm we have a correct understanding of the EDK2 behaviour
> and if so, would there be any solution for EDK2 to absorb both the DT
> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>
> At qemu level, detecting we are booting in ACPI mode and purposely
> removing the above mentioned DT nodes does not look straightforward.

My initial response would be to say that hotpluggable memory
should be suitably marked up in both the DTB and the ACPI tables
so that guest software that cares can tell that it is hotplugged
whether it is choosing to consume the DT or the ACPI tables.
QEMU should definitely not be reporting the hardware as looking
different to the guest based on some guess about what guest
software it is booting.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-02  8:39               ` Peter Maydell
  0 siblings, 0 replies; 95+ messages in thread
From: Peter Maydell @ 2019-04-02  8:39 UTC (permalink / raw)
  To: Auger Eric
  Cc: Ard Biesheuvel, sameo@linux.intel.com, qemu-devel@nongnu.org,
	Shameerali Kolothum Thodi, Linuxarm, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), imammedo@redhat.com,
	sebastien.boeuf@intel.com, Laszlo Ersek, Leif Lindholm

On Fri, 29 Mar 2019 at 20:56, Auger Eric <eric.auger@redhat.com> wrote:
> This series:
> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> https://patchwork.kernel.org/cover/10863301/
>
> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>
> We noticed that if we build the hotpluggable memory dt nodes on top of
> the above ACPI tables, the DIMM slots are interpreted as not
> hotpluggable memory slots (at least we think so).
>
> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> fact that those slots are exposed as hotpluggable in the SRAT for example.
>
> So in this series, we are forced to not generate the hotpluggable memory
> dt nodes if we want the DIMM slots to be effectively recognized as
> hotpluggable.
>
> Could you confirm we have a correct understanding of the EDK2 behaviour
> and if so, would there be any solution for EDK2 to absorb both the DT
> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>
> At qemu level, detecting we are booting in ACPI mode and purposely
> removing the above mentioned DT nodes does not look straightforward.

My initial response would be to say that hotpluggable memory
should be suitably marked up in both the DTB and the ACPI tables
so that guest software that cares can tell that it is hotplugged
whether it is choosing to consume the DT or the ACPI tables.
QEMU should definitely not be reporting the hardware as looking
different to the guest based on some guess about what guest
software it is booting.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-02  7:42                 ` Igor Mammedov
@ 2019-04-02 10:33                   ` Laszlo Ersek
  -1 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-02 10:33 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	Auger Eric, shannon.zhaosl@gmail.com, qemu-arm@nongnu.org,
	xuwei (O), sebastien.boeuf@intel.com, Leif Lindholm

On 04/02/19 09:42, Igor Mammedov wrote:
> On Mon, 1 Apr 2019 15:07:05 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 03/29/19 14:56, Auger Eric wrote:
>>> Hi Ard,
>>>
>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
>>>>>
>>>>> Hi Shameer,
>>>>>
>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
>>>>>>
>>>>>>  
>>>>>>> -----Original Message-----
>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>> Sent: 29 March 2019 09:32
>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>
>>>>>>> Hi Shameer,
>>>>>>>
>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>
>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>> any conflict with acpi tables. The default is "off".  
>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>  
>>>>>>>>
>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>> device-memory support is added for DT boot.  
>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>> any PCDIMM nodes.  
>>>>>>>>
>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>> info.  
>>>>>>
>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>
>>>>>> Also, to be more clear on what happens,
>>>>>>
>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>
>>>>>> From kernel log,
>>>>>>
>>>>>> [    0.000000] Early memory node ranges
>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>
>>>>>>
>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>
>>>>>> [    0.000000] Movable zone start for each node
>>>>>> [    0.000000] Early memory node ranges
>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>
>>>>>> The hotpluggable memory node is absent from early memory nodes here.  
>>>>>
>>>>> OK thank you for the example illustrating the concern.  
>>>>>>
>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
>>>>>
>>>>> Let's wait for EDK2 experts on this.
>>>>>  
>>>>
>>>> Happy to chime in, but I need a bit more context here.
>>>>
>>>> What is the problem, how does this path try to solve it, and why is
>>>> that a bad idea?
>>>>  
>>> Sure, sorry.
>>>
>>> This series:
>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>> https://patchwork.kernel.org/cover/10863301/
>>>
>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>
>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>> the above ACPI tables, the DIMM slots are interpreted as not
>>> hotpluggable memory slots (at least we think so).
>>>
>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>
>>> So in this series, we are forced to not generate the hotpluggable memory
>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>> hotpluggable.
>>>
>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>
>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>> removing the above mentioned DT nodes does not look straightforward.  
>>
>> The firmware is not enlightened about the ACPI content that comes from
>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>> as instructed through the ACPI linker/loader script, in order to install
>> the ACPI content for the OS. No actual information is consumed by the
>> firmware from the ACPI payload -- and that's a feature.
>>
>> The firmware does consume DT:
>>
>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>> the firmware (for its own information needs), and passed on to the OS.
>>
>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>> consumed only by the firmware (for its own information needs), and the
>> DT is hidden from the OS. The OS gets only the ACPI content
>> (processed/prepared as described above).
>>
>>
>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>> base/size pairs in all the memory nodes in the DT. For each such base
>> address that is currently tracked as "nonexistent" in the GCD memory
>> space map, the driver currently adds the base/size range as "system
>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>> to see as "conventional memory".
>>
>> If you need some memory ranges to show up as "special" in the UEFI
>> memmap, then you need to distinguish them somehow from the "regular"
>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>> firmware, so that it act upon the discriminator that you set in the DT.
>>
>>
>> Now... from a brief look at the Platform Init and UEFI specs, my
>> impression is that the hotpluggable (but presently not plugged) DIMM
>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>> in full.) If my impression is correct, then two options (alternatives)
>> exist:
>>
>> (1) Hide the affected memory nodes -- or at least the affected base/size
>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>> external firmware loaded. Then the firmware will not expose those ranges
>> as "conventional memory" in the UEFI memmap. This approach requires no
>> changes to edk2.
>>
>> This option is precisely what Eric described up-thread, at
>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>
>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>> have a FW image. If this one is not set, you can induce dt. But if
>>> there is a FW it can be either DT or ACPI booted. You also have the
>>> acpi_enabled knob.  
>>
>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>> "vl.c").
>>
>> So, the condition for hiding the hotpluggable memory nodes in question
>> from the DT is:
>>
>>   (aarch64 && firmware_loaded && acpi_enabled)
> I'd go with this one, though I have a question for firmware side.
> Let's assume we would want in future to expose hotpluggable & present
> memory via GetMemoryMap() (like bare-metal does) (guest OS theoretically
> can avoid using it for Normal zone based on hint from SRAT table early
> at boot), but what about firmware can it inspect SRAT table and not use
> hotpluggable ranges for its own use (or at least do not canibalize
> them permanently)?

This is actually two questions:

(a) Can the firmware inspect SRAT?

If the SRAT table structure isn't very complex, this is technically
doable, but the wrong thing to do, IMO.

First, we've tried hard to avoid enlightening the firmware about the
semantics of QEMU's ACPI tables.

Second, this would introduce an ordering constraint (or callbacks) in
the firmware, between the driver that processes & installs the ACPI
tables, and the driver that translates the memory nodes of the DT to the
memory ranges known to UEFI and the OS.

If we need such hinting, then option (2) below (from earlier context)
would be better:
- If it's OK to use an arm/aarch64 specific solution, then new DT
properties should work.
- If it should be arch-independent, then a dedicated fw_cfg file would
be better.

(b) Assuming we have the information from some source, can the firmware
expose some memory ranges as "usable RAM" to the OS, while staying away
from them for its own (firmware) purposes?

After consulting

  Table 25. Memory Type Usage before ExitBootServices()
  Table 26. Memory Type Usage after ExitBootServices()

in UEFI-2.7, I would say that the firmware driver that installs these
ranges to the memory (space) map should also allocate the ranges right
after, as EfiBootServicesData. This will prevent other drivers /
applications in the firmware from allocating chunks out of those areas,
and the OS will be at liberty to release and repurpose the ranges after
ExitBootServices().

Thanks,
Laszlo

>> (2) Invent and set an "ignore me, firmware" property for the
>> hotpluggable memory nodes in the DT, and update the firmware to honor
>> that property.
>>
>> Thanks
>> Laszlo
> 


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-02 10:33                   ` Laszlo Ersek
  0 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-02 10:33 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Auger Eric, Ard Biesheuvel, peter.maydell@linaro.org,
	sameo@linux.intel.com, qemu-devel@nongnu.org,
	Shameerali Kolothum Thodi, Linuxarm, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), sebastien.boeuf@intel.com,
	Leif Lindholm

On 04/02/19 09:42, Igor Mammedov wrote:
> On Mon, 1 Apr 2019 15:07:05 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 03/29/19 14:56, Auger Eric wrote:
>>> Hi Ard,
>>>
>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
>>>>>
>>>>> Hi Shameer,
>>>>>
>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
>>>>>>
>>>>>>  
>>>>>>> -----Original Message-----
>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>> Sent: 29 March 2019 09:32
>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>
>>>>>>> Hi Shameer,
>>>>>>>
>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>
>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>> any conflict with acpi tables. The default is "off".  
>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>  
>>>>>>>>
>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>> device-memory support is added for DT boot.  
>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>> any PCDIMM nodes.  
>>>>>>>>
>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>> info.  
>>>>>>
>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>
>>>>>> Also, to be more clear on what happens,
>>>>>>
>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>
>>>>>> From kernel log,
>>>>>>
>>>>>> [    0.000000] Early memory node ranges
>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>
>>>>>>
>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>
>>>>>> [    0.000000] Movable zone start for each node
>>>>>> [    0.000000] Early memory node ranges
>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>
>>>>>> The hotpluggable memory node is absent from early memory nodes here.  
>>>>>
>>>>> OK thank you for the example illustrating the concern.  
>>>>>>
>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
>>>>>
>>>>> Let's wait for EDK2 experts on this.
>>>>>  
>>>>
>>>> Happy to chime in, but I need a bit more context here.
>>>>
>>>> What is the problem, how does this path try to solve it, and why is
>>>> that a bad idea?
>>>>  
>>> Sure, sorry.
>>>
>>> This series:
>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>> https://patchwork.kernel.org/cover/10863301/
>>>
>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>
>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>> the above ACPI tables, the DIMM slots are interpreted as not
>>> hotpluggable memory slots (at least we think so).
>>>
>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>
>>> So in this series, we are forced to not generate the hotpluggable memory
>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>> hotpluggable.
>>>
>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>
>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>> removing the above mentioned DT nodes does not look straightforward.  
>>
>> The firmware is not enlightened about the ACPI content that comes from
>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>> as instructed through the ACPI linker/loader script, in order to install
>> the ACPI content for the OS. No actual information is consumed by the
>> firmware from the ACPI payload -- and that's a feature.
>>
>> The firmware does consume DT:
>>
>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>> the firmware (for its own information needs), and passed on to the OS.
>>
>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>> consumed only by the firmware (for its own information needs), and the
>> DT is hidden from the OS. The OS gets only the ACPI content
>> (processed/prepared as described above).
>>
>>
>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>> base/size pairs in all the memory nodes in the DT. For each such base
>> address that is currently tracked as "nonexistent" in the GCD memory
>> space map, the driver currently adds the base/size range as "system
>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>> to see as "conventional memory".
>>
>> If you need some memory ranges to show up as "special" in the UEFI
>> memmap, then you need to distinguish them somehow from the "regular"
>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>> firmware, so that it act upon the discriminator that you set in the DT.
>>
>>
>> Now... from a brief look at the Platform Init and UEFI specs, my
>> impression is that the hotpluggable (but presently not plugged) DIMM
>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>> in full.) If my impression is correct, then two options (alternatives)
>> exist:
>>
>> (1) Hide the affected memory nodes -- or at least the affected base/size
>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>> external firmware loaded. Then the firmware will not expose those ranges
>> as "conventional memory" in the UEFI memmap. This approach requires no
>> changes to edk2.
>>
>> This option is precisely what Eric described up-thread, at
>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>
>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>> have a FW image. If this one is not set, you can induce dt. But if
>>> there is a FW it can be either DT or ACPI booted. You also have the
>>> acpi_enabled knob.  
>>
>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>> "vl.c").
>>
>> So, the condition for hiding the hotpluggable memory nodes in question
>> from the DT is:
>>
>>   (aarch64 && firmware_loaded && acpi_enabled)
> I'd go with this one, though I have a question for firmware side.
> Let's assume we would want in future to expose hotpluggable & present
> memory via GetMemoryMap() (like bare-metal does) (guest OS theoretically
> can avoid using it for Normal zone based on hint from SRAT table early
> at boot), but what about firmware can it inspect SRAT table and not use
> hotpluggable ranges for its own use (or at least do not canibalize
> them permanently)?

This is actually two questions:

(a) Can the firmware inspect SRAT?

If the SRAT table structure isn't very complex, this is technically
doable, but the wrong thing to do, IMO.

First, we've tried hard to avoid enlightening the firmware about the
semantics of QEMU's ACPI tables.

Second, this would introduce an ordering constraint (or callbacks) in
the firmware, between the driver that processes & installs the ACPI
tables, and the driver that translates the memory nodes of the DT to the
memory ranges known to UEFI and the OS.

If we need such hinting, then option (2) below (from earlier context)
would be better:
- If it's OK to use an arm/aarch64 specific solution, then new DT
properties should work.
- If it should be arch-independent, then a dedicated fw_cfg file would
be better.

(b) Assuming we have the information from some source, can the firmware
expose some memory ranges as "usable RAM" to the OS, while staying away
from them for its own (firmware) purposes?

After consulting

  Table 25. Memory Type Usage before ExitBootServices()
  Table 26. Memory Type Usage after ExitBootServices()

in UEFI-2.7, I would say that the firmware driver that installs these
ranges to the memory (space) map should also allocate the ranges right
after, as EfiBootServicesData. This will prevent other drivers /
applications in the firmware from allocating chunks out of those areas,
and the OS will be at liberty to release and repurpose the ranges after
ExitBootServices().

Thanks,
Laszlo

>> (2) Invent and set an "ignore me, firmware" property for the
>> hotpluggable memory nodes in the DT, and update the firmware to honor
>> that property.
>>
>> Thanks
>> Laszlo
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-01 13:07               ` Laszlo Ersek
@ 2019-04-02 14:26                 ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-02 14:26 UTC (permalink / raw)
  To: Laszlo Ersek, Auger Eric, Ard Biesheuvel
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Linuxarm,
	qemu-devel@nongnu.org, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), imammedo@redhat.com,
	sebastien.boeuf@intel.com, Leif Lindholm

Hi Laszlo,

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: 01 April 2019 14:07
> To: Auger Eric <eric.auger@redhat.com>; Ard Biesheuvel
> <ard.biesheuvel@linaro.org>
> Cc: peter.maydell@linaro.org; sameo@linux.intel.com;
> qemu-devel@nongnu.org; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; Linuxarm
> <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> imammedo@redhat.com; sebastien.boeuf@intel.com; Leif Lindholm
> <Leif.Lindholm@arm.com>
> Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> feature "fdt"
> 
> On 03/29/19 14:56, Auger Eric wrote:

[...]

> > Sure, sorry.
> >
> > This series:
> > - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> > https://patchwork.kernel.org/cover/10863301/
> >
> > aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> > SRAT and DSDT parts and relies on GED to trigger the hotplug.
> >
> > We noticed that if we build the hotpluggable memory dt nodes on top of
> > the above ACPI tables, the DIMM slots are interpreted as not
> > hotpluggable memory slots (at least we think so).
> >
> > We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> > fact that those slots are exposed as hotpluggable in the SRAT for example.
> >
> > So in this series, we are forced to not generate the hotpluggable memory
> > dt nodes if we want the DIMM slots to be effectively recognized as
> > hotpluggable.
> >
> > Could you confirm we have a correct understanding of the EDK2 behaviour
> > and if so, would there be any solution for EDK2 to absorb both the DT
> > nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
> >
> > At qemu level, detecting we are booting in ACPI mode and purposely
> > removing the above mentioned DT nodes does not look straightforward.
> 
> The firmware is not enlightened about the ACPI content that comes from
> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> as instructed through the ACPI linker/loader script, in order to install
> the ACPI content for the OS. No actual information is consumed by the
> firmware from the ACPI payload -- and that's a feature.
> 
> The firmware does consume DT:
> 
> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> the firmware (for its own information needs), and passed on to the OS.
> 
> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> consumed only by the firmware (for its own information needs), and the
> DT is hidden from the OS. The OS gets only the ACPI content
> (processed/prepared as described above).
> 
> 
> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> base/size pairs in all the memory nodes in the DT. For each such base
> address that is currently tracked as "nonexistent" in the GCD memory
> space map, the driver currently adds the base/size range as "system
> memory". This in turn is reflected by the UEFI memmap that the OS gets
> to see as "conventional memory".
> 
> If you need some memory ranges to show up as "special" in the UEFI
> memmap, then you need to distinguish them somehow from the "regular"
> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
> firmware, so that it act upon the discriminator that you set in the DT.
> 
> 
> Now... from a brief look at the Platform Init and UEFI specs, my
> impression is that the hotpluggable (but presently not plugged) DIMM
> ranges should simply be *absent* from the UEFI memmap; is that correct?
> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> in full.) If my impression is correct, then two options (alternatives)
> exist:
> 
> (1) Hide the affected memory nodes -- or at least the affected base/size
> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> external firmware loaded. Then the firmware will not expose those ranges
> as "conventional memory" in the UEFI memmap. This approach requires no
> changes to edk2.
> 
> This option is precisely what Eric described up-thread, at
> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redh
> at.com>:
> 
> > in machvirt_init, there is firmware_loaded that tells you whether you
> > have a FW image. If this one is not set, you can induce dt. But if
> > there is a FW it can be either DT or ACPI booted. You also have the
> > acpi_enabled knob.
> 
> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> "vl.c").
> 
> So, the condition for hiding the hotpluggable memory nodes in question
> from the DT is:
> 
>   (aarch64 && firmware_loaded && acpi_enabled)

Thanks for your explanation and suggestions. I had a quick run with the above
and it seems to do the job. I will drop this extra opt-in feature patch from this
series and instead have this check.

Thanks,
Shameer

> 
> 
> (2) Invent and set an "ignore me, firmware" property for the
> hotpluggable memory nodes in the DT, and update the firmware to honor
> that property.
> 
> Thanks
> Laszlo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-02 14:26                 ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-02 14:26 UTC (permalink / raw)
  To: Laszlo Ersek, Auger Eric, Ard Biesheuvel
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Linuxarm, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), imammedo@redhat.com,
	sebastien.boeuf@intel.com, Leif Lindholm

Hi Laszlo,

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: 01 April 2019 14:07
> To: Auger Eric <eric.auger@redhat.com>; Ard Biesheuvel
> <ard.biesheuvel@linaro.org>
> Cc: peter.maydell@linaro.org; sameo@linux.intel.com;
> qemu-devel@nongnu.org; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; Linuxarm
> <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> imammedo@redhat.com; sebastien.boeuf@intel.com; Leif Lindholm
> <Leif.Lindholm@arm.com>
> Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> feature "fdt"
> 
> On 03/29/19 14:56, Auger Eric wrote:

[...]

> > Sure, sorry.
> >
> > This series:
> > - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> > https://patchwork.kernel.org/cover/10863301/
> >
> > aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> > SRAT and DSDT parts and relies on GED to trigger the hotplug.
> >
> > We noticed that if we build the hotpluggable memory dt nodes on top of
> > the above ACPI tables, the DIMM slots are interpreted as not
> > hotpluggable memory slots (at least we think so).
> >
> > We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> > fact that those slots are exposed as hotpluggable in the SRAT for example.
> >
> > So in this series, we are forced to not generate the hotpluggable memory
> > dt nodes if we want the DIMM slots to be effectively recognized as
> > hotpluggable.
> >
> > Could you confirm we have a correct understanding of the EDK2 behaviour
> > and if so, would there be any solution for EDK2 to absorb both the DT
> > nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
> >
> > At qemu level, detecting we are booting in ACPI mode and purposely
> > removing the above mentioned DT nodes does not look straightforward.
> 
> The firmware is not enlightened about the ACPI content that comes from
> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> as instructed through the ACPI linker/loader script, in order to install
> the ACPI content for the OS. No actual information is consumed by the
> firmware from the ACPI payload -- and that's a feature.
> 
> The firmware does consume DT:
> 
> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> the firmware (for its own information needs), and passed on to the OS.
> 
> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> consumed only by the firmware (for its own information needs), and the
> DT is hidden from the OS. The OS gets only the ACPI content
> (processed/prepared as described above).
> 
> 
> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> base/size pairs in all the memory nodes in the DT. For each such base
> address that is currently tracked as "nonexistent" in the GCD memory
> space map, the driver currently adds the base/size range as "system
> memory". This in turn is reflected by the UEFI memmap that the OS gets
> to see as "conventional memory".
> 
> If you need some memory ranges to show up as "special" in the UEFI
> memmap, then you need to distinguish them somehow from the "regular"
> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
> firmware, so that it act upon the discriminator that you set in the DT.
> 
> 
> Now... from a brief look at the Platform Init and UEFI specs, my
> impression is that the hotpluggable (but presently not plugged) DIMM
> ranges should simply be *absent* from the UEFI memmap; is that correct?
> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> in full.) If my impression is correct, then two options (alternatives)
> exist:
> 
> (1) Hide the affected memory nodes -- or at least the affected base/size
> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> external firmware loaded. Then the firmware will not expose those ranges
> as "conventional memory" in the UEFI memmap. This approach requires no
> changes to edk2.
> 
> This option is precisely what Eric described up-thread, at
> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redh
> at.com>:
> 
> > in machvirt_init, there is firmware_loaded that tells you whether you
> > have a FW image. If this one is not set, you can induce dt. But if
> > there is a FW it can be either DT or ACPI booted. You also have the
> > acpi_enabled knob.
> 
> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> "vl.c").
> 
> So, the condition for hiding the hotpluggable memory nodes in question
> from the DT is:
> 
>   (aarch64 && firmware_loaded && acpi_enabled)

Thanks for your explanation and suggestions. I had a quick run with the above
and it seems to do the job. I will drop this extra opt-in feature patch from this
series and instead have this check.

Thanks,
Shameer

> 
> 
> (2) Invent and set an "ignore me, firmware" property for the
> hotpluggable memory nodes in the DT, and update the firmware to honor
> that property.
> 
> Thanks
> Laszlo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-01 13:07               ` Laszlo Ersek
@ 2019-04-02 15:29                 ` Auger Eric
  -1 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-04-02 15:29 UTC (permalink / raw)
  To: Laszlo Ersek, Ard Biesheuvel
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Linuxarm,
	Shameerali Kolothum Thodi, qemu-devel@nongnu.org,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	imammedo@redhat.com, sebastien.boeuf@intel.com, Leif Lindholm

Hi Laszlo,

On 4/1/19 3:07 PM, Laszlo Ersek wrote:
> On 03/29/19 14:56, Auger Eric wrote:
>> Hi Ard,
>>
>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:
>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:
>>>>
>>>> Hi Shameer,
>>>>
>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>> Sent: 29 March 2019 09:32
>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>
>>>>>> Hi Shameer,
>>>>>>
>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>
>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>> any conflict with acpi tables. The default is "off".
>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>
>>>>>>>
>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>> device-memory support is added for DT boot.
>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>> any PCDIMM nodes.
>>>>>>>
>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.
>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>> info.
>>>>>
>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>
>>>>> Also, to be more clear on what happens,
>>>>>
>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>
>>>>> From kernel log,
>>>>>
>>>>> [    0.000000] Early memory node ranges
>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>
>>>>>
>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>
>>>>> [    0.000000] Movable zone start for each node
>>>>> [    0.000000] Early memory node ranges
>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>
>>>>> The hotpluggable memory node is absent from early memory nodes here.
>>>>
>>>> OK thank you for the example illustrating the concern.
>>>>>
>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.
>>>>
>>>> Let's wait for EDK2 experts on this.
>>>>
>>>
>>> Happy to chime in, but I need a bit more context here.
>>>
>>> What is the problem, how does this path try to solve it, and why is
>>> that a bad idea?
>>>
>> Sure, sorry.
>>
>> This series:
>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>> https://patchwork.kernel.org/cover/10863301/
>>
>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>
>> We noticed that if we build the hotpluggable memory dt nodes on top of
>> the above ACPI tables, the DIMM slots are interpreted as not
>> hotpluggable memory slots (at least we think so).
>>
>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>
>> So in this series, we are forced to not generate the hotpluggable memory
>> dt nodes if we want the DIMM slots to be effectively recognized as
>> hotpluggable.
>>
>> Could you confirm we have a correct understanding of the EDK2 behaviour
>> and if so, would there be any solution for EDK2 to absorb both the DT
>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>
>> At qemu level, detecting we are booting in ACPI mode and purposely
>> removing the above mentioned DT nodes does not look straightforward.
> 
> The firmware is not enlightened about the ACPI content that comes from
> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> as instructed through the ACPI linker/loader script, in order to install
> the ACPI content for the OS. No actual information is consumed by the
> firmware from the ACPI payload -- and that's a feature.
> 
> The firmware does consume DT:
> 
> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> the firmware (for its own information needs), and passed on to the OS.
> 
> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> consumed only by the firmware (for its own information needs), and the
> DT is hidden from the OS. The OS gets only the ACPI content
> (processed/prepared as described above).
> 
> 
> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> base/size pairs in all the memory nodes in the DT. For each such base
> address that is currently tracked as "nonexistent" in the GCD memory
> space map, the driver currently adds the base/size range as "system
> memory". This in turn is reflected by the UEFI memmap that the OS gets
> to see as "conventional memory".
> 
> If you need some memory ranges to show up as "special" in the UEFI
> memmap, then you need to distinguish them somehow from the "regular"
> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
> firmware, so that it act upon the discriminator that you set in the DT.
> 
> 
> Now... from a brief look at the Platform Init and UEFI specs, my
> impression is that the hotpluggable (but presently not plugged) DIMM
> ranges should simply be *absent* from the UEFI memmap; is that correct?
> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> in full.) If my impression is correct, then two options (alternatives)
> exist:
> 
> (1) Hide the affected memory nodes -- or at least the affected base/size
> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> external firmware loaded. Then the firmware will not expose those ranges
> as "conventional memory" in the UEFI memmap. This approach requires no
> changes to edk2.
> 
> This option is precisely what Eric described up-thread, at
> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
> 
>> in machvirt_init, there is firmware_loaded that tells you whether you
>> have a FW image. If this one is not set, you can induce dt. But if
>> there is a FW it can be either DT or ACPI booted. You also have the
>> acpi_enabled knob.
> 
> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> "vl.c").
> 
> So, the condition for hiding the hotpluggable memory nodes in question
> from the DT is:

> 
>   (aarch64 && firmware_loaded && acpi_enabled)

Thanks a lot for all those inputs!

I don't get why we test aarch64 in above condition (this was useful for
high ECAM range as the aarch32 FW was not supporting it but here, is it
still meaningful?)

Thanks

Eric

> 
> 
> (2) Invent and set an "ignore me, firmware" property for the
> hotpluggable memory nodes in the DT, and update the firmware to honor
> that property.
> 
> Thanks
> Laszlo
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-02 15:29                 ` Auger Eric
  0 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-04-02 15:29 UTC (permalink / raw)
  To: Laszlo Ersek, Ard Biesheuvel
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	imammedo@redhat.com, sebastien.boeuf@intel.com, Leif Lindholm

Hi Laszlo,

On 4/1/19 3:07 PM, Laszlo Ersek wrote:
> On 03/29/19 14:56, Auger Eric wrote:
>> Hi Ard,
>>
>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:
>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:
>>>>
>>>> Hi Shameer,
>>>>
>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>> Sent: 29 March 2019 09:32
>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>
>>>>>> Hi Shameer,
>>>>>>
>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>
>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>> any conflict with acpi tables. The default is "off".
>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>
>>>>>>>
>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>> device-memory support is added for DT boot.
>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>> any PCDIMM nodes.
>>>>>>>
>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.
>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>> info.
>>>>>
>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>
>>>>> Also, to be more clear on what happens,
>>>>>
>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>
>>>>> From kernel log,
>>>>>
>>>>> [    0.000000] Early memory node ranges
>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>
>>>>>
>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>
>>>>> [    0.000000] Movable zone start for each node
>>>>> [    0.000000] Early memory node ranges
>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>
>>>>> The hotpluggable memory node is absent from early memory nodes here.
>>>>
>>>> OK thank you for the example illustrating the concern.
>>>>>
>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.
>>>>
>>>> Let's wait for EDK2 experts on this.
>>>>
>>>
>>> Happy to chime in, but I need a bit more context here.
>>>
>>> What is the problem, how does this path try to solve it, and why is
>>> that a bad idea?
>>>
>> Sure, sorry.
>>
>> This series:
>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>> https://patchwork.kernel.org/cover/10863301/
>>
>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>
>> We noticed that if we build the hotpluggable memory dt nodes on top of
>> the above ACPI tables, the DIMM slots are interpreted as not
>> hotpluggable memory slots (at least we think so).
>>
>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>
>> So in this series, we are forced to not generate the hotpluggable memory
>> dt nodes if we want the DIMM slots to be effectively recognized as
>> hotpluggable.
>>
>> Could you confirm we have a correct understanding of the EDK2 behaviour
>> and if so, would there be any solution for EDK2 to absorb both the DT
>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>
>> At qemu level, detecting we are booting in ACPI mode and purposely
>> removing the above mentioned DT nodes does not look straightforward.
> 
> The firmware is not enlightened about the ACPI content that comes from
> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> as instructed through the ACPI linker/loader script, in order to install
> the ACPI content for the OS. No actual information is consumed by the
> firmware from the ACPI payload -- and that's a feature.
> 
> The firmware does consume DT:
> 
> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> the firmware (for its own information needs), and passed on to the OS.
> 
> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> consumed only by the firmware (for its own information needs), and the
> DT is hidden from the OS. The OS gets only the ACPI content
> (processed/prepared as described above).
> 
> 
> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> base/size pairs in all the memory nodes in the DT. For each such base
> address that is currently tracked as "nonexistent" in the GCD memory
> space map, the driver currently adds the base/size range as "system
> memory". This in turn is reflected by the UEFI memmap that the OS gets
> to see as "conventional memory".
> 
> If you need some memory ranges to show up as "special" in the UEFI
> memmap, then you need to distinguish them somehow from the "regular"
> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
> firmware, so that it act upon the discriminator that you set in the DT.
> 
> 
> Now... from a brief look at the Platform Init and UEFI specs, my
> impression is that the hotpluggable (but presently not plugged) DIMM
> ranges should simply be *absent* from the UEFI memmap; is that correct?
> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> in full.) If my impression is correct, then two options (alternatives)
> exist:
> 
> (1) Hide the affected memory nodes -- or at least the affected base/size
> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> external firmware loaded. Then the firmware will not expose those ranges
> as "conventional memory" in the UEFI memmap. This approach requires no
> changes to edk2.
> 
> This option is precisely what Eric described up-thread, at
> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
> 
>> in machvirt_init, there is firmware_loaded that tells you whether you
>> have a FW image. If this one is not set, you can induce dt. But if
>> there is a FW it can be either DT or ACPI booted. You also have the
>> acpi_enabled knob.
> 
> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> "vl.c").
> 
> So, the condition for hiding the hotpluggable memory nodes in question
> from the DT is:

> 
>   (aarch64 && firmware_loaded && acpi_enabled)

Thanks a lot for all those inputs!

I don't get why we test aarch64 in above condition (this was useful for
high ECAM range as the aarch32 FW was not supporting it but here, is it
still meaningful?)

Thanks

Eric

> 
> 
> (2) Invent and set an "ignore me, firmware" property for the
> hotpluggable memory nodes in the DT, and update the firmware to honor
> that property.
> 
> Thanks
> Laszlo
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-02 15:29                 ` Auger Eric
@ 2019-04-02 15:38                   ` Laszlo Ersek
  -1 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-02 15:38 UTC (permalink / raw)
  To: Auger Eric, Ard Biesheuvel
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Linuxarm,
	Shameerali Kolothum Thodi, qemu-devel@nongnu.org,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	imammedo@redhat.com, sebastien.boeuf@intel.com, Leif Lindholm

On 04/02/19 17:29, Auger Eric wrote:
> Hi Laszlo,
> 
> On 4/1/19 3:07 PM, Laszlo Ersek wrote:
>> On 03/29/19 14:56, Auger Eric wrote:
>>> Hi Ard,
>>>
>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:
>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:
>>>>>
>>>>> Hi Shameer,
>>>>>
>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>> Sent: 29 March 2019 09:32
>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>
>>>>>>> Hi Shameer,
>>>>>>>
>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>
>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>> any conflict with acpi tables. The default is "off".
>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>
>>>>>>>>
>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>> device-memory support is added for DT boot.
>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>> any PCDIMM nodes.
>>>>>>>>
>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.
>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>> info.
>>>>>>
>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>
>>>>>> Also, to be more clear on what happens,
>>>>>>
>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>
>>>>>> From kernel log,
>>>>>>
>>>>>> [    0.000000] Early memory node ranges
>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>
>>>>>>
>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>
>>>>>> [    0.000000] Movable zone start for each node
>>>>>> [    0.000000] Early memory node ranges
>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>
>>>>>> The hotpluggable memory node is absent from early memory nodes here.
>>>>>
>>>>> OK thank you for the example illustrating the concern.
>>>>>>
>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.
>>>>>
>>>>> Let's wait for EDK2 experts on this.
>>>>>
>>>>
>>>> Happy to chime in, but I need a bit more context here.
>>>>
>>>> What is the problem, how does this path try to solve it, and why is
>>>> that a bad idea?
>>>>
>>> Sure, sorry.
>>>
>>> This series:
>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>> https://patchwork.kernel.org/cover/10863301/
>>>
>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>
>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>> the above ACPI tables, the DIMM slots are interpreted as not
>>> hotpluggable memory slots (at least we think so).
>>>
>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>
>>> So in this series, we are forced to not generate the hotpluggable memory
>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>> hotpluggable.
>>>
>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>
>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>> removing the above mentioned DT nodes does not look straightforward.
>>
>> The firmware is not enlightened about the ACPI content that comes from
>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>> as instructed through the ACPI linker/loader script, in order to install
>> the ACPI content for the OS. No actual information is consumed by the
>> firmware from the ACPI payload -- and that's a feature.
>>
>> The firmware does consume DT:
>>
>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>> the firmware (for its own information needs), and passed on to the OS.
>>
>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>> consumed only by the firmware (for its own information needs), and the
>> DT is hidden from the OS. The OS gets only the ACPI content
>> (processed/prepared as described above).
>>
>>
>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>> base/size pairs in all the memory nodes in the DT. For each such base
>> address that is currently tracked as "nonexistent" in the GCD memory
>> space map, the driver currently adds the base/size range as "system
>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>> to see as "conventional memory".
>>
>> If you need some memory ranges to show up as "special" in the UEFI
>> memmap, then you need to distinguish them somehow from the "regular"
>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>> firmware, so that it act upon the discriminator that you set in the DT.
>>
>>
>> Now... from a brief look at the Platform Init and UEFI specs, my
>> impression is that the hotpluggable (but presently not plugged) DIMM
>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>> in full.) If my impression is correct, then two options (alternatives)
>> exist:
>>
>> (1) Hide the affected memory nodes -- or at least the affected base/size
>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>> external firmware loaded. Then the firmware will not expose those ranges
>> as "conventional memory" in the UEFI memmap. This approach requires no
>> changes to edk2.
>>
>> This option is precisely what Eric described up-thread, at
>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>
>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>> have a FW image. If this one is not set, you can induce dt. But if
>>> there is a FW it can be either DT or ACPI booted. You also have the
>>> acpi_enabled knob.
>>
>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>> "vl.c").
>>
>> So, the condition for hiding the hotpluggable memory nodes in question
>> from the DT is:
> 
>>
>>   (aarch64 && firmware_loaded && acpi_enabled)
> 
> Thanks a lot for all those inputs!
> 
> I don't get why we test aarch64 in above condition (this was useful for
> high ECAM range as the aarch32 FW was not supporting it but here, is it
> still meaningful?)

Sorry, I should have clarified that. Yes, it is meaningful:

While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
not the reverse, on ARM.) So if you run the 32-bit build of the
ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
the OS is the DT.

This "bitness distinction" is implemented in the firmware already. If
you hid the memory nodes from the DT under the condition

  (!aarch64 && firmware_loaded && acpi_enabled)

then the nodes would not be seen by the OS at all (because
"acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
all the OS can ever get is DT).

Thanks,
Laszlo

> 
> Thanks
> 
> Eric
> 
>>
>>
>> (2) Invent and set an "ignore me, firmware" property for the
>> hotpluggable memory nodes in the DT, and update the firmware to honor
>> that property.
>>
>> Thanks
>> Laszlo
>>


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-02 15:38                   ` Laszlo Ersek
  0 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-02 15:38 UTC (permalink / raw)
  To: Auger Eric, Ard Biesheuvel
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	imammedo@redhat.com, sebastien.boeuf@intel.com, Leif Lindholm

On 04/02/19 17:29, Auger Eric wrote:
> Hi Laszlo,
> 
> On 4/1/19 3:07 PM, Laszlo Ersek wrote:
>> On 03/29/19 14:56, Auger Eric wrote:
>>> Hi Ard,
>>>
>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:
>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:
>>>>>
>>>>> Hi Shameer,
>>>>>
>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>> Sent: 29 March 2019 09:32
>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>
>>>>>>> Hi Shameer,
>>>>>>>
>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>
>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>> any conflict with acpi tables. The default is "off".
>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>
>>>>>>>>
>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>> device-memory support is added for DT boot.
>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>> any PCDIMM nodes.
>>>>>>>>
>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.
>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>> info.
>>>>>>
>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>
>>>>>> Also, to be more clear on what happens,
>>>>>>
>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>
>>>>>> From kernel log,
>>>>>>
>>>>>> [    0.000000] Early memory node ranges
>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>
>>>>>>
>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>
>>>>>> [    0.000000] Movable zone start for each node
>>>>>> [    0.000000] Early memory node ranges
>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>
>>>>>> The hotpluggable memory node is absent from early memory nodes here.
>>>>>
>>>>> OK thank you for the example illustrating the concern.
>>>>>>
>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.
>>>>>
>>>>> Let's wait for EDK2 experts on this.
>>>>>
>>>>
>>>> Happy to chime in, but I need a bit more context here.
>>>>
>>>> What is the problem, how does this path try to solve it, and why is
>>>> that a bad idea?
>>>>
>>> Sure, sorry.
>>>
>>> This series:
>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>> https://patchwork.kernel.org/cover/10863301/
>>>
>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>
>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>> the above ACPI tables, the DIMM slots are interpreted as not
>>> hotpluggable memory slots (at least we think so).
>>>
>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>
>>> So in this series, we are forced to not generate the hotpluggable memory
>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>> hotpluggable.
>>>
>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>
>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>> removing the above mentioned DT nodes does not look straightforward.
>>
>> The firmware is not enlightened about the ACPI content that comes from
>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>> as instructed through the ACPI linker/loader script, in order to install
>> the ACPI content for the OS. No actual information is consumed by the
>> firmware from the ACPI payload -- and that's a feature.
>>
>> The firmware does consume DT:
>>
>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>> the firmware (for its own information needs), and passed on to the OS.
>>
>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>> consumed only by the firmware (for its own information needs), and the
>> DT is hidden from the OS. The OS gets only the ACPI content
>> (processed/prepared as described above).
>>
>>
>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>> base/size pairs in all the memory nodes in the DT. For each such base
>> address that is currently tracked as "nonexistent" in the GCD memory
>> space map, the driver currently adds the base/size range as "system
>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>> to see as "conventional memory".
>>
>> If you need some memory ranges to show up as "special" in the UEFI
>> memmap, then you need to distinguish them somehow from the "regular"
>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>> firmware, so that it act upon the discriminator that you set in the DT.
>>
>>
>> Now... from a brief look at the Platform Init and UEFI specs, my
>> impression is that the hotpluggable (but presently not plugged) DIMM
>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>> in full.) If my impression is correct, then two options (alternatives)
>> exist:
>>
>> (1) Hide the affected memory nodes -- or at least the affected base/size
>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>> external firmware loaded. Then the firmware will not expose those ranges
>> as "conventional memory" in the UEFI memmap. This approach requires no
>> changes to edk2.
>>
>> This option is precisely what Eric described up-thread, at
>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>
>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>> have a FW image. If this one is not set, you can induce dt. But if
>>> there is a FW it can be either DT or ACPI booted. You also have the
>>> acpi_enabled knob.
>>
>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>> "vl.c").
>>
>> So, the condition for hiding the hotpluggable memory nodes in question
>> from the DT is:
> 
>>
>>   (aarch64 && firmware_loaded && acpi_enabled)
> 
> Thanks a lot for all those inputs!
> 
> I don't get why we test aarch64 in above condition (this was useful for
> high ECAM range as the aarch32 FW was not supporting it but here, is it
> still meaningful?)

Sorry, I should have clarified that. Yes, it is meaningful:

While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
not the reverse, on ARM.) So if you run the 32-bit build of the
ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
the OS is the DT.

This "bitness distinction" is implemented in the firmware already. If
you hid the memory nodes from the DT under the condition

  (!aarch64 && firmware_loaded && acpi_enabled)

then the nodes would not be seen by the OS at all (because
"acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
all the OS can ever get is DT).

Thanks,
Laszlo

> 
> Thanks
> 
> Eric
> 
>>
>>
>> (2) Invent and set an "ignore me, firmware" property for the
>> hotpluggable memory nodes in the DT, and update the firmware to honor
>> that property.
>>
>> Thanks
>> Laszlo
>>

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-02 10:33                   ` Laszlo Ersek
@ 2019-04-02 15:42                     ` Auger Eric
  -1 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-04-02 15:42 UTC (permalink / raw)
  To: Laszlo Ersek, Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm

Hi Laszlo,

On 4/2/19 12:33 PM, Laszlo Ersek wrote:
> On 04/02/19 09:42, Igor Mammedov wrote:
>> On Mon, 1 Apr 2019 15:07:05 +0200
>> Laszlo Ersek <lersek@redhat.com> wrote:
>>
>>> On 03/29/19 14:56, Auger Eric wrote:
>>>> Hi Ard,
>>>>
>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
>>>>>>
>>>>>> Hi Shameer,
>>>>>>
>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
>>>>>>>
>>>>>>>  
>>>>>>>> -----Original Message-----
>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>>> Sent: 29 March 2019 09:32
>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>>
>>>>>>>> Hi Shameer,
>>>>>>>>
>>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>>
>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
>>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>>> any conflict with acpi tables. The default is "off".  
>>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>>  
>>>>>>>>>
>>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>>> device-memory support is added for DT boot.  
>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>>> any PCDIMM nodes.  
>>>>>>>>>
>>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>>> info.  
>>>>>>>
>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>>
>>>>>>> Also, to be more clear on what happens,
>>>>>>>
>>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>>
>>>>>>> From kernel log,
>>>>>>>
>>>>>>> [    0.000000] Early memory node ranges
>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>>
>>>>>>>
>>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>>
>>>>>>> [    0.000000] Movable zone start for each node
>>>>>>> [    0.000000] Early memory node ranges
>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>>
>>>>>>> The hotpluggable memory node is absent from early memory nodes here.  
>>>>>>
>>>>>> OK thank you for the example illustrating the concern.  
>>>>>>>
>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
>>>>>>
>>>>>> Let's wait for EDK2 experts on this.
>>>>>>  
>>>>>
>>>>> Happy to chime in, but I need a bit more context here.
>>>>>
>>>>> What is the problem, how does this path try to solve it, and why is
>>>>> that a bad idea?
>>>>>  
>>>> Sure, sorry.
>>>>
>>>> This series:
>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>>> https://patchwork.kernel.org/cover/10863301/
>>>>
>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>>
>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>>> the above ACPI tables, the DIMM slots are interpreted as not
>>>> hotpluggable memory slots (at least we think so).
>>>>
>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>>
>>>> So in this series, we are forced to not generate the hotpluggable memory
>>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>>> hotpluggable.
>>>>
>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>>
>>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>>> removing the above mentioned DT nodes does not look straightforward.  
>>>
>>> The firmware is not enlightened about the ACPI content that comes from
>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>>> as instructed through the ACPI linker/loader script, in order to install
>>> the ACPI content for the OS. No actual information is consumed by the
>>> firmware from the ACPI payload -- and that's a feature.
>>>
>>> The firmware does consume DT:
>>>
>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>>> the firmware (for its own information needs), and passed on to the OS.
>>>
>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>>> consumed only by the firmware (for its own information needs), and the
>>> DT is hidden from the OS. The OS gets only the ACPI content
>>> (processed/prepared as described above).
I am confused by the above statement actually. In the above case what
does happen if you pass the acpi=off in the kernel boot parameters?

Thanks

Eric
>>>
>>>
>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>>> base/size pairs in all the memory nodes in the DT. For each such base
>>> address that is currently tracked as "nonexistent" in the GCD memory
>>> space map, the driver currently adds the base/size range as "system
>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>>> to see as "conventional memory".
>>>
>>> If you need some memory ranges to show up as "special" in the UEFI
>>> memmap, then you need to distinguish them somehow from the "regular"
>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>>> firmware, so that it act upon the discriminator that you set in the DT.
>>>
>>>
>>> Now... from a brief look at the Platform Init and UEFI specs, my
>>> impression is that the hotpluggable (but presently not plugged) DIMM
>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>>> in full.) If my impression is correct, then two options (alternatives)
>>> exist:
>>>
>>> (1) Hide the affected memory nodes -- or at least the affected base/size
>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>>> external firmware loaded. Then the firmware will not expose those ranges
>>> as "conventional memory" in the UEFI memmap. This approach requires no
>>> changes to edk2.
>>>
>>> This option is precisely what Eric described up-thread, at
>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>>
>>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>>> have a FW image. If this one is not set, you can induce dt. But if
>>>> there is a FW it can be either DT or ACPI booted. You also have the
>>>> acpi_enabled knob.  
>>>
>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>>> "vl.c").
>>>
>>> So, the condition for hiding the hotpluggable memory nodes in question
>>> from the DT is:
>>>
>>>   (aarch64 && firmware_loaded && acpi_enabled)
>> I'd go with this one, though I have a question for firmware side.
>> Let's assume we would want in future to expose hotpluggable & present
>> memory via GetMemoryMap() (like bare-metal does) (guest OS theoretically
>> can avoid using it for Normal zone based on hint from SRAT table early
>> at boot), but what about firmware can it inspect SRAT table and not use
>> hotpluggable ranges for its own use (or at least do not canibalize
>> them permanently)?
> 
> This is actually two questions:
> 
> (a) Can the firmware inspect SRAT?
> 
> If the SRAT table structure isn't very complex, this is technically
> doable, but the wrong thing to do, IMO.
> 
> First, we've tried hard to avoid enlightening the firmware about the
> semantics of QEMU's ACPI tables.
> 
> Second, this would introduce an ordering constraint (or callbacks) in
> the firmware, between the driver that processes & installs the ACPI
> tables, and the driver that translates the memory nodes of the DT to the
> memory ranges known to UEFI and the OS.
> 
> If we need such hinting, then option (2) below (from earlier context)
> would be better:
> - If it's OK to use an arm/aarch64 specific solution, then new DT
> properties should work.
> - If it should be arch-independent, then a dedicated fw_cfg file would
> be better.
> 
> (b) Assuming we have the information from some source, can the firmware
> expose some memory ranges as "usable RAM" to the OS, while staying away
> from them for its own (firmware) purposes?
> 
> After consulting
> 
>   Table 25. Memory Type Usage before ExitBootServices()
>   Table 26. Memory Type Usage after ExitBootServices()
> 
> in UEFI-2.7, I would say that the firmware driver that installs these
> ranges to the memory (space) map should also allocate the ranges right
> after, as EfiBootServicesData. This will prevent other drivers /
> applications in the firmware from allocating chunks out of those areas,
> and the OS will be at liberty to release and repurpose the ranges after
> ExitBootServices().
> 
> Thanks,
> Laszlo
> 
>>> (2) Invent and set an "ignore me, firmware" property for the
>>> hotpluggable memory nodes in the DT, and update the firmware to honor
>>> that property.
>>>
>>> Thanks
>>> Laszlo
>>
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-02 15:42                     ` Auger Eric
  0 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-04-02 15:42 UTC (permalink / raw)
  To: Laszlo Ersek, Igor Mammedov
  Cc: Ard Biesheuvel, peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm

Hi Laszlo,

On 4/2/19 12:33 PM, Laszlo Ersek wrote:
> On 04/02/19 09:42, Igor Mammedov wrote:
>> On Mon, 1 Apr 2019 15:07:05 +0200
>> Laszlo Ersek <lersek@redhat.com> wrote:
>>
>>> On 03/29/19 14:56, Auger Eric wrote:
>>>> Hi Ard,
>>>>
>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
>>>>>>
>>>>>> Hi Shameer,
>>>>>>
>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
>>>>>>>
>>>>>>>  
>>>>>>>> -----Original Message-----
>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>>> Sent: 29 March 2019 09:32
>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>>
>>>>>>>> Hi Shameer,
>>>>>>>>
>>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>>
>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
>>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>>> any conflict with acpi tables. The default is "off".  
>>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>>  
>>>>>>>>>
>>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>>> device-memory support is added for DT boot.  
>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>>> any PCDIMM nodes.  
>>>>>>>>>
>>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>>> info.  
>>>>>>>
>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>>
>>>>>>> Also, to be more clear on what happens,
>>>>>>>
>>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>>
>>>>>>> From kernel log,
>>>>>>>
>>>>>>> [    0.000000] Early memory node ranges
>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>>
>>>>>>>
>>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>>
>>>>>>> [    0.000000] Movable zone start for each node
>>>>>>> [    0.000000] Early memory node ranges
>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>>
>>>>>>> The hotpluggable memory node is absent from early memory nodes here.  
>>>>>>
>>>>>> OK thank you for the example illustrating the concern.  
>>>>>>>
>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
>>>>>>
>>>>>> Let's wait for EDK2 experts on this.
>>>>>>  
>>>>>
>>>>> Happy to chime in, but I need a bit more context here.
>>>>>
>>>>> What is the problem, how does this path try to solve it, and why is
>>>>> that a bad idea?
>>>>>  
>>>> Sure, sorry.
>>>>
>>>> This series:
>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>>> https://patchwork.kernel.org/cover/10863301/
>>>>
>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>>
>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>>> the above ACPI tables, the DIMM slots are interpreted as not
>>>> hotpluggable memory slots (at least we think so).
>>>>
>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>>
>>>> So in this series, we are forced to not generate the hotpluggable memory
>>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>>> hotpluggable.
>>>>
>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>>
>>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>>> removing the above mentioned DT nodes does not look straightforward.  
>>>
>>> The firmware is not enlightened about the ACPI content that comes from
>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>>> as instructed through the ACPI linker/loader script, in order to install
>>> the ACPI content for the OS. No actual information is consumed by the
>>> firmware from the ACPI payload -- and that's a feature.
>>>
>>> The firmware does consume DT:
>>>
>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>>> the firmware (for its own information needs), and passed on to the OS.
>>>
>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>>> consumed only by the firmware (for its own information needs), and the
>>> DT is hidden from the OS. The OS gets only the ACPI content
>>> (processed/prepared as described above).
I am confused by the above statement actually. In the above case what
does happen if you pass the acpi=off in the kernel boot parameters?

Thanks

Eric
>>>
>>>
>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>>> base/size pairs in all the memory nodes in the DT. For each such base
>>> address that is currently tracked as "nonexistent" in the GCD memory
>>> space map, the driver currently adds the base/size range as "system
>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>>> to see as "conventional memory".
>>>
>>> If you need some memory ranges to show up as "special" in the UEFI
>>> memmap, then you need to distinguish them somehow from the "regular"
>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>>> firmware, so that it act upon the discriminator that you set in the DT.
>>>
>>>
>>> Now... from a brief look at the Platform Init and UEFI specs, my
>>> impression is that the hotpluggable (but presently not plugged) DIMM
>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>>> in full.) If my impression is correct, then two options (alternatives)
>>> exist:
>>>
>>> (1) Hide the affected memory nodes -- or at least the affected base/size
>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>>> external firmware loaded. Then the firmware will not expose those ranges
>>> as "conventional memory" in the UEFI memmap. This approach requires no
>>> changes to edk2.
>>>
>>> This option is precisely what Eric described up-thread, at
>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>>
>>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>>> have a FW image. If this one is not set, you can induce dt. But if
>>>> there is a FW it can be either DT or ACPI booted. You also have the
>>>> acpi_enabled knob.  
>>>
>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>>> "vl.c").
>>>
>>> So, the condition for hiding the hotpluggable memory nodes in question
>>> from the DT is:
>>>
>>>   (aarch64 && firmware_loaded && acpi_enabled)
>> I'd go with this one, though I have a question for firmware side.
>> Let's assume we would want in future to expose hotpluggable & present
>> memory via GetMemoryMap() (like bare-metal does) (guest OS theoretically
>> can avoid using it for Normal zone based on hint from SRAT table early
>> at boot), but what about firmware can it inspect SRAT table and not use
>> hotpluggable ranges for its own use (or at least do not canibalize
>> them permanently)?
> 
> This is actually two questions:
> 
> (a) Can the firmware inspect SRAT?
> 
> If the SRAT table structure isn't very complex, this is technically
> doable, but the wrong thing to do, IMO.
> 
> First, we've tried hard to avoid enlightening the firmware about the
> semantics of QEMU's ACPI tables.
> 
> Second, this would introduce an ordering constraint (or callbacks) in
> the firmware, between the driver that processes & installs the ACPI
> tables, and the driver that translates the memory nodes of the DT to the
> memory ranges known to UEFI and the OS.
> 
> If we need such hinting, then option (2) below (from earlier context)
> would be better:
> - If it's OK to use an arm/aarch64 specific solution, then new DT
> properties should work.
> - If it should be arch-independent, then a dedicated fw_cfg file would
> be better.
> 
> (b) Assuming we have the information from some source, can the firmware
> expose some memory ranges as "usable RAM" to the OS, while staying away
> from them for its own (firmware) purposes?
> 
> After consulting
> 
>   Table 25. Memory Type Usage before ExitBootServices()
>   Table 26. Memory Type Usage after ExitBootServices()
> 
> in UEFI-2.7, I would say that the firmware driver that installs these
> ranges to the memory (space) map should also allocate the ranges right
> after, as EfiBootServicesData. This will prevent other drivers /
> applications in the firmware from allocating chunks out of those areas,
> and the OS will be at liberty to release and repurpose the ranges after
> ExitBootServices().
> 
> Thanks,
> Laszlo
> 
>>> (2) Invent and set an "ignore me, firmware" property for the
>>> hotpluggable memory nodes in the DT, and update the firmware to honor
>>> that property.
>>>
>>> Thanks
>>> Laszlo
>>
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-02 15:38                   ` Laszlo Ersek
@ 2019-04-02 15:50                     ` Auger Eric
  -1 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-04-02 15:50 UTC (permalink / raw)
  To: Laszlo Ersek, Ard Biesheuvel
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Linuxarm,
	Shameerali Kolothum Thodi, qemu-devel@nongnu.org,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	imammedo@redhat.com, sebastien.boeuf@intel.com, Leif Lindholm

Laszlo,

On 4/2/19 5:38 PM, Laszlo Ersek wrote:
> On 04/02/19 17:29, Auger Eric wrote:
>> Hi Laszlo,
>>
>> On 4/1/19 3:07 PM, Laszlo Ersek wrote:
>>> On 03/29/19 14:56, Auger Eric wrote:
>>>> Hi Ard,
>>>>
>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:
>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:
>>>>>>
>>>>>> Hi Shameer,
>>>>>>
>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>>> Sent: 29 March 2019 09:32
>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>>
>>>>>>>> Hi Shameer,
>>>>>>>>
>>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>>
>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>>> any conflict with acpi tables. The default is "off".
>>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>>> device-memory support is added for DT boot.
>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>>> any PCDIMM nodes.
>>>>>>>>>
>>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.
>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>>> info.
>>>>>>>
>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>>
>>>>>>> Also, to be more clear on what happens,
>>>>>>>
>>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>>
>>>>>>> From kernel log,
>>>>>>>
>>>>>>> [    0.000000] Early memory node ranges
>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>>
>>>>>>>
>>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>>
>>>>>>> [    0.000000] Movable zone start for each node
>>>>>>> [    0.000000] Early memory node ranges
>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>>
>>>>>>> The hotpluggable memory node is absent from early memory nodes here.
>>>>>>
>>>>>> OK thank you for the example illustrating the concern.
>>>>>>>
>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.
>>>>>>
>>>>>> Let's wait for EDK2 experts on this.
>>>>>>
>>>>>
>>>>> Happy to chime in, but I need a bit more context here.
>>>>>
>>>>> What is the problem, how does this path try to solve it, and why is
>>>>> that a bad idea?
>>>>>
>>>> Sure, sorry.
>>>>
>>>> This series:
>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>>> https://patchwork.kernel.org/cover/10863301/
>>>>
>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>>
>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>>> the above ACPI tables, the DIMM slots are interpreted as not
>>>> hotpluggable memory slots (at least we think so).
>>>>
>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>>
>>>> So in this series, we are forced to not generate the hotpluggable memory
>>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>>> hotpluggable.
>>>>
>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>>
>>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>>> removing the above mentioned DT nodes does not look straightforward.
>>>
>>> The firmware is not enlightened about the ACPI content that comes from
>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>>> as instructed through the ACPI linker/loader script, in order to install
>>> the ACPI content for the OS. No actual information is consumed by the
>>> firmware from the ACPI payload -- and that's a feature.
>>>
>>> The firmware does consume DT:
>>>
>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>>> the firmware (for its own information needs), and passed on to the OS.
>>>
>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>>> consumed only by the firmware (for its own information needs), and the
>>> DT is hidden from the OS. The OS gets only the ACPI content
>>> (processed/prepared as described above).
>>>
>>>
>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>>> base/size pairs in all the memory nodes in the DT. For each such base
>>> address that is currently tracked as "nonexistent" in the GCD memory
>>> space map, the driver currently adds the base/size range as "system
>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>>> to see as "conventional memory".
>>>
>>> If you need some memory ranges to show up as "special" in the UEFI
>>> memmap, then you need to distinguish them somehow from the "regular"
>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>>> firmware, so that it act upon the discriminator that you set in the DT.
>>>
>>>
>>> Now... from a brief look at the Platform Init and UEFI specs, my
>>> impression is that the hotpluggable (but presently not plugged) DIMM
>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>>> in full.) If my impression is correct, then two options (alternatives)
>>> exist:
>>>
>>> (1) Hide the affected memory nodes -- or at least the affected base/size
>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>>> external firmware loaded. Then the firmware will not expose those ranges
>>> as "conventional memory" in the UEFI memmap. This approach requires no
>>> changes to edk2.
>>>
>>> This option is precisely what Eric described up-thread, at
>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>>
>>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>>> have a FW image. If this one is not set, you can induce dt. But if
>>>> there is a FW it can be either DT or ACPI booted. You also have the
>>>> acpi_enabled knob.
>>>
>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>>> "vl.c").
>>>
>>> So, the condition for hiding the hotpluggable memory nodes in question
>>> from the DT is:
>>
>>>
>>>   (aarch64 && firmware_loaded && acpi_enabled)
>>
>> Thanks a lot for all those inputs!
>>
>> I don't get why we test aarch64 in above condition (this was useful for
>> high ECAM range as the aarch32 FW was not supporting it but here, is it
>> still meaningful?)
> 
> Sorry, I should have clarified that. Yes, it is meaningful:
> 
> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> not the reverse, on ARM.) So if you run the 32-bit build of the
> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> the OS is the DT.

OK. Thank you for the clarification!

Eric
> 
> This "bitness distinction" is implemented in the firmware already. If
> you hid the memory nodes from the DT under the condition
> 
>   (!aarch64 && firmware_loaded && acpi_enabled)
> 
> then the nodes would not be seen by the OS at all (because
> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> all the OS can ever get is DT).
> 
> Thanks,
> Laszlo
> 
>>
>> Thanks
>>
>> Eric
>>
>>>
>>>
>>> (2) Invent and set an "ignore me, firmware" property for the
>>> hotpluggable memory nodes in the DT, and update the firmware to honor
>>> that property.
>>>
>>> Thanks
>>> Laszlo
>>>
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-02 15:50                     ` Auger Eric
  0 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-04-02 15:50 UTC (permalink / raw)
  To: Laszlo Ersek, Ard Biesheuvel
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	imammedo@redhat.com, sebastien.boeuf@intel.com, Leif Lindholm

Laszlo,

On 4/2/19 5:38 PM, Laszlo Ersek wrote:
> On 04/02/19 17:29, Auger Eric wrote:
>> Hi Laszlo,
>>
>> On 4/1/19 3:07 PM, Laszlo Ersek wrote:
>>> On 03/29/19 14:56, Auger Eric wrote:
>>>> Hi Ard,
>>>>
>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:
>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:
>>>>>>
>>>>>> Hi Shameer,
>>>>>>
>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>>> Sent: 29 March 2019 09:32
>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>>
>>>>>>>> Hi Shameer,
>>>>>>>>
>>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>>
>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:
>>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>>> any conflict with acpi tables. The default is "off".
>>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>>> device-memory support is added for DT boot.
>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>>> any PCDIMM nodes.
>>>>>>>>>
>>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.
>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>>> info.
>>>>>>>
>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>>
>>>>>>> Also, to be more clear on what happens,
>>>>>>>
>>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>>
>>>>>>> From kernel log,
>>>>>>>
>>>>>>> [    0.000000] Early memory node ranges
>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>>
>>>>>>>
>>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>>
>>>>>>> [    0.000000] Movable zone start for each node
>>>>>>> [    0.000000] Early memory node ranges
>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>>
>>>>>>> The hotpluggable memory node is absent from early memory nodes here.
>>>>>>
>>>>>> OK thank you for the example illustrating the concern.
>>>>>>>
>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.
>>>>>>
>>>>>> Let's wait for EDK2 experts on this.
>>>>>>
>>>>>
>>>>> Happy to chime in, but I need a bit more context here.
>>>>>
>>>>> What is the problem, how does this path try to solve it, and why is
>>>>> that a bad idea?
>>>>>
>>>> Sure, sorry.
>>>>
>>>> This series:
>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>>> https://patchwork.kernel.org/cover/10863301/
>>>>
>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>>
>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>>> the above ACPI tables, the DIMM slots are interpreted as not
>>>> hotpluggable memory slots (at least we think so).
>>>>
>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>>
>>>> So in this series, we are forced to not generate the hotpluggable memory
>>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>>> hotpluggable.
>>>>
>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>>
>>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>>> removing the above mentioned DT nodes does not look straightforward.
>>>
>>> The firmware is not enlightened about the ACPI content that comes from
>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>>> as instructed through the ACPI linker/loader script, in order to install
>>> the ACPI content for the OS. No actual information is consumed by the
>>> firmware from the ACPI payload -- and that's a feature.
>>>
>>> The firmware does consume DT:
>>>
>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>>> the firmware (for its own information needs), and passed on to the OS.
>>>
>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>>> consumed only by the firmware (for its own information needs), and the
>>> DT is hidden from the OS. The OS gets only the ACPI content
>>> (processed/prepared as described above).
>>>
>>>
>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>>> base/size pairs in all the memory nodes in the DT. For each such base
>>> address that is currently tracked as "nonexistent" in the GCD memory
>>> space map, the driver currently adds the base/size range as "system
>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>>> to see as "conventional memory".
>>>
>>> If you need some memory ranges to show up as "special" in the UEFI
>>> memmap, then you need to distinguish them somehow from the "regular"
>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>>> firmware, so that it act upon the discriminator that you set in the DT.
>>>
>>>
>>> Now... from a brief look at the Platform Init and UEFI specs, my
>>> impression is that the hotpluggable (but presently not plugged) DIMM
>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>>> in full.) If my impression is correct, then two options (alternatives)
>>> exist:
>>>
>>> (1) Hide the affected memory nodes -- or at least the affected base/size
>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>>> external firmware loaded. Then the firmware will not expose those ranges
>>> as "conventional memory" in the UEFI memmap. This approach requires no
>>> changes to edk2.
>>>
>>> This option is precisely what Eric described up-thread, at
>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>>
>>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>>> have a FW image. If this one is not set, you can induce dt. But if
>>>> there is a FW it can be either DT or ACPI booted. You also have the
>>>> acpi_enabled knob.
>>>
>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>>> "vl.c").
>>>
>>> So, the condition for hiding the hotpluggable memory nodes in question
>>> from the DT is:
>>
>>>
>>>   (aarch64 && firmware_loaded && acpi_enabled)
>>
>> Thanks a lot for all those inputs!
>>
>> I don't get why we test aarch64 in above condition (this was useful for
>> high ECAM range as the aarch32 FW was not supporting it but here, is it
>> still meaningful?)
> 
> Sorry, I should have clarified that. Yes, it is meaningful:
> 
> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> not the reverse, on ARM.) So if you run the 32-bit build of the
> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> the OS is the DT.

OK. Thank you for the clarification!

Eric
> 
> This "bitness distinction" is implemented in the firmware already. If
> you hid the memory nodes from the DT under the condition
> 
>   (!aarch64 && firmware_loaded && acpi_enabled)
> 
> then the nodes would not be seen by the OS at all (because
> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> all the OS can ever get is DT).
> 
> Thanks,
> Laszlo
> 
>>
>> Thanks
>>
>> Eric
>>
>>>
>>>
>>> (2) Invent and set an "ignore me, firmware" property for the
>>> hotpluggable memory nodes in the DT, and update the firmware to honor
>>> that property.
>>>
>>> Thanks
>>> Laszlo
>>>
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-02 15:42                     ` Auger Eric
@ 2019-04-02 15:52                       ` Laszlo Ersek
  -1 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-02 15:52 UTC (permalink / raw)
  To: Auger Eric, Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm

On 04/02/19 17:42, Auger Eric wrote:
> Hi Laszlo,
> 
> On 4/2/19 12:33 PM, Laszlo Ersek wrote:
>> On 04/02/19 09:42, Igor Mammedov wrote:
>>> On Mon, 1 Apr 2019 15:07:05 +0200
>>> Laszlo Ersek <lersek@redhat.com> wrote:
>>>
>>>> On 03/29/19 14:56, Auger Eric wrote:
>>>>> Hi Ard,
>>>>>
>>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
>>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
>>>>>>>
>>>>>>> Hi Shameer,
>>>>>>>
>>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
>>>>>>>>
>>>>>>>>  
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>>>> Sent: 29 March 2019 09:32
>>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>>>
>>>>>>>>> Hi Shameer,
>>>>>>>>>
>>>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>>>
>>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
>>>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>>>> any conflict with acpi tables. The default is "off".  
>>>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>>>  
>>>>>>>>>>
>>>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>>>> device-memory support is added for DT boot.  
>>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>>>> any PCDIMM nodes.  
>>>>>>>>>>
>>>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
>>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>>>> info.  
>>>>>>>>
>>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>>>
>>>>>>>> Also, to be more clear on what happens,
>>>>>>>>
>>>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>>>
>>>>>>>> From kernel log,
>>>>>>>>
>>>>>>>> [    0.000000] Early memory node ranges
>>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>>>
>>>>>>>>
>>>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>>>
>>>>>>>> [    0.000000] Movable zone start for each node
>>>>>>>> [    0.000000] Early memory node ranges
>>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>>>
>>>>>>>> The hotpluggable memory node is absent from early memory nodes here.  
>>>>>>>
>>>>>>> OK thank you for the example illustrating the concern.  
>>>>>>>>
>>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
>>>>>>>
>>>>>>> Let's wait for EDK2 experts on this.
>>>>>>>  
>>>>>>
>>>>>> Happy to chime in, but I need a bit more context here.
>>>>>>
>>>>>> What is the problem, how does this path try to solve it, and why is
>>>>>> that a bad idea?
>>>>>>  
>>>>> Sure, sorry.
>>>>>
>>>>> This series:
>>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>>>> https://patchwork.kernel.org/cover/10863301/
>>>>>
>>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>>>
>>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>>>> the above ACPI tables, the DIMM slots are interpreted as not
>>>>> hotpluggable memory slots (at least we think so).
>>>>>
>>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>>>
>>>>> So in this series, we are forced to not generate the hotpluggable memory
>>>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>>>> hotpluggable.
>>>>>
>>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>>>
>>>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>>>> removing the above mentioned DT nodes does not look straightforward.  
>>>>
>>>> The firmware is not enlightened about the ACPI content that comes from
>>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>>>> as instructed through the ACPI linker/loader script, in order to install
>>>> the ACPI content for the OS. No actual information is consumed by the
>>>> firmware from the ACPI payload -- and that's a feature.
>>>>
>>>> The firmware does consume DT:
>>>>
>>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>>>> the firmware (for its own information needs), and passed on to the OS.
>>>>
>>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>>>> consumed only by the firmware (for its own information needs), and the
>>>> DT is hidden from the OS. The OS gets only the ACPI content
>>>> (processed/prepared as described above).

> I am confused by the above statement actually. In the above case what
> does happen if you pass the acpi=off in the kernel boot parameters?

If you launch QEMU with "-no-acpi" and you pass "acpi=off" to the guest
kernel, then the kernel will not boot successfully, as it will not get
DT from the firmware, and it will ignore the ACPI tables that it does
get from the firmware.

Thanks
Laszlo

> 
> Thanks
> 
> Eric
>>>>
>>>>
>>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>>>> base/size pairs in all the memory nodes in the DT. For each such base
>>>> address that is currently tracked as "nonexistent" in the GCD memory
>>>> space map, the driver currently adds the base/size range as "system
>>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>>>> to see as "conventional memory".
>>>>
>>>> If you need some memory ranges to show up as "special" in the UEFI
>>>> memmap, then you need to distinguish them somehow from the "regular"
>>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>>>> firmware, so that it act upon the discriminator that you set in the DT.
>>>>
>>>>
>>>> Now... from a brief look at the Platform Init and UEFI specs, my
>>>> impression is that the hotpluggable (but presently not plugged) DIMM
>>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>>>> in full.) If my impression is correct, then two options (alternatives)
>>>> exist:
>>>>
>>>> (1) Hide the affected memory nodes -- or at least the affected base/size
>>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>>>> external firmware loaded. Then the firmware will not expose those ranges
>>>> as "conventional memory" in the UEFI memmap. This approach requires no
>>>> changes to edk2.
>>>>
>>>> This option is precisely what Eric described up-thread, at
>>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>>>
>>>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>>>> have a FW image. If this one is not set, you can induce dt. But if
>>>>> there is a FW it can be either DT or ACPI booted. You also have the
>>>>> acpi_enabled knob.  
>>>>
>>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>>>> "vl.c").
>>>>
>>>> So, the condition for hiding the hotpluggable memory nodes in question
>>>> from the DT is:
>>>>
>>>>   (aarch64 && firmware_loaded && acpi_enabled)
>>> I'd go with this one, though I have a question for firmware side.
>>> Let's assume we would want in future to expose hotpluggable & present
>>> memory via GetMemoryMap() (like bare-metal does) (guest OS theoretically
>>> can avoid using it for Normal zone based on hint from SRAT table early
>>> at boot), but what about firmware can it inspect SRAT table and not use
>>> hotpluggable ranges for its own use (or at least do not canibalize
>>> them permanently)?
>>
>> This is actually two questions:
>>
>> (a) Can the firmware inspect SRAT?
>>
>> If the SRAT table structure isn't very complex, this is technically
>> doable, but the wrong thing to do, IMO.
>>
>> First, we've tried hard to avoid enlightening the firmware about the
>> semantics of QEMU's ACPI tables.
>>
>> Second, this would introduce an ordering constraint (or callbacks) in
>> the firmware, between the driver that processes & installs the ACPI
>> tables, and the driver that translates the memory nodes of the DT to the
>> memory ranges known to UEFI and the OS.
>>
>> If we need such hinting, then option (2) below (from earlier context)
>> would be better:
>> - If it's OK to use an arm/aarch64 specific solution, then new DT
>> properties should work.
>> - If it should be arch-independent, then a dedicated fw_cfg file would
>> be better.
>>
>> (b) Assuming we have the information from some source, can the firmware
>> expose some memory ranges as "usable RAM" to the OS, while staying away
>> from them for its own (firmware) purposes?
>>
>> After consulting
>>
>>   Table 25. Memory Type Usage before ExitBootServices()
>>   Table 26. Memory Type Usage after ExitBootServices()
>>
>> in UEFI-2.7, I would say that the firmware driver that installs these
>> ranges to the memory (space) map should also allocate the ranges right
>> after, as EfiBootServicesData. This will prevent other drivers /
>> applications in the firmware from allocating chunks out of those areas,
>> and the OS will be at liberty to release and repurpose the ranges after
>> ExitBootServices().
>>
>> Thanks,
>> Laszlo
>>
>>>> (2) Invent and set an "ignore me, firmware" property for the
>>>> hotpluggable memory nodes in the DT, and update the firmware to honor
>>>> that property.
>>>>
>>>> Thanks
>>>> Laszlo
>>>
>>


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-02 15:52                       ` Laszlo Ersek
  0 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-02 15:52 UTC (permalink / raw)
  To: Auger Eric, Igor Mammedov
  Cc: Ard Biesheuvel, peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm

On 04/02/19 17:42, Auger Eric wrote:
> Hi Laszlo,
> 
> On 4/2/19 12:33 PM, Laszlo Ersek wrote:
>> On 04/02/19 09:42, Igor Mammedov wrote:
>>> On Mon, 1 Apr 2019 15:07:05 +0200
>>> Laszlo Ersek <lersek@redhat.com> wrote:
>>>
>>>> On 03/29/19 14:56, Auger Eric wrote:
>>>>> Hi Ard,
>>>>>
>>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
>>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
>>>>>>>
>>>>>>> Hi Shameer,
>>>>>>>
>>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
>>>>>>>>
>>>>>>>>  
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>>>> Sent: 29 March 2019 09:32
>>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>>>
>>>>>>>>> Hi Shameer,
>>>>>>>>>
>>>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>>>
>>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
>>>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>>>> any conflict with acpi tables. The default is "off".  
>>>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>>>  
>>>>>>>>>>
>>>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>>>> device-memory support is added for DT boot.  
>>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>>>> any PCDIMM nodes.  
>>>>>>>>>>
>>>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
>>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>>>> info.  
>>>>>>>>
>>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>>>
>>>>>>>> Also, to be more clear on what happens,
>>>>>>>>
>>>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>>>
>>>>>>>> From kernel log,
>>>>>>>>
>>>>>>>> [    0.000000] Early memory node ranges
>>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>>>
>>>>>>>>
>>>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>>>
>>>>>>>> [    0.000000] Movable zone start for each node
>>>>>>>> [    0.000000] Early memory node ranges
>>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>>>
>>>>>>>> The hotpluggable memory node is absent from early memory nodes here.  
>>>>>>>
>>>>>>> OK thank you for the example illustrating the concern.  
>>>>>>>>
>>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
>>>>>>>
>>>>>>> Let's wait for EDK2 experts on this.
>>>>>>>  
>>>>>>
>>>>>> Happy to chime in, but I need a bit more context here.
>>>>>>
>>>>>> What is the problem, how does this path try to solve it, and why is
>>>>>> that a bad idea?
>>>>>>  
>>>>> Sure, sorry.
>>>>>
>>>>> This series:
>>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>>>> https://patchwork.kernel.org/cover/10863301/
>>>>>
>>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>>>
>>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>>>> the above ACPI tables, the DIMM slots are interpreted as not
>>>>> hotpluggable memory slots (at least we think so).
>>>>>
>>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>>>
>>>>> So in this series, we are forced to not generate the hotpluggable memory
>>>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>>>> hotpluggable.
>>>>>
>>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>>>
>>>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>>>> removing the above mentioned DT nodes does not look straightforward.  
>>>>
>>>> The firmware is not enlightened about the ACPI content that comes from
>>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>>>> as instructed through the ACPI linker/loader script, in order to install
>>>> the ACPI content for the OS. No actual information is consumed by the
>>>> firmware from the ACPI payload -- and that's a feature.
>>>>
>>>> The firmware does consume DT:
>>>>
>>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>>>> the firmware (for its own information needs), and passed on to the OS.
>>>>
>>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>>>> consumed only by the firmware (for its own information needs), and the
>>>> DT is hidden from the OS. The OS gets only the ACPI content
>>>> (processed/prepared as described above).

> I am confused by the above statement actually. In the above case what
> does happen if you pass the acpi=off in the kernel boot parameters?

If you launch QEMU with "-no-acpi" and you pass "acpi=off" to the guest
kernel, then the kernel will not boot successfully, as it will not get
DT from the firmware, and it will ignore the ACPI tables that it does
get from the firmware.

Thanks
Laszlo

> 
> Thanks
> 
> Eric
>>>>
>>>>
>>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>>>> base/size pairs in all the memory nodes in the DT. For each such base
>>>> address that is currently tracked as "nonexistent" in the GCD memory
>>>> space map, the driver currently adds the base/size range as "system
>>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>>>> to see as "conventional memory".
>>>>
>>>> If you need some memory ranges to show up as "special" in the UEFI
>>>> memmap, then you need to distinguish them somehow from the "regular"
>>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>>>> firmware, so that it act upon the discriminator that you set in the DT.
>>>>
>>>>
>>>> Now... from a brief look at the Platform Init and UEFI specs, my
>>>> impression is that the hotpluggable (but presently not plugged) DIMM
>>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>>>> in full.) If my impression is correct, then two options (alternatives)
>>>> exist:
>>>>
>>>> (1) Hide the affected memory nodes -- or at least the affected base/size
>>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>>>> external firmware loaded. Then the firmware will not expose those ranges
>>>> as "conventional memory" in the UEFI memmap. This approach requires no
>>>> changes to edk2.
>>>>
>>>> This option is precisely what Eric described up-thread, at
>>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>>>
>>>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>>>> have a FW image. If this one is not set, you can induce dt. But if
>>>>> there is a FW it can be either DT or ACPI booted. You also have the
>>>>> acpi_enabled knob.  
>>>>
>>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>>>> "vl.c").
>>>>
>>>> So, the condition for hiding the hotpluggable memory nodes in question
>>>> from the DT is:
>>>>
>>>>   (aarch64 && firmware_loaded && acpi_enabled)
>>> I'd go with this one, though I have a question for firmware side.
>>> Let's assume we would want in future to expose hotpluggable & present
>>> memory via GetMemoryMap() (like bare-metal does) (guest OS theoretically
>>> can avoid using it for Normal zone based on hint from SRAT table early
>>> at boot), but what about firmware can it inspect SRAT table and not use
>>> hotpluggable ranges for its own use (or at least do not canibalize
>>> them permanently)?
>>
>> This is actually two questions:
>>
>> (a) Can the firmware inspect SRAT?
>>
>> If the SRAT table structure isn't very complex, this is technically
>> doable, but the wrong thing to do, IMO.
>>
>> First, we've tried hard to avoid enlightening the firmware about the
>> semantics of QEMU's ACPI tables.
>>
>> Second, this would introduce an ordering constraint (or callbacks) in
>> the firmware, between the driver that processes & installs the ACPI
>> tables, and the driver that translates the memory nodes of the DT to the
>> memory ranges known to UEFI and the OS.
>>
>> If we need such hinting, then option (2) below (from earlier context)
>> would be better:
>> - If it's OK to use an arm/aarch64 specific solution, then new DT
>> properties should work.
>> - If it should be arch-independent, then a dedicated fw_cfg file would
>> be better.
>>
>> (b) Assuming we have the information from some source, can the firmware
>> expose some memory ranges as "usable RAM" to the OS, while staying away
>> from them for its own (firmware) purposes?
>>
>> After consulting
>>
>>   Table 25. Memory Type Usage before ExitBootServices()
>>   Table 26. Memory Type Usage after ExitBootServices()
>>
>> in UEFI-2.7, I would say that the firmware driver that installs these
>> ranges to the memory (space) map should also allocate the ranges right
>> after, as EfiBootServicesData. This will prevent other drivers /
>> applications in the firmware from allocating chunks out of those areas,
>> and the OS will be at liberty to release and repurpose the ranges after
>> ExitBootServices().
>>
>> Thanks,
>> Laszlo
>>
>>>> (2) Invent and set an "ignore me, firmware" property for the
>>>> hotpluggable memory nodes in the DT, and update the firmware to honor
>>>> that property.
>>>>
>>>> Thanks
>>>> Laszlo
>>>
>>

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-02 15:52                       ` Laszlo Ersek
@ 2019-04-02 15:56                         ` Laszlo Ersek
  -1 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-02 15:56 UTC (permalink / raw)
  To: Auger Eric, Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm

On 04/02/19 17:52, Laszlo Ersek wrote:
> On 04/02/19 17:42, Auger Eric wrote:

>>>>> The firmware does consume DT:
>>>>>
>>>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>>>>> the firmware (for its own information needs), and passed on to the OS.
>>>>>
>>>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>>>>> consumed only by the firmware (for its own information needs), and the
>>>>> DT is hidden from the OS. The OS gets only the ACPI content
>>>>> (processed/prepared as described above).
> 
>> I am confused by the above statement actually. In the above case what
>> does happen if you pass the acpi=off in the kernel boot parameters?
> 
> If you launch QEMU with "-no-acpi" and you pass "acpi=off" to the guest
> kernel, then the kernel will not boot successfully, as it will not get
> DT from the firmware, and it will ignore the ACPI tables that it does
> get from the firmware.

Sorry, I ended up answering "what happens when you run QEMU *without*
-no-acpi and pass acpi=off to the guest kernel".

To explain what happens when you boot *with* -no-acpi: in that case,
"acpi=off" doesn't matter, since the guest kernel doesn't get ACPI
tables anyway. The kernel will go for DT.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-02 15:56                         ` Laszlo Ersek
  0 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-02 15:56 UTC (permalink / raw)
  To: Auger Eric, Igor Mammedov
  Cc: Ard Biesheuvel, peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm

On 04/02/19 17:52, Laszlo Ersek wrote:
> On 04/02/19 17:42, Auger Eric wrote:

>>>>> The firmware does consume DT:
>>>>>
>>>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>>>>> the firmware (for its own information needs), and passed on to the OS.
>>>>>
>>>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>>>>> consumed only by the firmware (for its own information needs), and the
>>>>> DT is hidden from the OS. The OS gets only the ACPI content
>>>>> (processed/prepared as described above).
> 
>> I am confused by the above statement actually. In the above case what
>> does happen if you pass the acpi=off in the kernel boot parameters?
> 
> If you launch QEMU with "-no-acpi" and you pass "acpi=off" to the guest
> kernel, then the kernel will not boot successfully, as it will not get
> DT from the firmware, and it will ignore the ACPI tables that it does
> get from the firmware.

Sorry, I ended up answering "what happens when you run QEMU *without*
-no-acpi and pass acpi=off to the guest kernel".

To explain what happens when you boot *with* -no-acpi: in that case,
"acpi=off" doesn't matter, since the guest kernel doesn't get ACPI
tables anyway. The kernel will go for DT.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-02 15:52                       ` Laszlo Ersek
@ 2019-04-02 16:07                         ` Auger Eric
  -1 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-04-02 16:07 UTC (permalink / raw)
  To: Laszlo Ersek, Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm

Laszlo,

On 4/2/19 5:52 PM, Laszlo Ersek wrote:
> On 04/02/19 17:42, Auger Eric wrote:
>> Hi Laszlo,
>>
>> On 4/2/19 12:33 PM, Laszlo Ersek wrote:
>>> On 04/02/19 09:42, Igor Mammedov wrote:
>>>> On Mon, 1 Apr 2019 15:07:05 +0200
>>>> Laszlo Ersek <lersek@redhat.com> wrote:
>>>>
>>>>> On 03/29/19 14:56, Auger Eric wrote:
>>>>>> Hi Ard,
>>>>>>
>>>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
>>>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
>>>>>>>>
>>>>>>>> Hi Shameer,
>>>>>>>>
>>>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
>>>>>>>>>
>>>>>>>>>  
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>>>>> Sent: 29 March 2019 09:32
>>>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>>>>
>>>>>>>>>> Hi Shameer,
>>>>>>>>>>
>>>>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>>>>
>>>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
>>>>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>>>>> any conflict with acpi tables. The default is "off".  
>>>>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>>>>  
>>>>>>>>>>>
>>>>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>>>>> device-memory support is added for DT boot.  
>>>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>>>>> any PCDIMM nodes.  
>>>>>>>>>>>
>>>>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
>>>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>>>>> info.  
>>>>>>>>>
>>>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>>>>
>>>>>>>>> Also, to be more clear on what happens,
>>>>>>>>>
>>>>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>>>>
>>>>>>>>> From kernel log,
>>>>>>>>>
>>>>>>>>> [    0.000000] Early memory node ranges
>>>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>>>>
>>>>>>>>> [    0.000000] Movable zone start for each node
>>>>>>>>> [    0.000000] Early memory node ranges
>>>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>>>>
>>>>>>>>> The hotpluggable memory node is absent from early memory nodes here.  
>>>>>>>>
>>>>>>>> OK thank you for the example illustrating the concern.  
>>>>>>>>>
>>>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
>>>>>>>>
>>>>>>>> Let's wait for EDK2 experts on this.
>>>>>>>>  
>>>>>>>
>>>>>>> Happy to chime in, but I need a bit more context here.
>>>>>>>
>>>>>>> What is the problem, how does this path try to solve it, and why is
>>>>>>> that a bad idea?
>>>>>>>  
>>>>>> Sure, sorry.
>>>>>>
>>>>>> This series:
>>>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>>>>> https://patchwork.kernel.org/cover/10863301/
>>>>>>
>>>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>>>>
>>>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>>>>> the above ACPI tables, the DIMM slots are interpreted as not
>>>>>> hotpluggable memory slots (at least we think so).
>>>>>>
>>>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>>>>
>>>>>> So in this series, we are forced to not generate the hotpluggable memory
>>>>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>>>>> hotpluggable.
>>>>>>
>>>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>>>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>>>>
>>>>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>>>>> removing the above mentioned DT nodes does not look straightforward.  
>>>>>
>>>>> The firmware is not enlightened about the ACPI content that comes from
>>>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>>>>> as instructed through the ACPI linker/loader script, in order to install
>>>>> the ACPI content for the OS. No actual information is consumed by the
>>>>> firmware from the ACPI payload -- and that's a feature.
>>>>>
>>>>> The firmware does consume DT:
>>>>>
>>>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>>>>> the firmware (for its own information needs), and passed on to the OS.
>>>>>
>>>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>>>>> consumed only by the firmware (for its own information needs), and the
>>>>> DT is hidden from the OS. The OS gets only the ACPI content
>>>>> (processed/prepared as described above).
> 
>> I am confused by the above statement actually. In the above case what
>> does happen if you pass the acpi=off in the kernel boot parameters?
> 
> If you launch QEMU with "-no-acpi" and you pass "acpi=off" to the guest
> kernel, then the kernel will not boot successfully, as it will not get
> DT from the firmware, and it will ignore the ACPI tables that it does
> get from the firmware.
Yup. Sorry this was hidden in my launch scripts.

Thanks!

Eric
> 
> Thanks
> Laszlo
> 
>>
>> Thanks
>>
>> Eric
>>>>>
>>>>>
>>>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>>>>> base/size pairs in all the memory nodes in the DT. For each such base
>>>>> address that is currently tracked as "nonexistent" in the GCD memory
>>>>> space map, the driver currently adds the base/size range as "system
>>>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>>>>> to see as "conventional memory".
>>>>>
>>>>> If you need some memory ranges to show up as "special" in the UEFI
>>>>> memmap, then you need to distinguish them somehow from the "regular"
>>>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>>>>> firmware, so that it act upon the discriminator that you set in the DT.
>>>>>
>>>>>
>>>>> Now... from a brief look at the Platform Init and UEFI specs, my
>>>>> impression is that the hotpluggable (but presently not plugged) DIMM
>>>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>>>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>>>>> in full.) If my impression is correct, then two options (alternatives)
>>>>> exist:
>>>>>
>>>>> (1) Hide the affected memory nodes -- or at least the affected base/size
>>>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>>>>> external firmware loaded. Then the firmware will not expose those ranges
>>>>> as "conventional memory" in the UEFI memmap. This approach requires no
>>>>> changes to edk2.
>>>>>
>>>>> This option is precisely what Eric described up-thread, at
>>>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>>>>
>>>>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>>>>> have a FW image. If this one is not set, you can induce dt. But if
>>>>>> there is a FW it can be either DT or ACPI booted. You also have the
>>>>>> acpi_enabled knob.  
>>>>>
>>>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>>>>> "vl.c").
>>>>>
>>>>> So, the condition for hiding the hotpluggable memory nodes in question
>>>>> from the DT is:
>>>>>
>>>>>   (aarch64 && firmware_loaded && acpi_enabled)
>>>> I'd go with this one, though I have a question for firmware side.
>>>> Let's assume we would want in future to expose hotpluggable & present
>>>> memory via GetMemoryMap() (like bare-metal does) (guest OS theoretically
>>>> can avoid using it for Normal zone based on hint from SRAT table early
>>>> at boot), but what about firmware can it inspect SRAT table and not use
>>>> hotpluggable ranges for its own use (or at least do not canibalize
>>>> them permanently)?
>>>
>>> This is actually two questions:
>>>
>>> (a) Can the firmware inspect SRAT?
>>>
>>> If the SRAT table structure isn't very complex, this is technically
>>> doable, but the wrong thing to do, IMO.
>>>
>>> First, we've tried hard to avoid enlightening the firmware about the
>>> semantics of QEMU's ACPI tables.
>>>
>>> Second, this would introduce an ordering constraint (or callbacks) in
>>> the firmware, between the driver that processes & installs the ACPI
>>> tables, and the driver that translates the memory nodes of the DT to the
>>> memory ranges known to UEFI and the OS.
>>>
>>> If we need such hinting, then option (2) below (from earlier context)
>>> would be better:
>>> - If it's OK to use an arm/aarch64 specific solution, then new DT
>>> properties should work.
>>> - If it should be arch-independent, then a dedicated fw_cfg file would
>>> be better.
>>>
>>> (b) Assuming we have the information from some source, can the firmware
>>> expose some memory ranges as "usable RAM" to the OS, while staying away
>>> from them for its own (firmware) purposes?
>>>
>>> After consulting
>>>
>>>   Table 25. Memory Type Usage before ExitBootServices()
>>>   Table 26. Memory Type Usage after ExitBootServices()
>>>
>>> in UEFI-2.7, I would say that the firmware driver that installs these
>>> ranges to the memory (space) map should also allocate the ranges right
>>> after, as EfiBootServicesData. This will prevent other drivers /
>>> applications in the firmware from allocating chunks out of those areas,
>>> and the OS will be at liberty to release and repurpose the ranges after
>>> ExitBootServices().
>>>
>>> Thanks,
>>> Laszlo
>>>
>>>>> (2) Invent and set an "ignore me, firmware" property for the
>>>>> hotpluggable memory nodes in the DT, and update the firmware to honor
>>>>> that property.
>>>>>
>>>>> Thanks
>>>>> Laszlo
>>>>
>>>
> 
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-02 16:07                         ` Auger Eric
  0 siblings, 0 replies; 95+ messages in thread
From: Auger Eric @ 2019-04-02 16:07 UTC (permalink / raw)
  To: Laszlo Ersek, Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm

Laszlo,

On 4/2/19 5:52 PM, Laszlo Ersek wrote:
> On 04/02/19 17:42, Auger Eric wrote:
>> Hi Laszlo,
>>
>> On 4/2/19 12:33 PM, Laszlo Ersek wrote:
>>> On 04/02/19 09:42, Igor Mammedov wrote:
>>>> On Mon, 1 Apr 2019 15:07:05 +0200
>>>> Laszlo Ersek <lersek@redhat.com> wrote:
>>>>
>>>>> On 03/29/19 14:56, Auger Eric wrote:
>>>>>> Hi Ard,
>>>>>>
>>>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
>>>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
>>>>>>>>
>>>>>>>> Hi Shameer,
>>>>>>>>
>>>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
>>>>>>>>>
>>>>>>>>>  
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>>>>> Sent: 29 March 2019 09:32
>>>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>>>>
>>>>>>>>>> Hi Shameer,
>>>>>>>>>>
>>>>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>>>>
>>>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
>>>>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>>>>> any conflict with acpi tables. The default is "off".  
>>>>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>>>>  
>>>>>>>>>>>
>>>>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>>>>> device-memory support is added for DT boot.  
>>>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>>>>> any PCDIMM nodes.  
>>>>>>>>>>>
>>>>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
>>>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>>>>> info.  
>>>>>>>>>
>>>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>>>>
>>>>>>>>> Also, to be more clear on what happens,
>>>>>>>>>
>>>>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>>>>
>>>>>>>>> From kernel log,
>>>>>>>>>
>>>>>>>>> [    0.000000] Early memory node ranges
>>>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>>>>
>>>>>>>>> [    0.000000] Movable zone start for each node
>>>>>>>>> [    0.000000] Early memory node ranges
>>>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>>>>
>>>>>>>>> The hotpluggable memory node is absent from early memory nodes here.  
>>>>>>>>
>>>>>>>> OK thank you for the example illustrating the concern.  
>>>>>>>>>
>>>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
>>>>>>>>
>>>>>>>> Let's wait for EDK2 experts on this.
>>>>>>>>  
>>>>>>>
>>>>>>> Happy to chime in, but I need a bit more context here.
>>>>>>>
>>>>>>> What is the problem, how does this path try to solve it, and why is
>>>>>>> that a bad idea?
>>>>>>>  
>>>>>> Sure, sorry.
>>>>>>
>>>>>> This series:
>>>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>>>>> https://patchwork.kernel.org/cover/10863301/
>>>>>>
>>>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>>>>
>>>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>>>>> the above ACPI tables, the DIMM slots are interpreted as not
>>>>>> hotpluggable memory slots (at least we think so).
>>>>>>
>>>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>>>>
>>>>>> So in this series, we are forced to not generate the hotpluggable memory
>>>>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>>>>> hotpluggable.
>>>>>>
>>>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>>>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>>>>
>>>>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>>>>> removing the above mentioned DT nodes does not look straightforward.  
>>>>>
>>>>> The firmware is not enlightened about the ACPI content that comes from
>>>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>>>>> as instructed through the ACPI linker/loader script, in order to install
>>>>> the ACPI content for the OS. No actual information is consumed by the
>>>>> firmware from the ACPI payload -- and that's a feature.
>>>>>
>>>>> The firmware does consume DT:
>>>>>
>>>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>>>>> the firmware (for its own information needs), and passed on to the OS.
>>>>>
>>>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>>>>> consumed only by the firmware (for its own information needs), and the
>>>>> DT is hidden from the OS. The OS gets only the ACPI content
>>>>> (processed/prepared as described above).
> 
>> I am confused by the above statement actually. In the above case what
>> does happen if you pass the acpi=off in the kernel boot parameters?
> 
> If you launch QEMU with "-no-acpi" and you pass "acpi=off" to the guest
> kernel, then the kernel will not boot successfully, as it will not get
> DT from the firmware, and it will ignore the ACPI tables that it does
> get from the firmware.
Yup. Sorry this was hidden in my launch scripts.

Thanks!

Eric
> 
> Thanks
> Laszlo
> 
>>
>> Thanks
>>
>> Eric
>>>>>
>>>>>
>>>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>>>>> base/size pairs in all the memory nodes in the DT. For each such base
>>>>> address that is currently tracked as "nonexistent" in the GCD memory
>>>>> space map, the driver currently adds the base/size range as "system
>>>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>>>>> to see as "conventional memory".
>>>>>
>>>>> If you need some memory ranges to show up as "special" in the UEFI
>>>>> memmap, then you need to distinguish them somehow from the "regular"
>>>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>>>>> firmware, so that it act upon the discriminator that you set in the DT.
>>>>>
>>>>>
>>>>> Now... from a brief look at the Platform Init and UEFI specs, my
>>>>> impression is that the hotpluggable (but presently not plugged) DIMM
>>>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>>>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>>>>> in full.) If my impression is correct, then two options (alternatives)
>>>>> exist:
>>>>>
>>>>> (1) Hide the affected memory nodes -- or at least the affected base/size
>>>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>>>>> external firmware loaded. Then the firmware will not expose those ranges
>>>>> as "conventional memory" in the UEFI memmap. This approach requires no
>>>>> changes to edk2.
>>>>>
>>>>> This option is precisely what Eric described up-thread, at
>>>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>>>>
>>>>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>>>>> have a FW image. If this one is not set, you can induce dt. But if
>>>>>> there is a FW it can be either DT or ACPI booted. You also have the
>>>>>> acpi_enabled knob.  
>>>>>
>>>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>>>>> "vl.c").
>>>>>
>>>>> So, the condition for hiding the hotpluggable memory nodes in question
>>>>> from the DT is:
>>>>>
>>>>>   (aarch64 && firmware_loaded && acpi_enabled)
>>>> I'd go with this one, though I have a question for firmware side.
>>>> Let's assume we would want in future to expose hotpluggable & present
>>>> memory via GetMemoryMap() (like bare-metal does) (guest OS theoretically
>>>> can avoid using it for Normal zone based on hint from SRAT table early
>>>> at boot), but what about firmware can it inspect SRAT table and not use
>>>> hotpluggable ranges for its own use (or at least do not canibalize
>>>> them permanently)?
>>>
>>> This is actually two questions:
>>>
>>> (a) Can the firmware inspect SRAT?
>>>
>>> If the SRAT table structure isn't very complex, this is technically
>>> doable, but the wrong thing to do, IMO.
>>>
>>> First, we've tried hard to avoid enlightening the firmware about the
>>> semantics of QEMU's ACPI tables.
>>>
>>> Second, this would introduce an ordering constraint (or callbacks) in
>>> the firmware, between the driver that processes & installs the ACPI
>>> tables, and the driver that translates the memory nodes of the DT to the
>>> memory ranges known to UEFI and the OS.
>>>
>>> If we need such hinting, then option (2) below (from earlier context)
>>> would be better:
>>> - If it's OK to use an arm/aarch64 specific solution, then new DT
>>> properties should work.
>>> - If it should be arch-independent, then a dedicated fw_cfg file would
>>> be better.
>>>
>>> (b) Assuming we have the information from some source, can the firmware
>>> expose some memory ranges as "usable RAM" to the OS, while staying away
>>> from them for its own (firmware) purposes?
>>>
>>> After consulting
>>>
>>>   Table 25. Memory Type Usage before ExitBootServices()
>>>   Table 26. Memory Type Usage after ExitBootServices()
>>>
>>> in UEFI-2.7, I would say that the firmware driver that installs these
>>> ranges to the memory (space) map should also allocate the ranges right
>>> after, as EfiBootServicesData. This will prevent other drivers /
>>> applications in the firmware from allocating chunks out of those areas,
>>> and the OS will be at liberty to release and repurpose the ranges after
>>> ExitBootServices().
>>>
>>> Thanks,
>>> Laszlo
>>>
>>>>> (2) Invent and set an "ignore me, firmware" property for the
>>>>> hotpluggable memory nodes in the DT, and update the firmware to honor
>>>>> that property.
>>>>>
>>>>> Thanks
>>>>> Laszlo
>>>>
>>>
> 
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-02 15:38                   ` Laszlo Ersek
@ 2019-04-03  9:49                     ` Igor Mammedov
  -1 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-03  9:49 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	Auger Eric, shannon.zhaosl@gmail.com, qemu-arm@nongnu.org,
	xuwei (O), sebastien.boeuf@intel.com, Leif Lindholm

On Tue, 2 Apr 2019 17:38:26 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

> On 04/02/19 17:29, Auger Eric wrote:
> > Hi Laszlo,
> > 
> > On 4/1/19 3:07 PM, Laszlo Ersek wrote:  
> >> On 03/29/19 14:56, Auger Eric wrote:  
> >>> Hi Ard,
> >>>
> >>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
> >>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
> >>>>>
> >>>>> Hi Shameer,
> >>>>>
> >>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
> >>>>>>
> >>>>>>  
> >>>>>>> -----Original Message-----
> >>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
> >>>>>>> Sent: 29 March 2019 09:32
> >>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> >>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> >>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> >>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
> >>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
> >>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
> >>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
> >>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
> >>>>>>>
> >>>>>>> Hi Shameer,
> >>>>>>>
> >>>>>>> [ + Laszlo, Ard, Leif ]
> >>>>>>>
> >>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
> >>>>>>>> This is to disable/enable populating DT nodes in case
> >>>>>>>> any conflict with acpi tables. The default is "off".  
> >>>>>>> The name of the option sounds misleading to me. Also we don't really
> >>>>>>> know the scope of the disablement. At the moment this just aims to
> >>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
> >>>>>>>  
> >>>>>>>>
> >>>>>>>> This will be used in subsequent patch where cold plug
> >>>>>>>> device-memory support is added for DT boot.  
> >>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
> >>>>>>> any PCDIMM nodes.  
> >>>>>>>>
> >>>>>>>> If DT memory node support is added for cold-plugged device
> >>>>>>>> memory, those memory will be visible to Guest kernel via
> >>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
> >>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
> >>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
> >>>>>>> info.  
> >>>>>>
> >>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
> >>>>>>
> >>>>>> Also, to be more clear on what happens,
> >>>>>>
> >>>>>> Guest ACPI boot with "fdt=on" ,
> >>>>>>
> >>>>>> From kernel log,
> >>>>>>
> >>>>>> [    0.000000] Early memory node ranges
> >>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
> >>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
> >>>>>>
> >>>>>>
> >>>>>> Guest ACPI boot with "fdt=off" ,
> >>>>>>
> >>>>>> [    0.000000] Movable zone start for each node
> >>>>>> [    0.000000] Early memory node ranges
> >>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
> >>>>>>
> >>>>>> The hotpluggable memory node is absent from early memory nodes here.  
> >>>>>
> >>>>> OK thank you for the example illustrating the concern.  
> >>>>>>
> >>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
> >>>>>
> >>>>> Let's wait for EDK2 experts on this.
> >>>>>  
> >>>>
> >>>> Happy to chime in, but I need a bit more context here.
> >>>>
> >>>> What is the problem, how does this path try to solve it, and why is
> >>>> that a bad idea?
> >>>>  
> >>> Sure, sorry.
> >>>
> >>> This series:
> >>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> >>> https://patchwork.kernel.org/cover/10863301/
> >>>
> >>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> >>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
> >>>
> >>> We noticed that if we build the hotpluggable memory dt nodes on top of
> >>> the above ACPI tables, the DIMM slots are interpreted as not
> >>> hotpluggable memory slots (at least we think so).
> >>>
> >>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> >>> fact that those slots are exposed as hotpluggable in the SRAT for example.
> >>>
> >>> So in this series, we are forced to not generate the hotpluggable memory
> >>> dt nodes if we want the DIMM slots to be effectively recognized as
> >>> hotpluggable.
> >>>
> >>> Could you confirm we have a correct understanding of the EDK2 behaviour
> >>> and if so, would there be any solution for EDK2 to absorb both the DT
> >>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
> >>>
> >>> At qemu level, detecting we are booting in ACPI mode and purposely
> >>> removing the above mentioned DT nodes does not look straightforward.  
> >>
> >> The firmware is not enlightened about the ACPI content that comes from
> >> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> >> as instructed through the ACPI linker/loader script, in order to install
> >> the ACPI content for the OS. No actual information is consumed by the
> >> firmware from the ACPI payload -- and that's a feature.
> >>
> >> The firmware does consume DT:
> >>
> >> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> >> the firmware (for its own information needs), and passed on to the OS.
> >>
> >> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> >> consumed only by the firmware (for its own information needs), and the
> >> DT is hidden from the OS. The OS gets only the ACPI content
> >> (processed/prepared as described above).
> >>
> >>
> >> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> >> base/size pairs in all the memory nodes in the DT. For each such base
> >> address that is currently tracked as "nonexistent" in the GCD memory
> >> space map, the driver currently adds the base/size range as "system
> >> memory". This in turn is reflected by the UEFI memmap that the OS gets
> >> to see as "conventional memory".
> >>
> >> If you need some memory ranges to show up as "special" in the UEFI
> >> memmap, then you need to distinguish them somehow from the "regular"
> >> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
> >> firmware, so that it act upon the discriminator that you set in the DT.
> >>
> >>
> >> Now... from a brief look at the Platform Init and UEFI specs, my
> >> impression is that the hotpluggable (but presently not plugged) DIMM
> >> ranges should simply be *absent* from the UEFI memmap; is that correct?
> >> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> >> in full.) If my impression is correct, then two options (alternatives)
> >> exist:
> >>
> >> (1) Hide the affected memory nodes -- or at least the affected base/size
> >> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> >> external firmware loaded. Then the firmware will not expose those ranges
> >> as "conventional memory" in the UEFI memmap. This approach requires no
> >> changes to edk2.
> >>
> >> This option is precisely what Eric described up-thread, at
> >> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
> >>  
> >>> in machvirt_init, there is firmware_loaded that tells you whether you
> >>> have a FW image. If this one is not set, you can induce dt. But if
> >>> there is a FW it can be either DT or ACPI booted. You also have the
> >>> acpi_enabled knob.  
> >>
> >> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> >> "vl.c").
> >>
> >> So, the condition for hiding the hotpluggable memory nodes in question
> >> from the DT is:  
> >   
> >>
> >>   (aarch64 && firmware_loaded && acpi_enabled)  
> > 
> > Thanks a lot for all those inputs!
> > 
> > I don't get why we test aarch64 in above condition (this was useful for
> > high ECAM range as the aarch32 FW was not supporting it but here, is it
> > still meaningful?)  
> 
> Sorry, I should have clarified that. Yes, it is meaningful:
> 
> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> not the reverse, on ARM.) So if you run the 32-bit build of the
> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> the OS is the DT.
> 
> This "bitness distinction" is implemented in the firmware already. If
> you hid the memory nodes from the DT under the condition
> 
>   (!aarch64 && firmware_loaded && acpi_enabled)
> 
> then the nodes would not be seen by the OS at all (because
> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> all the OS can ever get is DT).

It's getting tricky and I don't like a bit that we are trying to carter
64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
a valid about guessing on QEMU side (that's usually a source of problem
in the future).

Perhaps we should reconsider and think about marking hotplugbbale RAM
in DT and let firmware to exclude it from memory map.

> Thanks,
> Laszlo
> 
> > 
> > Thanks
> > 
> > Eric
> >   
> >>
> >>
> >> (2) Invent and set an "ignore me, firmware" property for the
> >> hotpluggable memory nodes in the DT, and update the firmware to honor
> >> that property.
> >>
> >> Thanks
> >> Laszlo
> >>  
> 


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-03  9:49                     ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-03  9:49 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Auger Eric, Ard Biesheuvel, peter.maydell@linaro.org,
	sameo@linux.intel.com, qemu-devel@nongnu.org,
	Shameerali Kolothum Thodi, Linuxarm, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), sebastien.boeuf@intel.com,
	Leif Lindholm

On Tue, 2 Apr 2019 17:38:26 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

> On 04/02/19 17:29, Auger Eric wrote:
> > Hi Laszlo,
> > 
> > On 4/1/19 3:07 PM, Laszlo Ersek wrote:  
> >> On 03/29/19 14:56, Auger Eric wrote:  
> >>> Hi Ard,
> >>>
> >>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
> >>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
> >>>>>
> >>>>> Hi Shameer,
> >>>>>
> >>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
> >>>>>>
> >>>>>>  
> >>>>>>> -----Original Message-----
> >>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
> >>>>>>> Sent: 29 March 2019 09:32
> >>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> >>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> >>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> >>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
> >>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
> >>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
> >>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
> >>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
> >>>>>>>
> >>>>>>> Hi Shameer,
> >>>>>>>
> >>>>>>> [ + Laszlo, Ard, Leif ]
> >>>>>>>
> >>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
> >>>>>>>> This is to disable/enable populating DT nodes in case
> >>>>>>>> any conflict with acpi tables. The default is "off".  
> >>>>>>> The name of the option sounds misleading to me. Also we don't really
> >>>>>>> know the scope of the disablement. At the moment this just aims to
> >>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
> >>>>>>>  
> >>>>>>>>
> >>>>>>>> This will be used in subsequent patch where cold plug
> >>>>>>>> device-memory support is added for DT boot.  
> >>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
> >>>>>>> any PCDIMM nodes.  
> >>>>>>>>
> >>>>>>>> If DT memory node support is added for cold-plugged device
> >>>>>>>> memory, those memory will be visible to Guest kernel via
> >>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
> >>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
> >>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
> >>>>>>> info.  
> >>>>>>
> >>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
> >>>>>>
> >>>>>> Also, to be more clear on what happens,
> >>>>>>
> >>>>>> Guest ACPI boot with "fdt=on" ,
> >>>>>>
> >>>>>> From kernel log,
> >>>>>>
> >>>>>> [    0.000000] Early memory node ranges
> >>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
> >>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
> >>>>>>
> >>>>>>
> >>>>>> Guest ACPI boot with "fdt=off" ,
> >>>>>>
> >>>>>> [    0.000000] Movable zone start for each node
> >>>>>> [    0.000000] Early memory node ranges
> >>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
> >>>>>>
> >>>>>> The hotpluggable memory node is absent from early memory nodes here.  
> >>>>>
> >>>>> OK thank you for the example illustrating the concern.  
> >>>>>>
> >>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
> >>>>>
> >>>>> Let's wait for EDK2 experts on this.
> >>>>>  
> >>>>
> >>>> Happy to chime in, but I need a bit more context here.
> >>>>
> >>>> What is the problem, how does this path try to solve it, and why is
> >>>> that a bad idea?
> >>>>  
> >>> Sure, sorry.
> >>>
> >>> This series:
> >>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> >>> https://patchwork.kernel.org/cover/10863301/
> >>>
> >>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> >>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
> >>>
> >>> We noticed that if we build the hotpluggable memory dt nodes on top of
> >>> the above ACPI tables, the DIMM slots are interpreted as not
> >>> hotpluggable memory slots (at least we think so).
> >>>
> >>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> >>> fact that those slots are exposed as hotpluggable in the SRAT for example.
> >>>
> >>> So in this series, we are forced to not generate the hotpluggable memory
> >>> dt nodes if we want the DIMM slots to be effectively recognized as
> >>> hotpluggable.
> >>>
> >>> Could you confirm we have a correct understanding of the EDK2 behaviour
> >>> and if so, would there be any solution for EDK2 to absorb both the DT
> >>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
> >>>
> >>> At qemu level, detecting we are booting in ACPI mode and purposely
> >>> removing the above mentioned DT nodes does not look straightforward.  
> >>
> >> The firmware is not enlightened about the ACPI content that comes from
> >> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> >> as instructed through the ACPI linker/loader script, in order to install
> >> the ACPI content for the OS. No actual information is consumed by the
> >> firmware from the ACPI payload -- and that's a feature.
> >>
> >> The firmware does consume DT:
> >>
> >> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> >> the firmware (for its own information needs), and passed on to the OS.
> >>
> >> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> >> consumed only by the firmware (for its own information needs), and the
> >> DT is hidden from the OS. The OS gets only the ACPI content
> >> (processed/prepared as described above).
> >>
> >>
> >> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> >> base/size pairs in all the memory nodes in the DT. For each such base
> >> address that is currently tracked as "nonexistent" in the GCD memory
> >> space map, the driver currently adds the base/size range as "system
> >> memory". This in turn is reflected by the UEFI memmap that the OS gets
> >> to see as "conventional memory".
> >>
> >> If you need some memory ranges to show up as "special" in the UEFI
> >> memmap, then you need to distinguish them somehow from the "regular"
> >> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
> >> firmware, so that it act upon the discriminator that you set in the DT.
> >>
> >>
> >> Now... from a brief look at the Platform Init and UEFI specs, my
> >> impression is that the hotpluggable (but presently not plugged) DIMM
> >> ranges should simply be *absent* from the UEFI memmap; is that correct?
> >> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> >> in full.) If my impression is correct, then two options (alternatives)
> >> exist:
> >>
> >> (1) Hide the affected memory nodes -- or at least the affected base/size
> >> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> >> external firmware loaded. Then the firmware will not expose those ranges
> >> as "conventional memory" in the UEFI memmap. This approach requires no
> >> changes to edk2.
> >>
> >> This option is precisely what Eric described up-thread, at
> >> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
> >>  
> >>> in machvirt_init, there is firmware_loaded that tells you whether you
> >>> have a FW image. If this one is not set, you can induce dt. But if
> >>> there is a FW it can be either DT or ACPI booted. You also have the
> >>> acpi_enabled knob.  
> >>
> >> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> >> "vl.c").
> >>
> >> So, the condition for hiding the hotpluggable memory nodes in question
> >> from the DT is:  
> >   
> >>
> >>   (aarch64 && firmware_loaded && acpi_enabled)  
> > 
> > Thanks a lot for all those inputs!
> > 
> > I don't get why we test aarch64 in above condition (this was useful for
> > high ECAM range as the aarch32 FW was not supporting it but here, is it
> > still meaningful?)  
> 
> Sorry, I should have clarified that. Yes, it is meaningful:
> 
> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> not the reverse, on ARM.) So if you run the 32-bit build of the
> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> the OS is the DT.
> 
> This "bitness distinction" is implemented in the firmware already. If
> you hid the memory nodes from the DT under the condition
> 
>   (!aarch64 && firmware_loaded && acpi_enabled)
> 
> then the nodes would not be seen by the OS at all (because
> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> all the OS can ever get is DT).

It's getting tricky and I don't like a bit that we are trying to carter
64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
a valid about guessing on QEMU side (that's usually a source of problem
in the future).

Perhaps we should reconsider and think about marking hotplugbbale RAM
in DT and let firmware to exclude it from memory map.

> Thanks,
> Laszlo
> 
> > 
> > Thanks
> > 
> > Eric
> >   
> >>
> >>
> >> (2) Invent and set an "ignore me, firmware" property for the
> >> hotpluggable memory nodes in the DT, and update the firmware to honor
> >> that property.
> >>
> >> Thanks
> >> Laszlo
> >>  
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-03  9:49                     ` Igor Mammedov
@ 2019-04-03 12:10                       ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-03 12:10 UTC (permalink / raw)
  To: Igor Mammedov, Laszlo Ersek
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Linuxarm, Auger Eric,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm



> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@redhat.com]
> Sent: 03 April 2019 10:49
> To: Laszlo Ersek <lersek@redhat.com>
> Cc: Auger Eric <eric.auger@redhat.com>; Ard Biesheuvel
> <ard.biesheuvel@linaro.org>; peter.maydell@linaro.org;
> sameo@linux.intel.com; qemu-devel@nongnu.org; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; Linuxarm
> <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> sebastien.boeuf@intel.com; Leif Lindholm <Leif.Lindholm@arm.com>
> Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> feature "fdt"
> 
> On Tue, 2 Apr 2019 17:38:26 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:

[...]

> > >>> Sure, sorry.
> > >>>
> > >>> This series:
> > >>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> > >>> https://patchwork.kernel.org/cover/10863301/
> > >>>
> > >>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> > >>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
> > >>>
> > >>> We noticed that if we build the hotpluggable memory dt nodes on top of
> > >>> the above ACPI tables, the DIMM slots are interpreted as not
> > >>> hotpluggable memory slots (at least we think so).
> > >>>
> > >>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores
> the
> > >>> fact that those slots are exposed as hotpluggable in the SRAT for
> example.
> > >>>
> > >>> So in this series, we are forced to not generate the hotpluggable memory
> > >>> dt nodes if we want the DIMM slots to be effectively recognized as
> > >>> hotpluggable.
> > >>>
> > >>> Could you confirm we have a correct understanding of the EDK2
> behaviour
> > >>> and if so, would there be any solution for EDK2 to absorb both the DT
> > >>> nodes and the relevant SRAT/DSDT tables and make the slots
> hotpluggable.
> > >>>
> > >>> At qemu level, detecting we are booting in ACPI mode and purposely
> > >>> removing the above mentioned DT nodes does not look straightforward.
> > >>
> > >> The firmware is not enlightened about the ACPI content that comes from
> > >> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> > >> as instructed through the ACPI linker/loader script, in order to install
> > >> the ACPI content for the OS. No actual information is consumed by the
> > >> firmware from the ACPI payload -- and that's a feature.
> > >>
> > >> The firmware does consume DT:
> > >>
> > >> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> > >> the firmware (for its own information needs), and passed on to the OS.
> > >>
> > >> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> > >> consumed only by the firmware (for its own information needs), and the
> > >> DT is hidden from the OS. The OS gets only the ACPI content
> > >> (processed/prepared as described above).
> > >>
> > >>
> > >> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> > >> base/size pairs in all the memory nodes in the DT. For each such base
> > >> address that is currently tracked as "nonexistent" in the GCD memory
> > >> space map, the driver currently adds the base/size range as "system
> > >> memory". This in turn is reflected by the UEFI memmap that the OS gets
> > >> to see as "conventional memory".
> > >>
> > >> If you need some memory ranges to show up as "special" in the UEFI
> > >> memmap, then you need to distinguish them somehow from the "regular"
> > >> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in
> the
> > >> firmware, so that it act upon the discriminator that you set in the DT.
> > >>
> > >>
> > >> Now... from a brief look at the Platform Init and UEFI specs, my
> > >> impression is that the hotpluggable (but presently not plugged) DIMM
> > >> ranges should simply be *absent* from the UEFI memmap; is that
> correct?
> > >> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> > >> in full.) If my impression is correct, then two options (alternatives)
> > >> exist:
> > >>
> > >> (1) Hide the affected memory nodes -- or at least the affected base/size
> > >> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> > >> external firmware loaded. Then the firmware will not expose those ranges
> > >> as "conventional memory" in the UEFI memmap. This approach requires
> no
> > >> changes to edk2.
> > >>
> > >> This option is precisely what Eric described up-thread, at
> > >>
> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redh
> at.com>:
> > >>
> > >>> in machvirt_init, there is firmware_loaded that tells you whether you
> > >>> have a FW image. If this one is not set, you can induce dt. But if
> > >>> there is a FW it can be either DT or ACPI booted. You also have the
> > >>> acpi_enabled knob.
> > >>
> > >> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> > >> "vl.c").
> > >>
> > >> So, the condition for hiding the hotpluggable memory nodes in question
> > >> from the DT is:
> > >
> > >>
> > >>   (aarch64 && firmware_loaded && acpi_enabled)
> > >
> > > Thanks a lot for all those inputs!
> > >
> > > I don't get why we test aarch64 in above condition (this was useful for
> > > high ECAM range as the aarch32 FW was not supporting it but here, is it
> > > still meaningful?)
> >
> > Sorry, I should have clarified that. Yes, it is meaningful:
> >
> > While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> > 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> > not the reverse, on ARM.) So if you run the 32-bit build of the
> > ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> > the OS is the DT.

Just to confirm, does that mean with 32-bit build of the UEFI, the OS cannot
boot with ACPI and uses DT only. So,

If ((aarch64 && firmware_loaded && acpi_enabled) {
   Hide_hotpluggable_memory_nodes()
} else {
   Add_ hotpluggable_memory_nodes()
}

should work for all cases?

> > This "bitness distinction" is implemented in the firmware already. If
> > you hid the memory nodes from the DT under the condition
> >
> >   (!aarch64 && firmware_loaded && acpi_enabled)
> >
> > then the nodes would not be seen by the OS at all (because
> > "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> > all the OS can ever get is DT).
> 
> It's getting tricky and I don't like a bit that we are trying to carter
> 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
> a valid about guessing on QEMU side (that's usually a source of problem
> in the future).

If the above is correct(with 32-bit variant of UEFI, OS cannot have ACPI boot),
then do we really have the issue of memory becoming non hot-un-unpluggable?
May be I am missing something. 

Thanks,
Shameer
 
> Perhaps we should reconsider and think about marking hotplugbbale RAM
> in DT and let firmware to exclude it from memory map.
> 
> > Thanks,
> > Laszlo
> >
> > >
> > > Thanks
> > >
> > > Eric
> > >
> > >>
> > >>
> > >> (2) Invent and set an "ignore me, firmware" property for the
> > >> hotpluggable memory nodes in the DT, and update the firmware to honor
> > >> that property.
> > >>
> > >> Thanks
> > >> Laszlo
> > >>
> >


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-03 12:10                       ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-03 12:10 UTC (permalink / raw)
  To: Igor Mammedov, Laszlo Ersek
  Cc: Auger Eric, Ard Biesheuvel, peter.maydell@linaro.org,
	sameo@linux.intel.com, qemu-devel@nongnu.org, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm



> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@redhat.com]
> Sent: 03 April 2019 10:49
> To: Laszlo Ersek <lersek@redhat.com>
> Cc: Auger Eric <eric.auger@redhat.com>; Ard Biesheuvel
> <ard.biesheuvel@linaro.org>; peter.maydell@linaro.org;
> sameo@linux.intel.com; qemu-devel@nongnu.org; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; Linuxarm
> <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> sebastien.boeuf@intel.com; Leif Lindholm <Leif.Lindholm@arm.com>
> Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> feature "fdt"
> 
> On Tue, 2 Apr 2019 17:38:26 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:

[...]

> > >>> Sure, sorry.
> > >>>
> > >>> This series:
> > >>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> > >>> https://patchwork.kernel.org/cover/10863301/
> > >>>
> > >>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> > >>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
> > >>>
> > >>> We noticed that if we build the hotpluggable memory dt nodes on top of
> > >>> the above ACPI tables, the DIMM slots are interpreted as not
> > >>> hotpluggable memory slots (at least we think so).
> > >>>
> > >>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores
> the
> > >>> fact that those slots are exposed as hotpluggable in the SRAT for
> example.
> > >>>
> > >>> So in this series, we are forced to not generate the hotpluggable memory
> > >>> dt nodes if we want the DIMM slots to be effectively recognized as
> > >>> hotpluggable.
> > >>>
> > >>> Could you confirm we have a correct understanding of the EDK2
> behaviour
> > >>> and if so, would there be any solution for EDK2 to absorb both the DT
> > >>> nodes and the relevant SRAT/DSDT tables and make the slots
> hotpluggable.
> > >>>
> > >>> At qemu level, detecting we are booting in ACPI mode and purposely
> > >>> removing the above mentioned DT nodes does not look straightforward.
> > >>
> > >> The firmware is not enlightened about the ACPI content that comes from
> > >> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> > >> as instructed through the ACPI linker/loader script, in order to install
> > >> the ACPI content for the OS. No actual information is consumed by the
> > >> firmware from the ACPI payload -- and that's a feature.
> > >>
> > >> The firmware does consume DT:
> > >>
> > >> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> > >> the firmware (for its own information needs), and passed on to the OS.
> > >>
> > >> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> > >> consumed only by the firmware (for its own information needs), and the
> > >> DT is hidden from the OS. The OS gets only the ACPI content
> > >> (processed/prepared as described above).
> > >>
> > >>
> > >> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> > >> base/size pairs in all the memory nodes in the DT. For each such base
> > >> address that is currently tracked as "nonexistent" in the GCD memory
> > >> space map, the driver currently adds the base/size range as "system
> > >> memory". This in turn is reflected by the UEFI memmap that the OS gets
> > >> to see as "conventional memory".
> > >>
> > >> If you need some memory ranges to show up as "special" in the UEFI
> > >> memmap, then you need to distinguish them somehow from the "regular"
> > >> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in
> the
> > >> firmware, so that it act upon the discriminator that you set in the DT.
> > >>
> > >>
> > >> Now... from a brief look at the Platform Init and UEFI specs, my
> > >> impression is that the hotpluggable (but presently not plugged) DIMM
> > >> ranges should simply be *absent* from the UEFI memmap; is that
> correct?
> > >> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> > >> in full.) If my impression is correct, then two options (alternatives)
> > >> exist:
> > >>
> > >> (1) Hide the affected memory nodes -- or at least the affected base/size
> > >> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> > >> external firmware loaded. Then the firmware will not expose those ranges
> > >> as "conventional memory" in the UEFI memmap. This approach requires
> no
> > >> changes to edk2.
> > >>
> > >> This option is precisely what Eric described up-thread, at
> > >>
> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redh
> at.com>:
> > >>
> > >>> in machvirt_init, there is firmware_loaded that tells you whether you
> > >>> have a FW image. If this one is not set, you can induce dt. But if
> > >>> there is a FW it can be either DT or ACPI booted. You also have the
> > >>> acpi_enabled knob.
> > >>
> > >> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> > >> "vl.c").
> > >>
> > >> So, the condition for hiding the hotpluggable memory nodes in question
> > >> from the DT is:
> > >
> > >>
> > >>   (aarch64 && firmware_loaded && acpi_enabled)
> > >
> > > Thanks a lot for all those inputs!
> > >
> > > I don't get why we test aarch64 in above condition (this was useful for
> > > high ECAM range as the aarch32 FW was not supporting it but here, is it
> > > still meaningful?)
> >
> > Sorry, I should have clarified that. Yes, it is meaningful:
> >
> > While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> > 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> > not the reverse, on ARM.) So if you run the 32-bit build of the
> > ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> > the OS is the DT.

Just to confirm, does that mean with 32-bit build of the UEFI, the OS cannot
boot with ACPI and uses DT only. So,

If ((aarch64 && firmware_loaded && acpi_enabled) {
   Hide_hotpluggable_memory_nodes()
} else {
   Add_ hotpluggable_memory_nodes()
}

should work for all cases?

> > This "bitness distinction" is implemented in the firmware already. If
> > you hid the memory nodes from the DT under the condition
> >
> >   (!aarch64 && firmware_loaded && acpi_enabled)
> >
> > then the nodes would not be seen by the OS at all (because
> > "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> > all the OS can ever get is DT).
> 
> It's getting tricky and I don't like a bit that we are trying to carter
> 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
> a valid about guessing on QEMU side (that's usually a source of problem
> in the future).

If the above is correct(with 32-bit variant of UEFI, OS cannot have ACPI boot),
then do we really have the issue of memory becoming non hot-un-unpluggable?
May be I am missing something. 

Thanks,
Shameer
 
> Perhaps we should reconsider and think about marking hotplugbbale RAM
> in DT and let firmware to exclude it from memory map.
> 
> > Thanks,
> > Laszlo
> >
> > >
> > > Thanks
> > >
> > > Eric
> > >
> > >>
> > >>
> > >> (2) Invent and set an "ignore me, firmware" property for the
> > >> hotpluggable memory nodes in the DT, and update the firmware to honor
> > >> that property.
> > >>
> > >> Thanks
> > >> Laszlo
> > >>
> >

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-03  9:49                     ` Igor Mammedov
@ 2019-04-03 13:19                       ` Laszlo Ersek
  -1 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-03 13:19 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	Auger Eric, shannon.zhaosl@gmail.com, qemu-arm@nongnu.org,
	xuwei (O), sebastien.boeuf@intel.com, Leif Lindholm

On 04/03/19 11:49, Igor Mammedov wrote:
> On Tue, 2 Apr 2019 17:38:26 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 04/02/19 17:29, Auger Eric wrote:
>>> Hi Laszlo,
>>>
>>> On 4/1/19 3:07 PM, Laszlo Ersek wrote:  
>>>> On 03/29/19 14:56, Auger Eric wrote:  
>>>>> Hi Ard,
>>>>>
>>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
>>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
>>>>>>>
>>>>>>> Hi Shameer,
>>>>>>>
>>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
>>>>>>>>
>>>>>>>>  
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>>>> Sent: 29 March 2019 09:32
>>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>>>
>>>>>>>>> Hi Shameer,
>>>>>>>>>
>>>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>>>
>>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
>>>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>>>> any conflict with acpi tables. The default is "off".  
>>>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>>>  
>>>>>>>>>>
>>>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>>>> device-memory support is added for DT boot.  
>>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>>>> any PCDIMM nodes.  
>>>>>>>>>>
>>>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
>>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>>>> info.  
>>>>>>>>
>>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>>>
>>>>>>>> Also, to be more clear on what happens,
>>>>>>>>
>>>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>>>
>>>>>>>> From kernel log,
>>>>>>>>
>>>>>>>> [    0.000000] Early memory node ranges
>>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>>>
>>>>>>>>
>>>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>>>
>>>>>>>> [    0.000000] Movable zone start for each node
>>>>>>>> [    0.000000] Early memory node ranges
>>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>>>
>>>>>>>> The hotpluggable memory node is absent from early memory nodes here.  
>>>>>>>
>>>>>>> OK thank you for the example illustrating the concern.  
>>>>>>>>
>>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
>>>>>>>
>>>>>>> Let's wait for EDK2 experts on this.
>>>>>>>  
>>>>>>
>>>>>> Happy to chime in, but I need a bit more context here.
>>>>>>
>>>>>> What is the problem, how does this path try to solve it, and why is
>>>>>> that a bad idea?
>>>>>>  
>>>>> Sure, sorry.
>>>>>
>>>>> This series:
>>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>>>> https://patchwork.kernel.org/cover/10863301/
>>>>>
>>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>>>
>>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>>>> the above ACPI tables, the DIMM slots are interpreted as not
>>>>> hotpluggable memory slots (at least we think so).
>>>>>
>>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>>>
>>>>> So in this series, we are forced to not generate the hotpluggable memory
>>>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>>>> hotpluggable.
>>>>>
>>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>>>
>>>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>>>> removing the above mentioned DT nodes does not look straightforward.  
>>>>
>>>> The firmware is not enlightened about the ACPI content that comes from
>>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>>>> as instructed through the ACPI linker/loader script, in order to install
>>>> the ACPI content for the OS. No actual information is consumed by the
>>>> firmware from the ACPI payload -- and that's a feature.
>>>>
>>>> The firmware does consume DT:
>>>>
>>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>>>> the firmware (for its own information needs), and passed on to the OS.
>>>>
>>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>>>> consumed only by the firmware (for its own information needs), and the
>>>> DT is hidden from the OS. The OS gets only the ACPI content
>>>> (processed/prepared as described above).
>>>>
>>>>
>>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>>>> base/size pairs in all the memory nodes in the DT. For each such base
>>>> address that is currently tracked as "nonexistent" in the GCD memory
>>>> space map, the driver currently adds the base/size range as "system
>>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>>>> to see as "conventional memory".
>>>>
>>>> If you need some memory ranges to show up as "special" in the UEFI
>>>> memmap, then you need to distinguish them somehow from the "regular"
>>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>>>> firmware, so that it act upon the discriminator that you set in the DT.
>>>>
>>>>
>>>> Now... from a brief look at the Platform Init and UEFI specs, my
>>>> impression is that the hotpluggable (but presently not plugged) DIMM
>>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>>>> in full.) If my impression is correct, then two options (alternatives)
>>>> exist:
>>>>
>>>> (1) Hide the affected memory nodes -- or at least the affected base/size
>>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>>>> external firmware loaded. Then the firmware will not expose those ranges
>>>> as "conventional memory" in the UEFI memmap. This approach requires no
>>>> changes to edk2.
>>>>
>>>> This option is precisely what Eric described up-thread, at
>>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>>>  
>>>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>>>> have a FW image. If this one is not set, you can induce dt. But if
>>>>> there is a FW it can be either DT or ACPI booted. You also have the
>>>>> acpi_enabled knob.  
>>>>
>>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>>>> "vl.c").
>>>>
>>>> So, the condition for hiding the hotpluggable memory nodes in question
>>>> from the DT is:  
>>>   
>>>>
>>>>   (aarch64 && firmware_loaded && acpi_enabled)  
>>>
>>> Thanks a lot for all those inputs!
>>>
>>> I don't get why we test aarch64 in above condition (this was useful for
>>> high ECAM range as the aarch32 FW was not supporting it but here, is it
>>> still meaningful?)  
>>
>> Sorry, I should have clarified that. Yes, it is meaningful:
>>
>> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
>> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
>> not the reverse, on ARM.) So if you run the 32-bit build of the
>> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
>> the OS is the DT.
>>
>> This "bitness distinction" is implemented in the firmware already. If
>> you hid the memory nodes from the DT under the condition
>>
>>   (!aarch64 && firmware_loaded && acpi_enabled)
>>
>> then the nodes would not be seen by the OS at all (because
>> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
>> all the OS can ever get is DT).
> 
> It's getting tricky and I don't like a bit that we are trying to carter
> 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
> a valid about guessing on QEMU side (that's usually a source of problem
> in the future).
> 
> Perhaps we should reconsider and think about marking hotplugbbale RAM
> in DT and let firmware to exclude it from memory map.

I'm fine either way.

(I'm glad to continue discussing either option; that shouldn't be taken
as a preference on my end.)

With option (2), please consider the new version dependency between QEMU
and the firmware -- this may or may not affect migration. (Thinking
about migration is difficult, so I'll leave that to you all :) )

Thanks
Laszlo

>>>> (2) Invent and set an "ignore me, firmware" property for the
>>>> hotpluggable memory nodes in the DT, and update the firmware to honor
>>>> that property.
>>>>
>>>> Thanks
>>>> Laszlo
>>>>  
>>
> 


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-03 13:19                       ` Laszlo Ersek
  0 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-03 13:19 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Auger Eric, Ard Biesheuvel, peter.maydell@linaro.org,
	sameo@linux.intel.com, qemu-devel@nongnu.org,
	Shameerali Kolothum Thodi, Linuxarm, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), sebastien.boeuf@intel.com,
	Leif Lindholm

On 04/03/19 11:49, Igor Mammedov wrote:
> On Tue, 2 Apr 2019 17:38:26 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 04/02/19 17:29, Auger Eric wrote:
>>> Hi Laszlo,
>>>
>>> On 4/1/19 3:07 PM, Laszlo Ersek wrote:  
>>>> On 03/29/19 14:56, Auger Eric wrote:  
>>>>> Hi Ard,
>>>>>
>>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:  
>>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:  
>>>>>>>
>>>>>>> Hi Shameer,
>>>>>>>
>>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:  
>>>>>>>>
>>>>>>>>  
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>>>>>>> Sent: 29 March 2019 09:32
>>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
>>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
>>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
>>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
>>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
>>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
>>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
>>>>>>>>>
>>>>>>>>> Hi Shameer,
>>>>>>>>>
>>>>>>>>> [ + Laszlo, Ard, Leif ]
>>>>>>>>>
>>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:  
>>>>>>>>>> This is to disable/enable populating DT nodes in case
>>>>>>>>>> any conflict with acpi tables. The default is "off".  
>>>>>>>>> The name of the option sounds misleading to me. Also we don't really
>>>>>>>>> know the scope of the disablement. At the moment this just aims to
>>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
>>>>>>>>>  
>>>>>>>>>>
>>>>>>>>>> This will be used in subsequent patch where cold plug
>>>>>>>>>> device-memory support is added for DT boot.  
>>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
>>>>>>>>> any PCDIMM nodes.  
>>>>>>>>>>
>>>>>>>>>> If DT memory node support is added for cold-plugged device
>>>>>>>>>> memory, those memory will be visible to Guest kernel via
>>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.  
>>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
>>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
>>>>>>>>> info.  
>>>>>>>>
>>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
>>>>>>>>
>>>>>>>> Also, to be more clear on what happens,
>>>>>>>>
>>>>>>>> Guest ACPI boot with "fdt=on" ,
>>>>>>>>
>>>>>>>> From kernel log,
>>>>>>>>
>>>>>>>> [    0.000000] Early memory node ranges
>>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
>>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
>>>>>>>>
>>>>>>>>
>>>>>>>> Guest ACPI boot with "fdt=off" ,
>>>>>>>>
>>>>>>>> [    0.000000] Movable zone start for each node
>>>>>>>> [    0.000000] Early memory node ranges
>>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
>>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
>>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
>>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
>>>>>>>>
>>>>>>>> The hotpluggable memory node is absent from early memory nodes here.  
>>>>>>>
>>>>>>> OK thank you for the example illustrating the concern.  
>>>>>>>>
>>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.  
>>>>>>>
>>>>>>> Let's wait for EDK2 experts on this.
>>>>>>>  
>>>>>>
>>>>>> Happy to chime in, but I need a bit more context here.
>>>>>>
>>>>>> What is the problem, how does this path try to solve it, and why is
>>>>>> that a bad idea?
>>>>>>  
>>>>> Sure, sorry.
>>>>>
>>>>> This series:
>>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
>>>>> https://patchwork.kernel.org/cover/10863301/
>>>>>
>>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
>>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
>>>>>
>>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
>>>>> the above ACPI tables, the DIMM slots are interpreted as not
>>>>> hotpluggable memory slots (at least we think so).
>>>>>
>>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
>>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
>>>>>
>>>>> So in this series, we are forced to not generate the hotpluggable memory
>>>>> dt nodes if we want the DIMM slots to be effectively recognized as
>>>>> hotpluggable.
>>>>>
>>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
>>>>> and if so, would there be any solution for EDK2 to absorb both the DT
>>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
>>>>>
>>>>> At qemu level, detecting we are booting in ACPI mode and purposely
>>>>> removing the above mentioned DT nodes does not look straightforward.  
>>>>
>>>> The firmware is not enlightened about the ACPI content that comes from
>>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
>>>> as instructed through the ACPI linker/loader script, in order to install
>>>> the ACPI content for the OS. No actual information is consumed by the
>>>> firmware from the ACPI payload -- and that's a feature.
>>>>
>>>> The firmware does consume DT:
>>>>
>>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
>>>> the firmware (for its own information needs), and passed on to the OS.
>>>>
>>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
>>>> consumed only by the firmware (for its own information needs), and the
>>>> DT is hidden from the OS. The OS gets only the ACPI content
>>>> (processed/prepared as described above).
>>>>
>>>>
>>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
>>>> base/size pairs in all the memory nodes in the DT. For each such base
>>>> address that is currently tracked as "nonexistent" in the GCD memory
>>>> space map, the driver currently adds the base/size range as "system
>>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
>>>> to see as "conventional memory".
>>>>
>>>> If you need some memory ranges to show up as "special" in the UEFI
>>>> memmap, then you need to distinguish them somehow from the "regular"
>>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
>>>> firmware, so that it act upon the discriminator that you set in the DT.
>>>>
>>>>
>>>> Now... from a brief look at the Platform Init and UEFI specs, my
>>>> impression is that the hotpluggable (but presently not plugged) DIMM
>>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
>>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
>>>> in full.) If my impression is correct, then two options (alternatives)
>>>> exist:
>>>>
>>>> (1) Hide the affected memory nodes -- or at least the affected base/size
>>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
>>>> external firmware loaded. Then the firmware will not expose those ranges
>>>> as "conventional memory" in the UEFI memmap. This approach requires no
>>>> changes to edk2.
>>>>
>>>> This option is precisely what Eric described up-thread, at
>>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
>>>>  
>>>>> in machvirt_init, there is firmware_loaded that tells you whether you
>>>>> have a FW image. If this one is not set, you can induce dt. But if
>>>>> there is a FW it can be either DT or ACPI booted. You also have the
>>>>> acpi_enabled knob.  
>>>>
>>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
>>>> "vl.c").
>>>>
>>>> So, the condition for hiding the hotpluggable memory nodes in question
>>>> from the DT is:  
>>>   
>>>>
>>>>   (aarch64 && firmware_loaded && acpi_enabled)  
>>>
>>> Thanks a lot for all those inputs!
>>>
>>> I don't get why we test aarch64 in above condition (this was useful for
>>> high ECAM range as the aarch32 FW was not supporting it but here, is it
>>> still meaningful?)  
>>
>> Sorry, I should have clarified that. Yes, it is meaningful:
>>
>> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
>> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
>> not the reverse, on ARM.) So if you run the 32-bit build of the
>> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
>> the OS is the DT.
>>
>> This "bitness distinction" is implemented in the firmware already. If
>> you hid the memory nodes from the DT under the condition
>>
>>   (!aarch64 && firmware_loaded && acpi_enabled)
>>
>> then the nodes would not be seen by the OS at all (because
>> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
>> all the OS can ever get is DT).
> 
> It's getting tricky and I don't like a bit that we are trying to carter
> 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
> a valid about guessing on QEMU side (that's usually a source of problem
> in the future).
> 
> Perhaps we should reconsider and think about marking hotplugbbale RAM
> in DT and let firmware to exclude it from memory map.

I'm fine either way.

(I'm glad to continue discussing either option; that shouldn't be taken
as a preference on my end.)

With option (2), please consider the new version dependency between QEMU
and the firmware -- this may or may not affect migration. (Thinking
about migration is difficult, so I'll leave that to you all :) )

Thanks
Laszlo

>>>> (2) Invent and set an "ignore me, firmware" property for the
>>>> hotpluggable memory nodes in the DT, and update the firmware to honor
>>>> that property.
>>>>
>>>> Thanks
>>>> Laszlo
>>>>  
>>
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-03 12:10                       ` Shameerali Kolothum Thodi
@ 2019-04-03 13:29                         ` Laszlo Ersek
  -1 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-03 13:29 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi, Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Linuxarm, Auger Eric,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm

On 04/03/19 14:10, Shameerali Kolothum Thodi wrote:

>>>>> So, the condition for hiding the hotpluggable memory nodes in question
>>>>> from the DT is:
>>>>
>>>>>
>>>>>   (aarch64 && firmware_loaded && acpi_enabled)

>>> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
>>> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
>>> not the reverse, on ARM.) So if you run the 32-bit build of the
>>> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
>>> the OS is the DT.
> 
> Just to confirm, does that mean with 32-bit build of the UEFI, the OS cannot
> boot with ACPI and uses DT only.

Indeed.

> So,
> 
> If ((aarch64 && firmware_loaded && acpi_enabled) {
>    Hide_hotpluggable_memory_nodes()
> } else {
>    Add_ hotpluggable_memory_nodes()
> }
> 
> should work for all cases?

Yes.

Here's what happens when any one of the subconditions evaluates to false:

- ARM32 has no ACPI bindings, so the guest kernel can only use DT.

- On AARCH64, if you don't "load the firmware" (= don't use UEFI), then
  there won't be an ACPI entry point for the OS to locate (the RSD PTR
  is defined by the ACPI spec in UEFI terms, for AARCH64). So the guest
  kernel can only use DT.

- When on AARCH64 and using UEFI, but asking QEMU not to generate ACPI
  content, the firmware will not install any ACPI tables, so the guest
  kernel can only use DT.

>>> This "bitness distinction" is implemented in the firmware already. If
>>> you hid the memory nodes from the DT under the condition
>>>
>>>   (!aarch64 && firmware_loaded && acpi_enabled)
>>>
>>> then the nodes would not be seen by the OS at all (because
>>> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
>>> all the OS can ever get is DT).
>>
>> It's getting tricky and I don't like a bit that we are trying to carter
>> 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
>> a valid about guessing on QEMU side (that's usually a source of problem
>> in the future).
> 
> If the above is correct(with 32-bit variant of UEFI, OS cannot have ACPI boot),
> then do we really have the issue of memory becoming non hot-un-unpluggable?
> May be I am missing something. 

I think Igor and Peter dislike adding complex logic to QEMU that
reflects the behavior of a specific firmware. AIUI their objection isn't
that it wouldn't work, but that it's not the right thing to do, from a
design perspective.

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-03 13:29                         ` Laszlo Ersek
  0 siblings, 0 replies; 95+ messages in thread
From: Laszlo Ersek @ 2019-04-03 13:29 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi, Igor Mammedov
  Cc: Auger Eric, Ard Biesheuvel, peter.maydell@linaro.org,
	sameo@linux.intel.com, qemu-devel@nongnu.org, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm

On 04/03/19 14:10, Shameerali Kolothum Thodi wrote:

>>>>> So, the condition for hiding the hotpluggable memory nodes in question
>>>>> from the DT is:
>>>>
>>>>>
>>>>>   (aarch64 && firmware_loaded && acpi_enabled)

>>> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
>>> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
>>> not the reverse, on ARM.) So if you run the 32-bit build of the
>>> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
>>> the OS is the DT.
> 
> Just to confirm, does that mean with 32-bit build of the UEFI, the OS cannot
> boot with ACPI and uses DT only.

Indeed.

> So,
> 
> If ((aarch64 && firmware_loaded && acpi_enabled) {
>    Hide_hotpluggable_memory_nodes()
> } else {
>    Add_ hotpluggable_memory_nodes()
> }
> 
> should work for all cases?

Yes.

Here's what happens when any one of the subconditions evaluates to false:

- ARM32 has no ACPI bindings, so the guest kernel can only use DT.

- On AARCH64, if you don't "load the firmware" (= don't use UEFI), then
  there won't be an ACPI entry point for the OS to locate (the RSD PTR
  is defined by the ACPI spec in UEFI terms, for AARCH64). So the guest
  kernel can only use DT.

- When on AARCH64 and using UEFI, but asking QEMU not to generate ACPI
  content, the firmware will not install any ACPI tables, so the guest
  kernel can only use DT.

>>> This "bitness distinction" is implemented in the firmware already. If
>>> you hid the memory nodes from the DT under the condition
>>>
>>>   (!aarch64 && firmware_loaded && acpi_enabled)
>>>
>>> then the nodes would not be seen by the OS at all (because
>>> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
>>> all the OS can ever get is DT).
>>
>> It's getting tricky and I don't like a bit that we are trying to carter
>> 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
>> a valid about guessing on QEMU side (that's usually a source of problem
>> in the future).
> 
> If the above is correct(with 32-bit variant of UEFI, OS cannot have ACPI boot),
> then do we really have the issue of memory becoming non hot-un-unpluggable?
> May be I am missing something. 

I think Igor and Peter dislike adding complex logic to QEMU that
reflects the behavior of a specific firmware. AIUI their objection isn't
that it wouldn't work, but that it's not the right thing to do, from a
design perspective.

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-03 13:29                         ` Laszlo Ersek
@ 2019-04-03 16:25                           ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-03 16:25 UTC (permalink / raw)
  To: Laszlo Ersek, Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Linuxarm, Auger Eric,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm

Hi Laszlo,

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: 03 April 2019 14:29
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> Igor Mammedov <imammedo@redhat.com>
> Cc: Auger Eric <eric.auger@redhat.com>; Ard Biesheuvel
> <ard.biesheuvel@linaro.org>; peter.maydell@linaro.org;
> sameo@linux.intel.com; qemu-devel@nongnu.org; Linuxarm
> <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> sebastien.boeuf@intel.com; Leif Lindholm <Leif.Lindholm@arm.com>
> Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> feature "fdt"
> 
> On 04/03/19 14:10, Shameerali Kolothum Thodi wrote:
> 
> >>>>> So, the condition for hiding the hotpluggable memory nodes in question
> >>>>> from the DT is:
> >>>>
> >>>>>
> >>>>>   (aarch64 && firmware_loaded && acpi_enabled)
> 
> >>> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> >>> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> >>> not the reverse, on ARM.) So if you run the 32-bit build of the
> >>> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> >>> the OS is the DT.
> >
> > Just to confirm, does that mean with 32-bit build of the UEFI, the OS cannot
> > boot with ACPI and uses DT only.
> 
> Indeed.
> 
> > So,
> >
> > If ((aarch64 && firmware_loaded && acpi_enabled) {
> >    Hide_hotpluggable_memory_nodes()
> > } else {
> >    Add_ hotpluggable_memory_nodes()
> > }
> >
> > should work for all cases?
> 
> Yes.
> 
> Here's what happens when any one of the subconditions evaluates to false:
> 
> - ARM32 has no ACPI bindings, so the guest kernel can only use DT.
> 
> - On AARCH64, if you don't "load the firmware" (= don't use UEFI), then
>   there won't be an ACPI entry point for the OS to locate (the RSD PTR
>   is defined by the ACPI spec in UEFI terms, for AARCH64). So the guest
>   kernel can only use DT.
> 
> - When on AARCH64 and using UEFI, but asking QEMU not to generate ACPI
>   content, the firmware will not install any ACPI tables, so the guest
>   kernel can only use DT.
> 

Thanks. That makes it very clear. Much appreciated.

> >>> This "bitness distinction" is implemented in the firmware already. If
> >>> you hid the memory nodes from the DT under the condition
> >>>
> >>>   (!aarch64 && firmware_loaded && acpi_enabled)
> >>>
> >>> then the nodes would not be seen by the OS at all (because
> >>> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> >>> all the OS can ever get is DT).
> >>
> >> It's getting tricky and I don't like a bit that we are trying to carter
> >> 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
> >> a valid about guessing on QEMU side (that's usually a source of problem
> >> in the future).
> >
> > If the above is correct(with 32-bit variant of UEFI, OS cannot have ACPI boot),
> > then do we really have the issue of memory becoming non
> hot-un-unpluggable?
> > May be I am missing something.
> 
> I think Igor and Peter dislike adding complex logic to QEMU that
> reflects the behavior of a specific firmware. AIUI their objection isn't
> that it wouldn't work, but that it's not the right thing to do, from a
> design perspective.

Understood. Hope we can converge on something soon.

Cheers,
Shameer

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-03 16:25                           ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-03 16:25 UTC (permalink / raw)
  To: Laszlo Ersek, Igor Mammedov
  Cc: Auger Eric, Ard Biesheuvel, peter.maydell@linaro.org,
	sameo@linux.intel.com, qemu-devel@nongnu.org, Linuxarm,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Leif Lindholm

Hi Laszlo,

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: 03 April 2019 14:29
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> Igor Mammedov <imammedo@redhat.com>
> Cc: Auger Eric <eric.auger@redhat.com>; Ard Biesheuvel
> <ard.biesheuvel@linaro.org>; peter.maydell@linaro.org;
> sameo@linux.intel.com; qemu-devel@nongnu.org; Linuxarm
> <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> sebastien.boeuf@intel.com; Leif Lindholm <Leif.Lindholm@arm.com>
> Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> feature "fdt"
> 
> On 04/03/19 14:10, Shameerali Kolothum Thodi wrote:
> 
> >>>>> So, the condition for hiding the hotpluggable memory nodes in question
> >>>>> from the DT is:
> >>>>
> >>>>>
> >>>>>   (aarch64 && firmware_loaded && acpi_enabled)
> 
> >>> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> >>> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> >>> not the reverse, on ARM.) So if you run the 32-bit build of the
> >>> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> >>> the OS is the DT.
> >
> > Just to confirm, does that mean with 32-bit build of the UEFI, the OS cannot
> > boot with ACPI and uses DT only.
> 
> Indeed.
> 
> > So,
> >
> > If ((aarch64 && firmware_loaded && acpi_enabled) {
> >    Hide_hotpluggable_memory_nodes()
> > } else {
> >    Add_ hotpluggable_memory_nodes()
> > }
> >
> > should work for all cases?
> 
> Yes.
> 
> Here's what happens when any one of the subconditions evaluates to false:
> 
> - ARM32 has no ACPI bindings, so the guest kernel can only use DT.
> 
> - On AARCH64, if you don't "load the firmware" (= don't use UEFI), then
>   there won't be an ACPI entry point for the OS to locate (the RSD PTR
>   is defined by the ACPI spec in UEFI terms, for AARCH64). So the guest
>   kernel can only use DT.
> 
> - When on AARCH64 and using UEFI, but asking QEMU not to generate ACPI
>   content, the firmware will not install any ACPI tables, so the guest
>   kernel can only use DT.
> 

Thanks. That makes it very clear. Much appreciated.

> >>> This "bitness distinction" is implemented in the firmware already. If
> >>> you hid the memory nodes from the DT under the condition
> >>>
> >>>   (!aarch64 && firmware_loaded && acpi_enabled)
> >>>
> >>> then the nodes would not be seen by the OS at all (because
> >>> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> >>> all the OS can ever get is DT).
> >>
> >> It's getting tricky and I don't like a bit that we are trying to carter
> >> 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
> >> a valid about guessing on QEMU side (that's usually a source of problem
> >> in the future).
> >
> > If the above is correct(with 32-bit variant of UEFI, OS cannot have ACPI boot),
> > then do we really have the issue of memory becoming non
> hot-un-unpluggable?
> > May be I am missing something.
> 
> I think Igor and Peter dislike adding complex logic to QEMU that
> reflects the behavior of a specific firmware. AIUI their objection isn't
> that it wouldn't work, but that it's not the right thing to do, from a
> design perspective.

Understood. Hope we can converge on something soon.

Cheers,
Shameer

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-03 16:25                           ` Shameerali Kolothum Thodi
  (?)
@ 2019-04-08  8:11                             ` Igor Mammedov
  -1 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-08  8:11 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Linuxarm, Auger Eric,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Laszlo Ersek, Leif Lindholm

On Wed, 3 Apr 2019 16:25:49 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> Hi Laszlo,
> 
> > -----Original Message-----
> > From: Laszlo Ersek [mailto:lersek@redhat.com]
> > Sent: 03 April 2019 14:29
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> > Igor Mammedov <imammedo@redhat.com>
> > Cc: Auger Eric <eric.auger@redhat.com>; Ard Biesheuvel
> > <ard.biesheuvel@linaro.org>; peter.maydell@linaro.org;
> > sameo@linux.intel.com; qemu-devel@nongnu.org; Linuxarm
> > <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> > qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> > sebastien.boeuf@intel.com; Leif Lindholm <Leif.Lindholm@arm.com>
> > Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> > feature "fdt"
> > 
> > On 04/03/19 14:10, Shameerali Kolothum Thodi wrote:
> >   
> > >>>>> So, the condition for hiding the hotpluggable memory nodes in question
> > >>>>> from the DT is:  
> > >>>>  
> > >>>>>
> > >>>>>   (aarch64 && firmware_loaded && acpi_enabled)  
> >   
> > >>> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> > >>> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> > >>> not the reverse, on ARM.) So if you run the 32-bit build of the
> > >>> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> > >>> the OS is the DT.  
> > >
> > > Just to confirm, does that mean with 32-bit build of the UEFI, the OS cannot
> > > boot with ACPI and uses DT only.  
> > 
> > Indeed.
> >   
> > > So,
> > >
> > > If ((aarch64 && firmware_loaded && acpi_enabled) {
> > >    Hide_hotpluggable_memory_nodes()
> > > } else {
> > >    Add_ hotpluggable_memory_nodes()
> > > }
> > >
> > > should work for all cases?  
> > 
> > Yes.
> > 
> > Here's what happens when any one of the subconditions evaluates to false:
> > 
> > - ARM32 has no ACPI bindings, so the guest kernel can only use DT.
> > 
> > - On AARCH64, if you don't "load the firmware" (= don't use UEFI), then
> >   there won't be an ACPI entry point for the OS to locate (the RSD PTR
> >   is defined by the ACPI spec in UEFI terms, for AARCH64). So the guest
> >   kernel can only use DT.
> > 
> > - When on AARCH64 and using UEFI, but asking QEMU not to generate ACPI
> >   content, the firmware will not install any ACPI tables, so the guest
> >   kernel can only use DT.
> >   
> 
> Thanks. That makes it very clear. Much appreciated.
> 
> > >>> This "bitness distinction" is implemented in the firmware already. If
> > >>> you hid the memory nodes from the DT under the condition
> > >>>
> > >>>   (!aarch64 && firmware_loaded && acpi_enabled)
> > >>>
> > >>> then the nodes would not be seen by the OS at all (because
> > >>> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> > >>> all the OS can ever get is DT).  
> > >>
> > >> It's getting tricky and I don't like a bit that we are trying to carter
> > >> 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
> > >> a valid about guessing on QEMU side (that's usually a source of problem
> > >> in the future).  
> > >
> > > If the above is correct(with 32-bit variant of UEFI, OS cannot have ACPI boot),
> > > then do we really have the issue of memory becoming non  
> > hot-un-unpluggable?  
> > > May be I am missing something.  
> > 
> > I think Igor and Peter dislike adding complex logic to QEMU that
> > reflects the behavior of a specific firmware. AIUI their objection isn't
> > that it wouldn't work, but that it's not the right thing to do, from a
> > design perspective.  
> 
> Understood. Hope we can converge on something soon.
Lets try adding a parameter to memory descriptors in DT that would mark
them as hotpluggable.

 
> Cheers,
> Shameer


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-08  8:11                             ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-08  8:11 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: Laszlo Ersek, Auger Eric, Ard Biesheuvel,
	peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Linuxarm, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), sebastien.boeuf@intel.com,
	Leif Lindholm

On Wed, 3 Apr 2019 16:25:49 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> Hi Laszlo,
> 
> > -----Original Message-----
> > From: Laszlo Ersek [mailto:lersek@redhat.com]
> > Sent: 03 April 2019 14:29
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> > Igor Mammedov <imammedo@redhat.com>
> > Cc: Auger Eric <eric.auger@redhat.com>; Ard Biesheuvel
> > <ard.biesheuvel@linaro.org>; peter.maydell@linaro.org;
> > sameo@linux.intel.com; qemu-devel@nongnu.org; Linuxarm
> > <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> > qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> > sebastien.boeuf@intel.com; Leif Lindholm <Leif.Lindholm@arm.com>
> > Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> > feature "fdt"
> > 
> > On 04/03/19 14:10, Shameerali Kolothum Thodi wrote:
> >   
> > >>>>> So, the condition for hiding the hotpluggable memory nodes in question
> > >>>>> from the DT is:  
> > >>>>  
> > >>>>>
> > >>>>>   (aarch64 && firmware_loaded && acpi_enabled)  
> >   
> > >>> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> > >>> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> > >>> not the reverse, on ARM.) So if you run the 32-bit build of the
> > >>> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> > >>> the OS is the DT.  
> > >
> > > Just to confirm, does that mean with 32-bit build of the UEFI, the OS cannot
> > > boot with ACPI and uses DT only.  
> > 
> > Indeed.
> >   
> > > So,
> > >
> > > If ((aarch64 && firmware_loaded && acpi_enabled) {
> > >    Hide_hotpluggable_memory_nodes()
> > > } else {
> > >    Add_ hotpluggable_memory_nodes()
> > > }
> > >
> > > should work for all cases?  
> > 
> > Yes.
> > 
> > Here's what happens when any one of the subconditions evaluates to false:
> > 
> > - ARM32 has no ACPI bindings, so the guest kernel can only use DT.
> > 
> > - On AARCH64, if you don't "load the firmware" (= don't use UEFI), then
> >   there won't be an ACPI entry point for the OS to locate (the RSD PTR
> >   is defined by the ACPI spec in UEFI terms, for AARCH64). So the guest
> >   kernel can only use DT.
> > 
> > - When on AARCH64 and using UEFI, but asking QEMU not to generate ACPI
> >   content, the firmware will not install any ACPI tables, so the guest
> >   kernel can only use DT.
> >   
> 
> Thanks. That makes it very clear. Much appreciated.
> 
> > >>> This "bitness distinction" is implemented in the firmware already. If
> > >>> you hid the memory nodes from the DT under the condition
> > >>>
> > >>>   (!aarch64 && firmware_loaded && acpi_enabled)
> > >>>
> > >>> then the nodes would not be seen by the OS at all (because
> > >>> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> > >>> all the OS can ever get is DT).  
> > >>
> > >> It's getting tricky and I don't like a bit that we are trying to carter
> > >> 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
> > >> a valid about guessing on QEMU side (that's usually a source of problem
> > >> in the future).  
> > >
> > > If the above is correct(with 32-bit variant of UEFI, OS cannot have ACPI boot),
> > > then do we really have the issue of memory becoming non  
> > hot-un-unpluggable?  
> > > May be I am missing something.  
> > 
> > I think Igor and Peter dislike adding complex logic to QEMU that
> > reflects the behavior of a specific firmware. AIUI their objection isn't
> > that it wouldn't work, but that it's not the right thing to do, from a
> > design perspective.  
> 
> Understood. Hope we can converge on something soon.
Lets try adding a parameter to memory descriptors in DT that would mark
them as hotpluggable.

 
> Cheers,
> Shameer

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-08  8:11                             ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-08  8:11 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Linuxarm, Auger Eric,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Laszlo Ersek, Leif Lindholm

On Wed, 3 Apr 2019 16:25:49 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> Hi Laszlo,
> 
> > -----Original Message-----
> > From: Laszlo Ersek [mailto:lersek@redhat.com]
> > Sent: 03 April 2019 14:29
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> > Igor Mammedov <imammedo@redhat.com>
> > Cc: Auger Eric <eric.auger@redhat.com>; Ard Biesheuvel
> > <ard.biesheuvel@linaro.org>; peter.maydell@linaro.org;
> > sameo@linux.intel.com; qemu-devel@nongnu.org; Linuxarm
> > <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> > qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> > sebastien.boeuf@intel.com; Leif Lindholm <Leif.Lindholm@arm.com>
> > Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> > feature "fdt"
> > 
> > On 04/03/19 14:10, Shameerali Kolothum Thodi wrote:
> >   
> > >>>>> So, the condition for hiding the hotpluggable memory nodes in question
> > >>>>> from the DT is:  
> > >>>>  
> > >>>>>
> > >>>>>   (aarch64 && firmware_loaded && acpi_enabled)  
> >   
> > >>> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> > >>> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> > >>> not the reverse, on ARM.) So if you run the 32-bit build of the
> > >>> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> > >>> the OS is the DT.  
> > >
> > > Just to confirm, does that mean with 32-bit build of the UEFI, the OS cannot
> > > boot with ACPI and uses DT only.  
> > 
> > Indeed.
> >   
> > > So,
> > >
> > > If ((aarch64 && firmware_loaded && acpi_enabled) {
> > >    Hide_hotpluggable_memory_nodes()
> > > } else {
> > >    Add_ hotpluggable_memory_nodes()
> > > }
> > >
> > > should work for all cases?  
> > 
> > Yes.
> > 
> > Here's what happens when any one of the subconditions evaluates to false:
> > 
> > - ARM32 has no ACPI bindings, so the guest kernel can only use DT.
> > 
> > - On AARCH64, if you don't "load the firmware" (= don't use UEFI), then
> >   there won't be an ACPI entry point for the OS to locate (the RSD PTR
> >   is defined by the ACPI spec in UEFI terms, for AARCH64). So the guest
> >   kernel can only use DT.
> > 
> > - When on AARCH64 and using UEFI, but asking QEMU not to generate ACPI
> >   content, the firmware will not install any ACPI tables, so the guest
> >   kernel can only use DT.
> >   
> 
> Thanks. That makes it very clear. Much appreciated.
> 
> > >>> This "bitness distinction" is implemented in the firmware already. If
> > >>> you hid the memory nodes from the DT under the condition
> > >>>
> > >>>   (!aarch64 && firmware_loaded && acpi_enabled)
> > >>>
> > >>> then the nodes would not be seen by the OS at all (because
> > >>> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> > >>> all the OS can ever get is DT).  
> > >>
> > >> It's getting tricky and I don't like a bit that we are trying to carter
> > >> 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
> > >> a valid about guessing on QEMU side (that's usually a source of problem
> > >> in the future).  
> > >
> > > If the above is correct(with 32-bit variant of UEFI, OS cannot have ACPI boot),
> > > then do we really have the issue of memory becoming non  
> > hot-un-unpluggable?  
> > > May be I am missing something.  
> > 
> > I think Igor and Peter dislike adding complex logic to QEMU that
> > reflects the behavior of a specific firmware. AIUI their objection isn't
> > that it wouldn't work, but that it's not the right thing to do, from a
> > design perspective.  
> 
> Understood. Hope we can converge on something soon.
Lets try adding a parameter to memory descriptors in DT that would mark
them as hotpluggable.

 
> Cheers,
> Shameer



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-03 13:19                       ` Laszlo Ersek
  (?)
@ 2019-04-08  8:13                         ` Igor Mammedov
  -1 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-08  8:13 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	Auger Eric, shannon.zhaosl@gmail.com, qemu-arm@nongnu.org,
	xuwei (O), sebastien.boeuf@intel.com, Leif Lindholm

On Wed, 3 Apr 2019 15:19:52 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

> On 04/03/19 11:49, Igor Mammedov wrote:
> > On Tue, 2 Apr 2019 17:38:26 +0200
> > Laszlo Ersek <lersek@redhat.com> wrote:
> >   
> >> On 04/02/19 17:29, Auger Eric wrote:  
> >>> Hi Laszlo,
> >>>
> >>> On 4/1/19 3:07 PM, Laszlo Ersek wrote:    
> >>>> On 03/29/19 14:56, Auger Eric wrote:    
> >>>>> Hi Ard,
> >>>>>
> >>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:    
> >>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:    
> >>>>>>>
> >>>>>>> Hi Shameer,
> >>>>>>>
> >>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:    
> >>>>>>>>
> >>>>>>>>    
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
> >>>>>>>>> Sent: 29 March 2019 09:32
> >>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> >>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> >>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> >>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
> >>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
> >>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
> >>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
> >>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
> >>>>>>>>>
> >>>>>>>>> Hi Shameer,
> >>>>>>>>>
> >>>>>>>>> [ + Laszlo, Ard, Leif ]
> >>>>>>>>>
> >>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:    
> >>>>>>>>>> This is to disable/enable populating DT nodes in case
> >>>>>>>>>> any conflict with acpi tables. The default is "off".    
> >>>>>>>>> The name of the option sounds misleading to me. Also we don't really
> >>>>>>>>> know the scope of the disablement. At the moment this just aims to
> >>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
> >>>>>>>>>    
> >>>>>>>>>>
> >>>>>>>>>> This will be used in subsequent patch where cold plug
> >>>>>>>>>> device-memory support is added for DT boot.    
> >>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
> >>>>>>>>> any PCDIMM nodes.    
> >>>>>>>>>>
> >>>>>>>>>> If DT memory node support is added for cold-plugged device
> >>>>>>>>>> memory, those memory will be visible to Guest kernel via
> >>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.    
> >>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
> >>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
> >>>>>>>>> info.    
> >>>>>>>>
> >>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
> >>>>>>>>
> >>>>>>>> Also, to be more clear on what happens,
> >>>>>>>>
> >>>>>>>> Guest ACPI boot with "fdt=on" ,
> >>>>>>>>
> >>>>>>>> From kernel log,
> >>>>>>>>
> >>>>>>>> [    0.000000] Early memory node ranges
> >>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
> >>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Guest ACPI boot with "fdt=off" ,
> >>>>>>>>
> >>>>>>>> [    0.000000] Movable zone start for each node
> >>>>>>>> [    0.000000] Early memory node ranges
> >>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
> >>>>>>>>
> >>>>>>>> The hotpluggable memory node is absent from early memory nodes here.    
> >>>>>>>
> >>>>>>> OK thank you for the example illustrating the concern.    
> >>>>>>>>
> >>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.    
> >>>>>>>
> >>>>>>> Let's wait for EDK2 experts on this.
> >>>>>>>    
> >>>>>>
> >>>>>> Happy to chime in, but I need a bit more context here.
> >>>>>>
> >>>>>> What is the problem, how does this path try to solve it, and why is
> >>>>>> that a bad idea?
> >>>>>>    
> >>>>> Sure, sorry.
> >>>>>
> >>>>> This series:
> >>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> >>>>> https://patchwork.kernel.org/cover/10863301/
> >>>>>
> >>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> >>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
> >>>>>
> >>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
> >>>>> the above ACPI tables, the DIMM slots are interpreted as not
> >>>>> hotpluggable memory slots (at least we think so).
> >>>>>
> >>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> >>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
> >>>>>
> >>>>> So in this series, we are forced to not generate the hotpluggable memory
> >>>>> dt nodes if we want the DIMM slots to be effectively recognized as
> >>>>> hotpluggable.
> >>>>>
> >>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
> >>>>> and if so, would there be any solution for EDK2 to absorb both the DT
> >>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
> >>>>>
> >>>>> At qemu level, detecting we are booting in ACPI mode and purposely
> >>>>> removing the above mentioned DT nodes does not look straightforward.    
> >>>>
> >>>> The firmware is not enlightened about the ACPI content that comes from
> >>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> >>>> as instructed through the ACPI linker/loader script, in order to install
> >>>> the ACPI content for the OS. No actual information is consumed by the
> >>>> firmware from the ACPI payload -- and that's a feature.
> >>>>
> >>>> The firmware does consume DT:
> >>>>
> >>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> >>>> the firmware (for its own information needs), and passed on to the OS.
> >>>>
> >>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> >>>> consumed only by the firmware (for its own information needs), and the
> >>>> DT is hidden from the OS. The OS gets only the ACPI content
> >>>> (processed/prepared as described above).
> >>>>
> >>>>
> >>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> >>>> base/size pairs in all the memory nodes in the DT. For each such base
> >>>> address that is currently tracked as "nonexistent" in the GCD memory
> >>>> space map, the driver currently adds the base/size range as "system
> >>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
> >>>> to see as "conventional memory".
> >>>>
> >>>> If you need some memory ranges to show up as "special" in the UEFI
> >>>> memmap, then you need to distinguish them somehow from the "regular"
> >>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
> >>>> firmware, so that it act upon the discriminator that you set in the DT.
> >>>>
> >>>>
> >>>> Now... from a brief look at the Platform Init and UEFI specs, my
> >>>> impression is that the hotpluggable (but presently not plugged) DIMM
> >>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
> >>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> >>>> in full.) If my impression is correct, then two options (alternatives)
> >>>> exist:
> >>>>
> >>>> (1) Hide the affected memory nodes -- or at least the affected base/size
> >>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> >>>> external firmware loaded. Then the firmware will not expose those ranges
> >>>> as "conventional memory" in the UEFI memmap. This approach requires no
> >>>> changes to edk2.
> >>>>
> >>>> This option is precisely what Eric described up-thread, at
> >>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
> >>>>    
> >>>>> in machvirt_init, there is firmware_loaded that tells you whether you
> >>>>> have a FW image. If this one is not set, you can induce dt. But if
> >>>>> there is a FW it can be either DT or ACPI booted. You also have the
> >>>>> acpi_enabled knob.    
> >>>>
> >>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> >>>> "vl.c").
> >>>>
> >>>> So, the condition for hiding the hotpluggable memory nodes in question
> >>>> from the DT is:    
> >>>     
> >>>>
> >>>>   (aarch64 && firmware_loaded && acpi_enabled)    
> >>>
> >>> Thanks a lot for all those inputs!
> >>>
> >>> I don't get why we test aarch64 in above condition (this was useful for
> >>> high ECAM range as the aarch32 FW was not supporting it but here, is it
> >>> still meaningful?)    
> >>
> >> Sorry, I should have clarified that. Yes, it is meaningful:
> >>
> >> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> >> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> >> not the reverse, on ARM.) So if you run the 32-bit build of the
> >> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> >> the OS is the DT.
> >>
> >> This "bitness distinction" is implemented in the firmware already. If
> >> you hid the memory nodes from the DT under the condition
> >>
> >>   (!aarch64 && firmware_loaded && acpi_enabled)
> >>
> >> then the nodes would not be seen by the OS at all (because
> >> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> >> all the OS can ever get is DT).  
> > 
> > It's getting tricky and I don't like a bit that we are trying to carter
> > 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
> > a valid about guessing on QEMU side (that's usually a source of problem
> > in the future).
> > 
> > Perhaps we should reconsider and think about marking hotplugbbale RAM
> > in DT and let firmware to exclude it from memory map.  
> 
> I'm fine either way.
> 
> (I'm glad to continue discussing either option; that shouldn't be taken
> as a preference on my end.)
> 
> With option (2), please consider the new version dependency between QEMU
> and the firmware -- this may or may not affect migration. (Thinking
> about migration is difficult, so I'll leave that to you all :) )
I don't see any issues with migrations so far,
it will change the size of DT but it's all new CLI so any existing
machine should not have new options hence it would keep OLD DT size.

> 
> Thanks
> Laszlo
> 
> >>>> (2) Invent and set an "ignore me, firmware" property for the
> >>>> hotpluggable memory nodes in the DT, and update the firmware to honor
> >>>> that property.
> >>>>
> >>>> Thanks
> >>>> Laszlo
> >>>>    
> >>  
> >   
> 


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-08  8:13                         ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-08  8:13 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Auger Eric, Ard Biesheuvel, peter.maydell@linaro.org,
	sameo@linux.intel.com, qemu-devel@nongnu.org,
	Shameerali Kolothum Thodi, Linuxarm, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), sebastien.boeuf@intel.com,
	Leif Lindholm

On Wed, 3 Apr 2019 15:19:52 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

> On 04/03/19 11:49, Igor Mammedov wrote:
> > On Tue, 2 Apr 2019 17:38:26 +0200
> > Laszlo Ersek <lersek@redhat.com> wrote:
> >   
> >> On 04/02/19 17:29, Auger Eric wrote:  
> >>> Hi Laszlo,
> >>>
> >>> On 4/1/19 3:07 PM, Laszlo Ersek wrote:    
> >>>> On 03/29/19 14:56, Auger Eric wrote:    
> >>>>> Hi Ard,
> >>>>>
> >>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:    
> >>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:    
> >>>>>>>
> >>>>>>> Hi Shameer,
> >>>>>>>
> >>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:    
> >>>>>>>>
> >>>>>>>>    
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
> >>>>>>>>> Sent: 29 March 2019 09:32
> >>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> >>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> >>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> >>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
> >>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
> >>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
> >>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
> >>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
> >>>>>>>>>
> >>>>>>>>> Hi Shameer,
> >>>>>>>>>
> >>>>>>>>> [ + Laszlo, Ard, Leif ]
> >>>>>>>>>
> >>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:    
> >>>>>>>>>> This is to disable/enable populating DT nodes in case
> >>>>>>>>>> any conflict with acpi tables. The default is "off".    
> >>>>>>>>> The name of the option sounds misleading to me. Also we don't really
> >>>>>>>>> know the scope of the disablement. At the moment this just aims to
> >>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
> >>>>>>>>>    
> >>>>>>>>>>
> >>>>>>>>>> This will be used in subsequent patch where cold plug
> >>>>>>>>>> device-memory support is added for DT boot.    
> >>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
> >>>>>>>>> any PCDIMM nodes.    
> >>>>>>>>>>
> >>>>>>>>>> If DT memory node support is added for cold-plugged device
> >>>>>>>>>> memory, those memory will be visible to Guest kernel via
> >>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.    
> >>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
> >>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
> >>>>>>>>> info.    
> >>>>>>>>
> >>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
> >>>>>>>>
> >>>>>>>> Also, to be more clear on what happens,
> >>>>>>>>
> >>>>>>>> Guest ACPI boot with "fdt=on" ,
> >>>>>>>>
> >>>>>>>> From kernel log,
> >>>>>>>>
> >>>>>>>> [    0.000000] Early memory node ranges
> >>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
> >>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Guest ACPI boot with "fdt=off" ,
> >>>>>>>>
> >>>>>>>> [    0.000000] Movable zone start for each node
> >>>>>>>> [    0.000000] Early memory node ranges
> >>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
> >>>>>>>>
> >>>>>>>> The hotpluggable memory node is absent from early memory nodes here.    
> >>>>>>>
> >>>>>>> OK thank you for the example illustrating the concern.    
> >>>>>>>>
> >>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.    
> >>>>>>>
> >>>>>>> Let's wait for EDK2 experts on this.
> >>>>>>>    
> >>>>>>
> >>>>>> Happy to chime in, but I need a bit more context here.
> >>>>>>
> >>>>>> What is the problem, how does this path try to solve it, and why is
> >>>>>> that a bad idea?
> >>>>>>    
> >>>>> Sure, sorry.
> >>>>>
> >>>>> This series:
> >>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> >>>>> https://patchwork.kernel.org/cover/10863301/
> >>>>>
> >>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> >>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
> >>>>>
> >>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
> >>>>> the above ACPI tables, the DIMM slots are interpreted as not
> >>>>> hotpluggable memory slots (at least we think so).
> >>>>>
> >>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> >>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
> >>>>>
> >>>>> So in this series, we are forced to not generate the hotpluggable memory
> >>>>> dt nodes if we want the DIMM slots to be effectively recognized as
> >>>>> hotpluggable.
> >>>>>
> >>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
> >>>>> and if so, would there be any solution for EDK2 to absorb both the DT
> >>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
> >>>>>
> >>>>> At qemu level, detecting we are booting in ACPI mode and purposely
> >>>>> removing the above mentioned DT nodes does not look straightforward.    
> >>>>
> >>>> The firmware is not enlightened about the ACPI content that comes from
> >>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> >>>> as instructed through the ACPI linker/loader script, in order to install
> >>>> the ACPI content for the OS. No actual information is consumed by the
> >>>> firmware from the ACPI payload -- and that's a feature.
> >>>>
> >>>> The firmware does consume DT:
> >>>>
> >>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> >>>> the firmware (for its own information needs), and passed on to the OS.
> >>>>
> >>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> >>>> consumed only by the firmware (for its own information needs), and the
> >>>> DT is hidden from the OS. The OS gets only the ACPI content
> >>>> (processed/prepared as described above).
> >>>>
> >>>>
> >>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> >>>> base/size pairs in all the memory nodes in the DT. For each such base
> >>>> address that is currently tracked as "nonexistent" in the GCD memory
> >>>> space map, the driver currently adds the base/size range as "system
> >>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
> >>>> to see as "conventional memory".
> >>>>
> >>>> If you need some memory ranges to show up as "special" in the UEFI
> >>>> memmap, then you need to distinguish them somehow from the "regular"
> >>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
> >>>> firmware, so that it act upon the discriminator that you set in the DT.
> >>>>
> >>>>
> >>>> Now... from a brief look at the Platform Init and UEFI specs, my
> >>>> impression is that the hotpluggable (but presently not plugged) DIMM
> >>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
> >>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> >>>> in full.) If my impression is correct, then two options (alternatives)
> >>>> exist:
> >>>>
> >>>> (1) Hide the affected memory nodes -- or at least the affected base/size
> >>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> >>>> external firmware loaded. Then the firmware will not expose those ranges
> >>>> as "conventional memory" in the UEFI memmap. This approach requires no
> >>>> changes to edk2.
> >>>>
> >>>> This option is precisely what Eric described up-thread, at
> >>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
> >>>>    
> >>>>> in machvirt_init, there is firmware_loaded that tells you whether you
> >>>>> have a FW image. If this one is not set, you can induce dt. But if
> >>>>> there is a FW it can be either DT or ACPI booted. You also have the
> >>>>> acpi_enabled knob.    
> >>>>
> >>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> >>>> "vl.c").
> >>>>
> >>>> So, the condition for hiding the hotpluggable memory nodes in question
> >>>> from the DT is:    
> >>>     
> >>>>
> >>>>   (aarch64 && firmware_loaded && acpi_enabled)    
> >>>
> >>> Thanks a lot for all those inputs!
> >>>
> >>> I don't get why we test aarch64 in above condition (this was useful for
> >>> high ECAM range as the aarch32 FW was not supporting it but here, is it
> >>> still meaningful?)    
> >>
> >> Sorry, I should have clarified that. Yes, it is meaningful:
> >>
> >> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> >> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> >> not the reverse, on ARM.) So if you run the 32-bit build of the
> >> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> >> the OS is the DT.
> >>
> >> This "bitness distinction" is implemented in the firmware already. If
> >> you hid the memory nodes from the DT under the condition
> >>
> >>   (!aarch64 && firmware_loaded && acpi_enabled)
> >>
> >> then the nodes would not be seen by the OS at all (because
> >> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> >> all the OS can ever get is DT).  
> > 
> > It's getting tricky and I don't like a bit that we are trying to carter
> > 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
> > a valid about guessing on QEMU side (that's usually a source of problem
> > in the future).
> > 
> > Perhaps we should reconsider and think about marking hotplugbbale RAM
> > in DT and let firmware to exclude it from memory map.  
> 
> I'm fine either way.
> 
> (I'm glad to continue discussing either option; that shouldn't be taken
> as a preference on my end.)
> 
> With option (2), please consider the new version dependency between QEMU
> and the firmware -- this may or may not affect migration. (Thinking
> about migration is difficult, so I'll leave that to you all :) )
I don't see any issues with migrations so far,
it will change the size of DT but it's all new CLI so any existing
machine should not have new options hence it would keep OLD DT size.

> 
> Thanks
> Laszlo
> 
> >>>> (2) Invent and set an "ignore me, firmware" property for the
> >>>> hotpluggable memory nodes in the DT, and update the firmware to honor
> >>>> that property.
> >>>>
> >>>> Thanks
> >>>> Laszlo
> >>>>    
> >>  
> >   
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-08  8:13                         ` Igor Mammedov
  0 siblings, 0 replies; 95+ messages in thread
From: Igor Mammedov @ 2019-04-08  8:13 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Shameerali Kolothum Thodi, Linuxarm,
	Auger Eric, shannon.zhaosl@gmail.com, qemu-arm@nongnu.org,
	xuwei (O), sebastien.boeuf@intel.com, Leif Lindholm

On Wed, 3 Apr 2019 15:19:52 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

> On 04/03/19 11:49, Igor Mammedov wrote:
> > On Tue, 2 Apr 2019 17:38:26 +0200
> > Laszlo Ersek <lersek@redhat.com> wrote:
> >   
> >> On 04/02/19 17:29, Auger Eric wrote:  
> >>> Hi Laszlo,
> >>>
> >>> On 4/1/19 3:07 PM, Laszlo Ersek wrote:    
> >>>> On 03/29/19 14:56, Auger Eric wrote:    
> >>>>> Hi Ard,
> >>>>>
> >>>>> On 3/29/19 2:14 PM, Ard Biesheuvel wrote:    
> >>>>>> On Fri, 29 Mar 2019 at 14:12, Auger Eric <eric.auger@redhat.com> wrote:    
> >>>>>>>
> >>>>>>> Hi Shameer,
> >>>>>>>
> >>>>>>> On 3/29/19 10:59 AM, Shameerali Kolothum Thodi wrote:    
> >>>>>>>>
> >>>>>>>>    
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
> >>>>>>>>> Sent: 29 March 2019 09:32
> >>>>>>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> >>>>>>>>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; imammedo@redhat.com;
> >>>>>>>>> peter.maydell@linaro.org; shannon.zhaosl@gmail.com;
> >>>>>>>>> sameo@linux.intel.com; sebastien.boeuf@intel.com
> >>>>>>>>> Cc: Linuxarm <linuxarm@huawei.com>; xuwei (O) <xuwei5@huawei.com>;
> >>>>>>>>> Laszlo Ersek <lersek@redhat.com>; Ard Biesheuvel
> >>>>>>>>> <ard.biesheuvel@linaro.org>; Leif Lindholm <Leif.Lindholm@arm.com>
> >>>>>>>>> Subject: Re: [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
> >>>>>>>>>
> >>>>>>>>> Hi Shameer,
> >>>>>>>>>
> >>>>>>>>> [ + Laszlo, Ard, Leif ]
> >>>>>>>>>
> >>>>>>>>> On 3/21/19 11:47 AM, Shameer Kolothum wrote:    
> >>>>>>>>>> This is to disable/enable populating DT nodes in case
> >>>>>>>>>> any conflict with acpi tables. The default is "off".    
> >>>>>>>>> The name of the option sounds misleading to me. Also we don't really
> >>>>>>>>> know the scope of the disablement. At the moment this just aims to
> >>>>>>>>> prevent the hotpluggable dt nodes from being added if we boot in ACPI mode.
> >>>>>>>>>    
> >>>>>>>>>>
> >>>>>>>>>> This will be used in subsequent patch where cold plug
> >>>>>>>>>> device-memory support is added for DT boot.    
> >>>>>>>>> I am concerned about the fact that in dt mode, by default, you won't see
> >>>>>>>>> any PCDIMM nodes.    
> >>>>>>>>>>
> >>>>>>>>>> If DT memory node support is added for cold-plugged device
> >>>>>>>>>> memory, those memory will be visible to Guest kernel via
> >>>>>>>>>> UEFI GetMemoryMap() and gets treated as early boot memory.    
> >>>>>>>>> Don't we have an issue in UEFI then. Normally the SRAT indicates whether
> >>>>>>>>> the slots are hotpluggable or not. Shouldn't the UEFI code look at this
> >>>>>>>>> info.    
> >>>>>>>>
> >>>>>>>> Sorry I missed this part. Yes, that will be a more cleaner solution.
> >>>>>>>>
> >>>>>>>> Also, to be more clear on what happens,
> >>>>>>>>
> >>>>>>>> Guest ACPI boot with "fdt=on" ,
> >>>>>>>>
> >>>>>>>> From kernel log,
> >>>>>>>>
> >>>>>>>> [    0.000000] Early memory node ranges
> >>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000ffffffff]  --> This is the hotpluggable memory node from DT.
> >>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000ffffffff]
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Guest ACPI boot with "fdt=off" ,
> >>>>>>>>
> >>>>>>>> [    0.000000] Movable zone start for each node
> >>>>>>>> [    0.000000] Early memory node ranges
> >>>>>>>> [    0.000000]   node   0: [mem 0x0000000040000000-0x00000000bbf5ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bbf60000-0x00000000bbffffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc000000-0x00000000bc02ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc030000-0x00000000bc36ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bc370000-0x00000000bf64ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf650000-0x00000000bf6dffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6e0000-0x00000000bf6effff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf6f0000-0x00000000bf80ffff]
> >>>>>>>> [    0.000000]   node   0: [mem 0x00000000bf810000-0x00000000bfffffff]
> >>>>>>>> [    0.000000] Zeroed struct page in unavailable ranges: 1040 pages
> >>>>>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff
> >>>>>>>>
> >>>>>>>> The hotpluggable memory node is absent from early memory nodes here.    
> >>>>>>>
> >>>>>>> OK thank you for the example illustrating the concern.    
> >>>>>>>>
> >>>>>>>> As you said, it could be possible to detect this node using SRAT in UEFI.    
> >>>>>>>
> >>>>>>> Let's wait for EDK2 experts on this.
> >>>>>>>    
> >>>>>>
> >>>>>> Happy to chime in, but I need a bit more context here.
> >>>>>>
> >>>>>> What is the problem, how does this path try to solve it, and why is
> >>>>>> that a bad idea?
> >>>>>>    
> >>>>> Sure, sorry.
> >>>>>
> >>>>> This series:
> >>>>> - [PATCH v3 00/10] ARM virt: ACPI memory hotplug support,
> >>>>> https://patchwork.kernel.org/cover/10863301/
> >>>>>
> >>>>> aims to introduce PCDIMM support in qemu. In ACPI mode, it builds the
> >>>>> SRAT and DSDT parts and relies on GED to trigger the hotplug.
> >>>>>
> >>>>> We noticed that if we build the hotpluggable memory dt nodes on top of
> >>>>> the above ACPI tables, the DIMM slots are interpreted as not
> >>>>> hotpluggable memory slots (at least we think so).
> >>>>>
> >>>>> We think the EDK2 GetMemoryMap() uses the dt node info and ignores the
> >>>>> fact that those slots are exposed as hotpluggable in the SRAT for example.
> >>>>>
> >>>>> So in this series, we are forced to not generate the hotpluggable memory
> >>>>> dt nodes if we want the DIMM slots to be effectively recognized as
> >>>>> hotpluggable.
> >>>>>
> >>>>> Could you confirm we have a correct understanding of the EDK2 behaviour
> >>>>> and if so, would there be any solution for EDK2 to absorb both the DT
> >>>>> nodes and the relevant SRAT/DSDT tables and make the slots hotpluggable.
> >>>>>
> >>>>> At qemu level, detecting we are booting in ACPI mode and purposely
> >>>>> removing the above mentioned DT nodes does not look straightforward.    
> >>>>
> >>>> The firmware is not enlightened about the ACPI content that comes from
> >>>> QEMU / fw_cfg. That ACPI content is *blindly* processed by the firmware,
> >>>> as instructed through the ACPI linker/loader script, in order to install
> >>>> the ACPI content for the OS. No actual information is consumed by the
> >>>> firmware from the ACPI payload -- and that's a feature.
> >>>>
> >>>> The firmware does consume DT:
> >>>>
> >>>> - If you start QEMU *with* "-no-acpi", then the DT is both consumed by
> >>>> the firmware (for its own information needs), and passed on to the OS.
> >>>>
> >>>> - If you start QEMU *without* "-no-acpi" (the default), then the DT is
> >>>> consumed only by the firmware (for its own information needs), and the
> >>>> DT is hidden from the OS. The OS gets only the ACPI content
> >>>> (processed/prepared as described above).
> >>>>
> >>>>
> >>>> In the firmware, the "ArmVirtPkg/HighMemDxe" driver iterates over the
> >>>> base/size pairs in all the memory nodes in the DT. For each such base
> >>>> address that is currently tracked as "nonexistent" in the GCD memory
> >>>> space map, the driver currently adds the base/size range as "system
> >>>> memory". This in turn is reflected by the UEFI memmap that the OS gets
> >>>> to see as "conventional memory".
> >>>>
> >>>> If you need some memory ranges to show up as "special" in the UEFI
> >>>> memmap, then you need to distinguish them somehow from the "regular"
> >>>> memory areas, in the DT. And then extend "ArmVirtPkg/HighMemDxe" in the
> >>>> firmware, so that it act upon the discriminator that you set in the DT.
> >>>>
> >>>>
> >>>> Now... from a brief look at the Platform Init and UEFI specs, my
> >>>> impression is that the hotpluggable (but presently not plugged) DIMM
> >>>> ranges should simply be *absent* from the UEFI memmap; is that correct?
> >>>> (I didn't check the ACPI spec, maybe it specifies the expected behavior
> >>>> in full.) If my impression is correct, then two options (alternatives)
> >>>> exist:
> >>>>
> >>>> (1) Hide the affected memory nodes -- or at least the affected base/size
> >>>> pairs -- from the DT, in case you boot without "-no-acpi" but with an
> >>>> external firmware loaded. Then the firmware will not expose those ranges
> >>>> as "conventional memory" in the UEFI memmap. This approach requires no
> >>>> changes to edk2.
> >>>>
> >>>> This option is precisely what Eric described up-thread, at
> >>>> <http://mid.mail-archive.com/3f0a5793-dd35-a497-2248-8eb0cd3c3a16@redhat.com>:
> >>>>    
> >>>>> in machvirt_init, there is firmware_loaded that tells you whether you
> >>>>> have a FW image. If this one is not set, you can induce dt. But if
> >>>>> there is a FW it can be either DT or ACPI booted. You also have the
> >>>>> acpi_enabled knob.    
> >>>>
> >>>> (The "-no-acpi" cmdline option clears the "acpi_enabled" variable in
> >>>> "vl.c").
> >>>>
> >>>> So, the condition for hiding the hotpluggable memory nodes in question
> >>>> from the DT is:    
> >>>     
> >>>>
> >>>>   (aarch64 && firmware_loaded && acpi_enabled)    
> >>>
> >>> Thanks a lot for all those inputs!
> >>>
> >>> I don't get why we test aarch64 in above condition (this was useful for
> >>> high ECAM range as the aarch32 FW was not supporting it but here, is it
> >>> still meaningful?)    
> >>
> >> Sorry, I should have clarified that. Yes, it is meaningful:
> >>
> >> While UEFI has bindings for both 32-bit and 64-bit ARM, ACPI has a
> >> 64-bit-only binding for ARM. (And you can have UEFI without ACPI, but
> >> not the reverse, on ARM.) So if you run the 32-bit build of the
> >> ArmVirtQemu firmware, you get no ACPI at all; all you can rely on with
> >> the OS is the DT.
> >>
> >> This "bitness distinction" is implemented in the firmware already. If
> >> you hid the memory nodes from the DT under the condition
> >>
> >>   (!aarch64 && firmware_loaded && acpi_enabled)
> >>
> >> then the nodes would not be seen by the OS at all (because
> >> "acpi_enabled" is irrelevant for the 32-bit build of ArmVirtQemu, and
> >> all the OS can ever get is DT).  
> > 
> > It's getting tricky and I don't like a bit that we are trying to carter
> > 64 bit only UEFI build (or any other build) on QEMU side. Also Peter has
> > a valid about guessing on QEMU side (that's usually a source of problem
> > in the future).
> > 
> > Perhaps we should reconsider and think about marking hotplugbbale RAM
> > in DT and let firmware to exclude it from memory map.  
> 
> I'm fine either way.
> 
> (I'm glad to continue discussing either option; that shouldn't be taken
> as a preference on my end.)
> 
> With option (2), please consider the new version dependency between QEMU
> and the firmware -- this may or may not affect migration. (Thinking
> about migration is difficult, so I'll leave that to you all :) )
I don't see any issues with migrations so far,
it will change the size of DT but it's all new CLI so any existing
machine should not have new options hence it would keep OLD DT size.

> 
> Thanks
> Laszlo
> 
> >>>> (2) Invent and set an "ignore me, firmware" property for the
> >>>> hotpluggable memory nodes in the DT, and update the firmware to honor
> >>>> that property.
> >>>>
> >>>> Thanks
> >>>> Laszlo
> >>>>    
> >>  
> >   
> 



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-arm] [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
  2019-04-08  8:11                             ` Igor Mammedov
  (?)
@ 2019-04-09 10:43                               ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-09 10:43 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Linuxarm, Auger Eric,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Laszlo Ersek, Leif Lindholm



> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@redhat.com]
> Sent: 08 April 2019 09:12
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Laszlo Ersek <lersek@redhat.com>; Auger Eric <eric.auger@redhat.com>;
> Ard Biesheuvel <ard.biesheuvel@linaro.org>; peter.maydell@linaro.org;
> sameo@linux.intel.com; qemu-devel@nongnu.org; Linuxarm
> <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> sebastien.boeuf@intel.com; Leif Lindholm <Leif.Lindholm@arm.com>
> Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> feature "fdt"

[...]
 
> > > > If the above is correct(with 32-bit variant of UEFI, OS cannot have ACPI
> boot),
> > > > then do we really have the issue of memory becoming non
> > > hot-un-unpluggable?
> > > > May be I am missing something.
> > >
> > > I think Igor and Peter dislike adding complex logic to QEMU that
> > > reflects the behavior of a specific firmware. AIUI their objection isn't
> > > that it wouldn't work, but that it's not the right thing to do, from a
> > > design perspective.
> >
> > Understood. Hope we can converge on something soon.
> Lets try adding a parameter to memory descriptors in DT that would mark
> them as hotpluggable.

Just send out v4 incorporating this. Please take a look and let me know.

Thanks,
Shameer
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-09 10:43                               ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-09 10:43 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Laszlo Ersek, Auger Eric, Ard Biesheuvel,
	peter.maydell@linaro.org, sameo@linux.intel.com,
	qemu-devel@nongnu.org, Linuxarm, shannon.zhaosl@gmail.com,
	qemu-arm@nongnu.org, xuwei (O), sebastien.boeuf@intel.com,
	Leif Lindholm



> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@redhat.com]
> Sent: 08 April 2019 09:12
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Laszlo Ersek <lersek@redhat.com>; Auger Eric <eric.auger@redhat.com>;
> Ard Biesheuvel <ard.biesheuvel@linaro.org>; peter.maydell@linaro.org;
> sameo@linux.intel.com; qemu-devel@nongnu.org; Linuxarm
> <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> sebastien.boeuf@intel.com; Leif Lindholm <Leif.Lindholm@arm.com>
> Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> feature "fdt"

[...]
 
> > > > If the above is correct(with 32-bit variant of UEFI, OS cannot have ACPI
> boot),
> > > > then do we really have the issue of memory becoming non
> > > hot-un-unpluggable?
> > > > May be I am missing something.
> > >
> > > I think Igor and Peter dislike adding complex logic to QEMU that
> > > reflects the behavior of a specific firmware. AIUI their objection isn't
> > > that it wouldn't work, but that it's not the right thing to do, from a
> > > design perspective.
> >
> > Understood. Hope we can converge on something soon.
> Lets try adding a parameter to memory descriptors in DT that would mark
> them as hotpluggable.

Just send out v4 incorporating this. Please take a look and let me know.

Thanks,
Shameer
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt"
@ 2019-04-09 10:43                               ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 95+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-04-09 10:43 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: peter.maydell@linaro.org, sameo@linux.intel.com, Ard Biesheuvel,
	qemu-devel@nongnu.org, Linuxarm, Auger Eric,
	shannon.zhaosl@gmail.com, qemu-arm@nongnu.org, xuwei (O),
	sebastien.boeuf@intel.com, Laszlo Ersek, Leif Lindholm



> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@redhat.com]
> Sent: 08 April 2019 09:12
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Laszlo Ersek <lersek@redhat.com>; Auger Eric <eric.auger@redhat.com>;
> Ard Biesheuvel <ard.biesheuvel@linaro.org>; peter.maydell@linaro.org;
> sameo@linux.intel.com; qemu-devel@nongnu.org; Linuxarm
> <linuxarm@huawei.com>; shannon.zhaosl@gmail.com;
> qemu-arm@nongnu.org; xuwei (O) <xuwei5@huawei.com>;
> sebastien.boeuf@intel.com; Leif Lindholm <Leif.Lindholm@arm.com>
> Subject: Re: [Qemu-devel] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in
> feature "fdt"

[...]
 
> > > > If the above is correct(with 32-bit variant of UEFI, OS cannot have ACPI
> boot),
> > > > then do we really have the issue of memory becoming non
> > > hot-un-unpluggable?
> > > > May be I am missing something.
> > >
> > > I think Igor and Peter dislike adding complex logic to QEMU that
> > > reflects the behavior of a specific firmware. AIUI their objection isn't
> > > that it wouldn't work, but that it's not the right thing to do, from a
> > > design perspective.
> >
> > Understood. Hope we can converge on something soon.
> Lets try adding a parameter to memory descriptors in DT that would mark
> them as hotpluggable.

Just send out v4 incorporating this. Please take a look and let me know.

Thanks,
Shameer
 


^ permalink raw reply	[flat|nested] 95+ messages in thread

end of thread, other threads:[~2019-04-09 10:44 UTC | newest]

Thread overview: 95+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-03-21 10:47 [Qemu-arm] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support Shameer Kolothum
2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 01/10] hw/acpi: Make ACPI IO address space configurable Shameer Kolothum
2019-04-01 12:58   ` [Qemu-arm] " Igor Mammedov
2019-04-01 12:58     ` [Qemu-devel] " Igor Mammedov
2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 02/10] hw/acpi: Do not create memory hotplug method when handler is not defined Shameer Kolothum
2019-03-28 14:14   ` [Qemu-arm] " Auger Eric
2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 03/10] hw/arm/virt: Add virtual ACPI device Shameer Kolothum
2019-03-28 14:14   ` Auger Eric
2019-03-29 11:22     ` Shameerali Kolothum Thodi
2019-04-01 13:08       ` [Qemu-arm] [Qemu-devel] " Igor Mammedov
2019-04-01 13:08         ` Igor Mammedov
2019-04-01 14:21         ` Shameerali Kolothum Thodi
2019-04-01 14:21           ` Shameerali Kolothum Thodi
2019-04-02  6:31           ` [Qemu-arm] " Igor Mammedov
2019-04-02  6:31             ` Igor Mammedov
2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 04/10] hw/arm/virt: Add memory hotplug framework Shameer Kolothum
2019-03-28 15:37   ` Auger Eric
2019-03-29 12:03     ` Shameerali Kolothum Thodi
2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 05/10] hw/arm/virt: Add ACPI support for device memory cold-plug Shameer Kolothum
2019-03-29  9:31   ` [Qemu-arm] " Auger Eric
2019-03-29 10:54     ` Shameerali Kolothum Thodi
2019-04-01 13:43     ` Igor Mammedov
2019-04-01 13:43       ` [Qemu-devel] " Igor Mammedov
2019-04-01 14:51       ` [Qemu-arm] " Shameerali Kolothum Thodi
2019-04-01 14:51         ` [Qemu-devel] " Shameerali Kolothum Thodi
2019-04-02  7:19         ` [Qemu-arm] " Igor Mammedov
2019-04-02  7:19           ` Igor Mammedov
2019-04-01 14:59       ` [Qemu-arm] " Auger Eric
2019-04-01 14:59         ` Auger Eric
2019-04-01 13:34   ` [Qemu-arm] " Igor Mammedov
2019-04-01 13:34     ` [Qemu-devel] " Igor Mammedov
2019-04-01 16:24     ` [Qemu-arm] " Shameerali Kolothum Thodi
2019-04-01 16:24       ` [Qemu-devel] " Shameerali Kolothum Thodi
2019-04-02  7:22       ` Igor Mammedov
2019-04-02  7:22         ` Igor Mammedov
2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 06/10] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Shameer Kolothum
2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 07/10] hw/arm/virt: Introduce opt-in feature "fdt" Shameer Kolothum
2019-03-29  9:31   ` Auger Eric
2019-03-29  9:41     ` Shameerali Kolothum Thodi
2019-03-29 13:41       ` Auger Eric
2019-03-29  9:59     ` Shameerali Kolothum Thodi
2019-03-29 13:12       ` Auger Eric
2019-03-29 13:14         ` Ard Biesheuvel
2019-03-29 13:56           ` [Qemu-arm] [Qemu-devel] " Auger Eric
2019-03-29 14:08             ` Shameerali Kolothum Thodi
2019-04-01 13:07             ` Laszlo Ersek
2019-04-01 13:07               ` Laszlo Ersek
2019-04-02  7:42               ` Igor Mammedov
2019-04-02  7:42                 ` Igor Mammedov
2019-04-02 10:33                 ` [Qemu-arm] " Laszlo Ersek
2019-04-02 10:33                   ` Laszlo Ersek
2019-04-02 15:42                   ` [Qemu-arm] " Auger Eric
2019-04-02 15:42                     ` Auger Eric
2019-04-02 15:52                     ` Laszlo Ersek
2019-04-02 15:52                       ` Laszlo Ersek
2019-04-02 15:56                       ` [Qemu-arm] " Laszlo Ersek
2019-04-02 15:56                         ` Laszlo Ersek
2019-04-02 16:07                       ` [Qemu-arm] " Auger Eric
2019-04-02 16:07                         ` Auger Eric
2019-04-02 14:26               ` [Qemu-arm] " Shameerali Kolothum Thodi
2019-04-02 14:26                 ` Shameerali Kolothum Thodi
2019-04-02 15:29               ` [Qemu-arm] " Auger Eric
2019-04-02 15:29                 ` Auger Eric
2019-04-02 15:38                 ` Laszlo Ersek
2019-04-02 15:38                   ` Laszlo Ersek
2019-04-02 15:50                   ` Auger Eric
2019-04-02 15:50                     ` Auger Eric
2019-04-03  9:49                   ` [Qemu-arm] " Igor Mammedov
2019-04-03  9:49                     ` Igor Mammedov
2019-04-03 12:10                     ` [Qemu-arm] " Shameerali Kolothum Thodi
2019-04-03 12:10                       ` Shameerali Kolothum Thodi
2019-04-03 13:29                       ` [Qemu-arm] " Laszlo Ersek
2019-04-03 13:29                         ` Laszlo Ersek
2019-04-03 16:25                         ` [Qemu-arm] " Shameerali Kolothum Thodi
2019-04-03 16:25                           ` Shameerali Kolothum Thodi
2019-04-08  8:11                           ` [Qemu-arm] " Igor Mammedov
2019-04-08  8:11                             ` Igor Mammedov
2019-04-08  8:11                             ` Igor Mammedov
2019-04-09 10:43                             ` [Qemu-arm] " Shameerali Kolothum Thodi
2019-04-09 10:43                               ` Shameerali Kolothum Thodi
2019-04-09 10:43                               ` Shameerali Kolothum Thodi
2019-04-03 13:19                     ` [Qemu-arm] " Laszlo Ersek
2019-04-03 13:19                       ` Laszlo Ersek
2019-04-08  8:13                       ` [Qemu-arm] " Igor Mammedov
2019-04-08  8:13                         ` Igor Mammedov
2019-04-08  8:13                         ` Igor Mammedov
2019-04-02  8:39             ` [Qemu-arm] " Peter Maydell
2019-04-02  8:39               ` Peter Maydell
2019-03-21 10:47 ` [Qemu-arm] [PATCH v3 08/10] hw/arm/boot: Expose the PC-DIMM nodes in the DT Shameer Kolothum
2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 09/10] hw/acpi: Add ACPI Generic Event Device Support Shameer Kolothum
2019-03-29 13:09   ` [Qemu-arm] " Auger Eric
2019-03-29 13:44     ` Shameerali Kolothum Thodi
2019-03-21 10:47 ` [Qemu-devel] [PATCH v3 10/10] hw/arm/virt: Init GED device and enable memory hotplug Shameer Kolothum
2019-03-29 14:16   ` [Qemu-arm] " Auger Eric
2019-03-21 11:06 ` [Qemu-arm] [Qemu-devel] [PATCH v3 00/10] ARM virt: ACPI memory hotplug support no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.