All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shameerali Kolothum Thodi via <qemu-arm@nongnu.org>
To: Jonathan Cameron <jonathan.cameron@huawei.com>,
	Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Peter Maydell <peter.maydell@linaro.org>,
	"qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
	Gustavo Romero <gustavo.romero@linaro.org>,
	"imammedo@redhat.com" <imammedo@redhat.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>
Subject: RE: [PATCH] hw/arm: add static NVDIMMs in device tree
Date: Thu, 31 Jul 2025 11:14:40 +0000	[thread overview]
Message-ID: <e8203af151ea4f9696b809dd5de6b155@huawei.com> (raw)
In-Reply-To: <20250731110036.00003a0a@huawei.com>



> -----Original Message-----
> From: Jonathan Cameron <jonathan.cameron@huawei.com>
> Sent: Thursday, July 31, 2025 11:01 AM
> To: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
> Cc: qemu-devel@nongnu.org; Peter Maydell <peter.maydell@linaro.org>;
> qemu-arm@nongnu.org; Gustavo Romero <gustavo.romero@linaro.org>;
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Subject: Re: [PATCH] hw/arm: add static NVDIMMs in device tree
> 
> On Wed, 30 Jul 2025 15:21:41 +0300
> Manos Pitsidianakis <manos.pitsidianakis@linaro.org> wrote:
> 
> > NVDIMM is used for fast rootfs with EROFS, for example by kata
> > containers. To allow booting with static NVDIMM memory, add them to
> > the device tree in arm virt machine.
> >
> > This allows users to boot directly with nvdimm memory devices without
> > having to rely on ACPI and hotplug.
> >
> > Verified to work with command invocation:
> >
> > ./qemu-system-aarch64 \
> >   -M virt,nvdimm=on \
> >   -cpu cortex-a57 \
> >   -m 4G,slots=2,maxmem=8G \
> >   -object memory-backend-file,id=mem1,share=on,mem-
> path=/tmp/nvdimm,size=4G,readonly=off \
> >   -device nvdimm,id=nvdimm1,memdev=mem1,unarmed=off \
> >   -drive file=./debian-12-nocloud-arm64-commited.qcow2,format=qcow2 \
> >   -kernel ./vmlinuz-6.1.0-13-arm64 \
> >   -append "root=/dev/vda1 console=ttyAMA0,115200 acpi=off"
> >   -initrd ./initrd.img-6.1.0-13-arm64 \
> >   -nographic \
> >   -serial mon:stdio
> >
> > Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
> 
> +CC shameer who might be able to remember how the nvdimm stuff works
> in
> +ACPI better
> than I can.  I think this is fine but more eyes would be good.

The cold plug DT support was part of the initial NVDIMM series,
https://lore.kernel.org/qemu-devel/20191004155302.4632-5-shameerali.kolothum.thodi@huawei.com/

But I can't remember the reason for dropping it, other than the comment from
Igor, that why we should do it for NVDIMM but not PC-DIMM.
https://lore.kernel.org/qemu-devel/20191111154627.63fc061b@redhat.com/

So, I guess there was not a strong use case for that at that time.

The PC-DIMM DT cold plug was dropped due to the issues/obstacles mentioned here,
https://lore.kernel.org/qemu-devel/5FC3163CFD30C246ABAA99954A238FA83F1B6A66@lhreml524-mbs.china.huawei.com/

+CC: Igor and Eric.

Thanks,
Shameer

> > ---
> >  hw/arm/boot.c | 39 +++++++++++++++++++++++++++++++++++++++
> >  hw/arm/virt.c |  8 +++++---
> >  2 files changed, 44 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/arm/boot.c b/hw/arm/boot.c index
> >
> d391cd01bb1b92ff213e69b84e5a69413b36c4f8..a0c1bcdf946ca98bb5da63f1
> a518
> > 018eb578dd81 100644
> > --- a/hw/arm/boot.c
> > +++ b/hw/arm/boot.c
> > @@ -25,6 +25,7 @@
> >  #include "hw/boards.h"
> >  #include "system/reset.h"
> >  #include "hw/loader.h"
> > +#include "hw/mem/memory-device.h"
> >  #include "elf.h"
> >  #include "system/device_tree.h"
> >  #include "qemu/config-file.h"
> > @@ -515,6 +516,26 @@ static void fdt_add_psci_node(void *fdt,
> ARMCPU *armcpu)
> >      qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);  }
> >
> > +static int fdt_add_pmem_node(void *fdt, uint32_t acells, uint32_t scells,
> > +                             int64_t mem_base, int64_t size, int64_t
> > +node) {
> > +    int ret;
> > +
> > +    g_autofree char *nodename = g_strdup_printf("/pmem@%" PRIx64,
> > + mem_base);
> > +
> > +    qemu_fdt_add_subnode(fdt, nodename);
> > +    qemu_fdt_setprop_string(fdt, nodename, "compatible", "pmem-
> region");
> > +    ret = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", acells,
> > +                                       mem_base, scells, size);
> 
> I'd burn some lines to avoid a comment covering unrelated ret handling
> 
> 	if (ret)
> 		return ret;
> 
> 	if (node >= 0) {
> 		return qem_fdt_setprop_cell()
> 	}
> 
> 	return 0;
> 
> > +    /* only set the NUMA ID if it is specified */
> > +    if (!ret && node >= 0) {
> > +        ret = qemu_fdt_setprop_cell(fdt, nodename, "numa-node-id",
> > +                                    node);
> > +    }
> > +
> > +    return ret;
> > +}
> > +
> >  int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
> >                   hwaddr addr_limit, AddressSpace *as, MachineState *ms,
> >                   ARMCPU *cpu)
> > @@ -525,6 +546,7 @@ int arm_load_dtb(hwaddr addr, const struct
> arm_boot_info *binfo,
> >      unsigned int i;
> >      hwaddr mem_base, mem_len;
> >      char **node_path;
> > +    g_autofree MemoryDeviceInfoList *md_list = NULL;
> >      Error *err = NULL;
> >
> >      if (binfo->dtb_filename) {
> > @@ -628,6 +650,23 @@ int arm_load_dtb(hwaddr addr, const struct
> arm_boot_info *binfo,
> >          }
> >      }
> >
> > +    md_list = qmp_memory_device_list();
> > +    for (MemoryDeviceInfoList *m = md_list; m != NULL; m = m->next) {
> > +        MemoryDeviceInfo *mi = m->value;
> > +
> > +        if (mi->type == MEMORY_DEVICE_INFO_KIND_NVDIMM) {
> > +            PCDIMMDeviceInfo *di = mi->u.nvdimm.data;
> > +
> > +            rc = fdt_add_pmem_node(fdt, acells, scells,
> > +                                   di->addr, di->size, di->node);
> > +            if (rc < 0) {
> > +                fprintf(stderr, "couldn't add NVDIMM /pmem@%"PRIx64"
> node\n",
> > +                        di->addr);
> > +                goto fail;
> > +            }
> > +        }
> > +    }
> > +
> >      rc = fdt_path_offset(fdt, "/chosen");
> >      if (rc < 0) {
> >          qemu_fdt_add_subnode(fdt, "/chosen"); diff --git
> > a/hw/arm/virt.c b/hw/arm/virt.c index
> >
> ef6be3660f5fb38da84235c32dc2d13a5c61889c..910f5bb5f66ee217a9140f912
> 880
> > 4a5b9f69b5b6 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2917,7 +2917,7 @@ static void
> virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >      const MachineState *ms = MACHINE(hotplug_dev);
> >      const bool is_nvdimm = object_dynamic_cast(OBJECT(dev),
> > TYPE_NVDIMM);
> >
> > -    if (!vms->acpi_dev) {
> > +    if (!vms->acpi_dev && !(is_nvdimm && !dev->hotplugged)) {
> >          error_setg(errp,
> >                     "memory hotplug is not enabled: missing acpi-ged device");
> >          return;
> > @@ -2949,8 +2949,10 @@ static void virt_memory_plug(HotplugHandler
> *hotplug_dev,
> >          nvdimm_plug(ms->nvdimms_state);
> >      }
> >
> > -    hotplug_handler_plug(HOTPLUG_HANDLER(vms->acpi_dev),
> > -                         dev, &error_abort);
> > +    if (vms->acpi_dev) {
> > +        hotplug_handler_plug(HOTPLUG_HANDLER(vms->acpi_dev),
> > +                             dev, &error_abort);
> > +    }
> >  }
> >
> >  static void virt_machine_device_pre_plug_cb(HotplugHandler
> > *hotplug_dev,
> >
> > ---
> > base-commit: 9b80226ece693197af8a981b424391b68b5bc38e
> > change-id: 20250730-nvdimm_arm64_virt-931a764bbe0c
> >
> > --
> > γαῖα πυρί μιχθήτω
> >
> >


WARNING: multiple messages have this Message-ID (diff)
From: Shameerali Kolothum Thodi via <qemu-devel@nongnu.org>
To: Jonathan Cameron <jonathan.cameron@huawei.com>,
	Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Peter Maydell <peter.maydell@linaro.org>,
	"qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
	Gustavo Romero <gustavo.romero@linaro.org>,
	"imammedo@redhat.com" <imammedo@redhat.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>
Subject: RE: [PATCH] hw/arm: add static NVDIMMs in device tree
Date: Thu, 31 Jul 2025 11:14:40 +0000	[thread overview]
Message-ID: <e8203af151ea4f9696b809dd5de6b155@huawei.com> (raw)
In-Reply-To: <20250731110036.00003a0a@huawei.com>



> -----Original Message-----
> From: Jonathan Cameron <jonathan.cameron@huawei.com>
> Sent: Thursday, July 31, 2025 11:01 AM
> To: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
> Cc: qemu-devel@nongnu.org; Peter Maydell <peter.maydell@linaro.org>;
> qemu-arm@nongnu.org; Gustavo Romero <gustavo.romero@linaro.org>;
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Subject: Re: [PATCH] hw/arm: add static NVDIMMs in device tree
> 
> On Wed, 30 Jul 2025 15:21:41 +0300
> Manos Pitsidianakis <manos.pitsidianakis@linaro.org> wrote:
> 
> > NVDIMM is used for fast rootfs with EROFS, for example by kata
> > containers. To allow booting with static NVDIMM memory, add them to
> > the device tree in arm virt machine.
> >
> > This allows users to boot directly with nvdimm memory devices without
> > having to rely on ACPI and hotplug.
> >
> > Verified to work with command invocation:
> >
> > ./qemu-system-aarch64 \
> >   -M virt,nvdimm=on \
> >   -cpu cortex-a57 \
> >   -m 4G,slots=2,maxmem=8G \
> >   -object memory-backend-file,id=mem1,share=on,mem-
> path=/tmp/nvdimm,size=4G,readonly=off \
> >   -device nvdimm,id=nvdimm1,memdev=mem1,unarmed=off \
> >   -drive file=./debian-12-nocloud-arm64-commited.qcow2,format=qcow2 \
> >   -kernel ./vmlinuz-6.1.0-13-arm64 \
> >   -append "root=/dev/vda1 console=ttyAMA0,115200 acpi=off"
> >   -initrd ./initrd.img-6.1.0-13-arm64 \
> >   -nographic \
> >   -serial mon:stdio
> >
> > Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
> 
> +CC shameer who might be able to remember how the nvdimm stuff works
> in
> +ACPI better
> than I can.  I think this is fine but more eyes would be good.

The cold plug DT support was part of the initial NVDIMM series,
https://lore.kernel.org/qemu-devel/20191004155302.4632-5-shameerali.kolothum.thodi@huawei.com/

But I can't remember the reason for dropping it, other than the comment from
Igor, that why we should do it for NVDIMM but not PC-DIMM.
https://lore.kernel.org/qemu-devel/20191111154627.63fc061b@redhat.com/

So, I guess there was not a strong use case for that at that time.

The PC-DIMM DT cold plug was dropped due to the issues/obstacles mentioned here,
https://lore.kernel.org/qemu-devel/5FC3163CFD30C246ABAA99954A238FA83F1B6A66@lhreml524-mbs.china.huawei.com/

+CC: Igor and Eric.

Thanks,
Shameer

> > ---
> >  hw/arm/boot.c | 39 +++++++++++++++++++++++++++++++++++++++
> >  hw/arm/virt.c |  8 +++++---
> >  2 files changed, 44 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/arm/boot.c b/hw/arm/boot.c index
> >
> d391cd01bb1b92ff213e69b84e5a69413b36c4f8..a0c1bcdf946ca98bb5da63f1
> a518
> > 018eb578dd81 100644
> > --- a/hw/arm/boot.c
> > +++ b/hw/arm/boot.c
> > @@ -25,6 +25,7 @@
> >  #include "hw/boards.h"
> >  #include "system/reset.h"
> >  #include "hw/loader.h"
> > +#include "hw/mem/memory-device.h"
> >  #include "elf.h"
> >  #include "system/device_tree.h"
> >  #include "qemu/config-file.h"
> > @@ -515,6 +516,26 @@ static void fdt_add_psci_node(void *fdt,
> ARMCPU *armcpu)
> >      qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);  }
> >
> > +static int fdt_add_pmem_node(void *fdt, uint32_t acells, uint32_t scells,
> > +                             int64_t mem_base, int64_t size, int64_t
> > +node) {
> > +    int ret;
> > +
> > +    g_autofree char *nodename = g_strdup_printf("/pmem@%" PRIx64,
> > + mem_base);
> > +
> > +    qemu_fdt_add_subnode(fdt, nodename);
> > +    qemu_fdt_setprop_string(fdt, nodename, "compatible", "pmem-
> region");
> > +    ret = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", acells,
> > +                                       mem_base, scells, size);
> 
> I'd burn some lines to avoid a comment covering unrelated ret handling
> 
> 	if (ret)
> 		return ret;
> 
> 	if (node >= 0) {
> 		return qem_fdt_setprop_cell()
> 	}
> 
> 	return 0;
> 
> > +    /* only set the NUMA ID if it is specified */
> > +    if (!ret && node >= 0) {
> > +        ret = qemu_fdt_setprop_cell(fdt, nodename, "numa-node-id",
> > +                                    node);
> > +    }
> > +
> > +    return ret;
> > +}
> > +
> >  int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
> >                   hwaddr addr_limit, AddressSpace *as, MachineState *ms,
> >                   ARMCPU *cpu)
> > @@ -525,6 +546,7 @@ int arm_load_dtb(hwaddr addr, const struct
> arm_boot_info *binfo,
> >      unsigned int i;
> >      hwaddr mem_base, mem_len;
> >      char **node_path;
> > +    g_autofree MemoryDeviceInfoList *md_list = NULL;
> >      Error *err = NULL;
> >
> >      if (binfo->dtb_filename) {
> > @@ -628,6 +650,23 @@ int arm_load_dtb(hwaddr addr, const struct
> arm_boot_info *binfo,
> >          }
> >      }
> >
> > +    md_list = qmp_memory_device_list();
> > +    for (MemoryDeviceInfoList *m = md_list; m != NULL; m = m->next) {
> > +        MemoryDeviceInfo *mi = m->value;
> > +
> > +        if (mi->type == MEMORY_DEVICE_INFO_KIND_NVDIMM) {
> > +            PCDIMMDeviceInfo *di = mi->u.nvdimm.data;
> > +
> > +            rc = fdt_add_pmem_node(fdt, acells, scells,
> > +                                   di->addr, di->size, di->node);
> > +            if (rc < 0) {
> > +                fprintf(stderr, "couldn't add NVDIMM /pmem@%"PRIx64"
> node\n",
> > +                        di->addr);
> > +                goto fail;
> > +            }
> > +        }
> > +    }
> > +
> >      rc = fdt_path_offset(fdt, "/chosen");
> >      if (rc < 0) {
> >          qemu_fdt_add_subnode(fdt, "/chosen"); diff --git
> > a/hw/arm/virt.c b/hw/arm/virt.c index
> >
> ef6be3660f5fb38da84235c32dc2d13a5c61889c..910f5bb5f66ee217a9140f912
> 880
> > 4a5b9f69b5b6 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2917,7 +2917,7 @@ static void
> virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >      const MachineState *ms = MACHINE(hotplug_dev);
> >      const bool is_nvdimm = object_dynamic_cast(OBJECT(dev),
> > TYPE_NVDIMM);
> >
> > -    if (!vms->acpi_dev) {
> > +    if (!vms->acpi_dev && !(is_nvdimm && !dev->hotplugged)) {
> >          error_setg(errp,
> >                     "memory hotplug is not enabled: missing acpi-ged device");
> >          return;
> > @@ -2949,8 +2949,10 @@ static void virt_memory_plug(HotplugHandler
> *hotplug_dev,
> >          nvdimm_plug(ms->nvdimms_state);
> >      }
> >
> > -    hotplug_handler_plug(HOTPLUG_HANDLER(vms->acpi_dev),
> > -                         dev, &error_abort);
> > +    if (vms->acpi_dev) {
> > +        hotplug_handler_plug(HOTPLUG_HANDLER(vms->acpi_dev),
> > +                             dev, &error_abort);
> > +    }
> >  }
> >
> >  static void virt_machine_device_pre_plug_cb(HotplugHandler
> > *hotplug_dev,
> >
> > ---
> > base-commit: 9b80226ece693197af8a981b424391b68b5bc38e
> > change-id: 20250730-nvdimm_arm64_virt-931a764bbe0c
> >
> > --
> > γαῖα πυρί μιχθήτω
> >
> >


  reply	other threads:[~2025-07-31 11:16 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-30 12:21 [PATCH] hw/arm: add static NVDIMMs in device tree Manos Pitsidianakis
2025-07-31  8:38 ` Philippe Mathieu-Daudé
2025-07-31  9:42   ` Manos Pitsidianakis
2025-07-31 10:00 ` Jonathan Cameron via
2025-07-31 10:00   ` Jonathan Cameron via
2025-07-31 11:14   ` Shameerali Kolothum Thodi via [this message]
2025-07-31 11:14     ` Shameerali Kolothum Thodi via
2025-07-31 12:20     ` Manos Pitsidianakis
2025-08-01 11:03       ` Igor Mammedov
2025-08-01 11:35         ` Peter Maydell
2025-08-01  6:58     ` Gao Xiang
2025-08-20  4:10       ` Gao Xiang
2025-08-20  9:29         ` David Hildenbrand
2025-08-20 10:11           ` Gao Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e8203af151ea4f9696b809dd5de6b155@huawei.com \
    --to=qemu-arm@nongnu.org \
    --cc=eric.auger@redhat.com \
    --cc=gustavo.romero@linaro.org \
    --cc=imammedo@redhat.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=manos.pitsidianakis@linaro.org \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shameerali.kolothum.thodi@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.