qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>,
	qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	eric.auger@redhat.com, imammedo@redhat.com
Cc: peter.maydell@linaro.org, shannon.zhaosl@gmail.com,
	sameo@linux.intel.com, sebastien.boeuf@intel.com,
	xuwei5@hisilicon.com, ard.biesheuvel@linaro.org,
	linuxarm@huawei.com
Subject: Re: [Qemu-devel] [PATCH v4 8/8] hw/arm/boot: Expose the PC-DIMM nodes in the DT
Date: Tue, 9 Apr 2019 17:08:57 +0200	[thread overview]
Message-ID: <4f3df83f-8d45-09d0-ec9e-0ddf843fd3a4@redhat.com> (raw)
In-Reply-To: <20190409102935.28292-9-shameerali.kolothum.thodi@huawei.com>

On 04/09/19 12:29, Shameer Kolothum wrote:
> This patch adds memory nodes corresponding to PC-DIMM regions.
> This will enable support for cold plugged device memory for Guests
> with DT boot.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/arm/boot.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)
> 
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index 8c840ba..150e1ed 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -19,6 +19,7 @@
>  #include "sysemu/numa.h"
>  #include "hw/boards.h"
>  #include "hw/loader.h"
> +#include "hw/mem/memory-device.h"
>  #include "elf.h"
>  #include "sysemu/device_tree.h"
>  #include "qemu/config-file.h"
> @@ -538,6 +539,41 @@ static void fdt_add_psci_node(void *fdt)
>      qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
>  }
>  
> +static int fdt_add_hotpluggable_memory_nodes(void *fdt,
> +                                             uint32_t acells, uint32_t scells) {
> +    MemoryDeviceInfoList *info, *info_list = qmp_memory_device_list();
> +    MemoryDeviceInfo *mi;
> +    int ret = 0;
> +
> +    for (info = info_list; info != NULL; info = info->next) {
> +        mi = info->value;
> +        switch (mi->type) {
> +        case MEMORY_DEVICE_INFO_KIND_DIMM:
> +        {
> +            PCDIMMDeviceInfo *di = mi->u.dimm.data;
> +
> +            ret = fdt_add_memory_node(fdt, acells, di->addr, scells,
> +                                      di->size, di->node, true);
> +            if (ret) {
> +                fprintf(stderr,
> +                        "couldn't add PCDIMM /memory@%"PRIx64" node\n",
> +                        di->addr);
> +                goto out;
> +            }
> +            break;
> +        }
> +        default:
> +            fprintf(stderr, "%s memory nodes are not yet supported\n",
> +                    MemoryDeviceInfoKind_str(mi->type));
> +            ret = -ENOENT;
> +            goto out;
> +        }
> +    }
> +out:
> +    qapi_free_MemoryDeviceInfoList(info_list);
> +    return ret;
> +}
> +
>  int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
>                   hwaddr addr_limit, AddressSpace *as)
>  {
> @@ -637,6 +673,12 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
>          }
>      }
>  
> +    rc = fdt_add_hotpluggable_memory_nodes(fdt, acells, scells);
> +    if (rc < 0) {
> +        fprintf(stderr, "couldn't add hotpluggable memory nodes\n");
> +        goto fail;
> +    }
> +
>      rc = fdt_path_offset(fdt, "/chosen");
>      if (rc < 0) {
>          qemu_fdt_add_subnode(fdt, "/chosen");
> 


Given patches #7 and #8, as I understand them, the firmware cannot distinguish hotpluggable & present, from hotpluggable & absent. The firmware can only skip both hotpluggable cases. That's fine in that the firmware will hog neither type -- but is that OK for the OS as well, for both ACPI boot and DT boot?

Consider in particular the "hotpluggable & present, ACPI boot" case. Assuming we modify the firmware to skip "hotpluggable" altogether, the UEFI memmap will not include the range despite it being present at boot. Presumably, ACPI will refer to the range somehow, however. Will that not confuse the OS?

When Igor raised this earlier, I suggested that hotpluggable-and-present should be added by the firmware, but also allocated immediately, as EfiBootServicesData type memory. This will prevent other drivers in the firmware from allocating AcpiNVS or Reserved chunks from the same memory range, the UEFI memmap will contain the range as EfiBootServicesData, and then the OS can release that allocation in one go early during boot.

But this really has to be clarified from the Linux kernel's expectations. Please formalize all of the following cases:

OS boot (DT/ACPI)  hotpluggable & ...  GetMemoryMap() should report as  DT/ACPI should report as
-----------------  ------------------  -------------------------------  ------------------------
DT                 present             ?                                ?
DT                 absent              ?                                ?
ACPI               present             ?                                ?
ACPI               absent              ?                                ?

Again, this table is dictated by Linux.

Thanks
Laszlo

WARNING: multiple messages have this Message-ID (diff)
From: Laszlo Ersek <lersek@redhat.com>
To: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>,
	qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	eric.auger@redhat.com, imammedo@redhat.com
Cc: peter.maydell@linaro.org, sameo@linux.intel.com,
	ard.biesheuvel@linaro.org, linuxarm@huawei.com,
	xuwei5@hisilicon.com, shannon.zhaosl@gmail.com,
	sebastien.boeuf@intel.com
Subject: Re: [Qemu-devel] [PATCH v4 8/8] hw/arm/boot: Expose the PC-DIMM nodes in the DT
Date: Tue, 9 Apr 2019 17:08:57 +0200	[thread overview]
Message-ID: <4f3df83f-8d45-09d0-ec9e-0ddf843fd3a4@redhat.com> (raw)
Message-ID: <20190409150857.ceLQbVXGxNDIJbvorL5ojWDvrYOCSSa-d960nLvcvTE@z> (raw)
In-Reply-To: <20190409102935.28292-9-shameerali.kolothum.thodi@huawei.com>

On 04/09/19 12:29, Shameer Kolothum wrote:
> This patch adds memory nodes corresponding to PC-DIMM regions.
> This will enable support for cold plugged device memory for Guests
> with DT boot.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/arm/boot.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)
> 
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index 8c840ba..150e1ed 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -19,6 +19,7 @@
>  #include "sysemu/numa.h"
>  #include "hw/boards.h"
>  #include "hw/loader.h"
> +#include "hw/mem/memory-device.h"
>  #include "elf.h"
>  #include "sysemu/device_tree.h"
>  #include "qemu/config-file.h"
> @@ -538,6 +539,41 @@ static void fdt_add_psci_node(void *fdt)
>      qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
>  }
>  
> +static int fdt_add_hotpluggable_memory_nodes(void *fdt,
> +                                             uint32_t acells, uint32_t scells) {
> +    MemoryDeviceInfoList *info, *info_list = qmp_memory_device_list();
> +    MemoryDeviceInfo *mi;
> +    int ret = 0;
> +
> +    for (info = info_list; info != NULL; info = info->next) {
> +        mi = info->value;
> +        switch (mi->type) {
> +        case MEMORY_DEVICE_INFO_KIND_DIMM:
> +        {
> +            PCDIMMDeviceInfo *di = mi->u.dimm.data;
> +
> +            ret = fdt_add_memory_node(fdt, acells, di->addr, scells,
> +                                      di->size, di->node, true);
> +            if (ret) {
> +                fprintf(stderr,
> +                        "couldn't add PCDIMM /memory@%"PRIx64" node\n",
> +                        di->addr);
> +                goto out;
> +            }
> +            break;
> +        }
> +        default:
> +            fprintf(stderr, "%s memory nodes are not yet supported\n",
> +                    MemoryDeviceInfoKind_str(mi->type));
> +            ret = -ENOENT;
> +            goto out;
> +        }
> +    }
> +out:
> +    qapi_free_MemoryDeviceInfoList(info_list);
> +    return ret;
> +}
> +
>  int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
>                   hwaddr addr_limit, AddressSpace *as)
>  {
> @@ -637,6 +673,12 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
>          }
>      }
>  
> +    rc = fdt_add_hotpluggable_memory_nodes(fdt, acells, scells);
> +    if (rc < 0) {
> +        fprintf(stderr, "couldn't add hotpluggable memory nodes\n");
> +        goto fail;
> +    }
> +
>      rc = fdt_path_offset(fdt, "/chosen");
>      if (rc < 0) {
>          qemu_fdt_add_subnode(fdt, "/chosen");
> 


Given patches #7 and #8, as I understand them, the firmware cannot distinguish hotpluggable & present, from hotpluggable & absent. The firmware can only skip both hotpluggable cases. That's fine in that the firmware will hog neither type -- but is that OK for the OS as well, for both ACPI boot and DT boot?

Consider in particular the "hotpluggable & present, ACPI boot" case. Assuming we modify the firmware to skip "hotpluggable" altogether, the UEFI memmap will not include the range despite it being present at boot. Presumably, ACPI will refer to the range somehow, however. Will that not confuse the OS?

When Igor raised this earlier, I suggested that hotpluggable-and-present should be added by the firmware, but also allocated immediately, as EfiBootServicesData type memory. This will prevent other drivers in the firmware from allocating AcpiNVS or Reserved chunks from the same memory range, the UEFI memmap will contain the range as EfiBootServicesData, and then the OS can release that allocation in one go early during boot.

But this really has to be clarified from the Linux kernel's expectations. Please formalize all of the following cases:

OS boot (DT/ACPI)  hotpluggable & ...  GetMemoryMap() should report as  DT/ACPI should report as
-----------------  ------------------  -------------------------------  ------------------------
DT                 present             ?                                ?
DT                 absent              ?                                ?
ACPI               present             ?                                ?
ACPI               absent              ?                                ?

Again, this table is dictated by Linux.

Thanks
Laszlo


  parent reply	other threads:[~2019-04-09 15:09 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-09 10:29 [Qemu-devel] [PATCH v4 0/8] ARM virt: ACPI memory hotplug support Shameer Kolothum
2019-04-09 10:29 ` Shameer Kolothum
2019-04-09 10:29 ` [Qemu-devel] [PATCH v4 1/8] hw/acpi: Make ACPI IO address space configurable Shameer Kolothum
2019-04-09 10:29   ` Shameer Kolothum
2019-04-09 10:29 ` [Qemu-devel] [PATCH v4 2/8] hw/acpi: Do not create memory hotplug method when handler is not defined Shameer Kolothum
2019-04-09 10:29   ` Shameer Kolothum
2019-04-09 10:29 ` [Qemu-devel] [PATCH v4 3/8] hw/acpi: Add ACPI Generic Event Device Support Shameer Kolothum
2019-04-09 10:29   ` Shameer Kolothum
2019-04-30 15:49   ` Auger Eric
2019-04-30 15:49     ` Auger Eric
2019-05-01 10:40     ` Shameerali Kolothum Thodi
2019-05-01 10:40       ` Shameerali Kolothum Thodi
2019-05-02  7:10       ` Auger Eric
2019-05-02  7:10         ` Auger Eric
2019-05-01 11:10   ` Ard Biesheuvel
2019-05-01 11:10     ` Ard Biesheuvel
2019-05-01 11:25     ` Shameerali Kolothum Thodi
2019-05-01 11:25       ` Shameerali Kolothum Thodi
2019-05-02  7:22       ` Ard Biesheuvel
2019-05-02  7:22         ` Ard Biesheuvel
2019-05-02 15:24         ` Igor Mammedov
2019-05-02 15:24           ` Igor Mammedov
2019-05-02 16:12   ` Igor Mammedov
2019-05-02 16:12     ` Igor Mammedov
2019-05-03 12:45     ` Shameerali Kolothum Thodi
2019-05-03 12:45       ` Shameerali Kolothum Thodi
2019-05-03 15:10       ` Igor Mammedov
2019-05-03 15:10         ` Igor Mammedov
2019-05-07  9:01         ` Shameerali Kolothum Thodi
2019-05-09 14:50           ` Igor Mammedov
2019-05-13 11:53     ` Shameerali Kolothum Thodi
2019-05-13 17:00       ` Shameerali Kolothum Thodi
2019-05-17  8:41         ` Igor Mammedov
2019-05-17 10:31           ` Shameerali Kolothum Thodi
2019-04-09 10:29 ` [Qemu-devel] [PATCH v4 4/8] hw/arm/virt: Add memory hotplug framework Shameer Kolothum
2019-04-09 10:29   ` Shameer Kolothum
2019-05-02 16:19   ` Igor Mammedov
2019-05-02 16:19     ` Igor Mammedov
2019-05-03 12:47     ` Shameerali Kolothum Thodi
2019-05-03 12:47       ` Shameerali Kolothum Thodi
2019-04-09 10:29 ` [Qemu-devel] [PATCH v4 5/8] hw/arm/virt: Enable device memory cold/hot plug with ACPI boot Shameer Kolothum
2019-04-09 10:29   ` Shameer Kolothum
2019-04-30 16:34   ` Auger Eric
2019-04-30 16:34     ` Auger Eric
2019-05-01 10:49     ` Shameerali Kolothum Thodi
2019-05-01 10:49       ` Shameerali Kolothum Thodi
2019-05-09 15:20   ` Igor Mammedov
2019-04-09 10:29 ` [Qemu-devel] [PATCH v4 6/8] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Shameer Kolothum
2019-04-09 10:29   ` Shameer Kolothum
2019-04-09 10:29 ` [Qemu-devel] [PATCH v4 7/8] hw/arm/boot: Add "hotpluggable" property to DT memory node Shameer Kolothum
2019-04-09 10:29   ` Shameer Kolothum
2019-04-09 10:29 ` [Qemu-devel] [PATCH v4 8/8] hw/arm/boot: Expose the PC-DIMM nodes in the DT Shameer Kolothum
2019-04-09 10:29   ` Shameer Kolothum
2019-04-09 15:08   ` Laszlo Ersek [this message]
2019-04-09 15:08     ` Laszlo Ersek
2019-04-10  8:49     ` Shameerali Kolothum Thodi
2019-04-10  8:49       ` Shameerali Kolothum Thodi
2019-05-03 13:35       ` Shameerali Kolothum Thodi
2019-05-03 13:35         ` Shameerali Kolothum Thodi
2019-05-03 14:13         ` Laszlo Ersek
2019-05-03 14:13           ` Laszlo Ersek
2019-05-08 10:30           ` Shameerali Kolothum Thodi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4f3df83f-8d45-09d0-ec9e-0ddf843fd3a4@redhat.com \
    --to=lersek@redhat.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=eric.auger@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=linuxarm@huawei.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=sameo@linux.intel.com \
    --cc=sebastien.boeuf@intel.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=shannon.zhaosl@gmail.com \
    --cc=xuwei5@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).