From: Auger Eric <eric.auger@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: peter.maydell@linaro.org, drjones@redhat.com, david@redhat.com,
qemu-devel@nongnu.org, shameerali.kolothum.thodi@huawei.com,
dgilbert@redhat.com, qemu-arm@nongnu.org,
david@gibson.dropbear.id.au, eric.auger.pro@gmail.com
Subject: Re: [Qemu-devel] [PATCH v7 00/17] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support
Date: Fri, 22 Feb 2019 18:35:26 +0100 [thread overview]
Message-ID: <c016024e-9f4b-8953-f492-8802675f7e22@redhat.com> (raw)
In-Reply-To: <20190222172742.18c3835a@redhat.com>
Hi Igor,
On 2/22/19 5:27 PM, Igor Mammedov wrote:
> On Wed, 20 Feb 2019 23:39:46 +0100
> Eric Auger <eric.auger@redhat.com> wrote:
>
>> This series aims to bump the 255GB RAM limit in machvirt and to
>> support device memory in general, and especially PCDIMM/NVDIMM.
>>
>> In machvirt versions < 4.0, the initial RAM starts at 1GB and can
>> grow up to 255GB. From 256GB onwards we find IO regions such as the
>> additional GICv3 RDIST region, high PCIe ECAM region and high PCIe
>> MMIO region. The address map was 1TB large. This corresponded to
>> the max IPA capacity KVM was able to manage.
>>
>> Since 4.20, the host kernel is able to support a larger and dynamic
>> IPA range. So the guest physical address can go beyond the 1TB. The
>> max GPA size depends on the host kernel configuration and physical CPUs.
>>
>> In this series we use this feature and allow the RAM to grow without
>> any other limit than the one put by the host kernel.
>>
>> The RAM still starts at 1GB. First comes the initial ram (-m) of size
>> ram_size and then comes the device memory (,maxmem) of size
>> maxram_size - ram_size. The device memory is potentially hotpluggable
>> depending on the instantiated memory objects.
>>
>> IO regions previously located between 256GB and 1TB are moved after
>> the RAM. Their offset is dynamically computed, depends on ram_size
>> and maxram_size. Size alignment is enforced.
>>
>> In case maxmem value is inferior to 255GB, the legacy memory map
>> still is used. The change of memory map becomes effective from 4.0
>> onwards.
>>
>> As we keep the initial RAM at 1GB base address, we do not need to do
>> invasive changes in the EDK2 FW. It seems nobody is eager to do
>> that job at the moment.
>>
>> Device memory being put just after the initial RAM, it is possible
>> to get access to this feature while keeping a 1TB address map.
>>
>> This series reuses/rebases patches initially submitted by Shameer
>> in [1] and Kwangwoo in [2] for the PC-DIMM and NV-DIMM parts.
>>
>> Functionally, the series is split into 3 parts:
>> 1) bump of the initial RAM limit [1 - 9] and change in
>> the memory map
>
>> 2) Support of PC-DIMM [10 - 13]
> Is this part complete ACPI wise (for coldplug)? I haven't noticed
> DSDT AML here no E820 changes, so ACPI wise pc-dimm shouldn't be
> visible to the guest. It might be that DT is masking problem
> but well, that won't work on ACPI only guests.
guest /proc/meminfo or "lshw -class memory" reflects the amount of mem
added with the DIMM slots. So it looks fine to me. Isn't E820 a pure x86
matter? What else would you expect in the dsdt? I understand hotplug
would require extra modifications but I don't see anything else missing
for coldplug.
> Even though I've tried make mem hotplug ACPI parts not x86 specific,
> I'm afraid it might be tightly coupled with hotplug support.
> So here are 2 options make DSDT part work without hotplug or
> implement hotplug here. I think the former is just a waste of time
> and we should just add hotplug. It should take relatively minor effort
> since you already implemented most of boiler plate here.
Shameer sent an RFC series for supporting hotplug.
[RFC PATCH 0/4] ARM virt: ACPI memory hotplug support
https://patchwork.kernel.org/cover/10783589/
I tested PCDIMM hotplug (with ACPI) this afternoon and it seemed to be
OK, even after system_reset.
Note the hotplug kernel support on ARM is very recent. I would prefer to
dissociate both efforts if we want to get a chance making coldplug for
4.0. Also we have an issue for NVDIMM since on reboot the guest does not
boot properly.
>
> As for how to implement ACPI HW part, I suggest to borrow GED
> device that NEMU guys trying to use instead of GPIO route,
> like we do now for ACPI_POWER_BUTTON_DEVICE to deliver event.
> So that it would be easier to share this with their virt-x86
> machine eventually.
Sounds like a different approach than the one initiated by Shameer?
Thanks
Eric
>
>
>> 3) Support of NV-DIMM [14 - 17]
> The same might be true for NUMA but I haven't dug this deep in to
> that part.
>
>>
>> 1) can be upstreamed before 2 and 2 can be upstreamed before 3.
>>
>> Work is ongoing to transform the whole memory as device memory.
>> However this move is not trivial and to me, is independent on
>> the improvements brought by this series:
>> - if we were to use DIMM for initial RAM, those DIMMs would use
>> use slots. Although they would not be part of the ones provided
>> using the ",slots" options, they are ACPI limited resources.
>> - DT and ACPI description needs to be reworked
>> - NUMA integration needs special care
>> - a special device memory object may be required to avoid consuming
>> slots and easing the FW description.
>>
>> So I preferred to separate the concerns. This new implementation
>> based on device memory could be candidate for another virt
>> version.
>>
>> Best Regards
>>
>> Eric
>>
>> References:
>>
>> [0] [RFC v2 0/6] hw/arm: Add support for non-contiguous iova regions
>> http://patchwork.ozlabs.org/cover/914694/
>>
>> [1] [RFC PATCH 0/3] add nvdimm support on AArch64 virt platform
>> https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg04599.html
>>
>> This series can be found at:
>> https://github.com/eauger/qemu/tree/v3.1.0-dimm-v7
>>
>> History:
>>
>> v6 -> v7:
>> - Addressed Peter and Igor comments (exceptions sent my email)
>> - Fixed TCG case. Now device memory works also for TCG and vcpu
>> pamax is checked
>> - See individual logs for more details
>>
>> v5 -> v6:
>> - mingw compilation issue fix
>> - kvm_arm_get_max_vm_phys_shift always returns the number of supported
>> IPA bits
>> - new patch "hw/arm/virt: Rename highmem IO regions" that eases the review
>> of "hw/arm/virt: Split the memory map description"
>> - "hw/arm/virt: Move memory map initialization into machvirt_init"
>> squashed into the previous patch
>> - change alignment of IO regions beyond the RAM so that it matches their
>> size
>>
>> v4 -> v5:
>> - change in the memory map
>> - see individual logs
>>
>> v3 -> v4:
>> - rebase on David's "pc-dimm: next bunch of cleanups" and
>> "pc-dimm: pre_plug "slot" and "addr" assignment"
>> - kvm-type option not used anymore. We directly use
>> maxram_size and ram_size machine fields to compute the
>> MAX IPA range. Migration is naturally handled as CLI
>> option are kept between source and destination. This was
>> suggested by David.
>> - device_memory_start and device_memory_size not stored
>> anymore in vms->bootinfo
>> - I did not take into account 2 Igor's comments: the one
>> related to the refactoring of arm_load_dtb and the one
>> related to the generation of the dtb after system_reset
>> which would contain nodes of hotplugged devices (we do
>> not support hotplug at this stage)
>> - check the end-user does not attempt to hotplug a device
>> - addition of "vl: Set machine ram_size, maxram_size and
>> ram_slots earlier"
>>
>> v2 -> v3:
>> - fix pc_q35 and pc_piix compilation error
>> - kwangwoo's email being not valid anymore, remove his address
>>
>> v1 -> v2:
>> - kvm_get_max_vm_phys_shift moved in arch specific file
>> - addition of NVDIMM part
>> - single series
>> - rebase on David's refactoring
>>
>> v1:
>> - was "[RFC 0/6] KVM/ARM: Dynamic and larger GPA size"
>> - was "[RFC 0/5] ARM virt: Support PC-DIMM at 2TB"
>>
>> Best Regards
>>
>> Eric
>>
>>
>> Eric Auger (12):
>> hw/arm/virt: Rename highmem IO regions
>> hw/arm/virt: Split the memory map description
>> hw/boards: Add a MachineState parameter to kvm_type callback
>> kvm: add kvm_arm_get_max_vm_ipa_size
>> vl: Set machine ram_size, maxram_size and ram_slots earlier
>> hw/arm/virt: Dynamic memory map depending on RAM requirements
>> hw/arm/virt: Implement kvm_type function for 4.0 machine
>> hw/arm/virt: Bump the 255GB initial RAM limit
>> hw/arm/virt: Add memory hotplug framework
>> hw/arm/virt: Allocate device_memory
>> hw/arm/boot: Expose the pmem nodes in the DT
>> hw/arm/virt: Add nvdimm and nvdimm-persistence options
>>
>> Kwangwoo Lee (2):
>> nvdimm: use configurable ACPI IO base and size
>> hw/arm/virt: Add nvdimm hot-plug infrastructure
>>
>> Shameer Kolothum (3):
>> hw/arm/boot: introduce fdt_add_memory_node helper
>> hw/arm/boot: Expose the PC-DIMM nodes in the DT
>> hw/arm/virt-acpi-build: Add PC-DIMM in SRAT
>>
>> accel/kvm/kvm-all.c | 2 +-
>> default-configs/arm-softmmu.mak | 4 +
>> hw/acpi/nvdimm.c | 31 ++-
>> hw/arm/boot.c | 136 ++++++++++--
>> hw/arm/virt-acpi-build.c | 23 +-
>> hw/arm/virt.c | 364 ++++++++++++++++++++++++++++----
>> hw/i386/pc_piix.c | 6 +-
>> hw/i386/pc_q35.c | 6 +-
>> hw/ppc/mac_newworld.c | 3 +-
>> hw/ppc/mac_oldworld.c | 2 +-
>> hw/ppc/spapr.c | 2 +-
>> include/hw/arm/virt.h | 24 ++-
>> include/hw/boards.h | 5 +-
>> include/hw/mem/nvdimm.h | 4 +
>> target/arm/kvm.c | 10 +
>> target/arm/kvm_arm.h | 13 ++
>> vl.c | 6 +-
>> 17 files changed, 556 insertions(+), 85 deletions(-)
>>
>
>
next prev parent reply other threads:[~2019-02-22 17:36 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-20 22:39 [Qemu-devel] [PATCH v7 00/17] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 01/17] hw/arm/boot: introduce fdt_add_memory_node helper Eric Auger
2019-02-21 14:58 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 02/17] hw/arm/virt: Rename highmem IO regions Eric Auger
2019-02-21 15:05 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 03/17] hw/arm/virt: Split the memory map description Eric Auger
2019-02-21 16:19 ` Igor Mammedov
2019-02-21 17:21 ` Auger Eric
2019-02-22 10:15 ` Igor Mammedov
2019-02-22 14:28 ` Auger Eric
2019-02-22 14:51 ` Igor Mammedov
2019-02-22 7:34 ` Heyi Guo
2019-02-22 8:08 ` Auger Eric
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 04/17] hw/boards: Add a MachineState parameter to kvm_type callback Eric Auger
2019-02-22 10:18 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 05/17] kvm: add kvm_arm_get_max_vm_ipa_size Eric Auger
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 06/17] vl: Set machine ram_size, maxram_size and ram_slots earlier Eric Auger
2019-02-22 10:40 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 07/17] hw/arm/virt: Dynamic memory map depending on RAM requirements Eric Auger
2019-02-22 12:57 ` Igor Mammedov
2019-02-22 14:06 ` Auger Eric
2019-02-22 14:23 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 08/17] hw/arm/virt: Implement kvm_type function for 4.0 machine Eric Auger
2019-02-22 12:45 ` Igor Mammedov
2019-02-22 14:01 ` Auger Eric
2019-02-22 14:39 ` Igor Mammedov
2019-02-22 14:53 ` Auger Eric
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 09/17] hw/arm/virt: Bump the 255GB initial RAM limit Eric Auger
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 10/17] hw/arm/virt: Add memory hotplug framework Eric Auger
2019-02-22 13:25 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 11/17] hw/arm/boot: Expose the PC-DIMM nodes in the DT Eric Auger
2019-02-22 13:30 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 12/17] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Eric Auger
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 13/17] hw/arm/virt: Allocate device_memory Eric Auger
2019-02-22 13:48 ` Igor Mammedov
2019-02-22 14:15 ` Auger Eric
2019-02-22 14:58 ` Igor Mammedov
2019-02-20 22:40 ` [Qemu-devel] [PATCH v7 14/17] nvdimm: use configurable ACPI IO base and size Eric Auger
2019-02-22 15:28 ` Igor Mammedov
2019-02-20 22:40 ` [Qemu-devel] [PATCH v7 15/17] hw/arm/virt: Add nvdimm hot-plug infrastructure Eric Auger
2019-02-22 15:36 ` Igor Mammedov
2019-02-20 22:40 ` [Qemu-devel] [PATCH v7 16/17] hw/arm/boot: Expose the pmem nodes in the DT Eric Auger
2019-02-20 22:40 ` [Qemu-devel] [PATCH v7 17/17] hw/arm/virt: Add nvdimm and nvdimm-persistence options Eric Auger
2019-02-22 15:48 ` Igor Mammedov
2019-02-22 15:57 ` Auger Eric
2019-02-20 22:46 ` [Qemu-devel] [PATCH v7 00/17] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Auger Eric
2019-02-22 16:27 ` Igor Mammedov
2019-02-22 17:35 ` Auger Eric [this message]
2019-02-25 9:42 ` Igor Mammedov
2019-02-25 10:13 ` Shameerali Kolothum Thodi
2019-02-26 8:40 ` Auger Eric
2019-02-26 13:11 ` Auger Eric
2019-02-26 16:56 ` Igor Mammedov
2019-02-26 17:53 ` Auger Eric
2019-02-27 10:10 ` Igor Mammedov
2019-02-27 10:27 ` Auger Eric
2019-02-27 10:41 ` Shameerali Kolothum Thodi
2019-02-27 17:51 ` Igor Mammedov
2019-02-28 7:48 ` Auger Eric
2019-02-28 14:05 ` Igor Mammedov
2019-03-01 14:18 ` Auger Eric
2019-03-01 16:33 ` Igor Mammedov
2019-03-01 17:52 ` Auger Eric
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c016024e-9f4b-8953-f492-8802675f7e22@redhat.com \
--to=eric.auger@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=david@redhat.com \
--cc=dgilbert@redhat.com \
--cc=drjones@redhat.com \
--cc=eric.auger.pro@gmail.com \
--cc=imammedo@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=shameerali.kolothum.thodi@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).