From: Igor Mammedov <imammedo@redhat.com>
To: Auger Eric <eric.auger@redhat.com>
Cc: peter.maydell@linaro.org, drjones@redhat.com, david@redhat.com,
dgilbert@redhat.com, shameerali.kolothum.thodi@huawei.com,
qemu-devel@nongnu.org, qemu-arm@nongnu.org,
eric.auger.pro@gmail.com, david@gibson.dropbear.id.au
Subject: Re: [Qemu-devel] [PATCH v7 00/17] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support
Date: Wed, 27 Feb 2019 11:10:25 +0100 [thread overview]
Message-ID: <20190227111025.4bb39cc7@redhat.com> (raw)
In-Reply-To: <c68ba5a2-0d24-c1ad-c2af-c957768abf9f@redhat.com>
On Tue, 26 Feb 2019 18:53:24 +0100
Auger Eric <eric.auger@redhat.com> wrote:
> Hi Igor,
>
> On 2/26/19 5:56 PM, Igor Mammedov wrote:
> > On Tue, 26 Feb 2019 14:11:58 +0100
> > Auger Eric <eric.auger@redhat.com> wrote:
> >
> >> Hi Igor,
> >>
> >> On 2/26/19 9:40 AM, Auger Eric wrote:
> >>> Hi Igor,
> >>>
> >>> On 2/25/19 10:42 AM, Igor Mammedov wrote:
> >>>> On Fri, 22 Feb 2019 18:35:26 +0100
> >>>> Auger Eric <eric.auger@redhat.com> wrote:
> >>>>
> >>>>> Hi Igor,
> >>>>>
> >>>>> On 2/22/19 5:27 PM, Igor Mammedov wrote:
> >>>>>> On Wed, 20 Feb 2019 23:39:46 +0100
> >>>>>> Eric Auger <eric.auger@redhat.com> wrote:
> >>>>>>
> >>>>>>> This series aims to bump the 255GB RAM limit in machvirt and to
> >>>>>>> support device memory in general, and especially PCDIMM/NVDIMM.
> >>>>>>>
> >>>>>>> In machvirt versions < 4.0, the initial RAM starts at 1GB and can
> >>>>>>> grow up to 255GB. From 256GB onwards we find IO regions such as the
> >>>>>>> additional GICv3 RDIST region, high PCIe ECAM region and high PCIe
> >>>>>>> MMIO region. The address map was 1TB large. This corresponded to
> >>>>>>> the max IPA capacity KVM was able to manage.
> >>>>>>>
> >>>>>>> Since 4.20, the host kernel is able to support a larger and dynamic
> >>>>>>> IPA range. So the guest physical address can go beyond the 1TB. The
> >>>>>>> max GPA size depends on the host kernel configuration and physical CPUs.
> >>>>>>>
> >>>>>>> In this series we use this feature and allow the RAM to grow without
> >>>>>>> any other limit than the one put by the host kernel.
> >>>>>>>
> >>>>>>> The RAM still starts at 1GB. First comes the initial ram (-m) of size
> >>>>>>> ram_size and then comes the device memory (,maxmem) of size
> >>>>>>> maxram_size - ram_size. The device memory is potentially hotpluggable
> >>>>>>> depending on the instantiated memory objects.
> >>>>>>>
> >>>>>>> IO regions previously located between 256GB and 1TB are moved after
> >>>>>>> the RAM. Their offset is dynamically computed, depends on ram_size
> >>>>>>> and maxram_size. Size alignment is enforced.
> >>>>>>>
> >>>>>>> In case maxmem value is inferior to 255GB, the legacy memory map
> >>>>>>> still is used. The change of memory map becomes effective from 4.0
> >>>>>>> onwards.
> >>>>>>>
> >>>>>>> As we keep the initial RAM at 1GB base address, we do not need to do
> >>>>>>> invasive changes in the EDK2 FW. It seems nobody is eager to do
> >>>>>>> that job at the moment.
> >>>>>>>
> >>>>>>> Device memory being put just after the initial RAM, it is possible
> >>>>>>> to get access to this feature while keeping a 1TB address map.
> >>>>>>>
> >>>>>>> This series reuses/rebases patches initially submitted by Shameer
> >>>>>>> in [1] and Kwangwoo in [2] for the PC-DIMM and NV-DIMM parts.
> >>>>>>>
> >>>>>>> Functionally, the series is split into 3 parts:
> >>>>>>> 1) bump of the initial RAM limit [1 - 9] and change in
> >>>>>>> the memory map
> >>>>>>
> >>>>>>> 2) Support of PC-DIMM [10 - 13]
> >>>>>> Is this part complete ACPI wise (for coldplug)? I haven't noticed
> >>>>>> DSDT AML here no E820 changes, so ACPI wise pc-dimm shouldn't be
> >>>>>> visible to the guest. It might be that DT is masking problem
> >>>>>> but well, that won't work on ACPI only guests.
> >>>>>
> >>>>> guest /proc/meminfo or "lshw -class memory" reflects the amount of mem
> >>>>> added with the DIMM slots.
> >>>> Question is how does it get there? Does it come from DT or from firmware
> >>>> via UEFI interfaces?
> >>>>
> >>>>> So it looks fine to me. Isn't E820 a pure x86 matter?
> >>>> sorry for misleading, I've meant is UEFI GetMemoryMap().
> >>>> On x86, I'm wary of adding PC-DIMMs to E802 which then gets exposed
> >>>> via UEFI GetMemoryMap() as guest kernel might start using it as normal
> >>>> memory early at boot and later put that memory into zone normal and hence
> >>>> make it non-hot-un-pluggable. The same concerns apply to DT based means
> >>>> of discovery.
> >>>> (That's guest issue but it's easy to workaround it not putting hotpluggable
> >>>> memory into UEFI GetMemoryMap() or DT and let DSDT describe it properly)
> >>>> That way memory doesn't get (ab)used by firmware or early boot kernel stages
> >>>> and doesn't get locked up.
> >>>>
> >>>>> What else would you expect in the dsdt?
> >>>> Memory device descriptions, look for code that adds PNP0C80 with _CRS
> >>>> describing memory ranges
> >>>
> >>> OK thank you for the explanations. I will work on PNP0C80 addition then.
> >>> Does it mean that in ACPI mode we must not output DT hotplug memory
> >>> nodes or assuming that PNP0C80 is properly described, it will "override"
> >>> DT description?
> >>
> >> After further investigations, I think the pieces you pointed out are
> >> added by Shameer's series, ie. through the build_memory_hotplug_aml()
> >> call. So I suggest we separate the concerns: this series brings support
> >> for DIMM coldplug. hotplug, including all the relevant ACPI structures
> >> will be added later on by Shameer.
> >
> > Maybe we should not put pc-dimms in DT for this series until it gets clear
> > if it doesn't conflict with ACPI in some way.
>
> I guess you mean removing the DT hotpluggable memory nodes only in ACPI
> mode? Otherwise you simply remove the DIMM feature, right?
Something like this so DT won't get in conflict with ACPI.
Only we don't have a switch for it something like, -machine fdt=on (with default off)
> I double checked and if you remove the hotpluggable memory DT nodes in
> ACPI mode:
> - you do not see the PCDIMM slots in guest /proc/meminfo anymore. So I
> guess you're right, if the DT nodes are available, that memory is
> considered as not unpluggable by the guest.
> - You can see the NVDIMM slots using ndctl list -u. You can mount a DAX
> system.
>
> Hotplug/unplug is clearly not supported by this series and any attempt
> results in "memory hotplug is not supported". Is it really an issue if
> the guest does not consider DIMM slots as not hot-unpluggable memory? I
> am not even sure the guest kernel would support to unplug that memory.
>
> In case we want all ACPI tables to be ready for making this memory seen
> as hot-unpluggable we need some Shameer's patches on top of this series.
May be we should push for this way (into 4.0), it's just a several patches
after all or even merge them in your series (I'd guess it would need to be
rebased on top of your latest work)
> Also don't DIMM slots already make sense in DT mode. Usually we accept
> to add one feature in DT and then in ACPI. For instance we can benefit
usually it doesn't conflict with each other (at least I'm not aware of it)
but I see a problem with in this case.
> from nvdimm in dt mode right? So, considering an incremental approach I
> would be in favour of keeping the DT nodes.
I'd guess it is the same as for DIMMs, ACPI support for NVDIMMs is much
more versatile.
I consider target application of arm/virt as a board that's used to
run in production generic ACPI capable guest in most use cases and
various DT only guests as secondary ones. It's hard to make
both usecases be happy with defaults (that's probably one of the
reasons why 'sbsa' board is being added).
So I'd give priority to ACPI based arm/virt versus DT when defaults are
considered.
> Thanks
>
> Eric
> >
> >
> >
> >
next prev parent reply other threads:[~2019-02-27 10:27 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-20 22:39 [Qemu-devel] [PATCH v7 00/17] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Eric Auger
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 01/17] hw/arm/boot: introduce fdt_add_memory_node helper Eric Auger
2019-02-21 14:58 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 02/17] hw/arm/virt: Rename highmem IO regions Eric Auger
2019-02-21 15:05 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 03/17] hw/arm/virt: Split the memory map description Eric Auger
2019-02-21 16:19 ` Igor Mammedov
2019-02-21 17:21 ` Auger Eric
2019-02-22 10:15 ` Igor Mammedov
2019-02-22 14:28 ` Auger Eric
2019-02-22 14:51 ` Igor Mammedov
2019-02-22 7:34 ` Heyi Guo
2019-02-22 8:08 ` Auger Eric
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 04/17] hw/boards: Add a MachineState parameter to kvm_type callback Eric Auger
2019-02-22 10:18 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 05/17] kvm: add kvm_arm_get_max_vm_ipa_size Eric Auger
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 06/17] vl: Set machine ram_size, maxram_size and ram_slots earlier Eric Auger
2019-02-22 10:40 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 07/17] hw/arm/virt: Dynamic memory map depending on RAM requirements Eric Auger
2019-02-22 12:57 ` Igor Mammedov
2019-02-22 14:06 ` Auger Eric
2019-02-22 14:23 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 08/17] hw/arm/virt: Implement kvm_type function for 4.0 machine Eric Auger
2019-02-22 12:45 ` Igor Mammedov
2019-02-22 14:01 ` Auger Eric
2019-02-22 14:39 ` Igor Mammedov
2019-02-22 14:53 ` Auger Eric
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 09/17] hw/arm/virt: Bump the 255GB initial RAM limit Eric Auger
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 10/17] hw/arm/virt: Add memory hotplug framework Eric Auger
2019-02-22 13:25 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 11/17] hw/arm/boot: Expose the PC-DIMM nodes in the DT Eric Auger
2019-02-22 13:30 ` Igor Mammedov
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 12/17] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Eric Auger
2019-02-20 22:39 ` [Qemu-devel] [PATCH v7 13/17] hw/arm/virt: Allocate device_memory Eric Auger
2019-02-22 13:48 ` Igor Mammedov
2019-02-22 14:15 ` Auger Eric
2019-02-22 14:58 ` Igor Mammedov
2019-02-20 22:40 ` [Qemu-devel] [PATCH v7 14/17] nvdimm: use configurable ACPI IO base and size Eric Auger
2019-02-22 15:28 ` Igor Mammedov
2019-02-20 22:40 ` [Qemu-devel] [PATCH v7 15/17] hw/arm/virt: Add nvdimm hot-plug infrastructure Eric Auger
2019-02-22 15:36 ` Igor Mammedov
2019-02-20 22:40 ` [Qemu-devel] [PATCH v7 16/17] hw/arm/boot: Expose the pmem nodes in the DT Eric Auger
2019-02-20 22:40 ` [Qemu-devel] [PATCH v7 17/17] hw/arm/virt: Add nvdimm and nvdimm-persistence options Eric Auger
2019-02-22 15:48 ` Igor Mammedov
2019-02-22 15:57 ` Auger Eric
2019-02-20 22:46 ` [Qemu-devel] [PATCH v7 00/17] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support Auger Eric
2019-02-22 16:27 ` Igor Mammedov
2019-02-22 17:35 ` Auger Eric
2019-02-25 9:42 ` Igor Mammedov
2019-02-25 10:13 ` Shameerali Kolothum Thodi
2019-02-26 8:40 ` Auger Eric
2019-02-26 13:11 ` Auger Eric
2019-02-26 16:56 ` Igor Mammedov
2019-02-26 17:53 ` Auger Eric
2019-02-27 10:10 ` Igor Mammedov [this message]
2019-02-27 10:27 ` Auger Eric
2019-02-27 10:41 ` Shameerali Kolothum Thodi
2019-02-27 17:51 ` Igor Mammedov
2019-02-28 7:48 ` Auger Eric
2019-02-28 14:05 ` Igor Mammedov
2019-03-01 14:18 ` Auger Eric
2019-03-01 16:33 ` Igor Mammedov
2019-03-01 17:52 ` Auger Eric
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190227111025.4bb39cc7@redhat.com \
--to=imammedo@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=david@redhat.com \
--cc=dgilbert@redhat.com \
--cc=drjones@redhat.com \
--cc=eric.auger.pro@gmail.com \
--cc=eric.auger@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=shameerali.kolothum.thodi@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).