From: Igor Mammedov <imammedo@redhat.com>
To: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Cc: xiaoguangrong.eric@gmail.com, mst@redhat.com,
qemu-devel@nongnu.org,
Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>,
Bharata B Rao <bharata.rao@gmail.com>,
qemu-ppc@nongnu.org, David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH v3 2/3] spapr: Add NVDIMM device support
Date: Wed, 11 Dec 2019 09:05:09 +0100 [thread overview]
Message-ID: <20191211090509.1845b913@redhat.com> (raw)
In-Reply-To: <1c24857f-64d4-a14d-1b66-cae2113d53a2@linux.ibm.com>
On Wed, 11 Dec 2019 09:44:11 +0530
Shivaprasad G Bhat <sbhat@linux.ibm.com> wrote:
> On 12/06/2019 07:22 AM, David Gibson wrote:
> > On Wed, Nov 27, 2019 at 09:50:54AM +0530, Bharata B Rao wrote:
> >> On Fri, Nov 22, 2019 at 10:42 AM David Gibson
> >> <david@gibson.dropbear.id.au> wrote:
> >>> Ok. A number of queries about this.
> >>>
> >>> 1) The PAPR spec for ibm,dynamic-memory-v2 says that the first word in
> >>> each entry is the number of LMBs, but for NVDIMMs you use the
> >>> not-necessarily-equal scm_block_size instead. Does the NVDIMM
> >>> amendment for PAPR really specify to use different block sizes for
> >>> these cases? (In which case that's a really stupid spec decision, but
> >>> that wouldn't surprise me at this point).
> >> SCM block sizes can be different from LMB sizes, but here we enforce
> >> that SCM device size (excluding metadata) to multiple of LMB size so
> >> that we don't end up memory range that is not aligned to LMB size.
> > Right, but it still doesn't make sense to use scm_block_size when you
> > create the dynamic-memory-v2 property.
>
> Right, I should use LMB size here as I will be creating holes here to
> disallow DIMMs
> to claim those LMBs marking them INVALID as Bharata Suggested before.
>
> > As far as the thing
> > interpreting that goes, it *must* be LMB size, not SCM block size. If
> > those are required to be the same at this point, you should use an
> > assert().
>
> SCM block size should be a multiple for LMB size, need not be equal.
> I'll add an assert
> for that, checking if equal. There is no benefit I see as of now having
> higher
> SCM block size as the bind/unbind are already done before the bind hcall.
>
> >>> 2) Similarly, the ibm,dynamic-memory-v2 description says that the
> >>> memory block described by the entry has a whole batch of contiguous
> >>> DRCs starting at the DRC index given and continuing for #LMBs DRCs.
> >>> For NVDIMMs it appears that you just have one DRC for the whole
> >>> NVDIMM. Is that right?
> >> One NVDIMM has one DRC, In our case, we need to mark the LMBs
> >> corresponding to that address range in ibm,dynamic-memory-v2 as
> >> reserved and invalid.
> > Ok, that fits very weirdly with the DRC allocation for the rest of
> > pluggable memory, but I suppose that's PAPR for you.
> >
> > Having these in together is very inscrutable though, and relies on a
> > heap of non-obvious constraints about placement of DIMMs and NVDIMMs
> > relative to each other. I really wonder if it would be better to have
> > a completely different address range for the NVDIMMs.
>
> The backend object for both DIMM and NVDIMM are memory-backend-*
> and they use the address from the same space. Separating it would mean
> using/introducing different backend object. I dont think we have a
> choice here.
What address-space(s) are are talking about here exactly?
From my point of view memory-backend-* provides RAM block at
some HVA, which shouldn't not have anything to do with how NVDIMM
partitions and maps it to GPA.
> >>> 3) You're not setting *any* extra flags on the entry. How is the
> >>> guest supposed to know which are NVDIMM entries and which are regular
> >>> DIMM entries? AFAICT in this version the NVDIMM slots are
> >>> indistinguishable from the unassigned hotplug memory (which makes the
> >>> difference in LMB and DRC numbering even more troubling).
> >> For NVDIMM case, this patch should populate the LMB set in
> >> ibm,dynamic-memory-v2 something like below:
> >> elem = spapr_get_drconf_cell(size /lmb_size, addr,
> >> 0, -1,
> >> SPAPR_LMB_FLAGS_RESERVED | SPAPR_LMB_FLAGS_DRC_INVALID);
> >>
> >> This will ensure that the NVDIMM range will never be considered as
> >> valid memory range for memory hotplug.
> > Hrm. Ok so we already have code that does that for any gaps between
> > DIMMs. I don't think there's actually anything that that code will do
> > differently than the code you have for NVDIMMs, so you could just skip
> > over the NVDIMMs here and it should do the right thing.
> >
> > The *interpretation* of those entries will become different: for space
> > into which a regular DIMM is later inserted, we'll assume the DRC
> > index given is a base and there are more DRCs following it, but for
> > NVDIMMs we'll assume the same DRC throughout. This is nuts, but IIUC
> > that's what PAPR says and we can't do much about it.
>
> My current patch is buggy as Bharata pointed out. The NVDIMM DRCs
> are not to be populated here, but mark the LMB DRCs as RESERVED and INVALID
> so that no malicious attempts to online those LMBs at those NVDIMM address
> ranges are attempted.
>
> >
> >>> 4) AFAICT these are _present_ NVDIMMs, so why is
> >>> SPAPR_LMB_FLAGS_ASSIGNED not set for them? (and why is the node
> >>> forced to -1, regardless of di->node).
> >>>
> >>>> QSIMPLEQ_INSERT_TAIL(&drconf_queue, elem, entry);
> >>>> nr_entries++;
> >>>> cur_addr = addr + size;
> >>>> @@ -1261,6 +1273,85 @@ static void spapr_dt_hypervisor(SpaprMachineState *spapr, void *fdt)
> >>>> }
> >>>> }
> >>>>
> >>>> +static void spapr_create_nvdimm_dr_connectors(SpaprMachineState *spapr)
> >>>> +{
> >>>> + MachineState *machine = MACHINE(spapr);
> >>>> + int i;
> >>>> +
> >>>> + for (i = 0; i < machine->ram_slots; i++) {
> >>>> + spapr_dr_connector_new(OBJECT(spapr), TYPE_SPAPR_DRC_PMEM, i);
> >>> What happens if you try to plug an NVDIMM to one of these slots, but a
> >>> regular DIMM has already taken it?
> >> NVDIMM hotplug won't get that occupied slot.
> > Ok.
> >
>
next prev parent reply other threads:[~2019-12-11 8:06 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-14 18:37 [PATCH v3 0/3] ppc: spapr: virtual NVDIMM support Shivaprasad G Bhat
2019-10-14 18:37 ` [PATCH v3 1/3] mem: move nvdimm_device_list to utilities Shivaprasad G Bhat
2019-11-19 2:58 ` David Gibson
2019-11-19 7:13 ` Igor Mammedov
2019-11-20 8:01 ` Shivaprasad G Bhat
2019-11-20 9:35 ` Igor Mammedov
2019-10-14 18:37 ` [PATCH v3 2/3] spapr: Add NVDIMM device support Shivaprasad G Bhat
2019-11-22 4:30 ` David Gibson
2019-11-27 4:20 ` Bharata B Rao
2019-12-06 1:52 ` David Gibson
2019-12-11 4:14 ` Shivaprasad G Bhat
2019-12-11 8:05 ` Igor Mammedov [this message]
2019-12-12 8:52 ` Shivaprasad G Bhat
2020-01-03 0:45 ` David Gibson
2019-12-16 11:15 ` Shivaprasad G Bhat
2019-10-14 18:38 ` [PATCH v3 3/3] spapr: Add Hcalls to support PAPR NVDIMM device Shivaprasad G Bhat
2019-11-22 5:11 ` David Gibson
2019-12-17 6:10 ` Shivaprasad G Bhat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191211090509.1845b913@redhat.com \
--to=imammedo@redhat.com \
--cc=bharata.rao@gmail.com \
--cc=david@gibson.dropbear.id.au \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=sbhat@linux.ibm.com \
--cc=sbhat@linux.vnet.ibm.com \
--cc=xiaoguangrong.eric@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).