qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Haozhong Zhang <haozhong.zhang@intel.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Richard Henderson <rth@twiddle.net>,
	Eduardo Habkost <ehabkost@redhat.com>,
	Marcel Apfelbaum <marcel@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] hw/acpi-build: build SRAT memory affinity structures for NVDIMM
Date: Wed, 21 Feb 2018 14:55:02 +0100	[thread overview]
Message-ID: <20180221145502.434b3c0b@redhat.com> (raw)
In-Reply-To: <CAPcyv4g_zoQf-LBOeT8=HO1X3YiNc+vUdZN713=a7uJC_A32Fg@mail.gmail.com>

On Tue, 20 Feb 2018 17:17:58 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> On Tue, Feb 20, 2018 at 6:10 AM, Igor Mammedov <imammedo@redhat.com> wrote:
> > On Sat, 17 Feb 2018 14:31:35 +0800
> > Haozhong Zhang <haozhong.zhang@intel.com> wrote:
> >  
> >> ACPI 6.2A Table 5-129 "SPA Range Structure" requires the proximity
> >> domain of a NVDIMM SPA range must match with corresponding entry in
> >> SRAT table.
> >>
> >> The address ranges of vNVDIMM in QEMU are allocated from the
> >> hot-pluggable address space, which is entirely covered by one SRAT
> >> memory affinity structure. However, users can set the vNVDIMM
> >> proximity domain in NFIT SPA range structure by the 'node' property of
> >> '-device nvdimm' to a value different than the one in the above SRAT
> >> memory affinity structure.
> >>
> >> In order to solve such proximity domain mismatch, this patch build one
> >> SRAT memory affinity structure for each NVDIMM device with the
> >> proximity domain used in NFIT. The remaining hot-pluggable address
> >> space is covered by one or multiple SRAT memory affinity structures
> >> with the proximity domain of the last node as before.
> >>
> >> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>  
> > If we consider hotpluggable system, correctly implemented OS should
> > be able pull proximity from Device::_PXM and override any value from SRAT.
> > Do we really have a problem here (anything that breaks if we would use _PXM)?
> > Maybe we should add _PXM object to nvdimm device nodes instead of massaging SRAT?  
> 
> Unfortunately _PXM is an awkward fit. Currently the proximity domain
> is attached to the SPA range structure. The SPA range may be
> associated with multiple DIMM devices and those individual NVDIMMs may
> have conflicting _PXM properties.
There shouldn't be any conflict here as  NVDIMM device's _PXM method,
should override in runtime any proximity specified by parent scope.
(as parent scope I'd also count boot time NFIT/SRAT tables).

To make it more clear we could clear valid proximity domain flag in SPA
like this:

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 59d6e42..131bca5 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -260,9 +260,7 @@ nvdimm_build_structure_spa(GArray *structures, DeviceState *dev)
      */
     nfit_spa->flags = cpu_to_le16(1 /* Control region is strictly for
                                        management during hot add/online
-                                       operation */ |
-                                  2 /* Data in Proximity Domain field is
-                                       valid*/);
+                                       operation */);
 
     /* NUMA node. */
     nfit_spa->proximity_domain = cpu_to_le32(node);

> Even if that was unified across
> DIMMs it is ambiguous whether a DIMM-device _PXM would relate to the
> device's control interface, or the assembled persistent memory SPA
> range.
I'm not sure what you mean under 'device's control interface',
could you clarify where the ambiguity comes from?

I read spec as: _PXM applies to address range covered by NVDIMM
device it belongs to.

As for assembled SPA, I'd assume that it applies to interleaved set
and all NVDIMMs with it should be on the same node. It's somewhat
irrelevant question though as QEMU so far implements only
  1:1:1/SPA:Region Mapping:NVDIMM Device/
mapping.

My main concern with using static configuration tables for proximity
mapping, we'd miss on hotplug side of equation. However if we start
from dynamic side first, we could later complement it with static
tables if there really were need for it.

  reply	other threads:[~2018-02-21 13:55 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-17  6:31 [Qemu-devel] [PATCH] hw/acpi-build: build SRAT memory affinity structures for NVDIMM Haozhong Zhang
2018-02-20 14:10 ` Igor Mammedov
2018-02-21  1:17   ` Dan Williams
2018-02-21 13:55     ` Igor Mammedov [this message]
2018-02-21 14:51       ` Dan Williams
2018-02-26 14:08         ` Igor Mammedov
2018-02-22  1:40       ` Haozhong Zhang
2018-02-26 13:59         ` Igor Mammedov
2018-02-26 14:58           ` Haozhong Zhang
2018-02-24  1:17 ` no-reply
2018-02-24  1:42   ` Haozhong Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180221145502.434b3c0b@redhat.com \
    --to=imammedo@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=ehabkost@redhat.com \
    --cc=haozhong.zhang@intel.com \
    --cc=marcel@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=stefanha@redhat.com \
    --cc=xiaoguangrong.eric@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).