qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Haozhong Zhang <hzzhan9@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Zhang Yi <yi.z.zhang@linux.intel.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	yu.c.zhang@linux.intel.com, Stefan Hajnoczi <stefanha@redhat.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Eduardo Habkost <ehabkost@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH 1/1] nvdimm: let qemu requiring section alignment of pmem resource.
Date: Wed, 13 Jun 2018 18:30:42 +0200	[thread overview]
Message-ID: <20180613183042.2ce134a6@redhat.com> (raw)
In-Reply-To: <20180612150425.hjydg5cbtlllzh67@HZ>

On Tue, 12 Jun 2018 23:04:25 +0800
Haozhong Zhang <hzzhan9@gmail.com> wrote:

> On 06/11/18 19:55, Dan Williams wrote:
> > On Mon, Jun 11, 2018 at 9:26 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:  
> > > On Mon, Jun 11, 2018 at 06:54:25PM +0800, Zhang Yi wrote:  
> > >> Nvdimm driver use Memory hot-plug APIs to map it's pmem resource,
> > >> which at a section granularity.
> > >>
> > >> When QEMU emulated the vNVDIMM device, decrease the label-storage,
> > >> QEMU will put the vNVDIMMs directly next to one another in physical
> > >> address space, which means that the boundary between them won't
> > >> align to the 128 MB memory section size.  
> > >
> > > I'm having a hard time parsing this.
> > >
> > > Where does the "128 MB memory section size" come from?  ACPI?
> > > A chipset-specific value?
> > >  
> > 
> > The devm_memremap_pages() implementation use the memory hotplug core
> > to allocate the 'struct page' array/map for persistent memory. Memory
> > hotplug can only be performed in terms of sections, 128MB on x86_64.  
> 
> IIUC, it also affects the normal RAM hotplug to a Linux VM on QEMU. If
> that is the case, it will be helpful to lift this option to pc-dimm.

Default alignment on page size boundary is implemented for the reason
that QEMU has no idea about guest os alignments req. and these requirements
might vary greatly depending on guest os running.
With some guests it works just fine even with 2M alignments/dimm sizes.

So it's up to upper layers which know what guest os is running to pick
plugged dimm sizes. So if a particular linux version minimum block size
is 128, then mgmt needs to plug dimm with size which is multiple of that.
That should satisfy whatever alignment req guest os has.

In case of nvdimm we need to fix address allocation in QEMU to account
for label size which broke above rule leading to "overlap" over label
area of nvdimm which isn't mapped into guest address space, but that's
probably it.

PS:
not related to patch question.
Intel guys contributed most of the code to nvdimm and continue actively
to develop it. Can we have a designated maintainer for nvdimm part from
Intel in addition to authors who just code/merge feature and disappear
(not reachable) shortly after that?


> Thanks,
> Haozhong
> 
> > There is some limited support for allowing devm_memremap_pages() to
> > overlap 'System RAM' within a given section, but it does not currently
> > support multiple devm_memremap_pages() calls overlapping within the
> > same section. There is currently a kernel bug where we do not handle
> > this unsupported configuration gracefully. The fix will cause
> > configurations configurations that try to overlap 2 persistent memory
> > ranges in the same section to fail.
> > 
> > The proposed fix is trying to make sure that QEMU does not run afoul
> > of this constraint.
> > 
> > There is currently no line of sight to reduce the minimum memory
> > hotplug alignment size to less than 128M. Also, as other architectures
> > outside of x86_64 add devm_memremap_pages() support, the minimum
> > section alignment constraint might change and is a property of a guest
> > OS. My understanding is that some guest OSes might expect an even
> > larger persistent memory minimum alignment.
> >   

      parent reply	other threads:[~2018-06-13 16:30 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-11 10:54 [Qemu-devel] [RFC PATCH 1/1] nvdimm: let qemu requiring section alignment of pmem resource Zhang Yi
2018-06-11 16:26 ` Stefan Hajnoczi
2018-06-12  2:55   ` Dan Williams
2018-06-12 13:27     ` Zhang,Yi
2018-06-12 15:04     ` Haozhong Zhang
2018-06-13 14:16       ` Stefan Hajnoczi
2018-06-13 16:30       ` Igor Mammedov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180613183042.2ce134a6@redhat.com \
    --to=imammedo@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=ehabkost@redhat.com \
    --cc=hzzhan9@gmail.com \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=stefanha@redhat.com \
    --cc=xiaoguangrong.eric@gmail.com \
    --cc=yi.z.zhang@linux.intel.com \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).