All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Jan Beulich <JBeulich@suse.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	Wei Liu <wei.liu2@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Jun Nakajima <jun.nakajima@intel.com>,
	Kevin Tian <kevin.tian@intel.com>,
	xen-devel@lists.xen.org, Keir Fraser <keir@xen.org>,
	Xiao Guangrong <guangrong.xiao@linux.intel.com>
Subject: Re: [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
Date: Wed, 20 Jan 2016 10:07:20 -0500	[thread overview]
Message-ID: <20160120150720.GG24581@char.us.oracle.com> (raw)
In-Reply-To: <20160120110449.GD4939@hz-desktop.sh.intel.com>

On Wed, Jan 20, 2016 at 07:04:49PM +0800, Haozhong Zhang wrote:
> On 01/20/16 01:46, Jan Beulich wrote:
> > >>> On 20.01.16 at 06:31, <haozhong.zhang@intel.com> wrote:
> > > The primary reason of current solution is to reuse existing NVDIMM
> > > driver in Linux kernel.
> >
> 
> CC'ing QEMU vNVDIMM maintainer: Xiao Guangrong
> 
> > Re-using code in the Dom0 kernel has benefits and drawbacks, and
> > in any event needs to depend on proper layering to remain in place.
> > A benefit is less code duplication between Xen and Linux; along the
> > same lines a drawback is code duplication between various Dom0
> > OS variants.
> >
> 
> Not clear about other Dom0 OS. But for Linux, it already has a NVDIMM
> driver since 4.2.
> 
> > > One responsibility of this driver is to discover NVDIMM devices and
> > > their parameters (e.g. which portion of an NVDIMM device can be mapped
> > > into the system address space and which address it is mapped to) by
> > > parsing ACPI NFIT tables. Looking at the NFIT spec in Sec 5.2.25 of
> > > ACPI Specification v6 and the actual code in Linux kernel
> > > (drivers/acpi/nfit.*), it's not a trivial task.
> > 
> > To answer one of Kevin's questions: The NFIT table doesn't appear
> > to require the ACPI interpreter. They seem more like SRAT and SLIT.
> 
> Sorry, I made a mistake in another reply. NFIT does not contain
> anything requiring ACPI interpreter. But there are some _DSM methods
> for NVDIMM in SSDT, which needs ACPI interpreter.

Right, but those are for health checks and such. Not needed for boot-time
discovery of the ranges in memory of the NVDIMM.
> 
> > Also you failed to answer Kevin's question regarding E820 entries: I
> > think NVDIMM (or at least parts thereof) get represented in E820 (or
> > the EFI memory map), and if that's the case this would be a very
> > strong hint towards management needing to be in the hypervisor.
> >
> 
> Legacy NVDIMM devices may use E820 entries or other ad-hoc ways to
> announce their locations, but newer ones that follow ACPI v6 spec do
> not need E820 any more and only need ACPI NFIT (i.e. firmware may not
> build E820 entries for them).

I am missing something here.

Linux pvops uses an hypercall to construct its E820 (XENMEM_machine_memory_map)
see arch/x86/xen/setup.c:xen_memory_setup.

That hypercall gets an filtered E820 from the hypervisor. And the
hypervisor gets the E820 from multiboot2 - which gets it from grub2.

With the 'legacy NVDIMM' using E820_NVDIMM (type 12? 13) - they don't
show up in multiboot2 - which means Xen will ignore them (not sure
if changes them to E820_RSRV or just leaves them alone).

Anyhow for the /dev/pmem0 driver in Linux to construct an block
device on the E820_NVDIMM - it MUST have the E820 entry - but we don't
construct that.

I would think that one of the patches would be for the hypervisor
to recognize the E820_NVDIMM and associate that area with p2m_mmio
(so that the xc_memory_mapping hypercall would work on the MFNs)?

But you also mention ACPI v6 defining them an using ACPI NFIT - 
so that would be treating said system address extracted from the
ACPI NFIT just as an MMIO (except it being WB instead of UC).

Either way - Xen hypervisor should also parse the ACPI NFIT so
that it can mark that range as p2m_mmio (or does it do that by
default for any non-E820 ranges?). Does it actually need to
do that? Or is that optional?

I hope the design document will explain a bit of this.

> 
> The current linux kernel can handle both legacy and new NVDIMM devices
> and provide the same block device interface for them.

OK, so Xen would need to do that as well - so that the Linux kernel
can utilize it.
> 
> > > Secondly, the driver implements a convenient block device interface to
> > > let software access areas where NVDIMM devices are mapped. The
> > > existing vNVDIMM implementation in QEMU uses this interface.
> > > 
> > > As Linux NVDIMM driver has already done above, why do we bother to
> > > reimplement them in Xen?
> > 
> > See above; a possibility is that we may need a split model (block
> > layer parts on Dom0, "normal memory" parts in the hypervisor.
> > Iirc the split is being determined by firmware, and hence set in
> > stone by the time OS (or hypervisor) boot starts.
> >
> 
> For the "normal memory" parts, do you mean parts that map the host
> NVDIMM device's address space range to the guest? I'm going to
> implement that part in hypervisor and expose it as a hypercall so that
> it can be used by QEMU.
> 
> Haozhong
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

  parent reply	other threads:[~2016-01-20 15:07 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-29 11:31 [PATCH 0/4] add support for vNVDIMM Haozhong Zhang
2015-12-29 11:31 ` [PATCH 1/4] x86/hvm: allow guest to use clflushopt and clwb Haozhong Zhang
2015-12-29 15:46   ` Andrew Cooper
2015-12-30  1:35     ` Haozhong Zhang
2015-12-30  2:16       ` Haozhong Zhang
2015-12-30 10:33         ` Andrew Cooper
2015-12-29 11:31 ` [PATCH 2/4] x86/hvm: add support for pcommit instruction Haozhong Zhang
2015-12-29 11:31 ` [PATCH 3/4] tools/xl: add a new xl configuration 'nvdimm' Haozhong Zhang
2016-01-04 11:16   ` Wei Liu
2016-01-06 12:40   ` Jan Beulich
2016-01-06 15:28     ` Haozhong Zhang
2015-12-29 11:31 ` [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu Haozhong Zhang
2016-01-15 17:10   ` Jan Beulich
2016-01-18  0:52     ` Haozhong Zhang
2016-01-18  8:46       ` Jan Beulich
2016-01-19 11:37         ` Wei Liu
2016-01-19 11:46           ` Jan Beulich
2016-01-20  5:14             ` Tian, Kevin
2016-01-20  5:58               ` Zhang, Haozhong
2016-01-20  5:31         ` Haozhong Zhang
2016-01-20  8:46           ` Jan Beulich
2016-01-20  8:58             ` Andrew Cooper
2016-01-20 10:15               ` Haozhong Zhang
2016-01-20 10:36                 ` Xiao Guangrong
2016-01-20 13:16                   ` Andrew Cooper
2016-01-20 14:29                     ` Stefano Stabellini
2016-01-20 14:42                       ` Haozhong Zhang
2016-01-20 14:45                       ` Andrew Cooper
2016-01-20 14:53                         ` Haozhong Zhang
2016-01-20 15:13                           ` Konrad Rzeszutek Wilk
2016-01-20 15:29                             ` Haozhong Zhang
2016-01-20 15:41                               ` Konrad Rzeszutek Wilk
2016-01-20 15:54                                 ` Haozhong Zhang
2016-01-21  3:35                                 ` Bob Liu
2016-01-20 15:05                         ` Stefano Stabellini
2016-01-20 18:14                           ` Andrew Cooper
2016-01-20 14:38                     ` Haozhong Zhang
2016-01-20 11:04             ` Haozhong Zhang
2016-01-20 11:20               ` Jan Beulich
2016-01-20 15:29                 ` Xiao Guangrong
2016-01-20 15:47                   ` Konrad Rzeszutek Wilk
2016-01-20 16:25                     ` Xiao Guangrong
2016-01-20 16:47                       ` Konrad Rzeszutek Wilk
2016-01-20 16:55                         ` Xiao Guangrong
2016-01-20 17:18                           ` Konrad Rzeszutek Wilk
2016-01-20 17:23                             ` Xiao Guangrong
2016-01-20 17:48                               ` Konrad Rzeszutek Wilk
2016-01-21  3:12                             ` Haozhong Zhang
2016-01-20 17:07                   ` Jan Beulich
2016-01-20 17:17                     ` Xiao Guangrong
2016-01-21  8:18                       ` Jan Beulich
2016-01-21  8:25                         ` Xiao Guangrong
2016-01-21  8:53                           ` Jan Beulich
2016-01-21  9:10                             ` Xiao Guangrong
2016-01-21  9:29                               ` Andrew Cooper
2016-01-21 10:26                                 ` Jan Beulich
2016-01-21 10:25                               ` Jan Beulich
2016-01-21 14:01                                 ` Haozhong Zhang
2016-01-21 14:52                                   ` Jan Beulich
2016-01-22  2:43                                     ` Haozhong Zhang
2016-01-26 11:44                                     ` George Dunlap
2016-01-26 12:44                                       ` Jan Beulich
2016-01-26 12:54                                         ` Juergen Gross
2016-01-26 14:44                                           ` Konrad Rzeszutek Wilk
2016-01-26 15:37                                             ` Jan Beulich
2016-01-26 15:57                                               ` Haozhong Zhang
2016-01-26 16:34                                                 ` Jan Beulich
2016-01-26 19:32                                                   ` Konrad Rzeszutek Wilk
2016-01-27  7:22                                                     ` Haozhong Zhang
2016-01-27 10:16                                                     ` Jan Beulich
2016-01-27 14:50                                                       ` Konrad Rzeszutek Wilk
2016-01-27 10:55                                                   ` George Dunlap
2016-01-26 13:58                                         ` George Dunlap
2016-01-26 14:46                                           ` Konrad Rzeszutek Wilk
2016-01-26 15:30                                         ` Haozhong Zhang
2016-01-26 15:33                                           ` Haozhong Zhang
2016-01-26 15:57                                           ` Jan Beulich
2016-01-27  2:23                                             ` Haozhong Zhang
2016-01-20 15:07               ` Konrad Rzeszutek Wilk [this message]
2016-01-06 15:37 ` [PATCH 0/4] add support for vNVDIMM Ian Campbell
2016-01-06 15:47   ` Haozhong Zhang
2016-01-20  3:28 ` Tian, Kevin
2016-01-20 12:43   ` Stefano Stabellini
2016-01-20 14:26     ` Zhang, Haozhong
2016-01-20 14:35       ` Stefano Stabellini
2016-01-20 14:47         ` Zhang, Haozhong
2016-01-20 14:54           ` Andrew Cooper
2016-01-20 15:59             ` Haozhong Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160120150720.GG24581@char.us.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=guangrong.xiao@linux.intel.com \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=keir@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.