All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Haozhong Zhang <haozhong.zhang@intel.com>,
	Kevin Tian <kevin.tian@intel.com>, Wei Liu <wei.liu2@citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Jun Nakajima <jun.nakajima@intel.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	xen-devel@lists.xen.org, Jan Beulich <JBeulich@suse.com>,
	Keir Fraser <keir@xen.org>
Subject: Re: [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
Date: Wed, 20 Jan 2016 10:47:49 -0500	[thread overview]
Message-ID: <20160120154749.GD1742@char.us.oracle.com> (raw)
In-Reply-To: <569FA7F3.8080506@linux.intel.com>

On Wed, Jan 20, 2016 at 11:29:55PM +0800, Xiao Guangrong wrote:
> 
> 
> On 01/20/2016 07:20 PM, Jan Beulich wrote:
> >>>>On 20.01.16 at 12:04, <haozhong.zhang@intel.com> wrote:
> >>On 01/20/16 01:46, Jan Beulich wrote:
> >>>>>>On 20.01.16 at 06:31, <haozhong.zhang@intel.com> wrote:
> >>>>Secondly, the driver implements a convenient block device interface to
> >>>>let software access areas where NVDIMM devices are mapped. The
> >>>>existing vNVDIMM implementation in QEMU uses this interface.
> >>>>
> >>>>As Linux NVDIMM driver has already done above, why do we bother to
> >>>>reimplement them in Xen?
> >>>
> >>>See above; a possibility is that we may need a split model (block
> >>>layer parts on Dom0, "normal memory" parts in the hypervisor.
> >>>Iirc the split is being determined by firmware, and hence set in
> >>>stone by the time OS (or hypervisor) boot starts.
> >>
> >>For the "normal memory" parts, do you mean parts that map the host
> >>NVDIMM device's address space range to the guest? I'm going to
> >>implement that part in hypervisor and expose it as a hypercall so that
> >>it can be used by QEMU.
> >
> >To answer this I need to have my understanding of the partitioning
> >being done by firmware confirmed: If that's the case, then "normal"
> >means the part that doesn't get exposed as a block device (SSD).
> >In any event there's no correlation to guest exposure here.
> 
> Firmware does not manage NVDIMM. All the operations of nvdimm are handled
> by OS.
> 
> Actually, there are lots of things we should take into account if we move
> the NVDIMM management to hypervisor:

If you remove the block device part and just deal with pmem part then this
gets smaller.

Also the _DSM operations - I can't see them being in hypervisor - but only
in the dom0 - which would have the right software to tickle the correct
ioctl on /dev/pmem to do the "management" (carve the NVDIMM, perform
an SMART operation, etc).

> a) ACPI NFIT interpretation
>    A new ACPI table introduced in ACPI 6.0 is named NFIT which exports the
>    base information of NVDIMM devices which includes PMEM info, PBLK
>    info, nvdimm device interleave, vendor info, etc. Let me explain it one
>    by one.

And it is a static table. As in part of the MADT.
> 
>    PMEM and PBLK are two modes to access NVDIMM devices:
>    1) PMEM can be treated as NV-RAM which is directly mapped to CPU's address
>       space so that CPU can r/w it directly.
>    2) as NVDIMM has huge capability and CPU's address space is limited, NVDIMM
>       only offers two windows which are mapped to CPU's address space, the data
>       window and access window, so that CPU can use these two windows to access
>       the whole NVDIMM device.
> 
>    NVDIMM device is interleaved whose info is also exported so that we can
>    calculate the address to access the specified NVDIMM device.

Right, along with the serial numbers.
> 
>    NVDIMM devices from different vendor can have different function so that the
>    vendor info is exported by NFIT to make vendor's driver work.

via _DSM right?
> 
> b) ACPI SSDT interpretation
>    SSDT offers _DSM method which controls NVDIMM device, such as label operation,
>    health check etc and hotplug support.

Sounds like the control domain (dom0) would be in charge of that.
> 
> c) Resource management
>    NVDIMM resource management challenged as:
>    1) PMEM is huge and it is little slower access than RAM so it is not suitable
>       to manage it as page struct (i think it is not a big problem in Xen
>       hypervisor?)
>    2) need to partition it to it be used in multiple VMs.
>    3) need to support PBLK and partition it in the future.

That all sounds to me like an control domain (dom0) decisions. Not Xen hypervisor.
> 
> d) management tools support
>    S.M.A.R.T? error detection and recovering?
> 
> c) hotplug support

How does that work? Ah the _DSM will point to the new ACPI NFIT for the OS
to scan. That would require the hypervisor also reading this for it to
update it's data-structures.
> 
> d) third parts drivers
>    Vendor drivers need to be ported to xen hypervisor and let it be supported in
>    the management tool.

Ewww.

I presume the 'third party drivers' mean more interesting _DSM features right?
On the base level the firmware with this type of NVDIMM would still have
the basic - ACPI NFIT + E820_NVDIMM (optional).
> 
> e) ...
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

  reply	other threads:[~2016-01-20 15:47 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-29 11:31 [PATCH 0/4] add support for vNVDIMM Haozhong Zhang
2015-12-29 11:31 ` [PATCH 1/4] x86/hvm: allow guest to use clflushopt and clwb Haozhong Zhang
2015-12-29 15:46   ` Andrew Cooper
2015-12-30  1:35     ` Haozhong Zhang
2015-12-30  2:16       ` Haozhong Zhang
2015-12-30 10:33         ` Andrew Cooper
2015-12-29 11:31 ` [PATCH 2/4] x86/hvm: add support for pcommit instruction Haozhong Zhang
2015-12-29 11:31 ` [PATCH 3/4] tools/xl: add a new xl configuration 'nvdimm' Haozhong Zhang
2016-01-04 11:16   ` Wei Liu
2016-01-06 12:40   ` Jan Beulich
2016-01-06 15:28     ` Haozhong Zhang
2015-12-29 11:31 ` [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu Haozhong Zhang
2016-01-15 17:10   ` Jan Beulich
2016-01-18  0:52     ` Haozhong Zhang
2016-01-18  8:46       ` Jan Beulich
2016-01-19 11:37         ` Wei Liu
2016-01-19 11:46           ` Jan Beulich
2016-01-20  5:14             ` Tian, Kevin
2016-01-20  5:58               ` Zhang, Haozhong
2016-01-20  5:31         ` Haozhong Zhang
2016-01-20  8:46           ` Jan Beulich
2016-01-20  8:58             ` Andrew Cooper
2016-01-20 10:15               ` Haozhong Zhang
2016-01-20 10:36                 ` Xiao Guangrong
2016-01-20 13:16                   ` Andrew Cooper
2016-01-20 14:29                     ` Stefano Stabellini
2016-01-20 14:42                       ` Haozhong Zhang
2016-01-20 14:45                       ` Andrew Cooper
2016-01-20 14:53                         ` Haozhong Zhang
2016-01-20 15:13                           ` Konrad Rzeszutek Wilk
2016-01-20 15:29                             ` Haozhong Zhang
2016-01-20 15:41                               ` Konrad Rzeszutek Wilk
2016-01-20 15:54                                 ` Haozhong Zhang
2016-01-21  3:35                                 ` Bob Liu
2016-01-20 15:05                         ` Stefano Stabellini
2016-01-20 18:14                           ` Andrew Cooper
2016-01-20 14:38                     ` Haozhong Zhang
2016-01-20 11:04             ` Haozhong Zhang
2016-01-20 11:20               ` Jan Beulich
2016-01-20 15:29                 ` Xiao Guangrong
2016-01-20 15:47                   ` Konrad Rzeszutek Wilk [this message]
2016-01-20 16:25                     ` Xiao Guangrong
2016-01-20 16:47                       ` Konrad Rzeszutek Wilk
2016-01-20 16:55                         ` Xiao Guangrong
2016-01-20 17:18                           ` Konrad Rzeszutek Wilk
2016-01-20 17:23                             ` Xiao Guangrong
2016-01-20 17:48                               ` Konrad Rzeszutek Wilk
2016-01-21  3:12                             ` Haozhong Zhang
2016-01-20 17:07                   ` Jan Beulich
2016-01-20 17:17                     ` Xiao Guangrong
2016-01-21  8:18                       ` Jan Beulich
2016-01-21  8:25                         ` Xiao Guangrong
2016-01-21  8:53                           ` Jan Beulich
2016-01-21  9:10                             ` Xiao Guangrong
2016-01-21  9:29                               ` Andrew Cooper
2016-01-21 10:26                                 ` Jan Beulich
2016-01-21 10:25                               ` Jan Beulich
2016-01-21 14:01                                 ` Haozhong Zhang
2016-01-21 14:52                                   ` Jan Beulich
2016-01-22  2:43                                     ` Haozhong Zhang
2016-01-26 11:44                                     ` George Dunlap
2016-01-26 12:44                                       ` Jan Beulich
2016-01-26 12:54                                         ` Juergen Gross
2016-01-26 14:44                                           ` Konrad Rzeszutek Wilk
2016-01-26 15:37                                             ` Jan Beulich
2016-01-26 15:57                                               ` Haozhong Zhang
2016-01-26 16:34                                                 ` Jan Beulich
2016-01-26 19:32                                                   ` Konrad Rzeszutek Wilk
2016-01-27  7:22                                                     ` Haozhong Zhang
2016-01-27 10:16                                                     ` Jan Beulich
2016-01-27 14:50                                                       ` Konrad Rzeszutek Wilk
2016-01-27 10:55                                                   ` George Dunlap
2016-01-26 13:58                                         ` George Dunlap
2016-01-26 14:46                                           ` Konrad Rzeszutek Wilk
2016-01-26 15:30                                         ` Haozhong Zhang
2016-01-26 15:33                                           ` Haozhong Zhang
2016-01-26 15:57                                           ` Jan Beulich
2016-01-27  2:23                                             ` Haozhong Zhang
2016-01-20 15:07               ` Konrad Rzeszutek Wilk
2016-01-06 15:37 ` [PATCH 0/4] add support for vNVDIMM Ian Campbell
2016-01-06 15:47   ` Haozhong Zhang
2016-01-20  3:28 ` Tian, Kevin
2016-01-20 12:43   ` Stefano Stabellini
2016-01-20 14:26     ` Zhang, Haozhong
2016-01-20 14:35       ` Stefano Stabellini
2016-01-20 14:47         ` Zhang, Haozhong
2016-01-20 14:54           ` Andrew Cooper
2016-01-20 15:59             ` Haozhong Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160120154749.GD1742@char.us.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=guangrong.xiao@linux.intel.com \
    --cc=haozhong.zhang@intel.com \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=keir@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.