From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [RFC Design Doc] Add vNVDIMM support for Xen Date: Tue, 2 Feb 2016 11:09:33 +0000 Message-ID: <56B08E6D.9030007@citrix.com> References: <20160201054414.GA25211@hz-desktop.sh.intel.com> <56AFA319.8070801@citrix.com> <20160202034435.GH6293@hz-desktop.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20160202034435.GH6293@hz-desktop.sh.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: xen-devel@lists.xen.org, Jan Beulich , Stefano Stabellini , Konrad Rzeszutek Wilk , George Dunlap , Ian Jackson , Ian Campbell , Juergen Gross , Wei Liu , Kevin Tian , Xiao Guangrong , Keir Fraser , Jun Nakajima List-Id: xen-devel@lists.xenproject.org On 02/02/16 03:44, Haozhong Zhang wrote: > On 02/01/16 18:25, Andrew Cooper wrote: >> On 01/02/16 05:44, Haozhong Zhang wrote: >>> Hi, >>> >>> The following document describes the design of adding vNVDIMM support >>> for Xen. Any comments are welcome. >>> >>> Thanks, >>> Haozhong >> Thankyou for doing this. It is a very comprehensive document, and a >> fantastic example for future similar situations. >> >> >> To start with however, I would like to clear up my confusion over the >> the usecases of pmem vs pblk. >> >> pblk, using indirect access, is less efficient than pmem. NVDIMMs >> themselves are slower (and presumably more expensive) than equivalent >> RAM, and presumably still has a finite number of write cycles, so I >> don't buy an argument suggesting that they are a plausible replacement >> for real RAM. >> >> I presume therefore that a system would only choose to use pblk mode in >> situations where the host physical address space is a limiting factor. >> Are there other situations which I have overlooked? >> > Limited physical address space is one concern. Another concern is that > pblk can be used by drivers to provide better RAS, like better error > detection and power-fail write atomicity. See Section "NVDIMM Driver" > in Chapter 1 of [3] for more details. Ah ok. So even with no limiting factors to consider, it would be a plausible design choice to use it in pblk mode. > >> Secondly, I presume that pmem vs pblk will be a firmware decision and >> fixed from the point of view of the Operating System? >> > Specifications on my hands [1-4] do not mention which one is in charge > for partitioning NVDIMM into pmem and pblk. However, as NFIT uses > separated SPA range structures for pmem and pblk regions, I also > presume that firmware (BIOS/EFI, or firmware on NVDIMM devices) > determines the partition. > > In addition, some NVDIMM vendors may provide specific _DSM commands to > allow software (OS/drivers) to reconfigure the pmem/pblk partition, > but those changes only take effect after reboot. If OS/drivers or > system administrators decide to do so, IMO they should make sure no > users are currently using those NVDIMMs and data on NVDIMMs is already > properly handled. Ok. Either way, it is going to be an administrator decision, and the layout is not going to change under the feet of a running operating system. ~Andrew