From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu Date: Tue, 26 Jan 2016 13:54:12 +0100 Message-ID: <56A76C74.5010506@suse.com> References: <20160120110449.GD4939@hz-desktop.sh.intel.com> <569F7B8302000078000C8FF8@prv-mh.provo.novell.com> <569FA7F3.8080506@linux.intel.com> <569FCCED02000078000C94BA@prv-mh.provo.novell.com> <569FC112.9060309@linux.intel.com> <56A0A25002000078000C971B@prv-mh.provo.novell.com> <56A095E3.5060507@linux.intel.com> <56A0AA8A02000078000C977D@prv-mh.provo.novell.com> <56A0A09A.2050101@linux.intel.com> <56A0C02A02000078000C9823@prv-mh.provo.novell.com> <20160121140103.GB6362@hz-desktop.sh.intel.com> <56A0FEA102000078000C9A44@prv-mh.provo.novell.com> <56A7785802000078000CB0CD@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <56A7785802000078000CB0CD@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , George Dunlap Cc: Haozhong Zhang , Kevin Tian , Wei Liu , Ian Campbell , Stefano Stabellini , Andrew Cooper , Ian Jackson , "xen-devel@lists.xen.org" , Jun Nakajima , Xiao Guangrong , Keir Fraser List-Id: xen-devel@lists.xenproject.org On 26/01/16 13:44, Jan Beulich wrote: >>>> On 26.01.16 at 12:44, wrote: >> On Thu, Jan 21, 2016 at 2:52 PM, Jan Beulich wrote: >>>>>> On 21.01.16 at 15:01, wrote: >>>> On 01/21/16 03:25, Jan Beulich wrote: >>>>>>>> On 21.01.16 at 10:10, wrote: >>>>>> c) hypervisor should mange PMEM resource pool and partition it to multiple >>>>>> VMs. >>>>> >>>>> Yes. >>>>> >>>> >>>> But I Still do not quite understand this part: why must pmem resource >>>> management and partition be done in hypervisor? >>> >>> Because that's where memory management belongs. And PMEM, >>> other than PBLK, is just another form of RAM. >> >> I haven't looked more deeply into the details of this, but this >> argument doesn't seem right to me. >> >> Normal RAM in Xen is what might be called "fungible" -- at boot, all >> RAM is zeroed, and it basically doesn't matter at all what RAM is >> given to what guest. (There are restrictions of course: lowmem for >> DMA, contiguous superpages, &c; but within those groups, it doesn't >> matter *which* bit of lowmem you get, as long as you get enough to do >> your job.) If you reboot your guest or hand RAM back to the >> hypervisor, you assume that everything in it will disappear. When you >> ask for RAM, you can request some parameters that it will have >> (lowmem, on a specific node, &c), but you can't request a specific >> page that you had before. >> >> This is not the case for PMEM. The whole point of PMEM (correct me if >> I'm wrong) is to be used for long-term storage that survives over >> reboot. It matters very much that a guest be given the same PRAM >> after the host is rebooted that it was given before. It doesn't make >> any sense to manage it the way Xen currently manages RAM (i.e., that >> you request a page and get whatever Xen happens to give you). > > Interesting. This isn't the usage model I have been thinking about > so far. Having just gone back to the original 0/4 mail, I'm afraid > we're really left guessing, and you guessed differently than I did. > My understanding of the intentions of PMEM so far was that this > is a high-capacity, slower than DRAM but much faster than e.g. > swapping to disk alternative to normal RAM. I.e. the persistent > aspect of it wouldn't matter at all in this case (other than for PBLK, > obviously). > > However, thinking through your usage model I have problems > seeing it work in a reasonable way even with virtualization left > aside: To my knowledge there's no established protocol on how > multiple parties (different versions of the same OS, or even > completely different OSes) would arbitrate using such memory > ranges. And even for a single OS it is, other than for disks (and > hence PBLK), not immediately clear how it would communicate > from one boot to another what information got stored where, > or how it would react to some or all of this storage having > disappeared (just like a disk which got removed, which - unless > it held the boot partition - would normally have pretty little > effect on the OS coming back up). Last year at Linux Plumbers Conference I attended a session dedicated to NVDIMM support. I asked the very same question and the INTEL guy there told me there is indeed something like a partition table meant to describe the layout of the memory areas and their contents. It would be nice to have a pointer to such information. Without anything like this it might be rather difficult to find the best solution how to implement NVDIMM support in Xen or any other product. Juergen