From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60668) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e9kam-0000GO-6j for qemu-devel@nongnu.org; Wed, 01 Nov 2017 00:25:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e9kal-0001bI-5V for qemu-devel@nongnu.org; Wed, 01 Nov 2017 00:25:28 -0400 Received: from mail-oi0-x22b.google.com ([2607:f8b0:4003:c06::22b]:48233) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1e9kak-0001Xx-VB for qemu-devel@nongnu.org; Wed, 01 Nov 2017 00:25:27 -0400 Received: by mail-oi0-x22b.google.com with SMTP id m198so1761655oig.5 for ; Tue, 31 Oct 2017 21:25:24 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <378b10f3-b32f-84f5-2bbc-50c2ec5bcdd4@gmail.com> References: <1455443283.33337333.1500618150787.JavaMail.zimbra@redhat.com> <20170724102330.GE652@quack2.suse.cz> <1157879323.33809400.1500897967669.JavaMail.zimbra@redhat.com> <20170724123752.GN652@quack2.suse.cz> <1888117852.34216619.1500992835767.JavaMail.zimbra@redhat.com> <1501016375.26846.21.camel@redhat.com> <1063764405.34607875.1501076841865.JavaMail.zimbra@redhat.com> <1501104453.26846.45.camel@redhat.com> <1501112787.4073.49.camel@redhat.com> <0a26793f-86f7-29e7-f61b-dc4c1ef08c8e@gmail.com> <378b10f3-b32f-84f5-2bbc-50c2ec5bcdd4@gmail.com> From: Dan Williams Date: Tue, 31 Oct 2017 21:25:22 -0700 Message-ID: Content-Type: text/plain; charset="UTF-8" Subject: Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Xiao Guangrong Cc: Rik van Riel , Pankaj Gupta , Jan Kara , Stefan Hajnoczi , Stefan Hajnoczi , kvm-devel , Qemu Developers , "linux-nvdimm@lists.01.org" , ross zwisler , Paolo Bonzini , Kevin Wolf , Nitesh Narayan Lal , Haozhong Zhang , Ross Zwisler On Tue, Oct 31, 2017 at 8:43 PM, Xiao Guangrong wrote: > > > On 10/31/2017 10:20 PM, Dan Williams wrote: >> >> On Tue, Oct 31, 2017 at 12:13 AM, Xiao Guangrong >> wrote: >>> >>> >>> >>> On 07/27/2017 08:54 AM, Dan Williams wrote: >>> >>>>> At that point, would it make sense to expose these special >>>>> virtio-pmem areas to the guest in a slightly different way, >>>>> so the regions that need virtio flushing are not bound by >>>>> the regular driver, and the regular driver can continue to >>>>> work for memory regions that are backed by actual pmem in >>>>> the host? >>>> >>>> >>>> >>>> Hmm, yes that could be feasible especially if it uses the ACPI NFIT >>>> mechanism. It would basically involve defining a new SPA (System >>>> Phyiscal Address) range GUID type, and then teaching libnvdimm to >>>> treat that as a new pmem device type. >>> >>> >>> >>> I would prefer a new flush mechanism to a new memory type introduced >>> to NFIT, e.g, in that mechanism we can define request queues and >>> completion queues and any other features to make virtualization >>> friendly. That would be much simpler. >>> >> >> No that's more confusing because now we are overloading the definition >> of persistent memory. I want this memory type identified from the top >> of the stack so it can appear differently in /proc/iomem and also >> implement this alternate flush communication. >> > > For the characteristic of memory, I have no idea why VM should know this > difference. It can be completely transparent to VM, that means, VM > does not need to know where this virtual PMEM comes from (for a really > nvdimm backend or a normal storage). The only discrepancy is the flush > interface. It's not persistent memory if it requires a hypercall to make it persistent. Unless memory writes can be made durable purely with cpu instructions it's dangerous for it to be treated as a PMEM range. Consider a guest that tried to map it with device-dax which has no facility to route requests to a special flushing interface. > >> In what way is this "more complicated"? It was trivial to add support >> for the "volatile" NFIT range, this will not be any more complicated >> than that. >> > > Introducing memory type is easy indeed, however, a new flush interface > definition is inevitable, i.e, we need a standard way to discover the > MMIOs to communicate with host. Right, the proposed way to do that for x86 platforms is a new SPA Range GUID type. in the NFIT.