From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiao Guangrong Subject: Re: [Qemu-devel] [PATCH v2 06/11] nvdimm acpi: initialize the resource used by NVDIMM ACPI Date: Mon, 22 Feb 2016 18:30:03 +0800 Message-ID: <56CAE32B.9080401@linux.intel.com> References: <20160215133722-mutt-send-email-mst@redhat.com> <20160215143234.29320a5f@nial.brq.redhat.com> <56C1F469.2040602@linux.intel.com> <20160215182404.0878474f@nial.brq.redhat.com> <56C21A7D.5040902@linux.intel.com> <20160216120047.5a50eccf@nial.brq.redhat.com> <56C3D522.6090401@linux.intel.com> <20160217192356-mutt-send-email-mst@redhat.com> <56C54298.3000904@linux.intel.com> <20160218110523.058a4716@nial.brq.redhat.com> <20160219100211-mutt-send-email-mst@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Igor Mammedov , ehabkost@redhat.com, KVM list , Gleb Natapov , mtosatti@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com, Paolo Bonzini , rth@twiddle.net To: Dan Williams , "Michael S. Tsirkin" Return-path: Received: from mga04.intel.com ([192.55.52.120]:28529 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751208AbcBVKi0 (ORCPT ); Mon, 22 Feb 2016 05:38:26 -0500 In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On 02/19/2016 04:43 PM, Dan Williams wrote: > On Fri, Feb 19, 2016 at 12:08 AM, Michael S. Tsirkin = wrote: >> On Thu, Feb 18, 2016 at 11:05:23AM +0100, Igor Mammedov wrote: >>> On Thu, 18 Feb 2016 12:03:36 +0800 >>> Xiao Guangrong wrote: >>> >>>> On 02/18/2016 01:26 AM, Michael S. Tsirkin wrote: >>>>> On Wed, Feb 17, 2016 at 10:04:18AM +0800, Xiao Guangrong wrote: >>>>>>>>> As for the rest could that commands go via MMIO that we usual= ly >>>>>>>>> use for control path? >>>>>>>> >>>>>>>> So both input data and output data go through single MMIO, we = need to >>>>>>>> introduce a protocol to pass these data, that is complex? >>>>>>>> >>>>>>>> And is any MMIO we can reuse (more complexer=EF=BC=9F) or we s= hould allocate this >>>>>>>> MMIO page =EF=BC=88the old question - where to allocated?=EF=BC= =89? >>>>>>> Maybe you could reuse/extend memhotplug IO interface, >>>>>>> or alternatively as Michael suggested add a vendor specific PCI= _Config, >>>>>>> I'd suggest PM device for that (hw/acpi/[piix4.c|ihc9.c]) >>>>>>> which I like even better since you won't need to care about whi= ch ports >>>>>>> to allocate at all. >>>>>> >>>>>> Well, if Michael does not object, i will do it in the next versi= on. :) >>>>> >>>>> Sorry, the thread's so long by now that I'm no longer sure what d= oes "it" refer to. >>>> >>>> Never mind i saw you were busy on other loops. >>>> >>>> "It" means the suggestion of Igor that "map each label area right = after each >>>> NVDIMM's data memory" >>> Michael pointed out that putting label right after each NVDIMM >>> might burn up to 256GB of address space due to DIMM's alignment for= 256 NVDIMMs. >>> However if address for each label is picked with pc_dimm_get_free_a= ddr() >>> and label's MemoryRegion alignment is default 2MB then all labels >>> would be allocated close to each other within a single 1GB range. >>> >>> That would burn only 1GB for 500 labels which is more than possible= 256 NVDIMMs. >> >> I thought about it, once we support hotplug, this means that one wil= l >> have to pre-declare how much is needed so QEMU can mark the correct >> memory reserved, that would be nasty. Maybe we always pre-reserve 1G= byte. >> Okay but next time we need something, do we steal another Gigabyte? >> It seems too much, I'll think it over on the weekend. >> >> Really, most other devices manage to get by with 4K chunks just fine= , I >> don't see why do we are so special and need to steal gigabytes of >> physically contigious phy ranges. > > What's the driving use case for labels in the guest? For example, > NVDIMM-N devices are supported by the kernel without labels. Yes, I see Linux driver supports label-less vNVDIMM that is exact curre= nt QEMU doing. However, label-less is only Linux specific implementation (as it completely bypasses namespace), other OS vendors (e.g Microsoft) will u= se label storage to address their own requirements=EF=BC=8Cor they do not follow= namespace spec at all. Another reason is that label is essential for PBLK support. BTW, the label support can be dynamically configured and it will be dis= abled on default. > > I certainly would not want to sacrifice 1GB alignment for a label are= a. > Yup, me too.