From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57523) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZZJCN-000732-A7 for qemu-devel@nongnu.org; Tue, 08 Sep 2015 09:44:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZZJCI-0008Ij-R5 for qemu-devel@nongnu.org; Tue, 08 Sep 2015 09:44:35 -0400 Received: from mga01.intel.com ([192.55.52.88]:17803) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZZJCI-0008Hi-F0 for qemu-devel@nongnu.org; Tue, 08 Sep 2015 09:44:30 -0400 References: <1439563931-12352-1-git-send-email-guangrong.xiao@linux.intel.com> <1439563931-12352-9-git-send-email-guangrong.xiao@linux.intel.com> <20150907161155.154a02f8@nial.brq.redhat.com> From: Xiao Guangrong Message-ID: <55EEE4C9.4090400@linux.intel.com> Date: Tue, 8 Sep 2015 21:38:17 +0800 MIME-Version: 1.0 In-Reply-To: <20150907161155.154a02f8@nial.brq.redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 08/18] nvdimm: init backend memory mapping and config data area List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Igor Mammedov Cc: ehabkost@redhat.com, kvm@vger.kernel.org, mst@redhat.com, gleb@kernel.org, mtosatti@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com, pbonzini@redhat.com, rth@twiddle.net On 09/07/2015 10:11 PM, Igor Mammedov wrote: > On Fri, 14 Aug 2015 22:52:01 +0800 > Xiao Guangrong wrote: > >> The parameter @file is used as backed memory for NVDIMM which is >> divided into two parts if @dataconfig is true: >> - first parts is (0, size - 128K], which is used as PMEM (Persistent >> Memory) >> - 128K at the end of the file, which is used as Config Data Area, it's >> used to store Label namespace data >> >> The @file supports both regular file and block device, of course we >> can assign any these two kinds of files for test and emulation, however, >> in the real word for performance reason, we usually used these files as >> NVDIMM backed file: >> - the regular file in the filesystem with DAX enabled created on NVDIMM >> device on host >> - the raw PMEM device on host, e,g /dev/pmem0 > > A lot of code in this series could reuse what QEMU already > uses for implementing pc-dimm devices. > > here is common concepts that could be reused. > - on physical system both DIMM and NVDIMM devices use > the same slots. We could share QEMU's '-m slots' option between > both devices. An alternative to not sharing would be to introduce > '-machine nvdimm_slots' option. > And yes, we need to know number of NVDIMMs to describe > them all in ACPI table (taking in amount future hotplug > include in this possible NVDIMM devices) > I'd go the same way as on real hardware on make them share the same slots. I'd prefer sharing slots for pc-dimm and nvdimm, it's easier to reuse the logic of slot-assignment and plug/unplug. > - they share the same physical address space and limits > on how much memory system can handle. So I'd suggest sharing existing > '-m maxmem' option and reuse hotplug_memory address space. Sounds good to me. > > Essentially what I'm suggesting is to inherit NVDIMM's implementation > from pc-dimm reusing all of its code/backends and > just override parts that do memory mapping into guest's address space to > accommodate NVDIMM's requirements. Good idea! We have to differentiate pc-dimm and nvdimm in the common code and nvdimm has different points with pc-dimm (for example, its has reserved-region, and need support live migration of label data). How about rename 'pc-nvdimm' to 'memory-device' and make it as a common device type, then build pc-dimm and nvdimm on top of it? Something like: static TypeInfo memory_device_info = { .name = TYPE_MEM_DEV, .parent = TYPE_DEVICE, }; static TypeInfo memory_device_info = { .name = TYPE_PC_DIMM, .parent = TYPE_MEM_DEV, }; static TypeInfo memory_device_info = { .name = TYPE_NVDIMM, .parent = TYPE_MEM_DEV, }; It also make CONIFG_NVDIMM and CONFIG_HOT_PLUG be independent. > >> >> Signed-off-by: Xiao Guangrong >> --- >> hw/mem/nvdimm/pc-nvdimm.c | 109 ++++++++++++++++++++++++++++++++++++++++++++- >> include/hw/mem/pc-nvdimm.h | 7 +++ >> 2 files changed, 115 insertions(+), 1 deletion(-) >> >> diff --git a/hw/mem/nvdimm/pc-nvdimm.c b/hw/mem/nvdimm/pc-nvdimm.c >> index 7a270a8..97710d1 100644 >> --- a/hw/mem/nvdimm/pc-nvdimm.c >> +++ b/hw/mem/nvdimm/pc-nvdimm.c >> @@ -22,12 +22,20 @@ >> * License along with this library; if not, see >> */ >> >> +#include >> +#include >> +#include >> + >> +#include "exec/address-spaces.h" >> #include "hw/mem/pc-nvdimm.h" >> >> -#define PAGE_SIZE (1UL << 12) >> +#define PAGE_SIZE (1UL << 12) >> + >> +#define MIN_CONFIG_DATA_SIZE (128 << 10) >> >> static struct nvdimms_info { >> ram_addr_t current_addr; >> + int device_index; >> } nvdimms_info; >> >> /* the address range [offset, ~0ULL) is reserved for NVDIMM. */ >> @@ -37,6 +45,26 @@ void pc_nvdimm_reserve_range(ram_addr_t offset) >> nvdimms_info.current_addr = offset; >> } >> >> +static ram_addr_t reserved_range_push(uint64_t size) >> +{ >> + uint64_t current; >> + >> + current = ROUND_UP(nvdimms_info.current_addr, PAGE_SIZE); >> + >> + /* do not have enough space? */ >> + if (current + size < current) { >> + return 0; >> + } >> + >> + nvdimms_info.current_addr = current + size; >> + return current; >> +} > You can't use all memory above hotplug_memory area since > we have to tell guest where 64-bit PCI window starts, > and currently it should start at reserved-memory-end > (but it isn't due to a bug: I've just posted fix to qemu-devel > "[PATCH 0/2] pc: fix 64-bit PCI window clashing with memory hotplug region" > ) Ah, got it, thanks for you pointing it out. > >> + >> +static uint32_t new_device_index(void) >> +{ >> + return nvdimms_info.device_index++; >> +} >> + >> static char *get_file(Object *obj, Error **errp) >> { >> PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj); >> @@ -48,6 +76,11 @@ static void set_file(Object *obj, const char *str, Error **errp) >> { >> PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj); >> >> + if (memory_region_size(&nvdimm->mr)) { >> + error_setg(errp, "cannot change property value"); >> + return; >> + } >> + >> if (nvdimm->file) { >> g_free(nvdimm->file); >> } >> @@ -76,13 +109,87 @@ static void pc_nvdimm_init(Object *obj) >> set_configdata, NULL); >> } >> >> +static uint64_t get_file_size(int fd) >> +{ >> + struct stat stat_buf; >> + uint64_t size; >> + >> + if (fstat(fd, &stat_buf) < 0) { >> + return 0; >> + } >> + >> + if (S_ISREG(stat_buf.st_mode)) { >> + return stat_buf.st_size; >> + } >> + >> + if (S_ISBLK(stat_buf.st_mode) && !ioctl(fd, BLKGETSIZE64, &size)) { >> + return size; >> + } >> + >> + return 0; >> +} > All this file stuff I'd leave to already existing backends like > memory-backend-file or even memory-backend-ram which already do > above and more allowing to configure persistent and volatile > NVDIMMs without changing NVDIMM fronted code. > The current memory backends use all memory size and map it to guest's address space. However, nvdimm needs a reserved region for its label data which is only accessed in Qemu. How about introduce two parameters, "reserved_size" and "reserved_addr" to TYPE_MEMORY_BACKEND, then only the memory region [0, size - reserved_size) is mapped to guest and the remain part is pointed by "reserved_addr"?