From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44830) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZUYDk-0005n0-47 for qemu-devel@nongnu.org; Wed, 26 Aug 2015 06:46:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZUYDg-0001HX-SD for qemu-devel@nongnu.org; Wed, 26 Aug 2015 06:46:20 -0400 Received: from mga02.intel.com ([134.134.136.20]:3535) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZUYDg-0001HF-87 for qemu-devel@nongnu.org; Wed, 26 Aug 2015 06:46:16 -0400 References: <1439563931-12352-1-git-send-email-guangrong.xiao@linux.intel.com> <1439563931-12352-9-git-send-email-guangrong.xiao@linux.intel.com> <20150825160353.GD8344@stefanha-thinkpad.redhat.com> From: Xiao Guangrong Message-ID: <55DD979A.70804@linux.intel.com> Date: Wed, 26 Aug 2015 18:40:26 +0800 MIME-Version: 1.0 In-Reply-To: <20150825160353.GD8344@stefanha-thinkpad.redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 08/18] nvdimm: init backend memory mapping and config data area List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: ehabkost@redhat.com, kvm@vger.kernel.org, mst@redhat.com, gleb@kernel.org, mtosatti@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com, imammedo@redhat.com, pbonzini@redhat.com, rth@twiddle.net On 08/26/2015 12:03 AM, Stefan Hajnoczi wrote: > On Fri, Aug 14, 2015 at 10:52:01PM +0800, Xiao Guangrong wrote: >> The parameter @file is used as backed memory for NVDIMM which is >> divided into two parts if @dataconfig is true: > > s/dataconfig/configdata/ Stupid typo, sorry. > >> @@ -76,13 +109,87 @@ static void pc_nvdimm_init(Object *obj) >> set_configdata, NULL); >> } >> >> +static uint64_t get_file_size(int fd) >> +{ >> + struct stat stat_buf; >> + uint64_t size; >> + >> + if (fstat(fd, &stat_buf) < 0) { >> + return 0; >> + } >> + >> + if (S_ISREG(stat_buf.st_mode)) { >> + return stat_buf.st_size; >> + } >> + >> + if (S_ISBLK(stat_buf.st_mode) && !ioctl(fd, BLKGETSIZE64, &size)) { >> + return size; >> + } > > #ifdef __linux__ for ioctl(fd, BLKGETSIZE64, &size)? > > There is nothing Linux-specific about emulating NVDIMMs so this code > should compile on all platforms. Right. I have no idea for how block devices work on other platforms so I will only allow linux to directly use bock device file in the next version. > >> + >> + return 0; >> +} >> + >> static void pc_nvdimm_realize(DeviceState *dev, Error **errp) >> { >> PCNVDIMMDevice *nvdimm = PC_NVDIMM(dev); >> + char name[512]; >> + void *buf; >> + ram_addr_t addr; >> + uint64_t size, nvdimm_size, config_size = MIN_CONFIG_DATA_SIZE; >> + int fd; >> >> if (!nvdimm->file) { >> error_setg(errp, "file property is not set"); >> } > > Missing return here. Will fix. > >> + >> + fd = open(nvdimm->file, O_RDWR); > > Does it make sense to support read-only NVDIMMs? > > It could be handy for sharing a read-only file between unprivileged > guests. The permissions on the file would only allow read, not write. Make sense. Currently these patchset just implements "shared" mode so that write permission is required, however, please see below: > >> + if (fd < 0) { >> + error_setg(errp, "can not open %s", nvdimm->file); > > s/can not/cannot/ > >> + return; >> + } >> + >> + size = get_file_size(fd); >> + buf = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); > > I guess the user will want to choose between MAP_SHARED and MAP_PRIVATE. > This can be added in the future. Good idea, it will allow guest to write data but discards its content after it exits. Will implement O_RDONLY + MAP_PRIVATE in the near future. > >> + if (buf == MAP_FAILED) { >> + error_setg(errp, "can not do mmap on %s", nvdimm->file); >> + goto do_close; >> + } >> + >> + nvdimm->config_data_size = config_size; >> + if (nvdimm->configdata) { >> + /* reserve MIN_CONFIGDATA_AREA_SIZE for configue data. */ >> + nvdimm_size = size - config_size; >> + nvdimm->config_data_addr = buf + nvdimm_size; >> + } else { >> + nvdimm_size = size; >> + nvdimm->config_data_addr = NULL; >> + } >> + >> + if ((int64_t)nvdimm_size <= 0) { > > The error cases can be detected before mmap(2). That avoids the int64_t > cast and also avoids nvdimm_size underflow and the bogus > nvdimm->config_data_addr calculation above. Okay. > > size = get_file_size(fd); > if (size == 0) { > error_setg(errp, "empty file or unable to get file size"); > goto do_close; > } else if (nvdimm->configdata && size < config_size) {{ > error_setg(errp, "file size is too small to store NVDIMM" > " configure data"); > goto do_close; > } > >> + error_setg(errp, "file size is too small to store NVDIMM" >> + " configure data"); >> + goto do_unmap; >> + } >> + >> + addr = reserved_range_push(nvdimm_size); >> + if (!addr) { >> + error_setg(errp, "do not have enough space for size %#lx.\n", size); > > error_setg() messages must not have a newline at the end. > > Please use "%#" PRIx64 instead of "%#lx" so compilation works on 32-bit > hosts where sizeof(long) == 4. Good catch. > >> + goto do_unmap; >> + } >> + >> + nvdimm->device_index = new_device_index(); >> + sprintf(name, "NVDIMM-%d", nvdimm->device_index); >> + memory_region_init_ram_ptr(&nvdimm->mr, OBJECT(dev), name, nvdimm_size, >> + buf); > > How is the autogenerated name used? > > Why not just use "pc-nvdimm.memory"? Ah. Just for debug proposal :) and i am not sure if a name used for multiple MRs (MemoryRegion) is a good idea. > >> + vmstate_register_ram(&nvdimm->mr, DEVICE(dev)); >> + memory_region_add_subregion(get_system_memory(), addr, &nvdimm->mr); >> + >> + return; > > fd is leaked. Will fix.