From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48553) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bxWB5-00014E-R6 for qemu-devel@nongnu.org; Fri, 21 Oct 2016 05:31:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bxWB1-000675-SK for qemu-devel@nongnu.org; Fri, 21 Oct 2016 05:31:51 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42650) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1bxWB1-00066c-KE for qemu-devel@nongnu.org; Fri, 21 Oct 2016 05:31:47 -0400 Date: Fri, 21 Oct 2016 11:31:41 +0200 From: Igor Mammedov Message-ID: <20161021113141.27bbc479@nial.brq.redhat.com> In-Reply-To: <20161020165658.GE5057@thinpad.lan.raisama.net> References: <20161020061301.31372-1-haozhong.zhang@intel.com> <20161020143412.5ea6b564@nial.brq.redhat.com> <20161020131138.4gzxk5ekltkiqtq2@hz-desktop> <20161020154215.4b765284@nial.brq.redhat.com> <20161020135610.GY5057@thinpad.lan.raisama.net> <20161020161521.6e20ce2a@nial.brq.redhat.com> <20161020144734.GA5057@thinpad.lan.raisama.net> <20161020173538.1ee9a556@nial.brq.redhat.com> <20161020165658.GE5057@thinpad.lan.raisama.net> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] hostmem-file: add a property 'notrunc' to avoid data corruption List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eduardo Habkost Cc: Haozhong Zhang , kwolf@redhat.com, Xiao Guangrong , Peter Crosthwaite , qemu-devel@nongnu.org, mreitz@redhat.com, Paolo Bonzini , Richard Henderson On Thu, 20 Oct 2016 14:56:58 -0200 Eduardo Habkost wrote: > On Thu, Oct 20, 2016 at 05:35:38PM +0200, Igor Mammedov wrote: > > On Thu, 20 Oct 2016 12:47:34 -0200 > > Eduardo Habkost wrote: > > > > > On Thu, Oct 20, 2016 at 04:15:21PM +0200, Igor Mammedov wrote: > > > > On Thu, 20 Oct 2016 11:56:10 -0200 > > > > Eduardo Habkost wrote: > > > > > > > > > On Thu, Oct 20, 2016 at 03:42:15PM +0200, Igor Mammedov wrote: > > > > > > On Thu, 20 Oct 2016 21:11:38 +0800 > > > > > > Haozhong Zhang wrote: > > > > > > > > > > > > > On 10/20/16 14:34 +0200, Igor Mammedov wrote: > > > > > > > >On Thu, 20 Oct 2016 14:13:01 +0800 > > > > > > > >Haozhong Zhang wrote: > > > > > > > > > > > > > > > >> If a file is used as the backend of memory-backend-file and its size is > > > > > > > >> not identical to the property 'size', the file will be truncated. For a > > > > > > > >> file used as the backend of vNVDIMM, its data is expected to be > > > > > > > >> persistent and the truncation may corrupt the existing data. > > > > > > > >I wonder if it's possible just skip 'size' property in your case instead > > > > > > > >'notrunc' property. That way if size is not present one'd get actual size > > > > > > > >using get_file_size() and set 'size' to it? > > > > > > > >And if 'size' is provided and 'size' != file_size then error out. > > > > > > > > > > > > > > > > > > > > > > I don't know how this can be implemented in QEMU. Specially, how does > > > > > > > the memory-backend-file know it's used for vNVDIMM, so that it can > > > > > > > skip the 'size' property? > > > > > > Does memory-backend-file needs to know that it's used by NVDIMM? > > > > > > Looking at nvdimm_realize it doesn't as it's assumes > > > > > > hostemem_size == pmem_size + label_size > > > > > > > > > > > > I'd make hostmem_file.size optional and take size from file > > > > > > and if 'size' is specified explictly require it to mach file size. > > > > > > It's generic and has nothing to do with nvdimm. > > > > > > > > > > We can take size from file, or take size from the > > > > > host_memory_backend_get_memory() callers. > > > > > > > > > > Enumerating all sizes that QEMU can use as input: > > > > > > > > > > A) Backend file size > > > > > B) memory backend "size" option > > > > > C) frontend-provided size (-numa size, -m, or pc-dimm "size" > > > > > property) > > > > -numa size affect only anon memory not backend backed one, for > > > > backend baked memory we use memdev where size comes from backend > > > > > > > > pc-dimm.size is readonly and isn't supposed to influence backend.size > > > > > > > > I'd drop C option > > > > > > If C is not present, it should be, as it affects the guest ABI > > > (and the ABI must never depend on the host you are running or > > > backend configuration, only on the frontend configuration). > > I've meant that C should not affect behavior of backend. > > > > > If we are dropping -numa size in favor of the > > > memory-backend-provided size, that's a bug. > > -numa size is not applicable here as it's not using backends, > > when backends are used it's -numa memdev instead in which case > > numa_info[nodenr].node_mem = object_property_get_int(o, "size", NULL); > > OK. My suggestion is to change this to not ignore the size > option, but to validate it and/or do whatever necessary to get a > MemoryRegion of the right size (your suggestion below to split > the memory region would work too). > > In other words, this should work: > > $ qemu -object memory-backend-file,id=mem0,mem-path=/tmp/mempath,size=2G \ > -numa node,size=2G,memdev=mem0 -m 2G > > this must error out: > > $ qemu -object memory-backend-file,id=mem0,mem-path=/tmp/mempath,size=2G \ > -numa node,size=4G,memdev=mem0 -m 2G > > and this should either error out, or result in a 1GB NUMA node > (but never in a 2G NUMA node): > > $ qemu -object memory-backend-file,id=mem0,mem-path=/tmp/mempath,size=2G \ > -numa node,size=1G,memdev=mem0 -m 1G Looks like a bit of over-engineering and mixing together backend with frontend, I don't see a compelling reason to support -numa size=X,memdev=foo as just memdev=foo is sufficient so I'd rather error out if user adds 'size' to -numa memdev=foo [...] > Probably there's only one case where behavior would be different > than what I was thinking. I would like to be possible to specify > only C (frontend size), and omit both A and B. For example: > > $ mkdir /tmp/mempath > $ qemu -object memory-backend-file,id=mem0,mem-path=/tmp/mempath \ > -numa node,size=1G,memdev=mem0 -m 1G > > I would like this command to create a 1G file inside /tmp/mempath > automatically, without the need for an explicit size=1G argument > to memory-backend-file. This is still C where frontend manages allocation of backend (i.e. changes backend's 'size' property) vs being just consumer of whatever backend provides. Purpose of backend is to create complete/initialized hostside object regardless of which frontend uses it. I see all properties of backend (including sizes) as belonging to hostside which shouldn't be influenced by whatever frontend it's used. And these properties are specified/managed only by mgmt side explicitly when creating a backend (either via CLI or monitor/QMP). In your example I'd error out with: "memory-backend-file: requires 'size' to create file in /tmp/mempath"