From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49082) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cGi2x-00077X-Jw for qemu-devel@nongnu.org; Tue, 13 Dec 2016 03:02:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cGi2w-00018w-Jo for qemu-devel@nongnu.org; Tue, 13 Dec 2016 03:02:47 -0500 References: <468dbeb8-5b1b-669f-f7e5-0d24a4308da8@redhat.com> From: Max Reitz Message-ID: <756cfb05-61e1-7ee3-435b-93759438a52d@redhat.com> Date: Tue, 13 Dec 2016 09:02:34 +0100 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH RFC 0/1] Allow storing the qcow2 L2 cache in disk List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alberto Garcia , qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, Kevin Wolf On 2016-12-12 at 15:13, Alberto Garcia wrote: > On Fri 09 Dec 2016 03:21:08 PM CET, Max Reitz wrote: > >>> In some scenarios, however, there's a different alternative: if the >>> qcow2 image is stored in a slow backend (eg. HDD), we could save >>> memory by putting the L2 cache in a faster one (SSD) instead of in >>> RAM. > >> Well, from a full design standpoint, it doesn't make a lot of sense to >> me: >> >> We have a two-level on-disk structure for cluster mapping so as to not >> waste memory for unused areas and so that we don't need to keep one >> large continuous chunk of metadata. Accessing the disk is slow, so we >> also have an in-memory cache which is just a single level fully >> associative cache replicating the same data (but just a part of it). >> >> Now you want to replicate all of it and store it on disk. My mind >> tells me that is duplicate data: We already have all of the metadata >> elsewhere on disk, namely in the qcow2 file, and even better, it is >> not stored in a fully associative structure there but directly mapped, >> making finding the correct entry much quicker. > > Yes but the use case is that the qcow2 image is stored in a slow disk, > so things will be faster if we avoid having to read it too often. > > But the data is there and it needs to be read, so we have three options: > > 1) Read it everytime we need it. It slows things down. > 2) Keep (part of) it in memory. It can use a lot of memory. > 3) Keep it in a faster disk. > > We're talking about 3) here, and this it not about creating new > structures, but about changing the storage backend of the existing L2 > cache (disk rather than RAM). I'm arguing that we already have an on-disk L2 structure and that is called simply the L1-L2 structure in the qcow2 file. The cache only makes sense because it is in RAM. >> However, the thing is that the existing structures also only exist in >> the original qcow2 file and cannot be just placed anywhere else, as >> opposed to our cache. In order to solve this, we would need to >> (incompatibly) modify the qcow2 format to allow storing data >> independently from metadata. I think this would be certainly doable, >> but the question is whether it is worth the effort. > > You mean split the qcow2 file in two: data and metadata? I don't think > it's worth the effort. That's the thing. I don't know. I definitely like how simple your approach is, but from a design standpoint it is not exactly optimal, because O(n) for a cluster lookup is simply worse than O(1). >> Maybe we can at least make the cache directly mapped if it is supposed >> to cover the whole image? That is, we would basically just load all of >> the L2 tables into memory and bypass the existing cache. > > I don't see how this addresses the original use case that I described. It just fixes the issue that the cache is fully associative and then the only issue I would have with your approach is that we are keeping duplicate data. > But leaving that aside, would that improve anything? I don't think the > cache itself adds any significant overhead here, IIRC even in your > presentation at KVM Forum 2015 qcow2 was comparable to raw as long as > all L2 tables were cached in memory. I haven't compared CPU usage, though. That may have gone up quite a bit, I don't know. For large enough images, it may even become a bottleneck. Max