From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56383) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YlcrD-0007wq-Ex for qemu-devel@nongnu.org; Fri, 24 Apr 2015 08:37:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YlcrC-0001US-24 for qemu-devel@nongnu.org; Fri, 24 Apr 2015 08:37:23 -0400 MIME-Version: 1.0 In-Reply-To: References: <1429629639-14328-1-git-send-email-berto@igalia.com> <20150422102602.GA20029@stefanha-thinkpad.redhat.com> <20150423101504.GG8811@stefanha-thinkpad.redhat.com> <20150424092621.GD13278@stefanha-thinkpad.redhat.com> <20150424095214.GB3838@noname.redhat.com> Date: Fri, 24 Apr 2015 13:37:21 +0100 Message-ID: From: Stefan Hajnoczi Content-Type: text/plain; charset=UTF-8 Subject: Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: do lazy allocation of the L2 cache List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alberto Garcia Cc: Kevin Wolf , Stefan Hajnoczi , qemu-devel , qemu block On Fri, Apr 24, 2015 at 12:10 PM, Alberto Garcia wrote: > On Fri 24 Apr 2015 11:52:14 AM CEST, Kevin Wolf wrote: > >>> The posix_memalign() call wastes memory. I compared: >>> >>> posix_memalign(&memptr, 65536, 2560 * 65536); >>> memset(memptr, 0, 2560 * 65536); >>> >>> with: >>> >>> for (i = 0; i < 2560; i++) { >>> posix_memalign(&memptr, 65536, 65536); >>> memset(memptr, 0, 65536); >>> } >> >> 64k alignment is too much, in practice you need 512b or 4k, which >> probably wastes a lot less memory. > > My tests were with 512b and 4k and the overhead was around 4k per page > (hence 2560 * 4 = 10240, the 10MB I was talking about). > >> But I just looked at the qcow2 cache code and you're right anyway. >> Allocating one big block instead of many small allocations in a loop >> looks like a good idea either way. > > The problem is that I had the idea to make the cache dynamic. > > Consider the scenario [A] <- [B], with a virtual size of 1TB and [B] a > newly created snapshot. The L2 cache size is 128MB for each image, you > read a lot of data from the disk and the cache from [A] starts to fill > up (at this point [B] is mostly empty so you get all the data from > [A]). Then you start to write data into [B], and now its L2 cache starts > to fill up as well. > > After a while you're going to have lots of cache entries in [A] that are > not needed anymore because now the data for those clusters is in [B]. > > I think it would be nice to have a way to free unused cache entries > after a while. Do you think mmap plus a periodic timer would work? I'm hesitant about changes like this because they make QEMU more complex, slow down the guest, and make the memory footprint volatile. But if a simple solution addresses the problem, then I'd be happy. Stefan