From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41955) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WLDcp-0002Kz-0O for qemu-devel@nongnu.org; Wed, 05 Mar 2014 10:20:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WLDcn-0000uM-Su for qemu-devel@nongnu.org; Wed, 05 Mar 2014 10:20:50 -0500 Received: from mail-ve0-x234.google.com ([2607:f8b0:400c:c01::234]:60276) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WLDcn-0000u4-6z for qemu-devel@nongnu.org; Wed, 05 Mar 2014 10:20:49 -0500 Received: by mail-ve0-f180.google.com with SMTP id jz11so1148861veb.25 for ; Wed, 05 Mar 2014 07:20:48 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <53173841.1020905@kamp.de> References: <530DBE6C.5030502@kamp.de> <20140226154154.GB20820@stefanha-thinkpad.muc.redhat.com> <530E0FF0.20501@kamp.de> <20140227085711.GC21749@stefanha-thinkpad.redhat.com> <53109E99.3020102@kamp.de> <20140303120349.GA21055@stefanha-thinkpad.redhat.com> <53147385.2090906@kamp.de> <20140304092456.GF25676@stefanha-thinkpad.redhat.com> <53173841.1020905@kamp.de> Date: Wed, 5 Mar 2014 08:20:48 -0700 Message-ID: From: Marcus Content-Type: text/plain; charset=ISO-8859-1 Subject: Re: [Qemu-devel] qemu-img convert cache mode for source List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven Cc: Kevin Wolf , Stefan Hajnoczi , "qemu-devel@nongnu.org" , Stefan Hajnoczi , Paolo Bonzini I think this is a more generic sysadmin problem. I've seen the same thing in the past with simply snapshotting a logical volume or zfs zvol and copying it off somewhere. Page cache bloats, the system starts swapping. To avoid it, we wrote a small C program that calls FADV_DONTNEED on a file, and fork off a process to call it on the source file every X seconds in our backup scripts. It's a little strange to me to have qemu-img do this, just like it would be strange if 'cp' did it, but I can see it as a very useful shortcut if it's an optional flag. qemu-img to me is just an admin tool, and the admin should decide if they want their tool's reads cached. Some additional things that come to mind: * If you are running qemu-img on a running VM's source file, FADV_DONTNEED may ruin the cache you wanted if the VM is not running cache=none. * O_DIRECT I think will cause unexpected problems, for example the zfsonlinux guys (and tmpfs as mentioned) don't yet support it. If it is used, there has to be a fallback or a way to turn it off. On Wed, Mar 5, 2014 at 7:44 AM, Peter Lieven wrote: > Am 04.03.2014 10:24, schrieb Stefan Hajnoczi: >> On Mon, Mar 03, 2014 at 01:20:21PM +0100, Peter Lieven wrote: >>> On 03.03.2014 13:03, Stefan Hajnoczi wrote: >>>> So what is the actual performance problem you are trying to solve and >>>> what benchmark output are you getting when you compare with >>>> FADV_DONTNEED against without FADV_DONTNEED? >>> I found the performance to be identical. For the problem see below please. >>>> I think there's a danger that the discussion will go around in circles. >>>> Please post the performance results that kicked off this whole effort >>>> and let's focus on the data. That way it's much easier to evaluate what >>>> changes to QEMU are a win and which are not necessary. >>> I found that under memory pressure situations the increasing buffers >>> leads to vserver memory being swapped out. This caused trouble >>> especially in overcommit scenarios (where all memory is backed by >>> swap). >> I think the general idea is qemu-img should not impact running guests, >> even on a heavily loaded machine. But again, this needs to be discussed >> using concrete benchmarks with configurations and results posted to the >> list. > Sure, this is why I started to look at this. I found that under high memory > pressure a backup (local storage -> NFS) causes swapping. I started to > use libnfs as destination to avoid influence of the kernel NFS client. But > I saw that the buffers still increase while a backup is running. With the > proposed patch I sent recently > > [PATCH] block: introduce BDRV_O_SEQUENTIAL > > I don't see this behaviour while I have not yet observed a performance penalty. > > Peter >> >> Stefan > >