From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:32865)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1bHsZ4-0000wv-DB
	for qemu-devel@nongnu.org; Tue, 28 Jun 2016 08:56:31 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1bHsZ0-00012s-JR
	for qemu-devel@nongnu.org; Tue, 28 Jun 2016 08:56:29 -0400
Received: from mx1.redhat.com ([209.132.183.28]:53754)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1bHsZ0-00012l-Aw
	for qemu-devel@nongnu.org; Tue, 28 Jun 2016 08:56:26 -0400
Date: Tue, 28 Jun 2016 13:56:21 +0100
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Message-ID: <20160628125620.GI2243@work-vm>
References: <1467104499-27517-1-git-send-email-pl@kamp.de>
	<db2ee39f-a46b-0e91-15e1-cfd20fe82e60@redhat.com>
	<57726A20.4000808@kamp.de>
	<1564831478.2624143.1467116962342.JavaMail.zimbra@redhat.com>
	<57726E7E.3060709@kamp.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <57726E7E.3060709@kamp.de>
Subject: Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Peter Lieven <pl@kamp.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org, kwolf@redhat.com, peter maydell <peter.maydell@linaro.org>, mst@redhat.com, mreitz@redhat.com, kraxel@redhat.com

* Peter Lieven (pl@kamp.de) wrote:
> Am 28.06.2016 um 14:29 schrieb Paolo Bonzini:
> > > Am 28.06.2016 um 13:37 schrieb Paolo Bonzini:
> > > > On 28/06/2016 11:01, Peter Lieven wrote:
> > > > > I recently found that Qemu is using several hundred megabytes of RSS
> > > > > memory
> > > > > more than older versions such as Qemu 2.2.0. So I started tracing
> > > > > memory allocation and found 2 major reasons for this.
> > > > > 
> > > > > 1) We changed the qemu coroutine pool to have a per thread and a global
> > > > > release
> > > > >      pool. The choosen poolsize and the changed algorithm could lead to up
> > > > >      to
> > > > >      192 free coroutines with just a single iothread. Each of the
> > > > >      coroutines
> > > > >      in the pool each having 1MB of stack memory.
> > > > But the fix, as you correctly note, is to reduce the stack size.  It
> > > > would be nice to compile block-obj-y with -Wstack-usage=2048 too.
> > > To reveal if there are any big stack allocations in the block layer?
> > Yes.  Most should be fixed by now, but a handful are probably still there.
> > (definitely one in vvfat.c).
> > 
> > > As it seems reducing to 64kB breaks live migration in some (non reproducible) cases.
> > Does it hit the guard page?
> 
> How would that look like? I get segfaults like this:
> 
> segfault at 7f91aa642b78 ip 0000555ab714ef7d sp 00007f91aa642b50 error 6 in qemu-system-x86_64[555ab6f2c000+794000]
> 
> most of the time error 6. Sometimes error 7. segfault is near the sp.

A backtrace would be good.

Dave

> 
> 
> > 
> > > > > 2) Between Qemu 2.2.0 and 2.3.0 RCU was introduced which lead to delayed
> > > > > freeing
> > > > >      of memory. This lead to higher heap allocations which could not
> > > > >      effectively
> > > > >      be returned to kernel (most likely due to fragmentation).
> > > > I agree that some of the exec.c allocations need some care, but I would
> > > > prefer to use a custom free list or lazy allocation instead of mmap.
> > > This would only help if the elements from the free list would be allocated
> > > using mmap? The issue is that RCU delays the freeing so that the number of
> > > concurrent allocations is high and then a bunch is freed at once. If the memory
> > > was malloced it would still have caused trouble.
> > The free list should improve reuse and fragmentation.  I'll take a look at
> > lazy allocation of subpages, too.
> 
> Ok, that would be good. And for the PhsyPageMap we use mmap and try to avoid
> the realloc?
> 
> Peter
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK