From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55200) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bHsCV-0002pv-UR for qemu-devel@nongnu.org; Tue, 28 Jun 2016 08:33:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bHsCS-0003KJ-8z for qemu-devel@nongnu.org; Tue, 28 Jun 2016 08:33:10 -0400 Received: from mx-v6.kamp.de ([2a02:248:0:51::16]:40112 helo=mx01.kamp.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bHsCR-0003KC-VA for qemu-devel@nongnu.org; Tue, 28 Jun 2016 08:33:08 -0400 References: <1467104499-27517-1-git-send-email-pl@kamp.de> <57726A20.4000808@kamp.de> <1564831478.2624143.1467116962342.JavaMail.zimbra@redhat.com> From: Peter Lieven Message-ID: <57726E7E.3060709@kamp.de> Date: Tue, 28 Jun 2016 14:33:02 +0200 MIME-Version: 1.0 In-Reply-To: <1564831478.2624143.1467116962342.JavaMail.zimbra@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: qemu-devel@nongnu.org, kwolf@redhat.com, peter maydell , mst@redhat.com, dgilbert@redhat.com, mreitz@redhat.com, kraxel@redhat.com Am 28.06.2016 um 14:29 schrieb Paolo Bonzini: >> Am 28.06.2016 um 13:37 schrieb Paolo Bonzini: >>> On 28/06/2016 11:01, Peter Lieven wrote: >>>> I recently found that Qemu is using several hundred megabytes of RSS >>>> memory >>>> more than older versions such as Qemu 2.2.0. So I started tracing >>>> memory allocation and found 2 major reasons for this. >>>> >>>> 1) We changed the qemu coroutine pool to have a per thread and a global >>>> release >>>> pool. The choosen poolsize and the changed algorithm could lead to up >>>> to >>>> 192 free coroutines with just a single iothread. Each of the >>>> coroutines >>>> in the pool each having 1MB of stack memory. >>> But the fix, as you correctly note, is to reduce the stack size. It >>> would be nice to compile block-obj-y with -Wstack-usage=2048 too. >> To reveal if there are any big stack allocations in the block layer? > Yes. Most should be fixed by now, but a handful are probably still there. > (definitely one in vvfat.c). > >> As it seems reducing to 64kB breaks live migration in some (non reproducible) cases. > Does it hit the guard page? How would that look like? I get segfaults like this: segfault at 7f91aa642b78 ip 0000555ab714ef7d sp 00007f91aa642b50 error 6 in qemu-system-x86_64[555ab6f2c000+794000] most of the time error 6. Sometimes error 7. segfault is near the sp. > >>>> 2) Between Qemu 2.2.0 and 2.3.0 RCU was introduced which lead to delayed >>>> freeing >>>> of memory. This lead to higher heap allocations which could not >>>> effectively >>>> be returned to kernel (most likely due to fragmentation). >>> I agree that some of the exec.c allocations need some care, but I would >>> prefer to use a custom free list or lazy allocation instead of mmap. >> This would only help if the elements from the free list would be allocated >> using mmap? The issue is that RCU delays the freeing so that the number of >> concurrent allocations is high and then a bunch is freed at once. If the memory >> was malloced it would still have caused trouble. > The free list should improve reuse and fragmentation. I'll take a look at > lazy allocation of subpages, too. Ok, that would be good. And for the PhsyPageMap we use mmap and try to avoid the realloc? Peter