All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Lieven <pl@kamp.de>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel@nongnu.org, kwolf@redhat.com,
	peter maydell <peter.maydell@linaro.org>,
	mst@redhat.com, mreitz@redhat.com, kraxel@redhat.com
Subject: Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage
Date: Tue, 28 Jun 2016 16:52:01 +0200	[thread overview]
Message-ID: <57728F11.3080802@kamp.de> (raw)
In-Reply-To: <57728D26.7080201@kamp.de>

Am 28.06.2016 um 16:43 schrieb Peter Lieven:
> Am 28.06.2016 um 14:56 schrieb Dr. David Alan Gilbert:
>> * Peter Lieven (pl@kamp.de) wrote:
>>> Am 28.06.2016 um 14:29 schrieb Paolo Bonzini:
>>>>> Am 28.06.2016 um 13:37 schrieb Paolo Bonzini:
>>>>>> On 28/06/2016 11:01, Peter Lieven wrote:
>>>>>>> I recently found that Qemu is using several hundred megabytes of RSS
>>>>>>> memory
>>>>>>> more than older versions such as Qemu 2.2.0. So I started tracing
>>>>>>> memory allocation and found 2 major reasons for this.
>>>>>>>
>>>>>>> 1) We changed the qemu coroutine pool to have a per thread and a global
>>>>>>> release
>>>>>>>       pool. The choosen poolsize and the changed algorithm could lead to up
>>>>>>>       to
>>>>>>>       192 free coroutines with just a single iothread. Each of the
>>>>>>>       coroutines
>>>>>>>       in the pool each having 1MB of stack memory.
>>>>>> But the fix, as you correctly note, is to reduce the stack size.  It
>>>>>> would be nice to compile block-obj-y with -Wstack-usage=2048 too.
>>>>> To reveal if there are any big stack allocations in the block layer?
>>>> Yes.  Most should be fixed by now, but a handful are probably still there.
>>>> (definitely one in vvfat.c).
>>>>
>>>>> As it seems reducing to 64kB breaks live migration in some (non reproducible) cases.
>>>> Does it hit the guard page?
>>> How would that look like? I get segfaults like this:
>>>
>>> segfault at 7f91aa642b78 ip 0000555ab714ef7d sp 00007f91aa642b50 error 6 in qemu-system-x86_64[555ab6f2c000+794000]
>>>
>>> most of the time error 6. Sometimes error 7. segfault is near the sp.
>> A backtrace would be good.
>
> Here we go. My old friend nc_senv_compat ;-)

This has already been fixed in master. My test systems use an older Qemu ;-)

Peter

>
> Again the question: Would you go for reducing the stack size an eliminating all stack eaters ?
>
> The static netbuf in nc_sendv_compat is no problem.
>
> And: I would go for adding the guard page without MAP_GROWSDOWN and mmaping the rest of the
> stack with this flag if availble. So we are save on non Linux systems or Linux before 3.9 or merged memory regions.
>
> Peter
>
> ---
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000555555a2ee35 in nc_sendv_compat (nc=0x0, iov=0x0, iovcnt=0, flags=0)
>     at net/net.c:701
> (gdb) bt full
> #0  0x0000555555a2ee35 in nc_sendv_compat (nc=0x0, iov=0x0, iovcnt=0, flags=0)
>     at net/net.c:701
>         buf = '\000' <repeats 65890 times>...
>         buffer = 0x0
>         offset = 0
> #1  0x0000555555a2f058 in qemu_deliver_packet_iov (sender=0x5555565a46b0,
>     flags=0, iov=0x7ffff7e98d20, iovcnt=1, opaque=0x555557802370)
>     at net/net.c:745
>         nc = 0x555557802370
>         ret = 21845
> #2  0x0000555555a3132d in qemu_net_queue_deliver (queue=0x555557802590,
>     sender=0x5555565a46b0, flags=0, data=0x55555659e2a8 "", size=74)
>     at net/queue.c:163
>         ret = -1
>         iov = {iov_base = 0x55555659e2a8, iov_len = 74}
> #3  0x0000555555a3178b in qemu_net_queue_flush (queue=0x555557802590)
>     at net/queue.c:260
>         packet = 0x55555659e280
>         ret = 21845
> #4  0x0000555555a2eb7a in qemu_flush_or_purge_queued_packets (
>     nc=0x555557802370, purge=false) at net/net.c:629
> No locals.
> #5  0x0000555555a2ebe4 in qemu_flush_queued_packets (nc=0x555557802370)
>     at net/net.c:642
> No locals.
> #6  0x00005555557747b7 in virtio_net_set_status (vdev=0x555556fb32a8,
>     status=7 '\a') at /usr/src/qemu-2.5.0/hw/net/virtio-net.c:178
>         ncs = 0x555557802370
>         queue_started = true
>         n = 0x555556fb32a8
>         __func__ = "virtio_net_set_status"
>         q = 0x555557308b50
>         i = 0
>         queue_status = 7 '\a'
> #7  0x0000555555795501 in virtio_set_status (vdev=0x555556fb32a8, val=7 '\a')
>     at /usr/src/qemu-2.5.0/hw/virtio/virtio.c:618
>         k = 0x55555657eb40
>         __func__ = "virtio_set_status"
> #8  0x00005555557985e6 in virtio_vmstate_change (opaque=0x555556fb32a8,
>     running=1, state=RUN_STATE_RUNNING)
>     at /usr/src/qemu-2.5.0/hw/virtio/virtio.c:1539
>         vdev = 0x555556fb32a8
>         qbus = 0x555556fb3240
>         __func__ = "virtio_vmstate_change"
>         k = 0x555556570420
>         backend_run = true
> #9  0x00005555558592ae in vm_state_notify (running=1, state=RUN_STATE_RUNNING)
>     at vl.c:1601
>         e = 0x555557320cf0
>         next = 0x555557af4c40
> #10 0x000055555585737d in vm_start () at vl.c:756
>         requested = RUN_STATE_MAX
> #11 0x0000555555a209ec in process_incoming_migration_co (opaque=0x5555566a1600)
>     at migration/migration.c:392
>         f = 0x5555566a1600
>         local_err = 0x0
>         mis = 0x5555575ab0e0
>         ps = POSTCOPY_INCOMING_NONE
>         ret = 0
> #12 0x0000555555b61efd in coroutine_trampoline (i0=1465036928, i1=21845)
>     at util/coroutine-ucontext.c:80
>         arg = {p = 0x55555752b080, i = {1465036928, 21845}}
>         self = 0x55555752b080
>         co = 0x55555752b080
> #13 0x00007ffff5cb7800 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> No symbol table info available.
> #14 0x00007fffffffcb40 in ?? ()
> No symbol table info available.
> #15 0x0000000000000000 in ?? ()
> No symbol table info available.
>
>
>>
>> Dave
>>
>>>
>>>>>>> 2) Between Qemu 2.2.0 and 2.3.0 RCU was introduced which lead to delayed
>>>>>>> freeing
>>>>>>>       of memory. This lead to higher heap allocations which could not
>>>>>>>       effectively
>>>>>>>       be returned to kernel (most likely due to fragmentation).
>>>>>> I agree that some of the exec.c allocations need some care, but I would
>>>>>> prefer to use a custom free list or lazy allocation instead of mmap.
>>>>> This would only help if the elements from the free list would be allocated
>>>>> using mmap? The issue is that RCU delays the freeing so that the number of
>>>>> concurrent allocations is high and then a bunch is freed at once. If the memory
>>>>> was malloced it would still have caused trouble.
>>>> The free list should improve reuse and fragmentation.  I'll take a look at
>>>> lazy allocation of subpages, too.
>>> Ok, that would be good. And for the PhsyPageMap we use mmap and try to avoid
>>> the realloc?
>>>
>>> Peter
>>>
>> -- 
>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>


-- 

Mit freundlichen Grüßen

Peter Lieven

...........................................................

   KAMP Netzwerkdienste GmbH
   Vestische Str. 89-91 | 46117 Oberhausen
   Tel: +49 (0) 208.89 402-50 | Fax: +49 (0) 208.89 402-40
   pl@kamp.de | http://www.kamp.de

   Geschäftsführer: Heiner Lante | Michael Lante
   Amtsgericht Duisburg | HRB Nr. 12154
   USt-Id-Nr.: DE 120607556

...........................................................

  reply	other threads:[~2016-06-28 14:52 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-28  9:01 [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 01/15] coroutine-ucontext: mmap stack memory Peter Lieven
2016-06-28 10:02   ` Peter Maydell
2016-06-28 10:21     ` Peter Lieven
2016-06-28 11:04   ` Paolo Bonzini
2016-06-28  9:01 ` [Qemu-devel] [PATCH 02/15] coroutine-ucontext: add a switch to monitor maximum stack size Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 03/15] coroutine-ucontext: reduce stack size to 64kB Peter Lieven
2016-06-28 10:54   ` Paolo Bonzini
2016-06-28 10:57     ` Dr. David Alan Gilbert
2016-06-28 11:17       ` Peter Lieven
2016-06-28 11:35         ` Dr. David Alan Gilbert
2016-06-28 12:09           ` Peter Lieven
2016-06-28 14:20             ` Dr. David Alan Gilbert
2016-06-30  6:34               ` Peter Lieven
2016-06-28 11:13     ` Peter Lieven
2016-06-28 11:26       ` Paolo Bonzini
2016-06-28  9:01 ` [Qemu-devel] [PATCH 04/15] coroutine: add a knob to disable the shared release pool Peter Lieven
2016-06-28 10:41   ` Paolo Bonzini
2016-06-28 10:47     ` Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 05/15] util: add a helper to mmap private anonymous memory Peter Lieven
2016-10-16  2:10   ` Michael S. Tsirkin
2016-10-18 13:50     ` Alex Bennée
2016-06-28  9:01 ` [Qemu-devel] [PATCH 06/15] exec: use mmap for subpages Peter Lieven
2016-06-28 10:48   ` Paolo Bonzini
2016-06-28  9:01 ` [Qemu-devel] [PATCH 07/15] qapi: use mmap for QmpInputVisitor Peter Lieven
2016-06-28  9:29   ` Dr. David Alan Gilbert
2016-06-28  9:39     ` Peter Lieven
2016-06-28 10:10       ` Daniel P. Berrange
2016-06-28 10:17         ` Dr. David Alan Gilbert
2016-06-28 10:21           ` Daniel P. Berrange
2016-06-28 14:10           ` Eric Blake
2016-06-28 11:36   ` Paolo Bonzini
2016-06-28 14:14     ` Eric Blake
2016-06-30 14:12   ` Markus Armbruster
2016-07-04  9:02     ` Paolo Bonzini
2016-07-04 11:18       ` Markus Armbruster
2016-07-04 11:36         ` Peter Lieven
2016-07-04 11:42         ` Paolo Bonzini
2016-06-28  9:01 ` [Qemu-devel] [PATCH 08/15] virtio: use mmap for VirtQueue Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 09/15] loader: use mmap for ROMs Peter Lieven
2016-06-28 10:41   ` Paolo Bonzini
2016-06-28 11:26     ` Peter Lieven
2016-07-04  7:30     ` Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 10/15] vmware_svga: use mmap for scratch pad Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 11/15] qom: use mmap for bigger Objects Peter Lieven
2016-06-28 10:08   ` Daniel P. Berrange
2016-06-28 10:10   ` Peter Maydell
2016-06-28 10:19     ` Peter Lieven
2016-06-28 10:42   ` Paolo Bonzini
2016-06-28 10:49     ` Peter Lieven
2016-06-30 14:15       ` Markus Armbruster
2016-06-28  9:01 ` [Qemu-devel] [PATCH 12/15] util: add a function to realloc mmapped memory Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 13/15] exec: use mmap for PhysPageMap->nodes Peter Lieven
2016-06-28 10:43   ` Paolo Bonzini
2016-06-28 10:48     ` Peter Lieven
2016-07-11  9:31     ` Peter Lieven
2016-07-11  9:44       ` Peter Lieven
2016-07-11 10:37       ` Paolo Bonzini
2016-07-12 14:34         ` Peter Lieven
2016-07-13 10:27           ` Paolo Bonzini
2016-07-14 14:47             ` Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 14/15] vnc-tight: make the encoding palette static Peter Lieven
2016-06-28 11:12   ` Paolo Bonzini
2016-06-28 11:18     ` Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 15/15] vnc: use mmap for VncState Peter Lieven
2016-06-28 11:37 ` [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage Paolo Bonzini
2016-06-28 12:14   ` Peter Lieven
2016-06-28 12:29     ` Paolo Bonzini
2016-06-28 12:33       ` Peter Lieven
2016-06-28 12:56         ` Paolo Bonzini
2016-06-28 12:56         ` Dr. David Alan Gilbert
2016-06-28 14:43           ` Peter Lieven
2016-06-28 14:52             ` Peter Lieven [this message]
2016-10-12 21:18 ` Michael R. Hines
2016-10-18 10:47   ` Peter Lieven
2016-10-19 17:40     ` Michael R. Hines
2016-10-31 22:00     ` Michael R. Hines
2016-11-01 22:02       ` Michael R. Hines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57728F11.3080802@kamp.de \
    --to=pl@kamp.de \
    --cc=dgilbert@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.