qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Lieven <pl@kamp.de>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel@nongnu.org, kwolf@redhat.com,
	peter maydell <peter.maydell@linaro.org>,
	mst@redhat.com, mreitz@redhat.com, kraxel@redhat.com
Subject: Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage
Date: Tue, 28 Jun 2016 16:43:50 +0200	[thread overview]
Message-ID: <57728D26.7080201@kamp.de> (raw)
In-Reply-To: <20160628125620.GI2243@work-vm>

Am 28.06.2016 um 14:56 schrieb Dr. David Alan Gilbert:
> * Peter Lieven (pl@kamp.de) wrote:
>> Am 28.06.2016 um 14:29 schrieb Paolo Bonzini:
>>>> Am 28.06.2016 um 13:37 schrieb Paolo Bonzini:
>>>>> On 28/06/2016 11:01, Peter Lieven wrote:
>>>>>> I recently found that Qemu is using several hundred megabytes of RSS
>>>>>> memory
>>>>>> more than older versions such as Qemu 2.2.0. So I started tracing
>>>>>> memory allocation and found 2 major reasons for this.
>>>>>>
>>>>>> 1) We changed the qemu coroutine pool to have a per thread and a global
>>>>>> release
>>>>>>       pool. The choosen poolsize and the changed algorithm could lead to up
>>>>>>       to
>>>>>>       192 free coroutines with just a single iothread. Each of the
>>>>>>       coroutines
>>>>>>       in the pool each having 1MB of stack memory.
>>>>> But the fix, as you correctly note, is to reduce the stack size.  It
>>>>> would be nice to compile block-obj-y with -Wstack-usage=2048 too.
>>>> To reveal if there are any big stack allocations in the block layer?
>>> Yes.  Most should be fixed by now, but a handful are probably still there.
>>> (definitely one in vvfat.c).
>>>
>>>> As it seems reducing to 64kB breaks live migration in some (non reproducible) cases.
>>> Does it hit the guard page?
>> How would that look like? I get segfaults like this:
>>
>> segfault at 7f91aa642b78 ip 0000555ab714ef7d sp 00007f91aa642b50 error 6 in qemu-system-x86_64[555ab6f2c000+794000]
>>
>> most of the time error 6. Sometimes error 7. segfault is near the sp.
> A backtrace would be good.

Here we go. My old friend nc_senv_compat ;-)

Again the question: Would you go for reducing the stack size an eliminating all stack eaters ?

The static netbuf in nc_sendv_compat is no problem.

And: I would go for adding the guard page without MAP_GROWSDOWN and mmaping the rest of the
stack with this flag if availble. So we are save on non Linux systems or Linux before 3.9 or merged memory regions.

Peter

---

Program received signal SIGSEGV, Segmentation fault.
0x0000555555a2ee35 in nc_sendv_compat (nc=0x0, iov=0x0, iovcnt=0, flags=0)
     at net/net.c:701
(gdb) bt full
#0  0x0000555555a2ee35 in nc_sendv_compat (nc=0x0, iov=0x0, iovcnt=0, flags=0)
     at net/net.c:701
         buf = '\000' <repeats 65890 times>...
         buffer = 0x0
         offset = 0
#1  0x0000555555a2f058 in qemu_deliver_packet_iov (sender=0x5555565a46b0,
     flags=0, iov=0x7ffff7e98d20, iovcnt=1, opaque=0x555557802370)
     at net/net.c:745
         nc = 0x555557802370
         ret = 21845
#2  0x0000555555a3132d in qemu_net_queue_deliver (queue=0x555557802590,
     sender=0x5555565a46b0, flags=0, data=0x55555659e2a8 "", size=74)
     at net/queue.c:163
         ret = -1
         iov = {iov_base = 0x55555659e2a8, iov_len = 74}
#3  0x0000555555a3178b in qemu_net_queue_flush (queue=0x555557802590)
     at net/queue.c:260
         packet = 0x55555659e280
         ret = 21845
#4  0x0000555555a2eb7a in qemu_flush_or_purge_queued_packets (
     nc=0x555557802370, purge=false) at net/net.c:629
No locals.
#5  0x0000555555a2ebe4 in qemu_flush_queued_packets (nc=0x555557802370)
     at net/net.c:642
No locals.
#6  0x00005555557747b7 in virtio_net_set_status (vdev=0x555556fb32a8,
     status=7 '\a') at /usr/src/qemu-2.5.0/hw/net/virtio-net.c:178
         ncs = 0x555557802370
         queue_started = true
         n = 0x555556fb32a8
         __func__ = "virtio_net_set_status"
         q = 0x555557308b50
         i = 0
         queue_status = 7 '\a'
#7  0x0000555555795501 in virtio_set_status (vdev=0x555556fb32a8, val=7 '\a')
     at /usr/src/qemu-2.5.0/hw/virtio/virtio.c:618
         k = 0x55555657eb40
         __func__ = "virtio_set_status"
#8  0x00005555557985e6 in virtio_vmstate_change (opaque=0x555556fb32a8,
     running=1, state=RUN_STATE_RUNNING)
     at /usr/src/qemu-2.5.0/hw/virtio/virtio.c:1539
         vdev = 0x555556fb32a8
         qbus = 0x555556fb3240
         __func__ = "virtio_vmstate_change"
         k = 0x555556570420
         backend_run = true
#9  0x00005555558592ae in vm_state_notify (running=1, state=RUN_STATE_RUNNING)
     at vl.c:1601
         e = 0x555557320cf0
         next = 0x555557af4c40
#10 0x000055555585737d in vm_start () at vl.c:756
         requested = RUN_STATE_MAX
#11 0x0000555555a209ec in process_incoming_migration_co (opaque=0x5555566a1600)
     at migration/migration.c:392
         f = 0x5555566a1600
         local_err = 0x0
         mis = 0x5555575ab0e0
         ps = POSTCOPY_INCOMING_NONE
         ret = 0
#12 0x0000555555b61efd in coroutine_trampoline (i0=1465036928, i1=21845)
     at util/coroutine-ucontext.c:80
         arg = {p = 0x55555752b080, i = {1465036928, 21845}}
         self = 0x55555752b080
         co = 0x55555752b080
#13 0x00007ffff5cb7800 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#14 0x00007fffffffcb40 in ?? ()
No symbol table info available.
#15 0x0000000000000000 in ?? ()
No symbol table info available.


>
> Dave
>
>>
>>>>>> 2) Between Qemu 2.2.0 and 2.3.0 RCU was introduced which lead to delayed
>>>>>> freeing
>>>>>>       of memory. This lead to higher heap allocations which could not
>>>>>>       effectively
>>>>>>       be returned to kernel (most likely due to fragmentation).
>>>>> I agree that some of the exec.c allocations need some care, but I would
>>>>> prefer to use a custom free list or lazy allocation instead of mmap.
>>>> This would only help if the elements from the free list would be allocated
>>>> using mmap? The issue is that RCU delays the freeing so that the number of
>>>> concurrent allocations is high and then a bunch is freed at once. If the memory
>>>> was malloced it would still have caused trouble.
>>> The free list should improve reuse and fragmentation.  I'll take a look at
>>> lazy allocation of subpages, too.
>> Ok, that would be good. And for the PhsyPageMap we use mmap and try to avoid
>> the realloc?
>>
>> Peter
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


-- 

Mit freundlichen Grüßen

Peter Lieven

...........................................................

   KAMP Netzwerkdienste GmbH
   Vestische Str. 89-91 | 46117 Oberhausen
   Tel: +49 (0) 208.89 402-50 | Fax: +49 (0) 208.89 402-40
   pl@kamp.de | http://www.kamp.de

   Geschäftsführer: Heiner Lante | Michael Lante
   Amtsgericht Duisburg | HRB Nr. 12154
   USt-Id-Nr.: DE 120607556

...........................................................

  reply	other threads:[~2016-06-28 14:44 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-28  9:01 [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 01/15] coroutine-ucontext: mmap stack memory Peter Lieven
2016-06-28 10:02   ` Peter Maydell
2016-06-28 10:21     ` Peter Lieven
2016-06-28 11:04   ` Paolo Bonzini
2016-06-28  9:01 ` [Qemu-devel] [PATCH 02/15] coroutine-ucontext: add a switch to monitor maximum stack size Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 03/15] coroutine-ucontext: reduce stack size to 64kB Peter Lieven
2016-06-28 10:54   ` Paolo Bonzini
2016-06-28 10:57     ` Dr. David Alan Gilbert
2016-06-28 11:17       ` Peter Lieven
2016-06-28 11:35         ` Dr. David Alan Gilbert
2016-06-28 12:09           ` Peter Lieven
2016-06-28 14:20             ` Dr. David Alan Gilbert
2016-06-30  6:34               ` Peter Lieven
2016-06-28 11:13     ` Peter Lieven
2016-06-28 11:26       ` Paolo Bonzini
2016-06-28  9:01 ` [Qemu-devel] [PATCH 04/15] coroutine: add a knob to disable the shared release pool Peter Lieven
2016-06-28 10:41   ` Paolo Bonzini
2016-06-28 10:47     ` Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 05/15] util: add a helper to mmap private anonymous memory Peter Lieven
2016-10-16  2:10   ` Michael S. Tsirkin
2016-10-18 13:50     ` Alex Bennée
2016-06-28  9:01 ` [Qemu-devel] [PATCH 06/15] exec: use mmap for subpages Peter Lieven
2016-06-28 10:48   ` Paolo Bonzini
2016-06-28  9:01 ` [Qemu-devel] [PATCH 07/15] qapi: use mmap for QmpInputVisitor Peter Lieven
2016-06-28  9:29   ` Dr. David Alan Gilbert
2016-06-28  9:39     ` Peter Lieven
2016-06-28 10:10       ` Daniel P. Berrange
2016-06-28 10:17         ` Dr. David Alan Gilbert
2016-06-28 10:21           ` Daniel P. Berrange
2016-06-28 14:10           ` Eric Blake
2016-06-28 11:36   ` Paolo Bonzini
2016-06-28 14:14     ` Eric Blake
2016-06-30 14:12   ` Markus Armbruster
2016-07-04  9:02     ` Paolo Bonzini
2016-07-04 11:18       ` Markus Armbruster
2016-07-04 11:36         ` Peter Lieven
2016-07-04 11:42         ` Paolo Bonzini
2016-06-28  9:01 ` [Qemu-devel] [PATCH 08/15] virtio: use mmap for VirtQueue Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 09/15] loader: use mmap for ROMs Peter Lieven
2016-06-28 10:41   ` Paolo Bonzini
2016-06-28 11:26     ` Peter Lieven
2016-07-04  7:30     ` Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 10/15] vmware_svga: use mmap for scratch pad Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 11/15] qom: use mmap for bigger Objects Peter Lieven
2016-06-28 10:08   ` Daniel P. Berrange
2016-06-28 10:10   ` Peter Maydell
2016-06-28 10:19     ` Peter Lieven
2016-06-28 10:42   ` Paolo Bonzini
2016-06-28 10:49     ` Peter Lieven
2016-06-30 14:15       ` Markus Armbruster
2016-06-28  9:01 ` [Qemu-devel] [PATCH 12/15] util: add a function to realloc mmapped memory Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 13/15] exec: use mmap for PhysPageMap->nodes Peter Lieven
2016-06-28 10:43   ` Paolo Bonzini
2016-06-28 10:48     ` Peter Lieven
2016-07-11  9:31     ` Peter Lieven
2016-07-11  9:44       ` Peter Lieven
2016-07-11 10:37       ` Paolo Bonzini
2016-07-12 14:34         ` Peter Lieven
2016-07-13 10:27           ` Paolo Bonzini
2016-07-14 14:47             ` Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 14/15] vnc-tight: make the encoding palette static Peter Lieven
2016-06-28 11:12   ` Paolo Bonzini
2016-06-28 11:18     ` Peter Lieven
2016-06-28  9:01 ` [Qemu-devel] [PATCH 15/15] vnc: use mmap for VncState Peter Lieven
2016-06-28 11:37 ` [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage Paolo Bonzini
2016-06-28 12:14   ` Peter Lieven
2016-06-28 12:29     ` Paolo Bonzini
2016-06-28 12:33       ` Peter Lieven
2016-06-28 12:56         ` Paolo Bonzini
2016-06-28 12:56         ` Dr. David Alan Gilbert
2016-06-28 14:43           ` Peter Lieven [this message]
2016-06-28 14:52             ` Peter Lieven
2016-10-12 21:18 ` Michael R. Hines
2016-10-18 10:47   ` Peter Lieven
2016-10-19 17:40     ` Michael R. Hines
2016-10-31 22:00     ` Michael R. Hines
2016-11-01 22:02       ` Michael R. Hines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57728D26.7080201@kamp.de \
    --to=pl@kamp.de \
    --cc=dgilbert@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).