From: Zhong Yang <yang.zhong@intel.com>
To: Shannon Zhao <zhaoshenglong@huawei.com>
Cc: qemu-devel@nongnu.org, pbonzini@redhat.com,
weidong.huang@huawei.com, arei.gonglei@huawei.com,
liujunjie23@huawei.com, wangxinxin.wang@huawei.com,
stone.xulei@huawei.com, zhang.zhanghailiang@huawei.com,
stefanha@redhat.com, berrange@redhat.com, yang.zhong@intel.com
Subject: Re: [Qemu-devel] [PATCH v3] rcu: reduce more than 7MB heap memory by malloc_trim()
Date: Mon, 27 Nov 2017 11:06:35 +0800 [thread overview]
Message-ID: <20171127030635.GA29806@yangzhon-Virtual> (raw)
In-Reply-To: <5A1A5C6E.9060409@huawei.com>
On Sun, Nov 26, 2017 at 02:17:18PM +0800, Shannon Zhao wrote:
> Hi,
>
> On 2017/11/24 14:30, Yang Zhong wrote:
> > Since there are some issues in memory alloc/free machenism
> > in glibc for little chunk memory, if Qemu frequently
> > alloc/free little chunk memory, the glibc doesn't alloc
> > little chunk memory from free list of glibc and still
> > allocate from OS, which make the heap size bigger and bigger.
> >
> > This patch introduce malloc_trim(), which will free heap memory.
> >
> > Below are test results from smaps file.
> > (1)without patch
> > 55f0783e1000-55f07992a000 rw-p 00000000 00:00 0 [heap]
> > Size: 21796 kB
> > Rss: 14260 kB
> > Pss: 14260 kB
> >
> > (2)with patch
> > 55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0 [heap]
> > Size: 21668 kB
> > Rss: 6940 kB
> > Pss: 6940 kB
> >
> > Signed-off-by: Yang Zhong <yang.zhong@intel.com>
> > ---
> > configure | 29 +++++++++++++++++++++++++++++
> > util/rcu.c | 6 ++++++
> > 2 files changed, 35 insertions(+)
> >
> > diff --git a/configure b/configure
> > index 0c6e757..6292ab0 100755
> > --- a/configure
> > +++ b/configure
> > @@ -426,6 +426,7 @@ vxhs=""
> > supported_cpu="no"
> > supported_os="no"
> > bogus_os="no"
> > +malloc_trim="yes"
> >
> > # parse CC options first
> > for opt do
> > @@ -3857,6 +3858,30 @@ if test "$tcmalloc" = "yes" && test "$jemalloc" = "yes" ; then
> > exit 1
> > fi
> >
> > +# Even if malloc_trim() is available, these non-libc memory allocators
> > +# do not support it.
> > +if test "$tcmalloc" = "yes" || test "$jemalloc" = "yes" ; then
> > + if test "$malloc_trim" = "yes" ; then
> > + echo "Disabling malloc_trim with non-libc memory allocator"
> > + fi
> > + malloc_trim="no"
> > +fi
> > +
> > +#######################################
> > +# malloc_trim
> > +
> > +if test "$malloc_trim" != "no" ; then
> > + cat > $TMPC << EOF
> > +#include <malloc.h>
> > +int main(void) { malloc_trim(0); return 0; }
> > +EOF
> > + if compile_prog "" "" ; then
> > + malloc_trim="yes"
> > + else
> > + malloc_trim="no"
> > + fi
> > +fi
> > +
> > ##########################################
> > # tcmalloc probe
> >
> > @@ -6012,6 +6037,10 @@ if test "$opengl" = "yes" ; then
> > fi
> > fi
> >
> > +if test "$malloc_trim" = "yes" ; then
> > + echo "CONFIG_MALLOC_TRIM=y" >> $config_host_mak
> > +fi
> > +
> > if test "$avx2_opt" = "yes" ; then
> > echo "CONFIG_AVX2_OPT=y" >> $config_host_mak
> > fi
> > diff --git a/util/rcu.c b/util/rcu.c
> > index ca5a63e..f403b77 100644
> > --- a/util/rcu.c
> > +++ b/util/rcu.c
> > @@ -32,6 +32,9 @@
> > #include "qemu/atomic.h"
> > #include "qemu/thread.h"
> > #include "qemu/main-loop.h"
> > +#if defined(CONFIG_MALLOC_TRIM)
> > +#include <malloc.h>
> > +#endif
> >
> > /*
> > * Global grace period counter. Bit 0 is always one in rcu_gp_ctr.
> > @@ -272,6 +275,9 @@ static void *call_rcu_thread(void *opaque)
> > node->func(node);
> > }
> > qemu_mutex_unlock_iothread();
> > +#if defined(CONFIG_MALLOC_TRIM)
> > + malloc_trim(4 * 1024 * 1024);
> > +#endif
> > }
> > abort();
> > }
> >
>
> Looks like this patch introduces a performance regression. With this
> patch the time of booting a VM with 60 scsi disks on ARM64 is increased
> by 200+ seconds.
>
Hello Shannon,
Thanks for your reply!
As for your concerns, i did VM bootup compared tests, and results as below:
#test command
./qemu-system-x86_64 -enable-kvm -cpu host -m 2G -smp cpus=4,cores=4,\
threads=1,sockets=1 -drive format=raw,\
file=test.img,index=0,media=disk -nographic
#without patch
root@intel-internal-corei7-64:~# systemd-analyze
Startup finished in 4.979s (kernel) + 1.214s (userspace) = 6.193s
root@intel-internal-corei7-64:~# systemd-analyze
Startup finished in 4.922s (kernel) + 1.175s (userspace) = 6.097s
root@intel-internal-corei7-64:~# systemd-analyze
Startup finished in 4.990s (kernel) + 1.301s (userspace) = 6.291s
root@intel-internal-corei7-64:~# systemd-analyze
Startup finished in 5.063s (kernel) + 1.336s (userspace) = 6.400s
root@intel-internal-corei7-64:~# systemd-analyze
Startup finished in 4.820s (kernel) + 1.237s (userspace) = 6.057s
avg: kernel 4.9548, userspace 1.2526
#with this patch
root@intel-internal-corei7-64:~# systemd-analyze
Startup finished in 5.099s (kernel) + 1.579s (userspace) = 6.679s
root@intel-internal-corei7-64:~# systemd-analyze
Startup finished in 5.003s (kernel) + 1.343s (userspace) = 6.347s
root@intel-internal-corei7-64:~# systemd-analyze
Startup finished in 4.853s (kernel) + 1.220s (userspace) = 6.074s
root@intel-internal-corei7-64:~# systemd-analyze
Startup finished in 4.836s (kernel) + 1.111s (userspace) = 5.948s
root@intel-internal-corei7-64:~# systemd-analyze
Startup finished in 4.917s (kernel) + 1.166s (userspace) = 6.083s
avg: kernel 4.9416s, userspace: 1.2838
From above test results, there are almost not any performance regression
on x86 platform. Sorry, there is not any ARM based platform in my hand,
i can't give related datas. thanks!
Regards,
Yang
> Thanks,
> --
> Shannon
next prev parent reply other threads:[~2017-11-27 3:07 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-24 6:30 [Qemu-devel] [PATCH v3] rcu: reduce more than 7MB heap memory by malloc_trim() Yang Zhong
2017-11-24 11:27 ` Stefan Hajnoczi
2017-11-26 6:17 ` Shannon Zhao
2017-11-27 3:06 ` Zhong Yang [this message]
2017-11-27 11:59 ` Paolo Bonzini
[not found] ` <20171201105622.GB26237@yangzhon-Virtual>
[not found] ` <74cccd14-e485-90d4-82d9-03355c05faca@redhat.com>
2017-12-04 12:03 ` Yang Zhong
2017-12-04 12:07 ` Daniel P. Berrange
2017-12-04 12:16 ` Yang Zhong
2017-12-04 12:18 ` Paolo Bonzini
2017-12-04 12:26 ` Shannon Zhao
2017-12-05 6:00 ` Yang Zhong
2017-12-05 14:10 ` Paolo Bonzini
2017-12-06 9:26 ` Yang Zhong
2017-12-06 9:48 ` Paolo Bonzini
2017-12-07 15:06 ` Yang Zhong
2017-12-11 16:31 ` Paolo Bonzini
2017-12-12 6:54 ` Yang Zhong
2017-12-12 7:09 ` Shannon Zhao
2017-12-18 7:17 ` Shannon Zhao
2017-12-18 7:51 ` Yang Zhong
2017-12-19 12:57 ` Paolo Bonzini
2017-12-08 11:06 ` Yang Zhong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171127030635.GA29806@yangzhon-Virtual \
--to=yang.zhong@intel.com \
--cc=arei.gonglei@huawei.com \
--cc=berrange@redhat.com \
--cc=liujunjie23@huawei.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=stone.xulei@huawei.com \
--cc=wangxinxin.wang@huawei.com \
--cc=weidong.huang@huawei.com \
--cc=zhang.zhanghailiang@huawei.com \
--cc=zhaoshenglong@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.