All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
To: Alexandre DERUMIER <aderumier@odiso.com>,
	Milosz Tanski <milosz@adfin.com>
Cc: cbt <cbt@ceph.com>, ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: [Cbt] client fio-rbd benchmark : debian wheezy vs ubuntu vivid : big difference
Date: Tue, 12 May 2015 08:12:08 +0200	[thread overview]
Message-ID: <555199B8.60003@profihost.ag> (raw)
In-Reply-To: <2095455190.206287571.1431390861421.JavaMail.zimbra@oxygem.tv>


Am 12.05.2015 um 02:34 schrieb Alexandre DERUMIER:
>>> ou can try it and see if it'll make a difference. Set LD_PRELOAD to 
>>> include the so of jemalloc / tcmalloc before starting FIO. Like this: 
>>>
>>> $ export LD_PRELOAD=${JEMALLOC_PATH}/lib/libjemalloc.so.1 
>>> $ ./run_test.sh 
> 
> Thanks it's working.
> 
> Seem that jemmaloc with fio-rbd give 17% iops improvement and reduce latencies and cpu usage !
> 
> results with 1 numjob:
> 
> glibc :       iops=36668  usr=62.23%, sys=12.13%      
> libtcmalloc : iops=36105  usr=63.54%, sys=8.45%
> jemalloc:     iops=43181  usr=60.91%, sys=10.51%      
> 
> 
> (with 10numjobs, i'm around 240k iops with jemalloc vs 220k iops with glibc/tcmalloc)
> 
> 
> I just found a qemu git a patch to enable tcmalloc
> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=2847b46958ab0bd604e1b3fcafba0f5ba4375833
> I'll try to test it to see if it's help

Sounds good. Any reason for not switching to tcmalloc by default in PVE?

Stefan

> 
> 
> 
> 
> 
> 
> fio results
> ------------
> 
> glibc
> -----
> Jobs: 1 (f=1): [r(1)] [100.0% done] [123.9MB/0KB/0KB /s] [31.8K/0/0 iops] [eta 00m:00s]
> rbd_iodepth32-test: (groupid=0, jobs=1): err= 0: pid=7239: Tue May 12 02:05:46 2015
>   read : io=30000MB, bw=146675KB/s, iops=36668, runt=209443msec
>     slat (usec): min=8, max=1245, avg=26.07, stdev=13.99
>     clat (usec): min=107, max=4752, avg=525.40, stdev=207.46
>      lat (usec): min=126, max=4767, avg=551.47, stdev=208.27
>     clat percentiles (usec):
>      |  1.00th=[  171],  5.00th=[  215], 10.00th=[  253], 20.00th=[  322],
>      | 30.00th=[  386], 40.00th=[  450], 50.00th=[  516], 60.00th=[  588],
>      | 70.00th=[  652], 80.00th=[  716], 90.00th=[  796], 95.00th=[  868],
>      | 99.00th=[  996], 99.50th=[ 1048], 99.90th=[ 1192], 99.95th=[ 1240],
>      | 99.99th=[ 1368]
>     bw (KB  /s): min=112328, max=176848, per=100.00%, avg=146768.86, stdev=12974.09
>     lat (usec) : 250=9.61%, 500=37.58%, 750=37.25%, 1000=14.60%
>     lat (msec) : 2=0.96%, 4=0.01%, 10=0.01%
>   cpu          : usr=62.23%, sys=12.13%, ctx=10008821, majf=0, minf=1348
>   IO depths    : 1=0.1%, 2=0.1%, 4=3.0%, 8=28.8%, 16=64.2%, 32=4.0%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>      complete  : 0=0.0%, 4=96.1%, 8=0.1%, 16=0.1%, 32=3.9%, 64=0.0%, >=64=0.0%
>      issued    : total=r=7680000/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
>      latency   : target=0, window=0, percentile=100.00%, depth=32
> 
> Run status group 0 (all jobs):
>    READ: io=30000MB, aggrb=146674KB/s, minb=146674KB/s, maxb=146674KB/s, mint=209443msec, maxt=209443msec
> 
> Disk stats (read/write):
>   sdb: ios=0/22, merge=0/13, ticks=0/0, in_queue=0, util=0.00%
> 
> 
> jemmaloc
> --------
> Jobs: 1 (f=1): [r(1)] [100.0% done] [165.4MB/0KB/0KB /s] [42.3K/0/0 iops] [eta 00m:00s]
> rbd_iodepth32-test: (groupid=0, jobs=1): err= 0: pid=7137: Tue May 12 02:01:25 2015
>   read : io=30000MB, bw=172726KB/s, iops=43181, runt=177854msec
>     slat (usec): min=6, max=563, avg=22.28, stdev=14.68
>     clat (usec): min=95, max=3559, avg=456.29, stdev=168.37
>      lat (usec): min=110, max=3579, avg=478.56, stdev=169.06
>     clat percentiles (usec):
>      |  1.00th=[  161],  5.00th=[  201], 10.00th=[  233], 20.00th=[  290],
>      | 30.00th=[  346], 40.00th=[  402], 50.00th=[  454], 60.00th=[  506],
>      | 70.00th=[  556], 80.00th=[  612], 90.00th=[  676], 95.00th=[  732],
>      | 99.00th=[  844], 99.50th=[  900], 99.90th=[ 1020], 99.95th=[ 1064],
>      | 99.99th=[ 1192]
>     bw (KB  /s): min=129936, max=199712, per=100.00%, avg=172822.83, stdev=11812.99
>     lat (usec) : 100=0.01%, 250=12.77%, 500=45.87%, 750=37.60%, 1000=3.62%
>     lat (msec) : 2=0.13%, 4=0.01%
>   cpu          : usr=60.91%, sys=10.51%, ctx=9329053, majf=0, minf=1687
>   IO depths    : 1=0.1%, 2=0.1%, 4=1.8%, 8=26.4%, 16=67.5%, 32=4.2%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>      complete  : 0=0.0%, 4=95.9%, 8=0.1%, 16=0.1%, 32=4.0%, 64=0.0%, >=64=0.0%
>      issued    : total=r=7680000/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
>      latency   : target=0, window=0, percentile=100.00%, depth=32
> 
> Run status group 0 (all jobs):
>    READ: io=30000MB, aggrb=172725KB/s, minb=172725KB/s, maxb=172725KB/s, mint=177854msec, maxt=177854msec
> 
> Disk stats (read/write):
>   sdb: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
> 
> 
> libtcmalloc
> ------------
> rbd engine: RBD version: 0.1.10
> Jobs: 1 (f=1): [r(1)] [100.0% done] [140.1MB/0KB/0KB /s] [35.9K/0/0 iops] [eta 00m:00s]
> rbd_iodepth32-test: (groupid=0, jobs=1): err= 0: pid=7039: Tue May 12 01:57:41 2015
>   read : io=30000MB, bw=144423KB/s, iops=36105, runt=212708msec
>     slat (usec): min=10, max=803, avg=26.65, stdev=17.68
>     clat (usec): min=54, max=5052, avg=530.82, stdev=216.05
>      lat (usec): min=114, max=5531, avg=557.46, stdev=217.22
>     clat percentiles (usec):
>      |  1.00th=[  169],  5.00th=[  213], 10.00th=[  251], 20.00th=[  322],
>      | 30.00th=[  386], 40.00th=[  454], 50.00th=[  524], 60.00th=[  596],
>      | 70.00th=[  660], 80.00th=[  724], 90.00th=[  804], 95.00th=[  876],
>      | 99.00th=[ 1048], 99.50th=[ 1128], 99.90th=[ 1336], 99.95th=[ 1464],
>      | 99.99th=[ 2256]
>     bw (KB  /s): min=60416, max=161496, per=100.00%, avg=144529.50, stdev=10827.54
>     lat (usec) : 100=0.01%, 250=9.88%, 500=36.69%, 750=36.97%, 1000=14.88%
>     lat (msec) : 2=1.57%, 4=0.01%, 10=0.01%
>   cpu          : usr=63.54%, sys=8.45%, ctx=9209514, majf=0, minf=2120
>   IO depths    : 1=0.1%, 2=0.1%, 4=3.0%, 8=28.9%, 16=64.0%, 32=4.0%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>      complete  : 0=0.0%, 4=96.1%, 8=0.1%, 16=0.1%, 32=3.8%, 64=0.0%, >=64=0.0%
>      issued    : total=r=7680000/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
>      latency   : target=0, window=0, percentile=100.00%, depth=32
> 
> 
> 
> 
> 
> ----- Mail original -----
> De: "Milosz Tanski" <milosz@adfin.com>
> À: "aderumier" <aderumier@odiso.com>
> Cc: "Stefan Priebe" <s.priebe@profihost.ag>, "cbt" <cbt@ceph.com>, "ceph-devel" <ceph-devel@vger.kernel.org>
> Envoyé: Lundi 11 Mai 2015 23:38:51
> Objet: Re: [Cbt] client fio-rbd benchmark : debian wheezy vs ubuntu vivid : big difference
> 
> On Mon, May 11, 2015 at 10:20 AM, Alexandre DERUMIER 
> <aderumier@odiso.com> wrote: 
>>>> That's pretty interesting. I wasn't aware that there were performance 
>>>> optimisations in glibc. 
>>>>
>>>> As you have a test setup. Is it possible to install jessie libc on wheezy? 
>>
>> mmm, I can try that. Not sure it'll work. 
>>
>>
>> BTW, librbd cpu usage is always 3x-4x more than KRBD. 
>> a lot of cpu is used from malloc/free. It could be great to optimise that. 
>>
>> I don't known if jemmaloc or tcmalloc could be used, like for osd daemons ? 
> 
> You can try it and see if it'll make a difference. Set LD_PRELOAD to 
> include the so of jemalloc / tcmalloc before starting FIO. Like this: 
> 
> $ export LD_PRELOAD=${JEMALLOC_PATH}/lib/libjemalloc.so.1 
> $ ./run_test.sh 
> 
> As a matter of policy, libraries shouldn't force a particular malloc 
> implementation on the users of a particular library. It might go 
> against the user's wishes, not to mention what conflicts would happen 
> if one library wanted / needed jamalloc while another one wanted / 
> needed tcmalloc. 
> 
>>
>>
>> Reducing cpu usage could improve a lot qemu performance, as qemu use only 1 thread by disk. 
>>
>>
>>
>> ----- Mail original ----- 
>> De: "Stefan Priebe" <s.priebe@profihost.ag> 
>> À: "aderumier" <aderumier@odiso.com>, "cbt" <cbt@ceph.com>, "ceph-devel" <ceph-devel@vger.kernel.org> 
>> Envoyé: Lundi 11 Mai 2015 12:30:03 
>> Objet: Re: [Cbt] client fio-rbd benchmark : debian wheezy vs ubuntu vivid : big difference 
>>
>> Am 11.05.2015 um 07:53 schrieb Alexandre DERUMIER: 
>>> Seem that's is ok too on debian jessie (with an extra boost with rbd_cache true) 
>>>
>>> Maybe is it related to old glibc on debian wheezy ? 
>>
>> That's pretty interesting. I wasn't aware that there were performance 
>> optimisations in glibc. 
>>
>> As you have a test setup. Is it possible to install jessie libc on wheezy? 
>>
>> Stefan 
>>
>>
>>>
>>> debian jessie: rbd_cache=false : iops=202985 : %Cpu(s): 21,9 us, 9,5 sy, 0,0 ni, 66,1 id, 0,0 wa, 0,0 hi, 2,6 si, 0,0 st 
>>> debian jessie: rbd_cache=true : iops=215290 : %Cpu(s): 27,9 us, 10,8 sy, 0,0 ni, 58,8 id, 0,0 wa, 0,0 hi, 2,6 si, 0,0 st 
>>>
>>>
>>> ubuntu vivid : rbd_cache=false : iops=201089 %Cpu(s): 21,3 us, 12,8 sy, 0,0 ni, 61,8 id, 0,0 wa, 0,0 hi, 4,1 si, 0,0 st 
>>> ubuntu vivid : rbd_cache=true : iops=197549 %Cpu(s): 27,2 us, 15,3 sy, 0,0 ni, 53,2 id, 0,0 wa, 0,0 hi, 4,2 si, 0,0 st 
>>> debian wheezy : rbd_cache=false: iops=161272 %Cpu(s): 28.4 us, 15.4 sy, 0.0 ni, 52.8 id, 0.0 wa, 0.0 hi, 3.4 si, 0.0 st 
>>> debian wheezy : rbd_cache=true : iops=135893 %Cpu(s): 30.0 us, 15.5 sy, 0.0 ni, 51.5 id, 0.0 wa, 0.0 hi, 3.0 si, 0.0 st 
>>>
>>>
>>>
>>> jessie perf report 
>>> ------------------ 
>>> + 9,18% 3,75% fio libc-2.19.so [.] malloc 
>>> + 6,76% 5,70% fio libc-2.19.so [.] _int_malloc 
>>> + 5,83% 5,64% fio libc-2.19.so [.] _int_free 
>>> + 5,11% 0,15% fio libpthread-2.19.so [.] __libc_recv 
>>> + 4,81% 4,81% swapper [kernel.kallsyms] [k] intel_idle 
>>> + 3,72% 0,37% fio libpthread-2.19.so [.] pthread_cond_broadcast@@GLIBC_2.3.2 
>>> + 3,41% 0,04% fio libpthread-2.19.so [.] 0x000000000000efad 
>>> + 3,31% 0,54% fio libpthread-2.19.so [.] pthread_cond_wait@@GLIBC_2.3.2 
>>> + 3,19% 0,09% fio libpthread-2.19.so [.] __lll_unlock_wake 
>>> + 2,52% 0,00% fio librados.so.2.0.0 [.] ceph::buffer::create_aligned(unsigned int, unsigned int) 
>>> + 2,09% 0,08% fio libc-2.19.so [.] __posix_memalign 
>>> + 2,04% 0,26% fio libpthread-2.19.so [.] __lll_lock_wait 
>>> + 2,02% 0,13% fio libc-2.19.so [.] _mid_memalign 
>>> + 1,95% 1,91% fio libc-2.19.so [.] __memcpy_sse2_unaligned 
>>> + 1,88% 0,08% fio libc-2.19.so [.] _int_memalign 
>>> + 1,88% 0,00% fio libc-2.19.so [.] __clone 
>>> + 1,88% 0,00% fio libpthread-2.19.so [.] start_thread 
>>> + 1,88% 0,12% fio fio [.] thread_main 
>>> + 1,37% 1,37% swapper [kernel.kallsyms] [k] native_write_msr_safe 
>>> + 1,29% 0,05% fio libc-2.19.so [.] __lll_unlock_wake_private 
>>> + 1,24% 1,24% fio libpthread-2.19.so [.] pthread_mutex_trylock 
>>> + 1,24% 0,29% fio libc-2.19.so [.] __lll_lock_wait_private 
>>> + 1,19% 0,21% fio librbd.so.1.0.0 [.] std::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::_M_clear() 
>>> + 1,19% 1,19% fio libc-2.19.so [.] free 
>>> + 1,18% 1,18% fio libc-2.19.so [.] malloc_consolidate 
>>> + 1,14% 1,14% fio [kernel.kallsyms] [k] get_futex_key_refs.isra.13 
>>> + 1,10% 1,10% fio [kernel.kallsyms] [k] __schedule 
>>> + 1,00% 0,28% fio librados.so.2.0.0 [.] ceph::buffer::list::append(char const*, unsigned int) 
>>> + 0,96% 0,00% fio librbd.so.1.0.0 [.] 0x000000000005b2e7 
>>> + 0,96% 0,96% fio [kernel.kallsyms] [k] _raw_spin_lock 
>>> + 0,92% 0,21% fio librados.so.2.0.0 [.] ceph::buffer::list::append(ceph::buffer::ptr const&, unsigned int, unsigned int) 
>>> + 0,91% 0,00% fio librados.so.2.0.0 [.] 0x000000000006e6c0 
>>> + 0,90% 0,90% swapper [kernel.kallsyms] [k] __switch_to 
>>> + 0,89% 0,01% fio librbd.so.1.0.0 [.] 0x00000000000ce1f1 
>>> + 0,89% 0,89% swapper [kernel.kallsyms] [k] cpu_startup_entry 
>>> + 0,87% 0,01% fio librados.so.2.0.0 [.] 0x00000000002e3ff1 
>>> + 0,86% 0,00% fio libc-2.19.so [.] 0x00000000000dd50d 
>>> + 0,85% 0,85% fio [kernel.kallsyms] [k] try_to_wake_up 
>>> + 0,83% 0,83% swapper [kernel.kallsyms] [k] __schedule 
>>> + 0,82% 0,82% fio [kernel.kallsyms] [k] copy_user_enhanced_fast_string 
>>> + 0,81% 0,00% fio librados.so.2.0.0 [.] 0x0000000000137abc 
>>> + 0,80% 0,80% swapper [kernel.kallsyms] [k] menu_select 
>>> + 0,75% 0,75% fio [kernel.kallsyms] [k] _raw_spin_lock_bh 
>>> + 0,75% 0,75% fio [kernel.kallsyms] [k] futex_wake 
>>> + 0,75% 0,75% fio libpthread-2.19.so [.] __pthread_mutex_unlock_usercnt 
>>> + 0,73% 0,73% fio [kernel.kallsyms] [k] __switch_to 
>>> + 0,70% 0,70% fio libstdc++.so.6.0.20 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) 
>>> + 0,70% 0,36% fio librados.so.2.0.0 [.] ceph::buffer::list::iterator::copy(unsigned int, char*) 
>>> + 0,70% 0,23% fio fio [.] get_io_u 
>>> + 0,67% 0,67% fio [kernel.kallsyms] [k] finish_task_switch 
>>> + 0,67% 0,32% fio libpthread-2.19.so [.] pthread_rwlock_unlock 
>>> + 0,67% 0,00% fio librados.so.2.0.0 [.] 0x00000000000cea98 
>>> + 0,64% 0,00% fio librados.so.2.0.0 [.] 0x00000000002e3f87 
>>> + 0,63% 0,63% fio [kernel.kallsyms] [k] futex_wait_setup 
>>> + 0,62% 0,62% swapper [kernel.kallsyms] [k] enqueue_task_fair 
>>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
>> the body of a message to majordomo@vger.kernel.org 
>> More majordomo info at http://vger.kernel.org/majordomo-info.html 
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2015-05-12  6:12 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-10  9:05 client fio-rbd benchmark : debian wheezy vs ubuntu vivid : big difference Alexandre DERUMIER
2015-05-11  5:53 ` [Cbt] " Alexandre DERUMIER
2015-05-11 10:30   ` Stefan Priebe - Profihost AG
2015-05-11 14:20     ` Alexandre DERUMIER
2015-05-11 21:38       ` Milosz Tanski
2015-05-12  0:34         ` Alexandre DERUMIER
2015-05-12  6:12           ` Stefan Priebe - Profihost AG [this message]
2015-05-12  8:17             ` Alexandre DERUMIER
2015-05-12 14:37               ` Milosz Tanski
2015-05-12 15:21                 ` Alexandre DERUMIER
2015-05-12 16:55                   ` Milosz Tanski
2015-05-11 13:45   ` Mark Nelson
2015-05-11 14:15     ` Alexandre DERUMIER

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=555199B8.60003@profihost.ag \
    --to=s.priebe@profihost.ag \
    --cc=aderumier@odiso.com \
    --cc=cbt@ceph.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=milosz@adfin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.