From: Jianguo Wu <wujianguo@huawei.com>
To: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: linux-mm@kvack.org, Andrea Arcangeli <aarcange@redhat.com>,
qiuxishi <qiuxishi@huawei.com>,
Wanpeng Li <liwanp@linux.vnet.ibm.com>,
Hush Bensen <hush.bensen@gmail.com>,
mitake.hitoshi@gmail.com
Subject: Re: Transparent Hugepage impact on memcpy
Date: Fri, 7 Jun 2013 09:26:58 +0800 [thread overview]
Message-ID: <51B136E2.4010606@huawei.com> (raw)
In-Reply-To: <CAMO-S2ixv55bGEFGR6Eh=UZgVBz=nv81EckuzWoVi0t4KdB+VA@mail.gmail.com>
Hi Hitoshi,
Thanks for your reply! please see below.
On 2013/6/6 21:54, Hitoshi Mitake wrote:
> Hi Jianguo,
>
> On Wed, Jun 5, 2013 at 12:26 PM, Jianguo Wu <wujianguo@huawei.com> wrote:
>> Hi,
>> One more question, I wrote a memcpy test program, mostly the same as with perf bench memcpy.
>> But test result isn't consistent with perf bench when THP is off.
>>
>> my program perf bench
>> THP: 3.628368 GB/Sec (with prefault) 3.672879 GB/Sec (with prefault)
>> NO-THP: 3.612743 GB/Sec (with prefault) 6.190187 GB/Sec (with prefault)
>>
>> Below is my code:
>> src = calloc(1, len);
>> dst = calloc(1, len);
>>
>> if (prefault)
>> memcpy(dst, src, len);
>> gettimeofday(&tv_start, NULL);
>> memcpy(dst, src, len);
>> gettimeofday(&tv_end, NULL);
>>
>> timersub(&tv_end, &tv_start, &tv_diff);
>> free(src);
>> free(dst);
>>
>> speed = (double)((double)len / timeval2double(&tv_diff));
>> print_bps(speed);
>>
>> This is weird, is it possible that perf bench do some build optimize?
>>
>> Thansk,
>> Jianguo Wu.
>
> perf bench mem memcpy is build with -O6. This is the compile command
> line (you can get this with make V=1):
> gcc -o bench/mem-memcpy-x86-64-asm.o -c -fno-omit-frame-pointer -ggdb3
> -funwind-tables -Wall -Wextra -std=gnu99 -Werror -O6 .... # ommited
>
> Can I see your compile option for your test program and the actual
> command line executing perf bench mem memcpy?
>
I just compiled my test program with gcc -o memcpy-test memcpy-test.c.
I tried to use the same compile option with perf bench mem memcpy, and
the test result showed no difference.
My execute command line for perf bench mem memcpy:
#./perf bench mem memcpy -l 1gb -o
Thanks,
Jianguo Wu
> Thanks,
> Hitoshi
>
>>
>> On 2013/6/4 16:57, Jianguo Wu wrote:
>>
>>> Hi all,
>>>
>>> I tested memcpy with perf bench, and found that in prefault case, When Transparent Hugepage is on,
>>> memcpy has worse performance.
>>>
>>> When THP on is 3.672879 GB/Sec (with prefault), while THP off is 6.190187 GB/Sec (with prefault).
>>>
>>> I think THP will improve performance, but the test result obviously not the case.
>>> Andrea mentioned THP cause "clear_page/copy_page less cache friendly" in
>>> http://events.linuxfoundation.org/slides/2011/lfcs/lfcs2011_hpc_arcangeli.pdf.
>>>
>>> I am not quite understand this, could you please give me some comments, Thanks!
>>>
>>> I test in Linux-3.4-stable, and my machine info is:
>>> Intel(R) Xeon(R) CPU E5520 @ 2.27GHz
>>>
>>> available: 2 nodes (0-1)
>>> node 0 cpus: 0 1 2 3 8 9 10 11
>>> node 0 size: 24567 MB
>>> node 0 free: 23550 MB
>>> node 1 cpus: 4 5 6 7 12 13 14 15
>>> node 1 size: 24576 MB
>>> node 1 free: 23767 MB
>>> node distances:
>>> node 0 1
>>> 0: 10 20
>>> 1: 20 10
>>>
>>> Below is test result:
>>> ---with THP---
>>> #cat /sys/kernel/mm/transparent_hugepage/enabled
>>> [always] madvise never
>>> #./perf bench mem memcpy -l 1gb -o
>>> # Running mem/memcpy benchmark...
>>> # Copying 1gb Bytes ...
>>>
>>> 3.672879 GB/Sec (with prefault)
>>>
>>> #./perf stat ...
>>> Performance counter stats for './perf bench mem memcpy -l 1gb -o':
>>>
>>> 35455940 cache-misses # 53.504 % of all cache refs [49.45%]
>>> 66267785 cache-references [49.78%]
>>> 2409 page-faults
>>> 450768651 dTLB-loads
>>> [50.78%]
>>> 24580 dTLB-misses
>>> # 0.01% of all dTLB cache hits [51.01%]
>>> 1338974202 dTLB-stores
>>> [50.63%]
>>> 77943 dTLB-misses
>>> [50.24%]
>>> 697404997 iTLB-loads
>>> [49.77%]
>>> 274 iTLB-misses
>>> # 0.00% of all iTLB cache hits [49.30%]
>>>
>>> 0.855041819 seconds time elapsed
>>>
>>> ---no THP---
>>> #cat /sys/kernel/mm/transparent_hugepage/enabled
>>> always madvise [never]
>>>
>>> #./perf bench mem memcpy -l 1gb -o
>>> # Running mem/memcpy benchmark...
>>> # Copying 1gb Bytes ...
>>>
>>> 6.190187 GB/Sec (with prefault)
>>>
>>> #./perf stat ...
>>> Performance counter stats for './perf bench mem memcpy -l 1gb -o':
>>>
>>> 16920763 cache-misses # 98.377 % of all cache refs [50.01%]
>>> 17200000 cache-references [50.04%]
>>> 524652 page-faults
>>> 734365659 dTLB-loads
>>> [50.04%]
>>> 4986387 dTLB-misses
>>> # 0.68% of all dTLB cache hits [50.04%]
>>> 1013408298 dTLB-stores
>>> [50.04%]
>>> 8180817 dTLB-misses
>>> [49.97%]
>>> 1526642351 iTLB-loads
>>> [50.41%]
>>> 56 iTLB-misses
>>> # 0.00% of all iTLB cache hits [50.21%]
>>>
>>> 1.025425847 seconds time elapsed
>>>
>>> Thanks,
>>> Jianguo Wu.
>>
>>
>>
>>
>
> .
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-06-07 1:27 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-04 8:57 Transparent Hugepage impact on memcpy Jianguo Wu
2013-06-04 12:30 ` Wanpeng Li
2013-06-04 12:30 ` Wanpeng Li
2013-06-04 20:20 ` Andrea Arcangeli
2013-06-05 2:49 ` Jianguo Wu
[not found] ` <51adde12.e6b2320a.610d.ffff96f3SMTPIN_ADDED_BROKEN@mx.google.com>
2013-06-04 12:55 ` Jianguo Wu
2013-06-04 14:10 ` Hush Bensen
2013-06-05 3:26 ` Jianguo Wu
2013-06-06 13:54 ` Hitoshi Mitake
2013-06-07 1:26 ` Jianguo Wu [this message]
2013-06-07 13:50 ` Hitoshi Mitake
2013-06-08 1:13 ` Jianguo Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51B136E2.4010606@huawei.com \
--to=wujianguo@huawei.com \
--cc=aarcange@redhat.com \
--cc=hush.bensen@gmail.com \
--cc=linux-mm@kvack.org \
--cc=liwanp@linux.vnet.ibm.com \
--cc=mitake.hitoshi@gmail.com \
--cc=mitake@dcl.info.waseda.ac.jp \
--cc=qiuxishi@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).