From: Jianguo Wu <wujianguo@huawei.com>
To: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, Andrea Arcangeli <aarcange@redhat.com>,
qiuxishi <qiuxishi@huawei.com>
Subject: Re: Transparent Hugepage impact on memcpy
Date: Tue, 4 Jun 2013 20:55:19 +0800 [thread overview]
Message-ID: <51ADE3B7.1070303@huawei.com> (raw)
In-Reply-To: <51adde12.e6b2320a.610d.ffff96f3SMTPIN_ADDED_BROKEN@mx.google.com>
On 2013/6/4 20:30, Wanpeng Li wrote:
> On Tue, Jun 04, 2013 at 04:57:57PM +0800, Jianguo Wu wrote:
>> Hi all,
>>
>> I tested memcpy with perf bench, and found that in prefault case, When Transparent Hugepage is on,
>> memcpy has worse performance.
>>
>> When THP on is 3.672879 GB/Sec (with prefault), while THP off is 6.190187 GB/Sec (with prefault).
>>
>
> I get similar result as you against 3.10-rc4 in the attachment. This
> dues to the characteristic of thp takes a single page fault for each
> 2MB virtual region touched by userland.
>
Hi Wanpeng,
Thanks for your reply:).
This test is with prefault, so it would not count page fault time in, and I think less page fault
will improve memcpy performance, right?
Test results from perf stat show a significant reduction in cache-references and cache-misses
when THP is off, do you have any idea about this?
Thanks,
Jianguo Wu.
>> I think THP will improve performance, but the test result obviously not the case.
>> Andrea mentioned THP cause "clear_page/copy_page less cache friendly" in
>> http://events.linuxfoundation.org/slides/2011/lfcs/lfcs2011_hpc_arcangeli.pdf.
>>
>> I am not quite understand this, could you please give me some comments, Thanks!
>>
>> I test in Linux-3.4-stable, and my machine info is:
>> Intel(R) Xeon(R) CPU E5520 @ 2.27GHz
>>
>> available: 2 nodes (0-1)
>> node 0 cpus: 0 1 2 3 8 9 10 11
>> node 0 size: 24567 MB
>> node 0 free: 23550 MB
>> node 1 cpus: 4 5 6 7 12 13 14 15
>> node 1 size: 24576 MB
>> node 1 free: 23767 MB
>> node distances:
>> node 0 1
>> 0: 10 20
>> 1: 20 10
>>
>> Below is test result:
>> ---with THP---
>> #cat /sys/kernel/mm/transparent_hugepage/enabled
>> [always] madvise never
>> #./perf bench mem memcpy -l 1gb -o
>> # Running mem/memcpy benchmark...
>> # Copying 1gb Bytes ...
>>
>> 3.672879 GB/Sec (with prefault)
>>
>> #./perf stat ...
>> Performance counter stats for './perf bench mem memcpy -l 1gb -o':
>>
>> 35455940 cache-misses # 53.504 % of all cache refs [49.45%]
>> 66267785 cache-references [49.78%]
>> 2409 page-faults
>> 450768651 dTLB-loads
>> [50.78%]
>> 24580 dTLB-misses
>> # 0.01% of all dTLB cache hits [51.01%]
>> 1338974202 dTLB-stores
>> [50.63%]
>> 77943 dTLB-misses
>> [50.24%]
>> 697404997 iTLB-loads
>> [49.77%]
>> 274 iTLB-misses
>> # 0.00% of all iTLB cache hits [49.30%]
>>
>> 0.855041819 seconds time elapsed
>>
>> ---no THP---
>> #cat /sys/kernel/mm/transparent_hugepage/enabled
>> always madvise [never]
>>
>> #./perf bench mem memcpy -l 1gb -o
>> # Running mem/memcpy benchmark...
>> # Copying 1gb Bytes ...
>>
>> 6.190187 GB/Sec (with prefault)
>>
>> #./perf stat ...
>> Performance counter stats for './perf bench mem memcpy -l 1gb -o':
>>
>> 16920763 cache-misses # 98.377 % of all cache refs [50.01%]
>> 17200000 cache-references [50.04%]
>> 524652 page-faults
>> 734365659 dTLB-loads
>> [50.04%]
>> 4986387 dTLB-misses
>> # 0.68% of all dTLB cache hits [50.04%]
>> 1013408298 dTLB-stores
>> [50.04%]
>> 8180817 dTLB-misses
>> [49.97%]
>> 1526642351 iTLB-loads
>> [50.41%]
>> 56 iTLB-misses
>> # 0.00% of all iTLB cache hits [50.21%]
>>
>> 1.025425847 seconds time elapsed
>>
>> Thanks,
>> Jianguo Wu.
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-06-04 12:55 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-04 8:57 Transparent Hugepage impact on memcpy Jianguo Wu
2013-06-04 12:30 ` Wanpeng Li
2013-06-04 20:20 ` Andrea Arcangeli
2013-06-05 2:49 ` Jianguo Wu
2013-06-04 12:30 ` Wanpeng Li
[not found] ` <51adde12.e6b2320a.610d.ffff96f3SMTPIN_ADDED_BROKEN@mx.google.com>
2013-06-04 12:55 ` Jianguo Wu [this message]
2013-06-04 14:10 ` Hush Bensen
2013-06-05 3:26 ` Jianguo Wu
2013-06-06 13:54 ` Hitoshi Mitake
2013-06-07 1:26 ` Jianguo Wu
2013-06-07 13:50 ` Hitoshi Mitake
2013-06-08 1:13 ` Jianguo Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51ADE3B7.1070303@huawei.com \
--to=wujianguo@huawei.com \
--cc=aarcange@redhat.com \
--cc=linux-mm@kvack.org \
--cc=liwanp@linux.vnet.ibm.com \
--cc=qiuxishi@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.