linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Jianguo Wu <wujianguo@huawei.com>,
	linux-mm@kvack.org, qiuxishi <qiuxishi@huawei.com>
Subject: Re: Transparent Hugepage impact on memcpy
Date: Tue, 4 Jun 2013 22:20:17 +0200	[thread overview]
Message-ID: <20130604202017.GJ3463@redhat.com> (raw)
In-Reply-To: <20130604123050.GA32707@hacker.(null)>

Hello everyone,

On Tue, Jun 04, 2013 at 08:30:51PM +0800, Wanpeng Li wrote:
> On Tue, Jun 04, 2013 at 04:57:57PM +0800, Jianguo Wu wrote:
> >Hi all,
> >
> >I tested memcpy with perf bench, and found that in prefault case, When Transparent Hugepage is on,
> >memcpy has worse performance.
> >
> >When THP on is 3.672879 GB/Sec (with prefault), while THP off is 6.190187 GB/Sec (with prefault).
> >
> 
> I get similar result as you against 3.10-rc4 in the attachment. This
> dues to the characteristic of thp takes a single page fault for each 
> 2MB virtual region touched by userland.

I had a look at what prefault does and page faults should not be
involved in the measurement of GB/sec. The "stats" also include the
page faults but the page fault is not part of the printed GB/sec, if
"-o" is used.

If the perf test is correct, it looks more an hardware issue with
memcpy and large TLBs than a software one. memset doesn't exibith it,
if this was something fundamental memset should also exibith it. It
shall be possible to reproduce this with hugetlbfs in fact... if you
want to be 100% sure it's not software, you should try that.

Chances are there's enough pre-fetching going on in the CPU to
optimize for those 4k tlb loads in streaming copies, and the
pagetables are also cached very nicely with streaming copies. Maybe
large TLBs somewhere are less optimized for streaming copies. Only
something smarter happening in the CPU optimized for 4k and not yet
for 2M TLBs can explain this: if the CPU was equally intelligent it
should definitely be faster with THP on even with "-o".

Overall I doubt there's anything in software to fix here.

Also note, this is not related to additional cache usage during page
faults that I mentioned in the pdf. Page faults or cache effects in
the page faults are completely removed from the equation because of
"-o". The prefault pass, eliminates the page faults and trashes away
all the cache (regardless if the page fault uses non-temporal stores
or not) before the "measured" memcpy load starts.

I don't think this is a major concern, as a proof of thumb you just
need to prefix the "perf" command with "time" to see it: the THP
version still completes much faster despite the prefault part of it
is slightly slower with THP on.

THP pays off the most during computations that are accessing randomly,
and not sequentially.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-06-04 20:20 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-04  8:57 Transparent Hugepage impact on memcpy Jianguo Wu
2013-06-04 12:30 ` Wanpeng Li
2013-06-04 12:30 ` Wanpeng Li
2013-06-04 20:20   ` Andrea Arcangeli [this message]
2013-06-05  2:49     ` Jianguo Wu
     [not found] ` <51adde12.e6b2320a.610d.ffff96f3SMTPIN_ADDED_BROKEN@mx.google.com>
2013-06-04 12:55   ` Jianguo Wu
2013-06-04 14:10 ` Hush Bensen
2013-06-05  3:26 ` Jianguo Wu
2013-06-06 13:54   ` Hitoshi Mitake
2013-06-07  1:26     ` Jianguo Wu
2013-06-07 13:50       ` Hitoshi Mitake
2013-06-08  1:13         ` Jianguo Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130604202017.GJ3463@redhat.com \
    --to=aarcange@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=liwanp@linux.vnet.ibm.com \
    --cc=qiuxishi@huawei.com \
    --cc=wujianguo@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).