From: Johannes Weiner <jweiner@redhat.com>
To: aarcange@redhat.com
Cc: linux-mm@kvack.org, Mel Gorman <mel@csn.ul.ie>,
Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH 3 of 3] thp: mremap support and TLB optimization
Date: Mon, 8 Aug 2011 10:25:41 +0200 [thread overview]
Message-ID: <20110808082541.GC27011@redhat.com> (raw)
In-Reply-To: <10a29e95223e52e49a61.1312649885@localhost>
On Sat, Aug 06, 2011 at 06:58:05PM +0200, aarcange@redhat.com wrote:
> From: Andrea Arcangeli <aarcange@redhat.com>
>
> This adds THP support to mremap (decreases the number of split_huge_page
> called).
>
> Here are also some benchmarks with a proggy like this:
>
> ===
> #define _GNU_SOURCE
> #include <sys/mman.h>
> #include <stdlib.h>
> #include <stdio.h>
> #include <string.h>
> #include <sys/time.h>
>
> #define SIZE (5UL*1024*1024*1024)
>
> int main()
> {
> static struct timeval oldstamp, newstamp;
> long diffsec;
> char *p, *p2, *p3, *p4;
> if (posix_memalign((void **)&p, 2*1024*1024, SIZE))
> perror("memalign"), exit(1);
> if (posix_memalign((void **)&p2, 2*1024*1024, SIZE))
> perror("memalign"), exit(1);
> if (posix_memalign((void **)&p3, 2*1024*1024, 4096))
> perror("memalign"), exit(1);
>
> memset(p, 0xff, SIZE);
> memset(p2, 0xff, SIZE);
> memset(p3, 0x77, 4096);
> gettimeofday(&oldstamp, NULL);
> p4 = mremap(p, SIZE, SIZE, MREMAP_FIXED|MREMAP_MAYMOVE, p3);
> gettimeofday(&newstamp, NULL);
> diffsec = newstamp.tv_sec - oldstamp.tv_sec;
> diffsec = newstamp.tv_usec - oldstamp.tv_usec + 1000000 * diffsec;
> printf("usec %ld\n", diffsec);
> if (p == MAP_FAILED || p4 != p3)
> //if (p == MAP_FAILED)
> perror("mremap"), exit(1);
> if (memcmp(p4, p2, SIZE))
> printf("mremap bug\n"), exit(1);
> printf("ok\n");
>
> return 0;
> }
> ===
>
> THP on
>
> Performance counter stats for './largepage13' (3 runs):
>
> 69195836 dTLB-loads ( +- 3.546% ) (scaled from 50.30%)
> 60708 dTLB-load-misses ( +- 11.776% ) (scaled from 52.62%)
> 676266476 dTLB-stores ( +- 5.654% ) (scaled from 69.54%)
> 29856 dTLB-store-misses ( +- 4.081% ) (scaled from 89.22%)
> 1055848782 iTLB-loads ( +- 4.526% ) (scaled from 80.18%)
> 8689 iTLB-load-misses ( +- 2.987% ) (scaled from 58.20%)
>
> 7.314454164 seconds time elapsed ( +- 0.023% )
>
> THP off
>
> Performance counter stats for './largepage13' (3 runs):
>
> 1967379311 dTLB-loads ( +- 0.506% ) (scaled from 60.59%)
> 9238687 dTLB-load-misses ( +- 22.547% ) (scaled from 61.87%)
> 2014239444 dTLB-stores ( +- 0.692% ) (scaled from 60.40%)
> 3312335 dTLB-store-misses ( +- 7.304% ) (scaled from 67.60%)
> 6764372065 iTLB-loads ( +- 0.925% ) (scaled from 79.00%)
> 8202 iTLB-load-misses ( +- 0.475% ) (scaled from 70.55%)
>
> 9.693655243 seconds time elapsed ( +- 0.069% )
>
> grep thp /proc/vmstat
> thp_fault_alloc 35849
> thp_fault_fallback 0
> thp_collapse_alloc 3
> thp_collapse_alloc_failed 0
> thp_split 0
>
> thp_split 0 confirms no thp split despite plenty of hugepages allocated.
>
> The measurement of only the mremap time (so excluding the 3 long
> memset and final long 10GB memory accessing memcmp):
>
> THP on
>
> usec 14824
> usec 14862
> usec 14859
>
> THP off
>
> usec 256416
> usec 255981
> usec 255847
>
> With an older kernel without the mremap optimizations (the below patch
> optimizes the non THP version too).
>
> THP on
>
> usec 392107
> usec 390237
> usec 404124
>
> THP off
>
> usec 444294
> usec 445237
> usec 445820
>
> I guess with a threaded program that sends more IPI on large SMP it'd
> create an even larger difference.
>
> All debug options are off except DEBUG_VM to avoid skewing the
> results.
>
> The only problem for native 2M mremap like it happens above both the
> source and destination address must be 2M aligned or the hugepmd can't
> be moved without a split but that is an hardware limitation.
>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Johannes Weiner <jweiner@redhat.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-08-08 8:25 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-06 16:58 [PATCH 0 of 3] THP: mremap support and TLB optimization #3 aarcange
2011-08-06 16:58 ` [PATCH 1 of 3] mremap: check for overflow using deltas aarcange
2011-08-08 8:20 ` Johannes Weiner
2011-08-10 10:48 ` Mel Gorman
2011-08-11 0:14 ` Rik van Riel
2011-08-06 16:58 ` [PATCH 2 of 3] mremap: avoid sending one IPI per page aarcange
2011-08-08 8:23 ` Johannes Weiner
2011-08-10 10:55 ` Mel Gorman
2011-08-11 0:16 ` Rik van Riel
2011-08-06 16:58 ` [PATCH 3 of 3] thp: mremap support and TLB optimization aarcange
2011-08-08 8:25 ` Johannes Weiner [this message]
2011-08-10 11:01 ` Mel Gorman
2011-08-11 0:26 ` Rik van Riel
2011-08-23 21:14 ` Andrew Morton
2011-08-23 22:13 ` Andrea Arcangeli
2011-08-23 22:25 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110808082541.GC27011@redhat.com \
--to=jweiner@redhat.com \
--cc=aarcange@redhat.com \
--cc=hughd@google.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.