All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Alex Thorlton <athorlton@sgi.com>
Cc: linux-mm@kvack.org, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Rik van Riel <riel@redhat.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Oleg Nesterov <oleg@redhat.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Kees Cook <keescook@chromium.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] mm: thp: Add per-mm_struct flag to control THP
Date: Fri, 10 Jan 2014 22:23:10 +0200	[thread overview]
Message-ID: <20140110202310.GB1421@node.dhcp.inet.fi> (raw)
In-Reply-To: <1389383718-46031-1-git-send-email-athorlton@sgi.com>

On Fri, Jan 10, 2014 at 01:55:18PM -0600, Alex Thorlton wrote:
> This patch adds an mm flag (MMF_THP_DISABLE) to disable transparent
> hugepages using prctl.  It is based on my original patch to add a
> per-task_struct flag to disable THP:
> 
> v1 - https://lkml.org/lkml/2013/8/2/671
> v2 - https://lkml.org/lkml/2013/8/2/703
> 
> After looking at alternate methods of modifying how THPs are handed out,
> it sounds like people might be more in favor of this type of approach,
> so I'm re-introducing the patch.
> 
> It seemed that everyone was in favor of moving this control over to the
> mm_struct, if it is to be implemented.  That's the only major change
> here, aside from the added ability to both set and clear the flag from
> prctl.
> 
> The main motivation behind this patch is to provide a way to disable THP
> for jobs where the code cannot be modified and using a malloc hook with
> madvise is not an option (i.e. statically allocated data).  This patch
> allows us to do just that, without affecting other jobs running on the
> system.
> 
> Here are some results showing the improvement that my test case gets
> when the MMF_THP_DISABLE flag is clear vs. set:
> 
> MMF_THP_DISABLE clear:
> 
> # perf stat -a -r 3 ./prctl_wrapper_mm 0 ./thp_pthread -C 0 -m 0 -c 512 -b 256g
> 
>  Performance counter stats for './prctl_wrapper_mm 0 ./thp_pthread -C 0 -m 0 -c 512 -b 256g' (3 runs):
> 
>   267694862.049279 task-clock                #  641.100 CPUs utilized            ( +-  0.23% ) [100.00%]
>            908,846 context-switches          #    0.000 M/sec                    ( +-  0.23% ) [100.00%]
>                874 CPU-migrations            #    0.000 M/sec                    ( +-  4.01% ) [100.00%]
>            131,966 page-faults               #    0.000 M/sec                    ( +-  2.75% )
> 351,127,909,744,906 cycles                    #    1.312 GHz                      ( +-  0.27% ) [100.00%]
> 523,537,415,562,692 stalled-cycles-frontend   #  149.10% frontend cycles idle     ( +-  0.26% ) [100.00%]
> 392,400,753,609,156 stalled-cycles-backend    #  111.75% backend  cycles idle     ( +-  0.29% ) [100.00%]
> 147,467,956,557,895 instructions              #    0.42  insns per cycle
>                                              #    3.55  stalled cycles per insn  ( +-  0.09% ) [100.00%]
> 26,922,737,309,021 branches                  #  100.572 M/sec                    ( +-  0.24% ) [100.00%]
>      1,308,714,545 branch-misses             #    0.00% of all branches          ( +-  0.18% )
> 
>      417.555688399 seconds time elapsed                                          ( +-  0.23% )
> 
> 
> MMF_THP_DISABLE set:
> 
> # perf stat -a -r 3 ./prctl_wrapper_mm 1 ./thp_pthread -C 0 -m 0 -c 512 -b 256g
> 
>  Performance counter stats for './prctl_wrapper_mm 1 ./thp_pthread -C 0 -m 0 -c 512 -b 256g' (3 runs):
> 
>   141674994.160138 task-clock                #  642.107 CPUs utilized            ( +-  0.23% ) [100.00%]
>          1,190,415 context-switches          #    0.000 M/sec                    ( +- 42.87% ) [100.00%]
>                688 CPU-migrations            #    0.000 M/sec                    ( +-  2.47% ) [100.00%]
>         62,394,646 page-faults               #    0.000 M/sec                    ( +-  0.00% )
> 156,748,225,096,919 cycles                    #    1.106 GHz                      ( +-  0.20% ) [100.00%]
> 211,440,354,290,433 stalled-cycles-frontend   #  134.89% frontend cycles idle     ( +-  0.40% ) [100.00%]
> 114,304,536,881,102 stalled-cycles-backend    #   72.92% backend  cycles idle     ( +-  0.88% ) [100.00%]
> 179,939,084,230,732 instructions              #    1.15  insns per cycle
>                                              #    1.18  stalled cycles per insn  ( +-  0.26% ) [100.00%]
> 26,659,099,949,509 branches                  #  188.171 M/sec                    ( +-  0.72% ) [100.00%]
>        762,772,361 branch-misses             #    0.00% of all branches          ( +-  0.97% )
> 
>      220.640905073 seconds time elapsed                                          ( +-  0.23% )
> 
> As you can see, this particular test gets about a 2x performance boost
> when THP is turned off. 

Do you know what cause the difference? I prefer to fix THP instead of
adding new knob to disable it.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Alex Thorlton <athorlton@sgi.com>
Cc: linux-mm@kvack.org, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Rik van Riel <riel@redhat.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Oleg Nesterov <oleg@redhat.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Kees Cook <keescook@chromium.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] mm: thp: Add per-mm_struct flag to control THP
Date: Fri, 10 Jan 2014 22:23:10 +0200	[thread overview]
Message-ID: <20140110202310.GB1421@node.dhcp.inet.fi> (raw)
In-Reply-To: <1389383718-46031-1-git-send-email-athorlton@sgi.com>

On Fri, Jan 10, 2014 at 01:55:18PM -0600, Alex Thorlton wrote:
> This patch adds an mm flag (MMF_THP_DISABLE) to disable transparent
> hugepages using prctl.  It is based on my original patch to add a
> per-task_struct flag to disable THP:
> 
> v1 - https://lkml.org/lkml/2013/8/2/671
> v2 - https://lkml.org/lkml/2013/8/2/703
> 
> After looking at alternate methods of modifying how THPs are handed out,
> it sounds like people might be more in favor of this type of approach,
> so I'm re-introducing the patch.
> 
> It seemed that everyone was in favor of moving this control over to the
> mm_struct, if it is to be implemented.  That's the only major change
> here, aside from the added ability to both set and clear the flag from
> prctl.
> 
> The main motivation behind this patch is to provide a way to disable THP
> for jobs where the code cannot be modified and using a malloc hook with
> madvise is not an option (i.e. statically allocated data).  This patch
> allows us to do just that, without affecting other jobs running on the
> system.
> 
> Here are some results showing the improvement that my test case gets
> when the MMF_THP_DISABLE flag is clear vs. set:
> 
> MMF_THP_DISABLE clear:
> 
> # perf stat -a -r 3 ./prctl_wrapper_mm 0 ./thp_pthread -C 0 -m 0 -c 512 -b 256g
> 
>  Performance counter stats for './prctl_wrapper_mm 0 ./thp_pthread -C 0 -m 0 -c 512 -b 256g' (3 runs):
> 
>   267694862.049279 task-clock                #  641.100 CPUs utilized            ( +-  0.23% ) [100.00%]
>            908,846 context-switches          #    0.000 M/sec                    ( +-  0.23% ) [100.00%]
>                874 CPU-migrations            #    0.000 M/sec                    ( +-  4.01% ) [100.00%]
>            131,966 page-faults               #    0.000 M/sec                    ( +-  2.75% )
> 351,127,909,744,906 cycles                    #    1.312 GHz                      ( +-  0.27% ) [100.00%]
> 523,537,415,562,692 stalled-cycles-frontend   #  149.10% frontend cycles idle     ( +-  0.26% ) [100.00%]
> 392,400,753,609,156 stalled-cycles-backend    #  111.75% backend  cycles idle     ( +-  0.29% ) [100.00%]
> 147,467,956,557,895 instructions              #    0.42  insns per cycle
>                                              #    3.55  stalled cycles per insn  ( +-  0.09% ) [100.00%]
> 26,922,737,309,021 branches                  #  100.572 M/sec                    ( +-  0.24% ) [100.00%]
>      1,308,714,545 branch-misses             #    0.00% of all branches          ( +-  0.18% )
> 
>      417.555688399 seconds time elapsed                                          ( +-  0.23% )
> 
> 
> MMF_THP_DISABLE set:
> 
> # perf stat -a -r 3 ./prctl_wrapper_mm 1 ./thp_pthread -C 0 -m 0 -c 512 -b 256g
> 
>  Performance counter stats for './prctl_wrapper_mm 1 ./thp_pthread -C 0 -m 0 -c 512 -b 256g' (3 runs):
> 
>   141674994.160138 task-clock                #  642.107 CPUs utilized            ( +-  0.23% ) [100.00%]
>          1,190,415 context-switches          #    0.000 M/sec                    ( +- 42.87% ) [100.00%]
>                688 CPU-migrations            #    0.000 M/sec                    ( +-  2.47% ) [100.00%]
>         62,394,646 page-faults               #    0.000 M/sec                    ( +-  0.00% )
> 156,748,225,096,919 cycles                    #    1.106 GHz                      ( +-  0.20% ) [100.00%]
> 211,440,354,290,433 stalled-cycles-frontend   #  134.89% frontend cycles idle     ( +-  0.40% ) [100.00%]
> 114,304,536,881,102 stalled-cycles-backend    #   72.92% backend  cycles idle     ( +-  0.88% ) [100.00%]
> 179,939,084,230,732 instructions              #    1.15  insns per cycle
>                                              #    1.18  stalled cycles per insn  ( +-  0.26% ) [100.00%]
> 26,659,099,949,509 branches                  #  188.171 M/sec                    ( +-  0.72% ) [100.00%]
>        762,772,361 branch-misses             #    0.00% of all branches          ( +-  0.97% )
> 
>      220.640905073 seconds time elapsed                                          ( +-  0.23% )
> 
> As you can see, this particular test gets about a 2x performance boost
> when THP is turned off. 

Do you know what cause the difference? I prefer to fix THP instead of
adding new knob to disable it.

-- 
 Kirill A. Shutemov

  reply	other threads:[~2014-01-10 20:25 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-10 19:55 [RFC PATCH] mm: thp: Add per-mm_struct flag to control THP Alex Thorlton
2014-01-10 19:55 ` Alex Thorlton
2014-01-10 20:23 ` Kirill A. Shutemov [this message]
2014-01-10 20:23   ` Kirill A. Shutemov
2014-01-10 22:01   ` Alex Thorlton
2014-01-10 22:01     ` Alex Thorlton
2014-01-10 22:10     ` Peter Zijlstra
2014-01-10 22:10       ` Peter Zijlstra
2014-01-10 22:39       ` Alex Thorlton
2014-01-10 22:39         ` Alex Thorlton
2014-01-14 15:44         ` Mel Gorman
2014-01-14 15:44           ` Mel Gorman
2014-01-14 19:38           ` Alex Thorlton
2014-01-14 19:38             ` Alex Thorlton
2014-01-22 10:26             ` Mel Gorman
2014-01-22 10:26               ` Mel Gorman
2014-01-22 17:53               ` Alex Thorlton
2014-01-22 17:53                 ` Alex Thorlton
2014-01-22 21:46                 ` David Rientjes
2014-01-22 21:46                   ` David Rientjes
2014-01-10 22:23     ` Kirill A. Shutemov
2014-01-10 22:23       ` Kirill A. Shutemov
2014-01-14 15:47       ` Mel Gorman
2014-01-14 15:47         ` Mel Gorman
2014-01-11 16:11   ` Oleg Nesterov
2014-01-11 16:11     ` Oleg Nesterov
2014-01-11 15:53 ` Oleg Nesterov
2014-01-11 15:53   ` Oleg Nesterov
2014-01-11 19:30   ` Alex Thorlton
2014-01-11 19:30     ` Alex Thorlton
2014-01-12 13:56     ` Oleg Nesterov
2014-01-12 13:56       ` Oleg Nesterov
2014-01-13 18:59       ` Alex Thorlton
2014-01-13 18:59         ` Alex Thorlton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140110202310.GB1421@node.dhcp.inet.fi \
    --to=kirill@shutemov.name \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=athorlton@sgi.com \
    --cc=benh@kernel.crashing.org \
    --cc=ebiederm@xmission.com \
    --cc=keescook@chromium.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.