From: Dave Hansen <dave@sr71.net>
To: Mel Gorman <mgorman@suse.de>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
akpm@linux-foundation.org, kirill.shutemov@linux.intel.com,
ak@linux.intel.com, riel@redhat.com, alex.shi@linaro.org,
dave.hansen@linux.intel.com
Subject: Re: [PATCH 2/6] x86: mm: rip out complicated, out-of-date, buggy TLB flushing
Date: Thu, 24 Apr 2014 09:58:11 -0700 [thread overview]
Message-ID: <535942A3.3020800@sr71.net> (raw)
In-Reply-To: <20140424084552.GQ23991@suse.de>
On 04/24/2014 01:45 AM, Mel Gorman wrote:
>> +/*
>> + * See Documentation/x86/tlb.txt for details. We choose 33
>> + * because it is large enough to cover the vast majority (at
>> + * least 95%) of allocations, and is small enough that we are
>> + * confident it will not cause too much overhead. Each single
>> + * flush is about 100 cycles, so this caps the maximum overhead
>> + * at _about_ 3,000 cycles.
>> + */
>> +/* in units of pages */
>> +unsigned long tlb_single_page_flush_ceiling = 1;
>> +
>
> This comment is premature. The documentation file does not exist yet and
> 33 means nothing yet. Out of curiousity though, how confident are you
> that a TLB flush is generally 100 cycles across different generations
> and manufacturers of CPUs? I'm not suggesting you change it or auto-tune
> it, am just curious.
Yeah, the comment belongs in the later patch where I set it to 33.
I looked at this on the last few generations of Intel CPUs. "100
cycles" was a very general statement, and not precise at all. My laptop
averages out to 113 cycles overall, but the flushes of 25 pages averaged
96 cycles/page while the flushes of 2 averaged 219/page.
Those cycles include some costs of from the instrumentation as well.
I did not test on other CPU manufacturers, but this should be pretty
easy to reproduce. I'm happy to help folks re-run it on other hardware.
I also believe with the modalias stuff we've got in sysfs for the CPU
objects we can do this in the future with udev rules instead of
hard-coding it in the kernel.
>> - /* In modern CPU, last level tlb used for both data/ins */
>> - if (vmflag & VM_EXEC)
>> - tlb_entries = tlb_lli_4k[ENTRIES];
>> - else
>> - tlb_entries = tlb_lld_4k[ENTRIES];
>> -
>> - /* Assume all of TLB entries was occupied by this task */
>> - act_entries = tlb_entries >> tlb_flushall_shift;
>> - act_entries = mm->total_vm > act_entries ? act_entries : mm->total_vm;
>> - nr_base_pages = (end - start) >> PAGE_SHIFT;
>> -
>> - /* tlb_flushall_shift is on balance point, details in commit log */
>> - if (nr_base_pages > act_entries) {
>> + if ((end - start) > tlb_single_page_flush_ceiling * PAGE_SIZE) {
>> count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
>> local_flush_tlb();
>> } else {
>
> We lose the different tuning based on whether the flush is for instructions
> or data. However, I cannot think of a good reason for keeping it as I
> expect that flushes of instructions is relatively rare. The benefit, if
> any, will be marginal. Still, if you do another revision it would be
> nice to call this out in the changelog.
Will do.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave@sr71.net>
To: Mel Gorman <mgorman@suse.de>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
akpm@linux-foundation.org, kirill.shutemov@linux.intel.com,
ak@linux.intel.com, riel@redhat.com, alex.shi@linaro.org,
dave.hansen@linux.intel.com
Subject: Re: [PATCH 2/6] x86: mm: rip out complicated, out-of-date, buggy TLB flushing
Date: Thu, 24 Apr 2014 09:58:11 -0700 [thread overview]
Message-ID: <535942A3.3020800@sr71.net> (raw)
In-Reply-To: <20140424084552.GQ23991@suse.de>
On 04/24/2014 01:45 AM, Mel Gorman wrote:
>> +/*
>> + * See Documentation/x86/tlb.txt for details. We choose 33
>> + * because it is large enough to cover the vast majority (at
>> + * least 95%) of allocations, and is small enough that we are
>> + * confident it will not cause too much overhead. Each single
>> + * flush is about 100 cycles, so this caps the maximum overhead
>> + * at _about_ 3,000 cycles.
>> + */
>> +/* in units of pages */
>> +unsigned long tlb_single_page_flush_ceiling = 1;
>> +
>
> This comment is premature. The documentation file does not exist yet and
> 33 means nothing yet. Out of curiousity though, how confident are you
> that a TLB flush is generally 100 cycles across different generations
> and manufacturers of CPUs? I'm not suggesting you change it or auto-tune
> it, am just curious.
Yeah, the comment belongs in the later patch where I set it to 33.
I looked at this on the last few generations of Intel CPUs. "100
cycles" was a very general statement, and not precise at all. My laptop
averages out to 113 cycles overall, but the flushes of 25 pages averaged
96 cycles/page while the flushes of 2 averaged 219/page.
Those cycles include some costs of from the instrumentation as well.
I did not test on other CPU manufacturers, but this should be pretty
easy to reproduce. I'm happy to help folks re-run it on other hardware.
I also believe with the modalias stuff we've got in sysfs for the CPU
objects we can do this in the future with udev rules instead of
hard-coding it in the kernel.
>> - /* In modern CPU, last level tlb used for both data/ins */
>> - if (vmflag & VM_EXEC)
>> - tlb_entries = tlb_lli_4k[ENTRIES];
>> - else
>> - tlb_entries = tlb_lld_4k[ENTRIES];
>> -
>> - /* Assume all of TLB entries was occupied by this task */
>> - act_entries = tlb_entries >> tlb_flushall_shift;
>> - act_entries = mm->total_vm > act_entries ? act_entries : mm->total_vm;
>> - nr_base_pages = (end - start) >> PAGE_SHIFT;
>> -
>> - /* tlb_flushall_shift is on balance point, details in commit log */
>> - if (nr_base_pages > act_entries) {
>> + if ((end - start) > tlb_single_page_flush_ceiling * PAGE_SIZE) {
>> count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
>> local_flush_tlb();
>> } else {
>
> We lose the different tuning based on whether the flush is for instructions
> or data. However, I cannot think of a good reason for keeping it as I
> expect that flushes of instructions is relatively rare. The benefit, if
> any, will be marginal. Still, if you do another revision it would be
> nice to call this out in the changelog.
Will do.
next prev parent reply other threads:[~2014-04-24 16:58 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-21 18:24 [PATCH 0/6] x86: rework tlb range flushing code Dave Hansen
2014-04-21 18:24 ` Dave Hansen
2014-04-21 18:24 ` [PATCH 1/6] x86: mm: clean up tlb " Dave Hansen
2014-04-21 18:24 ` Dave Hansen
2014-04-22 16:53 ` Rik van Riel
2014-04-22 16:53 ` Rik van Riel
2014-04-24 8:33 ` Mel Gorman
2014-04-24 8:33 ` Mel Gorman
2014-04-21 18:24 ` [PATCH 2/6] x86: mm: rip out complicated, out-of-date, buggy TLB flushing Dave Hansen
2014-04-21 18:24 ` Dave Hansen
2014-04-22 16:54 ` Rik van Riel
2014-04-22 16:54 ` Rik van Riel
2014-04-24 8:45 ` Mel Gorman
2014-04-24 8:45 ` Mel Gorman
2014-04-24 16:58 ` Dave Hansen [this message]
2014-04-24 16:58 ` Dave Hansen
2014-04-24 18:00 ` Mel Gorman
2014-04-24 18:00 ` Mel Gorman
2014-04-25 21:39 ` Dave Hansen
2014-04-25 21:39 ` Dave Hansen
2014-04-21 18:24 ` [PATCH 3/6] x86: mm: fix missed global TLB flush stat Dave Hansen
2014-04-21 18:24 ` Dave Hansen
2014-04-22 17:15 ` Rik van Riel
2014-04-22 17:15 ` Rik van Riel
2014-04-24 8:49 ` Mel Gorman
2014-04-24 8:49 ` Mel Gorman
2014-04-21 18:24 ` [PATCH 4/6] x86: mm: trace tlb flushes Dave Hansen
2014-04-21 18:24 ` Dave Hansen
2014-04-22 21:19 ` Rik van Riel
2014-04-22 21:19 ` Rik van Riel
2014-04-24 10:14 ` Mel Gorman
2014-04-24 10:14 ` Mel Gorman
2014-04-24 20:42 ` Dave Hansen
2014-04-24 20:42 ` Dave Hansen
2014-04-21 18:24 ` [PATCH 5/6] x86: mm: new tunable for single vs full TLB flush Dave Hansen
2014-04-21 18:24 ` Dave Hansen
2014-04-22 21:31 ` Rik van Riel
2014-04-22 21:31 ` Rik van Riel
2014-04-24 10:37 ` Mel Gorman
2014-04-24 10:37 ` Mel Gorman
2014-04-24 17:25 ` Dave Hansen
2014-04-24 17:25 ` Dave Hansen
2014-04-24 17:53 ` Rik van Riel
2014-04-24 17:53 ` Rik van Riel
2014-04-24 22:03 ` Dave Hansen
2014-04-24 22:03 ` Dave Hansen
2014-07-07 17:43 ` Dave Hansen
2014-07-07 17:43 ` Dave Hansen
2014-07-08 0:43 ` Alex Shi
2014-07-08 0:43 ` Alex Shi
2014-04-21 18:24 ` [PATCH 6/6] x86: mm: set TLB flush tunable to sane value (33) Dave Hansen
2014-04-21 18:24 ` Dave Hansen
2014-04-22 21:33 ` Rik van Riel
2014-04-22 21:33 ` Rik van Riel
2014-04-24 10:46 ` Mel Gorman
2014-04-24 10:46 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=535942A3.3020800@sr71.net \
--to=dave@sr71.net \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@linaro.org \
--cc=dave.hansen@linux.intel.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=riel@redhat.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.