From: Mel Gorman <mgorman@suse.de>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>, Alex Shi <alex.shi@linaro.org>,
Ingo Molnar <mingo@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
Fengguang Wu <fengguang.wu@intel.com>, Linux-X86 <x86@kernel.org>,
Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2
Date: Tue, 17 Dec 2013 09:55:23 +0000 [thread overview]
Message-ID: <20131217095523.GX11295@suse.de> (raw)
In-Reply-To: <CA+55aFzgH0hG3-zOOzADvBOMXJCGo4=JXafiRiXqL8NRec5J4A@mail.gmail.com>
On Mon, Dec 16, 2013 at 09:17:35AM -0800, Linus Torvalds wrote:
> On Mon, Dec 16, 2013 at 2:39 AM, Mel Gorman <mgorman@suse.de> wrote:
> >
> > First was Alex's microbenchmark from https://lkml.org/lkml/2012/5/17/59
> > and ran it for a range of thread numbers, 320 iterations per thread with
> > random number of entires to flush. Results are from two machines
>
> There's something wrong with that benchmark, it sometimes gets stuck,
It's not a thread-safe benchmark. The parent unmapping thread can finish
before the children start and it infinite loops.
> and the profile numbers are just random (and mostly in user space).
>
Yep, it's why when I used it I ran a large number of iterations with
semi-randomised number of entries trying to knock some sense out of it.
I was hoping that the Intel folk might come back with more details on
what their testing methodology was.
> I think you mentioned fixing a bug in it, mind pointing at the fixed benchmark?
>
Ugh, I'm embarassed by this. I did not properly fix the benchmark, just
bodged around the part that can lockup. Patch is below. Actual testing was
run using mmtests with the configs/config-global-dhp__tlbflush-performance
configuration file using something like this
# build boot kernel 1
./run-mmtests.sh --run-monitor --config configs/config-global-dhp__tlbflush-performance test-kernel-1
# build boot kernel 2
./run-mmtests.sh --run-monitor --config configs/config-global-dhp__tlbflush-performance test-kernel-2
cd work/log
../../compare-kernels.sh
> Looking at the kernel footprint, it seems to depend on what parameters
> you ran that benchmark with. Under certain loads, it seems to spend
> most of the time in clearing pages and in the page allocation ("-t 8
> -n 320"). And in other loads, it hits smp_call_function_many() and the
> TLB flushers ("-t 8 -n 8"). So exactly what parameters did you use?
>
A range of parameters. The test effectively does this
TLBFLUSH_MAX_ENTRIES=256
for_each_thread_count
for iteration in `seq 1 320`
# Select a range of entries to randomly select from. This is to ensure
# an evenish spread of entries to be tested
NR_SECTION=$((ITERATION%8))
RANGE=$((TLBFLUSH_MAX_ENTRIES/8))
THIS_MIN_ENTRIES=$((RANGE*NR_SECTION+1))
THIS_MAX_ENTRIES=$((THIS_MIN_ENTRIES+RANGE))
NR_ENTRIES=$((THIS_MIN_ENTRIES+(RANDOM%RANGE)))
if [ $NR_ENTRIES -gt $THIS_MAX_ENTRIES ]; then
NR_ENTRIES=$THIS_MAX_ENTRIES
fi
RESULT=`tlbflush -n $NR_ENTRIES -t $NR_THREADS 2>&1`
done
done
It splits the values for nr_entries (-n switch) into 8 segments and randomly
selects values within them. This results in noise but ensures the test hits
the best, average and worst cases for TLB range flushing. Writing this,
I realise I should have made MAX_ENTRIES 512 to hit the original shift
values. The original mail indicated that this test was run once for a very
limited number of threads and entries and I really hope this is not what
actually happened to tune that shift value.
> Because we've had things that change those two things (and they are
> totally independent).
>
Indeed and tuning on specifics would be a bad idea -- hence why my
testing took a randomised selection of ranges to test with and a large
number of iterations.
> And does anything stand out in the profiles of ebizzy? For example, in
> between 3.4.x and 3.11, we've converted the anon_vma locking from a
> mutex to a rwsem, and we know that caused several issues, possibly
> causing unfairness. There are other potential sources of unfairness.
> It would be good to perhaps bisect things at least *somewhat*, because
> *so* much has changed in 3.4 to 3.11 that it's impossible to guess.
>
I'll check. Right now, the machines are still occupied running bisections
which is still finding bugs. When that has found the obvious stuff, I'll use
profiles to identify what's left. FWIW, I would be surprised if ebizzy was
affected by the anon_vma locking. I do not think the threads are operating
within the same VMAs in a manner that would contend on those locks. If there
is a lock being contended, it's going to be on mmap_sem for creating mappings
just slightly larger than MMAP_THRESHOLD. Guessing though, not proven.
This is bodge that stops Alex's benchmark locking up. It's the wrong way to
fix a problem like this. I was not even convinced this benchmark was useful
to begin with and was unmotivated to spending time on fixing it up properly.
--- tlbflush.c.orig 2013-12-15 11:05:08.813821030 +0000
+++ tlbflush.c 2013-12-15 11:04:46.504926426 +0000
@@ -67,13 +67,17 @@
char x;
int i, k;
int randn[PAGE_SIZE];
+ int count = 0;
for (i=0;i<PAGE_SIZE; i++)
randn[i] = rand();
actimes = malloc(sizeof(long));
- while (*threadstart == 0 )
+ while (*threadstart == 0) {
+ if (++count > 1000000)
+ break;
usleep(1);
+ }
if (d->rw == 0)
@@ -180,6 +181,7 @@
threadstart = malloc(sizeof(int));
*threadstart = 0;
data.readp = &p; data.startaddr = startaddr; data.rw = rw; data.loop = l;
+ sleep(1);
for (i=0; i< t; i++)
if(pthread_create(&pid[i], NULL, accessmm, &data))
perror("pthread create");
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-12-17 9:55 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-13 20:01 [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2 Mel Gorman
2013-12-13 20:01 ` [PATCH 1/4] x86: mm: Clean up inconsistencies when flushing TLB ranges Mel Gorman
2013-12-13 20:01 ` [PATCH 2/4] x86: mm: Account for TLB flushes only when debugging Mel Gorman
2013-12-13 20:01 ` [PATCH 3/4] x86: mm: Change tlb_flushall_shift for IvyBridge Mel Gorman
2013-12-13 20:01 ` [PATCH 4/4] x86: mm: Eliminate redundant page table walk during TLB range flushing Mel Gorman
2013-12-13 21:16 ` [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2 Linus Torvalds
2013-12-13 22:38 ` H. Peter Anvin
2013-12-16 10:39 ` Mel Gorman
2013-12-16 17:17 ` Linus Torvalds
2013-12-17 9:55 ` Mel Gorman [this message]
2013-12-15 15:55 ` Mel Gorman
2013-12-15 16:17 ` Mel Gorman
2013-12-15 18:34 ` Linus Torvalds
2013-12-16 11:16 ` Mel Gorman
2013-12-16 10:24 ` Ingo Molnar
2013-12-16 12:59 ` Mel Gorman
2013-12-16 13:44 ` Ingo Molnar
2013-12-17 9:21 ` Mel Gorman
2013-12-17 9:26 ` Peter Zijlstra
2013-12-17 11:00 ` Ingo Molnar
2013-12-17 14:32 ` Mel Gorman
2013-12-17 14:42 ` Ingo Molnar
2013-12-17 17:54 ` Mel Gorman
2013-12-18 10:24 ` Ingo Molnar
2013-12-19 14:24 ` Mel Gorman
2013-12-19 16:49 ` Ingo Molnar
2013-12-20 11:13 ` Mel Gorman
2013-12-20 11:18 ` Ingo Molnar
2013-12-20 12:00 ` Mel Gorman
2013-12-20 12:20 ` Ingo Molnar
2013-12-20 13:55 ` Mel Gorman
2013-12-18 7:28 ` Fengguang Wu
2013-12-19 14:34 ` Mel Gorman
2013-12-20 15:51 ` Fengguang Wu
2013-12-20 16:44 ` Mel Gorman
2013-12-21 15:49 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131217095523.GX11295@suse.de \
--to=mgorman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@linaro.org \
--cc=fengguang.wu@intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).