Re: [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mel Gorman <mgorman@suse.de>
To: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Alex Shi <alex.shi@linaro.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Fengguang Wu <fengguang.wu@intel.com>,
	H Peter Anvin <hpa@zytor.com>, Linux-X86 <x86@kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2
Date: Tue, 17 Dec 2013 17:54:41 +0000	[thread overview]
Message-ID: <20131217175441.GI11295@suse.de> (raw)
In-Reply-To: <20131217144214.GA12370@gmail.com>

On Tue, Dec 17, 2013 at 03:42:14PM +0100, Ingo Molnar wrote:
> 
> * Mel Gorman <mgorman@suse.de> wrote:
> 
> > [...]
> >
> > At that point it'll be time to look at profiles and see where we are 
> > actually spending time because the possibilities of finding things 
> > to fix through bisection will be exhausted.
> 
> Yeah.
> 
> One (heavy handed but effective) trick that can be used in such a 
> situation is to just revert everything that is causing problems, and 
> continue reverting until we get back to a v3.4 baseline performance.
> 

Very tempted but the potential timeframe here is very large and the number
of patches could be considerable. Some patches cause a lot of noise. For
example, one patch enabled ACPI cpufreq driver loading which looks like
a regression during that window but it's a side-effect that gets fixed
later. It'll take time to identify all the patches that potentially cause
problems.

> Once such a 'clean' tree (or queue of patches) is achived, that can be 
> used as a measurement base and the individual features can be 
> re-applied again, one by one, with measurement and analysis becoming a 
> lot easier.
> 

Ordinarily I would agree with you but would prefer a shorter window for
that type of strategy.

> > > Also it appears the Ebizzy numbers ought to be stable enough now 
> > > to make the range-TLB-flush measurements more precise?
> > 
> > Right now, the tlbflush microbenchmark figures look awful on the 
> > 8-core machine when the tlbflush shift patch and the schedule domain 
> > fix are both applied.
> 
> I think that furthr strengthens the case for the 'clean base' approach 
> I outlined above - but it's your call obviously ...
> 

I'll keep it as plan b if it cannot be fixed with a direct approach.

> Thanks again for going through all this. Tracking multi-commit 
> performance regressions across 1.5 years worth of commits is generally 
> very hard. Does your testing effort comes from enterprise Linux QA 
> testing, or did you ran into this problem accidentally?
> 

It does not come from enterprise Linux QA testing but it's motivated by
it. I want to catch as many "obvious" performance bugs before they do as
it saves time and stress in the long run. To assist that, I setup continual
performance regression testing and ebizzy was included in the first report
I opened.  It makes me worry what the rest of the reports contain.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Mel Gorman <mgorman@suse.de>
To: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Alex Shi <alex.shi@linaro.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Fengguang Wu <fengguang.wu@intel.com>,
	H Peter Anvin <hpa@zytor.com>, Linux-X86 <x86@kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2
Date: Tue, 17 Dec 2013 17:54:41 +0000	[thread overview]
Message-ID: <20131217175441.GI11295@suse.de> (raw)
In-Reply-To: <20131217144214.GA12370@gmail.com>

On Tue, Dec 17, 2013 at 03:42:14PM +0100, Ingo Molnar wrote:
> 
> * Mel Gorman <mgorman@suse.de> wrote:
> 
> > [...]
> >
> > At that point it'll be time to look at profiles and see where we are 
> > actually spending time because the possibilities of finding things 
> > to fix through bisection will be exhausted.
> 
> Yeah.
> 
> One (heavy handed but effective) trick that can be used in such a 
> situation is to just revert everything that is causing problems, and 
> continue reverting until we get back to a v3.4 baseline performance.
> 

Very tempted but the potential timeframe here is very large and the number
of patches could be considerable. Some patches cause a lot of noise. For
example, one patch enabled ACPI cpufreq driver loading which looks like
a regression during that window but it's a side-effect that gets fixed
later. It'll take time to identify all the patches that potentially cause
problems.

> Once such a 'clean' tree (or queue of patches) is achived, that can be 
> used as a measurement base and the individual features can be 
> re-applied again, one by one, with measurement and analysis becoming a 
> lot easier.
> 

Ordinarily I would agree with you but would prefer a shorter window for
that type of strategy.

> > > Also it appears the Ebizzy numbers ought to be stable enough now 
> > > to make the range-TLB-flush measurements more precise?
> > 
> > Right now, the tlbflush microbenchmark figures look awful on the 
> > 8-core machine when the tlbflush shift patch and the schedule domain 
> > fix are both applied.
> 
> I think that furthr strengthens the case for the 'clean base' approach 
> I outlined above - but it's your call obviously ...
> 

I'll keep it as plan b if it cannot be fixed with a direct approach.

> Thanks again for going through all this. Tracking multi-commit 
> performance regressions across 1.5 years worth of commits is generally 
> very hard. Does your testing effort comes from enterprise Linux QA 
> testing, or did you ran into this problem accidentally?
> 

It does not come from enterprise Linux QA testing but it's motivated by
it. I want to catch as many "obvious" performance bugs before they do as
it saves time and stress in the long run. To assist that, I setup continual
performance regression testing and ebizzy was included in the first report
I opened.  It makes me worry what the rest of the reports contain.

-- 
Mel Gorman
SUSE Labs

next prev parent reply	other threads:[~2013-12-17 17:54 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-13 20:01 [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2 Mel Gorman
2013-12-13 20:01 ` Mel Gorman
2013-12-13 20:01 ` [PATCH 1/4] x86: mm: Clean up inconsistencies when flushing TLB ranges Mel Gorman
2013-12-13 20:01   ` Mel Gorman
2013-12-13 20:01 ` [PATCH 2/4] x86: mm: Account for TLB flushes only when debugging Mel Gorman
2013-12-13 20:01   ` Mel Gorman
2013-12-13 20:01 ` [PATCH 3/4] x86: mm: Change tlb_flushall_shift for IvyBridge Mel Gorman
2013-12-13 20:01   ` Mel Gorman
2013-12-13 20:01 ` [PATCH 4/4] x86: mm: Eliminate redundant page table walk during TLB range flushing Mel Gorman
2013-12-13 20:01   ` Mel Gorman
2013-12-13 21:16 ` [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2 Linus Torvalds
2013-12-13 21:16   ` Linus Torvalds
2013-12-13 22:38   ` H. Peter Anvin
2013-12-13 22:38     ` H. Peter Anvin
2013-12-16 10:39     ` Mel Gorman
2013-12-16 10:39       ` Mel Gorman
2013-12-16 17:17       ` Linus Torvalds
2013-12-16 17:17         ` Linus Torvalds
2013-12-17  9:55         ` Mel Gorman
2013-12-17  9:55           ` Mel Gorman
2013-12-15 15:55   ` Mel Gorman
2013-12-15 15:55     ` Mel Gorman
2013-12-15 16:17     ` Mel Gorman
2013-12-15 16:17       ` Mel Gorman
2013-12-15 18:34     ` Linus Torvalds
2013-12-15 18:34       ` Linus Torvalds
2013-12-16 11:16       ` Mel Gorman
2013-12-16 11:16         ` Mel Gorman
2013-12-16 10:24     ` Ingo Molnar
2013-12-16 10:24       ` Ingo Molnar
2013-12-16 12:59       ` Mel Gorman
2013-12-16 12:59         ` Mel Gorman
2013-12-16 13:44         ` Ingo Molnar
2013-12-16 13:44           ` Ingo Molnar
2013-12-17  9:21           ` Mel Gorman
2013-12-17  9:21             ` Mel Gorman
2013-12-17  9:26             ` Peter Zijlstra
2013-12-17  9:26               ` Peter Zijlstra
2013-12-17 11:00             ` Ingo Molnar
2013-12-17 11:00               ` Ingo Molnar
2013-12-17 14:32               ` Mel Gorman
2013-12-17 14:32                 ` Mel Gorman
2013-12-17 14:42                 ` Ingo Molnar
2013-12-17 14:42                   ` Ingo Molnar
2013-12-17 17:54                   ` Mel Gorman [this message]
2013-12-17 17:54                     ` Mel Gorman
2013-12-18 10:24                     ` Ingo Molnar
2013-12-18 10:24                       ` Ingo Molnar
2013-12-19 14:24               ` Mel Gorman
2013-12-19 14:24                 ` Mel Gorman
2013-12-19 16:49                 ` Ingo Molnar
2013-12-19 16:49                   ` Ingo Molnar
2013-12-20 11:13                   ` Mel Gorman
2013-12-20 11:13                     ` Mel Gorman
2013-12-20 11:18                     ` Ingo Molnar
2013-12-20 11:18                       ` Ingo Molnar
2013-12-20 12:00                       ` Mel Gorman
2013-12-20 12:00                         ` Mel Gorman
2013-12-20 12:20                         ` Ingo Molnar
2013-12-20 12:20                           ` Ingo Molnar
2013-12-20 13:55                           ` Mel Gorman
2013-12-20 13:55                             ` Mel Gorman
2013-12-18 10:32             ` [tip:sched/core] sched: Assign correct scheduling domain to ' sd_llc' tip-bot for Mel Gorman
2013-12-18  7:28 ` [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2 Fengguang Wu
2013-12-18  7:28   ` Fengguang Wu
2013-12-19 14:34   ` Mel Gorman
2013-12-19 14:34     ` Mel Gorman
2013-12-20 15:51     ` Fengguang Wu
2013-12-20 16:44       ` Mel Gorman
2013-12-20 16:44         ` Mel Gorman
2013-12-21 15:49         ` Fengguang Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131217175441.GI11295@suse.de \
    --to=mgorman@suse.de \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linaro.org \
    --cc=fengguang.wu@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.