Re: [PATCH 0/5] Improve sequential read throughput v4r8

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Mel Gorman <mgorman@suse.de>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux-FSDevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 0/5] Improve sequential read throughput v4r8
Date: Wed, 2 Jul 2014 16:53:47 +0100	[thread overview]
Message-ID: <20140702155347.GA10819@suse.de> (raw)
In-Reply-To: <20140702154439.GE1369@cmpxchg.org>

On Wed, Jul 02, 2014 at 11:44:39AM -0400, Johannes Weiner wrote:
> On Tue, Jul 01, 2014 at 05:25:38PM -0400, Johannes Weiner wrote:
> > These explanations make no sense.  If pages of a streaming writer have
> > enough time in memory to not thrash with a single zone, the fair
> > policy should make even MORE time in memory available to them and not
> > thrash them.  The fair policy is a necessity for multi-zone aging to
> > make any sense and having predictable reclaim and activation behavior.
> > That's why it's obviously not meant to benefit streaming workloads,
> > but it shouldn't harm them, either.  Certainly not 20%.  If streaming
> > pages thrash, something is up, the solution isn't to just disable the
> > second zone or otherwise work around the issue.
> 
> Hey, funny story.
> 
> I tried reproducing this with an isolated tester just to be sure,
> stealing tiobench's do_read_test(), but I wouldn't get any results.
> 
> I compared the original fair policy commit with its parent, I compared
> a current vanilla kernel to a crude #ifdef'd policy disabling, and I
> compared vanilla to your patch series - every kernel yields 132MB/s.
> 
> Then I realized, 132MB/s is the disk limit anyway - how the hell did I
> get 150MB/s peak speeds for sequential cold cache IO with seqreadv4?
> 
> So I looked at the tiobench source code and it turns out, it's not
> cold cache at all: it first does the write test, then the read test on
> the same file!
> 
> The file is bigger than memory, so you would expect the last X percent
> of the file to be cached after the seq write and the subsequent seq
> read to push the tail out before getting to it - standard working set
> bigger than memory behavior.
> 
> But without fairness, a chunk from the beginning of the file gets
> stuck in the DMA32 zone and never pushed out while writing, so when
> the reader comes along, it gets random parts from cache!
> 
> All patches that showed "major improvements" ruined fairness and led
> to non-linear caching of the test file during the write, and the read
> speedups came from the file being partially served from cache.
> 

Well, shit. Yes, artificially preserving would give an apparent boost
but be the completely wrong thing to do.

Andrew, please drop the series from mmotm. I'll pick apart whatever is
left that makes sense and resubmit what's left over if necessary.

Thanks

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2014-07-02 15:53 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-30 16:47 [PATCH 0/5] Improve sequential read throughput v4r8 Mel Gorman
2014-06-30 16:48 ` [PATCH 1/4] mm: pagemap: Avoid unnecessary overhead when tracepoints are deactivated Mel Gorman
2014-06-30 16:48 ` [PATCH 2/4] mm: Rearrange zone fields into read-only, page alloc, statistics and page reclaim lines Mel Gorman
2014-06-30 16:48 ` [PATCH 3/4] mm: vmscan: Do not reclaim from lower zones if they are balanced Mel Gorman
2014-06-30 16:48 ` [PATCH 4/4] mm: page_alloc: Reduce cost of the fair zone allocation policy Mel Gorman
2014-06-30 21:14   ` Andrew Morton
2014-06-30 21:51     ` Mel Gorman
2014-06-30 22:09       ` Andrew Morton
2014-07-01  8:02         ` Mel Gorman
2014-07-01 17:16 ` [PATCH 0/5] Improve sequential read throughput v4r8 Johannes Weiner
2014-07-01 18:39   ` Mel Gorman
2014-07-01 20:58     ` Mel Gorman
2014-07-01 21:25     ` Johannes Weiner
2014-07-02 15:44       ` Johannes Weiner
2014-07-02 15:53         ` Mel Gorman [this message]
2014-07-01 22:38     ` Dave Chinner
2014-07-01 23:09       ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140702155347.GA10819@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).