From: Jens Axboe <jens.axboe@oracle.com>
To: Chris Mason <chris.mason@oracle.com>
Cc: Shaohua Li <shaohua.li@intel.com>, linux-btrfs@vger.kernel.org
Subject: Re: [patch]btrfs: finish read pages in the order they are submitted
Date: Mon, 8 Feb 2010 11:59:01 +0100 [thread overview]
Message-ID: <20100208105901.GA1025@kernel.dk> (raw)
In-Reply-To: <20100203181845.GE22119@think>
On Wed, Feb 03 2010, Chris Mason wrote:
> On Wed, Feb 03, 2010 at 03:45:11PM +0800, Shaohua Li wrote:
> > the endio is done at reverse order of bio vectors. That means for a sequential
> > read, the page first submitted will finish last in a bio. Considering we will
> > do checksum (making cache hot) for every page, this does introduce delay (and
> > chance to squeeze cache used soon) for pages submitted at the begining. I
> > don't observe obvious performance difference with below patch at my simple test,
> > but seems more natural to finish read in the order they are submitted.
>
> Interesting, I wonder if we'd be able to see this on a higher throughput
> system. Jens, care to give it a shot (patch below)?
Sure, I gave it a spin. Baseline is current -git (-rc7'ish), and the
workload is just stream reading 8 16GB files. I used large streaming
reads as the bigger ios would hopefully help show the effect of doing
the reverse completions. The run takes ~1 minute, and the results are
averaged over 3 runs.
Throughput:
Kernel Slowest Fastest Average
-------------------------------------------------------
baseline 2041MB/sec 2229MB/sec 2155MB/sec
patched 2052MB/sec 2071MB/sec 2062MB/sec
Completion latency average (msecs):
Kernel Best Worst Average
-------------------------------------------------------
baseline 1.72 1.89 1.79
patche 1.83 1.89 1.85
Probably would need a LOT more runs to get a statistically significant
number here, it would be nice if O_DIRECT worked (hint, hint!) which
usually makes these things easier to test. If I look at the throughput
of the runs, the baseline usually starts a little slower (1.8GB/sec or
so) and gets faster, while the patched run starts much higher (close to
3.0GB/sec) and drops to 2.0GB/sec after that for the rest of the run.
So I did some perf stat checks too, to see if we see an improvement for
cache utilization. Results below.
Cache stats (millions)
Kernel References Misses
----------------------------------------------
baseline 3547 2387
patched 3822 2351o
These numbers are very stable, the above were also averaged over 3 runs,
but variability was very low.
My feeling is that the patch should be included. Cache misses are
provably down and the patch makes a lot of sense just logically. The
patched runs seemed more stable, and my gut tells me that the unpatched
runs may have been a bit flukey (one fast run, should probably be
excluded).
Let me know if you want more tests.
--
Jens Axboe
next prev parent reply other threads:[~2010-02-08 10:59 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-03 7:45 [patch]btrfs: finish read pages in the order they are submitted Shaohua Li
2010-02-03 18:18 ` Chris Mason
2010-02-08 10:59 ` Jens Axboe [this message]
2010-02-08 11:44 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100208105901.GA1025@kernel.dk \
--to=jens.axboe@oracle.com \
--cc=chris.mason@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=shaohua.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox