From: Chris Mason <chris.mason@oracle.com>
To: Steven Pratt <steve@dangyankee.net>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Updated performance results
Date: Thu, 23 Jul 2009 17:00:51 -0400 [thread overview]
Message-ID: <20090723210051.GB1040@think> (raw)
In-Reply-To: <4A68AD69.4030803@dangyankee.net>
On Thu, Jul 23, 2009 at 01:35:21PM -0500, Steven Pratt wrote:
> I have re-run the raid tests with re-creating the fileset between each
> of the random write workloads and performance does now match the
> previous newformat results. The bad news is that the huge gain that I
> had attributed to the newformat release, does not really exist. All of
> the previous results(except for the newformat run) were not re-creating
> the fileset, so the gain in performance was due only to having a fresh
> set of files, not any code changes.
Thanks for doing all of these runs. This is still a little different
than what I have here, my initial runs are very very fast and after 10
or so level out to a relatively low performance on random writes. With
nodatacow, it stays even.
>
> So, I have done 2 new sets of runs to look into this further. One is a 3
> hour run of single threaded random write to the RAID system. I have
> compared this to ext3. Performance results are here:
> http://btrfs.boxacle.net/repository/raid/longwrite/longwrite/Longrandomwrite.html
>
> and graphing of all the iostat data can be found here:
>
> http://btrfs.boxacle.net/repository/raid/longwrite/summary.html
>
> The iostat graphs for btrfs are interesting for a number of reasons.
> First, it takes about 3000 seconds (or 50 minutes) for btrfs to reach
> steady state. Second, if you look at write throughput from the device
> view vs. the btrfs/application view, we see that for a application
> throughput of 21.5MB/sec it requires 63MB/sec of actual disk writes.
> That is an overhead of 3 to 1 vs an overhead of ~0 for ext3. Also,
> looking at the change in iops vs MB/sec, we see that while btrfs starts
> out with reasonable size IOs, it quickly deteriorate to an average IO
> size of only 13kb. Remember, the starting file set is only 100GB on a
> 2.1TB filesystem, and all data is overwrite, and this is single
> threaded, so there is no reason this should fragment. It seems like the
> allocator is having a problem doing sequential allocations.
There are two things happening. First the default allocation scheme
isn't very well suited to this, mount -o ssd will perform better. But
over the long term, random overwrites to the file cause a lot of writes
to the extent allocation tree. That's really what -o nodatacow is
saving us. There are optimizations we can do, but we're holding off on
that in favor of enospc and other pressing things.
But, with all of that said, Josef has some really important allocator
improvements. I've put them out along with our pending patches into the
experimental branch of the btrfs-unstable tree. Could you please give
this branch a try both with and without the ssd mount option?
-chris
next prev parent reply other threads:[~2009-07-23 21:00 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-23 18:35 Updated performance results Steven Pratt
2009-07-23 21:00 ` Chris Mason [this message]
2009-07-23 22:04 ` Steven Pratt
2009-07-24 13:24 ` Chris Mason
2009-07-24 14:00 ` Chris Mason
2009-07-24 15:05 ` Steven Pratt
2009-07-28 20:12 ` Steven Pratt
2009-07-28 20:23 ` Chris Mason
2009-07-28 21:10 ` Steven Pratt
2009-08-05 20:35 ` Chris Mason
2009-08-07 7:30 ` debian developer
2009-08-07 13:56 ` Steven Pratt
2009-08-07 13:56 ` Steven Pratt
2009-08-07 23:12 ` Chris Mason
2009-08-31 17:49 ` Steven Pratt
2009-09-11 19:29 ` Chris Mason
2009-09-11 21:35 ` Steven Pratt
2009-09-14 13:51 ` Chris Mason
2009-09-14 17:20 ` Jens Axboe
2009-09-14 21:41 ` Steven Pratt
2009-09-14 23:13 ` Chris Mason
2009-09-16 0:52 ` Chris Mason
2009-09-16 15:15 ` Steven Pratt
2009-09-16 17:57 ` Steven Pratt
2009-09-16 18:07 ` Chris Mason
2009-09-16 18:15 ` Steven Pratt
2009-09-16 18:17 ` Chris Mason
2009-09-16 18:16 ` Steven Pratt
2009-09-16 18:20 ` Chris Mason
2009-09-16 18:37 ` Steven Pratt
2009-09-17 18:32 ` Eric Whitney
2009-09-17 18:39 ` Steven Pratt
2009-09-17 18:52 ` Chris Mason
2009-09-17 20:17 ` Chris Mason
2009-09-17 20:43 ` Chris Mason
2009-09-17 22:04 ` Steven Pratt
2009-09-18 20:14 ` Chris Mason
2009-09-23 15:24 ` Steven Pratt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090723210051.GB1040@think \
--to=chris.mason@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=steve@dangyankee.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox