From: Chris Mason <chris.mason@oracle.com>
To: Steven Pratt <steve@dangyankee.net>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Updated performance results
Date: Fri, 24 Jul 2009 09:24:07 -0400 [thread overview]
Message-ID: <20090724132407.GC16192@think> (raw)
In-Reply-To: <4A68DE81.3020505@dangyankee.net>
On Thu, Jul 23, 2009 at 05:04:49PM -0500, Steven Pratt wrote:
> Chris Mason wrote:
>> On Thu, Jul 23, 2009 at 01:35:21PM -0500, Steven Pratt wrote:
>>
>>> I have re-run the raid tests with re-creating the fileset between
>>> each of the random write workloads and performance does now match
>>> the previous newformat results. The bad news is that the huge gain
>>> that I had attributed to the newformat release, does not really
>>> exist. All of the previous results(except for the newformat run)
>>> were not re-creating the fileset, so the gain in performance was due
>>> only to having a fresh set of files, not any code changes.
>>>
>>
>> Thanks for doing all of these runs. This is still a little different
>> than what I have here, my initial runs are very very fast and after 10
>> or so level out to a relatively low performance on random writes. With
>> nodatacow, it stays even.
>>
>>
> Right, I do not see this problem with nodatacow.
>
>>> So, I have done 2 new sets of runs to look into this further. One is
>>> a 3 hour run of single threaded random write to the RAID system. I
>>> have compared this to ext3. Performance results are here:
>>> http://btrfs.boxacle.net/repository/raid/longwrite/longwrite/Longrandomwrite.html
>>>
>>> and graphing of all the iostat data can be found here:
>>>
>>> http://btrfs.boxacle.net/repository/raid/longwrite/summary.html
>>>
>>> The iostat graphs for btrfs are interesting for a number of reasons.
>>> First, it takes about 3000 seconds (or 50 minutes) for btrfs to
>>> reach steady state. Second, if you look at write throughput from
>>> the device view vs. the btrfs/application view, we see that for a
>>> application throughput of 21.5MB/sec it requires 63MB/sec of actual
>>> disk writes. That is an overhead of 3 to 1 vs an overhead of ~0 for
>>> ext3. Also, looking at the change in iops vs MB/sec, we see that
>>> while btrfs starts out with reasonable size IOs, it quickly
>>> deteriorate to an average IO size of only 13kb. Remember, the
>>> starting file set is only 100GB on a 2.1TB filesystem, and all data
>>> is overwrite, and this is single threaded, so there is no reason
>>> this should fragment. It seems like the allocator is having a
>>> problem doing sequential allocations.
>>>
>>
>> There are two things happening. First the default allocation scheme
>> isn't very well suited to this, mount -o ssd will perform better. But
>> over the long term, random overwrites to the file cause a lot of writes
>> to the extent allocation tree. That's really what -o nodatacow is
>> saving us. There are optimizations we can do, but we're holding off on
>> that in favor of enospc and other pressing things.
>>
> Well I have -o ssd data that I can upload, but it was worse than
> without. I do understand about timing and priorities.
>
>> But, with all of that said, Josef has some really important allocator
>> improvements. I've put them out along with our pending patches into the
>> experimental branch of the btrfs-unstable tree. Could you please give
>> this branch a try both with and without the ssd mount option?
>>
>>
> Sure, will try to get to it tomorrow.
Sorry, I missed a fix in the experimental branch. I'll push out a
rebased version in a few minutes.
-chris
next prev parent reply other threads:[~2009-07-24 13:24 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-23 18:35 Updated performance results Steven Pratt
2009-07-23 21:00 ` Chris Mason
2009-07-23 22:04 ` Steven Pratt
2009-07-24 13:24 ` Chris Mason [this message]
2009-07-24 14:00 ` Chris Mason
2009-07-24 15:05 ` Steven Pratt
2009-07-28 20:12 ` Steven Pratt
2009-07-28 20:23 ` Chris Mason
2009-07-28 21:10 ` Steven Pratt
2009-08-05 20:35 ` Chris Mason
2009-08-07 7:30 ` debian developer
2009-08-07 13:56 ` Steven Pratt
2009-08-07 13:56 ` Steven Pratt
2009-08-07 23:12 ` Chris Mason
2009-08-31 17:49 ` Steven Pratt
2009-09-11 19:29 ` Chris Mason
2009-09-11 21:35 ` Steven Pratt
2009-09-14 13:51 ` Chris Mason
2009-09-14 17:20 ` Jens Axboe
2009-09-14 21:41 ` Steven Pratt
2009-09-14 23:13 ` Chris Mason
2009-09-16 0:52 ` Chris Mason
2009-09-16 15:15 ` Steven Pratt
2009-09-16 17:57 ` Steven Pratt
2009-09-16 18:07 ` Chris Mason
2009-09-16 18:15 ` Steven Pratt
2009-09-16 18:17 ` Chris Mason
2009-09-16 18:16 ` Steven Pratt
2009-09-16 18:20 ` Chris Mason
2009-09-16 18:37 ` Steven Pratt
2009-09-17 18:32 ` Eric Whitney
2009-09-17 18:39 ` Steven Pratt
2009-09-17 18:52 ` Chris Mason
2009-09-17 20:17 ` Chris Mason
2009-09-17 20:43 ` Chris Mason
2009-09-17 22:04 ` Steven Pratt
2009-09-18 20:14 ` Chris Mason
2009-09-23 15:24 ` Steven Pratt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090724132407.GC16192@think \
--to=chris.mason@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=steve@dangyankee.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox