From: "G. Richard Bellamy" <rbellamy@pteradigm.com>
To: Duncan <1i5t5.duncan@cox.net>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: btrfs and iostat - how do I measure the live performance of my btrsf filesystems?
Date: Fri, 22 Aug 2014 21:13:58 -0700 [thread overview]
Message-ID: <CADw2B2P8bPwzcQeO0r5VnLXSJFO+MG_D5egUQ9pJG5M60J0UuA@mail.gmail.com> (raw)
In-Reply-To: <pan$fd73$abec75f5$f92eda58$95927b3e@cox.net>
Um. Derp. Yeah, it's actually sd[defh].
Thanks for the continuing education.
On Fri, Aug 22, 2014 at 8:24 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> G. Richard Bellamy posted on Fri, 22 Aug 2014 14:36:22 -0700 as excerpted:
>
>> An interesting exercise saw me reading data from my RAID10 to a USB
>> device, which produced the following representative iostat:
>>
>> Linux 3.14.17-1-lts (eanna) 08/22/2014 _x86_64_ (24 CPU)
>>
>> avg-cpu: %user %nice %system %iowait %steal %idle
>> 3.53 0.00 0.50 2.83 0.00 93.14
>>
>> Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
>> sda 1.89 0.01 0.01 839 998
>> sdc 0.00 0.00 0.00 1 0
>> sdb 1.23 0.02 0.01 1254 998
>
>> sdi 175.40 0.00 20.26 39 1454881
>
>> sdd 0.26 0.01 0.00 827 58
>> sde 28.86 12.29 0.00 882447 61
>> sdf 0.00 0.00 0.00 1 0
>> sdh 25.25 12.29 0.00 882448 57
>> sdg 0.25 0.01 0.00 826 60
>>
>> /dev/sdi is the USB drive, and /dev/sd[defg] are the four devices in the
>> raid10 volume. I'm reading a large (1.1T) file from the raid10 volume
>> and writing it to the USB drive.
>>
>> You can see that there are approximately two drives from the raid10
>> which are being read from - I assume this corresponds to the two spans
>> (the 'no lower than the (n/spans)x' speed I mentioned in my original
>> post - and that they aggregate to 24.58MB/s reads. This corresponds to
>> the 20.26MB/s writes to the USB drive.
>>
>> The raid10 volume is only being used for this file operation, nothing
>> else is touching it but the kernel and btrfs.
>>
>> I'm curious how others would read this?
>
> Something's not adding up. You say sd[defg] are the btrfs raid10, but
> it's sde and sdh that are getting the read traffic. Are you sure sdh
> isn't part of the raid10 and one of sd[dfg] (perhaps f, seeing d and g
> appear to balance out leaving f the odd one out?) is?
>
> Assuming sdh is indeed part of the raid10, it makes sense, and the fact
> that only two of the four devices are being active read matches what's
> known about btrfs raid1/10 at this point -- it has a relatively dumb read
> allocation algorithm that was good enough for a first implementation but
> obviously isn't optimal, reads are allocated based on the last bit of the
> PID (or TID IDR which), so even/odd. Since this is a single transfer
> process, all the activity is on one or the other, so it's reading from
> the two device wide stripe, but always from the same one of the two
> mirrors supporting each strip.
>
> If you had a second read process going on and it was the same even/odd
> pid, you'd be doubling up on the same two devices. Only with a
> relatively even mix of even/odd pid reads will you see things even out
> across all four. See what I mean about a "relatively dumb" not well
> optimized first implementation?
>
> As they say btrfs is stabilizing now, presumably one of these kernel
> cycles we'll see something better in terms of read mirror allocation
> algorithm, perhaps as part of N-way-mirroring, when that gets implemented
> (roadmapped for after raid5/6 is completed, it's two-way-mirroring only
> now, regardless of the number of devices).
>
> --
> Duncan - List replies preferred. No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master." Richard Stallman
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-08-23 4:13 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-05 14:14 btrfs and iostat - how do I measure the live performance of my btrsf filesystems? Tomasz Chmielewski
2014-08-05 22:06 ` G. Richard Bellamy
2014-08-05 23:39 ` Tomasz Chmielewski
2014-08-22 21:36 ` G. Richard Bellamy
2014-08-23 3:24 ` Duncan
2014-08-23 4:13 ` G. Richard Bellamy [this message]
-- strict thread matches above, loose matches on Subject: below --
2014-08-04 23:01 G. Richard Bellamy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CADw2B2P8bPwzcQeO0r5VnLXSJFO+MG_D5egUQ9pJG5M60J0UuA@mail.gmail.com \
--to=rbellamy@pteradigm.com \
--cc=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).