From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Recover btrfs volume which can only be mounded in read-only mode
Date: Tue, 27 Oct 2015 05:58:22 +0000 (UTC) [thread overview]
Message-ID: <pan$e5e2d$4d523be4$45db755c$a34029b2@cox.net> (raw)
In-Reply-To: 20151026092457.GA11331@carfax.org.uk
Hugo Mills posted on Mon, 26 Oct 2015 09:24:57 +0000 as excerpted:
> On Mon, Oct 26, 2015 at 09:14:00AM +0000, Duncan wrote:
>> Dmitry Katsubo posted on Sun, 18 Oct 2015 11:44:08 +0200 as excerpted:
>>
>>> I think PID-based solution is not the best one. Why not simply take a
>>> random device? Then at least all drives in the volume are equally
>>> loaded (in average).
>>
>> Nobody argues that the even/odd-PID-based read-scheduling solution is
>> /optimal/, in a production sense at least. But [it's near ideal for
>> testing, and "good enough" for the most general case].
>
> For what it's worth, David tried implementing round-robin (IIRC)
> some time ago, and found that it performed *worse* than the pid-based
> system. (It may have been random, but memory says it was round-robin).
What I'd like to know is what mdraid1 uses, and if btrfs can get that.
Because some upgrades worth ago, after trying mdraid6 for the main system
and mdraid0 for some parts (with mdraid1 for boot since grub1 could deal
with it, but not the others), I eventually settled on 4-way mdraid1 for
everything, using the same disks I had used for the raid6 and raid0.
And I was rather blown away by the mdraid1 speed, in comparison,
especially compared to raid0, which I thought would be better than
raid1. I guess my use-case is multi-thread read-heavy enough that the
whatever mdraid1 uses, I was getting upto four separate reads (one per
spindle) going at once, while writes still happened at single-spindle
speed as with SATA (as opposed to the older IDE, this was when SATA was
still new), each spindle had its own channel and they could write in
parallel with bottleneck being the speed at which the slowest of the four
completed its write. So writes were single-spindle-speed, still far
faster than the raid6 read-modify-write cycle, while reads... it really
did appear to multitask one per spindle.
Also, the mdraid1 may have actually taken into account spindle head
location as well, and scheduled reads to the spindle with the head
already positioned closest to the target, tho I'm not sure on that.
But whatever mdraid1 scheduling does, I was totally astonished at how
efficient it was, and it really did turn my thinking on most efficient
raid choices upside down. So if btrfs could simply take that scheduler
and modify it as necessary for btrfs specifics, provided the
modifications weren't /too/ heavy (and the fact that btrfs does read-time
checksum verification could very well mean the algorithm as directly
adapted as possible may not reach anything like the same efficiency), I
really do think that'd be the ideal. And of course it's freedomware code
in the same kernel, so reusing the mdraid read-scheduler shouldn't be the
problem it might be in other circumstances, tho the possible caveat of
btrfs specific implementation issues does remain.
And of course someone would have to take the time to adapt it to work
with btrfs, which gets us back onto the practical side of things, the
"opportunity rich, developer-time poor" situation that is btrfs coding
reality, premature optimization, possibly doing it at the same time as N-
way-mirroring, etc.
But anyway, mdraid's raid1 read-scheduler really does seem to be
impressively efficient, the benchmark to try to match, if possible. If
that can be done by reusing some of the same code, so much the better.
=:^)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
prev parent reply other threads:[~2015-10-27 5:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-14 14:28 Recover btrfs volume which can only be mounded in read-only mode Dmitry Katsubo
2015-10-14 14:40 ` Anand Jain
2015-10-14 20:27 ` Dmitry Katsubo
2015-10-15 0:48 ` Duncan
2015-10-15 14:10 ` Dmitry Katsubo
2015-10-15 14:55 ` Hugo Mills
2015-10-16 8:18 ` Duncan
2015-10-18 9:44 ` Dmitry Katsubo
2015-10-26 7:09 ` Duncan
2015-10-26 9:14 ` Duncan
2015-10-26 9:24 ` Hugo Mills
2015-10-27 5:58 ` Duncan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$e5e2d$4d523be4$45db755c$a34029b2@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).