linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: BTRFS as image store for KVM?
Date: Mon, 5 Oct 2015 08:19:50 +0000 (UTC)	[thread overview]
Message-ID: <pan$5cc6d$7932715b$df64b07$b3b8edce@cox.net> (raw)
In-Reply-To: CAGfcS_=9uf359-uBHfEXhUC3HTpkjzE7vMwG1iOR2Ld0u3nFTg@mail.gmail.com

Rich Freeman posted on Sun, 04 Oct 2015 08:21:53 -0400 as excerpted:

> On Sun, Oct 4, 2015 at 8:03 AM, Lionel Bouton
> <lionel-subscription@bouton.name> wrote:
>>
>> This focus on single reader RAID1 performance surprises me.
>>
>> 1/ AFAIK the kernel md RAID1 code behaves the same (last time I checked
>> you need 2 processes to read from 2 devices at once) and I've never
>> seen anyone arguing that the current md code is unstable.

I'm not a coder and could be wrong, but AFAIK, md/raid1 either works per 
thread (thus should multiplex I/O across raid1 devices in single-process-
multi-thread), or handles multiple AIO requests in parallel, if not both.

(If I'm laboring under a severe misconception, and I could be, please do 
correct me -- I'll rather be publicly corrected and have just that, my 
world-view corrected to align with reality, than be wrong, publicly or 
privately, and never know it, thus never correcting it!  =:^)

IOW, the primary case where I believe md/raid1 does single-device serial 
access, is where the single process is doing just that, serialized-single-
request-sleep-until-the-data's-ready.  Otherwise read requests are spread 
among the available spindles.  =:^)

But...

> Perhaps, but with btrfs it wouldn't be hard to get 1000 processes
> reading from a raid1 in btrfs and have every single request directed to
> the same disk with the other disk remaining completely idle.  I believe
> the algorithm is just whether the pid is even or odd, and doesn't take
> into account disk activity at all, let alone disk performance or
> anything more sophisticated than that.
> 
> I'm sure md does a better job than that.

Exactly.  Even/odd PID scheduling is great for testing, since it's simple 
enough to load either side exclusively or both sides exactly evenly, but 
it's absolutely horrible for multi-task, since worst-case single-device-
bottleneck is all too easy to achieve by accident, and even pure-random 
distribution is going to favor one side or the other to some extent, most 
of the time.

Even worse, due to the most-remaining-free-space chunk allocation 
algorithm and pair-mirroring only, no matter the number of devices, try 
to use 3+ devices of differing sizes, and until the space-available on 
the largest pair reaches that of the others, that largest pair will get 
the allocations.  Consider a bunch of quarter-TiB devices in raid1, with 
a pair of 2 TiB devices as well.  The quarter-TiB devices will remain 
idle until the pair of 2 TiB devices reach 1.75 TiB full, thus equalizing 
the space available on each compared to the other devices on the 
filesystem.  Of course, that means reads too, are going to be tied to 
only those two devices, for anything in that first 1.75 TiB of data, and 
if all those reads are from even or all from odd PIDs, it's only going to 
be ONE of... perhaps 10 devices! Possibly hundreds of read threads 
bottlenecking on a single device of ten, while the other 9/10 of the 
filesystem-array remains entirely idle! =:^(

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


  reply	other threads:[~2015-10-05  8:20 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-15 21:34 BTRFS as image store for KVM? Gert Menke
2015-09-16  3:00 ` Chris Murphy
2015-09-16  3:57 ` Duncan
2015-09-16 11:35   ` Brendan Heading
2015-09-16 12:25     ` Austin S Hemmelgarn
2015-09-16 12:41     ` Paul Jones
2015-09-17 17:56   ` Gert Menke
2015-09-17 18:35     ` Chris Murphy
2015-09-17 21:32       ` Gert Menke
2015-09-18  2:00       ` Duncan
2015-09-18  8:32         ` Gert Menke
2015-09-23  7:28         ` Russell Coker
2015-09-18 14:13       ` Austin S Hemmelgarn
2015-09-23  7:24         ` Russell Coker
2015-09-17 18:46     ` Mike Fleetwood
2015-09-17 19:43     ` Hugo Mills
2015-09-17 21:49       ` Gert Menke
2015-09-18  2:22       ` Duncan
2015-09-18  8:59         ` Gert Menke
2015-09-17 22:41     ` Sean Greenslade
2015-09-18  7:34       ` Gert Menke
2015-09-17  4:19 ` Paul Harvey
2015-09-20  1:26 ` Jim Salter
2015-09-25 12:48   ` Rich Freeman
2015-09-25 12:56     ` Jim Salter
2015-09-25 13:04     ` Austin S Hemmelgarn
     [not found]       ` <5605483A.7040103@jrs-s.net>
2015-09-25 13:46         ` Austin S Hemmelgarn
2015-09-25 13:52       ` Jim Salter
2015-09-25 14:02         ` Timofey Titovets
2015-09-25 14:20           ` Austin S Hemmelgarn
2015-09-29 14:12             ` Gert Menke
2015-10-02  4:21             ` Russell Coker
2015-10-02 12:07               ` Austin S Hemmelgarn
2015-10-03  8:32                 ` Russell Coker
2015-10-04  2:09                   ` Duncan
2015-10-04 12:03                     ` Lionel Bouton
2015-10-04 12:21                       ` Rich Freeman
2015-10-05  8:19                         ` Duncan [this message]
2015-10-05  8:43                       ` Erkki Seppala
2015-10-05  8:51                         ` Roman Mamedov
2015-10-05 11:16                       ` Lionel Bouton
2015-10-05 11:40                         ` Rich Freeman
2015-10-05 11:54                         ` Austin S Hemmelgarn
     [not found]                       ` <RPG31r00t34oj7R01PG5Us>
2015-10-05 14:04                         ` Duncan
2015-10-05 15:59                           ` Austin S Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$5cc6d$7932715b$df64b07$b3b8edce@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).