From: NeilBrown <neilb@suse.de>
To: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Cc: "Linux RAID" <linux-raid@vger.kernel.org>,
"Florian-Ewald Müller" <florian-ewald.mueller@profitbricks.com>
Subject: Re: [RFC] Process requests instead of bios to use a scheduler
Date: Mon, 2 Jun 2014 09:32:58 +1000 [thread overview]
Message-ID: <20140602093258.22aa2c05@notabene.brown> (raw)
In-Reply-To: <5385DECE.5060507@profitbricks.com>
[-- Attachment #1: Type: text/plain, Size: 3762 bytes --]
On Wed, 28 May 2014 15:04:14 +0200 Sebastian Parschauer
<sebastian.riemer@profitbricks.com> wrote:
> Hi Neil,
>
> at ProfitBricks we use the raid0 driver stacked on top of raid1 to form
> a RAID-10. Above there is LVM and SCST/ib_srpt.
Any particular reason you don't use the raid10 driver?
>
> We've extended the md driver for our 3.4 based kernels to do full bio
> accounting (by adding ticks and in-flights). Then, we've extended it to
> use the request-by-request mode using blk_init_queue() and an
> md_request_function() selectable by a module parameter and extended
> mdadm. This way the block layer provides the accounting and the
> possibility to select a scheduler.
> With the ticks we maintain a latency statistic. This way we can compare
> both modes.
>
> My colleague Florian is in CC as he has been the main developer for this.
>
> We did some fio 2.1.7 tests with iodepth 64, posixaio, 10 LVs with 1M
> chunks sequential I/O and 10 LVs with 4K chunks sequential as well as
> random I/O - one fio call per device. After 60s all fio processes are
> killed.
> Test systems have four 1 TB Seagate Constellation HDDs in RAID-10. LVs
> are 20G in size each.
>
> The biggest issue in our cloud is unfairness leading to high latency,
> SRP timeouts and reconnects. This way we would need a scheduler for our
> raid0 device.
Having a scheduler for RAID0 doesn't make any sense to me.
RAID0 simply passes each request down to the appropriate underlying device.
That device then does its own scheduling.
Adding a scheduler may well make sense for RAID1 (the current "scheduler"
only does some read balancing and is rather simplistic) and for RAID4/5/6/10.
But not for RAID0 .... was that a typo?
> The difference is tremendous when comparing the results of 4K random
> writes fighting against 1M sequential writes. With a scheduler the
> maximum write latency dropped from 10s to 1.6s. The other statistic
> values are number of bios for scheduler none and number of requests for
> other schedulers. First read, then write.
>
> Scheduler: none
> < 8 ms: 0 2139
> < 16 ms: 0 9451
> < 32 ms: 0 10277
> < 64 ms: 0 3586
> < 128 ms: 0 5169
> < 256 ms: 2 31688
> < 512 ms: 3 115360
> < 1024 ms: 2 283681
> < 2048 ms: 0 420918
> < 4096 ms: 0 10625
> < 8192 ms: 0 220
> < 16384 ms: 0 4
> < 32768 ms: 0 0
> < 65536 ms: 0 0
> >= 65536 ms: 0 0
> maximum ms: 660 9920
>
> Scheduler: deadline
> < 8 ms: 2 435
> < 16 ms: 1 997
> < 32 ms: 0 1560
> < 64 ms: 0 4345
> < 128 ms: 1 11933
> < 256 ms: 2 46366
> < 512 ms: 0 182166
> < 1024 ms: 1 75903
> < 2048 ms: 0 146
> < 4096 ms: 0 0
> < 8192 ms: 0 0
> < 16384 ms: 0 0
> < 32768 ms: 0 0
> < 65536 ms: 0 0
> >= 65536 ms: 0 0
> maximum ms: 640 1640
Could you do a graph? I like graphs :-)
I can certainly seem something has changed here...
>
> We clone the bios from the request and put them into a bio list. The
> request is marked as in-flight and afterwards the bios are processed
> one-by-one the same way as with the other mode.
>
> Is it safe to do it like this with a scheduler?
I see nothing inherently wrong with the theory. The details of the code are
much more important.
>
> Any concerns regarding the write-intent bitmap?
Only that it has to keep working.
>
> Do you have any other concerns?
>
> We can provide you with the full test results, the test scripts and also
> some code parts if you wish.
I'm not against improving the scheduling in various md raid levels, though
not RAID0 as I mentioned above.
Show me the code and I might be able to provide a more detailed opinion.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2014-06-01 23:32 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-28 13:04 [RFC] Process requests instead of bios to use a scheduler Sebastian Parschauer
2014-06-01 23:32 ` NeilBrown [this message]
2014-06-02 9:51 ` Sebastian Parschauer
2014-06-02 10:20 ` NeilBrown
2014-06-02 11:12 ` Sebastian Parschauer
2014-06-04 17:09 ` [RFC PATCH 0/4] md/mdadm: introduce request function mode support Sebastian Parschauer
2014-06-04 17:09 ` [RFC PATCH 1/4] md: complete bio accounting and add io_latency extension Sebastian Parschauer
2014-06-04 17:10 ` [RFC PATCH 2/4] md: introduce request function mode support Sebastian Parschauer
2014-06-04 17:10 ` [RFC PATCH 3/4] md: handle IO latency accounting in rqfn mode Sebastian Parschauer
2014-06-04 17:10 ` [RFC PATCH 4/4] mdadm: introduce '--use-requestfn' create/assembly option Sebastian Parschauer
2014-06-17 13:20 ` [RFC PATCH 0/4] md/mdadm: introduce request function mode support Sebastian Parschauer
[not found] ` <CAH3kUhEK26+4KryoReosMt654-vcrkkgkxaW5tKkFRDBqgX82w@mail.gmail.com>
[not found] ` <53A14513.20902@profitbricks.com>
2014-06-18 13:57 ` Roberto Spadim
2014-06-18 14:43 ` Sebastian Parschauer
2014-06-24 7:09 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140602093258.22aa2c05@notabene.brown \
--to=neilb@suse.de \
--cc=florian-ewald.mueller@profitbricks.com \
--cc=linux-raid@vger.kernel.org \
--cc=sebastian.riemer@profitbricks.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).