All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
To: NeilBrown <neilb@suse.de>
Cc: "Linux RAID" <linux-raid@vger.kernel.org>,
	"Florian-Ewald Müller" <florian-ewald.mueller@profitbricks.com>
Subject: [RFC] Process requests instead of bios to use a scheduler
Date: Wed, 28 May 2014 15:04:14 +0200	[thread overview]
Message-ID: <5385DECE.5060507@profitbricks.com> (raw)

Hi Neil,

at ProfitBricks we use the raid0 driver stacked on top of raid1 to form
a RAID-10. Above there is LVM and SCST/ib_srpt.

We've extended the md driver for our 3.4 based kernels to do full bio
accounting (by adding ticks and in-flights). Then, we've extended it to
use the request-by-request mode using blk_init_queue() and an
md_request_function() selectable by a module parameter and extended
mdadm. This way the block layer provides the accounting and the
possibility to select a scheduler.
With the ticks we maintain a latency statistic. This way we can compare
both modes.

My colleague Florian is in CC as he has been the main developer for this.

We did some fio 2.1.7 tests with iodepth 64, posixaio, 10 LVs with 1M
chunks sequential I/O and 10 LVs with 4K chunks sequential as well as
random I/O - one fio call per device. After 60s all fio processes are
killed.
Test systems have four 1 TB Seagate Constellation HDDs in RAID-10. LVs
are 20G in size each.

The biggest issue in our cloud is unfairness leading to high latency,
SRP timeouts and reconnects. This way we would need a scheduler for our
raid0 device.
The difference is tremendous when comparing the results of 4K random
writes fighting against 1M sequential writes. With a scheduler the
maximum write latency dropped from 10s to 1.6s. The other statistic
values are number of bios for scheduler none and number of requests for
other schedulers. First read, then write.

Scheduler: none
<      8 ms: 0 2139
<     16 ms: 0 9451
<     32 ms: 0 10277
<     64 ms: 0 3586
<    128 ms: 0 5169
<    256 ms: 2 31688
<    512 ms: 3 115360
<   1024 ms: 2 283681
<   2048 ms: 0 420918
<   4096 ms: 0 10625
<   8192 ms: 0 220
<  16384 ms: 0 4
<  32768 ms: 0 0
<  65536 ms: 0 0
>= 65536 ms: 0 0
 maximum ms: 660 9920

Scheduler: deadline
<      8 ms: 2 435
<     16 ms: 1 997
<     32 ms: 0 1560
<     64 ms: 0 4345
<    128 ms: 1 11933
<    256 ms: 2 46366
<    512 ms: 0 182166
<   1024 ms: 1 75903
<   2048 ms: 0 146
<   4096 ms: 0 0
<   8192 ms: 0 0
<  16384 ms: 0 0
<  32768 ms: 0 0
<  65536 ms: 0 0
>= 65536 ms: 0 0
 maximum ms: 640 1640

We clone the bios from the request and put them into a bio list. The
request is marked as in-flight and afterwards the bios are processed
one-by-one the same way as with the other mode.

Is it safe to do it like this with a scheduler?

Any concerns regarding the write-intent bitmap?

Do you have any other concerns?

We can provide you with the full test results, the test scripts and also
some code parts if you wish.

Cheers,
Sebastian

             reply	other threads:[~2014-05-28 13:04 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-28 13:04 Sebastian Parschauer [this message]
2014-06-01 23:32 ` [RFC] Process requests instead of bios to use a scheduler NeilBrown
2014-06-02  9:51   ` Sebastian Parschauer
2014-06-02 10:20     ` NeilBrown
2014-06-02 11:12       ` Sebastian Parschauer
2014-06-04 17:09       ` [RFC PATCH 0/4] md/mdadm: introduce request function mode support Sebastian Parschauer
2014-06-04 17:09         ` [RFC PATCH 1/4] md: complete bio accounting and add io_latency extension Sebastian Parschauer
2014-06-04 17:10         ` [RFC PATCH 2/4] md: introduce request function mode support Sebastian Parschauer
2014-06-04 17:10         ` [RFC PATCH 3/4] md: handle IO latency accounting in rqfn mode Sebastian Parschauer
2014-06-04 17:10         ` [RFC PATCH 4/4] mdadm: introduce '--use-requestfn' create/assembly option Sebastian Parschauer
2014-06-17 13:20         ` [RFC PATCH 0/4] md/mdadm: introduce request function mode support Sebastian Parschauer
     [not found]           ` <CAH3kUhEK26+4KryoReosMt654-vcrkkgkxaW5tKkFRDBqgX82w@mail.gmail.com>
     [not found]             ` <53A14513.20902@profitbricks.com>
2014-06-18 13:57               ` Roberto Spadim
2014-06-18 14:43                 ` Sebastian Parschauer
2014-06-24  7:09           ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5385DECE.5060507@profitbricks.com \
    --to=sebastian.riemer@profitbricks.com \
    --cc=florian-ewald.mueller@profitbricks.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.