From: Stan Hoeppner <stan@hardwarefreak.com>
To: tomas.hodek@volny.cz
Cc: linux-raid@vger.kernel.org
Subject: Re: raid1 read balance with write mostly
Date: Sat, 02 Feb 2013 15:52:12 -0600 [thread overview]
Message-ID: <510D8A8C.5090505@hardwarefreak.com> (raw)
In-Reply-To: <36edd03089332985af9a97eeaf719cd3@mail1.volny.cz>
On 2/2/2013 11:38 AM, tomas.hodek@volny.cz wrote:
> Hi
>
> I have started to test md raid1 with one ssd and one hdd devices on 3.7.1 kernel (it has trim/discard on raid1). This raid has enabled write behind option and HDD device has enabled write mostly option.
> Original idea of write mostly option was "Read requests will only be sent if there is no other option."
>
> My first simple test workload was a building latest stable kernel (3.7.1) using 16 threads.
> But i saw some reading from hdd irrespective of a write workload, I saw also more then 1000ms read await while ssd had await about 1ms. (I only used iostat -x.)
>
> I wanted to know why. I searched in source codes and i found read_balance function in raid1.c.
>
> If I read well this code and understand it - it do:
>
> If a device has "write mostly" option and if we still have not selected device for reading (and if is_badblock function is ended with true), code select this device directly. This direct selection may be a mistake because overwrite this direct selection is possible only in special cases - if other possible device (without write mostly option) is idle or a request is a part of sequential reads. Standard way read_balance function is searching the nearest and/or the least used device. Such device is using only if we have not a directly selected device (also from write mosty code path).
>
>
> I thing all code sequence
>
> best_disk = disk;
> continue;
>
> in main for loop is not best way and that setting
>
> best_padding_disk = disk;
> best_dist_disk = disk;
>
> is better because it give chance find better alternative. In other words - change direct selection to worst possible alternative.
> But i am not sure in all cases.
Did you test with a RAID1 of two mechanical drives? I can envision a
scenario of say a 300GB WD Raptor 10K mirrored to a 300GB partition on a
3TB 5K 'green' drive. The contents being the root filesystem, mirrored
strictly for safety. This is probably the same scenario you have in
mind. But in this case the IOPS performance difference is only 2:1,
whereas with the SSD it's more than 50:1. So under heavy read load, in
this case we'd probably want the slow 3TB drive to contribute to the
workload. With your patch, will it still do so?
--
Stan
>
> I made 2 version of a small patch to do it which change direct selection to setting write mostly device only as most distant and most pending possible device. Safe version is safe and reliable for future changes, now version is minimal for current code (up to 3.7.5).
>
> This patch work well for me. I can mark ssd as fail, remove from and add in raid under workload without any trouble or additional kernel log items.
>
> I attach my patches to email.
>
> Best regards
> Tomas Hodek
>
next prev parent reply other threads:[~2013-02-02 21:52 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-02 17:38 raid1 read balance with write mostly tomas.hodek
2013-02-02 21:52 ` Stan Hoeppner [this message]
2013-02-02 22:07 ` Roman Mamedov
2013-02-02 22:16 ` Tommy Apel Hansen
2013-02-03 9:02 ` tomas.hodek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=510D8A8C.5090505@hardwarefreak.com \
--to=stan@hardwarefreak.com \
--cc=linux-raid@vger.kernel.org \
--cc=tomas.hodek@volny.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).