From: NeilBrown <nfbrown@novell.com>
To: doug@easyco.com, linux-raid <linux-raid@vger.kernel.org>
Subject: Re: Create Lock to Eliminate RMW in RAID/456 when writing perfect stripes
Date: Fri, 25 Dec 2015 18:58:50 +1100 [thread overview]
Message-ID: <87h9j7yn5x.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <CAFx4rwR1gtLZUF6MUx71QTvw3hT5GKhsEMU2xcBpAz5wZWBUCw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2740 bytes --]
On Thu, Dec 24 2015, Doug Dumitru wrote:
> The issue:
>
> The background thread in RAID-5 can wake up in the middle of a process
> populating stripe cache entries with a long write. If the long write
> contains a complete stripe, the background thread "should" be able to
> process the require without doing any reads.
>
> Sometimes the background thread is too quick at starting up a write
> and schedules a RMW (Read Modify Write) even though the needed blocks
> will soon be available.
>
> Seeing this happen:
>
> You can see this happen by creating an MD set with a small stripe size
> and then doing DIRECT_IO writes that are exactly aligned on a stripe.
> For example, with 4 disks and 64K stripes, write 192K blocks aligned
> on 192K boundaries. You can do this from C or with 'dd' or 'fio'.
>
> If you have this running, you can then run iostat and you should see
> absolutely no read activity on the disks.
>
> The probability of this happening goes up when there are more disks.
> It may also go up the faster the disks are. My use case is 24 SSDs.
>
> The problem with this:
>
> There are really three issues.
>
> 1) The code does not need to work this way. It is not "broken" but
> just seems wrong.
> 2) There is a performance penalty here.
> 3) There is a Flash wear penalty here.
>
> It is 3) that most interests me.
>
> The fix:
>
> Create a waitq or semaphore based lock so that if a write includes a
> complete stripe, the background thread will wait for the write to
> completely populate the thread.
>
> I would do this with a small array of locks. When a write includes a
> complete stripe, it sets a lock (stripe_number % sizeof_lock_array).
> This lock is released as soon as the write finishes populating the
> stripe cache. The background thread checks this lock before it starts
> a write. If the lock is set, it waits until the stripe cache is
> completely populated which should eliminate the RMW.
>
> If no writes are full stripes, then the lock never gets set, so most
> code runs without any real overhead.
>
> Implementing this:
>
> I am happy to implement this. I have quite a bit of experience with
> lock structures like this. I can also test on x86 and x86_64, but
> will need help with other arch's.
>
> Then again, if this is too much of an "edge case", I will just keep my
> patches in-house.
Hi,
this is certainly something that needs fixing. I can't really say if
your approach would work or not without seeing it and testing it on a
variety of work loads.
Certainly if you do implement something, please post it for other to
test and review. If it makes measurable improvements without causing
significant regressions, it will likely be included upstream.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
next prev parent reply other threads:[~2015-12-25 7:58 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-23 19:20 Create Lock to Eliminate RMW in RAID/456 when writing perfect stripes Doug Dumitru
2015-12-23 20:28 ` Robert Kierski
2015-12-23 20:37 ` Doug Dumitru
2015-12-25 7:58 ` NeilBrown [this message]
2015-12-30 19:28 ` Doug Dumitru
2016-01-11 7:44 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h9j7yn5x.fsf@notabene.neil.brown.name \
--to=nfbrown@novell.com \
--cc=doug@easyco.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).