From: NeilBrown <neilb@suse.com>
To: Shaohua Li <shli@kernel.org>,
Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: Low RAID10 performance during resync
Date: Fri, 10 Jun 2016 17:08:12 +1000 [thread overview]
Message-ID: <87k2hxmsgz.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20160609173118.GA17207@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 3319 bytes --]
On Fri, Jun 10 2016, Shaohua Li wrote:
> On Thu, Jun 09, 2016 at 03:45:55PM +0200, Tomasz Majchrzak wrote:
>> A low performance of mkfs has been observed on RAID10 array during resync. It
>> is not so significant for NVMe drives but for my setup of RAID10 consisting
>> of 4 SATA drives format time has increased by 200%.
>>
>> I have looked into the problem and I have found out it is caused by this
>> changeset:
>>
>> commit 09314799e4f0589e52bafcd0ca3556c60468bc0e md: remove 'go_faster' option
>> from ->sync_request()
>>
>> It seemed the code had been redundant and could be safely removed due to
>> barriers mechanism but it proved otherwise. The barriers don't provide enough
>> throttle to resync IOs. They only assure non-resync IOs and resync IOs are
>> not being executed at the same time. In result resync IOs take around 25% of
>> CPU time, mostly because there are many of them but only one at a time so a
>> lot of CPU time is simply wasted waiting for a single IO to complete.
>>
>> The removed sleep call in resync IO had allowed a lot of non-resync activity
>> to be scheduled (nobody waiting for a barrier). Once sleep call had ended,
>> resync IO had to wait longer to raise a barrier as all non-resync activity
>> had to be completed first. It had nicely throttled a number of resync IOs in
>> favour of non-resync activity. Since we lack it now, the performance has
>> dropped badly.
>>
>> I would like to revert the changeset. We don't have to put a resync IO to
>> sleep for a second though. I have done some testing and it seems even a delay
>> of 100ms is sufficient. It slows down resync IOs to the same extent as sleep
>> for a second - the sleep call ends sooner but the barrier cannot be raised
>> until non-resync IOs complete.
>
> Add Neil.
>
> I'd like to make sure I understand the situation. With the change reverted, we
> dispatch a lot of normal IO and then do a resync IO. Without it reverted, we
> dispatch few normal IO and then do a resync IO. In other words, we don't batch
> normal IO currently. Is this what you say?
>
> Agree the barrier doesn't throttle resync IOs, it only assures normal IO and
> resync IO run in different time.
I think the barrier mechanism will mostly let large batches of IO
through as a match. If there is a pending request, a new request will
always be let straight through. Resync needs to wait for all pending
regular IO to complete before it gets a turn.
So I would only expect that patch to cause problems when IO is very
synchronous: write, wait, write, wait, etc.
I really didn't like the "go_faster" mechanism, but it might be OK to
have something like
if (conf->nr_waiting)
schedule_timeout_uninterruptible(1);
so it will wait one jiffie if there is normal IO. This would batch this
a lot more.
It is very hard to know the exact consequences of this sort of change on
all different configurations, and the other commit you mentioned shows.
I keep thinking there must be a better way, but I haven't found it yet
:-(
NeilBrown
>
> On the other hand, the change makes resync faster. Did you try to revert this one:
> ac8fa4196d205ac8fff3f8932bddbad4f16e4110
> If resync is fast, reverting this one will throttle resync.
>
> Thanks,
> Shaohua
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
next prev parent reply other threads:[~2016-06-10 7:08 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-09 13:45 Low RAID10 performance during resync Tomasz Majchrzak
2016-06-09 17:31 ` Shaohua Li
2016-06-10 7:08 ` NeilBrown [this message]
2016-06-10 14:45 ` Tomasz Majchrzak
2016-06-10 16:33 ` Shaohua Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k2hxmsgz.fsf@notabene.neil.brown.name \
--to=neilb@suse.com \
--cc=linux-raid@vger.kernel.org \
--cc=shli@kernel.org \
--cc=tomasz.majchrzak@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.