From: NeilBrown <neilb@suse.de>
To: Alexander Lyakas <alex.bolshoy@gmail.com>
Cc: linux-raid <linux-raid@vger.kernel.org>, yair@zadarastorage.com
Subject: Re: Can raid1-resync cause a corruption?
Date: Thu, 9 Jan 2014 14:10:43 +1100 [thread overview]
Message-ID: <20140109141043.28266c06@notabene.brown> (raw)
In-Reply-To: <CAGRgLy61Xx9US=nMuTdAKSnFXKGi=bH49gMZsNOD3Myv7NLQ3g@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2810 bytes --]
On Tue, 7 Jan 2014 17:01:23 +0200 Alexander Lyakas <alex.bolshoy@gmail.com>
wrote:
> Hi Neil,
> Thank you for your comments. Yes, apparently in this case md/raid1 was
> not the cause. I studied the code more by adding prints and following
> the resync flow, but cannot see any obvious problem.
>
> I did find another case, in which raid1-resync can read phantom data,
> although this is not what happened to us:
> # raid1 has 3 disks A,B,C and is resyncing after an unclean shutdown.
> sync_request always selects disk=A as read_disk.
> # application reads from far sector (beyond next_resync), so
> read_balance() selects disk=A to read from (it is the first one)
> # disk A fails
> # resync aborts and restarts, now sync_request reads from B and syncs into C
> # application reads again from the same far sector, now read_balance()
> selects disk B to read from
>
> So potentially we could get a different data from these two reads. In
> our case, though, there were no disk failures.
>
> FWIW, the raid1 code I was once responsible for, treated this
> situation as follows:
> # READ comes from application
> # raid1 sees that it is resyncing, so it locks the relevant area of
> the raid1 and syncs it. Then it unlocks and proceeds to serve the READ
> normally
> # resync thread comes to appropriate area, locks it and sees that it
> has already been synced (bits are off in the bitmap), so it proceeds
> further
>
> However in md/raid1, there is no mechanism currently that can lock a
> part of the raid. We only have raise_barrier/wait_barrier that
> effectively locks the whole capacity.
>
> Is it, for example, reasonable to READ the data as you normally do,
> then to trigger a WRITE with the same data and only then to complete
> the original READ? There are a lot of inefficiencies here, I know,
> like re-writing the same data again on read_disk, and syncing this
> data again later. (I know, patches are welcome...)
Hmmm.. yes that could conceivably cause a problem. It would apply to RAID6
too.
To "fix" it we would have to either read-and-check the replicas or parity
whenever we read from a block that is not "in-sync", and/or write them out.
This could be rather expensive for fairly little gain.
If we were doing a bitmap-based resync, then we could maybe expedite the
resync of any region before reading from it.
i.e. before reading from an block which is not known to be in-sync, we wait
for it to be in-sync, but also signal the resync process to do this 'bit'
worth next. That could be rather messy... but might not be too bad.
Rather than using the resync thread to handle the extra bits, maybe we could
have a work-queue which just handled specifically requested regions...
Patches certainly welcome :-)
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2014-01-09 3:10 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-29 11:27 Can raid1-resync cause a corruption? Alexander Lyakas
2013-12-29 21:41 ` NeilBrown
2014-01-02 16:52 ` Alexander Lyakas
2014-01-06 16:58 ` Alexander Lyakas
2014-01-06 23:56 ` NeilBrown
2014-01-07 15:01 ` Alexander Lyakas
2014-01-09 3:10 ` NeilBrown [this message]
[not found] ` <CA+res+QwDuaJTf1FVNtb--nTcoRjmFM4T6AadEt88UxPiG=EUw@mail.gmail.com>
2014-01-02 15:17 ` Jack Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140109141043.28266c06@notabene.brown \
--to=neilb@suse.de \
--cc=alex.bolshoy@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=yair@zadarastorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).