From: Bill Davidsen <davidsen@tmr.com>
To: Neil Brown <neilb@suse.de>
Cc: Piergiorgio Sartor <piergiorgio.sartor@nexgo.de>,
Steven Haigh <netwiz@crc.id.au>,
Bryan Mesich <bryan.mesich@ndsu.edu>,
Jon@eHardcastle.com, linux-raid@vger.kernel.org
Subject: Re: Why does one get mismatches?
Date: Wed, 24 Feb 2010 09:54:17 -0500 [thread overview]
Message-ID: <4B853D99.1040902@tmr.com> (raw)
In-Reply-To: <20100220090208.06c1130f@notabene.brown>
Neil Brown wrote:
> md is not in a position to lock the page - there is simply no way it can stop
> the filesystem from changing it.
> The only thing it could do would be to make a copy, then write the copy out.
> This would incur a performance cost.
>
>
Two thoughts on that - one is that for critical data, give me the option
at array start time, make the copy, slow the performance and make it
more consistent. My second thought is that a checksum of the page before
initiating write and after all writes are complete might be less of a
performance hit, and still could detect that the buffer had changed.
>> It seems to me, maybe I'm wrong, not a so safe design.
>>
>
> I think you are wrong.
>
>
> This is correct. However it would be equally correct if you were talking
> about s normal disk drive rather than a RAID1 pair.
> If the filesystem changes the page (or allows it to change) while a write is
> pending, then it cannot know what actual data was written. So it must write
> the block out again before it ever reads it in.
> RAID1 is no different to any other device in this respect.
>
>
>
>> In other words, would it be better, for the md layer,
>> to be robust against these kind of threats?
>>
>>
>
> Possibly, but at what cost?
> There are two ways that I can imagine to 'solve' this issue.
>
> 1/ always copy the page before writing. This would incur a significant
> overhead, both in the complexity of pre-allocation memory and in the
> delay taken to perform the copy. And it would very rarely be actually
> needed.
> 2/ Have the filesystem protect the page from changes while it is being
> written. This is quite possible for the filesystem to do (while it
> is impossible for md to do). There could be some performance
> cost with memory-mapped pages as they would need to be unmapped,
> but there would be no significant cost for reads, writes, and filesystem
> metadata operations.
>
Your next section somewhat mirrors my thought on md checking the data
after write to be sure it didn't change.
> Further, any filesystem that wants to make use of the integrity checks
> that newer drives provide (where the filesystem provides a 'checksum' for
> the block which gets passed all the way down and written to storage, and
> returned on a read) will need to do this anyway. So it is likely the in
> the near future all significant filesystems will provide all the
> guarantees md needs or order to simply do nothing different.
>
> So my feeling is that md is doing the best thing already.
>
> I believe 'swap' will always be an issue as unmapping swap pages during write
> could be a serious performance cost. It might be that the best thing to do
> with swap is to somehow mark the area of an array used for swap as "don't
> care" so md never bothers to resync it, and never reports inconsistencies
> there, as they really are not an issue.
>
> NeilBrown
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
--
Bill Davidsen <davidsen@tmr.com>
"We can't solve today's problems by using the same thinking we
used in creating them." - Einstein
next prev parent reply other threads:[~2010-02-24 14:54 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-20 11:52 Fw: Why does one get mismatches? Jon Hardcastle
2010-01-22 18:13 ` Goswin von Brederlow
2010-01-24 17:40 ` Jon Hardcastle
2010-01-24 21:52 ` Roger Heflin
2010-01-24 23:13 ` Goswin von Brederlow
2010-01-25 10:07 ` Jon Hardcastle
2010-01-25 10:37 ` Goswin von Brederlow
2010-01-25 10:52 ` Jon Hardcastle
2010-01-25 17:32 ` Goswin von Brederlow
2010-01-25 19:32 ` Iustin Pop
2010-02-01 21:18 ` Bill Davidsen
2010-02-01 22:37 ` Neil Brown
2010-02-02 15:11 ` Bill Davidsen
2010-02-03 11:17 ` Goswin von Brederlow
2010-02-11 5:14 ` Neil Brown
2010-02-11 17:51 ` Bryan Mesich
2010-02-16 21:25 ` Bill Davidsen
2010-02-16 21:38 ` Steven Haigh
2010-02-17 3:19 ` Bryan Mesich
2010-02-17 23:05 ` Neil Brown
2010-02-19 15:18 ` Piergiorgio Sartor
2010-02-19 22:02 ` Neil Brown
2010-02-19 22:37 ` Piergiorgio Sartor
2010-02-19 23:34 ` Asdo
2010-02-20 4:27 ` Goswin von Brederlow
2010-02-20 11:12 ` Asdo
2010-02-21 11:13 ` Goswin von Brederlow
[not found] ` <8754A21825504719B463AD9809E54349@m5>
[not found] ` <20100221194400.GA2570@lazy.lzy>
2010-02-22 13:01 ` Asdo
2010-02-22 13:30 ` Piergiorgio Sartor
2010-02-22 13:44 ` Piergiorgio Sartor
2010-02-24 19:42 ` Bill Davidsen
2010-02-20 4:23 ` Goswin von Brederlow
2010-02-24 14:54 ` Bill Davidsen [this message]
2010-02-24 21:37 ` Neil Brown
2010-02-26 20:48 ` Bill Davidsen
2010-02-26 21:09 ` Neil Brown
2010-02-26 22:01 ` Piergiorgio Sartor
2010-02-26 22:15 ` Bill Davidsen
2010-02-26 22:21 ` Piergiorgio Sartor
2010-02-26 22:20 ` Asdo
2010-02-27 6:01 ` Michael Evans
2010-02-28 0:01 ` Bill Davidsen
2010-02-24 14:46 ` Bill Davidsen
2010-02-24 16:12 ` Martin K. Petersen
2010-02-24 18:51 ` Piergiorgio Sartor
2010-02-24 22:21 ` Neil Brown
2010-02-25 8:41 ` Piergiorgio Sartor
2010-03-02 4:57 ` Neil Brown
2010-03-02 18:49 ` Piergiorgio Sartor
2010-02-24 21:39 ` Neil Brown
[not found] ` <4B8640A2.4060307@shiftmail.org>
2010-02-25 10:41 ` Neil Brown
2010-02-28 8:09 ` Luca Berra
2010-03-02 5:01 ` Neil Brown
2010-03-02 7:36 ` Luca Berra
2010-03-02 10:04 ` Michael Evans
2010-03-02 11:02 ` Luca Berra
2010-03-02 12:13 ` Michael Evans
2010-03-02 18:14 ` Asdo
2010-03-02 18:52 ` Piergiorgio Sartor
2010-03-02 23:27 ` Asdo
2010-03-03 9:13 ` Piergiorgio Sartor
2010-03-03 11:42 ` Asdo
2010-03-03 12:03 ` Piergiorgio Sartor
2010-03-02 20:17 ` Neil Brown
2010-02-24 21:32 ` Neil Brown
2010-02-25 7:22 ` Goswin von Brederlow
2010-02-25 7:39 ` Neil Brown
2010-02-25 8:47 ` John Robinson
2010-02-25 9:07 ` Neil Brown
2010-02-11 18:12 ` Piergiorgio Sartor
-- strict thread matches above, loose matches on Subject: below --
2010-02-01 23:14 Jon Hardcastle
2010-01-25 20:43 greg
2010-01-25 22:49 ` Steven Haigh
2010-01-27 21:54 ` Tirumala Reddy Marri
2010-01-28 9:16 ` Jon Hardcastle
2010-01-28 10:29 ` Asdo
2010-01-28 17:20 ` Tirumala Reddy Marri
2010-01-28 18:23 ` Goswin von Brederlow
2010-01-28 19:03 ` Tirumala Reddy Marri
2010-01-28 20:24 ` Goswin von Brederlow
2010-01-29 15:37 ` Jon Hardcastle
2010-01-29 23:52 ` Goswin von Brederlow
2010-01-30 10:39 ` Jon Hardcastle
2010-02-01 21:10 ` Bill Davidsen
2010-01-20 15:03 Jon Hardcastle
2010-01-20 15:34 ` Brett Russ
2010-01-20 20:44 ` Majed B.
2010-01-20 22:25 ` Brett Russ
2010-01-20 22:30 ` Majed B.
2010-01-20 22:43 ` Brett Russ
2010-01-20 23:01 ` Christopher Chen
2010-01-21 4:17 ` Steven Haigh
2010-01-21 8:08 ` Asdo
2010-01-21 10:52 ` Steven Haigh
2010-01-21 11:48 ` Farkas Levente
2010-01-21 12:15 ` Jon Hardcastle
2010-01-19 10:04 Jon Hardcastle
2010-01-20 14:19 ` Brett Russ
2010-01-20 14:34 ` Jon Hardcastle
2010-01-20 14:46 ` Brett Russ
2010-02-01 20:48 ` Bill Davidsen
2010-01-22 16:22 ` Jon Hardcastle
2010-01-22 16:34 ` Asdo
2010-01-22 17:41 ` Brett Russ
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B853D99.1040902@tmr.com \
--to=davidsen@tmr.com \
--cc=Jon@eHardcastle.com \
--cc=bryan.mesich@ndsu.edu \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
--cc=netwiz@crc.id.au \
--cc=piergiorgio.sartor@nexgo.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).