All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sebastian Sobolewski <linux@thirdmartini.com>
To: linux-raid@vger.kernel.org
Subject: Re: Write and verify correct data to read-failed sectors before degrading array?
Date: Thu, 16 Sep 2004 20:13:05 -0600	[thread overview]
Message-ID: <414A4831.1040407@thirdmartini.com> (raw)
In-Reply-To: <16714.17729.576621.347771@cse.unsw.edu.au>



Neil Brown wrote:

>On Thursday September 16, linux@thirdmartini.com wrote:
>  
>
>>    I have some experimental code that does the read-recovery piece for 
>>raid1 devices against kernel 2.4.26.  If an error is encountered on a 
>>read, the failure is delayed until the read is retried to the other 
>>mirror.  If the retried read succeeds it then writes the recovered block 
>>back over the previously failed block. 
>>    If the write fails then the drive is marked faulty otherwise we 
>>continue without setting the drive faulty.  ( The idea here is that 
>>modern disk drives have spare sectors, and will be automatically 
>>reallocate a bad sector to one of the spares on the next write ). 
>>    The caveat is that if the drive is generating lots of bad/failed 
>>reads it's most likely going south.. but that's what smart log 
>>monitoring is for.  If anyone is interested I can post the patch.
>>    
>>
>
>Certainly interested.
>
>Do you have any interlocking to ensure that if a real WRITE is
>submitted immediately after (or even during !!!) the READ, it does not
>get destroyed by the over-write.
>e.g.
>
>application     drive0          drive1
>READ request
>                READ from drive 0
>		fails
>				READ from drive 1
>				success. Schedule over-write on drive0
>READ completes
>WRITE block
>		WRITE to drive0 WRITE to drive1
>
>                overwrite happens.
>
>
>It is conceivable that the WRITE could be sent even *before* the READ
>completes though I'm not sure if it is possible in practice.
>
>NeilBrown
>
>  
>
    No, there is no interlocking at this time. I solve the above problem 
by  not replying to the read until after the recovery write attempt 
either fails or completes.  This works great when the application above 
us ( like a FS ) is using the buffer cache or guarantees no R-W 
conflicts.  ( I believe this is the case with buffered block devices at 
this time ).  Using /dev/raw and an application that can cause R-W 
conflicts WILL result in corruption.  This is why the patch is 
experimental. :)

    I've tested the code on a fault injector and I have not been able to 
cause a corruption using ext3 or xfs. 

-Sebastian


  reply	other threads:[~2004-09-17  2:13 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-10 20:22 [BUG / PATCH] raid1: set BIO_UPTODATE after read error Paul Clements
2004-09-13  5:32 ` Neil Brown
2004-09-15 17:34   ` Paul Clements
2004-09-16 10:50     ` Write and verify correct data to read-failed sectors before degrading array? Tim Small
2004-09-17  0:39       ` Neil Brown
2004-09-17  1:41         ` Sebastian Sobolewski
2004-09-17  2:00           ` Neil Brown
2004-09-17  2:13             ` Sebastian Sobolewski [this message]
2004-09-22  0:06               ` [PATCH] " Sebastian Sobolewski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=414A4831.1040407@thirdmartini.com \
    --to=linux@thirdmartini.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.