Multi-layer raid status

Linux RAID subsystem development
 help / color / mirror / Atom feed

* Multi-layer raid status
@ 2018-01-30 15:30 David Brown
  2018-02-02  6:03 ` NeilBrown
  0 siblings, 1 reply; 11+ messages in thread
From: David Brown @ 2018-01-30 15:30 UTC (permalink / raw)
  To: linux-raid

Does anyone know the current state of multi-layer raid (in the Linux md
layer) for recovery?

I am thinking of a setup like this (hypothetical example - it is not a
real setup):

md0 = sda + sdb, raid1
md1 = sdc + sdd, raid1
md2 = sde + sdf, raid1
md3 = sdg + sdh, raid1

md4 = md0 + md1 + md2 + md3, raid5

If you have an error reading a sector in sda, the raid1 pair finds the
mirror copy on sdb, re-writes the data to sda (which re-locates the bad
sector) and passes the good data on to the raid5 layer.  Everyone is
happy, and the error is corrected quickly.

Rebuilds are fast as single disk copies.

However, if you have an error reading a sector in sda /and/ when reading
the mirror copy in sdb, then the raid1 pair has no data to give to the
raid5 layer.  The raid5 layer will then read the rest of the stripe and
calculate the missing data.  I presume it will then re-write the
calculated data to md0, which will in turn write it to sda and sdb, and
all will be well again.

But what about rebuilds?  A rebuild or recovery of the raid1 layer is
not triggered by a read from the raid5 level - it will be handled at the
raid1 level.  If sda is replaced, then the raid1 level will build it by
copying from sdb.  If a read error is encountered while copying, is
there any way for the recovery code to know that it can get the missing
data by asking the raid5 level?  Is it possible to mark the matching sda
sector as bad, so that a future raid5 read (such as from a scrub) will
see that md0 stripe as bad, and re-write it?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multi-layer raid status
  2018-01-30 15:30 Multi-layer raid status David Brown
@ 2018-02-02  6:03 ` NeilBrown
  2018-02-02 10:41   ` David Brown
  0 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2018-02-02  6:03 UTC (permalink / raw)
  To: David Brown, linux-raid

[-- Attachment #1: Type: text/plain, Size: 2090 bytes --]

On Tue, Jan 30 2018, David Brown wrote:

> Does anyone know the current state of multi-layer raid (in the Linux md
> layer) for recovery?
>
> I am thinking of a setup like this (hypothetical example - it is not a
> real setup):
>
> md0 = sda + sdb, raid1
> md1 = sdc + sdd, raid1
> md2 = sde + sdf, raid1
> md3 = sdg + sdh, raid1
>
> md4 = md0 + md1 + md2 + md3, raid5
>
>
> If you have an error reading a sector in sda, the raid1 pair finds the
> mirror copy on sdb, re-writes the data to sda (which re-locates the bad
> sector) and passes the good data on to the raid5 layer.  Everyone is
> happy, and the error is corrected quickly.
>
> Rebuilds are fast as single disk copies.
>
>
> However, if you have an error reading a sector in sda /and/ when reading
> the mirror copy in sdb, then the raid1 pair has no data to give to the
> raid5 layer.  The raid5 layer will then read the rest of the stripe and
> calculate the missing data.  I presume it will then re-write the
> calculated data to md0, which will in turn write it to sda and sdb, and
> all will be well again.

If sda and sdb have bad-block-logs configured, this should work.  Not
everyone trusts them though.

>
>
> But what about rebuilds?  A rebuild or recovery of the raid1 layer is
> not triggered by a read from the raid5 level - it will be handled at the
> raid1 level.  If sda is replaced, then the raid1 level will build it by
> copying from sdb.  If a read error is encountered while copying, is
> there any way for the recovery code to know that it can get the missing
> data by asking the raid5 level?  Is it possible to mark the matching sda
> sector as bad, so that a future raid5 read (such as from a scrub) will
> see that md0 stripe as bad, and re-write it?
>

"Is it possible to mark the matching sda sector as bad"

This is exactly what the bad-block-list functionality is meant to do.

NeilBrown


>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multi-layer raid status
  2018-02-02  6:03 ` NeilBrown
@ 2018-02-02 10:41   ` David Brown
  2018-02-02 11:17     ` Wols Lists
  0 siblings, 1 reply; 11+ messages in thread
From: David Brown @ 2018-02-02 10:41 UTC (permalink / raw)
  To: NeilBrown, linux-raid

On 02/02/18 07:03, NeilBrown wrote:
> On Tue, Jan 30 2018, David Brown wrote:
> 
>> Does anyone know the current state of multi-layer raid (in the Linux md
>> layer) for recovery?
>>
>> I am thinking of a setup like this (hypothetical example - it is not a
>> real setup):
>>
>> md0 = sda + sdb, raid1
>> md1 = sdc + sdd, raid1
>> md2 = sde + sdf, raid1
>> md3 = sdg + sdh, raid1
>>
>> md4 = md0 + md1 + md2 + md3, raid5
>>
>>
>> If you have an error reading a sector in sda, the raid1 pair finds the
>> mirror copy on sdb, re-writes the data to sda (which re-locates the bad
>> sector) and passes the good data on to the raid5 layer.  Everyone is
>> happy, and the error is corrected quickly.
>>
>> Rebuilds are fast as single disk copies.
>>
>>
>> However, if you have an error reading a sector in sda /and/ when reading
>> the mirror copy in sdb, then the raid1 pair has no data to give to the
>> raid5 layer.  The raid5 layer will then read the rest of the stripe and
>> calculate the missing data.  I presume it will then re-write the
>> calculated data to md0, which will in turn write it to sda and sdb, and
>> all will be well again.
> 
> If sda and sdb have bad-block-logs configured, this should work.  Not
> everyone trusts them though.
> 
>>
>>
>> But what about rebuilds?  A rebuild or recovery of the raid1 layer is
>> not triggered by a read from the raid5 level - it will be handled at the
>> raid1 level.  If sda is replaced, then the raid1 level will build it by
>> copying from sdb.  If a read error is encountered while copying, is
>> there any way for the recovery code to know that it can get the missing
>> data by asking the raid5 level?  Is it possible to mark the matching sda
>> sector as bad, so that a future raid5 read (such as from a scrub) will
>> see that md0 stripe as bad, and re-write it?
>>
> 
> "Is it possible to mark the matching sda sector as bad"
> 
> This is exactly what the bad-block-list functionality is meant to do.
> 
> NeilBrown
> 

Marvellous - thank you for the information.

Using bad block lists and then doing a higher level scrub should
certainly work, and is a good general solution as it means you don't
need direct interaction between the layers (just the normal top-down
processing of layered block devices).  The disadvantage is that there
may be quite a delay between the raid1 rebuild and the next full re-read
of the entire raid5 array - all you really need is a single read at the
higher level to trigger the fixup.

Is there any way to map from the block numbers on the lower raid level
here to block numbers at a higher level?  I suppose in general the lower
level does not know what is above it.  I guess a user mode tool could
look at /proc/mdstat and work through it to figure out the layers, then
look through bad block lists and calculate the required high-level reads.






^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multi-layer raid status
  2018-02-02 10:41   ` David Brown
@ 2018-02-02 11:17     ` Wols Lists
  2018-02-02 11:32       ` David Brown
  0 siblings, 1 reply; 11+ messages in thread
From: Wols Lists @ 2018-02-02 11:17 UTC (permalink / raw)
  To: David Brown, NeilBrown, linux-raid

On 02/02/18 10:41, David Brown wrote:
> Using bad block lists and then doing a higher level scrub should
> certainly work, and is a good general solution as it means you don't
> need direct interaction between the layers (just the normal top-down
> processing of layered block devices).  The disadvantage is that there
> may be quite a delay between the raid1 rebuild and the next full re-read
> of the entire raid5 array - all you really need is a single read at the
> higher level to trigger the fixup.

This would be a perfect use case of my "full parity reads" mode ... at
the moment, raid just reads sufficient disks to return the requested
data, but I proposed a mode where it read the full stripe, did the
parity checks, and either returned a read error (2-disk raid-1, raid-5)
or corrected the stripe (raid-6) if things didn't add up.

Okay, it would knacker performance a bit, but where you've got a nested
raid like this, you switch it on, run a read on the filesystem ( tar /
--no-follow > /dev/null sort of thing), and it would sort out integrity
all the way down the stack.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multi-layer raid status
  2018-02-02 11:17     ` Wols Lists
@ 2018-02-02 11:32       ` David Brown
  2018-02-02 12:12         ` Reindl Harald
  2018-02-02 14:24         ` Wols Lists
  0 siblings, 2 replies; 11+ messages in thread
From: David Brown @ 2018-02-02 11:32 UTC (permalink / raw)
  To: Wols Lists, NeilBrown, linux-raid

On 02/02/18 12:17, Wols Lists wrote:
> On 02/02/18 10:41, David Brown wrote:
>> Using bad block lists and then doing a higher level scrub should
>> certainly work, and is a good general solution as it means you don't
>> need direct interaction between the layers (just the normal top-down
>> processing of layered block devices).  The disadvantage is that there
>> may be quite a delay between the raid1 rebuild and the next full re-read
>> of the entire raid5 array - all you really need is a single read at the
>> higher level to trigger the fixup.
> 
> This would be a perfect use case of my "full parity reads" mode ... at
> the moment, raid just reads sufficient disks to return the requested
> data, but I proposed a mode where it read the full stripe, did the
> parity checks, and either returned a read error (2-disk raid-1, raid-5)
> or corrected the stripe (raid-6) if things didn't add up.
> 

You already do that during a scrub.  You don't want to do it during
normal operations - unless you have a usage pattern with mostly big
reads, you will cripple performance.  A small performance drop is
acceptable if it can be shown to significantly improve reliability - but
making every read a full stripe read will give you random read
performance closer to that of a single disk than a raid array.

> Okay, it would knacker performance a bit, but where you've got a nested
> raid like this, you switch it on, run a read on the filesystem ( tar /
> --no-follow > /dev/null sort of thing), and it would sort out integrity
> all the way down the stack.
> 

That's a scrub.  You do it as a very low priority task on a regular basis.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multi-layer raid status
  2018-02-02 11:32       ` David Brown
@ 2018-02-02 12:12         ` Reindl Harald
  2018-02-02 14:24         ` Wols Lists
  1 sibling, 0 replies; 11+ messages in thread
From: Reindl Harald @ 2018-02-02 12:12 UTC (permalink / raw)
  To: David Brown, Wols Lists, NeilBrown, linux-raid



Am 02.02.2018 um 12:32 schrieb David Brown:
>> Okay, it would knacker performance a bit, but where you've got a nested
>> raid like this, you switch it on, run a read on the filesystem ( tar /
>> --no-follow > /dev/null sort of thing), and it would sort out integrity
>> all the way down the stack.
> 
> That's a scrub.  You do it as a very low priority task on a regular basis

if only that "low priority" would work these days....

dev.raid.speed_limit_min = 25000
dev.raid.speed_limit_max = 1000000

in the good old days "dev.raid.speed_limit_max" was how it is named and 
when you used your machine the scrub simply took longer, these days (for 
many months, few years) you better limit "dev.raid.speed_limit_max" and 
type "sysctl -p" and *after* that you can continue to use your machine

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multi-layer raid status
  2018-02-02 11:32       ` David Brown
  2018-02-02 12:12         ` Reindl Harald
@ 2018-02-02 14:24         ` Wols Lists
  2018-02-02 14:50           ` David Brown
  1 sibling, 1 reply; 11+ messages in thread
From: Wols Lists @ 2018-02-02 14:24 UTC (permalink / raw)
  To: David Brown, NeilBrown, linux-raid

On 02/02/18 11:32, David Brown wrote:
> You already do that during a scrub.  You don't want to do it during
> normal operations - unless you have a usage pattern with mostly big
> reads, you will cripple performance.  A small performance drop is
> acceptable if it can be shown to significantly improve reliability - but
> making every read a full stripe read will give you random read
> performance closer to that of a single disk than a raid array.

Unless integrity is more important than speed?

Unless (like in your own example) you know there's a problem and you
want to find it?

Yup I know it will knacker performance - I said so. But there are plenty
of use cases where it would actually be very useful, and probably the
lesser of two evils.

(Actually, re-reading your original email, it actually sounds like the
right thing to do would be to call hdparm to mark the sector bad on sda,
rather than use badblocks, so it will rewrite and clear itself. And this
is also a perfect example of where my technique would be useful - it's
probably not the raid-5 parity block that gets corrupted, therefore the
data itself has been corrupted, therefore my utility would find the
damaged file for you so you could recover from backup. A scrub at the
raid-5 level would just "fix" the parity and leave you with a corrupted
file waiting to blow up on you.)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multi-layer raid status
  2018-02-02 14:24         ` Wols Lists
@ 2018-02-02 14:50           ` David Brown
  2018-02-02 15:03             ` Wols Lists
  0 siblings, 1 reply; 11+ messages in thread
From: David Brown @ 2018-02-02 14:50 UTC (permalink / raw)
  To: Wols Lists, NeilBrown, linux-raid

On 02/02/18 15:24, Wols Lists wrote:
> On 02/02/18 11:32, David Brown wrote:
>> You already do that during a scrub.  You don't want to do it during
>> normal operations - unless you have a usage pattern with mostly big
>> reads, you will cripple performance.  A small performance drop is
>> acceptable if it can be shown to significantly improve reliability - but
>> making every read a full stripe read will give you random read
>> performance closer to that of a single disk than a raid array.
> 
> Unless integrity is more important than speed?

There are scenarios where it is realistic to expect integrity problems -
sudden decay of a disk sector is not a likely event.  There is /no/ good
reason for saying that when you read sector 1000 from disk A, you should
also read sector 1000 from disk B just in case that happened to go bad.
 Reading a whole stripe when you need to read one sector gives you
/nothing/.  Reading the whole stripe and checking the parity gives you
/almost/ nothing - if there is an error on the sector you are reading,
the disk tells you.  Undetected read errors are the pink unicorns of the
computing world - there are people who swear they have seen them, but
real evidence is very hard to come by.  And even then, there are much
better ways to deal with them (btrfs checksums, for example).

And, yet again, you have regular scrubs.  These have low bandwidth cost
(because you run them slowly, and because they do not flood your block
and stripe caches), and will detect any such errors.

Integrity is important, but it is not so important that nothing else
matters.  Do you make sure all your servers are six stories underground
in concrete bunkers?  Don't tell me you are unwilling to pay that cost -
surely you don't want to risk losing data to a meteorite strike?

Do you drive a tank to work?  After all, surely your personal safety is
more important than speed, or fuel costs.

> 
> Unless (like in your own example) you know there's a problem and you
> want to find it?

First, it is not an unknown problem - it is a known event.  Second,
reading full stripes for every disk read will not help in any way,
because your chances of reading the sector in question are tiny for most
normal usage pattern.  Third, normal regular scrubs will catch it just
the same, merely with a bit of delay.  If you want to get it faster and
don't mind low performance, increase the scrub bandwidth.

All I am asking is if is possible to have a targeted scrub on just the
relevant blocks, to minimise the low redundancy period.

> 
> Yup I know it will knacker performance - I said so. But there are plenty
> of use cases where it would actually be very useful, and probably the
> lesser of two evils.

What are these cases?  We have already eliminated the rebuild situation
I described.  And in particular, which use-cases are you thinking of
where you not be better off with alternative integrity improvements
(like higher redundancy levels) without killing performance?

> 
> (Actually, re-reading your original email, it actually sounds like the
> right thing to do would be to call hdparm to mark the sector bad on sda,
> rather than use badblocks, so it will rewrite and clear itself. And this
> is also a perfect example of where my technique would be useful - it's
> probably not the raid-5 parity block that gets corrupted, therefore the
> data itself has been corrupted, therefore my utility would find the
> damaged file for you so you could recover from backup. A scrub at the
> raid-5 level would just "fix" the parity and leave you with a corrupted
> file waiting to blow up on you.)
> 

That does not make sense.  The bad block list described by Neil will do
the job correctly.  hdparm bad block marking could also work, but it
does so at a lower level and the sector is /not/ corrected
automatically, AFAIK.  It also would not help if the raid1 were not
directly on a hard disk (think disk partition, another raid, an LVM
partition, an iSCSI disk, a remote block device, an encrypted block
device, etc.).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multi-layer raid status
  2018-02-02 14:50           ` David Brown
@ 2018-02-02 15:03             ` Wols Lists
  2018-02-02 15:40               ` David Brown
  0 siblings, 1 reply; 11+ messages in thread
From: Wols Lists @ 2018-02-02 15:03 UTC (permalink / raw)
  To: David Brown, NeilBrown, linux-raid

On 02/02/18 14:50, David Brown wrote:
> What are these cases?  We have already eliminated the rebuild situation
> I described.  And in particular, which use-cases are you thinking of
> where you not be better off with alternative integrity improvements
> (like higher redundancy levels) without killing performance?
> 
In particular, when you KNOW you've got a damaged raid, and you want to
know which files are affected. The whole point of my technique is that
either it uses the raid to recover (if it can) or it propagates a read
error back to the application. It does NOT "fix" the data and leave a
corrupted file behind.

>> > 
> That does not make sense.  The bad block list described by Neil will do
> the job correctly.  hdparm bad block marking could also work, but it
> does so at a lower level and the sector is /not/ corrected
> automatically, AFAIK.  It also would not help if the raid1 were not
> directly on a hard disk (think disk partition, another raid, an LVM
> partition, an iSCSI disk, a remote block device, an encrypted block
> device, etc.).
> 
Nor does the bad block list correct the error automatically, if that's
true then. The bad blocks list fakes a read error, the hdparm causes a
real read error. When the raid-5 scrub hits, either version triggers a
rewrite.

Thing about the bad-block list, is that that disk block is NOT
rewritten. It's moved, and that disk space is LOST. With hdparm, that
block gets rewritten, and if the rewrite succeeds the space is recovered.

Cheers,
Wol


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multi-layer raid status
  2018-02-02 15:03             ` Wols Lists
@ 2018-02-02 15:40               ` David Brown
  2018-02-02 16:49                 ` Wols Lists
  0 siblings, 1 reply; 11+ messages in thread
From: David Brown @ 2018-02-02 15:40 UTC (permalink / raw)
  To: Wols Lists, NeilBrown, linux-raid

On 02/02/18 16:03, Wols Lists wrote:
> On 02/02/18 14:50, David Brown wrote:
>> What are these cases?  We have already eliminated the rebuild situation
>> I described.  And in particular, which use-cases are you thinking of
>> where you not be better off with alternative integrity improvements
>> (like higher redundancy levels) without killing performance?
>>
> In particular, when you KNOW you've got a damaged raid, and you want to
> know which files are affected. The whole point of my technique is that
> either it uses the raid to recover (if it can) or it propagates a read
> error back to the application. It does NOT "fix" the data and leave a
> corrupted file behind.

If you read a block and the read fails, the raid system will already
read the whole stripe to re-create the missing data.  If it can
re-create it, it writes the new data back to the disk and returns it to
the application.  If it cannot, it gives the read error back to the
application.

I cannot imagine a situation where you would have a disk that you know
has incorrect data, as part of your array and in normal use for a file
system.  For the situation I originally described, if there were no
support for the bad block lists, then you would have to have a more
complex procedure for the rebuild.  (I believe it would be somethign
like this.  Enable the write intent bitmap for the raid5 level, take the
raid1 pair with the missing drive out of the raid5, rebuild the raid1
pair, and if the build is successful then put it back in the raid5 and
let the write intent logic bring it up to speed.  If the build had
errors, you'd have to unmount the filesystem, let the write intent logic
finish writing, then scrub the raid5.)

But since there is the bad block list to handle my concerns, there is no
problem there.

> 
>>>>
>> That does not make sense.  The bad block list described by Neil will do
>> the job correctly.  hdparm bad block marking could also work, but it
>> does so at a lower level and the sector is /not/ corrected
>> automatically, AFAIK.  It also would not help if the raid1 were not
>> directly on a hard disk (think disk partition, another raid, an LVM
>> partition, an iSCSI disk, a remote block device, an encrypted block
>> device, etc.).
>>
> Nor does the bad block list correct the error automatically, if that's
> true then. The bad blocks list fakes a read error, the hdparm causes a
> real read error. When the raid-5 scrub hits, either version triggers a
> rewrite.
> 
> Thing about the bad-block list, is that that disk block is NOT
> rewritten. It's moved, and that disk space is LOST. With hdparm, that
> block gets rewritten, and if the rewrite succeeds the space is recovered.

I don't know what the details are for when blocks are removed from bad
lists (either the md raid bad block list, or the hdparm list) and
re-tried.  But it does not matter - the fraction of wasted space is
negligible.

> 
> Cheers,
> Wol
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multi-layer raid status
  2018-02-02 15:40               ` David Brown
@ 2018-02-02 16:49                 ` Wols Lists
  0 siblings, 0 replies; 11+ messages in thread
From: Wols Lists @ 2018-02-02 16:49 UTC (permalink / raw)
  To: David Brown, NeilBrown, linux-raid

On 02/02/18 15:40, David Brown wrote:
> On 02/02/18 16:03, Wols Lists wrote:
>> On 02/02/18 14:50, David Brown wrote:
>>> What are these cases?  We have already eliminated the rebuild situation
>>> I described.  And in particular, which use-cases are you thinking of
>>> where you not be better off with alternative integrity improvements
>>> (like higher redundancy levels) without killing performance?
>>>
>> In particular, when you KNOW you've got a damaged raid, and you want to
>> know which files are affected. The whole point of my technique is that
>> either it uses the raid to recover (if it can) or it propagates a read
>> error back to the application. It does NOT "fix" the data and leave a
>> corrupted file behind.
> 
> If you read a block and the read fails, the raid system will already
> read the whole stripe to re-create the missing data.  If it can
> re-create it, it writes the new data back to the disk and returns it to
> the application.  If it cannot, it gives the read error back to the
> application.
> 
> I cannot imagine a situation where you would have a disk that you know
> has incorrect data, as part of your array and in normal use for a file
> system. 

Can't you? When I was discussing this originally I had a bunch of
examples given to me.

Let's take just one, which as far as I can tell is real, and is probably
far more common than system developers would like to admit. A drive
glitches, and writes a load of data - intended for let's say track 1398
- to track 1938 by mistake. Okay, that particular example is a decimal
blunder, and a drive would probably make a bit-flip mistake instead, but
writing data to the wrong place is apparently a well-recognised
intermittent failure mode. (And it's not even always hardware to blame -
just an unfortunate cosmic ray incident.)

Or - and it was reported on this list - a drive suffers a power glitch
and dumps the entire contents of its write buffer.

Either way, we now have a raid array which APPEARS to be functioning
normally, and a bunch of stripes are corrupt. If you're lucky (and yes,
this does seem to be the normal state of affairs) then it's just the
parity which has been corrupted, which a scrub will fix. But if it's not
the parity, then raid-1 and raid-5 you can kiss your data bye-bye, and
if it's raid-6, a scrub will send your data to data heaven.

And saying "it's never happened to me" doesn't mean it's never happened
to anyone else.

Let's go back a few years, to the development of the ext file system
from version 2, to version 4. I can't remember the exact saying, but
it's something along the lines of "premature optimisation is the root of
all evil". When an ext2 system crashed, you could easily spend hours
running fsck before the system was usable.

So the developers developed ext3, with a journal. By chance, this always
wrote the data blocks before the journal, so when the system crashed,
the journal fixed the file system, and the users were very happy they
didn't need a fsck.

Then the developers decided to optimise further into ext4 and broke the
link between data and journal! So now, an ext4 system might boot faster
after a crash, shaving seconds off journal replay time. But the system
took MUCH LONGER to be available to users, because now the filesystem
corrupted user data, and instead of running the system level fsck, users
had to replace it with an application data integrity tool.

So yes, my "integrity checking raid" might be slow. Which is why it
would be disabled by default, and require flipping a runtime switch to
enable it. But it's a hell of a lot faster than an "mkfs and reload from
backup", which is the alternative if your disk is corrupt (as opposed to
crashed and dead).

And my way gives you a list of corrupted files that need restoring, as
opposed to "scrub, fix, and cross your fingers".

And one last question - if my idea is stupid, why did somebody think it
worthwhile to write raid6check?

Why is it that so many kernel level guys seem to treat user data
integrity with contempt?

Cheers,
Wol

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-02-02 16:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-30 15:30 Multi-layer raid status David Brown
2018-02-02  6:03 ` NeilBrown
2018-02-02 10:41   ` David Brown
2018-02-02 11:17     ` Wols Lists
2018-02-02 11:32       ` David Brown
2018-02-02 12:12         ` Reindl Harald
2018-02-02 14:24         ` Wols Lists
2018-02-02 14:50           ` David Brown
2018-02-02 15:03             ` Wols Lists
2018-02-02 15:40               ` David Brown
2018-02-02 16:49                 ` Wols Lists

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox