* Regression- XFS won't mount on partitioned md array
@ 2008-05-16 17:11 David Greaves
2008-05-16 17:16 ` Justin Piszcz
2008-05-16 18:59 ` Eric Sandeen
0 siblings, 2 replies; 22+ messages in thread
From: David Greaves @ 2008-05-16 17:11 UTC (permalink / raw)
To: David Chinner; +Cc: LinuxRaid, xfs, 'linux-kernel@vger.kernel.org'
I just attempted a kernel upgrade from 2.6.20.7 to 2.6.25.3 and it no longer
mounts my xfs filesystem.
I bisected it to around
a67d7c5f5d25d0b13a4dfb182697135b014fa478
[XFS] Move platform specific mount option parse out of core XFS code
I have a RAID5 array with partitions:
Partition Table for /dev/md_d0
First Last
# Type Sector Sector Offset Length Filesystem Type (ID) Flag
-- ------- ----------- ----------- ------ ----------- -------------------- ----
1 Primary 0 2500288279 4 2500288280 Linux (83) None
2 Primary 2500288280 2500483583 0 195304 Non-FS data (DA) None
when I attempt to mount /media:
/dev/md_d0p1 /media xfs rw,nobarrier,noatime,logdev=/dev/md_d0p2,allocsize=512m 0 0
I get:
md_d0: p1 p2
XFS mounting filesystem md_d0p1
attempt to access beyond end of device
md_d0p2: rw=0, want=195311, limit=195304
I/O error in filesystem ("md_d0p1") meta-data dev md_d0p2 block 0x2fae7
("xlog_bread") error 5 buf count 512
XFS: empty log check failed
XFS: log mount/recovery failed: error 5
XFS: log mount failed
A repair:
xfs_repair /dev/md_d0p1 -l /dev/md_d0p2
gives no errors.
Phase 1 - find and verify superblock...
Phase 2 - using external log on /dev/md_d0p2
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
...
David
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: Regression- XFS won't mount on partitioned md array 2008-05-16 17:11 Regression- XFS won't mount on partitioned md array David Greaves @ 2008-05-16 17:16 ` Justin Piszcz 2008-05-16 18:05 ` David Greaves 2008-05-16 18:59 ` Eric Sandeen 1 sibling, 1 reply; 22+ messages in thread From: Justin Piszcz @ 2008-05-16 17:16 UTC (permalink / raw) To: David Greaves Cc: David Chinner, LinuxRaid, xfs, 'linux-kernel@vger.kernel.org' On Fri, 16 May 2008, David Greaves wrote: > I just attempted a kernel upgrade from 2.6.20.7 to 2.6.25.3 and it no longer > mounts my xfs filesystem. > > I bisected it to around > a67d7c5f5d25d0b13a4dfb182697135b014fa478 > [XFS] Move platform specific mount option parse out of core XFS code > > I have a RAID5 array with partitions: > > Partition Table for /dev/md_d0 > > First Last > # Type Sector Sector Offset Length Filesystem Type (ID) Flag > -- ------- ----------- ----------- ------ ----------- -------------------- ---- > 1 Primary 0 2500288279 4 2500288280 Linux (83) None > 2 Primary 2500288280 2500483583 0 195304 Non-FS data (DA) None > > > when I attempt to mount /media: > /dev/md_d0p1 /media xfs rw,nobarrier,noatime,logdev=/dev/md_d0p2,allocsize=512m 0 0 > > I get: > md_d0: p1 p2 > XFS mounting filesystem md_d0p1 > attempt to access beyond end of device > md_d0p2: rw=0, want=195311, limit=195304 > I/O error in filesystem ("md_d0p1") meta-data dev md_d0p2 block 0x2fae7 > ("xlog_bread") error 5 buf count 512 > XFS: empty log check failed > XFS: log mount/recovery failed: error 5 > XFS: log mount failed > > A repair: > xfs_repair /dev/md_d0p1 -l /dev/md_d0p2 > gives no errors. > > Phase 1 - find and verify superblock... > Phase 2 - using external log on /dev/md_d0p2 > - zero log... > - scan filesystem freespace and inode maps... > - found root inode chunk > ... > > > David > > Ouch, still on 2.6.25.1 here, didn't reboot yet, but I do not use mdraid'ed partitions, just regular mdraid, if you boot back to 2.6.20.7 does it work again? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Regression- XFS won't mount on partitioned md array 2008-05-16 17:16 ` Justin Piszcz @ 2008-05-16 18:05 ` David Greaves 2008-05-16 18:35 ` Oliver Pinter 0 siblings, 1 reply; 22+ messages in thread From: David Greaves @ 2008-05-16 18:05 UTC (permalink / raw) To: Justin Piszcz Cc: David Chinner, LinuxRaid, xfs, 'linux-kernel@vger.kernel.org' > Ouch, still on 2.6.25.1 here, didn't reboot yet, but I do not use > mdraid'ed partitions, just regular mdraid, if you boot back to 2.6.20.7 > does it work again? Yes, no probs. It came in prior to 2.6.25-rc1 The machine has a root xfs filesystem with an internal log on a sata disk and a data filesystem on a partitioned array with an external log (also on the partitioned array). Only the partitioned array/external-log filesystem is affected. David ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Regression- XFS won't mount on partitioned md array 2008-05-16 18:05 ` David Greaves @ 2008-05-16 18:35 ` Oliver Pinter 2008-05-17 14:48 ` David Greaves 0 siblings, 1 reply; 22+ messages in thread From: Oliver Pinter @ 2008-05-16 18:35 UTC (permalink / raw) To: David Greaves Cc: Justin Piszcz, David Chinner, LinuxRaid, xfs, linux-kernel@vger.kernel.org this[1] patch fixed? this patch is for 2.6.25.5 kernel 1: http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob;f=queue-2.6.25/block-do_mounts-accept-root-non-existant-partition.patch;h=097cda9928b434994dbb157065b2ca38e7cec3a1;hb=8cc4c3b370d59deb16c2e92165a466c82e914020 On 5/16/08, David Greaves <david@dgreaves.com> wrote: >> Ouch, still on 2.6.25.1 here, didn't reboot yet, but I do not use >> mdraid'ed partitions, just regular mdraid, if you boot back to 2.6.20.7 >> does it work again? > Yes, no probs. > > It came in prior to 2.6.25-rc1 > The machine has a root xfs filesystem with an internal log on a sata disk > and a > data filesystem on a partitioned array with an external log (also on the > partitioned array). > Only the partitioned array/external-log filesystem is affected. > > David > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Thanks, Oliver ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Regression- XFS won't mount on partitioned md array 2008-05-16 18:35 ` Oliver Pinter @ 2008-05-17 14:48 ` David Greaves 2008-05-17 15:20 ` David Greaves 0 siblings, 1 reply; 22+ messages in thread From: David Greaves @ 2008-05-17 14:48 UTC (permalink / raw) To: Oliver Pinter Cc: Justin Piszcz, David Chinner, LinuxRaid, xfs, linux-kernel@vger.kernel.org, Eric Sandeen Oliver Pinter wrote: > this[1] patch fixed? > > 1: http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob;f=queue-2.6.25/block-do_mounts-accept-root-non-existant-partition.patch;h=097cda9928b434994dbb157065b2ca38e7cec3a1;hb=8cc4c3b370d59deb16c2e92165a466c82e914020 Looks like a possible candidate - thanks. I think this patch is for mounting root on an md device when the partitions aren't yet initialised. However: * When I run cfdisk I can read the partition table. * Subsequent attempts to mount the xfs when /proc/partitions is clearly present still fail. > this patch is for 2.6.25.5 kernel ? there isn't a 2.6.25.5 It doesn't apply to 2.6.25.4 I'll see if I can make it apply... David ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Regression- XFS won't mount on partitioned md array 2008-05-17 14:48 ` David Greaves @ 2008-05-17 15:20 ` David Greaves 0 siblings, 0 replies; 22+ messages in thread From: David Greaves @ 2008-05-17 15:20 UTC (permalink / raw) To: Oliver Pinter Cc: Justin Piszcz, David Chinner, LinuxRaid, xfs, linux-kernel@vger.kernel.org, Eric Sandeen David Greaves wrote: > Oliver Pinter wrote: >> this[1] patch fixed? >> >> 1: > http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob;f=queue-2.6.25/block-do_mounts-accept-root-non-existant-partition.patch;h=097cda9928b434994dbb157065b2ca38e7cec3a1;hb=8cc4c3b370d59deb16c2e92165a466c82e914020 > > > Looks like a possible candidate - thanks. > > I think this patch is for mounting root on an md device when the partitions > aren't yet initialised. > > However: > * When I run cfdisk I can read the partition table. > * Subsequent attempts to mount the xfs when /proc/partitions is clearly present > still fail. > >> this patch is for 2.6.25.5 kernel > ? there isn't a 2.6.25.5 Sorry, I understand; it's queued for 2.6.25.5. > It doesn't apply to 2.6.25.4 > > I'll see if I can make it apply... Yep - it was download corruption. Applied but it didn't help. David ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Regression- XFS won't mount on partitioned md array 2008-05-16 17:11 Regression- XFS won't mount on partitioned md array David Greaves 2008-05-16 17:16 ` Justin Piszcz @ 2008-05-16 18:59 ` Eric Sandeen 2008-05-17 14:46 ` David Greaves 1 sibling, 1 reply; 22+ messages in thread From: Eric Sandeen @ 2008-05-16 18:59 UTC (permalink / raw) To: David Greaves Cc: David Chinner, LinuxRaid, xfs, 'linux-kernel@vger.kernel.org' David Greaves wrote: > I just attempted a kernel upgrade from 2.6.20.7 to 2.6.25.3 and it no longer > mounts my xfs filesystem. > > I bisected it to around > a67d7c5f5d25d0b13a4dfb182697135b014fa478 > [XFS] Move platform specific mount option parse out of core XFS code around that... not exactly? That commit should have been largely a code move, which is not to say that it can't contain a bug... :) > I have a RAID5 array with partitions: > > Partition Table for /dev/md_d0 > > First Last > # Type Sector Sector Offset Length Filesystem Type (ID) Flag > -- ------- ----------- ----------- ------ ----------- -------------------- ---- > 1 Primary 0 2500288279 4 2500288280 Linux (83) None > 2 Primary 2500288280 2500483583 0 195304 Non-FS data (DA) None > > > when I attempt to mount /media: > /dev/md_d0p1 /media xfs rw,nobarrier,noatime,logdev=/dev/md_d0p2,allocsize=512m 0 0 mythbox? :) Hm, so it's the external log size that it doesn't much like... > I get: > md_d0: p1 p2 > XFS mounting filesystem md_d0p1 > attempt to access beyond end of device > md_d0p2: rw=0, want=195311, limit=195304 what does /proc/partitions say about md_d0p1 and p2? Is it different between the older & newer kernel? What does xfs_info /mount/point say about the filesystem when you mount it under the older kernel? Or... if you can't mount it, xfs_db -r -c "sb 0" -c p /dev/md_d0p1 -Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Regression- XFS won't mount on partitioned md array 2008-05-16 18:59 ` Eric Sandeen @ 2008-05-17 14:46 ` David Greaves 2008-05-17 15:15 ` Eric Sandeen 0 siblings, 1 reply; 22+ messages in thread From: David Greaves @ 2008-05-17 14:46 UTC (permalink / raw) To: Eric Sandeen Cc: David Chinner, LinuxRaid, xfs, 'linux-kernel@vger.kernel.org' Eric Sandeen wrote: > David Greaves wrote: >> I just attempted a kernel upgrade from 2.6.20.7 to 2.6.25.3 and it no longer >> mounts my xfs filesystem. >> >> I bisected it to around >> a67d7c5f5d25d0b13a4dfb182697135b014fa478 >> [XFS] Move platform specific mount option parse out of core XFS code > > around that... not exactly? That commit should have been largely a code > move, which is not to say that it can't contain a bug... :) I got to within 4 on the bisect and my xfs partition containing the kernel src and the bisect history blew up telling me that files were directories and then exploding in a heap of lost+found/ fragments. Quite, erm, "interesting" really. At that point I decided I was close enough to ask for advice, looked at the commits and took this one as the most likely to cause the bug :) But, thinking about it, I can decode the kernel extraversion tags in /boot. From that I think my bounds were: 40ebd81d1a7635cf92a59c387a599fce4863206b [XFS] Use kernel-supplied "roundup_pow_of_two" for simplicity and: 3ed6526441053d79b85d206b14d75125e6f51cc2 [XFS] Implement fallocate. so those bound: [XFS] Remove the BPCSHIFT and NB* based macros from XFS. [XFS] Remove bogus assert [XFS] optimize XFS_IS_REALTIME_INODE w/o realtime config [XFS] Move platform specific mount option parse out of core XFS code and just glancing through the patches I didn't see any changes that looked likely in the others... > >> I have a RAID5 array with partitions: >> >> Partition Table for /dev/md_d0 >> >> First Last >> # Type Sector Sector Offset Length Filesystem Type (ID) Flag >> -- ------- ----------- ----------- ------ ----------- -------------------- ---- >> 1 Primary 0 2500288279 4 2500288280 Linux (83) None >> 2 Primary 2500288280 2500483583 0 195304 Non-FS data (DA) None >> >> >> when I attempt to mount /media: >> /dev/md_d0p1 /media xfs rw,nobarrier,noatime,logdev=/dev/md_d0p2,allocsize=512m 0 0 > > mythbox? :) Hey - we test some interesting corner cases... :) My *wife* just told *me* to buy, and I quote "No more than 10" 1Tb Samsung drives... I decided 5 would be plenty. > Hm, so it's the external log size that it doesn't much like... Yep - I noticed that - and ISTR that Neil has been fiddling in the md partitioning code over the last 6 months or so. I wondered where it got the larger figure from and if, somehow, md was changing the partition size somehow... >> I get: >> md_d0: p1 p2 >> XFS mounting filesystem md_d0p1 >> attempt to access beyond end of device >> md_d0p2: rw=0, want=195311, limit=195304 > > what does /proc/partitions say about md_d0p1 and p2? Is it different > between the older & newer kernel? 2.6.20.7 (good) 254 0 1250241792 md_d0 254 1 1250144138 md_d0p1 254 2 97652 md_d0p2 2.6.25.3 (bad) 254 0 1250241792 md_d0 254 1 1250144138 md_d0p1 254 2 97652 md_d0p2 2.6.25.4 (bad) 254 0 1250241792 md_d0 254 1 1250144138 md_d0p1 254 2 97652 md_d0p2 So nothing obvious there then... > > What does xfs_info /mount/point say about the filesystem when you mount > it under the older kernel? Or... if you can't mount it, teak:~# xfs_info /media/ meta-data=/dev/md_d0p1 isize=256 agcount=32, agsize=9766751 blks = sectsz=512 attr=0 data = bsize=4096 blocks=312536032, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 log =external bsize=4096 blocks=24413, version=2 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=65536 blocks=0, rtextents=0 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Regression- XFS won't mount on partitioned md array 2008-05-17 14:46 ` David Greaves @ 2008-05-17 15:15 ` Eric Sandeen 2008-05-17 19:10 ` Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system David Lethe 2008-05-17 23:18 ` Regression- XFS won't mount on partitioned md array Eric Sandeen 0 siblings, 2 replies; 22+ messages in thread From: Eric Sandeen @ 2008-05-17 15:15 UTC (permalink / raw) To: David Greaves Cc: David Chinner, LinuxRaid, xfs, 'linux-kernel@vger.kernel.org' David Greaves wrote: > Eric Sandeen wrote: >>> I get: >>> md_d0: p1 p2 >>> XFS mounting filesystem md_d0p1 >>> attempt to access beyond end of device >>> md_d0p2: rw=0, want=195311, limit=195304 >> what does /proc/partitions say about md_d0p1 and p2? Is it different >> between the older & newer kernel? ... > 2.6.25.4 (bad) > 254 0 1250241792 md_d0 > 254 1 1250144138 md_d0p1 > 254 2 97652 md_d0p2 > > So nothing obvious there then... > >> What does xfs_info /mount/point say about the filesystem when you mount >> it under the older kernel? Or... if you can't mount it, > teak:~# xfs_info /media/ > meta-data=/dev/md_d0p1 isize=256 agcount=32, agsize=9766751 blks > = sectsz=512 attr=0 > data = bsize=4096 blocks=312536032, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 > log =external bsize=4096 blocks=24413, version=2 > = sectsz=512 sunit=0 blks, lazy-count=0 > realtime =none extsz=65536 blocks=0, rtextents=0 ok, and with: > Partition Table for /dev/md_d0 > > First Last > # Type Sector Sector Offset Length Filesystem Type (ID) Flag > -- ------- ----------- ----------- ------ ----------- -------------------- ---- > 1 Primary 0 2500288279 4 2500288280 Linux (83) None > 2 Primary 2500288280 2500483583 0 195304 Non-FS data (DA) None So, xfs thinks the external log is 24413 4k blocks (from the sb geometry printed by xfs_info). This is 97652 1k units (matching your /proc/partitions output) and 195304 512-byte sectors (matching the partition table output). So that all looks consistent. So if xfs is doing: >>> md_d0p2: rw=0, want=195311, limit=195304 >>> XFS: empty log check failed it surely does seem to be trying to read past the end of what even it thinks is the end of its log. And, with your geometry I can reproduce this w/o md, partitioned or not. So looks like xfs itself is busted: loop5: rw=0, want=195311, limit=195304 I'll see if I have a little time today to track down the problem. Thanks, -Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system 2008-05-17 15:15 ` Eric Sandeen @ 2008-05-17 19:10 ` David Lethe 2008-05-17 19:29 ` Peter Rabbitson ` (2 more replies) 2008-05-17 23:18 ` Regression- XFS won't mount on partitioned md array Eric Sandeen 1 sibling, 3 replies; 22+ messages in thread From: David Lethe @ 2008-05-17 19:10 UTC (permalink / raw) To: LinuxRaid, linux-kernel I'm trying to figure out a mechanism to safely repair a stripe of data when I know a particular disk has a unrecoverable read error at a certain physical block (for 2.6 kernels) My original plan was to figure out the range of blocks in md device that utilizes the known bad block and force a raw read on physical device that covers the entire chunk and let the md driver do all of the work. Well, this didn't pan out. Problems include issues where if bad block maps to the parity block in a stripe then md won't necessarily read/verify parity, and in cases where you are running RAID1, then load balancing might result in the kernel reading the bad block from the good disk. So the degree of difficulty is much higher than I expected. I prefer not to patch kernels due to maintenance issues as well as desire for the technique to work across numerous kernels and patch revisions, and frankly, the odds are I would screw it up. An application-level program that can be invoked as necessary would be ideal. As such, anybody up to the challenge of writing the code? I want it enough to paypal somebody $500 who can write it, and will gladly open source the solution. (And to clarify why, I know physical block x on disk y is bad before the O/S reads the block, and just want to rebuild the stripe, not the entire md device when this happens. I must not compromise any file system data, cached or non-cached that is built on the md device. I have system with >100TB and if I did a rebuild every time I discovered a bad block somewhere, then a full parity repair would never complete before another physical bad block is discovered.) Contact me offline for the financial details, but I would certainly appreciate some thread discussion on an appropriate architecture. At least it is my opinion that such capability should eventually be native Linux, but as long as there is a program that can be run on demand that doesn't require rebuilding or patching kernels then that is all I need. David @ santools.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system 2008-05-17 19:10 ` Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system David Lethe @ 2008-05-17 19:29 ` Peter Rabbitson 2008-05-17 20:26 ` Guy Watkins 2008-05-19 2:54 ` Neil Brown 2 siblings, 0 replies; 22+ messages in thread From: Peter Rabbitson @ 2008-05-17 19:29 UTC (permalink / raw) To: David Lethe; +Cc: LinuxRaid, linux-kernel David Lethe wrote: > I'm trying to figure out a mechanism to safely repair a stripe of data > when I know a particular disk has a unrecoverable read error at a > certain physical block (for 2.6 kernels) > > <snip> > > As such, anybody up to the challenge of writing the code? I want it > enough to paypal somebody $500 who can write it, and will gladly open > source the solution. > Damn, here goes $500 :) Unfortunately the only thing I can bring to the table is a thread[1] about a mechanism that would fit your request nicely. Hopefully someone will pick this stuff up and make it a reality. Peter [1] http://marc.info/?l=linux-raid&m=120605458309825 ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system 2008-05-17 19:10 ` Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system David Lethe 2008-05-17 19:29 ` Peter Rabbitson @ 2008-05-17 20:26 ` Guy Watkins 2008-05-26 11:17 ` Jan Engelhardt 2008-05-19 2:54 ` Neil Brown 2 siblings, 1 reply; 22+ messages in thread From: Guy Watkins @ 2008-05-17 20:26 UTC (permalink / raw) To: 'David Lethe', 'LinuxRaid', linux-kernel } -----Original Message----- } From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- } owner@vger.kernel.org] On Behalf Of David Lethe } Sent: Saturday, May 17, 2008 3:10 PM } To: LinuxRaid; linux-kernel@vger.kernel.org } Subject: Mechanism to safely force repair of single md stripe w/o hurting } data integrity of file system } } I'm trying to figure out a mechanism to safely repair a stripe of data } when I know a particular disk has a unrecoverable read error at a } certain physical block (for 2.6 kernels) } } My original plan was to figure out the range of blocks in md device that } utilizes the known bad block and force a raw read on physical device } that covers the entire chunk and let the md driver do all of the work. } } Well, this didn't pan out. Problems include issues where if bad block } maps to the parity block in a stripe then md won't necessarily } read/verify parity, and in cases where you are running RAID1, then load } balancing might result in the kernel reading the bad block from the good } disk. } } So the degree of difficulty is much higher than I expected. I prefer } not to patch kernels due to maintenance issues as well as desire for the } technique to work across numerous kernels and patch revisions, and } frankly, the odds are I would screw it up. An application-level program } that can be invoked as necessary would be ideal. } } As such, anybody up to the challenge of writing the code? I want it } enough to paypal somebody $500 who can write it, and will gladly open } source the solution. } } (And to clarify why, I know physical block x on disk y is bad before the } O/S reads the block, and just want to rebuild the stripe, not the entire } md device when this happens. I must not compromise any file system data, } cached or non-cached that is built on the md device. I have system with } >100TB and if I did a rebuild every time I discovered a bad block } somewhere, then a full parity repair would never complete before another } physical bad block is discovered.) } } Contact me offline for the financial details, but I would certainly } appreciate some thread discussion on an appropriate architecture. At } least it is my opinion that such capability should eventually be native } Linux, but as long as there is a program that can be run on demand that } doesn't require rebuilding or patching kernels then that is all I need. } } David @ santools.com I thought this would cause md to read all blocks in an array: echo repair > /sys/block/md0/md/sync_action And rewrite any blocks that can't be read. In the old days, md would kick out a disk on a read error. When you added it back, md would rewrite everything on that disk, which corrected read errors. Guy ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system 2008-05-17 20:26 ` Guy Watkins @ 2008-05-26 11:17 ` Jan Engelhardt 0 siblings, 0 replies; 22+ messages in thread From: Jan Engelhardt @ 2008-05-26 11:17 UTC (permalink / raw) To: Guy Watkins; +Cc: 'David Lethe', 'LinuxRaid', linux-kernel On Saturday 2008-05-17 22:26, Guy Watkins wrote: > >I thought this would cause md to read all blocks in an array: >echo repair > /sys/block/md0/md/sync_action > >And rewrite any blocks that can't be read. > >In the old days, md would kick out a disk on a read error. When you added >it back, md would rewrite everything on that disk, which corrected read >errors. With a read bitmap (`mdadm -G /dev/mdX -b internal`, or during -C), it should resync less after an unwarranted kick. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system 2008-05-17 19:10 ` Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system David Lethe 2008-05-17 19:29 ` Peter Rabbitson 2008-05-17 20:26 ` Guy Watkins @ 2008-05-19 2:54 ` Neil Brown 2 siblings, 0 replies; 22+ messages in thread From: Neil Brown @ 2008-05-19 2:54 UTC (permalink / raw) To: David Lethe; +Cc: LinuxRaid, linux-kernel On Saturday May 17, david@santools.com wrote: > I'm trying to figure out a mechanism to safely repair a stripe of data > when I know a particular disk has a unrecoverable read error at a > certain physical block (for 2.6 kernels) > > My original plan was to figure out the range of blocks in md device that > utilizes the known bad block and force a raw read on physical device > that covers the entire chunk and let the md driver do all of the work. > > Well, this didn't pan out. Problems include issues where if bad block > maps to the parity block in a stripe then md won't necessarily > read/verify parity, and in cases where you are running RAID1, then load > balancing might result in the kernel reading the bad block from the good > disk. > > So the degree of difficulty is much higher than I expected. I prefer > not to patch kernels due to maintenance issues as well as desire for the > technique to work across numerous kernels and patch revisions, and > frankly, the odds are I would screw it up. An application-level program > that can be invoked as necessary would be ideal. This shouldn't be a problem. You write a patch, submit it for review, it gets reviewed and eventually submitted to mainline. Then it will work on all new kernels, and any screw ups that you make will be caught by someone else (me possibly). > > As such, anybody up to the challenge of writing the code? I want it > enough to paypal somebody $500 who can write it, and will gladly open > source the solution. It is largely done. If you write a number to /sys/block/mdXX/md/sync_max, then recovery will stop when it gets there. If you write 'check' to /sys/block/mdXX/md/sync_action, then it will read all blocks and auto-correct any unrecoverable read errors. You just need some way to set the start point of the resync. Probably just create a sync_min attribute - see lightly tested patch below. If this fits your needs, I'm sure www.compassion.com would be happy with your $500. To use this: 1/ Write the end address (sectors) to sync_max 2/ Write the start address (sectors) to sync_min 3/ Write 'check' to sync_action 4/ Monitor sync_completed until it reaches sync_max 5/ Write 'idle' to sync_action NeilBrown Signed-off-by: Neil Brown <neilb@suse.de> ### Diffstat output ./drivers/md/md.c | 46 +++++++++++++++++++++++++++++++++++++++++--- ./include/linux/raid/md_k.h | 2 + 2 files changed, 45 insertions(+), 3 deletions(-) diff .prev/drivers/md/md.c ./drivers/md/md.c --- .prev/drivers/md/md.c 2008-05-19 11:04:11.000000000 +1000 +++ ./drivers/md/md.c 2008-05-19 12:43:29.000000000 +1000 @@ -277,6 +277,7 @@ static mddev_t * mddev_find(dev_t unit) spin_lock_init(&new->write_lock); init_waitqueue_head(&new->sb_wait); new->reshape_position = MaxSector; + new->resync_min = 0; new->resync_max = MaxSector; new->level = LEVEL_NONE; @@ -3074,6 +3075,37 @@ sync_completed_show(mddev_t *mddev, char static struct md_sysfs_entry md_sync_completed = __ATTR_RO(sync_completed); static ssize_t +min_sync_show(mddev_t *mddev, char *page) +{ + return sprintf(page, "%llu\n", + (unsigned long long)mddev->resync_min); +} +static ssize_t +min_sync_store(mddev_t *mddev, const char *buf, size_t len) +{ + char *ep; + unsigned long long min = simple_strtoull(buf, &ep, 10); + if (ep == buf || (*ep != 0 && *ep != '\n')) + return -EINVAL; + if (min > mddev->resync_max) + return -EINVAL; + if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) + return -EBUSY; + + /* Must be a multiple of chunk_size */ + if (mddev->chunk_size) { + if (min & (sector_t)((mddev->chunk_size>>9)-1)) + return -EINVAL; + } + mddev->resync_min = min; + + return len; +} + +static struct md_sysfs_entry md_min_sync = +__ATTR(sync_min, S_IRUGO|S_IWUSR, min_sync_show, min_sync_store); + +static ssize_t max_sync_show(mddev_t *mddev, char *page) { if (mddev->resync_max == MaxSector) @@ -3092,6 +3124,9 @@ max_sync_store(mddev_t *mddev, const cha unsigned long long max = simple_strtoull(buf, &ep, 10); if (ep == buf || (*ep != 0 && *ep != '\n')) return -EINVAL; + if (max < mddev->resync_min) + return -EINVAL; + if (max < mddev->resync_max && test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) return -EBUSY; @@ -3103,7 +3138,8 @@ max_sync_store(mddev_t *mddev, const cha } mddev->resync_max = max; } - wake_up(&mddev->recovery_wait); + if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) + wake_up(&mddev->recovery_wait); return len; } @@ -3221,6 +3257,7 @@ static struct attribute *md_redundancy_a &md_sync_speed.attr, &md_sync_force_parallel.attr, &md_sync_completed.attr, + &md_min_sync.attr, &md_max_sync.attr, &md_suspend_lo.attr, &md_suspend_hi.attr, @@ -3776,6 +3813,7 @@ static int do_md_stop(mddev_t * mddev, i mddev->size = 0; mddev->raid_disks = 0; mddev->recovery_cp = 0; + mddev->resync_min = 0; mddev->resync_max = MaxSector; mddev->reshape_position = MaxSector; mddev->external = 0; @@ -5622,9 +5660,11 @@ void md_do_sync(mddev_t *mddev) max_sectors = mddev->resync_max_sectors; mddev->resync_mismatches = 0; /* we don't use the checkpoint if there's a bitmap */ - if (!mddev->bitmap && - !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) + if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) + j = mddev->resync_min; + else if (!mddev->bitmap) j = mddev->recovery_cp; + } else if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)) max_sectors = mddev->size << 1; else { diff .prev/include/linux/raid/md_k.h ./include/linux/raid/md_k.h --- .prev/include/linux/raid/md_k.h 2008-05-19 11:04:11.000000000 +1000 +++ ./include/linux/raid/md_k.h 2008-05-19 12:35:52.000000000 +1000 @@ -227,6 +227,8 @@ struct mddev_s atomic_t recovery_active; /* blocks scheduled, but not written */ wait_queue_head_t recovery_wait; sector_t recovery_cp; + sector_t resync_min; /* user request sync starts + * here */ sector_t resync_max; /* resync should pause * when it gets here */ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Regression- XFS won't mount on partitioned md array 2008-05-17 15:15 ` Eric Sandeen 2008-05-17 19:10 ` Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system David Lethe @ 2008-05-17 23:18 ` Eric Sandeen [not found] ` <482FBD4C.20608@sandeen.net> 1 sibling, 1 reply; 22+ messages in thread From: Eric Sandeen @ 2008-05-17 23:18 UTC (permalink / raw) To: David Greaves Cc: David Chinner, LinuxRaid, xfs, 'linux-kernel@vger.kernel.org' Eric Sandeen wrote: > I'll see if I have a little time today to track down the problem. Does this patch fix it for you? Does for me though I can't yet explain why ;) http://www.linux.sgi.com/archives/xfs/2008-05/msg00190.html -Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
[parent not found: <482FBD4C.20608@sandeen.net>]
* Re: Regression- XFS won't mount on partitioned md array [not found] ` <482FBD4C.20608@sandeen.net> @ 2008-05-18 8:48 ` David Greaves 2008-05-18 15:38 ` Eric Sandeen 2008-05-24 13:33 ` RFI for 2.6.25.5 : " David Greaves 0 siblings, 2 replies; 22+ messages in thread From: David Greaves @ 2008-05-18 8:48 UTC (permalink / raw) To: Eric Sandeen Cc: David Chinner, xfs, 'linux-kernel@vger.kernel.org', Christoph Hellwig, LinuxRaid Eric Sandeen wrote: > Eric Sandeen wrote: >> Eric Sandeen wrote: >> >>> I'll see if I have a little time today to track down the problem. >> >> Does this patch fix it for you? Does for me though I can't yet explain >> why ;) >> >> http://www.linux.sgi.com/archives/xfs/2008-05/msg00190.html >> >> -Eric Yes, this fixes it for me - thanks :) > So what's happening is that xfs is trying to read a page-sized IO from > the last sector of the log... which goes off the end of the device. > This looks like another regression introduced by > a9759f2de38a3443d5107bddde03b4f3f550060e, but fixed by Christoph's patch > in the URL above, which should be headed towards -stable. Damn, I guess I misread my bisect readings when things crashed then. Still, I said 'around' :) > (aside: it seems that this breaks any external log setup where the log > consists of the entire device... but I'd have expected the xfsqa suite > to catch this...?) > > The patch avoids the problem by looking for some extra locking but it > seems to me that the root cause is that the buffer being read at this > point doesn't have it's b_offset, the offset in it's page, set. Might > be another little buglet but harmless it seems. mmmm 'little buglets' in the filesystem holding a few Tb of data... mmmm Anything I can do to help find that? I suspect not if you can reproduce it. Anyhow - thanks again. David PS I'll be back soon, back in 2.6.23 I was hitting a hibernate/xfs bug which I've been avoiding by powering down. Well, it's still there in 2.6.25.3... ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Regression- XFS won't mount on partitioned md array 2008-05-18 8:48 ` David Greaves @ 2008-05-18 15:38 ` Eric Sandeen 2008-05-24 13:33 ` RFI for 2.6.25.5 : " David Greaves 1 sibling, 0 replies; 22+ messages in thread From: Eric Sandeen @ 2008-05-18 15:38 UTC (permalink / raw) To: David Greaves Cc: David Chinner, xfs, 'linux-kernel@vger.kernel.org', Christoph Hellwig, LinuxRaid David Greaves wrote: > Eric Sandeen wrote: > mmmm > 'little buglets' in the filesystem holding a few Tb of data... > mmmm > Anything I can do to help find that? I suspect not if you can reproduce it. Nah. I'll ask the sgi guys about it, it just seems a little inconsistent but maybe by design... -Eric > Anyhow - thanks again. > > David > PS I'll be back soon, back in 2.6.23 I was hitting a hibernate/xfs bug which > I've been avoiding by powering down. Well, it's still there in 2.6.25.3... > ^ permalink raw reply [flat|nested] 22+ messages in thread
* RFI for 2.6.25.5 : Re: Regression- XFS won't mount on partitioned md array 2008-05-18 8:48 ` David Greaves 2008-05-18 15:38 ` Eric Sandeen @ 2008-05-24 13:33 ` David Greaves 2008-05-24 13:52 ` Willy Tarreau 1 sibling, 1 reply; 22+ messages in thread From: David Greaves @ 2008-05-24 13:33 UTC (permalink / raw) To: Greg KH Cc: Eric Sandeen, David Chinner, xfs, 'linux-kernel@vger.kernel.org', Christoph Hellwig, LinuxRaid Hi Greg Perusing: http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git doesn't show the patch referenced below as in the queue for 2.6.25.5 David David Greaves wrote: > Eric Sandeen wrote: >> Eric Sandeen wrote: >>> Eric Sandeen wrote: >>> >>>> I'll see if I have a little time today to track down the problem. >>> Does this patch fix it for you? Does for me though I can't yet explain >>> why ;) >>> >>> http://www.linux.sgi.com/archives/xfs/2008-05/msg00190.html >>> >>> -Eric > Yes, this fixes it for me - thanks :) > >> So what's happening is that xfs is trying to read a page-sized IO from >> the last sector of the log... which goes off the end of the device. >> This looks like another regression introduced by >> a9759f2de38a3443d5107bddde03b4f3f550060e, but fixed by Christoph's patch >> in the URL above, which should be headed towards -stable. > Damn, I guess I misread my bisect readings when things crashed then. > Still, I said 'around' :) > >> (aside: it seems that this breaks any external log setup where the log >> consists of the entire device... but I'd have expected the xfsqa suite >> to catch this...?) >> >> The patch avoids the problem by looking for some extra locking but it >> seems to me that the root cause is that the buffer being read at this >> point doesn't have it's b_offset, the offset in it's page, set. Might >> be another little buglet but harmless it seems. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RFI for 2.6.25.5 : Re: Regression- XFS won't mount on partitioned md array 2008-05-24 13:33 ` RFI for 2.6.25.5 : " David Greaves @ 2008-05-24 13:52 ` Willy Tarreau 2008-05-24 15:39 ` Eric Sandeen 0 siblings, 1 reply; 22+ messages in thread From: Willy Tarreau @ 2008-05-24 13:52 UTC (permalink / raw) To: David Greaves Cc: Greg KH, Eric Sandeen, David Chinner, xfs, 'linux-kernel@vger.kernel.org', Christoph Hellwig, LinuxRaid, stable Hi David, On Sat, May 24, 2008 at 02:33:35PM +0100, David Greaves wrote: > Hi Greg > Perusing: > http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git > doesn't show the patch referenced below as in the queue for 2.6.25.5 First, please avoid top-posting. > David Greaves wrote: > > Eric Sandeen wrote: > >> Eric Sandeen wrote: > >>> Eric Sandeen wrote: > >>> > >>>> I'll see if I have a little time today to track down the problem. > >>> Does this patch fix it for you? Does for me though I can't yet explain > >>> why ;) > >>> > >>> http://www.linux.sgi.com/archives/xfs/2008-05/msg00190.html > >>> > >>> -Eric > > Yes, this fixes it for me - thanks :) > > > >> So what's happening is that xfs is trying to read a page-sized IO from > >> the last sector of the log... which goes off the end of the device. > >> This looks like another regression introduced by > >> a9759f2de38a3443d5107bddde03b4f3f550060e, but fixed by Christoph's patch > >> in the URL above, which should be headed towards -stable. > > Damn, I guess I misread my bisect readings when things crashed then. > > Still, I said 'around' :) > > > >> (aside: it seems that this breaks any external log setup where the log > >> consists of the entire device... but I'd have expected the xfsqa suite > >> to catch this...?) > >> > >> The patch avoids the problem by looking for some extra locking but it > >> seems to me that the root cause is that the buffer being read at this > >> point doesn't have it's b_offset, the offset in it's page, set. Might > >> be another little buglet but harmless it seems. It would have helped to CC stable (fixed) and to give the mainline commit ID since the stable branch only holds already merged patches. Greg, the commit is 6ab455ee... Willy ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RFI for 2.6.25.5 : Re: Regression- XFS won't mount on partitioned md array 2008-05-24 13:52 ` Willy Tarreau @ 2008-05-24 15:39 ` Eric Sandeen 0 siblings, 0 replies; 22+ messages in thread From: Eric Sandeen @ 2008-05-24 15:39 UTC (permalink / raw) To: Willy Tarreau Cc: David Greaves, Greg KH, David Chinner, xfs, 'linux-kernel@vger.kernel.org', Christoph Hellwig, LinuxRaid, stable Willy Tarreau wrote: > It would have helped to CC stable (fixed) and to give the mainline commit > ID since the stable branch only holds already merged patches. Greg, the > commit is 6ab455ee... > > Willy Yup I'll agree that this should probably go to -stable unless hch or dchinner disagree. FWIW I've already put it in the Fedora kernels. -Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system @ 2008-05-17 21:30 David Lethe 2008-05-17 23:16 ` Roger Heflin 0 siblings, 1 reply; 22+ messages in thread From: David Lethe @ 2008-05-17 21:30 UTC (permalink / raw) To: Guy Watkins, 'LinuxRaid', linux-kernel It will. But that defeats the purpose. I want to limit repair to only the raid stripe that utilizes a specifiv disk with a block that I know has a unrecoverable reas error. -----Original Message----- From: "Guy Watkins" <linux-raid@watkins-home.com> Subj: RE: Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system Date: Sat May 17, 2008 3:28 pm Size: 2K To: "'David Lethe'" <david@santools.com>; "'LinuxRaid'" <linux-raid@vger.kernel.org>; "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org> } -----Original Message----- } From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- } owner@vger.kernel.org] On Behalf Of David Lethe } Sent: Saturday, May 17, 2008 3:10 PM } To: LinuxRaid; linux-kernel@vger.kernel.org } Subject: Mechanism to safely force repair of single md stripe w/o hurting } data integrity of file system } } I'm trying to figure out a mechanism to safely repair a stripe of data } when I know a particular disk has a unrecoverable read error at a } certain physical block (for 2.6 kernels) } } My original plan was to figure out the range of blocks in md device that } utilizes the known bad block and force a raw read on physical device } that covers the entire chunk and let the md driver do all of the work. } } Well, this didn't pan out. Problems include issues where if bad block } maps to the parity block in a stripe then md won't necessarily } read/verify parity, and in cases where you are running RAID1, then load } balancing might result in the kernel reading the bad block from the good } disk. } } So the degree of difficulty is much higher than I expected. I prefer } not to patch kernels due to maintenance issues as well as desire for the } technique to work across numerous kernels and patch revisions, and } frankly, the odds are I would screw it up. An application-level program } that can be invoked as necessary would be ideal. } } As such, anybody up to the challenge of writing the code? I want it } enough to paypal somebody $500 who can write it, and will gladly open } source the solution. } } (And to clarify why, I know physical block x on disk y is bad before the } O/S reads the block, and just want to rebuild the stripe, not the entire } md device when this happens. I must not compromise any file system data, } cached or non-cached that is built on the md device. I have system with } >100TB and if I did a rebuild every time I discovered a bad block } somewhere, then a full parity repair would never complete before another } physical bad block is discovered.) } } Contact me offline for the financial details, but I would certainly } appreciate some thread discussion on an appropriate architecture. At } least it is my opinion that such capability should eventually be native } Linux, but as long as there is a program that can be run on demand that } doesn't require rebuilding or patching kernels then that is all I need. } } David @ santools.com I thought this would cause md to read all blocks in an array: echo repair > /sys/block/md0/md/sync_action And rewrite any blocks that can't be read. In the old days, md would kick out a disk on a read error. When you added it back, md would rewrite everything on that disk, which corrected read errors. Guy ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system 2008-05-17 21:30 Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system David Lethe @ 2008-05-17 23:16 ` Roger Heflin 0 siblings, 0 replies; 22+ messages in thread From: Roger Heflin @ 2008-05-17 23:16 UTC (permalink / raw) To: David; +Cc: Guy Watkins, 'LinuxRaid', linux-kernel David Lethe wrote: > It will. But that defeats the purpose. I want to limit repair to only the raid stripe that utilizes a specifiv disk with a block that I know has a unrecoverable reas error. > > -----Original Message----- > > From: "Guy Watkins" <linux-raid@watkins-home.com> > Subj: RE: Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system > Date: Sat May 17, 2008 3:28 pm > Size: 2K > To: "'David Lethe'" <david@santools.com>; "'LinuxRaid'" <linux-raid@vger.kernel.org>; "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org> > > } -----Original Message----- > } From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- > } owner@vger.kernel.org] On Behalf Of David Lethe > } Sent: Saturday, May 17, 2008 3:10 PM > } To: LinuxRaid; linux-kernel@vger.kernel.org > } Subject: Mechanism to safely force repair of single md stripe w/o hurting > } data integrity of file system > } > } I'm trying to figure out a mechanism to safely repair a stripe of data > } when I know a particular disk has a unrecoverable read error at a > } certain physical block (for 2.6 kernels) > } > } My original plan was to figure out the range of blocks in md device that > } utilizes the known bad block and force a raw read on physical device > } that covers the entire chunk and let the md driver do all of the work. > } > } Well, this didn't pan out. Problems include issues where if bad block > } maps to the parity block in a stripe then md won't necessarily > } read/verify parity, and in cases where you are running RAID1, then load > } balancing might result in the kernel reading the bad block from the good > } disk. > } > } So the degree of difficulty is much higher than I expected. I prefer > } not to patch kernels due to maintenance issues as well as desire for the > } technique to work across numerous kernels and patch revisions, and > } frankly, the odds are I would screw it up. An application-level program > } that can be invoked as necessary would be ideal. > } > } As such, anybody up to the challenge of writing the code? I want it > } enough to paypal somebody $500 who can write it, and will gladly open > } source the solution. > } > } (And to clarify why, I know physical block x on disk y is bad before the > } O/S reads the block, and just want to rebuild the stripe, not the entire > } md device when this happens. I must not compromise any file system data, > } cached or non-cached that is built on the md device. I have system with > } >100TB and if I did a rebuild every time I discovered a bad block > } somewhere, then a full parity repair would never complete before another > } physical bad block is discovered.) > } > } Contact me offline for the financial details, but I would certainly > } appreciate some thread discussion on an appropriate architecture. At > } least it is my opinion that such capability should eventually be native > } Linux, but as long as there is a program that can be run on demand that > } doesn't require rebuilding or patching kernels then that is all I need. > } > } David @ santools.com > > I thought this would cause md to read all blocks in an array: > echo repair > /sys/block/md0/md/sync_action > > And rewrite any blocks that can't be read. > > In the old days, md would kick out a disk on a read error. When you added > it back, md would rewrite everything on that disk, which corrected read > errors. > > Guy > I bet $500 is well below minimum wage in the US for the number of hours it would take someone to do this. And I would say that if you have > 100TB in a single raid5/6 that would mean you had to have at least 100 disks in that array, and most people get nervous at >8-16 disks in either raid5 or raid6 arrays, and the statistics of disks going bad, and the chance of a rebuild succeeding before another disk/block goes bad gets smaller and smaller as the number of disks increase, as you have noted you are at the point that it becomes unlikely that the rebuild will ever complete even with good disks in the array. Most people build a number of smaller raid5/raid6 arrays and then LVM them together to get around this issue. And on top of that the larger number of disks the greater the IO required to do a rebuild so the slower the rebuild potentially is. And that is assuming that you don't have a bad batch of disks that has an abnormally high failure rate. I know of a hardware disk arrays that handle the bad block issue by allocating (on initial array construction) a set of spare blocks on each disk. On finding a bad block on a disk they relocated and rebuild just the bad block on the disk with the bad block from the stripe/parity and somehow note that the block on the bad disk has been relocated, and after some number of bad blocks on a given disk, they note that the given disk has too many bad blocks, and you that should "clone" and then fail the original disk over to the cloned disk once the clone is finished, but this sort of thing would seem to be rather non-trivial, though if someone would setup a clone of the bad disk, and rebuild the bad sector this would probably cut down the amount of time/IO required to complete a rebuild, though it would still take several hours, and things would get more complicated if you had another failure during that process. Roger ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2008-05-26 11:17 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-16 17:11 Regression- XFS won't mount on partitioned md array David Greaves
2008-05-16 17:16 ` Justin Piszcz
2008-05-16 18:05 ` David Greaves
2008-05-16 18:35 ` Oliver Pinter
2008-05-17 14:48 ` David Greaves
2008-05-17 15:20 ` David Greaves
2008-05-16 18:59 ` Eric Sandeen
2008-05-17 14:46 ` David Greaves
2008-05-17 15:15 ` Eric Sandeen
2008-05-17 19:10 ` Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system David Lethe
2008-05-17 19:29 ` Peter Rabbitson
2008-05-17 20:26 ` Guy Watkins
2008-05-26 11:17 ` Jan Engelhardt
2008-05-19 2:54 ` Neil Brown
2008-05-17 23:18 ` Regression- XFS won't mount on partitioned md array Eric Sandeen
[not found] ` <482FBD4C.20608@sandeen.net>
2008-05-18 8:48 ` David Greaves
2008-05-18 15:38 ` Eric Sandeen
2008-05-24 13:33 ` RFI for 2.6.25.5 : " David Greaves
2008-05-24 13:52 ` Willy Tarreau
2008-05-24 15:39 ` Eric Sandeen
-- strict thread matches above, loose matches on Subject: below --
2008-05-17 21:30 Mechanism to safely force repair of single md stripe w/o hurting data integrity of file system David Lethe
2008-05-17 23:16 ` Roger Heflin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).