* I/O errors without erros from underlying device
@ 2015-12-07 16:05 Arkadiusz Miskiewicz
2015-12-07 16:37 ` John Stoffel
0 siblings, 1 reply; 8+ messages in thread
From: Arkadiusz Miskiewicz @ 2015-12-07 16:05 UTC (permalink / raw)
To: linux-raid
Hi.
4.3.0 kernel, raid6 array:
md7 : active raid6 sdg[10] sdad1[9] sdac1[8] sdag1[7] sdaf1[6] sdae1[5] sdaj1[4] sdai1[3] sdah1[2] sdn1[1]
31255089152 blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/10] [UUUUUUUUUU]
bitmap: 1/30 pages [4KB], 65536KB chunk
array had weird failure where many disks went into failed state but
remove && adding these disks "fixed" it (turns out not really fixed it).
Unfortunately now some reads fail:
pread(4, 0x1483a00, 4096, 16003680464896) = -1 EIO (Input/output error)
To reproduce used xfs_io
xfs_io -d -c "pread 16003680464896 4096" /dev/md7
pread64: Input/output error
which does pread exactly as shown above.
write also fails for that area:
xfs_io -d -c "pwrite 16003680464896 4096" /dev/md7
pwrite64: Input/output error
Note that nothing is written in dmesg when that happens.
I've tried various offsets and sizes of pread and at some point that was logged:
[ 848.988518] Buffer I/O error on dev md7, logical block 3907148544, async page read
but no error from underlying devices.
List of bad blocks:
http://sprunge.us/XSWI
What can I do now?
(loosing data from that few sectors is acceptable if the rest will be readable)
Thanks,
--
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: I/O errors without erros from underlying device 2015-12-07 16:05 I/O errors without erros from underlying device Arkadiusz Miskiewicz @ 2015-12-07 16:37 ` John Stoffel 2015-12-07 17:06 ` Arkadiusz Miskiewicz [not found] ` <201512071803.26434.arekm@maven.pl> 0 siblings, 2 replies; 8+ messages in thread From: John Stoffel @ 2015-12-07 16:37 UTC (permalink / raw) To: arekm; +Cc: linux-raid Arkadiusz> 4.3.0 kernel, raid6 array: I think there's a bug in the 4.3.x and 4.4-rc3 and lower with block merges. I ran into these over the weekend, where v4.2.6 was stable, but anything higher would lock up and crash on me. So first step would be to make sure you get and test v4.4-rc4. Arkadiusz> md7 : active raid6 sdg[10] sdad1[9] sdac1[8] sdag1[7] sdaf1[6] sdae1[5] sdaj1[4] sdai1[3] sdah1[2] sdn1[1] Arkadiusz> 31255089152 blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/10] [UUUUUUUUUU] Arkadiusz> bitmap: 1/30 pages [4KB], 65536KB chunk Arkadiusz> array had weird failure where many disks went into failed state but Arkadiusz> remove && adding these disks "fixed" it (turns out not really fixed it). Arkadiusz> Unfortunately now some reads fail: Arkadiusz> pread(4, 0x1483a00, 4096, 16003680464896) = -1 EIO (Input/output error) Arkadiusz> To reproduce used xfs_io Arkadiusz> xfs_io -d -c "pread 16003680464896 4096" /dev/md7 Arkadiusz> pread64: Input/output error Arkadiusz> which does pread exactly as shown above. Arkadiusz> write also fails for that area: Arkadiusz> xfs_io -d -c "pwrite 16003680464896 4096" /dev/md7 Arkadiusz> pwrite64: Input/output error Arkadiusz> Note that nothing is written in dmesg when that happens. Arkadiusz> I've tried various offsets and sizes of pread and at some point that was logged: Arkadiusz> [ 848.988518] Buffer I/O error on dev md7, logical block 3907148544, async page read Arkadiusz> but no error from underlying devices. Arkadiusz> List of bad blocks: Arkadiusz> http://sprunge.us/XSWI Arkadiusz> What can I do now? Arkadiusz> (loosing data from that few sectors is acceptable if the rest will be readable) Arkadiusz> Thanks, Arkadiusz> -- Arkadiusz> Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) Arkadiusz> -- Arkadiusz> To unsubscribe from this list: send the line "unsubscribe linux-raid" in Arkadiusz> the body of a message to majordomo@vger.kernel.org Arkadiusz> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: I/O errors without erros from underlying device 2015-12-07 16:37 ` John Stoffel @ 2015-12-07 17:06 ` Arkadiusz Miskiewicz [not found] ` <201512071803.26434.arekm@maven.pl> 1 sibling, 0 replies; 8+ messages in thread From: Arkadiusz Miskiewicz @ 2015-12-07 17:06 UTC (permalink / raw) To: John Stoffel; +Cc: linux-raid On Monday 07 of December 2015, John Stoffel wrote: > Arkadiusz> 4.3.0 kernel, raid6 array: > > I think there's a bug in the 4.3.x and 4.4-rc3 and lower with block > merges. I ran into these over the weekend, where v4.2.6 was stable, > but anything higher would lock up and crash on me. Well, no crashes here. > So first step would be to make sure you get and test v4.4-rc4. Do you know which commit there? > > Arkadiusz> md7 : active raid6 sdg[10] sdad1[9] sdac1[8] sdag1[7] sdaf1[6] > sdae1[5] sdaj1[4] sdai1[3] sdah1[2] sdn1[1] Arkadiusz> 31255089152 > blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/10] [UUUUUUUUUU] > Arkadiusz> bitmap: 1/30 pages [4KB], 65536KB chunk > > Arkadiusz> array had weird failure where many disks went into failed state > but Arkadiusz> remove && adding these disks "fixed" it (turns out not > really fixed it). > > Arkadiusz> Unfortunately now some reads fail: > > Arkadiusz> pread(4, 0x1483a00, 4096, 16003680464896) = -1 EIO (Input/output > error) > > Arkadiusz> To reproduce used xfs_io > Arkadiusz> xfs_io -d -c "pread 16003680464896 4096" /dev/md7 > Arkadiusz> pread64: Input/output error > Arkadiusz> which does pread exactly as shown above. > > Arkadiusz> write also fails for that area: > Arkadiusz> xfs_io -d -c "pwrite 16003680464896 4096" /dev/md7 > Arkadiusz> pwrite64: Input/output error > > Arkadiusz> Note that nothing is written in dmesg when that happens. > > Arkadiusz> I've tried various offsets and sizes of pread and at some point > that was logged: Arkadiusz> [ 848.988518] Buffer I/O error on dev md7, > logical block 3907148544, async page read > > Arkadiusz> but no error from underlying devices. > > Arkadiusz> List of bad blocks: > Arkadiusz> http://sprunge.us/XSWI > > Arkadiusz> What can I do now? > > Arkadiusz> (loosing data from that few sectors is acceptable if the rest > will be readable) > > Arkadiusz> Thanks, > Arkadiusz> -- > Arkadiusz> Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) > Arkadiusz> -- > Arkadiusz> To unsubscribe from this list: send the line "unsubscribe > linux-raid" in Arkadiusz> the body of a message to > majordomo@vger.kernel.org > Arkadiusz> More majordomo info at > http://vger.kernel.org/majordomo-info.html -- Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <201512071803.26434.arekm@maven.pl>]
* Re: I/O errors without erros from underlying device [not found] ` <201512071803.26434.arekm@maven.pl> @ 2015-12-07 17:23 ` John Stoffel 2015-12-07 20:46 ` Arkadiusz Miskiewicz 0 siblings, 1 reply; 8+ messages in thread From: John Stoffel @ 2015-12-07 17:23 UTC (permalink / raw) To: Arkadiusz Miśkiewicz; +Cc: John Stoffel, linux-raid >>>>> "Arkadiusz" == Arkadiusz Miśkiewicz <arekm@maven.pl> writes: Arkadiusz> On Monday 07 of December 2015, John Stoffel wrote: Arkadiusz> 4.3.0 kernel, raid6 array: >> >> I think there's a bug in the 4.3.x and 4.4-rc3 and lower with block >> merges. I ran into these over the weekend, where v4.2.6 was stable, >> but anything higher would lock up and crash on me. Arkadiusz> Well, no crashes here. That's good. It was hard(er) to hit when I wasn't running KVM VMs at the same time on the server, and I was running strictly RAID1 disks, so it's hard to know. >> So first step would be to make sure you get and test v4.4-rc4. Arkadiusz> Do you know which commit there? Try this, from the master lkml git repository: 2873d32ff493ecbfb7d2c7f56812ab941dda42f4 >> Arkadiusz> md7 : active raid6 sdg[10] sdad1[9] sdac1[8] sdag1[7] sdaf1[6] >> sdae1[5] sdaj1[4] sdai1[3] sdah1[2] sdn1[1] Arkadiusz> 31255089152 >> blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/10] [UUUUUUUUUU] Arkadiusz> bitmap: 1/30 pages [4KB], 65536KB chunk >> Arkadiusz> array had weird failure where many disks went into failed state >> but Arkadiusz> remove && adding these disks "fixed" it (turns out not >> really fixed it). >> Arkadiusz> Unfortunately now some reads fail: >> Arkadiusz> pread(4, 0x1483a00, 4096, 16003680464896) = -1 EIO (Input/output >> error) >> Arkadiusz> To reproduce used xfs_io Arkadiusz> xfs_io -d -c "pread 16003680464896 4096" /dev/md7 Arkadiusz> pread64: Input/output error Arkadiusz> which does pread exactly as shown above. >> Arkadiusz> write also fails for that area: Arkadiusz> xfs_io -d -c "pwrite 16003680464896 4096" /dev/md7 Arkadiusz> pwrite64: Input/output error >> Arkadiusz> Note that nothing is written in dmesg when that happens. >> Arkadiusz> I've tried various offsets and sizes of pread and at some point >> that was logged: Arkadiusz> [ 848.988518] Buffer I/O error on dev md7, >> logical block 3907148544, async page read >> Arkadiusz> but no error from underlying devices. >> Arkadiusz> List of bad blocks: Arkadiusz> http://sprunge.us/XSWI >> Arkadiusz> What can I do now? >> Arkadiusz> (loosing data from that few sectors is acceptable if the rest >> will be readable) >> Arkadiusz> Thanks, Arkadiusz> -- Arkadiusz> Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) Arkadiusz> -- Arkadiusz> To unsubscribe from this list: send the line "unsubscribe >> linux-raid" in Arkadiusz> the body of a message to >> majordomo@vger.kernel.org Arkadiusz> More majordomo info at >> http://vger.kernel.org/majordomo-info.html Arkadiusz> -- Arkadiusz> Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: I/O errors without erros from underlying device 2015-12-07 17:23 ` John Stoffel @ 2015-12-07 20:46 ` Arkadiusz Miskiewicz 2015-12-08 4:02 ` John Stoffel 2015-12-08 11:05 ` Arkadiusz Miskiewicz 0 siblings, 2 replies; 8+ messages in thread From: Arkadiusz Miskiewicz @ 2015-12-07 20:46 UTC (permalink / raw) To: John Stoffel; +Cc: linux-raid On Monday 07 of December 2015, John Stoffel wrote: > >>>>> "Arkadiusz" == Arkadiusz Miśkiewicz <arekm@maven.pl> writes: > Arkadiusz> On Monday 07 of December 2015, John Stoffel wrote: > > Arkadiusz> 4.3.0 kernel, raid6 array: > >> I think there's a bug in the 4.3.x and 4.4-rc3 and lower with block > >> merges. I ran into these over the weekend, where v4.2.6 was stable, > >> but anything higher would lock up and crash on me. > > Arkadiusz> Well, no crashes here. > > That's good. It was hard(er) to hit when I wasn't running KVM VMs at > the same time on the server, and I was running strictly RAID1 disks, > so it's hard to know. > > >> So first step would be to make sure you get and test v4.4-rc4. > > Arkadiusz> Do you know which commit there? > > Try this, from the master lkml git repository: > > 2873d32ff493ecbfb7d2c7f56812ab941dda42f4 It's merge commit. Don't see any obvious patch in that merge that would help my case. Anyway I would expect my problem to be related to badblock lists which numbers are close to dmesg error message: [ 848.988518] Buffer I/O error on dev md7, logical block 3907148544, async page read > >> http://sprunge.us/XSWI But how to repair these if write() also fails and http://www.spinics.net/lists/raid/msg49325.html suggests that write should "fix" these (by using replacement blocks I guess) ? > Arkadiusz> md7 : active raid6 sdg[10] sdad1[9] sdac1[8] sdag1[7] sdaf1[6] > > >> sdae1[5] sdaj1[4] sdai1[3] sdah1[2] sdn1[1] Arkadiusz> 31255089152 > >> blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/10] [UUUUUUUUUU] > > Arkadiusz> bitmap: 1/30 pages [4KB], 65536KB chunk > > Arkadiusz> array had weird failure where many disks went into failed state > > >> but Arkadiusz> remove && adding these disks "fixed" it (turns out not > >> really fixed it). > > Arkadiusz> Unfortunately now some reads fail: > > Arkadiusz> pread(4, 0x1483a00, 4096, 16003680464896) = -1 EIO (Input/output > > >> error) > > Arkadiusz> To reproduce used xfs_io > Arkadiusz> xfs_io -d -c "pread 16003680464896 4096" /dev/md7 > Arkadiusz> pread64: Input/output error > Arkadiusz> which does pread exactly as shown above. > > Arkadiusz> write also fails for that area: > Arkadiusz> xfs_io -d -c "pwrite 16003680464896 4096" /dev/md7 > Arkadiusz> pwrite64: Input/output error > > Arkadiusz> Note that nothing is written in dmesg when that happens. > > Arkadiusz> I've tried various offsets and sizes of pread and at some point > > >> that was logged: Arkadiusz> [ 848.988518] Buffer I/O error on dev md7, > >> logical block 3907148544, async page read > > Arkadiusz> but no error from underlying devices. > > Arkadiusz> List of bad blocks: > Arkadiusz> http://sprunge.us/XSWI > > Arkadiusz> What can I do now? > > Arkadiusz> (loosing data from that few sectors is acceptable if the rest > > >> will be readable) > > Arkadiusz> Thanks, > Arkadiusz> -- > Arkadiusz> Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) > Arkadiusz> -- > Arkadiusz> To unsubscribe from this list: send the line "unsubscribe > > >> linux-raid" in Arkadiusz> the body of a message to > >> majordomo@vger.kernel.org > > Arkadiusz> More majordomo info at > > >> http://vger.kernel.org/majordomo-info.html > > Arkadiusz> -- > Arkadiusz> Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) -- Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) -- Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: I/O errors without erros from underlying device 2015-12-07 20:46 ` Arkadiusz Miskiewicz @ 2015-12-08 4:02 ` John Stoffel 2015-12-08 11:05 ` Arkadiusz Miskiewicz 1 sibling, 0 replies; 8+ messages in thread From: John Stoffel @ 2015-12-08 4:02 UTC (permalink / raw) To: arekm; +Cc: John Stoffel, linux-raid >>>>> "Arkadiusz" == Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> writes: Arkadiusz> On Monday 07 of December 2015, John Stoffel wrote: >> >>>>> "Arkadiusz" == Arkadiusz Miśkiewicz <arekm@maven.pl> writes: Arkadiusz> On Monday 07 of December 2015, John Stoffel wrote: >> Arkadiusz> 4.3.0 kernel, raid6 array: >> >> I think there's a bug in the 4.3.x and 4.4-rc3 and lower with block >> >> merges. I ran into these over the weekend, where v4.2.6 was stable, >> >> but anything higher would lock up and crash on me. >> Arkadiusz> Well, no crashes here. >> >> That's good. It was hard(er) to hit when I wasn't running KVM VMs at >> the same time on the server, and I was running strictly RAID1 disks, >> so it's hard to know. >> >> >> So first step would be to make sure you get and test v4.4-rc4. >> Arkadiusz> Do you know which commit there? >> >> Try this, from the master lkml git repository: >> >> 2873d32ff493ecbfb7d2c7f56812ab941dda42f4 Arkadiusz> It's merge commit. Don't see any obvious patch in that merge that would help Arkadiusz> my case. The merge from Jens Axboe talking about blk something or other. In my case, it lead to instant lockups. In your case... hard to know. Sorry. Arkadiusz> Anyway I would expect my problem to be related to badblock Arkadiusz> lists which numbers are close to dmesg error message: [ Arkadiusz> 848.988518] Buffer I/O error on dev md7, logical block Arkadiusz> 3907148544, async page read >> >> http://sprunge.us/XSWI Arkadiusz> But how to repair these if write() also fails and Arkadiusz> http://www.spinics.net/lists/raid/msg49325.html suggests that write should Arkadiusz> "fix" these (by using replacement blocks I guess) ? Arkadiusz> md7 : active raid6 sdg[10] sdad1[9] sdac1[8] sdag1[7] sdaf1[6] >> >> >> sdae1[5] sdaj1[4] sdai1[3] sdah1[2] sdn1[1] Arkadiusz> 31255089152 >> >> blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/10] [UUUUUUUUUU] >> Arkadiusz> bitmap: 1/30 pages [4KB], 65536KB chunk >> Arkadiusz> array had weird failure where many disks went into failed state >> >> >> but Arkadiusz> remove && adding these disks "fixed" it (turns out not >> >> really fixed it). >> Arkadiusz> Unfortunately now some reads fail: >> Arkadiusz> pread(4, 0x1483a00, 4096, 16003680464896) = -1 EIO (Input/output >> >> >> error) >> Arkadiusz> To reproduce used xfs_io Arkadiusz> xfs_io -d -c "pread 16003680464896 4096" /dev/md7 Arkadiusz> pread64: Input/output error Arkadiusz> which does pread exactly as shown above. >> Arkadiusz> write also fails for that area: Arkadiusz> xfs_io -d -c "pwrite 16003680464896 4096" /dev/md7 Arkadiusz> pwrite64: Input/output error >> Arkadiusz> Note that nothing is written in dmesg when that happens. >> Arkadiusz> I've tried various offsets and sizes of pread and at some point >> >> >> that was logged: Arkadiusz> [ 848.988518] Buffer I/O error on dev md7, >> >> logical block 3907148544, async page read >> Arkadiusz> but no error from underlying devices. >> Arkadiusz> List of bad blocks: Arkadiusz> http://sprunge.us/XSWI >> Arkadiusz> What can I do now? >> Arkadiusz> (loosing data from that few sectors is acceptable if the rest >> >> >> will be readable) >> Arkadiusz> Thanks, Arkadiusz> -- Arkadiusz> Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) Arkadiusz> -- Arkadiusz> To unsubscribe from this list: send the line "unsubscribe >> >> >> linux-raid" in Arkadiusz> the body of a message to >> >> majordomo@vger.kernel.org >> Arkadiusz> More majordomo info at >> >> >> http://vger.kernel.org/majordomo-info.html >> Arkadiusz> -- Arkadiusz> Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) Arkadiusz> -- Arkadiusz> Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) Arkadiusz> -- Arkadiusz> Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) Arkadiusz> -- Arkadiusz> To unsubscribe from this list: send the line "unsubscribe linux-raid" in Arkadiusz> the body of a message to majordomo@vger.kernel.org Arkadiusz> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: I/O errors without erros from underlying device 2015-12-07 20:46 ` Arkadiusz Miskiewicz 2015-12-08 4:02 ` John Stoffel @ 2015-12-08 11:05 ` Arkadiusz Miskiewicz 2015-12-21 2:25 ` NeilBrown 1 sibling, 1 reply; 8+ messages in thread From: Arkadiusz Miskiewicz @ 2015-12-08 11:05 UTC (permalink / raw) To: linux-raid On Monday 07 of December 2015, Arkadiusz Miskiewicz wrote: > Anyway I would expect my problem to be related to badblock lists which > numbers are close to dmesg error message: [ 848.988518] Buffer I/O error > on dev md7, logical block 3907148544, async page read > > > >> http://sprunge.us/XSWI > > But how to repair these if write() also fails and > http://www.spinics.net/lists/raid/msg49325.html suggests that write should > "fix" these (by using replacement blocks I guess) ? Tried to get rid of badblock lists (well, corruption in that area is better than no access at all): mdadm --assemble /dev/md7 --force --update=no-bbl mdadm: Cannot remove active bbl from /dev/sdae1 mdadm: Cannot remove active bbl from /dev/sdag1 mdadm: Cannot remove active bbl from /dev/sdai1 mdadm: Cannot remove active bbl from /dev/sdn1 mdadm: Cannot remove active bbl from /dev/sdg mdadm: Cannot remove active bbl from /dev/sdad1 mdadm: /dev/md7 has been started with 10 drives. Is there a way to archieve that anyway? -- Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org ) -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: I/O errors without erros from underlying device 2015-12-08 11:05 ` Arkadiusz Miskiewicz @ 2015-12-21 2:25 ` NeilBrown 0 siblings, 0 replies; 8+ messages in thread From: NeilBrown @ 2015-12-21 2:25 UTC (permalink / raw) To: arekm, linux-raid [-- Attachment #1: Type: text/plain, Size: 1917 bytes --] On Tue, Dec 08 2015, Arkadiusz Miskiewicz wrote: > On Monday 07 of December 2015, Arkadiusz Miskiewicz wrote: > >> Anyway I would expect my problem to be related to badblock lists which >> numbers are close to dmesg error message: [ 848.988518] Buffer I/O error >> on dev md7, logical block 3907148544, async page read >> >> > >> http://sprunge.us/XSWI >> >> But how to repair these if write() also fails and >> http://www.spinics.net/lists/raid/msg49325.html suggests that write should >> "fix" these (by using replacement blocks I guess) ? > > Tried to get rid of badblock lists (well, corruption in that area is better > than no access at all): > > mdadm --assemble /dev/md7 --force --update=no-bbl > mdadm: Cannot remove active bbl from /dev/sdae1 > mdadm: Cannot remove active bbl from /dev/sdag1 > mdadm: Cannot remove active bbl from /dev/sdai1 > mdadm: Cannot remove active bbl from /dev/sdn1 > mdadm: Cannot remove active bbl from /dev/sdg > mdadm: Cannot remove active bbl from /dev/sdad1 > mdadm: /dev/md7 has been started with 10 drives. > > Is there a way to archieve that anyway? > You probably have bad blocks in multiple disks in the one stripe (look in /sys/block/md7/md/dev-*/badblocks or something like that to see). To get rid of these you would need to write to every block in the stripe. I guess I should try to find a way to make that easier. If you like you could hack mdadm to allow you to remove the bbl even though they aren't empty. In super1.c look for: } else if (strcmp(update, "no-bbl") == 0) { if (sb->feature_map & __cpu_to_le32(MD_FEATURE_BAD_BLOCKS)) pr_err("Cannot remove active bbl from %s\n",devname); else { sb->bblog_size = 0; sb->bblog_shift = 0; sb->bblog_offset = 0; } and change it to be unconditional and also to clear MD_FEATURE_BAD_BLOCKS. No warranty expressed or implied. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 818 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-12-21 2:25 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-07 16:05 I/O errors without erros from underlying device Arkadiusz Miskiewicz
2015-12-07 16:37 ` John Stoffel
2015-12-07 17:06 ` Arkadiusz Miskiewicz
[not found] ` <201512071803.26434.arekm@maven.pl>
2015-12-07 17:23 ` John Stoffel
2015-12-07 20:46 ` Arkadiusz Miskiewicz
2015-12-08 4:02 ` John Stoffel
2015-12-08 11:05 ` Arkadiusz Miskiewicz
2015-12-21 2:25 ` NeilBrown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).