* Troubleshooting "Buffer I/O error" on reading md device @ 2018-01-02 2:46 RQM 2018-01-02 3:13 ` Reindl Harald 2018-01-02 4:28 ` NeilBrown 0 siblings, 2 replies; 12+ messages in thread From: RQM @ 2018-01-02 2:46 UTC (permalink / raw) To: linux-raid@vger.kernel.org Hello everyone, I hope this list is the right place to ask the following: I've got a 5-disk RAID-5 array that's been built by a QNAP NAS device, which has recently failed (I suspect a faulty SATA controller or backplane). I migrated the disks to a desktop computer that runs Debian stretch (kernel 4.9.65-3+deb9u1 amd64) and mdadm version 3.4. Although the array can be assembled, I encountered the following error in my dmesg output ([1], recorded directly after a recent reboot and fsck attempt) when running fsck: Buffer I/O error on dev md0, logical block 1598030208, async page read I can reliably reproduce that error by trying to read from the md0 device. It's always the same block, also across reboots. I have suspected that possibly, one of the drives involved is faulty. Although smart errors have been logged [2], the errors are not recent enough to correlate with the fsck run. Also, I had sha1sum complete without error on every one of the individual disk devices /dev/sd[b-f], so reading from the drives does not provoke an error. Finally, I tried scrubbing the array by writing repair to md/sync_action. The process completed without any output to dmesg or signs of trouble in /proc/mdstat. However, reading from the array still fails at the same block as above, 1598030208. Here's the output of mdadm --detail /dev/md0: [3] I assume the md driver would know what exactly the problem is, but I don't know where to look to find that information. How can I proceed troubleshooting this issue? FYI, I had posted this on serverfault [4] previously, but unfortunately didn't arrive at a conclusion. Thank you very much in advance! [1] https://paste.ubuntu.com/26303735/ [2] https://paste.ubuntu.com/26303737/ [3] https://paste.ubuntu.com/26303754/ [4] https://serverfault.com/questions/889687/troubleshooting-buffer-i-o-error-on-software-raid-md-device ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Troubleshooting "Buffer I/O error" on reading md device 2018-01-02 2:46 Troubleshooting "Buffer I/O error" on reading md device RQM @ 2018-01-02 3:13 ` Reindl Harald 2018-01-02 4:28 ` NeilBrown 1 sibling, 0 replies; 12+ messages in thread From: Reindl Harald @ 2018-01-02 3:13 UTC (permalink / raw) To: RQM, linux-raid@vger.kernel.org Am 02.01.2018 um 03:46 schrieb RQM: > I hope this list is the right place to ask the following: > > I've got a 5-disk RAID-5 array that's been built by a QNAP NAS device, which has recently failed (I suspect a faulty SATA controller or backplane). > I migrated the disks to a desktop computer that runs Debian stretch (kernel 4.9.65-3+deb9u1 amd64) and mdadm version 3.4. Although the array can be assembled, I encountered the following error in my dmesg output ([1], recorded directly after a recent reboot and fsck attempt) when running fsck: > > Buffer I/O error on dev md0, logical block 1598030208, async page read > > I can reliably reproduce that error by trying to read from the md0 device. It's always the same block, also across reboots i had the same message on my testserver VM running within VMware Workstation after upgrade to one of the first 4.14 kernels on Fedora for /dev/sdb1 (rootfs) and it went away as it came in case of a virtual disk faulty hardware is even impossible or at least would i expect such message on the underlying raid10 and not in a random guest ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Troubleshooting "Buffer I/O error" on reading md device 2018-01-02 2:46 Troubleshooting "Buffer I/O error" on reading md device RQM 2018-01-02 3:13 ` Reindl Harald @ 2018-01-02 4:28 ` NeilBrown 2018-01-02 10:40 ` RQM 1 sibling, 1 reply; 12+ messages in thread From: NeilBrown @ 2018-01-02 4:28 UTC (permalink / raw) To: RQM, linux-raid@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 3374 bytes --] On Mon, Jan 01 2018, RQM wrote: > Hello everyone, > > I hope this list is the right place to ask the following: > > I've got a 5-disk RAID-5 array that's been built by a QNAP NAS device, which has recently failed (I suspect a faulty SATA controller or backplane). > I migrated the disks to a desktop computer that runs Debian stretch (kernel 4.9.65-3+deb9u1 amd64) and mdadm version 3.4. Although the array can be assembled, I encountered the following error in my dmesg output ([1], recorded directly after a recent reboot and fsck attempt) when running fsck: > > Buffer I/O error on dev md0, logical block 1598030208, async page read > > I can reliably reproduce that error by trying to read from the md0 device. It's always the same block, also across reboots. > > I have suspected that possibly, one of the drives involved is faulty. Although smart errors have been logged [2], the errors are not recent enough to correlate with the fsck run. Also, I had sha1sum complete without error on every one of the individual disk devices /dev/sd[b-f], so reading from the drives does not provoke an error. > > Finally, I tried scrubbing the array by writing repair to md/sync_action. The process completed without any output to dmesg or signs of trouble in /proc/mdstat. However, reading from the array still fails at the same block as above, 1598030208. > > Here's the output of mdadm --detail /dev/md0: [3] > > I assume the md driver would know what exactly the problem is, but I don't know where to look to find that information. How can I proceed troubleshooting this issue? > > FYI, I had posted this on serverfault [4] previously, but unfortunately didn't arrive at a conclusion. > > Thank you very much in advance! > > [1] https://paste.ubuntu.com/26303735/ > [2] https://paste.ubuntu.com/26303737/ > [3] https://paste.ubuntu.com/26303754/ > [4] https://serverfault.com/questions/889687/troubleshooting-buffer-i-o-error-on-software-raid-md-device This is truly weird. I'd even go so far as to say that it cannot possibly happen (but I've been wrong before). Step one is confirm that it is easy to reproduce. Does dd if=/dev/md0 bs=4K skip=1598030208 count=1 of=/dev/null trigger the message reliably? To check that "4K" is the correct blocksize, run blockdev --getbsz /dev/md0 use whatever number if gives as 'bs='. If you cannot reproduce like that, try a larger count and then a smaller skip with a large count. Once you can reproduce with minimal IO, do echo file:raid5.c +p > /sys/kernel/debug/dynamic_debug/control # repeat experiment echo file:raid5.c -p > /sys/kernel/debug/dynamic_debug/control and report the messages that appear in 'dmesg'. Also report "mdadm -E" of each member device, and kernel version (though I see that is in the serverfault report : 4.9.30-2+deb9u5). Then run blktrace /dev/md0 /dev/sd[acdef] in one window while reproducing the error again in another window. Then interrupt the blktrace. This will produce several blocktrace* files. create a tar.gz of these and put them somewhere that I can get them - hopefully they won't be too big. With all this information, I can poke around and will hopefully be able to explain if fine detail exactly why this cannot possible happen (unless it turns out that I'm wrong again). Thanks, NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Troubleshooting "Buffer I/O error" on reading md device 2018-01-02 4:28 ` NeilBrown @ 2018-01-02 10:40 ` RQM 2018-01-02 21:27 ` NeilBrown 0 siblings, 1 reply; 12+ messages in thread From: RQM @ 2018-01-02 10:40 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid\@vger.kernel.org Hello, thanks for the quick and helpful responses! Answers inline: > Step one is confirm that it is easy to reproduce. > Does > dd if=/dev/md0 bs=4K skip=1598030208 count=1 of=/dev/null > > trigger the message reliably? > To check that "4K" is the correct blocksize, run > blockdev --getbsz /dev/md0 > > use whatever number if gives as 'bs='. blockdev does indeed report a blocksize of 4096, and the dd line does reliably trigger dd: error reading '/dev/md0': Input/output error and the same line in dmesg as before. > Once you can reproduce with minimal IO, do > echo file:raid5.c +p > /sys/kernel/debug/dynamic_debug/control >repeat experiment > >echo file:raid5.c -p > /sys/kernel/debug/dynamic_debug/control > > and report the messages that appear in 'dmesg'. I had to replace the colon with a space in those two lines (otherwise I would get "bash: echo: write error: Invalid argument"), but after that, this is what I got in dmesg: https://paste.ubuntu.com/26305369/ > Also report "mdadm -E" of each member device, and kernel version (though > I see that is in the serverfault report : 4.9.30-2+deb9u5). mdadm -E says: https://paste.ubuntu.com/26305379/ The kernel has been updated between the serverfault post and my first mail to this list to 4.9.65-3+deb9u1. No changes since. > > Then run > blktrace /dev/md0 /dev/sd[acdef] > in one window while reproducing the error again in another window. > Then interrupt the blktrace. This will produce several blocktrace* > files. create a tar.gz of these and put them somewhere that I can get > them - hopefully they won't be too big. I had to adjust the last blktrace argument to /dev/sd[b-f] since after the last reboot the names of the drives have changed, but here's the output: https://filebin.ca/3mnjUz1OIXqm/blktrace-out.tar.gz I also included the blktrace terminal output in there. Thank you so much for the effort! Please let me know if you need anything. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Troubleshooting "Buffer I/O error" on reading md device 2018-01-02 10:40 ` RQM @ 2018-01-02 21:27 ` NeilBrown 2018-01-02 22:30 ` Roger Heflin 2018-01-04 14:45 ` RQM 0 siblings, 2 replies; 12+ messages in thread From: NeilBrown @ 2018-01-02 21:27 UTC (permalink / raw) To: RQM; +Cc: linux-raid\@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 3662 bytes --] On Tue, Jan 02 2018, RQM wrote: > Hello, > > thanks for the quick and helpful responses! Answers inline: > > > Step one is confirm that it is easy to reproduce. >> Does >> dd if=/dev/md0 bs=4K skip=1598030208 count=1 of=/dev/null >> >> trigger the message reliably? >> To check that "4K" is the correct blocksize, run >> blockdev --getbsz /dev/md0 >> >> use whatever number if gives as 'bs='. > > > blockdev does indeed report a blocksize of 4096, and the dd line does reliably trigger > dd: error reading '/dev/md0': Input/output error > and the same line in dmesg as before. > >> Once you can reproduce with minimal IO, do >> echo file:raid5.c +p > /sys/kernel/debug/dynamic_debug/control >>repeat experiment >> >>echo file:raid5.c -p > /sys/kernel/debug/dynamic_debug/control >> >> and report the messages that appear in 'dmesg'. > > I had to replace the colon with a space in those two lines (otherwise I would get "bash: echo: write error: Invalid argument"), but after that, this is what I got in dmesg: > https://paste.ubuntu.com/26305369/ [Tue Jan 2 11:14:47 2018] locked=0 uptodate=0 to_read=1 to_write=0 failed=2 failed_num=3,2 So for this stripe. Two devices appear to be failed: 3 and 2. As the two devices clearly are thought to be working there must be a bad block recorded. > >> Also report "mdadm -E" of each member device, and kernel version (though >> I see that is in the serverfault report : 4.9.30-2+deb9u5). > > mdadm -E says: https://paste.ubuntu.com/26305379/ I needed "mdadm -E" the components of the array, so the partitions rather than the whole devices. e.g. /dev/sdb1, not /dev/sdb. This will show a non-empty bad block list on at least two devices. You can remove the bad block by over-writing it. dd if=/dev/zero of=/dev/md0 bs=4K seek=1598030208 count=1 though that might corrupt some file containing the block. (note "seek" seeks in the output file, "skip" skips over the input file). How did the bad block get there? A possible scenario is: - A device fails and is removed from array - read error occurs on another device. Rather than failing the whole device, md records that block as bad. - failed device is replaced (or found to be a cabling problem) and recovered. Due to the bad block the stripe cannot be recovered, so a bad block is recorded in the new device. If the read error was really a cabling problem, then the original data might still be there. If it is, you could recover it and write it back to the array rather then writing from /dev/zero. Finding out which file the failed block is part of is probably possible, but not necessarily easy. If you want to try, the first step is reporting what filesystem is on md0. If it is ext4, then debugfs can help. If something else - I don't know. NeilBrown > The kernel has been updated between the serverfault post and my first mail to this list to 4.9.65-3+deb9u1. No changes since. > >> >> Then run >> blktrace /dev/md0 /dev/sd[acdef] >> in one window while reproducing the error again in another window. >> Then interrupt the blktrace. This will produce several blocktrace* >> files. create a tar.gz of these and put them somewhere that I can get >> them - hopefully they won't be too big. > > I had to adjust the last blktrace argument to /dev/sd[b-f] since after the last reboot the names of the drives have changed, but here's the output: > https://filebin.ca/3mnjUz1OIXqm/blktrace-out.tar.gz > I also included the blktrace terminal output in there. > > Thank you so much for the effort! Please let me know if you need anything. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Troubleshooting "Buffer I/O error" on reading md device 2018-01-02 21:27 ` NeilBrown @ 2018-01-02 22:30 ` Roger Heflin 2018-01-04 14:45 ` RQM 1 sibling, 0 replies; 12+ messages in thread From: Roger Heflin @ 2018-01-02 22:30 UTC (permalink / raw) To: NeilBrown; +Cc: RQM, linux-raid\@vger.kernel.org The brute force way to find the file is find all files and cat to /dev/null checking for a bad return code on the cat command. Last time I did it, that was easier, and unless the filesystem is really really big should finish in a day or 2. debugfs was not easy to understand and/or work with, and overall the brute force method took less of my time to implement. if find/cat does not find it that would indicate the error is in the free space or the filesystem data. On Tue, Jan 2, 2018 at 3:27 PM, NeilBrown <neilb@suse.com> wrote: > On Tue, Jan 02 2018, RQM wrote: > >> Hello, >> >> thanks for the quick and helpful responses! Answers inline: >> >> > Step one is confirm that it is easy to reproduce. >>> Does >>> dd if=/dev/md0 bs=4K skip=1598030208 count=1 of=/dev/null >>> >>> trigger the message reliably? >>> To check that "4K" is the correct blocksize, run >>> blockdev --getbsz /dev/md0 >>> >>> use whatever number if gives as 'bs='. >> >> >> blockdev does indeed report a blocksize of 4096, and the dd line does reliably trigger >> dd: error reading '/dev/md0': Input/output error >> and the same line in dmesg as before. >> >>> Once you can reproduce with minimal IO, do >>> echo file:raid5.c +p > /sys/kernel/debug/dynamic_debug/control >>>repeat experiment >>> >>>echo file:raid5.c -p > /sys/kernel/debug/dynamic_debug/control >>> >>> and report the messages that appear in 'dmesg'. >> >> I had to replace the colon with a space in those two lines (otherwise I would get "bash: echo: write error: Invalid argument"), but after that, this is what I got in dmesg: >> https://paste.ubuntu.com/26305369/ > > [Tue Jan 2 11:14:47 2018] locked=0 uptodate=0 to_read=1 to_write=0 failed=2 failed_num=3,2 > > So for this stripe. Two devices appear to be failed: 3 and 2. > As the two devices clearly are thought to be working there must be a bad > block recorded. > >> >>> Also report "mdadm -E" of each member device, and kernel version (though >>> I see that is in the serverfault report : 4.9.30-2+deb9u5). >> >> mdadm -E says: https://paste.ubuntu.com/26305379/ > > I needed "mdadm -E" the components of the array, so the partitions > rather than the whole devices. e.g. /dev/sdb1, not /dev/sdb. > > This will show a non-empty bad block list on at least two devices. > > You can remove the bad block by over-writing it. > dd if=/dev/zero of=/dev/md0 bs=4K seek=1598030208 count=1 > though that might corrupt some file containing the block. > > (note "seek" seeks in the output file, "skip" skips over the input > file). > > How did the bad block get there? > A possible scenario is: > - A device fails and is removed from array > - read error occurs on another device. Rather than failing the whole > device, md records that block as bad. > - failed device is replaced (or found to be a cabling problem) and > recovered. Due to the bad block the stripe cannot be recovered, > so a bad block is recorded in the new device. > > If the read error was really a cabling problem, then the original data > might still be there. If it is, you could recover it and write it back > to the array rather then writing from /dev/zero. > Finding out which file the failed block is part of is probably possible, > but not necessarily easy. If you want to try, the first step is > reporting what filesystem is on md0. If it is ext4, then debugfs can > help. If something else - I don't know. > > NeilBrown > > > >> The kernel has been updated between the serverfault post and my first mail to this list to 4.9.65-3+deb9u1. No changes since. >> >>> >>> Then run >>> blktrace /dev/md0 /dev/sd[acdef] >>> in one window while reproducing the error again in another window. >>> Then interrupt the blktrace. This will produce several blocktrace* >>> files. create a tar.gz of these and put them somewhere that I can get >>> them - hopefully they won't be too big. >> >> I had to adjust the last blktrace argument to /dev/sd[b-f] since after the last reboot the names of the drives have changed, but here's the output: >> https://filebin.ca/3mnjUz1OIXqm/blktrace-out.tar.gz >> I also included the blktrace terminal output in there. >> >> Thank you so much for the effort! Please let me know if you need anything. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Troubleshooting "Buffer I/O error" on reading md device 2018-01-02 21:27 ` NeilBrown 2018-01-02 22:30 ` Roger Heflin @ 2018-01-04 14:45 ` RQM 2018-01-05 1:05 ` NeilBrown 1 sibling, 1 reply; 12+ messages in thread From: RQM @ 2018-01-04 14:45 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid\\\@vger.kernel.org Hello, > I needed "mdadm -E" the components of the array, so the partitions > rather than the whole devices. e.g. /dev/sdb1, not /dev/sdb. Sorry, that should have occurred to me. Here's the output: https://paste.ubuntu.com/26319689/ Indeed, bad blocks are present on two devices. > You can remove the bad block by over-writing it. > dd if=/dev/zero of=/dev/md0 bs=4K seek=1598030208 count=1 > though that might corrupt some file containing the block. I have tried that just now, but before running mdadm -E above. dd appears to succeed when writing to the bad block, but after that, reading that block with dd fails again: "dd: error reading '/dev/md0': Input/output error" In dmesg, the following errors appear: [220444.068715] VFS: Dirty inode writeback failed for block device md0 (err=-5). [220445.850229] Buffer I/O error on dev md0, logical block 1598030208, async page read I have repeated the dd write-then-read experiment, with identical results. The filesystem is indeed ext4, but it's not of tremendous importance to me that all data is recovered, as the array contains backup data only. However, I would like to get the backup system back into operation, so I'd be very grateful for further hints how to get the array into a usable state. Thank you so much for your help so far! ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Troubleshooting "Buffer I/O error" on reading md device 2018-01-04 14:45 ` RQM @ 2018-01-05 1:05 ` NeilBrown 2018-01-05 12:55 ` RQM 0 siblings, 1 reply; 12+ messages in thread From: NeilBrown @ 2018-01-05 1:05 UTC (permalink / raw) To: RQM; +Cc: linux-raid\\\@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 2209 bytes --] On Thu, Jan 04 2018, RQM wrote: > Hello, > >> I needed "mdadm -E" the components of the array, so the partitions >> rather than the whole devices. e.g. /dev/sdb1, not /dev/sdb. > > Sorry, that should have occurred to me. Here's the output: > https://paste.ubuntu.com/26319689/ > > Indeed, bad blocks are present on two devices. > >> You can remove the bad block by over-writing it. >> dd if=/dev/zero of=/dev/md0 bs=4K seek=1598030208 count=1 >> though that might corrupt some file containing the block. > > I have tried that just now, but before running mdadm -E above. dd appears to succeed when writing to the bad block, but after that, reading that block with dd fails again: > "dd: error reading '/dev/md0': Input/output error" > > In dmesg, the following errors appear: > [220444.068715] VFS: Dirty inode writeback failed for block device md0 (err=-5). > [220445.850229] Buffer I/O error on dev md0, logical block 1598030208, async page read > > I have repeated the dd write-then-read experiment, with identical results. > > The filesystem is indeed ext4, but it's not of tremendous importance to me that all data is recovered, as the array contains backup data only. However, I would like to get the backup system back into operation, so I'd be very grateful for further hints how to get the array into a usable state. The easiest approach is to remove the bad block log. Stop array, and then assemble with --update=no-bbl. e.g mdadm -S /dev/md0 mdadm -A /dev/md0 --update=no-bbl /dev/sd[bcdef]3 Before you do that though, please take a dump of the metadata and send it to me, in case I get motivated to figure out why writing didn't work. mkdir /tmp/dump mdadm --dump /tmp/dump /dev/sd[bcdef]3 tar czSf /tmp/dump.tgz /tmp/dump The files in /tmp are sparse images of the hard drives with only the metadata present. The 'S' flag to tar should cause it to notice this and create a tiny tgz file. Then send me /tmp/dump.tgz. Thanks, NeilBrown > > Thank you so much for your help so far! > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Troubleshooting "Buffer I/O error" on reading md device 2018-01-05 1:05 ` NeilBrown @ 2018-01-05 12:55 ` RQM 2018-01-13 12:18 ` RQM 0 siblings, 1 reply; 12+ messages in thread From: RQM @ 2018-01-05 12:55 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid\\\\\\\@vger.kernel.org Hi, here's the metadata dump: https://filebin.ca/3n9OgaeSlV6x/dump.tgz When I try assembling with no-bbl, this is what I get: # mdadm -A /dev/md0 --update=no-bbl /dev/sd[bcdef]3 mdadm: Cannot remove active bbl from /dev/sdc3 mdadm: Cannot remove active bbl from /dev/sde3 mdadm: /dev/md0 has been started with 5 drives. The array does start up, but the behavior regarding dd reads and writes remains as it was before: Failure to read with the corresponding error messages in dmesg and on stdout/stderr, failure to write, but only indicated in dmesg. By the way, I have run smart long tests a day or two ago, and it reportedly completed without errors on all involved disks. Thank you again so much for your help! >-------- Original Message -------- >Subject: Re: Troubleshooting "Buffer I/O error" on reading md device >Local Time: January 5, 2018 2:05 AM >UTC Time: January 5, 2018 1:05 AM >From: neilb@suse.com >To: RQM <rqm@protonmail.com> >linux-raid\\\\\\\@vger.kernel.org <linux-raid@vger.kernel.org> > >On Thu, Jan 04 2018, RQM wrote: > >>Hello, >>>I needed "mdadm -E" the components of the array, so the partitions >>> rather than the whole devices. e.g. /dev/sdb1, not /dev/sdb. >>>Sorry, that should have occurred to me. Here's the output: >>https://paste.ubuntu.com/26319689/ >>Indeed, bad blocks are present on two devices. >>>You can remove the bad block by over-writing it. >>> dd if=/dev/zero of=/dev/md0 bs=4K seek=1598030208 count=1 >>> though that might corrupt some file containing the block. >>>I have tried that just now, but before running mdadm -E above. dd appears to succeed when writing to the bad block, but after that, reading that block with dd fails again: >> "dd: error reading '/dev/md0': Input/output error" >>In dmesg, the following errors appear: >> [220444.068715] VFS: Dirty inode writeback failed for block device md0 (err=-5). >> [220445.850229] Buffer I/O error on dev md0, logical block 1598030208, async page read >>I have repeated the dd write-then-read experiment, with identical results. >>The filesystem is indeed ext4, but it's not of tremendous importance to me that all data is recovered, as the array contains backup data only. However, I would like to get the backup system back into operation, so I'd be very grateful for further hints how to get the array into a usable state. >> > The easiest approach is to remove the bad block log. > Stop array, and then assemble with --update=no-bbl. > e.g > mdadm -S /dev/md0 > mdadm -A /dev/md0 --update=no-bbl /dev/sd[bcdef]3 > > Before you do that though, please take a dump of the metadata and send > it to me, in case I get motivated to figure out why writing didn't work. > > mkdir /tmp/dump > mdadm --dump /tmp/dump /dev/sd[bcdef]3 > tar czSf /tmp/dump.tgz /tmp/dump > > The files in /tmp are sparse images of the hard drives with only > the metadata present. The 'S' flag to tar should cause it to notice > this and create a tiny tgz file. > Then send me /tmp/dump.tgz. > > Thanks, > NeilBrown > > >>Thank you so much for your help so far! >> >>To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Troubleshooting "Buffer I/O error" on reading md device 2018-01-05 12:55 ` RQM @ 2018-01-13 12:18 ` RQM 2018-02-02 1:55 ` NeilBrown 2022-11-01 23:49 ` Darshaka Pathirana 0 siblings, 2 replies; 12+ messages in thread From: RQM @ 2018-01-13 12:18 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid\\\\\\\@vger.kernel.org Hello, I have been made aware that the link I had supplied previously does not work anymore. Here's another attempt at uploading the `mdadm --dump /dev/sd[bcdef]3` output: https://filebin.net/i0olmgzg52obnp0f/dump.tgz Any help is greatly appreciated. Please do let me know whether you plan on working on this issue in the near future, because otherwise I will have to re-create a new array on these disks in order to put them into production again. Thank you so much! ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Troubleshooting "Buffer I/O error" on reading md device 2018-01-13 12:18 ` RQM @ 2018-02-02 1:55 ` NeilBrown 2022-11-01 23:49 ` Darshaka Pathirana 1 sibling, 0 replies; 12+ messages in thread From: NeilBrown @ 2018-02-02 1:55 UTC (permalink / raw) To: RQM; +Cc: linux-raid@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 2076 bytes --] On Sat, Jan 13 2018, RQM wrote: > Hello, > > I have been made aware that the link I had supplied previously does not work anymore. > Here's another attempt at uploading the `mdadm --dump /dev/sd[bcdef]3` output: > > https://filebin.net/i0olmgzg52obnp0f/dump.tgz > > Any help is greatly appreciated. Please do let me know whether you plan on working on this issue in the near future, because otherwise I will have to re-create a new array on these disks in order to put them into production again. > > Thank you so much! Sorry that is has taken me so long to get to this - January was a bit crazy. Short answer is that if you use --assemble --force-no-bbl it will really truly get rid of the bad block log. I really should add that to the man page. Longer answer: If you assemble the array (without force-no-bbl) and grep . /sys/block/md0/md/rd*/bad_blocks you'll get /sys/block/md0/md/rd2/bad_blocks:3196060416 8 /sys/block/md0/md/rd3/bad_blocks:3196060416 8 So that is a 4K block that is bad at the same location on 2 devices. There is no data offset, and the chunk size is 64K, so using bc: % bc 3196060416/(64*2) 24969222 3196060416%(64*2) 0 the blocks are at the start of stripe 24969222. Each stripe is 4 date chunks, and a chunk is 64K or 16 4K blocks. So the block offset is close to % bc 24969222*4*16 1598030208 which is exactly the "logical block" which was reported. There are 5 devices, so the parity block rotates through the pattern D0 D1 D2 D3 P D1 D2 D3 P D0 D2 D3 P D0 D1 D3 P D0 D1 D2 P D0 D1 D2 D3 % bc 24969222%5 2 So this should be row 2 (counting from 0) D2 D3 P D0 D1 rd2 and rd2 are bad, so that is 'P' and 'D0'. So this confirms that it is just the first 4K block of that stripe which is bad. Writing should fix it... but it doesn't. The write gets an IO error. Looking at the code I can see why. The fix isn't completely trivial. I'll have think about it carefully. But for now --update=force-no-bbl should get you going. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Troubleshooting "Buffer I/O error" on reading md device 2018-01-13 12:18 ` RQM 2018-02-02 1:55 ` NeilBrown @ 2022-11-01 23:49 ` Darshaka Pathirana 1 sibling, 0 replies; 12+ messages in thread From: Darshaka Pathirana @ 2022-11-01 23:49 UTC (permalink / raw) To: linux-raid [-- Attachment #1.1: Type: text/plain, Size: 2916 bytes --] Hi, I am capturing this thread, because I also stumbled over the same problem, except I am running a RAID-1 setup. The server is (still) running Debian/stretch with mdadm 3.4-4+b1. Basically this is what happens: Accessing the RAID fails: % sudo dd if=/dev/md0 of=/dev/null skip=3112437760 count=33554432 dd: error reading '/dev/md0': Input/output error 514936+0 records in 514936+0 records out 263647232 bytes (264 MB, 251 MiB) copied, 0.447983 s, 589 MB/s dmesg output while trying to access the RAID: [Tue Nov 1 22:09:59 2022] Buffer I/O error on dev md0, logical block 389119087, async page read [Tue Nov 1 22:22:01 2022] Buffer I/O error on dev md0, logical block 389119087, async page read Jumping to the 'logical block': % sudo blockdev --getbsz /dev/md0 4096 % sudo dd if=/dev/md0 of=/dev/null skip=389119087 bs=4096 count=33554432 dd: error reading '/dev/md0': Input/output error 0+0 records in 0+0 records out 0 bytes copied, 0.000129958 s, 0.0 kB/s But the underlying disk seemed ok, which was strange: % sudo dd if=/dev/sdb1 skip=3112437760 count=33554432 of=/dev/null 33554432+0 records in 33554432+0 records out 17179869184 bytes (17 GB, 16 GiB) copied, 112.802 s, 152 MB/s sudo dd if=/dev/sdb1 skip=3112437760 count=33554432 of=/dev/null 9.18s user 29.80s system 34% cpu 1:52.81 total Note, through trial + error I found the offset of /dev/md0 to /dev/sdb1 to be 262144 blocks (with block size 512). That's why skip is not the same for both commands. After a very long research I found this thread and yes, there is a bad block log: % cat /sys/block/md0/md/rd*/bad_blocks 3113214840 8 % sudo mdadm -E /dev/sdb1 | grep Bad Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present. The other disk of that RAID has been removed, because the disk had SMART errors and is about to be replaced. Only then I noticed the input/output error. I am not sure how to proceed from here. Do you have any advice? On 2018-02-02 02:55, NeilBrown wrote: > > Short answer is that if you use > --assemble --force-no-bbl > it will really truly get rid of the bad block log. I really should add > that to the man page. *friendly wave* > Longer answer: > If you assemble the array (without force-no-bbl) and > > [...] > > So this should be row 2 (counting from 0) > D2 D3 P D0 D1 > > rd2 and rd2 are bad, so that is 'P' and 'D0'. > > So this confirms that it is just the first 4K block of that stripe which > is bad. > Writing should fix it... but it doesn't. The write gets an IO error. > > Looking at the code I can see why. The fix isn't completely > trivial. I'll have think about it carefully. I am curious: did you come up with a solution? Best & thx for your help, - Darsha P.s. I am not subscribed, please put me on CC. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 840 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-11-01 23:55 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-01-02 2:46 Troubleshooting "Buffer I/O error" on reading md device RQM 2018-01-02 3:13 ` Reindl Harald 2018-01-02 4:28 ` NeilBrown 2018-01-02 10:40 ` RQM 2018-01-02 21:27 ` NeilBrown 2018-01-02 22:30 ` Roger Heflin 2018-01-04 14:45 ` RQM 2018-01-05 1:05 ` NeilBrown 2018-01-05 12:55 ` RQM 2018-01-13 12:18 ` RQM 2018-02-02 1:55 ` NeilBrown 2022-11-01 23:49 ` Darshaka Pathirana
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).