* force remapping a pending sector in sw raid5 array
@ 2018-02-06 18:14 Marc MERLIN
2018-02-06 18:59 ` Reindl Harald
` (3 more replies)
0 siblings, 4 replies; 32+ messages in thread
From: Marc MERLIN @ 2018-02-06 18:14 UTC (permalink / raw)
To: linux-raid
So, I have 2 drives on a 5x6TB array that have respectively 1 and 8
pending sectors in smart.
Currently, I have a check running, but it will take a while...
echo check > /sys/block/md7/md/sync_action
md7 : active raid5 sdf1[0] sdg1[5] sdd1[3] sdh1[2] sde1[1]
23441561600 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
[==>..................] check = 10.5% (615972996/5860390400) finish=4822.1min speed=18125K/sec
bitmap: 3/44 pages [12KB], 65536KB chunk
My understanding is that eventually it will find the bad sectors that can't be read
and rewrite new ones (block remapping) after reading the remaining 4 drives.
But that may take up to 3 days, just due to how long the check will take and size of the drives
(they are on a SATA port multiplier, so I don't get a lot of speed).
Now, I was trying to see if I could just manually remap the block if I can read it at
least once.
Smart shows:
# 3 Extended offline Completed: read failure 90% 289 1287409520
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1
So, trying to read the block until it reads ok and gets remapped, would be great
but that didn't work:
myth:/mnt/btrfs_bigbackup/DS2/backup# dd if=/dev/sdh skip=1287409520 count=1 of=- | more
dd: reading `/dev/sdh': Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 9.79192 s, 0.0 kB/s
myth:/mnt/btrfs_bigbackup/DS2/backup# dd if=/dev/sdh skip=1287409520 count=1 of=- | more
dd: reading `/dev/sdh': Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 4.54204 s, 0.0 kB/s
ata5.04: exception Emask 0x0 SAct 0x1c000 SErr 0x0 action 0x0
ata5.04: failed command: READ FPDMA QUEUED
ata5.04: cmd 60/08:80:70:4f:bc/00:00:4c:00:00/40 tag 16 ncq dma 4096 in
res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F>
ata5.04: status: { DRDY ERR }
ata5.04: error: { UNC }
ata5.04: configured for UDMA/133
sd 4:4:0:0: [sdh] tag#16 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 4:4:0:0: [sdh] tag#16 Sense Key : Medium Error [current]
sd 4:4:0:0: [sdh] tag#16 Add. Sense: Unrecovered read error - auto reallocate failed
sd 4:4:0:0: [sdh] tag#16 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00
print_req_error: I/O error, dev sdh, sector 1287409520
Buffer I/O error on dev sdh, logical block 160926190, async page read
ata5: EH complete
That's not very unexpected.
However, I can get
hdparm --read-sector 1287409520 /dev/sdh
to work sometimes.
I've gotten garbage, sometimes 0, and sometimes what seems like good data (and gotten
the same data more than once).
hdparm --read-sector 1287409520 /dev/sdh
/dev/sdh:
reading sector 1287409520: SG_IO: bad/missing sense data, sb[]: 70 00 03 00 00 00 00 0a 40 51 e0 01 11 04 00 00 a0 70 00 00 00 00 00 00 00 00 00 00 00 00 00 00
succeeded
6843 7261 6361 6574 2072 6564 6976 6563
3a73 200a 3120 6d20 6d65 200a 3420 2f20
6564 2f76 6376 302f 200a 3420 7420 7974
200a 3420 7420 7974 0a53 2020 2035 642f
7665 742f 7974 200a 3520 2f20 6564 2f76
6f63 736e 6c6f 0a65 2020 2035 642f 7665
702f 6d74 0a78 2020 2036 706c 200a 3720
7620 7363 200a 3031 6d20 7369 0a63 3120
2033 6e69 7570 0a74 3120 2034 6f73 6e75
2f64 696d 6578 0a72 3120 2034 6f73 6e75
2f64 7364 0a70 3120 2034 6f73 6e75 2f64
7561 6964 0a6f 3120 2034 6f73 6e75 2f64
6461 7073 200a 3132 7320 0a67 3220 2039
6266 200a 3138 7620 6469 6f65 6c34 6e69
7875 310a 3631 6120 736c 0a61 3231 2038
7470 0a6d 3331 2036 7470 0a73 3831 2030
7375 0a62 3831 2039 7375 5f62 6564 6976
6563 320a 3632 6420 6d72 320a 3234 6d20
6465 6169 320a 3334 6820 6469 6172 0a77
3432 2034 6966 6572 6977 6572 320a 3534
6e20 6d76 0a65 3432 2036 656d 0a69 3432
2037 7561 0a78 3432 2038 7362 0a67 3432
2039 6177 6374 6468 676f 320a 3035 7220
6374 320a 3135 6420 7861 320a 3235 6420
6d69 636d 6c74 320a 3335 6e20 6364 6c74
320a 3435 6720 6970 636f 6968 0a70 420a
6f6c 6b63 6420 7665 6369 7365 0a3a 2020
2032 6466 200a 3820 7320 0a64 2020 2039
646d 200a 3131 7320 0a72 3620 2035 6473
200a 3636 7320 0a64 3620 2037 6473 200a
3836 7320 0a64 3620 2039 6473 200a 3037
7320 0a64 3720 2031 6473 310a 3832 7320
Should I stick this data into a 512 byte file and write it back with dd in the right place?
(sadly hdparm –write-sector does not seem to take input and just writes 0's instead)
Does that sound like a good plan, or is there another better way to fix my issue?
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: force remapping a pending sector in sw raid5 array 2018-02-06 18:14 force remapping a pending sector in sw raid5 array Marc MERLIN @ 2018-02-06 18:59 ` Reindl Harald 2018-02-06 19:36 ` Marc MERLIN 2018-02-06 20:03 ` Andreas Klauer ` (2 subsequent siblings) 3 siblings, 1 reply; 32+ messages in thread From: Reindl Harald @ 2018-02-06 18:59 UTC (permalink / raw) To: Marc MERLIN, linux-raid Am 06.02.2018 um 19:14 schrieb Marc MERLIN: > So, I have 2 drives on a 5x6TB array that have respectively 1 and 8 > pending sectors in smart. > > Currently, I have a check running, but it will take a while... > > echo check > /sys/block/md7/md/sync_action > md7 : active raid5 sdf1[0] sdg1[5] sdd1[3] sdh1[2] sde1[1] > 23441561600 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU] > [==>..................] check = 10.5% (615972996/5860390400) finish=4822.1min speed=18125K/sec > bitmap: 3/44 pages [12KB], 65536KB chunk > > My understanding is that eventually it will find the bad sectors that can't be read > and rewrite new ones (block remapping) after reading the remaining 4 drives. > > But that may take up to 3 days, just due to how long the check will take and size of the drives > (they are on a SATA port multiplier, so I don't get a lot of speed) but 18125K/sec is a joke given that you should run a scrub every week did you try to play around with sysctl.conf? adjusting teh vars below and run "sysctl -p" should amke a difference after a view seconds if the hardware is capable of more performance than that dev.raid.speed_limit_min = 25000 dev.raid.speed_limit_max = 1000000 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-06 18:59 ` Reindl Harald @ 2018-02-06 19:36 ` Marc MERLIN 0 siblings, 0 replies; 32+ messages in thread From: Marc MERLIN @ 2018-02-06 19:36 UTC (permalink / raw) To: Reindl Harald; +Cc: linux-raid On Tue, Feb 06, 2018 at 07:59:32PM +0100, Reindl Harald wrote: > but 18125K/sec is a joke given that you should run a scrub every week I know it's bad. Right now it's a bit slower than normal because I'm also copying data to the drives. > did you try to play around with sysctl.conf? Yes, I set it to 300,000, but obviously it won't make the hardware go faster than it can. I totally understand the performance is crap, but it's a backup array that I only bring up and power on once a week and scrub once a month, so it's ok enough for the use in question. For now, it's more about me learning how to manually force a block remap, not because I absolutely have to, but because it's always good to know and learn low level tools and how things work. I have used hdrecover in the past which reads all the blocks low level and re-reads a bad block many times to force a successful read and auto remap, but sadly it doesn't take a block offset, so it would only work if I let it run on the whole drive, which would be slow. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-06 18:14 force remapping a pending sector in sw raid5 array Marc MERLIN 2018-02-06 18:59 ` Reindl Harald @ 2018-02-06 20:03 ` Andreas Klauer 2018-02-06 21:51 ` Adam Goryachev 2018-02-07 9:42 ` Kay Diederichs 3 siblings, 0 replies; 32+ messages in thread From: Andreas Klauer @ 2018-02-06 20:03 UTC (permalink / raw) To: Marc MERLIN; +Cc: linux-raid On Tue, Feb 06, 2018 at 10:14:16AM -0800, Marc MERLIN wrote: > echo check > /sys/block/md7/md/sync_action > md7 : active raid5 sdf1[0] sdg1[5] sdd1[3] sdh1[2] sde1[1] > 23441561600 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU] > [==>..................] check = 10.5% (615972996/5860390400) finish=4822.1min speed=18125K/sec > bitmap: 3/44 pages [12KB], 65536KB chunk > > My understanding is that eventually it will find the bad sectors that can't be read > and rewrite new ones (block remapping) after reading the remaining 4 drives. You can do selective area checks, with /sys/block/mdX/md/sync_{min,max}. Documented here: https://www.kernel.org/doc/html/latest/admin-guide/md.html#md-devices-in-sysfs Also see if increasing the stripe cache size helps with speeds. > Should I stick this data into a 512 byte file and write it back with dd in the right place? Most hard drives have 4K sectors these days. Writing 512 byte into a bad physical 4K sector is probably not a good idea. So at minimum, write 4K (aligned). If in doubt, write more... However, leaving it to the md check should be the much safer option. Easy to make mistakes when messing with drives directly. > Does that sound like a good plan, or is there another better way to fix my issue? mdadm --replace the drive entirely with a new one is not an option? Then you could do a full badblocks write test on the removed/faulty drive and make a more informed decision as to whether it can be trusted at all. Regards Andreas Klauer ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-06 18:14 force remapping a pending sector in sw raid5 array Marc MERLIN 2018-02-06 18:59 ` Reindl Harald 2018-02-06 20:03 ` Andreas Klauer @ 2018-02-06 21:51 ` Adam Goryachev 2018-02-06 22:02 ` Marc MERLIN 2018-02-07 4:29 ` Marc MERLIN 2018-02-07 9:42 ` Kay Diederichs 3 siblings, 2 replies; 32+ messages in thread From: Adam Goryachev @ 2018-02-06 21:51 UTC (permalink / raw) To: Marc MERLIN, linux-raid On 07/02/18 05:14, Marc MERLIN wrote: > So, I have 2 drives on a 5x6TB array that have respectively 1 and 8 > pending sectors in smart. > > Currently, I have a check running, but it will take a while... > > echo check > /sys/block/md7/md/sync_action > md7 : active raid5 sdf1[0] sdg1[5] sdd1[3] sdh1[2] sde1[1] > 23441561600 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU] > [==>..................] check = 10.5% (615972996/5860390400) finish=4822.1min speed=18125K/sec > bitmap: 3/44 pages [12KB], 65536KB chunk > > My understanding is that eventually it will find the bad sectors that can't be read > and rewrite new ones (block remapping) after reading the remaining 4 drives. > > But that may take up to 3 days, just due to how long the check will take and size of the drives > (they are on a SATA port multiplier, so I don't get a lot of speed). > > Now, I was trying to see if I could just manually remap the block if I can read it at > least once. > Smart shows: > # 3 Extended offline Completed: read failure 90% 289 1287409520 > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1 > > So, trying to read the block until it reads ok and gets remapped, would be great > but that didn't work: > > Does that sound like a good plan, or is there another better way to fix my issue? I think instead of reading the sector from the drive and relying on the drive to determine the correct data (it's already telling you it can't). What you need to do is find out where on md7 drive x sector y maps to and read that sector from md7, which will get md to (possibly) notice the read error, and then read the data from the other drives, and then re-write the faulty sector with correct calculated data (or do the resync on that area of md7 only). You could probably take a rough guess as follows (note, my math is probably totally bogus as I don't really know the physical / logical mapping for raid5, but I'm guessing) You have 5 drives in raid5, and we know one drive (capacity) is used for checksum, so four drives of data. So sector 1287409520 of one drive would be approx 4 x sector 1287409520 of the md array. So try setting something like 1287000000 * 4 as the start of the resync up to 1288000000 * 4 and see if that finds and fixes it for you. If nothing else, it should finish fairly quickly. You might need to start earlier, but you could just keep reducing the "window" until you find the right spot. Or, someone who knows a lot more about this mapping might jump in and answer the question, though they might need to see the raid details to see the actual physical layout/order of drives/etc. Hope that helps anyway.... Regards, Adam -- Adam Goryachev Website Managers www.websitemanagers.com.au -- The information in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. If you have received this message in error, please notify us immediately. Please also destroy and delete the message from your computer. Viruses - Any loss/damage incurred by receiving this email is not the sender's responsibility. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-06 21:51 ` Adam Goryachev @ 2018-02-06 22:02 ` Marc MERLIN 2018-02-06 22:31 ` Roger Heflin 2018-02-07 4:29 ` Marc MERLIN 1 sibling, 1 reply; 32+ messages in thread From: Marc MERLIN @ 2018-02-06 22:02 UTC (permalink / raw) To: Adam Goryachev; +Cc: linux-raid On Wed, Feb 07, 2018 at 08:51:15AM +1100, Adam Goryachev wrote: > I think instead of reading the sector from the drive and relying on the > drive to determine the correct data (it's already telling you it can't). Just on that point, it's not that simple. A drive will only try to read the data a few times before giving up and marking the sector as pending a re-write with new data (so that it can be re-mapped). You can however re-read it in different ways and sometimes get the data back, which _should_ then cause an immediate re-writing of the data on a new block and turn the pending into a reallocated block However, this does not seem to have happened on my drive, either because the bad data didn't really get read by hdparm --read-sector, or because the firmware isn't doing its remapping job, or something else I don't understand > What you need to do is find out where on md7 drive x sector y maps to > and read that sector from md7, which will get md to (possibly) notice > the read error, and then read the data from the other drives, and then > re-write the faulty sector with correct calculated data (or do the > resync on that area of md7 only). Yeah, I got that part. > So try setting something like 1287000000 * 4 as the start of the resync > up to 1288000000 * 4 and see if that finds and fixes it for you. > > If nothing else, it should finish fairly quickly. You might need to > start earlier, but you could just keep reducing the "window" until you > find the right spot. Or, someone who knows a lot more about this mapping > might jump in and answer the question, though they might need to see the > raid details to see the actual physical layout/order of drives/etc. I did however (indeed) miss that I can narrow the check range, so I'll try playing with that until I can narrow it down to the right bit. I'm still curious as to why the hdparm bit didn't work, but oh well at this point. Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-06 22:02 ` Marc MERLIN @ 2018-02-06 22:31 ` Roger Heflin 2018-02-06 22:46 ` Marc MERLIN 0 siblings, 1 reply; 32+ messages in thread From: Roger Heflin @ 2018-02-06 22:31 UTC (permalink / raw) To: Marc MERLIN; +Cc: Adam Goryachev, Linux RAID What kind of drive is it? I have had good luck getting seagates to remap, on my 3tb WD Red drive with bad sectors the drive does not seem to remap them as easily. So far I have a lot of repeat bad sectors, but only 1 has remapped, even thought I am given the drive a lot of chances to remap the sectors. On Tue, Feb 6, 2018 at 4:02 PM, Marc MERLIN <marc@merlins.org> wrote: > On Wed, Feb 07, 2018 at 08:51:15AM +1100, Adam Goryachev wrote: >> I think instead of reading the sector from the drive and relying on the >> drive to determine the correct data (it's already telling you it can't). > > Just on that point, it's not that simple. A drive will only try to read the > data a few times before giving up and marking the sector as pending a > re-write with new data (so that it can be re-mapped). > You can however re-read it in different ways and sometimes get the data > back, which _should_ then cause an immediate re-writing of the data on a new > block and turn the pending into a reallocated block > However, this does not seem to have happened on my drive, either because the > bad data didn't really get read by hdparm --read-sector, or because the > firmware isn't doing its remapping job, or something else I don't understand > >> What you need to do is find out where on md7 drive x sector y maps to >> and read that sector from md7, which will get md to (possibly) notice >> the read error, and then read the data from the other drives, and then >> re-write the faulty sector with correct calculated data (or do the >> resync on that area of md7 only). > > Yeah, I got that part. > >> So try setting something like 1287000000 * 4 as the start of the resync >> up to 1288000000 * 4 and see if that finds and fixes it for you. >> >> If nothing else, it should finish fairly quickly. You might need to >> start earlier, but you could just keep reducing the "window" until you >> find the right spot. Or, someone who knows a lot more about this mapping >> might jump in and answer the question, though they might need to see the >> raid details to see the actual physical layout/order of drives/etc. > > I did however (indeed) miss that I can narrow the check range, so I'll try > playing with that until I can narrow it down to the right bit. > > I'm still curious as to why the hdparm bit didn't work, but oh well at this > point. > > Thanks, > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking > Home page: http://marc.merlins.org/ > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-06 22:31 ` Roger Heflin @ 2018-02-06 22:46 ` Marc MERLIN 0 siblings, 0 replies; 32+ messages in thread From: Marc MERLIN @ 2018-02-06 22:46 UTC (permalink / raw) To: Roger Heflin; +Cc: Adam Goryachev, Linux RAID On Tue, Feb 06, 2018 at 04:31:58PM -0600, Roger Heflin wrote: > What kind of drive is it? I have had good luck getting seagates to > remap, on my 3tb WD Red drive with bad sectors the drive does not seem > to remap them as easily. Device Model: WL6000GSA6457 Serial Number: WOL240367065 LU WWN Device Id: 5 0014ee 05932b834 Firmware Version: 82.00A82 User Capacity: 6,001,175,126,016 bytes [6.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 9 ATA Standard is: Not recognized. Minor revision code: 0x001f > So far I have a lot of repeat bad sectors, but only 1 has remapped, > even thought I am given the drive a lot of chances to remap the > sectors. Yeah, it seems that things don't work like they should. Glad to know that it's not just me, then :) I'll probably return these drives because that behaviour is not ok, but at the same time it's interesting to learn about failure cases on data that I could afford to lose (mostly it's the time lost to re-sync a very big backup, i.e 1 to 2 weeks) Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-06 21:51 ` Adam Goryachev 2018-02-06 22:02 ` Marc MERLIN @ 2018-02-07 4:29 ` Marc MERLIN 1 sibling, 0 replies; 32+ messages in thread From: Marc MERLIN @ 2018-02-07 4:29 UTC (permalink / raw) To: Adam Goryachev; +Cc: linux-raid On Wed, Feb 07, 2018 at 08:51:15AM +1100, Adam Goryachev wrote: > On 07/02/18 05:14, Marc MERLIN wrote: > > So, I have 2 drives on a 5x6TB array that have respectively 1 and 8 > > pending sectors in smart. > > > > Currently, I have a check running, but it will take a while... > > > > echo check > /sys/block/md7/md/sync_action > > md7 : active raid5 sdf1[0] sdg1[5] sdd1[3] sdh1[2] sde1[1] > > 23441561600 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU] > > [==>..................] check = 10.5% (615972996/5860390400) finish=4822.1min speed=18125K/sec > > bitmap: 3/44 pages [12KB], 65536KB chunk So, I'm a bit confused. First, I had [====>................] check = 22.5% (1321310068/5860390400) finish=3442.7min speed=21973K/sec and to recover from that mark, I have to echo 2642620136 > /sys/block/md7/md/sync_min In other words, 1321310068 is not the number you feed to sync_min, you have to double it. Then, you said I should take my LBA from # 2 Short offline Completed: read failure 90% 293 1287409520 and multiply it by 4. Does it really mean I should have used 8? I used 1287000000 * 4 5148000000 1288000000 * 4 5152000000 echo 5144000000 > /sys/block/md7/md/sync_min echo 5160000000 > /sys/block/md7/md/sync_max And the sync ran without tripping the bad block. Worse (kinda), the resync just hung once it reached 5160000000. I had to force idle to stop it. For what it's worth, the finish counter is also based on the last block of the drive, and not the value of sync_max. Minor bugs/problems? Ok, so I tried again by doubling the value: echo 10296000000 > /sys/block/md7/md/sync_min echo 10304000000 > /sys/block/md7/md/sync_max echo check > /sys/block/md7/md/sync_action This does not seem to have helped either. I'm now stuck on: Personalities : [linear] [raid0] [raid1] [raid10] [multipath] [raid6] [raid5] [raid4] md7 : active raid5 sdf1[0] sdg1[5] sdd1[3] sdh1[2] sde1[1] 23441561600 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU] [=================>...] check = 87.9% (5152000000/5860390400) finish=1977.2min speed=5970K/sec bitmap: 1/44 pages [4KB], 65536KB chunk Sync has reached max and is hung there, but without triggering the bad block. Mmmh, hitting this LBA reported in smart seems harder than it seemed. I've just reset it to running the whole disk and hope it'll hit the bad block eventually. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-06 18:14 force remapping a pending sector in sw raid5 array Marc MERLIN ` (2 preceding siblings ...) 2018-02-06 21:51 ` Adam Goryachev @ 2018-02-07 9:42 ` Kay Diederichs 2018-02-09 19:29 ` Marc MERLIN 3 siblings, 1 reply; 32+ messages in thread From: Kay Diederichs @ 2018-02-07 9:42 UTC (permalink / raw) To: linux-raid On 02/06/2018 07:14 PM, Marc MERLIN wrote: > So, I have 2 drives on a 5x6TB array that have respectively 1 and 8 > pending sectors in smart. > ... > # 3 Extended offline Completed: read failure 90% 289 1287409520 I have successfully used "badblocks" to overwrite bad sectors. The variety of badblocks that comes with RHEL (there may be others!) could be used with e.g. badblocks -svnb512 /dev/sdh 1287409599 1287409400 where -n Use non-destructive read-write mode. By default only a non-destructive read-only test is done. This option must not be combined with the -w option, as they are mutually exclusive. I've adjusted the last-block and first-block numbers in the command above so that they a) encompass the known bad blocks b) start and end on 4k-boundaries This command leaves those blocks intact that still can be read. After that, use a destructive-write badblocks e.g. badblocks -sfvwb512 /dev/sdh <x> <y> You'll have to adjust x and y to match just those blocks that cannot be read, based on the output of the first badblocks run. Afterwards, "smartctl -t short /dev/sdh" may clean up the SMART statistics. HTH, Kay ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-07 9:42 ` Kay Diederichs @ 2018-02-09 19:29 ` Marc MERLIN 2018-02-09 19:57 ` Kay Diederichs ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Marc MERLIN @ 2018-02-09 19:29 UTC (permalink / raw) To: Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin; +Cc: linux-raid On Wed, Feb 07, 2018 at 10:42:39AM +0100, Kay Diederichs wrote: > I've adjusted the last-block and first-block numbers in the command > above so that they > a) encompass the known bad blocks > b) start and end on 4k-boundaries > > This command leaves those blocks intact that still can be read. > > After that, use a destructive-write badblocks e.g. > > badblocks -sfvwb512 /dev/sdh <x> <y> > You'll have to adjust x and y to match just those blocks that cannot be > read, based on the output of the first badblocks run. I will try this next, thanks (still, for learning purposes). But, I'm confused by what happened. The md check ran to completion. It found things and supposedly fixed them: [240351.053406] md/raid:md7: read error corrected (8 sectors at 9159374528 on sdf1) Strangely, it did nothing with this: [287271.959779] sd 4:4:0:0: [sdh] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed The full resync/check is here: [89601.694910] md: data-check of RAID array md7 [240342.514062] ata5.02: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0 [240342.514073] ata5.02: failed command: READ FPDMA QUEUED [240342.514081] ata5.02: cmd 60/60:30:70:fc:f0/02:00:21:02:00/40 tag 6 ncq dma 311296 in [240342.514086] ata5.02: status: { DRDY ERR } [240342.514089] ata5.02: error: { UNC } [240342.515351] ata5.02: configured for UDMA/133 [240342.515470] ata5.02: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x0 t4 [240342.515578] sd 4:2:0:0: [sdf] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [240342.515585] sd 4:2:0:0: [sdf] tag#6 Sense Key : Medium Error [current] [240342.515590] sd 4:2:0:0: [sdf] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed [240342.515596] sd 4:2:0:0: [sdf] tag#6 CDB: Read(16) 88 00 00 00 00 02 21 f0 fc 70 00 00 02 60 00 00 [240342.515600] print_req_error: I/O error, dev sdf, sector 9159375984 [240342.515726] ata5: EH complete [240350.486141] ata5.02: exception Emask 0x0 SAct 0x30 SErr 0x0 action 0x0 [240350.486153] ata5.02: failed command: READ FPDMA QUEUED [240350.486160] ata5.02: cmd 60/08:20:c0:fe:f0/00:00:21:02:00/40 tag 4 ncq dma 4096 in [240350.486166] ata5.02: status: { DRDY ERR } [240350.486169] ata5.02: error: { UNC } [240350.487403] ata5.02: configured for UDMA/133 [240350.487450] sd 4:2:0:0: [sdf] tag#4 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [240350.487454] sd 4:2:0:0: [sdf] tag#4 Sense Key : Medium Error [current] [240350.487458] sd 4:2:0:0: [sdf] tag#4 Add. Sense: Unrecovered read error - auto reallocate failed [240350.487462] sd 4:2:0:0: [sdf] tag#4 CDB: Read(16) 88 00 00 00 00 02 21 f0 fe c0 00 00 00 08 00 00 [240350.487466] print_req_error: I/O error, dev sdf, sector 9159376576 [240350.487493] ata5: EH complete [240351.053406] md/raid:md7: read error corrected (8 sectors at 9159374528 on sdf1) [287271.958430] ata5.04: exception Emask 0x0 SAct 0xffc0 SErr 0x0 action 0x0 [287271.958442] ata5.04: failed command: READ FPDMA QUEUED [287271.958449] ata5.04: cmd 60/40:30:f0:d7:64/05:00:86:02:00/40 tag 6 ncq dma 688128 in [287271.958454] ata5.04: status: { DRDY ERR } [287271.958457] ata5.04: error: { UNC } [287271.959691] ata5.04: configured for UDMA/133 [287271.959770] sd 4:4:0:0: [sdh] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [287271.959775] sd 4:4:0:0: [sdh] tag#6 Sense Key : Medium Error [current] [287271.959779] sd 4:4:0:0: [sdh] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed [287271.959783] sd 4:4:0:0: [sdh] tag#6 CDB: Read(16) 88 00 00 00 00 02 86 64 d7 f0 00 00 05 40 00 00 [287271.959785] print_req_error: I/O error, dev sdh, sector 10844690416 [287271.959889] ata5: EH complete [315132.651910] md: md7: data-check done. Now, the sync is comnplete, and my bad blocks are still there? myth:~# smartctl -A /dev/sdh 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 2 myth:~# smartctl -A /dev/sdf 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 7 The pending sectors should have been re-written and become Reallocated_Event_Count, no? Reading myth:~# hdparm --read-sector 287409520 /dev/sdh still gives me what looks like non garbage data (but it could be) and [315411.087451] ata5.04: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [315411.087462] ata5.04: failed command: READ SECTOR(S) EXT [315411.087469] ata5.04: cmd 24/00:01:70:4f:bc/00:00:4c:00:00/e0 tag 0 pio 512 in [315411.087469] res 51/40:01:70:4f:bc/00:00:4c:00:00/e0 Emask 0x9 (media error) [315411.087474] ata5.04: status: { DRDY ERR } [315411.087478] ata5.04: error: { UNC } [315411.108028] ata5.04: configured for UDMA/133 [315411.108075] ata5: EH complete So, mdadm is happy allegedly, but my drives still have the same bad sectors they had (more or less). Yes, I know I should trash (return) those drives, but I still want to understand why I can't get basic block remapping working Any idea what went wrong? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-09 19:29 ` Marc MERLIN @ 2018-02-09 19:57 ` Kay Diederichs 2018-02-09 20:02 ` Roger Heflin 2018-02-09 20:13 ` Phil Turmel 2 siblings, 0 replies; 32+ messages in thread From: Kay Diederichs @ 2018-02-09 19:57 UTC (permalink / raw) Cc: linux-raid On 02/09/2018 08:29 PM, Marc MERLIN wrote: > On Wed, Feb 07, 2018 at 10:42:39AM +0100, Kay Diederichs wrote: >> I've adjusted the last-block and first-block numbers in the command >> above so that they >> a) encompass the known bad blocks >> b) start and end on 4k-boundaries >> >> This command leaves those blocks intact that still can be read. >> >> After that, use a destructive-write badblocks e.g. >> >> badblocks -sfvwb512 /dev/sdh <x> <y> >> You'll have to adjust x and y to match just those blocks that cannot be >> read, based on the output of the first badblocks run. > > I will try this next, thanks (still, for learning purposes). > > But, I'm confused by what happened. The md check ran to completion. > It found things and supposedly fixed them: > [240351.053406] md/raid:md7: read error corrected (8 sectors at 9159374528 on sdf1) > > Strangely, it did nothing with this: > [287271.959779] sd 4:4:0:0: [sdh] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed > > The full resync/check is here: > [89601.694910] md: data-check of RAID array md7 > [240342.514062] ata5.02: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0 > [240342.514073] ata5.02: failed command: READ FPDMA QUEUED > [240342.514081] ata5.02: cmd 60/60:30:70:fc:f0/02:00:21:02:00/40 tag 6 ncq dma 311296 in > [240342.514086] ata5.02: status: { DRDY ERR } > [240342.514089] ata5.02: error: { UNC } > [240342.515351] ata5.02: configured for UDMA/133 > [240342.515470] ata5.02: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x0 t4 > [240342.515578] sd 4:2:0:0: [sdf] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [240342.515585] sd 4:2:0:0: [sdf] tag#6 Sense Key : Medium Error [current] > [240342.515590] sd 4:2:0:0: [sdf] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed > [240342.515596] sd 4:2:0:0: [sdf] tag#6 CDB: Read(16) 88 00 00 00 00 02 21 f0 fc 70 00 00 02 60 00 00 > [240342.515600] print_req_error: I/O error, dev sdf, sector 9159375984 > [240342.515726] ata5: EH complete > [240350.486141] ata5.02: exception Emask 0x0 SAct 0x30 SErr 0x0 action 0x0 > [240350.486153] ata5.02: failed command: READ FPDMA QUEUED > [240350.486160] ata5.02: cmd 60/08:20:c0:fe:f0/00:00:21:02:00/40 tag 4 ncq dma 4096 in > [240350.486166] ata5.02: status: { DRDY ERR } > [240350.486169] ata5.02: error: { UNC } > [240350.487403] ata5.02: configured for UDMA/133 > [240350.487450] sd 4:2:0:0: [sdf] tag#4 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [240350.487454] sd 4:2:0:0: [sdf] tag#4 Sense Key : Medium Error [current] > [240350.487458] sd 4:2:0:0: [sdf] tag#4 Add. Sense: Unrecovered read error - auto reallocate failed > [240350.487462] sd 4:2:0:0: [sdf] tag#4 CDB: Read(16) 88 00 00 00 00 02 21 f0 fe c0 00 00 00 08 00 00 > [240350.487466] print_req_error: I/O error, dev sdf, sector 9159376576 > [240350.487493] ata5: EH complete > [240351.053406] md/raid:md7: read error corrected (8 sectors at 9159374528 on sdf1) > [287271.958430] ata5.04: exception Emask 0x0 SAct 0xffc0 SErr 0x0 action 0x0 > [287271.958442] ata5.04: failed command: READ FPDMA QUEUED > [287271.958449] ata5.04: cmd 60/40:30:f0:d7:64/05:00:86:02:00/40 tag 6 ncq dma 688128 in > [287271.958454] ata5.04: status: { DRDY ERR } > [287271.958457] ata5.04: error: { UNC } > [287271.959691] ata5.04: configured for UDMA/133 > [287271.959770] sd 4:4:0:0: [sdh] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [287271.959775] sd 4:4:0:0: [sdh] tag#6 Sense Key : Medium Error [current] > [287271.959779] sd 4:4:0:0: [sdh] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed > [287271.959783] sd 4:4:0:0: [sdh] tag#6 CDB: Read(16) 88 00 00 00 00 02 86 64 d7 f0 00 00 05 40 00 00 > [287271.959785] print_req_error: I/O error, dev sdh, sector 10844690416 > [287271.959889] ata5: EH complete > [315132.651910] md: md7: data-check done. > > Now, the sync is comnplete, and my bad blocks are still there? > myth:~# smartctl -A /dev/sdh > 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 2 > > myth:~# smartctl -A /dev/sdf > 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 7 > > The pending sectors should have been re-written and become Reallocated_Event_Count, no? > > Reading > myth:~# hdparm --read-sector 287409520 /dev/sdh > still gives me what looks like non garbage data (but it could be) and > [315411.087451] ata5.04: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > [315411.087462] ata5.04: failed command: READ SECTOR(S) EXT > [315411.087469] ata5.04: cmd 24/00:01:70:4f:bc/00:00:4c:00:00/e0 tag 0 pio 512 in > [315411.087469] res 51/40:01:70:4f:bc/00:00:4c:00:00/e0 Emask 0x9 (media error) > [315411.087474] ata5.04: status: { DRDY ERR } > [315411.087478] ata5.04: error: { UNC } > [315411.108028] ata5.04: configured for UDMA/133 > [315411.108075] ata5: EH complete > > So, mdadm is happy allegedly, but my drives still have the same bad sectors they had > (more or less). > > Yes, I know I should trash (return) those drives, but I still want to > understand why I can't get basic block remapping working > Any idea what went wrong? > > Thanks, > Marc > In my experience, drives do not behave the same. You could try two things: a) run smartctl -t short (or even -t long) on the drives - that might update their internal bad block lists b) run the "badblocks" commands above - possibly more than once Pls report what you find. HTH Kay ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-09 19:29 ` Marc MERLIN 2018-02-09 19:57 ` Kay Diederichs @ 2018-02-09 20:02 ` Roger Heflin 2018-02-09 20:13 ` Phil Turmel 2 siblings, 0 replies; 32+ messages in thread From: Roger Heflin @ 2018-02-09 20:02 UTC (permalink / raw) To: Marc MERLIN; +Cc: Kay Diederichs, Andreas Klauer, Adam Goryachev, Linux RAID I would not count on it with the WD's. I have several only one has bad blocks, but some of the blocks have been re-written many times and the disk firmware still won't relocate. Some of mine I can read and get a failure, and force a rewrite, and then it will fail on the next read pass a few hours later, and again get re-written to the same block that will again go bad shortly. Whatever the firmware is doing it has too high of a threshhold or is too stupid to reliably relocate sectors even when they are obviously bad. On Fri, Feb 9, 2018 at 1:29 PM, Marc MERLIN <marc@merlins.org> wrote: > On Wed, Feb 07, 2018 at 10:42:39AM +0100, Kay Diederichs wrote: >> I've adjusted the last-block and first-block numbers in the command >> above so that they >> a) encompass the known bad blocks >> b) start and end on 4k-boundaries >> >> This command leaves those blocks intact that still can be read. >> >> After that, use a destructive-write badblocks e.g. >> >> badblocks -sfvwb512 /dev/sdh <x> <y> >> You'll have to adjust x and y to match just those blocks that cannot be >> read, based on the output of the first badblocks run. > > I will try this next, thanks (still, for learning purposes). > > But, I'm confused by what happened. The md check ran to completion. > It found things and supposedly fixed them: > [240351.053406] md/raid:md7: read error corrected (8 sectors at 9159374528 on sdf1) > > Strangely, it did nothing with this: > [287271.959779] sd 4:4:0:0: [sdh] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed > > The full resync/check is here: > [89601.694910] md: data-check of RAID array md7 > [240342.514062] ata5.02: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0 > [240342.514073] ata5.02: failed command: READ FPDMA QUEUED > [240342.514081] ata5.02: cmd 60/60:30:70:fc:f0/02:00:21:02:00/40 tag 6 ncq dma 311296 in > [240342.514086] ata5.02: status: { DRDY ERR } > [240342.514089] ata5.02: error: { UNC } > [240342.515351] ata5.02: configured for UDMA/133 > [240342.515470] ata5.02: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x0 t4 > [240342.515578] sd 4:2:0:0: [sdf] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [240342.515585] sd 4:2:0:0: [sdf] tag#6 Sense Key : Medium Error [current] > [240342.515590] sd 4:2:0:0: [sdf] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed > [240342.515596] sd 4:2:0:0: [sdf] tag#6 CDB: Read(16) 88 00 00 00 00 02 21 f0 fc 70 00 00 02 60 00 00 > [240342.515600] print_req_error: I/O error, dev sdf, sector 9159375984 > [240342.515726] ata5: EH complete > [240350.486141] ata5.02: exception Emask 0x0 SAct 0x30 SErr 0x0 action 0x0 > [240350.486153] ata5.02: failed command: READ FPDMA QUEUED > [240350.486160] ata5.02: cmd 60/08:20:c0:fe:f0/00:00:21:02:00/40 tag 4 ncq dma 4096 in > [240350.486166] ata5.02: status: { DRDY ERR } > [240350.486169] ata5.02: error: { UNC } > [240350.487403] ata5.02: configured for UDMA/133 > [240350.487450] sd 4:2:0:0: [sdf] tag#4 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [240350.487454] sd 4:2:0:0: [sdf] tag#4 Sense Key : Medium Error [current] > [240350.487458] sd 4:2:0:0: [sdf] tag#4 Add. Sense: Unrecovered read error - auto reallocate failed > [240350.487462] sd 4:2:0:0: [sdf] tag#4 CDB: Read(16) 88 00 00 00 00 02 21 f0 fe c0 00 00 00 08 00 00 > [240350.487466] print_req_error: I/O error, dev sdf, sector 9159376576 > [240350.487493] ata5: EH complete > [240351.053406] md/raid:md7: read error corrected (8 sectors at 9159374528 on sdf1) > [287271.958430] ata5.04: exception Emask 0x0 SAct 0xffc0 SErr 0x0 action 0x0 > [287271.958442] ata5.04: failed command: READ FPDMA QUEUED > [287271.958449] ata5.04: cmd 60/40:30:f0:d7:64/05:00:86:02:00/40 tag 6 ncq dma 688128 in > [287271.958454] ata5.04: status: { DRDY ERR } > [287271.958457] ata5.04: error: { UNC } > [287271.959691] ata5.04: configured for UDMA/133 > [287271.959770] sd 4:4:0:0: [sdh] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [287271.959775] sd 4:4:0:0: [sdh] tag#6 Sense Key : Medium Error [current] > [287271.959779] sd 4:4:0:0: [sdh] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed > [287271.959783] sd 4:4:0:0: [sdh] tag#6 CDB: Read(16) 88 00 00 00 00 02 86 64 d7 f0 00 00 05 40 00 00 > [287271.959785] print_req_error: I/O error, dev sdh, sector 10844690416 > [287271.959889] ata5: EH complete > [315132.651910] md: md7: data-check done. > > Now, the sync is comnplete, and my bad blocks are still there? > myth:~# smartctl -A /dev/sdh > 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 2 > > myth:~# smartctl -A /dev/sdf > 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 7 > > The pending sectors should have been re-written and become Reallocated_Event_Count, no? > > Reading > myth:~# hdparm --read-sector 287409520 /dev/sdh > still gives me what looks like non garbage data (but it could be) and > [315411.087451] ata5.04: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > [315411.087462] ata5.04: failed command: READ SECTOR(S) EXT > [315411.087469] ata5.04: cmd 24/00:01:70:4f:bc/00:00:4c:00:00/e0 tag 0 pio 512 in > [315411.087469] res 51/40:01:70:4f:bc/00:00:4c:00:00/e0 Emask 0x9 (media error) > [315411.087474] ata5.04: status: { DRDY ERR } > [315411.087478] ata5.04: error: { UNC } > [315411.108028] ata5.04: configured for UDMA/133 > [315411.108075] ata5: EH complete > > So, mdadm is happy allegedly, but my drives still have the same bad sectors they had > (more or less). > > Yes, I know I should trash (return) those drives, but I still want to > understand why I can't get basic block remapping working > Any idea what went wrong? > > Thanks, > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking > Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-09 19:29 ` Marc MERLIN 2018-02-09 19:57 ` Kay Diederichs 2018-02-09 20:02 ` Roger Heflin @ 2018-02-09 20:13 ` Phil Turmel 2018-02-09 20:29 ` Marc MERLIN 2018-02-10 21:43 ` Mateusz Korniak 2 siblings, 2 replies; 32+ messages in thread From: Phil Turmel @ 2018-02-09 20:13 UTC (permalink / raw) To: Marc MERLIN, Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin Cc: linux-raid Hi Marc, On 02/09/2018 02:29 PM, Marc MERLIN wrote: > But, I'm confused by what happened. The md check ran to completion. > It found things and supposedly fixed them: > [240351.053406] md/raid:md7: read error corrected (8 sectors at > 9159374528 on sdf1) > Strangely, it did nothing with this: > [287271.959779] sd 4:4:0:0: [sdh] tag#6 Add. Sense: Unrecovered read > error - auto reallocate failed > Now, the sync is comnplete, and my bad blocks are still there? > myth:~# smartctl -A /dev/sdh > 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age > Always - 0 > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age > Always - 2 > > myth:~# smartctl -A /dev/sdf > 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age > Always - 0 > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age > Always - 7 > > The pending sectors should have been re-written and become > Reallocated_Event_Count, no? Yes, and not necessarily. Pending sectors can be non-permanent errors -- the drive firmware will test a pending sector immediately after write to see if the write is readable. If not, it will re-allocate while it still has the write data in its buffers. Otherwise, it'll clear the pending sector. > So, mdadm is happy allegedly, but my drives still have the same bad > sectors they had (more or less). If you have bad block lists enabled in your array, MD will *never* try to fix the underlying sectors. Please show your mdadm -E reports for these devices. If necessary, stop the array and re-assemble with the options to disable bad block lists. { How this misfeature got into the kernel and enabled by default baffles me. } Also, pending sectors that are in dead zones between metadata and array data will not be accessed by a check scrub, and will therefore persist. > Yes, I know I should trash (return) those drives, Well, non-permanent read errors are not considered warranty failures. They are in the drive specs. When pending is zero and actual re-allocations are climbing (my threshold is double digits), *then* it's time to replace. > but I still want to understand why I can't get basic block remapping > working Any idea what went wrong? Invalid expectations, perhaps. Phil ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-09 20:13 ` Phil Turmel @ 2018-02-09 20:29 ` Marc MERLIN 2018-02-09 20:44 ` Phil Turmel ` (2 more replies) 2018-02-10 21:43 ` Mateusz Korniak 1 sibling, 3 replies; 32+ messages in thread From: Marc MERLIN @ 2018-02-09 20:29 UTC (permalink / raw) To: Phil Turmel Cc: Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On Fri, Feb 09, 2018 at 03:13:26PM -0500, Phil Turmel wrote: > > The pending sectors should have been re-written and become > > Reallocated_Event_Count, no? > > Yes, and not necessarily. Pending sectors can be non-permanent errors > -- the drive firmware will test a pending sector immediately after write > to see if the write is readable. If not, it will re-allocate while it > still has the write data in its buffers. Otherwise, it'll clear the > pending sector. This shows the sector is still bad though, right? myth:~# hdparm --read-sector 1287409520 /dev/sdh /dev/sdh: reading sector 1287409520: SG_IO: bad/missing sense data, sb[]: 70 00 03 00 00 00 00 0a 40 51 e0 01 11 04 00 00 a0 70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 succeeded 7000 0b54 92c4 ffff 0000 0000 01fe 0000 (...) [ 2572.139404] ata5.04: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 2572.139419] ata5.04: failed command: READ SECTOR(S) EXT [ 2572.139427] ata5.04: cmd 24/00:01:70:4f:bc/00:00:4c:00:00/e0 tag 28 pio 512 in [ 2572.139427] res 51/40:01:70:4f:bc/00:00:4c:00:00/e0 Emask 0x9 (media error) [ 2572.139431] ata5.04: status: { DRDY ERR } [ 2572.139435] ata5.04: error: { UNC } [ 2572.162369] ata5.04: configured for UDMA/133 [ 2572.162414] ata5: EH complete mdadm also said it found 6 bad sectors and rewrote them (or something like that) and it's happy. So alledgely it did something, but smart does not agree (yet?) I'm now running a long smart test on all drives, will see if numbers change. Mmmh, and I just ran myth:~# badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 below, and I don't quite understand what's going on. > > So, mdadm is happy allegedly, but my drives still have the same bad > > sectors they had (more or less). > > If you have bad block lists enabled in your array, MD will *never* try > to fix the underlying sectors. Please show your mdadm -E reports for > these devices. If necessary, stop the array and re-assemble with the > options to disable bad block lists. { How this misfeature got into the > kernel and enabled by default baffles me. } This means I dont have bad block lists? myth:~# mdadm -E /dev/sdd e f g h all return /dev/sdd: MBR Magic : aa55 Partition[0] : 4294967295 sectors at 1 (type ee) > Also, pending sectors that are in dead zones between metadata and array > data will not be accessed by a check scrub, and will therefore persist. That's a good point, but then I would never have discovered those blocks while initializing the array. > > Yes, I know I should trash (return) those drives, > > Well, non-permanent read errors are not considered warranty failures. > They are in the drive specs. When pending is zero and actual > re-allocations are climbing (my threshold is double digits), *then* it's > time to replace. I think it's worse here. Read errors are not being cleared by block rewrites? Those are brand "new" (but really remanufactured) drives. So far I'm not liking what I'm seeing and I'm very close to just returning them all and getting some less dodgy ones. Sad because the last set of 5 I got from a similar source, have worked beautifully. Let's see what a full smart scan does. I may also use hdparm --write-sector to just fill those bad blocks with 0's now that it seems that mdadm isn't caring about/using them anymore? Now, badblocks perplexes me even more. Shouldn't -n re-write blocks? myth:~# badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 /dev/sdh is apparently in use by the system; badblocks forced anyway. Checking for bad blocks in non-destructive read-write mode From block 1287409400 to 1287409599 Checking for bad blocks (non-destructive read-write test) Testing with random pattern: 1287409520ne, 0:14 elapsed. (0/0/0 errors) 1287409521ne, 0:18 elapsed. (1/0/0 errors) 1287409522ne, 0:23 elapsed. (2/0/0 errors) 1287409523ne, 0:27 elapsed. (3/0/0 errors) 1287409524ne, 0:31 elapsed. (4/0/0 errors) 1287409525ne, 0:36 elapsed. (5/0/0 errors) 1287409526ne, 0:40 elapsed. (6/0/0 errors) 1287409527ne, 0:44 elapsed. (7/0/0 errors) done Pass completed, 8 bad blocks found. (8/0/0 errors) Badblocks found 8 bad blocks, but didn't rewrite them, or failed to, or succeeded but that did nothing anyway? Do I understand that 1) badblocks got read errors 2) it's supposed to rewrite the blocks with new data (or not?) 3) auto reallocate failed [ 3171.717001] ata5.04: exception Emask 0x0 SAct 0x40 SErr 0x0 action 0x0 [ 3171.717012] ata5.04: failed command: READ FPDMA QUEUED [ 3171.717019] ata5.04: cmd 60/08:30:70:4f:bc/00:00:4c:00:00/40 tag 6 ncq dma 4096 in [ 3171.717019] res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F> [ 3171.717031] ata5.04: status: { DRDY ERR } [ 3171.717034] ata5.04: error: { UNC } [ 3171.718293] ata5.04: configured for UDMA/133 [ 3171.718342] sd 4:4:0:0: [sdh] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 3171.718349] sd 4:4:0:0: [sdh] tag#6 Sense Key : Medium Error [current] [ 3171.718354] sd 4:4:0:0: [sdh] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed [ 3171.718360] sd 4:4:0:0: [sdh] tag#6 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00 [ 3171.718364] print_req_error: I/O error, dev sdh, sector 1287409520 [ 3171.718369] Buffer I/O error on dev sdh, logical block 160926190, async page read [ 3171.718393] ata5: EH complete [ 3176.092946] ata5.04: exception Emask 0x0 SAct 0x400000 SErr 0x0 action 0x0 [ 3176.092958] ata5.04: failed command: READ FPDMA QUEUED [ 3176.092973] ata5.04: cmd 60/08:b0:70:4f:bc/00:00:4c:00:00/40 tag 22 ncq dma 4096 in [ 3176.092973] res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F> [ 3176.092978] ata5.04: status: { DRDY ERR } [ 3176.092981] ata5.04: error: { UNC } [ 3176.094237] ata5.04: configured for UDMA/133 [ 3176.094285] sd 4:4:0:0: [sdh] tag#22 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 3176.094291] sd 4:4:0:0: [sdh] tag#22 Sense Key : Medium Error [current] [ 3176.094296] sd 4:4:0:0: [sdh] tag#22 Add. Sense: Unrecovered read error - auto reallocate failed [ 3176.094302] sd 4:4:0:0: [sdh] tag#22 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00 [ 3176.094306] print_req_error: I/O error, dev sdh, sector 1287409520 [ 3176.094310] Buffer I/O error on dev sdh, logical block 160926190, async page read [ 3176.094324] ata5: EH complete [ 3180.488899] ata5.04: exception Emask 0x0 SAct 0x100 SErr 0x0 action 0x0 [ 3180.488909] ata5.04: failed command: READ FPDMA QUEUED [ 3180.488916] ata5.04: cmd 60/08:40:70:4f:bc/00:00:4c:00:00/40 tag 8 ncq dma 4096 in [ 3180.488916] res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F> [ 3180.488928] ata5.04: status: { DRDY ERR } [ 3180.488931] ata5.04: error: { UNC } [ 3180.490193] ata5.04: configured for UDMA/133 [ 3180.490243] sd 4:4:0:0: [sdh] tag#8 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 3180.490249] sd 4:4:0:0: [sdh] tag#8 Sense Key : Medium Error [current] [ 3180.490254] sd 4:4:0:0: [sdh] tag#8 Add. Sense: Unrecovered read error - auto reallocate failed [ 3180.490259] sd 4:4:0:0: [sdh] tag#8 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00 [ 3180.490263] print_req_error: I/O error, dev sdh, sector 1287409520 [ 3180.490268] Buffer I/O error on dev sdh, logical block 160926190, async page read [ 3180.490290] ata5: EH complete [ 3184.873146] ata5.04: exception Emask 0x0 SAct 0x1000000 SErr 0x0 action 0x0 [ 3184.873161] ata5.04: failed command: READ FPDMA QUEUED [ 3184.873175] ata5.04: cmd 60/08:c0:70:4f:bc/00:00:4c:00:00/40 tag 24 ncq dma 4096 in [ 3184.873175] res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F> [ 3184.873181] ata5.04: status: { DRDY ERR } [ 3184.873184] ata5.04: error: { UNC } [ 3184.874437] ata5.04: configured for UDMA/133 [ 3184.874488] sd 4:4:0:0: [sdh] tag#24 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 3184.874495] sd 4:4:0:0: [sdh] tag#24 Sense Key : Medium Error [current] [ 3184.874500] sd 4:4:0:0: [sdh] tag#24 Add. Sense: Unrecovered read error - auto reallocate failed [ 3184.874506] sd 4:4:0:0: [sdh] tag#24 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00 [ 3184.874510] print_req_error: I/O error, dev sdh, sector 1287409520 [ 3184.874515] Buffer I/O error on dev sdh, logical block 160926190, async page read [ 3184.874555] ata5: EH complete -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-09 20:29 ` Marc MERLIN @ 2018-02-09 20:44 ` Phil Turmel 2018-02-09 21:22 ` Marc MERLIN 2018-02-09 20:52 ` Kay Diederichs 2018-02-09 21:17 ` Kay Diederichs 2 siblings, 1 reply; 32+ messages in thread From: Phil Turmel @ 2018-02-09 20:44 UTC (permalink / raw) To: Marc MERLIN Cc: Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On 02/09/2018 03:29 PM, Marc MERLIN wrote: > On Fri, Feb 09, 2018 at 03:13:26PM -0500, Phil Turmel wrote: >>> The pending sectors should have been re-written and become >>> Reallocated_Event_Count, no? >> >> Yes, and not necessarily. Pending sectors can be non-permanent errors >> -- the drive firmware will test a pending sector immediately after write >> to see if the write is readable. If not, it will re-allocate while it >> still has the write data in its buffers. Otherwise, it'll clear the >> pending sector. > > This shows the sector is still bad though, right? > > myth:~# hdparm --read-sector 1287409520 /dev/sdh > /dev/sdh: > reading sector 1287409520: SG_IO: bad/missing sense data, sb[]: 70 00 03 00 00 00 00 0a 40 51 e0 01 11 04 00 00 a0 70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 succeeded > 7000 0b54 92c4 ffff 0000 0000 01fe 0000 > (...) > > [ 2572.139404] ata5.04: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > [ 2572.139419] ata5.04: failed command: READ SECTOR(S) EXT > [ 2572.139427] ata5.04: cmd 24/00:01:70:4f:bc/00:00:4c:00:00/e0 tag 28 pio 512 in > [ 2572.139427] res 51/40:01:70:4f:bc/00:00:4c:00:00/e0 Emask 0x9 (media error) > [ 2572.139431] ata5.04: status: { DRDY ERR } > [ 2572.139435] ata5.04: error: { UNC } > [ 2572.162369] ata5.04: configured for UDMA/133 > [ 2572.162414] ata5: EH complete Yes. Those sectors are still pending. > mdadm also said it found 6 bad sectors and rewrote them (or something like that) > and it's happy. So alledgely it did something, but smart does not agree (yet?) Like I said, mdadm "check" won't fix sectors that it has recorded as bad, and doesn't even look at sectors outside its data area. > I'm now running a long smart test on all drives, will see if numbers change. Self tests in the drives don't fix pending sectors, as they don't have the correct data to write. That's why they can only be fixed by an upper layer providing the data (during write). > Mmmh, and I just ran > myth:~# badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 > below, and I don't quite understand what's going on. I'm not talking about the badblocks command. I'm talking about the bad block logging feature of MD. > This means I dont have bad block lists? > myth:~# mdadm -E /dev/sdd e f g h all return > /dev/sdd: > MBR Magic : aa55 > Partition[0] : 4294967295 sectors at 1 (type ee) This means nothing. Please run mdadm -E on the *member devices*. That means include the partition number if you are using partitions. See the output of mdadm -D /dev/mdX for an array's list of *members*. >> Well, non-permanent read errors are not considered warranty failures. >> They are in the drive specs. When pending is zero and actual >> re-allocations are climbing (my threshold is double digits), *then* it's >> time to replace. > > I think it's worse here. Read errors are not being cleared by block rewrites? > Those are brand "new" (but really remanufactured) drives. > So far I'm not liking what I'm seeing and I'm very close to just > returning them all and getting some less dodgy ones. How do you know that these sectors have been re-written? Let me repeat: MD will *not* write to blocks that it has recorded as bad in *its* bad block list, and doesn't even read non-data-area blocks during a check. > Sad because the last set of 5 I got from a similar source, have worked > beautifully. I'm not convinced these drives aren't working beautifully. > Let's see what a full smart scan does. > I may also use hdparm --write-sector to just fill those bad blocks with 0's > now that it seems that mdadm isn't caring about/using them anymore? > > Now, badblocks perplexes me even more. Shouldn't -n re-write blocks? > > myth:~# badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 > /dev/sdh is apparently in use by the system; badblocks forced anyway. This should have been a hint that you shouldn't be using the badblocks utility on a running array's devices. > Badblocks found 8 bad blocks, but didn't rewrite them, or failed to, or > succeeded but that did nothing anyway? > > Do I understand that > 1) badblocks got read errors Yes. > 2) it's supposed to rewrite the blocks with new data (or not?) No. > 3) auto reallocate failed Don't know. You haven't provided the information needed to say. Phil ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-09 20:44 ` Phil Turmel @ 2018-02-09 21:22 ` Marc MERLIN 2018-02-09 22:07 ` Wol's lists 0 siblings, 1 reply; 32+ messages in thread From: Marc MERLIN @ 2018-02-09 21:22 UTC (permalink / raw) To: Phil Turmel, Kay Diederichs Cc: Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On Fri, Feb 09, 2018 at 03:44:56PM -0500, Phil Turmel wrote: > > myth:~# mdadm -E /dev/sdd e f g h all return > > /dev/sdd: > > MBR Magic : aa55 > > Partition[0] : 4294967295 sectors at 1 (type ee) > > This means nothing. Please run mdadm -E on the *member devices*. That > means include the partition number if you are using partitions. See the > output of mdadm -D /dev/mdX for an array's list of *members*. Ooops, I knew better, sorry about that (I use --examine usually) As you guessed, there it is: Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present. So it knows about the bad blocks, skips over them during check/rewrite and that's why they never got rewritten. I can see why this could be helpful in some way, but yeah, that confused me until now. Thanks for pointing that out to me. > > I think it's worse here. Read errors are not being cleared by block rewrites? > > Those are brand "new" (but really remanufactured) drives. > > So far I'm not liking what I'm seeing and I'm very close to just > > returning them all and getting some less dodgy ones. > > How do you know that these sectors have been re-written? Let me repeat: > MD will *not* write to blocks that it has recorded as bad in *its* bad > block list, and doesn't even read non-data-area blocks during a check. Right, got it. > > Sad because the last set of 5 I got from a similar source, have worked > > beautifully. > > I'm not convinced these drives aren't working beautifully. Would you say it's acceptable for a drive nowadays to come with pending sectors as soon as you use it? Yes, I understand I can get them re-allocated and once too many get reallocated, things get incrementally bad, but my bar so far as been that by the time a drive is starting to re-allocate sectors, I should start watching it closely. If it does this out of the box, then it shouldn't have passed QA and been shipped to me to start with. Maybe it's the problem of how many dead pixels are acceptable on a 4K LCD? > > myth:~# badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 > > /dev/sdh is apparently in use by the system; badblocks forced anyway. > > This should have been a hint that you shouldn't be using the badblocks > utility on a running array's devices. I knew I was doing that, we already established that those blocks are not being used by the array itself because they're in the md bad block skip list, no? But ok, point taken, bad practise, I'll stop the array first next time. On Fri, Feb 09, 2018 at 09:52:38PM +0100, Kay Diederichs wrote: > > From block 1287409400 to 1287409599 > > Checking for bad blocks (non-destructive read-write test) > > Testing with random pattern: 1287409520ne, 0:14 elapsed. (0/0/0 errors) > > 1287409521ne, 0:18 elapsed. (1/0/0 errors) > > 1287409522ne, 0:23 elapsed. (2/0/0 errors) > > 1287409523ne, 0:27 elapsed. (3/0/0 errors) > > 1287409524ne, 0:31 elapsed. (4/0/0 errors) > > 1287409525ne, 0:36 elapsed. (5/0/0 errors) > > 1287409526ne, 0:40 elapsed. (6/0/0 errors) > > 1287409527ne, 0:44 elapsed. (7/0/0 errors) > > done > > Pass completed, 8 bad blocks found. (8/0/0 errors) > > What you write about the result of > badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 > is the expected behavior. -n means that it will _not_ write sectors that > it cannot read (because that would remove the possibility that data from > these sectors could be recovered by more tries). > > As I wrote, you have to use the -w option instead of -n, and use x and y > of 1287409527 1287409520 Right. Just had a very short night, so I'm not doing my best thinking right now :) myth:~# badblocks -fsvwb512 /dev/sdh 1287409527 1287409520 /dev/sdh is apparently in use by the system; badblocks forced anyway. Checking for bad blocks in read-write mode From block 1287409520 to 1287409527 Testing with pattern 0xaa: done Reading and comparing: done Testing with pattern 0x55: done Reading and comparing: done Testing with pattern 0xff: done Reading and comparing: done Testing with pattern 0x00: done Reading and comparing: done Pass completed, 0 bad blocks found. (0/0/0 errors) I'm a bit confused as to why bad blocks needs to work in reverse sector order, but it worked. Before: 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 2 After: 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1 So, that fixed one sector, and somehow the drive decided it didn't need to be re-allocated. Interesting. I figured once a sector went pending once, it would not actually be re-used and be remapped on the next write. Seems like it didn't happen here. Either way, thanks all for you help, let me poke at it a bit more. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-09 21:22 ` Marc MERLIN @ 2018-02-09 22:07 ` Wol's lists 2018-02-09 22:36 ` Marc MERLIN 0 siblings, 1 reply; 32+ messages in thread From: Wol's lists @ 2018-02-09 22:07 UTC (permalink / raw) To: Marc MERLIN, Phil Turmel, Kay Diederichs Cc: Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On 09/02/18 21:22, Marc MERLIN wrote: > Interesting. I figured once a sector went pending once, it would not actually be re-used and > be remapped on the next write. Seems like it didn't happen here. Because there's all sorts of reasons a sector can go pending. My favourite example is to compare it to DRAM. DRAM needs refreshing every couple of seconds, otherwise it loses its contents and cannot be read, but it's perfectly okay to rewrite and re-use it. Likewise, the magnetism in a drive can decay such that the data is unreadable, but there's nothing actually wrong with the drive. (If the data next door is repeatedly rewritten, the rewrite can "leak" and trash nearby data ...) The decay time for that should be years. The problem of course is when the problem has a decay time measured in minutes or hours. The rewrite succeeds, so the sector doesn't get remapped, but when you next read it it has died :-( Cheers, Wol ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-09 22:07 ` Wol's lists @ 2018-02-09 22:36 ` Marc MERLIN 0 siblings, 0 replies; 32+ messages in thread From: Marc MERLIN @ 2018-02-09 22:36 UTC (permalink / raw) To: Wol's lists Cc: Phil Turmel, Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On Fri, Feb 09, 2018 at 10:07:57PM +0000, Wol's lists wrote: > On 09/02/18 21:22, Marc MERLIN wrote: > >Interesting. I figured once a sector went pending once, it would not > >actually be re-used and > >be remapped on the next write. Seems like it didn't happen here. > > Because there's all sorts of reasons a sector can go pending. > > My favourite example is to compare it to DRAM. DRAM needs refreshing > every couple of seconds, otherwise it loses its contents and cannot be > read, but it's perfectly okay to rewrite and re-use it. You're correct. The density of drives is so high now that writing a block affects the ones around it. > Likewise, the magnetism in a drive can decay such that the data is > unreadable, but there's nothing actually wrong with the drive. (If the > data next door is repeatedly rewritten, the rewrite can "leak" and trash > nearby data ...) The decay time for that should be years. Right. That's why I'm unhappy that it happened within a week of unpacking the drives and 2 out of 5 had problems already. > The problem of course is when the problem has a decay time measured in > minutes or hours. The rewrite succeeds, so the sector doesn't get > remapped, but when you next read it it has died :-( Speaking of this, I still haven't gotten the drive to actually remap anything yet. On that 2nd drive, I'm seeing 7 pending sectors, and can't trigger any error or remapping on them: 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 7 # 1 Short offline Completed: read failure 90% 519 569442000 # 2 Short offline Completed: read failure 90% 519 569442000 # 3 Extended offline Completed: read failure 90% 518 569442000 # 4 Short offline Completed without error 00% 508 - # 5 Short offline Completed without error 00% 484 - # 6 Short offline Completed without error 00% 460 - # 7 Short offline Completed without error 00% 436 - # 8 Short offline Completed: read failure 90% 413 569441985 # 9 Extended offline Completed: read failure 90% 409 569441990 #10 Extended offline Completed: read failure 90% 409 569441985 #11 Extended offline Completed: read failure 90% 409 569441991 #12 Extended offline Completed: read failure 90% 409 569441985 So, running badblocks over that range should help, right? But no, I get nothing: myth:~# badblocks -fsvn -b512 /dev/sdf 569942000 569001000 /dev/sdf is apparently in use by the system; badblocks forced anyway. Checking for bad blocks in non-destructive read-write mode From block 569001000 to 569942000 Checking for bad blocks (non-destructive read-write test) Testing with random pattern: done Pass completed, 0 bad blocks found. (0/0/0 errors) In some way, unless I'm reading the wrong blocks, that would mean the blocks are good again? But smart still shows 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 7 and a short offline test immediately shows # 1 Short offline Completed: read failure 90% 519 569442000 Clearly, I still have some things to learn. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-09 20:29 ` Marc MERLIN 2018-02-09 20:44 ` Phil Turmel @ 2018-02-09 20:52 ` Kay Diederichs 2018-02-11 20:52 ` Roger Heflin 2018-02-09 21:17 ` Kay Diederichs 2 siblings, 1 reply; 32+ messages in thread From: Kay Diederichs @ 2018-02-09 20:52 UTC (permalink / raw) To: Marc MERLIN, Phil Turmel Cc: Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid [-- Attachment #1: Type: text/plain, Size: 9361 bytes --] Am 09/02/18 um 21:29 schrieb Marc MERLIN: > On Fri, Feb 09, 2018 at 03:13:26PM -0500, Phil Turmel wrote: >>> The pending sectors should have been re-written and become >>> Reallocated_Event_Count, no? >> >> Yes, and not necessarily. Pending sectors can be non-permanent errors >> -- the drive firmware will test a pending sector immediately after write >> to see if the write is readable. If not, it will re-allocate while it >> still has the write data in its buffers. Otherwise, it'll clear the >> pending sector. > > This shows the sector is still bad though, right? > > myth:~# hdparm --read-sector 1287409520 /dev/sdh > /dev/sdh: > reading sector 1287409520: SG_IO: bad/missing sense data, sb[]: 70 00 03 00 00 00 00 0a 40 51 e0 01 11 04 00 00 a0 70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 succeeded > 7000 0b54 92c4 ffff 0000 0000 01fe 0000 > (...) > > [ 2572.139404] ata5.04: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > [ 2572.139419] ata5.04: failed command: READ SECTOR(S) EXT > [ 2572.139427] ata5.04: cmd 24/00:01:70:4f:bc/00:00:4c:00:00/e0 tag 28 pio 512 in > [ 2572.139427] res 51/40:01:70:4f:bc/00:00:4c:00:00/e0 Emask 0x9 (media error) > [ 2572.139431] ata5.04: status: { DRDY ERR } > [ 2572.139435] ata5.04: error: { UNC } > [ 2572.162369] ata5.04: configured for UDMA/133 > [ 2572.162414] ata5: EH complete > > mdadm also said it found 6 bad sectors and rewrote them (or something like that) > and it's happy. So alledgely it did something, but smart does not agree (yet?) > > I'm now running a long smart test on all drives, will see if numbers change. > > Mmmh, and I just ran > myth:~# badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 > below, and I don't quite understand what's going on. > >>> So, mdadm is happy allegedly, but my drives still have the same bad >>> sectors they had (more or less). >> >> If you have bad block lists enabled in your array, MD will *never* try >> to fix the underlying sectors. Please show your mdadm -E reports for >> these devices. If necessary, stop the array and re-assemble with the >> options to disable bad block lists. { How this misfeature got into the >> kernel and enabled by default baffles me. } > > This means I dont have bad block lists? > myth:~# mdadm -E /dev/sdd e f g h all return > /dev/sdd: > MBR Magic : aa55 > Partition[0] : 4294967295 sectors at 1 (type ee) > >> Also, pending sectors that are in dead zones between metadata and array >> data will not be accessed by a check scrub, and will therefore persist. > > That's a good point, but then I would never have discovered those blocks > while initializing the array. > >>> Yes, I know I should trash (return) those drives, >> >> Well, non-permanent read errors are not considered warranty failures. >> They are in the drive specs. When pending is zero and actual >> re-allocations are climbing (my threshold is double digits), *then* it's >> time to replace. > > I think it's worse here. Read errors are not being cleared by block rewrites? > Those are brand "new" (but really remanufactured) drives. > So far I'm not liking what I'm seeing and I'm very close to just > returning them all and getting some less dodgy ones. > > Sad because the last set of 5 I got from a similar source, have worked > beautifully. > > Let's see what a full smart scan does. > I may also use hdparm --write-sector to just fill those bad blocks with 0's > now that it seems that mdadm isn't caring about/using them anymore? > > Now, badblocks perplexes me even more. Shouldn't -n re-write blocks? > > myth:~# badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 > /dev/sdh is apparently in use by the system; badblocks forced anyway. > Checking for bad blocks in non-destructive read-write mode > From block 1287409400 to 1287409599 > Checking for bad blocks (non-destructive read-write test) > Testing with random pattern: 1287409520ne, 0:14 elapsed. (0/0/0 errors) > 1287409521ne, 0:18 elapsed. (1/0/0 errors) > 1287409522ne, 0:23 elapsed. (2/0/0 errors) > 1287409523ne, 0:27 elapsed. (3/0/0 errors) > 1287409524ne, 0:31 elapsed. (4/0/0 errors) > 1287409525ne, 0:36 elapsed. (5/0/0 errors) > 1287409526ne, 0:40 elapsed. (6/0/0 errors) > 1287409527ne, 0:44 elapsed. (7/0/0 errors) > done > Pass completed, 8 bad blocks found. (8/0/0 errors) > > Badblocks found 8 bad blocks, but didn't rewrite them, or failed to, or > succeeded but that did nothing anyway? > > Do I understand that > 1) badblocks got read errors > 2) it's supposed to rewrite the blocks with new data (or not?) > 3) auto reallocate failed > > > [ 3171.717001] ata5.04: exception Emask 0x0 SAct 0x40 SErr 0x0 action 0x0 > [ 3171.717012] ata5.04: failed command: READ FPDMA QUEUED > [ 3171.717019] ata5.04: cmd 60/08:30:70:4f:bc/00:00:4c:00:00/40 tag 6 ncq dma 4096 in > [ 3171.717019] res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F> > [ 3171.717031] ata5.04: status: { DRDY ERR } > [ 3171.717034] ata5.04: error: { UNC } > [ 3171.718293] ata5.04: configured for UDMA/133 > [ 3171.718342] sd 4:4:0:0: [sdh] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [ 3171.718349] sd 4:4:0:0: [sdh] tag#6 Sense Key : Medium Error [current] > [ 3171.718354] sd 4:4:0:0: [sdh] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed > [ 3171.718360] sd 4:4:0:0: [sdh] tag#6 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00 > [ 3171.718364] print_req_error: I/O error, dev sdh, sector 1287409520 > [ 3171.718369] Buffer I/O error on dev sdh, logical block 160926190, async page read > [ 3171.718393] ata5: EH complete > [ 3176.092946] ata5.04: exception Emask 0x0 SAct 0x400000 SErr 0x0 action 0x0 > [ 3176.092958] ata5.04: failed command: READ FPDMA QUEUED > [ 3176.092973] ata5.04: cmd 60/08:b0:70:4f:bc/00:00:4c:00:00/40 tag 22 ncq dma 4096 in > [ 3176.092973] res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F> > [ 3176.092978] ata5.04: status: { DRDY ERR } > [ 3176.092981] ata5.04: error: { UNC } > [ 3176.094237] ata5.04: configured for UDMA/133 > [ 3176.094285] sd 4:4:0:0: [sdh] tag#22 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [ 3176.094291] sd 4:4:0:0: [sdh] tag#22 Sense Key : Medium Error [current] > [ 3176.094296] sd 4:4:0:0: [sdh] tag#22 Add. Sense: Unrecovered read error - auto reallocate failed > [ 3176.094302] sd 4:4:0:0: [sdh] tag#22 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00 > [ 3176.094306] print_req_error: I/O error, dev sdh, sector 1287409520 > [ 3176.094310] Buffer I/O error on dev sdh, logical block 160926190, async page read > [ 3176.094324] ata5: EH complete > [ 3180.488899] ata5.04: exception Emask 0x0 SAct 0x100 SErr 0x0 action 0x0 > [ 3180.488909] ata5.04: failed command: READ FPDMA QUEUED > [ 3180.488916] ata5.04: cmd 60/08:40:70:4f:bc/00:00:4c:00:00/40 tag 8 ncq dma 4096 in > [ 3180.488916] res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F> > [ 3180.488928] ata5.04: status: { DRDY ERR } > [ 3180.488931] ata5.04: error: { UNC } > [ 3180.490193] ata5.04: configured for UDMA/133 > [ 3180.490243] sd 4:4:0:0: [sdh] tag#8 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [ 3180.490249] sd 4:4:0:0: [sdh] tag#8 Sense Key : Medium Error [current] > [ 3180.490254] sd 4:4:0:0: [sdh] tag#8 Add. Sense: Unrecovered read error - auto reallocate failed > [ 3180.490259] sd 4:4:0:0: [sdh] tag#8 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00 > [ 3180.490263] print_req_error: I/O error, dev sdh, sector 1287409520 > [ 3180.490268] Buffer I/O error on dev sdh, logical block 160926190, async page read > [ 3180.490290] ata5: EH complete > [ 3184.873146] ata5.04: exception Emask 0x0 SAct 0x1000000 SErr 0x0 action 0x0 > [ 3184.873161] ata5.04: failed command: READ FPDMA QUEUED > [ 3184.873175] ata5.04: cmd 60/08:c0:70:4f:bc/00:00:4c:00:00/40 tag 24 ncq dma 4096 in > [ 3184.873175] res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F> > [ 3184.873181] ata5.04: status: { DRDY ERR } > [ 3184.873184] ata5.04: error: { UNC } > [ 3184.874437] ata5.04: configured for UDMA/133 > [ 3184.874488] sd 4:4:0:0: [sdh] tag#24 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [ 3184.874495] sd 4:4:0:0: [sdh] tag#24 Sense Key : Medium Error [current] > [ 3184.874500] sd 4:4:0:0: [sdh] tag#24 Add. Sense: Unrecovered read error - auto reallocate failed > [ 3184.874506] sd 4:4:0:0: [sdh] tag#24 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00 > [ 3184.874510] print_req_error: I/O error, dev sdh, sector 1287409520 > [ 3184.874515] Buffer I/O error on dev sdh, logical block 160926190, async page read > [ 3184.874555] ata5: EH complete > What you write about the result of badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 is the expected behavior. -n means that it will _not_ write sectors that it cannot read (because that would remove the possibility that data from these sectors could be recovered by more tries). As I wrote, you have to use the -w option instead of -n, and use x and y of 1287409527 1287409520 HTH Kay [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 5049 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-09 20:52 ` Kay Diederichs @ 2018-02-11 20:52 ` Roger Heflin 0 siblings, 0 replies; 32+ messages in thread From: Roger Heflin @ 2018-02-11 20:52 UTC (permalink / raw) To: Kay Diederichs Cc: Marc MERLIN, Phil Turmel, Andreas Klauer, Adam Goryachev, Linux RAID On my wd I have some code that does a hdparm --read-sector and if that fails it does a hdparm --write-sector back once I know bad sectors and where they are. I started this after I get the drive removed from mdraid, and I have had the write-sector successfully write and verify the data and then minutes or hours later have the --read-sector on that block fail again. So from what I can tell my drive is useless trash because of the firmware not being able to decide reasonably that the sector is not recoverable, lucky this drive is still under warranty. I have several other 3tb WD reds several of which are out of warranty and have no issues as far as I can tell. That is at least better luck then I had with my 1.5 (about 80% of the drives replaced bad sectors and ran out of spares, or where otherwise useless because of randomly successfully reading bad blocks but pausing everything for less than the 7second timeout). On Fri, Feb 9, 2018 at 2:52 PM, Kay Diederichs <kay.diederichs@uni-konstanz.de> wrote: > > > Am 09/02/18 um 21:29 schrieb Marc MERLIN: >> On Fri, Feb 09, 2018 at 03:13:26PM -0500, Phil Turmel wrote: >>>> The pending sectors should have been re-written and become >>>> Reallocated_Event_Count, no? >>> >>> Yes, and not necessarily. Pending sectors can be non-permanent errors >>> -- the drive firmware will test a pending sector immediately after write >>> to see if the write is readable. If not, it will re-allocate while it >>> still has the write data in its buffers. Otherwise, it'll clear the >>> pending sector. >> >> This shows the sector is still bad though, right? >> >> myth:~# hdparm --read-sector 1287409520 /dev/sdh >> /dev/sdh: >> reading sector 1287409520: SG_IO: bad/missing sense data, sb[]: 70 00 03 00 00 00 00 0a 40 51 e0 01 11 04 00 00 a0 70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 succeeded >> 7000 0b54 92c4 ffff 0000 0000 01fe 0000 >> (...) >> >> [ 2572.139404] ata5.04: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 >> [ 2572.139419] ata5.04: failed command: READ SECTOR(S) EXT >> [ 2572.139427] ata5.04: cmd 24/00:01:70:4f:bc/00:00:4c:00:00/e0 tag 28 pio 512 in >> [ 2572.139427] res 51/40:01:70:4f:bc/00:00:4c:00:00/e0 Emask 0x9 (media error) >> [ 2572.139431] ata5.04: status: { DRDY ERR } >> [ 2572.139435] ata5.04: error: { UNC } >> [ 2572.162369] ata5.04: configured for UDMA/133 >> [ 2572.162414] ata5: EH complete >> >> mdadm also said it found 6 bad sectors and rewrote them (or something like that) >> and it's happy. So alledgely it did something, but smart does not agree (yet?) >> >> I'm now running a long smart test on all drives, will see if numbers change. >> >> Mmmh, and I just ran >> myth:~# badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 >> below, and I don't quite understand what's going on. >> >>>> So, mdadm is happy allegedly, but my drives still have the same bad >>>> sectors they had (more or less). >>> >>> If you have bad block lists enabled in your array, MD will *never* try >>> to fix the underlying sectors. Please show your mdadm -E reports for >>> these devices. If necessary, stop the array and re-assemble with the >>> options to disable bad block lists. { How this misfeature got into the >>> kernel and enabled by default baffles me. } >> >> This means I dont have bad block lists? >> myth:~# mdadm -E /dev/sdd e f g h all return >> /dev/sdd: >> MBR Magic : aa55 >> Partition[0] : 4294967295 sectors at 1 (type ee) >> >>> Also, pending sectors that are in dead zones between metadata and array >>> data will not be accessed by a check scrub, and will therefore persist. >> >> That's a good point, but then I would never have discovered those blocks >> while initializing the array. >> >>>> Yes, I know I should trash (return) those drives, >>> >>> Well, non-permanent read errors are not considered warranty failures. >>> They are in the drive specs. When pending is zero and actual >>> re-allocations are climbing (my threshold is double digits), *then* it's >>> time to replace. >> >> I think it's worse here. Read errors are not being cleared by block rewrites? >> Those are brand "new" (but really remanufactured) drives. >> So far I'm not liking what I'm seeing and I'm very close to just >> returning them all and getting some less dodgy ones. >> >> Sad because the last set of 5 I got from a similar source, have worked >> beautifully. >> >> Let's see what a full smart scan does. >> I may also use hdparm --write-sector to just fill those bad blocks with 0's >> now that it seems that mdadm isn't caring about/using them anymore? >> >> Now, badblocks perplexes me even more. Shouldn't -n re-write blocks? >> >> myth:~# badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 >> /dev/sdh is apparently in use by the system; badblocks forced anyway. >> Checking for bad blocks in non-destructive read-write mode >> From block 1287409400 to 1287409599 >> Checking for bad blocks (non-destructive read-write test) >> Testing with random pattern: 1287409520ne, 0:14 elapsed. (0/0/0 errors) >> 1287409521ne, 0:18 elapsed. (1/0/0 errors) >> 1287409522ne, 0:23 elapsed. (2/0/0 errors) >> 1287409523ne, 0:27 elapsed. (3/0/0 errors) >> 1287409524ne, 0:31 elapsed. (4/0/0 errors) >> 1287409525ne, 0:36 elapsed. (5/0/0 errors) >> 1287409526ne, 0:40 elapsed. (6/0/0 errors) >> 1287409527ne, 0:44 elapsed. (7/0/0 errors) >> done >> Pass completed, 8 bad blocks found. (8/0/0 errors) >> >> Badblocks found 8 bad blocks, but didn't rewrite them, or failed to, or >> succeeded but that did nothing anyway? >> >> Do I understand that >> 1) badblocks got read errors >> 2) it's supposed to rewrite the blocks with new data (or not?) >> 3) auto reallocate failed >> >> >> [ 3171.717001] ata5.04: exception Emask 0x0 SAct 0x40 SErr 0x0 action 0x0 >> [ 3171.717012] ata5.04: failed command: READ FPDMA QUEUED >> [ 3171.717019] ata5.04: cmd 60/08:30:70:4f:bc/00:00:4c:00:00/40 tag 6 ncq dma 4096 in >> [ 3171.717019] res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F> >> [ 3171.717031] ata5.04: status: { DRDY ERR } >> [ 3171.717034] ata5.04: error: { UNC } >> [ 3171.718293] ata5.04: configured for UDMA/133 >> [ 3171.718342] sd 4:4:0:0: [sdh] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE >> [ 3171.718349] sd 4:4:0:0: [sdh] tag#6 Sense Key : Medium Error [current] >> [ 3171.718354] sd 4:4:0:0: [sdh] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed >> [ 3171.718360] sd 4:4:0:0: [sdh] tag#6 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00 >> [ 3171.718364] print_req_error: I/O error, dev sdh, sector 1287409520 >> [ 3171.718369] Buffer I/O error on dev sdh, logical block 160926190, async page read >> [ 3171.718393] ata5: EH complete >> [ 3176.092946] ata5.04: exception Emask 0x0 SAct 0x400000 SErr 0x0 action 0x0 >> [ 3176.092958] ata5.04: failed command: READ FPDMA QUEUED >> [ 3176.092973] ata5.04: cmd 60/08:b0:70:4f:bc/00:00:4c:00:00/40 tag 22 ncq dma 4096 in >> [ 3176.092973] res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F> >> [ 3176.092978] ata5.04: status: { DRDY ERR } >> [ 3176.092981] ata5.04: error: { UNC } >> [ 3176.094237] ata5.04: configured for UDMA/133 >> [ 3176.094285] sd 4:4:0:0: [sdh] tag#22 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE >> [ 3176.094291] sd 4:4:0:0: [sdh] tag#22 Sense Key : Medium Error [current] >> [ 3176.094296] sd 4:4:0:0: [sdh] tag#22 Add. Sense: Unrecovered read error - auto reallocate failed >> [ 3176.094302] sd 4:4:0:0: [sdh] tag#22 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00 >> [ 3176.094306] print_req_error: I/O error, dev sdh, sector 1287409520 >> [ 3176.094310] Buffer I/O error on dev sdh, logical block 160926190, async page read >> [ 3176.094324] ata5: EH complete >> [ 3180.488899] ata5.04: exception Emask 0x0 SAct 0x100 SErr 0x0 action 0x0 >> [ 3180.488909] ata5.04: failed command: READ FPDMA QUEUED >> [ 3180.488916] ata5.04: cmd 60/08:40:70:4f:bc/00:00:4c:00:00/40 tag 8 ncq dma 4096 in >> [ 3180.488916] res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F> >> [ 3180.488928] ata5.04: status: { DRDY ERR } >> [ 3180.488931] ata5.04: error: { UNC } >> [ 3180.490193] ata5.04: configured for UDMA/133 >> [ 3180.490243] sd 4:4:0:0: [sdh] tag#8 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE >> [ 3180.490249] sd 4:4:0:0: [sdh] tag#8 Sense Key : Medium Error [current] >> [ 3180.490254] sd 4:4:0:0: [sdh] tag#8 Add. Sense: Unrecovered read error - auto reallocate failed >> [ 3180.490259] sd 4:4:0:0: [sdh] tag#8 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00 >> [ 3180.490263] print_req_error: I/O error, dev sdh, sector 1287409520 >> [ 3180.490268] Buffer I/O error on dev sdh, logical block 160926190, async page read >> [ 3180.490290] ata5: EH complete >> [ 3184.873146] ata5.04: exception Emask 0x0 SAct 0x1000000 SErr 0x0 action 0x0 >> [ 3184.873161] ata5.04: failed command: READ FPDMA QUEUED >> [ 3184.873175] ata5.04: cmd 60/08:c0:70:4f:bc/00:00:4c:00:00/40 tag 24 ncq dma 4096 in >> [ 3184.873175] res 41/40:00:70:4f:bc/00:00:4c:00:00/00 Emask 0x409 (media error) <F> >> [ 3184.873181] ata5.04: status: { DRDY ERR } >> [ 3184.873184] ata5.04: error: { UNC } >> [ 3184.874437] ata5.04: configured for UDMA/133 >> [ 3184.874488] sd 4:4:0:0: [sdh] tag#24 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE >> [ 3184.874495] sd 4:4:0:0: [sdh] tag#24 Sense Key : Medium Error [current] >> [ 3184.874500] sd 4:4:0:0: [sdh] tag#24 Add. Sense: Unrecovered read error - auto reallocate failed >> [ 3184.874506] sd 4:4:0:0: [sdh] tag#24 CDB: Read(16) 88 00 00 00 00 00 4c bc 4f 70 00 00 00 08 00 00 >> [ 3184.874510] print_req_error: I/O error, dev sdh, sector 1287409520 >> [ 3184.874515] Buffer I/O error on dev sdh, logical block 160926190, async page read >> [ 3184.874555] ata5: EH complete >> > > What you write about the result of > badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 > is the expected behavior. -n means that it will _not_ write sectors that > it cannot read (because that would remove the possibility that data from > these sectors could be recovered by more tries). > > As I wrote, you have to use the -w option instead of -n, and use x and y > of 1287409527 1287409520 > > HTH > Kay > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-09 20:29 ` Marc MERLIN 2018-02-09 20:44 ` Phil Turmel 2018-02-09 20:52 ` Kay Diederichs @ 2018-02-09 21:17 ` Kay Diederichs 2 siblings, 0 replies; 32+ messages in thread From: Kay Diederichs @ 2018-02-09 21:17 UTC (permalink / raw) To: Marc MERLIN, Phil Turmel Cc: Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On 02/09/2018 09:29 PM, Marc MERLIN wrote: ... > > myth:~# badblocks -fsvnb512 /dev/sdh 1287409599 1287409400 > /dev/sdh is apparently in use by the system; badblocks forced anyway. > Checking for bad blocks in non-destructive read-write mode badblocks gives a warning that /dev/sdh is in use. You should not use it that way (needing the -f option), because essentially you are messing with the drive behind md's back. Remove /dev/sdh from the array before you use badblocks, or hdparm, or dd or the like on a member device. Kay ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-09 20:13 ` Phil Turmel 2018-02-09 20:29 ` Marc MERLIN @ 2018-02-10 21:43 ` Mateusz Korniak 2018-02-11 15:41 ` Marc MERLIN 2018-02-11 17:13 ` Phil Turmel 1 sibling, 2 replies; 32+ messages in thread From: Mateusz Korniak @ 2018-02-10 21:43 UTC (permalink / raw) To: Phil Turmel Cc: Marc MERLIN, Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On Friday 09 of February 2018 15:13:26 Phil Turmel wrote: > If you have bad block lists enabled in your array, MD will *never* try > to fix the underlying sectors As far I was able to find, failed write marks sector in BBL. Does data is saved under different location when such write fails for later reads? Failed read marks sector in BBL too? I am surprised to notice that I have plenty of sectors in BBL in some arrays which SMART reports be quite healthy, and all members passing short/long SMART tests ... -- Mateusz Korniak "(...) mam brata - poważny, domator, liczykrupa, hipokryta, pobożniś, krótko mówiąc - podpora społeczeństwa." Nikos Kazantzakis - "Grek Zorba" ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-10 21:43 ` Mateusz Korniak @ 2018-02-11 15:41 ` Marc MERLIN 2018-02-11 16:41 ` Marc MERLIN 2018-02-11 17:13 ` Phil Turmel 1 sibling, 1 reply; 32+ messages in thread From: Marc MERLIN @ 2018-02-11 15:41 UTC (permalink / raw) To: Mateusz Korniak Cc: Phil Turmel, Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid As a last update on those drives, sadly they seem to have real problems with SMART, which is why I was confused when using them. myth:~# badblocks -fsvn -b512 /dev/sdf /dev/sdf is apparently in use by the system; badblocks forced anyway. Checking for bad blocks in non-destructive read-write mode From block 0 to 3131110575 Checking for bad blocks (non-destructive read-write test) Testing with random pattern: done Pass completed, 0 bad blocks found. (0/0/0 errors) That means a full read/write scan ran ok. Yet: 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 7 7 sectors still marked as pending. This makes no sense... As far as I can tell, SMART is just broken on those drives, and they're going back to where I got them from. Thanks all for the replies and helping me confirm this. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-11 15:41 ` Marc MERLIN @ 2018-02-11 16:41 ` Marc MERLIN 0 siblings, 0 replies; 32+ messages in thread From: Marc MERLIN @ 2018-02-11 16:41 UTC (permalink / raw) To: Mateusz Korniak Cc: Phil Turmel, Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On Sun, Feb 11, 2018 at 07:41:58AM -0800, Marc MERLIN wrote: > As a last update on those drives, sadly they seem to have real problems > with SMART, which is why I was confused when using them. > > myth:~# badblocks -fsvn -b512 /dev/sdf > /dev/sdf is apparently in use by the system; badblocks forced anyway. > Checking for bad blocks in non-destructive read-write mode > From block 0 to 3131110575 > Checking for bad blocks (non-destructive read-write test) > Testing with random pattern: done > Pass completed, 0 bad blocks found. (0/0/0 errors) > > That means a full read/write scan ran ok. > Yet: > 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 7 > > 7 sectors still marked as pending. This makes no sense... And it gets "better", just re-ran a long self test, and still got: Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 90% 561 569442000 So, the disk sees bad blocks, SMART says there are bad blocks, and badblocks run over the entire drive in read/write mode, finds nothing anymore. Anyway, those drives are going back in the box and the mail tomorrow, but that sure is/was weird... Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-10 21:43 ` Mateusz Korniak 2018-02-11 15:41 ` Marc MERLIN @ 2018-02-11 17:13 ` Phil Turmel 2018-02-11 18:02 ` Wols Lists 2018-02-12 10:43 ` Mateusz Korniak 1 sibling, 2 replies; 32+ messages in thread From: Phil Turmel @ 2018-02-11 17:13 UTC (permalink / raw) To: Mateusz Korniak Cc: Marc MERLIN, Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On 02/10/2018 04:43 PM, Mateusz Korniak wrote: > On Friday 09 of February 2018 15:13:26 Phil Turmel wrote: >> If you have bad block lists enabled in your array, MD will *never* try >> to fix the underlying sectors > > As far I was able to find, failed write marks sector in BBL. > Does data is saved under different location when such write fails for later > reads? No. That is why this is a misfeature that should never have been turned on by default. Phil ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-11 17:13 ` Phil Turmel @ 2018-02-11 18:02 ` Wols Lists 2018-02-12 10:43 ` Mateusz Korniak 1 sibling, 0 replies; 32+ messages in thread From: Wols Lists @ 2018-02-11 18:02 UTC (permalink / raw) To: Phil Turmel, Mateusz Korniak Cc: Marc MERLIN, Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On 11/02/18 17:13, Phil Turmel wrote: > On 02/10/2018 04:43 PM, Mateusz Korniak wrote: >> On Friday 09 of February 2018 15:13:26 Phil Turmel wrote: >>> If you have bad block lists enabled in your array, MD will *never* try >>> to fix the underlying sectors I've just been reading the man pages. This is exactly what IS supposed to happen (that is, MD is *supposed* to fix the underlying sectors). >> >> As far I was able to find, failed write marks sector in BBL. >> Does data is saved under different location when such write fails for later >> reads? > > No. That is why this is a misfeature that should never have been turned > on by default. > I'm not going to argue about whether the feature should or should not have been turned on - I think the reality is that the feature is confused, and almost certainly buggy as a result, but imho it is a feature that *should* be enabled - by default - if only it worked :-( For a normal, properly functioning array, bad-blocks should be both enabled, AND EMPTY. That it has entries you can't get rid of implies it's buggy, as far as I can tell. Cheers, Wol ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-11 17:13 ` Phil Turmel 2018-02-11 18:02 ` Wols Lists @ 2018-02-12 10:43 ` Mateusz Korniak 2018-02-12 15:29 ` Phil Turmel 1 sibling, 1 reply; 32+ messages in thread From: Mateusz Korniak @ 2018-02-12 10:43 UTC (permalink / raw) To: Phil Turmel Cc: Marc MERLIN, Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On Sunday 11 of February 2018 12:13:45 Phil Turmel wrote: > On 02/10/2018 04:43 PM, Mateusz Korniak wrote: > > > data is saved under different location when such write fails for > > later reads? > > No. (...) So having arrays having non-empty BBL members means that array is in fact degraded (for tiny part, but still), right? Is there any option for mdadm --monitor to send warning e-mail when bbl entry is added? (I can't see anything regarding bbl in mdadm --monitor section) ? -- Mateusz Korniak "(...) mam brata - poważny, domator, liczykrupa, hipokryta, pobożniś, krótko mówiąc - podpora społeczeństwa." Nikos Kazantzakis - "Grek Zorba" ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-12 10:43 ` Mateusz Korniak @ 2018-02-12 15:29 ` Phil Turmel 2018-02-12 16:49 ` Marc MERLIN 0 siblings, 1 reply; 32+ messages in thread From: Phil Turmel @ 2018-02-12 15:29 UTC (permalink / raw) To: Mateusz Korniak Cc: Marc MERLIN, Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On 02/12/2018 05:43 AM, Mateusz Korniak wrote: > On Sunday 11 of February 2018 12:13:45 Phil Turmel wrote: >> On 02/10/2018 04:43 PM, Mateusz Korniak wrote: >> >>> data is saved under different location when such write fails for >>> later reads? >> >> No. (...) > > So having arrays having non-empty BBL members means that array is in fact > degraded (for tiny part, but still), right? Yes, it's degraded wherever there's a BBL entry. To my knowledge, *no* upper layer, whether device mapper or any filesystem, uses the information to avoid allocations in the degraded area or to rescue the data precariously living there. Last I looked, mdadm --detail did not report whether the array has degraded regions. You must inspect the output of mdadm --examine for every member. > Is there any option for mdadm --monitor to send warning e-mail when bbl entry > is added? (I can't see anything regarding bbl in mdadm --monitor section) ? No. The feature is incomplete. The only mitigation is to turn it off. Phil ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-12 15:29 ` Phil Turmel @ 2018-02-12 16:49 ` Marc MERLIN 2018-02-12 17:16 ` Phil Turmel 0 siblings, 1 reply; 32+ messages in thread From: Marc MERLIN @ 2018-02-12 16:49 UTC (permalink / raw) To: Phil Turmel Cc: Mateusz Korniak, Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On Mon, Feb 12, 2018 at 10:29:20AM -0500, Phil Turmel wrote: > > Is there any option for mdadm --monitor to send warning e-mail when bbl entry > > is added? (I can't see anything regarding bbl in mdadm --monitor section) ? > > No. The feature is incomplete. The only mitigation is to turn it off. I had a quick look but didn't really find how to turn it off after the fact (not at array creation time, but after it's already been created). Can you suggest how? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-12 16:49 ` Marc MERLIN @ 2018-02-12 17:16 ` Phil Turmel 2018-02-12 17:30 ` Marc MERLIN 0 siblings, 1 reply; 32+ messages in thread From: Phil Turmel @ 2018-02-12 17:16 UTC (permalink / raw) To: Marc MERLIN Cc: Mateusz Korniak, Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On 02/12/2018 11:49 AM, Marc MERLIN wrote: > On Mon, Feb 12, 2018 at 10:29:20AM -0500, Phil Turmel wrote: >>> Is there any option for mdadm --monitor to send warning e-mail when bbl entry >>> is added? (I can't see anything regarding bbl in mdadm --monitor section) ? >> >> No. The feature is incomplete. The only mitigation is to turn it off. > > I had a quick look but didn't really find how to turn it off after the fact > (not at array creation time, but after it's already been created). > > Can you suggest how? mdadm --assemble --update=no-bbl There's another (undocumented?) option required when there are entries in the list -- you'll have to dig that out for your situation. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: force remapping a pending sector in sw raid5 array 2018-02-12 17:16 ` Phil Turmel @ 2018-02-12 17:30 ` Marc MERLIN 0 siblings, 0 replies; 32+ messages in thread From: Marc MERLIN @ 2018-02-12 17:30 UTC (permalink / raw) To: Phil Turmel Cc: Mateusz Korniak, Kay Diederichs, Andreas Klauer, Adam Goryachev, Roger Heflin, linux-raid On Mon, Feb 12, 2018 at 12:16:15PM -0500, Phil Turmel wrote: > On 02/12/2018 11:49 AM, Marc MERLIN wrote: > > On Mon, Feb 12, 2018 at 10:29:20AM -0500, Phil Turmel wrote: > >>> Is there any option for mdadm --monitor to send warning e-mail when bbl entry > >>> is added? (I can't see anything regarding bbl in mdadm --monitor section) ? > >> > >> No. The feature is incomplete. The only mitigation is to turn it off. > > > > I had a quick look but didn't really find how to turn it off after the fact > > (not at array creation time, but after it's already been created). > > > > Can you suggest how? > > mdadm --assemble --update=no-bbl Thanks. > There's another (undocumented?) option required when there are entries > in the list -- you'll have to dig that out for your situation. That situation is gone, I was not able to clear the pending sectors even by re-writing every block of the drive with badblocks, so I returned the drives and got some better ones. Bad blocks on a "new" drive is bad enough, but then having the drive refuse to remap them, or apparently in my case fail to update the smart counters once the blocks were overwritten with good known data, is not ok. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2018-02-12 17:30 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-02-06 18:14 force remapping a pending sector in sw raid5 array Marc MERLIN 2018-02-06 18:59 ` Reindl Harald 2018-02-06 19:36 ` Marc MERLIN 2018-02-06 20:03 ` Andreas Klauer 2018-02-06 21:51 ` Adam Goryachev 2018-02-06 22:02 ` Marc MERLIN 2018-02-06 22:31 ` Roger Heflin 2018-02-06 22:46 ` Marc MERLIN 2018-02-07 4:29 ` Marc MERLIN 2018-02-07 9:42 ` Kay Diederichs 2018-02-09 19:29 ` Marc MERLIN 2018-02-09 19:57 ` Kay Diederichs 2018-02-09 20:02 ` Roger Heflin 2018-02-09 20:13 ` Phil Turmel 2018-02-09 20:29 ` Marc MERLIN 2018-02-09 20:44 ` Phil Turmel 2018-02-09 21:22 ` Marc MERLIN 2018-02-09 22:07 ` Wol's lists 2018-02-09 22:36 ` Marc MERLIN 2018-02-09 20:52 ` Kay Diederichs 2018-02-11 20:52 ` Roger Heflin 2018-02-09 21:17 ` Kay Diederichs 2018-02-10 21:43 ` Mateusz Korniak 2018-02-11 15:41 ` Marc MERLIN 2018-02-11 16:41 ` Marc MERLIN 2018-02-11 17:13 ` Phil Turmel 2018-02-11 18:02 ` Wols Lists 2018-02-12 10:43 ` Mateusz Korniak 2018-02-12 15:29 ` Phil Turmel 2018-02-12 16:49 ` Marc MERLIN 2018-02-12 17:16 ` Phil Turmel 2018-02-12 17:30 ` Marc MERLIN
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox