All of lore.kernel.org
 help / color / mirror / Atom feed
* mapping disk sectors to files
@ 2014-03-14  6:23 Thorsten von Eicken
  2014-03-14  8:07 ` Eyal Lebedinsky
  2014-03-14 12:13 ` Phil Turmel
  0 siblings, 2 replies; 3+ messages in thread
From: Thorsten von Eicken @ 2014-03-14  6:23 UTC (permalink / raw)
  To: linux-raid

I've just had a disk in a raid1 mirror set die and that exposed some bad
block on the remaining drive -- ooops! I'd now like to map the bad
blocks to files so I can restore the affected files from backups, but I
can't figure out the mapping. What I've done:
- while I had only one drive, I ran badblocks on the md device, that
gave me a list of bad blocks, I verified that they were indeed bad with
dd, and I used debugfs to map them to files, verified the files gave a
read error, wrote zeroes to the blocks to get the drive to reallocate, done!
- I rebuilt the mirror set with a fresh drive, but that exposed two more
bad blocks, ouch!
- Now I can't run badblocks anymore easily because the second drive will
"cover" for the first one as far as I understand
- I've tried converting sectors to blocks, subtracting partition offset
and data offset, but it's just not working, i.e. I can't get dd to hit
the error when I try

This is the set of bad blocks I'm trying to deal with:
Mar 13 01:12:24 h kernel: [522839.210723] end_request: I/O error, dev
sda, sector 1147023664
Mar 13 01:12:24 h kernel: [522839.211618] ata1: EH complete
Mar 13 01:12:24 h kernel: [522839.211635] md/raid1:md0: sda:
unrecoverable I/O read error for block 1146085632

Mar 13 01:11:49 h kernel: [522804.763467] end_request: I/O error, dev
sda, sector 1147020360
Mar 13 01:11:49 h kernel: [522804.765146] ata1: EH complete
Mar 13 01:11:49 h kernel: [522804.765180] md/raid1:md0: sda:
unrecoverable I/O read error for block 1146082304

Mar 13 01:11:30 h kernel: [522785.944248] end_request: I/O error, dev
sda, sector 1147017056
Mar 13 01:11:30 h kernel: [522785.945926] ata1: EH complete
Mar 13 01:11:30 h kernel: [522785.945961] md/raid1:md0: sda:
unrecoverable I/O read error for block 1146078976

Mar 13 01:09:23 h kernel: [522658.983066] end_request: I/O error, dev
sda, sector 1129050144
Mar 13 01:09:23 h kernel: [522658.984750] ata1: EH complete
Mar 13 01:09:23 h kernel: [522658.984781] md/raid1:md0: sda:
unrecoverable I/O read error for block 1128112128

Mar 13 01:06:44 h kernel: [522499.760134] end_request: I/O error, dev
sda, sector 1098724456
Mar 13 01:06:44 h kernel: [522499.761829] ata1: EH complete
Mar 13 01:06:44 h kernel: [522499.761869] md/raid1:md0: sda:
unrecoverable I/O read error for block 1097786368

The GPT is:
Disk /dev/sda: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): FD188278-C58A-41C7-9943-AD5E94EDF1F8
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 2048-sector boundaries
Total free space is 4061 sectors (2.0 MiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048          409600   199.0 MiB   EF00  EFI System
   2          411648          935935   256.0 MiB   0700  Microsoft basic
data
   3          935936      3907029134   1.8 TiB     FD00  Linux RAID

The md device says:
# mdadm --examine /dev/sda3
/dev/sda3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 1f3118f3:ee5a644b:d8d3df40:8decaa30
           Name : h2:0
  Creation Time : Sun Mar 18 00:37:55 2012
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 3906091151 (1862.57 GiB 1999.92 GB)
     Array Size : 1953045575 (1862.57 GiB 1999.92 GB)
  Used Dev Size : 3906091150 (1862.57 GiB 1999.92 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 7a87b471:48f6c853:0a5dd5ad:15fa79a4

    Update Time : Thu Mar 13 19:43:14 2014
       Checksum : 1522576c - correct
         Events : 8076588


   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing)

- I'm using an ext4 filesystem and dumpfs says:
First block:              0
Block size:               4096

If I take the bad sector number from the first error message
(1147023664) subtract the partition start (935936) and then divide by 8
(4096 byte blocks) I can get dd to trigger the bad block:
# dd if=/dev/sda3 of=/dev/null bs=4096 count=10 iflag=direct skip=143260966
dd: reading `/dev/sda3': Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 15.7246 s, 0.0 kB/s

But I need the block number in the md device so I can map it back to a
file. I've tried various calculations and used dd on the md device but I
see no error in syslog indicating hitting a bad block. Does someone know
how to do the mapping and/or how to fix the situation?

Thanks!



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: mapping disk sectors to files
  2014-03-14  6:23 mapping disk sectors to files Thorsten von Eicken
@ 2014-03-14  8:07 ` Eyal Lebedinsky
  2014-03-14 12:13 ` Phil Turmel
  1 sibling, 0 replies; 3+ messages in thread
From: Eyal Lebedinsky @ 2014-03-14  8:07 UTC (permalink / raw)
  To: linux-raid

raid1 means that the two disks hold the same data at the same
offset, so dd from the member device should confirm the address
and trigger an error. Your calculation looks correct to me (IANA
Expert).

I do not understand why you "can't run badblocks" on the individual
device. Only running it on the md device causes it to be "cover"ed.

BTW, I would expect the periodic smartd long test to report the bad
blocks earlier, and the periodic md 'check' to fix such problems.

HTH,
     Eyal

On 14/03/14 08:23, Thorsten von Eicken wrote:
> I've just had a disk in a raid1 mirror set die and that exposed some bad
> block on the remaining drive -- ooops! I'd now like to map the bad
> blocks to files so I can restore the affected files from backups, but I
> can't figure out the mapping. What I've done:
> - while I had only one drive, I ran badblocks on the md device, that
> gave me a list of bad blocks, I verified that they were indeed bad with
> dd, and I used debugfs to map them to files, verified the files gave a
> read error, wrote zeroes to the blocks to get the drive to reallocate, done!
> - I rebuilt the mirror set with a fresh drive, but that exposed two more
> bad blocks, ouch!
> - Now I can't run badblocks anymore easily because the second drive will
> "cover" for the first one as far as I understand
> - I've tried converting sectors to blocks, subtracting partition offset
> and data offset, but it's just not working, i.e. I can't get dd to hit
> the error when I try
>
> This is the set of bad blocks I'm trying to deal with:
> Mar 13 01:12:24 h kernel: [522839.210723] end_request: I/O error, dev
> sda, sector 1147023664
> Mar 13 01:12:24 h kernel: [522839.211618] ata1: EH complete
> Mar 13 01:12:24 h kernel: [522839.211635] md/raid1:md0: sda:
> unrecoverable I/O read error for block 1146085632
>
> Mar 13 01:11:49 h kernel: [522804.763467] end_request: I/O error, dev
> sda, sector 1147020360
> Mar 13 01:11:49 h kernel: [522804.765146] ata1: EH complete
> Mar 13 01:11:49 h kernel: [522804.765180] md/raid1:md0: sda:
> unrecoverable I/O read error for block 1146082304
>
> Mar 13 01:11:30 h kernel: [522785.944248] end_request: I/O error, dev
> sda, sector 1147017056
> Mar 13 01:11:30 h kernel: [522785.945926] ata1: EH complete
> Mar 13 01:11:30 h kernel: [522785.945961] md/raid1:md0: sda:
> unrecoverable I/O read error for block 1146078976
>
> Mar 13 01:09:23 h kernel: [522658.983066] end_request: I/O error, dev
> sda, sector 1129050144
> Mar 13 01:09:23 h kernel: [522658.984750] ata1: EH complete
> Mar 13 01:09:23 h kernel: [522658.984781] md/raid1:md0: sda:
> unrecoverable I/O read error for block 1128112128
>
> Mar 13 01:06:44 h kernel: [522499.760134] end_request: I/O error, dev
> sda, sector 1098724456
> Mar 13 01:06:44 h kernel: [522499.761829] ata1: EH complete
> Mar 13 01:06:44 h kernel: [522499.761869] md/raid1:md0: sda:
> unrecoverable I/O read error for block 1097786368
>
> The GPT is:
> Disk /dev/sda: 3907029168 sectors, 1.8 TiB
> Logical sector size: 512 bytes
> Disk identifier (GUID): FD188278-C58A-41C7-9943-AD5E94EDF1F8
> Partition table holds up to 128 entries
> First usable sector is 34, last usable sector is 3907029134
> Partitions will be aligned on 2048-sector boundaries
> Total free space is 4061 sectors (2.0 MiB)
>
> Number  Start (sector)    End (sector)  Size       Code  Name
>     1            2048          409600   199.0 MiB   EF00  EFI System
>     2          411648          935935   256.0 MiB   0700  Microsoft basic
> data
>     3          935936      3907029134   1.8 TiB     FD00  Linux RAID
>
> The md device says:
> # mdadm --examine /dev/sda3
> /dev/sda3:
>            Magic : a92b4efc
>          Version : 1.2
>      Feature Map : 0x0
>       Array UUID : 1f3118f3:ee5a644b:d8d3df40:8decaa30
>             Name : h2:0
>    Creation Time : Sun Mar 18 00:37:55 2012
>       Raid Level : raid1
>     Raid Devices : 2
>
>   Avail Dev Size : 3906091151 (1862.57 GiB 1999.92 GB)
>       Array Size : 1953045575 (1862.57 GiB 1999.92 GB)
>    Used Dev Size : 3906091150 (1862.57 GiB 1999.92 GB)
>      Data Offset : 2048 sectors
>     Super Offset : 8 sectors
>            State : clean
>      Device UUID : 7a87b471:48f6c853:0a5dd5ad:15fa79a4
>
>      Update Time : Thu Mar 13 19:43:14 2014
>         Checksum : 1522576c - correct
>           Events : 8076588
>
>
>     Device Role : Active device 1
>     Array State : AA ('A' == active, '.' == missing)
>
> - I'm using an ext4 filesystem and dumpfs says:
> First block:              0
> Block size:               4096
>
> If I take the bad sector number from the first error message
> (1147023664) subtract the partition start (935936) and then divide by 8
> (4096 byte blocks) I can get dd to trigger the bad block:
> # dd if=/dev/sda3 of=/dev/null bs=4096 count=10 iflag=direct skip=143260966
> dd: reading `/dev/sda3': Input/output error
> 0+0 records in
> 0+0 records out
> 0 bytes (0 B) copied, 15.7246 s, 0.0 kB/s
>
> But I need the block number in the md device so I can map it back to a
> file. I've tried various calculations and used dd on the md device but I
> see no error in syslog indicating hitting a bad block. Does someone know
> how to do the mapping and/or how to fix the situation?
>
> Thanks!
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

-- 
Eyal Lebedinsky (eyal@eyal.emu.id.au)


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: mapping disk sectors to files
  2014-03-14  6:23 mapping disk sectors to files Thorsten von Eicken
  2014-03-14  8:07 ` Eyal Lebedinsky
@ 2014-03-14 12:13 ` Phil Turmel
  1 sibling, 0 replies; 3+ messages in thread
From: Phil Turmel @ 2014-03-14 12:13 UTC (permalink / raw)
  To: Thorsten von Eicken, linux-raid

Good morning Thorsten,

On 03/14/2014 02:23 AM, Thorsten von Eicken wrote:
> I've just had a disk in a raid1 mirror set die and that exposed some bad
> block on the remaining drive -- ooops! I'd now like to map the bad
> blocks to files so I can restore the affected files from backups, but I
> can't figure out the mapping. What I've done:

[trim /]

> The GPT is:
> Disk /dev/sda: 3907029168 sectors, 1.8 TiB

>    3          935936      3907029134   1.8 TiB     FD00  Linux RAID

> The md device says:
> # mdadm --examine /dev/sda3
> /dev/sda3:
>           Magic : a92b4efc
>         Version : 1.2

>      Raid Level : raid1
> 
>     Data Offset : 2048 sectors

These key pieces of info tell you that the blocks in /dev/md0 start at
sector 2048 in /dev/sda3, which itself starts at sector 935936 of the
disk as a whole.  As a raid1, the blocks are 1:1 linear from there.

So all you've missed in your calculations is the 2048 sector offset of
the md data area.  Add that to your partition start value and you'll be
golden.

HTH,

Phil

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-03-14 12:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-14  6:23 mapping disk sectors to files Thorsten von Eicken
2014-03-14  8:07 ` Eyal Lebedinsky
2014-03-14 12:13 ` Phil Turmel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.