linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Import/recover RAID6 created with SAS2108?
@ 2016-03-06 21:37 Dan Russell
  2016-03-07 21:50 ` Andreas Klauer
  0 siblings, 1 reply; 6+ messages in thread
From: Dan Russell @ 2016-03-06 21:37 UTC (permalink / raw)
  To: linux-raid

I have a RAID6 array which was created on a Supermicro card built around the LSI 2108.  Two drives failed, I replaced one and started a rebuild, then two more failed, making my array go offline.

It looks like the failure was due to a cabling or backplane problem, and I suspect the data on the drives is good.  However, the Supermicro card is unable to read the foreign configs (it thinks there are 3 foreign configs, none of which are importable into a working array).

I would like to attempt recovery per the Wiki’s instructions (overlay files, force assembly, etc); can md read a RAID6 created by the 2108, or would I be wasting my time?

I have moved the drives from the hardware RAID card to a non-RAID SAS card and can see all drives.  However, mdadm -E on these drives seems to show the RAID controller’s field of vision (how many defined arrays, total number of physical disks in the system), not any insight into this drive’s role as a member of an array.  I can paste or gist the output of madam -E (304 lines) if it would help.

Thanks in advance-
 Dan

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Import/recover RAID6 created with SAS2108?
  2016-03-06 21:37 Import/recover RAID6 created with SAS2108? Dan Russell
@ 2016-03-07 21:50 ` Andreas Klauer
  2016-03-09  3:20   ` Dan Russell
  0 siblings, 1 reply; 6+ messages in thread
From: Andreas Klauer @ 2016-03-07 21:50 UTC (permalink / raw)
  To: Dan Russell; +Cc: linux-raid

On Sun, Mar 06, 2016 at 04:37:42PM -0500, Dan Russell wrote:
> can md read a RAID6 created by the 2108, or would I be wasting my time?

It might be possible.

I don't use HW-RAID anywhere but if I was forced to, this would be 
one of the first things to determine, how to read it with software 
in case the card fails.

> I can paste or gist the output of madam -E (304 lines) if it would help.

I don't know if it would help, but paste everything you have anyhow.

Is the data encrypted? If not - search for known file type headers 
(like a large jpeg or tar.gz, something larger than disks * chunksize), 
then look at the data on the other disks at the same offsets 
(grab a few megs of each disks), then try to deduce the layout 
and structure from that.

Basically use known data to reverse-engineer the layout if you 
currently know nothing about disk orders, chunk sizes, etc.

Regards
Andreas Klauer

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Import/recover RAID6 created with SAS2108?
  2016-03-07 21:50 ` Andreas Klauer
@ 2016-03-09  3:20   ` Dan Russell
  2016-03-09 19:27     ` Andreas Klauer
  2016-03-10  7:02     ` NeilBrown
  0 siblings, 2 replies; 6+ messages in thread
From: Dan Russell @ 2016-03-09  3:20 UTC (permalink / raw)
  To: linux-raid

Thank you.  I’ve made some progress.  My hardware controller uses a DDF container, which I’m able to start and inspect with mdadm.  Based on the mdadm release notes, I’ve updated from v3.2.5 to v3.4.

The data is not encrypted, and I believe I have the HDD order and chunk sizes correct (in part because the DDF container matches my independently-gathered notes).

The HDD order is sdi, sdj, sdag, sdl - sdz, sdaa - sdaf  (when the array initially failed, I replaced the drive in slot 2 with a new drive and started a rebuild.  The partially-rebuilt drive is sdk, the original “failed” drive is sdag).

When I did an Incremental on the container, I ended up with 3 inactive arrays (one 2-disk, one 20-disk, one 2-disk), which is comparable to what the hardware RAID controller told me about foreign configurations.  So I tried to create a new 24-disk array with the same parameters as the old.

I am able to create an array and see the LVM label on md1.  However fdisk and mdadm are reporting the array is 17.6TB in size, whereas it should be 66TB (24 3TB HDDs RAID6).  This is the same whether I specify or leave off the —size option when creating the array.

The first mdadm —create to make the ddf container shows that the ctime for sdag and sdp is Jan 17; this is the last time I booted this server prior to the breakage.  I’m wondering if there’s some way I can use the container metadata from either of those drives and ignore the rest?

Note that md0 is my root volume and is perfectly fine; it is only listed because I didn’t want to edit any command output.

# uname -r
3.13.0-49-generic

# mdadm -V
mdadm - v3.4 - 28th January 2016

# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid1 sdh1[1] sdg1[0]
      124967808 blocks super 1.2 [2/2] [UU]
      
# mdadm --create /dev/md127 -e ddf -l container -n 24 /dev/mapper/sdi /dev/mapper/sdj /dev/mapper/sdag /dev/mapper/sdl /dev/mapper/sdm /dev/mapper/sdn /dev/mapper/sdo /dev/mapper/sdp /dev/mapper/sdq /dev/mapper/sdr /dev/mapper/sds /dev/mapper/sdt /dev/mapper/sdu /dev/mapper/sdv /dev/mapper/sdw /dev/mapper/sdx /dev/mapper/sdy /dev/mapper/sdz /dev/mapper/sdaa /dev/mapper/sdab /dev/mapper/sdac /dev/mapper/sdad /dev/mapper/sdae /dev/mapper/sdaf
mdadm: /dev/mapper/sdj appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdag appears to be part of a raid array:
       level=container devices=212 ctime=Sun Jan 17 17:03:35 2016
mdadm: /dev/mapper/sdl appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdm appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdn appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdo appears to be part of a raid array:
       level=container devices=199 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdp appears to be part of a raid array:
       level=container devices=212 ctime=Sun Jan 17 17:03:35 2016
mdadm: /dev/mapper/sdq appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdr appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sds appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdt appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdu appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdv appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdw appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdx appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdy appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdz appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdaa appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdab appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdac appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdad appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdae appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
mdadm: /dev/mapper/sdaf appears to be part of a raid array:
       level=container devices=198 ctime=Fri Feb 26 13:38:07 2016
Continue creating array? y
mdadm: container /dev/md127 prepared.

# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md127 : inactive dm-24[23](S) dm-23[22](S) dm-22[21](S) dm-21[20](S) dm-20[19](S) dm-19[18](S) dm-18[17](S) dm-17[16](S) dm-16[15](S) dm-15[14](S) dm-14[13](S) dm-13[12](S) dm-12[11](S) dm-11[10](S) dm-10[9](S) dm-9[8](S) dm-8[7](S) dm-7[6](S) dm-6[5](S) dm-5[4](S) dm-4[3](S) dm-25[2](S) dm-2[1](S) dm-1[0](S)
      786432 blocks super external:ddf
       
md0 : active raid1 sdh1[1] sdg1[0]
      124967808 blocks super 1.2 [2/2] [UU]

# mdadm --create /dev/md1 --assume-clean --level=6 --raid-devices=24 --chunk=64 --size=2929686528 /dev/mapper/sdi /dev/mapper/sdj /dev/mapper/sdag /dev/mapper/sdl /dev/mapper/sdm /dev/mapper/sdn /dev/mapper/sdo /dev/mapper/sdp /dev/mapper/sdq /dev/mapper/sdr /dev/mapper/sds /dev/mapper/sdt /dev/mapper/sdu /dev/mapper/sdv /dev/mapper/sdw /dev/mapper/sdx /dev/mapper/sdy /dev/mapper/sdz /dev/mapper/sdaa /dev/mapper/sdab /dev/mapper/sdac /dev/mapper/sdad /dev/mapper/sdae /dev/mapper/sdaf
mdadm: /dev/mapper/sdi appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdj appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdag appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdl appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdm appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdn appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdo appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdp appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdq appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdr appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sds appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdt appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdu appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdv appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdw appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdx appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdy appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdz appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdaa appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdab appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdac appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdad appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdae appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
mdadm: /dev/mapper/sdaf appears to be part of a raid array:
       level=container devices=24 ctime=Tue Mar  8 21:18:40 2016
Continue creating array? y
mdadm: Creating array inside ddf container md127
mdadm: array /dev/md1 started.

# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md1 : active (auto-read-only) raid6 dm-24[23] dm-23[22] dm-22[21] dm-21[20] dm-20[19] dm-19[18] dm-18[17] dm-17[16] dm-16[15] dm-15[14] dm-14[13] dm-13[12] dm-12[11] dm-11[10] dm-10[9] dm-9[8] dm-8[7] dm-7[6] dm-6[5] dm-5[4] dm-4[3] dm-25[2] dm-2[1] dm-1[0]
      17208463360 blocks super external:/md127/0 level 6, 64k chunk, algorithm 10 [24/24] [UUUUUUUUUUUUUUUUUUUUUUUU]
      
md127 : inactive dm-24[23](S) dm-23[22](S) dm-22[21](S) dm-21[20](S) dm-20[19](S) dm-19[18](S) dm-18[17](S) dm-17[16](S) dm-16[15](S) dm-15[14](S) dm-14[13](S) dm-13[12](S) dm-12[11](S) dm-11[10](S) dm-10[9](S) dm-9[8](S) dm-8[7](S) dm-7[6](S) dm-6[5](S) dm-5[4](S) dm-4[3](S) dm-25[2](S) dm-2[1](S) dm-1[0](S)
      786432 blocks super external:ddf
       
md0 : active raid1 sdh1[1] sdg1[0]
      124967808 blocks super 1.2 [2/2] [UU]
      
unused devices: <none>

# mdadm -E /dev/md127
/dev/md127:
          Magic : de11de11
        Version : 01.02.00
Controller GUID : 4C696E75:782D4D44:2064616D:2D73746F:72616765:2D6E3031
                  (Linux-MD dam-storage-n01)
 Container GUID : 4C696E75:782D4D44:DEADBEEF:00000000:4410E200:C7DA314C
                  (Linux-MD 03/08/16 21:18:40)
            Seq : 00000003
  Redundant hdr : yes
  Virtual Disks : 1

      VD GUID[0] : 4C696E75:782D4D44:DEADBEEF:00000000:4410E257:2EF31542
                  (Linux-MD 03/08/16 21:20:07)
         unit[0] : 1
        state[0] : Optimal, Consistent
   init state[0] : Fully Initialised
       access[0] : Read/Write
         Name[0] : 1
 Raid Devices[0] : 24 (0@0K 1@0K 2@0K 3@0K 4@0K 5@0K 6@0K 7@0K 8@0K 9@0K 10@0K 11@0K 12@0K 13@0K 14@0K 15@0K 16@0K 17@0K 18@0K 19@0K 20@0K 21@0K 22@0K 23@0K)
   Chunk Size[0] : 128 sectors
   Raid Level[0] : RAID6
  Device Size[0] : 782202880
   Array Size[0] : 17208463360

 Physical Disks : 1023
      Number    RefNo      Size       Device      Type/State
         0    1bfc5b4f  2930233816K /dev/dm-1       active/Online
         1    44f3f4b1  2930233816K /dev/dm-2       active/Online
         2    22f1c2e4  2930233816K /dev/dm-25      active/Online
         3    ef32ccff  2930233816K /dev/dm-4       active/Online
         4    c7e6f1fe  2930233816K /dev/dm-5       active/Online
         5    00272b4d  2930233816K /dev/dm-6       active/Online
         6    f961cba2  2930233816K /dev/dm-7       active/Online
         7    40f01419  2930233816K /dev/dm-8       active/Online
         8    75858e24  2930233816K /dev/dm-9       active/Online
         9    ac398181  2930233816K /dev/dm-10      active/Online
        10    b39d9cbb  2930233816K /dev/dm-11      active/Online
        11    a71a4095  2930233816K /dev/dm-12      active/Online
        12    f1bf38e9  2930233816K /dev/dm-13      active/Online
        13    1a8973b2  2930233816K /dev/dm-14      active/Online
        14    c107b1b5  2930233816K /dev/dm-15      active/Online
        15    26b44a36  2930233816K /dev/dm-16      active/Online
        16    7f376a5f  2930233816K /dev/dm-17      active/Online
        17    22944f44  2930233816K /dev/dm-18      active/Online
        18    8e356094  2930233816K /dev/dm-19      active/Online
        19    0b454914  2930233816K /dev/dm-20      active/Online
        20    71df5ccc  2930233816K /dev/dm-21      active/Online
        21    763d65a1  2930233816K /dev/dm-22      active/Online
        22    aacda00d  2930233816K /dev/dm-23      active/Online
        23    9837ac03  2930233816K /dev/dm-24      active/Online


(picking a device at random)
# mdadm -E /dev/mapper/sdp
/dev/mapper/sdp:
          Magic : de11de11
        Version : 01.02.00
Controller GUID : 4C696E75:782D4D44:2064616D:2D73746F:72616765:2D6E3031
                  (Linux-MD dam-storage-n01)
 Container GUID : 4C696E75:782D4D44:DEADBEEF:00000000:4410E200:C7DA314C
                  (Linux-MD 03/08/16 21:18:40)
            Seq : 00000003
  Redundant hdr : yes
  Virtual Disks : 1

      VD GUID[0] : 4C696E75:782D4D44:DEADBEEF:00000000:4410E257:2EF31542
                  (Linux-MD 03/08/16 21:20:07)
         unit[0] : 1
        state[0] : Optimal, Consistent
   init state[0] : Fully Initialised
       access[0] : Read/Write
         Name[0] : 1
 Raid Devices[0] : 24 (0@0K 1@0K 2@0K 3@0K 4@0K 5@0K 6@0K 7@0K 8@0K 9@0K 10@0K 11@0K 12@0K 13@0K 14@0K 15@0K 16@0K 17@0K 18@0K 19@0K 20@0K 21@0K 22@0K 23@0K)
   Chunk Size[0] : 128 sectors
   Raid Level[0] : RAID6
  Device Size[0] : 782202880
   Array Size[0] : 17208463360

 Physical Disks : 1023
      Number    RefNo      Size       Device      Type/State
         0    1bfc5b4f  2930233816K                 active/Online
         1    44f3f4b1  2930233816K                 active/Online
         2    22f1c2e4  2930233816K                 active/Online
         3    ef32ccff  2930233816K                 active/Online
         4    c7e6f1fe  2930233816K                 active/Online
         5    00272b4d  2930233816K                 active/Online
         6    f961cba2  2930233816K                 active/Online
         7    40f01419  2930233816K /dev/dm-8       active/Online
         8    75858e24  2930233816K                 active/Online
         9    ac398181  2930233816K                 active/Online
        10    b39d9cbb  2930233816K                 active/Online
        11    a71a4095  2930233816K                 active/Online
        12    f1bf38e9  2930233816K                 active/Online
        13    1a8973b2  2930233816K                 active/Online
        14    c107b1b5  2930233816K                 active/Online
        15    26b44a36  2930233816K                 active/Online
        16    7f376a5f  2930233816K                 active/Online
        17    22944f44  2930233816K                 active/Online
        18    8e356094  2930233816K                 active/Online
        19    0b454914  2930233816K                 active/Online
        20    71df5ccc  2930233816K                 active/Online
        21    763d65a1  2930233816K                 active/Online
        22    aacda00d  2930233816K                 active/Online
        23    9837ac03  2930233816K                 active/Online

# fdisk -l /dev/md1

Disk /dev/md1: 17621.5 GB, 17621466480640 bytes
2 heads, 4 sectors/track, -1 cylinders, total 34416926720 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 65536 bytes / 1441792 bytes
Disk identifier: 0x00000000

Disk /dev/md1 doesn't contain a valid partition table


# file -s /dev/md1
/dev/md1: LVM2 PV (Linux Logical Volume Manager), UUID: 1lz3cZ-j4Sj-ZTcH-GiEm-eXGi-qMSr-lIzwyS, size: 65999978102784

For reference an array of the same size, attached to the HW RAID controller
# fdisk -l /dev/sda

Disk /dev/sda: 66000.0 GB, 65999989637120 bytes
255 heads, 63 sectors/track, 8024041 cylinders, total 128906229760 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

Disk /dev/sda doesn't contain a valid partition table


> On Mar 7, 2016, at 4:50 PM, Andreas Klauer <Andreas.Klauer@metamorpher.de> wrote:
> 
> On Sun, Mar 06, 2016 at 04:37:42PM -0500, Dan Russell wrote:
>> can md read a RAID6 created by the 2108, or would I be wasting my time?
> 
> It might be possible.
> 
> I don't use HW-RAID anywhere but if I was forced to, this would be 
> one of the first things to determine, how to read it with software 
> in case the card fails.
> 
>> I can paste or gist the output of madam -E (304 lines) if it would help.
> 
> I don't know if it would help, but paste everything you have anyhow.
> 
> Is the data encrypted? If not - search for known file type headers 
> (like a large jpeg or tar.gz, something larger than disks * chunksize), 
> then look at the data on the other disks at the same offsets 
> (grab a few megs of each disks), then try to deduce the layout 
> and structure from that.
> 
> Basically use known data to reverse-engineer the layout if you 
> currently know nothing about disk orders, chunk sizes, etc.
> 
> Regards
> Andreas Klauer

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Import/recover RAID6 created with SAS2108?
  2016-03-09  3:20   ` Dan Russell
@ 2016-03-09 19:27     ` Andreas Klauer
  2016-03-09 20:53       ` Dan Russell
  2016-03-10  7:02     ` NeilBrown
  1 sibling, 1 reply; 6+ messages in thread
From: Andreas Klauer @ 2016-03-09 19:27 UTC (permalink / raw)
  To: Dan Russell; +Cc: linux-raid

On Tue, Mar 08, 2016 at 10:20:17PM -0500, Dan Russell wrote:
> The partially-rebuilt drive is sdk, the original “failed” drive is sdag

Best to leave both out if one has outdated and the other only half 
the content...

> However fdisk and mdadm are reporting the array is 17.6TB in size, whereas it should be 66TB (24 3TB HDDs RAID6).

I reproduced your commands using tmpfs based loop devices and it gives me 
the same problem. The RAID size is only 16 TiB. It seems to be hitting a 
limit somewhere.

Your /dev/mapper/sdx are snapshot/overlays, I hope?

DDF metadata seems to be located at the end of the device, so you could try 
your luck with mdadm 1.0 metadata instead; that gives me a RAID of a size 
closer to home.

# mdadm --create /dev/md42 --assume-clean --metadata=1.0 --level=6 --raid-devices=24 --chunk=64 /dev/loop{0..23}
# fdisk -l /dev/md42
Disk /dev/md42: 60 TiB, 65999996846080 bytes, 128906243840 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 1441792 bytes

It wastes some sectors at the end though, not sure if more or less 
than what DDF uses for metadata. You might have to add some empty 
space to your device mappings to get a full view.

Regards
Andreas Klauer
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Import/recover RAID6 created with SAS2108?
  2016-03-09 19:27     ` Andreas Klauer
@ 2016-03-09 20:53       ` Dan Russell
  0 siblings, 0 replies; 6+ messages in thread
From: Dan Russell @ 2016-03-09 20:53 UTC (permalink / raw)
  To: linux-raid

On Mar 9, 2016, at 2:27 PM, Andreas Klauer <Andreas.Klauer@metamorpher.de> wrote:
> 
> On Tue, Mar 08, 2016 at 10:20:17PM -0500, Dan Russell wrote:
>> The partially-rebuilt drive is sdk, the original “failed” drive is sdag
> 
> Best to leave both out if one has outdated and the other only half 
> the content…

I generally agree, but in my case the filesystem wasn’t mounted (commented out of fstab, movers dropped system, I booted system, RAID failed before I mounted the filesystem), so I’m OK with the risk.

> 
>> However fdisk and mdadm are reporting the array is 17.6TB in size, whereas it should be 66TB (24 3TB HDDs RAID6).
> 
> I reproduced your commands using tmpfs based loop devices and it gives me 
> the same problem. The RAID size is only 16 TiB. It seems to be hitting a 
> limit somewhere.
> 
> Your /dev/mapper/sdx are snapshot/overlays, I hope?

Yes.  I can’t recommend highly enough the overlay_setup approach on the Wiki.

> DDF metadata seems to be located at the end of the device, so you could try 
> your luck with mdadm 1.0 metadata instead; that gives me a RAID of a size 
> closer to home.

This got me closer, but the LVM2 label was missing.  When I’d previously assembled the RAID in the container, I noticed algorithm 10; whereas with this approach it was 2.  I switched it to 10 and my array is back.  fsck (xfs_repair -n, really) says the FS is clean and random poking at files seems to back that up.

I have a backup, of course, but doing a disk-to-disk verify/recovery is going to be so much quicker.

Thank you so much for your help, Andreas and all the contributors to the “RAID_Recovery” and “Recovering_a_failed_software_RAID” Wiki pages.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Import/recover RAID6 created with SAS2108?
  2016-03-09  3:20   ` Dan Russell
  2016-03-09 19:27     ` Andreas Klauer
@ 2016-03-10  7:02     ` NeilBrown
  1 sibling, 0 replies; 6+ messages in thread
From: NeilBrown @ 2016-03-10  7:02 UTC (permalink / raw)
  To: Dan Russell, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1785 bytes --]

On Wed, Mar 09 2016, Dan Russell wrote:

> # mdadm --create /dev/md1 --assume-clean --level=6 --raid-devices=24
     --chunk=64 --size=2929686528

22 * 2929686528 == 0xF01B45000
> # cat /proc/mdstat 
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
> md1 : active (auto-read-only) raid6 dm-24[23] dm-23[22].....
>       17208463360 blocks super external:/md127/0 level 6, 64k chunk,....

17208463360 == 0x401B45000

So the size hasn't just be truncated to to some number of bits, but the
5th byte has changed from 0xF to 0x4.

That is very odd....

Ahhh.. I tried it myself, and used "mdadm -D" to look at the RAID6
array.

  Used Dev Size : 782202880 (745.97 GiB 800.98 GB)

The requested per-device size was
   2929686528 = 0xAE9F7800
the size given was
   782202880  = 0x2E9F7800

so we lost the msbit there...  Ahhhh.

diff --git a/super-ddf.c b/super-ddf.c
index faaf0a7ca9e0..0e00d17dd169 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -2688,10 +2688,10 @@ static int init_super_ddf_bvd(struct supertype *st,
 		free(vcl);
 		return 0;
 	}
-	vc->blocks = cpu_to_be64(info->size * 2);
+	vc->blocks = cpu_to_be64(size * 2);
 	vc->array_blocks = cpu_to_be64(
 		calc_array_size(info->level, info->raid_disks, info->layout,
-				info->chunk_size, info->size*2));
+				info->chunk_size, size*2));
 	memset(vc->pad1, 0xff, 8);
 	vc->spare_refs[0] = cpu_to_be32(0xffffffff);
 	vc->spare_refs[1] = cpu_to_be32(0xffffffff);


That was careless.  "info" is a legacy structure which has a 32bit size
field. So a 64bit size if passed as a separate arg, but this function
used the wrong one :-(

I'll send off a patch.

Thanks for the report - and glad you could get at your data :-)

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-03-10  7:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-06 21:37 Import/recover RAID6 created with SAS2108? Dan Russell
2016-03-07 21:50 ` Andreas Klauer
2016-03-09  3:20   ` Dan Russell
2016-03-09 19:27     ` Andreas Klauer
2016-03-09 20:53       ` Dan Russell
2016-03-10  7:02     ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).