Linux RAID subsystem development

Linux RAID subsystem development
 help / color / mirror / Atom feed

* RAID5 grow interrupted.
From: Axel Spallek @ 2016-09-22 14:11 UTC (permalink / raw)
  To: linux-raid

Hello there.

I did something wrong.

I tried to get a disk out of a RAID5 with 8 hdds (4TB) to get a Hotspare 
to change to RAID6 afterwards.

The RAID was clean and not rebuilding before I started.

The partition was only 11TB big. Not yet resized, because not created 
with 64Bit, which I wanted to do afterwards.

Therefore I issued the following commands:

mdadm --grow -n7 /dev/md1  <-- just to get the size for the next command.

mdadm --grow /dev/md1 --array-size 23441292288

mdadm --grow -n7 /dev/md1 --backup-file /var/backups/mdadm.backup


The RAID /dev/md1 is mounted on /srv, so the backupfile is safe.

After a some time, someone told me, that seafile does not work. Since I 
was in a hurry, I just rebootet the server and forgot the RAID rebuild.

The server came up again, but without /dev/md1.

I had made a Backup, which is 2 days old. Not so bad, because I have the 
seafile data on several computers. But to get the RAID back to work 
would be better.

How do I restart the rebuild process with the backup-file?

This is what I get in the console:

root@s10:~# cat /proc/mdstat
Personalities :
md1 : inactive sdh1[0](S) sda1[5](S) sdb1[6](S) sdc1[7](S) sdd1[8](S) 
sdf1[4](S) sde1[2](S) sdg1[1](S)
       31255059140 blocks super 1.2

unused devices: <none>


root@s10:~# mdadm -A --scan --verbose
mdadm: looking for devices for further assembly
mdadm: /dev/sdg1 is busy - skipping
mdadm: Cannot assemble mbr metadata on /dev/sdg
mdadm: /dev/sdd1 is busy - skipping
mdadm: Cannot assemble mbr metadata on /dev/sdd
mdadm: /dev/sdc1 is busy - skipping
mdadm: Cannot assemble mbr metadata on /dev/sdc
mdadm: /dev/sda1 is busy - skipping
mdadm: Cannot assemble mbr metadata on /dev/sda
mdadm: /dev/sdb1 is busy - skipping
mdadm: Cannot assemble mbr metadata on /dev/sdb
mdadm: /dev/sdf1 is busy - skipping
mdadm: Cannot assemble mbr metadata on /dev/sdf
mdadm: /dev/sdh1 is busy - skipping
mdadm: Cannot assemble mbr metadata on /dev/sdh
mdadm: /dev/sde1 is busy - skipping
mdadm: Cannot assemble mbr metadata on /dev/sde
mdadm: no recogniseable superblock on /dev/sdj5
mdadm: Cannot assemble mbr metadata on /dev/sdj2
mdadm: no recogniseable superblock on /dev/sdj1
mdadm: Cannot assemble mbr metadata on /dev/sdj
mdadm: no recogniseable superblock on /dev/sdi1
mdadm: Cannot assemble mbr metadata on /dev/sdi
mdadm: No arrays found in config file or automatically


root@s10:~# mdadm --examine --scan
ARRAY /dev/md/1  metadata=1.2 UUID=48f60e15:900f47cc:6c5f42b1:82f01530 
name=s10:1


root@s10:~# mdadm --examine /dev/sd*
/dev/sda:
    MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sda1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x4
      Array UUID : 48f60e15:900f47cc:6c5f42b1:82f01530
            Name : s10:1  (local to host s10)
   Creation Time : Fri Sep 16 06:59:32 2016
      Raid Level : raid5
    Raid Devices : 7

  Avail Dev Size : 7813764785 (3725.89 GiB 4000.65 GB)
      Array Size : 23441292288 (22355.36 GiB 24003.88 GB)
   Used Dev Size : 7813764096 (3725.89 GiB 4000.65 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=689 sectors
           State : clean
     Device UUID : 9fe5980b:be2beb6b:59537ad1:90091564

   Reshape pos'n : 22485531648 (21443.87 GiB 23025.18 GB)
   Delta Devices : -1 (8->7)

     Update Time : Thu Sep 22 14:23:54 2016
        Checksum : 1ea679f3 - correct
          Events : 180344

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 7
    Array State : AAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb:
    MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdb1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x4
      Array UUID : 48f60e15:900f47cc:6c5f42b1:82f01530
            Name : s10:1  (local to host s10)
   Creation Time : Fri Sep 16 06:59:32 2016
      Raid Level : raid5
    Raid Devices : 7

  Avail Dev Size : 7813764785 (3725.89 GiB 4000.65 GB)
      Array Size : 23441292288 (22355.36 GiB 24003.88 GB)
   Used Dev Size : 7813764096 (3725.89 GiB 4000.65 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=689 sectors
           State : clean
     Device UUID : 370c5540:2d3bdd3e:40a36449:b82309a8

   Reshape pos'n : 22485531648 (21443.87 GiB 23025.18 GB)
   Delta Devices : -1 (8->7)

     Update Time : Thu Sep 22 14:23:54 2016
        Checksum : e2331a09 - correct
          Events : 180344

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 6
    Array State : AAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc:
    MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdc1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x4
      Array UUID : 48f60e15:900f47cc:6c5f42b1:82f01530
            Name : s10:1  (local to host s10)
   Creation Time : Fri Sep 16 06:59:32 2016
      Raid Level : raid5
    Raid Devices : 7

  Avail Dev Size : 7813764785 (3725.89 GiB 4000.65 GB)
      Array Size : 23441292288 (22355.36 GiB 24003.88 GB)
   Used Dev Size : 7813764096 (3725.89 GiB 4000.65 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=689 sectors
           State : clean
     Device UUID : a9e8caf9:4c70d937:50e55bdf:b736cb97

   Reshape pos'n : 22485531648 (21443.87 GiB 23025.18 GB)
   Delta Devices : -1 (8->7)

     Update Time : Thu Sep 22 14:23:54 2016
        Checksum : 1a5e80ac - correct
          Events : 180344

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 5
    Array State : AAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd:
    MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdd1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x4
      Array UUID : 48f60e15:900f47cc:6c5f42b1:82f01530
            Name : s10:1  (local to host s10)
   Creation Time : Fri Sep 16 06:59:32 2016
      Raid Level : raid5
    Raid Devices : 7

  Avail Dev Size : 7813764785 (3725.89 GiB 4000.65 GB)
      Array Size : 23441292288 (22355.36 GiB 24003.88 GB)
   Used Dev Size : 7813764096 (3725.89 GiB 4000.65 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=689 sectors
           State : clean
     Device UUID : 50a6466d:057b9171:22865989:cabb59ce

   Reshape pos'n : 22485531648 (21443.87 GiB 23025.18 GB)
   Delta Devices : -1 (8->7)

     Update Time : Thu Sep 22 14:23:54 2016
        Checksum : a81e6ef1 - correct
          Events : 180344

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 4
    Array State : AAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sde:
    MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sde1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x4
      Array UUID : 48f60e15:900f47cc:6c5f42b1:82f01530
            Name : s10:1  (local to host s10)
   Creation Time : Fri Sep 16 06:59:32 2016
      Raid Level : raid5
    Raid Devices : 7

  Avail Dev Size : 7813764785 (3725.89 GiB 4000.65 GB)
      Array Size : 23441292288 (22355.36 GiB 24003.88 GB)
   Used Dev Size : 7813764096 (3725.89 GiB 4000.65 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=689 sectors
           State : clean
     Device UUID : 1fd7fdde:3220d053:1cad772f:508ed8a7

   Reshape pos'n : 22485531648 (21443.87 GiB 23025.18 GB)
   Delta Devices : -1 (8->7)

     Update Time : Thu Sep 22 14:23:54 2016
        Checksum : 7bb13e67 - correct
          Events : 180344

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 2
    Array State : AAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdf:
    MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdf1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x4
      Array UUID : 48f60e15:900f47cc:6c5f42b1:82f01530
            Name : s10:1  (local to host s10)
   Creation Time : Fri Sep 16 06:59:32 2016
      Raid Level : raid5
    Raid Devices : 7

  Avail Dev Size : 7813764785 (3725.89 GiB 4000.65 GB)
      Array Size : 23441292288 (22355.36 GiB 24003.88 GB)
   Used Dev Size : 7813764096 (3725.89 GiB 4000.65 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=689 sectors
           State : clean
     Device UUID : b2788315:a75fec1f:1d2681ee:2bba1be7

   Reshape pos'n : 22485531648 (21443.87 GiB 23025.18 GB)
   Delta Devices : -1 (8->7)

     Update Time : Thu Sep 22 14:23:54 2016
        Checksum : 7c9fc44d - correct
          Events : 180344

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 3
    Array State : AAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdg:
    MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdg1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x4
      Array UUID : 48f60e15:900f47cc:6c5f42b1:82f01530
            Name : s10:1  (local to host s10)
   Creation Time : Fri Sep 16 06:59:32 2016
      Raid Level : raid5
    Raid Devices : 7

  Avail Dev Size : 7813764785 (3725.89 GiB 4000.65 GB)
      Array Size : 23441292288 (22355.36 GiB 24003.88 GB)
   Used Dev Size : 7813764096 (3725.89 GiB 4000.65 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=689 sectors
           State : clean
     Device UUID : 4e7223d6:4416983d:7812788b:a2114dec

   Reshape pos'n : 22485531648 (21443.87 GiB 23025.18 GB)
   Delta Devices : -1 (8->7)

     Update Time : Thu Sep 22 14:23:54 2016
        Checksum : fd13b855 - correct
          Events : 180344

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 1
    Array State : AAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdh:
    MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdh1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x4
      Array UUID : 48f60e15:900f47cc:6c5f42b1:82f01530
            Name : s10:1  (local to host s10)
   Creation Time : Fri Sep 16 06:59:32 2016
      Raid Level : raid5
    Raid Devices : 7

  Avail Dev Size : 7813764785 (3725.89 GiB 4000.65 GB)
      Array Size : 23441292288 (22355.36 GiB 24003.88 GB)
   Used Dev Size : 7813764096 (3725.89 GiB 4000.65 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=689 sectors
           State : clean
     Device UUID : c1226e5c:4145cbcf:0bbe6160:4e0da07b

   Reshape pos'n : 22485531648 (21443.87 GiB 23025.18 GB)
   Delta Devices : -1 (8->7)

     Update Time : Thu Sep 22 14:23:54 2016
        Checksum : 79ce3f03 - correct
          Events : 180344

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 0
    Array State : AAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdi:
    MBR Magic : aa55
Partition[0] :    500109312 sectors at         2048 (type 83)
mdadm: No md superblock detected on /dev/sdi1.
/dev/sdj:
    MBR Magic : aa55
Partition[0] :     56143872 sectors at         2048 (type 83)
Partition[1] :      2478082 sectors at     56147966 (type 05)
mdadm: No md superblock detected on /dev/sdj1.
/dev/sdj2:
    MBR Magic : aa55
Partition[0] :      2478080 sectors at            2 (type 82)
mdadm: No md superblock detected on /dev/sdj5.



root@s10:~# mdadm --detail /dev/md1
/dev/md1:
         Version : 1.2
      Raid Level : raid0
   Total Devices : 8
     Persistence : Superblock is persistent

           State : inactive

   Delta Devices : -1, (1->0)
       New Level : raid5
      New Layout : left-symmetric
   New Chunksize : 512K

            Name : s10:1  (local to host s10)
            UUID : 48f60e15:900f47cc:6c5f42b1:82f01530
          Events : 180344

     Number   Major   Minor   RaidDevice

        -       8        1        -        /dev/sda1
        -       8       17        -        /dev/sdb1
        -       8       33        -        /dev/sdc1
        -       8       49        -        /dev/sdd1
        -       8       65        -        /dev/sde1
        -       8       81        -        /dev/sdf1
        -       8       97        -        /dev/sdg1
        -       8      113        -        /dev/sdh1


-- 

Mit freundlichem Gruß,

Axel Spallek
Dipl.-Ing. FH IE

Hochdorfer Straße 34
88477 Schönebürg

Mobil: 01577 7929886
E-Mail: axel@spallek.org

Bankverbindung:
Volksbank Schwendi
IBAN: DE20654913200095195009
BIC: GENODES1VBL

Steuernummer: 5438000091
UST.-ID.:     DE290536647


^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Andreas Klauer @ 2016-09-22  9:56 UTC (permalink / raw)
  To: Simon Becks; +Cc: Linux-RAID
In-Reply-To: <CAKA=zHr-dwudU5bENHF_3QSG9vYi4Tib8Sg2Jq_rfqOjuF7vew@mail.gmail.com>

On Wed, Sep 21, 2016 at 11:30:02PM +0200, Simon Becks wrote:
> I tried all possible orders but xfs_repair found in no round instantly
> a superblock or did the repair with flying colors.

You could also transfer the first 512K chunk of your removed disk over, 
or use the removed disk as first disk, just to see if that changes any.

You could also use only two disks (third disk missing) in case one of 
the disks had more data overwritten than the others.

But photorec not finding anything, there are two possibilities, 
one is the RAID is wrong, the other is that too much was overwritten.

You could run a mdadm check and see the number in mismatch_cnt. 
There should be some since parts were overwritten, but if the data 
were mostly intact there should not be too many.

But if it was zeroed there won't be mismatches either so you should 
probably check raw data itself too...

Regards
Andreas Klauer

^ permalink raw reply

* [PATCH v2] Fix RAID metadata check
From: Mariusz Dabrowski @ 2016-09-22  7:02 UTC (permalink / raw)
  To: linux-raid
  Cc: Jes.Sorensen, tomasz.majchrzak, aleksey.obitotskiy,
	pawel.baldysiak, artur.paszkiewicz, maksymilian.kunt,
	Mariusz Dabrowski

mdadm recognizes devices with partition table as part of an RAID array
and invalid warning message is displayed. After this fix proper warning
messages are being displayed for MBR/GPT disks and devices with RAID
metadata.

Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
---
 util.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/util.c b/util.c
index 8b52242..a238a21 100644
--- a/util.c
+++ b/util.c
@@ -710,17 +710,22 @@ int check_raid(int fd, char *name)
 
 	if (!st)
 		return 0;
-	st->ss->load_super(st, fd, name);
-	/* Looks like a raid array .. */
-	pr_err("%s appears to be part of a raid array:\n",
-		name);
-	st->ss->getinfo_super(st, &info, NULL);
-	st->ss->free_super(st);
-	crtime = info.array.ctime;
-	level = map_num(pers, info.array.level);
-	if (!level) level = "-unknown-";
-	cont_err("level=%s devices=%d ctime=%s",
-		 level, info.array.raid_disks, ctime(&crtime));
+	if (st->ss->add_to_super != NULL) {
+		st->ss->load_super(st, fd, name);
+		/* Looks like a raid array .. */
+		pr_err("%s appears to be part of a raid array:\n", name);
+		st->ss->getinfo_super(st, &info, NULL);
+		st->ss->free_super(st);
+		crtime = info.array.ctime;
+		level = map_num(pers, info.array.level);
+		if (!level)
+			level = "-unknown-";
+		cont_err("level=%s devices=%d ctime=%s",
+			level, info.array.raid_disks, ctime(&crtime));
+	} else {
+		/* Looks like GPT or MBR */
+		pr_err("partition table exists on %s\n", name);
+	}
 	return 1;
 }
 
-- 
1.8.3.1


^ permalink raw reply related

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Simon Becks @ 2016-09-22  5:46 UTC (permalink / raw)
  To: Simon Becks; +Cc: Linux-RAID
In-Reply-To: <CAKA=zHr-dwudU5bENHF_3QSG9vYi4Tib8Sg2Jq_rfqOjuF7vew@mail.gmail.com>

Measured now the time it takes to find the superblock with xfs_repair
in all combinations of disk order.

Fastest in 5 minutes in order sda,sdb,sdc but got error reading superblock 22 --
Seek to offset 2031216754688 failed

Superblock 22 is the the superblock it was found in 3 orders out of 6.

So i assumed, the fastest hit might be right one and started photorec on it:

Photorec only found

txt: 38 recovered
gif: 1 recovered

Gif was several gigabyte big and is not a real picture. The text files
are all smaller than 4K and only contain ps aux output of the nas.

Seems like i still do not have the right order of the disks? But it
looks identically to me:

/dev/mapper/sdb6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : da61174d:9567c4df:fcea79f1:38024893
           Name : grml:42  (local to host grml)
  Creation Time : Thu Sep 22 05:14:11 2016
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1923497952 (917.20 GiB 984.83 GB)
     Array Size : 1923496960 (1834.39 GiB 1969.66 GB)
  Used Dev Size : 1923496960 (917.19 GiB 984.83 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1960 sectors, after=992 sectors
          State : clean
    Device UUID : d0c61415:186b446b:ca34a8c6:69ed5b18

Internal Bitmap : 8 sectors from superblock
    Update Time : Thu Sep 22 05:14:11 2016
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : bba25a31 - correct
         Events : 1

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0

/dev/sde6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 342ec726:3804270d:5917dd5f:c24883a9
           Name : TS-XLB6C:2
  Creation Time : Fri Dec 23 17:58:59 2011
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1923497952 (917.20 GiB 984.83 GB)
     Array Size : 1923496960 (1834.39 GiB 1969.66 GB)
  Used Dev Size : 1923496960 (917.19 GiB 984.83 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=992 sectors
          State : active
    Device UUID : d27a69d0:456f3704:8e17ac75:78939886

Internal Bitmap : 8 sectors from superblock
    Update Time : Wed Jul 27 19:08:08 2016
       Checksum : de9dbd10 - correct
         Events : 11543

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)

Now testing photorec with the other orders.

2016-09-21 23:30 GMT+02:00 Simon Becks <beckssimon5@gmail.com>:
> I tried all possible orders but xfs_repair found in no round instantly
> a superblock or did the repair with flying colors.
>
> sda,sdb,sdc
> sda,sdc,sdb
> sdc,sda,sdb
> sdb,sda,sdc
> sdc,sdb,sda
> sdb,sdc,sda
>
>
> will give photorec a try and go to bed for now :/
>
> 2016-09-21 23:07 GMT+02:00 Chris Murphy <lists@colorremedies.com>:
>> On Wed, Sep 21, 2016 at 2:41 PM, Simon Becks <beckssimon5@gmail.com> wrote:
>>> So the old disk i removed 2 month ago reports
>>>
>>> /dev/loop1: SGI XFS filesystem data (blksz 4096, inosz 256, v2 dirs)
>>>
>>> So the filesystem on the raid is/was XFS.  gave xfs_repair a shot but
>>> it segfaults:
>>>
>>> i guess thats good, that it found at least the superblock?
>>
>> There's more than one and they're spread across the array. So it's
>> possible you got the first device order correct, so it finds a
>> superblock there, but then when it goes to the next position the drive
>> is out of order so it gets confused.
>>
>> To me this sounds like one drive is in the correct position but the
>> two others are reversed. But I'm not an XFS expert you'd have to ask
>> on their list.
>>
>>
>>
>>>
>>> root@grml ~ # xfs_repair /dev/md42
>>> Phase 1 - find and verify superblock...
>>> bad primary superblock - bad magic number !!!
>>>
>>> attempting to find secondary superblock...
>>> ...........................................
>>> found candidate secondary superblock...
>>> unable to verify superblock, continuing...
>>> found candidate secondary superblock...
>>> error reading superblock 22 -- seek to offset 2031216754688 failed
>>> unable to verify superblock, continuing...
>>> found candidate secondary superblock...
>>> unable to verify superblock, continuing...
>>> ..found candidate secondary superblock...
>>> verified secondary superblock...
>>> writing modified primary superblock
>>>         - reporting progress in intervals of 15 minutes
>>> sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with
>>> calculated value 2048
>>> resetting superblock root inode pointer to 2048
>>> sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent
>>
>> Those big ones strike me as imaginary numbers.
>>
>>> with calculated value 2049
>>> resetting superblock realtime bitmap ino pointer to 2049
>>> sb realtime summary inode 18446744073709551615 (NULLFSINO)
>>> inconsistent with calculated value 2050
>>> resetting superblock realtime summary ino pointer to 2050
>>> Phase 2 - using internal log
>>>         - zero log...
>>> totally zeroed log
>>>         - scan filesystem freespace and inode maps...
>>> bad magic number
>>> bad magic number
>>> bad magic number
>>> Metadata corruption detected at block 0x8/0x1000
>>> bad magic number
>>> Metadata corruption detected at block 0x23d3f408/0x1000
>>> bad magic numberbad magic number
>>>
>>> Metadata corruption detected at block 0x2afe5808/0x1000
>>> bad magic number
>>> bad magic number
>>> bad magic number
>>> bad magic number
>>> bad magic number
>>> bad magic number
>>> bad magic number
>>> bad magic number
>>> bad magic number
>>> bad magic number
>>> bad magic number
>>> bad magic number
>>> bad magic number
>>> Metadata corruption detected at block 0x10/0x1000
>>> Metadata corruption detected at block 0xe54c808/0x1000
>>> bad magic # 0x494e81f6 for agf 0
>>> bad version # 16908289 for agf 0
>>> bad sequence # 99 for agf 0
>>> bad length 99 for agf 0, should be 15027328
>>> flfirst 1301384768 in agf 0 too large (max = 1024)
>>> bad magic # 0x494e81f6 for agi 0
>>> bad version # 16908289 for agi 0
>>> bad sequence # 99 for agi 0
>>> bad length # 99 for agi 0, should be 15027328
>>> reset bad agf for ag 0
>>> reset bad agi for ag 0
>>> Metadata corruption detected at block 0xd6f7b808/0x1000
>>> Metadata corruption detected at block 0x2afe5810/0x1000
>>> bad on-disk superblock 6 - bad magic number
>>> primary/secondary superblock 6 conflict - AG superblock geometry info
>>> conflicts with filesystem geometry
>>> zeroing unused portion of secondary superblock (AG #6)
>>> [1]    23110 segmentation fault  xfs_repair /dev/md42
>>> xfs_repair /dev/md42
>>>
>>>
>>>
>>> 2016-09-21 21:50 GMT+02:00 Simon Becks <beckssimon5@gmail.com>:
>>>> Thank you. I already learned a lot. Your command only shows data for
>>>> all of the 3 disks.
>>>>
>>>> Out of curiosity i used strings /dev/loop42 | grep mp3 and many of my
>>>> songs showed up - is that a good sign?
>>>>
>>>> Just tried the 5 orders like a,b,c a,c,b and so on and receive the
>>>> same output about mount: wrong fs type, bad option, bad superblock on
>>>> /dev/md42 and fsck.ext2: Superblock invalid, trying backup blocks....
>>>>
>>>> Then used photorec in all 5 combinations of disks for several minutes
>>>> without a single file found.
>>>>
>>>> Is it possible that i have to keep something else in mind, while
>>>> assembling the raid? I expected at least some files with photorec when
>>>> the raid was assembled in the right order.
>>>>
>>>>
>>>> 2016-09-21 21:00 GMT+02:00 Andreas Klauer <Andreas.Klauer@metamorpher.de>:
>>>>> On Wed, Sep 21, 2016 at 08:31:23PM +0200, Simon Becks wrote:
>>>>>> Maybe i just assembled it in the wrong order?
>>>>>
>>>>> Yes, or maybe the superblock was overwritten by XFS after all.
>>>>>
>>>>> You could check what's at offset 1M for each disk.
>>>>>
>>>>> losetup --find --show --read-only --offset=$((2048*512)) /the/disk
>>>>> file -s /dev/loop42
>>>>>
>>>>> If the superblock was still intact it should say ext4 or whatever
>>>>> your filesystem was for at least one of them.
>>>>>
>>>>> You can also try this for the disk you removed 2 month ago.
>>>>>
>>>>> If that is not the case and fsck with backup superblock also
>>>>> is not successful then you'll have to see if you find anything
>>>>> valid in the raw data.
>>>>>
>>>>> Regards
>>>>> Andreas Klauer
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>> --
>> Chris Murphy

^ permalink raw reply

* Re: 95a05b3 broke mdadm --add on my superblock 1.0 array
From: Guoqing Jiang @ 2016-09-22  2:40 UTC (permalink / raw)
  To: Anthony DeRobertis, linux-raid, 837964
In-Reply-To: <57E22C76.6040600@suse.com>



On 09/21/2016 02:45 AM, Guoqing Jiang wrote:
>
>
> On 09/20/2016 02:31 PM, Anthony DeRobertis wrote:
>> Sorry for the amount of emails I'm sending, but I noticed something 
>> that's probably important. I'm also appending some gdb log from 
>> tracing through the function (trying to answer why it's doing cluster 
>> mode stuff at all).
>>
>> While tracing through, I noticed that *before* the write-bitmap loop, 
>> mdadm -E considers the superblock valid. That agrees with what I saw 
>> from strace, I suppose. To my first glance, it figures out how much 
>> to write by calling this function:
>>
>> static unsigned int calc_bitmap_size(bitmap_super_t *bms, unsigned 
>> int boundary)
>> {
>>     unsigned long long bits, bytes;
>>
>>     bits = __le64_to_cpu(bms->sync_size) / 
>> (__le32_to_cpu(bms->chunksize)>>9);
>>     bytes = (bits+7) >> 3;
>>     bytes += sizeof(bitmap_super_t);
>>     bytes = ROUND_UP(bytes, boundary);
>>
>>     return bytes;
>> }
>>
>> That code looked familiar, and I figured out where—it's also in 
>> 95a05b37e8eb2bc0803b1a0298fce6adc60eff16, the commit that I found 
>> originally broke it. But that commit is making a change to it: it 
>> changed the ROUND_UP line from 512 to 4096 (and from the gdb trace, 
>> boundary==4096).
>>
>> I tested changing that line to "bytes = ROUND_UP(bytes, 512);", and 
>> it works. Adds the new disk to the array and produces no warnings or 
>> errors.
>
> I think it is is a coincidence that above change works,  4a3d29e 
> commit made
> the change but it didn't change the logic at all.

Hmm, seems bitmap is aligned to 512 in previous mdadm, but with commit 
95a05b3
we made it aligned to 4k, so it causes the latest mdadm can't work with 
previous
created array.

Does the below change work? Thanks.

diff --git a/super1.c b/super1.c
index 9f62d23..6a0b075 100644
--- a/super1.c
+++ b/super1.c
@@ -2433,7 +2433,10 @@ static int write_bitmap1(struct supertype *st, 
int fd, enum bitmap_update update
                         memset(buf, 0xff, 4096);
                 memcpy(buf, (char *)bms, sizeof(bitmap_super_t));

-               towrite = calc_bitmap_size(bms, 4096);
+               if (__le32_to_cpu(bms->nodes) == 0)
+                       towrite = calc_bitmap_size(bms, 512);
+               else
+                       towrite = calc_bitmap_size(bms, 4096);
                 while (towrite > 0) {

Regards,
Guoqing

^ permalink raw reply related

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Simon Becks @ 2016-09-21 21:30 UTC (permalink / raw)
  Cc: Linux-RAID
In-Reply-To: <CAJCQCtSE1m4k_zpQxa59fZ8CK6biFSX0AUPkKCTUkNLT5y+MSA@mail.gmail.com>

I tried all possible orders but xfs_repair found in no round instantly
a superblock or did the repair with flying colors.

sda,sdb,sdc
sda,sdc,sdb
sdc,sda,sdb
sdb,sda,sdc
sdc,sdb,sda
sdb,sdc,sda


will give photorec a try and go to bed for now :/

2016-09-21 23:07 GMT+02:00 Chris Murphy <lists@colorremedies.com>:
> On Wed, Sep 21, 2016 at 2:41 PM, Simon Becks <beckssimon5@gmail.com> wrote:
>> So the old disk i removed 2 month ago reports
>>
>> /dev/loop1: SGI XFS filesystem data (blksz 4096, inosz 256, v2 dirs)
>>
>> So the filesystem on the raid is/was XFS.  gave xfs_repair a shot but
>> it segfaults:
>>
>> i guess thats good, that it found at least the superblock?
>
> There's more than one and they're spread across the array. So it's
> possible you got the first device order correct, so it finds a
> superblock there, but then when it goes to the next position the drive
> is out of order so it gets confused.
>
> To me this sounds like one drive is in the correct position but the
> two others are reversed. But I'm not an XFS expert you'd have to ask
> on their list.
>
>
>
>>
>> root@grml ~ # xfs_repair /dev/md42
>> Phase 1 - find and verify superblock...
>> bad primary superblock - bad magic number !!!
>>
>> attempting to find secondary superblock...
>> ...........................................
>> found candidate secondary superblock...
>> unable to verify superblock, continuing...
>> found candidate secondary superblock...
>> error reading superblock 22 -- seek to offset 2031216754688 failed
>> unable to verify superblock, continuing...
>> found candidate secondary superblock...
>> unable to verify superblock, continuing...
>> ..found candidate secondary superblock...
>> verified secondary superblock...
>> writing modified primary superblock
>>         - reporting progress in intervals of 15 minutes
>> sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with
>> calculated value 2048
>> resetting superblock root inode pointer to 2048
>> sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent
>
> Those big ones strike me as imaginary numbers.
>
>> with calculated value 2049
>> resetting superblock realtime bitmap ino pointer to 2049
>> sb realtime summary inode 18446744073709551615 (NULLFSINO)
>> inconsistent with calculated value 2050
>> resetting superblock realtime summary ino pointer to 2050
>> Phase 2 - using internal log
>>         - zero log...
>> totally zeroed log
>>         - scan filesystem freespace and inode maps...
>> bad magic number
>> bad magic number
>> bad magic number
>> Metadata corruption detected at block 0x8/0x1000
>> bad magic number
>> Metadata corruption detected at block 0x23d3f408/0x1000
>> bad magic numberbad magic number
>>
>> Metadata corruption detected at block 0x2afe5808/0x1000
>> bad magic number
>> bad magic number
>> bad magic number
>> bad magic number
>> bad magic number
>> bad magic number
>> bad magic number
>> bad magic number
>> bad magic number
>> bad magic number
>> bad magic number
>> bad magic number
>> bad magic number
>> Metadata corruption detected at block 0x10/0x1000
>> Metadata corruption detected at block 0xe54c808/0x1000
>> bad magic # 0x494e81f6 for agf 0
>> bad version # 16908289 for agf 0
>> bad sequence # 99 for agf 0
>> bad length 99 for agf 0, should be 15027328
>> flfirst 1301384768 in agf 0 too large (max = 1024)
>> bad magic # 0x494e81f6 for agi 0
>> bad version # 16908289 for agi 0
>> bad sequence # 99 for agi 0
>> bad length # 99 for agi 0, should be 15027328
>> reset bad agf for ag 0
>> reset bad agi for ag 0
>> Metadata corruption detected at block 0xd6f7b808/0x1000
>> Metadata corruption detected at block 0x2afe5810/0x1000
>> bad on-disk superblock 6 - bad magic number
>> primary/secondary superblock 6 conflict - AG superblock geometry info
>> conflicts with filesystem geometry
>> zeroing unused portion of secondary superblock (AG #6)
>> [1]    23110 segmentation fault  xfs_repair /dev/md42
>> xfs_repair /dev/md42
>>
>>
>>
>> 2016-09-21 21:50 GMT+02:00 Simon Becks <beckssimon5@gmail.com>:
>>> Thank you. I already learned a lot. Your command only shows data for
>>> all of the 3 disks.
>>>
>>> Out of curiosity i used strings /dev/loop42 | grep mp3 and many of my
>>> songs showed up - is that a good sign?
>>>
>>> Just tried the 5 orders like a,b,c a,c,b and so on and receive the
>>> same output about mount: wrong fs type, bad option, bad superblock on
>>> /dev/md42 and fsck.ext2: Superblock invalid, trying backup blocks....
>>>
>>> Then used photorec in all 5 combinations of disks for several minutes
>>> without a single file found.
>>>
>>> Is it possible that i have to keep something else in mind, while
>>> assembling the raid? I expected at least some files with photorec when
>>> the raid was assembled in the right order.
>>>
>>>
>>> 2016-09-21 21:00 GMT+02:00 Andreas Klauer <Andreas.Klauer@metamorpher.de>:
>>>> On Wed, Sep 21, 2016 at 08:31:23PM +0200, Simon Becks wrote:
>>>>> Maybe i just assembled it in the wrong order?
>>>>
>>>> Yes, or maybe the superblock was overwritten by XFS after all.
>>>>
>>>> You could check what's at offset 1M for each disk.
>>>>
>>>> losetup --find --show --read-only --offset=$((2048*512)) /the/disk
>>>> file -s /dev/loop42
>>>>
>>>> If the superblock was still intact it should say ext4 or whatever
>>>> your filesystem was for at least one of them.
>>>>
>>>> You can also try this for the disk you removed 2 month ago.
>>>>
>>>> If that is not the case and fsck with backup superblock also
>>>> is not successful then you'll have to see if you find anything
>>>> valid in the raw data.
>>>>
>>>> Regards
>>>> Andreas Klauer
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Chris Murphy

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Chris Murphy @ 2016-09-21 21:07 UTC (permalink / raw)
  To: Simon Becks; +Cc: Linux-RAID
In-Reply-To: <CAKA=zHoKE69Q01oEVOCpKWzp6gQ7zO8rvY=NzuD0YoqLJdgTTQ@mail.gmail.com>

On Wed, Sep 21, 2016 at 2:41 PM, Simon Becks <beckssimon5@gmail.com> wrote:
> So the old disk i removed 2 month ago reports
>
> /dev/loop1: SGI XFS filesystem data (blksz 4096, inosz 256, v2 dirs)
>
> So the filesystem on the raid is/was XFS.  gave xfs_repair a shot but
> it segfaults:
>
> i guess thats good, that it found at least the superblock?

There's more than one and they're spread across the array. So it's
possible you got the first device order correct, so it finds a
superblock there, but then when it goes to the next position the drive
is out of order so it gets confused.

To me this sounds like one drive is in the correct position but the
two others are reversed. But I'm not an XFS expert you'd have to ask
on their list.



>
> root@grml ~ # xfs_repair /dev/md42
> Phase 1 - find and verify superblock...
> bad primary superblock - bad magic number !!!
>
> attempting to find secondary superblock...
> ...........................................
> found candidate secondary superblock...
> unable to verify superblock, continuing...
> found candidate secondary superblock...
> error reading superblock 22 -- seek to offset 2031216754688 failed
> unable to verify superblock, continuing...
> found candidate secondary superblock...
> unable to verify superblock, continuing...
> ..found candidate secondary superblock...
> verified secondary superblock...
> writing modified primary superblock
>         - reporting progress in intervals of 15 minutes
> sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with
> calculated value 2048
> resetting superblock root inode pointer to 2048
> sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent

Those big ones strike me as imaginary numbers.

> with calculated value 2049
> resetting superblock realtime bitmap ino pointer to 2049
> sb realtime summary inode 18446744073709551615 (NULLFSINO)
> inconsistent with calculated value 2050
> resetting superblock realtime summary ino pointer to 2050
> Phase 2 - using internal log
>         - zero log...
> totally zeroed log
>         - scan filesystem freespace and inode maps...
> bad magic number
> bad magic number
> bad magic number
> Metadata corruption detected at block 0x8/0x1000
> bad magic number
> Metadata corruption detected at block 0x23d3f408/0x1000
> bad magic numberbad magic number
>
> Metadata corruption detected at block 0x2afe5808/0x1000
> bad magic number
> bad magic number
> bad magic number
> bad magic number
> bad magic number
> bad magic number
> bad magic number
> bad magic number
> bad magic number
> bad magic number
> bad magic number
> bad magic number
> bad magic number
> Metadata corruption detected at block 0x10/0x1000
> Metadata corruption detected at block 0xe54c808/0x1000
> bad magic # 0x494e81f6 for agf 0
> bad version # 16908289 for agf 0
> bad sequence # 99 for agf 0
> bad length 99 for agf 0, should be 15027328
> flfirst 1301384768 in agf 0 too large (max = 1024)
> bad magic # 0x494e81f6 for agi 0
> bad version # 16908289 for agi 0
> bad sequence # 99 for agi 0
> bad length # 99 for agi 0, should be 15027328
> reset bad agf for ag 0
> reset bad agi for ag 0
> Metadata corruption detected at block 0xd6f7b808/0x1000
> Metadata corruption detected at block 0x2afe5810/0x1000
> bad on-disk superblock 6 - bad magic number
> primary/secondary superblock 6 conflict - AG superblock geometry info
> conflicts with filesystem geometry
> zeroing unused portion of secondary superblock (AG #6)
> [1]    23110 segmentation fault  xfs_repair /dev/md42
> xfs_repair /dev/md42
>
>
>
> 2016-09-21 21:50 GMT+02:00 Simon Becks <beckssimon5@gmail.com>:
>> Thank you. I already learned a lot. Your command only shows data for
>> all of the 3 disks.
>>
>> Out of curiosity i used strings /dev/loop42 | grep mp3 and many of my
>> songs showed up - is that a good sign?
>>
>> Just tried the 5 orders like a,b,c a,c,b and so on and receive the
>> same output about mount: wrong fs type, bad option, bad superblock on
>> /dev/md42 and fsck.ext2: Superblock invalid, trying backup blocks....
>>
>> Then used photorec in all 5 combinations of disks for several minutes
>> without a single file found.
>>
>> Is it possible that i have to keep something else in mind, while
>> assembling the raid? I expected at least some files with photorec when
>> the raid was assembled in the right order.
>>
>>
>> 2016-09-21 21:00 GMT+02:00 Andreas Klauer <Andreas.Klauer@metamorpher.de>:
>>> On Wed, Sep 21, 2016 at 08:31:23PM +0200, Simon Becks wrote:
>>>> Maybe i just assembled it in the wrong order?
>>>
>>> Yes, or maybe the superblock was overwritten by XFS after all.
>>>
>>> You could check what's at offset 1M for each disk.
>>>
>>> losetup --find --show --read-only --offset=$((2048*512)) /the/disk
>>> file -s /dev/loop42
>>>
>>> If the superblock was still intact it should say ext4 or whatever
>>> your filesystem was for at least one of them.
>>>
>>> You can also try this for the disk you removed 2 month ago.
>>>
>>> If that is not the case and fsck with backup superblock also
>>> is not successful then you'll have to see if you find anything
>>> valid in the raw data.
>>>
>>> Regards
>>> Andreas Klauer
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Chris Murphy

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Andreas Klauer @ 2016-09-21 20:56 UTC (permalink / raw)
  To: Simon Becks; +Cc: Linux-RAID
In-Reply-To: <CAKA=zHoKE69Q01oEVOCpKWzp6gQ7zO8rvY=NzuD0YoqLJdgTTQ@mail.gmail.com>

On Wed, Sep 21, 2016 at 10:41:32PM +0200, Simon Becks wrote:
> So the filesystem on the raid is/was XFS.  gave xfs_repair a shot but
> it segfaults:
> 
> i guess thats good, that it found at least the superblock?

Although I use XFS myself I'm not too familiar with its repair facilities.
If you're *sure* the RAID is set up correctly - in this case photorec 
should be able to find something larger than 512K*3 that is valid - 
and still have XFS repair issues maybe the XFS list can help there.

Regards
Andreas Klauer

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Andreas Klauer @ 2016-09-21 20:53 UTC (permalink / raw)
  To: Simon Becks; +Cc: Linux-RAID
In-Reply-To: <CAKA=zHryc+hqie_VZDsU4HGQP9Cw7Oo_J3S4mjJHyZbGX1K1PA@mail.gmail.com>

On Wed, Sep 21, 2016 at 09:50:45PM +0200, Simon Becks wrote:
> Thank you. I already learned a lot. Your command only shows data for
> all of the 3 disks.

And for the removed disk?

photorec is slow, not sure how long you let it run;
also we don't know the extent of your damages.

You could search for known filetype headers on your own.
Megapixel JPEGs, or multi-megabyte MP3 tracks, might work.

If you find such a file header at $offset of one disk, 
you could look at the same $offset of the other disks, 
and see if you can deduct the disk order, raid layout, 
et cetera from this. So for example if you were using 
the wrong chunk size you should be able to tell from that...

But this is tedious work, usually it's easier to trial&error 
until you come up with the right setting that just works.
Or that would work if it wasn't overwritten... :-/

Regards
Andreas Klauer

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Simon Becks @ 2016-09-21 20:41 UTC (permalink / raw)
  To: Simon Becks; +Cc: Linux-RAID
In-Reply-To: <CAKA=zHryc+hqie_VZDsU4HGQP9Cw7Oo_J3S4mjJHyZbGX1K1PA@mail.gmail.com>

So the old disk i removed 2 month ago reports

/dev/loop1: SGI XFS filesystem data (blksz 4096, inosz 256, v2 dirs)

So the filesystem on the raid is/was XFS.  gave xfs_repair a shot but
it segfaults:

i guess thats good, that it found at least the superblock?

root@grml ~ # xfs_repair /dev/md42
Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!

attempting to find secondary superblock...
...........................................
found candidate secondary superblock...
unable to verify superblock, continuing...
found candidate secondary superblock...
error reading superblock 22 -- seek to offset 2031216754688 failed
unable to verify superblock, continuing...
found candidate secondary superblock...
unable to verify superblock, continuing...
..found candidate secondary superblock...
verified secondary superblock...
writing modified primary superblock
        - reporting progress in intervals of 15 minutes
sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with
calculated value 2048
resetting superblock root inode pointer to 2048
sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent
with calculated value 2049
resetting superblock realtime bitmap ino pointer to 2049
sb realtime summary inode 18446744073709551615 (NULLFSINO)
inconsistent with calculated value 2050
resetting superblock realtime summary ino pointer to 2050
Phase 2 - using internal log
        - zero log...
totally zeroed log
        - scan filesystem freespace and inode maps...
bad magic number
bad magic number
bad magic number
Metadata corruption detected at block 0x8/0x1000
bad magic number
Metadata corruption detected at block 0x23d3f408/0x1000
bad magic numberbad magic number

Metadata corruption detected at block 0x2afe5808/0x1000
bad magic number
bad magic number
bad magic number
bad magic number
bad magic number
bad magic number
bad magic number
bad magic number
bad magic number
bad magic number
bad magic number
bad magic number
bad magic number
Metadata corruption detected at block 0x10/0x1000
Metadata corruption detected at block 0xe54c808/0x1000
bad magic # 0x494e81f6 for agf 0
bad version # 16908289 for agf 0
bad sequence # 99 for agf 0
bad length 99 for agf 0, should be 15027328
flfirst 1301384768 in agf 0 too large (max = 1024)
bad magic # 0x494e81f6 for agi 0
bad version # 16908289 for agi 0
bad sequence # 99 for agi 0
bad length # 99 for agi 0, should be 15027328
reset bad agf for ag 0
reset bad agi for ag 0
Metadata corruption detected at block 0xd6f7b808/0x1000
Metadata corruption detected at block 0x2afe5810/0x1000
bad on-disk superblock 6 - bad magic number
primary/secondary superblock 6 conflict - AG superblock geometry info
conflicts with filesystem geometry
zeroing unused portion of secondary superblock (AG #6)
[1]    23110 segmentation fault  xfs_repair /dev/md42
xfs_repair /dev/md42



2016-09-21 21:50 GMT+02:00 Simon Becks <beckssimon5@gmail.com>:
> Thank you. I already learned a lot. Your command only shows data for
> all of the 3 disks.
>
> Out of curiosity i used strings /dev/loop42 | grep mp3 and many of my
> songs showed up - is that a good sign?
>
> Just tried the 5 orders like a,b,c a,c,b and so on and receive the
> same output about mount: wrong fs type, bad option, bad superblock on
> /dev/md42 and fsck.ext2: Superblock invalid, trying backup blocks....
>
> Then used photorec in all 5 combinations of disks for several minutes
> without a single file found.
>
> Is it possible that i have to keep something else in mind, while
> assembling the raid? I expected at least some files with photorec when
> the raid was assembled in the right order.
>
>
> 2016-09-21 21:00 GMT+02:00 Andreas Klauer <Andreas.Klauer@metamorpher.de>:
>> On Wed, Sep 21, 2016 at 08:31:23PM +0200, Simon Becks wrote:
>>> Maybe i just assembled it in the wrong order?
>>
>> Yes, or maybe the superblock was overwritten by XFS after all.
>>
>> You could check what's at offset 1M for each disk.
>>
>> losetup --find --show --read-only --offset=$((2048*512)) /the/disk
>> file -s /dev/loop42
>>
>> If the superblock was still intact it should say ext4 or whatever
>> your filesystem was for at least one of them.
>>
>> You can also try this for the disk you removed 2 month ago.
>>
>> If that is not the case and fsck with backup superblock also
>> is not successful then you'll have to see if you find anything
>> valid in the raw data.
>>
>> Regards
>> Andreas Klauer

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Simon Becks @ 2016-09-21 19:50 UTC (permalink / raw)
  Cc: Linux-RAID
In-Reply-To: <20160921190035.GA6949@metamorpher.de>

Thank you. I already learned a lot. Your command only shows data for
all of the 3 disks.

Out of curiosity i used strings /dev/loop42 | grep mp3 and many of my
songs showed up - is that a good sign?

Just tried the 5 orders like a,b,c a,c,b and so on and receive the
same output about mount: wrong fs type, bad option, bad superblock on
/dev/md42 and fsck.ext2: Superblock invalid, trying backup blocks....

Then used photorec in all 5 combinations of disks for several minutes
without a single file found.

Is it possible that i have to keep something else in mind, while
assembling the raid? I expected at least some files with photorec when
the raid was assembled in the right order.

2016-09-21 21:00 GMT+02:00 Andreas Klauer <Andreas.Klauer@metamorpher.de>:
> On Wed, Sep 21, 2016 at 08:31:23PM +0200, Simon Becks wrote:
>> Maybe i just assembled it in the wrong order?
>
> Yes, or maybe the superblock was overwritten by XFS after all.
>
> You could check what's at offset 1M for each disk.
>
> losetup --find --show --read-only --offset=$((2048*512)) /the/disk
> file -s /dev/loop42
>
> If the superblock was still intact it should say ext4 or whatever
> your filesystem was for at least one of them.
>
> You can also try this for the disk you removed 2 month ago.
>
> If that is not the case and fsck with backup superblock also
> is not successful then you'll have to see if you find anything
> valid in the raw data.
>
> Regards
> Andreas Klauer

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Andreas Klauer @ 2016-09-21 19:00 UTC (permalink / raw)
  To: Simon Becks; +Cc: Linux-RAID
In-Reply-To: <CAKA=zHoJu2vTXQa4yvbJDA2nmQwXLHyw2DfUJ+cVgTDr_gu2XQ@mail.gmail.com>

On Wed, Sep 21, 2016 at 08:31:23PM +0200, Simon Becks wrote:
> Maybe i just assembled it in the wrong order?

Yes, or maybe the superblock was overwritten by XFS after all.

You could check what's at offset 1M for each disk.

losetup --find --show --read-only --offset=$((2048*512)) /the/disk
file -s /dev/loop42

If the superblock was still intact it should say ext4 or whatever 
your filesystem was for at least one of them.

You can also try this for the disk you removed 2 month ago.

If that is not the case and fsck with backup superblock also 
is not successful then you'll have to see if you find anything 
valid in the raw data.

Regards
Andreas Klauer

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Simon Becks @ 2016-09-21 18:31 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: Linux-RAID
In-Reply-To: <20160921180318.GA6633@metamorpher.de>

Yes.

root@grml ~ # mdadm --create /dev/md42 --metadata=1.2 --data-offset=1M
--chunk=512 --level=5 --assume-clean --raid-devices 3 /dev/mapper/sda6
/dev/mapper/sdb6 /dev/mapper/sdc6      :(
mdadm: array /dev/md42 started.

root@grml ~ # cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md42 : active raid5 dm-2[2] dm-1[1] dm-0[0]
      1923496960 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 8/8 pages [32KB], 65536KB chunk

[ 1686.015891] md: bind<dm-0>
[ 1686.015937] md: bind<dm-1>
[ 1686.015977] md: bind<dm-2>
[ 1686.099569] raid6: sse2x1    6749 MB/s
[ 1686.167598] raid6: sse2x2    3460 MB/s
[ 1686.235644] raid6: sse2x4    3827 MB/s
[ 1686.235647] raid6: using algorithm sse2x1 (6749 MB/s)
[ 1686.235649] raid6: using ssse3x2 recovery algorithm
[ 1686.240790] async_tx: api initialized (async)
[ 1686.245577] xor: measuring software checksum speed
[ 1686.283657]    prefetch64-sse: 13934.000 MB/sec
[ 1686.323682]    generic_sse: 12289.000 MB/sec
[ 1686.323683] xor: using function: prefetch64-sse (13934.000 MB/sec)
[ 1686.331416] md: raid6 personality registered for level 6
[ 1686.331419] md: raid5 personality registered for level 5
[ 1686.331420] md: raid4 personality registered for level 4
[ 1686.331676] md/raid:md42: device dm-2 operational as raid disk 2
[ 1686.331678] md/raid:md42: device dm-1 operational as raid disk 1
[ 1686.331679] md/raid:md42: device dm-0 operational as raid disk 0
[ 1686.331903] md/raid:md42: allocated 0kB
[ 1686.331926] md/raid:md42: raid level 5 active with 3 out of 3
devices, algorithm 2
[ 1686.331927] RAID conf printout:
[ 1686.331927]  --- level:5 rd:3 wd:3
[ 1686.331928]  disk 0, o:1, dev:dm-0
[ 1686.331929]  disk 1, o:1, dev:dm-1
[ 1686.331929]  disk 2, o:1, dev:dm-2
[ 1686.331966] created bitmap (8 pages) for device md42
[ 1686.332394] md42: bitmap initialized from disk: read 1 pages, set
14676 of 14676 bits
[ 1686.332435] md42: detected capacity change from 0 to 1969660887040
[ 1686.332457] md: md42 switched to read-write mode.
[ 1686.334058]  md42: unknown partition table



root@grml ~ # mount /dev/md42 /te

                                       :(
mount: /dev/md42 is write-protected, mounting read-only
mount: wrong fs type, bad option, bad superblock on /dev/md42,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

root@grml ~ # fsck -n /dev/md42
fsck from util-linux 2.25.2
e2fsck 1.42.12 (29-Aug-2014)
ext2fs_open2: Bad magic number in super-block
fsck.ext2: Superblock invalid, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/md42

The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem.  If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>
 or
    e2fsck -b 32768 <device>


Maybe i just assembled it in the wrong order?

2016-09-21 20:03 GMT+02:00 Andreas Klauer <Andreas.Klauer@metamorpher.de>:
> On Wed, Sep 21, 2016 at 07:23:42PM +0200, Simon Becks wrote:
>> But this disk was not in the raid for almost 2 month.
>
> ?
>
> I'm not suggesting to use this disk. Well, not yet anyway.
> It might be an option if everything else fails...
>
> You posted this output assuming that the other disks were set up the same way, yes?
> In that case overlay + mdadm --create (with the settings you showed) is what you do.
>
> Regards
> Andreas Klauer

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Andreas Klauer @ 2016-09-21 18:03 UTC (permalink / raw)
  To: Simon Becks; +Cc: Linux-RAID
In-Reply-To: <CAKA=zHq7OgZ+oVAqn7zVmBM0gSQedNk2kZ3dkuBG7V89G-G3Rg@mail.gmail.com>

On Wed, Sep 21, 2016 at 07:23:42PM +0200, Simon Becks wrote:
> But this disk was not in the raid for almost 2 month.

?

I'm not suggesting to use this disk. Well, not yet anyway.
It might be an option if everything else fails...

You posted this output assuming that the other disks were set up the same way, yes?
In that case overlay + mdadm --create (with the settings you showed) is what you do.

Regards
Andreas Klauer

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Simon Becks @ 2016-09-21 17:23 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: Linux-RAID
In-Reply-To: <20160921171511.GA6381@metamorpher.de>

Thank you Andreas. But this disk was not in the raid for almost 2
month. I replaced the disk by one of 3 disk i have now in the raid.
This old disk is too much off from the raid i was using heavily till
yesterday when i started to do stupid things :/

2016-09-21 19:15 GMT+02:00 Andreas Klauer <Andreas.Klauer@metamorpher.de>:
> On Wed, Sep 21, 2016 at 06:56:43PM +0200, Simon Becks wrote:
>>     Data Offset : 2048 sectors
>
> Your RAID disks have a 1 MiB data offset there.
> If nothing was ever written on the XFS filesystem,
> it might still be intact.
>
> If you have an overlay for these disks you can try your luck
> with mdadm --create /dev/md42 --metadata=1.2 --data-offset=1M --chunk=512 --level=5 ...
> --assume-clean /dev/overlay/disk{1,2,3}
>
> Check --examine if it says the same things for device size.
> You already verified partitions should be the same but still...
>
> Then see if there is data on it or no (even if there is no filesystem,
> try a carver like photorec, if it gets an intact file larger than 3*512K
> the raid settings might be correct).
>
> photorec won't work if the data was encrypted.
>
> If there is no data, try a different disk order if you're not sure you got it right.
>
> Good luck
> Andreas Klauer

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Andreas Klauer @ 2016-09-21 17:15 UTC (permalink / raw)
  To: Simon Becks; +Cc: Linux-RAID
In-Reply-To: <CAKA=zHpgVoNA5Y6MBmB4fnG85F-z=hj4FXEiUOggsWUn3tfWXg@mail.gmail.com>

On Wed, Sep 21, 2016 at 06:56:43PM +0200, Simon Becks wrote:
>     Data Offset : 2048 sectors

Your RAID disks have a 1 MiB data offset there. 
If nothing was ever written on the XFS filesystem, 
it might still be intact.

If you have an overlay for these disks you can try your luck 
with mdadm --create /dev/md42 --metadata=1.2 --data-offset=1M --chunk=512 --level=5 ...
--assume-clean /dev/overlay/disk{1,2,3}

Check --examine if it says the same things for device size. 
You already verified partitions should be the same but still...

Then see if there is data on it or no (even if there is no filesystem, 
try a carver like photorec, if it gets an intact file larger than 3*512K 
the raid settings might be correct).

photorec won't work if the data was encrypted.

If there is no data, try a different disk order if you're not sure you got it right.

Good luck
Andreas Klauer

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Simon Becks @ 2016-09-21 16:56 UTC (permalink / raw)
  Cc: Linux-RAID
In-Reply-To: <CAJCQCtR4NoCebcDLD-ovT+hZY=NKRtj1cP693wHQTbmiERDAXw@mail.gmail.com>

Thank you. I replaced 2 month ago one disk of the 3 disk raid 5 and
just collected some informations from it:

/dev/sde6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 342ec726:3804270d:5917dd5f:c24883a9
           Name : TS-XLB6C:2
  Creation Time : Fri Dec 23 17:58:59 2011
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1923497952 (917.20 GiB 984.83 GB)
     Array Size : 1923496960 (1834.39 GiB 1969.66 GB)
  Used Dev Size : 1923496960 (917.19 GiB 984.83 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=992 sectors
          State : active
    Device UUID : d27a69d0:456f3704:8e17ac75:78939886

Internal Bitmap : 8 sectors from superblock
    Update Time : Wed Jul 27 19:08:08 2016
       Checksum : de9dbd10 - correct
         Events : 11543

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)


So it was a 3 disk raid5 with 512K chunk size. Anyway this disk should
be of no help as the "offset" i too big but it was of help to see how
the geometry is.


Using fdisk on that "old" disk also confirms that the disk layout is
identically to the 3 disk i had in place:

Disk /dev/sde: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: B84660EA-8691-4C7C-B914-DA2769BDD6D7

Device        Start        End    Sectors   Size Type
/dev/sde1      2048    2002943    2000896   977M Microsoft basic data
/dev/sde2   2002944   12003327   10000384   4.8G Microsoft basic data
/dev/sde3  12003328   12005375       2048     1M Microsoft basic data
/dev/sde4  12005376   12007423       2048     1M Microsoft basic data
/dev/sde5  12007424   14008319    2000896   977M Microsoft basic data
/dev/sde6  14008320 1937508319 1923500000 917.2G Microsoft basic data

So on the partitions size nothing has changed.

So i will now only work with overlay as i do not have that much space
available to copy all disks.

What would be the next steps? just create a new raid 5 array with sd[a-c]6 ?

Thank you.

Simon

2016-09-21 17:38 GMT+02:00 Chris Murphy <lists@colorremedies.com>:
> On Wed, Sep 21, 2016 at 4:39 AM, Simon Becks <beckssimon5@gmail.com> wrote:
>> Dear Developers & Gurus & Gods,
>>
>> i had a 3 disk software raid 5 (mdadm) on a buffalo terrastation. By
>> accident I reset the raid and the NAS put on a xfs filesystem on each
>> of the 3 partitions.
>>
>> sda6 sdb6 and sdc6 have been the raid5 member partitions.
>>
>> Now sda6 sdb6 and sdc6 only contain a xfs filesystem with some empty
>> default folder structure - my NAS created during the "reset".
>
> OK that doesn't really make sense, if it's going to do a reset I'd
> expect it to format all the other partitions and even repartition
> those drives. Why does it format just the sixth partition on these
> three drives? The point is, you need to make sure this "new" sda6 is
> really exactly the same as the old one. As in, the same start and end
> sector values. If this is a different partition scheme on any of the
> drives, you have to fix that problem first.
>
>
>>
>> mdadm --examine /dev/sda6
>> mdadm: No md superblock detected on /dev/sda6
>> mdadm --examine /dev/sdb6
>> mdadm: No md superblock detected on /dev/sdb6
>> mdadm --examine /dev/sdc6
>> mdadm: No md superblock detected on /dev/sdc6
>
> mkfs.xfs doesn't write that much metadata but it does write a lot of
> zeros, about 60MB of writes per mkfs depending on how many AG's are
> created. So no matter what the resulting array is going to have about
> 200MB of data loss spread around. It's hard to say what that will have
> stepped on, if you're lucky it'll be only data. If you're unluckly it
> will have hit the file system in a way that'll make it difficult to
> extract your data.
>
> So no matter what, this is now a scraping operation. And that will
> take some iteration. Invariably it will take less time to just create
> a new array and restore from backup. If you don't have a backup for
> this data, then it's not important data. Either way, it's not worth
> your time.
>
> However...
>
> To make it possible to iterate mdadm metadata version, and chunk size,
> and device order without more damage, you need to work on files of
> these partitions, or use an overlay.
> https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file
>
> The overlay file option is better because you can iterate and throw
> them away and quickly start again. If you dd the partitions to files,
> and you directly change those files, to start over you have to dd
> those partitions yet again. So you're probably going to want the
> overlay option no matter what.
>
> Each iteration will produce an assembled array, but only one
> combination will produce an array that's your array. And even that
> array might not be mountable due to mkfs damage. So you'll need some
> tests to find out whether you have a file system on that array or if
> it's just garbage.  fsck -n *might* recognize the filesystem even if
> it's badly damaged, and tell you how badly damaged it is, without
> trying to fix it. You're almost certainly best off not fixing it for
> starters, and just mounting it read only and getting off as much data
> as you can elsewhere. i.e. making the backup you should already have.
>
>
>
>
>
>>
>> Disk /dev/sdc: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
>> Units: sectors of 1 * 512 = 512 bytes
>> Sector size (logical/physical): 512 bytes / 4096 bytes
>> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
>> Disklabel type: gpt
>> Disk identifier: FAB30D96-11C4-477E-ADAA-9448A087E124
>>
>> Device        Start        End    Sectors   Size Type
>> /dev/sdc1      2048    2002943    2000896   977M Microsoft basic data
>> /dev/sdc2   2002944   12003327   10000384   4.8G Microsoft basic data
>> /dev/sdc3  12003328   12005375       2048     1M Microsoft basic data
>> /dev/sdc4  12005376   12007423       2048     1M Microsoft basic data
>> /dev/sdc5  12007424   14008319    2000896   977M Microsoft basic data
>> /dev/sdc6  14008320 1937508319 1923500000 917.2G Microsoft basic data
>>
>> XFS-Log attached for reference.
>>
>> Am I screwed or is there a chance to recreate the raid with the 3
>> disks end up with the raid and the filesystem i had before?
>
> It's pretty unlikely you'll totally avoid data loss, it's just matter
> of what damage has happened and that's not knowable in advance. You'll
> just have to try it out.
>
> If the file system can't be mounted ro; if the fsck can't make it
> "good enough" to mount ro; then you will want to take a look at
> testdisk which can scrape the array (not the individual drives) for
> file signatures. So long as the files are in contiguous blocks, it can
> enable you to scrape off things like photos and documents. Smaller
> files tend to recover better than big files.
>
> If that also fails, well yeah you can try test disk pointed at
> individual partitions or the whole drives and see what it finds. If
> the chunk size is 512KiB that sorta improves the chances you'll get
> some files but they'll definitely only be small files that'll be
> recognized. Any file broken up by raid striping will be on some other
> drive. So it's a huge jigsaw puzzle, which is why raid is not a
> backup, etc.
>
>
> --
> Chris Murphy

^ permalink raw reply

* Re: [PATCH] raid5: fix to detect failure of register_shrinker
From: Shaohua Li @ 2016-09-21 16:01 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-raid, linux-kernel, chao
In-Reply-To: <20160920023357.872-1-yuchao0@huawei.com>

On Tue, Sep 20, 2016 at 10:33:57AM +0800, Chao Yu wrote:
> register_shrinker can fail after commit 1d3d4437eae1 ("vmscan: per-node
> deferred work"), we should detect the failure of it, otherwise we may
> fail to register shrinker after raid5 configuration was setup successfully.
> 
> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> ---
>  drivers/md/raid5.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 766c3b7..b819a9a 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -6632,7 +6632,12 @@ static struct r5conf *setup_conf(struct mddev *mddev)
>  	conf->shrinker.count_objects = raid5_cache_count;
>  	conf->shrinker.batch = 128;
>  	conf->shrinker.flags = 0;
> -	register_shrinker(&conf->shrinker);
> +	if (register_shrinker(&conf->shrinker)) {
> +		printk(KERN_ERR
> +		       "md/raid:%s: couldn't register shrinker.\n",
> +		       mdname(mddev));
> +		goto abort;
> +	}
>  
>  	sprintf(pers_name, "raid%d", mddev->new_level);
>  	conf->thread = md_register_thread(raid5d, mddev, pers_name);

shrinker isn't fatal for raid5. Idealy we can ignore the error and stop
automatically stripe cache increase, but probable it's not worthy the effort.
Applied the patch.

Also the free_conf need better way to check if shrinker register successes. I
add another patch to fix it.

Thanks,
Shaohua

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Chris Murphy @ 2016-09-21 15:38 UTC (permalink / raw)
  To: Simon Becks; +Cc: Linux-RAID
In-Reply-To: <CAKA=zHqbFv_RQvyR4o4HPS2UrP4xrRWLiy2nR+gudDguk+JLog@mail.gmail.com>

On Wed, Sep 21, 2016 at 4:39 AM, Simon Becks <beckssimon5@gmail.com> wrote:
> Dear Developers & Gurus & Gods,
>
> i had a 3 disk software raid 5 (mdadm) on a buffalo terrastation. By
> accident I reset the raid and the NAS put on a xfs filesystem on each
> of the 3 partitions.
>
> sda6 sdb6 and sdc6 have been the raid5 member partitions.
>
> Now sda6 sdb6 and sdc6 only contain a xfs filesystem with some empty
> default folder structure - my NAS created during the "reset".

OK that doesn't really make sense, if it's going to do a reset I'd
expect it to format all the other partitions and even repartition
those drives. Why does it format just the sixth partition on these
three drives? The point is, you need to make sure this "new" sda6 is
really exactly the same as the old one. As in, the same start and end
sector values. If this is a different partition scheme on any of the
drives, you have to fix that problem first.

>
> mdadm --examine /dev/sda6
> mdadm: No md superblock detected on /dev/sda6
> mdadm --examine /dev/sdb6
> mdadm: No md superblock detected on /dev/sdb6
> mdadm --examine /dev/sdc6
> mdadm: No md superblock detected on /dev/sdc6

mkfs.xfs doesn't write that much metadata but it does write a lot of
zeros, about 60MB of writes per mkfs depending on how many AG's are
created. So no matter what the resulting array is going to have about
200MB of data loss spread around. It's hard to say what that will have
stepped on, if you're lucky it'll be only data. If you're unluckly it
will have hit the file system in a way that'll make it difficult to
extract your data.

So no matter what, this is now a scraping operation. And that will
take some iteration. Invariably it will take less time to just create
a new array and restore from backup. If you don't have a backup for
this data, then it's not important data. Either way, it's not worth
your time.

However...

To make it possible to iterate mdadm metadata version, and chunk size,
and device order without more damage, you need to work on files of
these partitions, or use an overlay.
https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file

The overlay file option is better because you can iterate and throw
them away and quickly start again. If you dd the partitions to files,
and you directly change those files, to start over you have to dd
those partitions yet again. So you're probably going to want the
overlay option no matter what.

Each iteration will produce an assembled array, but only one
combination will produce an array that's your array. And even that
array might not be mountable due to mkfs damage. So you'll need some
tests to find out whether you have a file system on that array or if
it's just garbage.  fsck -n *might* recognize the filesystem even if
it's badly damaged, and tell you how badly damaged it is, without
trying to fix it. You're almost certainly best off not fixing it for
starters, and just mounting it read only and getting off as much data
as you can elsewhere. i.e. making the backup you should already have.

>
> Disk /dev/sdc: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
> Units: sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 4096 bytes
> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> Disklabel type: gpt
> Disk identifier: FAB30D96-11C4-477E-ADAA-9448A087E124
>
> Device        Start        End    Sectors   Size Type
> /dev/sdc1      2048    2002943    2000896   977M Microsoft basic data
> /dev/sdc2   2002944   12003327   10000384   4.8G Microsoft basic data
> /dev/sdc3  12003328   12005375       2048     1M Microsoft basic data
> /dev/sdc4  12005376   12007423       2048     1M Microsoft basic data
> /dev/sdc5  12007424   14008319    2000896   977M Microsoft basic data
> /dev/sdc6  14008320 1937508319 1923500000 917.2G Microsoft basic data
>
> XFS-Log attached for reference.
>
> Am I screwed or is there a chance to recreate the raid with the 3
> disks end up with the raid and the filesystem i had before?

It's pretty unlikely you'll totally avoid data loss, it's just matter
of what damage has happened and that's not knowable in advance. You'll
just have to try it out.

If the file system can't be mounted ro; if the fsck can't make it
"good enough" to mount ro; then you will want to take a look at
testdisk which can scrape the array (not the individual drives) for
file signatures. So long as the files are in contiguous blocks, it can
enable you to scrape off things like photos and documents. Smaller
files tend to recover better than big files.

If that also fails, well yeah you can try test disk pointed at
individual partitions or the whole drives and see what it finds. If
the chunk size is 512KiB that sorta improves the chances you'll get
some files but they'll definitely only be small files that'll be
recognized. Any file broken up by raid striping will be on some other
drive. So it's a huge jigsaw puzzle, which is why raid is not a
backup, etc.

-- 
Chris Murphy

^ permalink raw reply

* Re: [PATCH][RESEND] Fix RAID metadata check
From: Jes Sorensen @ 2016-09-21 14:42 UTC (permalink / raw)
  To: Mariusz Dabrowski
  Cc: linux-raid, tomasz.majchrzak, aleksey.obitotskiy, pawel.baldysiak,
	artur.paszkiewicz, maksymilian.kunt
In-Reply-To: <1474442220-28895-1-git-send-email-mariusz.dabrowski@intel.com>

Mariusz Dabrowski <mariusz.dabrowski@intel.com> writes:
> mdadm recognizes devices with partition table as part of an RAID array
> and invalid warning message is displayed. After this fix proper warning
> messages are being displayed for MBR/GPT disks and devices with RAID
> metadata.

A couple of issues here - in general please respect proper coding
style. This patch is a mess :(

> Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
> ---
>  util.c | 28 +++++++++++++++++-----------
>  1 file changed, 17 insertions(+), 11 deletions(-)
>
> diff --git a/util.c b/util.c
> index c38ede7..5c845a0 100644
> --- a/util.c
> +++ b/util.c
> @@ -710,17 +710,23 @@ int check_raid(int fd, char *name)
>  
>  	if (!st)
>  		return 0;
> -	st->ss->load_super(st, fd, name);
> -	/* Looks like a raid array .. */
> -	pr_err("%s appears to be part of a raid array:\n",
> -		name);
> -	st->ss->getinfo_super(st, &info, NULL);
> -	st->ss->free_super(st);
> -	crtime = info.array.ctime;
> -	level = map_num(pers, info.array.level);
> -	if (!level) level = "-unknown-";
> -	cont_err("level=%s devices=%d ctime=%s",
> -		 level, info.array.raid_disks, ctime(&crtime));
> +	if (st->ss->add_to_super != NULL) {
> +		st->ss->load_super(st, fd, name);
> +		/* Looks like a raid array .. */
> +		pr_err("%s appears to be part of a raid array:\n",
> +			name);

Code lines are 80 characters - again when moving code around like this,
please do it properly.

> +		st->ss->getinfo_super(st, &info, NULL);
> +		st->ss->free_super(st);
> +		crtime = info.array.ctime;
> +		level = map_num(pers, info.array.level);
> +		if (!level) level = "-unknown-";

if () and action should never be on the same line. Yes I know it was in
the old code, but then please fix it up when you move it around.

> +		cont_err("level=%s devices=%d ctime=%s",
> +		level, info.array.raid_disks, ctime(&crtime));

The indentation here is not acceptable, please do it properly like you
would in the kernel too.

> +	}
> +	else {

Ehm?

> +		/* Looks like GPT or MBR */
> +		pr_err("partition table exists on %s\n", name);
> +	}
>  	return 1;
>  }

Cheers,
Jes

^ permalink raw reply

* Re: restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Benjamin ESTRABAUD @ 2016-09-21 13:34 UTC (permalink / raw)
  To: Simon Becks, linux-raid
In-Reply-To: <CAKA=zHqbFv_RQvyR4o4HPS2UrP4xrRWLiy2nR+gudDguk+JLog@mail.gmail.com>

On 21/09/16 11:39, Simon Becks wrote:
> Dear Developers & Gurus & Gods,
>
> i had a 3 disk software raid 5 (mdadm) on a buffalo terrastation. By
> accident I reset the raid and the NAS put on a xfs filesystem on each
> of the 3 partitions.
>
> sda6 sdb6 and sdc6 have been the raid5 member partitions.
>
> Now sda6 sdb6 and sdc6 only contain a xfs filesystem with some empty
> default folder structure - my NAS created during the "reset".
>

This is clearly not ideal, but hopefully you can recreate the MD 
superblocks and re-assemble the RAID as it was before, but that will 
imply you knowing exactly what order the devices were in your array and 
what RAID level and chunksize you used. There are procedures online to 
attempt this somewhat safe (by telling MD to assemble read only) but 
really the first step in this case is usually to backup the raw 
partitions to files or new devices in case you mess up something during 
recovery.

> Am I screwed or is there a chance to recreate the raid with the 3
> disks end up with the raid and the filesystem i had before?
>
Even if you manage to recreate the RAID superblocks and re-assemble as 
it used to be, part of the on RAID data will have been overwritten by 
the format of the individual partitions. Maybe enough that your 
filesystem will be corrupted beyond repair (at which point you can use 
utilities like the excellent "testdisk").

> any help is greatly appreciated.
>
Good luck!

> Simon
>
Regards,
Ben.

^ permalink raw reply

* restore 3disk raid5 after raidpartitions have been setup with xfs filesystem by accident
From: Simon Becks @ 2016-09-21 10:39 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1563 bytes --]

Dear Developers & Gurus & Gods,

i had a 3 disk software raid 5 (mdadm) on a buffalo terrastation. By
accident I reset the raid and the NAS put on a xfs filesystem on each
of the 3 partitions.

sda6 sdb6 and sdc6 have been the raid5 member partitions.

Now sda6 sdb6 and sdc6 only contain a xfs filesystem with some empty
default folder structure - my NAS created during the "reset".

mdadm --examine /dev/sda6
mdadm: No md superblock detected on /dev/sda6
mdadm --examine /dev/sdb6
mdadm: No md superblock detected on /dev/sdb6
mdadm --examine /dev/sdc6
mdadm: No md superblock detected on /dev/sdc6

Disk /dev/sdc: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: FAB30D96-11C4-477E-ADAA-9448A087E124

Device        Start        End    Sectors   Size Type
/dev/sdc1      2048    2002943    2000896   977M Microsoft basic data
/dev/sdc2   2002944   12003327   10000384   4.8G Microsoft basic data
/dev/sdc3  12003328   12005375       2048     1M Microsoft basic data
/dev/sdc4  12005376   12007423       2048     1M Microsoft basic data
/dev/sdc5  12007424   14008319    2000896   977M Microsoft basic data
/dev/sdc6  14008320 1937508319 1923500000 917.2G Microsoft basic data

XFS-Log attached for reference.

Am I screwed or is there a chance to recreate the raid with the 3
disks end up with the raid and the filesystem i had before?

any help is greatly appreciated.

Simon

[-- Attachment #2: xfs-log --]
[-- Type: application/octet-stream, Size: 30120 bytes --]

xfs_logprint:
    data device: 0x876
    log device: 0x876 daddr: 961750032 length: 939208

cycle: 1	version: 2		lsn: 1,0	tail_lsn: 1,0
length of Log Record: 20	prev offset: -1		num ops: 1
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: b0c0d0d0  len: 8  clientid: LOG  flags: UNMOUNT 
Unmount filesystem

============================================================================
cycle: 1	version: 2		lsn: 1,2	tail_lsn: 1,2
length of Log Record: 11776	prev offset: 0		num ops: 50
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 8ac06ac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: QM_QINOCREATE       tid: 0       num_items: 4
----------------------------------------------------------------------------
Oper (2): tid: 8ac06ac8  len: 24  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 2 (0x2)  len: 1  bmap size: 1  flags: 0x0
Oper (3): tid: 8ac06ac8  len: 128  clientid: TRANS  flags: none
AGI Buffer: XAGI  
ver: 1  seq#: 0  len: 60109375  cnt: 64  root: 3
level: 1  free#: 0x3c  newino: 0x80
bucket[0 - 3]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff 
bucket[4 - 7]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff 
bucket[8 - 11]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff 
bucket[12 - 15]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff 
bucket[16 - 19]: 0xffffffff 
----------------------------------------------------------------------------
Oper (4): tid: 8ac06ac8  len: 28  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 24 (0x18)  len: 8  bmap size: 2  flags: 0x0
Oper (5): tid: 8ac06ac8  len: 128  clientid: TRANS  flags: none
BUF DATA
----------------------------------------------------------------------------
Oper (6): tid: 8ac06ac8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x83  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 768
Oper (7): tid: 8ac06ac8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 0100000 version 2 format 2
nlink 1 uid 0 gid 0
atime 0x57cc6e35 mtime 0x57cc6e35 ctime 0x57cc6e35
size 0x0 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (8): tid: 8ac06ac8  len: 24  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 0 (0x0)  len: 1  bmap size: 1  flags: 0x0
Oper (9): tid: 8ac06ac8  len: 256  clientid: TRANS  flags: none
SUPER BLOCK Buffer: 
icount: 6360863066640355328  ifree: 240437500  fdblks: 0  frext: 0
----------------------------------------------------------------------------
Oper (10): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: COMMIT 
----------------------------------------------------------------------------
Oper (11): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (12): tid: 8ac06ac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: QM_QINOCREATE       tid: 0       num_items: 4
----------------------------------------------------------------------------
Oper (13): tid: 8ac06ac8  len: 24  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 2 (0x2)  len: 1  bmap size: 1  flags: 0x0
Oper (14): tid: 8ac06ac8  len: 128  clientid: TRANS  flags: none
AGI Buffer: XAGI  
ver: 1  seq#: 0  len: 60109375  cnt: 64  root: 3
level: 1  free#: 0x3b  newino: 0x80
bucket[0 - 3]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff 
bucket[4 - 7]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff 
bucket[8 - 11]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff 
bucket[12 - 15]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff 
bucket[16 - 19]: 0xffffffff 
----------------------------------------------------------------------------
Oper (15): tid: 8ac06ac8  len: 28  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 24 (0x18)  len: 8  bmap size: 2  flags: 0x0
Oper (16): tid: 8ac06ac8  len: 128  clientid: TRANS  flags: none
BUF DATA
----------------------------------------------------------------------------
Oper (17): tid: 8ac06ac8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x84  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 1024
Oper (18): tid: 8ac06ac8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 0100000 version 2 format 2
nlink 1 uid 0 gid 0
atime 0x57cc6e35 mtime 0x57cc6e35 ctime 0x57cc6e35
size 0x0 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (19): tid: 8ac06ac8  len: 24  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 0 (0x0)  len: 1  bmap size: 1  flags: 0x0
Oper (20): tid: 8ac06ac8  len: 256  clientid: TRANS  flags: none
SUPER BLOCK Buffer: 
icount: 6360863066640355328  ifree: 240437500  fdblks: 0  frext: 0
----------------------------------------------------------------------------
Oper (21): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: COMMIT 
----------------------------------------------------------------------------
Oper (22): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (23): tid: 8ac06ac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: QM_DQALLOC       tid: 0       num_items: 5
----------------------------------------------------------------------------
Oper (24): tid: 8ac06ac8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 3   ino: 0x83  flags: 0x5   dsize: 16
        blkno: 64  len: 16  boff: 768
Oper (25): tid: 8ac06ac8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 0100000 version 2 format 2
nlink 1 uid 0 gid 0
atime 0x57cc6e35 mtime 0x57cc6e35 ctime 0x57cc6e35
size 0x0 nblocks 0x1 extsize 0x0 nextents 0x1
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
Oper (26): tid: 8ac06ac8  len: 16  clientid: TRANS  flags: none
EXTENTS inode data
----------------------------------------------------------------------------
Oper (27): tid: 8ac06ac8  len: 24  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 1 (0x1)  len: 1  bmap size: 1  flags: 0x0
Oper (28): tid: 8ac06ac8  len: 128  clientid: TRANS  flags: none
AGF Buffer: XAGF  Out of space
----------------------------------------------------------------------------
Oper (29): tid: 8ac06ac8  len: 28  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 16 (0x10)  len: 8  bmap size: 2  flags: 0x0
Oper (30): tid: 8ac06ac8  len: 128  clientid: TRANS  flags: none
BUF DATA
----------------------------------------------------------------------------
Oper (31): tid: 8ac06ac8  len: 28  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 8 (0x8)  len: 8  bmap size: 2  flags: 0x0
Oper (32): tid: 8ac06ac8  len: 128  clientid: TRANS  flags: none
BUF DATA
----------------------------------------------------------------------------
Oper (33): tid: 8ac06ac8  len: 28  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 96 (0x60)  len: 8  bmap size: 2  flags: 0x4
Oper (34): tid: 8ac06ac8  len: 4096  clientid: TRANS  flags: none
BUF DATA
----------------------------------------------------------------------------
Oper (35): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: COMMIT 
----------------------------------------------------------------------------
Oper (36): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (37): tid: 8ac06ac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: QM_DQALLOC       tid: 0       num_items: 5
----------------------------------------------------------------------------
Oper (38): tid: 8ac06ac8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 3   ino: 0x84  flags: 0x5   dsize: 16
        blkno: 64  len: 16  boff: 1024
Oper (39): tid: 8ac06ac8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 0100000 version 2 format 2
nlink 1 uid 0 gid 0
atime 0x57cc6e35 mtime 0x57cc6e35 ctime 0x57cc6e35
size 0x0 nblocks 0x1 extsize 0x0 nextents 0x1
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
Oper (40): tid: 8ac06ac8  len: 16  clientid: TRANS  flags: none
EXTENTS inode data
----------------------------------------------------------------------------
Oper (41): tid: 8ac06ac8  len: 24  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 1 (0x1)  len: 1  bmap size: 1  flags: 0x0
Oper (42): tid: 8ac06ac8  len: 128  clientid: TRANS  flags: none
AGF Buffer: XAGF  Out of space
----------------------------------------------------------------------------
Oper (43): tid: 8ac06ac8  len: 28  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 16 (0x10)  len: 8  bmap size: 2  flags: 0x0
Oper (44): tid: 8ac06ac8  len: 128  clientid: TRANS  flags: none
BUF DATA
----------------------------------------------------------------------------
Oper (45): tid: 8ac06ac8  len: 28  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 8 (0x8)  len: 8  bmap size: 2  flags: 0x0
Oper (46): tid: 8ac06ac8  len: 128  clientid: TRANS  flags: none
BUF DATA
----------------------------------------------------------------------------
Oper (47): tid: 8ac06ac8  len: 28  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 104 (0x68)  len: 8  bmap size: 2  flags: 0x10
Oper (48): tid: 8ac06ac8  len: 4096  clientid: TRANS  flags: none
BUF DATA
----------------------------------------------------------------------------
Oper (49): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,26	tail_lsn: 1,2
length of Log Record: 1024	prev offset: 2		num ops: 15
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 8ac06ac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: QM_SBCHANGE       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (2): tid: 8ac06ac8  len: 24  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 0 (0x0)  len: 1  bmap size: 1  flags: 0x0
Oper (3): tid: 8ac06ac8  len: 256  clientid: TRANS  flags: none
SUPER BLOCK Buffer: 
icount: 6360863066640355328  ifree: 240437500  fdblks: 0  frext: 0
----------------------------------------------------------------------------
Oper (4): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: COMMIT 
----------------------------------------------------------------------------
Oper (5): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (6): tid: 8ac06ac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: SETATTR       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (7): tid: 8ac06ac8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x80  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 0
Oper (8): tid: 8ac06ac8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 040755 version 2 format 1
nlink 2 uid 0 gid 0
atime 0x0 mtime 0x57cc6e33 ctime 0x57cc6e35
size 0x6 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (9): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: COMMIT 
----------------------------------------------------------------------------
Oper (10): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (11): tid: 8ac06ac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: SETATTR       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (12): tid: 8ac06ac8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x80  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 0
Oper (13): tid: 8ac06ac8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 040755 version 2 format 1
nlink 2 uid 0 gid 0
atime 0x0 mtime 0x57cc6e33 ctime 0x57cc6e35
size 0x6 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (14): tid: 8ac06ac8  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,29	tail_lsn: 1,26
length of Log Record: 512	prev offset: 26		num ops: 5
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 99973ae8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 99973ae8  len: 16  clientid: TRANS  flags: none
TRAN:    type: DUMMY1       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (2): tid: 99973ae8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x80  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 0
Oper (3): tid: 99973ae8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 040755 version 2 format 1
nlink 2 uid 0 gid 0
atime 0x0 mtime 0x57cc6e33 ctime 0x57cc6e35
size 0x6 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (4): tid: 99973ae8  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,31	tail_lsn: 1,29
length of Log Record: 512	prev offset: 29		num ops: 5
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 9fc26b08  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 9fc26b08  len: 16  clientid: TRANS  flags: none
TRAN:    type: DUMMY1       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (2): tid: 9fc26b08  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x80  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 0
Oper (3): tid: 9fc26b08  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 040755 version 2 format 1
nlink 2 uid 0 gid 0
atime 0x0 mtime 0x57cc6e33 ctime 0x57cc6e35
size 0x6 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (4): tid: 9fc26b08  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,33	tail_lsn: 1,31
length of Log Record: 512	prev offset: 31		num ops: 5
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 9738ab08  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 9738ab08  len: 16  clientid: TRANS  flags: none
TRAN:    type: SB_COUNT       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (2): tid: 9738ab08  len: 24  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 0 (0x0)  len: 1  bmap size: 1  flags: 0x0
Oper (3): tid: 9738ab08  len: 128  clientid: TRANS  flags: none
SUPER BLOCK Buffer: 
icount: 64  ifree: 59  fdblks: 240320077  frext: 0
----------------------------------------------------------------------------
Oper (4): tid: 9738ab08  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,35	tail_lsn: 1,33
length of Log Record: 512	prev offset: 33		num ops: 1
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 9738ab08  len: 8  clientid: LOG  flags: UNMOUNT 
Unmount filesystem

============================================================================
cycle: 1	version: 2		lsn: 1,37	tail_lsn: 1,37
length of Log Record: 512	prev offset: 35		num ops: 10
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 99f62ac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 99f62ac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: SETATTR       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (2): tid: 99f62ac8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x80  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 0
Oper (3): tid: 99f62ac8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 040755 version 2 format 1
nlink 2 uid 0 gid 0
atime 0x0 mtime 0x57cc6e33 ctime 0x57cc7a74
size 0x6 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (4): tid: 99f62ac8  len: 0  clientid: TRANS  flags: COMMIT 
----------------------------------------------------------------------------
Oper (5): tid: 99f62ac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (6): tid: 99f62ac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: SETATTR       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (7): tid: 99f62ac8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x80  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 0
Oper (8): tid: 99f62ac8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 040755 version 2 format 1
nlink 2 uid 0 gid 0
atime 0x0 mtime 0x57cc6e33 ctime 0x57cc7a74
size 0x6 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (9): tid: 99f62ac8  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,39	tail_lsn: 1,37
length of Log Record: 512	prev offset: 37		num ops: 5
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 89897ae8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 89897ae8  len: 16  clientid: TRANS  flags: none
TRAN:    type: DUMMY1       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (2): tid: 89897ae8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x80  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 0
Oper (3): tid: 89897ae8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 040755 version 2 format 1
nlink 2 uid 0 gid 0
atime 0x0 mtime 0x57cc6e33 ctime 0x57cc7a74
size 0x6 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (4): tid: 89897ae8  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,41	tail_lsn: 1,39
length of Log Record: 512	prev offset: 39		num ops: 5
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 8943bb08  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 8943bb08  len: 16  clientid: TRANS  flags: none
TRAN:    type: DUMMY1       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (2): tid: 8943bb08  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x80  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 0
Oper (3): tid: 8943bb08  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 040755 version 2 format 1
nlink 2 uid 0 gid 0
atime 0x0 mtime 0x57cc6e33 ctime 0x57cc7a74
size 0x6 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (4): tid: 8943bb08  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,43	tail_lsn: 1,41
length of Log Record: 512	prev offset: 41		num ops: 5
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 9e41cae8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 9e41cae8  len: 16  clientid: TRANS  flags: none
TRAN:    type: SB_COUNT       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (2): tid: 9e41cae8  len: 24  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 0 (0x0)  len: 1  bmap size: 1  flags: 0x0
Oper (3): tid: 9e41cae8  len: 128  clientid: TRANS  flags: none
SUPER BLOCK Buffer: 
icount: 64  ifree: 59  fdblks: 240320077  frext: 0
----------------------------------------------------------------------------
Oper (4): tid: 9e41cae8  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,45	tail_lsn: 1,43
length of Log Record: 512	prev offset: 43		num ops: 1
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 9e41cae8  len: 8  clientid: LOG  flags: UNMOUNT 
Unmount filesystem

============================================================================
cycle: 1	version: 2		lsn: 1,47	tail_lsn: 1,47
length of Log Record: 512	prev offset: 45		num ops: 10
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 8f288ac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 8f288ac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: SETATTR       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (2): tid: 8f288ac8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x80  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 0
Oper (3): tid: 8f288ac8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 040755 version 2 format 1
nlink 2 uid 0 gid 0
atime 0x0 mtime 0x57cc6e33 ctime 0x57cc949b
size 0x6 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (4): tid: 8f288ac8  len: 0  clientid: TRANS  flags: COMMIT 
----------------------------------------------------------------------------
Oper (5): tid: 8f288ac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (6): tid: 8f288ac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: SETATTR       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (7): tid: 8f288ac8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x80  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 0
Oper (8): tid: 8f288ac8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 040755 version 2 format 1
nlink 2 uid 0 gid 0
atime 0x0 mtime 0x57cc6e33 ctime 0x57cc949b
size 0x6 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (9): tid: 8f288ac8  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,49	tail_lsn: 1,47
length of Log Record: 512	prev offset: 47		num ops: 5
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 9cd9aae8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 9cd9aae8  len: 16  clientid: TRANS  flags: none
TRAN:    type: DUMMY1       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (2): tid: 9cd9aae8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x80  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 0
Oper (3): tid: 9cd9aae8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 040755 version 2 format 1
nlink 2 uid 0 gid 0
atime 0x0 mtime 0x57cc6e33 ctime 0x57cc949b
size 0x6 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (4): tid: 9cd9aae8  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,51	tail_lsn: 1,49
length of Log Record: 512	prev offset: 49		num ops: 5
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 8328bac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 8328bac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: DUMMY1       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (2): tid: 8328bac8  len: 56  clientid: TRANS  flags: none
INODE: #regs: 2   ino: 0x80  flags: 0x1   dsize: 0
        blkno: 64  len: 16  boff: 0
Oper (3): tid: 8328bac8  len: 96  clientid: TRANS  flags: none
INODE CORE
magic 0x494e mode 040755 version 2 format 1
nlink 2 uid 0 gid 0
atime 0x0 mtime 0x57cc6e33 ctime 0x57cc949b
size 0x6 nblocks 0x0 extsize 0x0 nextents 0x0
naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
flags 0x0 gen 0x0
----------------------------------------------------------------------------
Oper (4): tid: 8328bac8  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,53	tail_lsn: 1,51
length of Log Record: 512	prev offset: 51		num ops: 5
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 8ef0cac8  len: 0  clientid: TRANS  flags: START 
----------------------------------------------------------------------------
Oper (1): tid: 8ef0cac8  len: 16  clientid: TRANS  flags: none
TRAN:    type: SB_COUNT       tid: 0       num_items: 1
----------------------------------------------------------------------------
Oper (2): tid: 8ef0cac8  len: 24  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 0 (0x0)  len: 1  bmap size: 1  flags: 0x0
Oper (3): tid: 8ef0cac8  len: 128  clientid: TRANS  flags: none
SUPER BLOCK Buffer: 
icount: 64  ifree: 59  fdblks: 240320077  frext: 0
----------------------------------------------------------------------------
Oper (4): tid: 8ef0cac8  len: 0  clientid: TRANS  flags: COMMIT 

============================================================================
cycle: 1	version: 2		lsn: 1,55	tail_lsn: 1,53
length of Log Record: 512	prev offset: 53		num ops: 1
uuid: d898c9bd-0dae-4ad6-bdf7-46535137b7be   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 8ef0cac8  len: 8  clientid: LOG  flags: UNMOUNT 
Unmount filesystem

============================================================================
xfs_logprint: skipped 4086 cleared blocks in range: 57 - 4142
xfs_logprint: skipped 935065 zeroed blocks in range: 4143 - 939207
xfs_logprint: physical end of log
============================================================================
xfs_logprint: logical end of log
============================================================================

^ permalink raw reply

* [PATCH][RESEND] Fix RAID metadata check
From: Mariusz Dabrowski @ 2016-09-21  7:17 UTC (permalink / raw)
  To: linux-raid
  Cc: Jes.Sorensen, tomasz.majchrzak, aleksey.obitotskiy,
	pawel.baldysiak, artur.paszkiewicz, maksymilian.kunt,
	Mariusz Dabrowski
In-Reply-To: <1473849811-29110-1-git-send-email-mariusz.dabrowski@intel.com>

mdadm recognizes devices with partition table as part of an RAID array
and invalid warning message is displayed. After this fix proper warning
messages are being displayed for MBR/GPT disks and devices with RAID
metadata.

Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
---
 util.c | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/util.c b/util.c
index c38ede7..5c845a0 100644
--- a/util.c
+++ b/util.c
@@ -710,17 +710,23 @@ int check_raid(int fd, char *name)
 
 	if (!st)
 		return 0;
-	st->ss->load_super(st, fd, name);
-	/* Looks like a raid array .. */
-	pr_err("%s appears to be part of a raid array:\n",
-		name);
-	st->ss->getinfo_super(st, &info, NULL);
-	st->ss->free_super(st);
-	crtime = info.array.ctime;
-	level = map_num(pers, info.array.level);
-	if (!level) level = "-unknown-";
-	cont_err("level=%s devices=%d ctime=%s",
-		 level, info.array.raid_disks, ctime(&crtime));
+	if (st->ss->add_to_super != NULL) {
+		st->ss->load_super(st, fd, name);
+		/* Looks like a raid array .. */
+		pr_err("%s appears to be part of a raid array:\n",
+			name);
+		st->ss->getinfo_super(st, &info, NULL);
+		st->ss->free_super(st);
+		crtime = info.array.ctime;
+		level = map_num(pers, info.array.level);
+		if (!level) level = "-unknown-";
+		cont_err("level=%s devices=%d ctime=%s",
+		level, info.array.raid_disks, ctime(&crtime));
+	}
+	else {
+		/* Looks like GPT or MBR */
+		pr_err("partition table exists on %s\n", name);
+	}
 	return 1;
 }
 
-- 
1.8.3.1


^ permalink raw reply related

* Re: 95a05b3 broke mdadm --add on my superblock 1.0 array
From: Guoqing Jiang @ 2016-09-21  6:45 UTC (permalink / raw)
  To: Anthony DeRobertis, linux-raid, 837964
In-Reply-To: <1931152f-f5bc-bc1f-76a8-91921ffc1bed@derobert.net>



On 09/20/2016 02:31 PM, Anthony DeRobertis wrote:
> Sorry for the amount of emails I'm sending, but I noticed something 
> that's probably important. I'm also appending some gdb log from 
> tracing through the function (trying to answer why it's doing cluster 
> mode stuff at all).
>
> While tracing through, I noticed that *before* the write-bitmap loop, 
> mdadm -E considers the superblock valid. That agrees with what I saw 
> from strace, I suppose. To my first glance, it figures out how much to 
> write by calling this function:
>
> static unsigned int calc_bitmap_size(bitmap_super_t *bms, unsigned int 
> boundary)
> {
>     unsigned long long bits, bytes;
>
>     bits = __le64_to_cpu(bms->sync_size) / 
> (__le32_to_cpu(bms->chunksize)>>9);
>     bytes = (bits+7) >> 3;
>     bytes += sizeof(bitmap_super_t);
>     bytes = ROUND_UP(bytes, boundary);
>
>     return bytes;
> }
>
> That code looked familiar, and I figured out where—it's also in 
> 95a05b37e8eb2bc0803b1a0298fce6adc60eff16, the commit that I found 
> originally broke it. But that commit is making a change to it: it 
> changed the ROUND_UP line from 512 to 4096 (and from the gdb trace, 
> boundary==4096).
>
> I tested changing that line to "bytes = ROUND_UP(bytes, 512);", and it 
> works. Adds the new disk to the array and produces no warnings or errors.

I think it is is a coincidence that above change works,  4a3d29e commit made
the change but it didn't change the logic at all. Also seems the problem 
is not
related to md-cluster code as your gdb debug shows it run into below part
because the version is 4.

/* no need to change bms->nodes for other bitmap types */

Thanks,
Guoqing

^ permalink raw reply

* Re: 95a05b3 broke mdadm --add on my superblock 1.0 array
From: Guoqing Jiang @ 2016-09-21  6:40 UTC (permalink / raw)
  To: Anthony DeRobertis, linux-raid, 837964
In-Reply-To: <20160920171223.n7t3wa673qopky4c@derobert.net>



On 09/20/2016 01:12 PM, Anthony DeRobertis wrote:
>
>> Which kernel version are you used to created the array in case the kernel
>> was updated?
> I've had the array for a while (the superblocks with -E show a creation
> time of Wed Jun 16 14:25:08 2010). If I had to take a guess, I'd guess
> it was created with the Debian squeeze alpha1 installer... So probably
> 2.6.30 or 2.6.32.
>

Hmm, lots of things are changed from 2.6.30, so it is possible that 
latest mdadm
can't work well with array which was created with the old kernel.

Thanks,
Guoqing

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox