* Raid-6 cannot reshape
@ 2020-04-01 10:16 Alexander Shenkin
[not found] ` <6b9b6d37-6325-6515-f693-0ff3b641a67a@shenkin.org>
0 siblings, 1 reply; 13+ messages in thread
From: Alexander Shenkin @ 2020-04-01 10:16 UTC (permalink / raw)
To: Linux-RAID
[-- Attachment #1: Type: text/plain, Size: 1525 bytes --]
Hi all,
I had a problem that caused my Ubuntu Server 14 to go down, and it now
will not boot. I added a drive to my raid1 (/dev/md0 -> /boot) + raid6
(/dev/md2 -> /) setup. I was previously running with /dev/md0 (raid1)
assembled from /dev/sd[a-f]1, and /dev/md2 (raid6) assembled
/dev/sd[a-f]3. Both partitions from the new drive were successfully
added to the two md arrays, and the raid1 (the smaller of the two)
resync'd successfully it seemed (I think that resync is the correct word
to use, meaning that mdadm is spreading out the data amongst the drives
once an array has been grown). When resyncing the larger raid6,
however, the sync speed was quite slow (kb's/sec), and got slower and
slower (7kb/sec... 5kb/sec... 3kb/sec... 1kb/sec...) until the system
entirely halted and I eventually turned the power off. It now will not
boot.
Thanks to Roger Heflin's help, I've booted into a livecd environment
(ubuntu server 18) with all the necessary raid personalities available.
When trying to --assemble md127 (raid6), i get the following error:
"Failure to restore critical section for reshape, sorry. Perhaps you
needed to specify the --backup-file". Needless to say, I didn't save a
backup file when adding the drive.
I have attached some diagnostic output here. It seems that my raid6
array is not being recognized as such. I'm not sure what the next steps
are - do I need to figure out how to get the resync up and running
again? Any help would be greatly appreciated.
Many thanks,
Allie
[-- Attachment #2: mdstat.txt --]
[-- Type: text/plain, Size: 395 bytes --]
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md127 : inactive sdd3[5](S) sdf3[8](S) sdc3[2](S) sdg3[9](S) sde3[7](S) sdb3[4](S) sda3[6](S)
20441322496 blocks super 1.2
md126 : active (auto-read-only) raid1 sdf1[8] sde1[7] sdg1[9] sda1[6] sdd1[5] sdc1[2] sdb1[4]
1950656 blocks super 1.2 [7/7] [UUUUUUU]
unused devices: <none>
[-- Attachment #3: fdisk.txt --]
[-- Type: text/plain, Size: 6695 bytes --]
Disk /dev/loop0: 265.3 MiB, 278147072 bytes, 543256 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop1: 70.1 MiB, 73531392 bytes, 143616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop2: 49.7 MiB, 52060160 bytes, 101680 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop3: 36 MiB, 37707776 bytes, 73648 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop4: 27.4 MiB, 28717056 bytes, 56088 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop5: 89.1 MiB, 93417472 bytes, 182456 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop6: 51.9 MiB, 54358016 bytes, 106168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop7: 51.9 MiB, 54407168 bytes, 106264 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/sda: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D607CD2D-05F4-42C6-914C-B99CE56E934F
Device Start End Sectors Size Type
/dev/sda1 2048 3905535 3903488 1.9G Linux RAID
/dev/sda2 3905536 3907583 2048 1M BIOS boot
/dev/sda3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sda4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/sdb: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D96D7513-3D74-435B-8B2F-26C2A32B0586
Device Start End Sectors Size Type
/dev/sdb1 2048 3905535 3903488 1.9G Linux RAID
/dev/sdb2 3905536 3907583 2048 1M BIOS boot
/dev/sdb3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sdb4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/sdc: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 4B356AFA-8F48-4227-86F0-329565146D7A
Device Start End Sectors Size Type
/dev/sdc1 2048 3905535 3903488 1.9G Linux RAID
/dev/sdc2 3905536 3907583 2048 1M BIOS boot
/dev/sdc3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sdc4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/sdd: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D4D1E2EB-8520-4B3E-8263-AB5B5BD2D7E2
Device Start End Sectors Size Type
/dev/sdd1 2048 3905535 3903488 1.9G Linux RAID
/dev/sdd2 3905536 3907583 2048 1M BIOS boot
/dev/sdd3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sdd4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/sde: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 52F044AB-9D50-4363-BF60-651A87159A17
Device Start End Sectors Size Type
/dev/sde1 2048 3905535 3903488 1.9G Linux RAID
/dev/sde2 3905536 3907583 2048 1M BIOS boot
/dev/sde3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sde4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/sdg: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: BE2DDABE-9106-4436-8ABC-6A919EBFB2E6
Device Start End Sectors Size Type
/dev/sdg1 2048 3905535 3903488 1.9G Linux RAID
/dev/sdg2 3905536 3907583 2048 1M BIOS boot
/dev/sdg3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sdg4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/sdf: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 36615F7D-974F-4A0B-B79B-165D872EF418
Device Start End Sectors Size Type
/dev/sdf1 2048 3905535 3903488 1.9G Linux RAID
/dev/sdf2 3905536 3907583 2048 1M BIOS boot
/dev/sdf3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sdf4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/md126: 1.9 GiB, 1997471744 bytes, 3901312 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/sdh: 29.2 GiB, 31376707072 bytes, 61282631 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00f674a6
Device Boot Start End Sectors Size Id Type
/dev/sdh1 * 2048 61282630 61280583 29.2G c W95 FAT32 (LBA)
Disk /dev/sdi: 15 GiB, 16106127360 bytes, 31457280 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00e19bdc
Device Boot Start End Sectors Size Id Type
/dev/sdi1 * 2048 31457279 31455232 15G c W95 FAT32 (LBA)
^ permalink raw reply [flat|nested] 13+ messages in thread[parent not found: <6b9b6d37-6325-6515-f693-0ff3b641a67a@shenkin.org>]
* Re: Raid-6 cannot reshape [not found] ` <6b9b6d37-6325-6515-f693-0ff3b641a67a@shenkin.org> @ 2020-04-06 15:27 ` Alexander Shenkin 2020-04-06 16:12 ` Roger Heflin 0 siblings, 1 reply; 13+ messages in thread From: Alexander Shenkin @ 2020-04-06 15:27 UTC (permalink / raw) To: Linux-RAID On 4/4/2020 9:19 AM, Alexander Shenkin wrote: > On 4/1/2020 11:16 AM, Alexander Shenkin wrote: >> Hi all, >> >> I had a problem that caused my Ubuntu Server 14 to go down, and it now >> will not boot. I added a drive to my raid1 (/dev/md0 -> /boot) + raid6 >> (/dev/md2 -> /) setup. I was previously running with /dev/md0 (raid1) >> assembled from /dev/sd[a-f]1, and /dev/md2 (raid6) assembled >> /dev/sd[a-f]3. Both partitions from the new drive were successfully >> added to the two md arrays, and the raid1 (the smaller of the two) >> resync'd successfully it seemed (I think that resync is the correct word >> to use, meaning that mdadm is spreading out the data amongst the drives >> once an array has been grown). When resyncing the larger raid6, >> however, the sync speed was quite slow (kb's/sec), and got slower and >> slower (7kb/sec... 5kb/sec... 3kb/sec... 1kb/sec...) until the system >> entirely halted and I eventually turned the power off. It now will not >> boot. >> >> Thanks to Roger Heflin's help, I've booted into a livecd environment >> (ubuntu server 18) with all the necessary raid personalities available. >> >> When trying to --assemble md127 (raid6), i get the following error: >> "Failure to restore critical section for reshape, sorry. Perhaps you >> needed to specify the --backup-file". Needless to say, I didn't save a >> backup file when adding the drive. >> >> I have attached some diagnostic output here. It seems that my raid6 >> array is not being recognized as such. I'm not sure what the next steps >> are - do I need to figure out how to get the resync up and running >> again? Any help would be greatly appreciated. >> >> Many thanks, >> >> Allie >> > > Hello again, > > Reading through https://raid.wiki.kernel.org/index.php/Assemble_Run and > https://raid.wiki.kernel.org/index.php/RAID_Recovery, the advice seems > to be to force assemble the array given that my event counts are all the > same. However, I don't want to do that until some experts have chimed > in. I've seen some other threads about using overlays to avoid data > loss... not sure if that is still a recommended method. > > I'm including the dmesg and mdadm --examine output here... any advice > much appreciated! > > Thanks, > Allie > > (note below - arrays are on /dev/md[a-g]) > > root@ubuntu-server:/home/ubuntu-server# mdadm -A --scan --verbose > mdadm: looking for devices for further assembly > mdadm: Cannot assemble mbr metadata on /dev/sdh1 > mdadm: Cannot assemble mbr metadata on /dev/sdh > mdadm: no recogniseable superblock on /dev/md/bobbiflekman:0 > mdadm: no recogniseable superblock on /dev/sdf4 > mdadm: /dev/sdf3 is busy - skipping > mdadm: no recogniseable superblock on /dev/sdf2 > mdadm: /dev/sdf1 is busy - skipping > mdadm: Cannot assemble mbr metadata on /dev/sdf > mdadm: no recogniseable superblock on /dev/sdg4 > mdadm: /dev/sdg3 is busy - skipping > mdadm: no recogniseable superblock on /dev/sdg2 > mdadm: /dev/sdg1 is busy - skipping > mdadm: Cannot assemble mbr metadata on /dev/sdg > mdadm: no recogniseable superblock on /dev/sde4 > mdadm: /dev/sde3 is busy - skipping > mdadm: no recogniseable superblock on /dev/sde2 > mdadm: /dev/sde1 is busy - skipping > mdadm: Cannot assemble mbr metadata on /dev/sde > mdadm: cannot open device /dev/sr1: No medium found > mdadm: cannot open device /dev/sr0: No medium found > mdadm: no recogniseable superblock on /dev/sdd4 > mdadm: /dev/sdd3 is busy - skipping > mdadm: no recogniseable superblock on /dev/sdd2 > mdadm: /dev/sdd1 is busy - skipping > mdadm: Cannot assemble mbr metadata on /dev/sdd > mdadm: no recogniseable superblock on /dev/sdc4 > mdadm: /dev/sdc3 is busy - skipping > mdadm: no recogniseable superblock on /dev/sdc2 > mdadm: /dev/sdc1 is busy - skipping > mdadm: Cannot assemble mbr metadata on /dev/sdc > mdadm: no recogniseable superblock on /dev/sdb4 > mdadm: /dev/sdb3 is busy - skipping > mdadm: no recogniseable superblock on /dev/sdb2 > mdadm: /dev/sdb1 is busy - skipping > mdadm: Cannot assemble mbr metadata on /dev/sdb > mdadm: no recogniseable superblock on /dev/sda4 > mdadm: /dev/sda3 is busy - skipping > mdadm: no recogniseable superblock on /dev/sda2 > mdadm: /dev/sda1 is busy - skipping > mdadm: Cannot assemble mbr metadata on /dev/sda > mdadm: no recogniseable superblock on /dev/loop7 > mdadm: no recogniseable superblock on /dev/loop6 > mdadm: no recogniseable superblock on /dev/loop5 > mdadm: no recogniseable superblock on /dev/loop4 > mdadm: no recogniseable superblock on /dev/loop3 > mdadm: no recogniseable superblock on /dev/loop2 > mdadm: no recogniseable superblock on /dev/loop1 > mdadm: no recogniseable superblock on /dev/loop0 > mdadm: No arrays found in config file or automatically > Hi again all, Apologies for all the self-followed-up emails. I realized my previous --examine was done without stopping the array in question. When stopped and reexamined, the critical bit seems to be this: mdadm: /dev/sdf3 is identified as a member of /dev/md/ubuntu:2, slot 4. mdadm: /dev/sdg3 is identified as a member of /dev/md/ubuntu:2, slot 6. mdadm: /dev/sde3 is identified as a member of /dev/md/ubuntu:2, slot 5. mdadm: /dev/sdd3 is identified as a member of /dev/md/ubuntu:2, slot 3. mdadm: /dev/sdc3 is identified as a member of /dev/md/ubuntu:2, slot 2. mdadm: /dev/sdb3 is identified as a member of /dev/md/ubuntu:2, slot 1. mdadm: /dev/sda3 is identified as a member of /dev/md/ubuntu:2, slot 0. mdadm: /dev/md/ubuntu:2 has an active reshape - checking if critical section needs to be restored mdadm: No backup metadata on device-6 mdadm: Failed to find backup of critical section mdadm: Failed to restore critical section for reshape, sorry. Possibly you needed to specify the --backup-file That is, it thinks there is an active reshape. Thanks, Allie root@ubuntu-server:/home/ubuntu-server# mdadm -A --scan --verbose mdadm: looking for devices for further assembly mdadm: Cannot assemble mbr metadata on /dev/sdh1 mdadm: Cannot assemble mbr metadata on /dev/sdh mdadm: no recogniseable superblock on /dev/md/bobbiflekman:0 mdadm: no recogniseable superblock on /dev/sdf4 mdadm: No super block found on /dev/sdf2 (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sdf2 mdadm: /dev/sdf1 is busy - skipping mdadm: No super block found on /dev/sdf (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sdf mdadm: No super block found on /dev/sdg4 (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sdg4 mdadm: No super block found on /dev/sdg2 (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sdg2 mdadm: /dev/sdg1 is busy - skipping mdadm: No super block found on /dev/sdg (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sdg mdadm: No super block found on /dev/sde4 (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sde4 mdadm: No super block found on /dev/sde2 (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sde2 mdadm: /dev/sde1 is busy - skipping mdadm: No super block found on /dev/sde (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sde mdadm: cannot open device /dev/sr1: No medium found mdadm: cannot open device /dev/sr0: No medium found mdadm: No super block found on /dev/sdd4 (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sdd4 mdadm: No super block found on /dev/sdd2 (Expected magic a92b4efc, got 9c6196f1) mdadm: no RAID superblock on /dev/sdd2 mdadm: /dev/sdd1 is busy - skipping mdadm: No super block found on /dev/sdd (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sdd mdadm: No super block found on /dev/sdc4 (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sdc4 mdadm: No super block found on /dev/sdc2 (Expected magic a92b4efc, got 9c6196f1) mdadm: no RAID superblock on /dev/sdc2 mdadm: /dev/sdc1 is busy - skipping mdadm: No super block found on /dev/sdc (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sdc mdadm: No super block found on /dev/sdb4 (Expected magic a92b4efc, got 53425553) mdadm: no RAID superblock on /dev/sdb4 mdadm: No super block found on /dev/sdb2 (Expected magic a92b4efc, got 9c6196f1) mdadm: no RAID superblock on /dev/sdb2 mdadm: /dev/sdb1 is busy - skipping mdadm: No super block found on /dev/sdb (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sdb mdadm: No super block found on /dev/sda4 (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sda4 mdadm: No super block found on /dev/sda2 (Expected magic a92b4efc, got 9c6196f1) mdadm: no RAID superblock on /dev/sda2 mdadm: /dev/sda1 is busy - skipping mdadm: No super block found on /dev/sda (Expected magic a92b4efc, got 00000000) mdadm: no RAID superblock on /dev/sda mdadm: No super block found on /dev/loop7 (Expected magic a92b4efc, got 72756769) mdadm: no RAID superblock on /dev/loop7 mdadm: No super block found on /dev/loop6 (Expected magic a92b4efc, got 69622f21) mdadm: no RAID superblock on /dev/loop6 mdadm: No super block found on /dev/loop5 (Expected magic a92b4efc, got 14ea0a05) mdadm: no RAID superblock on /dev/loop5 mdadm: No super block found on /dev/loop4 (Expected magic a92b4efc, got 1824ef5d) mdadm: no RAID superblock on /dev/loop4 mdadm: No super block found on /dev/loop3 (Expected magic a92b4efc, got 1e993ae9) mdadm: no RAID superblock on /dev/loop3 mdadm: No super block found on /dev/loop2 (Expected magic a92b4efc, got cb4d8c8e) mdadm: no RAID superblock on /dev/loop2 mdadm: No super block found on /dev/loop1 (Expected magic a92b4efc, got d2964063) mdadm: no RAID superblock on /dev/loop1 mdadm: No super block found on /dev/loop0 (Expected magic a92b4efc, got e7e108a6) mdadm: no RAID superblock on /dev/loop0 mdadm: /dev/sdf3 is identified as a member of /dev/md/ubuntu:2, slot 4. mdadm: /dev/sdg3 is identified as a member of /dev/md/ubuntu:2, slot 6. mdadm: /dev/sde3 is identified as a member of /dev/md/ubuntu:2, slot 5. mdadm: /dev/sdd3 is identified as a member of /dev/md/ubuntu:2, slot 3. mdadm: /dev/sdc3 is identified as a member of /dev/md/ubuntu:2, slot 2. mdadm: /dev/sdb3 is identified as a member of /dev/md/ubuntu:2, slot 1. mdadm: /dev/sda3 is identified as a member of /dev/md/ubuntu:2, slot 0. mdadm: /dev/md/ubuntu:2 has an active reshape - checking if critical section needs to be restored mdadm: No backup metadata on device-6 mdadm: Failed to find backup of critical section mdadm: Failed to restore critical section for reshape, sorry. Possibly you needed to specify the --backup-file mdadm: looking for devices for further assembly mdadm: /dev/sdf1 is busy - skipping mdadm: /dev/sdg1 is busy - skipping mdadm: /dev/sde1 is busy - skipping mdadm: /dev/sdd1 is busy - skipping mdadm: /dev/sdc1 is busy - skipping mdadm: /dev/sdb1 is busy - skipping mdadm: /dev/sda1 is busy - skipping mdadm: No arrays found in config file or automatically ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape 2020-04-06 15:27 ` Alexander Shenkin @ 2020-04-06 16:12 ` Roger Heflin 2020-04-06 16:27 ` Wols Lists 0 siblings, 1 reply; 13+ messages in thread From: Roger Heflin @ 2020-04-06 16:12 UTC (permalink / raw) To: Alexander Shenkin; +Cc: Linux-RAID When I looked at your detailed files you sent a few days ago, all of the reshapes (on all disks) indicated that they were at position 0, so it kind of appears that the reshape never actually started at all and hung immediately which is probably why it cannot find the critical section, it hung prior to that getting done. Not entirely sure how to undo a reshape that failed like this. On Mon, Apr 6, 2020 at 10:29 AM Alexander Shenkin <al@shenkin.org> wrote: > > > > On 4/4/2020 9:19 AM, Alexander Shenkin wrote: > > On 4/1/2020 11:16 AM, Alexander Shenkin wrote: > >> Hi all, > >> > >> I had a problem that caused my Ubuntu Server 14 to go down, and it now > >> will not boot. I added a drive to my raid1 (/dev/md0 -> /boot) + raid6 > >> (/dev/md2 -> /) setup. I was previously running with /dev/md0 (raid1) > >> assembled from /dev/sd[a-f]1, and /dev/md2 (raid6) assembled > >> /dev/sd[a-f]3. Both partitions from the new drive were successfully > >> added to the two md arrays, and the raid1 (the smaller of the two) > >> resync'd successfully it seemed (I think that resync is the correct word > >> to use, meaning that mdadm is spreading out the data amongst the drives > >> once an array has been grown). When resyncing the larger raid6, > >> however, the sync speed was quite slow (kb's/sec), and got slower and > >> slower (7kb/sec... 5kb/sec... 3kb/sec... 1kb/sec...) until the system > >> entirely halted and I eventually turned the power off. It now will not > >> boot. > >> > >> Thanks to Roger Heflin's help, I've booted into a livecd environment > >> (ubuntu server 18) with all the necessary raid personalities available. > >> > >> When trying to --assemble md127 (raid6), i get the following error: > >> "Failure to restore critical section for reshape, sorry. Perhaps you > >> needed to specify the --backup-file". Needless to say, I didn't save a > >> backup file when adding the drive. > >> > >> I have attached some diagnostic output here. It seems that my raid6 > >> array is not being recognized as such. I'm not sure what the next steps > >> are - do I need to figure out how to get the resync up and running > >> again? Any help would be greatly appreciated. > >> > >> Many thanks, > >> > >> Allie > >> > > > > Hello again, > > > > Reading through https://raid.wiki.kernel.org/index.php/Assemble_Run and > > https://raid.wiki.kernel.org/index.php/RAID_Recovery, the advice seems > > to be to force assemble the array given that my event counts are all the > > same. However, I don't want to do that until some experts have chimed > > in. I've seen some other threads about using overlays to avoid data > > loss... not sure if that is still a recommended method. > > > > I'm including the dmesg and mdadm --examine output here... any advice > > much appreciated! > > > > Thanks, > > Allie > > > > (note below - arrays are on /dev/md[a-g]) > > > > root@ubuntu-server:/home/ubuntu-server# mdadm -A --scan --verbose > > mdadm: looking for devices for further assembly > > mdadm: Cannot assemble mbr metadata on /dev/sdh1 > > mdadm: Cannot assemble mbr metadata on /dev/sdh > > mdadm: no recogniseable superblock on /dev/md/bobbiflekman:0 > > mdadm: no recogniseable superblock on /dev/sdf4 > > mdadm: /dev/sdf3 is busy - skipping > > mdadm: no recogniseable superblock on /dev/sdf2 > > mdadm: /dev/sdf1 is busy - skipping > > mdadm: Cannot assemble mbr metadata on /dev/sdf > > mdadm: no recogniseable superblock on /dev/sdg4 > > mdadm: /dev/sdg3 is busy - skipping > > mdadm: no recogniseable superblock on /dev/sdg2 > > mdadm: /dev/sdg1 is busy - skipping > > mdadm: Cannot assemble mbr metadata on /dev/sdg > > mdadm: no recogniseable superblock on /dev/sde4 > > mdadm: /dev/sde3 is busy - skipping > > mdadm: no recogniseable superblock on /dev/sde2 > > mdadm: /dev/sde1 is busy - skipping > > mdadm: Cannot assemble mbr metadata on /dev/sde > > mdadm: cannot open device /dev/sr1: No medium found > > mdadm: cannot open device /dev/sr0: No medium found > > mdadm: no recogniseable superblock on /dev/sdd4 > > mdadm: /dev/sdd3 is busy - skipping > > mdadm: no recogniseable superblock on /dev/sdd2 > > mdadm: /dev/sdd1 is busy - skipping > > mdadm: Cannot assemble mbr metadata on /dev/sdd > > mdadm: no recogniseable superblock on /dev/sdc4 > > mdadm: /dev/sdc3 is busy - skipping > > mdadm: no recogniseable superblock on /dev/sdc2 > > mdadm: /dev/sdc1 is busy - skipping > > mdadm: Cannot assemble mbr metadata on /dev/sdc > > mdadm: no recogniseable superblock on /dev/sdb4 > > mdadm: /dev/sdb3 is busy - skipping > > mdadm: no recogniseable superblock on /dev/sdb2 > > mdadm: /dev/sdb1 is busy - skipping > > mdadm: Cannot assemble mbr metadata on /dev/sdb > > mdadm: no recogniseable superblock on /dev/sda4 > > mdadm: /dev/sda3 is busy - skipping > > mdadm: no recogniseable superblock on /dev/sda2 > > mdadm: /dev/sda1 is busy - skipping > > mdadm: Cannot assemble mbr metadata on /dev/sda > > mdadm: no recogniseable superblock on /dev/loop7 > > mdadm: no recogniseable superblock on /dev/loop6 > > mdadm: no recogniseable superblock on /dev/loop5 > > mdadm: no recogniseable superblock on /dev/loop4 > > mdadm: no recogniseable superblock on /dev/loop3 > > mdadm: no recogniseable superblock on /dev/loop2 > > mdadm: no recogniseable superblock on /dev/loop1 > > mdadm: no recogniseable superblock on /dev/loop0 > > mdadm: No arrays found in config file or automatically > > > > Hi again all, > > Apologies for all the self-followed-up emails. I realized my previous > --examine was done without stopping the array in question. When stopped > and reexamined, the critical bit seems to be this: > > mdadm: /dev/sdf3 is identified as a member of /dev/md/ubuntu:2, slot 4. > mdadm: /dev/sdg3 is identified as a member of /dev/md/ubuntu:2, slot 6. > mdadm: /dev/sde3 is identified as a member of /dev/md/ubuntu:2, slot 5. > mdadm: /dev/sdd3 is identified as a member of /dev/md/ubuntu:2, slot 3. > mdadm: /dev/sdc3 is identified as a member of /dev/md/ubuntu:2, slot 2. > mdadm: /dev/sdb3 is identified as a member of /dev/md/ubuntu:2, slot 1. > mdadm: /dev/sda3 is identified as a member of /dev/md/ubuntu:2, slot 0. > mdadm: /dev/md/ubuntu:2 has an active reshape - checking if critical > section needs to be restored > mdadm: No backup metadata on device-6 > mdadm: Failed to find backup of critical section > mdadm: Failed to restore critical section for reshape, sorry. > Possibly you needed to specify the --backup-file > > That is, it thinks there is an active reshape. > > Thanks, > Allie > > > root@ubuntu-server:/home/ubuntu-server# mdadm -A --scan --verbose > mdadm: looking for devices for further assembly > mdadm: Cannot assemble mbr metadata on /dev/sdh1 > mdadm: Cannot assemble mbr metadata on /dev/sdh > mdadm: no recogniseable superblock on /dev/md/bobbiflekman:0 > mdadm: no recogniseable superblock on /dev/sdf4 > mdadm: No super block found on /dev/sdf2 (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sdf2 > mdadm: /dev/sdf1 is busy - skipping > mdadm: No super block found on /dev/sdf (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sdf > mdadm: No super block found on /dev/sdg4 (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sdg4 > mdadm: No super block found on /dev/sdg2 (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sdg2 > mdadm: /dev/sdg1 is busy - skipping > mdadm: No super block found on /dev/sdg (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sdg > mdadm: No super block found on /dev/sde4 (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sde4 > mdadm: No super block found on /dev/sde2 (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sde2 > mdadm: /dev/sde1 is busy - skipping > mdadm: No super block found on /dev/sde (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sde > mdadm: cannot open device /dev/sr1: No medium found > mdadm: cannot open device /dev/sr0: No medium found > mdadm: No super block found on /dev/sdd4 (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sdd4 > mdadm: No super block found on /dev/sdd2 (Expected magic a92b4efc, got > 9c6196f1) > mdadm: no RAID superblock on /dev/sdd2 > mdadm: /dev/sdd1 is busy - skipping > mdadm: No super block found on /dev/sdd (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sdd > mdadm: No super block found on /dev/sdc4 (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sdc4 > mdadm: No super block found on /dev/sdc2 (Expected magic a92b4efc, got > 9c6196f1) > mdadm: no RAID superblock on /dev/sdc2 > mdadm: /dev/sdc1 is busy - skipping > mdadm: No super block found on /dev/sdc (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sdc > mdadm: No super block found on /dev/sdb4 (Expected magic a92b4efc, got > 53425553) > mdadm: no RAID superblock on /dev/sdb4 > mdadm: No super block found on /dev/sdb2 (Expected magic a92b4efc, got > 9c6196f1) > mdadm: no RAID superblock on /dev/sdb2 > mdadm: /dev/sdb1 is busy - skipping > mdadm: No super block found on /dev/sdb (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sdb > mdadm: No super block found on /dev/sda4 (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sda4 > mdadm: No super block found on /dev/sda2 (Expected magic a92b4efc, got > 9c6196f1) > mdadm: no RAID superblock on /dev/sda2 > mdadm: /dev/sda1 is busy - skipping > mdadm: No super block found on /dev/sda (Expected magic a92b4efc, got > 00000000) > mdadm: no RAID superblock on /dev/sda > mdadm: No super block found on /dev/loop7 (Expected magic a92b4efc, got > 72756769) > mdadm: no RAID superblock on /dev/loop7 > mdadm: No super block found on /dev/loop6 (Expected magic a92b4efc, got > 69622f21) > mdadm: no RAID superblock on /dev/loop6 > mdadm: No super block found on /dev/loop5 (Expected magic a92b4efc, got > 14ea0a05) > mdadm: no RAID superblock on /dev/loop5 > mdadm: No super block found on /dev/loop4 (Expected magic a92b4efc, got > 1824ef5d) > mdadm: no RAID superblock on /dev/loop4 > mdadm: No super block found on /dev/loop3 (Expected magic a92b4efc, got > 1e993ae9) > mdadm: no RAID superblock on /dev/loop3 > mdadm: No super block found on /dev/loop2 (Expected magic a92b4efc, got > cb4d8c8e) > mdadm: no RAID superblock on /dev/loop2 > mdadm: No super block found on /dev/loop1 (Expected magic a92b4efc, got > d2964063) > mdadm: no RAID superblock on /dev/loop1 > mdadm: No super block found on /dev/loop0 (Expected magic a92b4efc, got > e7e108a6) > mdadm: no RAID superblock on /dev/loop0 > mdadm: /dev/sdf3 is identified as a member of /dev/md/ubuntu:2, slot 4. > mdadm: /dev/sdg3 is identified as a member of /dev/md/ubuntu:2, slot 6. > mdadm: /dev/sde3 is identified as a member of /dev/md/ubuntu:2, slot 5. > mdadm: /dev/sdd3 is identified as a member of /dev/md/ubuntu:2, slot 3. > mdadm: /dev/sdc3 is identified as a member of /dev/md/ubuntu:2, slot 2. > mdadm: /dev/sdb3 is identified as a member of /dev/md/ubuntu:2, slot 1. > mdadm: /dev/sda3 is identified as a member of /dev/md/ubuntu:2, slot 0. > mdadm: /dev/md/ubuntu:2 has an active reshape - checking if critical > section needs to be restored > mdadm: No backup metadata on device-6 > mdadm: Failed to find backup of critical section > mdadm: Failed to restore critical section for reshape, sorry. > Possibly you needed to specify the --backup-file > mdadm: looking for devices for further assembly > mdadm: /dev/sdf1 is busy - skipping > mdadm: /dev/sdg1 is busy - skipping > mdadm: /dev/sde1 is busy - skipping > mdadm: /dev/sdd1 is busy - skipping > mdadm: /dev/sdc1 is busy - skipping > mdadm: /dev/sdb1 is busy - skipping > mdadm: /dev/sda1 is busy - skipping > mdadm: No arrays found in config file or automatically ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape 2020-04-06 16:12 ` Roger Heflin @ 2020-04-06 16:27 ` Wols Lists 2020-04-06 20:34 ` Phil Turmel 0 siblings, 1 reply; 13+ messages in thread From: Wols Lists @ 2020-04-06 16:27 UTC (permalink / raw) To: Roger Heflin, Alexander Shenkin; +Cc: Linux-RAID On 06/04/20 17:12, Roger Heflin wrote: > When I looked at your detailed files you sent a few days ago, all of > the reshapes (on all disks) indicated that they were at position 0, so > it kind of appears that the reshape never actually started at all and > hung immediately which is probably why it cannot find the critical > section, it hung prior to that getting done. Not entirely sure how > to undo a reshape that failed like this. This seems quite common. Search the archives - it's probably something like --assemble --revert-reshape. Cheers, Wol ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape 2020-04-06 16:27 ` Wols Lists @ 2020-04-06 20:34 ` Phil Turmel 2020-04-07 10:25 ` Alexander Shenkin 0 siblings, 1 reply; 13+ messages in thread From: Phil Turmel @ 2020-04-06 20:34 UTC (permalink / raw) To: Wols Lists, Roger Heflin, Alexander Shenkin; +Cc: Linux-RAID On 4/6/20 12:27 PM, Wols Lists wrote: > On 06/04/20 17:12, Roger Heflin wrote: >> When I looked at your detailed files you sent a few days ago, all of >> the reshapes (on all disks) indicated that they were at position 0, so >> it kind of appears that the reshape never actually started at all and >> hung immediately which is probably why it cannot find the critical >> section, it hung prior to that getting done. Not entirely sure how >> to undo a reshape that failed like this. > > This seems quite common. Search the archives - it's probably something > like --assemble --revert-reshape. Ah, yes. I recall cases where mdmon wouldn't start or wouldn't open the array to start moving the stripes, so the kernel wouldn't advance. SystemD was one of the culprits, I believe, back then. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape 2020-04-06 20:34 ` Phil Turmel @ 2020-04-07 10:25 ` Alexander Shenkin 2020-04-07 11:28 ` Phil Turmel 0 siblings, 1 reply; 13+ messages in thread From: Alexander Shenkin @ 2020-04-07 10:25 UTC (permalink / raw) To: Phil Turmel, Wols Lists, Roger Heflin; +Cc: Linux-RAID On 4/6/2020 9:34 PM, Phil Turmel wrote: > On 4/6/20 12:27 PM, Wols Lists wrote: >> On 06/04/20 17:12, Roger Heflin wrote: >>> When I looked at your detailed files you sent a few days ago, all of >>> the reshapes (on all disks) indicated that they were at position 0, so >>> it kind of appears that the reshape never actually started at all and >>> hung immediately which is probably why it cannot find the critical >>> section, it hung prior to that getting done. Not entirely sure how >>> to undo a reshape that failed like this. >> >> This seems quite common. Search the archives - it's probably something >> like --assemble --revert-reshape. > > Ah, yes. I recall cases where mdmon wouldn't start or wouldn't open the > array to start moving the stripes, so the kernel wouldn't advance. > SystemD was one of the culprits, I believe, back then. Thanks all. So, is the following safe to run, and a good idea to try? mdadm --assemble --update=revert-reshape /dev/md127 /dev/sd[a-g]3 And if that doesn't work, add a force? mdadm --assemble --force --update=revert-reshape /dev/md127 /dev/sd[a-g]3 And adding --invalid-backup if it complains about backup files? Thanks, Allie ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape 2020-04-07 10:25 ` Alexander Shenkin @ 2020-04-07 11:28 ` Phil Turmel 2020-04-07 11:46 ` Alexander Shenkin 0 siblings, 1 reply; 13+ messages in thread From: Phil Turmel @ 2020-04-07 11:28 UTC (permalink / raw) To: Alexander Shenkin, Wols Lists, Roger Heflin; +Cc: Linux-RAID Hi Allie, On 4/7/20 6:25 AM, Alexander Shenkin wrote: > > > On 4/6/2020 9:34 PM, Phil Turmel wrote: >> On 4/6/20 12:27 PM, Wols Lists wrote: >>> On 06/04/20 17:12, Roger Heflin wrote: >>>> When I looked at your detailed files you sent a few days ago, all of >>>> the reshapes (on all disks) indicated that they were at position 0, so >>>> it kind of appears that the reshape never actually started at all and >>>> hung immediately which is probably why it cannot find the critical >>>> section, it hung prior to that getting done. Not entirely sure how >>>> to undo a reshape that failed like this. >>> >>> This seems quite common. Search the archives - it's probably something >>> like --assemble --revert-reshape. >> >> Ah, yes. I recall cases where mdmon wouldn't start or wouldn't open the >> array to start moving the stripes, so the kernel wouldn't advance. >> SystemD was one of the culprits, I believe, back then. > > Thanks all. > > So, is the following safe to run, and a good idea to try? > > mdadm --assemble --update=revert-reshape /dev/md127 /dev/sd[a-g]3 Yes. > And if that doesn't work, add a force? > > mdadm --assemble --force --update=revert-reshape /dev/md127 /dev/sd[a-g]3 Yes. > And adding --invalid-backup if it complains about backup files? Yes. > Thanks, > Allie Phil ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape 2020-04-07 11:28 ` Phil Turmel @ 2020-04-07 11:46 ` Alexander Shenkin 2020-04-07 12:28 ` Phil Turmel 0 siblings, 1 reply; 13+ messages in thread From: Alexander Shenkin @ 2020-04-07 11:46 UTC (permalink / raw) To: Phil Turmel, Wols Lists, Roger Heflin; +Cc: Linux-RAID On 4/7/2020 12:28 PM, Phil Turmel wrote: > Hi Allie, > > On 4/7/20 6:25 AM, Alexander Shenkin wrote: >> >> >> On 4/6/2020 9:34 PM, Phil Turmel wrote: >>> On 4/6/20 12:27 PM, Wols Lists wrote: >>>> On 06/04/20 17:12, Roger Heflin wrote: >>>>> When I looked at your detailed files you sent a few days ago, all of >>>>> the reshapes (on all disks) indicated that they were at position 0, so >>>>> it kind of appears that the reshape never actually started at all and >>>>> hung immediately which is probably why it cannot find the critical >>>>> section, it hung prior to that getting done. Not entirely sure how >>>>> to undo a reshape that failed like this. >>>> >>>> This seems quite common. Search the archives - it's probably something >>>> like --assemble --revert-reshape. >>> >>> Ah, yes. I recall cases where mdmon wouldn't start or wouldn't open the >>> array to start moving the stripes, so the kernel wouldn't advance. >>> SystemD was one of the culprits, I believe, back then. >> >> Thanks all. >> >> So, is the following safe to run, and a good idea to try? >> >> mdadm --assemble --update=revert-reshape /dev/md127 /dev/sd[a-g]3 > > Yes. > >> And if that doesn't work, add a force? > >> mdadm --assemble --force --update=revert-reshape /dev/md127 /dev/sd[a-g]3 > > Yes. > >> And adding --invalid-backup if it complains about backup files? > > Yes. > >> Thanks, >> Allie > > Phil > Thanks Phil, The --invalid-backup parameter was necessary to get this up and running. It's now up with the 7th disk as a spare. Shall I run fsck now, or can I just try to grow again? proposed grow operation: > mdadm --grow -raid-devices=7 --backup-file=/dev/usb/grow_md127.bak /dev/md127 > mdadm --stop /dev/md127 > umount /dev/md127 # not sure if this is necessary > resize2fs /dev/md127 Thanks, Allie assemble operation results: root@ubuntu-server:/home/ubuntu-server# mdadm --assemble --invalid-backup --update=revert-reshape /dev/md127 /dev/sd[a-g]3 mdadm: device 12 in /dev/md127 has wrong state in superblock, but /dev/sdg3 seems ok mdadm: /dev/md127 has been started with 6 drives and 1 spare. root@ubuntu-server:/home/ubuntu-server# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md127 : active raid6 sda3[6] sdg3[9](S) sde3[7] sdf3[8] sdd3[5] sdc3[2] sdb3[4] 11680755712 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/6] [UUUUUU] bitmap: 0/22 pages [0KB], 65536KB chunk md126 : active (auto-read-only) raid1 sdf1[8] sde1[7] sdg1[9] sda1[6] sdd1[5] sdc1[2] sdb1[4] 1950656 blocks super 1.2 [7/7] [UUUUUUU] unused devices: <none> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape 2020-04-07 11:46 ` Alexander Shenkin @ 2020-04-07 12:28 ` Phil Turmel 2020-04-07 12:31 ` Phil Turmel 0 siblings, 1 reply; 13+ messages in thread From: Phil Turmel @ 2020-04-07 12:28 UTC (permalink / raw) To: Alexander Shenkin, Wols Lists, Roger Heflin; +Cc: Linux-RAID > > Thanks Phil, > > The --invalid-backup parameter was necessary to get this up and running. > It's now up with the 7th disk as a spare. Shall I run fsck now, or can > I just try to grow again? > > proposed grow operation: >> mdadm --grow -raid-devices=7 --backup-file=/dev/usb/grow_md127.bak > /dev/md127 >> mdadm --stop /dev/md127 >> umount /dev/md127 # not sure if this is necessary >> resize2fs /dev/md127 An fsck could help, if any blocks did get moved. I would not attempt a grow again until you find out why the previous attempt didn't make progress. Check if mdmon is running, and/or compile a fresh copy of mdadm from source. If you don't figure it out, you'll just end up in the same spot again. Phil ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape 2020-04-07 12:28 ` Phil Turmel @ 2020-04-07 12:31 ` Phil Turmel 2020-04-07 13:19 ` Alexander Shenkin 0 siblings, 1 reply; 13+ messages in thread From: Phil Turmel @ 2020-04-07 12:31 UTC (permalink / raw) To: Alexander Shenkin, Wols Lists, Roger Heflin; +Cc: Linux-RAID On 4/7/20 8:28 AM, Phil Turmel wrote: >> >> Thanks Phil, >> >> The --invalid-backup parameter was necessary to get this up and running. >> It's now up with the 7th disk as a spare. Shall I run fsck now, or can >> I just try to grow again? >> >> proposed grow operation: >>> mdadm --grow -raid-devices=7 --backup-file=/dev/usb/grow_md127.bak >> /dev/md127 >>> mdadm --stop /dev/md127 >>> umount /dev/md127 # not sure if this is necessary >>> resize2fs /dev/md127 > > An fsck could help, if any blocks did get moved. > > I would not attempt a grow again until you find out why the previous > attempt didn't make progress. Check if mdmon is running, and/or compile > a fresh copy of mdadm from source. If you don't figure it out, you'll > just end up in the same spot again. Oh, one more point: Don't use a backup file. Let mdadm shift the data offsets to get the temporary space needed. (It'll run faster, too.) (I don't see any mdadm --examine reports in the list thread. Did you do them and keep the complete output?) Phil ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape 2020-04-07 12:31 ` Phil Turmel @ 2020-04-07 13:19 ` Alexander Shenkin 2020-04-07 15:08 ` antlists 0 siblings, 1 reply; 13+ messages in thread From: Alexander Shenkin @ 2020-04-07 13:19 UTC (permalink / raw) To: Phil Turmel, Wols Lists, Roger Heflin; +Cc: Linux-RAID [-- Attachment #1: Type: text/plain, Size: 1650 bytes --] On 4/7/2020 1:31 PM, Phil Turmel wrote: > On 4/7/20 8:28 AM, Phil Turmel wrote: >>> >>> Thanks Phil, >>> >>> The --invalid-backup parameter was necessary to get this up and running. >>> It's now up with the 7th disk as a spare. Shall I run fsck now, or >>> can >>> I just try to grow again? >>> >>> proposed grow operation: >>>> mdadm --grow -raid-devices=7 --backup-file=/dev/usb/grow_md127.bak >>> /dev/md127 >>>> mdadm --stop /dev/md127 >>>> umount /dev/md127 # not sure if this is necessary >>>> resize2fs /dev/md127 >> >> An fsck could help, if any blocks did get moved. >> >> I would not attempt a grow again until you find out why the previous >> attempt didn't make progress. Check if mdmon is running, and/or >> compile a fresh copy of mdadm from source. If you don't figure it >> out, you'll just end up in the same spot again. > > > Oh, one more point: Don't use a backup file. Let mdadm shift the data > offsets to get the temporary space needed. (It'll run faster, too.) > > (I don't see any mdadm --examine reports in the list thread. Did you do > them and keep the complete output?) > > Phil > Thanks Phil, fsck is finding lots and lots of problems. Figure I'll just run fsck -p and see what happens... not sure what choice i have... examine output attached here... Re re-growing, I was hoping that running on a newer mdadm (4.1) might fix the problem, and if i still encountered it, perhaps running the following might unstick it: echo frozen > /sys/block/md0/md/sync_action echo reshape > /sys/block/md0/md/sync_action But, I personally have no idea what happened (really), nor why... :-( thanks, allie [-- Attachment #2: examine.txt --] [-- Type: text/plain, Size: 6802 bytes --] root@ubuntu-server:/home/ubuntu-server# mdadm --examine /dev/sd[a-g]3 /dev/sda3: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : c7303f62:d848d424:269581c8:83a045ec Name : ubuntu:2 Creation Time : Sun Feb 5 23:39:58 2017 Raid Level : raid6 Raid Devices : 6 Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB) Array Size : 11680755712 (11139.64 GiB 11961.09 GB) Data Offset : 262144 sectors Super Offset : 8 sectors Unused Space : before=262056 sectors, after=0 sectors State : clean Device UUID : 10bdbed5:cb70c8a9:566c384d:ec4c926e Internal Bitmap : 8 sectors from superblock Update Time : Tue Apr 7 13:14:11 2020 Bad Block Log : 512 entries available at offset 72 sectors Checksum : 307cff15 - correct Events : 316498 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 0 Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdb3: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : c7303f62:d848d424:269581c8:83a045ec Name : ubuntu:2 Creation Time : Sun Feb 5 23:39:58 2017 Raid Level : raid6 Raid Devices : 6 Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB) Array Size : 11680755712 (11139.64 GiB 11961.09 GB) Data Offset : 262144 sectors Super Offset : 8 sectors Unused Space : before=262056 sectors, after=0 sectors State : clean Device UUID : cf70dad5:0c9ff5f6:ede689f2:ccee2eb0 Internal Bitmap : 8 sectors from superblock Update Time : Tue Apr 7 13:14:11 2020 Bad Block Log : 512 entries available at offset 72 sectors Checksum : 64b3fd8b - correct Events : 316498 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 1 Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdc3: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : c7303f62:d848d424:269581c8:83a045ec Name : ubuntu:2 Creation Time : Sun Feb 5 23:39:58 2017 Raid Level : raid6 Raid Devices : 6 Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB) Array Size : 11680755712 (11139.64 GiB 11961.09 GB) Data Offset : 262144 sectors Super Offset : 8 sectors Unused Space : before=262056 sectors, after=0 sectors State : clean Device UUID : f8839952:eaba2e9c:c2c401d4:3e0592a5 Internal Bitmap : 8 sectors from superblock Update Time : Tue Apr 7 13:14:11 2020 Bad Block Log : 512 entries available at offset 72 sectors Checksum : 5d8720d6 - correct Events : 316498 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 2 Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdd3: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : c7303f62:d848d424:269581c8:83a045ec Name : ubuntu:2 Creation Time : Sun Feb 5 23:39:58 2017 Raid Level : raid6 Raid Devices : 6 Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB) Array Size : 11680755712 (11139.64 GiB 11961.09 GB) Data Offset : 262144 sectors Super Offset : 8 sectors Unused Space : before=262056 sectors, after=0 sectors State : clean Device UUID : 875a0dbd:965a9986:1b78eb3d:e15fee50 Internal Bitmap : 8 sectors from superblock Update Time : Tue Apr 7 13:14:11 2020 Bad Block Log : 512 entries available at offset 72 sectors Checksum : c7aba50f - correct Events : 316498 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 3 Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sde3: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : c7303f62:d848d424:269581c8:83a045ec Name : ubuntu:2 Creation Time : Sun Feb 5 23:39:58 2017 Raid Level : raid6 Raid Devices : 6 Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB) Array Size : 11680755712 (11139.64 GiB 11961.09 GB) Data Offset : 262144 sectors Super Offset : 8 sectors Unused Space : before=262056 sectors, after=0 sectors State : clean Device UUID : dc0bda8c:2457fb4c:f87a4bec:8d5b58ed Internal Bitmap : 8 sectors from superblock Update Time : Tue Apr 7 13:14:11 2020 Bad Block Log : 512 entries available at offset 72 sectors Checksum : a8a4517e - correct Events : 316498 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 5 Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdf3: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : c7303f62:d848d424:269581c8:83a045ec Name : ubuntu:2 Creation Time : Sun Feb 5 23:39:58 2017 Raid Level : raid6 Raid Devices : 6 Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB) Array Size : 11680755712 (11139.64 GiB 11961.09 GB) Data Offset : 262144 sectors Super Offset : 8 sectors Unused Space : before=262056 sectors, after=0 sectors State : clean Device UUID : dc842dc3:09c910c7:c351c307:e2383d13 Internal Bitmap : 8 sectors from superblock Update Time : Tue Apr 7 13:14:11 2020 Bad Block Log : 512 entries available at offset 72 sectors Checksum : 9a69f083 - correct Events : 316498 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 4 Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdg3: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : c7303f62:d848d424:269581c8:83a045ec Name : ubuntu:2 Creation Time : Sun Feb 5 23:39:58 2017 Raid Level : raid6 Raid Devices : 6 Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB) Array Size : 11680755712 (11139.64 GiB 11961.09 GB) Data Offset : 262144 sectors Super Offset : 8 sectors Unused Space : before=262056 sectors, after=0 sectors State : clean Device UUID : 635ef71b:e4add925:30ae4f0a:f6b46611 Internal Bitmap : 8 sectors from superblock Update Time : Tue Apr 7 11:37:49 2020 Bad Block Log : 512 entries available at offset 72 sectors Checksum : 52b270d0 - correct Events : 316498 Layout : left-symmetric Chunk Size : 512K Device Role : spare Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape 2020-04-07 13:19 ` Alexander Shenkin @ 2020-04-07 15:08 ` antlists 2020-04-07 17:04 ` Alexander Shenkin 0 siblings, 1 reply; 13+ messages in thread From: antlists @ 2020-04-07 15:08 UTC (permalink / raw) To: Alexander Shenkin, Phil Turmel, Roger Heflin; +Cc: Linux-RAID On 07/04/2020 14:19, Alexander Shenkin wrote: > Re re-growing, I was hoping that running on a newer mdadm (4.1) might > fix the problem, and if i still encountered it, perhaps running the > following might unstick it: > > echo frozen > /sys/block/md0/md/sync_action > echo reshape > /sys/block/md0/md/sync_action > > But, I personally have no idea what happened (really), nor why...:-( iirc pretty much all these reports come from oldish Ubuntu systems ... What happened *could* be that you have an updated franken-kernel, plus an old mdadm, and the mess needs an Igor to stitch it all together... If you ARE going to try the grow again, I'd use an up-to-date recovery system to run the grow, and then reboot back in to the old Ubuntu once your system is back. And seriously think about upgrading your distro to the latest LTS. Cheers, Wol ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape 2020-04-07 15:08 ` antlists @ 2020-04-07 17:04 ` Alexander Shenkin 0 siblings, 0 replies; 13+ messages in thread From: Alexander Shenkin @ 2020-04-07 17:04 UTC (permalink / raw) To: antlists, Phil Turmel, Roger Heflin; +Cc: Linux-RAID Thanks all, My file system was a total mess, everything dropped into lost + found. I've put everything back into what I think are the right directory structures, and nice to not have all my data lost (i think)... but i may just do a clean install of a new os once i have it up, running, and grown again... many thanks to all... (and btw, it's growing nicely with ubuntu 18 at the moment...) allie On 4/7/2020 4:08 PM, antlists wrote: > On 07/04/2020 14:19, Alexander Shenkin wrote: >> Re re-growing, I was hoping that running on a newer mdadm (4.1) might >> fix the problem, and if i still encountered it, perhaps running the >> following might unstick it: >> >> echo frozen > /sys/block/md0/md/sync_action >> echo reshape > /sys/block/md0/md/sync_action >> >> But, I personally have no idea what happened (really), nor why...:-( > > iirc pretty much all these reports come from oldish Ubuntu systems ... > > What happened *could* be that you have an updated franken-kernel, plus > an old mdadm, and the mess needs an Igor to stitch it all together... > > If you ARE going to try the grow again, I'd use an up-to-date recovery > system to run the grow, and then reboot back in to the old Ubuntu once > your system is back. > > And seriously think about upgrading your distro to the latest LTS. > > Cheers, > Wol ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2020-04-07 17:04 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-01 10:16 Raid-6 cannot reshape Alexander Shenkin
[not found] ` <6b9b6d37-6325-6515-f693-0ff3b641a67a@shenkin.org>
2020-04-06 15:27 ` Alexander Shenkin
2020-04-06 16:12 ` Roger Heflin
2020-04-06 16:27 ` Wols Lists
2020-04-06 20:34 ` Phil Turmel
2020-04-07 10:25 ` Alexander Shenkin
2020-04-07 11:28 ` Phil Turmel
2020-04-07 11:46 ` Alexander Shenkin
2020-04-07 12:28 ` Phil Turmel
2020-04-07 12:31 ` Phil Turmel
2020-04-07 13:19 ` Alexander Shenkin
2020-04-07 15:08 ` antlists
2020-04-07 17:04 ` Alexander Shenkin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.