From: Yu Kuai <yukuai1@huaweicloud.com>
To: Jason Moss <phate408@gmail.com>, Yu Kuai <yukuai1@huaweicloud.com>
Cc: linux-raid@vger.kernel.org,
"yangerkun@huawei.com" <yangerkun@huawei.com>,
"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: Reshape Failure
Date: Sun, 10 Sep 2023 10:45:05 +0800 [thread overview]
Message-ID: <ee4d0dfe-a42c-1a84-73b1-2f5a8a78b428@huaweicloud.com> (raw)
In-Reply-To: <CA+w1tCf0RriSXMGGKCK0J9wYhbwctEkDAAMVYtRGQ6fmJpUbXA@mail.gmail.com>
Hi,
在 2023/09/07 14:19, Jason Moss 写道:
> Hi,
>
> On Wed, Sep 6, 2023 at 11:13 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> Hi,
>>
>> 在 2023/09/07 13:44, Jason Moss 写道:
>>> Hi,
>>>
>>> On Wed, Sep 6, 2023 at 6:38 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> 在 2023/09/06 22:05, Jason Moss 写道:
>>>>> Hi Kuai,
>>>>>
>>>>> I ended up using gdb rather than addr2line, as that output didn't give
>>>>> me the global offset. Maybe there's a better way, but this seems to be
>>>>> similar to what I expected.
>>>>
>>>> It's ok.
>>>>>
>>>>> (gdb) list *(reshape_request+0x416)
>>>>> 0x11566 is in reshape_request (drivers/md/raid5.c:6396).
>>>>> 6391 if ((mddev->reshape_backwards
>>>>> 6392 ? (safepos > writepos && readpos < writepos)
>>>>> 6393 : (safepos < writepos && readpos > writepos)) ||
>>>>> 6394 time_after(jiffies, conf->reshape_checkpoint + 10*HZ)) {
>>>>> 6395 /* Cannot proceed until we've updated the
>>>>> superblock... */
>>>>> 6396 wait_event(conf->wait_for_overlap,
>>>>> 6397 atomic_read(&conf->reshape_stripes)==0
>>>>> 6398 || test_bit(MD_RECOVERY_INTR,
>>>>
>>>> If reshape is stuck here, which means:
>>>>
>>>> 1) Either reshape io is stuck somewhere and never complete;
>>>> 2) Or the counter reshape_stripes is broken;
>>>>
>>>> Can you read following debugfs files to verify if io is stuck in
>>>> underlying disk?
>>>>
>>>> /sys/kernel/debug/block/[disk]/hctx*/{sched_tags,tags,busy,dispatch}
>>>>
>>>
>>> I'll attach this below.
>>>
>>>> Furthermore, echo frozen should break above wait_event() because
>>>> 'MD_RECOVERY_INTR' will be set, however, based on your description,
>>>> the problem still exist. Can you collect stack and addr2line result
>>>> of stuck thread after echo frozen?
>>>>
>>>
>>> I echo'd frozen to /sys/block/md0/md/sync_action, however the echo
>>> call has been sitting for about 30 minutes, maybe longer, and has not
>>> returned. Here's the current state:
>>>
>>> root 454 0.0 0.0 0 0 ? I< Sep05 0:00 [raid5wq]
>>> root 455 0.0 0.0 34680 5988 ? D Sep05 0:00 (udev-worker)
>>
>> Can you also show the stack of udev-worker? And any other thread with
>> 'D' state, I think above "echo frozen" is probably also stuck in D
>> state.
>>
>
> As requested:
>
> ps aux | grep D
> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
> root 455 0.0 0.0 34680 5988 ? D Sep05 0:00 (udev-worker)
> root 457 0.0 0.0 0 0 ? D Sep05 0:00 [md0_reshape]
> root 45507 0.0 0.0 8272 4736 pts/1 Ds+ Sep05 0:00 -bash
> jason 279169 0.0 0.0 6976 2560 pts/0 S+ 23:16 0:00
> grep --color=auto D
>
> [jason@arch md]$ sudo cat /proc/455/stack
> [<0>] wait_woken+0x54/0x60
> [<0>] raid5_make_request+0x5fe/0x12f0 [raid456]
> [<0>] md_handle_request+0x135/0x220 [md_mod]
> [<0>] __submit_bio+0xb3/0x170
> [<0>] submit_bio_noacct_nocheck+0x159/0x370
> [<0>] block_read_full_folio+0x21c/0x340
> [<0>] filemap_read_folio+0x40/0xd0
> [<0>] filemap_get_pages+0x475/0x630
> [<0>] filemap_read+0xd9/0x350
> [<0>] blkdev_read_iter+0x6b/0x1b0
> [<0>] vfs_read+0x201/0x350
> [<0>] ksys_read+0x6f/0xf0
> [<0>] do_syscall_64+0x60/0x90
> [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
>
>
> [jason@arch md]$ sudo cat /proc/45507/stack
> [<0>] kthread_stop+0x6a/0x180
> [<0>] md_unregister_thread+0x29/0x60 [md_mod]
> [<0>] action_store+0x168/0x320 [md_mod]
> [<0>] md_attr_store+0x86/0xf0 [md_mod]
> [<0>] kernfs_fop_write_iter+0x136/0x1d0
> [<0>] vfs_write+0x23e/0x420
> [<0>] ksys_write+0x6f/0xf0
> [<0>] do_syscall_64+0x60/0x90
> [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
>
> Please let me know if you'd like me to identify the lines for any of those.
>
That's enough.
> Thanks,
> Jason
>
>
>>> root 456 99.9 0.0 0 0 ? R Sep05 1543:40 [md0_raid6]
>>> root 457 0.0 0.0 0 0 ? D Sep05 0:00 [md0_reshape]
>>>
>>> [jason@arch md]$ sudo cat /proc/457/stack
>>> [<0>] md_do_sync+0xef2/0x11d0 [md_mod]
>>> [<0>] md_thread+0xae/0x190 [md_mod]
>>> [<0>] kthread+0xe8/0x120
>>> [<0>] ret_from_fork+0x34/0x50
>>> [<0>] ret_from_fork_asm+0x1b/0x30
>>>
>>> Reading symbols from md-mod.ko...
>>> (gdb) list *(md_do_sync+0xef2)
>>> 0xb3a2 is in md_do_sync (drivers/md/md.c:9035).
>>> 9030 ? "interrupted" : "done");
>>> 9031 /*
>>> 9032 * this also signals 'finished resyncing' to md_stop
>>> 9033 */
>>> 9034 blk_finish_plug(&plug);
>>> 9035 wait_event(mddev->recovery_wait,
>>> !atomic_read(&mddev->recovery_active));
>>
>> That's also wait for reshape io to be done from common layer.
>>
>>> 9036
>>> 9037 if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
>>> 9038 !test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
>>> 9039 mddev->curr_resync >= MD_RESYNC_ACTIVE) {
>>>
>>>
>>> The debugfs info:
>>>
>>> [root@arch ~]# cat
>>> /sys/kernel/debug/block/sda/hctx*/{sched_tags,tags,busy,dispatch}
>>
>> Only sched_tags is read, sorry that I didn't mean to use this exact cmd.
>>
>> Perhaps you can using following cmd:
>>
>> find /sys/kernel/debug/block/sda/ -type f | xargs grep .
>>
>>> nr_tags=64
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=64
>>> busy=1
>>
>> This means there is one IO in sda, however, I need more information to
>> make sure where is this IO. And please make sure don't run any other
>> thread that can read/write from sda. You can use "iostat -dmx 1" and
>> observe for a while to confirm that there is no new io.
And can you help for this? Confirm no new io and collect debugfs.
Thanks,
Kuai
>>
>> Thanks,
>> Kuai
>>
>>> cleared=55
>>> bits_per_word=16
>>> map_nr=4
>>> alloc_hint={40, 20, 46, 0}
>>> wake_batch=8
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=1
>>> min_shallow_depth=48
>>> nr_tags=32
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=32
>>> busy=0
>>> cleared=27
>>> bits_per_word=8
>>> map_nr=4
>>> alloc_hint={19, 26, 5, 21}
>>> wake_batch=4
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=1
>>> min_shallow_depth=4294967295
>>
>>
>>>
>>>
>>> [root@arch ~]# cat /sys/kernel/debug/block/sdb/hctx*
>>> /{sched_tags,tags,busy,dispatch}
>>> nr_tags=64
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=64
>>> busy=1
>>> cleared=56
>>> bits_per_word=16
>>> map_nr=4
>>> alloc_hint={57, 43, 14, 19}
>>> wake_batch=8
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=1
>>> min_shallow_depth=48
>>> nr_tags=32
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=32
>>> busy=0
>>> cleared=24
>>> bits_per_word=8
>>> map_nr=4
>>> alloc_hint={17, 13, 23, 17}
>>> wake_batch=4
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=1
>>> min_shallow_depth=4294967295
>>>
>>>
>>> [root@arch ~]# cat
>>> /sys/kernel/debug/block/sdd/hctx*/{sched_tags,tags,busy,dispatch}
>>> nr_tags=64
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=64
>>> busy=1
>>> cleared=51
>>> bits_per_word=16
>>> map_nr=4
>>> alloc_hint={36, 43, 15, 7}
>>> wake_batch=8
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=1
>>> min_shallow_depth=48
>>> nr_tags=32
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=32
>>> busy=0
>>> cleared=31
>>> bits_per_word=8
>>> map_nr=4
>>> alloc_hint={0, 15, 1, 22}
>>> wake_batch=4
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=1
>>> min_shallow_depth=4294967295
>>>
>>>
>>> [root@arch ~]# cat
>>> /sys/kernel/debug/block/sdf/hctx*/{sched_tags,tags,busy,dispatch}
>>> nr_tags=256
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=256
>>> busy=1
>>> cleared=131
>>> bits_per_word=64
>>> map_nr=4
>>> alloc_hint={125, 46, 83, 205}
>>> wake_batch=8
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=0
>>> min_shallow_depth=192
>>> nr_tags=10104
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=10104
>>> busy=0
>>> cleared=235
>>> bits_per_word=64
>>> map_nr=158
>>> alloc_hint={503, 2913, 9827, 9851}
>>> wake_batch=8
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=0
>>> min_shallow_depth=4294967295
>>>
>>>
>>> [root@arch ~]# cat
>>> /sys/kernel/debug/block/sdh/hctx*/{sched_tags,tags,busy,dispatch}
>>> nr_tags=256
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=256
>>> busy=1
>>> cleared=97
>>> bits_per_word=64
>>> map_nr=4
>>> alloc_hint={144, 144, 127, 254}
>>> wake_batch=8
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=0
>>> min_shallow_depth=192
>>> nr_tags=10104
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=10104
>>> busy=0
>>> cleared=235
>>> bits_per_word=64
>>> map_nr=158
>>> alloc_hint={503, 2913, 9827, 9851}
>>> wake_batch=8
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=0
>>> min_shallow_depth=4294967295
>>>
>>>
>>> [root@arch ~]# cat
>>> /sys/kernel/debug/block/sdi/hctx*/{sched_tags,tags,busy,dispatch}
>>> nr_tags=256
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=256
>>> busy=1
>>> cleared=34
>>> bits_per_word=64
>>> map_nr=4
>>> alloc_hint={197, 20, 1, 230}
>>> wake_batch=8
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=0
>>> min_shallow_depth=192
>>> nr_tags=10104
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=10104
>>> busy=0
>>> cleared=235
>>> bits_per_word=64
>>> map_nr=158
>>> alloc_hint={503, 2913, 9827, 9851}
>>> wake_batch=8
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=0
>>> min_shallow_depth=4294967295
>>>
>>>
>>> [root@arch ~]# cat
>>> /sys/kernel/debug/block/sdj/hctx*/{sched_tags,tags,busy,dispatch}
>>> nr_tags=256
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=256
>>> busy=1
>>> cleared=27
>>> bits_per_word=64
>>> map_nr=4
>>> alloc_hint={132, 74, 129, 76}
>>> wake_batch=8
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=0
>>> min_shallow_depth=192
>>> nr_tags=10104
>>> nr_reserved_tags=0
>>> active_queues=0
>>>
>>> bitmap_tags:
>>> depth=10104
>>> busy=0
>>> cleared=235
>>> bits_per_word=64
>>> map_nr=158
>>> alloc_hint={503, 2913, 9827, 9851}
>>> wake_batch=8
>>> wake_index=0
>>> ws_active=0
>>> ws={
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> {.wait=inactive},
>>> }
>>> round_robin=0
>>> min_shallow_depth=4294967295
>>>
>>>
>>> Thanks for your continued assistance with this!
>>> Jason
>>>
>>>
>>>> Thanks,
>>>> Kuai
>>>>
>>>>> &mddev->recovery));
>>>>> 6399 if (atomic_read(&conf->reshape_stripes) != 0)
>>>>> 6400 return 0;
>>>>>
>>>>> Thanks
>>>>>
>>>>> On Mon, Sep 4, 2023 at 6:08 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> 在 2023/09/05 0:38, Jason Moss 写道:
>>>>>>> Hi Kuai,
>>>>>>>
>>>>>>> Thank you for the suggestion, I was previously on 5.15.0. I've built
>>>>>>> an environment with 6.5.0.1 now and assembled the array there, but the
>>>>>>> same problem happens. It reshaped for 20-30 seconds, then completely
>>>>>>> stopped.
>>>>>>>
>>>>>>> Processes and /proc/<PID>/stack output:
>>>>>>> root 24593 0.0 0.0 0 0 ? I< 09:22 0:00 [raid5wq]
>>>>>>> root 24594 96.5 0.0 0 0 ? R 09:22 2:29 [md0_raid6]
>>>>>>> root 24595 0.3 0.0 0 0 ? D 09:22 0:00 [md0_reshape]
>>>>>>>
>>>>>>> [root@arch ~]# cat /proc/24593/stack
>>>>>>> [<0>] rescuer_thread+0x2b0/0x3b0
>>>>>>> [<0>] kthread+0xe8/0x120
>>>>>>> [<0>] ret_from_fork+0x34/0x50
>>>>>>> [<0>] ret_from_fork_asm+0x1b/0x30
>>>>>>>
>>>>>>> [root@arch ~]# cat /proc/24594/stack
>>>>>>>
>>>>>>> [root@arch ~]# cat /proc/24595/stack
>>>>>>> [<0>] reshape_request+0x416/0x9f0 [raid456]
>>>>>> Can you provide the addr2line result? Let's see where reshape_request()
>>>>>> is stuck first.
>>>>>>
>>>>>> Thanks,
>>>>>> Kuai
>>>>>>
>>>>>>> [<0>] raid5_sync_request+0x2fc/0x3d0 [raid456]
>>>>>>> [<0>] md_do_sync+0x7d6/0x11d0 [md_mod]
>>>>>>> [<0>] md_thread+0xae/0x190 [md_mod]
>>>>>>> [<0>] kthread+0xe8/0x120
>>>>>>> [<0>] ret_from_fork+0x34/0x50
>>>>>>> [<0>] ret_from_fork_asm+0x1b/0x30
>>>>>>>
>>>>>>> Please let me know if there's a better way to provide the stack info.
>>>>>>>
>>>>>>> Thank you
>>>>>>>
>>>>>>> On Sun, Sep 3, 2023 at 6:41 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> 在 2023/09/04 5:39, Jason Moss 写道:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> I recently attempted to add a new drive to my 8-drive RAID 6 array,
>>>>>>>>> growing it to 9 drives. I've done similar before with the same array,
>>>>>>>>> having previously grown it from 6 drives to 7 and then from 7 to 8
>>>>>>>>> with no issues. Drives are WD Reds, most older than 2019, some
>>>>>>>>> (including the newest) newer, but all confirmed CMR and not SMR.
>>>>>>>>>
>>>>>>>>> Process used to expand the array:
>>>>>>>>> mdadm --add /dev/md0 /dev/sdb1
>>>>>>>>> mdadm --grow --raid-devices=9 --backup-file=/root/grow_md0.bak /dev/md0
>>>>>>>>>
>>>>>>>>> The reshape started off fine, the process was underway, and the volume
>>>>>>>>> was still usable as expected. However, 15-30 minutes into the reshape,
>>>>>>>>> I lost access to the contents of the drive. Checking /proc/mdstat, the
>>>>>>>>> reshape was stopped at 0.6% with the counter not incrementing at all.
>>>>>>>>> Any process accessing the array would just hang until killed. I waited
>>>>>>>>
>>>>>>>> What kernel version are you using? And it'll be very helpful if you can
>>>>>>>> collect the stack of all stuck thread. There is a known deadlock for
>>>>>>>> raid5 related to reshape, and it's fixed in v6.5:
>>>>>>>>
>>>>>>>> https://lore.kernel.org/r/20230512015610.821290-6-yukuai1@huaweicloud.com
>>>>>>>>
>>>>>>>>> a half hour and there was still no further change to the counter. At
>>>>>>>>> this point, I restarted the server and found that when it came back up
>>>>>>>>> it would begin reshaping again, but only very briefly, under 30
>>>>>>>>> seconds, but the counter would be increasing during that time.
>>>>>>>>>
>>>>>>>>> I searched furiously for ideas and tried stopping and reassembling the
>>>>>>>>> array, assembling with an invalid-backup flag, echoing "frozen" then
>>>>>>>>> "reshape" to the sync_action file, and echoing "max" to the sync_max
>>>>>>>>> file. Nothing ever seemed to make a difference.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Don't do this before v6.5, echo "reshape" while reshape is still in
>>>>>>>> progress will corrupt your data:
>>>>>>>>
>>>>>>>> https://lore.kernel.org/r/20230512015610.821290-3-yukuai1@huaweicloud.com
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Kuai
>>>>>>>>
>>>>>>>>> Here is where I slightly panicked, worried that I'd borked my array,
>>>>>>>>> and powered off the server again and disconnected the new drive that
>>>>>>>>> was just added, assuming that since it was the change, it may be the
>>>>>>>>> problem despite having burn-in tested it, and figuring that I'll rush
>>>>>>>>> order a new drive, so long as the reshape continues and I can just
>>>>>>>>> rebuild onto a new drive once the reshape finishes. However, this made
>>>>>>>>> no difference and the array continued to not rebuild.
>>>>>>>>>
>>>>>>>>> Much searching later, I'd found nothing substantially different then
>>>>>>>>> I'd already tried and one of the common threads in other people's
>>>>>>>>> issues was bad drives, so I ran a self-test against each of the
>>>>>>>>> existing drives and found one drive that failed the read test.
>>>>>>>>> Thinking I had the culprit now, I dropped that drive out of the array
>>>>>>>>> and assembled the array again, but the same behavior persists. The
>>>>>>>>> array reshapes very briefly, then completely stops.
>>>>>>>>>
>>>>>>>>> Down to 0 drives of redundancy (in the reshaped section at least), not
>>>>>>>>> finding any new ideas on any of the forums, mailing list, wiki, etc,
>>>>>>>>> and very frustrated, I took a break, bought all new drives to build a
>>>>>>>>> new array in another server and restored from a backup. However, there
>>>>>>>>> is still some data not captured by the most recent backup that I would
>>>>>>>>> like to recover, and I'd also like to solve the problem purely to
>>>>>>>>> understand what happened and how to recover in the future.
>>>>>>>>>
>>>>>>>>> Is there anything else I should try to recover this array, or is this
>>>>>>>>> a lost cause?
>>>>>>>>>
>>>>>>>>> Details requested by the wiki to follow and I'm happy to collect any
>>>>>>>>> further data that would assist. /dev/sdb is the new drive that was
>>>>>>>>> added, then disconnected. /dev/sdh is the drive that failed a
>>>>>>>>> self-test and was removed from the array.
>>>>>>>>>
>>>>>>>>> Thank you in advance for any help provided!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> $ uname -a
>>>>>>>>> Linux Blyth 5.15.0-76-generic #83-Ubuntu SMP Thu Jun 15 19:16:32 UTC
>>>>>>>>> 2023 x86_64 x86_64 x86_64 GNU/Linux
>>>>>>>>>
>>>>>>>>> $ mdadm --version
>>>>>>>>> mdadm - v4.2 - 2021-12-30
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> $ sudo smartctl -H -i -l scterc /dev/sda
>>>>>>>>> smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-76-generic] (local build)
>>>>>>>>> Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
>>>>>>>>>
>>>>>>>>> === START OF INFORMATION SECTION ===
>>>>>>>>> Model Family: Western Digital Red
>>>>>>>>> Device Model: WDC WD30EFRX-68EUZN0
>>>>>>>>> Serial Number: WD-WCC4N7AT7R7X
>>>>>>>>> LU WWN Device Id: 5 0014ee 268545f93
>>>>>>>>> Firmware Version: 82.00A82
>>>>>>>>> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
>>>>>>>>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>>>>>>>> Rotation Rate: 5400 rpm
>>>>>>>>> Device is: In smartctl database [for details use: -P show]
>>>>>>>>> ATA Version is: ACS-2 (minor revision not indicated)
>>>>>>>>> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>>>>>>>> Local Time is: Sun Sep 3 13:27:55 2023 PDT
>>>>>>>>> SMART support is: Available - device has SMART capability.
>>>>>>>>> SMART support is: Enabled
>>>>>>>>>
>>>>>>>>> === START OF READ SMART DATA SECTION ===
>>>>>>>>> SMART overall-health self-assessment test result: PASSED
>>>>>>>>>
>>>>>>>>> SCT Error Recovery Control:
>>>>>>>>> Read: 70 (7.0 seconds)
>>>>>>>>> Write: 70 (7.0 seconds)
>>>>>>>>>
>>>>>>>>> $ sudo smartctl -H -i -l scterc /dev/sda
>>>>>>>>> smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-76-generic] (local build)
>>>>>>>>> Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
>>>>>>>>>
>>>>>>>>> === START OF INFORMATION SECTION ===
>>>>>>>>> Model Family: Western Digital Red
>>>>>>>>> Device Model: WDC WD30EFRX-68EUZN0
>>>>>>>>> Serial Number: WD-WCC4N7AT7R7X
>>>>>>>>> LU WWN Device Id: 5 0014ee 268545f93
>>>>>>>>> Firmware Version: 82.00A82
>>>>>>>>> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
>>>>>>>>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>>>>>>>> Rotation Rate: 5400 rpm
>>>>>>>>> Device is: In smartctl database [for details use: -P show]
>>>>>>>>> ATA Version is: ACS-2 (minor revision not indicated)
>>>>>>>>> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>>>>>>>> Local Time is: Sun Sep 3 13:28:16 2023 PDT
>>>>>>>>> SMART support is: Available - device has SMART capability.
>>>>>>>>> SMART support is: Enabled
>>>>>>>>>
>>>>>>>>> === START OF READ SMART DATA SECTION ===
>>>>>>>>> SMART overall-health self-assessment test result: PASSED
>>>>>>>>>
>>>>>>>>> SCT Error Recovery Control:
>>>>>>>>> Read: 70 (7.0 seconds)
>>>>>>>>> Write: 70 (7.0 seconds)
>>>>>>>>>
>>>>>>>>> $ sudo smartctl -H -i -l scterc /dev/sdb
>>>>>>>>> smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-76-generic] (local build)
>>>>>>>>> Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
>>>>>>>>>
>>>>>>>>> === START OF INFORMATION SECTION ===
>>>>>>>>> Model Family: Western Digital Red
>>>>>>>>> Device Model: WDC WD30EFRX-68EUZN0
>>>>>>>>> Serial Number: WD-WXG1A8UGLS42
>>>>>>>>> LU WWN Device Id: 5 0014ee 2b75ef53b
>>>>>>>>> Firmware Version: 80.00A80
>>>>>>>>> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
>>>>>>>>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>>>>>>>> Rotation Rate: 5400 rpm
>>>>>>>>> Device is: In smartctl database [for details use: -P show]
>>>>>>>>> ATA Version is: ACS-2 (minor revision not indicated)
>>>>>>>>> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>>>>>>>> Local Time is: Sun Sep 3 13:28:19 2023 PDT
>>>>>>>>> SMART support is: Available - device has SMART capability.
>>>>>>>>> SMART support is: Enabled
>>>>>>>>>
>>>>>>>>> === START OF READ SMART DATA SECTION ===
>>>>>>>>> SMART overall-health self-assessment test result: PASSED
>>>>>>>>>
>>>>>>>>> SCT Error Recovery Control:
>>>>>>>>> Read: 70 (7.0 seconds)
>>>>>>>>> Write: 70 (7.0 seconds)
>>>>>>>>>
>>>>>>>>> $ sudo smartctl -H -i -l scterc /dev/sdc
>>>>>>>>> smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-76-generic] (local build)
>>>>>>>>> Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
>>>>>>>>>
>>>>>>>>> === START OF INFORMATION SECTION ===
>>>>>>>>> Model Family: Western Digital Red
>>>>>>>>> Device Model: WDC WD30EFRX-68EUZN0
>>>>>>>>> Serial Number: WD-WCC4N4HYL32Y
>>>>>>>>> LU WWN Device Id: 5 0014ee 2630752f8
>>>>>>>>> Firmware Version: 82.00A82
>>>>>>>>> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
>>>>>>>>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>>>>>>>> Rotation Rate: 5400 rpm
>>>>>>>>> Device is: In smartctl database [for details use: -P show]
>>>>>>>>> ATA Version is: ACS-2 (minor revision not indicated)
>>>>>>>>> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>>>>>>>> Local Time is: Sun Sep 3 13:28:20 2023 PDT
>>>>>>>>> SMART support is: Available - device has SMART capability.
>>>>>>>>> SMART support is: Enabled
>>>>>>>>>
>>>>>>>>> === START OF READ SMART DATA SECTION ===
>>>>>>>>> SMART overall-health self-assessment test result: PASSED
>>>>>>>>>
>>>>>>>>> SCT Error Recovery Control:
>>>>>>>>> Read: 70 (7.0 seconds)
>>>>>>>>> Write: 70 (7.0 seconds)
>>>>>>>>>
>>>>>>>>> $ sudo smartctl -H -i -l scterc /dev/sdd
>>>>>>>>> smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-76-generic] (local build)
>>>>>>>>> Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
>>>>>>>>>
>>>>>>>>> === START OF INFORMATION SECTION ===
>>>>>>>>> Model Family: Western Digital Red
>>>>>>>>> Device Model: WDC WD30EFRX-68N32N0
>>>>>>>>> Serial Number: WD-WCC7K1FF6DYK
>>>>>>>>> LU WWN Device Id: 5 0014ee 2ba952a30
>>>>>>>>> Firmware Version: 82.00A82
>>>>>>>>> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
>>>>>>>>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>>>>>>>> Rotation Rate: 5400 rpm
>>>>>>>>> Form Factor: 3.5 inches
>>>>>>>>> Device is: In smartctl database [for details use: -P show]
>>>>>>>>> ATA Version is: ACS-3 T13/2161-D revision 5
>>>>>>>>> SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
>>>>>>>>> Local Time is: Sun Sep 3 13:28:21 2023 PDT
>>>>>>>>> SMART support is: Available - device has SMART capability.
>>>>>>>>> SMART support is: Enabled
>>>>>>>>>
>>>>>>>>> === START OF READ SMART DATA SECTION ===
>>>>>>>>> SMART overall-health self-assessment test result: PASSED
>>>>>>>>>
>>>>>>>>> SCT Error Recovery Control:
>>>>>>>>> Read: 70 (7.0 seconds)
>>>>>>>>> Write: 70 (7.0 seconds)
>>>>>>>>>
>>>>>>>>> $ sudo smartctl -H -i -l scterc /dev/sde
>>>>>>>>> smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-76-generic] (local build)
>>>>>>>>> Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
>>>>>>>>>
>>>>>>>>> === START OF INFORMATION SECTION ===
>>>>>>>>> Model Family: Western Digital Red
>>>>>>>>> Device Model: WDC WD30EFRX-68EUZN0
>>>>>>>>> Serial Number: WD-WCC4N5ZHTRJF
>>>>>>>>> LU WWN Device Id: 5 0014ee 2b88b83bb
>>>>>>>>> Firmware Version: 82.00A82
>>>>>>>>> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
>>>>>>>>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>>>>>>>> Rotation Rate: 5400 rpm
>>>>>>>>> Device is: In smartctl database [for details use: -P show]
>>>>>>>>> ATA Version is: ACS-2 (minor revision not indicated)
>>>>>>>>> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>>>>>>>> Local Time is: Sun Sep 3 13:28:22 2023 PDT
>>>>>>>>> SMART support is: Available - device has SMART capability.
>>>>>>>>> SMART support is: Enabled
>>>>>>>>>
>>>>>>>>> === START OF READ SMART DATA SECTION ===
>>>>>>>>> SMART overall-health self-assessment test result: PASSED
>>>>>>>>>
>>>>>>>>> SCT Error Recovery Control:
>>>>>>>>> Read: 70 (7.0 seconds)
>>>>>>>>> Write: 70 (7.0 seconds)
>>>>>>>>>
>>>>>>>>> $ sudo smartctl -H -i -l scterc /dev/sdf
>>>>>>>>> smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-76-generic] (local build)
>>>>>>>>> Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
>>>>>>>>>
>>>>>>>>> === START OF INFORMATION SECTION ===
>>>>>>>>> Model Family: Western Digital Red
>>>>>>>>> Device Model: WDC WD30EFRX-68AX9N0
>>>>>>>>> Serial Number: WD-WMC1T3804790
>>>>>>>>> LU WWN Device Id: 5 0014ee 6036b6826
>>>>>>>>> Firmware Version: 80.00A80
>>>>>>>>> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
>>>>>>>>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>>>>>>>> Device is: In smartctl database [for details use: -P show]
>>>>>>>>> ATA Version is: ACS-2 (minor revision not indicated)
>>>>>>>>> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>>>>>>>> Local Time is: Sun Sep 3 13:28:23 2023 PDT
>>>>>>>>> SMART support is: Available - device has SMART capability.
>>>>>>>>> SMART support is: Enabled
>>>>>>>>>
>>>>>>>>> === START OF READ SMART DATA SECTION ===
>>>>>>>>> SMART overall-health self-assessment test result: PASSED
>>>>>>>>>
>>>>>>>>> SCT Error Recovery Control:
>>>>>>>>> Read: 70 (7.0 seconds)
>>>>>>>>> Write: 70 (7.0 seconds)
>>>>>>>>>
>>>>>>>>> $ sudo smartctl -H -i -l scterc /dev/sdg
>>>>>>>>> smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-76-generic] (local build)
>>>>>>>>> Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
>>>>>>>>>
>>>>>>>>> === START OF INFORMATION SECTION ===
>>>>>>>>> Model Family: Western Digital Red
>>>>>>>>> Device Model: WDC WD30EFRX-68EUZN0
>>>>>>>>> Serial Number: WD-WMC4N0H692Z9
>>>>>>>>> LU WWN Device Id: 5 0014ee 65af39740
>>>>>>>>> Firmware Version: 82.00A82
>>>>>>>>> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
>>>>>>>>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>>>>>>>> Rotation Rate: 5400 rpm
>>>>>>>>> Device is: In smartctl database [for details use: -P show]
>>>>>>>>> ATA Version is: ACS-2 (minor revision not indicated)
>>>>>>>>> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>>>>>>>> Local Time is: Sun Sep 3 13:28:24 2023 PDT
>>>>>>>>> SMART support is: Available - device has SMART capability.
>>>>>>>>> SMART support is: Enabled
>>>>>>>>>
>>>>>>>>> === START OF READ SMART DATA SECTION ===
>>>>>>>>> SMART overall-health self-assessment test result: PASSED
>>>>>>>>>
>>>>>>>>> SCT Error Recovery Control:
>>>>>>>>> Read: 70 (7.0 seconds)
>>>>>>>>> Write: 70 (7.0 seconds)
>>>>>>>>>
>>>>>>>>> $ sudo smartctl -H -i -l scterc /dev/sdh
>>>>>>>>> smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-76-generic] (local build)
>>>>>>>>> Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
>>>>>>>>>
>>>>>>>>> === START OF INFORMATION SECTION ===
>>>>>>>>> Model Family: Western Digital Red
>>>>>>>>> Device Model: WDC WD30EFRX-68EUZN0
>>>>>>>>> Serial Number: WD-WMC4N0K5S750
>>>>>>>>> LU WWN Device Id: 5 0014ee 6b048d9ca
>>>>>>>>> Firmware Version: 82.00A82
>>>>>>>>> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
>>>>>>>>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>>>>>>>> Rotation Rate: 5400 rpm
>>>>>>>>> Device is: In smartctl database [for details use: -P show]
>>>>>>>>> ATA Version is: ACS-2 (minor revision not indicated)
>>>>>>>>> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>>>>>>>> Local Time is: Sun Sep 3 13:28:24 2023 PDT
>>>>>>>>> SMART support is: Available - device has SMART capability.
>>>>>>>>> SMART support is: Enabled
>>>>>>>>>
>>>>>>>>> === START OF READ SMART DATA SECTION ===
>>>>>>>>> SMART overall-health self-assessment test result: PASSED
>>>>>>>>>
>>>>>>>>> SCT Error Recovery Control:
>>>>>>>>> Read: 70 (7.0 seconds)
>>>>>>>>> Write: 70 (7.0 seconds)
>>>>>>>>>
>>>>>>>>> $ sudo smartctl -H -i -l scterc /dev/sdi
>>>>>>>>> smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-76-generic] (local build)
>>>>>>>>> Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
>>>>>>>>>
>>>>>>>>> === START OF INFORMATION SECTION ===
>>>>>>>>> Model Family: Western Digital Red
>>>>>>>>> Device Model: WDC WD30EFRX-68AX9N0
>>>>>>>>> Serial Number: WD-WMC1T1502475
>>>>>>>>> LU WWN Device Id: 5 0014ee 058d2e5cb
>>>>>>>>> Firmware Version: 80.00A80
>>>>>>>>> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
>>>>>>>>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>>>>>>>> Device is: In smartctl database [for details use: -P show]
>>>>>>>>> ATA Version is: ACS-2 (minor revision not indicated)
>>>>>>>>> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>>>>>>>> Local Time is: Sun Sep 3 13:28:27 2023 PDT
>>>>>>>>> SMART support is: Available - device has SMART capability.
>>>>>>>>> SMART support is: Enabled
>>>>>>>>>
>>>>>>>>> === START OF READ SMART DATA SECTION ===
>>>>>>>>> SMART overall-health self-assessment test result: PASSED
>>>>>>>>>
>>>>>>>>> SCT Error Recovery Control:
>>>>>>>>> Read: 70 (7.0 seconds)
>>>>>>>>> Write: 70 (7.0 seconds)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> $ sudo mdadm --examine /dev/sda
>>>>>>>>> /dev/sda:
>>>>>>>>> MBR Magic : aa55
>>>>>>>>> Partition[0] : 4294967295 sectors at 1 (type ee)
>>>>>>>>> $ sudo mdadm --examine /dev/sda1
>>>>>>>>> /dev/sda1:
>>>>>>>>> Magic : a92b4efc
>>>>>>>>> Version : 1.2
>>>>>>>>> Feature Map : 0xd
>>>>>>>>> Array UUID : 440dc11e:079308b1:131eda79:9a74c670
>>>>>>>>> Name : Blyth:0 (local to host Blyth)
>>>>>>>>> Creation Time : Tue Aug 4 23:47:57 2015
>>>>>>>>> Raid Level : raid6
>>>>>>>>> Raid Devices : 9
>>>>>>>>>
>>>>>>>>> Avail Dev Size : 5856376832 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Array Size : 20497268736 KiB (19.09 TiB 20.99 TB)
>>>>>>>>> Used Dev Size : 5856362496 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Data Offset : 247808 sectors
>>>>>>>>> Super Offset : 8 sectors
>>>>>>>>> Unused Space : before=247728 sectors, after=14336 sectors
>>>>>>>>> State : clean
>>>>>>>>> Device UUID : 8ca60ad5:60d19333:11b24820:91453532
>>>>>>>>>
>>>>>>>>> Internal Bitmap : 8 sectors from superblock
>>>>>>>>> Reshape pos'n : 124311040 (118.55 GiB 127.29 GB)
>>>>>>>>> Delta Devices : 1 (8->9)
>>>>>>>>>
>>>>>>>>> Update Time : Tue Jul 11 23:12:08 2023
>>>>>>>>> Bad Block Log : 512 entries available at offset 24 sectors - bad
>>>>>>>>> blocks present.
>>>>>>>>> Checksum : b6d8f4d1 - correct
>>>>>>>>> Events : 181105
>>>>>>>>>
>>>>>>>>> Layout : left-symmetric
>>>>>>>>> Chunk Size : 512K
>>>>>>>>>
>>>>>>>>> Device Role : Active device 7
>>>>>>>>> Array State : AA.AAAAA. ('A' == active, '.' == missing, 'R' == replacing)
>>>>>>>>>
>>>>>>>>> $ sudo mdadm --examine /dev/sdb
>>>>>>>>> /dev/sdb:
>>>>>>>>> MBR Magic : aa55
>>>>>>>>> Partition[0] : 4294967295 sectors at 1 (type ee)
>>>>>>>>> $ sudo mdadm --examine /dev/sdb1
>>>>>>>>> /dev/sdb1:
>>>>>>>>> Magic : a92b4efc
>>>>>>>>> Version : 1.2
>>>>>>>>> Feature Map : 0x5
>>>>>>>>> Array UUID : 440dc11e:079308b1:131eda79:9a74c670
>>>>>>>>> Name : Blyth:0 (local to host Blyth)
>>>>>>>>> Creation Time : Tue Aug 4 23:47:57 2015
>>>>>>>>> Raid Level : raid6
>>>>>>>>> Raid Devices : 9
>>>>>>>>>
>>>>>>>>> Avail Dev Size : 5856376832 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Array Size : 20497268736 KiB (19.09 TiB 20.99 TB)
>>>>>>>>> Used Dev Size : 5856362496 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Data Offset : 247808 sectors
>>>>>>>>> Super Offset : 8 sectors
>>>>>>>>> Unused Space : before=247728 sectors, after=14336 sectors
>>>>>>>>> State : clean
>>>>>>>>> Device UUID : 386d3001:16447e43:4d2a5459:85618d11
>>>>>>>>>
>>>>>>>>> Internal Bitmap : 8 sectors from superblock
>>>>>>>>> Reshape pos'n : 124207104 (118.45 GiB 127.19 GB)
>>>>>>>>> Delta Devices : 1 (8->9)
>>>>>>>>>
>>>>>>>>> Update Time : Tue Jul 11 00:02:59 2023
>>>>>>>>> Bad Block Log : 512 entries available at offset 24 sectors
>>>>>>>>> Checksum : b544a39 - correct
>>>>>>>>> Events : 181077
>>>>>>>>>
>>>>>>>>> Layout : left-symmetric
>>>>>>>>> Chunk Size : 512K
>>>>>>>>>
>>>>>>>>> Device Role : Active device 8
>>>>>>>>> Array State : AAAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
>>>>>>>>>
>>>>>>>>> $ sudo mdadm --examine /dev/sdc
>>>>>>>>> /dev/sdc:
>>>>>>>>> MBR Magic : aa55
>>>>>>>>> Partition[0] : 4294967295 sectors at 1 (type ee)
>>>>>>>>> $ sudo mdadm --examine /dev/sdc1
>>>>>>>>> /dev/sdc1:
>>>>>>>>> Magic : a92b4efc
>>>>>>>>> Version : 1.2
>>>>>>>>> Feature Map : 0xd
>>>>>>>>> Array UUID : 440dc11e:079308b1:131eda79:9a74c670
>>>>>>>>> Name : Blyth:0 (local to host Blyth)
>>>>>>>>> Creation Time : Tue Aug 4 23:47:57 2015
>>>>>>>>> Raid Level : raid6
>>>>>>>>> Raid Devices : 9
>>>>>>>>>
>>>>>>>>> Avail Dev Size : 5856376832 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Array Size : 20497268736 KiB (19.09 TiB 20.99 TB)
>>>>>>>>> Used Dev Size : 5856362496 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Data Offset : 247808 sectors
>>>>>>>>> Super Offset : 8 sectors
>>>>>>>>> Unused Space : before=247720 sectors, after=14336 sectors
>>>>>>>>> State : clean
>>>>>>>>> Device UUID : 1798ec4f:72c56905:4e74ea61:2468db75
>>>>>>>>>
>>>>>>>>> Internal Bitmap : 8 sectors from superblock
>>>>>>>>> Reshape pos'n : 124311040 (118.55 GiB 127.29 GB)
>>>>>>>>> Delta Devices : 1 (8->9)
>>>>>>>>>
>>>>>>>>> Update Time : Tue Jul 11 23:12:08 2023
>>>>>>>>> Bad Block Log : 512 entries available at offset 72 sectors - bad
>>>>>>>>> blocks present.
>>>>>>>>> Checksum : 88d8b8fc - correct
>>>>>>>>> Events : 181105
>>>>>>>>>
>>>>>>>>> Layout : left-symmetric
>>>>>>>>> Chunk Size : 512K
>>>>>>>>>
>>>>>>>>> Device Role : Active device 4
>>>>>>>>> Array State : AA.AAAAA. ('A' == active, '.' == missing, 'R' == replacing)
>>>>>>>>>
>>>>>>>>> $ sudo mdadm --examine /dev/sdd
>>>>>>>>> /dev/sdd:
>>>>>>>>> MBR Magic : aa55
>>>>>>>>> Partition[0] : 4294967295 sectors at 1 (type ee)
>>>>>>>>> $ sudo mdadm --examine /dev/sdd1
>>>>>>>>> /dev/sdd1:
>>>>>>>>> Magic : a92b4efc
>>>>>>>>> Version : 1.2
>>>>>>>>> Feature Map : 0x5
>>>>>>>>> Array UUID : 440dc11e:079308b1:131eda79:9a74c670
>>>>>>>>> Name : Blyth:0 (local to host Blyth)
>>>>>>>>> Creation Time : Tue Aug 4 23:47:57 2015
>>>>>>>>> Raid Level : raid6
>>>>>>>>> Raid Devices : 9
>>>>>>>>>
>>>>>>>>> Avail Dev Size : 5856376832 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Array Size : 20497268736 KiB (19.09 TiB 20.99 TB)
>>>>>>>>> Used Dev Size : 5856362496 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Data Offset : 247808 sectors
>>>>>>>>> Super Offset : 8 sectors
>>>>>>>>> Unused Space : before=247728 sectors, after=14336 sectors
>>>>>>>>> State : clean
>>>>>>>>> Device UUID : a198095b:f54d26a9:deb3be8f:d6de9be1
>>>>>>>>>
>>>>>>>>> Internal Bitmap : 8 sectors from superblock
>>>>>>>>> Reshape pos'n : 124311040 (118.55 GiB 127.29 GB)
>>>>>>>>> Delta Devices : 1 (8->9)
>>>>>>>>>
>>>>>>>>> Update Time : Tue Jul 11 23:12:08 2023
>>>>>>>>> Bad Block Log : 512 entries available at offset 24 sectors
>>>>>>>>> Checksum : d1471d9d - correct
>>>>>>>>> Events : 181105
>>>>>>>>>
>>>>>>>>> Layout : left-symmetric
>>>>>>>>> Chunk Size : 512K
>>>>>>>>>
>>>>>>>>> Device Role : Active device 6
>>>>>>>>> Array State : AA.AAAAA. ('A' == active, '.' == missing, 'R' == replacing)
>>>>>>>>>
>>>>>>>>> $ sudo mdadm --examine /dev/sde
>>>>>>>>> /dev/sde:
>>>>>>>>> MBR Magic : aa55
>>>>>>>>> Partition[0] : 4294967295 sectors at 1 (type ee)
>>>>>>>>> $ sudo mdadm --examine /dev/sde1
>>>>>>>>> /dev/sde1:
>>>>>>>>> Magic : a92b4efc
>>>>>>>>> Version : 1.2
>>>>>>>>> Feature Map : 0x5
>>>>>>>>> Array UUID : 440dc11e:079308b1:131eda79:9a74c670
>>>>>>>>> Name : Blyth:0 (local to host Blyth)
>>>>>>>>> Creation Time : Tue Aug 4 23:47:57 2015
>>>>>>>>> Raid Level : raid6
>>>>>>>>> Raid Devices : 9
>>>>>>>>>
>>>>>>>>> Avail Dev Size : 5856376832 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Array Size : 20497268736 KiB (19.09 TiB 20.99 TB)
>>>>>>>>> Used Dev Size : 5856362496 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Data Offset : 247808 sectors
>>>>>>>>> Super Offset : 8 sectors
>>>>>>>>> Unused Space : before=247720 sectors, after=14336 sectors
>>>>>>>>> State : clean
>>>>>>>>> Device UUID : acf7ba2e:35d2fa91:6b12b0ce:33a73af5
>>>>>>>>>
>>>>>>>>> Internal Bitmap : 8 sectors from superblock
>>>>>>>>> Reshape pos'n : 124311040 (118.55 GiB 127.29 GB)
>>>>>>>>> Delta Devices : 1 (8->9)
>>>>>>>>>
>>>>>>>>> Update Time : Tue Jul 11 23:12:08 2023
>>>>>>>>> Bad Block Log : 512 entries available at offset 72 sectors
>>>>>>>>> Checksum : e05d0278 - correct
>>>>>>>>> Events : 181105
>>>>>>>>>
>>>>>>>>> Layout : left-symmetric
>>>>>>>>> Chunk Size : 512K
>>>>>>>>>
>>>>>>>>> Device Role : Active device 5
>>>>>>>>> Array State : AA.AAAAA. ('A' == active, '.' == missing, 'R' == replacing)
>>>>>>>>>
>>>>>>>>> $ sudo mdadm --examine /dev/sdf
>>>>>>>>> /dev/sdf:
>>>>>>>>> MBR Magic : aa55
>>>>>>>>> Partition[0] : 4294967295 sectors at 1 (type ee)
>>>>>>>>> $ sudo mdadm --examine /dev/sdf1
>>>>>>>>> /dev/sdf1:
>>>>>>>>> Magic : a92b4efc
>>>>>>>>> Version : 1.2
>>>>>>>>> Feature Map : 0x5
>>>>>>>>> Array UUID : 440dc11e:079308b1:131eda79:9a74c670
>>>>>>>>> Name : Blyth:0 (local to host Blyth)
>>>>>>>>> Creation Time : Tue Aug 4 23:47:57 2015
>>>>>>>>> Raid Level : raid6
>>>>>>>>> Raid Devices : 9
>>>>>>>>>
>>>>>>>>> Avail Dev Size : 5856373760 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Array Size : 20497268736 KiB (19.09 TiB 20.99 TB)
>>>>>>>>> Used Dev Size : 5856362496 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Data Offset : 247808 sectors
>>>>>>>>> Super Offset : 8 sectors
>>>>>>>>> Unused Space : before=247720 sectors, after=14336 sectors
>>>>>>>>> State : clean
>>>>>>>>> Device UUID : 31e7b86d:c274ff45:aa6dab50:2ff058c6
>>>>>>>>>
>>>>>>>>> Internal Bitmap : 8 sectors from superblock
>>>>>>>>> Reshape pos'n : 124311040 (118.55 GiB 127.29 GB)
>>>>>>>>> Delta Devices : 1 (8->9)
>>>>>>>>>
>>>>>>>>> Update Time : Tue Jul 11 23:12:08 2023
>>>>>>>>> Bad Block Log : 512 entries available at offset 72 sectors
>>>>>>>>> Checksum : 26792cc0 - correct
>>>>>>>>> Events : 181105
>>>>>>>>>
>>>>>>>>> Layout : left-symmetric
>>>>>>>>> Chunk Size : 512K
>>>>>>>>>
>>>>>>>>> Device Role : Active device 0
>>>>>>>>> Array State : AA.AAAAA. ('A' == active, '.' == missing, 'R' == replacing)
>>>>>>>>>
>>>>>>>>> $ sudo mdadm --examine /dev/sdg
>>>>>>>>> /dev/sdg:
>>>>>>>>> MBR Magic : aa55
>>>>>>>>> Partition[0] : 4294967295 sectors at 1 (type ee)
>>>>>>>>> $ sudo mdadm --examine /dev/sdg1
>>>>>>>>> /dev/sdg1:
>>>>>>>>> Magic : a92b4efc
>>>>>>>>> Version : 1.2
>>>>>>>>> Feature Map : 0x5
>>>>>>>>> Array UUID : 440dc11e:079308b1:131eda79:9a74c670
>>>>>>>>> Name : Blyth:0 (local to host Blyth)
>>>>>>>>> Creation Time : Tue Aug 4 23:47:57 2015
>>>>>>>>> Raid Level : raid6
>>>>>>>>> Raid Devices : 9
>>>>>>>>>
>>>>>>>>> Avail Dev Size : 5856373760 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Array Size : 20497268736 KiB (19.09 TiB 20.99 TB)
>>>>>>>>> Used Dev Size : 5856362496 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Data Offset : 247808 sectors
>>>>>>>>> Super Offset : 8 sectors
>>>>>>>>> Unused Space : before=247720 sectors, after=14336 sectors
>>>>>>>>> State : clean
>>>>>>>>> Device UUID : 74476ce7:4edc23f6:08120711:ba281425
>>>>>>>>>
>>>>>>>>> Internal Bitmap : 8 sectors from superblock
>>>>>>>>> Reshape pos'n : 124311040 (118.55 GiB 127.29 GB)
>>>>>>>>> Delta Devices : 1 (8->9)
>>>>>>>>>
>>>>>>>>> Update Time : Tue Jul 11 23:12:08 2023
>>>>>>>>> Bad Block Log : 512 entries available at offset 72 sectors
>>>>>>>>> Checksum : 6f67d179 - correct
>>>>>>>>> Events : 181105
>>>>>>>>>
>>>>>>>>> Layout : left-symmetric
>>>>>>>>> Chunk Size : 512K
>>>>>>>>>
>>>>>>>>> Device Role : Active device 1
>>>>>>>>> Array State : AA.AAAAA. ('A' == active, '.' == missing, 'R' == replacing)
>>>>>>>>>
>>>>>>>>> $ sudo mdadm --examine /dev/sdh
>>>>>>>>> /dev/sdh:
>>>>>>>>> MBR Magic : aa55
>>>>>>>>> Partition[0] : 4294967295 sectors at 1 (type ee)
>>>>>>>>> $ sudo mdadm --examine /dev/sdh1
>>>>>>>>> /dev/sdh1:
>>>>>>>>> Magic : a92b4efc
>>>>>>>>> Version : 1.2
>>>>>>>>> Feature Map : 0xd
>>>>>>>>> Array UUID : 440dc11e:079308b1:131eda79:9a74c670
>>>>>>>>> Name : Blyth:0 (local to host Blyth)
>>>>>>>>> Creation Time : Tue Aug 4 23:47:57 2015
>>>>>>>>> Raid Level : raid6
>>>>>>>>> Raid Devices : 9
>>>>>>>>>
>>>>>>>>> Avail Dev Size : 5856373760 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Array Size : 20497268736 KiB (19.09 TiB 20.99 TB)
>>>>>>>>> Used Dev Size : 5856362496 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Data Offset : 247808 sectors
>>>>>>>>> Super Offset : 8 sectors
>>>>>>>>> Unused Space : before=247720 sectors, after=14336 sectors
>>>>>>>>> State : clean
>>>>>>>>> Device UUID : 31c08263:b135f0f5:763bc86b:f81d7296
>>>>>>>>>
>>>>>>>>> Internal Bitmap : 8 sectors from superblock
>>>>>>>>> Reshape pos'n : 124207104 (118.45 GiB 127.19 GB)
>>>>>>>>> Delta Devices : 1 (8->9)
>>>>>>>>>
>>>>>>>>> Update Time : Tue Jul 11 20:09:14 2023
>>>>>>>>> Bad Block Log : 512 entries available at offset 72 sectors - bad
>>>>>>>>> blocks present.
>>>>>>>>> Checksum : b7696b68 - correct
>>>>>>>>> Events : 181089
>>>>>>>>>
>>>>>>>>> Layout : left-symmetric
>>>>>>>>> Chunk Size : 512K
>>>>>>>>>
>>>>>>>>> Device Role : Active device 2
>>>>>>>>> Array State : AAAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
>>>>>>>>>
>>>>>>>>> $ sudo mdadm --examine /dev/sdi
>>>>>>>>> /dev/sdi:
>>>>>>>>> MBR Magic : aa55
>>>>>>>>> Partition[0] : 4294967295 sectors at 1 (type ee)
>>>>>>>>> $ sudo mdadm --examine /dev/sdi1
>>>>>>>>> /dev/sdi1:
>>>>>>>>> Magic : a92b4efc
>>>>>>>>> Version : 1.2
>>>>>>>>> Feature Map : 0x5
>>>>>>>>> Array UUID : 440dc11e:079308b1:131eda79:9a74c670
>>>>>>>>> Name : Blyth:0 (local to host Blyth)
>>>>>>>>> Creation Time : Tue Aug 4 23:47:57 2015
>>>>>>>>> Raid Level : raid6
>>>>>>>>> Raid Devices : 9
>>>>>>>>>
>>>>>>>>> Avail Dev Size : 5856373760 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Array Size : 20497268736 KiB (19.09 TiB 20.99 TB)
>>>>>>>>> Used Dev Size : 5856362496 sectors (2.73 TiB 3.00 TB)
>>>>>>>>> Data Offset : 247808 sectors
>>>>>>>>> Super Offset : 8 sectors
>>>>>>>>> Unused Space : before=247720 sectors, after=14336 sectors
>>>>>>>>> State : clean
>>>>>>>>> Device UUID : ac1063fc:d9d66e6d:f3de33da:b396f483
>>>>>>>>>
>>>>>>>>> Internal Bitmap : 8 sectors from superblock
>>>>>>>>> Reshape pos'n : 124311040 (118.55 GiB 127.29 GB)
>>>>>>>>> Delta Devices : 1 (8->9)
>>>>>>>>>
>>>>>>>>> Update Time : Tue Jul 11 23:12:08 2023
>>>>>>>>> Bad Block Log : 512 entries available at offset 72 sectors
>>>>>>>>> Checksum : 23b6d024 - correct
>>>>>>>>> Events : 181105
>>>>>>>>>
>>>>>>>>> Layout : left-symmetric
>>>>>>>>> Chunk Size : 512K
>>>>>>>>>
>>>>>>>>> Device Role : Active device 3
>>>>>>>>> Array State : AA.AAAAA. ('A' == active, '.' == missing, 'R' == replacing)
>>>>>>>>>
>>>>>>>>> $ sudo mdadm --detail /dev/md0
>>>>>>>>> /dev/md0:
>>>>>>>>> Version : 1.2
>>>>>>>>> Raid Level : raid6
>>>>>>>>> Total Devices : 9
>>>>>>>>> Persistence : Superblock is persistent
>>>>>>>>>
>>>>>>>>> State : inactive
>>>>>>>>> Working Devices : 9
>>>>>>>>>
>>>>>>>>> Delta Devices : 1, (-1->0)
>>>>>>>>> New Level : raid6
>>>>>>>>> New Layout : left-symmetric
>>>>>>>>> New Chunksize : 512K
>>>>>>>>>
>>>>>>>>> Name : Blyth:0 (local to host Blyth)
>>>>>>>>> UUID : 440dc11e:079308b1:131eda79:9a74c670
>>>>>>>>> Events : 181105
>>>>>>>>>
>>>>>>>>> Number Major Minor RaidDevice
>>>>>>>>>
>>>>>>>>> - 8 1 - /dev/sda1
>>>>>>>>> - 8 129 - /dev/sdi1
>>>>>>>>> - 8 113 - /dev/sdh1
>>>>>>>>> - 8 97 - /dev/sdg1
>>>>>>>>> - 8 81 - /dev/sdf1
>>>>>>>>> - 8 65 - /dev/sde1
>>>>>>>>> - 8 49 - /dev/sdd1
>>>>>>>>> - 8 33 - /dev/sdc1
>>>>>>>>> - 8 17 - /dev/sdb1
>>>>>>>>>
>>>>>>>>> $ cat /proc/mdstat
>>>>>>>>> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
>>>>>>>>> [raid4] [raid10]
>>>>>>>>> md0 : inactive sdb1[9](S) sdi1[4](S) sdf1[0](S) sdg1[1](S) sdh1[3](S)
>>>>>>>>> sda1[8](S) sdd1[7](S) sdc1[6](S) sde1[5](S)
>>>>>>>>> 26353689600 blocks super 1.2
>>>>>>>>>
>>>>>>>>> unused devices: <none>
>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>
>>> .
>>>
>>
>
> .
>
next prev parent reply other threads:[~2023-09-10 2:45 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-03 21:39 Reshape Failure Jason Moss
2023-09-04 1:41 ` Yu Kuai
2023-09-04 16:38 ` Jason Moss
2023-09-05 1:07 ` Yu Kuai
2023-09-06 14:05 ` Jason Moss
2023-09-07 1:38 ` Yu Kuai
2023-09-07 5:44 ` Jason Moss
[not found] ` <79aa3cf3-78d4-cfc6-8d3b-eb8704ffaba1@huaweicloud.com>
2023-09-07 6:19 ` Jason Moss
2023-09-10 2:45 ` Yu Kuai [this message]
2023-09-10 4:58 ` Jason Moss
2023-09-10 6:10 ` Yu Kuai
-- strict thread matches above, loose matches on Subject: below --
2011-02-16 15:46 reshape failure Tobias McNulty
2011-02-16 20:32 ` NeilBrown
2011-02-16 20:41 ` Tobias McNulty
2011-02-16 21:06 ` NeilBrown
2011-02-17 21:39 ` Tobias McNulty
2011-05-11 18:06 ` Tobias McNulty
2011-05-11 21:12 ` NeilBrown
2011-05-11 21:19 ` Tobias McNulty
[not found] ` <BANLkTi=3-PgTqeGqyu5fPZMporA1vk6-Tw@mail.gmail.com>
2011-05-11 21:34 ` NeilBrown
2011-05-12 0:46 ` Tobias McNulty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ee4d0dfe-a42c-1a84-73b1-2f5a8a78b428@huaweicloud.com \
--to=yukuai1@huaweicloud.com \
--cc=linux-raid@vger.kernel.org \
--cc=phate408@gmail.com \
--cc=yangerkun@huawei.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).