* mdadm hangs during raid5 grow
@ 2017-01-04 6:42 Sebastian Reichel
2017-01-08 20:50 ` Sebastian Reichel
0 siblings, 1 reply; 2+ messages in thread
From: Sebastian Reichel @ 2017-01-04 6:42 UTC (permalink / raw)
To: linux-raid
[-- Attachment #1: Type: text/plain, Size: 8358 bytes --]
Hi,
I have a problem with hanging mdadm reshape task at 100% CPU load
(kernel thread "md2_raid5"). Any operation on the raid (i.e. mdadm
-S) is also hanging. Rebooting worked, but after triggering the
reshape (mdadm --readwrite /dev/md2) I get the same behaviour.
dmesg has this stacktrace:
[ 1813.500745] INFO: task md2_resync:3377 blocked for more than 120 seconds.
[ 1813.500778] Not tainted 4.8.0-2-amd64 #1
[ 1813.500795] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1813.500822] md2_resync D ffff93207bc98180 0 3377 2 0x00000000
[ 1813.500827] ffff93206f46d000 ffff93207642a1c0 0000000000000246 ffff932059607bf0
[ 1813.500829] ffff932059608000 ffff93206effc400 ffff93206effc688 ffff932059607d24
[ 1813.500830] ffff932059607bf0 ffff93206e0a3000 ffffffffbc7eb6d1 ffff93206e0a3000
[ 1813.500832] Call Trace:
[ 1813.500841] [<ffffffffbc7eb6d1>] ? schedule+0x31/0x80
[ 1813.500847] [<ffffffffc0356924>] ? reshape_request+0x7b4/0x910 [raid456]
[ 1813.500851] [<ffffffffbc2bce80>] ? wake_atomic_t_function+0x60/0x60
[ 1813.500854] [<ffffffffc0356da3>] ? raid5_sync_request+0x323/0x3a0 [raid456]
[ 1813.500862] [<ffffffffc0271b50>] ? is_mddev_idle+0x98/0xf3 [md_mod]
[ 1813.500866] [<ffffffffc02649a9>] ? md_do_sync+0x959/0xed0 [md_mod]
[ 1813.500868] [<ffffffffbc2bce80>] ? wake_atomic_t_function+0x60/0x60
[ 1813.500872] [<ffffffffc0261363>] ? md_thread+0x133/0x140 [md_mod]
[ 1813.500873] [<ffffffffbc7eb1c9>] ? __schedule+0x289/0x760
[ 1813.500877] [<ffffffffc0261230>] ? find_pers+0x70/0x70 [md_mod]
[ 1813.500879] [<ffffffffbc29aecd>] ? kthread+0xcd/0xf0
[ 1813.500881] [<ffffffffbc7efcaf>] ? ret_from_fork+0x1f/0x40
[ 1813.500883] [<ffffffffbc29ae00>] ? kthread_create_on_node+0x190/0x190
Is this a known bug / some patch available?
[0] http://serverfault.com/questions/773244/mdadm-stuck-reshape-operation
[1] http://serverfault.com/questions/697193/raid-5-reshape-freeze
-- Sebastian
FWIW here are some infos about the raid:
mars# uname -a
Linux mars 4.8.0-2-amd64 #1 SMP Debian 4.8.11-1 (2016-12-02) x86_64 GNU/Linux
sre@mars ~ % cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10]
md2 : active raid5 sdl[4] sdk[6] sdm[3] sdj[5]
7813774720 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
[>....................] reshape = 0.2% (10197888/3906887360) finish=157305.2min speed=412K/sec
...
mars# for disk in /dev/sd[jmlk]; smartctl -i $disk | grep "Device Model"
Device Model: WDC WD40EFRX-68WT0N0
Device Model: WDC WD40EFRX-68WT0N0
Device Model: WDC WD40EFRX-68WT0N0
Device Model: WDC WD40EFRX-68WT0N0
mars# for disk in /dev/sd[jmlk]; if smartctl -l scterc,70,70 $disk > /dev/null ; then echo "$disk is good"; fi
/dev/sdj is good
/dev/sdk is good
/dev/sdl is good
/dev/sdm is good
mars# mdadm --detail /dev/md2
/dev/md2:
Version : 1.2
Creation Time : Wed Jan 15 12:59:19 2014
Raid Level : raid5
Array Size : 7813774720 (7451.80 GiB 8001.31 GB)
Used Dev Size : 3906887360 (3725.90 GiB 4000.65 GB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Update Time : Wed Jan 4 06:09:53 2017
State : clean, reshaping
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
Reshape Status : 0% complete
Delta Devices : 1, (3->4)
Name : mars:2 (local to host mars)
UUID : a9946f57:e2c50b35:d192467d:fa495817
Events : 39050
Number Major Minor RaidDevice State
4 8 176 0 active sync /dev/sdl
5 8 144 1 active sync /dev/sdj
3 8 192 2 active sync /dev/sdm
6 8 160 3 active sync /dev/sdk
mars# mdadm --examine /dev/sd[jmlk]
/dev/sdj:
Magic : a92b4efc
Version : 1.2
Feature Map : 0xc
Array UUID : a9946f57:e2c50b35:d192467d:fa495817
Name : mars:2 (local to host mars)
Creation Time : Wed Jan 15 12:59:19 2014
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 7813775280 (3725.90 GiB 4000.65 GB)
Array Size : 11720662080 (11177.69 GiB 12001.96 GB)
Used Dev Size : 7813774720 (3725.90 GiB 4000.65 GB)
Data Offset : 261888 sectors
Super Offset : 8 sectors
Unused Space : before=261800 sectors, after=560 sectors
State : active
Device UUID : 94bb69dc:955c3040:5cc4ecbb:28130785
Reshape pos'n : 29073984 (27.73 GiB 29.77 GB)
Delta Devices : 1 (3->4)
Update Time : Wed Jan 4 06:09:53 2017
Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present.
Checksum : ed4b675b - correct
Events : 39050
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 1
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdk:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : a9946f57:e2c50b35:d192467d:fa495817
Name : mars:2 (local to host mars)
Creation Time : Wed Jan 15 12:59:19 2014
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 7813775280 (3725.90 GiB 4000.65 GB)
Array Size : 11720662080 (11177.69 GiB 12001.96 GB)
Used Dev Size : 7813774720 (3725.90 GiB 4000.65 GB)
Data Offset : 261888 sectors
Super Offset : 8 sectors
Unused Space : before=261800 sectors, after=560 sectors
State : active
Device UUID : 213a97fc:46865f42:0e8b06be:7309eac1
Reshape pos'n : 29073984 (27.73 GiB 29.77 GB)
Delta Devices : 1 (3->4)
Update Time : Wed Jan 4 06:09:53 2017
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 4ea4cc90 - correct
Events : 39050
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 3
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdl:
Magic : a92b4efc
Version : 1.2
Feature Map : 0xc
Array UUID : a9946f57:e2c50b35:d192467d:fa495817
Name : mars:2 (local to host mars)
Creation Time : Wed Jan 15 12:59:19 2014
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 7813775280 (3725.90 GiB 4000.65 GB)
Array Size : 11720662080 (11177.69 GiB 12001.96 GB)
Used Dev Size : 7813774720 (3725.90 GiB 4000.65 GB)
Data Offset : 261888 sectors
Super Offset : 8 sectors
Unused Space : before=261800 sectors, after=560 sectors
State : active
Device UUID : a25cccf5:d23f2274:30dc0ffa:d8a79f14
Reshape pos'n : 29073984 (27.73 GiB 29.77 GB)
Delta Devices : 1 (3->4)
Update Time : Wed Jan 4 06:09:53 2017
Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present.
Checksum : 85b982a - correct
Events : 39050
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 0
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdm:
Magic : a92b4efc
Version : 1.2
Feature Map : 0xc
Array UUID : a9946f57:e2c50b35:d192467d:fa495817
Name : mars:2 (local to host mars)
Creation Time : Wed Jan 15 12:59:19 2014
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 7813775280 (3725.90 GiB 4000.65 GB)
Array Size : 11720662080 (11177.69 GiB 12001.96 GB)
Used Dev Size : 7813774720 (3725.90 GiB 4000.65 GB)
Data Offset : 261888 sectors
Super Offset : 8 sectors
Unused Space : before=261800 sectors, after=560 sectors
State : active
Device UUID : 21517788:cdc63605:5bdc0bee:d23cc234
Reshape pos'n : 29073984 (27.73 GiB 29.77 GB)
Delta Devices : 1 (3->4)
Update Time : Wed Jan 4 06:09:53 2017
Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present.
Checksum : 4039a8c7 - correct
Events : 39050
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 2
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: mdadm hangs during raid5 grow
2017-01-04 6:42 mdadm hangs during raid5 grow Sebastian Reichel
@ 2017-01-08 20:50 ` Sebastian Reichel
0 siblings, 0 replies; 2+ messages in thread
From: Sebastian Reichel @ 2017-01-08 20:50 UTC (permalink / raw)
To: linux-raid
[-- Attachment #1: Type: text/plain, Size: 10928 bytes --]
Hi,
On Wed, Jan 04, 2017 at 07:42:11AM +0100, Sebastian Reichel wrote:
> I have a problem with hanging mdadm reshape task at 100% CPU load
> (kernel thread "md2_raid5"). Any operation on the raid (i.e. mdadm
> -S) is also hanging. Rebooting worked, but after triggering the
> reshape (mdadm --readwrite /dev/md2) I get the same behaviour.
>
> dmesg has this stacktrace:
>
> [ 1813.500745] INFO: task md2_resync:3377 blocked for more than 120 seconds.
> [ 1813.500778] Not tainted 4.8.0-2-amd64 #1
> [ 1813.500795] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1813.500822] md2_resync D ffff93207bc98180 0 3377 2 0x00000000
> [ 1813.500827] ffff93206f46d000 ffff93207642a1c0 0000000000000246 ffff932059607bf0
> [ 1813.500829] ffff932059608000 ffff93206effc400 ffff93206effc688 ffff932059607d24
> [ 1813.500830] ffff932059607bf0 ffff93206e0a3000 ffffffffbc7eb6d1 ffff93206e0a3000
> [ 1813.500832] Call Trace:
> [ 1813.500841] [<ffffffffbc7eb6d1>] ? schedule+0x31/0x80
> [ 1813.500847] [<ffffffffc0356924>] ? reshape_request+0x7b4/0x910 [raid456]
> [ 1813.500851] [<ffffffffbc2bce80>] ? wake_atomic_t_function+0x60/0x60
> [ 1813.500854] [<ffffffffc0356da3>] ? raid5_sync_request+0x323/0x3a0 [raid456]
> [ 1813.500862] [<ffffffffc0271b50>] ? is_mddev_idle+0x98/0xf3 [md_mod]
> [ 1813.500866] [<ffffffffc02649a9>] ? md_do_sync+0x959/0xed0 [md_mod]
> [ 1813.500868] [<ffffffffbc2bce80>] ? wake_atomic_t_function+0x60/0x60
> [ 1813.500872] [<ffffffffc0261363>] ? md_thread+0x133/0x140 [md_mod]
> [ 1813.500873] [<ffffffffbc7eb1c9>] ? __schedule+0x289/0x760
> [ 1813.500877] [<ffffffffc0261230>] ? find_pers+0x70/0x70 [md_mod]
> [ 1813.500879] [<ffffffffbc29aecd>] ? kthread+0xcd/0xf0
> [ 1813.500881] [<ffffffffbc7efcaf>] ? ret_from_fork+0x1f/0x40
> [ 1813.500883] [<ffffffffbc29ae00>] ? kthread_create_on_node+0x190/0x190
>
> Is this a known bug / some patch available?
I also tried 4.9.0-rc8, which was easily available without trying to
find a working kernel config for the system and the error persists.
I have some more information:
mars# gdb /usr/lib/debug/lib/modules/4.9.0-rc8-amd64/kernel/drivers/md/raid456.ko
(gdb) info line *reshape_request+0x6dd
Line 5481 of "./drivers/md/raid5.c" starts at address 0xe8e4 <reshape_request+1748> and ends at 0xe8fd <reshape_request+1773>.
=> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/md/raid5.c?h=v4.9-rc8#n5481
=> wait_event(conf->wait_for_overlap, atomic_read(&conf->reshape_stripes) == 0 || test_bit(MD_RECOVERY_INTR, &mddev->recovery));
Also if I enable debug messages in raid5.c I get this from from
[md2_raid5], which is running at 100% cpu on one of the cores (Note
regarding the failed state: I'm pretty sure the disks are ok and I
can use the raid, if I mount it read-only).
# echo "file raid5.c +p" > /sys/kernel/debug/dynamic_debug/control
# dmesg | grep-random-sample
[41658.396897] locked=1 uptodate=4 to_read=0 to_write=0 failed=2 failed_num=2,1
[41658.396904] check 3: state 0x13 read (null) write (null) written (null)
[41658.397113] check 0: state 0x811 read (null) write (null) written (null)
[41658.397139] check 0: state 0x811 read (null) write (null) written (null)
[41658.397163] check 1: state 0xa01 read (null) write (null) written (null)
[41658.397192] handling stripe 20206728, state=0x1401 cnt=1, pd_idx=2, qd_idx=-1
, check:0, reconstruct:6
[41658.397222] schedule_reconstruction: stripe 20206728 locked: 2 ops_request: 10
[41658.397251] check 1: state 0xa01 read (null) write (null) written (null)
[41658.397277] ops_complete_reconstruct: stripe 20206736
[41658.397296] check 1: state 0xa01 read (null) write (null) written (null)
[41658.397328] check 0: state 0x811 read (null) write (null) written (null)
[41658.397351] check 3: state 0x13 read (null) write (null) written (null)
[41658.397368] locked=1 uptodate=4 to_read=0 to_write=0 failed=2 failed_num=2,1
[41658.397387] ops_complete_reconstruct: stripe 20206112
[41658.397395] __get_priority_stripe: handle: busy hold: empty full_writes: 0 bypass_count: 0
[41658.397417] check 1: state 0xa01 read (null) write (null) written (null)
[41658.397476] check 1: state 0xa01 read (null) write (null) written (null)
[41658.397560] check 2: state 0xa01 read (null) write (null) written (null)
[41658.397592] check 3: state 0x13 read (null) write (null) written (null)
[41658.397606] __get_priority_stripe: handle: busy hold: empty full_writes: 0 bypass_count: 0
[41658.397614] handling stripe 20206792, state=0x1401 cnt=1, pd_idx=2, qd_idx=-1
, check:0, reconstruct:6
[41658.397621] schedule_reconstruction: stripe 20206792 locked: 2 ops_request: 10
[41658.397630] check 1: state 0xa01 read (null) write (null) written (null)
[41658.397638] ops_complete_reconstruct: stripe 20206800
[41658.397840] locked=1 uptodate=4 to_read=0 to_write=0 failed=2 failed_num=2,1
[41658.397863] check 1: state 0xa01 read (null) write (null) written (null)
[41658.397869] ops_complete_reconstruct: stripe 20206976
[41658.397874] check 1: state 0xa01 read (null) write (null) written (null)
[41658.397883] ops_complete_reconstruct: stripe 20206984
[41658.397890] check 0: state 0x13 read (null) write (null) written (null)
[41658.397899] check 3: state 0x811 read (null) write (null) written (null)
[41658.397907] schedule_reconstruction: stripe 20207000 locked: 2 ops_request: 10
[41658.397952] check 1: state 0xa01 read (null) write (null) written (null)
[41658.398030] check 2: state 0xa01 read (null) write (null) written (null)
[41658.398079] check 1: state 0x203 read (null) write (null) written (null)
[41658.398105] check 2: state 0xa01 read (null) write (null) written (null)
[41658.398118] check 3: state 0x811 read (null) write (null) written (null)
[41658.398124] locked=1 uptodate=4 to_read=0 to_write=0 failed=2 failed_num=2,1
[41658.398130] check 3: state 0x811 read (null) write (null) written (null)
[41658.398136] schedule_reconstruction: stripe 20206904 locked: 2 ops_request: 10
[41658.398145] check 3: state 0x811 read (null) write (null) written (null)
[41658.398162] __get_priority_stripe: handle: busy hold: empty full_writes: 0 bypass_count: 0
[41658.398226] check 2: state 0xa01 read (null) write (null) written (null)
[41658.398294] ops_run_reconstruct5: stripe 20206936
[41658.398308] check 0: state 0x811 read (null) write (null) written (null)
[41658.398318] check 1: state 0x203 read (null) write (null) written (null)
[41658.398342] check 2: state 0xa01 read (null) write (null) written (null)
[41658.398367] __get_priority_stripe: handle: busy hold: empty full_writes: 0 bypass_count: 0
[41658.398389] ops_complete_reconstruct: stripe 20206184
[41658.398415] check 1: state 0xa01 read (null) write (null) written (null)
[41658.398469] locked=1 uptodate=4 to_read=0 to_write=0 failed=2 failed_num=2,1
[41658.398494] check 3: state 0x811 read (null) write (null) written (null)
[41658.398519] schedule_reconstruction: stripe 20206216 locked: 2 ops_request: 10
[41658.398542] check 2: state 0x203 read (null) write (null) written (null)
[41658.398554] ops_run_reconstruct5: stripe 20206224
[41658.398560] check 2: state 0x203 read (null) write (null) written (null)
[41658.398582] ops_run_reconstruct5: stripe 20206240
[41658.398600] check 2: state 0x203 read (null) write (null) written (null)
[41658.398627] check 3: state 0x811 read (null) write (null) written (null)
[41658.398659] check 3: state 0x811 read (null) write (null) written (null)
[41658.398733] ops_run_reconstruct5: stripe 20206288
[41658.398754] check 2: state 0x203 read (null) write (null) written (null)
[41658.398785] ops_complete_reconstruct: stripe 20206296
[41658.398792] check 1: state 0xa01 read (null) write (null) written (null)
[41658.398799] ops_complete_reconstruct: stripe 20206304
[41658.398807] __get_priority_stripe: handle: busy hold: empty full_writes: 0 bypass_count: 0
[41658.398813] handling stripe 20206312, state=0x1401 cnt=1, pd_idx=2, qd_idx=-1
, check:0, reconstruct:6
[41658.398845] check 0: state 0x13 read (null) write (null) written (null)
[41658.398870] check 1: state 0xa01 read (null) write (null) written (null)
[41658.398901] check 3: state 0x811 read (null) write (null) written (null)
[41658.398990] __get_priority_stripe: handle: busy hold: empty full_writes: 0 bypass_count: 0
[41658.399008] __get_priority_stripe: handle: busy hold: empty full_writes: 0 bypass_count: 0
[41658.399035] check 2: state 0xa01 read (null) write (null) written (null)
[41658.399039] schedule_reconstruction: stripe 20206040 locked: 2 ops_request: 10
[41658.399048] check 1: state 0x203 read (null) write (null) written (null)
[41658.399057] ops_complete_reconstruct: stripe 20206336
[41658.399063] check 1: state 0x203 read (null) write (null) written (null)
[41658.399090] check 1: state 0x203 read (null) write (null) written (null)
[41658.399117] check 1: state 0xa01 read (null) write (null) written (null)
[41658.399142] __get_priority_stripe: handle: busy hold: empty full_writes: 0 bypass_count: 0
[41658.399220] check 3: state 0x13 read (null) write (null) written (null)
Also I have an idea about the problem: Usually I added zero'd disks
to mdadm. This time I only zero'd out the first few MB, so that mdadm
does not complain about the header and thought it would take care of
the rest. Is it safe to remove the new disk, zero it out and re-add
it?
-- Sebastian
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2017-01-08 20:50 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-04 6:42 mdadm hangs during raid5 grow Sebastian Reichel
2017-01-08 20:50 ` Sebastian Reichel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).