raid10 resync hangs in 4.2.6, 4.3

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Andre Tomt <andre@tomt.net>
To: linux-raid <linux-raid@vger.kernel.org>
Subject: raid10 resync hangs in 4.2.6, 4.3
Date: Fri, 20 Nov 2015 19:14:28 +0100	[thread overview]
Message-ID: <564F6304.3040502@tomt.net> (raw)

[-- Attachment #1: Type: text/plain, Size: 5582 bytes --]

[resend with compressed attachments, first may have gotten eaten by a grue]

Hi

I'm seeing hangs with RAID10 resyncs on my system. RAID5/6 recovery on 
the same drive set works without any problems. BTRFS RAID6 is problem 
free on a different set of (very busy) drives on the same controllers as 
well.

It happens shortly after array creation, from seconds to a couple 
minutes in.

wchan for md0_resync kernel thread shows it sitting in raise_barrier() 
forever, while md0_raid10 keeps a CPU core 100% busy (but shows no 
wchan), and no resyncing or I/O to the array is getting done anymore.

After a short while kernel starts spitting out rcu_sched self-detected 
stall on CPU warnings, and other rcu use starts getting iffy (I think).

I/O directly to the RAID member disk (below md layer, eg /dev/sdX 
directly) continues to work after the hang, and there are no I/O errors 
in the kernel log.

The array is a 8 drive array spread over 3 HBAs, created with:
mdadm --create /dev/md0 --level=10 --chunk=128 --bitmap=none 
--raid-devices=8 /dev/sda1 /dev/sdg1 /dev/sdl1 /dev/sdm1 /dev/sdi1 
/dev/sdj1 /dev/sdp1 /dev/sds1

The HBAs are LSI SAS2008 in IT mode (mpt2sas driver), oldish 2TB SATA 
drives. Dual socket Xeon E5 v3 system with both sockets populated.

This happens on at least 4.2.6 and 4.3
I'm going to test some earlier kernels.

attached some more info.

md0_resync stack:
> root@mental:~# cat /proc/1663/stack
> [<ffffffffc042dd02>] raise_barrier+0x11b/0x14d [raid10]
> [<ffffffffc0432830>] sync_request+0x193/0x14fc [raid10]
> [<ffffffffc0559ac6>] md_do_sync+0x7d2/0xd78 [md_mod]
> [<ffffffffc0556df9>] md_thread+0x12f/0x145 [md_mod]
> [<ffffffff9d061db2>] kthread+0xcd/0xd5
> [<ffffffff9d3b7a8f>] ret_from_fork+0x3f/0x70
> [<ffffffffffffffff>] 0xffffffffffffffff

md0_raid10 stack:
> root@mental:~# cat /proc/1662/stack
> [<ffffffffffffffff>] 0xffffffffffffffff

cat stack trying to read /dev/md0 after hang:
> root@mental:~# cat /proc/1737/stack
> [<ffffffffc042de2b>] wait_barrier+0xd8/0x118 [raid10]
> [<ffffffffc042f83c>] __make_request+0x3e/0xb17 [raid10]
> [<ffffffffc0430399>] make_request+0x84/0xdc [raid10]
> [<ffffffffc0557aab>] md_make_request+0xf6/0x1cc [md_mod]
> [<ffffffff9d1a369a>] generic_make_request+0x97/0xd6
> [<ffffffff9d1a37d1>] submit_bio+0xf8/0x140
> [<ffffffff9d140e71>] mpage_bio_submit+0x25/0x2c
> [<ffffffff9d141499>] mpage_readpages+0x10e/0x11f
> [<ffffffff9d13c76c>] blkdev_readpages+0x18/0x1a
> [<ffffffff9d0d09e4>] __do_page_cache_readahead+0x13c/0x1e0
> [<ffffffff9d0d0c67>] ondemand_readahead+0x1df/0x1f2
> [<ffffffff9d0d0da0>] page_cache_sync_readahead+0x38/0x3a
> [<ffffffff9d0c70b3>] generic_file_read_iter+0x184/0x50b
> [<ffffffff9d13c8f7>] blkdev_read_iter+0x33/0x38
> [<ffffffff9d1132fd>] __vfs_read+0x8d/0xb1
> [<ffffffff9d113820>] vfs_read+0x95/0x120
> [<ffffffff9d11402d>] SyS_read+0x49/0x84
> [<ffffffff9d3b772e>] entry_SYSCALL_64_fastpath+0x12/0x71
> [<ffffffffffffffff>] 0xffffffffffffffff


First OOPS (more in dmesg.txt)
> [  150.183473] md0: detected capacity change from 0 to 8001054310400
> [  150.183647] md: resync of RAID array md0
> [  150.183652] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [  150.183654] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
> [  150.183678] md: using 128k window, over a total of 7813529600k.
> [  233.068271] INFO: rcu_sched self-detected stall on CPU
> [  233.068308] 	5: (17999 ticks this GP) idle=695/140000000000001/0 softirq=1235/1235 fqs=8999
> [  233.068335] 	 (t=18000 jiffies g=935 c=934 q=37)
> [  233.068354] Task dump for CPU 5:
> [  233.068356] md0_raid10      R  running task        0  1662      2 0x00000008
> [  233.068358]  0000000000000000 ffff88103fca3de0 ffffffff9d069f07 0000000000000005
> [  233.068360]  ffffffff9d63d0c0 ffff88103fca3df8 ffffffff9d06beb9 ffffffff9d63d0c0
> [  233.068361]  ffff88103fca3e28 ffffffff9d08c836 ffffffff9d63d0c0 ffff88103fcb4e80
> [  233.068363] Call Trace:
> [  233.068364]  <IRQ>  [<ffffffff9d069f07>] sched_show_task+0xb9/0xbe
> [  233.068372]  [<ffffffff9d06beb9>] dump_cpu_task+0x32/0x35
> [  233.068375]  [<ffffffff9d08c836>] rcu_dump_cpu_stacks+0x71/0x8c
> [  233.068378]  [<ffffffff9d08f32c>] rcu_check_callbacks+0x20f/0x5a3
> [  233.068382]  [<ffffffff9d0b72da>] ? acct_account_cputime+0x17/0x19
> [  233.068384]  [<ffffffff9d0911a2>] update_process_times+0x2a/0x4f
> [  233.068387]  [<ffffffff9d09cd55>] tick_sched_handle.isra.5+0x31/0x33
> [  233.068388]  [<ffffffff9d09cd8f>] tick_sched_timer+0x38/0x60
> [  233.068390]  [<ffffffff9d0917e1>] __hrtimer_run_queues+0xa1/0x10c
> [  233.068392]  [<ffffffff9d091c52>] hrtimer_interrupt+0xa0/0x172
> [  233.068395]  [<ffffffff9d0367a4>] smp_trace_apic_timer_interrupt+0x76/0x88
> [  233.068397]  [<ffffffff9d0367bf>] smp_apic_timer_interrupt+0x9/0xb
> [  233.068400]  [<ffffffff9d3b8402>] apic_timer_interrupt+0x82/0x90
> [  233.068401]  <EOI>  [<ffffffff9d19e9aa>] ? bio_copy_data+0xce/0x2af
> [  233.068410]  [<ffffffffc04320e5>] raid10d+0x974/0xf2c [raid10]
> [  233.068417]  [<ffffffffc0556df9>] md_thread+0x12f/0x145 [md_mod]
> [  233.068421]  [<ffffffffc0556df9>] ? md_thread+0x12f/0x145 [md_mod]
> [  233.068424]  [<ffffffff9d07ad2e>] ? wait_woken+0x6d/0x6d
> [  233.068428]  [<ffffffffc0556cca>] ? md_wait_for_blocked_rdev+0x102/0x102 [md_mod]
> [  233.068431]  [<ffffffff9d061db2>] kthread+0xcd/0xd5
> [  233.068434]  [<ffffffff9d061ce5>] ? kthread_worker_fn+0x13f/0x13f
> [  233.068436]  [<ffffffff9d3b7a8f>] ret_from_fork+0x3f/0x70
> [  233.068438]  [<ffffffff9d061ce5>] ? kthread_worker_fn+0x13f/0x13f


[-- Attachment #2: config.txt.gz --]
[-- Type: application/gzip, Size: 31864 bytes --]

[-- Attachment #3: dmesg.txt.gz --]
[-- Type: application/gzip, Size: 31060 bytes --]

[-- Attachment #4: wchan.txt.gz --]
[-- Type: application/gzip, Size: 3051 bytes --]

next             reply	other threads:[~2015-11-20 18:14 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-20 18:14 Andre Tomt [this message]
2015-11-20 20:30 ` raid10 resync hangs in 4.2.6, 4.3 John Stoffel
2015-11-21  1:27   ` Andre Tomt
2015-11-22 19:35     ` John Stoffel
2015-11-23  9:20 ` Artur Paszkiewicz
2015-11-23  9:52   ` Andre Tomt
2015-11-23 19:48     ` Andre Tomt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=564F6304.3040502@tomt.net \
    --to=andre@tomt.net \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).