public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* XFS and RAID10 with o2 layout
@ 2018-12-12 12:29 Sinisa
  2018-12-12 14:30 ` Brian Foster
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Sinisa @ 2018-12-12 12:29 UTC (permalink / raw)
  To: linux-xfs

[-- Attachment #1: Type: text/plain, Size: 2209 bytes --]

Hello group,

I have noticed something strange going on lately, but recently I have come to 
conclusion that there is some unwanted interaction between XFS and Linux RAID10 
with "offset" layout.

So here is the problem: I create a Linux RAID10 mirror with 2 disks (HDD or 
SSD) and "o2" layout (best choice for read and write speed):
# mdadm -C -n2 -l10 -po2 /dev/mdX /dev/sdaX /dev/sdbX
# mkfs.xfs /dev/mdX
# mount /dev/mdX /mnt
# rsync -avxDPHS / /mnt

So we have RAID10 initializing:

# cat /proc/mdstat
Personalities : [raid1] [raid10]
md2 : active raid10 sdb3[1] sda3[0]
       314433536 blocks super 1.2 4096K chunks 2 offset-copies [2/2] [UU]
       [==>..................]  resync = 11.7% (36917568/314433536) 
finish=8678.2min speed=532K/sec
       bitmap: 3/3 pages [12KB], 65536KB chunk

but after a few minutes everything stops like you can see above. Rsync (or any 
other process writing to that md device) also freezes. If I try to read already 
copied files - freeze, usually with less that 2GB copied.

Sometimes in dmesg I get some kernel messages about "task kworker/2:1:55 
blocked for more than 480 seconds." (please see attached dmesg.txt and my 
reports here: https://bugzilla.opensuse.org/show_bug.cgi?id=1111073), sometimes 
nothing at all. When this happens, I can only reboot with SysRq-b or 
"physically" with reset/power button.

Same thing can happen with "far" layout, but it seems to me that it does not 
happen every time (or that often). I might be wrong, because I never use "far" 
layout in real life, only for testing.
I was unable to reproduce the failure with "near" layout.

Also with EXT4 or BTRFS and any layout everything works just as it should, that 
is sync goes on until finished, and rsync, cp, or any other write work just 
fine at the same time.

Let me just add that I first saw this behavior in openSUSE LEAP 15.0 (kernel 
4.12). In previous versions (up to kernel 4.4) I never had this problem. In the 
meantime I have tested with kernels up to 4.20rc and it is the same. 
Unfortunately I cannot go back to test kernels 4.5 - 4.11 to pinpoint the 
moment the problem first appeared.



-- 
Best regards,
Siniša Bandin
(excuse my English)


[-- Attachment #2: dmesg.txt --]
[-- Type: text/plain, Size: 13412 bytes --]

[  180.981499] SGI XFS with ACLs, security attributes, no debug enabled
[  181.005019] XFS (md1): Mounting V5 Filesystem
[  181.132076] XFS (md1): Starting recovery (logdev: internal)
[  181.295606] XFS (md1): Ending recovery (logdev: internal)
[  181.804011] XFS (md1): Unmounting Filesystem
[  182.201794] XFS (md127): Mounting V4 Filesystem
[  182.736958] md: recovery of RAID array md127
[  182.915479] XFS (md127): Ending clean mount
[  183.819702] XFS (md127): Unmounting Filesystem
[  184.943831] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
[  529.784557] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
[  601.789958] md1: detected capacity change from 33284947968 to 0
[  601.789973] md: md1 stopped.
[  602.314112] md0: detected capacity change from 550436864 to 0
[  602.314128] md: md0 stopped.
[  602.745030] md: md127: recovery interrupted.
[  603.131684] md127: detected capacity change from 966229229568 to 0
[  603.132237] md: md127 stopped.
[  603.435808]  sda: sda1 sda2
[  603.594074] udevd[5011]: inotify_add_watch(11, /dev/sda2, 10) failed: No such file or directory
[  603.643959]  sda:
[  603.844724]  sdb: sdb1 sdb2
[  604.255407]  sdb: sdb1
[  604.490214] udevd[5050]: inotify_add_watch(11, /dev/sdb1, 10) failed: No such file or directory
[  605.140952]  sdb: sdb1
[  605.628686]  sdb: sdb1 sdb2
[  606.271192]  sdb: sdb1 sdb2 sdb3
[  607.079626]  sdb: sdb1 sdb2 sdb3
[  607.611092]  sda:
[  608.273201]  sda: sda1
[  608.611952]  sda: sda1 sda2
[  609.031326]  sda: sda1 sda2 sda3
[  609.753140] md/raid10:md1: not clean -- starting background reconstruction
[  609.753145] md/raid10:md1: active with 2 out of 2 devices
[  609.768804] md1: detected capacity change from 0 to 32210157568
[  609.772677] md: resync of RAID array md1
[  614.590107] XFS (md1): Mounting V5 Filesystem
[  615.449035] XFS (md1): Ending clean mount
[  617.678462] md/raid1:md0: not clean -- starting background reconstruction
[  617.678469] md/raid1:md0: active with 2 out of 2 mirrors
[  617.740729] md0: detected capacity change from 0 to 524222464
[  617.747107] md: delaying resync of md0 until md1 has finished (they share one or more physical units)
[  620.037818] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
[ 1463.754785] INFO: task kworker/0:3:227 blocked for more than 480 seconds.
[ 1463.754793]       Not tainted 4.19.5-1-default #1
[ 1463.754795] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.754799] kworker/0:3     D    0   227      2 0x80000000
[ 1463.755000] Workqueue: xfs-eofblocks/md1 xfs_eofblocks_worker [xfs]
[ 1463.755005] Call Trace:
[ 1463.755025]  ? __schedule+0x29a/0x880
[ 1463.755032]  ? rwsem_down_write_failed+0x197/0x350
[ 1463.755038]  schedule+0x78/0x110
[ 1463.755044]  rwsem_down_write_failed+0x197/0x350
[ 1463.755055]  call_rwsem_down_write_failed+0x13/0x20
[ 1463.755061]  down_write+0x20/0x30
[ 1463.755196]  xfs_free_eofblocks+0x114/0x1a0 [xfs]
[ 1463.755330]  xfs_inode_free_eofblocks+0xd3/0x1e0 [xfs]
[ 1463.755459]  ? xfs_inode_ag_walk_grab+0x5b/0x90 [xfs]
[ 1463.755586]  xfs_inode_ag_walk.isra.15+0x1aa/0x420 [xfs]
[ 1463.755714]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[ 1463.755727]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 1463.755734]  ? __switch_to_asm+0x40/0x70
[ 1463.755738]  ? __switch_to_asm+0x34/0x70
[ 1463.755743]  ? __switch_to_asm+0x40/0x70
[ 1463.755748]  ? __switch_to_asm+0x34/0x70
[ 1463.755752]  ? __switch_to_asm+0x40/0x70
[ 1463.755757]  ? __switch_to_asm+0x34/0x70
[ 1463.755762]  ? __switch_to_asm+0x40/0x70
[ 1463.755893]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[ 1463.755900]  ? radix_tree_gang_lookup_tag+0xc2/0x140
[ 1463.756032]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[ 1463.756158]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
[ 1463.756288]  xfs_eofblocks_worker+0x29/0x40 [xfs]
[ 1463.756298]  process_one_work+0x1fd/0x420
[ 1463.756305]  worker_thread+0x2d/0x3d0
[ 1463.756311]  ? rescuer_thread+0x340/0x340
[ 1463.756316]  kthread+0x112/0x130
[ 1463.756322]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 1463.756329]  ret_from_fork+0x3a/0x50
[ 1463.756375] INFO: task kworker/u4:0:4615 blocked for more than 480 seconds.
[ 1463.756379]       Not tainted 4.19.5-1-default #1
[ 1463.756380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.756383] kworker/u4:0    D    0  4615      2 0x80000000
[ 1463.756395] Workqueue: writeback wb_workfn (flush-9:1)
[ 1463.756400] Call Trace:
[ 1463.756409]  ? __schedule+0x29a/0x880
[ 1463.756420]  ? wait_barrier+0xdd/0x170 [raid10]
[ 1463.756426]  schedule+0x78/0x110
[ 1463.756433]  wait_barrier+0xdd/0x170 [raid10]
[ 1463.756440]  ? wait_woken+0x80/0x80
[ 1463.756448]  raid10_write_request+0xf2/0x900 [raid10]
[ 1463.756454]  ? wait_woken+0x80/0x80
[ 1463.756459]  ? mempool_alloc+0x55/0x160
[ 1463.756483]  ? md_write_start+0xa9/0x270 [md_mod]
[ 1463.756492]  raid10_make_request+0xc1/0x120 [raid10]
[ 1463.756498]  ? wait_woken+0x80/0x80
[ 1463.756514]  md_handle_request+0x121/0x190 [md_mod]
[ 1463.756535]  md_make_request+0x78/0x190 [md_mod]
[ 1463.756544]  generic_make_request+0x1c6/0x470
[ 1463.756553]  submit_bio+0x45/0x140
[ 1463.756714]  xfs_submit_ioend+0x9c/0x1e0 [xfs]
[ 1463.756844]  xfs_vm_writepages+0x68/0x80 [xfs]
[ 1463.756856]  do_writepages+0x31/0xb0
[ 1463.756865]  ? read_hpet+0x126/0x130
[ 1463.756873]  ? ktime_get+0x36/0xa0
[ 1463.756881]  __writeback_single_inode+0x3d/0x3e0
[ 1463.756889]  writeback_sb_inodes+0x1c4/0x430
[ 1463.756902]  __writeback_inodes_wb+0x5d/0xb0
[ 1463.756910]  wb_writeback+0x26b/0x310
[ 1463.756920]  wb_workfn+0x33a/0x410
[ 1463.756932]  process_one_work+0x1fd/0x420
[ 1463.756940]  worker_thread+0x2d/0x3d0
[ 1463.756946]  ? rescuer_thread+0x340/0x340
[ 1463.756951]  kthread+0x112/0x130
[ 1463.756957]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 1463.756965]  ret_from_fork+0x3a/0x50
[ 1463.756979] INFO: task kworker/0:2:4994 blocked for more than 480 seconds.
[ 1463.756982]       Not tainted 4.19.5-1-default #1
[ 1463.756984] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.756987] kworker/0:2     D    0  4994      2 0x80000000
[ 1463.757013] Workqueue: md submit_flushes [md_mod]
[ 1463.757016] Call Trace:
[ 1463.757024]  ? __schedule+0x29a/0x880
[ 1463.757034]  ? wait_barrier+0xdd/0x170 [raid10]
[ 1463.757039]  schedule+0x78/0x110
[ 1463.757047]  wait_barrier+0xdd/0x170 [raid10]
[ 1463.757054]  ? wait_woken+0x80/0x80
[ 1463.757062]  raid10_write_request+0xf2/0x900 [raid10]
[ 1463.757067]  ? wait_woken+0x80/0x80
[ 1463.757072]  ? mempool_alloc+0x55/0x160
[ 1463.757088]  ? md_write_start+0xa9/0x270 [md_mod]
[ 1463.757095]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 1463.757104]  raid10_make_request+0xc1/0x120 [raid10]
[ 1463.757110]  ? wait_woken+0x80/0x80
[ 1463.757126]  md_handle_request+0x121/0x190 [md_mod]
[ 1463.757132]  ? _raw_spin_unlock_irq+0x22/0x40
[ 1463.757137]  ? finish_task_switch+0x74/0x260
[ 1463.757156]  submit_flushes+0x21/0x40 [md_mod]
[ 1463.757163]  process_one_work+0x1fd/0x420
[ 1463.757170]  worker_thread+0x2d/0x3d0
[ 1463.757177]  ? rescuer_thread+0x340/0x340
[ 1463.757181]  kthread+0x112/0x130
[ 1463.757186]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 1463.757193]  ret_from_fork+0x3a/0x50
[ 1463.757205] INFO: task md1_resync:5215 blocked for more than 480 seconds.
[ 1463.757207]       Not tainted 4.19.5-1-default #1
[ 1463.757209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.757212] md1_resync      D    0  5215      2 0x80000000
[ 1463.757216] Call Trace:
[ 1463.757223]  ? __schedule+0x29a/0x880
[ 1463.757231]  ? raise_barrier+0x8d/0x140 [raid10]
[ 1463.757236]  schedule+0x78/0x110
[ 1463.757243]  raise_barrier+0x8d/0x140 [raid10]
[ 1463.757248]  ? wait_woken+0x80/0x80
[ 1463.757257]  raid10_sync_request+0x1f6/0x1e30 [raid10]
[ 1463.757265]  ? _raw_spin_unlock_irq+0x22/0x40
[ 1463.757284]  ? is_mddev_idle+0x125/0x137 [md_mod]
[ 1463.757302]  md_do_sync.cold.78+0x404/0x969 [md_mod]
[ 1463.757311]  ? wait_woken+0x80/0x80
[ 1463.757336]  ? md_rdev_init+0xb0/0xb0 [md_mod]
[ 1463.757351]  md_thread+0xe9/0x140 [md_mod]
[ 1463.757358]  ? _raw_spin_unlock_irqrestore+0x2e/0x60
[ 1463.757364]  ? __kthread_parkme+0x4c/0x70
[ 1463.757369]  kthread+0x112/0x130
[ 1463.757374]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 1463.757380]  ret_from_fork+0x3a/0x50
[ 1463.757395] INFO: task xfsaild/md1:5233 blocked for more than 480 seconds.
[ 1463.757398]       Not tainted 4.19.5-1-default #1
[ 1463.757400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.757402] xfsaild/md1     D    0  5233      2 0x80000000
[ 1463.757406] Call Trace:
[ 1463.757413]  ? __schedule+0x29a/0x880
[ 1463.757421]  ? wait_barrier+0xdd/0x170 [raid10]
[ 1463.757426]  schedule+0x78/0x110
[ 1463.757433]  wait_barrier+0xdd/0x170 [raid10]
[ 1463.757438]  ? wait_woken+0x80/0x80
[ 1463.757446]  raid10_write_request+0xf2/0x900 [raid10]
[ 1463.757451]  ? wait_woken+0x80/0x80
[ 1463.757455]  ? mempool_alloc+0x55/0x160
[ 1463.757471]  ? md_write_start+0xa9/0x270 [md_mod]
[ 1463.757477]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 1463.757485]  raid10_make_request+0xc1/0x120 [raid10]
[ 1463.757491]  ? wait_woken+0x80/0x80
[ 1463.757507]  md_handle_request+0x121/0x190 [md_mod]
[ 1463.757527]  md_make_request+0x78/0x190 [md_mod]
[ 1463.757536]  generic_make_request+0x1c6/0x470
[ 1463.757544]  submit_bio+0x45/0x140
[ 1463.757552]  ? bio_add_page+0x48/0x60
[ 1463.757716]  _xfs_buf_ioapply+0x2c1/0x450 [xfs]
[ 1463.757849]  ? xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
[ 1463.757974]  __xfs_buf_submit+0x67/0x270 [xfs]
[ 1463.758102]  xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
[ 1463.758232]  ? xfsaild+0x294/0x7e0 [xfs]
[ 1463.758364]  xfsaild+0x294/0x7e0 [xfs]
[ 1463.758377]  ? _raw_spin_unlock_irqrestore+0x2e/0x60
[ 1463.758508]  ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
[ 1463.758514]  kthread+0x112/0x130
[ 1463.758520]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 1463.758527]  ret_from_fork+0x3a/0x50
[ 1463.758543] INFO: task rpm:5364 blocked for more than 480 seconds.
[ 1463.758546]       Not tainted 4.19.5-1-default #1
[ 1463.758547] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.758550] rpm             D    0  5364   3757 0x00000000
[ 1463.758554] Call Trace:
[ 1463.758563]  ? __schedule+0x29a/0x880
[ 1463.758701]  ? xlog_wait+0x5c/0x70 [xfs]
[ 1463.759821]  schedule+0x78/0x110
[ 1463.760022]  xlog_wait+0x5c/0x70 [xfs]
[ 1463.760036]  ? wake_up_q+0x70/0x70
[ 1463.760167]  __xfs_log_force_lsn+0x223/0x230 [xfs]
[ 1463.760297]  ? xfs_file_fsync+0x196/0x1d0 [xfs]
[ 1463.760424]  xfs_log_force_lsn+0x93/0x140 [xfs]
[ 1463.760552]  xfs_file_fsync+0x196/0x1d0 [xfs]
[ 1463.760562]  ? __sb_end_write+0x36/0x60
[ 1463.760571]  do_fsync+0x38/0x70
[ 1463.760578]  __x64_sys_fdatasync+0x13/0x20
[ 1463.760585]  do_syscall_64+0x60/0x110
[ 1463.760594]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1463.760603] RIP: 0033:0x7f9757fae8a4
[ 1463.760616] Code: Bad RIP value.
[ 1463.760619] RSP: 002b:00007fff74fdb428 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
[ 1463.760654] RAX: ffffffffffffffda RBX: 0000000000000064 RCX: 00007f9757fae8a4
[ 1463.760657] RDX: 00000000012c4c60 RSI: 00000000012cc130 RDI: 0000000000000004
[ 1463.760660] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f9758708c00
[ 1463.760662] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000012cc130
[ 1463.760665] R13: 000000000123a3a0 R14: 0000000000010830 R15: 0000000000000062
[ 1463.760679] INFO: task kworker/0:8:5367 blocked for more than 480 seconds.
[ 1463.760683]       Not tainted 4.19.5-1-default #1
[ 1463.760684] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.760687] kworker/0:8     D    0  5367      2 0x80000000
[ 1463.760718] Workqueue: md submit_flushes [md_mod]
[ 1463.760721] Call Trace:
[ 1463.760731]  ? __schedule+0x29a/0x880
[ 1463.760741]  ? wait_barrier+0xdd/0x170 [raid10]
[ 1463.760746]  schedule+0x78/0x110
[ 1463.760753]  wait_barrier+0xdd/0x170 [raid10]
[ 1463.760761]  ? wait_woken+0x80/0x80
[ 1463.760768]  raid10_write_request+0xf2/0x900 [raid10]
[ 1463.760774]  ? wait_woken+0x80/0x80
[ 1463.760778]  ? mempool_alloc+0x55/0x160
[ 1463.760795]  ? md_write_start+0xa9/0x270 [md_mod]
[ 1463.760801]  ? try_to_wake_up+0x44/0x470
[ 1463.760810]  raid10_make_request+0xc1/0x120 [raid10]
[ 1463.760816]  ? wait_woken+0x80/0x80
[ 1463.760831]  md_handle_request+0x121/0x190 [md_mod]
[ 1463.760851]  md_make_request+0x78/0x190 [md_mod]
[ 1463.760860]  generic_make_request+0x1c6/0x470
[ 1463.760870]  raid10_write_request+0x77a/0x900 [raid10]
[ 1463.760875]  ? wait_woken+0x80/0x80
[ 1463.760879]  ? mempool_alloc+0x55/0x160
[ 1463.760895]  ? md_write_start+0xa9/0x270 [md_mod]
[ 1463.760904]  raid10_make_request+0xc1/0x120 [raid10]
[ 1463.760910]  ? wait_woken+0x80/0x80
[ 1463.760926]  md_handle_request+0x121/0x190 [md_mod]
[ 1463.760931]  ? _raw_spin_unlock_irq+0x22/0x40
[ 1463.760936]  ? finish_task_switch+0x74/0x260
[ 1463.760954]  submit_flushes+0x21/0x40 [md_mod]
[ 1463.760962]  process_one_work+0x1fd/0x420
[ 1463.760970]  worker_thread+0x2d/0x3d0
[ 1463.760976]  ? rescuer_thread+0x340/0x340
[ 1463.760981]  kthread+0x112/0x130
[ 1463.760986]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 1463.760992]  ret_from_fork+0x3a/0x50

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-12-18 15:01 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-12 12:29 XFS and RAID10 with o2 layout Sinisa
2018-12-12 14:30 ` Brian Foster
2018-12-13  8:21   ` Sinisa
2018-12-13 12:28     ` Brian Foster
2018-12-13 13:02       ` Sinisa
2018-12-13 17:30         ` keld
2018-12-14  6:59           ` Sinisa
     [not found]   ` <0a33a20d-5f49-7b34-3662-5b818c67621a@suse.com>
     [not found]     ` <48ba331d-a896-f532-2c75-cf94ddf87b60@4net.rs>
2018-12-17 15:04       ` Sinisa
2018-12-18 15:01     ` Sinisa
2018-12-13 22:05 ` Dave Chinner
2018-12-14  7:03   ` Sinisa
2018-12-14  8:26     ` Wols Lists
2018-12-14 20:44       ` John Stoffel
2018-12-15 15:36         ` Siniša Bandin
2018-12-14 21:20     ` Dave Chinner
2018-12-14 11:39 ` Sinisa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox