XFS and RAID10 with o2 layout

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* XFS and RAID10 with o2 layout
@ 2018-12-12 12:29 Sinisa
  2018-12-12 14:30 ` Brian Foster
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Sinisa @ 2018-12-12 12:29 UTC (permalink / raw)
  To: linux-xfs

[-- Attachment #1: Type: text/plain, Size: 2209 bytes --]

Hello group,

I have noticed something strange going on lately, but recently I have come to 
conclusion that there is some unwanted interaction between XFS and Linux RAID10 
with "offset" layout.

So here is the problem: I create a Linux RAID10 mirror with 2 disks (HDD or 
SSD) and "o2" layout (best choice for read and write speed):
# mdadm -C -n2 -l10 -po2 /dev/mdX /dev/sdaX /dev/sdbX
# mkfs.xfs /dev/mdX
# mount /dev/mdX /mnt
# rsync -avxDPHS / /mnt

So we have RAID10 initializing:

# cat /proc/mdstat
Personalities : [raid1] [raid10]
md2 : active raid10 sdb3[1] sda3[0]
       314433536 blocks super 1.2 4096K chunks 2 offset-copies [2/2] [UU]
       [==>..................]  resync = 11.7% (36917568/314433536) 
finish=8678.2min speed=532K/sec
       bitmap: 3/3 pages [12KB], 65536KB chunk

but after a few minutes everything stops like you can see above. Rsync (or any 
other process writing to that md device) also freezes. If I try to read already 
copied files - freeze, usually with less that 2GB copied.

Sometimes in dmesg I get some kernel messages about "task kworker/2:1:55 
blocked for more than 480 seconds." (please see attached dmesg.txt and my 
reports here: https://bugzilla.opensuse.org/show_bug.cgi?id=1111073), sometimes 
nothing at all. When this happens, I can only reboot with SysRq-b or 
"physically" with reset/power button.

Same thing can happen with "far" layout, but it seems to me that it does not 
happen every time (or that often). I might be wrong, because I never use "far" 
layout in real life, only for testing.
I was unable to reproduce the failure with "near" layout.

Also with EXT4 or BTRFS and any layout everything works just as it should, that 
is sync goes on until finished, and rsync, cp, or any other write work just 
fine at the same time.

Let me just add that I first saw this behavior in openSUSE LEAP 15.0 (kernel 
4.12). In previous versions (up to kernel 4.4) I never had this problem. In the 
meantime I have tested with kernels up to 4.20rc and it is the same. 
Unfortunately I cannot go back to test kernels 4.5 - 4.11 to pinpoint the 
moment the problem first appeared.



-- 
Best regards,
Siniša Bandin
(excuse my English)


[-- Attachment #2: dmesg.txt --]
[-- Type: text/plain, Size: 13412 bytes --]

[  180.981499] SGI XFS with ACLs, security attributes, no debug enabled
[  181.005019] XFS (md1): Mounting V5 Filesystem
[  181.132076] XFS (md1): Starting recovery (logdev: internal)
[  181.295606] XFS (md1): Ending recovery (logdev: internal)
[  181.804011] XFS (md1): Unmounting Filesystem
[  182.201794] XFS (md127): Mounting V4 Filesystem
[  182.736958] md: recovery of RAID array md127
[  182.915479] XFS (md127): Ending clean mount
[  183.819702] XFS (md127): Unmounting Filesystem
[  184.943831] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
[  529.784557] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
[  601.789958] md1: detected capacity change from 33284947968 to 0
[  601.789973] md: md1 stopped.
[  602.314112] md0: detected capacity change from 550436864 to 0
[  602.314128] md: md0 stopped.
[  602.745030] md: md127: recovery interrupted.
[  603.131684] md127: detected capacity change from 966229229568 to 0
[  603.132237] md: md127 stopped.
[  603.435808]  sda: sda1 sda2
[  603.594074] udevd[5011]: inotify_add_watch(11, /dev/sda2, 10) failed: No such file or directory
[  603.643959]  sda:
[  603.844724]  sdb: sdb1 sdb2
[  604.255407]  sdb: sdb1
[  604.490214] udevd[5050]: inotify_add_watch(11, /dev/sdb1, 10) failed: No such file or directory
[  605.140952]  sdb: sdb1
[  605.628686]  sdb: sdb1 sdb2
[  606.271192]  sdb: sdb1 sdb2 sdb3
[  607.079626]  sdb: sdb1 sdb2 sdb3
[  607.611092]  sda:
[  608.273201]  sda: sda1
[  608.611952]  sda: sda1 sda2
[  609.031326]  sda: sda1 sda2 sda3
[  609.753140] md/raid10:md1: not clean -- starting background reconstruction
[  609.753145] md/raid10:md1: active with 2 out of 2 devices
[  609.768804] md1: detected capacity change from 0 to 32210157568
[  609.772677] md: resync of RAID array md1
[  614.590107] XFS (md1): Mounting V5 Filesystem
[  615.449035] XFS (md1): Ending clean mount
[  617.678462] md/raid1:md0: not clean -- starting background reconstruction
[  617.678469] md/raid1:md0: active with 2 out of 2 mirrors
[  617.740729] md0: detected capacity change from 0 to 524222464
[  617.747107] md: delaying resync of md0 until md1 has finished (they share one or more physical units)
[  620.037818] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
[ 1463.754785] INFO: task kworker/0:3:227 blocked for more than 480 seconds.
[ 1463.754793]       Not tainted 4.19.5-1-default #1
[ 1463.754795] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.754799] kworker/0:3     D    0   227      2 0x80000000
[ 1463.755000] Workqueue: xfs-eofblocks/md1 xfs_eofblocks_worker [xfs]
[ 1463.755005] Call Trace:
[ 1463.755025]  ? __schedule+0x29a/0x880
[ 1463.755032]  ? rwsem_down_write_failed+0x197/0x350
[ 1463.755038]  schedule+0x78/0x110
[ 1463.755044]  rwsem_down_write_failed+0x197/0x350
[ 1463.755055]  call_rwsem_down_write_failed+0x13/0x20
[ 1463.755061]  down_write+0x20/0x30
[ 1463.755196]  xfs_free_eofblocks+0x114/0x1a0 [xfs]
[ 1463.755330]  xfs_inode_free_eofblocks+0xd3/0x1e0 [xfs]
[ 1463.755459]  ? xfs_inode_ag_walk_grab+0x5b/0x90 [xfs]
[ 1463.755586]  xfs_inode_ag_walk.isra.15+0x1aa/0x420 [xfs]
[ 1463.755714]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[ 1463.755727]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 1463.755734]  ? __switch_to_asm+0x40/0x70
[ 1463.755738]  ? __switch_to_asm+0x34/0x70
[ 1463.755743]  ? __switch_to_asm+0x40/0x70
[ 1463.755748]  ? __switch_to_asm+0x34/0x70
[ 1463.755752]  ? __switch_to_asm+0x40/0x70
[ 1463.755757]  ? __switch_to_asm+0x34/0x70
[ 1463.755762]  ? __switch_to_asm+0x40/0x70
[ 1463.755893]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[ 1463.755900]  ? radix_tree_gang_lookup_tag+0xc2/0x140
[ 1463.756032]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[ 1463.756158]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
[ 1463.756288]  xfs_eofblocks_worker+0x29/0x40 [xfs]
[ 1463.756298]  process_one_work+0x1fd/0x420
[ 1463.756305]  worker_thread+0x2d/0x3d0
[ 1463.756311]  ? rescuer_thread+0x340/0x340
[ 1463.756316]  kthread+0x112/0x130
[ 1463.756322]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 1463.756329]  ret_from_fork+0x3a/0x50
[ 1463.756375] INFO: task kworker/u4:0:4615 blocked for more than 480 seconds.
[ 1463.756379]       Not tainted 4.19.5-1-default #1
[ 1463.756380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.756383] kworker/u4:0    D    0  4615      2 0x80000000
[ 1463.756395] Workqueue: writeback wb_workfn (flush-9:1)
[ 1463.756400] Call Trace:
[ 1463.756409]  ? __schedule+0x29a/0x880
[ 1463.756420]  ? wait_barrier+0xdd/0x170 [raid10]
[ 1463.756426]  schedule+0x78/0x110
[ 1463.756433]  wait_barrier+0xdd/0x170 [raid10]
[ 1463.756440]  ? wait_woken+0x80/0x80
[ 1463.756448]  raid10_write_request+0xf2/0x900 [raid10]
[ 1463.756454]  ? wait_woken+0x80/0x80
[ 1463.756459]  ? mempool_alloc+0x55/0x160
[ 1463.756483]  ? md_write_start+0xa9/0x270 [md_mod]
[ 1463.756492]  raid10_make_request+0xc1/0x120 [raid10]
[ 1463.756498]  ? wait_woken+0x80/0x80
[ 1463.756514]  md_handle_request+0x121/0x190 [md_mod]
[ 1463.756535]  md_make_request+0x78/0x190 [md_mod]
[ 1463.756544]  generic_make_request+0x1c6/0x470
[ 1463.756553]  submit_bio+0x45/0x140
[ 1463.756714]  xfs_submit_ioend+0x9c/0x1e0 [xfs]
[ 1463.756844]  xfs_vm_writepages+0x68/0x80 [xfs]
[ 1463.756856]  do_writepages+0x31/0xb0
[ 1463.756865]  ? read_hpet+0x126/0x130
[ 1463.756873]  ? ktime_get+0x36/0xa0
[ 1463.756881]  __writeback_single_inode+0x3d/0x3e0
[ 1463.756889]  writeback_sb_inodes+0x1c4/0x430
[ 1463.756902]  __writeback_inodes_wb+0x5d/0xb0
[ 1463.756910]  wb_writeback+0x26b/0x310
[ 1463.756920]  wb_workfn+0x33a/0x410
[ 1463.756932]  process_one_work+0x1fd/0x420
[ 1463.756940]  worker_thread+0x2d/0x3d0
[ 1463.756946]  ? rescuer_thread+0x340/0x340
[ 1463.756951]  kthread+0x112/0x130
[ 1463.756957]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 1463.756965]  ret_from_fork+0x3a/0x50
[ 1463.756979] INFO: task kworker/0:2:4994 blocked for more than 480 seconds.
[ 1463.756982]       Not tainted 4.19.5-1-default #1
[ 1463.756984] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.756987] kworker/0:2     D    0  4994      2 0x80000000
[ 1463.757013] Workqueue: md submit_flushes [md_mod]
[ 1463.757016] Call Trace:
[ 1463.757024]  ? __schedule+0x29a/0x880
[ 1463.757034]  ? wait_barrier+0xdd/0x170 [raid10]
[ 1463.757039]  schedule+0x78/0x110
[ 1463.757047]  wait_barrier+0xdd/0x170 [raid10]
[ 1463.757054]  ? wait_woken+0x80/0x80
[ 1463.757062]  raid10_write_request+0xf2/0x900 [raid10]
[ 1463.757067]  ? wait_woken+0x80/0x80
[ 1463.757072]  ? mempool_alloc+0x55/0x160
[ 1463.757088]  ? md_write_start+0xa9/0x270 [md_mod]
[ 1463.757095]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 1463.757104]  raid10_make_request+0xc1/0x120 [raid10]
[ 1463.757110]  ? wait_woken+0x80/0x80
[ 1463.757126]  md_handle_request+0x121/0x190 [md_mod]
[ 1463.757132]  ? _raw_spin_unlock_irq+0x22/0x40
[ 1463.757137]  ? finish_task_switch+0x74/0x260
[ 1463.757156]  submit_flushes+0x21/0x40 [md_mod]
[ 1463.757163]  process_one_work+0x1fd/0x420
[ 1463.757170]  worker_thread+0x2d/0x3d0
[ 1463.757177]  ? rescuer_thread+0x340/0x340
[ 1463.757181]  kthread+0x112/0x130
[ 1463.757186]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 1463.757193]  ret_from_fork+0x3a/0x50
[ 1463.757205] INFO: task md1_resync:5215 blocked for more than 480 seconds.
[ 1463.757207]       Not tainted 4.19.5-1-default #1
[ 1463.757209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.757212] md1_resync      D    0  5215      2 0x80000000
[ 1463.757216] Call Trace:
[ 1463.757223]  ? __schedule+0x29a/0x880
[ 1463.757231]  ? raise_barrier+0x8d/0x140 [raid10]
[ 1463.757236]  schedule+0x78/0x110
[ 1463.757243]  raise_barrier+0x8d/0x140 [raid10]
[ 1463.757248]  ? wait_woken+0x80/0x80
[ 1463.757257]  raid10_sync_request+0x1f6/0x1e30 [raid10]
[ 1463.757265]  ? _raw_spin_unlock_irq+0x22/0x40
[ 1463.757284]  ? is_mddev_idle+0x125/0x137 [md_mod]
[ 1463.757302]  md_do_sync.cold.78+0x404/0x969 [md_mod]
[ 1463.757311]  ? wait_woken+0x80/0x80
[ 1463.757336]  ? md_rdev_init+0xb0/0xb0 [md_mod]
[ 1463.757351]  md_thread+0xe9/0x140 [md_mod]
[ 1463.757358]  ? _raw_spin_unlock_irqrestore+0x2e/0x60
[ 1463.757364]  ? __kthread_parkme+0x4c/0x70
[ 1463.757369]  kthread+0x112/0x130
[ 1463.757374]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 1463.757380]  ret_from_fork+0x3a/0x50
[ 1463.757395] INFO: task xfsaild/md1:5233 blocked for more than 480 seconds.
[ 1463.757398]       Not tainted 4.19.5-1-default #1
[ 1463.757400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.757402] xfsaild/md1     D    0  5233      2 0x80000000
[ 1463.757406] Call Trace:
[ 1463.757413]  ? __schedule+0x29a/0x880
[ 1463.757421]  ? wait_barrier+0xdd/0x170 [raid10]
[ 1463.757426]  schedule+0x78/0x110
[ 1463.757433]  wait_barrier+0xdd/0x170 [raid10]
[ 1463.757438]  ? wait_woken+0x80/0x80
[ 1463.757446]  raid10_write_request+0xf2/0x900 [raid10]
[ 1463.757451]  ? wait_woken+0x80/0x80
[ 1463.757455]  ? mempool_alloc+0x55/0x160
[ 1463.757471]  ? md_write_start+0xa9/0x270 [md_mod]
[ 1463.757477]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 1463.757485]  raid10_make_request+0xc1/0x120 [raid10]
[ 1463.757491]  ? wait_woken+0x80/0x80
[ 1463.757507]  md_handle_request+0x121/0x190 [md_mod]
[ 1463.757527]  md_make_request+0x78/0x190 [md_mod]
[ 1463.757536]  generic_make_request+0x1c6/0x470
[ 1463.757544]  submit_bio+0x45/0x140
[ 1463.757552]  ? bio_add_page+0x48/0x60
[ 1463.757716]  _xfs_buf_ioapply+0x2c1/0x450 [xfs]
[ 1463.757849]  ? xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
[ 1463.757974]  __xfs_buf_submit+0x67/0x270 [xfs]
[ 1463.758102]  xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
[ 1463.758232]  ? xfsaild+0x294/0x7e0 [xfs]
[ 1463.758364]  xfsaild+0x294/0x7e0 [xfs]
[ 1463.758377]  ? _raw_spin_unlock_irqrestore+0x2e/0x60
[ 1463.758508]  ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
[ 1463.758514]  kthread+0x112/0x130
[ 1463.758520]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 1463.758527]  ret_from_fork+0x3a/0x50
[ 1463.758543] INFO: task rpm:5364 blocked for more than 480 seconds.
[ 1463.758546]       Not tainted 4.19.5-1-default #1
[ 1463.758547] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.758550] rpm             D    0  5364   3757 0x00000000
[ 1463.758554] Call Trace:
[ 1463.758563]  ? __schedule+0x29a/0x880
[ 1463.758701]  ? xlog_wait+0x5c/0x70 [xfs]
[ 1463.759821]  schedule+0x78/0x110
[ 1463.760022]  xlog_wait+0x5c/0x70 [xfs]
[ 1463.760036]  ? wake_up_q+0x70/0x70
[ 1463.760167]  __xfs_log_force_lsn+0x223/0x230 [xfs]
[ 1463.760297]  ? xfs_file_fsync+0x196/0x1d0 [xfs]
[ 1463.760424]  xfs_log_force_lsn+0x93/0x140 [xfs]
[ 1463.760552]  xfs_file_fsync+0x196/0x1d0 [xfs]
[ 1463.760562]  ? __sb_end_write+0x36/0x60
[ 1463.760571]  do_fsync+0x38/0x70
[ 1463.760578]  __x64_sys_fdatasync+0x13/0x20
[ 1463.760585]  do_syscall_64+0x60/0x110
[ 1463.760594]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1463.760603] RIP: 0033:0x7f9757fae8a4
[ 1463.760616] Code: Bad RIP value.
[ 1463.760619] RSP: 002b:00007fff74fdb428 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
[ 1463.760654] RAX: ffffffffffffffda RBX: 0000000000000064 RCX: 00007f9757fae8a4
[ 1463.760657] RDX: 00000000012c4c60 RSI: 00000000012cc130 RDI: 0000000000000004
[ 1463.760660] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f9758708c00
[ 1463.760662] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000012cc130
[ 1463.760665] R13: 000000000123a3a0 R14: 0000000000010830 R15: 0000000000000062
[ 1463.760679] INFO: task kworker/0:8:5367 blocked for more than 480 seconds.
[ 1463.760683]       Not tainted 4.19.5-1-default #1
[ 1463.760684] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1463.760687] kworker/0:8     D    0  5367      2 0x80000000
[ 1463.760718] Workqueue: md submit_flushes [md_mod]
[ 1463.760721] Call Trace:
[ 1463.760731]  ? __schedule+0x29a/0x880
[ 1463.760741]  ? wait_barrier+0xdd/0x170 [raid10]
[ 1463.760746]  schedule+0x78/0x110
[ 1463.760753]  wait_barrier+0xdd/0x170 [raid10]
[ 1463.760761]  ? wait_woken+0x80/0x80
[ 1463.760768]  raid10_write_request+0xf2/0x900 [raid10]
[ 1463.760774]  ? wait_woken+0x80/0x80
[ 1463.760778]  ? mempool_alloc+0x55/0x160
[ 1463.760795]  ? md_write_start+0xa9/0x270 [md_mod]
[ 1463.760801]  ? try_to_wake_up+0x44/0x470
[ 1463.760810]  raid10_make_request+0xc1/0x120 [raid10]
[ 1463.760816]  ? wait_woken+0x80/0x80
[ 1463.760831]  md_handle_request+0x121/0x190 [md_mod]
[ 1463.760851]  md_make_request+0x78/0x190 [md_mod]
[ 1463.760860]  generic_make_request+0x1c6/0x470
[ 1463.760870]  raid10_write_request+0x77a/0x900 [raid10]
[ 1463.760875]  ? wait_woken+0x80/0x80
[ 1463.760879]  ? mempool_alloc+0x55/0x160
[ 1463.760895]  ? md_write_start+0xa9/0x270 [md_mod]
[ 1463.760904]  raid10_make_request+0xc1/0x120 [raid10]
[ 1463.760910]  ? wait_woken+0x80/0x80
[ 1463.760926]  md_handle_request+0x121/0x190 [md_mod]
[ 1463.760931]  ? _raw_spin_unlock_irq+0x22/0x40
[ 1463.760936]  ? finish_task_switch+0x74/0x260
[ 1463.760954]  submit_flushes+0x21/0x40 [md_mod]
[ 1463.760962]  process_one_work+0x1fd/0x420
[ 1463.760970]  worker_thread+0x2d/0x3d0
[ 1463.760976]  ? rescuer_thread+0x340/0x340
[ 1463.760981]  kthread+0x112/0x130
[ 1463.760986]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 1463.760992]  ret_from_fork+0x3a/0x50

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-12 12:29 XFS and RAID10 with o2 layout Sinisa
@ 2018-12-12 14:30 ` Brian Foster
  2018-12-13  8:21   ` Sinisa
       [not found]   ` <0a33a20d-5f49-7b34-3662-5b818c67621a@suse.com>
  2018-12-13 22:05 ` Dave Chinner
  2018-12-14 11:39 ` Sinisa
  2 siblings, 2 replies; 16+ messages in thread
From: Brian Foster @ 2018-12-12 14:30 UTC (permalink / raw)
  To: Sinisa; +Cc: linux-xfs, linux-raid

cc linux-raid

On Wed, Dec 12, 2018 at 01:29:49PM +0100, Sinisa wrote:
> Hello group,
> 
> I have noticed something strange going on lately, but recently I have come
> to conclusion that there is some unwanted interaction between XFS and Linux
> RAID10 with "offset" layout.
> 
> So here is the problem: I create a Linux RAID10 mirror with 2 disks (HDD or
> SSD) and "o2" layout (best choice for read and write speed):
> # mdadm -C -n2 -l10 -po2 /dev/mdX /dev/sdaX /dev/sdbX
> # mkfs.xfs /dev/mdX
> # mount /dev/mdX /mnt
> # rsync -avxDPHS / /mnt
> 
> So we have RAID10 initializing:
> 
> # cat /proc/mdstat
> Personalities : [raid1] [raid10]
> md2 : active raid10 sdb3[1] sda3[0]
>       314433536 blocks super 1.2 4096K chunks 2 offset-copies [2/2] [UU]
>       [==>..................]  resync = 11.7% (36917568/314433536)
> finish=8678.2min speed=532K/sec
>       bitmap: 3/3 pages [12KB], 65536KB chunk
> 
> but after a few minutes everything stops like you can see above. Rsync (or
> any other process writing to that md device) also freezes. If I try to read
> already copied files - freeze, usually with less that 2GB copied.
> 

Does the same thing happen without the RAID initialization? E.g., if you
wait for it to complete or (IIRC) if you create with --assume-clean? I
assume the init-in-progress state is common with your tests on other
filesystems?

A few more notes below inline to the log..

> Sometimes in dmesg I get some kernel messages about "task kworker/2:1:55
> blocked for more than 480 seconds." (please see attached dmesg.txt and my
> reports here: https://bugzilla.opensuse.org/show_bug.cgi?id=1111073),
> sometimes nothing at all. When this happens, I can only reboot with SysRq-b
> or "physically" with reset/power button.
> 
> Same thing can happen with "far" layout, but it seems to me that it does not
> happen every time (or that often). I might be wrong, because I never use
> "far" layout in real life, only for testing.
> I was unable to reproduce the failure with "near" layout.
> 
> Also with EXT4 or BTRFS and any layout everything works just as it should,
> that is sync goes on until finished, and rsync, cp, or any other write work
> just fine at the same time.
> 
> Let me just add that I first saw this behavior in openSUSE LEAP 15.0 (kernel
> 4.12). In previous versions (up to kernel 4.4) I never had this problem. In
> the meantime I have tested with kernels up to 4.20rc and it is the same.
> Unfortunately I cannot go back to test kernels 4.5 - 4.11 to pinpoint the
> moment the problem first appeared.
> 
> 
> 
> -- 
> Best regards,
> Siniša Bandin
> (excuse my English)
> 

> [  180.981499] SGI XFS with ACLs, security attributes, no debug enabled
> [  181.005019] XFS (md1): Mounting V5 Filesystem
> [  181.132076] XFS (md1): Starting recovery (logdev: internal)
> [  181.295606] XFS (md1): Ending recovery (logdev: internal)
> [  181.804011] XFS (md1): Unmounting Filesystem
> [  182.201794] XFS (md127): Mounting V4 Filesystem
> [  182.736958] md: recovery of RAID array md127
> [  182.915479] XFS (md127): Ending clean mount
> [  183.819702] XFS (md127): Unmounting Filesystem
> [  184.943831] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
> [  529.784557] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
> [  601.789958] md1: detected capacity change from 33284947968 to 0
> [  601.789973] md: md1 stopped.
> [  602.314112] md0: detected capacity change from 550436864 to 0
> [  602.314128] md: md0 stopped.
> [  602.745030] md: md127: recovery interrupted.
> [  603.131684] md127: detected capacity change from 966229229568 to 0
> [  603.132237] md: md127 stopped.
> [  603.435808]  sda: sda1 sda2
> [  603.594074] udevd[5011]: inotify_add_watch(11, /dev/sda2, 10) failed: No such file or directory
> [  603.643959]  sda:
> [  603.844724]  sdb: sdb1 sdb2
> [  604.255407]  sdb: sdb1
> [  604.490214] udevd[5050]: inotify_add_watch(11, /dev/sdb1, 10) failed: No such file or directory
> [  605.140952]  sdb: sdb1
> [  605.628686]  sdb: sdb1 sdb2
> [  606.271192]  sdb: sdb1 sdb2 sdb3
> [  607.079626]  sdb: sdb1 sdb2 sdb3
> [  607.611092]  sda:
> [  608.273201]  sda: sda1
> [  608.611952]  sda: sda1 sda2
> [  609.031326]  sda: sda1 sda2 sda3
> [  609.753140] md/raid10:md1: not clean -- starting background reconstruction
> [  609.753145] md/raid10:md1: active with 2 out of 2 devices
> [  609.768804] md1: detected capacity change from 0 to 32210157568
> [  609.772677] md: resync of RAID array md1
> [  614.590107] XFS (md1): Mounting V5 Filesystem
> [  615.449035] XFS (md1): Ending clean mount
> [  617.678462] md/raid1:md0: not clean -- starting background reconstruction
> [  617.678469] md/raid1:md0: active with 2 out of 2 mirrors
> [  617.740729] md0: detected capacity change from 0 to 524222464
> [  617.747107] md: delaying resync of md0 until md1 has finished (they share one or more physical units)

What are md0 and md1? Note that I don't see md2 anywhere in this log.

> [  620.037818] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
> [ 1463.754785] INFO: task kworker/0:3:227 blocked for more than 480 seconds.
> [ 1463.754793]       Not tainted 4.19.5-1-default #1
> [ 1463.754795] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1463.754799] kworker/0:3     D    0   227      2 0x80000000
> [ 1463.755000] Workqueue: xfs-eofblocks/md1 xfs_eofblocks_worker [xfs]
> [ 1463.755005] Call Trace:
> [ 1463.755025]  ? __schedule+0x29a/0x880
> [ 1463.755032]  ? rwsem_down_write_failed+0x197/0x350
> [ 1463.755038]  schedule+0x78/0x110
> [ 1463.755044]  rwsem_down_write_failed+0x197/0x350
> [ 1463.755055]  call_rwsem_down_write_failed+0x13/0x20
> [ 1463.755061]  down_write+0x20/0x30

So we have a background task blocked on an inode lock.

> [ 1463.755196]  xfs_free_eofblocks+0x114/0x1a0 [xfs]
> [ 1463.755330]  xfs_inode_free_eofblocks+0xd3/0x1e0 [xfs]
> [ 1463.755459]  ? xfs_inode_ag_walk_grab+0x5b/0x90 [xfs]
> [ 1463.755586]  xfs_inode_ag_walk.isra.15+0x1aa/0x420 [xfs]
> [ 1463.755714]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> [ 1463.755727]  ? trace_hardirqs_on_thunk+0x1a/0x1c
> [ 1463.755734]  ? __switch_to_asm+0x40/0x70
> [ 1463.755738]  ? __switch_to_asm+0x34/0x70
> [ 1463.755743]  ? __switch_to_asm+0x40/0x70
> [ 1463.755748]  ? __switch_to_asm+0x34/0x70
> [ 1463.755752]  ? __switch_to_asm+0x40/0x70
> [ 1463.755757]  ? __switch_to_asm+0x34/0x70
> [ 1463.755762]  ? __switch_to_asm+0x40/0x70
> [ 1463.755893]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> [ 1463.755900]  ? radix_tree_gang_lookup_tag+0xc2/0x140
> [ 1463.756032]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> [ 1463.756158]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> [ 1463.756288]  xfs_eofblocks_worker+0x29/0x40 [xfs]
> [ 1463.756298]  process_one_work+0x1fd/0x420
> [ 1463.756305]  worker_thread+0x2d/0x3d0
> [ 1463.756311]  ? rescuer_thread+0x340/0x340
> [ 1463.756316]  kthread+0x112/0x130
> [ 1463.756322]  ? kthread_create_worker_on_cpu+0x40/0x40
> [ 1463.756329]  ret_from_fork+0x3a/0x50
> [ 1463.756375] INFO: task kworker/u4:0:4615 blocked for more than 480 seconds.
> [ 1463.756379]       Not tainted 4.19.5-1-default #1
> [ 1463.756380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1463.756383] kworker/u4:0    D    0  4615      2 0x80000000
> [ 1463.756395] Workqueue: writeback wb_workfn (flush-9:1)
> [ 1463.756400] Call Trace:
> [ 1463.756409]  ? __schedule+0x29a/0x880
> [ 1463.756420]  ? wait_barrier+0xdd/0x170 [raid10]
> [ 1463.756426]  schedule+0x78/0x110
> [ 1463.756433]  wait_barrier+0xdd/0x170 [raid10]
> [ 1463.756440]  ? wait_woken+0x80/0x80
> [ 1463.756448]  raid10_write_request+0xf2/0x900 [raid10]
> [ 1463.756454]  ? wait_woken+0x80/0x80
> [ 1463.756459]  ? mempool_alloc+0x55/0x160
> [ 1463.756483]  ? md_write_start+0xa9/0x270 [md_mod]
> [ 1463.756492]  raid10_make_request+0xc1/0x120 [raid10]
> [ 1463.756498]  ? wait_woken+0x80/0x80
> [ 1463.756514]  md_handle_request+0x121/0x190 [md_mod]
> [ 1463.756535]  md_make_request+0x78/0x190 [md_mod]
> [ 1463.756544]  generic_make_request+0x1c6/0x470
> [ 1463.756553]  submit_bio+0x45/0x140

Writeback is blocked submitting I/O down in the MD driver.

> [ 1463.756714]  xfs_submit_ioend+0x9c/0x1e0 [xfs]
> [ 1463.756844]  xfs_vm_writepages+0x68/0x80 [xfs]
> [ 1463.756856]  do_writepages+0x31/0xb0
> [ 1463.756865]  ? read_hpet+0x126/0x130
> [ 1463.756873]  ? ktime_get+0x36/0xa0
> [ 1463.756881]  __writeback_single_inode+0x3d/0x3e0
> [ 1463.756889]  writeback_sb_inodes+0x1c4/0x430
> [ 1463.756902]  __writeback_inodes_wb+0x5d/0xb0
> [ 1463.756910]  wb_writeback+0x26b/0x310
> [ 1463.756920]  wb_workfn+0x33a/0x410
> [ 1463.756932]  process_one_work+0x1fd/0x420
> [ 1463.756940]  worker_thread+0x2d/0x3d0
> [ 1463.756946]  ? rescuer_thread+0x340/0x340
> [ 1463.756951]  kthread+0x112/0x130
> [ 1463.756957]  ? kthread_create_worker_on_cpu+0x40/0x40
> [ 1463.756965]  ret_from_fork+0x3a/0x50
> [ 1463.756979] INFO: task kworker/0:2:4994 blocked for more than 480 seconds.
> [ 1463.756982]       Not tainted 4.19.5-1-default #1
> [ 1463.756984] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1463.756987] kworker/0:2     D    0  4994      2 0x80000000
> [ 1463.757013] Workqueue: md submit_flushes [md_mod]
> [ 1463.757016] Call Trace:
> [ 1463.757024]  ? __schedule+0x29a/0x880
> [ 1463.757034]  ? wait_barrier+0xdd/0x170 [raid10]
> [ 1463.757039]  schedule+0x78/0x110
> [ 1463.757047]  wait_barrier+0xdd/0x170 [raid10]
> [ 1463.757054]  ? wait_woken+0x80/0x80
> [ 1463.757062]  raid10_write_request+0xf2/0x900 [raid10]
> [ 1463.757067]  ? wait_woken+0x80/0x80
> [ 1463.757072]  ? mempool_alloc+0x55/0x160
> [ 1463.757088]  ? md_write_start+0xa9/0x270 [md_mod]
> [ 1463.757095]  ? trace_hardirqs_off_thunk+0x1a/0x1c
> [ 1463.757104]  raid10_make_request+0xc1/0x120 [raid10]
> [ 1463.757110]  ? wait_woken+0x80/0x80
> [ 1463.757126]  md_handle_request+0x121/0x190 [md_mod]
> [ 1463.757132]  ? _raw_spin_unlock_irq+0x22/0x40
> [ 1463.757137]  ? finish_task_switch+0x74/0x260
> [ 1463.757156]  submit_flushes+0x21/0x40 [md_mod]

Some other MD task (?) also blocked submitting a request.

> [ 1463.757163]  process_one_work+0x1fd/0x420
> [ 1463.757170]  worker_thread+0x2d/0x3d0
> [ 1463.757177]  ? rescuer_thread+0x340/0x340
> [ 1463.757181]  kthread+0x112/0x130
> [ 1463.757186]  ? kthread_create_worker_on_cpu+0x40/0x40
> [ 1463.757193]  ret_from_fork+0x3a/0x50
> [ 1463.757205] INFO: task md1_resync:5215 blocked for more than 480 seconds.
> [ 1463.757207]       Not tainted 4.19.5-1-default #1
> [ 1463.757209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1463.757212] md1_resync      D    0  5215      2 0x80000000
> [ 1463.757216] Call Trace:
> [ 1463.757223]  ? __schedule+0x29a/0x880
> [ 1463.757231]  ? raise_barrier+0x8d/0x140 [raid10]
> [ 1463.757236]  schedule+0x78/0x110
> [ 1463.757243]  raise_barrier+0x8d/0x140 [raid10]
> [ 1463.757248]  ? wait_woken+0x80/0x80
> [ 1463.757257]  raid10_sync_request+0x1f6/0x1e30 [raid10]
> [ 1463.757265]  ? _raw_spin_unlock_irq+0x22/0x40
> [ 1463.757284]  ? is_mddev_idle+0x125/0x137 [md_mod]
> [ 1463.757302]  md_do_sync.cold.78+0x404/0x969 [md_mod]

The md1 sync task is blocked, I'm not sure on what.

> [ 1463.757311]  ? wait_woken+0x80/0x80
> [ 1463.757336]  ? md_rdev_init+0xb0/0xb0 [md_mod]
> [ 1463.757351]  md_thread+0xe9/0x140 [md_mod]
> [ 1463.757358]  ? _raw_spin_unlock_irqrestore+0x2e/0x60
> [ 1463.757364]  ? __kthread_parkme+0x4c/0x70
> [ 1463.757369]  kthread+0x112/0x130
> [ 1463.757374]  ? kthread_create_worker_on_cpu+0x40/0x40
> [ 1463.757380]  ret_from_fork+0x3a/0x50
> [ 1463.757395] INFO: task xfsaild/md1:5233 blocked for more than 480 seconds.
> [ 1463.757398]       Not tainted 4.19.5-1-default #1
> [ 1463.757400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1463.757402] xfsaild/md1     D    0  5233      2 0x80000000
> [ 1463.757406] Call Trace:
> [ 1463.757413]  ? __schedule+0x29a/0x880
> [ 1463.757421]  ? wait_barrier+0xdd/0x170 [raid10]
> [ 1463.757426]  schedule+0x78/0x110
> [ 1463.757433]  wait_barrier+0xdd/0x170 [raid10]
> [ 1463.757438]  ? wait_woken+0x80/0x80
> [ 1463.757446]  raid10_write_request+0xf2/0x900 [raid10]
> [ 1463.757451]  ? wait_woken+0x80/0x80
> [ 1463.757455]  ? mempool_alloc+0x55/0x160
> [ 1463.757471]  ? md_write_start+0xa9/0x270 [md_mod]
> [ 1463.757477]  ? trace_hardirqs_on_thunk+0x1a/0x1c
> [ 1463.757485]  raid10_make_request+0xc1/0x120 [raid10]
> [ 1463.757491]  ? wait_woken+0x80/0x80
> [ 1463.757507]  md_handle_request+0x121/0x190 [md_mod]
> [ 1463.757527]  md_make_request+0x78/0x190 [md_mod]
> [ 1463.757536]  generic_make_request+0x1c6/0x470
> [ 1463.757544]  submit_bio+0x45/0x140

xfsaild (metadata writeback) is also blocked submitting I/O down in the
MD driver.

> [ 1463.757552]  ? bio_add_page+0x48/0x60
> [ 1463.757716]  _xfs_buf_ioapply+0x2c1/0x450 [xfs]
> [ 1463.757849]  ? xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
> [ 1463.757974]  __xfs_buf_submit+0x67/0x270 [xfs]
> [ 1463.758102]  xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
> [ 1463.758232]  ? xfsaild+0x294/0x7e0 [xfs]
> [ 1463.758364]  xfsaild+0x294/0x7e0 [xfs]
> [ 1463.758377]  ? _raw_spin_unlock_irqrestore+0x2e/0x60
> [ 1463.758508]  ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
> [ 1463.758514]  kthread+0x112/0x130
> [ 1463.758520]  ? kthread_create_worker_on_cpu+0x40/0x40
> [ 1463.758527]  ret_from_fork+0x3a/0x50
> [ 1463.758543] INFO: task rpm:5364 blocked for more than 480 seconds.
> [ 1463.758546]       Not tainted 4.19.5-1-default #1
> [ 1463.758547] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1463.758550] rpm             D    0  5364   3757 0x00000000
> [ 1463.758554] Call Trace:
> [ 1463.758563]  ? __schedule+0x29a/0x880
> [ 1463.758701]  ? xlog_wait+0x5c/0x70 [xfs]
> [ 1463.759821]  schedule+0x78/0x110
> [ 1463.760022]  xlog_wait+0x5c/0x70 [xfs]
> [ 1463.760036]  ? wake_up_q+0x70/0x70
> [ 1463.760167]  __xfs_log_force_lsn+0x223/0x230 [xfs]
> [ 1463.760297]  ? xfs_file_fsync+0x196/0x1d0 [xfs]
> [ 1463.760424]  xfs_log_force_lsn+0x93/0x140 [xfs]
> [ 1463.760552]  xfs_file_fsync+0x196/0x1d0 [xfs]

An fsync is blocked, presumably on XFS log I/O completion.

> [ 1463.760562]  ? __sb_end_write+0x36/0x60
> [ 1463.760571]  do_fsync+0x38/0x70
> [ 1463.760578]  __x64_sys_fdatasync+0x13/0x20
> [ 1463.760585]  do_syscall_64+0x60/0x110
> [ 1463.760594]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 1463.760603] RIP: 0033:0x7f9757fae8a4
> [ 1463.760616] Code: Bad RIP value.
> [ 1463.760619] RSP: 002b:00007fff74fdb428 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
> [ 1463.760654] RAX: ffffffffffffffda RBX: 0000000000000064 RCX: 00007f9757fae8a4
> [ 1463.760657] RDX: 00000000012c4c60 RSI: 00000000012cc130 RDI: 0000000000000004
> [ 1463.760660] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f9758708c00
> [ 1463.760662] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000012cc130
> [ 1463.760665] R13: 000000000123a3a0 R14: 0000000000010830 R15: 0000000000000062
> [ 1463.760679] INFO: task kworker/0:8:5367 blocked for more than 480 seconds.
> [ 1463.760683]       Not tainted 4.19.5-1-default #1
> [ 1463.760684] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1463.760687] kworker/0:8     D    0  5367      2 0x80000000
> [ 1463.760718] Workqueue: md submit_flushes [md_mod]

And that MD submit_flushes thing again.

Not to say there isn't some issue between XFS and MD going on here, but
I think we might want an MD person to take a look at this and possibly
provide some insight. From an XFS perspective, this all just looks like
we're blocked on I/O (via writeback, AIL and log) to a slow device.

Brian

> [ 1463.760721] Call Trace:
> [ 1463.760731]  ? __schedule+0x29a/0x880
> [ 1463.760741]  ? wait_barrier+0xdd/0x170 [raid10]
> [ 1463.760746]  schedule+0x78/0x110
> [ 1463.760753]  wait_barrier+0xdd/0x170 [raid10]
> [ 1463.760761]  ? wait_woken+0x80/0x80
> [ 1463.760768]  raid10_write_request+0xf2/0x900 [raid10]
> [ 1463.760774]  ? wait_woken+0x80/0x80
> [ 1463.760778]  ? mempool_alloc+0x55/0x160
> [ 1463.760795]  ? md_write_start+0xa9/0x270 [md_mod]
> [ 1463.760801]  ? try_to_wake_up+0x44/0x470
> [ 1463.760810]  raid10_make_request+0xc1/0x120 [raid10]
> [ 1463.760816]  ? wait_woken+0x80/0x80
> [ 1463.760831]  md_handle_request+0x121/0x190 [md_mod]
> [ 1463.760851]  md_make_request+0x78/0x190 [md_mod]
> [ 1463.760860]  generic_make_request+0x1c6/0x470
> [ 1463.760870]  raid10_write_request+0x77a/0x900 [raid10]
> [ 1463.760875]  ? wait_woken+0x80/0x80
> [ 1463.760879]  ? mempool_alloc+0x55/0x160
> [ 1463.760895]  ? md_write_start+0xa9/0x270 [md_mod]
> [ 1463.760904]  raid10_make_request+0xc1/0x120 [raid10]
> [ 1463.760910]  ? wait_woken+0x80/0x80
> [ 1463.760926]  md_handle_request+0x121/0x190 [md_mod]
> [ 1463.760931]  ? _raw_spin_unlock_irq+0x22/0x40
> [ 1463.760936]  ? finish_task_switch+0x74/0x260
> [ 1463.760954]  submit_flushes+0x21/0x40 [md_mod]
> [ 1463.760962]  process_one_work+0x1fd/0x420
> [ 1463.760970]  worker_thread+0x2d/0x3d0
> [ 1463.760976]  ? rescuer_thread+0x340/0x340
> [ 1463.760981]  kthread+0x112/0x130
> [ 1463.760986]  ? kthread_create_worker_on_cpu+0x40/0x40
> [ 1463.760992]  ret_from_fork+0x3a/0x50

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-12 14:30 ` Brian Foster
@ 2018-12-13  8:21   ` Sinisa
  2018-12-13 12:28     ` Brian Foster
       [not found]   ` <0a33a20d-5f49-7b34-3662-5b818c67621a@suse.com>
  1 sibling, 1 reply; 16+ messages in thread
From: Sinisa @ 2018-12-13  8:21 UTC (permalink / raw)
  To: linux-xfs

Thanks for a quick reply. Replies are inline...

On 12.12.2018 15:30, Brian Foster wrote:
> cc linux-raid
>
> On Wed, Dec 12, 2018 at 01:29:49PM +0100, Sinisa wrote:
>> Hello group,
>>
>> I have noticed something strange going on lately, but recently I have come
>> to conclusion that there is some unwanted interaction between XFS and Linux
>> RAID10 with "offset" layout.
>>
>> So here is the problem: I create a Linux RAID10 mirror with 2 disks (HDD or
>> SSD) and "o2" layout (best choice for read and write speed):
>> # mdadm -C -n2 -l10 -po2 /dev/mdX /dev/sdaX /dev/sdbX
>> # mkfs.xfs /dev/mdX
>> # mount /dev/mdX /mnt
>> # rsync -avxDPHS / /mnt
>>
>> So we have RAID10 initializing:
>>
>> # cat /proc/mdstat
>> Personalities : [raid1] [raid10]
>> md2 : active raid10 sdb3[1] sda3[0]
>>       314433536 blocks super 1.2 4096K chunks 2 offset-copies [2/2] [UU]
>>       [==>..................]  resync = 11.7% (36917568/314433536)
>> finish=8678.2min speed=532K/sec
>>       bitmap: 3/3 pages [12KB], 65536KB chunk
>>
>> but after a few minutes everything stops like you can see above. Rsync (or
>> any other process writing to that md device) also freezes. If I try to read
>> already copied files - freeze, usually with less that 2GB copied.
>>
>
> Does the same thing happen without the RAID initialization? E.g., if you
> wait for it to complete or (IIRC) if you create with --assume-clean? I
> assume the init-in-progress state is common with your tests on other
> filesystems?
>

No, if I wait for RAID to finish initializing, or create it with 
--assume-clean, everything works just fine.

Actualy, ever since openSUSE LEAP 15.0 release I have been doing just that: 
pause installation process until initialization is done, then let it go on.

But recently it has happened so that I had to replace one of the disks in a 
"live" system (small file server), and was unable to do that on multiple tries 
during work hours because of this problem. When I waited until afternoon, when 
nobody was working/writing, resync was able to finish...


> A few more notes below inline to the log..
>
>> Sometimes in dmesg I get some kernel messages about "task kworker/2:1:55
>> blocked for more than 480 seconds." (please see attached dmesg.txt and my
>> reports here: https://bugzilla.opensuse.org/show_bug.cgi?id=1111073),
>> sometimes nothing at all. When this happens, I can only reboot with SysRq-b
>> or "physically" with reset/power button.
>>
>> Same thing can happen with "far" layout, but it seems to me that it does not
>> happen every time (or that often). I might be wrong, because I never use
>> "far" layout in real life, only for testing.
>> I was unable to reproduce the failure with "near" layout.
>>
>> Also with EXT4 or BTRFS and any layout everything works just as it should,
>> that is sync goes on until finished, and rsync, cp, or any other write work
>> just fine at the same time.
>>
>> Let me just add that I first saw this behavior in openSUSE LEAP 15.0 (kernel
>> 4.12). In previous versions (up to kernel 4.4) I never had this problem. In
>> the meantime I have tested with kernels up to 4.20rc and it is the same.
>> Unfortunately I cannot go back to test kernels 4.5 - 4.11 to pinpoint the
>> moment the problem first appeared.
>>
>>
>>
>> --
>> Best regards,
>> Siniša Bandin
>> (excuse my English)
>>
>
>> [ 180.981499] SGI XFS with ACLs, security attributes, no debug enabled
>> [ 181.005019] XFS (md1): Mounting V5 Filesystem
>> [ 181.132076] XFS (md1): Starting recovery (logdev: internal)
>> [ 181.295606] XFS (md1): Ending recovery (logdev: internal)
>> [ 181.804011] XFS (md1): Unmounting Filesystem
>> [ 182.201794] XFS (md127): Mounting V4 Filesystem
>> [ 182.736958] md: recovery of RAID array md127
>> [ 182.915479] XFS (md127): Ending clean mount
>> [ 183.819702] XFS (md127): Unmounting Filesystem
>> [ 184.943831] EXT4-fs (md0): mounted filesystem with ordered data mode. 
>> Opts: (null)
>> [ 529.784557] EXT4-fs (md0): mounted filesystem with ordered data mode. 
>> Opts: (null)
>> [ 601.789958] md1: detected capacity change from 33284947968 to 0
>> [ 601.789973] md: md1 stopped.
>> [ 602.314112] md0: detected capacity change from 550436864 to 0
>> [ 602.314128] md: md0 stopped.
>> [ 602.745030] md: md127: recovery interrupted.
>> [ 603.131684] md127: detected capacity change from 966229229568 to 0
>> [ 603.132237] md: md127 stopped.
>> [ 603.435808] sda: sda1 sda2
>> [ 603.594074] udevd[5011]: inotify_add_watch(11, /dev/sda2, 10) failed: No 
>> such file or directory
>> [ 603.643959] sda:
>> [ 603.844724] sdb: sdb1 sdb2
>> [ 604.255407] sdb: sdb1
>> [ 604.490214] udevd[5050]: inotify_add_watch(11, /dev/sdb1, 10) failed: No 
>> such file or directory
>> [ 605.140952] sdb: sdb1
>> [ 605.628686] sdb: sdb1 sdb2
>> [ 606.271192] sdb: sdb1 sdb2 sdb3
>> [ 607.079626] sdb: sdb1 sdb2 sdb3
>> [ 607.611092] sda:
>> [ 608.273201] sda: sda1
>> [ 608.611952] sda: sda1 sda2
>> [ 609.031326] sda: sda1 sda2 sda3
>> [ 609.753140] md/raid10:md1: not clean -- starting background reconstruction
>> [ 609.753145] md/raid10:md1: active with 2 out of 2 devices
>> [ 609.768804] md1: detected capacity change from 0 to 32210157568
>> [ 609.772677] md: resync of RAID array md1
>> [ 614.590107] XFS (md1): Mounting V5 Filesystem
>> [ 615.449035] XFS (md1): Ending clean mount
>> [ 617.678462] md/raid1:md0: not clean -- starting background reconstruction
>> [ 617.678469] md/raid1:md0: active with 2 out of 2 mirrors
>> [ 617.740729] md0: detected capacity change from 0 to 524222464
>> [ 617.747107] md: delaying resync of md0 until md1 has finished (they share 
>> one or more physical units)
>
> What are md0 and md1? Note that I don't see md2 anywhere in this log.
>

Sorry that I did not clarify that immediately, this log was taken earlier, 
during installation, when I got to see it in dmesg.
md0 was /boot (with EXT4), md1 was / with XFS.

Example of cat /proc/mdstat was taken later, when I brought up the system (by 
changing md1 to "near" layout at install time). So wherever you see md1 or md2, 
you can assume they are the same thing: new RAID10/o2 being written to during 
initialization. But second time there was nothing in dmesg, so I could not 
attach that.


>> [ 620.037818] EXT4-fs (md0): mounted filesystem with ordered data mode. 
>> Opts: (null)
>> [ 1463.754785] INFO: task kworker/0:3:227 blocked for more than 480 seconds.
>> [ 1463.754793] Not tainted 4.19.5-1-default #1
>> [ 1463.754795] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
>> this message.
>> [ 1463.754799] kworker/0:3 D 0 227 2 0x80000000
>> [ 1463.755000] Workqueue: xfs-eofblocks/md1 xfs_eofblocks_worker [xfs]
>> [ 1463.755005] Call Trace:
>> [ 1463.755025] ? __schedule+0x29a/0x880
>> [ 1463.755032] ? rwsem_down_write_failed+0x197/0x350
>> [ 1463.755038] schedule+0x78/0x110
>> [ 1463.755044] rwsem_down_write_failed+0x197/0x350
>> [ 1463.755055] call_rwsem_down_write_failed+0x13/0x20
>> [ 1463.755061] down_write+0x20/0x30
>
> So we have a background task blocked on an inode lock.
>
>> [ 1463.755196] xfs_free_eofblocks+0x114/0x1a0 [xfs]
>> [ 1463.755330] xfs_inode_free_eofblocks+0xd3/0x1e0 [xfs]
>> [ 1463.755459] ? xfs_inode_ag_walk_grab+0x5b/0x90 [xfs]
>> [ 1463.755586] xfs_inode_ag_walk.isra.15+0x1aa/0x420 [xfs]
>> [ 1463.755714] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
>> [ 1463.755727] ? trace_hardirqs_on_thunk+0x1a/0x1c
>> [ 1463.755734] ? __switch_to_asm+0x40/0x70
>> [ 1463.755738] ? __switch_to_asm+0x34/0x70
>> [ 1463.755743] ? __switch_to_asm+0x40/0x70
>> [ 1463.755748] ? __switch_to_asm+0x34/0x70
>> [ 1463.755752] ? __switch_to_asm+0x40/0x70
>> [ 1463.755757] ? __switch_to_asm+0x34/0x70
>> [ 1463.755762] ? __switch_to_asm+0x40/0x70
>> [ 1463.755893] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
>> [ 1463.755900] ? radix_tree_gang_lookup_tag+0xc2/0x140
>> [ 1463.756032] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
>> [ 1463.756158] xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
>> [ 1463.756288] xfs_eofblocks_worker+0x29/0x40 [xfs]
>> [ 1463.756298] process_one_work+0x1fd/0x420
>> [ 1463.756305] worker_thread+0x2d/0x3d0
>> [ 1463.756311] ? rescuer_thread+0x340/0x340
>> [ 1463.756316] kthread+0x112/0x130
>> [ 1463.756322] ? kthread_create_worker_on_cpu+0x40/0x40
>> [ 1463.756329] ret_from_fork+0x3a/0x50
>> [ 1463.756375] INFO: task kworker/u4:0:4615 blocked for more than 480 seconds.
>> [ 1463.756379] Not tainted 4.19.5-1-default #1
>> [ 1463.756380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
>> this message.
>> [ 1463.756383] kworker/u4:0 D 0 4615 2 0x80000000
>> [ 1463.756395] Workqueue: writeback wb_workfn (flush-9:1)
>> [ 1463.756400] Call Trace:
>> [ 1463.756409] ? __schedule+0x29a/0x880
>> [ 1463.756420] ? wait_barrier+0xdd/0x170 [raid10]
>> [ 1463.756426] schedule+0x78/0x110
>> [ 1463.756433] wait_barrier+0xdd/0x170 [raid10]
>> [ 1463.756440] ? wait_woken+0x80/0x80
>> [ 1463.756448] raid10_write_request+0xf2/0x900 [raid10]
>> [ 1463.756454] ? wait_woken+0x80/0x80
>> [ 1463.756459] ? mempool_alloc+0x55/0x160
>> [ 1463.756483] ? md_write_start+0xa9/0x270 [md_mod]
>> [ 1463.756492] raid10_make_request+0xc1/0x120 [raid10]
>> [ 1463.756498] ? wait_woken+0x80/0x80
>> [ 1463.756514] md_handle_request+0x121/0x190 [md_mod]
>> [ 1463.756535] md_make_request+0x78/0x190 [md_mod]
>> [ 1463.756544] generic_make_request+0x1c6/0x470
>> [ 1463.756553] submit_bio+0x45/0x140
>
> Writeback is blocked submitting I/O down in the MD driver.
>
>> [ 1463.756714] xfs_submit_ioend+0x9c/0x1e0 [xfs]
>> [ 1463.756844] xfs_vm_writepages+0x68/0x80 [xfs]
>> [ 1463.756856] do_writepages+0x31/0xb0
>> [ 1463.756865] ? read_hpet+0x126/0x130
>> [ 1463.756873] ? ktime_get+0x36/0xa0
>> [ 1463.756881] __writeback_single_inode+0x3d/0x3e0
>> [ 1463.756889] writeback_sb_inodes+0x1c4/0x430
>> [ 1463.756902] __writeback_inodes_wb+0x5d/0xb0
>> [ 1463.756910] wb_writeback+0x26b/0x310
>> [ 1463.756920] wb_workfn+0x33a/0x410
>> [ 1463.756932] process_one_work+0x1fd/0x420
>> [ 1463.756940] worker_thread+0x2d/0x3d0
>> [ 1463.756946] ? rescuer_thread+0x340/0x340
>> [ 1463.756951] kthread+0x112/0x130
>> [ 1463.756957] ? kthread_create_worker_on_cpu+0x40/0x40
>> [ 1463.756965] ret_from_fork+0x3a/0x50
>> [ 1463.756979] INFO: task kworker/0:2:4994 blocked for more than 480 seconds.
>> [ 1463.756982] Not tainted 4.19.5-1-default #1
>> [ 1463.756984] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
>> this message.
>> [ 1463.756987] kworker/0:2 D 0 4994 2 0x80000000
>> [ 1463.757013] Workqueue: md submit_flushes [md_mod]
>> [ 1463.757016] Call Trace:
>> [ 1463.757024] ? __schedule+0x29a/0x880
>> [ 1463.757034] ? wait_barrier+0xdd/0x170 [raid10]
>> [ 1463.757039] schedule+0x78/0x110
>> [ 1463.757047] wait_barrier+0xdd/0x170 [raid10]
>> [ 1463.757054] ? wait_woken+0x80/0x80
>> [ 1463.757062] raid10_write_request+0xf2/0x900 [raid10]
>> [ 1463.757067] ? wait_woken+0x80/0x80
>> [ 1463.757072] ? mempool_alloc+0x55/0x160
>> [ 1463.757088] ? md_write_start+0xa9/0x270 [md_mod]
>> [ 1463.757095] ? trace_hardirqs_off_thunk+0x1a/0x1c
>> [ 1463.757104] raid10_make_request+0xc1/0x120 [raid10]
>> [ 1463.757110] ? wait_woken+0x80/0x80
>> [ 1463.757126] md_handle_request+0x121/0x190 [md_mod]
>> [ 1463.757132] ? _raw_spin_unlock_irq+0x22/0x40
>> [ 1463.757137] ? finish_task_switch+0x74/0x260
>> [ 1463.757156] submit_flushes+0x21/0x40 [md_mod]
>
> Some other MD task (?) also blocked submitting a request.
>
>> [ 1463.757163] process_one_work+0x1fd/0x420
>> [ 1463.757170] worker_thread+0x2d/0x3d0
>> [ 1463.757177] ? rescuer_thread+0x340/0x340
>> [ 1463.757181] kthread+0x112/0x130
>> [ 1463.757186] ? kthread_create_worker_on_cpu+0x40/0x40
>> [ 1463.757193] ret_from_fork+0x3a/0x50
>> [ 1463.757205] INFO: task md1_resync:5215 blocked for more than 480 seconds.
>> [ 1463.757207] Not tainted 4.19.5-1-default #1
>> [ 1463.757209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
>> this message.
>> [ 1463.757212] md1_resync D 0 5215 2 0x80000000
>> [ 1463.757216] Call Trace:
>> [ 1463.757223] ? __schedule+0x29a/0x880
>> [ 1463.757231] ? raise_barrier+0x8d/0x140 [raid10]
>> [ 1463.757236] schedule+0x78/0x110
>> [ 1463.757243] raise_barrier+0x8d/0x140 [raid10]

>> [ 1463.757248] ? wait_woken+0x80/0x80
>> [ 1463.757257] raid10_sync_request+0x1f6/0x1e30 [raid10]
>> [ 1463.757265] ? _raw_spin_unlock_irq+0x22/0x40
>> [ 1463.757284] ? is_mddev_idle+0x125/0x137 [md_mod]
>> [ 1463.757302] md_do_sync.cold.78+0x404/0x969 [md_mod]
>
> The md1 sync task is blocked, I'm not sure on what.
>
>> [ 1463.757311] ? wait_woken+0x80/0x80
>> [ 1463.757336] ? md_rdev_init+0xb0/0xb0 [md_mod]
>> [ 1463.757351] md_thread+0xe9/0x140 [md_mod]
>> [ 1463.757358] ? _raw_spin_unlock_irqrestore+0x2e/0x60
>> [ 1463.757364] ? __kthread_parkme+0x4c/0x70
>> [ 1463.757369] kthread+0x112/0x130
>> [ 1463.757374] ? kthread_create_worker_on_cpu+0x40/0x40
>> [ 1463.757380] ret_from_fork+0x3a/0x50
>> [ 1463.757395] INFO: task xfsaild/md1:5233 blocked for more than 480 seconds.
>> [ 1463.757398] Not tainted 4.19.5-1-default #1
>> [ 1463.757400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
>> this message.
>> [ 1463.757402] xfsaild/md1 D 0 5233 2 0x80000000
>> [ 1463.757406] Call Trace:
>> [ 1463.757413] ? __schedule+0x29a/0x880
>> [ 1463.757421] ? wait_barrier+0xdd/0x170 [raid10]
>> [ 1463.757426] schedule+0x78/0x110
>> [ 1463.757433] wait_barrier+0xdd/0x170 [raid10]
>> [ 1463.757438] ? wait_woken+0x80/0x80
>> [ 1463.757446] raid10_write_request+0xf2/0x900 [raid10]
>> [ 1463.757451] ? wait_woken+0x80/0x80
>> [ 1463.757455] ? mempool_alloc+0x55/0x160
>> [ 1463.757471] ? md_write_start+0xa9/0x270 [md_mod]
>> [ 1463.757477] ? trace_hardirqs_on_thunk+0x1a/0x1c
>> [ 1463.757485] raid10_make_request+0xc1/0x120 [raid10]
>> [ 1463.757491] ? wait_woken+0x80/0x80
>> [ 1463.757507] md_handle_request+0x121/0x190 [md_mod]
>> [ 1463.757527] md_make_request+0x78/0x190 [md_mod]
>> [ 1463.757536] generic_make_request+0x1c6/0x470
>> [ 1463.757544] submit_bio+0x45/0x140
>
> xfsaild (metadata writeback) is also blocked submitting I/O down in the
> MD driver.
>
>> [ 1463.757552] ? bio_add_page+0x48/0x60
>> [ 1463.757716] _xfs_buf_ioapply+0x2c1/0x450 [xfs]
>> [ 1463.757849] ? xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
>> [ 1463.757974] __xfs_buf_submit+0x67/0x270 [xfs]
>> [ 1463.758102] xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
>> [ 1463.758232] ? xfsaild+0x294/0x7e0 [xfs]
>> [ 1463.758364] xfsaild+0x294/0x7e0 [xfs]
>> [ 1463.758377] ? _raw_spin_unlock_irqrestore+0x2e/0x60
>> [ 1463.758508] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
>> [ 1463.758514] kthread+0x112/0x130
>> [ 1463.758520] ? kthread_create_worker_on_cpu+0x40/0x40
>> [ 1463.758527] ret_from_fork+0x3a/0x50
>> [ 1463.758543] INFO: task rpm:5364 blocked for more than 480 seconds.
>> [ 1463.758546] Not tainted 4.19.5-1-default #1
>> [ 1463.758547] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
>> this message.
>> [ 1463.758550] rpm D 0 5364 3757 0x00000000
>> [ 1463.758554] Call Trace:
>> [ 1463.758563] ? __schedule+0x29a/0x880
>> [ 1463.758701] ? xlog_wait+0x5c/0x70 [xfs]
>> [ 1463.759821] schedule+0x78/0x110
>> [ 1463.760022] xlog_wait+0x5c/0x70 [xfs]
>> [ 1463.760036] ? wake_up_q+0x70/0x70
>> [ 1463.760167] __xfs_log_force_lsn+0x223/0x230 [xfs]
>> [ 1463.760297] ? xfs_file_fsync+0x196/0x1d0 [xfs]
>> [ 1463.760424] xfs_log_force_lsn+0x93/0x140 [xfs]
>> [ 1463.760552] xfs_file_fsync+0x196/0x1d0 [xfs]
>
> An fsync is blocked, presumably on XFS log I/O completion.
>
>> [ 1463.760562] ? __sb_end_write+0x36/0x60
>> [ 1463.760571] do_fsync+0x38/0x70
>> [ 1463.760578] __x64_sys_fdatasync+0x13/0x20
>> [ 1463.760585] do_syscall_64+0x60/0x110
>> [ 1463.760594] entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> [ 1463.760603] RIP: 0033:0x7f9757fae8a4
>> [ 1463.760616] Code: Bad RIP value.
>> [ 1463.760619] RSP: 002b:00007fff74fdb428 EFLAGS: 00000246 ORIG_RAX: 
>> 000000000000004b
>> [ 1463.760654] RAX: ffffffffffffffda RBX: 0000000000000064 RCX: 00007f9757fae8a4
>> [ 1463.760657] RDX: 00000000012c4c60 RSI: 00000000012cc130 RDI: 0000000000000004
>> [ 1463.760660] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f9758708c00
>> [ 1463.760662] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000012cc130
>> [ 1463.760665] R13: 000000000123a3a0 R14: 0000000000010830 R15: 0000000000000062
>> [ 1463.760679] INFO: task kworker/0:8:5367 blocked for more than 480 seconds.
>> [ 1463.760683] Not tainted 4.19.5-1-default #1
>> [ 1463.760684] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
>> this message.
>> [ 1463.760687] kworker/0:8 D 0 5367 2 0x80000000
>> [ 1463.760718] Workqueue: md submit_flushes [md_mod]
>
> And that MD submit_flushes thing again.
>
> Not to say there isn't some issue between XFS and MD going on here, but
> I think we might want an MD person to take a look at this and possibly
> provide some insight. From an XFS perspective, this all just looks like
> we're blocked on I/O (via writeback, AIL and log) to a slow device.
>
> Brian
>
>> [ 1463.760721] Call Trace:
>> [ 1463.760731] ? __schedule+0x29a/0x880
>> [ 1463.760741] ? wait_barrier+0xdd/0x170 [raid10]
>> [ 1463.760746] schedule+0x78/0x110
>> [ 1463.760753] wait_barrier+0xdd/0x170 [raid10]
>> [ 1463.760761] ? wait_woken+0x80/0x80
>> [ 1463.760768] raid10_write_request+0xf2/0x900 [raid10]
>> [ 1463.760774] ? wait_woken+0x80/0x80
>> [ 1463.760778] ? mempool_alloc+0x55/0x160
>> [ 1463.760795] ? md_write_start+0xa9/0x270 [md_mod]
>> [ 1463.760801] ? try_to_wake_up+0x44/0x470
>> [ 1463.760810] raid10_make_request+0xc1/0x120 [raid10]
>> [ 1463.760816] ? wait_woken+0x80/0x80
>> [ 1463.760831] md_handle_request+0x121/0x190 [md_mod]
>> [ 1463.760851] md_make_request+0x78/0x190 [md_mod]
>> [ 1463.760860] generic_make_request+0x1c6/0x470
>> [ 1463.760870] raid10_write_request+0x77a/0x900 [raid10]
>> [ 1463.760875] ? wait_woken+0x80/0x80
>> [ 1463.760879] ? mempool_alloc+0x55/0x160
>> [ 1463.760895] ? md_write_start+0xa9/0x270 [md_mod]
>> [ 1463.760904] raid10_make_request+0xc1/0x120 [raid10]
>> [ 1463.760910] ? wait_woken+0x80/0x80
>> [ 1463.760926] md_handle_request+0x121/0x190 [md_mod]
>> [ 1463.760931] ? _raw_spin_unlock_irq+0x22/0x40
>> [ 1463.760936] ? finish_task_switch+0x74/0x260
>> [ 1463.760954] submit_flushes+0x21/0x40 [md_mod]
>> [ 1463.760962] process_one_work+0x1fd/0x420
>> [ 1463.760970] worker_thread+0x2d/0x3d0
>> [ 1463.760976] ? rescuer_thread+0x340/0x340
>> [ 1463.760981] kthread+0x112/0x130
>> [ 1463.760986] ? kthread_create_worker_on_cpu+0x40/0x40
>> [ 1463.760992] ret_from_fork+0x3a/0x50

-- 
Srdačan pozdrav/Best regards,
Siniša Bandin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-13  8:21   ` Sinisa
@ 2018-12-13 12:28     ` Brian Foster
  2018-12-13 13:02       ` Sinisa
  0 siblings, 1 reply; 16+ messages in thread
From: Brian Foster @ 2018-12-13 12:28 UTC (permalink / raw)
  To: Sinisa; +Cc: linux-xfs

On Thu, Dec 13, 2018 at 09:21:18AM +0100, Sinisa wrote:
> Thanks for a quick reply. Replies are inline...
> 
> On 12.12.2018 15:30, Brian Foster wrote:
> > cc linux-raid
> > 
> > On Wed, Dec 12, 2018 at 01:29:49PM +0100, Sinisa wrote:
> > > Hello group,
> > > 
> > > I have noticed something strange going on lately, but recently I have come
> > > to conclusion that there is some unwanted interaction between XFS and Linux
> > > RAID10 with "offset" layout.
> > > 
> > > So here is the problem: I create a Linux RAID10 mirror with 2 disks (HDD or
> > > SSD) and "o2" layout (best choice for read and write speed):
> > > # mdadm -C -n2 -l10 -po2 /dev/mdX /dev/sdaX /dev/sdbX
> > > # mkfs.xfs /dev/mdX
> > > # mount /dev/mdX /mnt
> > > # rsync -avxDPHS / /mnt
> > > 
> > > So we have RAID10 initializing:
> > > 
> > > # cat /proc/mdstat
> > > Personalities : [raid1] [raid10]
> > > md2 : active raid10 sdb3[1] sda3[0]
> > >       314433536 blocks super 1.2 4096K chunks 2 offset-copies [2/2] [UU]
> > >       [==>..................]  resync = 11.7% (36917568/314433536)
> > > finish=8678.2min speed=532K/sec
> > >       bitmap: 3/3 pages [12KB], 65536KB chunk
> > > 
> > > but after a few minutes everything stops like you can see above. Rsync (or
> > > any other process writing to that md device) also freezes. If I try to read
> > > already copied files - freeze, usually with less that 2GB copied.
> > > 
> > 
> > Does the same thing happen without the RAID initialization? E.g., if you
> > wait for it to complete or (IIRC) if you create with --assume-clean? I
> > assume the init-in-progress state is common with your tests on other
> > filesystems?
> > 
> 
> No, if I wait for RAID to finish initializing, or create it with
> --assume-clean, everything works just fine.
> 
> Actualy, ever since openSUSE LEAP 15.0 release I have been doing just that:
> pause installation process until initialization is done, then let it go on.
> 
> But recently it has happened so that I had to replace one of the disks in a
> "live" system (small file server), and was unable to do that on multiple
> tries during work hours because of this problem. When I waited until
> afternoon, when nobody was working/writing, resync was able to finish...
> 

So apparently there is some kind of poor interaction here with the
internal MD resync code. It's not clear to me whether it's a lockup or
extreme slowdown, but unless anybody else has ideas I'd suggest to
solicit feedback from the MD devs (note that you dropped the linux-raid
cc) as to why this set of I/O might be blocked in the raid device and go
from there.

Brian

> 
> > A few more notes below inline to the log..
> > 
> > > Sometimes in dmesg I get some kernel messages about "task kworker/2:1:55
> > > blocked for more than 480 seconds." (please see attached dmesg.txt and my
> > > reports here: https://bugzilla.opensuse.org/show_bug.cgi?id=1111073),
> > > sometimes nothing at all. When this happens, I can only reboot with SysRq-b
> > > or "physically" with reset/power button.
> > > 
> > > Same thing can happen with "far" layout, but it seems to me that it does not
> > > happen every time (or that often). I might be wrong, because I never use
> > > "far" layout in real life, only for testing.
> > > I was unable to reproduce the failure with "near" layout.
> > > 
> > > Also with EXT4 or BTRFS and any layout everything works just as it should,
> > > that is sync goes on until finished, and rsync, cp, or any other write work
> > > just fine at the same time.
> > > 
> > > Let me just add that I first saw this behavior in openSUSE LEAP 15.0 (kernel
> > > 4.12). In previous versions (up to kernel 4.4) I never had this problem. In
> > > the meantime I have tested with kernels up to 4.20rc and it is the same.
> > > Unfortunately I cannot go back to test kernels 4.5 - 4.11 to pinpoint the
> > > moment the problem first appeared.
> > > 
> > > 
> > > 
> > > --
> > > Best regards,
> > > Siniša Bandin
> > > (excuse my English)
> > > 
> > 
> > > [ 180.981499] SGI XFS with ACLs, security attributes, no debug enabled
> > > [ 181.005019] XFS (md1): Mounting V5 Filesystem
> > > [ 181.132076] XFS (md1): Starting recovery (logdev: internal)
> > > [ 181.295606] XFS (md1): Ending recovery (logdev: internal)
> > > [ 181.804011] XFS (md1): Unmounting Filesystem
> > > [ 182.201794] XFS (md127): Mounting V4 Filesystem
> > > [ 182.736958] md: recovery of RAID array md127
> > > [ 182.915479] XFS (md127): Ending clean mount
> > > [ 183.819702] XFS (md127): Unmounting Filesystem
> > > [ 184.943831] EXT4-fs (md0): mounted filesystem with ordered data
> > > mode. Opts: (null)
> > > [ 529.784557] EXT4-fs (md0): mounted filesystem with ordered data
> > > mode. Opts: (null)
> > > [ 601.789958] md1: detected capacity change from 33284947968 to 0
> > > [ 601.789973] md: md1 stopped.
> > > [ 602.314112] md0: detected capacity change from 550436864 to 0
> > > [ 602.314128] md: md0 stopped.
> > > [ 602.745030] md: md127: recovery interrupted.
> > > [ 603.131684] md127: detected capacity change from 966229229568 to 0
> > > [ 603.132237] md: md127 stopped.
> > > [ 603.435808] sda: sda1 sda2
> > > [ 603.594074] udevd[5011]: inotify_add_watch(11, /dev/sda2, 10)
> > > failed: No such file or directory
> > > [ 603.643959] sda:
> > > [ 603.844724] sdb: sdb1 sdb2
> > > [ 604.255407] sdb: sdb1
> > > [ 604.490214] udevd[5050]: inotify_add_watch(11, /dev/sdb1, 10)
> > > failed: No such file or directory
> > > [ 605.140952] sdb: sdb1
> > > [ 605.628686] sdb: sdb1 sdb2
> > > [ 606.271192] sdb: sdb1 sdb2 sdb3
> > > [ 607.079626] sdb: sdb1 sdb2 sdb3
> > > [ 607.611092] sda:
> > > [ 608.273201] sda: sda1
> > > [ 608.611952] sda: sda1 sda2
> > > [ 609.031326] sda: sda1 sda2 sda3
> > > [ 609.753140] md/raid10:md1: not clean -- starting background reconstruction
> > > [ 609.753145] md/raid10:md1: active with 2 out of 2 devices
> > > [ 609.768804] md1: detected capacity change from 0 to 32210157568
> > > [ 609.772677] md: resync of RAID array md1
> > > [ 614.590107] XFS (md1): Mounting V5 Filesystem
> > > [ 615.449035] XFS (md1): Ending clean mount
> > > [ 617.678462] md/raid1:md0: not clean -- starting background reconstruction
> > > [ 617.678469] md/raid1:md0: active with 2 out of 2 mirrors
> > > [ 617.740729] md0: detected capacity change from 0 to 524222464
> > > [ 617.747107] md: delaying resync of md0 until md1 has finished
> > > (they share one or more physical units)
> > 
> > What are md0 and md1? Note that I don't see md2 anywhere in this log.
> > 
> 
> Sorry that I did not clarify that immediately, this log was taken earlier,
> during installation, when I got to see it in dmesg.
> md0 was /boot (with EXT4), md1 was / with XFS.
> 
> Example of cat /proc/mdstat was taken later, when I brought up the system
> (by changing md1 to "near" layout at install time). So wherever you see md1
> or md2, you can assume they are the same thing: new RAID10/o2 being written
> to during initialization. But second time there was nothing in dmesg, so I
> could not attach that.
> 
> 
> > > [ 620.037818] EXT4-fs (md0): mounted filesystem with ordered data
> > > mode. Opts: (null)
> > > [ 1463.754785] INFO: task kworker/0:3:227 blocked for more than 480 seconds.
> > > [ 1463.754793] Not tainted 4.19.5-1-default #1
> > > [ 1463.754795] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [ 1463.754799] kworker/0:3 D 0 227 2 0x80000000
> > > [ 1463.755000] Workqueue: xfs-eofblocks/md1 xfs_eofblocks_worker [xfs]
> > > [ 1463.755005] Call Trace:
> > > [ 1463.755025] ? __schedule+0x29a/0x880
> > > [ 1463.755032] ? rwsem_down_write_failed+0x197/0x350
> > > [ 1463.755038] schedule+0x78/0x110
> > > [ 1463.755044] rwsem_down_write_failed+0x197/0x350
> > > [ 1463.755055] call_rwsem_down_write_failed+0x13/0x20
> > > [ 1463.755061] down_write+0x20/0x30
> > 
> > So we have a background task blocked on an inode lock.
> > 
> > > [ 1463.755196] xfs_free_eofblocks+0x114/0x1a0 [xfs]
> > > [ 1463.755330] xfs_inode_free_eofblocks+0xd3/0x1e0 [xfs]
> > > [ 1463.755459] ? xfs_inode_ag_walk_grab+0x5b/0x90 [xfs]
> > > [ 1463.755586] xfs_inode_ag_walk.isra.15+0x1aa/0x420 [xfs]
> > > [ 1463.755714] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > [ 1463.755727] ? trace_hardirqs_on_thunk+0x1a/0x1c
> > > [ 1463.755734] ? __switch_to_asm+0x40/0x70
> > > [ 1463.755738] ? __switch_to_asm+0x34/0x70
> > > [ 1463.755743] ? __switch_to_asm+0x40/0x70
> > > [ 1463.755748] ? __switch_to_asm+0x34/0x70
> > > [ 1463.755752] ? __switch_to_asm+0x40/0x70
> > > [ 1463.755757] ? __switch_to_asm+0x34/0x70
> > > [ 1463.755762] ? __switch_to_asm+0x40/0x70
> > > [ 1463.755893] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > [ 1463.755900] ? radix_tree_gang_lookup_tag+0xc2/0x140
> > > [ 1463.756032] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > [ 1463.756158] xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> > > [ 1463.756288] xfs_eofblocks_worker+0x29/0x40 [xfs]
> > > [ 1463.756298] process_one_work+0x1fd/0x420
> > > [ 1463.756305] worker_thread+0x2d/0x3d0
> > > [ 1463.756311] ? rescuer_thread+0x340/0x340
> > > [ 1463.756316] kthread+0x112/0x130
> > > [ 1463.756322] ? kthread_create_worker_on_cpu+0x40/0x40
> > > [ 1463.756329] ret_from_fork+0x3a/0x50
> > > [ 1463.756375] INFO: task kworker/u4:0:4615 blocked for more than 480 seconds.
> > > [ 1463.756379] Not tainted 4.19.5-1-default #1
> > > [ 1463.756380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [ 1463.756383] kworker/u4:0 D 0 4615 2 0x80000000
> > > [ 1463.756395] Workqueue: writeback wb_workfn (flush-9:1)
> > > [ 1463.756400] Call Trace:
> > > [ 1463.756409] ? __schedule+0x29a/0x880
> > > [ 1463.756420] ? wait_barrier+0xdd/0x170 [raid10]
> > > [ 1463.756426] schedule+0x78/0x110
> > > [ 1463.756433] wait_barrier+0xdd/0x170 [raid10]
> > > [ 1463.756440] ? wait_woken+0x80/0x80
> > > [ 1463.756448] raid10_write_request+0xf2/0x900 [raid10]
> > > [ 1463.756454] ? wait_woken+0x80/0x80
> > > [ 1463.756459] ? mempool_alloc+0x55/0x160
> > > [ 1463.756483] ? md_write_start+0xa9/0x270 [md_mod]
> > > [ 1463.756492] raid10_make_request+0xc1/0x120 [raid10]
> > > [ 1463.756498] ? wait_woken+0x80/0x80
> > > [ 1463.756514] md_handle_request+0x121/0x190 [md_mod]
> > > [ 1463.756535] md_make_request+0x78/0x190 [md_mod]
> > > [ 1463.756544] generic_make_request+0x1c6/0x470
> > > [ 1463.756553] submit_bio+0x45/0x140
> > 
> > Writeback is blocked submitting I/O down in the MD driver.
> > 
> > > [ 1463.756714] xfs_submit_ioend+0x9c/0x1e0 [xfs]
> > > [ 1463.756844] xfs_vm_writepages+0x68/0x80 [xfs]
> > > [ 1463.756856] do_writepages+0x31/0xb0
> > > [ 1463.756865] ? read_hpet+0x126/0x130
> > > [ 1463.756873] ? ktime_get+0x36/0xa0
> > > [ 1463.756881] __writeback_single_inode+0x3d/0x3e0
> > > [ 1463.756889] writeback_sb_inodes+0x1c4/0x430
> > > [ 1463.756902] __writeback_inodes_wb+0x5d/0xb0
> > > [ 1463.756910] wb_writeback+0x26b/0x310
> > > [ 1463.756920] wb_workfn+0x33a/0x410
> > > [ 1463.756932] process_one_work+0x1fd/0x420
> > > [ 1463.756940] worker_thread+0x2d/0x3d0
> > > [ 1463.756946] ? rescuer_thread+0x340/0x340
> > > [ 1463.756951] kthread+0x112/0x130
> > > [ 1463.756957] ? kthread_create_worker_on_cpu+0x40/0x40
> > > [ 1463.756965] ret_from_fork+0x3a/0x50
> > > [ 1463.756979] INFO: task kworker/0:2:4994 blocked for more than 480 seconds.
> > > [ 1463.756982] Not tainted 4.19.5-1-default #1
> > > [ 1463.756984] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [ 1463.756987] kworker/0:2 D 0 4994 2 0x80000000
> > > [ 1463.757013] Workqueue: md submit_flushes [md_mod]
> > > [ 1463.757016] Call Trace:
> > > [ 1463.757024] ? __schedule+0x29a/0x880
> > > [ 1463.757034] ? wait_barrier+0xdd/0x170 [raid10]
> > > [ 1463.757039] schedule+0x78/0x110
> > > [ 1463.757047] wait_barrier+0xdd/0x170 [raid10]
> > > [ 1463.757054] ? wait_woken+0x80/0x80
> > > [ 1463.757062] raid10_write_request+0xf2/0x900 [raid10]
> > > [ 1463.757067] ? wait_woken+0x80/0x80
> > > [ 1463.757072] ? mempool_alloc+0x55/0x160
> > > [ 1463.757088] ? md_write_start+0xa9/0x270 [md_mod]
> > > [ 1463.757095] ? trace_hardirqs_off_thunk+0x1a/0x1c
> > > [ 1463.757104] raid10_make_request+0xc1/0x120 [raid10]
> > > [ 1463.757110] ? wait_woken+0x80/0x80
> > > [ 1463.757126] md_handle_request+0x121/0x190 [md_mod]
> > > [ 1463.757132] ? _raw_spin_unlock_irq+0x22/0x40
> > > [ 1463.757137] ? finish_task_switch+0x74/0x260
> > > [ 1463.757156] submit_flushes+0x21/0x40 [md_mod]
> > 
> > Some other MD task (?) also blocked submitting a request.
> > 
> > > [ 1463.757163] process_one_work+0x1fd/0x420
> > > [ 1463.757170] worker_thread+0x2d/0x3d0
> > > [ 1463.757177] ? rescuer_thread+0x340/0x340
> > > [ 1463.757181] kthread+0x112/0x130
> > > [ 1463.757186] ? kthread_create_worker_on_cpu+0x40/0x40
> > > [ 1463.757193] ret_from_fork+0x3a/0x50
> > > [ 1463.757205] INFO: task md1_resync:5215 blocked for more than 480 seconds.
> > > [ 1463.757207] Not tainted 4.19.5-1-default #1
> > > [ 1463.757209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [ 1463.757212] md1_resync D 0 5215 2 0x80000000
> > > [ 1463.757216] Call Trace:
> > > [ 1463.757223] ? __schedule+0x29a/0x880
> > > [ 1463.757231] ? raise_barrier+0x8d/0x140 [raid10]
> > > [ 1463.757236] schedule+0x78/0x110
> > > [ 1463.757243] raise_barrier+0x8d/0x140 [raid10]
> 
> > > [ 1463.757248] ? wait_woken+0x80/0x80
> > > [ 1463.757257] raid10_sync_request+0x1f6/0x1e30 [raid10]
> > > [ 1463.757265] ? _raw_spin_unlock_irq+0x22/0x40
> > > [ 1463.757284] ? is_mddev_idle+0x125/0x137 [md_mod]
> > > [ 1463.757302] md_do_sync.cold.78+0x404/0x969 [md_mod]
> > 
> > The md1 sync task is blocked, I'm not sure on what.
> > 
> > > [ 1463.757311] ? wait_woken+0x80/0x80
> > > [ 1463.757336] ? md_rdev_init+0xb0/0xb0 [md_mod]
> > > [ 1463.757351] md_thread+0xe9/0x140 [md_mod]
> > > [ 1463.757358] ? _raw_spin_unlock_irqrestore+0x2e/0x60
> > > [ 1463.757364] ? __kthread_parkme+0x4c/0x70
> > > [ 1463.757369] kthread+0x112/0x130
> > > [ 1463.757374] ? kthread_create_worker_on_cpu+0x40/0x40
> > > [ 1463.757380] ret_from_fork+0x3a/0x50
> > > [ 1463.757395] INFO: task xfsaild/md1:5233 blocked for more than 480 seconds.
> > > [ 1463.757398] Not tainted 4.19.5-1-default #1
> > > [ 1463.757400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [ 1463.757402] xfsaild/md1 D 0 5233 2 0x80000000
> > > [ 1463.757406] Call Trace:
> > > [ 1463.757413] ? __schedule+0x29a/0x880
> > > [ 1463.757421] ? wait_barrier+0xdd/0x170 [raid10]
> > > [ 1463.757426] schedule+0x78/0x110
> > > [ 1463.757433] wait_barrier+0xdd/0x170 [raid10]
> > > [ 1463.757438] ? wait_woken+0x80/0x80
> > > [ 1463.757446] raid10_write_request+0xf2/0x900 [raid10]
> > > [ 1463.757451] ? wait_woken+0x80/0x80
> > > [ 1463.757455] ? mempool_alloc+0x55/0x160
> > > [ 1463.757471] ? md_write_start+0xa9/0x270 [md_mod]
> > > [ 1463.757477] ? trace_hardirqs_on_thunk+0x1a/0x1c
> > > [ 1463.757485] raid10_make_request+0xc1/0x120 [raid10]
> > > [ 1463.757491] ? wait_woken+0x80/0x80
> > > [ 1463.757507] md_handle_request+0x121/0x190 [md_mod]
> > > [ 1463.757527] md_make_request+0x78/0x190 [md_mod]
> > > [ 1463.757536] generic_make_request+0x1c6/0x470
> > > [ 1463.757544] submit_bio+0x45/0x140
> > 
> > xfsaild (metadata writeback) is also blocked submitting I/O down in the
> > MD driver.
> > 
> > > [ 1463.757552] ? bio_add_page+0x48/0x60
> > > [ 1463.757716] _xfs_buf_ioapply+0x2c1/0x450 [xfs]
> > > [ 1463.757849] ? xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
> > > [ 1463.757974] __xfs_buf_submit+0x67/0x270 [xfs]
> > > [ 1463.758102] xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
> > > [ 1463.758232] ? xfsaild+0x294/0x7e0 [xfs]
> > > [ 1463.758364] xfsaild+0x294/0x7e0 [xfs]
> > > [ 1463.758377] ? _raw_spin_unlock_irqrestore+0x2e/0x60
> > > [ 1463.758508] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
> > > [ 1463.758514] kthread+0x112/0x130
> > > [ 1463.758520] ? kthread_create_worker_on_cpu+0x40/0x40
> > > [ 1463.758527] ret_from_fork+0x3a/0x50
> > > [ 1463.758543] INFO: task rpm:5364 blocked for more than 480 seconds.
> > > [ 1463.758546] Not tainted 4.19.5-1-default #1
> > > [ 1463.758547] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [ 1463.758550] rpm D 0 5364 3757 0x00000000
> > > [ 1463.758554] Call Trace:
> > > [ 1463.758563] ? __schedule+0x29a/0x880
> > > [ 1463.758701] ? xlog_wait+0x5c/0x70 [xfs]
> > > [ 1463.759821] schedule+0x78/0x110
> > > [ 1463.760022] xlog_wait+0x5c/0x70 [xfs]
> > > [ 1463.760036] ? wake_up_q+0x70/0x70
> > > [ 1463.760167] __xfs_log_force_lsn+0x223/0x230 [xfs]
> > > [ 1463.760297] ? xfs_file_fsync+0x196/0x1d0 [xfs]
> > > [ 1463.760424] xfs_log_force_lsn+0x93/0x140 [xfs]
> > > [ 1463.760552] xfs_file_fsync+0x196/0x1d0 [xfs]
> > 
> > An fsync is blocked, presumably on XFS log I/O completion.
> > 
> > > [ 1463.760562] ? __sb_end_write+0x36/0x60
> > > [ 1463.760571] do_fsync+0x38/0x70
> > > [ 1463.760578] __x64_sys_fdatasync+0x13/0x20
> > > [ 1463.760585] do_syscall_64+0x60/0x110
> > > [ 1463.760594] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > > [ 1463.760603] RIP: 0033:0x7f9757fae8a4
> > > [ 1463.760616] Code: Bad RIP value.
> > > [ 1463.760619] RSP: 002b:00007fff74fdb428 EFLAGS: 00000246 ORIG_RAX:
> > > 000000000000004b
> > > [ 1463.760654] RAX: ffffffffffffffda RBX: 0000000000000064 RCX: 00007f9757fae8a4
> > > [ 1463.760657] RDX: 00000000012c4c60 RSI: 00000000012cc130 RDI: 0000000000000004
> > > [ 1463.760660] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f9758708c00
> > > [ 1463.760662] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000012cc130
> > > [ 1463.760665] R13: 000000000123a3a0 R14: 0000000000010830 R15: 0000000000000062
> > > [ 1463.760679] INFO: task kworker/0:8:5367 blocked for more than 480 seconds.
> > > [ 1463.760683] Not tainted 4.19.5-1-default #1
> > > [ 1463.760684] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [ 1463.760687] kworker/0:8 D 0 5367 2 0x80000000
> > > [ 1463.760718] Workqueue: md submit_flushes [md_mod]
> > 
> > And that MD submit_flushes thing again.
> > 
> > Not to say there isn't some issue between XFS and MD going on here, but
> > I think we might want an MD person to take a look at this and possibly
> > provide some insight. From an XFS perspective, this all just looks like
> > we're blocked on I/O (via writeback, AIL and log) to a slow device.
> > 
> > Brian
> > 
> > > [ 1463.760721] Call Trace:
> > > [ 1463.760731] ? __schedule+0x29a/0x880
> > > [ 1463.760741] ? wait_barrier+0xdd/0x170 [raid10]
> > > [ 1463.760746] schedule+0x78/0x110
> > > [ 1463.760753] wait_barrier+0xdd/0x170 [raid10]
> > > [ 1463.760761] ? wait_woken+0x80/0x80
> > > [ 1463.760768] raid10_write_request+0xf2/0x900 [raid10]
> > > [ 1463.760774] ? wait_woken+0x80/0x80
> > > [ 1463.760778] ? mempool_alloc+0x55/0x160
> > > [ 1463.760795] ? md_write_start+0xa9/0x270 [md_mod]
> > > [ 1463.760801] ? try_to_wake_up+0x44/0x470
> > > [ 1463.760810] raid10_make_request+0xc1/0x120 [raid10]
> > > [ 1463.760816] ? wait_woken+0x80/0x80
> > > [ 1463.760831] md_handle_request+0x121/0x190 [md_mod]
> > > [ 1463.760851] md_make_request+0x78/0x190 [md_mod]
> > > [ 1463.760860] generic_make_request+0x1c6/0x470
> > > [ 1463.760870] raid10_write_request+0x77a/0x900 [raid10]
> > > [ 1463.760875] ? wait_woken+0x80/0x80
> > > [ 1463.760879] ? mempool_alloc+0x55/0x160
> > > [ 1463.760895] ? md_write_start+0xa9/0x270 [md_mod]
> > > [ 1463.760904] raid10_make_request+0xc1/0x120 [raid10]
> > > [ 1463.760910] ? wait_woken+0x80/0x80
> > > [ 1463.760926] md_handle_request+0x121/0x190 [md_mod]
> > > [ 1463.760931] ? _raw_spin_unlock_irq+0x22/0x40
> > > [ 1463.760936] ? finish_task_switch+0x74/0x260
> > > [ 1463.760954] submit_flushes+0x21/0x40 [md_mod]
> > > [ 1463.760962] process_one_work+0x1fd/0x420
> > > [ 1463.760970] worker_thread+0x2d/0x3d0
> > > [ 1463.760976] ? rescuer_thread+0x340/0x340
> > > [ 1463.760981] kthread+0x112/0x130
> > > [ 1463.760986] ? kthread_create_worker_on_cpu+0x40/0x40
> > > [ 1463.760992] ret_from_fork+0x3a/0x50
> 
> -- 
> Srdačan pozdrav/Best regards,
> Siniša Bandin
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-13 12:28     ` Brian Foster
@ 2018-12-13 13:02       ` Sinisa
  2018-12-13 17:30         ` keld
  0 siblings, 1 reply; 16+ messages in thread
From: Sinisa @ 2018-12-13 13:02 UTC (permalink / raw)
  To: linux-xfs; +Cc: linux-raid

On 12/13/18 1:28 PM, Brian Foster wrote:
> On Thu, Dec 13, 2018 at 09:21:18AM +0100, Sinisa wrote:
>> Thanks for a quick reply. Replies are inline...
>>
>> On 12.12.2018 15:30, Brian Foster wrote:
>>> cc linux-raid
>>>
>>> On Wed, Dec 12, 2018 at 01:29:49PM +0100, Sinisa wrote:
>>>> Hello group,
>>>>
>>>> I have noticed something strange going on lately, but recently I have come
>>>> to conclusion that there is some unwanted interaction between XFS and Linux
>>>> RAID10 with "offset" layout.
>>>>
>>>> So here is the problem: I create a Linux RAID10 mirror with 2 disks (HDD or
>>>> SSD) and "o2" layout (best choice for read and write speed):
>>>> # mdadm -C -n2 -l10 -po2 /dev/mdX /dev/sdaX /dev/sdbX
>>>> # mkfs.xfs /dev/mdX
>>>> # mount /dev/mdX /mnt
>>>> # rsync -avxDPHS / /mnt
>>>>
>>>> So we have RAID10 initializing:
>>>>
>>>> # cat /proc/mdstat
>>>> Personalities : [raid1] [raid10]
>>>> md2 : active raid10 sdb3[1] sda3[0]
>>>>        314433536 blocks super 1.2 4096K chunks 2 offset-copies [2/2] [UU]
>>>>        [==>..................]  resync = 11.7% (36917568/314433536)
>>>> finish=8678.2min speed=532K/sec
>>>>        bitmap: 3/3 pages [12KB], 65536KB chunk
>>>>
>>>> but after a few minutes everything stops like you can see above. Rsync (or
>>>> any other process writing to that md device) also freezes. If I try to read
>>>> already copied files - freeze, usually with less that 2GB copied.
>>>>
>>> Does the same thing happen without the RAID initialization? E.g., if you
>>> wait for it to complete or (IIRC) if you create with --assume-clean? I
>>> assume the init-in-progress state is common with your tests on other
>>> filesystems?
>>>
>> No, if I wait for RAID to finish initializing, or create it with
>> --assume-clean, everything works just fine.
>>
>> Actualy, ever since openSUSE LEAP 15.0 release I have been doing just that:
>> pause installation process until initialization is done, then let it go on.
>>
>> But recently it has happened so that I had to replace one of the disks in a
>> "live" system (small file server), and was unable to do that on multiple
>> tries during work hours because of this problem. When I waited until
>> afternoon, when nobody was working/writing, resync was able to finish...
>>
> So apparently there is some kind of poor interaction here with the
> internal MD resync code. It's not clear to me whether it's a lockup or
> extreme slowdown, but unless anybody else has ideas I'd suggest to
> solicit feedback from the MD devs (note that you dropped the linux-raid
> cc) as to why this set of I/O might be blocked in the raid device and go
> from there.
>
> Brian

(Sorry, I'm not very much into mailing lists, I have added linux-raid back to cc)

Today I tried lowering the RAID10 chunk size to 512 bytes (default), and only 
difference was that freeze appeared much faster.
Also tried with newest RC kernel 4.20.0-rc6-2.g91eea17-default (from openSUSE 
kernel/HEAD), with same results.

And it is definitely a lockup, because I have tried to leave it overnight, but 
once rsync/copy/write stops, it never moves on, and also once RAID sync stops 
it never moves on...

I was trying many times to get that dmesg report again, but without success 
(waited up to 30 minutes). Any help is welcome...


What I did not try is some other distribution, only openSUSE LEAP and Tumbleweed


>
>>> A few more notes below inline to the log..
>>>
>>>> Sometimes in dmesg I get some kernel messages about "task kworker/2:1:55
>>>> blocked for more than 480 seconds." (please see attached dmesg.txt and my
>>>> reports here: https://bugzilla.opensuse.org/show_bug.cgi?id=1111073),
>>>> sometimes nothing at all. When this happens, I can only reboot with SysRq-b
>>>> or "physically" with reset/power button.
>>>>
>>>> Same thing can happen with "far" layout, but it seems to me that it does not
>>>> happen every time (or that often). I might be wrong, because I never use
>>>> "far" layout in real life, only for testing.
>>>> I was unable to reproduce the failure with "near" layout.
>>>>
>>>> Also with EXT4 or BTRFS and any layout everything works just as it should,
>>>> that is sync goes on until finished, and rsync, cp, or any other write work
>>>> just fine at the same time.
>>>>
>>>> Let me just add that I first saw this behavior in openSUSE LEAP 15.0 (kernel
>>>> 4.12). In previous versions (up to kernel 4.4) I never had this problem. In
>>>> the meantime I have tested with kernels up to 4.20rc and it is the same.
>>>> Unfortunately I cannot go back to test kernels 4.5 - 4.11 to pinpoint the
>>>> moment the problem first appeared.
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Siniša Bandin
>>>> (excuse my English)
>>>>
>>>> [ 180.981499] SGI XFS with ACLs, security attributes, no debug enabled
>>>> [ 181.005019] XFS (md1): Mounting V5 Filesystem
>>>> [ 181.132076] XFS (md1): Starting recovery (logdev: internal)
>>>> [ 181.295606] XFS (md1): Ending recovery (logdev: internal)
>>>> [ 181.804011] XFS (md1): Unmounting Filesystem
>>>> [ 182.201794] XFS (md127): Mounting V4 Filesystem
>>>> [ 182.736958] md: recovery of RAID array md127
>>>> [ 182.915479] XFS (md127): Ending clean mount
>>>> [ 183.819702] XFS (md127): Unmounting Filesystem
>>>> [ 184.943831] EXT4-fs (md0): mounted filesystem with ordered data
>>>> mode. Opts: (null)
>>>> [ 529.784557] EXT4-fs (md0): mounted filesystem with ordered data
>>>> mode. Opts: (null)
>>>> [ 601.789958] md1: detected capacity change from 33284947968 to 0
>>>> [ 601.789973] md: md1 stopped.
>>>> [ 602.314112] md0: detected capacity change from 550436864 to 0
>>>> [ 602.314128] md: md0 stopped.
>>>> [ 602.745030] md: md127: recovery interrupted.
>>>> [ 603.131684] md127: detected capacity change from 966229229568 to 0
>>>> [ 603.132237] md: md127 stopped.
>>>> [ 603.435808] sda: sda1 sda2
>>>> [ 603.594074] udevd[5011]: inotify_add_watch(11, /dev/sda2, 10)
>>>> failed: No such file or directory
>>>> [ 603.643959] sda:
>>>> [ 603.844724] sdb: sdb1 sdb2
>>>> [ 604.255407] sdb: sdb1
>>>> [ 604.490214] udevd[5050]: inotify_add_watch(11, /dev/sdb1, 10)
>>>> failed: No such file or directory
>>>> [ 605.140952] sdb: sdb1
>>>> [ 605.628686] sdb: sdb1 sdb2
>>>> [ 606.271192] sdb: sdb1 sdb2 sdb3
>>>> [ 607.079626] sdb: sdb1 sdb2 sdb3
>>>> [ 607.611092] sda:
>>>> [ 608.273201] sda: sda1
>>>> [ 608.611952] sda: sda1 sda2
>>>> [ 609.031326] sda: sda1 sda2 sda3
>>>> [ 609.753140] md/raid10:md1: not clean -- starting background reconstruction
>>>> [ 609.753145] md/raid10:md1: active with 2 out of 2 devices
>>>> [ 609.768804] md1: detected capacity change from 0 to 32210157568
>>>> [ 609.772677] md: resync of RAID array md1
>>>> [ 614.590107] XFS (md1): Mounting V5 Filesystem
>>>> [ 615.449035] XFS (md1): Ending clean mount
>>>> [ 617.678462] md/raid1:md0: not clean -- starting background reconstruction
>>>> [ 617.678469] md/raid1:md0: active with 2 out of 2 mirrors
>>>> [ 617.740729] md0: detected capacity change from 0 to 524222464
>>>> [ 617.747107] md: delaying resync of md0 until md1 has finished
>>>> (they share one or more physical units)
>>> What are md0 and md1? Note that I don't see md2 anywhere in this log.
>>>
>> Sorry that I did not clarify that immediately, this log was taken earlier,
>> during installation, when I got to see it in dmesg.
>> md0 was /boot (with EXT4), md1 was / with XFS.
>>
>> Example of cat /proc/mdstat was taken later, when I brought up the system
>> (by changing md1 to "near" layout at install time). So wherever you see md1
>> or md2, you can assume they are the same thing: new RAID10/o2 being written
>> to during initialization. But second time there was nothing in dmesg, so I
>> could not attach that.
>>
>>
>>>> [ 620.037818] EXT4-fs (md0): mounted filesystem with ordered data
>>>> mode. Opts: (null)
>>>> [ 1463.754785] INFO: task kworker/0:3:227 blocked for more than 480 seconds.
>>>> [ 1463.754793] Not tainted 4.19.5-1-default #1
>>>> [ 1463.754795] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>> disables this message.
>>>> [ 1463.754799] kworker/0:3 D 0 227 2 0x80000000
>>>> [ 1463.755000] Workqueue: xfs-eofblocks/md1 xfs_eofblocks_worker [xfs]
>>>> [ 1463.755005] Call Trace:
>>>> [ 1463.755025] ? __schedule+0x29a/0x880
>>>> [ 1463.755032] ? rwsem_down_write_failed+0x197/0x350
>>>> [ 1463.755038] schedule+0x78/0x110
>>>> [ 1463.755044] rwsem_down_write_failed+0x197/0x350
>>>> [ 1463.755055] call_rwsem_down_write_failed+0x13/0x20
>>>> [ 1463.755061] down_write+0x20/0x30
>>> So we have a background task blocked on an inode lock.
>>>
>>>> [ 1463.755196] xfs_free_eofblocks+0x114/0x1a0 [xfs]
>>>> [ 1463.755330] xfs_inode_free_eofblocks+0xd3/0x1e0 [xfs]
>>>> [ 1463.755459] ? xfs_inode_ag_walk_grab+0x5b/0x90 [xfs]
>>>> [ 1463.755586] xfs_inode_ag_walk.isra.15+0x1aa/0x420 [xfs]
>>>> [ 1463.755714] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
>>>> [ 1463.755727] ? trace_hardirqs_on_thunk+0x1a/0x1c
>>>> [ 1463.755734] ? __switch_to_asm+0x40/0x70
>>>> [ 1463.755738] ? __switch_to_asm+0x34/0x70
>>>> [ 1463.755743] ? __switch_to_asm+0x40/0x70
>>>> [ 1463.755748] ? __switch_to_asm+0x34/0x70
>>>> [ 1463.755752] ? __switch_to_asm+0x40/0x70
>>>> [ 1463.755757] ? __switch_to_asm+0x34/0x70
>>>> [ 1463.755762] ? __switch_to_asm+0x40/0x70
>>>> [ 1463.755893] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
>>>> [ 1463.755900] ? radix_tree_gang_lookup_tag+0xc2/0x140
>>>> [ 1463.756032] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
>>>> [ 1463.756158] xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
>>>> [ 1463.756288] xfs_eofblocks_worker+0x29/0x40 [xfs]
>>>> [ 1463.756298] process_one_work+0x1fd/0x420
>>>> [ 1463.756305] worker_thread+0x2d/0x3d0
>>>> [ 1463.756311] ? rescuer_thread+0x340/0x340
>>>> [ 1463.756316] kthread+0x112/0x130
>>>> [ 1463.756322] ? kthread_create_worker_on_cpu+0x40/0x40
>>>> [ 1463.756329] ret_from_fork+0x3a/0x50
>>>> [ 1463.756375] INFO: task kworker/u4:0:4615 blocked for more than 480 seconds.
>>>> [ 1463.756379] Not tainted 4.19.5-1-default #1
>>>> [ 1463.756380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>> disables this message.
>>>> [ 1463.756383] kworker/u4:0 D 0 4615 2 0x80000000
>>>> [ 1463.756395] Workqueue: writeback wb_workfn (flush-9:1)
>>>> [ 1463.756400] Call Trace:
>>>> [ 1463.756409] ? __schedule+0x29a/0x880
>>>> [ 1463.756420] ? wait_barrier+0xdd/0x170 [raid10]
>>>> [ 1463.756426] schedule+0x78/0x110
>>>> [ 1463.756433] wait_barrier+0xdd/0x170 [raid10]
>>>> [ 1463.756440] ? wait_woken+0x80/0x80
>>>> [ 1463.756448] raid10_write_request+0xf2/0x900 [raid10]
>>>> [ 1463.756454] ? wait_woken+0x80/0x80
>>>> [ 1463.756459] ? mempool_alloc+0x55/0x160
>>>> [ 1463.756483] ? md_write_start+0xa9/0x270 [md_mod]
>>>> [ 1463.756492] raid10_make_request+0xc1/0x120 [raid10]
>>>> [ 1463.756498] ? wait_woken+0x80/0x80
>>>> [ 1463.756514] md_handle_request+0x121/0x190 [md_mod]
>>>> [ 1463.756535] md_make_request+0x78/0x190 [md_mod]
>>>> [ 1463.756544] generic_make_request+0x1c6/0x470
>>>> [ 1463.756553] submit_bio+0x45/0x140
>>> Writeback is blocked submitting I/O down in the MD driver.
>>>
>>>> [ 1463.756714] xfs_submit_ioend+0x9c/0x1e0 [xfs]
>>>> [ 1463.756844] xfs_vm_writepages+0x68/0x80 [xfs]
>>>> [ 1463.756856] do_writepages+0x31/0xb0
>>>> [ 1463.756865] ? read_hpet+0x126/0x130
>>>> [ 1463.756873] ? ktime_get+0x36/0xa0
>>>> [ 1463.756881] __writeback_single_inode+0x3d/0x3e0
>>>> [ 1463.756889] writeback_sb_inodes+0x1c4/0x430
>>>> [ 1463.756902] __writeback_inodes_wb+0x5d/0xb0
>>>> [ 1463.756910] wb_writeback+0x26b/0x310
>>>> [ 1463.756920] wb_workfn+0x33a/0x410
>>>> [ 1463.756932] process_one_work+0x1fd/0x420
>>>> [ 1463.756940] worker_thread+0x2d/0x3d0
>>>> [ 1463.756946] ? rescuer_thread+0x340/0x340
>>>> [ 1463.756951] kthread+0x112/0x130
>>>> [ 1463.756957] ? kthread_create_worker_on_cpu+0x40/0x40
>>>> [ 1463.756965] ret_from_fork+0x3a/0x50
>>>> [ 1463.756979] INFO: task kworker/0:2:4994 blocked for more than 480 seconds.
>>>> [ 1463.756982] Not tainted 4.19.5-1-default #1
>>>> [ 1463.756984] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>> disables this message.
>>>> [ 1463.756987] kworker/0:2 D 0 4994 2 0x80000000
>>>> [ 1463.757013] Workqueue: md submit_flushes [md_mod]
>>>> [ 1463.757016] Call Trace:
>>>> [ 1463.757024] ? __schedule+0x29a/0x880
>>>> [ 1463.757034] ? wait_barrier+0xdd/0x170 [raid10]
>>>> [ 1463.757039] schedule+0x78/0x110
>>>> [ 1463.757047] wait_barrier+0xdd/0x170 [raid10]
>>>> [ 1463.757054] ? wait_woken+0x80/0x80
>>>> [ 1463.757062] raid10_write_request+0xf2/0x900 [raid10]
>>>> [ 1463.757067] ? wait_woken+0x80/0x80
>>>> [ 1463.757072] ? mempool_alloc+0x55/0x160
>>>> [ 1463.757088] ? md_write_start+0xa9/0x270 [md_mod]
>>>> [ 1463.757095] ? trace_hardirqs_off_thunk+0x1a/0x1c
>>>> [ 1463.757104] raid10_make_request+0xc1/0x120 [raid10]
>>>> [ 1463.757110] ? wait_woken+0x80/0x80
>>>> [ 1463.757126] md_handle_request+0x121/0x190 [md_mod]
>>>> [ 1463.757132] ? _raw_spin_unlock_irq+0x22/0x40
>>>> [ 1463.757137] ? finish_task_switch+0x74/0x260
>>>> [ 1463.757156] submit_flushes+0x21/0x40 [md_mod]
>>> Some other MD task (?) also blocked submitting a request.
>>>
>>>> [ 1463.757163] process_one_work+0x1fd/0x420
>>>> [ 1463.757170] worker_thread+0x2d/0x3d0
>>>> [ 1463.757177] ? rescuer_thread+0x340/0x340
>>>> [ 1463.757181] kthread+0x112/0x130
>>>> [ 1463.757186] ? kthread_create_worker_on_cpu+0x40/0x40
>>>> [ 1463.757193] ret_from_fork+0x3a/0x50
>>>> [ 1463.757205] INFO: task md1_resync:5215 blocked for more than 480 seconds.
>>>> [ 1463.757207] Not tainted 4.19.5-1-default #1
>>>> [ 1463.757209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>> disables this message.
>>>> [ 1463.757212] md1_resync D 0 5215 2 0x80000000
>>>> [ 1463.757216] Call Trace:
>>>> [ 1463.757223] ? __schedule+0x29a/0x880
>>>> [ 1463.757231] ? raise_barrier+0x8d/0x140 [raid10]
>>>> [ 1463.757236] schedule+0x78/0x110
>>>> [ 1463.757243] raise_barrier+0x8d/0x140 [raid10]
>>>> [ 1463.757248] ? wait_woken+0x80/0x80
>>>> [ 1463.757257] raid10_sync_request+0x1f6/0x1e30 [raid10]
>>>> [ 1463.757265] ? _raw_spin_unlock_irq+0x22/0x40
>>>> [ 1463.757284] ? is_mddev_idle+0x125/0x137 [md_mod]
>>>> [ 1463.757302] md_do_sync.cold.78+0x404/0x969 [md_mod]
>>> The md1 sync task is blocked, I'm not sure on what.
>>>
>>>> [ 1463.757311] ? wait_woken+0x80/0x80
>>>> [ 1463.757336] ? md_rdev_init+0xb0/0xb0 [md_mod]
>>>> [ 1463.757351] md_thread+0xe9/0x140 [md_mod]
>>>> [ 1463.757358] ? _raw_spin_unlock_irqrestore+0x2e/0x60
>>>> [ 1463.757364] ? __kthread_parkme+0x4c/0x70
>>>> [ 1463.757369] kthread+0x112/0x130
>>>> [ 1463.757374] ? kthread_create_worker_on_cpu+0x40/0x40
>>>> [ 1463.757380] ret_from_fork+0x3a/0x50
>>>> [ 1463.757395] INFO: task xfsaild/md1:5233 blocked for more than 480 seconds.
>>>> [ 1463.757398] Not tainted 4.19.5-1-default #1
>>>> [ 1463.757400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>> disables this message.
>>>> [ 1463.757402] xfsaild/md1 D 0 5233 2 0x80000000
>>>> [ 1463.757406] Call Trace:
>>>> [ 1463.757413] ? __schedule+0x29a/0x880
>>>> [ 1463.757421] ? wait_barrier+0xdd/0x170 [raid10]
>>>> [ 1463.757426] schedule+0x78/0x110
>>>> [ 1463.757433] wait_barrier+0xdd/0x170 [raid10]
>>>> [ 1463.757438] ? wait_woken+0x80/0x80
>>>> [ 1463.757446] raid10_write_request+0xf2/0x900 [raid10]
>>>> [ 1463.757451] ? wait_woken+0x80/0x80
>>>> [ 1463.757455] ? mempool_alloc+0x55/0x160
>>>> [ 1463.757471] ? md_write_start+0xa9/0x270 [md_mod]
>>>> [ 1463.757477] ? trace_hardirqs_on_thunk+0x1a/0x1c
>>>> [ 1463.757485] raid10_make_request+0xc1/0x120 [raid10]
>>>> [ 1463.757491] ? wait_woken+0x80/0x80
>>>> [ 1463.757507] md_handle_request+0x121/0x190 [md_mod]
>>>> [ 1463.757527] md_make_request+0x78/0x190 [md_mod]
>>>> [ 1463.757536] generic_make_request+0x1c6/0x470
>>>> [ 1463.757544] submit_bio+0x45/0x140
>>> xfsaild (metadata writeback) is also blocked submitting I/O down in the
>>> MD driver.
>>>
>>>> [ 1463.757552] ? bio_add_page+0x48/0x60
>>>> [ 1463.757716] _xfs_buf_ioapply+0x2c1/0x450 [xfs]
>>>> [ 1463.757849] ? xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
>>>> [ 1463.757974] __xfs_buf_submit+0x67/0x270 [xfs]
>>>> [ 1463.758102] xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
>>>> [ 1463.758232] ? xfsaild+0x294/0x7e0 [xfs]
>>>> [ 1463.758364] xfsaild+0x294/0x7e0 [xfs]
>>>> [ 1463.758377] ? _raw_spin_unlock_irqrestore+0x2e/0x60
>>>> [ 1463.758508] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
>>>> [ 1463.758514] kthread+0x112/0x130
>>>> [ 1463.758520] ? kthread_create_worker_on_cpu+0x40/0x40
>>>> [ 1463.758527] ret_from_fork+0x3a/0x50
>>>> [ 1463.758543] INFO: task rpm:5364 blocked for more than 480 seconds.
>>>> [ 1463.758546] Not tainted 4.19.5-1-default #1
>>>> [ 1463.758547] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>> disables this message.
>>>> [ 1463.758550] rpm D 0 5364 3757 0x00000000
>>>> [ 1463.758554] Call Trace:
>>>> [ 1463.758563] ? __schedule+0x29a/0x880
>>>> [ 1463.758701] ? xlog_wait+0x5c/0x70 [xfs]
>>>> [ 1463.759821] schedule+0x78/0x110
>>>> [ 1463.760022] xlog_wait+0x5c/0x70 [xfs]
>>>> [ 1463.760036] ? wake_up_q+0x70/0x70
>>>> [ 1463.760167] __xfs_log_force_lsn+0x223/0x230 [xfs]
>>>> [ 1463.760297] ? xfs_file_fsync+0x196/0x1d0 [xfs]
>>>> [ 1463.760424] xfs_log_force_lsn+0x93/0x140 [xfs]
>>>> [ 1463.760552] xfs_file_fsync+0x196/0x1d0 [xfs]
>>> An fsync is blocked, presumably on XFS log I/O completion.
>>>
>>>> [ 1463.760562] ? __sb_end_write+0x36/0x60
>>>> [ 1463.760571] do_fsync+0x38/0x70
>>>> [ 1463.760578] __x64_sys_fdatasync+0x13/0x20
>>>> [ 1463.760585] do_syscall_64+0x60/0x110
>>>> [ 1463.760594] entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>> [ 1463.760603] RIP: 0033:0x7f9757fae8a4
>>>> [ 1463.760616] Code: Bad RIP value.
>>>> [ 1463.760619] RSP: 002b:00007fff74fdb428 EFLAGS: 00000246 ORIG_RAX:
>>>> 000000000000004b
>>>> [ 1463.760654] RAX: ffffffffffffffda RBX: 0000000000000064 RCX: 00007f9757fae8a4
>>>> [ 1463.760657] RDX: 00000000012c4c60 RSI: 00000000012cc130 RDI: 0000000000000004
>>>> [ 1463.760660] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f9758708c00
>>>> [ 1463.760662] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000012cc130
>>>> [ 1463.760665] R13: 000000000123a3a0 R14: 0000000000010830 R15: 0000000000000062
>>>> [ 1463.760679] INFO: task kworker/0:8:5367 blocked for more than 480 seconds.
>>>> [ 1463.760683] Not tainted 4.19.5-1-default #1
>>>> [ 1463.760684] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>> disables this message.
>>>> [ 1463.760687] kworker/0:8 D 0 5367 2 0x80000000
>>>> [ 1463.760718] Workqueue: md submit_flushes [md_mod]
>>> And that MD submit_flushes thing again.
>>>
>>> Not to say there isn't some issue between XFS and MD going on here, but
>>> I think we might want an MD person to take a look at this and possibly
>>> provide some insight. From an XFS perspective, this all just looks like
>>> we're blocked on I/O (via writeback, AIL and log) to a slow device.
>>>
>>> Brian
>>>
>>>> [ 1463.760721] Call Trace:
>>>> [ 1463.760731] ? __schedule+0x29a/0x880
>>>> [ 1463.760741] ? wait_barrier+0xdd/0x170 [raid10]
>>>> [ 1463.760746] schedule+0x78/0x110
>>>> [ 1463.760753] wait_barrier+0xdd/0x170 [raid10]
>>>> [ 1463.760761] ? wait_woken+0x80/0x80
>>>> [ 1463.760768] raid10_write_request+0xf2/0x900 [raid10]
>>>> [ 1463.760774] ? wait_woken+0x80/0x80
>>>> [ 1463.760778] ? mempool_alloc+0x55/0x160
>>>> [ 1463.760795] ? md_write_start+0xa9/0x270 [md_mod]
>>>> [ 1463.760801] ? try_to_wake_up+0x44/0x470
>>>> [ 1463.760810] raid10_make_request+0xc1/0x120 [raid10]
>>>> [ 1463.760816] ? wait_woken+0x80/0x80
>>>> [ 1463.760831] md_handle_request+0x121/0x190 [md_mod]
>>>> [ 1463.760851] md_make_request+0x78/0x190 [md_mod]
>>>> [ 1463.760860] generic_make_request+0x1c6/0x470
>>>> [ 1463.760870] raid10_write_request+0x77a/0x900 [raid10]
>>>> [ 1463.760875] ? wait_woken+0x80/0x80
>>>> [ 1463.760879] ? mempool_alloc+0x55/0x160
>>>> [ 1463.760895] ? md_write_start+0xa9/0x270 [md_mod]
>>>> [ 1463.760904] raid10_make_request+0xc1/0x120 [raid10]
>>>> [ 1463.760910] ? wait_woken+0x80/0x80
>>>> [ 1463.760926] md_handle_request+0x121/0x190 [md_mod]
>>>> [ 1463.760931] ? _raw_spin_unlock_irq+0x22/0x40
>>>> [ 1463.760936] ? finish_task_switch+0x74/0x260
>>>> [ 1463.760954] submit_flushes+0x21/0x40 [md_mod]
>>>> [ 1463.760962] process_one_work+0x1fd/0x420
>>>> [ 1463.760970] worker_thread+0x2d/0x3d0
>>>> [ 1463.760976] ? rescuer_thread+0x340/0x340
>>>> [ 1463.760981] kthread+0x112/0x130
>>>> [ 1463.760986] ? kthread_create_worker_on_cpu+0x40/0x40
>>>> [ 1463.760992] ret_from_fork+0x3a/0x50

-- 
Srdačan pozdrav/Best regards,
Siniša Bandin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-13 13:02       ` Sinisa
@ 2018-12-13 17:30         ` keld
  2018-12-14  6:59           ` Sinisa
  0 siblings, 1 reply; 16+ messages in thread
From: keld @ 2018-12-13 17:30 UTC (permalink / raw)
  To: Sinisa; +Cc: linux-xfs, linux-raid

o2 is not the fastest layout,
but when the blk size is big enough 
the loss on head movement is small. f2 is better also for smaller
block sizes




On Thu, Dec 13, 2018 at 02:02:13PM +0100, Sinisa wrote:
> On 12/13/18 1:28 PM, Brian Foster wrote:
> >On Thu, Dec 13, 2018 at 09:21:18AM +0100, Sinisa wrote:
> >>Thanks for a quick reply. Replies are inline...
> >>
> >>On 12.12.2018 15:30, Brian Foster wrote:
> >>>cc linux-raid
> >>>
> >>>On Wed, Dec 12, 2018 at 01:29:49PM +0100, Sinisa wrote:
> >>>>Hello group,
> >>>>
> >>>>I have noticed something strange going on lately, but recently I have 
> >>>>come
> >>>>to conclusion that there is some unwanted interaction between XFS and 
> >>>>Linux
> >>>>RAID10 with "offset" layout.
> >>>>
> >>>>So here is the problem: I create a Linux RAID10 mirror with 2 disks 
> >>>>(HDD or
> >>>>SSD) and "o2" layout (best choice for read and write speed):
> >>>># mdadm -C -n2 -l10 -po2 /dev/mdX /dev/sdaX /dev/sdbX
> >>>># mkfs.xfs /dev/mdX
> >>>># mount /dev/mdX /mnt
> >>>># rsync -avxDPHS / /mnt
> >>>>
> >>>>So we have RAID10 initializing:
> >>>>
> >>>># cat /proc/mdstat
> >>>>Personalities : [raid1] [raid10]
> >>>>md2 : active raid10 sdb3[1] sda3[0]
> >>>>       314433536 blocks super 1.2 4096K chunks 2 offset-copies [2/2] 
> >>>> [UU]
> >>>>       [==>..................]  resync = 11.7% (36917568/314433536)
> >>>>finish=8678.2min speed=532K/sec
> >>>>       bitmap: 3/3 pages [12KB], 65536KB chunk
> >>>>
> >>>>but after a few minutes everything stops like you can see above. Rsync 
> >>>>(or
> >>>>any other process writing to that md device) also freezes. If I try to 
> >>>>read
> >>>>already copied files - freeze, usually with less that 2GB copied.
> >>>>
> >>>Does the same thing happen without the RAID initialization? E.g., if you
> >>>wait for it to complete or (IIRC) if you create with --assume-clean? I
> >>>assume the init-in-progress state is common with your tests on other
> >>>filesystems?
> >>>
> >>No, if I wait for RAID to finish initializing, or create it with
> >>--assume-clean, everything works just fine.
> >>
> >>Actualy, ever since openSUSE LEAP 15.0 release I have been doing just 
> >>that:
> >>pause installation process until initialization is done, then let it go 
> >>on.
> >>
> >>But recently it has happened so that I had to replace one of the disks in 
> >>a
> >>"live" system (small file server), and was unable to do that on multiple
> >>tries during work hours because of this problem. When I waited until
> >>afternoon, when nobody was working/writing, resync was able to finish...
> >>
> >So apparently there is some kind of poor interaction here with the
> >internal MD resync code. It's not clear to me whether it's a lockup or
> >extreme slowdown, but unless anybody else has ideas I'd suggest to
> >solicit feedback from the MD devs (note that you dropped the linux-raid
> >cc) as to why this set of I/O might be blocked in the raid device and go
> >from there.
> >
> >Brian
> 
> (Sorry, I'm not very much into mailing lists, I have added linux-raid back 
> to cc)
> 
> Today I tried lowering the RAID10 chunk size to 512 bytes (default), and 
> only difference was that freeze appeared much faster.
> Also tried with newest RC kernel 4.20.0-rc6-2.g91eea17-default (from 
> openSUSE kernel/HEAD), with same results.
> 
> And it is definitely a lockup, because I have tried to leave it overnight, 
> but once rsync/copy/write stops, it never moves on, and also once RAID sync 
> stops it never moves on...
> 
> I was trying many times to get that dmesg report again, but without success 
> (waited up to 30 minutes). Any help is welcome...
> 
> 
> What I did not try is some other distribution, only openSUSE LEAP and 
> Tumbleweed
> 
> 
> >
> >>>A few more notes below inline to the log..
> >>>
> >>>>Sometimes in dmesg I get some kernel messages about "task kworker/2:1:55
> >>>>blocked for more than 480 seconds." (please see attached dmesg.txt and 
> >>>>my
> >>>>reports here: https://bugzilla.opensuse.org/show_bug.cgi?id=1111073),
> >>>>sometimes nothing at all. When this happens, I can only reboot with 
> >>>>SysRq-b
> >>>>or "physically" with reset/power button.
> >>>>
> >>>>Same thing can happen with "far" layout, but it seems to me that it 
> >>>>does not
> >>>>happen every time (or that often). I might be wrong, because I never use
> >>>>"far" layout in real life, only for testing.
> >>>>I was unable to reproduce the failure with "near" layout.
> >>>>
> >>>>Also with EXT4 or BTRFS and any layout everything works just as it 
> >>>>should,
> >>>>that is sync goes on until finished, and rsync, cp, or any other write 
> >>>>work
> >>>>just fine at the same time.
> >>>>
> >>>>Let me just add that I first saw this behavior in openSUSE LEAP 15.0 
> >>>>(kernel
> >>>>4.12). In previous versions (up to kernel 4.4) I never had this 
> >>>>problem. In
> >>>>the meantime I have tested with kernels up to 4.20rc and it is the same.
> >>>>Unfortunately I cannot go back to test kernels 4.5 - 4.11 to pinpoint 
> >>>>the
> >>>>moment the problem first appeared.
> >>>>
> >>>>
> >>>>
> >>>>--
> >>>>Best regards,
> >>>>Sini??a Bandin
> >>>>(excuse my English)
> >>>>
> >>>>[ 180.981499] SGI XFS with ACLs, security attributes, no debug enabled
> >>>>[ 181.005019] XFS (md1): Mounting V5 Filesystem
> >>>>[ 181.132076] XFS (md1): Starting recovery (logdev: internal)
> >>>>[ 181.295606] XFS (md1): Ending recovery (logdev: internal)
> >>>>[ 181.804011] XFS (md1): Unmounting Filesystem
> >>>>[ 182.201794] XFS (md127): Mounting V4 Filesystem
> >>>>[ 182.736958] md: recovery of RAID array md127
> >>>>[ 182.915479] XFS (md127): Ending clean mount
> >>>>[ 183.819702] XFS (md127): Unmounting Filesystem
> >>>>[ 184.943831] EXT4-fs (md0): mounted filesystem with ordered data
> >>>>mode. Opts: (null)
> >>>>[ 529.784557] EXT4-fs (md0): mounted filesystem with ordered data
> >>>>mode. Opts: (null)
> >>>>[ 601.789958] md1: detected capacity change from 33284947968 to 0
> >>>>[ 601.789973] md: md1 stopped.
> >>>>[ 602.314112] md0: detected capacity change from 550436864 to 0
> >>>>[ 602.314128] md: md0 stopped.
> >>>>[ 602.745030] md: md127: recovery interrupted.
> >>>>[ 603.131684] md127: detected capacity change from 966229229568 to 0
> >>>>[ 603.132237] md: md127 stopped.
> >>>>[ 603.435808] sda: sda1 sda2
> >>>>[ 603.594074] udevd[5011]: inotify_add_watch(11, /dev/sda2, 10)
> >>>>failed: No such file or directory
> >>>>[ 603.643959] sda:
> >>>>[ 603.844724] sdb: sdb1 sdb2
> >>>>[ 604.255407] sdb: sdb1
> >>>>[ 604.490214] udevd[5050]: inotify_add_watch(11, /dev/sdb1, 10)
> >>>>failed: No such file or directory
> >>>>[ 605.140952] sdb: sdb1
> >>>>[ 605.628686] sdb: sdb1 sdb2
> >>>>[ 606.271192] sdb: sdb1 sdb2 sdb3
> >>>>[ 607.079626] sdb: sdb1 sdb2 sdb3
> >>>>[ 607.611092] sda:
> >>>>[ 608.273201] sda: sda1
> >>>>[ 608.611952] sda: sda1 sda2
> >>>>[ 609.031326] sda: sda1 sda2 sda3
> >>>>[ 609.753140] md/raid10:md1: not clean -- starting background 
> >>>>reconstruction
> >>>>[ 609.753145] md/raid10:md1: active with 2 out of 2 devices
> >>>>[ 609.768804] md1: detected capacity change from 0 to 32210157568
> >>>>[ 609.772677] md: resync of RAID array md1
> >>>>[ 614.590107] XFS (md1): Mounting V5 Filesystem
> >>>>[ 615.449035] XFS (md1): Ending clean mount
> >>>>[ 617.678462] md/raid1:md0: not clean -- starting background 
> >>>>reconstruction
> >>>>[ 617.678469] md/raid1:md0: active with 2 out of 2 mirrors
> >>>>[ 617.740729] md0: detected capacity change from 0 to 524222464
> >>>>[ 617.747107] md: delaying resync of md0 until md1 has finished
> >>>>(they share one or more physical units)
> >>>What are md0 and md1? Note that I don't see md2 anywhere in this log.
> >>>
> >>Sorry that I did not clarify that immediately, this log was taken earlier,
> >>during installation, when I got to see it in dmesg.
> >>md0 was /boot (with EXT4), md1 was / with XFS.
> >>
> >>Example of cat /proc/mdstat was taken later, when I brought up the system
> >>(by changing md1 to "near" layout at install time). So wherever you see 
> >>md1
> >>or md2, you can assume they are the same thing: new RAID10/o2 being 
> >>written
> >>to during initialization. But second time there was nothing in dmesg, so I
> >>could not attach that.
> >>
> >>
> >>>>[ 620.037818] EXT4-fs (md0): mounted filesystem with ordered data
> >>>>mode. Opts: (null)
> >>>>[ 1463.754785] INFO: task kworker/0:3:227 blocked for more than 480 
> >>>>seconds.
> >>>>[ 1463.754793] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.754795] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.754799] kworker/0:3 D 0 227 2 0x80000000
> >>>>[ 1463.755000] Workqueue: xfs-eofblocks/md1 xfs_eofblocks_worker [xfs]
> >>>>[ 1463.755005] Call Trace:
> >>>>[ 1463.755025] ? __schedule+0x29a/0x880
> >>>>[ 1463.755032] ? rwsem_down_write_failed+0x197/0x350
> >>>>[ 1463.755038] schedule+0x78/0x110
> >>>>[ 1463.755044] rwsem_down_write_failed+0x197/0x350
> >>>>[ 1463.755055] call_rwsem_down_write_failed+0x13/0x20
> >>>>[ 1463.755061] down_write+0x20/0x30
> >>>So we have a background task blocked on an inode lock.
> >>>
> >>>>[ 1463.755196] xfs_free_eofblocks+0x114/0x1a0 [xfs]
> >>>>[ 1463.755330] xfs_inode_free_eofblocks+0xd3/0x1e0 [xfs]
> >>>>[ 1463.755459] ? xfs_inode_ag_walk_grab+0x5b/0x90 [xfs]
> >>>>[ 1463.755586] xfs_inode_ag_walk.isra.15+0x1aa/0x420 [xfs]
> >>>>[ 1463.755714] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> >>>>[ 1463.755727] ? trace_hardirqs_on_thunk+0x1a/0x1c
> >>>>[ 1463.755734] ? __switch_to_asm+0x40/0x70
> >>>>[ 1463.755738] ? __switch_to_asm+0x34/0x70
> >>>>[ 1463.755743] ? __switch_to_asm+0x40/0x70
> >>>>[ 1463.755748] ? __switch_to_asm+0x34/0x70
> >>>>[ 1463.755752] ? __switch_to_asm+0x40/0x70
> >>>>[ 1463.755757] ? __switch_to_asm+0x34/0x70
> >>>>[ 1463.755762] ? __switch_to_asm+0x40/0x70
> >>>>[ 1463.755893] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> >>>>[ 1463.755900] ? radix_tree_gang_lookup_tag+0xc2/0x140
> >>>>[ 1463.756032] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> >>>>[ 1463.756158] xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> >>>>[ 1463.756288] xfs_eofblocks_worker+0x29/0x40 [xfs]
> >>>>[ 1463.756298] process_one_work+0x1fd/0x420
> >>>>[ 1463.756305] worker_thread+0x2d/0x3d0
> >>>>[ 1463.756311] ? rescuer_thread+0x340/0x340
> >>>>[ 1463.756316] kthread+0x112/0x130
> >>>>[ 1463.756322] ? kthread_create_worker_on_cpu+0x40/0x40
> >>>>[ 1463.756329] ret_from_fork+0x3a/0x50
> >>>>[ 1463.756375] INFO: task kworker/u4:0:4615 blocked for more than 480 
> >>>>seconds.
> >>>>[ 1463.756379] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.756380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.756383] kworker/u4:0 D 0 4615 2 0x80000000
> >>>>[ 1463.756395] Workqueue: writeback wb_workfn (flush-9:1)
> >>>>[ 1463.756400] Call Trace:
> >>>>[ 1463.756409] ? __schedule+0x29a/0x880
> >>>>[ 1463.756420] ? wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.756426] schedule+0x78/0x110
> >>>>[ 1463.756433] wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.756440] ? wait_woken+0x80/0x80
> >>>>[ 1463.756448] raid10_write_request+0xf2/0x900 [raid10]
> >>>>[ 1463.756454] ? wait_woken+0x80/0x80
> >>>>[ 1463.756459] ? mempool_alloc+0x55/0x160
> >>>>[ 1463.756483] ? md_write_start+0xa9/0x270 [md_mod]
> >>>>[ 1463.756492] raid10_make_request+0xc1/0x120 [raid10]
> >>>>[ 1463.756498] ? wait_woken+0x80/0x80
> >>>>[ 1463.756514] md_handle_request+0x121/0x190 [md_mod]
> >>>>[ 1463.756535] md_make_request+0x78/0x190 [md_mod]
> >>>>[ 1463.756544] generic_make_request+0x1c6/0x470
> >>>>[ 1463.756553] submit_bio+0x45/0x140
> >>>Writeback is blocked submitting I/O down in the MD driver.
> >>>
> >>>>[ 1463.756714] xfs_submit_ioend+0x9c/0x1e0 [xfs]
> >>>>[ 1463.756844] xfs_vm_writepages+0x68/0x80 [xfs]
> >>>>[ 1463.756856] do_writepages+0x31/0xb0
> >>>>[ 1463.756865] ? read_hpet+0x126/0x130
> >>>>[ 1463.756873] ? ktime_get+0x36/0xa0
> >>>>[ 1463.756881] __writeback_single_inode+0x3d/0x3e0
> >>>>[ 1463.756889] writeback_sb_inodes+0x1c4/0x430
> >>>>[ 1463.756902] __writeback_inodes_wb+0x5d/0xb0
> >>>>[ 1463.756910] wb_writeback+0x26b/0x310
> >>>>[ 1463.756920] wb_workfn+0x33a/0x410
> >>>>[ 1463.756932] process_one_work+0x1fd/0x420
> >>>>[ 1463.756940] worker_thread+0x2d/0x3d0
> >>>>[ 1463.756946] ? rescuer_thread+0x340/0x340
> >>>>[ 1463.756951] kthread+0x112/0x130
> >>>>[ 1463.756957] ? kthread_create_worker_on_cpu+0x40/0x40
> >>>>[ 1463.756965] ret_from_fork+0x3a/0x50
> >>>>[ 1463.756979] INFO: task kworker/0:2:4994 blocked for more than 480 
> >>>>seconds.
> >>>>[ 1463.756982] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.756984] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.756987] kworker/0:2 D 0 4994 2 0x80000000
> >>>>[ 1463.757013] Workqueue: md submit_flushes [md_mod]
> >>>>[ 1463.757016] Call Trace:
> >>>>[ 1463.757024] ? __schedule+0x29a/0x880
> >>>>[ 1463.757034] ? wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.757039] schedule+0x78/0x110
> >>>>[ 1463.757047] wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.757054] ? wait_woken+0x80/0x80
> >>>>[ 1463.757062] raid10_write_request+0xf2/0x900 [raid10]
> >>>>[ 1463.757067] ? wait_woken+0x80/0x80
> >>>>[ 1463.757072] ? mempool_alloc+0x55/0x160
> >>>>[ 1463.757088] ? md_write_start+0xa9/0x270 [md_mod]
> >>>>[ 1463.757095] ? trace_hardirqs_off_thunk+0x1a/0x1c
> >>>>[ 1463.757104] raid10_make_request+0xc1/0x120 [raid10]
> >>>>[ 1463.757110] ? wait_woken+0x80/0x80
> >>>>[ 1463.757126] md_handle_request+0x121/0x190 [md_mod]
> >>>>[ 1463.757132] ? _raw_spin_unlock_irq+0x22/0x40
> >>>>[ 1463.757137] ? finish_task_switch+0x74/0x260
> >>>>[ 1463.757156] submit_flushes+0x21/0x40 [md_mod]
> >>>Some other MD task (?) also blocked submitting a request.
> >>>
> >>>>[ 1463.757163] process_one_work+0x1fd/0x420
> >>>>[ 1463.757170] worker_thread+0x2d/0x3d0
> >>>>[ 1463.757177] ? rescuer_thread+0x340/0x340
> >>>>[ 1463.757181] kthread+0x112/0x130
> >>>>[ 1463.757186] ? kthread_create_worker_on_cpu+0x40/0x40
> >>>>[ 1463.757193] ret_from_fork+0x3a/0x50
> >>>>[ 1463.757205] INFO: task md1_resync:5215 blocked for more than 480 
> >>>>seconds.
> >>>>[ 1463.757207] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.757209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.757212] md1_resync D 0 5215 2 0x80000000
> >>>>[ 1463.757216] Call Trace:
> >>>>[ 1463.757223] ? __schedule+0x29a/0x880
> >>>>[ 1463.757231] ? raise_barrier+0x8d/0x140 [raid10]
> >>>>[ 1463.757236] schedule+0x78/0x110
> >>>>[ 1463.757243] raise_barrier+0x8d/0x140 [raid10]
> >>>>[ 1463.757248] ? wait_woken+0x80/0x80
> >>>>[ 1463.757257] raid10_sync_request+0x1f6/0x1e30 [raid10]
> >>>>[ 1463.757265] ? _raw_spin_unlock_irq+0x22/0x40
> >>>>[ 1463.757284] ? is_mddev_idle+0x125/0x137 [md_mod]
> >>>>[ 1463.757302] md_do_sync.cold.78+0x404/0x969 [md_mod]
> >>>The md1 sync task is blocked, I'm not sure on what.
> >>>
> >>>>[ 1463.757311] ? wait_woken+0x80/0x80
> >>>>[ 1463.757336] ? md_rdev_init+0xb0/0xb0 [md_mod]
> >>>>[ 1463.757351] md_thread+0xe9/0x140 [md_mod]
> >>>>[ 1463.757358] ? _raw_spin_unlock_irqrestore+0x2e/0x60
> >>>>[ 1463.757364] ? __kthread_parkme+0x4c/0x70
> >>>>[ 1463.757369] kthread+0x112/0x130
> >>>>[ 1463.757374] ? kthread_create_worker_on_cpu+0x40/0x40
> >>>>[ 1463.757380] ret_from_fork+0x3a/0x50
> >>>>[ 1463.757395] INFO: task xfsaild/md1:5233 blocked for more than 480 
> >>>>seconds.
> >>>>[ 1463.757398] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.757400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.757402] xfsaild/md1 D 0 5233 2 0x80000000
> >>>>[ 1463.757406] Call Trace:
> >>>>[ 1463.757413] ? __schedule+0x29a/0x880
> >>>>[ 1463.757421] ? wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.757426] schedule+0x78/0x110
> >>>>[ 1463.757433] wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.757438] ? wait_woken+0x80/0x80
> >>>>[ 1463.757446] raid10_write_request+0xf2/0x900 [raid10]
> >>>>[ 1463.757451] ? wait_woken+0x80/0x80
> >>>>[ 1463.757455] ? mempool_alloc+0x55/0x160
> >>>>[ 1463.757471] ? md_write_start+0xa9/0x270 [md_mod]
> >>>>[ 1463.757477] ? trace_hardirqs_on_thunk+0x1a/0x1c
> >>>>[ 1463.757485] raid10_make_request+0xc1/0x120 [raid10]
> >>>>[ 1463.757491] ? wait_woken+0x80/0x80
> >>>>[ 1463.757507] md_handle_request+0x121/0x190 [md_mod]
> >>>>[ 1463.757527] md_make_request+0x78/0x190 [md_mod]
> >>>>[ 1463.757536] generic_make_request+0x1c6/0x470
> >>>>[ 1463.757544] submit_bio+0x45/0x140
> >>>xfsaild (metadata writeback) is also blocked submitting I/O down in the
> >>>MD driver.
> >>>
> >>>>[ 1463.757552] ? bio_add_page+0x48/0x60
> >>>>[ 1463.757716] _xfs_buf_ioapply+0x2c1/0x450 [xfs]
> >>>>[ 1463.757849] ? xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
> >>>>[ 1463.757974] __xfs_buf_submit+0x67/0x270 [xfs]
> >>>>[ 1463.758102] xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
> >>>>[ 1463.758232] ? xfsaild+0x294/0x7e0 [xfs]
> >>>>[ 1463.758364] xfsaild+0x294/0x7e0 [xfs]
> >>>>[ 1463.758377] ? _raw_spin_unlock_irqrestore+0x2e/0x60
> >>>>[ 1463.758508] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
> >>>>[ 1463.758514] kthread+0x112/0x130
> >>>>[ 1463.758520] ? kthread_create_worker_on_cpu+0x40/0x40
> >>>>[ 1463.758527] ret_from_fork+0x3a/0x50
> >>>>[ 1463.758543] INFO: task rpm:5364 blocked for more than 480 seconds.
> >>>>[ 1463.758546] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.758547] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.758550] rpm D 0 5364 3757 0x00000000
> >>>>[ 1463.758554] Call Trace:
> >>>>[ 1463.758563] ? __schedule+0x29a/0x880
> >>>>[ 1463.758701] ? xlog_wait+0x5c/0x70 [xfs]
> >>>>[ 1463.759821] schedule+0x78/0x110
> >>>>[ 1463.760022] xlog_wait+0x5c/0x70 [xfs]
> >>>>[ 1463.760036] ? wake_up_q+0x70/0x70
> >>>>[ 1463.760167] __xfs_log_force_lsn+0x223/0x230 [xfs]
> >>>>[ 1463.760297] ? xfs_file_fsync+0x196/0x1d0 [xfs]
> >>>>[ 1463.760424] xfs_log_force_lsn+0x93/0x140 [xfs]
> >>>>[ 1463.760552] xfs_file_fsync+0x196/0x1d0 [xfs]
> >>>An fsync is blocked, presumably on XFS log I/O completion.
> >>>
> >>>>[ 1463.760562] ? __sb_end_write+0x36/0x60
> >>>>[ 1463.760571] do_fsync+0x38/0x70
> >>>>[ 1463.760578] __x64_sys_fdatasync+0x13/0x20
> >>>>[ 1463.760585] do_syscall_64+0x60/0x110
> >>>>[ 1463.760594] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>>>[ 1463.760603] RIP: 0033:0x7f9757fae8a4
> >>>>[ 1463.760616] Code: Bad RIP value.
> >>>>[ 1463.760619] RSP: 002b:00007fff74fdb428 EFLAGS: 00000246 ORIG_RAX:
> >>>>000000000000004b
> >>>>[ 1463.760654] RAX: ffffffffffffffda RBX: 0000000000000064 RCX: 
> >>>>00007f9757fae8a4
> >>>>[ 1463.760657] RDX: 00000000012c4c60 RSI: 00000000012cc130 RDI: 
> >>>>0000000000000004
> >>>>[ 1463.760660] RBP: 0000000000000000 R08: 0000000000000000 R09: 
> >>>>00007f9758708c00
> >>>>[ 1463.760662] R10: 0000000000000000 R11: 0000000000000246 R12: 
> >>>>00000000012cc130
> >>>>[ 1463.760665] R13: 000000000123a3a0 R14: 0000000000010830 R15: 
> >>>>0000000000000062
> >>>>[ 1463.760679] INFO: task kworker/0:8:5367 blocked for more than 480 
> >>>>seconds.
> >>>>[ 1463.760683] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.760684] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.760687] kworker/0:8 D 0 5367 2 0x80000000
> >>>>[ 1463.760718] Workqueue: md submit_flushes [md_mod]
> >>>And that MD submit_flushes thing again.
> >>>
> >>>Not to say there isn't some issue between XFS and MD going on here, but
> >>>I think we might want an MD person to take a look at this and possibly
> >>>provide some insight. From an XFS perspective, this all just looks like
> >>>we're blocked on I/O (via writeback, AIL and log) to a slow device.
> >>>
> >>>Brian
> >>>
> >>>>[ 1463.760721] Call Trace:
> >>>>[ 1463.760731] ? __schedule+0x29a/0x880
> >>>>[ 1463.760741] ? wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.760746] schedule+0x78/0x110
> >>>>[ 1463.760753] wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.760761] ? wait_woken+0x80/0x80
> >>>>[ 1463.760768] raid10_write_request+0xf2/0x900 [raid10]
> >>>>[ 1463.760774] ? wait_woken+0x80/0x80
> >>>>[ 1463.760778] ? mempool_alloc+0x55/0x160
> >>>>[ 1463.760795] ? md_write_start+0xa9/0x270 [md_mod]
> >>>>[ 1463.760801] ? try_to_wake_up+0x44/0x470
> >>>>[ 1463.760810] raid10_make_request+0xc1/0x120 [raid10]
> >>>>[ 1463.760816] ? wait_woken+0x80/0x80
> >>>>[ 1463.760831] md_handle_request+0x121/0x190 [md_mod]
> >>>>[ 1463.760851] md_make_request+0x78/0x190 [md_mod]
> >>>>[ 1463.760860] generic_make_request+0x1c6/0x470
> >>>>[ 1463.760870] raid10_write_request+0x77a/0x900 [raid10]
> >>>>[ 1463.760875] ? wait_woken+0x80/0x80
> >>>>[ 1463.760879] ? mempool_alloc+0x55/0x160
> >>>>[ 1463.760895] ? md_write_start+0xa9/0x270 [md_mod]
> >>>>[ 1463.760904] raid10_make_request+0xc1/0x120 [raid10]
> >>>>[ 1463.760910] ? wait_woken+0x80/0x80
> >>>>[ 1463.760926] md_handle_request+0x121/0x190 [md_mod]
> >>>>[ 1463.760931] ? _raw_spin_unlock_irq+0x22/0x40
> >>>>[ 1463.760936] ? finish_task_switch+0x74/0x260
> >>>>[ 1463.760954] submit_flushes+0x21/0x40 [md_mod]
> >>>>[ 1463.760962] process_one_work+0x1fd/0x420
> >>>>[ 1463.760970] worker_thread+0x2d/0x3d0
> >>>>[ 1463.760976] ? rescuer_thread+0x340/0x340
> >>>>[ 1463.760981] kthread+0x112/0x130
> >>>>[ 1463.760986] ? kthread_create_worker_on_cpu+0x40/0x40
> >>>>[ 1463.760992] ret_from_fork+0x3a/0x50
> 
> -- 
> Srda??an pozdrav/Best regards,
> Sini??a Bandin
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-13 17:30         ` keld
@ 2018-12-14  6:59           ` Sinisa
  0 siblings, 0 replies; 16+ messages in thread
From: Sinisa @ 2018-12-14  6:59 UTC (permalink / raw)
  Cc: linux-xfs, linux-raid

Don't get me starting on this, every layout has it's pros and cons, but I have 
settled on "offset" with chunk size 1024 or 4096, depending on type of data I 
intend to put there.

Nevertheless, the problem is also appearing with "far" layout.


Srdačan pozdrav / Best regards,
Siniša Bandin

On 12/13/18 6:30 PM, keld@keldix.com wrote:
> o2 is not the fastest layout,
> but when the blk size is big enough
> the loss on head movement is small. f2 is better also for smaller
> block sizes
>
>
>
>
> On Thu, Dec 13, 2018 at 02:02:13PM +0100, Sinisa wrote:
>> On 12/13/18 1:28 PM, Brian Foster wrote:
>>> On Thu, Dec 13, 2018 at 09:21:18AM +0100, Sinisa wrote:
>>>> Thanks for a quick reply. Replies are inline...
>>>>
>>>> On 12.12.2018 15:30, Brian Foster wrote:
>>>>> cc linux-raid
>>>>>
>>>>> On Wed, Dec 12, 2018 at 01:29:49PM +0100, Sinisa wrote:
>>>>>> Hello group,
>>>>>>
>>>>>> I have noticed something strange going on lately, but recently I have
>>>>>> come
>>>>>> to conclusion that there is some unwanted interaction between XFS and
>>>>>> Linux
>>>>>> RAID10 with "offset" layout.
>>>>>>
>>>>>> So here is the problem: I create a Linux RAID10 mirror with 2 disks
>>>>>> (HDD or
>>>>>> SSD) and "o2" layout (best choice for read and write speed):
>>>>>> # mdadm -C -n2 -l10 -po2 /dev/mdX /dev/sdaX /dev/sdbX
>>>>>> # mkfs.xfs /dev/mdX
>>>>>> # mount /dev/mdX /mnt
>>>>>> # rsync -avxDPHS / /mnt
>>>>>>
>>>>>> So we have RAID10 initializing:
>>>>>>
>>>>>> # cat /proc/mdstat
>>>>>> Personalities : [raid1] [raid10]
>>>>>> md2 : active raid10 sdb3[1] sda3[0]
>>>>>>        314433536 blocks super 1.2 4096K chunks 2 offset-copies [2/2]
>>>>>> [UU]
>>>>>>        [==>..................]  resync = 11.7% (36917568/314433536)
>>>>>> finish=8678.2min speed=532K/sec
>>>>>>        bitmap: 3/3 pages [12KB], 65536KB chunk
>>>>>>
>>>>>> but after a few minutes everything stops like you can see above. Rsync
>>>>>> (or
>>>>>> any other process writing to that md device) also freezes. If I try to
>>>>>> read
>>>>>> already copied files - freeze, usually with less that 2GB copied.
>>>>>>
>>>>> Does the same thing happen without the RAID initialization? E.g., if you
>>>>> wait for it to complete or (IIRC) if you create with --assume-clean? I
>>>>> assume the init-in-progress state is common with your tests on other
>>>>> filesystems?
>>>>>
>>>> No, if I wait for RAID to finish initializing, or create it with
>>>> --assume-clean, everything works just fine.
>>>>
>>>> Actualy, ever since openSUSE LEAP 15.0 release I have been doing just
>>>> that:
>>>> pause installation process until initialization is done, then let it go
>>>> on.
>>>>
>>>> But recently it has happened so that I had to replace one of the disks in
>>>> a
>>>> "live" system (small file server), and was unable to do that on multiple
>>>> tries during work hours because of this problem. When I waited until
>>>> afternoon, when nobody was working/writing, resync was able to finish...
>>>>
>>> So apparently there is some kind of poor interaction here with the
>>> internal MD resync code. It's not clear to me whether it's a lockup or
>>> extreme slowdown, but unless anybody else has ideas I'd suggest to
>>> solicit feedback from the MD devs (note that you dropped the linux-raid
>>> cc) as to why this set of I/O might be blocked in the raid device and go
>> >from there.
>>> Brian
>> (Sorry, I'm not very much into mailing lists, I have added linux-raid back
>> to cc)
>>
>> Today I tried lowering the RAID10 chunk size to 512 bytes (default), and
>> only difference was that freeze appeared much faster.
>> Also tried with newest RC kernel 4.20.0-rc6-2.g91eea17-default (from
>> openSUSE kernel/HEAD), with same results.
>>
>> And it is definitely a lockup, because I have tried to leave it overnight,
>> but once rsync/copy/write stops, it never moves on, and also once RAID sync
>> stops it never moves on...
>>
>> I was trying many times to get that dmesg report again, but without success
>> (waited up to 30 minutes). Any help is welcome...
>>
>>
>> What I did not try is some other distribution, only openSUSE LEAP and
>> Tumbleweed
>>
>>
>>>>> A few more notes below inline to the log..
>>>>>
>>>>>> Sometimes in dmesg I get some kernel messages about "task kworker/2:1:55
>>>>>> blocked for more than 480 seconds." (please see attached dmesg.txt and
>>>>>> my
>>>>>> reports here: https://bugzilla.opensuse.org/show_bug.cgi?id=1111073),
>>>>>> sometimes nothing at all. When this happens, I can only reboot with
>>>>>> SysRq-b
>>>>>> or "physically" with reset/power button.
>>>>>>
>>>>>> Same thing can happen with "far" layout, but it seems to me that it
>>>>>> does not
>>>>>> happen every time (or that often). I might be wrong, because I never use
>>>>>> "far" layout in real life, only for testing.
>>>>>> I was unable to reproduce the failure with "near" layout.
>>>>>>
>>>>>> Also with EXT4 or BTRFS and any layout everything works just as it
>>>>>> should,
>>>>>> that is sync goes on until finished, and rsync, cp, or any other write
>>>>>> work
>>>>>> just fine at the same time.
>>>>>>
>>>>>> Let me just add that I first saw this behavior in openSUSE LEAP 15.0
>>>>>> (kernel
>>>>>> 4.12). In previous versions (up to kernel 4.4) I never had this
>>>>>> problem. In
>>>>>> the meantime I have tested with kernels up to 4.20rc and it is the same.
>>>>>> Unfortunately I cannot go back to test kernels 4.5 - 4.11 to pinpoint
>>>>>> the
>>>>>> moment the problem first appeared.
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>> Sini??a Bandin
>>>>>> (excuse my English)
>>>>>>
>>>>>> [ 180.981499] SGI XFS with ACLs, security attributes, no debug enabled
>>>>>> [ 181.005019] XFS (md1): Mounting V5 Filesystem
>>>>>> [ 181.132076] XFS (md1): Starting recovery (logdev: internal)
>>>>>> [ 181.295606] XFS (md1): Ending recovery (logdev: internal)
>>>>>> [ 181.804011] XFS (md1): Unmounting Filesystem
>>>>>> [ 182.201794] XFS (md127): Mounting V4 Filesystem
>>>>>> [ 182.736958] md: recovery of RAID array md127
>>>>>> [ 182.915479] XFS (md127): Ending clean mount
>>>>>> [ 183.819702] XFS (md127): Unmounting Filesystem
>>>>>> [ 184.943831] EXT4-fs (md0): mounted filesystem with ordered data
>>>>>> mode. Opts: (null)
>>>>>> [ 529.784557] EXT4-fs (md0): mounted filesystem with ordered data
>>>>>> mode. Opts: (null)
>>>>>> [ 601.789958] md1: detected capacity change from 33284947968 to 0
>>>>>> [ 601.789973] md: md1 stopped.
>>>>>> [ 602.314112] md0: detected capacity change from 550436864 to 0
>>>>>> [ 602.314128] md: md0 stopped.
>>>>>> [ 602.745030] md: md127: recovery interrupted.
>>>>>> [ 603.131684] md127: detected capacity change from 966229229568 to 0
>>>>>> [ 603.132237] md: md127 stopped.
>>>>>> [ 603.435808] sda: sda1 sda2
>>>>>> [ 603.594074] udevd[5011]: inotify_add_watch(11, /dev/sda2, 10)
>>>>>> failed: No such file or directory
>>>>>> [ 603.643959] sda:
>>>>>> [ 603.844724] sdb: sdb1 sdb2
>>>>>> [ 604.255407] sdb: sdb1
>>>>>> [ 604.490214] udevd[5050]: inotify_add_watch(11, /dev/sdb1, 10)
>>>>>> failed: No such file or directory
>>>>>> [ 605.140952] sdb: sdb1
>>>>>> [ 605.628686] sdb: sdb1 sdb2
>>>>>> [ 606.271192] sdb: sdb1 sdb2 sdb3
>>>>>> [ 607.079626] sdb: sdb1 sdb2 sdb3
>>>>>> [ 607.611092] sda:
>>>>>> [ 608.273201] sda: sda1
>>>>>> [ 608.611952] sda: sda1 sda2
>>>>>> [ 609.031326] sda: sda1 sda2 sda3
>>>>>> [ 609.753140] md/raid10:md1: not clean -- starting background
>>>>>> reconstruction
>>>>>> [ 609.753145] md/raid10:md1: active with 2 out of 2 devices
>>>>>> [ 609.768804] md1: detected capacity change from 0 to 32210157568
>>>>>> [ 609.772677] md: resync of RAID array md1
>>>>>> [ 614.590107] XFS (md1): Mounting V5 Filesystem
>>>>>> [ 615.449035] XFS (md1): Ending clean mount
>>>>>> [ 617.678462] md/raid1:md0: not clean -- starting background
>>>>>> reconstruction
>>>>>> [ 617.678469] md/raid1:md0: active with 2 out of 2 mirrors
>>>>>> [ 617.740729] md0: detected capacity change from 0 to 524222464
>>>>>> [ 617.747107] md: delaying resync of md0 until md1 has finished
>>>>>> (they share one or more physical units)
>>>>> What are md0 and md1? Note that I don't see md2 anywhere in this log.
>>>>>
>>>> Sorry that I did not clarify that immediately, this log was taken earlier,
>>>> during installation, when I got to see it in dmesg.
>>>> md0 was /boot (with EXT4), md1 was / with XFS.
>>>>
>>>> Example of cat /proc/mdstat was taken later, when I brought up the system
>>>> (by changing md1 to "near" layout at install time). So wherever you see
>>>> md1
>>>> or md2, you can assume they are the same thing: new RAID10/o2 being
>>>> written
>>>> to during initialization. But second time there was nothing in dmesg, so I
>>>> could not attach that.
>>>>
>>>>
>>>>>> [ 620.037818] EXT4-fs (md0): mounted filesystem with ordered data
>>>>>> mode. Opts: (null)
>>>>>> [ 1463.754785] INFO: task kworker/0:3:227 blocked for more than 480
>>>>>> seconds.
>>>>>> [ 1463.754793] Not tainted 4.19.5-1-default #1
>>>>>> [ 1463.754795] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>>>> disables this message.
>>>>>> [ 1463.754799] kworker/0:3 D 0 227 2 0x80000000
>>>>>> [ 1463.755000] Workqueue: xfs-eofblocks/md1 xfs_eofblocks_worker [xfs]
>>>>>> [ 1463.755005] Call Trace:
>>>>>> [ 1463.755025] ? __schedule+0x29a/0x880
>>>>>> [ 1463.755032] ? rwsem_down_write_failed+0x197/0x350
>>>>>> [ 1463.755038] schedule+0x78/0x110
>>>>>> [ 1463.755044] rwsem_down_write_failed+0x197/0x350
>>>>>> [ 1463.755055] call_rwsem_down_write_failed+0x13/0x20
>>>>>> [ 1463.755061] down_write+0x20/0x30
>>>>> So we have a background task blocked on an inode lock.
>>>>>
>>>>>> [ 1463.755196] xfs_free_eofblocks+0x114/0x1a0 [xfs]
>>>>>> [ 1463.755330] xfs_inode_free_eofblocks+0xd3/0x1e0 [xfs]
>>>>>> [ 1463.755459] ? xfs_inode_ag_walk_grab+0x5b/0x90 [xfs]
>>>>>> [ 1463.755586] xfs_inode_ag_walk.isra.15+0x1aa/0x420 [xfs]
>>>>>> [ 1463.755714] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
>>>>>> [ 1463.755727] ? trace_hardirqs_on_thunk+0x1a/0x1c
>>>>>> [ 1463.755734] ? __switch_to_asm+0x40/0x70
>>>>>> [ 1463.755738] ? __switch_to_asm+0x34/0x70
>>>>>> [ 1463.755743] ? __switch_to_asm+0x40/0x70
>>>>>> [ 1463.755748] ? __switch_to_asm+0x34/0x70
>>>>>> [ 1463.755752] ? __switch_to_asm+0x40/0x70
>>>>>> [ 1463.755757] ? __switch_to_asm+0x34/0x70
>>>>>> [ 1463.755762] ? __switch_to_asm+0x40/0x70
>>>>>> [ 1463.755893] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
>>>>>> [ 1463.755900] ? radix_tree_gang_lookup_tag+0xc2/0x140
>>>>>> [ 1463.756032] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
>>>>>> [ 1463.756158] xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
>>>>>> [ 1463.756288] xfs_eofblocks_worker+0x29/0x40 [xfs]
>>>>>> [ 1463.756298] process_one_work+0x1fd/0x420
>>>>>> [ 1463.756305] worker_thread+0x2d/0x3d0
>>>>>> [ 1463.756311] ? rescuer_thread+0x340/0x340
>>>>>> [ 1463.756316] kthread+0x112/0x130
>>>>>> [ 1463.756322] ? kthread_create_worker_on_cpu+0x40/0x40
>>>>>> [ 1463.756329] ret_from_fork+0x3a/0x50
>>>>>> [ 1463.756375] INFO: task kworker/u4:0:4615 blocked for more than 480
>>>>>> seconds.
>>>>>> [ 1463.756379] Not tainted 4.19.5-1-default #1
>>>>>> [ 1463.756380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>>>> disables this message.
>>>>>> [ 1463.756383] kworker/u4:0 D 0 4615 2 0x80000000
>>>>>> [ 1463.756395] Workqueue: writeback wb_workfn (flush-9:1)
>>>>>> [ 1463.756400] Call Trace:
>>>>>> [ 1463.756409] ? __schedule+0x29a/0x880
>>>>>> [ 1463.756420] ? wait_barrier+0xdd/0x170 [raid10]
>>>>>> [ 1463.756426] schedule+0x78/0x110
>>>>>> [ 1463.756433] wait_barrier+0xdd/0x170 [raid10]
>>>>>> [ 1463.756440] ? wait_woken+0x80/0x80
>>>>>> [ 1463.756448] raid10_write_request+0xf2/0x900 [raid10]
>>>>>> [ 1463.756454] ? wait_woken+0x80/0x80
>>>>>> [ 1463.756459] ? mempool_alloc+0x55/0x160
>>>>>> [ 1463.756483] ? md_write_start+0xa9/0x270 [md_mod]
>>>>>> [ 1463.756492] raid10_make_request+0xc1/0x120 [raid10]
>>>>>> [ 1463.756498] ? wait_woken+0x80/0x80
>>>>>> [ 1463.756514] md_handle_request+0x121/0x190 [md_mod]
>>>>>> [ 1463.756535] md_make_request+0x78/0x190 [md_mod]
>>>>>> [ 1463.756544] generic_make_request+0x1c6/0x470
>>>>>> [ 1463.756553] submit_bio+0x45/0x140
>>>>> Writeback is blocked submitting I/O down in the MD driver.
>>>>>
>>>>>> [ 1463.756714] xfs_submit_ioend+0x9c/0x1e0 [xfs]
>>>>>> [ 1463.756844] xfs_vm_writepages+0x68/0x80 [xfs]
>>>>>> [ 1463.756856] do_writepages+0x31/0xb0
>>>>>> [ 1463.756865] ? read_hpet+0x126/0x130
>>>>>> [ 1463.756873] ? ktime_get+0x36/0xa0
>>>>>> [ 1463.756881] __writeback_single_inode+0x3d/0x3e0
>>>>>> [ 1463.756889] writeback_sb_inodes+0x1c4/0x430
>>>>>> [ 1463.756902] __writeback_inodes_wb+0x5d/0xb0
>>>>>> [ 1463.756910] wb_writeback+0x26b/0x310
>>>>>> [ 1463.756920] wb_workfn+0x33a/0x410
>>>>>> [ 1463.756932] process_one_work+0x1fd/0x420
>>>>>> [ 1463.756940] worker_thread+0x2d/0x3d0
>>>>>> [ 1463.756946] ? rescuer_thread+0x340/0x340
>>>>>> [ 1463.756951] kthread+0x112/0x130
>>>>>> [ 1463.756957] ? kthread_create_worker_on_cpu+0x40/0x40
>>>>>> [ 1463.756965] ret_from_fork+0x3a/0x50
>>>>>> [ 1463.756979] INFO: task kworker/0:2:4994 blocked for more than 480
>>>>>> seconds.
>>>>>> [ 1463.756982] Not tainted 4.19.5-1-default #1
>>>>>> [ 1463.756984] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>>>> disables this message.
>>>>>> [ 1463.756987] kworker/0:2 D 0 4994 2 0x80000000
>>>>>> [ 1463.757013] Workqueue: md submit_flushes [md_mod]
>>>>>> [ 1463.757016] Call Trace:
>>>>>> [ 1463.757024] ? __schedule+0x29a/0x880
>>>>>> [ 1463.757034] ? wait_barrier+0xdd/0x170 [raid10]
>>>>>> [ 1463.757039] schedule+0x78/0x110
>>>>>> [ 1463.757047] wait_barrier+0xdd/0x170 [raid10]
>>>>>> [ 1463.757054] ? wait_woken+0x80/0x80
>>>>>> [ 1463.757062] raid10_write_request+0xf2/0x900 [raid10]
>>>>>> [ 1463.757067] ? wait_woken+0x80/0x80
>>>>>> [ 1463.757072] ? mempool_alloc+0x55/0x160
>>>>>> [ 1463.757088] ? md_write_start+0xa9/0x270 [md_mod]
>>>>>> [ 1463.757095] ? trace_hardirqs_off_thunk+0x1a/0x1c
>>>>>> [ 1463.757104] raid10_make_request+0xc1/0x120 [raid10]
>>>>>> [ 1463.757110] ? wait_woken+0x80/0x80
>>>>>> [ 1463.757126] md_handle_request+0x121/0x190 [md_mod]
>>>>>> [ 1463.757132] ? _raw_spin_unlock_irq+0x22/0x40
>>>>>> [ 1463.757137] ? finish_task_switch+0x74/0x260
>>>>>> [ 1463.757156] submit_flushes+0x21/0x40 [md_mod]
>>>>> Some other MD task (?) also blocked submitting a request.
>>>>>
>>>>>> [ 1463.757163] process_one_work+0x1fd/0x420
>>>>>> [ 1463.757170] worker_thread+0x2d/0x3d0
>>>>>> [ 1463.757177] ? rescuer_thread+0x340/0x340
>>>>>> [ 1463.757181] kthread+0x112/0x130
>>>>>> [ 1463.757186] ? kthread_create_worker_on_cpu+0x40/0x40
>>>>>> [ 1463.757193] ret_from_fork+0x3a/0x50
>>>>>> [ 1463.757205] INFO: task md1_resync:5215 blocked for more than 480
>>>>>> seconds.
>>>>>> [ 1463.757207] Not tainted 4.19.5-1-default #1
>>>>>> [ 1463.757209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>>>> disables this message.
>>>>>> [ 1463.757212] md1_resync D 0 5215 2 0x80000000
>>>>>> [ 1463.757216] Call Trace:
>>>>>> [ 1463.757223] ? __schedule+0x29a/0x880
>>>>>> [ 1463.757231] ? raise_barrier+0x8d/0x140 [raid10]
>>>>>> [ 1463.757236] schedule+0x78/0x110
>>>>>> [ 1463.757243] raise_barrier+0x8d/0x140 [raid10]
>>>>>> [ 1463.757248] ? wait_woken+0x80/0x80
>>>>>> [ 1463.757257] raid10_sync_request+0x1f6/0x1e30 [raid10]
>>>>>> [ 1463.757265] ? _raw_spin_unlock_irq+0x22/0x40
>>>>>> [ 1463.757284] ? is_mddev_idle+0x125/0x137 [md_mod]
>>>>>> [ 1463.757302] md_do_sync.cold.78+0x404/0x969 [md_mod]
>>>>> The md1 sync task is blocked, I'm not sure on what.
>>>>>
>>>>>> [ 1463.757311] ? wait_woken+0x80/0x80
>>>>>> [ 1463.757336] ? md_rdev_init+0xb0/0xb0 [md_mod]
>>>>>> [ 1463.757351] md_thread+0xe9/0x140 [md_mod]
>>>>>> [ 1463.757358] ? _raw_spin_unlock_irqrestore+0x2e/0x60
>>>>>> [ 1463.757364] ? __kthread_parkme+0x4c/0x70
>>>>>> [ 1463.757369] kthread+0x112/0x130
>>>>>> [ 1463.757374] ? kthread_create_worker_on_cpu+0x40/0x40
>>>>>> [ 1463.757380] ret_from_fork+0x3a/0x50
>>>>>> [ 1463.757395] INFO: task xfsaild/md1:5233 blocked for more than 480
>>>>>> seconds.
>>>>>> [ 1463.757398] Not tainted 4.19.5-1-default #1
>>>>>> [ 1463.757400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>>>> disables this message.
>>>>>> [ 1463.757402] xfsaild/md1 D 0 5233 2 0x80000000
>>>>>> [ 1463.757406] Call Trace:
>>>>>> [ 1463.757413] ? __schedule+0x29a/0x880
>>>>>> [ 1463.757421] ? wait_barrier+0xdd/0x170 [raid10]
>>>>>> [ 1463.757426] schedule+0x78/0x110
>>>>>> [ 1463.757433] wait_barrier+0xdd/0x170 [raid10]
>>>>>> [ 1463.757438] ? wait_woken+0x80/0x80
>>>>>> [ 1463.757446] raid10_write_request+0xf2/0x900 [raid10]
>>>>>> [ 1463.757451] ? wait_woken+0x80/0x80
>>>>>> [ 1463.757455] ? mempool_alloc+0x55/0x160
>>>>>> [ 1463.757471] ? md_write_start+0xa9/0x270 [md_mod]
>>>>>> [ 1463.757477] ? trace_hardirqs_on_thunk+0x1a/0x1c
>>>>>> [ 1463.757485] raid10_make_request+0xc1/0x120 [raid10]
>>>>>> [ 1463.757491] ? wait_woken+0x80/0x80
>>>>>> [ 1463.757507] md_handle_request+0x121/0x190 [md_mod]
>>>>>> [ 1463.757527] md_make_request+0x78/0x190 [md_mod]
>>>>>> [ 1463.757536] generic_make_request+0x1c6/0x470
>>>>>> [ 1463.757544] submit_bio+0x45/0x140
>>>>> xfsaild (metadata writeback) is also blocked submitting I/O down in the
>>>>> MD driver.
>>>>>
>>>>>> [ 1463.757552] ? bio_add_page+0x48/0x60
>>>>>> [ 1463.757716] _xfs_buf_ioapply+0x2c1/0x450 [xfs]
>>>>>> [ 1463.757849] ? xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
>>>>>> [ 1463.757974] __xfs_buf_submit+0x67/0x270 [xfs]
>>>>>> [ 1463.758102] xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
>>>>>> [ 1463.758232] ? xfsaild+0x294/0x7e0 [xfs]
>>>>>> [ 1463.758364] xfsaild+0x294/0x7e0 [xfs]
>>>>>> [ 1463.758377] ? _raw_spin_unlock_irqrestore+0x2e/0x60
>>>>>> [ 1463.758508] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
>>>>>> [ 1463.758514] kthread+0x112/0x130
>>>>>> [ 1463.758520] ? kthread_create_worker_on_cpu+0x40/0x40
>>>>>> [ 1463.758527] ret_from_fork+0x3a/0x50
>>>>>> [ 1463.758543] INFO: task rpm:5364 blocked for more than 480 seconds.
>>>>>> [ 1463.758546] Not tainted 4.19.5-1-default #1
>>>>>> [ 1463.758547] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>>>> disables this message.
>>>>>> [ 1463.758550] rpm D 0 5364 3757 0x00000000
>>>>>> [ 1463.758554] Call Trace:
>>>>>> [ 1463.758563] ? __schedule+0x29a/0x880
>>>>>> [ 1463.758701] ? xlog_wait+0x5c/0x70 [xfs]
>>>>>> [ 1463.759821] schedule+0x78/0x110
>>>>>> [ 1463.760022] xlog_wait+0x5c/0x70 [xfs]
>>>>>> [ 1463.760036] ? wake_up_q+0x70/0x70
>>>>>> [ 1463.760167] __xfs_log_force_lsn+0x223/0x230 [xfs]
>>>>>> [ 1463.760297] ? xfs_file_fsync+0x196/0x1d0 [xfs]
>>>>>> [ 1463.760424] xfs_log_force_lsn+0x93/0x140 [xfs]
>>>>>> [ 1463.760552] xfs_file_fsync+0x196/0x1d0 [xfs]
>>>>> An fsync is blocked, presumably on XFS log I/O completion.
>>>>>
>>>>>> [ 1463.760562] ? __sb_end_write+0x36/0x60
>>>>>> [ 1463.760571] do_fsync+0x38/0x70
>>>>>> [ 1463.760578] __x64_sys_fdatasync+0x13/0x20
>>>>>> [ 1463.760585] do_syscall_64+0x60/0x110
>>>>>> [ 1463.760594] entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>>>> [ 1463.760603] RIP: 0033:0x7f9757fae8a4
>>>>>> [ 1463.760616] Code: Bad RIP value.
>>>>>> [ 1463.760619] RSP: 002b:00007fff74fdb428 EFLAGS: 00000246 ORIG_RAX:
>>>>>> 000000000000004b
>>>>>> [ 1463.760654] RAX: ffffffffffffffda RBX: 0000000000000064 RCX:
>>>>>> 00007f9757fae8a4
>>>>>> [ 1463.760657] RDX: 00000000012c4c60 RSI: 00000000012cc130 RDI:
>>>>>> 0000000000000004
>>>>>> [ 1463.760660] RBP: 0000000000000000 R08: 0000000000000000 R09:
>>>>>> 00007f9758708c00
>>>>>> [ 1463.760662] R10: 0000000000000000 R11: 0000000000000246 R12:
>>>>>> 00000000012cc130
>>>>>> [ 1463.760665] R13: 000000000123a3a0 R14: 0000000000010830 R15:
>>>>>> 0000000000000062
>>>>>> [ 1463.760679] INFO: task kworker/0:8:5367 blocked for more than 480
>>>>>> seconds.
>>>>>> [ 1463.760683] Not tainted 4.19.5-1-default #1
>>>>>> [ 1463.760684] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>>>> disables this message.
>>>>>> [ 1463.760687] kworker/0:8 D 0 5367 2 0x80000000
>>>>>> [ 1463.760718] Workqueue: md submit_flushes [md_mod]
>>>>> And that MD submit_flushes thing again.
>>>>>
>>>>> Not to say there isn't some issue between XFS and MD going on here, but
>>>>> I think we might want an MD person to take a look at this and possibly
>>>>> provide some insight. From an XFS perspective, this all just looks like
>>>>> we're blocked on I/O (via writeback, AIL and log) to a slow device.
>>>>>
>>>>> Brian
>>>>>
>>>>>> [ 1463.760721] Call Trace:
>>>>>> [ 1463.760731] ? __schedule+0x29a/0x880
>>>>>> [ 1463.760741] ? wait_barrier+0xdd/0x170 [raid10]
>>>>>> [ 1463.760746] schedule+0x78/0x110
>>>>>> [ 1463.760753] wait_barrier+0xdd/0x170 [raid10]
>>>>>> [ 1463.760761] ? wait_woken+0x80/0x80
>>>>>> [ 1463.760768] raid10_write_request+0xf2/0x900 [raid10]
>>>>>> [ 1463.760774] ? wait_woken+0x80/0x80
>>>>>> [ 1463.760778] ? mempool_alloc+0x55/0x160
>>>>>> [ 1463.760795] ? md_write_start+0xa9/0x270 [md_mod]
>>>>>> [ 1463.760801] ? try_to_wake_up+0x44/0x470
>>>>>> [ 1463.760810] raid10_make_request+0xc1/0x120 [raid10]
>>>>>> [ 1463.760816] ? wait_woken+0x80/0x80
>>>>>> [ 1463.760831] md_handle_request+0x121/0x190 [md_mod]
>>>>>> [ 1463.760851] md_make_request+0x78/0x190 [md_mod]
>>>>>> [ 1463.760860] generic_make_request+0x1c6/0x470
>>>>>> [ 1463.760870] raid10_write_request+0x77a/0x900 [raid10]
>>>>>> [ 1463.760875] ? wait_woken+0x80/0x80
>>>>>> [ 1463.760879] ? mempool_alloc+0x55/0x160
>>>>>> [ 1463.760895] ? md_write_start+0xa9/0x270 [md_mod]
>>>>>> [ 1463.760904] raid10_make_request+0xc1/0x120 [raid10]
>>>>>> [ 1463.760910] ? wait_woken+0x80/0x80
>>>>>> [ 1463.760926] md_handle_request+0x121/0x190 [md_mod]
>>>>>> [ 1463.760931] ? _raw_spin_unlock_irq+0x22/0x40
>>>>>> [ 1463.760936] ? finish_task_switch+0x74/0x260
>>>>>> [ 1463.760954] submit_flushes+0x21/0x40 [md_mod]
>>>>>> [ 1463.760962] process_one_work+0x1fd/0x420
>>>>>> [ 1463.760970] worker_thread+0x2d/0x3d0
>>>>>> [ 1463.760976] ? rescuer_thread+0x340/0x340
>>>>>> [ 1463.760981] kthread+0x112/0x130
>>>>>> [ 1463.760986] ? kthread_create_worker_on_cpu+0x40/0x40
>>>>>> [ 1463.760992] ret_from_fork+0x3a/0x50
>> -- 
>> Srda??an pozdrav/Best regards,
>> Sini??a Bandin
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

[parent not found: <0a33a20d-5f49-7b34-3662-5b818c67621a@suse.com>]

[parent not found: <48ba331d-a896-f532-2c75-cf94ddf87b60@4net.rs>]

* Re: XFS and RAID10 with o2 layout
       [not found]     ` <48ba331d-a896-f532-2c75-cf94ddf87b60@4net.rs>
@ 2018-12-17 15:04       ` Sinisa
  0 siblings, 0 replies; 16+ messages in thread
From: Sinisa @ 2018-12-17 15:04 UTC (permalink / raw)
  To: linux-xfs; +Cc: linux-raid

5 hours of compilig later (old Athlon64 dual core), I was able to copy 9+GB 
without a glitch (under same conditions that were guaranteed to freeze in under 
2GB).

I'll run some more tests tomorrow, but this seems to have helped, thanks a lot!


Srdačan pozdrav / Best regards,
Siniša Bandin

On 12/17/18 10:31 AM, Sinisa wrote:
 > Thanks, I'll try to compile that ass soon as I get test machine back to 
openSUSE, during the day...
 >
 > Srdačan pozdrav / Best regards,
 > Siniša Bandin
 >
 > On 12/17/18 2:49 AM, Guoqing Jiang wrote:
 >> Hi,
 >>
 >> On 12/12/18 10:30 PM, Brian Foster wrote:
 >>>> [ 1463.760721] Call Trace:
 >>>> [ 1463.760731]  ? __schedule+0x29a/0x880
 >>>> [ 1463.760741]  ? wait_barrier+0xdd/0x170 [raid10]
 >>>> [ 1463.760746]  schedule+0x78/0x110
 >>>> [ 1463.760753]  wait_barrier+0xdd/0x170 [raid10]
 >>>> [ 1463.760761]  ? wait_woken+0x80/0x80
 >>>> [ 1463.760768]  raid10_write_request+0xf2/0x900 [raid10]
 >>>> [ 1463.760774]  ? wait_woken+0x80/0x80
 >>>> [ 1463.760778]  ? mempool_alloc+0x55/0x160
 >>>> [ 1463.760795]  ? md_write_start+0xa9/0x270 [md_mod]
 >>>> [ 1463.760801]  ? try_to_wake_up+0x44/0x470
 >>>> [ 1463.760810]  raid10_make_request+0xc1/0x120 [raid10]
 >>>> [ 1463.760816]  ? wait_woken+0x80/0x80
 >>>> [ 1463.760831]  md_handle_request+0x121/0x190 [md_mod]
 >>>> [ 1463.760851]  md_make_request+0x78/0x190 [md_mod]
 >>>> [ 1463.760860]  generic_make_request+0x1c6/0x470
 >>>> [ 1463.760870]  raid10_write_request+0x77a/0x900 [raid10]
 >>
 >> Seems bio is splitted, can you try about the change? Seems I didn't send it 
to mail list
 >> successfully.
 >>
 >> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
 >> index b98e746e7fc4..12cf8a04e839 100644
 >> --- a/drivers/md/raid10.c
 >> +++ b/drivers/md/raid10.c
 >> @@ -1209,7 +1209,9 @@ static void raid10_read_request(struct mddev *mddev, 
struct bio *bio,
 >>                 struct bio *split = bio_split(bio, max_sectors,
 >>                                               gfp, &conf->bio_split);
 >>                 bio_chain(split, bio);
 >> +               allow_barrier(conf);
 >>                 generic_make_request(bio);
 >> +               wait_barrier(conf);
 >>                 bio = split;
 >>                 r10_bio->master_bio = bio;
 >>                 r10_bio->sectors = max_sectors;
 >> @@ -1514,7 +1516,9 @@ static void raid10_write_request(struct mddev *mddev, 
struct bio *bio,
 >>                 struct bio *split = bio_split(bio, r10_bio->sectors,
 >>                                               GFP_NOIO, &conf->bio_split);
 >>                 bio_chain(split, bio);
 >> +               allow_barrier(conf);
 >>                 generic_make_request(bio);
 >> +               wait_barrier(conf);
 >>                 bio = split;
 >>                 r10_bio->master_bio = bio;
 >>         }
 >>
 >> And I updated opensuse bugzilla as well.
 >>
 >> Thanks,
 >> Guoqing
 >>
 >>>> [ 1463.760875]  ? wait_woken+0x80/0x80
 >>>> [ 1463.760879]  ? mempool_alloc+0x55/0x160
 >>>> [ 1463.760895]  ? md_write_start+0xa9/0x270 [md_mod]
 >>>> [ 1463.760904]  raid10_make_request+0xc1/0x120 [raid10]
 >>>> [ 1463.760910]  ? wait_woken+0x80/0x80
 >>>> [ 1463.760926]  md_handle_request+0x121/0x190 [md_mod]
 >>>> [ 1463.760931]  ? _raw_spin_unlock_irq+0x22/0x40
 >>>> [ 1463.760936]  ? finish_task_switch+0x74/0x260
 >>>> [ 1463.760954]  submit_flushes+0x21/0x40 [md_mod]
 >>>> [ 1463.760962]  process_one_work+0x1fd/0x420
 >>>> [ 1463.760970]  worker_thread+0x2d/0x3d0
 >>>> [ 1463.760976]  ? rescuer_thread+0x340/0x340
 >>>> [ 1463.760981]  kthread+0x112/0x130
 >>>> [ 1463.760986]  ? kthread_create_worker_on_cpu+0x40/0x40
 >>>> [ 1463.760992]  ret_from_fork+0x3a/0x50
 >>
 >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
       [not found]   ` <0a33a20d-5f49-7b34-3662-5b818c67621a@suse.com>
       [not found]     ` <48ba331d-a896-f532-2c75-cf94ddf87b60@4net.rs>
@ 2018-12-18 15:01     ` Sinisa
  1 sibling, 0 replies; 16+ messages in thread
From: Sinisa @ 2018-12-18 15:01 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: linux-xfs, linux-raid

  After full day of testing (on 3 machines that were affected by the bug 
earlier) I would like to confirm that there wasn't a single failure when this 
patch was applied, with 4.20rc6 and 4.12.14 kernels (from openSUSE).


Srdačan pozdrav / Best regards,
Siniša Bandin

On 12/17/18 2:49 AM, Guoqing Jiang wrote:
 > Hi,
 >
 > On 12/12/18 10:30 PM, Brian Foster wrote:
 >>> [ 1463.760721] Call Trace:
 >>> [ 1463.760731]  ? __schedule+0x29a/0x880
 >>> [ 1463.760741]  ? wait_barrier+0xdd/0x170 [raid10]
 >>> [ 1463.760746]  schedule+0x78/0x110
 >>> [ 1463.760753]  wait_barrier+0xdd/0x170 [raid10]
 >>> [ 1463.760761]  ? wait_woken+0x80/0x80
 >>> [ 1463.760768]  raid10_write_request+0xf2/0x900 [raid10]
 >>> [ 1463.760774]  ? wait_woken+0x80/0x80
 >>> [ 1463.760778]  ? mempool_alloc+0x55/0x160
 >>> [ 1463.760795]  ? md_write_start+0xa9/0x270 [md_mod]
 >>> [ 1463.760801]  ? try_to_wake_up+0x44/0x470
 >>> [ 1463.760810]  raid10_make_request+0xc1/0x120 [raid10]
 >>> [ 1463.760816]  ? wait_woken+0x80/0x80
 >>> [ 1463.760831]  md_handle_request+0x121/0x190 [md_mod]
 >>> [ 1463.760851]  md_make_request+0x78/0x190 [md_mod]
 >>> [ 1463.760860]  generic_make_request+0x1c6/0x470
 >>> [ 1463.760870]  raid10_write_request+0x77a/0x900 [raid10]
 >
 > Seems bio is splitted, can you try about the change? Seems I didn't send it 
to mail list
 > successfully.
 >
 > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
 > index b98e746e7fc4..12cf8a04e839 100644
 > --- a/drivers/md/raid10.c
 > +++ b/drivers/md/raid10.c
 > @@ -1209,7 +1209,9 @@ static void raid10_read_request(struct mddev *mddev, 
struct bio *bio,
 >                 struct bio *split = bio_split(bio, max_sectors,
 >                                               gfp, &conf->bio_split);
 >                 bio_chain(split, bio);
 > +               allow_barrier(conf);
 >                 generic_make_request(bio);
 > +               wait_barrier(conf);
 >                 bio = split;
 >                 r10_bio->master_bio = bio;
 >                 r10_bio->sectors = max_sectors;
 > @@ -1514,7 +1516,9 @@ static void raid10_write_request(struct mddev *mddev, 
struct bio *bio,
 >                 struct bio *split = bio_split(bio, r10_bio->sectors,
 >                                               GFP_NOIO, &conf->bio_split);
 >                 bio_chain(split, bio);
 > +               allow_barrier(conf);
 >                 generic_make_request(bio);
 > +               wait_barrier(conf);
 >                 bio = split;
 >                 r10_bio->master_bio = bio;
 >         }
 >
 > And I updated opensuse bugzilla as well.
 >
 > Thanks,
 > Guoqing
 >
 >>> [ 1463.760875]  ? wait_woken+0x80/0x80
 >>> [ 1463.760879]  ? mempool_alloc+0x55/0x160
 >>> [ 1463.760895]  ? md_write_start+0xa9/0x270 [md_mod]
 >>> [ 1463.760904]  raid10_make_request+0xc1/0x120 [raid10]
 >>> [ 1463.760910]  ? wait_woken+0x80/0x80
 >>> [ 1463.760926]  md_handle_request+0x121/0x190 [md_mod]
 >>> [ 1463.760931]  ? _raw_spin_unlock_irq+0x22/0x40
 >>> [ 1463.760936]  ? finish_task_switch+0x74/0x260
 >>> [ 1463.760954]  submit_flushes+0x21/0x40 [md_mod]
 >>> [ 1463.760962]  process_one_work+0x1fd/0x420
 >>> [ 1463.760970]  worker_thread+0x2d/0x3d0
 >>> [ 1463.760976]  ? rescuer_thread+0x340/0x340
 >>> [ 1463.760981]  kthread+0x112/0x130
 >>> [ 1463.760986]  ? kthread_create_worker_on_cpu+0x40/0x40
 >>> [ 1463.760992]  ret_from_fork+0x3a/0x50
 >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-12 12:29 XFS and RAID10 with o2 layout Sinisa
  2018-12-12 14:30 ` Brian Foster
@ 2018-12-13 22:05 ` Dave Chinner
  2018-12-14  7:03   ` Sinisa
  2018-12-14 11:39 ` Sinisa
  2 siblings, 1 reply; 16+ messages in thread
From: Dave Chinner @ 2018-12-13 22:05 UTC (permalink / raw)
  To: Sinisa; +Cc: linux-xfs

On Wed, Dec 12, 2018 at 01:29:49PM +0100, Sinisa wrote:
> Hello group,
> 
> I have noticed something strange going on lately, but recently I
> have come to conclusion that there is some unwanted interaction
> between XFS and Linux RAID10 with "offset" layout.
> 
> So here is the problem: I create a Linux RAID10 mirror with 2 disks
> (HDD or SSD) and "o2" layout (best choice for read and write speed):
> # mdadm -C -n2 -l10 -po2 /dev/mdX /dev/sdaX /dev/sdbX
> # mkfs.xfs /dev/mdX
> # mount /dev/mdX /mnt
> # rsync -avxDPHS / /mnt
> 
> So we have RAID10 initializing:
> 
> # cat /proc/mdstat
> Personalities : [raid1] [raid10]
> md2 : active raid10 sdb3[1] sda3[0]
>       314433536 blocks super 1.2 4096K chunks 2 offset-copies [2/2] [UU]
>       [==>..................]  resync = 11.7% (36917568/314433536)
> finish=8678.2min speed=532K/sec
>       bitmap: 3/3 pages [12KB], 65536KB chunk
> 
> but after a few minutes everything stops like you can see above.
> Rsync (or any other process writing to that md device) also freezes.
> If I try to read already copied files - freeze, usually with less
> that 2GB copied.

Just a quick note:

> [ 1463.756426]  schedule+0x78/0x110
> [ 1463.756433]  wait_barrier+0xdd/0x170 [raid10]
> [ 1463.756448]  raid10_write_request+0xf2/0x900 [raid10]
> [ 1463.756492]  raid10_make_request+0xc1/0x120 [raid10]
> [ 1463.756514]  md_handle_request+0x121/0x190 [md_mod]
> [ 1463.756535]  md_make_request+0x78/0x190 [md_mod]
> [ 1463.756544]  generic_make_request+0x1c6/0x470

This is XFS IO submission waiting on a MD sync barrier.

> [ 1463.757013] Workqueue: md submit_flushes [md_mod]
> [ 1463.757016] Call Trace:
> [ 1463.757039]  schedule+0x78/0x110
> [ 1463.757047]  wait_barrier+0xdd/0x170 [raid10]
> [ 1463.757062]  raid10_write_request+0xf2/0x900 [raid10]
> [ 1463.757104]  raid10_make_request+0xc1/0x120 [raid10]
> [ 1463.757126]  md_handle_request+0x121/0x190 [md_mod]
> [ 1463.757156]  submit_flushes+0x21/0x40 [md_mod]
> [ 1463.757163]  process_one_work+0x1fd/0x420
> [ 1463.757170]  worker_thread+0x2d/0x3d0
> [ 1463.757177]  ? rescuer_thread+0x340/0x340
> [ 1463.757181]  kthread+0x112/0x130

This is an MD flush thread waiting on a MD sync barrier.

> [ 1463.757212] md1_resync      D    0  5215      2 0x80000000
> [ 1463.757216] Call Trace:
> [ 1463.757236]  schedule+0x78/0x110
> [ 1463.757243]  raise_barrier+0x8d/0x140 [raid10]
> [ 1463.757257]  raid10_sync_request+0x1f6/0x1e30 [raid10]
> [ 1463.757302]  md_do_sync.cold.78+0x404/0x969 [md_mod]
> [ 1463.757351]  md_thread+0xe9/0x140 [md_mod]

THis is the MD resync thread raising the sync barrier and waiting
for all waiters to drain and pending IO to drain away.

> [ 1463.757426]  schedule+0x78/0x110
> [ 1463.757433]  wait_barrier+0xdd/0x170 [raid10]
> [ 1463.757446]  raid10_write_request+0xf2/0x900 [raid10]
> [ 1463.757485]  raid10_make_request+0xc1/0x120 [raid10]
> [ 1463.757507]  md_handle_request+0x121/0x190 [md_mod]
> [ 1463.757527]  md_make_request+0x78/0x190 [md_mod]
> [ 1463.757536]  generic_make_request+0x1c6/0x470
> [ 1463.757544]  submit_bio+0x45/0x140

XFS waiting on MD sync barrier.

> [ 1463.760718] Workqueue: md submit_flushes [md_mod]
> [ 1463.760721] Call Trace:
> [ 1463.760746]  schedule+0x78/0x110
> [ 1463.760753]  wait_barrier+0xdd/0x170 [raid10]
> [ 1463.760768]  raid10_write_request+0xf2/0x900 [raid10]
> [ 1463.760810]  raid10_make_request+0xc1/0x120 [raid10]
> [ 1463.760831]  md_handle_request+0x121/0x190 [md_mod]
> [ 1463.760851]  md_make_request+0x78/0x190 [md_mod]
> [ 1463.760860]  generic_make_request+0x1c6/0x470
> [ 1463.760870]  raid10_write_request+0x77a/0x900 [raid10]
> [ 1463.760904]  raid10_make_request+0xc1/0x120 [raid10]
> [ 1463.760926]  md_handle_request+0x121/0x190 [md_mod]
> [ 1463.760954]  submit_flushes+0x21/0x40 [md_mod]

And another MD flush thread waiting on a MD sync barrier.

Basically, this looks and smells like a MD sync barrier race
condition, not an XFs problem.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-13 22:05 ` Dave Chinner
@ 2018-12-14  7:03   ` Sinisa
  2018-12-14  8:26     ` Wols Lists
  2018-12-14 21:20     ` Dave Chinner
  0 siblings, 2 replies; 16+ messages in thread
From: Sinisa @ 2018-12-14  7:03 UTC (permalink / raw)
  To: linux-raid; +Cc: linux-xfs


On 12/13/18 11:05 PM, Dave Chinner wrote:
> On Wed, Dec 12, 2018 at 01:29:49PM +0100, Sinisa wrote:
>> Hello group,
>>
>> I have noticed something strange going on lately, but recently I
>> have come to conclusion that there is some unwanted interaction
>> between XFS and Linux RAID10 with "offset" layout.
>>
>> So here is the problem: I create a Linux RAID10 mirror with 2 disks
>> (HDD or SSD) and "o2" layout (best choice for read and write speed):
>> # mdadm -C -n2 -l10 -po2 /dev/mdX /dev/sdaX /dev/sdbX
>> # mkfs.xfs /dev/mdX
>> # mount /dev/mdX /mnt
>> # rsync -avxDPHS / /mnt
>>
>> So we have RAID10 initializing:
>>
>> # cat /proc/mdstat
>> Personalities : [raid1] [raid10]
>> md2 : active raid10 sdb3[1] sda3[0]
>>        314433536 blocks super 1.2 4096K chunks 2 offset-copies [2/2] [UU]
>>        [==>..................]  resync = 11.7% (36917568/314433536)
>> finish=8678.2min speed=532K/sec
>>        bitmap: 3/3 pages [12KB], 65536KB chunk
>>
>> but after a few minutes everything stops like you can see above.
>> Rsync (or any other process writing to that md device) also freezes.
>> If I try to read already copied files - freeze, usually with less
>> that 2GB copied.
> Just a quick note:
>
>> [ 1463.756426]  schedule+0x78/0x110
>> [ 1463.756433]  wait_barrier+0xdd/0x170 [raid10]
>> [ 1463.756448]  raid10_write_request+0xf2/0x900 [raid10]
>> [ 1463.756492]  raid10_make_request+0xc1/0x120 [raid10]
>> [ 1463.756514]  md_handle_request+0x121/0x190 [md_mod]
>> [ 1463.756535]  md_make_request+0x78/0x190 [md_mod]
>> [ 1463.756544]  generic_make_request+0x1c6/0x470
> This is XFS IO submission waiting on a MD sync barrier.
>
>> [ 1463.757013] Workqueue: md submit_flushes [md_mod]
>> [ 1463.757016] Call Trace:
>> [ 1463.757039]  schedule+0x78/0x110
>> [ 1463.757047]  wait_barrier+0xdd/0x170 [raid10]
>> [ 1463.757062]  raid10_write_request+0xf2/0x900 [raid10]
>> [ 1463.757104]  raid10_make_request+0xc1/0x120 [raid10]
>> [ 1463.757126]  md_handle_request+0x121/0x190 [md_mod]
>> [ 1463.757156]  submit_flushes+0x21/0x40 [md_mod]
>> [ 1463.757163]  process_one_work+0x1fd/0x420
>> [ 1463.757170]  worker_thread+0x2d/0x3d0
>> [ 1463.757177]  ? rescuer_thread+0x340/0x340
>> [ 1463.757181]  kthread+0x112/0x130
> This is an MD flush thread waiting on a MD sync barrier.
>
>> [ 1463.757212] md1_resync      D    0  5215      2 0x80000000
>> [ 1463.757216] Call Trace:
>> [ 1463.757236]  schedule+0x78/0x110
>> [ 1463.757243]  raise_barrier+0x8d/0x140 [raid10]
>> [ 1463.757257]  raid10_sync_request+0x1f6/0x1e30 [raid10]
>> [ 1463.757302]  md_do_sync.cold.78+0x404/0x969 [md_mod]
>> [ 1463.757351]  md_thread+0xe9/0x140 [md_mod]
> THis is the MD resync thread raising the sync barrier and waiting
> for all waiters to drain and pending IO to drain away.
>
>> [ 1463.757426]  schedule+0x78/0x110
>> [ 1463.757433]  wait_barrier+0xdd/0x170 [raid10]
>> [ 1463.757446]  raid10_write_request+0xf2/0x900 [raid10]
>> [ 1463.757485]  raid10_make_request+0xc1/0x120 [raid10]
>> [ 1463.757507]  md_handle_request+0x121/0x190 [md_mod]
>> [ 1463.757527]  md_make_request+0x78/0x190 [md_mod]
>> [ 1463.757536]  generic_make_request+0x1c6/0x470
>> [ 1463.757544]  submit_bio+0x45/0x140
> XFS waiting on MD sync barrier.
>
>> [ 1463.760718] Workqueue: md submit_flushes [md_mod]
>> [ 1463.760721] Call Trace:
>> [ 1463.760746]  schedule+0x78/0x110
>> [ 1463.760753]  wait_barrier+0xdd/0x170 [raid10]
>> [ 1463.760768]  raid10_write_request+0xf2/0x900 [raid10]
>> [ 1463.760810]  raid10_make_request+0xc1/0x120 [raid10]
>> [ 1463.760831]  md_handle_request+0x121/0x190 [md_mod]
>> [ 1463.760851]  md_make_request+0x78/0x190 [md_mod]
>> [ 1463.760860]  generic_make_request+0x1c6/0x470
>> [ 1463.760870]  raid10_write_request+0x77a/0x900 [raid10]
>> [ 1463.760904]  raid10_make_request+0xc1/0x120 [raid10]
>> [ 1463.760926]  md_handle_request+0x121/0x190 [md_mod]
>> [ 1463.760954]  submit_flushes+0x21/0x40 [md_mod]
> And another MD flush thread waiting on a MD sync barrier.
>
> Basically, this looks and smells like a MD sync barrier race
> condition, not an XFs problem.
>
> Cheers,
>
> Dave.

But why don't we see the same issue with other filesystems?


Srdačan pozdrav / Best regards,
Siniša Bandin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-14  7:03   ` Sinisa
@ 2018-12-14  8:26     ` Wols Lists
  2018-12-14 20:44       ` John Stoffel
  2018-12-14 21:20     ` Dave Chinner
  1 sibling, 1 reply; 16+ messages in thread
From: Wols Lists @ 2018-12-14  8:26 UTC (permalink / raw)
  To: Sinisa, linux-raid; +Cc: linux-xfs

On 14/12/18 07:03, Sinisa wrote:
>> And another MD flush thread waiting on a MD sync barrier.
>>
>> Basically, this looks and smells like a MD sync barrier race
>> condition, not an XFs problem.
>>
>> Cheers,
>>
>> Dave.
> 
> But why don't we see the same issue with other filesystems?

Possibly because, iirc, xfs is aware of the underlying raid?

I don't know, but I seem to remember that from earlier discussions about
"xfs over mdraid".

Cheers,
Wol

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-14  8:26     ` Wols Lists
@ 2018-12-14 20:44       ` John Stoffel
  2018-12-15 15:36         ` Siniša Bandin
  0 siblings, 1 reply; 16+ messages in thread
From: John Stoffel @ 2018-12-14 20:44 UTC (permalink / raw)
  To: Wols Lists; +Cc: Sinisa, linux-raid, linux-xfs

>>>>> "Wols" == Wols Lists <antlists@youngman.org.uk> writes:

Wols> On 14/12/18 07:03, Sinisa wrote:
>>> And another MD flush thread waiting on a MD sync barrier.
>>> 
>>> Basically, this looks and smells like a MD sync barrier race
>>> condition, not an XFs problem.
>>> 
>>> Cheers,
>>> 
>>> Dave.
>> 
>> But why don't we see the same issue with other filesystems?

Wols> Possibly because, iirc, xfs is aware of the underlying raid?

I don't think it's that in touch with the raid.  It can/does query for
settings so it can optimize stuff, but I don't think it actually uses
it other than as a block device.

Wols> I don't know, but I seem to remember that from earlier discussions about
Wols> "xfs over mdraid".

Have you tried doing this with the absolute latest kernel?  

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-14 20:44       ` John Stoffel
@ 2018-12-15 15:36         ` Siniša Bandin
  0 siblings, 0 replies; 16+ messages in thread
From: Siniša Bandin @ 2018-12-15 15:36 UTC (permalink / raw)
  To: linux-raid, linux-xfs, linux-raid-owner



On 14.12.2018 21:44, John Stoffel wrote:
>>>>>> "Wols" == Wols Lists <antlists@youngman.org.uk> writes:
> 
> Wols> On 14/12/18 07:03, Sinisa wrote:
>>>> And another MD flush thread waiting on a MD sync barrier.
>>>> 
>>>> Basically, this looks and smells like a MD sync barrier race
>>>> condition, not an XFs problem.
>>>> 
>>>> Cheers,
>>>> 
>>>> Dave.
>>> 
>>> But why don't we see the same issue with other filesystems?
> 
> Wols> Possibly because, iirc, xfs is aware of the underlying raid?
> 
> I don't think it's that in touch with the raid.  It can/does query for
> settings so it can optimize stuff, but I don't think it actually uses
> it other than as a block device.
> 
> Wols> I don't know, but I seem to remember that from earlier 
> discussions about
> Wols> "xfs over mdraid".
> 
> Have you tried doing this with the absolute latest kernel?

Well, not "absolute" but relative :)

Absolute latest being 4.20-rc6, I have tried with 4.20-rc2 or -rc3


-- 
Srdačan pozdrav/Best regards,
Siniša Bandin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-14  7:03   ` Sinisa
  2018-12-14  8:26     ` Wols Lists
@ 2018-12-14 21:20     ` Dave Chinner
  1 sibling, 0 replies; 16+ messages in thread
From: Dave Chinner @ 2018-12-14 21:20 UTC (permalink / raw)
  To: Sinisa; +Cc: linux-raid, linux-xfs

On Fri, Dec 14, 2018 at 08:03:36AM +0100, Sinisa wrote:
> On 12/13/18 11:05 PM, Dave Chinner wrote:
> >On Wed, Dec 12, 2018 at 01:29:49PM +0100, Sinisa wrote:
> >Basically, this looks and smells like a MD sync barrier race
> >condition, not an XFs problem.
> 
> But why don't we see the same issue with other filesystems?

XFS has a lot more parallelism at the storage layer than other
filesystems and has a different integrity synchronisation model (via
IO completion processing rather than submission serialisation), so
it stresses the underlying storage very differently to other
filesystems.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: XFS and RAID10 with o2 layout
  2018-12-12 12:29 XFS and RAID10 with o2 layout Sinisa
  2018-12-12 14:30 ` Brian Foster
  2018-12-13 22:05 ` Dave Chinner
@ 2018-12-14 11:39 ` Sinisa
  2 siblings, 0 replies; 16+ messages in thread
From: Sinisa @ 2018-12-14 11:39 UTC (permalink / raw)
  To: linux-xfs; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 3936 bytes --]

Today I installed Debian 9.

With default kernel 4.9 I could not reproduce the problem in about 2 hours of 
testing.

Then I restarted with kernel 4.19.5 (from "backports") and managed to get 
lockup in about 5 minutes:

i kernel 4.9 did this (for the 3rd time today)
# mdadm -C -n2 -l10 -po2 -c512 /dev/md2 /dev/sda3 /dev/sdb3              <- 
forced 512K chunk, to cause lockup faster than with larger chunk sizes
# mkfs.xfs -f /dev/md2
# mount /dev/md2 /mnt/
# rsync -avxDPHS / /mnt
(working OK)


Rebooted before sync was finished, started kernel 4.19
# mount /dev/md127 /mnt/                  <- did not add md2 to 
/etc/mdadm.conf, so it was auto-detected as md127 and in state PENDING until 
mounted
# rsync -avxDPHS / /mnt
... (rsync stopped after a few minutes)


root@debian:~# cat /proc/mdstat
Personalities : [raid1] [raid10] [linear] [multipath] [raid0] [raid6] [raid5] 
[raid4]
md127 : active raid10 sdb3[1] sda3[0]
       314441728 blocks super 1.2 512K chunks 2 offset-copies [2/2] [UU]
       [=====>...............]  resync = 28.4% (89309696/314441728) 
finish=828.2min speed=4528K/sec
       bitmap: 3/3 pages [12KB], 65536KB chunk


dmesg output is attached, to me looks quite the same as before



Rebooted (hard) into kernel 4.10.rc6 (only other option available from the same 
backports)
Could not make problem appear in 3 attempts.
With this, I think we can narrow the search to something between kernels 4.10 
and 4.12.


Next I will try to compile kernels 4.10 (final) and 4.11 and test with them, 
but that might take a bit more time...

Srdačan pozdrav / Best regards,
Siniša Bandin

On 12/12/18 1:29 PM, Sinisa wrote:
> Hello group,
>
> I have noticed something strange going on lately, but recently I have come to 
> conclusion that there is some unwanted interaction between XFS and Linux 
> RAID10 with "offset" layout.
>
> So here is the problem: I create a Linux RAID10 mirror with 2 disks (HDD or 
> SSD) and "o2" layout (best choice for read and write speed):
> # mdadm -C -n2 -l10 -po2 /dev/mdX /dev/sdaX /dev/sdbX
> # mkfs.xfs /dev/mdX
> # mount /dev/mdX /mnt
> # rsync -avxDPHS / /mnt
>
> So we have RAID10 initializing:
>
> # cat /proc/mdstat
> Personalities : [raid1] [raid10]
> md2 : active raid10 sdb3[1] sda3[0]
>       314433536 blocks super 1.2 4096K chunks 2 offset-copies [2/2] [UU]
>       [==>..................]  resync = 11.7% (36917568/314433536) 
> finish=8678.2min speed=532K/sec
>       bitmap: 3/3 pages [12KB], 65536KB chunk
>
> but after a few minutes everything stops like you can see above. Rsync (or 
> any other process writing to that md device) also freezes. If I try to read 
> already copied files - freeze, usually with less that 2GB copied.
>
> Sometimes in dmesg I get some kernel messages about "task kworker/2:1:55 
> blocked for more than 480 seconds." (please see attached dmesg.txt and my 
> reports here: https://bugzilla.opensuse.org/show_bug.cgi?id=1111073), 
> sometimes nothing at all. When this happens, I can only reboot with SysRq-b 
> or "physically" with reset/power button.
>
> Same thing can happen with "far" layout, but it seems to me that it does not 
> happen every time (or that often). I might be wrong, because I never use 
> "far" layout in real life, only for testing.
> I was unable to reproduce the failure with "near" layout.
>
> Also with EXT4 or BTRFS and any layout everything works just as it should, 
> that is sync goes on until finished, and rsync, cp, or any other write work 
> just fine at the same time.
>
> Let me just add that I first saw this behavior in openSUSE LEAP 15.0 (kernel 
> 4.12). In previous versions (up to kernel 4.4) I never had this problem. In 
> the meantime I have tested with kernels up to 4.20rc and it is the same. 
> Unfortunately I cannot go back to test kernels 4.5 - 4.11 to pinpoint the 
> moment the problem first appeared.
>
>
>


[-- Attachment #2: dmesg-debian.txt --]
[-- Type: text/plain, Size: 71164 bytes --]

[    0.000000] Linux version 4.19.0-trunk-amd64 (debian-kernel@lists.debian.org) (gcc version 8.2.0 (Debian 8.2.0-10)) #1 SMP Debian 4.19.5-1~exp1 (2018-11-27)
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.19.0-trunk-amd64 root=UUID=b2a83481-6604-4b8c-911f-08a273c95409 ro quiet
[    0.000000] x86/fpu: x87 FPU will use FXSAVE
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bdfcffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000bdfd0000-0x00000000bdfddfff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000bdfde000-0x00000000bdffffff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000d0000000-0x00000000dfffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000feefffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ff780000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x0000000101ffffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.5 present.
[    0.000000] DMI: MSI MS-7250/MS-7250, BIOS V3.10 04/07/2008
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 2000.031 MHz processor
[    0.008092] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.008096] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.012799] AGP: No AGP bridge found
[    0.012846] last_pfn = 0x102000 max_arch_pfn = 0x400000000
[    0.012853] MTRR default type: uncachable
[    0.012854] MTRR fixed ranges enabled:
[    0.012856]   00000-9FFFF write-back
[    0.012858]   A0000-EFFFF uncachable
[    0.012860]   F0000-FFFFF write-protect
[    0.012861] MTRR variable ranges enabled:
[    0.012863]   0 base 0000000000 mask FF80000000 write-back
[    0.012865]   1 base 0080000000 mask FFE0000000 write-back
[    0.012867]   2 base 00A0000000 mask FFF0000000 write-back
[    0.012869]   3 base 00B0000000 mask FFF8000000 write-back
[    0.012871]   4 base 00B8000000 mask FFFC000000 write-back
[    0.012872]   5 base 00BC000000 mask FFFE000000 write-back
[    0.012873]   6 disabled
[    0.012874]   7 disabled
[    0.012876] TOM2: 0000000102000000 aka 4128M
[    0.013042] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT  
[    0.013103] e820: update [mem 0xbe000000-0xffffffff] usable ==> reserved
[    0.013113] last_pfn = 0xbdfd0 max_arch_pfn = 0x400000000
[    0.018939] found SMP MP-table at [mem 0x000ff780-0x000ff78f] mapped at [(____ptrval____)]
[    0.024350] Base memory trampoline at [(____ptrval____)] 99000 size 24576
[    0.024356] BRK [0x3ca01000, 0x3ca01fff] PGTABLE
[    0.024359] BRK [0x3ca02000, 0x3ca02fff] PGTABLE
[    0.024360] BRK [0x3ca03000, 0x3ca03fff] PGTABLE
[    0.024402] BRK [0x3ca04000, 0x3ca04fff] PGTABLE
[    0.024406] BRK [0x3ca05000, 0x3ca05fff] PGTABLE
[    0.024559] BRK [0x3ca06000, 0x3ca06fff] PGTABLE
[    0.024586] BRK [0x3ca07000, 0x3ca07fff] PGTABLE
[    0.024741] RAMDISK: [mem 0x35451000-0x36a1ffff]
[    0.024748] ACPI: Early table checksum verification disabled
[    0.025101] ACPI: RSDP 0x00000000000F98B0 000014 (v00 ACPIAM)
[    0.025107] ACPI: RSDT 0x00000000BDFD0000 00003C (v01 MSIISM OEMRSDT  04000807 MSFT 00000097)
[    0.025115] ACPI: FACP 0x00000000BDFD0200 000084 (v02 MSIISM OEMFACP  04000807 MSFT 00000097)
[    0.025124] ACPI: DSDT 0x00000000BDFD0440 005045 (v01 1ADGH  1ADGH013 00000013 INTL 20051117)
[    0.025129] ACPI: FACS 0x00000000BDFDE000 000040
[    0.025133] ACPI: APIC 0x00000000BDFD0390 000070 (v01 MSIISM OEMAPIC  04000807 MSFT 00000097)
[    0.025138] ACPI: MCFG 0x00000000BDFD0400 00003C (v01 MSIISM OEMMCFG  04000807 MSFT 00000097)
[    0.025143] ACPI: OEMB 0x00000000BDFDE040 000061 (v01 MSIISM AMI_OEM  04000807 MSFT 00000097)
[    0.025148] ACPI: HPET 0x00000000BDFD5490 000038 (v01 MSIISM OEMHPET0 04000807 MSFT 00000097)
[    0.025154] ACPI: SSDT 0x00000000BDFD54D0 0001C4 (v01 A M I  POWERNOW 00000001 AMD  00000001)
[    0.025165] ACPI: Local APIC address 0xfee00000
[    0.025289] Scanning NUMA topology in Northbridge 24
[    0.025340] No NUMA configuration found
[    0.025342] Faking a node at [mem 0x0000000000000000-0x0000000101ffffff]
[    0.025348] NODE_DATA(0) allocated [mem 0x101ffa000-0x101ffefff]
[    0.025418] Zone ranges:
[    0.025419]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.025421]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.025423]   Normal   [mem 0x0000000100000000-0x0000000101ffffff]
[    0.025425]   Device   empty
[    0.025427] Movable zone start for each node
[    0.025428] Early memory node ranges
[    0.025429]   node   0: [mem 0x0000000000001000-0x000000000009efff]
[    0.025431]   node   0: [mem 0x0000000000100000-0x00000000bdfcffff]
[    0.025432]   node   0: [mem 0x0000000100000000-0x0000000101ffffff]
[    0.025840] Reserved but unavailable: 8338 pages
[    0.025844] Initmem setup node 0 [mem 0x0000000000001000-0x0000000101ffffff]
[    0.025847] On node 0 totalpages: 786286
[    0.025849]   DMA zone: 64 pages used for memmap
[    0.025850]   DMA zone: 21 pages reserved
[    0.025852]   DMA zone: 3998 pages, LIFO batch:0
[    0.026095]   DMA32 zone: 12096 pages used for memmap
[    0.026096]   DMA32 zone: 774096 pages, LIFO batch:63
[    0.075377]   Normal zone: 128 pages used for memmap
[    0.075379]   Normal zone: 8192 pages, LIFO batch:0
[    0.076023] Detected use of extended apic ids on hypertransport bus
[    0.076034] ACPI: PM-Timer IO Port: 0x2008
[    0.076036] ACPI: Local APIC address 0xfee00000
[    0.076056] IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
[    0.076059] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.076062] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.076063] ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
[    0.076065] ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
[    0.076066] ACPI: IRQ0 used by override.
[    0.076068] ACPI: IRQ9 used by override.
[    0.076069] ACPI: IRQ14 used by override.
[    0.076070] ACPI: IRQ15 used by override.
[    0.076072] Using ACPI (MADT) for SMP configuration information
[    0.076074] ACPI: HPET id: 0x10de8201 base: 0xfed00000
[    0.076080] smpboot: Allowing 2 CPUs, 0 hotplug CPUs
[    0.076101] PM: Registered nosave memory: [mem 0x00000000-0x00000fff]
[    0.076104] PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff]
[    0.076105] PM: Registered nosave memory: [mem 0x000a0000-0x000dffff]
[    0.076106] PM: Registered nosave memory: [mem 0x000e0000-0x000fffff]
[    0.076108] PM: Registered nosave memory: [mem 0xbdfd0000-0xbdfddfff]
[    0.076109] PM: Registered nosave memory: [mem 0xbdfde000-0xbdffffff]
[    0.076110] PM: Registered nosave memory: [mem 0xbe000000-0xcfffffff]
[    0.076111] PM: Registered nosave memory: [mem 0xd0000000-0xdfffffff]
[    0.076113] PM: Registered nosave memory: [mem 0xe0000000-0xfebfffff]
[    0.076114] PM: Registered nosave memory: [mem 0xfec00000-0xfec00fff]
[    0.076115] PM: Registered nosave memory: [mem 0xfec01000-0xfedfffff]
[    0.076116] PM: Registered nosave memory: [mem 0xfee00000-0xfeefffff]
[    0.076117] PM: Registered nosave memory: [mem 0xfef00000-0xff77ffff]
[    0.076118] PM: Registered nosave memory: [mem 0xff780000-0xffffffff]
[    0.076121] [mem 0xe0000000-0xfebfffff] available for PCI devices
[    0.076122] Booting paravirtualized kernel on bare hardware
[    0.076127] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.266510] random: get_random_bytes called from start_kernel+0x93/0x531 with crng_init=0
[    0.266521] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:2 nr_node_ids:1
[    0.268656] percpu: Embedded 44 pages/cpu @(____ptrval____) s143256 r8192 d28776 u1048576
[    0.268665] pcpu-alloc: s143256 r8192 d28776 u1048576 alloc=1*2097152
[    0.268667] pcpu-alloc: [0] 0 1 
[    0.268701] Built 1 zonelists, mobility grouping on.  Total pages: 773977
[    0.268702] Policy zone: Normal
[    0.268704] Kernel command line: BOOT_IMAGE=/vmlinuz-4.19.0-trunk-amd64 root=UUID=b2a83481-6604-4b8c-911f-08a273c95409 ro quiet
[    0.331385] AGP: Checking aperture...
[    0.336083] AGP: No AGP bridge found
[    0.336087] AGP: Node 0: aperture [bus addr 0xb0000000-0xb1ffffff] (32MB)
[    0.336089] Aperture pointing to e820 RAM. Ignoring.
[    0.336090] AGP: Your BIOS doesn't leave an aperture memory hole
[    0.336090] AGP: Please enable the IOMMU option in the BIOS setup
[    0.336092] AGP: This costs you 64MB of RAM
[    0.336096] AGP: Mapping aperture over RAM [mem 0xb0000000-0xb3ffffff] (65536KB)
[    0.336099] PM: Registered nosave memory: [mem 0xb0000000-0xb3ffffff]
[    0.384417] Memory: 2914780K/3145144K available (10252K kernel code, 1235K rwdata, 3192K rodata, 1572K init, 2332K bss, 230364K reserved, 0K cma-reserved)
[    0.384624] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.384637] ftrace: allocating 31599 entries in 124 pages
[    0.409674] rcu: Hierarchical RCU implementation.
[    0.409677] rcu: 	RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=2.
[    0.409679] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
[    0.416108] NR_IRQS: 33024, nr_irqs: 440, preallocated irqs: 16
[    0.416468] spurious 8259A interrupt: IRQ7.
[    0.420349] Console: colour VGA+ 80x25
[    0.420358] console [tty0] enabled
[    0.420394] ACPI: Core revision 20180810
[    0.420631] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 76450417870 ns
[    0.420651] hpet clockevent registered
[    0.420658] APIC: Switch to symmetric I/O mode setup
[    0.421265] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.421275] do_IRQ: 0.55 No irq handler for vector
[    0.440655] tsc: Marking TSC unstable due to TSCs unsynchronized
[    0.440676] Calibrating delay loop (skipped), value calculated using timer frequency.. 4000.06 BogoMIPS (lpj=8000124)
[    0.440680] pid_max: default: 32768 minimum: 301
[    0.440742] Security Framework initialized
[    0.440744] Yama: disabled by default; enable with sysctl kernel.yama.*
[    0.440779] AppArmor: AppArmor initialized
[    0.443813] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
[    0.445376] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
[    0.445460] Mount-cache hash table entries: 8192 (order: 4, 65536 bytes)
[    0.445511] Mountpoint-cache hash table entries: 8192 (order: 4, 65536 bytes)
[    0.445874] mce: CPU supports 5 MCE banks
[    0.445883] LVT offset 0 assigned for vector 0xf9
[    0.445886] process: using AMD E400 aware idle routine
[    0.445888] Last level iTLB entries: 4KB 512, 2MB 8, 4MB 4
[    0.445890] Last level dTLB entries: 4KB 512, 2MB 8, 4MB 4, 1GB 0
[    0.445892] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
[    0.445929] Spectre V2 : Mitigation: Full generic retpoline
[    0.445930] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[    0.446103] Freeing SMP alternatives memory: 24K
[    0.448656] smpboot: CPU0: AMD Athlon(tm) 64 X2 Dual Core Processor 3600+ (family: 0xf, model: 0x4b, stepping: 0x2)
[    0.448656] Performance Events: AMD PMU driver.
[    0.448656] ... version:                0
[    0.448656] ... bit width:              48
[    0.448656] ... generic registers:      4
[    0.448656] ... value mask:             0000ffffffffffff
[    0.448656] ... max period:             00007fffffffffff
[    0.448656] ... fixed-purpose events:   0
[    0.448656] ... event mask:             000000000000000f
[    0.448656] rcu: Hierarchical SRCU implementation.
[    0.448656] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
[    0.448656] smp: Bringing up secondary CPUs ...
[    0.448656] x86: Booting SMP configuration:
[    0.448656] .... node  #0, CPUs:      #1
[    0.524762] smp: Brought up 1 node, 2 CPUs
[    0.524762] smpboot: Max logical packages: 1
[    0.524762] smpboot: Total of 2 processors activated (8000.02 BogoMIPS)
[    0.525239] devtmpfs: initialized
[    0.525239] x86/mm: Memory block size: 128MB
[    0.525494] PM: Registering ACPI NVS region [mem 0xbdfde000-0xbdffffff] (139264 bytes)
[    0.525494] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.525494] futex hash table entries: 512 (order: 3, 32768 bytes)
[    0.525494] pinctrl core: initialized pinctrl subsystem
[    0.525494] NET: Registered protocol family 16
[    0.525494] audit: initializing netlink subsys (disabled)
[    0.525494] audit: type=2000 audit(1544782602.104:1): state=initialized audit_enabled=0 res=1
[    0.525494] cpuidle: using governor ladder
[    0.525494] cpuidle: using governor menu
[    0.525494] node 0 link 0: io port [1000, ffffff]
[    0.525494] node 0 link 0: io port [2000, 2fff]
[    0.525494] TOM: 00000000be000000 aka 3040M
[    0.525494] node 0 link 0: mmio [e0000000, efffffff]
[    0.525494] node 0 link 0: mmio [a0000, bffff]
[    0.525494] node 0 link 0: mmio [be000000, fe0bffff]
[    0.525494] TOM2: 0000000102000000 aka 4128M
[    0.525494] bus: [bus 00-ff] on node 0 link 0
[    0.525494] bus: 00 [io  0x0000-0xffff]
[    0.525494] bus: 00 [mem 0xbe000000-0xffffffff]
[    0.525494] bus: 00 [mem 0x000a0000-0x000bffff]
[    0.525494] bus: 00 [mem 0x102000000-0xfcffffffff]
[    0.525494] ACPI: bus type PCI registered
[    0.525494] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.525494] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xe0000000-0xefffffff] (base 0xe0000000)
[    0.525494] PCI: not using MMCONFIG
[    0.525494] PCI: Using configuration type 1 for base access
[    0.525494] mtrr: your CPUs had inconsistent variable MTRR settings
[    0.525494] mtrr: probably your BIOS does not setup all CPUs.
[    0.525494] mtrr: corrected configuration.
[    0.529051] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.529051] ACPI: Added _OSI(Module Device)
[    0.529051] ACPI: Added _OSI(Processor Device)
[    0.529051] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.529051] ACPI: Added _OSI(Processor Aggregator Device)
[    0.529051] ACPI: Added _OSI(Linux-Dell-Video)
[    0.529051] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[    0.533641] ACPI: 2 ACPI AML tables successfully acquired and loaded
[    0.535890] ACPI: Interpreter enabled
[    0.535916] ACPI: (supports S0 S1 S4 S5)
[    0.535918] ACPI: Using IOAPIC for interrupt routing
[    0.535962] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xe0000000-0xefffffff] (base 0xe0000000)
[    0.537246] PCI: MMCONFIG at [mem 0xe0000000-0xefffffff] reserved in ACPI motherboard resources
[    0.537275] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    0.537685] ACPI: Enabled 10 GPEs in block 00 to 1F
[    0.547876] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[    0.547885] acpi PNP0A03:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI]
[    0.547892] acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
[    0.548181] PCI host bridge to bus 0000:00
[    0.548185] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
[    0.548187] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
[    0.548189] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[    0.548191] pci_bus 0000:00: root bus resource [mem 0x000d0000-0x000dffff window]
[    0.548193] pci_bus 0000:00: root bus resource [mem 0xbe000000-0xfebfffff window]
[    0.548195] pci_bus 0000:00: root bus resource [bus 00-ff]
[    0.548216] pci 0000:00:00.0: [10de:0369] type 00 class 0x050000
[    0.548468] pci 0000:00:01.0: [10de:0362] type 00 class 0x060100
[    0.548477] pci 0000:00:01.0: reg 0x10: [io  0x2f00-0x2f7f]
[    0.548593] pci 0000:00:01.1: [10de:0368] type 00 class 0x0c0500
[    0.548606] pci 0000:00:01.1: reg 0x10: [io  0x2900-0x293f]
[    0.548622] pci 0000:00:01.1: reg 0x20: [io  0x2d00-0x2d3f]
[    0.548628] pci 0000:00:01.1: reg 0x24: [io  0x2e00-0x2e3f]
[    0.548678] pci 0000:00:01.1: PME# supported from D3hot D3cold
[    0.548789] pci 0000:00:02.0: [10de:036c] type 00 class 0x0c0310
[    0.548799] pci 0000:00:02.0: reg 0x10: [mem 0xfcffb000-0xfcffbfff]
[    0.548832] pci 0000:00:02.0: supports D1 D2
[    0.548834] pci 0000:00:02.0: PME# supported from D0 D1 D2 D3hot D3cold
[    0.548932] pci 0000:00:02.1: [10de:036d] type 00 class 0x0c0320
[    0.548941] pci 0000:00:02.1: reg 0x10: [mem 0xfcffac00-0xfcffacff]
[    0.548972] pci 0000:00:02.1: supports D1 D2
[    0.548974] pci 0000:00:02.1: PME# supported from D0 D1 D2 D3hot D3cold
[    0.549071] pci 0000:00:04.0: [10de:036e] type 00 class 0x01018a
[    0.549089] pci 0000:00:04.0: reg 0x20: [io  0xffa0-0xffaf]
[    0.549098] pci 0000:00:04.0: legacy IDE quirk: reg 0x10: [io  0x01f0-0x01f7]
[    0.549099] pci 0000:00:04.0: legacy IDE quirk: reg 0x14: [io  0x03f6]
[    0.549101] pci 0000:00:04.0: legacy IDE quirk: reg 0x18: [io  0x0170-0x0177]
[    0.549103] pci 0000:00:04.0: legacy IDE quirk: reg 0x1c: [io  0x0376]
[    0.549201] pci 0000:00:05.0: [10de:037f] type 00 class 0x010185
[    0.549210] pci 0000:00:05.0: reg 0x10: [io  0xd480-0xd487]
[    0.549215] pci 0000:00:05.0: reg 0x14: [io  0xd400-0xd403]
[    0.549219] pci 0000:00:05.0: reg 0x18: [io  0xd080-0xd087]
[    0.549223] pci 0000:00:05.0: reg 0x1c: [io  0xd000-0xd003]
[    0.549227] pci 0000:00:05.0: reg 0x20: [io  0xcc00-0xcc0f]
[    0.549232] pci 0000:00:05.0: reg 0x24: [mem 0xfcff9000-0xfcff9fff]
[    0.549342] pci 0000:00:05.1: [10de:037f] type 00 class 0x010185
[    0.549351] pci 0000:00:05.1: reg 0x10: [io  0xc880-0xc887]
[    0.549355] pci 0000:00:05.1: reg 0x14: [io  0xc800-0xc803]
[    0.549360] pci 0000:00:05.1: reg 0x18: [io  0xc480-0xc487]
[    0.549364] pci 0000:00:05.1: reg 0x1c: [io  0xc400-0xc403]
[    0.549368] pci 0000:00:05.1: reg 0x20: [io  0xc080-0xc08f]
[    0.549373] pci 0000:00:05.1: reg 0x24: [mem 0xfcff8000-0xfcff8fff]
[    0.549478] pci 0000:00:05.2: [10de:037f] type 00 class 0x010185
[    0.549487] pci 0000:00:05.2: reg 0x10: [io  0xc000-0xc007]
[    0.549491] pci 0000:00:05.2: reg 0x14: [io  0xbc00-0xbc03]
[    0.549495] pci 0000:00:05.2: reg 0x18: [io  0xb880-0xb887]
[    0.549500] pci 0000:00:05.2: reg 0x1c: [io  0xb800-0xb803]
[    0.549504] pci 0000:00:05.2: reg 0x20: [io  0xb480-0xb48f]
[    0.549508] pci 0000:00:05.2: reg 0x24: [mem 0xfcff7000-0xfcff7fff]
[    0.549615] pci 0000:00:06.0: [10de:0370] type 01 class 0x060401
[    0.549733] pci 0000:00:06.1: [10de:0371] type 00 class 0x040300
[    0.549744] pci 0000:00:06.1: reg 0x10: [mem 0xfcff0000-0xfcff3fff]
[    0.549782] pci 0000:00:06.1: PME# supported from D3hot D3cold
[    0.549883] pci 0000:00:08.0: [10de:0373] type 00 class 0x068000
[    0.549894] pci 0000:00:08.0: reg 0x10: [mem 0xfcff6000-0xfcff6fff]
[    0.549899] pci 0000:00:08.0: reg 0x14: [io  0xb400-0xb407]
[    0.549903] pci 0000:00:08.0: reg 0x18: [mem 0xfcffa800-0xfcffa8ff]
[    0.549908] pci 0000:00:08.0: reg 0x1c: [mem 0xfcffa400-0xfcffa40f]
[    0.549939] pci 0000:00:08.0: supports D1 D2
[    0.549941] pci 0000:00:08.0: PME# supported from D0 D1 D2 D3hot D3cold
[    0.550042] pci 0000:00:09.0: [10de:0373] type 00 class 0x068000
[    0.550053] pci 0000:00:09.0: reg 0x10: [mem 0xfcff5000-0xfcff5fff]
[    0.550057] pci 0000:00:09.0: reg 0x14: [io  0xb080-0xb087]
[    0.550062] pci 0000:00:09.0: reg 0x18: [mem 0xfcffa000-0xfcffa0ff]
[    0.550066] pci 0000:00:09.0: reg 0x1c: [mem 0xfcff4c00-0xfcff4c0f]
[    0.550098] pci 0000:00:09.0: supports D1 D2
[    0.550099] pci 0000:00:09.0: PME# supported from D0 D1 D2 D3hot D3cold
[    0.550202] pci 0000:00:0b.0: [10de:0374] type 01 class 0x060400
[    0.550237] pci 0000:00:0b.0: PME# supported from D0 D1 D2 D3hot D3cold
[    0.550338] pci 0000:00:0c.0: [10de:0374] type 01 class 0x060400
[    0.550367] pci 0000:00:0c.0: PME# supported from D0 D1 D2 D3hot D3cold
[    0.550471] pci 0000:00:0d.0: [10de:0378] type 01 class 0x060400
[    0.550502] pci 0000:00:0d.0: PME# supported from D0 D1 D2 D3hot D3cold
[    0.550605] pci 0000:00:0e.0: [10de:0375] type 01 class 0x060400
[    0.550635] pci 0000:00:0e.0: PME# supported from D0 D1 D2 D3hot D3cold
[    0.550746] pci 0000:00:0f.0: [10de:0377] type 01 class 0x060400
[    0.550779] pci 0000:00:0f.0: PME# supported from D0 D1 D2 D3hot D3cold
[    0.550886] pci 0000:00:18.0: [1022:1100] type 00 class 0x060000
[    0.550983] pci 0000:00:18.1: [1022:1101] type 00 class 0x060000
[    0.551080] pci 0000:00:18.2: [1022:1102] type 00 class 0x060000
[    0.551171] pci 0000:00:18.3: [1022:1103] type 00 class 0x060000
[    0.551283] pci_bus 0000:01: extended config space not accessible
[    0.551335] pci 0000:00:06.0: PCI bridge to [bus 01] (subtractive decode)
[    0.551340] pci 0000:00:06.0:   bridge window [io  0x0000-0x0cf7 window] (subtractive decode)
[    0.551342] pci 0000:00:06.0:   bridge window [io  0x0d00-0xffff window] (subtractive decode)
[    0.551344] pci 0000:00:06.0:   bridge window [mem 0x000a0000-0x000bffff window] (subtractive decode)
[    0.551346] pci 0000:00:06.0:   bridge window [mem 0x000d0000-0x000dffff window] (subtractive decode)
[    0.551349] pci 0000:00:06.0:   bridge window [mem 0xbe000000-0xfebfffff window] (subtractive decode)
[    0.551381] pci 0000:00:0b.0: PCI bridge to [bus 02]
[    0.551418] pci 0000:00:0c.0: PCI bridge to [bus 03]
[    0.551455] pci 0000:00:0d.0: PCI bridge to [bus 04]
[    0.551489] pci 0000:00:0e.0: PCI bridge to [bus 05]
[    0.551533] pci 0000:06:00.0: [10de:0a65] type 00 class 0x030000
[    0.551548] pci 0000:06:00.0: reg 0x10: [mem 0xfd000000-0xfdffffff]
[    0.551557] pci 0000:06:00.0: reg 0x14: [mem 0xc0000000-0xcfffffff 64bit pref]
[    0.551565] pci 0000:06:00.0: reg 0x1c: [mem 0xbe000000-0xbfffffff 64bit pref]
[    0.551572] pci 0000:06:00.0: reg 0x24: [io  0xec00-0xec7f]
[    0.551578] pci 0000:06:00.0: reg 0x30: [mem 0xfeb80000-0xfebfffff pref]
[    0.551584] pci 0000:06:00.0: enabling Extended Tags
[    0.551636] pci 0000:06:00.0: 32.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s x16 link at 0000:00:0f.0 (capable of 126.016 Gb/s with 8 GT/s x16 link)
[    0.551689] pci 0000:06:00.1: [10de:0be3] type 00 class 0x040300
[    0.551701] pci 0000:06:00.1: reg 0x10: [mem 0xfeb7c000-0xfeb7ffff]
[    0.551731] pci 0000:06:00.1: enabling Extended Tags
[    0.560685] pci 0000:00:0f.0: PCI bridge to [bus 06]
[    0.560688] pci 0000:00:0f.0:   bridge window [io  0xe000-0xefff]
[    0.560691] pci 0000:00:0f.0:   bridge window [mem 0xfd000000-0xfebfffff]
[    0.560694] pci 0000:00:0f.0:   bridge window [mem 0xbe000000-0xcfffffff 64bit pref]
[    0.561991] ACPI: PCI Interrupt Link [LNKA] (IRQs 16 17 18 19) *0, disabled.
[    0.562104] ACPI: PCI Interrupt Link [LNKB] (IRQs 16 17 18 19) *0, disabled.
[    0.562213] ACPI: PCI Interrupt Link [LNKC] (IRQs 16 17 18 19) *0, disabled.
[    0.562326] ACPI: PCI Interrupt Link [LNKD] (IRQs 16 17 18 19) *0, disabled.
[    0.562432] ACPI: PCI Interrupt Link [LNEA] (IRQs 16 17 18 19) *0, disabled.
[    0.562539] ACPI: PCI Interrupt Link [LNEB] (IRQs 16 17 18 19) *10
[    0.562645] ACPI: PCI Interrupt Link [LNEC] (IRQs 16 17 18 19) *10
[    0.562751] ACPI: PCI Interrupt Link [LNED] (IRQs 16 17 18 19) *0, disabled.
[    0.562858] ACPI: PCI Interrupt Link [LUB0] (IRQs 21 22 23) *10
[    0.562964] ACPI: PCI Interrupt Link [LMAD] (IRQs 20) *10
[    0.563070] ACPI: PCI Interrupt Link [LUB2] (IRQs 21 22 23) *11
[    0.563176] ACPI: PCI Interrupt Link [LMAC] (IRQs 20) *5
[    0.563281] ACPI: PCI Interrupt Link [LAZA] (IRQs 21 22 23) *11
[    0.563387] ACPI: PCI Interrupt Link [LSMB] (IRQs 21 22 23) *11
[    0.563492] ACPI: PCI Interrupt Link [LPMU] (IRQs 21 22 23) *5
[    0.563598] ACPI: PCI Interrupt Link [LSA0] (IRQs 21 22 23) *5
[    0.563703] ACPI: PCI Interrupt Link [LSA1] (IRQs 21 22 23) *10
[    0.563822] ACPI: PCI Interrupt Link [LATA] (IRQs 21 22 23) *0, disabled.
[    0.563929] ACPI: PCI Interrupt Link [LSA2] (IRQs 21 22 23) *10
[    0.564154] pci 0000:06:00.0: vgaarb: setting as boot VGA device
[    0.564154] pci 0000:06:00.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[    0.564154] pci 0000:06:00.0: vgaarb: bridge control possible
[    0.564154] vgaarb: loaded
[    0.564154] pps_core: LinuxPPS API ver. 1 registered
[    0.564154] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    0.564154] PTP clock support registered
[    0.564154] EDAC MC: Ver: 3.0.0
[    0.564154] PCI: Using ACPI for IRQ routing
[    0.569710] PCI: pci_cache_line_size set to 64 bytes
[    0.569761] e820: reserve RAM buffer [mem 0x0009fc00-0x0009ffff]
[    0.569763] e820: reserve RAM buffer [mem 0xbdfd0000-0xbfffffff]
[    0.569765] e820: reserve RAM buffer [mem 0x102000000-0x103ffffff]
[    0.569954] HPET: 3 timers in total, 0 timers will be used for per-cpu timer
[    0.569954] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 31
[    0.569954] hpet0: 3 comparators, 32-bit 25.000000 MHz counter
[    0.570775] clocksource: Switched to clocksource hpet
[    0.590470] VFS: Disk quotas dquot_6.6.0
[    0.590499] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.590689] AppArmor: AppArmor Filesystem Enabled
[    0.590726] pnp: PnP ACPI init
[    0.590844] pnp 00:00: Plug and Play ACPI device, IDs PNP0b00 (active)
[    0.591131] pnp 00:01: [dma 0 disabled]
[    0.591201] pnp 00:01: Plug and Play ACPI device, IDs PNP0501 (active)
[    0.591757] pnp 00:02: [dma 3]
[    0.591889] pnp 00:02: Plug and Play ACPI device, IDs PNP0401 (active)
[    0.592604] system 00:03: [io  0x04d0-0x04d1] has been reserved
[    0.592606] system 00:03: [io  0x0800-0x080f] has been reserved
[    0.592609] system 00:03: [io  0x2000-0x207f] has been reserved
[    0.592611] system 00:03: [io  0x2080-0x20ff] has been reserved
[    0.592613] system 00:03: [io  0x2400-0x247f] has been reserved
[    0.592615] system 00:03: [io  0x2480-0x24ff] has been reserved
[    0.592618] system 00:03: [io  0x2800-0x287f] has been reserved
[    0.592620] system 00:03: [io  0x2880-0x28ff] has been reserved
[    0.592622] system 00:03: [io  0x2c00-0x2c7f] has been reserved
[    0.592624] system 00:03: [io  0x2c80-0x2cff] has been reserved
[    0.592628] system 00:03: [mem 0x000d0000-0x000d3fff window] has been reserved
[    0.592630] system 00:03: [mem 0x000d4000-0x000d7fff window] has been reserved
[    0.592633] system 00:03: [mem 0x000de000-0x000dffff window] has been reserved
[    0.592635] system 00:03: [mem 0xfcf80000-0xfcfbffff] has been reserved
[    0.592638] system 00:03: [mem 0xfee01000-0xfeefffff] has been reserved
[    0.592645] system 00:03: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.592822] system 00:04: [mem 0xfec00000-0xfec00fff] could not be reserved
[    0.592825] system 00:04: [mem 0xfee00000-0xfee00fff] has been reserved
[    0.592831] system 00:04: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.592903] pnp 00:05: Plug and Play ACPI device, IDs PNP0303 PNP030b (active)
[    0.592966] pnp 00:06: Plug and Play ACPI device, IDs PNP0f03 PNP0f13 (active)
[    0.593130] system 00:07: [io  0x0a00-0x0a0f] has been reserved
[    0.593132] system 00:07: [io  0x0a10-0x0a1f] has been reserved
[    0.593138] system 00:07: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.593254] system 00:08: [mem 0xe0000000-0xefffffff] has been reserved
[    0.593259] system 00:08: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.593411] system 00:09: [mem 0xd0000000-0xdfffffff] has been reserved
[    0.593417] system 00:09: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.593598] system 00:0a: [mem 0x00000000-0x0009ffff] could not be reserved
[    0.593601] system 00:0a: [mem 0x000c0000-0x000cffff] could not be reserved
[    0.593603] system 00:0a: [mem 0x000e0000-0x000fffff] could not be reserved
[    0.593606] system 00:0a: [mem 0x00100000-0xbdffffff] could not be reserved
[    0.593608] system 00:0a: [mem 0xfec00000-0xffffffff] could not be reserved
[    0.593614] system 00:0a: Plug and Play ACPI device, IDs PNP0c01 (active)
[    0.593968] pnp: PnP ACPI: found 11 devices
[    0.601537] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    0.601585] pci 0000:00:06.0: PCI bridge to [bus 01]
[    0.601592] pci 0000:00:0b.0: PCI bridge to [bus 02]
[    0.601597] pci 0000:00:0c.0: PCI bridge to [bus 03]
[    0.601601] pci 0000:00:0d.0: PCI bridge to [bus 04]
[    0.601606] pci 0000:00:0e.0: PCI bridge to [bus 05]
[    0.601611] pci 0000:00:0f.0: PCI bridge to [bus 06]
[    0.601614] pci 0000:00:0f.0:   bridge window [io  0xe000-0xefff]
[    0.601617] pci 0000:00:0f.0:   bridge window [mem 0xfd000000-0xfebfffff]
[    0.601619] pci 0000:00:0f.0:   bridge window [mem 0xbe000000-0xcfffffff 64bit pref]
[    0.601626] pci_bus 0000:00: resource 4 [io  0x0000-0x0cf7 window]
[    0.601628] pci_bus 0000:00: resource 5 [io  0x0d00-0xffff window]
[    0.601631] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.601633] pci_bus 0000:00: resource 7 [mem 0x000d0000-0x000dffff window]
[    0.601635] pci_bus 0000:00: resource 8 [mem 0xbe000000-0xfebfffff window]
[    0.601638] pci_bus 0000:01: resource 4 [io  0x0000-0x0cf7 window]
[    0.601640] pci_bus 0000:01: resource 5 [io  0x0d00-0xffff window]
[    0.601642] pci_bus 0000:01: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.601644] pci_bus 0000:01: resource 7 [mem 0x000d0000-0x000dffff window]
[    0.601646] pci_bus 0000:01: resource 8 [mem 0xbe000000-0xfebfffff window]
[    0.601649] pci_bus 0000:06: resource 0 [io  0xe000-0xefff]
[    0.601651] pci_bus 0000:06: resource 1 [mem 0xfd000000-0xfebfffff]
[    0.601653] pci_bus 0000:06: resource 2 [mem 0xbe000000-0xcfffffff 64bit pref]
[    0.601800] NET: Registered protocol family 2
[    0.602064] tcp_listen_portaddr_hash hash table entries: 2048 (order: 3, 32768 bytes)
[    0.602113] TCP established hash table entries: 32768 (order: 6, 262144 bytes)
[    0.602325] TCP bind hash table entries: 32768 (order: 7, 524288 bytes)
[    0.602707] TCP: Hash tables configured (established 32768 bind 32768)
[    0.602847] UDP hash table entries: 2048 (order: 4, 65536 bytes)
[    0.602906] UDP-Lite hash table entries: 2048 (order: 4, 65536 bytes)
[    0.603061] NET: Registered protocol family 1
[    0.603452] PCI Interrupt Link [LUB0] enabled at IRQ 23
[    0.680807] pci 0000:00:02.0: quirk_usb_early_handoff+0x0/0x6c3 took 75854 usecs
[    0.681061] PCI Interrupt Link [LUB2] enabled at IRQ 22
[    0.681237] pci 0000:00:00.0: Found enabled HT MSI Mapping
[    0.681275] pci 0000:00:00.0: Found enabled HT MSI Mapping
[    0.681312] pci 0000:00:00.0: Found enabled HT MSI Mapping
[    0.681351] pci 0000:00:00.0: Found enabled HT MSI Mapping
[    0.681391] pci 0000:00:00.0: Found enabled HT MSI Mapping
[    0.681439] pci 0000:00:00.0: Found enabled HT MSI Mapping
[    0.681487] pci 0000:00:00.0: Found enabled HT MSI Mapping
[    0.681536] pci 0000:00:00.0: Found enabled HT MSI Mapping
[    0.681588] pci 0000:00:00.0: Found enabled HT MSI Mapping
[    0.681643] pci 0000:00:00.0: Found enabled HT MSI Mapping
[    0.681700] pci 0000:00:00.0: Found enabled HT MSI Mapping
[    0.681761] pci 0000:00:00.0: Found enabled HT MSI Mapping
[    0.681799] pci 0000:06:00.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[    0.681818] pci 0000:06:00.1: Linked as a consumer to 0000:06:00.0
[    0.681830] PCI: CLS 64 bytes, default 64
[    0.681938] Unpacking initramfs...
[    1.323118] Freeing initrd memory: 22332K
[    1.323907] PCI-DMA: Disabling AGP.
[    1.324010] PCI-DMA: aperture base @ b0000000 size 65536 KB
[    1.324010] PCI-DMA: using GART IOMMU.
[    1.324014] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
[    1.329561] Initialise system trusted keyrings
[    1.329702] workingset: timestamp_bits=40 max_order=20 bucket_order=0
[    1.332147] zbud: loaded
[    1.332506] pstore: using deflate compression
[    1.621716] Key type asymmetric registered
[    1.621720] Asymmetric key parser 'x509' registered
[    1.621746] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 247)
[    1.621811] io scheduler noop registered
[    1.621812] io scheduler deadline registered
[    1.621899] io scheduler cfq registered (default)
[    1.621900] io scheduler mq-deadline registered
[    1.622399] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[    1.622752] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    1.643201] 00:01: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[    1.643986] Linux agpgart interface v0.103
[    1.644038] AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
[    1.644039] AMD IOMMUv2 functionality not available on this system
[    1.644370] i8042: PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
[    1.646913] serio: i8042 KBD port at 0x60,0x64 irq 1
[    1.646969] serio: i8042 AUX port at 0x60,0x64 irq 12
[    1.647119] mousedev: PS/2 mouse device common for all mice
[    1.647183] rtc_cmos 00:00: RTC can wake from S4
[    1.647367] rtc_cmos 00:00: registered as rtc0
[    1.647390] rtc_cmos 00:00: alarms up to one year, y3k, 114 bytes nvram, hpet irqs
[    1.647434] ledtrig-cpu: registered to indicate activity on CPUs
[    1.648320] NET: Registered protocol family 10
[    1.654838] Segment Routing with IPv6
[    1.654884] mip6: Mobile IPv6
[    1.654889] NET: Registered protocol family 17
[    1.654896] mpls_gso: MPLS GSO support
[    1.655373] registered taskstats version 1
[    1.655374] Loading compiled-in X.509 certificates
[    1.722700] Loaded X.509 cert 'secure-boot-test-key-lfaraone: 97c1b25cddf9873ca78a58f3d73bf727d2cf78ff'
[    1.722749] zswap: loaded using pool lzo/zbud
[    1.722853] AppArmor: AppArmor sha1 policy hashing enabled
[    1.723472] rtc_cmos 00:00: setting system clock to 2018-12-14 10:16:44 UTC (1544782604)
[    1.723529] Unstable clock detected, switching default tracing clock to "global"
               If you want to keep using the local clock, then add:
                 "trace_clock=local"
               on the kernel command line
[    1.727406] Freeing unused kernel image memory: 1572K
[    1.744679] Write protecting the kernel read-only data: 16384k
[    1.746522] Freeing unused kernel image memory: 2028K
[    1.747243] Freeing unused kernel image memory: 904K
[    1.761767] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[    1.761783] Run /init as init process
[    1.791170] random: systemd-udevd: uninitialized urandom read (16 bytes read)
[    1.791338] random: systemd-udevd: uninitialized urandom read (16 bytes read)
[    1.791354] random: systemd-udevd: uninitialized urandom read (16 bytes read)
[    1.840595] i2c i2c-0: nForce2 SMBus adapter at 0x2d00
[    1.840628] i2c i2c-1: nForce2 SMBus adapter at 0x2e00
[    1.863834] ACPI: bus type USB registered
[    1.863869] usbcore: registered new interface driver usbfs
[    1.863885] usbcore: registered new interface driver hub
[    1.863937] usbcore: registered new device driver usb
[    1.868151] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    1.869694] ehci-pci: EHCI PCI platform driver
[    1.869969] ehci-pci 0000:00:02.1: EHCI Host Controller
[    1.869979] ehci-pci 0000:00:02.1: new USB bus registered, assigned bus number 1
[    1.869990] ehci-pci 0000:00:02.1: debug port 1
[    1.870022] ehci-pci 0000:00:02.1: cache line size of 64 is not supported
[    1.870058] ehci-pci 0000:00:02.1: irq 22, io mem 0xfcffac00
[    1.871594] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[    1.884735] ehci-pci 0000:00:02.1: USB 2.0 started, EHCI 1.00
[    1.884842] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.19
[    1.884845] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    1.884847] usb usb1: Product: EHCI Host Controller
[    1.884849] usb usb1: Manufacturer: Linux 4.19.0-trunk-amd64 ehci_hcd
[    1.884851] usb usb1: SerialNumber: 0000:00:02.1
[    1.885079] hub 1-0:1.0: USB hub found
[    1.885091] hub 1-0:1.0: 10 ports detected
[    1.886812] forcedeth: Reverse Engineered nForce ethernet driver. Version 0.64.
[    1.887144] PCI Interrupt Link [LMAC] enabled at IRQ 20
[    1.900598] SCSI subsystem initialized
[    1.901333] ohci-pci: OHCI PCI platform driver
[    1.901729] ohci-pci 0000:00:02.0: OHCI PCI host controller
[    1.901757] ohci-pci 0000:00:02.0: new USB bus registered, assigned bus number 2
[    1.901849] ohci-pci 0000:00:02.0: irq 23, io mem 0xfcffb000
[    1.924828] libata version 3.00 loaded.
[    1.933366] sata_nv 0000:00:05.0: version 3.5
[    1.933686] PCI Interrupt Link [LSA0] enabled at IRQ 21
[    1.933723] sata_nv 0000:00:05.0: Using SWNCQ mode
[    1.934863] pata_amd 0000:00:04.0: version 0.4.1
[    1.941102] scsi host1: pata_amd
[    1.941409] scsi host0: sata_nv
[    1.941799] scsi host3: sata_nv
[    1.942119] ata1: SATA max UDMA/133 cmd 0xd480 ctl 0xd400 bmdma 0xcc00 irq 21
[    1.942123] ata2: SATA max UDMA/133 cmd 0xd080 ctl 0xd000 bmdma 0xcc08 irq 21
[    1.942897] PCI Interrupt Link [LSA1] enabled at IRQ 23
[    1.942909] sata_nv 0000:00:05.1: Using SWNCQ mode
[    1.944724] scsi host2: pata_amd
[    1.944848] ata3: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14
[    1.944850] ata4: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15
[    1.944883] scsi host4: sata_nv
[    1.946479] scsi host5: sata_nv
[    1.946675] ata5: SATA max UDMA/133 cmd 0xc880 ctl 0xc800 bmdma 0xc080 irq 23
[    1.946678] ata6: SATA max UDMA/133 cmd 0xc480 ctl 0xc400 bmdma 0xc088 irq 23
[    1.947457] PCI Interrupt Link [LSA2] enabled at IRQ 22
[    1.947469] sata_nv 0000:00:05.2: Using SWNCQ mode
[    1.948302] scsi host6: sata_nv
[    1.948635] scsi host7: sata_nv
[    1.948889] ata7: SATA max UDMA/133 cmd 0xc000 ctl 0xbc00 bmdma 0xb480 irq 22
[    1.948892] ata8: SATA max UDMA/133 cmd 0xb880 ctl 0xb800 bmdma 0xb488 irq 22
[    1.962873] usb usb2: New USB device found, idVendor=1d6b, idProduct=0001, bcdDevice= 4.19
[    1.962877] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    1.962879] usb usb2: Product: OHCI PCI host controller
[    1.962880] usb usb2: Manufacturer: Linux 4.19.0-trunk-amd64 ohci_hcd
[    1.962882] usb usb2: SerialNumber: 0000:00:02.0
[    1.963093] hub 2-0:1.0: USB hub found
[    1.963105] hub 2-0:1.0: 10 ports detected
[    2.258973] ata5: SATA link down (SStatus 0 SControl 300)
[    2.266956] ata7: SATA link down (SStatus 0 SControl 300)
[    2.412685] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    2.426237] forcedeth 0000:00:08.0: ifname eth0, PHY OUI 0x1c1 @ 0, addr 00:16:17:b9:aa:45
[    2.426241] forcedeth 0000:00:08.0: highdma csum vlan pwrctl mgmt gbit lnktim msi desc-v3
[    2.426694] PCI Interrupt Link [LMAD] enabled at IRQ 20
[    2.484681] usb 2-4: new low-speed USB device number 2 using ohci-pci
[    2.499436] ata1.00: ATA-9: ST2000DX001-1NS164, CC41, max UDMA/133
[    2.499439] ata1.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 31/32)
[    2.587438] ata1.00: configured for UDMA/133
[    2.587812] scsi 0:0:0:0: Direct-Access     ATA      ST2000DX001-1NS1 CC41 PQ: 0 ANSI: 5
[    2.592007] sd 0:0:0:0: [sda] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[    2.592012] sd 0:0:0:0: [sda] 4096-byte physical blocks
[    2.592023] sd 0:0:0:0: [sda] Write Protect is off
[    2.592026] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    2.592043] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    2.592945]  sda: sda1 sda2 sda3 sda4
[    2.593674] sd 0:0:0:0: [sda] Attached SCSI disk
[    2.720696] usb 2-4: New USB device found, idVendor=0b39, idProduct=cd02, bcdDevice= 1.11
[    2.720699] usb 2-4: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[    2.720700] usb 2-4: Product:  CKL   USB  KVM 2/4 V:1.21
[    2.720702] usb 2-4: Manufacturer:  CKL   USB  KVM 2/4 V:1.21
[    2.734036] hidraw: raw HID events driver (C) Jiri Kosina
[    2.753295] usbcore: registered new interface driver usbhid
[    2.753297] usbhid: USB HID core driver
[    2.755699] input:  CKL   USB  KVM 2/4 V:1.21  CKL   USB  KVM 2/4 V:1.21 as /devices/pci0000:00/0000:00:02.0/usb2/2-4/2-4:1.0/0003:0B39:CD02.0001/input/input3
[    2.812860] hid-generic 0003:0B39:CD02.0001: input,hidraw0: USB HID v1.10 Keyboard [ CKL   USB  KVM 2/4 V:1.21  CKL   USB  KVM 2/4 V:1.21] on usb-0000:00:02.0-4/input0
[    2.813614] input:  CKL   USB  KVM 2/4 V:1.21  CKL   USB  KVM 2/4 V:1.21 Mouse as /devices/pci0000:00/0000:00:02.0/usb2/2-4/2-4:1.1/0003:0B39:CD02.0002/input/input4
[    2.813723] input:  CKL   USB  KVM 2/4 V:1.21  CKL   USB  KVM 2/4 V:1.21 System Control as /devices/pci0000:00/0000:00:02.0/usb2/2-4/2-4:1.1/0003:0B39:CD02.0002/input/input5
[    2.872748] input:  CKL   USB  KVM 2/4 V:1.21  CKL   USB  KVM 2/4 V:1.21 Consumer Control as /devices/pci0000:00/0000:00:02.0/usb2/2-4/2-4:1.1/0003:0B39:CD02.0002/input/input6
[    2.872790] input:  CKL   USB  KVM 2/4 V:1.21  CKL   USB  KVM 2/4 V:1.21 as /devices/pci0000:00/0000:00:02.0/usb2/2-4/2-4:1.1/0003:0B39:CD02.0002/input/input7
[    2.872850] hid-generic 0003:0B39:CD02.0002: input,hidraw1: USB HID v1.10 Mouse [ CKL   USB  KVM 2/4 V:1.21  CKL   USB  KVM 2/4 V:1.21] on usb-0000:00:02.0-4/input1
[    2.899743] ata2: SATA link down (SStatus 0 SControl 300)
[    2.899859] ata4: port disabled--ignoring
[    2.966242] forcedeth 0000:00:09.0: ifname eth1, PHY OUI 0x1c1 @ 1, addr 00:16:17:b9:a0:ed
[    2.966246] forcedeth 0000:00:09.0: highdma csum vlan pwrctl mgmt gbit lnktim msi desc-v3
[    2.967595] forcedeth 0000:00:09.0 enp0s9: renamed from eth1
[    2.984902] forcedeth 0000:00:08.0 enp0s8: renamed from eth0
[    3.210980] ata6: SATA link down (SStatus 0 SControl 300)
[    3.680684] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    3.683145] ata8.00: ATA-8: TOSHIBA DT01ACA100, MS2OA810, max UDMA/133
[    3.683149] ata8.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32)
[    3.687794] ata8.00: configured for UDMA/133
[    3.688155] scsi 7:0:0:0: Direct-Access     ATA      TOSHIBA DT01ACA1 A810 PQ: 0 ANSI: 5
[    3.688556] sd 7:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
[    3.688560] sd 7:0:0:0: [sdb] 4096-byte physical blocks
[    3.688572] sd 7:0:0:0: [sdb] Write Protect is off
[    3.688575] sd 7:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    3.688592] sd 7:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    3.713313]  sdb: sdb1 sdb2 sdb3 sdb4
[    3.713932] sd 7:0:0:0: [sdb] Attached SCSI disk
[    3.841566] random: fast init done
[    4.086532] md/raid1:md0: active with 2 out of 2 mirrors
[    4.111339] md/raid10:md1: active with 2 out of 2 devices
[    4.143533] md0: detected capacity change from 0 to 524222464
[    4.148797] md/raid10:md127: not clean -- starting background reconstruction
[    4.148800] md/raid10:md127: active with 2 out of 2 devices
[    4.173642] md1: detected capacity change from 0 to 322118352896
[    4.190453] md127: detected capacity change from 0 to 321988329472
[    4.484700] raid6: sse2x1   gen()  2523 MB/s
[    4.552674] raid6: sse2x1   xor()  2284 MB/s
[    4.620665] raid6: sse2x2   gen()  3177 MB/s
[    4.688670] raid6: sse2x2   xor()  2575 MB/s
[    4.756667] raid6: sse2x4   gen()  3483 MB/s
[    4.824680] raid6: sse2x4   xor()  1832 MB/s
[    4.824682] raid6: using algorithm sse2x4 gen() 3483 MB/s
[    4.824683] raid6: .... xor() 1832 MB/s, rmw enabled
[    4.824684] raid6: using intx1 recovery algorithm
[    4.825886] xor: measuring software checksum speed
[    4.864666]    prefetch64-sse:  5590.000 MB/sec
[    4.904664]    generic_sse:  5582.000 MB/sec
[    4.904665] xor: using function: prefetch64-sse (5590.000 MB/sec)
[    4.905647] async_tx: api initialized (async)
[    5.054517] SGI XFS with ACLs, security attributes, realtime, no debug enabled
[    5.067476] XFS (md1): Mounting V5 Filesystem
[    5.349644] XFS (md1): Ending clean mount
[    5.570491] random: crng init done
[    5.570495] random: 7 urandom warning(s) missed due to ratelimiting
[    5.755502] systemd[1]: systemd 232 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
[    5.755727] systemd[1]: Detected architecture x86-64.
[    5.756955] systemd[1]: Set hostname to <debian>.
[    6.346361] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[    6.346516] systemd[1]: Listening on Journal Socket (/dev/log).
[    6.346572] systemd[1]: Listening on fsck to fsckd communication Socket.
[    6.346659] systemd[1]: Started Dispatch Password Requests to Console Directory Watch.
[    6.346685] systemd[1]: Reached target Encrypted Volumes.
[    6.346711] systemd[1]: Reached target Remote File Systems.
[    6.347075] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point.
[    6.448344] lp: driver loaded but no devices found
[    6.474221] ppdev: user-space parallel port driver
[    6.497093] parport_pc 00:02: reported by Plug and Play ACPI
[    6.497188] parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE,EPP]
[    6.592844] lp0: using parport0 (interrupt-driven).
[    6.772173] systemd-journald[248]: Received request to flush runtime journal from PID 1
[    7.222719] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input8
[    7.222737] ACPI: Power Button [PWRB]
[    7.222821] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input9
[    7.228120] ACPI: Power Button [PWRF]
[    7.251785] nv_tco: NV TCO WatchDog Timer Driver v0.01
[    7.251893] nv_tco: Watchdog reboot not detected
[    7.251981] nv_tco: initialized (0x2440). heartbeat=30 sec (nowayout=0)
[    7.276162] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    7.278789] sd 7:0:0:0: Attached scsi generic sg1 type 0
[    7.409140] k8temp 0000:00:18.3: Temperature readouts might be wrong - check erratum #141
[    7.409164] k8temp 0000:00:18.3: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().
[    7.470595] input: PC Speaker as /devices/platform/pcspkr/input/input10
[    7.681856] PCI Interrupt Link [LAZA] enabled at IRQ 21
[    7.681872] snd_hda_intel 0000:00:06.1: Disabling MSI
[    7.682592] kvm: Nested Virtualization enabled
[    7.682876] PCI Interrupt Link [LNEC] enabled at IRQ 19
[    7.682925] snd_hda_intel 0000:06:00.1: Disabling MSI
[    7.682944] snd_hda_intel 0000:06:00.1: Handle vga_switcheroo audio client
[    7.707167] MCE: In-kernel MCE decoding enabled.
[    7.718384] EDAC amd64: Node 0: DRAM ECC disabled.
[    7.718387] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
                Either enable ECC checking or force module loading by setting 'ecc_enable_override'.
                (Note that use of the override may cause unknown side effects.)
[    7.746104] EDAC amd64: Node 0: DRAM ECC disabled.
[    7.746108] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
                Either enable ECC checking or force module loading by setting 'ecc_enable_override'.
                (Note that use of the override may cause unknown side effects.)
[    7.756576] powernow_k8: fid 0xc (2000 MHz), vid 0xc
[    7.756579] powernow_k8: fid 0xa (1800 MHz), vid 0xe
[    7.756581] powernow_k8: fid 0x2 (1000 MHz), vid 0x12
[    7.756622] powernow_k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3600+ (2 cpu cores) (version 2.20.00)
[    7.878050] PCI Interrupt Link [LNEB] enabled at IRQ 18
[    7.878265] nouveau 0000:06:00.0: NVIDIA GT218 (0a8280b1)
[    8.002742] nouveau 0000:06:00.0: bios: version 70.18.8a.00.06
[    8.005652] nouveau 0000:06:00.0: fb: 1024 MiB DDR3
[    8.543526] cryptd: max_cpu_qlen set to 1000
[    8.952032] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
[    9.124722] snd_hda_codec_realtek hdaudioC0D0: autoconfig for ALC883: line_outs=3 (0x14/0x15/0x16/0x0/0x0) type:line
[    9.124727] snd_hda_codec_realtek hdaudioC0D0:    speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
[    9.124730] snd_hda_codec_realtek hdaudioC0D0:    hp_outs=1 (0x1b/0x0/0x0/0x0/0x0)
[    9.124732] snd_hda_codec_realtek hdaudioC0D0:    mono: mono_out=0x0
[    9.124734] snd_hda_codec_realtek hdaudioC0D0:    dig-out=0x1e/0x0
[    9.124735] snd_hda_codec_realtek hdaudioC0D0:    inputs:
[    9.124738] snd_hda_codec_realtek hdaudioC0D0:      Front Mic=0x19
[    9.124741] snd_hda_codec_realtek hdaudioC0D0:      Rear Mic=0x18
[    9.124743] snd_hda_codec_realtek hdaudioC0D0:      Line=0x1a
[    9.324939] [TTM] Zone  kernel: Available graphics memory: 1503796 kiB
[    9.324942] [TTM] Initializing pool allocator
[    9.324948] [TTM] Initializing DMA pool allocator
[    9.324976] nouveau 0000:06:00.0: DRM: VRAM: 1024 MiB
[    9.324978] nouveau 0000:06:00.0: DRM: GART: 1048576 MiB
[    9.324984] nouveau 0000:06:00.0: DRM: TMDS table version 2.0
[    9.324986] nouveau 0000:06:00.0: DRM: DCB version 4.0
[    9.324989] nouveau 0000:06:00.0: DRM: DCB outp 00: 02000300 00000000
[    9.324992] nouveau 0000:06:00.0: DRM: DCB outp 01: 01000302 00020030
[    9.324995] nouveau 0000:06:00.0: DRM: DCB outp 02: 02021362 00020010
[    9.324997] nouveau 0000:06:00.0: DRM: DCB outp 04: 01032310 00000000
[    9.325000] nouveau 0000:06:00.0: DRM: DCB conn 00: 00001030
[    9.325002] nouveau 0000:06:00.0: DRM: DCB conn 01: 00002161
[    9.325004] nouveau 0000:06:00.0: DRM: DCB conn 02: 00000200
[    9.331280] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    9.331282] [drm] Driver supports precise vblank timestamp query.
[    9.333629] nouveau 0000:06:00.0: DRM: MM: using COPY for buffer copies
[    9.375857] nouveau 0000:06:00.0: DRM: allocated 1920x1080 fb: 0x70000, bo 00000000eda5bc36
[    9.375993] fbcon: nouveaufb (fb0) is primary device
[    9.403594] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:0f.0/0000:06:00.1/sound/card1/input11
[    9.408562] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:0f.0/0000:06:00.1/sound/card1/input12
[    9.408709] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:0f.0/0000:06:00.1/sound/card1/input13
[    9.409152] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:0f.0/0000:06:00.1/sound/card1/input14
[    9.416202] Console: switching to colour frame buffer device 240x67
[    9.419261] nouveau 0000:06:00.0: fb0: nouveaufb frame buffer device
[    9.436754] [drm] Initialized nouveau 1.3.1 20120801 for 0000:06:00.0 on minor 0
[    9.912422] audit: type=1400 audit(1544782612.684:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/cups-browsed" pid=464 comm="apparmor_parser"
[   10.605886] audit: type=1400 audit(1544782613.380:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/cups/backend/cups-pdf" pid=466 comm="apparmor_parser"
[   10.606968] audit: type=1400 audit(1544782613.380:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/cupsd" pid=466 comm="apparmor_parser"
[   10.607547] audit: type=1400 audit(1544782613.380:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/cupsd//third_party" pid=466 comm="apparmor_parser"
[   12.041690] input: HDA NVidia Front Mic as /devices/pci0000:00/0000:00:06.1/sound/card0/input15
[   12.041805] input: HDA NVidia Rear Mic as /devices/pci0000:00/0000:00:06.1/sound/card0/input16
[   12.041891] input: HDA NVidia Line as /devices/pci0000:00/0000:00:06.1/sound/card0/input17
[   12.041973] input: HDA NVidia Line Out Front as /devices/pci0000:00/0000:00:06.1/sound/card0/input18
[   12.042055] input: HDA NVidia Line Out Surround as /devices/pci0000:00/0000:00:06.1/sound/card0/input19
[   12.042138] input: HDA NVidia Line Out CLFE as /devices/pci0000:00/0000:00:06.1/sound/card0/input20
[   12.042217] input: HDA NVidia Front Headphone as /devices/pci0000:00/0000:00:06.1/sound/card0/input21
[   17.677549] audit: type=1400 audit(1544782620.452:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/evince" pid=430 comm="apparmor_parser"
[   17.678331] audit: type=1400 audit(1544782620.452:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/evince//sanitized_helper" pid=430 comm="apparmor_parser"
[   17.680448] audit: type=1400 audit(1544782620.452:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/evince-previewer" pid=430 comm="apparmor_parser"
[   17.681110] audit: type=1400 audit(1544782620.456:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/evince-previewer//sanitized_helper" pid=430 comm="apparmor_parser"
[   17.682823] audit: type=1400 audit(1544782620.456:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/evince-thumbnailer" pid=430 comm="apparmor_parser"
[   17.683371] audit: type=1400 audit(1544782620.456:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/evince-thumbnailer//sanitized_helper" pid=430 comm="apparmor_parser"
[   19.332454] IPv6: ADDRCONF(NETDEV_UP): enp0s8: link is not ready
[   19.336101] forcedeth 0000:00:08.0 enp0s8: MSI enabled
[   19.337701] IPv6: ADDRCONF(NETDEV_UP): enp0s8: link is not ready
[   19.337721] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s8: link becomes ready
[   19.353268] IPv6: ADDRCONF(NETDEV_UP): enp0s9: link is not ready
[   19.354204] forcedeth 0000:00:09.0 enp0s9: MSI enabled
[   19.354467] forcedeth 0000:00:09.0 enp0s9: no link during initialization
[   19.354834] IPv6: ADDRCONF(NETDEV_UP): enp0s9: link is not ready
[   21.812258] fuse init (API version 7.27)
[  286.996782] XFS (md127): Mounting V5 Filesystem
[  287.213884] md: resync of RAID array md127
[  287.526466] XFS (md127): Ending clean mount
[  726.404734] INFO: task kworker/1:4:255 blocked for more than 120 seconds.
[  726.404747]       Not tainted 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1
[  726.404752] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  726.404757] kworker/1:4     D    0   255      2 0x80000000
[  726.404796] Workqueue: md submit_flushes [md_mod]
[  726.404799] Call Trace:
[  726.404816]  ? __schedule+0x2a2/0x870
[  726.404823]  ? __switch_to_asm+0x34/0x70
[  726.404828]  schedule+0x28/0x80
[  726.404836]  wait_barrier+0xf6/0x1b0 [raid10]
[  726.404846]  ? finish_wait+0x80/0x80
[  726.404852]  raid10_write_request+0x104/0x960 [raid10]
[  726.404858]  ? finish_wait+0x80/0x80
[  726.404863]  ? mempool_alloc+0x67/0x190
[  726.404867]  ? _cond_resched+0x15/0x30
[  726.404882]  ? md_write_start+0xd0/0x220 [md_mod]
[  726.404886]  ? __schedule+0x2aa/0x870
[  726.404893]  raid10_make_request+0xc1/0x120 [raid10]
[  726.404899]  ? finish_wait+0x80/0x80
[  726.404914]  md_handle_request+0x119/0x190 [md_mod]
[  726.404931]  md_make_request+0x78/0x160 [md_mod]
[  726.404938]  generic_make_request+0x1a4/0x410
[  726.404945]  ? bio_clone_fast+0x2c/0x60
[  726.404951]  raid10_write_request+0x64a/0x960 [raid10]
[  726.404957]  ? finish_wait+0x80/0x80
[  726.404960]  ? mempool_alloc+0x67/0x190
[  726.404964]  ? _cond_resched+0x15/0x30
[  726.404980]  ? md_write_start+0xd0/0x220 [md_mod]
[  726.404984]  ? __switch_to_asm+0x34/0x70
[  726.404989]  ? __switch_to_asm+0x40/0x70
[  726.404993]  ? __switch_to_asm+0x34/0x70
[  726.405000]  raid10_make_request+0xc1/0x120 [raid10]
[  726.405006]  ? finish_wait+0x80/0x80
[  726.405020]  md_handle_request+0x119/0x190 [md_mod]
[  726.405026]  ? __switch_to_asm+0x34/0x70
[  726.405030]  ? __switch_to_asm+0x40/0x70
[  726.405045]  submit_flushes+0x21/0x40 [md_mod]
[  726.405052]  process_one_work+0x1a7/0x3a0
[  726.405057]  worker_thread+0x30/0x390
[  726.405063]  ? pwq_unbound_release_workfn+0xd0/0xd0
[  726.405066]  kthread+0x112/0x130
[  726.405070]  ? kthread_bind+0x30/0x30
[  726.405075]  ret_from_fork+0x35/0x40
[  726.405140] INFO: task xfsaild/md127:1168 blocked for more than 120 seconds.
[  726.405145]       Not tainted 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1
[  726.405149] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  726.405153] xfsaild/md127   D    0  1168      2 0x80000000
[  726.405157] Call Trace:
[  726.405164]  ? __schedule+0x2a2/0x870
[  726.405169]  schedule+0x28/0x80
[  726.405176]  wait_barrier+0xf6/0x1b0 [raid10]
[  726.405183]  ? finish_wait+0x80/0x80
[  726.405189]  raid10_write_request+0x104/0x960 [raid10]
[  726.405194]  ? finish_wait+0x80/0x80
[  726.405199]  ? mempool_alloc+0x67/0x190
[  726.405202]  ? _cond_resched+0x15/0x30
[  726.405219]  ? md_write_start+0xd0/0x220 [md_mod]
[  726.405226]  raid10_make_request+0xc1/0x120 [raid10]
[  726.405232]  ? finish_wait+0x80/0x80
[  726.405247]  md_handle_request+0x119/0x190 [md_mod]
[  726.405263]  md_make_request+0x78/0x160 [md_mod]
[  726.405268]  generic_make_request+0x1a4/0x410
[  726.405274]  submit_bio+0x45/0x140
[  726.405279]  ? bio_add_page+0x48/0x60
[  726.405484]  _xfs_buf_ioapply+0x2e2/0x480 [xfs]
[  726.405617]  ? xfs_buf_delwri_submit_buffers+0x11b/0x2b0 [xfs]
[  726.405742]  __xfs_buf_submit+0x67/0x240 [xfs]
[  726.405873]  xfs_buf_delwri_submit_buffers+0x11b/0x2b0 [xfs]
[  726.406005]  ? xfsaild+0x2c1/0x7e0 [xfs]
[  726.406135]  ? xfs_inode_item_push+0xc6/0x180 [xfs]
[  726.406264]  xfsaild+0x2c1/0x7e0 [xfs]
[  726.406400]  ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
[  726.406406]  kthread+0x112/0x130
[  726.406410]  ? kthread_bind+0x30/0x30
[  726.406417]  ret_from_fork+0x35/0x40
[  726.406425] INFO: task md127_resync:1169 blocked for more than 120 seconds.
[  726.406431]       Not tainted 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1
[  726.406434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  726.406439] md127_resync    D    0  1169      2 0x80000000
[  726.406443] Call Trace:
[  726.406449]  ? __schedule+0x2a2/0x870
[  726.406454]  schedule+0x28/0x80
[  726.406462]  raise_barrier+0xc3/0x190 [raid10]
[  726.406470]  ? finish_wait+0x80/0x80
[  726.406476]  raid10_sync_request+0x201/0x1dd0 [raid10]
[  726.406484]  ? next_arg+0x100/0x100
[  726.406488]  ? cpumask_next+0x16/0x20
[  726.406509]  ? is_mddev_idle+0xcc/0x12a [md_mod]
[  726.406524]  md_do_sync.cold.84+0x3e5/0x8ec [md_mod]
[  726.406532]  ? finish_wait+0x80/0x80
[  726.406538]  ? __switch_to_asm+0x40/0x70
[  726.406554]  ? md_rdev_init+0xb0/0xb0 [md_mod]
[  726.406568]  md_thread+0x94/0x150 [md_mod]
[  726.406574]  kthread+0x112/0x130
[  726.406578]  ? kthread_bind+0x30/0x30
[  726.406583]  ret_from_fork+0x35/0x40
[  726.406590] INFO: task kworker/1:6:1182 blocked for more than 120 seconds.
[  726.406594]       Not tainted 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1
[  726.406598] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  726.406602] kworker/1:6     D    0  1182      2 0x80000000
[  726.406622] Workqueue: md submit_flushes [md_mod]
[  726.406625] Call Trace:
[  726.406630]  ? __schedule+0x2a2/0x870
[  726.406635]  schedule+0x28/0x80
[  726.406641]  wait_barrier+0xf6/0x1b0 [raid10]
[  726.406647]  ? finish_wait+0x80/0x80
[  726.406654]  raid10_write_request+0x104/0x960 [raid10]
[  726.406659]  ? finish_wait+0x80/0x80
[  726.406664]  ? mempool_alloc+0x67/0x190
[  726.406668]  ? _cond_resched+0x15/0x30
[  726.406683]  ? md_write_start+0xd0/0x220 [md_mod]
[  726.406688]  ? __switch_to_asm+0x34/0x70
[  726.406692]  ? __switch_to_asm+0x40/0x70
[  726.406697]  ? __switch_to_asm+0x34/0x70
[  726.406703]  raid10_make_request+0xc1/0x120 [raid10]
[  726.406709]  ? finish_wait+0x80/0x80
[  726.406724]  md_handle_request+0x119/0x190 [md_mod]
[  726.406730]  ? __switch_to_asm+0x34/0x70
[  726.406735]  ? __switch_to_asm+0x40/0x70
[  726.406750]  submit_flushes+0x21/0x40 [md_mod]
[  726.406757]  process_one_work+0x1a7/0x3a0
[  726.406762]  worker_thread+0x30/0x390
[  726.406767]  ? pwq_unbound_release_workfn+0xd0/0xd0
[  726.406771]  kthread+0x112/0x130
[  726.406774]  ? kthread_bind+0x30/0x30
[  726.406779]  ret_from_fork+0x35/0x40
[  726.406789] INFO: task kworker/u4:0:1197 blocked for more than 120 seconds.
[  726.406793]       Not tainted 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1
[  726.406797] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  726.406801] kworker/u4:0    D    0  1197      2 0x80000000
[  726.406812] Workqueue: writeback wb_workfn (flush-9:127)
[  726.406815] Call Trace:
[  726.406821]  ? __schedule+0x2a2/0x870
[  726.406825]  ? mempool_alloc+0x67/0x190
[  726.406829]  schedule+0x28/0x80
[  726.406835]  wait_barrier+0xf6/0x1b0 [raid10]
[  726.406841]  ? finish_wait+0x80/0x80
[  726.406848]  raid10_write_request+0x104/0x960 [raid10]
[  726.406853]  ? finish_wait+0x80/0x80
[  726.406857]  ? mempool_alloc+0x67/0x190
[  726.406861]  ? _cond_resched+0x15/0x30
[  726.406876]  ? md_write_start+0xd0/0x220 [md_mod]
[  726.406883]  raid10_make_request+0xc1/0x120 [raid10]
[  726.406889]  ? finish_wait+0x80/0x80
[  726.406904]  md_handle_request+0x119/0x190 [md_mod]
[  726.406920]  md_make_request+0x78/0x160 [md_mod]
[  726.406927]  generic_make_request+0x1a4/0x410
[  726.406933]  submit_bio+0x45/0x140
[  726.407094]  ? xfs_setfilesize_trans_alloc.isra.14+0x3d/0x90 [xfs]
[  726.407220]  xfs_submit_ioend+0x9c/0x1e0 [xfs]
[  726.407347]  xfs_vm_writepages+0x78/0xa0 [xfs]
[  726.407356]  do_writepages+0x41/0xd0
[  726.407363]  ? enqueue_entity+0xf6/0x630
[  726.407367]  ? check_preempt_wakeup+0x113/0x230
[  726.407374]  __writeback_single_inode+0x3d/0x360
[  726.407380]  writeback_sb_inodes+0x1e3/0x450
[  726.407389]  __writeback_inodes_wb+0x5d/0xb0
[  726.407394]  wb_writeback+0x25f/0x2f0
[  726.407402]  ? get_nr_inodes+0x35/0x50
[  726.407407]  ? cpumask_next+0x16/0x20
[  726.407412]  wb_workfn+0x343/0x400
[  726.407419]  process_one_work+0x1a7/0x3a0
[  726.407425]  worker_thread+0x30/0x390
[  726.407430]  ? pwq_unbound_release_workfn+0xd0/0xd0
[  726.407434]  kthread+0x112/0x130
[  726.407438]  ? kthread_bind+0x30/0x30
[  726.407445]  ret_from_fork+0x35/0x40
[  726.407455] INFO: task rsync:1250 blocked for more than 120 seconds.
[  726.407461]       Not tainted 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1
[  726.407464] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  726.407468] rsync           D    0  1250   1249 0x00000000
[  726.407472] Call Trace:
[  726.407478]  ? __schedule+0x2a2/0x870
[  726.407483]  schedule+0x28/0x80
[  726.407488]  schedule_timeout+0x26d/0x390
[  726.407633]  ? __xfs_buf_submit+0x9f/0x240 [xfs]
[  726.407762]  ? xlog_bdstrat+0x30/0x60 [xfs]
[  726.407769]  __down+0x9b/0xf0
[  726.407897]  ? xfs_buf_find.isra.26+0x3d4/0x600 [xfs]
[  726.407904]  down+0x3b/0x50
[  726.408034]  xfs_buf_lock+0x33/0x100 [xfs]
[  726.408163]  xfs_buf_find.isra.26+0x3d4/0x600 [xfs]
[  726.408293]  xfs_buf_get_map+0x40/0x2a0 [xfs]
[  726.408426]  xfs_trans_get_buf_map+0xc1/0x160 [xfs]
[  726.408553]  xfs_da_get_buf+0xc0/0xf0 [xfs]
[  726.408663]  xfs_dir3_data_init+0x66/0x210 [xfs]
[  726.408898]  ? xfs_dir2_grow_inode+0xdb/0x130 [xfs]
[  726.409027]  xfs_dir2_sf_to_block+0x12e/0x6d0 [xfs]
[  726.409043]  ? make_kgid+0x13/0x20
[  726.409177]  ? xfs_setup_inode+0x83/0x110 [xfs]
[  726.409308]  ? xfs_ialloc+0x327/0x5a0 [xfs]
[  726.409435]  xfs_dir2_sf_addname+0xc5/0x6c0 [xfs]
[  726.409570]  ? kmem_alloc+0x61/0xe0 [xfs]
[  726.409693]  xfs_dir_createname+0x18c/0x1d0 [xfs]
[  726.409830]  xfs_create+0x455/0x5d0 [xfs]
[  726.409964]  xfs_generic_create+0x22c/0x2d0 [xfs]
[  726.409975]  ? d_splice_alias+0x134/0x3c0
[  726.409986]  path_openat+0x12f9/0x16e0
[  726.409999]  do_filp_open+0x93/0x100
[  726.410009]  ? __dentry_kill+0x121/0x170
[  726.410020]  ? __check_object_size+0xa3/0x181
[  726.410032]  do_sys_open+0x186/0x210
[  726.410043]  do_syscall_64+0x53/0x100
[  726.410056]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  726.410067] RIP: 0033:0x7fd8e42b44b0
[  726.410083] Code: Bad RIP value.
[  726.410091] RSP: 002b:00007fffe5049628 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
[  726.410105] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd8e42b44b0
[  726.410112] RDX: 0000000000000180 RSI: 00000000000000c2 RDI: 00007fffe504b890
[  726.410119] RBP: 000000000003a2f8 R08: 000000000000002b R09: 0000000000000000
[  726.410127] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffe504b8a4
[  726.410135] R13: 8421084210842109 R14: 00000000000000c2 R15: 00007fd8e4342320
[  847.236739] INFO: task kworker/1:4:255 blocked for more than 120 seconds.
[  847.236752]       Not tainted 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1
[  847.236756] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  847.236762] kworker/1:4     D    0   255      2 0x80000000
[  847.236799] Workqueue: md submit_flushes [md_mod]
[  847.236803] Call Trace:
[  847.236820]  ? __schedule+0x2a2/0x870
[  847.236826]  ? __switch_to_asm+0x34/0x70
[  847.236832]  schedule+0x28/0x80
[  847.236841]  wait_barrier+0xf6/0x1b0 [raid10]
[  847.236850]  ? finish_wait+0x80/0x80
[  847.236856]  raid10_write_request+0x104/0x960 [raid10]
[  847.236862]  ? finish_wait+0x80/0x80
[  847.236867]  ? mempool_alloc+0x67/0x190
[  847.236871]  ? _cond_resched+0x15/0x30
[  847.236887]  ? md_write_start+0xd0/0x220 [md_mod]
[  847.236890]  ? __schedule+0x2aa/0x870
[  847.236897]  raid10_make_request+0xc1/0x120 [raid10]
[  847.236904]  ? finish_wait+0x80/0x80
[  847.236918]  md_handle_request+0x119/0x190 [md_mod]
[  847.236936]  md_make_request+0x78/0x160 [md_mod]
[  847.236943]  generic_make_request+0x1a4/0x410
[  847.236949]  ? bio_clone_fast+0x2c/0x60
[  847.236956]  raid10_write_request+0x64a/0x960 [raid10]
[  847.236962]  ? finish_wait+0x80/0x80
[  847.236965]  ? mempool_alloc+0x67/0x190
[  847.236969]  ? _cond_resched+0x15/0x30
[  847.236985]  ? md_write_start+0xd0/0x220 [md_mod]
[  847.236989]  ? __switch_to_asm+0x34/0x70
[  847.236994]  ? __switch_to_asm+0x40/0x70
[  847.236998]  ? __switch_to_asm+0x34/0x70
[  847.237005]  raid10_make_request+0xc1/0x120 [raid10]
[  847.237011]  ? finish_wait+0x80/0x80
[  847.237026]  md_handle_request+0x119/0x190 [md_mod]
[  847.237031]  ? __switch_to_asm+0x34/0x70
[  847.237035]  ? __switch_to_asm+0x40/0x70
[  847.237051]  submit_flushes+0x21/0x40 [md_mod]
[  847.237057]  process_one_work+0x1a7/0x3a0
[  847.237062]  worker_thread+0x30/0x390
[  847.237068]  ? pwq_unbound_release_workfn+0xd0/0xd0
[  847.237072]  kthread+0x112/0x130
[  847.237075]  ? kthread_bind+0x30/0x30
[  847.237080]  ret_from_fork+0x35/0x40
[  847.237147] INFO: task xfsaild/md127:1168 blocked for more than 120 seconds.
[  847.237152]       Not tainted 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1
[  847.237156] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  847.237160] xfsaild/md127   D    0  1168      2 0x80000000
[  847.237164] Call Trace:
[  847.237171]  ? __schedule+0x2a2/0x870
[  847.237176]  schedule+0x28/0x80
[  847.237183]  wait_barrier+0xf6/0x1b0 [raid10]
[  847.237190]  ? finish_wait+0x80/0x80
[  847.237196]  raid10_write_request+0x104/0x960 [raid10]
[  847.237201]  ? finish_wait+0x80/0x80
[  847.237205]  ? mempool_alloc+0x67/0x190
[  847.237209]  ? _cond_resched+0x15/0x30
[  847.237226]  ? md_write_start+0xd0/0x220 [md_mod]
[  847.237233]  raid10_make_request+0xc1/0x120 [raid10]
[  847.237239]  ? finish_wait+0x80/0x80
[  847.237253]  md_handle_request+0x119/0x190 [md_mod]
[  847.237270]  md_make_request+0x78/0x160 [md_mod]
[  847.237275]  generic_make_request+0x1a4/0x410
[  847.237280]  submit_bio+0x45/0x140
[  847.237286]  ? bio_add_page+0x48/0x60
[  847.237489]  _xfs_buf_ioapply+0x2e2/0x480 [xfs]
[  847.237623]  ? xfs_buf_delwri_submit_buffers+0x11b/0x2b0 [xfs]
[  847.237748]  __xfs_buf_submit+0x67/0x240 [xfs]
[  847.237878]  xfs_buf_delwri_submit_buffers+0x11b/0x2b0 [xfs]
[  847.238011]  ? xfsaild+0x2c1/0x7e0 [xfs]
[  847.238140]  ? xfs_inode_item_push+0xc6/0x180 [xfs]
[  847.238268]  xfsaild+0x2c1/0x7e0 [xfs]
[  847.238404]  ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
[  847.238409]  kthread+0x112/0x130
[  847.238413]  ? kthread_bind+0x30/0x30
[  847.238421]  ret_from_fork+0x35/0x40
[  847.238429] INFO: task md127_resync:1169 blocked for more than 120 seconds.
[  847.238435]       Not tainted 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1
[  847.238438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  847.238443] md127_resync    D    0  1169      2 0x80000000
[  847.238447] Call Trace:
[  847.238453]  ? __schedule+0x2a2/0x870
[  847.238458]  schedule+0x28/0x80
[  847.238466]  raise_barrier+0xc3/0x190 [raid10]
[  847.238474]  ? finish_wait+0x80/0x80
[  847.238480]  raid10_sync_request+0x201/0x1dd0 [raid10]
[  847.238488]  ? next_arg+0x100/0x100
[  847.238492]  ? cpumask_next+0x16/0x20
[  847.238513]  ? is_mddev_idle+0xcc/0x12a [md_mod]
[  847.238529]  md_do_sync.cold.84+0x3e5/0x8ec [md_mod]
[  847.238537]  ? finish_wait+0x80/0x80
[  847.238542]  ? __switch_to_asm+0x40/0x70
[  847.238559]  ? md_rdev_init+0xb0/0xb0 [md_mod]
[  847.238573]  md_thread+0x94/0x150 [md_mod]
[  847.238578]  kthread+0x112/0x130
[  847.238582]  ? kthread_bind+0x30/0x30
[  847.238587]  ret_from_fork+0x35/0x40
[  847.238594] INFO: task kworker/1:6:1182 blocked for more than 120 seconds.
[  847.238598]       Not tainted 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1
[  847.238602] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  847.238606] kworker/1:6     D    0  1182      2 0x80000000
[  847.238626] Workqueue: md submit_flushes [md_mod]
[  847.238629] Call Trace:
[  847.238634]  ? __schedule+0x2a2/0x870
[  847.238639]  schedule+0x28/0x80
[  847.238646]  wait_barrier+0xf6/0x1b0 [raid10]
[  847.238652]  ? finish_wait+0x80/0x80
[  847.238658]  raid10_write_request+0x104/0x960 [raid10]
[  847.238663]  ? finish_wait+0x80/0x80
[  847.238668]  ? mempool_alloc+0x67/0x190
[  847.238672]  ? _cond_resched+0x15/0x30
[  847.238687]  ? md_write_start+0xd0/0x220 [md_mod]
[  847.238692]  ? __switch_to_asm+0x34/0x70
[  847.238696]  ? __switch_to_asm+0x40/0x70
[  847.238701]  ? __switch_to_asm+0x34/0x70
[  847.238707]  raid10_make_request+0xc1/0x120 [raid10]
[  847.238713]  ? finish_wait+0x80/0x80
[  847.238728]  md_handle_request+0x119/0x190 [md_mod]
[  847.238734]  ? __switch_to_asm+0x34/0x70
[  847.238738]  ? __switch_to_asm+0x40/0x70
[  847.238754]  submit_flushes+0x21/0x40 [md_mod]
[  847.238760]  process_one_work+0x1a7/0x3a0
[  847.238765]  worker_thread+0x30/0x390
[  847.238771]  ? pwq_unbound_release_workfn+0xd0/0xd0
[  847.238774]  kthread+0x112/0x130
[  847.238778]  ? kthread_bind+0x30/0x30
[  847.238783]  ret_from_fork+0x35/0x40

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-12-18 15:01 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-12 12:29 XFS and RAID10 with o2 layout Sinisa
2018-12-12 14:30 ` Brian Foster
2018-12-13  8:21   ` Sinisa
2018-12-13 12:28     ` Brian Foster
2018-12-13 13:02       ` Sinisa
2018-12-13 17:30         ` keld
2018-12-14  6:59           ` Sinisa
     [not found]   ` <0a33a20d-5f49-7b34-3662-5b818c67621a@suse.com>
     [not found]     ` <48ba331d-a896-f532-2c75-cf94ddf87b60@4net.rs>
2018-12-17 15:04       ` Sinisa
2018-12-18 15:01     ` Sinisa
2018-12-13 22:05 ` Dave Chinner
2018-12-14  7:03   ` Sinisa
2018-12-14  8:26     ` Wols Lists
2018-12-14 20:44       ` John Stoffel
2018-12-15 15:36         ` Siniša Bandin
2018-12-14 21:20     ` Dave Chinner
2018-12-14 11:39 ` Sinisa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox