From: Josef Bacik <jbacik@fb.com>
To: Marc MERLIN <marc@merlins.org>, <linux-btrfs@vger.kernel.org>
Subject: Re: 3.14.0-rc3: btrfs send/receive blocks btrfs IO on other devices (near deadlocks)
Date: Fri, 14 Mar 2014 10:42:47 -0400 [thread overview]
Message-ID: <53231567.9040008@fb.com> (raw)
In-Reply-To: <20140312151808.GE16865@merlins.org>
On 03/12/2014 11:18 AM, Marc MERLIN wrote:
> I have a file server with 4 cpu cores and 5 btrfs devices: Label:
> btrfs_boot uuid: e4c1daa8-9c39-4a59-b0a9-86297d397f3b Total
> devices 1 FS bytes used 48.92GiB devid 1 size 79.93GiB used
> 73.04GiB path /dev/mapper/cryptroot
>
> Label: varlocalspace uuid: 9f46dbe2-1344-44c3-b0fb-af2888c34f18
> Total devices 1 FS bytes used 1.10TiB devid 1 size 1.63TiB used
> 1.50TiB path /dev/mapper/cryptraid0
>
> Label: btrfs_pool1 uuid: 6358304a-2234-4243-b02d-4944c9af47d7
> Total devices 1 FS bytes used 7.16TiB devid 1 size 14.55TiB used
> 7.50TiB path /dev/mapper/dshelf1
>
> Label: btrfs_pool2 uuid: cb9df6d3-a528-4afc-9a45-4fed5ec358d6
> Total devices 1 FS bytes used 3.34TiB devid 1 size 7.28TiB used
> 3.42TiB path /dev/mapper/dshelf2
>
> Label: bigbackup uuid: 024ba4d0-dacb-438d-9f1b-eeb34083fe49 Total
> devices 5 FS bytes used 6.02TiB devid 1 size 1.82TiB used
> 1.43TiB path /dev/dm-9 devid 2 size 1.82TiB used 1.43TiB path
> /dev/dm-6 devid 3 size 1.82TiB used 1.43TiB path /dev/dm-5 devid
> 4 size 1.82TiB used 1.43TiB path /dev/dm-7 devid 5 size 1.82TiB
> used 1.43TiB path /dev/dm-8
>
>
> I have a very long running btrfs send/receive from btrfs_pool1 to
> bigbackup (long running meaning that it's been slowly copying over
> 5 days)
>
> The problem is that this is blocking IO to btrfs_pool2 which is
> using totally different drives. By blocking IO I mean that IO to
> pool2 kind of works sometimes, and hangs for very long times at
> other times.
>
> It looks as if one rsync to btrfs_pool2 or one piece of IO hangs on
> a shared lock and once that happens, all IO to btrfs_pool2 stops
> for a long time. It does recover eventually without reboot, but the
> wait times are ridiculous (it could be 1H or more).
>
> As I write this, I have a killall -9 rsync that waited for over
> 10mn before these processes would finally die: 23555 07:36
> wait_current_trans.isra.15 rsync -av -SH --delete (...) 23556
> 07:36 exit [rsync] <defunct> 25387
> 2-04:41:22 wait_current_trans.isra.15 rsync --password-file
> (...) 27481 31:26 wait_current_trans.isra.15 rsync
> --password-file (...) 29268 04:41:34 wait_current_trans.isra.15
> rsync --password-file (...) 29343 04:41:31 exit
> [rsync] <defunct> 29492 04:41:27 wait_current_trans.isra.15
> rsync --password-file (...)
>
> 14559 07:14:49 wait_current_trans.isra.15 cp -i -al current
> 20140312-feisty
>
> This is all stuck in btrfs kernel code. If someeone wants sysrq-w,
> there it is.
> https://urldefense.proofpoint.com/v1/url?u=http://marc.merlins.org/tmp/btrfs_full.txt&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0A&m=NfFB494sWgA3qCQbFaAQO2FapIJ6kuZcyS6PlP%2FXkCg%3D%0A&s=573f0b2deecc8980550a7645c9627b918659e0ab067590577c8ead4a59498bc1
>
> A quick summary: SysRq : Show Blocked State task
> PC stack pid father btrfs-cleaner D ffff8802126b0840 0
> 3332 2 0x00000000 ffff8800c5dc9d00 0000000000000046
> ffff8800c5dc9fd8 ffff8800c69f6310 00000000000141c0 ffff8800c69f6310
> ffff88017574c170 ffff880211e671e8 0000000000000000 ffff880211e67000
> ffff8801e5936e20 ffff8800c5dc9d10 Call Trace: [<ffffffff8160b0d9>]
> schedule+0x73/0x75 [<ffffffff8122a3c7>]
> wait_current_trans.isra.15+0x98/0xf4 [<ffffffff81085062>] ?
> finish_wait+0x65/0x65 [<ffffffff8122b86c>]
> start_transaction+0x48e/0x4f2 [<ffffffff8122bc4f>] ?
> __btrfs_end_transaction+0x2a1/0x2c6 [<ffffffff8122b8eb>]
> btrfs_start_transaction+0x1b/0x1d [<ffffffff8121c5cd>]
> btrfs_drop_snapshot+0x443/0x610 [<ffffffff8160d7b3>] ?
> _raw_spin_unlock+0x17/0x2a [<ffffffff81074efb>] ?
> finish_task_switch+0x51/0xdb [<ffffffff8160afbf>] ?
> __schedule+0x537/0x5de [<ffffffff8122c08d>]
> btrfs_clean_one_deleted_snapshot+0x103/0x10f [<ffffffff81224859>]
> cleaner_kthread+0x103/0x136 [<ffffffff81224756>] ?
> btrfs_alloc_root+0x26/0x26 [<ffffffff8106bc1b>] kthread+0xae/0xb6
> [<ffffffff8106bb6d>] ? __kthread_parkme+0x61/0x61
> [<ffffffff816141bc>] ret_from_fork+0x7c/0xb0 [<ffffffff8106bb6d>] ?
> __kthread_parkme+0x61/0x61 btrfs-transacti D ffff88021387eb00 0
> 3333 2 0x00000000 ffff8800c5dcb890 0000000000000046
> ffff8800c5dcbfd8 ffff88021387e5d0 00000000000141c0 ffff88021387e5d0
> ffff88021f2141c0 ffff88021387e5d0 ffff8800c5dcb930 ffffffff810fe574
> 0000000000000002 ffff8800c5dcb8a0 Call Trace: [<ffffffff810fe574>]
> ? wait_on_page_read+0x3c/0x3c [<ffffffff8160b0d9>]
> schedule+0x73/0x75 [<ffffffff8160b27e>] io_schedule+0x60/0x7a
> [<ffffffff810fe582>] sleep_on_page+0xe/0x12 [<ffffffff8160b510>]
> __wait_on_bit+0x48/0x7a [<ffffffff810fe522>]
> wait_on_page_bit+0x7a/0x7c [<ffffffff81085096>] ?
> autoremove_wake_function+0x34/0x34 [<ffffffff81245c70>]
> read_extent_buffer_pages+0x1bf/0x204 [<ffffffff81223710>] ?
> free_root_pointers+0x5b/0x5b [<ffffffff81224412>]
> btree_read_extent_buffer_pages.constprop.45+0x66/0x100
> [<ffffffff81225367>] read_tree_block+0x2f/0x47 [<ffffffff8120e4b6>]
> read_block_for_search.isra.26+0x24a/0x287 [<ffffffff8120fcf7>]
> btrfs_search_slot+0x4f4/0x6bb [<ffffffff81214c3d>]
> lookup_inline_extent_backref+0xda/0x3fb [<ffffffff812167e1>]
> __btrfs_free_extent+0xf4/0x712 [<ffffffff8121ba57>]
> __btrfs_run_delayed_refs+0x939/0xbdf [<ffffffff8121d896>]
> btrfs_run_delayed_refs+0x81/0x18f [<ffffffff8122af3e>]
> btrfs_commit_transaction+0x3a9/0x849 [<ffffffff81085062>] ?
> finish_wait+0x65/0x65 [<ffffffff81227598>]
> transaction_kthread+0xf8/0x1ab [<ffffffff812274a0>] ?
> btrfs_cleanup_transaction+0x43f/0x43f [<ffffffff8106bc1b>]
> kthread+0xae/0xb6 [<ffffffff8106bb6d>] ?
> __kthread_parkme+0x61/0x61 [<ffffffff816141bc>]
> ret_from_fork+0x7c/0xb0 [<ffffffff8106bb6d>] ?
> __kthread_parkme+0x61/0x61
>
>
> Worse, taking that dump gave me: gargamel:/etc/udev/rules.d# echo w
> > /proc/sysrq-trigger Message from syslogd@gargamel at Mar 12
> 07:13:16 ... kernel:[1234536.531251] BUG: soft lockup - CPU#1 stuck
> for 22s! [mysqld:12552] Message from syslogd@gargamel at Mar 12
> 07:13:16 ... kernel:[1234536.559276] BUG: soft lockup - CPU#2 stuck
> for 22s! [mysqld:4955] Message from syslogd@gargamel at Mar 12
> 07:13:16 ... kernel:[1234540.538636] BUG: soft lockup - CPU#0 stuck
> for 22s! [kswapd0:48] (there are more in the logs attached on what
> those lockups are)
>
> Thankfully my btrfs send/receive is still working and should
> finish today, but the amount of time it's been taking has been
> painful, and the effect it's been having on the rest of my system,
> making it hang or rendering its devices unusable for long periods
> of time, has been punishing.
>
> Can someone figure out from the kernel logs what is causing those
> near deadlocks?
>
> Actually this was so bad apparently that sysrq w didn't even all
> make it to syslog/disk (also on btrfs) but thankfully I got it on
> serial console.
>
>
> I also found this suring sysrq. Shouldit be reported to someone
> else? INFO: rcu_preempt detected stalls on CPUs/tasks: 3: (1 GPs
> behind) idle=395/140000000000000/0 softirq=284540927/284540928
> last_accelerate: ed62/2821, nonlazy_posted: 1, .. (detected by 0,
> t=15002 jiffies, g=100566635, c=100566634, q=87438) sending NMI to
> all CPUs: NMI backtrace for cpu 3 CPU: 3 PID: 21730 Comm: bash Not
> tainted 3.14.0-rc3-amd64-i915-preempt-20140216 #2 Hardware name:
> System manufacturer System Product Name/P8H67-M PRO, BIOS 3806
> 08/20/2012 task: ffff88001cf3a710 ti: ffff880037f78000 task.ti:
> ffff880037f78000 RIP: 0010:[<ffffffff81309a80>]
> [<ffffffff81309a80>] paravirt_read_tsc+0x0/0xd RSP:
> 0018:ffff880037f79ac0 EFLAGS: 00000046 RAX: 0000000000000003 RBX:
> 0000000094b043ff RCX: 0000000000000000 RDX: 0000000000000004 RSI:
> 00000000000003fd RDI: 0000000000000001 RBP: ffff880037f79ae8 R08:
> ffffffff81cf24d0 R09: 00000000fffffffe R10: 0000000000001a18 R11:
> 0000000000000000 R12: 00000000000009fb R13: 0000000000000003 R14:
> 0000000094b047d7 R15: 0000000000000036 FS: 0000000000000000(0000)
> GS:ffff88021f380000(0063) knlGS:00000000f754b6c0 CS: 0010 DS: 002b
> ES: 002b CR0: 0000000080050033 CR2: 00000000ffc13412 CR3:
> 0000000145622000 CR4: 00000000000407e0 Stack: ffffffff81309b59
> ffffffff81f27560 00000000000026f0 0000000000000020 ffffffff813c9e1b
> ffff880037f79af8 ffffffff81309ac9 ffff880037f79b08 ffffffff81309aef
> ffff880037f79b30 ffffffff813c9cc6 ffffffff81f27560 Call Trace:
> [<ffffffff81309b59>] ? delay_tsc+0x3d/0xa4 [<ffffffff813c9e1b>] ?
> serial8250_console_write+0x10d/0x10d [<ffffffff81309ac9>]
> __delay+0xf/0x11 [<ffffffff81309aef>] __const_udelay+0x24/0x26
> [<ffffffff813c9cc6>] wait_for_xmitr+0x49/0x91 [<ffffffff813c9e37>]
> serial8250_console_putchar+0x1c/0x2e [<ffffffff813c5d97>]
> uart_console_write+0x3f/0x54 [<ffffffff813c9dc8>]
> serial8250_console_write+0xba/0x10d [<ffffffff8109363b>]
> call_console_drivers.constprop.6+0xbc/0xf0 [<ffffffff81093bf7>]
> console_unlock+0x269/0x302 [<ffffffff8109405e>]
> vprintk_emit+0x3ce/0x3ff [<ffffffff81604702>] printk+0x54/0x56
> [<ffffffff81089799>] ? arch_local_irq_save+0x15/0x1b
> [<ffffffff8108752e>] print_cfs_rq+0x4fc/0xd71 [<ffffffff81080fff>]
> print_cfs_stats+0x5a/0x9e [<ffffffff81086c65>]
> print_cpu+0x538/0x8e3 [<ffffffff81087f7e>]
> sysrq_sched_debug_show+0x1f/0x42 [<ffffffff81078874>]
> show_state_filter+0x92/0x9f [<ffffffff813b7c7a>]
> sysrq_handle_showstate_blocked+0x13/0x15 [<ffffffff813b82c3>]
> __handle_sysrq+0xa0/0x138 [<ffffffff813b8630>]
> write_sysrq_trigger+0x28/0x37 [<ffffffff811a565a>]
> proc_reg_write+0x5a/0x7c [<ffffffff81155417>] vfs_write+0xab/0x107
> [<ffffffff81155b19>] SyS_write+0x46/0x79 [<ffffffff81615f6c>]
> sysenter_dispatch+0x7/0x21 Code: 89 e5 e8 a2 fe ff ff 89 c2 66 31
> c0 c1 e2 10 01 d0 15 ff ff 00 00 f7 d0 c1 e8 10 5d c3 90 90 90 90
> 90 90 90 90 90 90 90 90 90 90 <55> 48 89 e5 e8 9c c4 d0 ff 66 90 5d
> c3 66 66 66 66 90 55 48 89
>
>
I'm working on a deadlock with send/receive and then I'll turn my
attention to this. Thanks,
Josef
prev parent reply other threads:[~2014-03-14 14:43 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-12 15:18 3.14.0-rc3: btrfs send/receive blocks btrfs IO on other devices (near deadlocks) Marc MERLIN
2014-03-10 10:39 ` [PATCH] Btrfs-progs: fsck: disable --init-extent-tree option when using snapshots Wang Shilong
2014-03-10 12:12 ` Shilong Wang
2014-03-10 15:50 ` Josef Bacik
2014-03-11 1:23 ` Wang Shilong
2014-03-08 21:53 ` send/receive locking Hugo Mills
2014-03-08 21:55 ` Josef Bacik
2014-03-08 22:00 ` Hugo Mills
2014-03-08 22:02 ` Josef Bacik
2014-03-08 22:16 ` Hugo Mills
2014-03-09 16:43 ` Hugo Mills
2014-03-10 22:28 ` Hugo Mills
2014-03-14 2:19 ` Marc MERLIN
2014-03-14 13:36 ` [PATCH] Btrfs-progs: fsck: disable --init-extent-tree option when using snapshots Wang Shilong
2014-03-14 14:36 ` Josef Bacik
2014-03-17 12:21 ` Shilong Wang
2014-03-14 1:48 ` 3.14.0-rc3: btrfs send/receive blocks btrfs IO on other devices (near deadlocks) Marc MERLIN
2014-03-14 4:54 ` Duncan
2014-03-14 14:42 ` Josef Bacik [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53231567.9040008@fb.com \
--to=jbacik@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=marc@merlins.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.