* New: seeing 100% CPU / unkillable tasks
@ 2015-02-22 17:58 Holger Hoffstätte
0 siblings, 0 replies; only message in thread
From: Holger Hoffstätte @ 2015-02-22 17:58 UTC (permalink / raw)
To: linux-btrfs
kernel: 3.18.7 + all patches since 3.19 + the daily Filipe ;)
For the last few days I've been getting an awful lot of stuck tasks
after mundane operations like simple rsync'ing, an fallocate or just
doign a manual "sync".
Symptom is always 100% CPU use and the task (user-space fallocate, sync
or the [btrfs-transaction] kthread on eventual tx commit) hanging.
This happens even without stress (idle single-disk fs/system, no mem pressure)
and very irregularly. Today I got particularly unlucky and could trigger it
repeatedly, simply by doing a bunch of small fallocates on a fresh subvolume:
the first few would work and then - boom.
A full collection of several SysRq traces is at:
https://gist.github.com/hhoffstaette/c54ca2813cd47439c4c1
I've inserted spaces between different runs and SysRq segments to make
it a bit easier to read.
Common theme is almost always:
Feb 22 12:44:03 tux kernel: [<ffffffff812baa36>] ? __percpu_counter_add+0x56/0x80
Feb 22 12:44:03 tux kernel: [<ffffffffa07b566c>] ? find_first_extent_bit_state+0x2c/0x80 [btrfs]
Feb 22 12:44:03 tux kernel: [<ffffffff8108bb1b>] ? lock_timer_base.isra.36+0x2b/0x50
Feb 22 12:44:03 tux kernel: [<ffffffff81075023>] ? prepare_to_wait_event+0x83/0x100
Feb 22 12:44:03 tux kernel: [<ffffffffa07980ff>] wait_current_trans.isra.17+0x9f/0x100 [btrfs]
Feb 22 12:44:03 tux kernel: [<ffffffff81075130>] ? __wake_up_sync+0x20/0x20
Feb 22 12:44:03 tux kernel: [<ffffffffa0799ad8>] start_transaction+0x318/0x5a0 [btrfs]
Feb 22 12:44:03 tux kernel: [<ffffffffa0799e17>] btrfs_attach_transaction+0x17/0x20 [btrfs]
Feb 22 12:44:03 tux kernel: [<ffffffffa079486b>] transaction_kthread+0x8b/0x260 [btrfs]
Feb 22 12:44:03 tux kernel: [<ffffffffa07947e0>] ? btrfs_cleanup_transaction+0x520/0x520 [btrfs]
Feb 22 12:44:03 tux kernel: [<ffffffff810685eb>] kthread+0xdb/0x100
Feb 22 12:44:03 tux kernel: [<ffffffff81068510>] ? kthread_create_on_node+0x180/0x180
Feb 22 12:44:03 tux kernel: [<ffffffff8153f1ec>] ret_from_fork+0x7c/0xb0
Feb 22 12:44:03 tux kernel: [<ffffffff81068510>] ? kthread_create_on_node+0x180/0x180
or this:
Feb 22 14:08:45 tux kernel: [<ffffffffa056a809>] btrfs_set_path_blocking+0x49/0x90 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa056a8a5>] btrfs_clear_path_blocking+0x55/0xe0 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa056f657>] btrfs_search_slot+0x1f7/0xa60 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa0585955>] btrfs_update_root+0x55/0x270 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa060b4fd>] commit_cowonly_roots+0x1e5/0x285 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa0594135>] btrfs_commit_transaction+0x525/0xbb0 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa05d671d>] ? btrfs_log_dentry_safe+0x6d/0x80 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa05a9f5c>] btrfs_sync_file+0x1fc/0x330 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffff81191531>] do_fsync+0x51/0x80
Feb 22 14:08:45 tux kernel: [<ffffffff811602e7>] ? SyS_fallocate+0x47/0x80
Feb 22 14:08:45 tux kernel: [<ffffffff811917d0>] SyS_fsync+0x10/0x20
Clearly something is going into endless active loops and not terminating as it
should.
I realize this is vague but wanted to check if
- anyone is seeing this/something similar recently
- might have a suspect?
I've already backtracked a bit and can rule out Filipe's recent inode handling/fsync
stuff. The problem must have snuck in recently (last 2-3 weeks).
Grateful for any suggestions!
-h
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2015-02-22 17:58 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-22 17:58 New: seeing 100% CPU / unkillable tasks Holger Hoffstätte
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).