All of lore.kernel.org
 help / color / mirror / Atom feed
* New: seeing 100% CPU / unkillable tasks
@ 2015-02-22 17:58 Holger Hoffstätte
  0 siblings, 0 replies; only message in thread
From: Holger Hoffstätte @ 2015-02-22 17:58 UTC (permalink / raw)
  To: linux-btrfs


kernel: 3.18.7 + all patches since 3.19 + the daily Filipe ;)

For the last few days I've been getting an awful lot of stuck tasks
after mundane operations like simple rsync'ing, an fallocate or just
doign a manual "sync".

Symptom is always 100% CPU use and the task (user-space fallocate, sync
or the [btrfs-transaction] kthread on eventual tx commit) hanging.

This happens even without stress (idle single-disk fs/system, no mem pressure)
and very irregularly. Today I got particularly unlucky and could trigger it
repeatedly, simply by doing a bunch of small fallocates on a fresh subvolume:
the first few would work and then - boom.

A full collection of several SysRq traces is at:
https://gist.github.com/hhoffstaette/c54ca2813cd47439c4c1

I've inserted spaces between different runs and SysRq segments to make
it a bit easier to read.

Common theme is almost always:

Feb 22 12:44:03 tux kernel: [<ffffffff812baa36>] ? __percpu_counter_add+0x56/0x80
Feb 22 12:44:03 tux kernel: [<ffffffffa07b566c>] ? find_first_extent_bit_state+0x2c/0x80 [btrfs]
Feb 22 12:44:03 tux kernel: [<ffffffff8108bb1b>] ? lock_timer_base.isra.36+0x2b/0x50
Feb 22 12:44:03 tux kernel: [<ffffffff81075023>] ? prepare_to_wait_event+0x83/0x100
Feb 22 12:44:03 tux kernel: [<ffffffffa07980ff>] wait_current_trans.isra.17+0x9f/0x100 [btrfs]
Feb 22 12:44:03 tux kernel: [<ffffffff81075130>] ? __wake_up_sync+0x20/0x20
Feb 22 12:44:03 tux kernel: [<ffffffffa0799ad8>] start_transaction+0x318/0x5a0 [btrfs]
Feb 22 12:44:03 tux kernel: [<ffffffffa0799e17>] btrfs_attach_transaction+0x17/0x20 [btrfs]
Feb 22 12:44:03 tux kernel: [<ffffffffa079486b>] transaction_kthread+0x8b/0x260 [btrfs]
Feb 22 12:44:03 tux kernel: [<ffffffffa07947e0>] ? btrfs_cleanup_transaction+0x520/0x520 [btrfs]
Feb 22 12:44:03 tux kernel: [<ffffffff810685eb>] kthread+0xdb/0x100
Feb 22 12:44:03 tux kernel: [<ffffffff81068510>] ? kthread_create_on_node+0x180/0x180
Feb 22 12:44:03 tux kernel: [<ffffffff8153f1ec>] ret_from_fork+0x7c/0xb0
Feb 22 12:44:03 tux kernel: [<ffffffff81068510>] ? kthread_create_on_node+0x180/0x180

or this:

Feb 22 14:08:45 tux kernel: [<ffffffffa056a809>] btrfs_set_path_blocking+0x49/0x90 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa056a8a5>] btrfs_clear_path_blocking+0x55/0xe0 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa056f657>] btrfs_search_slot+0x1f7/0xa60 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa0585955>] btrfs_update_root+0x55/0x270 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa060b4fd>] commit_cowonly_roots+0x1e5/0x285 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa0594135>] btrfs_commit_transaction+0x525/0xbb0 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa05d671d>] ? btrfs_log_dentry_safe+0x6d/0x80 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffffa05a9f5c>] btrfs_sync_file+0x1fc/0x330 [btrfs]
Feb 22 14:08:45 tux kernel: [<ffffffff81191531>] do_fsync+0x51/0x80
Feb 22 14:08:45 tux kernel: [<ffffffff811602e7>] ? SyS_fallocate+0x47/0x80
Feb 22 14:08:45 tux kernel: [<ffffffff811917d0>] SyS_fsync+0x10/0x20 

Clearly something is going into endless active loops and not terminating as it
should.

I realize this is vague but wanted to check if
- anyone is seeing this/something similar recently
- might have a suspect?

I've already backtracked a bit and can rule out Filipe's recent inode handling/fsync
stuff. The problem must have snuck in recently (last 2-3 weeks).

Grateful for any suggestions!

-h


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2015-02-22 17:58 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-22 17:58 New: seeing 100% CPU / unkillable tasks Holger Hoffstätte

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.