From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Isaacson Subject: atime updates stuck under btrfs_dirty_inode Date: Sun, 16 Jan 2011 23:34:40 -0800 Message-ID: <20110117073440.GB7508@hexapodia.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii To: linux-btrfs@vger.kernel.org Return-path: List-ID: Hi all, I'm seeing processes block (for minutes or longer) in directory access on btrfs. I'm running commit ccda756d13f13a9276edc8d53d05aaecbf3c4baa Merge: 0c21e3a 65e5341 Author: Andy Isaacson Date: Sun Jan 9 12:37:23 2011 -0800 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable on a quad core Core i7. There seem to be a few failure cases. My most recent ls has been stuck like this for several hours, on a directory that should have somewhere between 6 and 50 entries: % ps u 15028 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND adi 15028 0.0 0.0 27032 1204 pts/8 S+ 22:18 0:00 ls -l /btr/foo % cat /proc/15028/stack [] shrink_delalloc+0x112/0x147 [btrfs] [] reserve_metadata_bytes+0xf2/0x1a8 [btrfs] [] btrfs_block_rsv_add+0x28/0x49 [btrfs] [] btrfs_trans_reserve_metadata+0x4d/0x75 [btrfs] [] start_transaction+0x13e/0x1d8 [btrfs] [] btrfs_start_transaction+0x13/0x15 [btrfs] [] btrfs_dirty_inode+0x6c/0xfd [btrfs] [] __mark_inode_dirty+0x2b/0x1ab [] touch_atime+0x10e/0x131 [] vfs_readdir+0x79/0xa0 [] sys_getdents+0x81/0xd1 [] tracesys+0xd9/0xde [] 0xffffffffffffffff Sometimes, if I poll fast enough I see this stack instead: % cat /proc/15028/stack [] wait_for_commit+0x99/0xe2 [btrfs] [] btrfs_commit_transaction+0xfe/0x612 [btrfs] [] start_transaction+0x151/0x1d8 [btrfs] [] btrfs_start_transaction+0x13/0x15 [btrfs] [] btrfs_dirty_inode+0x6c/0xfd [btrfs] [] __mark_inode_dirty+0x2b/0x1ab [] touch_atime+0x10e/0x131 [] vfs_readdir+0x79/0xa0 [] sys_getdents+0x81/0xd1 [] tracesys+0xd9/0xde [] 0xffffffffffffffff Another ls has been accumulating runtime for several days: 7406 pts/3 R+ 2973:56 | \_ ls -F the directory it's reading has around 200 files in it, I think. When I catch it in kernel mode its stack looks like this: % cat /proc/7406/stack [] writeback_inodes_sb_nr+0x76/0x7d [] writeback_inodes_sb_nr_if_idle+0x41/0x55 [] shrink_delalloc+0xaf/0x147 [btrfs] [] reserve_metadata_bytes+0xf2/0x1a8 [btrfs] [] btrfs_block_rsv_add+0x28/0x49 [btrfs] [] btrfs_trans_reserve_metadata+0x4d/0x75 [btrfs] [] start_transaction+0x13e/0x1d8 [btrfs] [] btrfs_start_transaction+0x13/0x15 [btrfs] [] btrfs_dirty_inode+0x6c/0xfd [btrfs] [] __mark_inode_dirty+0x2b/0x1ab [] touch_atime+0x10e/0x131 [] vfs_readdir+0x79/0xa0 [] sys_getdents+0x81/0xd1 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff vmstat is rather odd: % vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 0 124348 177388 11276228 0 0 157 24 3 0 2 10 87 1 1 0 0 122108 177388 11278540 0 0 536 112 2870 813309 2 13 83 2 2 0 0 119628 177388 11280852 0 0 704 0 2938 887219 2 11 87 0 2 0 0 117768 177388 11282684 0 0 800 0 2936 883427 2 15 82 1 2 0 0 114872 177388 11285464 0 0 1104 0 2819 883498 1 8 90 0 1 0 0 112704 177388 11287388 0 0 612 0 2961 880957 2 14 84 0 ^C One of the three block devices on this btrfs is reading a few hundred blocks per second, but is nowhere near 100% busy; the CS count is extremely high; and we have less than a full CPU spent in system time (even though there are two ls processes "blocked"). If anyone can point me to a good place to start tracking down the root cause, I'd appreciate it. Thanks, -andy