spinning in isolate_migratepages_range on busy nfs server

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* spinning in isolate_migratepages_range on busy nfs server
@ 2012-10-25 16:47 J. Bruce Fields
  2012-10-26 18:48 ` J. Bruce Fields
  2012-10-30 13:56 ` Mel Gorman
  0 siblings, 2 replies; 3+ messages in thread
From: J. Bruce Fields @ 2012-10-25 16:47 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linux-mm, linux-fsdevel, bmarson

We're seeing an nfs server on a 3.6-ish kernel lock up after running
specfs for a while.

Looking at the logs, there are some hung task warnings showing nfsd
threads stuck on directory i_mutexes trying to do lookups.

A sysrq-t dump showed there were also lots of threads holding those
i_mutexes while trying to allocate xfs inodes:

 	nfsd            R running task        0  6517      2 0x00000080
 	 ffff880f925074c0 0000000000000046 ffff880fe4718000 ffff880f92507fd8
 	 ffff880f92507fd8 ffff880f92507fd8 ffff880fd7920000 ffff880fe4718000
 	 0000000000000000 ffff880f92506000 ffff88102ffd96c0 ffff88102ffd9b40
 	Call Trace:
 	[<ffffffff81091aaa>] __cond_resched+0x2a/0x40
 	[<ffffffff815d3750>] _cond_resched+0x30/0x40
 	[<ffffffff81150e92>] isolate_migratepages_range+0xb2/0x550
 	[<ffffffff811507c0>] ?  compact_checklock_irqsave.isra.17+0xe0/0xe0
 	[<ffffffff81151536>] compact_zone+0x146/0x3f0
 	[<ffffffff81151a92>] compact_zone_order+0x82/0xc0
 	[<ffffffff81151bb1>] try_to_compact_pages+0xe1/0x110
 	[<ffffffff815c99e2>] __alloc_pages_direct_compact+0xaa/0x190
 	[<ffffffff81138317>] __alloc_pages_nodemask+0x517/0x980
 	[<ffffffff81088a00>] ? __synchronize_srcu+0xf0/0x110
 	[<ffffffff81171e30>] alloc_pages_current+0xb0/0x120
 	[<ffffffff8117b015>] new_slab+0x265/0x310
 	[<ffffffff815caefc>] __slab_alloc+0x358/0x525
 	[<ffffffffa05625a7>] ? kmem_zone_alloc+0x67/0xf0 [xfs]
 	[<ffffffff81088c72>] ? up+0x32/0x50
 	[<ffffffffa05625a7>] ? kmem_zone_alloc+0x67/0xf0 [xfs]
 	[<ffffffff8117b4ef>] kmem_cache_alloc+0xff/0x130
 	[<ffffffffa05625a7>] kmem_zone_alloc+0x67/0xf0 [xfs]
 	[<ffffffffa0552f49>] xfs_inode_alloc+0x29/0x270 [xfs]
 	[<ffffffffa0553801>] xfs_iget+0x231/0x6c0 [xfs]
 	[<ffffffffa0560687>] xfs_lookup+0xe7/0x110 [xfs]
 	[<ffffffffa05583e1>] xfs_vn_lookup+0x51/0x90 [xfs]
 	[<ffffffff81193e9d>] lookup_real+0x1d/0x60
 	[<ffffffff811940b8>] __lookup_hash+0x38/0x50
 	[<ffffffff81197e26>] lookup_one_len+0xd6/0x110
 	[<ffffffffa034667b>] nfsd_lookup_dentry+0x12b/0x4a0 [nfsd]
 	[<ffffffffa0346a69>] nfsd_lookup+0x79/0x140 [nfsd]
 	[<ffffffffa034fb5f>] nfsd3_proc_lookup+0xef/0x1c0 [nfsd]
 	[<ffffffffa0341bbb>] nfsd_dispatch+0xeb/0x230 [nfsd]
 	[<ffffffffa02ee3a8>] svc_process_common+0x328/0x6d0 [sunrpc]
 	[<ffffffffa02eeaa2>] svc_process+0x102/0x150 [sunrpc]
 	[<ffffffffa0341115>] nfsd+0xb5/0x1a0 [nfsd]
 	[<ffffffffa0341060>] ? nfsd_get_default_max_blksize+0x60/0x60 [nfsd]
 	[<ffffffff81082613>] kthread+0x93/0xa0
 	[<ffffffff815ddc34>] kernel_thread_helper+0x4/0x10
 	[<ffffffff81082580>] ? kthread_freezable_should_stop+0x70/0x70
 	[<ffffffff815ddc30>] ? gs_change+0x13/0x13

And perf --call-graph also shows we're spending all our time in the same
place, spinning on a lock (zone->lru_lock, I assume):

 -  92.65%           nfsd  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
    - _raw_spin_lock_irqsave
       - 99.86% isolate_migratepages_range

Just grepping through logs, I ran across 2a1402aa04 "mm: compaction:
acquire the zone->lru_lock as late as possible", in v3.7-rc1, which
looks relevant:

	Richard Davies and Shaohua Li have both reported lock contention
	problems in compaction on the zone and LRU locks as well as
	significant amounts of time being spent in compaction.  This
	series aims to reduce lock contention and scanning rates to
	reduce that CPU usage.  Richard reported at
	https://lkml.org/lkml/2012/9/21/91 that this series made a big
	different to a problem he reported in August:
			        
		http://marc.info/?l=kvm&m=134511507015614&w=2

So we're trying that.  Is there anything else we should try?

--b.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: spinning in isolate_migratepages_range on busy nfs server
  2012-10-25 16:47 spinning in isolate_migratepages_range on busy nfs server J. Bruce Fields
@ 2012-10-26 18:48 ` J. Bruce Fields
  2012-10-30 13:56 ` Mel Gorman
  1 sibling, 0 replies; 3+ messages in thread
From: J. Bruce Fields @ 2012-10-26 18:48 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linux-mm, linux-fsdevel, bmarson

On Thu, Oct 25, 2012 at 12:47:22PM -0400, bfields wrote:
> We're seeing an nfs server on a 3.6-ish kernel lock up after running
> specfs for a while.
> 
> Looking at the logs, there are some hung task warnings showing nfsd
> threads stuck on directory i_mutexes trying to do lookups.
> 
> A sysrq-t dump showed there were also lots of threads holding those
> i_mutexes while trying to allocate xfs inodes:
> 
>  	nfsd            R running task        0  6517      2 0x00000080
>  	 ffff880f925074c0 0000000000000046 ffff880fe4718000 ffff880f92507fd8
>  	 ffff880f92507fd8 ffff880f92507fd8 ffff880fd7920000 ffff880fe4718000
>  	 0000000000000000 ffff880f92506000 ffff88102ffd96c0 ffff88102ffd9b40
>  	Call Trace:
>  	[<ffffffff81091aaa>] __cond_resched+0x2a/0x40
>  	[<ffffffff815d3750>] _cond_resched+0x30/0x40
>  	[<ffffffff81150e92>] isolate_migratepages_range+0xb2/0x550
>  	[<ffffffff811507c0>] ?  compact_checklock_irqsave.isra.17+0xe0/0xe0
>  	[<ffffffff81151536>] compact_zone+0x146/0x3f0
>  	[<ffffffff81151a92>] compact_zone_order+0x82/0xc0
>  	[<ffffffff81151bb1>] try_to_compact_pages+0xe1/0x110
>  	[<ffffffff815c99e2>] __alloc_pages_direct_compact+0xaa/0x190
>  	[<ffffffff81138317>] __alloc_pages_nodemask+0x517/0x980
>  	[<ffffffff81088a00>] ? __synchronize_srcu+0xf0/0x110
>  	[<ffffffff81171e30>] alloc_pages_current+0xb0/0x120
>  	[<ffffffff8117b015>] new_slab+0x265/0x310
>  	[<ffffffff815caefc>] __slab_alloc+0x358/0x525
>  	[<ffffffffa05625a7>] ? kmem_zone_alloc+0x67/0xf0 [xfs]
>  	[<ffffffff81088c72>] ? up+0x32/0x50
>  	[<ffffffffa05625a7>] ? kmem_zone_alloc+0x67/0xf0 [xfs]
>  	[<ffffffff8117b4ef>] kmem_cache_alloc+0xff/0x130
>  	[<ffffffffa05625a7>] kmem_zone_alloc+0x67/0xf0 [xfs]
>  	[<ffffffffa0552f49>] xfs_inode_alloc+0x29/0x270 [xfs]
>  	[<ffffffffa0553801>] xfs_iget+0x231/0x6c0 [xfs]
>  	[<ffffffffa0560687>] xfs_lookup+0xe7/0x110 [xfs]
>  	[<ffffffffa05583e1>] xfs_vn_lookup+0x51/0x90 [xfs]
>  	[<ffffffff81193e9d>] lookup_real+0x1d/0x60
>  	[<ffffffff811940b8>] __lookup_hash+0x38/0x50
>  	[<ffffffff81197e26>] lookup_one_len+0xd6/0x110
>  	[<ffffffffa034667b>] nfsd_lookup_dentry+0x12b/0x4a0 [nfsd]
>  	[<ffffffffa0346a69>] nfsd_lookup+0x79/0x140 [nfsd]
>  	[<ffffffffa034fb5f>] nfsd3_proc_lookup+0xef/0x1c0 [nfsd]
>  	[<ffffffffa0341bbb>] nfsd_dispatch+0xeb/0x230 [nfsd]
>  	[<ffffffffa02ee3a8>] svc_process_common+0x328/0x6d0 [sunrpc]
>  	[<ffffffffa02eeaa2>] svc_process+0x102/0x150 [sunrpc]
>  	[<ffffffffa0341115>] nfsd+0xb5/0x1a0 [nfsd]
>  	[<ffffffffa0341060>] ? nfsd_get_default_max_blksize+0x60/0x60 [nfsd]
>  	[<ffffffff81082613>] kthread+0x93/0xa0
>  	[<ffffffff815ddc34>] kernel_thread_helper+0x4/0x10
>  	[<ffffffff81082580>] ? kthread_freezable_should_stop+0x70/0x70
>  	[<ffffffff815ddc30>] ? gs_change+0x13/0x13
> 
> And perf --call-graph also shows we're spending all our time in the same
> place, spinning on a lock (zone->lru_lock, I assume):
> 
>  -  92.65%           nfsd  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
>     - _raw_spin_lock_irqsave
>        - 99.86% isolate_migratepages_range
> 
> Just grepping through logs, I ran across 2a1402aa04 "mm: compaction:
> acquire the zone->lru_lock as late as possible", in v3.7-rc1, which
> looks relevant:
> 
> 	Richard Davies and Shaohua Li have both reported lock contention
> 	problems in compaction on the zone and LRU locks as well as
> 	significant amounts of time being spent in compaction.  This
> 	series aims to reduce lock contention and scanning rates to
> 	reduce that CPU usage.  Richard reported at
> 	https://lkml.org/lkml/2012/9/21/91 that this series made a big
> 	different to a problem he reported in August:
> 			        
> 		http://marc.info/?l=kvm&m=134511507015614&w=2
> 
> So we're trying that.  Is there anything else we should try?

Confirmed, applying that to 3.6 seems to fix the problem.

--b.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: spinning in isolate_migratepages_range on busy nfs server
  2012-10-25 16:47 spinning in isolate_migratepages_range on busy nfs server J. Bruce Fields
  2012-10-26 18:48 ` J. Bruce Fields
@ 2012-10-30 13:56 ` Mel Gorman
  1 sibling, 0 replies; 3+ messages in thread
From: Mel Gorman @ 2012-10-30 13:56 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-mm, linux-fsdevel, bmarson

On Thu, Oct 25, 2012 at 12:47:22PM -0400, J. Bruce Fields wrote:
> We're seeing an nfs server on a 3.6-ish kernel lock up after running
> specfs for a while.
> 
> Looking at the logs, there are some hung task warnings showing nfsd
> threads stuck on directory i_mutexes trying to do lookups.
> 
> A sysrq-t dump showed there were also lots of threads holding those
> i_mutexes while trying to allocate xfs inodes:
> 
>  	nfsd            R running task        0  6517      2 0x00000080
>  	 ffff880f925074c0 0000000000000046 ffff880fe4718000 ffff880f92507fd8
>  	 ffff880f92507fd8 ffff880f92507fd8 ffff880fd7920000 ffff880fe4718000
>  	 0000000000000000 ffff880f92506000 ffff88102ffd96c0 ffff88102ffd9b40
>  	Call Trace:
>  	[<ffffffff81091aaa>] __cond_resched+0x2a/0x40
>  	[<ffffffff815d3750>] _cond_resched+0x30/0x40
>  	[<ffffffff81150e92>] isolate_migratepages_range+0xb2/0x550
> <SNIP>
>
> And perf --call-graph also shows we're spending all our time in the same
> place, spinning on a lock (zone->lru_lock, I assume):
> 
>  -  92.65%           nfsd  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
>     - _raw_spin_lock_irqsave
>        - 99.86% isolate_migratepages_range
> 
> Just grepping through logs, I ran across 2a1402aa04 "mm: compaction:
> acquire the zone->lru_lock as late as possible", in v3.7-rc1, which
> looks relevant:
> 
> 	Richard Davies and Shaohua Li have both reported lock contention
> 	problems in compaction on the zone and LRU locks as well as
> 	significant amounts of time being spent in compaction.  This
> 	series aims to reduce lock contention and scanning rates to
> 	reduce that CPU usage.  Richard reported at
> 	https://lkml.org/lkml/2012/9/21/91 that this series made a big
> 	different to a problem he reported in August:
> 			        
> 		http://marc.info/?l=kvm&m=134511507015614&w=2
> 
> So we're trying that.  Is there anything else we should try?
> 

Sorry for the long delay in getting back, I was travelling. All the
related commits would ideally be tested. They are

e64c5237cf6ff474cb2f3f832f48f2b441dd9979 mm: compaction: abort compaction loop if lock is contended or run too long
3cc668f4e30fbd97b3c0574d8cac7a83903c9bc7 mm: compaction: move fatal signal check out of compact_checklock_irqsave
661c4cb9b829110cb68c18ea05a56be39f75a4d2 mm: compaction: Update try_to_compact_pages()kerneldoc comment
2a1402aa044b55c2d30ab0ed9405693ef06fb07c mm: compaction: acquire the zone->lru_lock as late as possible
f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e mm: compaction: acquire the zone->lock as late as possible
753341a4b85ff337487b9959c71c529f522004f4 revert "mm: have order > 0 compaction start off where it left"
bb13ffeb9f6bfeb301443994dfbf29f91117dfb3 mm: compaction: cache if a pageblock was scanned and no pages were isolated
c89511ab2f8fe2b47585e60da8af7fd213ec877e mm: compaction: Restart compaction from near where it left off
62997027ca5b3d4618198ed8b1aba40b61b1137b mm: compaction: clear PG_migrate_skip based on compaction and reclaim activity
0db63d7e25f96e2c6da925c002badf6f144ddf30 mm: compaction: correct the nr_strict va isolated check for CMA

Thanks very much.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-10-30 13:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-25 16:47 spinning in isolate_migratepages_range on busy nfs server J. Bruce Fields
2012-10-26 18:48 ` J. Bruce Fields
2012-10-30 13:56 ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).