Re: [2.6.35-rc1, bug] mm: minute-long livelocks in memory reclaim

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: Wu Fengguang <fengguang.wu@intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [2.6.35-rc1, bug] mm: minute-long livelocks in memory reclaim
Date: Mon, 23 Aug 2010 14:58:22 +0800	[thread overview]
Message-ID: <20100823065822.GA22707@localhost> (raw)
In-Reply-To: <20100822234811.GF31488@dastard>

On Mon, Aug 23, 2010 at 09:48:11AM +1000, Dave Chinner wrote:
> Folks,
> 
> I've been testing parallel create workloads over the weekend, and
> I've seen this a couple of times now under 8 thread parallel creates
> with XFS. I'm running on an 8p VM with 4GB RAM and a fast disk
> subsystem. Basically I am seeing the create rate drop to zero
> with all 8 CPUs stuck spinning for up to 2 minutes. 'echo t >
> /proc/sysrq-trigger' while this is occurring gives the following
> trace for all the fs-mark processes:
> 
> [49506.624018] fs_mark       R  running task        0  8376   7917 0x00000008
> [49506.624018]  0000000000000000 ffffffff81b94590 00000000000008fc 0000000000000002
> [49506.624018]  0000000000000000 0000000000000286 0000000000000297 ffffffffffffff10
> [49506.624018]  ffffffff810b3d02 0000000000000010 0000000000000202 ffff88011df777a8
> [49506.624018] Call Trace:
> [49506.624018]  [<ffffffff810b3d02>] ? smp_call_function_many+0x1a2/0x210
> [49506.624018]  [<ffffffff810b3ce5>] ? smp_call_function_many+0x185/0x210
> [49506.624018]  [<ffffffff81109170>] ? drain_local_pages+0x0/0x20
> [49506.624018]  [<ffffffff810b3d92>] ? smp_call_function+0x22/0x30
> [49506.624018]  [<ffffffff810849a4>] ? on_each_cpu+0x24/0x50
> [49506.624018]  [<ffffffff81107bec>] ? drain_all_pages+0x1c/0x20
> [49506.624018]  [<ffffffff8110825a>] ? __alloc_pages_nodemask+0x57a/0x730
> [49506.624018]  [<ffffffff8113c6d2>] ? kmem_getpages+0x62/0x160
> [49506.624018]  [<ffffffff8113d2b2>] ? fallback_alloc+0x192/0x240
> [49506.624018]  [<ffffffff8113cce1>] ? cache_grow+0x2d1/0x300
> [49506.624018]  [<ffffffff8113d04a>] ? ____cache_alloc_node+0x9a/0x170
> [49506.624018]  [<ffffffff8113cf6c>] ? cache_alloc_refill+0x25c/0x2a0
> [49506.624018]  [<ffffffff8113ddb3>] ? __kmalloc+0x193/0x230
> [49506.624018]  [<ffffffff812f59af>] ? kmem_alloc+0x8f/0xe0
> [49506.624018]  [<ffffffff812f59af>] ? kmem_alloc+0x8f/0xe0
> [49506.624018]  [<ffffffff812f5a9e>] ? kmem_zalloc+0x1e/0x50
> [49506.624018]  [<ffffffff812e2f4d>] ? xfs_log_commit_cil+0x9d/0x440
> [49506.624018]  [<ffffffff812eeec6>] ? _xfs_trans_commit+0x1e6/0x2b0
> [49506.624018]  [<ffffffff812f2b6f>] ? xfs_create+0x51f/0x690
> [49506.624018]  [<ffffffff812ffdb7>] ? xfs_vn_mknod+0xa7/0x1c0
> [49506.624018]  [<ffffffff812fff00>] ? xfs_vn_create+0x10/0x20
> [49506.624018]  [<ffffffff811510b8>] ? vfs_create+0xb8/0xf0
> [49506.624018]  [<ffffffff81151d2c>] ? do_last+0x4dc/0x5d0
> [49506.624018]  [<ffffffff81153bd7>] ? do_filp_open+0x207/0x5e0
> [49506.624018]  [<ffffffff8105fc58>] ? pvclock_clocksource_read+0x58/0xd0
> [49506.624018]  [<ffffffff8115eaca>] ? alloc_fd+0x10a/0x150
> [49506.624018]  [<ffffffff81144005>] ? do_sys_open+0x65/0x130
> [49506.624018]  [<ffffffff81144110>] ? sys_open+0x20/0x30
> [49506.624018]  [<ffffffff81036072>] ? system_call_fastpath+0x16/0x1b
> 
> Eventually the problem goes away, and the system goes back to
> performing at the normal rate. Any ideas on how to avoid this
> problem? I'm using CONFIG_SLAB=y is that is relevant....

zone->lock contention? Try rip the following two lines. The change
might be a bit aggressive though :)

Thanks,
Fengguang

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1bb327a..c08b8d3 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1864,9 +1864,6 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
 
 	cond_resched();
 
-	if (order != 0)
-		drain_all_pages();
-
 	if (likely(*did_some_progress))
 		page = get_page_from_freelist(gfp_mask, nodemask, order,
 					zonelist, high_zoneidx,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-08-23  6:58 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-22 23:48 [2.6.35-rc1, bug] mm: minute-long livelocks in memory reclaim Dave Chinner
2010-08-23  6:58 ` Wu Fengguang [this message]
2010-08-23  9:23   ` David Rientjes
2010-08-23 12:33     ` Dave Chinner

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:1bb327a dfblob:c08b8d3 )
 OR (
bs:"Re: [2.6.35-rc1, bug] mm: minute-long livelocks in memory reclaim" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100823065822.GA22707@localhost \
    --to=fengguang.wu@intel.com \
    --cc=david@fromorbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox