linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Kyle McMartin <kyle@mcmartin.ca>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Shaohua Li <shaohua.li@intel.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Christoph Lameter <cl@linux.com>,
	David Rientjes <rientjes@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low
Date: Mon, 29 Nov 2010 15:58:01 +0000	[thread overview]
Message-ID: <20101129155801.GG13268@csn.ul.ie> (raw)
In-Reply-To: <20101129152230.GH15818@bombadil.infradead.org>

On Mon, Nov 29, 2010 at 10:22:30AM -0500, Kyle McMartin wrote:
> On Mon, Nov 29, 2010 at 03:08:24PM +0000, Mel Gorman wrote:
> > Ouch! I have been unable to create an exact copy of your kernel source as
> > I'm not running Fedora. From a partial conversion of a source RPM, I saw no
> > changes related to mm/vmscan.c. Is this accurate? I'm trying to establish
> > if this is a mainline bug as well.
> > 
> 
> Sorry, if you extract the source rpm you should get the patched
> sources... Aside from a few patches to mm/mmap for execshield, mm/* is
> otherwise untouched from the latest stable 2.6.35 kernels.
> 

Perfect, that correlates with what I saw so this is probably a
mainline issue.

> If you git clone git://pkgs.fedoraproject.org/kernel and check out the
> origin/f14/master branch, it has all the patches we apply (based on the
> 'ApplyPatch' lines in kernel.spec
> 
> > Second, I see all the stack traces are marked with "?" making them
> > unreliable. Is that anything to be concerned about?
> > 
> 
> Hrm, I don't think it is, I think the ones with '?' are just artifacts
> because we don't have a proper unwinder. Oh! Thanks! I just found a bug
> in our configs... We don't have CONFIG_FRAME_POINTER set because
> CONFIG_DEBUG_KERNEL got unset in the 'production' configs... I'll fix
> that up.
> 

Ordinarily I'd expect it to be from the lack of a unwinder but if FRAME_POINTER
is there (which you say in a follow-up mail that is), it can be a bit of
a concern. There is some real weirness as it is. Take on of Luke's
examples where it appears to be locked up in

[ 5015.448127] Pid: 185, comm: kswapd1 Tainted: P 2.6.35.6-48.fc14.x86_64 #1 X8DA3/X8DA3
[ 5015.448127] RIP: 0010:[<ffffffff81469130>]  [<ffffffff81469130>] _raw_spin_unlock_irqrestore+0x18/0x19

I am at a loss to explain under what circumstances that can even happen!
Is there any possibility RIP is being translated to the wrong symbol possibly
via an userspace decoder of the oops or similar? Is there any possibility
the stack is being corrupted if the swap subsystem is on a complicated
software stack?

> > I see that one user has reported that the patches fixed the problem for him
> > but I fear that this might be a co-incidence or that the patches close a
> > race of some description. Specifically, I'm trying to identify if there is
> > a situation where kswapd() constantly loops checking watermarks and never
> > calling cond_resched(). This could conceivably happen if kswapd() is always
> > checking sleeping_prematurely() at a higher order where as balance_pgdat()
> > is always checks the watermarks at the lower order. I'm not seeing how this
> > could happen in 2.6.35.6 though. If Fedora doesn't have special changes,
> > it might mean that these patches do need to go into -stable as the
> > cost of zone_page_state_snapshot() is far higher on larger machines than
> > previously reported.
> > 
> 
> Yeah, I am a bit surprised as well. Luke seems to have quite a large
> machine... I haven't seen any kswapd lockups there on my 18G machine
> using the same kernel. :< (Possibly it's just not stressed enough
> though.)
> 

He reports his machine as 24-way but with his model of CPU it could
still be 2-socket which I ordinarily would not have expected to suffer
so badly from zone_page_state_snapshot(). I would have predicted that
the patches being thrown about in the thread "Free memory never fully
used, swapping" to be more relevant to kswapd failing to go to sleep :/

Andrew, this patch was a performance fix but is a report saying that it
fixes a functional regression in Fedora enough to push a patch torwards
stable even though an explanation as to *why* it fixes the problem is missing?

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-11-29 15:58 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-27  8:47 [PATCH 0/2] Reduce the amount of time spent in watermark-related functions Mel Gorman
2010-10-27  8:47 ` [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low Mel Gorman
2010-10-27 20:16   ` Christoph Lameter
2010-10-28  1:09   ` KAMEZAWA Hiroyuki
2010-10-28  9:49     ` Mel Gorman
2010-10-28  9:58       ` KAMEZAWA Hiroyuki
2010-11-14  8:53     ` [PATCH] set_pgdat_percpu_threshold() don't use for_each_online_cpu KOSAKI Motohiro
2010-11-15 10:26       ` Mel Gorman
2010-11-15 14:04         ` Christoph Lameter
2010-11-16  9:58           ` Mel Gorman
2010-11-17  0:07       ` Andrew Morton
2010-11-19 15:29         ` Christoph Lameter
2010-11-23  8:32         ` KOSAKI Motohiro
2010-11-01  7:06   ` [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low KOSAKI Motohiro
2010-11-26 16:06   ` Kyle McMartin
2010-11-29  9:56     ` Mel Gorman
2010-11-29 13:16       ` Kyle McMartin
2010-11-29 15:08         ` Mel Gorman
2010-11-29 15:22           ` Kyle McMartin
2010-11-29 15:26             ` Kyle McMartin
2010-11-29 15:58             ` Mel Gorman [this message]
2010-12-23 22:18               ` David Rientjes
2010-12-23 22:35                 ` Andrew Morton
2010-12-23 23:00                   ` Kyle McMartin
2010-12-23 23:07                   ` David Rientjes
2010-12-23 23:17                     ` Andrew Morton
2010-10-27  8:47 ` [PATCH 2/2] mm: vmstat: Use a single setter function and callback for adjusting percpu thresholds Mel Gorman
2010-10-27 20:13   ` Christoph Lameter
2010-10-28  1:10   ` KAMEZAWA Hiroyuki
2010-11-01  7:06   ` KOSAKI Motohiro
  -- strict thread matches above, loose matches on Subject: below --
2010-10-28 15:13 [PATCH 0/2] Reduce the amount of time spent in watermark-related functions V4 Mel Gorman
2010-10-28 15:13 ` [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low Mel Gorman
2010-10-28 22:04   ` Andrew Morton
2010-10-29 10:12     ` Mel Gorman
2010-10-29 19:40       ` Andrew Morton
2010-11-02  0:53         ` Shaohua Li
2010-11-09 11:33         ` Mel Gorman
2010-11-09 16:48         ` Christoph Lameter
2010-10-29 14:58     ` Christoph Lameter
2010-10-29 18:25       ` Andrew Morton
2010-10-29 19:33         ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101129155801.GG13268@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=kyle@mcmartin.ca \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    --cc=shaohua.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).