From: Simon Kirby <sim@hostway.ca>
To: Peter Sch??ller <scode@spotify.com>
Cc: Pekka Enberg <penberg@kernel.org>,
Dave Hansen <dave@linux.vnet.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org,
Mattias de Zalenski <zalenski@spotify.com>,
linux-mm@kvack.org
Subject: Re: Sudden and massive page cache eviction
Date: Wed, 24 Nov 2010 17:18:48 -0800 [thread overview]
Message-ID: <20101125011848.GB29511@hostway.ca> (raw)
In-Reply-To: <AANLkTi=yV02oY5AmNAYr+ZF0RUgVv8gkeP+D9_CcOfLi@mail.gmail.com>
On Wed, Nov 24, 2010 at 04:32:39PM +0100, Peter Sch??ller wrote:
> >> I forgot to address the second part of this question: How would I best
> >> inspect whether the kernel is doing that?
> >
> > You can, for example, record
> >
> > ??cat /proc/meminfo | grep Huge
> >
> > for large page allocations.
>
> Those show zero a per my other post. However I got the impression Dave
> was asking about regular but larger-than-one-page allocations internal
> to the kernel, while the Huge* lines in /proc/meminfo refers to
> allocations specifically done by userland applications doing huge page
> allocation on a system with huge pages enabled - or am I confused?
Your page cache dents don't seem quite as big, so it may be something
else, but if it's the same problem we're seeing here, it seems to have to
do with when an order=3 new_slab allocation comes in to grows the kmalloc
slab cache for an __alloc_skb (network packet). This is normal even
without jumbo frames now. When there are no zones with order=3
zone_watermark_ok(), kswapd is woken, which frees things all over the
place to try to get zone_watermark_ok(order=3) to be happy.
We're seeing this throw out a huge number of pages, and we're seeing it
happen even with lots of memory free in the zone. CONFIG_COMPACTION also
currently does not help because try_to_compact_pages() returns early with
COMPACT_SKIPPED if order <= PAGE_ALLOC_COSTLY_ORDER, and, you guessed it,
PAGE_ALLOC_COSTLY_ORDER is set to 3.
I reimplemented zone_pages_ok(order=3) in userspace, and I can see it
happen:
Code here: http://0x.ca/sim/ref/2.6.36/buddyinfo_scroll
Zone order:0 1 2 3 4 5 6 7 8 9 A nr_free state
DMA32 19026 33652 4897 13 5 1 2 0 0 0 0 106262 337 <= 256
Normal 450 0 0 0 0 0 0 0 0 0 0 450 -7 <= 238
DMA32 19301 33869 4665 12 5 1 2 0 0 0 0 106035 329 <= 256
Normal 450 0 0 0 0 0 0 0 0 0 0 450 -7 <= 238
DMA32 19332 33931 4603 9 5 1 2 0 0 0 0 105918 305 <= 256
Normal 450 0 0 0 0 0 0 0 0 0 0 450 -7 <= 238
DMA32 19467 34057 4468 6 5 1 2 0 0 0 0 105741 281 <= 256
Normal 450 0 0 0 0 0 0 0 0 0 0 450 -7 <= 238
DMA32 19591 34181 4344 5 5 1 2 0 0 0 0 105609 273 <= 256
Normal 450 0 0 0 0 0 0 0 0 0 0 450 -7 <= 238
DMA32 19856 34348 4109 2 5 1 2 0 0 0 0 105244 249 <= 256 !!!
Normal 450 0 0 0 0 0 0 0 0 0 0 450 -7 <= 238
DMA32 24088 36476 5437 144 5 1 2 0 0 0 0 120180 1385 <= 256
Normal 1024 1 0 0 0 0 0 0 0 0 0 1026 -5 <= 238
DMA32 26453 37440 6676 623 53 1 2 0 0 0 0 134029 5985 <= 256
Normal 8700 100 0 0 0 0 0 0 0 0 0 8900 193 <= 238
DMA32 48881 38161 7142 966 81 1 2 0 0 0 0 162955 9177 <= 256
Normal 8936 102 0 1 0 0 0 0 0 0 0 9148 205 <= 238
DMA32 66046 40051 7871 1409 135 2 2 0 0 0 0 191256 13617 <= 256
Normal 9019 18 0 0 0 0 0 0 0 0 0 9055 29 <= 238
DMA32 67133 48671 8231 1578 143 2 2 0 0 0 0 212503 15097 <= 256
So, kswapd was woken up at the line that ends in "!!!" there, because
free_pages(249) <= min(256), and so zone_watermark_ok() returned 0, when
an order=3 allocation came in.
Maybe try out that script and see if you see something similar.
Simon-
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-11-25 1:18 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <AANLkTikg-sR97tkG=ST9kjZcHe6puYSvMGh-eA3cnH7X@mail.gmail.com>
2010-11-23 0:11 ` Sudden and massive page cache eviction Andrew Morton
2010-11-23 8:38 ` Dave Hansen
2010-11-23 9:44 ` Peter Schüller
2010-11-23 16:19 ` Dave Hansen
2010-11-24 14:02 ` Peter Schüller
2010-11-24 14:14 ` Peter Schüller
2010-11-24 14:20 ` Pekka Enberg
2010-11-24 15:32 ` Peter Schüller
2010-11-24 17:46 ` Pekka Enberg
2010-11-25 1:18 ` Simon Kirby [this message]
2010-11-25 15:59 ` Peter Schüller
2010-12-01 6:36 ` Simon Kirby
2010-11-24 17:32 ` Dave Hansen
2010-11-25 15:33 ` Peter Schüller
2010-12-01 9:15 ` Simon Kirby
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101125011848.GB29511@hostway.ca \
--to=sim@hostway.ca \
--cc=akpm@linux-foundation.org \
--cc=dave@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=penberg@kernel.org \
--cc=scode@spotify.com \
--cc=zalenski@spotify.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).