linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Simon Kirby <sim@hostway.ca>
To: Peter Sch??ller <scode@spotify.com>
Cc: Pekka Enberg <penberg@kernel.org>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	Mattias de Zalenski <zalenski@spotify.com>,
	linux-mm@kvack.org
Subject: Re: Sudden and massive page cache eviction
Date: Wed, 24 Nov 2010 17:18:48 -0800	[thread overview]
Message-ID: <20101125011848.GB29511@hostway.ca> (raw)
In-Reply-To: <AANLkTi=yV02oY5AmNAYr+ZF0RUgVv8gkeP+D9_CcOfLi@mail.gmail.com>

On Wed, Nov 24, 2010 at 04:32:39PM +0100, Peter Sch??ller wrote:

> >> I forgot to address the second part of this question: How would I best
> >> inspect whether the kernel is doing that?
> >
> > You can, for example, record
> >
> > ??cat /proc/meminfo | grep Huge
> >
> > for large page allocations.
> 
> Those show zero a per my other post. However I got the impression Dave
> was asking about regular but larger-than-one-page allocations internal
> to the kernel, while the Huge* lines in /proc/meminfo refers to
> allocations specifically done by userland applications doing huge page
> allocation on a system with huge pages enabled - or am I confused?

Your page cache dents don't seem quite as big, so it may be something
else, but if it's the same problem we're seeing here, it seems to have to
do with when an order=3 new_slab allocation comes in to grows the kmalloc
slab cache for an __alloc_skb (network packet).  This is normal even
without jumbo frames now.  When there are no zones with order=3
zone_watermark_ok(), kswapd is woken, which frees things all over the
place to try to get zone_watermark_ok(order=3) to be happy.

We're seeing this throw out a huge number of pages, and we're seeing it
happen even with lots of memory free in the zone.  CONFIG_COMPACTION also
currently does not help because try_to_compact_pages() returns early with
COMPACT_SKIPPED if order <= PAGE_ALLOC_COSTLY_ORDER, and, you guessed it,
PAGE_ALLOC_COSTLY_ORDER is set to 3.

I reimplemented zone_pages_ok(order=3) in userspace, and I can see it
happen:

Code here: http://0x.ca/sim/ref/2.6.36/buddyinfo_scroll

  Zone order:0      1     2     3    4 5 6 7 8 9 A nr_free state

 DMA32   19026  33652  4897    13    5 1 2 0 0 0 0  106262 337 <= 256
Normal     450      0     0     0    0 0 0 0 0 0 0     450 -7 <= 238
 DMA32   19301  33869  4665    12    5 1 2 0 0 0 0  106035 329 <= 256
Normal     450      0     0     0    0 0 0 0 0 0 0     450 -7 <= 238
 DMA32   19332  33931  4603     9    5 1 2 0 0 0 0  105918 305 <= 256
Normal     450      0     0     0    0 0 0 0 0 0 0     450 -7 <= 238
 DMA32   19467  34057  4468     6    5 1 2 0 0 0 0  105741 281 <= 256
Normal     450      0     0     0    0 0 0 0 0 0 0     450 -7 <= 238
 DMA32   19591  34181  4344     5    5 1 2 0 0 0 0  105609 273 <= 256
Normal     450      0     0     0    0 0 0 0 0 0 0     450 -7 <= 238
 DMA32   19856  34348  4109     2    5 1 2 0 0 0 0  105244 249 <= 256 !!!
Normal     450      0     0     0    0 0 0 0 0 0 0     450 -7 <= 238
 DMA32   24088  36476  5437   144    5 1 2 0 0 0 0  120180 1385 <= 256
Normal    1024      1     0     0    0 0 0 0 0 0 0    1026 -5 <= 238
 DMA32   26453  37440  6676   623   53 1 2 0 0 0 0  134029 5985 <= 256
Normal    8700    100     0     0    0 0 0 0 0 0 0    8900 193 <= 238
 DMA32   48881  38161  7142   966   81 1 2 0 0 0 0  162955 9177 <= 256
Normal    8936    102     0     1    0 0 0 0 0 0 0    9148 205 <= 238
 DMA32   66046  40051  7871  1409  135 2 2 0 0 0 0  191256 13617 <= 256
Normal    9019     18     0     0    0 0 0 0 0 0 0    9055 29 <= 238
 DMA32   67133  48671  8231  1578  143 2 2 0 0 0 0  212503 15097 <= 256

So, kswapd was woken up at the line that ends in "!!!" there, because
free_pages(249) <= min(256), and so zone_watermark_ok() returned 0, when
an order=3 allocation came in.

Maybe try out that script and see if you see something similar.

Simon-

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-11-25  1:18 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AANLkTikg-sR97tkG=ST9kjZcHe6puYSvMGh-eA3cnH7X@mail.gmail.com>
2010-11-23  0:11 ` Sudden and massive page cache eviction Andrew Morton
2010-11-23  8:38   ` Dave Hansen
2010-11-23  9:44     ` Peter Schüller
2010-11-23 16:19       ` Dave Hansen
2010-11-24 14:02         ` Peter Schüller
2010-11-24 14:14           ` Peter Schüller
2010-11-24 14:20             ` Pekka Enberg
2010-11-24 15:32               ` Peter Schüller
2010-11-24 17:46                 ` Pekka Enberg
2010-11-25  1:18                 ` Simon Kirby [this message]
2010-11-25 15:59                   ` Peter Schüller
2010-12-01  6:36                     ` Simon Kirby
2010-11-24 17:32             ` Dave Hansen
2010-11-25 15:33               ` Peter Schüller
2010-12-01  9:15                 ` Simon Kirby

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101125011848.GB29511@hostway.ca \
    --to=sim@hostway.ca \
    --cc=akpm@linux-foundation.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=scode@spotify.com \
    --cc=zalenski@spotify.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).