From: Dave Chinner <david@fromorbit.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Claudio Martins <ctpm@ist.utl.pt>, linux-kernel@vger.kernel.org
Subject: Re: Order 0 page allocation failure under heavy I/O load
Date: Mon, 27 Oct 2008 21:17:01 +1100 [thread overview]
Message-ID: <20081027101701.GD4985@disturbed> (raw)
In-Reply-To: <1225094696.16159.8.camel@twins>
On Mon, Oct 27, 2008 at 09:04:56AM +0100, Peter Zijlstra wrote:
> On Mon, 2008-10-27 at 17:22 +1100, Dave Chinner wrote:
> > On Mon, Oct 27, 2008 at 06:47:31AM +0100, Claudio Martins wrote:
> > > On Sunday 26 October 2008, Dave Chinner wrote:
> > >
> > > > The host will hang for tens of seconds at a time with both CPU cores
> > > > pegged at 100%, and eventually I get this in dmesg:
> > > >
> > > > [1304740.261506] linux: page allocation failure. order:0, mode:0x10000
> > > > [1304740.261516] Pid: 10705, comm: linux Tainted: P 2.6.26-1-amd64
>
> > No, because I've found the XFS bug the workload was triggering so
> > I don't need to run it anymore.
> >
> > I reported the problem because it appears that we've reported an
> > allocation failure without very much reclaim scanning (64 pages in
> > DMA zone, 0 pages in DMA32 zone), and there is apparently pages
> > available for allocation in the DMA zone:
> >
> > 1304740.262136] Node 0 DMA: 160*4kB 82*8kB 32*16kB 11*32kB 8*64kB 4*128kB 3*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 8048kB
> >
> > So it appears that memory reclaim has not found the free pages it
> > apparently has available....
> >
> > Fundamentally, I/O from a single CPU to a single disk on a machine
> > with 2GB RAM should not be able to cause allocation failures at all,
> > especially when the I/O is pure data I/O to a single file. Something
> > in the default config is busted if I can do that, and that's why
> > I reported the bug.
>
> The allocation is 'mode:0x10000', which is __GFP_NOMEMALLOC. That means
> the allocation doesn't have __GFP_WAIT, so it cannot do reclaim, it
> doesn't have __GFP_HIGH so it can't access some emergency reserves.
How did we get a gfp_mask with only __GFP_NOMEMALLOC? A mempool
allocation sets that and many more flags (like __GFP_NORETRY) but
they aren't present in that mask....
> The DMA stuff is special, and part of it is guarded for anything but
> __GFP_DMA allocations.
So if it wasn't a __GFP_DMA allocation - then what ran out of
memory? There appeared to be memory availble in the DMA32 zone....
> You just ran the system very low on memory, and then tried an allocation
> that can't do anything about it.. I don't find it very surprising it
> fails.
I didn't run the system low on memory - the *kernel* did. The
page cache is holding most of memory, and most of that is clean:
Active:254755 inactive:180546 dirty:13547 writeback:20016 unstable:0
free:3059 slab:39487 mapped:141190 pagetables:16401 bounce:0
> The 'bug' if any, is having such a poor allocation within your IO path.
> Not something to blame on the VM.
The I/O path started with a page fault and a call to
balance_dirty_pages_ratelimited_nr(). i.e. all the I/O is being done
by the VM and the allocation failure appears to be caused by
the VM holding all the clean free memory in the page cache where
the I/O layers can't access it. That really does seem like a VM
balance problem to me, not an I/O layer problem....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2008-10-27 10:17 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-26 22:57 Order 0 page allocation failure under heavy I/O load Dave Chinner
2008-10-27 5:47 ` Claudio Martins
2008-10-27 6:22 ` Dave Chinner
2008-10-27 8:04 ` Peter Zijlstra
2008-10-27 10:17 ` Dave Chinner [this message]
2008-10-28 13:20 ` Miquel van Smoorenburg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081027101701.GD4985@disturbed \
--to=david@fromorbit.com \
--cc=ctpm@ist.utl.pt \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.