From mboxrd@z Thu Jan  1 00:00:00 1970
From: Badari Pulavarty <pbadari@us.ibm.com>
Subject: Re: Bufferheads & page-cache reference
Date: 15 Feb 2005 08:03:17 -0800
Message-ID: <1108483396.20053.1379.camel@dyn318077bld.beaverton.ibm.com>
References: <1108409415.20053.1278.camel@dyn318077bld.beaverton.ibm.com>
	 <20050214134058.1402cfed.akpm@osdl.org>
	 <20050214221044.GX13009@holomorphy.com>
	 <20050214143142.6c12fdb3.akpm@osdl.org>
	 <20050214225033.GY13009@holomorphy.com>
	 <1108426969.20053.1346.camel@dyn318077bld.beaverton.ibm.com>
	 <20050214185733.7ef3a613.akpm@osdl.org>
Mime-Version: 1.0
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
Cc: wli@holomorphy.com, linux-fsdevel@vger.kernel.org,
	ext2-devel <ext2-devel@lists.sourceforge.net>
Received: from e34.co.us.ibm.com ([32.97.110.132]:51166 "EHLO
	e34.co.us.ibm.com") by vger.kernel.org with ESMTP id S261765AbVBOQCa
	(ORCPT <rfc822;linux-fsdevel@vger.kernel.org>);
	Tue, 15 Feb 2005 11:02:30 -0500
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11])
	by e34.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j1FG2TMN083136
	for <linux-fsdevel@vger.kernel.org>; Tue, 15 Feb 2005 11:02:29 -0500
Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168])
	by westrelay02.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id j1FG2Tp8438194
	for <linux-fsdevel@vger.kernel.org>; Tue, 15 Feb 2005 09:02:29 -0700
Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1])
	by d03av02.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j1FG2SuM026997
	for <linux-fsdevel@vger.kernel.org>; Tue, 15 Feb 2005 09:02:28 -0700
To: Andrew Morton <akpm@osdl.org>
In-Reply-To: <20050214185733.7ef3a613.akpm@osdl.org>
Sender: linux-fsdevel-owner@vger.kernel.org
List-Id: linux-fsdevel.vger.kernel.org

On Mon, 2005-02-14 at 18:57, Andrew Morton wrote:
> Badari Pulavarty <pbadari@us.ibm.com> wrote:
> >
> >  Most of DB2 customers use filesystem for their database. Under the load,
> >  they complain that entire memory in the system is used by filesystem
> >  pagecache, freememory is very low and system starts swapping crazy OR
> >  see lots of memory allocation failures and OOM killer kills db2.
> >  slabinfo shows lots of bufferheads and VM folks claim that, bufferheads
> >  are holding a ref. on the pages, so they can't use them. So, I want
> >  to find the truth in the story and findout what exactly happening here
> >  and which one to blame (VM or FS or IO problems) ?
> > 
> >  BTW, all these on 2.4 kernels and I don't have a reproducible testcase
> >  :(
> > 
> >  Feb 7 05:35:17 nmcopsu41 kernel: ENOMEM in do_get_write_access,
> >  retrying.
> 
> Do these machines have a large amount of highmem?
> 
> If so, yes, you can oom because lots of highmem pages have buffer_heads
> attached and you've run out of lowmem.  The 2.4 VM will go off looking for
> lowmem pages to reclaim and will ignore the highmem pages because there's
> no highmem shortage.  Consequently those buffer_heads don't get freed up
> and we're unable to reclaim any lowmem -> oom.
> 
> Andrea did a patch along time ago (it'll be in suse 2.4 kernels) which,
> under these circumstances, strip the buffers from those highmem pages when
> they're encountered on the LRU.  From a quick read it seems that that patch
> is not in current 2.4 kernels.
> 
> It's harder to do that in 2.6 because we have a separate LR per zone.

Our DB2 folks *claims* to have seen this problem both on ia32 and AMD64
customers.  So, I am not sure if its really only highmem related. Only
workaround seems to be configure DB2 to not to use more than 1.5GB on a
8GB RAM system :(

I have nothing much to go on, other than looking data from a sick 
machine. What should I be looking at, to narrow down the problem
some more ?

BTW, none of these BIG customers will take a patch to figure out
whats happening (since its on their production system) :(


Thanks,
Badari