From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: Bufferheads & page-cache reference Date: Tue, 15 Feb 2005 09:26:13 -0800 Message-ID: <20050215092613.16b3252c.akpm@osdl.org> References: <1108409415.20053.1278.camel@dyn318077bld.beaverton.ibm.com> <20050214134058.1402cfed.akpm@osdl.org> <20050214221044.GX13009@holomorphy.com> <20050214143142.6c12fdb3.akpm@osdl.org> <20050214225033.GY13009@holomorphy.com> <1108426969.20053.1346.camel@dyn318077bld.beaverton.ibm.com> <20050214185733.7ef3a613.akpm@osdl.org> <1108483396.20053.1379.camel@dyn318077bld.beaverton.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: wli@holomorphy.com, linux-fsdevel@vger.kernel.org, ext2-devel@lists.sourceforge.net To: Badari Pulavarty In-Reply-To: <1108483396.20053.1379.camel@dyn318077bld.beaverton.ibm.com> Sender: ext2-devel-admin@lists.sourceforge.net Errors-To: ext2-devel-admin@lists.sourceforge.net List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , List-Archive: List-Id: linux-fsdevel.vger.kernel.org Badari Pulavarty wrote: > > On Mon, 2005-02-14 at 18:57, Andrew Morton wrote: > > Badari Pulavarty wrote: > > > > > > Most of DB2 customers use filesystem for their database. Under the load, > > > they complain that entire memory in the system is used by filesystem > > > pagecache, freememory is very low and system starts swapping crazy OR > > > see lots of memory allocation failures and OOM killer kills db2. > > > slabinfo shows lots of bufferheads and VM folks claim that, bufferheads > > > are holding a ref. on the pages, so they can't use them. So, I want > > > to find the truth in the story and findout what exactly happening here > > > and which one to blame (VM or FS or IO problems) ? > > > > > > BTW, all these on 2.4 kernels and I don't have a reproducible testcase > > > :( > > > > > > Feb 7 05:35:17 nmcopsu41 kernel: ENOMEM in do_get_write_access, > > > retrying. > > > > Do these machines have a large amount of highmem? > > > > If so, yes, you can oom because lots of highmem pages have buffer_heads > > attached and you've run out of lowmem. The 2.4 VM will go off looking for > > lowmem pages to reclaim and will ignore the highmem pages because there's > > no highmem shortage. Consequently those buffer_heads don't get freed up > > and we're unable to reclaim any lowmem -> oom. > > > > Andrea did a patch along time ago (it'll be in suse 2.4 kernels) which, > > under these circumstances, strip the buffers from those highmem pages when > > they're encountered on the LRU. From a quick read it seems that that patch > > is not in current 2.4 kernels. > > > > It's harder to do that in 2.6 because we have a separate LR per zone. > > Our DB2 folks *claims* to have seen this problem both on ia32 and AMD64 > customers. So, I am not sure if its really only highmem related. Only > workaround seems to be configure DB2 to not to use more than 1.5GB on a > 8GB RAM system :( It shouldn't happen on amd64. > I have nothing much to go on, other than looking data from a sick > machine. What should I be looking at, to narrow down the problem > some more ? /proc/meminfo and /proc/slabinfo (especially the buffer_head line) > BTW, none of these BIG customers will take a patch to figure out > whats happening (since its on their production system) :( > Yup. What kernel(s) are they running? I _think_ only suse have fixed that problem. ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click