From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: with ECARTIS (v1.0.0; list xfs); Mon, 17 Dec 2007 15:38:06 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id lBHNbwKR027030 for ; Mon, 17 Dec 2007 15:38:01 -0800 Date: Tue, 18 Dec 2007 10:37:59 +1100 From: David Chinner Subject: Re: Issue with 2.6.23 and drbd 8.0.7 Message-ID: <20071217233759.GB4396912@sgi.com> References: <20071217143655.chiehahh@trusted.lncsa.com> <20071217220354.GU4396912@sgi.com> <4766F58C.8040000@lncsa.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4766F58C.8040000@lncsa.com> Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com List-Id: xfs To: Laurent CARON Cc: David Chinner , xfs@oss.sgi.com On Mon, Dec 17, 2007 at 11:17:48PM +0100, Laurent CARON wrote: > David Chinner wrote: > > The symptoms you see are the machine running out of memory and the OOM > > killer being invoked. There's nothing XFS here - you'd do better to post > > to lkml about this. > > So, I was wrong .... :$ > > > Hmmm - you appear to have a highmem based box and have run out of > > low memory for the kernel. So while having ~9.5GB of free high > > memory (that the kernel can't directly use), you're out of low > > memory that the kernel can use and hence it is going OOM. The > > output of /proc/slabinfo or watching slabtop will tell you where > > most of this memory is going. > > Please find attached the output from /proc/slabinfo from both servers, > as well as output from slabtop from server 1. > > > > > FWIW, I suggest upgrading to a 64 bit machine ;) > > I'm currently migrating those 2 servers to 2 64 Bit setups ;) > > Thanks for your advice. > > Laurent > slabinfo - version: 2.1 (statistics) > # name : tunables : slabdata : globalstat : cpustat > xfs_inode 227129 245574 408 9 1 : tunables 32 16 8 : slabdata 27286 27286 > xfs_vnode 227106 243130 392 10 1 : tunables 32 16 8 : slabdata 24313 24313 > radix_tree_node 88310 88356 312 12 1 : tunables 32 16 8 : slabdata 7363 7363 > dentry 170738 215280 160 24 1 : tunables 32 16 8 : slabdata 8970 8970 > buffer_head 150095 460752 80 48 1 : tunables 32 16 8 : slabdata 9599 9599 > slabinfo - version: 2.1 (statistics) > xfs_inode 386493 386505 408 9 1 : tunables 32 16 8 : slabdata 42945 42945 > xfs_vnode 386491 386510 392 10 1 : tunables 32 16 8 : slabdata 38651 38651 > radix_tree_node 56266 56292 312 12 1 : tunables 32 16 8 : slabdata 4691 4691 > dentry 425976 425976 160 24 1 : tunables 32 16 8 : slabdata 17749 17749 > buffer_head 794845 794976 80 48 1 : tunables 32 16 8 : slabdata 16562 16562 > Active / Total Objects (% used) : 1031308 / 1501486 (68.7%) > Active / Total Slabs (% used) : 87577 / 87659 (99.9%) > Active / Total Caches (% used) : 116 / 179 (64.8%) > Active / Total Size (% used) : 275759.16K / 331390.36K (83.2%) > Minimum / Average / Maximum Object : 0.04K / 0.22K / 4096.00K > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > 460752 150236 32% 0.08K 9599 48 38396K buffer_head > 244413 225674 92% 0.40K 27157 9 108628K xfs_inode > 242010 225657 93% 0.38K 24201 10 96804K xfs_vnode > 215280 171465 79% 0.16K 8970 24 35880K dentry > 88368 88272 99% 0.30K 7364 12 29456K radix_tree_node Hmmm - no real surprises there, but the numbers are well lower than the ~960MB low memory limit. I suspect that there's something at around 2.55am that does a filesystem traversal and that blows out the memory usage of these slab caches and you run out of lowmem... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group