From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paul Furness Subject: Something's eating my memory... Date: Wed, 27 Aug 2003 11:28:56 +0100 Sender: linux-admin-owner@vger.kernel.org Message-ID: <1061980136.13066.48.camel@Zebra.vil.ite.mee.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: List-Id: Content-Type: text/plain; charset="us-ascii" To: linux-admin@vger.kernel.org Hi. Can someone help me? I'm running out of memory on my production server, and I can't figure out exactly why. It's a pretty new machine (about 4 months old), and the spec is: dual 2.8GHz Xeon CPU, 1G memory, SCSI hard disks, and an external SCSI RAID controller. It is based on a build of RedHat 7.3 + redhat released patches, and I have added the 2.4.21 kernel patched to support LVM and XFS. The primary (only?!) job of the serve is to be a file server, offering nfs and samba shares to servers and workstations; this includes the users' home directories. When I first built it, it worked like a dream, but of course it wasn't under a big load. Over time (about a month) I ramped up the load by adding the various shares to the machine and making them available to users. Over the last week, there have been a number of occasions when it ground almost to a complete halt; the rest of the time it performed just fine. It looks like it's having trouble when it gets hammered by everyone logging out at the end of the day (We have roaming profiles on windows workstations, using a samba domain controller. As an aside: it works really well; I can't imagine why anyone would ever want an actual windows server... ;). Anyhow, this heavy loading is to be expected. When I run top or free, it tells me that almost all of the memory is used, but it doesn't seem to be actually used by anything; the total memory used by the processes that top is showing is about 80M. Buffers is showing up as anything between about 450M and 700M. Clearly, the performance issue is happening when the memory fills up and it starts swapping. Here's an example output from free: total used free shared buffers cached Mem: 1032104 1019200 12904 0 2100 342172 -/+ buffers/cache: 674928 357176 Swap: 2096472 3508 2092964 I don't mind putting more memory into the server if this is the solution, but I need to be sure that it will actually help - if I put in another G and it fills up just the same, I'm back where I started but a little bit poorer! My problems are: 1. I don't really understand how the buffers are allocated, or why, and whether changing this would help performance. 2. There seems to be at least 150-200M of memory that I can't account for. Can anyone point me to where I can find out about the buffers, what they are and how they work? Can anyone suggest some accurate performance monitoring software that I can use to find out what exactly is happening when the server grinds to a halt? I guess I really need to know where the memory is going and possibly the disk activity. Gkrellm is sort of useful, but I really need something a bit more determined :) As always, any and all suggestions much appreciated. Paul.