linux-admin.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Something's eating my memory...
@ 2003-08-27 10:28 Paul Furness
  2003-08-27 11:12 ` Benjamin Walkenhorst
  2003-08-27 11:15 ` Bruce Ferrell
  0 siblings, 2 replies; 3+ messages in thread
From: Paul Furness @ 2003-08-27 10:28 UTC (permalink / raw)
  To: linux-admin

Hi.

Can someone help me? I'm running out of memory on my production server,
and I can't figure out exactly why.

It's a pretty new machine (about 4 months old), and the spec is: dual
2.8GHz Xeon CPU, 1G memory, SCSI hard disks, and an external SCSI RAID
controller. It is based on a build of RedHat 7.3 + redhat released
patches, and I have added the 2.4.21 kernel patched to support LVM and
XFS. 

The primary (only?!) job of the serve is to be a file server, offering
nfs and samba shares to servers and workstations; this includes the
users' home directories.

When I first built it, it worked like a dream, but of course it wasn't
under a big load. Over time (about a month) I ramped up the load by
adding the various shares to the machine and making them available to
users.

Over the last week, there have been a number of occasions when it ground
almost to a complete halt; the rest of the time it performed just fine.
It looks like it's having trouble when it gets hammered by everyone
logging out at the end of the day (We have roaming profiles on windows
workstations, using a samba domain controller. As an aside: it works
really well; I can't imagine why anyone would ever want an actual
windows server... ;). Anyhow, this heavy loading is to be expected.

When I run top or free, it tells me that almost all of the memory is
used, but it doesn't seem to be actually used by anything; the total
memory used by the processes that top is showing is about 80M. Buffers
is showing up as anything between about 450M and 700M.  Clearly, the
performance issue is happening when the memory fills up and it starts
swapping. Here's an example output from free:

                 total       used       free     shared    buffers    
cached
    Mem:       1032104    1019200      12904          0       2100    
342172
    -/+ buffers/cache:     674928     357176
    Swap:      2096472       3508    2092964

I don't mind putting more memory into the server if this is the
solution, but I need to be sure that it will actually help - if I put in
another G and it fills up just the same, I'm back where I started but a
little bit poorer!

My problems are: 
1. I don't really understand how the buffers are allocated, or why, and
whether changing this would help performance.
2. There seems to be at least 150-200M of memory that I can't account
for. 

Can anyone point me to where I can find out about the buffers, what they
are and how they work? 

Can anyone suggest some accurate performance monitoring software that I
can use to find out what exactly is happening when the server grinds to
a halt? I guess I really need to know where the memory is going and
possibly the disk activity. Gkrellm is sort of useful, but I really need
something a bit more determined :)

As always, any and all suggestions much appreciated.

Paul.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Something's eating my memory...
  2003-08-27 10:28 Something's eating my memory Paul Furness
@ 2003-08-27 11:12 ` Benjamin Walkenhorst
  2003-08-27 11:15 ` Bruce Ferrell
  1 sibling, 0 replies; 3+ messages in thread
From: Benjamin Walkenhorst @ 2003-08-27 11:12 UTC (permalink / raw)
  To: Paul Furness, linux-admin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Paul,

Unfortunately, I am not able to give you definite answers, but maybe some 
hints... 

On Mittwoch, 27. August 2003 12:28 Paul Furness wrote:

> It's a pretty new machine (about 4 months old), and the spec is: dual
> 2.8GHz Xeon CPU, 1G memory, SCSI hard disks, and an external SCSI RAID
> controller. It is based on a build of RedHat 7.3 + redhat released
> patches, and I have added the 2.4.21 kernel patched to support LVM and
> XFS.

a) What motherboard / NICs / controller(s) does the machine have? I'm not 
sure how much this applies for SMP-board, but sometimes a bad motherboard can 
badly screw up system performance, I guess same goes for other components 
when hitting a sudden peak in system load. 
b) Correct me if I'm wrong, but doesn't 2.4.21 include native support for LVM 
and XFS? Or do you need a patch to use XFS on LVM-volumes? 

> The primary (only?!) job of the serve is to be a file server, offering
> nfs and samba shares to servers and workstations; this includes the
> users' home directories.

It appears to me like a dual Xeon 2.8 is a little much for a file server, how 
many clients does it have to server? 
Do you have all the latest patches installed? RH7.3 is quite old, I think, 
and there have been some patches (especially to Samba, I think) which I 
understand to vastly improve Samba-performance, especially under heavy load. 

> When I first built it, it worked like a dream, but of course it wasn't
> under a big load. Over time (about a month) I ramped up the load by
> adding the various shares to the machine and making them available to
> users.
>
> Over the last week, there have been a number of occasions when it ground
> almost to a complete halt; the rest of the time it performed just fine.
> It looks like it's having trouble when it gets hammered by everyone
> logging out at the end of the day (We have roaming profiles on windows
> workstations, using a samba domain controller. As an aside: it works
> really well; I can't imagine why anyone would ever want an actual
> windows server... ;). Anyhow, this heavy loading is to be expected.

I think your memory issue is not directly causing this performance problems. 
I do not know how exactly Samba and NFS works, but maybe the client sends all 
its remaining write-data to the server, or maybe the server flushes its file 
buffers at logout?

>
> When I run top or free, it tells me that almost all of the memory is
> used, but it doesn't seem to be actually used by anything; the total
> memory used by the processes that top is showing is about 80M. Buffers
> is showing up as anything between about 450M and 700M.  Clearly, the
> performance issue is happening when the memory fills up and it starts
> swapping. Here's an example output from free:
>
>                  total       used       free     shared    buffers
> cached
>     Mem:       1032104    1019200      12904          0       2100
> 342172
>     -/+ buffers/cache:     674928     357176
>     Swap:      2096472       3508    2092964
>
> I don't mind putting more memory into the server if this is the
> solution, but I need to be sure that it will actually help - if I put in
> another G and it fills up just the same, I'm back where I started but a
> little bit poorer!

You should take a look at output vom top or ps, looking at what processes use 
up how much memory. 
Having lots of memory used for cache and buffers seems fine to me, because it 
reduces load on the hard drives.
I think that the system may be rather lazy about flushing its buffers, piling 
up lots of changes to the file system. Now lots of clients disconnect 
simultaneously, forcing or just motivating the system to flush its 
write-buffers, which may in fact slow down the system quite a bit. 
Or maybe, the users all save their work before/at logging out. This means 
that all day you get a write request every now and then, but a huge lot of 
request all at once at the end of the day. 
I suggest you give the Samba and NFS servers an upgrade and maybe install 
flushd (as an alternative, you can place an entry in crontab, running "sync" 
at regular intervals. 
- From what I read I understand that Linux-2.6 will also come with much 
improved memory managment, but who knows when? 

You could also run process accounting, but I don't know much about that. 

As an alternative you could try FreeBSD; it's a) very good in overall 
performance, as well as under heavy load, and b) very good in memory 
managment. It also has a software RAID driver, NFS and Samba are available 
too (surprise...), and it also supports SMP. So I think it might do the job 
just as well. 
(On the other hand it completely lacks GUI-based administration tools, but 
system configuration is much more tidy than with Linux.)

> As always, any and all suggestions much appreciated.
>
> Paul.

- -- 
Benjamin Walkenhorst
eMail: krylon@gmx.net
homepage: http://www.krylon.de
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
Comment: Public Key available at http://www.krylon.de

iD8DBQE/TJIKoYumWdMvhMQRAhhwAJ0U9lz7e7Qd++E7OZ4GqMPPDsJanQCfdTlv
KdxC3aIghZMDAihphcSU8HE=
=L4WB
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Something's eating my memory...
  2003-08-27 10:28 Something's eating my memory Paul Furness
  2003-08-27 11:12 ` Benjamin Walkenhorst
@ 2003-08-27 11:15 ` Bruce Ferrell
  1 sibling, 0 replies; 3+ messages in thread
From: Bruce Ferrell @ 2003-08-27 11:15 UTC (permalink / raw)
  To: Paul Furness; +Cc: linux-admin

When I run into this type of problem, I turn to sysstat.  I use it in 
conjunction with bigbrother/larrd/rrdtool and a bigbrother plugin called 
bb-sar from www.deadcat.net.

I almost missed the nfs aspect of this.  Is the system load suddenly sky 
rocketing?  I would start looking for something NFS has exported being 
"taken" away.  I saw something similar to this on a mail server with 
autofs mounted home directories.  We had an admin who liked to rearrange 
home directories.  It caused the system to get very confused and be very 
unhappy as NFS, even with soft mounts and interruptable flags set would 
keep pounding for the old, missing location.

This doesn't look like a memory allocation problem at all.  If kernel 
logging isn't enabled, enable it.  You'll get some good clues there.

Paul Furness wrote:
> Hi.
> 
> Can someone help me? I'm running out of memory on my production server,
> and I can't figure out exactly why.
> 
> It's a pretty new machine (about 4 months old), and the spec is: dual
> 2.8GHz Xeon CPU, 1G memory, SCSI hard disks, and an external SCSI RAID
> controller. It is based on a build of RedHat 7.3 + redhat released
> patches, and I have added the 2.4.21 kernel patched to support LVM and
> XFS. 
> 
> The primary (only?!) job of the serve is to be a file server, offering
> nfs and samba shares to servers and workstations; this includes the
> users' home directories.
> 
> When I first built it, it worked like a dream, but of course it wasn't
> under a big load. Over time (about a month) I ramped up the load by
> adding the various shares to the machine and making them available to
> users.
> 
> Over the last week, there have been a number of occasions when it ground
> almost to a complete halt; the rest of the time it performed just fine.
> It looks like it's having trouble when it gets hammered by everyone
> logging out at the end of the day (We have roaming profiles on windows
> workstations, using a samba domain controller. As an aside: it works
> really well; I can't imagine why anyone would ever want an actual
> windows server... ;). Anyhow, this heavy loading is to be expected.
> 
> When I run top or free, it tells me that almost all of the memory is
> used, but it doesn't seem to be actually used by anything; the total
> memory used by the processes that top is showing is about 80M. Buffers
> is showing up as anything between about 450M and 700M.  Clearly, the
> performance issue is happening when the memory fills up and it starts
> swapping. Here's an example output from free:
> 
>                  total       used       free     shared    buffers    
> cached
>     Mem:       1032104    1019200      12904          0       2100    
> 342172
>     -/+ buffers/cache:     674928     357176
>     Swap:      2096472       3508    2092964
> 
> I don't mind putting more memory into the server if this is the
> solution, but I need to be sure that it will actually help - if I put in
> another G and it fills up just the same, I'm back where I started but a
> little bit poorer!
> 
> My problems are: 
> 1. I don't really understand how the buffers are allocated, or why, and
> whether changing this would help performance.
> 2. There seems to be at least 150-200M of memory that I can't account
> for. 
> 
> Can anyone point me to where I can find out about the buffers, what they
> are and how they work? 
> 
> Can anyone suggest some accurate performance monitoring software that I
> can use to find out what exactly is happening when the server grinds to
> a halt? I guess I really need to know where the memory is going and
> possibly the disk activity. Gkrellm is sort of useful, but I really need
> something a bit more determined :)
> 
> As always, any and all suggestions much appreciated.
> 
> Paul.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-admin" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-08-27 11:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-08-27 10:28 Something's eating my memory Paul Furness
2003-08-27 11:12 ` Benjamin Walkenhorst
2003-08-27 11:15 ` Bruce Ferrell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).