public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: SMP performance problem in 2.4 (was: Athlon spinlock performance)
@ 2003-08-06 17:21 Manfred Spraul
  0 siblings, 0 replies; 7+ messages in thread
From: Manfred Spraul @ 2003-08-06 17:21 UTC (permalink / raw)
  To: Scott L. Burson; +Cc: linux-kernel

Scott wrote:

>The problem is in `try_to_free_pages' and its associated routines,
>`shrink_caches' and `shrink_cache', in `mm/vmscan.c'.  After I made some
>changes to greatly reduce lock contention in the slab allocator and
>`shrink_cache',
>
How did you change the slab locking?

> and then instrumented `shrink_cache' to see what it was
>doing, the problem showed up very clearly.
>
>In one approximately 60-second period with the problematic workload running, 
>`try_to_free_pages' was called 511 times.  It made 2597 calls to
>`shrink_caches', which made 2592 calls to `shrink_cache' (i.e. it was very
>rare for `kmem_cache_reap' to release enough pages itself).
>
2.6 contains a simple fix: I've removed kmem_cache_reap. Instead the 
code checks for empty pages in the slab caches every other second.

--
    Manfred


^ permalink raw reply	[flat|nested] 7+ messages in thread
* SMP performance problem in 2.4 (was: Athlon spinlock performance)
@ 2003-08-02 20:03 Scott L. Burson
  2003-08-02 21:44 ` Andrew Morton
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Scott L. Burson @ 2003-08-02 20:03 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Mathieu.Malaterre, Kanoj

Hi all,

Well, I have found the smoking gun as to what is causing the performance
problems that I have been seeing on my dual Athlon box (Tyan S2466, dual
Athlon MP 2800+, 2.5GB memory).

It doesn't, indeed, have anything to do with the Athlon.  The reason I
thought it did was that a dual Pentium box that I have access to (2.4GHz,
2GB memory) has no trace of the problem.  That machine is running Red Hat
7.2, which is 2.4.7-based.  It didn't occur to me that there might have been
a severe performance problem introduced into the kernel sometime between
2.4.7 and 2.4.18, but that, it turns out, is exactly what happened.

The problem is in `try_to_free_pages' and its associated routines,
`shrink_caches' and `shrink_cache', in `mm/vmscan.c'.  After I made some
changes to greatly reduce lock contention in the slab allocator and
`shrink_cache', and then instrumented `shrink_cache' to see what it was
doing, the problem showed up very clearly.

In one approximately 60-second period with the problematic workload running, 
`try_to_free_pages' was called 511 times.  It made 2597 calls to
`shrink_caches', which made 2592 calls to `shrink_cache' (i.e. it was very
rare for `kmem_cache_reap' to release enough pages itself).  The main loop
of `shrink_cache' was executed -- brace yourselves -- 189 million times!
During that time it called `page_cache_release' on only 31265 pages.

`shrink_cache' didn't even exist in 2.4.7.  Whatever mechanism 2.4.7 had for
releasing pages was evidently much more time-efficient, at least in the
particular situation I'm looking at.

Clearly the kernel group has been aware of the problems with `shrink_cache',
as I see that it has received quite a bit of attention in the course of 2.5
development.  I am hopeful that the problem will be substantially
ameliorated in 2.6.0.  (The comment at the top of `try_to_free_pages' --
"This is a fairly lame algorithm - it can result in excessive CPU burning"
-- suggests it won't be cured entirely.)

However, it seems the kernel group may not have been aware of just how bad
the problem can be in recent 2.4 kernels on dual-processor machines with
lots of memory.  It's bad enough that running two `find' jobs at the same
time on large filesystems can bring the machine pretty much to its knees.

There are many things about this code I don't understand, but the most
puzzling is this.  When `try_to_free_pages' is called, it sets out to free
32 pages (the value of `SWAP_CLUSTER_MAX').  It's prepared to do a very
large amount of work to accomplish this goal, and if it fails, it will call
`out_of_memory'.  Given that, what's odd is that it's being called when
memory isn't even close to being full (there's a good 800MB free, according
to `top').  It seems crazy that `out_of_memory' might be called when there
are hundreds of MB of free pages, just because `shrink_caches' couldn't find 
32 pages to free.  It suggests to me that `try_to_free_pages' is being
called in two contexts: one when a page allocation fails, and the other just 
for general cleaning, and that it's the latter context that's causing the
problem.

I will do some more instrumentation to try to verify this.  Comments
solicited.

-- Scott


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-08-06 17:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-08-06 17:21 SMP performance problem in 2.4 (was: Athlon spinlock performance) Manfred Spraul
  -- strict thread matches above, loose matches on Subject: below --
2003-08-02 20:03 Scott L. Burson
2003-08-02 21:44 ` Andrew Morton
2003-08-03 10:00   ` Scott L. Burson
2003-08-06  2:37     ` Rik van Riel
2003-08-03  2:40 ` Rik van Riel
2003-08-03  5:18 ` Con Kolivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox