public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.10: kswapd spins like crazy
@ 2005-02-03 10:29 Terje Fåberg
  2005-02-03 10:47 ` Nick Piggin
  0 siblings, 1 reply; 13+ messages in thread
From: Terje Fåberg @ 2005-02-03 10:29 UTC (permalink / raw)
  To: linux-kernel


I recently upgraded my desktop from 2.4.28 to
2.6.10. Even under moderate memory pressure kswapd
regularly eats almost all available cpu time 
whenever there is a little more IO throughput,
like copying large files. The system is extremely
sluggish during this. The system load goes up to 
7.5 or more.
 
This is a Pentium3-866 with 768MB RAM, 2x1GB 
swap partitions, vanilla 2.6.10. The strange 
behaviour starts at about 200 MB of swap in use.
2.4.28 masters the same workload without any
problems.

vmstat:
procs -----------memory---------- 
 r  b   swpd   free   buff  cache
 6  1 428012   4868  33236 347184
---swap-- -----io---- --system-- ----cpu----
 si   so    bi    bo   in    cs us sy id wa
 10    7   147   120  108   111 19 10 68  3

Is there anything I can do to track this down?

Regards, 
Terje

^ permalink raw reply	[flat|nested] 13+ messages in thread
* RE: 2.6.10: kswapd spins like crazy
@ 2005-02-04 16:16 Weathers, Norman R.
  0 siblings, 0 replies; 13+ messages in thread
From: Weathers, Norman R. @ 2005-02-04 16:16 UTC (permalink / raw)
  To: linux-kernel



We have had a similar problem with all kernels since 2.6.8.1.  It has
gotten so bad that we had to drop back to 2.6.7 with some extra patches
to get our systems working.  Our situation is a little bit different.

We are using smp Opteron boxes as NFS servers.  Under almost any load at
all, kswapd goes nuts, taking up
99 % of the CPU cycles for long periods of time.  With 2.6.7, this has
not been noticed as bad (just periods of about 3 - 5 seconds of 10 - 35
% utilized, then off for a few seconds, then back again.  Sometimes
kswapd lingers longer as the most aggressive app in top, but with 2.6.7,
the nfsd's are the most prevalent).

Also, we have noticed something else.  Our servers have dual Broadcom
gigabit nics (Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet
(rev 03)).  We have bonded both NICS back to our core switch, both
running at gigabit speed.  Under different loads, we start to get call
traces in dmesg and the syslog.  An excerpt follows:


<Jan/06 03:50 pm>Call Trace:<IRQ> <ffffffff80158fa0>{__alloc_pages+816}
<ffffffff8013ffd3>{del_timer+115}
<Jan/06 03:50 pm>       <ffffffff80158fe0>{__get_free_pages+16}
<ffffffff8015c886>{kmem_getpages+38}
<Jan/06 03:50 pm>       <ffffffff8015d8be>{cache_grow+190}
<ffffffff8015db16>{cache_alloc_refill+422}
<Jan/06 03:50 pm>       <ffffffff8015de06>{kmem_cache_alloc+54}
<ffffffff802d5eaf>{dst_alloc+47}
<Jan/06 03:50 pm>       <ffffffff802e3d17>{ip_route_input_slow+1639}
<ffffffff803085bb>{udp_rcv+267}
<Jan/06 03:50 pm>       <ffffffff802e612e>{ip_rcv+526}
<ffffffff802d297d>{netif_receive_skb+477}
<Jan/06 03:50 pm>
<ffffffffa0120fe8>{:bcm5700:MM_IndicateRxPackets+920}
<Jan/06 03:50 pm>       <ffffffffa011c9fe>{:bcm5700:bcm5700_poll+158}
<ffffffff802d2b94>{net_rx_action+132}
<Jan/06 03:50 pm>       <ffffffff8013c4b1>{__do_softirq+113}
<ffffffff8013c565>{do_softirq+53}
<Jan/06 03:50 pm>       <ffffffff80113baf>{do_IRQ+335}
<ffffffff80111001>{ret_from_intr+0}
<Jan/06 03:50 pm>        <EOI> <ffffffff8031e419>{thread_return+41}
<ffffffff8010eb20>{default_idle+0}
<Jan/06 03:50 pm>       <ffffffff8010eb44>{default_idle+36}
<ffffffff8010ebdc>{cpu_idle+44}
<Jan/06 03:50 pm>       <ffffffff80517885>{start_kernel+453}
<Jan/06 03:50 pm>swapper: page allocation failure. order:0, mode:0x20

<Jan/06 03:50 pm>Call Trace:<IRQ> <ffffffff80158fa0>{__alloc_pages+816}
<ffffffff801158b4>{end_8259A_irq+100}
<Jan/06 03:50 pm>       <ffffffff80158fe0>{__get_free_pages+16}
<ffffffff8015c886>{kmem_getpages+38}
<Jan/06 03:50 pm>       <ffffffff8015d8be>{cache_grow+190}
<ffffffff8015db16>{cache_alloc_refill+422}
<Jan/06 03:50 pm>       <ffffffff8015de06>{kmem_cache_alloc+54}
<ffffffff802d5eaf>{dst_alloc+47}
<Jan/06 03:50 pm>       <ffffffff802e3d17>{ip_route_input_slow+1639}
<ffffffff802e612e>{ip_rcv+526}
<Jan/06 03:50 pm>       <ffffffff80131b2b>{try_to_wake_up+523}
<ffffffff802d297d>{netif_receive_skb+477}
<Jan/06 03:50 pm>
<ffffffffa0120fe8>{:bcm5700:MM_IndicateRxPackets+920}
<Jan/06 03:50 pm>       <ffffffffa011c9fe>{:bcm5700:bcm5700_poll+158}
<ffffffff802d2b94>{net_rx_action+132}
<Jan/06 03:50 pm>       <ffffffff8013c4b1>{__do_softirq+113}
<ffffffff8013c565>{do_softirq+53}
<Jan/06 03:50 pm>       <ffffffff80113baf>{do_IRQ+335}
<ffffffff80111001>{ret_from_intr+0}
<Jan/06 03:50 pm>        <EOI> <ffffffff8031e419>{thread_return+41}
<ffffffff8010eb20>{default_idle+0}
<Jan/06 03:50 pm>       <ffffffff8010eb44>{default_idle+36}
<ffffffff8010ebdc>{cpu_idle+44}
<Jan/06 03:50 pm>       <ffffffff80517885>{start_kernel+453}
<Jan/06 03:50 pm>swapper: page allocation failure. order:0, mode:0x20

<Jan/06 03:50 pm>Call Trace:<IRQ> <ffffffff80158fa0>{__alloc_pages+816}
<ffffffff801158b4>{end_8259A_irq+100}
<Jan/06 03:50 pm>       <ffffffff80158fe0>{__get_free_pages+16}
<ffffffff8015c886>{kmem_getpages+38}
<Jan/06 03:50 pm>       <ffffffff8015d8be>{cache_grow+190}
<ffffffff8015db16>{cache_alloc_refill+422}
<Jan/06 03:50 pm>       <ffffffff8015de06>{kmem_cache_alloc+54}
<ffffffff802d5eaf>{dst_alloc+47}
<Jan/06 03:50 pm>       <ffffffff802e3d17>{ip_route_input_slow+1639}
<ffffffff803085bb>{udp_rcv+267}
<Jan/06 03:50 pm>       <ffffffff802e612e>{ip_rcv+526}
<ffffffff802d297d>{netif_receive_skb+477}
<Jan/06 03:50 pm>
<ffffffffa0120fe8>{:bcm5700:MM_IndicateRxPackets+920}
<Jan/06 03:50 pm>       <ffffffffa011c9fe>{:bcm5700:bcm5700_poll+158}
<ffffffff802d2b94>{net_rx_action+132}
<Jan/06 03:50 pm>       <ffffffff8013c4b1>{__do_softirq+113}
<ffffffff8013c565>{do_softirq+53}
<Jan/06 03:50 pm>       <ffffffff80113baf>{do_IRQ+335}
<ffffffff80111001>{ret_from_intr+0}
<Jan/06 03:50 pm>        <EOI> <ffffffff8031e419>{thread_return+41}
<ffffffff8010eb20>{default_idle+0}
<Jan/06 03:50 pm>       <ffffffff8010eb44>{default_idle+36}
<ffffffff8010ebdc>{cpu_idle+44}
<Jan/06 03:50 pm>       <ffffffff80517885>{start_kernel+453}
<Jan/06 03:50 pm>swapper: page allocation failure. order:0, mode:0x20

<Jan/06 03:50 pm>Call Trace:<IRQ> <ffffffff80158fa0>{__alloc_pages+816}
<ffffffff801158b4>{end_8259A_irq+100}
<Jan/06 03:50 pm>       <ffffffff80158fe0>{__get_free_pages+16}
<ffffffff8015c886>{kmem_getpages+38}
<Jan/06 03:50 pm>       <ffffffff8015d8be>{cache_grow+190}
<ffffffff8015db16>{cache_alloc_refill+422}
<Jan/06 03:50 pm>       <ffffffff8015de06>{kmem_cache_alloc+54}
<ffffffff802d5eaf>{dst_alloc+47}
<Jan/06 03:50 pm>       <ffffffff802e3d17>{ip_route_input_slow+1639}
<ffffffff802e612e>{ip_rcv+526}
<Jan/06 03:50 pm>       <ffffffff802d297d>{netif_receive_skb+477}
<ffffffffa0120fe8>{:bcm5700:MM_IndicateRxPackets+920}
<Jan/06 03:50 pm>       <ffffffffa011c9fe>{:bcm5700:bcm5700_poll+158}
<ffffffff802d2b94>{net_rx_action+132}
<Jan/06 03:50 pm>       <ffffffff8013c4b1>{__do_softirq+113}
<ffffffff8013c565>{do_softirq+53}
<Jan/06 03:50 pm>       <ffffffff80113baf>{do_IRQ+335}
<ffffffff80111001>{ret_from_intr+0}
<Jan/06 03:50 pm>        <EOI> <ffffffff8031e419>{thread_return+41}
<ffffffff8010eb20>{default_idle+0}
<Jan/06 03:50 pm>       <ffffffff8010eb44>{default_idle+36}
<ffffffff8010ebdc>{cpu_idle+44}
<Jan/06 03:50 pm>       <ffffffff80517885>{start_kernel+453}
<Jan/06 03:50 pm>swapper: page allocation failure. order:0, mode:0x20

<Jan/06 03:50 pm>Call Trace:<IRQ> <ffffffff80158fa0>{__alloc_pages+816}
<ffffffff80158fe0>{__get_free_pages+16}
<Jan/06 03:50 pm>       <ffffffff8015c886>{kmem_getpages+38}
<ffffffff8015d8be>{cache_grow+190}
<Jan/06 03:50 pm>       <ffffffff8015db16>{cache_alloc_refill+422}
<ffffffff8015de06>{kmem_cache_alloc+54}
<Jan/06 03:50 pm>       <ffffffff802d5eaf>{dst_alloc+47}
<ffffffff802e3d17>{ip_route_input_slow+1639}
<Jan/06 03:50 pm>       <ffffffff802e612e>{ip_rcv+526}
<ffffffff802d297d>{netif_receive_skb+477}

This was just a partial listing from one of our servers.  I had read in
several lists that this was not considered fatal.  The problem is that
with our setup, it has turned fatal, to the point of locking out the
system remotely, and only a reset from the machine itself able to work
(didn't even honor the sysrq-b combo at the console).

Has anyone else run into this?  I can get this kind of error using about
20 clients (100 MB connected) hitting one server (dual gigabit bonded).
With 2.6.8.1 and newer, the errors are reproducible, but I can't exactly
tell when they happen (either write or read).  I think I have seen them
happen in both writes and reads.  And the kswapd problems happened
during writes and reads both as well.

I can also get the kswapd going crazy with a local set of disk I/O
tests.

Any information needed, please ask.  Any help would be appreciated.

Thanks,
Norman Weathers




-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Nick Piggin
Sent: Thursday, February 03, 2005 7:20 PM
To: Andrew Morton
Cc: terje_fb@yahoo.no; linux-kernel@vger.kernel.org; torvalds@osdl.org
Subject: Re: 2.6.10: kswapd spins like crazy



Andrew Morton wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
>>Oh, attached should be a minimal fix if you would like to try it out.
>>
>>
>>...
>>--- linux-2.6/mm/vmscan.c~vmscan-minfix	2005-02-04
11:52:37.000000000 +1100
>>+++ linux-2.6-npiggin/mm/vmscan.c	2005-02-04 11:53:32.000000000
+1100
>>@@ -575,6 +575,7 @@ static void shrink_cache(struct zone *zo
>> 			nr_taken++;
>> 		}
>> 		zone->nr_inactive -= nr_taken;
>>+		zone->pages_scanned += nr_scan;
>> 		spin_unlock_irq(&zone->lru_lock);
>> 
>> 		if (nr_taken == 0)
>>
> 
> 
> Any theories as to why these pages aren't being activated and aren't
being
> reclaimed?
> 
> 

No none yet, which is what we should get to the bottom of. I must be
overlooking something, but the only ways I can see should be due to
transient conditions like page locked or under writeback. laptop_mode?

Terje, what is /proc/sys/vm/laptop_mode set to?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2005-02-05  7:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-03 10:29 2.6.10: kswapd spins like crazy Terje Fåberg
2005-02-03 10:47 ` Nick Piggin
2005-02-03 11:54   ` Terje Fåberg
2005-02-03 19:50     ` Terje Fåberg
2005-02-04  0:12       ` Nick Piggin
2005-02-04  0:56         ` Nick Piggin
2005-02-04  1:16           ` Andrew Morton
2005-02-04  1:19             ` Nick Piggin
2005-02-04 10:26               ` Terje Fåberg
2005-02-04 17:26                 ` Terje Fåberg
2005-02-04 22:18                   ` Nick Piggin
2005-02-05  7:12                     ` Terje Fåberg
  -- strict thread matches above, loose matches on Subject: below --
2005-02-04 16:16 Weathers, Norman R.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox