VM lockup with 2.4.8 / 2.4.8pre8

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* VM lockup with 2.4.8 / 2.4.8pre8
@ 2001-08-13 17:47 Roy C. Bixler
  2001-08-13 17:55 ` WANTED: " Rik van Riel
  2001-08-14 19:13 ` Roy C. Bixler
  0 siblings, 2 replies; 12+ messages in thread
From: Roy C. Bixler @ 2001-08-13 17:47 UTC (permalink / raw)
  To: linux-kernel

I have just inadvertantly encountered a VM lockup with Linux 2.4.8.  The
KDE kspread application couldn't handle one spreadsheet I gave it and it
ran away consuming all memory in the system.  When I first ran into the
trouble, my machine has 384 Meg. RAM and 184 Meg. of swap.  I tried
2.4.8pre8 and the lockup still occurs.  I have increased my swap to 768
Meg. and 2.4.8 still locks up.  I tried 2.4.7 and it doesn't lockup - it
correctly OOM kills the runaway process.

The system feels responcive up until it locks up.  Running 'top' while it
happens show that the lockup occurs at about the point where swap runs
out.  Other system details: it is running the latest Debian snapshot.

Linux frobozz 2.4.8 #1 Sat Aug 11 19:26:35 CDT 2001 i686 unknown

Gnu C                  2.95.4
Gnu make               3.79.1
binutils               2.11.90.0.25
util-linux             2.11h
mount                  2.11h
modutils               2.4.6
e2fsprogs              1.22
Linux C Library        2.2.4
Dynamic linker (ldd)   2.2.4
Procps                 2.0.7
Net-tools              1.60
Console-tools          0.2.3
Sh-utils               2.0.11
Modules Loaded         cs4232 ad1848 uart401 sound soundcore parport_pc lp
parport ipx usb-uhci usbcore

-- 
Roy Bixler
The University of Chicago Press
rcb@press-gopher.uchicago.edu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* WANTED: Re: VM lockup with 2.4.8 / 2.4.8pre8
  2001-08-13 17:47 VM lockup with 2.4.8 / 2.4.8pre8 Roy C. Bixler
@ 2001-08-13 17:55 ` Rik van Riel
  2001-08-14 19:04   ` Jon 'tex' Boone
                     ` (2 more replies)
  2001-08-14 19:13 ` Roy C. Bixler
  1 sibling, 3 replies; 12+ messages in thread
From: Rik van Riel @ 2001-08-13 17:55 UTC (permalink / raw)
  To: Roy C. Bixler; +Cc: linux-kernel, kernelnewbies

CALL FOR VOLUNTEERS
---------------------
On Mon, 13 Aug 2001, Roy C. Bixler wrote:

> I have just inadvertantly encountered a VM lockup with Linux 2.4.8.  The
> KDE kspread application couldn't handle one spreadsheet I gave it and it
> ran away consuming all memory in the system.  When I first ran into the
> trouble, my machine has 384 Meg. RAM and 184 Meg. of swap.  I tried
> 2.4.8pre8 and the lockup still occurs.  I have increased my swap to 768
> Meg. and 2.4.8 still locks up.  I tried 2.4.7 and it doesn't lockup - it
> correctly OOM kills the runaway process.

Ouch, I only did a quick test with the OOM killer and the
swap space reclaim patch and it worked in my quick test.

This means the OOM killer should be tuned, or more precisely,
the code deciding when the OOM killer kicks in should be tuned.

The code involved is very easy, so I'll explain it a bit and
ask for volunteers to tweak the code and fix the OOM behaviour.

The functions/places you may want to tweak are:

mm/vmscan.c::kswapd()
	else if (out_of_memory()) {
		oom_kill()

mm/oom_kill.c::out_of_memory()


regards,

Rik
--
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardvark@nl.linux.org (spam digging piggy)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WANTED: Re: VM lockup with 2.4.8 / 2.4.8pre8
  2001-08-13 17:55 ` WANTED: " Rik van Riel
@ 2001-08-14 19:04   ` Jon 'tex' Boone
  2001-08-14 19:32     ` Rik van Riel
  2001-08-14 20:05   ` Petr Baudis
       [not found]   ` <9lc0ek$l5k$1@ns1.clouddancer.com>
  2 siblings, 1 reply; 12+ messages in thread
From: Jon 'tex' Boone @ 2001-08-14 19:04 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel, kernelnewbies

Rik van Riel <riel@conectiva.com.br> writes:

> CALL FOR VOLUNTEERS
> ---------------------
> This means the OOM killer should be tuned, or more precisely,
> the code deciding when the OOM killer kicks in should be tuned.
> 
> The code involved is very easy, so I'll explain it a bit and
> ask for volunteers to tweak the code and fix the OOM behaviour.
> 
> The functions/places you may want to tweak are:
> 
> mm/vmscan.c::kswapd()
> 	else if (out_of_memory()) {
> 		oom_kill()
> 
> mm/oom_kill.c::out_of_memory()

Rik,

    Should said volunteer(s) work with stock 2.4.8?

-tex
-- 
------------------
Jon Allen Boone
tex@delamancha.org

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: VM lockup with 2.4.8 / 2.4.8pre8
  2001-08-13 17:47 VM lockup with 2.4.8 / 2.4.8pre8 Roy C. Bixler
  2001-08-13 17:55 ` WANTED: " Rik van Riel
@ 2001-08-14 19:13 ` Roy C. Bixler
  1 sibling, 0 replies; 12+ messages in thread
From: Roy C. Bixler @ 2001-08-14 19:13 UTC (permalink / raw)
  To: linux-kernel

On Mon, 13 Aug 2001, I wrote:
> I have just inadvertantly encountered a VM lockup with Linux 2.4.8.  The
> KDE kspread application couldn't handle one spreadsheet I gave it and it
> ran away consuming all memory in the system.  When I first ran into the
> trouble, my machine has 384 Meg. RAM and 184 Meg. of swap.  I tried
> 2.4.8pre8 and the lockup still occurs.  I have increased my swap to 768
> Meg. and 2.4.8 still locks up.  I tried 2.4.7 and it doesn't lockup - it
> correctly OOM kills the runaway process.
> 
> The system feels responcive up until it locks up.  Running 'top' while it
> happens show that the lockup occurs at about the point where swap runs
> out.  Other system details: it is running the latest Debian snapshot.

I've managed to do a little more tracing on this.  I've tried this test on
2.4.8-pre1 and it eventually kills the culprit process.  I tried again
under 2.4.8 and, since the Sys-Rq key combinations worked while the system
was otherwise completely quiet (no disk activity) and unresponcive, I
tried hitting Sys-Rq-P a few times and the stack traces always looked like
this:

do_try_to_free_pages
kswapd
kernel_thread

or

swap_out_vma
swap_out_mm
swap_out
refill_active_zone
refill_inactive
do_try_to_free_pages
kswapd
kernel_thread

I also just tried 2.4.9-pre3 and the system locked before swap filled up
with a screen full of '__alloc_pages: order 0 allocation failed' type
messages.

-- 
Roy Bixler
The University of Chicago Press
rcb@press-gopher.uchicago.edu



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WANTED: Re: VM lockup with 2.4.8 / 2.4.8pre8
  2001-08-14 19:04   ` Jon 'tex' Boone
@ 2001-08-14 19:32     ` Rik van Riel
  0 siblings, 0 replies; 12+ messages in thread
From: Rik van Riel @ 2001-08-14 19:32 UTC (permalink / raw)
  To: Jon 'tex' Boone; +Cc: linux-kernel, kernelnewbies

On Tue, 14 Aug 2001, Jon 'tex' Boone wrote:

> > mm/vmscan.c::kswapd()
> > 	else if (out_of_memory()) {
> > 		oom_kill()
> >
> > mm/oom_kill.c::out_of_memory()
>
>     Should said volunteer(s) work with stock 2.4.8?

No real need, the OOM functions are basically unchanged.

regards,

Rik
--
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardvark@nl.linux.org (spam digging piggy)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WANTED: Re: VM lockup with 2.4.8 / 2.4.8pre8
  2001-08-13 17:55 ` WANTED: " Rik van Riel
  2001-08-14 19:04   ` Jon 'tex' Boone
@ 2001-08-14 20:05   ` Petr Baudis
  2001-08-14 20:27     ` Rik van Riel
       [not found]   ` <9lc0ek$l5k$1@ns1.clouddancer.com>
  2 siblings, 1 reply; 12+ messages in thread
From: Petr Baudis @ 2001-08-14 20:05 UTC (permalink / raw)
  To: linux-kernel

Why we are giving so big importance to root processes? Yes, they are
important, but they are even more likely to flood our memory, because
limits don't apply to them. I propose to just divide their badness
by 2, not by 4.

I also propose to half badness of processes with pid < 1000 - those
processes are usually also important, because they are called during
boot-time and they usually handle important system affairs. And
because most of them are run by root, the previous behaviour will
be restored, but with giving less badness to some non-root important
processes, and more badness to later-run root processes, which are
often less important.

-- 

				Petr "Pasky" Baudis
.                                                                       .
#define BITCOUNT(x)     (((BX_(x)+(BX_(x)>>4)) & 0x0F0F0F0F) % 255)
#define  BX_(x)         ((x) - (((x)>>1)&0x77777777)                    \
                             - (((x)>>2)&0x33333333)                    \
                             - (((x)>>3)&0x11111111))
             -- really weird C code to count the number of bits in a word
.                                                                       .
My public PGP key is on: http://pasky.ji.cz/~pasky/pubkey.txt
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS d- s++:++ a--- C+++ UL++++$ P+ L+++ E--- W+ N !o K- w-- !O M-
!V PS+ !PE Y+ PGP+>++ t+ 5 X(+) R++ tv- b+ DI(+) D+ G e-> h! r% y?
------END GEEK CODE BLOCK------

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WANTED: Re: VM lockup with 2.4.8 / 2.4.8pre8
  2001-08-14 20:05   ` Petr Baudis
@ 2001-08-14 20:27     ` Rik van Riel
  0 siblings, 0 replies; 12+ messages in thread
From: Rik van Riel @ 2001-08-14 20:27 UTC (permalink / raw)
  To: Petr Baudis; +Cc: linux-kernel

On Tue, 14 Aug 2001, Petr Baudis wrote:

> I also propose to half badness of

Selecting which process to kill is not the problem
we are currently facing.

The problem is WHEN to kill something. Once we have
that fixed we can always work on refining the selection
algorithm ;))

Rik
--
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardvark@nl.linux.org (spam digging piggy)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WANTED: Re: VM lockup with 2.4.8 / 2.4.8pre8
       [not found]   ` <9lc0ek$l5k$1@ns1.clouddancer.com>
@ 2001-08-15 19:35     ` Colonel
  2001-08-15 20:14       ` Admin Mailing Lists
                         ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Colonel @ 2001-08-15 19:35 UTC (permalink / raw)
  To: linux-kernel

In clouddancer.list.kernel, you wrote:
>
>Why we are giving so big importance to root processes? Yes, they are
>important, but they are even more likely to flood our memory, because
>limits don't apply to them. I propose to just divide their badness
>by 2, not by 4.

Gee, lets punish everybody in case of one bad app...


>I also propose to half badness of processes with pid < 1000 - those
>processes are usually also important, because they are called during
>boot-time and they usually handle important system affairs.


The belief that boot started processes remain under a pid < 1000 is
flawed.  Simple example: the postfix mail server.


-- 
Windows 2001: "I'm sorry Dave ...  I'm afraid I can't do that."


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WANTED: Re: VM lockup with 2.4.8 / 2.4.8pre8
  2001-08-15 19:35     ` Colonel
@ 2001-08-15 20:14       ` Admin Mailing Lists
  2001-08-15 20:51       ` David Ford
       [not found]       ` <9lelsk$bri$1@ns1.clouddancer.com>
  2 siblings, 0 replies; 12+ messages in thread
From: Admin Mailing Lists @ 2001-08-15 20:14 UTC (permalink / raw)
  To: Colonel; +Cc: linux-kernel

> 
> >I also propose to half badness of processes with pid < 1000 - those
> >processes are usually also important, because they are called during
> >boot-time and they usually handle important system affairs.
> 
> 
> The belief that boot started processes remain under a pid < 1000 is
> flawed.  Simple example: the postfix mail server.
> 

agreed, but FWIW my postfix master daemon is pid 434

isn't this what Priority and Nice values are for, though?

-Tony
.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-.
Anthony J. Biacco                       Network Administrator/Engineer
thelittleprince@asteroid-b612.org       Intergrafix Internet Services

    "Dream as if you'll live forever, live as if you'll die today"
http://www.asteroid-b612.org                http://www.intergrafix.net
.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WANTED: Re: VM lockup with 2.4.8 / 2.4.8pre8
  2001-08-15 19:35     ` Colonel
  2001-08-15 20:14       ` Admin Mailing Lists
@ 2001-08-15 20:51       ` David Ford
       [not found]       ` <9lelsk$bri$1@ns1.clouddancer.com>
  2 siblings, 0 replies; 12+ messages in thread
From: David Ford @ 2001-08-15 20:51 UTC (permalink / raw)
  Cc: linux-kernel

Also consider that many places use randomized pids. You can only assume 
a few things about pids and that has to be done by evaluating kernel 
threads and the init pid.

David

Colonel wrote:

>>I also propose to half badness of processes with pid < 1000 - those
>>processes are usually also important, because they are called during
>>boot-time and they usually handle important system affairs.
>>
>
>The belief that boot started processes remain under a pid < 1000 is
>flawed.  Simple example: the postfix mail server.
>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WANTED: Re: VM lockup with 2.4.8 / 2.4.8pre8
       [not found]       ` <9lelsk$bri$1@ns1.clouddancer.com>
@ 2001-08-16  4:27         ` Colonel
  2001-08-16  9:27           ` Marco Colombo
  0 siblings, 1 reply; 12+ messages in thread
From: Colonel @ 2001-08-16  4:27 UTC (permalink / raw)
  To: linux-kernel

In clouddancer.list.kernel, you wrote:
>
>> 
>> >I also propose to half badness of processes with pid < 1000 - those
>> >processes are usually also important, because they are called during
>> >boot-time and they usually handle important system affairs.
>> 
>> 
>> The belief that boot started processes remain under a pid < 1000 is
>> flawed.  Simple example: the postfix mail server.
>> 
>
>agreed, but FWIW my postfix master daemon is pid 434


Ah, yes that reminds me that when you take down a service and then
start it again, you lose that nice low pid.  FWIW, my master is 23034
now.  As D Ford stated, paying attention to pid value is not useful.


-- 
Windows 2001: "I'm sorry Dave ...  I'm afraid I can't do that."


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: WANTED: Re: VM lockup with 2.4.8 / 2.4.8pre8
  2001-08-16  4:27         ` Colonel
@ 2001-08-16  9:27           ` Marco Colombo
  0 siblings, 0 replies; 12+ messages in thread
From: Marco Colombo @ 2001-08-16  9:27 UTC (permalink / raw)
  To: Colonel; +Cc: linux-kernel

On Wed, 15 Aug 2001, Colonel wrote:

> In clouddancer.list.kernel, you wrote:
> >
> >>
> >> >I also propose to half badness of processes with pid < 1000 - those
> >> >processes are usually also important, because they are called during
> >> >boot-time and they usually handle important system affairs.
> >>
> >>
> >> The belief that boot started processes remain under a pid < 1000 is
> >> flawed.  Simple example: the postfix mail server.
> >>
> >
> >agreed, but FWIW my postfix master daemon is pid 434
>
>
> Ah, yes that reminds me that when you take down a service and then
> start it again, you lose that nice low pid.  FWIW, my master is 23034
> now.  As D Ford stated, paying attention to pid value is not useful.

On a server which is anything but idle, 16bit pids cycle every 1 or 2
days. And many daemons do get restarted (syslog on RH systems), and they
do fork to perform their task (all from [x]inetd, sendmail, and the like).
PID means nothing, expecially when uptime > 100 days its pretty much a
random number (a lot of chances you already restarted almost everything,
unless your configuration is *very* stable).

.TM.
-- 
      ____/  ____/   /
     /      /       /			Marco Colombo
    ___/  ___  /   /		      Technical Manager
   /          /   /			 ESI s.r.l.
 _____/ _____/  _/		       Colombo@ESI.it


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2001-08-16  9:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-08-13 17:47 VM lockup with 2.4.8 / 2.4.8pre8 Roy C. Bixler
2001-08-13 17:55 ` WANTED: " Rik van Riel
2001-08-14 19:04   ` Jon 'tex' Boone
2001-08-14 19:32     ` Rik van Riel
2001-08-14 20:05   ` Petr Baudis
2001-08-14 20:27     ` Rik van Riel
     [not found]   ` <9lc0ek$l5k$1@ns1.clouddancer.com>
2001-08-15 19:35     ` Colonel
2001-08-15 20:14       ` Admin Mailing Lists
2001-08-15 20:51       ` David Ford
     [not found]       ` <9lelsk$bri$1@ns1.clouddancer.com>
2001-08-16  4:27         ` Colonel
2001-08-16  9:27           ` Marco Colombo
2001-08-14 19:13 ` Roy C. Bixler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox