* Kernel VM bug?
@ 2004-06-28 2:30 Brian
2004-06-28 2:58 ` William Lee Irwin III
0 siblings, 1 reply; 7+ messages in thread
From: Brian @ 2004-06-28 2:30 UTC (permalink / raw)
To: linux-kernel
Hello list,
While doing massive memory allocation (I'm using GRASS to project NASA's BlueMarble maps) the
kernel apparently tries to kill grass but fails. When I try to access /proc/<grass_pid>/stat the
process hangs.
For example, an 'strace' of 'ps' ends like this:
open("/proc/1783/stat", O_RDONLY) = 6
read(6, <PS and strace hang here>
I am able to project a few files, but once the filesystem cache fills up, GRASS hangs or gives a
panic in vm_stat:381. The strange thing is, very little swap space is in use, and the filesystem
cache continues to use most of the RAM.
Is this a kernel bug, or do I need to use kernel 2.6.x (I am using kernel 2.4.26) and
/proc/sys/vm/overcommit_memory or similar hack?
Since I am a kernel-newbie, these links might help explain the problem better ;)
http://seclists.org/linux-kernel/2001/Dec/1604.html
http://www.mail-archive.com/debian-glibc@lists.debian.org/msg10070.html
Brian G
__________________________________
Do you Yahoo!?
Yahoo! Mail - 50x more storage than other providers!
http://promotions.yahoo.com/new_mail
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Kernel VM bug? 2004-06-28 2:30 Kernel VM bug? Brian @ 2004-06-28 2:58 ` William Lee Irwin III 2004-06-28 13:01 ` Hugh Dickins ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: William Lee Irwin III @ 2004-06-28 2:58 UTC (permalink / raw) To: Brian; +Cc: linux-kernel On Sun, Jun 27, 2004 at 07:30:39PM -0700, Brian wrote: > While doing massive memory allocation (I'm using GRASS to project > NASA's BlueMarble maps) the kernel apparently tries to kill grass but > fails. When I try to access /proc/<grass_pid>/stat the process hangs. > For example, an 'strace' of 'ps' ends like this: > open("/proc/1783/stat", O_RDONLY) = 6 > read(6, <PS and strace hang here> > I am able to project a few files, but once the filesystem cache fills > up, GRASS hangs or gives a panic in vm_stat:381. The strange thing > is, very little swap space is in use, and the filesystem cache > continues to use most of the RAM. > Is this a kernel bug, or do I need to use kernel 2.6.x (I am using > kernel 2.4.26) and /proc/sys/vm/overcommit_memory or similar hack? > Since I am a kernel-newbie, these links might help explain the > problem better ;) > http://seclists.org/linux-kernel/2001/Dec/1604.html > http://www.mail-archive.com/debian-glibc@lists.debian.org/msg10070.html 2.6 has had many of these kinds of failure modes addressed, but it's not clear what the precise relationship of your issue is to the various work that's gone on there. This description isn't really enough to go on, but by "guesswork" I suspect it is a semaphore deadlock related to an interaction coredumping and process resource usage reporting in /proc/. Excerpts of system logs related to the killing of the process may also be useful to diagnose the problem. In general, what needs to go on is for us to be able to reproduce your failure. However, there are categories of problems, design flaws, that require work too invasive to be addressed in 2.4, and because of the ones that have been addressed in 2.6 related to this specific area, I'd suspect there is a good chance that using 2.6 will resolve your problem. Strict non-overcommit is also good to have in order for orderly application shutdown or otherwise application self-regulation of resource demands to occur at the time of hardware resource exhaustion. This is by necessity enabled by default and has to be disabled at runtime. You shouldn't have to do anything to enable it, but to doublecheck that strict non-overcommit hasn't been disabled by e.g. initscripts, please check that /proc/sys/vm/overcommit_memory stays 0. To investigate what may have happened in 2.4, it may be helpful for us to be able to run GRASS on a similar data set (IIRC it is open source and freely available for download) and to arrange for testing on a similar machine, which by and large we can arrange for ourselves given a sufficiently detailed description. Thanks. -- wli ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel VM bug? 2004-06-28 2:58 ` William Lee Irwin III @ 2004-06-28 13:01 ` Hugh Dickins 2004-06-28 20:18 ` William Lee Irwin III 2004-06-29 20:05 ` Brian 2004-06-30 2:04 ` Kernel VM bug? (more info) Brian Gunlogson 2 siblings, 1 reply; 7+ messages in thread From: Hugh Dickins @ 2004-06-28 13:01 UTC (permalink / raw) To: William Lee Irwin III; +Cc: Brian, linux-kernel On Sun, 27 Jun 2004, William Lee Irwin III wrote: > > Strict non-overcommit is also good to have in order for orderly > application shutdown or otherwise application self-regulation of > resource demands to occur at the time of hardware resource exhaustion. > This is by necessity enabled by default and has to be disabled at > runtime. You shouldn't have to do anything to enable it, but to > doublecheck that strict non-overcommit hasn't been disabled by e.g. > initscripts, please check that /proc/sys/vm/overcommit_memory stays 0. I'm not sure if I'm niggling over terminology, or pointing out a significant misunderstanding: but /proc/sys/vm/overcommit_memory 0 (indeed the default) is not what I call strict non-overcommit: that's 2. All settings (0, 1, 2) maintain the Committed_AS count shown in /proc/meminfo; but only /proc/sys/vm/overcommit_memory 2 totals and limits reservations using it. 1 imposes no limit. 0 checks that the particular "reservation" could plausibly be made available now, but without considering the total: so allows any number of concurrent maximum reservations - traditional relaxed Linux behaviour, not strict. (2 came along much later, yes the naming and numbering are both horrid.) Hugh ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel VM bug? 2004-06-28 13:01 ` Hugh Dickins @ 2004-06-28 20:18 ` William Lee Irwin III 0 siblings, 0 replies; 7+ messages in thread From: William Lee Irwin III @ 2004-06-28 20:18 UTC (permalink / raw) To: Hugh Dickins; +Cc: Brian, linux-kernel On Mon, Jun 28, 2004 at 02:01:29PM +0100, Hugh Dickins wrote: > I'm not sure if I'm niggling over terminology, or pointing out a > significant misunderstanding: but /proc/sys/vm/overcommit_memory 0 > (indeed the default) is not what I call strict non-overcommit: that's 2. > All settings (0, 1, 2) maintain the Committed_AS count shown in > /proc/meminfo; but only /proc/sys/vm/overcommit_memory 2 totals and > limits reservations using it. 1 imposes no limit. 0 checks that the > particular "reservation" could plausibly be made available now, but > without considering the total: so allows any number of concurrent > maximum reservations - traditional relaxed Linux behaviour, not strict. > (2 came along much later, yes the naming and numbering are both horrid.) I'm not sure if the numbers changed or something else went wrong. Not encouraging to hear this behaved differently from my expectations without my noticing. -- wli ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel VM bug? 2004-06-28 2:58 ` William Lee Irwin III 2004-06-28 13:01 ` Hugh Dickins @ 2004-06-29 20:05 ` Brian 2004-06-29 20:09 ` William Lee Irwin III 2004-06-30 2:04 ` Kernel VM bug? (more info) Brian Gunlogson 2 siblings, 1 reply; 7+ messages in thread From: Brian @ 2004-06-29 20:05 UTC (permalink / raw) To: William Lee Irwin III; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 2695 bytes --] GRASS also has problems on the 2.6.7 kernel. My system is an Athlon-XP with 512MB RAM running Slackware 10.0.0 (kernel 2.4.26) full installation in X windows with minimal window manager and minimal other processes. To reproduce: Download the NASA blue marble from (ftp://mitch.gsfc.nasa.gov/pub/stockli/bluemarble/MOD09A1.W.interpol.cyl.retouched.topo.bathymetry.3x21600x21600.gz) and use netpbm to convert the RAW RGB to a PPM. 'cat bluemarble.gz | gzip -dc | rawtoppm 21600 21600 > bluemarble.W.ppm' Compile and install grass CVS as of June 28 2004 20:20:00 UTC or use attached bash shell script. Create a new grass location, let's say with a location name of 'tiger' and a mapset name of 'brian' (use space to delete if compiled without readline support). Make sure the database directory is set and it already exists. Answer 'yes' until you get asked to select the coordinate system. Select coordinate system B, longitude latitude. Keep answering until you get asked to select a geodatic datum, use 'wgs84' as the datum. Type '1' when asked for the datum transformation parameters. For north edge type 50N, south edge type 23N, west edge 125W, east edge 70W, both east west and north south resolution of 0.00222222. Create a new location, let's say with a location name of 'blue.w_loc' and the mapset MUST be 'PERMANENT'. The rest is the same until the default region. Set those to north 90N, south 90S, west 180W, east 0W, and both grid resolutions to 0.00833333. Restart GRASS and choose the 'blue.w_loc'/'PERMANENT' location/mapset. Import the bluemarble PPM using r.in.ppm and use the create separate red/green/blue maps command line option "-b" 'r.in.ppm -bv input=<path to blue marble ppm> output=bluemarble.w'. Restart GRASS and use the the 'tiger'/'brian' location/mapset. Project the bluemarble maps until the filesystem cache fills up and something bad happens. Might take a few tries. 'r.proj input=bluemarble.w.r location=blue.w_loc mapset=PERMANENT method=cubic;r.proj input=bluemarble.w.g location=blue.w_loc mapset=PERMANENT method=cubic;r.proj input=bluemarble.w.b location=blue.w_loc mapset=PERMANENT method=cubic' Brian --- William Lee Irwin III <wli@holomorphy.com> wrote: > To investigate what may have happened in 2.4, it may be helpful for us > to be able to run GRASS on a similar data set (IIRC it is open source > and freely available for download) and to arrange for testing on a > similar machine, which by and large we can arrange for ourselves given > a sufficiently detailed description. > > Thanks. > > -- wli __________________________________ Do you Yahoo!? Yahoo! Mail is new and improved - Check it out! http://promotions.yahoo.com/new_mail [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: build_grass.sh --] [-- Type: text/x-sh; name="build_grass.sh", Size: 1439 bytes --] #!/bin/bash mkdir $HOME/grassroot|| exit cd $HOME/grassroot #Checkout grass echo "Password is grass" cvs -d:pserver:grass-guest@intevation.de:/home/grass/grassrepository login cvs -d:pserver:grass-guest@intevation.de:/home/grass/grassrepository -z3 checkout -D "28 Jun 2004 20:20" grass #Get grass dependencies cd $HOME/grassroot echo "Getting fftw-2.1.5..." wget http://www.fftw.org/fftw-2.1.5.tar.gz tar -xzf fftw-2.1.5.tar.gz echo "Getting proj4..." wget --passive-ftp ftp://ftp.remotesensing.org/pub/proj/proj-4.4.8.tar.gz tar -xzf proj-4.4.8.tar.gz echo "Getting gdal-1.2.1..." wget --passive-ftp ftp://ftp.remotesensing.org/gdal/gdal-1.2.1.tar.gz tar -xzf gdal-1.2.1.tar.gz #Compile dependencies cd $HOME/grassroot cd fftw-2.1.5 ./configure --prefix=/opt/grass && make || exit echo "Need root password" su -c "make install" cd $HOME/grassroot cd proj-4.4.8 ./configure --prefix=/opt/grass && make || exit echo "Need root password" su -c "make install" cd $HOME/grassroot cd gdal-1.2.1 ./configure --prefix=/opt/grass && make || exit echo "Need root password" su -c "make install" #Compile grass cd $HOME/grassroot cd grass PATH="/opt/grass/bin":$PATH ./configure --prefix=/opt/grass --with-proj-includes=/opt/grass/include/ --with-proj-libs=/opt/grass/lib/ --with-fftw-libs=/opt/grass/lib --with-fftw-includes=/opt/grass/include/ --without-postgres --without-odbc && make || exit echo "Need root password" su -c "make install" ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel VM bug? 2004-06-29 20:05 ` Brian @ 2004-06-29 20:09 ` William Lee Irwin III 0 siblings, 0 replies; 7+ messages in thread From: William Lee Irwin III @ 2004-06-29 20:09 UTC (permalink / raw) To: Brian; +Cc: linux-kernel On Tue, Jun 29, 2004 at 01:05:25PM -0700, Brian wrote: > GRASS also has problems on the 2.6.7 kernel. > My system is an Athlon-XP with 512MB RAM running Slackware 10.0.0 > (kernel 2.4.26) full installation in X windows with minimal window > manager and minimal other processes. > To reproduce: > Download the NASA blue marble from Okay, I'll get onto fixing this. My IRQ stack overfloweth. -- wli ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel VM bug? (more info) 2004-06-28 2:58 ` William Lee Irwin III 2004-06-28 13:01 ` Hugh Dickins 2004-06-29 20:05 ` Brian @ 2004-06-30 2:04 ` Brian Gunlogson 2 siblings, 0 replies; 7+ messages in thread From: Brian Gunlogson @ 2004-06-30 2:04 UTC (permalink / raw) To: William Lee Irwin III; +Cc: linux-kernel I got more info, hope this helps. Sometimes the kernel panics at vmscan:388. That is in kernel 2.4.26, function shrink_cache() on a source code line that says BUG_ON(!PageLRU(page)); Brian __________________________________ Do you Yahoo!? Yahoo! Mail - 50x more storage than other providers! http://promotions.yahoo.com/new_mail ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-06-30 2:04 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-06-28 2:30 Kernel VM bug? Brian 2004-06-28 2:58 ` William Lee Irwin III 2004-06-28 13:01 ` Hugh Dickins 2004-06-28 20:18 ` William Lee Irwin III 2004-06-29 20:05 ` Brian 2004-06-29 20:09 ` William Lee Irwin III 2004-06-30 2:04 ` Kernel VM bug? (more info) Brian Gunlogson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox