* Kernel VM bug?
@ 2004-06-28 2:30 Brian
2004-06-28 2:58 ` William Lee Irwin III
0 siblings, 1 reply; 7+ messages in thread
From: Brian @ 2004-06-28 2:30 UTC (permalink / raw)
To: linux-kernel
Hello list,
While doing massive memory allocation (I'm using GRASS to project NASA's BlueMarble maps) the
kernel apparently tries to kill grass but fails. When I try to access /proc/<grass_pid>/stat the
process hangs.
For example, an 'strace' of 'ps' ends like this:
open("/proc/1783/stat", O_RDONLY) = 6
read(6, <PS and strace hang here>
I am able to project a few files, but once the filesystem cache fills up, GRASS hangs or gives a
panic in vm_stat:381. The strange thing is, very little swap space is in use, and the filesystem
cache continues to use most of the RAM.
Is this a kernel bug, or do I need to use kernel 2.6.x (I am using kernel 2.4.26) and
/proc/sys/vm/overcommit_memory or similar hack?
Since I am a kernel-newbie, these links might help explain the problem better ;)
http://seclists.org/linux-kernel/2001/Dec/1604.html
http://www.mail-archive.com/debian-glibc@lists.debian.org/msg10070.html
Brian G
__________________________________
Do you Yahoo!?
Yahoo! Mail - 50x more storage than other providers!
http://promotions.yahoo.com/new_mail
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel VM bug?
2004-06-28 2:30 Kernel VM bug? Brian
@ 2004-06-28 2:58 ` William Lee Irwin III
2004-06-28 13:01 ` Hugh Dickins
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: William Lee Irwin III @ 2004-06-28 2:58 UTC (permalink / raw)
To: Brian; +Cc: linux-kernel
On Sun, Jun 27, 2004 at 07:30:39PM -0700, Brian wrote:
> While doing massive memory allocation (I'm using GRASS to project
> NASA's BlueMarble maps) the kernel apparently tries to kill grass but
> fails. When I try to access /proc/<grass_pid>/stat the process hangs.
> For example, an 'strace' of 'ps' ends like this:
> open("/proc/1783/stat", O_RDONLY) = 6
> read(6, <PS and strace hang here>
> I am able to project a few files, but once the filesystem cache fills
> up, GRASS hangs or gives a panic in vm_stat:381. The strange thing
> is, very little swap space is in use, and the filesystem cache
> continues to use most of the RAM.
> Is this a kernel bug, or do I need to use kernel 2.6.x (I am using
> kernel 2.4.26) and /proc/sys/vm/overcommit_memory or similar hack?
> Since I am a kernel-newbie, these links might help explain the
> problem better ;)
> http://seclists.org/linux-kernel/2001/Dec/1604.html
> http://www.mail-archive.com/debian-glibc@lists.debian.org/msg10070.html
2.6 has had many of these kinds of failure modes addressed, but it's
not clear what the precise relationship of your issue is to the various
work that's gone on there. This description isn't really enough to go
on, but by "guesswork" I suspect it is a semaphore deadlock related to
an interaction coredumping and process resource usage reporting in /proc/.
Excerpts of system logs related to the killing of the process may also
be useful to diagnose the problem.
In general, what needs to go on is for us to be able to reproduce your
failure. However, there are categories of problems, design flaws, that
require work too invasive to be addressed in 2.4, and because of the
ones that have been addressed in 2.6 related to this specific area, I'd
suspect there is a good chance that using 2.6 will resolve your problem.
Strict non-overcommit is also good to have in order for orderly
application shutdown or otherwise application self-regulation of
resource demands to occur at the time of hardware resource exhaustion.
This is by necessity enabled by default and has to be disabled at
runtime. You shouldn't have to do anything to enable it, but to
doublecheck that strict non-overcommit hasn't been disabled by e.g.
initscripts, please check that /proc/sys/vm/overcommit_memory stays 0.
To investigate what may have happened in 2.4, it may be helpful for us
to be able to run GRASS on a similar data set (IIRC it is open source
and freely available for download) and to arrange for testing on a
similar machine, which by and large we can arrange for ourselves given
a sufficiently detailed description.
Thanks.
-- wli
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel VM bug?
2004-06-28 2:58 ` William Lee Irwin III
@ 2004-06-28 13:01 ` Hugh Dickins
2004-06-28 20:18 ` William Lee Irwin III
2004-06-29 20:05 ` Brian
2004-06-30 2:04 ` Kernel VM bug? (more info) Brian Gunlogson
2 siblings, 1 reply; 7+ messages in thread
From: Hugh Dickins @ 2004-06-28 13:01 UTC (permalink / raw)
To: William Lee Irwin III; +Cc: Brian, linux-kernel
On Sun, 27 Jun 2004, William Lee Irwin III wrote:
>
> Strict non-overcommit is also good to have in order for orderly
> application shutdown or otherwise application self-regulation of
> resource demands to occur at the time of hardware resource exhaustion.
> This is by necessity enabled by default and has to be disabled at
> runtime. You shouldn't have to do anything to enable it, but to
> doublecheck that strict non-overcommit hasn't been disabled by e.g.
> initscripts, please check that /proc/sys/vm/overcommit_memory stays 0.
I'm not sure if I'm niggling over terminology, or pointing out a
significant misunderstanding: but /proc/sys/vm/overcommit_memory 0
(indeed the default) is not what I call strict non-overcommit: that's 2.
All settings (0, 1, 2) maintain the Committed_AS count shown in
/proc/meminfo; but only /proc/sys/vm/overcommit_memory 2 totals and
limits reservations using it. 1 imposes no limit. 0 checks that the
particular "reservation" could plausibly be made available now, but
without considering the total: so allows any number of concurrent
maximum reservations - traditional relaxed Linux behaviour, not strict.
(2 came along much later, yes the naming and numbering are both horrid.)
Hugh
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel VM bug?
2004-06-28 13:01 ` Hugh Dickins
@ 2004-06-28 20:18 ` William Lee Irwin III
0 siblings, 0 replies; 7+ messages in thread
From: William Lee Irwin III @ 2004-06-28 20:18 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Brian, linux-kernel
On Mon, Jun 28, 2004 at 02:01:29PM +0100, Hugh Dickins wrote:
> I'm not sure if I'm niggling over terminology, or pointing out a
> significant misunderstanding: but /proc/sys/vm/overcommit_memory 0
> (indeed the default) is not what I call strict non-overcommit: that's 2.
> All settings (0, 1, 2) maintain the Committed_AS count shown in
> /proc/meminfo; but only /proc/sys/vm/overcommit_memory 2 totals and
> limits reservations using it. 1 imposes no limit. 0 checks that the
> particular "reservation" could plausibly be made available now, but
> without considering the total: so allows any number of concurrent
> maximum reservations - traditional relaxed Linux behaviour, not strict.
> (2 came along much later, yes the naming and numbering are both horrid.)
I'm not sure if the numbers changed or something else went wrong. Not
encouraging to hear this behaved differently from my expectations
without my noticing.
-- wli
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel VM bug?
2004-06-28 2:58 ` William Lee Irwin III
2004-06-28 13:01 ` Hugh Dickins
@ 2004-06-29 20:05 ` Brian
2004-06-29 20:09 ` William Lee Irwin III
2004-06-30 2:04 ` Kernel VM bug? (more info) Brian Gunlogson
2 siblings, 1 reply; 7+ messages in thread
From: Brian @ 2004-06-29 20:05 UTC (permalink / raw)
To: William Lee Irwin III; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2695 bytes --]
GRASS also has problems on the 2.6.7 kernel.
My system is an Athlon-XP with 512MB RAM running Slackware 10.0.0 (kernel 2.4.26) full
installation in X windows with minimal window manager and minimal other processes.
To reproduce:
Download the NASA blue marble from
(ftp://mitch.gsfc.nasa.gov/pub/stockli/bluemarble/MOD09A1.W.interpol.cyl.retouched.topo.bathymetry.3x21600x21600.gz)
and use netpbm to convert the RAW RGB to a PPM.
'cat bluemarble.gz | gzip -dc | rawtoppm 21600 21600 > bluemarble.W.ppm'
Compile and install grass CVS as of June 28 2004 20:20:00 UTC or use attached bash shell script.
Create a new grass location, let's say with a location name of 'tiger' and a mapset name of
'brian' (use space to delete if compiled without readline support). Make sure the database
directory is set and it already exists. Answer 'yes' until you get asked to select the coordinate
system. Select coordinate system B, longitude latitude. Keep answering until you get asked to
select a geodatic datum, use 'wgs84' as the datum. Type '1' when asked for the datum
transformation parameters. For north edge type 50N, south edge type 23N, west edge 125W, east edge
70W, both east west and north south resolution of 0.00222222.
Create a new location, let's say with a location name of 'blue.w_loc' and the mapset MUST be
'PERMANENT'. The rest is the same until the default region. Set those to north 90N, south 90S,
west 180W, east 0W, and both grid resolutions to 0.00833333.
Restart GRASS and choose the 'blue.w_loc'/'PERMANENT' location/mapset. Import the bluemarble PPM
using r.in.ppm and use the create separate red/green/blue maps command line option "-b"
'r.in.ppm -bv input=<path to blue marble ppm> output=bluemarble.w'.
Restart GRASS and use the the 'tiger'/'brian' location/mapset.
Project the bluemarble maps until the filesystem cache fills up and something bad happens. Might
take a few tries.
'r.proj input=bluemarble.w.r location=blue.w_loc mapset=PERMANENT method=cubic;r.proj
input=bluemarble.w.g location=blue.w_loc mapset=PERMANENT method=cubic;r.proj input=bluemarble.w.b
location=blue.w_loc mapset=PERMANENT method=cubic'
Brian
--- William Lee Irwin III <wli@holomorphy.com> wrote:
> To investigate what may have happened in 2.4, it may be helpful for us
> to be able to run GRASS on a similar data set (IIRC it is open source
> and freely available for download) and to arrange for testing on a
> similar machine, which by and large we can arrange for ourselves given
> a sufficiently detailed description.
>
> Thanks.
>
> -- wli
__________________________________
Do you Yahoo!?
Yahoo! Mail is new and improved - Check it out!
http://promotions.yahoo.com/new_mail
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: build_grass.sh --]
[-- Type: text/x-sh; name="build_grass.sh", Size: 1439 bytes --]
#!/bin/bash
mkdir $HOME/grassroot|| exit
cd $HOME/grassroot
#Checkout grass
echo "Password is grass"
cvs -d:pserver:grass-guest@intevation.de:/home/grass/grassrepository login
cvs -d:pserver:grass-guest@intevation.de:/home/grass/grassrepository -z3 checkout -D "28 Jun 2004 20:20" grass
#Get grass dependencies
cd $HOME/grassroot
echo "Getting fftw-2.1.5..."
wget http://www.fftw.org/fftw-2.1.5.tar.gz
tar -xzf fftw-2.1.5.tar.gz
echo "Getting proj4..."
wget --passive-ftp ftp://ftp.remotesensing.org/pub/proj/proj-4.4.8.tar.gz
tar -xzf proj-4.4.8.tar.gz
echo "Getting gdal-1.2.1..."
wget --passive-ftp ftp://ftp.remotesensing.org/gdal/gdal-1.2.1.tar.gz
tar -xzf gdal-1.2.1.tar.gz
#Compile dependencies
cd $HOME/grassroot
cd fftw-2.1.5
./configure --prefix=/opt/grass &&
make || exit
echo "Need root password"
su -c "make install"
cd $HOME/grassroot
cd proj-4.4.8
./configure --prefix=/opt/grass &&
make || exit
echo "Need root password"
su -c "make install"
cd $HOME/grassroot
cd gdal-1.2.1
./configure --prefix=/opt/grass &&
make || exit
echo "Need root password"
su -c "make install"
#Compile grass
cd $HOME/grassroot
cd grass
PATH="/opt/grass/bin":$PATH ./configure --prefix=/opt/grass --with-proj-includes=/opt/grass/include/ --with-proj-libs=/opt/grass/lib/ --with-fftw-libs=/opt/grass/lib --with-fftw-includes=/opt/grass/include/ --without-postgres --without-odbc &&
make || exit
echo "Need root password"
su -c "make install"
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel VM bug?
2004-06-29 20:05 ` Brian
@ 2004-06-29 20:09 ` William Lee Irwin III
0 siblings, 0 replies; 7+ messages in thread
From: William Lee Irwin III @ 2004-06-29 20:09 UTC (permalink / raw)
To: Brian; +Cc: linux-kernel
On Tue, Jun 29, 2004 at 01:05:25PM -0700, Brian wrote:
> GRASS also has problems on the 2.6.7 kernel.
> My system is an Athlon-XP with 512MB RAM running Slackware 10.0.0
> (kernel 2.4.26) full installation in X windows with minimal window
> manager and minimal other processes.
> To reproduce:
> Download the NASA blue marble from
Okay, I'll get onto fixing this. My IRQ stack overfloweth.
-- wli
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel VM bug? (more info)
2004-06-28 2:58 ` William Lee Irwin III
2004-06-28 13:01 ` Hugh Dickins
2004-06-29 20:05 ` Brian
@ 2004-06-30 2:04 ` Brian Gunlogson
2 siblings, 0 replies; 7+ messages in thread
From: Brian Gunlogson @ 2004-06-30 2:04 UTC (permalink / raw)
To: William Lee Irwin III; +Cc: linux-kernel
I got more info, hope this helps. Sometimes the kernel panics at vmscan:388. That is in kernel
2.4.26, function shrink_cache() on a source code line that says BUG_ON(!PageLRU(page));
Brian
__________________________________
Do you Yahoo!?
Yahoo! Mail - 50x more storage than other providers!
http://promotions.yahoo.com/new_mail
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-06-30 2:04 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-28 2:30 Kernel VM bug? Brian
2004-06-28 2:58 ` William Lee Irwin III
2004-06-28 13:01 ` Hugh Dickins
2004-06-28 20:18 ` William Lee Irwin III
2004-06-29 20:05 ` Brian
2004-06-29 20:09 ` William Lee Irwin III
2004-06-30 2:04 ` Kernel VM bug? (more info) Brian Gunlogson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox