public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Only 10 MB/sec with via 82c686b chipset?
@ 2001-03-21  3:48 SodaPop
  2001-03-21 13:48 ` egger
  2001-03-21 14:18 ` Only 10 MB/sec with via 82c686b chipset? Jonathan Morton
  0 siblings, 2 replies; 162+ messages in thread
From: SodaPop @ 2001-03-21  3:48 UTC (permalink / raw)
  To: linux-kernel

Only 10 MB/sec with via 82c686b chipset?

I have an IWill KK-266R motherboard with an athlon-c 1200
processor in it, and for the life of me I can't get more than
10 MB/sec through the on-board ide controller.  Yes, all the
appropriate support is turned on in the kernel to enable dma
and specific chipset support, and yes, I think I have all
relevant patches and a reasonable kernel.

I started out with stock kernel 2.4.2.  I later added Hedrick's
ide.2.4.3-p4.all.03132001.patch, which did not change the
behaviour other than to include messages in the dmesg output. I
have even tried removing the via-specific chipset support from
the kernel, only to find that enabling dma with generic support
leaves me at the same 10 MB/sec as the specific support does.

I noted a number of other interesting things;  one, that -X33,
-X34, and -X64 through -X69 all have the same 10 MB/sec transfer
rate, and two, that the 10 MB/sec transfer rate can be linearly
increased to 12 MB/sec by raising the system bus from 100 mhz to
120 mhz (all components are safely rated at 133, no overclocking
involved.)

It is also quite strange that I have been able to run 'hdparm
-t /dev/hda' and 'hdparm -t /dev/hdb' concurrently and can still
get the full 10 MB/sec on both, for a sum total of 20 MB/sec. I
would have expected that the drives would clobber each other
given that they are on the same ide bus.

>From the /proc data and other information below, it seems to me
that this is some kind of screwball tuning issue between linux and
the chipset, not the chipset and the drives.  As near as I can
tell, the chipset is talking to the drives at a much higher data
rate than 10 MB/sec, but for some reason linux isn't able to
process the data any faster than that.  Running hdparm -t in
parallel and observing a speed increase from raising the cpu
clock leads me in that direction.  (Also note that hdparm -t only
uses a few percent of cpu.  It's not like the machine doesn't
have enough processing power.)

I'm really baffled at this point.  I can't rule out that I have
done something dumb, but for the life of me I can't think of
anything else.  I've been to a number of web pages, but the
general consensus seems to be that this chipset should just work,
and work beautifully without any trouble.  There aren't any
fixes because I seem to be the only one having this problem.

Does anyone have any other ideas?

-dennis T





Misc hardware information:
----------------------------------------------------
The board has raid hardware on it, but its currently disabled with
jumpers.  Cables are high quality 80 wire/40 pin cables.  Both
drives are the same, but currently hdb has the 32 gig clip jumper
attached.  Putting the drives on separate ide busses does not
change the 10 MB/sec throughput.  Removing hdb from the chain
does not raise the throughput of hda.  Both drives are rated at
37 MB/sec continuous DTR.  Hdd is a cheap 8x cdrom.

Hdb, when installed in a nearby machine, has a 17 MB/sec data
rate.  The machine is an AMD K6-II 500, 100 MHz bus, with via
82c586 chipset, running kernel 2.4.1 (via chipset support
enabled.)




Various miscellaneous data, and dmesg output:
----------------------------------------------------

root@gurney:~# cat /proc/ide/via
----------VIA BusMastering IDE Configuration----------------
Driver Version:                     3.20
South Bridge:                       VIA vt82c686b
Revision:                           ISA 0x40 IDE 0x6
BM-DMA base:                        0xd000
PCI clock:                          33MHz
Master Read  Cycle IRDY:            0ws
Master Write Cycle IRDY:            0ws
BM IDE Status Register Read Retry:  yes
Max DRDY Pulse Width:               No limit
-----------------------Primary IDE-------Secondary IDE------
Read DMA FIFO flush:          yes                 yes
End Sector FIFO flush:         no                  no
Prefetch Buffer:              yes                 yes
Post Write Buffer:            yes                  no
Enabled:                      yes                 yes
Simplex only:                  no                  no
Cable Type:                   80w                 40w
-------------------drive0----drive1----drive2----drive3-----
Transfer Mode:       UDMA      UDMA       PIO       PIO
Address Setup:       30ns      30ns     120ns      60ns
Cmd Active:          90ns      90ns      90ns      90ns
Cmd Recovery:        30ns      30ns      90ns      90ns
Data Active:         90ns      90ns     330ns     240ns
Data Recovery:       30ns      30ns     270ns     240ns
Cycle Time:          20ns      20ns      50ns      90ns
Transfer Rate:  100.0MB/s 100.0MB/s  40.0MB/s  22.2MB/s

----------------------------------------------------
root@gurney:~# hdparm /dev/hda

/dev/hda:
 multcount    =  0 (off)
 I/O support  =  1 (32-bit)
 unmaskirq    =  1 (on)
 using_dma    =  1 (on)
 keepsettings =  0 (off)
 nowerr       =  0 (off)
 readonly     =  0 (off)
 readahead    =  8 (on)
 geometry     = 5606/255/63, sectors = 90069840, start = 0

----------------------------------------------------
root@gurney:~# hdparm -i /dev/hda

/dev/hda:

 Model=IBM-DTLA-307045, FwRev=TX6OA60A, SerialNo=YMEYMNF7564
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=40
 BuffType=DualPortCache, BuffSize=1916kB, MaxMultSect=16, MultSect=off
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=90069840
 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4
 DMA modes: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5

----------------------------------------------------
root@gurney:~# hdparm -T /dev/hda

/dev/hda:
 Timing buffer-cache reads:   128 MB in  0.88 seconds =145.45 MB/sec

----------------------------------------------------
root@gurney:~# hdparm -t /dev/hda

/dev/hda:
 Timing buffered disk reads:  64 MB in  6.11 seconds = 10.47 MB/sec

----------------------------------------------------
root@gurney:~# dmesg
Linux version 2.4.2 (root@gurney) (gcc version 2.95.2 19991024 (release)) #4 Tue Mar 20 18:02:16 CST 2001
BIOS-provided physical RAM map:
 BIOS-e820: 000000000009fc00 @ 0000000000000000 (usable)
 BIOS-e820: 0000000000000400 @ 000000000009fc00 (usable)
 BIOS-e820: 0000000000010000 @ 00000000000f0000 (reserved)
 BIOS-e820: 0000000000010000 @ 00000000ffff0000 (reserved)
 BIOS-e820: 000000000fef0000 @ 0000000000100000 (usable)
 BIOS-e820: 000000000000d000 @ 000000000fff3000 (ACPI data)
 BIOS-e820: 0000000000003000 @ 000000000fff0000 (ACPI NVS)
On node 0 totalpages: 65520
zone(0): 4096 pages.
zone(1): 61424 pages.
zone(2): 0 pages.
Kernel command line: BOOT_IMAGE=2.4.2 ro root=304
Initializing CPU#0
Detected 898.848 MHz processor.
Console: colour VGA+ 80x34
Calibrating delay loop... 1789.13 BogoMIPS
Memory: 255656k/262080k available (903k kernel code, 6036k reserved, 337k data,
192k init, 0k highmem)
Dentry-cache hash table entries: 32768 (order: 6, 262144 bytes)
Buffer-cache hash table entries: 16384 (order: 4, 65536 bytes)
Page-cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 16384 (order: 5, 131072 bytes)
CPU: Before vendor init, caps: 0183f9ff c1c7f9ff 00000000, vendor = 2
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU: After vendor init, caps: 0183f9ff c1c7f9ff 00000000 00000000
CPU: After generic, caps: 0183f9ff c1c7f9ff 00000000 00000000
CPU: Common caps: 0183f9ff c1c7f9ff 00000000 00000000
CPU: AMD Athlon(tm) Processor stepping 02
Enabling fast FPU save and restore... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.37 (20001109) Richard Gooch (rgooch@atnf.csiro.au)
mtrr: detected mtrr type: Intel
PCI: PCI BIOS revision 2.10 entry at 0xfb240, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Bus master read caching disabled
Unknown bridge resource 0: assuming transparent
PCI: Using IRQ router VIA [1106/0686] at 00:07.0
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Starting kswapd v1.8
Detected PS/2 Mouse Port.
pty: 256 Unix98 ptys configured
block: queued sectors max/low 169880kB/56626kB, 512 slots per queue
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 39
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci00:07.1
    ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:DMA, hdb:DMA
    ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:pio, hdd:DMA
hda: IBM-DTLA-307045, ATA DISK drive
hdb: IBM-DTLA-307045, ATA DISK drive
hdd: HITACHI CDR-7930, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 90069840 sectors (46116 MB) w/1916KiB Cache, CHS=5606/255/63, UDMA(100)
hdb: 66055248 sectors (33820 MB) w/1916KiB Cache, CHS=65531/16/63, UDMA(100)
hdd: ATAPI 8X CD-ROM drive, 128kB Cache, DMA
Uniform CD-ROM driver Revision: 3.12
Partition check:
 hda: hda1 hda2 hda3 hda4
 hdb: hdb1 hdb2
Serial driver version 5.02 (2000-08-09) with MANY_PORTS SHARE_IRQ SERIAL_PCI enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
Real Time Clock Driver v1.10d
8139too Fast Ethernet driver 0.9.13 loaded
PCI: Found IRQ 10 for device 00:0b.0
PCI: The same IRQ used for device 00:0f.0
eth0: RealTek RTL8139 Fast Ethernet at 0xd0800000, 00:e0:7d:95:af:a6, IRQ 10
eth0:  Identified 8139 chip type 'RTL-8139C'
Linux agpgart interface v0.99 (c) Jeff Hartmann
agpgart: Maximum main memory to use for agp memory: 203M
agpgart: Detected Via Apollo Pro KT133 chipset
agpgart: AGP aperture is 64M @ 0xd8000000
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 2048 buckets, 16Kbytes
TCP: Hash tables configured (established 16384 bind 16384)
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
reiserfs: checking transaction log (device 03:04) ...
Using r5 hash to sort names
ReiserFS version 3.6.25
VFS: Mounted root (reiserfs filesystem) readonly.
Freeing unused kernel memory: 192k freed
Adding Swap: 530136k swap-space (priority -1)
hda: Write Cache SUCCESSED Flushing!<6>hda: Write Cache SUCCESSED Flushing!<6>usb.c: registered new driver usbdevfs
usb.c: registered new driver hub




^ permalink raw reply	[flat|nested] 162+ messages in thread
* [PATCH] Prevent OOM from killing init
@ 2001-03-21 22:54 Patrick O'Rourke
  2001-03-21 23:11 ` Eli Carter
  2001-03-21 23:48 ` Rik van Riel
  0 siblings, 2 replies; 162+ messages in thread
From: Patrick O'Rourke @ 2001-03-21 22:54 UTC (permalink / raw)
  To: linux-mm, linux-kernel

Since the system will panic if the init process is chosen by
the OOM killer, the following patch prevents select_bad_process()
from picking init.

Pat

--- xxx/linux-2.4.3-pre6/mm/oom_kill.c  Tue Nov 14 13:56:46 2000
+++ linux-2.4.3-pre6/mm/oom_kill.c      Wed Mar 21 15:25:03 2001
@@ -123,7 +123,7 @@

         read_lock(&tasklist_lock);
         for_each_task(p) {
-               if (p->pid) {
+               if (p->pid && p->pid != 1) {
                         int points = badness(p);
                         if (points > maxpoints) {
                                 chosen = p;

-- 
Patrick O'Rourke
978.606.0236
orourke@missioncriticallinux.com


^ permalink raw reply	[flat|nested] 162+ messages in thread
* RE: [PATCH] Prevent OOM from killing init
@ 2001-03-21 23:41 Leif Sawyer
  2001-03-22  0:32 ` Kevin Buhr
  0 siblings, 1 reply; 162+ messages in thread
From: Leif Sawyer @ 2001-03-21 23:41 UTC (permalink / raw)
  To: Eli Carter, Patrick O'Rourke; +Cc: linux-kernel

Patrick O'Rourke, who wrote:
> Since the system will panic if the init process is chosen by
> the OOM killer, the following patch prevents select_bad_process()
> from picking init.
> 

(Patch deleted)

What happens when init is not pid == 1, as is often the case
during installs, booting off of cdrom, etc..


^ permalink raw reply	[flat|nested] 162+ messages in thread
* RE: [PATCH] Prevent OOM from killing init
@ 2001-03-22 11:08 Heusden, Folkert van
  0 siblings, 0 replies; 162+ messages in thread
From: Heusden, Folkert van @ 2001-03-22 11:08 UTC (permalink / raw)
  To: Patrick O'Rourke, linux-mm, linux-kernel

> Since the system will panic if the init process is chosen by
> the OOM killer, the following patch prevents select_bad_process()
> from picking init.

Hmmm, wouldn't it be nice to make this all configurable? Like; have
some list of PIDs that can be killed?
I would hate it the daemon that checks my UPS would get killed...
(that deamon brings the machine down safely when the UPS'
batteries get emptied).
Would be something like:

int *dont_kill_pid, ndont_kill_pid;
// initialize with at least pid '1' and n=1

         for_each_task(p) {
		int loop;
		for(loop=ndont_kill_pid-1; loop>=0; loop--)
		{
			if (dont_kill_pid[loop] == p->pid) break;
		}
              if (p->pid && !(loop>=0)) {
                         int points = badness(p);
                         if (points > maxpoints) {
                                 chosen = p;


(untested (not even compiled or anything) code)

^ permalink raw reply	[flat|nested] 162+ messages in thread
[parent not found: <4605B269DB001E4299157DD1569079D2809930@EXCHANGE03.plaza.ds.adp.com>]
* Re: [PATCH] Prevent OOM from killing init
@ 2001-03-22 23:35 Mikael Pettersson
  2001-03-22 23:43 ` Alan Cox
  0 siblings, 1 reply; 162+ messages in thread
From: Mikael Pettersson @ 2001-03-22 23:35 UTC (permalink / raw)
  To: alan; +Cc: linux-kernel

On Thu, 22 Mar 2001 21:23:54 +0000 (GMT), Alan Cox wrote:

>> Really the whole oom_kill process seems bass-ackwards to me.  I can't in my mind
>> logically justify annihilating large-VM processes that have been running for 
>> days or weeks instead of just returning ENOMEM to a process that just started 
>> up.
>
>How do you return an out of memory error to a C program that is out of memory
>due to a stack growth fault. There is actually not a language construct for it

SIGSEGV.
Stack overflow for a language like C using standard implementation techniques
is the same as a page fault while accessing a page for which there is no backing
store. SIGSEGV is the logical choice, and the one I'd expect on other Unices.

oom_kill should simply fail the current allocation which cannot be satisfied,
either by having {s,}brk/mmap return error or by posting a SIGSEGV. This would
actually also be the correct answer, if Linux didn't overcommit memory ...

Remove the overcommit crap and oom_kill can go away; this entails ensuring
that mmap() honors MAP_RESERVE/MAP_NORESERVE.

/Mikael

^ permalink raw reply	[flat|nested] 162+ messages in thread
* Re: [PATCH] Prevent OOM from killing init
@ 2001-03-23  0:09 Mikael Pettersson
  2001-03-23  0:27 ` Andrew Morton
  2001-03-23 16:24 ` Horst von Brand
  0 siblings, 2 replies; 162+ messages in thread
From: Mikael Pettersson @ 2001-03-23  0:09 UTC (permalink / raw)
  To: alan; +Cc: linux-kernel

On Thu, 22 Mar 2001 23:43:57 +0000 (GMT), Alan Cox wrote:

> > >How do you return an out of memory error to a C program that is out of memory
> > >due to a stack growth fault. There is actually not a language construct for it
> > SIGSEGV.
> > Stack overflow for a language like C using standard implementation techniques
> > is the same as a page fault while accessing a page for which there is no backing
> > store. SIGSEGV is the logical choice, and the one I'd expect on other Unices.
> 
> Guess again. You are expanding the stack because you have no room left on it.
> You take a fault. You want to report a SIGSEGV. Now where are you
> going to put the stack frame ?
> 
> SIGSEGV in combination with a preallocated alternate stack maybe

Oh I know 99% of the processes getting this will die. The behaviour I'd
expect from vanilla code in this particular case (stack overflow) is:
- page fault in stack "segment"
- no backing store available
- post SIGSEGV to current
  * push sighandler frame on current stack (or altstack, if registered) [+]
  * no room? SIG_DFL, i.e kill

My point is that with overcommit removed, there's no question as to
which process is actually out of memory. No need for the kernel to guess;
since it doesn't guess, it cannot guess wrong.

Concerning the stack: sure, oom makes it problematic to report the
error in a useful way. So use sigaltstack() and SA_ONSTACK. [+]
Processes that don't do this get killed, but not because oom_kill
did some fancy guesswork.

[+] Speaking as a hacker on a runtime system for a concurrent
programming language (Erlang), I consider the current Unix/POSIX/Linux
default of having the kernel throw up[*] at the user's current stack
pointer to be unbelievably broken. sigaltstack() and SA_ONSTACK should
not be options but required behaviour.

[*] Signal & trap frames used to be called "stack puke" in old 68k days.

/Mikael

^ permalink raw reply	[flat|nested] 162+ messages in thread
* RE: [PATCH] Prevent OOM from killing init
@ 2001-03-23  9:28 Heusden, Folkert van
  0 siblings, 0 replies; 162+ messages in thread
From: Heusden, Folkert van @ 2001-03-23  9:28 UTC (permalink / raw)
  To: Rik van Riel, Tom Kondilis; +Cc: linux-mm, linux-kernel

> That's not the OOM killer however, but init dying because it
> couldn't get the memory it needed to satisfy a page fault or
> somesuch...

Ehrm, I would like to re-state that it still would be nice if
some mechanism got introduced which enables one to set certain
processes to "cannot be killed".
For example: I would hate it it the UPS monitoring daemon got
killed for obvious reasons :o)

^ permalink raw reply	[flat|nested] 162+ messages in thread
* Re: [PATCH] Prevent OOM from killing init
@ 2001-03-23 18:29 Andries.Brouwer
  2001-03-23 18:38 ` Alan Cox
                   ` (2 more replies)
  0 siblings, 3 replies; 162+ messages in thread
From: Andries.Brouwer @ 2001-03-23 18:29 UTC (permalink / raw)
  To: alan, linux-kernel

On Fri, Mar 23, 2001 at 05:04:07PM +0000, Alan Cox wrote:
> > This is just an escape route in case everything else has failed.
> >
> > Linux is unreliable.
> > That is bad.
>
> Since your definition of reliability is a mathematical abstraction requiring
> infinite storage why don't you start by inventing infinitely large SDRAM
> chips, then get back to us ?

Ah, Alan,
I can see that you dislike seeing me say bad things about Linux.
I dislike having to say them.

On the other hand, my definition of reliability does not require
infinite storage. After all, earlier Unix flavours did not need
an OOM killer either, and my editor was not killed under Unix V6
on 64k when I started some other process.

Linux is unreliable because a program can be killed at random,
without warning, because of bugs in some other program.
The old Unix guarantee that a program only crashes because of
its own behaviour is lost. That is very sad.

What can one do? I need not tell you - you know better than I do.
The main point is letting malloc fail when the memory cannot be
guaranteed. There are various solutions for stack space, none of
them very elegant, but all have in common that when we run out of
stack space the program doing that gets SIGSEGV, and not some
random other program. (And a well-written program could catch this
SIGSEGV and do cleanup, preserving the integrity of its data base.
Clearly one would want to guarantee a certain minimum stack space
at fork time.)

Will this setup be very inefficient? I don't know. Perhaps.
If my programs actually use 10 MB but have a guarantee for
200 MB then the rest of that memory is not wasted. But it can
only be used for things that can be freed when needed, like
inode and buffer cache.

But inefficient or not, I much prefer a system with guarantees,
something that is reliable by default, above something that
works well if you are lucky and fails at unpredictable moments.

Andries

^ permalink raw reply	[flat|nested] 162+ messages in thread
* RE: [PATCH] Prevent OOM from killing init
@ 2001-03-23 19:33 Stephen Satchell
  0 siblings, 0 replies; 162+ messages in thread
From: Stephen Satchell @ 2001-03-23 19:33 UTC (permalink / raw)
  To: linux-kernel

At 10:28 AM 3/23/01 +0100, you wrote:
>Ehrm, I would like to re-state that it still would be nice if
>some mechanism got introduced which enables one to set certain
>processes to "cannot be killed".
>For example: I would hate it it the UPS monitoring daemon got
>killed for obvious reasons :o)

Hey, my new flame-proof suit arrived today, so let me give it a try-out...

1)  If you have a daemon that absolutely positively has to be there, why 
not put the damn thing in "inittab" with the RESPAWN attribute?  OOM kills 
it, init notices it, init respawns it, you have your UPS monitoring daemon 
back.

2)  Why is task #1 (init) considered at all by the OOM task-killer 
code?  Sounds like a possible off-by-one bug to me.

3)  If random task-killing is such a problem, one solution is to add yet 
another word to the process table entry, something on the order of 
"oom_importance".  Off the top of my head, this 16-bit value would be 
0x4000 for "normal" processes, and would be the value at start-up.  A value 
of 0xFFFF would be the "never-kill" value, while the value of 0x0000 would 
be the equivalent of the guy who ALWAYS gives up his airplane seat.  The 
process could set this value between 0x0000 and 0xBFFF for processes 
running without root privs, the full range for root processes.  The big 
advantage here is that a daemon or major system can set the value to zero 
during start-up (to ensure being killed if there aren't enough system 
resources) and then boost the immunity once it is going strong.  I can see 
this being of particular value in windows desktops where an attempt to 
start a widget causes an out-of memory condition and THAT WIDGET is the one 
that then dies.  That would be the expected behavior.

 From a debug perspective, it means that the programmer can avoid killing 
something on his development system "by accident" by attracting all the 
task-killing lightning during initial debug.  This would be a sure-fire 
improvement over accidentally killing your debugger, for example.

I call it "nice for memory".

Satch


^ permalink raw reply	[flat|nested] 162+ messages in thread
* Re: [PATCH] Prevent OOM from killing init
@ 2001-03-23 23:15 Andries.Brouwer
  2001-03-23 23:17 ` Martin Dalecki
                   ` (2 more replies)
  0 siblings, 3 replies; 162+ messages in thread
From: Andries.Brouwer @ 2001-03-23 23:15 UTC (permalink / raw)
  To: Andries.Brouwer, alan; +Cc: linux-kernel

[to various people]

No, ulimit does not work. (But it helps a little.)
No, /proc/sys/vm/overcommit_memory does not work.

[to Alan]

> Nobody feels its very important because nobody has implemented it.

Yes, that is the right response.
What can one say? One can only do.

Andries


^ permalink raw reply	[flat|nested] 162+ messages in thread
* Re: [PATCH] Prevent OOM from killing init
@ 2001-03-24  1:11 Andries.Brouwer
  0 siblings, 0 replies; 162+ messages in thread
From: Andries.Brouwer @ 2001-03-24  1:11 UTC (permalink / raw)
  To: timw; +Cc: alan, linux-kernel


> It was actually worse than that. Grab your copy of "Lions", and check lines
> 4375-4377 in function xswap(). A failure to allocate space in the swapmap
> caused a panic. Same problem in xalloc().

[no Lions nearby; somewhere I still have the printout but am
too lazy to search; I also have the tape but nothing to read it with]

yes, you may well be right if you say that my picture
of the distant past is too rosy - maybe I forgot all
this trouble
still - yesterday I lost three edit sessions -
I do not recall any such occurrence in the 25 years before


^ permalink raw reply	[flat|nested] 162+ messages in thread
* Re: [PATCH] Prevent OOM from killing init
@ 2001-03-24  1:38 Jonathan Morton
  0 siblings, 0 replies; 162+ messages in thread
From: Jonathan Morton @ 2001-03-24  1:38 UTC (permalink / raw)
  To: linux-kernel

>Hmm...  "if ( freemem < (size_of_mallocing_process / 20) ) fail_to_allocate;"
>
>Seems like a reasonable soft limit - processes which have already got lots
>of RAM can probably stand not to have that little bit more and can be
>curbed more quickly.  Processes with less probably don't deserve to die and
>furthermore are less likely to be engineered to handle malloc() failure, so
>failure only occurs closer to the mark.  In this scenario OOM killing
>(which is, after all, a last resort) should trigger rarely and simple
>malloc() failure (which userspace apps can cope with more easily) is an
>early-warning and prevention system.

Following up my own post with some action, I hacked 2.4.1's
mm/mmap.c::vm_enough_pages() to include something similar to the above
algorithm.  In fact, it triggers malloc() failure when 1/16th of
current->mm->total_vm would be greater than the sum of the free space and
the potentially-allocated area.

My very quick tests show that my test program (the rogue allocator) now in
fact does encounter a failed malloc() at approx. 475M, instead of being
killed by the OOM handler at approx. 490M.  This is pretty much the desired
behaviour.

If someone would like me to post a patch and have it tested, I'd be happy
to do so.

--------------------------------------------------------------
from:     Jonathan "Chromatix" Morton
mail:     chromi@cyberspace.org  (not for attachments)
big-mail: chromatix@penguinpowered.com
uni-mail: j.d.morton@lancaster.ac.uk

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-----BEGIN GEEK CODE BLOCK-----
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-----END GEEK CODE BLOCK-----



^ permalink raw reply	[flat|nested] 162+ messages in thread
* Re: [PATCH] Prevent OOM from killing init
@ 2001-03-24  2:30 Andreas Franck
  0 siblings, 0 replies; 162+ messages in thread
From: Andreas Franck @ 2001-03-24  2:30 UTC (permalink / raw)
  To: linux-kernel

Hi together,

seems like a hot discussion going on, but I couldn't resist and would like to
throw in my $0.02.

Besides misunderstandings and general displeasure, some very interesting
facts have shown up in the discussion (oh, yeah), which I'd like to know more
about, and just extend them with a bit of my latest experience regarding
memory usage.

First one is about buffer/inode cache. What I expect as a medium-skilled
system hacker would be: Before giving up with an OOM-whatever,

a) all non-dirty buffers should be freed, possibly giving tons of memory
b) all dirty buffers should be flushed and freed, alas

I'm not sure if both is tried ATM, but I think enough experts are here to
answer my questions :)

What I saw lately was some general system sluggishness after copying very big
files (ripping a CD image to disk) - it seems the system has paged out most
of its processes (including the calling bash shell) in favor of the copying
task, just for buffers! Up to which degree is this reasonable? It seems to
slow down the system when using swap, so for this task I better had
deactivated it. Not what one "intuitively" expects.

So, what is the second important point? The current system cannot properly
distinguish between memory an application "really" needs and memory an
application "eventually" needs (as internal caches, ...).

A possible solution could be the implementation of something like SIGDANGER,
which would be sent to an application in case of memory overload, so
it should try to free a bit memory if it can. Surely applications would have
to be modified to use that information. How about the C library, does it
maintain any big buffers, for I/O or so? I don't know, changes there could
surely be passed on transparently. Ok, ok, it's the MacOS way of thinking, so
the other possibility. This problems are intimately related to memory
overcommitting, or not doing so, so what might be fatal in overcommitting?

One problem arises if an application gets a huge part of overcommitted memory
and then tries to use it, which spontaneously fails - just because the memory
was committed somewhere else, to the 999 other apps which are already
 running.

The flaw there is that at some time, you can guarantee that the overcommit
would fail, if the memory was really used. At this point, the application
could be halted (so that it does not get the chance to make use of the
overcommit promise), until some more memory is available again - either by
paging, or by waiting for other jobs to terminate. This could lead to
starvation, but it potentially could let the system survive.

A further idea would be to use overcommitted memory only for buffers and
caches, this was already mentioned before. In any situation "near" an OOM,
further memory pressure should be avoided - for example, by letting malloc()
fail. This might also hurt existing processes, so some heuristics could
decide - a malloc() from a freshly started process should fail regardlessly
of its size, while older processes might get some more tolerance, because the
system might trust their behaviour a bit more.

So far from me, this was just a collection of some more or less unrelated
thoughts, which I'd like to know a bit more about, or hear from experts why
all of this is b*llshit (or: already done(TM)!)

Greetings,
Andreas

^ permalink raw reply	[flat|nested] 162+ messages in thread
* Re: [PATCH] Prevent OOM from killing init
@ 2001-03-24 10:18 Andries.Brouwer
  0 siblings, 0 replies; 162+ messages in thread
From: Andries.Brouwer @ 2001-03-24 10:18 UTC (permalink / raw)
  To: Andries.Brouwer, paul; +Cc: linux-kernel

    From paul@jakma.org Sat Mar 24 03:00:17 2001

    > No, ulimit does not work. (But it helps a little.)

    no, not perfect, i very much agree. but in daily usage it reduces
    chance of OOM to close to 0.

No. How would you use it? Compute individual limits for
each process? One typically has a few very large processes
that may easily take most of memory, and lots of small processes.
With a low ulimit these large processes do not run.
With a large ulimit it does not help against OOM.
The job of accounting what is available belongs to the system,
not the user.

Note that ulimit does not limit the sum of your processes,
it limits each individual process.

Andries

^ permalink raw reply	[flat|nested] 162+ messages in thread
* Re: [PATCH] Prevent OOM from killing init
@ 2001-03-24 23:41 Benoit Garnier
  2001-03-25  5:45 ` Stephen Satchell
  2001-03-25 14:32 ` Martin Dalecki
  0 siblings, 2 replies; 162+ messages in thread
From: Benoit Garnier @ 2001-03-24 23:41 UTC (permalink / raw)
  To: linux-kernel

Szabolcs Szakacsits wrote :

> But if you start
> to think you get the conclusion that process killing can't be avoided if
> you want the system keep running.

What's the point in keeping the OS running if the applications are silently
killed?

If your box is running for example a mail server, and it appears that
another process is juste eating the free memory, do you really want to kill
the mail server, just because it's the main process and consuming more
memory and CPU than others?

Well, fine, your OS is up, but your application is not here anymore.

I just think there's no general solution, users must have the chance to
choose processes not to be killed, or malloc() returning errors.

----
Benoît GARNIER



^ permalink raw reply	[flat|nested] 162+ messages in thread
[parent not found: <Pine.LNX.4.30.0103251549100.13864-100000@fs131-224.f-secure.com>]

end of thread, other threads:[~2001-03-27 15:06 UTC | newest]

Thread overview: 162+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-03-21  3:48 Only 10 MB/sec with via 82c686b chipset? SodaPop
2001-03-21 13:48 ` egger
2001-03-22  2:14   ` TimO
2001-03-23  2:38   ` Only 10 MB/sec with via 82c686b - FIXED SodaPop
2001-03-23  9:48     ` Alan Cox
2001-03-23 16:21       ` SodaPop
2001-03-23 17:00       ` [PATCH] Prevent OOM from killing init SodaPop
2001-03-23 18:42         ` Martin Dalecki
2001-03-23 20:25           ` SodaPop
2001-03-23 20:33             ` Martin Dalecki
2001-03-23 19:19         ` Jonathan Morton
2001-03-23  2:59   ` Only 10 MB/sec with via 82c686b - FIXED SodaPop
2001-03-21 14:18 ` Only 10 MB/sec with via 82c686b chipset? Jonathan Morton
2001-03-21 17:34   ` egger
  -- strict thread matches above, loose matches on Subject: below --
2001-03-21 22:54 [PATCH] Prevent OOM from killing init Patrick O'Rourke
2001-03-21 23:11 ` Eli Carter
2001-03-21 23:40   ` Patrick O'Rourke
2001-03-21 23:48 ` Rik van Riel
2001-03-22  8:14   ` Eric W. Biederman
2001-03-22  9:24     ` Rik van Riel
2001-03-22 19:29     ` Philipp Rumpf
2001-03-22 11:47   ` Guest section DW
2001-03-22 15:01     ` Rik van Riel
2001-03-22 19:04       ` Guest section DW
2001-03-22 16:41     ` Eric W. Biederman
2001-03-22 20:28     ` Stephen Clouse
2001-03-22 21:01       ` Ingo Oeser
2001-03-22 21:23       ` Alan Cox
2001-03-22 22:00         ` Guest section DW
2001-03-22 22:12           ` Ed Tomlinson
2001-03-22 22:52           ` Alan Cox
2001-03-22 23:27             ` Guest section DW
2001-03-22 23:37               ` Rik van Riel
2001-03-26 19:04                 ` James Antill
2001-03-26 20:05                   ` Rik van Riel
2001-03-22 23:40               ` Alan Cox
2001-03-23 20:09                 ` Szabolcs Szakacsits
2001-03-23 22:21                   ` Alan Cox
2001-03-23 22:37                     ` Szabolcs Szakacsits
2001-03-23 19:57           ` Szabolcs Szakacsits
2001-03-22 22:10         ` Doug Ledford
2001-03-22 22:53           ` Alan Cox
2001-03-22 23:30             ` Doug Ledford
2001-03-22 23:40               ` Alan Cox
2001-03-22 23:43         ` Stephen Clouse
2001-03-23 19:26         ` Szabolcs Szakacsits
2001-03-23 20:41           ` Paul Jakma
2001-03-23 21:58             ` george anzinger
2001-03-24  5:55               ` Rik van Riel
2001-03-23 22:18             ` Szabolcs Szakacsits
2001-03-24  2:08               ` Paul Jakma
2001-03-23  1:31       ` Michael Peddemors
2001-03-23  7:04         ` Rik van Riel
2001-03-23 11:28           ` Guest section DW
2001-03-23 14:50             ` Eric W. Biederman
2001-03-23 17:21               ` Guest section DW
2001-03-23 20:18                 ` Paul Jakma
2001-03-24 20:19                   ` Jesse Pollard
2001-03-23 23:48                 ` Eric W. Biederman
2001-03-23 21:11             ` José Luis Domingo López
2001-03-27 15:05       ` Anthony de Boer - USEnet
2002-03-23  0:33       ` Martin Dalecki
2001-03-22 23:53         ` Rik van Riel
2002-03-23  1:21           ` Martin Dalecki
2001-03-23  0:20         ` Stephen Clouse
2002-03-23  1:30           ` Martin Dalecki
2001-03-23  1:37             ` Rik van Riel
2001-03-23 10:48               ` Martin Dalecki
2001-03-23 14:56                 ` Rik van Riel
2001-03-23 16:43                   ` Guest section DW
2001-03-24  5:57                     ` Rik van Riel
2001-03-25 16:35                       ` Guest section DW
2001-03-23 20:20                   ` Tom Diehl
2001-03-23 23:56                     ` Tim Wright
2001-03-24  0:21                       ` Tom Diehl
2001-03-23 17:26     ` James A. Sutherland
2001-03-23 17:32       ` Alan Cox
2001-03-23 18:58         ` Martin Dalecki
2001-03-23 19:45         ` Jonathan Morton
2001-03-23 23:26           ` Eric W. Biederman
2001-03-25 15:30         ` Martin Dalecki
2001-03-25 20:47         ` Stephen Satchell
2001-03-24  0:03       ` Guest section DW
2001-03-24  7:52       ` Doug Ledford
2001-03-25  0:32       ` Kurt Garloff
2001-03-25 15:02         ` Sandy Harris
2001-03-25 18:07         ` Guest section DW
2001-03-22 14:53   ` Patrick O'Rourke
2001-03-22 19:24   ` Philipp Rumpf
2001-03-22 22:20   ` James A. Sutherland
2001-03-23 17:31   ` Szabolcs Szakacsits
2001-03-24  5:54     ` Rik van Riel
2001-03-24  6:55       ` Juha Saarinen
2001-03-27  8:31       ` Roger Gammans
2001-03-21 23:41 Leif Sawyer
2001-03-22  0:32 ` Kevin Buhr
2001-03-22 11:08 Heusden, Folkert van
     [not found] <4605B269DB001E4299157DD1569079D2809930@EXCHANGE03.plaza.ds.adp.com>
2001-03-22 16:29 ` Rik van Riel
2001-03-22 18:32   ` Christian Bodmer
2001-03-23 15:08     ` Horst von Brand
2001-03-24  7:48       ` Doug Ledford
2001-03-24 10:21         ` Mike Galbraith
2001-03-24 18:19           ` Doug Ledford
2001-03-24 22:47             ` Mike Galbraith
2001-03-24 23:35             ` Jonathan Morton
2001-03-25 18:35               ` Jonathan Morton
2001-03-26  4:40                 ` Horst von Brand
2001-03-26  8:36                 ` Mike Galbraith
2001-03-26 10:01                 ` Jonathan Morton
2001-03-26 14:48                   ` Rik van Riel
2001-03-25 19:07               ` Mike Galbraith
2001-03-24 20:04           ` Jonathan Morton
2001-03-24 20:59           ` Jonathan Morton
2001-03-24 22:11             ` Rik van Riel
2001-03-24 23:36             ` Jonathan Morton
2001-03-25 14:30             ` Martin Dalecki
2001-03-25 14:13           ` Martin Dalecki
2001-03-24 12:42         ` Jonathan Morton
2001-03-24 15:06           ` Mike Galbraith
2001-03-25 14:10         ` Martin Dalecki
2001-03-22 23:35 Mikael Pettersson
2001-03-22 23:43 ` Alan Cox
2001-03-27  7:58   ` Helge Hafting
2001-03-23  0:09 Mikael Pettersson
2001-03-23  0:27 ` Andrew Morton
2001-03-23 12:29   ` Mikael Pettersson
2001-03-23 16:24 ` Horst von Brand
2001-03-23 16:49   ` Guest section DW
2001-03-23 17:04     ` Alan Cox
2001-03-23  9:28 Heusden, Folkert van
2001-03-23 18:29 Andries.Brouwer
2001-03-23 18:38 ` Alan Cox
2001-03-24  0:46   ` Tim Wright
2001-03-24 16:48   ` Jesse Pollard
2001-03-25 16:12     ` Szabolcs Szakacsits
2001-03-25 16:39     ` Jonathan Morton
2001-03-23 18:43 ` nick
2001-03-23 19:01   ` Martin Dalecki
2001-03-23 19:23     ` nick
2001-03-23 22:12     ` Alan Cox
2001-03-23 23:23       ` Stephen E. Clark
2001-03-24 10:40         ` Gérard Roudier
2001-03-23 21:14 ` Jonathan Morton
2001-03-25 14:56   ` Marco Colombo
2001-03-23 19:33 Stephen Satchell
2001-03-23 23:15 Andries.Brouwer
2001-03-23 23:17 ` Martin Dalecki
2001-03-24  0:13 ` Jonathan Morton
2001-03-24  6:58   ` Rik van Riel
2001-03-24 12:38   ` Jonathan Morton
2001-03-24 13:12   ` Jonathan Morton
2001-03-24  1:59 ` Paul Jakma
2001-03-24  1:11 Andries.Brouwer
2001-03-24  1:38 Jonathan Morton
2001-03-24  2:30 Andreas Franck
2001-03-24 10:18 Andries.Brouwer
2001-03-24 23:41 Benoit Garnier
2001-03-25  5:45 ` Stephen Satchell
2001-03-25  6:58   ` Stephen Clouse
2001-03-25 14:37   ` Martin Dalecki
2001-03-25 14:32 ` Martin Dalecki
     [not found] <Pine.LNX.4.30.0103251549100.13864-100000@fs131-224.f-secure.com>
     [not found] ` <l03130315b6e242006a4b@[192.168.239.101]>
2001-03-25 15:47   ` Jonathan Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox