* experiences beyond 4 GB RAM with 2.4.22
@ 2003-09-09 9:01 Stephan von Krawczynski
2003-09-09 12:25 ` Andrea Arcangeli
2003-09-12 2:46 ` Neil Brown
0 siblings, 2 replies; 48+ messages in thread
From: Stephan von Krawczynski @ 2003-09-09 9:01 UTC (permalink / raw)
To: linux-kernel; +Cc: Neil Brown, Andrea Arcangeli
Hello,
lately I upgraded my testbox from 2 to 6 GB ram and found out some oddities I
would like to hear your opinions.
The box ran flawlessly and performant with 2 GB - was in fact a real joy.
After upgrading the ram and recompiling kernel 2.4.22 with support for 64 GB I
noticed:
1) nfs clients see timeouts again, like
Sep 9 03:37:35 clienta kernel: nfs: server 192.168.1.1 not responding, still
trying
Sep 9 03:37:35 clienta kernel: nfs: server 192.168.1.1 OK
Sep 9 03:37:35 clienta kernel: nfs: server 192.168.1.1 not responding, still
trying
Sep 9 03:37:35 clienta kernel: nfs: server 192.168.1.1 OK
Sep 9 03:41:13 clienta kernel: nfs: server 192.168.1.1 not responding, still
trying
Sep 9 03:41:13 clienta kernel: nfs: server 192.168.1.1 OK
Both are 2.4.22. 192.168.1.1 is the testbox. I saw those with 2GB, but could
fix it through more nfs-daemons and
echo 2097152 >/proc/sys/net/core/rmem_max
echo 2097152 >/proc/sys/net/core/wmem_max
Are these values too small for 6 GB?
2) Box is very slow, kswapd looks very active during tar of a local harddisk.
Interactivity is really bad. Seems vm has a high time looking for free or
usable pages. Compared to 2 GB the behaviour is unbelievably bad.
3) Network performance has a remarkable dropdown during above tar. In fact
doing simple pings every few minutes shows that quite a lot of them are simply
dropped, never make it over the ethernet.
I am really astonished about this. Can some kind soul give me hints or maybe
patches to try?
Regards,
Stephan
^ permalink raw reply [flat|nested] 48+ messages in thread* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-09 9:01 experiences beyond 4 GB RAM with 2.4.22 Stephan von Krawczynski @ 2003-09-09 12:25 ` Andrea Arcangeli 2003-09-12 2:46 ` Neil Brown 1 sibling, 0 replies; 48+ messages in thread From: Andrea Arcangeli @ 2003-09-09 12:25 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: linux-kernel, Neil Brown On Tue, Sep 09, 2003 at 11:01:12AM +0200, Stephan von Krawczynski wrote: > Hello, > > lately I upgraded my testbox from 2 to 6 GB ram and found out some oddities I > would like to hear your opinions. > The box ran flawlessly and performant with 2 GB - was in fact a real joy. > After upgrading the ram and recompiling kernel 2.4.22 with support for 64 GB I > noticed: > > 1) nfs clients see timeouts again, like > > Sep 9 03:37:35 clienta kernel: nfs: server 192.168.1.1 not responding, still > trying > Sep 9 03:37:35 clienta kernel: nfs: server 192.168.1.1 OK > Sep 9 03:37:35 clienta kernel: nfs: server 192.168.1.1 not responding, still > trying > Sep 9 03:37:35 clienta kernel: nfs: server 192.168.1.1 OK > Sep 9 03:41:13 clienta kernel: nfs: server 192.168.1.1 not responding, still > trying > Sep 9 03:41:13 clienta kernel: nfs: server 192.168.1.1 OK > > Both are 2.4.22. 192.168.1.1 is the testbox. I saw those with 2GB, but could > fix it through more nfs-daemons and > > echo 2097152 >/proc/sys/net/core/rmem_max > echo 2097152 >/proc/sys/net/core/wmem_max > > Are these values too small for 6 GB? > > 2) Box is very slow, kswapd looks very active during tar of a local harddisk. > Interactivity is really bad. Seems vm has a high time looking for free or > usable pages. Compared to 2 GB the behaviour is unbelievably bad. > > 3) Network performance has a remarkable dropdown during above tar. In fact > doing simple pings every few minutes shows that quite a lot of them are simply > dropped, never make it over the ethernet. > > I am really astonished about this. Can some kind soul give me hints or maybe > patches to try? for the vm issues my suggestion is to try again with 2.4.22aa1. Andrea /* * If you refuse to depend on closed software for a critical * part of your business, these links may be useful: * * rsync.kernel.org::pub/scm/linux/kernel/bkcvs/linux-2.5/ * rsync.kernel.org::pub/scm/linux/kernel/bkcvs/linux-2.4/ * http://www.cobite.com/cvsps/ * * svn://svn.kernel.org/linux-2.6/trunk * svn://svn.kernel.org/linux-2.4/trunk */ ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-09 9:01 experiences beyond 4 GB RAM with 2.4.22 Stephan von Krawczynski 2003-09-09 12:25 ` Andrea Arcangeli @ 2003-09-12 2:46 ` Neil Brown 2003-09-12 6:54 ` Stephan von Krawczynski 1 sibling, 1 reply; 48+ messages in thread From: Neil Brown @ 2003-09-12 2:46 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: linux-kernel On Tuesday September 9, skraw@ithnet.com wrote: > Hello, > > lately I upgraded my testbox from 2 to 6 GB ram and found out some oddities I > would like to hear your opinions. > The box ran flawlessly and performant with 2 GB - was in fact a real joy. > After upgrading the ram and recompiling kernel 2.4.22 with support for 64 GB I > noticed: > > 1) nfs clients see timeouts again, like > > Sep 9 03:37:35 clienta kernel: nfs: server 192.168.1.1 not responding, still > trying > Sep 9 03:37:35 clienta kernel: nfs: server 192.168.1.1 OK > Sep 9 03:37:35 clienta kernel: nfs: server 192.168.1.1 not responding, still > trying > Sep 9 03:37:35 clienta kernel: nfs: server 192.168.1.1 OK > Sep 9 03:41:13 clienta kernel: nfs: server 192.168.1.1 not responding, still > trying > Sep 9 03:41:13 clienta kernel: nfs: server 192.168.1.1 OK > > Both are 2.4.22. 192.168.1.1 is the testbox. I saw those with 2GB, but could > fix it through more nfs-daemons and > > echo 2097152 >/proc/sys/net/core/rmem_max > echo 2097152 >/proc/sys/net/core/wmem_max > > Are these values too small for 6 GB? No. The values are proportional to the number of server threads, not the amount of RAM... and they should be un-necessary after 2.4.20 anyway as nfsd in the kernel makes the appropriate settings. > > 2) Box is very slow, kswapd looks very active during tar of a local harddisk. > Interactivity is really bad. Seems vm has a high time looking for free or > usable pages. Compared to 2 GB the behaviour is unbelievably bad. > > 3) Network performance has a remarkable dropdown during above tar. In fact > doing simple pings every few minutes shows that quite a lot of them are simply > dropped, never make it over the ethernet. My only guess is that it is doing a lot of copying into low memory because your devices can only DMA into/outof low memory. Have you tried 2.6 ?? How about CONFIG_HIGHMEM4G ? It won't use all the RAM, but it would be interesting if it were faster. NeilBrown ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-12 2:46 ` Neil Brown @ 2003-09-12 6:54 ` Stephan von Krawczynski 2003-09-12 7:11 ` Jens Axboe ` (2 more replies) 0 siblings, 3 replies; 48+ messages in thread From: Stephan von Krawczynski @ 2003-09-12 6:54 UTC (permalink / raw) To: Neil Brown; +Cc: linux-kernel On Fri, 12 Sep 2003 12:46:46 +1000 Neil Brown <neilb@cse.unsw.edu.au> wrote: > > Both are 2.4.22. 192.168.1.1 is the testbox. I saw those with 2GB, but > > could fix it through more nfs-daemons and > > > > echo 2097152 >/proc/sys/net/core/rmem_max > > echo 2097152 >/proc/sys/net/core/wmem_max > > > > Are these values too small for 6 GB? > > No. The values are proportional to the number of server threads, not > the amount of RAM... and they should be un-necessary after 2.4.20 > anyway as nfsd in the kernel makes the appropriate settings. Oh. That's interesting. Then everything should be the same if I deleted those... > > 2) Box is very slow, kswapd looks very active during tar of a local > > harddisk. Interactivity is really bad. Seems vm has a high time looking for > > free or usable pages. Compared to 2 GB the behaviour is unbelievably bad. > > > > 3) Network performance has a remarkable dropdown during above tar. In fact > > doing simple pings every few minutes shows that quite a lot of them are > > simply dropped, never make it over the ethernet. > > My only guess is that it is doing a lot of copying into low memory > because your devices can only DMA into/outof low memory. I forgot to mention: Both network card and controller are 64 bit cards. Network card is (vendor 3com): Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit Ethernet (rev 15) (tg3-driver) Controller is: RAID bus controller: 3ware Inc 3ware 7000-series ATA-RAID (rev 01) I have "CONFIG_HIGHIO=y" > Have you tried 2.6 ?? No, not yet. I have not dared :-) > How about CONFIG_HIGHMEM4G ? > It won't use all the RAM, but it would be interesting if it were > faster. I already thought about that and tried. In fact it is as fast and fine as 2 GB setup. It runs really smooth. The really simple test for the problem is running "updatedb" (find over the whole filesystem). The box comes to a crawl while this is running, network is absolutely bad, interactivity is rather dead, very often not even a ssh login works. Regards, Stephan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-12 6:54 ` Stephan von Krawczynski @ 2003-09-12 7:11 ` Jens Axboe 2003-09-12 7:53 ` Mike Fedyk 2003-09-15 22:01 ` Marcelo Tosatti 2 siblings, 0 replies; 48+ messages in thread From: Jens Axboe @ 2003-09-12 7:11 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: Neil Brown, linux-kernel On Fri, Sep 12 2003, Stephan von Krawczynski wrote: > > My only guess is that it is doing a lot of copying into low memory > > because your devices can only DMA into/outof low memory. > > I forgot to mention: Both network card and controller are 64 bit cards. > Network card is (vendor 3com): > Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit Ethernet > (rev 15) (tg3-driver) > Controller is: > RAID bus controller: 3ware Inc 3ware 7000-series ATA-RAID (rev 01) > I have "CONFIG_HIGHIO=y" That is not enough, the 3ware driver only sets a 32-bit IO capability mask. So you will still be bouncing to/from the upper 2G, Neils diagnosis is absolutely right. If you want to verify this fact, boot with profile=2 and run readprofile -r; updatedb; readprofile | sort -nr and you should see the bounce copy functions near the top. As a paying customer, you should ask 3ware about their hardware. They might be able to support > 32-bit dma just noone has asked about this feature yet. A quick peek at their driver shows they define their sg address element as a 32-bit quantity, so maybe not... -- Jens Axboe ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-12 6:54 ` Stephan von Krawczynski 2003-09-12 7:11 ` Jens Axboe @ 2003-09-12 7:53 ` Mike Fedyk 2003-09-15 22:01 ` Marcelo Tosatti 2 siblings, 0 replies; 48+ messages in thread From: Mike Fedyk @ 2003-09-12 7:53 UTC (permalink / raw) To: Neil Brown; +Cc: linux-kernel > On Fri, 12 Sep 2003 12:46:46 +1000 > Neil Brown <neilb@cse.unsw.edu.au> wrote: > > > > Both are 2.4.22. 192.168.1.1 is the testbox. I saw those with 2GB, but > > > could fix it through more nfs-daemons and > > > > > > echo 2097152 >/proc/sys/net/core/rmem_max > > > echo 2097152 >/proc/sys/net/core/wmem_max > > > > > > Are these values too small for 6 GB? > > > > No. The values are proportional to the number of server threads, not > > the amount of RAM... and they should be un-necessary after 2.4.20 > > anyway as nfsd in the kernel makes the appropriate settings. So then what do I need to to get those error messages off of my nfs clients? I have seen this with for a long time through 2.4 and 2.5 (didn't use nfs with 2.2...). ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-12 6:54 ` Stephan von Krawczynski 2003-09-12 7:11 ` Jens Axboe 2003-09-12 7:53 ` Mike Fedyk @ 2003-09-15 22:01 ` Marcelo Tosatti 2003-09-16 8:21 ` Stephan von Krawczynski 2 siblings, 1 reply; 48+ messages in thread From: Marcelo Tosatti @ 2003-09-15 22:01 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: Neil Brown, linux-kernel On Fri, 12 Sep 2003, Stephan von Krawczynski wrote: > On Fri, 12 Sep 2003 12:46:46 +1000 > Neil Brown <neilb@cse.unsw.edu.au> wrote: > > > > Both are 2.4.22. 192.168.1.1 is the testbox. I saw those with 2GB, but > > > could fix it through more nfs-daemons and > > > > > > echo 2097152 >/proc/sys/net/core/rmem_max > > > echo 2097152 >/proc/sys/net/core/wmem_max > > > > > > Are these values too small for 6 GB? > > > > No. The values are proportional to the number of server threads, not > > the amount of RAM... and they should be un-necessary after 2.4.20 > > anyway as nfsd in the kernel makes the appropriate settings. > > Oh. That's interesting. Then everything should be the same if I deleted > those... > > > > 2) Box is very slow, kswapd looks very active during tar of a local > > > harddisk. Interactivity is really bad. Seems vm has a high time looking for > > > free or usable pages. Compared to 2 GB the behaviour is unbelievably bad. > > > > > > 3) Network performance has a remarkable dropdown during above tar. In fact > > > doing simple pings every few minutes shows that quite a lot of them are > > > simply dropped, never make it over the ethernet. > > > > My only guess is that it is doing a lot of copying into low memory > > because your devices can only DMA into/outof low memory. > > I forgot to mention: Both network card and controller are 64 bit cards. > Network card is (vendor 3com): > Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit Ethernet > (rev 15) (tg3-driver) > Controller is: > RAID bus controller: 3ware Inc 3ware 7000-series ATA-RAID (rev 01) > I have "CONFIG_HIGHIO=y" > > > Have you tried 2.6 ?? > > No, not yet. I have not dared :-) > > > How about CONFIG_HIGHMEM4G ? > > It won't use all the RAM, but it would be interesting if it were > > faster. > > I already thought about that and tried. In fact it is as fast and fine as 2 GB > setup. It runs really smooth. > The really simple test for the problem is running "updatedb" (find over the > whole filesystem). The box comes to a crawl while this is running, network is > absolutely bad, interactivity is rather dead, very often not even a ssh login > works. Does -pre4 (with the VM changes from Andrea) show any difference? There are significant changes in the per-zone decisions which might help. Have you tried 2.4.22-aa? Thanks ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-15 22:01 ` Marcelo Tosatti @ 2003-09-16 8:21 ` Stephan von Krawczynski 2003-09-16 12:05 ` Stephan von Krawczynski 2003-09-16 13:11 ` Marcelo Tosatti 0 siblings, 2 replies; 48+ messages in thread From: Stephan von Krawczynski @ 2003-09-16 8:21 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: neilb, linux-kernel On Mon, 15 Sep 2003 19:01:42 -0300 (BRT) Marcelo Tosatti <marcelo.tosatti@cyclades.com.br> wrote: > > I already thought about that and tried. In fact it is as fast and fine as 2 > > GB setup. It runs really smooth. > > The really simple test for the problem is running "updatedb" (find over the > > whole filesystem). The box comes to a crawl while this is running, network > > is absolutely bad, interactivity is rather dead, very often not even a ssh > > login works. > > Does -pre4 (with the VM changes from Andrea) show any difference? There > are significant changes in the per-zone decisions which might help. Hello Marcelo, it looks like -pre4 performs not well even in 4 GB environment. After few days of running I find hanging 2.4.22 nfs-clients on a 2.4.23-pre4 server. On the client I get a bunch of those: Sep 16 03:02:00 brenda kernel: nfs: server 192.168.1.1 OK Sep 16 03:02:32 brenda kernel: nfs: server 192.168.1.1 not responding, still trying Sep 16 03:02:32 brenda kernel: nfs: server 192.168.1.1 OK Sep 16 03:02:37 brenda kernel: nfs: server 192.168.1.1 not responding, still trying Sep 16 03:02:37 brenda last message repeated 3 times Sep 16 03:02:37 brenda kernel: nfs: server 192.168.1.1 OK Sep 16 03:02:37 brenda last message repeated 3 times Sep 16 03:02:38 brenda kernel: nfs: server 192.168.1.1 not responding, still trying Sep 16 03:02:38 brenda last message repeated 6 times Sep 16 03:02:38 brenda kernel: nfs: server 192.168.1.1 OK Sep 16 03:02:38 brenda last message repeated 6 times Sep 16 03:02:41 brenda kernel: nfs: server 192.168.1.1 not responding, still trying Sep 16 03:02:41 brenda kernel: nfs: server 192.168.1.1 OK Sep 16 03:02:41 brenda kernel: nfs: server 192.168.1.1 not responding, still trying Sep 16 03:02:41 brenda kernel: nfs: server 192.168.1.1 OK Sep 16 03:02:42 brenda kernel: nfs: server 192.168.1.1 not responding, still trying Sep 16 03:02:42 brenda last message repeated 2 times Sep 16 03:02:42 brenda kernel: nfs: server 192.168.1.1 OK Sep 16 03:02:42 brenda last message repeated 2 times Sep 16 03:02:43 brenda kernel: nfs: server 192.168.1.1 not responding, still trying Sep 16 03:02:43 brenda last message repeated 8 times Sep 16 03:02:43 brenda kernel: nfs: server 192.168.1.1 OK Sep 16 03:03:08 brenda last message repeated 7 times Sep 16 03:03:09 brenda kernel: nfs: server 192.168.1.1 not responding, still trying Sep 16 03:03:09 brenda last message repeated 2 times Sep 16 03:03:09 brenda kernel: nfs: server 192.168.1.1 OK Sep 16 03:03:10 brenda last message repeated 2 times And then the nfs-action is dead. Process hangs. This has not happened with 2.4.22 as a server. It showed up after a day of creating 4,7 GB dvd iso images. Creating theses isos was ok, no error or so during the action. Is it possible that pre4 does not recover all that well from former memory pressure? > Have you tried 2.4.22-aa? Sorry, not yet. I will go back to 2.4.22 and stress it to see if these effects show up. Regards, Stephan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 8:21 ` Stephan von Krawczynski @ 2003-09-16 12:05 ` Stephan von Krawczynski 2003-09-16 13:11 ` Marcelo Tosatti 1 sibling, 0 replies; 48+ messages in thread From: Stephan von Krawczynski @ 2003-09-16 12:05 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: marcelo.tosatti, neilb, linux-kernel On Tue, 16 Sep 2003 10:21:13 +0200 Stephan von Krawczynski <skraw@ithnet.com> wrote: > On Mon, 15 Sep 2003 19:01:42 -0300 (BRT) > Marcelo Tosatti <marcelo.tosatti@cyclades.com.br> wrote: > > > > I already thought about that and tried. In fact it is as fast and fine as > > > 2 GB setup. It runs really smooth. > > > The really simple test for the problem is running "updatedb" (find over > > > the whole filesystem). The box comes to a crawl while this is running, > > > network is absolutely bad, interactivity is rather dead, very often not > > > even a ssh login works. > > > > Does -pre4 (with the VM changes from Andrea) show any difference? There > > are significant changes in the per-zone decisions which might help. > > Hello Marcelo, > > it looks like -pre4 performs not well even in 4 GB environment. After few > days of running I find hanging 2.4.22 nfs-clients on a 2.4.23-pre4 server. > > On the client I get a bunch of those: > > Sep 16 03:02:00 brenda kernel: nfs: server 192.168.1.1 OK > [...] Hello again, you will love to hear that you can drop the above statement completely. After digging deeper into the case I found out that it was caused by a dead switch. So this is no -pre4 problem, but a hardware issue. The switch corrupted about every 20th packet. So I can tell nothing negative so far about -pre4 with 4 GB. I'll try with 6 GB now. Regards, Stephan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 8:21 ` Stephan von Krawczynski 2003-09-16 12:05 ` Stephan von Krawczynski @ 2003-09-16 13:11 ` Marcelo Tosatti 2003-09-16 13:36 ` Stephan von Krawczynski 1 sibling, 1 reply; 48+ messages in thread From: Marcelo Tosatti @ 2003-09-16 13:11 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: Marcelo Tosatti, neilb, linux-kernel On Tue, 16 Sep 2003, Stephan von Krawczynski wrote: > Sep 16 03:03:09 brenda last message repeated 2 times > Sep 16 03:03:09 brenda kernel: nfs: server 192.168.1.1 OK > Sep 16 03:03:10 brenda last message repeated 2 times > > And then the nfs-action is dead. Process hangs. This has not happened with > 2.4.22 as a server. It showed up after a day of creating 4,7 GB dvd iso images. > Creating theses isos was ok, no error or so during the action. > Is it possible that pre4 does not recover all that well from former memory > pressure? > > > Have you tried 2.4.22-aa? > > Sorry, not yet. > I will go back to 2.4.22 and stress it to see if these effects show up. Oh... Jens just pointed bounce buffering is needed for the upper 2Gs. Maybe you have a SCSI card+disks to test ? 8) ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 13:11 ` Marcelo Tosatti @ 2003-09-16 13:36 ` Stephan von Krawczynski 2003-09-16 13:55 ` Richard B. Johnson ` (4 more replies) 0 siblings, 5 replies; 48+ messages in thread From: Stephan von Krawczynski @ 2003-09-16 13:36 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: neilb, linux-kernel On Tue, 16 Sep 2003 10:11:49 -0300 (BRT) Marcelo Tosatti <marcelo.tosatti@cyclades.com.br> wrote: > Oh... Jens just pointed bounce buffering is needed for the upper 2Gs. > > Maybe you have a SCSI card+disks to test ? 8) Well, I do understand the bounce buffer problem, but honestly the current way of handling the situation seems questionable at least. If you ever tried such a system you notice it is a lot worse than just dumping the additional ram above 4GB. You can really watch your network connections go bogus which is just unacceptable. Is there any thinkable way to ommit the bounce buffers and still do something useful with the beyond-4GB ram parts? We should not leave the current bad situation as is... Regards, Stephan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 13:36 ` Stephan von Krawczynski @ 2003-09-16 13:55 ` Richard B. Johnson 2003-09-16 14:13 ` Stephan von Krawczynski 2003-09-16 14:33 ` Marcelo Tosatti ` (3 subsequent siblings) 4 siblings, 1 reply; 48+ messages in thread From: Richard B. Johnson @ 2003-09-16 13:55 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: Marcelo Tosatti, neilb, linux-kernel On Tue, 16 Sep 2003, Stephan von Krawczynski wrote: > On Tue, 16 Sep 2003 10:11:49 -0300 (BRT) > Marcelo Tosatti <marcelo.tosatti@cyclades.com.br> wrote: > > > Oh... Jens just pointed bounce buffering is needed for the upper 2Gs. > > > > Maybe you have a SCSI card+disks to test ? 8) > > Well, I do understand the bounce buffer problem, but honestly the current way > of handling the situation seems questionable at least. If you ever tried such a > system you notice it is a lot worse than just dumping the additional ram above > 4GB. You can really watch your network connections go bogus which is just > unacceptable. Is there any thinkable way to ommit the bounce buffers and still > do something useful with the beyond-4GB ram parts? > We should not leave the current bad situation as is... > > Regards, > Stephan Can you explain what you mean by "network connections go bogus". Whether or not you have more that 4GB or RAM and therefore have to page it into lower virtual addresses has nothing at all to do with networking unless you have a network device driver that did not allocate memory properly. If so, that should be checked out. Since there is only 32 bits of address space in the Intel machines, the virtual memory seen by a process can't exceed that, including the kernel itself. However, the physical memory can come from anywhere and that's what the extended-memory specification attempts to provide. If something is hurting the network, it shouldn't be so. Cheers, Dick Johnson Penguin : Linux version 2.4.22 on an i686 machine (794.73 BogoMips). Note 96.31% of all statistics are fiction. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 13:55 ` Richard B. Johnson @ 2003-09-16 14:13 ` Stephan von Krawczynski 0 siblings, 0 replies; 48+ messages in thread From: Stephan von Krawczynski @ 2003-09-16 14:13 UTC (permalink / raw) To: root; +Cc: marcelo.tosatti, neilb, linux-kernel On Tue, 16 Sep 2003 09:55:36 -0400 (EDT) "Richard B. Johnson" <root@chaos.analogic.com> wrote: > Can you explain what you mean by "network connections go bogus". Sure. Do this: Use a controller (like 3ware) that cannot DMA beyond a 4 GB range. Now put some GBs of data onto that and start to tar the data around on the same disk. While doing this you can watch network go crazy and drop packets at will. You can of course force the whole setup further by (trying) additional nfs-action, but this is really not needed. Just about the same thing can be experienced if you do a simple find all over the disk. CPU load explodes and network is _dead_. To try this you need some GBs of data, a box with more than 4 GB ram, 3ware controller and a usual SuSE 8.2. Wait until after midnight for "updatedb" to run and try to login during that time ;-) "Works" with out-of-the-box equipment and distro :-) Regards, Stephan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 13:36 ` Stephan von Krawczynski 2003-09-16 13:55 ` Richard B. Johnson @ 2003-09-16 14:33 ` Marcelo Tosatti 2003-09-16 14:36 ` Stephan von Krawczynski 2003-09-16 14:36 ` Alan Cox ` (2 subsequent siblings) 4 siblings, 1 reply; 48+ messages in thread From: Marcelo Tosatti @ 2003-09-16 14:33 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: Marcelo Tosatti, neilb, linux-kernel On Tue, 16 Sep 2003, Stephan von Krawczynski wrote: > On Tue, 16 Sep 2003 10:11:49 -0300 (BRT) > Marcelo Tosatti <marcelo.tosatti@cyclades.com.br> wrote: > > > Oh... Jens just pointed bounce buffering is needed for the upper 2Gs. > > > > Maybe you have a SCSI card+disks to test ? 8) > > Well, I do understand the bounce buffer problem, but honestly the current way > of handling the situation seems questionable at least. If you ever tried such a > system you notice it is a lot worse than just dumping the additional ram above > 4GB. You can really watch your network connections go bogus which is just > unacceptable. All is fine with 4GB? > Is there any thinkable way to ommit the bounce buffers and still > do something useful with the beyond-4GB ram parts? > We should not leave the current bad situation as is... No way to omit the bounce buffers. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 14:33 ` Marcelo Tosatti @ 2003-09-16 14:36 ` Stephan von Krawczynski 0 siblings, 0 replies; 48+ messages in thread From: Stephan von Krawczynski @ 2003-09-16 14:36 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: neilb, linux-kernel On Tue, 16 Sep 2003 11:33:27 -0300 (BRT) Marcelo Tosatti <marcelo.tosatti@cyclades.com.br> wrote: > > Well, I do understand the bounce buffer problem, but honestly the current way > > of handling the situation seems questionable at least. If you ever tried such a > > system you notice it is a lot worse than just dumping the additional ram above > > 4GB. You can really watch your network connections go bogus which is just > > unacceptable. > > All is fine with 4GB? Absolutely perfect. Regards, Stephan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 13:36 ` Stephan von Krawczynski 2003-09-16 13:55 ` Richard B. Johnson 2003-09-16 14:33 ` Marcelo Tosatti @ 2003-09-16 14:36 ` Alan Cox 2003-09-16 15:20 ` Stephan von Krawczynski ` (2 more replies) 2003-09-16 15:22 ` Timothy Miller 2003-09-16 15:29 ` Martin J. Bligh 4 siblings, 3 replies; 48+ messages in thread From: Alan Cox @ 2003-09-16 14:36 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: Marcelo Tosatti, neilb, Linux Kernel Mailing List On Maw, 2003-09-16 at 14:36, Stephan von Krawczynski wrote: > Well, I do understand the bounce buffer problem, but honestly the current way > of handling the situation seems questionable at least. If you ever tried such a > system you notice it is a lot worse than just dumping the additional ram above > 4GB. You can really watch your network connections go bogus which is just > unacceptable. Is there any thinkable way to ommit the bounce buffers and still > do something useful with the beyond-4GB ram parts? The 2.6 tree is somewhat better about this but at the end of the day if your I/O subsystem can't do the job your box will not perform ideally. For some workloads its a huge win to have the extra RAM, for others the I/O is a real pain. Also in some cases it might be interesting to try using the extra RAM above the 4G boundary as a giant ram disk and using it as first swap device. I don't know anyone who explored that however ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 14:36 ` Alan Cox @ 2003-09-16 15:20 ` Stephan von Krawczynski 2003-09-16 15:29 ` Alan Cox 2003-09-16 17:10 ` Pavel Machek 2003-09-17 6:41 ` Rogier Wolff 2 siblings, 1 reply; 48+ messages in thread From: Stephan von Krawczynski @ 2003-09-16 15:20 UTC (permalink / raw) To: Alan Cox; +Cc: marcelo.tosatti, neilb, linux-kernel On Tue, 16 Sep 2003 15:36:14 +0100 Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > On Maw, 2003-09-16 at 14:36, Stephan von Krawczynski wrote: > > Well, I do understand the bounce buffer problem, but honestly the current > > way of handling the situation seems questionable at least. If you ever > > tried such a system you notice it is a lot worse than just dumping the > > additional ram above 4GB. You can really watch your network connections go > > bogus which is just unacceptable. Is there any thinkable way to ommit the > > bounce buffers and still do something useful with the beyond-4GB ram parts? > > The 2.6 tree is somewhat better about this but at the end of the day if > your I/O subsystem can't do the job your box will not perform ideally. Hm, "not ideally" is a real friendly word for describing the mess ;-) Isn't there a possibility to flag this part of the memory as nonDMA-able, kind of "do whatever you want with it, but don't expect any dma-driven i/o"... I know this gets a problem when swap jumps in, though. But really it is far better for the box to flag it more or less unusable compared to a DoS done by user-space "find" ... I know this is a real corner case of life. It looks more like taking a different decision than current to improve the situation and not so much a real development topic. Probably a note in kernel docs reading "DON'T DO THIS" is either sufficient... Regards, Stephan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 15:20 ` Stephan von Krawczynski @ 2003-09-16 15:29 ` Alan Cox 2003-09-16 15:49 ` Timothy Miller 2003-09-16 19:58 ` Olivier Galibert 0 siblings, 2 replies; 48+ messages in thread From: Alan Cox @ 2003-09-16 15:29 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: Marcelo Tosatti, neilb, Linux Kernel Mailing List The kernel has no idea what you will do with given ram. It does try to make some guesses but you are basically trying to paper over hardware limits. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 15:29 ` Alan Cox @ 2003-09-16 15:49 ` Timothy Miller 2003-09-16 16:17 ` Stephan von Krawczynski 2003-09-16 19:58 ` Olivier Galibert 1 sibling, 1 reply; 48+ messages in thread From: Timothy Miller @ 2003-09-16 15:49 UTC (permalink / raw) To: Alan Cox Cc: Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List Alan Cox wrote: > The kernel has no idea what you will do with given ram. It does try to > make some guesses but you are basically trying to paper over hardware > limits. Maybe not what you WILL DO, but what you HAVE DONE. Those tasks which have done the most DMA-requiring I/O could get preference for low memory. No? Yeah, I know.. show you the code. :) ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 15:49 ` Timothy Miller @ 2003-09-16 16:17 ` Stephan von Krawczynski 0 siblings, 0 replies; 48+ messages in thread From: Stephan von Krawczynski @ 2003-09-16 16:17 UTC (permalink / raw) To: Timothy Miller; +Cc: alan, marcelo.tosatti, neilb, linux-kernel On Tue, 16 Sep 2003 11:49:10 -0400 Timothy Miller <miller@techsource.com> wrote: > Alan Cox wrote: > > The kernel has no idea what you will do with given ram. It does try to > > make some guesses but you are basically trying to paper over hardware > > limits. > > > Maybe not what you WILL DO, but what you HAVE DONE. Those tasks which > have done the most DMA-requiring I/O could get preference for low > memory. No? > > Yeah, I know.. show you the code. :) That really sounds complex and therefore I would not try that in first place. And it does not sound like a solution to my special problem of few tasks operating on a lot of data (a lot more than fits to physical mem). For this type of situation it would be really intelligent not to give away non-dma-able memory. How can the kernel do that? At least it knows that there are I/O devices that cannot cope with the mem it is presenting. This sounds in fact simple. The problem is: what _can_ be done with this type of mem at all? It does not even help a lot if there are other devices that can handle it, because you still must face the fact of a simple file-copy from such a capable device to one that is not. No wonder that the gurus did not have any good idea yet ;-) Regards, Stephan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 15:29 ` Alan Cox 2003-09-16 15:49 ` Timothy Miller @ 2003-09-16 19:58 ` Olivier Galibert 2003-09-17 15:10 ` Alan Cox 1 sibling, 1 reply; 48+ messages in thread From: Olivier Galibert @ 2003-09-16 19:58 UTC (permalink / raw) To: Alan Cox Cc: Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List On Tue, Sep 16, 2003 at 04:29:02PM +0100, Alan Cox wrote: > The kernel has no idea what you will do with given ram. It does try to > make some guesses but you are basically trying to paper over hardware > limits. Is there a way to specifically turn that ram into a tmpfs though? OG. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 19:58 ` Olivier Galibert @ 2003-09-17 15:10 ` Alan Cox 2003-09-17 19:19 ` Jens Axboe 0 siblings, 1 reply; 48+ messages in thread From: Alan Cox @ 2003-09-17 15:10 UTC (permalink / raw) To: Olivier Galibert Cc: Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List On Maw, 2003-09-16 at 20:58, Olivier Galibert wrote: > On Tue, Sep 16, 2003 at 04:29:02PM +0100, Alan Cox wrote: > > The kernel has no idea what you will do with given ram. It does try to > > make some guesses but you are basically trying to paper over hardware > > limits. > > Is there a way to specifically turn that ram into a tmpfs though? Something like z2ram copied and hacked a little to kmap the blocks it wants would give you a block device you could use for swap or for /tmp. Im not sure tmpfs would work here ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-17 15:10 ` Alan Cox @ 2003-09-17 19:19 ` Jens Axboe 2003-09-17 19:30 ` Marcelo Tosatti 0 siblings, 1 reply; 48+ messages in thread From: Jens Axboe @ 2003-09-17 19:19 UTC (permalink / raw) To: Alan Cox Cc: Olivier Galibert, Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List On Wed, Sep 17 2003, Alan Cox wrote: > On Maw, 2003-09-16 at 20:58, Olivier Galibert wrote: > > On Tue, Sep 16, 2003 at 04:29:02PM +0100, Alan Cox wrote: > > > The kernel has no idea what you will do with given ram. It does try to > > > make some guesses but you are basically trying to paper over hardware > > > limits. > > > > Is there a way to specifically turn that ram into a tmpfs though? > > > Something like z2ram copied and hacked a little to kmap the blocks it > wants would give you a block device you could use for swap or for /tmp. > Im not sure tmpfs would work here Aditionally, you need GFP_DMA32 or similar. Would also alleviate the nasty pressure on ZONE_NORMAL which is often quite stressed. -- Jens Axboe ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-17 19:19 ` Jens Axboe @ 2003-09-17 19:30 ` Marcelo Tosatti 2003-09-17 22:18 ` Stephan von Krawczynski 2003-09-18 7:08 ` Jens Axboe 0 siblings, 2 replies; 48+ messages in thread From: Marcelo Tosatti @ 2003-09-17 19:30 UTC (permalink / raw) To: Jens Axboe Cc: Alan Cox, Olivier Galibert, Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List On Wed, 17 Sep 2003, Jens Axboe wrote: > On Wed, Sep 17 2003, Alan Cox wrote: > > On Maw, 2003-09-16 at 20:58, Olivier Galibert wrote: > > > On Tue, Sep 16, 2003 at 04:29:02PM +0100, Alan Cox wrote: > > > > The kernel has no idea what you will do with given ram. It does try to > > > > make some guesses but you are basically trying to paper over hardware > > > > limits. > > > > > > Is there a way to specifically turn that ram into a tmpfs though? > > > > > > Something like z2ram copied and hacked a little to kmap the blocks it > > wants would give you a block device you could use for swap or for /tmp. > > Im not sure tmpfs would work here > > Aditionally, you need GFP_DMA32 or similar. Would also alleviate the > nasty pressure on ZONE_NORMAL which is often quite stressed. IMO such GFP_DMA32 flag is a bit intrusive for 2.4, isnt it? What has been done in 2.6 in respect to the excessive normal zone pressure and bounce buffering problems? Thanks ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-17 19:30 ` Marcelo Tosatti @ 2003-09-17 22:18 ` Stephan von Krawczynski 2003-09-18 7:08 ` Jens Axboe 1 sibling, 0 replies; 48+ messages in thread From: Stephan von Krawczynski @ 2003-09-17 22:18 UTC (permalink / raw) To: Marcelo Tosatti Cc: axboe, alan, galibert, marcelo.tosatti, neilb, linux-kernel On Wed, 17 Sep 2003 16:30:45 -0300 (BRT) Marcelo Tosatti <marcelo.tosatti@cyclades.com.br> wrote: > > > On Wed, 17 Sep 2003, Jens Axboe wrote: > > > On Wed, Sep 17 2003, Alan Cox wrote: > > > On Maw, 2003-09-16 at 20:58, Olivier Galibert wrote: > > > > On Tue, Sep 16, 2003 at 04:29:02PM +0100, Alan Cox wrote: > > > > > The kernel has no idea what you will do with given ram. It does try > > > > > to make some guesses but you are basically trying to paper over > > > > > hardware limits. > > > > > > > > Is there a way to specifically turn that ram into a tmpfs though? > > > > > > > > > Something like z2ram copied and hacked a little to kmap the blocks it > > > wants would give you a block device you could use for swap or for /tmp. > > > Im not sure tmpfs would work here > > > > Aditionally, you need GFP_DMA32 or similar. Would also alleviate the > > nasty pressure on ZONE_NORMAL which is often quite stressed. > > IMO such GFP_DMA32 flag is a bit intrusive for 2.4, isnt it? > > What has been done in 2.6 in respect to the excessive normal zone > pressure and bounce buffering problems? > > Thanks Before running too far in this direction I would suggest to solve Oliviers' problem with the aic driver. I really would like to know if he sees the same positive effects in pre4 like me. It seems Andreas' vm patches have a very positive influence on the issue. At least I cannot see the "crawling effect" up to now with 6GB and pre4 compared to 2.4.22. It would surely be of interest if Oliviers' 8 GB variant improves, too. May well be that the bouncing is not that bad compared to other corner effects of the vm in this special situation. Interactivity during load and especially network seems far better in pre4. For sure it is not as speedy as a 4GB setup, but it works pretty well (up to now). Regards, Stephan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-17 19:30 ` Marcelo Tosatti 2003-09-17 22:18 ` Stephan von Krawczynski @ 2003-09-18 7:08 ` Jens Axboe 2003-09-18 7:12 ` Jens Axboe 2003-09-18 15:05 ` William Lee Irwin III 1 sibling, 2 replies; 48+ messages in thread From: Jens Axboe @ 2003-09-18 7:08 UTC (permalink / raw) To: Marcelo Tosatti Cc: Alan Cox, Olivier Galibert, Stephan von Krawczynski, neilb, Linux Kernel Mailing List On Wed, Sep 17 2003, Marcelo Tosatti wrote: > > > On Wed, 17 Sep 2003, Jens Axboe wrote: > > > On Wed, Sep 17 2003, Alan Cox wrote: > > > On Maw, 2003-09-16 at 20:58, Olivier Galibert wrote: > > > > On Tue, Sep 16, 2003 at 04:29:02PM +0100, Alan Cox wrote: > > > > > The kernel has no idea what you will do with given ram. It does try to > > > > > make some guesses but you are basically trying to paper over hardware > > > > > limits. > > > > > > > > Is there a way to specifically turn that ram into a tmpfs though? > > > > > > > > > Something like z2ram copied and hacked a little to kmap the blocks it > > > wants would give you a block device you could use for swap or for /tmp. > > > Im not sure tmpfs would work here > > > > Aditionally, you need GFP_DMA32 or similar. Would also alleviate the > > nasty pressure on ZONE_NORMAL which is often quite stressed. > > IMO such GFP_DMA32 flag is a bit intrusive for 2.4, isnt it? Not really, it's just an extra zone. Maybe I can dig such a patch up, I had one for 2.4.2-pre something... > What has been done in 2.6 in respect to the excessive normal zone > pressure and bounce buffering problems? Nothing, afaic. 2.6 isn't even completely deadlock free when it comes to bounce buffering. -- Jens Axboe ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-18 7:08 ` Jens Axboe @ 2003-09-18 7:12 ` Jens Axboe 2003-09-18 11:22 ` Stephan von Krawczynski 2003-09-18 15:05 ` William Lee Irwin III 1 sibling, 1 reply; 48+ messages in thread From: Jens Axboe @ 2003-09-18 7:12 UTC (permalink / raw) To: Marcelo Tosatti Cc: Alan Cox, Olivier Galibert, Stephan von Krawczynski, neilb, Linux Kernel Mailing List On Thu, Sep 18 2003, Jens Axboe wrote: > On Wed, Sep 17 2003, Marcelo Tosatti wrote: > > > > > > On Wed, 17 Sep 2003, Jens Axboe wrote: > > > > > On Wed, Sep 17 2003, Alan Cox wrote: > > > > On Maw, 2003-09-16 at 20:58, Olivier Galibert wrote: > > > > > On Tue, Sep 16, 2003 at 04:29:02PM +0100, Alan Cox wrote: > > > > > > The kernel has no idea what you will do with given ram. It does try to > > > > > > make some guesses but you are basically trying to paper over hardware > > > > > > limits. > > > > > > > > > > Is there a way to specifically turn that ram into a tmpfs though? > > > > > > > > > > > > Something like z2ram copied and hacked a little to kmap the blocks it > > > > wants would give you a block device you could use for swap or for /tmp. > > > > Im not sure tmpfs would work here > > > > > > Aditionally, you need GFP_DMA32 or similar. Would also alleviate the > > > nasty pressure on ZONE_NORMAL which is often quite stressed. > > > > IMO such GFP_DMA32 flag is a bit intrusive for 2.4, isnt it? > > Not really, it's just an extra zone. Maybe I can dig such a patch up, I > had one for 2.4.2-pre something... This is the latest I had, for 2.4.5. Pretty simple and nonintrusive at that time. diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.5/arch/i386/mm/init.c linux/arch/i386/mm/init.c --- /opt/kernel/linux-2.4.5/arch/i386/mm/init.c Sat Apr 21 01:15:20 2001 +++ linux/arch/i386/mm/init.c Sun May 27 17:50:26 2001 @@ -25,6 +25,7 @@ #include <linux/highmem.h> #include <linux/pagemap.h> #include <linux/bootmem.h> +#include <linux/pci.h> #include <asm/processor.h> #include <asm/system.h> @@ -348,12 +349,15 @@ kmap_init(); #endif { - unsigned long zones_size[MAX_NR_ZONES] = {0, 0, 0}; - unsigned int max_dma, high, low; + unsigned long zones_size[MAX_NR_ZONES] = {0, 0, 0, 0}; + unsigned int max_dma, max_dma32, high, low, high32; max_dma = virt_to_phys((char *)MAX_DMA_ADDRESS) >> PAGE_SHIFT; + max_dma32 = PCI_MAX_DMA32 >> PAGE_SHIFT; low = max_low_pfn; - high = highend_pfn; + high32 = high = highend_pfn; + if (high32 > max_dma32) + high32 = max_dma32 + 1; /* first map in HIGHMEM */ if (low < max_dma) zones_size[ZONE_DMA] = low; @@ -361,12 +365,12 @@ zones_size[ZONE_DMA] = max_dma; zones_size[ZONE_NORMAL] = low - max_dma; #ifdef CONFIG_HIGHMEM - zones_size[ZONE_HIGHMEM] = high - low; + zones_size[ZONE_DMA32] = high32 - low; + zones_size[ZONE_HIGHMEM] = high - high32; #endif } free_area_init(zones_size); } - return; } /* diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.5/include/linux/mm.h linux/include/linux/mm.h --- /opt/kernel/linux-2.4.5/include/linux/mm.h Sat May 26 13:30:50 2001 +++ linux/include/linux/mm.h Tue May 29 15:46:02 2001 @@ -476,8 +476,10 @@ #define __GFP_IO 0x04 #define __GFP_DMA 0x08 #ifdef CONFIG_HIGHMEM -#define __GFP_HIGHMEM 0x10 +#define __GFP_DMA32 0x10 +#define __GFP_HIGHMEM 0x20 #else +#define __GFP_DMA32 0x0 /* noop */ #define __GFP_HIGHMEM 0x0 /* noop */ #endif diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.5/include/linux/mmzone.h linux/include/linux/mmzone.h --- /opt/kernel/linux-2.4.5/include/linux/mmzone.h Sat May 26 13:30:50 2001 +++ linux/include/linux/mmzone.h Sun May 27 18:26:59 2001 @@ -27,7 +27,8 @@ * * ZONE_DMA < 16 MB ISA DMA capable memory * ZONE_NORMAL 16-896 MB direct mapped by the kernel - * ZONE_HIGHMEM > 896 MB only page cache and user processes + * ZONE_DMA32 > 892MB < 4GB For 32-bit DMA + * ZONE_HIGHMEM > 4GB only page cache and user processes */ typedef struct zone_struct { /* @@ -62,8 +63,9 @@ #define ZONE_DMA 0 #define ZONE_NORMAL 1 -#define ZONE_HIGHMEM 2 -#define MAX_NR_ZONES 3 +#define ZONE_DMA32 2 +#define ZONE_HIGHMEM 3 +#define MAX_NR_ZONES 4 /* * One allocation request operates on a zonelist. A zonelist @@ -81,7 +83,7 @@ int gfp_mask; } zonelist_t; -#define NR_GFPINDEX 0x20 +#define NR_GFPINDEX 0x40 /* * The pg_data_t structure is used in machines with CONFIG_DISCONTIGMEM diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.5/mm/page_alloc.c linux/mm/page_alloc.c --- /opt/kernel/linux-2.4.5/mm/page_alloc.c Sat May 26 13:30:50 2001 +++ linux/mm/page_alloc.c Sun May 27 23:47:22 2001 @@ -598,6 +598,7 @@ while (pgdat) { pages += pgdat->node_zones[ZONE_HIGHMEM].free_pages; + pages += pgdat->node_zones[ZONE_DMA32].free_pages; pgdat = pgdat->node_next; } return pages; @@ -683,6 +684,8 @@ k = ZONE_NORMAL; if (i & __GFP_HIGHMEM) k = ZONE_HIGHMEM; + if (i & __GFP_DMA32) + k = ZONE_DMA32; if (i & __GFP_DMA) k = ZONE_DMA; @@ -700,6 +703,14 @@ #endif zonelist->zones[j++] = zone; } + case ZONE_DMA32: + zone = pgdat->node_zones + ZONE_DMA32; + if (zone->size) { +#ifndef CONFIG_HIGHMEM + BUG(); +#endif + zonelist->zones[j++] = zone; + } case ZONE_NORMAL: zone = pgdat->node_zones + ZONE_NORMAL; if (zone->size) @@ -833,8 +844,11 @@ for (i = 0; i < size; i++) { struct page *page = mem_map + offset + i; page->zone = zone; - if (j != ZONE_HIGHMEM) + if (j != ZONE_HIGHMEM && j != ZONE_DMA32) { page->virtual = __va(zone_start_paddr); + } else + page->virtual = NULL; + zone_start_paddr += PAGE_SIZE; } -- Jens Axboe ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-18 7:12 ` Jens Axboe @ 2003-09-18 11:22 ` Stephan von Krawczynski 0 siblings, 0 replies; 48+ messages in thread From: Stephan von Krawczynski @ 2003-09-18 11:22 UTC (permalink / raw) To: Jens Axboe; +Cc: marcelo.tosatti, alan, galibert, neilb, linux-kernel, andrea On Thu, 18 Sep 2003 09:12:49 +0200 Jens Axboe <axboe@suse.de> wrote: > On Thu, Sep 18 2003, Jens Axboe wrote: > > On Wed, Sep 17 2003, Marcelo Tosatti wrote: > > > > > > > > > On Wed, 17 Sep 2003, Jens Axboe wrote: > > > > > > > On Wed, Sep 17 2003, Alan Cox wrote: > > > > > On Maw, 2003-09-16 at 20:58, Olivier Galibert wrote: > > > > > > On Tue, Sep 16, 2003 at 04:29:02PM +0100, Alan Cox wrote: > > > > > > > The kernel has no idea what you will do with given ram. It does > > > > > > > try to make some guesses but you are basically trying to paper > > > > > > > over hardware limits. > > > > > > > > > > > > Is there a way to specifically turn that ram into a tmpfs though? > > > > > > > > > > > > > > > Something like z2ram copied and hacked a little to kmap the blocks it > > > > > wants would give you a block device you could use for swap or for > > > > > /tmp. Im not sure tmpfs would work here > > > > > > > > Aditionally, you need GFP_DMA32 or similar. Would also alleviate the > > > > nasty pressure on ZONE_NORMAL which is often quite stressed. > > > > > > IMO such GFP_DMA32 flag is a bit intrusive for 2.4, isnt it? > > > > Not really, it's just an extra zone. Maybe I can dig such a patch up, I > > had one for 2.4.2-pre something... > > This is the latest I had, for 2.4.5. Pretty simple and nonintrusive at > that time. >From a design point of view I would pretty much agree with your idea. In fact there is a ram attribute (dma32) which currently is not reflected in the data structures in a selectable way. I can see no good reason for this lack. Regards, Stephan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-18 7:08 ` Jens Axboe 2003-09-18 7:12 ` Jens Axboe @ 2003-09-18 15:05 ` William Lee Irwin III 1 sibling, 0 replies; 48+ messages in thread From: William Lee Irwin III @ 2003-09-18 15:05 UTC (permalink / raw) To: Jens Axboe Cc: Marcelo Tosatti, Alan Cox, Olivier Galibert, Stephan von Krawczynski, neilb, Linux Kernel Mailing List On Wed, Sep 17 2003, Marcelo Tosatti wrote: >> IMO such GFP_DMA32 flag is a bit intrusive for 2.4, isnt it? On Thu, Sep 18, 2003 at 09:08:45AM +0200, Jens Axboe wrote: > Not really, it's just an extra zone. Maybe I can dig such a patch up, I > had one for 2.4.2-pre something... On Wed, Sep 17 2003, Marcelo Tosatti wrote: >> What has been done in 2.6 in respect to the excessive normal zone >> pressure and bounce buffering problems? On Thu, Sep 18, 2003 at 09:08:45AM +0200, Jens Axboe wrote: > Nothing, afaic. 2.6 isn't even completely deadlock free when it comes to > bounce buffering. It'd be great to have ZONE_DMA32 around for 2.6. -- wli ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 14:36 ` Alan Cox 2003-09-16 15:20 ` Stephan von Krawczynski @ 2003-09-16 17:10 ` Pavel Machek 2003-09-16 19:53 ` Olivier Galibert 2003-09-17 6:41 ` Rogier Wolff 2 siblings, 1 reply; 48+ messages in thread From: Pavel Machek @ 2003-09-16 17:10 UTC (permalink / raw) To: Alan Cox Cc: Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List Hi! > > Well, I do understand the bounce buffer problem, but honestly the current way > > of handling the situation seems questionable at least. If you ever tried such a > > system you notice it is a lot worse than just dumping the additional ram above > > 4GB. You can really watch your network connections go bogus which is just > > unacceptable. Is there any thinkable way to ommit the bounce buffers and still > > do something useful with the beyond-4GB ram parts? > > The 2.6 tree is somewhat better about this but at the end of the day if > your I/O subsystem can't do the job your box will not perform ideally. > For some workloads its a huge win to have the extra RAM, for others the > I/O is a real pain. If he has trouble logging in, then there's a bug somewhere. Bounce buffers should not slow machine down more than 2x, and from his description it looks like way worse slowdown. Pavel -- Pavel Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need... ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 17:10 ` Pavel Machek @ 2003-09-16 19:53 ` Olivier Galibert 2003-09-16 20:04 ` Pavel Machek 2003-09-16 21:16 ` Marcelo Tosatti 0 siblings, 2 replies; 48+ messages in thread From: Olivier Galibert @ 2003-09-16 19:53 UTC (permalink / raw) To: Pavel Machek Cc: Alan Cox, Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List [-- Attachment #1: Type: text/plain, Size: 1273 bytes --] On Tue, Sep 16, 2003 at 07:10:57PM +0200, Pavel Machek wrote: > Hi! > > > > Well, I do understand the bounce buffer problem, but honestly the current way > > > of handling the situation seems questionable at least. If you ever tried such a > > > system you notice it is a lot worse than just dumping the additional ram above > > > 4GB. You can really watch your network connections go bogus which is just > > > unacceptable. Is there any thinkable way to ommit the bounce buffers and still > > > do something useful with the beyond-4GB ram parts? > > > > The 2.6 tree is somewhat better about this but at the end of the day if > > your I/O subsystem can't do the job your box will not perform ideally. > > For some workloads its a huge win to have the extra RAM, for others the > > I/O is a real pain. > > If he has trouble logging in, then there's a bug somewhere. > Bounce buffers should not slow machine down more than > 2x, and from his description it looks like way worse slowdown. The box does not just slowdown, the box crawls on the floor wimpering. Nothing works except ping until the i/os are finished (and they seem to crawl too), then everything works perfectly again. We're quite eager to fix the problem too, if you want us to test some things. OG. [-- Attachment #2: info.txt.gz --] [-- Type: application/x-gunzip, Size: 6007 bytes --] ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 19:53 ` Olivier Galibert @ 2003-09-16 20:04 ` Pavel Machek 2003-09-16 21:16 ` Marcelo Tosatti 1 sibling, 0 replies; 48+ messages in thread From: Pavel Machek @ 2003-09-16 20:04 UTC (permalink / raw) To: Olivier Galibert, Alan Cox, Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List Hi! > > > > Well, I do understand the bounce buffer problem, but honestly the current way > > > > of handling the situation seems questionable at least. If you ever tried such a > > > > system you notice it is a lot worse than just dumping the additional ram above > > > > 4GB. You can really watch your network connections go bogus which is just > > > > unacceptable. Is there any thinkable way to ommit the bounce buffers and still > > > > do something useful with the beyond-4GB ram parts? > > > > > > The 2.6 tree is somewhat better about this but at the end of the day if > > > your I/O subsystem can't do the job your box will not perform ideally. > > > For some workloads its a huge win to have the extra RAM, for others the > > > I/O is a real pain. > > > > If he has trouble logging in, then there's a bug somewhere. > > Bounce buffers should not slow machine down more than > > 2x, and from his description it looks like way worse slowdown. > > The box does not just slowdown, the box crawls on the floor wimpering. > Nothing works except ping until the i/os are finished (and they seem > to crawl too), then everything works perfectly again. That seems like bug ;-). Can you do some kind of memstat to see if it is not something like atomic pages shortage? Also try to run vanilla kernel. And try running it UP. > We're quite eager to fix the problem too, if you want us to test some > things. I'm afraid I do not have big-enough box close-enough to fix that. Does it happen with another disk driver, too? What about interrupts, are not they disabled for too long? Can you enable PREEMPT to see 'scheduling in atomic' warnings? Pavel -- When do you have a heart between your knees? [Johanka's followup: and *two* hearts?] ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 19:53 ` Olivier Galibert 2003-09-16 20:04 ` Pavel Machek @ 2003-09-16 21:16 ` Marcelo Tosatti 2003-09-16 21:23 ` Olivier Galibert 1 sibling, 1 reply; 48+ messages in thread From: Marcelo Tosatti @ 2003-09-16 21:16 UTC (permalink / raw) To: Olivier Galibert Cc: Pavel Machek, Alan Cox, Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List On Tue, 16 Sep 2003, Olivier Galibert wrote: > On Tue, Sep 16, 2003 at 07:10:57PM +0200, Pavel Machek wrote: > > Hi! > > > > > > Well, I do understand the bounce buffer problem, but honestly the current way > > > > of handling the situation seems questionable at least. If you ever tried such a > > > > system you notice it is a lot worse than just dumping the additional ram above > > > > 4GB. You can really watch your network connections go bogus which is just > > > > unacceptable. Is there any thinkable way to ommit the bounce buffers and still > > > > do something useful with the beyond-4GB ram parts? > > > > > > The 2.6 tree is somewhat better about this but at the end of the day if > > > your I/O subsystem can't do the job your box will not perform ideally. > > > For some workloads its a huge win to have the extra RAM, for others the > > > I/O is a real pain. > > > > If he has trouble logging in, then there's a bug somewhere. > > Bounce buffers should not slow machine down more than > > 2x, and from his description it looks like way worse slowdown. > > The box does not just slowdown, the box crawls on the floor wimpering. > Nothing works except ping until the i/os are finished (and they seem > to crawl too), then everything works perfectly again. > > We're quite eager to fix the problem too, if you want us to test some > things. Which card and driver are you using for IO? 3ware? How much RAM do you have? I remember I tested heavy IO loads (heavy swapping and dbench) on 8GB machine and all worked fine (interactive terminal, etc) but that was a looong time ago back in 2.4. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 21:16 ` Marcelo Tosatti @ 2003-09-16 21:23 ` Olivier Galibert 2003-09-17 11:14 ` Stephan von Krawczynski 0 siblings, 1 reply; 48+ messages in thread From: Olivier Galibert @ 2003-09-16 21:23 UTC (permalink / raw) To: Marcelo Tosatti Cc: Pavel Machek, Alan Cox, Stephan von Krawczynski, neilb, Linux Kernel Mailing List On Tue, Sep 16, 2003 at 06:16:58PM -0300, Marcelo Tosatti wrote: > Which card and driver are you using for IO? 3ware? Bus 6, device 2, function 0: SCSI storage controller: Adaptec AIC-7902 U320 (rev 3). Bus 6, device 2, function 1: SCSI storage controller: Adaptec AIC-7902 U320 (#2) (rev 3). 0: Adaptec AIC79xx driver version: 1.3.10 Adaptec AIC7902 Ultra320 SCSI adapter aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs Allocated SCBs: 32, SG List Length: 85 Target 0 Negotiation Settings User: 320.000MB/s transfers (160.000MHz DT|IU|QAS, 16bit) Goal: 320.000MB/s transfers (160.000MHz DT|IU|QAS, 16bit) Curr: 320.000MB/s transfers (160.000MHz DT|IU|QAS, 16bit) Transmission Errors 0 Channel A Target 0 Lun 0 Settings Commands Queued 6712736 Commands Active 0 Command Openings 32 Max Tagged Openings 32 Device Queue Frozen Count 0 1: Adaptec AIC79xx driver version: 1.3.10 Adaptec AIC7902 Ultra320 SCSI adapter aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs Allocated SCBs: 64, SG List Length: 85 Serial EEPROM: 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x09f4 0x0146 0x2807 0x0010 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0x0410 0xb3d7 Target 0 Negotiation Settings User: 320.000MB/s transfers (160.000MHz DT|IU|QAS, 16bit) Goal: 160.000MB/s transfers (80.000MHz DT, 16bit) Curr: 160.000MB/s transfers (80.000MHz DT, 16bit) Transmission Errors 0 Channel A Target 0 Lun 0 Settings Commands Queued 5523328 Commands Active 0 Command Openings 32 Max Tagged Openings 32 Device Queue Frozen Count 0 Channel A Target 0 Lun 1 Settings Commands Queued 14646734 Commands Active 0 Command Openings 32 Max Tagged Openings 32 Device Queue Frozen Count 0 Target 1 Negotiation Settings User: 320.000MB/s transfers (160.000MHz DT|IU|QAS, 16bit) Goal: 160.000MB/s transfers (80.000MHz DT, 16bit) Curr: 160.000MB/s transfers (80.000MHz DT, 16bit) Transmission Errors 0 Channel A Target 1 Lun 0 Settings Commands Queued 4 Commands Active 0 Command Openings 32 Max Tagged Openings 32 Device Queue Frozen Count 0 Channel A Target 1 Lun 1 Settings Commands Queued 4 Commands Active 0 Command Openings 32 Max Tagged Openings 32 Device Queue Frozen Count 0 dmesg: SCSI subsystem driver Revision: 1.00 scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.10 <Adaptec AIC7902 Ultra320 SCSI adapter> aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.10 <Adaptec AIC7902 Ultra320 SCSI adapter> aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs blk: queue f7965018, I/O limit 524287Mb (mask 0x7fffffffff) (scsi0:A:0): 320.000MB/s transfers (160.000MHz DT|IU|QAS, 16bit) scsi1:A:0:0: DV failed to configure device. Please file a bug report against this driver. scsi1:A:1:0: DV failed to configure device. Please file a bug report against this driver. Vendor: SEAGATE Model: ST373307LC Rev: 0004 Type: Direct-Access ANSI SCSI revision: 03 blk: queue f7913e18, I/O limit 524287Mb (mask 0x7fffffffff) Vendor: SUPER Model: GEM318 Rev: 0 Type: Processor ANSI SCSI revision: 02 blk: queue f7913018, I/O limit 524287Mb (mask 0x7fffffffff) scsi0:A:0:0: Tagged Queuing enabled. Depth 32 (scsi1:A:0): 160.000MB/s transfers (80.000MHz DT, 16bit) Vendor: transtec Model: Rev: 0001 Type: Direct-Access ANSI SCSI revision: 03 blk: queue f78cbc18, I/O limit 524287Mb (mask 0x7fffffffff) Vendor: transtec Model: Rev: 0001 Type: Direct-Access ANSI SCSI revision: 03 blk: queue f78cba18, I/O limit 524287Mb (mask 0x7fffffffff) (scsi1:A:1): 160.000MB/s transfers (80.000MHz DT, 16bit) Vendor: transtec Model: Rev: 0001 Type: Direct-Access ANSI SCSI revision: 03 blk: queue f78cb618, I/O limit 524287Mb (mask 0x7fffffffff) Vendor: transtec Model: Rev: 0001 Type: Direct-Access ANSI SCSI revision: 03 blk: queue f78cb418, I/O limit 524287Mb (mask 0x7fffffffff) scsi1:A:0:0: Tagged Queuing enabled. Depth 32 scsi1:A:0:1: Tagged Queuing enabled. Depth 32 scsi1:A:1:0: Tagged Queuing enabled. Depth 32 scsi1:A:1:1: Tagged Queuing enabled. Depth 32 Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0 Attached scsi disk sdc at scsi1, channel 0, id 0, lun 1 Attached scsi disk sdd at scsi1, channel 0, id 1, lun 0 Attached scsi disk sde at scsi1, channel 0, id 1, lun 1 SCSI device sda: 143374744 512-byte hdwr sectors (73408 MB) Partition check: sda: sda1 sda2 sda3 sda4 SCSI device sdb: 2788016128 512-byte hdwr sectors (1427464 MB) sdb: sdb1 SCSI device sdc: 2788016128 512-byte hdwr sectors (1427464 MB) sdc: sdc1 SCSI device sdd: 4101521408 512-byte hdwr sectors (2099979 MB) sdd: sdd1 SCSI device sde: 4101521408 512-byte hdwr sectors (2099979 MB) sde: sde1 Attached scsi generic sg1 at scsi0, channel 0, id 6, lun 0, type 3 > How much RAM do you have? 8Gb, currently down to 4Gb using mem=4G so that things actually work. OG. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 21:23 ` Olivier Galibert @ 2003-09-17 11:14 ` Stephan von Krawczynski 2003-09-17 13:08 ` Olivier Galibert 0 siblings, 1 reply; 48+ messages in thread From: Stephan von Krawczynski @ 2003-09-17 11:14 UTC (permalink / raw) To: Olivier Galibert; +Cc: marcelo.tosatti, pavel, alan, neilb, linux-kernel On Tue, 16 Sep 2003 23:23:02 +0200 Olivier Galibert <olivier.galibert@limsi.fr> wrote: > On Tue, Sep 16, 2003 at 06:16:58PM -0300, Marcelo Tosatti wrote: > > Which card and driver are you using for IO? 3ware? > > Bus 6, device 2, function 0: > SCSI storage controller: Adaptec AIC-7902 U320 (rev 3). > Bus 6, device 2, function 1: > SCSI storage controller: Adaptec AIC-7902 U320 (#2) (rev 3). > Hello Olivier, Pretty interesting. Can you please give 2.4.23-pre4 a short test. I think I can see a remarkable difference tp 2.4.22 and would like to find confirmation ... Regards, Stephan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-17 11:14 ` Stephan von Krawczynski @ 2003-09-17 13:08 ` Olivier Galibert 2003-09-18 9:58 ` Olivier Galibert 0 siblings, 1 reply; 48+ messages in thread From: Olivier Galibert @ 2003-09-17 13:08 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: marcelo.tosatti, pavel, alan, neilb, linux-kernel On Wed, Sep 17, 2003 at 01:14:07PM +0200, Stephan von Krawczynski wrote: > Can you please give 2.4.23-pre4 a short test. I think I can see a remarkable > difference tp 2.4.22 and would like to find confirmation ... Well, I tried but the aic7xxx does not work for me, see other mail. I'll try again once the LUN enumeration is fixed. OG. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-17 13:08 ` Olivier Galibert @ 2003-09-18 9:58 ` Olivier Galibert 2003-09-18 10:13 ` Stephan von Krawczynski 0 siblings, 1 reply; 48+ messages in thread From: Olivier Galibert @ 2003-09-18 9:58 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: marcelo.tosatti, pavel, alan, neilb, linux-kernel On Wed, Sep 17, 2003 at 03:08:18PM +0200, Olivier Galibert wrote: > On Wed, Sep 17, 2003 at 01:14:07PM +0200, Stephan von Krawczynski wrote: > > Can you please give 2.4.23-pre4 a short test. I think I can see a remarkable > > difference tp 2.4.22 and would like to find confirmation ... > > Well, I tried but the aic7xxx does not work for me, see other mail. > I'll try again once the LUN enumeration is fixed. Actually I had booted the wrong kernel (2.6.0t4) by mistake. SCSI works, and there is indeed a remarkable difference. The system holds perfectly under filled-ram, high i/o usage now. Excellent. Now if only the CPU enumeration worked and both CPUs were detected... OG. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-18 9:58 ` Olivier Galibert @ 2003-09-18 10:13 ` Stephan von Krawczynski 2003-09-18 11:22 ` Olivier Galibert 0 siblings, 1 reply; 48+ messages in thread From: Stephan von Krawczynski @ 2003-09-18 10:13 UTC (permalink / raw) To: Olivier Galibert; +Cc: marcelo.tosatti, pavel, alan, neilb, linux-kernel On Thu, 18 Sep 2003 11:58:45 +0200 Olivier Galibert <galibert@pobox.com> wrote: > On Wed, Sep 17, 2003 at 03:08:18PM +0200, Olivier Galibert wrote: > > On Wed, Sep 17, 2003 at 01:14:07PM +0200, Stephan von Krawczynski wrote: > > > Can you please give 2.4.23-pre4 a short test. I think I can see a > > > remarkable difference tp 2.4.22 and would like to find confirmation ... > > > > Well, I tried but the aic7xxx does not work for me, see other mail. > > I'll try again once the LUN enumeration is fixed. > > Actually I had booted the wrong kernel (2.6.0t4) by mistake. SCSI > works, and there is indeed a remarkable difference. The system holds > perfectly under filled-ram, high i/o usage now. Excellent. Fine. So we seem to agree 2.4.23 will be another big hit in the 2.4 line :-) > Now if only the CPU enumeration worked and both CPUs were detected... Hm, I have not yet seen any configuration where multiple CPUs are not detected. Are you sure you have compiled in SMP support? What does dmesg look like? Regards, Stephan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-18 10:13 ` Stephan von Krawczynski @ 2003-09-18 11:22 ` Olivier Galibert 0 siblings, 0 replies; 48+ messages in thread From: Olivier Galibert @ 2003-09-18 11:22 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: marcelo.tosatti, pavel, alan, neilb, linux-kernel On Thu, Sep 18, 2003 at 12:13:27PM +0200, Stephan von Krawczynski wrote: > Fine. So we seem to agree 2.4.23 will be another big hit in the 2.4 line :-) Indeed, it's working beautifully. > > Now if only the CPU enumeration worked and both CPUs were detected... > > Hm, I have not yet seen any configuration where multiple CPUs are not detected. > Are you sure you have compiled in SMP support? What does dmesg look like? I found the problem. The meaning of the option "number of supported CPUs" is not what is expected. It is not fixing the maximum number of CPUs, but the number of the last CPU checked for. Specifically, in our system the CPUs are numbered 0 and 6. Setting the MNCPU to 2 prevents the second CPU to be taken into account. OG. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 14:36 ` Alan Cox 2003-09-16 15:20 ` Stephan von Krawczynski 2003-09-16 17:10 ` Pavel Machek @ 2003-09-17 6:41 ` Rogier Wolff 2003-09-17 10:26 ` Jens Axboe 2 siblings, 1 reply; 48+ messages in thread From: Rogier Wolff @ 2003-09-17 6:41 UTC (permalink / raw) To: Alan Cox Cc: Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List On Tue, Sep 16, 2003 at 03:36:14PM +0100, Alan Cox wrote: > I/O is a real pain. Also in some cases it might be interesting to try > using the extra RAM above the 4G boundary as a giant ram disk and using > it as first swap device. 4G? Above 4G? The limit should be configurable a lot earlier. I'd want to configure that on the machines I'm installing tomorrow. 4G RAM, but I'd rather not use the highmem stuff. I think the workload that this machine is likely to get will work very well with this setup. Why does this have the opportunity to work better than just using the 2 or 4G of RAM? Because after you've used the bottom 1G, that might just remain there, requiring lots of IO to go through bounce buffers and memory remappings. By considering the top part of RAM as swap, you'll force the important stuff into the more easily accessable RAM (Compare to fastram as it was called on the Amiga!). Roger. -- ** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 ** *-- BitWizard writes Linux device drivers for any device you may have! --* **** "Linux is like a wigwam - no windows, no gates, apache inside!" **** ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-17 6:41 ` Rogier Wolff @ 2003-09-17 10:26 ` Jens Axboe 2003-09-17 10:42 ` Rogier Wolff 2003-09-17 19:19 ` Pavel Machek 0 siblings, 2 replies; 48+ messages in thread From: Jens Axboe @ 2003-09-17 10:26 UTC (permalink / raw) To: Rogier Wolff Cc: Alan Cox, Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List On Wed, Sep 17 2003, Rogier Wolff wrote: > On Tue, Sep 16, 2003 at 03:36:14PM +0100, Alan Cox wrote: > > I/O is a real pain. Also in some cases it might be interesting to try > > using the extra RAM above the 4G boundary as a giant ram disk and using > > it as first swap device. > > 4G? Above 4G? The limit should be configurable a lot earlier. > > I'd want to configure that on the machines I'm installing tomorrow. > 4G RAM, but I'd rather not use the highmem stuff. I think the workload > that this machine is likely to get will work very well with this setup. > > Why does this have the opportunity to work better than just using the > 2 or 4G of RAM? Because after you've used the bottom 1G, that might > just remain there, requiring lots of IO to go through bounce buffers > and memory remappings. By considering the top part of RAM as swap, > you'll force the important stuff into the more easily accessable > RAM (Compare to fastram as it was called on the Amiga!). You are misunderstanding the problem. You don't use bounce buffers just because the page happens to reside in high memory, it is only used if the hardware cannot DMA to it. And that is exactly the problem here with the 3ware adapter, it cannot dma to > 4GB. So in a 6GB setup (with potentially 5G of highmem), only the last 2G requires bouncing. To answer one of the other questions regarding slowdown - it can be nastier than 2x, remember that for reads the copy back happens inside the interrupt handler... It would also be interesting to note (with vmstat 1) whether it's all system time, or if you see something like kswapd going crazy too. If the attached patch makes a difference, then it could be a vm issue as well. Still doesn't change that fact that if you build a machine with 6GB of RAM and expect it to perform, then you don't add io controllers that cannot DMA to all of your RAM. ===== mm/highmem.c 1.15 vs edited ===== --- 1.15/mm/highmem.c Thu Feb 20 21:45:27 2003 +++ edited/mm/highmem.c Wed Sep 17 12:25:06 2003 @@ -335,7 +335,7 @@ struct list_head *tmp; struct page *page; - page = alloc_page(GFP_NOHIGHIO); + page = alloc_page(GFP_ATOMIC); if (page) return page; /* -- Jens Axboe ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-17 10:26 ` Jens Axboe @ 2003-09-17 10:42 ` Rogier Wolff 2003-09-17 10:53 ` Jens Axboe 2003-09-17 19:19 ` Pavel Machek 1 sibling, 1 reply; 48+ messages in thread From: Rogier Wolff @ 2003-09-17 10:42 UTC (permalink / raw) To: Jens Axboe Cc: Rogier Wolff, Alan Cox, Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List On Wed, Sep 17, 2003 at 12:26:29PM +0200, Jens Axboe wrote: > On Wed, Sep 17 2003, Rogier Wolff wrote: > > On Tue, Sep 16, 2003 at 03:36:14PM +0100, Alan Cox wrote: > > > I/O is a real pain. Also in some cases it might be interesting to try > > > using the extra RAM above the 4G boundary as a giant ram disk and using > > > it as first swap device. > > > > 4G? Above 4G? The limit should be configurable a lot earlier. > > > > I'd want to configure that on the machines I'm installing tomorrow. > > 4G RAM, but I'd rather not use the highmem stuff. I think the workload > > that this machine is likely to get will work very well with this setup. > > > > Why does this have the opportunity to work better than just using the > > 2 or 4G of RAM? Because after you've used the bottom 1G, that might > > just remain there, requiring lots of IO to go through bounce buffers > > and memory remappings. By considering the top part of RAM as swap, ^^^^^^^^^^^^^^^^^ > > you'll force the important stuff into the more easily accessable > > RAM (Compare to fastram as it was called on the Amiga!). > > You are misunderstanding the problem. You don't use bounce buffers just > because the page happens to reside in high memory, it is only used if > the hardware cannot DMA to it. And that is exactly the problem here with > the 3ware adapter, it cannot dma to > 4GB. So in a 6GB setup (with > potentially 5G of highmem), only the last 2G requires bouncing. As I understand things (But this is from following discussions on linux-kernel from afar, not from personal poking at the code!) there is also a performance penalty for the kernel not having direct physically mapped access to RAM. We map up to 3G of virtual memory of userspace, and up to 1Gb of physical RAM into the kernel memory map for performance reasons. So if I have 2G RAM and want to keep 3G userspace, I have to use some "highmem" stuff right? This will not directly require the use of bounce buffers, but it will require the kernel to remap regions when it needs to access them. If this doesn't have a performance impact, why do I have the option of directly mapping 1G, 2G, or 3G? Roger. -- ** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 ** *-- BitWizard writes Linux device drivers for any device you may have! --* **** "Linux is like a wigwam - no windows, no gates, apache inside!" **** ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-17 10:42 ` Rogier Wolff @ 2003-09-17 10:53 ` Jens Axboe 0 siblings, 0 replies; 48+ messages in thread From: Jens Axboe @ 2003-09-17 10:53 UTC (permalink / raw) To: Rogier Wolff Cc: Alan Cox, Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List On Wed, Sep 17 2003, Rogier Wolff wrote: > On Wed, Sep 17, 2003 at 12:26:29PM +0200, Jens Axboe wrote: > > On Wed, Sep 17 2003, Rogier Wolff wrote: > > > On Tue, Sep 16, 2003 at 03:36:14PM +0100, Alan Cox wrote: > > > > I/O is a real pain. Also in some cases it might be interesting to try > > > > using the extra RAM above the 4G boundary as a giant ram disk and using > > > > it as first swap device. > > > > > > 4G? Above 4G? The limit should be configurable a lot earlier. > > > > > > I'd want to configure that on the machines I'm installing tomorrow. > > > 4G RAM, but I'd rather not use the highmem stuff. I think the workload > > > that this machine is likely to get will work very well with this setup. > > > > > > Why does this have the opportunity to work better than just using the > > > 2 or 4G of RAM? Because after you've used the bottom 1G, that might > > > just remain there, requiring lots of IO to go through bounce buffers > > > and memory remappings. By considering the top part of RAM as swap, > ^^^^^^^^^^^^^^^^^ > > > you'll force the important stuff into the more easily accessable > > > RAM (Compare to fastram as it was called on the Amiga!). > > > > You are misunderstanding the problem. You don't use bounce buffers just > > because the page happens to reside in high memory, it is only used if > > the hardware cannot DMA to it. And that is exactly the problem here with > > the 3ware adapter, it cannot dma to > 4GB. So in a 6GB setup (with > > potentially 5G of highmem), only the last 2G requires bouncing. > > As I understand things (But this is from following discussions on > linux-kernel from afar, not from personal poking at the code!) there > is also a performance penalty for the kernel not having direct > physically mapped access to RAM. We map up to 3G of virtual memory of > userspace, and up to 1Gb of physical RAM into the kernel memory map > for performance reasons. So if I have 2G RAM and want to keep 3G > userspace, I have to use some "highmem" stuff right? > > This will not directly require the use of bounce buffers, but it will > require the kernel to remap regions when it needs to access them. That is completely correct. Your original post just didn't make this distinction, and there is an order of magnitude performance difference between kmap() and bouncing! :) > If this doesn't have a performance impact, why do I have the option of > directly mapping 1G, 2G, or 3G? It does cost something of course, but not nearly as expensive as bounce buffering. You cannot compare the two. -- Jens Axboe ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-17 10:26 ` Jens Axboe 2003-09-17 10:42 ` Rogier Wolff @ 2003-09-17 19:19 ` Pavel Machek 2003-09-18 11:39 ` Rogier Wolff 1 sibling, 1 reply; 48+ messages in thread From: Pavel Machek @ 2003-09-17 19:19 UTC (permalink / raw) To: Jens Axboe Cc: Rogier Wolff, Alan Cox, Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List Hi! > > and memory remappings. By considering the top part of RAM as swap, > > you'll force the important stuff into the more easily accessable > > RAM (Compare to fastram as it was called on the Amiga!). > > You are misunderstanding the problem. You don't use bounce buffers just > because the page happens to reside in high memory, it is only used if > the hardware cannot DMA to it. And that is exactly the problem here with > the 3ware adapter, it cannot dma to > 4GB. So in a 6GB setup (with > potentially 5G of highmem), only the last 2G requires bouncing. > > To answer one of the other questions regarding slowdown - it can be > nastier than 2x, remember that for reads the copy back happens inside > the interrupt handler... It would also be interesting to note (with Ouch, I guess I see. If big part of time is spent in interrupt copying data, network is going to loose packets and perform awfully. Could he run some interrupt latency tester? Perhaps we are copying way too much in one chunk, therefore starving network? Heh, old ide disk in PIO mode should allow that 6GB machine to perform better... At least you don't loose packets during PIO reads... Pavel -- Pavel Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need... ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-17 19:19 ` Pavel Machek @ 2003-09-18 11:39 ` Rogier Wolff 2003-09-18 12:13 ` Rogier Wolff 0 siblings, 1 reply; 48+ messages in thread From: Rogier Wolff @ 2003-09-18 11:39 UTC (permalink / raw) To: Pavel Machek Cc: Jens Axboe, Rogier Wolff, Alan Cox, Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List On Wed, Sep 17, 2003 at 09:19:29PM +0200, Pavel Machek wrote: > Heh, old ide disk in PIO mode should allow that 6GB machine > to perform better... > At least you don't loose packets during PIO reads... As long as you tune it hdparm -i1 /dev/hdX Roger. > -- ** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 ** *-- BitWizard writes Linux device drivers for any device you may have! --* **** "Linux is like a wigwam - no windows, no gates, apache inside!" **** ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-18 11:39 ` Rogier Wolff @ 2003-09-18 12:13 ` Rogier Wolff 0 siblings, 0 replies; 48+ messages in thread From: Rogier Wolff @ 2003-09-18 12:13 UTC (permalink / raw) To: Rogier Wolff Cc: Pavel Machek, Jens Axboe, Alan Cox, Stephan von Krawczynski, Marcelo Tosatti, neilb, Linux Kernel Mailing List On Thu, Sep 18, 2003 at 01:39:03PM +0200, Rogier Wolff wrote: > As long as you tune it > hdparm -i1 /dev/hdX ehmm make that hdparm -u1 /dev/hdX -- ** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 ** *-- BitWizard writes Linux device drivers for any device you may have! --* **** "Linux is like a wigwam - no windows, no gates, apache inside!" **** ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 13:36 ` Stephan von Krawczynski ` (2 preceding siblings ...) 2003-09-16 14:36 ` Alan Cox @ 2003-09-16 15:22 ` Timothy Miller 2003-09-16 15:29 ` Martin J. Bligh 4 siblings, 0 replies; 48+ messages in thread From: Timothy Miller @ 2003-09-16 15:22 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: Marcelo Tosatti, neilb, linux-kernel Stephan von Krawczynski wrote: > On Tue, 16 Sep 2003 10:11:49 -0300 (BRT) > Marcelo Tosatti <marcelo.tosatti@cyclades.com.br> wrote: > > >>Oh... Jens just pointed bounce buffering is needed for the upper 2Gs. >> >>Maybe you have a SCSI card+disks to test ? 8) > > > Well, I do understand the bounce buffer problem, but honestly the current way > of handling the situation seems questionable at least. If you ever tried such a > system you notice it is a lot worse than just dumping the additional ram above > 4GB. You can really watch your network connections go bogus which is just > unacceptable. Is there any thinkable way to ommit the bounce buffers and still > do something useful with the beyond-4GB ram parts? > We should not leave the current bad situation as is... If there were some kind of tracking to determine which processes are doing I/O which requires the process to be in low memory. Then, processes can be migrated around in physical memory so as to optimize for that. Or is that already being done? ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: experiences beyond 4 GB RAM with 2.4.22 2003-09-16 13:36 ` Stephan von Krawczynski ` (3 preceding siblings ...) 2003-09-16 15:22 ` Timothy Miller @ 2003-09-16 15:29 ` Martin J. Bligh 4 siblings, 0 replies; 48+ messages in thread From: Martin J. Bligh @ 2003-09-16 15:29 UTC (permalink / raw) To: Stephan von Krawczynski, Marcelo Tosatti; +Cc: neilb, linux-kernel >> Oh... Jens just pointed bounce buffering is needed for the upper 2Gs. >> >> Maybe you have a SCSI card+disks to test ? 8) > > Well, I do understand the bounce buffer problem, but honestly the current way > of handling the situation seems questionable at least. If you ever tried such a > system you notice it is a lot worse than just dumping the additional ram above > 4GB. You can really watch your network connections go bogus which is just > unacceptable. Is there any thinkable way to ommit the bounce buffers and still > do something useful with the beyond-4GB ram parts? > We should not leave the current bad situation as is... It won't need to bounce buffer if you have a decent driver & hardware. M. ^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2003-09-18 15:04 UTC | newest] Thread overview: 48+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-09-09 9:01 experiences beyond 4 GB RAM with 2.4.22 Stephan von Krawczynski 2003-09-09 12:25 ` Andrea Arcangeli 2003-09-12 2:46 ` Neil Brown 2003-09-12 6:54 ` Stephan von Krawczynski 2003-09-12 7:11 ` Jens Axboe 2003-09-12 7:53 ` Mike Fedyk 2003-09-15 22:01 ` Marcelo Tosatti 2003-09-16 8:21 ` Stephan von Krawczynski 2003-09-16 12:05 ` Stephan von Krawczynski 2003-09-16 13:11 ` Marcelo Tosatti 2003-09-16 13:36 ` Stephan von Krawczynski 2003-09-16 13:55 ` Richard B. Johnson 2003-09-16 14:13 ` Stephan von Krawczynski 2003-09-16 14:33 ` Marcelo Tosatti 2003-09-16 14:36 ` Stephan von Krawczynski 2003-09-16 14:36 ` Alan Cox 2003-09-16 15:20 ` Stephan von Krawczynski 2003-09-16 15:29 ` Alan Cox 2003-09-16 15:49 ` Timothy Miller 2003-09-16 16:17 ` Stephan von Krawczynski 2003-09-16 19:58 ` Olivier Galibert 2003-09-17 15:10 ` Alan Cox 2003-09-17 19:19 ` Jens Axboe 2003-09-17 19:30 ` Marcelo Tosatti 2003-09-17 22:18 ` Stephan von Krawczynski 2003-09-18 7:08 ` Jens Axboe 2003-09-18 7:12 ` Jens Axboe 2003-09-18 11:22 ` Stephan von Krawczynski 2003-09-18 15:05 ` William Lee Irwin III 2003-09-16 17:10 ` Pavel Machek 2003-09-16 19:53 ` Olivier Galibert 2003-09-16 20:04 ` Pavel Machek 2003-09-16 21:16 ` Marcelo Tosatti 2003-09-16 21:23 ` Olivier Galibert 2003-09-17 11:14 ` Stephan von Krawczynski 2003-09-17 13:08 ` Olivier Galibert 2003-09-18 9:58 ` Olivier Galibert 2003-09-18 10:13 ` Stephan von Krawczynski 2003-09-18 11:22 ` Olivier Galibert 2003-09-17 6:41 ` Rogier Wolff 2003-09-17 10:26 ` Jens Axboe 2003-09-17 10:42 ` Rogier Wolff 2003-09-17 10:53 ` Jens Axboe 2003-09-17 19:19 ` Pavel Machek 2003-09-18 11:39 ` Rogier Wolff 2003-09-18 12:13 ` Rogier Wolff 2003-09-16 15:22 ` Timothy Miller 2003-09-16 15:29 ` Martin J. Bligh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox