* Poor performance during disk writes @ 2001-12-18 0:53 jlm 2001-12-18 1:09 ` Reid Hekman 2001-12-18 18:46 ` Andre Hedrick 0 siblings, 2 replies; 20+ messages in thread From: jlm @ 2001-12-18 0:53 UTC (permalink / raw) To: linux-kernel I have been witnessing what I believe to be poor performance from my computer ever since I have moved into the 2.4.x kernel versions. Combing through the unofficial archives of this mailing list reveals some others having similar problems, but I haven't seen any real resolution, or maybe their problems are completely different than mine. The problem simply is that whenever the computer does a big disk write, everything else is put on hold. Maybe this isn't a problem, but just the way it was written. I have tested this on a 2.2.x kernel and it also does it, but to a much lesser extent, to the point that I noticed the performance loss in the upgrade to a 2.4.x kernel and decided to investigate further. But also, I do not have any performance problems with disk reads. Programs can be loading up all they want and I am able to use my computer for other things during that time. It just seems to me (maybe I'm wrong) that the computer should be able to send small bits of data to the disk for writing during the off cycles and not affect the rest of the system (which is what I imagine it is doing for reads). To test, I've got a 142Meg file. I copy it around, makeing sure to copy from one disk to another. Of course the copy goes fine, because it does a cache (as I've been reading here), but eventually it needs to write out to disk (or when I do a sync) and here is where the computer hangs for a bit. If an mp3 is playing, it halts for 5 seconds at a time, mouse movement on the screen is VERY jerky, Gkrellm will stop updating for seconds and even just in console I can't type in stuff for a bit. I've been using hdparm to try and tweak hard disk access, but I'm not so sure this is the problem, and it's making me more confused about the entire situation. hdparm doesn't allow me to set using_dma, which it seems ought to be a necessity for getting a decent speed out of your hard drives (not that speed is the problem here), but despite that I still get a 51MB/s cache read speed in testing. Confusing, is the hard drive using (u)dma or not? Also, unmaskirq masks things a bit slower. So, the questions: Is there a way for me to stop this, some configure option? Is it a bug/performance issue that needs to be addressed in the kernel? Should I just go back to the 2.2.x kernel series and shutup already? I'm running 3 hard drives (30G Maxtor, 20G Seagate, and 2.1G Quantum Fireball) on an AMD k6-2 3dnow with a Gigabyte GA-5AX MOBO and the ALI Aladin V chipset. Thanks for your time and let me know if you need any more info/ output from dmesg or something. -- MACINTOSH = Machine Always Crashes If Not The Operating System Hangs "Life would be so much easier if we could just look at the source code." - Dave Olson ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-18 0:53 Poor performance during disk writes jlm @ 2001-12-18 1:09 ` Reid Hekman 2001-12-18 1:36 ` jlm 2001-12-18 18:46 ` Andre Hedrick 1 sibling, 1 reply; 20+ messages in thread From: Reid Hekman @ 2001-12-18 1:09 UTC (permalink / raw) To: jlm; +Cc: linux-kernel > So, the questions: Is there a way for me to stop this, some configure > option? Is it a bug/performance issue that needs to be addressed in the > kernel? Should I just go back to the 2.2.x kernel series and shutup > already? > > I'm running 3 hard drives (30G Maxtor, 20G Seagate, and 2.1G Quantum > Fireball) on an AMD k6-2 3dnow with a Gigabyte GA-5AX MOBO and the ALI > Aladin V chipset. > > Thanks for your time and let me know if you need any more info/ output > from dmesg or something. Specific kernel version, df, & hdparm output would all be helpful. > - Dave Olson Regards, Reid ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-18 1:09 ` Reid Hekman @ 2001-12-18 1:36 ` jlm 2001-12-18 2:01 ` Reid Hekman 0 siblings, 1 reply; 20+ messages in thread From: jlm @ 2001-12-18 1:36 UTC (permalink / raw) To: linux-kernel On Mon, 2001-12-17 at 20:09, Reid Hekman wrote: > Specific kernel version, df, & hdparm output would all be helpful. /usr 24> uname -a Linux PC2 2.4.16 #1 Sun Dec 2 15:26:09 EST 2001 i586 unknown /usr 25> df . Filesystem 1k-blocks Used Available Use% Mounted on /dev/hdb1 6047724 3472788 2267728 61% /usr /usr 26> hdparm -I /dev/hdb /dev/hdb: non-removable ATA device, with non-removable media Model Number: ST320413A Serial Number: 6ED2305M Firmware Revision: 3.39 Standards: Supported: 1 2 3 4 5 Likely used: 5 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 bytes/track: 0 (obsolete) bytes/sector: 0 (obsolete) current sector capacity: 16514064 LBA user addressable sectors = 39102336 Capabilities: LBA, IORDY(can be disabled) Buffer size: 512.0kB Queue depth: 1 Standby timer values: spec'd by standard r/w multiple sector transfer: Max = 16 Current = 16 DMA: mdma0 mdma1 *mdma2 udma0 udma1 udma2 udma3 udma4 udma5 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=240ns IORDY flow control=120ns Commands/features: Enabled Supported: * READ BUFFER cmd * WRITE BUFFER cmd * Host Protected Area feature set * look-ahead * write cache * Power Management feature set Security Mode feature set SMART feature set SET MAX security extension * DOWNLOAD MICROCODE cmd Security: Master password revision code = 65534 supported not enabled not locked not frozen not expired: security count not supported: enhanced erase HW reset results: CBLID- above Vih Device num = 1 Checksum: correct My hdb hard drive is where I found the problem originally. Also, I'm running the ext2 filesystem. -- MACINTOSH = Machine Always Crashes If Not The Operating System Hangs "Life would be so much easier if we could just look at the source code." - Dave Olson ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-18 1:36 ` jlm @ 2001-12-18 2:01 ` Reid Hekman 0 siblings, 0 replies; 20+ messages in thread From: Reid Hekman @ 2001-12-18 2:01 UTC (permalink / raw) To: jlm; +Cc: linux-kernel On Mon, 2001-12-17 at 19:36, jlm wrote: > On Mon, 2001-12-17 at 20:09, Reid Hekman wrote: > > > Specific kernel version, df, & hdparm output would all be helpful. > /usr 24> uname -a > Linux PC2 2.4.16 #1 Sun Dec 2 15:26:09 EST 2001 i586 unknown Is PCI IDE support for your chipset compiled in? PCI DMA by default? > non-removable ATA device, with non-removable media > Model Number: ST320413A > Serial Number: 6ED2305M > Firmware Revision: 3.39 [...] > Capabilities: > LBA, IORDY(can be disabled) > Buffer size: 512.0kB Queue depth: 1 > Standby timer values: spec'd by standard > r/w multiple sector transfer: Max = 16 Current = 16 > DMA: mdma0 mdma1 *mdma2 udma0 udma1 udma2 udma3 udma4 udma5 > Cycle time: min=120ns recommended=120ns Can you set udma on the drive instead? > - Dave Olson Regards, Reid ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-18 0:53 Poor performance during disk writes jlm 2001-12-18 1:09 ` Reid Hekman @ 2001-12-18 18:46 ` Andre Hedrick 2001-12-18 17:42 ` Gérard Roudier 1 sibling, 1 reply; 20+ messages in thread From: Andre Hedrick @ 2001-12-18 18:46 UTC (permalink / raw) To: jlm; +Cc: linux-kernel File './Bonnie.2276', size: 1073741824, volumes: 1 Writing with putc()... done: 72692 kB/s 83.7 %CPU Rewriting... done: 25355 kB/s 12.0 %CPU Writing intelligently...done: 103022 kB/s 40.5 %CPU Reading with getc()... done: 37188 kB/s 67.5 %CPU Reading intelligently...done: 40809 kB/s 11.4 %CPU Seeker 2...Seeker 1...Seeker 3...start 'em...done...done...done... ---Sequential Output (nosync)--- ---Sequential Input-- --Rnd Seek- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k (03)- Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 1*1024 72692 83.7 103022 40.5 25355 12.0 37188 67.5 40809 11.4 382.1 2.4 Maybe this is the kind of performance you want out your ATA subsystem. Maybe if I could get a patch in to the kernels we could all have stable and fast IO. Regards, Andre Hedrick CEO/President, LAD Storage Consulting Group Linux ATA Development Linux Disk Certification Project On 17 Dec 2001, jlm wrote: > I have been witnessing what I believe to be poor performance from my > computer ever since I have moved into the 2.4.x kernel versions. Combing > through the unofficial archives of this mailing list reveals some others > having similar problems, but I haven't seen any real resolution, or > maybe their problems are completely different than mine. > > The problem simply is that whenever the computer does a big disk write, > everything else is put on hold. Maybe this isn't a problem, but just the > way it was written. I have tested this on a 2.2.x kernel and it also > does it, but to a much lesser extent, to the point that I noticed the > performance loss in the upgrade to a 2.4.x kernel and decided to > investigate further. > > But also, I do not have any performance problems with disk reads. > Programs can be loading up all they want and I am able to use my > computer for other things during that time. It just seems to me (maybe > I'm wrong) that the computer should be able to send small bits of data > to the disk for writing during the off cycles and not affect the rest of > the system (which is what I imagine it is doing for reads). > > To test, I've got a 142Meg file. I copy it around, makeing sure to copy > from one disk to another. Of course the copy goes fine, because it does > a cache (as I've been reading here), but eventually it needs to write > out to disk (or when I do a sync) and here is where the computer hangs > for a bit. If an mp3 is playing, it halts for 5 seconds at a time, mouse > movement on the screen is VERY jerky, Gkrellm will stop updating for > seconds and even just in console I can't type in stuff for a bit. > > I've been using hdparm to try and tweak hard disk access, but I'm not so > sure this is the problem, and it's making me more confused about the > entire situation. hdparm doesn't allow me to set using_dma, which it > seems ought to be a necessity for getting a decent speed out of your > hard drives (not that speed is the problem here), but despite that I > still get a 51MB/s cache read speed in testing. Confusing, is the hard > drive using (u)dma or not? Also, unmaskirq masks things a bit slower. > > So, the questions: Is there a way for me to stop this, some configure > option? Is it a bug/performance issue that needs to be addressed in the > kernel? Should I just go back to the 2.2.x kernel series and shutup > already? > > I'm running 3 hard drives (30G Maxtor, 20G Seagate, and 2.1G Quantum > Fireball) on an AMD k6-2 3dnow with a Gigabyte GA-5AX MOBO and the ALI > Aladin V chipset. > > Thanks for your time and let me know if you need any more info/ output > from dmesg or something. > > -- > MACINTOSH = Machine Always Crashes If Not The Operating System Hangs > "Life would be so much easier if we could just look at the source code." > - Dave Olson > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-18 18:46 ` Andre Hedrick @ 2001-12-18 17:42 ` Gérard Roudier 2001-12-18 20:34 ` Andre Hedrick 2001-12-21 16:29 ` Poor performance during disk writes Troy Benjegerdes 0 siblings, 2 replies; 20+ messages in thread From: Gérard Roudier @ 2001-12-18 17:42 UTC (permalink / raw) To: Andre Hedrick; +Cc: jlm, linux-kernel On Tue, 18 Dec 2001, Andre Hedrick wrote: > File './Bonnie.2276', size: 1073741824, volumes: 1 > Writing with putc()... done: 72692 kB/s 83.7 %CPU > Rewriting... done: 25355 kB/s 12.0 %CPU > Writing intelligently...done: 103022 kB/s 40.5 %CPU > Reading with getc()... done: 37188 kB/s 67.5 %CPU > Reading intelligently...done: 40809 kB/s 11.4 %CPU > Seeker 2...Seeker 1...Seeker 3...start 'em...done...done...done... > ---Sequential Output (nosync)--- ---Sequential Input-- --Rnd Seek- > -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k (03)- > Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU > 1*1024 72692 83.7 103022 40.5 25355 12.0 37188 67.5 40809 11.4 382.1 2.4 > > Maybe this is the kind of performance you want out your ATA subsystem. > Maybe if I could get a patch in to the kernels we could all have stable > and fast IO. I rather see lots of wasting rather than performance, here. Bonnie says that your subsystem can sustain 103 MB/s write but only 41 MB/s read. This looks about 60% throughput wasted for read. Note that if you intend to use it only for write-only applications, performance are not that bad, even if just dropping the data on the floor would give you infinite throughput without any difference in functionnality. :-) Gérard Roudier Not CEO, not President of anything. > Regards, > > > Andre Hedrick > CEO/President, LAD Storage Consulting Group > Linux ATA Development > Linux Disk Certification Project ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-18 17:42 ` Gérard Roudier @ 2001-12-18 20:34 ` Andre Hedrick 2001-12-18 19:09 ` Gérard Roudier 2001-12-21 16:29 ` Poor performance during disk writes Troy Benjegerdes 1 sibling, 1 reply; 20+ messages in thread From: Andre Hedrick @ 2001-12-18 20:34 UTC (permalink / raw) To: Gérard Roudier; +Cc: jlm, linux-kernel On Tue, 18 Dec 2001, Gérard Roudier wrote: > > > On Tue, 18 Dec 2001, Andre Hedrick wrote: > > > File './Bonnie.2276', size: 1073741824, volumes: 1 > > Writing with putc()... done: 72692 kB/s 83.7 %CPU > > Rewriting... done: 25355 kB/s 12.0 %CPU > > Writing intelligently...done: 103022 kB/s 40.5 %CPU > > Reading with getc()... done: 37188 kB/s 67.5 %CPU > > Reading intelligently...done: 40809 kB/s 11.4 %CPU > > Seeker 2...Seeker 1...Seeker 3...start 'em...done...done...done... > > ---Sequential Output (nosync)--- ---Sequential Input-- --Rnd Seek- > > -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k (03)- > > Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU > > 1*1024 72692 83.7 103022 40.5 25355 12.0 37188 67.5 40809 11.4 382.1 2.4 > > > > Maybe this is the kind of performance you want out your ATA subsystem. > > Maybe if I could get a patch in to the kernels we could all have stable > > and fast IO. > > I rather see lots of wasting rather than performance, here. Bonnie says > that your subsystem can sustain 103 MB/s write but only 41 MB/s read. This > looks about 60% throughput wasted for read. > > Note that if you intend to use it only for write-only applications, > performance are not that bad, even if just dropping the data on the floor > would give you infinite throughput without any difference in > functionnality. :-) Well sense somebody paid/paying me make write performance go through the roof -- that is what I did. Now if you look closely you could see that in writing we are doing a boat load more work than reading. If somebody want me to throttle the reads more then they know how to get it done. Regards, Andre Hedrick Linux Disk Certification Project Linux ATA Development ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-18 20:34 ` Andre Hedrick @ 2001-12-18 19:09 ` Gérard Roudier 2001-12-19 23:26 ` jlm 0 siblings, 1 reply; 20+ messages in thread From: Gérard Roudier @ 2001-12-18 19:09 UTC (permalink / raw) To: Andre Hedrick; +Cc: jlm, linux-kernel On Tue, 18 Dec 2001, Andre Hedrick wrote: > On Tue, 18 Dec 2001, Gérard Roudier wrote: > > > > > > > On Tue, 18 Dec 2001, Andre Hedrick wrote: > > > > > File './Bonnie.2276', size: 1073741824, volumes: 1 > > > Writing with putc()... done: 72692 kB/s 83.7 %CPU > > > Rewriting... done: 25355 kB/s 12.0 %CPU > > > Writing intelligently...done: 103022 kB/s 40.5 %CPU > > > Reading with getc()... done: 37188 kB/s 67.5 %CPU > > > Reading intelligently...done: 40809 kB/s 11.4 %CPU > > > Seeker 2...Seeker 1...Seeker 3...start 'em...done...done...done... > > > ---Sequential Output (nosync)--- ---Sequential Input-- --Rnd Seek- > > > -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k (03)- > > > Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU > > > 1*1024 72692 83.7 103022 40.5 25355 12.0 37188 67.5 40809 11.4 382.1 2.4 > > > > > > Maybe this is the kind of performance you want out your ATA subsystem. > > > Maybe if I could get a patch in to the kernels we could all have stable > > > and fast IO. > > > > I rather see lots of wasting rather than performance, here. Bonnie says > > that your subsystem can sustain 103 MB/s write but only 41 MB/s read. This > > looks about 60% throughput wasted for read. > > > > Note that if you intend to use it only for write-only applications, > > performance are not that bad, even if just dropping the data on the floor > > would give you infinite throughput without any difference in > > functionnality. :-) > > Well sense somebody paid/paying me make write performance go through the > roof -- that is what I did. Now if you look closely you could see that in > writing we are doing a boat load more work than reading. If somebody want > me to throttle the reads more then they know how to get it done. I am not the one that will pay you for that, as you can guess. :-) I just was curious about the technical reasons, if any, of so large a difference. Just, the CPU and the memory subsystem are certainly not the issue. But I donnot want to prevent you from earning from such kind of improvement. Hence, let me go back to free scsi. Gérard. > Regards, > > Andre Hedrick > Linux Disk Certification Project Linux ATA Development ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-18 19:09 ` Gérard Roudier @ 2001-12-19 23:26 ` jlm 2001-12-20 10:49 ` Helge Hafting 0 siblings, 1 reply; 20+ messages in thread From: jlm @ 2001-12-19 23:26 UTC (permalink / raw) To: linux-kernel On Tue, 2001-12-18 at 14:09, Gérard Roudier wrote: > > > On Tue, 18 Dec 2001, Andre Hedrick wrote: > > > On Tue, 18 Dec 2001, Gérard Roudier wrote: > > > > > > > > > > > On Tue, 18 Dec 2001, Andre Hedrick wrote: > > > > > > > File './Bonnie.2276', size: 1073741824, volumes: 1 > > > > Writing with putc()... done: 72692 kB/s 83.7 %CPU > > > > Rewriting... done: 25355 kB/s 12.0 %CPU > > > > Writing intelligently...done: 103022 kB/s 40.5 %CPU > > > > Reading with getc()... done: 37188 kB/s 67.5 %CPU > > > > Reading intelligently...done: 40809 kB/s 11.4 %CPU > > > > Seeker 2...Seeker 1...Seeker 3...start 'em...done...done...done... > > > > ---Sequential Output (nosync)--- ---Sequential Input-- --Rnd Seek- > > > > -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k (03)- > > > > Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU > > > > 1*1024 72692 83.7 103022 40.5 25355 12.0 37188 67.5 40809 11.4 382.1 2.4 > > > > > > > > Maybe this is the kind of performance you want out your ATA subsystem. > > > > Maybe if I could get a patch in to the kernels we could all have stable > > > > and fast IO. I think people might be missing the issue that I'm having, here. Let me see if I can clarify. I'm not too concerned about write speed. I don't care too much if the hard drive can only write one byte per second. The problem is that when the kernel decides to write out to the disk, it is pre-empting everything else. All output to the user in X, the sound card, and also text typing in the console is put "on the back burner" while the disk is written to. It seems to me that smaller chunks of data can be written to the disk without disrupting my use of the computer (which is the case with untarring a small file, for instance), so if the kernel has got a lot to write to disk, just do that as a bunch of smaller writes and we should be fine. So I guess I don't really care what mode the hard drive is operating in (udma, mdma, dma or plain ide), I just don't want to have to go get a cup of coffee while the hard drive saves some data. Is there a "don't pre-empt the rest of the system" switch for the eide drives? Is there something fundamental/unique going on here that I'm missing? Thanks for listening. > > > > > > I rather see lots of wasting rather than performance, here. Bonnie says > > > that your subsystem can sustain 103 MB/s write but only 41 MB/s read. This > > > looks about 60% throughput wasted for read. > > > > > > Note that if you intend to use it only for write-only applications, > > > performance are not that bad, even if just dropping the data on the floor > > > would give you infinite throughput without any difference in > > > functionnality. :-) > > > > Well sense somebody paid/paying me make write performance go through the > > roof -- that is what I did. Now if you look closely you could see that in > > writing we are doing a boat load more work than reading. If somebody want > > me to throttle the reads more then they know how to get it done. > > I am not the one that will pay you for that, as you can guess. :-) > > I just was curious about the technical reasons, if any, of so large a > difference. Just, the CPU and the memory subsystem are certainly not the > issue. But I donnot want to prevent you from earning from such kind of > improvement. Hence, let me go back to free scsi. -- MACINTOSH = Machine Always Crashes If Not The Operating System Hangs "Life would be so much easier if we could just look at the source code." - Dave Olson ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-19 23:26 ` jlm @ 2001-12-20 10:49 ` Helge Hafting 2001-12-20 11:16 ` Oops in 2.4.14-pre6 and 2.4.14-pre9aa1 Andre Margis 0 siblings, 1 reply; 20+ messages in thread From: Helge Hafting @ 2001-12-20 10:49 UTC (permalink / raw) To: jlm, linux-kernel jlm wrote: > I think people might be missing the issue that I'm having, here. Let me > see if I can clarify. I'm not too concerned about write speed. I don't > care too much if the hard drive can only write one byte per second. The > problem is that when the kernel decides to write out to the disk, it is > pre-empting everything else. All output to the user in X, the sound > card, and also text typing in the console is put "on the back burner" > while the disk is written to. There may be a problem here, and maybe not: All of the actions above _may_ require disk access. The shell you type into could be swapped out, for example. A slow disk will be a problem in that case, swapin won't happen until the disk head seeks to the relevant position, and that may be delayed by the write. Even if the cpu is capable of doing other work while IO is going on. > It seems to me that smaller chunks of data can be written to the disk > without disrupting my use of the computer (which is the case with > untarring a small file, for instance), so if the kernel has got a lot to > write to disk, just do that as a bunch of smaller writes and we should > be fine. > > So I guess I don't really care what mode the hard drive is operating in > (udma, mdma, dma or plain ide), I just don't want to have to go get a > cup of coffee while the hard drive saves some data. Is there a "don't Devices generally get the cpu before anything else. A good disk system don't need much cpu. Running IDE in PIO mode require a lot of cpu though. Using any of the DMA modes avoids that. > pre-empt the rest of the system" switch for the eide drives? Is there > something fundamental/unique going on here that I'm missing? dma, udma, etc. is that switch. It lets the cpu do other work (such as redrawing X) while the disk is busy. Plain ide is what you don't want. The problem of waiting for other files or swapping while a really big write is going on is different. Get more drives, so the big writes go to one drive while you get stuff swapped in (or other file access) on other drive(s). The kernel is capable of getting fast response from one drive while another is completely bogged down with enormous writes. Helge Hafting ^ permalink raw reply [flat|nested] 20+ messages in thread
* Oops in 2.4.14-pre6 and 2.4.14-pre9aa1 2001-12-20 10:49 ` Helge Hafting @ 2001-12-20 11:16 ` Andre Margis 0 siblings, 0 replies; 20+ messages in thread From: Andre Margis @ 2001-12-20 11:16 UTC (permalink / raw) To: linux-kernel I'm running a Application server in a DELL POWEREDGE 8450 with 4xP-III 700 Mhz, 4GB Memory. After 40 days running kernel 2.4.14-pre6 my system reports the following error in /var/adm/messages: Dec 13 13:52:42 front01 kernel: invalid operand: 0000 Dec 13 13:52:42 front01 kernel: CPU: 3 Dec 13 13:52:42 front01 kernel: EIP: 0010:[<c012e764>] Not tainted Dec 13 13:52:42 front01 kernel: EFLAGS: 00010282 Dec 13 13:52:42 front01 kernel: eax: 00000880 ebx: c855e280 ecx: c855e280 edx: 00000000 Dec 13 13:52:42 front01 kernel: esi: fe000ff4 edi: 00000000 ebp: 00000ff1 esp: ce439eb0 Dec 13 13:52:42 front01 kernel: ds: 0018 es: 0018 ss: 0018 Dec 13 13:52:42 front01 kernel: Process ps.bin (pid: 29154, stackpage=ce439000) Dec 13 13:52:42 front01 kernel: Stack: c855e280 fe000ff4 e58f3003 00000ff1 f40252a3 c01f4bd8 c021bb70 c01504aa Dec 13 13:52:42 front01 kernel: c855e280 c012eebc c011daf8 00000003 00000003 bffffff1 eef09920 c011dbb6 Dec 13 13:52:42 front01 kernel: ca07b400 eef09920 bffffff1 e58f3000 00000003 00000000 ca07b400 ca07b41c Dec 13 13:52:42 front01 kernel: Call Trace: [<c01504aa>] [<c012eebc>] [<c011daf8>] [<c011dbb6>] [<c011dc3a>] Dec 13 13:52:42 front01 kernel: [<c014f66a>] [<c014f8db>] [<c01343d7>] [<c0106f73>] Dec 13 13:52:42 front01 kernel: Dec 13 13:52:42 front01 kernel: Code: 0f 0b ba 00 e0 ff ff 80 63 18 eb 21 e2 f6 42 05 20 0f 85 55 I change ther kernel to 2.4.15-pre9aa1 and today the same error occurs, one week later, with this message: Dec 20 01:21:58 front01 kernel: invalid operand: 0000 Dec 20 01:21:58 front01 kernel: CPU: 3 Dec 20 01:21:58 front01 kernel: EIP: 0010:[<c013153b>] Not tainted Dec 20 01:21:58 front01 kernel: EFLAGS: 00010202 Dec 20 01:21:58 front01 kernel: eax: 00000840 ebx: c5ba3ec0 ecx: c5ba3ec0 edx: 00000000 Dec 20 01:21:58 front01 kernel: esi: fe000f81 edi: 00000000 ebp: 00000f7b esp: c2cebeb0 Dec 20 01:21:58 front01 kernel: ds: 0018 es: 0018 ss: 0018 Dec 20 01:21:58 front01 kernel: Process ps.bin (pid: 19556, stackpage=c2ceb000) Dec 20 01:21:58 front01 kernel: Stack: c5ba3ec0 fe000f81 cd7b1006 00000f7b c0238fb0 c0154d8a d51db340 d51db340 Dec 20 01:21:58 front01 kernel: c5ba3ec0 c0131d98 c011ffb8 00000006 00000006 bfffff7b c8b8e740 c0120076 Dec 20 01:21:58 front01 kernel: c294b820 c8b8e740 bfffff7b cd7b1000 00000006 00000000 c294b83c c294b820 Dec 20 01:21:58 front01 kernel: Call Trace: [<c0154d8a>] [<c0131d98>] [<c011ffb8>] [<c0120076>] [<c0120119>] Dec 20 01:21:58 front01 kernel: [<c0153f3a>] [<c01541ab>] [<c0138117>] [<c0106f73>] Dec 20 01:21:58 front01 kernel: Dec 20 01:21:58 front01 kernel: Code: 0f 0b 8b 43 18 a8 80 74 02 0f 0b b9 00 e0 ff ff 80 63 18 eb ksymoops: ksymoops 2.4.1 on i686 2.4.15-pre9. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.15-pre9/ (default) -m /usr/src/linux/System.map (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Error (expand_objects): cannot stat(/lib/reiserfs.o) for reiserfs Error (expand_objects): cannot stat(/lib/sym53c8xx.o) for sym53c8xx Error (expand_objects): cannot stat(/lib/qla2x00.o) for qla2x00 Error (expand_objects): cannot stat(/lib/megaraid.o) for megaraid Error (expand_objects): cannot stat(/lib/sd_mod.o) for sd_mod Error (expand_objects): cannot stat(/lib/scsi_mod.o) for scsi_mod Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/net/ipv4/netfilter/netfilter.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/net/ipv6/netfilter/netfilter.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/net/fc/fc.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/net/wan/wan.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/net/appletalk/appletalk.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/net/tokenring/tr.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/net/pcmcia/pcmcia_net.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/net/wireless/wireless_net.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/misc/misc.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/cdrom/driver.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/media/radio/radio.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/media/video/video.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/media/media.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/sound/sounddrivers.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/parport/driver.o Warning (read_object): no symbols in /lib/modules/2.4.15-pre9/build/drivers/hotplug/vmlinux-obj.o Warning (compare_maps): mismatch on symbol partition_name , ksyms_base says c01a3160, System.map says c0158b20. Ignoring ksyms_base entry Warning (compare_maps): mismatch on symbol vg , lvm-mod says c89d8b20, /lib/modules/2.4.15-pre9/kernel/drivers/md/lvm-mod.o says c89d8780. Ignoring /lib/modules/2.4.15-pre9/kernel/drivers/md/lvm-mod.o entry Warning (map_ksym_to_module): cannot match loaded module reiserfs to a unique module object. Trace may not be reliable. Warning (map_ksym_to_module): cannot match loaded module sym53c8xx to a unique module object. Trace may not be reliable. Warning (map_ksym_to_module): cannot match loaded module qla2x00 to a unique module object. Trace may not be reliable. Warning (map_ksym_to_module): cannot match loaded module megaraid to a unique mole object. Trace may not be reliable. Warning (map_ksym_to_module): cannot match loaded module scsi_mod to a unique module object. Trace may not be reliable. Dec 20 01:21:58 front01 kernel: invalid operand: 0000 Dec 20 01:21:58 front01 kernel: CPU: 3 Dec 20 01:21:58 front01 kernel: EIP: 0010:[<c013153b>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 Dec 20 01:21:58 front01 kernel: EFLAGS: 00010202 Dec 20 01:21:58 front01 kernel: eax: 00000840 ebx: c5ba3ec0 ecx: c5ba3ec0 edx: 00000000 Dec 20 01:21:58 front01 kernel: esi: fe000f81 edi: 00000000 ebp: 00000f7b esp: c2cebeb0 Dec 20 01:21:58 front01 kernel: ds: 0018 es: 0018 ss: 0018 Dec 20 01:21:58 front01 kernel: Process ps.bin (pid: 19556, stackpage=c2ceb000) Dec 20 01:21:58 front01 kernel: Stack: c5ba3ec0 fe000f81 cd7b1006 00000f7b c0238fb0 c0154d8a d51db340 d51db340 Dec 20 01:21:58 front01 kernel: c5ba3ec0 c0131d98 c011ffb8 00000006 00000006 bfffff7b c8b8e740 c0120076 Dec 20 01:21:58 front01 kernel: c294b820 c8b8e740 bfffff7b cd7b1000 00000006 00000000 c294b83c c294b820 Dec 20 01:21:58 front01 kernel: Call Trace: [<c0154d8a>] [<c0131d98>] [<c011ffb8>] [<c0120076>] [<c0120119>] Warning (Oops_read): Code line not seen, dumping what data is available >>EIP; c013153b <__free_pages_ok+4b/238> <===== Trace; c0154d8a <proc_base_lookup+22a/23c> Trace; c0131d98 <__free_pages+1c/20> Trace; c011ffb8 <access_one_page+244/2a0> Trace; c0120076 <access_mm+62/7c> Trace; c0120119 <access_process_vm+89/c4> Trace; c0153f3a <proc_pid_cmdline+62/e8> Trace; c01541ab <proc_info_read+53/110> Trace; c0138117 <sys_read+8f/c4> Trace; c0106f73 <system_call+33/38> 26 warnings and 6 errors issued. Results may not be reliable. In all oops the system stay up, but if you run a ps command this process freeze. It's possible to reboot the machine using reboot -f Any help? Thank's in advance Andre Margis ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-18 17:42 ` Gérard Roudier 2001-12-18 20:34 ` Andre Hedrick @ 2001-12-21 16:29 ` Troy Benjegerdes 1 sibling, 0 replies; 20+ messages in thread From: Troy Benjegerdes @ 2001-12-21 16:29 UTC (permalink / raw) To: Gérard Roudier; +Cc: Andre Hedrick, jlm, linux-kernel On Tue, Dec 18, 2001 at 06:42:49PM +0100, Gérard Roudier wrote: > > > On Tue, 18 Dec 2001, Andre Hedrick wrote: > > > File './Bonnie.2276', size: 1073741824, volumes: 1 > > Writing with putc()... done: 72692 kB/s 83.7 %CPU > > Rewriting... done: 25355 kB/s 12.0 %CPU > > Writing intelligently...done: 103022 kB/s 40.5 %CPU > > Reading with getc()... done: 37188 kB/s 67.5 %CPU > > Reading intelligently...done: 40809 kB/s 11.4 %CPU > > Seeker 2...Seeker 1...Seeker 3...start 'em...done...done...done... > > ---Sequential Output (nosync)--- ---Sequential Input-- --Rnd Seek- > > -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k (03)- > > Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU > > 1*1024 72692 83.7 103022 40.5 25355 12.0 37188 67.5 40809 11.4 382.1 2.4 > > > > Maybe this is the kind of performance you want out your ATA subsystem. > > Maybe if I could get a patch in to the kernels we could all have stable > > and fast IO. > > I rather see lots of wasting rather than performance, here. Bonnie says > that your subsystem can sustain 103 MB/s write but only 41 MB/s read. This > looks about 60% throughput wasted for read. Uh, well, um, what drive is he writing too?? He could very well have 2 gig of memory in this box and half the writes were cached. 41MB/s seems reasonable for most common IDE disks. Of course I know Andre has some rather 'uncommon' IDE drives :P Does bonnie actually do any sort of 'sync' operation to ensure data writen is on the disk? Is that 100mb/sec write real, or just because of block layer caching? > > Note that if you intend to use it only for write-only applications, > performance are not that bad, even if just dropping the data on the floor > would give you infinite throughput without any difference in > functionnality. :-) > > > Gérard Roudier > Not CEO, not President of anything. > > > Regards, > > > > > > Andre Hedrick > > CEO/President, LAD Storage Consulting Group > > Linux ATA Development > > Linux Disk Certification Project > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Troy Benjegerdes | master of mispeeling | 'da hozer' | hozer@drgw.net -----"If this message isn't misspelled, I didn't write it" -- Me ----- "Why do musicians compose symphonies and poets write poems? They do it because life wouldn't have any meaning for them if they didn't. That's why I draw cartoons. It's my life." -- Charles Schulz ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes
@ 2001-12-20 13:27 Dieter Nützel
2001-12-20 14:51 ` safemode
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Dieter Nützel @ 2001-12-20 13:27 UTC (permalink / raw)
To: Helge Hafting; +Cc: jlm, Andre Hedrick, Linux Kernel List
On Thursday, 20.12.201, 10:49 Helge Hafting wrote:
> jlm wrote:
[-]
> > So I guess I don't really care what mode the hard drive is operating in
> > (udma, mdma, dma or plain ide), I just don't want to have to go get a
> > cup of coffee while the hard drive saves some data. Is there a "don't
>
> Devices generally get the cpu before anything else. A good disk system
> don't need much cpu. Running IDE in PIO mode require a lot
> of cpu though. Using any of the DMA modes avoids that.
Amen..
Sorry, Helge sure you are right in theory but try dbench 32 (maybe
bonnie/bonnie++) and playing an MP3/Ogg-Vorbis in parallel...
That's my first test on any "new" kernel version.
Even with an 1 GHz Athlon II, 640 MB, U160 DDYS 18 GB, 10k IBM disk (on an
AHA-2940UW) it stutters like mad. I am running all my kernel _with_ Robert
Love's preempt + lock-break patches and it doesn't solve the problem.
CPU load is (very) low but it do not work like it should.
> > pre-empt the rest of the system" switch for the eide drives? Is there
> > something fundamental/unique going on here that I'm missing?
> dma, udma, etc. is that switch. It lets the cpu do other work (such as
> redrawing X) while the disk is busy. Plain ide is what you don't want.
See above the whole system show some bad hiccup.
> The problem of waiting for other files or swapping while a really big
> write is going on is different. Get more drives, so the big writes go
> to one drive while you get stuff swapped in (or other file access)
> on other drive(s). The kernel is capable of getting fast response
> from one drive while another is completely bogged down with
> enormous writes.
Tried this already. Neither I put my test files (MP3/Ogg-Vorbis) in /dev/shm
or a nother disk it do not change anything.
There must be something in the VFS?
-Dieter
--
Dieter Nützel
Graduate Student, Computer Science
University of Hamburg
@home: Dieter.Nuetzel@hamburg.de
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: Poor performance during disk writes 2001-12-20 13:27 Dieter Nützel @ 2001-12-20 14:51 ` safemode 2001-12-20 17:40 ` William Lee Irwin III [not found] ` <0112201629230E.01835@manta> 2 siblings, 0 replies; 20+ messages in thread From: safemode @ 2001-12-20 14:51 UTC (permalink / raw) To: Dieter Nützel; +Cc: Helge Hafting, jlm, Andre Hedrick, Linux Kernel List On Thu, 2001-12-20 at 08:27, Dieter Nützel wrote: > On Thursday, 20.12.201, 10:49 Helge Hafting wrote: > > jlm wrote: > [-] > > > So I guess I don't really care what mode the hard drive is operating in > > > (udma, mdma, dma or plain ide), I just don't want to have to go get a > > > cup of coffee while the hard drive saves some data. Is there a "don't > > > > Devices generally get the cpu before anything else. A good disk system > > don't need much cpu. Running IDE in PIO mode require a lot > > of cpu though. Using any of the DMA modes avoids that. > > Amen.. > Sorry, Helge sure you are right in theory but try dbench 32 (maybe > bonnie/bonnie++) and playing an MP3/Ogg-Vorbis in parallel... > That's my first test on any "new" kernel version. > > Even with an 1 GHz Athlon II, 640 MB, U160 DDYS 18 GB, 10k IBM disk (on an > AHA-2940UW) it stutters like mad. I am running all my kernel _with_ Robert > Love's preempt + lock-break patches and it doesn't solve the problem. > CPU load is (very) low but it do not work like it should. try it with vanilla 2.4.17-rc1. I just did and i'm getting no stuttering at all. nice -n 20 dbench 32. Worked quite nicely. Of course i'm using an ext3 fs which is more important than your cpu speed or ram. This kind of discussion has been talked about and argued over many times in the past here already. Too many factors go into this and in the end, dbench is _Meant_ to preempt everything else. if you want a real test find a real program you really use and use it. > > > pre-empt the rest of the system" switch for the eide drives? Is there > > > something fundamental/unique going on here that I'm missing? > > dma, udma, etc. is that switch. It lets the cpu do other work (such as > > redrawing X) while the disk is busy. Plain ide is what you don't want. > > See above the whole system show some bad hiccup. > > > The problem of waiting for other files or swapping while a really big > > write is going on is different. Get more drives, so the big writes go > > to one drive while you get stuff swapped in (or other file access) > > on other drive(s). The kernel is capable of getting fast response > > from one drive while another is completely bogged down with > > enormous writes. > > Tried this already. Neither I put my test files (MP3/Ogg-Vorbis) in /dev/shm > or a nother disk it do not change anything. > > There must be something in the VFS? > > -Dieter > > -- > Dieter Nützel > Graduate Student, Computer Science > University of Hamburg > @home: Dieter.Nuetzel@hamburg.de > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-20 13:27 Dieter Nützel 2001-12-20 14:51 ` safemode @ 2001-12-20 17:40 ` William Lee Irwin III 2001-12-20 18:19 ` Andrew Morton [not found] ` <0112201629230E.01835@manta> 2 siblings, 1 reply; 20+ messages in thread From: William Lee Irwin III @ 2001-12-20 17:40 UTC (permalink / raw) To: Linux Kernel List On Thu, Dec 20, 2001 at 02:27:17PM +0100, Dieter N?tzel wrote: > Amen.. > Sorry, Helge sure you are right in theory but try dbench 32 (maybe > bonnie/bonnie++) and playing an MP3/Ogg-Vorbis in parallel... > That's my first test on any "new" kernel version. > > Even with an 1 GHz Athlon II, 640 MB, U160 DDYS 18 GB, 10k IBM disk (on an > AHA-2940UW) it stutters like mad. I am running all my kernel _with_ Robert > Love's preempt + lock-break patches and it doesn't solve the problem. > CPU load is (very) low but it do not work like it should. I tried this on my 600MHz Athlon with 768MB of RAM and U160 DDYS 36GB 10Krpm IBM disk on a Adaptect 39160 I managed to get it not to stutter at all. I was also using preempt + lockbreak and a few others. The crucial patch appeared to be from Andrew Morton and it involved tuning the elevator to avoid read starvation. A significantly helpful hardware suggestion regarding the sound card and drivers came from Linus himself, though. Linus and others pointed out that applications are able to cause some drivers to generate a large number of interrupts by using small buffers and unfriendly ioctl's, especially esd. My workaround was to change out sound hardware and disable esd. If this is happening to you, /proc/profile should show handle_IRQ_event() and schedule() very high up. On the other hand, this shows up as a steady drain on system resources and excessive system time, not stuttering or skipping. Andrew, I don't have the URL for that still floating around. Can you point Dieter to it? Cheers, Bill ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-20 17:40 ` William Lee Irwin III @ 2001-12-20 18:19 ` Andrew Morton 2001-12-20 18:29 ` Dave Jones 2001-12-21 16:50 ` Jens Axboe 0 siblings, 2 replies; 20+ messages in thread From: Andrew Morton @ 2001-12-20 18:19 UTC (permalink / raw) To: William Lee Irwin III; +Cc: Linux Kernel List William Lee Irwin III wrote: > > Andrew, I don't have the URL for that still floating around. Can you > point Dieter to it? It's here. You need to run elvtune -b N /dev/hdXX where N=0 is "disable", N=1 is minimum read latency, N=6 is a reasonable setting. --- linux-2.4.17-pre6/drivers/block/elevator.c Thu Jul 19 20:59:41 2001 +++ linux-akpm/drivers/block/elevator.c Sat Dec 8 11:10:36 2001 @@ -74,11 +74,10 @@ inline int bh_rq_in_between(struct buffe return 0; } - int elevator_linus_merge(request_queue_t *q, struct request **req, struct list_head * head, struct buffer_head *bh, int rw, - int max_sectors) + int max_sectors, int max_bomb_segments) { struct list_head *entry = &q->queue_head; unsigned int count = bh->b_size >> 9, ret = ELEVATOR_NO_MERGE; @@ -116,6 +115,56 @@ int elevator_linus_merge(request_queue_t } } + /* + * If we failed to merge a read anywhere in the request + * queue, we really don't want to place it at the end + * of the list, behind lots of writes. So place it near + * the front. + * + * We don't want to place it in front of _all_ writes: that + * would create lots of seeking, and isn't tunable. + * We try to avoid promoting this read in front of existing + * reads. + * + * max_bomb_sectors becomes the maximum number of write + * requests which we allow to remain in place in front of + * a newly introduced read. We weight things a little bit, + * so large writes are more expensive than small ones, but it's + * requests which count, not sectors. + */ + if (max_bomb_segments && rw == READ && ret == ELEVATOR_NO_MERGE) { + int cur_latency = 0; + struct request * const cur_request = *req; + + entry = head->next; + while (entry != &q->queue_head) { + struct request *__rq; + + if (entry == &q->queue_head) + BUG(); + if (entry == q->queue_head.next && + q->head_active && !q->plugged) + BUG(); + __rq = blkdev_entry_to_request(entry); + + if (__rq == cur_request) { + /* + * This is where the old algorithm placed it. + * There's no point pushing it further back, + * so leave it here, in sorted order. + */ + break; + } + if (__rq->cmd == WRITE) { + cur_latency += 1 + __rq->nr_sectors / 64; + if (cur_latency >= max_bomb_segments) { + *req = __rq; + break; + } + } + entry = entry->next; + } + } return ret; } @@ -144,7 +193,7 @@ void elevator_linus_merge_req(struct req int elevator_noop_merge(request_queue_t *q, struct request **req, struct list_head * head, struct buffer_head *bh, int rw, - int max_sectors) + int max_sectors, int max_bomb_segments) { struct list_head *entry; unsigned int count = bh->b_size >> 9; @@ -188,7 +237,7 @@ int blkelvget_ioctl(elevator_t * elevato output.queue_ID = elevator->queue_ID; output.read_latency = elevator->read_latency; output.write_latency = elevator->write_latency; - output.max_bomb_segments = 0; + output.max_bomb_segments = elevator->max_bomb_segments; if (copy_to_user(arg, &output, sizeof(blkelv_ioctl_arg_t))) return -EFAULT; @@ -207,9 +256,12 @@ int blkelvset_ioctl(elevator_t * elevato return -EINVAL; if (input.write_latency < 0) return -EINVAL; + if (input.max_bomb_segments < 0) + return -EINVAL; elevator->read_latency = input.read_latency; elevator->write_latency = input.write_latency; + elevator->max_bomb_segments = input.max_bomb_segments; return 0; } --- linux-2.4.17-pre6/drivers/block/ll_rw_blk.c Mon Nov 5 21:01:11 2001 +++ linux-akpm/drivers/block/ll_rw_blk.c Sat Dec 8 11:10:36 2001 @@ -690,7 +690,8 @@ again: } else if (q->head_active && !q->plugged) head = head->next; - el_ret = elevator->elevator_merge_fn(q, &req, head, bh, rw,max_sectors); + el_ret = elevator->elevator_merge_fn(q, &req, head, bh, + rw, max_sectors, elevator->max_bomb_segments); switch (el_ret) { case ELEVATOR_BACK_MERGE: --- linux-2.4.17-pre6/include/linux/elevator.h Thu Feb 15 16:58:34 2001 +++ linux-akpm/include/linux/elevator.h Sat Dec 8 11:10:36 2001 @@ -5,8 +5,9 @@ typedef void (elevator_fn) (struct reque struct list_head *, struct list_head *, int); -typedef int (elevator_merge_fn) (request_queue_t *, struct request **, struct list_head *, - struct buffer_head *, int, int); +typedef int (elevator_merge_fn)(request_queue_t *, struct request **, + struct list_head *, struct buffer_head *bh, + int rw, int max_sectors, int max_bomb_segments); typedef void (elevator_merge_cleanup_fn) (request_queue_t *, struct request *, int); @@ -16,6 +17,7 @@ struct elevator_s { int read_latency; int write_latency; + int max_bomb_segments; elevator_merge_fn *elevator_merge_fn; elevator_merge_cleanup_fn *elevator_merge_cleanup_fn; @@ -24,13 +26,13 @@ struct elevator_s unsigned int queue_ID; }; -int elevator_noop_merge(request_queue_t *, struct request **, struct list_head *, struct buffer_head *, int, int); -void elevator_noop_merge_cleanup(request_queue_t *, struct request *, int); -void elevator_noop_merge_req(struct request *, struct request *); - -int elevator_linus_merge(request_queue_t *, struct request **, struct list_head *, struct buffer_head *, int, int); -void elevator_linus_merge_cleanup(request_queue_t *, struct request *, int); -void elevator_linus_merge_req(struct request *, struct request *); +elevator_merge_fn elevator_noop_merge; +elevator_merge_cleanup_fn elevator_noop_merge_cleanup; +elevator_merge_req_fn elevator_noop_merge_req; + +elevator_merge_fn elevator_linus_merge; +elevator_merge_cleanup_fn elevator_linus_merge_cleanup; +elevator_merge_req_fn elevator_linus_merge_req; typedef struct blkelv_ioctl_arg_s { int queue_ID; @@ -54,22 +56,6 @@ extern void elevator_init(elevator_t *, #define ELEVATOR_FRONT_MERGE 1 #define ELEVATOR_BACK_MERGE 2 -/* - * This is used in the elevator algorithm. We don't prioritise reads - * over writes any more --- although reads are more time-critical than - * writes, by treating them equally we increase filesystem throughput. - * This turns out to give better overall performance. -- sct - */ -#define IN_ORDER(s1,s2) \ - ((((s1)->rq_dev == (s2)->rq_dev && \ - (s1)->sector < (s2)->sector)) || \ - (s1)->rq_dev < (s2)->rq_dev) - -#define BHRQ_IN_ORDER(bh, rq) \ - ((((bh)->b_rdev == (rq)->rq_dev && \ - (bh)->b_rsector < (rq)->sector)) || \ - (bh)->b_rdev < (rq)->rq_dev) - static inline int elevator_request_latency(elevator_t * elevator, int rw) { int latency; @@ -85,7 +71,7 @@ static inline int elevator_request_laten ((elevator_t) { \ 0, /* read_latency */ \ 0, /* write_latency */ \ - \ + 0, /* max_bomb_segments */ \ elevator_noop_merge, /* elevator_merge_fn */ \ elevator_noop_merge_cleanup, /* elevator_merge_cleanup_fn */ \ elevator_noop_merge_req, /* elevator_merge_req_fn */ \ @@ -95,7 +81,7 @@ static inline int elevator_request_laten ((elevator_t) { \ 8192, /* read passovers */ \ 16384, /* write passovers */ \ - \ + 0, /* max_bomb_segments */ \ elevator_linus_merge, /* elevator_merge_fn */ \ elevator_linus_merge_cleanup, /* elevator_merge_cleanup_fn */ \ elevator_linus_merge_req, /* elevator_merge_req_fn */ \ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-20 18:19 ` Andrew Morton @ 2001-12-20 18:29 ` Dave Jones 2001-12-21 16:50 ` Jens Axboe 2001-12-21 16:50 ` Jens Axboe 1 sibling, 1 reply; 20+ messages in thread From: Dave Jones @ 2001-12-20 18:29 UTC (permalink / raw) To: Andrew Morton; +Cc: William Lee Irwin III, Linux Kernel List On Thu, 20 Dec 2001, Andrew Morton wrote: > You need to run > elvtune -b N /dev/hdXX > where N=0 is "disable", N=1 is minimum read latency, N=6 is > a reasonable setting. I'm curious, why was max_bomb_segments dropped the last time it was in the tree ? I recall it happening, but the reason escapes me. Dave. -- | Dave Jones. http://www.codemonkey.org.uk | SuSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-20 18:29 ` Dave Jones @ 2001-12-21 16:50 ` Jens Axboe 0 siblings, 0 replies; 20+ messages in thread From: Jens Axboe @ 2001-12-21 16:50 UTC (permalink / raw) To: Dave Jones; +Cc: Andrew Morton, William Lee Irwin III, Linux Kernel List On Thu, Dec 20 2001, Dave Jones wrote: > On Thu, 20 Dec 2001, Andrew Morton wrote: > > > You need to run > > elvtune -b N /dev/hdXX > > where N=0 is "disable", N=1 is minimum read latency, N=6 is > > a reasonable setting. > > I'm curious, why was max_bomb_segments dropped the last time > it was in the tree ? I recall it happening, but the reason > escapes me. Fooled me too the first time, read Andrew's patch though. It isn't related at all. -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Poor performance during disk writes 2001-12-20 18:19 ` Andrew Morton 2001-12-20 18:29 ` Dave Jones @ 2001-12-21 16:50 ` Jens Axboe 1 sibling, 0 replies; 20+ messages in thread From: Jens Axboe @ 2001-12-21 16:50 UTC (permalink / raw) To: Andrew Morton; +Cc: William Lee Irwin III, Linux Kernel List On Thu, Dec 20 2001, Andrew Morton wrote: > --- linux-2.4.17-pre6/drivers/block/ll_rw_blk.c Mon Nov 5 21:01:11 2001 > +++ linux-akpm/drivers/block/ll_rw_blk.c Sat Dec 8 11:10:36 2001 > @@ -690,7 +690,8 @@ again: > } else if (q->head_active && !q->plugged) > head = head->next; > > - el_ret = elevator->elevator_merge_fn(q, &req, head, bh, rw,max_sectors); > + el_ret = elevator->elevator_merge_fn(q, &req, head, bh, > + rw, max_sectors, elevator->max_bomb_segments); merge function can just grab max_bomb_segments ala int mbs = q->elevator.max_bomb_segments so no need to modify the merge functions. -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <0112201629230E.01835@manta>]
[parent not found: <200112201436.fBKEa2m26640@zero.tech9.net>]
* Re: Poor performance during disk writes [not found] ` <200112201436.fBKEa2m26640@zero.tech9.net> @ 2001-12-20 19:02 ` Robert Love 0 siblings, 0 replies; 20+ messages in thread From: Robert Love @ 2001-12-20 19:02 UTC (permalink / raw) To: Dieter N?tzel; +Cc: vda, Helge Hafting, jlm, Andre Hedrick, Linux Kernel List On Thu, 2001-12-20 at 09:35, Dieter N?tzel wrote: > > Robert maintains latency measurement patch, do you use it? > > Yes, I did the ReiserFS lock-break tests for him. > > > Does it show where are the problems? > > NO, we have no clue, yet :-( Want to see if its the VM? Rik van Riel has updated his 2.4-ac VM for new kernels and added reverse page mapping (a neat feature). It is available at: http://www.surriel.com/patches/2.4/2.4.16-rmap-6 Give it a whirl, you might me impressed. If not, maybe we can scratch the VM as the problem and stare meanly at VFS ;-) (Note my lock-break patch will fail on the new VM. Ignore it. The rest is still fine. Perhaps I'll do a lock-break for this VM later). Robert Love ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2001-12-21 16:53 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-12-18 0:53 Poor performance during disk writes jlm
2001-12-18 1:09 ` Reid Hekman
2001-12-18 1:36 ` jlm
2001-12-18 2:01 ` Reid Hekman
2001-12-18 18:46 ` Andre Hedrick
2001-12-18 17:42 ` Gérard Roudier
2001-12-18 20:34 ` Andre Hedrick
2001-12-18 19:09 ` Gérard Roudier
2001-12-19 23:26 ` jlm
2001-12-20 10:49 ` Helge Hafting
2001-12-20 11:16 ` Oops in 2.4.14-pre6 and 2.4.14-pre9aa1 Andre Margis
2001-12-21 16:29 ` Poor performance during disk writes Troy Benjegerdes
-- strict thread matches above, loose matches on Subject: below --
2001-12-20 13:27 Dieter Nützel
2001-12-20 14:51 ` safemode
2001-12-20 17:40 ` William Lee Irwin III
2001-12-20 18:19 ` Andrew Morton
2001-12-20 18:29 ` Dave Jones
2001-12-21 16:50 ` Jens Axboe
2001-12-21 16:50 ` Jens Axboe
[not found] ` <0112201629230E.01835@manta>
[not found] ` <200112201436.fBKEa2m26640@zero.tech9.net>
2001-12-20 19:02 ` Robert Love
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox