* [PATCH] fix readahead breakage for sequential after random reads @ 2004-07-18 22:30 Miklos Szeredi 2004-07-26 23:29 ` Andrew Morton 0 siblings, 1 reply; 9+ messages in thread From: Miklos Szeredi @ 2004-07-18 22:30 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel Hi Andrew, Current readahead logic is broken when a random read pattern is followed by a long sequential read. The cause is that on a window miss ra->next_size is set to ra->average, but ra->average is only updated at the end of a sequence, so window size will remain 1 until the end of the sequential read. This patch fixes this by taking the current sequence length into account (code taken from towards end of page_cache_readahead()), and also setting ra->average to a decent value in handle_ra_miss() when sequential access is detected. Please apply! Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> ============================================================================== --- linux-2.6.8-rc2/mm/readahead.c.orig 2004-06-16 07:18:57.000000000 +0200 +++ linux-2.6.8-rc2/mm/readahead.c 2004-07-18 23:52:31.000000000 +0200 @@ -470,7 +470,11 @@ do_io: * pages shall be accessed in the next * current window. */ - ra->next_size = min(ra->average , (unsigned long)max); + average = ra->average; + if (ra->serial_cnt > average) + average = (ra->serial_cnt + ra->average + 1) / 2; + + ra->next_size = min(average , (unsigned long)max); } ra->start = offset; ra->size = ra->next_size; @@ -552,6 +556,7 @@ void handle_ra_miss(struct address_space ra->size = max; ra->ahead_start = 0; ra->ahead_size = 0; + ra->average = max / 2; } } ra->prev_page = offset; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] fix readahead breakage for sequential after random reads 2004-07-18 22:30 [PATCH] fix readahead breakage for sequential after random reads Miklos Szeredi @ 2004-07-26 23:29 ` Andrew Morton 2004-07-26 23:56 ` Ram Pai 0 siblings, 1 reply; 9+ messages in thread From: Andrew Morton @ 2004-07-26 23:29 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-kernel, Ram Pai Miklos Szeredi <miklos@szeredi.hu> wrote: > > Current readahead logic is broken when a random read pattern is > followed by a long sequential read. The cause is that on a window > miss ra->next_size is set to ra->average, but ra->average is only > updated at the end of a sequence, so window size will remain 1 until > the end of the sequential read. > > This patch fixes this by taking the current sequence length into > account (code taken from towards end of page_cache_readahead()), and > also setting ra->average to a decent value in handle_ra_miss() when > sequential access is detected. Thanks. Do you have any performance testing results from this patch? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] fix readahead breakage for sequential after random reads 2004-07-26 23:29 ` Andrew Morton @ 2004-07-26 23:56 ` Ram Pai 2004-07-27 0:08 ` Andrew Morton 0 siblings, 1 reply; 9+ messages in thread From: Ram Pai @ 2004-07-26 23:56 UTC (permalink / raw) To: Andrew Morton; +Cc: Miklos Szeredi, linux-kernel Andrew, Yes the patch fixes a valid bug. RP On Mon, 2004-07-26 at 16:29, Andrew Morton wrote: > Miklos Szeredi <miklos@szeredi.hu> wrote: > > > > Current readahead logic is broken when a random read pattern is > > followed by a long sequential read. The cause is that on a window > > miss ra->next_size is set to ra->average, but ra->average is only > > updated at the end of a sequence, so window size will remain 1 until > > the end of the sequential read. > > > > This patch fixes this by taking the current sequence length into > > account (code taken from towards end of page_cache_readahead()), and > > also setting ra->average to a decent value in handle_ra_miss() when > > sequential access is detected. > > Thanks. Do you have any performance testing results from this patch? > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] fix readahead breakage for sequential after random reads 2004-07-26 23:56 ` Ram Pai @ 2004-07-27 0:08 ` Andrew Morton 2004-07-27 4:18 ` Ram Pai 0 siblings, 1 reply; 9+ messages in thread From: Andrew Morton @ 2004-07-27 0:08 UTC (permalink / raw) To: Ram Pai; +Cc: miklos, linux-kernel Ram Pai <linuxram@us.ibm.com> wrote: > > Andrew, > Yes the patch fixes a valid bug. > Please don't top-post :( > RP > > On Mon, 2004-07-26 at 16:29, Andrew Morton wrote: > > Miklos Szeredi <miklos@szeredi.hu> wrote: > > > > > > Current readahead logic is broken when a random read pattern is > > > followed by a long sequential read. The cause is that on a window > > > miss ra->next_size is set to ra->average, but ra->average is only > > > updated at the end of a sequence, so window size will remain 1 until > > > the end of the sequential read. > > > > > > This patch fixes this by taking the current sequence length into > > > account (code taken from towards end of page_cache_readahead()), and > > > also setting ra->average to a decent value in handle_ra_miss() when > > > sequential access is detected. > > > > Thanks. Do you have any performance testing results from this patch? > > > Ram Pai <linuxram@us.ibm.com> wrote: > > Andrew, > Yes the patch fixes a valid bug. Fine, but the readahead code is performance-sensitive, and it takes quite some time for any regressions to be discovered. So I'm going to need to either sit on this patch for a very long time, or extensively test it myself, or await convincing test results from someone else. Can you help with that? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] fix readahead breakage for sequential after random reads 2004-07-27 0:08 ` Andrew Morton @ 2004-07-27 4:18 ` Ram Pai 2004-07-27 7:40 ` Miklos Szeredi 2004-08-04 17:38 ` Ram Pai 0 siblings, 2 replies; 9+ messages in thread From: Ram Pai @ 2004-07-27 4:18 UTC (permalink / raw) To: Andrew Morton; +Cc: miklos, linux-kernel On Mon, 2004-07-26 at 17:08, Andrew Morton wrote: > Ram Pai <linuxram@us.ibm.com> wrote: > > > > Andrew, > > Yes the patch fixes a valid bug. > > > > Please don't top-post :( > > RP > > > > On Mon, 2004-07-26 at 16:29, Andrew Morton wrote: > > > Miklos Szeredi <miklos@szeredi.hu> wrote: > > > > > > > > Current readahead logic is broken when a random read pattern is > > > > followed by a long sequential read. The cause is that on a window > > > > miss ra->next_size is set to ra->average, but ra->average is only > > > > updated at the end of a sequence, so window size will remain 1 until > > > > the end of the sequential read. > > > > > > > > This patch fixes this by taking the current sequence length into > > > > account (code taken from towards end of page_cache_readahead()), and > > > > also setting ra->average to a decent value in handle_ra_miss() when > > > > sequential access is detected. > > > > > > Thanks. Do you have any performance testing results from this patch? > > > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > Andrew, > > Yes the patch fixes a valid bug. > > Fine, but the readahead code is performance-sensitive, and it takes quite > some time for any regressions to be discovered. So I'm going to need to > either sit on this patch for a very long time, or extensively test it > myself, or await convincing test results from someone else. > > Can you help with that? yes I will run all my standard testsuites before we take this patch. (DSS workload, iozone, sysbench). I will get back with some results sooon. Probably by the end of this week. Also I think the bug that Miklos, found is really hard to reproduce. Did he find this bug by code inspection? Its really really hard to get into a state where the current window is of size 1 page with zero pages in the readahead window, and then the sequential read pattern to just right then. RP > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] fix readahead breakage for sequential after random reads 2004-07-27 4:18 ` Ram Pai @ 2004-07-27 7:40 ` Miklos Szeredi 2004-07-27 15:34 ` Ram Pai 2004-08-04 17:38 ` Ram Pai 1 sibling, 1 reply; 9+ messages in thread From: Miklos Szeredi @ 2004-07-27 7:40 UTC (permalink / raw) To: linuxram; +Cc: akpm, linux-kernel Ram Pai <linuxram@us.ibm.com> wrote: > > Also I think the bug that Miklos, found is really hard to reproduce. Did > he find this bug by code inspection? Its really really hard to get into > a state where the current window is of size 1 page with zero pages in > the readahead window, and then the sequential read pattern to just right > then. I found it by accident. I did my testing with the following read sequence: page offset num pages 7 1 0 1 35 1 23 1 42 1 33 1 29 1 100 200 I did not measure actual performance, but only looked at the length of the page vector passed to my filesystem's readpages() method. Miklos ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] fix readahead breakage for sequential after random reads 2004-07-27 7:40 ` Miklos Szeredi @ 2004-07-27 15:34 ` Ram Pai 0 siblings, 0 replies; 9+ messages in thread From: Ram Pai @ 2004-07-27 15:34 UTC (permalink / raw) To: Miklos Szeredi; +Cc: akpm, linux-kernel On Tue, 2004-07-27 at 00:40, Miklos Szeredi wrote: > Ram Pai <linuxram@us.ibm.com> wrote: > > > > Also I think the bug that Miklos, found is really hard to reproduce. Did > > he find this bug by code inspection? Its really really hard to get into > > a state where the current window is of size 1 page with zero pages in > > the readahead window, and then the sequential read pattern to just right > > then. > > I found it by accident. I did my testing with the following read > sequence: > > page offset num pages > 7 1 > 0 1 > 35 1 > 23 1 > 42 1 > 33 1 > 29 1 > 100 200 > > I did not measure actual performance, but only looked at the length of > the page vector passed to my filesystem's readpages() method. right. this pattern is just about right to get into that bad state. Had you had some more random reads then readahead algorithm will go into the slow-read mode(readahead-off mode) and you will not bump into this bug. RP > > Miklos > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] fix readahead breakage for sequential after random reads 2004-07-27 4:18 ` Ram Pai 2004-07-27 7:40 ` Miklos Szeredi @ 2004-08-04 17:38 ` Ram Pai 2004-08-04 19:39 ` Shane Shrybman 1 sibling, 1 reply; 9+ messages in thread From: Ram Pai @ 2004-08-04 17:38 UTC (permalink / raw) To: Andrew Morton; +Cc: miklos, linux-kernel, shrybman [-- Attachment #1: Type: text/plain, Size: 2323 bytes --] On Mon, 2004-07-26 at 21:18, Ram Pai wrote: > On Mon, 2004-07-26 at 17:08, Andrew Morton wrote: > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > > Andrew, > > > Yes the patch fixes a valid bug. > > > > > > > Please don't top-post :( > > > RP > > > > > > On Mon, 2004-07-26 at 16:29, Andrew Morton wrote: > > > > Miklos Szeredi <miklos@szeredi.hu> wrote: > > > > > > > > > > Current readahead logic is broken when a random read pattern is > > > > > followed by a long sequential read. The cause is that on a window > > > > > miss ra->next_size is set to ra->average, but ra->average is only > > > > > updated at the end of a sequence, so window size will remain 1 until > > > > > the end of the sequential read. > > > > > > > > > > This patch fixes this by taking the current sequence length into > > > > > account (code taken from towards end of page_cache_readahead()), and > > > > > also setting ra->average to a decent value in handle_ra_miss() when > > > > > sequential access is detected. > > > > > > > > Thanks. Do you have any performance testing results from this patch? > > > > > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > > Andrew, > > > Yes the patch fixes a valid bug. > > > > Fine, but the readahead code is performance-sensitive, and it takes quite > > some time for any regressions to be discovered. So I'm going to need to > > either sit on this patch for a very long time, or extensively test it > > myself, or await convincing test results from someone else. > > > > Can you help with that? > > yes I will run all my standard testsuites before we take this patch. > (DSS workload, iozone, sysbench). I will get back with some results > sooon. Probably by the end of this week. Ok I have enclosed the results. The summary is: there is no significant improvement or decrease in performance of (DSS workload, iozone, sysbench) The increase or decrease is in the margin of errors. I have also enclosed a patch that partially backs off Miklos's fix. Shane Shrybman correctly pointed out that the real fix is to set ra->average value to max/2 when we move from readahead-off mode to readahead-on mode. The other part of Miklos's fix becomes irrelevent. Sorry it took some time to get back on this. Its almost automated so turnaround time should be quick now-on-wards. RP [-- Attachment #2: miklos_partial_backoff.patch --] [-- Type: text/plain, Size: 531 bytes --] --- linux-2.6.8-rc3/mm/readahead.c 2004-08-03 14:26:46.000000000 -0700 +++ mypatchdir/mm/readahead.c 2004-08-04 17:08:47.665196424 -0700 @@ -468,11 +468,7 @@ do_io: * pages shall be accessed in the next * current window. */ - average = ra->average; - if (ra->serial_cnt > average) - average = (ra->serial_cnt + ra->average + 1) / 2; - - ra->next_size = min(average , (unsigned long)max); + ra->next_size = min(ra->average , (unsigned long)max); } ra->start = offset; ra->size = ra->next_size; [-- Attachment #3: report --] [-- Type: text/plain, Size: 4049 bytes --] DSS WORKLOAD ---------------- --------------------------------------------------------------------- | 2.6.8-rc2 | miklos_patch | miklos_mod_patch | | | | | ---------------------------------------------------------------------- | | | | | X | 0.9001% increase| 0.73649% increase | | | w.r.t X | w.r.t X | | | | | --------------------------------------------------------------------- iozone ------- iozone -c -t1 -s 4096m -r 128k -w -------------------------------------------------------------------- | | 2.6.8-rc2 | miklos_patch |miklos_mod_patch | --------------------------------------------------------------------- | | | | | |Sequential | 29773.95KB/sec | 29777.38KB/sec| 29968.74KB/sec | | | | | | |reverse read | 17905.65KB/sec | 17897.85KB/sec| 17726.55KB/sec | | | | | | |stride read | 19927.43KB/sec | 19778.90KB/sec| 19625.75KB/sec | | | | | | |random read | 18454.49KB/sec | 18384.28KB/sec| 18666.47KB/sec | | | | | | |random mix | 19434.58KB/sec | 19480.43KB/sec| 18755.05KB/sec | | | | | | |pread | 29763.56KB/sec | 29780.05KB/sec| 29961.16KB/sec | | | | | | -------------------------------------------------------------------- iozone NFS ------- iozone -c -t1 -s 4096m -r 128k -w --------------------------------------------------------------------------- | | 2.6.8-rc2 | miklos_patch | miklos_mod_patch| |---------------|---------------|-----------------------|-----------------| | | | | | |Sequential | 2272.77KB/sec | 2266.45KB/sec | 2269.98KB/sec | | | | | | |reverse read | 3078.14KB/sec | 3066.73KB/sec | 3079.87KB/sec | | | | | | |stride read | 3253.49KB/sec | 3220.70KB/sec | 3264.41KB/sec | | | | | | |random read | 5225.44KB/sec | 5231.68KB/sec | 5274.32KB/sec | | | | | | |random mix | 8389.64KB/sec | 8458.40KB/sec | 7722.72KB/sec | | | | | | |pread | 2264.64KB/sec | 2264.35KB/sec | 2256.15KB/sec | -------------------------------------------------------------------------- sysbench -------- sysbench --num-threads=256 --test=fileio --file-total-size=2800M --file-test-mode=rndrw run ------------------------------------------------------------------------- | | 2.6.8-rc2 | miklos_patch |miklos_mod_patch| ------------------------------------------------------------------------- | | | | | | Throughput | 2.745MB/sec | 2.721MB/sec | 2.607MB/sec | |---------------|-----------------------|---------------|----------------| | | | | | | | | | | |Requests/sec | 175.99 | 174.40MB/sec | 167.08MB/sec | |---------------|-----------------------|---------------|----------------| | | | | | | | | | | |Execution time | 56.8213s | 57.3400s | 59.8498s | -------------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] fix readahead breakage for sequential after random reads 2004-08-04 17:38 ` Ram Pai @ 2004-08-04 19:39 ` Shane Shrybman 0 siblings, 0 replies; 9+ messages in thread From: Shane Shrybman @ 2004-08-04 19:39 UTC (permalink / raw) To: Ram Pai; +Cc: Andrew Morton, miklos, linux-kernel On Wed, 2004-08-04 at 13:38, Ram Pai wrote: > On Mon, 2004-07-26 at 21:18, Ram Pai wrote: > > On Mon, 2004-07-26 at 17:08, Andrew Morton wrote: > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > > > > Andrew, > > > > Yes the patch fixes a valid bug. > > > > > > > > > > Please don't top-post :( > > > > RP > > > > > > > > On Mon, 2004-07-26 at 16:29, Andrew Morton wrote: > > > > > Miklos Szeredi <miklos@szeredi.hu> wrote: > > > > > > > > > > > > Current readahead logic is broken when a random read pattern is > > > > > > followed by a long sequential read. The cause is that on a window > > > > > > miss ra->next_size is set to ra->average, but ra->average is only > > > > > > updated at the end of a sequence, so window size will remain 1 until > > > > > > the end of the sequential read. > > > > > > > > > > > > This patch fixes this by taking the current sequence length into > > > > > > account (code taken from towards end of page_cache_readahead()), and > > > > > > also setting ra->average to a decent value in handle_ra_miss() when > > > > > > sequential access is detected. > > > > > > > > > > Thanks. Do you have any performance testing results from this patch? > > > > > > > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > > > > Andrew, > > > > Yes the patch fixes a valid bug. > > > > > > Fine, but the readahead code is performance-sensitive, and it takes quite > > > some time for any regressions to be discovered. So I'm going to need to > > > either sit on this patch for a very long time, or extensively test it > > > myself, or await convincing test results from someone else. > > > > > > Can you help with that? > > > > yes I will run all my standard testsuites before we take this patch. > > (DSS workload, iozone, sysbench). I will get back with some results > > sooon. Probably by the end of this week. > > Ok I have enclosed the results. The summary is: there is no significant > improvement or decrease in performance of (DSS workload, iozone, > sysbench) The increase or decrease is in the margin of errors. > > I have also enclosed a patch that partially backs off Miklos's fix. > Shane Shrybman correctly pointed out that the real fix is to set > ra->average value to max/2 when we move from readahead-off mode to > readahead-on mode. The other part of Miklos's fix becomes irrelevent. > The patch looks fine to me. > > Sorry it took some time to get back on this. Its almost automated so > turnaround time should be quick now-on-wards. > For these types of bugs the difference is not in the IO performance numbers but in the processing overhead. Does it make sense to track processor statistics (sys/user/io-wait...) during the benchmarks. So we could say that this patch doesn't change IO performance but uses 1% less system time? Maybe use /usr/bin/time -v sysbench ... > RP Regards, Shane ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2004-08-04 19:40 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-07-18 22:30 [PATCH] fix readahead breakage for sequential after random reads Miklos Szeredi 2004-07-26 23:29 ` Andrew Morton 2004-07-26 23:56 ` Ram Pai 2004-07-27 0:08 ` Andrew Morton 2004-07-27 4:18 ` Ram Pai 2004-07-27 7:40 ` Miklos Szeredi 2004-07-27 15:34 ` Ram Pai 2004-08-04 17:38 ` Ram Pai 2004-08-04 19:39 ` Shane Shrybman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox