From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linux Memory Management List <linux-mm@kvack.org>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: [PATCH 00/11] [RFC] 512K readahead size with thrashing safe readahead
Date: Tue, 02 Feb 2010 23:28:35 +0800 [thread overview]
Message-ID: <20100202152835.683907822@intel.com> (raw)
Andrew,
This is to lift default readahead size to 512KB, which I believe yields
more I/O throughput without noticeably increasing I/O latency for today's HDD.
For example, for a 100MB/s and 8ms access time HDD:
io_size KB access_time transfer_time io_latency util% throughput KB/s IOPS
4 8 0.04 8.04 0.49% 497.57 124.39
8 8 0.08 8.08 0.97% 990.33 123.79
16 8 0.16 8.16 1.92% 1961.69 122.61
32 8 0.31 8.31 3.76% 3849.62 120.30
64 8 0.62 8.62 7.25% 7420.29 115.94
128 8 1.25 9.25 13.51% 13837.84 108.11
256 8 2.50 10.50 23.81% 24380.95 95.24
512 8 5.00 13.00 38.46% 39384.62 76.92
1024 8 10.00 18.00 55.56% 56888.89 55.56
2048 8 20.00 28.00 71.43% 73142.86 35.71
4096 8 40.00 48.00 83.33% 85333.33 20.83
The 128KB => 512KB readahead size boosts IO throughput from ~13MB/s to ~39MB/s, while
merely increases IO latency from 9.25ms to 13.00ms.
As for SSD, I find that Intel X25-M SSD desires large readahead size
even for sequential reads (the first patch has benchmark details):
rasize first run time/throughput second run time/throughput
------------------------------------------------------------------
4k 3.40038 s, 123 MB/s 3.42842 s, 122 MB/s
8k 2.7362 s, 153 MB/s 2.74528 s, 153 MB/s
16k 2.59808 s, 161 MB/s 2.58728 s, 162 MB/s
32k 2.50488 s, 167 MB/s 2.49138 s, 168 MB/s
64k 2.12861 s, 197 MB/s 2.13055 s, 197 MB/s
128k 1.92905 s, 217 MB/s 1.93176 s, 217 MB/s
256k 1.75896 s, 238 MB/s 1.78963 s, 234 MB/s
512k 1.67357 s, 251 MB/s 1.69112 s, 248 MB/s
1M 1.62115 s, 259 MB/s 1.63206 s, 257 MB/s
2M 1.56204 s, 269 MB/s 1.58854 s, 264 MB/s
4M 1.57949 s, 266 MB/s 1.57426 s, 266 MB/s
As suggested by Linus, decrease default readahead size for small devices at the same time.
[PATCH 01/11] readahead: limit readahead size for small devices
[PATCH 02/11] readahead: bump up the default readahead size
[PATCH 03/11] readahead: introduce {MAX|MIN}_READAHEAD_PAGES macros for ease of use
The two other impacts of an enlarged readahead size are
- memory footprint (caused by readahead miss)
Sequential readahead hit ratio is pretty high regardless of max
readahead size; the extra memory footprint is mainly caused by
enlarged mmap read-around.
I measured my desktop:
- under Xwindow:
128KB readahead cache hit ratio = 143MB/230MB = 62%
512KB readahead cache hit ratio = 138MB/248MB = 55%
- under console: (seems more stable than the Xwindow data)
128KB readahead cache hit ratio = 30MB/56MB = 53%
1MB readahead cache hit ratio = 30MB/59MB = 51%
So the impact to memory footprint looks acceptable.
- readahead thrashing
It will now cost 1MB readahead buffer per stream. Memory tight systems
typically do not run multiple streams; but if they do so, it should
help I/O performance as long as we can avoid thrashing, which can be
achieved with the following patches.
[PATCH 04/11] readahead: replace ra->mmap_miss with ra->ra_flags
[PATCH 05/11] readahead: retain inactive lru pages to be accessed soon
[PATCH 06/11] readahead: thrashing safe context readahead
This is a major rewrite of the readahead algorithm, so I did careful tests with
the following tracing/stats patches:
[PATCH 07/11] readahead: record readahead patterns
[PATCH 08/11] readahead: add tracing event
[PATCH 09/11] readahead: add /debug/readahead/stats
I verified the new readahead behavior on various access patterns,
as well as stress tested the thrashing safety, by running 300 streams
with mem=128M.
Only 2031/61325=3.3% readahead windows are thrashed (due to workload
variation):
# cat /debug/readahead/stats
pattern readahead eof_hit cache_hit io sync_io mmap_io size async_size io_size
initial 20 9 4 20 20 12 73 37 35
subsequent 3 3 0 1 0 1 8 8 1
context 61325 1 5479 61325 6788 5 14 2 13
thrash 2031 0 1222 2031 2031 0 9 0 6
around 235 90 142 235 235 235 60 0 19
fadvise 0 0 0 0 0 0 0 0 0
random 223 133 0 91 91 1 1 0 1
all 63837 236 6847 63703 9165 0 14 2 13
And the readahead inside a single stream is working as expected:
# grep streams-3162 /debug/tracing/trace
streams-3162 [000] 8602.455953: readahead: readahead-context(dev=0:2, ino=0, req=287352+1, ra=287354+10-2, async=1) = 10
streams-3162 [000] 8602.907873: readahead: readahead-context(dev=0:2, ino=0, req=287362+1, ra=287364+20-3, async=1) = 20
streams-3162 [000] 8604.027879: readahead: readahead-context(dev=0:2, ino=0, req=287381+1, ra=287384+14-2, async=1) = 14
streams-3162 [000] 8604.754722: readahead: readahead-context(dev=0:2, ino=0, req=287396+1, ra=287398+10-2, async=1) = 10
streams-3162 [000] 8605.191228: readahead: readahead-context(dev=0:2, ino=0, req=287406+1, ra=287408+18-3, async=1) = 18
streams-3162 [000] 8606.831895: readahead: readahead-context(dev=0:2, ino=0, req=287423+1, ra=287426+12-2, async=1) = 12
streams-3162 [000] 8606.919614: readahead: readahead-thrash(dev=0:2, ino=0, req=287425+1, ra=287425+8-0, async=0) = 1
streams-3162 [000] 8607.545016: readahead: readahead-context(dev=0:2, ino=0, req=287436+1, ra=287438+9-2, async=1) = 9
streams-3162 [000] 8607.960039: readahead: readahead-context(dev=0:2, ino=0, req=287445+1, ra=287447+18-3, async=1) = 18
streams-3162 [000] 8608.790973: readahead: readahead-context(dev=0:2, ino=0, req=287462+1, ra=287465+21-3, async=1) = 21
streams-3162 [000] 8609.763138: readahead: readahead-context(dev=0:2, ino=0, req=287483+1, ra=287486+15-2, async=1) = 15
streams-3162 [000] 8611.467401: readahead: readahead-context(dev=0:2, ino=0, req=287499+1, ra=287501+11-2, async=1) = 11
streams-3162 [000] 8642.512413: readahead: readahead-context(dev=0:2, ino=0, req=288053+1, ra=288056+10-2, async=1) = 10
streams-3162 [000] 8643.246618: readahead: readahead-context(dev=0:2, ino=0, req=288064+1, ra=288066+22-3, async=1) = 22
streams-3162 [000] 8644.278613: readahead: readahead-context(dev=0:2, ino=0, req=288085+1, ra=288088+16-3, async=1) = 16
streams-3162 [000] 8644.395782: readahead: readahead-context(dev=0:2, ino=0, req=288087+1, ra=288087+21-3, async=0) = 5
streams-3162 [000] 8645.109918: readahead: readahead-context(dev=0:2, ino=0, req=288101+1, ra=288108+8-1, async=1) = 8
streams-3162 [000] 8645.285078: readahead: readahead-context(dev=0:2, ino=0, req=288105+1, ra=288116+8-1, async=1) = 8
streams-3162 [000] 8645.731794: readahead: readahead-context(dev=0:2, ino=0, req=288115+1, ra=288122+14-2, async=1) = 13
streams-3162 [000] 8646.114250: readahead: readahead-context(dev=0:2, ino=0, req=288123+1, ra=288136+8-1, async=1) = 8
streams-3162 [000] 8646.626320: readahead: readahead-context(dev=0:2, ino=0, req=288134+1, ra=288144+16-3, async=1) = 16
streams-3162 [000] 8647.035721: readahead: readahead-context(dev=0:2, ino=0, req=288143+1, ra=288160+10-2, async=1) = 10
streams-3162 [000] 8647.693082: readahead: readahead-context(dev=0:2, ino=0, req=288157+1, ra=288165+12-2, async=1) = 8
streams-3162 [000] 8648.221368: readahead: readahead-context(dev=0:2, ino=0, req=288168+1, ra=288177+15-2, async=1) = 15
streams-3162 [000] 8649.280800: readahead: readahead-context(dev=0:2, ino=0, req=288190+1, ra=288192+23-3, async=1) = 23
[...]
btw, Linus suggested to disable start-of-file readahead if lseek() has been called:
[PATCH 10/11] readahead: dont do start-of-file readahead after lseek()
At last, the updated context readahead will do more radix tree scans, so need
to optimize radix_tree_prev_hole():
[PATCH 11/11] radixtree: speed up next/prev hole search
It will on average reduce 8*64 level-0 slot searches to 32 level-0 slot
plus 8 level-1 node searches.
Thanks,
Fengguang
WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linux Memory Management List <linux-mm@kvack.org>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: [PATCH 00/11] [RFC] 512K readahead size with thrashing safe readahead
Date: Tue, 02 Feb 2010 23:28:35 +0800 [thread overview]
Message-ID: <20100202152835.683907822@intel.com> (raw)
Andrew,
This is to lift default readahead size to 512KB, which I believe yields
more I/O throughput without noticeably increasing I/O latency for today's HDD.
For example, for a 100MB/s and 8ms access time HDD:
io_size KB access_time transfer_time io_latency util% throughput KB/s IOPS
4 8 0.04 8.04 0.49% 497.57 124.39
8 8 0.08 8.08 0.97% 990.33 123.79
16 8 0.16 8.16 1.92% 1961.69 122.61
32 8 0.31 8.31 3.76% 3849.62 120.30
64 8 0.62 8.62 7.25% 7420.29 115.94
128 8 1.25 9.25 13.51% 13837.84 108.11
256 8 2.50 10.50 23.81% 24380.95 95.24
512 8 5.00 13.00 38.46% 39384.62 76.92
1024 8 10.00 18.00 55.56% 56888.89 55.56
2048 8 20.00 28.00 71.43% 73142.86 35.71
4096 8 40.00 48.00 83.33% 85333.33 20.83
The 128KB => 512KB readahead size boosts IO throughput from ~13MB/s to ~39MB/s, while
merely increases IO latency from 9.25ms to 13.00ms.
As for SSD, I find that Intel X25-M SSD desires large readahead size
even for sequential reads (the first patch has benchmark details):
rasize first run time/throughput second run time/throughput
------------------------------------------------------------------
4k 3.40038 s, 123 MB/s 3.42842 s, 122 MB/s
8k 2.7362 s, 153 MB/s 2.74528 s, 153 MB/s
16k 2.59808 s, 161 MB/s 2.58728 s, 162 MB/s
32k 2.50488 s, 167 MB/s 2.49138 s, 168 MB/s
64k 2.12861 s, 197 MB/s 2.13055 s, 197 MB/s
128k 1.92905 s, 217 MB/s 1.93176 s, 217 MB/s
256k 1.75896 s, 238 MB/s 1.78963 s, 234 MB/s
512k 1.67357 s, 251 MB/s 1.69112 s, 248 MB/s
1M 1.62115 s, 259 MB/s 1.63206 s, 257 MB/s
2M 1.56204 s, 269 MB/s 1.58854 s, 264 MB/s
4M 1.57949 s, 266 MB/s 1.57426 s, 266 MB/s
As suggested by Linus, decrease default readahead size for small devices at the same time.
[PATCH 01/11] readahead: limit readahead size for small devices
[PATCH 02/11] readahead: bump up the default readahead size
[PATCH 03/11] readahead: introduce {MAX|MIN}_READAHEAD_PAGES macros for ease of use
The two other impacts of an enlarged readahead size are
- memory footprint (caused by readahead miss)
Sequential readahead hit ratio is pretty high regardless of max
readahead size; the extra memory footprint is mainly caused by
enlarged mmap read-around.
I measured my desktop:
- under Xwindow:
128KB readahead cache hit ratio = 143MB/230MB = 62%
512KB readahead cache hit ratio = 138MB/248MB = 55%
- under console: (seems more stable than the Xwindow data)
128KB readahead cache hit ratio = 30MB/56MB = 53%
1MB readahead cache hit ratio = 30MB/59MB = 51%
So the impact to memory footprint looks acceptable.
- readahead thrashing
It will now cost 1MB readahead buffer per stream. Memory tight systems
typically do not run multiple streams; but if they do so, it should
help I/O performance as long as we can avoid thrashing, which can be
achieved with the following patches.
[PATCH 04/11] readahead: replace ra->mmap_miss with ra->ra_flags
[PATCH 05/11] readahead: retain inactive lru pages to be accessed soon
[PATCH 06/11] readahead: thrashing safe context readahead
This is a major rewrite of the readahead algorithm, so I did careful tests with
the following tracing/stats patches:
[PATCH 07/11] readahead: record readahead patterns
[PATCH 08/11] readahead: add tracing event
[PATCH 09/11] readahead: add /debug/readahead/stats
I verified the new readahead behavior on various access patterns,
as well as stress tested the thrashing safety, by running 300 streams
with mem=128M.
Only 2031/61325=3.3% readahead windows are thrashed (due to workload
variation):
# cat /debug/readahead/stats
pattern readahead eof_hit cache_hit io sync_io mmap_io size async_size io_size
initial 20 9 4 20 20 12 73 37 35
subsequent 3 3 0 1 0 1 8 8 1
context 61325 1 5479 61325 6788 5 14 2 13
thrash 2031 0 1222 2031 2031 0 9 0 6
around 235 90 142 235 235 235 60 0 19
fadvise 0 0 0 0 0 0 0 0 0
random 223 133 0 91 91 1 1 0 1
all 63837 236 6847 63703 9165 0 14 2 13
And the readahead inside a single stream is working as expected:
# grep streams-3162 /debug/tracing/trace
streams-3162 [000] 8602.455953: readahead: readahead-context(dev=0:2, ino=0, req=287352+1, ra=287354+10-2, async=1) = 10
streams-3162 [000] 8602.907873: readahead: readahead-context(dev=0:2, ino=0, req=287362+1, ra=287364+20-3, async=1) = 20
streams-3162 [000] 8604.027879: readahead: readahead-context(dev=0:2, ino=0, req=287381+1, ra=287384+14-2, async=1) = 14
streams-3162 [000] 8604.754722: readahead: readahead-context(dev=0:2, ino=0, req=287396+1, ra=287398+10-2, async=1) = 10
streams-3162 [000] 8605.191228: readahead: readahead-context(dev=0:2, ino=0, req=287406+1, ra=287408+18-3, async=1) = 18
streams-3162 [000] 8606.831895: readahead: readahead-context(dev=0:2, ino=0, req=287423+1, ra=287426+12-2, async=1) = 12
streams-3162 [000] 8606.919614: readahead: readahead-thrash(dev=0:2, ino=0, req=287425+1, ra=287425+8-0, async=0) = 1
streams-3162 [000] 8607.545016: readahead: readahead-context(dev=0:2, ino=0, req=287436+1, ra=287438+9-2, async=1) = 9
streams-3162 [000] 8607.960039: readahead: readahead-context(dev=0:2, ino=0, req=287445+1, ra=287447+18-3, async=1) = 18
streams-3162 [000] 8608.790973: readahead: readahead-context(dev=0:2, ino=0, req=287462+1, ra=287465+21-3, async=1) = 21
streams-3162 [000] 8609.763138: readahead: readahead-context(dev=0:2, ino=0, req=287483+1, ra=287486+15-2, async=1) = 15
streams-3162 [000] 8611.467401: readahead: readahead-context(dev=0:2, ino=0, req=287499+1, ra=287501+11-2, async=1) = 11
streams-3162 [000] 8642.512413: readahead: readahead-context(dev=0:2, ino=0, req=288053+1, ra=288056+10-2, async=1) = 10
streams-3162 [000] 8643.246618: readahead: readahead-context(dev=0:2, ino=0, req=288064+1, ra=288066+22-3, async=1) = 22
streams-3162 [000] 8644.278613: readahead: readahead-context(dev=0:2, ino=0, req=288085+1, ra=288088+16-3, async=1) = 16
streams-3162 [000] 8644.395782: readahead: readahead-context(dev=0:2, ino=0, req=288087+1, ra=288087+21-3, async=0) = 5
streams-3162 [000] 8645.109918: readahead: readahead-context(dev=0:2, ino=0, req=288101+1, ra=288108+8-1, async=1) = 8
streams-3162 [000] 8645.285078: readahead: readahead-context(dev=0:2, ino=0, req=288105+1, ra=288116+8-1, async=1) = 8
streams-3162 [000] 8645.731794: readahead: readahead-context(dev=0:2, ino=0, req=288115+1, ra=288122+14-2, async=1) = 13
streams-3162 [000] 8646.114250: readahead: readahead-context(dev=0:2, ino=0, req=288123+1, ra=288136+8-1, async=1) = 8
streams-3162 [000] 8646.626320: readahead: readahead-context(dev=0:2, ino=0, req=288134+1, ra=288144+16-3, async=1) = 16
streams-3162 [000] 8647.035721: readahead: readahead-context(dev=0:2, ino=0, req=288143+1, ra=288160+10-2, async=1) = 10
streams-3162 [000] 8647.693082: readahead: readahead-context(dev=0:2, ino=0, req=288157+1, ra=288165+12-2, async=1) = 8
streams-3162 [000] 8648.221368: readahead: readahead-context(dev=0:2, ino=0, req=288168+1, ra=288177+15-2, async=1) = 15
streams-3162 [000] 8649.280800: readahead: readahead-context(dev=0:2, ino=0, req=288190+1, ra=288192+23-3, async=1) = 23
[...]
btw, Linus suggested to disable start-of-file readahead if lseek() has been called:
[PATCH 10/11] readahead: dont do start-of-file readahead after lseek()
At last, the updated context readahead will do more radix tree scans, so need
to optimize radix_tree_prev_hole():
[PATCH 11/11] radixtree: speed up next/prev hole search
It will on average reduce 8*64 level-0 slot searches to 32 level-0 slot
plus 8 level-1 node searches.
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jens.axboe@oracle.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Linux Memory Management List <linux-mm@kvack.org>,
linux-fsdevel@vger.kernel.org,
Wu Fengguang <fengguang.wu@intel.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: [PATCH 00/11] [RFC] 512K readahead size with thrashing safe readahead
Date: Tue, 02 Feb 2010 23:28:35 +0800 [thread overview]
Message-ID: <20100202152835.683907822@intel.com> (raw)
Andrew,
This is to lift default readahead size to 512KB, which I believe yields
more I/O throughput without noticeably increasing I/O latency for today's HDD.
For example, for a 100MB/s and 8ms access time HDD:
io_size KB access_time transfer_time io_latency util% throughput KB/s IOPS
4 8 0.04 8.04 0.49% 497.57 124.39
8 8 0.08 8.08 0.97% 990.33 123.79
16 8 0.16 8.16 1.92% 1961.69 122.61
32 8 0.31 8.31 3.76% 3849.62 120.30
64 8 0.62 8.62 7.25% 7420.29 115.94
128 8 1.25 9.25 13.51% 13837.84 108.11
256 8 2.50 10.50 23.81% 24380.95 95.24
512 8 5.00 13.00 38.46% 39384.62 76.92
1024 8 10.00 18.00 55.56% 56888.89 55.56
2048 8 20.00 28.00 71.43% 73142.86 35.71
4096 8 40.00 48.00 83.33% 85333.33 20.83
The 128KB => 512KB readahead size boosts IO throughput from ~13MB/s to ~39MB/s, while
merely increases IO latency from 9.25ms to 13.00ms.
As for SSD, I find that Intel X25-M SSD desires large readahead size
even for sequential reads (the first patch has benchmark details):
rasize first run time/throughput second run time/throughput
------------------------------------------------------------------
4k 3.40038 s, 123 MB/s 3.42842 s, 122 MB/s
8k 2.7362 s, 153 MB/s 2.74528 s, 153 MB/s
16k 2.59808 s, 161 MB/s 2.58728 s, 162 MB/s
32k 2.50488 s, 167 MB/s 2.49138 s, 168 MB/s
64k 2.12861 s, 197 MB/s 2.13055 s, 197 MB/s
128k 1.92905 s, 217 MB/s 1.93176 s, 217 MB/s
256k 1.75896 s, 238 MB/s 1.78963 s, 234 MB/s
512k 1.67357 s, 251 MB/s 1.69112 s, 248 MB/s
1M 1.62115 s, 259 MB/s 1.63206 s, 257 MB/s
2M 1.56204 s, 269 MB/s 1.58854 s, 264 MB/s
4M 1.57949 s, 266 MB/s 1.57426 s, 266 MB/s
As suggested by Linus, decrease default readahead size for small devices at the same time.
[PATCH 01/11] readahead: limit readahead size for small devices
[PATCH 02/11] readahead: bump up the default readahead size
[PATCH 03/11] readahead: introduce {MAX|MIN}_READAHEAD_PAGES macros for ease of use
The two other impacts of an enlarged readahead size are
- memory footprint (caused by readahead miss)
Sequential readahead hit ratio is pretty high regardless of max
readahead size; the extra memory footprint is mainly caused by
enlarged mmap read-around.
I measured my desktop:
- under Xwindow:
128KB readahead cache hit ratio = 143MB/230MB = 62%
512KB readahead cache hit ratio = 138MB/248MB = 55%
- under console: (seems more stable than the Xwindow data)
128KB readahead cache hit ratio = 30MB/56MB = 53%
1MB readahead cache hit ratio = 30MB/59MB = 51%
So the impact to memory footprint looks acceptable.
- readahead thrashing
It will now cost 1MB readahead buffer per stream. Memory tight systems
typically do not run multiple streams; but if they do so, it should
help I/O performance as long as we can avoid thrashing, which can be
achieved with the following patches.
[PATCH 04/11] readahead: replace ra->mmap_miss with ra->ra_flags
[PATCH 05/11] readahead: retain inactive lru pages to be accessed soon
[PATCH 06/11] readahead: thrashing safe context readahead
This is a major rewrite of the readahead algorithm, so I did careful tests with
the following tracing/stats patches:
[PATCH 07/11] readahead: record readahead patterns
[PATCH 08/11] readahead: add tracing event
[PATCH 09/11] readahead: add /debug/readahead/stats
I verified the new readahead behavior on various access patterns,
as well as stress tested the thrashing safety, by running 300 streams
with mem=128M.
Only 2031/61325=3.3% readahead windows are thrashed (due to workload
variation):
# cat /debug/readahead/stats
pattern readahead eof_hit cache_hit io sync_io mmap_io size async_size io_size
initial 20 9 4 20 20 12 73 37 35
subsequent 3 3 0 1 0 1 8 8 1
context 61325 1 5479 61325 6788 5 14 2 13
thrash 2031 0 1222 2031 2031 0 9 0 6
around 235 90 142 235 235 235 60 0 19
fadvise 0 0 0 0 0 0 0 0 0
random 223 133 0 91 91 1 1 0 1
all 63837 236 6847 63703 9165 0 14 2 13
And the readahead inside a single stream is working as expected:
# grep streams-3162 /debug/tracing/trace
streams-3162 [000] 8602.455953: readahead: readahead-context(dev=0:2, ino=0, req=287352+1, ra=287354+10-2, async=1) = 10
streams-3162 [000] 8602.907873: readahead: readahead-context(dev=0:2, ino=0, req=287362+1, ra=287364+20-3, async=1) = 20
streams-3162 [000] 8604.027879: readahead: readahead-context(dev=0:2, ino=0, req=287381+1, ra=287384+14-2, async=1) = 14
streams-3162 [000] 8604.754722: readahead: readahead-context(dev=0:2, ino=0, req=287396+1, ra=287398+10-2, async=1) = 10
streams-3162 [000] 8605.191228: readahead: readahead-context(dev=0:2, ino=0, req=287406+1, ra=287408+18-3, async=1) = 18
streams-3162 [000] 8606.831895: readahead: readahead-context(dev=0:2, ino=0, req=287423+1, ra=287426+12-2, async=1) = 12
streams-3162 [000] 8606.919614: readahead: readahead-thrash(dev=0:2, ino=0, req=287425+1, ra=287425+8-0, async=0) = 1
streams-3162 [000] 8607.545016: readahead: readahead-context(dev=0:2, ino=0, req=287436+1, ra=287438+9-2, async=1) = 9
streams-3162 [000] 8607.960039: readahead: readahead-context(dev=0:2, ino=0, req=287445+1, ra=287447+18-3, async=1) = 18
streams-3162 [000] 8608.790973: readahead: readahead-context(dev=0:2, ino=0, req=287462+1, ra=287465+21-3, async=1) = 21
streams-3162 [000] 8609.763138: readahead: readahead-context(dev=0:2, ino=0, req=287483+1, ra=287486+15-2, async=1) = 15
streams-3162 [000] 8611.467401: readahead: readahead-context(dev=0:2, ino=0, req=287499+1, ra=287501+11-2, async=1) = 11
streams-3162 [000] 8642.512413: readahead: readahead-context(dev=0:2, ino=0, req=288053+1, ra=288056+10-2, async=1) = 10
streams-3162 [000] 8643.246618: readahead: readahead-context(dev=0:2, ino=0, req=288064+1, ra=288066+22-3, async=1) = 22
streams-3162 [000] 8644.278613: readahead: readahead-context(dev=0:2, ino=0, req=288085+1, ra=288088+16-3, async=1) = 16
streams-3162 [000] 8644.395782: readahead: readahead-context(dev=0:2, ino=0, req=288087+1, ra=288087+21-3, async=0) = 5
streams-3162 [000] 8645.109918: readahead: readahead-context(dev=0:2, ino=0, req=288101+1, ra=288108+8-1, async=1) = 8
streams-3162 [000] 8645.285078: readahead: readahead-context(dev=0:2, ino=0, req=288105+1, ra=288116+8-1, async=1) = 8
streams-3162 [000] 8645.731794: readahead: readahead-context(dev=0:2, ino=0, req=288115+1, ra=288122+14-2, async=1) = 13
streams-3162 [000] 8646.114250: readahead: readahead-context(dev=0:2, ino=0, req=288123+1, ra=288136+8-1, async=1) = 8
streams-3162 [000] 8646.626320: readahead: readahead-context(dev=0:2, ino=0, req=288134+1, ra=288144+16-3, async=1) = 16
streams-3162 [000] 8647.035721: readahead: readahead-context(dev=0:2, ino=0, req=288143+1, ra=288160+10-2, async=1) = 10
streams-3162 [000] 8647.693082: readahead: readahead-context(dev=0:2, ino=0, req=288157+1, ra=288165+12-2, async=1) = 8
streams-3162 [000] 8648.221368: readahead: readahead-context(dev=0:2, ino=0, req=288168+1, ra=288177+15-2, async=1) = 15
streams-3162 [000] 8649.280800: readahead: readahead-context(dev=0:2, ino=0, req=288190+1, ra=288192+23-3, async=1) = 23
[...]
btw, Linus suggested to disable start-of-file readahead if lseek() has been called:
[PATCH 10/11] readahead: dont do start-of-file readahead after lseek()
At last, the updated context readahead will do more radix tree scans, so need
to optimize radix_tree_prev_hole():
[PATCH 11/11] radixtree: speed up next/prev hole search
It will on average reduce 8*64 level-0 slot searches to 32 level-0 slot
plus 8 level-1 node searches.
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2010-02-02 15:36 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-02 15:28 Wu Fengguang [this message]
2010-02-02 15:28 ` [PATCH 00/11] [RFC] 512K readahead size with thrashing safe readahead Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 01/11] readahead: limit readahead size for small devices Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 19:38 ` Jens Axboe
2010-02-02 19:38 ` Jens Axboe
2010-02-03 6:13 ` Wu Fengguang
2010-02-03 6:13 ` Wu Fengguang
2010-02-03 8:23 ` Jens Axboe
2010-02-03 8:23 ` Jens Axboe
2010-02-04 8:24 ` Clemens Ladisch
2010-02-04 8:24 ` Clemens Ladisch
2010-02-04 13:00 ` Wu Fengguang
2010-02-04 13:00 ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 02/11] readahead: bump up the default readahead size Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 03/11] readahead: introduce {MAX|MIN}_READAHEAD_PAGES macros for ease of use Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 04/11] readahead: replace ra->mmap_miss with ra->ra_flags Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 05/11] readahead: retain inactive lru pages to be accessed soon Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 06/11] readahead: thrashing safe context readahead Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 07/11] readahead: record readahead patterns Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 08/11] readahead: add tracing event Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-12 16:19 ` Steven Rostedt
2010-02-12 16:19 ` Steven Rostedt
2010-02-14 3:56 ` Wu Fengguang
2010-02-14 3:56 ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 09/11] readahead: add /debug/readahead/stats Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` [PATCH 10/11] readahead: dont do start-of-file readahead after lseek() Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 17:39 ` Linus Torvalds
2010-02-02 17:39 ` Linus Torvalds
2010-02-02 18:13 ` Olivier Galibert
2010-02-02 18:13 ` Olivier Galibert
2010-02-02 18:40 ` Linus Torvalds
2010-02-02 18:40 ` Linus Torvalds
2010-02-02 18:48 ` Olivier Galibert
2010-02-02 18:48 ` Olivier Galibert
2010-02-02 19:14 ` Linus Torvalds
2010-02-02 19:14 ` Linus Torvalds
2010-02-02 19:59 ` david
2010-02-02 19:59 ` david
2010-02-02 20:22 ` Linus Torvalds
2010-02-02 20:22 ` Linus Torvalds
2010-02-02 15:28 ` [PATCH 11/11] radixtree: speed up next/prev hole search Wu Fengguang
2010-02-02 15:28 ` Wu Fengguang
2010-02-02 22:38 ` [PATCH 00/11] [RFC] 512K readahead size with thrashing safe readahead Vivek Goyal
2010-02-02 22:38 ` Vivek Goyal
2010-02-02 22:38 ` Vivek Goyal
2010-02-02 23:17 ` Vivek Goyal
2010-02-02 23:17 ` Vivek Goyal
2010-02-03 6:27 ` Wu Fengguang
2010-02-03 6:27 ` Wu Fengguang
2010-02-03 15:24 ` Vivek Goyal
2010-02-03 15:24 ` Vivek Goyal
2010-02-03 15:58 ` Vivek Goyal
2010-02-03 15:58 ` Vivek Goyal
2010-02-03 16:55 ` Fwd: " Mike Snitzer
2010-02-04 13:21 ` Wu Fengguang
2010-02-04 13:21 ` Wu Fengguang
2010-02-04 15:52 ` Vivek Goyal
2010-02-04 15:52 ` Vivek Goyal
2010-02-04 13:44 ` Wu Fengguang
2010-02-04 13:44 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100202152835.683907822@intel.com \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=jens.axboe@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.