Re: Slow file transfer speeds with CFQ IO scheduler in some cases

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Wu Fengguang <wfg@linux.intel.com>
To: Vladislav Bolkhovitin <vst@vlnb.net>
Cc: Jens Axboe <jens.axboe@oracle.com>,
	Jeff Moyer <jmoyer@redhat.com>,
	"Vitaly V. Bursov" <vitalyb@telenet.dn.ua>,
	linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org,
	lukasz.jurewicz@gmail.com
Subject: Re: Slow file transfer speeds with CFQ IO scheduler in some cases
Date: Mon, 23 Mar 2009 09:42:20 +0800	[thread overview]
Message-ID: <20090323014220.GA10448@localhost> (raw)
In-Reply-To: <49C2846D.5030500@vlnb.net>

On Thu, Mar 19, 2009 at 08:44:13PM +0300, Vladislav Bolkhovitin wrote:
> Wu Fengguang, on 02/19/2009 05:05 AM wrote:
>> On Tue, Feb 17, 2009 at 10:01:40PM +0300, Vladislav Bolkhovitin wrote:
>>> Wu Fengguang, on 02/13/2009 04:57 AM wrote:
>>>> On Thu, Feb 12, 2009 at 09:35:18PM +0300, Vladislav Bolkhovitin wrote:
>>>>> Sorry for such a huge delay. There were many other activities I 
>>>>> had to  do before + I had to be sure I didn't miss anything.
>>>>>
>>>>> We didn't use NFS, we used SCST (http://scst.sourceforge.net) 
>>>>> with   iSCSI-SCST target driver. It has similar to NFS 
>>>>> architecture, where N  threads (N=5 in this case) handle IO from 
>>>>> remote initiators (clients)  coming from wire using iSCSI 
>>>>> protocol. In addition, SCST has patch  called 
>>>>> export_alloc_io_context (see    
>>>>> http://lkml.org/lkml/2008/12/10/282), which allows for the IO 
>>>>> threads  queue IO using single IO context, so we can see if 
>>>>> context RA can   replace grouping IO threads in single IO 
>>>>> context.
>>>>>
>>>>> Unfortunately, the results are negative. We find neither any  
>>>>> advantages  of context RA over current RA implementation, nor  
>>>>> possibility for  context RA to replace grouping IO threads in 
>>>>> single IO context.
>>>>>
>>>>> Setup on the target (server) was the following. 2 SATA drives 
>>>>> grouped in  md RAID-0 with average local read throughput ~120MB/s 
>>>>> ("dd if=/dev/zero  of=/dev/md0 bs=1M count=20000" outputs 
>>>>> "20971520000 bytes (21 GB)  copied, 177,742 s, 118 MB/s"). The md 
>>>>> device was partitioned on 3  partitions. The first partition was 
>>>>> 10% of space in the beginning of the  device, the last partition 
>>>>> was 10% of space in the end of the device,  the middle one was 
>>>>> the rest in the middle of the space them. Then the  first and the 
>>>>> last partitions were exported to the initiator (client).  They 
>>>>> were /dev/sdb and /dev/sdc on it correspondingly.
>>>> Vladislav, Thank you for the benchmarks! I'm very interested in
>>>> optimizing your workload and figuring out what happens underneath.
>>>>
>>>> Are the client and server two standalone boxes connected by GBE?
>>>>
>>>> When you set readahead sizes in the benchmarks, you are setting them
>>>> in the server side? I.e. "linux-4dtq" is the SCST server? What's the
>>>> client side readahead size?
>>>>
>>>> It would help a lot to debug readahead if you can provide the
>>>> server side readahead stats and trace log for the worst case.
>>>> This will automatically answer the above questions as well as disclose
>>>> the micro-behavior of readahead:
>>>>
>>>>         mount -t debugfs none /sys/kernel/debug
>>>>
>>>>         echo > /sys/kernel/debug/readahead/stats # reset counters
>>>>         # do benchmark
>>>>         cat /sys/kernel/debug/readahead/stats
>>>>
>>>>         echo 1 > /sys/kernel/debug/readahead/trace_enable
>>>>         # do micro-benchmark, i.e. run the same benchmark for a short time
>>>>         echo 0 > /sys/kernel/debug/readahead/trace_enable
>>>>         dmesg
>>>>
>>>> The above readahead trace should help find out how the client side
>>>> sequential reads convert into server side random reads, and how we can
>>>> prevent that.
>>> See attached. Could you comment the logs, please, so I will also be 
>>> able  to read them in the future?
>>
>> Vladislav, thank you for the logs!
>>
>> The printk format for the following lines is:
>>
>> +       printk(KERN_DEBUG "readahead-%s(pid=%d(%s), dev=%02x:%02x(%s), "
>> +               "ino=%lu(%s), req=%lu+%lu, ra=%lu+%d-%d, async=%d) = %d\n",
>> +               ra_pattern_names[pattern],
>> +               current->pid, current->comm,
>> +               MAJOR(mapping->host->i_sb->s_dev),
>> +               MINOR(mapping->host->i_sb->s_dev),
>> +               mapping->host->i_sb->s_id,
>> +               mapping->host->i_ino,
>> +               filp->f_path.dentry->d_name.name,
>> +               offset, req_size,
>> +               ra->start, ra->size, ra->async_size,
>> +               async,
>> +               actual);
>>
>> readahead-marker(pid=3838(vdiskd3_3), dev=00:02(bdev), ino=0(raid-1st), req=10596+1, ra=10628+32-32, async=1) = 32
>> readahead-marker(pid=3837(vdiskd3_2), dev=00:02(bdev), ino=0(raid-1st), req=10628+1, ra=10660+32-32, async=1) = 32
>> readahead-marker(pid=3835(vdiskd3_0), dev=00:02(bdev), ino=0(raid-1st), req=10660+1, ra=10692+32-32, async=1) = 32
>> readahead-marker(pid=3839(vdiskd3_4), dev=00:02(bdev), ino=0(raid-1st), req=10692+1, ra=10724+32-32, async=1) = 32
>> readahead-marker(pid=3837(vdiskd3_2), dev=00:02(bdev), ino=0(raid-1st), req=10724+1, ra=10756+32-32, async=1) = 32
>> readahead-marker(pid=3838(vdiskd3_3), dev=00:02(bdev), ino=0(raid-1st), req=10756+1, ra=10788+32-32, async=1) = 32
>> readahead-marker(pid=3839(vdiskd3_4), dev=00:02(bdev), ino=0(raid-1st), req=10788+1, ra=10820+32-32, async=1) = 32
>> readahead-marker(pid=3835(vdiskd3_0), dev=00:02(bdev), ino=0(raid-1st), req=10820+1, ra=10852+32-32, async=1) = 32
>> readahead-marker(pid=3838(vdiskd3_3), dev=00:02(bdev), ino=0(raid-1st), req=10852+1, ra=10884+32-32, async=1) = 32
>> readahead-marker(pid=3837(vdiskd3_2), dev=00:02(bdev), ino=0(raid-1st), req=10884+1, ra=10916+32-32, async=1) = 32
>> readahead-marker(pid=3839(vdiskd3_4), dev=00:02(bdev), ino=0(raid-1st), req=10916+1, ra=10948+32-32, async=1) = 32
>> readahead-marker(pid=3836(vdiskd3_1), dev=00:02(bdev), ino=0(raid-1st), req=10948+1, ra=10980+32-32, async=1) = 32
>> readahead-marker(pid=3837(vdiskd3_2), dev=00:02(bdev), ino=0(raid-1st), req=10980+1, ra=11012+32-32, async=1) = 32
>> readahead-marker(pid=3838(vdiskd3_3), dev=00:02(bdev), ino=0(raid-1st), req=11012+1, ra=11044+32-32, async=1) = 32
>> readahead-marker(pid=3836(vdiskd3_1), dev=00:02(bdev), ino=0(raid-1st), req=11044+1, ra=11076+32-32, async=1) = 32
>> readahead-subsequent(pid=3836(vdiskd3_1), dev=00:02(bdev), ino=0(raid-1st), req=11076+1, ra=11108+32-32, async=1) = 32
>> readahead-marker(pid=3835(vdiskd3_0), dev=00:02(bdev), ino=0(raid-1st), req=11108+1, ra=11140+32-32, async=1) = 32
>> readahead-subsequent(pid=3835(vdiskd3_0), dev=00:02(bdev), ino=0(raid-1st), req=11140+1, ra=11172+32-32, async=1) = 32
>> readahead-marker(pid=3839(vdiskd3_4), dev=00:02(bdev), ino=0(raid-1st), req=11172+1, ra=11204+32-32, async=1) = 32
>> readahead-subsequent(pid=3839(vdiskd3_4), dev=00:02(bdev), ino=0(raid-1st), req=11204+1, ra=11236+32-32, async=1) = 32
>> readahead-marker(pid=3837(vdiskd3_2), dev=00:02(bdev), ino=0(raid-1st), req=11236+1, ra=11268+32-32, async=1) = 32
>> readahead-subsequent(pid=3837(vdiskd3_2), dev=00:02(bdev), ino=0(raid-1st), req=11268+1, ra=11300+32-32, async=1) = 32
>> readahead-marker(pid=3835(vdiskd3_0), dev=00:02(bdev), ino=0(raid-1st), req=11300+1, ra=11332+32-32, async=1) = 32
>> readahead-subsequent(pid=3835(vdiskd3_0), dev=00:02(bdev), ino=0(raid-1st), req=11332+1, ra=11364+32-32, async=1) = 32
>> readahead-marker(pid=3839(vdiskd3_4), dev=00:02(bdev), ino=0(raid-1st), req=11364+1, ra=11396+32-32, async=1) = 32
>> readahead-subsequent(pid=3839(vdiskd3_4), dev=00:02(bdev), ino=0(raid-1st), req=11396+1, ra=11428+32-32, async=1) = 32
>>
>> The above trace shows that the readahead logic is working pretty well,
>> however the max readahead size(32 pages) is way too small. This can
>> also be confirmed in the following table, where the average readahead
>> request size/async_size and actual readahead I/O size are all 30.
>>
>> linux-4dtq:/ # cat /sys/kernel/debug/readahead/stats
>> pattern         count sync_count  eof_count       size async_size     actual
>> none                0          0          0          0          0          0
>> initial0           71         71         41          4          3          2
>> initial            23         23          0          4          3          4
>> subsequent       3845          4         21         31         31         31
>> marker           4222          0          1         31         31         31
>> trail               0          0          0          0          0          0
>> oversize            0          0          0          0          0          0
>> reverse             0          0          0          0          0          0
>> stride              0          0          0          0          0          0
>> thrash              0          0          0          0          0          0
>> mmap              135        135         15         32          0         17
>> fadvise           180        180        180          0          0          1
>> random             23         23          2          1          0          1
>> all              8499        436        260         30         30         30
>>                                                     ^^^^^^^^^^^^^^^^^^^^^^^^
>>
>> I suspect that your main performance problem comes from the small read/readahead size.
>> If larger values are used, even the vanilla 2.6.27 kernel will perform well.
>
> Yes, it was misconfiguration on our side: readahead size was not set  
> correctly on all devices. In the correct configuration context based RA  
> shows constant advantage over the current vanilla algorithm, but not as  
> much as I would expect. It still performs considerably worse, than in  
> case when all the IO threads work in the same IO context. To remind, our  
> setup and tests described in http://lkml.org/lkml/2009/2/12/277.

Vladislav, thank you very much for the great efforts and details!

> Here are the conclusions from tests:
>
>  1. Making all IO threads work in the same IO context with CFQ (vanilla  
> RA and default RA size) brings near 100% link utilization on single  
> stream reads (100MB/s) and with deadline about 50% (50MB/s). I.e. there  
> is 100% improvement of CFQ over deadline. With 2 read streams CFQ has  
> ever more advantage: >400% (23MB/s vs 5MB/s).

The ideal 2-stream throughput should be >60MB/s, so I guess there are
still room of improvements for the CFQ's 23MB/s?

>  2. All IO threads work in different IO contexts. With vanilla RA and  
> default RA size CFQ performs 50% worse (48MB/s), even worse than 
> deadline.
>
>  3. All IO threads work in different IO contexts. With default RA size  
> context RA brings on single stream 40% improvement with deadline (71MB/s  
> vs 51MB/s), no improvement with cfq (48MB/s).
>
>  4. All IO threads work in different IO contexts. With higher RA sizes  
> there is stable 6% improvement with context RA over vanilla RA with CFQ  
> starting from 20%. Deadline performs similarly. In parallel reads  
> improvement is bigger: 30% on 4M RA size with deadline (39MB/s vs 27MB/s)

Your attached readahead trace shows that context RA is submitting
perfect 256-page async readahead IOs. (However the readahead-subsequent
cache hits are puzzling.)

The vanilla RA detects concurrent streams in a passive/opportunistic way.
The context RA works in an active/guaranteed way. It's also better at
serving the NFS style cooperative io processes. And your SCST workload
looks very like NFS.

The one fact I cannot understand is that SCST seems to breaking up the
client side 64K reads into server side 4K reads(above readahead layer).
But I remember you told me that SCST don't do NFS rsize style split-ups.
Is this a bug? The 4K read size is too small to be CPU/network friendly...
Where are the split-up and re-assemble done? On the client side or
internal to the server?

>  5. All IO threads work in different IO contexts. The best performance  
> achieved with RA maximum size 4M on both RA algorithms, but starting  
> from size 1M it starts growing very slowly.

True.

>  6. Unexpected result. In case, when ll IO threads work in the same IO  
> context with CFQ increasing RA size *decreases* throughput. I think this  
> is, because RA requests performed as single big READ requests, while  
> requests coming from remote clients are much smaller in size (up to  
> 256K), so, when the read by RA data transferred to the remote client on  
> 100MB/s speed, the backstorage media gets rotated a bit, so the next  
> read request must wait the rotation latency (~0.1ms on 7200RPM). This is  
> well conforms with (3) above, when context RA has 40% advantage over  
> vanilla RA with default RA, but much smaller with higher RA.

Maybe. But the readahead IOs (as showed by the trace) are _async_ ones...

> Bottom line IMHO conclusions:
>
> 1. Context RA should be considered after additional examination to  
> replace current RA algorithm in the kernel

That's my plan to push context RA to mainline. And thank you very much
for providing and testing out a real world application for it!

> 2. It would be better to increase default RA size to 1024K

That's a long wish to increase the default RA size. However I have a
vague feeling that it would be better to first make the lower layers
more smart on max_sectors_kb granularity request splitting and batching.

> *AND* one of the following:
>
> 3.1. All RA requests should be split in smaller requests with size up to  
> 256K, which should not be merged with any other request

Are you referring to max_sectors_kb?

What's your max_sectors_kb and nr_requests? Something like

        grep -r . /sys/block/sda/queue/

> OR
>
> 3.2. New RA requests should be sent before the previous one completed to  
> don't let the storage device rotate too far to need full rotation to  
> serve the next request.

Linus has a mmap readahead cleanup patch that can do this. It
basically replaces a {find_lock_page(); readahead();} sequence into
{find_get_page(); readahead(); lock_page();}.

I'll try to push that patch into mainline.

> I like suggestion 3.1 a lot more, since it should be simple to implement  
> and has the following 2 positive side effects:
>
> 1. It would allow to minimize negative effect of higher RA size on the  
> I/O delay latency by allowing CFQ to switch to too long waiting  
> requests, when necessary.
>
> 2. It would allow better requests pipelining, which is very important to  
> minimize uplink latency for synchronous requests (i.e. with only one IO  
> request at time, next request issued, when the previous one completed).  
> You can see in http://www.3ware.com/kb/article.aspx?id=11050 that 3ware  
> recommends for maximum performance set max_sectors_kb as low as *64K*  
> with 16MB RA. It allows to maximize serving commands pipelining. And  
> this suggestion really works allowing to improve throughput in 50-100%!
>
> Here are the raw numbers. I also attached context RA debug output for  
> 2MB RA size case for your viewing pleasure.

Thank you, it helped a lot!

(Can I wish a CONFIG_PRINTK_TIME=y next time? :-)

Thanks,
Fengguang

> --------------------------------------------------------------------
>
> Performance baseline: all IO threads work in the same IO context,  
> current vanilla RA, default RA size:
>
> CFQ scheduler:
>
> #dd if=/dev/sdb of=/dev/null bs=64K count=80000
>         a) 102 MB/s
>         b) 102 MB/s
>         c) 102 MB/s
>
> Run at the same time:
> #while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done
> #dd if=/dev/sdb of=/dev/null bs=64K count=80000
>         a) 21,6 MB/s
>         b) 22,8 MB/s
>         c) 24,1 MB/s
>         d) 23,1 MB/s
>
> Deadline scheduler:
>
> #dd if=/dev/sdb of=/dev/null bs=64K count=80000
>         a) 51,1 MB/s
>         b) 51,4 MB/s
>         c) 51,1 MB/s
>
> Run at the same time:
> #while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done
> #dd if=/dev/sdb of=/dev/null bs=64K count=80000
>         a) 4,7 MB/s
>         b) 4,6 MB/s
>         c) 4,8 MB/s
>
> --------------------------------------------------------------------
>
> RA performance baseline: all IO threads work in different IO contexts,  
> current vanilla RA, default RA size:
>
> CFQ:
>
> #dd if=/dev/sdb of=/dev/null bs=64K count=80000
>         a) 48,6 MB/s
>         b) 49,2 MB/s
>         c) 48,9 MB/s
>
> Run at the same time:
> #while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done
> #dd if=/dev/sdb of=/dev/null bs=64K count=80000
>         a) 4,2 MB/s
>         b) 3,9 MB/s
>         c) 4,1 MB/s
>
> Deadline:
>
> 1) dd if=/dev/sdb of=/dev/null bs=64K count=80000
>         a) 53,2 MB/s
>         b) 51,8 MB/s
>         c) 51,6 MB/s
>
> Run at the same time:
> #while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done
> #dd if=/dev/sdb of=/dev/null bs=64K count=80000
>         a) 5,1 MB/s
>         b) 4,6 MB/s
>         c) 4,8 MB/s
>
> --------------------------------------------------------------------
>
> Context RA, all IO threads work in different IO contexts, default RA size:
>
> CFQ:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
>         a) 47,9 MB/s
>         b) 48,2 MB/s
>         c) 48,1 MB/s
>
> Run at the same time:
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
>         a) 3,5 MB/s
>         b) 3,6 MB/s
>         c) 3,8 MB/s
>
> Deadline:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
>         a) 72,4 MB/s
>         b) 68,3 MB/s
>         c) 71,3 MB/s
>
> Run at the same time:
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
>         a) 4,3 MB/s
>         b) 5,0 MB/s
>         c) 4,8 MB/s
>
> --------------------------------------------------------------------
>
> Vanilla RA, all IO threads work in different IO contexts, various RA sizes:
>
> CFQ:
>
> RA 512K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 60,5 MB/s
> 	b) 59,3 MB/s
> 	c) 59,7 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 9,4 MB/s
> 	b) 9,4 MB/s
> 	c) 9,1 MB/s
>
> --- RA 1024K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 74,7 MB/s
> 	b) 73,2 MB/s
> 	c) 74,1 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 13,7 MB/s
> 	b) 13,6 MB/s
> 	c) 13,1 MB/s
>
> --- RA 2048K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 76,7 MB/s
> 	b) 76,8 MB/s
> 	c) 76,6 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 21,8 MB/s
> 	b) 22,1 MB/s
> 	c) 20,3 MB/s
>
> --- RA 4096K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 80,8 MB/s
> 	b) 80.8 MB/s
> 	c) 80,3 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 29,6 MB/s
> 	b) 29,4 MB/s
> 	c) 27,2 MB/s
>
> === Deadline:
>
> RA 512K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 68,4 MB/s
> 	b) 67,0 MB/s
> 	c) 67,6 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 8,8 MB/s
> 	b) 8,9 MB/s
> 	c) 8,7 MB/s
>
> --- RA 1024K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 81,0 MB/s
> 	b) 82,4 MB/s
> 	c) 81,7 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 13,5 MB/s
> 	b) 13,1 MB/s
> 	c) 12,9 MB/s
>
> --- RA 2048K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 81,1 MB/s
> 	b) 80,1 MB/s
> 	c) 81,8 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 21,9 MB/s
> 	b) 20,7 MB/s
> 	c) 21,3 MB/s
>
> --- RA 4096K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 83,1 MB/s
> 	b) 82,7 MB/s
> 	c) 82,9 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 27,9 MB/s
> 	b) 23,5 MB/s
> 	c) 27,6 MB/s
>
> --------------------------------------------------------------------
>
> Context RA, all IO threads work in different IO contexts, various RA sizes:
>
> CFQ:
>
> RA 512K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 63,7 MB/s
> 	b) 63,5 MB/s
> 	c) 62,8 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 7,1 MB/s
> 	b) 6,7 MB/s
> 	c) 7,0 MB/s	
> 	d) 6,9 MB/s
>
> --- RA 1024K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 81,1 MB/s
> 	b) 81,8 MB/s
> 	c)  MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 14,1 MB/s
> 	b) 14,0 MB/s
> 	c) 14,1 MB/s
>
> --- RA 2048K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 81,6 MB/s
> 	b) 81,4 MB/s
> 	c) 86,0 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 22,3 MB/s
> 	b) 21,5 MB/s
> 	c) 21,7 MB/s
>
> --- RA 4096K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 83,1 MB/s
> 	b) 83,5 MB/s
> 	c) 82,9 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 32,8 MB/s
> 	b) 32,7 MB/s
> 	c) 30,2 MB/s
>
> === Deadline:
>
> RA 512K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 68,8 MB/s
> 	b) 68,9 MB/s
> 	c) 69,0 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 8,7 MB/s
> 	b) 9,0 MB/s
> 	c) 8,9 MB/s
>
> --- RA 1024K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 83,5 MB/s
> 	b) 83,1 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 14,0 MB/s
> 	b) 13.9 MB/s
> 	c) 13,8 MB/s
>
> --- RA 2048K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 82,6 MB/s
> 	b) 82,4 MB/s
> 	c) 81,9 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 21,9 MB/s
> 	b) 23,1 MB/s
> 	c) 17,8 MB/s	
> 	d) 21,1 MB/s
>
> --- RA 4096K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 84,5 MB/s
> 	b) 83,7 MB/s
> 	c) 83,8 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 39,9 MB/s
> 	b) 39,5 MB/s
> 	c) 38,4 MB/s
>
> --------------------------------------------------------------------
>
> all IO threads work in the same IO context, context RA, various RA sizes:
>
> === CFQ:
>
> --- RA 512K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 86,4 MB/s
> 	b) 87,9 MB/s
> 	c) 86,7 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 17,8 MB/s
> 	b) 18,3 MB/s
> 	c) 17,7 MB/s
>
> --- RA 1024K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 83,3 MB/s
> 	b) 81,6 MB/s
> 	c) 81,9 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 22,1 MB/s
> 	b) 21,5 MB/s
> 	c) 21,2 MB/s
>
> --- RA 2048K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 81,1 MB/s
> 	b) 81,0 MB/s
> 	c) 81,6 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 22,2 MB/s
> 	b) 20,2 MB/s
> 	c) 20,9 MB/s
>
> --- RA 4096K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 83,4 MB/s
> 	b) 82,8 MB/s
> 	c) 83,3 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 22,6 MB/s
> 	b) 23,4 MB/s
> 	c) 21,8 MB/s
>
> === Deadline:
>
> --- RA 512K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 70,0 MB/s
> 	b) 70,7 MB/s
> 	c) 69,7 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 9,1 MB/s
> 	b) 8,3 MB/s
> 	c) 8,4 MB/s	
>
> --- RA 1024K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 84,3 MB/s
> 	b) 83,2 MB/s
> 	c) 83,3 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 13,9 MB/s
> 	b) 13,1 MB/s
> 	c) 13,4 MB/s
>
> --- RA 2048K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 82,6 MB/s
> 	b) 82,1 MB/s
> 	c) 82,3 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 21,6 MB/s
> 	b) 22,4 MB/s
> 	c) 21,3 MB/s	
>
> --- RA 4096K:
>
> dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 83,8 MB/s
> 	b) 83,8 MB/s
> 	c) 83,1 MB/s
>
> Run at the same time:		
> linux-4dtq:~ # while true; do dd if=/dev/sdc  of=/dev/null bs=64K; done	
> linux-4dtq:~ # dd if=/dev/sdb of=/dev/null bs=64K count=80000
> 	a) 39,5 MB/s
> 	b) 39,6 MB/s
> 	c) 37,0 MB/s
>
> Thanks,
> Vlad

next prev parent reply	other threads:[~2009-03-23  1:43 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-09 18:04 Slow file transfer speeds with CFQ IO scheduler in some cases Vitaly V. Bursov
2008-11-09 18:30 ` Alexey Dobriyan
2008-11-09 18:32   ` Vitaly V. Bursov
2008-11-10 10:44 ` Jens Axboe
2008-11-10 13:51   ` Jeff Moyer
2008-11-10 13:56     ` Jens Axboe
2008-11-10 17:16       ` Vitaly V. Bursov
2008-11-10 17:35         ` Jens Axboe
2008-11-10 18:27           ` Vitaly V. Bursov
2008-11-10 18:29             ` Jens Axboe
2008-11-10 18:39               ` Jeff Moyer
2008-11-10 18:42               ` Jens Axboe
2008-11-10 21:51             ` Jeff Moyer
2008-11-11  9:34               ` Jens Axboe
2008-11-11  9:35                 ` Jens Axboe
2008-11-11 11:52                   ` Jens Axboe
2008-11-11 16:48                     ` Jeff Moyer
2008-11-11 18:08                       ` Jens Axboe
2008-11-11 16:53                     ` Vitaly V. Bursov
2008-11-11 18:06                       ` Jens Axboe
2008-11-11 19:36                         ` Jeff Moyer
2008-11-11 21:41                           ` Jeff Layton
2008-11-11 21:59                             ` Jeff Layton
2008-11-12 12:20                               ` Jens Axboe
2008-11-12 12:45                                 ` Jeff Layton
2008-11-12 12:54                                   ` Christoph Hellwig
2008-11-11 19:42                         ` Vitaly V. Bursov
2008-11-12 18:32       ` Jeff Moyer
2008-11-12 19:02         ` Jens Axboe
2008-11-13  8:51           ` Wu Fengguang
2008-11-13  8:54             ` Jens Axboe
2008-11-14  1:36               ` Wu Fengguang
2008-11-25 11:02                 ` Vladislav Bolkhovitin
2008-11-25 11:25                   ` Wu Fengguang
2008-11-25 15:21                   ` Jeff Moyer
2008-11-25 16:17                     ` Vladislav Bolkhovitin
2008-11-13 18:46             ` Vitaly V. Bursov
2008-11-25 10:59             ` Vladislav Bolkhovitin
2008-11-25 11:30               ` Wu Fengguang
2008-11-25 11:41                 ` Vladislav Bolkhovitin
2008-11-25 11:49                   ` Wu Fengguang
2008-11-25 12:03                     ` Vladislav Bolkhovitin
2008-11-25 12:09                       ` Vladislav Bolkhovitin
2008-11-25 12:15                         ` Wu Fengguang
2008-11-27 17:46                           ` Vladislav Bolkhovitin
     [not found]                             ` <492EDCFB.7080302-d+Crzxg7Rs0@public.gmane.org>
2008-11-28  0:48                               ` Wu Fengguang
2008-11-28  0:48                                 ` Wu Fengguang
2009-02-12 18:35                                 ` Vladislav Bolkhovitin
2009-02-13  1:57                                   ` Wu Fengguang
2009-02-13 20:08                                     ` Vladislav Bolkhovitin
2009-02-13 20:08                                       ` Vladislav Bolkhovitin
     [not found]                                       ` <4995D339.5050502-d+Crzxg7Rs0@public.gmane.org>
2009-02-16  2:34                                         ` Wu Fengguang
2009-02-16  2:34                                           ` Wu Fengguang
2009-02-17 19:03                                           ` Vladislav Bolkhovitin
2009-02-17 19:03                                             ` Vladislav Bolkhovitin
2009-02-18 18:14                                             ` Vladislav Bolkhovitin
2009-02-19  1:35                                             ` Wu Fengguang
2009-02-17 19:01                                     ` Vladislav Bolkhovitin
2009-02-17 19:01                                       ` Vladislav Bolkhovitin
2009-02-19  2:05                                       ` Wu Fengguang
2009-03-19 17:44                                         ` Vladislav Bolkhovitin
2009-03-20  8:53                                           ` Vladislav Bolkhovitin
2009-03-23  1:42                                           ` Wu Fengguang [this message]
2009-04-21 18:18                                             ` Vladislav Bolkhovitin
2009-04-24  8:43                                               ` Wu Fengguang
2009-05-12 18:13                                                 ` Vladislav Bolkhovitin
     [not found]                                   ` <49946BE6.1040005-d+Crzxg7Rs0@public.gmane.org>
2009-02-17 19:01                                     ` Vladislav Bolkhovitin
2009-02-17 19:01                                       ` Vladislav Bolkhovitin
     [not found]                                       ` <499B0979.8050006-d+Crzxg7Rs0@public.gmane.org>
2009-02-19  1:38                                         ` Wu Fengguang
2009-02-19  1:38                                           ` Wu Fengguang
2008-11-24 15:33           ` Jeff Moyer
2008-11-24 18:13             ` Jens Axboe
2008-11-24 18:50               ` Jeff Moyer
2008-11-24 18:51                 ` Jens Axboe
2008-11-13  6:54         ` Vitaly V. Bursov
2008-11-13 14:32           ` Jeff Moyer
2008-11-13 18:33             ` Vitaly V. Bursov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090323014220.GA10448@localhost \
    --to=wfg@linux.intel.com \
    --cc=jens.axboe@oracle.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=lukasz.jurewicz@gmail.com \
    --cc=vitalyb@telenet.dn.ua \
    --cc=vst@vlnb.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.