linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Improve odirect-write performance for block-device.
@ 2012-07-16  1:29 majianpeng
  2012-07-16  3:29 ` Shaohua Li
  0 siblings, 1 reply; 5+ messages in thread
From: majianpeng @ 2012-07-16  1:29 UTC (permalink / raw)
  To: Neil Brown, viro; +Cc: linux-raid, linux-fsdevel

Create a raid5 using four disk and the chunksize is 512K.
Test command is: dd if=/dev/zero of=/dev/md0 bs=1536K count=90000 oflag=direct

In RHEL6(kernel 2.6.32):speed about 240MB/s
In 3.5.0-rc5:speed about 77MB/S
Add two patch in 3.5.0-rc5, speed about 200MB/S.

So the performance of odirect-wrirte for block-deivce was obvious reduced.
PATCH 1/2: Add blk_plug function for odirect-write block-device
PATCH 2/2: Remove REQ_SYNC for odirect-write in raid456.

PATCH 2/2 maybe not correct because it alse for odirect-write for regular file.
Jianpeng Ma (2):
  fs/block-dev.c:fix performance regression in O_DIRECT writes to    
    md block devices.
  raid5: For write performance, remove REQ_SYNC when write was odirect.

 drivers/md/raid5.c |    3 +++
 fs/block_dev.c     |    7 ++++++-
 2 files changed, 9 insertions(+), 1 deletions(-)

-- 
1.7.5.4

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] Improve odirect-write performance for block-device.
  2012-07-16  1:29 [PATCH 0/2] Improve odirect-write performance for block-device majianpeng
@ 2012-07-16  3:29 ` Shaohua Li
  2012-07-16  5:43   ` majianpeng
  0 siblings, 1 reply; 5+ messages in thread
From: Shaohua Li @ 2012-07-16  3:29 UTC (permalink / raw)
  To: majianpeng; +Cc: Neil Brown, viro, linux-raid, linux-fsdevel

2012/7/15 majianpeng <majianpeng@gmail.com>:
> Create a raid5 using four disk and the chunksize is 512K.
> Test command is: dd if=/dev/zero of=/dev/md0 bs=1536K count=90000 oflag=direct
>
> In RHEL6(kernel 2.6.32):speed about 240MB/s
> In 3.5.0-rc5:speed about 77MB/S
> Add two patch in 3.5.0-rc5, speed about 200MB/S.
>
> So the performance of odirect-wrirte for block-deivce was obvious reduced.
> PATCH 1/2: Add blk_plug function for odirect-write block-device
> PATCH 2/2: Remove REQ_SYNC for odirect-write in raid456.
>
> PATCH 2/2 maybe not correct because it alse for odirect-write for regular file.
> Jianpeng Ma (2):
>   fs/block-dev.c:fix performance regression in O_DIRECT writes to
>     md block devices.

In raid5, all requests are submitted by raid5d thread, which already has
plug. Why doesn't it work?

>   raid5: For write performance, remove REQ_SYNC when write was odirect.

REQ_SYNC only impacts CFQ, this sounds not reasonable. So the disks
are using CFQ ioscheduler. Can you check if you can see the same issue
with deadline?

Let me guess, without REQ_SYNC, read will get higher priority against write
in CFQ, so in this case, write gets delayed, and maybe get better write
request merge. And now with REQ_SYNC, read and write has the same
priority, there is less request merge.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Re: [PATCH 0/2] Improve odirect-write performance for block-device.
  2012-07-16  3:29 ` Shaohua Li
@ 2012-07-16  5:43   ` majianpeng
  2012-07-16 13:21     ` Shaohua Li
  0 siblings, 1 reply; 5+ messages in thread
From: majianpeng @ 2012-07-16  5:43 UTC (permalink / raw)
  To: shli; +Cc: Neil Brown, viro, linux-raid, linux-fsdevel

On 2012-07-16 11:29 Shaohua Li <shli@kernel.org> Wrote:
>2012/7/15 majianpeng <majianpeng@gmail.com>:
>> Create a raid5 using four disk and the chunksize is 512K.
>> Test command is: dd if=/dev/zero of=/dev/md0 bs=1536K count=90000 oflag=direct
>>
>> In RHEL6(kernel 2.6.32):speed about 240MB/s
>> In 3.5.0-rc5:speed about 77MB/S
>> Add two patch in 3.5.0-rc5, speed about 200MB/S.
>>
>> So the performance of odirect-wrirte for block-deivce was obvious reduced.
>> PATCH 1/2: Add blk_plug function for odirect-write block-device
>> PATCH 2/2: Remove REQ_SYNC for odirect-write in raid456.
>>
>> PATCH 2/2 maybe not correct because it alse for odirect-write for regular file.
>> Jianpeng Ma (2):
>>   fs/block-dev.c:fix performance regression in O_DIRECT writes to
>>     md block devices.
>
>In raid5, all requests are submitted by raid5d thread, which already has
>plug. Why doesn't it work?
No. the purpose of two patch is to reduce the read operation when write which was not full-write.
I tested in RHEL6.The read operation is zero.But in 3.5.0-rc5, the read operaiton may equal to write-operation.
And i used the bs was 1536k(3*512k(chunk-size)).
>
>>   raid5: For write performance, remove REQ_SYNC when write was odirect.
>
>REQ_SYNC only impacts CFQ, this sounds not reasonable. So the disks
>are using CFQ ioscheduler. Can you check if you can see the same issue
>with deadline?
I tested and the result is the same like cfq.
But in RHEL6, the ioscheduler is also cfq.
>
>Let me guess, without REQ_SYNC, read will get higher priority against write
>in CFQ, so in this case, write gets delayed, and maybe get better write
>request merge. And now with REQ_SYNC, read and write has the same
>priority, there is less request merge.
>
>Thanks,
>Shaohua
For harddisk,the read for not full-write will remarkly reduce the performance.
So the first it to make write full-write as posible.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Re: [PATCH 0/2] Improve odirect-write performance for block-device.
  2012-07-16  5:43   ` majianpeng
@ 2012-07-16 13:21     ` Shaohua Li
  2012-07-17  1:13       ` majianpeng
  0 siblings, 1 reply; 5+ messages in thread
From: Shaohua Li @ 2012-07-16 13:21 UTC (permalink / raw)
  To: majianpeng; +Cc: Neil Brown, viro, linux-raid, linux-fsdevel

2012/7/15 majianpeng <majianpeng@gmail.com>:
> On 2012-07-16 11:29 Shaohua Li <shli@kernel.org> Wrote:
>>2012/7/15 majianpeng <majianpeng@gmail.com>:
>>> Create a raid5 using four disk and the chunksize is 512K.
>>> Test command is: dd if=/dev/zero of=/dev/md0 bs=1536K count=90000 oflag=direct
>>>
>>> In RHEL6(kernel 2.6.32):speed about 240MB/s
>>> In 3.5.0-rc5:speed about 77MB/S
>>> Add two patch in 3.5.0-rc5, speed about 200MB/S.
>>>
>>> So the performance of odirect-wrirte for block-deivce was obvious reduced.
>>> PATCH 1/2: Add blk_plug function for odirect-write block-device
>>> PATCH 2/2: Remove REQ_SYNC for odirect-write in raid456.
>>>
>>> PATCH 2/2 maybe not correct because it alse for odirect-write for regular file.
>>> Jianpeng Ma (2):
>>>   fs/block-dev.c:fix performance regression in O_DIRECT writes to
>>>     md block devices.
>>
>>In raid5, all requests are submitted by raid5d thread, which already has
>>plug. Why doesn't it work?
> No. the purpose of two patch is to reduce the read operation when write which was not full-write.
> I tested in RHEL6.The read operation is zero.But in 3.5.0-rc5, the read operaiton may equal to write-operation.
> And i used the bs was 1536k(3*512k(chunk-size)).

yes, I know. But I want to understand why we need the plug in your
test. The IO is dispatched from raid5d, it already has plug.

Fengguang used to post a patch to move the plug from generic_file_aio_write
to do_blockdev_direct_IO, which sounds better.

>>>   raid5: For write performance, remove REQ_SYNC when write was odirect.
>>
>>REQ_SYNC only impacts CFQ, this sounds not reasonable. So the disks
>>are using CFQ ioscheduler. Can you check if you can see the same issue
>>with deadline?
> I tested and the result is the same like cfq.
> But in RHEL6, the ioscheduler is also cfq.
>>
>>Let me guess, without REQ_SYNC, read will get higher priority against write
>>in CFQ, so in this case, write gets delayed, and maybe get better write
>>request merge. And now with REQ_SYNC, read and write has the same
>>priority, there is less request merge.
>>
>>Thanks,
>>Shaohua
> For harddisk,the read for not full-write will remarkly reduce the performance.
> So the first it to make write full-write as posible.

yes, this is the symptom, but I'd like to understand why REQ_SYNC makes
the difference.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Re: [PATCH 0/2] Improve odirect-write performance for block-device.
  2012-07-16 13:21     ` Shaohua Li
@ 2012-07-17  1:13       ` majianpeng
  0 siblings, 0 replies; 5+ messages in thread
From: majianpeng @ 2012-07-17  1:13 UTC (permalink / raw)
  To: shli; +Cc: Neil Brown, viro, linux-raid, linux-fsdevel

On 2012-07-16 21:21 Shaohua Li <shli@kernel.org> Wrote:
>2012/7/15 majianpeng <majianpeng@gmail.com>:
>> On 2012-07-16 11:29 Shaohua Li <shli@kernel.org> Wrote:
>>>2012/7/15 majianpeng <majianpeng@gmail.com>:
>>>> Create a raid5 using four disk and the chunksize is 512K.
>>>> Test command is: dd if=/dev/zero of=/dev/md0 bs=1536K count=90000 oflag=direct
>>>>
>>>> In RHEL6(kernel 2.6.32):speed about 240MB/s
>>>> In 3.5.0-rc5:speed about 77MB/S
>>>> Add two patch in 3.5.0-rc5, speed about 200MB/S.
>>>>
>>>> So the performance of odirect-wrirte for block-deivce was obvious reduced.
>>>> PATCH 1/2: Add blk_plug function for odirect-write block-device
>>>> PATCH 2/2: Remove REQ_SYNC for odirect-write in raid456.
>>>>
>>>> PATCH 2/2 maybe not correct because it alse for odirect-write for regular file.
>>>> Jianpeng Ma (2):
>>>>   fs/block-dev.c:fix performance regression in O_DIRECT writes to
>>>>     md block devices.
>>>
>>>In raid5, all requests are submitted by raid5d thread, which already has
>>>plug. Why doesn't it work?
>> No. the purpose of two patch is to reduce the read operation when write which was not full-write.
>> I tested in RHEL6.The read operation is zero.But in 3.5.0-rc5, the read operaiton may equal to write-operation.
>> And i used the bs was 1536k(3*512k(chunk-size)).
>
>yes, I know. But I want to understand why we need the plug in your
>test. The IO is dispatched from raid5d, it already has plug.
Plug in raid5 only effect the blk_queue_bio().
Plug in direct_aio_write only effect the mddev_check_plugged.
It will effect the code :
raid5d:
>if (atomic_read(&mddev->plug_cnt) == 0)
>			raid5_activate_delayed(conf);
>
So two plugs are two different function.

>Fengguang used to post a patch to move the plug from generic_file_aio_write
>to do_blockdev_direct_IO, which sounds better.
>
I did find this patch in kernel.
syscall_write patch is :
syscall_write--->vfs_write->f_op.write or do_sync_write-->f_op.aio_write
For regular file: aio_wirte is generic_file_aio_write().
generic_file_aio_write() used blk_plug.so for odirect wirte for regular file,the plug used.But it not in do_blockdev_direct_IO.
For block file: aio_write is blkdev_aio_write().
blkdev_aio_write call __generic_file_aio_write--->generic_file_direct_write-->a_ops.direct_io that is blkdev_direct_IO.
So odirect-write  for block,there is not plug.

Can you send the patch or the commit? I want to find the performance  which better.
>>>>   raid5: For write performance, remove REQ_SYNC when write was odirect.
>>>
>>>REQ_SYNC only impacts CFQ, this sounds not reasonable. So the disks
>>>are using CFQ ioscheduler. Can you check if you can see the same issue
>>>with deadline?
>> I tested and the result is the same like cfq.
>> But in RHEL6, the ioscheduler is also cfq.
>>>
>>>Let me guess, without REQ_SYNC, read will get higher priority against write
>>>in CFQ, so in this case, write gets delayed, and maybe get better write
>>>request merge. And now with REQ_SYNC, read and write has the same
>>>priority, there is less request merge.
>>>
>>>Thanks,
>>>Shaohua
>> For harddisk,the read for not full-write will remarkly reduce the performance.
>> So the first it to make write full-write as posible.
>
>yes, this is the symptom, but I'd like to understand why REQ_SYNC makes
>the difference.
>
Because the REQ_SYNC, the stripe set STRIPE_PREREAD_ACTIVE.So the stripe will be not delay and read some data.
Because the read operation, the performance will remarkly reduce for harddisk.
I did not have ssd device,so i don't know the effect to ssd devcie.
>Thanks,
>Shaohua

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-07-17  1:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-16  1:29 [PATCH 0/2] Improve odirect-write performance for block-device majianpeng
2012-07-16  3:29 ` Shaohua Li
2012-07-16  5:43   ` majianpeng
2012-07-16 13:21     ` Shaohua Li
2012-07-17  1:13       ` majianpeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).