From: juncheng bai <baijuncheng@unitedstack.com>
To: Ilya Dryomov <idryomov@gmail.com>
Cc: idryomov@redhat.com, Alex Elder <elder@linaro.org>,
Josh Durgin <josh.durgin@inktank.com>,
lucienchao@gmail.com, jeff@garzik.org, yehuda@hq.newdream.net,
Sage Weil <sage@newdream.net>,
elder@inktank.com,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Ceph Development <ceph-devel@vger.kernel.org>
Subject: Re: [PATCH RFC] storage:rbd: make the size of request is equal to the, size of the object
Date: Tue, 16 Jun 2015 11:28:11 +0800 [thread overview]
Message-ID: <557F97CB.6070608@unitedstack.com> (raw)
In-Reply-To: <CAOi1vP_VLVVhoFrJ+ETRa7+o+sAjyHquZR_g2kYdO9n-8jxdoQ@mail.gmail.com>
On 2015/6/15 22:27, Ilya Dryomov wrote:
> On Mon, Jun 15, 2015 at 4:23 PM, juncheng bai
> <baijuncheng@unitedstack.com> wrote:
>>
>>
>> On 2015/6/15 21:03, Ilya Dryomov wrote:
>>>
>>> On Mon, Jun 15, 2015 at 2:18 PM, juncheng bai
>>> <baijuncheng@unitedstack.com> wrote:
>>>>
>>>> From 6213215bd19926d1063d4e01a248107dab8a899b Mon Sep 17 00:00:00 2001
>>>> From: juncheng bai <baijuncheng@unitedstack.com>
>>>> Date: Mon, 15 Jun 2015 18:34:00 +0800
>>>> Subject: [PATCH] storage:rbd: make the size of request is equal to the
>>>> size of the object
>>>>
>>>> ensures that the merged size of request can achieve the size of
>>>> the object.
>>>> when merge a bio to request or merge a request to request, the
>>>> sum of the segment number of the current request and the segment
>>>> number of the bio is not greater than the max segments of the request,
>>>> so the max size of request is 512k if the max segments of request is
>>>> BLK_MAX_SEGMENTS.
>>>>
>>>> Signed-off-by: juncheng bai <baijuncheng@unitedstack.com>
>>>> ---
>>>> drivers/block/rbd.c | 2 ++
>>>> 1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>>>> index 0a54c58..dec6045 100644
>>>> --- a/drivers/block/rbd.c
>>>> +++ b/drivers/block/rbd.c
>>>> @@ -3757,6 +3757,8 @@ static int rbd_init_disk(struct rbd_device
>>>> *rbd_dev)
>>>> segment_size = rbd_obj_bytes(&rbd_dev->header);
>>>> blk_queue_max_hw_sectors(q, segment_size / SECTOR_SIZE);
>>>> blk_queue_max_segment_size(q, segment_size);
>>>> + if (segment_size > BLK_MAX_SEGMENTS * PAGE_SIZE)
>>>> + blk_queue_max_segments(q, segment_size / PAGE_SIZE);
>>>> blk_queue_io_min(q, segment_size);
>>>> blk_queue_io_opt(q, segment_size);
>>>
>>>
>>> I made a similar patch on Friday, investigating blk-mq plugging issue
>>> reported by Nick. My patch sets it to BIO_MAX_PAGES unconditionally -
>>> AFAIU there is no point in setting to anything bigger since the bios
>>> will be clipped to that number of vecs. Given that BIO_MAX_PAGES is
>>> 256, this gives is 1M direct I/Os.
>>
>> Hi. For signal bio, the max number of bio_vec is BIO_MAX_PAGES, but a
>> request can be merged from multiple bios. We can see the below function:
>> ll_back_merge_fn, ll_front_merge_fn and etc.
>> And I test in kernel 3.18 use this patch, and do:
>> echo 4096 > /sys/block/rbd0/queue/max_sectors_kb
>> We use systemtap to trace the request size, It is upto 4M.
>
> Kernel 3.18 is pre rbd blk-mq transition, which happened in 4.0. You
> should test whatever patches you have with at least 4.0.
>
> Putting that aside, I must be missing something. You'll get 4M
> requests on 3.18 both with your patch and without it, the only
> difference would be the size of bios being merged - 512k vs 1M. Can
> you describe your test workload and provide before and after traces?
>
Hi. I update kernel version to 4.0.5. The test information as shown below:
The base information:
03:28:13-root@server-186:~$uname -r
4.0.5
My simple systemtap script:
probe module("rbd").function("rbd_img_request_create")
{
printf("offset:%lu length:%lu\n", ulong_arg(2), ulong_arg(3));
}
I use dd to execute the test case:
dd if=/dev/zero of=/dev/rbd0 bs=4M count=1 oflag=direct
Case one: Without patch
03:30:23-root@server-186:~$cat /sys/block/rbd0/queue/max_sectors_kb
4096
03:30:35-root@server-186:~$cat /sys/block/rbd0/queue/max_segments
128
The output of systemtap for nornal data:
offset:0 length:524288
offset:524288 length:524288
offset:1048576 length:524288
offset:1572864 length:524288
offset:2097152 length:524288
offset:2621440 length:524288
offset:3145728 length:524288
offset:3670016 length:524288
Case two:With patch
cat /sys/block/rbd0/queue/max_sectors_kb
4096
03:49:14-root@server-186:linux-4.0.5$cat /sys/block/rbd0/queue/max_segments
1024
The output of systemtap for nornal data:
offset:0 length:1048576
offset:1048576 length:1048576
offset:2097152 length:1048576
offset:3145728 length:1048576
According to the test, you are right.
Because the blk-mq doesn't use any scheduling policy.
03:52:13-root@server-186:linux-4.0.5$cat /sys/block/rbd0/queue/scheduler
none
In previous versions of the kernel 4.0, the rbd use the defualt
scheduler:cfq
So, I think that the blk-mq need to do more?
> Thanks,
>
> Ilya
>
next prev parent reply other threads:[~2015-06-16 3:29 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-15 11:18 [PATCH RFC] storage:rbd: make the size of request is equal to the, size of the object juncheng bai
2015-06-15 13:03 ` Ilya Dryomov
2015-06-15 13:23 ` juncheng bai
2015-06-15 14:27 ` Ilya Dryomov
2015-06-16 3:28 ` juncheng bai [this message]
2015-06-16 8:37 ` Ilya Dryomov
2015-06-16 11:57 ` juncheng bai
2015-06-16 13:30 ` Ilya Dryomov
2015-06-16 14:14 ` juncheng bai
2015-06-16 15:51 ` Ilya Dryomov
2015-06-17 3:04 ` juncheng bai
2015-06-17 8:24 ` Ilya Dryomov
2015-06-17 9:47 ` juncheng bai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=557F97CB.6070608@unitedstack.com \
--to=baijuncheng@unitedstack.com \
--cc=ceph-devel@vger.kernel.org \
--cc=elder@inktank.com \
--cc=elder@linaro.org \
--cc=idryomov@gmail.com \
--cc=idryomov@redhat.com \
--cc=jeff@garzik.org \
--cc=josh.durgin@inktank.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lucienchao@gmail.com \
--cc=sage@newdream.net \
--cc=yehuda@hq.newdream.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox