From: Max Reitz <mreitz@redhat.com>
To: Peter Lieven <pl@kamp.de>, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, famz@redhat.com, armbru@redhat.com,
stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge
Date: Mon, 20 Oct 2014 13:51:12 +0200 [thread overview]
Message-ID: <5444F730.4000003@redhat.com> (raw)
In-Reply-To: <5444DDF4.40800@kamp.de>
On 2014-10-20 at 12:03, Peter Lieven wrote:
> On 20.10.2014 11:27, Max Reitz wrote:
>> On 2014-10-20 at 11:14, Peter Lieven wrote:
>>> On 20.10.2014 10:59, Max Reitz wrote:
>>>> On 2014-10-20 at 08:14, Peter Lieven wrote:
>>>>> the block layer silently merges write requests since
>>>>
>>>> s/^t/T/
>>>>
>>>>> commit 40b4f539. This patch adds a knob to disable
>>>>> this feature as there has been some discussion lately
>>>>> if multiwrite is a good idea at all and as it falsifies
>>>>> benchmarks.
>>>>>
>>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>>> ---
>>>>> block.c | 4 ++++
>>>>> block/qapi.c | 1 +
>>>>> blockdev.c | 7 +++++++
>>>>> hmp.c | 4 ++++
>>>>> include/block/block_int.h | 1 +
>>>>> qapi/block-core.json | 10 +++++++++-
>>>>> qemu-options.hx | 1 +
>>>>> qmp-commands.hx | 2 ++
>>>>> 8 files changed, 29 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/block.c b/block.c
>>>>> index 27533f3..1658a72 100644
>>>>> --- a/block.c
>>>>> +++ b/block.c
>>>>> @@ -4531,6 +4531,10 @@ static int
>>>>> multiwrite_merge(BlockDriverState *bs, BlockRequest *reqs,
>>>>> {
>>>>> int i, outidx;
>>>>> + if (!bs->write_merging) {
>>>>> + return num_reqs;
>>>>> + }
>>>>> +
>>>>> // Sort requests by start sector
>>>>> qsort(reqs, num_reqs, sizeof(*reqs), &multiwrite_req_compare);
>>>>> diff --git a/block/qapi.c b/block/qapi.c
>>>>> index 9733ebd..02251dd 100644
>>>>> --- a/block/qapi.c
>>>>> +++ b/block/qapi.c
>>>>> @@ -58,6 +58,7 @@ BlockDeviceInfo
>>>>> *bdrv_block_device_info(BlockDriverState *bs)
>>>>> info->backing_file_depth = bdrv_get_backing_file_depth(bs);
>>>>> info->detect_zeroes = bs->detect_zeroes;
>>>>> + info->write_merging = bs->write_merging;
>>>>> if (bs->io_limits_enabled) {
>>>>> ThrottleConfig cfg;
>>>>> diff --git a/blockdev.c b/blockdev.c
>>>>> index e595910..13e47b8 100644
>>>>> --- a/blockdev.c
>>>>> +++ b/blockdev.c
>>>>> @@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char
>>>>> *file, QDict *bs_opts,
>>>>> const char *id;
>>>>> bool has_driver_specific_opts;
>>>>> BlockdevDetectZeroesOptions detect_zeroes;
>>>>> + bool write_merging;
>>>>> BlockDriver *drv = NULL;
>>>>> /* Check common options by copying from bs_opts to opts,
>>>>> all other options
>>>>> @@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char
>>>>> *file, QDict *bs_opts,
>>>>> snapshot = qemu_opt_get_bool(opts, "snapshot", 0);
>>>>> ro = qemu_opt_get_bool(opts, "read-only", 0);
>>>>> copy_on_read = qemu_opt_get_bool(opts, "copy-on-read", false);
>>>>> + write_merging = qemu_opt_get_bool(opts, "write-merging", true);
>>>>
>>>> Using this option in blockdev_init() means that you can only enable
>>>> or disable merging for the top layer (the root BDS). Furthermore,
>>>> since you don't set bs->write_merging in bdrv_new() (or at least
>>>> bdrv_open()), it actually defaults to false and only for the top
>>>> layer it defaults to true.
>>>>
>>>> Therefore, if after this patch a format block driver issues a
>>>> multiwrite to its file, the write will not be merged and the user
>>>> can do nothing about it. I don't suppose this is intentional...?
>>>
>>> I am not sure if a block driver actually can do this at all? The
>>> only way to enter multiwrite is from virtio_blk_handle_request in
>>> virtio-blk.c.
>>
>> Well, there's also qemu-io -c multiwrite (which only accesses the
>> root BDS as well). But other than that, yes, you're right. So, in
>> practice it shouldn't matter.
>>
>>>
>>>>
>>>> I propose evaluating the option in bdrv_open() and setting
>>>> bs->write_merging there.
>>>
>>> I wasn't aware actually. I remember that someone asked me to
>>> implement discard_zeroes in blockdev_init. I think it was something
>>> related to QMP. So we still might
>>> need to check parameters at 2 positions? It is quite confusing which
>>> paramter has to be parsed where.
>>
>> As for me, I don't know why some options are parsed in
>> blockdev_init() at all. I guess all the options currently parsed in
>> blockdev_init() should later be moved to the BlockBackend, at least
>> that would be the idea. In practice, we cannot do that: Things like
>> caching will stay in the BlockDriverState.
>>
>> I think it's just broken. IMHO, everything related to the BB should
>> be in blockdev_init() and everything related to the BDS should be in
>> bdrv_open(). So the question is now whether you want write_merging to
>> be in the BDS or in the BB. Considering BB is in Kevin's block branch
>> as of last Friday, you might actually want to work on that branch and
>> move the field into the BB if you decide that that's the place it
>> should be in.
>
> Actually I there a pros and cons for both BDS and BB. As of now my
> intention was to be able to turn it off. As there are People who would
> like to see it completely disappear I would not spent too much effort
> in that switch today.
> Looking at BB it is a BDS thing and thus belongs to bdrv_open. But
> this is true for discard_zeroes (and others) as well. Kevin, Stefan,
> ultimatively where should it be parsed?
Yes, and for cache, too. That's what I meant with "it's just broken".
Max
> I have on my todo list the following to give you a figure what might
> happen. All this is for 2.3+ except for the accounting maybe:
>
> - add accounting for merged requests (to have a metric for
> modifications)
> - evaluate if sorting the requests really helps
> - simplify the merge conditions and do not keep a fixed list of
> requests just merge if the Nth and N-1 requests are mergable (without
> sorting).
> - think about handling the complete merging in virtio-blk with the
> above simplifications. currently it is a virtio-blk only feature anyway.
> - add read merging
>
> Peter
next prev parent reply other threads:[~2014-10-20 11:51 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-20 6:14 [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge Peter Lieven
2014-10-20 8:59 ` Max Reitz
2014-10-20 9:14 ` Peter Lieven
2014-10-20 9:27 ` Max Reitz
2014-10-20 10:03 ` Peter Lieven
2014-10-20 11:51 ` Max Reitz [this message]
2014-10-20 11:53 ` Peter Lieven
2014-10-20 12:56 ` Kevin Wolf
2014-10-20 12:16 ` Peter Lieven
2014-10-20 12:19 ` Max Reitz
2014-10-20 12:48 ` Peter Lieven
2014-10-20 13:15 ` Max Reitz
2014-10-20 13:19 ` Peter Lieven
2014-10-20 13:22 ` Max Reitz
2014-10-20 13:29 ` Peter Lieven
2014-10-20 13:31 ` Kevin Wolf
2014-10-20 13:47 ` Peter Lieven
2014-10-20 13:55 ` Kevin Wolf
2014-10-20 13:59 ` Peter Lieven
2014-10-20 13:59 ` Max Reitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5444F730.4000003@redhat.com \
--to=mreitz@redhat.com \
--cc=armbru@redhat.com \
--cc=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=pl@kamp.de \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).