qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Lieven <pl@kamp.de>
To: Max Reitz <mreitz@redhat.com>, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, famz@redhat.com, armbru@redhat.com,
	stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge
Date: Mon, 20 Oct 2014 13:53:41 +0200	[thread overview]
Message-ID: <5444F7C5.7050206@kamp.de> (raw)
In-Reply-To: <5444F730.4000003@redhat.com>

On 20.10.2014 13:51, Max Reitz wrote:
> On 2014-10-20 at 12:03, Peter Lieven wrote:
>> On 20.10.2014 11:27, Max Reitz wrote:
>>> On 2014-10-20 at 11:14, Peter Lieven wrote:
>>>> On 20.10.2014 10:59, Max Reitz wrote:
>>>>> On 2014-10-20 at 08:14, Peter Lieven wrote:
>>>>>> the block layer silently merges write requests since
>>>>>
>>>>> s/^t/T/
>>>>>
>>>>>> commit 40b4f539. This patch adds a knob to disable
>>>>>> this feature as there has been some discussion lately
>>>>>> if multiwrite is a good idea at all and as it falsifies
>>>>>> benchmarks.
>>>>>>
>>>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>>>> ---
>>>>>>   block.c                   |    4 ++++
>>>>>>   block/qapi.c              |    1 +
>>>>>>   blockdev.c                |    7 +++++++
>>>>>>   hmp.c                     |    4 ++++
>>>>>>   include/block/block_int.h |    1 +
>>>>>>   qapi/block-core.json      |   10 +++++++++-
>>>>>>   qemu-options.hx           |    1 +
>>>>>>   qmp-commands.hx           |    2 ++
>>>>>>   8 files changed, 29 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/block.c b/block.c
>>>>>> index 27533f3..1658a72 100644
>>>>>> --- a/block.c
>>>>>> +++ b/block.c
>>>>>> @@ -4531,6 +4531,10 @@ static int multiwrite_merge(BlockDriverState *bs, BlockRequest *reqs,
>>>>>>   {
>>>>>>       int i, outidx;
>>>>>>   +    if (!bs->write_merging) {
>>>>>> +        return num_reqs;
>>>>>> +    }
>>>>>> +
>>>>>>       // Sort requests by start sector
>>>>>>       qsort(reqs, num_reqs, sizeof(*reqs), &multiwrite_req_compare);
>>>>>>   diff --git a/block/qapi.c b/block/qapi.c
>>>>>> index 9733ebd..02251dd 100644
>>>>>> --- a/block/qapi.c
>>>>>> +++ b/block/qapi.c
>>>>>> @@ -58,6 +58,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs)
>>>>>>         info->backing_file_depth = bdrv_get_backing_file_depth(bs);
>>>>>>       info->detect_zeroes = bs->detect_zeroes;
>>>>>> +    info->write_merging = bs->write_merging;
>>>>>>         if (bs->io_limits_enabled) {
>>>>>>           ThrottleConfig cfg;
>>>>>> diff --git a/blockdev.c b/blockdev.c
>>>>>> index e595910..13e47b8 100644
>>>>>> --- a/blockdev.c
>>>>>> +++ b/blockdev.c
>>>>>> @@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
>>>>>>       const char *id;
>>>>>>       bool has_driver_specific_opts;
>>>>>>       BlockdevDetectZeroesOptions detect_zeroes;
>>>>>> +    bool write_merging;
>>>>>>       BlockDriver *drv = NULL;
>>>>>>         /* Check common options by copying from bs_opts to opts, all other options
>>>>>> @@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
>>>>>>       snapshot = qemu_opt_get_bool(opts, "snapshot", 0);
>>>>>>       ro = qemu_opt_get_bool(opts, "read-only", 0);
>>>>>>       copy_on_read = qemu_opt_get_bool(opts, "copy-on-read", false);
>>>>>> +    write_merging = qemu_opt_get_bool(opts, "write-merging", true);
>>>>>
>>>>> Using this option in blockdev_init() means that you can only enable or disable merging for the top layer (the root BDS). Furthermore, since you don't set bs->write_merging in bdrv_new() (or at least bdrv_open()), it actually defaults to false and 
>>>>> only for the top layer it defaults to true.
>>>>>
>>>>> Therefore, if after this patch a format block driver issues a multiwrite to its file, the write will not be merged and the user can do nothing about it. I don't suppose this is intentional...?
>>>>
>>>> I am not sure if a block driver actually can do this at all? The only way to enter multiwrite is from virtio_blk_handle_request in virtio-blk.c.
>>>
>>> Well, there's also qemu-io -c multiwrite (which only accesses the root BDS as well). But other than that, yes, you're right. So, in practice it shouldn't matter.
>>>
>>>>
>>>>>
>>>>> I propose evaluating the option in bdrv_open() and setting bs->write_merging there.
>>>>
>>>> I wasn't aware actually. I remember that someone asked me to implement discard_zeroes in blockdev_init. I think it was something related to QMP. So we still might
>>>> need to check parameters at 2 positions? It is quite confusing which paramter has to be parsed where.
>>>
>>> As for me, I don't know why some options are parsed in blockdev_init() at all. I guess all the options currently parsed in blockdev_init() should later be moved to the BlockBackend, at least that would be the idea. In practice, we cannot do that: 
>>> Things like caching will stay in the BlockDriverState.
>>>
>>> I think it's just broken. IMHO, everything related to the BB should be in blockdev_init() and everything related to the BDS should be in bdrv_open(). So the question is now whether you want write_merging to be in the BDS or in the BB. Considering BB 
>>> is in Kevin's block branch as of last Friday, you might actually want to work on that branch and move the field into the BB if you decide that that's the place it should be in.
>>
>> Actually I there a pros and cons for both BDS and BB. As of now my intention was to be able to turn it off. As there are People who would like to see it completely disappear I would not spent too much effort in that switch today.
>> Looking at BB it is a BDS thing and thus belongs to bdrv_open. But this is true for discard_zeroes (and others) as well. Kevin, Stefan, ultimatively where should it be parsed?
>
> Yes, and for cache, too. That's what I meant with "it's just broken".

Looking at the old discussion about discard zeroes it was recommended to put it into bdrv_open_common. If thats still the recommendation I will put it in bdrv_open_common and send
a fix for discard_zeroes. I can't remember why I finally put it in blockdev_init...

Peter

  reply	other threads:[~2014-10-20 11:53 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-20  6:14 [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge Peter Lieven
2014-10-20  8:59 ` Max Reitz
2014-10-20  9:14   ` Peter Lieven
2014-10-20  9:27     ` Max Reitz
2014-10-20 10:03       ` Peter Lieven
2014-10-20 11:51         ` Max Reitz
2014-10-20 11:53           ` Peter Lieven [this message]
2014-10-20 12:56             ` Kevin Wolf
2014-10-20 12:16           ` Peter Lieven
2014-10-20 12:19             ` Max Reitz
2014-10-20 12:48               ` Peter Lieven
2014-10-20 13:15                 ` Max Reitz
2014-10-20 13:19                   ` Peter Lieven
2014-10-20 13:22                     ` Max Reitz
2014-10-20 13:29                       ` Peter Lieven
2014-10-20 13:31                       ` Kevin Wolf
2014-10-20 13:47                         ` Peter Lieven
2014-10-20 13:55                           ` Kevin Wolf
2014-10-20 13:59                             ` Peter Lieven
2014-10-20 13:59                               ` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5444F7C5.7050206@kamp.de \
    --to=pl@kamp.de \
    --cc=armbru@redhat.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).