From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57081) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XgBUk-0004DE-PF for qemu-devel@nongnu.org; Mon, 20 Oct 2014 07:51:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XgBUf-0005IP-Pj for qemu-devel@nongnu.org; Mon, 20 Oct 2014 07:51:26 -0400 Received: from mx1.redhat.com ([209.132.183.28]:63611) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XgBUf-0005ID-I2 for qemu-devel@nongnu.org; Mon, 20 Oct 2014 07:51:21 -0400 Message-ID: <5444F730.4000003@redhat.com> Date: Mon, 20 Oct 2014 13:51:12 +0200 From: Max Reitz MIME-Version: 1.0 References: <1413785643-20336-1-git-send-email-pl@kamp.de> <5444CED9.8070402@redhat.com> <5444D28F.4010005@kamp.de> <5444D57C.6080409@redhat.com> <5444DDF4.40800@kamp.de> In-Reply-To: <5444DDF4.40800@kamp.de> Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven , qemu-devel@nongnu.org Cc: kwolf@redhat.com, famz@redhat.com, armbru@redhat.com, stefanha@redhat.com On 2014-10-20 at 12:03, Peter Lieven wrote: > On 20.10.2014 11:27, Max Reitz wrote: >> On 2014-10-20 at 11:14, Peter Lieven wrote: >>> On 20.10.2014 10:59, Max Reitz wrote: >>>> On 2014-10-20 at 08:14, Peter Lieven wrote: >>>>> the block layer silently merges write requests since >>>> >>>> s/^t/T/ >>>> >>>>> commit 40b4f539. This patch adds a knob to disable >>>>> this feature as there has been some discussion lately >>>>> if multiwrite is a good idea at all and as it falsifies >>>>> benchmarks. >>>>> >>>>> Signed-off-by: Peter Lieven >>>>> --- >>>>> block.c | 4 ++++ >>>>> block/qapi.c | 1 + >>>>> blockdev.c | 7 +++++++ >>>>> hmp.c | 4 ++++ >>>>> include/block/block_int.h | 1 + >>>>> qapi/block-core.json | 10 +++++++++- >>>>> qemu-options.hx | 1 + >>>>> qmp-commands.hx | 2 ++ >>>>> 8 files changed, 29 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/block.c b/block.c >>>>> index 27533f3..1658a72 100644 >>>>> --- a/block.c >>>>> +++ b/block.c >>>>> @@ -4531,6 +4531,10 @@ static int >>>>> multiwrite_merge(BlockDriverState *bs, BlockRequest *reqs, >>>>> { >>>>> int i, outidx; >>>>> + if (!bs->write_merging) { >>>>> + return num_reqs; >>>>> + } >>>>> + >>>>> // Sort requests by start sector >>>>> qsort(reqs, num_reqs, sizeof(*reqs), &multiwrite_req_compare); >>>>> diff --git a/block/qapi.c b/block/qapi.c >>>>> index 9733ebd..02251dd 100644 >>>>> --- a/block/qapi.c >>>>> +++ b/block/qapi.c >>>>> @@ -58,6 +58,7 @@ BlockDeviceInfo >>>>> *bdrv_block_device_info(BlockDriverState *bs) >>>>> info->backing_file_depth = bdrv_get_backing_file_depth(bs); >>>>> info->detect_zeroes = bs->detect_zeroes; >>>>> + info->write_merging = bs->write_merging; >>>>> if (bs->io_limits_enabled) { >>>>> ThrottleConfig cfg; >>>>> diff --git a/blockdev.c b/blockdev.c >>>>> index e595910..13e47b8 100644 >>>>> --- a/blockdev.c >>>>> +++ b/blockdev.c >>>>> @@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char >>>>> *file, QDict *bs_opts, >>>>> const char *id; >>>>> bool has_driver_specific_opts; >>>>> BlockdevDetectZeroesOptions detect_zeroes; >>>>> + bool write_merging; >>>>> BlockDriver *drv = NULL; >>>>> /* Check common options by copying from bs_opts to opts, >>>>> all other options >>>>> @@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char >>>>> *file, QDict *bs_opts, >>>>> snapshot = qemu_opt_get_bool(opts, "snapshot", 0); >>>>> ro = qemu_opt_get_bool(opts, "read-only", 0); >>>>> copy_on_read = qemu_opt_get_bool(opts, "copy-on-read", false); >>>>> + write_merging = qemu_opt_get_bool(opts, "write-merging", true); >>>> >>>> Using this option in blockdev_init() means that you can only enable >>>> or disable merging for the top layer (the root BDS). Furthermore, >>>> since you don't set bs->write_merging in bdrv_new() (or at least >>>> bdrv_open()), it actually defaults to false and only for the top >>>> layer it defaults to true. >>>> >>>> Therefore, if after this patch a format block driver issues a >>>> multiwrite to its file, the write will not be merged and the user >>>> can do nothing about it. I don't suppose this is intentional...? >>> >>> I am not sure if a block driver actually can do this at all? The >>> only way to enter multiwrite is from virtio_blk_handle_request in >>> virtio-blk.c. >> >> Well, there's also qemu-io -c multiwrite (which only accesses the >> root BDS as well). But other than that, yes, you're right. So, in >> practice it shouldn't matter. >> >>> >>>> >>>> I propose evaluating the option in bdrv_open() and setting >>>> bs->write_merging there. >>> >>> I wasn't aware actually. I remember that someone asked me to >>> implement discard_zeroes in blockdev_init. I think it was something >>> related to QMP. So we still might >>> need to check parameters at 2 positions? It is quite confusing which >>> paramter has to be parsed where. >> >> As for me, I don't know why some options are parsed in >> blockdev_init() at all. I guess all the options currently parsed in >> blockdev_init() should later be moved to the BlockBackend, at least >> that would be the idea. In practice, we cannot do that: Things like >> caching will stay in the BlockDriverState. >> >> I think it's just broken. IMHO, everything related to the BB should >> be in blockdev_init() and everything related to the BDS should be in >> bdrv_open(). So the question is now whether you want write_merging to >> be in the BDS or in the BB. Considering BB is in Kevin's block branch >> as of last Friday, you might actually want to work on that branch and >> move the field into the BB if you decide that that's the place it >> should be in. > > Actually I there a pros and cons for both BDS and BB. As of now my > intention was to be able to turn it off. As there are People who would > like to see it completely disappear I would not spent too much effort > in that switch today. > Looking at BB it is a BDS thing and thus belongs to bdrv_open. But > this is true for discard_zeroes (and others) as well. Kevin, Stefan, > ultimatively where should it be parsed? Yes, and for cache, too. That's what I meant with "it's just broken". Max > I have on my todo list the following to give you a figure what might > happen. All this is for 2.3+ except for the accounting maybe: > > - add accounting for merged requests (to have a metric for > modifications) > - evaluate if sorting the requests really helps > - simplify the merge conditions and do not keep a fixed list of > requests just merge if the Nth and N-1 requests are mergable (without > sorting). > - think about handling the complete merging in virtio-blk with the > above simplifications. currently it is a virtio-blk only feature anyway. > - add read merging > > Peter