From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34328) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XgBvt-0008FP-9K for qemu-devel@nongnu.org; Mon, 20 Oct 2014 08:19:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XgBvk-00006K-8y for qemu-devel@nongnu.org; Mon, 20 Oct 2014 08:19:29 -0400 Received: from mx1.redhat.com ([209.132.183.28]:13182) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XgBvk-00006G-1e for qemu-devel@nongnu.org; Mon, 20 Oct 2014 08:19:20 -0400 Message-ID: <5444FDBE.9060501@redhat.com> Date: Mon, 20 Oct 2014 14:19:10 +0200 From: Max Reitz MIME-Version: 1.0 References: <1413785643-20336-1-git-send-email-pl@kamp.de> <5444CED9.8070402@redhat.com> <5444D28F.4010005@kamp.de> <5444D57C.6080409@redhat.com> <5444DDF4.40800@kamp.de> <5444F730.4000003@redhat.com> <5444FD07.6030702@kamp.de> In-Reply-To: <5444FD07.6030702@kamp.de> Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven , qemu-devel@nongnu.org Cc: kwolf@redhat.com, famz@redhat.com, armbru@redhat.com, stefanha@redhat.com On 2014-10-20 at 14:16, Peter Lieven wrote: > On 20.10.2014 13:51, Max Reitz wrote: >> On 2014-10-20 at 12:03, Peter Lieven wrote: >>> On 20.10.2014 11:27, Max Reitz wrote: >>>> On 2014-10-20 at 11:14, Peter Lieven wrote: >>>>> On 20.10.2014 10:59, Max Reitz wrote: >>>>>> On 2014-10-20 at 08:14, Peter Lieven wrote: >>>>>>> the block layer silently merges write requests since >>>>>> >>>>>> s/^t/T/ >>>>>> >>>>>>> commit 40b4f539. This patch adds a knob to disable >>>>>>> this feature as there has been some discussion lately >>>>>>> if multiwrite is a good idea at all and as it falsifies >>>>>>> benchmarks. >>>>>>> >>>>>>> Signed-off-by: Peter Lieven >>>>>>> --- >>>>>>> block.c | 4 ++++ >>>>>>> block/qapi.c | 1 + >>>>>>> blockdev.c | 7 +++++++ >>>>>>> hmp.c | 4 ++++ >>>>>>> include/block/block_int.h | 1 + >>>>>>> qapi/block-core.json | 10 +++++++++- >>>>>>> qemu-options.hx | 1 + >>>>>>> qmp-commands.hx | 2 ++ >>>>>>> 8 files changed, 29 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/block.c b/block.c >>>>>>> index 27533f3..1658a72 100644 >>>>>>> --- a/block.c >>>>>>> +++ b/block.c >>>>>>> @@ -4531,6 +4531,10 @@ static int >>>>>>> multiwrite_merge(BlockDriverState *bs, BlockRequest *reqs, >>>>>>> { >>>>>>> int i, outidx; >>>>>>> + if (!bs->write_merging) { >>>>>>> + return num_reqs; >>>>>>> + } >>>>>>> + >>>>>>> // Sort requests by start sector >>>>>>> qsort(reqs, num_reqs, sizeof(*reqs), >>>>>>> &multiwrite_req_compare); >>>>>>> diff --git a/block/qapi.c b/block/qapi.c >>>>>>> index 9733ebd..02251dd 100644 >>>>>>> --- a/block/qapi.c >>>>>>> +++ b/block/qapi.c >>>>>>> @@ -58,6 +58,7 @@ BlockDeviceInfo >>>>>>> *bdrv_block_device_info(BlockDriverState *bs) >>>>>>> info->backing_file_depth = bdrv_get_backing_file_depth(bs); >>>>>>> info->detect_zeroes = bs->detect_zeroes; >>>>>>> + info->write_merging = bs->write_merging; >>>>>>> if (bs->io_limits_enabled) { >>>>>>> ThrottleConfig cfg; >>>>>>> diff --git a/blockdev.c b/blockdev.c >>>>>>> index e595910..13e47b8 100644 >>>>>>> --- a/blockdev.c >>>>>>> +++ b/blockdev.c >>>>>>> @@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char >>>>>>> *file, QDict *bs_opts, >>>>>>> const char *id; >>>>>>> bool has_driver_specific_opts; >>>>>>> BlockdevDetectZeroesOptions detect_zeroes; >>>>>>> + bool write_merging; >>>>>>> BlockDriver *drv = NULL; >>>>>>> /* Check common options by copying from bs_opts to opts, >>>>>>> all other options >>>>>>> @@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char >>>>>>> *file, QDict *bs_opts, >>>>>>> snapshot = qemu_opt_get_bool(opts, "snapshot", 0); >>>>>>> ro = qemu_opt_get_bool(opts, "read-only", 0); >>>>>>> copy_on_read = qemu_opt_get_bool(opts, "copy-on-read", >>>>>>> false); >>>>>>> + write_merging = qemu_opt_get_bool(opts, "write-merging", >>>>>>> true); >>>>>> >>>>>> Using this option in blockdev_init() means that you can only >>>>>> enable or disable merging for the top layer (the root BDS). >>>>>> Furthermore, since you don't set bs->write_merging in bdrv_new() >>>>>> (or at least bdrv_open()), it actually defaults to false and only >>>>>> for the top layer it defaults to true. >>>>>> >>>>>> Therefore, if after this patch a format block driver issues a >>>>>> multiwrite to its file, the write will not be merged and the user >>>>>> can do nothing about it. I don't suppose this is intentional...? >>>>> >>>>> I am not sure if a block driver actually can do this at all? The >>>>> only way to enter multiwrite is from virtio_blk_handle_request in >>>>> virtio-blk.c. >>>> >>>> Well, there's also qemu-io -c multiwrite (which only accesses the >>>> root BDS as well). But other than that, yes, you're right. So, in >>>> practice it shouldn't matter. >>>> >>>>> >>>>>> >>>>>> I propose evaluating the option in bdrv_open() and setting >>>>>> bs->write_merging there. >>>>> >>>>> I wasn't aware actually. I remember that someone asked me to >>>>> implement discard_zeroes in blockdev_init. I think it was >>>>> something related to QMP. So we still might >>>>> need to check parameters at 2 positions? It is quite confusing >>>>> which paramter has to be parsed where. >>>> >>>> As for me, I don't know why some options are parsed in >>>> blockdev_init() at all. I guess all the options currently parsed in >>>> blockdev_init() should later be moved to the BlockBackend, at least >>>> that would be the idea. In practice, we cannot do that: Things like >>>> caching will stay in the BlockDriverState. >>>> >>>> I think it's just broken. IMHO, everything related to the BB should >>>> be in blockdev_init() and everything related to the BDS should be >>>> in bdrv_open(). So the question is now whether you want >>>> write_merging to be in the BDS or in the BB. Considering BB is in >>>> Kevin's block branch as of last Friday, you might actually want to >>>> work on that branch and move the field into the BB if you decide >>>> that that's the place it should be in. >>> >>> Actually I there a pros and cons for both BDS and BB. As of now my >>> intention was to be able to turn it off. As there are People who >>> would like to see it completely disappear I would not spent too much >>> effort in that switch today. >>> Looking at BB it is a BDS thing and thus belongs to bdrv_open. But >>> this is true for discard_zeroes (and others) as well. Kevin, Stefan, >>> ultimatively where should it be parsed? >> >> Yes, and for cache, too. That's what I meant with "it's just broken". > > Can you further help here. I think my problem was that I don't have > access to the commandline options in bdrv_open?! You do. It's the "options" QDict. :-) Max