From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Eric Blake <eblake@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>,
qemu-block@nongnu.org, Kevin Wolf <kwolf@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [PATCH 5/6] block/nbd: Do not force-cap *pnum
Date: Tue, 22 Jun 2021 12:07:53 +0300 [thread overview]
Message-ID: <488e47db-e7af-5b18-2cee-8dd6abb81481@virtuozzo.com> (raw)
In-Reply-To: <20210621185336.zslqpqusqng4ub2u@redhat.com>
21.06.2021 21:53, Eric Blake wrote:
> On Sat, Jun 19, 2021 at 01:53:24PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>> +++ b/block/nbd.c
>>> @@ -1702,7 +1702,7 @@ static int coroutine_fn nbd_client_co_block_status(
>>> .type = NBD_CMD_BLOCK_STATUS,
>>> .from = offset,
>>> .len = MIN(QEMU_ALIGN_DOWN(INT_MAX, bs->bl.request_alignment),
>>> - MIN(bytes, s->info.size - offset)),
>>> + s->info.size - offset),
>>> .flags = NBD_CMD_FLAG_REQ_ONE,
>>> };
>>>
>>
>> Hmm..
>>
>> I don't that this change is correct. In contrast with file-posix you don't get extra information for free, you just make a larger request. This means that server will have to do more work.
>
> Not necessarily. The fact that we have passed NBD_CMD_FLAG_REQ_ONE
> means that the server is still only allowed to give us one extent in
> its answer, and that it may not give us information beyond the length
> we requested. You are right that if we lose the REQ_ONE flag we may
> result in the server doing more work to provide us additional extents
> that we will then be ignoring because we aren't yet set up for
> avoiding REQ_ONE. Fixing that is a longer-term goal. But in the
> short term, I see no harm in giving a larger length to the server with
> REQ_ONE.
>
>>
>> (look at blockstatus_to_extents, it calls bdrv_block_status_above in a loop).
>>
>> For example, assume that nbd export is a qcow2 image with all clusters allocated. With this change, nbd server will loop through the whole qcow2 image, load all L2 tables to return big allocated extent.
>
> No, the server is allowed to reply with less length than our request,
> and that is particularly true if the server does NOT have free access
> to the full length of our request. In the case of qcow2, since
> bdrv_block_status is (by current design) clamped at cluster
> boundaries, requesting a 4G length will NOT increase the amount of the
> server response any further than the first cluster boundary (that is,
> the point where the server no longer has free access to status without
> loading another cluster of L2 entries).
No. No matter where bdrv_block_status_above is clamped. If the whole disk is allocated, blockstatus_to_extents() in nbd/server.c will loop through the whole requested range and merge all the information into one extent. This doesn't violate NBD_CMD_FLAG_REQ_ONE: we have one extent on output and don't go beyound the length. It's valid for the server to try to satisfy as much as possible of request, and blockstatus_to_extents works in this way currently.
Remember that nbd_extent_array_add() can merge new extent to the previous if it has the same type.
>
>>
>> So, only server can decide, could it add some extra free information to request or not. But unfortunately NBD_CMD_FLAG_REQ_ONE doesn't allow it.
>
> What the flag prohibits is the server giving us more information than
> the length we requested. But this patch is increasing our request
> length for the case where the server CAN give us more information than
> we need locally, on the hopes that even though the server can only
> reply with one extent, we aren't wasting as many network
> back-and-forth trips when a larger request would have worked.
>
--
Best regards,
Vladimir
next prev parent reply other threads:[~2021-06-22 9:09 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-17 15:52 [PATCH 0/6] block: block-status cache for data regions Max Reitz
2021-06-17 15:52 ` [PATCH 1/6] block: Drop BDS comment regarding bdrv_append() Max Reitz
2021-06-18 17:42 ` Eric Blake
2021-06-19 9:38 ` Vladimir Sementsov-Ogievskiy
2021-06-17 15:52 ` [PATCH 2/6] block: block-status cache for data regions Max Reitz
2021-06-18 18:51 ` Eric Blake
2021-06-21 9:37 ` Max Reitz
2021-06-19 10:20 ` Vladimir Sementsov-Ogievskiy
2021-06-21 10:05 ` Max Reitz
2021-06-17 15:52 ` [PATCH 3/6] block/file-posix: Do not force-cap *pnum Max Reitz
2021-06-18 20:16 ` Eric Blake
2021-06-21 9:38 ` Max Reitz
2021-06-19 10:32 ` Vladimir Sementsov-Ogievskiy
2021-06-17 15:52 ` [PATCH 4/6] block/gluster: " Max Reitz
2021-06-18 20:17 ` Eric Blake
2021-06-19 10:36 ` Vladimir Sementsov-Ogievskiy
2021-06-21 9:47 ` Max Reitz
2021-06-17 15:52 ` [PATCH 5/6] block/nbd: " Max Reitz
2021-06-18 20:20 ` Eric Blake
2021-06-19 11:12 ` Vladimir Sementsov-Ogievskiy
2021-06-19 10:53 ` Vladimir Sementsov-Ogievskiy
2021-06-21 9:50 ` Max Reitz
2021-06-21 18:54 ` Eric Blake
2021-06-21 18:53 ` Eric Blake
2021-06-22 9:07 ` Vladimir Sementsov-Ogievskiy [this message]
2021-06-17 15:52 ` [PATCH 6/6] block/iscsi: " Max Reitz
2021-06-18 20:20 ` Eric Blake
2021-06-19 11:13 ` Vladimir Sementsov-Ogievskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=488e47db-e7af-5b18-2cee-8dd6abb81481@virtuozzo.com \
--to=vsementsov@virtuozzo.com \
--cc=eblake@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).