From: Kevin Wolf <kwolf@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: fam@euphon.net, berto@igalia.com, stefanha@redhat.com,
qemu-block@nongnu.org, dillaman@redhat.com,
pavel.dovgaluk@ispras.ru, sw@weilnetz.de, pl@kamp.de,
qemu-devel@nongnu.org, mreitz@redhat.com, jsnow@redhat.com,
ronniesahlberg@gmail.com, den@openvz.org, pbonzini@redhat.com,
ari@tuxera.com
Subject: Re: [RFC 0/3] 64bit block-layer part I
Date: Thu, 23 Apr 2020 17:43:04 +0200 [thread overview]
Message-ID: <20200423154304.GD23654@linux.fritz.box> (raw)
In-Reply-To: <20200330141818.31294-1-vsementsov@virtuozzo.com>
Am 30.03.2020 um 16:18 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Hi all!
>
> There is an idea to make NBD protocol extension to support 64bit
> write-zero/discard/block-status commands (commands, which doesn't
> transfer user data). It's needed to increase performance of zeroing
> large ranges (up to the whole image). Zeroing of the whole image is used
> as first step of mirror job, qemu-img convert, it should be also used at
> start of backup actually..
>
> We need to support it in block-layer, so we want 64bit write_zeros.
> Currently driver handler now have int bytes parameter.
>
> write_zeros path goes through normal pwritev, so we need 64bit write,
> and then we need 64bit read for symmetry, and better, let's make all io
> path work with 64bit bytes parameter.
>
> Actually most of block-layer already have 64bit parameters: offset is
> sometimes int64_t and sometimes uint64_t. bytes parameter is one of
> int64_t, uint64_t, int, unsigned int...
>
> I think we need one type for all of this, and this one type is int64_t.
> Signed int64_t is a bit better than uint64_t: you can use same variable
> to get some result (including error < 0) and than reuse it as an
> argument without any type conversion.
>
> So, I propose, as a first step, convert all uint64_t parameters to
> int64_t.
>
> Still, I don't have good idea of how to split this into more than 3
> patches, so, this is an RFC.
I think the split in three patches isn't too bad because it's not a
whole lot of code. But of course, it is little code that has lots of
implications which does make it hard to review. If we think that we
might bisect a bug in the series later, maybe it would be better to
split it into more patches.
write/write_zeroes has to be a single thing, I'm afraid. But I guess
read could be a separate patch, as could be copy_range. Not sure about
discard.
> What's next?
>
> Converting write_zero and discard is not as simple: we can't just
> s/int/uint64_t/, as some functions may use some int variables for
> calculations and this will be broken by something larger than int.
>
> So, I think the simplest way is to add .bdrv_co_pwritev_zeros64 and
> .bdrv_co_pdiscard64 and update drivers one-by-one. If at some point all
> drivers updated - drop unused 32bit functions, and then drop "64" suffix
> from API. If not - we'll live with both APIs.
We already have too many unfinished conversions in QEMU, let's not add
one more.
Fortunately, we already have a tool that could help us here: Things like
bs->bl.max_pwrite_zeroes. We could make BDRV_REQUEST_MAX_BYTES the
default value and only drivers that override it can get bigger requests.
> Another thing to do is updating default limiting of request (currently
> they are limited to INT_MAX).
As above, I wouldn't update the default, but rather enable drivers to
overload the default with a larger value. This will involve changing
some places where we use MIN() between INT_MAX and the driver's value.
> Then we may move some drivers to 64bit discard/write_zero: I think about
> qcow2, file-posix and nbd (as a proof-of-concept for already proposed
> NBD extension).
Makes sense to me.
Kevin
prev parent reply other threads:[~2020-04-23 16:58 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-30 14:18 [RFC 0/3] 64bit block-layer part I Vladimir Sementsov-Ogievskiy
2020-03-30 14:18 ` [PATCH 1/3] block: use int64_t as bytes type in tracked requests Vladimir Sementsov-Ogievskiy
2020-04-22 15:37 ` Stefan Hajnoczi
2020-04-23 15:25 ` Kevin Wolf
2020-03-30 14:18 ` [PATCH 2/3] block/io: convert generic io path to use int64_t parameters Vladimir Sementsov-Ogievskiy
2020-04-22 15:50 ` Stefan Hajnoczi
2020-04-22 17:45 ` Vladimir Sementsov-Ogievskiy
2020-03-30 14:18 ` [PATCH 3/3] block: use int64_t instead of uint64_t in driver handlers Vladimir Sementsov-Ogievskiy
2020-03-30 17:43 ` [RFC 0/3] 64bit block-layer part I no-reply
2020-03-30 17:48 ` no-reply
2020-03-30 17:50 ` no-reply
2020-04-22 14:29 ` Vladimir Sementsov-Ogievskiy
2020-04-22 14:52 ` Eric Blake
2020-04-22 15:53 ` Stefan Hajnoczi
2020-04-22 18:24 ` Vladimir Sementsov-Ogievskiy
2020-04-22 19:32 ` Eric Blake
2020-04-23 15:43 ` Kevin Wolf [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200423154304.GD23654@linux.fritz.box \
--to=kwolf@redhat.com \
--cc=ari@tuxera.com \
--cc=berto@igalia.com \
--cc=den@openvz.org \
--cc=dillaman@redhat.com \
--cc=fam@euphon.net \
--cc=jsnow@redhat.com \
--cc=mreitz@redhat.com \
--cc=pavel.dovgaluk@ispras.ru \
--cc=pbonzini@redhat.com \
--cc=pl@kamp.de \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=ronniesahlberg@gmail.com \
--cc=stefanha@redhat.com \
--cc=sw@weilnetz.de \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).