From: Kevin Wolf <kwolf@redhat.com>
To: Michael Tokarev <mjt@tls.msk.ru>
Cc: qemu-devel@nongnu.org, Anthony Liguori <anthony@codemonkey.ws>,
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] [PATCH] increase BlockConf.min_io_size type from uint16_t to uint32_t
Date: Wed, 02 May 2012 16:35:25 +0200 [thread overview]
Message-ID: <4FA1462D.7020108@redhat.com> (raw)
In-Reply-To: <4FA1079C.80105@msgid.tls.msk.ru>
Am 02.05.2012 12:08, schrieb Michael Tokarev:
> 02.05.2012 13:57, Kevin Wolf пишет:
>> Am 30.04.2012 17:52, schrieb Michael Tokarev:
>>> This value is used currently for virtio-blk only. It was defined
>>> as uint16_t before, which is the same as in kernel<=>user interface
>>> (in virtio_blk.h, struct virtio_blk_config). But the problem is
>>> that in kernel<=>user interface the units are sectors (which is
>>> usually 512 bytes or more), while in qemu it is in bytes. However,
>>> for, say, md raid5 arrays, it is typical to have actual min_io_size
>>> of a host device to be large. For example, for raid5 device of
>>> 3 drives with 64Kb chunk size, that value will be 128Kb, which does
>>> not fit in a uint16_t anymore.
>>>
>>> Increase the value size from 16bits to 32bits for now.
>>>
>>> But apparently, the kernel<=>user interface needs to be fixed too
>>> (while it is much more difficult due to compatibility issues),
>>> because even with 512byte units, the 16bit value there will overflow
>>> too with quite normal MD RAID configuration.
>>>
>>> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
>>> ---
>>> block.h | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/block.h b/block.h
>>> index f163e54..cd5ae79 100644
>>> --- a/block.h
>>> +++ b/block.h
>>> @@ -425,7 +425,7 @@ typedef struct BlockConf {
>>> BlockDriverState *bs;
>>> uint16_t physical_block_size;
>>> uint16_t logical_block_size;
>>> - uint16_t min_io_size;
>>> + uint32_t min_io_size;
>>> uint32_t opt_io_size;
>>> int32_t bootindex;
>>> uint32_t discard_granularity;
>>> @@ -450,7 +450,7 @@ static inline unsigned int get_physical_block_exp(BlockConf *conf)
>>> _conf.logical_block_size, 512), \
>>> DEFINE_PROP_BLOCKSIZE("physical_block_size", _state, \
>>> _conf.physical_block_size, 512), \
>>> - DEFINE_PROP_UINT16("min_io_size", _state, _conf.min_io_size, 0), \
>>> + DEFINE_PROP_UINT32("min_io_size", _state, _conf.min_io_size, 0), \
>>> DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 0), \
>>> DEFINE_PROP_INT32("bootindex", _state, _conf.bootindex, -1), \
>>> DEFINE_PROP_UINT32("discard_granularity", _state, \
>>
>> Don't you need an additional check in virtio-blk now so that you don't
>> overflow the 16 bit field in the virtio protocol? And I guess the same
>> for SCSI, where INQUIRY reports only a 16 bits sector count as well.
>
> That probably should be added too, but I'm not sure I agree these should be
> added into virtio-blk.
>
> As I already mentioned, the virtio protocol has the same defect (but there
> it is less serious due to usage of larger units). And that's where the
> additional overflow needs to be ELIMINATED, not just checked. Ie, the
> protocol should be changed somehow - the only issue is that I don't know
> how to do that now, once it has been in use for quite some time.
Even if you create a new version of the protocol (introduce a new
feature flag or whatever), newer qemus will still have to deal with
older guests and vice versa.
> Note that now, we don't have ANY checks of this theme whatsoever, at all:
> I tried using 128K as min_io_size, and the guest sees it as 0 currently, --
> the limits (at least the fact that the value fits in the defined number
> of bits) should be checked in the common function which implements these
> DEFINE_PROP_UINT* defines.
Looks like a bug in the qdev property parser to me. 128k is obviously an
invalid value for a 16 bit property.
But now that you're extending the property value to 32 bits, but only 25
bits can be really used, the property type doesn't even theoretically
suffice as a check.
> I'm not sure I can fix all of this in one go, so I went on and fixed the
> most specific bug first, without any additional bad side effects.
>
> Besides, as I also already mentioned, this whole structure is used in
> virtio-blk *only*, so only virtio driver passes this information into
> guest, scsi does not use it. And again, in case of scsi, the units are
> *sectors*, not bytes.
Yes, same as for virtio-blk.
But I can't see which structure is only used by virtio-blk, though. The
min_io_size property is contained in DEFINE_BLOCK_PROPERTIES(), which is
used by every qdevified block device. Most of them ignore min_io_size,
but virtio-blk and scsi-disk don't.
> So maybe the solution is to keep this as 16bits but switch to sectors.
> But for this, DEFINE_PROP_UINT can't be used anymore, unless we agree
> to change user interface TOO and require the user to specify these
> values in sectors to start with.
That wouldn't be a good interface. Let's just take a 32 bit number and
add the checks in the devices.
> And yes, if SCSI uses 16bit value here, it will have the same issue
> as virtio currently have -- pretty normal MD RAID array can't be
> expressed using 16bit number of sectors already... Oh well, but we
> at least have a (small) chance to fix virtio.
Just curious... What values do you want to use? The 32 MB minimum I/O
size that are possible with 16 bits and 512 byte sectors already sounds
insanely large.
Kevin
next prev parent reply other threads:[~2012-05-02 14:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1335801160-321-1-git-send-email-mjt@msgid.tls.msk.ru>
2012-05-01 8:27 ` [Qemu-devel] [PATCH] increase BlockConf.min_io_size type from uint16_t to uint32_t Stefan Hajnoczi
2012-05-01 8:43 ` Michael Tokarev
2012-05-02 9:57 ` Kevin Wolf
2012-05-02 10:08 ` Michael Tokarev
2012-05-02 14:35 ` Kevin Wolf [this message]
2012-05-02 15:37 ` Michael Tokarev
2012-05-02 15:53 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FA1462D.7020108@redhat.com \
--to=kwolf@redhat.com \
--cc=anthony@codemonkey.ws \
--cc=mjt@tls.msk.ru \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).