Re: [Qemu-devel] [PATCH 4/4] block: avoid creating oversized writes in multiwrite_merge

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Peter Lieven <pl@kamp.de>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	ronniesahlberg@gmail.com, qemu-devel@nongnu.org,
	stefanha@redhat.com, mreitz@redhat.com
Subject: Re: [Qemu-devel] [PATCH 4/4] block: avoid creating oversized writes in multiwrite_merge
Date: Tue, 30 Sep 2014 09:26:43 +0200	[thread overview]
Message-ID: <542A5B33.20707@kamp.de> (raw)
In-Reply-To: <20140923100517.GE3871@noname.str.redhat.com>

On 23.09.2014 12:05, Kevin Wolf wrote:
> Am 23.09.2014 um 11:52 hat Peter Lieven geschrieben:
>> On 23.09.2014 11:47, Kevin Wolf wrote:
>>> Am 23.09.2014 um 11:32 hat Peter Lieven geschrieben:
>>>> On 23.09.2014 10:59, Kevin Wolf wrote:
>>>>> Am 23.09.2014 um 08:15 hat Peter Lieven geschrieben:
>>>>>> On 22.09.2014 21:06, Paolo Bonzini wrote:
>>>>>>> Il 22/09/2014 11:43, Peter Lieven ha scritto:
>>>>>>>> This series aims not at touching default behaviour. The default for max_transfer_length
>>>>>>>> is 0 (no limit). max_transfer_length is a limit that MUST be satisfied otherwise the request
>>>>>>>> will fail. And Patch 2 aims at catching this fail earlier in the stack.
>>>>>>> Understood.  But the right fix is to avoid that backend limits transpire
>>>>>>> into guest ABI, not to catch the limits earlier.  So the right fix would
>>>>>>> be to implement request splitting.
>>>>>> Since you proposed to add traces for this would you leave those in?
>>>>>> And since iSCSI is the only user of this at the moment would you
>>>>>> go for implementing this check in the iSCSI block layer?
>>>>>>
>>>>>> As for the split logic would you think it is enough to build new qiov's
>>>>>> out of the too big iov without copying the contents? This would work
>>>>>> as long as a single iov inside the qiov is not bigger the max_transfer_length.
>>>>>> Otherwise I would need to allocate temporary buffers and copy around.
>>>>> You can split single iovs, too. There are functions that make this very
>>>>> easy, they copy a sub-qiov with a byte granularity offset and length
>>>>> (qemu_iovec_concat and friends). qcow2 uses the same to split requests
>>>>> at (fragmented) cluster boundaries.
>>>> Might it be as easy as this?
>>> This is completely untested, right? :-)
>> Yes :-)
>> I was just unsure if the general approach is right.
> Looks alright to me.
>
>>> But ignoring bugs, the principle looks right.
>>>
>>>> static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs,
>>>>      int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
>>>>      BdrvRequestFlags flags)
>>>> {
>>>>      if (nb_sectors < 0 || nb_sectors > (UINT_MAX >> BDRV_SECTOR_BITS)) {
>>>>          return -EINVAL;
>>>>      }
>>>>
>>>>      if (bs->bl.max_transfer_length &&
>>>>          nb_sectors > bs->bl.max_transfer_length) {
>>>>          int ret = 0;
>>>>          QEMUIOVector *qiov2 = NULL;
>>> Make it "QEMUIOVector qiov2;" on the stack.
>>>
>>>>          size_t soffset = 0;
>>>>
>>>>          trace_bdrv_co_do_readv_toobig(bs, sector_num, nb_sectors,
>>>> bs->bl.max_transfer_length);
>>>>
>>>>          qemu_iovec_init(qiov2, qiov->niov);
>>> And &qiov2 here, then this doesn't crash with a NULL dereference.
>>>
>>>>          while (nb_sectors > bs->bl.max_transfer_length && !ret) {
>>>>              qemu_iovec_reset(qiov2);
>>>>              qemu_iovec_concat(qiov2, qiov, soffset,
>>>>                                bs->bl.max_transfer_length << BDRV_SECTOR_BITS);
>>>>              ret = bdrv_co_do_preadv(bs, sector_num << BDRV_SECTOR_BITS,
>>>>                                      bs->bl.max_transfer_length << BDRV_SECTOR_BITS,
>>>>                                      qiov2, flags);
>>>>              soffset += bs->bl.max_transfer_length << BDRV_SECTOR_BITS;
>>>>              sector_num += bs->bl.max_transfer_length;
>>>>              nb_sectors -= bs->bl.max_transfer_length;
>>>>          }
>>>>          qemu_iovec_destroy(qiov2);
>>>>          if (ret) {
>>>>              return ret;
>>>>          }
>>> The error check needs to be immediately after the assignment of ret,
>>> otherwise the next loop iteration can overwrite an error with a success
>>> (and if it didn't, it would still do useless I/O because the request as
>>> a whole would fail anyway).
>> There is a && !ret in the loop condition. I wanted to avoid copying the destroy part.
> Ah, yes, clever. I missed that. Maybe too clever then. ;-)
>
>> BTW, is it !ret or ret < 0 ?
> It only ever returns 0 or negative, so both are equivalent. I
> prefer checks for ret < 0, but that's a matter of style rather than
> correctness.
>
> Another problem I just noticed is that this is not the only caller of
> bdrv_co_do_preadv(), so you're not splitting all requests. The
> synchronous bdrv_read/write/pread/pwrite/pwritev functions all don't get
> the functionality this way.
>
> Perhaps you should be doing it inside bdrv_co_do_preadv(), before the
> call to bdrv_aligned_preadv(). Might even be more correct if it can
> happen that the alignment adjustment increases a request too much to fit
> in bl.max_transfer_length.

If I do it this way can I use the same req Object for all splitted
requests?

Peter

next prev parent reply	other threads:[~2014-09-30  7:27 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-05 16:51 [Qemu-devel] [PATCH 0/4] introduce max_transfer_length Peter Lieven
2014-09-05 16:51 ` [Qemu-devel] [PATCH 1/4] BlockLimits: " Peter Lieven
2014-09-05 16:51 ` [Qemu-devel] [PATCH 2/4] block: immediately cancel oversized read/write requests Peter Lieven
2014-09-08 13:44   ` Benoît Canet
2014-09-08 13:49     ` Paolo Bonzini
2014-09-08 13:56       ` Peter Lieven
2014-09-08 13:58         ` Paolo Bonzini
2014-09-08 14:35           ` Peter Lieven
2014-09-08 14:42             ` Paolo Bonzini
2014-09-08 14:54               ` Peter Lieven
2014-09-23  8:47                 ` Kevin Wolf
2014-09-23  8:55                   ` Peter Lieven
2014-09-23  9:09                     ` Kevin Wolf
2014-09-08 15:13               ` ronnie sahlberg
2014-09-08 15:15                 ` Paolo Bonzini
2014-09-08 15:18                   ` Peter Lieven
2014-09-08 15:27                     ` Paolo Bonzini
2014-09-08 16:18                       ` Peter Lieven
2014-09-08 16:29                         ` Paolo Bonzini
2014-09-12 11:43                           ` Peter Lieven
2014-09-18 14:12                             ` Paolo Bonzini
2014-09-18 14:16                               ` Peter Lieven
2014-09-18 14:17                                 ` Paolo Bonzini
2014-09-18 22:57                                   ` Peter Lieven
2014-09-08 15:16                 ` Peter Lieven
2014-09-05 16:51 ` [Qemu-devel] [PATCH 3/4] block/iscsi: set max_transfer_length Peter Lieven
2014-09-05 16:51 ` [Qemu-devel] [PATCH 4/4] block: avoid creating oversized writes in multiwrite_merge Peter Lieven
2014-09-18 14:13   ` Paolo Bonzini
2014-09-18 22:56     ` Peter Lieven
2014-09-19 13:33       ` Paolo Bonzini
2014-09-22  9:43         ` Peter Lieven
2014-09-22 19:06           ` Paolo Bonzini
2014-09-23  6:15             ` Peter Lieven
2014-09-23  8:59               ` Kevin Wolf
2014-09-23  9:04                 ` Peter Lieven
2014-09-23  9:32                 ` Peter Lieven
2014-09-23  9:47                   ` Kevin Wolf
2014-09-23  9:52                     ` Peter Lieven
2014-09-23 10:05                       ` Kevin Wolf
2014-09-30  7:26                         ` Peter Lieven [this message]
2014-09-30  8:03                           ` Kevin Wolf
2014-09-05 17:05 ` [Qemu-devel] [PATCH 0/4] introduce max_transfer_length ronnie sahlberg
2014-09-05 19:52   ` Peter Lieven
2014-09-05 21:22     ` ronnie sahlberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=542A5B33.20707@kamp.de \
    --to=pl@kamp.de \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=ronniesahlberg@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).