qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: "Denis V. Lunev" <den@openvz.org>
Cc: qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH for 2.7 1/1] qcow2: improve qcow2_co_write_zeroes()
Date: Tue, 26 Apr 2016 10:23:21 +0200	[thread overview]
Message-ID: <20160426082321.GA4213@noname.str.redhat.com> (raw)
In-Reply-To: <571DEF7D.1010304@openvz.org>

Am 25.04.2016 um 12:20 hat Denis V. Lunev geschrieben:
> On 04/25/2016 12:05 PM, Kevin Wolf wrote:
> >Am 23.04.2016 um 14:05 hat Denis V. Lunev geschrieben:
> >>Unfortunately Linux kernel could send non-aligned requests to qemu-nbd
> >>if the caller is using O_DIRECT and does not align in-memory data to
> >>page. Thus qemu-nbd will call block layer with non-aligned requests.
> >>
> >>qcow2_co_write_zeroes forcibly asks the caller to supply block-aligned
> >>data. In the other case it rejects with ENOTSUP which is properly
> >>handled on the upper level. The problem is that this grows the image.
> >>
> >>This could be optimized a bit:
> >>- particular request could be split to block aligned part and head/tail,
> >>   which could be handled separately
> >In fact, this is what bdrv_co_do_write_zeroes() is already supposed to
> >do. qcow2 exposes its cluster size as bs->bl.write_zeroes_alignment, so
> >block/io.c should split the request in three.
> >
> >If you see something different happening, we may have a bug there.
> >
> Pls look to the commit
> 
> commit 459b4e66129d091a11e9886ecc15a8bf9f7f3d92
> Author: Denis V. Lunev <den@openvz.org>
> Date:   Tue May 12 17:30:56 2015 +0300
> 
>     block: align bounce buffers to page
> 
> The situation is exactly like the described there. The user
> of the /dev/nbd0 writes with O_DIRECT and has unaligned
> to page buffers. Thus real operations on qemu-nbd
> layer becomes unaligned to block size.

I don't understand the connection to this patch. Unaligned buffers on
the NBD client shouldn't even be visible in the server, unless they
already result in the client requesting different things. If so, what is
the difference in the NBD requests? And can we reproduce the same
locally with qemu-io and no NBD involved?

> Thus bdrv_co_do_write_zeroes is helpless here unfortunately.

How can qcow2 fix something that bdrv_co_do_write_zeroes() can't
possibly fix? In particular, why does splitting the request in head,
tail and aligned part help when done by qcow2, but the same thing
doesn't help when done by bdrv_co_do_write_zeroes()?

I'd actually be interested in both parts of the answer, because I'm not
sure how _memory_ alignment on the client can possibly be fixed in
qcow2; but if it's about _disk_ alignment, I don't understand why it
can't be fixed in bdrv_co_do_write_zeroes().

> >>- writes could be omitted when we do know that the image already contains
> >>   zeroes at the offsets being written
> >I don't think this is a valid shortcut. The semantics of a write_zeroes
> >operation is that the zeros (literal or as flags) are stored in this
> >layer and that the backing file isn't involved any more for the given
> >sectors. For example, a streaming operation or qemu-img rebase may
> >involve write_zeroes operations, and relying on the backing file would
> >cause corruption there (because the whole point of the operation is that
> >the backing file can be removed).
> this is not a problem. The block will be abscent and thus it will be
> read as zeroes.

Removing a backing file doesn't mean that there won't still be another
backing file. You may have only removed one node in the backing file
chain, or in the case of rebase, you switch to another backing file.

Kevin

  parent reply	other threads:[~2016-04-26  8:23 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-23 12:05 [Qemu-devel] [PATCH for 2.7 1/1] qcow2: improve qcow2_co_write_zeroes() Denis V. Lunev
2016-04-25  9:05 ` Kevin Wolf
2016-04-25 10:20   ` Denis V. Lunev
2016-04-25 19:35     ` Eric Blake
2016-04-25 21:00       ` Denis V. Lunev
2016-04-26  8:23     ` Kevin Wolf [this message]
2016-04-26  9:35       ` Denis V. Lunev
2016-04-26 10:19         ` Kevin Wolf
2016-04-27  7:07           ` Denis V. Lunev
2016-04-27  8:12             ` Kevin Wolf
2016-04-27  8:32               ` Denis V. Lunev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160426082321.GA4213@noname.str.redhat.com \
    --to=kwolf@redhat.com \
    --cc=den@openvz.org \
    --cc=mreitz@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).