From: Kevin Wolf <kwolf@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org, mreitz@redhat.com
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH 1/2] block: Convert .bdrv_truncate callback to coroutine_fn
Date: Mon, 25 Jun 2018 11:51:05 +0200 [thread overview]
Message-ID: <20180625095105.GD5024@localhost.localdomain> (raw)
In-Reply-To: <20180625090213.GB29002@stefanha-x1.localdomain>
[-- Attachment #1: Type: text/plain, Size: 1960 bytes --]
Am 25.06.2018 um 11:02 hat Stefan Hajnoczi geschrieben:
> On Thu, Jun 21, 2018 at 07:06:56PM +0200, Kevin Wolf wrote:
> > bdrv_truncate() is an operation that can block (even for a quite long
> > time, depending on the PreallocMode) in I/O paths that shouldn't block.
> > Convert it to a coroutine_fn so that we have the infrastructure for
> > drivers to make their .bdrv_co_truncate implementation asynchronous.
>
> block/commit.c:commit_run() invokes blk_truncate() outside of a drained
> region. I haven't looked for other instances, but this patch opens the
> door for races with other I/O requests. Are you sure it's safe to make
> this asynchronous without request serialization?
After trying to explain why it's correct, I start to think that you're
right at least in one case. The new thing after this patch is that the
truncate operation isn't atomic any more. What this means depends on the
block driver:
* file-posix/win32: I think this one is okay. The truncate
implementation doesn't depend in any way on the content of the image.
Preallocation could be critical (in that it could overwrite
concurrently issued write requests), but the BDS size is only adjusted
after the driver callback has returned, so there can't be a concurrent
write request.
* copy-on-read, crypto, raw-format: Essentially just filter drivers that
pass the request to a child node, no problem.
* gluster, iscsi, nfs, rbd, ssh: Won't yield even after this series, so
trivially okay.
* qcow2: This one is where you're right, it needs to hold s->lock so
that the metadata modifications become safe.
* qed: Does a single header update, should be fine without locking.
* sheepdog: Doesn't yield until it starts preallocation. For
preallocation, the same reasoning as for file-posix applies.
So, if I got this right, the only thing to change is holding s->lock in
qcow2_co_truncate().
Do you agree?
Kevin
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
next prev parent reply other threads:[~2018-06-25 9:51 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-21 17:06 [Qemu-devel] [PATCH 0/2] file-posix: Make truncate/preallocate asynchronous Kevin Wolf
2018-06-21 17:06 ` [Qemu-devel] [PATCH 1/2] block: Convert .bdrv_truncate callback to coroutine_fn Kevin Wolf
2018-06-25 9:02 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2018-06-25 9:51 ` Kevin Wolf [this message]
2018-06-25 14:15 ` Max Reitz
2018-06-21 17:06 ` [Qemu-devel] [PATCH 2/2] file-posix: Make .bdrv_co_truncate asynchronous Kevin Wolf
2018-06-25 8:56 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2018-06-25 9:18 ` Kevin Wolf
2018-06-25 14:24 ` [Qemu-devel] " Max Reitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180625095105.GD5024@localhost.localdomain \
--to=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).