From: Anthony Liguori <anthony@codemonkey.ws>
To: Paul Brook <paul@codesourcery.com>
Cc: Blue Swirl <blauwirbel@gmail.com>,
Laurent Vivier <Laurent.Vivier@bull.net>,
qemu-devel@nongnu.org, Kevin Wolf <kwolf@suse.de>
Subject: Re: [Qemu-devel] Re: [PATCH][v2] Align file accesses with cache=off (O_DIRECT)
Date: Tue, 20 May 2008 20:14:52 -0500 [thread overview]
Message-ID: <4833778C.4030209@codemonkey.ws> (raw)
In-Reply-To: <200805210205.37432.paul@codesourcery.com>
Paul Brook wrote:
> On Wednesday 21 May 2008, Anthony Liguori wrote:
>
>> Paul Brook wrote:
>>
>>>> When sector-aligned guest offsets are converted to sector-unaligned
>>>> writes (e.g. due to qcow2 etc.), that property is no longer satisfied,
>>>> and power failure of the host disk can cause more damage than the
>>>> guest is designed to be resistant to.
>>>>
>>> Seems like the easiest solution would be to have qcow always align its
>>> writes. We don't do on the fly compression, so it should be fairly easy
>>> to make this happen with minimal overhead.
>>>
>> That's not sufficient. O_DIRECT imposes not only offset alignment
>> requirements but also requirements on the buffer being read to. Most of
>> the code in QEMU does not properly align the read/write buffers.
>>
>
> In that case you need both. For correct operation the qcow layer needs to
> ensure that all file offsets are block aligned (amongst other things, I
> wouldn't be surprised if there are more subtle problems with metadata
> updates).
>
> The memory buffer alignment can occur wherever is most convenient, that's
> trivially atomic w.r.t. unexpected interruptions.
>
Yes, I don't think qcow is very safe at all wrt unexpected power events.
If we're going to support O_DIRECT, then it's important that underlying
block device emulate accesses that don't meet the requirements of
O_DIRECT. IMHO, this should all happen within block-raw-posix.c.
Keep in mind, the requirements for O_DIRECT since 2.5 is hard sector
size (which is usually 512 bytes, but not always) for offset, buffer,
and size. Pre-2.5, the requirement is soft sector size which on a
filesystem is usually 4k.
I don't think it's that important to try and guess the right alignment
size, 512 is probably usually sufficient, but spreading alignment
requirements of 512 throughout QEMU code is a bad idea because this is
something that's very hardware/OS specific.
For people that care about data integrity, we should be using O_SYNC,
not O_DIRECT anyway.
Regards,
Anthony Liguori
> Paul
>
next prev parent reply other threads:[~2008-05-21 1:15 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-20 11:32 [Qemu-devel] [PATCH][v2] Align file accesses with cache=off (O_DIRECT) Laurent Vivier
2008-05-20 19:47 ` [Qemu-devel] " Anthony Liguori
2008-05-20 22:36 ` Jamie Lokier
2008-05-20 22:52 ` Paul Brook
2008-05-20 22:59 ` Laurent Vivier
2008-05-21 0:54 ` Paul Brook
2008-05-21 7:59 ` Laurent Vivier
2008-05-21 0:58 ` Anthony Liguori
2008-05-21 1:04 ` Jamie Lokier
2008-05-21 1:05 ` Anthony Liguori
2008-05-21 8:06 ` Kevin Wolf
2008-05-21 1:05 ` Paul Brook
2008-05-21 1:14 ` Anthony Liguori [this message]
2008-05-21 8:24 ` Kevin Wolf
2008-05-21 12:26 ` Jamie Lokier
2008-05-21 12:37 ` Avi Kivity
2008-05-21 13:41 ` Jamie Lokier
2008-05-21 13:55 ` Anthony Liguori
2008-05-21 14:17 ` Avi Kivity
2008-05-21 14:26 ` Anthony Liguori
2008-05-21 14:57 ` Avi Kivity
2008-05-21 15:34 ` Jamie Lokier
2008-05-21 16:02 ` Anthony Liguori
2008-05-21 16:24 ` Jamie Lokier
2008-05-21 16:48 ` Avi Kivity
2008-05-21 17:01 ` Andrea Arcangeli
2008-05-21 17:18 ` Avi Kivity
2008-05-21 17:47 ` Andrea Arcangeli
2008-05-21 17:53 ` Anthony Liguori
2008-05-21 18:08 ` Andrea Arcangeli
2008-05-21 18:25 ` Anthony Liguori
2008-05-21 20:13 ` Andrea Arcangeli
2008-05-21 20:35 ` Anthony Liguori
2008-05-21 20:42 ` Andrea Arcangeli
2008-05-21 18:29 ` Avi Kivity
2008-05-21 16:45 ` Avi Kivity
2008-05-21 16:44 ` Avi Kivity
2008-05-20 23:04 ` Laurent Vivier
2008-05-20 23:13 ` Jamie Lokier
2008-05-21 1:00 ` Anthony Liguori
2008-05-21 1:19 ` Jamie Lokier
2008-05-21 2:12 ` Anthony Liguori
2008-05-21 8:27 ` Andreas Färber
2008-05-21 14:06 ` Anthony Liguori
2008-05-21 15:31 ` Jamie Lokier
2008-05-21 11:43 ` Jamie Lokier
2008-05-23 9:12 ` Laurent Vivier
2008-05-28 7:01 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4833778C.4030209@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=Laurent.Vivier@bull.net \
--cc=blauwirbel@gmail.com \
--cc=kwolf@suse.de \
--cc=paul@codesourcery.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).