From: Jamie Lokier <jamie@shareable.org>
To: qemu-devel@nongnu.org
Cc: Blue Swirl <blauwirbel@gmail.com>,
Laurent Vivier <Laurent.Vivier@bull.net>,
Kevin Wolf <kwolf@suse.de>
Subject: Re: [Qemu-devel] Re: [PATCH][v2] Align file accesses with cache=off (O_DIRECT)
Date: Wed, 21 May 2008 02:19:15 +0100 [thread overview]
Message-ID: <20080521011915.GC595@shareable.org> (raw)
In-Reply-To: <48337444.2070203@codemonkey.ws>
Anthony Liguori wrote:
> >One property of disks is that if you overwrite a sector and the're
> >power loss, when read later that sector might be corrupt. Even if the
> >new data is the same as the old data with only some bytes changed,
> >some of the _unchanged_ bytes may be corrupt by this.
>
> I don't think this is true. What evidence do you have to support such
> claims?
What do you imagine happens when you pull the power in the middle of
writing a sector to a floppy disk (to pick a more easily imagined
example)?
There is not enough residual power to write the rest of the sector.
That sector's checksum will therefore be corrupt, and (hopefully) have
a CRC read error. It can be written over again, wiping the CRC error.
No sector which wasn't being written will be corrupt: the write head
isn't activated over those. The drive waits until it senses the start
of sector N, then activates the write head to write data bits.
The CRC error by itself my cause the whole sector to be reported as
corrupt with no data. However, if you do manage to get back the bits
from the media, some bits of the sector being written whose values
were not intended to change may be different than expected. This is
because the way data is recorded does not encode each bit separately,
but multiplexes them together for modulation, and also because bit
timing is not exact.
A modern hard disk uses much more complex data encoding, which further
adds to the effect of a truncated write corrupting even data bits not
intended to be changed, in the vicinity of those being changed.
But it should aim to provide the same basic guarantee that writing a
sector cannot corrupt neighbouring sectors on power failure, only the
one(s) being written. This is because robustness of journalling
filesystems and databases do rather depend on this property, and
simple old-fashioned disks do provide it.
I am just speculating; I don't know whether modern hard disks provide
this property, or under what circumstances they fail. But it seems
they could provide it, because they still have physically independent
sectors.
(Interestingly, the journal block size used by Oracle on different
OSes is different, suggesting the "basic unit of corruption"
varies between OSes and is not always a single sector).
Although it's just speculation, do you think modern hard disks behave
differently from this?
-- Jamie
next prev parent reply other threads:[~2008-05-21 1:19 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-20 11:32 [Qemu-devel] [PATCH][v2] Align file accesses with cache=off (O_DIRECT) Laurent Vivier
2008-05-20 19:47 ` [Qemu-devel] " Anthony Liguori
2008-05-20 22:36 ` Jamie Lokier
2008-05-20 22:52 ` Paul Brook
2008-05-20 22:59 ` Laurent Vivier
2008-05-21 0:54 ` Paul Brook
2008-05-21 7:59 ` Laurent Vivier
2008-05-21 0:58 ` Anthony Liguori
2008-05-21 1:04 ` Jamie Lokier
2008-05-21 1:05 ` Anthony Liguori
2008-05-21 8:06 ` Kevin Wolf
2008-05-21 1:05 ` Paul Brook
2008-05-21 1:14 ` Anthony Liguori
2008-05-21 8:24 ` Kevin Wolf
2008-05-21 12:26 ` Jamie Lokier
2008-05-21 12:37 ` Avi Kivity
2008-05-21 13:41 ` Jamie Lokier
2008-05-21 13:55 ` Anthony Liguori
2008-05-21 14:17 ` Avi Kivity
2008-05-21 14:26 ` Anthony Liguori
2008-05-21 14:57 ` Avi Kivity
2008-05-21 15:34 ` Jamie Lokier
2008-05-21 16:02 ` Anthony Liguori
2008-05-21 16:24 ` Jamie Lokier
2008-05-21 16:48 ` Avi Kivity
2008-05-21 17:01 ` Andrea Arcangeli
2008-05-21 17:18 ` Avi Kivity
2008-05-21 17:47 ` Andrea Arcangeli
2008-05-21 17:53 ` Anthony Liguori
2008-05-21 18:08 ` Andrea Arcangeli
2008-05-21 18:25 ` Anthony Liguori
2008-05-21 20:13 ` Andrea Arcangeli
2008-05-21 20:35 ` Anthony Liguori
2008-05-21 20:42 ` Andrea Arcangeli
2008-05-21 18:29 ` Avi Kivity
2008-05-21 16:45 ` Avi Kivity
2008-05-21 16:44 ` Avi Kivity
2008-05-20 23:04 ` Laurent Vivier
2008-05-20 23:13 ` Jamie Lokier
2008-05-21 1:00 ` Anthony Liguori
2008-05-21 1:19 ` Jamie Lokier [this message]
2008-05-21 2:12 ` Anthony Liguori
2008-05-21 8:27 ` Andreas Färber
2008-05-21 14:06 ` Anthony Liguori
2008-05-21 15:31 ` Jamie Lokier
2008-05-21 11:43 ` Jamie Lokier
2008-05-23 9:12 ` Laurent Vivier
2008-05-28 7:01 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080521011915.GC595@shareable.org \
--to=jamie@shareable.org \
--cc=Laurent.Vivier@bull.net \
--cc=blauwirbel@gmail.com \
--cc=kwolf@suse.de \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).