qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: qemu-devel@nongnu.org
Cc: Blue Swirl <blauwirbel@gmail.com>,
	Laurent Vivier <Laurent.Vivier@bull.net>,
	Kevin Wolf <kwolf@suse.de>
Subject: Re: [Qemu-devel] Re: [PATCH][v2] Align file accesses with cache=off (O_DIRECT)
Date: Tue, 20 May 2008 21:12:50 -0500	[thread overview]
Message-ID: <48338522.7030306@codemonkey.ws> (raw)
In-Reply-To: <20080521011915.GC595@shareable.org>

Jamie Lokier wrote:
> Anthony Liguori wrote:
>   
>>> One property of disks is that if you overwrite a sector and the're
>>> power loss, when read later that sector might be corrupt.  Even if the
>>> new data is the same as the old data with only some bytes changed,
>>> some of the _unchanged_ bytes may be corrupt by this.
>>>       
>> I don't think this is true.  What evidence do you have to support such 
>> claims?
>>     
>
> What do you imagine happens when you pull the power in the middle of
> writing a sector to a floppy disk (to pick a more easily imagined
> example)?
>
> There is not enough residual power to write the rest of the sector.
> That sector's checksum will therefore be corrupt, and (hopefully) have
> a CRC read error.  It can be written over again, wiping the CRC error.
>   

Why would the sector's checksum be corrupt?  The checksum wouldn't 
change after the data write.

> No sector which wasn't being written will be corrupt: the write head
> isn't activated over those.  The drive waits until it senses the start
> of sector N, then activates the write head to write data bits.
>
> The CRC error by itself my cause the whole sector to be reported as
> corrupt with no data.  However, if you do manage to get back the bits
> from the media, some bits of the sector being written whose values
> were not intended to change may be different than expected.  This is
> because the way data is recorded does not encode each bit separately,
> but multiplexes them together for modulation, and also because bit
> timing is not exact.
>
> A modern hard disk uses much more complex data encoding, which further
> adds to the effect of a truncated write corrupting even data bits not
> intended to be changed, in the vicinity of those being changed.
>
> But it should aim to provide the same basic guarantee that writing a
> sector cannot corrupt neighbouring sectors on power failure, only the
> one(s) being written.  This is because robustness of journalling
> filesystems and databases do rather depend on this property, and
> simple old-fashioned disks do provide it.
>
> I am just speculating; I don't know whether modern hard disks provide
> this property, or under what circumstances they fail.  But it seems
> they could provide it, because they still have physically independent
> sectors.
>
> (Interestingly, the journal block size used by Oracle on different
> OSes is different, suggesting the "basic unit of corruption"
> varies between OSes and is not always a single sector).
>
> Although it's just speculation, do you think modern hard disks behave
> differently from this?
>   

Modern *enterprise* hard disks have battery backed caches so read/write 
operations always complete or fail.  Low-end disks don't tend to have 
battery backed caches but AFAIK, rewriting the same data will not result 
in any sort of disk corruption.

Regards,

Anthony Liguori


> -- Jamie
>
>
>   

  reply	other threads:[~2008-05-21  2:13 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-20 11:32 [Qemu-devel] [PATCH][v2] Align file accesses with cache=off (O_DIRECT) Laurent Vivier
2008-05-20 19:47 ` [Qemu-devel] " Anthony Liguori
2008-05-20 22:36   ` Jamie Lokier
2008-05-20 22:52     ` Paul Brook
2008-05-20 22:59       ` Laurent Vivier
2008-05-21  0:54         ` Paul Brook
2008-05-21  7:59           ` Laurent Vivier
2008-05-21  0:58       ` Anthony Liguori
2008-05-21  1:04         ` Jamie Lokier
2008-05-21  1:05         ` Anthony Liguori
2008-05-21  8:06           ` Kevin Wolf
2008-05-21  1:05         ` Paul Brook
2008-05-21  1:14           ` Anthony Liguori
2008-05-21  8:24             ` Kevin Wolf
2008-05-21 12:26               ` Jamie Lokier
2008-05-21 12:37                 ` Avi Kivity
2008-05-21 13:41                   ` Jamie Lokier
2008-05-21 13:55                     ` Anthony Liguori
2008-05-21 14:17                       ` Avi Kivity
2008-05-21 14:26                         ` Anthony Liguori
2008-05-21 14:57                           ` Avi Kivity
2008-05-21 15:34                             ` Jamie Lokier
2008-05-21 16:02                               ` Anthony Liguori
2008-05-21 16:24                                 ` Jamie Lokier
2008-05-21 16:48                                   ` Avi Kivity
2008-05-21 17:01                                     ` Andrea Arcangeli
2008-05-21 17:18                                       ` Avi Kivity
2008-05-21 17:47                                         ` Andrea Arcangeli
2008-05-21 17:53                                           ` Anthony Liguori
2008-05-21 18:08                                             ` Andrea Arcangeli
2008-05-21 18:25                                               ` Anthony Liguori
2008-05-21 20:13                                                 ` Andrea Arcangeli
2008-05-21 20:35                                                   ` Anthony Liguori
2008-05-21 20:42                                                     ` Andrea Arcangeli
2008-05-21 18:29                                           ` Avi Kivity
2008-05-21 16:45                                 ` Avi Kivity
2008-05-21 16:44                               ` Avi Kivity
2008-05-20 23:04     ` Laurent Vivier
2008-05-20 23:13       ` Jamie Lokier
2008-05-21  1:00     ` Anthony Liguori
2008-05-21  1:19       ` Jamie Lokier
2008-05-21  2:12         ` Anthony Liguori [this message]
2008-05-21  8:27           ` Andreas Färber
2008-05-21 14:06             ` Anthony Liguori
2008-05-21 15:31               ` Jamie Lokier
2008-05-21 11:43           ` Jamie Lokier
2008-05-23  9:12   ` Laurent Vivier
2008-05-28  7:01     ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48338522.7030306@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=Laurent.Vivier@bull.net \
    --cc=blauwirbel@gmail.com \
    --cc=kwolf@suse.de \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).