All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <mason@suse.com>
To: Sam Vilain <sam@vilain.net>
Cc: Oleg Drokin <green@namesys.com>, reiserfs-list@namesys.com
Subject: Re: Data corrupted after crash
Date: 26 Jul 2002 11:17:48 -0400	[thread overview]
Message-ID: <1027696668.8530.255.camel@tiny> (raw)
In-Reply-To: <20020726134405.D4DB145B@hofmann>

On Fri, 2002-07-26 at 09:44, Sam Vilain wrote:
> Oleg Drokin <green@namesys.com> wrote:
> 
> > No. Reiserfs provides only metadata journaling. So the data itself still
> > may be damaged (data still may be damaged even in case of full data
> > journaling just because there is no API for applications to control
> > transactions currently)
> 
> Still, data journalling would be nice.  I must say that when reiserfs gets
> the power taken out from under it, you don't half end up with your
> recently worked on files containing pieces of each other.
> 
> With data journalling, this would not be a problem; although of course
> files could end up in a half-updated state, which to a given application
> may as well be corrupted.  It would be a bit slower, unless your journal is on a seperate device, but that can't be helped.

Actually, data journaling doesn't give you much more protection than
ordered write mode.  This is because userspace has no way to influence
which data gets included into a given transaction.  So, if you write
16k, it might become 1 atomic unit or 4, you have no way of knowing.

Both ordered data mode and journaled data mode make sure that new blocks
added to a file are flushed before the transaction commits.  This makes
sure you don't get garbage in the file.

journaled data mode also make sure that a single block is written as an
atomic unit.  If your 4k block spans 8 512 byte sectors on the drive,
you know after a crash all 8 will either be updated or not changed at
all.

journaled data mode also gives you complete ordering of data writes with
respect to the metadata.  So, if a single process does this:

create(file1)
write(file1)
rename(file1, file2)
write(file2)

You know that each step along the chain will either include the previous
step or not be done at all.  In other words, you know the rename won't
happen without the data in file1 being updated.

Most people don't really need either feature, and data=ordered is faster
most of the time.  data journaled mode can be much faster for
synchronous data writes though.

You can try the current data logging patches at:

ftp.suse.com/pub/people/mason/patches/data-logging

I've just uploaded data-logging-20.diff, which should fix a bug where
some writes were not properly ordered if you crashed right after a
truncate or tail conversion (it might take the mirror a few minutes to
update).  

-chris




      parent reply	other threads:[~2002-07-26 15:17 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-07-25 17:54 Data corrupted after crash Sebastian Kaps
2002-07-26  9:19 ` Oleg Drokin
2002-07-26  9:34   ` Sebastian Kaps
2002-07-26 13:44   ` Sam Vilain
2002-07-26 13:53     ` Oleg Drokin
2002-07-26 15:17     ` Chris Mason [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1027696668.8530.255.camel@tiny \
    --to=mason@suse.com \
    --cc=green@namesys.com \
    --cc=reiserfs-list@namesys.com \
    --cc=sam@vilain.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.