All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hans Reiser <reiser@namesys.com>
To: Andi Kleen <ak@muc.de>
Cc: reiserfs-list@namesys.com
Subject: Re: reiser4 data journalling?
Date: Wed, 03 Sep 2003 02:21:20 +0400	[thread overview]
Message-ID: <3F5517E0.6000403@namesys.com> (raw)
In-Reply-To: <m3oey7m2ed.fsf@averell.firstfloor.org>

Andi Kleen wrote:

>Hans Reiser <reiser@namesys.com> writes:
>  
>
>>Because atomicity in ext3 basically consists of guaranteeing that the filesystem is consistent, and due to the disk drive implementation 4k blocks are written atomically.  So, if you span more than one block, it is not guaranteed to be atomic.
>>    
>>
>
>I was told most modern disks run with 32-64K blocks internally.
>
This is very interesting, it would explain something that has puzzled me 
for a long time about 64k being the optimal transfer size.

>4K is very unlikely.
>
>Of course they don't always guarantee that a write of such a block
>is atomic, but then they also don't guarantee it for 4K.
>
What they do in the event of power loss is the big unclear for hard 
drives.  Maybe I should ask for government funding to test hard drives 
for what they actually do, and how they handle power loss.;-)   Somebody 
needs to test these folks.....

Most filesystems are designed based around the assumption that ecc on 
the drive will detect failed writes with 4k granularity.  If I remember 
right, some folks on this list have said some things suggesting that for 
some IBM drives a power failure can cause a more than 4k chunk to go 
bad, and that this can be a problem for filesystem consistency.

It would probably be nice for the OS and the drive to use the same block 
size..... or at least interesting to experiment with.

>The only sure way is to wait for the disk telling you it is finished 
>(= use write barriers)  and/or turn the write buffer off.
>Even then you have to hope that the disk firmware doesn't lie to you.
>
>-Andi
>
>
>  
>


-- 
Hans



  reply	other threads:[~2003-09-02 22:21 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-08-28 17:25 reiser4 data journalling? Tupshin Harper
2003-08-28 17:42 ` Nikita Danilov
2003-08-28 20:48   ` Hans Reiser
2003-08-29  0:18     ` Tom Vier
2003-08-29  0:21       ` Hans Reiser
2003-08-29  1:23         ` Mike Fedyk
2003-08-29  5:08           ` Oleg Drokin
2003-08-29 17:48             ` Hans Reiser
2003-08-29 18:15               ` Nikita Danilov
2003-08-29 18:17                 ` Nikita Danilov
2003-08-29 21:43                 ` Hans Reiser
2003-08-29 21:56                   ` system_lists
2003-08-30 18:22                   ` Andi Kleen
2003-09-02 22:21                     ` Hans Reiser [this message]
2003-09-03  2:05                       ` Tom Vier
2003-08-29 21:47           ` Hans Reiser
2003-08-29 13:12       ` Lars Marowsky-Bree
2003-09-02 22:17         ` Hans Reiser
2003-09-03 10:52           ` Lars Marowsky-Bree
2003-09-03 16:56             ` Hans Reiser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3F5517E0.6000403@namesys.com \
    --to=reiser@namesys.com \
    --cc=ak@muc.de \
    --cc=reiserfs-list@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.