From: Matthias Andree <matthias.andree@gmx.de>
To: Christoffer Hall-Frederiksen <hall@diku.dk>
Cc: Matthias Andree <matthias.andree@gmx.de>,
Jens Axboe <axboe@suse.de>,
Heikki Tuuri <Heikki.Tuuri@innodb.com>,
linux-kernel@vger.kernel.org
Subject: Re: True fsync() in Linux (on IDE)
Date: Mon, 22 Mar 2004 21:28:14 +0100 [thread overview]
Message-ID: <20040322202814.GA14746@merlin.emma.line.org> (raw)
In-Reply-To: <405F3A9C.3050307@diku.dk>
On Mon, 22 Mar 2004, Christoffer Hall-Frederiksen wrote:
> >If there is no such atomicity (except maybe in ext3fs data=journal or
> >the upcoming reiserfs4 - isn't there?), then nobody should claim so. If
> >the kernel cannot 100.00000000% guarantee the write is atomic, claiming
> >otherwise is plain fraud and nothing else.
> >
> >Some people bet their whole business/company and hence a fair deal of
> >their belongings on a single data base, and making them believe facts
> >that simply aren't reality is dangerous. These people will have very
> >little understanding for sloppiness here. Linux has no obligation to be
> >fast or reliable, but it MUST PROPERLY AND TRUTHFULLY state what it can
> >guarantee and what it cannot guarantee.
>
> Some databases (eg. oracle) can write a checksum for each database page
> to overcome this problem, as this is not just "a linux problem".
I am aware some databases support checksumming (Berkeley DB also does,
since v4.1 (*), and probably a lot more so they know where log recovery
starts) but does that make statements sensible that claim the timing
(some stochastic factor) would usually give "guarantees" about
atomicity of the individual page write when the hardware doesn't
guarantee anything beyond 512 bytes at a time? I think it does not.
I don't mind to beat up anyone, I'd just like to have the guarantees
documented without thin-ice kind of promises "usually you'll get more".
It's good to get more than what was asked for, but the application
designer cannot take that into account because he gets no guarantees. So
why bother wasting space and time for writing and reading such lines? Or
even discussing?
Maybe some interface so an application can query the maximum size of an
atomic write for any given file system (stat[v]fs extension maybe) would
be useful though, so applications can be optimized for data-journaling
file systems should these prove capable to provide "large atomic write"
guarantees.
(*) http://cvs.sourceforge.net/viewcvs.py/bogofilter/bogofilter/src/datastore_db.c?only_with_tag=branch-db-txn#rev1.93.2.5
--
Matthias Andree
Encrypt your mail: my GnuPG key ID is 0x052E7D95
next prev parent reply other threads:[~2004-03-22 20:28 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-03-22 13:08 True fsync() in Linux (on IDE) Heikki Tuuri
2004-03-22 13:23 ` Jens Axboe
2004-03-22 15:17 ` Matthias Andree
2004-03-22 15:35 ` Christoph Hellwig
2004-03-22 19:12 ` Christoffer Hall-Frederiksen
2004-03-22 20:28 ` Matthias Andree [this message]
2004-03-22 19:33 ` Hans Reiser
-- strict thread matches above, loose matches on Subject: below --
2004-03-18 1:08 Peter Zaitsev
2004-03-18 6:47 ` Jens Axboe
2004-03-18 11:34 ` Matthias Andree
2004-03-18 11:55 ` Jens Axboe
2004-03-18 12:21 ` Matthias Andree
2004-03-18 12:37 ` Jens Axboe
2004-03-18 19:44 ` Peter Zaitsev
2004-03-18 19:47 ` Jens Axboe
2004-03-18 20:11 ` Chris Mason
2004-03-18 20:17 ` Peter Zaitsev
2004-03-18 20:33 ` Chris Mason
2004-03-18 20:46 ` Peter Zaitsev
2004-03-18 21:02 ` Chris Mason
2004-03-18 21:09 ` Peter Zaitsev
2004-03-18 21:19 ` Chris Mason
2004-03-19 8:05 ` Hans Reiser
2004-03-19 13:52 ` Chris Mason
2004-03-19 19:26 ` Peter Zaitsev
2004-03-19 20:23 ` Chris Mason
2004-03-19 20:31 ` Hans Reiser
2004-03-19 20:38 ` Chris Mason
2004-03-19 20:48 ` Hans Reiser
2004-03-19 20:56 ` Chris Mason
2004-03-20 11:04 ` Hans Reiser
2004-03-19 19:36 ` Hans Reiser
2004-03-19 19:57 ` Chris Mason
2004-03-19 20:04 ` Hans Reiser
2004-03-19 20:15 ` Chris Mason
2004-03-19 20:06 ` Peter Zaitsev
2004-03-19 22:03 ` Matthias Andree
2004-03-20 10:20 ` Jamie Lokier
2004-03-20 19:48 ` Peter Zaitsev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040322202814.GA14746@merlin.emma.line.org \
--to=matthias.andree@gmx.de \
--cc=Heikki.Tuuri@innodb.com \
--cc=axboe@suse.de \
--cc=hall@diku.dk \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.