public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@suse.de>
To: Linux Kernel <linux-kernel@vger.kernel.org>
Cc: matthias.andree@gmx.de
Subject: Re: True  fsync() in Linux (on IDE)
Date: Thu, 18 Mar 2004 12:55:45 +0100	[thread overview]
Message-ID: <20040318115544.GN22234@suse.de> (raw)
In-Reply-To: <20040318113453.GB6864@merlin.emma.line.org>


(btw - maybe you don't like to be cc'ed on kernel posts, but I do. it's
lkml etiquette to do so, and it makes sure that I see your mail.
otherwise I might not, especially true for bigger threads. so please, cc
people. thanks)

On Thu, Mar 18 2004, Matthias Andree wrote:
> On Thu, 18 Mar 2004, Jens Axboe wrote:
> 
> > Chris and I have working real fsync() with the barrier patches. I'll
> > clean it up and post a patch for vanilla 2.6.5-rc today.
> 
> This is good news.
> 
> The barrier stuff is long overdue^UI'm looking forward to this.
> 
> I'm using the term "TCQ" liberally although it may be inexact for older
> (parallel) ATA generations:
> 
> All these ATA fsync() vs. write cache issues have been open for much too
> long - no reproaches, but it's a pity we haven't been able to have data
> consistency for data bases and fast bulk writes (that need the write
> cache without TCQ) in the same drive for so long. I have seen Linux
> introduce TCQ for PATA early in 2.5, then drop it again. Similarly,
> FreeBSD ventured into TCQ for ATA but appears to have dropped it again
> as well.

That's because PATA TCQ sucks :-)

> May I ask that the information whether a particular driver (file system,
> hardware) supports write barriers be exposed in a standard way, for
> instance in the Kconfig help lines?

Since reiser is the first implementation of it, it gets to chose how
this works. Currently that's done by giving -o barrier=flush (=ordered
used to exist as well, it will probably return - right now we just
played with IDE).

> If I recall correctly from earlier patches, the barrier stuff is 1.
> command model (ATA vs.  SCSI) specific and 2. driver and hardware
> specific and 3. requires that the file system knows how to use this
> properly.

Yes.

> Given that file systems have certain write ordering requirements if they
> are to be recoverable after a crash, I suspect Linux has _not_ been able
> to guarantee on-disk consistency for any time for years, which means
> that a crash in the wrong moment can kill the file system itself if the
> drive has reordered writes - only ext3 without write cache seems to
> behave better in this respect (data=ordered).
> 
> I would like to have a document that shows which file system, which
> chipset driver for PATA, which chipset driver for ATA, which low-level
> SCSI host adaptor driver, which file system support write barrier. We
> will probably also need to check if intermediate layers such as md and
> dm-mod propagate such information.

Only PATA core needs to support it, not the chipset drivers. md and dm
aren't a difficult to implement now that unplug/congestion already
iterates the device list and I added a blkdev_issue_flush() command.

> Given the necessary information, I can hack together a HTML document to
> provide this information; this offer has however not seen any response
> in the past. I am however not acquainted with the drivers and need
> information from the kernel hackers. Without such support, such a
> documentation effort is doomed.

Usual approach - just start writing, it's a lot easier to get
corrections (people seem to be several times more willing to point out
your errors than give you recomendations for something you haven't
started yet).

> BTW, I should very much like to be able to trace the low-level write
> information that goes out to the device, possibly including the payload
> - something like tcpdump for the ATA or SCSI commands that are sent to
> the driver. Is such a facility available?

No.

-- 
Jens Axboe


  reply	other threads:[~2004-03-18 11:56 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-18  1:08 True fsync() in Linux (on IDE) Peter Zaitsev
2004-03-18  6:47 ` Jens Axboe
2004-03-18 11:34   ` Matthias Andree
2004-03-18 11:55     ` Jens Axboe [this message]
2004-03-18 12:21       ` Matthias Andree
2004-03-18 12:37         ` Jens Axboe
2004-03-18 11:58     ` (no subject) Daniel Czarnecki
2004-03-18 19:44   ` True fsync() in Linux (on IDE) Peter Zaitsev
2004-03-18 19:47     ` Jens Axboe
2004-03-18 20:11       ` Chris Mason
2004-03-18 20:17         ` Peter Zaitsev
2004-03-18 20:33           ` Chris Mason
2004-03-18 20:46             ` Peter Zaitsev
2004-03-18 21:02               ` Chris Mason
2004-03-18 21:09                 ` Peter Zaitsev
2004-03-18 21:19                   ` Chris Mason
2004-03-19  8:05                     ` Hans Reiser
2004-03-19 13:52                       ` Chris Mason
2004-03-19 19:26                         ` Peter Zaitsev
2004-03-19 20:23                           ` Chris Mason
2004-03-19 20:31                             ` Hans Reiser
2004-03-19 20:38                               ` Chris Mason
2004-03-19 20:48                                 ` Hans Reiser
2004-03-19 20:56                                   ` Chris Mason
2004-03-20 11:04                                     ` Hans Reiser
2004-03-19 19:36                         ` Hans Reiser
2004-03-19 19:57                           ` Chris Mason
2004-03-19 20:04                             ` Hans Reiser
2004-03-19 20:15                               ` Chris Mason
2004-03-19 20:06                           ` Peter Zaitsev
2004-03-19 22:03                             ` Matthias Andree
2004-03-20 10:20                             ` Jamie Lokier
2004-03-20 19:48                               ` Peter Zaitsev
  -- strict thread matches above, loose matches on Subject: below --
2004-03-22 13:08 Heikki Tuuri
2004-03-22 13:23 ` Jens Axboe
2004-03-22 15:17   ` Matthias Andree
2004-03-22 15:35     ` Christoph Hellwig
2004-03-22 19:12     ` Christoffer Hall-Frederiksen
2004-03-22 20:28       ` Matthias Andree
2004-03-22 19:33     ` Hans Reiser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040318115544.GN22234@suse.de \
    --to=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthias.andree@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox