All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@suse.de>
To: Linux Kernel <linux-kernel@vger.kernel.org>
Cc: matthias.andree@gmx.de
Subject: Re: True  fsync() in Linux (on IDE)
Date: Thu, 18 Mar 2004 12:55:45 +0100	[thread overview]
Message-ID: <20040318115544.GN22234@suse.de> (raw)
In-Reply-To: <20040318113453.GB6864@merlin.emma.line.org>


(btw - maybe you don't like to be cc'ed on kernel posts, but I do. it's
lkml etiquette to do so, and it makes sure that I see your mail.
otherwise I might not, especially true for bigger threads. so please, cc
people. thanks)

On Thu, Mar 18 2004, Matthias Andree wrote:
> On Thu, 18 Mar 2004, Jens Axboe wrote:
> 
> > Chris and I have working real fsync() with the barrier patches. I'll
> > clean it up and post a patch for vanilla 2.6.5-rc today.
> 
> This is good news.
> 
> The barrier stuff is long overdue^UI'm looking forward to this.
> 
> I'm using the term "TCQ" liberally although it may be inexact for older
> (parallel) ATA generations:
> 
> All these ATA fsync() vs. write cache issues have been open for much too
> long - no reproaches, but it's a pity we haven't been able to have data
> consistency for data bases and fast bulk writes (that need the write
> cache without TCQ) in the same drive for so long. I have seen Linux
> introduce TCQ for PATA early in 2.5, then drop it again. Similarly,
> FreeBSD ventured into TCQ for ATA but appears to have dropped it again
> as well.

That's because PATA TCQ sucks :-)

> May I ask that the information whether a particular driver (file system,
> hardware) supports write barriers be exposed in a standard way, for
> instance in the Kconfig help lines?

Since reiser is the first implementation of it, it gets to chose how
this works. Currently that's done by giving -o barrier=flush (=ordered
used to exist as well, it will probably return - right now we just
played with IDE).

> If I recall correctly from earlier patches, the barrier stuff is 1.
> command model (ATA vs.  SCSI) specific and 2. driver and hardware
> specific and 3. requires that the file system knows how to use this
> properly.

Yes.

> Given that file systems have certain write ordering requirements if they
> are to be recoverable after a crash, I suspect Linux has _not_ been able
> to guarantee on-disk consistency for any time for years, which means
> that a crash in the wrong moment can kill the file system itself if the
> drive has reordered writes - only ext3 without write cache seems to
> behave better in this respect (data=ordered).
> 
> I would like to have a document that shows which file system, which
> chipset driver for PATA, which chipset driver for ATA, which low-level
> SCSI host adaptor driver, which file system support write barrier. We
> will probably also need to check if intermediate layers such as md and
> dm-mod propagate such information.

Only PATA core needs to support it, not the chipset drivers. md and dm
aren't a difficult to implement now that unplug/congestion already
iterates the device list and I added a blkdev_issue_flush() command.

> Given the necessary information, I can hack together a HTML document to
> provide this information; this offer has however not seen any response
> in the past. I am however not acquainted with the drivers and need
> information from the kernel hackers. Without such support, such a
> documentation effort is doomed.

Usual approach - just start writing, it's a lot easier to get
corrections (people seem to be several times more willing to point out
your errors than give you recomendations for something you haven't
started yet).

> BTW, I should very much like to be able to trace the low-level write
> information that goes out to the device, possibly including the payload
> - something like tcpdump for the ATA or SCSI commands that are sent to
> the driver. Is such a facility available?

No.

-- 
Jens Axboe


  reply	other threads:[~2004-03-18 11:56 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-18  1:08 True fsync() in Linux (on IDE) Peter Zaitsev
2004-03-18  6:47 ` Jens Axboe
2004-03-18 11:34   ` Matthias Andree
2004-03-18 11:55     ` Jens Axboe [this message]
2004-03-18 12:21       ` Matthias Andree
2004-03-18 12:37         ` Jens Axboe
2004-03-18 11:58     ` (no subject) Daniel Czarnecki
2004-03-18 19:44   ` True fsync() in Linux (on IDE) Peter Zaitsev
2004-03-18 19:47     ` Jens Axboe
2004-03-18 20:11       ` Chris Mason
2004-03-18 20:17         ` Peter Zaitsev
2004-03-18 20:33           ` Chris Mason
2004-03-18 20:46             ` Peter Zaitsev
2004-03-18 21:02               ` Chris Mason
2004-03-18 21:09                 ` Peter Zaitsev
2004-03-18 21:19                   ` Chris Mason
2004-03-19  8:05                     ` Hans Reiser
2004-03-19 13:52                       ` Chris Mason
2004-03-19 19:26                         ` Peter Zaitsev
2004-03-19 20:23                           ` Chris Mason
2004-03-19 20:31                             ` Hans Reiser
2004-03-19 20:38                               ` Chris Mason
2004-03-19 20:48                                 ` Hans Reiser
2004-03-19 20:56                                   ` Chris Mason
2004-03-20 11:04                                     ` Hans Reiser
2004-03-19 19:36                         ` Hans Reiser
2004-03-19 19:57                           ` Chris Mason
2004-03-19 20:04                             ` Hans Reiser
2004-03-19 20:15                               ` Chris Mason
2004-03-19 20:06                           ` Peter Zaitsev
2004-03-19 22:03                             ` Matthias Andree
2004-03-20 10:20                             ` Jamie Lokier
2004-03-20 19:48                               ` Peter Zaitsev
  -- strict thread matches above, loose matches on Subject: below --
2004-03-22 13:08 Heikki Tuuri
2004-03-22 13:23 ` Jens Axboe
2004-03-22 15:17   ` Matthias Andree
2004-03-22 15:35     ` Christoph Hellwig
2004-03-22 19:12     ` Christoffer Hall-Frederiksen
2004-03-22 20:28       ` Matthias Andree
2004-03-22 19:33     ` Hans Reiser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040318115544.GN22234@suse.de \
    --to=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthias.andree@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.