public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <mason@suse.com>
To: Daniel Phillips <phillips@bonn-fries.net>,
	James Bottomley <James.Bottomley@SteelEye.com>,
	"Stephen C. Tweedie" <sct@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: [PATCH] 2.4.x write barriers (updated for ext3)
Date: Sun, 03 Mar 2002 22:34:07 -0500	[thread overview]
Message-ID: <757370000.1015212846@tiny> (raw)
In-Reply-To: <E16heCm-0000Q5-00@starship.berlin>
In-Reply-To: <200202281536.g1SFaqF02079@localhost.localdomain> <E16heCm-0000Q5-00@starship.berlin>



On Sunday, March 03, 2002 11:11:44 PM +0100 Daniel Phillips <phillips@bonn-fries.net> wrote:

> I have a standing offer from at least one engineer to make firmware changes 
> to the drives if it makes Linux work better.  So a reasonable plan is: first 
> know what's ideal, second ask for it.  Coupled with that, we'd need a way of 
> identifying drives that don't work in the ideal way, and require a fallback.
> 
> In my opinion, the only correct behavior is a write barrier that completes
> when data is on the platter, and that does this even when write-back is
> enabled.  

With a battery backup, we want the raid controller (or whatever) to 
pretend the barrier is done right away.  It should be as safe, and 
allow the target to merge the writes.

> Surely this is not rocket science at the disk firmware level.  Is
> this or is this not the way ordered tags were supposed to work?

There are many issues at play in this thread, here's an attempt at
a summary (please correct any mistakes).

1) The drivers would need to be changed to properly keep tag ordering 
in place on resets, and error conditions.

2) ordered tags force ordering of all writes the drive is processing.
For some workloads, it will be forced to order stuff the journal code
doesn't care about at all, perhaps leading to lower performance than
the simple wait_on_buffer() we're using now.

2a) Are the filesystems asking for something impossible?  Can drives
really write block N and N+1, making sure to commit N to media before
N+1 (including an abort on N+1 if N fails), but still keeping up a 
nice seek free stream of writes?

3) Some drives may not be very smart about ordered tags.  We need
to figure out which is faster, using the ordered tag or using a
simple cache flush (when writeback is on).  The good news about
the cache flush is that it doesn't require major surgery in the
scsi error handlers.

4) If some scsi drives come with writeback on by default, do they also
turn it on under high load like IDE drives do?

> 
>> Clearly, there would also have to be a mechanism to flush the cache on 
>> unmount, so if this were done by ioctl, would you prefer that the filesystem 
>> be in charge of flushing the cache on barrier writes, or would you like the sd 
>> device to do it transparently?
> 
> The filesystem should just say 'this request is a write barrier' and the 
> lower layers, whether that's scsi or bio, should do what's necessary to make
> it come true.

That's the goal.  The current 2.4 patch differentiates between ordered
barriers and flush barriers just so I can make the flush the default
on IDE, and enable the ordered stuff when I want to experiment on scsi.

-chris


  reply	other threads:[~2002-03-04  3:35 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-02-22 15:57 [PATCH] 2.4.x write barriers (updated for ext3) James Bottomley
2002-02-22 16:10 ` Chris Mason
2002-02-22 16:13 ` Stephen C. Tweedie
2002-02-22 17:36   ` James Bottomley
2002-02-22 18:14     ` Chris Mason
2002-02-28 15:36       ` James Bottomley
2002-02-28 15:55         ` Chris Mason
2002-02-28 17:58           ` Mike Anderson
2002-02-28 18:12           ` Chris Mason
2002-03-01  2:08             ` James Bottomley
2002-03-03 22:11         ` Daniel Phillips
2002-03-04  3:34           ` Chris Mason [this message]
2002-03-04  5:05             ` Daniel Phillips
2002-03-04 15:03               ` James Bottomley
2002-03-04 17:04                 ` Stephen C. Tweedie
2002-03-04 17:16                   ` Chris Mason
2002-03-04 18:05                     ` Stephen C. Tweedie
2002-03-04 18:28                       ` James Bottomley
2002-03-04 19:55                         ` Stephen C. Tweedie
2002-03-04 19:48                       ` Daniel Phillips
2002-03-04 19:57                         ` Stephen C. Tweedie
2002-03-04 21:06                           ` Daniel Phillips
2002-03-05 14:58                             ` Stephen C. Tweedie
2002-03-05  7:48                         ` Jens Axboe
2002-03-04 19:51                     ` Daniel Phillips
2002-03-05  7:42                       ` Jens Axboe
2002-03-04 17:35                   ` James Bottomley
2002-03-04 17:48                     ` Chris Mason
2002-03-04 18:11                       ` James Bottomley
2002-03-04 18:41                         ` Chris Mason
2002-03-04 21:34                         ` Stephen C. Tweedie
2002-03-04 18:09                     ` Stephen C. Tweedie
2002-03-04  8:19             ` Helge Hafting
2002-03-04 14:57             ` James Bottomley
2002-03-04 17:24               ` Chris Mason
2002-03-04 19:02                 ` Daniel Phillips
2002-03-05  7:22               ` Jeremy Higdon
2002-03-05 23:01                 ` Daniel Phillips
2002-03-04  4:21           ` Jeremy Higdon
2002-03-04  5:31             ` Daniel Phillips
2002-03-04  6:09               ` Jeremy Higdon
2002-03-04  7:57                 ` Daniel Phillips
2002-03-05  7:09                   ` Jeremy Higdon
2002-03-05 22:56                     ` Daniel Phillips
2002-03-04 16:52                 ` Stephen C. Tweedie
2002-03-04 18:15                   ` Daniel Phillips
2002-03-05  7:40                     ` Jens Axboe
2002-03-05 22:29                       ` Daniel Phillips
2002-03-12  7:01                         ` Jens Axboe
2002-03-10  5:24                   ` Douglas Gilbert
2002-03-11 11:13                     ` Kurt Garloff
2002-03-12  1:17                       ` GOTO Masanori
2002-03-12  6:58                       ` Jens Axboe
2002-03-13 22:37                         ` Peter Osterlund
2002-03-11 11:34                     ` Stephen C. Tweedie
2002-03-11 17:15                       ` James Bottomley
2002-03-04 14:48           ` James Bottomley
2002-03-06 13:59             ` Daniel Phillips
2002-03-06 14:34               ` James Bottomley
2002-02-25 10:57 ` Helge Hafting
2002-02-25 15:04   ` James Bottomley
  -- strict thread matches above, loose matches on Subject: below --
2002-03-01 15:26 Dieter Nützel
2002-03-01 16:00 ` James Bottomley
2002-02-21 23:30 Chris Mason
2002-02-22 14:19 ` Stephen C. Tweedie
2002-02-22 15:26   ` Chris Mason
2002-01-10  9:55 [ANNOUNCE] FUSE: Filesystem in Userspace 0.95 Miklos Szeredi
2002-01-13  3:10 ` Pavel Machek
2002-01-21 10:18   ` Miklos Szeredi
2002-01-23 10:47     ` Pavel Machek
2002-01-22 19:07 ` Daniel Phillips
2002-01-23  2:33   ` [Avfs] " Justin Mason
2002-01-23  5:26     ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=757370000.1015212846@tiny \
    --to=mason@suse.com \
    --cc=James.Bottomley@SteelEye.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=phillips@bonn-fries.net \
    --cc=sct@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox