From: Jeff Garzik <jgarzik@pobox.com>
To: Greg Stark <gsstark@mit.edu>
Cc: Mike Fedyk <mfedyk@matchmail.com>,
Erik Steffl <steffl@bigfoot.com>,
linux-kernel@vger.kernel.org
Subject: Re: libata in 2.4.24?
Date: Tue, 2 Dec 2003 15:16:49 -0500 [thread overview]
Message-ID: <20031202201649.GB17779@gtf.org> (raw)
In-Reply-To: <877k1f9e1g.fsf@stark.dyndns.tv>
On Tue, Dec 02, 2003 at 03:10:19PM -0500, Greg Stark wrote:
> Jeff Garzik <jgarzik@pobox.com> writes:
>
> > So, today, no acknowledgement occurs until the data _really_ is in the
> > drive's buffers.
>
> The drive's buffers isn't good enough. If power is lost the write will be lost
> and the database corrupt. It needs to be on the platters.
Certainly agreed.
> > > This doesn't happen with SCSI disks where multiple requests can be pending so
> > > there's no urgency to reporting a false success. The request doesn't complete
> > > until the write hits disk. As a result SCSI disks are reliable for database
> > > operation and IDE disks aren't unless write caching is disabled.
> >
> > This is not really true.
> >
> > Regardless of TCQ, if the OS driver has not issued a FLUSH CACHE (IDE)
> > or SYNCHRONIZE CACHE (SCSI), then the data is not guaranteed to be on
> > the disk media. Plain and simple.
>
> That doesn't agree with people's experience. People seem to find that SCSI
> drives never cache writes. This sort of makes sense since there's just not
> much reason to report a write success before the write can be performed.
> There's no performance advantage as long as more requests can be queued up.
Some IDE _and/or_ SCSI drives do not cache writes. For these drives,
the _absence_ of an OS flush-cache command still means your data gets
to the platter.
The core problem is not issuing a flush-cache command, it sounds like.
The drive technology (wcache, or no) is largely irrelevant.
> > If fsync(2) returns without a flush-cache, then your data is not
> > guaranteed to be on the disk. And as you noted, flush-cache destroys
> > performance.
>
> It's my understanding that it doesn't. There was some discussion in the past
eh? flush-cache very definitely hurts performance, on both IDE and
SCSI, for drives that support write caching.
> > There are three levels:
> >
> > a) Data is successfully transferred to the controller/drive queue (TCQ).
> > b) Data is successfully transferred to the drive's internal buffers.
> > c) The drive successfully transfers data to the media.
>
> Only the third is of interest to Postgres or other databases. In fact, I
Certainly.
> suspect only the third is of interest to other systems that are supposed to be
> reliable like MTAs etc. I think Wietse and others would be shocked if they
> were told fsync wasn't guaranteed to have waited until the writes had actually
> hit the media.
As well he should be :)
Jeff
next prev parent reply other threads:[~2003-12-02 20:31 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-11-28 18:27 linux-2.4.23 released Marcelo Tosatti
2003-11-28 19:06 ` Willy Tarreau
2003-11-28 22:55 ` J.A. Magallon
2003-11-29 22:26 ` libata in 2.4.24? Samuel Flory
2003-11-29 23:10 ` Marcelo Tosatti
2003-12-01 10:43 ` Marcelo Tosatti
2003-12-01 18:06 ` Samuel Flory
2003-12-01 21:12 ` Greg Stark
2003-12-01 21:23 ` Samuel Flory
2003-12-01 21:44 ` Greg Stark
2003-12-01 22:00 ` Jeff Garzik
2003-12-01 22:06 ` Samuel Flory
2003-12-01 22:00 ` Erik Steffl
2003-12-02 5:36 ` Greg Stark
[not found] ` <20031202055336.GO1566@mis-mike-wstn.matchmail.com>
2003-12-02 5:58 ` Mike Fedyk
2003-12-02 16:31 ` Greg Stark
2003-12-02 17:40 ` Mike Fedyk
2003-12-02 18:04 ` Jeff Garzik
2003-12-02 18:46 ` Mike Fedyk
2003-12-02 18:49 ` Jeff Garzik
2003-12-04 8:18 ` Jens Axboe
2003-12-02 18:02 ` Jeff Garzik
2003-12-02 18:51 ` Greg Stark
2003-12-02 19:06 ` Jeff Garzik
2003-12-02 20:10 ` Greg Stark
2003-12-02 20:16 ` Jeff Garzik [this message]
2003-12-02 20:34 ` Greg Stark
2003-12-02 22:34 ` bill davidsen
2003-12-02 23:02 ` Mike Fedyk
2003-12-02 23:18 ` bill davidsen
2003-12-02 23:40 ` Mike Fedyk
2003-12-03 0:01 ` Jeff Garzik
2003-12-03 0:47 ` Jamie Lokier
2003-12-07 5:33 ` Bill Davidsen
2003-12-01 21:36 ` Justin Cormack
-- strict thread matches above, loose matches on Subject: below --
2003-12-01 13:41 Xose Vazquez Perez
2003-12-01 14:11 ` Marcelo Tosatti
2003-12-02 19:59 ` Stephan von Krawczynski
2003-12-02 22:05 ` bill davidsen
2003-12-02 22:34 ` Jeff Garzik
2003-12-03 0:34 Xose Vazquez Perez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20031202201649.GB17779@gtf.org \
--to=jgarzik@pobox.com \
--cc=gsstark@mit.edu \
--cc=linux-kernel@vger.kernel.org \
--cc=mfedyk@matchmail.com \
--cc=steffl@bigfoot.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.