From: Jeff Garzik <jgarzik@pobox.com>
To: Greg Stark <gsstark@mit.edu>
Cc: Mike Fedyk <mfedyk@matchmail.com>,
Erik Steffl <steffl@bigfoot.com>,
linux-kernel@vger.kernel.org
Subject: Re: libata in 2.4.24?
Date: Tue, 2 Dec 2003 15:16:49 -0500 [thread overview]
Message-ID: <20031202201649.GB17779@gtf.org> (raw)
In-Reply-To: <877k1f9e1g.fsf@stark.dyndns.tv>
On Tue, Dec 02, 2003 at 03:10:19PM -0500, Greg Stark wrote:
> Jeff Garzik <jgarzik@pobox.com> writes:
>
> > So, today, no acknowledgement occurs until the data _really_ is in the
> > drive's buffers.
>
> The drive's buffers isn't good enough. If power is lost the write will be lost
> and the database corrupt. It needs to be on the platters.
Certainly agreed.
> > > This doesn't happen with SCSI disks where multiple requests can be pending so
> > > there's no urgency to reporting a false success. The request doesn't complete
> > > until the write hits disk. As a result SCSI disks are reliable for database
> > > operation and IDE disks aren't unless write caching is disabled.
> >
> > This is not really true.
> >
> > Regardless of TCQ, if the OS driver has not issued a FLUSH CACHE (IDE)
> > or SYNCHRONIZE CACHE (SCSI), then the data is not guaranteed to be on
> > the disk media. Plain and simple.
>
> That doesn't agree with people's experience. People seem to find that SCSI
> drives never cache writes. This sort of makes sense since there's just not
> much reason to report a write success before the write can be performed.
> There's no performance advantage as long as more requests can be queued up.
Some IDE _and/or_ SCSI drives do not cache writes. For these drives,
the _absence_ of an OS flush-cache command still means your data gets
to the platter.
The core problem is not issuing a flush-cache command, it sounds like.
The drive technology (wcache, or no) is largely irrelevant.
> > If fsync(2) returns without a flush-cache, then your data is not
> > guaranteed to be on the disk. And as you noted, flush-cache destroys
> > performance.
>
> It's my understanding that it doesn't. There was some discussion in the past
eh? flush-cache very definitely hurts performance, on both IDE and
SCSI, for drives that support write caching.
> > There are three levels:
> >
> > a) Data is successfully transferred to the controller/drive queue (TCQ).
> > b) Data is successfully transferred to the drive's internal buffers.
> > c) The drive successfully transfers data to the media.
>
> Only the third is of interest to Postgres or other databases. In fact, I
Certainly.
> suspect only the third is of interest to other systems that are supposed to be
> reliable like MTAs etc. I think Wietse and others would be shocked if they
> were told fsync wasn't guaranteed to have waited until the writes had actually
> hit the media.
As well he should be :)
Jeff
next prev parent reply other threads:[~2003-12-02 20:31 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-11-28 18:27 linux-2.4.23 released Marcelo Tosatti
2003-11-28 19:06 ` Willy Tarreau
2003-11-28 22:55 ` J.A. Magallon
2003-11-29 22:26 ` libata in 2.4.24? Samuel Flory
2003-11-29 23:10 ` Marcelo Tosatti
2003-12-01 10:43 ` Marcelo Tosatti
2003-12-01 18:06 ` Samuel Flory
2003-12-01 21:12 ` Greg Stark
2003-12-01 21:23 ` Samuel Flory
2003-12-01 21:44 ` Greg Stark
2003-12-01 22:00 ` Jeff Garzik
2003-12-01 22:06 ` Samuel Flory
2003-12-01 22:00 ` Erik Steffl
2003-12-02 5:36 ` Greg Stark
[not found] ` <20031202055336.GO1566@mis-mike-wstn.matchmail.com>
2003-12-02 5:58 ` Mike Fedyk
2003-12-02 16:31 ` Greg Stark
2003-12-02 17:40 ` Mike Fedyk
2003-12-02 18:04 ` Jeff Garzik
2003-12-02 18:46 ` Mike Fedyk
2003-12-02 18:49 ` Jeff Garzik
2003-12-04 8:18 ` Jens Axboe
2003-12-02 18:02 ` Jeff Garzik
2003-12-02 18:51 ` Greg Stark
2003-12-02 19:06 ` Jeff Garzik
2003-12-02 20:10 ` Greg Stark
2003-12-02 20:16 ` Jeff Garzik [this message]
2003-12-02 20:34 ` Greg Stark
2003-12-02 22:34 ` bill davidsen
2003-12-02 23:02 ` Mike Fedyk
2003-12-02 23:18 ` bill davidsen
2003-12-02 23:40 ` Mike Fedyk
2003-12-03 0:01 ` Jeff Garzik
2003-12-03 0:47 ` Jamie Lokier
2003-12-07 5:33 ` Bill Davidsen
2003-12-01 21:36 ` Justin Cormack
-- strict thread matches above, loose matches on Subject: below --
2003-12-01 13:41 Xose Vazquez Perez
2003-12-01 14:11 ` Marcelo Tosatti
2003-12-02 19:59 ` Stephan von Krawczynski
2003-12-02 22:05 ` bill davidsen
2003-12-02 22:34 ` Jeff Garzik
2003-12-03 0:34 Xose Vazquez Perez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20031202201649.GB17779@gtf.org \
--to=jgarzik@pobox.com \
--cc=gsstark@mit.edu \
--cc=linux-kernel@vger.kernel.org \
--cc=mfedyk@matchmail.com \
--cc=steffl@bigfoot.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox