All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Ensuring data is written to disk
Date: Tue, 1 Aug 2006 11:17:44 +0100	[thread overview]
Message-ID: <20060801101743.GA31760@mail.shareable.org> (raw)
In-Reply-To: <A69CFE5B2F49D91186C1000BCD9DBD03E2AB38@otausminexs.au.otis.com>

Armistead, Jason wrote:
> I've been following the thread about disk data consistency with some
> interest.  Given that many IDE disk drives may choose to hold data in their
> write buffers before actually writing it to disk, and given that the
> ordering of the writes may not be the same as the OS or application expects,
> the only obvious way I can see to overcome this, and ensure the data is
> truly written to the physical platters without disabling write caching is to
> overwhelm the disk drive with more data than can fit in its internal write
> buffer.
> 
> So, if you have an IDE disk with an 8Mb cache, guess what, send it an 8Mb
> chunk of random data to write out when you do an fsync().  Better still,
> locate this 8Mb as close to the middle of the travel of its heads, so that
> performance is not affected any more than necessary.  If the drive firmware
> uses a LILO or LRU policy to determine when to do its disk writes,
> overwhelming its buffers should ensure that the actual data you sent to it
> gets written out 

It doesn't work.

I thought that too, for a while, as a way to avoid sending CACHEFLUSH
commands for fs journal ordering when there is a lot of data being
written anyway.

But there is no guarantee that the drive uses a LILO or LRU policy,
and if the firmware is optimised for cache performance over a range of
benchmarks, it won't use those - there are better strategies.

You could write 8MB to the drive, but it could easily pass through the
cache without evicting some of the other data you want written.
_Especially_ if the 8MB is written to an area in the middle of the
head sweep.

> Of course, guessing the disk drive write buffer size and trying not to kill
> system I/O performance with all these writes is another question entirely
> ... sigh !!!

If you just want to evict all data from the drive's cache, and don't
actually have other data to write, there is a CACHEFLUSH command you
can send to the drive which will be more dependable than writing as
much data as the cache size.

-- Jamie

  reply	other threads:[~2006-08-01 10:17 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-01  0:11 [Qemu-devel] Ensuring data is written to disk Armistead, Jason
2006-08-01 10:17 ` Jamie Lokier [this message]
2006-08-01 10:45   ` Jens Axboe
2006-08-01 14:17     ` Jamie Lokier
2006-08-01 19:05       ` Jens Axboe
2006-08-01 21:50         ` Jamie Lokier
2006-08-02  6:51           ` Jens Axboe
2006-08-02 13:28             ` Jamie Lokier
2006-08-02 15:56               ` Bill C. Riemers
2006-08-07 13:11             ` R. Armiento
2006-08-07 16:14               ` Bill C. Riemers
2006-08-07 18:13               ` Thomas Steffen
2006-08-08  2:37                 ` R. Armiento

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060801101743.GA31760@mail.shareable.org \
    --to=jamie@shareable.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.