From: Ric Wheeler <rwheeler@redhat.com>
To: "Darrick J. Wong" <djwong@us.ibm.com>
Cc: Jens Axboe <axboe@kernel.dk>, "Theodore Ts'o" <tytso@mit.edu>,
Neil Brown <neilb@suse.de>,
Andreas Dilger <adilger.kernel@dilger.ca>,
Alasdair G Kergon <agk@redhat.com>, Jan Kara <jack@suse.cz>,
Mike Snitzer <snitzer@redhat.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-raid@vger.kernel.org, Keith Mannthey <kmannth@us.ibm.com>,
dm-devel@redhat.com, Mingming Cao <cmm@us.ibm.com>,
Tejun Heo <tj@kernel.org>,
linux-ext4@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
Josef Bacik <josef@redhat.com>
Subject: Re: [PATCH v6 0/4] ext4: Coordinate data-only flush requests sent by fsync
Date: Mon, 29 Nov 2010 18:48:25 -0500 [thread overview]
Message-ID: <4CF43BC9.8040603@redhat.com> (raw)
In-Reply-To: <20101129220536.12401.16581.stgit@elm3b57.beaverton.ibm.com>
On 11/29/2010 05:05 PM, Darrick J. Wong wrote:
> On certain types of hardware, issuing a write cache flush takes a considerable
> amount of time. Typically, these are simple storage systems with write cache
> enabled and no battery to save that cache after a power failure. When we
> encounter a system with many I/O threads that write data and then call fsync
> after more transactions accumulate, ext4_sync_file performs a data-only flush,
> the performance of which is suboptimal because each of those threads issues its
> own flush command to the drive instead of trying to coordinate the flush,
> thereby wasting execution time.
>
> Instead of each fsync call initiating its own flush, there's now a flag to
> indicate if (0) no flushes are ongoing, (1) we're delaying a short time to
> collect other fsync threads, or (2) we're actually in-progress on a flush.
>
> So, if someone calls ext4_sync_file and no flushes are in progress, the flag
> shifts from 0->1 and the thread delays for a short time to see if there are any
> other threads that are close behind in ext4_sync_file. After that wait, the
> state transitions to 2 and the flush is issued. Once that's done, the state
> goes back to 0 and a completion is signalled.
>
> Those close-behind threads see the flag is already 1, and go to sleep until the
> completion is signalled. Instead of issuing a flush themselves, they simply
> wait for that first thread to do it for them. If they see that the flag is 2,
> they wait for the current flush to finish, and start over.
>
> However, there are a couple of exceptions to this rule. First, there exist
> high-end storage arrays with battery-backed write caches for which flush
> commands take very little time (< 2ms); on these systems, performing the
> coordination actually lowers performance. Given the earlier patch to the block
> layer to report low-level device flush times, we can detect this situation and
> have all threads issue flushes without coordinating, as we did before. The
> second case is when there's a single thread issuing flushes, in which case it
> can skip the coordination.
>
> This author of this patch is aware that jbd2 has a similar flush coordination
> scheme for journal commits. An earlier version of this patch simply created a
> new empty journal transaction and committed it, but that approach was shown to
> increase the amount of write traffic heading towards the disk, which in turn
> lowered performance considerably, especially in the case where directio was in
> use. Therefore, this patch adds the coordination code directly to ext4.
Hi Darrick,
Just curious why we would need to have batching in both places? Doesn't your
patch set make the jbd2 transaction batching redundant?
I noticed that the patches have a default delay and a mount option to override
that default. The jbd2 code today tries to measure the average time needed in a
transaction and automatically tune itself. Can't we do something similar with
your patch set? (I hate to see yet another mount option added!)
Regards,
Ric
next prev parent reply other threads:[~2010-11-29 23:47 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-29 22:05 [PATCH v6 0/4] ext4: Coordinate data-only flush requests sent by fsync Darrick J. Wong
2010-11-29 22:05 ` [PATCH 1/4] block: Measure flush round-trip times and report average value Darrick J. Wong
2010-12-02 9:49 ` Lukas Czerner
2010-11-29 22:05 ` [PATCH 2/4] md: Compute average flush time from component devices Darrick J. Wong
2010-11-29 22:05 ` [PATCH 3/4] dm: " Darrick J. Wong
2010-11-30 5:21 ` Mike Snitzer
2010-11-29 22:06 ` [PATCH 4/4] ext4: Coordinate data-only flush requests sent by fsync Darrick J. Wong
2010-11-29 23:48 ` Ric Wheeler [this message]
2010-11-30 0:19 ` [PATCH v6 0/4] " Darrick J. Wong
2010-12-01 0:14 ` Mingming Cao
2010-11-30 0:39 ` Neil Brown
2010-11-30 0:48 ` Ric Wheeler
2010-11-30 1:26 ` Neil Brown
2010-11-30 23:32 ` Darrick J. Wong
2010-11-30 13:45 ` Tejun Heo
2010-11-30 13:58 ` Ric Wheeler
2010-11-30 16:43 ` Christoph Hellwig
2010-11-30 23:31 ` Darrick J. Wong
2010-11-30 16:41 ` Christoph Hellwig
2011-01-07 23:54 ` Patch to issue pure flushes directly (Was: Re: [PATCH v6 0/4] ext4: Coordinate data-only flush requests sent) " Ted Ts'o
2011-01-08 7:45 ` Christoph Hellwig
[not found] ` <20110108074524.GA13024@lst.de>
2011-01-08 14:08 ` Tejun Heo
2011-01-04 16:27 ` [RFC PATCH v7] ext4: Coordinate data-only flush requests sent " Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CF43BC9.8040603@redhat.com \
--to=rwheeler@redhat.com \
--cc=adilger.kernel@dilger.ca \
--cc=agk@redhat.com \
--cc=axboe@kernel.dk \
--cc=cmm@us.ibm.com \
--cc=djwong@us.ibm.com \
--cc=dm-devel@redhat.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=josef@redhat.com \
--cc=kmannth@us.ibm.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
--cc=snitzer@redhat.com \
--cc=tj@kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).