All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: "Darrick J. Wong" <djwong@us.ibm.com>
Cc: Jan Kara <jack@suse.cz>,
	tytso@mit.edu, Ric Wheeler <rwheeler@redhat.com>,
	Christoph Hellwig <hch@infradead.org>,
	Mingming Cao <cmm@us.ibm.com>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Keith Mannthey <kmannth@us.ibm.com>,
	Mingming Cao <mcao@us.ibm.com>
Subject: Re: [RFC] ext4: Don't send extra barrier during fsync if there are no dirty pages.
Date: Tue, 3 Aug 2010 15:21:54 +0200	[thread overview]
Message-ID: <20100803132154.GG3322@quack.suse.cz> (raw)
In-Reply-To: <20100803000939.GA2109@tux1.beaverton.ibm.com>

On Mon 02-08-10 17:09:39, Darrick J. Wong wrote:
> On Wed, Jul 21, 2010 at 07:16:09PM +0200, Jan Kara wrote:
> >   Hi,
> > 
> > > On Wed, Jun 30, 2010 at 09:21:04AM -0400, Ric Wheeler wrote:
> > > > 
> > > > The problem with not issuing a cache flush when you have dirty meta
> > > > data or data is that it does not have any tie to the state of the
> > > > volatile write cache of the target storage device.
> > > 
> > > We track whether or not there is any metadata updates associated with
> > > the inode already; if it does, we force a journal commit, and this
> > > implies a barrier operation.
> > > 
> > > The case we're talking about here is one where either (a) there is no
> > > journal, or (b) there have been no metadata updates (I'm simplifying a
> > > little here; in fact we track whether there have been fdatasync()- vs
> > > fsync()- worthy metadata updates), and so there hasn't been a journal
> > > commit to do the cache flush.
> > > 
> > > In this case, we want to track when is the last time an fsync() has
> > > been issued, versus when was the last time data blocks for a
> > > particular inode have been pushed out to disk.
> > > 
> > > To use an example I used as motivation for why we might want an
> > > fsync2(int fd[], int flags[], int num) syscall, consider the situation
> > > of:
> > > 
> > > 	fsync(control_fd);
> > > 	fdatasync(data_fd);
> > > 
> > > The first fsync() will have executed a cache flush operation.  So when
> > > we do the fdatasync() (assuming that no metadata needs to be flushed
> > > out to disk), there is no need for the cache flush operation.
> > > 
> > > If we had an enhanced fsync command, we would also be able to
> > > eliminate a second journal commit in the case where data_fd also had
> > > some metadata that needed to be flushed out to disk.
> >   Current implementation already avoids journal commit because of
> > fdatasync(data_fd). We remeber a transaction ID when inode metadata has
> > last been updated and do not force a transaction commit if it is already
> > committed. Thus the first fsync might force a transaction commit but second
> > fdatasync likely won't.
> >   We could actually improve the scheme to work for data as well. I wrote
> > a proof-of-concept patches (attached) and they nicely avoid second barrier
> > when doing:
> > echo "aaa" >file1; echo "aaa" >file2; fsync file2; fsync file1
> > 
> >   Ted, would you be interested in something like this?
> 
> Well... on my fsync-happy workloads, this seems to cut the barrier count down
> by about 20%, and speeds it up by about 20%.
  Nice, thanks for measurement.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  parent reply	other threads:[~2010-08-03 13:22 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-29 23:51 [RFC] ext4: Don't send extra barrier during fsync if there are no dirty pages Darrick J. Wong
2010-05-04  0:57 ` Mingming Cao
2010-05-04 14:16   ` Ric Wheeler
2010-05-04 15:45     ` Christoph Hellwig
2010-06-30 12:48       ` tytso
2010-06-30 13:21         ` Ric Wheeler
2010-06-30 13:21           ` Ric Wheeler
2010-06-30 13:44           ` tytso
2010-06-30 13:54             ` Ric Wheeler
2010-06-30 13:54               ` Ric Wheeler
2010-06-30 19:05               ` Andreas Dilger
2010-07-21 17:16             ` Jan Kara
2010-07-21 17:16               ` Jan Kara
2010-08-03  0:09               ` Darrick J. Wong
2010-08-03  9:01                 ` Christoph Hellwig
2010-08-04 18:16                   ` Darrick J. Wong
2010-08-03 13:21                 ` Jan Kara [this message]
2010-08-03 13:24         ` Avi Kivity
2010-08-03 13:24           ` Avi Kivity
2010-08-04 23:32           ` Ted Ts'o
2010-08-05  2:20             ` Avi Kivity
2010-08-05  2:20               ` Avi Kivity
2010-08-05 16:17               ` Ted Ts'o
2010-08-05 19:13                 ` Jeff Moyer
2010-08-05 20:39                   ` Ted Ts'o
2010-08-05 20:44                     ` Jeff Moyer
2010-05-04 19:49     ` Mingming Cao
2010-06-29 20:51       ` [RFC v2] " Darrick J. Wong
2010-08-05 16:40         ` Ted Ts'o
2010-08-05 16:45           ` Ted Ts'o
2010-08-05 16:45             ` Ted Ts'o
2010-08-06  7:04             ` Darrick J. Wong
2010-08-06  7:04               ` Darrick J. Wong
2010-08-06 10:17               ` Ric Wheeler
2010-08-09 19:53               ` [RFC v3] ext4: Combine barrier requests coming from fsync Darrick J. Wong
2010-08-09 19:53                 ` Darrick J. Wong
2010-08-09 21:07                 ` Christoph Hellwig
2010-08-16 16:14                   ` Darrick J. Wong
2010-08-19  2:07                     ` Darrick J. Wong
2010-08-19  8:53                       ` Christoph Hellwig
2010-08-19  9:17                         ` Tejun Heo
2010-08-19 15:48                           ` Tejun Heo
2010-08-09 21:19                 ` Andreas Dilger
2010-08-09 23:38                   ` Darrick J. Wong
2010-08-19  2:14                     ` [RFC v4] ext4: Coordinate fsync requests Darrick J. Wong
2010-08-23 18:31                       ` Performance testing of various barrier reduction patches [was: Re: [RFC v4] ext4: Coordinate fsync requests] Darrick J. Wong
2010-09-23 23:25                         ` Darrick J. Wong
2010-09-24  6:24                           ` Andreas Dilger
2010-09-24 11:44                             ` Ric Wheeler
2010-09-27 23:01                             ` Darrick J. Wong
2010-10-08 21:26                               ` Darrick J. Wong
2010-10-08 21:56                                 ` Ric Wheeler
2010-10-11 20:20                                   ` Darrick J. Wong
2010-10-12 14:14                                     ` Christoph Hellwig
2010-10-15 23:39                                       ` Darrick J. Wong
2010-10-15 23:40                                         ` Christoph Hellwig
2010-10-16  0:02                                           ` Darrick J. Wong
2010-10-11 14:33                                 ` Ted Ts'o
2010-10-18 22:49                                 ` Darrick J. Wong
2010-10-19 18:28                                   ` Christoph Hellwig
2010-08-06  7:13           ` [RFC v2] ext4: Don't send extra barrier during fsync if there are no dirty pages Darrick J. Wong
2010-08-06  7:13             ` Darrick J. Wong
2010-08-06 18:04             ` Ted Ts'o
2010-08-09 19:36               ` Darrick J. Wong
2010-08-09 19:36                 ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100803132154.GG3322@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=cmm@us.ibm.com \
    --cc=djwong@us.ibm.com \
    --cc=hch@infradead.org \
    --cc=kmannth@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcao@us.ibm.com \
    --cc=rwheeler@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.