linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: tytso@mit.edu
To: Ric Wheeler <rwheeler@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Mingming Cao <cmm@us.ibm.com>,
	djwong@us.ibm.com, linux-ext4 <linux-ext4@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Keith Mannthey <kmannth@us.ibm.com>,
	Mingming Cao <mcao@us.ibm.com>
Subject: Re: [RFC] ext4: Don't send extra barrier during fsync if there are no dirty pages.
Date: Wed, 30 Jun 2010 09:44:29 -0400	[thread overview]
Message-ID: <20100630134429.GE1333@thunk.org> (raw)
In-Reply-To: <4C2B44C0.3090002@redhat.com>

On Wed, Jun 30, 2010 at 09:21:04AM -0400, Ric Wheeler wrote:
> 
> The problem with not issuing a cache flush when you have dirty meta
> data or data is that it does not have any tie to the state of the
> volatile write cache of the target storage device.

We track whether or not there is any metadata updates associated with
the inode already; if it does, we force a journal commit, and this
implies a barrier operation.

The case we're talking about here is one where either (a) there is no
journal, or (b) there have been no metadata updates (I'm simplifying a
little here; in fact we track whether there have been fdatasync()- vs
fsync()- worthy metadata updates), and so there hasn't been a journal
commit to do the cache flush.

In this case, we want to track when is the last time an fsync() has
been issued, versus when was the last time data blocks for a
particular inode have been pushed out to disk.

To use an example I used as motivation for why we might want an
fsync2(int fd[], int flags[], int num) syscall, consider the situation
of:

	fsync(control_fd);
	fdatasync(data_fd);

The first fsync() will have executed a cache flush operation.  So when
we do the fdatasync() (assuming that no metadata needs to be flushed
out to disk), there is no need for the cache flush operation.

If we had an enhanced fsync command, we would also be able to
eliminate a second journal commit in the case where data_fd also had
some metadata that needed to be flushed out to disk.

> It would definitely be *very* useful to have an array of fd's that
> all need fsync()'ed at home time....

Yes, but it would require applications to change their code.

One thing that I would like about a new fsync2() system call is with a
flags field, we could add some new, more expressive flags:

#define FSYNC_DATA    0x0001 /* Only flush metadata if needed to access data */
#define FSYNC_NOWAIT  0x0002 /* Initiate the flush operations but don't wait
		      	        for them to complete */
#define FSYNC_NOBARRER 0x004 /* FS may skip the barrier if not needed for fs
		       	     	consistency */

etc.

					- Ted

  reply	other threads:[~2010-06-30 13:44 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-29 23:51 [RFC] ext4: Don't send extra barrier during fsync if there are no dirty pages Darrick J. Wong
2010-05-04  0:57 ` Mingming Cao
2010-05-04 14:16   ` Ric Wheeler
2010-05-04 15:45     ` Christoph Hellwig
2010-06-30 12:48       ` tytso
2010-06-30 13:21         ` Ric Wheeler
2010-06-30 13:44           ` tytso [this message]
2010-06-30 13:54             ` Ric Wheeler
2010-06-30 19:05               ` Andreas Dilger
2010-07-21 17:16             ` Jan Kara
2010-08-03  0:09               ` Darrick J. Wong
2010-08-03  9:01                 ` Christoph Hellwig
2010-08-04 18:16                   ` Darrick J. Wong
2010-08-03 13:21                 ` Jan Kara
2010-08-03 13:24         ` Avi Kivity
2010-08-04 23:32           ` Ted Ts'o
2010-08-05  2:20             ` Avi Kivity
2010-08-05 16:17               ` Ted Ts'o
2010-08-05 19:13                 ` Jeff Moyer
2010-08-05 20:39                   ` Ted Ts'o
2010-08-05 20:44                     ` Jeff Moyer
2010-05-04 19:49     ` Mingming Cao
2010-06-29 20:51       ` [RFC v2] " Darrick J. Wong
2010-08-05 16:40         ` Ted Ts'o
2010-08-05 16:45           ` Ted Ts'o
2010-08-06  7:04             ` Darrick J. Wong
2010-08-06 10:17               ` Ric Wheeler
2010-08-09 19:53               ` [RFC v3] ext4: Combine barrier requests coming from fsync Darrick J. Wong
2010-08-09 21:07                 ` Christoph Hellwig
2010-08-16 16:14                   ` Darrick J. Wong
2010-08-19  2:07                     ` Darrick J. Wong
2010-08-19  8:53                       ` Christoph Hellwig
2010-08-19  9:17                         ` Tejun Heo
2010-08-19 15:48                           ` Tejun Heo
2010-08-09 21:19                 ` Andreas Dilger
2010-08-09 23:38                   ` Darrick J. Wong
2010-08-19  2:14                     ` [RFC v4] ext4: Coordinate fsync requests Darrick J. Wong
2010-08-23 18:31                       ` Performance testing of various barrier reduction patches [was: Re: [RFC v4] ext4: Coordinate fsync requests] Darrick J. Wong
2010-09-23 23:25                         ` Darrick J. Wong
2010-09-24  6:24                           ` Andreas Dilger
2010-09-24 11:44                             ` Ric Wheeler
2010-09-27 23:01                             ` Darrick J. Wong
2010-10-08 21:26                               ` Darrick J. Wong
2010-10-08 21:56                                 ` Ric Wheeler
2010-10-11 20:20                                   ` Darrick J. Wong
2010-10-12 14:14                                     ` Christoph Hellwig
2010-10-15 23:39                                       ` Darrick J. Wong
2010-10-15 23:40                                         ` Christoph Hellwig
2010-10-16  0:02                                           ` Darrick J. Wong
2010-10-11 14:33                                 ` Ted Ts'o
2010-10-18 22:49                                 ` Darrick J. Wong
2010-10-19 18:28                                   ` Christoph Hellwig
2010-08-06  7:13           ` [RFC v2] ext4: Don't send extra barrier during fsync if there are no dirty pages Darrick J. Wong
2010-08-06 18:04             ` Ted Ts'o
2010-08-09 19:36               ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100630134429.GE1333@thunk.org \
    --to=tytso@mit.edu \
    --cc=cmm@us.ibm.com \
    --cc=djwong@us.ibm.com \
    --cc=hch@infradead.org \
    --cc=kmannth@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcao@us.ibm.com \
    --cc=rwheeler@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).