[Cluster-devel] Re: [GFS2] Don't flush everything on fdatasync [70/70]

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

From: Wendy Cheng <wcheng@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] Re: [GFS2] Don't flush everything on fdatasync [70/70]
Date: Thu, 07 Dec 2006 14:05:00 -0500	[thread overview]
Message-ID: <457865DC.3020608@redhat.com> (raw)
In-Reply-To: <1165482686.3752.816.camel@quoit.chygwyn.com>

Steven Whitehouse wrote:
> Hi,
>
> On Fri, 2006-12-01 at 11:09 -0800, Andrew Morton wrote:
>   
>>> I was taking my cue here from ext3 which does something similar. The
>>> filemap_fdatawrite() is done by the VFS before this is called with a
>>> filemap_fdatawait() afterwards. This was intended to flush the metadata
>>> via (eventually) ->write_inode() although I guess I should be calling
>>> write_inode_now() instead?
>>>       
>> oh I see, you're jsut trying to write the inode itself, not the pages.
>>
>> write_inode_now() will write the pages, which you seem to not want to do.
>> Whatever.  The APIs here are a bit awkward.
>>     
>
> I've added a comment to explain whats going on here, and also the
> following patch. I know it could be better, but its still an improvement
> on what was there before,
>
>
>   
Steve,

I'm in the middle of something else and don't have upstream kernel 
source handy at this moment. But I read akpm's comment as 
"write_inode_now" would do writepage and that is *not* what you want (?) 
(since vfs has done that before this call is invoked). I vaguely 
recalled I did try write_inode_now() on GFS1 once but had to replace it 
with "sync_inode" on RHEL4 (for the reason that I can't remember at this 
moment). I suggest you keep "sync_inode" (at least for a while until we 
can prove other call can do better). This "sync_inode" has been well 
tested out (with GFS1's fsync call).

There is another issue. It is a gray area. Note that you don't grab any 
glock here ... so if someone *has* written something in other nodes, 
this sync could miss it (?). This depends on how people expects a 
fsync/fdatasync should behave in a cluster filesystem. GFS1 asks for a 
shared lock here so it will force other node to flush the data (I 
personally think this is a more correct behavior). Your call though.

-- Wendy

> --------------------------------------------------------------------------------------------
>
> >From 34126f9f41901ca9d7d0031c2b11fc0d6c07b72d Mon Sep 17 00:00:00 2001
> From: Steven Whitehouse <swhiteho@redhat.com>
> Date: Thu, 7 Dec 2006 09:13:14 -0500
> Subject: [PATCH] [GFS2] Change gfs2_fsync() to use write_inode_now()
>
> This is a bit better than the previous version of gfs2_fsync()
> although it would be better still if we were able to call a
> function which only wrote the inode & metadata. Its no big deal
> though that this will potentially write the data as well since
> the VFS has already done that before calling gfs2_fsync(). I've
> also added a comment to explain whats going on here.
>
> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
> Cc: Andrew Morton <akpm@osdl.org>
> ---
>  fs/gfs2/ops_file.c |   11 ++++++-----
>  1 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/fs/gfs2/ops_file.c b/fs/gfs2/ops_file.c
> index 7bd971b..b3f1e03 100644
> --- a/fs/gfs2/ops_file.c
> +++ b/fs/gfs2/ops_file.c
> @@ -510,6 +510,11 @@ static int gfs2_close(struct inode *inod
>   * is set. For stuffed inodes we must flush the log in order to
>   * ensure that all data is on disk.
>   *
> + * The call to write_inode_now() is there to write back metadata and
> + * the inode itself. It does also try and write the data, but thats
> + * (hopefully) a no-op due to the VFS having already called filemap_fdatawrite()
> + * for us.
> + *
>   * Returns: errno
>   */
>  
> @@ -518,10 +523,6 @@ static int gfs2_fsync(struct file *file,
>  	struct inode *inode = dentry->d_inode;
>  	int sync_state = inode->i_state & (I_DIRTY_SYNC|I_DIRTY_DATASYNC);
>  	int ret = 0;
> -	struct writeback_control wbc = {
> -		.sync_mode = WB_SYNC_ALL,
> -		.nr_to_write = 0,
> -	};
>  
>  	if (gfs2_is_jdata(GFS2_I(inode))) {
>  		gfs2_log_flush(GFS2_SB(inode), GFS2_I(inode)->i_gl);
> @@ -530,7 +531,7 @@ static int gfs2_fsync(struct file *file,
>  
>  	if (sync_state != 0) {
>  		if (!datasync)
> -			ret = sync_inode(inode, &wbc);
> +			ret = write_inode_now(inode, 0);
>  
>  		if (gfs2_is_stuffed(GFS2_I(inode)))
>  			gfs2_log_flush(GFS2_SB(inode), GFS2_I(inode)->i_gl);
>

next prev parent reply	other threads:[~2006-12-07 19:05 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-30 12:24 [Cluster-devel] [GFS2] Don't flush everything on fdatasync [70/70] Steven Whitehouse
2006-12-01  7:01 ` [Cluster-devel] " Andrew Morton
2006-12-01 10:58   ` Steven Whitehouse
2006-12-01 19:09     ` Andrew Morton
2006-12-05 14:36       ` Steven Whitehouse
2006-12-07  9:11       ` Steven Whitehouse
2006-12-07 19:05         ` Wendy Cheng [this message]
2006-12-08 10:29           ` Steven Whitehouse
2006-12-08 10:29             ` Steven Whitehouse
2006-12-07 12:17       ` [Cluster-devel] [GFS2 & DLM] Pull request Steven Whitehouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=457865DC.3020608@redhat.com \
    --to=wcheng@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).