From: Wendy Cheng <wcheng@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] Re: [GFS2] Don't flush everything on fdatasync [70/70]
Date: Thu, 07 Dec 2006 14:05:00 -0500 [thread overview]
Message-ID: <457865DC.3020608@redhat.com> (raw)
In-Reply-To: <1165482686.3752.816.camel@quoit.chygwyn.com>
Steven Whitehouse wrote:
> Hi,
>
> On Fri, 2006-12-01 at 11:09 -0800, Andrew Morton wrote:
>
>>> I was taking my cue here from ext3 which does something similar. The
>>> filemap_fdatawrite() is done by the VFS before this is called with a
>>> filemap_fdatawait() afterwards. This was intended to flush the metadata
>>> via (eventually) ->write_inode() although I guess I should be calling
>>> write_inode_now() instead?
>>>
>> oh I see, you're jsut trying to write the inode itself, not the pages.
>>
>> write_inode_now() will write the pages, which you seem to not want to do.
>> Whatever. The APIs here are a bit awkward.
>>
>
> I've added a comment to explain whats going on here, and also the
> following patch. I know it could be better, but its still an improvement
> on what was there before,
>
>
>
Steve,
I'm in the middle of something else and don't have upstream kernel
source handy at this moment. But I read akpm's comment as
"write_inode_now" would do writepage and that is *not* what you want (?)
(since vfs has done that before this call is invoked). I vaguely
recalled I did try write_inode_now() on GFS1 once but had to replace it
with "sync_inode" on RHEL4 (for the reason that I can't remember at this
moment). I suggest you keep "sync_inode" (at least for a while until we
can prove other call can do better). This "sync_inode" has been well
tested out (with GFS1's fsync call).
There is another issue. It is a gray area. Note that you don't grab any
glock here ... so if someone *has* written something in other nodes,
this sync could miss it (?). This depends on how people expects a
fsync/fdatasync should behave in a cluster filesystem. GFS1 asks for a
shared lock here so it will force other node to flush the data (I
personally think this is a more correct behavior). Your call though.
-- Wendy
> --------------------------------------------------------------------------------------------
>
> >From 34126f9f41901ca9d7d0031c2b11fc0d6c07b72d Mon Sep 17 00:00:00 2001
> From: Steven Whitehouse <swhiteho@redhat.com>
> Date: Thu, 7 Dec 2006 09:13:14 -0500
> Subject: [PATCH] [GFS2] Change gfs2_fsync() to use write_inode_now()
>
> This is a bit better than the previous version of gfs2_fsync()
> although it would be better still if we were able to call a
> function which only wrote the inode & metadata. Its no big deal
> though that this will potentially write the data as well since
> the VFS has already done that before calling gfs2_fsync(). I've
> also added a comment to explain whats going on here.
>
> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
> Cc: Andrew Morton <akpm@osdl.org>
> ---
> fs/gfs2/ops_file.c | 11 ++++++-----
> 1 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/fs/gfs2/ops_file.c b/fs/gfs2/ops_file.c
> index 7bd971b..b3f1e03 100644
> --- a/fs/gfs2/ops_file.c
> +++ b/fs/gfs2/ops_file.c
> @@ -510,6 +510,11 @@ static int gfs2_close(struct inode *inod
> * is set. For stuffed inodes we must flush the log in order to
> * ensure that all data is on disk.
> *
> + * The call to write_inode_now() is there to write back metadata and
> + * the inode itself. It does also try and write the data, but thats
> + * (hopefully) a no-op due to the VFS having already called filemap_fdatawrite()
> + * for us.
> + *
> * Returns: errno
> */
>
> @@ -518,10 +523,6 @@ static int gfs2_fsync(struct file *file,
> struct inode *inode = dentry->d_inode;
> int sync_state = inode->i_state & (I_DIRTY_SYNC|I_DIRTY_DATASYNC);
> int ret = 0;
> - struct writeback_control wbc = {
> - .sync_mode = WB_SYNC_ALL,
> - .nr_to_write = 0,
> - };
>
> if (gfs2_is_jdata(GFS2_I(inode))) {
> gfs2_log_flush(GFS2_SB(inode), GFS2_I(inode)->i_gl);
> @@ -530,7 +531,7 @@ static int gfs2_fsync(struct file *file,
>
> if (sync_state != 0) {
> if (!datasync)
> - ret = sync_inode(inode, &wbc);
> + ret = write_inode_now(inode, 0);
>
> if (gfs2_is_stuffed(GFS2_I(inode)))
> gfs2_log_flush(GFS2_SB(inode), GFS2_I(inode)->i_gl);
>
WARNING: multiple messages have this Message-ID (diff)
From: Wendy Cheng <wcheng@redhat.com>
To: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Morton <akpm@osdl.org>,
cluster-devel@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [Cluster-devel] Re: [GFS2] Don't flush everything on fdatasync [70/70]
Date: Thu, 07 Dec 2006 14:05:00 -0500 [thread overview]
Message-ID: <457865DC.3020608@redhat.com> (raw)
In-Reply-To: <1165482686.3752.816.camel@quoit.chygwyn.com>
Steven Whitehouse wrote:
> Hi,
>
> On Fri, 2006-12-01 at 11:09 -0800, Andrew Morton wrote:
>
>>> I was taking my cue here from ext3 which does something similar. The
>>> filemap_fdatawrite() is done by the VFS before this is called with a
>>> filemap_fdatawait() afterwards. This was intended to flush the metadata
>>> via (eventually) ->write_inode() although I guess I should be calling
>>> write_inode_now() instead?
>>>
>> oh I see, you're jsut trying to write the inode itself, not the pages.
>>
>> write_inode_now() will write the pages, which you seem to not want to do.
>> Whatever. The APIs here are a bit awkward.
>>
>
> I've added a comment to explain whats going on here, and also the
> following patch. I know it could be better, but its still an improvement
> on what was there before,
>
>
>
Steve,
I'm in the middle of something else and don't have upstream kernel
source handy at this moment. But I read akpm's comment as
"write_inode_now" would do writepage and that is *not* what you want (?)
(since vfs has done that before this call is invoked). I vaguely
recalled I did try write_inode_now() on GFS1 once but had to replace it
with "sync_inode" on RHEL4 (for the reason that I can't remember at this
moment). I suggest you keep "sync_inode" (at least for a while until we
can prove other call can do better). This "sync_inode" has been well
tested out (with GFS1's fsync call).
There is another issue. It is a gray area. Note that you don't grab any
glock here ... so if someone *has* written something in other nodes,
this sync could miss it (?). This depends on how people expects a
fsync/fdatasync should behave in a cluster filesystem. GFS1 asks for a
shared lock here so it will force other node to flush the data (I
personally think this is a more correct behavior). Your call though.
-- Wendy
> --------------------------------------------------------------------------------------------
>
> >From 34126f9f41901ca9d7d0031c2b11fc0d6c07b72d Mon Sep 17 00:00:00 2001
> From: Steven Whitehouse <swhiteho@redhat.com>
> Date: Thu, 7 Dec 2006 09:13:14 -0500
> Subject: [PATCH] [GFS2] Change gfs2_fsync() to use write_inode_now()
>
> This is a bit better than the previous version of gfs2_fsync()
> although it would be better still if we were able to call a
> function which only wrote the inode & metadata. Its no big deal
> though that this will potentially write the data as well since
> the VFS has already done that before calling gfs2_fsync(). I've
> also added a comment to explain whats going on here.
>
> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
> Cc: Andrew Morton <akpm@osdl.org>
> ---
> fs/gfs2/ops_file.c | 11 ++++++-----
> 1 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/fs/gfs2/ops_file.c b/fs/gfs2/ops_file.c
> index 7bd971b..b3f1e03 100644
> --- a/fs/gfs2/ops_file.c
> +++ b/fs/gfs2/ops_file.c
> @@ -510,6 +510,11 @@ static int gfs2_close(struct inode *inod
> * is set. For stuffed inodes we must flush the log in order to
> * ensure that all data is on disk.
> *
> + * The call to write_inode_now() is there to write back metadata and
> + * the inode itself. It does also try and write the data, but thats
> + * (hopefully) a no-op due to the VFS having already called filemap_fdatawrite()
> + * for us.
> + *
> * Returns: errno
> */
>
> @@ -518,10 +523,6 @@ static int gfs2_fsync(struct file *file,
> struct inode *inode = dentry->d_inode;
> int sync_state = inode->i_state & (I_DIRTY_SYNC|I_DIRTY_DATASYNC);
> int ret = 0;
> - struct writeback_control wbc = {
> - .sync_mode = WB_SYNC_ALL,
> - .nr_to_write = 0,
> - };
>
> if (gfs2_is_jdata(GFS2_I(inode))) {
> gfs2_log_flush(GFS2_SB(inode), GFS2_I(inode)->i_gl);
> @@ -530,7 +531,7 @@ static int gfs2_fsync(struct file *file,
>
> if (sync_state != 0) {
> if (!datasync)
> - ret = sync_inode(inode, &wbc);
> + ret = write_inode_now(inode, 0);
>
> if (gfs2_is_stuffed(GFS2_I(inode)))
> gfs2_log_flush(GFS2_SB(inode), GFS2_I(inode)->i_gl);
>
next prev parent reply other threads:[~2006-12-07 19:05 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-30 12:24 [Cluster-devel] [GFS2] Don't flush everything on fdatasync [70/70] Steven Whitehouse
2006-11-30 12:24 ` Steven Whitehouse
2006-12-01 7:01 ` [Cluster-devel] " Andrew Morton
2006-12-01 7:01 ` Andrew Morton
2006-12-01 10:58 ` [Cluster-devel] " Steven Whitehouse
2006-12-01 10:58 ` Steven Whitehouse
2006-12-01 19:09 ` [Cluster-devel] " Andrew Morton
2006-12-01 19:09 ` Andrew Morton
2006-12-05 14:36 ` [Cluster-devel] " Steven Whitehouse
2006-12-05 14:36 ` Steven Whitehouse
2006-12-07 9:11 ` [Cluster-devel] " Steven Whitehouse
2006-12-07 9:11 ` Steven Whitehouse
2006-12-07 19:05 ` Wendy Cheng [this message]
2006-12-07 19:05 ` [Cluster-devel] " Wendy Cheng
2006-12-08 10:29 ` Steven Whitehouse
2006-12-08 10:29 ` Steven Whitehouse
2006-12-08 10:29 ` Steven Whitehouse
2006-12-07 12:17 ` [Cluster-devel] [GFS2 & DLM] Pull request Steven Whitehouse
2006-12-07 12:17 ` Steven Whitehouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=457865DC.3020608@redhat.com \
--to=wcheng@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.