* [PATCH] pnfs: call layoutcommit after flushing inode's data to disk.
@ 2010-05-20 3:28 Tao Guo
2010-05-20 4:42 ` Tao Guo
0 siblings, 1 reply; 6+ messages in thread
From: Tao Guo @ 2010-05-20 3:28 UTC (permalink / raw)
To: linux-nfs; +Cc: Benny Halevy, J. Bruce Fields
Signed-off-by: Tao Guo <guotao-U4AKAne5IzAR5TUyvShJeg@public.gmane.org>
---
fs/nfs/write.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index d3e1645..b4f48b2 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1558,6 +1558,7 @@ int nfs_write_inode(struct inode *inode, struct writeback_control *wbc)
*/
int nfs_wb_all(struct inode *inode)
{
+ int ret;
struct writeback_control wbc = {
.sync_mode = WB_SYNC_ALL,
.nr_to_write = LONG_MAX,
@@ -1565,7 +1566,12 @@ int nfs_wb_all(struct inode *inode)
.range_end = LLONG_MAX,
};
- return sync_inode(inode, &wbc);
+ ret = sync_inode(inode, &wbc);
+#ifdef CONFIG_NFS_V4_1
+ if (!ret && NFS_I(inode)->layoutcommit_ctx)
+ ret = pnfs_layoutcommit_inode(inode, 1);
+#endif
+ return ret;
}
int nfs_wb_page_cancel(struct inode *inode, struct page *page)
--
1.6.3.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] pnfs: call layoutcommit after flushing inode's data to disk.
2010-05-20 3:28 Tao Guo
@ 2010-05-20 4:42 ` Tao Guo
[not found] ` <AANLkTikc6JXz1ubKJHxs0UJq-x80Mafsz4SKgkBPZgnf-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Tao Guo @ 2010-05-20 4:42 UTC (permalink / raw)
To: linux-nfs; +Cc: Benny Halevy, J. Bruce Fields
This is for a bug introduced to 2.6.34. In 2.6.32 and 2.6.33 we call
layoutcommit in
nfs_sync_mapping_wait(), but in 2.6.34 we use sync_inode() to sync
inode's data, so
the layoutcommit code is gone.
BTW: In current code, layoutcommit_ctx will increase refcount of
nfs_inode's ctx, so if
layoutcommit_ctx is not NULL, we could not reach nfs4_close_context
=2E.. --> __nfs_close().
So pnfs_layoutcommit_inode() in __nfs_close() will not be called in
whatever situation.
Why we have to use nfs_inode's ctx as layoutcommit_ctx, since we only
need its rpc_creds
actually?
On Thu, May 20, 2010 at 11:28 AM, Tao Guo <guotao-U4AKAne5IzAR5TUyvShJeg@public.gmane.org> wrote:
> Signed-off-by: Tao Guo <guotao-U4AKAne5IzAR5TUyvShJeg@public.gmane.org>
> ---
> =C2=A0fs/nfs/write.c | =C2=A0 =C2=A08 +++++++-
> =C2=A01 files changed, 7 insertions(+), 1 deletions(-)
>
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index d3e1645..b4f48b2 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -1558,6 +1558,7 @@ int nfs_write_inode(struct inode *inode, struct=
writeback_control *wbc)
> =C2=A0*/
> =C2=A0int nfs_wb_all(struct inode *inode)
> =C2=A0{
> + =C2=A0 =C2=A0 =C2=A0 int ret;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct writeback_control wbc =3D {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.sync_mode =3D=
WB_SYNC_ALL,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.nr_to_write =3D=
LONG_MAX,
> @@ -1565,7 +1566,12 @@ int nfs_wb_all(struct inode *inode)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.range_end =3D=
LLONG_MAX,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0};
>
> - =C2=A0 =C2=A0 =C2=A0 return sync_inode(inode, &wbc);
> + =C2=A0 =C2=A0 =C2=A0 ret =3D sync_inode(inode, &wbc);
> +#ifdef CONFIG_NFS_V4_1
> + =C2=A0 =C2=A0 =C2=A0 if (!ret && NFS_I(inode)->layoutcommit_ctx)
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ret =3D pnfs_layou=
tcommit_inode(inode, 1);
> +#endif
> + =C2=A0 =C2=A0 =C2=A0 return ret;
> =C2=A0}
>
> =C2=A0int nfs_wb_page_cancel(struct inode *inode, struct page *page)
> --
> 1.6.3.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" =
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at =C2=A0http://vger.kernel.org/majordomo-info.ht=
ml
>
--=20
tao.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] pnfs: call layoutcommit after flushing inode's data to disk.
[not found] ` <AANLkTikc6JXz1ubKJHxs0UJq-x80Mafsz4SKgkBPZgnf-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-05-20 14:17 ` Andy Adamson
0 siblings, 0 replies; 6+ messages in thread
From: Andy Adamson @ 2010-05-20 14:17 UTC (permalink / raw)
To: Tao Guo; +Cc: linux-nfs, Benny Halevy, J. Bruce Fields
On May 20, 2010, at 12:42 AM, Tao Guo wrote:
> This is for a bug introduced to 2.6.34. In 2.6.32 and 2.6.33 we call
> layoutcommit in
> nfs_sync_mapping_wait(), but in 2.6.34 we use sync_inode() to sync
> inode's data, so
> the layoutcommit code is gone.
>
> BTW: In current code, layoutcommit_ctx will increase refcount of
> nfs_inode's ctx, so if
> layoutcommit_ctx is not NULL, we could not reach nfs4_close_context
> ... --> __nfs_close().
> So pnfs_layoutcommit_inode() in __nfs_close() will not be called in
> whatever situation.
> Why we have to use nfs_inode's ctx as layoutcommit_ctx, since we only
> need its rpc_creds
> actually?
I have a patch to be submitted today that replaces the
nfs_open_context with an rpc_cred.
-->Andy
>
> On Thu, May 20, 2010 at 11:28 AM, Tao Guo <guotao-U4AKAne5IzAR5TUyvShJeg@public.gmane.org> wrote:
>> Signed-off-by: Tao Guo <guotao-U4AKAne5IzAR5TUyvShJeg@public.gmane.org>
>> ---
>> fs/nfs/write.c | 8 +++++++-
>> 1 files changed, 7 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
>> index d3e1645..b4f48b2 100644
>> --- a/fs/nfs/write.c
>> +++ b/fs/nfs/write.c
>> @@ -1558,6 +1558,7 @@ int nfs_write_inode(struct inode *inode,
>> struct writeback_control *wbc)
>> */
>> int nfs_wb_all(struct inode *inode)
>> {
>> + int ret;
>> struct writeback_control wbc = {
>> .sync_mode = WB_SYNC_ALL,
>> .nr_to_write = LONG_MAX,
>> @@ -1565,7 +1566,12 @@ int nfs_wb_all(struct inode *inode)
>> .range_end = LLONG_MAX,
>> };
>>
>> - return sync_inode(inode, &wbc);
>> + ret = sync_inode(inode, &wbc);
>> +#ifdef CONFIG_NFS_V4_1
>> + if (!ret && NFS_I(inode)->layoutcommit_ctx)
>> + ret = pnfs_layoutcommit_inode(inode, 1);
>> +#endif
>> + return ret;
>> }
>>
>> int nfs_wb_page_cancel(struct inode *inode, struct page *page)
>> --
>> 1.6.3.3
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-
>> nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
>
>
> --
> tao.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] pnfs: call layoutcommit after flushing inode's data to disk.
@ 2010-05-24 19:04 Boaz Harrosh
2010-05-24 19:06 ` Boaz Harrosh
0 siblings, 1 reply; 6+ messages in thread
From: Boaz Harrosh @ 2010-05-24 19:04 UTC (permalink / raw)
To: Tao Guo, Andy Adamson, Benny Halevy, NFS list
From: Tao Guo <guotao@nrchpc.ac.cn>
This is for a bug introduced to 2.6.34. In 2.6.32 and 2.6.33 we call
layoutcommit in nfs_sync_mapping_wait(), but in 2.6.34 we use
sync_inode() to sync inode's data, so the layoutcommit code is gone.
BTW: In current code, layoutcommit_ctx will increase refcount of
nfs_inode's ctx, so if layoutcommit_ctx is not NULL, we could not
reach nfs4_close_context ... --> __nfs_close().
So pnfs_layoutcommit_inode() in __nfs_close() will not be called in
whatever situation.
Why we have to use nfs_inode's ctx as layoutcommit_ctx, since we
only need its rpc_creds actually?
Boaz:
Without this patch none-files layouts which return NFS_FILE_SYNC
from writes, will eventually hang on uncommitted inodes.
This patch reveals a bad design pattern from the pnfs-client. On
none-files layouts nfs_post_op_update_inode_force_wcc or friends
are never called and i_size_write() is never preformed on client.
What happens is that the layoutcommit eventually updates the server.
In Linux the sever would update i_size_attr and mtime. It would then
be detected by the client and i_size is fetched from server. This is
not nice client behaviour, and is not done so for files and none-pnfs
IO. Sigh!
I think also files-lo might suffer from this in some cases, which means
if commit is not preformed (It's just an hint) then writeout will have
i_size corruption.
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
fs/nfs/write.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index dad8da3..9424203 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1558,6 +1558,7 @@ int nfs_write_inode(struct inode *inode, struct writeback_control *wbc)
*/
int nfs_wb_all(struct inode *inode)
{
+ int ret;
struct writeback_control wbc = {
.sync_mode = WB_SYNC_ALL,
.nr_to_write = LONG_MAX,
@@ -1565,7 +1566,12 @@ int nfs_wb_all(struct inode *inode)
.range_end = LLONG_MAX,
};
- return sync_inode(inode, &wbc);
+ ret = sync_inode(inode, &wbc);
+#ifdef CONFIG_NFS_V4_1
+ if (!ret && do_layoutcommit(NFS_I(inode)))
+ ret = pnfs_layoutcommit_inode(inode, 1);
+#endif
+ return ret;
}
int nfs_wb_page_cancel(struct inode *inode, struct page *page)
--
1.6.6.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] pnfs: call layoutcommit after flushing inode's data to disk.
2010-05-24 19:04 [PATCH] pnfs: call layoutcommit after flushing inode's data to disk Boaz Harrosh
@ 2010-05-24 19:06 ` Boaz Harrosh
2010-05-25 6:55 ` Benny Halevy
0 siblings, 1 reply; 6+ messages in thread
From: Boaz Harrosh @ 2010-05-24 19:06 UTC (permalink / raw)
To: Tao Guo, Andy Adamson, Benny Halevy, NFS list
On 05/24/2010 10:04 PM, Boaz Harrosh wrote:
> From: Tao Guo <guotao-U4AKAne5IzAR5TUyvShJeg@public.gmane.org>
>
> This is for a bug introduced to 2.6.34. In 2.6.32 and 2.6.33 we call
> layoutcommit in nfs_sync_mapping_wait(), but in 2.6.34 we use
> sync_inode() to sync inode's data, so the layoutcommit code is gone.
>
> BTW: In current code, layoutcommit_ctx will increase refcount of
> nfs_inode's ctx, so if layoutcommit_ctx is not NULL, we could not
> reach nfs4_close_context ... --> __nfs_close().
> So pnfs_layoutcommit_inode() in __nfs_close() will not be called in
> whatever situation.
> Why we have to use nfs_inode's ctx as layoutcommit_ctx, since we
> only need its rpc_creds actually?
>
> Boaz:
> Without this patch none-files layouts which return NFS_FILE_SYNC
> from writes, will eventually hang on uncommitted inodes.
>
> This patch reveals a bad design pattern from the pnfs-client. On
> none-files layouts nfs_post_op_update_inode_force_wcc or friends
> are never called and i_size_write() is never preformed on client.
> What happens is that the layoutcommit eventually updates the server.
> In Linux the sever would update i_size_attr and mtime. It would then
> be detected by the client and i_size is fetched from server. This is
> not nice client behaviour, and is not done so for files and none-pnfs
> IO. Sigh!
> I think also files-lo might suffer from this in some cases, which means
> if commit is not preformed (It's just an hint) then writeout will have
commit => layout_commit, oops
> i_size corruption.
>
> Signed-off-by: Tao Guo <guotao-U4AKAne5IzAR5TUyvShJeg@public.gmane.org>
> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
> ---
> fs/nfs/write.c | 8 +++++++-
> 1 files changed, 7 insertions(+), 1 deletions(-)
>
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index dad8da3..9424203 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -1558,6 +1558,7 @@ int nfs_write_inode(struct inode *inode, struct writeback_control *wbc)
> */
> int nfs_wb_all(struct inode *inode)
> {
> + int ret;
> struct writeback_control wbc = {
> .sync_mode = WB_SYNC_ALL,
> .nr_to_write = LONG_MAX,
> @@ -1565,7 +1566,12 @@ int nfs_wb_all(struct inode *inode)
> .range_end = LLONG_MAX,
> };
>
> - return sync_inode(inode, &wbc);
> + ret = sync_inode(inode, &wbc);
> +#ifdef CONFIG_NFS_V4_1
> + if (!ret && do_layoutcommit(NFS_I(inode)))
> + ret = pnfs_layoutcommit_inode(inode, 1);
> +#endif
> + return ret;
> }
>
> int nfs_wb_page_cancel(struct inode *inode, struct page *page)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] pnfs: call layoutcommit after flushing inode's data to disk.
2010-05-24 19:06 ` Boaz Harrosh
@ 2010-05-25 6:55 ` Benny Halevy
0 siblings, 0 replies; 6+ messages in thread
From: Benny Halevy @ 2010-05-25 6:55 UTC (permalink / raw)
To: Boaz Harrosh, Tao Guo; +Cc: Andy Adamson, NFS list
On May. 24, 2010, 22:06 +0300, Boaz Harrosh <bharrosh@panasas.com> wrote:
> On 05/24/2010 10:04 PM, Boaz Harrosh wrote:
>> From: Tao Guo <guotao-U4AKAne5IzAR5TUyvShJeg@public.gmane.org>
>>
>> This is for a bug introduced to 2.6.34. In 2.6.32 and 2.6.33 we call
>> layoutcommit in nfs_sync_mapping_wait(), but in 2.6.34 we use
>> sync_inode() to sync inode's data, so the layoutcommit code is gone.
>>
>> BTW: In current code, layoutcommit_ctx will increase refcount of
>> nfs_inode's ctx, so if layoutcommit_ctx is not NULL, we could not
>> reach nfs4_close_context ... --> __nfs_close().
>> So pnfs_layoutcommit_inode() in __nfs_close() will not be called in
>> whatever situation.
>> Why we have to use nfs_inode's ctx as layoutcommit_ctx, since we
>> only need its rpc_creds actually?
>>
>> Boaz:
>> Without this patch none-files layouts which return NFS_FILE_SYNC
>> from writes, will eventually hang on uncommitted inodes.
>>
>> This patch reveals a bad design pattern from the pnfs-client. On
>> none-files layouts nfs_post_op_update_inode_force_wcc or friends
>> are never called and i_size_write() is never preformed on client.
>> What happens is that the layoutcommit eventually updates the server.
>> In Linux the sever would update i_size_attr and mtime. It would then
>> be detected by the client and i_size is fetched from server. This is
>> not nice client behaviour, and is not done so for files and none-pnfs
>> IO. Sigh!
>> I think also files-lo might suffer from this in some cases, which means
>> if commit is not preformed (It's just an hint) then writeout will have
>
> commit => layout_commit, oops
>
>> i_size corruption.
>>
>> Signed-off-by: Tao Guo <guotao-U4AKAne5IzAR5TUyvShJeg@public.gmane.org>
>> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Committed at pnfs-all-2.6.34-2010-05-25
Thanks!
Benny
>> ---
>> fs/nfs/write.c | 8 +++++++-
>> 1 files changed, 7 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
>> index dad8da3..9424203 100644
>> --- a/fs/nfs/write.c
>> +++ b/fs/nfs/write.c
>> @@ -1558,6 +1558,7 @@ int nfs_write_inode(struct inode *inode, struct writeback_control *wbc)
>> */
>> int nfs_wb_all(struct inode *inode)
>> {
>> + int ret;
>> struct writeback_control wbc = {
>> .sync_mode = WB_SYNC_ALL,
>> .nr_to_write = LONG_MAX,
>> @@ -1565,7 +1566,12 @@ int nfs_wb_all(struct inode *inode)
>> .range_end = LLONG_MAX,
>> };
>>
>> - return sync_inode(inode, &wbc);
>> + ret = sync_inode(inode, &wbc);
>> +#ifdef CONFIG_NFS_V4_1
>> + if (!ret && do_layoutcommit(NFS_I(inode)))
>> + ret = pnfs_layoutcommit_inode(inode, 1);
>> +#endif
>> + return ret;
>> }
>>
>> int nfs_wb_page_cancel(struct inode *inode, struct page *page)
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-05-25 6:55 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-24 19:04 [PATCH] pnfs: call layoutcommit after flushing inode's data to disk Boaz Harrosh
2010-05-24 19:06 ` Boaz Harrosh
2010-05-25 6:55 ` Benny Halevy
-- strict thread matches above, loose matches on Subject: below --
2010-05-20 3:28 Tao Guo
2010-05-20 4:42 ` Tao Guo
[not found] ` <AANLkTikc6JXz1ubKJHxs0UJq-x80Mafsz4SKgkBPZgnf-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-05-20 14:17 ` Andy Adamson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).