All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] NFS - Fix for Infinite loop during syncing
@ 2004-12-13 20:23 Steve Dickson
  2004-12-13 20:45 ` Trond Myklebust
  2005-01-31 17:44 ` Steve Dickson
  0 siblings, 2 replies; 5+ messages in thread
From: Steve Dickson @ 2004-12-13 20:23 UTC (permalink / raw)
  To: nfs

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]


It was brought to my attention that following series of events
would cause an infinite loop in the 2.4 nfs kernels.

1) Mount the fileystem with acregmin=1,acregmax=1 from two clients.
2) On client 1, create a process that continuously writes to a file.
3) On client 2, remove that file that is being written
4) On client 1, interrupted out of the writing process (which is failing
     with ESTALEs) and type sync

The sync process loops in wait_on_locked(), when called from
sync_inodes_sb(), since the "broken" inode can not be cleared
from the locked inode list.

This patch sets the NFS_INO_STALE bit in write path (via
nfs_writeback_done) which breaks the inode is early enough to
stop it from being added to the that list.

Comments?

steved.


[-- Attachment #2: linux-2.4.28-nfs-syncloop.patch --]
[-- Type: text/plain, Size: 472 bytes --]

--- linux-2.4.28/fs/nfs/write.c.orig	2004-04-14 09:05:40.000000000 -0400
+++ linux-2.4.28/fs/nfs/write.c	2004-12-13 14:23:55.000000000 -0500
@@ -1073,6 +1073,9 @@ nfs_writeback_done(struct rpc_task *task
 			SetPageError(page);
 			if (req->wb_file)
 				req->wb_file->f_error = task->tk_status;
+			if (task->tk_status == -ESTALE)
+				NFS_FLAGS(inode) |= NFS_INO_STALE;
+
 			nfs_inode_remove_request(req);
 			dprintk(", error = %d\n", task->tk_status);
 			goto next;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] NFS - Fix for Infinite loop during syncing
  2004-12-13 20:23 [PATCH] NFS - Fix for Infinite loop during syncing Steve Dickson
@ 2004-12-13 20:45 ` Trond Myklebust
  2004-12-13 21:20   ` Steve Dickson
  2005-01-31 17:44 ` Steve Dickson
  1 sibling, 1 reply; 5+ messages in thread
From: Trond Myklebust @ 2004-12-13 20:45 UTC (permalink / raw)
  To: Steve Dickson; +Cc: nfs

m=E5 den 13.12.2004 Klokka 15:23 (-0500) skreiv Steve Dickson:
> It was brought to my attention that following series of events
> would cause an infinite loop in the 2.4 nfs kernels.
>=20
> 1) Mount the fileystem with acregmin=3D1,acregmax=3D1 from two clients.
> 2) On client 1, create a process that continuously writes to a file.
> 3) On client 2, remove that file that is being written
> 4) On client 1, interrupted out of the writing process (which is failing
>      with ESTALEs) and type sync
>=20
> The sync process loops in wait_on_locked(), when called from
> sync_inodes_sb(), since the "broken" inode can not be cleared
> from the locked inode list.

Isn't this pretty much a generic problem of mmapped files? AFAICS, sync
can loop here no matter what filesystem one is using.

If you use ordinary writes, then all is well, since the
nfs_revalidate_inode() in nfs_file_write() should set the NFS_INO_STALE
flag for you.

Cheers
  Trond

--=20
Trond Myklebust <trond.myklebust@fys.uio.no>



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] NFS - Fix for Infinite loop during syncing
  2004-12-13 20:45 ` Trond Myklebust
@ 2004-12-13 21:20   ` Steve Dickson
  2004-12-14  0:14     ` Steve Dickson
  0 siblings, 1 reply; 5+ messages in thread
From: Steve Dickson @ 2004-12-13 21:20 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: nfs

Trond Myklebust wrote:

>m=E5 den 13.12.2004 Klokka 15:23 (-0500) skreiv Steve Dickson:
> =20
>
>>It was brought to my attention that following series of events
>>would cause an infinite loop in the 2.4 nfs kernels.
>>
>>1) Mount the fileystem with acregmin=3D1,acregmax=3D1 from two clients.
>>2) On client 1, create a process that continuously writes to a file.
>>3) On client 2, remove that file that is being written
>>4) On client 1, interrupted out of the writing process (which is failin=
g
>>     with ESTALEs) and type sync
>>
>>The sync process loops in wait_on_locked(), when called from
>>sync_inodes_sb(), since the "broken" inode can not be cleared
>>from the locked inode list.
>>   =20
>>
>
>Isn't this pretty much a generic problem of mmapped files? AFAICS, sync
>can loop here no matter what filesystem one is using.
> =20
>
maybe, but these were not mmapped files... here is perl script I used

open $file, "> filetje.$$" or die "Open failed: $!";
while ( 1 ) {
   print $file time(), " The quick brown fox jumps over the lazy dog,=20
1234567890\n"
}

>If you use ordinary writes, then all is well, since the
>nfs_revalidate_inode() in nfs_file_write() should set the NFS_INO_STALE
>flag for you.
> =20
>
Right... nfs_revalidate_inode() does "break" the inode only after a (in=20
this case)
seconded ESTALE  failure .... By setting NFS_INO_STALE in nfs_write_done(=
)
it breaks the inode earlier, which means the nfs_revalidate_inode() is=20
not even
tried.

Now that I think about it... probably makes sense to set NFS_INO_STALE in
nfs_commit_done() as well... From my debugging, it just seems the=20
earlier we
break the inode... the better....

steved.



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] NFS - Fix for Infinite loop during syncing
  2004-12-13 21:20   ` Steve Dickson
@ 2004-12-14  0:14     ` Steve Dickson
  0 siblings, 0 replies; 5+ messages in thread
From: Steve Dickson @ 2004-12-14  0:14 UTC (permalink / raw)
  To: nfs; +Cc: Trond Myklebust

[-- Attachment #1: Type: text/plain, Size: 361 bytes --]

Steve Dickson wrote:

> Now that I think about it... probably makes sense to set NFS_INO_STALE in
> nfs_commit_done() as well... From my debugging, it just seems the 
> earlier we
> break the inode... the better....

I liked this idea some much that here is the patch for it... I really do 
think
the sooner we know an inode is broken... the better...

steved.

[-- Attachment #2: linux-2.4.28-nfs-syncloop.patch --]
[-- Type: text/plain, Size: 807 bytes --]

--- linux-2.4.28/fs/nfs/write.c.orig	2004-04-14 09:05:40.000000000 -0400
+++ linux-2.4.28/fs/nfs/write.c	2004-12-13 19:04:23.000000000 -0500
@@ -1073,6 +1073,9 @@ nfs_writeback_done(struct rpc_task *task
 			SetPageError(page);
 			if (req->wb_file)
 				req->wb_file->f_error = task->tk_status;
+			if (task->tk_status == -ESTALE)
+				NFS_FLAGS(inode) |= NFS_INO_STALE;
+
 			nfs_inode_remove_request(req);
 			dprintk(", error = %d\n", task->tk_status);
 			goto next;
@@ -1223,6 +1226,9 @@ nfs_commit_done(struct rpc_task *task)
 		if (task->tk_status < 0) {
 			if (req->wb_file)
 				req->wb_file->f_error = task->tk_status;
+			if (task->tk_status == -ESTALE)
+				NFS_FLAGS(inode) |= NFS_INO_STALE;
+
 			nfs_inode_remove_request(req);
 			dprintk(", error = %d\n", task->tk_status);
 			goto next;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] NFS - Fix for Infinite loop during syncing
  2004-12-13 20:23 [PATCH] NFS - Fix for Infinite loop during syncing Steve Dickson
  2004-12-13 20:45 ` Trond Myklebust
@ 2005-01-31 17:44 ` Steve Dickson
  1 sibling, 0 replies; 5+ messages in thread
From: Steve Dickson @ 2005-01-31 17:44 UTC (permalink / raw)
  To: nfs

[-- Attachment #1: Type: text/plain, Size: 2474 bytes --]

Steve Dickson wrote:

>
> It was brought to my attention that following series of events
> would cause an infinite loop in the 2.4 nfs kernels.
>
> 1) Mount the fileystem with acregmin=1,acregmax=1 from two clients.
> 2) On client 1, create a process that continuously writes to a file.
> 3) On client 2, remove that file that is being written
> 4) On client 1, interrupted out of the writing process (which is failing
>     with ESTALEs) and type sync
>
Here is an update patch to this problem. My original patch does
avoid the infinite loop but didn't address the actual cause of the loop.
The attached patch does... and here is what is happening:

A process is continuity writing to a broken (i.e ESTALE) fd which
is queuing up pages to be sent out.

A getattr happens (due a cache time out) which fails with ESTALE
so _nfs_revalidate_inode() removes the inode from the hash list:

    if (status == -ESTALE) {
        NFS_FLAGS(inode) |= NFS_INO_STALE;
        if (inode != inode->i_sb->s_root->d_inode)
            remove_inode_hash(inode);
    }

Now when __sync_one() comes along and see the dirty pages, the inode
is added to the locked inode list,  data is sync-ed out and then
__refile_inode() is called:
<>
    list_add(&inode->i_list, &inode->i_sb->s_locked_inodes);
<>    inode->i_state |= I_LOCK;
    /* write out data */
    inode->i_state &= ~I_LOCK;
    if (!(inode->i_state & I_FREEING))
        __refile_inode(inode);

Now here is the problem! Since the inode is has already been removed
from the i_hash list, the inode is never refiled 
__refile_inode(inode):

    if (inode->i_state & I_FREEING)
        return;
    if (list_empty(&inode->i_hash))
        return;

which causes the infinite loop because the node is never removed from 
the locked
inode list. Now my original patch avoid this loop because 
__nfs_revalidate_inode()
saw the inode was stale before it removed the inode from the hash list. 
The attached
patch still "breaks" the inode earlier (since is stop a bunch of 
unnecessary i/o)  but
it also it removes the call to remove_inode_hash() in 
_nfs_revalidate_inode() which
is the real cause of the problem....

So code in question is:
    if (inode != inode->i_sb->s_root->d_inode)
        remove_inode_hash(inode);

and I hoping someone can shed some light on as to why the
inode is being removed from the i_hash list with an ESTALE failure.
Does it make sense to remove an inode from the i_hash when there
are dirty pages?

steved.



[-- Attachment #2: linux-2.4.29-nfs-syncloop.patch --]
[-- Type: text/x-patch, Size: 1366 bytes --]

--- linux-2.4.29/fs/nfs/write.c.orig	2004-04-14 09:05:40.000000000 -0400
+++ linux-2.4.29/fs/nfs/write.c	2005-01-31 10:51:45.979056000 -0500
@@ -1073,6 +1073,9 @@ nfs_writeback_done(struct rpc_task *task
 			SetPageError(page);
 			if (req->wb_file)
 				req->wb_file->f_error = task->tk_status;
+			if (task->tk_status == -ESTALE)
+				NFS_FLAGS(inode) |= NFS_INO_STALE;
+
 			nfs_inode_remove_request(req);
 			dprintk(", error = %d\n", task->tk_status);
 			goto next;
@@ -1223,6 +1226,9 @@ nfs_commit_done(struct rpc_task *task)
 		if (task->tk_status < 0) {
 			if (req->wb_file)
 				req->wb_file->f_error = task->tk_status;
+			if (task->tk_status == -ESTALE)
+				NFS_FLAGS(inode) |= NFS_INO_STALE;
+
 			nfs_inode_remove_request(req);
 			dprintk(", error = %d\n", task->tk_status);
 			goto next;
--- linux-2.4.29/fs/nfs/inode.c.orig	2004-04-14 09:05:40.000000000 -0400
+++ linux-2.4.29/fs/nfs/inode.c	2005-01-31 11:02:13.492190000 -0500
@@ -907,11 +907,9 @@ __nfs_revalidate_inode(struct nfs_server
 	if (status) {
 		dfprintk(PAGECACHE, "nfs_revalidate_inode: (%x/%Ld) getattr failed, error=%d\n",
 			 inode->i_dev, (long long)NFS_FILEID(inode), status);
-		if (status == -ESTALE) {
+		if (status == -ESTALE) 
 			NFS_FLAGS(inode) |= NFS_INO_STALE;
-			if (inode != inode->i_sb->s_root->d_inode)
-				remove_inode_hash(inode);
-		}
+
 		goto out;
 	}
 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-01-31 17:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-12-13 20:23 [PATCH] NFS - Fix for Infinite loop during syncing Steve Dickson
2004-12-13 20:45 ` Trond Myklebust
2004-12-13 21:20   ` Steve Dickson
2004-12-14  0:14     ` Steve Dickson
2005-01-31 17:44 ` Steve Dickson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.