* [PATCH] NFS - Fix for Infinite loop during syncing
@ 2004-12-13 20:23 Steve Dickson
2004-12-13 20:45 ` Trond Myklebust
2005-01-31 17:44 ` Steve Dickson
0 siblings, 2 replies; 5+ messages in thread
From: Steve Dickson @ 2004-12-13 20:23 UTC (permalink / raw)
To: nfs
[-- Attachment #1: Type: text/plain, Size: 752 bytes --]
It was brought to my attention that following series of events
would cause an infinite loop in the 2.4 nfs kernels.
1) Mount the fileystem with acregmin=1,acregmax=1 from two clients.
2) On client 1, create a process that continuously writes to a file.
3) On client 2, remove that file that is being written
4) On client 1, interrupted out of the writing process (which is failing
with ESTALEs) and type sync
The sync process loops in wait_on_locked(), when called from
sync_inodes_sb(), since the "broken" inode can not be cleared
from the locked inode list.
This patch sets the NFS_INO_STALE bit in write path (via
nfs_writeback_done) which breaks the inode is early enough to
stop it from being added to the that list.
Comments?
steved.
[-- Attachment #2: linux-2.4.28-nfs-syncloop.patch --]
[-- Type: text/plain, Size: 472 bytes --]
--- linux-2.4.28/fs/nfs/write.c.orig 2004-04-14 09:05:40.000000000 -0400
+++ linux-2.4.28/fs/nfs/write.c 2004-12-13 14:23:55.000000000 -0500
@@ -1073,6 +1073,9 @@ nfs_writeback_done(struct rpc_task *task
SetPageError(page);
if (req->wb_file)
req->wb_file->f_error = task->tk_status;
+ if (task->tk_status == -ESTALE)
+ NFS_FLAGS(inode) |= NFS_INO_STALE;
+
nfs_inode_remove_request(req);
dprintk(", error = %d\n", task->tk_status);
goto next;
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] NFS - Fix for Infinite loop during syncing 2004-12-13 20:23 [PATCH] NFS - Fix for Infinite loop during syncing Steve Dickson @ 2004-12-13 20:45 ` Trond Myklebust 2004-12-13 21:20 ` Steve Dickson 2005-01-31 17:44 ` Steve Dickson 1 sibling, 1 reply; 5+ messages in thread From: Trond Myklebust @ 2004-12-13 20:45 UTC (permalink / raw) To: Steve Dickson; +Cc: nfs m=E5 den 13.12.2004 Klokka 15:23 (-0500) skreiv Steve Dickson: > It was brought to my attention that following series of events > would cause an infinite loop in the 2.4 nfs kernels. >=20 > 1) Mount the fileystem with acregmin=3D1,acregmax=3D1 from two clients. > 2) On client 1, create a process that continuously writes to a file. > 3) On client 2, remove that file that is being written > 4) On client 1, interrupted out of the writing process (which is failing > with ESTALEs) and type sync >=20 > The sync process loops in wait_on_locked(), when called from > sync_inodes_sb(), since the "broken" inode can not be cleared > from the locked inode list. Isn't this pretty much a generic problem of mmapped files? AFAICS, sync can loop here no matter what filesystem one is using. If you use ordinary writes, then all is well, since the nfs_revalidate_inode() in nfs_file_write() should set the NFS_INO_STALE flag for you. Cheers Trond --=20 Trond Myklebust <trond.myklebust@fys.uio.no> ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://productguide.itmanagersjournal.com/ _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] NFS - Fix for Infinite loop during syncing 2004-12-13 20:45 ` Trond Myklebust @ 2004-12-13 21:20 ` Steve Dickson 2004-12-14 0:14 ` Steve Dickson 0 siblings, 1 reply; 5+ messages in thread From: Steve Dickson @ 2004-12-13 21:20 UTC (permalink / raw) To: Trond Myklebust; +Cc: nfs Trond Myklebust wrote: >m=E5 den 13.12.2004 Klokka 15:23 (-0500) skreiv Steve Dickson: > =20 > >>It was brought to my attention that following series of events >>would cause an infinite loop in the 2.4 nfs kernels. >> >>1) Mount the fileystem with acregmin=3D1,acregmax=3D1 from two clients. >>2) On client 1, create a process that continuously writes to a file. >>3) On client 2, remove that file that is being written >>4) On client 1, interrupted out of the writing process (which is failin= g >> with ESTALEs) and type sync >> >>The sync process loops in wait_on_locked(), when called from >>sync_inodes_sb(), since the "broken" inode can not be cleared >>from the locked inode list. >> =20 >> > >Isn't this pretty much a generic problem of mmapped files? AFAICS, sync >can loop here no matter what filesystem one is using. > =20 > maybe, but these were not mmapped files... here is perl script I used open $file, "> filetje.$$" or die "Open failed: $!"; while ( 1 ) { print $file time(), " The quick brown fox jumps over the lazy dog,=20 1234567890\n" } >If you use ordinary writes, then all is well, since the >nfs_revalidate_inode() in nfs_file_write() should set the NFS_INO_STALE >flag for you. > =20 > Right... nfs_revalidate_inode() does "break" the inode only after a (in=20 this case) seconded ESTALE failure .... By setting NFS_INO_STALE in nfs_write_done(= ) it breaks the inode earlier, which means the nfs_revalidate_inode() is=20 not even tried. Now that I think about it... probably makes sense to set NFS_INO_STALE in nfs_commit_done() as well... From my debugging, it just seems the=20 earlier we break the inode... the better.... steved. ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://productguide.itmanagersjournal.com/ _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] NFS - Fix for Infinite loop during syncing 2004-12-13 21:20 ` Steve Dickson @ 2004-12-14 0:14 ` Steve Dickson 0 siblings, 0 replies; 5+ messages in thread From: Steve Dickson @ 2004-12-14 0:14 UTC (permalink / raw) To: nfs; +Cc: Trond Myklebust [-- Attachment #1: Type: text/plain, Size: 361 bytes --] Steve Dickson wrote: > Now that I think about it... probably makes sense to set NFS_INO_STALE in > nfs_commit_done() as well... From my debugging, it just seems the > earlier we > break the inode... the better.... I liked this idea some much that here is the patch for it... I really do think the sooner we know an inode is broken... the better... steved. [-- Attachment #2: linux-2.4.28-nfs-syncloop.patch --] [-- Type: text/plain, Size: 807 bytes --] --- linux-2.4.28/fs/nfs/write.c.orig 2004-04-14 09:05:40.000000000 -0400 +++ linux-2.4.28/fs/nfs/write.c 2004-12-13 19:04:23.000000000 -0500 @@ -1073,6 +1073,9 @@ nfs_writeback_done(struct rpc_task *task SetPageError(page); if (req->wb_file) req->wb_file->f_error = task->tk_status; + if (task->tk_status == -ESTALE) + NFS_FLAGS(inode) |= NFS_INO_STALE; + nfs_inode_remove_request(req); dprintk(", error = %d\n", task->tk_status); goto next; @@ -1223,6 +1226,9 @@ nfs_commit_done(struct rpc_task *task) if (task->tk_status < 0) { if (req->wb_file) req->wb_file->f_error = task->tk_status; + if (task->tk_status == -ESTALE) + NFS_FLAGS(inode) |= NFS_INO_STALE; + nfs_inode_remove_request(req); dprintk(", error = %d\n", task->tk_status); goto next; ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] NFS - Fix for Infinite loop during syncing 2004-12-13 20:23 [PATCH] NFS - Fix for Infinite loop during syncing Steve Dickson 2004-12-13 20:45 ` Trond Myklebust @ 2005-01-31 17:44 ` Steve Dickson 1 sibling, 0 replies; 5+ messages in thread From: Steve Dickson @ 2005-01-31 17:44 UTC (permalink / raw) To: nfs [-- Attachment #1: Type: text/plain, Size: 2474 bytes --] Steve Dickson wrote: > > It was brought to my attention that following series of events > would cause an infinite loop in the 2.4 nfs kernels. > > 1) Mount the fileystem with acregmin=1,acregmax=1 from two clients. > 2) On client 1, create a process that continuously writes to a file. > 3) On client 2, remove that file that is being written > 4) On client 1, interrupted out of the writing process (which is failing > with ESTALEs) and type sync > Here is an update patch to this problem. My original patch does avoid the infinite loop but didn't address the actual cause of the loop. The attached patch does... and here is what is happening: A process is continuity writing to a broken (i.e ESTALE) fd which is queuing up pages to be sent out. A getattr happens (due a cache time out) which fails with ESTALE so _nfs_revalidate_inode() removes the inode from the hash list: if (status == -ESTALE) { NFS_FLAGS(inode) |= NFS_INO_STALE; if (inode != inode->i_sb->s_root->d_inode) remove_inode_hash(inode); } Now when __sync_one() comes along and see the dirty pages, the inode is added to the locked inode list, data is sync-ed out and then __refile_inode() is called: <> list_add(&inode->i_list, &inode->i_sb->s_locked_inodes); <> inode->i_state |= I_LOCK; /* write out data */ inode->i_state &= ~I_LOCK; if (!(inode->i_state & I_FREEING)) __refile_inode(inode); Now here is the problem! Since the inode is has already been removed from the i_hash list, the inode is never refiled __refile_inode(inode): if (inode->i_state & I_FREEING) return; if (list_empty(&inode->i_hash)) return; which causes the infinite loop because the node is never removed from the locked inode list. Now my original patch avoid this loop because __nfs_revalidate_inode() saw the inode was stale before it removed the inode from the hash list. The attached patch still "breaks" the inode earlier (since is stop a bunch of unnecessary i/o) but it also it removes the call to remove_inode_hash() in _nfs_revalidate_inode() which is the real cause of the problem.... So code in question is: if (inode != inode->i_sb->s_root->d_inode) remove_inode_hash(inode); and I hoping someone can shed some light on as to why the inode is being removed from the i_hash list with an ESTALE failure. Does it make sense to remove an inode from the i_hash when there are dirty pages? steved. [-- Attachment #2: linux-2.4.29-nfs-syncloop.patch --] [-- Type: text/x-patch, Size: 1366 bytes --] --- linux-2.4.29/fs/nfs/write.c.orig 2004-04-14 09:05:40.000000000 -0400 +++ linux-2.4.29/fs/nfs/write.c 2005-01-31 10:51:45.979056000 -0500 @@ -1073,6 +1073,9 @@ nfs_writeback_done(struct rpc_task *task SetPageError(page); if (req->wb_file) req->wb_file->f_error = task->tk_status; + if (task->tk_status == -ESTALE) + NFS_FLAGS(inode) |= NFS_INO_STALE; + nfs_inode_remove_request(req); dprintk(", error = %d\n", task->tk_status); goto next; @@ -1223,6 +1226,9 @@ nfs_commit_done(struct rpc_task *task) if (task->tk_status < 0) { if (req->wb_file) req->wb_file->f_error = task->tk_status; + if (task->tk_status == -ESTALE) + NFS_FLAGS(inode) |= NFS_INO_STALE; + nfs_inode_remove_request(req); dprintk(", error = %d\n", task->tk_status); goto next; --- linux-2.4.29/fs/nfs/inode.c.orig 2004-04-14 09:05:40.000000000 -0400 +++ linux-2.4.29/fs/nfs/inode.c 2005-01-31 11:02:13.492190000 -0500 @@ -907,11 +907,9 @@ __nfs_revalidate_inode(struct nfs_server if (status) { dfprintk(PAGECACHE, "nfs_revalidate_inode: (%x/%Ld) getattr failed, error=%d\n", inode->i_dev, (long long)NFS_FILEID(inode), status); - if (status == -ESTALE) { + if (status == -ESTALE) NFS_FLAGS(inode) |= NFS_INO_STALE; - if (inode != inode->i_sb->s_root->d_inode) - remove_inode_hash(inode); - } + goto out; } ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-01-31 17:42 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-12-13 20:23 [PATCH] NFS - Fix for Infinite loop during syncing Steve Dickson 2004-12-13 20:45 ` Trond Myklebust 2004-12-13 21:20 ` Steve Dickson 2004-12-14 0:14 ` Steve Dickson 2005-01-31 17:44 ` Steve Dickson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox