* The return of the hanging "ls"...
@ 2013-11-25 4:59 NeilBrown
2013-11-25 14:59 ` Chuck Lever
0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2013-11-25 4:59 UTC (permalink / raw)
To: Myklebust, Trond, Chuck Lever; +Cc: NFS
[-- Attachment #1: Type: text/plain, Size: 2485 bytes --]
Hi Trond,
I just noticed commit acdc53b2146c7ee67feb1f02f7bc3020126514b8 from 2010
reverts the effect commit 28c494c5c8d425e15b7b82571e4df6d6bc34594d from Chunk
in 2007.
Specifically it removes mutex lock/unlock in nfs_getattr.
Chuck added them:
- /* Flush out writes to the server in order to update c/mtime */
- if (S_ISREG(inode->i_mode))
+ /*
+ * Flush out writes to the server in order to update c/mtime.
+ *
+ * Hold the i_mutex to suspend application writes temporarily;
+ * this prevents long-running writing applications from blocking
+ * nfs_wb_nocommit.
+ */
+ if (S_ISREG(inode->i_mode)) {
+ mutex_lock(&inode->i_mutex);
nfs_wb_nocommit(inode);
+ mutex_unlock(&inode->i_mutex);
+ }
You removed them.
- /*
- * Flush out writes to the server in order to update c/mtime.
- *
- * Hold the i_mutex to suspend application writes temporarily;
- * this prevents long-running writing applications from blocking
- * nfs_wb_nocommit.
- */
+ /* Flush out writes to the server in order to update c/mtime. */
if (S_ISREG(inode->i_mode)) {
- mutex_lock(&inode->i_mutex);
- nfs_wb_nocommit(inode);
- mutex_unlock(&inode->i_mutex);
+ err = filemap_write_and_wait(inode->i_mapping);
+ if (err)
+ goto out;
}
/*
Do you recall why?
I noticed because a customer reported exactly the same symptoms the were
fixed by Chucks patch some years ago.
The comment on your patch says (in part):
Also replace nfs_wb_nocommit() with a
call to filemap_write_and_wait(), which doesn't need to hold the
inode->i_mutex.
It is certainly true that filemap_write_and_wait doesn't need to hold the
mutex, but neither did nfs_wb_nocommit. The mutex is held to stop "suspend
application writes temporarily" so no more pages get dirtied until all
the current dirty pages have been written out. i.e. to stop
generic_file_aio_write() from proceeding.
The particular test that shows the problem is a large write like
dd if=/dev/zero of=/mnt/nfs/somefile count=2000000
then in another window
ls -l /mnt/nfs
the "ls -l" will hang until the "dd" completes.
Can we put the mutex lock/unlock back please?
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: The return of the hanging "ls"...
2013-11-25 4:59 The return of the hanging "ls" NeilBrown
@ 2013-11-25 14:59 ` Chuck Lever
2013-11-25 23:23 ` NeilBrown
0 siblings, 1 reply; 5+ messages in thread
From: Chuck Lever @ 2013-11-25 14:59 UTC (permalink / raw)
To: NeilBrown; +Cc: Myklebust, Trond, NFS
Hi Neil-
On Nov 24, 2013, at 11:59 PM, NeilBrown <neilb@suse.de> wrote:
>
> Hi Trond,
> I just noticed commit acdc53b2146c7ee67feb1f02f7bc3020126514b8 from 2010
> reverts the effect commit 28c494c5c8d425e15b7b82571e4df6d6bc34594d from Chunk
> in 2007.
I'm wondering if a subsequent commit changed filemap_write_and_wait().
There was some recent clean up that (very likely unintentionally) changed the serialization of some of these VFS utility functions.
> Specifically it removes mutex lock/unlock in nfs_getattr.
> Chuck added them:
>
> - /* Flush out writes to the server in order to update c/mtime */
> - if (S_ISREG(inode->i_mode))
> + /*
> + * Flush out writes to the server in order to update c/mtime.
> + *
> + * Hold the i_mutex to suspend application writes temporarily;
> + * this prevents long-running writing applications from blocking
> + * nfs_wb_nocommit.
> + */
> + if (S_ISREG(inode->i_mode)) {
> + mutex_lock(&inode->i_mutex);
> nfs_wb_nocommit(inode);
> + mutex_unlock(&inode->i_mutex);
> + }
>
>
> You removed them.
>
> - /*
> - * Flush out writes to the server in order to update c/mtime.
> - *
> - * Hold the i_mutex to suspend application writes temporarily;
> - * this prevents long-running writing applications from blocking
> - * nfs_wb_nocommit.
> - */
> + /* Flush out writes to the server in order to update c/mtime. */
> if (S_ISREG(inode->i_mode)) {
> - mutex_lock(&inode->i_mutex);
> - nfs_wb_nocommit(inode);
> - mutex_unlock(&inode->i_mutex);
> + err = filemap_write_and_wait(inode->i_mapping);
> + if (err)
> + goto out;
> }
>
> /*
>
>
> Do you recall why?
>
> I noticed because a customer reported exactly the same symptoms the were
> fixed by Chucks patch some years ago.
>
>
> The comment on your patch says (in part):
>
> Also replace nfs_wb_nocommit() with a
> call to filemap_write_and_wait(), which doesn't need to hold the
> inode->i_mutex.
>
> It is certainly true that filemap_write_and_wait doesn't need to hold the
> mutex, but neither did nfs_wb_nocommit. The mutex is held to stop "suspend
> application writes temporarily" so no more pages get dirtied until all
> the current dirty pages have been written out. i.e. to stop
> generic_file_aio_write() from proceeding.
>
> The particular test that shows the problem is a large write like
> dd if=/dev/zero of=/mnt/nfs/somefile count=2000000
> then in another window
> ls -l /mnt/nfs
>
> the "ls -l" will hang until the "dd" completes.
>
> Can we put the mutex lock/unlock back please?
>
> Thanks,
> NeilBrown
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: The return of the hanging "ls"...
2013-11-25 14:59 ` Chuck Lever
@ 2013-11-25 23:23 ` NeilBrown
2013-11-25 23:41 ` Myklebust, Trond
0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2013-11-25 23:23 UTC (permalink / raw)
To: Chuck Lever; +Cc: Myklebust, Trond, NFS
[-- Attachment #1: Type: text/plain, Size: 786 bytes --]
On Mon, 25 Nov 2013 09:59:39 -0500 Chuck Lever <chuck.lever@oracle.com> wrote:
> Hi Neil-
>
> On Nov 24, 2013, at 11:59 PM, NeilBrown <neilb@suse.de> wrote:
>
> >
> > Hi Trond,
> > I just noticed commit acdc53b2146c7ee67feb1f02f7bc3020126514b8 from 2010
> > reverts the effect commit 28c494c5c8d425e15b7b82571e4df6d6bc34594d from Chunk
> > in 2007.
>
> I'm wondering if a subsequent commit changed filemap_write_and_wait().
I hadn't thought of that possibility. I've just had a look at the
differences between acdc53b2146c7ee6 and now and cannot find anything that
could be related.
Thanks,
NeilBrown
>
> There was some recent clean up that (very likely unintentionally) changed the serialization of some of these VFS utility functions.
>
>
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: The return of the hanging "ls"...
2013-11-25 23:23 ` NeilBrown
@ 2013-11-25 23:41 ` Myklebust, Trond
2013-11-26 2:11 ` NeilBrown
0 siblings, 1 reply; 5+ messages in thread
From: Myklebust, Trond @ 2013-11-25 23:41 UTC (permalink / raw)
To: NeilBrown; +Cc: Chuck Lever, NFS
On Tue, 2013-11-26 at 10:23 +1100, NeilBrown wrote:
> On Mon, 25 Nov 2013 09:59:39 -0500 Chuck Lever <chuck.lever@oracle.com> wrote:
>
> > Hi Neil-
> >
> > On Nov 24, 2013, at 11:59 PM, NeilBrown <neilb@suse.de> wrote:
> >
> > >
> > > Hi Trond,
> > > I just noticed commit acdc53b2146c7ee67feb1f02f7bc3020126514b8 from 2010
> > > reverts the effect commit 28c494c5c8d425e15b7b82571e4df6d6bc34594d from Chunk
> > > in 2007.
> >
> > I'm wondering if a subsequent commit changed filemap_write_and_wait().
>
> I hadn't thought of that possibility. I've just had a look at the
> differences between acdc53b2146c7ee6 and now and cannot find anything that
> could be related.
To clarify a little: my understanding is that the current 2-pass code in
write_cache_pages() is supposed to prevent livelock. Instead of chasing
PAGECACHE_TAG_DIRTY tags (which are constantly being set if an
application is actively writing), we call tag_pages_for_writeback() once
in order to convert the current set of PAGECACHE_TAG_DIRTY tags into
PAGECACHE_TAG_TOWRITE tags, and then we have a second pass write those
pages back (and wait for completion).
IOW: the inode->i_mutex should be unnecessary here...
Now that said, we recently added in the call to nfs_inode_dio_wait(). If
applications are using O_DIRECT, then _that_ could livelock. There is
nothing currently preventing the applications from continuing to bump
the inode->i_dio_count while we're waiting. Christoph has proposed some
locking changes that should fix that problem. I'm still evaluating his
patchset...
Cheers
Trond
Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer
NetApp
Trond.Myklebust@netapp.com
www.netapp.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: The return of the hanging "ls"...
2013-11-25 23:41 ` Myklebust, Trond
@ 2013-11-26 2:11 ` NeilBrown
0 siblings, 0 replies; 5+ messages in thread
From: NeilBrown @ 2013-11-26 2:11 UTC (permalink / raw)
To: Myklebust, Trond; +Cc: Chuck Lever, NFS
[-- Attachment #1: Type: text/plain, Size: 2262 bytes --]
On Mon, 25 Nov 2013 23:41:02 +0000 "Myklebust, Trond"
<Trond.Myklebust@netapp.com> wrote:
> On Tue, 2013-11-26 at 10:23 +1100, NeilBrown wrote:
> > On Mon, 25 Nov 2013 09:59:39 -0500 Chuck Lever <chuck.lever@oracle.com> wrote:
> >
> > > Hi Neil-
> > >
> > > On Nov 24, 2013, at 11:59 PM, NeilBrown <neilb@suse.de> wrote:
> > >
> > > >
> > > > Hi Trond,
> > > > I just noticed commit acdc53b2146c7ee67feb1f02f7bc3020126514b8 from 2010
> > > > reverts the effect commit 28c494c5c8d425e15b7b82571e4df6d6bc34594d from Chunk
> > > > in 2007.
> > >
> > > I'm wondering if a subsequent commit changed filemap_write_and_wait().
> >
> > I hadn't thought of that possibility. I've just had a look at the
> > differences between acdc53b2146c7ee6 and now and cannot find anything that
> > could be related.
>
> To clarify a little: my understanding is that the current 2-pass code in
> write_cache_pages() is supposed to prevent livelock. Instead of chasing
> PAGECACHE_TAG_DIRTY tags (which are constantly being set if an
> application is actively writing), we call tag_pages_for_writeback() once
> in order to convert the current set of PAGECACHE_TAG_DIRTY tags into
> PAGECACHE_TAG_TOWRITE tags, and then we have a second pass write those
> pages back (and wait for completion).
> IOW: the inode->i_mutex should be unnecessary here...
Thanks for that pointer. You are right, write_cache_pages shouldn't loop
indefinitely any more.
And I also just noticed commit 72cb77f4a5ace37b12dcb47a0e8637a2c28ad881 and
NFS_INO_FLUSHING which is an extra reason that the i_mutex isn't needed.
Don't know how I missed that last night. Less haste, more speed I guess.
So it looks like I jumped the gun and there must be some other explanation.
Sorry 'bout that.
Thanks,
NeilBrown
>
>
> Now that said, we recently added in the call to nfs_inode_dio_wait(). If
> applications are using O_DIRECT, then _that_ could livelock. There is
> nothing currently preventing the applications from continuing to bump
> the inode->i_dio_count while we're waiting. Christoph has proposed some
> locking changes that should fix that problem. I'm still evaluating his
> patchset...
>
> Cheers
> Trond
>
> Cheers
> Trond
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-11-26 2:11 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-25 4:59 The return of the hanging "ls" NeilBrown
2013-11-25 14:59 ` Chuck Lever
2013-11-25 23:23 ` NeilBrown
2013-11-25 23:41 ` Myklebust, Trond
2013-11-26 2:11 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).