linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* The return of the hanging "ls"...
@ 2013-11-25  4:59 NeilBrown
  2013-11-25 14:59 ` Chuck Lever
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2013-11-25  4:59 UTC (permalink / raw)
  To: Myklebust, Trond, Chuck Lever; +Cc: NFS

[-- Attachment #1: Type: text/plain, Size: 2485 bytes --]


Hi Trond,
 I just noticed commit acdc53b2146c7ee67feb1f02f7bc3020126514b8 from 2010
reverts the effect commit 28c494c5c8d425e15b7b82571e4df6d6bc34594d from Chunk
in 2007.

Specifically it removes mutex lock/unlock in nfs_getattr.
Chuck added them:

-       /* Flush out writes to the server in order to update c/mtime */
-       if (S_ISREG(inode->i_mode))
+       /*
+        * Flush out writes to the server in order to update c/mtime.
+        *
+        * Hold the i_mutex to suspend application writes temporarily;
+        * this prevents long-running writing applications from blocking
+        * nfs_wb_nocommit.
+        */
+       if (S_ISREG(inode->i_mode)) {
+               mutex_lock(&inode->i_mutex);
                nfs_wb_nocommit(inode);
+               mutex_unlock(&inode->i_mutex);
+       }


You removed them.

-       /*
-        * Flush out writes to the server in order to update c/mtime.
-        *
-        * Hold the i_mutex to suspend application writes temporarily;
-        * this prevents long-running writing applications from blocking
-        * nfs_wb_nocommit.
-        */
+       /* Flush out writes to the server in order to update c/mtime.  */
        if (S_ISREG(inode->i_mode)) {
-               mutex_lock(&inode->i_mutex);
-               nfs_wb_nocommit(inode);
-               mutex_unlock(&inode->i_mutex);
+               err = filemap_write_and_wait(inode->i_mapping);
+               if (err)
+                       goto out;
        }
 
        /*


Do you recall why?

I noticed because a customer reported exactly the same symptoms the were
fixed by Chucks patch some years ago.


The comment on your patch says (in part):

    Also replace nfs_wb_nocommit() with a
    call to filemap_write_and_wait(), which doesn't need to hold the
    inode->i_mutex.

It is certainly true that filemap_write_and_wait doesn't need to hold the
mutex, but neither did nfs_wb_nocommit.  The mutex is held to stop "suspend
application writes temporarily" so no more pages get dirtied until all
the current dirty pages have been written out.  i.e. to stop
generic_file_aio_write() from proceeding.

The particular test that shows the problem is a large write like
  dd if=/dev/zero of=/mnt/nfs/somefile count=2000000
then in another window
  ls -l /mnt/nfs

the "ls -l" will hang until the "dd" completes.

Can we put the mutex lock/unlock back please?

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The return of the hanging "ls"...
  2013-11-25  4:59 The return of the hanging "ls" NeilBrown
@ 2013-11-25 14:59 ` Chuck Lever
  2013-11-25 23:23   ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Chuck Lever @ 2013-11-25 14:59 UTC (permalink / raw)
  To: NeilBrown; +Cc: Myklebust, Trond, NFS

Hi Neil-

On Nov 24, 2013, at 11:59 PM, NeilBrown <neilb@suse.de> wrote:

> 
> Hi Trond,
> I just noticed commit acdc53b2146c7ee67feb1f02f7bc3020126514b8 from 2010
> reverts the effect commit 28c494c5c8d425e15b7b82571e4df6d6bc34594d from Chunk
> in 2007.

I'm wondering if a subsequent commit changed filemap_write_and_wait().

There was some recent clean up that (very likely unintentionally) changed the serialization of some of these VFS utility functions.



> Specifically it removes mutex lock/unlock in nfs_getattr.
> Chuck added them:
> 
> -       /* Flush out writes to the server in order to update c/mtime */
> -       if (S_ISREG(inode->i_mode))
> +       /*
> +        * Flush out writes to the server in order to update c/mtime.
> +        *
> +        * Hold the i_mutex to suspend application writes temporarily;
> +        * this prevents long-running writing applications from blocking
> +        * nfs_wb_nocommit.
> +        */
> +       if (S_ISREG(inode->i_mode)) {
> +               mutex_lock(&inode->i_mutex);
>                nfs_wb_nocommit(inode);
> +               mutex_unlock(&inode->i_mutex);
> +       }
> 
> 
> You removed them.
> 
> -       /*
> -        * Flush out writes to the server in order to update c/mtime.
> -        *
> -        * Hold the i_mutex to suspend application writes temporarily;
> -        * this prevents long-running writing applications from blocking
> -        * nfs_wb_nocommit.
> -        */
> +       /* Flush out writes to the server in order to update c/mtime.  */
>        if (S_ISREG(inode->i_mode)) {
> -               mutex_lock(&inode->i_mutex);
> -               nfs_wb_nocommit(inode);
> -               mutex_unlock(&inode->i_mutex);
> +               err = filemap_write_and_wait(inode->i_mapping);
> +               if (err)
> +                       goto out;
>        }
> 
>        /*
> 
> 
> Do you recall why?
> 
> I noticed because a customer reported exactly the same symptoms the were
> fixed by Chucks patch some years ago.
> 
> 
> The comment on your patch says (in part):
> 
>    Also replace nfs_wb_nocommit() with a
>    call to filemap_write_and_wait(), which doesn't need to hold the
>    inode->i_mutex.
> 
> It is certainly true that filemap_write_and_wait doesn't need to hold the
> mutex, but neither did nfs_wb_nocommit.  The mutex is held to stop "suspend
> application writes temporarily" so no more pages get dirtied until all
> the current dirty pages have been written out.  i.e. to stop
> generic_file_aio_write() from proceeding.
> 
> The particular test that shows the problem is a large write like
>  dd if=/dev/zero of=/mnt/nfs/somefile count=2000000
> then in another window
>  ls -l /mnt/nfs
> 
> the "ls -l" will hang until the "dd" completes.
> 
> Can we put the mutex lock/unlock back please?
> 
> Thanks,
> NeilBrown

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The return of the hanging "ls"...
  2013-11-25 14:59 ` Chuck Lever
@ 2013-11-25 23:23   ` NeilBrown
  2013-11-25 23:41     ` Myklebust, Trond
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2013-11-25 23:23 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Myklebust, Trond, NFS

[-- Attachment #1: Type: text/plain, Size: 786 bytes --]

On Mon, 25 Nov 2013 09:59:39 -0500 Chuck Lever <chuck.lever@oracle.com> wrote:

> Hi Neil-
> 
> On Nov 24, 2013, at 11:59 PM, NeilBrown <neilb@suse.de> wrote:
> 
> > 
> > Hi Trond,
> > I just noticed commit acdc53b2146c7ee67feb1f02f7bc3020126514b8 from 2010
> > reverts the effect commit 28c494c5c8d425e15b7b82571e4df6d6bc34594d from Chunk
> > in 2007.
> 
> I'm wondering if a subsequent commit changed filemap_write_and_wait().

I hadn't thought of that possibility.  I've just had a look at the
differences between  acdc53b2146c7ee6  and now and cannot find anything that
could be related.

Thanks,
NeilBrown

> 
> There was some recent clean up that (very likely unintentionally) changed the serialization of some of these VFS utility functions.
> 
> 
> 


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The return of the hanging "ls"...
  2013-11-25 23:23   ` NeilBrown
@ 2013-11-25 23:41     ` Myklebust, Trond
  2013-11-26  2:11       ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Myklebust, Trond @ 2013-11-25 23:41 UTC (permalink / raw)
  To: NeilBrown; +Cc: Chuck Lever, NFS

On Tue, 2013-11-26 at 10:23 +1100, NeilBrown wrote:
> On Mon, 25 Nov 2013 09:59:39 -0500 Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> > Hi Neil-
> > 
> > On Nov 24, 2013, at 11:59 PM, NeilBrown <neilb@suse.de> wrote:
> > 
> > > 
> > > Hi Trond,
> > > I just noticed commit acdc53b2146c7ee67feb1f02f7bc3020126514b8 from 2010
> > > reverts the effect commit 28c494c5c8d425e15b7b82571e4df6d6bc34594d from Chunk
> > > in 2007.
> > 
> > I'm wondering if a subsequent commit changed filemap_write_and_wait().
> 
> I hadn't thought of that possibility.  I've just had a look at the
> differences between  acdc53b2146c7ee6  and now and cannot find anything that
> could be related.

To clarify a little: my understanding is that the current 2-pass code in
write_cache_pages() is supposed to prevent livelock. Instead of chasing
PAGECACHE_TAG_DIRTY tags (which are constantly being set if an
application is actively writing), we call tag_pages_for_writeback() once
in order to convert the current set of PAGECACHE_TAG_DIRTY tags into
PAGECACHE_TAG_TOWRITE tags, and then we have a second pass write those
pages back (and wait for completion).
IOW: the inode->i_mutex should be unnecessary here...


Now that said, we recently added in the call to nfs_inode_dio_wait(). If
applications are using O_DIRECT, then _that_ could livelock. There is
nothing currently preventing the applications from continuing to bump
the inode->i_dio_count while we're waiting. Christoph has proposed some
locking changes that should fix that problem. I'm still evaluating his
patchset...

Cheers
  Trond

Cheers
  Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The return of the hanging "ls"...
  2013-11-25 23:41     ` Myklebust, Trond
@ 2013-11-26  2:11       ` NeilBrown
  0 siblings, 0 replies; 5+ messages in thread
From: NeilBrown @ 2013-11-26  2:11 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: Chuck Lever, NFS

[-- Attachment #1: Type: text/plain, Size: 2262 bytes --]

On Mon, 25 Nov 2013 23:41:02 +0000 "Myklebust, Trond"
<Trond.Myklebust@netapp.com> wrote:

> On Tue, 2013-11-26 at 10:23 +1100, NeilBrown wrote:
> > On Mon, 25 Nov 2013 09:59:39 -0500 Chuck Lever <chuck.lever@oracle.com> wrote:
> > 
> > > Hi Neil-
> > > 
> > > On Nov 24, 2013, at 11:59 PM, NeilBrown <neilb@suse.de> wrote:
> > > 
> > > > 
> > > > Hi Trond,
> > > > I just noticed commit acdc53b2146c7ee67feb1f02f7bc3020126514b8 from 2010
> > > > reverts the effect commit 28c494c5c8d425e15b7b82571e4df6d6bc34594d from Chunk
> > > > in 2007.
> > > 
> > > I'm wondering if a subsequent commit changed filemap_write_and_wait().
> > 
> > I hadn't thought of that possibility.  I've just had a look at the
> > differences between  acdc53b2146c7ee6  and now and cannot find anything that
> > could be related.
> 
> To clarify a little: my understanding is that the current 2-pass code in
> write_cache_pages() is supposed to prevent livelock. Instead of chasing
> PAGECACHE_TAG_DIRTY tags (which are constantly being set if an
> application is actively writing), we call tag_pages_for_writeback() once
> in order to convert the current set of PAGECACHE_TAG_DIRTY tags into
> PAGECACHE_TAG_TOWRITE tags, and then we have a second pass write those
> pages back (and wait for completion).
> IOW: the inode->i_mutex should be unnecessary here...

Thanks for that pointer.  You are right, write_cache_pages shouldn't loop
indefinitely any more.
And I also just noticed commit 72cb77f4a5ace37b12dcb47a0e8637a2c28ad881 and
NFS_INO_FLUSHING which is an extra reason that the i_mutex isn't needed.
Don't know how I missed that last night.  Less haste, more speed I guess.

So it looks like I jumped the gun and there must be some other explanation.
Sorry 'bout that.

Thanks,
NeilBrown

> 
> 
> Now that said, we recently added in the call to nfs_inode_dio_wait(). If
> applications are using O_DIRECT, then _that_ could livelock. There is
> nothing currently preventing the applications from continuing to bump
> the inode->i_dio_count while we're waiting. Christoph has proposed some
> locking changes that should fix that problem. I'm still evaluating his
> patchset...
> 
> Cheers
>   Trond
> 
> Cheers
>   Trond


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-11-26  2:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-25  4:59 The return of the hanging "ls" NeilBrown
2013-11-25 14:59 ` Chuck Lever
2013-11-25 23:23   ` NeilBrown
2013-11-25 23:41     ` Myklebust, Trond
2013-11-26  2:11       ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).