All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Jeff Layton <jlayton@poochiereds.net>
Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Trond Myklebust <trond.myklebust@primarydata.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Anna Schumaker <Anna.Schumaker@netapp.com>
Subject: Re: parallel lookups on NFS
Date: Sat, 30 Apr 2016 19:58:36 +0100	[thread overview]
Message-ID: <20160430185836.GC25498@ZenIV.linux.org.uk> (raw)
In-Reply-To: <1462027414.10011.31.camel@poochiereds.net>

On Sat, Apr 30, 2016 at 10:43:34AM -0400, Jeff Layton wrote:

> Not exactly, but the test seems to have deadlocked without the last
> patch in play. Here's the ls command:
> 
> [jlayton@rawhide ~]$ cat /proc/1425/stack
> [<ffffffffa03d6eec>] nfs_block_sillyrename+0x5c/0xa0 [nfs]
> [<ffffffffa03c8ef8>] nfs_readdir+0xf8/0x620 [nfs]
> [<ffffffff812232bb>] iterate_dir+0x16b/0x1a0
> [<ffffffff812236e8>] SyS_getdents+0x88/0x100
> [<ffffffff81003cb2>] do_syscall_64+0x62/0x110
> [<ffffffff8174ae21>] return_from_SYSCALL_64+0x0/0x6a
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> ...and here is the nfsidem command:
> 
> [jlayton@rawhide ~]$ cat /proc/1295/stack
> [<ffffffff813953b7>] call_rwsem_down_write_failed+0x17/0x30
> [<ffffffff8121f65b>] filename_create+0x6b/0x150
> [<ffffffff812204e4>] SyS_mkdir+0x44/0xe0
> [<ffffffff81003cb2>] do_syscall_64+0x62/0x110
> [<ffffffff8174ae21>] return_from_SYSCALL_64+0x0/0x6a
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> 
> I'll have to take off here in a bit so I won't be able to help much
> until later, but all I was doing was running the cthon special tests
> like so:
> 
>     $ ./server -p /export -s -N 100 tlielax
> 
> That makes a directory called "rawhide.test" (since the client's
> hostname is "rawhide") and runs its tests in there. Then I ran this in
> a different shell:
> 
>     $ while true; do ls -l /mnt/tlielax/rawhide.test ; done
> 
> Probably I should run this on a stock kernel just to see if there are
> preexisting problems...

FWIW, I could reproduce that (and I really wonder WTF is going on - looks
like nfs_async_unlink_release() getting lost somehow), but not the memory
corruption with the last commit...  What .config are you using?

WARNING: multiple messages have this Message-ID (diff)
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Jeff Layton <jlayton@poochiereds.net>
Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Trond Myklebust <trond.myklebust@primarydata.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Anna Schumaker <Anna.Schumaker@netapp.com>
Subject: Re: parallel lookups on NFS
Date: Sat, 30 Apr 2016 19:58:36 +0100	[thread overview]
Message-ID: <20160430185836.GC25498@ZenIV.linux.org.uk> (raw)
In-Reply-To: <1462027414.10011.31.camel@poochiereds.net>

On Sat, Apr 30, 2016 at 10:43:34AM -0400, Jeff Layton wrote:

> Not exactly, but the test seems to have deadlocked without the last
> patch in play. Here's the ls command:
> 
> [jlayton@rawhide ~]$ cat /proc/1425/stack
> [<ffffffffa03d6eec>] nfs_block_sillyrename+0x5c/0xa0 [nfs]
> [<ffffffffa03c8ef8>] nfs_readdir+0xf8/0x620 [nfs]
> [<ffffffff812232bb>] iterate_dir+0x16b/0x1a0
> [<ffffffff812236e8>] SyS_getdents+0x88/0x100
> [<ffffffff81003cb2>] do_syscall_64+0x62/0x110
> [<ffffffff8174ae21>] return_from_SYSCALL_64+0x0/0x6a
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> ...and here is the nfsidem command:
> 
> [jlayton@rawhide ~]$ cat /proc/1295/stack
> [<ffffffff813953b7>] call_rwsem_down_write_failed+0x17/0x30
> [<ffffffff8121f65b>] filename_create+0x6b/0x150
> [<ffffffff812204e4>] SyS_mkdir+0x44/0xe0
> [<ffffffff81003cb2>] do_syscall_64+0x62/0x110
> [<ffffffff8174ae21>] return_from_SYSCALL_64+0x0/0x6a
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> 
> I'll have to take off here in a bit so I won't be able to help much
> until later, but all I was doing was running the cthon special tests
> like so:
> 
> � � $ ./server -p /export -s -N 100 tlielax
> 
> That makes a directory called "rawhide.test" (since the client's
> hostname is "rawhide") and runs its tests in there. Then I ran this in
> a different shell:
> 
>     $ while true; do ls -l /mnt/tlielax/rawhide.test ; done
> 
> Probably I should run this on a stock kernel just to see if there are
> preexisting problems...

FWIW, I could reproduce that (and I really wonder WTF is going on - looks
like nfs_async_unlink_release() getting lost somehow), but not the memory
corruption with the last commit...  What .config are you using?

  reply	other threads:[~2016-04-30 18:58 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-24  2:34 parallel lookups on NFS Al Viro
2016-04-24 12:46 ` Jeff Layton
2016-04-24 19:18   ` Al Viro
2016-04-24 19:18     ` Al Viro
2016-04-24 20:51     ` Jeff Layton
2016-04-29  7:58     ` Al Viro
2016-04-30 13:15       ` Jeff Layton
2016-04-30 13:15         ` Jeff Layton
2016-04-30 13:22         ` Jeff Layton
2016-04-30 14:22           ` Al Viro
2016-04-30 14:22             ` Al Viro
2016-04-30 14:43             ` Jeff Layton
2016-04-30 18:58               ` Al Viro [this message]
2016-04-30 18:58                 ` Al Viro
2016-04-30 19:29                 ` Al Viro
     [not found]                   ` <1462048765.10011.44.camel@poochiereds.net>
2016-04-30 20:57                     ` Al Viro
2016-04-30 22:17                       ` Jeff Layton
2016-04-30 22:33                       ` Jeff Layton
2016-04-30 23:31                         ` Al Viro
2016-05-01  0:02                           ` Al Viro
2016-05-01  0:18                             ` Al Viro
2016-05-01  1:08                               ` Al Viro
2016-05-01 13:35                                 ` Jeff Layton
2016-04-30 23:23                       ` Jeff Layton
2016-04-30 23:29                         ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160430185836.GC25498@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=Anna.Schumaker@netapp.com \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=trond.myklebust@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.