From: Ram Pai <linuxram@us.ibm.com>
To: Ian Kent <raven@themaw.net>
Cc: autofs@linux.kernel.org, linux-fsdevel@vger.kernel.org,
William H Taber <wtaber@us.ibm.com>
Subject: Re: [autofs] [RFC PATCH]autofs4: hang and proposed fix
Date: Wed, 16 Nov 2005 17:52:42 -0800 [thread overview]
Message-ID: <1132192362.5720.163.camel@localhost> (raw)
In-Reply-To: <Pine.LNX.4.63.0511170627150.3049@donald.themaw.net>
On Wed, 2005-11-16 at 14:57, Ian Kent wrote:
> On Wed, 16 Nov 2005, Ram Pai wrote:
>
> > On Wed, 2005-11-16 at 04:41, Ian Kent wrote:
> > > On Wed, 16 Nov 2005, Ram Pai wrote:
> > >
> > > Thanks for you effort Ram.
> > >
> > > > Autofs4 assumes that its ->revalidate() function gets called with the
> > > > parent_dentry's_inode_semaphore released. This is true mostly
> > > > but not in one particular case.
> > >
> > > Yep. Certainly does.
> > >
> > > Isn't my mistake not noticing that the inode semaphore is taken in
> > > vfs_readdir?
> > >
> > > It's been like that all along and I can't understand how I didn't notice
> > > it before.
> > >
> > > Help me out a bit here please Ram.
> > >
> > > Aren't there other paths that enter revalidate without holding the
> > > semaphore? chdir?
> >
> > Looking at the code and it seemed to me that ->revalidate() function is
> > always with the semaphore held. Atleast VFS seem to have that
> > assumption.
>
> My reading gives me the opposite impression.
>
> The chdir example above calls the path walking routine which calls the
> lookup method with the semaphore held and revalidate without it held.
> Clearly, there are a number of other examples.
>
> Certainly, I could be wrong and I'll be checking that.
I see your point. Looking more through the code it looks like
the convention about how ->revalidate() gets called, seems to be
inconsistent in VFS.
in do_lookup() which calls ->revalidate(), the semaphore is not-held.
Where as lookup_one_len() is expected to be called with the semaphore
held. This function calls lookup_hash() which calls cached_lookup()
which later calls ->revalidate(), and here ->revalidate() is called with
the semaphore held. Is this the source of the bug?
> >
> > And looking at the autofs4 code, I get the impression, that it
> > assumes that the semaphore is released when it gets called. Which seems
> > to be inconsistent and wrong.
>
> It does but I thought that was VFS design and I'm willing to be corrected
> if I'm wrong.
>
> Thinking about it and looking at the stack trace I'm having a bit of
> trouble working out why there is a mount wait being triggered here at all.
>
> I think this is a getdents call so then there should have been an open
> which would have done the mount wait, followed by the getdents call itself
> and finally a close. I see this sequence all the time when I'm using debug
> to log activity and it's also evident from the corresponding functions in
> fs/libfs.c.
In this case there is no explicit open on a autofs4 directory. The
readdir is taking place on a directory belonging to the stubfs
filesystem. Internally stubfs filesystem is trying to open the
automounter's dentry through the
lookup_one_len() call. And this triggers the automouter into action.
>
> What I can't work out is how getdents appears to be called without having
> called open. Is there anything more that you can tell me about how you
> have been able to demonstrate this error.
>
Maybe you are missing the stubfs part. The stubfs is kind of
in-the-middle filesystem which sits between the application and
the autofs4.
Will Taber: Am I saying this right?
Take a look at the test patch for stubfs posted at
http://www.sudhaa.com/~ram/readahead/stubfs.patch
For clarity here is the scenario:
P1 executes 'ls' on a directory belonging to stubfs.
stubfs's ->lookup() gets
called and it internally redirects that lookup to autofs4
by calling lookup_one_len() on /net/ram
note: /net belongs to autofs4 and lookup_one_len() is
called holding the inode-semaphore of /net .
lookup_one_len() calls lookup_hash() which finds that there
is no cached dentry for 'ram', and hence allocates a dentry
and calls ->lookup() of autofs4.
autofs4 adds the dentry to the dcache and calls its
->revalidate() after releasing the semaphore.
->revalidate tries to wake up the automounter daemon, and
goes to sleep on a waitq.
P2 executes 'ls' on another directory belonging to stubfs.
stubfs's ->lookup()
gets called and it internally redirects that lookup to autofs4
by calling lookup_one_len() on /net/ram. lookup_one_len() is
called holding the inode-semaphore of /net.
lookup_one_len() calls lookup_hash() which calls
cached_lookup(). cached_lookup() finds the dentry
corresponding to 'ram' in the dcache. So it calls
->revalidate() on it. NOTE: this time autofs4's
->revalidate() is called holding the semaphore.
->revalidate() goes to sleep on the same waitq
waiting on the automounter to wake him up.
automouter: the automounter now comes in and tries to hold
the semaphore on /net and deadlocks.
The question is: Who is the culprit? stubfs? VFS? or
autofs4?
RP
> >
> > >
> > > Does uping an open semaphore allow other undesirable side affects?
> > >
> > > Do you think it would perhaps be better to release the semaphore in
> > > autofs4_readdir ... hang on the stack trace doesn't look like a readdir
> > > ... I'll have to check 2.6.15-rc1 ... ?
> > >
> >
> > One automounter process is in sys_mkdir() system call and the other if I
> > recollect correctly was in sys_getdent64() system call.
> >
> > Yes this problem can be demonstrated on all versions of the 2.6 kernel.
> > Infact I reproduced it on 2.6.15-rc1 kernel.
> >
> > > Apart from the above, looking at the patch and assuming that the semaphore
> > > is always held it would probaby be better to move semaphore open/close
> > > into try_to_fill_dentry as any control process using autofs, such as automount,
> > > must never cause a mount wait to be called (oz_mode = 1).
> >
> >
> >
> > >
> > > Ideas?
> >
> > One thing is sure, VFS assumes that the semaphore is held when
> > ->revalidate() is called, and that convention is followed religiously by
> > all filesytems. autofs4 has this special case of releasing the
> > semaphore if it is waiting to be woken up by the automounter daemon. So
> > as you said, may be the semaphore must be released just before sleeping
> > on the waitq?
> >
> > RP
> >
> > >
> > > >
> > > > Process P1 calls autofs4's ->lookup(). The lookup finds that the dentry
> > > > does not exist. It creates a dentry and adds to the cache. Releases
> > > > the parent's inode's semaphore and than calls ->revalidate().
> > > >
> > > > Process P2 meanwhile comes in and cached_lookup() gets called. It finds
> > > > the dentry in the cache and finds ->revalidate() function exists. So
> > > > it calls ->revalidate() holding the parent's inode's semaphore.
> > > >
> > > > Now the automounter daemon comes in and tries to hold the same semaphore
> > > > in order to mount. But since the semaphore is held by P2 it
> > > > goes to sleep.
> > > >
> > > > Process P1 and P2 continue waiting for the mount to complete and it never
> > > > happens. Deadlock.
> > > >
> > > > The stack of the deadlock is as follows:
> > > >
> > > > ls S 00000000 0 13049 11954 (NOTLB)
> > > > f5221df0 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > > > 00000000 f5d44a70 c721b520 00000000 d4f33800 003d0990 c721b9d8 f5d44030
> > > > f5d44164 f5220000 f5221e3c f3dd6880 f5221e68 c0215207 f3b95580 80000000
> > > > Call Trace:
> > > > [<c0215207>] autofs4_wait+0x307/0x3d0
> > > > [<c02141d3>] try_to_fill_dentry+0xf3/0x150
> > > > [<c0214389>] autofs4_revalidate+0x159/0x170
> > > > [<c02144e0>] autofs4_lookup+0x110/0x150
> > > > [<c016f3f5>] __lookup_hash+0x85/0xb0
> > > > [<c016f42a>] lookup_hash+0xa/0x10
> > > > [<c016f483>] lookup_one_len+0x53/0x70
> > > > [<f8851293>] stubfs_readdir+0x113/0x170 [stubfs]
> > > > [<c0172fcb>] vfs_readdir+0x8b/0xa0
> > > > [<c01733b3>] sys_getdents64+0x63/0xb5
> > > > [<c010464d>] syscall_call+0x7/0xb
> > > >
> > > > ls S C011B1AF 0 13050 11898 (NOTLB)
> > > > f1337df0 00000082 f1337e04 c011b1af 06ce3f60 00000027 00000027 00000080
> > > > 06d03f60 00000000 c721b520 00000000 d4f33800 003d0990 f1337df0 f5d44a70
> > > > f5d44ba4 f1336000 f1337e3c f3dd6880 f1337e68 c0215207 f3b95580 80000000
> > > > Call Trace:
> > > > [<c0215207>] autofs4_wait+0x307/0x3d0
> > > > [<c02141d3>] try_to_fill_dentry+0xf3/0x150
> > > > [<c0214389>] autofs4_revalidate+0x159/0x170
> > > > [<c016dc77>] cached_lookup+0x47/0x80
> > > > [<c016f3ca>] __lookup_hash+0x5a/0xb0
> > > > [<c016f42a>] lookup_hash+0xa/0x10
> > > > [<c016f483>] lookup_one_len+0x53/0x70
> > > > [<f88512e3>] stubfs_readdir+0x163/0x170 [stubfs]
> > > > [<c0172fcb>] vfs_readdir+0x8b/0xa0
> > > > [<c01733b3>] sys_getdents64+0x63/0xb5
> > > > [<c010464d>] syscall_call+0x7/0xb
> > > >
> > > > automount D 00000010 0 13052 13016 (NOTLB)
> > > > f3321f00 fff80000 00000007 00000010 f3321f68 c7b1cd20 00000000 f3321f34
> > > > f3321ee8 f5e92a70 c7233520 00000000 d5304100 003d0990 c7233560 f1e31a70
> > > > f1e31ba4 f5f59914 f5f5991c 00000296 f3321f38 c03b4cd3 f1e31a70 00000001
> > > > Call Trace:
> > > > [<c03b4cd3>] __down+0x83/0xe0
> > > > [<c03b3632>] __down_failed+0xa/0x10
> > > > [<c0171e6d>] .text.lock.namei+0xeb/0x1de
> > > > [<c0170482>] sys_mkdir+0x52/0xd0
> > > > [<c010464d>] syscall_call+0x7/0xb
> > > > BUG: soft lockup detected on CPU#0!
> > > >
> > > >
> > > > I have coded up a tentative fix. The patch releases the semaphore in
> > > > ->revalidate() function, instead of the caller of that function. Not
> > > > sure if this is the right fix. Tested it and verified that the deadlock
> > > > is fixed. But I am not sure if it opens up other bugs. Please validate.
> > > >
> > > >
> > > > fs/autofs4/root.c | 26 +++++++++++++++-----------
> > > > 1 files changed, 15 insertions(+), 11 deletions(-)
> > > >
> > > > Index: 2.6.15-rc1/fs/autofs4/root.c
> > > > ===================================================================
> > > > --- 2.6.15-rc1.orig/fs/autofs4/root.c
> > > > +++ 2.6.15-rc1/fs/autofs4/root.c
> > > > @@ -386,40 +386,47 @@ static int autofs4_revalidate(struct den
> > > > struct autofs_sb_info *sbi = autofs4_sbi(dir->i_sb);
> > > > int oz_mode = autofs4_oz_mode(sbi);
> > > > int flags = nd ? nd->flags : 0;
> > > > int status = 1;
> > > >
> > > > + up(&dir->i_sem);
> > > > /* Pending dentry */
> > > > if (autofs4_ispending(dentry)) {
> > > > if (!oz_mode)
> > > > - status = try_to_fill_dentry(dentry, dir->i_sb, sbi, flags);
> > > > - return status;
> > > > + status = try_to_fill_dentry(dentry, dir->i_sb,
> > > > + sbi, flags);
> > > > + goto out;
> > > > }
> > > >
> > > > /* Negative dentry.. invalidate if "old" */
> > > > - if (dentry->d_inode == NULL)
> > > > - return (dentry->d_time - jiffies <= AUTOFS_NEGATIVE_TIMEOUT);
> > > > + if (dentry->d_inode == NULL) {
> > > > + status = (dentry->d_time - jiffies <= AUTOFS_NEGATIVE_TIMEOUT);
> > > > + goto out;
> > > > + }
> > > >
> > > > /* Check for a non-mountpoint directory with no contents */
> > > > spin_lock(&dcache_lock);
> > > > if (S_ISDIR(dentry->d_inode->i_mode) &&
> > > > !d_mountpoint(dentry) &&
> > > > list_empty(&dentry->d_subdirs)) {
> > > > DPRINTK("dentry=%p %.*s, emptydir",
> > > > dentry, dentry->d_name.len, dentry->d_name.name);
> > > > spin_unlock(&dcache_lock);
> > > > if (!oz_mode)
> > > > - status = try_to_fill_dentry(dentry, dir->i_sb, sbi, flags);
> > > > - return status;
> > > > + status = try_to_fill_dentry(dentry, dir->i_sb, sbi,
> > > > + flags);
> > > > + goto out;
> > > > }
> > > > spin_unlock(&dcache_lock);
> > > >
> > > > /* Update the usage list */
> > > > if (!oz_mode)
> > > > autofs4_update_usage(dentry);
> > > >
> > > > - return 1;
> > > > +out:
> > > > + down(&dir->i_sem);
> > > > + return status;
> > > > }
> > > >
> > > > static void autofs4_dentry_release(struct dentry *de)
> > > > {
> > > > struct autofs_info *inf;
> > > > @@ -485,15 +492,12 @@ static struct dentry *autofs4_lookup(str
> > > > spin_unlock(&dentry->d_lock);
> > > > }
> > > > dentry->d_fsdata = NULL;
> > > > d_add(dentry, NULL);
> > > >
> > > > - if (dentry->d_op && dentry->d_op->d_revalidate) {
> > > > - up(&dir->i_sem);
> > > > + if (dentry->d_op && dentry->d_op->d_revalidate)
> > > > (dentry->d_op->d_revalidate)(dentry, nd);
> > > > - down(&dir->i_sem);
> > > > - }
> > > >
> > > > /*
> > > > * If we are still pending, check if we had to handle
> > > > * a signal. If so we can force a restart..
> > > > */
> > > >
> > > > _______________________________________________
> > > > autofs mailing list
> > > > autofs@linux.kernel.org
> > > > http://linux.kernel.org/mailman/listinfo/autofs
> > > >
> > >
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
next prev parent reply other threads:[~2005-11-17 1:52 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-11-16 10:17 [RFC PATCH]autofs4: hang and proposed fix Ram Pai
2005-11-16 12:41 ` [autofs] " Ian Kent
2005-11-16 16:50 ` Ram Pai
2005-11-16 22:57 ` Ian Kent
2005-11-17 1:52 ` Ram Pai [this message]
2005-11-17 18:50 ` [autofs] " Ian Kent
2005-11-17 19:19 ` William H. Taber
2005-11-17 20:39 ` Ram Pai
2005-11-17 22:31 ` William H. Taber
2005-11-18 14:57 ` Ian Kent
2005-11-18 14:54 ` Ian Kent
2005-11-18 14:44 ` Ian Kent
2005-11-18 15:20 ` William H. Taber
2005-11-18 16:30 ` Ian Kent
2005-11-18 17:12 ` William H. Taber
2005-11-18 18:57 ` Ram Pai
2005-11-18 20:08 ` William H. Taber
2005-11-19 2:52 ` Ian Kent
2005-11-21 16:40 ` William H. Taber
2005-11-22 13:13 ` Ian Kent
2005-11-22 17:48 ` [autofs] " William H. Taber
2005-11-23 14:11 ` Ian Kent
2005-11-23 16:42 ` William H. Taber
2005-11-23 17:52 ` Ian Kent
2005-11-23 18:47 ` William H. Taber
2005-11-19 1:40 ` Ian Kent
2005-11-16 15:22 ` Jeff Moyer
2005-11-16 17:00 ` [autofs] " Ram Pai
2005-11-16 18:25 ` Jeff Moyer
2005-11-16 19:24 ` William H. Taber
2005-11-16 19:51 ` Ram Pai
2005-11-27 10:47 ` Ian Kent
2005-11-28 17:19 ` William H. Taber
2005-11-28 23:12 ` Badari Pulavarty
2005-11-29 14:19 ` Ian Kent
2005-11-29 16:34 ` William H. Taber
2005-11-30 14:02 ` Ian Kent
2005-11-30 16:49 ` Badari Pulavarty
2005-11-30 17:04 ` Trond Myklebust
2005-11-30 21:10 ` William H. Taber
2005-11-29 14:20 ` Ian Kent
2005-11-30 1:16 ` [autofs] " Jeff Moyer
2005-11-30 1:56 ` Trond Myklebust
2005-11-30 4:15 ` Jeff Moyer
2005-11-30 6:14 ` Trond Myklebust
2005-11-30 15:44 ` Ian Kent
2005-11-30 15:53 ` [autofs] " Trond Myklebust
2005-11-30 16:12 ` Ian Kent
2005-11-30 16:27 ` Ian Kent
2005-11-30 16:45 ` [autofs] " Trond Myklebust
2005-11-30 20:32 ` William H. Taber
2005-11-30 20:53 ` Trond Myklebust
2005-11-30 21:30 ` William H. Taber
2005-11-30 22:32 ` Trond Myklebust
2005-12-01 16:27 ` William H. Taber
2005-12-01 12:09 ` Ian Kent
2005-12-01 16:30 ` William H. Taber
2005-12-02 13:49 ` Ian Kent
2005-12-02 14:07 ` Jeff Moyer
2005-12-02 15:21 ` Ian Kent
2005-12-02 16:35 ` [autofs] " Will Taber
2005-12-02 17:11 ` Ian Kent
2005-12-02 15:34 ` Will Taber
2005-12-02 17:29 ` Ian Kent
2005-12-02 18:12 ` Trond Myklebust
2005-12-04 12:56 ` Christoph Hellwig
2005-12-04 12:57 ` Christoph Hellwig
2005-12-04 14:58 ` Ian Kent
2005-12-04 17:17 ` [autofs] " Christoph Hellwig
2005-12-05 14:02 ` Ian Kent
2005-12-06 21:20 ` Jeff Moyer
2005-12-06 21:40 ` Christoph Hellwig
2005-12-06 22:37 ` Jeff Moyer
2005-12-07 14:52 ` Will Taber
2005-12-07 15:18 ` Christoph Hellwig
2005-12-07 15:22 ` Brian Long
2005-12-07 15:25 ` Christoph Hellwig
2005-12-07 17:46 ` Will Taber
2005-12-08 14:16 ` Ian Kent
2005-12-09 12:12 ` Christoph Hellwig
2005-12-09 13:33 ` John T. Kohl
2005-12-13 18:39 ` Christoph Hellwig
2005-12-04 14:56 ` Ian Kent
2005-12-02 19:04 ` [autofs] " Will Taber
2005-12-04 9:39 ` Ian Kent
2005-12-02 16:04 ` [autofs] " Jeff Moyer
2005-12-02 17:36 ` Ian Kent
2005-12-02 18:33 ` [autofs] " Will Taber
2005-12-04 9:52 ` Ian Kent
2005-12-04 14:54 ` Ian Kent
2005-12-05 15:40 ` Ian Kent
2005-11-30 14:48 ` [autofs] " Ian Kent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1132192362.5720.163.camel@localhost \
--to=linuxram@us.ibm.com \
--cc=autofs@linux.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=raven@themaw.net \
--cc=wtaber@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).