From: Willy Tarreau <w@1wt.eu>
To: Grant Coady <gcoady.lk@gmail.com>
Cc: Marcelo Tosatti <marcelo@kvack.org>,
linux-kernel@vger.kernel.org,
Trond Myklebust <trond.myklebust@fys.uio.no>
Subject: Re: Linux 2.4.33-rc2
Date: Wed, 5 Jul 2006 22:51:37 +0200 [thread overview]
Message-ID: <20060705205137.GA25913@1wt.eu> (raw)
In-Reply-To: <0e6ma2961ro2evtrnacgmla7j52j738q76@4ax.com>
Hi,
On Wed, Jul 05, 2006 at 11:51:35AM +1000, Grant Coady wrote:
> On Tue, 4 Jul 2006 00:07:36 +0200, Willy Tarreau <w@1wt.eu> wrote:
>
> >On Wed, Jun 21, 2006 at 04:27:56PM -0300, Marcelo Tosatti wrote:
> >
> >> Willy Tarreau:
> >> Fix vfs_unlink/NFS NULL pointer dereference
> >
> >Marcelo, I'm not sure this one is perfect yet. Today, while packaging
> >a lot of files for our distro at work, I came up with a problem where
> >deleting a file on NFS, and later simply accessing (read/write/create)
> >a file on the NFS file system did block. However, I could kill all the
> >offending processes. This was after a full day of mkdir/create/open/
> >unlink... (tens of thoudands of those), so it is not much reproduceable.
> >
> >I could not unmount the NFS anymore, while other users had no problem.
> >Rebooting the client solved the problem. I caught an RPC trace (attached),
> >not sure if it can help. I must say that I'm also running Trond's NFS
> >patches which I suspected first, but with which I never encountered a
> >single problem for years.
> >
> >The fact that the problem appeared during an rm -rf made me think about
> >the vfs_unlink() patch. I went to read it again an I'm wondering if we
> >have not inserted a new problem (please forgive my ignorance here) :
> >
> >in 2.4.32, we had the following sequence :
> > down(&dir->i_zombie);
> > if (may_delete(dir, dentry, 0) != 0) return;
> > lock_kernel();
> > error = dir->i_op->unlink(dir, dentry);
> > unlock_kernel();
> > if (!error)
> > d_delete(dentry);
> > up(&dir->i_zombie);
> > if (!error)
> > inode_dir_notify(dir, DN_DELETE);
> >
> >
> >int 2.4.33-rc2, we have :
> > if (may_delete(dir, dentry, 0) != 0) return;
> > inode = dentry->d_inode;
> >
> > atomic_inc(&inode->i_count);
> > double_down(&dir->i_zombie, &inode->i_zombie);
> >
> > lock_kernel();
> > error = dir->i_op->unlink(dir, dentry);
> > unlock_kernel();
> >
> > double_up(&dir->i_zombie, &inode->i_zombie);
> > iput(inode);
> >
> > if (!error) {
> > d_delete(dentry);
> > inode_dir_notify(dir, DN_DELETE);
> > }
> >
> >What I notice is that in 2.4.32, d_delete(dentry) was performed
> >between down(&dir->i_zombie) and up(&dir->i_zombie), while now
> >it's completely outside. I wonder if this can cause race conditions
> >or not, but at least, I'm sure that we have changed the locking
> >sequence, which might have some impact.
> >
> >Do you think I'm searching in the wrong direction ? I worry a
> >bit, because getting a deadlock after only one day, it's a bit
> >early :-/
> >
> Assuming you mean something like the patch below? Doesn't cause any
> problems (yet, still testing) like eat files or segfault here as
> reported for -rc1 +/- various patches ;)
>
> Cheers,
> Grant.
> --- linux-2.4.33-rc2/fs/namei.c 2006-06-22 07:27:47.000000000 +1000
> +++ linux-2.4.33-rc2b/fs/namei.c 2006-07-05 11:43:19.000000000 +1000
> @@ -1497,13 +1497,14 @@
> lock_kernel();
> error = dir->i_op->unlink(dir, dentry);
> unlock_kernel();
> + if (!error)
> + d_delete(dentry);
> }
> }
> double_up(&dir->i_zombie, &inode->i_zombie);
> iput(inode);
>
> if (!error) {
> - d_delete(dentry);
> inode_dir_notify(dir, DN_DELETE);
> }
> return error;
after a full day of stress-test of multiple parallel tar xUf, and ffsb at
full CPU load, I could not reproduce the problem on the exact same kernel
I first saw it on. So I think I had bad luck and the problem is not related
to the vfs_unlink() patch, so unless anyone else reports a problem or tells
us why it is right or wrong, it would seem reasonable to keep it as it is
in -rc2.
Regards,
Willy
next prev parent reply other threads:[~2006-07-05 20:51 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-21 19:27 Linux 2.4.33-rc2 Marcelo Tosatti
2006-06-21 23:35 ` Grant Coady
2006-07-03 22:07 ` Willy Tarreau
2006-07-05 1:51 ` Grant Coady
2006-07-05 5:18 ` Willy Tarreau
2006-07-05 20:51 ` Willy Tarreau [this message]
2006-07-06 7:42 ` Grant Coady
2006-07-06 8:25 ` Willy Tarreau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060705205137.GA25913@1wt.eu \
--to=w@1wt.eu \
--cc=gcoady.lk@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=marcelo@kvack.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.