* nfsd stuckage
@ 2009-01-06 22:56 Andrew Morton
2009-01-06 23:02 ` J. Bruce Fields
2009-01-08 14:57 ` Peter Zijlstra
0 siblings, 2 replies; 12+ messages in thread
From: Andrew Morton @ 2009-01-06 22:56 UTC (permalink / raw)
To: linux-nfs; +Cc: linux-kernel, Neil Brown, J. Bruce Fields
I just built current mainline plus the just-sent 266 -mm patches.
The machine failed to power off when hit with `halt -pfn'. dmesg output:
[ 37.087037] calling rfcomm_init+0x0/0xb6 [rfcomm] @ 3946
[ 37.087294] initcall rfcomm_init+0x0/0xb6 [rfcomm] returned 0 after 72 usecs
[ 37.505855] calling hidp_init+0x0/0x5e [hidp] @ 4046
[ 37.506072] initcall hidp_init+0x0/0x5e [hidp] returned 0 after 28 usecs
[ 37.636638] calling init_autofs4_fs+0x0/0x23 [autofs4] @ 4081
[ 37.636990] initcall init_autofs4_fs+0x0/0x23 [autofs4] returned 0 after 54 usecs
[ 39.630075] calling init_nlm+0x0/0x22 [lockd] @ 4264
[ 39.630321] initcall init_nlm+0x0/0x22 [lockd] returned 0 after 59 usecs
[ 39.690077] calling init_rpcsec_gss+0x0/0x4a [auth_rpcgss] @ 4264
[ 39.690281] initcall init_rpcsec_gss+0x0/0x4a [auth_rpcgss] returned 0 after 12 usecs
[ 39.834034] calling init_nfsd+0x0/0xe2 [nfsd] @ 4302
[ 39.834471] initcall init_nfsd+0x0/0xe2 [nfsd] returned 0 after 236 usecs
[ 39.924213] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[ 672.162677] INFO: task nfsd4:4324 blocked for more than 480 seconds.
[ 672.162706] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 672.162725] ffff880251df1d60 0000000000000046 ffff88025e1c0580 ffff8802488013d8
[ 672.162753] ffff880251df1d20 ffff88024c49a7a0 ffff88025e088760 ffff88024c49ab18
[ 672.162834] 000000002807ee00 00000000ffff59ec ffff880251df1d50 0000000000000282
[ 672.162865] Call Trace:
[ 672.162880] [<ffffffff8052a1d8>] __mutex_lock_slowpath+0x6a/0xac
[ 672.162895] [<ffffffff8052a0c5>] mutex_lock+0x2c/0x30
[ 672.162908] [<ffffffff802c4747>] vfs_fsync+0x63/0xa9
[ 672.162933] [<ffffffffa048c74c>] nfsd_sync_dir+0x10/0x12 [nfsd]
[ 672.162960] [<ffffffffa04a5427>] nfsd4_sync_rec_dir+0x27/0x40 [nfsd]
[ 672.162984] [<ffffffffa04a592a>] nfsd4_recdir_purge_old+0x3d/0x6a [nfsd]
[ 672.163023] [<ffffffffa04a1745>] laundromat_main+0x62/0x225 [nfsd]
[ 672.163049] [<ffffffffa04a16e3>] ? laundromat_main+0x0/0x225 [nfsd]
[ 672.163064] [<ffffffff8024b4a7>] run_workqueue+0x8d/0x124
[ 672.163076] [<ffffffff8024b5e0>] ? worker_thread+0x0/0xe5
[ 672.163089] [<ffffffff8024b6b8>] worker_thread+0xd8/0xe5
[ 672.163102] [<ffffffff8024e8cc>] ? autoremove_wake_function+0x0/0x36
[ 672.163115] [<ffffffff8024b5e0>] ? worker_thread+0x0/0xe5
[ 672.163127] [<ffffffff8024e5e0>] kthread+0x44/0x6b
[ 672.163140] [<ffffffff8020cfba>] child_rip+0xa/0x20
[ 672.163151] [<ffffffff8024e59c>] ? kthread+0x0/0x6b
[ 672.163162] [<ffffffff8020cfb0>] ? child_rip+0x0/0x20
[ 1204.739381] INFO: task nfsd4:4324 blocked for more than 480 seconds.
[ 1204.739415] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1204.739436] ffff880251df1d60 0000000000000046 ffff88025e1c0580 ffff8802488013d8
[ 1204.739523] ffff880251df1d20 ffff88024c49a7a0 ffff88025e088760 ffff88024c49ab18
[ 1204.739551] 000000002807ee00 00000000ffff59ec ffff880251df1d50 0000000000000282
[ 1204.739579] Call Trace:
This didn't happen in linux-next a week or so ago.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage
2009-01-06 22:56 nfsd stuckage Andrew Morton
@ 2009-01-06 23:02 ` J. Bruce Fields
2009-01-06 23:03 ` Christoph Hellwig
2009-01-08 14:57 ` Peter Zijlstra
1 sibling, 1 reply; 12+ messages in thread
From: J. Bruce Fields @ 2009-01-06 23:02 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-nfs, linux-kernel, Neil Brown, Eric Sesterhenn, hch
On Tue, Jan 06, 2009 at 02:56:12PM -0800, Andrew Morton wrote:
>
> I just built current mainline plus the just-sent 266 -mm patches.
>
> The machine failed to power off when hit with `halt -pfn'. dmesg output:
Christoph, can you live with this for now?
--b.
commit 33e3950dc2eae7484e79685083c304d93013e3ec
Author: J. Bruce Fields <bfields@citi.umich.edu>
Date: Tue Jan 6 13:37:03 2009 -0500
nfsd: fix double-locks of directory mutex
A number of nfsd operations depend on the i_mutex to cover more code
than just the fsync, so the approach of 4c728ef583b3d8 "add a vfs_fsync
helper" doesn't work for nfsd. Revert the parts of those patches that
touch nfsd, and remove the logic from vfs_nfsd that was needed only for
the special case of nfsd.
Reported-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 44aa92a..6e50aaa 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -744,16 +744,44 @@ nfsd_close(struct file *filp)
fput(filp);
}
+/*
+ * Sync a file
+ * As this calls fsync (not fdatasync) there is no need for a write_inode
+ * after it.
+ */
+static inline int nfsd_dosync(struct file *filp, struct dentry *dp,
+ const struct file_operations *fop)
+{
+ struct inode *inode = dp->d_inode;
+ int (*fsync) (struct file *, struct dentry *, int);
+ int err;
+
+ err = filemap_fdatawrite(inode->i_mapping);
+ if (err == 0 && fop && (fsync = fop->fsync))
+ err = fsync(filp, dp, 0);
+ if (err == 0)
+ err = filemap_fdatawait(inode->i_mapping);
+
+ return err;
+}
+
static int
nfsd_sync(struct file *filp)
{
- return vfs_fsync(filp, filp->f_path.dentry, 0);
+ int err;
+ struct inode *inode = filp->f_path.dentry->d_inode;
+ dprintk("nfsd: sync file %s\n", filp->f_path.dentry->d_name.name);
+ mutex_lock(&inode->i_mutex);
+ err=nfsd_dosync(filp, filp->f_path.dentry, filp->f_op);
+ mutex_unlock(&inode->i_mutex);
+
+ return err;
}
int
-nfsd_sync_dir(struct dentry *dentry)
+nfsd_sync_dir(struct dentry *dp)
{
- return vfs_fsync(NULL, dentry, 0);
+ return nfsd_dosync(NULL, dp, dp->d_inode->i_fop);
}
/*
diff --git a/fs/sync.c b/fs/sync.c
index 0921d6d..8e0a656 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -83,10 +83,6 @@ int file_fsync(struct file *filp, struct dentry *dentry, int datasync)
*
* Write back data and metadata for @file to disk. If @datasync is
* set only metadata needed to access modified file data is written.
- *
- * In case this function is called from nfsd @file may be %NULL and
- * only @dentry is set. This can only happen when the filesystem
- * implements the export_operations API.
*/
int vfs_fsync(struct file *file, struct dentry *dentry, int datasync)
{
@@ -94,18 +90,8 @@ int vfs_fsync(struct file *file, struct dentry *dentry, int datasync)
struct address_space *mapping;
int err, ret;
- /*
- * Get mapping and operations from the file in case we have
- * as file, or get the default values for them in case we
- * don't have a struct file available. Damn nfsd..
- */
- if (file) {
- mapping = file->f_mapping;
- fop = file->f_op;
- } else {
- mapping = dentry->d_inode->i_mapping;
- fop = dentry->d_inode->i_fop;
- }
+ mapping = file->f_mapping;
+ fop = file->f_op;
if (!fop || !fop->fsync) {
ret = -EINVAL;
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: nfsd stuckage
2009-01-06 23:02 ` J. Bruce Fields
@ 2009-01-06 23:03 ` Christoph Hellwig
2009-01-06 23:05 ` J. Bruce Fields
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2009-01-06 23:03 UTC (permalink / raw)
To: J. Bruce Fields
Cc: Andrew Morton, linux-nfs, linux-kernel, Neil Brown,
Eric Sesterhenn, hch
On Tue, Jan 06, 2009 at 06:02:44PM -0500, J. Bruce Fields wrote:
> On Tue, Jan 06, 2009 at 02:56:12PM -0800, Andrew Morton wrote:
> >
> > I just built current mainline plus the just-sent 266 -mm patches.
> >
> > The machine failed to power off when hit with `halt -pfn'. dmesg output:
>
> Christoph, can you live with this for now?
nfsd part is well, livable. But the fs/sync.c is buggy as stackable
filesystems can call vfs_fsync with a NULL pointer due to nfs calling it
that way, so please drop that hunk.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage
2009-01-06 23:03 ` Christoph Hellwig
@ 2009-01-06 23:05 ` J. Bruce Fields
2009-01-07 0:15 ` J. Bruce Fields
0 siblings, 1 reply; 12+ messages in thread
From: J. Bruce Fields @ 2009-01-06 23:05 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Andrew Morton, linux-nfs, linux-kernel, Neil Brown,
Eric Sesterhenn
On Wed, Jan 07, 2009 at 12:03:56AM +0100, Christoph Hellwig wrote:
> On Tue, Jan 06, 2009 at 06:02:44PM -0500, J. Bruce Fields wrote:
> > On Tue, Jan 06, 2009 at 02:56:12PM -0800, Andrew Morton wrote:
> > >
> > > I just built current mainline plus the just-sent 266 -mm patches.
> > >
> > > The machine failed to power off when hit with `halt -pfn'. dmesg output:
> >
> > Christoph, can you live with this for now?
>
> nfsd part is well, livable. But the fs/sync.c is buggy as stackable
> filesystems can call vfs_fsync with a NULL pointer due to nfs calling it
> that way, so please drop that hunk.
Whoops, OK, I didn't understand that. I'll drop that hunk, retest, then
submit--should take a few minutes.
--b.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage
2009-01-06 23:05 ` J. Bruce Fields
@ 2009-01-07 0:15 ` J. Bruce Fields
2009-01-07 0:23 ` Andrew Morton
0 siblings, 1 reply; 12+ messages in thread
From: J. Bruce Fields @ 2009-01-07 0:15 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Andrew Morton, linux-nfs, linux-kernel, Neil Brown,
Eric Sesterhenn
On Tue, Jan 06, 2009 at 06:05:51PM -0500, bfields wrote:
> On Wed, Jan 07, 2009 at 12:03:56AM +0100, Christoph Hellwig wrote:
> > On Tue, Jan 06, 2009 at 06:02:44PM -0500, J. Bruce Fields wrote:
> > > On Tue, Jan 06, 2009 at 02:56:12PM -0800, Andrew Morton wrote:
> > > >
> > > > I just built current mainline plus the just-sent 266 -mm patches.
> > > >
> > > > The machine failed to power off when hit with `halt -pfn'. dmesg output:
> > >
> > > Christoph, can you live with this for now?
> >
> > nfsd part is well, livable. But the fs/sync.c is buggy as stackable
> > filesystems can call vfs_fsync with a NULL pointer due to nfs calling it
> > that way, so please drop that hunk.
>
> Whoops, OK, I didn't understand that. I'll drop that hunk, retest, then
> submit--should take a few minutes.
So this works for me--and I guess I may as well submit it in a pull
request. But: it sounds like we still have a regression for ecryptfs?
(Since on ecryptfs export nfsd will still try to get the mutex twice on
create, unlink, etc.) Maybe we should just revert
4c728ef583b3d82266584da5cb068294c09df31e entirely for now?
--b.
commit 3fbc5c762bd9f9ff52fe7b5b09398a3cff0e8415
Author: J. Bruce Fields <bfields@citi.umich.edu>
Date: Tue Jan 6 13:37:03 2009 -0500
nfsd: fix double-locks of directory mutex
A number of nfsd operations depend on the i_mutex to cover more code
than just the fsync, so the approach of 4c728ef583b3d8 "add a vfs_fsync
helper" doesn't work for nfsd. Revert the parts of those patches that
touch nfsd.
Note: we can't, however, remove the logic from vfs_fsync that was needed
only for the special case of nfsd, because a vfs_fsync(NULL,...) call
can still result indirectly from a stackable filesystem that was called
by nfsd.
Reported-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 44aa92a..6e50aaa 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -744,16 +744,44 @@ nfsd_close(struct file *filp)
fput(filp);
}
+/*
+ * Sync a file
+ * As this calls fsync (not fdatasync) there is no need for a write_inode
+ * after it.
+ */
+static inline int nfsd_dosync(struct file *filp, struct dentry *dp,
+ const struct file_operations *fop)
+{
+ struct inode *inode = dp->d_inode;
+ int (*fsync) (struct file *, struct dentry *, int);
+ int err;
+
+ err = filemap_fdatawrite(inode->i_mapping);
+ if (err == 0 && fop && (fsync = fop->fsync))
+ err = fsync(filp, dp, 0);
+ if (err == 0)
+ err = filemap_fdatawait(inode->i_mapping);
+
+ return err;
+}
+
static int
nfsd_sync(struct file *filp)
{
- return vfs_fsync(filp, filp->f_path.dentry, 0);
+ int err;
+ struct inode *inode = filp->f_path.dentry->d_inode;
+ dprintk("nfsd: sync file %s\n", filp->f_path.dentry->d_name.name);
+ mutex_lock(&inode->i_mutex);
+ err=nfsd_dosync(filp, filp->f_path.dentry, filp->f_op);
+ mutex_unlock(&inode->i_mutex);
+
+ return err;
}
int
-nfsd_sync_dir(struct dentry *dentry)
+nfsd_sync_dir(struct dentry *dp)
{
- return vfs_fsync(NULL, dentry, 0);
+ return nfsd_dosync(NULL, dp, dp->d_inode->i_fop);
}
/*
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: nfsd stuckage
2009-01-07 0:15 ` J. Bruce Fields
@ 2009-01-07 0:23 ` Andrew Morton
2009-01-07 0:28 ` J. Bruce Fields
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2009-01-07 0:23 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: hch, linux-nfs, linux-kernel, neilb, snakebyte
On Tue, 6 Jan 2009 19:15:01 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:
> nfsd: fix double-locks of directory mutex
grumble.
>
> +/*
> + * Sync a file
> + * As this calls fsync (not fdatasync) there is no need for a write_inode
> + * after it.
> + */
> +static inline int nfsd_dosync(struct file *filp, struct dentry *dp,
> + const struct file_operations *fop)
> +{
> + struct inode *inode = dp->d_inode;
> + int (*fsync) (struct file *, struct dentry *, int);
> + int err;
> +
> + err = filemap_fdatawrite(inode->i_mapping);
> + if (err == 0 && fop && (fsync = fop->fsync))
> + err = fsync(filp, dp, 0);
> + if (err == 0)
> + err = filemap_fdatawait(inode->i_mapping);
> +
> + return err;
> +}
This function is HUGE! And hardly a fastpath.
> static int
> nfsd_sync(struct file *filp)
> {
> - return vfs_fsync(filp, filp->f_path.dentry, 0);
> + int err;
> + struct inode *inode = filp->f_path.dentry->d_inode;
> + dprintk("nfsd: sync file %s\n", filp->f_path.dentry->d_name.name);
> + mutex_lock(&inode->i_mutex);
> + err=nfsd_dosync(filp, filp->f_path.dentry, filp->f_op);
(checkpatch?)
> + mutex_unlock(&inode->i_mutex);
> +
> + return err;
> }
>
> int
> -nfsd_sync_dir(struct dentry *dentry)
> +nfsd_sync_dir(struct dentry *dp)
> {
> - return vfs_fsync(NULL, dentry, 0);
> + return nfsd_dosync(NULL, dp, dp->d_inode->i_fop);
> }
And we expand it twice.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage
2009-01-07 0:23 ` Andrew Morton
@ 2009-01-07 0:28 ` J. Bruce Fields
2009-01-07 7:42 ` Christoph Hellwig
0 siblings, 1 reply; 12+ messages in thread
From: J. Bruce Fields @ 2009-01-07 0:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: hch, linux-nfs, linux-kernel, neilb, snakebyte
On Tue, Jan 06, 2009 at 04:23:28PM -0800, Andrew Morton wrote:
> On Tue, 6 Jan 2009 19:15:01 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
>
> > nfsd: fix double-locks of directory mutex
>
> grumble.
This is literally just a revert of part of 4c728ef583b3d822; if you'd
like me to clean up this stuff while I'm there, I'm happy to.
--b.
> > +/*
> > + * Sync a file
> > + * As this calls fsync (not fdatasync) there is no need for a write_inode
> > + * after it.
> > + */
> > +static inline int nfsd_dosync(struct file *filp, struct dentry *dp,
> > + const struct file_operations *fop)
> > +{
> > + struct inode *inode = dp->d_inode;
> > + int (*fsync) (struct file *, struct dentry *, int);
> > + int err;
> > +
> > + err = filemap_fdatawrite(inode->i_mapping);
> > + if (err == 0 && fop && (fsync = fop->fsync))
> > + err = fsync(filp, dp, 0);
> > + if (err == 0)
> > + err = filemap_fdatawait(inode->i_mapping);
> > +
> > + return err;
> > +}
>
> This function is HUGE! And hardly a fastpath.
>
> > static int
> > nfsd_sync(struct file *filp)
> > {
> > - return vfs_fsync(filp, filp->f_path.dentry, 0);
> > + int err;
> > + struct inode *inode = filp->f_path.dentry->d_inode;
> > + dprintk("nfsd: sync file %s\n", filp->f_path.dentry->d_name.name);
> > + mutex_lock(&inode->i_mutex);
> > + err=nfsd_dosync(filp, filp->f_path.dentry, filp->f_op);
>
> (checkpatch?)
>
> > + mutex_unlock(&inode->i_mutex);
> > +
> > + return err;
> > }
> >
> > int
> > -nfsd_sync_dir(struct dentry *dentry)
> > +nfsd_sync_dir(struct dentry *dp)
> > {
> > - return vfs_fsync(NULL, dentry, 0);
> > + return nfsd_dosync(NULL, dp, dp->d_inode->i_fop);
> > }
>
> And we expand it twice.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage
2009-01-07 0:28 ` J. Bruce Fields
@ 2009-01-07 7:42 ` Christoph Hellwig
2009-01-07 16:56 ` J. Bruce Fields
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2009-01-07 7:42 UTC (permalink / raw)
To: J. Bruce Fields
Cc: Andrew Morton, hch, linux-nfs, linux-kernel, neilb, snakebyte
On Tue, Jan 06, 2009 at 07:28:16PM -0500, J. Bruce Fields wrote:
> On Tue, Jan 06, 2009 at 04:23:28PM -0800, Andrew Morton wrote:
> > On Tue, 6 Jan 2009 19:15:01 -0500
> > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> >
> > > nfsd: fix double-locks of directory mutex
> >
> > grumble.
>
> This is literally just a revert of part of 4c728ef583b3d822; if you'd
> like me to clean up this stuff while I'm there, I'm happy to.
Please leave it as the revert. NFSD really needs to use vfs_fsync
eventually so we can sort out our ->fsync usage. I suspect the best
way to get there is to to the i_mutex removal for fsync earlier than
planned, but I'll need to audit the filesystems first.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage
2009-01-07 7:42 ` Christoph Hellwig
@ 2009-01-07 16:56 ` J. Bruce Fields
2009-01-07 17:22 ` Christoph Hellwig
0 siblings, 1 reply; 12+ messages in thread
From: J. Bruce Fields @ 2009-01-07 16:56 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Andrew Morton, hch, linux-nfs, linux-kernel, neilb, snakebyte
On Wed, Jan 07, 2009 at 02:42:56AM -0500, Christoph Hellwig wrote:
> On Tue, Jan 06, 2009 at 07:28:16PM -0500, J. Bruce Fields wrote:
> > On Tue, Jan 06, 2009 at 04:23:28PM -0800, Andrew Morton wrote:
> > > On Tue, 6 Jan 2009 19:15:01 -0500
> > > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > >
> > > > nfsd: fix double-locks of directory mutex
> > >
> > > grumble.
> >
> > This is literally just a revert of part of 4c728ef583b3d822; if you'd
> > like me to clean up this stuff while I'm there, I'm happy to.
>
> Please leave it as the revert. NFSD really needs to use vfs_fsync
> eventually so we can sort out our ->fsync usage.
OK. Mind if we just revert the whole commit for now? With the
double-lock regression is still there for ecryptfs exports, then I'd
rather do a simple revert of the whole patch and not try to pick out
just the fs/nfsd/vfs.c part.
--b.
> I suspect the best way to get there is to to the i_mutex removal for
> fsync earlier than planned, but I'll need to audit the filesystems
> first.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage
2009-01-07 16:56 ` J. Bruce Fields
@ 2009-01-07 17:22 ` Christoph Hellwig
0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2009-01-07 17:22 UTC (permalink / raw)
To: J. Bruce Fields
Cc: Christoph Hellwig, Andrew Morton, hch, linux-nfs, linux-kernel,
neilb, snakebyte
On Wed, Jan 07, 2009 at 11:56:39AM -0500, J. Bruce Fields wrote:
> OK. Mind if we just revert the whole commit for now? With the
> double-lock regression is still there for ecryptfs exports, then I'd
> rather do a simple revert of the whole patch and not try to pick out
> just the fs/nfsd/vfs.c part.
Umm, exporting ecryptfs would previously take the lower i_mutex in
the ecryptfs fsync method and now does in vfs_fsync, there should
be no changed in behaviour.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage
2009-01-06 22:56 nfsd stuckage Andrew Morton
2009-01-06 23:02 ` J. Bruce Fields
@ 2009-01-08 14:57 ` Peter Zijlstra
2009-01-08 16:05 ` J. Bruce Fields
1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2009-01-08 14:57 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-nfs, linux-kernel, Neil Brown, J. Bruce Fields,
Christoph Hellwig
On Tue, 2009-01-06 at 14:56 -0800, Andrew Morton wrote:
> I just built current mainline plus the just-sent 266 -mm patches.
>
> The machine failed to power off when hit with `halt -pfn'. dmesg output:
> [ 672.162677] INFO: task nfsd4:4324 blocked for more than 480 seconds.
> [ 672.162706] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 672.162725] ffff880251df1d60 0000000000000046 ffff88025e1c0580 ffff8802488013d8
> [ 672.162753] ffff880251df1d20 ffff88024c49a7a0 ffff88025e088760 ffff88024c49ab18
> [ 672.162834] 000000002807ee00 00000000ffff59ec ffff880251df1d50 0000000000000282
> [ 672.162865] Call Trace:
> [ 672.162880] [<ffffffff8052a1d8>] __mutex_lock_slowpath+0x6a/0xac
> [ 672.162895] [<ffffffff8052a0c5>] mutex_lock+0x2c/0x30
> [ 672.162908] [<ffffffff802c4747>] vfs_fsync+0x63/0xa9
> [ 672.162933] [<ffffffffa048c74c>] nfsd_sync_dir+0x10/0x12 [nfsd]
> [ 672.162960] [<ffffffffa04a5427>] nfsd4_sync_rec_dir+0x27/0x40 [nfsd]
> [ 672.162984] [<ffffffffa04a592a>] nfsd4_recdir_purge_old+0x3d/0x6a [nfsd]
> [ 672.163023] [<ffffffffa04a1745>] laundromat_main+0x62/0x225 [nfsd]
> [ 672.163049] [<ffffffffa04a16e3>] ? laundromat_main+0x0/0x225 [nfsd]
> [ 672.163064] [<ffffffff8024b4a7>] run_workqueue+0x8d/0x124
> [ 672.163076] [<ffffffff8024b5e0>] ? worker_thread+0x0/0xe5
> [ 672.163089] [<ffffffff8024b6b8>] worker_thread+0xd8/0xe5
> [ 672.163102] [<ffffffff8024e8cc>] ? autoremove_wake_function+0x0/0x36
> [ 672.163115] [<ffffffff8024b5e0>] ? worker_thread+0x0/0xe5
> [ 672.163127] [<ffffffff8024e5e0>] kthread+0x44/0x6b
> [ 672.163140] [<ffffffff8020cfba>] child_rip+0xa/0x20
> [ 672.163151] [<ffffffff8024e59c>] ? kthread+0x0/0x6b
> [ 672.163162] [<ffffffff8020cfb0>] ? child_rip+0x0/0x20
FWIW lockdep seems to warn about this...
All I have to do to trigger this is boot the machine and let it sit for
a few minutes.
[ 113.552497] =============================================
[ 113.553289] [ INFO: possible recursive locking detected ]
[ 113.553289] 2.6.28-tip #592
[ 113.553289] ---------------------------------------------
[ 113.553289] nfsd4/1914 is trying to acquire lock:
[ 113.553289] (&type->i_mutex_dir_key#4){--..}, at: [<ffffffff802e7e5e>] vfs_fsync+0x6c/0xb1
[ 113.553289]
[ 113.553289] but task is already holding lock:
[ 113.553289] (&type->i_mutex_dir_key#4){--..}, at: [<ffffffffa0190727>] nfsd4_sync_rec_dir+0x22/0x47 [nfsd]
[ 113.553289]
[ 113.553289] other info that might help us debug this:
[ 113.553289] 4 locks held by nfsd4/1914:
[ 113.553289] #0: (nfsd4){--..}, at: [<ffffffff80252303>] run_workqueue+0xb6/0x21b
[ 113.553289] #1: ((laundromat_work).work){--..}, at: [<ffffffff80252303>] run_workqueue+0xb6/0x21b
[ 113.553289] #2: (client_mutex){--..}, at: [<ffffffffa018bd05>] laundromat_main+0x33/0x24e [nfsd]
[ 113.553289] #3: (&type->i_mutex_dir_key#4){--..}, at: [<ffffffffa0190727>] nfsd4_sync_rec_dir+0x22/0x47 [nfsd]
[ 113.553289]
[ 113.553289] stack backtrace:
[ 113.553289] Pid: 1914, comm: nfsd4 Not tainted 2.6.28-tip #592
[ 113.553289] Call Trace:
[ 113.553289] [<ffffffff80266987>] __lock_acquire+0xe42/0x161a
[ 113.553289] [<ffffffff80288857>] ? __call_rcu+0x7a/0x107
[ 113.553289] [<ffffffff802671b4>] lock_acquire+0x55/0x71
[ 113.553289] [<ffffffff802e7e5e>] ? vfs_fsync+0x6c/0xb1
[ 113.553289] [<ffffffff805568d0>] mutex_lock_nested+0x4e/0x320
[ 113.553289] [<ffffffff802e7e5e>] ? vfs_fsync+0x6c/0xb1
[ 113.553289] [<ffffffff8029bde0>] ? __filemap_fdatawrite_range+0x57/0x5f
[ 113.553289] [<ffffffff802e7e5e>] vfs_fsync+0x6c/0xb1
[ 113.553289] [<ffffffffa0176f8f>] nfsd_sync_dir+0x15/0x17 [nfsd]
[ 113.553289] [<ffffffffa0190733>] nfsd4_sync_rec_dir+0x2e/0x47 [nfsd]
[ 113.553289] [<ffffffffa0190791>] nfsd4_recdir_purge_old+0x45/0x73 [nfsd]
[ 113.553289] [<ffffffffa018bd44>] laundromat_main+0x72/0x24e [nfsd]
[ 113.553289] [<ffffffff80252355>] run_workqueue+0x108/0x21b
[ 113.553289] [<ffffffff80252303>] ? run_workqueue+0xb6/0x21b
[ 113.553289] [<ffffffffa018bcd2>] ? laundromat_main+0x0/0x24e [nfsd]
[ 113.553289] [<ffffffff8025254d>] worker_thread+0xe5/0xf6
[ 113.553289] [<ffffffff80256615>] ? autoremove_wake_function+0x0/0x3d
[ 113.553289] [<ffffffff80252468>] ? worker_thread+0x0/0xf6
[ 113.553289] [<ffffffff80256200>] kthread+0x4e/0x7b
[ 113.553289] [<ffffffff8020d51a>] child_rip+0xa/0x20
[ 113.553289] [<ffffffff8020cec0>] ? restore_args+0x0/0x30
[ 113.553289] [<ffffffff802561b2>] ? kthread+0x0/0x7b
[ 113.553289] [<ffffffff8020d510>] ? child_rip+0x0/0x20
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage
2009-01-08 14:57 ` Peter Zijlstra
@ 2009-01-08 16:05 ` J. Bruce Fields
0 siblings, 0 replies; 12+ messages in thread
From: J. Bruce Fields @ 2009-01-08 16:05 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Andrew Morton, linux-nfs, linux-kernel, Neil Brown,
Christoph Hellwig
On Thu, Jan 08, 2009 at 03:57:30PM +0100, Peter Zijlstra wrote:
> FWIW lockdep seems to warn about this...
>
> All I have to do to trigger this is boot the machine and let it sit for
> a few minutes.
Linus merged a fix (9a8d248e2d2 "nfsd: fix double-locks of directory
mutex") last night. If you still see warnings after that, let us know.
--b.
>
> [ 113.552497] =============================================
> [ 113.553289] [ INFO: possible recursive locking detected ]
> [ 113.553289] 2.6.28-tip #592
> [ 113.553289] ---------------------------------------------
> [ 113.553289] nfsd4/1914 is trying to acquire lock:
> [ 113.553289] (&type->i_mutex_dir_key#4){--..}, at: [<ffffffff802e7e5e>] vfs_fsync+0x6c/0xb1
> [ 113.553289]
> [ 113.553289] but task is already holding lock:
> [ 113.553289] (&type->i_mutex_dir_key#4){--..}, at: [<ffffffffa0190727>] nfsd4_sync_rec_dir+0x22/0x47 [nfsd]
> [ 113.553289]
> [ 113.553289] other info that might help us debug this:
> [ 113.553289] 4 locks held by nfsd4/1914:
> [ 113.553289] #0: (nfsd4){--..}, at: [<ffffffff80252303>] run_workqueue+0xb6/0x21b
> [ 113.553289] #1: ((laundromat_work).work){--..}, at: [<ffffffff80252303>] run_workqueue+0xb6/0x21b
> [ 113.553289] #2: (client_mutex){--..}, at: [<ffffffffa018bd05>] laundromat_main+0x33/0x24e [nfsd]
> [ 113.553289] #3: (&type->i_mutex_dir_key#4){--..}, at: [<ffffffffa0190727>] nfsd4_sync_rec_dir+0x22/0x47 [nfsd]
> [ 113.553289]
> [ 113.553289] stack backtrace:
> [ 113.553289] Pid: 1914, comm: nfsd4 Not tainted 2.6.28-tip #592
> [ 113.553289] Call Trace:
> [ 113.553289] [<ffffffff80266987>] __lock_acquire+0xe42/0x161a
> [ 113.553289] [<ffffffff80288857>] ? __call_rcu+0x7a/0x107
> [ 113.553289] [<ffffffff802671b4>] lock_acquire+0x55/0x71
> [ 113.553289] [<ffffffff802e7e5e>] ? vfs_fsync+0x6c/0xb1
> [ 113.553289] [<ffffffff805568d0>] mutex_lock_nested+0x4e/0x320
> [ 113.553289] [<ffffffff802e7e5e>] ? vfs_fsync+0x6c/0xb1
> [ 113.553289] [<ffffffff8029bde0>] ? __filemap_fdatawrite_range+0x57/0x5f
> [ 113.553289] [<ffffffff802e7e5e>] vfs_fsync+0x6c/0xb1
> [ 113.553289] [<ffffffffa0176f8f>] nfsd_sync_dir+0x15/0x17 [nfsd]
> [ 113.553289] [<ffffffffa0190733>] nfsd4_sync_rec_dir+0x2e/0x47 [nfsd]
> [ 113.553289] [<ffffffffa0190791>] nfsd4_recdir_purge_old+0x45/0x73 [nfsd]
> [ 113.553289] [<ffffffffa018bd44>] laundromat_main+0x72/0x24e [nfsd]
> [ 113.553289] [<ffffffff80252355>] run_workqueue+0x108/0x21b
> [ 113.553289] [<ffffffff80252303>] ? run_workqueue+0xb6/0x21b
> [ 113.553289] [<ffffffffa018bcd2>] ? laundromat_main+0x0/0x24e [nfsd]
> [ 113.553289] [<ffffffff8025254d>] worker_thread+0xe5/0xf6
> [ 113.553289] [<ffffffff80256615>] ? autoremove_wake_function+0x0/0x3d
> [ 113.553289] [<ffffffff80252468>] ? worker_thread+0x0/0xf6
> [ 113.553289] [<ffffffff80256200>] kthread+0x4e/0x7b
> [ 113.553289] [<ffffffff8020d51a>] child_rip+0xa/0x20
> [ 113.553289] [<ffffffff8020cec0>] ? restore_args+0x0/0x30
> [ 113.553289] [<ffffffff802561b2>] ? kthread+0x0/0x7b
> [ 113.553289] [<ffffffff8020d510>] ? child_rip+0x0/0x20
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-01-08 16:05 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-06 22:56 nfsd stuckage Andrew Morton
2009-01-06 23:02 ` J. Bruce Fields
2009-01-06 23:03 ` Christoph Hellwig
2009-01-06 23:05 ` J. Bruce Fields
2009-01-07 0:15 ` J. Bruce Fields
2009-01-07 0:23 ` Andrew Morton
2009-01-07 0:28 ` J. Bruce Fields
2009-01-07 7:42 ` Christoph Hellwig
2009-01-07 16:56 ` J. Bruce Fields
2009-01-07 17:22 ` Christoph Hellwig
2009-01-08 14:57 ` Peter Zijlstra
2009-01-08 16:05 ` J. Bruce Fields
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox