* nfsd stuckage @ 2009-01-06 22:56 Andrew Morton 2009-01-06 23:02 ` J. Bruce Fields 2009-01-08 14:57 ` Peter Zijlstra 0 siblings, 2 replies; 12+ messages in thread From: Andrew Morton @ 2009-01-06 22:56 UTC (permalink / raw) To: linux-nfs; +Cc: linux-kernel, Neil Brown, J. Bruce Fields I just built current mainline plus the just-sent 266 -mm patches. The machine failed to power off when hit with `halt -pfn'. dmesg output: [ 37.087037] calling rfcomm_init+0x0/0xb6 [rfcomm] @ 3946 [ 37.087294] initcall rfcomm_init+0x0/0xb6 [rfcomm] returned 0 after 72 usecs [ 37.505855] calling hidp_init+0x0/0x5e [hidp] @ 4046 [ 37.506072] initcall hidp_init+0x0/0x5e [hidp] returned 0 after 28 usecs [ 37.636638] calling init_autofs4_fs+0x0/0x23 [autofs4] @ 4081 [ 37.636990] initcall init_autofs4_fs+0x0/0x23 [autofs4] returned 0 after 54 usecs [ 39.630075] calling init_nlm+0x0/0x22 [lockd] @ 4264 [ 39.630321] initcall init_nlm+0x0/0x22 [lockd] returned 0 after 59 usecs [ 39.690077] calling init_rpcsec_gss+0x0/0x4a [auth_rpcgss] @ 4264 [ 39.690281] initcall init_rpcsec_gss+0x0/0x4a [auth_rpcgss] returned 0 after 12 usecs [ 39.834034] calling init_nfsd+0x0/0xe2 [nfsd] @ 4302 [ 39.834471] initcall init_nfsd+0x0/0xe2 [nfsd] returned 0 after 236 usecs [ 39.924213] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 672.162677] INFO: task nfsd4:4324 blocked for more than 480 seconds. [ 672.162706] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 672.162725] ffff880251df1d60 0000000000000046 ffff88025e1c0580 ffff8802488013d8 [ 672.162753] ffff880251df1d20 ffff88024c49a7a0 ffff88025e088760 ffff88024c49ab18 [ 672.162834] 000000002807ee00 00000000ffff59ec ffff880251df1d50 0000000000000282 [ 672.162865] Call Trace: [ 672.162880] [<ffffffff8052a1d8>] __mutex_lock_slowpath+0x6a/0xac [ 672.162895] [<ffffffff8052a0c5>] mutex_lock+0x2c/0x30 [ 672.162908] [<ffffffff802c4747>] vfs_fsync+0x63/0xa9 [ 672.162933] [<ffffffffa048c74c>] nfsd_sync_dir+0x10/0x12 [nfsd] [ 672.162960] [<ffffffffa04a5427>] nfsd4_sync_rec_dir+0x27/0x40 [nfsd] [ 672.162984] [<ffffffffa04a592a>] nfsd4_recdir_purge_old+0x3d/0x6a [nfsd] [ 672.163023] [<ffffffffa04a1745>] laundromat_main+0x62/0x225 [nfsd] [ 672.163049] [<ffffffffa04a16e3>] ? laundromat_main+0x0/0x225 [nfsd] [ 672.163064] [<ffffffff8024b4a7>] run_workqueue+0x8d/0x124 [ 672.163076] [<ffffffff8024b5e0>] ? worker_thread+0x0/0xe5 [ 672.163089] [<ffffffff8024b6b8>] worker_thread+0xd8/0xe5 [ 672.163102] [<ffffffff8024e8cc>] ? autoremove_wake_function+0x0/0x36 [ 672.163115] [<ffffffff8024b5e0>] ? worker_thread+0x0/0xe5 [ 672.163127] [<ffffffff8024e5e0>] kthread+0x44/0x6b [ 672.163140] [<ffffffff8020cfba>] child_rip+0xa/0x20 [ 672.163151] [<ffffffff8024e59c>] ? kthread+0x0/0x6b [ 672.163162] [<ffffffff8020cfb0>] ? child_rip+0x0/0x20 [ 1204.739381] INFO: task nfsd4:4324 blocked for more than 480 seconds. [ 1204.739415] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1204.739436] ffff880251df1d60 0000000000000046 ffff88025e1c0580 ffff8802488013d8 [ 1204.739523] ffff880251df1d20 ffff88024c49a7a0 ffff88025e088760 ffff88024c49ab18 [ 1204.739551] 000000002807ee00 00000000ffff59ec ffff880251df1d50 0000000000000282 [ 1204.739579] Call Trace: This didn't happen in linux-next a week or so ago. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage 2009-01-06 22:56 nfsd stuckage Andrew Morton @ 2009-01-06 23:02 ` J. Bruce Fields 2009-01-06 23:03 ` Christoph Hellwig 2009-01-08 14:57 ` Peter Zijlstra 1 sibling, 1 reply; 12+ messages in thread From: J. Bruce Fields @ 2009-01-06 23:02 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-nfs, linux-kernel, Neil Brown, Eric Sesterhenn, hch On Tue, Jan 06, 2009 at 02:56:12PM -0800, Andrew Morton wrote: > > I just built current mainline plus the just-sent 266 -mm patches. > > The machine failed to power off when hit with `halt -pfn'. dmesg output: Christoph, can you live with this for now? --b. commit 33e3950dc2eae7484e79685083c304d93013e3ec Author: J. Bruce Fields <bfields@citi.umich.edu> Date: Tue Jan 6 13:37:03 2009 -0500 nfsd: fix double-locks of directory mutex A number of nfsd operations depend on the i_mutex to cover more code than just the fsync, so the approach of 4c728ef583b3d8 "add a vfs_fsync helper" doesn't work for nfsd. Revert the parts of those patches that touch nfsd, and remove the logic from vfs_nfsd that was needed only for the special case of nfsd. Reported-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 44aa92a..6e50aaa 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -744,16 +744,44 @@ nfsd_close(struct file *filp) fput(filp); } +/* + * Sync a file + * As this calls fsync (not fdatasync) there is no need for a write_inode + * after it. + */ +static inline int nfsd_dosync(struct file *filp, struct dentry *dp, + const struct file_operations *fop) +{ + struct inode *inode = dp->d_inode; + int (*fsync) (struct file *, struct dentry *, int); + int err; + + err = filemap_fdatawrite(inode->i_mapping); + if (err == 0 && fop && (fsync = fop->fsync)) + err = fsync(filp, dp, 0); + if (err == 0) + err = filemap_fdatawait(inode->i_mapping); + + return err; +} + static int nfsd_sync(struct file *filp) { - return vfs_fsync(filp, filp->f_path.dentry, 0); + int err; + struct inode *inode = filp->f_path.dentry->d_inode; + dprintk("nfsd: sync file %s\n", filp->f_path.dentry->d_name.name); + mutex_lock(&inode->i_mutex); + err=nfsd_dosync(filp, filp->f_path.dentry, filp->f_op); + mutex_unlock(&inode->i_mutex); + + return err; } int -nfsd_sync_dir(struct dentry *dentry) +nfsd_sync_dir(struct dentry *dp) { - return vfs_fsync(NULL, dentry, 0); + return nfsd_dosync(NULL, dp, dp->d_inode->i_fop); } /* diff --git a/fs/sync.c b/fs/sync.c index 0921d6d..8e0a656 100644 --- a/fs/sync.c +++ b/fs/sync.c @@ -83,10 +83,6 @@ int file_fsync(struct file *filp, struct dentry *dentry, int datasync) * * Write back data and metadata for @file to disk. If @datasync is * set only metadata needed to access modified file data is written. - * - * In case this function is called from nfsd @file may be %NULL and - * only @dentry is set. This can only happen when the filesystem - * implements the export_operations API. */ int vfs_fsync(struct file *file, struct dentry *dentry, int datasync) { @@ -94,18 +90,8 @@ int vfs_fsync(struct file *file, struct dentry *dentry, int datasync) struct address_space *mapping; int err, ret; - /* - * Get mapping and operations from the file in case we have - * as file, or get the default values for them in case we - * don't have a struct file available. Damn nfsd.. - */ - if (file) { - mapping = file->f_mapping; - fop = file->f_op; - } else { - mapping = dentry->d_inode->i_mapping; - fop = dentry->d_inode->i_fop; - } + mapping = file->f_mapping; + fop = file->f_op; if (!fop || !fop->fsync) { ret = -EINVAL; ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: nfsd stuckage 2009-01-06 23:02 ` J. Bruce Fields @ 2009-01-06 23:03 ` Christoph Hellwig 2009-01-06 23:05 ` J. Bruce Fields 0 siblings, 1 reply; 12+ messages in thread From: Christoph Hellwig @ 2009-01-06 23:03 UTC (permalink / raw) To: J. Bruce Fields Cc: Andrew Morton, linux-nfs, linux-kernel, Neil Brown, Eric Sesterhenn, hch On Tue, Jan 06, 2009 at 06:02:44PM -0500, J. Bruce Fields wrote: > On Tue, Jan 06, 2009 at 02:56:12PM -0800, Andrew Morton wrote: > > > > I just built current mainline plus the just-sent 266 -mm patches. > > > > The machine failed to power off when hit with `halt -pfn'. dmesg output: > > Christoph, can you live with this for now? nfsd part is well, livable. But the fs/sync.c is buggy as stackable filesystems can call vfs_fsync with a NULL pointer due to nfs calling it that way, so please drop that hunk. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage 2009-01-06 23:03 ` Christoph Hellwig @ 2009-01-06 23:05 ` J. Bruce Fields 2009-01-07 0:15 ` J. Bruce Fields 0 siblings, 1 reply; 12+ messages in thread From: J. Bruce Fields @ 2009-01-06 23:05 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrew Morton, linux-nfs, linux-kernel, Neil Brown, Eric Sesterhenn On Wed, Jan 07, 2009 at 12:03:56AM +0100, Christoph Hellwig wrote: > On Tue, Jan 06, 2009 at 06:02:44PM -0500, J. Bruce Fields wrote: > > On Tue, Jan 06, 2009 at 02:56:12PM -0800, Andrew Morton wrote: > > > > > > I just built current mainline plus the just-sent 266 -mm patches. > > > > > > The machine failed to power off when hit with `halt -pfn'. dmesg output: > > > > Christoph, can you live with this for now? > > nfsd part is well, livable. But the fs/sync.c is buggy as stackable > filesystems can call vfs_fsync with a NULL pointer due to nfs calling it > that way, so please drop that hunk. Whoops, OK, I didn't understand that. I'll drop that hunk, retest, then submit--should take a few minutes. --b. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage 2009-01-06 23:05 ` J. Bruce Fields @ 2009-01-07 0:15 ` J. Bruce Fields 2009-01-07 0:23 ` Andrew Morton 0 siblings, 1 reply; 12+ messages in thread From: J. Bruce Fields @ 2009-01-07 0:15 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrew Morton, linux-nfs, linux-kernel, Neil Brown, Eric Sesterhenn On Tue, Jan 06, 2009 at 06:05:51PM -0500, bfields wrote: > On Wed, Jan 07, 2009 at 12:03:56AM +0100, Christoph Hellwig wrote: > > On Tue, Jan 06, 2009 at 06:02:44PM -0500, J. Bruce Fields wrote: > > > On Tue, Jan 06, 2009 at 02:56:12PM -0800, Andrew Morton wrote: > > > > > > > > I just built current mainline plus the just-sent 266 -mm patches. > > > > > > > > The machine failed to power off when hit with `halt -pfn'. dmesg output: > > > > > > Christoph, can you live with this for now? > > > > nfsd part is well, livable. But the fs/sync.c is buggy as stackable > > filesystems can call vfs_fsync with a NULL pointer due to nfs calling it > > that way, so please drop that hunk. > > Whoops, OK, I didn't understand that. I'll drop that hunk, retest, then > submit--should take a few minutes. So this works for me--and I guess I may as well submit it in a pull request. But: it sounds like we still have a regression for ecryptfs? (Since on ecryptfs export nfsd will still try to get the mutex twice on create, unlink, etc.) Maybe we should just revert 4c728ef583b3d82266584da5cb068294c09df31e entirely for now? --b. commit 3fbc5c762bd9f9ff52fe7b5b09398a3cff0e8415 Author: J. Bruce Fields <bfields@citi.umich.edu> Date: Tue Jan 6 13:37:03 2009 -0500 nfsd: fix double-locks of directory mutex A number of nfsd operations depend on the i_mutex to cover more code than just the fsync, so the approach of 4c728ef583b3d8 "add a vfs_fsync helper" doesn't work for nfsd. Revert the parts of those patches that touch nfsd. Note: we can't, however, remove the logic from vfs_fsync that was needed only for the special case of nfsd, because a vfs_fsync(NULL,...) call can still result indirectly from a stackable filesystem that was called by nfsd. Reported-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 44aa92a..6e50aaa 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -744,16 +744,44 @@ nfsd_close(struct file *filp) fput(filp); } +/* + * Sync a file + * As this calls fsync (not fdatasync) there is no need for a write_inode + * after it. + */ +static inline int nfsd_dosync(struct file *filp, struct dentry *dp, + const struct file_operations *fop) +{ + struct inode *inode = dp->d_inode; + int (*fsync) (struct file *, struct dentry *, int); + int err; + + err = filemap_fdatawrite(inode->i_mapping); + if (err == 0 && fop && (fsync = fop->fsync)) + err = fsync(filp, dp, 0); + if (err == 0) + err = filemap_fdatawait(inode->i_mapping); + + return err; +} + static int nfsd_sync(struct file *filp) { - return vfs_fsync(filp, filp->f_path.dentry, 0); + int err; + struct inode *inode = filp->f_path.dentry->d_inode; + dprintk("nfsd: sync file %s\n", filp->f_path.dentry->d_name.name); + mutex_lock(&inode->i_mutex); + err=nfsd_dosync(filp, filp->f_path.dentry, filp->f_op); + mutex_unlock(&inode->i_mutex); + + return err; } int -nfsd_sync_dir(struct dentry *dentry) +nfsd_sync_dir(struct dentry *dp) { - return vfs_fsync(NULL, dentry, 0); + return nfsd_dosync(NULL, dp, dp->d_inode->i_fop); } /* ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: nfsd stuckage 2009-01-07 0:15 ` J. Bruce Fields @ 2009-01-07 0:23 ` Andrew Morton 2009-01-07 0:28 ` J. Bruce Fields 0 siblings, 1 reply; 12+ messages in thread From: Andrew Morton @ 2009-01-07 0:23 UTC (permalink / raw) To: J. Bruce Fields; +Cc: hch, linux-nfs, linux-kernel, neilb, snakebyte On Tue, 6 Jan 2009 19:15:01 -0500 "J. Bruce Fields" <bfields@fieldses.org> wrote: > nfsd: fix double-locks of directory mutex grumble. > > +/* > + * Sync a file > + * As this calls fsync (not fdatasync) there is no need for a write_inode > + * after it. > + */ > +static inline int nfsd_dosync(struct file *filp, struct dentry *dp, > + const struct file_operations *fop) > +{ > + struct inode *inode = dp->d_inode; > + int (*fsync) (struct file *, struct dentry *, int); > + int err; > + > + err = filemap_fdatawrite(inode->i_mapping); > + if (err == 0 && fop && (fsync = fop->fsync)) > + err = fsync(filp, dp, 0); > + if (err == 0) > + err = filemap_fdatawait(inode->i_mapping); > + > + return err; > +} This function is HUGE! And hardly a fastpath. > static int > nfsd_sync(struct file *filp) > { > - return vfs_fsync(filp, filp->f_path.dentry, 0); > + int err; > + struct inode *inode = filp->f_path.dentry->d_inode; > + dprintk("nfsd: sync file %s\n", filp->f_path.dentry->d_name.name); > + mutex_lock(&inode->i_mutex); > + err=nfsd_dosync(filp, filp->f_path.dentry, filp->f_op); (checkpatch?) > + mutex_unlock(&inode->i_mutex); > + > + return err; > } > > int > -nfsd_sync_dir(struct dentry *dentry) > +nfsd_sync_dir(struct dentry *dp) > { > - return vfs_fsync(NULL, dentry, 0); > + return nfsd_dosync(NULL, dp, dp->d_inode->i_fop); > } And we expand it twice. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage 2009-01-07 0:23 ` Andrew Morton @ 2009-01-07 0:28 ` J. Bruce Fields 2009-01-07 7:42 ` Christoph Hellwig 0 siblings, 1 reply; 12+ messages in thread From: J. Bruce Fields @ 2009-01-07 0:28 UTC (permalink / raw) To: Andrew Morton; +Cc: hch, linux-nfs, linux-kernel, neilb, snakebyte On Tue, Jan 06, 2009 at 04:23:28PM -0800, Andrew Morton wrote: > On Tue, 6 Jan 2009 19:15:01 -0500 > "J. Bruce Fields" <bfields@fieldses.org> wrote: > > > nfsd: fix double-locks of directory mutex > > grumble. This is literally just a revert of part of 4c728ef583b3d822; if you'd like me to clean up this stuff while I'm there, I'm happy to. --b. > > +/* > > + * Sync a file > > + * As this calls fsync (not fdatasync) there is no need for a write_inode > > + * after it. > > + */ > > +static inline int nfsd_dosync(struct file *filp, struct dentry *dp, > > + const struct file_operations *fop) > > +{ > > + struct inode *inode = dp->d_inode; > > + int (*fsync) (struct file *, struct dentry *, int); > > + int err; > > + > > + err = filemap_fdatawrite(inode->i_mapping); > > + if (err == 0 && fop && (fsync = fop->fsync)) > > + err = fsync(filp, dp, 0); > > + if (err == 0) > > + err = filemap_fdatawait(inode->i_mapping); > > + > > + return err; > > +} > > This function is HUGE! And hardly a fastpath. > > > static int > > nfsd_sync(struct file *filp) > > { > > - return vfs_fsync(filp, filp->f_path.dentry, 0); > > + int err; > > + struct inode *inode = filp->f_path.dentry->d_inode; > > + dprintk("nfsd: sync file %s\n", filp->f_path.dentry->d_name.name); > > + mutex_lock(&inode->i_mutex); > > + err=nfsd_dosync(filp, filp->f_path.dentry, filp->f_op); > > (checkpatch?) > > > + mutex_unlock(&inode->i_mutex); > > + > > + return err; > > } > > > > int > > -nfsd_sync_dir(struct dentry *dentry) > > +nfsd_sync_dir(struct dentry *dp) > > { > > - return vfs_fsync(NULL, dentry, 0); > > + return nfsd_dosync(NULL, dp, dp->d_inode->i_fop); > > } > > And we expand it twice. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage 2009-01-07 0:28 ` J. Bruce Fields @ 2009-01-07 7:42 ` Christoph Hellwig 2009-01-07 16:56 ` J. Bruce Fields 0 siblings, 1 reply; 12+ messages in thread From: Christoph Hellwig @ 2009-01-07 7:42 UTC (permalink / raw) To: J. Bruce Fields Cc: Andrew Morton, hch, linux-nfs, linux-kernel, neilb, snakebyte On Tue, Jan 06, 2009 at 07:28:16PM -0500, J. Bruce Fields wrote: > On Tue, Jan 06, 2009 at 04:23:28PM -0800, Andrew Morton wrote: > > On Tue, 6 Jan 2009 19:15:01 -0500 > > "J. Bruce Fields" <bfields@fieldses.org> wrote: > > > > > nfsd: fix double-locks of directory mutex > > > > grumble. > > This is literally just a revert of part of 4c728ef583b3d822; if you'd > like me to clean up this stuff while I'm there, I'm happy to. Please leave it as the revert. NFSD really needs to use vfs_fsync eventually so we can sort out our ->fsync usage. I suspect the best way to get there is to to the i_mutex removal for fsync earlier than planned, but I'll need to audit the filesystems first. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage 2009-01-07 7:42 ` Christoph Hellwig @ 2009-01-07 16:56 ` J. Bruce Fields 2009-01-07 17:22 ` Christoph Hellwig 0 siblings, 1 reply; 12+ messages in thread From: J. Bruce Fields @ 2009-01-07 16:56 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrew Morton, hch, linux-nfs, linux-kernel, neilb, snakebyte On Wed, Jan 07, 2009 at 02:42:56AM -0500, Christoph Hellwig wrote: > On Tue, Jan 06, 2009 at 07:28:16PM -0500, J. Bruce Fields wrote: > > On Tue, Jan 06, 2009 at 04:23:28PM -0800, Andrew Morton wrote: > > > On Tue, 6 Jan 2009 19:15:01 -0500 > > > "J. Bruce Fields" <bfields@fieldses.org> wrote: > > > > > > > nfsd: fix double-locks of directory mutex > > > > > > grumble. > > > > This is literally just a revert of part of 4c728ef583b3d822; if you'd > > like me to clean up this stuff while I'm there, I'm happy to. > > Please leave it as the revert. NFSD really needs to use vfs_fsync > eventually so we can sort out our ->fsync usage. OK. Mind if we just revert the whole commit for now? With the double-lock regression is still there for ecryptfs exports, then I'd rather do a simple revert of the whole patch and not try to pick out just the fs/nfsd/vfs.c part. --b. > I suspect the best way to get there is to to the i_mutex removal for > fsync earlier than planned, but I'll need to audit the filesystems > first. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage 2009-01-07 16:56 ` J. Bruce Fields @ 2009-01-07 17:22 ` Christoph Hellwig 0 siblings, 0 replies; 12+ messages in thread From: Christoph Hellwig @ 2009-01-07 17:22 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, Andrew Morton, hch, linux-nfs, linux-kernel, neilb, snakebyte On Wed, Jan 07, 2009 at 11:56:39AM -0500, J. Bruce Fields wrote: > OK. Mind if we just revert the whole commit for now? With the > double-lock regression is still there for ecryptfs exports, then I'd > rather do a simple revert of the whole patch and not try to pick out > just the fs/nfsd/vfs.c part. Umm, exporting ecryptfs would previously take the lower i_mutex in the ecryptfs fsync method and now does in vfs_fsync, there should be no changed in behaviour. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage 2009-01-06 22:56 nfsd stuckage Andrew Morton 2009-01-06 23:02 ` J. Bruce Fields @ 2009-01-08 14:57 ` Peter Zijlstra 2009-01-08 16:05 ` J. Bruce Fields 1 sibling, 1 reply; 12+ messages in thread From: Peter Zijlstra @ 2009-01-08 14:57 UTC (permalink / raw) To: Andrew Morton Cc: linux-nfs, linux-kernel, Neil Brown, J. Bruce Fields, Christoph Hellwig On Tue, 2009-01-06 at 14:56 -0800, Andrew Morton wrote: > I just built current mainline plus the just-sent 266 -mm patches. > > The machine failed to power off when hit with `halt -pfn'. dmesg output: > [ 672.162677] INFO: task nfsd4:4324 blocked for more than 480 seconds. > [ 672.162706] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 672.162725] ffff880251df1d60 0000000000000046 ffff88025e1c0580 ffff8802488013d8 > [ 672.162753] ffff880251df1d20 ffff88024c49a7a0 ffff88025e088760 ffff88024c49ab18 > [ 672.162834] 000000002807ee00 00000000ffff59ec ffff880251df1d50 0000000000000282 > [ 672.162865] Call Trace: > [ 672.162880] [<ffffffff8052a1d8>] __mutex_lock_slowpath+0x6a/0xac > [ 672.162895] [<ffffffff8052a0c5>] mutex_lock+0x2c/0x30 > [ 672.162908] [<ffffffff802c4747>] vfs_fsync+0x63/0xa9 > [ 672.162933] [<ffffffffa048c74c>] nfsd_sync_dir+0x10/0x12 [nfsd] > [ 672.162960] [<ffffffffa04a5427>] nfsd4_sync_rec_dir+0x27/0x40 [nfsd] > [ 672.162984] [<ffffffffa04a592a>] nfsd4_recdir_purge_old+0x3d/0x6a [nfsd] > [ 672.163023] [<ffffffffa04a1745>] laundromat_main+0x62/0x225 [nfsd] > [ 672.163049] [<ffffffffa04a16e3>] ? laundromat_main+0x0/0x225 [nfsd] > [ 672.163064] [<ffffffff8024b4a7>] run_workqueue+0x8d/0x124 > [ 672.163076] [<ffffffff8024b5e0>] ? worker_thread+0x0/0xe5 > [ 672.163089] [<ffffffff8024b6b8>] worker_thread+0xd8/0xe5 > [ 672.163102] [<ffffffff8024e8cc>] ? autoremove_wake_function+0x0/0x36 > [ 672.163115] [<ffffffff8024b5e0>] ? worker_thread+0x0/0xe5 > [ 672.163127] [<ffffffff8024e5e0>] kthread+0x44/0x6b > [ 672.163140] [<ffffffff8020cfba>] child_rip+0xa/0x20 > [ 672.163151] [<ffffffff8024e59c>] ? kthread+0x0/0x6b > [ 672.163162] [<ffffffff8020cfb0>] ? child_rip+0x0/0x20 FWIW lockdep seems to warn about this... All I have to do to trigger this is boot the machine and let it sit for a few minutes. [ 113.552497] ============================================= [ 113.553289] [ INFO: possible recursive locking detected ] [ 113.553289] 2.6.28-tip #592 [ 113.553289] --------------------------------------------- [ 113.553289] nfsd4/1914 is trying to acquire lock: [ 113.553289] (&type->i_mutex_dir_key#4){--..}, at: [<ffffffff802e7e5e>] vfs_fsync+0x6c/0xb1 [ 113.553289] [ 113.553289] but task is already holding lock: [ 113.553289] (&type->i_mutex_dir_key#4){--..}, at: [<ffffffffa0190727>] nfsd4_sync_rec_dir+0x22/0x47 [nfsd] [ 113.553289] [ 113.553289] other info that might help us debug this: [ 113.553289] 4 locks held by nfsd4/1914: [ 113.553289] #0: (nfsd4){--..}, at: [<ffffffff80252303>] run_workqueue+0xb6/0x21b [ 113.553289] #1: ((laundromat_work).work){--..}, at: [<ffffffff80252303>] run_workqueue+0xb6/0x21b [ 113.553289] #2: (client_mutex){--..}, at: [<ffffffffa018bd05>] laundromat_main+0x33/0x24e [nfsd] [ 113.553289] #3: (&type->i_mutex_dir_key#4){--..}, at: [<ffffffffa0190727>] nfsd4_sync_rec_dir+0x22/0x47 [nfsd] [ 113.553289] [ 113.553289] stack backtrace: [ 113.553289] Pid: 1914, comm: nfsd4 Not tainted 2.6.28-tip #592 [ 113.553289] Call Trace: [ 113.553289] [<ffffffff80266987>] __lock_acquire+0xe42/0x161a [ 113.553289] [<ffffffff80288857>] ? __call_rcu+0x7a/0x107 [ 113.553289] [<ffffffff802671b4>] lock_acquire+0x55/0x71 [ 113.553289] [<ffffffff802e7e5e>] ? vfs_fsync+0x6c/0xb1 [ 113.553289] [<ffffffff805568d0>] mutex_lock_nested+0x4e/0x320 [ 113.553289] [<ffffffff802e7e5e>] ? vfs_fsync+0x6c/0xb1 [ 113.553289] [<ffffffff8029bde0>] ? __filemap_fdatawrite_range+0x57/0x5f [ 113.553289] [<ffffffff802e7e5e>] vfs_fsync+0x6c/0xb1 [ 113.553289] [<ffffffffa0176f8f>] nfsd_sync_dir+0x15/0x17 [nfsd] [ 113.553289] [<ffffffffa0190733>] nfsd4_sync_rec_dir+0x2e/0x47 [nfsd] [ 113.553289] [<ffffffffa0190791>] nfsd4_recdir_purge_old+0x45/0x73 [nfsd] [ 113.553289] [<ffffffffa018bd44>] laundromat_main+0x72/0x24e [nfsd] [ 113.553289] [<ffffffff80252355>] run_workqueue+0x108/0x21b [ 113.553289] [<ffffffff80252303>] ? run_workqueue+0xb6/0x21b [ 113.553289] [<ffffffffa018bcd2>] ? laundromat_main+0x0/0x24e [nfsd] [ 113.553289] [<ffffffff8025254d>] worker_thread+0xe5/0xf6 [ 113.553289] [<ffffffff80256615>] ? autoremove_wake_function+0x0/0x3d [ 113.553289] [<ffffffff80252468>] ? worker_thread+0x0/0xf6 [ 113.553289] [<ffffffff80256200>] kthread+0x4e/0x7b [ 113.553289] [<ffffffff8020d51a>] child_rip+0xa/0x20 [ 113.553289] [<ffffffff8020cec0>] ? restore_args+0x0/0x30 [ 113.553289] [<ffffffff802561b2>] ? kthread+0x0/0x7b [ 113.553289] [<ffffffff8020d510>] ? child_rip+0x0/0x20 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: nfsd stuckage 2009-01-08 14:57 ` Peter Zijlstra @ 2009-01-08 16:05 ` J. Bruce Fields 0 siblings, 0 replies; 12+ messages in thread From: J. Bruce Fields @ 2009-01-08 16:05 UTC (permalink / raw) To: Peter Zijlstra Cc: Andrew Morton, linux-nfs, linux-kernel, Neil Brown, Christoph Hellwig On Thu, Jan 08, 2009 at 03:57:30PM +0100, Peter Zijlstra wrote: > FWIW lockdep seems to warn about this... > > All I have to do to trigger this is boot the machine and let it sit for > a few minutes. Linus merged a fix (9a8d248e2d2 "nfsd: fix double-locks of directory mutex") last night. If you still see warnings after that, let us know. --b. > > [ 113.552497] ============================================= > [ 113.553289] [ INFO: possible recursive locking detected ] > [ 113.553289] 2.6.28-tip #592 > [ 113.553289] --------------------------------------------- > [ 113.553289] nfsd4/1914 is trying to acquire lock: > [ 113.553289] (&type->i_mutex_dir_key#4){--..}, at: [<ffffffff802e7e5e>] vfs_fsync+0x6c/0xb1 > [ 113.553289] > [ 113.553289] but task is already holding lock: > [ 113.553289] (&type->i_mutex_dir_key#4){--..}, at: [<ffffffffa0190727>] nfsd4_sync_rec_dir+0x22/0x47 [nfsd] > [ 113.553289] > [ 113.553289] other info that might help us debug this: > [ 113.553289] 4 locks held by nfsd4/1914: > [ 113.553289] #0: (nfsd4){--..}, at: [<ffffffff80252303>] run_workqueue+0xb6/0x21b > [ 113.553289] #1: ((laundromat_work).work){--..}, at: [<ffffffff80252303>] run_workqueue+0xb6/0x21b > [ 113.553289] #2: (client_mutex){--..}, at: [<ffffffffa018bd05>] laundromat_main+0x33/0x24e [nfsd] > [ 113.553289] #3: (&type->i_mutex_dir_key#4){--..}, at: [<ffffffffa0190727>] nfsd4_sync_rec_dir+0x22/0x47 [nfsd] > [ 113.553289] > [ 113.553289] stack backtrace: > [ 113.553289] Pid: 1914, comm: nfsd4 Not tainted 2.6.28-tip #592 > [ 113.553289] Call Trace: > [ 113.553289] [<ffffffff80266987>] __lock_acquire+0xe42/0x161a > [ 113.553289] [<ffffffff80288857>] ? __call_rcu+0x7a/0x107 > [ 113.553289] [<ffffffff802671b4>] lock_acquire+0x55/0x71 > [ 113.553289] [<ffffffff802e7e5e>] ? vfs_fsync+0x6c/0xb1 > [ 113.553289] [<ffffffff805568d0>] mutex_lock_nested+0x4e/0x320 > [ 113.553289] [<ffffffff802e7e5e>] ? vfs_fsync+0x6c/0xb1 > [ 113.553289] [<ffffffff8029bde0>] ? __filemap_fdatawrite_range+0x57/0x5f > [ 113.553289] [<ffffffff802e7e5e>] vfs_fsync+0x6c/0xb1 > [ 113.553289] [<ffffffffa0176f8f>] nfsd_sync_dir+0x15/0x17 [nfsd] > [ 113.553289] [<ffffffffa0190733>] nfsd4_sync_rec_dir+0x2e/0x47 [nfsd] > [ 113.553289] [<ffffffffa0190791>] nfsd4_recdir_purge_old+0x45/0x73 [nfsd] > [ 113.553289] [<ffffffffa018bd44>] laundromat_main+0x72/0x24e [nfsd] > [ 113.553289] [<ffffffff80252355>] run_workqueue+0x108/0x21b > [ 113.553289] [<ffffffff80252303>] ? run_workqueue+0xb6/0x21b > [ 113.553289] [<ffffffffa018bcd2>] ? laundromat_main+0x0/0x24e [nfsd] > [ 113.553289] [<ffffffff8025254d>] worker_thread+0xe5/0xf6 > [ 113.553289] [<ffffffff80256615>] ? autoremove_wake_function+0x0/0x3d > [ 113.553289] [<ffffffff80252468>] ? worker_thread+0x0/0xf6 > [ 113.553289] [<ffffffff80256200>] kthread+0x4e/0x7b > [ 113.553289] [<ffffffff8020d51a>] child_rip+0xa/0x20 > [ 113.553289] [<ffffffff8020cec0>] ? restore_args+0x0/0x30 > [ 113.553289] [<ffffffff802561b2>] ? kthread+0x0/0x7b > [ 113.553289] [<ffffffff8020d510>] ? child_rip+0x0/0x20 > > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-01-08 16:05 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-01-06 22:56 nfsd stuckage Andrew Morton 2009-01-06 23:02 ` J. Bruce Fields 2009-01-06 23:03 ` Christoph Hellwig 2009-01-06 23:05 ` J. Bruce Fields 2009-01-07 0:15 ` J. Bruce Fields 2009-01-07 0:23 ` Andrew Morton 2009-01-07 0:28 ` J. Bruce Fields 2009-01-07 7:42 ` Christoph Hellwig 2009-01-07 16:56 ` J. Bruce Fields 2009-01-07 17:22 ` Christoph Hellwig 2009-01-08 14:57 ` Peter Zijlstra 2009-01-08 16:05 ` J. Bruce Fields
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox