From: Trond Myklebust <Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
To: "J. R. Okajima" <hooanon05-/E1597aS9LR3+QwDJ9on6Q@public.gmane.org>
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Wu Fengguang
<fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>,
Steve Rago <sar-a+KepyhlMvJWk0Htik3J/w@public.gmane.org>,
Jens Axboe <jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
Peter Staubach <staubach-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Arjan van de Ven <arjan-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>,
Al Viro <viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
Subject: Re: [PATCH 10/12] NFS: Simplify nfs_wb_page()
Date: Wed, 10 Mar 2010 15:18:20 -0500 [thread overview]
Message-ID: <1268252300.3096.81.camel@localhost.localdomain> (raw)
In-Reply-To: <1268249482.3096.76.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
On Wed, 2010-03-10 at 14:31 -0500, Trond Myklebust wrote:
> >From your trace it looks as if the problem is that the nfs_wb_page() is
> triggering a dentry release, which deadlocks with in
> truncate_inode_pages() because the _caller_ of nfs_release_page() holds
> a page lock.
>
> As far as I can see, your iput() call above can deadlock in exactly the
> same way.
>
> Note that shrink_page_list() is the only function that does this sort of
> thing without holding a reference to the inode.
OK. Does the following patch fix the deadlock for you?
Cheers
Trond
-----------------------------------------------------------------------------------------------------------
NFS: Avoid a deadlock in nfs_release_page
From: Trond Myklebust <Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
J.R. Okajima reports the following deadlock:
INFO: task kswapd0:305 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kswapd0 D 0000000000000001 0 305 2 0x00000000
ffff88001f21d4f0 0000000000000046 ffff88001fdea680 ffff88001f21c000
ffff88001f21dfd8 ffff88001f21c000 ffff88001f21dfd8 ffff88001f21dfd8
ffff88001fdea040 0000000000014c00 0000000000000001 ffff88001fdea040
Call Trace:
[<ffffffff8146155d>] io_schedule+0x4d/0x70
[<ffffffff810d2be5>] sync_page+0x65/0xa0
[<ffffffff81461b12>] __wait_on_bit_lock+0x52/0xb0
[<ffffffff810d2b80>] ? sync_page+0x0/0xa0
[<ffffffff810d2b64>] __lock_page+0x64/0x70
[<ffffffff81070ce0>] ? wake_bit_function+0x0/0x40
[<ffffffff810df1d4>] truncate_inode_pages_range+0x344/0x4a0
[<ffffffff810df340>] truncate_inode_pages+0x10/0x20
[<ffffffff8112cbfe>] generic_delete_inode+0x15e/0x190
[<ffffffff8112cc8d>] generic_drop_inode+0x5d/0x80
[<ffffffff8112bb88>] iput+0x78/0x80
[<ffffffff811bc908>] nfs_dentry_iput+0x38/0x50
[<ffffffff811285f4>] dentry_iput+0x84/0x110
[<ffffffff811286ae>] d_kill+0x2e/0x60
[<ffffffff8112912a>] dput+0x7a/0x170
[<ffffffff8111e925>] path_put+0x15/0x40
[<ffffffff811c3a44>] __put_nfs_open_context+0xa4/0xb0
[<ffffffff811cb5d0>] ? nfs_free_request+0x0/0x50
[<ffffffff811c3b0b>] put_nfs_open_context+0xb/0x10
[<ffffffff811cb5f9>] nfs_free_request+0x29/0x50
[<ffffffff81234b7e>] kref_put+0x8e/0xe0
[<ffffffff811cb594>] nfs_release_request+0x14/0x20
[<ffffffff811cf769>] nfs_find_and_lock_request+0x89/0xa0
[<ffffffff811d1180>] nfs_wb_page+0x80/0x110
[<ffffffff811c0770>] nfs_release_page+0x70/0x90
[<ffffffff810d18ee>] try_to_release_page+0x5e/0x80
[<ffffffff810e1178>] shrink_page_list+0x638/0x860
[<ffffffff810e19de>] shrink_zone+0x63e/0xc40
We can fix this by making the call to put_nfs_open_context() happen when we
actually remove the write request from the inode (which is done by the
nfsiod thread in this case).
Signed-off-by: Trond Myklebust <Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
---
fs/nfs/pagelist.c | 9 +++++++--
1 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index a12c45b..81fb4a5 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -148,10 +148,16 @@ void nfs_clear_page_tag_locked(struct nfs_page *req)
void nfs_clear_request(struct nfs_page *req)
{
struct page *page = req->wb_page;
+ struct nfs_open_context *ctx = req->wb_context;
+
if (page != NULL) {
page_cache_release(page);
req->wb_page = NULL;
}
+ if (ctx != NULL) {
+ put_nfs_open_context(ctx);
+ req->wb_context = NULL;
+ }
}
@@ -165,9 +171,8 @@ static void nfs_free_request(struct kref *kref)
{
struct nfs_page *req = container_of(kref, struct nfs_page, wb_kref);
- /* Release struct file or cached credential */
+ /* Release struct file and open context */
nfs_clear_request(req);
- put_nfs_open_context(req->wb_context);
nfs_page_free(req);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-03-10 20:18 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-25 22:15 [PATCH 00/12] Re: [PATCH] improve the performance of large sequential write NFS workloads Trond Myklebust
[not found] ` <20100125221544.16750.70574.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-01-25 22:15 ` [PATCH 06/12] NFS: Run COMMIT as an asynchronous RPC call when wbc->for_background is set Trond Myklebust
2010-01-25 22:15 ` [PATCH 02/12] VM: Don't call bdi_stat(BDI_UNSTABLE) on non-nfs backing-devices Trond Myklebust
2010-01-25 22:15 ` [PATCH 09/12] NFS: Replace __nfs_write_mapping with sync_inode() Trond Myklebust
[not found] ` <20100125221545.16750.63968.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-01-26 11:21 ` Christoph Hellwig
[not found] ` <20100126112148.GA25170-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2010-01-26 14:02 ` Trond Myklebust
2010-01-26 23:17 ` Trond Myklebust
2010-01-25 22:15 ` [PATCH 03/12] NFS: Cleanup - move nfs_write_inode() into fs/nfs/write.c Trond Myklebust
2010-01-25 22:15 ` [PATCH 04/12] NFS: Reduce the number of unnecessary COMMIT calls Trond Myklebust
2010-01-25 22:15 ` [PATCH 07/12] NFS: Ensure inode is always marked I_DIRTY_DATASYNC, if it has unstable pages Trond Myklebust
2010-01-25 22:15 ` [PATCH 01/12] VM: Split out the accounting of unstable writes from BDI_RECLAIMABLE Trond Myklebust
2010-01-25 22:15 ` [PATCH 10/12] NFS: Simplify nfs_wb_page() Trond Myklebust
[not found] ` <20100125221545.16750.19154.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-03-10 18:51 ` J. R. Okajima
2010-03-10 19:31 ` Trond Myklebust
[not found] ` <1268249482.3096.76.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-03-10 20:18 ` Trond Myklebust [this message]
[not found] ` <1268252300.3096.81.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-03-11 4:45 ` J. R. Okajima
2010-03-11 14:26 ` Trond Myklebust
[not found] ` <1268317582.3354.9.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-03-12 4:22 ` J. R. Okajima
2010-03-17 16:49 ` Christoph Hellwig
2010-03-17 17:26 ` Trond Myklebust
2010-03-17 17:52 ` Jeff Layton
2010-03-17 17:58 ` Trond Myklebust
[not found] ` <1268848682.8335.5.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-03-17 18:08 ` Jeff Layton
2010-01-25 22:15 ` [PATCH 08/12] NFS: Simplify nfs_wb_page_cancel() Trond Myklebust
2010-01-25 22:15 ` [PATCH 12/12] NFS: Remove requirement for inode->i_mutex from nfs_invalidate_mapping Trond Myklebust
2010-01-25 22:15 ` [PATCH 05/12] VM/NFS: The VM must tell the filesystem when to free reclaimable pages Trond Myklebust
2010-01-25 22:15 ` [PATCH 11/12] NFS: Clean up nfs_sync_mapping Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1268252300.3096.81.camel@localhost.localdomain \
--to=trond.myklebust-hgovqubeegtqt0dzr+alfa@public.gmane.org \
--cc=arjan-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=hch-jcswGhMUV9g@public.gmane.org \
--cc=hooanon05-/E1597aS9LR3+QwDJ9on6Q@public.gmane.org \
--cc=jack-AlSwsSmVLrQ@public.gmane.org \
--cc=jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=mingo-X9Un+BFzKDI@public.gmane.org \
--cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=sar-a+KepyhlMvJWk0Htik3J/w@public.gmane.org \
--cc=staubach-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).