From: "J. R. Okajima" <hooanon05-/E1597aS9LR3+QwDJ9on6Q@public.gmane.org>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: linux-nfs@vger.kernel.org, Wu Fengguang <fengguang.wu@intel.com>,
Peter Zijlstra <peterz@infradead.org>, Jan Kara <jack@suse.cz>,
Steve Rago <sar-a+KepyhlMvJWk0Htik3J/w@public.gmane.org>,
Jens Axboe <jens.axboe@oracle.com>,
Peter Staubach <staubach@redhat.com>,
Arjan van de Ven <arjan@infradead.org>,
Ingo Molnar <mingo@elte.hu>,
linux-fsdevel@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
Al Viro <viro@ZenIV.linux.org.uk>
Subject: Re: [PATCH 10/12] NFS: Simplify nfs_wb_page()
Date: Thu, 11 Mar 2010 03:51:49 +0900 [thread overview]
Message-ID: <16839.1268247109@jrobl> (raw)
In-Reply-To: <20100125221545.16750.19154.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
Trond Myklebust:
> -static int nfs_wb_page_priority(struct inode *inode, struct page *page,
> - int how)
> +/*
> + * Write back all requests on one page - we do this before reading it.
> + */
> +int nfs_wb_page(struct inode *inode, struct page *page)
> {
:::
> - do {
> + while(PagePrivate(page)) {
> if (clear_page_dirty_for_io(page)) {
> ret = nfs_writepage_locked(page, &wbc);
> if (ret < 0)
> goto out_error;
> - } else if (!PagePrivate(page))
> + }
> + req = nfs_find_and_lock_request(page);
> + if (!req)
> break;
> - ret = nfs_sync_mapping_wait(page->mapping, &wbc, how);
> - if (ret < 0)
> + if (IS_ERR(req)) {
> + ret = PTR_ERR(req);
> goto out_error;
:::
Hello Trond,
I am unsure whether this nfs_find_and_lock_request() call is correct or
not, but it brings me a problem.
This call trace in "kswapd blocked for more than brabra" message shows
that generic_delete_inode() blocked.
Is this put_nfs_open_context() -- iput() call in shrink_page_list()
context intended?
If not, I'd suggest a patch like this.
INFO: task kswapd0:305 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kswapd0 D 0000000000000001 0 305 2 0x00000000
ffff88001f21d4f0 0000000000000046 ffff88001fdea680 ffff88001f21c000
ffff88001f21dfd8 ffff88001f21c000 ffff88001f21dfd8 ffff88001f21dfd8
ffff88001fdea040 0000000000014c00 0000000000000001 ffff88001fdea040
Call Trace:
[<ffffffff8146155d>] io_schedule+0x4d/0x70
[<ffffffff810d2be5>] sync_page+0x65/0xa0
[<ffffffff81461b12>] __wait_on_bit_lock+0x52/0xb0
[<ffffffff810d2b80>] ? sync_page+0x0/0xa0
[<ffffffff810d2b64>] __lock_page+0x64/0x70
[<ffffffff81070ce0>] ? wake_bit_function+0x0/0x40
[<ffffffff810df1d4>] truncate_inode_pages_range+0x344/0x4a0
[<ffffffff810df340>] truncate_inode_pages+0x10/0x20
[<ffffffff8112cbfe>] generic_delete_inode+0x15e/0x190
[<ffffffff8112cc8d>] generic_drop_inode+0x5d/0x80
[<ffffffff8112bb88>] iput+0x78/0x80
[<ffffffff811bc908>] nfs_dentry_iput+0x38/0x50
[<ffffffff811285f4>] dentry_iput+0x84/0x110
[<ffffffff811286ae>] d_kill+0x2e/0x60
[<ffffffff8112912a>] dput+0x7a/0x170
[<ffffffff8111e925>] path_put+0x15/0x40
[<ffffffff811c3a44>] __put_nfs_open_context+0xa4/0xb0
[<ffffffff811cb5d0>] ? nfs_free_request+0x0/0x50
[<ffffffff811c3b0b>] put_nfs_open_context+0xb/0x10
[<ffffffff811cb5f9>] nfs_free_request+0x29/0x50
[<ffffffff81234b7e>] kref_put+0x8e/0xe0
[<ffffffff811cb594>] nfs_release_request+0x14/0x20
[<ffffffff811cf769>] nfs_find_and_lock_request+0x89/0xa0
[<ffffffff811d1180>] nfs_wb_page+0x80/0x110
[<ffffffff811c0770>] nfs_release_page+0x70/0x90
[<ffffffff810d18ee>] try_to_release_page+0x5e/0x80
[<ffffffff810e1178>] shrink_page_list+0x638/0x860
[<ffffffff810e19de>] shrink_zone+0x63e/0xc40
[<ffffffff81464437>] ? _raw_spin_unlock+0x57/0x70
[<ffffffff8107641e>] ? up_read+0x1e/0x40
[<ffffffff810e26a9>] kswapd+0x6c9/0xa20
[<ffffffff810df700>] ? isolate_pages_global+0x0/0x280
[<ffffffff81070ca0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff810e1fe0>] ? kswapd+0x0/0xa20
[<ffffffff810706d6>] kthread+0x96/0xb0
[<ffffffff8100b5a4>] kernel_thread_helper+0x4/0x10
[<ffffffff81464f14>] ? restore_args+0x0/0x30
[<ffffffff81070640>] ? kthread+0x0/0xb0
[<ffffffff8100b5a0>] ? kernel_thread_helper+0x0/0x10
no locks held by kswapd0/305.
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index ae8d022..ffa5463 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -491,8 +491,13 @@ static int nfs_release_page(struct page *page, gfp_t gfp)
{
dfprintk(PAGECACHE, "NFS: release_page(%p)\n", page);
- if (gfp & __GFP_WAIT)
+ if (gfp & __GFP_WAIT) {
+ struct inode *inode;
+
+ inode = igrab(page->mapping->host);
nfs_wb_page(page->mapping->host, page);
+ iput(inode);
+ }
/* If PagePrivate() is set, then the page is not freeable */
if (PagePrivate(page))
return 0;
J. R. Okajima
WARNING: multiple messages have this Message-ID (diff)
From: "J. R. Okajima" <hooanon05-/E1597aS9LR3+QwDJ9on6Q@public.gmane.org>
To: Trond Myklebust
<Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Wu Fengguang
<fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>,
Steve Rago <sar-a+KepyhlMvJWk0Htik3J/w@public.gmane.org>,
Jens Axboe <jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
Peter Staubach <staubach-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Arjan van de Ven <arjan-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>,
Al Viro <viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
Subject: Re: [PATCH 10/12] NFS: Simplify nfs_wb_page()
Date: Thu, 11 Mar 2010 03:51:49 +0900 [thread overview]
Message-ID: <16839.1268247109@jrobl> (raw)
In-Reply-To: <20100125221545.16750.19154.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
Trond Myklebust:
> -static int nfs_wb_page_priority(struct inode *inode, struct page *page,
> - int how)
> +/*
> + * Write back all requests on one page - we do this before reading it.
> + */
> +int nfs_wb_page(struct inode *inode, struct page *page)
> {
:::
> - do {
> + while(PagePrivate(page)) {
> if (clear_page_dirty_for_io(page)) {
> ret = nfs_writepage_locked(page, &wbc);
> if (ret < 0)
> goto out_error;
> - } else if (!PagePrivate(page))
> + }
> + req = nfs_find_and_lock_request(page);
> + if (!req)
> break;
> - ret = nfs_sync_mapping_wait(page->mapping, &wbc, how);
> - if (ret < 0)
> + if (IS_ERR(req)) {
> + ret = PTR_ERR(req);
> goto out_error;
:::
Hello Trond,
I am unsure whether this nfs_find_and_lock_request() call is correct or
not, but it brings me a problem.
This call trace in "kswapd blocked for more than brabra" message shows
that generic_delete_inode() blocked.
Is this put_nfs_open_context() -- iput() call in shrink_page_list()
context intended?
If not, I'd suggest a patch like this.
INFO: task kswapd0:305 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kswapd0 D 0000000000000001 0 305 2 0x00000000
ffff88001f21d4f0 0000000000000046 ffff88001fdea680 ffff88001f21c000
ffff88001f21dfd8 ffff88001f21c000 ffff88001f21dfd8 ffff88001f21dfd8
ffff88001fdea040 0000000000014c00 0000000000000001 ffff88001fdea040
Call Trace:
[<ffffffff8146155d>] io_schedule+0x4d/0x70
[<ffffffff810d2be5>] sync_page+0x65/0xa0
[<ffffffff81461b12>] __wait_on_bit_lock+0x52/0xb0
[<ffffffff810d2b80>] ? sync_page+0x0/0xa0
[<ffffffff810d2b64>] __lock_page+0x64/0x70
[<ffffffff81070ce0>] ? wake_bit_function+0x0/0x40
[<ffffffff810df1d4>] truncate_inode_pages_range+0x344/0x4a0
[<ffffffff810df340>] truncate_inode_pages+0x10/0x20
[<ffffffff8112cbfe>] generic_delete_inode+0x15e/0x190
[<ffffffff8112cc8d>] generic_drop_inode+0x5d/0x80
[<ffffffff8112bb88>] iput+0x78/0x80
[<ffffffff811bc908>] nfs_dentry_iput+0x38/0x50
[<ffffffff811285f4>] dentry_iput+0x84/0x110
[<ffffffff811286ae>] d_kill+0x2e/0x60
[<ffffffff8112912a>] dput+0x7a/0x170
[<ffffffff8111e925>] path_put+0x15/0x40
[<ffffffff811c3a44>] __put_nfs_open_context+0xa4/0xb0
[<ffffffff811cb5d0>] ? nfs_free_request+0x0/0x50
[<ffffffff811c3b0b>] put_nfs_open_context+0xb/0x10
[<ffffffff811cb5f9>] nfs_free_request+0x29/0x50
[<ffffffff81234b7e>] kref_put+0x8e/0xe0
[<ffffffff811cb594>] nfs_release_request+0x14/0x20
[<ffffffff811cf769>] nfs_find_and_lock_request+0x89/0xa0
[<ffffffff811d1180>] nfs_wb_page+0x80/0x110
[<ffffffff811c0770>] nfs_release_page+0x70/0x90
[<ffffffff810d18ee>] try_to_release_page+0x5e/0x80
[<ffffffff810e1178>] shrink_page_list+0x638/0x860
[<ffffffff810e19de>] shrink_zone+0x63e/0xc40
[<ffffffff81464437>] ? _raw_spin_unlock+0x57/0x70
[<ffffffff8107641e>] ? up_read+0x1e/0x40
[<ffffffff810e26a9>] kswapd+0x6c9/0xa20
[<ffffffff810df700>] ? isolate_pages_global+0x0/0x280
[<ffffffff81070ca0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff810e1fe0>] ? kswapd+0x0/0xa20
[<ffffffff810706d6>] kthread+0x96/0xb0
[<ffffffff8100b5a4>] kernel_thread_helper+0x4/0x10
[<ffffffff81464f14>] ? restore_args+0x0/0x30
[<ffffffff81070640>] ? kthread+0x0/0xb0
[<ffffffff8100b5a0>] ? kernel_thread_helper+0x0/0x10
no locks held by kswapd0/305.
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index ae8d022..ffa5463 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -491,8 +491,13 @@ static int nfs_release_page(struct page *page, gfp_t gfp)
{
dfprintk(PAGECACHE, "NFS: release_page(%p)\n", page);
- if (gfp & __GFP_WAIT)
+ if (gfp & __GFP_WAIT) {
+ struct inode *inode;
+
+ inode = igrab(page->mapping->host);
nfs_wb_page(page->mapping->host, page);
+ iput(inode);
+ }
/* If PagePrivate() is set, then the page is not freeable */
if (PagePrivate(page))
return 0;
J. R. Okajima
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-03-10 19:09 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-25 22:15 [PATCH 00/12] Re: [PATCH] improve the performance of large sequential write NFS workloads Trond Myklebust
2010-01-25 22:15 ` [PATCH 05/12] VM/NFS: The VM must tell the filesystem when to free reclaimable pages Trond Myklebust
[not found] ` <20100125221544.16750.70574.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-01-25 22:15 ` [PATCH 02/12] VM: Don't call bdi_stat(BDI_UNSTABLE) on non-nfs backing-devices Trond Myklebust
2010-01-25 22:15 ` Trond Myklebust
2010-01-25 22:15 ` [PATCH 08/12] NFS: Simplify nfs_wb_page_cancel() Trond Myklebust
2010-01-25 22:15 ` Trond Myklebust
2010-01-25 22:15 ` [PATCH 01/12] VM: Split out the accounting of unstable writes from BDI_RECLAIMABLE Trond Myklebust
2010-01-25 22:15 ` Trond Myklebust
2010-01-25 22:15 ` [PATCH 07/12] NFS: Ensure inode is always marked I_DIRTY_DATASYNC, if it has unstable pages Trond Myklebust
2010-01-25 22:15 ` Trond Myklebust
2010-01-25 22:15 ` [PATCH 03/12] NFS: Cleanup - move nfs_write_inode() into fs/nfs/write.c Trond Myklebust
2010-01-25 22:15 ` Trond Myklebust
2010-01-25 22:15 ` [PATCH 09/12] NFS: Replace __nfs_write_mapping with sync_inode() Trond Myklebust
2010-01-25 22:15 ` Trond Myklebust
2010-01-26 11:21 ` Christoph Hellwig
2010-01-26 11:21 ` Christoph Hellwig
2010-01-26 14:02 ` Trond Myklebust
2010-01-26 14:02 ` Trond Myklebust
2010-01-26 23:17 ` Trond Myklebust
2010-01-26 23:17 ` Trond Myklebust
2010-01-25 22:15 ` [PATCH 10/12] NFS: Simplify nfs_wb_page() Trond Myklebust
2010-01-25 22:15 ` Trond Myklebust
[not found] ` <20100125221545.16750.19154.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-03-10 18:51 ` J. R. Okajima [this message]
2010-03-10 18:51 ` J. R. Okajima
2010-03-10 19:31 ` Trond Myklebust
2010-03-10 19:31 ` Trond Myklebust
[not found] ` <1268249482.3096.76.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-03-10 20:18 ` Trond Myklebust
2010-03-10 20:18 ` Trond Myklebust
[not found] ` <1268252300.3096.81.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-03-11 4:45 ` J. R. Okajima
2010-03-11 4:45 ` J. R. Okajima
2010-03-11 14:26 ` Trond Myklebust
2010-03-11 14:26 ` Trond Myklebust
[not found] ` <1268317582.3354.9.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-03-12 4:22 ` J. R. Okajima
2010-03-12 4:22 ` J. R. Okajima
2010-03-17 16:49 ` Christoph Hellwig
2010-03-17 16:49 ` Christoph Hellwig
2010-03-17 17:26 ` Trond Myklebust
2010-03-17 17:52 ` Jeff Layton
2010-03-17 17:58 ` Trond Myklebust
[not found] ` <1268848682.8335.5.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-03-17 18:08 ` Jeff Layton
2010-03-17 18:08 ` Jeff Layton
2010-01-25 22:15 ` [PATCH 06/12] NFS: Run COMMIT as an asynchronous RPC call when wbc->for_background is set Trond Myklebust
2010-01-25 22:15 ` Trond Myklebust
2010-01-25 22:15 ` [PATCH 04/12] NFS: Reduce the number of unnecessary COMMIT calls Trond Myklebust
2010-01-25 22:15 ` Trond Myklebust
2010-01-25 22:15 ` [PATCH 12/12] NFS: Remove requirement for inode->i_mutex from nfs_invalidate_mapping Trond Myklebust
2010-01-25 22:15 ` Trond Myklebust
2010-01-25 22:15 ` [PATCH 11/12] NFS: Clean up nfs_sync_mapping Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=16839.1268247109@jrobl \
--to=hooanon05-/e1597as9lr3+qwdj9on6q@public.gmane.org \
--cc=Trond.Myklebust@netapp.com \
--cc=arjan@infradead.org \
--cc=fengguang.wu@intel.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=sar-a+KepyhlMvJWk0Htik3J/w@public.gmane.org \
--cc=staubach@redhat.com \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.