From: Theodore Ts'o <tytso@mit.edu>
To: Ext4 Developers List <linux-ext4@vger.kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Subject: [PATCH 06/12] ext4: grab page before starting transaction handle in write_begin()
Date: Sat, 9 Feb 2013 16:53:46 -0500 [thread overview]
Message-ID: <1360446832-12724-7-git-send-email-tytso@mit.edu> (raw)
In-Reply-To: <1360446832-12724-1-git-send-email-tytso@mit.edu>
The grab_cache_page_write_begin() function can potentially sleep for a
long time, since it may need to do memory allocation which can block
if the system is under significant memory pressure, and because it may
be blocked on page writeback. If it does take a long time to grab the
page, it's better that we not hold an active jbd2 handle.
So grab a handle on the page first, and _then_ start the transaction
handle.
This commit fixes the following long transaction handle hold time:
postmark-2917 [000] .... 196.435786: jbd2_handle_stats: dev 254,32
tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1
dirtied_blocks 0
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
fs/ext4/inode.c | 111 ++++++++++++++++++++++++++++++++++----------------------
1 file changed, 68 insertions(+), 43 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c3c47e2..38164a8 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -906,32 +906,40 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
ret = ext4_try_to_write_inline_data(mapping, inode, pos, len,
flags, pagep);
if (ret < 0)
- goto out;
- if (ret == 1) {
- ret = 0;
- goto out;
- }
+ return ret;
+ if (ret == 1)
+ return 0;
}
-retry:
+ /*
+ * grab_cache_page_write_begin() can take a long time if the
+ * system is thrashing due to memory pressure, or if the page
+ * is being written back. So grab it first before we start
+ * the transaction handle. This also allows us to allocate
+ * the page (if needed) without using GFP_NOFS.
+ */
+retry_grab:
+ page = grab_cache_page_write_begin(mapping, index, flags);
+ if (!page)
+ return -ENOMEM;
+ unlock_page(page);
+
+retry_journal:
handle = ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, needed_blocks);
if (IS_ERR(handle)) {
- ret = PTR_ERR(handle);
- goto out;
+ page_cache_release(page);
+ return PTR_ERR(handle);
}
- /* We cannot recurse into the filesystem as the transaction is already
- * started */
- flags |= AOP_FLAG_NOFS;
-
- page = grab_cache_page_write_begin(mapping, index, flags);
- if (!page) {
+ lock_page(page);
+ if (page->mapping != mapping) {
+ /* The page got truncated from under us */
+ unlock_page(page);
+ page_cache_release(page);
ext4_journal_stop(handle);
- ret = -ENOMEM;
- goto out;
+ goto retry_grab;
}
-
- *pagep = page;
+ wait_on_page_writeback(page);
if (ext4_should_dioread_nolock(inode))
ret = __block_write_begin(page, pos, len, ext4_get_block_write);
@@ -946,7 +954,6 @@ retry:
if (ret) {
unlock_page(page);
- page_cache_release(page);
/*
* __block_write_begin may have instantiated a few blocks
* outside i_size. Trim these off again. Don't need
@@ -970,11 +977,14 @@ retry:
if (inode->i_nlink)
ext4_orphan_del(NULL, inode);
}
- }
- if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
- goto retry;
-out:
+ if (ret == -ENOSPC &&
+ ext4_should_retry_alloc(inode->i_sb, &retries))
+ goto retry_journal;
+ page_cache_release(page);
+ return ret;
+ }
+ *pagep = page;
return ret;
}
@@ -2524,42 +2534,52 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
pos, len, flags,
pagep, fsdata);
if (ret < 0)
- goto out;
- if (ret == 1) {
- ret = 0;
- goto out;
- }
+ return ret;
+ if (ret == 1)
+ return 0;
}
-retry:
+ /*
+ * grab_cache_page_write_begin() can take a long time if the
+ * system is thrashing due to memory pressure, or if the page
+ * is being written back. So grab it first before we start
+ * the transaction handle. This also allows us to allocate
+ * the page (if needed) without using GFP_NOFS.
+ */
+retry_grab:
+ page = grab_cache_page_write_begin(mapping, index, flags);
+ if (!page)
+ return -ENOMEM;
+ unlock_page(page);
+
/*
* With delayed allocation, we don't log the i_disksize update
* if there is delayed block allocation. But we still need
* to journalling the i_disksize update if writes to the end
* of file which has an already mapped buffer.
*/
+retry_journal:
handle = ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, 1);
if (IS_ERR(handle)) {
- ret = PTR_ERR(handle);
- goto out;
+ page_cache_release(page);
+ return PTR_ERR(handle);
}
- /* We cannot recurse into the filesystem as the transaction is already
- * started */
- flags |= AOP_FLAG_NOFS;
- page = grab_cache_page_write_begin(mapping, index, flags);
- if (!page) {
+ lock_page(page);
+ if (page->mapping != mapping) {
+ /* The page got truncated from under us */
+ unlock_page(page);
+ page_cache_release(page);
ext4_journal_stop(handle);
- ret = -ENOMEM;
- goto out;
+ goto retry_grab;
}
- *pagep = page;
+ /* In case writeback began while the page was unlocked */
+ wait_on_page_writeback(page);
ret = __block_write_begin(page, pos, len, ext4_da_get_block_prep);
if (ret < 0) {
unlock_page(page);
ext4_journal_stop(handle);
- page_cache_release(page);
/*
* block_write_begin may have instantiated a few blocks
* outside i_size. Trim these off again. Don't need
@@ -2567,11 +2587,16 @@ retry:
*/
if (pos + len > inode->i_size)
ext4_truncate_failed_write(inode);
+
+ if (ret == -ENOSPC &&
+ ext4_should_retry_alloc(inode->i_sb, &retries))
+ goto retry_journal;
+
+ page_cache_release(page);
+ return ret;
}
- if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
- goto retry;
-out:
+ *pagep = page;
return ret;
}
--
1.7.12.rc0.22.gcdd159b
next prev parent reply other threads:[~2013-02-09 21:53 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-09 21:53 [PATCH 00/12] jbd2 optimization and bug fixes Theodore Ts'o
2013-02-09 21:53 ` [PATCH 01/12] jbd2: track request delay statistics Theodore Ts'o
2013-02-11 15:57 ` Jan Kara
2013-02-09 21:53 ` [PATCH 02/12] jbd2: revert "jbd2: add COW fields to struct jbd2_journal_handle" Theodore Ts'o
2013-02-11 15:58 ` Jan Kara
2013-02-09 21:53 ` [PATCH 03/12] jbd2: add tracepoints which provide per-handle statistics Theodore Ts'o
2013-02-09 21:53 ` [PATCH 04/12] ext4: move the jbd2 wrapper functions out of super.c Theodore Ts'o
2013-02-09 21:53 ` [PATCH 05/12] ext4: pass context information to jbd2__journal_start() Theodore Ts'o
2013-02-11 16:16 ` Jan Kara
2013-02-11 18:13 ` Theodore Ts'o
2013-02-11 19:58 ` Jan Kara
2013-02-11 20:14 ` Theodore Ts'o
2013-02-09 21:53 ` Theodore Ts'o [this message]
2013-02-11 16:35 ` [PATCH 06/12] ext4: grab page before starting transaction handle in write_begin() Jan Kara
2013-02-09 21:53 ` [PATCH 07/12] ext4: start handle at the last possible moment in ext4_unlink() Theodore Ts'o
2013-02-11 16:21 ` Jan Kara
2013-02-09 21:53 ` [PATCH 08/12] ext4: start handle at the last possible moment in ext4_rmdir() Theodore Ts'o
2013-02-11 16:22 ` Jan Kara
2013-02-09 21:53 ` [PATCH 09/12] ext4: fix the number of credits needed for ext4_ext_migrate() Theodore Ts'o
2013-02-11 16:26 ` Jan Kara
2013-02-09 21:53 ` [PATCH 10/12] ext4: fix the number of credits needed for ext4_unlink() and ext4_rmdir() Theodore Ts'o
2013-02-11 16:28 ` Jan Kara
2013-02-11 18:30 ` Theodore Ts'o
2013-02-11 19:30 ` Jan Kara
2013-02-09 21:53 ` [PATCH 11/12] ext4: fix the number of credits needed for acl ops with inline data Theodore Ts'o
2013-02-10 13:42 ` Tao Ma
2013-02-10 18:15 ` Shentino
2013-02-10 19:43 ` Theodore Ts'o
2013-02-11 16:16 ` Andreas Dilger
2013-02-11 16:30 ` Jan Kara
2013-02-09 21:53 ` [PATCH 12/12] ext4: start handle at the last possible moment when creating inodes Theodore Ts'o
2013-02-11 1:47 ` Theodore Ts'o
2013-02-11 16:00 ` Andreas Dilger
2013-02-11 15:52 ` [PATCH 00/12] jbd2 optimization and bug fixes Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1360446832-12724-7-git-send-email-tytso@mit.edu \
--to=tytso@mit.edu \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).