From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: Justin Forbes <jmforbes@linuxtx.org>,
Zwane Mwaikambo <zwane@arm.linux.org.uk>,
"Theodore Ts'o" <tytso@mit.edu>,
Randy Dunlap <rdunlap@xenotime.net>,
Dave Jones <davej@redhat.com>,
Chuck Wolber <chuckw@quantumlinux.com>,
Chris Wedgwood <reviews@ml.cw.f00f.org>,
Michael Krufky <mkrufky@linuxtv.org>,
Chuck Ebbert <cebbert@redhat.com>,
Domenico Andreoli <cavokz@gmail.com>, Willy Tarreau <w@1wt.eu>,
Rodrigo Rubira Branco <rbranco@la.checkpoint.com>,
Jake Edge <jake@lwn.net>, Eugene Teo <eteo@redhat.com>,
torvalds@linux-foundation.org, akpm@linux-foundation.org,
alan@lxorguk.ukuu.org.uk, Mingming Cao <cmm@us.ibm.com>,
Jan Kara <jack@suse.cz>
Subject: [patch 13/25] jbd: fix race between free buffer and commit transaction
Date: Mon, 4 Aug 2008 14:30:25 -0700 [thread overview]
Message-ID: <20080804213025.GM8014@suse.de> (raw)
In-Reply-To: <20080804212725.GA7944@suse.de>
[-- Attachment #1: jbd-fix-race-between-free-buffer-and-commit-transaction.patch --]
[-- Type: text/plain, Size: 4891 bytes --]
2.6.26-stable review patch. If anyone has any objections, please let us
know.
------------------
From: Mingming Cao <cmm@us.ibm.com>
commit 3f31fddfa26b7594b44ff2b34f9a04ba409e0f91 upstream
journal_try_to_free_buffers() could race with jbd commit transaction when
the later is holding the buffer reference while waiting for the data
buffer to flush to disk. If the caller of journal_try_to_free_buffers()
request tries hard to release the buffers, it will treat the failure as
error and return back to the caller. We have seen the directo IO failed
due to this race. Some of the caller of releasepage() also expecting the
buffer to be dropped when passed with GFP_KERNEL mask to the
releasepage()->journal_try_to_free_buffers().
With this patch, if the caller is passing the __GFP_WAIT and __GFP_FS to
indicating this call could wait, in case of try_to_free_buffers() failed,
let's waiting for journal_commit_transaction() to finish commit the
current committing transaction, then try to free those buffers again.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Reviewed-by: Badari Pulavarty <pbadari@us.ibm.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
fs/jbd/transaction.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++--
mm/filemap.c | 3 --
2 files changed, 56 insertions(+), 4 deletions(-)
--- a/fs/jbd/transaction.c
+++ b/fs/jbd/transaction.c
@@ -1648,12 +1648,42 @@ out:
return;
}
+/*
+ * journal_try_to_free_buffers() could race with journal_commit_transaction()
+ * The latter might still hold the a count on buffers when inspecting
+ * them on t_syncdata_list or t_locked_list.
+ *
+ * journal_try_to_free_buffers() will call this function to
+ * wait for the current transaction to finish syncing data buffers, before
+ * tryinf to free that buffer.
+ *
+ * Called with journal->j_state_lock held.
+ */
+static void journal_wait_for_transaction_sync_data(journal_t *journal)
+{
+ transaction_t *transaction = NULL;
+ tid_t tid;
+
+ spin_lock(&journal->j_state_lock);
+ transaction = journal->j_committing_transaction;
+
+ if (!transaction) {
+ spin_unlock(&journal->j_state_lock);
+ return;
+ }
+
+ tid = transaction->t_tid;
+ spin_unlock(&journal->j_state_lock);
+ log_wait_commit(journal, tid);
+}
/**
* int journal_try_to_free_buffers() - try to free page buffers.
* @journal: journal for operation
* @page: to try and free
- * @unused_gfp_mask: unused
+ * @gfp_mask: we use the mask to detect how hard should we try to release
+ * buffers. If __GFP_WAIT and __GFP_FS is set, we wait for commit code to
+ * release the buffers.
*
*
* For all the buffers on this page,
@@ -1682,9 +1712,11 @@ out:
* journal_try_to_free_buffer() is changing its state. But that
* cannot happen because we never reallocate freed data as metadata
* while the data is part of a transaction. Yes?
+ *
+ * Return 0 on failure, 1 on success
*/
int journal_try_to_free_buffers(journal_t *journal,
- struct page *page, gfp_t unused_gfp_mask)
+ struct page *page, gfp_t gfp_mask)
{
struct buffer_head *head;
struct buffer_head *bh;
@@ -1713,7 +1745,28 @@ int journal_try_to_free_buffers(journal_
if (buffer_jbd(bh))
goto busy;
} while ((bh = bh->b_this_page) != head);
+
ret = try_to_free_buffers(page);
+
+ /*
+ * There are a number of places where journal_try_to_free_buffers()
+ * could race with journal_commit_transaction(), the later still
+ * holds the reference to the buffers to free while processing them.
+ * try_to_free_buffers() failed to free those buffers. Some of the
+ * caller of releasepage() request page buffers to be dropped, otherwise
+ * treat the fail-to-free as errors (such as generic_file_direct_IO())
+ *
+ * So, if the caller of try_to_release_page() wants the synchronous
+ * behaviour(i.e make sure buffers are dropped upon return),
+ * let's wait for the current transaction to finish flush of
+ * dirty data buffers, then try to free those buffers again,
+ * with the journal locked.
+ */
+ if (ret == 0 && (gfp_mask & __GFP_WAIT) && (gfp_mask & __GFP_FS)) {
+ journal_wait_for_transaction_sync_data(journal);
+ ret = try_to_free_buffers(page);
+ }
+
busy:
return ret;
}
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2581,9 +2581,8 @@ out:
* Otherwise return zero.
*
* The @gfp_mask argument specifies whether I/O may be performed to release
- * this page (__GFP_IO), and whether the call may block (__GFP_WAIT).
+ * this page (__GFP_IO), and whether the call may block (__GFP_WAIT & __GFP_FS).
*
- * NOTE: @gfp_mask may go away, and this function may become non-blocking.
*/
int try_to_release_page(struct page *page, gfp_t gfp_mask)
{
--
next prev parent reply other threads:[~2008-08-04 21:48 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20080804203506.816201392@mini.kroah.org>
2008-08-04 21:27 ` [patch 00/25] 2.6.26-stable review Greg KH
2008-08-04 21:29 ` [patch 01/25] ftrace: remove unneeded documentation Greg KH
2008-08-04 21:42 ` Steven Rostedt
2008-08-04 21:46 ` Greg KH
2008-08-04 22:16 ` Steven Rostedt
2008-08-04 21:49 ` Randy Dunlap
2008-08-04 22:02 ` Steven Rostedt
2008-08-04 22:06 ` Randy Dunlap
2008-08-04 21:29 ` [patch 02/25] romfs_readpage: dont report errors for pages beyond i_size Greg KH
2008-08-04 21:29 ` [patch 03/25] netfilter: nf_nat_sip: c= is optional for session Greg KH
2008-08-04 21:29 ` [patch 04/25] SCSI: bsg: fix bsg_mutex hang with device removal Greg KH
2008-08-04 21:29 ` [patch 05/25] x86: idle process - add checking for NULL early param Greg KH
2008-08-04 21:29 ` [patch 06/25] x86: io delay " Greg KH
2008-08-04 21:29 ` [patch 07/25] Close race in md_probe Greg KH
2008-08-04 21:30 ` [patch 08/25] Kprobe smoke test lockdep warning Greg KH
2008-08-04 21:30 ` [patch 09/25] netfilter: xt_time: fix times time_mt()s use of do_div() Greg KH
2008-08-04 21:30 ` [patch 10/25] linear: correct disk numbering error check Greg KH
2008-08-04 21:30 ` [patch 11/25] SCSI: ch: fix ch_remove oops Greg KH
2008-08-04 21:30 ` [patch 12/25] NFS: Ensure we zap only the access and acl caches when setting new acls Greg KH
2008-08-04 21:30 ` Greg KH [this message]
2008-08-04 21:30 ` [patch 14/25] Input: i8042 - add Intel D845PESV to nopnp list Greg KH
2008-08-04 21:30 ` [patch 15/25] Input: i8042 - add Gericom Bellagio to nomux blacklist Greg KH
2008-08-04 21:30 ` [patch 16/25] Input: i8042 - add Acer Aspire 1360 " Greg KH
2008-08-04 21:30 ` [patch 17/25] Bluetooth: Signal user-space for HIDP and BNEP socket errors Greg KH
2008-08-04 21:30 ` [patch 18/25] Add compat handler for PTRACE_GETSIGINFO Greg KH
2008-08-04 21:30 ` [patch 19/25] ALSA: hda - Fix wrong volumes in AD1988 auto-probe mode Greg KH
2008-08-04 21:30 ` [patch 20/25] ALSA: hda - Fix DMA position inaccuracy Greg KH
2008-08-04 21:30 ` [patch 21/25] ALSA: hda - Add missing Thinkpad Z60m support Greg KH
2008-08-04 21:30 ` [patch 22/25] ALSA: emu10k1 - Fix inverted Analog/Digital mixer switch on Audigy2 Greg KH
2008-08-04 21:30 ` [patch 23/25] vfs: fix lookup on deleted directory Greg KH
2008-08-04 21:30 ` [patch 24/25] Ath5k: fix memory corruption Greg KH
2008-08-04 21:30 ` [patch 25/25] Ath5k: kill tasklets on shutdown Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080804213025.GM8014@suse.de \
--to=gregkh@suse.de \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=cavokz@gmail.com \
--cc=cebbert@redhat.com \
--cc=chuckw@quantumlinux.com \
--cc=cmm@us.ibm.com \
--cc=davej@redhat.com \
--cc=eteo@redhat.com \
--cc=jack@suse.cz \
--cc=jake@lwn.net \
--cc=jmforbes@linuxtx.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mkrufky@linuxtv.org \
--cc=rbranco@la.checkpoint.com \
--cc=rdunlap@xenotime.net \
--cc=reviews@ml.cw.f00f.org \
--cc=stable@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
--cc=w@1wt.eu \
--cc=zwane@arm.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox