stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, "Theodore Tso" <tytso@mit.edu>,
	Ben Hutchings <ben@decadent.org.uk>,
	George Barnett <gbarnett@atlassian.com>,
	Rui Xiang <rui.xiang@huawei.com>
Subject: [PATCH 3.4 60/99] ext4/jbd2: dont wait (forever) for stale tid caused by wraparound
Date: Fri,  7 Mar 2014 17:07:56 -0800	[thread overview]
Message-ID: <20140308010613.516854000@linuxfoundation.org> (raw)
In-Reply-To: <20140308010611.468206150@linuxfoundation.org>

3.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit d76a3a77113db020d9bb1e894822869410450bd9 upstream.

In the case where an inode has a very stale transaction id (tid) in
i_datasync_tid or i_sync_tid, it's possible that after a very large
(2**31) number of transactions, that the tid number space might wrap,
causing tid_geq()'s calculations to fail.

Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified
by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily",
attempted to fix this problem, but it only avoided kjournald spinning
forever by fixing the logic in jbd2_log_start_commit().

Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c
that might call jbd2_log_start_commit() with a stale tid, those
functions will subsequently call jbd2_log_wait_commit() with the same
stale tid, and then wait for a very long time.  To fix this, we
replace the calls to jbd2_log_start_commit() and
jbd2_log_wait_commit() with a call to a new function,
jbd2_complete_transaction(), which will correctly handle stale tid's.

As a bonus, jbd2_complete_transaction() will avoid locking
j_state_lock for writing unless a commit needs to be started.  This
should have a small (but probably not measurable) improvement for
ext4's scalability.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Reported-by: George Barnett <gbarnett@atlassian.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Rui Xiang <rui.xiang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ext4/fsync.c      |    3 +--
 fs/ext4/inode.c      |    3 +--
 fs/jbd2/journal.c    |   31 +++++++++++++++++++++++++++++++
 include/linux/jbd2.h |    1 +
 4 files changed, 34 insertions(+), 4 deletions(-)

--- a/fs/ext4/fsync.c
+++ b/fs/ext4/fsync.c
@@ -260,8 +260,7 @@ int ext4_sync_file(struct file *file, lo
 	if (journal->j_flags & JBD2_BARRIER &&
 	    !jbd2_trans_will_send_data_barrier(journal, commit_tid))
 		needs_barrier = true;
-	jbd2_log_start_commit(journal, commit_tid);
-	ret = jbd2_log_wait_commit(journal, commit_tid);
+	ret = jbd2_complete_transaction(journal, commit_tid);
 	if (needs_barrier)
 		blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL);
  out:
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -149,8 +149,7 @@ void ext4_evict_inode(struct inode *inod
 			journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
 			tid_t commit_tid = EXT4_I(inode)->i_datasync_tid;
 
-			jbd2_log_start_commit(journal, commit_tid);
-			jbd2_log_wait_commit(journal, commit_tid);
+			jbd2_complete_transaction(journal, commit_tid);
 			filemap_write_and_wait(&inode->i_data);
 		}
 		truncate_inode_pages(&inode->i_data, 0);
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -662,6 +662,37 @@ int jbd2_log_wait_commit(journal_t *jour
 }
 
 /*
+ * When this function returns the transaction corresponding to tid
+ * will be completed.  If the transaction has currently running, start
+ * committing that transaction before waiting for it to complete.  If
+ * the transaction id is stale, it is by definition already completed,
+ * so just return SUCCESS.
+ */
+int jbd2_complete_transaction(journal_t *journal, tid_t tid)
+{
+	int	need_to_wait = 1;
+
+	read_lock(&journal->j_state_lock);
+	if (journal->j_running_transaction &&
+	    journal->j_running_transaction->t_tid == tid) {
+		if (journal->j_commit_request != tid) {
+			/* transaction not yet started, so request it */
+			read_unlock(&journal->j_state_lock);
+			jbd2_log_start_commit(journal, tid);
+			goto wait_commit;
+		}
+	} else if (!(journal->j_committing_transaction &&
+		     journal->j_committing_transaction->t_tid == tid))
+		need_to_wait = 0;
+	read_unlock(&journal->j_state_lock);
+	if (!need_to_wait)
+		return 0;
+wait_commit:
+	return jbd2_log_wait_commit(journal, tid);
+}
+EXPORT_SYMBOL(jbd2_complete_transaction);
+
+/*
  * Log buffer allocation routines:
  */
 
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1178,6 +1178,7 @@ int __jbd2_log_start_commit(journal_t *j
 int jbd2_journal_start_commit(journal_t *journal, tid_t *tid);
 int jbd2_journal_force_commit_nested(journal_t *journal);
 int jbd2_log_wait_commit(journal_t *journal, tid_t tid);
+int jbd2_complete_transaction(journal_t *journal, tid_t tid);
 int jbd2_log_do_checkpoint(journal_t *journal);
 int jbd2_trans_will_send_data_barrier(journal_t *journal, tid_t tid);
 



  parent reply	other threads:[~2014-03-08  1:07 UTC|newest]

Thread overview: 123+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-08  1:06 [PATCH 3.4 00/99] 3.4.83-stable review Greg Kroah-Hartman
2014-03-08  1:06 ` [PATCH 3.4 01/99] ext4: dont try to modify s_flags if the the file system is read-only Greg Kroah-Hartman
2014-03-08  1:06 ` [PATCH 3.4 02/99] ext4: fix online resize with a non-standard blocks per group setting Greg Kroah-Hartman
2014-03-08  1:06 ` [PATCH 3.4 03/99] ext4: dont leave i_crtime.tv_sec uninitialized Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 04/99] ARM: 7953/1: mm: ensure TLB invalidation is complete before enabling MMU Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 05/99] ARM: 7957/1: add DSB after icache flush in __flush_icache_all() Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 06/99] avr32: fix missing module.h causing build failure in mimc200/fram.c Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 07/99] avr32: Makefile: add -D__linux__ flag for gcc-4.4.7 use Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 08/99] cifs: ensure that uncached writes handle unmapped areas correctly Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 09/99] rtl8187: fix regression on MIPS without coherent DMA Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 10/99] rtlwifi: Fix incorrect return from rtl_ps_enable_nic() Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 11/99] rtlwifi: rtl8192ce: Fix too long disable of IRQs Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 13/99] tg3: Fix deadlock in tg3_change_mtu() Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 14/99] bonding: 802.3ad: make aggregator_identifier bond-private Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 15/99] usbnet: remove generic hard_header_len check Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 16/99] net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 17/99] net: add and use skb_gso_transport_seglen() Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 18/99] net: ip, ipv6: handle gso skbs in forwarding path Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 19/99] ALSA: usb-audio: work around KEF X300A firmware bug Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 20/99] ASoC: wm8770: Fix wrong number of enum items Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 22/99] ASoC: sta32x: Fix array access overflow Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 23/99] ASoC: wm8958-dsp: Fix firmware block loading Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 24/99] SUNRPC: Fix races in xs_nospace() Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 25/99] powerpc/le: Ensure that the stop-self RTAS token is handled correctly Greg Kroah-Hartman
2014-03-10 10:40   ` Luís Henriques
2014-03-11 23:08     ` Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 26/99] powerpc/crashdump : Fix page frame number check in copy_oldmem_page Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 27/99] perf/x86: Fix event scheduling Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 28/99] ata: enable quirk from jmicron JMB350 for JMB394 Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 29/99] sata_sil: apply MOD15WRITE quirk to TOSHIBA MK2561GSYN Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 30/99] PCI: Enable INTx if BIOS left them disabled Greg Kroah-Hartman
2014-03-08 13:50   ` Bjorn Helgaas
2014-03-11 23:08     ` Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 31/99] i7core_edac: Fix PCI device reference count Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 32/99] ACPI / video: Filter the _BCL table for duplicate brightness values Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 33/99] ACPI / processor: Rework processor throttling with work_on_cpu() Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 34/99] USB: serial: option: blacklist interface 4 for Cinterion PHS8 and PXS8 Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 35/99] USB: ftdi_sio: add Cressi Leonardo PID Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 36/99] hwmon: (max1668) Fix writing the minimum temperature Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 37/99] workqueue: ensure @task is valid across kthread_stop() Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 38/99] perf: Fix hotplug splat Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 39/99] SELinux: bigendian problems with filename trans rules Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 40/99] quota: Fix race between dqput() and dquot_scan_active() Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 41/99] dma: ste_dma40: dont dereference free:d descriptor Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 42/99] dm mpath: fix stalls when handling invalid ioctls Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 43/99] mm: vmscan: fix endless loop in kswapd balancing Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 44/99] cgroup: cgroup_subsys->fork() should be called after the task is added to css_set Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 45/99] KVM: s390: move kvm_guest_enter,exit closer to sie Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 46/99] s390/kvm: dont announce RRBM support Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 47/99] KVM: PPC: Emulate dcbf Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 48/99] KVM: IOMMU: hva align mapping page size Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 49/99] proc connector: reject unprivileged listener bumps Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 50/99] cgroup: fix RCU accesses to task->cgroups Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 51/99] mm/hotplug: correctly add new zone to all other nodes zone lists Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 52/99] perf tools: Remove extraneous newline when parsing hardware cache events Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 53/99] perf tools: Fix cache event name generation Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 54/99] nilfs2: fix issue with race condition of competition between segments for dirty blocks Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 55/99] fuse: readdir: check for slash in names Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 56/99] fuse: hotfix truncate_pagecache() issue Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 57/99] libceph: unregister request in __map_request failed and nofail == false Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 58/99] cifs: dont instantiate new dentries in readdir for inodes that need to be revalidated immediately Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 59/99] ncpfs: fix rmdir returns Device or resource busy Greg Kroah-Hartman
2014-03-08  1:07 ` Greg Kroah-Hartman [this message]
2014-03-08  1:07 ` [PATCH 3.4 61/99] UBIFS: fix double free of ubifs_orphan objects Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 62/99] ext4: fix possible use-after-free with AIO Greg Kroah-Hartman
2014-03-08  1:07 ` [PATCH 3.4 63/99] cifs: adjust sequence number downward after signing NT_CANCEL request Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 64/99] nbd: correct disconnect behavior Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 65/99] block: Dont access request after it might be freed Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 66/99] ext4: return ENOMEM if sb_getblk() fails Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 67/99] [media] saa7134: Fix unlocked snd_pcm_stop() call Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 68/99] xen/boot: Disable BIOS SMP MP table search Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 69/99] xen/smp: Fix leakage of timer interrupt line for every CPU online/offline Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 70/99] xen/smp/spinlock: Fix leakage of the spinlock " Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 71/99] xen-netback: fix sparse warning Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 72/99] xen-netback: coalesce slots in TX path and fix regressions Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 73/99] xen-netback: dont disconnect frontend when seeing oversize packet Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 74/99] xen/io/ring.h: new macro to detect whether there are too many requests on the ring Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 75/99] xen/blkback: Check for insane amounts of request on the ring (v6) Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 76/99] xen/events: mask events when changing their VCPU binding Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 77/99] sunrpc: clarify comments on rpc_make_runnable Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 78/99] SUNRPC: Prevent an rpc_task wakeup race Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 79/99] ASoC: imx-ssi: Fix occasional AC97 reset failure Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 80/99] ASoC: sglt5000: Fix the default value of CHIP_SSS_CTRL Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 81/99] ALSA: atiixp: Fix unlocked snd_pcm_stop() call Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 82/99] ALSA: 6fire: " Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 83/99] ALSA: ua101: " Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 84/99] ALSA: usx2y: " Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 85/99] ALSA: pxa2xx: " Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 86/99] ASoC: s6000: " Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 87/99] staging: line6: " Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 88/99] ALSA: asihpi: " Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 89/99] iwlwifi: fix flow handler debug code Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 90/99] iwlwifi: protect SRAM debugfs Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 91/99] iwlwifi: dont handle masked interrupt Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 92/99] iwlwifi: handle DMA mapping failures Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 93/99] iwlwifi: always copy first 16 bytes of commands Greg Kroah-Hartman
2014-03-22 14:19   ` Andreas Sturmlechner
2014-03-22 16:25     ` Greg Kroah-Hartman
2014-03-22 16:28       ` Andreas Sturmlechner
2014-03-22 16:51         ` Greg Kroah-Hartman
2014-03-22 17:38           ` Ben Hutchings
2014-03-22 18:43             ` Grumbach, Emmanuel
2014-03-22 21:01             ` Andreas Sturmlechner
2014-03-25  2:55               ` Ben Hutchings
2014-03-25  9:29                 ` Andreas Sturmlechner
2014-03-25 12:05                   ` Jianguo Wu
2014-03-25 17:28                 ` [PATCH 3.4] iwlwifi: Complete backport of "iwlwifi: always copy first 16 bytes of commands" Ben Hutchings
2014-03-25 18:16                   ` Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 94/99] iwlwifi: dvm: dont send BT_CONFIG on devices w/o Bluetooth Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 95/99] iwlwifi: dvm: fix calling ieee80211_chswitch_done() with NULL Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 96/99] iwlwifi: pcie: add SKUs for 6000, 6005 and 6235 series Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 97/99] rtlwifi: Fix endian error in extracting packet type Greg Kroah-Hartman
2014-03-08  1:08 ` [PATCH 3.4 98/99] net: asix: handle packets crossing URB boundaries Greg Kroah-Hartman
2014-03-08  9:47 ` [PATCH 3.4 00/99] 3.4.83-stable review Satoru Takeuchi
2014-03-08 14:35   ` Guenter Roeck
2014-03-08 16:18     ` Greg Kroah-Hartman
2014-03-08 17:10       ` Guenter Roeck
2014-03-08 20:50         ` Satoru Takeuchi
2014-03-09  4:18           ` Shuah Khan
2014-03-12  0:05             ` Greg Kroah-Hartman
2014-03-12  0:05           ` Greg Kroah-Hartman
2014-03-12  0:04         ` Greg Kroah-Hartman
2014-03-12  2:34           ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140308010613.516854000@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=ben@decadent.org.uk \
    --cc=gbarnett@atlassian.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rui.xiang@huawei.com \
    --cc=stable@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).