From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, "Theodore Tso" <tytso@mit.edu>,
Ben Hutchings <ben@decadent.org.uk>,
George Barnett <gbarnett@atlassian.com>,
Rui Xiang <rui.xiang@huawei.com>
Subject: [PATCH 3.4 60/99] ext4/jbd2: dont wait (forever) for stale tid caused by wraparound
Date: Fri, 7 Mar 2014 17:07:56 -0800 [thread overview]
Message-ID: <20140308010613.516854000@linuxfoundation.org> (raw)
In-Reply-To: <20140308010611.468206150@linuxfoundation.org>
3.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Theodore Ts'o <tytso@mit.edu>
commit d76a3a77113db020d9bb1e894822869410450bd9 upstream.
In the case where an inode has a very stale transaction id (tid) in
i_datasync_tid or i_sync_tid, it's possible that after a very large
(2**31) number of transactions, that the tid number space might wrap,
causing tid_geq()'s calculations to fail.
Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified
by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily",
attempted to fix this problem, but it only avoided kjournald spinning
forever by fixing the logic in jbd2_log_start_commit().
Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c
that might call jbd2_log_start_commit() with a stale tid, those
functions will subsequently call jbd2_log_wait_commit() with the same
stale tid, and then wait for a very long time. To fix this, we
replace the calls to jbd2_log_start_commit() and
jbd2_log_wait_commit() with a call to a new function,
jbd2_complete_transaction(), which will correctly handle stale tid's.
As a bonus, jbd2_complete_transaction() will avoid locking
j_state_lock for writing unless a commit needs to be started. This
should have a small (but probably not measurable) improvement for
ext4's scalability.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Reported-by: George Barnett <gbarnett@atlassian.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Rui Xiang <rui.xiang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/ext4/fsync.c | 3 +--
fs/ext4/inode.c | 3 +--
fs/jbd2/journal.c | 31 +++++++++++++++++++++++++++++++
include/linux/jbd2.h | 1 +
4 files changed, 34 insertions(+), 4 deletions(-)
--- a/fs/ext4/fsync.c
+++ b/fs/ext4/fsync.c
@@ -260,8 +260,7 @@ int ext4_sync_file(struct file *file, lo
if (journal->j_flags & JBD2_BARRIER &&
!jbd2_trans_will_send_data_barrier(journal, commit_tid))
needs_barrier = true;
- jbd2_log_start_commit(journal, commit_tid);
- ret = jbd2_log_wait_commit(journal, commit_tid);
+ ret = jbd2_complete_transaction(journal, commit_tid);
if (needs_barrier)
blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL);
out:
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -149,8 +149,7 @@ void ext4_evict_inode(struct inode *inod
journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
tid_t commit_tid = EXT4_I(inode)->i_datasync_tid;
- jbd2_log_start_commit(journal, commit_tid);
- jbd2_log_wait_commit(journal, commit_tid);
+ jbd2_complete_transaction(journal, commit_tid);
filemap_write_and_wait(&inode->i_data);
}
truncate_inode_pages(&inode->i_data, 0);
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -662,6 +662,37 @@ int jbd2_log_wait_commit(journal_t *jour
}
/*
+ * When this function returns the transaction corresponding to tid
+ * will be completed. If the transaction has currently running, start
+ * committing that transaction before waiting for it to complete. If
+ * the transaction id is stale, it is by definition already completed,
+ * so just return SUCCESS.
+ */
+int jbd2_complete_transaction(journal_t *journal, tid_t tid)
+{
+ int need_to_wait = 1;
+
+ read_lock(&journal->j_state_lock);
+ if (journal->j_running_transaction &&
+ journal->j_running_transaction->t_tid == tid) {
+ if (journal->j_commit_request != tid) {
+ /* transaction not yet started, so request it */
+ read_unlock(&journal->j_state_lock);
+ jbd2_log_start_commit(journal, tid);
+ goto wait_commit;
+ }
+ } else if (!(journal->j_committing_transaction &&
+ journal->j_committing_transaction->t_tid == tid))
+ need_to_wait = 0;
+ read_unlock(&journal->j_state_lock);
+ if (!need_to_wait)
+ return 0;
+wait_commit:
+ return jbd2_log_wait_commit(journal, tid);
+}
+EXPORT_SYMBOL(jbd2_complete_transaction);
+
+/*
* Log buffer allocation routines:
*/
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1178,6 +1178,7 @@ int __jbd2_log_start_commit(journal_t *j
int jbd2_journal_start_commit(journal_t *journal, tid_t *tid);
int jbd2_journal_force_commit_nested(journal_t *journal);
int jbd2_log_wait_commit(journal_t *journal, tid_t tid);
+int jbd2_complete_transaction(journal_t *journal, tid_t tid);
int jbd2_log_do_checkpoint(journal_t *journal);
int jbd2_trans_will_send_data_barrier(journal_t *journal, tid_t tid);
next prev parent reply other threads:[~2014-03-08 1:07 UTC|newest]
Thread overview: 123+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-08 1:06 [PATCH 3.4 00/99] 3.4.83-stable review Greg Kroah-Hartman
2014-03-08 1:06 ` [PATCH 3.4 01/99] ext4: dont try to modify s_flags if the the file system is read-only Greg Kroah-Hartman
2014-03-08 1:06 ` [PATCH 3.4 02/99] ext4: fix online resize with a non-standard blocks per group setting Greg Kroah-Hartman
2014-03-08 1:06 ` [PATCH 3.4 03/99] ext4: dont leave i_crtime.tv_sec uninitialized Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 04/99] ARM: 7953/1: mm: ensure TLB invalidation is complete before enabling MMU Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 05/99] ARM: 7957/1: add DSB after icache flush in __flush_icache_all() Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 06/99] avr32: fix missing module.h causing build failure in mimc200/fram.c Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 07/99] avr32: Makefile: add -D__linux__ flag for gcc-4.4.7 use Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 08/99] cifs: ensure that uncached writes handle unmapped areas correctly Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 09/99] rtl8187: fix regression on MIPS without coherent DMA Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 10/99] rtlwifi: Fix incorrect return from rtl_ps_enable_nic() Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 11/99] rtlwifi: rtl8192ce: Fix too long disable of IRQs Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 13/99] tg3: Fix deadlock in tg3_change_mtu() Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 14/99] bonding: 802.3ad: make aggregator_identifier bond-private Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 15/99] usbnet: remove generic hard_header_len check Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 16/99] net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 17/99] net: add and use skb_gso_transport_seglen() Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 18/99] net: ip, ipv6: handle gso skbs in forwarding path Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 19/99] ALSA: usb-audio: work around KEF X300A firmware bug Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 20/99] ASoC: wm8770: Fix wrong number of enum items Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 22/99] ASoC: sta32x: Fix array access overflow Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 23/99] ASoC: wm8958-dsp: Fix firmware block loading Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 24/99] SUNRPC: Fix races in xs_nospace() Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 25/99] powerpc/le: Ensure that the stop-self RTAS token is handled correctly Greg Kroah-Hartman
2014-03-10 10:40 ` Luís Henriques
2014-03-11 23:08 ` Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 26/99] powerpc/crashdump : Fix page frame number check in copy_oldmem_page Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 27/99] perf/x86: Fix event scheduling Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 28/99] ata: enable quirk from jmicron JMB350 for JMB394 Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 29/99] sata_sil: apply MOD15WRITE quirk to TOSHIBA MK2561GSYN Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 30/99] PCI: Enable INTx if BIOS left them disabled Greg Kroah-Hartman
2014-03-08 13:50 ` Bjorn Helgaas
2014-03-11 23:08 ` Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 31/99] i7core_edac: Fix PCI device reference count Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 32/99] ACPI / video: Filter the _BCL table for duplicate brightness values Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 33/99] ACPI / processor: Rework processor throttling with work_on_cpu() Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 34/99] USB: serial: option: blacklist interface 4 for Cinterion PHS8 and PXS8 Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 35/99] USB: ftdi_sio: add Cressi Leonardo PID Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 36/99] hwmon: (max1668) Fix writing the minimum temperature Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 37/99] workqueue: ensure @task is valid across kthread_stop() Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 38/99] perf: Fix hotplug splat Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 39/99] SELinux: bigendian problems with filename trans rules Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 40/99] quota: Fix race between dqput() and dquot_scan_active() Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 41/99] dma: ste_dma40: dont dereference free:d descriptor Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 42/99] dm mpath: fix stalls when handling invalid ioctls Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 43/99] mm: vmscan: fix endless loop in kswapd balancing Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 44/99] cgroup: cgroup_subsys->fork() should be called after the task is added to css_set Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 45/99] KVM: s390: move kvm_guest_enter,exit closer to sie Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 46/99] s390/kvm: dont announce RRBM support Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 47/99] KVM: PPC: Emulate dcbf Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 48/99] KVM: IOMMU: hva align mapping page size Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 49/99] proc connector: reject unprivileged listener bumps Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 50/99] cgroup: fix RCU accesses to task->cgroups Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 51/99] mm/hotplug: correctly add new zone to all other nodes zone lists Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 52/99] perf tools: Remove extraneous newline when parsing hardware cache events Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 53/99] perf tools: Fix cache event name generation Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 54/99] nilfs2: fix issue with race condition of competition between segments for dirty blocks Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 55/99] fuse: readdir: check for slash in names Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 56/99] fuse: hotfix truncate_pagecache() issue Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 57/99] libceph: unregister request in __map_request failed and nofail == false Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 58/99] cifs: dont instantiate new dentries in readdir for inodes that need to be revalidated immediately Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 59/99] ncpfs: fix rmdir returns Device or resource busy Greg Kroah-Hartman
2014-03-08 1:07 ` Greg Kroah-Hartman [this message]
2014-03-08 1:07 ` [PATCH 3.4 61/99] UBIFS: fix double free of ubifs_orphan objects Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 62/99] ext4: fix possible use-after-free with AIO Greg Kroah-Hartman
2014-03-08 1:07 ` [PATCH 3.4 63/99] cifs: adjust sequence number downward after signing NT_CANCEL request Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 64/99] nbd: correct disconnect behavior Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 65/99] block: Dont access request after it might be freed Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 66/99] ext4: return ENOMEM if sb_getblk() fails Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 67/99] [media] saa7134: Fix unlocked snd_pcm_stop() call Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 68/99] xen/boot: Disable BIOS SMP MP table search Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 69/99] xen/smp: Fix leakage of timer interrupt line for every CPU online/offline Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 70/99] xen/smp/spinlock: Fix leakage of the spinlock " Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 71/99] xen-netback: fix sparse warning Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 72/99] xen-netback: coalesce slots in TX path and fix regressions Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 73/99] xen-netback: dont disconnect frontend when seeing oversize packet Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 74/99] xen/io/ring.h: new macro to detect whether there are too many requests on the ring Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 75/99] xen/blkback: Check for insane amounts of request on the ring (v6) Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 76/99] xen/events: mask events when changing their VCPU binding Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 77/99] sunrpc: clarify comments on rpc_make_runnable Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 78/99] SUNRPC: Prevent an rpc_task wakeup race Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 79/99] ASoC: imx-ssi: Fix occasional AC97 reset failure Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 80/99] ASoC: sglt5000: Fix the default value of CHIP_SSS_CTRL Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 81/99] ALSA: atiixp: Fix unlocked snd_pcm_stop() call Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 82/99] ALSA: 6fire: " Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 83/99] ALSA: ua101: " Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 84/99] ALSA: usx2y: " Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 85/99] ALSA: pxa2xx: " Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 86/99] ASoC: s6000: " Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 87/99] staging: line6: " Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 88/99] ALSA: asihpi: " Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 89/99] iwlwifi: fix flow handler debug code Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 90/99] iwlwifi: protect SRAM debugfs Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 91/99] iwlwifi: dont handle masked interrupt Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 92/99] iwlwifi: handle DMA mapping failures Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 93/99] iwlwifi: always copy first 16 bytes of commands Greg Kroah-Hartman
2014-03-22 14:19 ` Andreas Sturmlechner
2014-03-22 16:25 ` Greg Kroah-Hartman
2014-03-22 16:28 ` Andreas Sturmlechner
2014-03-22 16:51 ` Greg Kroah-Hartman
2014-03-22 17:38 ` Ben Hutchings
2014-03-22 18:43 ` Grumbach, Emmanuel
2014-03-22 21:01 ` Andreas Sturmlechner
2014-03-25 2:55 ` Ben Hutchings
2014-03-25 9:29 ` Andreas Sturmlechner
2014-03-25 12:05 ` Jianguo Wu
2014-03-25 17:28 ` [PATCH 3.4] iwlwifi: Complete backport of "iwlwifi: always copy first 16 bytes of commands" Ben Hutchings
2014-03-25 18:16 ` Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 94/99] iwlwifi: dvm: dont send BT_CONFIG on devices w/o Bluetooth Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 95/99] iwlwifi: dvm: fix calling ieee80211_chswitch_done() with NULL Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 96/99] iwlwifi: pcie: add SKUs for 6000, 6005 and 6235 series Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 97/99] rtlwifi: Fix endian error in extracting packet type Greg Kroah-Hartman
2014-03-08 1:08 ` [PATCH 3.4 98/99] net: asix: handle packets crossing URB boundaries Greg Kroah-Hartman
2014-03-08 9:47 ` [PATCH 3.4 00/99] 3.4.83-stable review Satoru Takeuchi
2014-03-08 14:35 ` Guenter Roeck
2014-03-08 16:18 ` Greg Kroah-Hartman
2014-03-08 17:10 ` Guenter Roeck
2014-03-08 20:50 ` Satoru Takeuchi
2014-03-09 4:18 ` Shuah Khan
2014-03-12 0:05 ` Greg Kroah-Hartman
2014-03-12 0:05 ` Greg Kroah-Hartman
2014-03-12 0:04 ` Greg Kroah-Hartman
2014-03-12 2:34 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140308010613.516854000@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=ben@decadent.org.uk \
--cc=gbarnett@atlassian.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rui.xiang@huawei.com \
--cc=stable@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).