stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Junxiao Bi <junxiao.bi@oracle.com>,
	Wengang Wang <wen.gang.wang@oracle.com>,
	Mark Fasheh <mfasheh@suse.de>, Joel Becker <jlbec@evilplan.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.10 01/44] ocfs2: fix journal commit deadlock
Date: Tue, 13 Jan 2015 23:23:22 -0800	[thread overview]
Message-ID: <20150114072227.497329446@linuxfoundation.org> (raw)
In-Reply-To: <20150114072227.419663002@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Junxiao Bi <junxiao.bi@oracle.com>

commit 136f49b9171074872f2a14ad0ab10486d1ba13ca upstream.

For buffer write, page lock will be got in write_begin and released in
write_end, in ocfs2_write_end_nolock(), before it unlock the page in
ocfs2_free_write_ctxt(), it calls ocfs2_run_deallocs(), this will ask
for the read lock of journal->j_trans_barrier.  Holding page lock and
ask for journal->j_trans_barrier breaks the locking order.

This will cause a deadlock with journal commit threads, ocfs2cmt will
get write lock of journal->j_trans_barrier first, then it wakes up
kjournald2 to do the commit work, at last it waits until done.  To
commit journal, kjournald2 needs flushing data first, it needs get the
cache page lock.

Since some ocfs2 cluster locks are holding by write process, this
deadlock may hung the whole cluster.

unlock pages before ocfs2_run_deallocs() can fix the locking order, also
put unlock before ocfs2_commit_trans() to make page lock is unlocked
before j_trans_barrier to preserve unlocking order.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ocfs2/aops.c |   16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -917,7 +917,7 @@ void ocfs2_unlock_and_free_pages(struct
 	}
 }
 
-static void ocfs2_free_write_ctxt(struct ocfs2_write_ctxt *wc)
+static void ocfs2_unlock_pages(struct ocfs2_write_ctxt *wc)
 {
 	int i;
 
@@ -938,7 +938,11 @@ static void ocfs2_free_write_ctxt(struct
 		page_cache_release(wc->w_target_page);
 	}
 	ocfs2_unlock_and_free_pages(wc->w_pages, wc->w_num_pages);
+}
 
+static void ocfs2_free_write_ctxt(struct ocfs2_write_ctxt *wc)
+{
+	ocfs2_unlock_pages(wc);
 	brelse(wc->w_di_bh);
 	kfree(wc);
 }
@@ -2060,11 +2064,19 @@ out_write_size:
 	di->i_mtime_nsec = di->i_ctime_nsec = cpu_to_le32(inode->i_mtime.tv_nsec);
 	ocfs2_journal_dirty(handle, wc->w_di_bh);
 
+	/* unlock pages before dealloc since it needs acquiring j_trans_barrier
+	 * lock, or it will cause a deadlock since journal commit threads holds
+	 * this lock and will ask for the page lock when flushing the data.
+	 * put it here to preserve the unlock order.
+	 */
+	ocfs2_unlock_pages(wc);
+
 	ocfs2_commit_trans(osb, handle);
 
 	ocfs2_run_deallocs(osb, &wc->w_dealloc);
 
-	ocfs2_free_write_ctxt(wc);
+	brelse(wc->w_di_bh);
+	kfree(wc);
 
 	return copied;
 }



  reply	other threads:[~2015-01-14  7:23 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-14  7:23 [PATCH 3.10 00/44] 3.10.65-stable review Greg Kroah-Hartman
2015-01-14  7:23 ` Greg Kroah-Hartman [this message]
2015-01-14  7:23 ` [PATCH 3.10 02/44] ath9k_hw: fix hardware queue allocation Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 03/44] ath9k: fix BE/BK queue order Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 04/44] can: peak_usb: fix cleanup sequence order in case of error during init Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 05/44] can: peak_usb: fix memset() usage Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 06/44] swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 07/44] ath5k: fix hardware queue index assignment Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 08/44] ASoC: sigmadsp: Refuse to load firmware files with a non-supported version Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 09/44] ASoC: max98090: Fix ill-defined sidetone route Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 10/44] ASoC: dwc: Ensure FIFOs are flushed to prevent channel swap Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 11/44] PCI: Restore detection of read-only BARs Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 12/44] pstore-ram: Fix hangs by using write-combine mappings Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 13/44] pstore-ram: Allow optional mapping with pgprot_noncached Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 14/44] UBI: Fix invalid vfree() Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 15/44] UBI: Fix double free after do_sync_erase() Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 16/44] iommu/vt-d: Fix an off-by-one bug in __domain_mapping() Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 17/44] HID: i2c-hid: fix race condition reading reports Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 18/44] HID: i2c-hid: prevent buffer overflow in early IRQ Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 19/44] HID: roccat: potential out of bounds in pyra_sysfs_write_settings() Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 20/44] HID: add battery quirk for USB_DEVICE_ID_APPLE_ALU_WIRELESS_2011_ISO keyboard Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 22/44] x86_64, vdso: Fix the vdso address randomization algorithm Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 23/44] x86, vdso: Use asm volatile in __getcpu Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 24/44] driver core: Fix unbalanced device reference in drivers_probe Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 25/44] ALSA: usb-audio: extend KEF X300A FU 10 tweak to Arcam rPAC Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 26/44] ALSA: hda - using uninitialized data Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 27/44] ALSA: hda - Fix wrong gpio_dir & gpio_mask hint setups for IDT/STAC codecs Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 28/44] USB: cdc-acm: check for valid interfaces Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 29/44] genhd: check for int overflow in disk_expand_part_tbl() Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 30/44] cdc-acm: memory leak in error case Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 31/44] writeback: fix a subtle race condition in I_DIRTY clearing Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 32/44] serial: samsung: wait for transfer completion before clock disable Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 33/44] fs: nfsd: Fix signedness bug in compare_blob Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 34/44] nfsd4: fix xdr4 inclusion of escaped char Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 35/44] nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 36/44] scripts/kernel-doc: dont eat struct members with __aligned Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 37/44] ARM: mvebu: disable I/O coherency on non-SMP situations on Armada 370/375/38x/XP Greg Kroah-Hartman
2015-01-14  7:23 ` [PATCH 3.10 38/44] Btrfs: dont delay inode ref updates during log replay Greg Kroah-Hartman
2015-01-14  7:24 ` [PATCH 3.10 39/44] perf/x86/intel/uncore: Make sure only uncore events are collected Greg Kroah-Hartman
2015-01-14  7:24 ` [PATCH 3.10 40/44] perf: Fix events installation during moving group Greg Kroah-Hartman
2015-01-14  7:24 ` [PATCH 3.10 41/44] perf session: Do not fail on processing out of order event Greg Kroah-Hartman
2015-01-14  7:24 ` [PATCH 3.10 42/44] mm, vmscan: prevent kswapd livelock due to pfmemalloc-throttled process being killed Greg Kroah-Hartman
2015-01-14  7:24 ` [PATCH 3.10 43/44] mm: propagate error from stack expansion even for guard page Greg Kroah-Hartman
2015-01-14  7:24 ` [PATCH 3.10 44/44] mm: Dont count the stack guard page towards RLIMIT_STACK Greg Kroah-Hartman
2015-01-14 22:49 ` [PATCH 3.10 00/44] 3.10.65-stable review Shuah Khan
2015-01-15  0:43 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150114072227.497329446@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=jlbec@evilplan.org \
    --cc=junxiao.bi@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mfasheh@suse.de \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=wen.gang.wang@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).