From: Luis Henriques <luis.henriques@canonical.com>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org,
kernel-team@lists.ubuntu.com
Cc: Jan Kara <jack@suse.cz>, "Theodore Ts'o" <tytso@mit.edu>,
Luis Henriques <luis.henriques@canonical.com>
Subject: [PATCH 3.11 06/70] ext4: fix jbd2 warning under heavy xattr load
Date: Wed, 7 May 2014 14:12:09 +0100 [thread overview]
Message-ID: <1399468393-10140-7-git-send-email-luis.henriques@canonical.com> (raw)
In-Reply-To: <1399468393-10140-1-git-send-email-luis.henriques@canonical.com>
3.11.10.10 -stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara <jack@suse.cz>
commit ec4cb1aa2b7bae18dd8164f2e9c7c51abcf61280 upstream.
When heavily exercising xattr code the assertion that
jbd2_journal_dirty_metadata() shouldn't return error was triggered:
WARNING: at /srv/autobuild-ceph/gitbuilder.git/build/fs/jbd2/transaction.c:1237
jbd2_journal_dirty_metadata+0x1ba/0x260()
CPU: 0 PID: 8877 Comm: ceph-osd Tainted: G W 3.10.0-ceph-00049-g68d04c9 #1
Hardware name: Dell Inc. PowerEdge R410/01V648, BIOS 1.6.3 02/07/2011
ffffffff81a1d3c8 ffff880214469928 ffffffff816311b0 ffff880214469968
ffffffff8103fae0 ffff880214469958 ffff880170a9dc30 ffff8802240fbe80
0000000000000000 ffff88020b366000 ffff8802256e7510 ffff880214469978
Call Trace:
[<ffffffff816311b0>] dump_stack+0x19/0x1b
[<ffffffff8103fae0>] warn_slowpath_common+0x70/0xa0
[<ffffffff8103fb2a>] warn_slowpath_null+0x1a/0x20
[<ffffffff81267c2a>] jbd2_journal_dirty_metadata+0x1ba/0x260
[<ffffffff81245093>] __ext4_handle_dirty_metadata+0xa3/0x140
[<ffffffff812561f3>] ext4_xattr_release_block+0x103/0x1f0
[<ffffffff81256680>] ext4_xattr_block_set+0x1e0/0x910
[<ffffffff8125795b>] ext4_xattr_set_handle+0x38b/0x4a0
[<ffffffff810a319d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff81257b32>] ext4_xattr_set+0xc2/0x140
[<ffffffff81258547>] ext4_xattr_user_set+0x47/0x50
[<ffffffff811935ce>] generic_setxattr+0x6e/0x90
[<ffffffff81193ecb>] __vfs_setxattr_noperm+0x7b/0x1c0
[<ffffffff811940d4>] vfs_setxattr+0xc4/0xd0
[<ffffffff8119421e>] setxattr+0x13e/0x1e0
[<ffffffff811719c7>] ? __sb_start_write+0xe7/0x1b0
[<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60
[<ffffffff8118c65c>] ? fget_light+0x3c/0x130
[<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60
[<ffffffff8118f1f8>] ? __mnt_want_write+0x58/0x70
[<ffffffff811946be>] SyS_fsetxattr+0xbe/0x100
[<ffffffff816407c2>] system_call_fastpath+0x16/0x1b
The reason for the warning is that buffer_head passed into
jbd2_journal_dirty_metadata() didn't have journal_head attached. This is
caused by the following race of two ext4_xattr_release_block() calls:
CPU1 CPU2
ext4_xattr_release_block() ext4_xattr_release_block()
lock_buffer(bh);
/* False */
if (BHDR(bh)->h_refcount == cpu_to_le32(1))
} else {
le32_add_cpu(&BHDR(bh)->h_refcount, -1);
unlock_buffer(bh);
lock_buffer(bh);
/* True */
if (BHDR(bh)->h_refcount == cpu_to_le32(1))
get_bh(bh);
ext4_free_blocks()
...
jbd2_journal_forget()
jbd2_journal_unfile_buffer()
-> JH is gone
error = ext4_handle_dirty_xattr_block(handle, inode, bh);
-> triggers the warning
We fix the problem by moving ext4_handle_dirty_xattr_block() under the
buffer lock. Sadly this cannot be done in nojournal mode as that
function can call sync_dirty_buffer() which would deadlock. Luckily in
nojournal mode the race is harmless (we only dirty already freed buffer)
and thus for nojournal mode we leave the dirtying outside of the buffer
lock.
Reported-by: Sage Weil <sage@inktank.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
---
fs/ext4/xattr.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 1423c48..298e9c8 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -517,8 +517,8 @@ static void ext4_xattr_update_super_block(handle_t *handle,
}
/*
- * Release the xattr block BH: If the reference count is > 1, decrement
- * it; otherwise free the block.
+ * Release the xattr block BH: If the reference count is > 1, decrement it;
+ * otherwise free the block.
*/
static void
ext4_xattr_release_block(handle_t *handle, struct inode *inode,
@@ -538,16 +538,31 @@ ext4_xattr_release_block(handle_t *handle, struct inode *inode,
if (ce)
mb_cache_entry_free(ce);
get_bh(bh);
+ unlock_buffer(bh);
ext4_free_blocks(handle, inode, bh, 0, 1,
EXT4_FREE_BLOCKS_METADATA |
EXT4_FREE_BLOCKS_FORGET);
- unlock_buffer(bh);
} else {
le32_add_cpu(&BHDR(bh)->h_refcount, -1);
if (ce)
mb_cache_entry_release(ce);
+ /*
+ * Beware of this ugliness: Releasing of xattr block references
+ * from different inodes can race and so we have to protect
+ * from a race where someone else frees the block (and releases
+ * its journal_head) before we are done dirtying the buffer. In
+ * nojournal mode this race is harmless and we actually cannot
+ * call ext4_handle_dirty_xattr_block() with locked buffer as
+ * that function can call sync_dirty_buffer() so for that case
+ * we handle the dirtying after unlocking the buffer.
+ */
+ if (ext4_handle_valid(handle))
+ error = ext4_handle_dirty_xattr_block(handle, inode,
+ bh);
unlock_buffer(bh);
- error = ext4_handle_dirty_xattr_block(handle, inode, bh);
+ if (!ext4_handle_valid(handle))
+ error = ext4_handle_dirty_xattr_block(handle, inode,
+ bh);
if (IS_SYNC(inode))
ext4_handle_sync(handle);
dquot_free_block(inode, EXT4_C2B(EXT4_SB(inode->i_sb), 1));
--
1.9.1
next prev parent reply other threads:[~2014-05-07 13:12 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-07 13:12 [3.11.y.z extended stable] Linux 3.11.10.10 stable review Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 01/70] core, nfqueue, openvswitch: Orphan frags in skb_zerocopy and handle errors Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 02/70] KVM: ioapic: fix assignment of ioapic->rtc_status.pending_eoi (CVE-2014-0155) Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 03/70] iio: querying buffer scan_mask should return 0/1 Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 04/70] pata_at91: fix ata_host_activate() failure handling Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 05/70] ext4: note the error in ext4_end_bio() Luis Henriques
2014-05-07 13:12 ` Luis Henriques [this message]
2014-05-07 13:12 ` [PATCH 3.11 07/70] ext4: use i_size_read in ext4_unaligned_aio() Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 08/70] locks: allow __break_lease to sleep even when break_time is 0 Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 09/70] usb: gadget: zero: Fix SuperSpeed enumeration for alternate setting 1 Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 10/70] ahci: do not request irq for dummy port Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 11/70] genirq: Allow forcing cpu affinity of interrupts Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 12/70] irqchip: Gic: Support forced affinity setting Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 13/70] clocksource: Exynos_mct: Register clock event after request_irq() Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 14/70] nfsd: set timeparms.to_maxval in setup_callback_client Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 15/70] ahci: Do not receive interrupts sent by dummy ports Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 16/70] libata/ahci: accommodate tag ordered controllers Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 17/70] Input: synaptics - add min/max quirk for ThinkPad T431s, L440, L540, S1 Yoga and X1 Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 18/70] drm/radeon: fix count in cik_sdma_ring_test() Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 19/70] drm/radeon/pm: don't walk the crtc list before it has been initialized (v2) Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 20/70] drm/radeon: fix ATPX detection on non-VGA GPUs Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 21/70] mm: make fixup_user_fault() check the vma access rights too Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 22/70] ARM: 8027/1: fix do_div() bug in big-endian systems Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 23/70] ARM: 8030/1: ARM : kdump : add arch_crash_save_vmcoreinfo Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 24/70] USB: serial: fix sysfs-attribute removal deadlock Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 25/70] 8250_core: Fix unwanted TX chars write Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 26/70] serial: 8250: Fix thread unsafe __dma_tx_complete function Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 27/70] Btrfs: fix inode caching vs tree log Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 28/70] usb: xhci: Prefer endpoint context dequeue pointer over stopped_trb Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 29/70] xhci: Switch Intel Lynx Point ports to EHCI on shutdown Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 30/70] usb/xhci: fix compilation warning when !CONFIG_PCI && !CONFIG_PM Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 31/70] USB: io_ti: fix firmware download on big-endian machines Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 32/70] usb: qcserial: add Sierra Wireless EM7355 Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 33/70] usb: qcserial: add Sierra Wireless MC73xx Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 34/70] usb: qcserial: add Sierra Wireless MC7305/MC7355 Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 35/70] usb: option: add Olivetti Olicard 500 Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 36/70] usb: option: add Alcatel L800MA Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 37/70] usb: option: add and update a number of CMOTech devices Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 38/70] crypto: crypto_wq - Fix late crypto work queue initialization Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 39/70] i2c: i801: Add Device IDs for Intel Wildcat Point-LP PCH Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 40/70] i2c: i801: enable Intel BayTrail SMBUS Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 41/70] ftrace/x86: One more missing sync after fixup of function modification failure Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 42/70] Bluetooth: Add support for Intel Bluetooth device [8087:0a2a] Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 43/70] ARM: 8007/1: Remove extraneous kcmp syscall ignore Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 44/70] ARM: mvebu: ensure the mdio node has a clock reference on Armada 370/XP Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 45/70] ARM: OMAP3: hwmod data: Correct clock domains for USB modules Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 46/70] ARM: OMAP4: Fix definition of IS_PM44XX_ERRATUM Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 47/70] xhci: extend quirk for Renesas cards Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 48/70] [SCSI] qla2xxx: fix error handling of qla2x00_mem_alloc() Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 49/70] [SCSI] arcmsr: upper 32 of dma address lost Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 50/70] ARM: 7840/1: LPAE: don't reject mapping /dev/mem above 4GB Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 51/70] s390/chsc: fix SEI usage on old FW levels Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 52/70] drm/i915: Don't check gmch state on inherited configs Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 53/70] drm/vmwgfx: Make sure user-space can't DMA across buffer object boundaries v2 Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 54/70] s390/bpf,jit: initialize A register if 1st insn is BPF_S_LDX_B_MSH Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 55/70] arm: KVM: fix possible misalignment of PGDs and bounce page Luis Henriques
2014-05-07 13:12 ` [PATCH 3.11 56/70] KVM: ARM: vgic: Fix sgi dispatch problem Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 57/70] ftrace/module: Hardcode ftrace_module_init() call into load_module() Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 58/70] [SCSI] mpt2sas: Don't disable device twice at suspend Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 59/70] [SCSI] virtio-scsi: Skip setting affinity on uninitialized vq Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 60/70] drivercore: deferral race condition fix Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 61/70] hrtimer: Prevent all reprogramming if hang detected Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 62/70] hrtimer: Prevent remote enqueue of leftmost timers Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 63/70] timer: Prevent overflow in apply_slack Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 64/70] ARC: Entry Handler tweaks: Optimize away redundant IRQ_DISABLE_SAVE Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 65/70] ARC: !PREEMPT: Ensure Return to kernel mode is IRQ safe Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 66/70] iser-target: Add missing se_cmd put for WRITE_PENDING in tx_comp_err Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 67/70] floppy: ignore kernel-only members in FDRAWCMD ioctl input Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 68/70] floppy: don't write kernel-only members to FDRAWCMD ioctl output Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 69/70] n_tty: Fix n_tty_write crash when echoing in raw mode Luis Henriques
2014-05-07 13:13 ` [PATCH 3.11 70/70] KVM: s390: Optimize ucontrol path Luis Henriques
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1399468393-10140-7-git-send-email-luis.henriques@canonical.com \
--to=luis.henriques@canonical.com \
--cc=jack@suse.cz \
--cc=kernel-team@lists.ubuntu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).