From: Ben Hutchings <ben@decadent.org.uk>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: akpm@linux-foundation.org,
"Andreas Rohner" <andreas.rohner@gmx.net>,
"Ryusuke Konishi" <konishi.ryusuke@lab.ntt.co.jp>,
"Linus Torvalds" <torvalds@linux-foundation.org>
Subject: [PATCH 3.2 53/79] nilfs2: fix race condition that causes file system corruption
Date: Sun, 11 Feb 2018 04:20:06 +0000 [thread overview]
Message-ID: <lsq.1518322806.998583910@decadent.org.uk> (raw)
In-Reply-To: <lsq.1518322802.943358572@decadent.org.uk>
3.2.99-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Andreas Rohner <andreas.rohner@gmx.net>
commit 31ccb1f7ba3cfe29631587d451cf5bb8ab593550 upstream.
There is a race condition between nilfs_dirty_inode() and
nilfs_set_file_dirty().
When a file is opened, nilfs_dirty_inode() is called to update the
access timestamp in the inode. It calls __nilfs_mark_inode_dirty() in a
separate transaction. __nilfs_mark_inode_dirty() caches the ifile
buffer_head in the i_bh field of the inode info structure and marks it
as dirty.
After some data was written to the file in another transaction, the
function nilfs_set_file_dirty() is called, which adds the inode to the
ns_dirty_files list.
Then the segment construction calls nilfs_segctor_collect_dirty_files(),
which goes through the ns_dirty_files list and checks the i_bh field.
If there is a cached buffer_head in i_bh it is not marked as dirty
again.
Since nilfs_dirty_inode() and nilfs_set_file_dirty() use separate
transactions, it is possible that a segment construction that writes out
the ifile occurs in-between the two. If this happens the inode is not
on the ns_dirty_files list, but its ifile block is still marked as dirty
and written out.
In the next segment construction, the data for the file is written out
and nilfs_bmap_propagate() updates the b-tree. Eventually the bmap root
is written into the i_bh block, which is not dirty, because it was
written out in another segment construction.
As a result the bmap update can be lost, which leads to file system
corruption. Either the virtual block address points to an unallocated
DAT block, or the DAT entry will be reused for something different.
The error can remain undetected for a long time. A typical error
message would be one of the "bad btree" errors or a warning that a DAT
entry could not be found.
This bug can be reproduced reliably by a simple benchmark that creates
and overwrites millions of 4k files.
Link: http://lkml.kernel.org/r/1509367935-3086-2-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Andreas Rohner <andreas.rohner@gmx.net>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
fs/nilfs2/segment.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -1880,8 +1880,6 @@ static int nilfs_segctor_collect_dirty_f
"failed to get inode block.\n");
return err;
}
- mark_buffer_dirty(ibh);
- nilfs_mdt_mark_dirty(ifile);
spin_lock(&nilfs->ns_inode_lock);
if (likely(!ii->i_bh))
ii->i_bh = ibh;
@@ -1890,6 +1888,10 @@ static int nilfs_segctor_collect_dirty_f
goto retry;
}
+ // Always redirty the buffer to avoid race condition
+ mark_buffer_dirty(ii->i_bh);
+ nilfs_mdt_mark_dirty(ifile);
+
clear_bit(NILFS_I_QUEUED, &ii->i_state);
set_bit(NILFS_I_BUSY, &ii->i_state);
list_move_tail(&ii->i_dirty, &sci->sc_dirty_files);
next prev parent reply other threads:[~2018-02-11 4:33 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-11 4:20 [PATCH 3.2 00/79] 3.2.99-rc1 review Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 39/79] ocfs2: should wait dio before inode lock in ocfs2_setattr() Ben Hutchings
2018-02-11 7:39 ` alex chen
2018-02-11 18:01 ` Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 23/79] mtd: nand: Fix writing mtdoops to nand flash Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 33/79] rt2x00usb: mark device removed when get ENOENT usb error Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 38/79] ocfs2: fix issue that ocfs2_setattr() does not deal with new_i_size==i_size Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 44/79] dm: discard support requires all targets in a table support discards Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 29/79] x86/smp: Don't ever patch back to UP if we unplug cpus Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 50/79] autofs4: catatonic_mode vs. notify_daemon race Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 52/79] autofs: fix careless error in recent commit Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 59/79] ALSA: hda: Add Raven PCI ID Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 22/79] media: omap_vout: Fix a possible null pointer dereference in omap_vout_open() Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 09/79] scsi: bfa: integer overflow in debugfs Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 31/79] USB: usbfs: compute urb->actual_length for isochronous Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 48/79] nfs: Fix ugly referral attributes Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 05/79] USB: serial: garmin_gps: fix I/O after failed probe and remove Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 02/79] rtc: interface: ignore expired timers when enqueuing new timers Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 27/79] eCryptfs: use after free in ecryptfs_release_messaging() Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 13/79] net/9p: Switch to wait_event_killable() Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 32/79] video: udlfb: Fix read EDID timeout Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 28/79] media: Don't do DMA on stack for firmware upload in the AS102 driver Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 01/79] Input: adxl34x - do not treat FIFO_MODE() as boolean Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 07/79] media: rc: check for integer overflow Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 26/79] coda: fix 'kernel memory exposure attempt' in fsync Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 41/79] sctp: Fixup v4mapped behaviour to comply with Sock API Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 76/79] usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 54/79] ALSA: timer: Remove kernel warning at compat ioctl error paths Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 16/79] l2tp: push all ppp pseudowire shutdown through .release handler Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 10/79] IB/srp: Avoid that a cable pull can trigger a kernel crash Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 12/79] fs/9p: Compare qid.path in v9fs_test_inode Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 46/79] KVM: vmx: Inject #GP on invalid PAT CR Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 19/79] l2tp: initialise l2tp_eth sessions before registering them Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 78/79] kaiser: Set _PAGE_NX only if supported Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 73/79] usbip: fix stub_rx: get_pipe() to validate endpoint number Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 55/79] ALSA: usb-audio: Add sanity checks to FE parser Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 47/79] KVM: SVM: obey guest PAT Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 42/79] sctp: fully initialize the IPv6 address in sctp_v6_to_addr() Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 35/79] blktrace: Fix potential deadlock between delete & sysfs ops Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 49/79] autofs4: autofs4_wait() vs. autofs4_catatonic_mode() race Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 25/79] USB: Add delay-init quirk for Corsair K70 LUX keyboards Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 17/79] l2tp: ensure sessions are freed after their PPPOL2TP socket Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 74/79] usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 71/79] usbip: Fix sscanf handling Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 77/79] [media] cx231xx: Fix the max number of interfaces Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 64/79] netfilter: xt_TCPMSS: fix handling of malformed TCP header and options Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 34/79] dm: fix race between dm_get_from_kobject() and __dm_destroy() Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 11/79] tpm-dev-common: Reject too short writes Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 72/79] usb: add helper to extract bits 12:11 of wMaxPacketSize Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 65/79] netfilter: xt_TCPMSS: correct return value in tcpmss_mangle_packet Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 21/79] l2tp: initialise PPP sessions before registering them Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 45/79] dm bufio: fix integer overflow when limiting maximum cache size Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 43/79] net/sctp: Always set scope_id in sctp_inet6_skb_msgname Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 69/79] ALSA: seq: Make ioctls race-free Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 04/79] PCI/AER: Report non-fatal errors only to the affected endpoint Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 14/79] l2tp: add session reorder queue purge function to core Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 62/79] netfilter: xt_TCPOPTSTRIP: don't use tcp_hdr() Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 20/79] l2tp: protect sock pointer of struct pppol2tp_session with RCU Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 60/79] x86/decoder: Add new TEST instruction pattern Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 06/79] USB: serial: garmin_gps: fix memory leak on probe errors Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 51/79] autofs: don't fail mount for transient error Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 63/79] netfilter: xt_TCPMSS: Fix missing fragmentation handling Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 18/79] l2tp: don't register sessions in l2tp_session_create() Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 79/79] kaiser: Set _PAGE_NX only if supported Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 03/79] rtc: set the alarm to the next expiring timer Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 75/79] usbip: prevent vhci_hcd driver from leaking a socket pointer address Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 30/79] kprobes, x86/alternatives: Use text_mutex to protect smp_alt_modules Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 40/79] s390/disassembler: increase show_code buffer size Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 24/79] isofs: fix timestamps beyond 2027 Ben Hutchings
2018-02-11 4:20 ` Ben Hutchings [this message]
2018-02-11 4:20 ` [PATCH 3.2 37/79] IB/mlx4: Increase maximal message size under UD QP Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 58/79] ALSA: usb-audio: Add sanity checks in v2 clock parsers Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 61/79] netfilter: xt_TCPOPTSTRIP: fix possible mangling beyond packet boundary Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 56/79] ALSA: usb-audio: Fix potential out-of-bound access at parsing SU Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 08/79] KVM: nVMX: set IDTR and GDTR limits when loading L1 host state Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 70/79] staging: usbip: removed #if 0'd out code Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 36/79] blktrace: fix unlocked access to init/start-stop/teardown Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 15/79] l2tp: purge session reorder queue on delete Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 66/79] netfilter: xt_TCPMSS: add more sanity tests on tcph->doff Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 68/79] RDS: null pointer dereference in rds_atomic_free_op Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 57/79] ALSA: usb-audio: Fix potential zero-division at parsing FU Ben Hutchings
2018-02-12 6:59 ` Takashi Iwai
2018-02-13 18:28 ` Ben Hutchings
2018-02-11 4:20 ` [PATCH 3.2 67/79] RDS: Heap OOB write in rds_message_alloc_sgs() Ben Hutchings
2018-02-11 11:18 ` [PATCH 3.2 00/79] 3.2.99-rc1 review Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=lsq.1518322806.998583910@decadent.org.uk \
--to=ben@decadent.org.uk \
--cc=akpm@linux-foundation.org \
--cc=andreas.rohner@gmx.net \
--cc=konishi.ryusuke@lab.ntt.co.jp \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox