stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Joe Thornber <ejt@redhat.com>,
	Mike Snitzer <snitzer@redhat.com>
Subject: [PATCH 3.10 43/48] dm transaction manager: fix corruption due to non-atomic transaction commit
Date: Sun, 11 May 2014 21:20:17 +0200	[thread overview]
Message-ID: <20140511191953.898796186@linuxfoundation.org> (raw)
In-Reply-To: <20140511191948.079900414@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Joe Thornber <ejt@redhat.com>

commit a9d45396f5956d0b615c7ae3b936afd888351a47 upstream.

The persistent-data library used by dm-thin, dm-cache, etc is
transactional.  If anything goes wrong, such as an io error when writing
new metadata or a power failure, then we roll back to the last
transaction.

Atomicity when committing a transaction is achieved by:

a) Never overwriting data from the previous transaction.
b) Writing the superblock last, after all other metadata has hit the
   disk.

This commit and the following commit ("dm: take care to copy the space
map roots before locking the superblock") fix a bug associated with (b).
When committing it was possible for the superblock to still be written
in spite of an io error occurring during the preceeding metadata flush.
With these commits we're careful not to take the write lock out on the
superblock until after the metadata flush has completed.

Change the transaction manager's semantics for dm_tm_commit() to assume
all data has been flushed _before_ the single superblock that is passed
in.

As a prerequisite, split the block manager's block unlocking and
flushing by simplifying dm_bm_flush_and_unlock() to dm_bm_flush().  Now
the unlocking must be done separately.

This issue was discovered by forcing io errors at the crucial time
using dm-flakey.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/md/dm-cache-metadata.c                      |    3 ++-
 drivers/md/persistent-data/dm-block-manager.c       |   15 ++-------------
 drivers/md/persistent-data/dm-block-manager.h       |    3 +--
 drivers/md/persistent-data/dm-transaction-manager.c |    5 +++--
 drivers/md/persistent-data/dm-transaction-manager.h |   17 ++++++++---------
 5 files changed, 16 insertions(+), 27 deletions(-)

--- a/drivers/md/dm-cache-metadata.c
+++ b/drivers/md/dm-cache-metadata.c
@@ -511,8 +511,9 @@ static int __begin_transaction_flags(str
 	disk_super = dm_block_data(sblock);
 	update_flags(disk_super, mutator);
 	read_superblock_fields(cmd, disk_super);
+	dm_bm_unlock(sblock);
 
-	return dm_bm_flush_and_unlock(cmd->bm, sblock);
+	return dm_bm_flush(cmd->bm);
 }
 
 static int __begin_transaction(struct dm_cache_metadata *cmd)
--- a/drivers/md/persistent-data/dm-block-manager.c
+++ b/drivers/md/persistent-data/dm-block-manager.c
@@ -595,25 +595,14 @@ int dm_bm_unlock(struct dm_block *b)
 }
 EXPORT_SYMBOL_GPL(dm_bm_unlock);
 
-int dm_bm_flush_and_unlock(struct dm_block_manager *bm,
-			   struct dm_block *superblock)
+int dm_bm_flush(struct dm_block_manager *bm)
 {
-	int r;
-
 	if (bm->read_only)
 		return -EPERM;
 
-	r = dm_bufio_write_dirty_buffers(bm->bufio);
-	if (unlikely(r)) {
-		dm_bm_unlock(superblock);
-		return r;
-	}
-
-	dm_bm_unlock(superblock);
-
 	return dm_bufio_write_dirty_buffers(bm->bufio);
 }
-EXPORT_SYMBOL_GPL(dm_bm_flush_and_unlock);
+EXPORT_SYMBOL_GPL(dm_bm_flush);
 
 void dm_bm_set_read_only(struct dm_block_manager *bm)
 {
--- a/drivers/md/persistent-data/dm-block-manager.h
+++ b/drivers/md/persistent-data/dm-block-manager.h
@@ -105,8 +105,7 @@ int dm_bm_unlock(struct dm_block *b);
  *
  * This method always blocks.
  */
-int dm_bm_flush_and_unlock(struct dm_block_manager *bm,
-			   struct dm_block *superblock);
+int dm_bm_flush(struct dm_block_manager *bm);
 
 /*
  * Switches the bm to a read only mode.  Once read-only mode
--- a/drivers/md/persistent-data/dm-transaction-manager.c
+++ b/drivers/md/persistent-data/dm-transaction-manager.c
@@ -154,7 +154,7 @@ int dm_tm_pre_commit(struct dm_transacti
 	if (r < 0)
 		return r;
 
-	return 0;
+	return dm_bm_flush(tm->bm);
 }
 EXPORT_SYMBOL_GPL(dm_tm_pre_commit);
 
@@ -164,8 +164,9 @@ int dm_tm_commit(struct dm_transaction_m
 		return -EWOULDBLOCK;
 
 	wipe_shadow_table(tm);
+	dm_bm_unlock(root);
 
-	return dm_bm_flush_and_unlock(tm->bm, root);
+	return dm_bm_flush(tm->bm);
 }
 EXPORT_SYMBOL_GPL(dm_tm_commit);
 
--- a/drivers/md/persistent-data/dm-transaction-manager.h
+++ b/drivers/md/persistent-data/dm-transaction-manager.h
@@ -38,18 +38,17 @@ struct dm_transaction_manager *dm_tm_cre
 /*
  * We use a 2-phase commit here.
  *
- * i) In the first phase the block manager is told to start flushing, and
- * the changes to the space map are written to disk.  You should interrogate
- * your particular space map to get detail of its root node etc. to be
- * included in your superblock.
+ * i) Make all changes for the transaction *except* for the superblock.
+ * Then call dm_tm_pre_commit() to flush them to disk.
  *
- * ii) @root will be committed last.  You shouldn't use more than the
- * first 512 bytes of @root if you wish the transaction to survive a power
- * failure.  You *must* have a write lock held on @root for both stage (i)
- * and (ii).  The commit will drop the write lock.
+ * ii) Lock your superblock.  Update.  Then call dm_tm_commit() which will
+ * unlock the superblock and flush it.  No other blocks should be updated
+ * during this period.  Care should be taken to never unlock a partially
+ * updated superblock; perform any operations that could fail *before* you
+ * take the superblock lock.
  */
 int dm_tm_pre_commit(struct dm_transaction_manager *tm);
-int dm_tm_commit(struct dm_transaction_manager *tm, struct dm_block *root);
+int dm_tm_commit(struct dm_transaction_manager *tm, struct dm_block *superblock);
 
 /*
  * These methods are the only way to get hold of a writeable block.



  parent reply	other threads:[~2014-05-11 19:20 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-11 19:19 [PATCH 3.10 00/48] 3.10.40-stable review Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 01/48] drivers/tty/hvc: dont free hvc_console_setup after init Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 02/48] tty: serial: 8250_core.c Bug fix for Exar chips Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 03/48] n_tty: Fix n_tty_write crash when echoing in raw mode Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 04/48] floppy: ignore kernel-only members in FDRAWCMD ioctl input Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 05/48] floppy: dont write kernel-only members to FDRAWCMD ioctl output Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 06/48] iser-target: Add missing se_cmd put for WRITE_PENDING in tx_comp_err Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 07/48] ARM: 7840/1: LPAE: dont reject mapping /dev/mem above 4GB Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 08/48] KVM: ioapic: fix assignment of ioapic->rtc_status.pending_eoi (CVE-2014-0155) Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 09/48] MIPS: KVM: Pass reserved instruction exceptions to guest Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 10/48] MIPS: Hibernate: Flush TLB entries in swsusp_arch_resume() Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 11/48] virtio_balloon: dont softlockup on huge balloon changes Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 12/48] [SCSI] virtio-scsi: Skip setting affinity on uninitialized vq Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 13/48] [SCSI] mpt2sas: Dont disable device twice at suspend Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 14/48] powerpc/compat: 32-bit little endian machine name is ppcle, not ppc Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 15/48] powerpc/tm: Disable IRQ in tm_recheckpoint Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 16/48] s390/chsc: fix SEI usage on old FW levels Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 17/48] s390/bpf,jit: initialize A register if 1st insn is BPF_S_LDX_B_MSH Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 18/48] ARC: Entry Handler tweaks: Simplify branch for in-kernel preemption Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 19/48] ARC: Entry Handler tweaks: Optimize away redundant IRQ_DISABLE_SAVE Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 20/48] framebuffer: fix cfb_copyarea Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 21/48] matroxfb: restore the registers M_ACCESS and M_PITCH Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 22/48] mach64: use unaligned access Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 23/48] mach64: fix cursor when character width is not a multiple of 8 pixels Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 24/48] b43: Fix machine check error due to improper access of B43_MMIO_PSM_PHY_HDR Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.10 25/48] libata/ahci: accommodate tag ordered controllers Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 26/48] iwlwifi: dvm: take mutex when sending SYNC BT config command Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 27/48] mac80211: fix WPA with VLAN on AP side with ps-sta again Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 28/48] mac80211: fix software remain-on-channel implementation Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 29/48] mac80211: exclude AP_VLAN interfaces from tx power calculation Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 30/48] locks: allow __break_lease to sleep even when break_time is 0 Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 31/48] rtlwifi: rtl8723ae: Fix too long disable of IRQs Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 32/48] rtlwifi: rtl8188ee: " Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 33/48] rtlwifi: rtl8192cu: " Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 34/48] rtlwifi: rtl8192se: " Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 35/48] rtlwifi: rtl8192se: Fix regression due to commit 1bf4bbb Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 36/48] rtlwifi: rtl8188ee: initialize packet_beacon Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 37/48] gpio: mxs: Allow for recursive enable_irq_wake() call Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 38/48] tgafb: fix data copying Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 39/48] mtd: atmel_nand: Disable subpage NAND write when using Atmel PMECC Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 40/48] mtd: nuc900_nand: NULL dereference in nuc900_nand_enable() Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 41/48] mtd: sm_ftl: heap corruption in sm_create_sysfs_attributes() Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 42/48] Skip intel_crt_init for Dell XPS 8700 Greg Kroah-Hartman
2014-05-11 19:20 ` Greg Kroah-Hartman [this message]
2014-05-11 19:20 ` [PATCH 3.10 44/48] dm thin: fix dangling bio in process_deferred_bios error path Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 45/48] lockd: ensure we tear down any live sockets when socket creation fails during lockd_up Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 46/48] Input: synaptics - add min/max quirk for ThinkPad T431s, L440, L540, S1 Yoga and X1 Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 47/48] Input: synaptics - add min/max quirk for ThinkPad Edge E431 Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.10 48/48] drm: cirrus: add power management support Greg Kroah-Hartman
2014-05-11 22:50 ` [PATCH 3.10 00/48] 3.10.40-stable review Guenter Roeck
2014-05-12 21:54 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140511191953.898796186@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=ejt@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=snitzer@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).