From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Shaohua Li <shli@kernel.org>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Mike Snitzer <snitzer@redhat.com>,
Heinz Mauelshagen <heinzm@redhat.com>, NeilBrown <neilb@suse.de>
Subject: [PATCH 3.14 05/37] md/raid5: disable DISCARD by default due to safety concerns.
Date: Tue, 7 Oct 2014 16:19:22 -0700 [thread overview]
Message-ID: <20141007231827.230400159@linuxfoundation.org> (raw)
In-Reply-To: <20141007231827.043235686@linuxfoundation.org>
3.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: NeilBrown <neilb@suse.de>
commit 8e0e99ba64c7ba46133a7c8a3e3f7de01f23bd93 upstream.
It has come to my attention (thanks Martin) that 'discard_zeroes_data'
is only a hint. Some devices in some cases don't do what it
says on the label.
The use of DISCARD in RAID5 depends on reads from discarded regions
being predictably zero. If a write to a previously discarded region
performs a read-modify-write cycle it assumes that the parity block
was consistent with the data blocks. If all were zero, this would
be the case. If some are and some aren't this would not be the case.
This could lead to data corruption after a device failure when
data needs to be reconstructed from the parity.
As we cannot trust 'discard_zeroes_data', ignore it by default
and so disallow DISCARD on all raid4/5/6 arrays.
As many devices are trustworthy, and as there are benefits to using
DISCARD, add a module parameter to over-ride this caution and cause
DISCARD to work if discard_zeroes_data is set.
If a site want to enable DISCARD on some arrays but not on others they
should select DISCARD support at the filesystem level, and set the
raid456 module parameter.
raid456.devices_handle_discard_safely=Y
As this is a data-safety issue, I believe this patch is suitable for
-stable.
DISCARD support for RAID456 was added in 3.7
Cc: Shaohua Li <shli@kernel.org>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Heinz Mauelshagen <heinzm@redhat.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Fixes: 620125f2bf8ff0c4969b79653b54d7bcc9d40637
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/md/raid5.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -64,6 +64,10 @@
#define cpu_to_group(cpu) cpu_to_node(cpu)
#define ANY_GROUP NUMA_NO_NODE
+static bool devices_handle_discard_safely = false;
+module_param(devices_handle_discard_safely, bool, 0644);
+MODULE_PARM_DESC(devices_handle_discard_safely,
+ "Set to Y if all devices in each array reliably return zeroes on reads from discarded regions");
static struct workqueue_struct *raid5_wq;
/*
* Stripe cache
@@ -6117,7 +6121,7 @@ static int run(struct mddev *mddev)
mddev->queue->limits.discard_granularity = stripe;
/*
* unaligned part of discard request will be ignored, so can't
- * guarantee discard_zerors_data
+ * guarantee discard_zeroes_data
*/
mddev->queue->limits.discard_zeroes_data = 0;
@@ -6142,6 +6146,18 @@ static int run(struct mddev *mddev)
!bdev_get_queue(rdev->bdev)->
limits.discard_zeroes_data)
discard_supported = false;
+ /* Unfortunately, discard_zeroes_data is not currently
+ * a guarantee - just a hint. So we only allow DISCARD
+ * if the sysadmin has confirmed that only safe devices
+ * are in use by setting a module parameter.
+ */
+ if (!devices_handle_discard_safely) {
+ if (discard_supported) {
+ pr_info("md/raid456: discard support disabled due to uncertainty.\n");
+ pr_info("Set raid456.devices_handle_discard_safely=Y to override.\n");
+ }
+ discard_supported = false;
+ }
}
if (discard_supported &&
next prev parent reply other threads:[~2014-10-07 23:20 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-07 23:19 [PATCH 3.14 00/37] 3.14.21-stable review Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 01/37] udf: Avoid infinite loop when processing indirect ICBs Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 02/37] perf: fix perf bug in fork() Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 03/37] mm: migrate: Close race between migration completion and mprotect Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 04/37] cpufreq: integrator: fix integrator_cpufreq_remove return type Greg Kroah-Hartman
2014-10-07 23:19 ` Greg Kroah-Hartman [this message]
2014-10-07 23:19 ` [PATCH 3.14 06/37] drm/i915: Flush the PTEs after updating them before suspend Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 07/37] Fix problem recognizing symlinks Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 08/37] init/Kconfig: Fix HAVE_FUTEX_CMPXCHG to not break up the EXPERT menu Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 09/37] ring-buffer: Fix infinite spin in reading buffer Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 10/37] CIFS: Fix SMB2 readdir error handling Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 11/37] hugetlb: ensure hugepage access is denied if hugepages are not supported Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 12/37] mm, thp: move invariant bug check out of loop in __split_huge_page_map Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 13/37] mm: numa: Do not mark PTEs pte_numa when splitting huge pages Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 14/37] media: vb2: fix VBI/poll regression Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 15/37] jiffies: Fix timeval conversion to jiffies Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 16/37] mm: exclude memoryless nodes from zone_reclaim Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 17/37] swap: change swap_info singly-linked list to list_head Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 18/37] lib/plist: add helper functions Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 19/37] lib/plist: add plist_requeue Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 20/37] swap: change swap_list_head to plist, add swap_avail_head Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 21/37] mm, compaction: avoid isolating pinned pages Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 22/37] mm/compaction: disallow high-order page for migration target Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 23/37] mm/compaction: do not call suitable_migration_target() on every page Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 24/37] drbd: fix regression out of mem, failed to invoke fence-peer helper Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 25/37] mm/compaction: change the timing to check to drop the spinlock Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 26/37] mm/compaction: check pageblock suitability once per pageblock Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 27/37] mm/compaction: clean-up code on success of ballon isolation Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 28/37] mm, compaction: determine isolation mode only once Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 29/37] mm, compaction: ignore pageblock skip when manually invoking compaction Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 30/37] mm/readahead.c: fix readahead failure for memoryless NUMA nodes and limit readahead pages Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 31/37] mm: optimize put_mems_allowed() usage Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 32/37] mm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 33/37] mm: vmscan: respect NUMA policy mask when shrinking slab on direct reclaim Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 34/37] mm: vmscan: shrink_slab: rename max_pass -> freeable Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 35/37] vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state() Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 36/37] mm: per-thread vma caching Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 37/37] mm: dont pointlessly use BUG_ON() for sanity check Greg Kroah-Hartman
2014-10-08 2:48 ` [PATCH 3.14 00/37] 3.14.21-stable review Guenter Roeck
2014-10-08 20:05 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141007231827.230400159@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=heinzm@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=neilb@suse.de \
--cc=shli@kernel.org \
--cc=snitzer@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).