From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Mike Snitzer <snitzer@redhat.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Hannes Reinecke <hare@suse.de>
Subject: [ 40/52] dm mpath: disable WRITE SAME if it fails
Date: Wed, 2 Oct 2013 21:05:58 -0700 [thread overview]
Message-ID: <20131003040524.980274351@linuxfoundation.org> (raw)
In-Reply-To: <20131003040522.190209641@linuxfoundation.org>
3.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mike Snitzer <snitzer@redhat.com>
commit f84cb8a46a771f36a04a02c61ea635c968ed5f6a upstream.
Workaround the SCSI layer's problematic WRITE SAME heuristics by
disabling WRITE SAME in the DM multipath device's queue_limits if an
underlying device disabled it.
The WRITE SAME heuristics, with both the original commit 5db44863b6eb
("[SCSI] sd: Implement support for WRITE SAME") and the updated commit
66c28f971 ("[SCSI] sd: Update WRITE SAME heuristics"), default to enabling
WRITE SAME(10) even without successfully determining it is supported.
After the first failed WRITE SAME the SCSI layer will disable WRITE SAME
for the device (by setting sdkp->device->no_write_same which results in
'max_write_same_sectors' in device's queue_limits to be set to 0).
When a device is stacked ontop of such a SCSI device any changes to that
SCSI device's queue_limits do not automatically propagate up the stack.
As such, a DM multipath device will not have its WRITE SAME support
disabled. This causes the block layer to continue to issue WRITE SAME
requests to the mpath device which causes paths to fail and (if mpath IO
isn't configured to queue when no paths are available) it will result in
actual IO errors to the upper layers.
This fix doesn't help configurations that have additional devices
stacked ontop of the mpath device (e.g. LVM created linear DM devices
ontop). A proper fix that restacks all the queue_limits from the bottom
of the device stack up will need to be explored if SCSI will continue to
use this model of optimistically allowing op codes and then disabling
them after they fail for the first time.
Before this patch:
EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
device-mapper: multipath: XXX snitm debugging: failing WRITE SAME IO with error=-121
end_request: critical target error, dev dm-6, sector 528
dm-6: WRITE SAME failed. Manually zeroing.
device-mapper: multipath: Failing path 8:112.
end_request: I/O error, dev dm-6, sector 4616
dm-6: WRITE SAME failed. Manually zeroing.
end_request: I/O error, dev dm-6, sector 4616
end_request: I/O error, dev dm-6, sector 5640
end_request: I/O error, dev dm-6, sector 6664
end_request: I/O error, dev dm-6, sector 7688
end_request: I/O error, dev dm-6, sector 524288
Buffer I/O error on device dm-6, logical block 65536
lost page write due to I/O error on dm-6
JBD2: Error -5 detected when updating journal superblock for dm-6-8.
end_request: I/O error, dev dm-6, sector 524296
Aborting journal on device dm-6-8.
end_request: I/O error, dev dm-6, sector 524288
Buffer I/O error on device dm-6, logical block 65536
lost page write due to I/O error on dm-6
JBD2: Error -5 detected when updating journal superblock for dm-6-8.
# cat /sys/block/sdh/queue/write_same_max_bytes
0
# cat /sys/block/dm-6/queue/write_same_max_bytes
33553920
After this patch:
EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
device-mapper: multipath: XXX snitm debugging: WRITE SAME I/O failed with error=-121
end_request: critical target error, dev dm-6, sector 528
dm-6: WRITE SAME failed. Manually zeroing.
# cat /sys/block/sdh/queue/write_same_max_bytes
0
# cat /sys/block/dm-6/queue/write_same_max_bytes
0
It should be noted that WRITE SAME support wasn't enabled in DM
multipath until v3.10.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/md/dm-mpath.c | 11 ++++++++++-
drivers/md/dm.c | 11 +++++++++++
include/linux/device-mapper.h | 3 ++-
3 files changed, 23 insertions(+), 2 deletions(-)
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1284,8 +1284,17 @@ static int do_end_io(struct multipath *m
if (!error && !clone->errors)
return 0; /* I/O complete */
- if (error == -EOPNOTSUPP || error == -EREMOTEIO || error == -EILSEQ)
+ if (error == -EOPNOTSUPP || error == -EREMOTEIO || error == -EILSEQ) {
+ if ((clone->cmd_flags & REQ_WRITE_SAME) &&
+ !clone->q->limits.max_write_same_sectors) {
+ struct queue_limits *limits;
+
+ /* device doesn't really support WRITE SAME, disable it */
+ limits = dm_get_queue_limits(dm_table_get_md(m->ti->table));
+ limits->max_write_same_sectors = 0;
+ }
return error;
+ }
if (mpio->pgpath)
fail_path(mpio->pgpath);
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2219,6 +2219,17 @@ struct target_type *dm_get_immutable_tar
}
/*
+ * The queue_limits are only valid as long as you have a reference
+ * count on 'md'.
+ */
+struct queue_limits *dm_get_queue_limits(struct mapped_device *md)
+{
+ BUG_ON(!atomic_read(&md->holders));
+ return &md->queue->limits;
+}
+EXPORT_SYMBOL_GPL(dm_get_queue_limits);
+
+/*
* Fully initialize a request-based queue (->elevator, ->request_fn, etc).
*/
static int dm_init_request_based_queue(struct mapped_device *md)
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -405,13 +405,14 @@ int dm_noflush_suspending(struct dm_targ
union map_info *dm_get_mapinfo(struct bio *bio);
union map_info *dm_get_rq_mapinfo(struct request *rq);
+struct queue_limits *dm_get_queue_limits(struct mapped_device *md);
+
/*
* Geometry functions.
*/
int dm_get_geometry(struct mapped_device *md, struct hd_geometry *geo);
int dm_set_geometry(struct mapped_device *md, struct hd_geometry *geo);
-
/*-----------------------------------------------------------------
* Functions for manipulating device-mapper tables.
*---------------------------------------------------------------*/
next prev parent reply other threads:[~2013-10-03 4:05 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-03 4:05 [ 00/52] 3.10.15-stable review Greg Kroah-Hartman
2013-10-03 4:05 ` [ 01/52] block: Fix bio_copy_data() Greg Kroah-Hartman
2013-10-03 4:05 ` [ 02/52] sysv: Add forgotten superblock lock init for v7 fs Greg Kroah-Hartman
2013-10-03 4:05 ` [ 03/52] bcache: Fix a dumb journal discard bug Greg Kroah-Hartman
2013-10-03 4:05 ` [ 04/52] bcache: Strip endline when writing the label through sysfs Greg Kroah-Hartman
2013-10-03 4:05 ` [ 05/52] bcache: Fix for when no journal entries are found Greg Kroah-Hartman
2013-10-03 4:05 ` [ 06/52] bcache: Fix a writeback performance regression Greg Kroah-Hartman
2013-10-03 4:05 ` [ 07/52] bcache: Fix a flush/fua performance bug Greg Kroah-Hartman
2013-10-03 4:05 ` [ 08/52] bcache: Fix a dumb CPU spinning bug in writeback Greg Kroah-Hartman
2013-10-03 4:05 ` [ 09/52] bcache: Fix a shrinker deadlock Greg Kroah-Hartman
2013-10-03 4:05 ` [ 10/52] bcache: Fix for handling overlapping extents when reading in a btree node Greg Kroah-Hartman
2013-10-03 4:05 ` [ 11/52] bcache: Fix flushes in writeback mode Greg Kroah-Hartman
2013-10-03 4:05 ` [ 12/52] x86/reboot: Add quirk to make Dell C6100 use reboot=pci automatically Greg Kroah-Hartman
2013-10-03 4:05 ` [ 13/52] tools lib lk: Uninclude linux/magic.h in debugfs.c Greg Kroah-Hartman
2013-10-03 4:05 ` [ 14/52] x86, efi: Dont map Boot Services on i386 Greg Kroah-Hartman
2013-10-03 4:05 ` [ 15/52] mei: make me client counters less error prone Greg Kroah-Hartman
2013-10-03 4:05 ` [ 16/52] mei: bus: stop wait for read during cl state transition Greg Kroah-Hartman
2013-10-03 4:05 ` [ 17/52] mei: cancel stall timers in mei_reset Greg Kroah-Hartman
2013-10-03 4:05 ` [ 18/52] tty: Fix SIGTTOU not sent with tcflush() Greg Kroah-Hartman
2013-10-03 4:05 ` [ 19/52] serial: tegra: fix tty-kref leak Greg Kroah-Hartman
2013-10-03 4:05 ` [ 20/52] serial: pch_uart: fix tty-kref leak in rx-error path Greg Kroah-Hartman
2013-10-03 4:05 ` [ 21/52] serial: pch_uart: fix tty-kref leak in dma-rx path Greg Kroah-Hartman
2013-10-03 4:05 ` [ 22/52] ARM: 7837/3: fix Thumb-2 bug in AES assembler code Greg Kroah-Hartman
2013-10-03 4:05 ` [ 23/52] staging: vt6656: [BUG] main_usb.c oops on device_close move flag earlier Greg Kroah-Hartman
2013-10-03 4:05 ` [ 24/52] staging: vt6656: [BUG] iwctl_siwencodeext return if device not open Greg Kroah-Hartman
2013-10-03 4:05 ` [ 25/52] drm/i915/tv: clear adjusted_mode.flags Greg Kroah-Hartman
2013-10-03 4:05 ` [ 26/52] xhci: Ensure a command structure points to the correct trb on the command ring Greg Kroah-Hartman
2013-10-03 4:05 ` [ 27/52] xhci: Fix oops happening after address device timeout Greg Kroah-Hartman
2013-10-03 4:05 ` [ 28/52] USB: fix PM config symbol in uhci-hcd, ehci-hcd, and xhci-hcd Greg Kroah-Hartman
2013-10-03 4:05 ` [ 29/52] xhci: Fix race between ep halt and URB cancellation Greg Kroah-Hartman
2013-10-03 4:05 ` [ 30/52] USB: OHCI: accept very late isochronous URBs Greg Kroah-Hartman
2013-10-03 4:05 ` [ 31/52] USB: UHCI: " Greg Kroah-Hartman
2013-10-03 4:05 ` [ 32/52] USB: Fix breakage in ffs_fs_mount() Greg Kroah-Hartman
2013-10-03 4:05 ` [ 33/52] fsl/usb: Resolve PHY_CLK_VLD instability issue for ULPI phy Greg Kroah-Hartman
2013-10-03 4:05 ` [ 34/52] usb: dwc3: pci: add support for BayTrail Greg Kroah-Hartman
2013-10-03 4:05 ` [ 35/52] usb: dwc3: add support for Merrifield Greg Kroah-Hartman
2013-10-03 4:05 ` [ 36/52] usb/core/devio.c: Dont reject control message to endpoint with wrong direction bit Greg Kroah-Hartman
2013-10-03 4:05 ` [ 37/52] driver core : Fix use after free of dev->parent in device_shutdown Greg Kroah-Hartman
2013-10-03 4:05 ` [ 38/52] dm snapshot: workaround for a false positive lockdep warning Greg Kroah-Hartman
2013-10-03 4:05 ` [ 39/52] dm-snapshot: fix performance degradation due to small hash size Greg Kroah-Hartman
2013-10-03 4:05 ` Greg Kroah-Hartman [this message]
2013-10-03 4:05 ` [ 41/52] dm-raid: silence compiler warning on rebuilds_per_group Greg Kroah-Hartman
2013-10-03 4:06 ` [ 42/52] drm/i915: preserve pipe A quirk in i9xx_set_pipeconf Greg Kroah-Hartman
2013-10-03 4:06 ` [ 43/52] drm/i915/dp: increase i2c-over-aux retry interval on AUX DEFER Greg Kroah-Hartman
2013-10-03 4:06 ` [ 44/52] drm/radeon: avoid UVD corruption on AGP cards using GPU gart Greg Kroah-Hartman
2013-10-03 4:06 ` [ 45/52] drm/radeon: Make r100_cp_ring_info() and radeon_ring_gfx() safe (v2) Greg Kroah-Hartman
2013-10-03 4:06 ` [ 46/52] drm/radeon: disable tests/benchmarks if accel is disabled Greg Kroah-Hartman
2013-10-03 4:06 ` [ 47/52] drm/radeon: add missing hdmi callbacks for rv6xx Greg Kroah-Hartman
2013-10-03 4:06 ` [ 48/52] drm/radeon: fix hdmi audio on DCE3.0/3.1 asics Greg Kroah-Hartman
2013-10-03 4:06 ` [ 49/52] ARM: mxs: stub out mxs_pm_init for !CONFIG_PM Greg Kroah-Hartman
2013-10-03 4:06 ` [ 50/52] hwmon: (applesmc) Check key count before proceeding Greg Kroah-Hartman
2013-10-03 4:06 ` [ 51/52] ALSA: compress: Fix compress device unregister Greg Kroah-Hartman
2013-10-03 4:06 ` [ 52/52] drm/i915: fix gen4 digital port hotplug definitions Greg Kroah-Hartman
2013-10-03 13:32 ` [ 00/52] 3.10.15-stable review Guenter Roeck
2013-10-03 18:41 ` Greg Kroah-Hartman
2013-10-03 22:54 ` Shuah Khan
2013-10-03 23:04 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131003040524.980274351@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=hare@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=snitzer@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).