From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Ric Wheeler <rwheeler@redhat.com>,
James Bottomley <JBottomley@Parallels.com>
Subject: [ 07/33] SCSI: sd: fix array cache flushing bug causing performance problems
Date: Fri, 17 May 2013 14:49:51 -0700 [thread overview]
Message-ID: <20130517214917.641498291@linuxfoundation.org> (raw)
In-Reply-To: <20130517214916.821259930@linuxfoundation.org>
3.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Bottomley <JBottomley@Parallels.com>
commit 39c60a0948cc06139e2fbfe084f83cb7e7deae3b upstream.
Some arrays synchronize their full non volatile cache when the sd driver sends
a SYNCHRONIZE CACHE command. Unfortunately, they can have Terrabytes of this
and we send a SYNCHRONIZE CACHE for every barrier if an array reports it has a
writeback cache. This leads to massive slowdowns on journalled filesystems.
The fix is to allow userspace to turn off the writeback cache setting as a
temporary measure (i.e. without doing the MODE SELECT to write it back to the
device), so even though the device reported it has a writeback cache, the
user, knowing that the cache is non volatile and all they care about is
filesystem correctness, can turn that bit off in the kernel and avoid the
performance ruinous (and safety irrelevant) SYNCHRONIZE CACHE commands.
The way you do this is add a 'temporary' prefix when performing the usual
cache setting operations, so
echo temporary write through > /sys/class/scsi_disk/<disk>/cache_type
Reported-by: Ric Wheeler <rwheeler@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/scsi/sd.c | 20 ++++++++++++++++++++
drivers/scsi/sd.h | 1 +
2 files changed, 21 insertions(+)
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -138,6 +138,7 @@ sd_store_cache_type(struct device *dev,
char *buffer_data;
struct scsi_mode_data data;
struct scsi_sense_hdr sshdr;
+ const char *temp = "temporary ";
int len;
if (sdp->type != TYPE_DISK)
@@ -146,6 +147,13 @@ sd_store_cache_type(struct device *dev,
* it's not worth the risk */
return -EINVAL;
+ if (strncmp(buf, temp, sizeof(temp) - 1) == 0) {
+ buf += sizeof(temp) - 1;
+ sdkp->cache_override = 1;
+ } else {
+ sdkp->cache_override = 0;
+ }
+
for (i = 0; i < ARRAY_SIZE(sd_cache_types); i++) {
len = strlen(sd_cache_types[i]);
if (strncmp(sd_cache_types[i], buf, len) == 0 &&
@@ -158,6 +166,13 @@ sd_store_cache_type(struct device *dev,
return -EINVAL;
rcd = ct & 0x01 ? 1 : 0;
wce = ct & 0x02 ? 1 : 0;
+
+ if (sdkp->cache_override) {
+ sdkp->WCE = wce;
+ sdkp->RCD = rcd;
+ return count;
+ }
+
if (scsi_mode_sense(sdp, 0x08, 8, buffer, sizeof(buffer), SD_TIMEOUT,
SD_MAX_RETRIES, &data, NULL))
return -EINVAL;
@@ -2036,6 +2051,10 @@ sd_read_cache_type(struct scsi_disk *sdk
int old_rcd = sdkp->RCD;
int old_dpofua = sdkp->DPOFUA;
+
+ if (sdkp->cache_override)
+ return;
+
first_len = 4;
if (sdp->skip_ms_page_8) {
if (sdp->type == TYPE_RBC)
@@ -2517,6 +2536,7 @@ static void sd_probe_async(void *data, a
sdkp->capacity = 0;
sdkp->media_present = 1;
sdkp->write_prot = 0;
+ sdkp->cache_override = 0;
sdkp->WCE = 0;
sdkp->RCD = 0;
sdkp->ATO = 0;
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -70,6 +70,7 @@ struct scsi_disk {
u8 protection_type;/* Data Integrity Field */
u8 provisioning_mode;
unsigned ATO : 1; /* state of disk ATO bit */
+ unsigned cache_override : 1; /* temp override of WCE,RCD */
unsigned WCE : 1; /* state of disk WCE bit */
unsigned RCD : 1; /* state of disk RCD bit, unused */
unsigned DPOFUA : 1; /* state of disk DPOFUA bit */
next prev parent reply other threads:[~2013-05-17 21:58 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-17 21:49 [ 00/33] 3.0.79-stable review Greg Kroah-Hartman
2013-05-17 21:49 ` [ 01/33] KVM: VMX: fix halt emulation while emulating invalid guest sate Greg Kroah-Hartman
2013-05-17 21:49 ` [ 02/33] ARM: OMAP: RX-51: change probe order of touchscreen and panel SPI devices Greg Kroah-Hartman
2013-05-17 21:49 ` [ 03/33] ASoC: wm8994: missing break in wm8994_aif3_hw_params() Greg Kroah-Hartman
2013-05-17 21:49 ` [ 04/33] ACPICA: Fix possible buffer overflow during a field unit read operation Greg Kroah-Hartman
2013-05-17 21:49 ` [ 05/33] dm snapshot: fix error return code in snapshot_ctr Greg Kroah-Hartman
2013-05-17 21:49 ` [ 06/33] xen/vcpu/pvhvm: Fix vcpu hotplugging hanging Greg Kroah-Hartman
2013-05-17 21:49 ` Greg Kroah-Hartman [this message]
2013-05-17 21:49 ` [ 08/33] timer: Dont reinitialize the cpu base lock during CPU_UP_PREPARE Greg Kroah-Hartman
2013-05-17 21:49 ` [ 09/33] tick: Cleanup NOHZ per cpu data on cpu down Greg Kroah-Hartman
2013-05-17 21:49 ` [ 10/33] ext4: limit group search loop for non-extent files Greg Kroah-Hartman
2013-05-17 21:49 ` [ 11/33] ath9k: fix key allocation error handling for powersave keys Greg Kroah-Hartman
2013-05-17 21:49 ` [ 12/33] mwifiex: clear is_suspended flag when interrupt is received early Greg Kroah-Hartman
2013-05-17 21:49 ` [ 13/33] mwifiex: fix setting of multicast filter Greg Kroah-Hartman
2013-05-17 21:49 ` [ 14/33] drm/mm: fix dump table BUG Greg Kroah-Hartman
2013-05-17 21:49 ` [ 15/33] tcp: force a dst refcount when prequeue packet Greg Kroah-Hartman
2013-05-17 21:50 ` [ 16/33] 3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA) Greg Kroah-Hartman
2013-05-17 21:50 ` [ 17/33] net_sched: act_ipt forward compat with xtables Greg Kroah-Hartman
2013-05-17 21:50 ` [ 18/33] bridge: fix race with topology change timer Greg Kroah-Hartman
2013-05-17 21:50 ` [ 19/33] 3c59x: fix freeing nonexistent resource on driver unload Greg Kroah-Hartman
2013-05-17 21:50 ` [ 20/33] 3c59x: fix PCI resource management Greg Kroah-Hartman
2013-05-17 21:50 ` [ 21/33] if_cablemodem.h: Add parenthesis around ioctl macros Greg Kroah-Hartman
2013-05-17 21:50 ` [ 22/33] macvlan: fix passthru mode race between dev removal and rx path Greg Kroah-Hartman
2013-05-17 21:50 ` [ 23/33] ipv6: do not clear pinet6 field Greg Kroah-Hartman
2013-05-17 21:50 ` [ 24/33] xfrm6: release dev before returning error Greg Kroah-Hartman
2013-05-17 21:50 ` [ 25/33] pch_dma: Use GFP_ATOMIC because called from interrupt context Greg Kroah-Hartman
2013-05-17 21:50 ` [ 26/33] r8169: fix vlan tag read ordering Greg Kroah-Hartman
2013-05-17 21:50 ` [ 27/33] drbd: fix for deadlock when using automatic split-brain-recovery Greg Kroah-Hartman
2013-05-17 21:50 ` [ 28/33] drivers/rtc/rtc-pcf2123.c: fix error return code in pcf2123_probe() Greg Kroah-Hartman
2013-05-17 21:50 ` [ 29/33] ACPI / EC: Restart transaction even when the IBF flag set Greg Kroah-Hartman
2013-05-17 21:50 ` [ 30/33] drivers/char/ipmi: memcpy, need additional 2 bytes to avoid memory overflow Greg Kroah-Hartman
2013-05-17 21:50 ` [ 31/33] ipmi: ipmi_devintf: compat_ioctl method fails to take ipmi_mutex Greg Kroah-Hartman
2013-05-17 21:50 ` [ 32/33] drm/radeon: check incoming cliprects pointer Greg Kroah-Hartman
2013-05-17 21:50 ` [ 33/33] usermodehelper: check subprocess_info->path != NULL Greg Kroah-Hartman
2013-05-19 11:15 ` [ 00/33] 3.0.79-stable review Satoru Takeuchi
2013-05-19 17:07 ` Greg Kroah-Hartman
2013-05-19 17:09 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130517214917.641498291@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=JBottomley@Parallels.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rwheeler@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox