From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Ric Wheeler <ricwheeler@gmail.com>
Cc: "Black, David" <david.black@emc.com>,
Jeremy Linton <jlinton@tributary.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Ric Wheeler <rwheeler@redhat.com>, Hannes Reinecke <hare@suse.de>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
"Martin K. Petersen" <mkp@mkp.net>,
Jeff Moyer <jmoyer@redhat.com>, Tejun Heo <tj@kernel.org>,
Mike Snitzer <snitzer@redhat.com>,
"Elliott, Robert (Server Storage)" <Elliott@hp.com>,
"Knight, Frederick" <Frederick.Knight@netapp.com>
Subject: Re: T10 WCE interpretation in Linux & device level access
Date: Wed, 24 Apr 2013 14:02:53 -0700 [thread overview]
Message-ID: <1366837373.1971.28.camel@dabdike> (raw)
In-Reply-To: <5178438F.3010103@gmail.com>
On Wed, 2013-04-24 at 16:41 -0400, Ric Wheeler wrote:
> On 04/24/2013 02:20 PM, Black, David wrote:
> > Jeremy,
> >
> > It looks like, you, Paolo and Ric have hit the nail on the head here - this is
> > a nice summary, IMHO:
> >
> >> On 4/24/2013 7:57 AM, Paolo Bonzini wrote:
> >>>> If the device can promise this, we don't care (and don't know) how it
> >>>> manages that promise. It can leave the data on battery backed DRAM, can
> >>> archive it to flash or any other scheme that works.
> >>>
> >>> That's exactly the point of SYNC_NV=1.
> >> Well its the point, but the specification is written such that the vendors can
> >> choose to implement it any way they wish, especially for split cache
> >> systems where there is both volatile and non volatile cache.
> > Independent of T10's best intentions at the time, the implementations aren't
> > doing what's needed or intended, and I'd guess that the SYNC_NV bit is not
> > being set to 1 by [other people's ;-) ] software that should be setting it
> > to 1 if it were paying attention to the standard.
> >
> > This is further complicated by it being completely legitimate wrt the SCSI
> > standard to put non-volatile cache in a system and not have the SCSI interface
> > admit that the non-volatile cache exists (WCE=0, SYNCHRONIZE CACHE is a no-op
> > independent of the value of SYNC_NV).
> >
> > I believe that Rob Elliot's 13-050 proposal to obsolete SYNC_NV and re-specify
> > SYNCHRONIZE CACHE to make all data non-volatile by whatever means the target
> > chooses is what T10 should do, and that matches Ric's summary:
> >
> >>>> If the device can promise this, we don't care (and don't know) how it
> >>>> manages that promise. It can leave the data on battery backed DRAM, can
> >>> archive it to flash or any other scheme that works.
> > Beyond that, attempting to manage drive removal from storage systems via the
> > SCSI interface with standard commands is a waste of time and effort, IMHO.
> > In a serious storage array (and even some fairly simple RAID controllers), some
> > vendor-specific "magic" is needed to get the array (or controller) to prepare
> > so that the drive can be removed cleanly. To oversimplify, it's not enough to
> > flush data to the drive; the array or controller is stateful, and hence has
> > to be told to "forget" the drive, where "forget" involves things that are
> > rather implementation-specific.
> >
> > Thanks,
> > --David
> >
>
> So I think that leaves us with some arrays that might benefit from Paolo's
> proposed patch, but almost certainly still will need to be able to "ignore
> flushes" for some block device accessing DB's, etc....
That just leaves us with random standards behaviour. Lets permit the
deterministic thing instead for the distros. It kills two birds with
one stone because we can set WCE for the stupid UAS devices that clear
it wrongly as well.
For those who don't read code well, you add a temporary prefix to the
cache set in
echo xxx > /sys/class/scsi_disk/<disk>/cache_type
and it will set the flags for the lifetime of the current kernel, but
won't try to do a mode select to make them permanent.
James
---
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 7992635..af16e88 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -142,6 +142,7 @@ sd_store_cache_type(struct device *dev, struct device_attribute *attr,
char *buffer_data;
struct scsi_mode_data data;
struct scsi_sense_hdr sshdr;
+ const char *temp = "temporary ";
int len;
if (sdp->type != TYPE_DISK)
@@ -150,6 +151,13 @@ sd_store_cache_type(struct device *dev, struct device_attribute *attr,
* it's not worth the risk */
return -EINVAL;
+ if (strncmp(buf, temp, sizeof(temp) - 1) == 0) {
+ buf += sizeof(temp) - 1;
+ sdkp->cache_override = 1;
+ } else {
+ sdkp->cache_override = 0;
+ }
+
for (i = 0; i < ARRAY_SIZE(sd_cache_types); i++) {
len = strlen(sd_cache_types[i]);
if (strncmp(sd_cache_types[i], buf, len) == 0 &&
@@ -162,6 +170,13 @@ sd_store_cache_type(struct device *dev, struct device_attribute *attr,
return -EINVAL;
rcd = ct & 0x01 ? 1 : 0;
wce = ct & 0x02 ? 1 : 0;
+
+ if (sdkp->cache_override) {
+ sdkp->WCE = wce;
+ sdkp->RCD = rcd;
+ return count;
+ }
+
if (scsi_mode_sense(sdp, 0x08, 8, buffer, sizeof(buffer), SD_TIMEOUT,
SD_MAX_RETRIES, &data, NULL))
return -EINVAL;
@@ -2319,6 +2334,10 @@ sd_read_cache_type(struct scsi_disk *sdkp, unsigned char *buffer)
int old_rcd = sdkp->RCD;
int old_dpofua = sdkp->DPOFUA;
+
+ if (sdkp->cache_override)
+ return;
+
first_len = 4;
if (sdp->skip_ms_page_8) {
if (sdp->type == TYPE_RBC)
@@ -2812,6 +2831,7 @@ static void sd_probe_async(void *data, async_cookie_t cookie)
sdkp->capacity = 0;
sdkp->media_present = 1;
sdkp->write_prot = 0;
+ sdkp->cache_override = 0;
sdkp->WCE = 0;
sdkp->RCD = 0;
sdkp->ATO = 0;
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 74a1e4c..2386aeb 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -73,6 +73,7 @@ struct scsi_disk {
u8 protection_type;/* Data Integrity Field */
u8 provisioning_mode;
unsigned ATO : 1; /* state of disk ATO bit */
+ unsigned cache_override : 1; /* temp override of WCE,RCD */
unsigned WCE : 1; /* state of disk WCE bit */
unsigned RCD : 1; /* state of disk RCD bit, unused */
unsigned DPOFUA : 1; /* state of disk DPOFUA bit */
next prev parent reply other threads:[~2013-04-24 21:02 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-23 19:41 T10 WCE interpretation in Linux & device level access Ric Wheeler
2013-04-23 20:07 ` James Bottomley
2013-04-23 22:39 ` Jeremy Linton
2013-04-24 5:44 ` Elliott, Robert (Server Storage)
2013-04-24 11:00 ` Ric Wheeler
2013-04-27 16:09 ` James Bottomley
2013-04-24 11:17 ` Paolo Bonzini
2013-04-24 12:07 ` Hannes Reinecke
2013-04-24 12:08 ` Paolo Bonzini
2013-04-24 12:12 ` Hannes Reinecke
2013-04-24 12:23 ` Paolo Bonzini
2013-04-24 12:27 ` Mike Snitzer
2013-04-24 12:27 ` Ric Wheeler
2013-04-24 12:57 ` Paolo Bonzini
2013-04-24 14:35 ` Jeremy Linton
2013-04-24 18:20 ` Black, David
2013-04-24 20:41 ` Ric Wheeler
2013-04-24 21:02 ` James Bottomley [this message]
2013-04-24 21:54 ` Paolo Bonzini
2013-04-24 22:09 ` James Bottomley
2013-04-24 22:36 ` Ric Wheeler
2013-04-24 22:46 ` James Bottomley
2013-04-25 11:35 ` Ric Wheeler
2013-04-25 14:12 ` James Bottomley
2013-04-25 1:32 ` Martin K. Petersen
2013-04-27 6:03 ` Paolo Bonzini
2013-04-24 11:30 ` Hannes Reinecke
2013-04-23 20:28 ` Douglas Gilbert
2013-04-24 15:40 ` Douglas Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1366837373.1971.28.camel@dabdike \
--to=james.bottomley@hansenpartnership.com \
--cc=Elliott@hp.com \
--cc=Frederick.Knight@netapp.com \
--cc=david.black@emc.com \
--cc=hare@suse.de \
--cc=jlinton@tributary.com \
--cc=jmoyer@redhat.com \
--cc=linux-scsi@vger.kernel.org \
--cc=mkp@mkp.net \
--cc=pbonzini@redhat.com \
--cc=ricwheeler@gmail.com \
--cc=rwheeler@redhat.com \
--cc=snitzer@redhat.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.