From mboxrd@z Thu Jan 1 00:00:00 1970 From: Douglas Gilbert Subject: Re: T10 WCE interpretation in Linux & device level access Date: Wed, 24 Apr 2013 11:40:12 -0400 Message-ID: <5177FCDC.6010304@interlog.com> References: <5176E3E8.3000809@redhat.com> Reply-To: dgilbert@interlog.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from smtp.infotech.no ([82.134.31.41]:43703 "EHLO smtp.infotech.no" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752450Ab3DXPlZ (ORCPT ); Wed, 24 Apr 2013 11:41:25 -0400 In-Reply-To: <5176E3E8.3000809@redhat.com> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Ric Wheeler Cc: "linux-scsi@vger.kernel.org" , "Martin K. Petersen" , James Bottomley , Jeff Moyer , Tejun Heo , Mike Snitzer On 13-04-23 03:41 PM, Ric Wheeler wrote: > > For many years, we have used WCE as an indication that a device has a volatile > write cache (not just a write cache) and used this as a trigger to send down > SYNCHRONIZE_CACHE commands as needed. > > Some arrays with non-volatile cache seem to have WCE set and simply ignore the > command. > > Some arrays with non-volatile cache seem to not set WCE. > > Others arrays with non-volatile cache - our problem arrays - set WCE and do > something horrible and slow when sent the SYNCHRONIZE_CACHE commands. > > Note that for file systems, you can override this behavior by mounting with our > barriers disabled (mount -o nobarrier .....). There is currently no way do > disable this for anything using the device directly, not through the file system. > > Some applications run against block devices - not through a file system - and > want not to slow to a crawl when they have an array in my problem set. > > Giving them a hook to ignore WCE seems to be a hack, but one that would resolve > issues with users who won't want to wait months (years?) for us to convince the > array vendors. > > Is this a hook worth doing? > > Have we hashed this out in the T10 committee? Naturally I'm biased, but I tend to think the user space is usually smarter than the kernel. That assumes skilled users. So if the user space issues a SYNCHRONIZE_CACHE with the IMMED bit set and for the whole disk then the user should have a way of forcing that command to be issued. The assumption here is that the skilled user is about to power down that array or pull some disks or SSDs *. The more questionable cases are when a file system or the block layer is issuing a barrier or some such that translates to a SYNCHRONIZE_CACHE. That should be ignored in some cases already discussed in this thread. While working with SoCs I have noticed an interesting technique. Sub-system sized sections of the memory mapped IO space (e.g. a bank of GPIOs) can be write protected by a simple ASCII sequence **. Attempts to change configuration registers after write protect are ignored and an error is noted (if anyone cares). The same ACSII sequence can be used to un-write protect those sub-system configuration registers. Typically on a SoC if the GPIOs are randomly re-configured, it's game over. Back to the SCSI world: a better solution might be if an LLD could be informed of the reason a SCSI control command is being issued (a sort of "come from" field). Failing, or it addition to that, a sysfs interface could be added to filter out "dangerous" SCSI commands: echo "SC" > /sys/class/scsi_device/8:0:0:0/device/filter cat /sys/class/scsi_device/8:0:0:0/device/filter FU SC If, for whatever reason, we did ignore a SYNCHRONIZE_CACHE command we could use vendor specific sense data (vendor=Linux) to indicate that a command had been ignored. That could be extended to all SCSI commands that are filtered out ***; better that than EIO, EACCES etc. Doug Gilbert * and if Linux doesn't permit this, then user might be advised to run another, more obedient, host OS with Linux running as a VM. A "pass-by" rather than a "pass-through" ... ** only the configuration registers are write protected, so data can still be written to the GPIOs *** like me, many pass-through users cannot see why SCSI commands injected to the SCSI subsystem (e.g. via sg or bsg) are filtered out silently by the block layer.