public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: dgilbert@interlog.com, michaelc@cs.wisc.edu,
	linux-scsi@vger.kernel.org, target-devel@vger.kernel.org,
	ceph-devel@vger.kernel.org, axboe@kernel.dk
Subject: Re: [PATCH 0/5] block/scsi/lio support for COMPARE_AND_WRITE
Date: Fri, 17 Oct 2014 08:02:02 +0200	[thread overview]
Message-ID: <5440B0DA.4020902@suse.de> (raw)
In-Reply-To: <54402421.8060808@interlog.com>

On 10/16/2014 10:01 PM, Douglas Gilbert wrote:
> On 14-10-16 12:39 PM, Douglas Gilbert wrote:
>> On 14-10-16 07:37 AM, michaelc@cs.wisc.edu wrote:
>>> The following patches implement the SCSI command COMPARE_AND_WRITE as
>>> a new
>>> bio/request type REQ_CMP_AND_WRITE. COMPARE_AND_WRITE is defined in the
>>> SCSI SBC (SCSI block command) specs as:
>>>
>>> The COMPARE AND WRITE command requests that the device server perform
>>> the
>>> following as an uninterrupted series of actions:
>>>
>>> 1) perform the following operations:
>>>          A) read the specified logical blocks; and
>>>          B) transfer the specified number of logical blocks from the
>>> Data-Out
>>>          Buffer (i.e., the verify instance of the data is transferred
>>> from the
>>>          Data-Out Buffer);
>>>
>>> 2) compare the data read from the specified logical blocks with the
>>> verify
>>> instance of the data; and
>>> 3) If the compared data matches, then perform the following operations:
>>>          1) transfer the specified number of logical blocks from the
>>> Data-Out
>>>          Buffer (i.e., the write instance of the data transferred
>>> from the
>>>          Data-Out Buffer); and
>>>          2) write those logical blocks.
>>>
>>> The most command use of this command today is in VMware ESX where it
>>> is used
>>> for locking. See
>>> http://blogs.vmware.com/vsphere/2012/05/vmfs-locking-uncovered.html
>>> [in ESX is it is called ATS (atomic test and set)] for more VMware info.
>>> Linux fits into this use, because its SCSI target layer (LIO) is
>>> commonly
>>> used as storage for ESX VMs.
>>>
>>> Currently, to support this command in LIO we emulate it by taking a
>>> lock,
>>> doing a read, comparing it, then doing a write. The problem this
>>> patchset
>>> tries to solve is that in many cases it is more efficient to pass the
>>> one
>>> COMPARE_AND_REQUEST request directly to the device where it might have
>>> optimized locking and also will require fewer requests to/from the
>>> target
>>> and backing storage device.
>>>
>>> I am also bugging the ceph-devel list, because I am working on LIO +
>>> ceph
>>> support. I am interested in using ceph's rbd device for the backing
>>> storage for LIO, and I was thinking this request could be implemented
>>> similar
>>> to how REQ_DISCARD (unmap/trim) is going to be, and I wanted to get
>>> some early
>>> feedback. I know the scsi layer better, so I have only added support
>>> in sd in
>>> this patchset.
>>>
>>> The following patches were made over the target-pending for-next
>>> branch but
>>> also apply to Linus's tree.
>>
>> As I found when I implemented this command in sg3_utils,
>> my library's support for handling and reporting the
>> MISCOMPARE sense key needed to be strengthened. [A sense
>> buffer with a MISCOMPARE sense key is what results when
>> the compare in step 2) is unequal.]
>>
>> Since it was relatively rare prior to VMWare's use of
>> the COMPARE AND WRITE command, MISCOMPARE is often forgotten
>> in sense key handling. Also it should not be considered
>> as an error and definitely should not lead to the command
>> being retried.
>>
>> The COMPARE AND WRITE command may fail for other reasons
>> such as a transport problem or a Unit Attention, so the
>> SCSI eh logic may need to know about it.
>
> Elaborating ...
>
> Hannes will enjoy this one: say a COMPARE AND WRITE (CAW) fails
> due to a transport error or timeout. What should the EH do *** ?
> Answer: read that LBA(s) to see whether the command succeeded
> (i.e. wrote the new data)! If it did, do nothing; if it didn't,
> repeat the CAW command. And naturally that second CAW may
> yield a MISCOMPARE.
>
Hmm. Surely we should be getting a sense code telling us up to
which block the CAW failed? Reading the LBA(s) seems like a daft
idea to me ...

>
> Mike proposes using ECANCELED for the errno corresponding to
> MISCOMPARE. Not wild about that but can't see anything better,
> and it is definitely much better than EIO.
>
Yup. Please do.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2014-10-17  6:02 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-16  5:37 [PATCH 0/5] block/scsi/lio support for COMPARE_AND_WRITE michaelc
2014-10-16  5:37 ` [PATCH 1/5] block: set the nr of sectors a dev can compare and write atomically michaelc
2014-10-16  5:37 ` [PATCH 2/5] block: add function to issue compare and write michaelc
2014-10-17  9:55   ` Christoph Hellwig
2014-10-17 23:38     ` Martin K. Petersen
2014-10-18 15:16       ` Christoph Hellwig
2014-10-16  5:37 ` [PATCH 3/5] scsi: add support for COMPARE_AND_WRITE michaelc
2014-12-18  0:23   ` Elliott, Robert (Server Storage)
2014-10-16  5:37 ` [PATCH 4/5] lio: use REQ_COMPARE_AND_WRITE if supported michaelc
2014-10-16  5:37 ` [PATCH 5/5] lio iblock: add support for REQ_CMP_AND_WRITE michaelc
2014-10-16 10:39 ` [PATCH 0/5] block/scsi/lio support for COMPARE_AND_WRITE Douglas Gilbert
2014-10-16 20:01   ` Douglas Gilbert
2014-10-16 20:12     ` Elliott, Robert (Server Storage)
2014-10-17  6:02     ` Hannes Reinecke [this message]
2014-10-18  8:11 ` Bart Van Assche
2014-10-18 20:32   ` Mike Christie
2014-10-20  7:18     ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5440B0DA.4020902@suse.de \
    --to=hare@suse.de \
    --cc=axboe@kernel.dk \
    --cc=ceph-devel@vger.kernel.org \
    --cc=dgilbert@interlog.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox