From mboxrd@z Thu Jan  1 00:00:00 1970
From: sagi grimberg <sagig@mellanox.com>
Subject: Re: linux rdma 3.14 merge plans
Date: Sun, 19 Jan 2014 13:20:17 +0200
Message-ID: <52DBB4F1.4020400@mellanox.com>
References: <CAJZOPZ+4yQ-sT=ks7+eiJjkxOjy5w=BmG16JVcUPiuVsof7qEA@mail.gmail.com>	 <CAG4TOxOMmvFWnkU3DBn33rscEKh2_YfbUCKY=iY8PCVN3+nEsA@mail.gmail.com>	 <52CD1C68.4050406@mellanox.com>	 <1389645171.5567.459.camel@haakon3.risingtidesystems.com>	 <1389820541.5567.543.camel@haakon3.risingtidesystems.com>	 <CAG4TOxNa32sLxifPx_f8sW04B_qSh01WWfWjRvam6fjvFLDXSQ@mail.gmail.com>	 <1389906852.5567.668.camel@haakon3.risingtidesystems.com>	 <CAG4TOxPeYQ=e5LdJft1Hkx8donUQjJaKDEAv3iRLGxPYJQ_b9w@mail.gmail.com> <1390102949.5567.749.camel@haakon3.risingtidesystems.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"; format=flowed
Content-Transfer-Encoding: 7bit
Return-path: <target-devel-owner@vger.kernel.org>
In-Reply-To: <1390102949.5567.749.camel@haakon3.risingtidesystems.com>
Sender: target-devel-owner@vger.kernel.org
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>, Roland Dreier <roland@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Hefty Sean <sean.hefty@intel.com>, Or Gerlitz <ogerlitz@mellanox.com>, linux-rdma <linux-rdma@vger.kernel.org>, "Martin K. Petersen" <martin.petersen@oracle.com>, target-devel <target-devel@vger.kernel.org>
List-Id: linux-rdma@vger.kernel.org

On 1/19/2014 5:42 AM, Nicholas A. Bellinger wrote:
> On Sat, 2014-01-18 at 13:42 -0800, Roland Dreier wrote:
>> On Thu, Jan 16, 2014 at 1:14 PM, Nicholas A. Bellinger
>> <nab@linux-iscsi.org> wrote:
>>> I've reviewed the API from the perspective of what's required for
>>> implementing protection support in iser, and currently don't have any
>>> recommendations or objections beyond what has been proposed by Sagi & Co
>>> in PATCH-v4 code.
>> I guess I'm a little confused about why we need verbs support for this
>> to implement DIF/DIX in iser.  Isn't the whole point of protection to
>> have end-to-end checksums, rather than having checksums computed by
>> the transport after there's a chance for corruption?
>>
> So to my knowledge, there are three target side DIX HBA modes of
> operation:
>
>    - TARGET PASS: Fabric + Backend support PI
>    - TARGET INSERT: Fabric does not support PI, backend supports PI
>    - TARGET STRIP: Fabric supports PI, backend does not support PI
>
> The scenario your thinking about above is the 'TARGET INSERT' case,
> where the initiator does not generate PI, but the backend device on the
> target side expects PI, so the target fabric ends up generating PI on
> incoming WRITEs, and verifying + striping PI on outgoing READs.
>
> The scenario for 'TARGET STRIP' is when the initiator generates PI but
> the backend device does not support/process PI, so the target verifies +
> strips PI on incoming WRITESs, and inserts PI on outgoing READs.
>
> Your correct that both of these modes don't provide true end-to-end
> protection, and my understanding is that they are provided as a way to
> accommodate existing fabrics + backend devices where PI is not supported
> all the way through the stack.
>
> The 'TARGET PASS' is the scenario that provides true end-to-end
> guarantees, where for WRITEs PI is generated by the Host OS, verified +
> passed on the initiator side HBA, verified + passed on the target HBA,
> and verified + stored on the device backend.  For READs, PI is retrieved
> from the backend device, verified + passed on the target HBA, verified +
> passed on the initiator HBA, and finally verified on the Host OS.
>
> So in the proposed RDMA VERBs changes these three modes of target DIX
> operation are supported.  Also it's my understanding (Sagi & Co, please
> correct me), that the proposed changes are implemented to be independent
> of target/initiator mode DIX operation.

Correct. Verbs API allow all supported protection operations without any 
peer dependencies.

>
> --nab
>

Thanks Nic,  let me elaborate on this,

It is true that T10-PI aims for end-to-end data-integrity, the verbs API 
offer HW offload for protection
information processing which is VERY expensive for CPU computation (CRC 
verify for each block). T10-PI
is intended to be offloaded, both on the backend devices which supports 
this feature and for fabric
transports (such as Qlogic/Emulex FC drivers), Verbs API just adds RDMA 
to the game...

T10-PI specifies protection information scheme in a way that over the 
wire each protection interval is
followed by 8 bytes of data-integrity (interleaved). In the memory on 
the other hand, the data and
protection block-guards may lie in separated buffers (for example that 
is the preferred approach in block
and SCSI layers).

Newly introduced REG_SIG_MR work request allows to (fast) register a 
"signature" memory key which
incorporates the protection method and pattern used:
1. Where is the data? (data sge)
2. Where is the protection block-guards? (protection sge)
3. What are the signature attributes? (T10-PI method, crc/reftag/apptag 
seeds, verify mask, memory pattern, wire pattern)
When doing the actual data-transfer the HCA will enforce T10-PI scheme 
(See my cover-letter for a more detailed explanation).

If you take a look in SCSI implementation you will see that SCSI signals 
the transport of protection attributes in
scsi_cmnd (prot SG-list, protection type, guard type, protection 
operation). Verbs API allows full offload of all
T10-PI operations:
1. INSERT - HCA computes/generates data-integrity block-guards and 
writes them according to the specified
     pattern (interleaved with the data or in a separated buffer).
2. STRIP (and VERIFY) - HCA verifies incoming data-integrity 
block-guards and strip them from the data stream.
3. PASS (and VERIFY) - HCA verifies incoming data-integrity block-guards 
and passes them forward according to
     the specified pattern (interleaved/separated).

In addition, Verbs API can be easily extended to support other 
data-integrity methods (XOR-32, CRC-32, etc...)
so that an application interested in data-integrity has signature verbs 
in its tool-box. This is why we use "Signature"
notation and refer to T10-PI as a specific signature method.

Hope this helps,
Sagi.