public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Martin Svec <martin.svec@zoner.cz>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: target-devel@vger.kernel.org, linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: NAA breakage
Date: Sat, 10 Sep 2011 17:00:06 +0200	[thread overview]
Message-ID: <4E6B7B76.4090805@zoner.cz> (raw)
In-Reply-To: <1315604283.7420.58.camel@haakon2.linux-iscsi.org>

Hi Nicholas,

thanks for your response, see my comments below.

Dne 9.9.2011 23:38, Nicholas A. Bellinger napsal(a):
> On Fri, 2011-09-09 at 14:27 +0200, Martin Svec wrote:
>> Hello folks,
>>
>> I'd like to reopen this discussion because there was no conclusion in
>> last three weeks and I still believe that the present implementation
>> of NAA IDs is wrong, regardless of the Andy's Shevchenko patch. Let me
>> explain why:
>>
> Hi Martin,
>
> Thanks for your follow-up here Martin.  Getting this resolved for v3.1
> is still on my todo list, and the patch I would like to push will be
> ready for review in the next days.  My thoughts on your comments are
> below.
>
>> (1) According to SCSI SPC-3 (7.6.10), 0x80 VPD unit serial number is a
>> vendor-assigned variable-length string of ASCII data with characters
>> 20h through 7Eh.
>>
> Correct
>
>> (2) target_emulate_evpd_83() wrongly assumes that the unit serial
>> number is a hex-encoded string with at least 25 characters and
>> generates NAA ID using hex2bin() from its first 25 chars.
>>
> Correct, the fix that I think makes the most sense here is to ensure
> that the unit serial number always contains only hex digits (by
> stripping out the non hex charactes) when set via configfs in
> vpd_unit_serial.

Martin: But then you place additional undocumented restrictions on the 
unit serial number format, don't you?

>> (3) SCSI SPC-3 (7.6.3.6.4) states that NAA IEEE Registered Extended
>> identifier is a 16-byte fixed-length binary sequence that is
>> _uniquely_ assigned by the organization associated with the IEEE
>> company_id (LIO uses OpenFabrics IEEE ID 00 14 05). That is, NAA ID
>> must be a guaranteed _stable_ worldwide-unique identifier and e.g.
>> VMware strongly relies on this.
>>
>>   From (1) and (2) it follows me that LIO does not guarantee the
>> uniqueness and in fact it very easily produces duplicate NAA IDs. For
>> example, unit serial numbers with a common 25-character prefix will
>> necessarily lead to the same NAA ID.
> It's the job of userspace to generate a UUID for the unit_serial number,
> and to ensure (as much as possible) the UUID is unique.  Taking the
> first 25 characters of this value has not created a problem so far.  Can
> you give an example of how it's 'very easily' able to produce duplicate
> NAA IDs..?

Martin: Again, SPC-3 says nothing about hex-character UUID as a unit 
serial number. That's the key point I try to emphasize -- you assume 
that it has a particular format but everbody who follows only the 
SPC-3 specification and doesn't know these LIO-specific restrictions 
risks duplicate NAA IDs. Below is an example of two unique SPC-3 
compliant serial numbers that result in the same NAA ID (tested with 
mainline kernel 3.1.0-rc1+):

$ sg_inq -p 0x83 /dev/sdc
VPD INQUIRY: Device Identification page
   Designation descriptor number 1, descriptor length: 20
     designator_type: NAA,  code_set: Binary
     associated with the addressed logical unit
       NAA 6, IEEE Company_id: 0x140f
       Vendor Specific Identifier: 0xfefbfef9f
       Vendor Specific Identifier Extension: 0xefefcfbfefef9fef
       [0x600140ffefbfef9fefefcfbfefef9fef]
   Designation descriptor number 2, descriptor length: 78
     designator_type: T10 vendor identification,  code_set: ASCII
     associated with the addressed logical unit
       vendor id: LIO-ORG
       vendor specific: 
IBLOCK:OurCompanyProductionSAN.StorageServer12.Customer524.Drive1

$ sg_inq -p 0x83 /dev/sdd
VPD INQUIRY: Device Identification page
   Designation descriptor number 1, descriptor length: 20
     designator_type: NAA,  code_set: Binary
     associated with the addressed logical unit
       NAA 6, IEEE Company_id: 0x140f
       Vendor Specific Identifier: 0xfefbfef9f
       Vendor Specific Identifier Extension: 0xefefcfbfefef9fef
       [0x600140ffefbfef9fefefcfbfefef9fef]
   Designation descriptor number 2, descriptor length: 78
     designator_type: T10 vendor identification,  code_set: ASCII
     associated with the addressed logical unit
       vendor id: LIO-ORG
       vendor specific: 
IBLOCK:OurCompanyProductionSAN.StorageServer12.Customer524.Drive2

Clearly, the serial numbers differ only in the last character (drive 
number), far beyond the 25 characters used for NAA. For somebody that 
starts with (i)SCSI and wants a unique and readable identification of 
its LUNs, these serial numbers IMHO perfectly make sense, are SPC-3 
compliant, but VMware vSphere will be totally confused of their NAAs 
generated by LIO. And for equally sized LUNs, I guess that there is 
even a chance for data corruption because VMware will probably assume 
that the two LUNs are two _paths_ to the _same_ LUN (I'll try to test it).

Note that the above example is a real-world example that I hit in 
January when I started to play with iSCSI and LIO. After first 
multipathing issues and without any knowledge of LIO/SPC3/NAA, it was 
easier for me to change my scripts to generate serial numbers based on 
hashes, rather than read standards and LIO sources to find out if it 
was my bug or not.

Another examples of colliding serial numbers are all strings that 
contain no hex characters or contain a mixture of hex and non-hex 
characters with identical hex characters ocurring on the same offsets.

>> With Andy's Shevchenko patch, the
>> same also holds for serial numbers that contain only non-hex
>> characters in first 25 bytes, resulting in NAA IDs full of 0xff. And
>> there are other cases where hex2bin() conversion applied to serial
>> numbers leads to duplicates.
>>
>> So the way NAA ID is generated from the serial number seems to be
>> broken and does not guarantee NAA ID uniqueness even if the serial
>> numbers are unique and SPC-3 compliant.
>>
>> However, I think that the solution is easy:
>>
>> (a) Provide a ConfigFS entry for NAA ID to allow userspace to maintain
>> the uniqueness on its own.
>>
> I am against exposing the NAA ID as a configfs attribute.  I still think
> basing this upon the EVPD 0x80 unit serial still makes the most sense,
> and to make userspace ensure (as much as possible) that the UUID ->  unit
> serial is unique.
>
>> (b) If no ConfigFS NAA ID is specified, target_emulate_evpd_83()
>> should make the best effort to generate unique NAA ID from the unit
>> serial number. An obvious solution is to compute a hash (e.g. SHA1)
>> from the unit serial number and use its 13 most significant bytes to
>> fill vendor-specific NAA ID bytes.
>>
> Generating a hash based upon unit serial for the vendor-specific NAA ID
> bytes might be useful, but I am still not convinced there is a real
> problem of duplicate NAA IDs using UUID based unit seriales for the
> vendor specific area..
>
>> Yes, the drawback is that such a change breaks NAA IDs of existing
>> setups. It's a question if it is better to maintain backward
>> compatibility, or fix it while LIO is in mainline for a short time yet.
>>
> I think the drawback is worth the extra pain here..  As mentioned, I am
> still leaning toward a simple fix to force hex characters for all
> vpd_unit_serial values set via configfs.
>
> --nab
>

Martin: Yes, that's a possible solution -- enforce vpd_unit_serial to 
be a hex-character string at least 25 characters long, and document 
that the first 25 characters must be unique within a given SAN. It's 
more restrictive than SPC-3 says but at least it doesn't allow to set 
vpd_unit_serial to something that leads to duplicate NAA IDs.

I still think that my proposal is better because it provides the same 
guarantees without additional restrictions to SPC-3 standard but I can 
live with it :-) All that I want is to save future LIO users from 
surprises caused by the NAA ID generation based on undocumented 
vpd_unit_serial assumptions.

Finally, please remember that VMware strongly relies on _unique_ and 
_stable_ NAA IDs. So your decision should be definitive, unless you 
provide a configfs interface for NAA IDs. It would be really bad to 
generate different NAA IDs in different versions of mainline kernel.

Martin


  reply	other threads:[~2011-09-10 15:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <4E494F9F.4000909@zoner.cz>
     [not found] ` <1313522082.2853.10.camel@haakon2.linux-iscsi.org>
     [not found]   ` <4E4BAB1D.6070104@zoner.cz>
     [not found]     ` <1313618333.9928.61.camel@haakon2.linux-iscsi.org>
     [not found]       ` <4E528F05.9090208@zoner.cz>
     [not found]         ` <4E6A0618.3040102@zoner.cz>
2011-09-09 21:38           ` NAA breakage Nicholas A. Bellinger
2011-09-10 15:00             ` Martin Svec [this message]
2011-09-10 20:37               ` Nicholas A. Bellinger
2011-09-11 11:52                 ` Martin Svec
2011-09-11 14:00                   ` Chris Boot
2011-09-12  7:35                     ` Nicholas A. Bellinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E6B7B76.4090805@zoner.cz \
    --to=martin.svec@zoner.cz \
    --cc=linux-scsi@vger.kernel.org \
    --cc=nab@linux-iscsi.org \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox