From: Martin Svec <martin.svec@zoner.cz>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: target-devel@vger.kernel.org, linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: NAA breakage
Date: Sat, 10 Sep 2011 17:00:06 +0200 [thread overview]
Message-ID: <4E6B7B76.4090805@zoner.cz> (raw)
In-Reply-To: <1315604283.7420.58.camel@haakon2.linux-iscsi.org>
Hi Nicholas,
thanks for your response, see my comments below.
Dne 9.9.2011 23:38, Nicholas A. Bellinger napsal(a):
> On Fri, 2011-09-09 at 14:27 +0200, Martin Svec wrote:
>> Hello folks,
>>
>> I'd like to reopen this discussion because there was no conclusion in
>> last three weeks and I still believe that the present implementation
>> of NAA IDs is wrong, regardless of the Andy's Shevchenko patch. Let me
>> explain why:
>>
> Hi Martin,
>
> Thanks for your follow-up here Martin. Getting this resolved for v3.1
> is still on my todo list, and the patch I would like to push will be
> ready for review in the next days. My thoughts on your comments are
> below.
>
>> (1) According to SCSI SPC-3 (7.6.10), 0x80 VPD unit serial number is a
>> vendor-assigned variable-length string of ASCII data with characters
>> 20h through 7Eh.
>>
> Correct
>
>> (2) target_emulate_evpd_83() wrongly assumes that the unit serial
>> number is a hex-encoded string with at least 25 characters and
>> generates NAA ID using hex2bin() from its first 25 chars.
>>
> Correct, the fix that I think makes the most sense here is to ensure
> that the unit serial number always contains only hex digits (by
> stripping out the non hex charactes) when set via configfs in
> vpd_unit_serial.
Martin: But then you place additional undocumented restrictions on the
unit serial number format, don't you?
>> (3) SCSI SPC-3 (7.6.3.6.4) states that NAA IEEE Registered Extended
>> identifier is a 16-byte fixed-length binary sequence that is
>> _uniquely_ assigned by the organization associated with the IEEE
>> company_id (LIO uses OpenFabrics IEEE ID 00 14 05). That is, NAA ID
>> must be a guaranteed _stable_ worldwide-unique identifier and e.g.
>> VMware strongly relies on this.
>>
>> From (1) and (2) it follows me that LIO does not guarantee the
>> uniqueness and in fact it very easily produces duplicate NAA IDs. For
>> example, unit serial numbers with a common 25-character prefix will
>> necessarily lead to the same NAA ID.
> It's the job of userspace to generate a UUID for the unit_serial number,
> and to ensure (as much as possible) the UUID is unique. Taking the
> first 25 characters of this value has not created a problem so far. Can
> you give an example of how it's 'very easily' able to produce duplicate
> NAA IDs..?
Martin: Again, SPC-3 says nothing about hex-character UUID as a unit
serial number. That's the key point I try to emphasize -- you assume
that it has a particular format but everbody who follows only the
SPC-3 specification and doesn't know these LIO-specific restrictions
risks duplicate NAA IDs. Below is an example of two unique SPC-3
compliant serial numbers that result in the same NAA ID (tested with
mainline kernel 3.1.0-rc1+):
$ sg_inq -p 0x83 /dev/sdc
VPD INQUIRY: Device Identification page
Designation descriptor number 1, descriptor length: 20
designator_type: NAA, code_set: Binary
associated with the addressed logical unit
NAA 6, IEEE Company_id: 0x140f
Vendor Specific Identifier: 0xfefbfef9f
Vendor Specific Identifier Extension: 0xefefcfbfefef9fef
[0x600140ffefbfef9fefefcfbfefef9fef]
Designation descriptor number 2, descriptor length: 78
designator_type: T10 vendor identification, code_set: ASCII
associated with the addressed logical unit
vendor id: LIO-ORG
vendor specific:
IBLOCK:OurCompanyProductionSAN.StorageServer12.Customer524.Drive1
$ sg_inq -p 0x83 /dev/sdd
VPD INQUIRY: Device Identification page
Designation descriptor number 1, descriptor length: 20
designator_type: NAA, code_set: Binary
associated with the addressed logical unit
NAA 6, IEEE Company_id: 0x140f
Vendor Specific Identifier: 0xfefbfef9f
Vendor Specific Identifier Extension: 0xefefcfbfefef9fef
[0x600140ffefbfef9fefefcfbfefef9fef]
Designation descriptor number 2, descriptor length: 78
designator_type: T10 vendor identification, code_set: ASCII
associated with the addressed logical unit
vendor id: LIO-ORG
vendor specific:
IBLOCK:OurCompanyProductionSAN.StorageServer12.Customer524.Drive2
Clearly, the serial numbers differ only in the last character (drive
number), far beyond the 25 characters used for NAA. For somebody that
starts with (i)SCSI and wants a unique and readable identification of
its LUNs, these serial numbers IMHO perfectly make sense, are SPC-3
compliant, but VMware vSphere will be totally confused of their NAAs
generated by LIO. And for equally sized LUNs, I guess that there is
even a chance for data corruption because VMware will probably assume
that the two LUNs are two _paths_ to the _same_ LUN (I'll try to test it).
Note that the above example is a real-world example that I hit in
January when I started to play with iSCSI and LIO. After first
multipathing issues and without any knowledge of LIO/SPC3/NAA, it was
easier for me to change my scripts to generate serial numbers based on
hashes, rather than read standards and LIO sources to find out if it
was my bug or not.
Another examples of colliding serial numbers are all strings that
contain no hex characters or contain a mixture of hex and non-hex
characters with identical hex characters ocurring on the same offsets.
>> With Andy's Shevchenko patch, the
>> same also holds for serial numbers that contain only non-hex
>> characters in first 25 bytes, resulting in NAA IDs full of 0xff. And
>> there are other cases where hex2bin() conversion applied to serial
>> numbers leads to duplicates.
>>
>> So the way NAA ID is generated from the serial number seems to be
>> broken and does not guarantee NAA ID uniqueness even if the serial
>> numbers are unique and SPC-3 compliant.
>>
>> However, I think that the solution is easy:
>>
>> (a) Provide a ConfigFS entry for NAA ID to allow userspace to maintain
>> the uniqueness on its own.
>>
> I am against exposing the NAA ID as a configfs attribute. I still think
> basing this upon the EVPD 0x80 unit serial still makes the most sense,
> and to make userspace ensure (as much as possible) that the UUID -> unit
> serial is unique.
>
>> (b) If no ConfigFS NAA ID is specified, target_emulate_evpd_83()
>> should make the best effort to generate unique NAA ID from the unit
>> serial number. An obvious solution is to compute a hash (e.g. SHA1)
>> from the unit serial number and use its 13 most significant bytes to
>> fill vendor-specific NAA ID bytes.
>>
> Generating a hash based upon unit serial for the vendor-specific NAA ID
> bytes might be useful, but I am still not convinced there is a real
> problem of duplicate NAA IDs using UUID based unit seriales for the
> vendor specific area..
>
>> Yes, the drawback is that such a change breaks NAA IDs of existing
>> setups. It's a question if it is better to maintain backward
>> compatibility, or fix it while LIO is in mainline for a short time yet.
>>
> I think the drawback is worth the extra pain here.. As mentioned, I am
> still leaning toward a simple fix to force hex characters for all
> vpd_unit_serial values set via configfs.
>
> --nab
>
Martin: Yes, that's a possible solution -- enforce vpd_unit_serial to
be a hex-character string at least 25 characters long, and document
that the first 25 characters must be unique within a given SAN. It's
more restrictive than SPC-3 says but at least it doesn't allow to set
vpd_unit_serial to something that leads to duplicate NAA IDs.
I still think that my proposal is better because it provides the same
guarantees without additional restrictions to SPC-3 standard but I can
live with it :-) All that I want is to save future LIO users from
surprises caused by the NAA ID generation based on undocumented
vpd_unit_serial assumptions.
Finally, please remember that VMware strongly relies on _unique_ and
_stable_ NAA IDs. So your decision should be definitive, unless you
provide a configfs interface for NAA IDs. It would be really bad to
generate different NAA IDs in different versions of mainline kernel.
Martin
next prev parent reply other threads:[~2011-09-10 15:06 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4E494F9F.4000909@zoner.cz>
[not found] ` <1313522082.2853.10.camel@haakon2.linux-iscsi.org>
[not found] ` <4E4BAB1D.6070104@zoner.cz>
[not found] ` <1313618333.9928.61.camel@haakon2.linux-iscsi.org>
[not found] ` <4E528F05.9090208@zoner.cz>
[not found] ` <4E6A0618.3040102@zoner.cz>
2011-09-09 21:38 ` NAA breakage Nicholas A. Bellinger
2011-09-10 15:00 ` Martin Svec [this message]
2011-09-10 20:37 ` Nicholas A. Bellinger
2011-09-11 11:52 ` Martin Svec
2011-09-11 14:00 ` Chris Boot
2011-09-12 7:35 ` Nicholas A. Bellinger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E6B7B76.4090805@zoner.cz \
--to=martin.svec@zoner.cz \
--cc=linux-scsi@vger.kernel.org \
--cc=nab@linux-iscsi.org \
--cc=target-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox