public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* Persistent reservation behaviour/compliance with redundant controllers
@ 2013-12-25 23:00 Matthias Eble
  2014-01-06 22:20 ` Lee Duncan
  0 siblings, 1 reply; 10+ messages in thread
From: Matthias Eble @ 2013-12-25 23:00 UTC (permalink / raw)
  To: linux-scsi

Hi all,

I'm experiencing a behaviour that doesn't comply to the SPC3/4 standards from
my point of view. I have read the t10 drafts to understand scsi3 persistent
reservations (PR). Probably I simply got the standard wrong, but maybe somebody
can bring light into the situation.

My understanding of SPC-3/4 is that with PR, registrations should happen on any
I_T Nexus accessing a volume. To me, in a dm-multipath environment, this
translates to "register every single path".

But that doesn't work on our 3Par 7400.
Now the question is, who is wrong? Me (likely :-), or HP/3Par (unlikely).


Here's the dmmp map
360002ac0000000000000000a00006e6b dm-6 3PARdata,VV
size=2.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 3:0:1:4  sdg  8:96    active ready running
  |- 3:0:3:4  sdl  8:176   active ready running
  |- 5:0:3:4  sdbg 67:160  active ready running
  `- 5:0:1:4  sdce 69:32   active ready running


Here are the commands:
1: starting with a clean state:
   # sg_persist --in --read-keys /dev/sdg
     3PARdata  VV                3122
     Peripheral device type: disk
     PR generation=0x3a, there are NO registered reservation keys

2: first registration (sdg) works fine:
   # sg_persist -d /dev/sdg --no-inquiry --out --register \
                --param-sark=0x420480a029000067

3: however registering sdl fails:
   # sg_persist -d /dev/sdl --no-inquiry --out --register \
                --param-sark=0x420480a02900006c
      persistent reserve out: scsi status: Reservation Conflict

When I --register-*ignore* the second device, the command succeeds.
But the first registration key for sdg gets substituted by the new one for sdl.
The same thing happens the other way around when sdg is register-ignore'd
again.

There can only be two registrations at a time: (sdg XOR sdl) and (sdbg XOR sdce)
Now my question is: Does this comply to the standard?

My core problem is that I'd like to ensure that no registration is missing
by accident.

I hope that somebody on this list is kind enough to answer my question or
give me a hint. HP was not able to direct it to a capable person in the
last 9 months. *sigh*


Any help is appreciated!

Thanks in advance,
Matthias



3Par specific information:
3Par systems have a transparent controller(node) failover feature.
In the example above, scsi host3 has two paths to the same volume.
The paths are provided by two different controller nodes.
If one node fails, the other node can take over the path transparently.
To me it looks like the SG3PR implementation is too transparent when it
comes to SG3PR.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Persistent reservation behaviour/compliance with redundant controllers
  2013-12-25 23:00 Persistent reservation behaviour/compliance with redundant controllers Matthias Eble
@ 2014-01-06 22:20 ` Lee Duncan
  2014-01-06 22:53   ` Matthias Eble
  0 siblings, 1 reply; 10+ messages in thread
From: Lee Duncan @ 2014-01-06 22:20 UTC (permalink / raw)
  To: Matthias Eble, linux-scsi

On 12/25/2013 03:00 PM, Matthias Eble wrote:
> Hi all,
> 
> I'm experiencing a behaviour that doesn't comply to the SPC3/4 standards from
> my point of view. I have read the t10 drafts to understand scsi3 persistent
> reservations (PR). Probably I simply got the standard wrong, but maybe somebody
> can bring light into the situation.
> 
> My understanding of SPC-3/4 is that with PR, registrations should happen on any
> I_T Nexus accessing a volume. To me, in a dm-multipath environment, this
> translates to "register every single path".
> 
> But that doesn't work on our 3Par 7400.
> Now the question is, who is wrong? Me (likely :-), or HP/3Par (unlikely).
> 
> 
> Here's the dmmp map
> 360002ac0000000000000000a00006e6b dm-6 3PARdata,VV
> size=2.0T features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 3:0:1:4  sdg  8:96    active ready running
>   |- 3:0:3:4  sdl  8:176   active ready running
>   |- 5:0:3:4  sdbg 67:160  active ready running
>   `- 5:0:1:4  sdce 69:32   active ready running
> 
> 
> Here are the commands:
> 1: starting with a clean state:
>    # sg_persist --in --read-keys /dev/sdg
>      3PARdata  VV                3122
>      Peripheral device type: disk
>      PR generation=0x3a, there are NO registered reservation keys
> 
> 2: first registration (sdg) works fine:
>    # sg_persist -d /dev/sdg --no-inquiry --out --register \
>                 --param-sark=0x420480a029000067
> 
> 3: however registering sdl fails:
>    # sg_persist -d /dev/sdl --no-inquiry --out --register \
>                 --param-sark=0x420480a02900006c
>       persistent reserve out: scsi status: Reservation Conflict
> 
> When I --register-*ignore* the second device, the command succeeds.
> But the first registration key for sdg gets substituted by the new one for sdl.
> The same thing happens the other way around when sdg is register-ignore'd
> again.
> 
> There can only be two registrations at a time: (sdg XOR sdl) and (sdbg XOR sdce)
> Now my question is: Does this comply to the standard?
> 
> My core problem is that I'd like to ensure that no registration is missing
> by accident.

Matthias:

I _believe_ the problem is that you are re-registering the same
I_T_Nexus through /dev/sdl, your second attempt at registration, as you
did when you used /dev/sdg, your original registration.

What are you really trying to do? Are you testing that persistent
reservations "work" or trying to figure them out?

I have a "persistent reservations for dummies" document I wrote that I
can send you off list, if you like.

> 
> I hope that somebody on this list is kind enough to answer my question or
> give me a hint. HP was not able to direct it to a capable person in the
> last 9 months. *sigh*
> 
> 
> Any help is appreciated!
> 
> Thanks in advance,
> Matthias

-- 
Lee Duncan
SUSE Labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Persistent reservation behaviour/compliance with redundant controllers
  2014-01-06 22:20 ` Lee Duncan
@ 2014-01-06 22:53   ` Matthias Eble
  2014-01-06 23:06     ` James Bottomley
  2014-01-07 20:18     ` Pasi Kärkkäinen
  0 siblings, 2 replies; 10+ messages in thread
From: Matthias Eble @ 2014-01-06 22:53 UTC (permalink / raw)
  To: Lee Duncan; +Cc: linux-scsi

2014/1/6 Lee Duncan <lduncan@suse.com>:
> On 12/25/2013 03:00 PM, Matthias Eble wrote:
>> Here's the dmmp map
>> 360002ac0000000000000000a00006e6b dm-6 3PARdata,VV
>> size=2.0T features='0' hwhandler='0' wp=rw
>> `-+- policy='round-robin 0' prio=1 status=active
>>   |- 3:0:1:4  sdg  8:96    active ready running
>>   |- 3:0:3:4  sdl  8:176   active ready running
>>   |- 5:0:3:4  sdbg 67:160  active ready running
>>   `- 5:0:1:4  sdce 69:32   active ready running
>>
>> There can only be two registrations at a time: (sdg XOR sdl) and (sdbg XOR sdce)
>> Now my question is: Does this comply to the standard?
>>
>
> I _believe_ the problem is that you are re-registering the same
> I_T_Nexus through /dev/sdl, your second attempt at registration, as you
> did when you used /dev/sdg, your original registration.


Can sdg and sdl be the same I_T_Nexus at a time?
Right now, they are handled like that.
In my understanding, every scsi disk device represents an I_T_Nexus.


# lsscsi -t | egrep '/dev/sd(g|l|bg|ce)'
    [3:0:1:4]    disk    fc:0x20120002ac006e6b,0x14ad40  /dev/sdg
    [3:0:3:4]    disk    fc:0x21120002ac006e6b,0x14ad80  /dev/sdl
    [5:0:1:4]    disk    fc:0x22110002ac006e6b,0x0aad40  /dev/sdce
    [5:0:3:4]    disk    fc:0x23110002ac006e6b,0x0aad80  /dev/sdbg


> What are you really trying to do? Are you testing that persistent
> reservations "work" or trying to figure them out?

I am testing PR on a specific storage system, which seems to behave differently
like the ones before.

> I have a "persistent reservations for dummies" document I wrote that I
> can send you off list, if you like.

I think I know how PRs work. Yet I'd be happy about your document.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Persistent reservation behaviour/compliance with redundant controllers
  2014-01-06 22:53   ` Matthias Eble
@ 2014-01-06 23:06     ` James Bottomley
  2014-01-06 23:35       ` Matthias Eble
  2014-01-22 20:43       ` Matthias Eble
  2014-01-07 20:18     ` Pasi Kärkkäinen
  1 sibling, 2 replies; 10+ messages in thread
From: James Bottomley @ 2014-01-06 23:06 UTC (permalink / raw)
  To: Matthias Eble; +Cc: Lee Duncan, linux-scsi

On Mon, 2014-01-06 at 23:53 +0100, Matthias Eble wrote:
> 2014/1/6 Lee Duncan <lduncan@suse.com>:
> > On 12/25/2013 03:00 PM, Matthias Eble wrote:
> >> Here's the dmmp map
> >> 360002ac0000000000000000a00006e6b dm-6 3PARdata,VV
> >> size=2.0T features='0' hwhandler='0' wp=rw
> >> `-+- policy='round-robin 0' prio=1 status=active
> >>   |- 3:0:1:4  sdg  8:96    active ready running
> >>   |- 3:0:3:4  sdl  8:176   active ready running
> >>   |- 5:0:3:4  sdbg 67:160  active ready running
> >>   `- 5:0:1:4  sdce 69:32   active ready running
> >>
> >> There can only be two registrations at a time: (sdg XOR sdl) and (sdbg XOR sdce)
> >> Now my question is: Does this comply to the standard?
> >>
> >
> > I _believe_ the problem is that you are re-registering the same
> > I_T_Nexus through /dev/sdl, your second attempt at registration, as you
> > did when you used /dev/sdg, your original registration.
> 
> 
> Can sdg and sdl be the same I_T_Nexus at a time?
> Right now, they are handled like that.
> In my understanding, every scsi disk device represents an I_T_Nexus.

No, every SCSI disk is an I_T_L nexus.  There's no actual device object
in Linux for an I_T nexus.

James


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Persistent reservation behaviour/compliance with redundant controllers
  2014-01-06 23:06     ` James Bottomley
@ 2014-01-06 23:35       ` Matthias Eble
  2014-01-07  2:09         ` Laurence Oberman
  2014-01-22 20:43       ` Matthias Eble
  1 sibling, 1 reply; 10+ messages in thread
From: Matthias Eble @ 2014-01-06 23:35 UTC (permalink / raw)
  To: James Bottomley; +Cc: Lee Duncan, linux-scsi

2014/1/7 James Bottomley <James.Bottomley@hansenpartnership.com>:
> On Mon, 2014-01-06 at 23:53 +0100, Matthias Eble wrote:
>>
>> Can sdg and sdl be the same I_T_Nexus at a time?
>> Right now, they are handled like that.
>> In my understanding, every scsi disk device represents an I_T_Nexus.
>
> No, every SCSI disk is an I_T_L nexus.  There's no actual device object
> in Linux for an I_T nexus.

So, PR registrations are made for an I_T nexus using an I_T_L nexus.
Probably my previous systems had a 1:1 relation between I_T and I_T_L.

Is there a way to identify which I_T_L nexuses belong to the same I_T nexus?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Persistent reservation behaviour/compliance with redundant controllers
  2014-01-06 23:35       ` Matthias Eble
@ 2014-01-07  2:09         ` Laurence Oberman
  0 siblings, 0 replies; 10+ messages in thread
From: Laurence Oberman @ 2014-01-07  2:09 UTC (permalink / raw)
  To: Matthias Eble; +Cc: James Bottomley, Lee Duncan, linux-scsi@vger.kernel.org

I reached out to a. Contact at HP and he shared this with. Not sure if its helpful.

3PAR does something different based on the host OS mode or Persona that is set for the host OS type being used as to how we respond with these commands. The  main aspects of this question derive with how a active/passive controller model would work, however, because 3PAR is all controllers or nodes are equal all paths are active. The 3Par implementation of S2R and S3PGR is intended to comply with SPC-3. The scope of reservations is limited to a full logical unit, element scope is not supported. SCSI-3 reservations allow each host/array path to have a key registered against it. Typically a host will register the same key upon all of the paths it sees to the array and each host will have its own unique key. Access to the volume can then be restricted to those hosts who have registered keys. 
 Should a host be determined to have gone rogue its key can be revoked by any of the still active hosts, causing the rogue host to lose access to the volume.
 
They need to register the same key to all paths of the same lun.
 
Once the host has taken appropriate action to become healthy again it can register a new key and regain access.
 
For 3PAR use the showrsv command to view things from the 3PAR array:
 
showrsv - Show information about scsi reservations of virtual volumes (VVs).
 
SYNTAX
    showrsv [options <arg>] [<VV_name>]
 
DESCRIPTION
    The showrsv command displays SCSI reservation and registration information
    for VLUNs bound for a specified port.
 
AUTHORITY
    Any role in the system
 
OPTIONS
    -l <scsi3|scsi2>

> On Jan 6, 2014, at 6:35 PM, Matthias Eble <psychotrahe@gmail.com> wrote:
> 
> 2014/1/7 James Bottomley <James.Bottomley@hansenpartnership.com>:
>>> On Mon, 2014-01-06 at 23:53 +0100, Matthias Eble wrote:
>>> 
>>> Can sdg and sdl be the same I_T_Nexus at a time?
>>> Right now, they are handled like that.
>>> In my understanding, every scsi disk device represents an I_T_Nexus.
>> 
>> No, every SCSI disk is an I_T_L nexus.  There's no actual device object
>> in Linux for an I_T nexus.
> 
> So, PR registrations are made for an I_T nexus using an I_T_L nexus.
> Probably my previous systems had a 1:1 relation between I_T and I_T_L.
> 
> Is there a way to identify which I_T_L nexuses belong to the same I_T nexus?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Persistent reservation behaviour/compliance with redundant controllers
  2014-01-06 22:53   ` Matthias Eble
  2014-01-06 23:06     ` James Bottomley
@ 2014-01-07 20:18     ` Pasi Kärkkäinen
  2014-01-23 18:31       ` Lee Duncan
  1 sibling, 1 reply; 10+ messages in thread
From: Pasi Kärkkäinen @ 2014-01-07 20:18 UTC (permalink / raw)
  To: Matthias Eble; +Cc: Lee Duncan, linux-scsi

On Mon, Jan 06, 2014 at 11:53:44PM +0100, Matthias Eble wrote:
> 
> > I have a "persistent reservations for dummies" document I wrote that I
> > can send you off list, if you like.
> 
> I think I know how PRs work. Yet I'd be happy about your document.
>

I think that document could be helpful for others aswell, so please post it to the list :)

Thanks!

-- Pasi


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Persistent reservation behaviour/compliance with redundant controllers
  2014-01-06 23:06     ` James Bottomley
  2014-01-06 23:35       ` Matthias Eble
@ 2014-01-22 20:43       ` Matthias Eble
  1 sibling, 0 replies; 10+ messages in thread
From: Matthias Eble @ 2014-01-22 20:43 UTC (permalink / raw)
  To: James Bottomley; +Cc: Lee Duncan, linux-scsi

2014/1/7 James Bottomley <James.Bottomley@hansenpartnership.com>
>
> On Mon, 2014-01-06 at 23:53 +0100, Matthias Eble wrote:
> > 2014/1/6 Lee Duncan <lduncan@suse.com>:
> > > On 12/25/2013 03:00 PM, Matthias Eble wrote:
> > >> Here's the dmmp map
> > >> 360002ac0000000000000000a00006e6b dm-6 3PARdata,VV
> > >> size=2.0T features='0' hwhandler='0' wp=rw
> > >> `-+- policy='round-robin 0' prio=1 status=active
> > >>   |- 3:0:1:4  sdg  8:96    active ready running
> > >>   |- 3:0:3:4  sdl  8:176   active ready running
> > >>   |- 5:0:3:4  sdbg 67:160  active ready running
> > >>   `- 5:0:1:4  sdce 69:32   active ready running
> > >>
> > >> There can only be two registrations at a time: (sdg XOR sdl) and (sdbg XOR sdce)
> > >> Now my question is: Does this comply to the standard?
> > >>
> > >
> > > I _believe_ the problem is that you are re-registering the same
> > > I_T_Nexus through /dev/sdl, your second attempt at registration, as you
> > > did when you used /dev/sdg, your original registration.
> >
> >
> > Can sdg and sdl be the same I_T_Nexus at a time?
> > Right now, they are handled like that.
> > In my understanding, every scsi disk device represents an I_T_Nexus.
>
> No, every SCSI disk is an I_T_L nexus.  There's no actual device object
> in Linux for an I_T nexus.


Hi All,

I'd like to document the progress and findings in lots of off-list emails with
HP's t10 members.
Maybe someone on the net will face the same problem.

First of all, the SPC wording isn't 100% precise. For most commands, the Lun
context is implicit. So if the standards state "I_T Nexus", I_T_L Nexuses are
meant, as the reservation commands are always lun specific.

That said, PR-registrations need to be done for every
I_T_L Nexus -> every single dmmp path (/dev/sdX)

So we started to test the behaviour of the 3Par system.
It seems that there are some quirks in the 3Par implementation.
The error that led to my initial question is that the target port
identifier isn't included in the target's reservation handling.
Thus all PR commands from one host port are considered the same.
Regardless of the target port over which they were received.
(As seen in attached commands #5 or #6 after issuing #2 )
Note that the investigations haven't been finished.


For those who are interested, here are the findings (verbose output stripped):


1.# sg_persist --in --read-keys /dev/sdl
  3PARdata  VV                3122
  Peripheral device type: disk
  PR generation=0x44, there are NO registered reservation keys

register via sdl:
2.# sg_persist -vvv -d /dev/sdl --no-inquiry --out --register
--param-sark=0x420480a02900006c
    PR out: command (Register) successful

test for scp3r23 table 33 compliance (same key on registered I_T Nexus
should succeed): False
3.# sg_persist -vvv -d /dev/sdl --no-inquiry --out --register
--param-sark=0x420480a02900006c
    persistent reserve out: scsi status: Reservation Conflict
    PR out: command failed

now with a *different key* (should conflict): True
4.# sg_persist -vvv -d /dev/sdl --no-inquiry --out --register
--param-sark=0x420480a02900006d
    persistent reserve out: scsi status: Reservation Conflict
    PR out: command failed

Same behaviour using another path/I_T_L Nexus (should succeed in both cases):
5.# sg_persist -vvv -d /dev/sdg --no-inquiry --out --register
--param-sark=0x420480a02900006c
    persistent reserve out: scsi status: Reservation Conflict
    PR out: command failed
6.# sg_persist -vvv -d /dev/sdg --no-inquiry --out --register
--param-sark=0x420480a02900006d
    persistent reserve out: scsi status: Reservation Conflict
    PR out: command failed

Unregister via sdg :-/
7.# sg_persist -vvv -d /dev/sdg --no-inquiry --out --register
--param-rk=0x420480a02900006c
    PR out: command (Register) successful

Additionally, read-full-status service action and ALL_TG_PT are not
supported, right now.


That's it for now.

Thanks for your replies,
Matthias

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Persistent reservation behaviour/compliance with redundant controllers
  2014-01-07 20:18     ` Pasi Kärkkäinen
@ 2014-01-23 18:31       ` Lee Duncan
  2014-01-24 11:41         ` Pasi Kärkkäinen
  0 siblings, 1 reply; 10+ messages in thread
From: Lee Duncan @ 2014-01-23 18:31 UTC (permalink / raw)
  To: Pasi Kärkkäinen, Matthias Eble; +Cc: linux-scsi

On 01/07/2014 12:18 PM, Pasi Kärkkäinen wrote:
> On Mon, Jan 06, 2014 at 11:53:44PM +0100, Matthias Eble wrote:
>>
>>> I have a "persistent reservations for dummies" document I wrote that I
>>> can send you off list, if you like.
>>
>> I think I know how PRs work. Yet I'd be happy about your document.
>>
> 
> I think that document could be helpful for others aswell, so please post it to the list :)
> 
> Thanks!
> 
> -- Pasi
> 


Apologies for taking so darn long to reply!

I have published my SCSI-3 Document here:

  http://www.gonzoleeman.net/documents/scsi-3-pgr-tutorial-v1.0

Feedback welcome.
-- 
Lee Duncan
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Persistent reservation behaviour/compliance with redundant controllers
  2014-01-23 18:31       ` Lee Duncan
@ 2014-01-24 11:41         ` Pasi Kärkkäinen
  0 siblings, 0 replies; 10+ messages in thread
From: Pasi Kärkkäinen @ 2014-01-24 11:41 UTC (permalink / raw)
  To: Lee Duncan; +Cc: Matthias Eble, linux-scsi

On Thu, Jan 23, 2014 at 10:31:00AM -0800, Lee Duncan wrote:
> On 01/07/2014 12:18 PM, Pasi Kärkkäinen wrote:
> > On Mon, Jan 06, 2014 at 11:53:44PM +0100, Matthias Eble wrote:
> >>
> >>> I have a "persistent reservations for dummies" document I wrote that I
> >>> can send you off list, if you like.
> >>
> >> I think I know how PRs work. Yet I'd be happy about your document.
> >>
> > 
> > I think that document could be helpful for others aswell, so please post it to the list :)
> > 
> > Thanks!
> > 
> > -- Pasi
> > 
> 
> 
> Apologies for taking so darn long to reply!
> 
> I have published my SCSI-3 Document here:
> 
>   http://www.gonzoleeman.net/documents/scsi-3-pgr-tutorial-v1.0
>

Thanky you! I'll check it out.
 
> Feedback welcome.
> -- 
> Lee Duncan
> SUSE Labs
>

-- Pasi

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-01-24 11:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-25 23:00 Persistent reservation behaviour/compliance with redundant controllers Matthias Eble
2014-01-06 22:20 ` Lee Duncan
2014-01-06 22:53   ` Matthias Eble
2014-01-06 23:06     ` James Bottomley
2014-01-06 23:35       ` Matthias Eble
2014-01-07  2:09         ` Laurence Oberman
2014-01-22 20:43       ` Matthias Eble
2014-01-07 20:18     ` Pasi Kärkkäinen
2014-01-23 18:31       ` Lee Duncan
2014-01-24 11:41         ` Pasi Kärkkäinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox