linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* understanding the sg_ses --raw output? so I can turn on the faulty light
@ 2011-02-07 15:25 Jon Bendtsen
  2011-02-07 15:40 ` James Bottomley
  2011-02-08 14:15 ` Jon Bendtsen
  0 siblings, 2 replies; 6+ messages in thread
From: Jon Bendtsen @ 2011-02-07 15:25 UTC (permalink / raw)
  To: linux-scsi

Hi

There has earlier been 2 threads on sg_ses 21 may 2007 and 10 june 2010,
but unfortunately neither seemed to include information about how to
understand the --raw output from sg_ses.

When I run this command
	sg_ses --page=2 /dev/sg30 --raw

I get output that looks like:
        00 00 00 00 00 00 00 00  01 00 00 00 01 00 00 00
        01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
        01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
        01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
        01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
        01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
        01 00 00 00 01 00 00 00  00 00 01 00 01 00 01 00
        00 00 00 40 06 00 00 47  06 00 00 47 06 00 00 47
        00 00 00 00 01 00 2c 00  06 00 00 00 06 00 00 00
        06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
        06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
        06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
        06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
        06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
        06 00 00 00 06 00 00 00

There are 26 segments starting with 01 00 00 00, and 27 with 06. I think
i have to use the 01 segments, but the number 26 does not fit with my
number of disks unless they count like this: controller, backplane + 24
disks or enclosure port A + port B + 24 disks.

At first i figured i had to set 20 in the last field, but that did not
work. So i tried modifying a 01 00 00 00 to 01 00 02 20 based on this
webpage:
http://storagesecrets.org/2008/12/scsi-enclosure-services-ses-ses-2-management/

I also tried with 06 00 00 00 to 06 00 02 20, still nothing.

I set the data using
	sg_ses --control --page=2 -d - /dev/sg30 < raw.test


Here are the output of lsscsi
[0:0:0:0]    disk    ATA      INTEL SSDSA2M040 2CV1  /dev/sda
[1:0:0:0]    disk    ATA      INTEL SSDSA2M040 2CV1  /dev/sdb
[2:0:0:0]    disk    ATA      Corsair CSSD-F24 2.0   /dev/sdc
[3:0:0:0]    disk    ATA      Corsair CSSD-F24 2.0   /dev/sdd
[5:0:0:0]    disk    ATA      Corsair CSSD-F24 2.0   /dev/sde
[5:0:1:0]    disk    ATA      Corsair CSSD-F24 2.0   /dev/sdf
[6:0:0:0]    disk    ATA      SAMSUNG HE103UJ  1113  /dev/sdg
[6:0:1:0]    disk    ATA      SAMSUNG HE103UJ  1113  /dev/sdh
[6:0:2:0]    disk    ATA      WDC WD1001FALS-0 0K05  /dev/sdi
[6:0:3:0]    disk    ATA      WDC WD1001FALS-0 0K05  /dev/sdj
[6:0:4:0]    disk    ATA      SAMSUNG HE103UJ  1113  /dev/sdk
[6:0:5:0]    disk    ATA      SAMSUNG HE103UJ  1113  /dev/sdl
[6:0:6:0]    disk    ATA      WDC WD10EADS-00L 1A01  /dev/sdm
[6:0:7:0]    disk    ATA      SAMSUNG HE103UJ  1113  /dev/sdn
[6:0:8:0]    disk    ATA      WDC WD1001FALS-0 0K05  /dev/sdo
[6:0:9:0]    disk    ATA      SAMSUNG HE103UJ  1113  /dev/sdp
[6:0:10:0]   disk    ATA      WDC WD1001FALS-0 0K05  /dev/sdq
[6:0:11:0]   disk    ATA      WDC WD1001FALS-0 0K05  /dev/sdr
[6:0:12:0]   disk    ATA      Hitachi HDS72101 A39C  /dev/sds
[6:0:13:0]   disk    ATA      SAMSUNG HD103UJ  1118  /dev/sdt
[6:0:14:0]   disk    ATA      SAMSUNG HD103UJ  1118  /dev/sdu
[6:0:15:0]   disk    ATA      Hitachi HDS72101 A39C  /dev/sdv
[6:0:16:0]   disk    ATA      Hitachi HUA72201 A3EA  /dev/sdw
[6:0:17:0]   disk    ATA      Hitachi HUA72201 A3EA  /dev/sdx
[6:0:18:0]   disk    ATA      WDC WD1002FBYS-0 0C06  /dev/sdy
[6:0:19:0]   disk    ATA      WDC WD1002FBYS-0 0C06  /dev/sdz
[6:0:20:0]   disk    ATA      WDC WD2002FYPS-0 1G01  /dev/sdaa
[6:0:21:0]   disk    ATA      Hitachi HUA72202 A3EA  /dev/sdab
[6:0:22:0]   disk    ATA      WDC WD2003FYYS-0 1D01  /dev/sdac
[6:0:23:0]   disk    ATA      WDC WD2001FASS-0 0101  /dev/sdad
[6:0:24:0]   enclosu LSILOGIC SASX36 A.0          9  -


The first 6 /dev/sd devices are attached using sata to a 00:11.0 SATA
controller: ATI Technologies Inc SB700/SB800 SATA Controller [IDE mode]
where as the last 24 disks is connected using a 06:00.0 SCSI storage
controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT
SAS (rev 08) found onboard on my SuperMicro H8DI3+-F in a supermicro
SC846E2 chasis. Only 1 SAS cable is attached between the controller and
enclosure backplane even though both support 2 cables.


The same enclosure and disks used to work with a 3ware 9690sa-8i which
had a webpage system to turn on and off those bits. But after a brand
new disk crashed and pulled all the other 23 disks offline from the
controller, then i do not like to continue to use the 3ware 9690sa-8i
controller.


JonB


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: understanding the sg_ses --raw output? so I can turn on the faulty light
  2011-02-07 15:25 understanding the sg_ses --raw output? so I can turn on the faulty light Jon Bendtsen
@ 2011-02-07 15:40 ` James Bottomley
  2011-02-08 12:55   ` Jon Bendtsen
  2011-02-08 14:15 ` Jon Bendtsen
  1 sibling, 1 reply; 6+ messages in thread
From: James Bottomley @ 2011-02-07 15:40 UTC (permalink / raw)
  To: Jon Bendtsen; +Cc: linux-scsi

On Mon, 2011-02-07 at 16:25 +0100, Jon Bendtsen wrote:
> Hi
> 
> There has earlier been 2 threads on sg_ses 21 may 2007 and 10 june 2010,
> but unfortunately neither seemed to include information about how to
> understand the --raw output from sg_ses.

It's a hex dump of the diagnostic mode page.

> When I run this command
> 	sg_ses --page=2 /dev/sg30 --raw
> 
> I get output that looks like:
>         00 00 00 00 00 00 00 00  01 00 00 00 01 00 00 00
>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>         01 00 00 00 01 00 00 00  00 00 01 00 01 00 01 00
>         00 00 00 40 06 00 00 47  06 00 00 47 06 00 00 47
>         00 00 00 00 01 00 2c 00  06 00 00 00 06 00 00 00
>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>         06 00 00 00 06 00 00 00
> 
> There are 26 segments starting with 01 00 00 00, and 27 with 06. I think
> i have to use the 01 segments, but the number 26 does not fit with my
> number of disks unless they count like this: controller, backplane + 24
> disks or enclosure port A + port B + 24 disks.
> 
> At first i figured i had to set 20 in the last field, but that did not
> work. So i tried modifying a 01 00 00 00 to 01 00 02 20 based on this
> webpage:
> http://storagesecrets.org/2008/12/scsi-enclosure-services-ses-ses-2-management/
> 
> I also tried with 06 00 00 00 to 06 00 02 20, still nothing.
> 
> I set the data using
> 	sg_ses --control --page=2 -d - /dev/sg30 < raw.test

So I don't really plan to parse a huge hex dump, but my instinct would
be you got something wrong.

I'd firstly validate that your lights can be flashed with the ses
driver ... if they can, then look for a mistake in the hex dump.

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: understanding the sg_ses --raw output? so I can turn on the faulty light
  2011-02-07 15:40 ` James Bottomley
@ 2011-02-08 12:55   ` Jon Bendtsen
  2011-02-08 14:18     ` James Bottomley
  0 siblings, 1 reply; 6+ messages in thread
From: Jon Bendtsen @ 2011-02-08 12:55 UTC (permalink / raw)
  To: linux-scsi

On 07/02/11 16.40, James Bottomley wrote:
> On Mon, 2011-02-07 at 16:25 +0100, Jon Bendtsen wrote:
>> Hi
>>
>> There has earlier been 2 threads on sg_ses 21 may 2007 and 10 june 2010,
>> but unfortunately neither seemed to include information about how to
>> understand the --raw output from sg_ses.
> 
> It's a hex dump of the diagnostic mode page.

I know that. What I meant was which segments in the hex dump correlate
to which segments in the text version of the dianostic mode page?

How long are the segments? 8 bytes? The way it is formatted something
could hint that? Or is it just 4 bytes which another formatting suggests?


>> When I run this command
>> 	sg_ses --page=2 /dev/sg30 --raw
>>
>> I get output that looks like:
>>         00 00 00 00 00 00 00 00  01 00 00 00 01 00 00 00
>>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>>         01 00 00 00 01 00 00 00  00 00 01 00 01 00 01 00
>>         00 00 00 40 06 00 00 47  06 00 00 47 06 00 00 47
>>         00 00 00 00 01 00 2c 00  06 00 00 00 06 00 00 00
>>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>>         06 00 00 00 06 00 00 00
>>
>> There are 26 segments starting with 01 00 00 00, and 27 with 06. I think
>> i have to use the 01 segments, but the number 26 does not fit with my
>> number of disks unless they count like this: controller, backplane + 24
>> disks or enclosure port A + port B + 24 disks.
>>
>> At first i figured i had to set 20 in the last field, but that did not
>> work. So i tried modifying a 01 00 00 00 to 01 00 02 20 based on this
>> webpage:
>> http://storagesecrets.org/2008/12/scsi-enclosure-services-ses-ses-2-management/
>>
>> I also tried with 06 00 00 00 to 06 00 02 20, still nothing.
>>
>> I set the data using
>> 	sg_ses --control --page=2 -d - /dev/sg30 < raw.test
> 
> So I don't really plan to parse a huge hex dump, but my instinct would
> be you got something wrong.

I probably do that, which is why I asked how to understand the hex dump.


> I'd firstly validate that your lights can be flashed with the ses
> driver ... if they can, then look for a mistake in the hex dump.

How can I validate that my lights can be flashed with the SES driver?



JonB

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: understanding the sg_ses --raw output? so I can turn on the faulty light
  2011-02-07 15:25 understanding the sg_ses --raw output? so I can turn on the faulty light Jon Bendtsen
  2011-02-07 15:40 ` James Bottomley
@ 2011-02-08 14:15 ` Jon Bendtsen
  1 sibling, 0 replies; 6+ messages in thread
From: Jon Bendtsen @ 2011-02-08 14:15 UTC (permalink / raw)
  To: linux-scsi

On 07/02/11 16.25, Jon Bendtsen wrote:
> Hi
> 
> There has earlier been 2 threads on sg_ses 21 may 2007 and 10 june 2010,
> but unfortunately neither seemed to include information about how to
> understand the --raw output from sg_ses.
> 
> When I run this command
> 	sg_ses --page=2 /dev/sg30 --raw
> 
> I get output that looks like:
>         00 00 00 00 00 00 00 00  01 00 00 00 01 00 00 00
>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>         01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
>         01 00 00 00 01 00 00 00  00 00 01 00 01 00 01 00
>         00 00 00 40 06 00 00 47  06 00 00 47 06 00 00 47
>         00 00 00 00 01 00 2c 00  06 00 00 00 06 00 00 00
>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>         06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
>         06 00 00 00 06 00 00 00

I cracked it using this PDF
http://www.snia.org/events/storage-developer2008/presentations/monday/RajendraDivecha_SCSI_SES.pdf

Page 26 top diagram tells me they use 4 byte segments, aka 01 00 00 00
is one device slot.

Page 26 middle diagram tells me that 01 from the above segment must be
changed to 81 to select this device slot.

Page 27 diagram tells me that the 2. byte is not important.
Page 27 diagram tells me that the 3. byte should be 02 to signal RQST
IDENT. The result on my system is that the device slot starts to blink.

Page 27 diagram also tells me that if I set the 4. byte to 20, then I
select RSQT FAULT. The result on my systems is that the same device slot
turns on but does not blink, it stays static.

If I set both 3. byte to 02 and 4. byte to 20, then it blinks.

Below you can see the raw data blob which selects slot 16 and sets RQST
fault.

        00 00 00 00 00 00 00 00  01 00 00 00 01 00 00 00
        01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
        01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
        01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
        01 00 00 00 81 00 00 20  01 00 00 00 01 00 00 00
        01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
        01 00 00 00 01 00 00 00  00 00 01 00 01 00 01 00
        00 00 00 40 06 00 00 47  06 00 00 47 06 00 00 47
        00 00 00 00 01 00 2c 00  06 00 00 00 06 00 00 00
        06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
        06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
        06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
        06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
        06 00 00 00 06 00 00 00  06 00 00 00 06 00 00 00
        06 00 00 00 06 00 00 00

And here are the sg_ses output without --raw

    Individual element 16 status:
        Predicted failure=0, Disabled=0, Swap=0, status: OK
        OK=0, Reserved device=0, Hot spare=0, Cons check=0
        In crit array=0, In failed array=0, Rebuild/remap=0, R/R abort=0
        App client bypass A=0, Don't remove=0, Enc bypass A=0, Enc
bypass B=0
        Ready to insert=0, RMV=0, Ident=0, Report=0
        App client bypass B=0, Fault sensed=0, Fault reqstd=1, Device off=0
        Bypassed A=0, Bypassed B=0, Dev bypassed A=0, Dev bypassed B=0



JonB


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: understanding the sg_ses --raw output? so I can turn on the faulty light
  2011-02-08 12:55   ` Jon Bendtsen
@ 2011-02-08 14:18     ` James Bottomley
  2011-02-08 14:22       ` Jon Bendtsen
  0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2011-02-08 14:18 UTC (permalink / raw)
  To: Jon Bendtsen; +Cc: linux-scsi

On Tue, 2011-02-08 at 13:55 +0100, Jon Bendtsen wrote:
> On 07/02/11 16.40, James Bottomley wrote:
> > On Mon, 2011-02-07 at 16:25 +0100, Jon Bendtsen wrote:
> >> Hi
> >>
> >> There has earlier been 2 threads on sg_ses 21 may 2007 and 10 june 2010,
> >> but unfortunately neither seemed to include information about how to
> >> understand the --raw output from sg_ses.
> > 
> > It's a hex dump of the diagnostic mode page.
> 
> I know that. What I meant was which segments in the hex dump correlate
> to which segments in the text version of the dianostic mode page?

It's what the man page says:  the byte for byte output of the diagnostic
mode page minus the first four bytes.

> How long are the segments? 8 bytes? The way it is formatted something
> could hint that? Or is it just 4 bytes which another formatting suggests?

Well the descriptor format is variable, it's documented in the SES
standard.

[...]
> > I'd firstly validate that your lights can be flashed with the ses
> > driver ... if they can, then look for a mistake in the hex dump.
> 
> How can I validate that my lights can be flashed with the SES driver?

well, it depends what the enclosure calls it's slots, but it would be
something like

echo 1 > /sys/class/enclosure/<dev>/<slot>/fault

After making sure the ses driver is loaded and bound, of course.  Quite
a few fault lights are hard wired, and not amenable to software
interference (others aren't hard wired at all and only work with
software).

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: understanding the sg_ses --raw output? so I can turn on the faulty light
  2011-02-08 14:18     ` James Bottomley
@ 2011-02-08 14:22       ` Jon Bendtsen
  0 siblings, 0 replies; 6+ messages in thread
From: Jon Bendtsen @ 2011-02-08 14:22 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi

On 08/02/11 15.18, James Bottomley wrote:
> On Tue, 2011-02-08 at 13:55 +0100, Jon Bendtsen wrote:
>> On 07/02/11 16.40, James Bottomley wrote:
>>> On Mon, 2011-02-07 at 16:25 +0100, Jon Bendtsen wrote:
>>>> Hi
>>>>
>>>> There has earlier been 2 threads on sg_ses 21 may 2007 and 10 june 2010,
>>>> but unfortunately neither seemed to include information about how to
>>>> understand the --raw output from sg_ses.
>>>
>>> It's a hex dump of the diagnostic mode page.
>>
>> I know that. What I meant was which segments in the hex dump correlate
>> to which segments in the text version of the dianostic mode page?
> 
> It's what the man page says:  the byte for byte output of the diagnostic
> mode page minus the first four bytes.
> 
>> How long are the segments? 8 bytes? The way it is formatted something
>> could hint that? Or is it just 4 bytes which another formatting suggests?
> 
> Well the descriptor format is variable, it's documented in the SES
> standard.

Which I have not been able to find in a available standard. The
organisation wants money to show me.

I managed to find a PDF though, see my other post. Thank you for taking
your time to answer my questions.


> [...]
>>> I'd firstly validate that your lights can be flashed with the ses
>>> driver ... if they can, then look for a mistake in the hex dump.
>>
>> How can I validate that my lights can be flashed with the SES driver?
> 
> well, it depends what the enclosure calls it's slots, but it would be
> something like
> 
> echo 1 > /sys/class/enclosure/<dev>/<slot>/fault

I do not have those


> After making sure the ses driver is loaded and bound, of course.  Quite
> a few fault lights are hard wired, and not amenable to software
> interference (others aren't hard wired at all and only work with
> software).

ok.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-02-08 14:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-07 15:25 understanding the sg_ses --raw output? so I can turn on the faulty light Jon Bendtsen
2011-02-07 15:40 ` James Bottomley
2011-02-08 12:55   ` Jon Bendtsen
2011-02-08 14:18     ` James Bottomley
2011-02-08 14:22       ` Jon Bendtsen
2011-02-08 14:15 ` Jon Bendtsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).