* In-kernel QLE2462 driver LU enumeration bug?
@ 2006-01-26 12:49 Tore Anderson
2006-02-06 23:11 ` Andrew Vasquez
0 siblings, 1 reply; 5+ messages in thread
From: Tore Anderson @ 2006-01-26 12:49 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 2579 bytes --]
Hi. I've had large problems making a setup where I'm using a QLE2462
card in order to connect to an EMC CLARiiON AX100. Port 1 of the HBA
is connected to Storage Processor A ("SPA") of the AX100, while port 2
is connected to SPB.
There are two RAIDs on the AX100, one containing five LU's of 230 GB
each, containing EXT3 file systems labeled "data{1,2,3,4,5}". SPB is
the designated controller for all these LUs. The other RAID contains
only one LU of 10GB, containing an EXT3 file system labeled "postgres".
This LU is normally controlled by SPA.
Now, when I use kernel 2.6.12 with driver 8.01.01 from QLogic[1],
everything works just nice. I see a total of 12 LUs representing the
two paths to each LU, the postres one being available through the first
SCSI host the module registers, with a "ghost" LU being visible on the
second SCSI host, and vice verca for the data LUs. All as expected.
I load the qla2xxx module using ql2xfailover=0, as the driver to the
best of my knowledge are unable to send the necessary EMC-specific
"trespass" command to the ghost LU before actually using it.
For proper failover I need dm-multipath, but this didn't run stable
for me on 2.6.12. I believe this situation has improved in 2.6.15, so
I wanted to try upgrading. Now, the 8.01.01 version of the driver
doesn't compile against 2.6.15 (complains about struct Scsi_Host not
having members named "eh_active" and "state"), so I thought I'd try the
in-kernel one. I downloaded the firmware from QLogic[2], and loaded
the driver. It went in fine, found the AX100, created 12 SCSI LUs...
But it seems all the different LUs created on the server corresponded
with just /one/ of the LUs that are actually on the AX100. All of the
twelve LUs are listed in /proc/partitions as 230GB, and all six
connected to SCSI host 1 are ghost paths, while all six connected to
host 2 are active. When dumping the EXT3 superblock of all those, I
see that all of them actually acess just one of the LUs on the AX100
(data5).
After hours of debugging without getting anywhere (including upgrading
to the QLogic-driver from todays pull of Linus' git tree), I can't
think of another explanation than this being a bug in the QLogic
driver.
To aid debugging I've attached logs from both 2.6.12 and 2.6.15
(suitable for diffing). If there's anything more I can help with
just let me know.
[1] http://download.qlogic.com/drivers/32415/qla2xxx-v8.01.01-dist.tgz
[2] ftp://ftp.qlogic.com/outgoing/linux/firmware/ql2400_fw.bin
Regards
--
Tore Anderson
[-- Attachment #2: 2.6.12-log.gz --]
[-- Type: application/x-gzip, Size: 2171 bytes --]
[-- Attachment #3: 2.6.15-log.gz --]
[-- Type: application/x-gzip, Size: 1991 bytes --]
[-- Attachment #4: test.sh --]
[-- Type: application/x-shellscript, Size: 506 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: In-kernel QLE2462 driver LU enumeration bug?
2006-01-26 12:49 In-kernel QLE2462 driver LU enumeration bug? Tore Anderson
@ 2006-02-06 23:11 ` Andrew Vasquez
2006-02-07 8:16 ` Tore Anderson
2006-02-07 9:17 ` Christoph Hellwig
0 siblings, 2 replies; 5+ messages in thread
From: Andrew Vasquez @ 2006-02-06 23:11 UTC (permalink / raw)
To: Tore Anderson; +Cc: linux-scsi
On Thu, 26 Jan 2006, Tore Anderson wrote:
> Hi. I've had large problems making a setup where I'm using a QLE2462
> card in order to connect to an EMC CLARiiON AX100. Port 1 of the HBA
> is connected to Storage Processor A ("SPA") of the AX100, while port 2
> is connected to SPB.
Please try the attached patch -- not sure how this was missed as it's
been in my patch-queue for some time.
Thanks,
Andrew Vasquez
---
diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c
index 7ec0b8d..6544b6d 100644
--- a/drivers/scsi/qla2xxx/qla_iocb.c
+++ b/drivers/scsi/qla2xxx/qla_iocb.c
@@ -814,6 +814,7 @@ qla24xx_start_scsi(srb_t *sp)
cmd_pkt->port_id[2] = sp->fcport->d_id.b.domain;
int_to_scsilun(sp->cmd->device->lun, &cmd_pkt->lun);
+ host_to_fcp_swap((uint8_t *)&cmd_pkt->lun, sizeof(cmd_pkt->lun));
/* Load SCSI command packet. */
memcpy(cmd_pkt->fcp_cdb, cmd->cmnd, cmd->cmd_len);
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: In-kernel QLE2462 driver LU enumeration bug?
2006-02-06 23:11 ` Andrew Vasquez
@ 2006-02-07 8:16 ` Tore Anderson
2006-02-07 9:17 ` Christoph Hellwig
1 sibling, 0 replies; 5+ messages in thread
From: Tore Anderson @ 2006-02-07 8:16 UTC (permalink / raw)
To: Andrew Vasquez; +Cc: linux-scsi
* Andrew Vasquez
> Please try the attached patch -- not sure how this was missed as it's
> been in my patch-queue for some time.
Thanks, Andrew! Your patch fixes the bug for me. I hope it's not too
late for 2.6.16 inclusion.
Kind regards
--
Tore Anderson
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: In-kernel QLE2462 driver LU enumeration bug?
2006-02-06 23:11 ` Andrew Vasquez
2006-02-07 8:16 ` Tore Anderson
@ 2006-02-07 9:17 ` Christoph Hellwig
2006-02-07 16:19 ` Andrew Vasquez
1 sibling, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2006-02-07 9:17 UTC (permalink / raw)
To: Andrew Vasquez; +Cc: Tore Anderson, linux-scsi
>
> int_to_scsilun(sp->cmd->device->lun, &cmd_pkt->lun);
> + host_to_fcp_swap((uint8_t *)&cmd_pkt->lun, sizeof(cmd_pkt->lun));
this looks rather odd to me. first and minor cmd_pkt->lun now gets
values in different endianesses asigned, which doesn't help static
typechecking, aka getting the qla2xxx driver sparse clean. Second the
host_to_fcp_swap function looks more than fishy to me. It's doing a loop
of unconditional byteswaps. that can't be right on BE hardware, can it?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: In-kernel QLE2462 driver LU enumeration bug?
2006-02-07 9:17 ` Christoph Hellwig
@ 2006-02-07 16:19 ` Andrew Vasquez
0 siblings, 0 replies; 5+ messages in thread
From: Andrew Vasquez @ 2006-02-07 16:19 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Tore Anderson, linux-scsi
On Tue, 07 Feb 2006, Christoph Hellwig wrote:
> > int_to_scsilun(sp->cmd->device->lun, &cmd_pkt->lun);
> > + host_to_fcp_swap((uint8_t *)&cmd_pkt->lun, sizeof(cmd_pkt->lun));
>
> this looks rather odd to me. first and minor cmd_pkt->lun now gets
> values in different endianesses asigned, which doesn't help static
> typechecking, aka getting the qla2xxx driver sparse clean.
Granted, to take advantage of the larger addressing space, the driver
went from:
qla_fw.h:
#define COMMAND_TYPE_7 0x18 /* Command Type 7 entry */
struct cmd_type_7 {
uint8_t entry_type; /* Entry type. */
...
uint8_t lun[8]; /* FCP LUN (BE). */
...
}
qla_iocb.c::qla24xx_start_scsi():
...
/* Set LUN number*/
cmd_pkt->lun[1] = LSB(fclun->lun);
cmd_pkt->lun[2] = MSB(fclun->lun);
host_to_fcp_swap(cmd_pkt->lun, sizeof(cmd_pkt->lun));
to:
#define COMMAND_TYPE_7 0x18 /* Command Type 7 entry */
struct cmd_type_7 {
uint8_t entry_type; /* Entry type. */
...
struct scsi_lun lun; /* FCP LUN (BE). */
...
}
qla_iocb.c::qla24xx_start_scsi():
...
int_to_scsilun(sp->cmd->device->lun, &cmd_pkt->lun);
host_to_fcp_swap((uint8_t *)&cmd_pkt->lun, sizeof(cmd_pkt->lun));
(note: the host_to_fcp_swap() was originally missing).
> Second the
> host_to_fcp_swap function looks more than fishy to me. It's doing a loop
> of unconditional byteswaps. that can't be right on BE hardware, can it?
Unlike our 2gb (and earlier products), specific data passed through an
IOCB (LU, CDB), are converted to wire-format before the submission to
the firmware. Note these values are not scalars, but instead arrays
of opaque 8bit data.
You'll also notice the that much of the FCP_DATA returned in a 4gb
status-IOCB is also converted from wire-format (via
host_to_fcp_swap()).
--
AV
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-02-07 16:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-26 12:49 In-kernel QLE2462 driver LU enumeration bug? Tore Anderson
2006-02-06 23:11 ` Andrew Vasquez
2006-02-07 8:16 ` Tore Anderson
2006-02-07 9:17 ` Christoph Hellwig
2006-02-07 16:19 ` Andrew Vasquez
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).