linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* In-kernel QLE2462 driver LU enumeration bug?
@ 2006-01-26 12:49 Tore Anderson
  2006-02-06 23:11 ` Andrew Vasquez
  0 siblings, 1 reply; 5+ messages in thread
From: Tore Anderson @ 2006-01-26 12:49 UTC (permalink / raw)
  To: linux-scsi

[-- Attachment #1: Type: text/plain, Size: 2579 bytes --]


  Hi.  I've had large problems making a setup where I'm using a QLE2462
 card in order to connect to an EMC CLARiiON AX100.  Port 1 of the HBA
 is connected to Storage Processor A ("SPA") of the AX100, while port 2
 is connected to SPB.

  There are two RAIDs on the AX100, one containing five LU's of 230 GB
 each, containing EXT3 file systems labeled "data{1,2,3,4,5}".  SPB is
 the designated controller for all these LUs.  The other RAID contains
 only one LU of 10GB, containing an EXT3 file system labeled "postgres".
 This LU is normally controlled by SPA.

  Now, when I use kernel 2.6.12 with driver 8.01.01 from QLogic[1],
 everything works just nice.  I see a total of 12 LUs representing the
 two paths to each LU, the postres one being available through the first
 SCSI host the module registers, with a "ghost" LU being visible on the
 second SCSI host, and vice verca for the data LUs.  All as expected.
 I load the qla2xxx module using ql2xfailover=0, as the driver to the
 best of my knowledge are unable to send the necessary EMC-specific
 "trespass" command to the ghost LU before actually using it.

  For proper failover I need dm-multipath, but this didn't run stable
 for me on 2.6.12.  I believe this situation has improved in 2.6.15, so
 I wanted to try upgrading.  Now, the 8.01.01 version of the driver
 doesn't compile against 2.6.15 (complains about struct Scsi_Host not
 having members named "eh_active" and "state"), so I thought I'd try the
 in-kernel one.  I downloaded the firmware from QLogic[2], and loaded
 the driver.  It went in fine, found the AX100, created 12 SCSI LUs...
 But it seems all the different LUs created on the server corresponded
 with just /one/ of the LUs that are actually on the AX100.  All of the
 twelve LUs are listed in /proc/partitions as 230GB, and all six
 connected to SCSI host 1 are ghost paths, while all six connected to
 host 2 are active.  When dumping the EXT3 superblock of all those, I
 see that all of them actually acess just one of the LUs on the AX100
 (data5).

  After hours of debugging without getting anywhere (including upgrading
 to the QLogic-driver from todays pull of Linus' git tree), I can't
 think of another explanation than this being a bug in the QLogic
 driver.

  To aid debugging I've attached logs from both 2.6.12 and 2.6.15
 (suitable for diffing).  If there's anything more I can help with
 just let me know.

  [1] http://download.qlogic.com/drivers/32415/qla2xxx-v8.01.01-dist.tgz
  [2] ftp://ftp.qlogic.com/outgoing/linux/firmware/ql2400_fw.bin

Regards
-- 
Tore Anderson

[-- Attachment #2: 2.6.12-log.gz --]
[-- Type: application/x-gzip, Size: 2171 bytes --]

[-- Attachment #3: 2.6.15-log.gz --]
[-- Type: application/x-gzip, Size: 1991 bytes --]

[-- Attachment #4: test.sh --]
[-- Type: application/x-shellscript, Size: 506 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: In-kernel QLE2462 driver LU enumeration bug?
  2006-01-26 12:49 In-kernel QLE2462 driver LU enumeration bug? Tore Anderson
@ 2006-02-06 23:11 ` Andrew Vasquez
  2006-02-07  8:16   ` Tore Anderson
  2006-02-07  9:17   ` Christoph Hellwig
  0 siblings, 2 replies; 5+ messages in thread
From: Andrew Vasquez @ 2006-02-06 23:11 UTC (permalink / raw)
  To: Tore Anderson; +Cc: linux-scsi

On Thu, 26 Jan 2006, Tore Anderson wrote:

>   Hi.  I've had large problems making a setup where I'm using a QLE2462
>  card in order to connect to an EMC CLARiiON AX100.  Port 1 of the HBA
>  is connected to Storage Processor A ("SPA") of the AX100, while port 2
>  is connected to SPB.

Please try the attached patch -- not sure how this was missed as it's
been in my patch-queue for some time.

Thanks,
Andrew Vasquez

---

diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c
index 7ec0b8d..6544b6d 100644
--- a/drivers/scsi/qla2xxx/qla_iocb.c
+++ b/drivers/scsi/qla2xxx/qla_iocb.c
@@ -814,6 +814,7 @@ qla24xx_start_scsi(srb_t *sp)
 	cmd_pkt->port_id[2] = sp->fcport->d_id.b.domain;
 
 	int_to_scsilun(sp->cmd->device->lun, &cmd_pkt->lun);
+	host_to_fcp_swap((uint8_t *)&cmd_pkt->lun, sizeof(cmd_pkt->lun));
 
 	/* Load SCSI command packet. */
 	memcpy(cmd_pkt->fcp_cdb, cmd->cmnd, cmd->cmd_len);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: In-kernel QLE2462 driver LU enumeration bug?
  2006-02-06 23:11 ` Andrew Vasquez
@ 2006-02-07  8:16   ` Tore Anderson
  2006-02-07  9:17   ` Christoph Hellwig
  1 sibling, 0 replies; 5+ messages in thread
From: Tore Anderson @ 2006-02-07  8:16 UTC (permalink / raw)
  To: Andrew Vasquez; +Cc: linux-scsi

* Andrew Vasquez

> Please try the attached patch -- not sure how this was missed as it's
> been in my patch-queue for some time.

  Thanks, Andrew!  Your patch fixes the bug for me.  I hope it's not too
 late for 2.6.16 inclusion.

Kind regards
-- 
Tore Anderson


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: In-kernel QLE2462 driver LU enumeration bug?
  2006-02-06 23:11 ` Andrew Vasquez
  2006-02-07  8:16   ` Tore Anderson
@ 2006-02-07  9:17   ` Christoph Hellwig
  2006-02-07 16:19     ` Andrew Vasquez
  1 sibling, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2006-02-07  9:17 UTC (permalink / raw)
  To: Andrew Vasquez; +Cc: Tore Anderson, linux-scsi

>  
>  	int_to_scsilun(sp->cmd->device->lun, &cmd_pkt->lun);
> +	host_to_fcp_swap((uint8_t *)&cmd_pkt->lun, sizeof(cmd_pkt->lun));

this looks rather odd to me.  first and minor cmd_pkt->lun now gets
values in different endianesses asigned, which doesn't help static
typechecking, aka getting the qla2xxx driver sparse clean.  Second the
host_to_fcp_swap function looks more than fishy to me.  It's doing a loop
of unconditional byteswaps.  that can't be right on BE hardware, can it?



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: In-kernel QLE2462 driver LU enumeration bug?
  2006-02-07  9:17   ` Christoph Hellwig
@ 2006-02-07 16:19     ` Andrew Vasquez
  0 siblings, 0 replies; 5+ messages in thread
From: Andrew Vasquez @ 2006-02-07 16:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Tore Anderson, linux-scsi

On Tue, 07 Feb 2006, Christoph Hellwig wrote:

> >  	int_to_scsilun(sp->cmd->device->lun, &cmd_pkt->lun);
> > +	host_to_fcp_swap((uint8_t *)&cmd_pkt->lun, sizeof(cmd_pkt->lun));
> 
> this looks rather odd to me.  first and minor cmd_pkt->lun now gets
> values in different endianesses asigned, which doesn't help static
> typechecking, aka getting the qla2xxx driver sparse clean.

Granted, to take advantage of the larger addressing space, the driver
went from:

  qla_fw.h:

	#define COMMAND_TYPE_7  0x18            /* Command Type 7 entry */
	struct cmd_type_7 {
		uint8_t entry_type;             /* Entry type. */
		...
		uint8_t lun[8];                 /* FCP LUN (BE). */
		...
	}

  qla_iocb.c::qla24xx_start_scsi():

	...
	/* Set LUN number*/
	cmd_pkt->lun[1] = LSB(fclun->lun);
	cmd_pkt->lun[2] = MSB(fclun->lun);
	host_to_fcp_swap(cmd_pkt->lun, sizeof(cmd_pkt->lun));

to:

	#define COMMAND_TYPE_7  0x18            /* Command Type 7 entry */
	struct cmd_type_7 {
		uint8_t entry_type;             /* Entry type. */
		...
		struct scsi_lun lun;            /* FCP LUN (BE). */
		...
	}

  qla_iocb.c::qla24xx_start_scsi():

	...
	int_to_scsilun(sp->cmd->device->lun, &cmd_pkt->lun);
	host_to_fcp_swap((uint8_t *)&cmd_pkt->lun, sizeof(cmd_pkt->lun));

(note: the host_to_fcp_swap() was originally missing).

> Second the
> host_to_fcp_swap function looks more than fishy to me.  It's doing a loop
> of unconditional byteswaps.  that can't be right on BE hardware, can it?

Unlike our 2gb (and earlier products), specific data passed through an
IOCB (LU, CDB), are converted to wire-format before the submission to
the firmware.  Note these values are not scalars, but instead arrays
of opaque 8bit data.

You'll also notice the that much of the FCP_DATA returned in a 4gb
status-IOCB is also converted from wire-format (via
host_to_fcp_swap()).

--
AV

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-02-07 16:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-26 12:49 In-kernel QLE2462 driver LU enumeration bug? Tore Anderson
2006-02-06 23:11 ` Andrew Vasquez
2006-02-07  8:16   ` Tore Anderson
2006-02-07  9:17   ` Christoph Hellwig
2006-02-07 16:19     ` Andrew Vasquez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).