* Major qla2xxx regression on sparc64
@ 2007-04-16 8:02 David Miller
2007-04-16 16:37 ` Andrew Vasquez
0 siblings, 1 reply; 25+ messages in thread
From: David Miller @ 2007-04-16 8:02 UTC (permalink / raw)
To: linux-scsi; +Cc: linux-kernel, andrew.vasquez, James.Bottomley, ema
Sparc64 systems which have an on-board qla2xxx chip (such as
SunBlade-1000 and SunBlade-2000, there are probably some other systems
like this too) do not have any NVRAM information present, in fact the
NVRAM is basically all 0's from what I can tell.
This always worked just fine since the code would previously just use
a bunch of defaults when an inconsistent NVRAM was detected.
But the changeset below at the end of this email broke this and now
I'm seeing bug reports from sparc64 users and I was just able to
reproduce the problem myself just today as well. I verified that
reverting the patch below gets things working again.
Emanuele, you can feed the patch below to "patch -p1 -R" to get that
working again so we can move on to the other sparc64 bug we're looking
into :-)
The failure mode isn't nice, it actually ends up crashing with an OOPS
in qla2xxx_init_host_attr() because ha->node_name is NULL, it's
supposed to be initialized by functions like qla2x00_nvram_config()
Can we revert the patch below or do something similar to get things
working again on sparc64?
The most important thing which qla2x00_nvram_config() seems to want to
get is the WWN port_name and node_name. These are provided in the OFW
device tree so we could pluck them out of there with something like:
#ifdef CONFIG_SPARC
#include <asm/prom.h>
#include <asm/pbm.h>
#endif
...
#ifdef CONFIG_SPARC
struct pcidev_cookie *pcp = pdev->sysdata;
u8 *port_name, *node_name;
port_name = of_get_property(pcp->prom_node, "port-wwn", NULL);
node_name = of_get_property(pcp->prom_node, "node-wwn", NULL);
#endif
Those will hold a pointer to the property values or NULL if the
property does not exist. This is private data, so you should make
copies of them into your local data structure and not use references
to them.
I don't see any OFW properties present that could be used to fill in
the rest of the NVRAM parameters, so we'd need to use the defaults
that the code before the change was using.
But even if that fails, I think the fallback code should be put back,
since it obviously was used by at least one system and it's probable
that there are some other applications of using this qla2xxx chip that
will have an empty NVRAM too.
I can understand the apprehension in using a fixed port_name[] value,
since it could conflict with other FC controllers on the mesh, but if
that is so important just choose some random value that is a valid FC
ID or use some characteristic ID that can be used to compose part of
the port WWN in order to give it at least some uniqueness.
I'm happy to test out any patches you come up with, thanks.
commit 7aef45ac92f49e76d990b51b7ecd714b9a608be1
Author: Andrew Vasquez <andrew.vasquez@qlogic.com>
Date: Mon Jan 29 10:22:26 2007 -0800
[SCSI] qla2xxx: Fail initialization when inconsistent NVRAM detected.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index ee4b79e..b247bc2 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -88,7 +88,12 @@ qla2x00_initialize_adapter(scsi_qla_host_t *ha)
qla_printk(KERN_INFO, ha, "Configure NVRAM parameters...\n");
- ha->isp_ops.nvram_config(ha);
+ rval = ha->isp_ops.nvram_config(ha);
+ if (rval) {
+ DEBUG2(printk("scsi(%ld): Unable to verify NVRAM data.\n",
+ ha->host_no));
+ return rval;
+ }
if (ha->flags.disable_serdes) {
/* Mask HBA via NVRAM settings? */
@@ -1404,7 +1409,6 @@ qla2x00_set_model_info(scsi_qla_host_t *ha, uint8_t *model, size_t len, char *de
int
qla2x00_nvram_config(scsi_qla_host_t *ha)
{
- int rval;
uint8_t chksum = 0;
uint16_t cnt;
uint8_t *dptr1, *dptr2;
@@ -1413,8 +1417,6 @@ qla2x00_nvram_config(scsi_qla_host_t *ha)
uint8_t *ptr = (uint8_t *)ha->request_ring;
struct device_reg_2xxx __iomem *reg = &ha->iobase->isp;
- rval = QLA_SUCCESS;
-
/* Determine NVRAM starting address. */
ha->nvram_size = sizeof(nvram_t);
ha->nvram_base = 0;
@@ -1438,55 +1440,7 @@ qla2x00_nvram_config(scsi_qla_host_t *ha)
qla_printk(KERN_WARNING, ha, "Inconsistent NVRAM detected: "
"checksum=0x%x id=%c version=0x%x.\n", chksum, nv->id[0],
nv->nvram_version);
- qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet "
- "invalid -- WWPN) defaults.\n");
-
- /*
- * Set default initialization control block.
- */
- memset(nv, 0, ha->nvram_size);
- nv->parameter_block_version = ICB_VERSION;
-
- if (IS_QLA23XX(ha)) {
- nv->firmware_options[0] = BIT_2 | BIT_1;
- nv->firmware_options[1] = BIT_7 | BIT_5;
- nv->add_firmware_options[0] = BIT_5;
- nv->add_firmware_options[1] = BIT_5 | BIT_4;
- nv->frame_payload_size = __constant_cpu_to_le16(2048);
- nv->special_options[1] = BIT_7;
- } else if (IS_QLA2200(ha)) {
- nv->firmware_options[0] = BIT_2 | BIT_1;
- nv->firmware_options[1] = BIT_7 | BIT_5;
- nv->add_firmware_options[0] = BIT_5;
- nv->add_firmware_options[1] = BIT_5 | BIT_4;
- nv->frame_payload_size = __constant_cpu_to_le16(1024);
- } else if (IS_QLA2100(ha)) {
- nv->firmware_options[0] = BIT_3 | BIT_1;
- nv->firmware_options[1] = BIT_5;
- nv->frame_payload_size = __constant_cpu_to_le16(1024);
- }
-
- nv->max_iocb_allocation = __constant_cpu_to_le16(256);
- nv->execution_throttle = __constant_cpu_to_le16(16);
- nv->retry_count = 8;
- nv->retry_delay = 1;
-
- nv->port_name[0] = 33;
- nv->port_name[3] = 224;
- nv->port_name[4] = 139;
-
- nv->login_timeout = 4;
-
- /*
- * Set default host adapter parameters
- */
- nv->host_p[1] = BIT_2;
- nv->reset_delay = 5;
- nv->port_down_retry_count = 8;
- nv->max_luns_per_target = __constant_cpu_to_le16(8);
- nv->link_down_timeout = 60;
-
- rval = 1;
+ return QLA_FUNCTION_FAILED;
}
#if defined(CONFIG_IA64_GENERIC) || defined(CONFIG_IA64_SGI_SN2)
@@ -1699,11 +1653,7 @@ qla2x00_nvram_config(scsi_qla_host_t *ha)
}
}
- if (rval) {
- DEBUG2_3(printk(KERN_WARNING
- "scsi(%ld): NVRAM configuration failed!\n", ha->host_no));
- }
- return (rval);
+ return QLA_SUCCESS;
}
static void
@@ -3121,7 +3071,9 @@ qla2x00_abort_isp(scsi_qla_host_t *ha)
ha->isp_ops.get_flash_version(ha, ha->request_ring);
- ha->isp_ops.nvram_config(ha);
+ rval = ha->isp_ops.nvram_config(ha);
+ if (rval)
+ goto isp_abort_retry;
if (!qla2x00_restart_isp(ha)) {
clear_bit(RESET_MARKER_NEEDED, &ha->dpc_flags);
@@ -3151,6 +3103,7 @@ qla2x00_abort_isp(scsi_qla_host_t *ha)
}
}
} else { /* failed the ISP abort */
+isp_abort_retry:
ha->flags.online = 1;
if (test_bit(ISP_ABORT_RETRY, &ha->dpc_flags)) {
if (ha->isp_abort_cnt == 0) {
@@ -3340,7 +3293,6 @@ qla24xx_reset_adapter(scsi_qla_host_t *ha)
int
qla24xx_nvram_config(scsi_qla_host_t *ha)
{
- int rval;
struct init_cb_24xx *icb;
struct nvram_24xx *nv;
uint32_t *dptr;
@@ -3348,7 +3300,6 @@ qla24xx_nvram_config(scsi_qla_host_t *ha)
uint32_t chksum;
uint16_t cnt;
- rval = QLA_SUCCESS;
icb = (struct init_cb_24xx *)ha->init_cb;
nv = (struct nvram_24xx *)ha->request_ring;
@@ -3381,51 +3332,7 @@ qla24xx_nvram_config(scsi_qla_host_t *ha)
qla_printk(KERN_WARNING, ha, "Inconsistent NVRAM detected: "
"checksum=0x%x id=%c version=0x%x.\n", chksum, nv->id[0],
le16_to_cpu(nv->nvram_version));
- qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet "
- "invalid -- WWPN) defaults.\n");
-
- /*
- * Set default initialization control block.
- */
- memset(nv, 0, ha->nvram_size);
- nv->nvram_version = __constant_cpu_to_le16(ICB_VERSION);
- nv->version = __constant_cpu_to_le16(ICB_VERSION);
- nv->frame_payload_size = __constant_cpu_to_le16(2048);
- nv->execution_throttle = __constant_cpu_to_le16(0xFFFF);
- nv->exchange_count = __constant_cpu_to_le16(0);
- nv->hard_address = __constant_cpu_to_le16(124);
- nv->port_name[0] = 0x21;
- nv->port_name[1] = 0x00 + PCI_FUNC(ha->pdev->devfn);
- nv->port_name[2] = 0x00;
- nv->port_name[3] = 0xe0;
- nv->port_name[4] = 0x8b;
- nv->port_name[5] = 0x1c;
- nv->port_name[6] = 0x55;
- nv->port_name[7] = 0x86;
- nv->node_name[0] = 0x20;
- nv->node_name[1] = 0x00;
- nv->node_name[2] = 0x00;
- nv->node_name[3] = 0xe0;
- nv->node_name[4] = 0x8b;
- nv->node_name[5] = 0x1c;
- nv->node_name[6] = 0x55;
- nv->node_name[7] = 0x86;
- nv->login_retry_count = __constant_cpu_to_le16(8);
- nv->interrupt_delay_timer = __constant_cpu_to_le16(0);
- nv->login_timeout = __constant_cpu_to_le16(0);
- nv->firmware_options_1 =
- __constant_cpu_to_le32(BIT_14|BIT_13|BIT_2|BIT_1);
- nv->firmware_options_2 = __constant_cpu_to_le32(2 << 4);
- nv->firmware_options_2 |= __constant_cpu_to_le32(BIT_12);
- nv->firmware_options_3 = __constant_cpu_to_le32(2 << 13);
- nv->host_p = __constant_cpu_to_le32(BIT_11|BIT_10);
- nv->efi_parameters = __constant_cpu_to_le32(0);
- nv->reset_delay = 5;
- nv->max_luns_per_target = __constant_cpu_to_le16(128);
- nv->port_down_retry_count = __constant_cpu_to_le16(30);
- nv->link_down_timeout = __constant_cpu_to_le16(30);
-
- rval = 1;
+ return QLA_FUNCTION_FAILED;
}
/* Reset Initialization control block */
@@ -3570,11 +3477,7 @@ qla24xx_nvram_config(scsi_qla_host_t *ha)
ha->flags.process_response_queue = 1;
}
- if (rval) {
- DEBUG2_3(printk(KERN_WARNING
- "scsi(%ld): NVRAM configuration failed!\n", ha->host_no));
- }
- return (rval);
+ return QLA_SUCCESS;
}
static int
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: Major qla2xxx regression on sparc64 2007-04-16 8:02 Major qla2xxx regression on sparc64 David Miller @ 2007-04-16 16:37 ` Andrew Vasquez 2007-04-16 19:37 ` David Miller 0 siblings, 1 reply; 25+ messages in thread From: Andrew Vasquez @ 2007-04-16 16:37 UTC (permalink / raw) To: David Miller; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema [-- Attachment #1: Type: text/plain, Size: 3194 bytes --] On Mon, 16 Apr 2007, David Miller wrote: > Sparc64 systems which have an on-board qla2xxx chip (such as > SunBlade-1000 and SunBlade-2000, there are probably some other systems > like this too) do not have any NVRAM information present, in fact the > NVRAM is basically all 0's from what I can tell. > > This always worked just fine since the code would previously just use > a bunch of defaults when an inconsistent NVRAM was detected. > > But the changeset below at the end of this email broke this and now > I'm seeing bug reports from sparc64 users and I was just able to > reproduce the problem myself just today as well. I verified that > reverting the patch below gets things working again. > > Emanuele, you can feed the patch below to "patch -p1 -R" to get that > working again so we can move on to the other sparc64 bug we're looking > into :-) I sent Emanuele the attached patch during the weekend... > The failure mode isn't nice, it actually ends up crashing with an OOPS > in qla2xxx_init_host_attr() because ha->node_name is NULL, it's > supposed to be initialized by functions like qla2x00_nvram_config() No, it's not very nice... > Can we revert the patch below or do something similar to get things > working again on sparc64? > > The most important thing which qla2x00_nvram_config() seems to want to > get is the WWN port_name and node_name. These are provided in the OFW > device tree so we could pluck them out of there with something like: > > #ifdef CONFIG_SPARC > #include <asm/prom.h> > #include <asm/pbm.h> > #endif > > ... > > #ifdef CONFIG_SPARC > struct pcidev_cookie *pcp = pdev->sysdata; > u8 *port_name, *node_name; > > port_name = of_get_property(pcp->prom_node, "port-wwn", NULL); > node_name = of_get_property(pcp->prom_node, "node-wwn", NULL); > #endif > Those will hold a pointer to the property values or NULL if the > property does not exist. This is private data, so you should make > copies of them into your local data structure and not use references > to them. > > I don't see any OFW properties present that could be used to fill in > the rest of the NVRAM parameters, so we'd need to use the defaults > that the code before the change was using. I'd be more inclined to do soemthing like the above, rather than: > But even if that fails, I think the fallback code should be put back, > since it obviously was used by at least one system and it's probable > that there are some other applications of using this qla2xxx chip that > will have an empty NVRAM too. Then they should really get their NVRAM corrected, if in fact their NVRAMs are cleared. > I can understand the apprehension in using a fixed port_name[] value, > since it could conflict with other FC controllers on the mesh, but if > that is so important just choose some random value that is a valid FC > ID or use some characteristic ID that can be used to compose part of > the port WWN in order to give it at least some uniqueness. Look, there's a fine balance here that we must strike -- the solution that you're proposing implies that there's some 'random' bit-space within the IEEE NAA with which one can safely encode without stomping on any valid OUI. [-- Attachment #2: 0001-qla2xxx-Error-out-during-probe-if-we-re-unable-to.patch --] [-- Type: text/plain, Size: 1069 bytes --] >From 9ee6de3bbaa03390b83226e7bb84c49566a583b3 Mon Sep 17 00:00:00 2001 From: Andrew Vasquez <andrew.vasquez@qlogic.com> Date: Wed, 11 Apr 2007 16:02:06 -0700 Subject: [PATCH] qla2xxx: Error-out during probe() if we're unable to complete HBA initialization. Remove a stale check against ha->device_flags (DFLG_NO_CABLE) as topology scanning is performed within the DPC-thread context. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> --- drivers/scsi/qla2xxx/qla_os.c | 4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index b78919a..0a36912 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1577,9 +1577,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) goto probe_failed; } - if (qla2x00_initialize_adapter(ha) && - !(ha->device_flags & DFLG_NO_CABLE)) { - + if (qla2x00_initialize_adapter(ha)) { qla_printk(KERN_WARNING, ha, "Failed to initialize adapter\n"); -- 1.5.1.1.107.g7a159 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-16 16:37 ` Andrew Vasquez @ 2007-04-16 19:37 ` David Miller 2007-04-16 19:59 ` David Miller 2007-04-16 20:08 ` Andrew Vasquez 0 siblings, 2 replies; 25+ messages in thread From: David Miller @ 2007-04-16 19:37 UTC (permalink / raw) To: andrew.vasquez; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema From: Andrew Vasquez <andrew.vasquez@qlogic.com> Date: Mon, 16 Apr 2007 09:37:12 -0700 > On Mon, 16 Apr 2007, David Miller wrote: > > > But even if that fails, I think the fallback code should be put back, > > since it obviously was used by at least one system and it's probable > > that there are some other applications of using this qla2xxx chip that > > will have an empty NVRAM too. > > Then they should really get their NVRAM corrected, if in fact their > NVRAMs are cleared. > > > I can understand the apprehension in using a fixed port_name[] value, > > since it could conflict with other FC controllers on the mesh, but if > > that is so important just choose some random value that is a valid FC > > ID or use some characteristic ID that can be used to compose part of > > the port WWN in order to give it at least some uniqueness. > > Look, there's a fine balance here that we must strike -- the solution > that you're proposing implies that there's some 'random' bit-space > within the IEEE NAA with which one can safely encode without stomping > on any valid OUI. The fact is that your driver was significantly more robust previously, and now it's so less robust that it now fails for people. That's totally unacceptable. Just like the sparc64 systsems, others depending upon this fallback behavior the qla2xxx driver had are going to break and they are not going to be able to just go and fix their hardware and re-flash the NVRAM. Every user on the planet is going to be 1,000 times more happy with a big fat warning in their kernel log saying that things might not go right, but the driver is going to try anyways, rather than a complete non-attempt to make things work. You replaced a possible failure with a guarenteed one. %99.9999999 of people are never going to run into a FC ID collision. They have an onboard FC controller and a disk or two. They DON'T CARE, they want their systems to work and if you don't give them that you're not being a good driver maintainer. You BROKE things, therefore you must FIX it. Now I'm happy to code up the sparc OFW property bits but your attitude and perspective on this absolutely has to change and the old fallback code still has to go back in there, possible FC ID collisions or not. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-16 19:37 ` David Miller @ 2007-04-16 19:59 ` David Miller 2007-04-16 20:08 ` Andrew Vasquez 1 sibling, 0 replies; 25+ messages in thread From: David Miller @ 2007-04-16 19:59 UTC (permalink / raw) To: andrew.vasquez; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema From: David Miller <davem@davemloft.net> Date: Mon, 16 Apr 2007 12:37:43 -0700 (PDT) > Now I'm happy to code up the sparc OFW property bits but your attitude > and perspective on this absolutely has to change and the old fallback > code still has to go back in there, possible FC ID collisions or not. Here is the patch I think we should definitely apply to the driver to fix this bug and restore the fallback functionality. Tested on SunBlade-1000: diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index 98c01cd..3e296ab 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -11,6 +11,11 @@ #include "qla_devtbl.h" +#ifdef CONFIG_SPARC +#include <asm/prom.h> +#include <asm/pbm.h> +#endif + /* XXX(hch): this is ugly, but we don't want to pull in exioctl.h */ #ifndef EXT_IS_LUN_BIT_SET #define EXT_IS_LUN_BIT_SET(P,L) \ @@ -88,12 +93,7 @@ qla2x00_initialize_adapter(scsi_qla_host_t *ha) qla_printk(KERN_INFO, ha, "Configure NVRAM parameters...\n"); - rval = ha->isp_ops.nvram_config(ha); - if (rval) { - DEBUG2(printk("scsi(%ld): Unable to verify NVRAM data.\n", - ha->host_no)); - return rval; - } + ha->isp_ops.nvram_config(ha); if (ha->flags.disable_serdes) { /* Mask HBA via NVRAM settings? */ @@ -1393,6 +1393,28 @@ qla2x00_set_model_info(scsi_qla_host_t *ha, uint8_t *model, size_t len, char *de } } +/* On sparc systems, obtain port and node WWN from firmware + * properties. + */ +static void qla2xxx_nvram_wwn_from_ofw(scsi_qla_host_t *ha, nvram_t *nv) +{ +#ifdef CONFIG_SPARC + struct pci_dev *pdev = ha->pdev; + struct pcidev_cookie *pcp = pdev->sysdata; + struct device_node *dp = pcp->prom_node; + u8 *val; + int len; + + val = of_get_property(dp, "port-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->port_name, val, WWN_SIZE); + + val = of_get_property(dp, "node-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->node_name, val, WWN_SIZE); +#endif +} + /* * NVRAM configuration for ISP 2xxx * @@ -1409,6 +1431,7 @@ qla2x00_set_model_info(scsi_qla_host_t *ha, uint8_t *model, size_t len, char *de int qla2x00_nvram_config(scsi_qla_host_t *ha) { + int rval; uint8_t chksum = 0; uint16_t cnt; uint8_t *dptr1, *dptr2; @@ -1417,6 +1440,8 @@ qla2x00_nvram_config(scsi_qla_host_t *ha) uint8_t *ptr = (uint8_t *)ha->request_ring; struct device_reg_2xxx __iomem *reg = &ha->iobase->isp; + rval = QLA_SUCCESS; + /* Determine NVRAM starting address. */ ha->nvram_size = sizeof(nvram_t); ha->nvram_base = 0; @@ -1440,7 +1465,57 @@ qla2x00_nvram_config(scsi_qla_host_t *ha) qla_printk(KERN_WARNING, ha, "Inconsistent NVRAM detected: " "checksum=0x%x id=%c version=0x%x.\n", chksum, nv->id[0], nv->nvram_version); - return QLA_FUNCTION_FAILED; + qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet " + "invalid -- WWPN) defaults.\n"); + + /* + * Set default initialization control block. + */ + memset(nv, 0, ha->nvram_size); + nv->parameter_block_version = ICB_VERSION; + + if (IS_QLA23XX(ha)) { + nv->firmware_options[0] = BIT_2 | BIT_1; + nv->firmware_options[1] = BIT_7 | BIT_5; + nv->add_firmware_options[0] = BIT_5; + nv->add_firmware_options[1] = BIT_5 | BIT_4; + nv->frame_payload_size = __constant_cpu_to_le16(2048); + nv->special_options[1] = BIT_7; + } else if (IS_QLA2200(ha)) { + nv->firmware_options[0] = BIT_2 | BIT_1; + nv->firmware_options[1] = BIT_7 | BIT_5; + nv->add_firmware_options[0] = BIT_5; + nv->add_firmware_options[1] = BIT_5 | BIT_4; + nv->frame_payload_size = __constant_cpu_to_le16(1024); + } else if (IS_QLA2100(ha)) { + nv->firmware_options[0] = BIT_3 | BIT_1; + nv->firmware_options[1] = BIT_5; + nv->frame_payload_size = __constant_cpu_to_le16(1024); + } + + nv->max_iocb_allocation = __constant_cpu_to_le16(256); + nv->execution_throttle = __constant_cpu_to_le16(16); + nv->retry_count = 8; + nv->retry_delay = 1; + + nv->port_name[0] = 33; + nv->port_name[3] = 224; + nv->port_name[4] = 139; + + qla2xxx_nvram_wwn_from_ofw(ha, nv); + + nv->login_timeout = 4; + + /* + * Set default host adapter parameters + */ + nv->host_p[1] = BIT_2; + nv->reset_delay = 5; + nv->port_down_retry_count = 8; + nv->max_luns_per_target = __constant_cpu_to_le16(8); + nv->link_down_timeout = 60; + + rval = 1; } #if defined(CONFIG_IA64_GENERIC) || defined(CONFIG_IA64_SGI_SN2) @@ -1653,7 +1728,11 @@ qla2x00_nvram_config(scsi_qla_host_t *ha) } } - return QLA_SUCCESS; + if (rval) { + DEBUG2_3(printk(KERN_WARNING + "scsi(%ld): NVRAM configuration failed!\n", ha->host_no)); + } + return (rval); } static void @@ -3071,9 +3150,7 @@ qla2x00_abort_isp(scsi_qla_host_t *ha) ha->isp_ops.get_flash_version(ha, ha->request_ring); - rval = ha->isp_ops.nvram_config(ha); - if (rval) - goto isp_abort_retry; + ha->isp_ops.nvram_config(ha); if (!qla2x00_restart_isp(ha)) { clear_bit(RESET_MARKER_NEEDED, &ha->dpc_flags); @@ -3103,7 +3180,6 @@ qla2x00_abort_isp(scsi_qla_host_t *ha) } } } else { /* failed the ISP abort */ -isp_abort_retry: ha->flags.online = 1; if (test_bit(ISP_ABORT_RETRY, &ha->dpc_flags)) { if (ha->isp_abort_cnt == 0) { @@ -3290,9 +3366,32 @@ qla24xx_reset_adapter(scsi_qla_host_t *ha) spin_unlock_irqrestore(&ha->hardware_lock, flags); } +/* On sparc systems, obtain port and node WWN from firmware + * properties. + */ +static void qla24xx_nvram_wwn_from_ofw(scsi_qla_host_t *ha, struct nvram_24xx *nv) +{ +#ifdef CONFIG_SPARC + struct pci_dev *pdev = ha->pdev; + struct pcidev_cookie *pcp = pdev->sysdata; + struct device_node *dp = pcp->prom_node; + u8 *val; + int len; + + val = of_get_property(dp, "port-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->port_name, val, WWN_SIZE); + + val = of_get_property(dp, "node-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->node_name, val, WWN_SIZE); +#endif +} + int qla24xx_nvram_config(scsi_qla_host_t *ha) { + int rval; struct init_cb_24xx *icb; struct nvram_24xx *nv; uint32_t *dptr; @@ -3300,6 +3399,7 @@ qla24xx_nvram_config(scsi_qla_host_t *ha) uint32_t chksum; uint16_t cnt; + rval = QLA_SUCCESS; icb = (struct init_cb_24xx *)ha->init_cb; nv = (struct nvram_24xx *)ha->request_ring; @@ -3332,7 +3432,52 @@ qla24xx_nvram_config(scsi_qla_host_t *ha) qla_printk(KERN_WARNING, ha, "Inconsistent NVRAM detected: " "checksum=0x%x id=%c version=0x%x.\n", chksum, nv->id[0], le16_to_cpu(nv->nvram_version)); - return QLA_FUNCTION_FAILED; + qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet " + "invalid -- WWPN) defaults.\n"); + + /* + * Set default initialization control block. + */ + memset(nv, 0, ha->nvram_size); + nv->nvram_version = __constant_cpu_to_le16(ICB_VERSION); + nv->version = __constant_cpu_to_le16(ICB_VERSION); + nv->frame_payload_size = __constant_cpu_to_le16(2048); + nv->execution_throttle = __constant_cpu_to_le16(0xFFFF); + nv->exchange_count = __constant_cpu_to_le16(0); + nv->hard_address = __constant_cpu_to_le16(124); + nv->port_name[0] = 0x21; + nv->port_name[1] = 0x00 + PCI_FUNC(ha->pdev->devfn); + nv->port_name[2] = 0x00; + nv->port_name[3] = 0xe0; + nv->port_name[4] = 0x8b; + nv->port_name[5] = 0x1c; + nv->port_name[6] = 0x55; + nv->port_name[7] = 0x86; + nv->node_name[0] = 0x20; + nv->node_name[1] = 0x00; + nv->node_name[2] = 0x00; + nv->node_name[3] = 0xe0; + nv->node_name[4] = 0x8b; + nv->node_name[5] = 0x1c; + nv->node_name[6] = 0x55; + nv->node_name[7] = 0x86; + qla24xx_nvram_wwn_from_ofw(ha, nv); + nv->login_retry_count = __constant_cpu_to_le16(8); + nv->interrupt_delay_timer = __constant_cpu_to_le16(0); + nv->login_timeout = __constant_cpu_to_le16(0); + nv->firmware_options_1 = + __constant_cpu_to_le32(BIT_14|BIT_13|BIT_2|BIT_1); + nv->firmware_options_2 = __constant_cpu_to_le32(2 << 4); + nv->firmware_options_2 |= __constant_cpu_to_le32(BIT_12); + nv->firmware_options_3 = __constant_cpu_to_le32(2 << 13); + nv->host_p = __constant_cpu_to_le32(BIT_11|BIT_10); + nv->efi_parameters = __constant_cpu_to_le32(0); + nv->reset_delay = 5; + nv->max_luns_per_target = __constant_cpu_to_le16(128); + nv->port_down_retry_count = __constant_cpu_to_le16(30); + nv->link_down_timeout = __constant_cpu_to_le16(30); + + rval = 1; } /* Reset Initialization control block */ @@ -3479,7 +3624,11 @@ qla24xx_nvram_config(scsi_qla_host_t *ha) ha->flags.process_response_queue = 1; } - return QLA_SUCCESS; + if (rval) { + DEBUG2_3(printk(KERN_WARNING + "scsi(%ld): NVRAM configuration failed!\n", ha->host_no)); + } + return (rval); } static int ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-16 19:37 ` David Miller 2007-04-16 19:59 ` David Miller @ 2007-04-16 20:08 ` Andrew Vasquez 2007-04-16 21:10 ` Andrew Vasquez 2007-04-18 17:13 ` Christoph Hellwig 1 sibling, 2 replies; 25+ messages in thread From: Andrew Vasquez @ 2007-04-16 20:08 UTC (permalink / raw) To: David Miller; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema On Mon, 16 Apr 2007, David Miller wrote: > From: Andrew Vasquez <andrew.vasquez@qlogic.com> > Date: Mon, 16 Apr 2007 09:37:12 -0700 > > > On Mon, 16 Apr 2007, David Miller wrote: > > > > > But even if that fails, I think the fallback code should be put back, > > > since it obviously was used by at least one system and it's probable > > > that there are some other applications of using this qla2xxx chip that > > > will have an empty NVRAM too. > > > > Then they should really get their NVRAM corrected, if in fact their > > NVRAMs are cleared. > > > > > I can understand the apprehension in using a fixed port_name[] value, > > > since it could conflict with other FC controllers on the mesh, but if > > > that is so important just choose some random value that is a valid FC > > > ID or use some characteristic ID that can be used to compose part of > > > the port WWN in order to give it at least some uniqueness. > > > > Look, there's a fine balance here that we must strike -- the solution > > that you're proposing implies that there's some 'random' bit-space > > within the IEEE NAA with which one can safely encode without stomping > > on any valid OUI. > > The fact is that your driver was significantly more robust > previously, and now it's so less robust that it now fails for > people. > > That's totally unacceptable. > > Just like the sparc64 systsems, others depending upon this fallback > behavior the qla2xxx driver had are going to break and they are not > going to be able to just go and fix their hardware and re-flash the > NVRAM. > > Every user on the planet is going to be 1,000 times more happy with a > big fat warning in their kernel log saying that things might not go > right, but the driver is going to try anyways, rather than a complete > non-attempt to make things work. > > You replaced a possible failure with a guarenteed one. > > %99.9999999 of people are never going to run into a FC ID collision. > They have an onboard FC controller and a disk or two. Sorry, but in a SATA/SCSI environment that may be true, but in the case of FC that expectation is unrealistic. There are thousands of FC installations where there are several thousand endpoints (including initiators and targets) all interconnected. Let's use your case -- just connect two sparc machines within the same fabric to your storage, with the old code, there's still a problem. > They DON'T > CARE, they want their systems to work and if you don't give them that > you're not being a good driver maintainer. Let's push aside attitudes and unrealistic statistics, could we perhaps agree to re-add the use of doctored NVRAM (and thus non-random WWPN/WWNN) when NVRAM is corrupted or non-present with a module-parameter (which defaults to 0) which indicates the user *really* knows what she is doing and recognizes WWPN collisions may occur -- non-zero the parameter value indicates doctored values will be used, zero value (the default) fails initialization. In both cases a big FAT warning is issued. > You BROKE things, therefore you must FIX it. > > Now I'm happy to code up the sparc OFW property bits but your attitude > and perspective on this absolutely has to change and the old fallback > code still has to go back in there, possible FC ID collisions or not. That would be great, I'd like to insure the balance is maintained for *all* our users. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-16 20:08 ` Andrew Vasquez @ 2007-04-16 21:10 ` Andrew Vasquez 2007-04-16 22:08 ` David Miller 2007-04-18 17:13 ` Christoph Hellwig 1 sibling, 1 reply; 25+ messages in thread From: Andrew Vasquez @ 2007-04-16 21:10 UTC (permalink / raw) To: David Miller; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema On Mon, 16 Apr 2007, Andrew Vasquez wrote: > On Mon, 16 Apr 2007, David Miller wrote: > > > They DON'T > > CARE, they want their systems to work and if you don't give them that > > you're not being a good driver maintainer. > > Let's push aside attitudes and unrealistic statistics, could we > perhaps agree to re-add the use of doctored NVRAM (and thus > non-random WWPN/WWNN) when NVRAM is corrupted or non-present with a > module-parameter (which defaults to 0) which indicates the user > *really* knows what she is doing and recognizes WWPN collisions may > occur -- non-zero the parameter value indicates doctored values will > be used, zero value (the default) fails initialization. In both cases > a big FAT warning is issued. > > > You BROKE things, therefore you must FIX it. > > > > Now I'm happy to code up the sparc OFW property bits but your attitude > > and perspective on this absolutely has to change and the old fallback > > code still has to go back in there, possible FC ID collisions or not. > > That would be great, I'd like to insure the balance is maintained for > *all* our users. Ok, how about the following patch based on the one you posted which adds the codes to retrieve the WWPN/WWNN from firmware on SPARC, and also adds the module-parameter override I mentioned above. Perhaps the module-parameter should be set to non-zero in the case of SPARC, to take care of your system configurations? Regards, Andrew Vasquez --- diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h index 74544ae..b26090d 100644 --- a/drivers/scsi/qla2xxx/qla_gbl.h +++ b/drivers/scsi/qla2xxx/qla_gbl.h @@ -62,6 +62,7 @@ extern int ql2xfdmienable; extern int ql2xallocfwdump; extern int ql2xextended_error_logging; extern int ql2xqfullrampup; +extern int ql2xoverrideinvalidnvram; extern void qla2x00_sp_compl(scsi_qla_host_t *, srb_t *); diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index 98c01cd..fa5df97 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -11,6 +11,11 @@ #include "qla_devtbl.h" +#ifdef CONFIG_SPARC +#include <asm/prom.h> +#include <asm/pbm.h> +#endif + /* XXX(hch): this is ugly, but we don't want to pull in exioctl.h */ #ifndef EXT_IS_LUN_BIT_SET #define EXT_IS_LUN_BIT_SET(P,L) \ @@ -1393,6 +1398,42 @@ qla2x00_set_model_info(scsi_qla_host_t *ha, uint8_t *model, size_t len, char *de } } +/* On sparc systems, obtain port and node WWN from firmware + * properties. + */ +static void qla2xxx_nvram_wwn_from_ofw(scsi_qla_host_t *ha, nvram_t *nv) +{ +#ifdef CONFIG_SPARC + struct pci_dev *pdev = ha->pdev; + struct pcidev_cookie *pcp = pdev->sysdata; + struct device_node *dp = pcp->prom_node; + u8 *val; + int len; + + val = of_get_property(dp, "port-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->port_name, val, WWN_SIZE); + + val = of_get_property(dp, "node-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->node_name, val, WWN_SIZE); +#endif +} + +static inline int +qla2x00_override_invalid_nvram(scsi_qla_host_t *ha) +{ + if (!ql2xoverrideinvalidnvram) { + qla_printk(KERN_WARNING, ha, + "Reload the driver with the ql2xoverrideinvalidnvram \n"); + qla_printk(KERN_WARNING, ha, + " module parameter set to a non-zero value to ignore \n"); + qla_printk(KERN_WARNING, ha, + " this warning.\n"); + } + return ql2xoverrideinvalidnvram; +} + /* * NVRAM configuration for ISP 2xxx * @@ -1440,7 +1481,57 @@ qla2x00_nvram_config(scsi_qla_host_t *ha) qla_printk(KERN_WARNING, ha, "Inconsistent NVRAM detected: " "checksum=0x%x id=%c version=0x%x.\n", chksum, nv->id[0], nv->nvram_version); - return QLA_FUNCTION_FAILED; + if (!qla2x00_override_invalid_nvram(ha)) + return QLA_FUNCTION_FAILED; + qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet " + "invalid -- WWPN) defaults.\n"); + + /* + * Set default initialization control block. + */ + memset(nv, 0, ha->nvram_size); + nv->parameter_block_version = ICB_VERSION; + + if (IS_QLA23XX(ha)) { + nv->firmware_options[0] = BIT_2 | BIT_1; + nv->firmware_options[1] = BIT_7 | BIT_5; + nv->add_firmware_options[0] = BIT_5; + nv->add_firmware_options[1] = BIT_5 | BIT_4; + nv->frame_payload_size = __constant_cpu_to_le16(2048); + nv->special_options[1] = BIT_7; + } else if (IS_QLA2200(ha)) { + nv->firmware_options[0] = BIT_2 | BIT_1; + nv->firmware_options[1] = BIT_7 | BIT_5; + nv->add_firmware_options[0] = BIT_5; + nv->add_firmware_options[1] = BIT_5 | BIT_4; + nv->frame_payload_size = __constant_cpu_to_le16(1024); + } else if (IS_QLA2100(ha)) { + nv->firmware_options[0] = BIT_3 | BIT_1; + nv->firmware_options[1] = BIT_5; + nv->frame_payload_size = __constant_cpu_to_le16(1024); + } + + nv->max_iocb_allocation = __constant_cpu_to_le16(256); + nv->execution_throttle = __constant_cpu_to_le16(16); + nv->retry_count = 8; + nv->retry_delay = 1; + + nv->port_name[0] = 33; + nv->port_name[3] = 224; + nv->port_name[4] = 139; + + qla2xxx_nvram_wwn_from_ofw(ha, nv); + + nv->login_timeout = 4; + + /* + * Set default host adapter parameters + */ + nv->host_p[1] = BIT_2; + nv->reset_delay = 5; + nv->port_down_retry_count = 8; + nv->max_luns_per_target = __constant_cpu_to_le16(8); + nv->link_down_timeout = 60; } #if defined(CONFIG_IA64_GENERIC) || defined(CONFIG_IA64_SGI_SN2) @@ -3290,6 +3381,28 @@ qla24xx_reset_adapter(scsi_qla_host_t *ha) spin_unlock_irqrestore(&ha->hardware_lock, flags); } +/* On sparc systems, obtain port and node WWN from firmware + * properties. + */ +static void qla24xx_nvram_wwn_from_ofw(scsi_qla_host_t *ha, struct nvram_24xx *nv) +{ +#ifdef CONFIG_SPARC + struct pci_dev *pdev = ha->pdev; + struct pcidev_cookie *pcp = pdev->sysdata; + struct device_node *dp = pcp->prom_node; + u8 *val; + int len; + + val = of_get_property(dp, "port-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->port_name, val, WWN_SIZE); + + val = of_get_property(dp, "node-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->node_name, val, WWN_SIZE); +#endif +} + int qla24xx_nvram_config(scsi_qla_host_t *ha) { @@ -3332,7 +3445,52 @@ qla24xx_nvram_config(scsi_qla_host_t *ha) qla_printk(KERN_WARNING, ha, "Inconsistent NVRAM detected: " "checksum=0x%x id=%c version=0x%x.\n", chksum, nv->id[0], le16_to_cpu(nv->nvram_version)); - return QLA_FUNCTION_FAILED; + if (!qla2x00_override_invalid_nvram(ha)) + return QLA_FUNCTION_FAILED; + qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet " + "invalid -- WWPN) defaults.\n"); + + /* + * Set default initialization control block. + */ + memset(nv, 0, ha->nvram_size); + nv->nvram_version = __constant_cpu_to_le16(ICB_VERSION); + nv->version = __constant_cpu_to_le16(ICB_VERSION); + nv->frame_payload_size = __constant_cpu_to_le16(2048); + nv->execution_throttle = __constant_cpu_to_le16(0xFFFF); + nv->exchange_count = __constant_cpu_to_le16(0); + nv->hard_address = __constant_cpu_to_le16(124); + nv->port_name[0] = 0x21; + nv->port_name[1] = 0x00 + PCI_FUNC(ha->pdev->devfn); + nv->port_name[2] = 0x00; + nv->port_name[3] = 0xe0; + nv->port_name[4] = 0x8b; + nv->port_name[5] = 0x1c; + nv->port_name[6] = 0x55; + nv->port_name[7] = 0x86; + nv->node_name[0] = 0x20; + nv->node_name[1] = 0x00; + nv->node_name[2] = 0x00; + nv->node_name[3] = 0xe0; + nv->node_name[4] = 0x8b; + nv->node_name[5] = 0x1c; + nv->node_name[6] = 0x55; + nv->node_name[7] = 0x86; + qla24xx_nvram_wwn_from_ofw(ha, nv); + nv->login_retry_count = __constant_cpu_to_le16(8); + nv->interrupt_delay_timer = __constant_cpu_to_le16(0); + nv->login_timeout = __constant_cpu_to_le16(0); + nv->firmware_options_1 = + __constant_cpu_to_le32(BIT_14|BIT_13|BIT_2|BIT_1); + nv->firmware_options_2 = __constant_cpu_to_le32(2 << 4); + nv->firmware_options_2 |= __constant_cpu_to_le32(BIT_12); + nv->firmware_options_3 = __constant_cpu_to_le32(2 << 13); + nv->host_p = __constant_cpu_to_le32(BIT_11|BIT_10); + nv->efi_parameters = __constant_cpu_to_le32(0); + nv->reset_delay = 5; + nv->max_luns_per_target = __constant_cpu_to_le16(128); + nv->port_down_retry_count = __constant_cpu_to_le16(30); + nv->link_down_timeout = __constant_cpu_to_le16(30); } /* Reset Initialization control block */ diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 68f5d24..acbe2db 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -90,6 +90,13 @@ MODULE_PARM_DESC(ql2xqfullrampup, "depth for a device after a queue-full condition has been " "detected. Default is 120 seconds."); +int ql2xoverrideinvalidnvram; +module_param(ql2xoverrideinvalidnvram, int, S_IRUGO|S_IRUSR); +MODULE_PARM_DESC(ql2xoverrideinvalidnvram, + "When set and upon detecting an invalid or non-present " + "NVRAM state, use doctored settings (potentially invalid) " + "to initialize the HBA."); + /* * SCSI host template entry points */ @@ -1575,9 +1582,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) goto probe_failed; } - if (qla2x00_initialize_adapter(ha) && - !(ha->device_flags & DFLG_NO_CABLE)) { - + if (qla2x00_initialize_adapter(ha)) { qla_printk(KERN_WARNING, ha, "Failed to initialize adapter\n"); @@ -1733,7 +1738,8 @@ qla2x00_free_device(scsi_qla_host_t *ha) ha->flags.online = 0; /* Stop currently executing firmware. */ - qla2x00_try_to_stop_firmware(ha); + if (ha->fw_major_version) + qla2x00_try_to_stop_firmware(ha); /* turn-off interrupts on the card */ if (ha->interrupts_on) ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-16 21:10 ` Andrew Vasquez @ 2007-04-16 22:08 ` David Miller 2007-04-16 22:25 ` Andrew Vasquez 0 siblings, 1 reply; 25+ messages in thread From: David Miller @ 2007-04-16 22:08 UTC (permalink / raw) To: andrew.vasquez; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema From: Andrew Vasquez <andrew.vasquez@qlogic.com> Date: Mon, 16 Apr 2007 14:10:49 -0700 > Ok, how about the following patch based on the one you posted which > adds the codes to retrieve the WWPN/WWNN from firmware on SPARC, and > also adds the module-parameter override I mentioned above. > > Perhaps the module-parameter should be set to non-zero in the case of > SPARC, to take care of your system configurations? I think it should default to non-zero always, in fact the option is completely pointless. The guy who hits this had a system which worked previously, and you're explicitly breaking it. That's wrong. How can you not see that this quality of implementation decision you're making stinks? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-16 22:08 ` David Miller @ 2007-04-16 22:25 ` Andrew Vasquez 2007-04-16 22:29 ` David Miller 0 siblings, 1 reply; 25+ messages in thread From: Andrew Vasquez @ 2007-04-16 22:25 UTC (permalink / raw) To: David Miller; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema On Mon, 16 Apr 2007, David Miller wrote: > From: Andrew Vasquez <andrew.vasquez@qlogic.com> > Date: Mon, 16 Apr 2007 14:10:49 -0700 > > > Ok, how about the following patch based on the one you posted which > > adds the codes to retrieve the WWPN/WWNN from firmware on SPARC, and > > also adds the module-parameter override I mentioned above. > > > > Perhaps the module-parameter should be set to non-zero in the case of > > SPARC, to take care of your system configurations? > > I think it should default to non-zero always, in fact the option > is completely pointless. > > The guy who hits this had a system which worked previously, and you're > explicitly breaking it. That's wrong. Sorry, 'it' didn't work... 'It' *never* did. > How can you not see that this quality of implementation decision > you're making stinks? You're defending a position which itself left users with a false sense of security and comfort. This is a *real* problem from an enterprise perspective where FC reigns. Fine, I'll agree that wacking-users (and I'll wager the outliers) with a 2x4 was a bit extreme, but I'd much rather handle those users on a case-by-case basis, either by: * If dealing with a PCI card, directing a user to a support staff at QLogic to resolve the NVRAM issues. * If it's some on-board ISP with no NVRAM, as was your SPARC case, then add *proper* codes to retrieve the data from some secondary persistent store. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-16 22:25 ` Andrew Vasquez @ 2007-04-16 22:29 ` David Miller 2007-04-16 23:28 ` Andrew Vasquez 0 siblings, 1 reply; 25+ messages in thread From: David Miller @ 2007-04-16 22:29 UTC (permalink / raw) To: andrew.vasquez; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema From: Andrew Vasquez <andrew.vasquez@qlogic.com> Date: Mon, 16 Apr 2007 15:25:17 -0700 > Fine, I'll agree that wacking-users (and > I'll wager the outliers) with a 2x4 was a bit extreme, And that, right there, is basically the end of the conversation. You don't do this to users, ever. Put a big loud kernel log message in there when this situation presents itself, use as many capital letters and scary language that you wish. Let them know that if things explode they get to keep the pieces. But at least try to give them something that works when you know that you can. You don't need to make someone's system unbootable in order to make them aware of a potential problem. It's very anti-social to approach things in this way. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-16 22:29 ` David Miller @ 2007-04-16 23:28 ` Andrew Vasquez 2007-04-16 23:41 ` David Miller 0 siblings, 1 reply; 25+ messages in thread From: Andrew Vasquez @ 2007-04-16 23:28 UTC (permalink / raw) To: David Miller; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema On Mon, 16 Apr 2007, David Miller wrote: > From: Andrew Vasquez <andrew.vasquez@qlogic.com> > Date: Mon, 16 Apr 2007 15:25:17 -0700 > > > Fine, I'll agree that wacking-users (and > > I'll wager the outliers) with a 2x4 was a bit extreme, > > And that, right there, is basically the end of the conversation. > > You don't do this to users, ever. > Put a big loud kernel log message in there when this situation > presents itself, use as many capital letters and scary language that > you wish. Let them know that if things explode they get to keep the > pieces. > > But at least try to give them something that works when you know that > you can. > > You don't need to make someone's system unbootable in order to make > them aware of a potential problem. It's very anti-social to approach Sorry, but let's be realistic, this type of warning would have *NEVER* been addressed if we kept the status quo -- your modifications to read the wwpn/wwnn would have never been submitted, everybody would have kept going on blistfully ignorant of the issue. Changes such as these are a common Linux upstream idiom... So, meeting in the middle, with the NVRAM bits restored along with some ability for the user to *knowingly* recognize the problem, I take it, is not going to work for you? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-16 23:28 ` Andrew Vasquez @ 2007-04-16 23:41 ` David Miller 2007-04-16 23:47 ` Andrew Vasquez 0 siblings, 1 reply; 25+ messages in thread From: David Miller @ 2007-04-16 23:41 UTC (permalink / raw) To: andrew.vasquez; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema From: Andrew Vasquez <andrew.vasquez@qlogic.com> Date: Mon, 16 Apr 2007 16:28:51 -0700 > Sorry, but let's be realistic, this type of warning would have > *NEVER* been addressed if we kept the status quo Wrong. I watch the logs all the time and would have sent you a fix to use the Sparc firmware info as soon as I saw the kernel log message. Anyone who has worked with me over the last 15 years will let you know emphatically that this is true. AND IN THE MEAN TIME I COULD GET WORK DONE AND MY SYSTEM WOULD BOOT! ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-16 23:41 ` David Miller @ 2007-04-16 23:47 ` Andrew Vasquez 2007-04-17 0:05 ` David Miller 0 siblings, 1 reply; 25+ messages in thread From: Andrew Vasquez @ 2007-04-16 23:47 UTC (permalink / raw) To: David Miller; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema On Mon, 16 Apr 2007, David Miller wrote: > From: Andrew Vasquez <andrew.vasquez@qlogic.com> > Date: Mon, 16 Apr 2007 16:28:51 -0700 > > > Sorry, but let's be realistic, this type of warning would have > > *NEVER* been addressed if we kept the status quo > > Wrong. I watch the logs all the time and would have sent you a fix to > use the Sparc firmware info as soon as I saw the kernel log message. Dave, according to your earlier emails, the qla2xxx driver worked 'fine' in driver versions before commit 7aef45ac92f49e76d990b51b7ecd714b9a608be1. If that were the case, then you would have seen the warning messages: ... qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet " "invalid -- WWPN) defaults.\n"); > Anyone who has worked with me over the last 15 years will let you know > emphatically that this is true. > > AND IN THE MEAN TIME I COULD GET WORK DONE AND MY SYSTEM WOULD BOOT! I understand that, and recognize your contribution, that was never in question. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-16 23:47 ` Andrew Vasquez @ 2007-04-17 0:05 ` David Miller 2007-04-17 2:41 ` Andrew Vasquez 0 siblings, 1 reply; 25+ messages in thread From: David Miller @ 2007-04-17 0:05 UTC (permalink / raw) To: andrew.vasquez; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema From: Andrew Vasquez <andrew.vasquez@qlogic.com> Date: Mon, 16 Apr 2007 16:47:05 -0700 > Dave, according to your earlier emails, the qla2xxx driver worked > 'fine' in driver versions before commit > 7aef45ac92f49e76d990b51b7ecd714b9a608be1. If that were the case, then > you would have seen the warning messages: > > ... > qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet " > "invalid -- WWPN) defaults.\n"); I have in fact seen the message several times and that messages gives me no reason to believe something needs to be fixed. It should have said "PLEASE REPORT THIS to drivers@qlogic.com" or something similar to indicate the severity better. "An invalid WWPN, what's that?" said the user. :) How about "FC IDs may conflict and cause miscommunication! Please report to driver author so this can be fixed!" or similar? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-17 0:05 ` David Miller @ 2007-04-17 2:41 ` Andrew Vasquez 2007-04-17 5:02 ` David Miller 0 siblings, 1 reply; 25+ messages in thread From: Andrew Vasquez @ 2007-04-17 2:41 UTC (permalink / raw) To: David Miller, Seokmann Ju; +Cc: linux-scsi, linux-kernel, James.Bottomley, ema On Mon, 16 Apr 2007, David Miller wrote: > From: Andrew Vasquez <andrew.vasquez@qlogic.com> > Date: Mon, 16 Apr 2007 16:47:05 -0700 > > > Dave, according to your earlier emails, the qla2xxx driver worked > > 'fine' in driver versions before commit > > 7aef45ac92f49e76d990b51b7ecd714b9a608be1. If that were the case, then > > you would have seen the warning messages: > > > > ... > > qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet " > > "invalid -- WWPN) defaults.\n"); > > I have in fact seen the message several times and that messages gives > me no reason to believe something needs to be fixed. > > It should have said "PLEASE REPORT THIS to drivers@qlogic.com" or > something similar to indicate the severity better. > > "An invalid WWPN, what's that?" said the user. :) > > How about "FC IDs may conflict and cause miscommunication! Please > report to driver author so this can be fixed!" or similar? That verbiage sounds fine -- so would you consider the previous patch I submitted (with module parameter) along with the wording above? I'm in transit for a redeye to NY so I won't be able to modify the patch, If you would be amenable to the above, Seokmann, could you rework the patch? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-17 2:41 ` Andrew Vasquez @ 2007-04-17 5:02 ` David Miller 2007-04-17 18:28 ` Seokmann Ju 0 siblings, 1 reply; 25+ messages in thread From: David Miller @ 2007-04-17 5:02 UTC (permalink / raw) To: andrew.vasquez Cc: seokmann.ju, linux-scsi, linux-kernel, James.Bottomley, ema From: Andrew Vasquez <andrew.vasquez@qlogic.com> Date: Mon, 16 Apr 2007 19:41:07 -0700 > That verbiage sounds fine -- so would you consider the previous patch > I submitted (with module parameter) along with the wording above? Yes, that sounds fine. > I'm in transit for a redeye to NY so I won't be able to modify the > patch, If you would be amenable to the above, Seokmann, could you > rework the patch? Thanks guys. ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: Major qla2xxx regression on sparc64 2007-04-17 5:02 ` David Miller @ 2007-04-17 18:28 ` Seokmann Ju 2007-04-17 18:56 ` David Miller 2007-04-18 17:16 ` Christoph Hellwig 0 siblings, 2 replies; 25+ messages in thread From: Seokmann Ju @ 2007-04-17 18:28 UTC (permalink / raw) To: David Miller, Andrew Vasquez Cc: linux-scsi, linux-kernel, James.Bottomley, ema [-- Attachment #1: Type: text/plain, Size: 10729 bytes --] Hello David, On Mon 4/16/2007 10:02 PM, David Miller wrote: > > I'm in transit for a redeye to NY so I won't be able to modify the > > patch, If you would be amenable to the above, Seokmann, could you > > rework the patch? > > Thanks guys. Here, I've attached updated patch. Please take this. Sorry for late. Thank you, Seokmann --- diff -Naur linux-2.6.git/drivers/scsi/qla2xxx/qla_gbl.h linux-2.6.git-nvram/drivers/scsi/qla2xxx/qla_gbl.h --- linux-2.6.git/drivers/scsi/qla2xxx/qla_gbl.h 2007-04-17 11:06:39.000000000 -0700 +++ linux-2.6.git-nvram/drivers/scsi/qla2xxx/qla_gbl.h 2007-04-17 11:07:24.000000000 -0700 @@ -62,6 +62,7 @@ extern int ql2xallocfwdump; extern int ql2xextended_error_logging; extern int ql2xqfullrampup; +extern int ql2xoverrideinvalidnvram; extern void qla2x00_sp_compl(scsi_qla_host_t *, srb_t *); diff -Naur linux-2.6.git/drivers/scsi/qla2xxx/qla_init.c linux-2.6.git-nvram/drivers/scsi/qla2xxx/qla_init.c --- linux-2.6.git/drivers/scsi/qla2xxx/qla_init.c 2007-04-17 11:06:39.000000000 -0700 +++ linux-2.6.git-nvram/drivers/scsi/qla2xxx/qla_init.c 2007-04-17 11:07:24.000000000 -0700 @@ -11,6 +11,11 @@ #include "qla_devtbl.h" +#ifdef CONFIG_SPARC +#include <asm/prom.h> +#include <asm/pbm.h> +#endif + /* XXX(hch): this is ugly, but we don't want to pull in exioctl.h */ #ifndef EXT_IS_LUN_BIT_SET #define EXT_IS_LUN_BIT_SET(P,L) \ @@ -1393,6 +1398,42 @@ } } +/* On sparc systems, obtain port and node WWN from firmware + * properties. + */ +static void qla2xxx_nvram_wwn_from_ofw(scsi_qla_host_t *ha, nvram_t *nv) +{ +#ifdef CONFIG_SPARC + struct pci_dev *pdev = ha->pdev; + struct pcidev_cookie *pcp = pdev->sysdata; + struct device_node *dp = pcp->prom_node; + u8 *val; + int len; + + val = of_get_property(dp, "port-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->port_name, val, WWN_SIZE); + + val = of_get_property(dp, "node-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->node_name, val, WWN_SIZE); +#endif +} + +static inline int +qla2x00_override_invalid_nvram(scsi_qla_host_t *ha) +{ + if (!ql2xoverrideinvalidnvram) { + qla_printk(KERN_WARNING, ha, + "Reload the driver with the ql2xoverrideinvalidnvram \n"); + qla_printk(KERN_WARNING, ha, + " module parameter set to a non-zero value to ignore \n"); + qla_printk(KERN_WARNING, ha, + " this warning.\n"); + } + return ql2xoverrideinvalidnvram; +} + /* * NVRAM configuration for ISP 2xxx * @@ -1440,7 +1481,58 @@ qla_printk(KERN_WARNING, ha, "Inconsistent NVRAM detected: " "checksum=0x%x id=%c version=0x%x.\n", chksum, nv->id[0], nv->nvram_version); - return QLA_FUNCTION_FAILED; + if (!qla2x00_override_invalid_nvram(ha)) + return QLA_FUNCTION_FAILED; + qla_printk(KERN_WARNING, ha, "FC IDs may conflict and cause " + "miscommunication - please report this to QLogic " + "(linux-driver@qlogic.com).\n"); + + /* + * Set default initialization control block. + */ + memset(nv, 0, ha->nvram_size); + nv->parameter_block_version = ICB_VERSION; + + if (IS_QLA23XX(ha)) { + nv->firmware_options[0] = BIT_2 | BIT_1; + nv->firmware_options[1] = BIT_7 | BIT_5; + nv->add_firmware_options[0] = BIT_5; + nv->add_firmware_options[1] = BIT_5 | BIT_4; + nv->frame_payload_size = __constant_cpu_to_le16(2048); + nv->special_options[1] = BIT_7; + } else if (IS_QLA2200(ha)) { + nv->firmware_options[0] = BIT_2 | BIT_1; + nv->firmware_options[1] = BIT_7 | BIT_5; + nv->add_firmware_options[0] = BIT_5; + nv->add_firmware_options[1] = BIT_5 | BIT_4; + nv->frame_payload_size = __constant_cpu_to_le16(1024); + } else if (IS_QLA2100(ha)) { + nv->firmware_options[0] = BIT_3 | BIT_1; + nv->firmware_options[1] = BIT_5; + nv->frame_payload_size = __constant_cpu_to_le16(1024); + } + + nv->max_iocb_allocation = __constant_cpu_to_le16(256); + nv->execution_throttle = __constant_cpu_to_le16(16); + nv->retry_count = 8; + nv->retry_delay = 1; + + nv->port_name[0] = 33; + nv->port_name[3] = 224; + nv->port_name[4] = 139; + + qla2xxx_nvram_wwn_from_ofw(ha, nv); + + nv->login_timeout = 4; + + /* + * Set default host adapter parameters + */ + nv->host_p[1] = BIT_2; + nv->reset_delay = 5; + nv->port_down_retry_count = 8; + nv->max_luns_per_target = __constant_cpu_to_le16(8); + nv->link_down_timeout = 60; } #if defined(CONFIG_IA64_GENERIC) || defined(CONFIG_IA64_SGI_SN2) @@ -3290,6 +3382,28 @@ spin_unlock_irqrestore(&ha->hardware_lock, flags); } +/* On sparc systems, obtain port and node WWN from firmware + * properties. + */ +static void qla24xx_nvram_wwn_from_ofw(scsi_qla_host_t *ha, struct nvram_24xx *nv) +{ +#ifdef CONFIG_SPARC + struct pci_dev *pdev = ha->pdev; + struct pcidev_cookie *pcp = pdev->sysdata; + struct device_node *dp = pcp->prom_node; + u8 *val; + int len; + + val = of_get_property(dp, "port-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->port_name, val, WWN_SIZE); + + val = of_get_property(dp, "node-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->node_name, val, WWN_SIZE); +#endif +} + int qla24xx_nvram_config(scsi_qla_host_t *ha) { @@ -3332,7 +3446,53 @@ qla_printk(KERN_WARNING, ha, "Inconsistent NVRAM detected: " "checksum=0x%x id=%c version=0x%x.\n", chksum, nv->id[0], le16_to_cpu(nv->nvram_version)); - return QLA_FUNCTION_FAILED; + if (!qla2x00_override_invalid_nvram(ha)) + return QLA_FUNCTION_FAILED; + qla_printk(KERN_WARNING, ha, "FC IDs may conflict and cause " + "miscommunication - please report this to QLogic " + "(linux-driver@qlogic.com).\n"); + + /* + * Set default initialization control block. + */ + memset(nv, 0, ha->nvram_size); + nv->nvram_version = __constant_cpu_to_le16(ICB_VERSION); + nv->version = __constant_cpu_to_le16(ICB_VERSION); + nv->frame_payload_size = __constant_cpu_to_le16(2048); + nv->execution_throttle = __constant_cpu_to_le16(0xFFFF); + nv->exchange_count = __constant_cpu_to_le16(0); + nv->hard_address = __constant_cpu_to_le16(124); + nv->port_name[0] = 0x21; + nv->port_name[1] = 0x00 + PCI_FUNC(ha->pdev->devfn); + nv->port_name[2] = 0x00; + nv->port_name[3] = 0xe0; + nv->port_name[4] = 0x8b; + nv->port_name[5] = 0x1c; + nv->port_name[6] = 0x55; + nv->port_name[7] = 0x86; + nv->node_name[0] = 0x20; + nv->node_name[1] = 0x00; + nv->node_name[2] = 0x00; + nv->node_name[3] = 0xe0; + nv->node_name[4] = 0x8b; + nv->node_name[5] = 0x1c; + nv->node_name[6] = 0x55; + nv->node_name[7] = 0x86; + qla24xx_nvram_wwn_from_ofw(ha, nv); + nv->login_retry_count = __constant_cpu_to_le16(8); + nv->interrupt_delay_timer = __constant_cpu_to_le16(0); + nv->login_timeout = __constant_cpu_to_le16(0); + nv->firmware_options_1 = + __constant_cpu_to_le32(BIT_14|BIT_13|BIT_2|BIT_1); + nv->firmware_options_2 = __constant_cpu_to_le32(2 << 4); + nv->firmware_options_2 |= __constant_cpu_to_le32(BIT_12); + nv->firmware_options_3 = __constant_cpu_to_le32(2 << 13); + nv->host_p = __constant_cpu_to_le32(BIT_11|BIT_10); + nv->efi_parameters = __constant_cpu_to_le32(0); + nv->reset_delay = 5; + nv->max_luns_per_target = __constant_cpu_to_le16(128); + nv->port_down_retry_count = __constant_cpu_to_le16(30); + nv->link_down_timeout = __constant_cpu_to_le16(30); } /* Reset Initialization control block */ diff -Naur linux-2.6.git/drivers/scsi/qla2xxx/qla_os.c linux-2.6.git-nvram/drivers/scsi/qla2xxx/qla_os.c --- linux-2.6.git/drivers/scsi/qla2xxx/qla_os.c 2007-04-17 11:06:39.000000000 -0700 +++ linux-2.6.git-nvram/drivers/scsi/qla2xxx/qla_os.c 2007-04-17 11:07:24.000000000 -0700 @@ -90,6 +90,13 @@ "depth for a device after a queue-full condition has been " "detected. Default is 120 seconds."); +int ql2xoverrideinvalidnvram; +module_param(ql2xoverrideinvalidnvram, int, S_IRUGO|S_IRUSR); +MODULE_PARM_DESC(ql2xoverrideinvalidnvram, + "When set and upon detecting an invalid or non-present " + "NVRAM state, use doctored settings (potentially invalid) " + "to initialize the HBA."); + /* * SCSI host template entry points */ @@ -1575,9 +1582,7 @@ goto probe_failed; } - if (qla2x00_initialize_adapter(ha) && - !(ha->device_flags & DFLG_NO_CABLE)) { - + if (qla2x00_initialize_adapter(ha)) { qla_printk(KERN_WARNING, ha, "Failed to initialize adapter\n"); @@ -1733,7 +1738,8 @@ ha->flags.online = 0; /* Stop currently executing firmware. */ - qla2x00_try_to_stop_firmware(ha); + if (ha->fw_major_version) + qla2x00_try_to_stop_firmware(ha); /* turn-off interrupts on the card */ if (ha->interrupts_on) --- [-- Attachment #2: nvram_override.patch --] [-- Type: application/octet-stream, Size: 8229 bytes --] diff -Naur linux-2.6.git/drivers/scsi/qla2xxx/qla_gbl.h linux-2.6.git-nvram/drivers/scsi/qla2xxx/qla_gbl.h --- linux-2.6.git/drivers/scsi/qla2xxx/qla_gbl.h 2007-04-17 11:06:39.000000000 -0700 +++ linux-2.6.git-nvram/drivers/scsi/qla2xxx/qla_gbl.h 2007-04-17 11:07:24.000000000 -0700 @@ -62,6 +62,7 @@ extern int ql2xallocfwdump; extern int ql2xextended_error_logging; extern int ql2xqfullrampup; +extern int ql2xoverrideinvalidnvram; extern void qla2x00_sp_compl(scsi_qla_host_t *, srb_t *); diff -Naur linux-2.6.git/drivers/scsi/qla2xxx/qla_init.c linux-2.6.git-nvram/drivers/scsi/qla2xxx/qla_init.c --- linux-2.6.git/drivers/scsi/qla2xxx/qla_init.c 2007-04-17 11:06:39.000000000 -0700 +++ linux-2.6.git-nvram/drivers/scsi/qla2xxx/qla_init.c 2007-04-17 11:07:24.000000000 -0700 @@ -11,6 +11,11 @@ #include "qla_devtbl.h" +#ifdef CONFIG_SPARC +#include <asm/prom.h> +#include <asm/pbm.h> +#endif + /* XXX(hch): this is ugly, but we don't want to pull in exioctl.h */ #ifndef EXT_IS_LUN_BIT_SET #define EXT_IS_LUN_BIT_SET(P,L) \ @@ -1393,6 +1398,42 @@ } } +/* On sparc systems, obtain port and node WWN from firmware + * properties. + */ +static void qla2xxx_nvram_wwn_from_ofw(scsi_qla_host_t *ha, nvram_t *nv) +{ +#ifdef CONFIG_SPARC + struct pci_dev *pdev = ha->pdev; + struct pcidev_cookie *pcp = pdev->sysdata; + struct device_node *dp = pcp->prom_node; + u8 *val; + int len; + + val = of_get_property(dp, "port-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->port_name, val, WWN_SIZE); + + val = of_get_property(dp, "node-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->node_name, val, WWN_SIZE); +#endif +} + +static inline int +qla2x00_override_invalid_nvram(scsi_qla_host_t *ha) +{ + if (!ql2xoverrideinvalidnvram) { + qla_printk(KERN_WARNING, ha, + "Reload the driver with the ql2xoverrideinvalidnvram \n"); + qla_printk(KERN_WARNING, ha, + " module parameter set to a non-zero value to ignore \n"); + qla_printk(KERN_WARNING, ha, + " this warning.\n"); + } + return ql2xoverrideinvalidnvram; +} + /* * NVRAM configuration for ISP 2xxx * @@ -1440,7 +1481,58 @@ qla_printk(KERN_WARNING, ha, "Inconsistent NVRAM detected: " "checksum=0x%x id=%c version=0x%x.\n", chksum, nv->id[0], nv->nvram_version); - return QLA_FUNCTION_FAILED; + if (!qla2x00_override_invalid_nvram(ha)) + return QLA_FUNCTION_FAILED; + qla_printk(KERN_WARNING, ha, "FC IDs may conflict and cause " + "miscommunication - please report this to QLogic " + "(linux-driver@qlogic.com).\n"); + + /* + * Set default initialization control block. + */ + memset(nv, 0, ha->nvram_size); + nv->parameter_block_version = ICB_VERSION; + + if (IS_QLA23XX(ha)) { + nv->firmware_options[0] = BIT_2 | BIT_1; + nv->firmware_options[1] = BIT_7 | BIT_5; + nv->add_firmware_options[0] = BIT_5; + nv->add_firmware_options[1] = BIT_5 | BIT_4; + nv->frame_payload_size = __constant_cpu_to_le16(2048); + nv->special_options[1] = BIT_7; + } else if (IS_QLA2200(ha)) { + nv->firmware_options[0] = BIT_2 | BIT_1; + nv->firmware_options[1] = BIT_7 | BIT_5; + nv->add_firmware_options[0] = BIT_5; + nv->add_firmware_options[1] = BIT_5 | BIT_4; + nv->frame_payload_size = __constant_cpu_to_le16(1024); + } else if (IS_QLA2100(ha)) { + nv->firmware_options[0] = BIT_3 | BIT_1; + nv->firmware_options[1] = BIT_5; + nv->frame_payload_size = __constant_cpu_to_le16(1024); + } + + nv->max_iocb_allocation = __constant_cpu_to_le16(256); + nv->execution_throttle = __constant_cpu_to_le16(16); + nv->retry_count = 8; + nv->retry_delay = 1; + + nv->port_name[0] = 33; + nv->port_name[3] = 224; + nv->port_name[4] = 139; + + qla2xxx_nvram_wwn_from_ofw(ha, nv); + + nv->login_timeout = 4; + + /* + * Set default host adapter parameters + */ + nv->host_p[1] = BIT_2; + nv->reset_delay = 5; + nv->port_down_retry_count = 8; + nv->max_luns_per_target = __constant_cpu_to_le16(8); + nv->link_down_timeout = 60; } #if defined(CONFIG_IA64_GENERIC) || defined(CONFIG_IA64_SGI_SN2) @@ -3290,6 +3382,28 @@ spin_unlock_irqrestore(&ha->hardware_lock, flags); } +/* On sparc systems, obtain port and node WWN from firmware + * properties. + */ +static void qla24xx_nvram_wwn_from_ofw(scsi_qla_host_t *ha, struct nvram_24xx *nv) +{ +#ifdef CONFIG_SPARC + struct pci_dev *pdev = ha->pdev; + struct pcidev_cookie *pcp = pdev->sysdata; + struct device_node *dp = pcp->prom_node; + u8 *val; + int len; + + val = of_get_property(dp, "port-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->port_name, val, WWN_SIZE); + + val = of_get_property(dp, "node-wwn", &len); + if (val && len >= WWN_SIZE) + memcpy(nv->node_name, val, WWN_SIZE); +#endif +} + int qla24xx_nvram_config(scsi_qla_host_t *ha) { @@ -3332,7 +3446,53 @@ qla_printk(KERN_WARNING, ha, "Inconsistent NVRAM detected: " "checksum=0x%x id=%c version=0x%x.\n", chksum, nv->id[0], le16_to_cpu(nv->nvram_version)); - return QLA_FUNCTION_FAILED; + if (!qla2x00_override_invalid_nvram(ha)) + return QLA_FUNCTION_FAILED; + qla_printk(KERN_WARNING, ha, "FC IDs may conflict and cause " + "miscommunication - please report this to QLogic " + "(linux-driver@qlogic.com).\n"); + + /* + * Set default initialization control block. + */ + memset(nv, 0, ha->nvram_size); + nv->nvram_version = __constant_cpu_to_le16(ICB_VERSION); + nv->version = __constant_cpu_to_le16(ICB_VERSION); + nv->frame_payload_size = __constant_cpu_to_le16(2048); + nv->execution_throttle = __constant_cpu_to_le16(0xFFFF); + nv->exchange_count = __constant_cpu_to_le16(0); + nv->hard_address = __constant_cpu_to_le16(124); + nv->port_name[0] = 0x21; + nv->port_name[1] = 0x00 + PCI_FUNC(ha->pdev->devfn); + nv->port_name[2] = 0x00; + nv->port_name[3] = 0xe0; + nv->port_name[4] = 0x8b; + nv->port_name[5] = 0x1c; + nv->port_name[6] = 0x55; + nv->port_name[7] = 0x86; + nv->node_name[0] = 0x20; + nv->node_name[1] = 0x00; + nv->node_name[2] = 0x00; + nv->node_name[3] = 0xe0; + nv->node_name[4] = 0x8b; + nv->node_name[5] = 0x1c; + nv->node_name[6] = 0x55; + nv->node_name[7] = 0x86; + qla24xx_nvram_wwn_from_ofw(ha, nv); + nv->login_retry_count = __constant_cpu_to_le16(8); + nv->interrupt_delay_timer = __constant_cpu_to_le16(0); + nv->login_timeout = __constant_cpu_to_le16(0); + nv->firmware_options_1 = + __constant_cpu_to_le32(BIT_14|BIT_13|BIT_2|BIT_1); + nv->firmware_options_2 = __constant_cpu_to_le32(2 << 4); + nv->firmware_options_2 |= __constant_cpu_to_le32(BIT_12); + nv->firmware_options_3 = __constant_cpu_to_le32(2 << 13); + nv->host_p = __constant_cpu_to_le32(BIT_11|BIT_10); + nv->efi_parameters = __constant_cpu_to_le32(0); + nv->reset_delay = 5; + nv->max_luns_per_target = __constant_cpu_to_le16(128); + nv->port_down_retry_count = __constant_cpu_to_le16(30); + nv->link_down_timeout = __constant_cpu_to_le16(30); } /* Reset Initialization control block */ diff -Naur linux-2.6.git/drivers/scsi/qla2xxx/qla_os.c linux-2.6.git-nvram/drivers/scsi/qla2xxx/qla_os.c --- linux-2.6.git/drivers/scsi/qla2xxx/qla_os.c 2007-04-17 11:06:39.000000000 -0700 +++ linux-2.6.git-nvram/drivers/scsi/qla2xxx/qla_os.c 2007-04-17 11:07:24.000000000 -0700 @@ -90,6 +90,13 @@ "depth for a device after a queue-full condition has been " "detected. Default is 120 seconds."); +int ql2xoverrideinvalidnvram; +module_param(ql2xoverrideinvalidnvram, int, S_IRUGO|S_IRUSR); +MODULE_PARM_DESC(ql2xoverrideinvalidnvram, + "When set and upon detecting an invalid or non-present " + "NVRAM state, use doctored settings (potentially invalid) " + "to initialize the HBA."); + /* * SCSI host template entry points */ @@ -1575,9 +1582,7 @@ goto probe_failed; } - if (qla2x00_initialize_adapter(ha) && - !(ha->device_flags & DFLG_NO_CABLE)) { - + if (qla2x00_initialize_adapter(ha)) { qla_printk(KERN_WARNING, ha, "Failed to initialize adapter\n"); @@ -1733,7 +1738,8 @@ ha->flags.online = 0; /* Stop currently executing firmware. */ - qla2x00_try_to_stop_firmware(ha); + if (ha->fw_major_version) + qla2x00_try_to_stop_firmware(ha); /* turn-off interrupts on the card */ if (ha->interrupts_on) ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-17 18:28 ` Seokmann Ju @ 2007-04-17 18:56 ` David Miller 2007-04-18 17:16 ` Christoph Hellwig 1 sibling, 0 replies; 25+ messages in thread From: David Miller @ 2007-04-17 18:56 UTC (permalink / raw) To: seokmann.ju Cc: andrew.vasquez, linux-scsi, linux-kernel, James.Bottomley, ema From: "Seokmann Ju" <seokmann.ju@qlogic.com> Date: Tue, 17 Apr 2007 11:28:07 -0700 > Hello David, > On Mon 4/16/2007 10:02 PM, David Miller wrote: > > > I'm in transit for a redeye to NY so I won't be able to modify the > > > patch, If you would be amenable to the above, Seokmann, could you > > > rework the patch? > > > > Thanks guys. > Here, I've attached updated patch. Please take this. > Sorry for late. Actually, I misunderstood the patch. If SPARC provides the port and node WWN numbers in the OFW properties the driver should load properly. NAK, I reject this patch. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-17 18:28 ` Seokmann Ju 2007-04-17 18:56 ` David Miller @ 2007-04-18 17:16 ` Christoph Hellwig 2007-04-18 20:12 ` David Miller 1 sibling, 1 reply; 25+ messages in thread From: Christoph Hellwig @ 2007-04-18 17:16 UTC (permalink / raw) To: Seokmann Ju Cc: David Miller, Andrew Vasquez, linux-scsi, linux-kernel, James.Bottomley, ema On Tue, Apr 17, 2007 at 11:28:07AM -0700, Seokmann Ju wrote: > Hello David, > On Mon 4/16/2007 10:02 PM, David Miller wrote: > > > I'm in transit for a redeye to NY so I won't be able to modify the > > > patch, If you would be amenable to the above, Seokmann, could you > > > rework the patch? > > > > Thanks guys. > Here, I've attached updated patch. Please take this. > Sorry for late. I don't think there should be a warning in the sparc case where the wwn's are stored in the device tree. This is the way sun intended the hardware to work on sparc. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-18 17:16 ` Christoph Hellwig @ 2007-04-18 20:12 ` David Miller 0 siblings, 0 replies; 25+ messages in thread From: David Miller @ 2007-04-18 20:12 UTC (permalink / raw) To: hch Cc: seokmann.ju, andrew.vasquez, linux-scsi, linux-kernel, James.Bottomley, ema From: Christoph Hellwig <hch@infradead.org> Date: Wed, 18 Apr 2007 18:16:32 +0100 > On Tue, Apr 17, 2007 at 11:28:07AM -0700, Seokmann Ju wrote: > > Hello David, > > On Mon 4/16/2007 10:02 PM, David Miller wrote: > > > > I'm in transit for a redeye to NY so I won't be able to modify the > > > > patch, If you would be amenable to the above, Seokmann, could you > > > > rework the patch? > > > > > > Thanks guys. > > Here, I've attached updated patch. Please take this. > > Sorry for late. > > I don't think there should be a warning in the sparc case where the > wwn's are stored in the device tree. This is the way sun intended > the hardware to work on sparc. Yep, that's the main point. Sun provides device parameters via openprom properties in lieu of existing mechanisms chip vendors usually use for this on other platforms such as NVRAM and the BIOS. They do this for just about every scsi, networking, and video chip. It works this way on PowerPC as well. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-16 20:08 ` Andrew Vasquez 2007-04-16 21:10 ` Andrew Vasquez @ 2007-04-18 17:13 ` Christoph Hellwig 2007-04-18 17:28 ` Andrew Vasquez 2007-04-18 20:10 ` David Miller 1 sibling, 2 replies; 25+ messages in thread From: Christoph Hellwig @ 2007-04-18 17:13 UTC (permalink / raw) To: Andrew Vasquez Cc: David Miller, linux-scsi, linux-kernel, James.Bottomley, ema On Mon, Apr 16, 2007 at 01:08:57PM -0700, Andrew Vasquez wrote: > Sorry, but in a SATA/SCSI environment that may be true, but in the > case of FC that expectation is unrealistic. There are thousands of FC > installations where there are several thousand endpoints (including > initiators and targets) all interconnected. Let's use your case -- > just connect two sparc machines within the same fabric to your > storage, with the old code, there's still a problem. Multi-initiator setups are probably the majority in the FC world by now. But there has been a time around 2000 where various of companies thought FC is so l33t that they'd put it in as default. E.g. the early SGI Origin 3000 used to have FC-attached root disks. I think Sun has similar machines. > > They DON'T > > CARE, they want their systems to work and if you don't give them that > > you're not being a good driver maintainer. > > Let's push aside attitudes and unrealistic statistics, could we > perhaps agree to re-add the use of doctored NVRAM (and thus > non-random WWPN/WWNN) when NVRAM is corrupted or non-present with a > module-parameter (which defaults to 0) which indicates the user > *really* knows what she is doing and recognizes WWPN collisions may > occur -- non-zero the parameter value indicates doctored values will > be used, zero value (the default) fails initialization. In both cases > a big FAT warning is issued. I don't think a module option is a good idea at this point. The problem is you broke some so far perfectly working setups, which is not okay. The only first step can be printing a really big warning. After this has been in for a while (at lest half a year) we can make it a non-default option or turn if off completely in case the warning never triggered in practice. The only resonable thing for 2.6.21 is to put in David's patch, possible with an even more drastic warning when the rom is invalid and there's no prom-fallback available. Note that I expect Sun put in the invalid ROM intentionally, as we have similar cases with other cards that have totally messed up ROMs in Sun-branded versions. Personally I think that's an utterly bad decision from Sun's side, but we'll have to live with this. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-18 17:13 ` Christoph Hellwig @ 2007-04-18 17:28 ` Andrew Vasquez 2007-04-18 20:13 ` David Miller 2007-04-18 20:10 ` David Miller 1 sibling, 1 reply; 25+ messages in thread From: Andrew Vasquez @ 2007-04-18 17:28 UTC (permalink / raw) To: Christoph Hellwig, David Miller, linux-scsi, linux-kernel, James.Bottomley, ema On Wed, 18 Apr 2007, Christoph Hellwig wrote: > I don't think a module option is a good idea at this point. The problem > is you broke some so far perfectly working setups, which is not okay. > The only first step can be printing a really big warning. After this > has been in for a while (at lest half a year) we can make it a non-default > option or turn if off completely in case the warning never triggered in > practice. > > The only resonable thing for 2.6.21 is to put in David's patch, possible > with an even more drastic warning when the rom is invalid and there's > no prom-fallback available. > > Note that I expect Sun put in the invalid ROM intentionally, as we have > similar cases with other cards that have totally messed up ROMs in > Sun-branded versions. Personally I think that's an utterly bad decision > from Sun's side, but we'll have to live with this. Fine. I'll rework an alternate patch for the 2.6.22 timeframe... ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-18 17:28 ` Andrew Vasquez @ 2007-04-18 20:13 ` David Miller 2007-04-19 17:15 ` Andrew Vasquez 0 siblings, 1 reply; 25+ messages in thread From: David Miller @ 2007-04-18 20:13 UTC (permalink / raw) To: andrew.vasquez; +Cc: hch, linux-scsi, linux-kernel, James.Bottomley, ema From: Andrew Vasquez <andrew.vasquez@qlogic.com> Date: Wed, 18 Apr 2007 10:28:02 -0700 > On Wed, 18 Apr 2007, Christoph Hellwig wrote: > > > I don't think a module option is a good idea at this point. The problem > > is you broke some so far perfectly working setups, which is not okay. > > The only first step can be printing a really big warning. After this > > has been in for a while (at lest half a year) we can make it a non-default > > option or turn if off completely in case the warning never triggered in > > practice. > > > > The only resonable thing for 2.6.21 is to put in David's patch, possible > > with an even more drastic warning when the rom is invalid and there's > > no prom-fallback available. > > > > Note that I expect Sun put in the invalid ROM intentionally, as we have > > similar cases with other cards that have totally messed up ROMs in > > Sun-branded versions. Personally I think that's an utterly bad decision > > from Sun's side, but we'll have to live with this. > > Fine. I'll rework an alternate patch for the 2.6.22 timeframe... We need to fix things now for 2.6.21 and the 2.6.x -stable branches because users have unusable systems currently. If it's just a time issue I can work on and push the patch, especially since I have the means to test things here. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-18 20:13 ` David Miller @ 2007-04-19 17:15 ` Andrew Vasquez 0 siblings, 0 replies; 25+ messages in thread From: Andrew Vasquez @ 2007-04-19 17:15 UTC (permalink / raw) To: David Miller; +Cc: hch, linux-scsi, linux-kernel, James.Bottomley, ema On Wed, 18 Apr 2007, David Miller wrote: > > On Wed, 18 Apr 2007, Christoph Hellwig wrote: > > > > > I don't think a module option is a good idea at this point. The problem > > > is you broke some so far perfectly working setups, which is not okay. > > > The only first step can be printing a really big warning. After this > > > has been in for a while (at lest half a year) we can make it a non-default > > > option or turn if off completely in case the warning never triggered in > > > practice. > > > > > > The only resonable thing for 2.6.21 is to put in David's patch, possible > > > with an even more drastic warning when the rom is invalid and there's > > > no prom-fallback available. > > > > > > Note that I expect Sun put in the invalid ROM intentionally, as we have > > > similar cases with other cards that have totally messed up ROMs in > > > Sun-branded versions. Personally I think that's an utterly bad decision > > > from Sun's side, but we'll have to live with this. > > > > Fine. I'll rework an alternate patch for the 2.6.22 timeframe... > > We need to fix things now for 2.6.21 and the 2.6.x -stable branches > because users have unusable systems currently. Yes, and I'm fine with the original patch you provided which reverts the change and adds the firmware-upcalls to retrieve the wwpn/wwnn. > If it's just a time issue I can work on and push the patch, especially > since I have the means to test things here. I'll start with the final 2.6.21 -- add modify to add the *flashing* light warning and some additional bits based on other archs I can test with embedded ISPs. Thanks again for the SPARC tips. Regards, Andrew Vasquez ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-18 17:13 ` Christoph Hellwig 2007-04-18 17:28 ` Andrew Vasquez @ 2007-04-18 20:10 ` David Miller 2007-04-22 14:31 ` Christoph Hellwig 1 sibling, 1 reply; 25+ messages in thread From: David Miller @ 2007-04-18 20:10 UTC (permalink / raw) To: hch; +Cc: andrew.vasquez, linux-scsi, linux-kernel, James.Bottomley, ema From: Christoph Hellwig <hch@infradead.org> Date: Wed, 18 Apr 2007 18:13:46 +0100 > Note that I expect Sun put in the invalid ROM intentionally, as we have > similar cases with other cards that have totally messed up ROMs in > Sun-branded versions. Personally I think that's an utterly bad decision > from Sun's side, but we'll have to live with this. The way configuration paramters are provided on Sun systems is via openfirmware properties, this is how they do everything and that is why they never setup the NVRAMs on various chips. This happens on PowerPC too. One of many examples is determining the clock parameters of the ATI Radeon graphics card, it's provided as an openfirmware property not in some config area in the BIOS. We can either be anti-social and _demand_ that these systems get their NVRAMs fixed, which in reality will never happen and is totally unreasonable, or we can say ok this is how we get the information we need on these platforms we'll look for it there. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Major qla2xxx regression on sparc64 2007-04-18 20:10 ` David Miller @ 2007-04-22 14:31 ` Christoph Hellwig 0 siblings, 0 replies; 25+ messages in thread From: Christoph Hellwig @ 2007-04-22 14:31 UTC (permalink / raw) To: David Miller Cc: hch, andrew.vasquez, linux-scsi, linux-kernel, James.Bottomley, ema On Wed, Apr 18, 2007 at 01:10:54PM -0700, David Miller wrote: > From: Christoph Hellwig <hch@infradead.org> > Date: Wed, 18 Apr 2007 18:13:46 +0100 > > > Note that I expect Sun put in the invalid ROM intentionally, as we have > > similar cases with other cards that have totally messed up ROMs in > > Sun-branded versions. Personally I think that's an utterly bad decision > > from Sun's side, but we'll have to live with this. > > The way configuration paramters are provided on Sun systems is via > openfirmware properties, this is how they do everything and that is > why they never setup the NVRAMs on various chips. > > This happens on PowerPC too. One of many examples is determining the > clock parameters of the ATI Radeon graphics card, it's provided as an > openfirmware property not in some config area in the BIOS. > > We can either be anti-social and _demand_ that these systems get their > NVRAMs fixed, which in reality will never happen and is totally > unreasonable, or we can say ok this is how we get the information we > need on these platforms we'll look for it there. Personally I think it is reasonable to expect manufacturer-provided config space to always be in the same place, regardless what kind of executable rom is there. Then again most vendors seem to think it's unresonable, and even if it wasn't we'll have to make sure our drivers work with whatever it out there - so this discussion is rather moot. ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2007-04-22 14:31 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-04-16 8:02 Major qla2xxx regression on sparc64 David Miller 2007-04-16 16:37 ` Andrew Vasquez 2007-04-16 19:37 ` David Miller 2007-04-16 19:59 ` David Miller 2007-04-16 20:08 ` Andrew Vasquez 2007-04-16 21:10 ` Andrew Vasquez 2007-04-16 22:08 ` David Miller 2007-04-16 22:25 ` Andrew Vasquez 2007-04-16 22:29 ` David Miller 2007-04-16 23:28 ` Andrew Vasquez 2007-04-16 23:41 ` David Miller 2007-04-16 23:47 ` Andrew Vasquez 2007-04-17 0:05 ` David Miller 2007-04-17 2:41 ` Andrew Vasquez 2007-04-17 5:02 ` David Miller 2007-04-17 18:28 ` Seokmann Ju 2007-04-17 18:56 ` David Miller 2007-04-18 17:16 ` Christoph Hellwig 2007-04-18 20:12 ` David Miller 2007-04-18 17:13 ` Christoph Hellwig 2007-04-18 17:28 ` Andrew Vasquez 2007-04-18 20:13 ` David Miller 2007-04-19 17:15 ` Andrew Vasquez 2007-04-18 20:10 ` David Miller 2007-04-22 14:31 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox