* [PATCH] mpt2sas: DIF Type 2 Protection Support @ 2010-04-22 16:47 Eric Moore 2010-04-22 19:24 ` lpfc SAN/SCSI issue brem belguebli 0 siblings, 1 reply; 9+ messages in thread From: Eric Moore @ 2010-04-22 16:47 UTC (permalink / raw) To: linux-scsi Adding DIF Type 2 protection support, as well as turning on 32 byte cdb's, and setting the cdb length for > 16 byte in the SCSI_IO->control parameter. Signed-off-by: Martin Petersen <martin.petersen@oracle.com> Signed-off-by: Eric Moore <eric.moore@lsi.com> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h index b4afe43..0f41fcd 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_base.h +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h @@ -69,11 +69,11 @@ #define MPT2SAS_DRIVER_NAME "mpt2sas" #define MPT2SAS_AUTHOR "LSI Corporation <DL-MPTFusionLinux@lsi.com>" #define MPT2SAS_DESCRIPTION "LSI MPT Fusion SAS 2.0 Device Driver" -#define MPT2SAS_DRIVER_VERSION "05.100.00.02" +#define MPT2SAS_DRIVER_VERSION "05.100.00.03" #define MPT2SAS_MAJOR_VERSION 05 #define MPT2SAS_MINOR_VERSION 100 #define MPT2SAS_BUILD_VERSION 00 -#define MPT2SAS_RELEASE_VERSION 02 +#define MPT2SAS_RELEASE_VERSION 03 /* * Set MPT2SAS_SG_DEPTH value based on user input. diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c index c5ff26a..456ea7c 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c @@ -2858,9 +2858,7 @@ _scsih_setup_eedp(struct scsi_cmnd *scmd, Mpi2SCSIIORequest_t *mpi_request) unsigned char prot_op = scsi_get_prot_op(scmd); unsigned char prot_type = scsi_get_prot_type(scmd); - if (prot_type == SCSI_PROT_DIF_TYPE0 || - prot_type == SCSI_PROT_DIF_TYPE2 || - prot_op == SCSI_PROT_NORMAL) + if (prot_type == SCSI_PROT_DIF_TYPE0 || prot_op == SCSI_PROT_NORMAL) return; if (prot_op == SCSI_PROT_READ_STRIP) @@ -2882,7 +2880,13 @@ _scsih_setup_eedp(struct scsi_cmnd *scmd, Mpi2SCSIIORequest_t *mpi_request) MPI2_SCSIIO_EEDPFLAGS_CHECK_GUARD; mpi_request->CDB.EEDP32.PrimaryReferenceTag = cpu_to_be32(scsi_get_lba(scmd)); + break; + + case SCSI_PROT_DIF_TYPE2: + eedp_flags |= MPI2_SCSIIO_EEDPFLAGS_INC_PRI_REFTAG | + MPI2_SCSIIO_EEDPFLAGS_CHECK_REFTAG | + MPI2_SCSIIO_EEDPFLAGS_CHECK_GUARD; break; case SCSI_PROT_DIF_TYPE3: @@ -3013,7 +3017,7 @@ _scsih_qcmd(struct scsi_cmnd *scmd, void (*done)(struct scsi_cmnd *)) mpi_control |= MPI2_SCSIIO_CONTROL_SIMPLEQ; /* Make sure Device is not raid volume */ if (!_scsih_is_raid(&scmd->device->sdev_gendev) && - sas_is_tlr_enabled(scmd->device)) + sas_is_tlr_enabled(scmd->device) && scmd->cmd_len != 32) mpi_control |= MPI2_SCSIIO_CONTROL_TLR_ON; smid = mpt2sas_base_get_smid_scsiio(ioc, ioc->scsi_io_cb_idx, scmd); @@ -3025,6 +3029,8 @@ _scsih_qcmd(struct scsi_cmnd *scmd, void (*done)(struct scsi_cmnd *)) mpi_request = mpt2sas_base_get_msg_frame(ioc, smid); memset(mpi_request, 0, sizeof(Mpi2SCSIIORequest_t)); _scsih_setup_eedp(scmd, mpi_request); + if (scmd->cmd_len == 32) + mpi_control |= 4 << MPI2_SCSIIO_CONTROL_ADDCDBLEN_SHIFT; mpi_request->Function = MPI2_FUNCTION_SCSI_IO_REQUEST; if (sas_device_priv_data->sas_target->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) @@ -6567,7 +6573,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) INIT_LIST_HEAD(&ioc->delayed_tr_list); /* init shost parameters */ - shost->max_cmd_len = 16; + shost->max_cmd_len = 32; shost->max_lun = max_lun; shost->transportt = mpt2sas_transport_template; shost->unique_id = ioc->id; @@ -6580,7 +6586,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) } scsi_host_set_prot(shost, SHOST_DIF_TYPE1_PROTECTION - | SHOST_DIF_TYPE3_PROTECTION); + | SHOST_DIF_TYPE2_PROTECTION | SHOST_DIF_TYPE3_PROTECTION); scsi_host_set_guard(shost, SHOST_DIX_GUARD_CRC); /* event thread */ ^ permalink raw reply related [flat|nested] 9+ messages in thread
* lpfc SAN/SCSI issue 2010-04-22 16:47 [PATCH] mpt2sas: DIF Type 2 Protection Support Eric Moore @ 2010-04-22 19:24 ` brem belguebli 2010-04-23 13:28 ` James Smart 0 siblings, 1 reply; 9+ messages in thread From: brem belguebli @ 2010-04-22 19:24 UTC (permalink / raw) Cc: linux-scsi I have a server (RHEL 5.3) connected to 2 SAN extended fabrics (across 2 sites, distance 1 ms, links are ISL with 100 km long distance buffer credits) via 2 lpfc HBA's (LPe1105-HP FC with the RHEL 5.3 shipped LPFC driver 8.2.0.33.3p.) A SAN FABRIC reconfiguration (DWDM Ring failover from worker to protection) occured yesterday after some intersite telco link switch that lasted less than 0,3 ms. Only one FABRIC was impacted, named FABRIC2 Our server is connected to the FABRICs thru 2 edge switches, so not directly connected to the core switches on which the link failure occured. >From then, our server (which accesses thru the 2 fabrics the LUNS from our 2 sites) started to climb in terms of load average (up to 250 for a dual proc quadcore machine!) with a high percentage of iowait (up to 50%). We did some testing, bypassing DM-MP by issuing dd commands to the physical /dev/sdX devices (more than 30 LUNS are presented to the server, seen each thru 4 paths making more than 120 /dev/sd devices) and half of our dd processes went to D state, as well as some unitary scsi_id that we manually run on the same physical devices. Multipathd itself was also in D state. The only way to restore the whole thing was to reset the server HBA connected to FABRIC2, after 2 hours of investigation No kind of scsi log, or whatever did appear during the outage duration (~2 hours) despite the fact that the scsi timeouts set on the physical devices is 60s, that the HBA's timeout is 14s. The /sys/block/sdX/device/state were showing running state despite the fact that the devices (well half of them) were actually inaccessible. What leads me to : 1) assumption: it looks the lpfc driver following this SAN event goes in a black hole mode not returning any io error or whatever to the scsi upper layer 2) question: how come the scsi timers don't trigger and declare the device faulty (the answer may be in the above assumption). Any idea or tip on what could cause this, some FC SCN message not well handled or whatever ? Regards Brem ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: lpfc SAN/SCSI issue 2010-04-22 19:24 ` lpfc SAN/SCSI issue brem belguebli @ 2010-04-23 13:28 ` James Smart [not found] ` <j2o29ae894c1004230922le8baf635y563e50e3edc53bc3@mail.gmail.com> 0 siblings, 1 reply; 9+ messages in thread From: James Smart @ 2010-04-23 13:28 UTC (permalink / raw) To: brem belguebli; +Cc: linux-scsi@vger.kernel.org Brem, We're looking at the lpfc driver as to whether this matches anything we are aware of. Please send me the system console log during this time frame. No messages whatsoever would be very odd. Sending us the output of the shost, rport, and sdev sysfs parameters, as well as DM configuration values would also help. It won't necessarily be i/o timers that would fire, but other timers should. -- james s brem belguebli wrote: > I have a server (RHEL 5.3) connected to 2 SAN extended fabrics (across 2 > sites, distance 1 ms, links are ISL with 100 km long distance buffer > credits) via 2 lpfc HBA's (LPe1105-HP FC with the RHEL 5.3 shipped LPFC > driver 8.2.0.33.3p.) > > A SAN FABRIC reconfiguration (DWDM Ring failover from worker to > protection) occured yesterday after some intersite telco link switch > that lasted less than 0,3 ms. > > Only one FABRIC was impacted, named FABRIC2 > > Our server is connected to the FABRICs thru 2 edge switches, so not > directly connected to the core switches on which the link failure > occured. > >>From then, our server (which accesses thru the 2 fabrics the LUNS from > our 2 sites) started to climb in terms of load average (up to 250 for a > dual proc quadcore machine!) with a high percentage of iowait (up to > 50%). > > We did some testing, bypassing DM-MP by issuing dd commands to the > physical /dev/sdX devices (more than 30 LUNS are presented to the > server, seen each thru 4 paths making more than 120 /dev/sd devices) > and half of our dd processes went to D state, as well as some unitary > scsi_id that we manually run on the same physical devices. > > Multipathd itself was also in D state. > > The only way to restore the whole thing was to reset the server HBA > connected to FABRIC2, after 2 hours of investigation > > No kind of scsi log, or whatever did appear during the outage duration > (~2 hours) despite the fact that the scsi timeouts set on the physical > devices is 60s, that the HBA's timeout is 14s. > > The /sys/block/sdX/device/state were showing running state despite the > fact that the devices (well half of them) were actually inaccessible. > > What leads me to : > > 1) assumption: it looks the lpfc driver following this SAN event goes in > a black hole mode not returning any io error or whatever to the scsi > upper layer > > 2) question: how come the scsi timers don't trigger and declare the > device faulty (the answer may be in the above assumption). > > Any idea or tip on what could cause this, some FC SCN message not well > handled or whatever ? > > Regards > > Brem > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <j2o29ae894c1004230922le8baf635y563e50e3edc53bc3@mail.gmail.com>]
[parent not found: <4BD226F4.6070908@emulex.com>]
[parent not found: <1272109999.2983.30.camel@localhost>]
[parent not found: <4BD5D258.8030309@emulex.com>]
* Re: lpfc SAN/SCSI issue [not found] ` <4BD5D258.8030309@emulex.com> @ 2010-04-26 21:52 ` brem belguebli 2010-04-27 17:37 ` brem belguebli 0 siblings, 1 reply; 9+ messages in thread From: brem belguebli @ 2010-04-26 21:52 UTC (permalink / raw) To: James Smart; +Cc: linux-scsi Hi James, On Mon, 2010-04-26 at 13:50 -0400, James Smart wrote: > Brem, > > I'm not understanding you. > > > brem belguebli wrote: > > We have sg3_utils installed , and I think we ran sg_verify on one or > > 2 > > unresponsive /dev/sd and it didn't give the hand back. > > > what do you mean "give the hand back" ? was the operation > successful or not ? > When I say it didn't give the hand back, I mean the one or 2 processes got stuck in D state, thus not returning success . > > It was exactly > > cd /sys/block > > for DEV in `ls -1d dev*`; do > > echo ${DEV} > > dd if =/dev/${DEV} of=/dev/null bs=1024 count=1 & > > echo > > done > > > > And yes it really works, never seen any kind of preemption of DM-MP over > > direct sd access. I've cc'ed dm-devel may be some DM guru could give his > > opinion on this. > > > > Next time, I'll use a sg_dd instead of dd, to bypass any cache effect > > (by the way, does VFS cache anything when addressing /dev/X devices ?) > > > ok - by "works" means "dd successfully read 1 block from the device" - > right ? > Yes, the devices on which dd was successful were the ones from FABRIC1, dd completed successfully by reading the first 1024 bytes to copy them to /dev/null > > > The most interesting for the lpfc driver would be the lpfc module > > > parameter "lpfc_log_verbose=4115" > > > which turns on discovery log messages, els messages, link events, and > > > FCP i/o error messages. > > > > > > > As our DWDM ring switch is on the less optimal path, there will be a > > switch back to nominal soon. > > > > I'll activate this log level on the HBA's and check the firmware > > versions you gave me . > > > ok. I believe that the shost for the adapters in question, have a > sysfs variable for lpfc_log_verbose, that sets the log level on the > individual adapter. This would not require you to unload/reload the > driver to set the option. > I'll tell you tomorrow (was off today) if the parameter exists for these HBA's. > > Hopefully, we will be able to provide you something deeper to > > investigate. > > > > Brem > > > > ok. > > -- james > > Thanks ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: lpfc SAN/SCSI issue 2010-04-26 21:52 ` brem belguebli @ 2010-04-27 17:37 ` brem belguebli 2010-05-03 16:39 ` brem belguebli 0 siblings, 1 reply; 9+ messages in thread From: brem belguebli @ 2010-04-27 17:37 UTC (permalink / raw) To: James Smart; +Cc: linux-scsi Hi James, I could set lpfc_log_verbose on both HBA's to 4115, I hope it'll be high enough to get interesting traces. On Mon, 2010-04-26 at 23:52 +0200, brem belguebli wrote: > Hi James, > > On Mon, 2010-04-26 at 13:50 -0400, James Smart wrote: > > Brem, > > > > I'm not understanding you. > > > > > > brem belguebli wrote: > > > We have sg3_utils installed , and I think we ran sg_verify on one or > > > 2 > > > unresponsive /dev/sd and it didn't give the hand back. > > > > > what do you mean "give the hand back" ? was the operation > > successful or not ? > > > When I say it didn't give the hand back, I mean the one or 2 processes > got stuck in D state, thus not returning success . > > > It was exactly > > > cd /sys/block > > > for DEV in `ls -1d dev*`; do > > > echo ${DEV} > > > dd if =/dev/${DEV} of=/dev/null bs=1024 count=1 & > > > echo > > > done > > > > > > And yes it really works, never seen any kind of preemption of DM-MP over > > > direct sd access. I've cc'ed dm-devel may be some DM guru could give his > > > opinion on this. > > > > > > Next time, I'll use a sg_dd instead of dd, to bypass any cache effect > > > (by the way, does VFS cache anything when addressing /dev/X devices ?) > > > > > ok - by "works" means "dd successfully read 1 block from the device" - > > right ? > > > Yes, the devices on which dd was successful were the ones from FABRIC1, > dd completed successfully by reading the first 1024 bytes to copy them > to /dev/null > > > > > The most interesting for the lpfc driver would be the lpfc module > > > > parameter "lpfc_log_verbose=4115" > > > > which turns on discovery log messages, els messages, link events, and > > > > FCP i/o error messages. > > > > > > > > > > As our DWDM ring switch is on the less optimal path, there will be a > > > switch back to nominal soon. > > > > > > I'll activate this log level on the HBA's and check the firmware > > > versions you gave me . > > > > > ok. I believe that the shost for the adapters in question, have a > > sysfs variable for lpfc_log_verbose, that sets the log level on the > > individual adapter. This would not require you to unload/reload the > > driver to set the option. > > > I'll tell you tomorrow (was off today) if the parameter exists for these > HBA's. > > > Hopefully, we will be able to provide you something deeper to > > > investigate. > > > > > > Brem > > > > > > > ok. > > > > -- james > > > > > Thanks > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: lpfc SAN/SCSI issue 2010-04-27 17:37 ` brem belguebli @ 2010-05-03 16:39 ` brem belguebli 2010-05-05 14:01 ` James Smart 0 siblings, 1 reply; 9+ messages in thread From: brem belguebli @ 2010-05-03 16:39 UTC (permalink / raw) To: James Smart; +Cc: linux-scsi Hi james, We haven't yet been able to ask our Telco to switch back the DWDM links to original situation. However, since logging was activated on the server I'm having a lot of messages : lpfc 0000:10:00.1: 1:(0):0730 FCP command x26 failed: x2 SNS x70000500 x20000000 Data: xa x200 x10 x0 x0 for which I couldn't find no explanation (http://www-dl.emulex.com/support/linux/820482p/linux.pdf) Do you have any information on this ? Also, there are other lpfc parameters that could be tweaked if I understand well their meaning: lpfc_hba_queue_depth currently set to 1024 : Does it represent the number of [IOs/Exchanges] the HBA will queue untill the remote port acks them or untill it is considered down ? lpfc_max_scsicmpl_time set to 0 : Does 0 represent some infinite value, meaning it won't timeout any IO for which the driver did not receive any completion ack ? Thanks Brem 2010/4/27 brem belguebli <brem.belguebli@gmail.com>: > Hi James, > > I could set lpfc_log_verbose on both HBA's to 4115, I hope it'll be high > enough to get interesting traces. > > On Mon, 2010-04-26 at 23:52 +0200, brem belguebli wrote: >> Hi James, >> >> On Mon, 2010-04-26 at 13:50 -0400, James Smart wrote: >> > Brem, >> > >> > I'm not understanding you. >> > >> > >> > brem belguebli wrote: >> > > We have sg3_utils installed , and I think we ran sg_verify on one or >> > > 2 >> > > unresponsive /dev/sd and it didn't give the hand back. >> > > >> > what do you mean "give the hand back" ? was the operation >> > successful or not ? >> > >> When I say it didn't give the hand back, I mean the one or 2 processes >> got stuck in D state, thus not returning success . >> > > It was exactly >> > > cd /sys/block >> > > for DEV in `ls -1d dev*`; do >> > > echo ${DEV} >> > > dd if =/dev/${DEV} of=/dev/null bs=1024 count=1 & >> > > echo >> > > done >> > > >> > > And yes it really works, never seen any kind of preemption of DM-MP over >> > > direct sd access. I've cc'ed dm-devel may be some DM guru could give his >> > > opinion on this. >> > > >> > > Next time, I'll use a sg_dd instead of dd, to bypass any cache effect >> > > (by the way, does VFS cache anything when addressing /dev/X devices ?) >> > > >> > ok - by "works" means "dd successfully read 1 block from the device" - >> > right ? >> > >> Yes, the devices on which dd was successful were the ones from FABRIC1, >> dd completed successfully by reading the first 1024 bytes to copy them >> to /dev/null >> >> > > > The most interesting for the lpfc driver would be the lpfc module >> > > > parameter "lpfc_log_verbose=4115" >> > > > which turns on discovery log messages, els messages, link events, and >> > > > FCP i/o error messages. >> > > > >> > > >> > > As our DWDM ring switch is on the less optimal path, there will be a >> > > switch back to nominal soon. >> > > >> > > I'll activate this log level on the HBA's and check the firmware >> > > versions you gave me . >> > > >> > ok. I believe that the shost for the adapters in question, have a >> > sysfs variable for lpfc_log_verbose, that sets the log level on the >> > individual adapter. This would not require you to unload/reload the >> > driver to set the option. >> > >> I'll tell you tomorrow (was off today) if the parameter exists for these >> HBA's. > > >> > > Hopefully, we will be able to provide you something deeper to >> > > investigate. >> > > >> > > Brem >> > > >> > >> > ok. >> > >> > -- james >> > >> > >> Thanks >> >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: lpfc SAN/SCSI issue 2010-05-03 16:39 ` brem belguebli @ 2010-05-05 14:01 ` James Smart 2010-05-06 11:06 ` brem belguebli 0 siblings, 1 reply; 9+ messages in thread From: James Smart @ 2010-05-05 14:01 UTC (permalink / raw) To: brem belguebli; +Cc: linux-scsi@vger.kernel.org brem belguebli wrote: > Hi james, > > We haven't yet been able to ask our Telco to switch back the DWDM > links to original situation. > > However, since logging was activated on the server I'm having a lot of > messages : > > lpfc 0000:10:00.1: 1:(0):0730 FCP command x26 failed: x2 SNS x70000500 > x20000000 Data: xa x200 x10 x0 x0 > > for which I couldn't find no explanation > (http://www-dl.emulex.com/support/linux/820482p/linux.pdf) > > Do you have any information on this ? This is saying that SCSI command opcode 0x26 (Vendor-specific opcode ??) failed, with Status code x2 (Check Condition) followed by the SCSI sense data, w/ Sense Key 5 (ILLEGAL REQUEST). I don't know who would be issuing this command (opcode 0x26), most likely some utility/daemon using sgio, but the target is rejecting the command (not valid for the vendor). Very reasonable. > Also, there are other lpfc parameters that could be tweaked if I > understand well their meaning: > > lpfc_hba_queue_depth currently set to 1024 : Does it represent the > number of [IOs/Exchanges] the HBA will queue untill the remote port > acks them or untill it is considered down ? This is the total number of i/o's outstanding on the wire, to all targets/luns, at any point in time. This is typically the capacity of the adapter, which is used in a FIFO basis as I/O is received from the midlayer. The default value of the attribute takes the maximum from the adapter. On your adapter, the value is 1024. On most newer adapters, it is 2x this or more. The only time I've seen this value tweaked is when our adapter is connected to a single target (array), and overruns or fully utilizes the capacity of the target, causing the target to work harder, and actually accomplish less, than it could at say an 80% utilization level (note: capacity level is target-specific). (another reason per-target queue_depth handling was put in - see next comment). > > lpfc_max_scsicmpl_time set to 0 : Does 0 represent some infinite > value, meaning it won't timeout any IO for which the driver did not > receive any completion ack ? No, unrelated. This is relative to target queue depth mgmt. The midlayer doesn't do queue depth management by target - only per sdev (lun). Our driver does though. Target queue depth is the sum of all i/o to all luns on the same target, with a threshold that may or may not be capped based on the array type, and which scales/ramps down to the existing outstanding i/o count when the target reports QUEUE_FULL/TASK_SET_FULL. This behavior is valid only on targets that have a shared i/o queue for all luns. This value controls the per-target ramp-up processing. If 0, we use a constant compiled-in interval which ramps our target queue depth back up by x%. When non-zero, it specifies a shost-specific time interval for the ramp up (it's actually a little trickier than this as it's tailored on some arrays that really depended upon not being overrun beyond their capacity levels). -- james s ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: lpfc SAN/SCSI issue 2010-05-05 14:01 ` James Smart @ 2010-05-06 11:06 ` brem belguebli 2010-05-06 13:39 ` James Smart 0 siblings, 1 reply; 9+ messages in thread From: brem belguebli @ 2010-05-06 11:06 UTC (permalink / raw) To: James Smart; +Cc: linux-scsi@vger.kernel.org Hi James, 2010/5/5 James Smart <james.smart@emulex.com>: > > > brem belguebli wrote: >> >> Hi james, >> >> We haven't yet been able to ask our Telco to switch back the DWDM >> links to original situation. >> >> However, since logging was activated on the server I'm having a lot of >> messages : >> >> lpfc 0000:10:00.1: 1:(0):0730 FCP command x26 failed: x2 SNS x70000500 >> x20000000 Data: xa x200 x10 x0 x0 >> >> for which I couldn't find no explanation >> (http://www-dl.emulex.com/support/linux/820482p/linux.pdf) >> >> Do you have any information on this ? > > This is saying that SCSI command opcode 0x26 (Vendor-specific opcode ??) > failed, with Status code x2 (Check Condition) followed by the SCSI sense > data, w/ Sense Key 5 (ILLEGAL REQUEST). > > I don't know who would be issuing this command (opcode 0x26), most likely > some utility/daemon using sgio, but the target is rejecting the command (not > valid for the vendor). Very reasonable. > I could finally find the 730 messages explanation in your docs, and we have tracked the faulty program. It is hpasm which is shipped with the Proliant Support Pack, that we invoque in the monitoring of the hardware RAID of the servers. Actually the same program runs on similar (OS, HBA's, etc...) machines without querying the opcode 0x26, and on 2 servers it does. Further investigation on this pointed out that on these 2 servers, we did install extra Emulex packages, elxocmlibhbaapi, elxocmlibhbaapi-32bit and elxocmcore that install various libraries ( /usr/lib/libemsdm.so, /usr/lib/libdfc.so,/usr/lib/libnl.so.1) that certainly contained symbols that are, thru the linux-gate.so, matched in these 3 libs, making the above program (hpasm) querying opcode 0x26 on all the storage controllers on the system. > >> Also, there are other lpfc parameters that could be tweaked if I >> understand well their meaning: >> >> lpfc_hba_queue_depth currently set to 1024 : Does it represent the >> number of [IOs/Exchanges] the HBA will queue untill the remote port >> acks them or untill it is considered down ? > > This is the total number of i/o's outstanding on the wire, to all > targets/luns, at any point in time. This is typically the capacity of the > adapter, which is used in a FIFO basis as I/O is received from the midlayer. > The default value of the attribute takes the maximum from the adapter. On > your adapter, the value is 1024. On most newer adapters, it is 2x this or > more. The only time I've seen this value tweaked is when our adapter is > connected to a single target (array), and overruns or fully utilizes the > capacity of the target, causing the target to work harder, and actually > accomplish less, than it could at say an 80% utilization level (note: > capacity level is target-specific). (another reason per-target queue_depth > handling was put in - see next comment). > > >> >> lpfc_max_scsicmpl_time set to 0 : Does 0 represent some infinite >> value, meaning it won't timeout any IO for which the driver did not >> receive any completion ack ? > > No, unrelated. This is relative to target queue depth mgmt. The midlayer > doesn't do queue depth management by target - only per sdev (lun). Our > driver does though. Target queue depth is the sum of all i/o to all luns on > the same target, with a threshold that may or may not be capped based on > the array type, and which scales/ramps down to the existing outstanding i/o > count when the target reports QUEUE_FULL/TASK_SET_FULL. This behavior is > valid only on targets that have a shared i/o queue for all luns. This value > controls the per-target ramp-up processing. If 0, we use a constant > compiled-in interval which ramps our target queue depth back up by x%. When > non-zero, it specifies a shost-specific time interval for the ramp up (it's > actually a little trickier than this as it's tailored on some arrays that > really depended upon not being overrun beyond their capacity levels). > Thanks for the explanation. However, we do not have anymore x26 opcode error messages, though I wasn't sure this was the root cause of the problem we had during the DWDM ring failover, I increased the logging (0xffff) on the HBA's of the nodes (total 4 nodes, 2 that were reporting the x26 opcode error say Group A, and the 2 that never did, say Group B). These 4 nodes form a cluster accessing the same LUNS thru the same controllers the very same way, and I get errors relative to INQUIRY on Group A: lpfc 0000:10:00.1: 1:(0):0730 FCP command x12 failed: x0 SNS x0 x0 Data: x8 x3c x0 x0 x0 lpfc 0000:10:00.1: 1:(0):0716 FCP Read Underrun, expected 96, residual 60 Data: x3c x12 x0 lpfc 0000:10:00.1: 1:0336 Rsp Ring 0 error: IOCB Data: xff000018 xe99fc48 x0 x0 x3c x0 x1d70c8e xa29b16 lpfc 0000:10:00.1: 1:0729 FCP cmd x12 failed <0/0> status: x1 result: x3c Data: x1d7 xc8e lpfc 0000:10:00.0: 0:(0):0730 FCP command x12 failed: x0 SNS x0 x0 Data: x8 x3c x0 x0 x0 lpfc 0000:10:00.0: 0:(0):0716 FCP Read Underrun, expected 96, residual 60 Data: x3c x12 x0 lpfc 0000:10:00.1: 1:0336 Rsp Ring 0 error: IOCB Data: xff000018 xe9960c0 x0 x0 x3c x0 x3360c67 xa29b16 lpfc 0000:10:00.1: 1:0729 FCP cmd x12 failed <0/0> status: x1 result: x3c Data: x336 xc67 On both HBA's and concerning the 13 paths seen thru target 0 (<0/0>, <0/1>...) Group B doesn't show no error. I'm going to get on one of Group B node a HBA's change to make sure it is not a hardware issue, and I'll keep you informed. > > -- james s > > Regards Brem -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: lpfc SAN/SCSI issue 2010-05-06 11:06 ` brem belguebli @ 2010-05-06 13:39 ` James Smart 0 siblings, 0 replies; 9+ messages in thread From: James Smart @ 2010-05-06 13:39 UTC (permalink / raw) To: brem belguebli; +Cc: linux-scsi@vger.kernel.org brem belguebli wrote: > However, we do not have anymore x26 opcode error messages, though I > wasn't sure this was the root cause of the problem we had during the > DWDM ring failover, It most likely wasn't - although error handlers on some arrays, when overloaded or going through failovers, sometimes react oddly. > I increased the logging (0xffff) on the HBA's of > the nodes (total 4 nodes, 2 that were reporting the x26 opcode error > say Group A, and the 2 that never did, say Group B). I did not recommend 0xFFFF as it turns on everything - whether error or not. The value I gave should have filtered out non-errors. > These 4 nodes form a cluster accessing the same LUNS thru the same > controllers the very same way, and I get errors relative to INQUIRY on > Group A: > > lpfc 0000:10:00.1: 1:(0):0730 FCP command x12 failed: x0 SNS x0 x0 > Data: x8 x3c x0 x0 x0 > lpfc 0000:10:00.1: 1:(0):0716 FCP Read Underrun, expected 96, residual > 60 Data: x3c x12 x0 > lpfc 0000:10:00.1: 1:0336 Rsp Ring 0 error: IOCB Data: xff000018 > xe99fc48 x0 x0 x3c x0 x1d70c8e xa29b16 Yes - this a normal response for SCSI commands where the command allows variable length data from the target - INQUIRY is such a case. We report any SCSI completion error - such as this underrun (target returned less data than the buffer the host gave it). This is not an error. > Group B doesn't show no error. If you're not seeing the underrun error - there isn't i/o being performed. And if INQUIRY isn't being seen, the midlayer isn't attempting to scan the device. Most likely is the hba isn't even seeing the target, which should be visible from the lpfc log messages on FC discovery. Please send me the log messages for the Group B hosts and I'll help interpret - However! don't spam linux-scsi with this huge log (especially if 0xffff, the older log value should have been good enough). Send it to me off-list. -- james s ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-05-06 13:39 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-22 16:47 [PATCH] mpt2sas: DIF Type 2 Protection Support Eric Moore
2010-04-22 19:24 ` lpfc SAN/SCSI issue brem belguebli
2010-04-23 13:28 ` James Smart
[not found] ` <j2o29ae894c1004230922le8baf635y563e50e3edc53bc3@mail.gmail.com>
[not found] ` <4BD226F4.6070908@emulex.com>
[not found] ` <1272109999.2983.30.camel@localhost>
[not found] ` <4BD5D258.8030309@emulex.com>
2010-04-26 21:52 ` brem belguebli
2010-04-27 17:37 ` brem belguebli
2010-05-03 16:39 ` brem belguebli
2010-05-05 14:01 ` James Smart
2010-05-06 11:06 ` brem belguebli
2010-05-06 13:39 ` James Smart
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox