All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/16] qla2xxx target mode improvements
@ 2025-09-29 14:28 Tony Battersby
  2025-09-29 14:30 ` [PATCH v2 01/16] Revert "scsi: qla2xxx: Perform lockless command completion in abort path" Tony Battersby
                   ` (17 more replies)
  0 siblings, 18 replies; 27+ messages in thread
From: Tony Battersby @ 2025-09-29 14:28 UTC (permalink / raw)
  To: Nilesh Javali, GR-QLogic-Storage-Upstream, James E.J. Bottomley,
	Martin K. Petersen
  Cc: linux-scsi, target-devel, scst-devel,
	linux-kernel@vger.kernel.org, Dmitry Bogdanov, Xose Vazquez Perez

v1 -> v2
- Add new patch "scsi: qla2xxx: clear cmds after chip reset" suggested
by Dmitry Bogdanov.
- Rename "scsi: qla2xxx: fix oops during cmd abort" to "scsi: qla2xxx:
fix races with aborting commands" and make SCST reset the ISP on a HW
timeout instead of unmapping DMA that might still be in use.
- Fix "scsi: qla2xxx: fix TMR failure handling" to free mcmds properly
for LIO.
- In "scsi: qla2xxx: add back SRR support", detect more buggy HBA fw
versions based on the fw release notes.
- Shorten code comment in "scsi: qla2xxx: improve safety of cmd lookup
by handle" and improve patch description.
- Rebase other patches as needed.

v1:
https://lore.kernel.org/r/f8977250-638c-4d7d-ac0c-65f742b8d535@cybernetics.com/

This patch series improves the qla2xxx FC driver in target mode.  I
developed these patches using the out-of-tree SCST target-mode subsystem
(https://scst.sourceforge.net/), although most of the improvements will
also apply to the other target-mode subsystems such as the in-tree LIO. 
Unfortunately qla2xxx+LIO does not pass all of my tests, but my patches
do not make it any worse (results below).  These patches have been
well-tested at my employer with qla2xxx+SCST in both initiator mode and
target mode and with a variety of FC HBAs and initiators.  Since SCST is
out-of-tree, some of the patches have parts that apply in-tree and other
parts that apply out-of-tree to SCST.  I am going to include the
out-of-tree SCST patches to provide additional context; feel free to
ignore them if you are not interested.

All patches apply to linux 6.17 and SCST 3.10 master branch.

Summary of patches:
- bugfixes
- cleanups
- improve handling of aborts and task management requests
- improve log message
- add back SLER / SRR support (removed in 2017)

Some of these patches improve handling of aborts and task management
requests.  This is some of the testing that I did:

Test 1: Use /dev/sg to queue random disk I/O with short timeouts; make
sure cmds are aborted successfully.
Test 2: Queue lots of disk I/O, then use "sg_reset -N -d /dev/sg" on
initiator to reset logical unit.
Test 3: Queue lots of disk I/O, then use "sg_reset -N -t /dev/sg" on
initiator to reset target.
Test 4: Queue lots of disk I/O, then use "sg_reset -N -b /dev/sg" on
initiator to reset bus.
Test 5: Queue lots of disk I/O, then use "sg_reset -N -H /dev/sg" on
initiator to reset host.
Test 6: Use fiber channel attenuator to trigger SRR during
write/read/compare test; check data integrity.

With my patches, SCST passes all of these tests.

Results with in-tree LIO target-mode subsystem:

Test 1: Seems to abort the same cmd multiple times (both
qlt_24xx_retry_term_exchange() and __qlt_send_term_exchange()).  But
cmds get aborted, so give it a pass?

Test 2: Seems to work; cmds are aborted.

Test 3: Target reset doesn't seem to abort cmds, instead, a few seconds
later:
qla2xxx [0000:04:00.0]-f058:9: qla_target(0): tag 1314312, op 2a: CTIO
with TIMEOUT status 0xb received (state 1, port 51:40:2e:c0:18:1d:9f:cc,
LUN 0)

Tests 4 and 5: The initiator is unable to log back in to the target; the
following messages are repeated over and over on the target:
qla2xxx [0000:04:00.0]-e01c:9: Sending TERM ELS CTIO (ha=00000000f8811390)
qla2xxx [0000:04:00.0]-f097:9: Linking sess 000000008df5aba8 [0] wwn
51:40:2e:c0:18:1d:9f:cc with PLOGI ACK to wwn 51:40:2e:c0:18:1d:9f:cc
s_id 00:00:01, ref=2 pla 00000000835a9271 link 0

Test 6: passes with my patches; SRR not supported previously.

So qla2xxx+LIO seems a bit flaky when handling exceptions, but my
patches do not make it any worse.  Perhaps someone who is more familiar
with LIO can look at the difference between LIO and SCST and figure out
how to improve it.

Tony Battersby
https://www.cybernetics.com/

Tony Battersby (16):
  Revert "scsi: qla2xxx: Perform lockless command completion in abort
    path"
  scsi: qla2xxx: fix initiator mode with qlini_mode=exclusive
  scsi: qla2xxx: fix lost interrupts with qlini_mode=disabled
  scsi: qla2xxx: use reinit_completion on mbx_intr_comp
  scsi: qla2xxx: remove code for unsupported hardware
  scsi: qla2xxx: improve debug output for term exchange
  scsi: qla2xxx: fix term exchange when cmd_sent_to_fw == 1
  scsi: qla2xxx: clear cmds after chip reset
  scsi: qla2xxx: fix races with aborting commands
  scsi: qla2xxx: improve checks in qlt_xmit_response / qlt_rdy_to_xfer
  scsi: qla2xxx: fix TMR failure handling
  scsi: qla2xxx: fix invalid memory access with big CDBs
  scsi: qla2xxx: add cmd->rsp_sent
  scsi: qla2xxx: improve cmd logging
  scsi: qla2xxx: add back SRR support
  scsi: qla2xxx: improve safety of cmd lookup by handle

 drivers/scsi/qla2xxx/qla_dbg.c     |    3 +-
 drivers/scsi/qla2xxx/qla_def.h     |    1 -
 drivers/scsi/qla2xxx/qla_gbl.h     |    2 +-
 drivers/scsi/qla2xxx/qla_init.c    |    1 +
 drivers/scsi/qla2xxx/qla_isr.c     |   32 +-
 drivers/scsi/qla2xxx/qla_mbx.c     |    2 +
 drivers/scsi/qla2xxx/qla_mid.c     |    4 +-
 drivers/scsi/qla2xxx/qla_os.c      |   35 +-
 drivers/scsi/qla2xxx/qla_target.c  | 1775 +++++++++++++++++++++++-----
 drivers/scsi/qla2xxx/qla_target.h  |  112 +-
 drivers/scsi/qla2xxx/tcm_qla2xxx.c |   17 +
 11 files changed, 1646 insertions(+), 338 deletions(-)


base-commit: e5f0a698b34ed76002dc5cff3804a61c80233a7a
-- 
2.43.0


^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH v2 11/16] scsi: qla2xxx: fix TMR failure handling
  2025-09-29 14:43 ` [PATCH v2 11/16] scsi: qla2xxx: fix TMR failure handling Tony Battersby
@ 2025-10-03  8:40 ` Dan Carpenter
  -1 siblings, 0 replies; 27+ messages in thread
From: kernel test robot @ 2025-10-03  5:04 UTC (permalink / raw)
  To: oe-kbuild; +Cc: lkp, Dan Carpenter

BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <f52cda16-4952-4b28-bbf7-d44f4e054490@cybernetics.com>
References: <f52cda16-4952-4b28-bbf7-d44f4e054490@cybernetics.com>
TO: Tony Battersby <tonyb@cybernetics.com>
TO: Nilesh Javali <njavali@marvell.com>
TO: GR-QLogic-Storage-Upstream@marvell.com
TO: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
TO: "Martin K. Petersen" <martin.petersen@oracle.com>
CC: "linux-scsi" <linux-scsi@vger.kernel.org>
CC: target-devel@vger.kernel.org
CC: scst-devel@lists.sourceforge.net
CC: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
CC: Dmitry Bogdanov <d.bogdanov@yadro.com>
CC: Xose Vazquez Perez <xose.vazquez@gmail.com>

Hi Tony,

kernel test robot noticed the following build warnings:

[auto build test WARNING on e5f0a698b34ed76002dc5cff3804a61c80233a7a]

url:    https://github.com/intel-lab-lkp/linux/commits/Tony-Battersby/Revert-scsi-qla2xxx-Perform-lockless-command-completion-in-abort-path/20250930-024814
base:   e5f0a698b34ed76002dc5cff3804a61c80233a7a
patch link:    https://lore.kernel.org/r/f52cda16-4952-4b28-bbf7-d44f4e054490%40cybernetics.com
patch subject: [PATCH v2 11/16] scsi: qla2xxx: fix TMR failure handling
:::::: branch date: 3 days ago
:::::: commit date: 3 days ago
config: i386-randconfig-141-20251002 (https://download.01.org/0day-ci/archive/20251003/202510031227.18psESZQ-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <error27@gmail.com>
| Closes: https://lore.kernel.org/r/202510031227.18psESZQ-lkp@intel.com/

New smatch warnings:
drivers/scsi/qla2xxx/qla_target.c:5735 qlt_handle_abts_completion() error: we previously assumed 'mcmd' could be null (see line 5723)

Old smatch warnings:
drivers/scsi/qla2xxx/qla_target.c:671 qla24xx_delete_sess_fn() warn: can 'fcport' even be NULL?

vim +/mcmd +5735 drivers/scsi/qla2xxx/qla_target.c

0691094ff3f2cfa Quinn Tran      2018-09-04  5704  
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5705  
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5706  static void qlt_handle_abts_completion(struct scsi_qla_host *vha,
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5707  	struct rsp_que *rsp, response_t *pkt)
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5708  {
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5709  	struct abts_resp_from_24xx_fw *entry =
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5710  		(struct abts_resp_from_24xx_fw *)pkt;
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5711  	u32 h = pkt->handle & ~QLA_TGT_HANDLE_MASK;
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5712  	struct qla_tgt_mgmt_cmd *mcmd;
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5713  	struct qla_hw_data *ha = vha->hw;
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5714  
81bcf1c5cf0ee87 Bart Van Assche 2019-04-11  5715  	mcmd = qlt_ctio_to_cmd(vha, rsp, pkt->handle, pkt);
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5716  	if (mcmd == NULL && h != QLA_TGT_SKIP_HANDLE) {
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5717  		ql_dbg(ql_dbg_async, vha, 0xe064,
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5718  		    "qla_target(%d): ABTS Comp without mcmd\n",
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5719  		    vha->vp_idx);
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5720  		return;
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5721  	}
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5722  
6b0431d6fa20bd1 Quinn Tran      2018-09-04 @5723  	if (mcmd)
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5724  		vha  = mcmd->vha;
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5725  	vha->vha_tgt.qla_tgt->abts_resp_expected--;
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5726  
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5727  	ql_dbg(ql_dbg_tgt, vha, 0xe038,
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5728  	    "ABTS_RESP_24XX: compl_status %x\n",
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5729  	    entry->compl_status);
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5730  
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5731  	if (le16_to_cpu(entry->compl_status) != ABTS_RESP_COMPL_SUCCESS) {
7ffa5b939751b66 Bart Van Assche 2020-05-18  5732  		if (le32_to_cpu(entry->error_subcode1) == 0x1E &&
7ffa5b939751b66 Bart Van Assche 2020-05-18  5733  		    le32_to_cpu(entry->error_subcode2) == 0) {
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5734  			if (qlt_chk_unresolv_exchg(vha, rsp->qpair, entry)) {
74dabbbd8bb833e Tony Battersby  2025-09-29 @5735  				qlt_free_ul_mcmd(ha, mcmd);
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5736  				return;
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5737  			}
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5738  			qlt_24xx_retry_term_exchange(vha, rsp->qpair,
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5739  			    pkt, mcmd);
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5740  		} else {
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5741  			ql_dbg(ql_dbg_tgt, vha, 0xe063,
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5742  			    "qla_target(%d): ABTS_RESP_24XX failed %x (subcode %x:%x)",
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5743  			    vha->vp_idx, entry->compl_status,
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5744  			    entry->error_subcode1,
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5745  			    entry->error_subcode2);
74dabbbd8bb833e Tony Battersby  2025-09-29  5746  			qlt_free_ul_mcmd(ha, mcmd);
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5747  		}
e752a04e1bd14cc Bart Van Assche 2019-08-08  5748  	} else if (mcmd) {
74dabbbd8bb833e Tony Battersby  2025-09-29  5749  		qlt_free_ul_mcmd(ha, mcmd);
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5750  	}
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5751  }
6b0431d6fa20bd1 Quinn Tran      2018-09-04  5752  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2025-11-08 17:03 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-29 14:28 [PATCH v2 00/16] qla2xxx target mode improvements Tony Battersby
2025-09-29 14:30 ` [PATCH v2 01/16] Revert "scsi: qla2xxx: Perform lockless command completion in abort path" Tony Battersby
2025-09-29 14:32 ` [PATCH v2 02/16] scsi: qla2xxx: fix initiator mode with qlini_mode=exclusive Tony Battersby
2025-09-29 14:33 ` [PATCH v2 03/16] scsi: qla2xxx: fix lost interrupts with qlini_mode=disabled Tony Battersby
2025-09-29 14:34 ` [PATCH v2 04/16] scsi: qla2xxx: use reinit_completion on mbx_intr_comp Tony Battersby
2025-09-29 14:35 ` [PATCH v2 05/16] scsi: qla2xxx: remove code for unsupported hardware Tony Battersby
2025-09-29 14:36 ` [PATCH v2 06/16] scsi: qla2xxx: improve debug output for term exchange Tony Battersby
2025-09-29 14:37 ` [PATCH v2 07/16] scsi: qla2xxx: fix term exchange when cmd_sent_to_fw == 1 Tony Battersby
2025-09-29 14:38 ` [PATCH v2 08/16] scsi: qla2xxx: clear cmds after chip reset Tony Battersby
2025-09-29 14:39 ` [PATCH v2 09/16] scsi: qla2xxx: fix races with aborting commands Tony Battersby
2025-09-29 14:41   ` [SCST PATCH " Tony Battersby
2025-09-29 14:42 ` [PATCH v2 10/16] scsi: qla2xxx: improve checks in qlt_xmit_response / qlt_rdy_to_xfer Tony Battersby
2025-09-29 14:43 ` [PATCH v2 11/16] scsi: qla2xxx: fix TMR failure handling Tony Battersby
2025-09-29 14:44 ` [PATCH v2 12/16] scsi: qla2xxx: fix invalid memory access with big CDBs Tony Battersby
2025-09-29 14:45   ` [SCST PATCH " Tony Battersby
2025-09-29 14:47 ` [PATCH v2 13/16] scsi: qla2xxx: add cmd->rsp_sent Tony Battersby
2025-09-29 14:48   ` [SCST PATCH " Tony Battersby
2025-09-29 14:49 ` [PATCH v2 14/16] scsi: qla2xxx: improve cmd logging Tony Battersby
2025-09-29 14:50 ` [PATCH v2 15/16] scsi: qla2xxx: add back SRR support Tony Battersby
2025-09-29 14:51   ` [SCST PATCH " Tony Battersby
2025-09-29 14:53 ` [PATCH v2 16/16] scsi: qla2xxx: improve safety of cmd lookup by handle Tony Battersby
2025-09-29 14:54 ` [SCST PATCH v2] qla2x00t-32gbit: add on_abort_cmd callback Tony Battersby
2025-11-03 15:44 ` [PATCH v2 00/16] qla2xxx target mode improvements Tony Battersby
2025-11-08 17:03   ` Martin K. Petersen
  -- strict thread matches above, loose matches on Subject: below --
2025-10-03  5:04 [PATCH v2 11/16] scsi: qla2xxx: fix TMR failure handling kernel test robot
2025-10-03  8:40 ` Dan Carpenter
2025-10-03 14:38 ` [PATCH v3 " Tony Battersby

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.