public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Bogdanov <d.bogdanov@yadro.com>
To: Tony Battersby <tonyb@cybernetics.com>
Cc: Nilesh Javali <njavali@marvell.com>,
	<GR-QLogic-Storage-Upstream@marvell.com>,
	"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	linux-scsi <linux-scsi@vger.kernel.org>,
	<target-devel@vger.kernel.org>,
	<scst-devel@lists.sourceforge.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [DMARC Error]Re: [PATCH 08/15] scsi: qla2xxx: fix oops during cmd abort
Date: Thu, 25 Sep 2025 11:42:52 +0300	[thread overview]
Message-ID: <20250925084252.GA821@yadro.com> (raw)
In-Reply-To: <e8cc07cf-9bd1-41a4-bd46-44e18179154b@cybernetics.com>

On Wed, Sep 24, 2025 at 03:41:32PM -0400, Tony Battersby wrote:
> 
> On 9/11/25 10:21, Dmitry Bogdanov wrote:
> > On Mon, Sep 08, 2025 at 02:58:06PM -0400, Tony Battersby wrote:
> >> (target mode)
> >>
> >> There is a race between the following:
> >>
> >> CPU 1:
> >> scst_hw_pending_work_fn() ->
> >> sqa_on_hw_pending_cmd_timeout() ->
> >> qlt_abort_cmd() ->
> >> qlt_unmap_sg()
> >>
> >> CPU 2:
> >> qla_do_work() ->
> >> qla24xx_process_response_queue() ->
> >> qlt_do_ctio_completion() ->
> >> qlt_unmap_sg()
> >>
> >> Two CPUs calling qlt_unmap_sg() on the same cmd at the same time
> >> results in an oops:
> >>
> >> dma_unmap_sg_attrs()
> >>         BUG_ON(!valid_dma_direction(dir));
> >>
> >> This race is more likely to happen because qlt_abort_cmd() may cause the
> >> hardware to send a CTIO.
> >>
> >> The solution is to lock cmd->qpair->qp_lock_ptr when aborting a command.
> >> This makes it possible to check the cmd state and make decisions about
> >> what to do without racing with the CTIO handler and other code.
> >>
> >> - Lock cmd->qpair->qp_lock_ptr when aborting a cmd.
> >> - Eliminate cmd->cmd_lock and change cmd->aborted back to a bitfield
> >>   since it is now protected by qp_lock_ptr just like all the other
> >>   flags.
> >> - Add another command state QLA_TGT_STATE_DONE to avoid any possible
> >>   races between qlt_abort_cmd() and tgt_ops->free_cmd().
> >> - Add the cmd->sent_term_exchg flag to indicate if
> >>   qlt_send_term_exchange() has already been called.
> >> - For SCST (scst_hw_pending_work_fn()), export qlt_send_term_exchange()
> >>   and qlt_unmap_sg() so that they can be called directly instead of
> >>   trying to make qlt_abort_cmd() work for both HW timeout and TMR abort.
> >> - Add TRC_CTIO_IGNORED for scst_hw_pending_work_fn().
> >>
> >> Fixes: 26f9ce53817a ("scsi: qla2xxx: Fix missed DMA unmap for aborted commands")
> > You are trying to fix that commit using its approach, but actually that
> > approach is the root cause itself. It is not ok to unmap dma while that
> > memory is owned by HW.
> >
> > We use this patch 4 years already instead of 26f9ce53817a and never
> > faced with such crashes.
> >
> >
> > From: Dmitry Bogdanov <d.bogdanov@yadro.com>
> > Date: Wed, 20 Oct 2021 15:57:31 +0300
> > Subject: [PATCH] scsi: qla2xxx: clear cmds after chip reset
> >
> > Commands sent to FW, after chip reset got stuck and never freed as FW is
> > not going to response to them anymore.
> >
> > This patch partially reverts aefed3e5548f at __qla2x00_abort_all_cmds.
> >
> > Fixes: aefed3e5548f ("scsi: qla2xxx: target: Fix offline port handling and host reset handling")
> > Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
> 
> Hey Dmitry, I want to pick up your patch and add it to my v2 patchset,
> but I have made a few changes to it.  Do I have your permission to add
> your "Co-developed-by" and "Signed-off-by" tags to the patch below?
> (Never did this before, I think I need to ask permission.)

Yes, of course, feel free to use it. Signed-off-by is enough I think.

 
> Compared to your patch, I changed "if (cmd->se_cmd.t_state ==
> TRANSPORT_WRITE_PENDING)" to "if (cmd->state ==
> QLA_TGT_STATE_NEED_DATA)" (which is the way it was originally) to work
> better with SCST and added the revert of 26f9ce53817a.

Yes, that is more correct.
 
> I will reply to this message with the two updated v2 patches that follow
> your suggestions and remove the dangerous code that you objected to.  If
> you approve of them, then I will submit the entire v2 patchset, since
> some of the other patches needed to be rebased.
>



  parent reply	other threads:[~2025-09-25  8:52 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-08 18:45 [PATCH 00/15] qla2xxx target mode improvements Tony Battersby
2025-09-08 18:47 ` [PATCH 01/15] Revert "scsi: qla2xxx: Perform lockless command completion in abort path" Tony Battersby
2025-09-08 18:48 ` [PATCH 02/15] scsi: qla2xxx: fix initiator mode with qlini_mode=exclusive Tony Battersby
2025-09-08 18:50 ` [PATCH 03/15] scsi: qla2xxx: fix lost interrupts with qlini_mode=disabled Tony Battersby
2025-09-08 18:51 ` [PATCH 04/15] scsi: qla2xxx: use reinit_completion on mbx_intr_comp Tony Battersby
2025-09-08 18:53 ` [PATCH 05/15] scsi: qla2xxx: remove code for unsupported hardware Tony Battersby
2025-09-08 18:54 ` [PATCH 06/15] scsi: qla2xxx: improve debug output for term exchange Tony Battersby
2025-09-08 18:56 ` [PATCH 07/15] scsi: qla2xxx: fix term exchange when cmd_sent_to_fw == 1 Tony Battersby
2025-09-08 18:58 ` [PATCH 08/15] scsi: qla2xxx: fix oops during cmd abort Tony Battersby
2025-09-08 18:59   ` [SCST PATCH " Tony Battersby
2025-09-11 14:21   ` [PATCH " Dmitry Bogdanov
2025-09-24 19:41     ` Tony Battersby
2025-09-24 19:43       ` [PATCH v2 09/16] scsi: qla2xxx: fix races with aborting commands Tony Battersby
2025-09-24 19:45         ` [SCST PATCH " Tony Battersby
2025-09-25  8:42       ` Dmitry Bogdanov [this message]
2025-09-08 19:01 ` [PATCH 09/15] scsi: qla2xxx: improve checks in qlt_xmit_response / qlt_rdy_to_xfer Tony Battersby
2025-09-08 19:02 ` [PATCH 10/15] scsi: qla2xxx: fix TMR failure handling Tony Battersby
2025-09-12 14:36   ` Dmitry Bogdanov
2025-09-16 16:04     ` Tony Battersby
2025-09-17 13:06       ` [DMARC Error]Re: " Dmitry Bogdanov
2025-09-17 20:38         ` Tony Battersby
2025-09-08 19:04 ` [PATCH 11/15] scsi: qla2xxx: fix invalid memory access with big CDBs Tony Battersby
2025-09-08 19:05   ` [SCST PATCH " Tony Battersby
2025-09-08 19:07 ` [PATCH 12/15] scsi: qla2xxx: add cmd->rsp_sent Tony Battersby
2025-09-08 19:08   ` [SCST PATCH " Tony Battersby
2025-09-15 13:47   ` [PATCH " Dmitry Bogdanov
2025-09-24 20:04     ` Tony Battersby
2025-09-08 19:09 ` [PATCH 13/15] scsi: qla2xxx: improve cmd logging Tony Battersby
2025-09-08 19:10 ` [PATCH 14/15] scsi: qla2xxx: add back SRR support Tony Battersby
2025-09-08 19:11   ` [SCST PATCH " Tony Battersby
2025-09-25 12:49   ` [PATCH " Xose Vazquez Perez
2025-09-25 15:30     ` Xose Vazquez Perez
2025-09-25 16:04       ` Tony Battersby
2025-09-25 17:00         ` Tony Battersby
2025-09-25 19:30           ` Tony Battersby
2025-09-08 19:13 ` [PATCH 15/15] scsi: qla2xxx: improve safety of cmd lookup by handle Tony Battersby
2025-09-08 19:14 ` [SCST PATCH] qla2x00t-32gbit: add on_abort_cmd callback Tony Battersby

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250925084252.GA821@yadro.com \
    --to=d.bogdanov@yadro.com \
    --cc=GR-QLogic-Storage-Upstream@marvell.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=njavali@marvell.com \
    --cc=scst-devel@lists.sourceforge.net \
    --cc=target-devel@vger.kernel.org \
    --cc=tonyb@cybernetics.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox