From: Dmitry Bogdanov <d.bogdanov@yadro.com>
To: Mike Christie <michael.christie@oracle.com>
Cc: <mlombard@redhat.com>, <martin.petersen@oracle.com>,
<mgurtovoy@nvidia.com>, <sagi@grimberg.me>,
<linux-scsi@vger.kernel.org>, <target-devel@vger.kernel.org>
Subject: Re: [PATCH v2 11/13] scsi: target: Treat CMD_T_FABRIC_STOP like CMD_T_STOP
Date: Tue, 17 Jan 2023 22:55:40 +0300 [thread overview]
Message-ID: <20230117195540.GD31614@yadro.com> (raw)
In-Reply-To: <5574fcc5-49ea-0cb4-bb95-cb011f21fc8c@oracle.com>
On Tue, Jan 17, 2023 at 12:45:56PM -0600, Mike Christie wrote:
>
> On 1/17/23 05:52, Dmitry Bogdanov wrote:
> > On Fri, Jan 13, 2023 at 05:15:12PM +0300, Dmitry Bogdanov wrote:
> >> On Wed, Jan 11, 2023 at 09:08:30PM -0600, Mike Christie wrote:
> >>>
> >>> iscsit will set CMD_T_FABRIC_STOP on running commands when its transport
> >>> connection is down and it can't send/recv IO (tx/rx threads are killed
> >>> or the cleanup thread is run from the one thats up). It will then loop
> >>> over running commands and wait for LIO core to complete them or clean
> >>> them up if they were on an internal queue waiting to be sent or ackd.
> >>>
> >>> Currently, CMD_T_FABRIC_STOP only stops TMRs from operating on the
> >>> command but for isert we need to prevent LIO core from calling into
> >>> iscsit callouts when the connection is being brought down. If LIO core
> >>> queues commands to iscsit and it ends up adding to an internal queue
> >>> instead of passing back to the driver then we can end up hanging waiting
> >>> on command completion that never occurs because it's stuck on the internal
> >>> list (the tx thread is stopped at this time, so it will never loop over
> >>> the response list and call into isert). We also want to sync up on a
> >>> point where we no longer call into isert so it can cleanup it's structs.
> >>>
> >>> This has LIO core treat CMD_T_FABRIC_STOP like CMD_T_STOP during
> >>> command execution and also fixes the locking around the
> >>> target_cmd_interrupted calls so we don't have a case where a command
> >>> is marked CMD_T_COMPLETE and CMD_T_STOP|CMD_T_FABRIC_STOP at the same
> >>> time.
> >>>
> >>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> >>> ---
> >>> drivers/target/target_core_sbc.c | 2 +-
> >>> drivers/target/target_core_transport.c | 27 +++++++++++++++-----------
> >>> 2 files changed, 17 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
> >>> index 7536ca797606..56136613767f 100644
> >>> --- a/drivers/target/target_core_sbc.c
> >>> +++ b/drivers/target/target_core_sbc.c
> >>> @@ -459,7 +459,7 @@ static sense_reason_t compare_and_write_callback(struct se_cmd *cmd, bool succes
> >>> * we don't have to perform the write operation.
> >>> */
> >>> WARN_ON(!(cmd->transport_state &
> >>> - (CMD_T_ABORTED | CMD_T_STOP)));
> >>> + (CMD_T_ABORTED | CMD_T_STOP | CMD_T_FABRIC_STOP)));
> >>> goto out;
> >>> }
> >>> /*
> >>> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> >>> index cb3fdc81ba3b..02a9476945dc 100644
> >>> --- a/drivers/target/target_core_transport.c
> >>> +++ b/drivers/target/target_core_transport.c
> >>> @@ -737,8 +737,8 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
> >>> * Determine if frontend context caller is requesting the stopping of
> >>> * this command for frontend exceptions.
> >>> */
> >>> - if (cmd->transport_state & CMD_T_STOP) {
> >>> - pr_debug("%s:%d CMD_T_STOP for ITT: 0x%08llx\n",
> >>> + if (cmd->transport_state & (CMD_T_STOP | CMD_T_FABRIC_STOP)) {
> >>> + pr_debug("%s:%d CMD_T_STOP|CMD_T_FABRIC_STOP for ITT: 0x%08llx\n",
> >>> __func__, __LINE__, cmd->tag);
> >>>
> >>> spin_unlock_irqrestore(&cmd->t_state_lock, flags);
> >>> @@ -889,7 +889,7 @@ static bool target_cmd_interrupted(struct se_cmd *cmd)
> >>> INIT_WORK(&cmd->work, target_abort_work);
> >>> queue_work(target_completion_wq, &cmd->work);
> >>> return true;
> >>> - } else if (cmd->transport_state & CMD_T_STOP) {
> >>> + } else if (cmd->transport_state & (CMD_T_STOP | CMD_T_FABRIC_STOP)) {
> >>> if (cmd->transport_complete_callback)
> >>> cmd->transport_complete_callback(cmd, false, &post_ret);
> >>> complete_all(&cmd->t_transport_stop_comp);
> >>> @@ -907,13 +907,15 @@ void target_complete_cmd_with_sense(struct se_cmd *cmd, u8 scsi_status,
> >>> int success, cpu;
> >>> unsigned long flags;
> >>>
> >>> - if (target_cmd_interrupted(cmd))
> >>> + spin_lock_irqsave(&cmd->t_state_lock, flags);
> >>
> >> That leads to a hardlock because
> >> target_cmd_interrupted() => cmd->transport_complete_callback() also tooks
> >> cmd->t_state_lock.
>
> We shouldn't lock up because for the failure cases the transport_complete_callback
> functions don't take the lock. They just cleanup and return.
The second callback (compare_and_write_post) does take the lock always.
Although to be honest, that lock there has no sense.
> > But the taking the lock for read+write of cmd->t*_state is absolutelly right!
> > It would be great if you manage to move transport_complete_callback()
> > into other thread/job.
>
> I'm not sure why we want to do that since none of the transport_complete_callback
> sleep on failure. It seems to just add more complexity and we only have the 2
> transport_complete_callback uses now, so it seems overkill.
I have a personal interest on that :) Of course, it is just one of the
ways to solve it.
BR,
Dmitry
next prev parent reply other threads:[~2023-01-17 21:31 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-12 3:08 [PATCH v2 00/13] target task management fixes Mike Christie
2023-01-12 3:08 ` [PATCH v2 01/13] scsi: target: Move sess cmd counter to new struct Mike Christie
2023-01-12 3:08 ` [PATCH v2 02/13] scsi: target: Move cmd counter allocation Mike Christie
2023-01-12 3:08 ` [PATCH v2 03/13] scsi: target: Pass in cmd counter to use during cmd setup Mike Christie
2023-01-12 3:08 ` [PATCH v2 04/13] scsi: target: iscsit/isert: Alloc per conn cmd counter Mike Christie
2023-01-12 3:08 ` [PATCH v2 05/13] scsi: target: iscsit: stop/wait on cmds during conn close Mike Christie
2023-01-12 3:08 ` [PATCH v2 06/13] scsi: target: iscsit: Fix TAS handling during conn cleanup Mike Christie
2023-01-12 3:08 ` [PATCH v2 07/13] scsi: target: Fix multiple LUN_RESET handling Mike Christie
2023-01-12 3:08 ` [PATCH v2 08/13] scsi: target: drop tas arg from __transport_wait_for_tasks Mike Christie
2023-01-12 3:08 ` [PATCH v2 09/13] scsi: target: iscsit: Fix isert disconnect handling during login Mike Christie
2023-01-13 14:08 ` Dmitry Bogdanov
2023-01-18 2:14 ` Mike Christie
2023-01-23 9:53 ` Sagi Grimberg
2023-01-24 2:40 ` michael.christie
2023-01-12 3:08 ` [PATCH v2 10/13] scsi: target: iscsit: Add helper to check when cmd has failed Mike Christie
2023-01-12 3:08 ` [PATCH v2 11/13] scsi: target: Treat CMD_T_FABRIC_STOP like CMD_T_STOP Mike Christie
2023-01-13 14:15 ` Dmitry Bogdanov
2023-01-17 11:52 ` Dmitry Bogdanov
2023-01-17 18:45 ` Mike Christie
2023-01-17 19:55 ` Dmitry Bogdanov [this message]
2023-01-18 19:12 ` Mike Christie
2023-01-12 3:08 ` [PATCH v2 12/13] scsi: target: iscsit: Cleanup isert commands at conn closure Mike Christie
2023-01-12 3:08 ` [PATCH v2 13/13] IB/isert: Fix hang in target_wait_for_cmds Mike Christie
2023-01-23 9:50 ` Sagi Grimberg
2023-01-12 3:13 ` [PATCH v2 00/13] target task management fixes Mike Christie
2023-01-23 9:54 ` Sagi Grimberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230117195540.GD31614@yadro.com \
--to=d.bogdanov@yadro.com \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=mgurtovoy@nvidia.com \
--cc=michael.christie@oracle.com \
--cc=mlombard@redhat.com \
--cc=sagi@grimberg.me \
--cc=target-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox