linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Hannes Reinecke <hare@suse.de>, <JBottomley@odin.com>,
	<martin.petersen@oracle.com>
Cc: <linuxarm@huawei.com>, <zhangfei.gao@linaro.org>,
	<xuwei5@hisilicon.com>, <john.garry2@mail.dcu.ie>,
	<linux-scsi@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/6] hisi_sas: use slot abort in v1 hw
Date: Tue, 16 Feb 2016 16:13:29 +0000	[thread overview]
Message-ID: <56C34AA9.8080604@huawei.com> (raw)
In-Reply-To: <56C340E9.1030503@suse.de>

On 16/02/2016 15:31, Hannes Reinecke wrote:
> On 02/16/2016 01:22 PM, John Garry wrote:
>> When TRANS_TX_CREDIT_TIMEOUT_ERR or
>> TRANS_TX_CLOSE_NORMAL_ERR errors occur for a
>> command, the command should be re-attempted.
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> ---
>>   drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 22 ++++++++++++++++++----
>>   1 file changed, 18 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
>> index ce5f65d..34f71a1c 100644
>> --- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
>> +++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
>> @@ -1118,9 +1118,8 @@ static int prep_ssp_v1_hw(struct hisi_hba *hisi_hba,
>>   }
>>
>>   /* by default, task resp is complete */
>> -static void slot_err_v1_hw(struct hisi_hba *hisi_hba,
>> -			   struct sas_task *task,
>> -			   struct hisi_sas_slot *slot)
>> +static void slot_err_v1_hw(struct hisi_hba *hisi_hba, struct sas_task *task,
>> +			   struct hisi_sas_slot *slot, int *abort_slot)
>>   {
>>   	struct task_status_struct *ts = &task->task_status;
>>   	struct hisi_sas_err_record_v1 *err_record = slot->status_buffer;
>> @@ -1212,6 +1211,14 @@ static void slot_err_v1_hw(struct hisi_hba *hisi_hba,
>>   			ts->stat = SAS_NAK_R_ERR;
>>   			break;
>>   		}
>> +		case TRANS_TX_CREDIT_TIMEOUT_ERR:
>> +		case TRANS_TX_CLOSE_NORMAL_ERR:
>> +		{
>> +			/* This will request a retry */
>> +			ts->stat = SAS_QUEUE_FULL;
>> +			++(*abort_slot);
>> +			break;
>> +		}
>>   		default:
>>   		{
>>   			ts->stat = SAM_STAT_CHECK_CONDITION;
>> @@ -1317,8 +1324,14 @@ static int slot_complete_v1_hw(struct hisi_hba *hisi_hba,
>>
>>   	if (cmplt_hdr_data & CMPLT_HDR_ERR_RCRD_XFRD_MSK &&
>>   		!(cmplt_hdr_data & CMPLT_HDR_RSPNS_XFRD_MSK)) {
>> +		int abort_slot = 0;
>>
>> -		slot_err_v1_hw(hisi_hba, task, slot);
>> +		slot_err_v1_hw(hisi_hba, task, slot,  &abort_slot);
>> +		if (unlikely(abort_slot)) {
>> +			queue_work(hisi_hba->wq, &slot->abort_slot);
>> +			sts = ts->stat;
>> +			goto out_1;
>> +		}
>>   		goto out;
>>   	}
>>
> What is the 'abort_slot' variable for?
> Currently it's just a counter, no?
> So why the weird pointer passing?
>
> And it does feel weird. Apparently the driver does get a message,
> but still has to abort the command. Why?
> Isn't the message an indicator that the command has been aborted?
>
> Cheers,
>
> Hannes
>

I'll paste some more code for convenience and to help clarify:

static int slot_complete_v1_hw(struct hisi_hba *hisi_hba,
                    struct hisi_sas_slot *slot, int abort)
{
...

     if (cmplt_hdr_data & CMPLT_HDR_ERR_RCRD_XFRD_MSK &&
         !(cmplt_hdr_data & CMPLT_HDR_RSPNS_XFRD_MSK)) {
         int abort_slot = 0;

         slot_err_v1_hw(hisi_hba, task, slot,  &abort_slot);
         if (unlikely(abort_slot)) { /* check if we need to abort the 
task */
             queue_work(hisi_hba->wq, &slot->abort_slot);
             sts = ts->stat;
             goto out_1;
         }
         goto out;
     }

  ...

out:
     if (sas_dev && sas_dev->running_req)
         sas_dev->running_req--;

     hisi_sas_slot_task_free(hisi_hba, task, slot);
     sts = ts->stat;

     if (task->task_done)
         task->task_done(task);
out_1:

     return sts;
}

Variable abort_slot is really a boolean flag which can be set in 
slot_err_v1_hw(). When error TRANS_TX_CREDIT_TIMEOUT_ERR or 
TRANS_TX_CLOSE_NORMAL_ERR occurs in the slot, abort_slot is set. In this 
case we don't immediately complete the task (goto out and call 
hisi_sas_slot_task_free() and task->task_done()), but instead queue the 
task to be aborted in the device before completing (call queue_work() 
and then goto out_1).
When hisi_sas_slot_abort() [patch #2] runs in the workqueue for the 
task, it first aborts the task in the device with a TMF, and then 
completes the task. Finally the status (SAS_QUEUE_FULL) is passed back 
to SCSI framework, which will request a retry for the scsi command.

This is the method our hw people recommended to handle these types of 
errors.

Hope this explains,
Cheers,
John

  reply	other threads:[~2016-02-16 16:14 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-16 12:22 [PATCH 0/6] hisi_sas: add abort and retry feature John Garry
2016-02-16 12:22 ` [PATCH 1/6] hisi_sas: add TMF_RESP_FUNC_SUCC check John Garry
2016-02-16 15:20   ` Hannes Reinecke
2016-02-16 12:22 ` [PATCH 2/6] hisi_sas: add hisi_sas_slot_abort() John Garry
2016-02-16 15:22   ` Hannes Reinecke
2016-02-16 15:41     ` John Garry
2016-02-18  9:30       ` John Garry
2016-02-16 12:22 ` [PATCH 3/6] hisi_sas: use slot abort in v1 hw John Garry
2016-02-16 15:31   ` Hannes Reinecke
2016-02-16 16:13     ` John Garry [this message]
2016-02-18  7:16       ` Hannes Reinecke
2016-02-18  9:52         ` John Garry
2016-02-16 12:22 ` [PATCH 4/6] hisi_sas: use slot abort in v2 hw John Garry
2016-02-16 15:32   ` Hannes Reinecke
2016-02-16 16:58     ` John Garry
2016-02-16 12:22 ` [PATCH 5/6] hisi_sas: add hisi_sas_slave_configure() John Garry
2016-02-16 15:33   ` Hannes Reinecke
2016-02-16 16:56     ` John Garry
2016-02-18  7:40       ` Hannes Reinecke
2016-02-18 10:12         ` John Garry
2016-02-18 10:30           ` Hannes Reinecke
2016-02-18 10:57             ` John Garry
2016-02-19 10:46               ` John Garry
2016-02-19 14:31                 ` Hannes Reinecke
2016-02-22 10:02                   ` John Garry
2016-02-16 12:22 ` [PATCH 6/6] hisi_sas: update driver version to 1.3 John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56C34AA9.8080604@huawei.com \
    --to=john.garry@huawei.com \
    --cc=JBottomley@odin.com \
    --cc=hare@suse.de \
    --cc=john.garry2@mail.dcu.ie \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=martin.petersen@oracle.com \
    --cc=xuwei5@hisilicon.com \
    --cc=zhangfei.gao@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).