public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Junxian Huang <huangjunxian6@hisilicon.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: <jgg@ziepe.ca>, <linux-rdma@vger.kernel.org>,
	<linuxarm@huawei.com>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH for-rc 2/9] RDMA/hns: Fix a long wait for cmdq event during reset
Date: Mon, 8 Jul 2024 14:50:50 +0800	[thread overview]
Message-ID: <7cae577b-e469-9357-8375-d14746a7787b@hisilicon.com> (raw)
In-Reply-To: <20240708053850.GA6788@unreal>



On 2024/7/8 13:38, Leon Romanovsky wrote:
> On Mon, Jul 08, 2024 at 10:29:54AM +0800, Junxian Huang wrote:
>>
>>
>> On 2024/7/7 16:30, Leon Romanovsky wrote:
>>> On Fri, Jul 05, 2024 at 04:59:30PM +0800, Junxian Huang wrote:
>>>> From: wenglianfa <wenglianfa@huawei.com>
>>>>
>>>> During reset, cmdq events won't be reported, leading to a long and
>>>> unnecessary wait. Notify all the cmdqs to stop waiting at the beginning
>>>> of reset.
>>>>
>>>> Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
>>>> Signed-off-by: wenglianfa <wenglianfa@huawei.com>
>>>> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
>>>> ---
>>>>  drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 18 ++++++++++++++++++
>>>>  1 file changed, 18 insertions(+)
>>>>
>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>> index a5d746a5cc68..ff135df1a761 100644
>>>> --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>> @@ -6977,6 +6977,21 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
>>>>  
>>>>  	handle->rinfo.instance_state = HNS_ROCE_STATE_NON_INIT;
>>>>  }
>>>> +
>>>> +static void hns_roce_v2_reset_notify_cmd(struct hns_roce_dev *hr_dev)
>>>> +{
>>>> +	struct hns_roce_cmdq *hr_cmd = &hr_dev->cmd;
>>>> +	int i;
>>>> +
>>>> +	if (!hr_dev->cmd_mod)
>>>
>>> What prevents cmd_mod from being changed?
>>>
>>
>> It's set when the device is being initialized, and won't be changed after that.
> 
> This is exactly the point, you are assuming that the device is already
> ininitialized or not initialized at all. What prevents hns_roce_v2_reset_notify_cmd()
> from being called in the middle of initialization?
> 
> Thanks
> 

This is ensured by hns3 NIC driver.

Initialization and reset of hns RoCE are both called by hns3. It will check the state
of RoCE device (see line 3798), and notify RoCE device to reset (hns_roce_v2_reset_notify_cmd()
is called) only if the RoCE device has been already initialized:

 3791 static int hclge_notify_roce_client(struct hclge_dev *hdev,
 3792                                     enum hnae3_reset_notify_type type)
 3793 {
 3794         struct hnae3_handle *handle = &hdev->vport[0].roce;
 3795         struct hnae3_client *client = hdev->roce_client;
 3796         int ret;
 3797
 3798         if (!test_bit(HCLGE_STATE_ROCE_REGISTERED, &hdev->state) || !client)
 3799                 return 0;
 3800
 3801         if (!client->ops->reset_notify)
 3802                 return -EOPNOTSUPP;
 3803
 3804         ret = client->ops->reset_notify(handle, type);
 3805         if (ret)
 3806                 dev_err(&hdev->pdev->dev, "notify roce client failed %d(%d)",
 3807                         type, ret);
 3808
 3809         return ret;
 3810 }

And the bit is set (see line 11246) after the initialization has been done (line 11242):

11224 static int hclge_init_roce_client_instance(struct hnae3_ae_dev *ae_dev,
11225                                            struct hclge_vport *vport)
11226 {
11227         struct hclge_dev *hdev = ae_dev->priv;
11228         struct hnae3_client *client;
11229         int rst_cnt;
11230         int ret;
11231
11232         if (!hnae3_dev_roce_supported(hdev) || !hdev->roce_client ||
11233             !hdev->nic_client)
11234                 return 0;
11235
11236         client = hdev->roce_client;
11237         ret = hclge_init_roce_base_info(vport);
11238         if (ret)
11239                 return ret;
11240
11241         rst_cnt = hdev->rst_stats.reset_cnt;
11242         ret = client->ops->init_instance(&vport->roce);
11243         if (ret)
11244                 return ret;
11245
11246         set_bit(HCLGE_STATE_ROCE_REGISTERED, &hdev->state);
11247         if (test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state) ||
11248             rst_cnt != hdev->rst_stats.reset_cnt) {
11249                 ret = -EBUSY;
11250                 goto init_roce_err;
11251         }

Junxian

>>
>> Junxian
>>
>>>> +		return;
>>>> +
>>>> +	for (i = 0; i < hr_cmd->max_cmds; i++) {
>>>> +		hr_cmd->context[i].result = -EBUSY;
>>>> +		complete(&hr_cmd->context[i].done);
>>>> +	}
>>>> +}
>>>> +
>>>>  static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle)
>>>>  {
>>>>  	struct hns_roce_dev *hr_dev;
>>>> @@ -6997,6 +7012,9 @@ static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle)
>>>>  	hr_dev->dis_db = true;
>>>>  	hr_dev->state = HNS_ROCE_DEVICE_STATE_RST_DOWN;
>>>>  
>>>> +	/* Complete the CMDQ event in advance during the reset. */
>>>> +	hns_roce_v2_reset_notify_cmd(hr_dev);
>>>> +
>>>>  	return 0;
>>>>  }
>>>>  
>>>> -- 
>>>> 2.33.0
>>>>
>>

  reply	other threads:[~2024-07-08  6:50 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-05  8:59 [PATCH for-rc 0/9] RDMA/hns: Bugfixes Junxian Huang
2024-07-05  8:59 ` [PATCH for-rc 1/9] RDMA/hns: Check atomic wr length Junxian Huang
2024-07-07  8:24   ` Leon Romanovsky
2024-07-08  2:27     ` Junxian Huang
2024-07-05  8:59 ` [PATCH for-rc 2/9] RDMA/hns: Fix a long wait for cmdq event during reset Junxian Huang
2024-07-07  8:30   ` Leon Romanovsky
2024-07-08  2:29     ` Junxian Huang
2024-07-08  5:38       ` Leon Romanovsky
2024-07-08  6:50         ` Junxian Huang [this message]
2024-07-08  7:33           ` Leon Romanovsky
2024-07-08  7:46             ` Junxian Huang
2024-07-08  8:27               ` Leon Romanovsky
2024-07-08  8:45                 ` Junxian Huang
2024-07-08  8:59                   ` Leon Romanovsky
2024-07-08  9:30                     ` Junxian Huang
2024-07-08 11:16                       ` Leon Romanovsky
2024-07-09  6:21                         ` Junxian Huang
2024-07-09  7:22                           ` Leon Romanovsky
2024-07-09  7:49                             ` Junxian Huang
2024-07-05  8:59 ` [PATCH for-rc 3/9] RDMA/hns: Fix soft lockup under heavy CEQE load Junxian Huang
2024-07-05 10:47   ` Zhu Yanjun
2024-07-08  2:30     ` Junxian Huang
2024-07-05  8:59 ` [PATCH for-rc 4/9] RDMA/hns: Fix unmatch exception handling when init eq table fails Junxian Huang
2024-07-05  8:59 ` [PATCH for-rc 5/9] RDMA/hns: Fix missing pagesize and alignment check in FRMR Junxian Huang
2024-07-07  9:16   ` Zhu Yanjun
2024-07-08  2:44     ` Junxian Huang
2024-07-08  5:41       ` Leon Romanovsky
2024-07-08  7:57       ` Zhu Yanjun
2024-07-08  8:33         ` Leon Romanovsky
2024-07-05  8:59 ` [PATCH for-rc 6/9] RDMA/hns: Fix shift-out-bounds when max_inline_data is 0 Junxian Huang
2024-07-05  8:59 ` [PATCH for-rc 7/9] RDMA/hns: Fix undifined behavior caused by invalid max_sge Junxian Huang
2024-07-05  8:59 ` [PATCH for-rc 8/9] RDMA/hns: Fix insufficient extend DB for VFs Junxian Huang
2024-07-05  8:59 ` [PATCH for-rc 9/9] RDMA/hns: Fix mbx timing out before CMD execution is completed Junxian Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7cae577b-e469-9357-8375-d14746a7787b@hisilicon.com \
    --to=huangjunxian6@hisilicon.com \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox