From: Jason Gunthorpe <jgg@nvidia.com>
To: Wenpeng Liang <liangwenpeng@huawei.com>
Cc: leon@kernel.org, linux-rdma@vger.kernel.org, linuxarm@huawei.com
Subject: Re: [PATCH for-next] RDMA/hns: Add the detection for CMDQ status in the device initialization process
Date: Wed, 4 May 2022 22:00:23 -0300 [thread overview]
Message-ID: <20220505010023.GB220614@nvidia.com> (raw)
In-Reply-To: <20220429093104.26687-1-liangwenpeng@huawei.com>
On Fri, Apr 29, 2022 at 05:31:04PM +0800, Wenpeng Liang wrote:
> From: Yangyang Li <liyangyang20@huawei.com>
>
> CMDQ may fail during HNS ROCEE initialization. The following is the log
> when the execution fails:
>
> [ 481.424373] hns3 0000:bd:00.2: In reset process RoCE client reinit.
> [ 482.120830] hns3 0000:bd:00.2: CMDQ move tail from 840 to 839
> [ 482.129220] hns3 0000:bd:00.2 hns_2: failed to set gid, ret = -11!
> [ 482.184702] hns3 0000:bd:00.2: CMDQ move tail from 840 to 839
> <...>
> [ 485.540909] hns3 0000:bd:00.2: CMDQ move tail from 840 to 839
> [ 485.579958] hns3 0000:bd:00.2: CMDQ move tail from 840 to 0
> [ 495.694616] hns3 0000:bd:00.2: [cmd]token 14e mailbox 20 timeout.
> [ 495.700689] hns3 0000:bd:00.2 hns_2: set HEM step 0 failed!
> [ 495.706242] hns3 0000:bd:00.2 hns_2: set HEM address to HW failed!
> [ 495.712412] hns3 0000:bd:00.2 hns_2: failed to alloc mtpt, ret = -16.
> [ 495.718836] infiniband hns_2: Couldn't create ib_mad PD
> [ 495.724046] infiniband hns_2: Couldn't open port 1
> [ 495.729375] hns3 0000:bd:00.2: Reset done, RoCE client reinit finished.
>
> However, even if ib_mad client registration failed, ib_register_device()
> still returns success to the driver.
>
> In the device initialization process, CMDQ execution fails because HW/FW
> is abnormal. Therefore, if CMDQ fails, the initialization function should
> set CMDQ to a fatal error state and return a failure to the caller.
>
> Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
> Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
> Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
> ---
> drivers/infiniband/hw/hns/hns_roce_device.h | 6 ++++++
> drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 21 +++++++++++++++++++++
> 2 files changed, 27 insertions(+)
Applied to for-next, thanks
Jason
prev parent reply other threads:[~2022-05-05 1:00 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-29 9:31 [PATCH for-next] RDMA/hns: Add the detection for CMDQ status in the device initialization process Wenpeng Liang
2022-05-05 1:00 ` Jason Gunthorpe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220505010023.GB220614@nvidia.com \
--to=jgg@nvidia.com \
--cc=leon@kernel.org \
--cc=liangwenpeng@huawei.com \
--cc=linux-rdma@vger.kernel.org \
--cc=linuxarm@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.