public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Kamal Heib <kheib@redhat.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: linux-rdma@vger.kernel.org,
	Siva Reddy Kallam <siva.kallam@broadcom.com>,
	Jason Gunthorpe <jgg@ziepe.ca>
Subject: Re: [PATCH for-rc] RDMA/bng_re: Fix silent failure in HWRM version query
Date: Wed, 4 Mar 2026 22:49:40 -0500	[thread overview]
Message-ID: <aaj9VLGLHWESm0kw@lima-fedora> (raw)
In-Reply-To: <20260304153707.GG12611@unreal>

On Wed, Mar 04, 2026 at 05:37:07PM +0200, Leon Romanovsky wrote:
> On Mon, Mar 02, 2026 at 11:36:45PM -0500, Kamal Heib wrote:
> > If the firmware version query fails, the driver currently ignores the
> > error and continues initializing. This leaves the device in a bad state.
> 
> Can you please elaborate what will it cause?
> 
> Thanks
>

If bng_re_query_hwrm_version() fails, the code returns early and leaves
cctx->hwrm_cmd_max_timeout uninitialized. This parameter is subsequently
assigned to rcfw->max_timeout, which is used by __wait_for_resp(). Later,
when the driver sends firmware commands and enters __wait_for_resp(), it
passes a zero timeout to the commands being sent, which can lead to a
lockup.

Also, cctx->hwrm_intf_ver is left uninitialized, which will likely
be used in the future to determine if a specific feature is supported
or not (like how it is done in bnxt_re).

Thanks,
Kamal
> > 
> > Fix this by making bng_re_query_hwrm_version() return the error code and
> > update the driver to check for this error and stop the setup process
> > safely if it happens.
> > 
> > Fixes: 745065770c2d ("RDMA/bng_re: Register and get the resources from bnge driver")
> > Signed-off-by: Kamal Heib <kheib@redhat.com>
> > ---
> >  drivers/infiniband/hw/bng_re/bng_dev.c | 11 ++++++++---
> >  1 file changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/infiniband/hw/bng_re/bng_dev.c b/drivers/infiniband/hw/bng_re/bng_dev.c
> > index d34b5f88cd40..17147175a9b0 100644
> > --- a/drivers/infiniband/hw/bng_re/bng_dev.c
> > +++ b/drivers/infiniband/hw/bng_re/bng_dev.c
> > @@ -210,7 +210,7 @@ static int bng_re_stats_ctx_alloc(struct bng_re_dev *rdev)
> >  	return rc;
> >  }
> >  
> > -static void bng_re_query_hwrm_version(struct bng_re_dev *rdev)
> > +static int bng_re_query_hwrm_version(struct bng_re_dev *rdev)
> >  {
> >  	struct bnge_auxr_dev *aux_dev = rdev->aux_dev;
> >  	struct hwrm_ver_get_output ver_get_resp = {};
> > @@ -230,7 +230,7 @@ static void bng_re_query_hwrm_version(struct bng_re_dev *rdev)
> >  	if (rc) {
> >  		ibdev_err(&rdev->ibdev, "Failed to query HW version, rc = 0x%x",
> >  			  rc);
> > -		return;
> > +		return rc;
> >  	}
> >  
> >  	cctx = rdev->chip_ctx;
> > @@ -244,6 +244,8 @@ static void bng_re_query_hwrm_version(struct bng_re_dev *rdev)
> >  
> >  	if (!cctx->hwrm_cmd_max_timeout)
> >  		cctx->hwrm_cmd_max_timeout = BNG_ROCE_FW_MAX_TIMEOUT;
> > +
> > +	return 0;
> >  }
> >  
> >  static void bng_re_dev_uninit(struct bng_re_dev *rdev)
> > @@ -306,7 +308,9 @@ static int bng_re_dev_init(struct bng_re_dev *rdev)
> >  		goto msix_ctx_fail;
> >  	}
> >  
> > -	bng_re_query_hwrm_version(rdev);
> > +	rc = bng_re_query_hwrm_version(rdev);
> > +	if (rc)
> > +		goto query_hwrm_ver_fail;
> >  
> >  	rc = bng_re_alloc_fw_channel(&rdev->bng_res, &rdev->rcfw);
> >  	if (rc) {
> > @@ -392,6 +396,7 @@ static int bng_re_dev_init(struct bng_re_dev *rdev)
> >  nq_alloc_fail:
> >  	bng_re_free_rcfw_channel(&rdev->rcfw);
> >  alloc_fw_chl_fail:
> > +query_hwrm_ver_fail:
> >  	bng_re_destroy_chip_ctx(rdev);
> >  msix_ctx_fail:
> >  	bnge_unregister_dev(rdev->aux_dev);
> > -- 
> > 2.52.0
> > 
> 


  reply	other threads:[~2026-03-05  3:49 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-03  4:36 [PATCH for-rc] RDMA/bng_re: Fix silent failure in HWRM version query Kamal Heib
2026-03-04  9:02 ` Siva Reddy Kallam
2026-03-04 15:37 ` Leon Romanovsky
2026-03-05  3:49   ` Kamal Heib [this message]
2026-03-05  9:32     ` Leon Romanovsky
2026-03-05  9:34 ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaj9VLGLHWESm0kw@lima-fedora \
    --to=kheib@redhat.com \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=siva.kallam@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox