Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Junxian Huang <huangjunxian6@hisilicon.com>
Cc: leon@kernel.org, linux-rdma@vger.kernel.org, linuxarm@huawei.com,
	tangchengchang@huawei.com
Subject: Re: [PATCH for-rc 1/3] RDMA/hns: Fix memory leak of bonding resource
Date: Mon, 25 May 2026 11:38:37 -0300	[thread overview]
Message-ID: <20260525143837.GA2457236@nvidia.com> (raw)
In-Reply-To: <20260520055759.2354037-2-huangjunxian6@hisilicon.com>

On Wed, May 20, 2026 at 01:57:57PM +0800, Junxian Huang wrote:
> In a corner case of concurrent driver removal and driver reset,
> bonding resource is first released in hns_roce_hw_v2_exit() during
> driver removal, and then is allocated again in hns_roce_register_device()
> during driver reset. This leads to memory leak because the release
> timing has already passed. This may also lead to a kernel panic
> as below because of the leaked notifier callback:
> 
>  Call trace:
>   0xffffa20fccc04978 (P)
>   raw_notifier_call_chain+0x20/0x38
>   call_netdevice_notifiers_info+0x60/0xb8
>   netdev_lower_state_changed+0x4c/0xb8
> 
> Bonding resource allocation and release should occur only during
> driver init and removal, so don't do the allocation during reset.
> 
> Fixes: b37ad2e290fc ("RDMA/hns: Initialize bonding resources")
> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
> ---
>  drivers/infiniband/hw/hns/hns_roce_main.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
> index c17ff5347a01..a7308a3c586e 100644
> --- a/drivers/infiniband/hw/hns/hns_roce_main.c
> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
> @@ -795,6 +795,7 @@ static const struct ib_device_ops hns_roce_dev_restrack_ops = {
>  
>  static int hns_roce_register_device(struct hns_roce_dev *hr_dev)
>  {
> +	struct hns_roce_v2_priv *priv = hr_dev->priv;
>  	struct hns_roce_ib_iboe *iboe = NULL;
>  	struct device *dev = hr_dev->dev;
>  	struct ib_device *ib_dev = NULL;
> @@ -838,7 +839,8 @@ static int hns_roce_register_device(struct hns_roce_dev *hr_dev)
>  
>  	dma_set_max_seg_size(dev, SZ_2G);
>  
> -	if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_BOND) {
> +	if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_BOND &&
> +	    priv->handle->rinfo.reset_state != HNS_ROCE_STATE_RST_INIT) {
>  		ret = hns_roce_alloc_bond_grp(hr_dev);
>  		if (ret) {
>  			dev_err(dev, "failed to alloc bond_grp for bus %u, ret = %d\n",

The sashiko comments about inverted teardown seems pretty reasonable?

https://sashiko.dev/#/patchset/20260520055759.2354037-1-huangjunxian6%40hisilic

It would be better to fix it that way instead of sprinkling this
around.

The other comments seem less interesting.

Jason

  reply	other threads:[~2026-05-25 14:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-20  5:57 [PATCH for-rc 0/3] RDMA/hns: Misc fixes Junxian Huang
2026-05-20  5:57 ` [PATCH for-rc 1/3] RDMA/hns: Fix memory leak of bonding resource Junxian Huang
2026-05-25 14:38   ` Jason Gunthorpe [this message]
2026-05-20  5:57 ` [PATCH for-rc 2/3] RDMA/hns: Fix warning in poll cq direct mode Junxian Huang
2026-05-20  5:57 ` [PATCH for-rc 3/3] RDMA/hns: Fix log flood after cmd_mbox failure Junxian Huang
2026-05-25 14:39 ` [PATCH for-rc 0/3] RDMA/hns: Misc fixes Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260525143837.GA2457236@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=huangjunxian6@hisilicon.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=tangchengchang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox