From: Jason Gunthorpe <jgg@nvidia.com>
To: Wenpeng Liang <liangwenpeng@huawei.com>
Cc: dledford@redhat.com, linux-rdma@vger.kernel.org, linuxarm@huawei.com
Subject: Re: [RESEND PATCH v2 for-next] RDMA/hns: Solve the problem that dma_pool is used during the reset
Date: Fri, 20 Aug 2021 16:29:17 -0300 [thread overview]
Message-ID: <20210820192917.GA572552@nvidia.com> (raw)
In-Reply-To: <1629339474-43445-1-git-send-email-liangwenpeng@huawei.com>
On Thu, Aug 19, 2021 at 10:17:54AM +0800, Wenpeng Liang wrote:
> From: Lang Cheng <chenglang@huawei.com>
>
> During the reset, the driver calls dma_pool_destroy() to release the
> dma_pool resources. If the dma_pool_free interface is called during the
> modify_qp operation, an exception will occur.
>
> [15834.440744] Unable to handle kernel paging request at virtual address
> ffffa2cfc7725678
> ...
> [15834.660596] Call trace:
> [15834.663033] queued_spin_lock_slowpath+0x224/0x308
> [15834.667802] _raw_spin_lock_irqsave+0x78/0x88
> [15834.672140] dma_pool_free+0x34/0x118
> [15834.675799] hns_roce_free_cmd_mailbox+0x54/0x88 [hns_roce_hw_v2]
> [15834.681872] hns_roce_v2_qp_modify.isra.57+0xcc/0x120 [hns_roce_hw_v2]
> [15834.688376] hns_roce_v2_modify_qp+0x4d4/0x1ef8 [hns_roce_hw_v2]
> [15834.694362] hns_roce_modify_qp+0x214/0x5a8 [hns_roce_hw_v2]
> [15834.699996] _ib_modify_qp+0xf0/0x308
> [15834.703642] ib_modify_qp+0x38/0x48
> [15834.707118] rt_ktest_modify_qp+0x14c/0x998 [rdma_test]
> ...
> [15837.269216] Unable to handle kernel paging request at virtual address
> 000197c995a1d1b4
> ...
> [15837.480898] Call trace:
> [15837.483335] __free_pages+0x28/0x78
> [15837.486807] dma_direct_free_pages+0xa0/0xe8
> [15837.491058] dma_direct_free+0x48/0x60
> [15837.494790] dma_free_attrs+0xa4/0xe8
> [15837.498449] hns_roce_buf_free+0xb0/0x150 [hns_roce_hw_v2]
> [15837.503918] mtr_free_bufs.isra.1+0x88/0xc0 [hns_roce_hw_v2]
> [15837.509558] hns_roce_mtr_destroy+0x60/0x80 [hns_roce_hw_v2]
> [15837.515198] hns_roce_v2_cleanup_eq_table+0x1d0/0x2a0 [hns_roce_hw_v2]
> [15837.521701] hns_roce_exit+0x108/0x1e0 [hns_roce_hw_v2]
> [15837.526908] __hns_roce_hw_v2_uninit_instance.isra.75+0x70/0xb8 [hns_roce_hw_v2]
> [15837.534276] hns_roce_hw_v2_uninit_instance+0x64/0x80 [hns_roce_hw_v2]
> [15837.540786] hclge_uninit_client_instance+0xe8/0x1e8 [hclge]
> [15837.546419] hnae3_uninit_client_instance+0xc4/0x118 [hnae3]
> [15837.552052] hnae3_unregister_client+0x16c/0x1f0 [hnae3]
> [15837.557346] hns_roce_hw_v2_exit+0x34/0x50 [hns_roce_hw_v2]
> [15837.562895] __arm64_sys_delete_module+0x208/0x268
> [15837.567665] el0_svc_common.constprop.4+0x110/0x200
> [15837.572520] do_el0_svc+0x34/0x98
> [15837.575821] el0_svc+0x14/0x40
> [15837.578862] el0_sync_handler+0xb0/0x2d0
> [15837.582766] el0_sync+0x140/0x180
>
> It is caused by two concurrent processes:
> uninit_instance->dma_pool_destroy(cmdq)
> modify_qp->dma_poll_free(cmdq)
Something else has gone wrong in your system.
modify_qp is not allowed to be running after ib_unregister_device()
returns.
I see:
[15834.707118] rt_ktest_modify_qp+0x14c/0x998 [rdma_test]
Which suggest to me your ULP is a test and that test is not properly
acting as an ib_client. When a client is unregistered it must close
all RDMA objects and stop all activity before the client unregister
callback returns.
Jason
prev parent reply other threads:[~2021-08-20 19:29 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-19 2:17 [RESEND PATCH v2 for-next] RDMA/hns: Solve the problem that dma_pool is used during the reset Wenpeng Liang
2021-08-20 19:29 ` Jason Gunthorpe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210820192917.GA572552@nvidia.com \
--to=jgg@nvidia.com \
--cc=dledford@redhat.com \
--cc=liangwenpeng@huawei.com \
--cc=linux-rdma@vger.kernel.org \
--cc=linuxarm@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.