From: "lizhijian@fujitsu.com" <lizhijian@fujitsu.com>
To: Bob Pearson <rpearsonhpe@gmail.com>,
Yanjun Zhu <yanjun.zhu@linux.dev>, Jason Gunthorpe <jgg@ziepe.ca>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH for-next] RDMA/rxe: check rxe_pd before rxe_put in rxe_mr_cleanup()
Date: Fri, 15 Jul 2022 03:37:57 +0000 [thread overview]
Message-ID: <11dafa5f-c52d-16c1-fe37-2cd45ab20474@fujitsu.com> (raw)
In-Reply-To: <c82760f8-8774-a90e-7636-1c8954c007f3@gmail.com>
On 15/07/2022 00:13, Bob Pearson wrote:
> On 7/6/22 04:21, lizhijian@fujitsu.com wrote:
>> It's possible mr_pd(mr) returns NULL if rxe_mr_alloc() fails.
>>
>> it fixes below panic:
>> [ 114.163945] RPC: Registered rdma backchannel transport module.
>> [ 116.868003] eth0 speed is unknown, defaulting to 1000
>> [ 120.173114] rdma_rxe: rxe_mr_init_user: Unable to allocate memory for map
>> [ 120.173159] ==================================================================
>> [ 120.173161] BUG: KASAN: null-ptr-deref in __rxe_put+0x18/0x60 [rdma_rxe]
>> [ 120.173194] Write of size 4 at addr 0000000000000080 by task rdma_flush_serv/685
>> [ 120.173197]
>> [ 120.173199] CPU: 0 PID: 685 Comm: rdma_flush_serv Not tainted 5.19.0-rc1-roce-flush+ #90
>> [ 120.173203] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-27-g64f37cc530f1-prebuilt.qemu.org 04/01/2014
>> [ 120.173208] Call Trace:
>> [ 120.173216] <TASK>
>> [ 120.173217] dump_stack_lvl+0x34/0x44
>> [ 120.173250] kasan_report+0xab/0x120
>> [ 120.173261] ? __rxe_put+0x18/0x60 [rdma_rxe]
>> [ 120.173277] kasan_check_range+0xf9/0x1e0
>> [ 120.173282] __rxe_put+0x18/0x60 [rdma_rxe]
>> [ 120.173311] rxe_mr_cleanup+0x21/0x140 [rdma_rxe]
>> [ 120.173328] __rxe_cleanup+0xff/0x1d0 [rdma_rxe]
>> [ 120.173344] rxe_reg_user_mr+0xa7/0xc0 [rdma_rxe]
>> [ 120.173360] ib_uverbs_reg_mr+0x265/0x460 [ib_uverbs]
>> [ 120.173387] ? ib_uverbs_modify_qp+0x8b/0xd0 [ib_uverbs]
>> [ 120.173433] ? ib_uverbs_create_cq+0x100/0x100 [ib_uverbs]
>> [ 120.173461] ? uverbs_fill_udata+0x1d8/0x330 [ib_uverbs]
>> [ 120.173488] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x19d/0x250 [ib_uverbs]
>> [ 120.173517] ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x190/0x190 [ib_uverbs]
>> [ 120.173547] ? radix_tree_next_chunk+0x31e/0x410
>> [ 120.173559] ? uverbs_fill_udata+0x255/0x330 [ib_uverbs]
>> [ 120.173587] ib_uverbs_cmd_verbs+0x11c2/0x1450 [ib_uverbs]
>> [ 120.173616] ? ucma_put_ctx+0x16/0x50 [rdma_ucm]
>> [ 120.173623] ? __rcu_read_unlock+0x43/0x60
>> [ 120.173633] ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x190/0x190 [ib_uverbs]
>> [ 120.173661] ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]
>> [ 120.173711] ? avc_ss_reset+0xb0/0xb0
>> [ 120.173722] ? vfs_fileattr_set+0x450/0x450
>> [ 120.173742] ? should_fail+0x78/0x2b0
>> [ 120.173745] ? __fsnotify_parent+0x38a/0x4e0
>> [ 120.173764] ? ioctl_has_perm.constprop.0.isra.0+0x198/0x210
>> [ 120.173784] ? should_fail+0x78/0x2b0
>> [ 120.173787] ? selinux_bprm_creds_for_exec+0x550/0x550
>> [ 120.173792] ib_uverbs_ioctl+0x114/0x1b0 [ib_uverbs]
>> [ 120.173820] ? ib_uverbs_cmd_verbs+0x1450/0x1450 [ib_uverbs]
>> [ 120.173861] __x64_sys_ioctl+0xb4/0xf0
>> [ 120.173867] do_syscall_64+0x3b/0x90
>> [ 120.173877] entry_SYSCALL_64_after_hwframe+0x46/0xb0
>> [ 120.173884] RIP: 0033:0x7f4b563c14eb
>> [ 120.173889] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 b9 0c 00 f7 d8 64 89 01 48
>> [ 120.173892] RSP: 002b:00007ffe0e4a6fe8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
>>
>> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
>> ---
>> drivers/infiniband/sw/rxe/rxe_mr.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
>> index 9a5c2af6a56f..cec5775a72f2 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
>> @@ -695,8 +695,10 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
>> void rxe_mr_cleanup(struct rxe_pool_elem *elem)
>> {
>> struct rxe_mr *mr = container_of(elem, typeof(*mr), elem);
>> + struct rxe_pd *pd = mr_pd(mr);
>>
>> - rxe_put(mr_pd(mr));
>> + if (pd)
>> + rxe_put(pd);
>>
>> ib_umem_release(mr->umem);
>>
> Li,
>
> You seem to be fixing the problem in the wrong place.
> All MRs should have an associated PD.
Currently, in rxe_reg_user_mr process, PD will be associated to a MR only when the MR allotted map_set successfully.
164 int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
165 int access, struct rxe_mr *mr)
166 {
...
188 err = rxe_mr_alloc(mr, num_buf, 0);
189 if (err) {
190 pr_warn("%s: Unable to allocate memory for map\n",
191 __func__);
192 goto err_release_umem;
193 }
...
227 mr->ibmr.pd = &pd->ibpd; <<< associate the PD with a MR
But if rxe_mr_alloc() fails, this rxe_pd will be put in rxe_mr_init_user()'s caller rxe_reg_user_mr().
912 static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd,
913 u64 start,
914 u64 length,
915 u64 iova,
916 int access, struct ib_udata *udata)
917 {
918 int err;
919 struct rxe_dev *rxe = to_rdev(ibpd->device);
920 struct rxe_pd *pd = to_rpd(ibpd);
921 struct rxe_mr *mr;
922
923 mr = rxe_alloc(&rxe->mr_pool);
924 if (!mr) {
925 err = -ENOMEM;
926 goto err2;
927 }
928
929
930 rxe_get(pd); <<< pair with rxe_put() in rxe_mr_cleanup() if rxe_mr_init_user() successes. or rxe_put() in err3 below.
931
932 err = rxe_mr_init_user(pd, start, length, iova, access, mr);
933 if (err)
934 goto err3;
935
936 rxe_finalize(mr);
937
938 return &mr->ibmr;
939
940 err3:
941 rxe_put(pd);
942 rxe_cleanup(mr);
943 err2:
944 return ERR_PTR(err);
945 }
Thanks
> The PD is passed in as a struct ib_pd to one of the MR registration APIs from rdma-core.
> The PD is allocated in rdma-core and it should check that it has a valid PD before it calls
> the rxe driver. I am not sure how you triggered the above behavior.
>
> The address of the PD is saved in the MR struct when the MR is registered and just should never
> be NULL. Assuming there is a way to register an MR without a PD (I have never seen this) we should
> check it in the registration routine not the cleanup routine and fail the call there.
>
> [Jason, Is there such a thing as an MR without a valid PD?]
>
> Bob
next prev parent reply other threads:[~2022-07-15 3:38 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-06 9:21 [PATCH for-next] RDMA/rxe: check rxe_pd before rxe_put in rxe_mr_cleanup() lizhijian
2022-07-12 1:06 ` lizhijian
2022-07-14 16:13 ` Bob Pearson
2022-07-15 3:37 ` lizhijian [this message]
2022-07-15 18:28 ` Bob Pearson
2022-07-16 8:05 ` lizhijian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=11dafa5f-c52d-16c1-fe37-2cd45ab20474@fujitsu.com \
--to=lizhijian@fujitsu.com \
--cc=jgg@ziepe.ca \
--cc=linux-rdma@vger.kernel.org \
--cc=rpearsonhpe@gmail.com \
--cc=yanjun.zhu@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox