Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
From: "lizhijian@fujitsu.com" <lizhijian@fujitsu.com>
To: Jason Gunthorpe <jgg@nvidia.com>, Bob Pearson <rpearsonhpe@gmail.com>
Cc: Zhu Yanjun <zyjzyj2000@gmail.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	Bob Pearson <rpearsonhpe@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Guoqing Jiang <guoqing.jiang@linux.dev>
Subject: Re: [PATCH] RDMA/rxe: Use kzalloc() to alloc map_set
Date: Tue, 24 May 2022 03:59:15 +0000	[thread overview]
Message-ID: <3e3373f5-7b12-a8e8-2d73-c2976b272290@fujitsu.com> (raw)
In-Reply-To: <d956bac8-36a6-0148-6f9c-fa43c8c272a7@fujitsu.com>

Hi Jason & Bob
CC Guoqing

@Guoqing, It may correlate with your previous bug report: https://lore.kernel.org/all/20220210073655.42281-1-guoqing.jiang@linux.dev/T/


It's observed that a same MR in rnbd server will trigger below code
path:
  -> rxe_mr_init_fast()
  |-> alloc map_set() # map_set is uninitialized
  |...-> rxe_map_mr_sg() # build the map_set
      |-> rxe_mr_set_page()
  |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE that means
                           # we can access host memory(such rxe_mr_copy)
  |...-> rxe_invalidate_mr() # mr->state change to FREE from VALID
  |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE,
                           # but map_set was not built again
  |...-> rxe_mr_copy() # kernel crash due to access wild addresses
                       # that lookup from the map_set

I draft a patch like below for it, but i wonder if it's rxe's responsibility to do such checking.
Any comments are very welcome.


 From e9d0bd821f07f5e049027f07b3ce9dc283624201 Mon Sep 17 00:00:00 2001
From: Li Zhijian <lizhijian@fujitsu.com>
Date: Tue, 24 May 2022 10:56:19 +0800
Subject: [PATCH] RDMA/rxe: check map_set valid when handle IB_WR_REG_MR
                                                                                                                                                                                                                                        
It's observed that a same MR in rnbd server will trigger below code
path:
  -> rxe_mr_init_fast()
  |-> alloc map_set() # map_set is uninitialized
  |...-> rxe_map_mr_sg() # build the map_set
      |-> rxe_mr_set_page()
  |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE that means
                           # we can access host memory(such rxe_mr_copy)
  |...-> rxe_invalidate_mr() # mr->state change to FREE from VALID
  |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE,
                           # but map_set was not built again
  |...-> rxe_mr_copy() # kernel crash due to access wild addresses
                       # that lookup from the map_set
                                                                                                                                                                                                                                        
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
  drivers/infiniband/sw/rxe/rxe_mr.c    | 9 +++++++++
  drivers/infiniband/sw/rxe/rxe_verbs.c | 1 +
  drivers/infiniband/sw/rxe/rxe_verbs.h | 1 +
  3 files changed, 11 insertions(+)
                                                                                                                                                                                                                                        
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 787c7dadc14f..09673d559c06 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -90,6 +90,7 @@ static int rxe_mr_alloc_map_set(int num_map, struct rxe_map_set **setp)
         if (!set->map)
                 goto err_free_set;
                                                                                                                                                                                                                                        
+       set->valid = false;
         for (i = 0; i < num_map; i++) {
                 set->map[i] = kmalloc(sizeof(struct rxe_map), GFP_KERNEL);
                 if (!set->map[i])
@@ -216,6 +217,7 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
         }
                                                                                                                                                                                                                                        
         set = mr->cur_map_set;
+       set->valid = true;
         set->page_shift = PAGE_SHIFT;
         set->page_mask = PAGE_SIZE - 1;
  
@@ -643,6 +645,7 @@ int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey)
         }
  
         mr->state = RXE_MR_STATE_FREE;
+       mr->cur_map_set->valid = mr->next_map_set->valid = false;
         ret = 0;
  
  err_drop_ref:
@@ -679,12 +682,18 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
                 return -EINVAL;
         }
  
+       if (!mr->next_map_set->valid) {
+               pr_warn("%s: map set is not valid\n", __func__);
+               return -EINVAL;
+       }
+
         mr->access = access;
         mr->lkey = (mr->lkey & ~0xff) | key;
         mr->rkey = (access & IB_ACCESS_REMOTE) ? mr->lkey : 0;
         mr->state = RXE_MR_STATE_VALID;
  
         set = mr->cur_map_set;
+       set->valid = false;
         mr->cur_map_set = mr->next_map_set;
         mr->cur_map_set->iova = wqe->wr.wr.reg.mr->iova;
         mr->next_map_set = set;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 58e4412b1d16..4b7ae2d1d921 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -992,6 +992,7 @@ static int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg,
         set->page_shift = ilog2(ibmr->page_size);
         set->page_mask = ibmr->page_size - 1;
         set->offset = set->iova & set->page_mask;
+       set->valid = true;
  
         return n;
  }
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index 86068d70cd95..2edf31aab7e1 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -289,6 +289,7 @@ struct rxe_map {
  
  struct rxe_map_set {
         struct rxe_map          **map;
+       bool                    valid;
         u64                     va;
         u64                     iova;
         size_t                  length;
-- 
2.31.1


On 23/05/2022 22:02, Li, Zhijian wrote:
>
> on 2022/5/20 22:45, Jason Gunthorpe wrote:
>> On Wed, May 18, 2022 at 12:37:25PM +0800, Li Zhijian wrote:
>>> Below call chains will alloc map_set without fully initializing map_set.
>>> rxe_mr_init_fast()
>>>   -> rxe_mr_alloc()
>>>      -> rxe_mr_alloc_map_set()
>>>
>>> Uninitialized values inside struct rxe_map_set are possible to cause
>>> kernel panic.
>> If the value is uninitialized then why is 0 an OK value?
>>
>> Would be happier to know the exact value that is not initialized
>
> Well, good question. After re-think of this issue, it seems this patch wasn't the root cause though it made the crash disappear in some extent.
>
> I'm still working on the root cause :)
>
> Thanks
>
> Zhijian
>
>
>>
>> Jason
>
>

  reply	other threads:[~2022-05-24  4:00 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-18  4:37 [PATCH] RDMA/rxe: Use kzalloc() to alloc map_set Li Zhijian
2022-05-20 14:45 ` Jason Gunthorpe
2022-05-23 14:02   ` Li, Zhijian
2022-05-24  3:59     ` lizhijian [this message]
2022-05-24 10:56       ` Haris Iqbal
2022-05-25  1:31         ` lizhijian
2022-05-25  4:11           ` Guoqing Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e3373f5-7b12-a8e8-2d73-c2976b272290@fujitsu.com \
    --to=lizhijian@fujitsu.com \
    --cc=guoqing.jiang@linux.dev \
    --cc=jgg@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rpearsonhpe@gmail.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox