From: Zhu Yanjun <yanjun.zhu@linux.dev>
To: "Daisuke Matsuda (Fujitsu)" <matsuda-daisuke@fujitsu.com>,
'Zhu Yanjun' <yanjun.zhu@intel.com>,
"yi.zhang@redhat.com" <yi.zhang@redhat.com>
Cc: "jgg@ziepe.ca" <jgg@ziepe.ca>,
"leon@kernel.org" <leon@kernel.org>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
"zyjzyj2000@gmail.com" <zyjzyj2000@gmail.com>
Subject: Re: [PATCH 1/1] RDMA/rxe: Fix blktests srp lead kernel panic with 64k page size
Date: Fri, 13 Oct 2023 20:28:39 +0800 [thread overview]
Message-ID: <6f155aa3-8e75-40c5-9686-cad9f9ac0d81@linux.dev> (raw)
In-Reply-To: <OS3PR01MB98651C7454C46841B8A78F11E5D2A@OS3PR01MB9865.jpnprd01.prod.outlook.com>
在 2023/10/13 20:01, Daisuke Matsuda (Fujitsu) 写道:
> On Fri, Oct 13, 2023 10:18 AM Zhu Yanjun wrote:
>> From: Zhu Yanjun <yanjun.zhu@linux.dev>
>>
>> The page_size of mr is set in infiniband core originally. In the commit
>> 325a7eb85199 ("RDMA/rxe: Cleanup page variables in rxe_mr.c"), the
>> page_size is also set. Sometime this will cause conflict.
>
> I appreciate your prompt action, but I do not think this commit deals with
> the root cause. I agree that the problem lies in rxe driver, but what is wrong
> with assigning actual page size to ibmr.page_size?
Please check the source code. ibmr.page_size is assigned in
infiniband/core. And then it is assigned in rxe.
When the 2 are different, the problem will occur.
Please add logs to infiniband/core and rxe to check them.
Zhu Yanjun
>
> IMO, the problem comes from the device attribute of rxe driver, which is used
> in ulp/srp layer to calculate the page_size.
> =====
> static int srp_add_one(struct ib_device *device)
> {
> struct srp_device *srp_dev;
> struct ib_device_attr *attr = &device->attrs;
> <...>
> /*
> * Use the smallest page size supported by the HCA, down to a
> * minimum of 4096 bytes. We're unlikely to build large sglists
> * out of smaller entries.
> */
> mr_page_shift = max(12, ffs(attr->page_size_cap) - 1);
> srp_dev->mr_page_size = 1 << mr_page_shift;
> =====
> On initialization of srp driver, mr_page_size is calculated here.
> Note that the device attribute is used to calculate the value of page shift
> when the device is trying to use a page size larger than 4096. Since Yi specified
> CONFIG_ARM64_64K_PAGES, the system naturally met the condition.
>
> =====
> static int srp_map_finish_fr(struct srp_map_state *state,
> struct srp_request *req,
> struct srp_rdma_ch *ch, int sg_nents,
> unsigned int *sg_offset_p)
> {
> struct srp_target_port *target = ch->target;
> struct srp_device *dev = target->srp_host->srp_dev;
> <...>
> n = ib_map_mr_sg(desc->mr, state->sg, sg_nents, sg_offset_p,
> dev->mr_page_size);
> =====
> After that, mr_page_size is presumably passed to ib_core layer.
>
> =====
> int ib_map_mr_sg(struct ib_mr *mr, struct scatterlist *sg, int sg_nents,
> unsigned int *sg_offset, unsigned int page_size)
> {
> if (unlikely(!mr->device->ops.map_mr_sg))
> return -EOPNOTSUPP;
>
> mr->page_size = page_size;
>
> return mr->device->ops.map_mr_sg(mr, sg, sg_nents, sg_offset);
> }
> EXPORT_SYMBOL(ib_map_mr_sg);
> =====
> Consequently, the page size calculated in srp driver is set to ibmr.page_size.
>
> Coming back to rxe, the device attribute is set here:
> =====
> rxe.c
> <...>
> /* initialize rxe device parameters */
> static void rxe_init_device_param(struct rxe_dev *rxe)
> {
> rxe->max_inline_data = RXE_MAX_INLINE_DATA;
>
> rxe->attr.vendor_id = RXE_VENDOR_ID;
> rxe->attr.max_mr_size = RXE_MAX_MR_SIZE;
> rxe->attr.page_size_cap = RXE_PAGE_SIZE_CAP;
> rxe->attr.max_qp = RXE_MAX_QP;
> ---
> rxe_param.h
> <...>
> /* default/initial rxe device parameter settings */
> enum rxe_device_param {
> RXE_MAX_MR_SIZE = -1ull,
> RXE_PAGE_SIZE_CAP = 0xfffff000,
> RXE_MAX_QP_WR = DEFAULT_MAX_VALUE,
> =====
> rxe_init_device_param() sets the attributes to rxe_dev->attr, and it is later copied
> to ib_device->attrs in setup_device()@core/device.c.
> See that the page size cap is hardcoded to 4096 bytes. I suspect this led to
> incorrect page_size being set to ibmr.page_size, resulting in the kernel crash.
>
> I think rxe driver is made to be able to handle arbitrary page sizes.
> Probably, we can just modify RXE_PAGE_SIZE_CAP to fix the issue.
> What do you guys think?
>
> Thanks,
> Daisuke Matsuda
>
next prev parent reply other threads:[~2023-10-13 12:28 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-13 1:18 [PATCH 1/1] RDMA/rxe: Fix blktests srp lead kernel panic with 64k page size Zhu Yanjun
2023-10-13 12:01 ` Daisuke Matsuda (Fujitsu)
2023-10-13 12:28 ` Zhu Yanjun [this message]
2023-10-13 13:01 ` Daisuke Matsuda (Fujitsu)
2023-10-13 13:44 ` Rain River
2023-10-16 6:07 ` Daisuke Matsuda (Fujitsu)
2023-10-18 8:34 ` Zhu Yanjun
2023-10-20 3:47 ` Zhijian Li (Fujitsu)
2023-10-20 6:54 ` Zhijian Li (Fujitsu)
2023-10-20 16:21 ` Bart Van Assche
2023-10-23 0:58 ` Zhijian Li (Fujitsu)
2023-10-20 14:01 ` Jason Gunthorpe
2023-10-23 3:52 ` Zhijian Li (Fujitsu)
2023-10-23 6:08 ` Zhu Yanjun
2023-10-23 10:45 ` Yi Zhang
2023-10-24 8:15 ` Zhijian Li (Fujitsu)
2023-10-24 9:13 ` Zhijian Li (Fujitsu)
2023-10-26 9:05 ` Zhijian Li (Fujitsu)
2023-10-26 11:42 ` Jason Gunthorpe
2023-10-26 12:59 ` Zhu Yanjun
2023-10-26 23:23 ` Jason Gunthorpe
2023-10-27 1:36 ` Zhu Yanjun
2023-10-27 4:01 ` Zhu Yanjun
2023-10-27 11:51 ` Jason Gunthorpe
2023-10-26 13:28 ` Bart Van Assche
2023-10-26 13:43 ` Jason Gunthorpe
2023-10-26 21:47 ` Bart Van Assche
2023-10-27 1:26 ` Daisuke Matsuda (Fujitsu)
2023-10-27 1:39 ` Zhu Yanjun
2023-10-27 5:43 ` Zhijian Li (Fujitsu)
2023-10-31 1:36 ` Zhu Yanjun
[not found] ` <CAEz=LcuLCe7bhUohh6BcHdJ1_ocJdZq=eu07vWb3Md5_ZOGDBg@mail.gmail.com>
[not found] ` <CAEz=LcuQ6fFpHqBPT1oTUgKABAHFJqYDC-AHidE-+n6OtzmCPQ@mail.gmail.com>
2023-10-31 8:14 ` Greg Sword
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6f155aa3-8e75-40c5-9686-cad9f9ac0d81@linux.dev \
--to=yanjun.zhu@linux.dev \
--cc=jgg@ziepe.ca \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=matsuda-daisuke@fujitsu.com \
--cc=yanjun.zhu@intel.com \
--cc=yi.zhang@redhat.com \
--cc=zyjzyj2000@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox