From: "Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com>
To: "Daisuke Matsuda (Fujitsu)" <matsuda-daisuke@fujitsu.com>,
'Zhu Yanjun' <yanjun.zhu@intel.com>,
"yi.zhang@redhat.com" <yi.zhang@redhat.com>,
"bvanassche@acm.org" <bvanassche@acm.org>,
"haris.iqbal@ionos.com" <haris.iqbal@ionos.com>,
"jinpu.wang@ionos.com" <jinpu.wang@ionos.com>
Cc: Zhu Yanjun <yanjun.zhu@linux.dev>, "jgg@ziepe.ca" <jgg@ziepe.ca>,
"leon@kernel.org" <leon@kernel.org>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
"zyjzyj2000@gmail.com" <zyjzyj2000@gmail.com>
Subject: Re: [PATCH 1/1] RDMA/rxe: Fix blktests srp lead kernel panic with 64k page size
Date: Fri, 20 Oct 2023 06:54:50 +0000 [thread overview]
Message-ID: <e628d470-507d-41a2-92ee-c32b8a8d4791@fujitsu.com> (raw)
In-Reply-To: <a6e4efa6-0623-4afa-9b57-969aaf346081@fujitsu.com>
Add and Hi Bart,
Yi reported a crash[1] when PAGE_SIZE is 64K
[1]https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
The root cause is unknown so far, but I notice that SRP over RXE always uses MR with page_size
4K = MAX(4K, min(device_support_page_size)) even though the device supports 64K page size.
* RXE device support 4K ~ 2G page size
4024 /*
4025 * Use the smallest page size supported by the HCA, down to a
4026 * minimum of 4096 bytes. We're unlikely to build large sglists
4027 * out of smaller entries.
4028 */
4029 mr_page_shift = max(12, ffs(attr->page_size_cap) - 1);
4030 srp_dev->mr_page_size = 1 << mr_page_shift;
4031 srp_dev->mr_page_mask = ~((u64) srp_dev->mr_page_size - 1);
4032 max_pages_per_mr = attr->max_mr_size;
I doubt if we can use PAGE_SIZE MR if the device supports it.
BTW, rtrs seems have the same code. @rtrs
Thanks
Zhijian
On 20/10/2023 11:47, Zhijian Li (Fujitsu) wrote:
> CC Bart
>
> On 13/10/2023 20:01, Daisuke Matsuda (Fujitsu) wrote:
>> On Fri, Oct 13, 2023 10:18 AM Zhu Yanjun wrote:
>>> From: Zhu Yanjun<yanjun.zhu@linux.dev>
>>>
>>> The page_size of mr is set in infiniband core originally. In the commit
>>> 325a7eb85199 ("RDMA/rxe: Cleanup page variables in rxe_mr.c"), the
>>> page_size is also set. Sometime this will cause conflict.
>> I appreciate your prompt action, but I do not think this commit deals with
>> the root cause. I agree that the problem lies in rxe driver, but what is wrong
>> with assigning actual page size to ibmr.page_size?
>>
>> IMO, the problem comes from the device attribute of rxe driver, which is used
>> in ulp/srp layer to calculate the page_size.
>> =====
>> static int srp_add_one(struct ib_device *device)
>> {
>> struct srp_device *srp_dev;
>> struct ib_device_attr *attr = &device->attrs;
>> <...>
>> /*
>> * Use the smallest page size supported by the HCA, down to a
>> * minimum of 4096 bytes. We're unlikely to build large sglists
>> * out of smaller entries.
>> */
>> mr_page_shift = max(12, ffs(attr->page_size_cap) - 1);
>
>
> You light me up.
> RXE provides attr.page_size_cap(RXE_PAGE_SIZE_CAP) which means it can support 4K-2G page size
>
> so i think more accurate logic should be:
>
> if (device supports PAGE_SIZE)
> use PAGE_SIZE
> else if (device support 4096 page_size) // fallback to 4096
> use 4096 etc...
> else
> ...
>
>
>
>
>> srp_dev->mr_page_size = 1 << mr_page_shift;
>> =====
>> On initialization of srp driver, mr_page_size is calculated here.
>> Note that the device attribute is used to calculate the value of page shift
>> when the device is trying to use a page size larger than 4096. Since Yi specified
>> CONFIG_ARM64_64K_PAGES, the system naturally met the condition.
next prev parent reply other threads:[~2023-10-20 6:55 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-13 1:18 [PATCH 1/1] RDMA/rxe: Fix blktests srp lead kernel panic with 64k page size Zhu Yanjun
2023-10-13 12:01 ` Daisuke Matsuda (Fujitsu)
2023-10-13 12:28 ` Zhu Yanjun
2023-10-13 13:01 ` Daisuke Matsuda (Fujitsu)
2023-10-13 13:44 ` Rain River
2023-10-16 6:07 ` Daisuke Matsuda (Fujitsu)
2023-10-18 8:34 ` Zhu Yanjun
2023-10-20 3:47 ` Zhijian Li (Fujitsu)
2023-10-20 6:54 ` Zhijian Li (Fujitsu) [this message]
2023-10-20 16:21 ` Bart Van Assche
2023-10-23 0:58 ` Zhijian Li (Fujitsu)
2023-10-20 14:01 ` Jason Gunthorpe
2023-10-23 3:52 ` Zhijian Li (Fujitsu)
2023-10-23 6:08 ` Zhu Yanjun
2023-10-23 10:45 ` Yi Zhang
2023-10-24 8:15 ` Zhijian Li (Fujitsu)
2023-10-24 9:13 ` Zhijian Li (Fujitsu)
2023-10-26 9:05 ` Zhijian Li (Fujitsu)
2023-10-26 11:42 ` Jason Gunthorpe
2023-10-26 12:59 ` Zhu Yanjun
2023-10-26 23:23 ` Jason Gunthorpe
2023-10-27 1:36 ` Zhu Yanjun
2023-10-27 4:01 ` Zhu Yanjun
2023-10-27 11:51 ` Jason Gunthorpe
2023-10-26 13:28 ` Bart Van Assche
2023-10-26 13:43 ` Jason Gunthorpe
2023-10-26 21:47 ` Bart Van Assche
2023-10-27 1:26 ` Daisuke Matsuda (Fujitsu)
2023-10-27 1:39 ` Zhu Yanjun
2023-10-27 5:43 ` Zhijian Li (Fujitsu)
2023-10-31 1:36 ` Zhu Yanjun
[not found] ` <CAEz=LcuLCe7bhUohh6BcHdJ1_ocJdZq=eu07vWb3Md5_ZOGDBg@mail.gmail.com>
[not found] ` <CAEz=LcuQ6fFpHqBPT1oTUgKABAHFJqYDC-AHidE-+n6OtzmCPQ@mail.gmail.com>
2023-10-31 8:14 ` Greg Sword
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e628d470-507d-41a2-92ee-c32b8a8d4791@fujitsu.com \
--to=lizhijian@fujitsu.com \
--cc=bvanassche@acm.org \
--cc=haris.iqbal@ionos.com \
--cc=jgg@ziepe.ca \
--cc=jinpu.wang@ionos.com \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=matsuda-daisuke@fujitsu.com \
--cc=yanjun.zhu@intel.com \
--cc=yanjun.zhu@linux.dev \
--cc=yi.zhang@redhat.com \
--cc=zyjzyj2000@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox