From: Leon Romanovsky <leon@kernel.org>
To: "Rehm, Kevan" <kevan.rehm@hpe.com>
Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
Yishai Hadas <yishaih@nvidia.com>
Subject: Re: Segfault in mlx5 driver on infiniband after application fork
Date: Thu, 8 Feb 2024 10:52:29 +0200 [thread overview]
Message-ID: <20240208085229.GF56027@unreal> (raw)
In-Reply-To: <E25C1D96-0FBF-44AB-A5B5-71CDA49E73D1@hpe.com>
On Wed, Feb 07, 2024 at 07:17:01PM +0000, Rehm, Kevan wrote:
> Greetings,
>
> I don’t see a way to open a ticket at rdma-core; it was suggested that I send this email instead.
>
> I have been chasing a problem in rdma-core-47.1. Originally, I opened a ticket in libfabric, but it was pointed out that mlx5 is not part of libfabric. Full description of the problem plus debug notes are documented at the github repository for libfabric, see issue 9792, please have a look there rather than repeating all of the background information in this email.
>
> An application started by pytorch does a fork, then the child process attempts to use libfabric to open a new DAOS infiniband endpoint. The original endpoint is owned and still in use by the parent process.
>
> When the parent process created the endpoint (fi_fabric, fi_domain, fi_endpoint calls), the mlx5 driver allocated memory pages for use in SRQ creation, and issued a madvise to say that the pages are DONTFORK. These pages are associated with the domain’s ibv_device which is cached in the driver. After the fork when the child process calls fi_domain for its new endpoint, it gets the ibv_device that was cached at the time it was created by the parent. The child process immediately segfaults when trying to create a SRQ, because the pages associated with that ibv_device are not in the child’s memory. There doesn’t appear to be any way for a child process to create a fresh endpoint because of the caching being done for ibv_devices.
>
> Is this the proper way to “open a ticket” against rdma-core?
It is right place, but I won't call it "proper way".
For anyone who is interested in this issue, please follow the links below:
https://github.com/ofiwg/libfabric/issues/9792
https://daosio.atlassian.net/browse/DAOS-15117
Regarding the issue, I don't know if mlx5 actively used to run
libfabric, but the mentioned call to ibv_dontfork_range() existed from
prehistoric era.
Do you have any environment variables set related to rdma-core?
Thanks
>
> Regards, Kevan
>
>
>
next prev parent reply other threads:[~2024-02-08 8:52 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-07 19:17 Segfault in mlx5 driver on infiniband after application fork Rehm, Kevan
2024-02-08 8:52 ` Leon Romanovsky [this message]
2024-02-08 9:05 ` Mark Zhang
-- strict thread matches above, loose matches on Subject: below --
2024-02-11 19:24 Kevan Rehm
2024-02-12 13:33 ` Jason Gunthorpe
2024-02-12 14:37 ` Kevan Rehm
2024-02-12 14:40 ` Jason Gunthorpe
2024-02-12 16:04 ` Kevan Rehm
2024-02-12 16:12 ` Jason Gunthorpe
2024-02-12 16:37 ` Kevan Rehm
2024-02-12 16:45 ` Jason Gunthorpe
2024-02-16 19:56 ` Kevan Rehm
2024-02-13 16:45 Kevan Rehm
2024-02-21 12:51 Kevan Rehm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240208085229.GF56027@unreal \
--to=leon@kernel.org \
--cc=kevan.rehm@hpe.com \
--cc=linux-rdma@vger.kernel.org \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox