From: Chuck Lever III <chuck.lever@oracle.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Leon Romanovsky <leon@kernel.org>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH v1] RDMA/core: Fix check_flush_dependency splat on addr_wq
Date: Mon, 29 Aug 2022 19:31:43 +0000 [thread overview]
Message-ID: <EFCAC526-1C24-40B6-A535-E58A616875CD@oracle.com> (raw)
In-Reply-To: <Yw0E68fl/FcvUSnO@nvidia.com>
> On Aug 29, 2022, at 2:26 PM, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Mon, Aug 29, 2022 at 06:15:28PM +0000, Chuck Lever III wrote:
>>
>>> On Aug 29, 2022, at 1:22 PM, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
>>> Even a simple case like mlx5 may cause the NIC to trigger a host
>>> memory allocation, which is done in another thread and done as a
>>> normal GFP_KERNEL. This memory allocation must progress before a
>>> CQ/QP/MR/etc can be created. So now we are deadlocked again.
>>
>> That sounds to me like a bug in mlx5. The driver is supposed
>> to respect the caller's GFP settings. Again, if the request
>> is small, it's likely to succeed anyway, but larger requests
>> are not reliable and need to fail quickly so the system can
>> move onto other fishing spots.
>
> It is a design artifact, the FW is the one requesting the memory and
> it has no idea about kernel GFP flags. As above a FW thread could have
> already started requesting memory for some other purpose and we may
> already be inside the mlx5 FW page request thread under a GFP_KERNEL
> allocation doing reclaim. How can this ever be fixed?
I'm willing to admit I'm no expert here. But... IIUC the
deadlock problem is triggered by /waiting/ for memory to
become available to satisfy an allocation request.
So using GFP_NOWAIT, GFP_NOIO/memalloc_noio, and
GFP_NOFS/memalloc_nofs when drivers allocate memory should
be enough to prevent a deadlock and keep the allocations from
diving into reserved memory. I believe only GFP_ATOMIC goes
for reserved memory pools. These others are normal allocations
that simply do not wait if a direct reclaim should be required.
The second-order issue is that the "failed to allocate"
recovery paths are not likely to be well tested, and these
other flags make that kind of failure more likely. Enable
memory allocation failure injection and begin fixing the shit
that comes up.
If you've got "can't fail" scenarios, we'll have to look at
those closely.
--
Chuck Lever
prev parent reply other threads:[~2022-08-29 19:31 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-22 15:30 [PATCH v1] RDMA/core: Fix check_flush_dependency splat on addr_wq Chuck Lever
2022-08-23 8:09 ` Leon Romanovsky
2022-08-23 13:58 ` Chuck Lever III
2022-08-24 9:20 ` Leon Romanovsky
2022-08-24 14:09 ` Chuck Lever III
2022-08-26 13:29 ` Jason Gunthorpe
2022-08-26 14:02 ` Chuck Lever III
2022-08-26 14:08 ` Jason Gunthorpe
2022-08-26 19:57 ` Chuck Lever III
2022-08-29 16:45 ` Jason Gunthorpe
2022-08-29 17:14 ` Chuck Lever III
2022-08-29 17:22 ` Jason Gunthorpe
2022-08-29 18:15 ` Chuck Lever III
2022-08-29 18:26 ` Jason Gunthorpe
2022-08-29 19:31 ` Chuck Lever III [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=EFCAC526-1C24-40B6-A535-E58A616875CD@oracle.com \
--to=chuck.lever@oracle.com \
--cc=jgg@nvidia.com \
--cc=leon@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox