From: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: List Linux RDMA Mailing
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Knut Omang <Knut.Omang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
Jack Morgenstein <jackm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: Crashes due to concurrent calls to ib_unmap_fmr()
Date: Tue, 18 Apr 2017 20:44:30 +0300 [thread overview]
Message-ID: <20170418174430.GD14088@mtr-leonro.local> (raw)
In-Reply-To: <A9FF7F7C-F936-4925-B3A6-31684613B69F-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 2587 bytes --]
On Mon, Apr 17, 2017 at 01:45:24PM -0400, Chuck Lever wrote:
>
> > On Apr 15, 2017, at 5:55 AM, Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> >
> > On Fri, Apr 14, 2017 at 11:51:39AM -0400, Chuck Lever wrote:
> >> Howdy-
> >>
> >> I recently found a way to crash my HCA (and the whole system) using a
> >> signal on an NFS/RDMA mount point that is using FMR. I've documented
> >> the issue:
> >>
> >> https://bugzilla.linux-nfs.org/show_bug.cgi?id=305
> >>
> >> And I have an NFS/RDMA fix I'm testing for v4.13. The fix is to prevent
> >> simultaneous calls to ib_unmap_fmr with the same FMR.
> >>
> >> While working on the fix, I've been looking for any documentation
> >> regarding serialization requirements for ib_unmap_fmr. Knut Omang pointed
> >> out to me that Documentation/infiniband/core-locking.txt makes this bold
> >> statement:
> >>
> >>> Reentrancy
> >>>
> >>> All of the methods in struct ib_device exported by a low-level
> >>> driver must be fully reentrant. The low-level driver is required to
> >>> perform all synchronization necessary to maintain consistency, even
> >>> if multiple function calls using the same object are run
> >>> simultaneously.
> >>>
> >>> The IB midlayer does not perform any serialization of function calls.
> >>>
> >>> Because low-level drivers are reentrant, upper level protocol
> >>> consumers are not required to perform any serialization.
> >>
> >> Does this re-entrancy guarantee apply only when ib_unmap_fmr is called
> >> concurrently with unique FMRs?
> >
> > According to description, it should apply to all operations on ib_device
> > without any exclusion.
> >
> >>
> >> I've been told it is not possible for ib_unmap_fmr to detect when it has
> >> been invoked in different threads with the same FMR.
> >
> > Right, FMR management is implemented as direct writes to MPT and MTT
> > tables. HW doesn't distinguish simultaneous calls to the TPT cache.
> >
> >> but apparently the > user space equivalent does not have the same
> >> vulnerability (I did not test this assertion).
> >>
> >> I'm wondering what is proper closure here (aside from merging the
> >> NFS/RDMA fix).
> >
> > Maybe serialize unmap_frm (workqueue) from the driver side?
>
> Either correcting the documentation or a driver change is OK with me.
>
> Claiming that "upper level protocol consumers are not required to
> perform any serialization" seems like a stretch.
Right,
I added Jack to this thread, and we will need a couple of days to think
internally about possible solutions.
Thanks
>
>
> --
> Chuck Lever
>
>
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2017-04-18 17:44 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-14 15:51 Crashes due to concurrent calls to ib_unmap_fmr() Chuck Lever
[not found] ` <5C9E097E-938D-4F41-9EA4-003F77A54DAD-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-04-15 9:55 ` Leon Romanovsky
[not found] ` <20170415095528.GK1343-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-04-17 17:45 ` Chuck Lever
[not found] ` <A9FF7F7C-F936-4925-B3A6-31684613B69F-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-04-18 17:44 ` Leon Romanovsky [this message]
[not found] ` <20170418174430.GD14088-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-04-19 8:02 ` jackm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170418174430.GD14088@mtr-leonro.local \
--to=leon-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
--cc=Knut.Omang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=jackm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox