* [PATCH rdma-next 0/8] Cleanup and fix the CMA state machine
@ 2020-09-02 8:11 Leon Romanovsky
2020-09-17 12:33 ` Jason Gunthorpe
0 siblings, 1 reply; 2+ messages in thread
From: Leon Romanovsky @ 2020-09-02 8:11 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe
Cc: Leon Romanovsky, Eli Cohen, linux-kernel, linux-rdma,
Roland Dreier
From: Leon Romanovsky <leonro@nvidia.com>
From Jason:
The RDMA CMA continues to attract syzkaller bugs due to its somewhat loose
operation of its FSM. Audit and scrub the whole thing to follow modern
expectations.
Overall the design elements are broadly:
- The ULP entry points MUST NOT run in parallel with each other. The ULP
is solely responsible for preventing this.
- If the ULP returns !0 from it's event callback it MUST guarentee that no
other ULP threads are touching the cm_id or calling into any RDMA CM
entry point.
- ULP entry points can sometimes run conurrently with handler callbacks,
although it is tricky because there are many entry points that exist
in the flow before the handler is registered.
- Some ULP entry points are called from the ULP event handler callback,
under the handler_mutex. (however ucma never does this)
- state uses a weird double locking scheme, in most cases one should hold
the handler_mutex. (It is somewhat unclear what exactly the spinlock is
for)
- Reading the state without holding the spinlock should use READ_ONCE,
even if the handler_mutex is held.
- There are certain states which are 'stable' under the handler_mutex,
exit from that state requires also holding the handler_mutex. This
explains why testing the test under only the handler_mutex makes sense.
Thanks
Jason Gunthorpe (8):
RDMA/cma: Fix locking for the RDMA_CM_CONNECT state
RDMA/cma: Make the locking for automatic state transition more clear
RDMA/cma: Fix locking for the RDMA_CM_LISTEN state
RDMA/cma: Remove cma_comp()
RDMA/cma: Combine cma_ndev_work with cma_work
RDMA/cma: Remove dead code for kernel rdmacm multicast
RDMA/cma: Consolidate the destruction of a cma_multicast in one place
RDMA/cma: Fix use after free race in roce multicast join
drivers/infiniband/core/cma.c | 466 ++++++++++++++++------------------
1 file changed, 218 insertions(+), 248 deletions(-)
--
2.26.2
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH rdma-next 0/8] Cleanup and fix the CMA state machine
2020-09-02 8:11 [PATCH rdma-next 0/8] Cleanup and fix the CMA state machine Leon Romanovsky
@ 2020-09-17 12:33 ` Jason Gunthorpe
0 siblings, 0 replies; 2+ messages in thread
From: Jason Gunthorpe @ 2020-09-17 12:33 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Doug Ledford, Leon Romanovsky, Eli Cohen, linux-kernel,
linux-rdma, Roland Dreier
On Wed, Sep 02, 2020 at 11:11:14AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> >From Jason:
>
> The RDMA CMA continues to attract syzkaller bugs due to its somewhat loose
> operation of its FSM. Audit and scrub the whole thing to follow modern
> expectations.
>
> Overall the design elements are broadly:
>
> - The ULP entry points MUST NOT run in parallel with each other. The ULP
> is solely responsible for preventing this.
>
> - If the ULP returns !0 from it's event callback it MUST guarentee that no
> other ULP threads are touching the cm_id or calling into any RDMA CM
> entry point.
>
> - ULP entry points can sometimes run conurrently with handler callbacks,
> although it is tricky because there are many entry points that exist
> in the flow before the handler is registered.
>
> - Some ULP entry points are called from the ULP event handler callback,
> under the handler_mutex. (however ucma never does this)
>
> - state uses a weird double locking scheme, in most cases one should hold
> the handler_mutex. (It is somewhat unclear what exactly the spinlock is
> for)
>
> - Reading the state without holding the spinlock should use READ_ONCE,
> even if the handler_mutex is held.
>
> - There are certain states which are 'stable' under the handler_mutex,
> exit from that state requires also holding the handler_mutex. This
> explains why testing the test under only the handler_mutex makes sense.
>
> Thanks
>
> Jason Gunthorpe (8):
> RDMA/cma: Fix locking for the RDMA_CM_CONNECT state
> RDMA/cma: Make the locking for automatic state transition more clear
> RDMA/cma: Fix locking for the RDMA_CM_LISTEN state
> RDMA/cma: Remove cma_comp()
> RDMA/cma: Combine cma_ndev_work with cma_work
> RDMA/cma: Remove dead code for kernel rdmacm multicast
> RDMA/cma: Consolidate the destruction of a cma_multicast in one place
> RDMA/cma: Fix use after free race in roce multicast join
Applied to for-next
Jason
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-09-17 12:53 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-02 8:11 [PATCH rdma-next 0/8] Cleanup and fix the CMA state machine Leon Romanovsky
2020-09-17 12:33 ` Jason Gunthorpe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox