The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH rdma-next v1 0/2] RDMA: fix cross-NIC same-host IPv6 RDMA-CM connect
@ 2026-06-15 17:46 Alex Timofeyev
  2026-06-15 17:46 ` [PATCH rdma-next v1 1/2] RDMA/core: use destination netdev MAC for cross-NIC same-host local dst Alex Timofeyev
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Alex Timofeyev @ 2026-06-15 17:46 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky, linux-rdma
  Cc: Parav Pandit, Edward Srouji, Vlad Dumitrescu, stable,
	linux-kernel

RDMA-CM cannot establish an IPv6 RoCEv2 connection between two NICs that
live on the same host. This shows up on hosts that pin one process per
NUMA-local NIC and let those processes talk to each other over each NIC's
global IPv6 GID (e.g. a storage daemon with one engine per NUMA node on
dual ConnectX-7). rdma_resolve_addr() and ib_send_cm_req() both return
success, but the destination NIC silently drops the frame and the peer
never sees the REQ; the connection times out.

The bug has two halves, one on each side of the connection:

1) Send side (patch 1, drivers/infiniband/core/addr.c)

   When the destination address is local, addr_resolve_neigh() copies the
   *source* device's MAC into the path record's destination MAC. That is
   right for true loopback (same netdev), but for a destination that lives
   on a different netdev of the same host the destination NIC will not
   accept a frame addressed to the source NIC's MAC and drops it in HW.
   The fix resolves the netdev that owns the destination address and uses
   its MAC.

2) Receive side (patch 2, drivers/infiniband/core/cma.c)

   Once the REQ does reach the peer, validate_ipv6_net_dev() rejects it:
   rt6_lookup() of a same-host destination collapses onto the loopback
   netdev, so the strict rt6i_idev->dev == net_dev check fails with
   -EHOSTUNREACH even though the REQ arrived on the right net_dev. The fix
   accepts an RTF_LOCAL route when net_dev itself owns the listener
   address. This half is only observable once patch 1 lets the REQ arrive.

Both halves are needed for a working connection; patch 1 alone makes the
REQ reach the peer but it is then rejected by the unfixed receive side.

Verification
------------
Measured on two RoCEv2 ConnectX-7 ports on the same host, each with a
global IPv6 GID (port A "src", port B "dst"), driving a cross-NIC
RDMA-CM connect (rping, src GID on port A -> dst GID on port B) while
tracing the destination MAC resolved in addr_resolve():

  without the series:  resolved dst MAC = port A's MAC (the *source* NIC)
                        -> frame dropped, connect times out
  with the series:     resolved dst MAC = port B's MAC (the *dest* NIC)
                        -> connect completes

The kernel under test carried c31e4038c97f and its dst_rtable() prereq
(i.e. the same addr_resolve_neigh()/is_dst_local() shape as for-next);
the change applies unmodified to rdma.git for-next.

Note on stable: the Fixes: tags bound the backport to where each construct
exists in its current form. Trees predating c31e4038c97f have the
equivalent send-side gap in the older IFF_LOOPBACK form of
addr_resolve_neigh() and would need a separately shaped backport.

The patches are independent files but should be applied as a pair so the
connection works end to end.

Alex Timofeyev (2):
  RDMA/core: use destination netdev MAC for cross-NIC same-host local
    dst
  RDMA/cma: accept cross-NIC same-host local dst in
    validate_ipv6_net_dev

 drivers/infiniband/core/addr.c | 22 +++++++++++++++++++---
 drivers/infiniband/core/cma.c  | 15 ++++++++++++++-
 2 files changed, 33 insertions(+), 4 deletions(-)


base-commit: 20ff9350862468af21b46cae2c22d17d6ec637f9
-- 
2.40.4


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-15 23:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-15 17:46 [PATCH rdma-next v1 0/2] RDMA: fix cross-NIC same-host IPv6 RDMA-CM connect Alex Timofeyev
2026-06-15 17:46 ` [PATCH rdma-next v1 1/2] RDMA/core: use destination netdev MAC for cross-NIC same-host local dst Alex Timofeyev
2026-06-15 17:46 ` [PATCH rdma-next v1 2/2] RDMA/cma: accept cross-NIC same-host local dst in validate_ipv6_net_dev Alex Timofeyev
2026-06-15 23:59 ` [PATCH rdma-next v1 0/2] RDMA: fix cross-NIC same-host IPv6 RDMA-CM connect Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox