* [PATCH rdma-next v1 2/2] RDMA/cma: accept cross-NIC same-host local dst in validate_ipv6_net_dev
2026-06-15 17:46 [PATCH rdma-next v1 0/2] RDMA: fix cross-NIC same-host IPv6 RDMA-CM connect Alex Timofeyev
@ 2026-06-15 17:46 ` Alex Timofeyev
2026-06-15 17:46 ` [PATCH rdma-next v1 1/2] RDMA/core: use destination netdev MAC for cross-NIC same-host local dst Alex Timofeyev
2026-06-15 23:59 ` [PATCH rdma-next v1 0/2] RDMA: fix cross-NIC same-host IPv6 RDMA-CM connect Jason Gunthorpe
2 siblings, 0 replies; 4+ messages in thread
From: Alex Timofeyev @ 2026-06-15 17:46 UTC (permalink / raw)
To: Jason Gunthorpe, Leon Romanovsky, linux-rdma
Cc: Parav Pandit, Edward Srouji, Vlad Dumitrescu, stable,
linux-kernel
validate_ipv6_net_dev() confirms an incoming CM REQ was delivered on the
correct net_dev with an rt6_lookup() that requires
rt->rt6i_idev->dev == net_dev. For an IPv6 destination that is local to a
different netdev of the same host, the FIB resolves the lookup onto the
loopback netdev, so rt6i_idev->dev is lo regardless of which physical
netdev owns the listener address. The strict comparison then rejects the
REQ with -EHOSTUNREACH even though it was correctly delivered on net_dev.
Accept the request when the resolved route is RTF_LOCAL and net_dev itself
owns the address the listener was bound to (src_addr). This is the
receive-side counterpart to the cross-NIC same-host send-side fix in
addr_resolve_neigh().
Fixes: f887f2ac87c2 ("IB/cma: Validate routing of incoming requests")
Cc: stable@vger.kernel.org
Cc: Parav Pandit <parav@nvidia.com>
Signed-off-by: Alex Timofeyev <sashka@ankey.net>
---
drivers/infiniband/core/cma.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 9480d1a51c11..872c57943362 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1635,7 +1635,20 @@ static bool validate_ipv6_net_dev(struct net_device *net_dev,
if (!rt)
return false;
- ret = rt->rt6i_idev->dev == net_dev;
+ if (rt->rt6i_idev->dev == net_dev) {
+ ret = true;
+ } else if (rt->rt6i_flags & RTF_LOCAL) {
+ /* For a destination that is local to another netdev of the same
+ * host, the FIB collapses the lookup onto the loopback netdev,
+ * so rt6i_idev->dev is not net_dev even though the request was
+ * correctly delivered on net_dev. Accept it when net_dev itself
+ * owns the address we were listening on.
+ */
+ ret = ipv6_chk_addr(dev_net(net_dev), &src_addr->sin6_addr,
+ net_dev, 1);
+ } else {
+ ret = false;
+ }
ip6_rt_put(rt);
return ret;
--
2.40.4
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH rdma-next v1 1/2] RDMA/core: use destination netdev MAC for cross-NIC same-host local dst
2026-06-15 17:46 [PATCH rdma-next v1 0/2] RDMA: fix cross-NIC same-host IPv6 RDMA-CM connect Alex Timofeyev
2026-06-15 17:46 ` [PATCH rdma-next v1 2/2] RDMA/cma: accept cross-NIC same-host local dst in validate_ipv6_net_dev Alex Timofeyev
@ 2026-06-15 17:46 ` Alex Timofeyev
2026-06-15 23:59 ` [PATCH rdma-next v1 0/2] RDMA: fix cross-NIC same-host IPv6 RDMA-CM connect Jason Gunthorpe
2 siblings, 0 replies; 4+ messages in thread
From: Alex Timofeyev @ 2026-06-15 17:46 UTC (permalink / raw)
To: Jason Gunthorpe, Leon Romanovsky, linux-rdma
Cc: Parav Pandit, Edward Srouji, Vlad Dumitrescu, stable,
linux-kernel
addr_resolve_neigh() treats every is_dst_local() destination as loopback
and copies the source device's MAC into the path record's destination MAC
(dst_dev_addr <- src_dev_addr). That is correct for true loopback (source
and destination on the same netdev), but wrong when the local destination
address lives on a different netdev of the same host.
In that cross-NIC same-host case the destination NIC will not accept a
frame whose destination MAC is the source NIC's MAC, and drops it in
hardware before it reaches the peer. rdma_resolve_addr() and
ib_send_cm_req() both return success, but the CM REQ never arrives and the
connection times out.
Look up the netdev that owns the destination address and copy its MAC into
dst_dev_addr instead. Fall back to the source MAC when no netdev claims the
address (true loopback), preserving the existing behaviour.
This was observed with two RoCEv2 ConnectX-7 ports on the same host, each
holding a global IPv6 GID, when one process pinned per NUMA NIC connected
to the other over RDMA-CM: the resolved destination MAC was the source
port's MAC instead of the destination port's, and the REQ was silently
dropped. With the fix the resolved MAC is the destination port's and the
connection completes.
Fixes: c31e4038c97f ("RDMA/core: Use route entry flag to decide on loopback traffic")
Cc: stable@vger.kernel.org
Cc: Parav Pandit <parav@nvidia.com>
Signed-off-by: Alex Timofeyev <sashka@ankey.net>
---
drivers/infiniband/core/addr.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index 7e62b5b1ffaa..84aa43436bfe 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -451,10 +451,26 @@ static int addr_resolve_neigh(const struct dst_entry *dst,
u32 seq)
{
if (is_dst_local(dst)) {
- /* When the destination is local entry, source and destination
- * are same. Skip the neighbour lookup.
+ struct net_device *dst_ndev;
+
+ /* When the destination is local, source and destination are on
+ * the same host. For true loopback (same netdev) the source and
+ * destination MACs are equal, but when the destination address
+ * lives on a different netdev of the same host the destination
+ * MAC must be that netdev's MAC -- otherwise the destination NIC
+ * silently drops the frame. Look up the netdev that owns the
+ * destination address and copy its MAC; fall back to the source
+ * MAC if no netdev claims the address.
*/
- memcpy(addr->dst_dev_addr, addr->src_dev_addr, MAX_ADDR_LEN);
+ rcu_read_lock();
+ dst_ndev = rdma_find_ndev_for_src_ip_rcu(dev_net(dst->dev), dst_in);
+ if (!IS_ERR(dst_ndev))
+ memcpy(addr->dst_dev_addr, dst_ndev->dev_addr,
+ MAX_ADDR_LEN);
+ else
+ memcpy(addr->dst_dev_addr, addr->src_dev_addr,
+ MAX_ADDR_LEN);
+ rcu_read_unlock();
return 0;
}
--
2.40.4
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH rdma-next v1 0/2] RDMA: fix cross-NIC same-host IPv6 RDMA-CM connect
2026-06-15 17:46 [PATCH rdma-next v1 0/2] RDMA: fix cross-NIC same-host IPv6 RDMA-CM connect Alex Timofeyev
2026-06-15 17:46 ` [PATCH rdma-next v1 2/2] RDMA/cma: accept cross-NIC same-host local dst in validate_ipv6_net_dev Alex Timofeyev
2026-06-15 17:46 ` [PATCH rdma-next v1 1/2] RDMA/core: use destination netdev MAC for cross-NIC same-host local dst Alex Timofeyev
@ 2026-06-15 23:59 ` Jason Gunthorpe
2 siblings, 0 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2026-06-15 23:59 UTC (permalink / raw)
To: Alex Timofeyev
Cc: Leon Romanovsky, linux-rdma, Parav Pandit, Edward Srouji,
Vlad Dumitrescu, linux-kernel
On Mon, Jun 15, 2026 at 05:46:19PM +0000, Alex Timofeyev wrote:
> RDMA-CM cannot establish an IPv6 RoCEv2 connection between two NICs that
> live on the same host. This shows up on hosts that pin one process per
> NUMA-local NIC and let those processes talk to each other over each NIC's
> global IPv6 GID (e.g. a storage daemon with one engine per NUMA node on
> dual ConnectX-7). rdma_resolve_addr() and ib_send_cm_req() both return
> success, but the destination NIC silently drops the frame and the peer
> never sees the REQ; the connection times out.
>
> The bug has two halves, one on each side of the connection:
>
> 1) Send side (patch 1, drivers/infiniband/core/addr.c)
>
> When the destination address is local, addr_resolve_neigh() copies the
> *source* device's MAC into the path record's destination MAC. That is
> right for true loopback (same netdev), but for a destination that lives
> on a different netdev of the same host the destination NIC will not
> accept a frame addressed to the source NIC's MAC and drops it in HW.
> The fix resolves the netdev that owns the destination address and uses
> its MAC.
I'm not sure about this, you need to have policy routing or VRF setup
so these local routes don't show up.. Do you have that?
A local route result should result only in a local loopback AH, it should
never result in a packet on the wire, and we shouldn't be trying to
mangle loopback routes at all.
> 2) Receive side (patch 2, drivers/infiniband/core/cma.c)
>
> Once the REQ does reach the peer, validate_ipv6_net_dev() rejects it:
> rt6_lookup() of a same-host destination collapses onto the loopback
> netdev, so the strict rt6i_idev->dev == net_dev check fails with
> -EHOSTUNREACH even though the REQ arrived on the right net_dev. The fix
> accepts an RTF_LOCAL route when net_dev itself owns the listener
> address. This half is only observable once patch 1 lets the REQ
> arrive.
Same answer here, if you have proper routing you won't get a loopback
route to match and you won't fail on this check. Removing the check
does not seem correct.
Jason
^ permalink raw reply [flat|nested] 4+ messages in thread