public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Edward Srouji <edwards@nvidia.com>
To: <jgg@ziepe.ca>, <leon@kernel.org>
Cc: <linux-rdma@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<parav@nvidia.com>, <cratiu@nvidia.com>, <vdumitrescu@nvidia.com>,
	<edwards@nvidia.com>, <kuba@kernel.org>, <tariqt@nvidia.com>,
	<mbloch@nvidia.com>, <gal@nvidia.com>, <idosch@nvidia.com>
Subject: [PATCH v1 3/4] RDMA/core: Use route entry flag to decide on loopback traffic
Date: Tue, 16 Sep 2025 14:11:02 +0300	[thread overview]
Message-ID: <20250916111103.84069-4-edwards@nvidia.com> (raw)
In-Reply-To: <20250916111103.84069-1-edwards@nvidia.com>

From: Parav Pandit <parav@nvidia.com>

addr_resolve() considers a destination to be local if the next-hop
device of the resolved route for the destination is the loopback
netdevice.

This fails when the source and destination IP addresses belong to
a netdev enslaved to a VRF netdev. In this case the next-hop device
is the VRF itself:

 $ ip link add name myvrf up type vrf table 100
 $ ip link set ens2f0np0 master myvrf up
 $ ip addr add 192.168.1.1/24 dev ens2f0np0
 $ ip route get 192.168.1.1 oif myvrf
 local 192.168.1.1 dev myvrf table 100 src 192.168.1.1 uid 0
    cache <local>

This results in packets being generated with an incorrect destination
MAC of the VRF netdevice and ib_write_bw failing with timeout.

Solve this by determining if a destination is local or not based on
the resolved route's type rather than based on its next-hop netdevice
loopback flag.

This enables to resolve loopback traffic with and without VRF
configurations in a uniform way.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
---
 drivers/infiniband/core/addr.c | 34 +++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index ca86c482662f..61596cda2b65 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -446,31 +446,40 @@ static int addr6_resolve(struct sockaddr *src_sock,
 }
 #endif
 
+static bool is_dst_local(const struct dst_entry *dst)
+{
+	if (dst->ops->family == AF_INET)
+		return !!(dst_rtable(dst)->rt_type & RTN_LOCAL);
+	else if (dst->ops->family == AF_INET6)
+		return !!(dst_rt6_info(dst)->rt6i_flags & RTF_LOCAL);
+	else
+		return false;
+}
+
 static int addr_resolve_neigh(const struct dst_entry *dst,
 			      const struct sockaddr *dst_in,
 			      struct rdma_dev_addr *addr,
-			      unsigned int ndev_flags,
 			      u32 seq)
 {
-	int ret = 0;
-
-	if (ndev_flags & IFF_LOOPBACK)
+	if (is_dst_local(dst)) {
+		/* When the destination is local entry, source and destination
+		 * are same. Skip the neighbour lookup.
+		 */
 		memcpy(addr->dst_dev_addr, addr->src_dev_addr, MAX_ADDR_LEN);
-	else
-		ret = fetch_ha(dst, addr, dst_in, seq);
-	return ret;
+		return 0;
+	}
+
+	return fetch_ha(dst, addr, dst_in, seq);
 }
 
 static int rdma_set_src_addr_rcu(struct rdma_dev_addr *dev_addr,
-				 unsigned int *ndev_flags,
 				 const struct sockaddr *dst_in,
 				 const struct dst_entry *dst)
 {
 	struct net_device *ndev = READ_ONCE(dst->dev);
 
-	*ndev_flags = ndev->flags;
 	/* A physical device must be the RDMA device to use */
-	if (ndev->flags & IFF_LOOPBACK) {
+	if (is_dst_local(dst)) {
 		int ret;
 		/*
 		 * RDMA (IB/RoCE, iWarp) doesn't run on lo interface or
@@ -538,7 +547,6 @@ static int addr_resolve(struct sockaddr *src_in,
 			u32 seq)
 {
 	struct dst_entry *dst = NULL;
-	unsigned int ndev_flags = 0;
 	struct rtable *rt = NULL;
 	int ret;
 
@@ -575,7 +583,7 @@ static int addr_resolve(struct sockaddr *src_in,
 		rcu_read_unlock();
 		goto done;
 	}
-	ret = rdma_set_src_addr_rcu(addr, &ndev_flags, dst_in, dst);
+	ret = rdma_set_src_addr_rcu(addr, dst_in, dst);
 	rcu_read_unlock();
 
 	/*
@@ -583,7 +591,7 @@ static int addr_resolve(struct sockaddr *src_in,
 	 * only if src addr translation didn't fail.
 	 */
 	if (!ret && resolve_neigh)
-		ret = addr_resolve_neigh(dst, dst_in, addr, ndev_flags, seq);
+		ret = addr_resolve_neigh(dst, dst_in, addr, seq);
 
 	if (src_in->sa_family == AF_INET)
 		ip_rt_put(rt);
-- 
2.21.3


  parent reply	other threads:[~2025-09-16 11:11 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-07 16:08 [PATCH 0/4] Fix local destination address resolution with VRF Edward Srouji
2025-09-07 16:08 ` [PATCH 1/4] RDMA/core: Squash a single user static function Edward Srouji
2025-09-10  8:17   ` Leon Romanovsky
2025-09-10 10:51     ` Parav Pandit
2025-09-10 12:21       ` Leon Romanovsky
2025-09-07 16:08 ` [PATCH 2/4] RDMA/core: Resolve MAC of next-hop device without ARP support Edward Srouji
2025-09-10  8:32   ` Leon Romanovsky
2025-09-10 10:55     ` Parav Pandit
2025-09-15 16:30       ` Jason Gunthorpe
2025-09-15 17:16         ` Parav Pandit
2025-09-07 16:08 ` [PATCH 3/4] RDMA/core: Use route entry flag to decide on loopback traffic Edward Srouji
2025-09-07 16:08 ` [PATCH 4/4] IB/ipoib: Ignore L3 master device Edward Srouji
2025-09-16 11:10 ` [PATCH v1 0/4] Fix local destination address resolution with VRF Edward Srouji
2025-09-16 11:11   ` [PATCH v1 1/4] RDMA/core: Squash a single user static function Edward Srouji
2025-09-16 11:11   ` [PATCH v1 2/4] RDMA/core: Resolve MAC of next-hop device without ARP support Edward Srouji
2025-09-16 11:11   ` Edward Srouji [this message]
2025-09-16 11:11   ` [PATCH v1 4/4] IB/ipoib: Ignore L3 master device Edward Srouji
2025-09-18  9:24   ` [PATCH v1 0/4] Fix local destination address resolution with VRF Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250916111103.84069-4-edwards@nvidia.com \
    --to=edwards@nvidia.com \
    --cc=cratiu@nvidia.com \
    --cc=gal@nvidia.com \
    --cc=idosch@nvidia.com \
    --cc=jgg@ziepe.ca \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mbloch@nvidia.com \
    --cc=parav@nvidia.com \
    --cc=tariqt@nvidia.com \
    --cc=vdumitrescu@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox