* [PATCH 1/1] RDMA/rxe: Fetch skb packets from ethernet layer
@ 2020-11-05 11:12 Zhu Yanjun
  2020-11-07 20:26 ` Jakub Kicinski
  0 siblings, 1 reply; 6+ messages in thread
From: Zhu Yanjun @ 2020-11-05 11:12 UTC (permalink / raw)
  To: yanjunz, dledford, jgg, linux-rdma, netdev
In the original design, in rx, skb packet would pass ethernet
layer and IP layer, eventually reach udp tunnel.
Now rxe fetches the skb packets from the ethernet layer directly.
So this bypasses the IP and UDP layer. As such, the skb packets
are sent to the upper protocals directly from the ethernet layer.
This increases bandwidth and decreases latency.
Signed-off-by: Zhu Yanjun <yanjunz@nvidia.com>
---
 drivers/infiniband/sw/rxe/rxe_net.c |   45 ++++++++++++++++++++++++++++++++++-
 1 files changed, 44 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
index 2e490e5..8ea68b6 100644
--- a/drivers/infiniband/sw/rxe/rxe_net.c
+++ b/drivers/infiniband/sw/rxe/rxe_net.c
@@ -18,6 +18,7 @@
 #include "rxe_loc.h"
 
 static struct rxe_recv_sockets recv_sockets;
+static struct net_device *g_ndev;
 
 struct device *rxe_dma_device(struct rxe_dev *rxe)
 {
@@ -113,7 +114,7 @@ static int rxe_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
 	}
 
 	tnl_cfg.encap_type = 1;
-	tnl_cfg.encap_rcv = rxe_udp_encap_recv;
+	tnl_cfg.encap_rcv = NULL;
 
 	/* Setup UDP tunnel */
 	setup_udp_tunnel_sock(net, sock, &tnl_cfg);
@@ -357,6 +358,38 @@ struct sk_buff *rxe_init_packet(struct rxe_dev *rxe, struct rxe_av *av,
 	return rxe->ndev->name;
 }
 
+static rx_handler_result_t rxe_handle_frame(struct sk_buff **pskb)
+{
+	struct sk_buff *skb = *pskb;
+	struct iphdr *iph;
+	struct udphdr *udph;
+
+	if (unlikely(skb->pkt_type == PACKET_LOOPBACK))
+		return RX_HANDLER_PASS;
+
+	if (!is_valid_ether_addr(eth_hdr(skb)->h_source)) {
+		kfree(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	if (eth_hdr(skb)->h_proto != cpu_to_be16(ETH_P_IP))
+		return RX_HANDLER_PASS;
+
+	iph = ip_hdr(skb);
+
+	if (iph->protocol != IPPROTO_UDP)
+		return RX_HANDLER_PASS;
+
+	udph = udp_hdr(skb);
+
+	if (udph->dest != cpu_to_be16(ROCE_V2_UDP_DPORT))
+		return RX_HANDLER_PASS;
+
+	rxe_udp_encap_recv(NULL, skb);
+
+	return RX_HANDLER_CONSUMED;
+}
+
 int rxe_net_add(const char *ibdev_name, struct net_device *ndev)
 {
 	int err;
@@ -367,6 +400,7 @@ int rxe_net_add(const char *ibdev_name, struct net_device *ndev)
 		return -ENOMEM;
 
 	rxe->ndev = ndev;
+	g_ndev = ndev;
 
 	err = rxe_add(rxe, ndev->mtu, ibdev_name);
 	if (err) {
@@ -374,6 +408,12 @@ int rxe_net_add(const char *ibdev_name, struct net_device *ndev)
 		return err;
 	}
 
+	rtnl_lock();
+	err = netdev_rx_handler_register(ndev, rxe_handle_frame, rxe);
+	rtnl_unlock();
+	if (err)
+		return err;
+
 	return 0;
 }
 
@@ -498,6 +538,9 @@ static int rxe_net_ipv6_init(void)
 
 void rxe_net_exit(void)
 {
+	rtnl_lock();
+	netdev_rx_handler_unregister(g_ndev);
+	rtnl_unlock();
 	rxe_release_udp_tunnel(recv_sockets.sk6);
 	rxe_release_udp_tunnel(recv_sockets.sk4);
 	unregister_netdevice_notifier(&rxe_net_notifier);
-- 
1.7.1
^ permalink raw reply related	[flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] RDMA/rxe: Fetch skb packets from ethernet layer
  2020-11-05 11:12 [PATCH 1/1] RDMA/rxe: Fetch skb packets from ethernet layer Zhu Yanjun
@ 2020-11-07 20:26 ` Jakub Kicinski
       [not found]   ` <222b9c1b-9d60-22f3-6097-8abd651cc192@gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2020-11-07 20:26 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: dledford, jgg, linux-rdma, netdev
On Thu,  5 Nov 2020 19:12:01 +0800 Zhu Yanjun wrote:
> In the original design, in rx, skb packet would pass ethernet
> layer and IP layer, eventually reach udp tunnel.
> 
> Now rxe fetches the skb packets from the ethernet layer directly.
> So this bypasses the IP and UDP layer. As such, the skb packets
> are sent to the upper protocals directly from the ethernet layer.
> 
> This increases bandwidth and decreases latency.
> 
> Signed-off-by: Zhu Yanjun <yanjunz@nvidia.com>
Nope, no stealing UDP packets with some random rx handlers.
The tunnel socket is a correct approach.
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] RDMA/rxe: Fetch skb packets from ethernet layer
       [not found]   ` <222b9c1b-9d60-22f3-6097-8abd651cc192@gmail.com>
@ 2020-11-08  5:27     ` Zhu Yanjun
  2020-11-09 18:25       ` Jakub Kicinski
  0 siblings, 1 reply; 6+ messages in thread
From: Zhu Yanjun @ 2020-11-08  5:27 UTC (permalink / raw)
  To: zyjzyj2000@gmail.com, kuba, Doug Ledford, Jason Gunthorpe,
	linux-rdma, netdev
On Sun, Nov 8, 2020 at 1:24 PM Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
>
>
>
>
> -------- Forwarded Message --------
> Subject: Re: [PATCH 1/1] RDMA/rxe: Fetch skb packets from ethernet layer
> Date: Sat, 7 Nov 2020 12:26:17 -0800
> From: Jakub Kicinski <kuba@kernel.org>
> To: Zhu Yanjun <yanjunz@nvidia.com>
> CC: dledford@redhat.com, jgg@ziepe.ca, linux-rdma@vger.kernel.org, netdev@vger.kernel.org
>
>
> On Thu, 5 Nov 2020 19:12:01 +0800 Zhu Yanjun wrote:
>
> In the original design, in rx, skb packet would pass ethernet
> layer and IP layer, eventually reach udp tunnel.
>
> Now rxe fetches the skb packets from the ethernet layer directly.
> So this bypasses the IP and UDP layer. As such, the skb packets
> are sent to the upper protocals directly from the ethernet layer.
>
> This increases bandwidth and decreases latency.
>
> Signed-off-by: Zhu Yanjun <yanjunz@nvidia.com>
>
>
> Nope, no stealing UDP packets with some random rx handlers.
Why? Is there any risks?
Zhu Yanjun
>
> The tunnel socket is a correct approach.
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] RDMA/rxe: Fetch skb packets from ethernet layer
  2020-11-08  5:27     ` Zhu Yanjun
@ 2020-11-09 18:25       ` Jakub Kicinski
  2020-11-10  1:58         ` Zhu Yanjun
  0 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2020-11-09 18:25 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: Doug Ledford, Jason Gunthorpe, linux-rdma, netdev
On Sun, 8 Nov 2020 13:27:32 +0800 Zhu Yanjun wrote:
> On Sun, Nov 8, 2020 at 1:24 PM Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
> > On Thu, 5 Nov 2020 19:12:01 +0800 Zhu Yanjun wrote:
> >
> > In the original design, in rx, skb packet would pass ethernet
> > layer and IP layer, eventually reach udp tunnel.
> >
> > Now rxe fetches the skb packets from the ethernet layer directly.
> > So this bypasses the IP and UDP layer. As such, the skb packets
> > are sent to the upper protocals directly from the ethernet layer.
> >
> > This increases bandwidth and decreases latency.
> >
> > Signed-off-by: Zhu Yanjun <yanjunz@nvidia.com>
> >
> >
> > Nope, no stealing UDP packets with some random rx handlers.  
> 
> Why? Is there any risks?
Are there risks in layering violations? Yes.
For example - you do absolutely no protocol parsing, checksum
validation, only support IPv4, etc.
Besides it also makes the code far less maintainable, rx_handler is a
singleton, etc. etc.
> > The tunnel socket is a correct approach.  
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] RDMA/rxe: Fetch skb packets from ethernet layer
  2020-11-09 18:25       ` Jakub Kicinski
@ 2020-11-10  1:58         ` Zhu Yanjun
  2020-11-11 11:15           ` Zhu Yanjun
  0 siblings, 1 reply; 6+ messages in thread
From: Zhu Yanjun @ 2020-11-10  1:58 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Doug Ledford, Jason Gunthorpe, linux-rdma, netdev
On Tue, Nov 10, 2020 at 2:25 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sun, 8 Nov 2020 13:27:32 +0800 Zhu Yanjun wrote:
> > On Sun, Nov 8, 2020 at 1:24 PM Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
> > > On Thu, 5 Nov 2020 19:12:01 +0800 Zhu Yanjun wrote:
> > >
> > > In the original design, in rx, skb packet would pass ethernet
> > > layer and IP layer, eventually reach udp tunnel.
> > >
> > > Now rxe fetches the skb packets from the ethernet layer directly.
> > > So this bypasses the IP and UDP layer. As such, the skb packets
> > > are sent to the upper protocals directly from the ethernet layer.
> > >
> > > This increases bandwidth and decreases latency.
> > >
> > > Signed-off-by: Zhu Yanjun <yanjunz@nvidia.com>
> > >
> > >
> > > Nope, no stealing UDP packets with some random rx handlers.
> >
> > Why? Is there any risks?
>
> Are there risks in layering violations? Yes.
>
> For example - you do absolutely no protocol parsing,
Protocol parsing is in rxe driver.
> checksum validation, only support IPv4, etc.
Since only ipv4 is supported in rxe, if ipv6 is supported in rxe, I
will add ipv6.
>
> Besides it also makes the code far less maintainable, rx_handler is a
This rx_handler is also used in openvswitch and bridge.
Zhu Yanjun
> singleton, etc. etc.
>
> > > The tunnel socket is a correct approach.
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] RDMA/rxe: Fetch skb packets from ethernet layer
  2020-11-10  1:58         ` Zhu Yanjun
@ 2020-11-11 11:15           ` Zhu Yanjun
  0 siblings, 0 replies; 6+ messages in thread
From: Zhu Yanjun @ 2020-11-11 11:15 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Doug Ledford, Jason Gunthorpe, linux-rdma, netdev
On Tue, Nov 10, 2020 at 9:58 AM Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
>
> On Tue, Nov 10, 2020 at 2:25 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Sun, 8 Nov 2020 13:27:32 +0800 Zhu Yanjun wrote:
> > > On Sun, Nov 8, 2020 at 1:24 PM Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
> > > > On Thu, 5 Nov 2020 19:12:01 +0800 Zhu Yanjun wrote:
> > > >
> > > > In the original design, in rx, skb packet would pass ethernet
> > > > layer and IP layer, eventually reach udp tunnel.
> > > >
> > > > Now rxe fetches the skb packets from the ethernet layer directly.
> > > > So this bypasses the IP and UDP layer. As such, the skb packets
> > > > are sent to the upper protocals directly from the ethernet layer.
> > > >
> > > > This increases bandwidth and decreases latency.
> > > >
> > > > Signed-off-by: Zhu Yanjun <yanjunz@nvidia.com>
> > > >
> > > >
> > > > Nope, no stealing UDP packets with some random rx handlers.
> > >
> > > Why? Is there any risks?
> >
> > Are there risks in layering violations? Yes.
> >
> > For example - you do absolutely no protocol parsing,
>
> Protocol parsing is in rxe driver.
>
> > checksum validation, only support IPv4, etc.
>
> Since only ipv4 is supported in rxe, if ipv6 is supported in rxe, I
> will add ipv6.
>
> >
> > Besides it also makes the code far less maintainable, rx_handler is a
>
> This rx_handler is also used in openvswitch and bridge.
in Vacation. I will reply as soon as I come back.
Zhu Yanjun
>
> Zhu Yanjun
>
> > singleton, etc. etc.
> >
> > > > The tunnel socket is a correct approach.
^ permalink raw reply	[flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-11-11 11:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-11-05 11:12 [PATCH 1/1] RDMA/rxe: Fetch skb packets from ethernet layer Zhu Yanjun
2020-11-07 20:26 ` Jakub Kicinski
     [not found]   ` <222b9c1b-9d60-22f3-6097-8abd651cc192@gmail.com>
2020-11-08  5:27     ` Zhu Yanjun
2020-11-09 18:25       ` Jakub Kicinski
2020-11-10  1:58         ` Zhu Yanjun
2020-11-11 11:15           ` Zhu Yanjun
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).