netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Shrijeet Mukherjee <shm@cumulusnetworks.com>,
	hannes@stressinduktion.org, nicolas.dichtel@6wind.com,
	dsahern@gmail.com, ebiederm@xmission.com, hadi@mojatatu.com,
	davem@davemloft.net, stephen@networkplumber.org,
	netdev@vger.kernel.org
Cc: roopa@cumulusnetworks.com, gospo@cumulusnetworks.com,
	jtoppins@cumulusnetworks.com, nikolay@cumulusnetworks.com
Subject: Re: [RFC net-next 3/3] rcv path changes for vrf traffic
Date: Wed, 10 Jun 2015 11:31:57 -0700	[thread overview]
Message-ID: <5578829D.9030202@gmail.com> (raw)
In-Reply-To: <87eae7a7a03708bda5560a5ea43b0fcac705c80d.1433561681.git.shm@cumulusnetworks.com>

On 06/08/2015 11:35 AM, Shrijeet Mukherjee wrote:
> From: Shrijeet Mukherjee <shm@cumulusnetworks.com>
>
> Incoming frames for IP protocol stacks need the IIF to be changed
> from the actual interface to the VRF device. This allows the IIF
> rule to be used to select tables (or do regular PBR)
>
> This change selects the iif to be the VRF device if it exists and
> the incoming iif is enslaved to the VRF device.
>
> Since VRF aware sockets are always bound to the VRF device this
> system allows return traffic to find the socket of origin.
>
> changes are in the arp_rcv, icmp_rcv and ip_rcv paths
>
> Question : I did not wrap the rcv modifications, in CONFIG_NET_VRF
> as it would create code variations and the vrf_ptr check is there
> I can make that whole thing modular.
>
> Signed-off-by: Shrijeet Mukherjee <shm@cumulusnetworks.com>
> ---
>   net/ipv4/fib_frontend.c |   14 +++++++++++---
>   net/ipv4/fib_trie.c     |    7 +++++--
>   net/ipv4/icmp.c         |    6 ++++++
>   net/ipv4/route.c        |    3 ++-
>   4 files changed, 24 insertions(+), 6 deletions(-)
>
> diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
> index 9d4cef4..cf2d584 100644
> --- a/net/ipv4/fib_frontend.c
> +++ b/net/ipv4/fib_frontend.c
> @@ -45,6 +45,7 @@
>   #include <net/ip_fib.h>
>   #include <net/rtnetlink.h>
>   #include <net/xfrm.h>
> +#include <net/vrf.h>
>
>   #ifndef CONFIG_IP_MULTIPLE_TABLES
>
> @@ -218,15 +219,18 @@ static inline unsigned int __inet_dev_addr_type(struct net *net,
>   	struct fib_result	res;
>   	unsigned int ret = RTN_BROADCAST;
>   	struct fib_table *local_table;
> +	int rt_table = RT_TABLE_LOCAL;
>
>   	if (ipv4_is_zeronet(addr) || ipv4_is_lbcast(addr))
>   		return RTN_BROADCAST;
>   	if (ipv4_is_multicast(addr))
>   		return RTN_MULTICAST;
>
> -	rcu_read_lock();
> +	if (dev && dev->vrf_ptr)
> +		rt_table =  dev->vrf_ptr->tb_id;
>
> -	local_table = fib_get_table(net, RT_TABLE_LOCAL);
> +	rcu_read_lock();
> +	local_table = fib_get_table(net, rt_table);
>   	if (local_table) {
>   		ret = RTN_UNICAST;
>   		if (!fib_table_lookup(local_table, &fl4, &res, FIB_LOOKUP_NOREF)) {

I would prefer it if you left a blank line between rcu_read_lock and the 
local_table assignment line.  It would help to make this patch and the 
code in general a bit more readable.

> @@ -309,7 +313,11 @@ static int __fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst,
>   	bool dev_match;
>
>   	fl4.flowi4_oif = 0;
> -	fl4.flowi4_iif = oif ? : LOOPBACK_IFINDEX;
> +	if (dev->vrf_ptr)
> +		fl4.flowi4_iif = dev->vrf_ptr ?
> +			dev->vrf_ptr->ifindex : dev->ifindex;
> +	else
> +		fl4.flowi4_iif = oif ? : LOOPBACK_IFINDEX;
>   	fl4.daddr = src;
>   	fl4.saddr = dst;
>   	fl4.flowi4_tos = tos;

This code is kind of redundant.  What is the point of the dev->vfr_ptr 
ternary operator when you just checked it in the if statement.

> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> index 97fa62d..515ff11 100644
> --- a/net/ipv4/fib_trie.c
> +++ b/net/ipv4/fib_trie.c
> @@ -1409,8 +1409,11 @@ found:
>
>   			if (nh->nh_flags & RTNH_F_DEAD)
>   				continue;
> -			if (flp->flowi4_oif && flp->flowi4_oif != nh->nh_oif)
> -				continue;
> +			if (!(flp->flowi4_flags & FLOWI_FLAG_VRFSRC)) {
> +				if (flp->flowi4_oif &&
> +				    flp->flowi4_oif != nh->nh_oif)
> +					continue;
> +			}
>
>   			if (!(fib_flags & FIB_LOOKUP_NOREF))
>   				atomic_inc(&fi->fib_clntref);

You might break this up so that both the flowi4_flags and flowi4_oif 
checks are in the first if statement, and the != is in the next.  I 
would argue that the flowi4_oif check might be done first as you can 
avoid having to test for any masks since you are testing for a 0 value.

> diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> index f5203fb..61b7da3 100644
> --- a/net/ipv4/icmp.c
> +++ b/net/ipv4/icmp.c
> @@ -96,6 +96,7 @@
>   #include <net/xfrm.h>
>   #include <net/inet_common.h>
>   #include <net/ip_fib.h>
> +#include <net/vrf.h>
>
>   /*
>    *	Build xmit assembly blocks
> @@ -420,6 +421,8 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb)
>   			daddr = icmp_param->replyopts.opt.opt.faddr;
>   	}
>   	memset(&fl4, 0, sizeof(fl4));
> +	if (skb->dev && skb->dev->vrf_ptr)
> +		fl4.flowi4_oif = skb->dev->vrf_ptr->ifindex;
>   	fl4.daddr = daddr;
>   	fl4.saddr = saddr;
>   	fl4.flowi4_mark = mark;
> @@ -458,6 +461,9 @@ static struct rtable *icmp_route_lookup(struct net *net,
>   	fl4->flowi4_proto = IPPROTO_ICMP;
>   	fl4->fl4_icmp_type = type;
>   	fl4->fl4_icmp_code = code;
> +	if (skb_in->dev && skb_in->dev->vrf_ptr)
> +		fl4->flowi4_oif =  skb_in->dev->vrf_ptr->ifindex;
> +
>   	security_skb_classify_flow(skb_in, flowi4_to_flowi(fl4));
>   	rt = __ip_route_output_key(net, fl4);
>   	if (IS_ERR(rt))
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index f605598..8c37720 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -109,6 +109,7 @@
>   #include <linux/kmemleak.h>
>   #endif
>   #include <net/secure_seq.h>
> +#include <net/vrf.h>
>
>   #define RT_FL_TOS(oldflp4) \
>   	((oldflp4)->flowi4_tos & (IPTOS_RT_MASK | RTO_ONLINK))
> @@ -1710,7 +1711,7 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
>   	 *	Now we are ready to route packet.
>   	 */
>   	fl4.flowi4_oif = 0;
> -	fl4.flowi4_iif = dev->ifindex;
> +	fl4.flowi4_iif = dev->vrf_ptr ? dev->vrf_ptr->ifindex : dev->ifindex;
>   	fl4.flowi4_mark = skb->mark;
>   	fl4.flowi4_tos = tos;
>   	fl4.flowi4_scope = RT_SCOPE_UNIVERSE;
>

  parent reply	other threads:[~2015-06-10 18:32 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-08 18:35 [RFC net-next 0/3] Proposal for VRF-lite Shrijeet Mukherjee
2015-06-08 18:35 ` [RFC net-next 1/3] Symbol preparation for VRF driver Shrijeet Mukherjee
2015-06-10 16:24   ` Alexander Duyck
2015-06-08 18:35 ` [RFC net-next 2/3] VRF driver and needed infrastructure Shrijeet Mukherjee
2015-06-08 19:08   ` David Ahern
2015-06-08 20:17   ` Hannes Frederic Sowa
2015-06-09  9:19   ` Nicolas Dichtel
2015-06-09 12:35   ` Nikolay Aleksandrov
2015-06-10  2:11     ` Shrijeet Mukherjee
2015-06-10 18:20   ` Alexander Duyck
2015-06-08 18:35 ` [RFC net-next 3/3] rcv path changes for vrf traffic Shrijeet Mukherjee
2015-06-08 19:58   ` Hannes Frederic Sowa
2015-06-08 20:00     ` Hannes Frederic Sowa
2015-06-08 20:22     ` Shrijeet Mukherjee
2015-06-08 20:33       ` Hannes Frederic Sowa
2015-06-08 22:44         ` Shrijeet Mukherjee
2015-06-09  5:41           ` Hannes Frederic Sowa
2015-06-08 22:05     ` David Miller
2015-06-08 22:13       ` Hannes Frederic Sowa
2015-06-08 22:21         ` David Miller
2015-06-09  0:36     ` David Ahern
2015-06-09  1:03     ` David Ahern
2015-06-09  5:35       ` Hannes Frederic Sowa
2015-06-10 18:31   ` Alexander Duyck [this message]
2015-06-08 18:35 ` [RFC iproute2] Add the ability to create a VRF device and specify it's table binding Shrijeet Mukherjee
2015-06-08 19:13 ` [RFC net-next 0/3] Proposal for VRF-lite David Ahern
2015-06-08 19:51   ` Shrijeet Mukherjee
2015-06-08 20:41   ` Hannes Frederic Sowa
2015-06-09  8:58 ` Nicolas Dichtel
2015-06-09 14:21   ` David Ahern
2015-06-09 14:55     ` Nicolas Dichtel
2015-06-09 17:14       ` Shrijeet Mukherjee
2015-06-09 10:15 ` Thomas Graf
2015-06-09 12:30   ` Nicolas Dichtel
2015-06-09 12:43     ` Hannes Frederic Sowa
     [not found]   ` <CAJmoNQHRTJwdMjziQiPBX07sZKrYd3Z1ASNi1xQZdgJ1Vs6bGg@mail.gmail.com>
2015-06-12  9:46     ` Thomas Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5578829D.9030202@gmail.com \
    --to=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=gospo@cumulusnetworks.com \
    --cc=hadi@mojatatu.com \
    --cc=hannes@stressinduktion.org \
    --cc=jtoppins@cumulusnetworks.com \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.dichtel@6wind.com \
    --cc=nikolay@cumulusnetworks.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=shm@cumulusnetworks.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).