From: David Ahern <dsahern@gmail.com>
To: Hannes Frederic Sowa <hannes@stressinduktion.org>,
Shrijeet Mukherjee <shm@cumulusnetworks.com>
Cc: nicolas.dichtel@6wind.com, ebiederm@xmission.com,
hadi@mojatatu.com, davem@davemloft.net,
stephen@networkplumber.org, netdev@vger.kernel.org,
roopa@cumulusnetworks.com, gospo@cumulusnetworks.com,
jtoppins@cumulusnetworks.com, nikolay@cumulusnetworks.com
Subject: Re: [RFC net-next 3/3] rcv path changes for vrf traffic
Date: Mon, 08 Jun 2015 18:36:03 -0600 [thread overview]
Message-ID: <557634F3.4070700@gmail.com> (raw)
In-Reply-To: <1433793517.4616.4.camel@stressinduktion.org>
On 6/8/15 1:58 PM, Hannes Frederic Sowa wrote:
> Hi Shrijeet,
>
> On Mo, 2015-06-08 at 11:35 -0700, Shrijeet Mukherjee wrote:
>> From: Shrijeet Mukherjee <shm@cumulusnetworks.com>
>>
>> Incoming frames for IP protocol stacks need the IIF to be changed
>> from the actual interface to the VRF device. This allows the IIF
>> rule to be used to select tables (or do regular PBR)
>>
>> This change selects the iif to be the VRF device if it exists and
>> the incoming iif is enslaved to the VRF device.
>>
>> Since VRF aware sockets are always bound to the VRF device this
>> system allows return traffic to find the socket of origin.
>>
>> changes are in the arp_rcv, icmp_rcv and ip_rcv paths
>>
>> Question : I did not wrap the rcv modifications, in CONFIG_NET_VRF
>> as it would create code variations and the vrf_ptr check is there
>> I can make that whole thing modular.
>
> From an architectural level I think the output path looks good. For the
> input path I would also to propose my (I think) more flexible solution:
>
Something is still not right on the output path. e.g., I see the wrong
source address showing up on ping -I vrf0:
# ping -I vrf0 1.1.1.254
ping: Warning: source address might be selected on device other than vrf0.
PING 1.1.1.254 (1.1.1.254) from 172.16.1.52 vrf0: 56(84) bytes of data.
64 bytes from 1.1.1.254: icmp_seq=1 ttl=64 time=0.215 ms
...
The reason is because the datagram connect function fails to look up the
outbound route in the vrf and falls back to the main table. (As an aside
the fallback to other tables is something that should not be happening
for VRFs; you want to use the table specific to the VRF.)
The route lookup fails because it passes in oif = vrf device (this VRF
design relies on bind to device which sets oif in the flow). That is
good for selecting the table to use for the lookups, but not good for
selecting the route within the table.
This is one way to fix the connect problem:
diff --git a/include/net/route.h b/include/net/route.h
index fe22d03afb6a..a18798caec25 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -245,11 +245,18 @@ static inline void ip_route_connect_init(struct
flowi4 *fl4, __be32 dst, __be32
__be16 sport, __be16 dport,
struct sock *sk)
{
+ struct net_device *dev = dev_get_by_index(sock_net(sk), oif);
__u8 flow_flags = 0;
if (inet_sk(sk)->transparent)
flow_flags |= FLOWI_FLAG_ANYSRC;
+ if (dev) {
+ if (netif_is_vrf(dev))
+ flow_flags |= FLOWI_FLAG_VRFSRC;
+ dev_put(dev);
+ }
+
flowi4_init_output(fl4, oif, sk->sk_mark, tos, RT_SCOPE_UNIVERSE,
protocol, flow_flags, dst, src, dport, sport);
}
which essentially tells fib_table_lookup to drop the OIF comparison
after selecting the table per this change made in the patch Shrijeet posted:
if (!(flp->flowi4_flags & FLOWI_FLAG_VRFSRC)) {
if (flp->flowi4_oif &&
flp->flowi4_oif != nh->nh_oif)
continue;
}
next prev parent reply other threads:[~2015-06-09 0:36 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-08 18:35 [RFC net-next 0/3] Proposal for VRF-lite Shrijeet Mukherjee
2015-06-08 18:35 ` [RFC net-next 1/3] Symbol preparation for VRF driver Shrijeet Mukherjee
2015-06-10 16:24 ` Alexander Duyck
2015-06-08 18:35 ` [RFC net-next 2/3] VRF driver and needed infrastructure Shrijeet Mukherjee
2015-06-08 19:08 ` David Ahern
2015-06-08 20:17 ` Hannes Frederic Sowa
2015-06-09 9:19 ` Nicolas Dichtel
2015-06-09 12:35 ` Nikolay Aleksandrov
2015-06-10 2:11 ` Shrijeet Mukherjee
2015-06-10 18:20 ` Alexander Duyck
2015-06-08 18:35 ` [RFC net-next 3/3] rcv path changes for vrf traffic Shrijeet Mukherjee
2015-06-08 19:58 ` Hannes Frederic Sowa
2015-06-08 20:00 ` Hannes Frederic Sowa
2015-06-08 20:22 ` Shrijeet Mukherjee
2015-06-08 20:33 ` Hannes Frederic Sowa
2015-06-08 22:44 ` Shrijeet Mukherjee
2015-06-09 5:41 ` Hannes Frederic Sowa
2015-06-08 22:05 ` David Miller
2015-06-08 22:13 ` Hannes Frederic Sowa
2015-06-08 22:21 ` David Miller
2015-06-09 0:36 ` David Ahern [this message]
2015-06-09 1:03 ` David Ahern
2015-06-09 5:35 ` Hannes Frederic Sowa
2015-06-10 18:31 ` Alexander Duyck
2015-06-08 18:35 ` [RFC iproute2] Add the ability to create a VRF device and specify it's table binding Shrijeet Mukherjee
2015-06-08 19:13 ` [RFC net-next 0/3] Proposal for VRF-lite David Ahern
2015-06-08 19:51 ` Shrijeet Mukherjee
2015-06-08 20:41 ` Hannes Frederic Sowa
2015-06-09 8:58 ` Nicolas Dichtel
2015-06-09 14:21 ` David Ahern
2015-06-09 14:55 ` Nicolas Dichtel
2015-06-09 17:14 ` Shrijeet Mukherjee
2015-06-09 10:15 ` Thomas Graf
2015-06-09 12:30 ` Nicolas Dichtel
2015-06-09 12:43 ` Hannes Frederic Sowa
[not found] ` <CAJmoNQHRTJwdMjziQiPBX07sZKrYd3Z1ASNi1xQZdgJ1Vs6bGg@mail.gmail.com>
2015-06-12 9:46 ` Thomas Graf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=557634F3.4070700@gmail.com \
--to=dsahern@gmail.com \
--cc=davem@davemloft.net \
--cc=ebiederm@xmission.com \
--cc=gospo@cumulusnetworks.com \
--cc=hadi@mojatatu.com \
--cc=hannes@stressinduktion.org \
--cc=jtoppins@cumulusnetworks.com \
--cc=netdev@vger.kernel.org \
--cc=nicolas.dichtel@6wind.com \
--cc=nikolay@cumulusnetworks.com \
--cc=roopa@cumulusnetworks.com \
--cc=shm@cumulusnetworks.com \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).