From: Stephen Suryaputra <ssuryaextr@gmail.com>
To: David Ahern <dsahern@gmail.com>
Cc: netdev@vger.kernel.org
Subject: Re: ip rule iif oif and vrf
Date: Wed, 30 Sep 2020 22:23:45 -0400 [thread overview]
Message-ID: <20201001022345.GA3527@vmserver> (raw)
In-Reply-To: <97ce9942-2115-ed3a-75ea-8b58aa799ceb@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3821 bytes --]
On Thu, Sep 24, 2020 at 08:41:54AM -0600, David Ahern wrote:
> > We have multiple options on the table right now. One that can be done
> > without writing any code is to use an nft prerouting rule to mark
> > the packet with iif equals the tunnel and use ip rule fwmark to lookup
> > the right table.
> >
> > ip netns exec r0 nft add table ip c2t
> > ip netns exec r0 nft add chain ip c2t prerouting '{ type filter hook prerouting priority 0; policy accept; }'
> > ip netns exec r0 nft rule ip c2t prerouting iif gre01 mark set 101 counter
> > ip netns exec r0 ip rule add fwmark 101 table 10 pref 999
> >
> > ip netns exec r1 nft add table ip c2t
> > ip netns exec r1 nft add chain ip c2t prerouting '{ type filter hook prerouting priority 0; policy accept; }'
> > ip netns exec r1 nft rule ip c2t prerouting iif gre10 mark set 101 counter
> > ip netns exec r1 ip rule add fwmark 101 table 10 pref 999
> >
> > But this doesn't seem to work on my Ubuntu VM with the namespaces
> > script, i.e. pinging from h0 to h1. The packet doesn't egress r1_v11. It
> > does work on our target, based on 4.14 kernel.
>
> add debugs to net/core/fib_rules.c, fib_rule_match() to see if
> flowi_mark is getting set properly. There could easily be places that
> are missed. Or if it works on one setup, but not another compare sysctl
> settings for net.core and net.ipv4
Sorry, I got side-tracked. Coming back to this: I made a mistake in the
ip rule fwmark pref in the script. I have fixed it and the script is
attached (gre_setup_nft.sh). It has the nft and ip rule commands above.
The ping between h0 and h1 works.
> > We also notice though in on the target platform that the ip rule fwmark
> > doesn't seem to change the skb->dev to the vrf of the lookup table.
>
> not following that statement. fwmark should be marking the skb, not
> changing the skb->dev.
Yes and it causes the ping between h0 and r1 r1_v11 to not work, e.g.:
ip netns exec h0 ping -c 1 11.0.0.1
Similarly, ping between r0_v00 and r1_v11 also does not work:
ip netns exec r0 ip vrf exec vrf_r0t ping -c 1 -I 10.0.0.1 11.0.0.1
> > E.g., ping from 10.0.0.1 to 11.0.0.1. With net.ipv4.fwmark_reflect set,
> > the reply is generated but the originating ping application doesn't get
> > the packet. I suspect it is caused by the socket is bound to the tenant
> > vrf. I haven't been able to repro this because of the problem with the
> > nft approach above.
To illustrate my statements above, this is what I did:
ip netns exec r1 sysctl -w net.ipv4.fwmark_reflect=1
ip netns exec h0 ping -c 1 11.0.0.1
PING 11.0.0.1 (11.0.0.1) 56(84) bytes of data.
64 bytes from 11.0.0.1: icmp_seq=1 ttl=63 time=0.079 ms
The ping between h0 and r1 r1_v11 works, but it still doesn't work for this:
ip netns exec r0 ip vrf exec vrf_r0t ping -c 1 -I 10.0.0.1 11.0.0.1
eventhough the reply is received by gre01:
ip netns exec r0 tcpdump -nexi gre01
22:10:57.173680 Out ethertype IPv4 (0x0800), length 100: 10.0.0.1 > 11.0.0.1: ICMP echo request, id 3803, seq 1, length 64
0x0000: 4500 0054 1d2a 4000 4001 087e 0a00 0001
0x0010: 0b00 0001 0800 a410 0edb 0001 b13a 755f
0x0020: 0000 0000 5da6 0200 0000 0000 1011 1213
0x0030: 1415 1617 1819 1a1b 1c1d 1e1f 2021 2223
0x0040: 2425 2627 2829 2a2b 2c2d 2e2f 3031 3233
0x0050: 3435 3637
22:10:57.173724 In ethertype IPv4 (0x0800), length 100: 11.0.0.1 > 10.0.0.1: ICMP echo reply, id 3803, seq 1, length 64
0x0000: 4500 0054 c709 0000 4001 9e9e 0b00 0001
0x0010: 0a00 0001 0000 ac10 0edb 0001 b13a 755f
0x0020: 0000 0000 5da6 0200 0000 0000 1011 1213
0x0030: 1415 1617 1819 1a1b 1c1d 1e1f 2021 2223
0x0040: 2425 2627 2829 2a2b 2c2d 2e2f 3031 3233
0x0050: 3435 3637
In summary the question is: should ip rule with action FR_ACT_TO_TBL
also change the skb->dev to the right l3mdev?
Thank you,
Stephen.
[-- Attachment #2: gre_setup_nft.sh --]
[-- Type: application/x-sh, Size: 5707 bytes --]
next prev parent reply other threads:[~2020-10-01 2:23 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-22 13:11 ip rule iif oif and vrf Stephen Suryaputra
2020-09-22 15:39 ` David Ahern
2020-09-23 23:50 ` Stephen Suryaputra
2020-09-24 1:47 ` David Ahern
2020-09-24 13:48 ` Stephen Suryaputra
2020-09-24 14:41 ` David Ahern
2020-10-01 2:23 ` Stephen Suryaputra [this message]
2020-10-12 0:06 ` David Ahern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201001022345.GA3527@vmserver \
--to=ssuryaextr@gmail.com \
--cc=dsahern@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).