netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] xfrm: Add option to reset oif in xfrm lookup
@ 2016-07-26  0:34 Subash Abhinov Kasiviswanathan
  2016-07-26  2:34 ` David Ahern
  2016-07-28  5:08 ` Steffen Klassert
  0 siblings, 2 replies; 8+ messages in thread
From: Subash Abhinov Kasiviswanathan @ 2016-07-26  0:34 UTC (permalink / raw)
  To: dsa, steffen.klassert, netdev, herbert; +Cc: Subash Abhinov Kasiviswanathan

We are seeing incorrect routing when tunneling packets over an
interface and sending it over another interface. This scenario
worked on 3.18 (and earlier) and failed on 4.4 kernel. The rules
/ routes / policies were the same across kernels.

Commit 42a7b32b73d6 ("xfrm: Add oif to dst lookups") allowed
preservation of the oif from a raw packet to a transformed packet.
This causes issues with forwarding scenarios where the
existing oif causes an incorrect route lookup.

Create a new sysctl which resets oif in xfrm policy. Default value
is 0 which means that oif is preserved on transform.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 include/net/netns/xfrm.h | 1 +
 net/ipv4/xfrm4_policy.c  | 3 ++-
 net/ipv6/xfrm6_policy.c  | 3 ++-
 net/xfrm/xfrm_sysctl.c   | 8 ++++++++
 4 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/include/net/netns/xfrm.h b/include/net/netns/xfrm.h
index 24cd394..2e1beca 100644
--- a/include/net/netns/xfrm.h
+++ b/include/net/netns/xfrm.h
@@ -64,6 +64,7 @@ struct netns_xfrm {
 	u32			sysctl_aevent_rseqth;
 	int			sysctl_larval_drop;
 	u32			sysctl_acq_expires;
+	int 			sysctl_reset_oif;
 #ifdef CONFIG_SYSCTL
 	struct ctl_table_header	*sysctl_hdr;
 #endif
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index 7b0edb3..4dc3733 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -29,7 +29,8 @@ static struct dst_entry *__xfrm4_dst_lookup(struct net *net, struct flowi4 *fl4,
 	memset(fl4, 0, sizeof(*fl4));
 	fl4->daddr = daddr->a4;
 	fl4->flowi4_tos = tos;
-	fl4->flowi4_oif = oif;
+	if (!net->xfrm.sysctl_reset_oif)
+		fl4->flowi4_oif = oif;
 	if (saddr)
 		fl4->saddr = saddr->a4;
 
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index c074771..13e72d7 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -36,7 +36,8 @@ static struct dst_entry *xfrm6_dst_lookup(struct net *net, int tos, int oif,
 	int err;
 
 	memset(&fl6, 0, sizeof(fl6));
-	fl6.flowi6_oif = oif;
+	if (!net->xfrm.sysctl_reset_oif)
+		fl6.flowi6_oif = oif;
 	fl6.flowi6_flags = FLOWI_FLAG_SKIP_NH_OIF;
 	memcpy(&fl6.daddr, daddr, sizeof(fl6.daddr));
 	if (saddr)
diff --git a/net/xfrm/xfrm_sysctl.c b/net/xfrm/xfrm_sysctl.c
index 05a6e3d..c9d374b 100644
--- a/net/xfrm/xfrm_sysctl.c
+++ b/net/xfrm/xfrm_sysctl.c
@@ -9,6 +9,7 @@ static void __net_init __xfrm_sysctl_init(struct net *net)
 	net->xfrm.sysctl_aevent_rseqth = XFRM_AE_SEQT_SIZE;
 	net->xfrm.sysctl_larval_drop = 1;
 	net->xfrm.sysctl_acq_expires = 30;
+	net->xfrm.sysctl_reset_oif = 0;
 }
 
 #ifdef CONFIG_SYSCTL
@@ -37,6 +38,12 @@ static struct ctl_table xfrm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec
 	},
+	{
+		.procname	= "xfrm_reset_oif",
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec
+	},
 	{}
 };
 
@@ -53,6 +60,7 @@ int __net_init xfrm_sysctl_init(struct net *net)
 	table[1].data = &net->xfrm.sysctl_aevent_rseqth;
 	table[2].data = &net->xfrm.sysctl_larval_drop;
 	table[3].data = &net->xfrm.sysctl_acq_expires;
+	table[4].data = &net->xfrm.sysctl_reset_oif;
 
 	/* Don't export sysctls to unprivileged users */
 	if (net->user_ns != &init_user_ns)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] xfrm: Add option to reset oif in xfrm lookup
  2016-07-26  0:34 [RFC PATCH] xfrm: Add option to reset oif in xfrm lookup Subash Abhinov Kasiviswanathan
@ 2016-07-26  2:34 ` David Ahern
  2016-07-29 18:21   ` subashab
  2016-07-28  5:08 ` Steffen Klassert
  1 sibling, 1 reply; 8+ messages in thread
From: David Ahern @ 2016-07-26  2:34 UTC (permalink / raw)
  To: Subash Abhinov Kasiviswanathan, steffen.klassert, netdev, herbert

On 7/25/16 6:34 PM, Subash Abhinov Kasiviswanathan wrote:
> We are seeing incorrect routing when tunneling packets over an
> interface and sending it over another interface. This scenario
> worked on 3.18 (and earlier) and failed on 4.4 kernel. The rules
> / routes / policies were the same across kernels.

Can you give an example of your use case -- e.g., commands for others 
(me) to reproduce?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] xfrm: Add option to reset oif in xfrm lookup
  2016-07-26  0:34 [RFC PATCH] xfrm: Add option to reset oif in xfrm lookup Subash Abhinov Kasiviswanathan
  2016-07-26  2:34 ` David Ahern
@ 2016-07-28  5:08 ` Steffen Klassert
  1 sibling, 0 replies; 8+ messages in thread
From: Steffen Klassert @ 2016-07-28  5:08 UTC (permalink / raw)
  To: Subash Abhinov Kasiviswanathan; +Cc: dsa, netdev, herbert

On Mon, Jul 25, 2016 at 06:34:32PM -0600, Subash Abhinov Kasiviswanathan wrote:
> We are seeing incorrect routing when tunneling packets over an
> interface and sending it over another interface. This scenario
> worked on 3.18 (and earlier) and failed on 4.4 kernel. The rules
> / routes / policies were the same across kernels.
> 
> Commit 42a7b32b73d6 ("xfrm: Add oif to dst lookups") allowed
> preservation of the oif from a raw packet to a transformed packet.
> This causes issues with forwarding scenarios where the
> existing oif causes an incorrect route lookup.
> 
> Create a new sysctl which resets oif in xfrm policy. Default value
> is 0 which means that oif is preserved on transform.

Please don't try to workaround a bug with a sysctl.
If we have a bug here, we should fix it. Choosing
between bug A and bug B with a sysctl is not what
we are doing ;)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] xfrm: Add option to reset oif in xfrm lookup
  2016-07-26  2:34 ` David Ahern
@ 2016-07-29 18:21   ` subashab
  2016-08-03  4:06     ` David Ahern
  0 siblings, 1 reply; 8+ messages in thread
From: subashab @ 2016-07-29 18:21 UTC (permalink / raw)
  To: David Ahern; +Cc: steffen.klassert, netdev, herbert, netdev-owner

> Please don't try to workaround a bug with a sysctl.
> If we have a bug here, we should fix it. Choosing
> between bug A and bug B with a sysctl is not what
> we are doing ;)

Sure, this was just a quick hack.

> Can you give an example of your use case -- e.g., commands for others
> (me) to reproduce?

Here is an equivalent set of rules. We see a difference in the oif when 
reset oif vs preserve it.
eth1 is the interface from which traffic is generated while eth0 is the 
tunnel.

--------------
#Commands
echo 1 > /proc/sys/net/ipv4/ip_forward
echo 1 > /proc/sys/net/ipv4/conf/all/accept_local
echo 1 > /proc/sys/net/ipv4/conf/eth0/accept_local
echo 1 > /proc/sys/net/ipv4/conf/eth1/accept_local

ip addr add 192.168.77.2/24 dev eth0
ip link set eth0 mtu 1400
ip link set eth0 up

ip addr add 192.168.33.2/24 dev eth1
ip link set eth1 mtu 1400
ip link set eth1 up

ip ru add to 192.168.33.1 lookup 8 prio 4000
ip ru add oif eth1 lookup 8 prio 4010
ip ru add to 192.168.77.1 lookup 9 prio 4030

ip route add default dev eth1 table 8
ip route add default dev eth0 table 9

iptables -t raw -A OUTPUT -j LOG --log-prefix "RAW-OUT >> "
iptables -t mangle -A POSTROUTING -j LOG --log-prefix "MAN-PST >> "

echo 0 > /proc/sys/net/ipv4/tcp_timestamps

# out direction
ip xfrm state add src 192.168.77.2 dst 192.168.77.1 proto esp spi 0x1234 
mode tunnel enc 'cbc(aes)' 
0xbb31df5b207dc1c7a8512eeda0b2d0691e27bc8059dbb82df616bb9955058cd5 auth 
'hmac(sha1)' 0x93b43b527d564efb9eac8cd04510b86e409f8ea7 flag af-unspec 
encap espinudp 4500 4500 0.0.0.0

ip xfrm policy add dir out src 192.168.33.2 tmpl src 192.168.77.2 dst 
192.168.77.1 proto esp spi 0x1234 mode tunnel

# in direction
ip xfrm state add src 192.168.77.1 dst 192.168.77.2 proto esp spi 0x4321 
mode tunnel enc 'cbc(aes)' 
0x5d3ca96d1af2eaa9cf8f1c1cace88f550e2a5b7b82027023287e1fe2a42f7f54 auth 
'hmac(sha1)' 0xcd09f850d7c0dd6dc0ed342619c1165571452f9d flag af-unspec 
encap espinudp 4500 4500 0.0.0.0

ip xfrm policy add dir in dst 192.168.33.2 tmpl src 192.168.77.1 dst 
192.168.77.2 proto esp spi 0x4321 mode tunnel
ip xfrm policy add dir fwd dst 192.168.33.2 tmpl src 192.168.77.1 dst 
192.168.77.2 proto esp spi 0x4321 mode tunnel
--------------

Output when resetting oif (3.18)

root@vm:~# ping -c 1 -I eth1 192.168.33.1
PING 192.168.33.1 (192.168.33.1) 56(84) bytes of data.
RAW-OUT >> IN= OUT=eth0 SRC=192.168.33.2 DST=192.168.33.1 LEN=84 
TOS=0x00 PREC=0x00 TTL=64 ID=801 DF PROTO=ICMP TYPE=8 CODE=0 ID=2040 
SEQ=1
MAN-PST >> IN= OUT=eth0 SRC=192.168.33.2 DST=192.168.33.1 LEN=84 
TOS=0x00 PREC=0x00 TTL=64 ID=801 DF PROTO=ICMP TYPE=8 CODE=0 ID=2040 
SEQ=1
RAW-OUT >> IN= OUT=eth0 SRC=192.168.77.2 DST=192.168.77.1 LEN=160 
TOS=0x00 PREC=0x00 TTL=64 ID=41757 DF PROTO=UDP SPT=4500 DPT=4500 
LEN=140
MAN-PST >> IN= OUT=eth0 SRC=192.168.77.2 DST=192.168.77.1 LEN=160 
TOS=0x00 PREC=0x00 TTL=64 ID=41757 DF PROTO=UDP SPT=4500 DPT=4500 
LEN=140

--------------

Output when preserving oif (4.4)

root@vm:~# ping -c 1 -I eth1 192.168.33.1
PING 192.168.33.1 (192.168.33.1) 56(84) bytes of data.
RAW-OUT >> IN= OUT=eth1 SRC=192.168.33.2 DST=192.168.33.1 LEN=84 
TOS=0x00 PREC=0x00 TTL=64 ID=20191 DF PROTO=ICMP TYPE=8 CODE=0 ID=2043 
SEQ=1
MAN-PST >> IN= OUT=eth1 SRC=192.168.33.2 DST=192.168.33.1 LEN=84 
TOS=0x00 PREC=0x00 TTL=64 ID=20191 DF PROTO=ICMP TYPE=8 CODE=0 ID=2043 
SEQ=1
RAW-OUT >> IN= OUT=eth1 SRC=192.168.77.2 DST=192.168.77.1 LEN=160 
TOS=0x00 PREC=0x00 TTL=64 ID=49515 DF PROTO=UDP SPT=4500 DPT=4500 
LEN=140
MAN-PST >> IN= OUT=eth1 SRC=192.168.77.2 DST=192.168.77.1 LEN=160 
TOS=0x00 PREC=0x00 TTL=64 ID=49515 DF PROTO=UDP SPT=4500 DPT=4500 
LEN=140

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] xfrm: Add option to reset oif in xfrm lookup
  2016-07-29 18:21   ` subashab
@ 2016-08-03  4:06     ` David Ahern
  2016-08-03 23:02       ` subashab
  0 siblings, 1 reply; 8+ messages in thread
From: David Ahern @ 2016-08-03  4:06 UTC (permalink / raw)
  To: subashab; +Cc: steffen.klassert, netdev, herbert, netdev-owner

On 7/29/16 12:21 PM, subashab@codeaurora.org wrote:
>> Please don't try to workaround a bug with a sysctl.
>> If we have a bug here, we should fix it. Choosing
>> between bug A and bug B with a sysctl is not what
>> we are doing ;)
>
> Sure, this was just a quick hack.
>
>> Can you give an example of your use case -- e.g., commands for others
>> (me) to reproduce?
>
> Here is an equivalent set of rules. We see a difference in the oif when
> reset oif vs preserve it.
> eth1 is the interface from which traffic is generated while eth0 is the
> tunnel.
>
> --------------
> #Commands
> echo 1 > /proc/sys/net/ipv4/ip_forward
> echo 1 > /proc/sys/net/ipv4/conf/all/accept_local
> echo 1 > /proc/sys/net/ipv4/conf/eth0/accept_local
> echo 1 > /proc/sys/net/ipv4/conf/eth1/accept_local
>
> ip addr add 192.168.77.2/24 dev eth0
> ip link set eth0 mtu 1400
> ip link set eth0 up
>
> ip addr add 192.168.33.2/24 dev eth1
> ip link set eth1 mtu 1400
> ip link set eth1 up
>
> ip ru add to 192.168.33.1 lookup 8 prio 4000
> ip ru add oif eth1 lookup 8 prio 4010
> ip ru add to 192.168.77.1 lookup 9 prio 4030
>
> ip route add default dev eth1 table 8
> ip route add default dev eth0 table 9
>
> iptables -t raw -A OUTPUT -j LOG --log-prefix "RAW-OUT >> "
> iptables -t mangle -A POSTROUTING -j LOG --log-prefix "MAN-PST >> "
>
> echo 0 > /proc/sys/net/ipv4/tcp_timestamps
>
> # out direction
> ip xfrm state add src 192.168.77.2 dst 192.168.77.1 proto esp spi 0x1234
> mode tunnel enc 'cbc(aes)'
> 0xbb31df5b207dc1c7a8512eeda0b2d0691e27bc8059dbb82df616bb9955058cd5 auth
> 'hmac(sha1)' 0x93b43b527d564efb9eac8cd04510b86e409f8ea7 flag af-unspec
> encap espinudp 4500 4500 0.0.0.0
>
> ip xfrm policy add dir out src 192.168.33.2 tmpl src 192.168.77.2 dst
> 192.168.77.1 proto esp spi 0x1234 mode tunnel
>
> # in direction
> ip xfrm state add src 192.168.77.1 dst 192.168.77.2 proto esp spi 0x4321
> mode tunnel enc 'cbc(aes)'
> 0x5d3ca96d1af2eaa9cf8f1c1cace88f550e2a5b7b82027023287e1fe2a42f7f54 auth
> 'hmac(sha1)' 0xcd09f850d7c0dd6dc0ed342619c1165571452f9d flag af-unspec
> encap espinudp 4500 4500 0.0.0.0
>
> ip xfrm policy add dir in dst 192.168.33.2 tmpl src 192.168.77.1 dst
> 192.168.77.2 proto esp spi 0x4321 mode tunnel
> ip xfrm policy add dir fwd dst 192.168.33.2 tmpl src 192.168.77.1 dst
> 192.168.77.2 proto esp spi 0x4321 mode tunnel
> --------------
>
> Output when resetting oif (3.18)
>
> root@vm:~# ping -c 1 -I eth1 192.168.33.1
> PING 192.168.33.1 (192.168.33.1) 56(84) bytes of data.
> RAW-OUT >> IN= OUT=eth0 SRC=192.168.33.2 DST=192.168.33.1 LEN=84
> TOS=0x00 PREC=0x00 TTL=64 ID=801 DF PROTO=ICMP TYPE=8 CODE=0 ID=2040 SEQ=1
> MAN-PST >> IN= OUT=eth0 SRC=192.168.33.2 DST=192.168.33.1 LEN=84
> TOS=0x00 PREC=0x00 TTL=64 ID=801 DF PROTO=ICMP TYPE=8 CODE=0 ID=2040 SEQ=1
> RAW-OUT >> IN= OUT=eth0 SRC=192.168.77.2 DST=192.168.77.1 LEN=160
> TOS=0x00 PREC=0x00 TTL=64 ID=41757 DF PROTO=UDP SPT=4500 DPT=4500 LEN=140
> MAN-PST >> IN= OUT=eth0 SRC=192.168.77.2 DST=192.168.77.1 LEN=160
> TOS=0x00 PREC=0x00 TTL=64 ID=41757 DF PROTO=UDP SPT=4500 DPT=4500 LEN=140
>
> --------------
>
> Output when preserving oif (4.4)
>
> root@vm:~# ping -c 1 -I eth1 192.168.33.1
> PING 192.168.33.1 (192.168.33.1) 56(84) bytes of data.
> RAW-OUT >> IN= OUT=eth1 SRC=192.168.33.2 DST=192.168.33.1 LEN=84
> TOS=0x00 PREC=0x00 TTL=64 ID=20191 DF PROTO=ICMP TYPE=8 CODE=0 ID=2043
> SEQ=1
> MAN-PST >> IN= OUT=eth1 SRC=192.168.33.2 DST=192.168.33.1 LEN=84
> TOS=0x00 PREC=0x00 TTL=64 ID=20191 DF PROTO=ICMP TYPE=8 CODE=0 ID=2043
> SEQ=1
> RAW-OUT >> IN= OUT=eth1 SRC=192.168.77.2 DST=192.168.77.1 LEN=160
> TOS=0x00 PREC=0x00 TTL=64 ID=49515 DF PROTO=UDP SPT=4500 DPT=4500 LEN=140
> MAN-PST >> IN= OUT=eth1 SRC=192.168.77.2 DST=192.168.77.1 LEN=160
> TOS=0x00 PREC=0x00 TTL=64 ID=49515 DF PROTO=UDP SPT=4500 DPT=4500 LEN=140
>

I can't explain the iptables output but from a FIB lookup perspective it 
is using table 8 per the FIB rules, the xfrm is hit and packets shift to 
192.168.77.1 and go out what you have as eth0.

Take a look at:
   perf record -e fib:* -a -g
   perf script

And then run tcpdump on both eth0 and eth1. For me on "eth0" (which is 
really eth11 for my VM setup) I see this on the ping:

20:50:11.389837 ARP, Request who-has 192.168.77.2 tell 192.168.77.1, 
length 28
20:50:11.390079 ARP, Reply 192.168.77.2 is-at 02:00:12:34:02:0a, length 28
20:50:11.390101 IP 192.168.77.1 > 192.168.77.2: ICMP 192.168.77.1 udp 
port 4500 unreachable, length 168

So the packets are going out "eth0" as expected.

That said, the commands you have given do not totally transfer to 
another setup. In my case I have 2 VMs with eth11 and eth12 directly 
connected (VM1 eth11 <--> VM2 eth11 and ditto for eth12). You have given 
one side of the commands and I have configured the other side with the 
.1 addresses but not bothered to translate the xfrm commands.

That said, this seems like a contrived example -- you pin ping to device 
eth1 (-I eth1), you are pinging a host on the network for eth1 but want 
packets to go out eth0 via the xfrm. Can you elaborate on the real use 
case and problem here?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] xfrm: Add option to reset oif in xfrm lookup
  2016-08-03  4:06     ` David Ahern
@ 2016-08-03 23:02       ` subashab
  2016-08-04  4:52         ` David Ahern
  0 siblings, 1 reply; 8+ messages in thread
From: subashab @ 2016-08-03 23:02 UTC (permalink / raw)
  To: David Ahern; +Cc: steffen.klassert, netdev, herbert, netdev-owner

> I can't explain the iptables output but from a FIB lookup perspective
> it is using table 8 per the FIB rules, the xfrm is hit and packets
> shift to 192.168.77.1 and go out what you have as eth0.
> 
> Take a look at:
>   perf record -e fib:* -a -g
>   perf script
> 
> And then run tcpdump on both eth0 and eth1. For me on "eth0" (which is
> really eth11 for my VM setup) I see this on the ping:
> 

You can try running these commands as is on UML.
We tried these out on 3.18 as well as on 4.4.

> 20:50:11.389837 ARP, Request who-has 192.168.77.2 tell 192.168.77.1, 
> length 28
> 20:50:11.390079 ARP, Reply 192.168.77.2 is-at 02:00:12:34:02:0a, length 
> 28
> 20:50:11.390101 IP 192.168.77.1 > 192.168.77.2: ICMP 192.168.77.1 udp
> port 4500 unreachable, length 168
> 
> So the packets are going out "eth0" as expected.
> 
> That said, the commands you have given do not totally transfer to
> another setup. In my case I have 2 VMs with eth11 and eth12 directly
> connected (VM1 eth11 <--> VM2 eth11 and ditto for eth12). You have
> given one side of the commands and I have configured the other side
> with the .1 addresses but not bothered to translate the xfrm commands.
> 
> That said, this seems like a contrived example -- you pin ping to
> device eth1 (-I eth1), you are pinging a host on the network for eth1
> but want packets to go out eth0 via the xfrm. Can you elaborate on the
> real use case and problem here?

Applications may be bound to a specific interface but would try to send 
data over multiple types of networks.
Our use case here is wifi calling. In this case, we try to force packets 
to go over wifi after encryption.
The rules which we were using worked on 3.18 but we ran into issues on 
4.4.
Debugging narrowed us down to this oif preservation through xfrm.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] xfrm: Add option to reset oif in xfrm lookup
  2016-08-03 23:02       ` subashab
@ 2016-08-04  4:52         ` David Ahern
  2016-08-05 21:57           ` subashab
  0 siblings, 1 reply; 8+ messages in thread
From: David Ahern @ 2016-08-04  4:52 UTC (permalink / raw)
  To: subashab; +Cc: steffen.klassert, netdev, herbert, netdev-owner

On 8/3/16 5:02 PM, subashab@codeaurora.org wrote:
>> I can't explain the iptables output but from a FIB lookup perspective
>> it is using table 8 per the FIB rules, the xfrm is hit and packets
>> shift to 192.168.77.1 and go out what you have as eth0.
>>
>> Take a look at:
>>   perf record -e fib:* -a -g
>>   perf script
>>
>> And then run tcpdump on both eth0 and eth1. For me on "eth0" (which is
>> really eth11 for my VM setup) I see this on the ping:
>>
> 
> You can try running these commands as is on UML.
> We tried these out on 3.18 as well as on 4.4.
> 
>> 20:50:11.389837 ARP, Request who-has 192.168.77.2 tell 192.168.77.1, length 28
>> 20:50:11.390079 ARP, Reply 192.168.77.2 is-at 02:00:12:34:02:0a, length 28
>> 20:50:11.390101 IP 192.168.77.1 > 192.168.77.2: ICMP 192.168.77.1 udp
>> port 4500 unreachable, length 168
>>
>> So the packets are going out "eth0" as expected.
>>
>> That said, the commands you have given do not totally transfer to
>> another setup. In my case I have 2 VMs with eth11 and eth12 directly
>> connected (VM1 eth11 <--> VM2 eth11 and ditto for eth12). You have
>> given one side of the commands and I have configured the other side
>> with the .1 addresses but not bothered to translate the xfrm commands.
>>
>> That said, this seems like a contrived example -- you pin ping to
>> device eth1 (-I eth1), you are pinging a host on the network for eth1
>> but want packets to go out eth0 via the xfrm. Can you elaborate on the
>> real use case and problem here?
> 
> Applications may be bound to a specific interface but would try to send data over multiple types of networks.
> Our use case here is wifi calling. In this case, we try to force packets to go over wifi after encryption.
> The rules which we were using worked on 3.18 but we ran into issues on 4.4.
> Debugging narrowed us down to this oif preservation through xfrm.

I need to do some additional testing next week (taking PTO the next 2 days), but this should fix your problem. Can you confirm? This is better than a sysctl to handle the known use cases, but it does not handle a combination of the 2 known use cases (e.g., throw your use case into a VRF).

diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index b644a23c3db0..41f5b504a782 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -29,7 +29,7 @@ static struct dst_entry *__xfrm4_dst_lookup(struct net *net, struct flowi4 *fl4,
        memset(fl4, 0, sizeof(*fl4));
        fl4->daddr = daddr->a4;
        fl4->flowi4_tos = tos;
-       fl4->flowi4_oif = oif;
+       fl4->flowi4_oif = l3mdev_master_ifindex_by_index(net, oif);
        if (saddr)
                fl4->saddr = saddr->a4;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] xfrm: Add option to reset oif in xfrm lookup
  2016-08-04  4:52         ` David Ahern
@ 2016-08-05 21:57           ` subashab
  0 siblings, 0 replies; 8+ messages in thread
From: subashab @ 2016-08-05 21:57 UTC (permalink / raw)
  To: David Ahern; +Cc: steffen.klassert, netdev, herbert, netdev-owner

> I need to do some additional testing next week (taking PTO the next 2
> days), but this should fix your problem. Can you confirm? This is
> better than a sysctl to handle the known use cases, but it does not
> handle a combination of the 2 known use cases (e.g., throw your use
> case into a VRF).
> 
> diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
> index b644a23c3db0..41f5b504a782 100644
> --- a/net/ipv4/xfrm4_policy.c
> +++ b/net/ipv4/xfrm4_policy.c
> @@ -29,7 +29,7 @@ static struct dst_entry *__xfrm4_dst_lookup(struct
> net *net, struct flowi4 *fl4,
>         memset(fl4, 0, sizeof(*fl4));
>         fl4->daddr = daddr->a4;
>         fl4->flowi4_tos = tos;
> -       fl4->flowi4_oif = oif;
> +       fl4->flowi4_oif = l3mdev_master_ifindex_by_index(net, oif);
>         if (saddr)
>                 fl4->saddr = saddr->a4;

Thanks David. This works for me on 4.4 (along with commit 
1a8524794fc7c70f44ac28e3a6e8fd637bc41f14 ('net: l3mdev: Add master 
device lookup by index')).
Let me know if you have some other approach in mind or if this needs to 
be sent as an official patch.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-08-05 21:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-26  0:34 [RFC PATCH] xfrm: Add option to reset oif in xfrm lookup Subash Abhinov Kasiviswanathan
2016-07-26  2:34 ` David Ahern
2016-07-29 18:21   ` subashab
2016-08-03  4:06     ` David Ahern
2016-08-03 23:02       ` subashab
2016-08-04  4:52         ` David Ahern
2016-08-05 21:57           ` subashab
2016-07-28  5:08 ` Steffen Klassert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).