All of lore.kernel.org
 help / color / mirror / Atom feed
* apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265
@ 2022-04-29 10:08 Paolo Abeni
  2022-05-03 18:37 ` apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 (checksums) Mat Martineau
  2022-05-03 20:41 ` apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 Maxim Galaganov
  0 siblings, 2 replies; 9+ messages in thread
From: Paolo Abeni @ 2022-04-29 10:08 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang, Mat Martineau

Hello,

I'm investigating the mentioned issue. Not strictily related to the
observed failure, I think/fear we have a very serious problem WRT csum.

Currently, AFAICS, we encode the csum converting the value returned
from csum_fold(csum_partial(csum_unfold)) to be:

https://elixir.bootlin.com/linux/v5.18-rc4/source/net/mptcp/options.c#L1343

while the UDP/TCP csum store directly the result for the same
operation:

https://elixir.bootlin.com/linux/v5.18-rc4/source/net/core/dev.c#L3228

I guess we are doing it wrong but we don't obseve any specific problem
as tests runs on the same arch and we do swap on rx, too. 

I think should see sistematic csum failure when the involved peers have
different endianess (e.g. x86 vs arm). If anyone has easy access to
both systems, could please verify the above?

If the above is true the nasty part is that I don't see how to fix this
without breaking the interop with bugged versions :////

/P


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 (checksums)
  2022-04-29 10:08 apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 Paolo Abeni
@ 2022-05-03 18:37 ` Mat Martineau
  2022-05-04 16:41   ` Matthieu Baerts
  2022-05-03 20:41 ` apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 Maxim Galaganov
  1 sibling, 1 reply; 9+ messages in thread
From: Mat Martineau @ 2022-05-03 18:37 UTC (permalink / raw)
  To: Paolo Abeni, Christoph Paasch; +Cc: mptcp, Geliang Tang

On Fri, 29 Apr 2022, Paolo Abeni wrote:

> Hello,
>
> I'm investigating the mentioned issue. Not strictily related to the
> observed failure, I think/fear we have a very serious problem WRT csum.
>
> Currently, AFAICS, we encode the csum converting the value returned
> from csum_fold(csum_partial(csum_unfold)) to be:
>
> https://elixir.bootlin.com/linux/v5.18-rc4/source/net/mptcp/options.c#L1343
>
> while the UDP/TCP csum store directly the result for the same
> operation:
>
> https://elixir.bootlin.com/linux/v5.18-rc4/source/net/core/dev.c#L3228
>

Another example where the checksum is usually assigned in the tcphdr shows 
similar direct assignment:

https://elixir.bootlin.com/linux/latest/source/net/ipv4/tcp_ipv4.c#L634

> I guess we are doing it wrong but we don't obseve any specific problem
> as tests runs on the same arch and we do swap on rx, too.

It looks that way to me as well... Back when we did interop with 
multipath-tcp.org (before moving to MPTCPv1, which was prior to 
upstreaming), we hadn't added checksum support yet.

> I think should see sistematic csum failure when the involved peers have
> different endianess (e.g. x86 vs arm). If anyone has easy access to
> both systems, could please verify the above?

Most ARM systems are also little-endian.

Aside from mixed-endian interop, there's also stack interop to consider. 
@Christoph, do you know if DSS checksums are interoperating with other 
MPTCPv1 stacks?


The emulated MIPS VM I tried to set up in qemu last year was not stable 
enough to even 'git clone' the kernel code, it would be worth trying that 
again with a cross-compiled kernel.

I noticed yesterday that the BPF CI is running big-endian tests on the z15 
architecture. I think that's through travis-ci. Do CI providers like 
Travis or Cirrus allow interop testing between multiple VMs?

Checking on other tools we could use to confirm checksum operation (and 
run on either big- or little-endian platforms), it looks like wireshark 
tries to track DSS checksums but does not verify them. Packetdrill seems 
to have some code (maybe incomplete?) for getting the checksum data from 
headers - would require some work to complete that code and add test 
cases.

>
> If the above is true the nasty part is that I don't see how to fix this
> without breaking the interop with bugged versions :////

Same :|

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265
  2022-04-29 10:08 apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 Paolo Abeni
  2022-05-03 18:37 ` apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 (checksums) Mat Martineau
@ 2022-05-03 20:41 ` Maxim Galaganov
  2022-05-03 23:47   ` Mat Martineau
  2022-05-04  9:52   ` Paolo Abeni
  1 sibling, 2 replies; 9+ messages in thread
From: Maxim Galaganov @ 2022-05-03 20:41 UTC (permalink / raw)
  To: Paolo Abeni, mptcp; +Cc: Geliang Tang, Mat Martineau

On 29.04.2022 13:08, Paolo Abeni wrote:
> I think should see sistematic csum failure when the involved peers have
> different endianess (e.g. x86 vs arm). If anyone has easy access to
> both systems, could please verify the above?
> 

I've built OpenWrt from current master (4509b79) for ath79/generic with 
CONFIG_TESTING_KERNEL=y
CONFIG_MPTCP=y
CONFIG_MPTCP_IPV6=y

and flashed it onto TP-Link Archer C7 v5.
This is what I can observe with a simple echo server and nc.

Client:
% uname -r
5.17.4-200.fc35.x86_64
% lscpu |fgrep ndian
Byte Order:                      Little Endian
% mptcpize run nc 192.168.2.1 12345
qwerty
Ncat: Connection reset by peer.

Server:
root@tpa3:~# uname -r
5.15.35
root@tpa3:~# lscpu |fgrep ndian
Byte Order:          Big Endian
root@tpa3:~# sysctl -w net.mptcp.checksum_enabled=1
net.mptcp.checksum_enabled = 1
root@tpa3:~# ./echo 12345
Server is listening on 12345
^C
root@tpa3:~# nstat 'Mptcp*'
#kernel
MPTcpExtMPCapableSYNRX          1                  0.0
MPTcpExtMPCapableACKRX          1                  0.0
MPTcpExtDataCsumErr             1                  0.0
MPTcpExtMPFailTx                1                  0.0

The issue goes away when I set net.mptcp.checksum_enabled=0 on both 
ends. Please tell me if you need more info.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265
  2022-05-03 20:41 ` apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 Maxim Galaganov
@ 2022-05-03 23:47   ` Mat Martineau
  2022-05-04  9:52   ` Paolo Abeni
  1 sibling, 0 replies; 9+ messages in thread
From: Mat Martineau @ 2022-05-03 23:47 UTC (permalink / raw)
  To: Maxim Galaganov; +Cc: Paolo Abeni, mptcp, Geliang Tang

On Tue, 3 May 2022, Maxim Galaganov wrote:

> On 29.04.2022 13:08, Paolo Abeni wrote:
>> I think should see sistematic csum failure when the involved peers have
>> different endianess (e.g. x86 vs arm). If anyone has easy access to
>> both systems, could please verify the above?
>> 
>
> I've built OpenWrt from current master (4509b79) for ath79/generic with 
> CONFIG_TESTING_KERNEL=y
> CONFIG_MPTCP=y
> CONFIG_MPTCP_IPV6=y
>
> and flashed it onto TP-Link Archer C7 v5.
> This is what I can observe with a simple echo server and nc.
>
> Client:
> % uname -r
> 5.17.4-200.fc35.x86_64
> % lscpu |fgrep ndian
> Byte Order:                      Little Endian
> % mptcpize run nc 192.168.2.1 12345
> qwerty
> Ncat: Connection reset by peer.
>
> Server:
> root@tpa3:~# uname -r
> 5.15.35
> root@tpa3:~# lscpu |fgrep ndian
> Byte Order:          Big Endian
> root@tpa3:~# sysctl -w net.mptcp.checksum_enabled=1
> net.mptcp.checksum_enabled = 1
> root@tpa3:~# ./echo 12345
> Server is listening on 12345
> ^C
> root@tpa3:~# nstat 'Mptcp*'
> #kernel
> MPTcpExtMPCapableSYNRX          1                  0.0
> MPTcpExtMPCapableACKRX          1                  0.0
> MPTcpExtDataCsumErr             1                  0.0
> MPTcpExtMPFailTx                1                  0.0
>
> The issue goes away when I set net.mptcp.checksum_enabled=0 on both ends. 
> Please tell me if you need more info.
>

Hi Maxim -

Thanks for verifying the problem and that it does appear to interop when 
checksums are off.

I found some helpful notes from the BPF maintainers on setting up a 
big-endian s390x qemu guest:

https://www.kernel.org/doc/html/latest/bpf/s390.html

Hopefully that will allow us to set up some vms for ongoing testing. I'm 
working on setting one up to experiment with.

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265
  2022-05-03 20:41 ` apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 Maxim Galaganov
  2022-05-03 23:47   ` Mat Martineau
@ 2022-05-04  9:52   ` Paolo Abeni
  2022-05-04 14:30     ` Maxim Galaganov
  1 sibling, 1 reply; 9+ messages in thread
From: Paolo Abeni @ 2022-05-04  9:52 UTC (permalink / raw)
  To: Maxim Galaganov, mptcp; +Cc: Geliang Tang, Mat Martineau

On Tue, 2022-05-03 at 23:41 +0300, Maxim Galaganov wrote:
> On 29.04.2022 13:08, Paolo Abeni wrote:
> > I think should see sistematic csum failure when the involved peers have
> > different endianess (e.g. x86 vs arm). If anyone has easy access to
> > both systems, could please verify the above?
> > 
> 
> I've built OpenWrt from current master (4509b79) for ath79/generic with 
> CONFIG_TESTING_KERNEL=y
> CONFIG_MPTCP=y
> CONFIG_MPTCP_IPV6=y
> 
> and flashed it onto TP-Link Archer C7 v5.
> This is what I can observe with a simple echo server and nc.
> 
> Client:
> % uname -r
> 5.17.4-200.fc35.x86_64
> % lscpu |fgrep ndian
> Byte Order:                      Little Endian
> % mptcpize run nc 192.168.2.1 12345
> qwerty
> Ncat: Connection reset by peer.
> 
> Server:
> root@tpa3:~# uname -r
> 5.15.35
> root@tpa3:~# lscpu |fgrep ndian
> Byte Order:          Big Endian
> root@tpa3:~# sysctl -w net.mptcp.checksum_enabled=1
> net.mptcp.checksum_enabled = 1
> root@tpa3:~# ./echo 12345
> Server is listening on 12345
> ^C
> root@tpa3:~# nstat 'Mptcp*'
> #kernel
> MPTcpExtMPCapableSYNRX          1                  0.0
> MPTcpExtMPCapableACKRX          1                  0.0
> MPTcpExtDataCsumErr             1                  0.0
> MPTcpExtMPFailTx                1                  0.0
> 
> The issue goes away when I set net.mptcp.checksum_enabled=0 on both 
> ends. Please tell me if you need more info.

Thank you very much for the testing! That has been very helpful!

Indeed we have serious problems :/

Could you please additionally test the following patch? Kernel for both
arches must be rebuilt.

This should cause old x86 build with csum enabled to fallback to TCP
while interoperating with new ones, which is ugly and bad, but - given
that csum is disable by default - possibly still acceptable?!? 

Thanks!

Paolo 
---
diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index ac3b7b8a02f6..5b5849d9fe60 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -107,7 +107,7 @@ static void mptcp_parse_option(const struct sk_buff *skb,
 			ptr += 2;
 		}
 		if (opsize == TCPOLEN_MPTCP_MPC_ACK_DATA_CSUM) {
-			mp_opt->csum = (__force __sum16)get_unaligned_be16(ptr);
+			mp_opt->csum = get_unaligned((__force __sum16 *)ptr);
 			mp_opt->suboptions |= OPTION_MPTCP_CSUMREQD;
 			ptr += 2;
 		}
@@ -221,7 +221,7 @@ static void mptcp_parse_option(const struct sk_buff *skb,
 
 			if (opsize == expected_opsize + TCPOLEN_MPTCP_DSS_CHECKSUM) {
 				mp_opt->suboptions |= OPTION_MPTCP_CSUMREQD;
-				mp_opt->csum = (__force __sum16)get_unaligned_be16(ptr);
+				mp_opt->csum = get_unaligned((__force __sum16 *)ptr);
 				ptr += 2;
 			}
 
@@ -1282,7 +1282,7 @@ static void mptcp_set_rwin(struct tcp_sock *tp, struct tcphdr *th)
 	}
 }
 
-u16 __mptcp_make_csum(u64 data_seq, u32 subflow_seq, u16 data_len, __wsum sum)
+__sum16 __mptcp_make_csum(u64 data_seq, u32 subflow_seq, u16 data_len, __wsum sum)
 {
 	struct csum_pseudo_header header;
 	__wsum csum;
@@ -1298,15 +1298,23 @@ u16 __mptcp_make_csum(u64 data_seq, u32 subflow_seq, u16 data_len, __wsum sum)
 	header.csum = 0;
 
 	csum = csum_partial(&header, sizeof(header), sum);
-	return (__force u16)csum_fold(csum);
+	return csum_fold(csum);
 }
 
-static u16 mptcp_make_csum(const struct mptcp_ext *mpext)
+static __sum16 mptcp_make_csum(const struct mptcp_ext *mpext)
 {
 	return __mptcp_make_csum(mpext->data_seq, mpext->subflow_seq, mpext->data_len,
 				 ~csum_unfold(mpext->csum));
 }
 
+static void put_len_csum(u16 len, __sum16 csum, __be16 *ptr)
+{
+	put_unaligned_be16(len, ptr);
+
+	ptr += 1;
+	put_unaligned(csum, ptr);
+}
+
 void mptcp_write_options(struct tcphdr *th, __be32 *ptr, struct tcp_sock *tp,
 			 struct mptcp_out_options *opts)
 {
@@ -1385,9 +1393,9 @@ void mptcp_write_options(struct tcphdr *th, __be32 *ptr, struct tcp_sock *tp,
 				/* data_len == 0 is reserved for the infinite mapping,
 				 * the checksum will also be set to 0.
 				 */
-				put_unaligned_be32(mpext->data_len << 16 |
-						   (mpext->data_len ? mptcp_make_csum(mpext) : 0),
-						   ptr);
+				put_len_csum(mpext->data_len,
+					     mpext->data_len ? mptcp_make_csum(mpext) : 0,
+					     (__force __be16 *)ptr);
 			} else {
 				put_unaligned_be32(mpext->data_len << 16 |
 						   TCPOPT_NOP << 8 | TCPOPT_NOP, ptr);
@@ -1438,11 +1446,12 @@ void mptcp_write_options(struct tcphdr *th, __be32 *ptr, struct tcp_sock *tp,
 			goto mp_capable_done;
 
 		if (opts->csum_reqd) {
-			put_unaligned_be32(opts->data_len << 16 |
-					   __mptcp_make_csum(opts->data_seq,
-							     opts->subflow_seq,
-							     opts->data_len,
-							     ~csum_unfold(opts->csum)), ptr);
+			put_len_csum(opts->data_len,
+				     __mptcp_make_csum(opts->data_seq,
+						       opts->subflow_seq,
+						       opts->data_len,
+						       ~csum_unfold(opts->csum)),
+				     (__force __be16 *)ptr);
 		} else {
 			put_unaligned_be32(opts->data_len << 16 |
 					   TCPOPT_NOP << 8 | TCPOPT_NOP, ptr);


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265
  2022-05-04  9:52   ` Paolo Abeni
@ 2022-05-04 14:30     ` Maxim Galaganov
  2022-05-06  1:11       ` Mat Martineau
  0 siblings, 1 reply; 9+ messages in thread
From: Maxim Galaganov @ 2022-05-04 14:30 UTC (permalink / raw)
  To: Paolo Abeni, mptcp; +Cc: Geliang Tang, Mat Martineau

On 04.05.2022 12:52, Paolo Abeni wrote:
> 
> Could you please additionally test the following patch? Kernel for both
> arches must be rebuilt.
> 
> This should cause old x86 build with csum enabled to fallback to TCP
> while interoperating with new ones, which is ugly and bad, but - given
> that csum is disable by default - possibly still acceptable?!?


The results for me are as follows:

old x86_64 (5.17.5) + old mips (5.15.35) => reset
new x86_64 (5.18-rc5+patch) vs old mips (5.15.35) => works
new x86_64 (5.18-rc5+patch) vs new mips (5.15.35+patch) => works
old x86_64 (5.17.2) + new mips (5.15.35+patch) => reset
old x86_64 (5.17.2) + new x86_64 (5.18-rc5+patch) => reset
new x86_64 (5.18-rc5+patch) talking to itself => works

I edited the patch so that it applies to my kernels, and this might be 
the reason I don't see fallback but see reset instead -- I don't have 
"mptcp: TCP fallback for established connections" patches in my tree as 
they are only in net-next yet. Or I may be doing something stupid.

Is there a way for an old box without infinite mapping support to 
fallback an established connection in case of csum failure?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 (checksums)
  2022-05-03 18:37 ` apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 (checksums) Mat Martineau
@ 2022-05-04 16:41   ` Matthieu Baerts
  0 siblings, 0 replies; 9+ messages in thread
From: Matthieu Baerts @ 2022-05-04 16:41 UTC (permalink / raw)
  To: Mat Martineau, Paolo Abeni, Christoph Paasch; +Cc: mptcp, Geliang Tang

Hi Mat,

On 03/05/2022 20:37, Mat Martineau wrote:
> I noticed yesterday that the BPF CI is running big-endian tests on the
> z15 architecture. I think that's through travis-ci. Do CI providers like
> Travis or Cirrus allow interop testing between multiple VMs?

I guess we can always launch multiple VMs with qemu and allow them to
discuss with each others.

It is not complex but a bit of work to do.

> Checking on other tools we could use to confirm checksum operation (and
> run on either big- or little-endian platforms), it looks like wireshark
> tries to track DSS checksums but does not verify them. Packetdrill seems
> to have some code (maybe incomplete?) for getting the checksum data from
> headers - would require some work to complete that code and add test cases.

Maybe Packetdrill can help? Not sure indeed.

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265
  2022-05-04 14:30     ` Maxim Galaganov
@ 2022-05-06  1:11       ` Mat Martineau
  2022-05-07  0:49         ` Mat Martineau
  0 siblings, 1 reply; 9+ messages in thread
From: Mat Martineau @ 2022-05-06  1:11 UTC (permalink / raw)
  To: Maxim Galaganov, Paolo Abeni; +Cc: mptcp, Geliang Tang

On Wed, 4 May 2022, Maxim Galaganov wrote:

> On 04.05.2022 12:52, Paolo Abeni wrote:
>> 
>> Could you please additionally test the following patch? Kernel for both
>> arches must be rebuilt.
>> 
>> This should cause old x86 build with csum enabled to fallback to TCP
>> while interoperating with new ones, which is ugly and bad, but - given
>> that csum is disable by default - possibly still acceptable?!?
>
>
> The results for me are as follows:
>
> old x86_64 (5.17.5) + old mips (5.15.35) => reset
> new x86_64 (5.18-rc5+patch) vs old mips (5.15.35) => works
> new x86_64 (5.18-rc5+patch) vs new mips (5.15.35+patch) => works
> old x86_64 (5.17.2) + new mips (5.15.35+patch) => reset
> old x86_64 (5.17.2) + new x86_64 (5.18-rc5+patch) => reset
> new x86_64 (5.18-rc5+patch) talking to itself => works
>
> I edited the patch so that it applies to my kernels, and this might be the 
> reason I don't see fallback but see reset instead -- I don't have "mptcp: TCP 
> fallback for established connections" patches in my tree as they are only in 
> net-next yet. Or I may be doing something stupid.
>
> Is there a way for an old box without infinite mapping support to fallback an 
> established connection in case of csum failure?
>

In some cases, I think we can do that!

Both peers do not consider MPTCP "fully established" until they receive a 
valid DATA_ACK, and the old box will fall back if the new box sends an 
infinite mapping in response to the first data. So, if as part of the fix 
we make sure the new box responds to a bad checksum in the initial DSS (or 
MPC+DATA) with an infinite mapping instead of a reset, fallback should 
work. And I think what's needed to do that is to only set subflow->mp_fail 
at the end of validate_data_csum() if subflow->fully_established is 
'true'.

However, if the new box is the first to send data with a checksum, the old 
box decides whether to reset or fallback. It looks like unpatched 5.17, 
5.16, and 5.15 would send a reset in this "patched host sends data first" 
scenario. 5.14 looks like it would fall back? Older MPTCP kernels don't 
support DSS checksums.

This means that if the use case is a client/server setup where the client 
sends data first, and the server is running a new/patched kernel, we get 
the fallback behavior. That's good!


This still leaves that "patched host sends data first" scenario that 
should be addressed. The RFC requires checksums to be enabled if one peer 
requests it in the SYN/SYNACK handshake. Is it too much of a kludge to use 
a sysctl that would force TCP fallback if a peer requests checksums? This 
would be just like the behavior we had in v5.13 and earlier. We could add 
another state to the existing MPTCP checksum sysctl ("force fallback if 
checksums requested").


--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265
  2022-05-06  1:11       ` Mat Martineau
@ 2022-05-07  0:49         ` Mat Martineau
  0 siblings, 0 replies; 9+ messages in thread
From: Mat Martineau @ 2022-05-07  0:49 UTC (permalink / raw)
  To: Maxim Galaganov, Paolo Abeni; +Cc: mptcp, Geliang Tang

On Thu, 5 May 2022, Mat Martineau wrote:

> On Wed, 4 May 2022, Maxim Galaganov wrote:
>
>> On 04.05.2022 12:52, Paolo Abeni wrote:
>>> 
>>> Could you please additionally test the following patch? Kernel for both
>>> arches must be rebuilt.
>>> 
>>> This should cause old x86 build with csum enabled to fallback to TCP
>>> while interoperating with new ones, which is ugly and bad, but - given
>>> that csum is disable by default - possibly still acceptable?!?
>> 
>> 
>> The results for me are as follows:
>> 
>> old x86_64 (5.17.5) + old mips (5.15.35) => reset
>> new x86_64 (5.18-rc5+patch) vs old mips (5.15.35) => works
>> new x86_64 (5.18-rc5+patch) vs new mips (5.15.35+patch) => works
>> old x86_64 (5.17.2) + new mips (5.15.35+patch) => reset
>> old x86_64 (5.17.2) + new x86_64 (5.18-rc5+patch) => reset
>> new x86_64 (5.18-rc5+patch) talking to itself => works
>> 
>> I edited the patch so that it applies to my kernels, and this might be the 
>> reason I don't see fallback but see reset instead -- I don't have "mptcp: 
>> TCP fallback for established connections" patches in my tree as they are 
>> only in net-next yet. Or I may be doing something stupid.
>> 
>> Is there a way for an old box without infinite mapping support to fallback 
>> an established connection in case of csum failure?
>> 
>
> In some cases, I think we can do that!
>
> Both peers do not consider MPTCP "fully established" until they receive a 
> valid DATA_ACK, and the old box will fall back if the new box sends an 
> infinite mapping in response to the first data. So, if as part of the fix we 
> make sure the new box responds to a bad checksum in the initial DSS (or 
> MPC+DATA) with an infinite mapping instead of a reset, fallback should work. 
> And I think what's needed to do that is to only set subflow->mp_fail at the 
> end of validate_data_csum() if subflow->fully_established is 'true'.

The subflow->fully_established flag in the kernel code is set when the DSS 
is received, and some modifications to set that flag later was not 
compatible with the self tests. So I'm trying something slightly different 
that doesn't depend on that specific flag, but that's still a 
work-in-progress.

>
> However, if the new box is the first to send data with a checksum, the old 
> box decides whether to reset or fallback. It looks like unpatched 5.17, 5.16, 
> and 5.15 would send a reset in this "patched host sends data first" scenario. 
> 5.14 looks like it would fall back? Older MPTCP kernels don't support DSS 
> checksums.
>
> This means that if the use case is a client/server setup where the client 
> sends data first, and the server is running a new/patched kernel, we get the 
> fallback behavior. That's good!
>
>
> This still leaves that "patched host sends data first" scenario that should 
> be addressed. The RFC requires checksums to be enabled if one peer requests 
> it in the SYN/SYNACK handshake. Is it too much of a kludge to use a sysctl 
> that would force TCP fallback if a peer requests checksums? This would be 
> just like the behavior we had in v5.13 and earlier. We could add another 
> state to the existing MPTCP checksum sysctl ("force fallback if checksums 
> requested").
>
>
> --
> Mat Martineau
> Intel
>
>

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-05-07  0:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-04-29 10:08 apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 Paolo Abeni
2022-05-03 18:37 ` apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 (checksums) Mat Martineau
2022-05-04 16:41   ` Matthieu Baerts
2022-05-03 20:41 ` apropos https://github.com/multipath-tcp/mptcp_net-next/issues/265 Maxim Galaganov
2022-05-03 23:47   ` Mat Martineau
2022-05-04  9:52   ` Paolo Abeni
2022-05-04 14:30     ` Maxim Galaganov
2022-05-06  1:11       ` Mat Martineau
2022-05-07  0:49         ` Mat Martineau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.