* [PATCH v2 net] tcp: Enable header prediction for active open connections with MD5.
@ 2023-08-03 4:22 Kuniyuki Iwashima
2023-08-03 6:44 ` Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Kuniyuki Iwashima @ 2023-08-03 4:22 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
David Ahern
Cc: YOSHIFUJI Hideaki, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
TCP socket saves the minimum required header length in tcp_header_len
of struct tcp_sock, and later the value is used in __tcp_fast_path_on()
to generate a part of TCP header in tcp_sock(sk)->pred_flags.
In tcp_rcv_established(), if the incoming packet has the same pattern
with pred_flags, we enter the fast path and skip full option parsing.
The MD5 option is parsed in tcp_v[46]_rcv(), so we need not parse it
again later in tcp_rcv_established() unless other options exist. Thus,
MD5 should add TCPOLEN_MD5SIG_ALIGNED to tcp_header_len and avoid the
slow path.
For passive open connections with MD5, we add TCPOLEN_MD5SIG_ALIGNED
to tcp_header_len in tcp_create_openreq_child() after 3WHS.
On the other hand, we do it in tcp_connect_init() for active open
connections. However, the value is overwritten while processing
SYN+ACK or crossed SYN in tcp_rcv_synsent_state_process().
1) SYN+ACK
tcp_rcv_synsent_state_process
tp->tcp_header_len = sizeof(struct tcphdr) or
sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED
tcp_finish_connect
__tcp_fast_path_on
tcp_send_ack
2) Crossed SYN and the following ACK
tcp_rcv_synsent_state_process
tp->tcp_header_len = sizeof(struct tcphdr) or
sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED
tcp_set_state(sk, TCP_SYN_RECV)
tcp_send_synack
-- ACK received --
tcp_v4_rcv
tcp_v4_do_rcv
tcp_rcv_state_process
tcp_fast_path_on
__tcp_fast_path_on
So these two cases will have the wrong value in pred_flags and never
go into the fast path.
Let's add TCPOLEN_MD5SIG_ALIGNED in tcp_rcv_synsent_state_process()
to enable header prediction for active open connections.
Fixes: cfb6eeb4c860 ("[TCP]: MD5 Signature Option (RFC2385) support.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
v2:
* Update function graph
v1: https://lore.kernel.org/all/20230803011658.17086-1-kuniyu@amazon.com/
---
net/ipv4/tcp_input.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 57c8af1859c1..4d274b511d97 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6291,6 +6291,11 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
tp->tcp_header_len = sizeof(struct tcphdr);
}
+#ifdef CONFIG_TCP_MD5SIG
+ if (tp->af_specific->md5_lookup(sk, sk))
+ tp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED;
+#endif
+
tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
tcp_initialize_rcv_mss(sk);
@@ -6368,6 +6373,11 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
tp->tcp_header_len = sizeof(struct tcphdr);
}
+#ifdef CONFIG_TCP_MD5SIG
+ if (tp->af_specific->md5_lookup(sk, sk))
+ tp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED;
+#endif
+
WRITE_ONCE(tp->rcv_nxt, TCP_SKB_CB(skb)->seq + 1);
WRITE_ONCE(tp->copied_seq, tp->rcv_nxt);
tp->rcv_wup = TCP_SKB_CB(skb)->seq + 1;
--
2.30.2
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v2 net] tcp: Enable header prediction for active open connections with MD5.
2023-08-03 4:22 [PATCH v2 net] tcp: Enable header prediction for active open connections with MD5 Kuniyuki Iwashima
@ 2023-08-03 6:44 ` Eric Dumazet
2023-08-03 6:59 ` Paolo Abeni
2023-08-03 21:25 ` Kuniyuki Iwashima
0 siblings, 2 replies; 6+ messages in thread
From: Eric Dumazet @ 2023-08-03 6:44 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
YOSHIFUJI Hideaki, Kuniyuki Iwashima, netdev
On Thu, Aug 3, 2023 at 6:22 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> TCP socket saves the minimum required header length in tcp_header_len
> of struct tcp_sock, and later the value is used in __tcp_fast_path_on()
> to generate a part of TCP header in tcp_sock(sk)->pred_flags.
>
> In tcp_rcv_established(), if the incoming packet has the same pattern
> with pred_flags, we enter the fast path and skip full option parsing.
>
> The MD5 option is parsed in tcp_v[46]_rcv(), so we need not parse it
> again later in tcp_rcv_established() unless other options exist. Thus,
> MD5 should add TCPOLEN_MD5SIG_ALIGNED to tcp_header_len and avoid the
> slow path.
>
> For passive open connections with MD5, we add TCPOLEN_MD5SIG_ALIGNED
> to tcp_header_len in tcp_create_openreq_child() after 3WHS.
>
> On the other hand, we do it in tcp_connect_init() for active open
> connections. However, the value is overwritten while processing
> SYN+ACK or crossed SYN in tcp_rcv_synsent_state_process().
>
> 1) SYN+ACK
>
> tcp_rcv_synsent_state_process
> tp->tcp_header_len = sizeof(struct tcphdr) or
> sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED
> tcp_finish_connect
> __tcp_fast_path_on
> tcp_send_ack
>
> 2) Crossed SYN and the following ACK
>
> tcp_rcv_synsent_state_process
> tp->tcp_header_len = sizeof(struct tcphdr) or
> sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED
> tcp_set_state(sk, TCP_SYN_RECV)
> tcp_send_synack
>
> -- ACK received --
> tcp_v4_rcv
> tcp_v4_do_rcv
> tcp_rcv_state_process
> tcp_fast_path_on
> __tcp_fast_path_on
>
> So these two cases will have the wrong value in pred_flags and never
> go into the fast path.
>
> Let's add TCPOLEN_MD5SIG_ALIGNED in tcp_rcv_synsent_state_process()
> to enable header prediction for active open connections.
I do not think we want to slow down fast path (no MD5), for 'header
prediction' of MD5 flows,
considering how slow MD5 is anyway (no GSO/GRO), and add yet another
ugly #ifdef CONFIG_TCP_MD5SIG
in already convoluted code base.
The case of cross-syn is kind of hilarious, if you ask me.
>
> Fixes: cfb6eeb4c860 ("[TCP]: MD5 Signature Option (RFC2385) support.")
This would be net-next material anyway, unless you show a huge
improvement after this patch,
which I doubt very much.
Or maybe the real intent is not fully expressed in your changelog ?
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH v2 net] tcp: Enable header prediction for active open connections with MD5.
2023-08-03 6:44 ` Eric Dumazet
@ 2023-08-03 6:59 ` Paolo Abeni
2023-08-03 7:07 ` Eric Dumazet
2023-08-03 21:25 ` Kuniyuki Iwashima
1 sibling, 1 reply; 6+ messages in thread
From: Paolo Abeni @ 2023-08-03 6:59 UTC (permalink / raw)
To: Eric Dumazet, Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, David Ahern, YOSHIFUJI Hideaki,
Kuniyuki Iwashima, netdev
On Thu, 2023-08-03 at 08:44 +0200, Eric Dumazet wrote:
> On Thu, Aug 3, 2023 at 6:22 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > TCP socket saves the minimum required header length in tcp_header_len
> > of struct tcp_sock, and later the value is used in __tcp_fast_path_on()
> > to generate a part of TCP header in tcp_sock(sk)->pred_flags.
> >
> > In tcp_rcv_established(), if the incoming packet has the same pattern
> > with pred_flags, we enter the fast path and skip full option parsing.
> >
> > The MD5 option is parsed in tcp_v[46]_rcv(), so we need not parse it
> > again later in tcp_rcv_established() unless other options exist. Thus,
> > MD5 should add TCPOLEN_MD5SIG_ALIGNED to tcp_header_len and avoid the
> > slow path.
> >
> > For passive open connections with MD5, we add TCPOLEN_MD5SIG_ALIGNED
> > to tcp_header_len in tcp_create_openreq_child() after 3WHS.
> >
> > On the other hand, we do it in tcp_connect_init() for active open
> > connections. However, the value is overwritten while processing
> > SYN+ACK or crossed SYN in tcp_rcv_synsent_state_process().
> >
> > 1) SYN+ACK
> >
> > tcp_rcv_synsent_state_process
> > tp->tcp_header_len = sizeof(struct tcphdr) or
> > sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED
> > tcp_finish_connect
> > __tcp_fast_path_on
> > tcp_send_ack
> >
> > 2) Crossed SYN and the following ACK
> >
> > tcp_rcv_synsent_state_process
> > tp->tcp_header_len = sizeof(struct tcphdr) or
> > sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED
> > tcp_set_state(sk, TCP_SYN_RECV)
> > tcp_send_synack
> >
> > -- ACK received --
> > tcp_v4_rcv
> > tcp_v4_do_rcv
> > tcp_rcv_state_process
> > tcp_fast_path_on
> > __tcp_fast_path_on
> >
> > So these two cases will have the wrong value in pred_flags and never
> > go into the fast path.
> >
> > Let's add TCPOLEN_MD5SIG_ALIGNED in tcp_rcv_synsent_state_process()
> > to enable header prediction for active open connections.
>
> I do not think we want to slow down fast path (no MD5), for 'header
> prediction' of MD5 flows,
> considering how slow MD5 is anyway (no GSO/GRO), and add yet another
> ugly #ifdef CONFIG_TCP_MD5SIG
> in already convoluted code base.
Somewhat related, do you know how much header prediction makes a
difference for plain TCP? IIRC quite some time ago there was the idea
to remove header prediction completely to simplify the code overall -
then reverted because indeed caused regression in RR test-case. Do you
know if that is still true? would it make sense to re-evaluate that
thing (HP removal) again?
Thanks!
Paolo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 net] tcp: Enable header prediction for active open connections with MD5.
2023-08-03 6:59 ` Paolo Abeni
@ 2023-08-03 7:07 ` Eric Dumazet
2023-08-03 15:02 ` Florian Westphal
0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2023-08-03 7:07 UTC (permalink / raw)
To: Paolo Abeni, Florian Westphal
Cc: Kuniyuki Iwashima, David S. Miller, Jakub Kicinski, David Ahern,
YOSHIFUJI Hideaki, Kuniyuki Iwashima, netdev
On Thu, Aug 3, 2023 at 8:59 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Thu, 2023-08-03 at 08:44 +0200, Eric Dumazet wrote:
> > On Thu, Aug 3, 2023 at 6:22 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > >
> > > TCP socket saves the minimum required header length in tcp_header_len
> > > of struct tcp_sock, and later the value is used in __tcp_fast_path_on()
> > > to generate a part of TCP header in tcp_sock(sk)->pred_flags.
> > >
> > > In tcp_rcv_established(), if the incoming packet has the same pattern
> > > with pred_flags, we enter the fast path and skip full option parsing.
> > >
> > > The MD5 option is parsed in tcp_v[46]_rcv(), so we need not parse it
> > > again later in tcp_rcv_established() unless other options exist. Thus,
> > > MD5 should add TCPOLEN_MD5SIG_ALIGNED to tcp_header_len and avoid the
> > > slow path.
> > >
> > > For passive open connections with MD5, we add TCPOLEN_MD5SIG_ALIGNED
> > > to tcp_header_len in tcp_create_openreq_child() after 3WHS.
> > >
> > > On the other hand, we do it in tcp_connect_init() for active open
> > > connections. However, the value is overwritten while processing
> > > SYN+ACK or crossed SYN in tcp_rcv_synsent_state_process().
> > >
> > > 1) SYN+ACK
> > >
> > > tcp_rcv_synsent_state_process
> > > tp->tcp_header_len = sizeof(struct tcphdr) or
> > > sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED
> > > tcp_finish_connect
> > > __tcp_fast_path_on
> > > tcp_send_ack
> > >
> > > 2) Crossed SYN and the following ACK
> > >
> > > tcp_rcv_synsent_state_process
> > > tp->tcp_header_len = sizeof(struct tcphdr) or
> > > sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED
> > > tcp_set_state(sk, TCP_SYN_RECV)
> > > tcp_send_synack
> > >
> > > -- ACK received --
> > > tcp_v4_rcv
> > > tcp_v4_do_rcv
> > > tcp_rcv_state_process
> > > tcp_fast_path_on
> > > __tcp_fast_path_on
> > >
> > > So these two cases will have the wrong value in pred_flags and never
> > > go into the fast path.
> > >
> > > Let's add TCPOLEN_MD5SIG_ALIGNED in tcp_rcv_synsent_state_process()
> > > to enable header prediction for active open connections.
> >
> > I do not think we want to slow down fast path (no MD5), for 'header
> > prediction' of MD5 flows,
> > considering how slow MD5 is anyway (no GSO/GRO), and add yet another
> > ugly #ifdef CONFIG_TCP_MD5SIG
> > in already convoluted code base.
>
> Somewhat related, do you know how much header prediction makes a
> difference for plain TCP? IIRC quite some time ago there was the idea
> to remove header prediction completely to simplify the code overall -
> then reverted because indeed caused regression in RR test-case. Do you
> know if that is still true? would it make sense to re-evaluate that
> thing (HP removal) again?
>
I think Florian did this, he might recall the details.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 net] tcp: Enable header prediction for active open connections with MD5.
2023-08-03 7:07 ` Eric Dumazet
@ 2023-08-03 15:02 ` Florian Westphal
0 siblings, 0 replies; 6+ messages in thread
From: Florian Westphal @ 2023-08-03 15:02 UTC (permalink / raw)
To: Eric Dumazet
Cc: Paolo Abeni, Florian Westphal, Kuniyuki Iwashima, David S. Miller,
Jakub Kicinski, David Ahern, YOSHIFUJI Hideaki, Kuniyuki Iwashima,
netdev
Eric Dumazet <edumazet@google.com> wrote:
> On Thu, Aug 3, 2023 at 8:59 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >
> > On Thu, 2023-08-03 at 08:44 +0200, Eric Dumazet wrote:
> > > I do not think we want to slow down fast path (no MD5), for 'header
> > > prediction' of MD5 flows,
> > > considering how slow MD5 is anyway (no GSO/GRO), and add yet another
> > > ugly #ifdef CONFIG_TCP_MD5SIG
> > > in already convoluted code base.
> >
> > Somewhat related, do you know how much header prediction makes a
> > difference for plain TCP? IIRC quite some time ago there was the idea
> > to remove header prediction completely to simplify the code overall -
> > then reverted because indeed caused regression in RR test-case. Do you
> > know if that is still true? would it make sense to re-evaluate that
> > thing (HP removal) again?
> >
>
> I think Florian did this, he might recall the details.
Memory is hazy here, 31770e34e43d ("tcp: Revert "tcp: remove header prediction"")
has some clues.
One would need to refactor tcp_ack so that all the extra accesses are
only done if the packet matches expected next sequence without advancing
the window. Not sure its worth doing, one would start to add tcp header
prediction v2, with little or no significant LOC savings.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 net] tcp: Enable header prediction for active open connections with MD5.
2023-08-03 6:44 ` Eric Dumazet
2023-08-03 6:59 ` Paolo Abeni
@ 2023-08-03 21:25 ` Kuniyuki Iwashima
1 sibling, 0 replies; 6+ messages in thread
From: Kuniyuki Iwashima @ 2023-08-03 21:25 UTC (permalink / raw)
To: edumazet; +Cc: davem, dsahern, kuba, kuni1840, kuniyu, netdev, pabeni, yoshfuji
From: Eric Dumazet <edumazet@google.com>
Date: Thu, 3 Aug 2023 08:44:14 +0200
> On Thu, Aug 3, 2023 at 6:22 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > TCP socket saves the minimum required header length in tcp_header_len
> > of struct tcp_sock, and later the value is used in __tcp_fast_path_on()
> > to generate a part of TCP header in tcp_sock(sk)->pred_flags.
> >
> > In tcp_rcv_established(), if the incoming packet has the same pattern
> > with pred_flags, we enter the fast path and skip full option parsing.
> >
> > The MD5 option is parsed in tcp_v[46]_rcv(), so we need not parse it
> > again later in tcp_rcv_established() unless other options exist. Thus,
> > MD5 should add TCPOLEN_MD5SIG_ALIGNED to tcp_header_len and avoid the
> > slow path.
> >
> > For passive open connections with MD5, we add TCPOLEN_MD5SIG_ALIGNED
> > to tcp_header_len in tcp_create_openreq_child() after 3WHS.
> >
> > On the other hand, we do it in tcp_connect_init() for active open
> > connections. However, the value is overwritten while processing
> > SYN+ACK or crossed SYN in tcp_rcv_synsent_state_process().
> >
> > 1) SYN+ACK
> >
> > tcp_rcv_synsent_state_process
> > tp->tcp_header_len = sizeof(struct tcphdr) or
> > sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED
> > tcp_finish_connect
> > __tcp_fast_path_on
> > tcp_send_ack
> >
> > 2) Crossed SYN and the following ACK
> >
> > tcp_rcv_synsent_state_process
> > tp->tcp_header_len = sizeof(struct tcphdr) or
> > sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED
> > tcp_set_state(sk, TCP_SYN_RECV)
> > tcp_send_synack
> >
> > -- ACK received --
> > tcp_v4_rcv
> > tcp_v4_do_rcv
> > tcp_rcv_state_process
> > tcp_fast_path_on
> > __tcp_fast_path_on
> >
> > So these two cases will have the wrong value in pred_flags and never
> > go into the fast path.
> >
> > Let's add TCPOLEN_MD5SIG_ALIGNED in tcp_rcv_synsent_state_process()
> > to enable header prediction for active open connections.
>
> I do not think we want to slow down fast path (no MD5), for 'header
> prediction' of MD5 flows,
> considering how slow MD5 is anyway (no GSO/GRO), and add yet another
> ugly #ifdef CONFIG_TCP_MD5SIG
> in already convoluted code base.
>
> The case of cross-syn is kind of hilarious, if you ask me.
>
> >
> > Fixes: cfb6eeb4c860 ("[TCP]: MD5 Signature Option (RFC2385) support.")
>
> This would be net-next material anyway, unless you show a huge
> improvement after this patch,
> which I doubt very much.
I just noticed the MD5 does not add TCPOLEN_MD5SIG_ALIGNED and thought
it would benefit from the fast path, but sorry, you were correct.
I have tested with slightly modified netperf that uses MD5 for each
flow and found that the patch does not improve MD5 perf at all. Rather,
without all TCPOLEN_MD5SIG_ALIGNED addition in tcp_connect_init() and
tcp_create_openreq_child(), the slow path became the faster path for MD5.
On c5.4xlarge EC2 instance (16 vCPU, 32 GiB mem)
$ for i in {1..10}; do
./super_netperf $(nproc) -H localhost -l 10 -- -m 256 -M 256 2>/dev/null;
done
- 36e68eadd303 : 10.376 Gbps
- all fast path : 10.374 Gbps
- all slow path : 10.394 Gbps
I'll post v3 that disable fast path for MD5 instead.
Thanks!
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-08-03 21:25 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-03 4:22 [PATCH v2 net] tcp: Enable header prediction for active open connections with MD5 Kuniyuki Iwashima
2023-08-03 6:44 ` Eric Dumazet
2023-08-03 6:59 ` Paolo Abeni
2023-08-03 7:07 ` Eric Dumazet
2023-08-03 15:02 ` Florian Westphal
2023-08-03 21:25 ` Kuniyuki Iwashima
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).