netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4
@ 2023-04-27 13:45 Antoine Tenart
  2023-04-27 13:45 ` [PATCH net-next 1/4] net: tcp: make the txhash available in TIME_WAIT sockets for IPv4 too Antoine Tenart
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Antoine Tenart @ 2023-04-27 13:45 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet; +Cc: Antoine Tenart, netdev

Hello,

Series is divided in two parts. First two commits make the txhash (used
for the skb hash in TCP) to be consistent for all IPv4/TCP packets (IPv6
doesn't have the same issue). Last two commits improve doc/comment
hash-related parts.

One example is when using OvS with dp_hash, which uses skb->hash, to
select a path. We'd like packets from the same flow to be consistent, as
well as the hash being stable over time when using net.core.txrehash=0.
Same applies for kernel ECMP which also can use skb->hash.

IMHO the series makes sense in net-next, but we could argue (some)
commits be seen as fixes and I can resend if necessary.

Thanks!
Antoine

Antoine Tenart (4):
  net: tcp: make the txhash available in TIME_WAIT sockets for IPv4 too
  net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV
  Documentation: net: net.core.txrehash is not specific to listening
    sockets
  net: skbuff: fix l4_hash comment

 Documentation/admin-guide/sysctl/net.rst |  4 ++--
 include/linux/skbuff.h                   |  4 ++--
 include/net/ip.h                         |  2 +-
 net/ipv4/ip_output.c                     |  4 +++-
 net/ipv4/tcp_ipv4.c                      | 14 +++++++++-----
 net/ipv4/tcp_minisocks.c                 |  2 +-
 6 files changed, 18 insertions(+), 12 deletions(-)

-- 
2.40.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next 1/4] net: tcp: make the txhash available in TIME_WAIT sockets for IPv4 too
  2023-04-27 13:45 [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4 Antoine Tenart
@ 2023-04-27 13:45 ` Antoine Tenart
  2023-04-27 13:45 ` [PATCH net-next 2/4] net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV Antoine Tenart
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Antoine Tenart @ 2023-04-27 13:45 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet; +Cc: Antoine Tenart, netdev

Commit c67b85558ff2 ("ipv6: tcp: send consistent autoflowlabel in
TIME_WAIT state") made the socket txhash also available in TIME_WAIT
sockets but for IPv6 only. Make it available for IPv4 too as we'll use
it in later commits.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 net/ipv4/tcp_minisocks.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index dac0d62120e6..04fc328727e6 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -303,6 +303,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
 		tcptw->tw_ts_offset	= tp->tsoffset;
 		tcptw->tw_last_oow_ack_time = 0;
 		tcptw->tw_tx_delay	= tp->tcp_tx_delay;
+		tw->tw_txhash		= sk->sk_txhash;
 #if IS_ENABLED(CONFIG_IPV6)
 		if (tw->tw_family == PF_INET6) {
 			struct ipv6_pinfo *np = inet6_sk(sk);
@@ -311,7 +312,6 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
 			tw->tw_v6_rcv_saddr = sk->sk_v6_rcv_saddr;
 			tw->tw_tclass = np->tclass;
 			tw->tw_flowlabel = be32_to_cpu(np->flow_label & IPV6_FLOWLABEL_MASK);
-			tw->tw_txhash = sk->sk_txhash;
 			tw->tw_ipv6only = sk->sk_ipv6only;
 		}
 #endif
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 2/4] net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV
  2023-04-27 13:45 [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4 Antoine Tenart
  2023-04-27 13:45 ` [PATCH net-next 1/4] net: tcp: make the txhash available in TIME_WAIT sockets for IPv4 too Antoine Tenart
@ 2023-04-27 13:45 ` Antoine Tenart
  2023-04-27 13:45 ` [PATCH net-next 3/4] Documentation: net: net.core.txrehash is not specific to listening sockets Antoine Tenart
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Antoine Tenart @ 2023-04-27 13:45 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet; +Cc: Antoine Tenart, netdev

When using IPv4/TCP, skb->hash comes from sk->sk_txhash except in
TIME_WAIT and SYN_RECV where it's not set in the reply skb from
ip_send_unicast_reply. Those packets will have a mismatched hash with
others from the same flow as their hashes will be 0. IPv6 does not have
the same issue as the hash is set from the socket txhash in those cases.

This commits sets the hash in the reply skb from ip_send_unicast_reply,
which makes the IPv4 code behaving like IPv6.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 include/net/ip.h     |  2 +-
 net/ipv4/ip_output.c |  4 +++-
 net/ipv4/tcp_ipv4.c  | 14 +++++++++-----
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index c3fffaa92d6e..749735171e2c 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -280,7 +280,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 			   const struct ip_options *sopt,
 			   __be32 daddr, __be32 saddr,
 			   const struct ip_reply_arg *arg,
-			   unsigned int len, u64 transmit_time);
+			   unsigned int len, u64 transmit_time, u32 txhash);
 
 #define IP_INC_STATS(net, field)	SNMP_INC_STATS64((net)->mib.ip_statistics, field)
 #define __IP_INC_STATS(net, field)	__SNMP_INC_STATS64((net)->mib.ip_statistics, field)
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 22a90a9392eb..268d8d6f4d8e 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1682,7 +1682,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 			   const struct ip_options *sopt,
 			   __be32 daddr, __be32 saddr,
 			   const struct ip_reply_arg *arg,
-			   unsigned int len, u64 transmit_time)
+			   unsigned int len, u64 transmit_time, u32 txhash)
 {
 	struct ip_options_data replyopts;
 	struct ipcm_cookie ipc;
@@ -1745,6 +1745,8 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 								arg->csum));
 		nskb->ip_summed = CHECKSUM_NONE;
 		nskb->mono_delivery_time = !!transmit_time;
+		if (txhash)
+			skb_set_hash(nskb, txhash, PKT_HASH_TYPE_L4);
 		ip_push_pending_frames(sk, &fl4);
 	}
 out:
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 39bda2b1066e..8fd4b548d448 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -692,6 +692,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)
 	u64 transmit_time = 0;
 	struct sock *ctl_sk;
 	struct net *net;
+	u32 txhash = 0;
 
 	/* Never send a reset in response to a reset. */
 	if (th->rst)
@@ -829,12 +830,14 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)
 				   inet_twsk(sk)->tw_priority : sk->sk_priority;
 		transmit_time = tcp_transmit_time(sk);
 		xfrm_sk_clone_policy(ctl_sk, sk);
+		txhash = (sk->sk_state == TCP_TIME_WAIT) ?
+			 inet_twsk(sk)->tw_txhash : sk->sk_txhash;
 	}
 	ip_send_unicast_reply(ctl_sk,
 			      skb, &TCP_SKB_CB(skb)->header.h4.opt,
 			      ip_hdr(skb)->saddr, ip_hdr(skb)->daddr,
 			      &arg, arg.iov[0].iov_len,
-			      transmit_time);
+			      transmit_time, txhash);
 
 	ctl_sk->sk_mark = 0;
 	xfrm_sk_free_policy(ctl_sk);
@@ -857,7 +860,7 @@ static void tcp_v4_send_ack(const struct sock *sk,
 			    struct sk_buff *skb, u32 seq, u32 ack,
 			    u32 win, u32 tsval, u32 tsecr, int oif,
 			    struct tcp_md5sig_key *key,
-			    int reply_flags, u8 tos)
+			    int reply_flags, u8 tos, u32 txhash)
 {
 	const struct tcphdr *th = tcp_hdr(skb);
 	struct {
@@ -933,7 +936,7 @@ static void tcp_v4_send_ack(const struct sock *sk,
 			      skb, &TCP_SKB_CB(skb)->header.h4.opt,
 			      ip_hdr(skb)->saddr, ip_hdr(skb)->daddr,
 			      &arg, arg.iov[0].iov_len,
-			      transmit_time);
+			      transmit_time, txhash);
 
 	ctl_sk->sk_mark = 0;
 	sock_net_set(ctl_sk, &init_net);
@@ -954,7 +957,8 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb)
 			tw->tw_bound_dev_if,
 			tcp_twsk_md5_key(tcptw),
 			tw->tw_transparent ? IP_REPLY_ARG_NOSRCCHECK : 0,
-			tw->tw_tos
+			tw->tw_tos,
+			tw->tw_txhash
 			);
 
 	inet_twsk_put(tw);
@@ -987,7 +991,7 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
 			0,
 			tcp_md5_do_lookup(sk, l3index, addr, AF_INET),
 			inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0,
-			ip_hdr(skb)->tos);
+			ip_hdr(skb)->tos, tcp_rsk(req)->txhash);
 }
 
 /*
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 3/4] Documentation: net: net.core.txrehash is not specific to listening sockets
  2023-04-27 13:45 [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4 Antoine Tenart
  2023-04-27 13:45 ` [PATCH net-next 1/4] net: tcp: make the txhash available in TIME_WAIT sockets for IPv4 too Antoine Tenart
  2023-04-27 13:45 ` [PATCH net-next 2/4] net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV Antoine Tenart
@ 2023-04-27 13:45 ` Antoine Tenart
  2023-04-27 13:45 ` [PATCH net-next 4/4] net: skbuff: fix l4_hash comment Antoine Tenart
  2023-04-27 15:15 ` [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4 Eric Dumazet
  4 siblings, 0 replies; 11+ messages in thread
From: Antoine Tenart @ 2023-04-27 13:45 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet; +Cc: Antoine Tenart, netdev

The net.core.txrehash documentation mentions this knob is for listening
sockets only, while sk_rethink_txhash can be called on SYN and RTO
retransmits on all TCP sockets.

Remove the listening socket part.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 Documentation/admin-guide/sysctl/net.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 466c560b0c30..4877563241f3 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -386,8 +386,8 @@ Default : 0  (for compatibility reasons)
 txrehash
 --------
 
-Controls default hash rethink behaviour on listening socket when SO_TXREHASH
-option is set to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt).
+Controls default hash rethink behaviour on socket when SO_TXREHASH option is set
+to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt).
 
 If set to 1 (default), hash rethink is performed on listening socket.
 If set to 0, hash rethink is not performed.
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 4/4] net: skbuff: fix l4_hash comment
  2023-04-27 13:45 [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4 Antoine Tenart
                   ` (2 preceding siblings ...)
  2023-04-27 13:45 ` [PATCH net-next 3/4] Documentation: net: net.core.txrehash is not specific to listening sockets Antoine Tenart
@ 2023-04-27 13:45 ` Antoine Tenart
  2023-04-27 15:15 ` [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4 Eric Dumazet
  4 siblings, 0 replies; 11+ messages in thread
From: Antoine Tenart @ 2023-04-27 13:45 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet; +Cc: Antoine Tenart, netdev

Since commit 877d1f6291f8 ("net: Set sk_txhash from a random number")
sk->sk_txhash is not a canonical 4-tuple hash. sk->sk_txhash is
used in the TCP Tx path to populate skb->hash, with skb->l4_hash=1.
With this, skb->l4_hash does not always indicate the hash is a
"canonical 4-tuple hash over transport ports" but rather a hash from L4
layer to provide a uniform distribution over flows. Reword the comment
accordingly, to avoid misunderstandings.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 include/linux/skbuff.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 738776ab8838..f54c84193b23 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -791,8 +791,8 @@ typedef unsigned char *sk_buff_data_t;
  *	@active_extensions: active extensions (skb_ext_id types)
  *	@ndisc_nodetype: router type (from link layer)
  *	@ooo_okay: allow the mapping of a socket to a queue to be changed
- *	@l4_hash: indicate hash is a canonical 4-tuple hash over transport
- *		ports.
+ *	@l4_hash: indicate hash is from layer 4 and provides a uniform
+ *		distribution over flows.
  *	@sw_hash: indicates hash was computed in software stack
  *	@wifi_acked_valid: wifi_acked was set
  *	@wifi_acked: whether frame was acked on wifi or not
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4
  2023-04-27 13:45 [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4 Antoine Tenart
                   ` (3 preceding siblings ...)
  2023-04-27 13:45 ` [PATCH net-next 4/4] net: skbuff: fix l4_hash comment Antoine Tenart
@ 2023-04-27 15:15 ` Eric Dumazet
  2023-04-27 15:44   ` Antoine Tenart
  4 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2023-04-27 15:15 UTC (permalink / raw)
  To: Antoine Tenart; +Cc: davem, kuba, pabeni, netdev

On Thu, Apr 27, 2023 at 3:45 PM Antoine Tenart <atenart@kernel.org> wrote:
>
> Hello,
>
> Series is divided in two parts. First two commits make the txhash (used
> for the skb hash in TCP) to be consistent for all IPv4/TCP packets (IPv6
> doesn't have the same issue). Last two commits improve doc/comment
> hash-related parts.
>
> One example is when using OvS with dp_hash, which uses skb->hash, to
> select a path. We'd like packets from the same flow to be consistent, as
> well as the hash being stable over time when using net.core.txrehash=0.
> Same applies for kernel ECMP which also can use skb->hash.

How do you plan to test these patches ?

>
> IMHO the series makes sense in net-next, but we could argue (some)
> commits be seen as fixes and I can resend if necessary.

net-next is closed...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4
  2023-04-27 15:15 ` [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4 Eric Dumazet
@ 2023-04-27 15:44   ` Antoine Tenart
  0 siblings, 0 replies; 11+ messages in thread
From: Antoine Tenart @ 2023-04-27 15:44 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, kuba, pabeni, netdev

Quoting Eric Dumazet (2023-04-27 17:15:27)
> On Thu, Apr 27, 2023 at 3:45 PM Antoine Tenart <atenart@kernel.org> wrote:
> >
> > Series is divided in two parts. First two commits make the txhash (used
> > for the skb hash in TCP) to be consistent for all IPv4/TCP packets (IPv6
> > doesn't have the same issue). Last two commits improve doc/comment
> > hash-related parts.
> >
> > One example is when using OvS with dp_hash, which uses skb->hash, to
> > select a path. We'd like packets from the same flow to be consistent, as
> > well as the hash being stable over time when using net.core.txrehash=0.
> > Same applies for kernel ECMP which also can use skb->hash.
> 
> How do you plan to test these patches ?

I did perform manual checks (with net.core.txrehash=0 to make sure the
hash was consistent) using a setup with OvS and dp_hash (~ECMP like) and
looking which path packets took. Not sure if there is a simpler test
that could be automated, we can't use autoflowlabel to make simple
scripts like for IPv6. Anything you'd like to see specifically?

> > IMHO the series makes sense in net-next, but we could argue (some)
> > commits be seen as fixes and I can resend if necessary.
> 
> net-next is closed...

I was convinced to have checked, but well, completely missed it. Sorry
about that... Will resend when appropriate.

Thanks,
Antoine

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4
@ 2023-05-11  9:34 Antoine Tenart
  2023-05-11 10:24 ` Eric Dumazet
  2023-05-11 11:59 ` Ilya Maximets
  0 siblings, 2 replies; 11+ messages in thread
From: Antoine Tenart @ 2023-05-11  9:34 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet; +Cc: Antoine Tenart, netdev

Hello,

Series is divided in two parts. First two commits make the txhash (used
for the skb hash in TCP) to be consistent for all IPv4/TCP packets (IPv6
doesn't have the same issue). Last two commits improve doc/comment
hash-related parts.

One example is when using OvS with dp_hash, which uses skb->hash, to
select a path. We'd like packets from the same flow to be consistent, as
well as the hash being stable over time when using net.core.txrehash=0.
Same applies for kernel ECMP which also can use skb->hash.

IMHO the series makes sense in net-next, but we could argue (some)
commits be seen as fixes and I can resend if necessary.

Thanks!
Antoine

Antoine Tenart (4):
  net: tcp: make the txhash available in TIME_WAIT sockets for IPv4 too
  net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV
  Documentation: net: net.core.txrehash is not specific to listening
    sockets
  net: skbuff: fix l4_hash comment

 Documentation/admin-guide/sysctl/net.rst |  4 ++--
 include/linux/skbuff.h                   |  4 ++--
 include/net/ip.h                         |  2 +-
 net/ipv4/ip_output.c                     |  4 +++-
 net/ipv4/tcp_ipv4.c                      | 14 +++++++++-----
 net/ipv4/tcp_minisocks.c                 |  2 +-
 6 files changed, 18 insertions(+), 12 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4
  2023-05-11  9:34 Antoine Tenart
@ 2023-05-11 10:24 ` Eric Dumazet
  2023-05-11 11:55   ` Antoine Tenart
  2023-05-11 11:59 ` Ilya Maximets
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2023-05-11 10:24 UTC (permalink / raw)
  To: Antoine Tenart; +Cc: davem, kuba, pabeni, netdev

On Thu, May 11, 2023 at 11:35 AM Antoine Tenart <atenart@kernel.org> wrote:
>
> Hello,
>
> Series is divided in two parts. First two commits make the txhash (used
> for the skb hash in TCP) to be consistent for all IPv4/TCP packets (IPv6
> doesn't have the same issue). Last two commits improve doc/comment
> hash-related parts.
>
> One example is when using OvS with dp_hash, which uses skb->hash, to
> select a path. We'd like packets from the same flow to be consistent, as
> well as the hash being stable over time when using net.core.txrehash=0.
> Same applies for kernel ECMP which also can use skb->hash.
>

SGTM, thanks.

Reviewed-by: Eric Dumazet <edumazet@google.com>

FYI while reviewing your patches, I found that I have to send this fix:

I suggest we hold your patch series a bit before this reaches net-next tree,
to avoid merge conflicts.

Bug was added in commit f6c0f5d209fa ("tcp: honor SO_PRIORITY in
TIME_WAIT state")


diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 39bda2b1066e1d607a59fb79c6305d0ca30cb28d..06d2573685ca993a3a0a89807f09d7b5c153cc72
100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -829,6 +829,9 @@ static void tcp_v4_send_reset(const struct sock
*sk, struct sk_buff *skb)
                                   inet_twsk(sk)->tw_priority : sk->sk_priority;
                transmit_time = tcp_transmit_time(sk);
                xfrm_sk_clone_policy(ctl_sk, sk);
+       } else {
+               ctl_sk->sk_mark = 0;
+               ctl_sk->sk_priority = 0;
        }
        ip_send_unicast_reply(ctl_sk,
                              skb, &TCP_SKB_CB(skb)->header.h4.opt,
@@ -836,7 +839,6 @@ static void tcp_v4_send_reset(const struct sock
*sk, struct sk_buff *skb)
                              &arg, arg.iov[0].iov_len,
                              transmit_time);

-       ctl_sk->sk_mark = 0;
        xfrm_sk_free_policy(ctl_sk);
        sock_net_set(ctl_sk, &init_net);
        __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
@@ -935,7 +937,6 @@ static void tcp_v4_send_ack(const struct sock *sk,
                              &arg, arg.iov[0].iov_len,
                              transmit_time);

-       ctl_sk->sk_mark = 0;
        sock_net_set(ctl_sk, &init_net);
        __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
        local_bh_enable();

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4
  2023-05-11 10:24 ` Eric Dumazet
@ 2023-05-11 11:55   ` Antoine Tenart
  0 siblings, 0 replies; 11+ messages in thread
From: Antoine Tenart @ 2023-05-11 11:55 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, kuba, pabeni, netdev

Quoting Eric Dumazet (2023-05-11 12:24:15)
> On Thu, May 11, 2023 at 11:35 AM Antoine Tenart <atenart@kernel.org> wrote:
> >
> > Series is divided in two parts. First two commits make the txhash (used
> > for the skb hash in TCP) to be consistent for all IPv4/TCP packets (IPv6
> > doesn't have the same issue). Last two commits improve doc/comment
> > hash-related parts.
> >
> > One example is when using OvS with dp_hash, which uses skb->hash, to
> > select a path. We'd like packets from the same flow to be consistent, as
> > well as the hash being stable over time when using net.core.txrehash=0.
> > Same applies for kernel ECMP which also can use skb->hash.
> >
> 
> SGTM, thanks.
> 
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> 
> FYI while reviewing your patches, I found that I have to send this fix:
> 
> I suggest we hold your patch series a bit before this reaches net-next tree,
> to avoid merge conflicts.

Sure, no problem. Thanks for the review!

> Bug was added in commit f6c0f5d209fa ("tcp: honor SO_PRIORITY in
> TIME_WAIT state")
> 
> 
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 39bda2b1066e1d607a59fb79c6305d0ca30cb28d..06d2573685ca993a3a0a89807f09d7b5c153cc72
> 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -829,6 +829,9 @@ static void tcp_v4_send_reset(const struct sock
> *sk, struct sk_buff *skb)
>                                    inet_twsk(sk)->tw_priority : sk->sk_priority;
>                 transmit_time = tcp_transmit_time(sk);
>                 xfrm_sk_clone_policy(ctl_sk, sk);
> +       } else {
> +               ctl_sk->sk_mark = 0;
> +               ctl_sk->sk_priority = 0;
>         }
>         ip_send_unicast_reply(ctl_sk,
>                               skb, &TCP_SKB_CB(skb)->header.h4.opt,
> @@ -836,7 +839,6 @@ static void tcp_v4_send_reset(const struct sock
> *sk, struct sk_buff *skb)
>                               &arg, arg.iov[0].iov_len,
>                               transmit_time);
> 
> -       ctl_sk->sk_mark = 0;
>         xfrm_sk_free_policy(ctl_sk);
>         sock_net_set(ctl_sk, &init_net);
>         __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
> @@ -935,7 +937,6 @@ static void tcp_v4_send_ack(const struct sock *sk,
>                               &arg, arg.iov[0].iov_len,
>                               transmit_time);
> 
> -       ctl_sk->sk_mark = 0;
>         sock_net_set(ctl_sk, &init_net);
>         __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
>         local_bh_enable();
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4
  2023-05-11  9:34 Antoine Tenart
  2023-05-11 10:24 ` Eric Dumazet
@ 2023-05-11 11:59 ` Ilya Maximets
  1 sibling, 0 replies; 11+ messages in thread
From: Ilya Maximets @ 2023-05-11 11:59 UTC (permalink / raw)
  To: Antoine Tenart, davem, kuba, pabeni, edumazet; +Cc: netdev, i.maximets

On 5/11/23 11:34, Antoine Tenart wrote:
> Hello,
> 
> Series is divided in two parts. First two commits make the txhash (used
> for the skb hash in TCP) to be consistent for all IPv4/TCP packets (IPv6
> doesn't have the same issue). Last two commits improve doc/comment
> hash-related parts.
> 
> One example is when using OvS with dp_hash, which uses skb->hash, to
> select a path. We'd like packets from the same flow to be consistent, as
> well as the hash being stable over time when using net.core.txrehash=0.
> Same applies for kernel ECMP which also can use skb->hash.

FWIW, same also applies to seg6_flowlabel that is used for flowlable
based load balancing, because seg6_make_flowlabel() is using skb hash.

Best regards, Ilya Maximets.

> 
> IMHO the series makes sense in net-next, but we could argue (some)
> commits be seen as fixes and I can resend if necessary.
> 
> Thanks!
> Antoine
> 
> Antoine Tenart (4):
>   net: tcp: make the txhash available in TIME_WAIT sockets for IPv4 too
>   net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV
>   Documentation: net: net.core.txrehash is not specific to listening
>     sockets
>   net: skbuff: fix l4_hash comment
> 
>  Documentation/admin-guide/sysctl/net.rst |  4 ++--
>  include/linux/skbuff.h                   |  4 ++--
>  include/net/ip.h                         |  2 +-
>  net/ipv4/ip_output.c                     |  4 +++-
>  net/ipv4/tcp_ipv4.c                      | 14 +++++++++-----
>  net/ipv4/tcp_minisocks.c                 |  2 +-
>  6 files changed, 18 insertions(+), 12 deletions(-)
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-05-11 11:59 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-27 13:45 [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4 Antoine Tenart
2023-04-27 13:45 ` [PATCH net-next 1/4] net: tcp: make the txhash available in TIME_WAIT sockets for IPv4 too Antoine Tenart
2023-04-27 13:45 ` [PATCH net-next 2/4] net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV Antoine Tenart
2023-04-27 13:45 ` [PATCH net-next 3/4] Documentation: net: net.core.txrehash is not specific to listening sockets Antoine Tenart
2023-04-27 13:45 ` [PATCH net-next 4/4] net: skbuff: fix l4_hash comment Antoine Tenart
2023-04-27 15:15 ` [PATCH net-next 0/4] net: tcp: make txhash use consistent for IPv4 Eric Dumazet
2023-04-27 15:44   ` Antoine Tenart
  -- strict thread matches above, loose matches on Subject: below --
2023-05-11  9:34 Antoine Tenart
2023-05-11 10:24 ` Eric Dumazet
2023-05-11 11:55   ` Antoine Tenart
2023-05-11 11:59 ` Ilya Maximets

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).