netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] tcp: warn on bogus MSS and try to amend it
@ 2016-11-30 13:14 Marcelo Ricardo Leitner
  2016-12-01 20:29 ` David Miller
  0 siblings, 1 reply; 4+ messages in thread
From: Marcelo Ricardo Leitner @ 2016-11-30 13:14 UTC (permalink / raw)
  To: netdev
  Cc: Jon Maxwell, Alex Sidorenko, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, tlfalcon, Brian King,
	Eric Dumazet, davem

There have been some reports lately about TCP connection stalls caused
by NIC drivers that aren't setting gso_size on aggregated packets on rx
path. This causes TCP to assume that the MSS is actually the size of the
aggregated packet, which is invalid.

Although the proper fix is to be done at each driver, it's often hard
and cumbersome for one to debug, come to such root cause and report/fix
it.

This patch amends this situation in two ways. First, it adds a warning
on when this situation occurs, so it gives a hint to those trying to
debug this. It also limit the maximum probed MSS to the adverised MSS,
as it should never be any higher than that.

The result is that the connection may not have the best performance ever
but it shouldn't stall, and the admin will have a hint on what to look
for.

Tested with virtio by forcing gso_size to 0.

Cc: Jonathan Maxwell <jmaxwell37@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---
 net/ipv4/tcp_input.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index a27b9c0e27c08b4e4aeaff3d0bfdf3ae561ba4d8..ecc86105eb479de9b80db71af6a16a5af612a61c 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -144,7 +144,10 @@ static void tcp_measure_rcv_mss(struct sock *sk, const struct sk_buff *skb)
 	 */
 	len = skb_shinfo(skb)->gso_size ? : skb->len;
 	if (len >= icsk->icsk_ack.rcv_mss) {
-		icsk->icsk_ack.rcv_mss = len;
+		icsk->icsk_ack.rcv_mss = min_t(unsigned int, len,
+					       tcp_sk(sk)->advmss);
+		if (icsk->icsk_ack.rcv_mss != len)
+			pr_warn_once("Seems your NIC driver is doing bad RX acceleration. TCP performance may be compromised.\n");
 	} else {
 		/* Otherwise, we make more careful check taking into account,
 		 * that SACKs block is variable.
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net] tcp: warn on bogus MSS and try to amend it
  2016-11-30 13:14 [PATCH net] tcp: warn on bogus MSS and try to amend it Marcelo Ricardo Leitner
@ 2016-12-01 20:29 ` David Miller
  2016-12-01 20:46   ` marcelo.leitner
  2016-12-02 10:07   ` marcelo.leitner
  0 siblings, 2 replies; 4+ messages in thread
From: David Miller @ 2016-12-01 20:29 UTC (permalink / raw)
  To: marcelo.leitner
  Cc: netdev, jmaxwell37, alexandre.sidorenko, kuznet, jmorris,
	yoshfuji, kaber, tlfalcon, brking, eric.dumazet

From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Date: Wed, 30 Nov 2016 11:14:32 -0200

> There have been some reports lately about TCP connection stalls caused
> by NIC drivers that aren't setting gso_size on aggregated packets on rx
> path. This causes TCP to assume that the MSS is actually the size of the
> aggregated packet, which is invalid.
> 
> Although the proper fix is to be done at each driver, it's often hard
> and cumbersome for one to debug, come to such root cause and report/fix
> it.
> 
> This patch amends this situation in two ways. First, it adds a warning
> on when this situation occurs, so it gives a hint to those trying to
> debug this. It also limit the maximum probed MSS to the adverised MSS,
> as it should never be any higher than that.
> 
> The result is that the connection may not have the best performance ever
> but it shouldn't stall, and the admin will have a hint on what to look
> for.
> 
> Tested with virtio by forcing gso_size to 0.
> 
> Cc: Jonathan Maxwell <jmaxwell37@gmail.com>
> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

I totally agree with this change, however I think the warning message can
be improved in two ways:

>  	len = skb_shinfo(skb)->gso_size ? : skb->len;
>  	if (len >= icsk->icsk_ack.rcv_mss) {
> -		icsk->icsk_ack.rcv_mss = len;
> +		icsk->icsk_ack.rcv_mss = min_t(unsigned int, len,
> +					       tcp_sk(sk)->advmss);
> +		if (icsk->icsk_ack.rcv_mss != len)
> +			pr_warn_once("Seems your NIC driver is doing bad RX acceleration. TCP performance may be compromised.\n");

We know it's a bad GRO implementation that causes this so let's be specific in the
message, perhaps something like:

	Driver has suspect GRO implementation, TCP performance may be compromised.

Also, we have skb->dev available here most likely, so prefixing the message with
skb->dev->name would make analyzing this situation even easier for someone hitting
this.

I'm not certain if an skb->dev==NULL check is necessary here or not, but it is
definitely something you need to consider.

Thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] tcp: warn on bogus MSS and try to amend it
  2016-12-01 20:29 ` David Miller
@ 2016-12-01 20:46   ` marcelo.leitner
  2016-12-02 10:07   ` marcelo.leitner
  1 sibling, 0 replies; 4+ messages in thread
From: marcelo.leitner @ 2016-12-01 20:46 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, jmaxwell37, alexandre.sidorenko, kuznet, jmorris,
	yoshfuji, kaber, tlfalcon, brking, eric.dumazet

On Thu, Dec 01, 2016 at 03:29:49PM -0500, David Miller wrote:
> From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> Date: Wed, 30 Nov 2016 11:14:32 -0200
> 
> > There have been some reports lately about TCP connection stalls caused
> > by NIC drivers that aren't setting gso_size on aggregated packets on rx
> > path. This causes TCP to assume that the MSS is actually the size of the
> > aggregated packet, which is invalid.
> > 
> > Although the proper fix is to be done at each driver, it's often hard
> > and cumbersome for one to debug, come to such root cause and report/fix
> > it.
> > 
> > This patch amends this situation in two ways. First, it adds a warning
> > on when this situation occurs, so it gives a hint to those trying to
> > debug this. It also limit the maximum probed MSS to the adverised MSS,
> > as it should never be any higher than that.
> > 
> > The result is that the connection may not have the best performance ever
> > but it shouldn't stall, and the admin will have a hint on what to look
> > for.
> > 
> > Tested with virtio by forcing gso_size to 0.
> > 
> > Cc: Jonathan Maxwell <jmaxwell37@gmail.com>
> > Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> 
> I totally agree with this change, however I think the warning message can
> be improved in two ways:
> 
> >  	len = skb_shinfo(skb)->gso_size ? : skb->len;
> >  	if (len >= icsk->icsk_ack.rcv_mss) {
> > -		icsk->icsk_ack.rcv_mss = len;
> > +		icsk->icsk_ack.rcv_mss = min_t(unsigned int, len,
> > +					       tcp_sk(sk)->advmss);
> > +		if (icsk->icsk_ack.rcv_mss != len)
> > +			pr_warn_once("Seems your NIC driver is doing bad RX acceleration. TCP performance may be compromised.\n");
> 
> We know it's a bad GRO implementation that causes this so let's be specific in the
> message, perhaps something like:
> 
> 	Driver has suspect GRO implementation, TCP performance may be compromised.

Okay.

> 
> Also, we have skb->dev available here most likely, so prefixing the message with
> skb->dev->name would make analyzing this situation even easier for someone hitting
> this.

Nice, yes.
And this skb is mostly non-forwardable as it's bigger than the MTU,
so if someone is using net namespaces and this skb would be routed
through some veth interfaces, it would give a false hint then, but
shouldn't happen. Unless it would fit (a larger) veth mtu, but still,
one probably will simplify things up to debug this.

> 
> I'm not certain if an skb->dev==NULL check is necessary here or not, but it is
> definitely something you need to consider.
> 
> Thanks!
> 

Will check. Thanks!

  Marcelo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] tcp: warn on bogus MSS and try to amend it
  2016-12-01 20:29 ` David Miller
  2016-12-01 20:46   ` marcelo.leitner
@ 2016-12-02 10:07   ` marcelo.leitner
  1 sibling, 0 replies; 4+ messages in thread
From: marcelo.leitner @ 2016-12-02 10:07 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, jmaxwell37, alexandre.sidorenko, kuznet, jmorris,
	yoshfuji, kaber, tlfalcon, brking, eric.dumazet

On Thu, Dec 01, 2016 at 03:29:49PM -0500, David Miller wrote:
> From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> Date: Wed, 30 Nov 2016 11:14:32 -0200
> 
> > There have been some reports lately about TCP connection stalls caused
> > by NIC drivers that aren't setting gso_size on aggregated packets on rx
> > path. This causes TCP to assume that the MSS is actually the size of the
> > aggregated packet, which is invalid.
> > 
> > Although the proper fix is to be done at each driver, it's often hard
> > and cumbersome for one to debug, come to such root cause and report/fix
> > it.
> > 
> > This patch amends this situation in two ways. First, it adds a warning
> > on when this situation occurs, so it gives a hint to those trying to
> > debug this. It also limit the maximum probed MSS to the adverised MSS,
> > as it should never be any higher than that.
> > 
> > The result is that the connection may not have the best performance ever
> > but it shouldn't stall, and the admin will have a hint on what to look
> > for.
> > 
> > Tested with virtio by forcing gso_size to 0.
> > 
> > Cc: Jonathan Maxwell <jmaxwell37@gmail.com>
> > Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> 
> I totally agree with this change, however I think the warning message can
> be improved in two ways:
> 
> >  	len = skb_shinfo(skb)->gso_size ? : skb->len;
> >  	if (len >= icsk->icsk_ack.rcv_mss) {
> > -		icsk->icsk_ack.rcv_mss = len;
> > +		icsk->icsk_ack.rcv_mss = min_t(unsigned int, len,
> > +					       tcp_sk(sk)->advmss);
> > +		if (icsk->icsk_ack.rcv_mss != len)
> > +			pr_warn_once("Seems your NIC driver is doing bad RX acceleration. TCP performance may be compromised.\n");
> 
> We know it's a bad GRO implementation that causes this so let's be specific in the
> message, perhaps something like:
> 
> 	Driver has suspect GRO implementation, TCP performance may be compromised.
> 
> Also, we have skb->dev available here most likely, so prefixing the message with
> skb->dev->name would make analyzing this situation even easier for someone hitting
> this.

It's not avaliable anymore.. It's NULLified before we get there:

tcp_v4_rcv()   (same for v6)
{
	...
	skb->dev = NULL;
	...
        if (!sock_owned_by_user(sk)) {
                if (!tcp_prequeue(sk, skb))
                        ret = tcp_v4_do_rcv(sk, skb);
        } else if (tcp_add_backlog(sk, skb)) {
	...
}

I'll update the msg as above and post v2.

Thanks,
Marcelo

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-12-02 10:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-30 13:14 [PATCH net] tcp: warn on bogus MSS and try to amend it Marcelo Ricardo Leitner
2016-12-01 20:29 ` David Miller
2016-12-01 20:46   ` marcelo.leitner
2016-12-02 10:07   ` marcelo.leitner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).