From: Andrew Vagin <avagin@parallels.com>
To: Yuchung Cheng <ycheng@google.com>
Cc: Andrey Vagin <avagin@openvz.org>, netdev <netdev@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Eric Dumazet <edumazet@google.com>,
Pavel Emelyanov <xemul@parallels.com>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH] tcp: don't use timestamp from repaired skb-s to calculate RTT
Date: Tue, 12 Aug 2014 22:29:56 +0400 [thread overview]
Message-ID: <20140812182956.GA12993@paralelels.com> (raw)
In-Reply-To: <CAK6E8=c7VEJViP_KeqCBo_dePL-wh8J5+nu=YGrrPq2GKtmBBQ@mail.gmail.com>
On Tue, Aug 12, 2014 at 07:53:57AM -0700, Yuchung Cheng wrote:
> On Tue, Aug 12, 2014 at 2:45 AM, Andrey Vagin <avagin@openvz.org> wrote:
> > We don't know right timestamp for repaired skb-s. Wrong RTT estimations
> > isn't good, because some congestion modules heavily depends on it.
> >
> > This patch adds the TCPCB_REPAIRED flag, which is included in
> > TCPCB_RETRANS.
> >
> > Thanks to Eric for the advice how to fix this issue.
> >
> > This patch fixes the warning:
> > [ 879.562947] WARNING: CPU: 0 PID: 2825 at net/ipv4/tcp_input.c:3078 tcp_ack+0x11f5/0x1380()
> > [ 879.567253] CPU: 0 PID: 2825 Comm: socket-tcpbuf-l Not tainted 3.16.0-next-20140811 #1
> > [ 879.567829] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > [ 879.568177] 0000000000000000 00000000c532680c ffff880039643d00 ffffffff817aa2d2
> > [ 879.568776] 0000000000000000 ffff880039643d38 ffffffff8109afbd ffff880039d6ba80
> > [ 879.569386] ffff88003a449800 000000002983d6bd 0000000000000000 000000002983d6bc
> > [ 879.569982] Call Trace:
> > [ 879.570264] [<ffffffff817aa2d2>] dump_stack+0x4d/0x66
> > [ 879.570599] [<ffffffff8109afbd>] warn_slowpath_common+0x7d/0xa0
> > [ 879.570935] [<ffffffff8109b0ea>] warn_slowpath_null+0x1a/0x20
> > [ 879.571292] [<ffffffff816d0a05>] tcp_ack+0x11f5/0x1380
> > [ 879.571614] [<ffffffff816d10bd>] tcp_rcv_established+0x1ed/0x710
> > [ 879.571958] [<ffffffff816dc9da>] tcp_v4_do_rcv+0x10a/0x370
> > [ 879.572315] [<ffffffff81657459>] release_sock+0x89/0x1d0
> > [ 879.572642] [<ffffffff816c81a0>] do_tcp_setsockopt.isra.36+0x120/0x860
> > [ 879.573000] [<ffffffff8110a52e>] ? rcu_read_lock_held+0x6e/0x80
> > [ 879.573352] [<ffffffff816c8912>] tcp_setsockopt+0x32/0x40
> > [ 879.573678] [<ffffffff81654ac4>] sock_common_setsockopt+0x14/0x20
> > [ 879.574031] [<ffffffff816537b0>] SyS_setsockopt+0x80/0xf0
> > [ 879.574393] [<ffffffff817b40a9>] system_call_fastpath+0x16/0x1b
> > [ 879.574730] ---[ end trace a17cbc38eb8c5c00 ]---
> >
> > Cc: Eric Dumazet <edumazet@google.com>
> > Cc: Pavel Emelyanov <xemul@parallels.com>
> > Cc: "David S. Miller" <davem@davemloft.net>
> > Signed-off-by: Andrey Vagin <avagin@openvz.org>
> > ---
> > include/net/tcp.h | 4 +++-
> > net/ipv4/tcp.c | 16 +++++++++-------
> > 2 files changed, 12 insertions(+), 8 deletions(-)
> >
> > diff --git a/include/net/tcp.h b/include/net/tcp.h
> > index dafa1cb..36f5525 100644
> > --- a/include/net/tcp.h
> > +++ b/include/net/tcp.h
> > @@ -705,8 +705,10 @@ struct tcp_skb_cb {
> > #define TCPCB_SACKED_RETRANS 0x02 /* SKB retransmitted */
> > #define TCPCB_LOST 0x04 /* SKB is lost */
> > #define TCPCB_TAGBITS 0x07 /* All tag bits */
> > +#define TCPCB_REPAIRED 0x10 /* SKB repaired (no skb_mstamp) */
> > #define TCPCB_EVER_RETRANS 0x80 /* Ever retransmitted frame */
> > -#define TCPCB_RETRANS (TCPCB_SACKED_RETRANS|TCPCB_EVER_RETRANS)
> > +#define TCPCB_RETRANS (TCPCB_SACKED_RETRANS|TCPCB_EVER_RETRANS| \
> > + TCPCB_REPAIRED)
> >
> > __u8 ip_dsfield; /* IPv4 tos or IPv6 dsfield */
> > /* 1 byte hole */
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index 181b70e..cb5f548 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -1188,13 +1188,6 @@ new_segment:
> > goto wait_for_memory;
> >
> > /*
> > - * All packets are restored as if they have
> > - * already been sent.
> > - */
> > - if (tp->repair)
> > - TCP_SKB_CB(skb)->when = tcp_time_stamp;
> > -
> > - /*
> > * Check whether we can use HW checksum.
> > */
> > if (sk->sk_route_caps & NETIF_F_ALL_CSUM)
> > @@ -1203,6 +1196,15 @@ new_segment:
> > skb_entail(sk, skb);
> > copy = size_goal;
> > max = size_goal;
> > +
> > + /* All packets are restored as if they have
> > + * already been sent. skb_mstamp isn't set to
> > + * avoid wrong rtt estimation.
> > + */
> > + if (tp->repair) {
> > + TCP_SKB_CB(skb)->sacked |= TCPCB_REPAIRED;
> > + TCP_SKB_CB(skb)->when = tcp_time_stamp;
> But this still allow RTT samples from TCP timestamp options even if
> the packet is marked retransmitted/repaired in tcp_ack_update_rtt()?
"when" isn't used there.
rtt = tcp_time_stamp - tp->rx_opt.rcv_tsecr
If a tcp connection is moved from another host, we set tp->tsoffset so,
that rcv_tsecr remains coherent with tcp_time_stamp on the target host.
>
> > + }
> > }
> >
> > /* Try to append data to the end of skb. */
> > --
> > 1.9.3
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2014-08-12 18:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-12 9:45 [PATCH] tcp: don't use timestamp from repaired skb-s to calculate RTT Andrey Vagin
2014-08-12 12:15 ` Eric Dumazet
2014-08-12 12:33 ` Andrew Vagin
2014-08-12 13:14 ` Eric Dumazet
2014-08-12 14:34 ` Andrew Vagin
2014-08-12 14:53 ` Yuchung Cheng
2014-08-12 18:29 ` Andrew Vagin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140812182956.GA12993@paralelels.com \
--to=avagin@parallels.com \
--cc=avagin@openvz.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=xemul@parallels.com \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).