All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Octavian Purdila <opurdila@ixiacom.com>
Cc: Jan-Bernd Themann <themann@de.ibm.com>,
	netdev@vger.kernel.org, Christoph Raisch <raisch@de.ibm.com>
Subject: Re: TSecr != 0 check in inet_lro.c
Date: Tue, 25 Aug 2009 07:42:33 +0200	[thread overview]
Message-ID: <4A9379C9.6050108@gmail.com> (raw)
In-Reply-To: <200908250054.50664.opurdila@ixiacom.com>

Octavian Purdila a écrit :
> Hi,
> 
> We are seeing a performance issue with TSO/LRO which we tracked down to the 
> TSecr !=0 check in lro_tcp_ip_check.

ouch...

> 
> It happens when the LRO side's TSval wraps around and gets to 0. That triggers 
> the TSO side to send packets with TSecr set to 0, which means that such 
> packets won't be aggregated - and that will put a lot of burden on the stack 
> which will result in lots of drops.

Probability of such event is 1 / 2^32 or so ?

> 
> I'm failing to understand the purpose of this check. Any hints? :)
> 

rfc1323 badly interpreted ?

I remember tsecr=0 was forbidden by Linux, while apparently rfc is not
so clear.

rfc1323 : 3.2
         The Timestamp Echo Reply field (TSecr) is only valid if the ACK
         bit is set in the TCP header; if it is valid, it echos a times-
         tamp value that was sent by the remote TCP in the TSval field
         of a Timestamps option.  When TSecr is not valid, its value
         must be zero.  The TSecr value will generally be from the most
         recent Timestamp option that was received; however, there are
         exceptions that are explained below.

Note how this is not saying "a zero Tsecr value is not valid"

I could not find why : "When TSecr is not valid, its value
must be zero", and why we consider a zero value to be not meaningfull...

static inline void tcp_ack_update_rtt(struct sock *sk, const int flag,
                                      const s32 seq_rtt)
{
        const struct tcp_sock *tp = tcp_sk(sk);
        /* Note that peer MAY send zero echo. In this case it is ignored. (rfc1323) */
        if (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr)
                tcp_ack_saw_tstamp(sk, flag);
        else if (seq_rtt >= 0)
                tcp_ack_no_tstamp(sk, seq_rtt, flag);
}

static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
                                         struct tcphdr *th, unsigned len)
{
        struct tcp_sock *tp = tcp_sk(sk);
        struct inet_connection_sock *icsk = inet_csk(sk);
        int saved_clamp = tp->rx_opt.mss_clamp;

        tcp_parse_options(skb, &tp->rx_opt, 0);

        if (th->ack) {
...
                if (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr &&
                    !between(tp->rx_opt.rcv_tsecr, tp->retrans_stamp,
                             tcp_time_stamp)) {
                        NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_PAWSACTIVEREJECTED);
                        goto reset_and_undo;
                }
...

static inline void tcp_rcv_rtt_measure_ts(struct sock *sk,
                                          const struct sk_buff *skb)
{
        struct tcp_sock *tp = tcp_sk(sk);
        if (tp->rx_opt.rcv_tsecr &&
            (TCP_SKB_CB(skb)->end_seq -
             TCP_SKB_CB(skb)->seq >= inet_csk(sk)->icsk_ack.rcv_mss))
                tcp_rcv_rtt_update(tp, tcp_time_stamp - tp->rx_opt.rcv_tsecr, 0);
}
...
static inline int tcp_packet_delayed(struct tcp_sock *tp)
{
        return !tp->retrans_stamp ||
                (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr &&
                 before(tp->rx_opt.rcv_tsecr, tp->retrans_stamp));
}
...

So we dont have a bit saying we received a tsecr, we use the 
'if saw_tstamp AND tsecr is not null' convention...


  reply	other threads:[~2009-08-25  5:42 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-24 21:54 TSecr != 0 check in inet_lro.c Octavian Purdila
2009-08-25  5:42 ` Eric Dumazet [this message]
2009-08-25 11:50   ` Octavian Purdila
2009-09-01 22:46   ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A9379C9.6050108@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=opurdila@ixiacom.com \
    --cc=raisch@de.ibm.com \
    --cc=themann@de.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.