All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhang Le <r0bertz@gentoo.org>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	"David S. Miller" <davem@davemloft.net>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	"Pekka Savola (ipv6)" <pekkas@netcore.fi>,
	James Morris <jmorris@namei.org>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	Patrick McHardy <kaber@trash.net>
Subject: Re: [PATCH] ipv4: mitigate an integer underflow when comparing tcp timestamps
Date: Sun, 14 Nov 2010 23:00:07 +0800	[thread overview]
Message-ID: <20101114150006.GA23973@adriano> (raw)
In-Reply-To: <1289724745.2743.61.camel@edumazet-laptop>

[-- Attachment #1: Type: text/plain, Size: 3712 bytes --]

On 09:52 Sun 14 Nov     , Eric Dumazet wrote:
> Le dimanche 14 novembre 2010 à 15:35 +0800, Zhang Le a écrit :
> > Behind a loadbalancer which does NAT, peer->tcp_ts could be much smaller than
> > req->ts_recent. In this case, theoretically the req should not be ignored.
> > 
> > But in fact, it could be ignored, if peer->tcp_ts is so small that the
> > difference between this two number is larger than 2 to the power of 31.
> > 
> > I understand that under this situation, timestamp does not make sense any more,
> > because it actually comes from difference machines. However, if anyone
> > ever need to do the same investigation which I have done, this will
> > save some time for him.
> > 
> > Signed-off-by: Zhang Le <r0bertz@gentoo.org>
> > ---
> >  net/ipv4/tcp_ipv4.c |    4 ++--
> >  1 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index 8f8527d..1eb4974 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -1352,8 +1352,8 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
> >  		    peer->v4daddr == saddr) {
> >  			inet_peer_refcheck(peer);
> >  			if ((u32)get_seconds() - peer->tcp_ts_stamp < TCP_PAWS_MSL &&
> > -			    (s32)(peer->tcp_ts - req->ts_recent) >
> > -							TCP_PAWS_WINDOW) {
> > +			    ((s32)(peer->tcp_ts - req->ts_recent) > TCP_PAWS_WINDOW &&
> > +			     peer->tcp_ts > req->ts_recent)) {
> >  				NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_PAWSPASSIVEREJECTED);
> >  				goto drop_and_release;
> >  			}
> 
> This seems very wrong to me.
> 
> Adding a : if (peer->tcp_ts > req->ts_recent) condition is _not_ going
> to help. And it might break some working setups, because of wrap around.

Yeah, you are right. And sorry for overlooking this.

I should have reviewed time_{before,after}'s implementation before posting this.

So it seems we can't do anything to improve this except to add some warning in
documentation. Maybe some comments in the code too.

> 
> Really, if you have multiple clients behind a common NAT, you cannot use
> this code at all, since NAT doesnt usually change TCP timestamps.
> 
> What about following patch instead ?
> 
> [PATCH] doc: extend tcp_tw_recycle documentation
> 
> tcp_tw_recycle should not be used on a server if there is a chance
> clients are behind a same NAT. Document this fact before too many users
> discover this too late.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
>  Documentation/networking/ip-sysctl.txt |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
> index c7165f4..406f0d5 100644
> --- a/Documentation/networking/ip-sysctl.txt
> +++ b/Documentation/networking/ip-sysctl.txt
> @@ -446,7 +446,12 @@ tcp_tso_win_divisor - INTEGER
>  tcp_tw_recycle - BOOLEAN
>  	Enable fast recycling TIME-WAIT sockets. Default value is 0.
>  	It should not be changed without advice/request of technical
> -	experts.
> +	experts. If you set it to 1, make sure you dont miss connections
> +	attempts (check LINUX_MIB_PAWSPASSIVEREJECTED netstat counter).
> +	In particular, this might break if several clients are behind
> +	a common NAT device, since their TCP timestamp wont be changed
> +	by the NAT. tcp_tw_recycle should be used with care, most
> +	probably in private networks.
>  
>  tcp_tw_reuse - BOOLEAN
>  	Allow to reuse TIME-WAIT sockets for new connections when it is
> 
> 

-- 
Zhang, Le
Gentoo/Loongson Developer
http://zhangle.is-a-geek.org
0260 C902 B8F8 6506 6586 2B90 BC51 C808 1E4E 2973

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

  reply	other threads:[~2010-11-14 14:58 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-14  7:35 [PATCH] ipv4: mitigate an integer underflow when comparing tcp timestamps Zhang Le
2010-11-14  8:52 ` Eric Dumazet
2010-11-14 15:00   ` Zhang Le [this message]
2010-11-14 19:55   ` David Miller
2010-11-25 16:55     ` ZHANG, Le

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101114150006.GA23973@adriano \
    --to=r0bertz@gentoo.org \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=jmorris@namei.org \
    --cc=kaber@trash.net \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pekkas@netcore.fi \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.