netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hans Schillstrom <hans.schillstrom@ericsson.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Vijay Subramanian <subramanian.vijay@gmail.com>,
	Dave Taht <dave.taht@gmail.com>, netdev <netdev@vger.kernel.org>,
	Neal Cardwell <ncardwell@google.com>,
	Tom Herbert <therbert@google.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>
Subject: Re: [PATCH v2 net-next] tcp: avoid tx starvation by SYNACK packets
Date: Mon, 25 Jun 2012 08:24:23 +0200	[thread overview]
Message-ID: <201206250824.31386.hans.schillstrom@ericsson.com> (raw)
In-Reply-To: <1340440962.17495.39.camel@edumazet-glaptop>

Hi Eric
On Saturday 23 June 2012 10:42:42 Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> On Sat, 2012-06-23 at 00:34 -0700, Vijay Subramanian wrote:
> 
> > This  patch ([PATCH net-next] tcp: avoid tx starvation by SYNACK
> > packets)  is neither in net/net-next trees nor on patchwork. Maybe it
> > was missed since it was sent during the merge window. Is this not
> > needed anymore or is it being tested currently?
> 
> You're right, thanks for the reminder !

We have been runing this patch for a while now,
so I added a "Tested-by:"

> 
> [PATCH v2 net-next] tcp: avoid tx starvation by SYNACK packets
> 
> pfifo_fast being the default Qdisc, its pretty easy to fill it with
> SYNACK (small) packets while host is under synflood attack.
> 
> Packets of established TCP sessions are dropped at Qdisc layer and
> host appears almost dead.
> 
> Avoid this problem assigning TC_PRIO_FILLER priority to SYNACK
> generated in SYNCOOKIE mode, so that these packets are enqueued into
> pfifo_fast lowest priority (band 2).
> 
> Other packets, queued to band 0 or 1 are dequeued before any SYNACK
> packets waiting in band 2.
> 
> If not under synflood, SYNACK priority is as requested by listener
> sk_priority policy.
> 
> Reported-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Hans Schillstrom <hans.schillstrom@ericsson.com>


> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Cc: Tom Herbert <therbert@google.com>
> Cc: Vijay Subramanian <subramanian.vijay@gmail.com>
> ---
>  net/dccp/ipv4.c                  |    2 ++
>  net/ipv4/ip_output.c             |    2 +-
>  net/ipv4/tcp_ipv4.c              |    7 ++++++-
>  net/ipv6/inet6_connection_sock.c |    1 +
>  net/ipv6/ip6_output.c            |    2 +-
>  net/ipv6/tcp_ipv6.c              |   11 ++++++++---
>  6 files changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
> index 3eb76b5..045176f 100644
> --- a/net/dccp/ipv4.c
> +++ b/net/dccp/ipv4.c
> @@ -515,6 +515,7 @@ static int dccp_v4_send_response(struct sock *sk, struct request_sock *req,
>  
>  		dh->dccph_checksum = dccp_v4_csum_finish(skb, ireq->loc_addr,
>  							      ireq->rmt_addr);
> +		skb->priority = sk->sk_priority;
>  		err = ip_build_and_send_pkt(skb, sk, ireq->loc_addr,
>  					    ireq->rmt_addr,
>  					    ireq->opt);
> @@ -556,6 +557,7 @@ static void dccp_v4_ctl_send_reset(struct sock *sk, struct sk_buff *rxskb)
>  	skb_dst_set(skb, dst_clone(dst));
>  
>  	bh_lock_sock(ctl_sk);
> +	skb->priority = ctl_sk->sk_priority;
>  	err = ip_build_and_send_pkt(skb, ctl_sk,
>  				    rxiph->daddr, rxiph->saddr, NULL);
>  	bh_unlock_sock(ctl_sk);
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 0f3185a..71c6c20 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -155,7 +155,7 @@ int ip_build_and_send_pkt(struct sk_buff *skb, struct sock *sk,
>  		ip_options_build(skb, &opt->opt, daddr, rt, 0);
>  	}
>  
> -	skb->priority = sk->sk_priority;
> +	/* skb->priority is set by the caller */
>  	skb->mark = sk->sk_mark;
>  
>  	/* Send it out. */
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index b52934f..5ef4131 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -81,7 +81,7 @@
>  #include <linux/stddef.h>
>  #include <linux/proc_fs.h>
>  #include <linux/seq_file.h>
> -
> +#include <linux/pkt_sched.h>
>  #include <linux/crypto.h>
>  #include <linux/scatterlist.h>
>  
> @@ -821,6 +821,7 @@ static void tcp_v4_reqsk_send_ack(struct sock *sk, struct sk_buff *skb,
>   *	Send a SYN-ACK after having received a SYN.
>   *	This still operates on a request_sock only, not on a big
>   *	socket.
> + *	nocache is set for SYN-ACK sent in SYNCOOKIE mode
>   */
>  static int tcp_v4_send_synack(struct sock *sk, struct dst_entry *dst,
>  			      struct request_sock *req,
> @@ -843,6 +844,10 @@ static int tcp_v4_send_synack(struct sock *sk, struct dst_entry *dst,
>  		__tcp_v4_send_check(skb, ireq->loc_addr, ireq->rmt_addr);
>  
>  		skb_set_queue_mapping(skb, queue_mapping);
> +
> +		/* SYNACK sent in SYNCOOKIE mode have low priority */
> +		skb->priority = nocache ? TC_PRIO_FILLER : sk->sk_priority;
> +
>  		err = ip_build_and_send_pkt(skb, sk, ireq->loc_addr,
>  					    ireq->rmt_addr,
>  					    ireq->opt);
> diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c
> index e6cee52..5812a74 100644
> --- a/net/ipv6/inet6_connection_sock.c
> +++ b/net/ipv6/inet6_connection_sock.c
> @@ -248,6 +248,7 @@ int inet6_csk_xmit(struct sk_buff *skb, struct flowi *fl_unused)
>  	/* Restore final destination back after routing done */
>  	fl6.daddr = np->daddr;
>  
> +	skb->priority = sk->sk_priority;
>  	res = ip6_xmit(sk, skb, &fl6, np->opt, np->tclass);
>  	rcu_read_unlock();
>  	return res;
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index a233a7c..a93378a 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -228,7 +228,7 @@ int ip6_xmit(struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
>  	hdr->saddr = fl6->saddr;
>  	hdr->daddr = *first_hop;
>  
> -	skb->priority = sk->sk_priority;
> +	/* skb->priority is set by the caller */
>  	skb->mark = sk->sk_mark;
>  
>  	mtu = dst_mtu(dst);
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 26a8862..f664452 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -43,6 +43,7 @@
>  #include <linux/ipv6.h>
>  #include <linux/icmpv6.h>
>  #include <linux/random.h>
> +#include <linux/pkt_sched.h>
>  
>  #include <net/tcp.h>
>  #include <net/ndisc.h>
> @@ -479,7 +480,8 @@ out:
>  
>  static int tcp_v6_send_synack(struct sock *sk, struct request_sock *req,
>  			      struct request_values *rvp,
> -			      u16 queue_mapping)
> +			      u16 queue_mapping,
> +			      bool syncookie)
>  {
>  	struct inet6_request_sock *treq = inet6_rsk(req);
>  	struct ipv6_pinfo *np = inet6_sk(sk);
> @@ -515,6 +517,7 @@ static int tcp_v6_send_synack(struct sock *sk, struct request_sock *req,
>  	if (skb) {
>  		__tcp_v6_send_check(skb, &treq->loc_addr, &treq->rmt_addr);
>  
> +		skb->priority = syncookie ? TC_PRIO_FILLER : sk->sk_priority;
>  		fl6.daddr = treq->rmt_addr;
>  		skb_set_queue_mapping(skb, queue_mapping);
>  		err = ip6_xmit(sk, skb, &fl6, opt, np->tclass);
> @@ -531,7 +534,7 @@ static int tcp_v6_rtx_synack(struct sock *sk, struct request_sock *req,
>  			     struct request_values *rvp)
>  {
>  	TCP_INC_STATS_BH(sock_net(sk), TCP_MIB_RETRANSSEGS);
> -	return tcp_v6_send_synack(sk, req, rvp, 0);
> +	return tcp_v6_send_synack(sk, req, rvp, 0, false);
>  }
>  
>  static void tcp_v6_reqsk_destructor(struct request_sock *req)
> @@ -909,6 +912,7 @@ static void tcp_v6_send_response(struct sk_buff *skb, u32 seq, u32 ack, u32 win,
>  	dst = ip6_dst_lookup_flow(ctl_sk, &fl6, NULL, false);
>  	if (!IS_ERR(dst)) {
>  		skb_dst_set(buff, dst);
> +		skb->priority = ctl_sk->sk_priority;
>  		ip6_xmit(ctl_sk, buff, &fl6, NULL, tclass);
>  		TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS);
>  		if (rst)
> @@ -1217,7 +1221,8 @@ have_isn:
>  
>  	if (tcp_v6_send_synack(sk, req,
>  			       (struct request_values *)&tmp_ext,
> -			       skb_get_queue_mapping(skb)) ||
> +			       skb_get_queue_mapping(skb),
> +			       want_cookie) ||
>  	    want_cookie)
>  		goto drop_and_free;
>  
> 
> 
> 

-- 
Regards
Hans Schillstrom <hans.schillstrom@ericsson.com>

  reply	other threads:[~2012-06-25  6:24 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-31 21:56 [PATCH net-next] tcp: avoid tx starvation by SYNACK packets Eric Dumazet
2012-05-31 23:03 ` David Miller
2012-06-01  4:48   ` Eric Dumazet
2012-06-01 17:46     ` David Miller
2012-06-01  7:00 ` [PATCH] tcp: do not create inetpeer on SYNACK message Eric Dumazet
2012-06-01 18:24   ` David Miller
2012-06-01 21:34   ` Hans Schillström
2012-06-02  6:56     ` Eric Dumazet
2012-06-01  7:36 ` [PATCH net-next] tcp: avoid tx starvation by SYNACK packets Hans Schillstrom
2012-06-01  9:34   ` Eric Dumazet
2012-06-02  1:28 ` Dave Taht
2012-06-02  5:46   ` Eric Dumazet
2012-06-23  7:34     ` Vijay Subramanian
2012-06-23  8:42       ` [PATCH v2 " Eric Dumazet
2012-06-25  6:24         ` Hans Schillstrom [this message]
2012-06-25 22:43         ` David Miller
2012-06-26  4:51           ` Eric Dumazet
2012-06-26  4:55             ` David Miller
2012-06-26  5:34               ` Hans Schillstrom
2012-06-26  7:11                 ` David Miller
2012-06-26  7:27                   ` Hans Schillstrom
2012-06-26 17:02                 ` Eric Dumazet
2012-06-27  5:23                   ` Hans Schillstrom
2012-06-27  8:22                     ` David Miller
2012-06-27  8:25                       ` Jesper Dangaard Brouer
2012-06-27  8:30                       ` Hans Schillstrom
2012-06-27  8:40                       ` Eric Dumazet
2012-06-27  8:48                         ` David Miller
2012-06-27  6:32                   ` Jesper Dangaard Brouer
2012-06-27  6:54                     ` David Miller
2012-06-27  7:24                       ` Jesper Dangaard Brouer
2012-06-27  7:30                         ` Eric Dumazet
2012-06-27  7:54                           ` Jesper Dangaard Brouer
2012-06-27  8:02                             ` Eric Dumazet
2012-06-27  8:21                               ` Jesper Dangaard Brouer
2012-06-27  8:45                                 ` Eric Dumazet
2012-06-27  9:23                                   ` Jesper Dangaard Brouer
2012-06-27  8:13                           ` David Miller
2012-06-27 19:50                       ` Florian Westphal
2012-06-27 21:39                         ` Eric Dumazet
2012-06-27 22:23                           ` David Miller
2012-06-27 22:23                         ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201206250824.31386.hans.schillstrom@ericsson.com \
    --to=hans.schillstrom@ericsson.com \
    --cc=brouer@redhat.com \
    --cc=dave.taht@gmail.com \
    --cc=eric.dumazet@gmail.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=subramanian.vijay@gmail.com \
    --cc=therbert@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).