[RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods
@ 2012-05-24 13:01 Jesper Dangaard Brouer
  2012-05-24 13:20 ` Hans Schillstrom
  2012-05-24 13:26 ` Christoph Paasch
  0 siblings, 2 replies; 7+ messages in thread
From: Jesper Dangaard Brouer @ 2012-05-24 13:01 UTC (permalink / raw)
  To: Eric Dumazet, David Miller; +Cc: Martin Topholm, netdev

Hi Eric,

I have been doing some TCP performance measurements with SYN flooding,
and have found that, we don't handle this case well.

I have made a patch for fast/early SYN handling in tcp_v4_rcv() in
net/ipv4/tcp_ipv4.c.  This increases SYN performance from 130 kpps to
750 kpps (max of the generator), with idle CPU cycles.

Current locking:
 During a SYN flood (against a single port) all CPUs are spinning on
the same spinlock, namely bh_lock_sock_nested(sk), in tcp_ipv4.c.  The
lock dates back to a commit by DaveM in May 1999, see historic
commit[1].  It seem that TCP runs fully locked, per sock.

I need some help with locking, as the patch seems to work fine, with
NO-PREEMPT, but with PREEMPT enabled I start to see warnings (in
reqsk_queue_destroy) and oopses (in inet_csk_reqsk_queue_prune).

What am I missing?

[1] Historic commit: http://git.kernel.org/?p=linux/kernel/git/davem/netdev-vger-cvs.git;a=commitdiff;h=5744fad55cefbd6f079410500a507443d92d63ff

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

[RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods

TCP SYN handling is on the slow path via tcp_v4_rcv(), and is
performed while holding spinlock bh_lock_sock().

Real-life and testlab experiments show, that the kernel choks
when reaching 130Kpps SYN floods (powerful Nehalem 16 cores).
Measuring with perf reveals, that its caused by
bh_lock_sock_nested() call in tcp_v4_rcv().

With this patch, the machine can handle 750Kpps (max of the SYN
flood generator) with cycles to spare.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---

 net/ipv4/tcp_ipv4.c |   16 ++++++++++++++++
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 2e76ffb..7d7e8e0 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1718,6 +1718,22 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	if (!sk)
 		goto no_tcp_socket;

+	/* Fast/early SYN handling, to mitigate SYN attacks */
+	if (sk->sk_state == TCP_LISTEN && th->syn && !th->ack && !th->fin) {
+		//bh_lock_sock_nested(sk); /* Don't think lock is needed */
+		/* Handles syn cookie, normally called from
+		 * tcp_rcv_state_process() */
+		tcp_v4_conn_request(sk, skb);
+		//bh_unlock_sock(sk);
+
+		/* Questions, do we (really) need to create a new sk,
+		 * as in tcp_v4_hnd_req() ?
+		 */
+		sock_put(sk);
+		kfree_skb(skb);
+		return 0;
+	}
+
 process:
 	if (sk->sk_state == TCP_TIME_WAIT)
 		goto do_time_wait;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods
  2012-05-24 13:01 [RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods Jesper Dangaard Brouer
@ 2012-05-24 13:20 ` Hans Schillstrom
  2012-05-24 17:32   ` Jesper Dangaard Brouer
  2012-05-24 13:26 ` Christoph Paasch
  1 sibling, 1 reply; 7+ messages in thread
From: Hans Schillstrom @ 2012-05-24 13:20 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Eric Dumazet, David Miller, Martin Topholm, netdev

Hi Jesper
We are also working with this issue right now,

On Thursday 24 May 2012 15:01:07 Jesper Dangaard Brouer wrote:
> Hi Eric,
> 
> I have been doing some TCP performance measurements with SYN flooding,
> and have found that, we don't handle this case well.
> 
> I have made a patch for fast/early SYN handling in tcp_v4_rcv() in
> net/ipv4/tcp_ipv4.c.  This increases SYN performance from 130 kpps to
> 750 kpps (max of the generator), with idle CPU cycles.
> 
> Current locking:
>  During a SYN flood (against a single port) all CPUs are spinning on
> the same spinlock, namely bh_lock_sock_nested(sk), in tcp_ipv4.c.  The
> lock dates back to a commit by DaveM in May 1999, see historic
> commit[1].  It seem that TCP runs fully locked, per sock.
> 
> I need some help with locking, as the patch seems to work fine, with
> NO-PREEMPT, but with PREEMPT enabled I start to see warnings (in
> reqsk_queue_destroy) and oopses (in inet_csk_reqsk_queue_prune).
> 
> What am I missing?
> 
> [1] Historic commit: http://git.kernel.org/?p=linux/kernel/git/davem/netdev-vger-cvs.git;a=commitdiff;h=5744fad55cefbd6f079410500a507443d92d63ff
> 
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Sr. Network Kernel Developer at Red Hat
>   Author of http://www.iptv-analyzer.org
>   LinkedIn: http://www.linkedin.com/in/brouer
> 
> 
> [RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods
> 
> TCP SYN handling is on the slow path via tcp_v4_rcv(), and is
> performed while holding spinlock bh_lock_sock().
> 
> Real-life and testlab experiments show, that the kernel choks
> when reaching 130Kpps SYN floods (powerful Nehalem 16 cores).
> Measuring with perf reveals, that its caused by
> bh_lock_sock_nested() call in tcp_v4_rcv().

I can confirm this too, and it doesn't scale with more cores

> 
> With this patch, the machine can handle 750Kpps (max of the SYN
> flood generator) with cycles to spare.
This looks great.

I'm also working with a solution that not trash conntack
i.e. have conntrack working during a heavy SYN attack

-- 
Regards
Hans Schillstrom 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods
  2012-05-24 13:01 [RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods Jesper Dangaard Brouer
  2012-05-24 13:20 ` Hans Schillstrom
@ 2012-05-24 13:26 ` Christoph Paasch
  2012-05-24 14:51   ` Eric Dumazet
  1 sibling, 1 reply; 7+ messages in thread
From: Christoph Paasch @ 2012-05-24 13:26 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Eric Dumazet, David Miller, Martin Topholm, netdev

Hello,

On 05/24/2012 03:01 PM, Jesper Dangaard Brouer wrote:
> I have been doing some TCP performance measurements with SYN flooding,
> and have found that, we don't handle this case well.
> 
> I have made a patch for fast/early SYN handling in tcp_v4_rcv() in
> net/ipv4/tcp_ipv4.c.  This increases SYN performance from 130 kpps to
> 750 kpps (max of the generator), with idle CPU cycles.
> 
> Current locking:
>  During a SYN flood (against a single port) all CPUs are spinning on
> the same spinlock, namely bh_lock_sock_nested(sk), in tcp_ipv4.c.  The
> lock dates back to a commit by DaveM in May 1999, see historic
> commit[1].  It seem that TCP runs fully locked, per sock.
> 
> I need some help with locking, as the patch seems to work fine, with
> NO-PREEMPT, but with PREEMPT enabled I start to see warnings (in
> reqsk_queue_destroy) and oopses (in inet_csk_reqsk_queue_prune).
> 
> What am I missing?

For each retransmission of a SYN you will add a request-sock to the
syn_table, because you do not pass by tcp_v4_hnd_req(), which checks
this by calling inet_csk_search_req().

And your warning in reqsk_queue_destroy is because the access to the the
request_sock_queue is no more protected by a lock.


The request_sock_queue is a shared resource, which must be protect by a
lock. As you allow "parallel" SYN-processing, the queue will get corrupted.


Cheers,
Christoph


-- 
Christoph Paasch
PhD Student

IP Networking Lab --- http://inl.info.ucl.ac.be
MultiPath TCP in the Linux Kernel --- http://mptcp.info.ucl.ac.be
Université Catholique de Louvain
-- 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods
  2012-05-24 13:26 ` Christoph Paasch
@ 2012-05-24 14:51   ` Eric Dumazet
  2012-05-24 17:21     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2012-05-24 14:51 UTC (permalink / raw)
  To: christoph.paasch, Jesper Dangaard Brouer
  Cc: David Miller, Martin Topholm, netdev, Tom Herbert

On Thu, 2012-05-24 at 15:26 +0200, Christoph Paasch wrote:
> Hello,
> 
> On 05/24/2012 03:01 PM, Jesper Dangaard Brouer wrote:
> > I have been doing some TCP performance measurements with SYN flooding,
> > and have found that, we don't handle this case well.
> > 
> > I have made a patch for fast/early SYN handling in tcp_v4_rcv() in
> > net/ipv4/tcp_ipv4.c.  This increases SYN performance from 130 kpps to
> > 750 kpps (max of the generator), with idle CPU cycles.
> > 
> > Current locking:
> >  During a SYN flood (against a single port) all CPUs are spinning on
> > the same spinlock, namely bh_lock_sock_nested(sk), in tcp_ipv4.c.  The
> > lock dates back to a commit by DaveM in May 1999, see historic
> > commit[1].  It seem that TCP runs fully locked, per sock.
> > 
> > I need some help with locking, as the patch seems to work fine, with
> > NO-PREEMPT, but with PREEMPT enabled I start to see warnings (in
> > reqsk_queue_destroy) and oopses (in inet_csk_reqsk_queue_prune).
> > 
> > What am I missing?
> 
> For each retransmission of a SYN you will add a request-sock to the
> syn_table, because you do not pass by tcp_v4_hnd_req(), which checks
> this by calling inet_csk_search_req().
> 
> And your warning in reqsk_queue_destroy is because the access to the the
> request_sock_queue is no more protected by a lock.
> 
> 
> The request_sock_queue is a shared resource, which must be protect by a
> lock. As you allow "parallel" SYN-processing, the queue will get corrupted.
> 

Hi guys, that's a very interesting subject.

I began work on fully converting this stuff to RCU some weeks ago but
got distracted by codel / fq_codel and other cool stuff (TCP coalescing
and skb->frag_head)

I dont know if you remember the SO_REUSEPORT patch(s) posted by Tom
Herbert in the past. The remaining issue was about adding/removing a new
listener to a pool of listeners to same port, and hash function was
changed so we could lost some connexions in SYN_RECV state at this
stage.

So I was working having a shared table, and not anymore using a central
spinlock, but an array of spinlock, as done elsewhere
(ESTABLISHED/TIMEWAIT hash tables)

My work is probably a ~500 LOC target, allowing concurrent processing by
all cpus of the host.

Jesper, my goals are probably different than yours, unless I
misunderstood your intention.

I feel you want to have an emergency mode, when listener is overflowed
to immediately send a SYNCOOKIE ?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods
  2012-05-24 14:51   ` Eric Dumazet
@ 2012-05-24 17:21     ` Jesper Dangaard Brouer
  2012-05-24 17:27       ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: Jesper Dangaard Brouer @ 2012-05-24 17:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: christoph.paasch, David Miller, Martin Topholm, netdev,
	Tom Herbert

On Thu, 2012-05-24 at 16:51 +0200, Eric Dumazet wrote:
> On Thu, 2012-05-24 at 15:26 +0200, Christoph Paasch wrote:
> > Hello,
> > 
> > On 05/24/2012 03:01 PM, Jesper Dangaard Brouer wrote:
> > > I have been doing some TCP performance measurements with SYN flooding,
> > > and have found that, we don't handle this case well.
> > > 
> > > I have made a patch for fast/early SYN handling in tcp_v4_rcv() in
> > > net/ipv4/tcp_ipv4.c.  This increases SYN performance from 130 kpps to
> > > 750 kpps (max of the generator), with idle CPU cycles.
> > > 
> > > Current locking:
> > >  During a SYN flood (against a single port) all CPUs are spinning on
> > > the same spinlock, namely bh_lock_sock_nested(sk), in tcp_ipv4.c.  The
> > > lock dates back to a commit by DaveM in May 1999, see historic
> > > commit[1].  It seem that TCP runs fully locked, per sock.
> > > 
> > > I need some help with locking, as the patch seems to work fine, with
> > > NO-PREEMPT, but with PREEMPT enabled I start to see warnings (in
> > > reqsk_queue_destroy) and oopses (in inet_csk_reqsk_queue_prune).
> > > 
> > > What am I missing?
> > 
> > For each retransmission of a SYN you will add a request-sock to the
> > syn_table, because you do not pass by tcp_v4_hnd_req(), which checks
> > this by calling inet_csk_search_req().

Thanks that's good hint.  I was suspecting that tcp_v4_hnd_req() was
somehow needed (as noted in the comment in the patch)

> > And your warning in reqsk_queue_destroy is because the access to the the
> > request_sock_queue is no more protected by a lock.

Yes, I was suspecting that.

> > The request_sock_queue is a shared resource, which must be protect by a
> > lock. As you allow "parallel" SYN-processing, the queue will get corrupted.
> > 
> 
> Hi guys, that's a very interesting subject.
> 
> I began work on fully converting this stuff to RCU some weeks ago but
> got distracted by codel / fq_codel and other cool stuff (TCP coalescing
> and skb->frag_head)
> 
> I dont know if you remember the SO_REUSEPORT patch(s) posted by Tom
> Herbert in the past. The remaining issue was about adding/removing a new
> listener to a pool of listeners to same port, and hash function was
> changed so we could lost some connexions in SYN_RECV state at this
> stage.

Sorry, don't remember.

> So I was working having a shared table, and not anymore using a central
> spinlock, but an array of spinlock, as done elsewhere
> (ESTABLISHED/TIMEWAIT hash tables)
> 
> My work is probably a ~500 LOC target, allowing concurrent processing by
> all cpus of the host.

Sounds really promising, especially coming from the network-ninja :-)


> Jesper, my goals are probably different than yours, unless I
> misunderstood your intention.
> 
> I feel you want to have an emergency mode, when listener is overflowed
> to immediately send a SYNCOOKIE ?

Yes, this is more an emergency mode.

I was thinking of only handling the SYN cookie case in parallel.
That should be easier locking wise, right.

I'm also considering writing a netfilter/iptables syn-cookie module, as
this would allow people to use it in combination with IPset, to e.g
create a whitelist feature of known-good-hosts (which have completed the
TCP handshake). But it would be nicer if the base kernel was just fast
enough to handle these SYN floods.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods
  2012-05-24 17:21     ` Jesper Dangaard Brouer
@ 2012-05-24 17:27       ` Eric Dumazet
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2012-05-24 17:27 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: christoph.paasch, David Miller, Martin Topholm, netdev,
	Tom Herbert

On Thu, 2012-05-24 at 19:21 +0200, Jesper Dangaard Brouer wrote:

> Sorry, don't remember.

http://kerneltrap.org/mailarchive/linux-netdev/2010/4/19/6274993

> Sounds really promising, especially coming from the network-ninja :-)

;)

> Yes, this is more an emergency mode.
> 
> I was thinking of only handling the SYN cookie case in parallel.
> That should be easier locking wise, right.
> 
> I'm also considering writing a netfilter/iptables syn-cookie module, as
> this would allow people to use it in combination with IPset, to e.g
> create a whitelist feature of known-good-hosts (which have completed the
> TCP handshake). But it would be nicer if the base kernel was just fast
> enough to handle these SYN floods.
> 

Indeed, I believe I can make this happen eventually in a short term.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods
  2012-05-24 13:20 ` Hans Schillstrom
@ 2012-05-24 17:32   ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 7+ messages in thread
From: Jesper Dangaard Brouer @ 2012-05-24 17:32 UTC (permalink / raw)
  To: Hans Schillstrom; +Cc: Eric Dumazet, David Miller, Martin Topholm, netdev

On Thu, 2012-05-24 at 15:20 +0200, Hans Schillstrom wrote:
> Hi Jesper
> We are also working with this issue right now,
> 
[..] 
> > [RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods
> > 
> > TCP SYN handling is on the slow path via tcp_v4_rcv(), and is
> > performed while holding spinlock bh_lock_sock().
> > 
> > Real-life and testlab experiments show, that the kernel choks
> > when reaching 130Kpps SYN floods (powerful Nehalem 16 cores).
> > Measuring with perf reveals, that its caused by
> > bh_lock_sock_nested() call in tcp_v4_rcv().
> 
> I can confirm this too, and it doesn't scale with more cores
> 
> > 
> > With this patch, the machine can handle 750Kpps (max of the SYN
> > flood generator) with cycles to spare.
>
> This looks great.

Yes, its definitely shows that there is huge performance gain hidden
here! But we still have to handle locking (which will affect perf).

> I'm also working with a solution that not trash conntack
> i.e. have conntrack working during a heavy SYN attack

Sounds interesting, but that's a separate problem.  In this case I have
disabled conntracking (I even disabled flow-control and drop the syn-ack
responses on the generator).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-05-24 17:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-24 13:01 [RFC PATCH] tcp: Fast/early SYN handling to mitigate SYN floods Jesper Dangaard Brouer
2012-05-24 13:20 ` Hans Schillstrom
2012-05-24 17:32   ` Jesper Dangaard Brouer
2012-05-24 13:26 ` Christoph Paasch
2012-05-24 14:51   ` Eric Dumazet
2012-05-24 17:21     ` Jesper Dangaard Brouer
2012-05-24 17:27       ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).