From mboxrd@z Thu Jan 1 00:00:00 1970 From: Changli Gao Subject: Re: DDoS attack causing bad effect on conntrack searches Date: Tue, 1 Jun 2010 13:48:23 +0800 Message-ID: References: <1271941082.14501.189.camel@jdb-workstation> <4BD04C74.9020402@trash.net> <1271946961.7895.5665.camel@edumazet-laptop> <1271948029.7895.5707.camel@edumazet-laptop> <20100422155123.GA2524@linux.vnet.ibm.com> <1271952128.7895.5851.camel@edumazet-laptop> <1272056237.4599.7.camel@edumazet-laptop> <1272139861.20714.525.camel@edumazet-laptop> <1272292568.13192.43.camel@jdb-workstation> <1275340896.2478.26.camel@edumazet-laptop> <1275368732.2478.88.camel@edumazet-laptop> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: hawk@comx.dk, Jesper Dangaard Brouer , paulmck@linux.vnet.ibm.com, Patrick McHardy , Linux Kernel Network Hackers , Netfilter Developers To: Eric Dumazet Return-path: In-Reply-To: <1275368732.2478.88.camel@edumazet-laptop> Sender: netdev-owner@vger.kernel.org List-Id: netfilter-devel.vger.kernel.org On Tue, Jun 1, 2010 at 1:05 PM, Eric Dumazet w= rote: > Le mardi 01 juin 2010 =E0 08:28 +0800, Changli Gao a =E9crit : >> On Tue, Jun 1, 2010 at 5:21 AM, Eric Dumazet wrote: >> > >> > I had a look at current conntrack and found the 'unconfirmed' list= was >> > maybe a candidate for a potential blackhole. >> > >> > That is, if a reader happens to hit an entry that is moved from re= gular >> > hash table slot 'hash' to unconfirmed list, >> >> Sorry, but I can't find where we do this things. unconfirmed list is >> used to track the unconfirmed cts, whose corresponding skbs are stil= l >> in path from the first to the last netfilter hooks. As soon as the >> skbs end their travel in netfilter, the corresponding cts will be >> confirmed(moving ct from unconfirmed list to regular hash table). >> > > So netfilter is a monolithic thing. > > When a packet begins its travel into netfilter, you guarantee that no > other packet can also begin its travel and find an unconfirmed > conntrack ? > > I wonder why we use atomic ops then to track the confirmed bit :) seems no need. > > >> unconfirmed list should be small, as networking receiving is in BH. > > So according to you, netfilter/ct runs only in input path ? No. there are another entrances: local out and nf_reinject. If there isn't any packet queued, as netfilter is in atomic context, the nubmer of unconfirmed cts should be small( at most, 2 * nr_cpu?). > > So I assume a packet is handled by CPU X, creates a new conntrack > (possibly early droping an old entry that was previously in a standar= d > hash chain), inserted in unconfirmed list. > Oh, Thanks, I got it. > _You_ guarantee another CPU > Y, handling another packet, possibly sent by a hacker reading your > netdev mails, cannot find the conntrack that was early dropped ? > >> How about implementing unconfirmed list as a per cpu variable? > > I first implemented such a patch to reduce cache line contention, the= n I > asked to myself : What is exactly an unconfirmed conntrack ? Can thei= r > number be unbounded ? If yes, we have a problem, even on a two cpus > machine. Using two lists instead of one wont solve the fundamental > problem. > > The real question is, why do we need this unconfirmed 'list' in the > first place. Is it really a private per cpu thing ? No, it isn't really private. But most of time, it is accessed locally, if it is implemented as a per cpu var. > Can you prove this, > in respect of lockless lookups, and things like NFQUEUE ? > > Each conntrack object has two list anchors. One for IP_CT_DIR_ORIGINA= L, > one for IP_CT_DIR_REPLY. > > Unconfirmed list use the first anchor. This means another cpu can > definitely find an unconfirmed item in a regular hash chain, since we > dont respect an RCU grace period before re-using an object. > > If memory was not a problem, we probably would use a third anchor to > avoid this, or regular RCU instead of SLAB_DESTROY_BY_RCU variant. > thanks for your explaining. --=20 Regards, Changli Gao(xiaosuo@gmail.com)