Re: Questions about early_drop()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Patrick McHardy <kaber@trash.net>
To: Luca Pesce <pesce.luca@gmail.com>
Cc: netfilter-devel@vger.kernel.org
Subject: Re: Questions about early_drop()
Date: Wed, 04 Nov 2009 12:31:37 +0100	[thread overview]
Message-ID: <4AF16619.2090807@trash.net> (raw)
In-Reply-To: <4AEC32A9.8090405@gmail.com>

Luca Pesce wrote:
> I just realized that point 4 is very lame and wrong, so please skip it,
> consider only the first three questions.
> Thanks.
> 
> Luca Pesce ha scritto:
>> Hi all,
>>     today I was looking at early_drop() code in nf_conntrack_core.c,
>> and I came
>> up with some questions, due to the fact that I am not such a netfilter
>> expert...
>> I am not running the latest kernel:  I am cutting&pasting early_drop()
>> of my kernel
>> at the end of this mail, note that compared to 2.6.31.x this is quite
>> different.
>>
>> 1- why does early_drop() increase the ct_general.use count of the ct
>> to be dropped
>> before calling death_by_timeout(), and then decreases it with
>> nf_ct_put(ct)? Is
>> this a way to postpone ct death? What for?

Quoting the changelog:

    [NETFILTER]: conntrack: fix race condition in early_drop

    On SMP environments the maximum number of conntracks can be overpassed
    under heavy stress situations due to an existing race condition.

            CPU A                   CPU B
         atomic_read()               ...
         early_drop()                ...
            ...                  atomic_read()
       allocate conntrack      allocate conntrack
         atomic_inc()             atomic_inc()

    This patch moves the counter incrementation before the early drop stage.

>> 2- I see that 2.6.31.5 version of early_drop() is more complex: it
>> crosses more
>> than one bucket looking for not assured connections to be killed. I
>> like that
>> approach, but I was wondering if this is not burning too much CPU when
>> the
>> conntrack table is overly saturated persistently (and so when this
>> function is
>> called very often)...any experience about that?

No negative experience at least :) It does greatly improve
robustness under DoS since with jhash() and a properly sized
hash table there's likely only a single entry per bucket.

>> Can I port the whole early_drop() of 2.6.31.5 on my kernel?

Probably not.

>> 3- on 2.6.31.5 version of early_drop(), there are two added checks
>> before killing
>> the conntrack:
>>
>>  if (ct && unlikely(nf_ct_is_dying(ct) ||
>>                    !atomic_inc_not_zero(&ct->ct_general.use)))
>>             ct = NULL;
>>            
>> I understand that these are to ensure that the ct is not already dying
>> for itself:
>> should I add those to early_drop() which I am currently using to avoid
>> races?

This was fixing RCU races. Without knowing your version I can't
tell. Probably not though, the affected -stable versions should
already include this.

     prev parent reply	other threads:[~2009-11-04 11:31 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-31 12:07 Questions about early_drop() Luca Pesce
2009-10-31 12:50 ` Luca Pesce
2009-11-04 11:31   ` Patrick McHardy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AF16619.2090807@trash.net \
    --to=kaber@trash.net \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pesce.luca@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.