Questions about early

All of lore.kernel.org
 help / color / mirror / Atom feed

* Questions about early_drop()
@ 2009-10-31 12:07 Luca Pesce
  2009-10-31 12:50 ` Luca Pesce
  0 siblings, 1 reply; 3+ messages in thread
From: Luca Pesce @ 2009-10-31 12:07 UTC (permalink / raw)
  To: netfilter-devel

Hi all,
    today I was looking at early_drop() code in nf_conntrack_core.c, and I came
up with some questions, due to the fact that I am not such a netfilter expert...
I am not running the latest kernel:  I am cutting&pasting early_drop()
of my kernel
at the end of this mail, note that compared to 2.6.31.x this is quite different.

1- why does early_drop() increase the ct_general.use count of the ct
to be dropped
before calling death_by_timeout(), and then decreases it with nf_ct_put(ct)? Is
this a way to postpone ct death? What for?

2- I see that 2.6.31.5 version of early_drop() is more complex: it crosses more
than one bucket looking for not assured connections to be killed. I like that
approach, but I was wondering if this is not burning too much CPU when the
conntrack table is overly saturated persistently (and so when this function is
called very often)...any experience about that?
Can I port the whole early_drop() of 2.6.31.5 on my kernel?

3- on 2.6.31.5 version of early_drop(), there are two added checks
before killing
the conntrack:

 if (ct && unlikely(nf_ct_is_dying(ct) ||
				   !atomic_inc_not_zero(&ct->ct_general.use)))
			ct = NULL;

I understand that these are to ensure that the ct is not already dying
for itself:
should I add those to early_drop() which I am currently using to avoid races?

4- I was thinking about temporarily modifying the early_drop()
behaviour, only to
see the results of a test tool which tries to create NATed ftp download sessions
with a given connection rate (say X per second) - only to experiment a bit!
With early_drop() as is, after a while there are no NOT_ASSURED
conntracks in the
hastable, so no no new connections are established. If I remove the check on the
ASSURED bit, I would kill a ct regardlessly of its status to make room for the
new one. Would that harm in some way the conntrack code logic?
Again, this would be experimental only - not to be used in real life.

Thanks for your time,
Luca

static int early_drop(struct list_head *chain)
{
	/* Traverse backwards: gives us oldest, which is roughly LRU */
	struct nf_conntrack_tuple_hash *h;
	struct nf_conn *ct = NULL, *tmp;
	int dropped = 0;

	read_lock_bh(&nf_conntrack_lock);
	list_for_each_entry_reverse(h, chain, list) {
		tmp = nf_ct_tuplehash_to_ctrack(h);
		if (!test_bit(IPS_ASSURED_BIT, &tmp->status)) {
			ct = tmp;
			atomic_inc(&ct->ct_general.use);
			break;
		}
	}
	read_unlock_bh(&nf_conntrack_lock);

	if (!ct)
		return dropped;

	if (del_timer(&ct->timeout)) {
		death_by_timeout((unsigned long)ct);
		dropped = 1;
		NF_CT_STAT_INC_ATOMIC(early_drop);
	}
	nf_ct_put(ct);
	return dropped;
}

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Questions about early_drop()
  2009-10-31 12:07 Questions about early_drop() Luca Pesce
@ 2009-10-31 12:50 ` Luca Pesce
  2009-11-04 11:31   ` Patrick McHardy
  0 siblings, 1 reply; 3+ messages in thread
From: Luca Pesce @ 2009-10-31 12:50 UTC (permalink / raw)
  To: netfilter-devel

I just realized that point 4 is very lame and wrong, so please skip it, 
consider only the first three questions.
Thanks.

Luca Pesce ha scritto:
> Hi all,
>     today I was looking at early_drop() code in nf_conntrack_core.c, and I came
> up with some questions, due to the fact that I am not such a netfilter expert...
> I am not running the latest kernel:  I am cutting&pasting early_drop()
> of my kernel
> at the end of this mail, note that compared to 2.6.31.x this is quite different.
>
> 1- why does early_drop() increase the ct_general.use count of the ct
> to be dropped
> before calling death_by_timeout(), and then decreases it with nf_ct_put(ct)? Is
> this a way to postpone ct death? What for?
>
> 2- I see that 2.6.31.5 version of early_drop() is more complex: it crosses more
> than one bucket looking for not assured connections to be killed. I like that
> approach, but I was wondering if this is not burning too much CPU when the
> conntrack table is overly saturated persistently (and so when this function is
> called very often)...any experience about that?
> Can I port the whole early_drop() of 2.6.31.5 on my kernel?
>
> 3- on 2.6.31.5 version of early_drop(), there are two added checks
> before killing
> the conntrack:
>
>  if (ct && unlikely(nf_ct_is_dying(ct) ||
> 				   !atomic_inc_not_zero(&ct->ct_general.use)))
> 			ct = NULL;
> 			
> I understand that these are to ensure that the ct is not already dying
> for itself:
> should I add those to early_drop() which I am currently using to avoid races?
>
>
> Thanks for your time,
> Luca
>
>   


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Questions about early_drop()
  2009-10-31 12:50 ` Luca Pesce
@ 2009-11-04 11:31   ` Patrick McHardy
  0 siblings, 0 replies; 3+ messages in thread
From: Patrick McHardy @ 2009-11-04 11:31 UTC (permalink / raw)
  To: Luca Pesce; +Cc: netfilter-devel

Luca Pesce wrote:
> I just realized that point 4 is very lame and wrong, so please skip it,
> consider only the first three questions.
> Thanks.
> 
> Luca Pesce ha scritto:
>> Hi all,
>>     today I was looking at early_drop() code in nf_conntrack_core.c,
>> and I came
>> up with some questions, due to the fact that I am not such a netfilter
>> expert...
>> I am not running the latest kernel:  I am cutting&pasting early_drop()
>> of my kernel
>> at the end of this mail, note that compared to 2.6.31.x this is quite
>> different.
>>
>> 1- why does early_drop() increase the ct_general.use count of the ct
>> to be dropped
>> before calling death_by_timeout(), and then decreases it with
>> nf_ct_put(ct)? Is
>> this a way to postpone ct death? What for?

Quoting the changelog:

    [NETFILTER]: conntrack: fix race condition in early_drop

    On SMP environments the maximum number of conntracks can be overpassed
    under heavy stress situations due to an existing race condition.

            CPU A                   CPU B
         atomic_read()               ...
         early_drop()                ...
            ...                  atomic_read()
       allocate conntrack      allocate conntrack
         atomic_inc()             atomic_inc()

    This patch moves the counter incrementation before the early drop stage.

>> 2- I see that 2.6.31.5 version of early_drop() is more complex: it
>> crosses more
>> than one bucket looking for not assured connections to be killed. I
>> like that
>> approach, but I was wondering if this is not burning too much CPU when
>> the
>> conntrack table is overly saturated persistently (and so when this
>> function is
>> called very often)...any experience about that?

No negative experience at least :) It does greatly improve
robustness under DoS since with jhash() and a properly sized
hash table there's likely only a single entry per bucket.

>> Can I port the whole early_drop() of 2.6.31.5 on my kernel?

Probably not.

>> 3- on 2.6.31.5 version of early_drop(), there are two added checks
>> before killing
>> the conntrack:
>>
>>  if (ct && unlikely(nf_ct_is_dying(ct) ||
>>                    !atomic_inc_not_zero(&ct->ct_general.use)))
>>             ct = NULL;
>>            
>> I understand that these are to ensure that the ct is not already dying
>> for itself:
>> should I add those to early_drop() which I am currently using to avoid
>> races?

This was fixing RCU races. Without knowing your version I can't
tell. Probably not though, the affected -stable versions should
already include this.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-11-04 11:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-31 12:07 Questions about early_drop() Luca Pesce
2009-10-31 12:50 ` Luca Pesce
2009-11-04 11:31   ` Patrick McHardy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.