netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick McHardy <kaber@trash.net>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel@vger.kernel.org
Subject: Re: [PATCH 1/2] netfilter: conntrack: move event cache to conntrack extension infrastructure
Date: Fri, 05 Jun 2009 16:13:47 +0200	[thread overview]
Message-ID: <4A29281B.6010607@trash.net> (raw)
In-Reply-To: <4A291863.4090604@netfilter.org>

Pablo Neira Ayuso wrote:
> Patrick McHardy wrote:
>>>
>>> -    /* New conntrack */
>>> -    IPCT_NEW_BIT = 0,
>>> -    IPCT_NEW = (1 << IPCT_NEW_BIT),
>>> -
>>> -...
>>> +    IPCT_NEW        = 0,    /* new conntrack */
>>
>> Why this change? Further down, you change the code to use
>> (1 << IPCT_*), which isn't really an improvement.
>
> Oh, this is not intended to be an improvement. I needed to change this 
> to use bitopts operations, that's all. Or I could use IPCT_*_BIT and 
> change every reference to this in the whole conntrack table, but the 
> patch would be much bigger and noisy :).

The major part is in ctnetlink I think, and you do seem to touch every
one of them :) But OK, using _BIT for all the events doesn't seem too
appealing either.

>>> +
>>> +    e = nf_ct_ecache_find(ct);
>>> +    if (e == NULL)
>>> +        return;
>>> +
>>> +    set_bit(event, &e->cache);
>>
>> This looks quite expensive, given how often this operation is performed.
>> Did you benchmark this?
>
> I'll do that benchmark. I was initially using nf_conntrack_lock to 
> protect the per-conntrack event cache, I think that use bitops is a 
> bit better?

I actually meant the lookup done potentially multiple times per packet.
But I incorrectly thought that it was more expensive, that seems fine.

>>> @@ -8,12 +8,14 @@ enum nf_ct_ext_id
>>>      NF_CT_EXT_HELPER,
>>>      NF_CT_EXT_NAT,
>>>      NF_CT_EXT_ACCT,
>>> +    NF_CT_EXT_ECACHE,
>>>      NF_CT_EXT_NUM,
>>
>> Quoting nf_conntrack_extend.c:
>>
>> /* This assumes that extended areas in conntrack for the types
>>    whose NF_CT_EXT_F_PREALLOC bit set are allocated in order */
>>
>> Is that actually the case here? It might be beneficial to move
>> this before accounting if possible, I guess its used more often.
>
> I think that accounting information is updated more often. Events are 
> only updated for very few packet specifically the setup and the 
> tear-down packets of a flow.

No, events are only sent to userspace every seldom. But f.i. TCP
conntrack generates at least one event per packet.

But what I actually meant was that its used more often I think.
Never mind, also forget about the PREALLOC question, I should
have read what I pasted :) Of course you could add the PREALLOC
flag, when events are enabled you add the extension for every
conntrack anyways.

>>> @@ -738,6 +739,9 @@ nf_conntrack_in(struct net *net, u_int8_t pf, 
>>> unsigned int hooknum,
>>>  
>>>      NF_CT_ASSERT(skb->nfct);
>>>  
>>> +    /* We may have pending events, deliver them and clear the cache */
>>> +    nf_ct_deliver_cached_events(ct);
>>
>> How does this guarantee that an event will be delivered in time?
>> As far as I can see, it might never be delivered, or at least not
>> until a timeout occurs.
>
> Yes, that's the idea. In short, if we have cached events, we trigger a 
> delivery via ctnetlink. If the delivery fails, we keep the cached 
> events and the next packet will trigger a new delivery. We can lose 
> events but at worse case, the destroy will be delivered.

OK, so this is essentially replacing the delivery we did previously when
beginning to cache events for a different conntrack. Thats fine and 
necessary
in case a packet triggering an event didn't made it past POST_ROUTING.

>
> If we add a new conntrack extension to store the creation and 
> destruction time of the conntrack entries, we can have reliable 
> flow-accouting since, at least, the destroy event will be delivered. 
> In the case of the state synchronization, we aim to ensure that 
> long-standing flows survive failures, thus, under event loss, the 
> backup nodes would gett the state after some tries, specially for 
> long-standing flows since every packet would trigger another delivery 
> in case of problems.

*conntrackd* aims at that I think :) Anyways, on second - third thought, I
agree that this is fine, the previous guarantees weren't any stronger.

>
> BTW, I have removed that line locally. So the next delivery try for 
> pending events is done only in nf_conntrack_confirm(), not twice, one 
> in nf_conntrack_in() and nf_conntrack_confirm() as it happens in this 
> patch.

That means we're only delivering events if a packet actually made
it through. I guess this is fine too, ideally we wouldn't even have
state transistions.

>>> @@ -1123,6 +1123,8 @@ ctnetlink_change_conntrack(struct nf_conn *ct, 
>>> struct nlattr *cda[])
>>>          err = ctnetlink_change_helper(ct, cda);
>>>          if (err < 0)
>>>              return err;
>>> +
>>> +        nf_conntrack_event_cache(IPCT_HELPER, ct);
>>
>> Why are we suddenly caching a lot more events manually?
>
> Currently, in user-space triggered events, we are including in the 
> event message some fields that may not have been updated. Now we can 
> provide more accurante events by notifying only the conntrack object 
> fields that have been updated.
>
The patch is already pretty large, please seperate that part if
doesn't has to be in this patch to make it work.


  reply	other threads:[~2009-06-05 14:13 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-04 11:07 [PATCH 0/2] Pablo Neira Ayuso
2009-06-04 11:08 ` [PATCH 1/2] netfilter: conntrack: move event cache to conntrack extension infrastructure Pablo Neira Ayuso
2009-06-04 12:16   ` Pablo Neira Ayuso
2009-06-05 11:04   ` Patrick McHardy
2009-06-05 13:06     ` Pablo Neira Ayuso
2009-06-05 14:13       ` Patrick McHardy [this message]
2009-06-06  6:24         ` Pablo Neira Ayuso
2009-06-04 11:08 ` [PATCH 2/2] netfilter: conntrack: optional reliable conntrack event delivery Pablo Neira Ayuso
2009-06-05 14:37   ` Patrick McHardy
2009-06-06  6:34     ` Pablo Neira Ayuso
2009-06-08 13:49       ` Patrick McHardy
2009-06-09 22:36     ` Pablo Neira Ayuso
2009-06-09 22:43       ` Patrick McHardy
2009-06-09 22:45         ` Patrick McHardy
2009-06-09 22:58           ` Pablo Neira Ayuso
2009-06-10  1:18             ` Eric Dumazet
2009-06-10  9:55               ` Patrick McHardy
2009-06-10 10:36                 ` Pablo Neira Ayuso
2009-06-10 10:55                   ` Patrick McHardy
2009-06-10 11:01                     ` Patrick McHardy
2009-06-10 11:40                       ` Patrick McHardy
2009-06-10 12:22                         ` Pablo Neira Ayuso
2009-06-10 12:27                           ` Patrick McHardy
2009-06-10 12:43                             ` Pablo Neira Ayuso
2009-06-10 12:56                               ` Patrick McHardy
2009-06-10 12:26                         ` Jozsef Kadlecsik
2009-06-10 12:30                           ` Patrick McHardy
2009-06-10 12:41                             ` Patrick McHardy
2009-06-04 11:17 ` [PATCH 0/2] reliable per-conntrack event cache Pablo Neira Ayuso
  -- strict thread matches above, loose matches on Subject: below --
2009-05-04 13:53 [PATCH 0/2] conntrack event subsystem updates for 2.6.31 (part 2) Pablo Neira Ayuso
2009-05-04 13:53 ` [PATCH 1/2] netfilter: conntrack: move event cache to conntrack extension infrastructure Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A29281B.6010607@trash.net \
    --to=kaber@trash.net \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).