netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Patrick McHardy <kaber@trash.net>
Cc: Netfilter Developers <netfilter-devel@vger.kernel.org>,
	Linux Netdev List <netdev@vger.kernel.org>
Subject: Re: [PATCH] conntrack: Reduce conntrack count in nf_conntrack_free()
Date: Wed, 25 Mar 2009 14:44:34 +0100	[thread overview]
Message-ID: <49CA3542.5020601@cosmosbay.com> (raw)
In-Reply-To: <49CA3404.3090609@trash.net>

Patrick McHardy a écrit :
> Eric Dumazet wrote:
>> I have a litle problem on __nf_conntrack_find() being exported.
>>
>> Problem is that with SLAB_DESTROY_BY_RCU we must take a reference on
>> object
>> to recheck it. So ideally  only nf_conntrack_find_get() should be used,
>> or callers of __nf_conntrack_find() should lock nf_conntrack_lock
>> (as properly done for example in net/netfilter/nf_conntrack_netlink.c,
>> line 1292)
>>
>> Here is preliminary patch for review (not tested at all, its 4h50 am
>> here :) )
>>
>> Could you help me, by checking __nf_conntrack_find() use in
>> net/netfilter/xt_connlimit.c ?
>> and line 1246 of net/netfilter/nf_conntrack_netlink.c
>>
>> This part is a litle bit gray for me. :)
> 
> In case of xt_connlimit, it seems fine to just take a reference.
> In case of ctnetlink, keeping the unreferenced lookup under the
> lock seems fine. We unfortunately have to export some internals
> like nf_conntrack lock for ctnetlink anyways, so I don't think
> it would be worth to change it to take references and unexport
> the lookup function.
> 
>> +/*
>> + * Warning :
>> + * - Caller must take a reference on returned object
>> + *   and recheck nf_ct_tuple_equal(tuple, &h->tuple)
>> + * OR
>> + * - Caller must lock nf_conntrack_lock before calling this function
>> + */
>>  struct nf_conntrack_tuple_hash *
>>  __nf_conntrack_find(struct net *net, const struct nf_conntrack_tuple
>> *tuple)
>>  {
>>      struct nf_conntrack_tuple_hash *h;
>> -    struct hlist_node *n;
>> +    struct hlist_nulls_node *n;
>>      unsigned int hash = hash_conntrack(tuple);
>>  
>>      /* Disable BHs the entire time since we normally need to disable
>> them
>>       * at least once for the stats anyway.
>>       */
>>      local_bh_disable();
>> -    hlist_for_each_entry_rcu(h, n, &net->ct.hash[hash], hnode) {
>> +begin:
>> +    hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash], hnnode) {
>>          if (nf_ct_tuple_equal(tuple, &h->tuple)) {
>>              NF_CT_STAT_INC(net, found);
>>              local_bh_enable();
>> @@ -261,6 +270,13 @@ __nf_conntrack_find(struct net *net, const struct
>> nf_conntrack_tuple *tuple)
>>          }
>>          NF_CT_STAT_INC(net, searched);
>>      }
>> +    /*
>> +     * if the nulls value we got at the end of this lookup is
>> +     * not the expected one, we must restart lookup.
>> +     * We probably met an item that was moved to another chain.
>> +     */
>> +    if (get_nulls_value(n) != hash)
>> +        goto begin;
> 
> Are you sure this is enough? An entry might have been reused and added
> to the same chain I think, so I think we need to recheck the tuple.

Yes, done in caller

> 
>>      local_bh_enable();
>>  
>>      return NULL;
>> @@ -275,11 +291,18 @@ nf_conntrack_find_get(struct net *net, const
>> struct nf_conntrack_tuple *tuple)
>>      struct nf_conn *ct;
>>  
>>      rcu_read_lock();
>> +begin:
>>      h = __nf_conntrack_find(net, tuple);
>>      if (h) {
>>          ct = nf_ct_tuplehash_to_ctrack(h);
>>          if (unlikely(!atomic_inc_not_zero(&ct->ct_general.use)))
>>              h = NULL;
>> +        else {
>> +            if (unlikely(!nf_ct_tuple_equal(tuple, &h->tuple))) {
>> +                nf_ct_put(ct);
>> +                goto begin;
> 
> Ah I see, the hash comparison above is only an optimization?
> 
>> +            }
>> +        }
>>      }
>>      rcu_read_unlock();
>>  
> 
> 

check net/ipv4/udp.c for an example (__udp4_lib_lookup())

In case of UDP, key check is not returning true/false, but a score.
So UDP case is a litle bit more complex than conntrack case.


        rcu_read_lock();
begin:
        result = NULL;
        badness = -1;
        sk_nulls_for_each_rcu(sk, node, &hslot->head) {
                score = compute_score(sk, net, saddr, hnum, sport,
                                      daddr, dport, dif);
                if (score > badness) {
                        result = sk;
                        badness = score;
                }
        }
        /*
         * if the nulls value we got at the end of this lookup is
         * not the expected one, we must restart lookup.
         * We probably met an item that was moved to another chain.
         */
        if (get_nulls_value(node) != hash)
                goto begin;

        if (result) {
                if (unlikely(!atomic_inc_not_zero(&result->sk_refcnt)))
                        result = NULL;
                else if (unlikely(compute_score(result, net, saddr, hnum, sport,
                                  daddr, dport, dif) < badness)) {
                        sock_put(result);
                        goto begin;
                }
        }
        rcu_read_unlock();
        return result;



  reply	other threads:[~2009-03-25 13:44 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-23 10:42 ucc_geth: nf_conntrack: table full, dropping packet Joakim Tjernlund
2009-03-23 12:15 ` Patrick McHardy
2009-03-23 12:25   ` Joakim Tjernlund
2009-03-23 12:29     ` Patrick McHardy
2009-03-23 12:59       ` Joakim Tjernlund
     [not found]       ` <OF387EC803.F810F72A-ONC1257582.00468C6E-C1257582.00475783@LocalDomain>
2009-03-23 13:09         ` Joakim Tjernlund
2009-03-23 17:42       ` Joakim Tjernlund
2009-03-23 17:49         ` Patrick McHardy
2009-03-24  8:22           ` Joakim Tjernlund
2009-03-24  9:12             ` Eric Dumazet
2009-03-24 10:55               ` Joakim Tjernlund
2009-03-24 12:07                 ` [PATCH] conntrack: Reduce conntrack count in nf_conntrack_free() Eric Dumazet
2009-03-24 12:25                   ` Eric Dumazet
2009-03-24 12:43                     ` Patrick McHardy
2009-03-24 13:32                       ` Eric Dumazet
2009-03-24 13:38                         ` Patrick McHardy
2009-03-24 13:47                           ` Eric Dumazet
     [not found]                             ` <49C8F871.9070600@cosmosbay.com>
     [not found]                               ` <49C8F8E0.9050502@trash.net>
2009-03-25  3:53                                 ` Eric Dumazet
2009-03-25 13:39                                   ` Patrick McHardy
2009-03-25 13:44                                     ` Eric Dumazet [this message]
2009-03-24 13:20                   ` Joakim Tjernlund
2009-03-24 13:28                     ` Patrick McHardy
2009-03-24 13:29                     ` Eric Dumazet
2009-03-24 13:41                       ` Joakim Tjernlund
2009-03-24 15:17                   ` Maxime Bizon
2009-03-24 15:21                     ` Patrick McHardy
2009-03-24 15:27                     ` Eric Dumazet
2009-03-24 19:54                       ` [PATCH] netfilter: Use hlist_add_head_rcu() in nf_conntrack_set_hashsize() Eric Dumazet
2009-03-25 16:26                         ` Patrick McHardy
2009-03-25 17:53                       ` [PATCH] conntrack: use SLAB_DESTROY_BY_RCU for nf_conn structs Eric Dumazet
2009-03-25 18:05                         ` Patrick McHardy
2009-03-25 18:06                           ` Patrick McHardy
2009-03-25 18:15                           ` Eric Dumazet
2009-03-25 18:24                             ` Patrick McHardy
2009-03-25 18:53                               ` Eric Dumazet
2009-03-25 19:00                                 ` Patrick McHardy
2009-03-25 19:17                                   ` Eric Dumazet
2009-03-25 19:41                                     ` Patrick McHardy
2009-03-25 19:58                                       ` Eric Dumazet
2009-03-25 20:10                                         ` Patrick McHardy
2009-03-24 18:29                     ` [PATCH] conntrack: Reduce conntrack count in nf_conntrack_free() Joakim Tjernlund
2009-03-23 17:49         ` ucc_geth: nf_conntrack: table full, dropping packet Eric Dumazet
2009-03-23 18:04           ` Joakim Tjernlund
2009-03-23 18:08             ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49CA3542.5020601@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=kaber@trash.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).