From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: [PATCH] netfilter: finer grained nf_conn locking Date: Mon, 30 Mar 2009 13:05:32 -0700 Message-ID: <20090330130532.1e433313@nehalam> References: <20090218051906.174295181@vyatta.com> <20090218052747.679540125@vyatta.com> <499BDB5D.2050105@trash.net> <499C1894.7060400@cosmosbay.com> <49CE568A.9090104@cosmosbay.com> <20090328174835.0d0b63f8@nehalam> <49D1241B.6020504@cosmosbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Patrick McHardy , David Miller , Rick Jones , netdev@vger.kernel.org, netfilter-devel@vger.kernel.org To: Eric Dumazet Return-path: Received: from mail.vyatta.com ([76.74.103.46]:44840 "EHLO mail.vyatta.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754574AbZC3UFk convert rfc822-to-8bit (ORCPT ); Mon, 30 Mar 2009 16:05:40 -0400 In-Reply-To: <49D1241B.6020504@cosmosbay.com> Sender: netfilter-devel-owner@vger.kernel.org List-ID: On Mon, 30 Mar 2009 21:57:15 +0200 Eric Dumazet wrote: > Stephen Hemminger a =C3=A9crit : > > On Sat, 28 Mar 2009 17:55:38 +0100 > > Eric Dumazet wrote: > >=20 > >> Eric Dumazet a =C3=A9crit : > >>> Patrick McHardy a =C3=A9crit : > >>>> Stephen Hemminger wrote: > >>>> > >>>>> @@ -50,6 +50,7 @@ struct ip_ct_tcp_state { > >>>>> =20 > >>>>> struct ip_ct_tcp > >>>>> { > >>>>> + spinlock_t lock; > >>>>> struct ip_ct_tcp_state seen[2]; /* connection parameter= s per > >>>>> direction */ > >>>>> u_int8_t state; /* state of the connection (enum > >>>>> tcp_conntrack) */ > >>>>> /* For detecting stale connections */ > >>>> Eric already posted a patch to use an array of locks, which is > >>>> a better approach IMO since it keeps the size of the conntrack > >>>> entries down. > >>> Yes, we probably can use an array for short lived lock sections. > >=20 > > I am not a fan of the array of locks. Sizing it is awkward and > > it is vulnerable to hash collisions. Let's see if there is another > > better way. >=20 > On normal machines, (no debugging spinlocks), patch uses an embedded > spinlock. We probably can use this even on 32bit kernels, considering > previous patch removed the rcu_head (8 bytes on 32bit arches) from > nf_conn :) >=20 > if LOCKDEP is on, size of a spinlock is 64 bytes on x86_64. > Adding a spinlock on each nf_conn would be too expensive. In this > case, an array of spinlock is a good compromise, as done in > IP route cache, tcp ehash, ... >=20 > I agree sizing of this hash table is not pretty, and should be > a generic kernel service (I wanted such service for futexes for examp= le) >=20 IMO having different locking based on lockdep and architecture is an in= vitation to future obscure problems. Perhaps some other locking method or shrink= ing ct entry would be better. -- To unsubscribe from this list: send the line "unsubscribe netfilter-dev= el" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html