From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Hans Schillstrom <hans.schillstrom@ericsson.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>,
"lvs-devel@vger.kernel.org" <lvs-devel@vger.kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"netfilter-devel@vger.kernel.org"
<netfilter-devel@vger.kernel.org>,
"horms@verge.net.au" <horms@verge.net.au>,
"ja@ssi.bg" <ja@ssi.bg>,
"wensong@linux-vs.org" <wensong@linux-vs.org>
Subject: Re: [RFC PATCH 1/9] ipvs network name space aware
Date: Tue, 19 Oct 2010 11:44:36 -0700 [thread overview]
Message-ID: <20101019184436.GG2362@linux.vnet.ibm.com> (raw)
In-Reply-To: <201010181523.49568.hans.schillstrom@ericsson.com>
On Mon, Oct 18, 2010 at 03:23:48PM +0200, Hans Schillstrom wrote:
> On Monday 18 October 2010 13:37:38 Daniel Lezcano wrote:
> > On 10/18/2010 11:54 AM, Hans Schillstrom wrote:
> > > On Monday 18 October 2010 10:59:25 Daniel Lezcano wrote:
> > >
> > >> On 10/08/2010 01:16 PM, Hans Schillstrom wrote:
> > >>
> > >>> This part contains the include files
> > >>> where include/net/netns/ip_vs.h is new and contains all moved vars.
> > >>>
> > >>> SUMMARY
> > >>>
> > >>> include/net/ip_vs.h | 136 ++++---
> > >>> include/net/net_namespace.h | 2 +
> > >>> include/net/netns/ip_vs.h | 112 +++++
> > >>>
> > >>> Signed-off-by:Hans Schillstrom<hans.schillstrom@ericsson.com>
> > >>> ---
> > >>>
> > >>>
> > >>>
> > >> [ ... ]
> > >>
> > >>
> > >>> #ifdef CONFIG_IP_VS_IPV6
> > >>> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> > >>> index bd10a79..b59cdc5 100644
> > >>> --- a/include/net/net_namespace.h
> > >>> +++ b/include/net/net_namespace.h
> > >>> @@ -15,6 +15,7 @@
> > >>> #include<net/netns/ipv4.h>
> > >>> #include<net/netns/ipv6.h>
> > >>> #include<net/netns/dccp.h>
> > >>> +#include<net/netns/ip_vs.h>
> > >>> #include<net/netns/x_tables.h>
> > >>> #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
> > >>> #include<net/netns/conntrack.h>
> > >>> @@ -91,6 +92,7 @@ struct net {
> > >>> struct sk_buff_head wext_nlevents;
> > >>> #endif
> > >>> struct net_generic *gen;
> > >>> + struct netns_ipvs *ipvs;
> > >>> };
> > >>>
> > >>>
> > >> IMHO, it would be better to use the net_generic infra-structure instead
> > >> of adding a new field in the netns structure.
> > >>
> > >>
> > >>
> > > I realized that to, but the performance penalty is quite high with net_generic :-(
> > > But on the other hand if you are going to backport it, (without recompiling the kernel)
> > > you gonna need it!
> > >
> >
> > Hmm, yes. We don't want to have the init_net_ns performances to be impacted.
> >
> > You use here a pointer which will be dereferenced like the net_generic,
> > I don't think there will be
> > a big difference between using net_generic and using a pointer in the
> > net namespace structure.
> >
> > The difference is the id usage, but this one is based on the idr which
> > is quite fast.
> >
>
> I'm not so sure about that, have a look at net_generic and rcu_read_lock
> and compare
> ipvs = net->ipvs;
> vs.
> ipvs = net_generic(net, id)
>
> static inline void *net_generic(struct net *net, int id)
> {
> struct net_generic *ng;
> void *ptr;
>
> rcu_read_lock();
> ng = rcu_dereference(net->gen);
> BUG_ON(id == 0 || id > ng->len);
> ptr = ng->ptr[id - 1];
> rcu_read_unlock();
>
> return ptr;
> }
> ...
> static inline void rcu_read_lock(void)
> {
> __rcu_read_lock();
> __acquire(RCU);
> rcu_read_acquire();
> }
>
> Another way of doing it is to pass the ipvs ptr instead of the net ptr,
> and add *net to the ipvs struct.
>
> > We should experiment a bit here to compare both solutions.
> Agre
> >
> I single stepped through the rcu_read_lock() on a x86_64
> and it's quite many "stepi" that you need to enter :-(
Was this by chance with lockdep enabled? If not, could you please send
your .config?
Thanx, Paul
> > IMHO, we can (1) create a non-pointer netns_ipvs field in the net
> > namespace structure or (2) use a pointer [with net_generic].
> >
> > (1) is the faster but with the drawback of having a bigger memory
> > footprint even if the ipvs module is not loaded.
> > In this case we have to take care of what we store in the netns_ipvs
> > structure, that is reduce the per namespace table and so. At the first
> > glance, I think we can reduce this to the sysctls and a very few data,
> > for example using global tables tagged with the namespace and we don't
> > break the cacheline alignment optimization.
> >
> > (2) is slower but as the memory is allocated and freed when the module
> > is loaded/unloaded. What I don't like with this approach is we add some
> > overhead even if the netns is not compiled in the kernel.
> >
> or (3)
> Like (1) with data that needs to be cache aligned in "struct net"
> and the rest in an ipvs struct.
> Global hash tables or not ?
>
> > > My sugestion, take both with a configuration switch like:
> > > (i.e. avoid the rcu locking)
> > >
> > > --- net/ip_vs.h ---
> > > ...
> > > extern int ip_vs_net_id; /* net id for ip_vs */
> > >
> > >
> > > static inline struct netns_ipvs * net_ipvs(struct net* net, int id) {
> > > #ifdef CONFIG_IP_VS_FAST_NETNS
> > > return net->ipvs;
> > > #else
> > > return (struct netns_ipvs *)net_generic(net, id);
> > > #endif
> > > }
> > > ...
> > >
> > > and where you need the netns_ipvs struct ptr,
> > > [snip]
> > > struct ip_vs_conn *ip_vs_conn_in_get(struct net *net, ....
> > > {
> > > struct netns_ipvs *ipvs = net_ipvs(net, ip_vs_net_id);
> > > ...
> > >
> >
> > It is a nice way to wrap both solutions but at this point I don't think
> > it is worth to add a 3rd option to compile ipvs.
> >
> >
>
> --
> Regards
> Hans Schillstrom <hans.schillstrom@ericsson.com>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-10-19 18:44 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-08 11:16 [RFC PATCH 1/9] ipvs network name space aware Hans Schillstrom
2010-10-18 8:59 ` Daniel Lezcano
2010-10-18 9:54 ` Hans Schillstrom
2010-10-18 11:37 ` Daniel Lezcano
2010-10-18 13:23 ` Hans Schillstrom
2010-10-18 14:26 ` Daniel Lezcano
2010-10-19 18:44 ` Paul E. McKenney [this message]
2010-10-20 8:25 ` Hans Schillstrom
2010-10-20 16:02 ` Paul E. McKenney
2010-10-21 7:45 ` Hans Schillstrom
2010-10-21 8:01 ` Eric Dumazet
2010-10-21 15:18 ` Paul E. McKenney
2010-10-21 8:58 ` Eric Dumazet
2010-10-21 15:16 ` Paul E. McKenney
2010-10-21 15:24 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101019184436.GG2362@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=daniel.lezcano@free.fr \
--cc=hans.schillstrom@ericsson.com \
--cc=horms@verge.net.au \
--cc=ja@ssi.bg \
--cc=lvs-devel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=wensong@linux-vs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.