From: ebiederm@xmission.com (Eric W. Biederman)
To: hadi@cyberus.ca
Cc: Daniel Lezcano <dlezcano@fr.ibm.com>,
devel@openvz.org, Linux Containers <containers@lists.osdl.org>,
Stephen Hemminger <shemminger@osdl.org>,
netdev@vger.kernel.org
Subject: Re: Network virtualization/isolation
Date: Mon, 04 Dec 2006 05:15:00 -0700 [thread overview]
Message-ID: <m1slfvuabf.fsf@ebiederm.dsl.xmission.com> (raw)
In-Reply-To: <1165155198.3517.86.camel@localhost> (hadi@cyberus.ca's message of "Sun, 03 Dec 2006 09:13:18 -0500")
jamal <hadi@cyberus.ca> writes:
> I have removed the Re: just to add some freshness to the discussion
>
> So i read quickly the rest of the discussions. I was almost suprised to
> find that i agree with Eric on a lot of opinions (we also agree that
> vindaloo is good for you i guess);->
> The two issues that stood out for me (in addition to what i already said
> below):
>
> 1) the solution must ease the migration of containers; i didnt see
> anything about migrating them to another host across a network, but i
> assume that this is a given.
It is mostly a given. It is a goal for some of us and not for others.
Containers are a necessary first step to getting migration and checkpoint/restart
assistance from the kernel.
> 2) the socket level bind/accept filtering with multiple IPs. From
> reading what Herbert has, it seems they have figured a clever way to
> optimize this path albeit some challenges (speacial casing for raw
> filters) etc.
>
> I am wondering if one was to use the two level muxing of the socket
> layer, how much more performance improvement the above scheme provides
> for #2?
I don't follow this question.
> Consider the case of L2 where by the time the packet hits the socket
> layer on incoming, the VE is already known; in such a case, the lookup
> would be very cheap. The advantage being you get rid of the speacial
> casing altogether. I dont see any issues with binds per multiple IPs etc
> using such a technique.
>
> For the case of #1 above, wouldnt it be also easier if the tables for
> netdevices, PIDs etc were per VE (using the 2 level mux)?
Generally yes. s/VE/namespace/. There is a case with hash tables where
it seems saner to add an additional entry because hash it is hard to dynamically
allocate a hash table, (because they need something large then a
single page allocation). But for everything else yes it makes things
much easier if you have a per namespace data structure.
A practical question is can we replace hash tables with some variant of
trie or radix-tree and not take a performance hit. Given the better scaling of
tress to different workload sizes if we can use them so much the
better. Especially because a per namespace split gives us a lot of
good properties.
> In any case, folks, i hope i am not treading on anyones toes; i know
> each one of you has implemented and has users and i am trying to be as
> neutral as i can (but clearly biased;->).
Well we rather expect to bash heads until we can come up with something
we all can agree on with the people who more regularly have to maintain
the code. The discussions so far have largely been warm ups, to actually
doing something.
Getting feedback from people who regularly work with the networking stack
is appreciated.
Eric
next prev parent reply other threads:[~2006-12-04 12:16 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-25 15:51 Network virtualization/isolation Daniel Lezcano
2006-10-23 20:01 ` Stephen Hemminger
2006-10-26 9:44 ` Daniel Lezcano
2006-10-26 15:56 ` Stephen Hemminger
2006-10-26 22:16 ` Daniel Lezcano
2006-10-27 7:34 ` Dmitry Mishin
2006-10-27 9:10 ` Daniel Lezcano
2006-11-01 14:35 ` jamal
2006-11-01 16:13 ` Daniel Lezcano
2006-11-14 15:17 ` Daniel Lezcano
2006-11-14 18:12 ` James Morris
2006-11-15 9:56 ` Daniel Lezcano
2006-11-22 12:00 ` Daniel Lezcano
2006-11-25 9:09 ` Eric W. Biederman
2006-11-28 14:15 ` Daniel Lezcano
2006-11-28 16:51 ` Eric W. Biederman
2006-11-28 17:37 ` Herbert Poetzl
2006-11-28 20:26 ` Daniel Lezcano
2006-11-28 21:50 ` Eric W. Biederman
2006-11-29 5:54 ` Herbert Poetzl
2006-11-29 20:21 ` Brian Haley
2006-11-29 22:10 ` [Devel] " Daniel Lezcano
2006-11-30 16:15 ` Vlad Yasevich
2006-11-30 16:38 ` Daniel Lezcano
2006-11-30 17:24 ` Herbert Poetzl
2006-12-03 12:26 ` jamal
2006-12-03 14:13 ` jamal
2006-12-03 16:00 ` Eric W. Biederman
2006-12-04 15:19 ` Dmitry Mishin
2006-12-04 15:45 ` Eric W. Biederman
2006-12-04 16:43 ` Herbert Poetzl
2006-12-04 16:58 ` Eric W. Biederman
2006-12-04 17:02 ` Dmitry Mishin
2006-12-04 17:19 ` Herbert Poetzl
2006-12-04 17:41 ` Daniel Lezcano
2006-12-04 12:15 ` Eric W. Biederman [this message]
2006-12-04 13:44 ` jamal
2006-12-04 15:35 ` Eric W. Biederman
2006-12-04 16:00 ` Dmitry Mishin
2006-12-04 16:52 ` Eric W. Biederman
2006-12-06 11:54 ` [Devel] " Kirill Korotaev
2006-12-06 18:30 ` Herbert Poetzl
2006-12-08 19:57 ` Eric W. Biederman
2006-12-09 3:50 ` Herbert Poetzl
2006-12-09 6:13 ` Andrew Morton
2006-12-09 6:35 ` Herbert Poetzl
2006-12-09 21:18 ` Dmitry Mishin
2006-12-09 22:34 ` Kir Kolyshkin
2006-12-10 2:21 ` Herbert Poetzl
2006-12-09 8:07 ` Eric W. Biederman
2006-12-09 11:27 ` Tomasz Torcz
2006-12-09 19:04 ` Herbert Poetzl
2006-12-03 16:37 ` Herbert Poetzl
2006-12-03 16:58 ` jamal
2006-12-04 10:18 ` Daniel Lezcano
2006-12-04 13:22 ` jamal
2006-12-02 11:29 ` Kari Hurtta
2006-12-02 11:49 ` Kari Hurtta
2006-11-29 5:58 ` Herbert Poetzl
2006-11-25 8:21 ` Eric W. Biederman
2006-11-26 18:34 ` Herbert Poetzl
2006-11-26 19:41 ` Ben Greear
2006-11-26 20:52 ` Eric W. Biederman
2006-11-25 8:27 ` Eric W. Biederman
-- strict thread matches above, loose matches on Subject: below --
2006-11-25 16:35 Leonid Grossman
2006-11-25 19:26 ` Eric W. Biederman
2006-11-25 22:17 Leonid Grossman
2006-11-25 23:16 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m1slfvuabf.fsf@ebiederm.dsl.xmission.com \
--to=ebiederm@xmission.com \
--cc=containers@lists.osdl.org \
--cc=devel@openvz.org \
--cc=dlezcano@fr.ibm.com \
--cc=hadi@cyberus.ca \
--cc=netdev@vger.kernel.org \
--cc=shemminger@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).