From: Dmitry Mishin <dim@openvz.org>
To: "YOSHIFUJI Hideaki / 吉藤英明" <yoshfuji@linux-ipv6.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
netdev@vger.kernel.org, containers@lists.osdl.org, alexey@sw.ru,
saw@sw.ru
Subject: Re: [PATCH 0/12] L2 network namespace (v3)
Date: Fri, 19 Jan 2007 12:35:11 +0300 [thread overview]
Message-ID: <200701191235.11646.dim@openvz.org> (raw)
In-Reply-To: <m1ps9b5vdp.fsf@ebiederm.dsl.xmission.com>
On Friday 19 January 2007 10:27, Eric W. Biederman wrote:
> YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> writes:
>
> > In article <200701171851.14734.dim@openvz.org> (at Wed, 17 Jan 2007 18:51:14
> > +0300), Dmitry Mishin <dim@openvz.org> says:
> >
> >> ===================================
> >> L2 network namespaces
> >>
> >> The most straightforward concept of network virtualization is complete
> >> separation of namespaces, covering device list, routing tables, netfilter
> >> tables, socket hashes, and everything else.
> >>
> >> On input path, each packet is tagged with namespace right from the
> >> place where it appears from a device, and is processed by each layer
> >> in the context of this namespace.
> >> Non-root namespaces communicate with the outside world in two ways: by
> >> owning hardware devices, or receiving packets forwarded them by their parent
> >> namespace via pass-through device.
> >
> > Can you handle multicast / broadcast and IPv6, which are very important?
>
> The basic idea here is very simple.
>
> Each network namespace appears to user space as a separate network stack,
> with it's own set of routing tables etc.
>
> All sockets and all network devices (the sources of packets) belong
> to exactly one network namespace.
>
> >From the socket or the network device a packet enters the network stack
> you can infer the network namespace that it will be processed in.
> Each network namespace should get it own complement of the data structures
> necessary to process packets, and everything should work.
>
> Talking between namespaces is accomplished either through an external network,
> or through a special pseudo network device. The simplest to implement
> is two network devices where all packets transmitted on one are received
> on the other. Then by placing one network device in one namespace and
> the other in another interface it looks like two machines connected by
> a cross over cable.
>
> Once you have that in a one namespace you can connect other namespaces
> with the existing ethernet bridging or by configuring one of the
> namespaces as a router and routing traffic between them.
>
>
> Supporting IPv6 is roughly as difficult as supporting IPv4.
>
> What needs to happen to convert code is all variables either need
> a per network namespace instance or the data structures needs to be
> modified to have a network namespace tag. For hash tables which
> are hard to allocate dynamically tagging is the preferred conversion
> method, for anything that is small enough duplication is preferred
> as it allows the existing logic to be kept.
>
> In the fast path the impact of all of the conversions should be very light,
> to non-existent. In network stack initialization and cleanup there
> is work todo because you are initializing and cleanup variables more often
> then at module insertion and removal.
>
> So my expectation is that once we get a framework established and merged
> to allow network namespaces eventually the entire network stack will be
> converted. Not just ipv4 and ipv6 but decnet, ipx, iptables, fair scheduling,
> ethernet bridging and all of the other weird and twisty bits of the
> linux network stack.
Thanks Eric for such descriptive comment. I can only sign off on it :)
>
> The primary practical hurdle is there is a lot of networking code in
> the kernel.
>
> I think I know a path by which we can incrementally merge support for
> network namespaces without breaking anything. More to come on this
> when I finish up my demonstration patchset in a week or so that
> is complete enough to show what I am talking about.
>
> I hope this helps but the concept into perspective.
I'll be waiting it.
>
> As for Dmitry's patchset in particular it currently does not support
> IPv6 and I don't know where it is with respect to the broadcast and
> multicast but I don't see any immediate problems that would preclude
> those from working. But any incompleteness is exactly that
> incompleteness and an implementation problem not a fundamental design
> issue.
Broadcasts/multicasts are supported.
--
Thanks,
Dmitry.
prev parent reply other threads:[~2007-01-19 10:13 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-17 15:51 [PATCH 0/12] L2 network namespace (v3) Dmitry Mishin
2007-01-17 15:57 ` [PATCH 1/12] L2 network namespace (v3): current network namespace operations Dmitry Mishin
2007-01-17 20:16 ` Eric W. Biederman
2007-01-18 10:56 ` Dmitry Mishin
2007-01-18 13:37 ` Eric W. Biederman
2007-01-25 7:58 ` Eric W. Biederman
2007-01-17 15:58 ` [PATCH 0/12] L2 network namespace (v3) Cedric Le Goater
2007-01-17 15:59 ` [PATCH 2/12] L2 network namespace (v3): network devices virtualization Dmitry Mishin
2007-01-17 16:00 ` [PATCH 3/12] L2 network namespace (v3): loopback device virtualization Dmitry Mishin
2007-01-17 16:01 ` [PATCH 4/12] L2 network namespace (v3): devinet sysctl's checks Dmitry Mishin
2007-01-17 16:03 ` [PATCH 5/12] L2 network namespace (v3): IPv4 routing Dmitry Mishin
2007-01-17 16:05 ` [PATCH 6/12] L2 network namespace (v3): socket hashes Dmitry Mishin
2007-01-17 16:10 ` [PATCH 0/12] L2 network namespace (v3) Daniel Lezcano
2007-01-17 16:10 ` [PATCH 7/12] allow proc_dir_entries to have destructor Dmitry Mishin
2007-01-17 16:11 ` [PATCH 8/12] net_device seq_file Dmitry Mishin
2007-01-17 20:36 ` Stephen Hemminger
2007-01-18 17:07 ` Eric W. Biederman
2007-01-17 16:14 ` [PATCH 9/12] L2 network namespace (v3): device to pass packets between namespaces Dmitry Mishin
2007-01-17 16:15 ` [PATCH 10/12] L2 network namespace (v3): playing with pass-through device Dmitry Mishin
2007-01-17 16:16 ` [PATCH 11/12] L2 network namespace (v3): sockets proc view virtualization Dmitry Mishin
2007-01-17 16:18 ` [PATCH 12/12] L2 network namespace (v3): L3 network namespace intro Dmitry Mishin
2007-01-19 0:07 ` [PATCH 0/12] L2 network namespace (v3) YOSHIFUJI Hideaki / 吉藤英明
2007-01-19 7:27 ` Eric W. Biederman
2007-01-19 9:35 ` Dmitry Mishin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200701191235.11646.dim@openvz.org \
--to=dim@openvz.org \
--cc=alexey@sw.ru \
--cc=containers@lists.osdl.org \
--cc=ebiederm@xmission.com \
--cc=netdev@vger.kernel.org \
--cc=saw@sw.ru \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).