From: Brian Haley <brian.haley-VXdhtT5mjnY@public.gmane.org>
To: Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org
Subject: Re: [PATCH 3/3] C/R: Basic support for network namespaces and devices
Date: Wed, 20 Jan 2010 17:21:19 -0500 [thread overview]
Message-ID: <4B5781DF.6050106@hp.com> (raw)
In-Reply-To: <1263999673-11279-4-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Dan Smith wrote:
> When checkpointing a task tree with network namespaces, we hook into
> do_checkpoint_ns() along with the others. Any devices in a given namespace
> are checkpointed (including their peer, in the case of veth) sequentially.
> Each network device stores a list of protocol addresses, as well as other
> information, such as hardware address.
>
> This patch supports veth pairs, as well as the loopback adapter. The
> loopback support is there to make sure that any additional addresses and
> state (such as up/down) is copied to the loopback adapter that we are
> given in the new network namespace.
>
> On restart, we instantiate new network namespaces and veth pairs as
> necessary. Any device we encounter that isn't in a network namespace
> that was checkpointed as part of a task is left in the namespace of the
> restarting process. This will be the case for a veth half that exists
> in the init netns to provide network access to a container.
>
> Still to do are:
>
> 1. Routes
> 2. Netfilter rules
> 3. IPv6 addresses
> 4. Other virtual device types (e.g. bridges)
What about:
1. Multicast
2. Device config info (ipv4_devconf)
> +static int checkpoint_in_addrs(struct ckpt_ctx *ctx, struct in_device *indev)
> +{
> + struct ckpt_hdr_netdev_addr *h;
> + struct in_ifaddr *addr = indev->ifa_list;
> + int ret;
> + int count = 0;
> +
> + while (addr) {
> + h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_NETDEV_ADDR);
> + if (!h)
> + return -ENOMEM;
> +
> + h->type = CKPT_NETDEV_ADDR_IPV4; /* Only IPv4 right now */
> +
> + h->inet4_local = addr->ifa_local;
> + h->inet4_address = addr->ifa_address;
> + h->inet4_mask = addr->ifa_mask;
> + h->inet4_broadcast = addr->ifa_broadcast;
What about addr->ifa_flags and all the other elements like prefixlen, scope and label?
> +int checkpoint_netdev(struct ckpt_ctx *ctx, void *ptr)
> +{
> + struct ckpt_hdr_netdev *h;
> + struct net_device *dev = ptr;
> + struct net_device *peer = NULL;
> + struct net *net = dev->nd_net;
> + int ret = 0;
> + struct ifreq req;
> +
> + h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_NETDEV);
> + if (!h)
> + return -ENOMEM;
> +
> + if (strcmp(dev->name, "lo") == 0)
> + h->type = CKPT_NETDEV_LO;
> + else {
> + h->type = CKPT_NETDEV_VETH;
> + peer = veth_get_peer(dev);
> + }
> +
> + memcpy(req.ifr_name, dev->name, IFNAMSIZ);
> + ret = __kern_dev_ioctl(net, SIOCGIFFLAGS, &req);
> + h->flags = req.ifr_flags;
> + if (ret < 0)
> + goto out;
> +
> + ret = __kern_dev_ioctl(net, SIOCGIFHWADDR, &req);
> + if (ret < 0)
> + goto out;
> + memcpy(h->hwaddr, req.ifr_hwaddr.sa_data, sizeof(h->hwaddr));
> +
> + h->netns_ref = ckpt_obj_lookup(ctx, net, CKPT_OBJ_NET_NS);
> + if (!h->netns_ref) {
> + ret = -EINVAL;
> + ckpt_err(ctx, ret, "Found netdev with no netns");
> + goto out;
> + }
> +
> + h->inet4_addrs = count_inet4_addrs(dev->ip_ptr);
> +
> + if (h->type == CKPT_NETDEV_VETH) {
> + ret = add_veth_refs(ctx, h, dev, peer);
> + if (ret < 0)
> + goto out;
> + }
> +
> + ret = ckpt_write_obj(ctx, (struct ckpt_hdr *) h);
> + if (ret < 0)
> + goto out;
> +
> + if (h->type == CKPT_NETDEV_VETH) {
> + ret = ckpt_write_buffer(ctx, dev->name, IFNAMSIZ);
> + if (ret < 0)
> + goto out;
> +
> + ret = ckpt_write_buffer(ctx, peer->name, IFNAMSIZ);
> + if (ret < 0)
> + goto out;
> + }
> +
> + ret = checkpoint_in_addrs(ctx, dev->ip_ptr);
> + if ((ret >= 0) && (ret != h->inet4_addrs)) {
> + ret = -EBUSY;
> + ckpt_err(ctx, ret,
> + "Addresses on interface %s changed\n", dev->name);
> + goto out;
> + }
This isn't guaranteed to catch every change to the address list, just that
the number of addresses is the same, is there no way to hold a lock the whole
time?
-Brian
next prev parent reply other threads:[~2010-01-20 22:21 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-20 15:01 Network namespace and device support Dan Smith
[not found] ` <1263999673-11279-1-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-01-20 15:01 ` [PATCH 1/3] Expose rtnl_link_ops_get() Dan Smith
2010-01-20 15:01 ` [PATCH 2/3] Add a veth_get_peer() and veth_set_peer() functions Dan Smith
2010-01-21 9:24 ` David Miller
2010-01-20 15:01 ` [PATCH 3/3] C/R: Basic support for network namespaces and devices Dan Smith
[not found] ` <1263999673-11279-4-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-01-20 17:36 ` Serge E. Hallyn
2010-01-20 21:26 ` Oren Laadan
2010-01-21 15:38 ` Dan Smith
[not found] ` <878wbrpix6.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2010-01-21 16:14 ` Oren Laadan
2010-01-20 22:21 ` Brian Haley [this message]
[not found] ` <4B5781DF.6050106-VXdhtT5mjnY@public.gmane.org>
2010-01-21 15:37 ` Dan Smith
[not found] ` <87fx5zpiy7.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2010-01-21 16:08 ` Oren Laadan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B5781DF.6050106@hp.com \
--to=brian.haley-vxdhtt5mjny@public.gmane.org \
--cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
--cc=danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox