From mboxrd@z Thu Jan  1 00:00:00 1970
From: ebiederm@xmission.com (Eric W. Biederman)
Subject: Re: [PATCH 2/6] C/R: Basic support for network namespaces and devices (v5)
Date: Mon, 08 Mar 2010 09:53:20 -0800
Message-ID: <m1hboqu2cv.fsf@fess.ebiederm.org>
References: <1267130595-23637-1-git-send-email-danms@us.ibm.com>
	<1267130595-23637-3-git-send-email-danms@us.ibm.com>
	<4B91D1A3.9030404@cs.columbia.edu>
	<87bpf1idic.fsf@caffeine.danplanet.com>
	<4B92D574.1090006@cs.columbia.edu>
	<871vfuiul6.fsf@caffeine.danplanet.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Return-path: <netdev-owner@vger.kernel.org>
In-Reply-To: <871vfuiul6.fsf@caffeine.danplanet.com> (Dan Smith's message of "Mon\, 08 Mar 2010 09\:36\:37 -0800")
Sender: netdev-owner@vger.kernel.org
To: Dan Smith <danms@us.ibm.com>
Cc: Oren Laadan <orenl@cs.columbia.edu>, containers@lists.osdl.org, den@openvz.org, netdev@vger.kernel.org, davem@davemloft.net, benjamin.thery@bull.net
List-Id: containers.vger.kernel.org

Dan Smith <danms@us.ibm.com> writes:

> OL> I'm confused: in checkpoint_ns() inside the for_each_netdev() loop
> OL> you first test for dev->netdev_ops->ndo_checkpoint and then call
> OL> checkpoint_obj(... CKPT_OBJ_NETDEV) - which in turn will call
> OL> checkpoint_netdev(), which will again test for
> dev-> netdev_ops->ndo_checkpoint ...  am I reading it wrongly ?
>
> In the case of veth, yes.  It goes something like this:
>
> checkpoint_netns() {
>   foreach netdev in netns {
>     checkpoint_netdev {
>       if netdev is veth {
>         checkpoint_peer(); // Will call checkpoint_netdev again
>       }
>     }
>   }
> }
>
> It shouldn't happen, but it seems like since we could potentially add
> another checkpoint_obj(mydev) somewhere other than in
> checkpoint_netdev(), it is reasonable to check that there is actually
> something to call before we call it.
>
> Would you prefer a BUG()?
>
> OL> How about this - to me it feels simpler:
>
> OL> 	dev = rtnl_newlink(veth_new_link_msg, &veth, this_name);
> OL> 	if (IS_ERR(dev))
> OL> 		return dev;
>
> OL> 	peer = dev_get_by_name(current->nsproxy->net_ns, peer_name);
> OL> 	if (!peer) {
> OL> 		ret = -EINVAL;
> OL> 		goto err_dev;
> OL> 	}
> OL> 	ret = ckpt_obj_insert(ctx, peer, h->veth.peer_ref,
> OL> 			      CKPT_OBJ_NETDEV);
> OL> 	if (ret < 0)
> OL> 		goto err_peer;
>
> OL> 	dev_put(peer);
>
> OL> 	dq.dev = dev;
> OL> 	dq.peer = peer;
> OL> 	ret = deferqueue_add(ctx->deferqueue, &dq, sizeof(dq),
> OL> 			     netdev_noop, netdev_cleanup);
> OL> 	if (ret)
> OL> 		goto err_peer;
>
> If you fail here you need to unregister_netdev() because the dev_put()
> that the objhash will not cause it to happen.  Unless we add something
> to allow you to remove your object from the hash, you can't prevent
> that final put, so you have to have it in the deferqueue for
> later.  You can't check the refcount in the objhash function because it
> will differ depending on the number of addresses and protocols the
> device has, and those don't get released until unregister_netdev()
> which will block if you call it before you've released all of your
> references.  If the objhash put function could examine ctx->errno,
> then it could drop its reference and then call unregister_netdev(),
> but that would involve changing all the drop functions.  What am I
> missing?

Can we take advantage of the fact that when you destroy a network
namespace the virtual devices in that network namespace are also destroyed?

Eric