From: "Paul E. McKenney" <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: Linux Containers
<containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>,
netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Subject: Re: [PATCH 03/16] net: Basic network namespace infrastructure.
Date: Sun, 9 Sep 2007 09:45:47 -0700 [thread overview]
Message-ID: <20070909164547.GD10417@linux.vnet.ibm.com> (raw)
In-Reply-To: <m1fy1otarm.fsf-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>
On Sun, Sep 09, 2007 at 04:04:45AM -0600, Eric W. Biederman wrote:
> "Paul E. McKenney" <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
>
> > On Sat, Sep 08, 2007 at 03:15:34PM -0600, Eric W. Biederman wrote:
> >>
> >> This is the basic infrastructure needed to support network
> >> namespaces. This infrastructure is:
> >> - Registration functions to support initializing per network
> >> namespace data when a network namespaces is created or destroyed.
> >>
> >> - struct net. The network namespace data structure.
> >> This structure will grow as variables are made per network
> >> namespace but this is the minimal starting point.
> >>
> >> - Functions to grab a reference to the network namespace.
> >> I provide both get/put functions that keep a network namespace
> >> from being freed. And hold/release functions serve as weak references
> >> and will warn if their count is not zero when the data structure
> >> is freed. Useful for dealing with more complicated data structures
> >> like the ipv4 route cache.
> >>
> >> - A list of all of the network namespaces so we can iterate over them.
> >>
> >> - A slab for the network namespace data structure allowing leaks
> >> to be spotted.
> >
> > If I understand this correctly, the only way to get to a namespace is
> > via get_net_ns_by_pid(), which contains the rcu_read_lock() that matches
> > the rcu_barrier() below.
>
> Not quite. That is the convoluted case for getting a namespace someone
> else is using. current->nsproxy->net_ns works and should require no
> locking to read (only the current process may modify it) and does hold
> a reference to the network namespace. Similarly for sock->sk_net.
Ah! Got it, thank you for the explanation.
> > So, is the get_net() in sock_copy() in this patch adding a reference to
> > an element that is guaranteed to already have at least one reference?
>
> Yes.
>
> > If not, how are we preventing sock_copy() from running concurrently with
> > cleanup_net()? Ah, I see -- in sock_copy() we are getting a reference
> > to the new struct sock that no one else can get a reference to, so OK.
> > Ditto for the get_net() in sk_alloc().
>
> > But I still don't understand what is protecting the get_net() in
> > dev_seq_open(). Is there an existing reference?
>
> Sort of. The directories under /proc/net are created when create
> a network namespace and they are destroyed when the network namespace
> is removed. And those directories remember which network namespace
> they are for and that is what dev_seq_open is referencing.
>
> So the tricky case what happens if we open a directory under /proc/net
> as we are cleaning up a network namespace.
Yep! ;-)
> > If so, how do we know
> > that it won't be removed just as we are trying to add our reference
> > (while at the same time cleanup_net() is running)? Ditto for the other
> > _open() operations in the same patch. And for netlink_seq_open().
> >
> > Enlightenment?
>
> Good spotting. It looks like you have found a legitimate race. Grr.
> I thought I had a reference to the network namespace there. I need to
> step back and think about this a bit, and see if I can come up with a
> legitimate idiom.
>
> I know the network namespace exists and I have not finished
> cleanup_net because I can still get to the /proc entries.
OK. Hmmm... I need to go review locking for /proc...
> I know I cannot use get_net for the reference in in /proc because
> otherwise I could not release the network namespace unless I was to
> unmount the filesystem, which is not a desirable property.
>
> I think I can change the idiom to:
>
> struct net *maybe_get_net(struct net *net)
> {
> if (!atomic_inc_not_zero(&net->count))
> net = NULL;
> return net;
> }
>
> Which would make dev_seq_open be:
>
> static int dev_seq_open(struct inode *inode, struct file *file)
> {
> struct seq_file *seq;
> int res;
> res = seq_open(file, &dev_seq_ops);
> if (!res) {
> seq = file->private_data;
> seq->private = maybe_get_net(PROC_NET(inode));
> if (!seq->private) {
> res = -ENOENT;
> seq_release(inode, file);
> }
> }
> return res;
> }
>
> I'm still asking myself if I need any kind of locking to ensure
> struct net does not go away in the mean time, if so rcu_read_lock()
> should be sufficient.
Agreed -- and it might be possible to leverage the existing locking
in the /proc code.
Thanx, Paul
> I will read through the generic proc code very carefully after
> I have slept and see if there is what I the code above is sufficient,
> and if so update the patchset.
>
> Eric
next prev parent reply other threads:[~2007-09-09 16:45 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-08 21:07 [PATCH 00/16] core network namespace support Eric W. Biederman
2007-09-08 21:09 ` [PATCH 01/16] appletalk: In notifier handlers convert the void pointer to a netdevice Eric W. Biederman
2007-09-08 21:13 ` [PATCH 02/16] net: Don't implement dev_ifname32 inline Eric W. Biederman
2007-09-08 21:15 ` [PATCH 03/16] net: Basic network namespace infrastructure Eric W. Biederman
2007-09-08 21:17 ` [PATCH 04/16] net: Add a network namespace parameter to tasks Eric W. Biederman
2007-09-08 21:18 ` [PATCH 05/16] net: Add a network namespace tag to struct net_device Eric W. Biederman
2007-09-08 21:20 ` [PATCH 07/16] net: Make /proc/net per network namespace Eric W. Biederman
2007-09-08 21:23 ` [PATCH 08/16] net: Make socket creation namespace safe Eric W. Biederman
2007-09-08 21:24 ` [PATCH 09/16] net: Initialize the network namespace of network devices Eric W. Biederman
2007-09-08 21:25 ` [PATCH 10/16] net: Make packet reception network namespace safe Eric W. Biederman
2007-09-08 21:27 ` [PATCH 11/16] net: Make device event notification " Eric W. Biederman
2007-09-08 21:28 ` [PATCH 12/16] net: Support multiple network namespaces with netlink Eric W. Biederman
2007-09-08 21:35 ` [PATCH 13/16] net: Make the device list and device lookups per namespace Eric W. Biederman
2007-09-08 21:36 ` [PATCH 14/16] net: Factor out __dev_alloc_name from dev_alloc_name Eric W. Biederman
2007-09-08 21:38 ` [PATCH 15/16] net: Implement network device movement between namespaces Eric W. Biederman
2007-09-08 21:43 ` [PATCH 16/16] net: netlink support for moving devices between network namespaces Eric W. Biederman
2007-09-08 21:47 ` [PATCH 17/16] net: Disable netfilter sockopts when not in the initial network namespace Eric W. Biederman
2007-09-10 13:50 ` Pavel Emelyanov
[not found] ` <46E54B96.8060105-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2007-09-10 15:27 ` Eric W. Biederman
2007-09-12 11:59 ` David Miller
2007-09-12 12:03 ` David Miller
2007-09-12 12:16 ` Eric W. Biederman
[not found] ` <m1tzq4u92n.fsf_-_-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>
2007-09-10 19:07 ` [PATCH 16/16] net: netlink support for moving devices between network namespaces Serge E. Hallyn
2007-09-10 19:30 ` Eric W. Biederman
2007-09-11 0:54 ` Serge E. Hallyn
2007-09-12 11:57 ` David Miller
2007-09-12 11:54 ` [PATCH 15/16] net: Implement network device movement between namespaces David Miller
2007-09-12 11:49 ` [PATCH 14/16] net: Factor out __dev_alloc_name from dev_alloc_name David Miller
2007-09-12 11:39 ` [PATCH 13/16] net: Make the device list and device lookups per namespace David Miller
[not found] ` <m1bqccvock.fsf_-_-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>
2007-09-10 13:46 ` [PATCH 12/16] net: Support multiple network namespaces with netlink Pavel Emelyanov
2007-09-10 15:24 ` Eric W. Biederman
2007-09-12 11:06 ` David Miller
2007-09-12 11:02 ` [PATCH 11/16] net: Make device event notification network namespace safe David Miller
2007-09-12 11:00 ` [PATCH 10/16] net: Make packet reception " David Miller
2007-09-12 10:58 ` [PATCH 09/16] net: Initialize the network namespace of network devices David Miller
2007-09-12 10:04 ` [PATCH 08/16] net: Make socket creation namespace safe David Miller
2007-09-12 10:02 ` [PATCH 07/16] net: Make /proc/net per network namespace David Miller
2007-09-12 12:12 ` Daniel Lezcano
2007-09-12 12:19 ` David Miller
2007-09-08 21:21 ` [PATCH 06/16] net: Add a network namespace parameter to struct sock Eric W. Biederman
2007-09-12 9:58 ` David Miller
2007-09-12 9:57 ` [PATCH 05/16] net: Add a network namespace tag to struct net_device David Miller
2007-09-12 9:55 ` [PATCH 04/16] net: Add a network namespace parameter to tasks David Miller
2007-09-09 8:44 ` [PATCH 03/16] net: Basic network namespace infrastructure Eric Dumazet
[not found] ` <46E3B281.4030105-fPLkHRcR87vqlBn2x/YWAg@public.gmane.org>
2007-09-09 10:18 ` Eric W. Biederman
2007-09-10 5:46 ` Krishna Kumar2
[not found] ` <OF55551EA4.A3E6920C-ON65257352.001D6A3E-65257352.001FBEA7-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
2007-09-10 6:40 ` Eric W. Biederman
[not found] ` <m1ejh8x3ih.fsf_-_-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>
2007-09-09 0:33 ` Paul E. McKenney
2007-09-09 10:04 ` Eric W. Biederman
[not found] ` <m1fy1otarm.fsf-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>
2007-09-09 16:45 ` Paul E. McKenney [this message]
2007-09-10 6:32 ` Eric W. Biederman
2007-09-10 13:16 ` Pavel Emelyanov
[not found] ` <46E543A0.7010104-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2007-09-10 15:53 ` Eric W. Biederman
2007-09-12 9:52 ` David Miller
2007-09-12 9:39 ` [PATCH 02/16] net: Don't implement dev_ifname32 inline David Miller
2007-09-12 9:27 ` [PATCH 01/16] appletalk: In notifier handlers convert the void pointer to a netdevice David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070909164547.GD10417@linux.vnet.ibm.com \
--to=paulmck-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
--cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
--cc=davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org \
--cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
--cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).