From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman)
To: Pavel Emelyanov <xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
Cc: Linux Containers
<containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>,
Cedric Le Goater <clg-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH 0/3] Sysctl shadow management
Date: Tue, 20 Nov 2007 06:05:58 -0700 [thread overview]
Message-ID: <m1y7ctrrrd.fsf@ebiederm.dsl.xmission.com> (raw)
In-Reply-To: <4742C73C.3010904-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> (Pavel Emelyanov's message of "Tue, 20 Nov 2007 14:38:36 +0300")
Pavel Emelyanov <xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> writes:
> Hi guys!
>
> You all know, that with multiple namespaces we have to take
> special care about sysctls. E.g. IPC sysctl handlers are
> equipped with kludges to alter the sysctl parameters of
> appropriate namespace. The same thing should be done for UTS
> namespace (but it is not - we have a BUG in mainstream) and
> (!) for network namespaces.
The bug in mainstream was introduced by commit:
7d69a1f4a72b18876c99c697692b78339d491568
Thee code should read:
static void *get_uts(ctl_table *table, int write)
{
char *which = table->data;
+ struct uts_namespace *uts_ns = current->nsproxy->uts_ns;
+ which = (which - (char *)&init_uts_ns) + (char *)uts_ns;
if (!write)
down_read(&uts_sem);
else
down_write(&uts_sem);
return which;
}
And for 2.6.24 we should just restore the two missing lines.
> Unlike all the other namespaces, network will have to not
> just address different variables via same sysctl names, but
> to have different tables with different sysctl names. E.g.
> /proc/sys/net/conf have entries for devices, which differ
> across namespaces.
>
> Eric currently have some work done in that directions, I
> like the approach in general very much, but it looks rather
> raw (Eric, take this in good part). You know, ifdefs in the
> middle of the code, explicit references to net namespace
> and so on and so forth.
Sure. I don't in principle have any problem with the set of
roots we go through be dynamic at run time instead of compile
time. That is probably a cleaner approach and likely to solve
my problem with sched_debug registering sysctls before the
sysctl subsystem is initialized.
One direction I have always intended to expand things when
there was a bit of time to modify /proc/sys so that it
is a symlink to /proc/self/sys. Then make /proc/<pid>/sys
have cachable dentries for the sysctls.
To achieve that we need to pass our namespaces into
your shadow function. Instead of always using current.
So I am thinking something like:
struct ctl_table_root {
struct list_head ctl_entry;
struct ctl_table_header *ctl_head;
struct ctl_table_header *lookup_ctl_head(struct nsproxy *namespaces);
};
To actually handle the set of network devices in a namespace we need
to have a list so making sysctl_head_next just loop over a list of
lists should be no extra work and make implementing the users
easier.
Beyond that through from my quick skim I have a preference for
the way I am handling it.
We really need to add the tables with some variant of
register_sysctl_table or else we will have module unload races.
Introducing register_sysctl_paths is a very useful cleanup in
it's own right and it helps quite a bit. In most cases it removes the
need for your create_sysctl_shadow function, and it always reduced the
amount of code for tables.
Further simply using kmemdup instead of a custom crafted function
is a little more straight forward and is the idiom already
established throughout the networking code.
Then we will just need a base sysctl function:
struct ctl_table_header *
register_sysctl_rooted_paths(struct ctl_table_root *root,
struct nsproxy *namespaces,
struct ctl_path *path,
struct ctl_table *table)
For the simple namespaces we can call it once per namespace
after registering our root (we can't use current because we initialize
things before we update nsproxy), and we still need to run
sysctl_check.
For the more complex namespaces (i.e. the network namespace). We can
write a simple wrapper around register_sysctl_rooted_paths.
And of course non-namespace specific sysctls can just use a wrapper
that assumes the default global list of sysctl_headers.
> So here's the RFC for a bit better sysctls shadow management.
>
> I will provide 3 patches:
> 1. the sysctl shadows themselves;
> 2. using shadows in UTS namespace;
> 3. using shadows in IPC namespace;
Your patches look fairly reasonable. Other then the pieces
I have mentioned already.
> Using them in net namespace is already checked (I created
> sysctl entries with different names), but I don't have any
> patches against any David's tree yet. If we're OK with this
> set I will start talking to Andrew and David about who to
> send these patches to and making shadows for net-related
> sysctl variables.
I think we need another round. My hunch is that it will be easiest
if David collects them up, and then Andrew updates his tree, but
we will see.
Eric
next prev parent reply other threads:[~2007-11-20 13:05 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-20 11:38 [PATCH 0/3] Sysctl shadow management Pavel Emelyanov
[not found] ` <4742C73C.3010904-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2007-11-20 11:43 ` [PATCH 1/3] The sysctl shadows Pavel Emelyanov
[not found] ` <4742C86E.6060705-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2007-11-20 17:16 ` Dave Hansen
2007-11-21 9:20 ` Pavel Emelyanov
2007-11-20 11:45 ` [PATCH 2/3] Switch UTS namespace to use shadows Pavel Emelyanov
2007-11-20 11:47 ` [PATCH 3/3] Switch IPC namespace to use sysctl shadows Pavel Emelyanov
[not found] ` <4742C95D.1040907-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2007-11-20 17:24 ` Dave Hansen
2007-11-21 9:21 ` Pavel Emelyanov
2007-11-20 13:05 ` Eric W. Biederman [this message]
[not found] ` <m1y7ctrrrd.fsf-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>
2007-11-20 13:21 ` [PATCH 0/3] Sysctl shadow management Pavel Emelyanov
[not found] ` <4742DF51.8060402-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2007-11-20 15:21 ` Eric W. Biederman
[not found] ` <m1tznhrli5.fsf-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>
2007-11-20 15:36 ` Pavel Emelyanov
[not found] ` <4742FEF6.6080609-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2007-11-20 19:47 ` Eric W. Biederman
[not found] ` <m1myt8snqp.fsf-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>
2007-11-21 9:52 ` Pavel Emelyanov
2007-11-29 17:40 ` [PATCH 0/4] Sysctl namespace support Eric W. Biederman
[not found] ` <m1odddc5mf.fsf-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>
2007-11-29 17:45 ` [PATCH 1/4] sysctl: Add register_sysctl_paths function Eric W. Biederman
2007-11-29 17:46 ` [PATCH 2/4] sysctl: Remember the ctl_table we passed to register_sysctl_paths Eric W. Biederman
2007-11-29 17:51 ` [PATCH 3/4] sysctl: Infrastructure for per namespace sysctls Eric W. Biederman
2007-11-29 17:53 ` [PATCH 4/4] net: Implement the per network namespace sysctl infrastructure Eric W. Biederman
2007-11-30 16:18 ` Serge E. Hallyn
2007-11-30 16:23 ` Pavel Emelyanov
2007-11-30 21:49 ` Eric W. Biederman
2007-12-01 0:01 ` Serge E. Hallyn
2007-11-30 12:56 ` [PATCH 0/4] Sysctl namespace support Herbert Xu
[not found] ` <20071130125627.GH26848-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2007-11-30 13:25 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m1y7ctrrrd.fsf@ebiederm.dsl.xmission.com \
--to=ebiederm-as9lmozglivwk0htik3j/w@public.gmane.org \
--cc=clg-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org \
--cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
--cc=xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox