From: "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
"Stéphane Graber"
<stephane.graber-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>,
"Daniel Lezcano" <dlezcano-GANU6spQydw@public.gmane.org>
Subject: Re: [PATCH RFC] syslog ns proof of concept
Date: Mon, 19 Nov 2012 14:18:15 +0000 [thread overview]
Message-ID: <20121119141815.GB4321@mail.hallyn.com> (raw)
In-Reply-To: <87pq3c223i.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> writes:
>
> > Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> >> Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:
> >>
> >> > Introduce a system log namespace. The syslog ns is tied to a user
> >> > namespace. You must create a new user namespace before you can create a
> >> > new sylog ns. The syslog ns is created through a new command (11) to
> >> > the __NR_syslog system call.
> >> >
> >> > Once a task enters a new syslog ns, it's "dmesg", "dmesg -c" and
> >> > /dev/kmsg actions affect only itself, so that user-created syslog
> >> > messages no longer are confusingly combined in the host's syslog.
> >> > "printk" itself always goes to the initial syslog_ns, and consoles
> >> > belong only to the initial syslog_ns. However printks relating to a
> >> > specific network namespace, for instance, can now be targeted to the
> >> > syslog ns for the user ns which owns the network ns, aiding in debugging
> >> > in a container.
> >> >
> >> > This patch is on top of the user namespace enhanced kernel at
> >> > git://kernel.ubuntu.com/serge/quantal-userns. It is good enough to
> >> > compile with stock ubuntu kernel options, boot, launch other syslog
> >> > namespaces and exercise them. It will need help before it will compile
> >> > with funky options like CONFIG_PRINTK=n. This is only being sent out to
> >> > get feedback on the general idea.
> >> >
> >> > Comments greatly appreciated.
> >> >
> >> > (See https://wiki.ubuntu.com/LxcSyslogNs for background).
> >>
> >> Overall I would say the goal sounds well thought out.
> >>
> >> I am not a fan of how this ties into the user namespace. I would prefer
> >> closer or looser ties. The recursive reference count loop where a
> >> userns refers to a syslogns and that syslogns refers to the same userns
> >> is unpleasant.
> >
> > We could make the nsproxy point to the syslog_ns, but this seemed simpler.
> > Note that the syslog_ns does not need to pin the user_ns, since by design
> > the user_ns owning a syslog_ns can't go away if the syslog_ns is still
> > alive.
> >
> > But yes, the question of "what should point to the syslog_ns" is what has
> > kept a syslog_ns from being seriously proposed since february 2010 :)
> >
> > Hm, wait. A nagging feeling made me look back, and I see that I do in
> > fact pin the user_ns from the syslog_ns. I didn't mean to (and I don't
> > release it :) and we don't need to. When a syslog_ns is created, it
> > can only be inherited by child user_ns's, and its owner, the parent user_ns,
> > can never go away until the child user_ns's go away.
>
> There is an argument to be made that syslog messages are the kind of
> security identifiers like uid, gids, and keys that should be part of a
> user namespace. I'm not fully convinced but there are some DOS attacks
> that would naturally prevent.
I can't really think of a good case for not putting the syslogns straight
into the userns (i.e. not having a separate syslogns), so I'd say let's
go that route.
There is a big locking bug (besides syslog_ns pinning user_ns) in my
patch - something needs to be done with struct cont, which pins the
syslog_ns. So either when a user_ns is freed we need to flush struct
cont if it is pinning this user_ns, or the struct cont should
explicitly pin the user_ns.
> >> The important case as I understand it is to handle injection of messages
> >> into dmesg by userspace?
> >
> > 1. injection of messages into dmesg by userspace, 2. clearing of messages
> > by userspace, but also 3. allowing appropriate kernel printks to be
> > targeted to containers.
> >
> >> I would really like to see how messages from networking devices and
> >> netfilter would be handled. Right now one of the ugliest bits of
> >
> > It would simply replace a
> > printk(KERN_NOTICE "doing something\n");
> > with
> > nsprintk(net->user_ns->syslog_ns, KERN_NOTICE "doing something\n");
> >
> > I'm not yet clear on whether we'd want nsprintk to print to both the
> > init_syslog_ns (with a ns prefix) and the child ns.
>
> There are some specialized forms of printk like dev_printk and in
> particular netdev_printk that it would be very interesting if they
> did the work behind the scenes. So that you could code the obvious
> thing and it would do the right thing automatically.
Agreed.
> >> lowering the permissions in the network namespace is what do about the
> >> commands that set the message loglevel.
> >
> > Here I'm not sure what you mean.
>
> There is a possible DOS attack that by turning on debug messages in a
> user namespace you can overwhelm syslog.
Oh, I see.
> >> In general unless we can safely and sanely direct kernel messages into
> >> this new dmesg I don't actually see the point of having another ring
> >> buffer in the kernel. If the only success is userspace having the
> >> syslog facility simply be unavailable seems more palatable.
> >
> > No I didn't do any in this patch, but directing kernel messages into the
> > new dmesg was definately a goal and should be trivial now.
>
> Getting the semantics of which kernel messages should be directed at the
> new ring buffer and what that means seems to me to be a key factor in
> seeing how practical this is. Otherwise this seems to call out for a
> change in userspace.
Ok, I was hoping that once there was a trivial to use nsprintk the
appopriate users would be converted by others :), but I can take a
look at converting compelling users before I resend.
> Certainly inside a user namespace now you can't destructively touch the
> kernel's syslog at all.
That should be true, yes.
thanks,
-serge
prev parent reply other threads:[~2012-11-19 14:18 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-17 0:25 [PATCH RFC] syslog ns proof of concept Serge Hallyn
2012-11-17 3:14 ` Eric W. Biederman
[not found] ` <87haoo3opt.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-17 4:02 ` Serge E. Hallyn
[not found] ` <20121117040200.GA24079-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2012-11-17 6:08 ` Eric W. Biederman
[not found] ` <87pq3c223i.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-19 14:18 ` Serge E. Hallyn [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121119141815.GB4321@mail.hallyn.com \
--to=serge-a9i7lubdfnhqt0dzr+alfa@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=dlezcano-GANU6spQydw@public.gmane.org \
--cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
--cc=stephane.graber-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox