From: "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
To: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
"Eric W. Biederman"
<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Subject: Re: Virtualizing /proc/sys/kernel/random/boot_id per container ?
Date: Tue, 4 Sep 2012 17:18:18 +0000 [thread overview]
Message-ID: <20120904171818.GA5334@mail.hallyn.com> (raw)
In-Reply-To: <50461EBB.2050501-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
Quoting Glauber Costa (glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org):
> On 09/04/2012 07:25 PM, Serge Hallyn wrote:
> > Quoting Glauber Costa (glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org):
> >> On 09/04/2012 06:44 PM, Serge Hallyn wrote:
> >>> Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> >>>> Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> writes:
> >>>>
> >>>>> On 08/31/2012 04:13 AM, Eric W. Biederman wrote:
> >>>>>> "Daniel P. Berrange" <berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
> >>>>>>
> >>>>>>> On Thu, Aug 30, 2012 at 03:15:17PM -0700, Eric W. Biederman wrote:
> >>>>>>>> "Daniel P. Berrange" <berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
> >>>>>>>>
> >>>>>>>>> One of the features that SystemD folks have asked us to fix in LXC, is
> >>>>>>>>> to make sure that /proc/sys/kernel/random/boot_id changes each time a
> >>>>>>>>> container is started.
> >>>>>>>>
> >>>>>>>> There may be a good reason for this. Most of the time what I have seen
> >>>>>>>> of kernel requests from the direction of SystemD is that while there may
> >>>>>>>> be a real problem but usually their imagined solution is not a
> >>>>>>>> particularly good solution. So a description of the problem is needed.
> >>>>>>>>
> >>>>>>>> Justifying something with just SystemD wants this is a good way to get
> >>>>>>>> a nack.
> >>>>>>>
> >>>>>>> SystemD records log messages for all system services in their journal.
> >>>>>>> They can show you all log messages for the current service execution,
> >>>>>>> all log messages for a service since system boot, or all log messsages
> >>>>>>> ever. The boot_id value is used as a unique tag to allow grouping of
> >>>>>>> the log messages per system boot. When we run systemd inside a container
> >>>>>>> we want to get that grouping of log messages generated by services inside
> >>>>>>> the container, to take account of the container boot, not the host boot.
> >>>>>>> Hence the desire to have the boot_id value reflect when a container is
> >>>>>>> booted.
> >>>>>>
> >>>>>> Since SystemD post-dates containers and since the logging feature is not
> >>>>>> currently in wide use that use case is completely non-persuasive.
> >>>>>>
> >>>>>> So far this just sounds like a plain SystemD bug and something that can
> >>>>>> be easily changed at this point in time.
> >>>>>>
> >>>>>> It has been a long time but my fuzzy memory says that the originial
> >>>>>> boot_id justification was based on use cases that could not be solved
> >>>>>> any other way.
> >>>>>>
> >>>>>> My memory says it was this thread https://lkml.org/lkml/1999/5/31/233
> >>>>>> that inspired the implementation of boot_id. However reading the
> >>>>>> current emacs source code it appears emacs gave up before boot_id
> >>>>>> was implemented and stats /var/run/random-seed (which we seem to
> >>>>>> have removed) or looks in wtmp or utmp for the latest boot record.
> >>>>>>
> >>>>>> I did a quick grep through the binaries on my system and I could not
> >>>>>> find anything using /proc/sys/random/boot_id.
> >>>>>>
> >>>>>> That suggests to me that the proper solution is to actually just remove
> >>>>>> boot_id.
> >>>>>>
> >>>>>> Hmm. And then there is other interesting detail. What should boot_id
> >>>>>> return after the processes have migrated from one system to another.
> >>>>>>
> >>>>>
> >>>>> Since this would be a per-boot id, this clearly has to be carried over
> >>>>> with migration, along with all the tons of data we already carry.
> >>>>
> >>>> The twist of course is what does a boot mean. If we are really after
> >>>> machine boots than the current behavior is correct.
> >>>>
> >>>> Looking back in the archives the desired behavior appears to be a value
> >>>> that can be used to see if a pid value must be stale.
> >>>>
> >>>> As a stale pid detector boot_id is pretty lousy. Pids can still be
> >>>> reused.
> >>>>
> >>>> Still a role as a stale pid detector makes it clear which namespace
> >>>> boot_id should be in and how we should treat boot_id upon migration.
> >>>>
> >>>> You can only serve as a stale pid detector if you are in the pid
> >>>> namespace.
> >>>>
> >>>> So at this point patches are welcome. Hopefully with a summary
> >>>> of the discussion.
> >>>
> >>> I don't understand why this should be provided by the kernel. Especially
> >>> given that we've proven that everyone really wants this to be per-container
> >>> as well.
> >>>
> >>> So why not just have init, on startup, create a /run/boot_id file, perhaps
> >>> by sha1summing the time at which it started perhaps plus some nonce?
> >>>
> >> Why shouldn't it provided by the kernel?, is the real question
> >
> > Because it's not the right place. The origin of this thread proves that
> > people want a per-init, not per-kernel, value.
> >
>
> Not all files provided by the kernel are "per-kernel". /proc/self is
> full of per-namespace stuff.
>
> >> The way I see it, every file we need to setup from the outside is a
> >> hassle. Among many other things, it is just asking for duplication of
> >> efforts among multiple userspaces.
> >>
> >> netns does this for its proc files. The only reason we don't do it for
> >> cgroups-driven file, is that the semantics is very ill-defined. For this
> >> file, it doesn't seem to be the case.
> >
> > But it is the case. How do you intend to have the kernel decide what
> > value to put in there for a process in a container, or in a chroot?
> >
>
> one value per pidns.
ok. (So should it be called /proc/pidns_uuid? Well, whatever. No
objection from me - thanks.)
-serge
next prev parent reply other threads:[~2012-09-04 17:18 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-30 21:18 Virtualizing /proc/sys/kernel/random/boot_id per container ? Daniel P. Berrange
[not found] ` <20120830211832.GA3297-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-08-30 22:15 ` Eric W. Biederman
[not found] ` <878vcwjabu.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-08-30 22:50 ` Daniel P. Berrange
[not found] ` <20120830225002.GA9226-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-08-31 0:13 ` Eric W. Biederman
[not found] ` <87bohrhqai.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-09-03 7:56 ` Glauber Costa
[not found] ` <5044629C.3030909-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-09-03 19:48 ` Eric W. Biederman
[not found] ` <87r4qi6g6k.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-09-04 8:42 ` Glauber Costa
[not found] ` <5045BF05.9050707-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-09-04 9:16 ` Glauber Costa
[not found] ` <5045C707.9020001-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-09-04 9:53 ` Eric W. Biederman
2012-09-04 9:20 ` Eric W. Biederman
[not found] ` <878vcq5ekx.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-09-04 12:08 ` Daniel P. Berrange
2012-09-04 15:28 ` Serge Hallyn
2012-09-04 14:44 ` Serge Hallyn
2012-09-04 14:45 ` Glauber Costa
[not found] ` <50461421.7030305-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-09-04 15:25 ` Serge Hallyn
2012-09-04 15:31 ` Glauber Costa
[not found] ` <50461EBB.2050501-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-09-04 17:18 ` Serge E. Hallyn [this message]
[not found] ` <20120904171818.GA5334-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2012-09-04 19:46 ` Eric W. Biederman
[not found] ` <87vcft1shu.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-09-05 12:10 ` Daniel P. Berrange
2012-09-05 7:59 ` Glauber Costa
2012-08-30 23:22 ` Daniel P. Berrange
[not found] ` <20120830232239.GE9226-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-08-31 0:18 ` Eric W. Biederman
2012-08-31 13:25 ` Serge Hallyn
2012-09-03 7:53 ` Glauber Costa
[not found] ` <504461F1.1090400-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-09-04 14:42 ` Serge Hallyn
2012-09-03 7:52 ` Glauber Costa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120904171818.GA5334@mail.hallyn.com \
--to=serge-a9i7lubdfnhqt0dzr+alfa@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
--cc=glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox