From: Andrew Morton <akpm@linux-foundation.org>
To: Daniel Colascione <dancol@google.com>
Cc: linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
timmurray@google.com, primiano@google.com, joelaf@google.com,
Jonathan Corbet <corbet@lwn.net>,
Mike Rapoport <rppt@linux.vnet.ibm.com>,
Vlastimil Babka <vbabka@suse.cz>, Roman Gushchin <guro@fb.com>,
Prashant Dhamdhere <pdhamdhe@redhat.com>,
"Dennis Zhou (Facebook)" <dennisszhou@gmail.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
"Steven Rostedt (VMware)" <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@kernel.org>,
Dominik Brodowski <linux@dominikbrodowski.net>,
Josh Poimboeuf <jpoimboe@redhat.com>,
Ard Biesheuvel <ard.biesheuvel@linaro.org>,
Michal Hocko <mhocko@suse.com>,
Stephen Rothwell <sfr@canb.auug.org.au>,
KJ Tsanaktsidis <ktsanaktsidis@zendesk.com>
Subject: Re: [PATCH v2] Add /proc/pid_gen
Date: Wed, 21 Nov 2018 14:12:20 -0800 [thread overview]
Message-ID: <20181121141220.0e533c1dcb4792480efbf3ff@linux-foundation.org> (raw)
In-Reply-To: <20181121205428.165205-1-dancol@google.com>
On Wed, 21 Nov 2018 12:54:20 -0800 Daniel Colascione <dancol@google.com> wrote:
> Trace analysis code needs a coherent picture of the set of processes
> and threads running on a system. While it's possible to enumerate all
> tasks via /proc, this enumeration is not atomic. If PID numbering
> rolls over during snapshot collection, the resulting snapshot of the
> process and thread state of the system may be incoherent, confusing
> trace analysis tools. The fundamental problem is that if a PID is
> reused during a userspace scan of /proc, it's impossible to tell, in
> post-processing, whether a fact that the userspace /proc scanner
> reports regarding a given PID refers to the old or new task named by
> that PID, as the scan of that PID may or may not have occurred before
> the PID reuse, and there's no way to "stamp" a fact read from the
> kernel with a trace timestamp.
>
> This change adds a per-pid-namespace 64-bit generation number,
> incremented on PID rollover, and exposes it via a new proc file
> /proc/pid_gen. By examining this file before and after /proc
> enumeration, user code can detect the potential reuse of a PID and
> restart the task enumeration process, repeating until it gets a
> coherent snapshot.
>
> PID rollover ought to be rare, so in practice, scan repetitions will
> be rare.
In general, tracing is a rather specialized thing. Why is this very
occasional confusion a sufficiently serious problem to warrant addition
of this code?
Which userspace tools will be using pid_gen? Are the developers of
those tools signed up to use pid_gen?
> --- a/include/linux/pid.h
> +++ b/include/linux/pid.h
> @@ -112,6 +112,7 @@ extern struct pid *find_ge_pid(int nr, struct pid_namespace *);
> int next_pidmap(struct pid_namespace *pid_ns, unsigned int last);
>
> extern struct pid *alloc_pid(struct pid_namespace *ns);
> +extern u64 read_pid_generation(struct pid_namespace *ns);
pig_generation_read() would be a better (and more idiomatic) name.
> extern void free_pid(struct pid *pid);
> extern void disable_pid_allocation(struct pid_namespace *ns);
>
> ...
>
> +u64 read_pid_generation(struct pid_namespace *ns)
> +{
> + u64 generation;
> +
> +
> + spin_lock_irq(&pidmap_lock);
> + generation = ns->generation;
> + spin_unlock_irq(&pidmap_lock);
> + return generation;
> +}
What is the spinlocking in here for? afaict the only purpose it serves
is to make the 64-bit read atomic, so it isn't needed on 32-bit?
> void disable_pid_allocation(struct pid_namespace *ns)
> {
> spin_lock_irq(&pidmap_lock);
> @@ -449,6 +463,17 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns)
> return idr_get_next(&ns->idr, &nr);
> }
>
> +#ifdef CONFIG_PROC_FS
> +static int pid_generation_show(struct seq_file *m, void *v)
> +{
> + u64 generation =
> + read_pid_generation(proc_pid_ns(file_inode(m->file)));
u64 generation;
generation = read_pid_generation(proc_pid_ns(file_inode(m->file)));
is a nicer way of avoiding column wrap.
> + seq_printf(m, "%llu\n", generation);
> + return 0;
> +
> +};
> +#endif
> +
> void __init pid_idr_init(void)
> {
> /* Verify no one has done anything silly: */
> @@ -465,4 +490,13 @@ void __init pid_idr_init(void)
>
> init_pid_ns.pid_cachep = KMEM_CACHE(pid,
> SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT);
> +
> +}
> +
> +void __init pid_proc_init(void)
> +{
> + /* pid_idr_init is too early, so get a separate init function. */
s/get a/use a/
> +#ifdef CONFIG_PROC_FS
> + WARN_ON(!proc_create_single("pid_gen", 0, NULL, pid_generation_show));
> +#endif
> }
This whole function could vanish if !CONFIG_PROC_FS. Doesn't matter
much with __init code though.
next prev parent reply other threads:[~2018-11-21 22:12 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-21 20:14 [PATCH] Add /proc/pid_generation Daniel Colascione
2018-11-21 20:31 ` Matthew Wilcox
2018-11-21 20:38 ` Daniel Colascione
2018-11-22 2:06 ` Matthew Wilcox
2018-11-25 22:55 ` Pavel Machek
2018-11-21 20:54 ` [PATCH v2] Add /proc/pid_gen Daniel Colascione
2018-11-21 22:12 ` Andrew Morton [this message]
2018-11-21 22:40 ` Daniel Colascione
2018-11-21 22:48 ` Jann Horn
2018-11-21 22:52 ` Daniel Colascione
2018-11-21 22:50 ` Andrew Morton
2018-11-21 23:21 ` Daniel Colascione
2018-11-21 23:35 ` Andy Lutomirski
2018-11-22 0:21 ` Daniel Colascione
2018-11-22 13:58 ` Cyrill Gorcunov
2018-11-22 0:22 ` Andrew Morton
2018-11-22 0:28 ` Daniel Colascione
2018-11-22 0:30 ` Daniel Colascione
2018-11-22 15:27 ` Mathieu Desnoyers
2018-11-22 0:57 ` Andrew Morton
2018-11-22 1:08 ` Daniel Colascione
2018-11-22 1:29 ` Andrew Morton
2018-11-22 2:35 ` Tim Murray
2018-11-22 5:30 ` Daniel Colascione
2018-11-22 11:19 ` [PATCH] Add /proc/pid_generation Kevin Easton
2018-11-23 11:14 ` David Laight
2018-11-25 23:00 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181121141220.0e533c1dcb4792480efbf3ff@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=ard.biesheuvel@linaro.org \
--cc=corbet@lwn.net \
--cc=dancol@google.com \
--cc=dennisszhou@gmail.com \
--cc=ebiederm@xmission.com \
--cc=guro@fb.com \
--cc=joelaf@google.com \
--cc=jpoimboe@redhat.com \
--cc=ktsanaktsidis@zendesk.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@dominikbrodowski.net \
--cc=mhocko@suse.com \
--cc=mingo@kernel.org \
--cc=pdhamdhe@redhat.com \
--cc=primiano@google.com \
--cc=rostedt@goodmis.org \
--cc=rppt@linux.vnet.ibm.com \
--cc=sfr@canb.auug.org.au \
--cc=tglx@linutronix.de \
--cc=timmurray@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).