linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Daniel Colascione <dancol@google.com>
Cc: linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
	timmurray@google.com, primiano@google.com, joelaf@google.com,
	Jonathan Corbet <corbet@lwn.net>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Vlastimil Babka <vbabka@suse.cz>, Roman Gushchin <guro@fb.com>,
	Prashant Dhamdhere <pdhamdhe@redhat.com>,
	"Dennis Zhou (Facebook)" <dennisszhou@gmail.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"Steven Rostedt (VMware)" <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Dominik Brodowski <linux@dominikbrodowski.net>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Michal Hocko <mhocko@suse.com>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	KJ Tsanaktsidis <ktsanaktsidis@zendesk.com>
Subject: Re: [PATCH v2] Add /proc/pid_gen
Date: Wed, 21 Nov 2018 14:12:20 -0800	[thread overview]
Message-ID: <20181121141220.0e533c1dcb4792480efbf3ff@linux-foundation.org> (raw)
In-Reply-To: <20181121205428.165205-1-dancol@google.com>

On Wed, 21 Nov 2018 12:54:20 -0800 Daniel Colascione <dancol@google.com> wrote:

> Trace analysis code needs a coherent picture of the set of processes
> and threads running on a system. While it's possible to enumerate all
> tasks via /proc, this enumeration is not atomic. If PID numbering
> rolls over during snapshot collection, the resulting snapshot of the
> process and thread state of the system may be incoherent, confusing
> trace analysis tools. The fundamental problem is that if a PID is
> reused during a userspace scan of /proc, it's impossible to tell, in
> post-processing, whether a fact that the userspace /proc scanner
> reports regarding a given PID refers to the old or new task named by
> that PID, as the scan of that PID may or may not have occurred before
> the PID reuse, and there's no way to "stamp" a fact read from the
> kernel with a trace timestamp.
> 
> This change adds a per-pid-namespace 64-bit generation number,
> incremented on PID rollover, and exposes it via a new proc file
> /proc/pid_gen. By examining this file before and after /proc
> enumeration, user code can detect the potential reuse of a PID and
> restart the task enumeration process, repeating until it gets a
> coherent snapshot.
> 
> PID rollover ought to be rare, so in practice, scan repetitions will
> be rare.

In general, tracing is a rather specialized thing.  Why is this very
occasional confusion a sufficiently serious problem to warrant addition
of this code?

Which userspace tools will be using pid_gen?  Are the developers of
those tools signed up to use pid_gen?

> --- a/include/linux/pid.h
> +++ b/include/linux/pid.h
> @@ -112,6 +112,7 @@ extern struct pid *find_ge_pid(int nr, struct pid_namespace *);
>  int next_pidmap(struct pid_namespace *pid_ns, unsigned int last);
>  
>  extern struct pid *alloc_pid(struct pid_namespace *ns);
> +extern u64 read_pid_generation(struct pid_namespace *ns);

pig_generation_read() would be a better (and more idiomatic) name.

>  extern void free_pid(struct pid *pid);
>  extern void disable_pid_allocation(struct pid_namespace *ns);
>
> ...
>
> +u64 read_pid_generation(struct pid_namespace *ns)
> +{
> +	u64 generation;
> +
> +
> +	spin_lock_irq(&pidmap_lock);
> +	generation = ns->generation;
> +	spin_unlock_irq(&pidmap_lock);
> +	return generation;
> +}

What is the spinlocking in here for?  afaict the only purpose it serves
is to make the 64-bit read atomic, so it isn't needed on 32-bit?

>  void disable_pid_allocation(struct pid_namespace *ns)
>  {
>  	spin_lock_irq(&pidmap_lock);
> @@ -449,6 +463,17 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns)
>  	return idr_get_next(&ns->idr, &nr);
>  }
>  
> +#ifdef CONFIG_PROC_FS
> +static int pid_generation_show(struct seq_file *m, void *v)
> +{
> +	u64 generation =
> +		read_pid_generation(proc_pid_ns(file_inode(m->file)));

	u64 generation;

	generation = read_pid_generation(proc_pid_ns(file_inode(m->file)));

is a nicer way of avoiding column wrap.

> +	seq_printf(m, "%llu\n", generation);
> +	return 0;
> +
> +};
> +#endif
> +
>  void __init pid_idr_init(void)
>  {
>  	/* Verify no one has done anything silly: */
> @@ -465,4 +490,13 @@ void __init pid_idr_init(void)
>  
>  	init_pid_ns.pid_cachep = KMEM_CACHE(pid,
>  			SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT);
> +
> +}
> +
> +void __init pid_proc_init(void)
> +{
> +	/* pid_idr_init is too early, so get a separate init function. */

s/get a/use a/

> +#ifdef CONFIG_PROC_FS
> +	WARN_ON(!proc_create_single("pid_gen", 0, NULL, pid_generation_show));
> +#endif
>  }

This whole function could vanish if !CONFIG_PROC_FS.  Doesn't matter
much with __init code though.

  reply	other threads:[~2018-11-21 22:12 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-21 20:14 [PATCH] Add /proc/pid_generation Daniel Colascione
2018-11-21 20:31 ` Matthew Wilcox
2018-11-21 20:38   ` Daniel Colascione
2018-11-22  2:06     ` Matthew Wilcox
2018-11-25 22:55       ` Pavel Machek
2018-11-21 20:54 ` [PATCH v2] Add /proc/pid_gen Daniel Colascione
2018-11-21 22:12   ` Andrew Morton [this message]
2018-11-21 22:40     ` Daniel Colascione
2018-11-21 22:48       ` Jann Horn
2018-11-21 22:52         ` Daniel Colascione
2018-11-21 22:50       ` Andrew Morton
2018-11-21 23:21         ` Daniel Colascione
2018-11-21 23:35           ` Andy Lutomirski
2018-11-22  0:21             ` Daniel Colascione
2018-11-22 13:58             ` Cyrill Gorcunov
2018-11-22  0:22           ` Andrew Morton
2018-11-22  0:28             ` Daniel Colascione
2018-11-22  0:30               ` Daniel Colascione
2018-11-22 15:27                 ` Mathieu Desnoyers
2018-11-22  0:57               ` Andrew Morton
2018-11-22  1:08                 ` Daniel Colascione
2018-11-22  1:29                   ` Andrew Morton
2018-11-22  2:35                     ` Tim Murray
2018-11-22  5:30                       ` Daniel Colascione
2018-11-22 11:19 ` [PATCH] Add /proc/pid_generation Kevin Easton
2018-11-23 11:14   ` David Laight
2018-11-25 23:00     ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181121141220.0e533c1dcb4792480efbf3ff@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=ard.biesheuvel@linaro.org \
    --cc=corbet@lwn.net \
    --cc=dancol@google.com \
    --cc=dennisszhou@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=guro@fb.com \
    --cc=joelaf@google.com \
    --cc=jpoimboe@redhat.com \
    --cc=ktsanaktsidis@zendesk.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@dominikbrodowski.net \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=pdhamdhe@redhat.com \
    --cc=primiano@google.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=sfr@canb.auug.org.au \
    --cc=tglx@linutronix.de \
    --cc=timmurray@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).