All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Salman <sqazi@google.com>
Cc: mingo@elte.hu, linux-kernel@vger.kernel.org,
	peterz@infradead.org, tytso@google.com,
	torvalds@linux-foundation.org, walken@google.com,
	Chen Liqin <liqin.chen@sunplusct.com>,
	Lennox Wu <lennox.wu@gmail.com>
Subject: Re: [PATCH] Fix a race in pid generation that causes pids to be reused immediately.
Date: Mon, 14 Jun 2010 16:58:51 -0700	[thread overview]
Message-ID: <20100614165851.6bdfe485.akpm@linux-foundation.org> (raw)
In-Reply-To: <20100611224902.5039.60134.stgit@bumblebee1.mtv.corp.google.com>

On Fri, 11 Jun 2010 15:49:54 -0700
Salman <sqazi@google.com> wrote:

> A program that repeatedly forks and waits is susceptible to having the
> same pid repeated, especially when it competes with another instance of the
> same program.  This is really bad for bash implementation.  Furthermore,
> many shell scripts assume that pid numbers will not be used for some length
> of time.
> 
> Race Description:
>
> ...
>
> diff --git a/kernel/pid.c b/kernel/pid.c
> index e9fd8c1..fbbd5f6 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -122,6 +122,43 @@ static void free_pidmap(struct upid *upid)
>  	atomic_inc(&map->nr_free);
>  }
>  
> +/*
> + * If we started walking pids at 'base', is 'a' seen before 'b'?
> + */
> +static int pid_before(int base, int a, int b)
> +{
> +	/*
> +	 * This is the same as saying
> +	 *
> +	 * (a - base + MAXUINT) % MAXUINT < (b - base + MAXUINT) % MAXUINT
> +	 * and that mapping orders 'a' and 'b' with respect to 'base'.
> +	 */
> +	return (unsigned)(a - base) < (unsigned)(b - base);
> +}

pid.c uses an exotic mix of `int' and `pid_t' to represent pids.  `int'
seems to preponderate.

> +/*
> + * We might be racing with someone else trying to set pid_ns->last_pid.
> + * We want the winner to have the "later" value, because if the
> + * "earlier" value prevails, then a pid may get reused immediately.
> + *
> + * Since pids rollover, it is not sufficient to just pick the bigger
> + * value.  We have to consider where we started counting from.
> + *
> + * 'base' is the value of pid_ns->last_pid that we observed when
> + * we started looking for a pid.
> + *
> + * 'pid' is the pid that we eventually found.
> + */
> +static void set_last_pid(struct pid_namespace *pid_ns, int base, int pid)
> +{
> +	int prev;
> +	int last_write = base;
> +	do {
> +		prev = last_write;
> +		last_write = cmpxchg(&pid_ns->last_pid, prev, pid);
> +	} while ((prev != last_write) && (pid_before(base, last_write, pid)));
> +}

<gets distracted>

hm.  For a long time cmpxchg() wasn't available on all architectures. 
That _seems_ to have been fixed.

arch/score assumes that cmpxchg() operates on unsigned longs.

arch/powerpc plays the necessary games to make 4- and 8-byte scalars work.

ia64 handles 1, 2, 4 and 8-byte quantities.

arm handles 1, 2 and 4-byte scalars.

as does blackfin.

So from the few architectures I looked at, it seems that we do indeed
handle cmpxchg() on all architectures although not very consistently. 
arch/score will blow up if someone tries to use cmpxchg() on 1- or
2-byte scalars.

<looks at the consumers>

infiniband deos cmpxchg() on u64*'s, which will blow up on many
architectures.

Using

	grep -r '[ 	]cmpxchg[^_]' . | grep -v /arch/

I can't see any cmpxchg() callers in truly generic code.  lockdep and
kernel/trace/ring_buffer.c aren't used on the more remote
architectures, I think.

Traditionally, atomic_cmpxchg() was the safe and portable one to use.



  parent reply	other threads:[~2010-06-14 23:59 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-11 17:17 [PATCH] Fix a race in pid generation that causes pids to be reused immediately Salman
2010-06-11 17:44 ` Linus Torvalds
2010-06-11 22:49   ` Salman
2010-06-11 23:07     ` Linus Torvalds
2010-06-14 23:58     ` Andrew Morton [this message]
2010-06-15  0:56       ` tytso
2010-06-15  1:55         ` Andrew Morton
2010-06-15  3:26           ` Paul Mackerras
2010-06-15  4:21             ` Andrew Morton
2010-06-15  4:38               ` Eric Dumazet
2010-06-15  6:57               ` Benjamin Herrenschmidt
2010-06-15  7:25               ` Paul Mackerras
2010-06-15 12:56           ` tytso
2010-06-15 13:06             ` Kyle McMartin
2010-06-15 14:35           ` Peter Zijlstra
2010-06-15 19:37             ` Andrew Morton
2010-06-15 18:35     ` [PATCH 0/1] (Was: Fix a race in pid generation that causes pids to be reused immediately.) Oleg Nesterov
2010-06-15 18:35       ` [PATCH 1/1] pids: alloc_pidmap: remove the unnecessary boundary checks Oleg Nesterov
  -- strict thread matches above, loose matches on Subject: below --
2010-06-10 21:24 [PATCH] Fix a race in pid generation that causes pids to be reused immediately Salman
2010-06-10 20:09 Salman
2010-06-10 20:38 ` tytso
2010-06-10 21:04   ` Salman Qazi
2010-06-09 21:00 Salman
2010-06-09 21:21 ` Linus Torvalds
2010-06-09 21:33   ` Peter Zijlstra
2010-06-09 22:20     ` Linus Torvalds
2010-06-09 22:27       ` Linus Torvalds
2010-06-10  0:08         ` Salman Qazi
2010-06-10  0:20           ` Linus Torvalds
     [not found]             ` <AANLkTilXJ0X2qxD9cNTlLayKzySEZu1HEZUWu--Go8kw@mail.gmail.com>
2010-06-10  5:55               ` Salman Qazi
2010-06-10 16:39                 ` Linus Torvalds
2010-06-09  6:24 Salman
2010-06-09  6:53 ` Andi Kleen
2010-06-09  9:48 ` Ingo Molnar
2010-06-09 15:39   ` Linus Torvalds
2010-06-09 15:50     ` tytso
2010-06-09 16:06       ` Linus Torvalds
2010-06-09 17:10         ` tytso
2010-06-09 17:23           ` Linus Torvalds
2010-06-09 17:25             ` Linus Torvalds
2010-06-09 17:34               ` tytso
2010-06-09 17:43                 ` Linus Torvalds
2010-06-09 17:47                   ` tytso
2010-06-09 18:09                     ` Salman Qazi
2010-06-09 11:49 ` Michel Lespinasse
2010-06-09 12:37   ` tytso
2010-06-09 12:17 ` tytso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100614165851.6bdfe485.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=lennox.wu@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liqin.chen@sunplusct.com \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=sqazi@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@google.com \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.