From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>,
ebiederm@xmission.com, Al Viro <viro@zeniv.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>,
Paul McKenney <paulmck@linux.vnet.ibm.com>,
mhocko@suse.cz, LKML <linux-kernel@vger.kernel.org>,
ktsan@googlegroups.com, Kostya Serebryany <kcc@google.com>,
Andrey Konovalov <andreyknvl@google.com>,
Alexander Potapenko <glider@google.com>,
Hans Boehm <hboehm@google.com>
Subject: Re: [PATCH] kernel: fix data race in put_pid
Date: Fri, 18 Sep 2015 10:51:56 +0200 [thread overview]
Message-ID: <20150918085156.GS3816@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20150917180919.GA32116@redhat.com>
On Thu, Sep 17, 2015 at 08:09:19PM +0200, Oleg Nesterov wrote:
> On 09/17, Dmitry Vyukov wrote:
> >
> > I can update the patch description, but let me explain it here first.
>
> Yes thanks.
>
> > Here is the essence of what happens:
>
> Aha, so you really meant that 2 put_pid's can race with each other,
>
> > // thread 1
> > 1: pid->foo = 1; // foo is the first word of pid object
> > // then it does put_pid
> > 2: atomic_dec_and_test(&pid->count) // decrements count to 1 and
> > returns false so the function returns
> >
> > // thread 2
> > // executes put_pid
> > 3: atomic_load(&pid->count); // returns 1, so proceed to kmem_cache_free
> > // then kmem_cache_free does:
> > 4: *(void**)pid = head->freelist;
> > 5: head->freelist = (void*)pid;
> >
> > This can be executed as:
> >
> > 4: *(void**)pid = head->freelist;
> > 1: pid->foo = 1; // foo is the first word of pid object
> > 2: atomic_dec_and_test(&pid->count) // decrements count to 1 and
> > returns false so the function returns
> > 3: atomic_load(&pid->count); // returns 1, so proceed to kmem_cache_free
> > 5: head->freelist = (void*)pid;
>
> Unless I am totally confused, everything is simpler. We can forget
> about the hoisting, freelist, etc.
>
> Thread 2 can see the result of atomic_dec_and_test(), but not the
> result of "pid->foo = 1". In this case in can free the object which
> can be re-allocated _before_ STORE(pid->foo) completes. Of course,
> this would be really bad.
>
> I need to recheck, but afaics this is not possible. This optimization
> is fine, but probably needs a comment.
For sure, this code doesn't make any sense to me.
> We rely on delayed_put_pid()
> called by RCU. And note that nobody can write to this pid after it
> is removed from the rcu-protected list.
>
> So I think this is false alarm, but I'll try to recheck tomorrow, it
> is too late for me today.
As an alternative patch, could we not do:
void put_pid(struct pid *pid)
{
struct pid_namespace *ns;
if (!pid)
return;
ns = pid->numbers[pid->level].ns;
if ((atomic_read(&pid->count) == 1) ||
atomic_dec_and_test(&pid->count)) {
+ smp_read_barrier_depends(); /* ctrl-dep */
kmem_cache_free(ns->pid_cachep, pid);
put_pid_ns(ns);
}
}
That would upgrade the atomic_read() path to a full READ_ONCE_CTRL(),
and thereby avoid any of the kmem_cache_free() stores from leaking out.
And its free, except on Alpha. Whereas the atomic_read_acquire() will
generate a full memory barrier on whole bunch of archs.
next prev parent reply other threads:[~2015-09-18 8:57 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-17 13:24 [PATCH] kernel: fix data race in put_pid Dmitry Vyukov
2015-09-17 16:08 ` Oleg Nesterov
2015-09-17 16:41 ` Dmitry Vyukov
2015-09-17 17:44 ` Oleg Nesterov
2015-09-17 17:57 ` Dmitry Vyukov
2015-09-17 17:59 ` Dmitry Vyukov
2015-09-17 18:09 ` Oleg Nesterov
2015-09-17 18:38 ` Dmitry Vyukov
2015-09-18 8:51 ` Peter Zijlstra [this message]
2015-09-18 8:57 ` Peter Zijlstra
2015-09-18 9:27 ` Peter Zijlstra
2015-09-18 12:31 ` James Hogan
2015-09-18 12:34 ` Peter Zijlstra
2015-09-18 15:56 ` Paul E. McKenney
2015-09-18 9:06 ` Dmitry Vyukov
2015-09-18 9:28 ` Will Deacon
2015-09-18 9:33 ` Peter Zijlstra
2015-09-18 11:22 ` Peter Zijlstra
2015-09-18 11:30 ` Will Deacon
2015-09-18 11:50 ` Dmitry Vyukov
2015-09-18 11:56 ` Peter Zijlstra
2015-09-18 12:19 ` Peter Zijlstra
2015-09-18 12:44 ` Will Deacon
2015-09-18 13:10 ` Peter Zijlstra
2015-09-18 13:44 ` Oleg Nesterov
2015-09-18 13:49 ` Peter Zijlstra
2015-09-18 13:53 ` Dmitry Vyukov
2015-09-18 14:41 ` Oleg Nesterov
2015-09-22 8:38 ` Dmitry Vyukov
2015-09-23 8:48 ` [tip:locking/core] atomic: Implement atomic_read_ctrl() tip-bot for Peter Zijlstra
2015-09-18 16:15 ` [PATCH] kernel: fix data race in put_pid Eric Dumazet
2015-09-18 16:20 ` Peter Zijlstra
2015-09-18 15:57 ` Paul E. McKenney
2015-09-18 13:28 ` Oleg Nesterov
2015-09-18 13:31 ` Oleg Nesterov
2015-09-18 13:46 ` Peter Zijlstra
2015-09-18 15:00 ` Oleg Nesterov
2015-09-18 15:30 ` Oleg Nesterov
2015-09-18 16:00 ` Paul E. McKenney
2015-09-17 17:56 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150918085156.GS3816@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=andreyknvl@google.com \
--cc=dvyukov@google.com \
--cc=ebiederm@xmission.com \
--cc=glider@google.com \
--cc=hboehm@google.com \
--cc=kcc@google.com \
--cc=ktsan@googlegroups.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mhocko@suse.cz \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox