From: Simon Kirby <sim@hostway.ca>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Waiman Long <Waiman.Long@hp.com>,
Ian Applegate <ia@cloudflare.com>,
Al Viro <viro@zeniv.linux.org.uk>,
Christoph Lameter <cl@gentwo.org>,
Pekka Enberg <penberg@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Chris Mason <chris.mason@fusionio.com>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH] mutexes: Add CONFIG_DEBUG_MUTEX_FASTPATH=y debug variant to debug SMP races
Date: Wed, 4 Dec 2013 01:19:03 -0800 [thread overview]
Message-ID: <20131204091903.GA18675@hostway.ca> (raw)
In-Reply-To: <CA+55aFxhUYi6Lbrxfjgq7OqSDMF24wqvnGn_J_RAMY7y-AmRiQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1729 bytes --]
On Tue, Dec 03, 2013 at 10:10:29AM -0800, Linus Torvalds wrote:
> On Tue, Dec 3, 2013 at 12:52 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > I'd expect such bugs to be more prominent with unlucky object
> > size/alignment: if mutex->count lies on a separate cache line from
> > mutex->wait_lock.
>
> I doubt that makes much of a difference. It's still just "CPU cycles"
> away, and the window will be tiny unless you have multi-socket
> machines and/or are just very unlucky.
>
> For stress-testing, it would be much better to use some hack like
>
> /* udelay a bit if the spinlock isn't contended */
> if (mutex->wait_lock.ticket.head+1 == mutex->wait_lock.ticket.tail)
> udelay(1);
>
> in __mutex_unlock_common() just before the spin_unlock(). Make the
> window really *big*.
I haven't had a chance yet to do much testing of the proposed race fix
and race enlarging patches, but I did write a tool to reproduce the race.
I started it running and left for dinner, and sure enough, it actually
seems to work on plain 3.12 on a Dell PowerEdge r410 w/dual E5520,
reproducing "Poison overwritten" at a rate of about once every 15 minutes
(running 6 in parallel, booted with "slub_debug").
I have no idea if actually relying on tsc alignment between cores and
sockets is a reasonable idea these days, but it seems to work. I first
used a read() on a pipe close()d by the other process to synchronize
them, but this didn't seem to work as well as busy-waiting until the
timestamp counters pass a previously-decided-upon start time.
Meanwhile, I still don't understand how moving the unlock _up_ to cover
less of the code can solve the race, but I will stare at your long
explanation more tomorrow.
Simon-
[-- Attachment #2: piperace.c --]
[-- Type: text/x-csrc, Size: 1652 bytes --]
#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#define MAX_PIPES 450
#define FORK_CYCLES 2000000
#define CLOSE_CYCLES 100000
#define STAT_SHIFT 6
static inline unsigned long readtsc()
{
unsigned int low, high;
asm volatile("rdtsc" : "=a" (low), "=d" (high));
return low | ((unsigned long)(high) << 32);
}
static int pipefd[MAX_PIPES][2];
int main(int argc, char *argv[]) {
unsigned long loop, race_start, miss;
unsigned long misses = 0, races = 0;
int i;
pid_t pid;
struct rlimit rlim = {
.rlim_cur = MAX_PIPES * 2 + 96,
.rlim_max = MAX_PIPES * 2 + 96,
};
if (setrlimit(RLIMIT_NOFILE, &rlim) != 0)
perror("setrlimit(RLIMIT_NOFILE)");
for (loop = 1;;loop++) {
/*
* Make a bunch of pipes
*/
for (i = 0;i < MAX_PIPES;i++) {
if (pipe(pipefd[i]) == -1) {
perror("pipe()");
exit(EXIT_FAILURE);
}
}
race_start = readtsc() + FORK_CYCLES;
asm("":::"memory");
pid = fork();
if (pid == -1) {
perror("fork()");
exit(EXIT_FAILURE);
}
pid = !!pid;
/*
* Close one pipe descriptor per process
*/
for (i = 0;i < MAX_PIPES;i++)
close(pipefd[i][!pid]);
for (i = 0;i < MAX_PIPES;i++) {
/*
* Line up and try to close at the same time
*/
miss = 1;
while (readtsc() < race_start)
miss = 0;
close(pipefd[i][pid]);
misses+= miss;
race_start+= CLOSE_CYCLES;
}
races+= MAX_PIPES;
if (!(loop & (~(~0UL << STAT_SHIFT))))
fprintf(stderr, "%c %lu (%.2f%% false starts)\n",
"CP"[pid], readtsc(), misses * 100. / races);
if (pid)
wait(NULL); /* Parent */
else
exit(0); /* Child */
}
}
next prev parent reply other threads:[~2013-12-04 9:19 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-02 16:00 Found it! (was Re: [3.10] Oopses in kmem_cache_allocate() via prepare_creds()) Linus Torvalds
2013-12-02 16:27 ` Ingo Molnar
2013-12-02 16:46 ` Al Viro
2013-12-02 17:05 ` Ingo Molnar
2013-12-02 17:06 ` Al Viro
2013-12-03 2:58 ` Linus Torvalds
2013-12-03 4:28 ` Al Viro
2013-12-05 8:12 ` gfs2 deadlock (was Re: Found it) Al Viro
2013-12-05 10:19 ` Steven Whitehouse
2013-12-03 8:52 ` [PATCH] mutexes: Add CONFIG_DEBUG_MUTEX_FASTPATH=y debug variant to debug SMP races Ingo Molnar
2013-12-03 18:10 ` Linus Torvalds
2013-12-04 9:19 ` Simon Kirby [this message]
2013-12-04 21:14 ` Linus Torvalds
2013-12-05 8:06 ` Simon Kirby
2013-12-05 6:57 ` Simon Kirby
2013-12-11 15:03 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131204091903.GA18675@hostway.ca \
--to=sim@hostway.ca \
--cc=Waiman.Long@hp.com \
--cc=chris.mason@fusionio.com \
--cc=cl@gentwo.org \
--cc=ia@cloudflare.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=penberg@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox