From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Manfred Spraul <manfred@colorfullife.com>
Cc: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org,
Steven Rostedt <rostedt@goodmis.org>,
Darren Hart <dvhart@linux.intel.com>,
David Miller <davem@davemloft.net>,
Eric Dumazet <eric.dumazet@gmail.com>,
Mike Galbraith <efault@gmx.de>
Subject: Re: [RFC][PATCH 3/3] ipc/sem: Rework wakeup scheme
Date: Thu, 15 Sep 2011 21:32:15 +0200 [thread overview]
Message-ID: <1316115135.4060.19.camel@twins> (raw)
In-Reply-To: <4E7235F6.1030303@colorfullife.com>
On Thu, 2011-09-15 at 19:29 +0200, Manfred Spraul wrote:
> Hi Peter,
> What is broken?
I'm not quite sure yet, but the results are that sembench doesn't
complete properly; http://oss.oracle.com/~mason/sembench.c
That seems to be happening is that we get spurious wakeups in the
ipc/sem code resulting it semtimedop returning -EINTR, even though
there's no pending signal.
(there really should be a if (!signal_pending(current)) goto again thing
in that semtimedop wait loop)
Adding a loop in userspace like:
again:
ret = semtimedop(semid_lookup[l->id], &sb, 1, tvp);
if (ret) {
if (errno == EINTR) {
l->spurious++;
kill_tracer();
goto again;
}
perror("semtimedop");
}
makes it complete again (although performance seems to suffer a lot
compared to a kernel without this patch).
It seems related to patch 2/3 converting the futex code, without that
patch I can't seem to reproduce. All this is strange though, because if
there were multiple wakeups on the same task wake_lists ought to result
in less wakeups in total, not more.
I've been trying to trace the thing but so far no luck.. when I enable
too much tracing it goes away.. silly heisenbugger.
> > +static void wake_up_sem_queue_prepare(struct wake_list_head *wake_list,
> > struct sem_queue *q, int error)
> > {
> > + struct task_struct *p = ACCESS_ONCE(q->sleeper);
> >
> > + get_task_struct(p);
> > + q->status = error;
> > + /*
> > + * implies a full barrier
> > + */
> > + wake_list_add(wake_list, p);
> > + put_task_struct(p);
> > }
> I think the get_task_struct()/put_task_struct is not necessary:
> Just do the wake_list_add() before writing q->status:
> wake_list_add() is identical to list_add_tail(&q->simple_list, pt).
> [except that it contains additional locking, which doesn't matter here]
But the moment we write q->status, q can disappear right?
Suppose the task gets a wakeup (say from a signal) right after we write
q->status. Then p can disappear (do_exit) and we'd try to enqueue dead
memory -> BOOM!
> > +static void wake_up_sem_queue_do(struct wake_list_head *wake_list)
> > {
> > + wake_up_list(wake_list, TASK_ALL);
> > }
> >
> wake_up_list() calls wake_up_state() that calls try_to_wake_up().
> try_to_wake_up() seems to return immediately when the state is TASK_DEAD.
>
> That leaves: Is it safe to call wake_up_list() in parallel with do_exit()?
> The current implementation avoids that.
Ah, wake_list_add() does get_task_struct() and wake_up_list() will first
issue the wakeup and then drop the reference.
Hrmm,. it looks like its all these atomic ops {get,put}_task_struct()
that are causing the performance drop.. I just removed the ones in
wake_up_sem_queue_prepare() just for kicks and I got about half my
performance gap back.
next prev parent reply other threads:[~2011-09-15 19:32 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-14 13:30 [RFC][PATCH 0/3] delayed wakeup list Peter Zijlstra
2011-09-14 13:30 ` [RFC][PATCH 1/3] sched: Provide " Peter Zijlstra
2011-09-14 13:50 ` Peter Zijlstra
2011-09-14 14:08 ` Eric Dumazet
2011-09-14 14:12 ` Peter Zijlstra
2011-09-14 15:35 ` Darren Hart
2011-09-14 15:39 ` Peter Zijlstra
2011-09-14 15:49 ` Darren Hart
2011-09-16 7:59 ` Paul Turner
2011-09-16 7:59 ` Paul Turner
2011-09-16 8:48 ` Peter Zijlstra
2011-10-02 14:01 ` Manfred Spraul
2011-10-03 10:23 ` Peter Zijlstra
2011-09-14 13:30 ` [RFC][PATCH 2/3] futex: Reduce hash bucket lock contention Peter Zijlstra
2011-09-14 15:46 ` Darren Hart
2011-09-14 15:51 ` Peter Zijlstra
2011-09-14 16:00 ` Darren Hart
2011-09-14 20:49 ` Thomas Gleixner
2011-09-16 12:34 ` Peter Zijlstra
2011-09-17 12:57 ` Manfred Spraul
2011-09-19 7:37 ` Peter Zijlstra
2011-09-19 8:51 ` Peter Zijlstra
2011-09-14 13:30 ` [RFC][PATCH 3/3] ipc/sem: Rework wakeup scheme Peter Zijlstra
2011-09-15 17:29 ` Manfred Spraul
2011-09-15 19:32 ` Peter Zijlstra [this message]
2011-09-15 19:35 ` Peter Zijlstra
2011-09-15 19:45 ` Peter Zijlstra
2011-09-17 12:36 ` Manfred Spraul
2011-09-16 12:18 ` Peter Zijlstra
2011-09-17 12:32 ` Manfred Spraul
2011-09-16 12:39 ` Peter Zijlstra
2011-09-14 13:51 ` [RFC][PATCH 0/3] delayed wakeup list Eric Dumazet
2011-09-14 13:56 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1316115135.4060.19.camel@twins \
--to=a.p.zijlstra@chello.nl \
--cc=davem@davemloft.net \
--cc=dvhart@linux.intel.com \
--cc=efault@gmx.de \
--cc=eric.dumazet@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=mingo@elte.hu \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox