From: Howard Chu <hyc@symas.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Robert Hancock <hancockr@shaw.ca>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: sched_yield() makes OpenLDAP slow
Date: Sat, 20 Aug 2005 11:38:06 -0700 [thread overview]
Message-ID: <4307788E.1040209@symas.com> (raw)
In-Reply-To: <4306AF26.3030106@yahoo.com.au>
Nick Piggin wrote:
> Robert Hancock wrote:
> > I fail to see how sched_yield is going to be very helpful in this
> > situation. Since that call can sleep from a range of time ranging
> > from zero to a long time, it's going to give unpredictable results.
> Well, not sleep technically, but yield the CPU for some undefined
> amount of time.
Since the slapd server was not written to run in realtime, nor is it
commonly run on realtime operating systems, I don't believe predictable
timing here is a criteria we care about. One could say the same of
sigsuspend() by the way - it can pause a process for a range of time
ranging from zero to a long time. Should we tell application writers not
to use this function either, regardless of whether the developer thinks
they have a good reason to use it?
> > It seems to me that this sort of thing is why we have POSIX pthread
> > synchronization primitives.. sched_yield is basically there for a
> > process to indicate that "what I'm doing doesn't matter much, let
> > other stuff run". Any other use of it generally constitutes some
> > kind of hack.
In terms of transaction recovery, we do an exponential backoff on the
retries, because our benchmarks showed that under heavy lock contention,
immediate retries only made things worse. In fact, having arbitrarily
long backoff delays here was shown to improve transaction throughput.
(We use select() with an increasing timeval in combination with the
yield() call. One way or another we get a longer delay as desired.)
sched_yield is there for a *thread* to indicate "what I'm doing doesn't
matter much, let other stuff run."
I suppose it may be a hack. But then so is TCP congestion control. In
both cases, empirical evidence indicates the hack is worthwhile. If you
haven't done the analysis then you're in no position to deny the value
of the approach.
> In SCHED_OTHER mode, you're right, sched_yield is basically
> meaningless.
> In a realtime system, there is a very well defined and probably
> useful behaviour.
> Eg. If 2 SCHED_FIFO processes are running at the same priority, One
> can call sched_yield to deterministically give the CPU to the other
> guy.
Well yes, the point of a realtime system is to provide deterministic
response times to unpredictable input.
I'll note that we removed a number of the yield calls (that were in
OpenLDAP 2.2) for the 2.3 release, because I found that they were
redundant and causing unnecessary delays. My own test system is running
on a Linux 2.6.12.3 kernel (installed over a SuSE 9.2 x86_64 distro),
and OpenLDAP 2.3 runs perfectly well here, now that those redundant
calls have been removed. But I also found that I needed to add a new
yield(), to work around yet another unexpected issue on this system - we
have a number of threads waiting on a condition variable, and the thread
holding the mutex signals the var, unlocks the mutex, and then
immediately relocks it. The expectation here is that upon unlocking the
mutex, the calling thread would block while some waiting thread (that
just got signaled) would get to run. In fact what happened is that the
calling thread unlocked and relocked the mutex without allowing any of
the waiting threads to run. In this case the only solution was to insert
a yield() after the mutex_unlock(). So again, for those of you claiming
"oh, all you need to do is use a condition variable or any of the other
POSIX synchronization primitives" - yes, that's a nice theory, but
reality says otherwise.
To say that sched_yield is basically meaningless is far overstating your
point.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
OpenLDAP Core Team http://www.openldap.org/project/
next prev parent reply other threads:[~2005-08-20 18:38 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4D8eT-4rg-31@gated-at.bofh.it>
2005-08-20 3:20 ` sched_yield() makes OpenLDAP slow Robert Hancock
2005-08-20 4:18 ` Nick Piggin
2005-08-20 18:38 ` Howard Chu [this message]
2005-08-20 20:57 ` Lee Revell
2005-08-20 21:24 ` Howard Chu
2005-08-21 0:36 ` Nick Piggin
2005-08-21 11:33 ` Nikita Danilov
2005-08-22 8:06 ` Howard Chu
2005-08-20 21:50 ` Lee Revell
2005-08-21 1:04 ` Robert Hancock
2005-08-22 11:44 ` linux-os (Dick Johnson)
2005-08-22 14:26 ` Robert Hancock
2005-08-23 11:17 ` linux-os (Dick Johnson)
2005-08-23 12:07 ` Denis Vlasenko
2005-08-24 3:37 ` Lincoln Dale
2006-01-24 22:59 e100 oops on resume Stefan Seyfried
2006-01-24 23:21 ` Mattia Dongili
2006-01-25 9:02 ` Olaf Kirch
2006-01-25 12:11 ` Olaf Kirch
2006-01-25 13:51 ` sched_yield() makes OpenLDAP slow Howard Chu
2006-01-25 14:38 ` Robert Hancock
2006-01-25 17:49 ` Christopher Friesen
2006-01-26 1:07 ` David Schwartz
2006-01-26 8:30 ` Helge Hafting
2006-01-26 9:01 ` Nick Piggin
2006-01-26 10:50 ` Nikita Danilov
[not found] <5uZqb-4fo-15@gated-at.bofh.it>
2006-01-14 22:47 ` Robert Hancock
-- strict thread matches above, loose matches on Subject: below --
2006-01-14 19:29 Howard Chu
[not found] <43057641.70700@symas.com.suse.lists.linux.kernel>
[not found] ` <17157.45712.877795.437505@gargle.gargle.HOWL.suse.lists.linux.kernel>
[not found] ` <430666DB.70802@symas.com.suse.lists.linux.kernel>
2005-08-20 13:48 ` Andi Kleen
2005-08-21 19:47 ` Florian Weimer
2005-08-22 5:09 ` Howard Chu
2005-08-22 13:06 ` Andi Kleen
2005-08-22 18:47 ` Howard Chu
2005-08-22 13:20 ` Florian Weimer
2005-08-22 23:19 ` Howard Chu
2005-08-19 6:03 Howard Chu
2005-08-19 6:34 ` Nick Piggin
2005-08-19 6:59 ` Chris Wedgwood
2005-08-19 22:45 ` Howard Chu
2005-08-19 10:21 ` Nikita Danilov
2005-08-19 23:10 ` Howard Chu
2005-08-20 13:23 ` Nikita Danilov
2005-08-20 19:49 ` Howard Chu
2005-08-20 22:08 ` Nikita Danilov
[not found] <4303DB48.8010902@develer.com.suse.lists.linux.kernel>
[not found] ` <20050818010703.GA13127@nineveh.rivenstone.net.suse.lists.linux.kernel>
[not found] ` <4303F967.6000404@yahoo.com.au.suse.lists.linux.kernel>
[not found] ` <43054D9A.7090509@develer.com.suse.lists.linux.kernel>
2005-08-19 3:19 ` Andi Kleen
2005-08-19 3:30 ` Bernardo Innocenti
2005-08-18 0:50 Bernardo Innocenti
2005-08-18 0:47 ` Con Kolivas
2005-08-18 10:48 ` Maciej Soltysiak
2005-08-18 1:07 ` Joseph Fannin
2005-08-18 2:25 ` Bernardo Innocenti
2005-08-18 2:58 ` Nick Piggin
2005-08-19 3:10 ` Bernardo Innocenti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4307788E.1040209@symas.com \
--to=hyc@symas.com \
--cc=hancockr@shaw.ca \
--cc=linux-kernel@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox