public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Howard Chu <hyc@symas.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Robert Hancock <hancockr@shaw.ca>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: sched_yield() makes OpenLDAP slow
Date: Sat, 20 Aug 2005 11:38:06 -0700	[thread overview]
Message-ID: <4307788E.1040209@symas.com> (raw)
In-Reply-To: <4306AF26.3030106@yahoo.com.au>

Nick Piggin wrote:
>  Robert Hancock wrote:
> > I fail to see how sched_yield is going to be very helpful in this
> > situation. Since that call can sleep from a range of time ranging
> > from zero to a long time, it's going to give unpredictable results.

>  Well, not sleep technically, but yield the CPU for some undefined
>  amount of time.

Since the slapd server was not written to run in realtime, nor is it 
commonly run on realtime operating systems, I don't believe predictable 
timing here is a criteria we care about. One could say the same of 
sigsuspend() by the way - it can pause a process for a range of time 
ranging from zero to a long time. Should we tell application writers not 
to use this function either, regardless of whether the developer thinks 
they have a good reason to use it?

> > It seems to me that this sort of thing is why we have POSIX pthread
> > synchronization primitives.. sched_yield is basically there for a
> > process to indicate that "what I'm doing doesn't matter much, let
> > other stuff run". Any other use of it generally constitutes some
> > kind of hack.

In terms of transaction recovery, we do an exponential backoff on the 
retries, because our benchmarks showed that under heavy lock contention, 
immediate retries only made things worse. In fact, having arbitrarily 
long backoff delays here was shown to improve transaction throughput. 
(We use select() with an increasing timeval in combination with the 
yield() call. One way or another we get a longer delay as desired.)

sched_yield is there for a *thread* to indicate "what I'm doing doesn't 
matter much, let other stuff run."

I suppose it may be a hack. But then so is TCP congestion control. In 
both cases, empirical evidence indicates the hack is worthwhile. If you 
haven't done the analysis then you're in no position to deny the value 
of the approach.

>  In SCHED_OTHER mode, you're right, sched_yield is basically
>  meaningless.

>  In a realtime system, there is a very well defined and probably
>  useful behaviour.

>  Eg. If 2 SCHED_FIFO processes are running at the same priority, One
>  can call sched_yield to deterministically give the CPU to the other
>  guy.

Well yes, the point of a realtime system is to provide deterministic 
response times to unpredictable input.

I'll note that we removed a number of the yield calls (that were in 
OpenLDAP 2.2) for the 2.3 release, because I found that they were 
redundant and causing unnecessary delays. My own test system is running 
on a Linux 2.6.12.3 kernel (installed over a SuSE 9.2 x86_64 distro), 
and OpenLDAP 2.3 runs perfectly well here, now that those redundant 
calls have been removed. But I also found that I needed to add a new 
yield(), to work around yet another unexpected issue on this system - we 
have a number of threads waiting on a condition variable, and the thread 
holding the mutex signals the var, unlocks the mutex, and then 
immediately relocks it. The expectation here is that upon unlocking the 
mutex, the calling thread would block while some waiting thread (that 
just got signaled) would get to run. In fact what happened is that the 
calling thread unlocked and relocked the mutex without allowing any of 
the waiting threads to run. In this case the only solution was to insert 
a yield() after the mutex_unlock(). So again, for those of you claiming 
"oh, all you need to do is use a condition variable or any of the other 
POSIX synchronization primitives" - yes, that's a nice theory, but 
reality says otherwise.

To say that sched_yield is basically meaningless is far overstating your 
point.
-- 
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  OpenLDAP Core Team            http://www.openldap.org/project/


  reply	other threads:[~2005-08-20 18:38 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <4D8eT-4rg-31@gated-at.bofh.it>
2005-08-20  3:20 ` sched_yield() makes OpenLDAP slow Robert Hancock
2005-08-20  4:18   ` Nick Piggin
2005-08-20 18:38     ` Howard Chu [this message]
2005-08-20 20:57       ` Lee Revell
2005-08-20 21:24         ` Howard Chu
2005-08-21  0:36           ` Nick Piggin
2005-08-21 11:33           ` Nikita Danilov
2005-08-22  8:06             ` Howard Chu
2005-08-20 21:50       ` Lee Revell
2005-08-21  1:04       ` Robert Hancock
2005-08-22 11:44         ` linux-os (Dick Johnson)
2005-08-22 14:26           ` Robert Hancock
2005-08-23 11:17             ` linux-os (Dick Johnson)
2005-08-23 12:07               ` Denis Vlasenko
2005-08-24  3:37                 ` Lincoln Dale
2006-01-24 22:59 e100 oops on resume Stefan Seyfried
2006-01-24 23:21 ` Mattia Dongili
2006-01-25  9:02   ` Olaf Kirch
2006-01-25 12:11     ` Olaf Kirch
2006-01-25 13:51       ` sched_yield() makes OpenLDAP slow Howard Chu
2006-01-25 14:38         ` Robert Hancock
2006-01-25 17:49         ` Christopher Friesen
2006-01-26  1:07         ` David Schwartz
2006-01-26  8:30           ` Helge Hafting
2006-01-26  9:01             ` Nick Piggin
2006-01-26 10:50             ` Nikita Danilov
     [not found] <5uZqb-4fo-15@gated-at.bofh.it>
2006-01-14 22:47 ` Robert Hancock
  -- strict thread matches above, loose matches on Subject: below --
2006-01-14 19:29 Howard Chu
     [not found] <43057641.70700@symas.com.suse.lists.linux.kernel>
     [not found] ` <17157.45712.877795.437505@gargle.gargle.HOWL.suse.lists.linux.kernel>
     [not found]   ` <430666DB.70802@symas.com.suse.lists.linux.kernel>
2005-08-20 13:48     ` Andi Kleen
2005-08-21 19:47       ` Florian Weimer
2005-08-22  5:09         ` Howard Chu
2005-08-22 13:06           ` Andi Kleen
2005-08-22 18:47             ` Howard Chu
2005-08-22 13:20           ` Florian Weimer
2005-08-22 23:19             ` Howard Chu
2005-08-19  6:03 Howard Chu
2005-08-19  6:34 ` Nick Piggin
2005-08-19  6:59 ` Chris Wedgwood
2005-08-19 22:45   ` Howard Chu
2005-08-19 10:21 ` Nikita Danilov
2005-08-19 23:10   ` Howard Chu
2005-08-20 13:23     ` Nikita Danilov
2005-08-20 19:49       ` Howard Chu
2005-08-20 22:08         ` Nikita Danilov
     [not found] <4303DB48.8010902@develer.com.suse.lists.linux.kernel>
     [not found] ` <20050818010703.GA13127@nineveh.rivenstone.net.suse.lists.linux.kernel>
     [not found]   ` <4303F967.6000404@yahoo.com.au.suse.lists.linux.kernel>
     [not found]     ` <43054D9A.7090509@develer.com.suse.lists.linux.kernel>
2005-08-19  3:19       ` Andi Kleen
2005-08-19  3:30         ` Bernardo Innocenti
2005-08-18  0:50 Bernardo Innocenti
2005-08-18  0:47 ` Con Kolivas
2005-08-18 10:48   ` Maciej Soltysiak
2005-08-18  1:07 ` Joseph Fannin
2005-08-18  2:25   ` Bernardo Innocenti
2005-08-18  2:58   ` Nick Piggin
2005-08-19  3:10     ` Bernardo Innocenti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4307788E.1040209@symas.com \
    --to=hyc@symas.com \
    --cc=hancockr@shaw.ca \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox