From: Daniel Phillips <phillips@innominate.de>
To: Tim Wright <timw@splhi.com>, linux-kernel@vger.kernel.org
Subject: Re: [RFC] Semaphores used for daemon wakeup
Date: Wed, 20 Dec 2000 02:34:56 +0100 [thread overview]
Message-ID: <3A400CC0.8F21D000@innominate.de> (raw)
In-Reply-To: <0012171922570J.00623@gimli> <20001218193405.A24041@scutter.internal.splhi.com> <3A3F5E74.3F1988AB@innominate.de> <20001219080759.A24812@scutter.internal.splhi.com>
Tim Wright wrote:
>
> Hi Daniel,
> On Tue, Dec 19, 2000 at 02:11:16PM +0100, Daniel Phillips wrote:
> [...]
> > I'm curious, is my method of avoiding the deadlock race the same as
> > yours? My solution is to keep a count of tasks that 'intend' to take
> > the down():
> >
> > atomic_inc(&bdflush_waiters);
> > up(&bdflush_request);
> > down(&bdflush_waiter);
> >
> > so that bdflush will issue the correct number of up's even if the waiter
> > has not yet gone to sleep. IOW, is your approach in DYNIX the same only
> > in spirit, or in detail?
> >
> > --
> > Daniel
>
> OK,
> this is not how we generally would achieve the goal, although the approach
> looks valid. We have a number of primitives available that are not currently
> used in Linux (unless I'm losing my eyesight :-)
> We use p_sema, and v_sema for down and up respectively (this was done many
> years ago, and the names are in deference to Edsger Dijkstra.
> For normal semaphores (as opposed to read/writer or other variants), we have
> sema_t sema;
> init_sema(&sema, 1); /* initialize semaphore & set initial count */
> p_sema(&sema, PZERO); /* "grab" semaphore and set process priority */
> /* priority < PZERO == sleep uninterruptibly */
> v_sema(&sema); /* release semaphore (i.e. increment count) */
> cp_sema(&sema); /* Attempt to grab semaphore iff free else EBUSY */
> vall_sema(&sema); /* Wake up all sleepers on this semaphore */
> blocked_sema(&sema); /* boolean: any sleepers ? */
> p_sema_v_lock(&sema, priority, &lock); /* atomically release the lock AND */
> /* go to sleep on the semaphore */
>
> Simple spinlock primitives are similar (e.g. p_lock ...), but the last
> primitive above is the key to avoiding many races. The classic coding style
> in DYNIX/ptx (this for buffer allocation) is then:
>
> dmabuf_init(...);
> {
> ...
> init_sema(&dmabuf_wait, 0);
> init_lock(&dmabuf_mutex);
> ...
> }
>
> dmabuf_alloc(...)
> {
> spl_t saved_spl;
> ...
> while (1) {
> saved_spl = p_lock(&dmabuf_mutex, SPLSWP);
> attempt to grab a free buffer;
> if (success){
> v_lock(&dmabuf_mutex, saved_spl);
> return;
> } else {
> p_sema_v_lock(&dmabuf_wait, PSWP+1, &dmabuf_mutex);
> }
> }
> }
>
> dmabuf_free(...)
> {
> spl_t saved_spl;
> ...
> saved_spl = p_lock(&dmabuf_mutex, SPLHI);
> free up buffer;
> if (blocked_sema(&dmabuf_wait)) {
> vall_sema(&dmabuf_wait);
> }
> v_lock(&dmabuf_mutex, s);
> }
>
> As you can see, the spinlocks ensure no races, and the key is the atomicity
> of p_sema_v_lock(). No-one can race in and sleep on dmabuf_wait, because
> they have to hold dmabuf_mutex to do so. Exactly the same mechanism would
> work for the bdflush problem.
Yes, I see. There are a lot of similarities to the situation I
described. The main difference between this situation and bdflush is
that dmabuf_free isn't really waiting on dmabuf_alloc to fullfill a
condition (other than to get out of its exclusion region) while bdflush
can have n waiters.
If I could have a new primitive for this job it would be up_down(sem1,
sem2), atomic with respect to a sleeper on sem1. And please give me an
up_all for good measure. Then for a task wanting to wait on bdflush I
could write:
up_down(&bdflush_request, &bdflush_waiter);
And in bdflush, just:
up_all(&bdflush_waiter);
down(&bdflush_request);
But I found I could do the job with existing primitives so I did.
Originally I wrote:
int waiters = xchg(&bdflush_waiters.count, 0);
while (waiters--)
up(&bdflush_waiter);
which uses one less atomic op but, as Philip Rumpf pointed out to me,
doesn't work on Sparc. Oh well. On Intel, the extra read is
practically free. I could have gone at it by making a new primitive:
int atomic_read_and_clear(atomic_t *p)
{
int n = atomic_read(p);
atomic_sub(p, n);
return n;
}
and on arch i86 it would become:
#define atomic_read_and_clear(p) (xchg(p, 0))
> One can argue the relative merits of the different approaches. I suspect that
> the above code is less bus-intensive relative to the atomic inc/dec/count ops,
> but I may be wrong.
I couldn't say, because your mechanism would need to be elaborated a
little to handle bdflush's multiple waiters, and I don't know exactly
what your up_and_wait would look like. Do spinlocks work for bdflush,
or would you have to go to semaphores? (If the latter you arrive at my
up_down primitive, which is interesting.) It's even hard to say whether
my approach is faster or slower than the existing approach. Ultimately,
up() calls wake_up() and down() calls both add_wait_queue() and
remove_wait_queue(), so I lose a little there. I win in the common case
of the non-blocking wakeup, which normally runs through Ben Lahaises's
lovingly handcrafted fast path in up(), whereas the existing code uses
the more involved wake_up_process(). What's clear is, they are all
plenty fast enough for this application, and what I'm really trying for
is readability.
--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2000-12-20 2:07 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2000-12-17 12:06 [RFC] Semaphores used for daemon wakeup Daniel Phillips
2000-12-19 0:14 ` Nigel Gamble
2000-12-19 3:34 ` Tim Wright
2000-12-19 13:11 ` Daniel Phillips
2000-12-19 16:07 ` Tim Wright
2000-12-20 1:34 ` Daniel Phillips [this message]
2000-12-21 16:28 ` Tim Wright
2000-12-19 9:01 ` Daniel Phillips
[not found] <3A42380B.6E9291D1@sgi.com>
2000-12-21 19:30 ` Paul Cassella
2000-12-21 22:19 ` Tim Wright
2000-12-22 1:12 ` Daniel Phillips
2000-12-22 1:50 ` Daniel Phillips
2000-12-22 4:26 ` Paul Cassella
2000-12-22 11:46 ` Daniel Phillips
2000-12-22 15:33 ` Tim Wright
2000-12-22 17:25 ` Daniel Phillips
2000-12-22 17:32 ` Brian Pomerantz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3A400CC0.8F21D000@innominate.de \
--to=phillips@innominate.de \
--cc=linux-kernel@vger.kernel.org \
--cc=timw@splhi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox