Re: [RFC] Semaphores used for daemon wakeup

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Daniel Phillips <phillips@innominate.de>
To: Tim Wright <timw@splhi.com>, linux-kernel@vger.kernel.org
Subject: Re: [RFC] Semaphores used for daemon wakeup
Date: Wed, 20 Dec 2000 02:34:56 +0100	[thread overview]
Message-ID: <3A400CC0.8F21D000@innominate.de> (raw)
In-Reply-To: <0012171922570J.00623@gimli> <20001218193405.A24041@scutter.internal.splhi.com> <3A3F5E74.3F1988AB@innominate.de> <20001219080759.A24812@scutter.internal.splhi.com>

Tim Wright wrote:
> 
> Hi Daniel,
> On Tue, Dec 19, 2000 at 02:11:16PM +0100, Daniel Phillips wrote:
> [...]
> > I'm curious, is my method of avoiding the deadlock race the same as
> > yours?  My solution is to keep a count of tasks that 'intend' to take
> > the down():
> >
> >         atomic_inc(&bdflush_waiters);
> >         up(&bdflush_request);
> >         down(&bdflush_waiter);
> >
> > so that bdflush will issue the correct number of up's even if the waiter
> > has not yet gone to sleep.  IOW, is your approach in DYNIX the same only
> > in spirit, or in detail?
> >
> > --
> > Daniel
> 
> OK,
> this is not how we generally would achieve the goal, although the approach
> looks valid. We have a number of primitives available that are not currently
> used in Linux (unless I'm losing my eyesight :-)
> We use p_sema, and v_sema for down and up respectively (this was done many
> years ago, and the names are in deference to Edsger Dijkstra.
> For normal semaphores (as opposed to read/writer or other variants), we have
> sema_t sema;
> init_sema(&sema, 1);    /* initialize semaphore & set initial count */
> p_sema(&sema, PZERO);   /* "grab" semaphore and set process priority */
>                         /* priority < PZERO == sleep uninterruptibly */
> v_sema(&sema);          /* release semaphore (i.e. increment count) */
> cp_sema(&sema);         /* Attempt to grab semaphore iff free else EBUSY */
> vall_sema(&sema);       /* Wake up all sleepers on this semaphore */
> blocked_sema(&sema);    /* boolean: any sleepers ? */
> p_sema_v_lock(&sema, priority, &lock);  /* atomically release the lock AND */
>                         /* go to sleep on the semaphore */
> 
> Simple spinlock primitives are similar (e.g. p_lock ...), but the last
> primitive above is the key to avoiding many races. The classic coding style
> in DYNIX/ptx (this for buffer allocation) is then:
> 
> dmabuf_init(...);
> {
>         ...
>         init_sema(&dmabuf_wait, 0);
>         init_lock(&dmabuf_mutex);
>         ...
> }
> 
> dmabuf_alloc(...)
> {
>         spl_t saved_spl;
>         ...
>         while (1) {
>                 saved_spl = p_lock(&dmabuf_mutex, SPLSWP);
>                 attempt to grab a free buffer;
>                 if (success){
>                         v_lock(&dmabuf_mutex, saved_spl);
>                         return;
>                 } else {
>                         p_sema_v_lock(&dmabuf_wait, PSWP+1, &dmabuf_mutex);
>                 }
>         }
> }
> 
> dmabuf_free(...)
> {
>         spl_t saved_spl;
>         ...
>         saved_spl = p_lock(&dmabuf_mutex, SPLHI);
>         free up buffer;
>         if (blocked_sema(&dmabuf_wait)) {
>                 vall_sema(&dmabuf_wait);
>         }
>         v_lock(&dmabuf_mutex, s);
> }
> 
> As you can see, the spinlocks ensure no races, and the key is the atomicity
> of p_sema_v_lock(). No-one can race in and sleep on dmabuf_wait, because
> they have to hold dmabuf_mutex to do so. Exactly the same mechanism would
> work for the bdflush problem.

Yes, I see.  There are a lot of similarities to the situation I
described.  The main difference between this situation and bdflush is
that dmabuf_free isn't really waiting on dmabuf_alloc to fullfill a
condition (other than to get out of its exclusion region) while bdflush
can have n waiters.

If I could have a new primitive for this job it would be up_down(sem1,
sem2), atomic with respect to a sleeper on sem1.  And please give me an
up_all for good measure.  Then for a task wanting to wait on bdflush I
could write:

        up_down(&bdflush_request, &bdflush_waiter);

And in bdflush, just:

        up_all(&bdflush_waiter);
        down(&bdflush_request);

But I found I could do the job with existing primitives so I did.

Originally I wrote:

        int waiters = xchg(&bdflush_waiters.count, 0);
        while (waiters--)
                up(&bdflush_waiter);

which uses one less atomic op but, as Philip Rumpf pointed out to me,
doesn't work on Sparc.  Oh well.  On Intel, the extra read is
practically free.  I could have gone at it by making a new primitive:

int atomic_read_and_clear(atomic_t *p)
{
	int n = atomic_read(p);
	atomic_sub(p, n);
	return n;
}

and on arch i86 it would become:

	#define atomic_read_and_clear(p) (xchg(p, 0))

> One can argue the relative merits of the different approaches. I suspect that
> the above code is less bus-intensive relative to the atomic inc/dec/count ops,
> but I may be wrong.

I couldn't say, because your mechanism would need to be elaborated a
little to handle bdflush's multiple waiters, and I don't know exactly
what your up_and_wait would look like.  Do spinlocks work for bdflush,
or would you have to go to semaphores?  (If the latter you arrive at my
up_down primitive, which is interesting.)  It's even hard to say whether
my approach is faster or slower than the existing approach.  Ultimately,
up() calls wake_up() and down() calls both add_wait_queue() and
remove_wait_queue(), so I lose a little there.  I win in the common case
of the non-blocking wakeup, which normally runs through Ben Lahaises's
lovingly handcrafted fast path in up(), whereas the existing code uses
the more involved wake_up_process().  What's clear is, they are all
plenty fast enough for this application, and what I'm really trying for
is readability.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

next prev parent reply	other threads:[~2000-12-20  2:07 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-12-17 12:06 [RFC] Semaphores used for daemon wakeup Daniel Phillips
2000-12-19  0:14 ` Nigel Gamble
2000-12-19  3:34 ` Tim Wright
2000-12-19 13:11   ` Daniel Phillips
2000-12-19 16:07     ` Tim Wright
2000-12-20  1:34       ` Daniel Phillips [this message]
2000-12-21 16:28         ` Tim Wright
2000-12-19  9:01 ` Daniel Phillips
     [not found] <3A42380B.6E9291D1@sgi.com>
2000-12-21 19:30 ` Paul Cassella
2000-12-21 22:19   ` Tim Wright
2000-12-22  1:12   ` Daniel Phillips
2000-12-22  1:50   ` Daniel Phillips
2000-12-22  4:26     ` Paul Cassella
2000-12-22 11:46       ` Daniel Phillips
2000-12-22 15:33         ` Tim Wright
2000-12-22 17:25           ` Daniel Phillips
2000-12-22 17:32   ` Brian Pomerantz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3A400CC0.8F21D000@innominate.de \
    --to=phillips@innominate.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=timw@splhi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox