Re: [PATCH] Lightweight userspace semphores

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH] Lightweight userspace semphores
@ 2002-02-27  8:43 Martin Wirth
  2002-02-27 15:24 ` Hubertus Franke
  0 siblings, 1 reply; 8+ messages in thread
From: Martin Wirth @ 2002-02-27  8:43 UTC (permalink / raw)
  To: frankeh, linux-kernel

Hi Hubertus,

I just had a quick look on your semaphore code. As far as I can see you 
do no re-check of the userspace semaphore counter while going to sleep 
on your kernel semaphore. But this way you may loose synchronization 
between the kernel semaphore and the user-space semaphore counter if 
more than two processes are involved. Or did I miss some tricky form
of how you avoided this problem?

Martin Wirth

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Lightweight userspace semphores
  2002-02-27  8:43 [PATCH] Lightweight userspace semphores Martin Wirth
@ 2002-02-27 15:24 ` Hubertus Franke
  2002-02-27 17:17   ` Martin Wirth
  0 siblings, 1 reply; 8+ messages in thread
From: Hubertus Franke @ 2002-02-27 15:24 UTC (permalink / raw)
  To: Martin Wirth; +Cc: linux-kernel

On Wed, Feb 27, 2002 at 09:43:45AM +0100, Martin Wirth wrote:
> Hi Hubertus,
> 
> I just had a quick look on your semaphore code. As far as I can see you 
> do no re-check of the userspace semaphore counter while going to sleep 
> on your kernel semaphore. But this way you may loose synchronization 
> between the kernel semaphore and the user-space semaphore counter if 
> more than two processes are involved. Or did I miss some tricky form
> of how you avoided this problem?
> 
> Martin Wirth

Yes, you are missing something.
I assume you are looking at the usema (non spinning version).
The trick is that the kernel semaphore stores a token in case a race
condition occurs and it will resolve it that way.
I rely on the fact that the kernel only provides a wait queue nothing
else, the entire state on how many are waiting are stored in user space.

Effectively you have a P operation in user space followed by a P operation
in kernel space. Same for V operation.

Assuming you have two or more processes 0,1,2 ... you can show
that race conditions are properly resolved. Again, the trick is to not
sync the state of the kernel and the user level. It comes naturally 
if you properly separate the duties.

The spinning versions were much more interesting, you can really have 
some tricky situations for 3 or more processes. The ulockflex program
really helped to identify problems.

In future messages please point out which of the user locks are in question.
Remember I provide several:
	(1) semaphores
	(2) semaphores with timed spinning
	(3) convoy avoidance locks
	(4) convoy avoidance locks with spinning
	(5) shared locks == multiple reader/single writer locks
	This list is growing as we speak.

I also modified the code (I will post the update at the lse site).
This also fixes a small bug in the ulockflex program, that I introduced
in the last posting at lse.
I took Ben LaHaises suggestions and provide multiple runqueues
up to a limit, rather then explicitely coding exclusive/shared version.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Lightweight userspace semphores...
  2002-02-27 15:24 ` Hubertus Franke
@ 2002-02-27 17:17   ` Martin Wirth
  2002-02-27 19:04     ` Hubertus Franke
  0 siblings, 1 reply; 8+ messages in thread
From: Martin Wirth @ 2002-02-27 17:17 UTC (permalink / raw)
  To: Hubertus Franke, linux-kernel

Hubertus Franke wrote:

> Again, the trick is to not
> sync the state of the kernel and the user level. It comes naturally 
> if you properly separate the duties.
> 

Your code for usema_down is simply:

int usema_down(ulock_t *ulock)
{
	if (!__ulock_down(ulock)) {
		return 0;
	}
	return ulock_wait(ulock,0);
}

This means you do not recheck if the usema is really available after you 
return form ulock_wait(), but you may have the following situtation:

Process 1             Process 2                        Process 3
  down

                       down
                         -> call usema_wait
                         but gets preempted shortly
                         before it
                         can enter kernel mode

  up
   (kernel sema is free)

                                                          down -> got it


                       resume execution,
                       thinks its also got it
                       because kernel sema is free!

Now you have two owners!!


Martin Wirth





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Lightweight userspace semphores...
  2002-02-27 17:17   ` Martin Wirth
@ 2002-02-27 19:04     ` Hubertus Franke
       [not found]       ` <3C7FDF76.9040903@dlr.de>
  0 siblings, 1 reply; 8+ messages in thread
From: Hubertus Franke @ 2002-02-27 19:04 UTC (permalink / raw)
  To: Martin Wirth; +Cc: linux-kernel

On Wed, Feb 27, 2002 at 06:17:41PM +0100, Martin Wirth wrote:
> Hubertus Franke wrote:
> 
> > Again, the trick is to not
> > sync the state of the kernel and the user level. It comes naturally 
> > if you properly separate the duties.
> > 
> 
> Your code for usema_down is simply:
> 
> int usema_down(ulock_t *ulock)
> {
> 	if (!__ulock_down(ulock)) {
> 		return 0;
> 	}
> 	return ulock_wait(ulock,0);
> }
> 
> This means you do not recheck if the usema is really available after you 
> return form ulock_wait(), but you may have the following situtation:

Not so. First remember the following:

__ulock_down is basically an 
	atomic_dec(&lockword) and returns 0 if lockword after decrement
        is < 0. 

__ulock_up   is basically a
	atomic_inc(&lockword) and returns 0 if lockword after increment
	is >= 0.

After each line I print the value of the lockword after the atomic op.
I assume initial value was "1". Initial value of kernel semaphore is "0".
Let's write this as [ 1, 0 ] state (user,kernel).

> 
> Process 1             Process 2                        Process 3
>   down
    [ 0, 0 ]
> 
>                        down
	                 [ -1, 0 ]
>                          -> call usema_wait
>                          but gets preempted shortly
>                          before it
>                          can enter kernel mode
> 
>   up
    [ 0, 0 ] now signal the kernel but nobody waiting on it yet
    [ 0, 1 ] kernel stores the signal token in its state
>    (kernel sema is free)
    [ 0, 1 ] kernel stores the signal token in its state
> 
>                                                           down -> got it
						(not so........ )
						[-1,1]
						you don't have the lock
						as __ulock_down return !=0
						you are going to call the 							kernel ulock_wait	
	
Two scenarios now:  either P3 get's there first to the kernel or P2.
P3 first.

						[-1,1] (start state from above)
						wait in kernel is through down
						that will succeed and eat 
						the kernel sem token and
						continue out of kernel
						holding the lock
						[-1,0]

			P2 rescheduled..
			down in the kernel
			will block the process
			[ -1, -1 ]

P2 first, assuming that P3 too got preempted before the kernel call 
		("Shit Happens", Forrest Gump)

		
			[-1,1] (start state from above)
			wait in kernel is through down
			that will succeed and eat 
			the kernel sem token and
			continue out of kernel
			holding the lock
			[-1,0]

						P3 rescheduled..
						down in the kernel
						will block the process
			[ -1, -1 ]


Any UP call now will ditch into the kernel and signal the waiting process
to continue, handing the lock over to it.
			[ 0, 0 ]
> 
> 
>                        resume execution,
>                        thinks its also got it
>                        because kernel sema is free!

So, it works and is correct. Hope that clarifies it, if not let me know.
Interestingly enough. This scheme doesn't work for spinning locks.
Goto lib/ulocks.c[91-133], I have integrated this dead code here
to show how not to do it. A similar analysis as above shows
that this approach wouldn't work. You need cmpxchg for these scenarios
(more or less).

> 
> Now you have two owners!!
> 

Not so,  QED.

> 
> Martin Wirth
> 
> 

-- Hubertus Franke 		(frankeh@watson.ibm.com)

> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

[parent not found: <3C7FDF76.9040903@dlr.de>]

* Re: [PATCH] Lightweight userspace semaphores...
       [not found]       ` <3C7FDF76.9040903@dlr.de>
@ 2002-03-02 14:08         ` Hubertus Franke
  2002-03-03 22:13           ` [Lse-tech] " Paul Jackson
  0 siblings, 1 reply; 8+ messages in thread
From: Hubertus Franke @ 2002-03-02 14:08 UTC (permalink / raw)
  To: Martin Wirth; +Cc: rusty, linux-kernel, lse-tech

On Fri, Mar 01, 2002 at 09:07:18PM +0100, Martin Wirth wrote:
> 
> 
> Hubertus Franke Worte:
> 
> >
> >So, it works and is correct. Hope that clarifies it, if not let me know.
> >Interestingly enough. This scheme doesn't work for spinning locks.
> >Goto lib/ulocks.c[91-133], I have integrated this dead code here
> >to show how not to do it. A similar analysis as above shows
> >that this approach wouldn't work. You need cmpxchg for these scenarios
> >(more or less).
> >
> 
> You are right, I falsely assumed the initial state to be [1,1].
> 
> But as mentioned in your README, your approach is currently is not able 
> to manage signal handling correctly.
> You have to ignore all non-fatal signals by using ESYSRESTART and a 
> SIG_KILL sent to one of the processes
> may corrupt your user-kernel-syncronisation. 
> 
> I don't think a user space semaphore implementation is acceptable until 
> it provides (signal-) interruptability and
> timeouts. But maybe you have some idea how to manage this.
> 
> Martin Wirth
> 
>  

As of the signal handling and ESYSRESTART.
The user code on the slow path can check for return code and
has two choices (a) reenter the kernel and wait or (b) correct the
status word, because its still counted as a waiting process and return
0. I have chosen (a) and I automated it.
I have tried to send signals (e.g. SIGIO) and it 
seems to work fine. The signal is handled in user space and returns back
to the kernel section. (b) is feasible but a requires a bit more exception
work in the routine.
As of the corruption case. There is flatout (almost) nothing you can do.
For instance, if a process graps the locks and exits, then another process
tries to acquire it you got a deadlock. At the least on the mailing list
most people feel that you are dead in the water anyway.

I have been thinking about the problem of corruption.
The owner of the lock could register its pid in the lock aread (2nd word).
That still leaves non-atomic windows open and is a half-ass solution.

But more conceptually, if you process dies while holding a lock that protects
some shared data, there is no guarantee that you can make about the data
itself if the updating process dies. In that kind of scenario, why
trying to continue as if nothing happened. 
It seems the general consent on the list is that your app is toast at that
point.

-- Hubertus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Lse-tech] Re: [PATCH] Lightweight userspace semaphores...
  2002-03-02 14:08         ` [PATCH] Lightweight userspace semaphores Hubertus Franke
@ 2002-03-03 22:13           ` Paul Jackson
  2002-03-04  6:13             ` Rusty Russell
  2002-03-04 16:07             ` Hubertus Franke
  0 siblings, 2 replies; 8+ messages in thread
From: Paul Jackson @ 2002-03-03 22:13 UTC (permalink / raw)
  To: Hubertus Franke; +Cc: Martin Wirth, rusty, linux-kernel, lse-tech

On Sat, 2 Mar 2002, Hubertus Franke wrote:
> But more conceptually, if you process dies while holding a lock ... 
> your app is toast at that point.

Without application specific knowledge, certainly.

Is there someway one could support a hook, to enable
an application to register a handler that could clean
up, for those apps that found that worthwhile?

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Lse-tech] Re: [PATCH] Lightweight userspace semaphores...
  2002-03-03 22:13           ` [Lse-tech] " Paul Jackson
@ 2002-03-04  6:13             ` Rusty Russell
  2002-03-04 16:07             ` Hubertus Franke
  1 sibling, 0 replies; 8+ messages in thread
From: Rusty Russell @ 2002-03-04  6:13 UTC (permalink / raw)
  To: Paul Jackson; +Cc: frankeh, martin.wirth, rusty, linux-kernel, lse-tech

On Sun, 3 Mar 2002 14:13:45 -0800
Paul Jackson <pj@engr.sgi.com> wrote:

> On Sat, 2 Mar 2002, Hubertus Franke wrote:
> > But more conceptually, if you process dies while holding a lock ... 
> > your app is toast at that point.
> 
> Without application specific knowledge, certainly.
> 
> Is there someway one could support a hook, to enable
> an application to register a handler that could clean
> up, for those apps that found that worthwhile?

If you want this, use fcntl locks (see TDB).  If you don't tell the kernel
what you are doing (which is the reason these locks are fast), it cannot
clean up for you.

One could conceive of a solution where a process told the kernel
"here is where I keep my lock states: if I die, clean up".  Of course,
there's a race there too.

IMHO, given that the lock is protecting something which is left in an
unknown state, this is something which would require serious testing
to be proven worthwhile.

Hope that helps,
Rusty.
-- 
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Lse-tech] Re: [PATCH] Lightweight userspace semaphores...
  2002-03-03 22:13           ` [Lse-tech] " Paul Jackson
  2002-03-04  6:13             ` Rusty Russell
@ 2002-03-04 16:07             ` Hubertus Franke
  1 sibling, 0 replies; 8+ messages in thread
From: Hubertus Franke @ 2002-03-04 16:07 UTC (permalink / raw)
  To: Paul Jackson; +Cc: Martin Wirth, rusty, linux-kernel, lse-tech

On Sun, Mar 03, 2002 at 02:13:45PM -0800, Paul Jackson wrote:
> On Sat, 2 Mar 2002, Hubertus Franke wrote:
> > But more conceptually, if you process dies while holding a lock ... 
> > your app is toast at that point.
> 
> Without application specific knowledge, certainly.
> 
> Is there someway one could support a hook, to enable
> an application to register a handler that could clean
> up, for those apps that found that worthwhile?
> 

I think that's a good idea,  I'll think it through and see what Rusty thinks.


> -- 
>                           I won't rest till it's the best ...
>                           Programmer, Linux Scalability
>                           Paul Jackson <pj@sgi.com> 1.650.933.1373

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-03-05  7:09 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-02-27  8:43 [PATCH] Lightweight userspace semphores Martin Wirth
2002-02-27 15:24 ` Hubertus Franke
2002-02-27 17:17   ` Martin Wirth
2002-02-27 19:04     ` Hubertus Franke
     [not found]       ` <3C7FDF76.9040903@dlr.de>
2002-03-02 14:08         ` [PATCH] Lightweight userspace semaphores Hubertus Franke
2002-03-03 22:13           ` [Lse-tech] " Paul Jackson
2002-03-04  6:13             ` Rusty Russell
2002-03-04 16:07             ` Hubertus Franke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox