public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: generic rwsem [Re: Alpha "process table hang"]
@ 2001-04-17 19:18 D.W.Howells
  2001-04-17 20:49 ` Andrea Arcangeli
  0 siblings, 1 reply; 17+ messages in thread
From: D.W.Howells @ 2001-04-17 19:18 UTC (permalink / raw)
  To: andrea; +Cc: linux-kernel

Andrea,

> As said the design of the framework to plugin per-arch rwsem implementation 
> isn't flexible enough and the generic spinlocks are as well broken, try to 
> use them if you can (yes I tried that for the alpha, it was just a mess and 
> it was more productive to rewrite than to fix).

Having thought about the matter a bit, I know what the problem is:

As stated in the email with the latest patch, I haven't yet extended this to 
cover any architecture but i386. That patch was actually put up for comments, 
though it got included anyway.

Therefore, all other archs use the old (and probably) broken implementations!

I'll quickly knock up a patch to fix the other archs. This should also fix 
the alpha problem.

As for making the stuff I had done less generic, and more specific, I only 
made it more generic because I got asked to by a number of people. It was 
suggested that I move the contention functions into lib/rwsem.c and make them 
common.

As far as using atomic_add_return() goes, the C compiler cannot make the 
fastpath anywhere near as efficient, because amongst other things, I can make 
use of the condition flags set in EFLAGS and the compiler can't.

> And it's also more readable and it's not bloated code, 65+110 lines
> compared to 156+148+174 lines. 

You do my code an injustice there... I've put comments in mine.

David

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: generic rwsem [Re: Alpha "process table hang"]
@ 2001-04-17 23:54 D.W.Howells
  2001-04-18 20:49 ` Andrea Arcangeli
  0 siblings, 1 reply; 17+ messages in thread
From: D.W.Howells @ 2001-04-17 23:54 UTC (permalink / raw)
  To: andrea; +Cc: dhowells, linux-kernel

> It is 36bytes. and on 64bit archs the difference is going to be less. 

You're right - I can't add up (must be too late at night), and I was looking 
at wait_queue not wait_queue_head. I suppose that means my implementations 
are then 20 and 16 bytes respectively.

On 64-bit archs the difference will be less, depending on what a "long" is.

> The real waste is the lock of the waitqueue that I don't need, so I should 
> probably keep two list_head in the waitqueue instead of using the 
> wait_queue_head_t and wake_up_process by hand. 

Perhaps you should steal my wake_up_ctx() idea. That means you only need one 
wait queue, and you use bits in the wait_queue flags to note which type of 
waiter is at the front of the queue.

You can then say "wake up the first thing at the front of the queue if it is 
a writer"; and you can say "wake up the first consequtive bunch of things at 
the front of the queue, provided they're all readers" or "wake up all the 
readers in the queue".

> The fast path has to be as fast as yours, if not then the only variable
> that can make difference is the fact I'm not inlining the fast path because
> it's not that small, in such a case I should simply inline the fast path

My point exactly... It can't be as fast because it's _all_ out of line. 
Therefore you always have to go through the overhead of a function call, 
whatever that entails on the architecture of choice.

> I don't care about the scalability of the slow path and I think the slow
> path may even be faster than yours because I don't run additional
> unlock/lock and memory barriers and the other cpus will stop dirtifying my
> stuff after their first trylock until I unlock. 

Except for the optimised case, you may be correct on an SMP configured kernel 
(for a UP kernel, spinlocks are nops).

However! mine runs for as little time as possible with spinlocks held in the 
generic case, and, perhaps more importantly, as little time as possible with 
interrupts disabled.

One other thing: should you be using spin_lock_irqsave() instead of 
spin_lock_irq() in your down functions? I'm not sure it's necessary, however, 
since you probably shouldn't be sleeping if you've got the interrupts 
disabled (though schedule() will cope).

> If you have time to benchmark I'd be interested to see some number. But
> anyways my implementation was mostly meant to be obviously right and
> possible to ovverride with per-arch algorithms

I'll have a go tomorrow evening. It's time to go to bed now I think:-)

David

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: generic rwsem [Re: Alpha "process table hang"]
@ 2001-04-17 21:48 D.W.Howells
  2001-04-17 23:06 ` Andrea Arcangeli
  0 siblings, 1 reply; 17+ messages in thread
From: D.W.Howells @ 2001-04-17 21:48 UTC (permalink / raw)
  To: andrea; +Cc: linux-kernel, dhowells, torvalds


> I am sure ppc couldn't race (at least unless up_read/up_write were excuted 
> from irq/softnet context and that never happens in 2.4.4pre3, see below ;). 

This is not actually using the rwsem code I wrote at the moment.

> And incidentally the above is what (I guess Richard) did on the alpha and
> that should really go into common code instead of having
> asm-i386/rwsem-xadd.h asm-alpha/rwsem.h etcc.etc... just implement
> atomic_inc_return using xadd in asm-i386/atomic.h, that's much better
> design IMHO. 

I disagree... you want such primitives to be as efficient as possible. The 
whole point of having asm/xxxx.h files is that you can stuff them full of 
dirty tricks specific to certain architectures.

> That can obviously be done for example with C code like this: 
>        count = atomic_inc_return(&sem->count); 
>        if (__builtin_expect(count == 0, 0)) 
>                slow_path() 
>
> The above is the perfect C implementation IMHO

But not so efficient since it _has_ to take a jump unless the compiler can 
emit code in alternative text sections when it seems appropriate. Plus, you 
have to test count's value. XADD sets EFLAGS based on the result in memory, 
something that allows all but one fastpath to be two instructions in length 
(the one that isn't is three).

But, yes, there should probably be two generic cases: one implemented with a 
spinlock in the rwsem struct (as I supply) and one implemented using 
atomic_add_return(). Note, however! atomic_add_return() is not necessarily 
implemented efficiently: if it involves a cmpxchg loop (as many seem to), 
then that is really quite inefficient, and may be better done as a spinlock 
anyway.

I've had a look at your implementation... It seems to hold the spinlocks for 
an awfully long time... specifically around the local variable initialisation 
in the 'failed' functions. Don't forget that the compiler can't reorder these 
because they're inside the spinlock. I would, if I were you, fold the 
'failed' functions into the main ones to avoid this problem.

Your rw_semaphore structure is also rather large: 46 bytes without debugging 
stuff (16 bytes apiece for the waitqueues and 12 bytes for the rest). 
Contrast that with mine: generic is 24 bytes and the i386-xadd optimised is 
20.

Admittedly, though, yours is extremely simple and easy to follow, but I don't 
think it's going to be very fast.

Of course, I still prefer mine... smaller, faster, more efficient:-)

David

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: Alpha "process table hang"
@ 2001-04-11 17:57 Bob McElrath
       [not found] ` <E14nOzo-0007Ew-00@the-village.bc.nu>
  0 siblings, 1 reply; 17+ messages in thread
From: Bob McElrath @ 2001-04-11 17:57 UTC (permalink / raw)
  To: Peter Rival; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1095 bytes --]

Peter Rival [frival@zk3.dec.com] wrote:
> Hmpf.  Haven't seen this at all on any of the Alphas that I'm running.  What
> exact system are you seeing this on, and what are you running when it happens?

This is a LX164 system, 533 MHz.

I have a hunch it's related to the X server because I've seen it many,
many times while sitting at the console (in X), but never when I'm
logged on remotely.  I've seen it with both XFree86 3.3.6, 4.0.2, 4.0.3,
Matrox Millenium II video card, 8MB.

I'm also experiencing regular X crashes, but the process-table-hang
doesn't occur at the same time as an X crash (or v/v).  I sent a patch
to xfree86@xfree86.org a few days ago that seemed to fix (one of) the X
crashes (in the mga driver, ask if you want details).

(But since the X server shouldn't have the ability to corrupt the
kernel's process list, there has to be a problem in the kernel
somewhere)

Note that this system was completely stable with 2.2 kernels.

Cheers,
-- Bob

Bob McElrath (rsmcelrath@students.wisc.edu) 
Univ. of Wisconsin at Madison, Department of Physics

[-- Attachment #2: Type: application/pgp-signature, Size: 240 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2001-04-23 23:40 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-04-17 19:18 generic rwsem [Re: Alpha "process table hang"] D.W.Howells
2001-04-17 20:49 ` Andrea Arcangeli
2001-04-17 21:29   ` Christoph Hellwig
2001-04-17 22:06     ` Andrea Arcangeli
  -- strict thread matches above, loose matches on Subject: below --
2001-04-17 23:54 D.W.Howells
2001-04-18 20:49 ` Andrea Arcangeli
2001-04-17 21:48 D.W.Howells
2001-04-17 23:06 ` Andrea Arcangeli
2001-04-11 17:57 Alpha "process table hang" Bob McElrath
     [not found] ` <E14nOzo-0007Ew-00@the-village.bc.nu>
2001-04-13 13:48   ` Bob McElrath
2001-04-17 15:07     ` generic rwsem [Re: Alpha "process table hang"] Andrea Arcangeli
2001-04-17 15:28       ` Bob McElrath
2001-04-19 16:21         ` Bob McElrath
2001-04-19 17:17           ` Andrea Arcangeli
2001-04-23 23:27             ` Bob McElrath
2001-04-23 23:40               ` Andrea Arcangeli
2001-04-17 15:45       ` Christoph Hellwig
2001-04-17 16:59       ` David Howells
2001-04-17 17:55         ` Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox