* rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]] @ 2001-04-19 23:28 D.W.Howells 2001-04-20 1:42 ` Andrea Arcangeli 0 siblings, 1 reply; 13+ messages in thread From: D.W.Howells @ 2001-04-19 23:28 UTC (permalink / raw) To: andrea; +Cc: dhowells, linux-kernel, torvalds You asked for some benchmarks Andrea, so I've obtained some. The set of test modules can be found at: ftp://infradead.org/pub/people/dwh/rwsem-test.tar.bz2 (This also includes rwsem-stat.txt which has a copy of the benchmark results in as well) There are six test programs. They can be made for i386 by the following command: make And can also be made to invoke the scheduler after each pass through the loop: make SCHED=yes I ran each individual test twice, hence the two sets of results for permutation. My machine is a Dual 400MHz PII with an 440BX chipset. All the tests were run in runlevel 3 with no other applications running. I benchmarked four different environments: (1) 2.4.4-pre3 + Andrea's generic rwsem patch (2) 2.4.4-pre4 using XADD to implement the rwsems (3) same as (2) but with a tweak to make rwsem_wake() less fair (4) 2.4.4-pre3 using my generic spinlock code to implement the rwsems David TEST NUM READERS NUM WRITERS CONTENTION =============== =============== =============== ========== rwsem-rw 4 2 r-w & w-w rwsem-ro 4 0 no rwsem-wo 0 4 w-w only rwsem-r1 1 0 no rwsem-w1 0 1 no rwsem-r2 2 0 no ENVIRONMENT TEST SCHED READERS WRITERS =============================== ======= ======= =============== ======= Linux-2.4.4-pre3 + AA-rwsem rws-rw no 3330281 1009 3331972 994 yes 1767102 607091 1743420 642095 rws-ro no 2534630 n/a 3535202 n/a yes 2837218 n/a 3164814 n/a rws-wo no n/a 2507449 n/a 2399102 yes n/a 1568467 n/a 1412262 rws-r1 no 9232485 n/a 9217585 n/a yes 5483757 n/a 5487028 n/a rws-w1 no n/a 9900333 n/a 9918021 yes n/a 5745657 n/a 5747063 rws-r2 no 3499275 n/a 3518590 n/a yes 3184431 n/a 3180287 n/a ------------------------------- ------- ------- --------------- ------- Linux-2.4.4-pre4 [XADD] rws-rw no 563388 283005 558159 280288 yes 683670 197017 700714 194316 rws-ro no 6316985 n/a 6314241 n/a yes 4309406 n/a 4575043 n/a rws-wo no n/a 765699 n/a 763876 yes n/a 650512 n/a 652287 rws-r1 no 15171532 n/a 15158899 n/a yes 7222310 n/a 7229793 n/a rws-w1 no n/a 13942744 n/a 13991823 yes n/a 7362605 n/a 7356127 rws-r2 no 5517671 n/a 5516168 n/a yes 3452796 n/a 3331947 n/a ------------------------------- ------- ------- --------------- ------- Linux-2.4.4-pre4 [XADD] rws-rw no 531929 267129 + slightly-unfair-contention 531093 266822 tweak yes 839560 185670 903995 183958 rws-ro no 6318293 n/a 6320336 n/a yes 4257862 n/a 4315243 n/a rws-wo no n/a 766427 n/a 766471 yes n/a 644036 n/a 642236 ------------------------------- ------- ------- --------------- ------- Linux-2.4.4-pre4 [GENERIC-SPIN] rws-rw no 545138 274002 545378 273785 yes 755343 187874 745888 185562 rws-ro no 4500398 n/a 4506583 n/a yes 3137883 n/a 3129119 n/a rws-wo no n/a 763599 n/a 763059 yes n/a 641256 n/a 647319 rws-r1 no 13110083 n/a 13115436 n/a yes 6950687 n/a 6951901 n/a rws-w1 no n/a 13004627 n/a 13003754 yes n/a 6899764 n/a 6898953 rws-r2 no 4741615 n/a 4746860 n/a yes 3340292 n/a 2967268 n/a \b0ã\x12A8\x01 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]] 2001-04-19 23:28 rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]] D.W.Howells @ 2001-04-20 1:42 ` Andrea Arcangeli 2001-04-20 10:10 ` David Howells 2001-04-20 17:17 ` x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] Andrea Arcangeli 0 siblings, 2 replies; 13+ messages in thread From: Andrea Arcangeli @ 2001-04-20 1:42 UTC (permalink / raw) To: D . W . Howells; +Cc: dhowells, linux-kernel, torvalds On Fri, Apr 20, 2001 at 12:28:09AM +0100, D . W . Howells wrote: > I benchmarked four different environments: > > (1) 2.4.4-pre3 + Andrea's generic rwsem patch > (2) 2.4.4-pre4 using XADD to implement the rwsems > (3) same as (2) but with a tweak to make rwsem_wake() less fair > (4) 2.4.4-pre3 using my generic spinlock code to implement the rwsems > > David > > > TEST NUM READERS NUM WRITERS CONTENTION > =============== =============== =============== ========== > rwsem-rw 4 2 r-w & w-w > rwsem-ro 4 0 no > rwsem-wo 0 4 w-w only > rwsem-r1 1 0 no > rwsem-w1 0 1 no > rwsem-r2 2 0 no > > > ENVIRONMENT TEST SCHED READERS WRITERS > =============================== ======= ======= =============== ======= > Linux-2.4.4-pre3 + AA-rwsem rws-rw no 3330281 1009 > 3331972 994 [..] > ------------------------------- ------- ------- --------------- ------- > Linux-2.4.4-pre4 [GENERIC-SPIN] rws-rw no 545138 274002 > 545378 273785 > yes 755343 187874 > 745888 185562 Some explanation on the above extreme difference. In the misc rw benchmark the reason in the same amount of time I get a total number of down 3332966 and you get only 819163 is that I provide recursive down_read and that in turn can starve the down_write (my first patches weren't implementing fair semaphores). As you can see in my post of yesterday I made my semaphores fair in my last patches (from rwsem-generic-5). (you didn't said which patch you used exactly but obviously it was earlier than the -5 revision) I'm uncertain if I should drop the list_empty() check from the fast path and if I should still allow up_* to be called from irq/softirq, if I reduce the max number of sleepers to 2^16 and I will provide weaker wakeup semantics I won't be penalizied anymore and then we'll really compare apples to orange making the comparison more interesting (probably I will do because later on I can probably re-add that two features without too much pain). About the benchmark you wrote it looks good measure to me, thanks. Andrea ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]] 2001-04-20 1:42 ` Andrea Arcangeli @ 2001-04-20 10:10 ` David Howells 2001-04-20 17:17 ` x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] Andrea Arcangeli 1 sibling, 0 replies; 13+ messages in thread From: David Howells @ 2001-04-20 10:10 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel > About the benchmark you wrote it looks good measure to me, thanks. As with all benchmarks, take with one pinch of salt and two of Mindcraft:-) David ^ permalink raw reply [flat|nested] 13+ messages in thread
* x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] 2001-04-20 1:42 ` Andrea Arcangeli 2001-04-20 10:10 ` David Howells @ 2001-04-20 17:17 ` Andrea Arcangeli 2001-04-20 23:45 ` Linus Torvalds 1 sibling, 1 reply; 13+ messages in thread From: Andrea Arcangeli @ 2001-04-20 17:17 UTC (permalink / raw) To: D . W . Howells; +Cc: dhowells, linux-kernel, torvalds On Fri, Apr 20, 2001 at 03:42:15AM +0200, Andrea Arcangeli wrote: > I'm uncertain if I should drop the list_empty() check from the fast path and if While dropping the list_empty check to speed up the fast path I faced the same complexity of the 2.4.4pre4 lib/rwsem.c and so before reinventing the wheel I read how the problem was solved in 2.4.4pre4. I couldn't get convinced that all the subtle possible cases was handled correctly so I tried to stress them. To do that I changed the 2.4.4pre4 rwsem (compiled for PII SMP) as below patch shows in order to reproduce more easily those corner cases (note my changes cannot be the cause of the deadlock, as far I can tell they can _only_ change the timings, and the timings have to be flexible because they can be randomly influenced by interrupt handlers machine checks and whatever else in whatever hardware): --- rwsemref-1/lib/rwsem.c.~1~ Thu Apr 19 19:59:46 2001 +++ rwsemref-1/lib/rwsem.c Fri Apr 20 17:13:05 2001 @@ -6,6 +6,7 @@ #include <linux/rwsem.h> #include <linux/sched.h> #include <linux/module.h> +#include <linux/delay.h> /* * wait for the read lock to be granted @@ -18,6 +19,7 @@ DECLARE_WAITQUEUE(wait,tsk); signed long count; + mdelay(1); rwsemtrace(sem,"Entering rwsem_down_read_failed"); /* this waitqueue context flag will be cleared when we are granted the lock */ @@ -59,6 +61,7 @@ DECLARE_WAITQUEUE(wait,tsk); signed long count; + mdelay(1); rwsemtrace(sem,"Entering rwsem_down_write_failed"); /* this waitqueue context flag will be cleared when we are granted the lock */ @@ -115,6 +118,7 @@ goto out; } + mdelay(1); /* try to grant a single write lock if there's a writer at the front of the queue * - note we leave the 'active part' of the count incremented by 1 and the waiting part * incremented by 0x00010000 @@ -122,6 +126,7 @@ if (wake_up_ctx(&sem->wait,1,-RWSEM_WAITING_FOR_WRITE)==1) goto out; + mdelay(10); /* grant an infinite number of read locks to the readers at the front of the queue * - note we increment the 'active part' of the count by the number of readers just woken, * less one for the activity decrement we've already done @@ -129,6 +134,7 @@ woken = wake_up_ctx(&sem->wait,65535,-RWSEM_WAITING_FOR_READ); if (woken<=0) goto counter_correction; + mdelay(1); woken *= RWSEM_ACTIVE_BIAS-RWSEM_WAITING_BIAS; woken -= RWSEM_ACTIVE_BIAS; and I succeeded in deadlocking the 2.4.4pre[2345] rwsemaphores with kernel looping forever in 2.4.4pre4/lib/rwsem.c:rwsem_wake() but at least the breakage is 100% reproducible if you use the above patch. All you have to do to reproduce is to apply the above patch and then to run the rwsem proggy that I posted to the list two days ago (the one that shows my latest -5/6 revisions to be just a bit faster in the slow path) that does the mmap and page faults from different threads in loop, in another console you run `while :; do ps xa; done`, then while `ps` is near the end of its run you kill the parent thread of the rwsem testcase so that all thread-children gets killed as well, and at that point `ps` deadlocks hard in R state unkillable (it keeps looping forever in the try_again: counter_correction: paths). [..] 285 ? SW 0:00 [rpciod] 461 ? S 0:00 /usr/sbin/inetd 463 tty1 S 0:00 /sbin/mingetty --noclear tty1 464 tty2 S 0:00 /sbin/mingetty tty2 465 tty3 S 0:00 /sbin/mingetty tty3 466 tty4 S 0:00 /sbin/mingetty tty4 467 tty5 S 0:00 /sbin/mingetty tty5 468 tty6 S 0:00 /sbin/mingetty tty6 469 ? S 0:00 in.rlogind 470 pts/0 S 0:00 login -- andrea 471 pts/0 S 0:00 -bash 530 ? S 0:00 in.rlogind 531 pts/1 S 0:00 login -- andrea 532 pts/1 S 0:00 -bash 666 pts/1 R 0:00 ps xa 667 pts/0 S 0:00 ./rwsem 668 pts/0 R 0:00 ./rwsem 669 pts/0 D 0:00 ./rwsem 670 pts/0 D 0:00 ./rwsem 671 pts/0 D 0:00 ./rwsem 672 pts/0 D 0:00 ./rwsem 673 pts/0 D 0:00 ./rwsem 674 pts/0 D 0:00 ./rwsem 675 pts/0 D 0:00 ./rwsem 676 pts/0 D 0:00 ./rwsem 677 pts/0 R 0:00 ./rwsem 678 pts/0 D 0:00 ./rwsem 679 pts/0 D 0:00 ./rwsem 680 pts/0 D 0:00 ./rwsem 681 pts/0 D 0:00 ./rwsem 682 pts/0 D 0:00 ./rwsem 683 pts/0 D 0:00 ./rwsem 684 pts/0 D 0:00 ./rwsem 685 pts/0 D 0:00 ./rwsem 686 pts/0 D 0:00 ./rwsem 687 pts/0 D 0:00 ./rwsem 688 pts/0 R 0:00 ./rwsem 689 pts/0 D 0:00 ./rwsem 690 pts/0 D 0:00 ./rwsem 691 pts/0 D 0:00 ./rwsem 692 pts/0 D 0:00 ./rwsem 693 pts/0 D 0:00 ./rwsem 694 pts/0 D 0:00 ./rwsem 695 pts/0 D 0:00 ./rwsem 696 pts/0 D 0:00 ./rwsem 697 pts/0 D 0:00 ./rwsem 698 pts/0 D 0:00 ./rwsem 699 pts/0 D 0:00 ./rwsem 700 pts/0 D 0:00 ./rwsem 701 pts/0 D 0:00 ./rwsem 702 pts/0 D 0:00 ./rwsem 703 pts/0 D 0:00 ./rwsem [ hang here ] from another terminal: andrea@laser:~ > kill 666 andrea@laser:~ > kill 666 andrea@laser:~ > kill 666 andrea@laser:~ > kill 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > ps 666 PID TTY STAT TIME COMMAND 666 pts/1 R 0:31 ps xa andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > kill -9 666 andrea@laser:~ > ps 666 PID TTY STAT TIME COMMAND 666 pts/1 R 53:38 ps xa andrea@laser:~ > At first I had a short look to check if this bug could be unrelated to the rwsem and to be instead a race between the /proc filesystem and the do_exit path, but after a short review of the interesting places the task_lock seems to serialize correctly the accesses to the tsk->mm so the semaphore shouldn't go away under rwsem_wake and furthmore it would be quite strange if the mm goes away under the /proc filesystem that it only reproducibly fails in the same way in rwsem_wake and I never get other system malfunctions for example while reading the contents of the mm. As further confirm this is a bug in the rwsemaphores I changed your benchmark to be a little more aggressive: --- rwsem-rw.c.orig Thu Apr 19 22:37:27 2001 +++ rwsem-rw.c Fri Apr 20 18:56:06 2001 @@ -71,9 +71,8 @@ static inline void sched(void) { -#ifdef SCHED - schedule(); -#endif + if (current->need_resched) + schedule(); } int reader(void *arg) @@ -86,8 +85,8 @@ while (atomic_read(&do_stuff)) { dr(); - ur(); sched(); + ur(); } atomic_dec(&runners); @@ -105,8 +104,8 @@ while (atomic_read(&do_stuff)) { dw(); - uw(); sched(); + uw(); } atomic_dec(&runners); @@ -125,22 +124,64 @@ init_rwsem(&rwsem_sem); atomic_set(&do_stuff, 1); + wmb(); #ifdef DEBUG_RWSEM rwsem_to_monitor = &rwsem_sem; #endif init_timer(&timer); timer.function = stop_test; - timer.expires = jiffies + 5 * HZ; + timer.expires = jiffies + 50 * HZ; add_timer(&timer); - kernel_thread(writer, "WriteA", 0); - kernel_thread(writer, "WriteB", 0); - - kernel_thread(reader, "ReadA", 0); - kernel_thread(reader, "ReadB", 0); - kernel_thread(reader, "ReadC", 0); - kernel_thread(reader, "ReadD", 0); + kernel_thread(writer, "Write0", 0); + kernel_thread(writer, "Write1", 0); + kernel_thread(writer, "Write2", 0); + kernel_thread(writer, "Write3", 0); + kernel_thread(writer, "Write4", 0); + kernel_thread(writer, "Write5", 0); + kernel_thread(writer, "Write6", 0); + kernel_thread(writer, "Write7", 0); + kernel_thread(writer, "Write8", 0); + kernel_thread(writer, "Write9", 0); + kernel_thread(writer, "Write10", 0); + kernel_thread(writer, "Write11", 0); + kernel_thread(writer, "Write12", 0); + kernel_thread(writer, "Write13", 0); + kernel_thread(writer, "Write14", 0); + kernel_thread(writer, "Write15", 0); + + kernel_thread(reader, "Read0", 0); + kernel_thread(reader, "Read1", 0); + kernel_thread(reader, "Read2", 0); + kernel_thread(reader, "Read3", 0); + kernel_thread(reader, "Read4", 0); + kernel_thread(reader, "Read5", 0); + kernel_thread(reader, "Read6", 0); + kernel_thread(reader, "Read7", 0); + kernel_thread(reader, "Read8", 0); + kernel_thread(reader, "Read9", 0); + kernel_thread(reader, "Read10", 0); + kernel_thread(reader, "Read11", 0); + kernel_thread(reader, "Read12", 0); + kernel_thread(reader, "Read13", 0); + kernel_thread(reader, "Read14", 0); + kernel_thread(reader, "Read15", 0); + kernel_thread(reader, "Read16", 0); + kernel_thread(reader, "Read17", 0); + kernel_thread(reader, "Read18", 0); + kernel_thread(reader, "Read19", 0); + kernel_thread(reader, "Read20", 0); + kernel_thread(reader, "Read21", 0); + kernel_thread(reader, "Read22", 0); + kernel_thread(reader, "Read23", 0); + kernel_thread(reader, "Read24", 0); + kernel_thread(reader, "Read25", 0); + kernel_thread(reader, "Read26", 0); + kernel_thread(reader, "Read27", 0); + kernel_thread(reader, "Read28", 0); + kernel_thread(reader, "Read29", 0); + kernel_thread(reader, "Read30", 0); return 0; } @@ -155,6 +196,8 @@ schedule(); printk("reads taken: %d\n", atomic_read(&reads_taken)); printk("writes taken: %d\n", atomic_read(&writes_taken)); + printk("readers: %d\n", atomic_read(&readers)); + printk("writers: %d\n", atomic_read(&writers)); #ifdef DEBUG_RWSEM rwsem_to_monitor = 0; #endif and now also the benchmark deadlocked: andrea@laser:~ > ps 669 PID TTY STAT TIME COMMAND 669 pts/0 R 6:25 [Read19] andrea@laser:~ > su root@laser:/home/andrea > rmmod rwsem-rw [ hang here ] Until we fix that bug I will keep running my rwsem-generic-6 patch to be 100% sure to sit on rock solid rwsems. BTW, this is the fix against pre4 for the 386+SMP rwsemaphores bug that I found yesterday: --- rwsemref-1/include/asm-i386/rwsem-spin.h.~1~ Fri Apr 20 05:03:59 2001 +++ rwsemref-1/include/asm-i386/rwsem-spin.h Fri Apr 20 19:05:52 2001 @@ -93,9 +93,9 @@ __asm__ __volatile__( "# beginning down_read\n\t" #ifdef CONFIG_SMP + "1:\n\t" LOCK_PREFIX " decb "RWSEM_SPINLOCK_OFFSET_STR"(%%eax)\n\t" /* try to grab the spinlock */ " js 3f\n" /* jump if failed */ - "1:\n\t" #endif " incl (%%eax)\n\t" /* adds 0x00000001, returns the old value */ #ifdef CONFIG_SMP @@ -132,9 +132,9 @@ __asm__ __volatile__( "# beginning down_write\n\t" #ifdef CONFIG_SMP + "1:\n\t" LOCK_PREFIX " decb "RWSEM_SPINLOCK_OFFSET_STR"(%%eax)\n\t" /* try to grab the spinlock */ " js 3f\n" /* jump if failed */ - "1:\n\t" #endif " xchg %0,(%%eax)\n\t" /* retrieve the old value */ " add %0,(%%eax)\n\t" /* add 0xffff0001, result in memory */ @@ -173,9 +173,9 @@ __asm__ __volatile__( "# beginning __up_read\n\t" #ifdef CONFIG_SMP + "1:\n\t" LOCK_PREFIX " decb "RWSEM_SPINLOCK_OFFSET_STR"(%%eax)\n\t" /* try to grab the spinlock */ " js 3f\n" /* jump if failed */ - "1:\n\t" #endif " xchg %0,(%%eax)\n\t" /* retrieve the old value */ " addl %0,(%%eax)\n\t" /* subtract 1, result in memory */ @@ -213,9 +213,9 @@ __asm__ __volatile__( "# beginning __up_write\n\t" #ifdef CONFIG_SMP + "1:\n\t" LOCK_PREFIX " decb "RWSEM_SPINLOCK_OFFSET_STR"(%%eax)\n\t" /* try to grab the spinlock */ " js 3f\n" /* jump if failed */ - "1:\n\t" #endif " addl %3,(%%eax)\n\t" /* adds 0x00010001 */ #ifdef CONFIG_SMP @@ -251,9 +251,9 @@ __asm__ __volatile__( "# beginning rwsem_atomic_update\n\t" #ifdef CONFIG_SMP + "1:\n\t" LOCK_PREFIX " decb "RWSEM_SPINLOCK_OFFSET_STR"(%1)\n\t" /* try to grab the spinlock */ " js 3f\n" /* jump if failed */ - "1:\n\t" #endif " xchgl %0,(%1)\n\t" /* retrieve the old value */ " addl %0,(%1)\n\t" /* add 0xffff0001, result in memory */ @@ -287,9 +287,9 @@ __asm__ __volatile__( "# beginning rwsem_cmpxchgw\n\t" #ifdef CONFIG_SMP + "1:\n\t" LOCK_PREFIX " decb "RWSEM_SPINLOCK_OFFSET_STR"(%3)\n\t" /* try to grab the spinlock */ " js 3f\n" /* jump if failed */ - "1:\n\t" #endif " cmpw %w1,(%3)\n\t" " jne 4f\n\t" /* jump if old doesn't match sem->count LSW */ Andrea ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] 2001-04-20 17:17 ` x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] Andrea Arcangeli @ 2001-04-20 23:45 ` Linus Torvalds 2001-04-21 14:03 ` Andrea Arcangeli 0 siblings, 1 reply; 13+ messages in thread From: Linus Torvalds @ 2001-04-20 23:45 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: D . W . Howells, dhowells, linux-kernel On Fri, 20 Apr 2001, Andrea Arcangeli wrote: > > While dropping the list_empty check to speed up the fast path I faced the same > complexity of the 2.4.4pre4 lib/rwsem.c and so before reinventing the wheel I > read how the problem was solved in 2.4.4pre4. I would suggest the following: - the generic semaphores should use the lock that already exists in the wait-queue as the semaphore spinlock. - the generic semaphores should _not_ drop the lock. Right now it drops the semaphore lock when it goes into the slow path, only to re-aquire it. This is due to bad interfacing with the generic slow-path routines. I suspect that this lock-drop is why Andrea sees problems with the generic semaphores. The changes to "count" and "sleeper" aren't actually atomic, because we don't hold the lock over them all. And re-using the lock means that we don't need the two levels of spinlocking for adding ourselves to the wait queue. Easily done by just moving the locking _out_ of the wait-queue helper functions, no? - the generic semaphores are entirely out-of-line, and are just declared universally as regular FASTCALL() functions. The fast-path x86 code looks ok to me. The debugging stuff makes it less readable than it should be, I suspect, and is probably not worth it at this stage. The users of rw-semaphores are so well-defined (and so well debugged) that the debugging code only makes the code harder to follow right now. Comments? Andrea? Your patches have looked ok, but I absoutely refuse to see the non-inlined fast-path for reasonable x86 hardware.. Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] 2001-04-20 23:45 ` Linus Torvalds @ 2001-04-21 14:03 ` Andrea Arcangeli 2001-04-21 14:17 ` Russell King 2001-04-21 14:37 ` Russell King 0 siblings, 2 replies; 13+ messages in thread From: Andrea Arcangeli @ 2001-04-21 14:03 UTC (permalink / raw) To: Linus Torvalds; +Cc: D . W . Howells, dhowells, linux-kernel On Fri, Apr 20, 2001 at 04:45:32PM -0700, Linus Torvalds wrote: > I would suggest the following: > > - the generic semaphores should use the lock that already exists in the > wait-queue as the semaphore spinlock. Ok, that is what my generic code does. > - the generic semaphores should _not_ drop the lock. Right now it drops > the semaphore lock when it goes into the slow path, only to re-aquire > it. This is due to bad interfacing with the generic slow-path routines. My generic code doesn't drop the lock. > I suspect that this lock-drop is why Andrea sees problems with the > generic semaphores. The changes to "count" and "sleeper" aren't > actually atomic, because we don't hold the lock over them all. And > re-using the lock means that we don't need the two levels of > spinlocking for adding ourselves to the wait queue. Easily done by just > moving the locking _out_ of the wait-queue helper functions, no? Basically yes, however for the wakeup I wrote a dedicated routine that knows how to do the wake-all-next-readers or wake-next-writer (it is not the same helper function of sched.c). > - the generic semaphores are entirely out-of-line, and are just declared > universally as regular FASTCALL() functions. This is what I implemented originally but then I moved the fast path inline for the fast-path benchmark reasons. I think in real life it doesn't matter much if the fast path is inline or not. > The fast-path x86 code looks ok to me. The debugging stuff makes it less > readable than it should be, I suspect, and is probably not worth it at > this stage. The users of rw-semaphores are so well-defined (and so well > debugged) that the debugging code only makes the code harder to follow > right now. yes I agree, infact I added the ->magic check only to catch uninitialized semaphores (and this one doesn't hurt readability that much). > Comments? Andrea? Your patches have looked ok, but I absoutely refuse to > see the non-inlined fast-path for reasonable x86 hardware.. In my last patch the fast path is inline as said above but it is not in asm yet because I couldn't get convinced it was right code. I plan to return looking into the rwsem soon. I also seen David fixed the bug and dropped the buggy rwsem-spin.h, so I suggest to merge his code for now, after a very short look it seems certainly better than pre5. Andrea ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] 2001-04-21 14:03 ` Andrea Arcangeli @ 2001-04-21 14:17 ` Russell King 2001-04-21 14:29 ` Andrea Arcangeli 2001-04-21 17:18 ` Linus Torvalds 2001-04-21 14:37 ` Russell King 1 sibling, 2 replies; 13+ messages in thread From: Russell King @ 2001-04-21 14:17 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Linus Torvalds, D . W . Howells, dhowells, linux-kernel On Sat, Apr 21, 2001 at 04:03:27PM +0200, Andrea Arcangeli wrote: > On Fri, Apr 20, 2001 at 04:45:32PM -0700, Linus Torvalds wrote: > > I would suggest the following: > > > > - the generic semaphores should use the lock that already exists in the > > wait-queue as the semaphore spinlock. > > Ok, that is what my generic code does. Erm, spin_lock()? What if up_read or up_write gets called from interrupt context (is this allowed)? If these are now allowed, then maybe we should either consider getting the Stanford checker to check for this, or else we ought to do a debugging if (in_interupt()) BUG(); thing. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] 2001-04-21 14:17 ` Russell King @ 2001-04-21 14:29 ` Andrea Arcangeli 2001-04-21 14:37 ` rmk 2001-04-21 17:18 ` Linus Torvalds 1 sibling, 1 reply; 13+ messages in thread From: Andrea Arcangeli @ 2001-04-21 14:29 UTC (permalink / raw) To: Russell King; +Cc: Linus Torvalds, D . W . Howells, dhowells, linux-kernel On Sat, Apr 21, 2001 at 03:17:37PM +0100, Russell King wrote: > On Sat, Apr 21, 2001 at 04:03:27PM +0200, Andrea Arcangeli wrote: > > On Fri, Apr 20, 2001 at 04:45:32PM -0700, Linus Torvalds wrote: > > > I would suggest the following: > > > > > > - the generic semaphores should use the lock that already exists in the > > > wait-queue as the semaphore spinlock. > > > > Ok, that is what my generic code does. > > Erm, spin_lock()? What if up_read or up_write gets called from interrupt > context (is this allowed)? That it is allowed by my generic code that does spin_lock_irq in down_* and spin_lock_irqsave in up_* but it's disallowed by the weaker semantics of the generic and x86 semaphores 2.4.4pre[2345] (or + David's last patch). > If these are now allowed, then maybe we should either consider getting > the Stanford checker to check for this, or else we ought to do a debugging > if (in_interupt()) BUG(); thing. Caller bug is the last of my worries. (and I seriously doubt that if somebody doesn't know the semantics of the rwsem, he will care to enable the debugging checks during regression testing ;) Andrea ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] 2001-04-21 14:29 ` Andrea Arcangeli @ 2001-04-21 14:37 ` rmk 2001-04-21 15:04 ` Andrea Arcangeli 0 siblings, 1 reply; 13+ messages in thread From: rmk @ 2001-04-21 14:37 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Linus Torvalds, D . W . Howells, dhowells, linux-kernel Andrea Arcangeli writes: > That it is allowed by my generic code that does spin_lock_irq in down_* and > spin_lock_irqsave in up_* but it's disallowed by the weaker semantics of the > generic and x86 semaphores 2.4.4pre[2345] (or + David's last patch). Hang on, who's code is in 2.4.4-pre5? It claims to be Davids, which does suffer from the problem I described. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] 2001-04-21 14:37 ` rmk @ 2001-04-21 15:04 ` Andrea Arcangeli 0 siblings, 0 replies; 13+ messages in thread From: Andrea Arcangeli @ 2001-04-21 15:04 UTC (permalink / raw) To: rmk; +Cc: Linus Torvalds, D . W . Howells, dhowells, linux-kernel On Sat, Apr 21, 2001 at 03:37:05PM +0100, rmk@arm.linux.org.uk wrote: > Andrea Arcangeli writes: > > That it is allowed by my generic code that does spin_lock_irq in down_* and > > spin_lock_irqsave in up_* but it's disallowed by the weaker semantics of the ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > generic and x86 semaphores 2.4.4pre[2345] (or + David's last patch). ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Hang on, who's code is in 2.4.4-pre5? It claims to be Davids, which does > suffer from the problem I described. Yes. Andrea ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] 2001-04-21 14:17 ` Russell King 2001-04-21 14:29 ` Andrea Arcangeli @ 2001-04-21 17:18 ` Linus Torvalds 1 sibling, 0 replies; 13+ messages in thread From: Linus Torvalds @ 2001-04-21 17:18 UTC (permalink / raw) To: Russell King; +Cc: Andrea Arcangeli, D . W . Howells, dhowells, linux-kernel On Sat, 21 Apr 2001, Russell King wrote: > > Erm, spin_lock()? What if up_read or up_write gets called from interrupt > context (is this allowed)? Currently that is not allowed. We allow it for regular semaphores, but not for rw-semaphores. We may some day have to revisit that issue, but I suspect we won't have much reason to. Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] 2001-04-21 14:03 ` Andrea Arcangeli 2001-04-21 14:17 ` Russell King @ 2001-04-21 14:37 ` Russell King 2001-04-21 15:07 ` Andrea Arcangeli 1 sibling, 1 reply; 13+ messages in thread From: Russell King @ 2001-04-21 14:37 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Linus Torvalds, D . W . Howells, dhowells, linux-kernel On Sat, Apr 21, 2001 at 04:03:27PM +0200, Andrea Arcangeli wrote: > On Fri, Apr 20, 2001 at 04:45:32PM -0700, Linus Torvalds wrote: > > I would suggest the following: > > > > - the generic semaphores should use the lock that already exists in the > > wait-queue as the semaphore spinlock. > > Ok, that is what my generic code does. rwsem-spinlock.h requires linux/types.h to be included (you're using __u16 at towards the bottom): gcc -D__KERNEL__ -I/usr/src2/v2.4/linux-rpc/include -Wall -Wstrict-prototypes -O2 -fno-strict-aliasing -pipe -mapcs-32 -march=armv3m -mtune=strongarm110 -mshort-load-bytes -msoft-float -c -o ieee1284.o ieee1284.c In file included from /usr/src2/v2.4/linux-rpc/include/linux/rwsem.h:56, from /usr/src2/v2.4/linux-rpc/include/asm/semaphore.h:10, from /usr/src2/v2.4/linux-rpc/include/linux/parport.h:101, from ieee1284.c:19: /usr/src2/v2.4/linux-rpc/include/linux/rwsem-spinlock.h:154: parse error before `rwsem_cmpxchgw' /usr/src2/v2.4/linux-rpc/include/linux/rwsem-spinlock.h:154: parse error before `__u16' /usr/src2/v2.4/linux-rpc/include/linux/rwsem-spinlock.h:155: warning: return-type defaults to `int' /usr/src2/v2.4/linux-rpc/include/linux/rwsem-spinlock.h:155: warning: function declaration isn't a prototype Here is a patch that fixes this oversight against 2.4.4-pre5: --- orig/include/linux/rwsem-spinlock.h Sat Apr 21 15:32:57 2001 +++ linux/include/linux/rwsem-spinlock.h Sat Apr 21 15:28:45 2001 @@ -11,6 +11,7 @@ #endif #include <linux/spinlock.h> +#include <linux/types.h> #ifdef __KERNEL__ -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] 2001-04-21 14:37 ` Russell King @ 2001-04-21 15:07 ` Andrea Arcangeli 0 siblings, 0 replies; 13+ messages in thread From: Andrea Arcangeli @ 2001-04-21 15:07 UTC (permalink / raw) To: Russell King; +Cc: Linus Torvalds, D . W . Howells, dhowells, linux-kernel On Sat, Apr 21, 2001 at 03:37:42PM +0100, Russell King wrote: > On Sat, Apr 21, 2001 at 04:03:27PM +0200, Andrea Arcangeli wrote: > > On Fri, Apr 20, 2001 at 04:45:32PM -0700, Linus Torvalds wrote: > > > I would suggest the following: > > > > > > - the generic semaphores should use the lock that already exists in the > > > wait-queue as the semaphore spinlock. > > > > Ok, that is what my generic code does. > > rwsem-spinlock.h requires linux/types.h to be included (you're using > __u16 at towards the bottom): As you are quoting me on "that is what my generic code does", note that my code is only here: ftp://ftp.*.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.4pre4aa1/00_rwsem-generic-6 and no one single line of my rwsem patches is ever been included in any of Linus's or Alan's kernels. > gcc -D__KERNEL__ -I/usr/src2/v2.4/linux-rpc/include -Wall -Wstrict-prototypes > -O2 -fno-strict-aliasing -pipe -mapcs-32 -march=armv3m -mtune=strongarm110 > -mshort-load-bytes -msoft-float -c -o ieee1284.o ieee1284.c > In file included from /usr/src2/v2.4/linux-rpc/include/linux/rwsem.h:56, > from /usr/src2/v2.4/linux-rpc/include/asm/semaphore.h:10, > from /usr/src2/v2.4/linux-rpc/include/linux/parport.h:101, > from ieee1284.c:19: > /usr/src2/v2.4/linux-rpc/include/linux/rwsem-spinlock.h:154: parse error before `rwsem_cmpxchgw' > /usr/src2/v2.4/linux-rpc/include/linux/rwsem-spinlock.h:154: parse error before `__u16' > /usr/src2/v2.4/linux-rpc/include/linux/rwsem-spinlock.h:155: warning: return-type defaults to `int' > /usr/src2/v2.4/linux-rpc/include/linux/rwsem-spinlock.h:155: warning: function declaration isn't a prototype > > Here is a patch that fixes this oversight against 2.4.4-pre5: > > --- orig/include/linux/rwsem-spinlock.h Sat Apr 21 15:32:57 2001 > +++ linux/include/linux/rwsem-spinlock.h Sat Apr 21 15:28:45 2001 > @@ -11,6 +11,7 @@ > #endif > > #include <linux/spinlock.h> > +#include <linux/types.h> > > #ifdef __KERNEL__ > > > > -- > Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux > http://www.arm.linux.org.uk/personal/aboutme.html > Andrea ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2001-04-21 17:18 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-04-19 23:28 rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]] D.W.Howells 2001-04-20 1:42 ` Andrea Arcangeli 2001-04-20 10:10 ` David Howells 2001-04-20 17:17 ` x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem benchmarks [Re: generic rwsem [Re: Alpha "process table hang"]]] Andrea Arcangeli 2001-04-20 23:45 ` Linus Torvalds 2001-04-21 14:03 ` Andrea Arcangeli 2001-04-21 14:17 ` Russell King 2001-04-21 14:29 ` Andrea Arcangeli 2001-04-21 14:37 ` rmk 2001-04-21 15:04 ` Andrea Arcangeli 2001-04-21 17:18 ` Linus Torvalds 2001-04-21 14:37 ` Russell King 2001-04-21 15:07 ` Andrea Arcangeli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox