From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation Date: Mon, 22 Jul 2013 12:34:02 +0200 Message-ID: <20130722103402.GA1991@gmail.com> References: <1373679249-27123-1-git-send-email-Waiman.Long@hp.com> <1373679249-27123-2-git-send-email-Waiman.Long@hp.com> <51E49FA3.4030202@hp.com> <20130718074204.GA22623@gmail.com> <51E7F03A.4090305@hp.com> <20130719084023.GB25784@gmail.com> <51E95B85.8090003@hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from mail-ea0-f176.google.com ([209.85.215.176]:52196 "EHLO mail-ea0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756799Ab3GVKeH (ORCPT ); Mon, 22 Jul 2013 06:34:07 -0400 Content-Disposition: inline In-Reply-To: <51E95B85.8090003@hp.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Waiman Long Cc: Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Arnd Bergmann , linux-arch@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Peter Zijlstra , Steven Rostedt , Andrew Morton , Richard Weinberger , Catalin Marinas , Greg Kroah-Hartman , Matt Fleming , Herbert Xu , Akinobu Mita , Rusty Russell , Michel Lespinasse , Andi Kleen , Rik van Riel , "Paul E. McKenney" , Linus Torvalds , "Chandramouleeswaran, Aswin" , Norton, Sc * Waiman Long wrote: > I had run some performance tests using the fserver and new_fserver > benchmarks (on ext4 filesystems) of the AIM7 test suite on a 80-core > DL980 with HT on. The following kernels were used: > > 1. Modified 3.10.1 kernel with mb_cache_spinlock in fs/mbcache.c > replaced by a rwlock > 2. Modified 3.10.1 kernel + modified __read_lock_failed code as suggested > by Ingo > 3. Modified 3.10.1 kernel + queue read/write lock > 4. Modified 3.10.1 kernel + queue read/write lock in classic read/write > lock behavior > > The last one is with the read lock stealing flag set in the qrwlock > structure to give priority to readers and behave more like the classic > read/write lock with less fairness. > > The following table shows the averaged results in the 200-1000 > user ranges: > > +-----------------+--------+--------+--------+--------+ > | Kernel | 1 | 2 | 3 | 4 | > +-----------------+--------+--------+--------+--------+ > | fserver JPM | 245598 | 274457 | 403348 | 411941 | > | % change from 1 | 0% | +11.8% | +64.2% | +67.7% | > +-----------------+--------+--------+--------+--------+ > | new-fserver JPM | 231549 | 269807 | 399093 | 399418 | > | % change from 1 | 0% | +16.5% | +72.4% | +72.5% | > +-----------------+--------+--------+--------+--------+ So it's not just herding that is a problem. I'm wondering, how sensitive is this particular benchmark to fairness? I.e. do the 200-1000 simulated users each perform the same number of ops, so that any smearing of execution time via unfairness gets amplified? I.e. does steady-state throughput go up by 60%+ too with your changes? Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ea0-f176.google.com ([209.85.215.176]:52196 "EHLO mail-ea0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756799Ab3GVKeH (ORCPT ); Mon, 22 Jul 2013 06:34:07 -0400 Date: Mon, 22 Jul 2013 12:34:02 +0200 From: Ingo Molnar Subject: Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation Message-ID: <20130722103402.GA1991@gmail.com> References: <1373679249-27123-1-git-send-email-Waiman.Long@hp.com> <1373679249-27123-2-git-send-email-Waiman.Long@hp.com> <51E49FA3.4030202@hp.com> <20130718074204.GA22623@gmail.com> <51E7F03A.4090305@hp.com> <20130719084023.GB25784@gmail.com> <51E95B85.8090003@hp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51E95B85.8090003@hp.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Waiman Long Cc: Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Arnd Bergmann , linux-arch@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Peter Zijlstra , Steven Rostedt , Andrew Morton , Richard Weinberger , Catalin Marinas , Greg Kroah-Hartman , Matt Fleming , Herbert Xu , Akinobu Mita , Rusty Russell , Michel Lespinasse , Andi Kleen , Rik van Riel , "Paul E. McKenney" , Linus Torvalds , "Chandramouleeswaran, Aswin" , "Norton, Scott J" , George Spelvin Message-ID: <20130722103402.URTBbjKyNxPZquavD6vK_5PJ0zOhOg8AxmmrohhscXQ@z> * Waiman Long wrote: > I had run some performance tests using the fserver and new_fserver > benchmarks (on ext4 filesystems) of the AIM7 test suite on a 80-core > DL980 with HT on. The following kernels were used: > > 1. Modified 3.10.1 kernel with mb_cache_spinlock in fs/mbcache.c > replaced by a rwlock > 2. Modified 3.10.1 kernel + modified __read_lock_failed code as suggested > by Ingo > 3. Modified 3.10.1 kernel + queue read/write lock > 4. Modified 3.10.1 kernel + queue read/write lock in classic read/write > lock behavior > > The last one is with the read lock stealing flag set in the qrwlock > structure to give priority to readers and behave more like the classic > read/write lock with less fairness. > > The following table shows the averaged results in the 200-1000 > user ranges: > > +-----------------+--------+--------+--------+--------+ > | Kernel | 1 | 2 | 3 | 4 | > +-----------------+--------+--------+--------+--------+ > | fserver JPM | 245598 | 274457 | 403348 | 411941 | > | % change from 1 | 0% | +11.8% | +64.2% | +67.7% | > +-----------------+--------+--------+--------+--------+ > | new-fserver JPM | 231549 | 269807 | 399093 | 399418 | > | % change from 1 | 0% | +16.5% | +72.4% | +72.5% | > +-----------------+--------+--------+--------+--------+ So it's not just herding that is a problem. I'm wondering, how sensitive is this particular benchmark to fairness? I.e. do the 200-1000 simulated users each perform the same number of ops, so that any smearing of execution time via unfairness gets amplified? I.e. does steady-state throughput go up by 60%+ too with your changes? Thanks, Ingo