From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Wed, 26 Sep 2001 11:16:50 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Wed, 26 Sep 2001 11:16:41 -0400 Received: from d12lmsgate.de.ibm.com ([195.212.91.199]:55039 "EHLO d12lmsgate.de.ibm.com") by vger.kernel.org with ESMTP id ; Wed, 26 Sep 2001 11:16:32 -0400 Message-ID: <3BB1F0FB.2554ED93@de.ibm.com> Date: Wed, 26 Sep 2001 17:15:07 +0200 From: Juergen Doelle X-Mailer: Mozilla 4.72 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: Stephen Tweedie , Andrea Arcangeli , Andrew Morton , linux-kernel@vger.kernel.org, Mark Hemment , lse-tech@lists.sourceforge.net, "Steve Fox" Subject: Re: [PATCH] Align VM locks, new spinlock patch X-MIMETrack: Itemize by SMTP Server on D12ML008/12/M/IBM(Release 5.0.8 |June 18, 2001) at 26/09/2001 17:16:00, Serialize by Router on D12ML008/12/M/IBM(Release 5.0.8 |June 18, 2001) at 26/09/2001 17:16:28, Serialize complete at 26/09/2001 17:16:28 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Stephen C. Tweedie wrote > Do you have CPU utilisation differences for these cases, as well as > pure IO bandwidth differences? It would be interesting to see just > how much the IO code's internal latency impacts on the final dbench > numbers. I created vmstat traces for dbench 22 +-----------------------------------+------------------------------+ | 2.4.10 | 2.4.10 + spinlock patch | +----+----------+-------+-----------+----------+-------+-----------+ |time|procs | IO | cpu |procs | IO | cpu | + +----------+-------+-----------+----------+-------+-----------+ |[s] | r b w | bo/s | us sy id | r b w | bo/s | us sy id | +----+----------+-------+-----------+----------+-------+-----------+ | 1 | 23 0 0 | 0 | 0 14 85 | 22 0 0 | 0 | 1 47 52 | | 2 | 22 0 0 | 0 | 2 98 0 | 22 0 0 | 0 | 2 99 0 | | 3 | 22 0 0 | 0 | 2 98 0 | 22 0 0 | 0 | 4 97 0 | | 4 | 8 14 2 | 11052 | 12 89 0 | 5 17 1 | 20465 | 11 84 5 | | 5 | 3 19 1 | 1616 | 1 4 94 | 20 2 6 | 1788 | 16 83 0 | | 6 | 1 21 1 | 1760 | 2 4 94 | 22 0 4 | 0 | 18 82 0 | | 7 | 23 0 3 | 2852 | 12 46 42 | 22 0 1 | 0 | 19 80 0 | | 8 | 22 0 3 | 0 | 15 85 0 | 22 0 0 | 0 | 19 82 0 | | 9 | 22 0 2 | 0 | 17 83 0 | 23 0 0 | 0 | 18 82 0 | | 10 | 23 0 1 | 0 | 16 84 0 | 22 0 0 | 0 | 17 82 1 | | 11 | 22 0 0 | 0 | 14 86 0 | 22 0 0 | 0 | 18 83 | | 12 | 22 0 0 | 0 | 16 84 0 | 19 0 0 | 0 | 18 82 0 | | 13 | 22 0 0 | 0 | 19 81 1 | 9 0 0 | 0 | 7 94 | | 14 | 20 0 0 | 0 | 17 84 0 | 0 0 0 | 0 | 0 30 70 | | 15 | 17 0 0 | 0 | 15 85 0 |----------+-------+-----------+ | 16 | 13 0 0 | 0 | 4 97 0 | | 17 | 12 0 0 | 0 | 0 100 0 | | 18 | 7 0 0 | 0 | 0 99 0 | | 19 | 0 0 0 | 0 | 0 15 85 | +----+----------+-------+-----------+ *the empty idle columns originally containing 5315553, treat as '0' The patch o reduces significantly the idle times, the user process wait time is much shorter o reduces the I/O phase to the half of the time, by much higher rates o increases the user and decreases the system CPU utilization I prior posted lockmeter results on 2.4.5, where this patch reduced for 8 CPUs the average spin hold time by about 47% and the total CPU utilization spent for spinning by 45%. I think it does not influence the device layer directly. It speeds up the time spent in critical phases protected with spin locks, when the number of competitors increases. In case of high competitions caused by many parallel working processors, the buffer cache and pagecache handling gets faster and this increases the number of changed pages per second. Therefore the mechanism used here can also be used to improve competitive scenarios for other spin locks. Juergen ______________________________________________________________ Juergen Doelle IBM Linux Technology Center - kernel performance jdoelle@de.ibm.com