From mboxrd@z Thu Jan 1 00:00:00 1970 Received: with ECARTIS (v1.0.0; list linux-mips); Wed, 24 Feb 2010 16:53:45 +0100 (CET) Received: from localhost.localdomain ([127.0.0.1]:56663 "EHLO h5.dl5rb.org.uk" rhost-flags-OK-OK-OK-FAIL) by eddie.linux-mips.org with ESMTP id S1492453Ab0BXPxm (ORCPT ); Wed, 24 Feb 2010 16:53:42 +0100 Received: from h5.dl5rb.org.uk (localhost.localdomain [127.0.0.1]) by h5.dl5rb.org.uk (8.14.3/8.14.3) with ESMTP id o1OFrdFV024294; Wed, 24 Feb 2010 16:53:39 +0100 Received: (from ralf@localhost) by h5.dl5rb.org.uk (8.14.3/8.14.3/Submit) id o1OFrbfB024293; Wed, 24 Feb 2010 16:53:37 +0100 Date: Wed, 24 Feb 2010 16:53:36 +0100 From: Ralf Baechle To: David Daney Cc: linux-mips@linux-mips.org Subject: Re: [PATCH] MIPS: Optimize spinlocks. Message-ID: <20100224155336.GA5130@linux-mips.org> References: <1265311909-1679-1-git-send-email-ddaney@caviumnetworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1265311909-1679-1-git-send-email-ddaney@caviumnetworks.com> User-Agent: Mutt/1.5.20 (2009-08-17) Return-Path: X-Envelope-To: <"|/home/ecartis/ecartis -s linux-mips"> (uid 0) X-Orcpt: rfc822;linux-mips@linux-mips.org Original-Recipient: rfc822;linux-mips@linux-mips.org X-archive-position: 26018 X-ecartis-version: Ecartis v1.0.0 Sender: linux-mips-bounce@linux-mips.org Errors-to: linux-mips-bounce@linux-mips.org X-original-sender: ralf@linux-mips.org Precedence: bulk X-list: linux-mips On Thu, Feb 04, 2010 at 11:31:49AM -0800, David Daney wrote: > The current locking mechanism uses a ll/sc sequence to release a > spinlock. This is slower than a wmb() followed by a store to unlock. > > The branching forward to .subsection 2 on sc failure slows down the > contended case. So we get rid of that part too. > > Since we are now working on naturally aligned u16 values, we can get > rid of a masking operation as the LHU already does the right thing. > The ANDI are reversed for better scheduling on multi-issue CPUs > > On a 12 CPU 750MHz Octeon cn5750 this patch improves ipv4 UDP packet > forwarding rates from 3.58*10^6 PPS to 3.99*10^6 PPS, or about 11%. And in your benchmarking patch you wrote: > spin_single spin_multi > base 106885 247941 > spinlock_patch 75194 219465 I did some benchmarking on an IP27 (180MHz, 2 CPU, needs LL/SC workaround): spin_single spin_multi base 229341 3505690 spinlock_patch 177847 3615326 So about 22% speedup for spin_single but 3% slowdown for spin_multi. Disabling the R10k LL/SC workaround btw. gives another 23% speedup for spin_single and marginal 0.3% for spin_multi; the latter may well be statistical noise. Ralf