From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755642Ab0KZRGz (ORCPT ); Fri, 26 Nov 2010 12:06:55 -0500 Received: from hera.kernel.org ([140.211.167.34]:48788 "EHLO hera.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755606Ab0KZRGx (ORCPT ); Fri, 26 Nov 2010 12:06:53 -0500 Message-ID: <4CEFE8F6.5050109@kernel.org> Date: Fri, 26 Nov 2010 18:05:58 +0100 From: Tejun Heo User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Christoph Lameter CC: akpm@linux-foundation.org, Pekka Enberg , linux-kernel@vger.kernel.org, Eric Dumazet , Mathieu Desnoyers Subject: Re: [thiscpuops upgrade 05/10] x86: Use this_cpu_inc_return for nmi counter References: <20101123235139.908255844@linux.com> <20101123235158.826005750@linux.com> <4CEFE1CB.4050404@kernel.org> In-Reply-To: X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.3 (hera.kernel.org [127.0.0.1]); Fri, 26 Nov 2010 17:06:00 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/26/2010 06:02 PM, Christoph Lameter wrote: > On Fri, 26 Nov 2010, Tejun Heo wrote: > >>> - __this_cpu_inc(alert_counter); >>> - if (__this_cpu_read(alert_counter) == 5 * nmi_hz) >>> + if (__this_cpu_inc_return(alert_counter) == 5 * nmi_hz) >> >> Hmmm... one worry I have is that xadd, being not a very popular >> operation, might be slower than add and read. Using it for atomicity >> would probably be beneficial in most cases but have you checked this >> actually is cheaper? > > XADD takes 3 uops. INC 1 and MOV 1 uop. So there is an additiona uop. > > However, a memory fetch from l1 takes a mininum 4 cycles. Doing that twice > already ends up with at least 8 cycles. Thanks for the explanation. It might be beneficial to note performance characteristics on top of the x86 implementation? Anyways, for this and the following simple conversion patches. Reviewed-by: Tejun Heo -- tejun