From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261439AbVA1E2d (ORCPT ); Thu, 27 Jan 2005 23:28:33 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261440AbVA1E2d (ORCPT ); Thu, 27 Jan 2005 23:28:33 -0500 Received: from mx1.elte.hu ([157.181.1.137]:14532 "EHLO mx1.elte.hu") by vger.kernel.org with ESMTP id S261439AbVA1E2a (ORCPT ); Thu, 27 Jan 2005 23:28:30 -0500 Date: Fri, 28 Jan 2005 05:28:15 +0100 From: Ingo Molnar To: Paul Jackson Cc: Linus Torvalds , pwil3058@bigpond.net.au, akpm@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [patch, 2.6.10-rc2] sched: fix ->nr_uninterruptible handling bugs Message-ID: <20050128042815.GA29751@elte.hu> References: <20041116113209.GA1890@elte.hu> <419A7D09.4080001@bigpond.net.au> <20041116232827.GA842@elte.hu> <20050127165330.6f388054.pj@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050127165330.6f388054.pj@sgi.com> User-Agent: Mutt/1.4.1i X-ELTE-SpamVersion: MailScanner 4.31.6-itk1 (ELTE 1.2) SpamAssassin 2.63 ClamAV 0.73 X-ELTE-VirusStatus: clean X-ELTE-SpamCheck: no X-ELTE-SpamCheck-Details: score=-4.9, required 5.9, autolearn=not spam, BAYES_00 -4.90 X-ELTE-SpamLevel: X-ELTE-SpamScore: -4 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org * Paul Jackson wrote: > A long time ago, Linus wrote: > > An atomic op is pretty much as expensive as a spinlock/unlock pair on x86. > > Not _quite_, but it's pretty close. > > Are both read and modify atomic ops relatively expensive on some CPUs, > or is it just modify atomic ops? > > (Ignoring for this question the possibility that a mix of read and > modify ops could heat up a cache line on multiprocessor systems, and > focusing for the moment just on the CPU internals ...) if by 'some CPUs' you mean x86 then it's the LOCK prefixed ops that are expensive. I.e. all the LOCK-prefixed RMW variants of instructions: atomic.h: LOCK "addl %1,%0" atomic.h: LOCK "subl %1,%0" atomic.h: LOCK "subl %2,%0; sete %1" atomic.h: LOCK "incl %0" atomic.h: LOCK "decl %0" atomic.h: LOCK "decl %0; sete %1" atomic.h: LOCK "incl %0; sete %1" atomic.h: LOCK "addl %2,%0; sets %1" atomic.h: LOCK "xaddl %0, %1;" atomic.h:__asm__ __volatile__(LOCK "andl %0,%1" \ atomic.h:__asm__ __volatile__(LOCK "orl %0,%1" \ pure reads/writes are architecturally guaranteed to be atomic (so atomic.h uses them, not some fancy instruction) and they are (/better be) fast. interestingly, the x86 spinlock implementation uses a LOCK-ed instruction only on acquire - it uses a simple atomic write (and implicit barrier assumption) on the way out: #define spin_unlock_string \ "movb $1,%0" \ :"=m" (lock->slock) : : "memory" no LOCK prefix. Due to this spinlocks can sometimes be _cheaper_ than doing the same via atomic inc/dec. Ingo