From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753907Ab0ANAj6 (ORCPT ); Wed, 13 Jan 2010 19:39:58 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751501Ab0ANAj5 (ORCPT ); Wed, 13 Jan 2010 19:39:57 -0500 Received: from terminus.zytor.com ([198.137.202.10]:33911 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751013Ab0ANAj5 (ORCPT ); Wed, 13 Jan 2010 19:39:57 -0500 Message-ID: <4B4E67C8.501@zytor.com> Date: Wed, 13 Jan 2010 16:39:36 -0800 From: "H. Peter Anvin" User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.5) Gecko/20091209 Fedora/3.0-4.fc12 Thunderbird/3.0 MIME-Version: 1.0 To: George Spelvin CC: linux-kernel@vger.kernel.org, schwab@linux-m68k.org, torvalds@linux-foundation.org Subject: Re: x86-32: clean up rwsem inline asm statements References: <20100114002708.4154.qmail@science.horizon.com> In-Reply-To: <20100114002708.4154.qmail@science.horizon.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/13/2010 04:27 PM, George Spelvin wrote: >> There are a number of things that can be done better... for one thing, >> "+m" (sem->count) and "a" (sem) is just bloody wrong. The right thing >> would be "a" (&sem->count) for proper robustness. > > Actually, no. The "+m" (sem->count) is telling GCC that sem->count is > updated; "a" (&sem->count) does *not* tell it to invalidate cached > copies of sem->count that it may have lying around. > > However, we can't just use "+m" (sem->count) because GCC has a poor > grasp on the concept of atomic operations; as far as it is concerned, > it is exactly equivalent to copying the value into a stack slot, do the > operation there, and copy it back. You completely missed the point. The reason we need both "+m" (sem->count) and "a" (sem) is because we need the address of the semaphore in %eax in case we hit the slow path. They're actually orthogonal requirements, and we could implement them as: LOCK_PREFIX xadd %1,%0 jz 1f call call_rwsem_wake 1: : "+m" (sem->count), "=d" (tmp) : "a" (sem) ... just fine, and arguably that would be more correct in the sense that if sem->count isn't the first member of *sem, then we still pass the address of sem in %eax to call_rwsem_wake. In practice, with the zero-offset sem->count, gcc will typically generate (%eax) as the argument since it has it available anyway, but it wouldn't have to. The code as it currently is: LOCK_PREFIX xadd %1,(%2) jz 1f call call_rwsem_wake 1: : "+m" (sem->count), "=d" (tmp) : "a" (sem) ... implicitly assumes the offset of "count" is zero but doesn't enforce it in any way. My comment was that since the assumption is clearly that arguments 0 and 2 point to the same memory object, it should be (&sem->count) in the second case, but that's actually not really the "proper" fix because call_rwsem_wake presumably expects the value of sem. -hpa