From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1MCm5j-0008Jq-6z
	for qemu-devel@nongnu.org; Fri, 05 Jun 2009 22:57:07 -0400
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1MCm5e-0008Gr-My
	for qemu-devel@nongnu.org; Fri, 05 Jun 2009 22:57:06 -0400
Received: from [199.232.76.173] (port=60044 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1MCm5e-0008Go-Fj
	for qemu-devel@nongnu.org; Fri, 05 Jun 2009 22:57:02 -0400
Received: from mx20.gnu.org ([199.232.41.8]:5406)
	by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32)
	(Exim 4.60) (envelope-from <paul@codesourcery.com>)
	id 1MCm5e-0001fd-4V
	for qemu-devel@nongnu.org; Fri, 05 Jun 2009 22:57:02 -0400
Received: from mail.codesourcery.com ([65.74.133.4])
	by mx20.gnu.org with esmtp (Exim 4.60)
	(envelope-from <paul@codesourcery.com>) id 1MCm5a-0000QY-Ts
	for qemu-devel@nongnu.org; Fri, 05 Jun 2009 22:56:59 -0400
From: Paul Brook <paul@codesourcery.com>
Subject: Re: [Qemu-devel] [PATCH 0/7] target-ppc/linux-user: NPTL support
Date: Sat, 6 Jun 2009 03:56:55 +0100
References: <1244141522-21802-1-git-send-email-froydnj@codesourcery.com>
	<Pine.LNX.4.64.0906060302040.27655@linmac.oyster.ru>
In-Reply-To: <Pine.LNX.4.64.0906060302040.27655@linmac.oyster.ru>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200906060356.56398.paul@codesourcery.com>
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: qemu-devel@nongnu.org
Cc: Nathan Froyd <froydnj@codesourcery.com>

On Saturday 06 June 2009, malc wrote:
> On Thu, 4 Jun 2009, Nathan Froyd wrote:
> > This patch series adds NPTL support in Linux user-mode emulation to
> > 32-bit PowerPC targets.
> >
> > The main complication comes from implementing atomic instructions
> > properly.  We chose to implement a simplistic model:
> >
> > - reserved loads record the value loaded;

An important point here is that the address/value pair is per thread/cpu.
A nice side-effect is that these loads reduce to a simple atomic load and some 
thread local bookkeeping. Conditional stores require somewhat more exotic 
atomic operations, but still don't need to go poking at system global state or 
other CPUs.
This sounds strange at first reading, but ll/sc semantics are deliberately 
designed to minimize contention between unrelated CPUs/resources in large 
systems.

> > - conditional stores check that the memory at the effective address
> >   contains the value loaded by the previous reserved load, in addition
> >   to all other checks.  if so, the store succeeds; otherwise, it fails.
>
> I think this will break code that relies on the fact that ll/sc is not
> affected by the ABA problem.

I'm not absolutely certain about PPC, but on other architectures (ARM, MIPS, 
Alpha) this implementation is sufficient.

The only questionable case is when a second thread overwrites and then 
restores the original value between a locked load and a conditional store. 
However limited coherency and memory ordering between CPUs make it impossible 
to know whether this modify+restore occurred before or after the initial load.

The worst that can happen here is that another thread gains and releases the 
lock[1] while the current thread is in the process of acquiring the lock.  
Even when this happens it is impossible for two threads to acquire the lock 
simultaneously. The only difference is the window between ll and sc. During 
this period we don't know whether we have the lock or not, so it's extremely 
unlikely that we will do anything that relies on no other thread having the 
lock. In practice ll/sc are always used as matching pairs with no intervening 
memory accesses, so this is never a problem.

I could probably come up with synthetic testcases where qemu behavior is 
observably different to real hardware. However I'm pretty certain this never 
occurs in real code, and it is questionable whether such behavior is 
architecturally defined.

If you still believe this is a problem you need come up with an actual 
testcase that demonstrates how this can introduce a race condition.

Paul

[1] Lock acquisition is the most obvious example, but the same applies to any 
atomic operation implemented on top of ll/sc.