From mboxrd@z Thu Jan 1 00:00:00 1970 From: catalin.marinas@arm.com (Catalin Marinas) Date: Wed, 19 Feb 2014 17:05:15 +0000 Subject: [PATCH 1/2] arm64: atomics: fix use of acquire + release for full barrier semantics In-Reply-To: <1391516953-14541-1-git-send-email-will.deacon@arm.com> References: <1391516953-14541-1-git-send-email-will.deacon@arm.com> Message-ID: <20140219170515.GE22252@arm.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Tue, Feb 04, 2014 at 12:29:12PM +0000, Will Deacon wrote: > Linux requires a number of atomic operations to provide full barrier > semantics, that is no memory accesses after the operation can be > observed before any accesses up to and including the operation in > program order. > > On arm64, these operations have been incorrectly implemented as follows: > > // A, B, C are independent memory locations > > > > // atomic_op (B) > 1: ldaxr x0, [B] // Exclusive load with acquire > > stlxr w1, x0, [B] // Exclusive store with release > cbnz w1, 1b > > > > The assumption here being that two half barriers are equivalent to a > full barrier, so the only permitted ordering would be A -> B -> C > (where B is the atomic operation involving both a load and a store). > > Unfortunately, this is not the case by the letter of the architecture > and, in fact, the accesses to A and C are permitted to pass their > nearest half barrier resulting in orderings such as Bl -> A -> C -> Bs > or Bl -> C -> A -> Bs (where Bl is the load-acquire on B and Bs is the > store-release on B). This is a clear violation of the full barrier > requirement. > > The simple way to fix this is to implement the same algorithm as ARMv7 > using explicit barriers: > > > > // atomic_op (B) > dmb ish // Full barrier > 1: ldxr x0, [B] // Exclusive load > > stxr w1, x0, [B] // Exclusive store > cbnz w1, 1b > dmb ish // Full barrier > > > > but this has the undesirable effect of introducing *two* full barrier > instructions. A better approach is actually the following, non-intuitive > sequence: > > > > // atomic_op (B) > 1: ldxr x0, [B] // Exclusive load > > stlxr w1, x0, [B] // Exclusive store with release > cbnz w1, 1b > dmb ish // Full barrier > > > > The simple observations here are: > > - The dmb ensures that no subsequent accesses (e.g. the access to C) > can enter or pass the atomic sequence. > > - The dmb also ensures that no prior accesses (e.g. the access to A) > can pass the atomic sequence. > > - Therefore, no prior access can pass a subsequent access, or > vice-versa (i.e. A is strictly ordered before C). > > - The stlxr ensures that no prior access can pass the store component > of the atomic operation. > > The only tricky part remaining is the ordering between the ldxr and the > access to A, since the absence of the first dmb means that we're now > permitting re-ordering between the ldxr and any prior accesses. > > From an (arbitrary) observer's point of view, there are two scenarios: > > 1. We have observed the ldxr. This means that if we perform a store to > [B], the ldxr will still return older data. If we can observe the > ldxr, then we can potentially observe the permitted re-ordering > with the access to A, which is clearly an issue when compared to > the dmb variant of the code. Thankfully, the exclusive monitor will > save us here since it will be cleared as a result of the store and > the ldxr will retry. Notice that any use of a later memory > observation to imply observation of the ldxr will also imply > observation of the access to A, since the stlxr/dmb ensure strict > ordering. > > 2. We have not observed the ldxr. This means we can perform a store > and influence the later ldxr. However, that doesn't actually tell > us anything about the access to [A], so we've not lost anything > here either when compared to the dmb variant. > > This patch implements this solution for our barriered atomic operations, > ensuring that we satisfy the full barrier requirements where they are > needed. > > Cc: > Cc: Catalin Marinas > Cc: Peter Zijlstra > Signed-off-by: Will Deacon Reviewed-by: Catalin Marinas