From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com ([66.187.233.31]:2966 "EHLO mx1.redhat.com") by vger.kernel.org with ESMTP id S1751491AbWCIOBf (ORCPT ); Thu, 9 Mar 2006 09:01:35 -0500 From: David Howells In-Reply-To: <21627.1141846631@warthog.cambridge.redhat.com> References: <21627.1141846631@warthog.cambridge.redhat.com> <29826.1141828678@warthog.cambridge.redhat.com> <31492.1141753245@warthog.cambridge.redhat.com> Subject: Re: [PATCH] Document Linux's memory barriers [try #3] Date: Thu, 09 Mar 2006 14:01:16 +0000 Message-ID: <27749.1141912876@warthog.cambridge.redhat.com> Sender: linux-arch-owner@vger.kernel.org To: David Howells Cc: torvalds@osdl.org, akpm@osdl.org, mingo@redhat.com, alan@redhat.com, linux-arch@vger.kernel.org, linuxppc64-dev@ozlabs.org, linux-kernel@vger.kernel.org List-ID: I'm thinking of adding the attached to the document. Any comments or objections? David diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 6eeb7e4..f9a9192 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -4,6 +4,8 @@ Contents: + (*) What do we consider memory? + (*) What are memory barriers? (*) Where are memory barriers needed? @@ -32,6 +34,82 @@ Contents: (*) References. +=========================== +WHAT DO WE CONSIDER MEMORY? +=========================== + +For the purpose of this specification, "memory", at least as far as cached CPU +vs CPU interactions go, has to include the CPU caches in the system. Although +any particular read or write may not actually appear outside of the CPU that +issued it because the CPU was able to satisfy it from its own cache, it's still +as if the memory access had taken place as far as the other CPUs are concerned +since the cache coherency and ejection mechanisms will propegate the effects +upon conflict. + +Consider the system logically as: + + <--- CPU ---> : <----------- Memory -----------> + : + +--------+ +--------+ : +--------+ +-----------+ + | | | | : | | | | +---------+ + | CPU | | Memory | : | CPU | | | | | + | Core |--->| Access |----->| Cache |<-->| | | | + | | | Queue | : | | | |--->| Memory | + | | | | : | | | | | | + +--------+ +--------+ : +--------+ | | | | + : | Cache | +---------+ + : | Coherency | + : | Mechanism | +---------+ + +--------+ +--------+ : +--------+ | | | | + | | | | : | | | | | | + | CPU | | Memory | : | CPU | | |--->| Device | + | Core |--->| Access |----->| Cache |<-->| | | | + | | | Queue | : | | | | | | + | | | | : | | | | +---------+ + +--------+ +--------+ : +--------+ +-----------+ + : + : + +The CPU core may execute instructions in any order it deems fit, provided the +expected program causality appears to be maintained. Some of the instructions +generate load and store operations which then go into the memory access queue +to be performed. The core may place these in the queue in any order it wishes, +and continue execution until it is forced to wait for an instruction to +complete. + +What memory barriers are concerned with is controlling the order in which +accesses cross from the CPU side of things to the memory side of things, and +the order in which the effects are perceived to happen by the other observers +in the system. + + +Note that the above model does not show uncached memory or I/O accesses. These +procede directly from the queue to the memory or the devices, bypassing any +cache coherency: + + <--- CPU ---> : + : +-----+ + +--------+ +--------+ : | | + | | | | : | | +---------+ + | CPU | | Memory | : | | | | + | Core |--->| Access |--------------->| | | | + | | | Queue | : | |------------->| Memory | + | | | | : | | | | + +--------+ +--------+ : | | | | + : | | +---------+ + : | Bus | + : | | +---------+ + +--------+ +--------+ : | | | | + | | | | : | | | | + | CPU | | Memory | : | |<------------>| Device | + | Core |--->| Access |--------------->| | | | + | | | Queue | : | | | | + | | | | : | | +---------+ + +--------+ +--------+ : | | + : +-----+ + : + + ========================= WHAT ARE MEMORY BARRIERS? ========================= @@ -448,8 +526,8 @@ In all cases there are variants on a LOC The LOCK accesses will be completed before the UNLOCK accesses. -And therefore an UNLOCK followed by a LOCK is equivalent to a full barrier, but -a LOCK followed by an UNLOCK isn't. + Therefore an UNLOCK followed by a LOCK is equivalent to a full barrier, + but a LOCK followed by an UNLOCK is not. Locks and semaphores may not provide any guarantee of ordering on UP compiled systems, and so can't be counted on in such a situation to actually do anything