From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jack Steiner Date: Mon, 03 Nov 2003 19:28:56 +0000 Subject: Re: [RFC] Better MCA recovery on IPF Message-Id: List-Id: References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-ia64@vger.kernel.org On Mon, Nov 03, 2003 at 10:42:48AM -0800, Alberto Munoz wrote: > > > > > Hi, > > > > > > I just wondered if a speculative load hitting a cache or memory > > > error does cause an exception on IA64 ? > > > > I dont think a speculative load should cause a problem - at > > least until > > code tries to consume the data by transfering it to a > > processor register. > > If you are doing a read (which is what a speculative load will be > generating), the error will be generated by whatever part of the logic that > detects it. You cannot possible send poisoned data through a memory bus and a > system bus (at least not the Intel system buses I am familiar with) without > having some of the error checking logic (ECC or parity) complaining about it > (this means generating an MCA). As the poisoned data flows thru the BUSes, errors may be reported but these errors are not reported to the OS as uncorrected/fatal MCA errors. Depending on your chipset, errors are logged as platform errors. There is a good paper by Tony Luck (Intel) that describes data poisoning as used in IA64. You can find it on google or at: archive.linuxsymposium.org/ols2003/Proceedings/ All-Reprints/Reprint-Luck-OLS2003.pdf See the section on "data poisoning". > > > As I understand the cpu architecture, an error that occurs > > reading data > > will result in a poisoned cache line being delivered to the > > cpu cache. > > The poisoned cache line can stay in the cache forever. No MCA error is > > reported until the data is actually consumed by tranfering > > the data from > > cache to a cpu register. > > The problem is that the cache error checking logic has no way of knowing that > the data it is about to supply to some register is going to be used for a > speculative operation. The cache logic is pretty far away (in processor > terms) from the decoding logic. > > Bert Munoz > > > This requires some support from the chipset. Some chipsets dont fully > > support this error model. > > > > > > > > -- Thanks Jack Steiner (steiner@sgi.com) 651-683-5302 Principal Engineer SGI - Silicon Graphics, Inc.