From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: just a dump Date: Mon, 25 May 2009 15:47:38 -0300 Message-ID: <20090525184738.GA7137@amt.cnet> References: <4A09E620.3040300@xs4all.nl> <4A09F62A.8010203@xs4all.nl> <20090515144923.GA6304@amt.cnet> <4A0E7B81.6070203@xs4all.nl> <20090516131046.GB3153@amt.cnet> <4A152B7C.4080401@xs4all.nl> <4A152E9B.3060500@xs4all.nl> <20090523214753.GA17590@amt.cnet> <4A19344B.8070503@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Hans de Bruin , kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from mx2.redhat.com ([66.187.237.31]:34152 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752126AbZEYSrp (ORCPT ); Mon, 25 May 2009 14:47:45 -0400 Content-Disposition: inline In-Reply-To: <4A19344B.8070503@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Sun, May 24, 2009 at 02:49:31PM +0300, Avi Kivity wrote: > Marcelo Tosatti wrote: >>> another one. This time the vm where booted sequentialy: >>> >>> [ 253.268993] kvm: 2907: cpu0 unimplemented perfctr wrmsr: >>> 0xc0010003 data 0x0 >>> [ 475.036542] rmap_remove: ffff8800cdb913b0 10 0->BUG >>> >> ^^^ >> >> So 0x10 is 1 bit different from 0x00, which would be the no present >> entry for AMD. >> >> Usually an indication of hardware problems. Perhaps you want to try >> the test Lucas mentioned. >> > > It's actually the PCD bit, not that it affects your analysis. Strange > that we see this pattern (1 bit differences on sptes) on both AMD and > Intel. Very suspicious. On Intel there was confirmation (through memtest86) that it was memory failure. The pattern was: 0xff7ffffffffff001