From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brad Campbell Subject: Re: XP machine freeze Date: Mon, 13 Apr 2015 20:45:59 +0800 Message-ID: <552BBA87.50109@fnarfbargle.com> References: <009701d05ffb$5e37a740$1aa6f5c0$@astim.si> <550EE047.3030605@fnarfbargle.com> <5519BBF4.7080600@redhat.com> <552B40F7.5080107@fnarfbargle.com> <552BB8D5.7060200@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit To: Paolo Bonzini , Saso Slavicic , kvm@vger.kernel.org, =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Return-path: Received: from ns3.fnarfbargle.com ([103.4.17.7]:57871 "EHLO ns3.fnarfbargle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932418AbbDMMqD (ORCPT ); Mon, 13 Apr 2015 08:46:03 -0400 In-Reply-To: <552BB8D5.7060200@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 13/04/15 20:38, Paolo Bonzini wrote: > > On 13/04/2015 06:07, Brad Campbell wrote: >> On 31/03/15 05:11, Paolo Bonzini wrote: >>> On 22/03/2015 16:31, Brad Campbell wrote: >>>> No help I'm afraid, but at least I can conclusively say that 3.16 is >>>> good, and 3.17 is bad. >>> Can you try more specifically around the first KVM pull request? That >>> would be between c9b88e958182 (presumed good) and 8533ce727188 (presumed >>> bad)? >>> >>> >> >> G'day Paolo. >> >> I can confirm that the fault appears to lie between good and bad as >> specified above. >> Bad failed before 48 hours, good ran for 143 hours. I'm bisecting now. > Thanks! Remember to bisect only with arch/x86/kvm. > > Also: > > 1) Brad, I see you are on AMD. Have you ever reproduced it on Intel? > Saso, are you on AMD as well? > > If so, the most likely culprit is this: > > commit 6addfc42992be4b073c39137ecfdf4b2aa2d487f > Author: Paolo Bonzini > Date: Thu Mar 27 11:29:28 2014 +0100 G'day Paolo, Yes, on AMD and I've tried hard to reproduce it on Intel and been unable to thus far. Now you mention it may be AMD specific, I have a spare motherboard and processor sitting in a drawer. I'll bolt it together tomorrow and see if I can reproduce it on another AMD machine. Two machines should let me test it twice as fast. I got a fail this afternoon, so I'm due to reboot tonight. I'll just revert that one suspect commit from a known bad kernel and see if that cleans it up. If not then I'll work through the remainder of the information in your mail. I really appreciate the attention you've paid to this, it has been a frustrating bug for me because I'm in a position of not knowing what I don't know, and obviously doing something wrong in very long bisection processes. Regards, Brad -- Dolphins are so intelligent that within a few weeks they can train Americans to stand at the edge of the pool and throw them fish.