From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brad Campbell Subject: Re: XP machine freeze Date: Sun, 19 Apr 2015 23:27:26 +0800 Message-ID: <5533C95E.5030707@fnarfbargle.com> References: <009701d05ffb$5e37a740$1aa6f5c0$@astim.si> <550EE047.3030605@fnarfbargle.com> <5519BBF4.7080600@redhat.com> <552B40F7.5080107@fnarfbargle.com> <552BB8D5.7060200@redhat.com> <552BBA87.50109@fnarfbargle.com> <552BCC60.1000103@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit To: Paolo Bonzini , Saso Slavicic , kvm@vger.kernel.org, =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Return-path: Received: from ns3.fnarfbargle.com ([103.4.17.7]:51033 "EHLO ns3.fnarfbargle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750848AbbDSP1d (ORCPT ); Sun, 19 Apr 2015 11:27:33 -0400 In-Reply-To: <552BCC60.1000103@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 13/04/15 22:02, Paolo Bonzini wrote: > > On 13/04/2015 14:45, Brad Campbell wrote: >> G'day Paolo, >> >> Yes, on AMD and I've tried hard to reproduce it on Intel and been unable >> to thus far. >> >> Now you mention it may be AMD specific, I have a spare motherboard and >> processor sitting in a drawer. I'll bolt it together tomorrow and see if >> I can reproduce it on another AMD machine. Two machines should let me >> test it twice as fast. >> >> I got a fail this afternoon, so I'm due to reboot tonight. I'll just >> revert that one suspect commit from a known bad kernel and see if that >> cleans it up. If not then I'll work through the remainder of the >> information in your mail. I really appreciate the attention you've paid >> to this, it has been a frustrating bug for me because I'm in a position >> of not knowing what I don't know, and obviously doing something wrong in >> very long bisection processes. > Actually, if you have time to change your course of action, please > revert the one that Nadav pointed out (f210f7572bed, KVM: x86: > Fix lost interrupt on irr_pending race) or cherry-pick it on top of 3.17. > > Paolo > Ok, I think we have a winner. Patch manually plopped on top of vanilla 3.17. It has never gone for anywhere near this long on a bad kernel. brad@srv:~$ uptime 23:24:48 up 6 days, 1:01, 3 users, load average: 1.48, 1.95, 2.48 So this patch went into the kernel during the 3.19 release cycle? Affected kernels 3.16-3.18? Regards, Brad -- Dolphins are so intelligent that within a few weeks they can train Americans to stand at the edge of the pool and throw them fish.