From mboxrd@z Thu Jan  1 00:00:00 1970
From: Brad Campbell <lists2009@fnarfbargle.com>
Subject: Re: XP machine freeze
Date: Sun, 19 Apr 2015 23:27:26 +0800
Message-ID: <5533C95E.5030707@fnarfbargle.com>
References: <009701d05ffb$5e37a740$1aa6f5c0$@astim.si> <550EE047.3030605@fnarfbargle.com> <5519BBF4.7080600@redhat.com> <552B40F7.5080107@fnarfbargle.com> <552BB8D5.7060200@redhat.com> <552BBA87.50109@fnarfbargle.com> <552BCC60.1000103@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
To: Paolo Bonzini <pbonzini@redhat.com>,
	Saso Slavicic <saso.linux@astim.si>, kvm@vger.kernel.org,
	=?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= <rkrcmar@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from ns3.fnarfbargle.com ([103.4.17.7]:51033 "EHLO
	ns3.fnarfbargle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1750848AbbDSP1d (ORCPT <rfc822;kvm@vger.kernel.org>);
	Sun, 19 Apr 2015 11:27:33 -0400
In-Reply-To: <552BCC60.1000103@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>


On 13/04/15 22:02, Paolo Bonzini wrote:
>
> On 13/04/2015 14:45, Brad Campbell wrote:
>> G'day Paolo,
>>
>> Yes, on AMD and I've tried hard to reproduce it on Intel and been unable
>> to thus far.
>>
>> Now you mention it may be AMD specific, I have a spare motherboard and
>> processor sitting in a drawer. I'll bolt it together tomorrow and see if
>> I can reproduce it on another AMD machine. Two machines should let me
>> test it twice as fast.
>>
>> I got a fail this afternoon, so I'm due to reboot tonight. I'll just
>> revert that one suspect commit from a known bad kernel and see if that
>> cleans it up. If not then I'll work through the remainder of the
>> information in your mail. I really appreciate the attention you've paid
>> to this, it has been a frustrating bug for me because I'm in a position
>> of not knowing what I don't know, and obviously doing something wrong in
>> very long bisection processes.
> Actually, if you have time to change your course of action, please
> revert the one that Nadav pointed out (f210f7572bed, KVM: x86:
> Fix lost interrupt on irr_pending race) or cherry-pick it on top of 3.17.
>
> Paolo
>
Ok, I think we have a winner. Patch manually plopped on top of vanilla 
3.17. It has never gone for anywhere near this long on a bad kernel.

brad@srv:~$ uptime
  23:24:48 up 6 days,  1:01,  3 users,  load average: 1.48, 1.95, 2.48

So this patch went into the kernel during the 3.19 release cycle? 
Affected kernels 3.16-3.18?

Regards,
Brad

-- 
Dolphins are so intelligent that within a few weeks they can
train Americans to stand at the edge of the pool and throw them
fish.