From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brad Campbell Subject: Re: XP machine freeze Date: Sat, 04 Apr 2015 18:55:22 +0800 Message-ID: <551FC31A.8050305@fnarfbargle.com> References: <009701d05ffb$5e37a740$1aa6f5c0$@astim.si> <550EE047.3030605@fnarfbargle.com> <5519BBF4.7080600@redhat.com> <5519EA01.4010102@fnarfbargle.com> <004f01d06b7c$08b96970$1a2c3c50$@astim.si> <551A4A42.10309@fnarfbargle.com> <551A6121.5030400@redhat.com> <551A8213.7080100@fnarfbargle.com> <551A83BB.7040107@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit To: Paolo Bonzini , Saso Slavicic , kvm@vger.kernel.org Return-path: Received: from ns3.fnarfbargle.com ([103.4.17.7]:42705 "EHLO ns3.fnarfbargle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751977AbbDDKza (ORCPT ); Sat, 4 Apr 2015 06:55:30 -0400 In-Reply-To: <551A83BB.7040107@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 31/03/15 19:23, Paolo Bonzini wrote: > > On 31/03/2015 13:16, Brad Campbell wrote: >> If you look at the bisect point I'm currently at it's a mix of i2c and >> arm. The only vaguely relevant (as far as I can see) commit is the >> addition of the getrandom() syscall, so my bisect is looking dodgy at >> best. If I can come up with a better test case on a non-critical box >> then I'll be in a better position to try and help get to the bottom of >> the issue. > Yes, the bisect went wrong somewhere. Still, the 'bad' commit leave > both a net-next and a KVM merge, so it could be worse. :) > So the bisect went horribly wrong _again_ and I've been completely unable to reproduce this problem on a test or staging machine (tried both), so I'm down to trying the 4 commits you suggested (pre/post KVM & pre/post net) to see if I can find bookends for another targeted bisect. I'm 23 hours into the pre-kvm commit, so I probably need another week or two to at least identify some new bisect points as I really to want to leave it run for 4 or 5 days for a good kernel. I tried running iperf in various automated incarnations to speed up the determination of a bad kernel, but it made absolutely no difference at all to the fault time. The other thing that occurred to me is of course I'm sucking in about 1.5Mb/s through the network and immediately streaming it out to disk, so it's entirely possible it may be disk related too. I'll keep plugging away. In the mean time if anyone has any ideas I'm all ears. Regards, Brad -- Dolphins are so intelligent that within a few weeks they can train Americans to stand at the edge of the pool and throw them fish.