From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932140AbbAEXxt (ORCPT ); Mon, 5 Jan 2015 18:53:49 -0500 Received: from mail-pd0-f177.google.com ([209.85.192.177]:56175 "EHLO mail-pd0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753084AbbAEXxr (ORCPT ); Mon, 5 Jan 2015 18:53:47 -0500 Message-ID: <1420502015.2910.6.camel@cyril> Subject: Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels From: Cyril Bur To: Don Zickus Cc: linux-kernel@vger.kernel.org, mpe@ellerman.id.au, drjones@redhat.com, akpm@linux-foundation.org, mingo@kernel.org, uobergfe@redhat.com, chaiw.fnst@cn.fujitsu.com, cl@linu.com, fabf@skynet.be, atomlin@redhat.com, benzh@chromium.org, mtosatti@redhat.com Date: Tue, 06 Jan 2015 10:53:35 +1100 In-Reply-To: <20150105165057.GU116159@redhat.com> References: <1419224764-11384-1-git-send-email-cyrilbur@gmail.com> <20150105165057.GU116159@redhat.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2015-01-05 at 11:50 -0500, Don Zickus wrote: > cc'ing Marcelo > > On Mon, Dec 22, 2014 at 04:06:02PM +1100, Cyril Bur wrote: > > When the hypervisor pauses a virtualised kernel the kernel will observe a jump > > in timebase, this can cause spurious messages from the softlockup detector. > > > > Whilst these messages are harmless, they are accompanied with a stack trace > > which causes undue concern and more problematically the stack trace in the > > guest has nothing to do with the observed problem and can only be misleading. > > > > Futhermore, on POWER8 this is completely avoidable with the introduction of > > the Virtual Time Base (VTB) register. > > Hi Cyril, > > Your solution seems simple and doesn't disturb the softlockup code as much > as the x86 solution does. The only small issue I had was the use of > sched_clock instead of local_clock. I keep forgetting the difference > (unstable clock is the biggest reason I think). My apologies there it appears I stuffed up, local_clock was used initially in the softlockup code, I'll send a v2. > Other than that, I am not the biggest fan of putting multiple virtual > guest solutions for the same problem into the watchdog code. I would > prefer a common solution/framework to leverage. Agreed. > I have the x86 folks focusing on the steal_time stuff. It started with > KVM and I believe VMWare is working on utilizing it too (and maybe Xen). I'm not sure I've ever seen this, could you please point me towards something I can look at? > Not sure if that is useful or could be incoporated into the power8 code. > Though to be honest I am curious if the steal_time code could be ported to > your solution as it seems the watchdog code could remove all the > steal_time warts. Happy to help sus out the situation here, again, if you could pass on what the x86 guys are working on, thanks. Thanks, Cyril > I have cc'd Marcelo into this discussion as he was the last person I > remember talking with about this problem. > > Cheers, > Don