From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755253AbbFCVLv (ORCPT ); Wed, 3 Jun 2015 17:11:51 -0400 Received: from www.sr71.net ([198.145.64.142]:50698 "EHLO blackbird.sr71.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753600AbbFCVLm (ORCPT ); Wed, 3 Jun 2015 17:11:42 -0400 Message-ID: <556F6D8C.3040204@sr71.net> Date: Wed, 03 Jun 2015 14:11:40 -0700 From: Dave Hansen User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: John Stultz CC: LKML , "Yu, Fenghua" , the arch/x86 maintainers Subject: Re: hangs in verify_pmtmr_rate() References: <556F5BAF.8010303@sr71.net> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/03/2015 01:46 PM, John Stultz wrote: > On Wed, Jun 3, 2015 at 12:55 PM, Dave Hansen wrote: >> I'm seeing boot hangs when trying to boot a 32-bit 4.1.0-rc5 kernel on >> some 64-bit CPUs (I'm not sure if it is a regression). The NMI watchdog >> shows init_acpi_pm_clocksource() as the last thing in the backtrace, >> specifically verify_pmtmr_rate()'s I/O instructions. It appears to be >> mach_countup()'s while loop that gets stuck. >> >> Booting with "pmtmr=0" works around this for me, as would unsetting >> CONFIG_X86_PM_TIMER I'd imagine. >> >> The hardware I'm doing this on is a bit wonky and I think the hpet is >> broken on it. >> >> Does this look like *really* broken hardware, or something that we >> should be detecting and able to recover from? > > Hrm. Does this machine have a working PIT? It _should_. :) > Does pit_calibrate_tsc() end up being used on this box to calibrate > the TSC (its similar logic, so it should get stuck in the same way), > or does it use a different method for tsc calibration? I end up seeing: > tsc: Fast TSC calibration failed > tsc: Using PIT calibration value > tsc: Detected 911.616 MHz processor > Calibrating delay loop (skipped) , value calculated using timer frequency.. 1823.23 BogoMIPS... (lpj=3646464) Which makes it look like native_calibrate_tsc() managed to successfully do a pit_calibrate_tsc() and got to the code below (otherwise we would have hit the (tsc_pit_min == ULONG_MAX) case). > /* We don't have an alternative source, use the PIT calibration value */ > if (!hpet && !ref1 && !ref2) { > pr_info("Using PIT calibration value\n"); > return tsc_pit_min; > } I later see: > timekeeping watchdog: Marking clocksource 'tsc' as unstable, because skew is too large: > 'refined-jiffies' wd_now: ffeef86 wd_last: fffeef09 mask: ffffffff