public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* hangs in verify_pmtmr_rate()
@ 2015-06-03 19:55 Dave Hansen
  2015-06-03 20:46 ` John Stultz
  2015-06-03 21:48 ` Yu, Fenghua
  0 siblings, 2 replies; 4+ messages in thread
From: Dave Hansen @ 2015-06-03 19:55 UTC (permalink / raw)
  To: LKML, Yu, Fenghua, John Stultz, the arch/x86 maintainers

I'm seeing boot hangs when trying to boot a 32-bit 4.1.0-rc5 kernel on
some 64-bit CPUs (I'm not sure if it is a regression).  The NMI watchdog
shows init_acpi_pm_clocksource() as the last thing in the backtrace,
specifically verify_pmtmr_rate()'s I/O instructions.  It appears to be
mach_countup()'s while loop that gets stuck.

Booting with "pmtmr=0" works around this for me, as would unsetting
CONFIG_X86_PM_TIMER I'd imagine.

The hardware I'm doing this on is a bit wonky and I think the hpet is
broken on it.

Does this look like *really* broken hardware, or something that we
should be detecting and able to recover from?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hangs in verify_pmtmr_rate()
  2015-06-03 19:55 hangs in verify_pmtmr_rate() Dave Hansen
@ 2015-06-03 20:46 ` John Stultz
  2015-06-03 21:11   ` Dave Hansen
  2015-06-03 21:48 ` Yu, Fenghua
  1 sibling, 1 reply; 4+ messages in thread
From: John Stultz @ 2015-06-03 20:46 UTC (permalink / raw)
  To: Dave Hansen; +Cc: LKML, Yu, Fenghua, the arch/x86 maintainers

On Wed, Jun 3, 2015 at 12:55 PM, Dave Hansen <dave@sr71.net> wrote:
> I'm seeing boot hangs when trying to boot a 32-bit 4.1.0-rc5 kernel on
> some 64-bit CPUs (I'm not sure if it is a regression).  The NMI watchdog
> shows init_acpi_pm_clocksource() as the last thing in the backtrace,
> specifically verify_pmtmr_rate()'s I/O instructions.  It appears to be
> mach_countup()'s while loop that gets stuck.
>
> Booting with "pmtmr=0" works around this for me, as would unsetting
> CONFIG_X86_PM_TIMER I'd imagine.
>
> The hardware I'm doing this on is a bit wonky and I think the hpet is
> broken on it.
>
> Does this look like *really* broken hardware, or something that we
> should be detecting and able to recover from?

Hrm. Does this machine have a working PIT?

Does pit_calibrate_tsc() end up being used on this box to calibrate
the TSC (its similar logic, so it should get stuck in the same way),
or does it use a different method for tsc calibration?

thanks
-john

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hangs in verify_pmtmr_rate()
  2015-06-03 20:46 ` John Stultz
@ 2015-06-03 21:11   ` Dave Hansen
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Hansen @ 2015-06-03 21:11 UTC (permalink / raw)
  To: John Stultz; +Cc: LKML, Yu, Fenghua, the arch/x86 maintainers

On 06/03/2015 01:46 PM, John Stultz wrote:
> On Wed, Jun 3, 2015 at 12:55 PM, Dave Hansen <dave@sr71.net> wrote:
>> I'm seeing boot hangs when trying to boot a 32-bit 4.1.0-rc5 kernel on
>> some 64-bit CPUs (I'm not sure if it is a regression).  The NMI watchdog
>> shows init_acpi_pm_clocksource() as the last thing in the backtrace,
>> specifically verify_pmtmr_rate()'s I/O instructions.  It appears to be
>> mach_countup()'s while loop that gets stuck.
>>
>> Booting with "pmtmr=0" works around this for me, as would unsetting
>> CONFIG_X86_PM_TIMER I'd imagine.
>>
>> The hardware I'm doing this on is a bit wonky and I think the hpet is
>> broken on it.
>>
>> Does this look like *really* broken hardware, or something that we
>> should be detecting and able to recover from?
> 
> Hrm. Does this machine have a working PIT?

It _should_.  :)

> Does pit_calibrate_tsc() end up being used on this box to calibrate
> the TSC (its similar logic, so it should get stuck in the same way),
> or does it use a different method for tsc calibration?

I end up seeing:

> 	tsc: Fast TSC calibration failed
> 	tsc: Using PIT calibration value
> 	tsc: Detected 911.616 MHz processor
> 	Calibrating delay loop (skipped) , value calculated using timer frequency.. 1823.23 BogoMIPS... (lpj=3646464)

Which makes it look like native_calibrate_tsc() managed to successfully
do a pit_calibrate_tsc() and got to the code below (otherwise  we would
have hit the (tsc_pit_min == ULONG_MAX) case).

>         /* We don't have an alternative source, use the PIT calibration value */
>         if (!hpet && !ref1 && !ref2) {
>                 pr_info("Using PIT calibration value\n");
>                 return tsc_pit_min;
>         }

I later see:

> 	timekeeping watchdog: Marking clocksource 'tsc' as unstable, because skew is too large:
> 	 'refined-jiffies' wd_now: ffeef86 wd_last: fffeef09 mask: ffffffff



^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: hangs in verify_pmtmr_rate()
  2015-06-03 19:55 hangs in verify_pmtmr_rate() Dave Hansen
  2015-06-03 20:46 ` John Stultz
@ 2015-06-03 21:48 ` Yu, Fenghua
  1 sibling, 0 replies; 4+ messages in thread
From: Yu, Fenghua @ 2015-06-03 21:48 UTC (permalink / raw)
  To: Dave Hansen, LKML, John Stultz, the arch/x86 maintainers

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1201 bytes --]

> From: Dave Hansen [mailto:dave@sr71.net]
> Sent: Wednesday, June 03, 2015 12:55 PM
> To: LKML; Yu, Fenghua; John Stultz; the arch/x86 maintainers
> Subject: hangs in verify_pmtmr_rate()
> 
> I'm seeing boot hangs when trying to boot a 32-bit 4.1.0-rc5 kernel on some
> 64-bit CPUs (I'm not sure if it is a regression).  The NMI watchdog shows
> init_acpi_pm_clocksource() as the last thing in the backtrace, specifically
> verify_pmtmr_rate()'s I/O instructions.  It appears to be mach_countup()'s
> while loop that gets stuck.
> 
> Booting with "pmtmr=0" works around this for me, as would unsetting
> CONFIG_X86_PM_TIMER I'd imagine.
> 
> The hardware I'm doing this on is a bit wonky and I think the hpet is broken
> on it.
> 
> Does this look like *really* broken hardware, or something that we should
> be detecting and able to recover from?

Latest 4.1.0-rc5 defconfig boots fine without "pmtmr=0" on my machine which is same kind as Dave's.

I always give "hpet=disable" on this machine to w/a a known hw issue.

Thanks.

-Fenghua

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-06-03 21:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-03 19:55 hangs in verify_pmtmr_rate() Dave Hansen
2015-06-03 20:46 ` John Stultz
2015-06-03 21:11   ` Dave Hansen
2015-06-03 21:48 ` Yu, Fenghua

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox