linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alok Kataria <akataria@vmware.com>
To: "weijg.fnst@cn.fujitsu.com" <weijg.fnst@cn.fujitsu.com>
Cc: "tglx@linutronix.de" <tglx@linutronix.de>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"x86@kernel.org" <x86@kernel.org>
Subject: Re: RFC: Fix kdump failed with 'notsc'
Date: Fri, 24 Jun 2016 10:41:07 +0000	[thread overview]
Message-ID: <1466765165.24676.22.camel@vmware.com> (raw)
In-Reply-To: <1465898098.16116.52.camel@localhost>

Hi Wei, 

On Tue, 2016-06-14 at 09:56 +0000, Wei, Jiangang wrote:
> Hi,
> 
> When I trigger kernel crash and specify 'notsc' for capture-kernel,
> The process of kdump will be blocked at calibrate_delay_converge().
> 
> /* wait for "start of" clock tick */
> ticks = jiffies;
> while (ticks == jiffies)
>         ; /* nothing */
> 
> The reason is that the jiffies remains the same, no changed.
> 
> serial console log as following,
> ............
> [    0.000000] Linux version 4.7.0-rc2+ (root@localhost.localdomain)
> (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #2 SMP Wed Jun
> 156
> [    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc2+ 
> root=/dev/mapper/centos-root ro rd.lvm.lv=centos/swap
> vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=256M
> vconsole.keymap=us console=tty0 console=ttyS0,115200n8 LANG=en_US.UTF-8
> irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off
> panic=10 rootflags=nofail acpi_no_memhotplug notsc 
> ............
> [    0.000000] tsc: Kernel compiled with CONFIG_X86_TSC, cannot disable
> TSC completely
> ............
> [    0.000000] clocksource: hpet: mask: 0xffffffff max_cycles:
> 0xffffffff, max_idle_ns: 133484882848 ns
> [    0.000000] tsc: Fast TSC calibration using PIT
> [    0.000000] tsc: Detected 3192.714 MHz processor
> [    0.000000] Calibrating delay loop... 
> 
> # The last log is raised by calibrate_delay(), which calls
> calibrate_delay_converge() to compute the lpj value. 
> 
> # So far, I don't know why the jiffies stays the same.
> # But I found two methods can avoid this problem。
> 
> 1)specify the 'lpj=<n>'  with 'notsc' together.
> 
> 2)  revert the 70de9a9.
> 
>     commit 70de9a97049e0ba79dc040868564408d5ce697f9
>     Author: Alok Kataria <akataria@vmware.com>
>     Date:   Mon Nov 3 11:18:47 2008 -0800
> 
>     x86: don't use tsc_khz to calculate lpj if notsc is passed
>     
>     Impact: fix udelay when "notsc" boot parameter is passed
>     
>     With notsc passed on commandline, tsc may not be used for
>     udelays, make sure that we do not use tsc_khz to calculate
>     the lpj value in such cases.
> 
> IMO,
> The flow of getting tsc_khz as following,
> tsc_init()->x86_platform.calibrate_tsc()->native_calibrate_tsc()->quick_pit_calibrate().
> No codes use or call 'rdtsc'.

The intent of that change was to skip calculating the lpj value based on
the tsc_khz value if notsc is specified. Note that it has noting to do
with using rdtsc for tsc frequency calibration, instead we use the tsc
frequency (tsc_khz) derived lpj value for udelay (see delay_tsc).

If notsc is passed, we skip assigning a value to lpj_fine since tsc is
no longer used for implementing delay. Instead we now calibrate lpj
value in calibrate_delay and call calibrate_delay_converge. Now looking
at calibrate_delay_converge, it expects jiffies to advance. Otherwise
you will wait endlessly there 

static unsigned long calibrate_delay_converge(void)                                     
{  
...
   /* wait for "start of" clock tick */                                                 
   ticks = jiffies;
   while (ticks == jiffies)                                                             
      ; /* nothing */                                                                   

You should really look at why is jiffies not incrementing.

> 
> Even if  ‘notsc’  is passed, the tsc_khz is credible.
> and we can get lpj by it.
> 
> So I want to push a patch to revert the 70de9a9.
> Any comments or suggestions is appreciated.

As mentioned above reverting change 70de9a9 is wrong and would be just
papering over the actual issue. 

Thanks,
Alok

> 
> Thanks,
> wei
> 
> 
>  
> 
> 

  parent reply	other threads:[~2016-06-24 10:41 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-14  9:56 RFC: Fix kdump failed with 'notsc' Wei, Jiangang
2016-06-20  6:55 ` Wei, Jiangang
2016-06-24 10:41 ` Alok Kataria [this message]
2016-06-27  6:45   ` Wei, Jiangang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1466765165.24676.22.camel@vmware.com \
    --to=akataria@vmware.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=weijg.fnst@cn.fujitsu.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).