All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alok Kataria <akataria@vmware.com>
To: "weijg.fnst@cn.fujitsu.com" <weijg.fnst@cn.fujitsu.com>
Cc: "tglx@linutronix.de" <tglx@linutronix.de>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"x86@kernel.org" <x86@kernel.org>
Subject: Re: RFC: Fix kdump failed with 'notsc'
Date: Fri, 24 Jun 2016 10:41:07 +0000	[thread overview]
Message-ID: <1466765165.24676.22.camel@vmware.com> (raw)
In-Reply-To: <1465898098.16116.52.camel@localhost>

Hi Wei, 

On Tue, 2016-06-14 at 09:56 +0000, Wei, Jiangang wrote:
> Hi,
> 
> When I trigger kernel crash and specify 'notsc' for capture-kernel,
> The process of kdump will be blocked at calibrate_delay_converge().
> 
> /* wait for "start of" clock tick */
> ticks = jiffies;
> while (ticks == jiffies)
>         ; /* nothing */
> 
> The reason is that the jiffies remains the same, no changed.
> 
> serial console log as following,
> ............
> [    0.000000] Linux version 4.7.0-rc2+ (root@localhost.localdomain)
> (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #2 SMP Wed Jun
> 156
> [    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc2+ 
> root=/dev/mapper/centos-root ro rd.lvm.lv=centos/swap
> vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=256M
> vconsole.keymap=us console=tty0 console=ttyS0,115200n8 LANG=en_US.UTF-8
> irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off
> panic=10 rootflags=nofail acpi_no_memhotplug notsc 
> ............
> [    0.000000] tsc: Kernel compiled with CONFIG_X86_TSC, cannot disable
> TSC completely
> ............
> [    0.000000] clocksource: hpet: mask: 0xffffffff max_cycles:
> 0xffffffff, max_idle_ns: 133484882848 ns
> [    0.000000] tsc: Fast TSC calibration using PIT
> [    0.000000] tsc: Detected 3192.714 MHz processor
> [    0.000000] Calibrating delay loop... 
> 
> # The last log is raised by calibrate_delay(), which calls
> calibrate_delay_converge() to compute the lpj value. 
> 
> # So far, I don't know why the jiffies stays the same.
> # But I found two methods can avoid this problem。
> 
> 1)specify the 'lpj=<n>'  with 'notsc' together.
> 
> 2)  revert the 70de9a9.
> 
>     commit 70de9a97049e0ba79dc040868564408d5ce697f9
>     Author: Alok Kataria <akataria@vmware.com>
>     Date:   Mon Nov 3 11:18:47 2008 -0800
> 
>     x86: don't use tsc_khz to calculate lpj if notsc is passed
>     
>     Impact: fix udelay when "notsc" boot parameter is passed
>     
>     With notsc passed on commandline, tsc may not be used for
>     udelays, make sure that we do not use tsc_khz to calculate
>     the lpj value in such cases.
> 
> IMO,
> The flow of getting tsc_khz as following,
> tsc_init()->x86_platform.calibrate_tsc()->native_calibrate_tsc()->quick_pit_calibrate().
> No codes use or call 'rdtsc'.

The intent of that change was to skip calculating the lpj value based on
the tsc_khz value if notsc is specified. Note that it has noting to do
with using rdtsc for tsc frequency calibration, instead we use the tsc
frequency (tsc_khz) derived lpj value for udelay (see delay_tsc).

If notsc is passed, we skip assigning a value to lpj_fine since tsc is
no longer used for implementing delay. Instead we now calibrate lpj
value in calibrate_delay and call calibrate_delay_converge. Now looking
at calibrate_delay_converge, it expects jiffies to advance. Otherwise
you will wait endlessly there 

static unsigned long calibrate_delay_converge(void)                                     
{  
...
   /* wait for "start of" clock tick */                                                 
   ticks = jiffies;
   while (ticks == jiffies)                                                             
      ; /* nothing */                                                                   

You should really look at why is jiffies not incrementing.

> 
> Even if  ‘notsc’  is passed, the tsc_khz is credible.
> and we can get lpj by it.
> 
> So I want to push a patch to revert the 70de9a9.
> Any comments or suggestions is appreciated.

As mentioned above reverting change 70de9a9 is wrong and would be just
papering over the actual issue. 

Thanks,
Alok

> 
> Thanks,
> wei
> 
> 
>  
> 
> 

  parent reply	other threads:[~2016-06-24 10:41 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-14  9:56 RFC: Fix kdump failed with 'notsc' Wei, Jiangang
2016-06-20  6:55 ` Wei, Jiangang
2016-06-24 10:41 ` Alok Kataria [this message]
2016-06-27  6:45   ` Wei, Jiangang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1466765165.24676.22.camel@vmware.com \
    --to=akataria@vmware.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=weijg.fnst@cn.fujitsu.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.