linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sasha.levin@oracle.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Arjan van de Ven <arjan@linux.intel.com>,
	lenb@kernel.org, rjw@rjwysocki.net,
	Eliezer Tamir <eliezer.tamir@linux.intel.com>,
	rui.zhang@intel.com, jacob.jun.pan@linux.intel.com,
	Mike Galbraith <bitbucket@online.de>,
	Ingo Molnar <mingo@kernel.org>,
	hpa@zytor.com, paulmck@linux.vnet.ibm.com,
	Thomas Gleixner <tglx@linutronix.de>,
	John Stultz <john.stultz@linaro.org>,
	Andy Lutomirski <luto@amacapital.net>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [PATCH 13/15] sched: Use a static_key for sched_clock_stable
Date: Tue, 21 Jan 2014 17:28:37 -0500	[thread overview]
Message-ID: <52DEF495.2020304@oracle.com> (raw)
In-Reply-To: <20131212141655.362219382@infradead.org>

On 12/12/2013 09:08 AM, Peter Zijlstra wrote:
> In order to avoid the runtime condition and variable load turn
> sched_clock_stable into a static_key.
>
> Also provide a shorter implementation of local_clock() and
> cpu_clock(int) when sched_clock_stable==1.
>
>                          MAINLINE   PRE       POST
>
>      sched_clock_stable: 1          1         1
>      (cold) sched_clock: 329841     221876    215295
>      (cold) local_clock: 301773     234692    220773
>      (warm) sched_clock: 38375      25602     25659
>      (warm) local_clock: 100371     33265     27242
>      (warm) rdtsc:       27340      24214     24208
>      sched_clock_stable: 0          0         0
>      (cold) sched_clock: 382634     235941    237019
>      (cold) local_clock: 396890     297017    294819
>      (warm) sched_clock: 38194      25233     25609
>      (warm) local_clock: 143452     71234     71232
>      (warm) rdtsc:       27345      24245     24243
>
> Signed-off-by: Peter Zijlstra<peterz@infradead.org>

Hi Peter,

This patch seems to be causing an issue with booting a KVM guest. It seems that it
causes the time to go random during early boot process:

	[    0.000000] Initmem setup node 30 [mem 0x12ee000000-0x138dffffff]
	[    0.000000]   NODE_DATA [mem 0xcfa42000-0xcfa72fff]
	[    0.000000]     NODE_DATA(30) on node 1
	[    0.000000] Initmem setup node 31 [mem 0x138e000000-0x142fffffff]
	[    0.000000]   NODE_DATA [mem 0xcfa11000-0xcfa41fff]
	[    0.000000]     NODE_DATA(31) on node 1
	[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
	[    0.000000] kvm-clock: cpu 0, msr 0:cf991001, boot clock
	[133538.294040] Zone ranges:
	[133538.294338]   DMA      [mem 0x00001000-0x00ffffff]
	[133538.294804]   DMA32    [mem 0x01000000-0xffffffff]
	[133538.295223]   Normal   [mem 0x100000000-0x142fffffff]
	[133538.295670] Movable zone start for each node

Looking at the code, initially I though that the problem is with:

	+void set_sched_clock_stable(void)
	+{
	+       if (!sched_clock_stable())
	+               static_key_slow_dec(&__sched_clock_stable);
	+}
	+
	+void clear_sched_clock_stable(void)
	+{
	+       /* XXX worry about clock continuity */
	+       if (sched_clock_stable())
	+               static_key_slow_inc(&__sched_clock_stable);
	+}

I think the jump label inc/dec is reversed here. We would want to inc it when enabling
and dec when disabling, no?
However, trying to reverse the two didn't help. I was still seeing the same odd behaviour.

I tried doing a simple conversion to using a simple var like before, which looks like this:

diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
index 6bd6a67..a035932 100644
--- a/kernel/sched/clock.c
+++ b/kernel/sched/clock.c
@@ -76,26 +76,21 @@ EXPORT_SYMBOL_GPL(sched_clock);
  __read_mostly int sched_clock_running;

  #ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
-static struct static_key __sched_clock_stable = STATIC_KEY_INIT;
+static int __sched_clock_stable;

  int sched_clock_stable(void)
  {
-       if (static_key_false(&__sched_clock_stable))
-               return false;
-       return true;
+       return __sched_clock_stable;
  }

  void set_sched_clock_stable(void)
  {
-       if (!sched_clock_stable())
-               static_key_slow_dec(&__sched_clock_stable);
+       __sched_clock_stable = 1;
  }

  static void __clear_sched_clock_stable(struct work_struct *work)
  {
-       /* XXX worry about clock continuity */
-       if (sched_clock_stable())
-               static_key_slow_inc(&__sched_clock_stable);
+       __sched_clock_stable = 0;
  }

  static DECLARE_WORK(sched_clock_work, __clear_sched_clock_stable);
@@ -340,7 +335,7 @@ EXPORT_SYMBOL_GPL(sched_clock_idle_wakeup_event);
   */
  u64 cpu_clock(int cpu)
  {
-       if (static_key_false(&__sched_clock_stable))
+       if (!sched_clock_stable())
                 return sched_clock_cpu(cpu);

         return sched_clock();
@@ -355,7 +350,7 @@ u64 cpu_clock(int cpu)
   */
  u64 local_clock(void)
  {
-       if (static_key_false(&__sched_clock_stable))
+       if (!sched_clock_stable())
                 return sched_clock_cpu(raw_smp_processor_id());

         return sched_clock();


This has corrected the issue:

	[    0.000000] Initmem setup node 31 [mem 0x138e000000-0x142fffffff]
	[    0.000000]   NODE_DATA [mem 0xcfa11000-0xcfa41fff]
	[    0.000000]     NODE_DATA(31) on node 1
	[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
	[    0.000000] kvm-clock: cpu 0, msr 0:cf991001, boot clock
	[    0.000000] Zone ranges:
	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
	[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
	[    0.000000]   Normal   [mem 0x100000000-0x142fffffff]
	[    0.000000] Movable zone start for each node
	[    0.000000] Early memory node ranges
		[ timing is correct for the rest of the boot]

At this point, I thought that there's something up with jump labels being used this early (?) and
tried compiling with CONFIG_JUMP_LABELS=n, this didn't solve the issue.

This makes me thing there's something different related to jumplabels we're missing, as the
no-jumplabel config should be very similar to the patch I did above, I just can't figure
out what it is.


Thanks,
Sasha

  reply	other threads:[~2014-01-21 22:29 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-12 14:08 [PATCH 00/15] cleanups and optimizations Peter Zijlstra
2013-12-12 14:08 ` [PATCH 01/15] x86, acpi, idle: Restructure the mwait idle routines Peter Zijlstra
2013-12-19 20:09   ` [tip:x86/idle] " tip-bot for Peter Zijlstra
2013-12-19 20:13     ` H. Peter Anvin
2013-12-19 20:16       ` Peter Zijlstra
2013-12-12 14:08 ` [PATCH 02/15] sched, preempt: Fixup missed PREEMPT_NEED_RESCHED folding Peter Zijlstra
2013-12-12 14:08 ` [PATCH 03/15] preempt, locking: Rework local_bh_{dis,en}able() Peter Zijlstra
2013-12-12 14:08 ` [PATCH 04/15] locking: Optimize lock_bh functions Peter Zijlstra
2013-12-12 14:08 ` [PATCH 05/15] sched, net: Clean up preempt_enable_no_resched() abuse Peter Zijlstra
2013-12-12 14:08 ` [PATCH 06/15] sched, net: Fixup busy_loop_us_clock() Peter Zijlstra
2013-12-12 14:08 ` [PATCH 07/15] sched, thermal: Clean up preempt_enable_no_resched() abuse Peter Zijlstra
2013-12-12 14:08 ` [PATCH 08/15] preempt: Take away preempt_enable_no_resched() from modules Peter Zijlstra
2013-12-12 14:08 ` [PATCH 09/15] x86: Use mul_u64_u32_shr() for native_sched_clock() Peter Zijlstra
2013-12-12 14:08 ` [PATCH 10/15] x86: Move some code around Peter Zijlstra
2013-12-12 14:08 ` [PATCH 11/15] x86: Rewrite cyc2ns to avoid the need to disable IRQs Peter Zijlstra
2014-06-16 17:13   ` Viresh Kumar
2014-06-17 12:15     ` Peter Zijlstra
2014-06-17 12:42       ` Viresh Kumar
2014-06-17 16:32         ` Mauro
2014-06-17 12:47       ` Mauro
2013-12-12 14:08 ` [PATCH 12/15] sched: Remove local_irq_disable() from the clocks Peter Zijlstra
2013-12-12 14:08 ` [PATCH 13/15] sched: Use a static_key for sched_clock_stable Peter Zijlstra
2014-01-21 22:28   ` Sasha Levin [this message]
2014-01-22 10:45     ` Peter Zijlstra
2014-01-22 11:59       ` Peter Zijlstra
2014-01-23  1:53         ` Dave Young
2014-01-23 16:46         ` [tip:sched/urgent] sched/clock: Fixup early initialization tip-bot for Peter Zijlstra
2014-01-22 12:00       ` [PATCH 13/15] sched: Use a static_key for sched_clock_stable Markus Trippelsdorf
2014-01-22 12:07         ` Peter Zijlstra
2014-01-22 12:16           ` Peter Zijlstra
2014-01-22 12:26           ` Markus Trippelsdorf
2014-01-22 12:30             ` Peter Zijlstra
2014-01-22 13:14               ` Markus Trippelsdorf
2014-01-22 14:26                 ` Sasha Levin
2014-01-22 18:35                   ` Markus Trippelsdorf
2014-01-22 18:42                     ` Peter Zijlstra
2014-01-22 18:45                       ` Markus Trippelsdorf
2014-01-22 19:17                         ` Josh Boyer
2014-01-22 19:09                       ` Markus Trippelsdorf
2014-01-22 19:12                         ` Markus Trippelsdorf
2014-01-22 20:16                           ` Peter Zijlstra
2014-01-22 21:08                             ` Peter Zijlstra
2014-01-22 21:17                               ` Markus Trippelsdorf
2014-01-23  9:48                                 ` Peter Zijlstra
2014-01-23 10:01                                   ` Markus Trippelsdorf
2014-01-23 10:04                                     ` Peter Zijlstra
2014-01-23 13:32                                   ` Josh Boyer
2014-01-23 16:46                                   ` [tip:sched/urgent] sched/x86/tsc: Initialize multiplier to 0 tip-bot for Peter Zijlstra
2014-01-22 23:53                               ` [PATCH 13/15] sched: Use a static_key for sched_clock_stable Josh Boyer
2014-01-23  1:53                               ` Dave Young
2014-01-23  2:10                                 ` Dave Young
2014-01-23 16:56                                   ` Peter Zijlstra
2014-01-24  3:15                                     ` Dave Young
2014-01-24  7:58                                       ` Ingo Molnar
2014-01-22 17:14     ` Peter Zijlstra
2014-01-22 22:31       ` Sasha Levin
2013-12-12 14:08 ` [PATCH 14/15] sched, clock: Fixup clear_sched_clock_stable() Peter Zijlstra
2013-12-12 14:08 ` [PATCH 15/15] x86: Avoid a runtime condition in native_sched_clock() Peter Zijlstra
2013-12-13  3:30 ` [PATCH 00/15] cleanups and optimizations Mike Galbraith
2013-12-13 10:49 ` Eliezer Tamir
2013-12-13 13:56   ` Peter Zijlstra
2013-12-16 17:48     ` Eliezer Tamir
2013-12-17 13:32       ` Peter Zijlstra
2013-12-17 14:03         ` Eliezer Tamir
2013-12-17 15:13           ` Peter Zijlstra
2013-12-17 18:19             ` Eliezer Tamir
2013-12-17 22:11               ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52DEF495.2020304@oracle.com \
    --to=sasha.levin@oracle.com \
    --cc=arjan@linux.intel.com \
    --cc=bitbucket@online.de \
    --cc=eliezer.tamir@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=john.stultz@linaro.org \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=rui.zhang@intel.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).