All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Peter Zijlstra <peterz@infradead.org>
Cc: x86-ml <x86@kernel.org>, Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Steven Rostedt <rostedt@goodmis.org>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] x86: Kill notsc
Date: Wed, 21 Oct 2015 19:58:03 +0200	[thread overview]
Message-ID: <20151021175803.GF3575@pd.tnic> (raw)
In-Reply-To: <20151018142007.GA11294@pd.tnic>

On Sun, Oct 18, 2015 at 04:20:07PM +0200, Borislav Petkov wrote:
> Ok,
> 
> let's try this and see where it takes us. Patch has been only lightly
> tested in kvm - I'll hammer on it for real once we agree about the
> general form.
> 
> Aanyway, this patch is something Peter and I have been talking about on
> IRC a couple of times already. I finally found some free time to poke at
> it, here's the result.
> 
> Thoughts?
> 
> ---
> 
> Kill "notsc" cmdline option and all the glue around it. The two boxes
> worldwide which don't have a TSC should disable X86_TSC. Thus, make
> native_sched_clock() use TSC unconditionally, even if the TSC is
> unstable because that's fine there. This gets rid of the static key
> too and makes the function even simpler and faster, which is a Good
> Thing(tm).
> 
> The jiffies-fallback is for the !X86_TSC case.
> 
> Signed-off-by: Borislav Petkov <bp@suse.de>
> ---
>  Documentation/kernel-parameters.txt       |  2 -
>  Documentation/x86/x86_64/boot-options.txt |  5 ---
>  arch/x86/include/asm/tsc.h                |  3 +-
>  arch/x86/kernel/apic/apic.c               |  2 +-
>  arch/x86/kernel/tsc.c                     | 74 ++++++++-----------------------
>  5 files changed, 21 insertions(+), 65 deletions(-)

...

> +#ifdef CONFIG_X86_TSC
> +	/* return the value in ns */
> +	return cycles_2_ns(rdtsc());
> +#else

Ok, we have a problem here:

See the splat below. On the init path

start_kernel
|-> sched_init
    |-> init_idle

we're calling sched_clock() which does the cycles_2_ns() thing.

However, that cycles_2_ns() thing gets called only in tsc_init() which
comes later in start_kernel().

Which means, data in this line

      ns += mul_u64_u32_shr(cyc, data->cyc2ns_mul, data->cyc2ns_shift);

is NULL.

Now, the question is, can I push only the cyc2ns_init() call up, before
sched_init()?

I mean, it dummy initializes only the per-cpu variable and tsc_init()
will correct it anyway, as the comment says:

        /*
         * Secondary CPUs do not run through tsc_init(), so set up
         * all the scale factors for all CPUs, assuming the same
         * speed as the bootup CPU. (cpufreq notifiers will fix this
         * up if their speed diverges)
         */
        for_each_possible_cpu(cpu) {
                cyc2ns_init(cpu);
                set_cyc2ns_scale(cpu_khz, cpu);
        }


Thoughts?


[    0.000000] BUG: unable to handle kernel NULL pointer dereference at           (null)
[    0.000000] IP: [<ffffffff8100d19d>] native_sched_clock+0x2d/0x90
[    0.000000] PGD 0 
[    0.000000] Oops: 0000 [#1] PREEMPT SMP 
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.3.0-rc6+ #4
[    0.000000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[    0.000000] task: ffffffff819f7500 ti: ffffffff819e8000 task.ti: ffffffff819e8000
[    0.000000] RIP: 0010:[<ffffffff8100d19d>]  [<ffffffff8100d19d>] native_sched_clock+0x2d/0x90
[    0.000000] RSP: 0000:ffffffff819ebee0  EFLAGS: 00010046
[    0.000000] RAX: 0000000129f79c5a RBX: ffff88007b7d5c00 RCX: 0000000000000000
[    0.000000] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff819f76e0
[    0.000000] RBP: ffffffff819ebee0 R08: 0000000000000001 R09: 0000000000000000
[    0.000000] R10: 0000000000000001 R11: ffffffff819f7500 R12: 0000000000000000
[    0.000000] R13: 0000000000000000 R14: ffffffff819f7b30 R15: ffffffff819f7500
[    0.000000] FS:  0000000000000000(0000) GS:ffff88007b600000(0000) knlGS:0000000000000000
[    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    0.000000] CR2: 0000000000000000 CR3: 00000000019f2000 CR4: 00000000000406b0
[    0.000000] Stack:
[    0.000000]  ffffffff819ebf28 ffffffff8108622f 0000000000000000 0000000000000086
[    0.000000]  ffffffff819f7500 0000000000000008 0000000000000008 00000000001d5c00
[    0.000000]  ffffffff0000ffff ffffffff819ebf58 ffffffff81cace4f ffffffffffffffff
[    0.000000] Call Trace:
[    0.000000]  [<ffffffff8108622f>] init_idle+0x13f/0x2d0
[    0.000000]  [<ffffffff81cace4f>] sched_init+0x304/0x345
[    0.000000]  [<ffffffff81c97d6d>] start_kernel+0x25e/0x417
[    0.000000]  [<ffffffff81c972fe>] x86_64_start_reservations+0x2a/0x2c
[    0.000000]  [<ffffffff81c97468>] x86_64_start_kernel+0x168/0x176
[    0.000000] Code: 89 e5 48 83 e4 f0 0f 31 48 c1 e2 20 48 09 d0 65 ff 05 88 dd ff 7e 65 48 8b 35 e0 5b 1c 7f 65 48 8b 15 e0 5b 1c 7f 48 39 d6 75 2c <8b> 16 8b 4e 04 48 f7 e2 48 0f ad d0 48 d3 ea f6 c1 40 48 0f 45 
[    0.000000] RIP  [<ffffffff8100d19d>] native_sched_clock+0x2d/0x90
[    0.000000]  RSP <ffffffff819ebee0>
[    0.000000] CR2: 0000000000000000
[    0.000000] ---[ end trace aff58b1304cfecf1 ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task!

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

  parent reply	other threads:[~2015-10-21 17:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-18 14:20 [RFC PATCH] x86: Kill notsc Borislav Petkov
2015-10-19  4:47 ` Andy Lutomirski
2015-10-19  8:16   ` Borislav Petkov
2015-10-21 17:58 ` Borislav Petkov [this message]
2015-10-21 19:01   ` Peter Zijlstra
2015-10-22 18:51     ` [PATCH -v2] " Borislav Petkov
2015-11-04 10:21       ` Thomas Gleixner
2015-11-04 10:29         ` Borislav Petkov
2015-11-16 18:45           ` [RFC PATCH -v2.1] " Borislav Petkov
2015-11-16 21:25             ` Thomas Gleixner
2015-11-17  5:02               ` H. Peter Anvin
2015-11-17  8:53                 ` Borislav Petkov
2015-11-17  9:11                   ` Thomas Gleixner
2015-11-17  9:33                     ` Borislav Petkov
2015-11-17 10:08                       ` Thomas Gleixner
2015-11-17 11:09                         ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151021175803.GF3575@pd.tnic \
    --to=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.