public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Thomas Gleixner <tglx@kernel.org>
Cc: Tim Bird <tim.bird@sony.com>,
	rostedt@goodmis.org, john.ogness@linutronix.de,
	senozhatsky@chromium.org, francesco@valla.it,
	geert@linux-m68k.org, shashankbalaji02@gmail.com,
	linux-embedded@vger.kernel.org, linux-kernel@vger.kernel.org,
	Brian Masney <bmasney@redhat.com>
Subject: Re: [PATCH v4 1/1] printk: fix zero-valued printk timestamps in early boot
Date: Fri, 17 Apr 2026 12:37:35 +0200	[thread overview]
Message-ID: <aeINb8-a83R3QDXd@pathway.suse.cz> (raw)
In-Reply-To: <87qzohdv6i.ffs@tglx>

Adding Brian into Cc.

On Wed 2026-04-15 00:38:29, Thomas Gleixner wrote:
> On Fri, Apr 10 2026 at 14:37, Tim Bird wrote:
> > +
> > +#include <linux/timekeeping.h>
> > +#ifdef CONFIG_ARM64
> > +#include <asm/sysreg.h>
> > +#endif
> > +
> > +#ifdef CONFIG_EARLY_CYCLES_KHZ
> > +static inline u64 early_unsafe_cycles(void)
> > +{
> > +#if defined(CONFIG_X86_64)
> > +	/*
> > +	 * This rdtsc may happen before secure TSC is initialized, and
> > +	 * it is unordered. So please don't use this value for cryptography
> > +	 * or after SMP is initialized.
> > +	 */
> > +	return rdtsc();
> > +#elif defined(CONFIG_ARM64)
> > +	return read_sysreg(cntvct_el0);
> > +#elif defined(CONFIG_RISCV_TIMER)
> > +	u64 val;
> > +
> > +	asm volatile("rdtime %0" : "=r"(val));
> > +	return val;
> > +#else
> > +	return 0;
> > +#endif
> > +}
> 
> No. Generic code and generic headers have no business to implement any
> architecture specific code and there is zero justification for
> architecture specific #ifdefs in generic code.

Yeah, this looks a bit wild.

> > +/* returns a nanosecond value based on early cycles */
> > +static inline u64 early_times_ns(void)
> > +{
> > +	if (CONFIG_EARLY_CYCLES_KHZ)
> > +		/*
> > +		 * Note: the multiply must precede the division to avoid
> > +		 * truncation and loss of resolution
> > +		 * Don't use fancier MULT/SHIFT math here.  Since this is
> > +		 * static, the compiler can optimize the math operations.
> > +		 */
> > +		return (early_unsafe_cycles() * NS_PER_KHZ) / CONFIG_EARLY_CYCLES_KHZ;
> 
> This code will result in a division by zero warning from any reasonable
> compiler because this is evaluated _before_ it is eliminated.
> 
> > @@ -2294,6 +2295,8 @@ int vprintk_store(int facility, int level,
> >  	 * timestamp with respect to the caller.
> >  	 */
> >  	ts_nsec = local_clock();
> > +	if (unlikely(!ts_nsec))
> > +		ts_nsec = early_times_ns();
> 
> I explained to you how this wants to be implemented to be useful and you
> declared that you are unwilling to put the effort in.
> 
> My NAK for this disgusting and tasteless hack still applies.
> 
> Either you are willing to work with the community and the relevant
> maintainers or you can keep your hackery maintained out of tree for
> those who care about it and are willing to ignore the fallout. 

The discussion went wild and is full of emotions.

Let me summarize my understanding:

There are people who try to optimize boot times. Tim is one
of them. He used an out-of-tree patch for many years. He decided
to share it to make the life easier for others.

Tim's original approach was trivial [Tim1]. IMHO, he used a cycle counter
with a stable frequency and hardcoded the computation to timestamps.
It opened discussion how to integrate it better:

   1. Avoid hard coded value in Kconfig by some calibration [Tim2][Tim3]
      One hardcoded value is back in [Tim4] for simplicity.

   2. Avoid jump in the timestamps when timekeeping is allowed.
      It was partly removed in [v2][v3] by already "calibrated"
      timestamps read by userspace (syslog, /dev/kmsg). Again,
      this approach was removed in v4 for simplicity.

  Pros of v4:

      + very simple
      + gives some reasonably looking timestamps
      + might be good enough for the purpose

  Cons of v4:

      + hacky, does not compile in some case, ...
      + hardcoded value in config
      + jump in timestamps when timekeeping is initialized


Now, we have alternative approach by Thomas [Thomas1] which allows
to initialize time keeping on x86 ASAP:

   Pros:

      + clean and well integrated with timekeeping
      + no hard coded values in Kconfig
      + no jump in timestamps

   Cons:

      + need non-trivial changes for each supported architecture
      + no timestamps for the very early code (30ms on the measured x86_64)


My view:

Thomas' approach is great because it is clean integration, ...
but:

   1. I am not sure if the complexity is worth it. There are only few
      people (Tim's tip is 50) who are interested into the early
      boot times and all are developers.

   2. It does not cover the very early boot. And Brian mentioned
      a real life problem found in this area, see
      https://lore.kernel.org/all/acxx9Bt0N3nhtLgN@redhat.com/

      Steven mentioned getting the very early timestamps from
      the firmware, see
      https://lore.kernel.org/all/20260401111244.5057a89c@gandalf.local.home/
      But I am not sure how complicated it is. And if it
      does not need any special HW or so.


Tim's approach is interesting because of the simplicity
and might be good enough for the few (50) users.

I think that there are basically two problems with Tim's approach:

   1. It needs a reasonable API to get a cycle counter with
      a stable frequency.

      My understanding is that get_cycles() might be good enough.
      The only problem is that it the stability is not guaranteed
      and it is not calibrated.

      Would it help to rename it to get_bogo_cycles() ?


   2. The early timestamps provided by the bogo cycles
      are not synchronized with timestamps from
      the proper time keeping.

      Would it help to print a disclaimer, similar to,
      for example, trace_printk() first use?
      Something like:

[    0.002912] **********************************************************
[    0.002917] ****   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE
[    0.002921] **
[    0.002935] ** Using BOGO early timestamps
[    0.002939] **
[    0.002943] ** They are not properly calibrated and might use a source
[    0.002949] ** with an unstable frequency.
[    0.002953] **
[    0.002957] ** They are not comparable with timestamps after
[    0.002961] ** the timekeeping is initialized.
[    0.002966] **
[    0.002968] ****   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE
[    0.002971] *******************************************************
[    0.002975] Booting Linux on physical CPU 0x0000000000 [0x410fd083]
[    0.002998] Linux version 7.0.0-rc6-v8+ (tbird@timdesk) (aarch64-linux-gnu-gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0, GNU ld (GNU Binutils for Ubuntu) 2.42) #20 SMP PREEMPT Fri Apr 10 11:57:48 MDT 2026
[    0.003002] KASLR enabled
[    0.003338] random: crng init done
[    0.003866] Machine model: Raspberry Pi 4 Model B Rev 1.5
[    0.004495] efi: UEFI not found.
...
[    0.183552] Root IRQ handler: gic_handle_irq
[    0.183561] GIC: Using split EOI/Deactivate mode
[    0.183699] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.183958] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xc743ce346, max_idle_ns: 440795203123 ns
[    0.183952] arch_timer: cp15 timer running at 54.00MHz (phys).
[    0.183957] **********************************************************
[    0.183962] ****   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE
[    0.183967] **
[    0.183971] ** End of BOGO early timestamps
[    0.183976] **
[    0.183982] ****   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE
[    0.183989] **********************************************************
[    0.000000] sched_clock: 56 bits at 54MHz, resolution 18ns, wraps every 4398046511102ns
[    0.000157] Console: colour dummy device 80x25
[    0.000165] printk: legacy console [tty1] enabled

My view is that it would be nice to make the life easier
for the 50 developers who do very useful work.

But we do not need to create and maintain any complicated
code for this. If the bogo cycles are good enough.
If they already have some users and have to stay anyway.
If we make it clear that the early timestamps are bogus...

IMHO, the main risk is that it won't be used just by the 50 developers
but it will get misused and open some can of worms. I think that
the risk might be acceptable but...

What do you think, please?
Am I too naive in this case?

[Tim1] https://lore.kernel.org/all/39b09edb-8998-4ebd-a564-7d594434a981@bird.org/
[Tim2] https://lore.kernel.org/all/20260124194027.713991-1-tim.bird@sony.com/
[Tim3] https://lore.kernel.org/all/20260210234741.3262320-1-tim.bird@sony.com/
[Tim4] https://lore.kernel.org/all/20260410203741.997410-1-tim.bird@sony.com/

[Thomas1] https://lore.kernel.org/all/87fr5ib6ks.ffs@tglx/

Best Regards,
Petr

  parent reply	other threads:[~2026-04-17 10:37 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10 20:37 [PATCH v4 0/1] printk: fix zero-valued printk timestamps Tim Bird
2026-04-10 20:37 ` [PATCH v4 1/1] printk: fix zero-valued printk timestamps in early boot Tim Bird
2026-04-12 10:10   ` kernel test robot
2026-04-13 17:58     ` Bird, Tim
2026-04-13 20:17       ` David Laight
2026-04-14 22:48       ` Thomas Gleixner
2026-04-12 11:12   ` kernel test robot
2026-04-12 12:55   ` kernel test robot
2026-04-14  7:03   ` Geert Uytterhoeven
2026-04-14  9:51     ` Roberto A. Foglietta
2026-04-14 22:38   ` Thomas Gleixner
2026-04-15  0:19     ` Roberto A. Foglietta
2026-04-15  9:19       ` Roberto A. Foglietta
2026-04-15 13:06         ` Roberto A. Foglietta
2026-04-15 18:40         ` Geert Uytterhoeven
2026-04-15 19:23           ` Roberto A. Foglietta
2026-04-15 21:30       ` Thomas Gleixner
2026-04-15 21:57         ` Roberto A. Foglietta
2026-04-15 22:24           ` Roberto A. Foglietta
2026-04-15 23:12             ` Bird, Tim
2026-04-16  5:38               ` Roberto A. Foglietta
2026-04-16  8:25                 ` Roberto A. Foglietta
2026-04-16  8:43                   ` Geert Uytterhoeven
2026-04-16  9:38                     ` Roberto A. Foglietta
2026-04-16 10:52                       ` David Laight
2026-04-17  7:35                         ` Roberto A. Foglietta
2026-04-17 10:37     ` Petr Mladek [this message]
2026-04-17 11:46       ` Roberto A. Foglietta
2026-04-17 16:11         ` Steven Rostedt
2026-04-17 19:26           ` Roberto A. Foglietta
2026-04-17 19:24       ` Brian Masney
2026-04-17 19:52         ` Roberto A. Foglietta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aeINb8-a83R3QDXd@pathway.suse.cz \
    --to=pmladek@suse.com \
    --cc=bmasney@redhat.com \
    --cc=francesco@valla.it \
    --cc=geert@linux-m68k.org \
    --cc=john.ogness@linutronix.de \
    --cc=linux-embedded@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    --cc=shashankbalaji02@gmail.com \
    --cc=tglx@kernel.org \
    --cc=tim.bird@sony.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox