From: Petr Mladek <pmladek@suse.com>
To: Francesco Valla <francesco@valla.it>
Cc: Tim Bird <tim@bird.org>, Steve Rostedt <rostedt@goodmis.org>,
john.ogness@linutronix.de, senozhatsky@chromium.org,
Tim Bird <Tim.Bird@sony.com>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
Anna-Maria Behnsen <anna-maria@linutronix.de>,
Frederic Weisbecker <frederic@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Linux Embedded <linux-embedded@vger.kernel.org>
Subject: Re: [PATCH] printk: add early_counter_ns routine for printk blind spot
Date: Wed, 26 Nov 2025 13:55:32 +0100 [thread overview]
Message-ID: <aSb4xHCB6i_Q6JpC@pathway.suse.cz> (raw)
In-Reply-To: <aSWqWD1BQ-m-Zy9C@bywater>
On Tue 2025-11-25 14:08:40, Francesco Valla wrote:
> Hi Tim,
>
> I tested this on my i.MX93 FRDM (arm64) board and after a bit of
> fiddling with the MULT/SHIFT values I got it working. It can be a very
> valuable addition.
>
> Some comments follow.
>
> On Mon, Nov 24, 2025 at 10:30:52PM -0700, Tim Bird wrote:
> > From: Tim Bird <tim.bird@sony.com>
> >
> > During early boot, printk timestamps are reported as zero,
> > which creates a blind spot in early boot timings. This blind
> > spot hinders timing and optimization efforts for code that
> > executes before time_init(), which is when local_clock() is
> > initialized sufficiently to start returning non-zero timestamps.
> > This period is about 400 milliseconds for many current desktop
> > and embedded machines running Linux.
> >
> > Add an early_counter_ns function that returns nanosecond
> > timestamps based on get_cycles(). get_cycles() is operational
> > on arm64 and x86_64 from kernel start. Add some calibration
> > printks to allow setting configuration variables that are used
> > to convert cycle counts to nanoseconds (which are then used
> > in early printks). Add CONFIG_EARLY_COUNTER_NS, as well as
> > some associated conversion variables, as new kernel config
> > variables.
> >
> > After proper configuration, this yields non-zero timestamps for
> > printks from the very start of kernel execution. The timestamps
> > are relative to the start of the architecture-specific counter
> > used in get_cycles (e.g. the TSC on x86_64 and cntvct_el0 on arm64).
> > This means that the time reported reflects time-from-power-on for
> > most embedded products. This is also a useful data point for
> > boot-time optimization work.
> >
> > Note that there is a discontinuity in the timestamp sequencing
> > when standard clocks are finally initialized in time_init().
> > The printk timestamps are thus not monotonically increasing
> > through the entire boot.
>
> This is... not going to work, IMO, and might lead to breakages in
> userspace tools (are printk timings a userspace API?).
Honestly, I am not sure if it would break anything. The fact is
that printk() always used monotonic timers. And it is possible
that some userspace depends on it.
I personally thing that non-monotonic time stamps might be confusing
but they should not cause any serious breakage. But I might be wrong.
People are creative...
> I actually have a counter-proposal: the time obtained through cycle
> evaluation is used as an offset to be added to the printk time after
> time_init() is called. A (working, but maybe sub-optimal) patch to
> obtain this is attached at the end.
I am not sure if this is a good idea. The offset would cause
that all post-timer-init printk timestamps differ from values
provided by the timer API. And it might cause confusion,
for example, when they are printed as part of the message,
or when analyzing a crash dump.
On the other hand, there are various clock sources in the kernel
which are not comparable anyway. So maybe I am too cautious.
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -770,6 +770,53 @@ config IKHEADERS
> > or similar programs. If you build the headers as a module, a module called
> > kheaders.ko is built which can be loaded on-demand to get access to headers.
> >
> > +config EARLY_COUNTER_NS
> > + bool "Use counter for early printk timestamps"
> > + default y
> > + depends on PRINTK
> > + help
> > + Use a cycle-counter to provide printk timestamps during early
> > + boot. This allows seeing timing information that would
> > + otherwise be displayed with 0-valued timestamps.
> > +
> > + In order for this to work, you need to specify values for
> > + EARLY_COUNTER_MULT and EARLY_COUNTER_SHIFT, used to convert
> > + from the cycle count to nanoseconds.
> > +
> > +config EARLY_COUNTER_MULT
> > + int "Multiplier for early cycle counter"
> > + depends on PRINTK && EARLY_COUNTER_NS
> > + default 1
> > + help
> > + This value specifies a multiplier to be used when converting
> > + cycle counts to nanoseconds. The formula used is:
> > + ns = (cycles * mult) >> shift
> > +
> > + Use a multiplier that will bring the value of (cycles * mult)
> > + to near a power of two, that is greater than 1000. The
> > + nanoseconds returned by this conversion are divided by 1000
> > + to be used as the printk timestamp counter (with resolution
> > + of microseconds).
> > +
> > + As an example, for a cycle-counter with a frequency of 200 Mhz,
> > + the multiplier would be: 10485760, and the shift would be 21.
> > +
>
> If I got this correclty:
>
> EARLY_COUNTER_MULT = (10^9 / freq) << EARLY_COUNTER_SHIFT
>
> where EARLY_COUNTER_SHIFT can be chosen at will, provided it is big
> enough to survice the ns->us conversion but small enough not to overflow
> the u64 container.
>
> > +config EARLY_COUNTER_SHIFT
> > + int "Shift value for early cycle counter"
> > + range 0 63
> > + depends on PRINTK && EARLY_COUNTER_NS
> > + default 0
> > + help
> > + This value specifies a shift value to be used when converting
> > + cycle counts to nanoseconds. The formula used is:
> > + ns = (cycles * mult) >> shift
> > +
> > + Use a shift that will bring the result to a value
> > + in nanoseconds.
> > +
> > + As an example, for a cycle-counter with a frequency of 200 Mhz,
> > + the multiplier would be: 10485760, and the shift would be 21.
> > +
> > config LOG_BUF_SHIFT
> > int "Kernel log buffer size (16 => 64KB, 17 => 128KB)"
> > range 12 25
So, it is usable only for a particular HW. It is not usable for a
generic kernel which is supposed to run on misc HW.
I guess that there is no way to detect the CPU frequency at runtime?
Best Regards,
Petr
next prev parent reply other threads:[~2025-11-26 12:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-25 5:30 [PATCH] printk: add early_counter_ns routine for printk blind spot Tim Bird
2025-11-25 7:52 ` kernel test robot
2025-11-25 13:08 ` Francesco Valla
2025-11-26 7:38 ` Geert Uytterhoeven
2025-11-27 0:16 ` Bird, Tim
2025-11-27 16:16 ` Petr Mladek
2025-11-26 12:55 ` Petr Mladek [this message]
2025-11-27 0:03 ` Bird, Tim
2025-11-26 11:13 ` Petr Mladek
2025-11-27 9:13 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aSb4xHCB6i_Q6JpC@pathway.suse.cz \
--to=pmladek@suse.com \
--cc=Tim.Bird@sony.com \
--cc=akpm@linux-foundation.org \
--cc=anna-maria@linutronix.de \
--cc=francesco@valla.it \
--cc=frederic@kernel.org \
--cc=john.ogness@linutronix.de \
--cc=linux-embedded@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=senozhatsky@chromium.org \
--cc=tglx@linutronix.de \
--cc=tim@bird.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).