From: Thomas Gleixner <tglx@kernel.org>
To: "Bird, Tim" <Tim.Bird@sony.com>,
"pmladek@suse.com" <pmladek@suse.com>,
"rostedt@goodmis.org" <rostedt@goodmis.org>,
"senozhatsky@chromium.org" <senozhatsky@chromium.org>,
Shashank Balaji <shashankbalaji02@gmail.com>,
"john.ogness@linutronix.de" <john.ogness@linutronix.de>
Cc: "francesco@valla.it" <francesco@valla.it>,
"geert@linux-m68k.org" <geert@linux-m68k.org>,
"linux-embedded@vger.kernel.org" <linux-embedded@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH v3] printk: fix zero-valued printk timestamps in early boot
Date: Sat, 28 Mar 2026 22:59:57 +0100 [thread overview]
Message-ID: <87jyuvboo2.ffs@tglx> (raw)
In-Reply-To: <MW5PR13MB5632FC91EE0EF8754331400DFD57A@MW5PR13MB5632.namprd13.prod.outlook.com>
Tim!
On Fri, Mar 27 2026 at 18:48, Tim Bird wrote:
> Well, this is using get_cycles(), which already exists on most architectures.
The fact that get_cycles() exists does not make it a good choice. There
is a reason why anything which deals with even remotely reliable time
requirements stopped using it. It's still there as a low level
architecture specific interface and most other usage is purely
historical or wrong to begin with and should be removed completely.
A lot of people spent a significant amount of time to get rid of this
ill defined mechanism and it's just sad that they did not manage to
eliminate it completely.
> This patch just adds a funky way to use cycles (which are available
> from power on, rather than from the start of kernel timekeeping) to
> allow saving timing data for some early printks (usually about 20 to
> 60 printks).
I can see that, but I'm not accepting yet another ill defined glued on
mechanism which relies on a historical ill defined mistake.
> Also, my current plan is to back off of adjusting the offset of
> unrelated (non-pre-time_init()) printks, and limit the effect in the
> system to just those first early (pre-time_init()) printks. The
> complication to add an offset to all following printks was just to
> avoid a discontinuity in printk timestamps, once time_init() was
> called and "real" timestamps started producing non-zeros. Given how
> confusing this seems to have made things, I'm thinking of backing off
> of that approach.
This discontinuity results from the fact that you glued it into the
printk code and sched_clock() does not know about it.
>> printk()
>>
>> time_ns = local_clock();
> that's ts_nsec = local_clock()
That obviously changes the illustrative nature of my narrative
significantly. Thanks for pointing it out.
>> As this needs to be supported by the architecture/platform in any case
>> there is close to zero benefit from creating complicated generic
>> infrastructure for this.
>
> The problem with this is that tsc_early_uncalibrated() can't return
> nanoseconds until after calibration.
In theory it could for most modern x86 CPUs as they advertise the
nominal TSC frequency in CPUID. Other architectures have well known
clocksource frequencies, e.g. S390 has a known nominal frequency of 1GHz
(IIRC).
But that does not solve any of the other problems. See below.
> I don't think it's a good idea to returns cycles sometimes and nanoseconds
> at other times, from a deep-seated timing function like this.
> Also tsc_available() might itself depend on initialization that hasn't happened yet
> (in early boot).
Access to the TSC requires the X86_FEATURE_TSC bit being set, which
happens in early_cpu_init(). Before that get_cycles() returns firmly 0.
> My approach of saving cycles in ts_nsec for the early printks works
> because there's a limited number of places (only 2) inside the printk
> code where ts_nsec is accessed, meaning that the code to detect a
> cycles value instead of a nanoseconds value can be constrained to just
> those two places. Basically, I'm doing the conversion from cycles to
> nanoseconds at printk presentation time, rather than at the time of
> printk message submission.
I know, but that again requires to add more ill defined infrastructure.
We are not aiming to add more, we want to get rid of it completely to
the extent possible.
> The approach that I originally started with
> (see https://lore.kernel.org/linux-embedded/39b09edb-8998-4ebd-a564-7d594434a981@bird.org/
> was to use hardcoded multiplier and shift values for converting from cycles
> to nanoseconds. These multiplier and shift values would be set at kernel
> configuration time (ie, using CONFIG values).
Which makes it unusable for distro kernels and therefore a non-starter.
> There are other approaches, but none really work early enough in the
> kernel boot to not be a pain. The goal is to provide timing info
> before: timekeeping init, jiffies startup, and even CPU features
> determination,
As I pointed out before that's wishful thinking:
You _cannot_ access a resource before it has been determined to be
available.
Period.
It does not matter at all if _you_ know for sure that it is the case in
_your_ personal setup.
> and to keep the effect narrow -- limited only to printks, and the
> first few pre-time_init() printk messages, at that.
Either it is solved in a generic way or we have to agree that it's not
solvable at all. Your narrow effect argument is bogus and you know that
very well.
> I'm now researching a suggestion from Shashank Balaji to use the
> existing calibration data from tsc initialization, which might
> simplify the current patch even further. I'll make sure to CC you on
> the next version of the patch.
If you want to use the calibration data from tsc_early_init() then you
achieve exactly _nothing_ because tsc_early_init() also enables the
early sched clock on bare metal. On a VM with KVM clock available the
KVM clock setup enables the early sched clock even before that via
init_hypervisor_platform().
The early TSC init happens in setup_arch() via tsc_early_init() and it's
completely unclear whether you can always access the TSC safely before
that unconditionally due to SNP, which requires to enable the secure TSC
first. There is a reason why all of this is ordered the way it is.
While reading TSC way before that might work on bare metal and in most
VMs, it's not guaranteed to be safe unconditionally unless someone sits
down and provides proof to the contrary. As always I'm happy to be
proven wrong.
When tsc_early_init() was introduced for the very same reason you are
looking into that, quite some people spent a lot of time to come up with
a solution which was deemed safe enough to be used unconditionally.
Please consult the LKML archive for the full history. The commit links
will give you a proper starting point.
That said, I completely understand the itch you are trying to scratch
and I'm the least person to prevent an architecturally sound solution,
but I'm also the first person to NAK any attempt which is based on
uninformed claims and 'works for me' arguments.
The only clean way to solve this cleanly is moving the sched clock
initialization to the earliest point possible and accepting that due to
hardware, enumeration and virtualization constraints this point might be
suboptimal. Everything else is just an attempt to defy reality.
Thanks,
tglx
next prev parent reply other threads:[~2026-03-28 22:00 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-25 5:30 [PATCH] printk: add early_counter_ns routine for printk blind spot Tim Bird
2025-11-25 7:52 ` kernel test robot
2025-11-25 13:08 ` Francesco Valla
2025-11-26 7:38 ` Geert Uytterhoeven
2025-11-27 0:16 ` Bird, Tim
2025-11-27 16:16 ` Petr Mladek
2025-11-26 12:55 ` Petr Mladek
2025-11-27 0:03 ` Bird, Tim
2025-11-26 11:13 ` Petr Mladek
2025-11-27 9:13 ` kernel test robot
2026-01-24 19:40 ` [PATCH v2] printk: fix zero-valued printk timestamps in early boot Tim Bird
2026-01-25 14:41 ` Francesco Valla
2026-01-26 16:52 ` Bird, Tim
2026-02-02 16:23 ` Petr Mladek
2026-01-26 10:12 ` Geert Uytterhoeven
2026-01-26 17:11 ` Bird, Tim
2026-01-27 8:10 ` Geert Uytterhoeven
2026-02-10 23:47 ` [PATCH v3] " Tim Bird
2026-03-04 11:23 ` Petr Mladek
2026-03-09 17:27 ` Shashank Balaji
2026-03-10 10:43 ` Petr Mladek
2026-03-10 19:17 ` Bird, Tim
2026-03-09 19:25 ` Shashank Balaji
2026-03-10 11:39 ` Petr Mladek
2026-03-10 18:54 ` Bird, Tim
2026-03-11 15:45 ` Petr Mladek
2026-03-11 15:47 ` Michael Kelley
2026-03-13 4:52 ` Bird, Tim
2026-03-13 10:45 ` Petr Mladek
2026-03-14 14:16 ` Shashank Balaji
2026-03-24 20:07 ` Bird, Tim
2026-03-14 16:15 ` Michael Kelley
2026-03-24 19:47 ` Bird, Tim
2026-03-26 9:24 ` John Ogness
2026-03-27 18:04 ` Bird, Tim
2026-03-20 18:15 ` Bird, Tim
2026-03-28 15:56 ` Thomas Gleixner
2026-03-26 13:17 ` Thomas Gleixner
2026-03-27 18:48 ` Bird, Tim
2026-03-28 21:59 ` Thomas Gleixner [this message]
2026-03-29 22:42 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87jyuvboo2.ffs@tglx \
--to=tglx@kernel.org \
--cc=Tim.Bird@sony.com \
--cc=francesco@valla.it \
--cc=geert@linux-m68k.org \
--cc=john.ogness@linutronix.de \
--cc=linux-embedded@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=senozhatsky@chromium.org \
--cc=shashankbalaji02@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox