From: Prarit Bhargava <prarit@redhat.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org,
John Stultz <john.stultz@linaro.org>,
Xunlei Pang <pang.xunlei@linaro.org>,
Baolin Wang <baolin.wang@linaro.org>,
Andrew Morton <akpm@linux-foundation.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Petr Mladek <pmladek@suse.cz>, Tejun Heo <tj@kernel.org>,
Peter Hurley <peter@hurleysoftware.com>,
Vasily Averin <vvs@virtuozzo.com>, Joe Perches <joe@perches.com>
Subject: Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2]
Date: Wed, 13 Jan 2016 09:36:42 -0500 [thread overview]
Message-ID: <569660FA.2020802@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1601131443380.3575@nanos>
On 01/13/2016 08:45 AM, Thomas Gleixner wrote:
> On Wed, 13 Jan 2016, Prarit Bhargava wrote:
>> This patchset introduces additional NMI safe timekeeping functions and the
>> kernel parameter printk.clock=[local|boot|real|tai] allowing a
>> user to specify an adjusted clock to use with printk timestamps. The
>> hardware clock, or the existing functionality, is preserved by default.
>
> You still fail to explain WHY we need a gazillion of different clocks
> here.
I've had cases in the past where an earlier warning/failures have resulted in a
much later panics, across several systems. Trying to synchronize all of these
events with wall clock time is all but impossible after the event has occurred.
I've seen cases where earlier MCAs lead to panics, earlier I/O warnings have
lead to panics, panics/problems at a specific time, etc. Attempting to figure
out what happened in the lab or cluster is not trivial without having a
timestamp that can actually be synchronized against a wall clock.
In the case that made me finally submit this, the disks were generating
seemingly random I/O timeout errors which meant at that point I had no logging
to disk (and this assumes the systems are logging to disk because I'm seeing
more and more systems that are not). I did manage to get dmesg from crash
dumps, however, the problem then became trying to figure out exactly what time
the system started having problems (Was there an external event that lead to the
failures and panics? Are the early failures across systems at the same time, or
did they occur over several hours? Did the systems all panic at the same time?
Was the failure at a specific time after boot and due to a weird timeout? etc.)
Trying to figure out what actually is happening & debugging becomes much easier
with the above timestamp patch because I can actually tell what time something
happened.
Admittedly, I have not used TAI. I started by using REAL, and then the BOOT
clock to see if this was some sort of strange 10-day timeout on the system. I
only included TAI option for completeness.
>
> What's the problem with using the fast monotonic clock instead of local_clock
> and be done with it? I really don't see the point why we would need
> boot/real/tai and all the extra churn in the fast clock.
AFAICT the local_clock() (on x86 at least) is accessed without accessing a lock
and is just a tsc read. I assumed that local_clock() fast and lockless access
was the "best" method for obtaining a time stamp. I would only suggest using
the other clocks on systems that are "known stable", or running kernels that are
considered to have stable timekeeping code.
P.
>
next prev parent reply other threads:[~2016-01-13 14:37 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-13 12:34 [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Prarit Bhargava
2016-01-13 12:34 ` [PATCH 1/2] kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions Prarit Bhargava
2016-01-13 12:34 ` [PATCH 2/2] printk, Add printk.clock kernel parameter Prarit Bhargava
2016-01-13 13:45 ` [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Thomas Gleixner
2016-01-13 14:36 ` Prarit Bhargava [this message]
2016-01-13 17:28 ` Thomas Gleixner
2016-01-14 12:52 ` Petr Mladek
2016-01-14 14:39 ` Prarit Bhargava
2016-01-14 14:44 ` Thomas Gleixner
2016-01-21 16:09 ` Prarit Bhargava
2016-01-22 8:04 ` Thomas Gleixner
-- strict thread matches above, loose matches on Subject: below --
2016-01-13 12:34 Prarit Bhargava
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=569660FA.2020802@redhat.com \
--to=prarit@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linaro.org \
--cc=gregkh@linuxfoundation.org \
--cc=joe@perches.com \
--cc=john.stultz@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pang.xunlei@linaro.org \
--cc=peter@hurleysoftware.com \
--cc=pmladek@suse.cz \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vvs@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.