From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933036AbcAMOhC (ORCPT ); Wed, 13 Jan 2016 09:37:02 -0500 Received: from mx1.redhat.com ([209.132.183.28]:44559 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752964AbcAMOgo (ORCPT ); Wed, 13 Jan 2016 09:36:44 -0500 Message-ID: <569660FA.2020802@redhat.com> Date: Wed, 13 Jan 2016 09:36:42 -0500 From: Prarit Bhargava User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Thomas Gleixner CC: linux-kernel@vger.kernel.org, John Stultz , Xunlei Pang , Baolin Wang , Andrew Morton , Greg Kroah-Hartman , Petr Mladek , Tejun Heo , Peter Hurley , Vasily Averin , Joe Perches Subject: Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] References: <1452688466-14877-1-git-send-email-prarit@redhat.com> In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/13/2016 08:45 AM, Thomas Gleixner wrote: > On Wed, 13 Jan 2016, Prarit Bhargava wrote: >> This patchset introduces additional NMI safe timekeeping functions and the >> kernel parameter printk.clock=[local|boot|real|tai] allowing a >> user to specify an adjusted clock to use with printk timestamps. The >> hardware clock, or the existing functionality, is preserved by default. > > You still fail to explain WHY we need a gazillion of different clocks > here. I've had cases in the past where an earlier warning/failures have resulted in a much later panics, across several systems. Trying to synchronize all of these events with wall clock time is all but impossible after the event has occurred. I've seen cases where earlier MCAs lead to panics, earlier I/O warnings have lead to panics, panics/problems at a specific time, etc. Attempting to figure out what happened in the lab or cluster is not trivial without having a timestamp that can actually be synchronized against a wall clock. In the case that made me finally submit this, the disks were generating seemingly random I/O timeout errors which meant at that point I had no logging to disk (and this assumes the systems are logging to disk because I'm seeing more and more systems that are not). I did manage to get dmesg from crash dumps, however, the problem then became trying to figure out exactly what time the system started having problems (Was there an external event that lead to the failures and panics? Are the early failures across systems at the same time, or did they occur over several hours? Did the systems all panic at the same time? Was the failure at a specific time after boot and due to a weird timeout? etc.) Trying to figure out what actually is happening & debugging becomes much easier with the above timestamp patch because I can actually tell what time something happened. Admittedly, I have not used TAI. I started by using REAL, and then the BOOT clock to see if this was some sort of strange 10-day timeout on the system. I only included TAI option for completeness. > > What's the problem with using the fast monotonic clock instead of local_clock > and be done with it? I really don't see the point why we would need > boot/real/tai and all the extra churn in the fast clock. AFAICT the local_clock() (on x86 at least) is accessed without accessing a lock and is just a tsc read. I assumed that local_clock() fast and lockless access was the "best" method for obtaining a time stamp. I would only suggest using the other clocks on systems that are "known stable", or running kernels that are considered to have stable timekeeping code. P. >