From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753738AbaJQSXU (ORCPT ); Fri, 17 Oct 2014 14:23:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:27293 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753292AbaJQSXT (ORCPT ); Fri, 17 Oct 2014 14:23:19 -0400 Message-ID: <54415E86.1090609@redhat.com> Date: Fri, 17 Oct 2014 14:23:02 -0400 From: Prarit Bhargava User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20131028 Thunderbird/17.0.10 MIME-Version: 1.0 To: John Stultz CC: lkml , Thomas Gleixner Subject: Re: [PATCH] clocksource, Add warning to clocksource_delta() validation code References: <1413554226-12652-1-git-send-email-prarit@redhat.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/17/2014 02:17 PM, John Stultz wrote: > On Fri, Oct 17, 2014 at 6:57 AM, Prarit Bhargava wrote: >> A bug report came in against an older kernel which output "backward time" >> messages and the report noted that the upstream kernel worked. After some >> investigation it turned out that one of the sockets was bad on the system >> and the "backward time" messages were caused by a real, but intermittent, >> hardware failure. >> >> Commit 09ec54429c6d10f87d1f084de53ae2c1c3a81108 ("clocksource: Move >> cycle_last validation to core code") modifies the x86 clocksource such that >> if a negative delta between two reads of time is calculated the >> clocksource_delta() code will return 0. There is no warning when this >> occurs and there really should be one in order to catch not only hardware >> issues like the issue above, but potential coding issues as the code is >> modified. This patch introduces a WARN() which will also dump a stack >> trace to the console so the exact code path can be evaluated. >> >> I tested this by booting on the broken hardware and left the system idle >> until a negative clocksource_delta() event occurred. >> >> Cc: John Stultz >> Cc: Thomas Gleixner >> Signed-off-by: Prarit Bhargava >> --- >> kernel/time/timekeeping_internal.h | 7 ++++++- >> 1 file changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/time/timekeeping_internal.h b/kernel/time/timekeeping_internal.h >> index 4ea005a..abe6bc8 100644 >> --- a/kernel/time/timekeeping_internal.h >> +++ b/kernel/time/timekeeping_internal.h >> @@ -17,7 +17,12 @@ static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask) >> { >> cycle_t ret = (now - last) & mask; >> >> - return (s64) ret > 0 ? ret : 0; >> + if ((s64)ret > 0) >> + return ret; >> + >> + WARN(1, "Clocksource calculated negative delta, %lld. last = %llu, now = %llu, mask = %llx\n", >> + (s64)ret, last, now, mask); >> + return 0; > > > I realize you followed up that this wasn't finished, but just as some > feedback, there's a number of types of hardware where there may be a > very slight skew between cpu TSC, and this will briefly trigger right > after each timekeeping update if a system is reading the clock > frequently (think of the case where the update happens on the cpu > thats just a little bit ahead, while a timestamping loop is running on > a cpu that is a little bit behind). Ah, interesting. Okay ... drop this patch then. Thanks for the info John. P.