From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <4FF03BDC.9070208@computer.org> Date: Sun, 01 Jul 2012 14:00:28 +0200 From: Jan Ceuleers MIME-Version: 1.0 To: John Stultz CC: Linux Kernel Mailing List , stable@vger.kernel.org, Thomas Gleixner Subject: Re: [PATCH] [RFC] Potential fix for leapsecond caused futex related load spikes References: <1341135371-45034-1-git-send-email-johnstul@us.ibm.com> In-Reply-To: <1341135371-45034-1-git-send-email-johnstul@us.ibm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: On 07/01/2012 11:36 AM, John Stultz wrote: > I believe this issue is due to the leapsecond being added without > calling clock_was_set() to notify the hrtimer subsystem of the > change. (Although I've not yet chased all the way down to the > hrtimer code to validate exactly what's going on there). For the benefit of -stable: Am I right in thinking that, if the analysis is confirmed, this was caused by the following commit: commit 746976a301ac9c9aa10d7d42454f8d6cdad8ff2b Author: Thomas Gleixner Date: Tue Jul 3 20:05:20 2007 +0200 NTP: remove clock_was_set() call to prevent deadlock The clock_was_set() call in seconds_overflow() which happens only when leap seconds are inserted / deleted is wrong in two aspects: 1. it results in a call to on_each_cpu() with interrupts disabled 2. it is potential deadlock source vs. call_lock in smp_call_function() The only possible side effect of the removal might be, that an absolute CLOCK_REALTIME timer fires 1 second too late, in the rare case of leap second deletion and an absolute CLOCK_REALTIME timer which expires in the affected time frame. It will never fire too early. This was probably observed by the reporter of a June 30th -> July 1st hang: http://lkml.org/lkml/2007/7/3/103 A similar problem was observed by Dave Jones, who provided a screen shot with a lockdep back trace, which allowed to analyse the problem. Sob: Thomas Gleixner Ab: Ingo Molnar Sob: Linus Torvalds