All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Stultz <johnstul@us.ibm.com>
To: Ben Blum <bblum@andrew.cmu.edu>
Cc: Jan Engelhardt <jengelh@inai.de>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	simon@fire.lp0.eu, Thomas Gleixner <tglx@linutronix.de>
Subject: Re: Leap second insertion causes futex to repeatedly timeout
Date: Sun, 01 Jul 2012 02:07:42 -0700	[thread overview]
Message-ID: <4FF0135E.8080008@us.ibm.com> (raw)
In-Reply-To: <20120701083605.GA2692@ghc17.ghc.andrew.cmu.edu>

On 07/01/2012 01:36 AM, Ben Blum wrote:
> On Sun, Jul 01, 2012 at 01:16:13AM -0700, john stultz wrote:
>> On Sat, Jun 30, 2012 at 5:57 PM, Jan Engelhardt <jengelh@inai.de> wrote:
>>> This year's leap second insertion has had the strange effect on at least
>>> Linux versions 3.4.4 (my end) and 3.5-rc4 (Simon's box, Cc) that certain
>>> processes use up all CPU power, because of futexes repeatedly timing
>>> out. This seems to only affect certain processes.
>>>
>>> Simon observes - http://s85.org/owXfmLvt - that
>>> Firefox/Thunderbird/Chrome/Java are affected.
>>>
>>> As for me, it affects VirtualBox, mysqld and ksoftirqd. The processes
>>> continue to run and respond. Most weird: I can stop-start mysqld and the
>>> issue persists. (I would have expected it to go away because the leap
>>> second event would then be in the past that mysqld does not know about
>>> anymore.)
>>>
>>>
>>> Is this a kernel issue? glibc?
>> Some of the reports that the issue is resolved by calling:
>>         $ date -s "`date`"
>> suggests that it might be due to clock_was_set() not being called
>> after the leap second was added, causing some hrtimer confusion.
>>
>> Thomas: does that sound about right?
>>
>> I've got an initial patch to add the clock_was_set() calls where
>> needed, but so far have not been able to reproduce the issue (tried
>> firefox and some simpler futex tests).  I'll keep trying and hopefully
>> have something to send out tomorrow.
>>
>> Again, my apologies for the trouble.
> I can't vouch for whether this is the problem or not, but be very
> careful with clock_was_set()! See this commit:
>
> http://www.mail-archive.com/git-commits-head@vger.kernel.org/msg15039.html
>
> In short, clock_was_set() calls on_each_cpu() which is not allowed to be
> called in atomic context. Watch out for xtime_lock.

Quite right.  The fix is a little awkward due to the need to call it 
outside of holding xtime_lock/timekeeper.lock.

I've just reproduced the issue w/ Thunderbird, and my fix seems to avoid 
the issue. Working up a patch now.

thanks
-john



      reply	other threads:[~2012-07-01  9:07 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-01  0:57 Leap second insertion causes futex to repeatedly timeout Jan Engelhardt
2012-07-01  7:02 ` Markus Trippelsdorf
2012-07-01  8:16 ` john stultz
2012-07-01  8:36   ` Ben Blum
2012-07-01  9:07     ` John Stultz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FF0135E.8080008@us.ibm.com \
    --to=johnstul@us.ibm.com \
    --cc=bblum@andrew.cmu.edu \
    --cc=jengelh@inai.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=simon@fire.lp0.eu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.