All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicolai Stange <nicstange@gmail.com>
To: Luiz Capitulino <lcapitulino@redhat.com>
Cc: fweisbec@gmail.com, tglx@linutronix.de, mtosatti@redhat.com,
	williams@redhat.com, nicstange@gmail.com,
	linux-kernel@vger.kernel.org,
	John Stultz <john.stultz@linaro.org>
Subject: Re: [nohz_full/apic] multiple timer interrupts a second
Date: Sat, 04 Nov 2017 10:14:52 +0100	[thread overview]
Message-ID: <878tfm76cz.fsf@gmail.com> (raw)
In-Reply-To: <20171103170703.46d72a31@redhat.com> (Luiz Capitulino's message of "Fri, 3 Nov 2017 17:12:03 -0400")

Hi Luiz,

[John Stultz added to CC]

On Fri, Nov 03 2017, Luiz Capitulino wrote:

> [CC'ing lkml this time]
>
> I've observed that smp_apic_timer_interrupt() is sometimes called
> two or more times a second on a nohz_full core which has a single
> task taking 100% of the core. In one of the calls, hrtimer_interrupt()
> runs tick_sched_timer(), but in others it doesn't call any handler.
> Here's an example (Linus HEAD f34157878):
>
> <...>-1831  [008]  1060.115578: funcgraph_entry:                   |  smp_apic_timer_interrupt() {
> <...>-1831  [008]  1060.115578: funcgraph_entry:                   |    hrtimer_interrupt() {
> <...>-1831  [008]  1060.115578: hrtimer_expire_entry: hrtimer=0xffff8edfefd12d60 function=tick_sched_timer now=1060079001980
> <...>-1831  [008]  1060.115578: funcgraph_entry:        1.172 us   |      tick_sched_timer();
> <...>-1831  [008]  1060.115580: funcgraph_exit:         1.757 us   |    }
> <...>-1831  [008]  1060.115580: hrtimer_start:        hrtimer=0xffff8edfefd12d60 function=tick_sched_timer expires=1061079000000 softexpires=1061079000000
> <...>-1831  [008]  1060.115581: funcgraph_exit:         3.026 us   |  }
> <...>-1831  [008]  1061.115577: funcgraph_entry:                   |  smp_apic_timer_interrupt() {
> <...>-1831  [008]  1061.115577: funcgraph_entry:        0.261 us   |    hrtimer_interrupt(); <---------- NO handler called
> <...>-1831  [008]  1061.115578: funcgraph_exit:         1.349 us   |  }
> <...>-1831  [008]  1061.115579: funcgraph_entry:                   |  smp_apic_timer_interrupt() {
> <...>-1831  [008]  1061.115579: funcgraph_entry:                   |    hrtimer_interrupt() {
> <...>-1831  [008]  1061.115579: hrtimer_expire_entry: hrtimer=0xffff8edfefd12d60 function=tick_sched_timer now=1061079001473
> <...>-1831  [008]  1061.115580: funcgraph_entry:        1.413 us   |      tick_sched_timer();
> <...>-1831  [008]  1061.115581: funcgraph_exit:         2.124 us   |    }
> <...>-1831  [008]  1061.115582: hrtimer_start:        hrtimer=0xffff8edfefd12d60 function=tick_sched_timer expires=1062079000000 softexpires=1062079000000
> <...>-1831  [008]  1061.115582: funcgraph_exit:         3.255 us   |  }
>
> Is this expected for some reason?
>
> I guess what's happening is that the deadline timer is firing
> earlier than expected. From a few dozen to a few hundreds
> nanoseconds earlier. When this happens, hrtimer_interrupt()
> skips calling the hrtimer handler (since it's early) and the
> apic is programmed to fire in the next microsecond.

Exactly.


> On further research I saw that Nicolai tried to fix a very similar
> problem last year:
>
> commit 1a9e4c564ab174e53ed86def922804a5ddc63e7d
> Author: Nicolai Stange <nicstange@gmail.com>
> Date:   Thu Jul 14 17:22:54 2016 +0200
>
>     x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents frequency roundoff error
>     
>     I noticed the following bug/misbehavior on certain Intel systems: with a
>     single task running on a NOHZ CPU on an Intel Haswell, I recognized
>     that I did not only get the one expected local_timer APIC interrupt, but
>     two per second at minimum. (!)

Note that there's also 6731b0d611a1 ("x86/timers/apic: Inform TSC
deadline clockevent device about recalibration").


> Maybe this issue is still present?

Yes it is. The reason is that NTP frequency adjustments would make it
into the mono clocksource's frequency but not into the
clock_event_device's.

I sent a series back then [1] addressing this.

However, in the meanwhile, I changed jobs and thus, got somehow
distracted. My bad :/

The result is that this got merged only up to what corresponds to
[10/28] in [1].

I'll pick this up again now and try to get the rest accepted for
inclusion.


Thanks,

Nicolai


[1] last time I sent this series as a whole can be found at
    http://lkml.kernel.org/r/20161119160055.12491-2-nicstange@gmail.com

  reply	other threads:[~2017-11-04  9:14 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-03 21:12 [nohz_full/apic] multiple timer interrupts a second Luiz Capitulino
2017-11-04  9:14 ` Nicolai Stange [this message]
2017-11-06 14:00   ` Luiz Capitulino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878tfm76cz.fsf@gmail.com \
    --to=nicstange@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=john.stultz@linaro.org \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.