From: Felipe Balbi <felipe.balbi@linux.intel.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Alessandro Zummo <a.zummo@towertech.it>,
Alexandre Belloni <alexandre.belloni@free-electrons.com>,
linux-kernel@vger.kernel.org, linux-rtc@vger.kernel.org
Subject: Re: HRTimer causing rtctest to fail
Date: Wed, 28 Feb 2018 11:26:55 +0200 [thread overview]
Message-ID: <87muzt5usg.fsf@linux.intel.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1802271640010.1886@nanos.tec.linutronix.de>
[-- Attachment #1.1: Type: text/plain, Size: 4550 bytes --]
Hi,
Thomas Gleixner <tglx@linutronix.de> writes:
> On Tue, 27 Feb 2018, Felipe Balbi wrote:
>> I'm facing an odd problem with v4.16-rc2 (also reproducible on v4.15
>> final) where rtctest fails sometimes which PIE coming too late with
>> frequencies >= 1024 Hz.
>>
>> I've modified rtctest.c a bit so that it continues running even after
>> first failure just so I could get a glimpse of how many times it fails.
>>
>> Here are the results:
>>
>> Error 1024Hz i 17/20 diff 1089/976
>>
>> Error 2048Hz i 2/20 diff 538/488
>> Error 2048Hz i 9/20 diff 558/488
>> Error 2048Hz i 16/20 diff 560/488
>>
>> Error 4096Hz i 2/20 diff 305/244
>> Error 4096Hz i 7/20 diff 288/244
>> Error 4096Hz i 9/20 diff 285/244
>> Error 4096Hz i 11/20 diff 310/244
>> Error 4096Hz i 16/20 diff 284/244
>> Error 4096Hz i 20/20 diff 317/244
>>
>>
>> I added a few trace_printk() calls around rtc_dev_read() (and some other
>> functions) and captured the time for entry and exit of that function. I
>> noticed that HRTimer fires way too late and also way too early
>> sometimes. On the order of 100+ us.
>
> Power management, deep idle states?
>
>> For example with iterations 17 of 1024 Hz (seen above) and 18 (also of
>> 1024 Hz) I captured the following time stamps (all in uS):
>>
>> | Start | End | Expected End | Diff |
>> |----------+----------+--------------+------|
>> | 35900626 | 35901711 | 35901603 | -108 | (too late)
>> | 35901826 | 35902628 | 35902803 | 174 | (too early)
>
> Too early? hrtimers never fire too early. The measurement in that test is
> bogus:
>
> gettimeofday(&start, NULL);
> /* This blocks */
> retval = read(fd, &data, sizeof(unsigned long));
> if (retval == -1) {
> perror("read");
> exit(errno);
> }
> gettimeofday(&end, NULL);
>
> It's measuring relative time between going to sleep and wakeup, so if an
> event comes late, then the next one will claim to be early which is
> nonsense because periodic mode runs on an absolute time line.
>
> 1 2 3 4 5 6 7
> Expected: | | | | | | |
> Real: | | | | | | |
>
> ^ late event
> So 2->3 would give you (period + latency) and 3->4 (period - latency) ....
>
> And if the task gets scheduled out before the taking the start time stamp
> then this is also giving bogus results.
>
> What the test really should figure out is the deviation of the absolute
> time line. That's hard because the RTC code starts the hrtimer with:
>
> hrtimer_start(&rtc->pie_timer, period, HRTIMER_MODE_REL);
>
> which means relative to 'now'. So there is no good way for user space to
> determine the expected timeline.
>
>> I can't come up with a proper explanation as to why HRTimer is firing at
>> such wildly odd extremes. On most of the measurements, HRTimer executes
>> within 5 uS of scheduled time and I can't convince myself that
>> scheduling latency would get so high all of a sudden. Specially
>> considering I'm testing with a 4 core machine and there's nothing else
>> running on it. It's mostly idle.
>
> Enable the hrtimer and scheduling tracepoints. That should give you a hint
> what's going on.
Thanks, that does give me a lot more information. So here you go:
rtctest-1348 [003] d..2 313.766141: hrtimer_start: hrtimer=00000000667ce595 function=rtc_pie_update_irq expires=313573983010 softexpires=313573983010 mode=REL
<idle>-0 [003] d.h1 313.767189: hrtimer_expire_entry: hrtimer=00000000667ce595 function=rtc_pie_update_irq now=313574053764
We still have a 70754 nS deviation. After changing to absolute time,
the deviation remains:
<idle>-0 [000] dNh2 29.303251: hrtimer_start: hrtimer=000000006858b496 function=rtc_pie_update_irq expires=28765551360 softexpires=28765551360 mode=ABS
<idle>-0 [000] d.h1 29.303565: hrtimer_expire_entry: hrtimer=000000006858b496 function=rtc_pie_update_irq now=28765621916
I took a few measurements (76 to be exact) and plotted the deviation
from expected expiry time (now - expires from the tracepoint data) and
it's looking rather odd. PNG attached (all numbers in nS). Could this be
caused by FW stealing CPU behind Linux's back? I can take many more
measurements, if necessary. Just let me know.
cheers
[-- Attachment #1.2: chart.png --]
[-- Type: image/png, Size: 42201 bytes --]
[-- Attachment #1.3: Type: text/plain, Size: 14 bytes --]
--
balbi
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next prev parent reply other threads:[~2018-02-28 9:27 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-27 13:20 HRTimer causing rtctest to fail Felipe Balbi
2018-02-27 16:11 ` Thomas Gleixner
2018-02-28 9:26 ` Felipe Balbi [this message]
2018-02-28 9:55 ` Thomas Gleixner
2018-02-28 10:55 ` Felipe Balbi
2018-02-28 11:50 ` Thomas Gleixner
2018-02-28 12:02 ` Felipe Balbi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87muzt5usg.fsf@linux.intel.com \
--to=felipe.balbi@linux.intel.com \
--cc=a.zummo@towertech.it \
--cc=alexandre.belloni@free-electrons.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rtc@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.