From: Rik van Riel <riel@redhat.com>
To: Luiz Capitulino <lcapitulino@redhat.com>, fweisbec@gmail.com
Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org
Subject: Re: [BUG nohz]: wrong user and system time accounting
Date: Thu, 23 Mar 2017 20:56:02 -0400 [thread overview]
Message-ID: <1490316962.8850.59.camel@redhat.com> (raw)
In-Reply-To: <20170323165512.60945ac6@redhat.com>
On Thu, 2017-03-23 at 16:55 -0400, Luiz Capitulino wrote:
> When there are two or more tasks executing in user-space and
> taking 100% of a nohz_full CPU, top reports 70% system time
> and 30% user time utilization. Sometimes I'm even able to get
> 100% system time and 0% user time.
>
> This was reproduced with latest Linus tree (093b995), but I
> don't believe it's a regression (at least not a recent one)
> as I can reproduce it with older kernels. Also, I have
> CONFIG_IRQ_TIME_ACCOUNTING=y and haven't tried to reproduce
> without it yet.
>
> Below you'll find the steps to reproduce and some initial
> analysis.
>
> Steps to reproduce
> ------------------
>
> 1. Set up a CPU for nohz_full with isolcpus= nohz_full=
>
> 2. Pin two tasks that hog the CPU 100% of the time to that CPU
>
> 3. Run top -d1 and check system time
>
> NOTE: When there's only one task hogging a nohz_full CPU, top
> shows 100% user-time, as expected
>
> Initial analysis
> ----------------
>
> When tracing vtime accounting functions and the user-space/kernel
> transitions when the issue is taking place, I see several of the
> following:
>
> hog-10552 [015] 1132.711104:
> function: enter_from_user_mode <-- apic_timer_interrupt
> hog-10552 [015] 1132.711105:
> function: __context_tracking_exit <--
> enter_from_user_mode
> hog-10552 [015] 1132.711105:
> bprint: __context_tracking_exit.part.4: new state=1 cur
> state=1 active=1
> hog-10552 [015] 1132.711105:
> function: vtime_account_user <--
> __context_tracking_exit.part.4
> hog-10552 [015] 1132.711105:
> function: smp_apic_timer_interrupt <--
> apic_timer_interrupt
> hog-10552 [015] 1132.711106: function: irq_enter <--
> smp_apic_timer_interrupt
> hog-10552 [015] 1132.711106: function: tick_sched_timer
> <-- __hrtimer_run_queues
> hog-10552 [015] 1132.711108: function: irq_exit <--
> smp_apic_timer_interrupt
> hog-10552 [015] 1132.711108:
> function: __context_tracking_enter <--
> prepare_exit_to_usermode
> hog-10552 [015] 1132.711108:
> bprint: __context_tracking_enter.part.2: new state=1
> cur state=0 active=1
> hog-10552 [015] 1132.711109: function: vtime_user_enter
> <-- __context_tracking_enter.part.2
> hog-10552 [015] 1132.711109:
> function: __vtime_account_system <-- vtime_user_enter
> hog-10552 [015] 1132.711109:
> function: account_system_time <-- __vtime_account_system
>
> On entering the kernel due to a timer interrupt, vtime_account_user()
> skips user-time accounting. Then later on when returning to user-
> space,
> vtime_user_enter() is probably accounting the whole time (ie. user-
> space
> plus kernel-space) to system time.
>
> Now, when does vtime_account_user() skips accounting? Well, when the
> time delta is less then one jiffie. This would imply that
> vtime_account_user()
> is being called less than one jiffie since the last accounting, but I
> haven't
> confirmed any of this yet.
Jiffies should be advanced by the timer interrupt, on the
housekeeping CPU, which is not doing context tracking.
Why is the isolated/nohz_full CPU receiving timer interrupts
at all?
I thought it would not, but obviously I am wrong. What is
going on here?
next prev parent reply other threads:[~2017-03-24 0:56 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-23 20:55 [BUG nohz]: wrong user and system time accounting Luiz Capitulino
2017-03-24 0:56 ` Rik van Riel [this message]
2017-03-24 1:05 ` Luiz Capitulino
2017-03-24 1:08 ` Rik van Riel
2017-03-24 1:39 ` Luiz Capitulino
2017-03-27 5:33 ` lkml
2017-03-24 1:52 ` Wanpeng Li
2017-03-24 3:56 ` Luiz Capitulino
2017-03-27 1:56 ` Wanpeng Li
2017-03-27 17:35 ` Rik van Riel
2017-03-28 7:19 ` Wanpeng Li
[not found] ` <20170328132406.7d23579c@redhat.com>
[not found] ` <20170329131656.1d6cb743@redhat.com>
[not found] ` <20170329221700.GB23895@lerouge>
2017-03-29 22:46 ` Wanpeng Li
2017-03-30 2:14 ` Luiz Capitulino
2017-03-30 12:27 ` Wanpeng Li
2017-03-27 18:38 ` Luiz Capitulino
2017-03-28 5:28 ` Wanpeng Li
2017-03-28 13:44 ` Luiz Capitulino
2017-03-29 13:04 ` Frederic Weisbecker
2017-03-29 13:14 ` Rik van Riel
2017-03-29 13:23 ` Luiz Capitulino
2017-03-29 21:12 ` Frederic Weisbecker
2017-03-30 1:48 ` Luiz Capitulino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1490316962.8850.59.camel@redhat.com \
--to=riel@redhat.com \
--cc=fweisbec@gmail.com \
--cc=lcapitulino@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).