All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luiz Capitulino <lcapitulino@redhat.com>
To: Wanpeng Li <kernellwp@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-rt-users@vger.kernel.org, Rik van Riel <riel@redhat.com>
Subject: Re: [BUG nohz]: wrong user and system time accounting
Date: Wed, 29 Mar 2017 22:14:00 -0400	[thread overview]
Message-ID: <20170329221400.2b1e8d77@redhat.com> (raw)
In-Reply-To: <CANRm+CzSCJODribK6vPhQVT=CGFX7ewT6K3mTMu4vfZv5r_8LA@mail.gmail.com>

On Thu, 30 Mar 2017 06:46:30 +0800
Wanpeng Li <kernellwp@gmail.com> wrote:

> > So! Now we need to find a proper fix :o)
> >
> > Hmm, how bad would it be to revert to sched_clock() instead of jiffies in vtime_delta()?
> > We could use nanosecond granularity to check deltas but only perform an actual cputime update
> > when that delta >= TICK_NSEC. That should keep the load ok.  
> 
> Yeah, I mentioned something similar before.
> https://lkml.org/lkml/2017/3/26/138 However, Rik's commit optimized
> syscalls by not utilize sched_clock(), so if we should distinguish
> between syscalls/exceptions and irqs?

Why not use ktime_get()?

Here's the solution I was thinking about, it's mostly untested. I'm
rate limiting below TICK_NSEC because I want to avoid syncing with
the tick.

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index f3778e2b..a8b1e85 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -676,18 +676,20 @@ void thread_group_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 static u64 vtime_delta(struct task_struct *tsk)
 {
-	unsigned long now = READ_ONCE(jiffies);
+	return ktime_sub(ktime_get(), tsk->vtime_snap);
+}
 
-	if (time_before(now, (unsigned long)tsk->vtime_snap))
-		return 0;
+/* A little bit less than the tick period */
+#define VTIME_RATE_LIMIT (TICK_NSEC - 200000)
 
-	return jiffies_to_nsecs(now - tsk->vtime_snap);
+static bool vtime_should_account(struct task_struct *tsk)
+{
+	return vtime_delta(tsk) > VTIME_RATE_LIMIT;
 }
 
 static u64 get_vtime_delta(struct task_struct *tsk)
 {
-	unsigned long now = READ_ONCE(jiffies);
-	u64 delta, other;
+	u64 delta, other, now = ktime_get();
 
 	/*
 	 * Unlike tick based timing, vtime based timing never has lost
@@ -696,7 +698,7 @@ static u64 get_vtime_delta(struct task_struct *tsk)
 	 * elapsed time. Limit account_other_time to prevent rounding
 	 * errors from causing elapsed vtime to go negative.
 	 */
-	delta = jiffies_to_nsecs(now - tsk->vtime_snap);
+	delta = ktime_sub(now, tsk->vtime_snap);
 	other = account_other_time(delta);
 	WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
 	tsk->vtime_snap = now;
@@ -711,7 +713,7 @@ static void __vtime_account_system(struct task_struct *tsk)
 
 void vtime_account_system(struct task_struct *tsk)
 {
-	if (!vtime_delta(tsk))
+	if (!vtime_should_account(tsk))
 		return;
 
 	write_seqcount_begin(&tsk->vtime_seqcount);
@@ -723,7 +725,7 @@ void vtime_account_user(struct task_struct *tsk)
 {
 	write_seqcount_begin(&tsk->vtime_seqcount);
 	tsk->vtime_snap_whence = VTIME_SYS;
-	if (vtime_delta(tsk))
+	if (vtime_should_account(tsk))
 		account_user_time(tsk, get_vtime_delta(tsk));
 	write_seqcount_end(&tsk->vtime_seqcount);
 }
@@ -731,7 +733,7 @@ void vtime_account_user(struct task_struct *tsk)
 void vtime_user_enter(struct task_struct *tsk)
 {
 	write_seqcount_begin(&tsk->vtime_seqcount);
-	if (vtime_delta(tsk))
+	if (vtime_should_account(tsk))
 		__vtime_account_system(tsk);
 	tsk->vtime_snap_whence = VTIME_USER;
 	write_seqcount_end(&tsk->vtime_seqcount);
@@ -747,7 +749,7 @@ void vtime_guest_enter(struct task_struct *tsk)
 	 * that can thus safely catch up with a tickless delta.
 	 */
 	write_seqcount_begin(&tsk->vtime_seqcount);
-	if (vtime_delta(tsk))
+	if (vtime_should_account(tsk))
 		__vtime_account_system(tsk);
 	current->flags |= PF_VCPU;
 	write_seqcount_end(&tsk->vtime_seqcount);
@@ -776,7 +778,7 @@ void arch_vtime_task_switch(struct task_struct *prev)
 
 	write_seqcount_begin(&current->vtime_seqcount);
 	current->vtime_snap_whence = VTIME_SYS;
-	current->vtime_snap = jiffies;
+	current->vtime_snap = ktime_get();
 	write_seqcount_end(&current->vtime_seqcount);
 }
 

  reply	other threads:[~2017-03-30  2:14 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-23 20:55 [BUG nohz]: wrong user and system time accounting Luiz Capitulino
2017-03-24  0:56 ` Rik van Riel
2017-03-24  1:05   ` Luiz Capitulino
2017-03-24  1:08     ` Rik van Riel
2017-03-24  1:39       ` Luiz Capitulino
2017-03-27  5:33   ` lkml
2017-03-24  1:52 ` Wanpeng Li
2017-03-24  3:56   ` Luiz Capitulino
2017-03-27  1:56 ` Wanpeng Li
2017-03-27 17:35   ` Rik van Riel
2017-03-28  7:19     ` Wanpeng Li
     [not found]     ` <20170328132406.7d23579c@redhat.com>
     [not found]       ` <20170328161454.4a5d9e8b@redhat.com>
2017-03-28 21:01         ` Rik van Riel
2017-03-28 21:26           ` Luiz Capitulino
2017-03-29  9:56             ` Wanpeng Li
2017-03-29 12:56               ` Frederic Weisbecker
2017-03-28 21:24         ` Rik van Riel
2017-03-28 21:30           ` Luiz Capitulino
     [not found]       ` <20170329131656.1d6cb743@redhat.com>
2017-03-29 20:08         ` Rik van Riel
2017-03-29 22:54           ` Frederic Weisbecker
2017-03-30 12:57             ` Rik van Riel
2017-03-30  1:58           ` Wanpeng Li
2017-03-30 12:40             ` Frederic Weisbecker
2017-03-30 13:19               ` Mike Galbraith
2017-03-30  4:27           ` Mike Galbraith
2017-03-30  6:47             ` Wanpeng Li
2017-03-30 11:52               ` Wanpeng Li
2017-03-30 12:33                 ` Mike Galbraith
2017-03-30 13:38               ` Frederic Weisbecker
2017-03-30 13:59                 ` Wanpeng Li
2017-03-30 14:18                   ` Frederic Weisbecker
2017-03-30 21:25                     ` Luiz Capitulino
2017-03-31 20:09                       ` Luiz Capitulino
2017-03-31 23:24                         ` Frederic Weisbecker
2017-04-01  3:11                           ` Luiz Capitulino
2017-04-03 15:23                             ` Frederic Weisbecker
2017-04-03 19:06                               ` Luiz Capitulino
2017-04-04 17:36                                 ` Luiz Capitulino
2017-04-05 14:26                                   ` Rik van Riel
2017-04-11 11:03                 ` Wanpeng Li
2017-04-11 11:36                   ` Peter Zijlstra
2017-04-11 11:43                     ` Wanpeng Li
2017-04-11 14:22               ` Thomas Gleixner
2017-04-12 13:18                 ` Frederic Weisbecker
2017-04-12 14:57                   ` Thomas Gleixner
2017-04-12 15:14                     ` Frederic Weisbecker
2017-04-13  4:31                     ` Wanpeng Li
2017-04-13 13:32                       ` Frederic Weisbecker
2017-05-02 10:01                         ` Wanpeng Li
2017-05-15  8:17                           ` Wanpeng Li
2017-06-29 17:22                             ` Frederic Weisbecker
2017-03-30 12:51             ` Frederic Weisbecker
2017-03-30 13:02               ` Rik van Riel
2017-03-30 13:35                 ` Mike Galbraith
2017-04-03 14:40                   ` Frederic Weisbecker
2017-04-04  7:32                     ` Mike Galbraith
2017-03-30 13:44                 ` Frederic Weisbecker
     [not found]         ` <20170329221700.GB23895@lerouge>
2017-03-29 22:46           ` Wanpeng Li
2017-03-30  2:14             ` Luiz Capitulino [this message]
2017-03-30 12:27               ` Wanpeng Li
2017-03-27 18:38   ` Luiz Capitulino
2017-03-28  5:28     ` Wanpeng Li
2017-03-28 13:44       ` Luiz Capitulino
2017-03-29 13:04 ` Frederic Weisbecker
2017-03-29 13:14   ` Rik van Riel
2017-03-29 13:23     ` Luiz Capitulino
2017-03-29 21:12       ` Frederic Weisbecker
2017-03-30  1:48         ` Luiz Capitulino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170329221400.2b1e8d77@redhat.com \
    --to=lcapitulino@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=kernellwp@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.