All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Wanpeng Li <wanpengli@tencent.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Yauheni Kaliuta <yauheni.kaliuta@redhat.com>,
	Ingo Molnar <mingo@kernel.org>, Rik van Riel <riel@redhat.com>
Subject: [PATCH 18/25] vtime: Track nice-ness on top of context switch
Date: Wed, 14 Nov 2018 03:46:02 +0100	[thread overview]
Message-ID: <1542163569-20047-19-git-send-email-frederic@kernel.org> (raw)
In-Reply-To: <1542163569-20047-1-git-send-email-frederic@kernel.org>

We need to read the nice value of the task running on any CPU, possibly
remotely, in order to correctly support kcpustat on nohz_full.
Unfortunately we can't just read task_nice(tsk) when tsk runs on another
CPU because its nice value may be concurrently changed. There could be a
risk that a recently modified nice value is thought to apply for a longer
while than is supposed to.

For example if a task runs at T0 with nice = -10, then its nice value
is changed at T0 + 1 second with nice = 10, a reader at T0 + 1 second
could think that the task had this "nice == 10" value since the beginning
(T0) and spuriously account 1 second nice time on kcpustat instead of 1
second user time.

So we need to track the nice value changes under vtime seqcount. Start
with context switches and account the vtime nice-ness on top of it.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Ingo Molnar <mingo@kernel.org>
---
 include/linux/sched.h  |  1 +
 kernel/sched/cputime.c | 44 +++++++++++++++++++++++++++++++++++---------
 2 files changed, 36 insertions(+), 9 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 27e0544..356326f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -280,6 +280,7 @@ enum vtime_state {
 struct vtime {
 	seqcount_t		seqcount;
 	unsigned long long	starttime;
+	int			nice;
 	enum vtime_state	state;
 	unsigned int		cpu;
 	u64			utime;
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 8f5dee2..07c2e7f 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -735,13 +735,42 @@ static void vtime_account_system(struct task_struct *tsk,
 static void vtime_account_guest(struct task_struct *tsk,
 				struct vtime *vtime)
 {
+	enum cpu_usage_stat index;
+
 	vtime->gtime += get_vtime_delta(vtime);
-	if (vtime->gtime >= TICK_NSEC) {
-		account_guest_time(tsk, vtime->gtime);
-		vtime->gtime = 0;
-	}
+
+	if (vtime->gtime < TICK_NSEC)
+		return;
+
+	if (vtime->nice)
+		index = CPUTIME_GUEST_NICE;
+	else
+		index = CPUTIME_GUEST;
+
+	account_guest_time_index(tsk, vtime->gtime, index);
+	vtime->gtime = 0;
 }
 
+static void vtime_account_user(struct task_struct *tsk,
+			       struct vtime *vtime)
+{
+	enum cpu_usage_stat index;
+
+	vtime->utime += get_vtime_delta(vtime);
+
+	if (vtime->utime < TICK_NSEC)
+		return;
+
+	if (vtime->nice)
+		index = CPUTIME_NICE;
+	else
+		index = CPUTIME_USER;
+
+	account_user_time_index(tsk, vtime->utime, index);
+	vtime->utime = 0;
+}
+
+
 static void __vtime_account_kernel(struct task_struct *tsk,
 				   struct vtime *vtime)
 {
@@ -779,11 +808,7 @@ void vtime_user_exit(struct task_struct *tsk)
 	struct vtime *vtime = &tsk->vtime;
 
 	write_seqcount_begin(&vtime->seqcount);
-	vtime->utime += get_vtime_delta(vtime);
-	if (vtime->utime >= TICK_NSEC) {
-		account_user_time(tsk, vtime->utime);
-		vtime->utime = 0;
-	}
+	vtime_account_user(tsk, vtime);
 	vtime->state = VTIME_SYS;
 	write_seqcount_end(&vtime->seqcount);
 }
@@ -864,6 +889,7 @@ void vtime_task_switch_generic(struct task_struct *prev)
 		vtime->state = VTIME_SYS;
 	vtime->starttime = sched_clock();
 	vtime->cpu = smp_processor_id();
+	vtime->nice = (task_nice(current) > 0) ? 1 : 0;
 	write_seqcount_end(&vtime->seqcount);
 
 	rcu_assign_pointer(kcpustat->curr, current);
-- 
2.7.4


  parent reply	other threads:[~2018-11-14  2:47 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-14  2:45 [PATCH 00/25] sched/nohz: Make kcpustat vtime aware (Fix kcpustat on nohz_full) Frederic Weisbecker
2018-11-14  2:45 ` [PATCH 01/25] sched/vtime: Fix guest/system mis-accounting on task switch Frederic Weisbecker
2018-11-14  2:45 ` [PATCH 02/25] sched/vtime: Protect idle accounting under vtime seqcount Frederic Weisbecker
2018-11-20 13:19   ` Peter Zijlstra
2018-11-14  2:45 ` [PATCH 03/25] vtime: Rename vtime_account_system() to vtime_account_kernel() Frederic Weisbecker
2018-11-14  2:45 ` [PATCH 04/25] vtime: Spare a seqcount lock/unlock cycle on context switch Frederic Weisbecker
2018-11-20 13:25   ` Peter Zijlstra
2019-09-25 14:42     ` Frederic Weisbecker
2018-11-14  2:45 ` [PATCH 05/25] sched/vtime: Record CPU under seqcount for kcpustat needs Frederic Weisbecker
2018-11-14  2:45 ` [PATCH 06/25] sched/cputime: Add vtime idle task state Frederic Weisbecker
2018-11-14  2:45 ` [PATCH 07/25] sched/cputime: Add vtime guest " Frederic Weisbecker
2018-11-14  2:45 ` [PATCH 08/25] vtime: Exit vtime before exit_notify() Frederic Weisbecker
2018-11-20 13:54   ` Peter Zijlstra
2018-11-14  2:45 ` [PATCH 09/25] kcpustat: Track running task following vtime sequences Frederic Weisbecker
2018-11-20 13:58   ` Peter Zijlstra
2018-11-14  2:45 ` [PATCH 10/25] context_tracking: Remove context_tracking_active() Frederic Weisbecker
2018-11-14  2:45 ` [PATCH 11/25] context_tracking: s/context_tracking_is_enabled/context_tracking_enabled() Frederic Weisbecker
2018-11-14  2:45 ` [PATCH 12/25] context_tracking: Rename context_tracking_is_cpu_enabled() to context_tracking_enabled_this_cpu() Frederic Weisbecker
2018-11-14  2:45 ` [PATCH 13/25] context_tracking: Introduce context_tracking_enabled_cpu() Frederic Weisbecker
2018-11-20 14:02   ` Peter Zijlstra
2018-11-14  2:45 ` [PATCH 14/25] sched/vtime: Rename vtime_accounting_cpu_enabled() to vtime_accounting_enabled_this_cpu() Frederic Weisbecker
2018-11-14  2:45 ` [PATCH 15/25] sched/vtime: Introduce vtime_accounting_enabled_cpu() Frederic Weisbecker
2018-11-20 14:04   ` Peter Zijlstra
2018-11-14  2:46 ` [PATCH 16/25] sched/cputime: Allow to pass cputime index on user/guest accounting Frederic Weisbecker
2018-11-14  2:46 ` [PATCH 17/25] sched/cputime: Standardize the kcpustat index based accounting functions Frederic Weisbecker
2018-11-14  2:46 ` Frederic Weisbecker [this message]
2018-11-20 14:09   ` [PATCH 18/25] vtime: Track nice-ness on top of context switch Peter Zijlstra
2018-11-14  2:46 ` [PATCH 19/25] sched/vite: Handle nice updates under vtime Frederic Weisbecker
2018-11-20 14:17   ` Peter Zijlstra
2018-11-26 15:53     ` Frederic Weisbecker
2018-11-26 16:11       ` Peter Zijlstra
2018-11-26 18:41         ` Frederic Weisbecker
2018-11-14  2:46 ` [PATCH 20/25] sched/kcpustat: Introduce vtime-aware kcpustat accessor Frederic Weisbecker
2018-11-20 14:23   ` Peter Zijlstra
2018-11-20 22:40     ` Frederic Weisbecker
2018-11-21  8:18       ` Peter Zijlstra
2018-11-21  8:35         ` Peter Zijlstra
2018-11-21 16:33         ` Frederic Weisbecker
2018-11-14  2:46 ` [PATCH 21/25] procfs: Use vtime aware " Frederic Weisbecker
2018-11-20 14:24   ` Peter Zijlstra
2018-11-20 22:31     ` Frederic Weisbecker
2018-11-14  2:46 ` [PATCH 22/25] cpufreq: " Frederic Weisbecker
2018-11-14  2:46 ` [PATCH 23/25] leds: Use vtime aware kcpustat accessors Frederic Weisbecker
2018-11-14  2:46 ` [PATCH 24/25] rackmeter: " Frederic Weisbecker
2018-11-14  2:46 ` [PATCH 25/25] sched/vtime: Clarify vtime_task_switch() argument layout Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1542163569-20047-19-git-send-email-frederic@kernel.org \
    --to=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=wanpengli@tencent.com \
    --cc=yauheni.kaliuta@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.