From: Peter Zijlstra <peterz@infradead.org>
To: Arjan van de Ven <arjan@linux.intel.com>
Cc: "Fernando Luis Vázquez Cao" <fernando_b1@lab.ntt.co.jp>,
"Frederic Weisbecker" <fweisbec@gmail.com>,
"Oleg Nesterov" <oleg@redhat.com>,
"Ingo Molnar" <mingo@kernel.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
"Tetsuo Handa" <penguin-kernel@I-love.SAKURA.ne.jp>,
"Andrew Morton" <akpm@linux-foundation.org>
Subject: Re: [PATCH 2/4] nohz: Synchronize sleep time stats with seqlock
Date: Tue, 20 Aug 2013 18:01:46 +0200 [thread overview]
Message-ID: <20130820160146.GG3258@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <52138BE9.5090005@linux.intel.com>
On Tue, Aug 20, 2013 at 08:31:53AM -0700, Arjan van de Ven wrote:
> On 8/20/2013 1:44 AM, Peter Zijlstra wrote:
> >Of course, if we can get away with completely removing all of that
> >(which I think Arjan suggested was a real possibility) then that would
> >be ever so much better still :-)
>
> I'm quite ok with removing that.
>
> however note that "top" also reports per cpu iowait...
> and that's a userspace expectation
Right, broken as that maybe :/ OK that looks like CPUTIME_IOWAIT which
is tick based and not the ns based accounting.
Still it needs the per-cpu nr_iowait accounting which pretty much
requires the atomics so no big gains there.
Which means that if Frederic can make the ns thing as expensive as the
existing atomics we might as well keep the ns thing too.
Hmm, would something like the below make sense? I suppose this can be
done even for the ns case, you'd have to duplicate all stats though.
At the very least the below reduces the number of atomics, not entirely
sure it all matters much though, some benchmarking would be in order I
suppose.
---
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2167,8 +2167,12 @@ unsigned long nr_iowait(void)
{
unsigned long i, sum = 0;
- for_each_possible_cpu(i)
- sum += atomic_read(&cpu_rq(i)->nr_iowait);
+ for_each_possible_cpu(i) {
+ struct rq *rq = cpu_rq(i);
+
+ sum += rq->nr_iowait_local;
+ sum += atomic_read(&rq->nr_iowait_remote);
+ }
return sum;
}
@@ -2176,7 +2180,7 @@ unsigned long nr_iowait(void)
unsigned long nr_iowait_cpu(int cpu)
{
struct rq *this = cpu_rq(cpu);
- return atomic_read(&this->nr_iowait);
+ return atomic_read(&this->nr_iowait_remote) + this->nr_iowait_local;
}
#ifdef CONFIG_SMP
@@ -4086,31 +4090,49 @@ EXPORT_SYMBOL_GPL(yield_to);
*/
void __sched io_schedule(void)
{
- struct rq *rq = raw_rq();
+ struct rq *rq;
delayacct_blkio_start();
- atomic_inc(&rq->nr_iowait);
blk_flush_plug(current);
+
+ preempt_disable();
+ rq = this_rq();
+ rq->nr_iowait_local++;
current->in_iowait = 1;
- schedule();
+ schedule_preempt_disabled();
current->in_iowait = 0;
- atomic_dec(&rq->nr_iowait);
+ if (likely(task_cpu(current) == cpu_of(rq)))
+ rq->nr_iowait_local--;
+ else
+ atomic_dec(&rq->nr_iowait_remote);
+ preempt_enable();
+
delayacct_blkio_end();
}
EXPORT_SYMBOL(io_schedule);
long __sched io_schedule_timeout(long timeout)
{
- struct rq *rq = raw_rq();
+ struct rq *rq;
long ret;
delayacct_blkio_start();
- atomic_inc(&rq->nr_iowait);
blk_flush_plug(current);
+
+ preempt_disable();
+ rq = this_rq();
+ rq->nr_iowait_local++;
current->in_iowait = 1;
+ preempt_enable_no_resched();
ret = schedule_timeout(timeout);
+ preempt_disable();
current->in_iowait = 0;
- atomic_dec(&rq->nr_iowait);
+ if (likely(task_cpu(current) == cpu_of(rq)))
+ rq->nr_iowait_local--;
+ else
+ atomic_dec(&rq->nr_iowait_remote);
+ preempt_enable();
+
delayacct_blkio_end();
return ret;
}
@@ -6650,7 +6672,8 @@ void __init sched_init(void)
#endif
#endif
init_rq_hrtick(rq);
- atomic_set(&rq->nr_iowait, 0);
+ rq->nr_iowait_local = 0;
+ atomic_set(&rq->nr_iowait_remote, 0);
}
set_load_weight(&init_task);
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -453,7 +453,8 @@ struct rq {
u64 clock;
u64 clock_task;
- atomic_t nr_iowait;
+ int nr_iowait_local;
+ atomic_t nr_iowait_remote;
#ifdef CONFIG_SMP
struct root_domain *rd;
next prev parent reply other threads:[~2013-08-20 16:02 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-16 15:42 [PATCH RESEND 0/4] nohz: Fix racy sleeptime stats Frederic Weisbecker
2013-08-16 15:42 ` [PATCH 1/4] nohz: Only update sleeptime stats locally Frederic Weisbecker
2013-08-18 16:49 ` Oleg Nesterov
2013-08-18 21:38 ` Frederic Weisbecker
2013-08-18 17:04 ` Oleg Nesterov
2013-08-19 18:05 ` Stratos Karafotis
2013-08-16 15:42 ` [PATCH 2/4] nohz: Synchronize sleep time stats with seqlock Frederic Weisbecker
2013-08-16 16:02 ` Oleg Nesterov
2013-08-16 16:20 ` Frederic Weisbecker
2013-08-16 16:26 ` Oleg Nesterov
2013-08-16 16:46 ` Frederic Weisbecker
2013-08-16 16:49 ` Oleg Nesterov
2013-08-16 17:12 ` Frederic Weisbecker
2013-08-18 16:36 ` Oleg Nesterov
2013-08-18 21:25 ` Frederic Weisbecker
2013-08-19 10:58 ` Peter Zijlstra
2013-08-19 15:44 ` Arjan van de Ven
2013-08-19 15:47 ` Arjan van de Ven
2013-08-19 11:10 ` Peter Zijlstra
2013-08-19 11:15 ` Peter Zijlstra
2013-08-20 6:59 ` Fernando Luis Vázquez Cao
2013-08-20 8:44 ` Peter Zijlstra
2013-08-20 15:29 ` Frederic Weisbecker
2013-08-20 15:33 ` Arjan van de Ven
2013-08-20 15:35 ` Frederic Weisbecker
2013-08-20 15:41 ` Arjan van de Ven
2013-08-20 15:31 ` Arjan van de Ven
2013-08-20 16:01 ` Peter Zijlstra [this message]
2013-08-20 16:33 ` Oleg Nesterov
2013-08-20 17:54 ` Peter Zijlstra
2013-08-20 18:25 ` Oleg Nesterov
2013-08-21 8:31 ` Peter Zijlstra
2013-08-21 11:35 ` Oleg Nesterov
2013-08-21 12:33 ` Peter Zijlstra
2013-08-21 14:23 ` Peter Zijlstra
2013-08-21 16:41 ` Oleg Nesterov
2013-10-01 14:05 ` Frederic Weisbecker
2013-10-01 14:26 ` Frederic Weisbecker
2013-10-01 14:27 ` Frederic Weisbecker
2013-10-01 14:49 ` Frederic Weisbecker
2013-10-01 15:00 ` Peter Zijlstra
2013-10-01 15:21 ` Frederic Weisbecker
2013-10-01 15:56 ` Peter Zijlstra
2013-10-01 16:47 ` Frederic Weisbecker
2013-10-01 16:59 ` Peter Zijlstra
2013-10-02 12:45 ` Frederic Weisbecker
2013-10-02 12:50 ` Peter Zijlstra
2013-10-02 14:35 ` Arjan van de Ven
2013-10-02 16:01 ` Frederic Weisbecker
2013-08-21 12:48 ` Peter Zijlstra
2013-08-21 17:09 ` Oleg Nesterov
2013-08-21 18:31 ` Peter Zijlstra
2013-08-21 18:32 ` Oleg Nesterov
2013-08-20 22:18 ` Frederic Weisbecker
2013-08-21 11:49 ` Oleg Nesterov
2013-08-20 6:21 ` Fernando Luis Vázquez Cao
2013-08-20 21:55 ` Frederic Weisbecker
2013-08-16 16:32 ` Frederic Weisbecker
2013-08-16 16:33 ` Oleg Nesterov
2013-08-16 16:49 ` Frederic Weisbecker
2013-08-16 16:37 ` Frederic Weisbecker
2013-08-18 16:54 ` Oleg Nesterov
2013-08-18 21:40 ` Frederic Weisbecker
2013-08-16 15:42 ` [PATCH 3/4] nohz: Consolidate sleep time stats read code Frederic Weisbecker
2013-08-18 17:00 ` Oleg Nesterov
2013-08-18 21:47 ` Frederic Weisbecker
2013-08-16 15:42 ` [PATCH 4/4] nohz: Convert a few places to use local per cpu accesses Frederic Weisbecker
2013-08-16 16:00 ` Peter Zijlstra
2013-08-16 16:12 ` Frederic Weisbecker
2013-08-16 16:19 ` Oleg Nesterov
2013-08-16 16:34 ` Frederic Weisbecker
2013-08-20 18:15 ` [PATCH RESEND 0/4] nohz: Fix racy sleeptime stats Oleg Nesterov
2013-08-21 8:28 ` Peter Zijlstra
2013-08-21 11:42 ` Oleg Nesterov
-- strict thread matches above, loose matches on Subject: below --
2014-05-07 13:41 [PATCH 1/4] nohz: Only update sleeptime stats locally Denys Vlasenko
2014-05-07 13:41 ` [PATCH 2/4] nohz: Synchronize sleep time stats with seqlock Denys Vlasenko
2014-04-24 18:45 [PATCH 1/4] nohz: Only update sleeptime stats locally Denys Vlasenko
2014-04-24 18:45 ` [PATCH 2/4] nohz: Synchronize sleep time stats with seqlock Denys Vlasenko
2013-08-09 0:54 [PATCH 0/4] nohz: Fix racy sleeptime stats Frederic Weisbecker
2013-08-09 0:54 ` [PATCH 2/4] nohz: Synchronize sleep time stats with seqlock Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130820160146.GG3258@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=arjan@linux.intel.com \
--cc=fernando_b1@lab.ntt.co.jp \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.