linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joonwoo Park <joonwoop@codeaurora.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>,
	linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	ohaugan@codeaurora.org
Subject: Re: [PATCH] sched: fix incorrect wait time and wait count statistics
Date: Tue, 27 Oct 2015 19:40:54 -0700	[thread overview]
Message-ID: <20151028024054.GA8839@codeaurora.org> (raw)
In-Reply-To: <20151027125728.GA3201@worktop.amr.corp.intel.com>

On Tue, Oct 27, 2015 at 01:57:28PM +0100, Peter Zijlstra wrote:
> 
> (Excessive quoting for Olav)
> 
> On Mon, Oct 26, 2015 at 06:44:48PM -0700, Joonwoo Park wrote:
> > On 10/25/2015 03:26 AM, Peter Zijlstra wrote:
> 
> > > Also note that on both sites we also set TASK_ON_RQ_MIGRATING -- albeit
> > > late. Can't you simply set that earlier (and back to QUEUED later) and
> > > test for task_on_rq_migrating() instead of blowing up the fastpath like
> > > you did?
> > > 
> > 
> > Yes it's doable.  I also find it's much simpler.
> 
> > From 98d615d46211a90482a0f9b7204265c54bba8520 Mon Sep 17 00:00:00 2001
> > From: Joonwoo Park <joonwoop@codeaurora.org>
> > Date: Mon, 26 Oct 2015 16:37:47 -0700
> > Subject: [PATCH v2] sched: fix incorrect wait time and wait count statistics
> > 
> > At present scheduler resets task's wait start timestamp when the task
> > migrates to another rq.  This misleads scheduler itself into reporting
> > less wait time than actual by omitting time spent for waiting prior to
> > migration and also more wait count than actual by counting migration as
> > wait end event which can be seen by trace or /proc/<pid>/sched with
> > CONFIG_SCHEDSTATS=y.
> > 
> > Carry forward migrating task's wait time prior to migration and
> > don't count migration as a wait end event to fix such statistics error.
> > 
> > In order to determine whether task is migrating mark task->on_rq with
> > TASK_ON_RQ_MIGRATING while dequeuing and enqueuing due to migration.
> > 
> > To: Ingo Molnar <mingo@kernel.org>
> > To: Peter Zijlstra <peterz@infradead.org>
> > Cc: linux-kernel@vger.kernel.org
> > Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
> > ---
> 
> So now that you rely on TASK_ON_RQ_MIGRATING; I think you missed one
> place that can migrate sched_fair tasks and doesn't set it.
> 
> Olav recently did a patch adding TASK_ON_RQ_MIGRATING to _every_
> migration path, but that is (still) somewhat overkill. With your changes
> we need it for sched_fair though.
> 
> So I think you need to change __migrate_swap_task(), which is used by
> the NUMA scheduling to swap two running tasks.

Oh yes... __migrate_swap_task() can migrate fair class task so I should mark as TASK_ON_RQ_MIGRATING.
I will fix this in subsequent patch.

> 
> Also, it might be prudent to extend the CONFIG_SCHED_DEBUG ifdef in
> set_task_cpu() to test for this new requirement:
> 
> 	WARN_ON_ONCE(p->state == TASK_RUNNING &&
> 	             p->class == &fair_sched_class &&
> 		     p->on_rq != TASK_ON_RQ_MIGRATING);
> 

I conceived to argue that if we had Olav's change (with revised marking order) dummy like myself don't need to worry
about stale on_rq state and probably didn't make mistake like I did with __migrate_swap_task(). 
But I don't think I can argue for that reason anymore as my patch will set TASK_ON_RQ_MIGRATING for all migration paths for
the fair class tasks at least and moreover above macro you suggested would fortify the new requirement.

I will add that macro. 

I recall Olav had some other reason for his patch though.

> >  kernel/sched/core.c |  4 ++--
> >  kernel/sched/fair.c | 17 ++++++++++++++---
> >  2 files changed, 16 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index bcd214e..d9e4ad5 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1069,8 +1069,8 @@ static struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int new
> >  {
> >  	lockdep_assert_held(&rq->lock);
> >  
> > -	dequeue_task(rq, p, 0);
> >  	p->on_rq = TASK_ON_RQ_MIGRATING;
> > +	dequeue_task(rq, p, 0);
> >  	set_task_cpu(p, new_cpu);
> >  	raw_spin_unlock(&rq->lock);
> >  
> > @@ -1078,8 +1078,8 @@ static struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int new
> >  
> >  	raw_spin_lock(&rq->lock);
> >  	BUG_ON(task_cpu(p) != new_cpu);
> > -	p->on_rq = TASK_ON_RQ_QUEUED;
> >  	enqueue_task(rq, p, 0);
> > +	p->on_rq = TASK_ON_RQ_QUEUED;
> >  	check_preempt_curr(rq, p, 0);
> >  
> >  	return rq;
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 9a5e60f..7609576 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -740,7 +740,11 @@ static void update_curr_fair(struct rq *rq)
> >  static inline void
> >  update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >  {
> > -	schedstat_set(se->statistics.wait_start, rq_clock(rq_of(cfs_rq)));
> > +	schedstat_set(se->statistics.wait_start,
> > +		task_on_rq_migrating(task_of(se)) &&
> > +		likely(rq_clock(rq_of(cfs_rq)) > se->statistics.wait_start) ?
> > +		rq_clock(rq_of(cfs_rq)) - se->statistics.wait_start :
> > +		rq_clock(rq_of(cfs_rq)));
> 
> So I get that you want to avoid emitting code for !SCHEDSTATS; but that
> is rather unreadable. Either use the GCC stmt-expr or wrap the lot in
> #ifdef.
> 

Will do.

> >  }
> >  
> >  /*
> > @@ -759,6 +763,13 @@ static void update_stats_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >  static void
> >  update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >  {
> > +	if (task_on_rq_migrating(task_of(se))) {
> 
> Maybe add a comment that we adjust the wait_start time-base and will
> rebase on the new rq_clock in update_stats_wait_start().
> 

Will do.

Thanks,
Joonwoo

> > +		schedstat_set(se->statistics.wait_start,
> > +			      rq_clock(rq_of(cfs_rq)) -
> > +			      se->statistics.wait_start);
> > +		return;
> > +	}
> > +
> >  	schedstat_set(se->statistics.wait_max, max(se->statistics.wait_max,
> >  			rq_clock(rq_of(cfs_rq)) - se->statistics.wait_start));
> >  	schedstat_set(se->statistics.wait_count, se->statistics.wait_count + 1);
> > @@ -5656,8 +5667,8 @@ static void detach_task(struct task_struct *p, struct lb_env *env)
> >  {
> >  	lockdep_assert_held(&env->src_rq->lock);
> >  
> > -	deactivate_task(env->src_rq, p, 0);
> >  	p->on_rq = TASK_ON_RQ_MIGRATING;
> > +	deactivate_task(env->src_rq, p, 0);
> >  	set_task_cpu(p, env->dst_cpu);
> >  }
> >  
> > @@ -5790,8 +5801,8 @@ static void attach_task(struct rq *rq, struct task_struct *p)
> >  	lockdep_assert_held(&rq->lock);
> >  
> >  	BUG_ON(task_rq(p) != rq);
> > -	p->on_rq = TASK_ON_RQ_QUEUED;
> >  	activate_task(rq, p, 0);
> > +	p->on_rq = TASK_ON_RQ_QUEUED;
> >  	check_preempt_curr(rq, p, 0);
> >  }

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

  reply	other threads:[~2015-10-28  2:41 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-25  5:23 [PATCH] sched: fix incorrect wait time and wait count statistics Joonwoo Park
2015-10-25 10:08 ` Peter Zijlstra
2015-10-25 10:26 ` Peter Zijlstra
2015-10-27  1:44   ` Joonwoo Park
2015-10-27 12:57     ` Peter Zijlstra
2015-10-28  2:40       ` Joonwoo Park [this message]
2015-10-28  4:46         ` [PATCH v4] " Joonwoo Park
2015-11-06 13:57           ` Peter Zijlstra
2015-11-07  2:41             ` Joonwoo Park
2015-11-09 10:32               ` Peter Zijlstra
2015-11-13  3:38                 ` [PATCH v5] " Joonwoo Park
2015-10-27 19:17 ` [PATCH v3] " Joonwoo Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151028024054.GA8839@codeaurora.org \
    --to=joonwoop@codeaurora.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=ohaugan@codeaurora.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).