All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joonwoo Park <joonwoop@codeaurora.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>,
	linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	ohaugan@codeaurora.org
Subject: Re: [PATCH] sched: fix incorrect wait time and wait count statistics
Date: Tue, 27 Oct 2015 19:40:54 -0700	[thread overview]
Message-ID: <20151028024054.GA8839@codeaurora.org> (raw)
In-Reply-To: <20151027125728.GA3201@worktop.amr.corp.intel.com>

On Tue, Oct 27, 2015 at 01:57:28PM +0100, Peter Zijlstra wrote:
> 
> (Excessive quoting for Olav)
> 
> On Mon, Oct 26, 2015 at 06:44:48PM -0700, Joonwoo Park wrote:
> > On 10/25/2015 03:26 AM, Peter Zijlstra wrote:
> 
> > > Also note that on both sites we also set TASK_ON_RQ_MIGRATING -- albeit
> > > late. Can't you simply set that earlier (and back to QUEUED later) and
> > > test for task_on_rq_migrating() instead of blowing up the fastpath like
> > > you did?
> > > 
> > 
> > Yes it's doable.  I also find it's much simpler.
> 
> > From 98d615d46211a90482a0f9b7204265c54bba8520 Mon Sep 17 00:00:00 2001
> > From: Joonwoo Park <joonwoop@codeaurora.org>
> > Date: Mon, 26 Oct 2015 16:37:47 -0700
> > Subject: [PATCH v2] sched: fix incorrect wait time and wait count statistics
> > 
> > At present scheduler resets task's wait start timestamp when the task
> > migrates to another rq.  This misleads scheduler itself into reporting
> > less wait time than actual by omitting time spent for waiting prior to
> > migration and also more wait count than actual by counting migration as
> > wait end event which can be seen by trace or /proc/<pid>/sched with
> > CONFIG_SCHEDSTATS=y.
> > 
> > Carry forward migrating task's wait time prior to migration and
> > don't count migration as a wait end event to fix such statistics error.
> > 
> > In order to determine whether task is migrating mark task->on_rq with
> > TASK_ON_RQ_MIGRATING while dequeuing and enqueuing due to migration.
> > 
> > To: Ingo Molnar <mingo@kernel.org>
> > To: Peter Zijlstra <peterz@infradead.org>
> > Cc: linux-kernel@vger.kernel.org
> > Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
> > ---
> 
> So now that you rely on TASK_ON_RQ_MIGRATING; I think you missed one
> place that can migrate sched_fair tasks and doesn't set it.
> 
> Olav recently did a patch adding TASK_ON_RQ_MIGRATING to _every_
> migration path, but that is (still) somewhat overkill. With your changes
> we need it for sched_fair though.
> 
> So I think you need to change __migrate_swap_task(), which is used by
> the NUMA scheduling to swap two running tasks.

Oh yes... __migrate_swap_task() can migrate fair class task so I should mark as TASK_ON_RQ_MIGRATING.
I will fix this in subsequent patch.

> 
> Also, it might be prudent to extend the CONFIG_SCHED_DEBUG ifdef in
> set_task_cpu() to test for this new requirement:
> 
> 	WARN_ON_ONCE(p->state == TASK_RUNNING &&
> 	             p->class == &fair_sched_class &&
> 		     p->on_rq != TASK_ON_RQ_MIGRATING);
> 

I conceived to argue that if we had Olav's change (with revised marking order) dummy like myself don't need to worry
about stale on_rq state and probably didn't make mistake like I did with __migrate_swap_task(). 
But I don't think I can argue for that reason anymore as my patch will set TASK_ON_RQ_MIGRATING for all migration paths for
the fair class tasks at least and moreover above macro you suggested would fortify the new requirement.

I will add that macro. 

I recall Olav had some other reason for his patch though.

> >  kernel/sched/core.c |  4 ++--
> >  kernel/sched/fair.c | 17 ++++++++++++++---
> >  2 files changed, 16 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index bcd214e..d9e4ad5 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1069,8 +1069,8 @@ static struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int new
> >  {
> >  	lockdep_assert_held(&rq->lock);
> >  
> > -	dequeue_task(rq, p, 0);
> >  	p->on_rq = TASK_ON_RQ_MIGRATING;
> > +	dequeue_task(rq, p, 0);
> >  	set_task_cpu(p, new_cpu);
> >  	raw_spin_unlock(&rq->lock);
> >  
> > @@ -1078,8 +1078,8 @@ static struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int new
> >  
> >  	raw_spin_lock(&rq->lock);
> >  	BUG_ON(task_cpu(p) != new_cpu);
> > -	p->on_rq = TASK_ON_RQ_QUEUED;
> >  	enqueue_task(rq, p, 0);
> > +	p->on_rq = TASK_ON_RQ_QUEUED;
> >  	check_preempt_curr(rq, p, 0);
> >  
> >  	return rq;
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 9a5e60f..7609576 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -740,7 +740,11 @@ static void update_curr_fair(struct rq *rq)
> >  static inline void
> >  update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >  {
> > -	schedstat_set(se->statistics.wait_start, rq_clock(rq_of(cfs_rq)));
> > +	schedstat_set(se->statistics.wait_start,
> > +		task_on_rq_migrating(task_of(se)) &&
> > +		likely(rq_clock(rq_of(cfs_rq)) > se->statistics.wait_start) ?
> > +		rq_clock(rq_of(cfs_rq)) - se->statistics.wait_start :
> > +		rq_clock(rq_of(cfs_rq)));
> 
> So I get that you want to avoid emitting code for !SCHEDSTATS; but that
> is rather unreadable. Either use the GCC stmt-expr or wrap the lot in
> #ifdef.
> 

Will do.

> >  }
> >  
> >  /*
> > @@ -759,6 +763,13 @@ static void update_stats_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >  static void
> >  update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >  {
> > +	if (task_on_rq_migrating(task_of(se))) {
> 
> Maybe add a comment that we adjust the wait_start time-base and will
> rebase on the new rq_clock in update_stats_wait_start().
> 

Will do.

Thanks,
Joonwoo

> > +		schedstat_set(se->statistics.wait_start,
> > +			      rq_clock(rq_of(cfs_rq)) -
> > +			      se->statistics.wait_start);
> > +		return;
> > +	}
> > +
> >  	schedstat_set(se->statistics.wait_max, max(se->statistics.wait_max,
> >  			rq_clock(rq_of(cfs_rq)) - se->statistics.wait_start));
> >  	schedstat_set(se->statistics.wait_count, se->statistics.wait_count + 1);
> > @@ -5656,8 +5667,8 @@ static void detach_task(struct task_struct *p, struct lb_env *env)
> >  {
> >  	lockdep_assert_held(&env->src_rq->lock);
> >  
> > -	deactivate_task(env->src_rq, p, 0);
> >  	p->on_rq = TASK_ON_RQ_MIGRATING;
> > +	deactivate_task(env->src_rq, p, 0);
> >  	set_task_cpu(p, env->dst_cpu);
> >  }
> >  
> > @@ -5790,8 +5801,8 @@ static void attach_task(struct rq *rq, struct task_struct *p)
> >  	lockdep_assert_held(&rq->lock);
> >  
> >  	BUG_ON(task_rq(p) != rq);
> > -	p->on_rq = TASK_ON_RQ_QUEUED;
> >  	activate_task(rq, p, 0);
> > +	p->on_rq = TASK_ON_RQ_QUEUED;
> >  	check_preempt_curr(rq, p, 0);
> >  }

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

  reply	other threads:[~2015-10-28  2:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-25  5:23 [PATCH] sched: fix incorrect wait time and wait count statistics Joonwoo Park
2015-10-25 10:08 ` Peter Zijlstra
2015-10-25 10:26 ` Peter Zijlstra
2015-10-27  1:44   ` Joonwoo Park
2015-10-27 12:57     ` Peter Zijlstra
2015-10-28  2:40       ` Joonwoo Park [this message]
2015-10-28  4:46         ` [PATCH v4] " Joonwoo Park
2015-11-06 13:57           ` Peter Zijlstra
2015-11-07  2:41             ` Joonwoo Park
2015-11-09 10:32               ` Peter Zijlstra
2015-11-13  3:38                 ` [PATCH v5] " Joonwoo Park
2015-11-23 16:20                   ` [tip:sched/core] sched/core: Fix " tip-bot for Joonwoo Park
2015-10-27 19:17 ` [PATCH v3] sched: fix " Joonwoo Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151028024054.GA8839@codeaurora.org \
    --to=joonwoop@codeaurora.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=ohaugan@codeaurora.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.