From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Rakib Mullick <rakib.mullick@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
mingo@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Add rq->nr_uninterruptible count to dest cpu's rq while CPU goes down.
Date: Tue, 28 Aug 2012 06:42:06 -0700 [thread overview]
Message-ID: <20120828134206.GH2961@linux.vnet.ibm.com> (raw)
In-Reply-To: <CADZ9YHi-DGd71jxQpYyRfVUqdr-ks-znSeCtcBAccL7wRd3r5g@mail.gmail.com>
On Tue, Aug 28, 2012 at 12:57:09PM +0600, Rakib Mullick wrote:
> Hello Paul,
>
> On 8/28/12, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > On Mon, Aug 20, 2012 at 09:26:57AM -0700, Paul E. McKenney wrote:
> >> On Mon, Aug 20, 2012 at 11:26:57AM +0200, Peter Zijlstra wrote:
> >
> > How about the following updated patch?
> >
> Actually, I was waiting for Peter's update.
I was too, but chatted with Peter.
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > sched: Fix load avg vs cpu-hotplug
> >
> > Rabik and Paul reported two different issues related to the same few
> > lines of code.
> >
> > Rabik's issue is that the nr_uninterruptible migration code is wrong in
> > that he sees artifacts due to this (Rabik please do expand in more
> > detail).
> >
> > Paul's issue is that this code as it stands relies on us using
> > stop_machine() for unplug, we all would like to remove this assumption
> > so that eventually we can remove this stop_machine() usage altogether.
> >
> > The only reason we'd have to migrate nr_uninterruptible is so that we
> > could use for_each_online_cpu() loops in favour of
> > for_each_possible_cpu() loops, however since nr_uninterruptible() is the
> > only such loop and its using possible lets not bother at all.
> >
> > The problem Rabik sees is (probably) caused by the fact that by
> > migrating nr_uninterruptible we screw rq->calc_load_active for both rqs
> > involved.
> >
> > So don't bother with fancy migration schemes (meaning we now have to
> > keep using for_each_possible_cpu()) and instead fold any nr_active delta
> > after we migrate all tasks away to make sure we don't have any skewed
> > nr_active accounting.
> >
> > [ paulmck: Move call to calc_load_migration to CPU_DEAD to avoid
> > miscounting noted by Rakib. ]
> >
> > Reported-by: Rakib Mullick <rakib.mullick@gmail.com>
> > Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index e841dfc..a8807f2 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -5309,27 +5309,17 @@ void idle_task_exit(void)
> > }
> >
> > /*
> > - * While a dead CPU has no uninterruptible tasks queued at this point,
> > - * it might still have a nonzero ->nr_uninterruptible counter, because
> > - * for performance reasons the counter is not stricly tracking tasks to
> > - * their home CPUs. So we just add the counter to another CPU's counter,
> > - * to keep the global sum constant after CPU-down:
> > - */
> > -static void migrate_nr_uninterruptible(struct rq *rq_src)
> > -{
> > - struct rq *rq_dest = cpu_rq(cpumask_any(cpu_active_mask));
> > -
> > - rq_dest->nr_uninterruptible += rq_src->nr_uninterruptible;
> > - rq_src->nr_uninterruptible = 0;
> > -}
> > -
> > -/*
> > - * remove the tasks which were accounted by rq from calc_load_tasks.
> > + * Since this CPU is going 'away' for a while, fold any nr_active delta
> > + * we might have. Assumes we're called after migrate_tasks() so that the
> > + * nr_active count is stable.
> > + *
> > + * Also see the comment "Global load-average calculations".
> > */
> > -static void calc_global_load_remove(struct rq *rq)
> > +static void calc_load_migrate(struct rq *rq)
> > {
> > - atomic_long_sub(rq->calc_load_active, &calc_load_tasks);
> > - rq->calc_load_active = 0;
> > + long delta = calc_load_fold_active(rq);
> > + if (delta)
> > + atomic_long_add(delta, &calc_load_tasks);
> > }
> >
> > /*
> > @@ -5622,9 +5612,18 @@ migration_call(struct notifier_block *nfb, unsigned
> > long action, void *hcpu)
> > migrate_tasks(cpu);
> > BUG_ON(rq->nr_running != 1); /* the migration thread */
> > raw_spin_unlock_irqrestore(&rq->lock, flags);
> > + break;
> >
> > - migrate_nr_uninterruptible(rq);
> > - calc_global_load_remove(rq);
> > + case CPU_DEAD:
> > + {
> > + struct rq *dest_rq;
> > +
> > + local_irq_save(flags);
> > + dest_rq = cpu_rq(smp_processor_id());
>
> Use of smp_processor_id() as dest cpu isn't clear to me, this
> processor is about to get down, isn't it?
Nope. The CPU_DEAD notifier happens after the outgoing CPU has been
fully offlined, and so it must run on some other CPU.
> > + raw_spin_lock(&dest_rq->lock);
> > + calc_load_migrate(rq);
>
> Well, calc_load_migrate() has no impact cause rq->nr_running == 1 at
> this point. It's been already pointed out previously.
Even after the outgoing CPU is fully gone? I would hope that the value
would be zero.
Thanx, Paul
next prev parent reply other threads:[~2012-08-28 13:49 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-16 13:45 Add rq->nr_uninterruptible count to dest cpu's rq while CPU goes down Rakib Mullick
2012-08-16 13:56 ` Peter Zijlstra
2012-08-16 14:28 ` Rakib Mullick
2012-08-16 14:42 ` Peter Zijlstra
2012-08-16 15:32 ` Rakib Mullick
2012-08-16 17:46 ` Peter Zijlstra
2012-08-17 13:39 ` Rakib Mullick
2012-08-20 9:26 ` Peter Zijlstra
2012-08-20 16:10 ` Rakib Mullick
2012-08-20 16:16 ` Peter Zijlstra
2012-08-20 16:26 ` Paul E. McKenney
2012-08-27 18:44 ` Paul E. McKenney
2012-08-28 6:57 ` Rakib Mullick
2012-08-28 13:42 ` Paul E. McKenney [this message]
2012-08-28 16:52 ` Rakib Mullick
2012-08-28 17:07 ` Paul E. McKenney
2012-08-29 1:05 ` Rakib Mullick
2012-09-04 18:43 ` [tip:sched/core] sched: Fix load avg vs cpu-hotplug tip-bot for Peter Zijlstra
2012-09-05 12:36 ` Peter Zijlstra
2012-09-05 13:29 ` Ingo Molnar
2012-09-05 17:01 ` Peter Zijlstra
2012-09-05 17:34 ` Ingo Molnar
2012-09-05 22:03 ` Peter Zijlstra
2012-09-05 23:39 ` Paul E. McKenney
2012-09-06 3:30 ` Rakib Mullick
2012-09-14 6:14 ` [tip:sched/core] sched: Fix load avg vs. cpu-hotplug tip-bot for Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120828134206.GH2961@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rakib.mullick@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).