linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org,
	Nitesh Narayan Lal <nitesh@redhat.com>,
	Peter Xu <peterx@redhat.com>
Subject: Re: [patch 1/2] nohz: only wakeup a single target cpu when kicking a task
Date: Tue, 20 Oct 2020 15:52:45 -0300	[thread overview]
Message-ID: <20201020185245.GA3577@fuller.cnet> (raw)
In-Reply-To: <20201014234053.GA86158@lothringen>

On Thu, Oct 15, 2020 at 01:40:53AM +0200, Frederic Weisbecker wrote:
> On Wed, Oct 14, 2020 at 10:33:21AM +0200, Peter Zijlstra wrote:
> > On Tue, Oct 13, 2020 at 02:13:28PM -0300, Marcelo Tosatti wrote:
> > 
> > > > Yes but if the task isn't running, run_posix_cpu_timers() doesn't have
> > > > anything to elapse. So indeed we can spare the IPI if the task is not
> > > > running. Provided ordering makes sure that the task sees the new dependency
> > > > when it schedules in of course.
> > > 
> > > True.
> > > 
> > >  * p->on_cpu <- { 0, 1 }:
> > >  *
> > >  *   is set by prepare_task() and cleared by finish_task() such that it will be
> > >  *   set before p is scheduled-in and cleared after p is scheduled-out, both
> > >  *   under rq->lock. Non-zero indicates the task is running on its CPU.
> > > 
> > > 
> > > CPU-0 (tick_set_dep)            CPU-1 (task switch)
> > > 
> > > STORE p->tick_dep_mask
> > > smp_mb() (atomic_fetch_or())
> > > LOAD p->on_cpu
> > > 
> > > 
> > >                                 context_switch(prev, next)
> > >                                 STORE next->on_cpu = 1
> > >                                 ...                             [*]
> > > 
> > >                                 LOAD current->tick_dep_mask
> > > 
> > 
> > That load is in tick_nohz_task_switch() right? (which BTW is placed
> > completely wrong) You could easily do something like the below I
> > suppose.
> > 
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 81632cd5e3b7..2a5fafe66bb0 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -410,6 +410,14 @@ void __tick_nohz_task_switch(void)
> >  	ts = this_cpu_ptr(&tick_cpu_sched);
> >  
> >  	if (ts->tick_stopped) {
> > +		/*
> > +		 * tick_set_dep()		(this)
> > +		 *
> > +		 * STORE p->tick_dep_mask	STORE p->on_cpu
> > +		 * smp_mb()			smp_mb()
> > +		 * LOAD p->on_cpu		LOAD p->tick_dep_mask
> > +		 */
> > +		smp_mb();
> >  		if (atomic_read(&current->tick_dep_mask) ||
> >  		    atomic_read(&current->signal->tick_dep_mask))
> >  			tick_nohz_full_kick();
> 
> It would then need to be unconditional (whatever value of ts->tick_stopped).
> Assuming the tick isn't stopped, we may well have an interrupt firing right
> after schedule() which doesn't see the new value of tick_dep_map.
> 
> Alternatively, we could rely on p->on_rq which is set to TASK_ON_RQ_QUEUED
> at wake up time, prior to the schedule() full barrier. Of course that doesn't
> mean that the task is actually the one running on the CPU but it's a good sign,
> considering that we are running in nohz_full mode and it's usually optimized
> for single task mode.

Unfortunately that would require exporting p->on_rq which is internal to
the scheduler, locklessly.

(can surely do that if you prefer!)

> 
> Also setting a remote task's tick dependency is only used by posix cpu timer
> in case the user has the bad taste to enqueue on a task running in nohz_full
> mode. It shouldn't deserve an unconditional full barrier in the schedule path.
> 
> If the target is current, as is used by RCU, I guess we can keep a special
> treatment.

To answer PeterZ's original question:

"So we need to kick the CPU unconditionally, or only when the task is
actually running? AFAICT we only care about current->tick_dep_mask."

If there is a task sharing signals, executing on a remote CPU, yes that remote CPU 
should be awakened.

Could skip the IPI if the remote process is not running, however for 
the purposes of low latency isolated processes, this optimization is
not necessary.

So the last posted patchset is good enough for isolated low latency processes.

Do you guys want me to do something or can you take the series as it is?

> > re tick_nohz_task_switch() being placed wrong, it should probably be
> > placed before finish_lock_switch(). Something like so.
> > 
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index cf044580683c..5c92c959824f 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -4084,6 +4084,7 @@ static struct rq *finish_task_switch(struct task_struct *prev)
> >  	vtime_task_switch(prev);
> >  	perf_event_task_sched_in(prev, current);
> >  	finish_task(prev);
> > +	tick_nohz_task_switch();
> >  	finish_lock_switch(rq);
> >  	finish_arch_post_lock_switch();
> >  	kcov_finish_switch(current);
> > @@ -4121,7 +4122,6 @@ static struct rq *finish_task_switch(struct task_struct *prev)
> >  		put_task_struct_rcu_user(prev);
> >  	}
> >  
> > -	tick_nohz_task_switch();
> 
> IIRC, we wanted to keep it outside rq lock because it shouldn't it...


  parent reply	other threads:[~2020-10-20 18:56 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-07 18:01 [patch 0/2] nohz_full: only wakeup target CPUs when notifying new tick dependency (v2) Marcelo Tosatti
2020-10-07 18:01 ` [patch 1/2] nohz: only wakeup a single target cpu when kicking a task Marcelo Tosatti
2020-10-08 12:22   ` Peter Zijlstra
2020-10-08 17:54     ` Marcelo Tosatti
2020-10-08 19:54       ` Frederic Weisbecker
2020-10-13 17:13         ` Marcelo Tosatti
2020-10-14  8:33           ` Peter Zijlstra
2020-10-14 23:40             ` Frederic Weisbecker
2020-10-15 10:12               ` Peter Zijlstra
2020-10-26 14:42                 ` Frederic Weisbecker
2020-10-20 18:52               ` Marcelo Tosatti [this message]
2020-10-22 12:53                 ` Frederic Weisbecker
2020-10-08 14:59   ` Peter Xu
2020-10-08 15:28     ` Peter Zijlstra
2020-10-08 19:16       ` Peter Xu
2020-10-08 19:48       ` Frederic Weisbecker
2020-10-08 17:43     ` Marcelo Tosatti
2020-10-07 18:01 ` [patch 2/2] nohz: change signal tick dependency to wakeup CPUs of member tasks Marcelo Tosatti
2020-10-08 12:35   ` Peter Zijlstra
2020-10-08 18:04     ` Marcelo Tosatti
  -- strict thread matches above, loose matches on Subject: below --
2020-10-08 19:11 [patch 0/2] nohz_full: only wakeup target CPUs when notifying new tick dependency (v3) Marcelo Tosatti
2020-10-08 19:11 ` [patch 1/2] nohz: only wakeup a single target cpu when kicking a task Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201020185245.GA3577@fuller.cnet \
    --to=mtosatti@redhat.com \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nitesh@redhat.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).