public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yong Zhang <yong.zhang0@gmail.com>
To: Venki Pallipadi <venki@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	Aaron Durbin <adurbin@google.com>, Paul Turner <pjt@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC] Extend mwait idle to optimize away IPIs when possible
Date: Thu, 9 Feb 2012 10:18:55 +0800	[thread overview]
Message-ID: <20120209021855.GA26152@zhy> (raw)
In-Reply-To: <CABeCy1bAB6PSAUabk8zO8oh_sLVo3sOPVrr_kSovLdmOOtkfTg@mail.gmail.com>

On Wed, Feb 08, 2012 at 03:28:45PM -0800, Venki Pallipadi wrote:
> On Tue, Feb 7, 2012 at 10:51 PM, Yong Zhang <yong.zhang0@gmail.com> wrote:
> > On Mon, Feb 06, 2012 at 12:42:13PM -0800, Venkatesh Pallipadi wrote:
> >> smp_call_function_single and ttwu_queue_remote sends unconditional IPI
> >> to target CPU. However, if the target CPU is in mwait based idle, we can
> >> do IPI-less wakeups using the magical powers of monitor-mwait.
> >> Doing this has certain advantages:
> >
> > Actually I'm trying to do the similar thing on MIPS.
> >
> > The difference is that I want task_is_polling() to do something. The basic
> > idea is:
> >
> >> + ? ? ? ? ? ? ? ? ? ? if (ipi_pending()) {
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? clear_ipi_pending();
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? local_bh_disable();
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? local_irq_disable();
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? generic_smp_call_function_single_interrupt();
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? scheduler_wakeup_self_check();
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? local_irq_enable();
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? local_bh_enable();
> >
> > I let cpu_idle() check if there is anything to do as your above code.
> >
> > And task_is_polling() handle the others with below patch:
> > ---
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 5255c9d..09f633d 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -527,15 +527,16 @@ void resched_task(struct task_struct *p)
> > ? ? ? ? ? ? ? ?smp_send_reschedule(cpu);
> > ?}
> >
> > -void resched_cpu(int cpu)
> > +int resched_cpu(int cpu)
> > ?{
> > ? ? ? ?struct rq *rq = cpu_rq(cpu);
> > ? ? ? ?unsigned long flags;
> >
> > ? ? ? ?if (!raw_spin_trylock_irqsave(&rq->lock, flags))
> > - ? ? ? ? ? ? ? return;
> > + ? ? ? ? ? ? ? return 0;
> > ? ? ? ?resched_task(cpu_curr(cpu));
> > ? ? ? ?raw_spin_unlock_irqrestore(&rq->lock, flags);
> > + ? ? ? return 1;
> > ?}
> 

I assume we are talking about 'return from idle' but seems I don't
make it clear.

> Two points -
> rq->lock: I tried something similar first. One hurdle with checking
> task_is_polling() is that you need rq->lock to check it. And adding
> lock+unlock (without wait) in wakeup path ended up being no net gain
> compared to IPI. And when we actually end up spinning on that lock,
> thats going to add overhead in the common path. That is the reason I
> switched to atomic compare exchange and moving any wait onto the
> target CPU coming out of idle.

I see. But actually we will not spinning on that lock because we
use 'trylock' in resched_cpu(). And you are right there is indeed a
little overhead (resched_task()) if we hold the lock but it can be
tolerated IMHO.

BTW, mind showing you test case thus we can collect some common data?

> 
> resched_task: ttwu_queue_remote() does not imply that the remote CPU
> will do a resched. Today there is a IPI and IPI handler calls onto
> check_preempt_wakeup() and if the current task has higher precedence
> than the waking up task, then there will be just an activation of new
> task and no resched. Using resched_task above breaks
> check_preempt_wakeup() and always calls a resched on remote CPU after
> the IPI, which would be change in behavior.

Yeah, if the remote cpu is not idle, mine will change the behavior; but
if the remote cpu is idle, it will always rescheduled, right?

So maybe we could introduce resched_idle_cpu() to make things more clear:

int resched_idle_cpu(int cpu) 
{
        struct rq *rq = cpu_rq(cpu);
        unsigned long flags;
        int ret = 0; 

        if (!raw_spin_trylock_irqsave(&rq->lock, flags))
                goto out; 
        if (!idle_cpu(cpu))
                goto out_unlock;
        resched_task(cpu_curr(cpu));
                ret = 1; 
out_unlock:
        raw_spin_unlock_irqrestore(&rq->lock, flags);
out:
        return ret; 
}

Thanks,
Yong

> 
> Thanks,
> Venki
> 
> >
> > ?#ifdef CONFIG_NO_HZ
> > @@ -1484,7 +1485,8 @@ void scheduler_ipi(void)
> >
> > ?static void ttwu_queue_remote(struct task_struct *p, int cpu)
> > ?{
> > - ? ? ? if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list))
> > + ? ? ? if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list) &&
> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? !resched_cpu(cpu))
> > ? ? ? ? ? ? ? ?smp_send_reschedule(cpu);
> > ?}
> >
> > Thought?
> >
> > Thanks,
> > Yong
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Only stand for myself

  reply	other threads:[~2012-02-09  2:19 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-06 20:42 [RFC] Extend mwait idle to optimize away IPIs when possible Venkatesh Pallipadi
2012-02-06 21:02 ` Peter Zijlstra
2012-02-06 21:26   ` Venki Pallipadi
2012-02-07  0:26 ` David Daney
2012-02-07  1:24   ` H. Peter Anvin
2012-02-07  1:34     ` David Daney
2012-02-07  1:45       ` H. Peter Anvin
2012-02-07  2:03         ` Venki Pallipadi
2012-02-07  2:24 ` Suresh Siddha
2012-02-07 21:39   ` Venki Pallipadi
2012-02-08  6:51 ` Yong Zhang
2012-02-08 23:28   ` Venki Pallipadi
2012-02-09  2:18     ` Yong Zhang [this message]
2012-02-10  2:17       ` Venki Pallipadi
2012-02-13  5:27         ` Yong Zhang
2012-02-10 19:19 ` Peter Zijlstra
2012-02-11  2:11   ` Venki Pallipadi
2012-02-11  3:09     ` Peter Zijlstra
2012-02-13  5:34     ` Yong Zhang
2012-02-14 13:52       ` Peter Zijlstra
2012-02-15  1:39         ` Yong Zhang
2012-02-15  2:32         ` Venki Pallipadi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120209021855.GA26152@zhy \
    --to=yong.zhang0@gmail.com \
    --cc=adurbin@google.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=venki@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox