All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yong Zhang <yong.zhang0@gmail.com>
To: Venki Pallipadi <venki@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	Aaron Durbin <adurbin@google.com>, Paul Turner <pjt@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC] Extend mwait idle to optimize away IPIs when possible
Date: Thu, 9 Feb 2012 10:18:55 +0800	[thread overview]
Message-ID: <20120209021855.GA26152@zhy> (raw)
In-Reply-To: <CABeCy1bAB6PSAUabk8zO8oh_sLVo3sOPVrr_kSovLdmOOtkfTg@mail.gmail.com>

On Wed, Feb 08, 2012 at 03:28:45PM -0800, Venki Pallipadi wrote:
> On Tue, Feb 7, 2012 at 10:51 PM, Yong Zhang <yong.zhang0@gmail.com> wrote:
> > On Mon, Feb 06, 2012 at 12:42:13PM -0800, Venkatesh Pallipadi wrote:
> >> smp_call_function_single and ttwu_queue_remote sends unconditional IPI
> >> to target CPU. However, if the target CPU is in mwait based idle, we can
> >> do IPI-less wakeups using the magical powers of monitor-mwait.
> >> Doing this has certain advantages:
> >
> > Actually I'm trying to do the similar thing on MIPS.
> >
> > The difference is that I want task_is_polling() to do something. The basic
> > idea is:
> >
> >> + ? ? ? ? ? ? ? ? ? ? if (ipi_pending()) {
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? clear_ipi_pending();
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? local_bh_disable();
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? local_irq_disable();
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? generic_smp_call_function_single_interrupt();
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? scheduler_wakeup_self_check();
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? local_irq_enable();
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? local_bh_enable();
> >
> > I let cpu_idle() check if there is anything to do as your above code.
> >
> > And task_is_polling() handle the others with below patch:
> > ---
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 5255c9d..09f633d 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -527,15 +527,16 @@ void resched_task(struct task_struct *p)
> > ? ? ? ? ? ? ? ?smp_send_reschedule(cpu);
> > ?}
> >
> > -void resched_cpu(int cpu)
> > +int resched_cpu(int cpu)
> > ?{
> > ? ? ? ?struct rq *rq = cpu_rq(cpu);
> > ? ? ? ?unsigned long flags;
> >
> > ? ? ? ?if (!raw_spin_trylock_irqsave(&rq->lock, flags))
> > - ? ? ? ? ? ? ? return;
> > + ? ? ? ? ? ? ? return 0;
> > ? ? ? ?resched_task(cpu_curr(cpu));
> > ? ? ? ?raw_spin_unlock_irqrestore(&rq->lock, flags);
> > + ? ? ? return 1;
> > ?}
> 

I assume we are talking about 'return from idle' but seems I don't
make it clear.

> Two points -
> rq->lock: I tried something similar first. One hurdle with checking
> task_is_polling() is that you need rq->lock to check it. And adding
> lock+unlock (without wait) in wakeup path ended up being no net gain
> compared to IPI. And when we actually end up spinning on that lock,
> thats going to add overhead in the common path. That is the reason I
> switched to atomic compare exchange and moving any wait onto the
> target CPU coming out of idle.

I see. But actually we will not spinning on that lock because we
use 'trylock' in resched_cpu(). And you are right there is indeed a
little overhead (resched_task()) if we hold the lock but it can be
tolerated IMHO.

BTW, mind showing you test case thus we can collect some common data?

> 
> resched_task: ttwu_queue_remote() does not imply that the remote CPU
> will do a resched. Today there is a IPI and IPI handler calls onto
> check_preempt_wakeup() and if the current task has higher precedence
> than the waking up task, then there will be just an activation of new
> task and no resched. Using resched_task above breaks
> check_preempt_wakeup() and always calls a resched on remote CPU after
> the IPI, which would be change in behavior.

Yeah, if the remote cpu is not idle, mine will change the behavior; but
if the remote cpu is idle, it will always rescheduled, right?

So maybe we could introduce resched_idle_cpu() to make things more clear:

int resched_idle_cpu(int cpu) 
{
        struct rq *rq = cpu_rq(cpu);
        unsigned long flags;
        int ret = 0; 

        if (!raw_spin_trylock_irqsave(&rq->lock, flags))
                goto out; 
        if (!idle_cpu(cpu))
                goto out_unlock;
        resched_task(cpu_curr(cpu));
                ret = 1; 
out_unlock:
        raw_spin_unlock_irqrestore(&rq->lock, flags);
out:
        return ret; 
}

Thanks,
Yong

> 
> Thanks,
> Venki
> 
> >
> > ?#ifdef CONFIG_NO_HZ
> > @@ -1484,7 +1485,8 @@ void scheduler_ipi(void)
> >
> > ?static void ttwu_queue_remote(struct task_struct *p, int cpu)
> > ?{
> > - ? ? ? if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list))
> > + ? ? ? if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list) &&
> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? !resched_cpu(cpu))
> > ? ? ? ? ? ? ? ?smp_send_reschedule(cpu);
> > ?}
> >
> > Thought?
> >
> > Thanks,
> > Yong
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Only stand for myself

  reply	other threads:[~2012-02-09  2:19 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-06 20:42 [RFC] Extend mwait idle to optimize away IPIs when possible Venkatesh Pallipadi
2012-02-06 21:02 ` Peter Zijlstra
2012-02-06 21:26   ` Venki Pallipadi
2012-02-07  0:26 ` David Daney
2012-02-07  1:24   ` H. Peter Anvin
2012-02-07  1:34     ` David Daney
2012-02-07  1:45       ` H. Peter Anvin
2012-02-07  2:03         ` Venki Pallipadi
2012-02-07  2:24 ` Suresh Siddha
2012-02-07 21:39   ` Venki Pallipadi
2012-02-08  6:51 ` Yong Zhang
2012-02-08 23:28   ` Venki Pallipadi
2012-02-09  2:18     ` Yong Zhang [this message]
2012-02-10  2:17       ` Venki Pallipadi
2012-02-13  5:27         ` Yong Zhang
2012-02-10 19:19 ` Peter Zijlstra
2012-02-11  2:11   ` Venki Pallipadi
2012-02-11  3:09     ` Peter Zijlstra
2012-02-13  5:34     ` Yong Zhang
2012-02-14 13:52       ` Peter Zijlstra
2012-02-15  1:39         ` Yong Zhang
2012-02-15  2:32         ` Venki Pallipadi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120209021855.GA26152@zhy \
    --to=yong.zhang0@gmail.com \
    --cc=adurbin@google.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=venki@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.