public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Chris Wright <chrisw@sous-sol.org>
To: Rik van Riel <riel@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Avi Kiviti <avi@redhat.com>,
	Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Ingo Molnar <mingo@elte.hu>,
	Anthony Liguori <aliguori@linux.vnet.ibm.com>
Subject: Re: [RFC PATCH 2/3] sched: add yield_to function
Date: Thu, 2 Dec 2010 16:50:08 -0800	[thread overview]
Message-ID: <20101203005008.GU10050@sequoia.sous-sol.org> (raw)
In-Reply-To: <20101202144423.3ad1908d@annuminas.surriel.com>

* Rik van Riel (riel@redhat.com) wrote:
> Add a yield_to function to the scheduler code, allowing us to
> give the remainder of our timeslice to another thread.
> 
> We may want to use this to provide a sys_yield_to system call
> one day.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index c5f926c..4f3cce9 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1985,6 +1985,7 @@ extern void set_user_nice(struct task_struct *p, long nice);
>  extern int task_prio(const struct task_struct *p);
>  extern int task_nice(const struct task_struct *p);
>  extern int can_nice(const struct task_struct *p, const int nice);
> +extern void requeue_task(struct rq *rq, struct task_struct *p);
>  extern int task_curr(const struct task_struct *p);
>  extern int idle_cpu(int cpu);
>  extern int sched_setscheduler(struct task_struct *, int, struct sched_param *);
> @@ -2058,6 +2059,14 @@ extern int wake_up_state(struct task_struct *tsk, unsigned int state);
>  extern int wake_up_process(struct task_struct *tsk);
>  extern void wake_up_new_task(struct task_struct *tsk,
>  				unsigned long clone_flags);
> +
> +#ifdef CONFIG_SCHED_HRTICK
> +extern u64 slice_remain(struct task_struct *);
> +extern void yield_to(struct task_struct *);
> +#else
> +static inline void yield_to(struct task_struct *p) yield()

Missing {}'s ?

> +#endif
> +
>  #ifdef CONFIG_SMP
>   extern void kick_process(struct task_struct *tsk);
>  #else
> diff --git a/kernel/sched.c b/kernel/sched.c
> index f8e5a25..ef088cd 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -1909,6 +1909,26 @@ static void dequeue_task(struct rq *rq, struct task_struct *p, int sleep)
>  	p->se.on_rq = 0;
>  }
>  
> +/**
> + * requeue_task - requeue a task which priority got changed by yield_to
> + * @rq: the tasks's runqueue
> + * @p: the task in question
> + * Must be called with the runqueue lock held. Will cause the CPU to
> + * reschedule if p is now at the head of the runqueue.
> + */
> +void requeue_task(struct rq *rq, struct task_struct *p)
> +{
> +	assert_spin_locked(&rq->lock);
> +
> +	if (!p->se.on_rq || task_running(rq, p) || task_has_rt_policy(p))
> +		return;

already checked task_running(rq, p) || task_has_rt_policy(p) w/ rq lock
held.

> +
> +	dequeue_task(rq, p, 0);
> +	enqueue_task(rq, p, 0);

seems like you could condense to save an update_rq_clock() call at least,
don't know if the info_queued, info_dequeued need to be updated

> +	resched_task(p);
> +}
> +
>  /*
>   * __normal_prio - return the priority that is based on the static prio
>   */
> @@ -6797,6 +6817,36 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
>  	return ret;
>  }
>  
> +#ifdef CONFIG_SCHED_HRTICK
> +/*
> + * Yield the CPU, giving the remainder of our time slice to task p.
> + * Typically used to hand CPU time to another thread inside the same
> + * process, eg. when p holds a resource other threads are waiting for.
> + * Giving priority to p may help get that resource released sooner.
> + */
> +void yield_to(struct task_struct *p)
> +{
> +	unsigned long flags;
> +	struct sched_entity *se = &p->se;
> +	struct rq *rq;
> +	struct cfs_rq *cfs_rq;
> +	u64 remain = slice_remain(current);
> +
> +	rq = task_rq_lock(p, &flags);
> +	if (task_running(rq, p) || task_has_rt_policy(p))
> +		goto out;
> +	cfs_rq = cfs_rq_of(se);
> +	se->vruntime -= remain;
> +	if (se->vruntime < cfs_rq->min_vruntime)
> +		se->vruntime = cfs_rq->min_vruntime;

Should these details all be in sched_fair?  Seems like the wrong layer
here.  And would that condition go the other way?  If new vruntime is
smaller than min, then it becomes new cfs_rq->min_vruntime?

> +	requeue_task(rq, p);
> + out:
> +	task_rq_unlock(rq, &flags);
> +	yield();
> +}
> +EXPORT_SYMBOL(yield_to);
> +#endif
> +
>  /**
>   * sys_sched_yield - yield the current processor to other threads.
>   *
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index 5119b08..2a0a595 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -974,6 +974,25 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued)
>   */
>  
>  #ifdef CONFIG_SCHED_HRTICK
> +u64 slice_remain(struct task_struct *p)
> +{
> +	unsigned long flags;
> +	struct sched_entity *se = &p->se;
> +	struct cfs_rq *cfs_rq;
> +	struct rq *rq;
> +	u64 slice, ran;
> +	s64 delta;
> +
> +	rq = task_rq_lock(p, &flags);
> +	cfs_rq = cfs_rq_of(se);
> +	slice = sched_slice(cfs_rq, se);
> +	ran = se->sum_exec_runtime - se->prev_sum_exec_runtime;
> +	delta = slice - ran;
> +	task_rq_unlock(rq, &flags);
> +
> +	return max(delta, 0LL);

Can delta go negative?

  reply	other threads:[~2010-12-03  0:50 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-02 19:41 [RFC PATCH 0/3] directed yield for Pause Loop Exiting Rik van Riel
2010-12-02 19:43 ` [RFC PATCH 1/3] kvm: keep track of which task is running a KVM vcpu Rik van Riel
2010-12-03  1:18   ` Chris Wright
2010-12-03 14:50     ` Rik van Riel
2010-12-03 15:55       ` Chris Wright
2010-12-05 12:40       ` Avi Kivity
2010-12-03 12:17   ` Srivatsa Vaddagiri
2010-12-03 14:16     ` Rik van Riel
2010-12-05 12:59       ` Avi Kivity
2010-12-02 19:44 ` [RFC PATCH 2/3] sched: add yield_to function Rik van Riel
2010-12-03  0:50   ` Chris Wright [this message]
2010-12-03 18:27     ` Rik van Riel
2010-12-03 19:30       ` Chris Wright
2010-12-03 21:30       ` Peter Zijlstra
2010-12-03  5:54   ` Mike Galbraith
2010-12-03 13:46     ` Srivatsa Vaddagiri
2010-12-03 14:45       ` Mike Galbraith
2010-12-03 14:48         ` Rik van Riel
2010-12-03 15:09           ` Mike Galbraith
2010-12-03 15:35             ` Rik van Riel
2010-12-03 16:20               ` Srivatsa Vaddagiri
2010-12-03 17:09                 ` Rik van Riel
2010-12-03 17:29                   ` Srivatsa Vaddagiri
2010-12-03 17:33                     ` Rik van Riel
2010-12-03 17:45                       ` Srivatsa Vaddagiri
2010-12-03 20:05               ` Mike Galbraith
2010-12-03 21:26             ` Peter Zijlstra
2010-12-03 13:23   ` Peter Zijlstra
2010-12-03 13:30     ` Srivatsa Vaddagiri
2010-12-03 14:03       ` Peter Zijlstra
2010-12-03 14:06         ` Srivatsa Vaddagiri
2010-12-03 14:10           ` Srivatsa Vaddagiri
2010-12-03 21:23             ` Peter Zijlstra
2010-12-04 13:02               ` Rik van Riel
2010-12-10  4:34           ` Rik van Riel
2010-12-10  8:39             ` Srivatsa Vaddagiri
2010-12-10 14:55               ` Rik van Riel
2010-12-08 17:55     ` Rik van Riel
2010-12-08 20:00       ` Peter Zijlstra
2010-12-08 20:04         ` Peter Zijlstra
2010-12-08 22:59         ` Rik van Riel
2010-12-02 19:45 ` [RFC PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin Rik van Riel
2010-12-03  2:24   ` Chris Wright
2010-12-05 12:58     ` Avi Kivity
2010-12-05 12:56   ` Avi Kivity
2010-12-08 22:38     ` Rik van Riel
2010-12-09 10:28       ` Avi Kivity
2010-12-09 17:07         ` Rik van Riel
2010-12-11  7:27           ` Avi Kivity
2010-12-02 22:41 ` [RFC PATCH 0/3] directed yield for Pause Loop Exiting Chris Wright
2010-12-05 13:02   ` Avi Kivity
2010-12-10  5:03 ` Balbir Singh
2010-12-10 14:54   ` Rik van Riel
2010-12-11  7:31   ` Avi Kivity
2010-12-11 13:57     ` Balbir Singh
2010-12-13 11:57       ` Avi Kivity
2010-12-13 12:39         ` Balbir Singh
2010-12-13 12:42           ` Avi Kivity
2010-12-13 17:02       ` Rik van Riel
2010-12-14  9:25         ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101203005008.GU10050@sequoia.sous-sol.org \
    --to=chrisw@sous-sol.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aliguori@linux.vnet.ibm.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=riel@redhat.com \
    --cc=vatsa@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox