Re: [PATCH v3 1/7] sched: introduce primitives to account for CFS bandwidth tracking

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: balbir@linux.vnet.ibm.com
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org,
	Dhaval Giani <dhaval.giani@gmail.com>,
	Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
	Srivatsa Vaddagiri <vatsa@in.ibm.com>,
	Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>, Pavel Emelyanov <xemul@openvz.org>,
	Herbert Poetzl <herbert@13thfloor.at>,
	Avi Kivity <avi@redhat.com>, Chris Friesen <cfriesen@nortel.com>,
	Paul Menage <menage@google.com>,
	Mike Waychison <mikew@google.com>, Paul Turner <pjt@google.com>,
	Nikhil Rao <ncrao@google.com>
Subject: Re: [PATCH v3 1/7] sched: introduce primitives to account for CFS bandwidth tracking
Date: Thu, 14 Oct 2010 09:52:17 +0200	[thread overview]
Message-ID: <1287042737.29097.153.camel@twins> (raw)
In-Reply-To: <20101013130018.GC3914@balbir.in.ibm.com>

On Wed, 2010-10-13 at 18:30 +0530, Balbir Singh wrote:

> > -static void start_rt_bandwidth(struct rt_bandwidth *rt_b)
> > +static void start_bandwidth_timer(struct hrtimer *period_timer, ktime_t period)
> >  {
> > -	ktime_t now;
> > +	unsigned long delta;
> > +	ktime_t soft, hard, now;
> > +
> > +	for (;;) {
> > +		if (hrtimer_active(period_timer))
> > +			break;
> > 
> > +		now = hrtimer_cb_get_time(period_timer);
> > +		hrtimer_forward(period_timer, now, period);
> > +
> > +		soft = hrtimer_get_softexpires(period_timer);
> > +		hard = hrtimer_get_expires(period_timer);
> > +		delta = ktime_to_ns(ktime_sub(hard, soft));
> > +		__hrtimer_start_range_ns(period_timer, soft, delta,
> > +					 HRTIMER_MODE_ABS_PINNED, 0);
> 
> This code can be replaced with
> 
> hrtimer_start_expires(period_timer, HRTIMER_MODE_ABS_PINNED) if we
> don't care about wakeup_softirq, is there a reason we prefer to keep
> wakeup as 0?

You cannot do wakeups while holding the rq->lock, can you? :-)

> > +	}
> > +}


> > +static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer)
> > +{
> > +	struct cfs_bandwidth *cfs_b =
> > +		container_of(timer, struct cfs_bandwidth, period_timer);
> > +	ktime_t now;
> > +	int overrun;
> > +	int idle = 0;
> > +
> > +	for (;;) {
> > +		now = hrtimer_cb_get_time(timer);
> > +		overrun = hrtimer_forward(timer, now, cfs_b->period);
> > +
> > +		if (!overrun)
> > +			break;
> 
> What is the significance of overrun? overrun is set when delta >
> interval. The logic seems to be that hrtimer is forwarded in steps of
> cfs_b->period till we reach the desired time.
> 

Overrun is the number of periods missed. The goal is to increment the
quota for each period, if the timer is late 3 periods, we still need to
increment it 3 times, that's what overrun does.

> > +
> > +		idle = do_sched_cfs_period_timer(cfs_b, overrun);
> > +	}
> > +
> > +	return idle ? HRTIMER_NORESTART : HRTIMER_RESTART;
> > +}

> > +static void start_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
> > +{
> > +	if (cfs_b->quota == RUNTIME_INF)
> > +		return;
> > +
> > +	if (hrtimer_active(&cfs_b->period_timer))
> > +		return;
> 
> Why the double check, start_bandwidth_timer also checks this. Is it to
> avoid doing the check under cfs_b->lock?

Basically..

> > +
> > +	raw_spin_lock(&cfs_b->lock);
> > +	start_bandwidth_timer(&cfs_b->period_timer, cfs_b->period);
> > +	raw_spin_unlock(&cfs_b->lock);
> > +}
> > +

> > +#ifdef CONFIG_CFS_BANDWIDTH
> > +static int tg_set_cfs_bandwidth(struct task_group *tg, u64 period, u64 quota)
> > +{
> > +	int i;
> > +	static DEFINE_MUTEX(mutex);
> > +
> > +	if (tg == &init_task_group)
> > +		return -EINVAL;
> > +
> > +	if (!period)
> > +		return -EINVAL;
> > +
> > +	/*
> > +	 * Ensure we have at least one tick of bandwidth every period.  This is
> > +	 * to prevent reaching a state of large arrears when throttled via
> > +	 * entity_tick() resulting in prolonged exit starvation.
> > +	 */
> > +	if (NS_TO_JIFFIES(quota) < 1)
> > +		return -EINVAL;
> 
> I hope we document this in the Documentation :)

/me went and looked up arrears in a dictionary and wonders why 'debt'
wasn't good enough.

> > +
> > +	mutex_lock(&mutex);
> > +	raw_spin_lock_irq(&tg->cfs_bandwidth.lock);
> > +	tg->cfs_bandwidth.period = ns_to_ktime(period);
> > +	tg->cfs_bandwidth.runtime = tg->cfs_bandwidth.quota = quota;
> > +	raw_spin_unlock_irq(&tg->cfs_bandwidth.lock);
> > +
> > +	for_each_possible_cpu(i) {
> 
> Why not for_each_online_cpu()?

Probably could be cured with a hotplug handler, but then you need to
track more state iirc.

> > +		struct cfs_rq *cfs_rq = tg->cfs_rq[i];
> > +		struct rq *rq = rq_of(cfs_rq);
> > +
> > +		raw_spin_lock_irq(&rq->lock);
> > +		init_cfs_rq_quota(cfs_rq);
> > +		raw_spin_unlock_irq(&rq->lock);
> > +	}
> > +	mutex_unlock(&mutex);
> > +
> > +	return 0;
> > +}

next prev parent reply	other threads:[~2010-10-14  7:52 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-12  7:49 [PATCH v3 0/7] CFS Bandwidth Control Bharata B Rao
2010-10-12  7:50 ` [PATCH v3 1/7] sched: introduce primitives to account for CFS bandwidth tracking Bharata B Rao
2010-10-13 13:00   ` Balbir Singh
2010-10-14  5:14     ` Bharata B Rao
2010-10-14  7:52     ` Peter Zijlstra [this message]
2010-10-14 12:38       ` Balbir Singh
2010-10-14 13:24         ` Peter Zijlstra
2010-12-06  9:02         ` Bharata B Rao
2010-10-12  7:51 ` [PATCH v3 2/7] sched: accumulate per-cfs_rq cpu usage Bharata B Rao
2010-10-13 13:30   ` Balbir Singh
2010-10-13 13:46     ` Nikhil Rao
2010-10-13 13:59       ` Balbir Singh
2010-10-13 14:41         ` Nikhil Rao
2010-10-14  5:39           ` Balbir Singh
2010-10-14  8:57   ` Peter Zijlstra
2010-10-14  9:07     ` Paul Turner
2010-10-14  9:13       ` Peter Zijlstra
2010-10-14  9:01   ` Peter Zijlstra
2010-10-14  9:14     ` Paul Turner
2010-10-14  9:27       ` Peter Zijlstra
2010-10-14  9:53         ` Paul Turner
2010-10-14  9:19   ` Peter Zijlstra
2010-10-14  9:27     ` Paul Turner
2010-10-14  9:40       ` Bharata B Rao
2010-10-12  7:52 ` [PATCH v3 3/7] sched: throttle cfs_rq entities which exceed their local quota Bharata B Rao
2010-10-13  6:34   ` KAMEZAWA Hiroyuki
2010-10-13  6:44     ` Paul Turner
2010-10-13  6:47       ` Bharata B Rao
2010-10-13  6:52         ` Paul Turner
2010-10-13  7:00       ` KAMEZAWA Hiroyuki
2010-10-13  7:13         ` Paul Turner
2010-10-14  9:12     ` Peter Zijlstra
2010-10-14  9:50       ` KAMEZAWA Hiroyuki
2010-10-14  9:59         ` Peter Zijlstra
2010-10-14 10:08           ` KAMEZAWA Hiroyuki
2010-10-14 10:25             ` Paul Turner
2010-10-14 10:41               ` Peter Zijlstra
2010-10-14 23:30                 ` KAMEZAWA Hiroyuki
2010-10-14 10:37             ` Peter Zijlstra
2010-10-14  9:58       ` Paul Turner
2010-10-12  7:52 ` [PATCH v3 4/7] sched: unthrottle cfs_rq(s) who ran out of quota at period refresh Bharata B Rao
2010-10-15  4:45   ` Balbir Singh
2010-12-07 13:13     ` Bharata B Rao
2010-10-12  7:53 ` [PATCH v3 5/7] sched: add exports tracking cfs bandwidth control statistics Bharata B Rao
2010-10-12  7:54 ` [PATCH v3 6/7] sched: hierarchical task accounting for FAIR_GROUP_SCHED Bharata B Rao
2010-10-12  7:55 ` [PATCH v3 7/7] sched: Return/expire slack quota using generation counters Bharata B Rao
2010-10-13  5:14 ` [PATCH v3 0/7] CFS Bandwidth Control KAMEZAWA Hiroyuki
2010-10-13  5:44 ` Herbert Poetzl
2010-10-13  6:26   ` Paul Turner
2010-11-17  8:32 ` Lai Jiangshan
2010-11-19  3:24   ` Bharata B Rao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1287042737.29097.153.camel@twins \
    --to=peterz@infradead.org \
    --cc=avi@redhat.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=cfriesen@nortel.com \
    --cc=dhaval.giani@gmail.com \
    --cc=herbert@13thfloor.at \
    --cc=kamalesh@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=mikew@google.com \
    --cc=mingo@elte.hu \
    --cc=ncrao@google.com \
    --cc=pjt@google.com \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=vatsa@in.ibm.com \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.