[PATCH] cpufreq: ondemand+conservative=condemand

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] cpufreq: ondemand+conservative=condemand
@ 2005-06-26  1:01 Paolo Marchetti
  2005-06-27  0:31 ` Peter Chubb
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Marchetti @ 2005-06-26  1:01 UTC (permalink / raw)
  To: kernel

 > Just change defaults in conservative governor to make it more responsive.
>
Alexey,
I played with conservative governor trying to make it work decently on
my p4 with no results.
As you know it works but it isn't responsive, it takes eons to step up/down.

No matter how I tune sampling_rate & C. conservative governor is useless for me.
Please look at the code, you will see ondemand is better exept for the
"give-me-all-the-power-here-and-now" thing.

ciao,
Paolo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] cpufreq: ondemand+conservative=condemand
  2005-06-26  1:01 [PATCH] cpufreq: ondemand+conservative=condemand Paolo Marchetti
@ 2005-06-27  0:31 ` Peter Chubb
  2005-06-27 11:36   ` Paolo Marchetti
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Chubb @ 2005-06-27  0:31 UTC (permalink / raw)
  To: Paolo Marchetti; +Cc: kernel

>>>>> "Paolo" == Paolo Marchetti <natryum@gmail.com> writes:

>> Just change defaults in conservative governor to make it more
>> responsive.
>> 
Paolo> Alexey, I played with conservative governor trying to make it
Paolo> work decently on my p4 with no results.  As you know it works
Paolo> but it isn't responsive, it takes eons to step up/down.

You can always use a userspace governer.  I've attached the one I use.

Every five seconds (you can make it faster if you wish, but that seems
about right for my usage patterns), the program reads the load
average, and decides whether to adjust the CPU frequency.  If the one
second load average is above $FASTTRESHHOLD, the frequency will be
stepped up by $FASTINC; otherwise if it's above $SLOWTHRESHHOLD, it's
incremented by $SLOWINC.  If the 15-second load average is below
$DECTHRESSHOLD, the frequency is stepped downwards by $DEC.  So you
get fast increases, and slow decreases, but becasue the time constant
for the decrease is long, you can get good response for a load spike,
then fairly rapid decrease.  The aim is to keep the load average
around 0.9.


--
#!/bin/sh

# Seconds to sleep between adjustments
INTERVAL=5

# The controller increments the throttling state by FASTINC
# if the load average is over FASTTRHESHHOLD.
# Thresholds are in percentage points load average -- i.e., the one
# second  load average of 1.0 corresponds to a threshold of 100.
FASTINC=3
FASTTHRESHOLD=100
# Slow increment
SLOWINC=1
SLOWTHRESHOLD=80
# Decrement
DEC=1
DECTHRESHOLD=500

cd /sys/devices/system/cpu/cpu0/cpufreq

# Do some parameter checks.
[ $FASTTHRESHOLD -le $SLOWTHRESHOLD ] && {
    echo >&2 "Fast Threshold $FASTTHRESHOLD must be greater than the"
    echo >&2 "slow threshold $SLOWTHRESHOLD"
    exit 1
}

[ \( $SLOWINC -ge 1 \) -a  \( $FASTINC -ge 1 \) -a \( $DEC -ge 1 \) ] || {
    echo >&2 "Increments must all be small integers in the range 1 to  7"
    exit 1
}

# convert a two dec place number to an int scaled by 100.
function to_int()
{
        val=$1
	OIFS="$IFS"
	IFS="."
	set  $val
	IFS="$OIFS"
	expr $1 \* 100 + $2
}

# get load averages
function loadavg()
{
	read onesec fivesec fifteensec rest < /proc/loadavg
	onesec=`to_int $onesec`
	fifteensec=`to_int $fifteensec`
}

function getspeeds()
{
    echo userspace > scaling_governor
    set `cat scaling_available_frequencies`
    i=0
    for j
    do
	i=`expr $i + 1`
	eval speed$i=$j
    done
    nspeeds=$i
}

# Get current throttling factor.
# This can be changed automatically by the BIOS in response to power
# events (e.g., AC coming on line).
function throttle() {
	< scaling_cur_freq read curfreq
	i=1;
	while [ $i -lt $nspeeds ]
	do
	    eval [ \$speed$i -eq 0$curfreq ] && expr $nspeeds - $i
	    i=`expr $i + 1`
        done
}

function set_speed() {
        x=`expr $nspeeds - $1`
	eval speed=\$speed$x
	echo $speed  > scaling_setspeed
}

# Increase the effective processor speed.
function up()
{
	 [ $current_throttle -eq 0 ] || {
		current_throttle=`expr $current_throttle - $1`
		[ $current_throttle -lt 0 ] && current_throttle=0
		set_speed $current_throttle
         }
}

# Decrease the effective processor speed.
function down()
{
	 [ $current_throttle -eq $nspeeds ] || {
		current_throttle=`expr $current_throttle + $1`
		[ $current_throttle -gt $nspeeds ] && current_throttle=$nspeeds
		set_speed $current_throttle
	}
}


getspeeds
current_throttle=`throttle`
while sleep $INTERVAL
do
	loadavg

	# Go up fast, then tail off.
	#
	if [ $onesec -gt $FASTTHRESHOLD ]
	then
		up $FASTINC
	elif [ $onesec -gt $SLOWTHRESHOLD ] 
	then
		up $SLOWINC
	elif [ $fifteensec -lt $DECTHRESHOLD ]
	then
		down $DEC
	fi
done

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] cpufreq: ondemand+conservative=condemand
  2005-06-27  0:31 ` Peter Chubb
@ 2005-06-27 11:36   ` Paolo Marchetti
  0 siblings, 0 replies; 5+ messages in thread
From: Paolo Marchetti @ 2005-06-27 11:36 UTC (permalink / raw)
  To: Peter Chubb, kernel

On 6/27/05, Peter Chubb <peter@chubb.wattle.id.au> wrote:
> >>>>> "Paolo" == Paolo Marchetti <natryum@gmail.com> writes:
> 
> >> Just change defaults in conservative governor to make it more
> >> responsive.
> >>
> Paolo> Alexey, I played with conservative governor trying to make it
> Paolo> work decently on my p4 with no results.  As you know it works
> Paolo> but it isn't responsive, it takes eons to step up/down.
> 
> You can always use a userspace governer.  I've attached the one I use.
> 
> Every five seconds (you can make it faster if you wish, but that seems
> about right for my usage patterns), the program reads the load
> average, and decides whether to adjust the CPU frequency.  If the one
> second load average is above $FASTTRESHHOLD, the frequency will be
> stepped up by $FASTINC; otherwise if it's above $SLOWTHRESHHOLD, it's
> incremented by $SLOWINC.  If the 15-second load average is below
> $DECTHRESSHOLD, the frequency is stepped downwards by $DEC.  So you
> get fast increases, and slow decreases, but becasue the time constant
> for the decrease is long, you can get good response for a load spike,
> then fairly rapid decrease.  The aim is to keep the load average
> around 0.9.
> 
> 
> --
> #!/bin/sh
> 
> # Seconds to sleep between adjustments
> INTERVAL=5
> 
> # The controller increments the throttling state by FASTINC
> # if the load average is over FASTTRHESHHOLD.
> # Thresholds are in percentage points load average -- i.e., the one
> # second  load average of 1.0 corresponds to a threshold of 100.
> FASTINC=3
> FASTTHRESHOLD=100
> # Slow increment
> SLOWINC=1
> SLOWTHRESHOLD=80
> # Decrement
> DEC=1
> DECTHRESHOLD=500
> 
> cd /sys/devices/system/cpu/cpu0/cpufreq
> 
> # Do some parameter checks.
> [ $FASTTHRESHOLD -le $SLOWTHRESHOLD ] && {
>     echo >&2 "Fast Threshold $FASTTHRESHOLD must be greater than the"
>     echo >&2 "slow threshold $SLOWTHRESHOLD"
>     exit 1
> }
> 
> [ \( $SLOWINC -ge 1 \) -a  \( $FASTINC -ge 1 \) -a \( $DEC -ge 1 \) ] || {
>     echo >&2 "Increments must all be small integers in the range 1 to  7"
>     exit 1
> }
> 
> # convert a two dec place number to an int scaled by 100.
> function to_int()
> {
>         val=$1
>         OIFS="$IFS"
>         IFS="."
>         set  $val
>         IFS="$OIFS"
>         expr $1 \* 100 + $2
> }
> 
> # get load averages
> function loadavg()
> {
>         read onesec fivesec fifteensec rest < /proc/loadavg
>         onesec=`to_int $onesec`
>         fifteensec=`to_int $fifteensec`
> }
> 
> function getspeeds()
> {
>     echo userspace > scaling_governor
>     set `cat scaling_available_frequencies`
>     i=0
>     for j
>     do
>         i=`expr $i + 1`
>         eval speed$i=$j
>     done
>     nspeeds=$i
> }
> 
> # Get current throttling factor.
> # This can be changed automatically by the BIOS in response to power
> # events (e.g., AC coming on line).
> function throttle() {
>         < scaling_cur_freq read curfreq
>         i=1;
>         while [ $i -lt $nspeeds ]
>         do
>             eval [ \$speed$i -eq 0$curfreq ] && expr $nspeeds - $i
>             i=`expr $i + 1`
>         done
> }
> 
> function set_speed() {
>         x=`expr $nspeeds - $1`
>         eval speed=\$speed$x
>         echo $speed  > scaling_setspeed
> }
> 
> # Increase the effective processor speed.
> function up()
> {
>          [ $current_throttle -eq 0 ] || {
>                 current_throttle=`expr $current_throttle - $1`
>                 [ $current_throttle -lt 0 ] && current_throttle=0
>                 set_speed $current_throttle
>          }
> }
> 
> # Decrease the effective processor speed.
> function down()
> {
>          [ $current_throttle -eq $nspeeds ] || {
>                 current_throttle=`expr $current_throttle + $1`
>                 [ $current_throttle -gt $nspeeds ] && current_throttle=$nspeeds
>                 set_speed $current_throttle
>         }
> }
> 
> 
> getspeeds
> current_throttle=`throttle`
> while sleep $INTERVAL
> do
>         loadavg
> 
>         # Go up fast, then tail off.
>         #
>         if [ $onesec -gt $FASTTHRESHOLD ]
>         then
>                 up $FASTINC
>         elif [ $onesec -gt $SLOWTHRESHOLD ]
>         then
>                 up $SLOWINC
>         elif [ $fifteensec -lt $DECTHRESHOLD ]
>         then
>                 down $DEC
>         fi
> done
> 
Thank you, I'll try it.
Unfortunately the problem is: how to get conservative governor work
decently on a p4 laptop?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] cpufreq: ondemand+conservative=condemand
@ 2005-06-25 15:08 Paolo Marchetti
  2005-06-25 18:26 ` Alexey Dobriyan
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Marchetti @ 2005-06-25 15:08 UTC (permalink / raw)
  To: Dave Jones, Alexey Dobriyan, kernel

[-- Attachment #1: Type: text/plain, Size: 926 bytes --]

Hello world!

Dave, please consider this patch.

I'm a newbie so I'm sure I've done a lot of mistakes starting with the
ugly name :)
Sorry in advance.

'condemand' - This driver adds a dynamic cpufreq policy governor.
The governor does a periodic polling and 
changes frequency based on the CPU utilization.
The support for this governor depends on CPU capability to
do fast frequency switching (i.e, very low latency frequency
transitions). 
This driver takes inspiration (and code) from the ondemand and
conservative governors, it does fast scale down like ondemand
and gradual scale up like conservative.


By making a contribution to this project, I certify that:
The contribution was created in whole or in part by me and
I have the right to submit it under the open source license
indicated in the file.

Signed-off-by: Paolo Marchetti<natryum@gmail.com>
---

Patch attached (second mistake)

[-- Attachment #2: condemand.patch --]
[-- Type: application/octet-stream, Size: 16722 bytes --]

diff -uprN -X dontdiff linux-2.6.12/drivers/cpufreq/cpufreq_condemand.c my-linux-2.6.12/drivers/cpufreq/cpufreq_condemand.c
--- linux-2.6.12/drivers/cpufreq/cpufreq_condemand.c	1970-01-01 01:00:00.000000000 +0100
+++ my-linux-2.6.12/drivers/cpufreq/cpufreq_condemand.c	2005-06-25 16:30:46.000000000 +0200
@@ -0,0 +1,532 @@
+/*
+ *  drivers/cpufreq/cpufreq_condemand.c
+ *
+ *  Copyright (C)  2005 Paolo Marchetti <natryum@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/smp.h>
+#include <linux/ctype.h>
+#include <linux/cpufreq.h>
+#include <linux/sysctl.h>
+#include <linux/types.h>
+#include <linux/fs.h>
+#include <linux/sysfs.h>
+#include <linux/sched.h>
+#include <linux/kmod.h>
+#include <linux/workqueue.h>
+#include <linux/jiffies.h>
+#include <linux/kernel_stat.h>
+#include <linux/percpu.h>
+
+/*
+ * dbs is used in this file as a shortform for demandbased switching
+ * It helps to keep variable names smaller, simpler
+ */
+
+#define DEF_FREQUENCY_UP_THRESHOLD		(80)
+#define MIN_FREQUENCY_UP_THRESHOLD		(11)
+#define MAX_FREQUENCY_UP_THRESHOLD		(100)
+
+/* 
+ * The polling frequency of this governor depends on the capability of 
+ * the processor. Default polling frequency is 1000 times the transition
+ * latency of the processor. The governor will work on any processor with 
+ * transition latency <= 10mS, using appropriate sampling 
+ * rate.
+ * For CPUs with transition latency > 10mS (mostly drivers with CPUFREQ_ETERNAL)
+ * this governor will not work.
+ * All times here are in uS.
+ */
+static unsigned int 				def_sampling_rate;
+#define MIN_SAMPLING_RATE			(def_sampling_rate / 2)
+#define MAX_SAMPLING_RATE			(500 * def_sampling_rate)
+#define DEF_SAMPLING_RATE_LATENCY_MULTIPLIER	(1000)
+#define DEF_SAMPLING_DOWN_FACTOR		(1)
+#define MAX_SAMPLING_DOWN_FACTOR		(10)
+#define TRANSITION_LATENCY_LIMIT		(10 * 1000)
+
+static void do_dbs_timer(void *data);
+
+struct cpu_dbs_info_s {
+	struct cpufreq_policy 	*cur_policy;
+	unsigned int 		prev_cpu_idle_up;
+	unsigned int 		prev_cpu_idle_down;
+	unsigned int 		enable;
+};
+static DEFINE_PER_CPU(struct cpu_dbs_info_s, cpu_dbs_info);
+
+static unsigned int dbs_enable;	/* number of CPUs using this policy */
+
+static DECLARE_MUTEX 	(dbs_sem);
+static DECLARE_WORK	(dbs_work, do_dbs_timer, NULL);
+
+struct dbs_tuners {
+	unsigned int 		sampling_rate;
+	unsigned int		sampling_down_factor;
+	unsigned int		up_threshold;
+	unsigned int		ignore_nice;
+	unsigned int		freq_step;
+};
+
+static struct dbs_tuners dbs_tuners_ins = {
+	.up_threshold 		= DEF_FREQUENCY_UP_THRESHOLD,
+	.sampling_down_factor 	= DEF_SAMPLING_DOWN_FACTOR,
+};
+
+static inline unsigned int get_cpu_idle_time(unsigned int cpu)
+{
+	return	kstat_cpu(cpu).cpustat.idle +
+		kstat_cpu(cpu).cpustat.iowait +
+		( !dbs_tuners_ins.ignore_nice ? 
+		  kstat_cpu(cpu).cpustat.nice :
+		  0);
+}
+
+/************************** sysfs interface ************************/
+static ssize_t show_sampling_rate_max(struct cpufreq_policy *policy, char *buf)
+{
+	return sprintf (buf, "%u\n", MAX_SAMPLING_RATE);
+}
+
+static ssize_t show_sampling_rate_min(struct cpufreq_policy *policy, char *buf)
+{
+	return sprintf (buf, "%u\n", MIN_SAMPLING_RATE);
+}
+
+#define define_one_ro(_name) 					\
+static struct freq_attr _name =  				\
+__ATTR(_name, 0444, show_##_name, NULL)
+
+define_one_ro(sampling_rate_max);
+define_one_ro(sampling_rate_min);
+
+/* cpufreq_condemand Governor Tunables */
+#define show_one(file_name, object)					\
+static ssize_t show_##file_name						\
+(struct cpufreq_policy *unused, char *buf)				\
+{									\
+	return sprintf(buf, "%u\n", dbs_tuners_ins.object);		\
+}
+show_one(sampling_rate, sampling_rate);
+show_one(sampling_down_factor, sampling_down_factor);
+show_one(up_threshold, up_threshold);
+show_one(ignore_nice, ignore_nice);
+show_one(freq_step, freq_step);
+
+static ssize_t store_sampling_down_factor(struct cpufreq_policy *unused, 
+		const char *buf, size_t count)
+{
+	unsigned int input;
+	int ret;
+	ret = sscanf (buf, "%u", &input);
+	if (ret != 1 )
+		return -EINVAL;
+
+	if (input > MAX_SAMPLING_DOWN_FACTOR || input < 1)
+		return -EINVAL;
+
+	down(&dbs_sem);
+	dbs_tuners_ins.sampling_down_factor = input;
+	up(&dbs_sem);
+
+	return count;
+}
+
+static ssize_t store_sampling_rate(struct cpufreq_policy *unused, 
+		const char *buf, size_t count)
+{
+	unsigned int input;
+	int ret;
+	ret = sscanf (buf, "%u", &input);
+
+	down(&dbs_sem);
+	if (ret != 1 || input > MAX_SAMPLING_RATE || input < MIN_SAMPLING_RATE) {
+		up(&dbs_sem);
+		return -EINVAL;
+	}
+
+	dbs_tuners_ins.sampling_rate = input;
+	up(&dbs_sem);
+
+	return count;
+}
+
+static ssize_t store_up_threshold(struct cpufreq_policy *unused, 
+		const char *buf, size_t count)
+{
+	unsigned int input;
+	int ret;
+	ret = sscanf (buf, "%u", &input);
+
+	down(&dbs_sem);
+	if (ret != 1 || input > MAX_FREQUENCY_UP_THRESHOLD || 
+			input < MIN_FREQUENCY_UP_THRESHOLD) {
+		up(&dbs_sem);
+		return -EINVAL;
+	}
+
+	dbs_tuners_ins.up_threshold = input;
+	up(&dbs_sem);
+
+	return count;
+}
+
+static ssize_t store_ignore_nice(struct cpufreq_policy *policy,
+		const char *buf, size_t count)
+{
+	unsigned int input;
+	int ret;
+
+	unsigned int j;
+	
+	ret = sscanf (buf, "%u", &input);
+	if ( ret != 1 )
+		return -EINVAL;
+
+	if ( input > 1 )
+		input = 1;
+	
+	down(&dbs_sem);
+	if ( input == dbs_tuners_ins.ignore_nice ) { /* nothing to do */
+		up(&dbs_sem);
+		return count;
+	}
+	dbs_tuners_ins.ignore_nice = input;
+
+	/* we need to re-evaluate prev_cpu_idle_up and prev_cpu_idle_down */
+	for_each_online_cpu(j) {
+		struct cpu_dbs_info_s *j_dbs_info;
+		j_dbs_info = &per_cpu(cpu_dbs_info, j);
+		j_dbs_info->prev_cpu_idle_up = get_cpu_idle_time(j);
+		j_dbs_info->prev_cpu_idle_down = j_dbs_info->prev_cpu_idle_up;
+	}
+	up(&dbs_sem);
+
+	return count;
+}
+
+static ssize_t store_freq_step(struct cpufreq_policy *policy,
+		const char *buf, size_t count)
+{
+	unsigned int input;
+	int ret;
+
+	ret = sscanf (buf, "%u", &input);
+
+	if ( ret != 1 )
+		return -EINVAL;
+
+	if ( input > 100 )
+		input = 100;
+	
+	/* no need to test here if freq_step is zero as the user might actually
+	 * want this, they would be crazy though :) */
+	down(&dbs_sem);
+	dbs_tuners_ins.freq_step = input;
+	up(&dbs_sem);
+
+	return count;
+}
+
+
+
+#define define_one_rw(_name) \
+static struct freq_attr _name = \
+__ATTR(_name, 0644, show_##_name, store_##_name)
+
+define_one_rw(sampling_rate);
+define_one_rw(sampling_down_factor);
+define_one_rw(up_threshold);
+define_one_rw(ignore_nice);
+define_one_rw(freq_step);
+
+static struct attribute * dbs_attributes[] = {
+	&sampling_rate_max.attr,
+	&sampling_rate_min.attr,
+	&sampling_rate.attr,
+	&sampling_down_factor.attr,
+	&up_threshold.attr,
+	&ignore_nice.attr,
+	&freq_step.attr,
+	NULL
+};
+
+static struct attribute_group dbs_attr_group = {
+	.attrs = dbs_attributes,
+	.name = "condemand",
+};
+
+/************************** sysfs end ************************/
+
+static void dbs_check_cpu(int cpu)
+{
+	unsigned int idle_ticks, up_idle_ticks, total_ticks;
+	unsigned int freq_next;
+	unsigned int freq_step;
+	unsigned int freq_down_sampling_rate;
+	static int down_skip[NR_CPUS];
+	struct cpu_dbs_info_s *this_dbs_info;
+
+	struct cpufreq_policy *policy;
+	unsigned int j;
+
+	this_dbs_info = &per_cpu(cpu_dbs_info, cpu);
+	if (!this_dbs_info->enable)
+		return;
+
+	policy = this_dbs_info->cur_policy;
+	/* 
+	 * Every sampling_rate, we check, if current idle time is less
+	 * than 20% (default), then we try to increase frequency
+	 * Every sampling_rate*sampling_down_factor, we look for a the lowest
+	 * frequency which can sustain the load while keeping idle time over
+	 * 30%. If such a frequency exist, we try to decrease to this frequency.
+	 *
+	 * Any frequency increase takes it to the maximum frequency. 
+	 * Frequency reduction happens at minimum steps of 
+	 * 5% (default) of current frequency 
+	 */
+
+	/* Check for frequency increase */
+	idle_ticks = UINT_MAX;
+	for_each_cpu_mask(j, policy->cpus) {
+		unsigned int tmp_idle_ticks, total_idle_ticks;
+		struct cpu_dbs_info_s *j_dbs_info;
+
+		j_dbs_info = &per_cpu(cpu_dbs_info, j);
+		total_idle_ticks = get_cpu_idle_time(j);
+		tmp_idle_ticks = total_idle_ticks -
+			j_dbs_info->prev_cpu_idle_up;
+		j_dbs_info->prev_cpu_idle_up = total_idle_ticks;
+
+		if (tmp_idle_ticks < idle_ticks)
+			idle_ticks = tmp_idle_ticks;
+	}
+
+	/* Scale idle ticks by 100 and compare with up and down ticks */
+	idle_ticks *= 100;
+	up_idle_ticks = (100 - dbs_tuners_ins.up_threshold) *
+			usecs_to_jiffies(dbs_tuners_ins.sampling_rate);
+
+	if (idle_ticks < up_idle_ticks) {
+		down_skip[cpu] = 0;
+		for_each_cpu_mask(j, policy->cpus) {
+			struct cpu_dbs_info_s *j_dbs_info;
+
+			j_dbs_info = &per_cpu(cpu_dbs_info, j);
+			j_dbs_info->prev_cpu_idle_down = 
+					j_dbs_info->prev_cpu_idle_up;
+		}
+		/* if we are already at full speed then break out early */
+		if (policy->cur == policy->max)
+			return;
+		
+                freq_step = (dbs_tuners_ins.freq_step * policy->max) / 100;
+
+                /* max freq cannot be less than 100. Paranoid!
+                if (unlikely(freq_step == 0))
+	                        freq_step = 5;
+		*/
+
+                policy->cur += freq_step;
+                if (policy->cur > policy->max)
+	                        policy->cur = policy->max;
+
+                __cpufreq_driver_target(policy, policy->cur,
+		                        CPUFREQ_RELATION_H);
+                return;
+        }
+
+	/* Check for frequency decrease */
+	down_skip[cpu]++;
+	if (down_skip[cpu] < dbs_tuners_ins.sampling_down_factor)
+		return;
+
+	idle_ticks = UINT_MAX;
+	for_each_cpu_mask(j, policy->cpus) {
+		unsigned int tmp_idle_ticks, total_idle_ticks;
+		struct cpu_dbs_info_s *j_dbs_info;
+
+		j_dbs_info = &per_cpu(cpu_dbs_info, j);
+		/* Check for frequency decrease */
+		total_idle_ticks = j_dbs_info->prev_cpu_idle_up;
+		tmp_idle_ticks = total_idle_ticks -
+			j_dbs_info->prev_cpu_idle_down;
+		j_dbs_info->prev_cpu_idle_down = total_idle_ticks;
+
+		if (tmp_idle_ticks < idle_ticks)
+			idle_ticks = tmp_idle_ticks;
+	}
+
+	down_skip[cpu] = 0;
+	/* if we cannot reduce the frequency anymore, break out early */
+	if (policy->cur == policy->min)
+		return;
+
+	/* Compute how many ticks there are between two measurements */
+	freq_down_sampling_rate = dbs_tuners_ins.sampling_rate *
+		dbs_tuners_ins.sampling_down_factor;
+	total_ticks = usecs_to_jiffies(freq_down_sampling_rate);
+
+	/*
+	 * The optimal frequency is the frequency that is the lowest that
+	 * can support the current CPU usage without triggering the up
+	 * policy. To be safe, we focus 10 points under the threshold.
+	 */
+	freq_next = ((total_ticks - idle_ticks) * 100) / total_ticks;
+	freq_next = (freq_next * policy->cur) / 
+			(dbs_tuners_ins.up_threshold - 10);
+
+	if (freq_next <= ((policy->cur * 95) / 100))
+		__cpufreq_driver_target(policy, freq_next, CPUFREQ_RELATION_L);
+}
+
+static void do_dbs_timer(void *data)
+{ 
+	int i;
+	down(&dbs_sem);
+	for_each_online_cpu(i)
+		dbs_check_cpu(i);
+	schedule_delayed_work(&dbs_work, 
+			usecs_to_jiffies(dbs_tuners_ins.sampling_rate));
+	up(&dbs_sem);
+} 
+
+static inline void dbs_timer_init(void)
+{
+	INIT_WORK(&dbs_work, do_dbs_timer, NULL);
+	schedule_delayed_work(&dbs_work,
+			usecs_to_jiffies(dbs_tuners_ins.sampling_rate));
+	return;
+}
+
+static inline void dbs_timer_exit(void)
+{
+	cancel_delayed_work(&dbs_work);
+	return;
+}
+
+static int cpufreq_governor_dbs(struct cpufreq_policy *policy,
+				   unsigned int event)
+{
+	unsigned int cpu = policy->cpu;
+	struct cpu_dbs_info_s *this_dbs_info;
+	unsigned int j;
+
+	this_dbs_info = &per_cpu(cpu_dbs_info, cpu);
+
+	switch (event) {
+	case CPUFREQ_GOV_START:
+		if ((!cpu_online(cpu)) || 
+		    (!policy->cur))
+			return -EINVAL;
+
+		if (policy->cpuinfo.transition_latency >
+				(TRANSITION_LATENCY_LIMIT * 1000))
+			return -EINVAL;
+		if (this_dbs_info->enable) /* Already enabled */
+			break;
+		 
+		down(&dbs_sem);
+		for_each_cpu_mask(j, policy->cpus) {
+			struct cpu_dbs_info_s *j_dbs_info;
+			j_dbs_info = &per_cpu(cpu_dbs_info, j);
+			j_dbs_info->cur_policy = policy;
+		
+			j_dbs_info->prev_cpu_idle_up = get_cpu_idle_time(j);
+			j_dbs_info->prev_cpu_idle_down
+				= j_dbs_info->prev_cpu_idle_up;
+		}
+		this_dbs_info->enable = 1;
+		sysfs_create_group(&policy->kobj, &dbs_attr_group);
+		dbs_enable++;
+		/*
+		 * Start the timerschedule work, when this governor
+		 * is used for first time
+		 */
+		if (dbs_enable == 1) {
+			unsigned int latency;
+			/* policy latency is in nS. Convert it to uS first */
+
+			latency = policy->cpuinfo.transition_latency;
+			if (latency < 1000)
+				latency = 1000;
+
+			def_sampling_rate = (latency / 1000) *
+					DEF_SAMPLING_RATE_LATENCY_MULTIPLIER;
+			dbs_tuners_ins.sampling_rate = def_sampling_rate;
+			dbs_tuners_ins.ignore_nice = 0;
+			dbs_tuners_ins.freq_step = 5;
+
+			dbs_timer_init();
+		}
+		
+		up(&dbs_sem);
+		break;
+
+	case CPUFREQ_GOV_STOP:
+		down(&dbs_sem);
+		this_dbs_info->enable = 0;
+		sysfs_remove_group(&policy->kobj, &dbs_attr_group);
+		dbs_enable--;
+		/*
+		 * Stop the timerschedule work, when this governor
+		 * is used for first time
+		 */
+		if (dbs_enable == 0) 
+			dbs_timer_exit();
+		
+		up(&dbs_sem);
+
+		break;
+
+	case CPUFREQ_GOV_LIMITS:
+		down(&dbs_sem);
+		if (policy->max < this_dbs_info->cur_policy->cur)
+			__cpufreq_driver_target(
+					this_dbs_info->cur_policy,
+				       	policy->max, CPUFREQ_RELATION_H);
+		else if (policy->min > this_dbs_info->cur_policy->cur)
+			__cpufreq_driver_target(
+					this_dbs_info->cur_policy,
+				       	policy->min, CPUFREQ_RELATION_L);
+		up(&dbs_sem);
+		break;
+	}
+	return 0;
+}
+
+static struct cpufreq_governor cpufreq_gov_dbs = {
+	.name		= "condemand",
+	.governor	= cpufreq_governor_dbs,
+	.owner		= THIS_MODULE,
+};
+
+static int __init cpufreq_gov_dbs_init(void)
+{
+	return cpufreq_register_governor(&cpufreq_gov_dbs);
+}
+
+static void __exit cpufreq_gov_dbs_exit(void)
+{
+	/* Make sure that the scheduled work is indeed not running */
+	flush_scheduled_work();
+
+	cpufreq_unregister_governor(&cpufreq_gov_dbs);
+}
+
+
+MODULE_AUTHOR ("Paolo Marchetti <natryum@gmail.com>");
+MODULE_DESCRIPTION ("'cpufreq_condemand' - A dynamic cpufreq governor for "
+		"Low Latency Frequency Transition capable processors "
+		"fast scale down, gradual scale up");
+MODULE_LICENSE ("GPL");
+
+module_init(cpufreq_gov_dbs_init);
+module_exit(cpufreq_gov_dbs_exit);
diff -uprN -X dontdiff linux-2.6.12/drivers/cpufreq/Kconfig my-linux-2.6.12/drivers/cpufreq/Kconfig
--- linux-2.6.12/drivers/cpufreq/Kconfig	2005-06-17 21:48:29.000000000 +0200
+++ my-linux-2.6.12/drivers/cpufreq/Kconfig	2005-06-25 16:27:23.000000000 +0200
@@ -139,4 +139,20 @@ config CPU_FREQ_GOV_CONSERVATIVE
 
 	  If in doubt, say N.
 
+config CPU_FREQ_GOV_CONDEMAND
+	tristate "'condemand' cpufreq policy governor"
+	help
+	  'condemand' - This driver adds a dynamic cpufreq policy governor.
+	  The governor does a periodic polling and 
+	  changes frequency based on the CPU utilization.
+	  The support for this governor depends on CPU capability to
+	  do fast frequency switching (i.e, very low latency frequency
+	  transitions). 
+
+	  This driver takes inspiration (and code) from the ondemand and
+	  conservative governors, it does fast scale down like ondemand
+	  and gradual scale up like conservative.
+
+	  If in doubt, say N.
+
 endif	# CPU_FREQ
diff -uprN -X dontdiff linux-2.6.12/drivers/cpufreq/Makefile my-linux-2.6.12/drivers/cpufreq/Makefile
--- linux-2.6.12/drivers/cpufreq/Makefile	2005-06-17 21:48:29.000000000 +0200
+++ my-linux-2.6.12/drivers/cpufreq/Makefile	2005-06-25 16:07:56.000000000 +0200
@@ -9,6 +9,7 @@ obj-$(CONFIG_CPU_FREQ_GOV_POWERSAVE)	+= 
 obj-$(CONFIG_CPU_FREQ_GOV_USERSPACE)	+= cpufreq_userspace.o
 obj-$(CONFIG_CPU_FREQ_GOV_ONDEMAND)	+= cpufreq_ondemand.o
 obj-$(CONFIG_CPU_FREQ_GOV_CONSERVATIVE)	+= cpufreq_conservative.o
+obj-$(CONFIG_CPU_FREQ_GOV_CONDEMAND)	+= cpufreq_condemand.o
 
 # CPUfreq cross-arch helpers
 obj-$(CONFIG_CPU_FREQ_TABLE)		+= freq_table.o
diff -uprN -X dontdiff linux-2.6.12/Makefile my-linux-2.6.12/Makefile
--- linux-2.6.12/Makefile	2005-06-17 21:48:29.000000000 +0200
+++ my-linux-2.6.12/Makefile	2005-06-25 16:33:06.000000000 +0200
@@ -1,7 +1,7 @@
 VERSION = 2
 PATCHLEVEL = 6
 SUBLEVEL = 12
-EXTRAVERSION =
+EXTRAVERSION = -second
 NAME=Woozy Numbat
 
 # *DOCUMENTATION*

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] cpufreq: ondemand+conservative=condemand
  2005-06-25 15:08 Paolo Marchetti
@ 2005-06-25 18:26 ` Alexey Dobriyan
  0 siblings, 0 replies; 5+ messages in thread
From: Alexey Dobriyan @ 2005-06-25 18:26 UTC (permalink / raw)
  To: Paolo Marchetti; +Cc: Dave Jones, kernel

On Saturday 25 June 2005 19:08, Paolo Marchetti wrote:
> 'condemand' - This driver adds a dynamic cpufreq policy governor.
> The governor does a periodic polling and 
> changes frequency based on the CPU utilization.
> The support for this governor depends on CPU capability to
> do fast frequency switching (i.e, very low latency frequency
> transitions). 
> This driver takes inspiration (and code) from the ondemand and
> conservative governors, it does fast scale down like ondemand
> and gradual scale up like conservative.

Just change defaults in conservative governor to make it more responsive.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-06-27 11:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-26  1:01 [PATCH] cpufreq: ondemand+conservative=condemand Paolo Marchetti
2005-06-27  0:31 ` Peter Chubb
2005-06-27 11:36   ` Paolo Marchetti
  -- strict thread matches above, loose matches on Subject: below --
2005-06-25 15:08 Paolo Marchetti
2005-06-25 18:26 ` Alexey Dobriyan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox