* [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
2016-01-12 10:24 [PATCH RESEND v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification Shilpasri G Bhat
@ 2016-01-12 10:24 ` Shilpasri G Bhat
2016-01-12 10:57 ` Shreyas B Prabhu
2016-01-12 10:24 ` [PATCH RESEND v4 2/4] cpufreq: powernv/tracing: Add powernv_throttle tracepoint Shilpasri G Bhat
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Shilpasri G Bhat @ 2016-01-12 10:24 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel
Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
Shilpasri G Bhat
cpu_to_chip_id() does a DT walk through to find out the chip id by taking a
contended device tree lock. This adds an unnecessary overhead in a hot-path.
So instead of cpu_to_chip_id() use PIR of the cpu to find the chip id.
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
---
drivers/cpufreq/powernv-cpufreq.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index cb50138..597a084 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -39,6 +39,7 @@
#define PMSR_PSAFE_ENABLE (1UL << 30)
#define PMSR_SPR_EM_DISABLE (1UL << 31)
#define PMSR_MAX(x) ((x >> 32) & 0xFF)
+#define pir_to_chip_id(pir) (((pir) >> 7) & 0x3f)
static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
static bool rebooting, throttled, occ_reset;
@@ -312,13 +313,14 @@ static inline unsigned int get_nominal_index(void)
static void powernv_cpufreq_throttle_check(void *data)
{
unsigned int cpu = smp_processor_id();
+ unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id());
unsigned long pmsr;
int pmsr_pmax, i;
pmsr = get_pmspr(SPRN_PMSR);
for (i = 0; i < nr_chips; i++)
- if (chips[i].id == cpu_to_chip_id(cpu))
+ if (chips[i].id == chip_id)
break;
/* Check for Pmax Capping */
@@ -558,7 +560,8 @@ static int init_chip_info(void)
unsigned int prev_chip_id = UINT_MAX;
for_each_possible_cpu(cpu) {
- unsigned int id = cpu_to_chip_id(cpu);
+ unsigned int id =
+ pir_to_chip_id(get_hard_smp_processor_id(cpu));
if (prev_chip_id != id) {
prev_chip_id = id;
--
1.9.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
2016-01-12 10:24 ` [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
@ 2016-01-12 10:57 ` Shreyas B Prabhu
0 siblings, 0 replies; 8+ messages in thread
From: Shreyas B Prabhu @ 2016-01-12 10:57 UTC (permalink / raw)
To: Shilpasri G Bhat
Cc: linuxppc-dev, linux-kernel, rjw, viresh.kumar, linux-pm, pc,
anton, ego
On 01/12/2016 03:54 PM, Shilpasri G Bhat wrote:
> cpu_to_chip_id() does a DT walk through to find out the chip id by taking a
> contended device tree lock. This adds an unnecessary overhead in a hot-path.
> So instead of cpu_to_chip_id() use PIR of the cpu to find the chip id.
>
> Reported-by: Anton Blanchard <anton@samba.org>
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
> ---
> drivers/cpufreq/powernv-cpufreq.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index cb50138..597a084 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -39,6 +39,7 @@
> #define PMSR_PSAFE_ENABLE (1UL << 30)
> #define PMSR_SPR_EM_DISABLE (1UL << 31)
> #define PMSR_MAX(x) ((x >> 32) & 0xFF)
> +#define pir_to_chip_id(pir) (((pir) >> 7) & 0x3f)
Since this is platform specific and true only for power8, this is not
the right place to put it. Either you can move this to arch/powerpc or
you can maintain a cpu to chip map within the driver.
>
> static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
> static bool rebooting, throttled, occ_reset;
> @@ -312,13 +313,14 @@ static inline unsigned int get_nominal_index(void)
> static void powernv_cpufreq_throttle_check(void *data)
> {
> unsigned int cpu = smp_processor_id();
> + unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id());
> unsigned long pmsr;
> int pmsr_pmax, i;
>
> pmsr = get_pmspr(SPRN_PMSR);
>
> for (i = 0; i < nr_chips; i++)
> - if (chips[i].id == cpu_to_chip_id(cpu))
> + if (chips[i].id == chip_id)
> break;
>
> /* Check for Pmax Capping */
> @@ -558,7 +560,8 @@ static int init_chip_info(void)
> unsigned int prev_chip_id = UINT_MAX;
>
> for_each_possible_cpu(cpu) {
> - unsigned int id = cpu_to_chip_id(cpu);
> + unsigned int id =
> + pir_to_chip_id(get_hard_smp_processor_id(cpu));
>
> if (prev_chip_id != id) {
> prev_chip_id = id;
>
Thanks,
Shreyas
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND v4 2/4] cpufreq: powernv/tracing: Add powernv_throttle tracepoint
2016-01-12 10:24 [PATCH RESEND v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
@ 2016-01-12 10:24 ` Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats Shilpasri G Bhat
3 siblings, 0 replies; 8+ messages in thread
From: Shilpasri G Bhat @ 2016-01-12 10:24 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel
Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
Shilpasri G Bhat, Ingo Molnar, Steven Rostedt
This patch adds the powernv_throttle tracepoint to trace the CPU
frequency throttling event, which is used by the powernv-cpufreq
driver in POWER8.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Steven Rostedt <rostedt@goodmis.org>
---
No changes from v2 and v3.
include/trace/events/power.h | 22 ++++++++++++++++++++++
kernel/trace/power-traces.c | 1 +
2 files changed, 23 insertions(+)
diff --git a/include/trace/events/power.h b/include/trace/events/power.h
index 284244e..19e5030 100644
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -38,6 +38,28 @@ DEFINE_EVENT(cpu, cpu_idle,
TP_ARGS(state, cpu_id)
);
+TRACE_EVENT(powernv_throttle,
+
+ TP_PROTO(int chip_id, const char *reason, int pmax),
+
+ TP_ARGS(chip_id, reason, pmax),
+
+ TP_STRUCT__entry(
+ __field(int, chip_id)
+ __string(reason, reason)
+ __field(int, pmax)
+ ),
+
+ TP_fast_assign(
+ __entry->chip_id = chip_id;
+ __assign_str(reason, reason);
+ __entry->pmax = pmax;
+ ),
+
+ TP_printk("Chip %d Pmax %d %s", __entry->chip_id,
+ __entry->pmax, __get_str(reason))
+);
+
TRACE_EVENT(pstate_sample,
TP_PROTO(u32 core_busy,
diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c
index eb4220a..81b8745 100644
--- a/kernel/trace/power-traces.c
+++ b/kernel/trace/power-traces.c
@@ -15,4 +15,5 @@
EXPORT_TRACEPOINT_SYMBOL_GPL(suspend_resume);
EXPORT_TRACEPOINT_SYMBOL_GPL(cpu_idle);
+EXPORT_TRACEPOINT_SYMBOL_GPL(powernv_throttle);
--
1.9.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event
2016-01-12 10:24 [PATCH RESEND v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 2/4] cpufreq: powernv/tracing: Add powernv_throttle tracepoint Shilpasri G Bhat
@ 2016-01-12 10:24 ` Shilpasri G Bhat
2016-01-12 10:55 ` Gautham R Shenoy
2016-01-12 10:24 ` [PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats Shilpasri G Bhat
3 siblings, 1 reply; 8+ messages in thread
From: Shilpasri G Bhat @ 2016-01-12 10:24 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel
Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
Shilpasri G Bhat
Record the throttle event with a trace print replacing the printk,
except for events like throttling below nominal and occ reset
event which print a warning message.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
---
Changes from v3:
- Separate this patch to contain trace_point changes
- Move struct chip member 'restore' of type bool above 'mask' to reduce
structure padding.
No changes from v2.
Changes from v1:
- As suggested by Paul Clarke replaced char * throttle_reason[][30] by
const char * const throttle_reason[].
drivers/cpufreq/powernv-cpufreq.c | 95 ++++++++++++++++++++-------------------
1 file changed, 49 insertions(+), 46 deletions(-)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 597a084..c98a6e7 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -28,6 +28,7 @@
#include <linux/of.h>
#include <linux/reboot.h>
#include <linux/slab.h>
+#include <trace/events/power.h>
#include <asm/cputhreads.h>
#include <asm/firmware.h>
@@ -44,12 +45,22 @@
static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
static bool rebooting, throttled, occ_reset;
+static const char * const throttle_reason[] = {
+ "No throttling",
+ "Power Cap",
+ "Processor Over Temperature",
+ "Power Supply Failure",
+ "Over Current",
+ "OCC Reset"
+};
+
static struct chip {
unsigned int id;
bool throttled;
+ bool restore;
+ u8 throt_reason;
cpumask_t mask;
struct work_struct throttle;
- bool restore;
} *chips;
static int nr_chips;
@@ -310,41 +321,49 @@ static inline unsigned int get_nominal_index(void)
return powernv_pstate_info.max - powernv_pstate_info.nominal;
}
-static void powernv_cpufreq_throttle_check(void *data)
+static void powernv_cpufreq_check_pmax(void)
{
unsigned int cpu = smp_processor_id();
unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id());
- unsigned long pmsr;
int pmsr_pmax, i;
- pmsr = get_pmspr(SPRN_PMSR);
+ pmsr_pmax = (s8)PMSR_MAX(get_pmspr(SPRN_PMSR));
for (i = 0; i < nr_chips; i++)
if (chips[i].id == chip_id)
break;
- /* Check for Pmax Capping */
- pmsr_pmax = (s8)PMSR_MAX(pmsr);
if (pmsr_pmax != powernv_pstate_info.max) {
if (chips[i].throttled)
- goto next;
+ return;
+
chips[i].throttled = true;
if (pmsr_pmax < powernv_pstate_info.nominal)
- pr_crit("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
- cpu, chips[i].id, pmsr_pmax,
- powernv_pstate_info.nominal);
- else
- pr_info("CPU %d on Chip %u has Pmax reduced below turbo frequency (%d < %d)\n",
- cpu, chips[i].id, pmsr_pmax,
- powernv_pstate_info.max);
+ pr_warn_once("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
+ cpu, chips[i].id, pmsr_pmax,
+ powernv_pstate_info.nominal);
+
+ trace_powernv_throttle(chips[i].id,
+ throttle_reason[chips[i].throt_reason],
+ pmsr_pmax);
} else if (chips[i].throttled) {
chips[i].throttled = false;
- pr_info("CPU %d on Chip %u has Pmax restored to %d\n", cpu,
- chips[i].id, pmsr_pmax);
+ trace_powernv_throttle(chips[i].id,
+ throttle_reason[chips[i].throt_reason],
+ pmsr_pmax);
}
+}
+
+static void powernv_cpufreq_throttle_check(void *data)
+{
+ unsigned long pmsr;
+
+ pmsr = get_pmspr(SPRN_PMSR);
+
+ /* Check for Pmax Capping */
+ powernv_cpufreq_check_pmax();
/* Check if Psafe_mode_active is set in PMSR. */
-next:
if (pmsr & PMSR_PSAFE_ENABLE) {
throttled = true;
pr_info("Pstate set to safe frequency\n");
@@ -358,7 +377,7 @@ next:
if (throttled) {
pr_info("PMSR = %16lx\n", pmsr);
- pr_crit("CPU Frequency could be throttled\n");
+ pr_warn("CPU Frequency could be throttled\n");
}
}
@@ -449,15 +468,6 @@ void powernv_cpufreq_work_fn(struct work_struct *work)
}
}
-static char throttle_reason[][30] = {
- "No throttling",
- "Power Cap",
- "Processor Over Temperature",
- "Power Supply Failure",
- "Over Current",
- "OCC Reset"
- };
-
static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
unsigned long msg_type, void *_msg)
{
@@ -483,7 +493,7 @@ static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
*/
if (!throttled) {
throttled = true;
- pr_crit("CPU frequency is throttled for duration\n");
+ pr_warn("CPU frequency is throttled for duration\n");
}
break;
@@ -507,23 +517,18 @@ static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
return 0;
}
- if (omsg.throttle_status &&
+ for (i = 0; i < nr_chips; i++)
+ if (chips[i].id == omsg.chip)
+ break;
+
+ if (omsg.throttle_status >= 0 &&
omsg.throttle_status <= OCC_MAX_THROTTLE_STATUS)
- pr_info("OCC: Chip %u Pmax reduced due to %s\n",
- (unsigned int)omsg.chip,
- throttle_reason[omsg.throttle_status]);
- else if (!omsg.throttle_status)
- pr_info("OCC: Chip %u %s\n", (unsigned int)omsg.chip,
- throttle_reason[omsg.throttle_status]);
- else
- return 0;
+ chips[i].throt_reason = omsg.throttle_status;
- for (i = 0; i < nr_chips; i++)
- if (chips[i].id == omsg.chip) {
- if (!omsg.throttle_status)
- chips[i].restore = true;
- schedule_work(&chips[i].throttle);
- }
+ if (!omsg.throttle_status)
+ chips[i].restore = true;
+
+ schedule_work(&chips[i].throttle);
}
return 0;
}
@@ -569,16 +574,14 @@ static int init_chip_info(void)
}
}
- chips = kmalloc_array(nr_chips, sizeof(struct chip), GFP_KERNEL);
+ chips = kcalloc(nr_chips, sizeof(struct chip), GFP_KERNEL);
if (!chips)
return -ENOMEM;
for (i = 0; i < nr_chips; i++) {
chips[i].id = chip[i];
- chips[i].throttled = false;
cpumask_copy(&chips[i].mask, cpumask_of_node(chip[i]));
INIT_WORK(&chips[i].throttle, powernv_cpufreq_work_fn);
- chips[i].restore = false;
}
return 0;
--
1.9.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event
2016-01-12 10:24 ` [PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event Shilpasri G Bhat
@ 2016-01-12 10:55 ` Gautham R Shenoy
0 siblings, 0 replies; 8+ messages in thread
From: Gautham R Shenoy @ 2016-01-12 10:55 UTC (permalink / raw)
To: Shilpasri G Bhat
Cc: linuxppc-dev, linux-kernel, rjw, viresh.kumar, linux-pm, pc,
anton, ego, shreyas
Hi Shilpa,
Just saw this resend!
On Tue, Jan 12, 2016 at 04:24:26AM -0600, Shilpasri G Bhat wrote:
> Record the throttle event with a trace print replacing the printk,
> except for events like throttling below nominal and occ reset
> event which print a warning message.
>
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
> ---
[..snip..]
>
> -static void powernv_cpufreq_throttle_check(void *data)
> +static void powernv_cpufreq_check_pmax(void)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This function only contains code moved from
powernv_cpufreq_throttle_check with pr_crit/pr_warns replaced by
trace_powernv_throttle. Furthermore, it is not called from any other
place. Given that the original function was ~60 lines do we really
need to split it into two separate functions ? If yes, could it be an
inline function ?
> {
> unsigned int cpu = smp_processor_id();
> unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id());
> - unsigned long pmsr;
> int pmsr_pmax, i;
>
> - pmsr = get_pmspr(SPRN_PMSR);
> + pmsr_pmax = (s8)PMSR_MAX(get_pmspr(SPRN_PMSR));
>
> for (i = 0; i < nr_chips; i++)
> if (chips[i].id == chip_id)
> break;
>
> - /* Check for Pmax Capping */
> - pmsr_pmax = (s8)PMSR_MAX(pmsr);
> if (pmsr_pmax != powernv_pstate_info.max) {
> if (chips[i].throttled)
> - goto next;
> + return;
> +
> chips[i].throttled = true;
> if (pmsr_pmax < powernv_pstate_info.nominal)
> - pr_crit("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
> - cpu, chips[i].id, pmsr_pmax,
> - powernv_pstate_info.nominal);
> - else
> - pr_info("CPU %d on Chip %u has Pmax reduced below turbo frequency (%d < %d)\n",
> - cpu, chips[i].id, pmsr_pmax,
> - powernv_pstate_info.max);
> + pr_warn_once("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
> + cpu, chips[i].id, pmsr_pmax,
> + powernv_pstate_info.nominal);
> +
> + trace_powernv_throttle(chips[i].id,
> + throttle_reason[chips[i].throt_reason],
> + pmsr_pmax);
> } else if (chips[i].throttled) {
> chips[i].throttled = false;
> - pr_info("CPU %d on Chip %u has Pmax restored to %d\n", cpu,
> - chips[i].id, pmsr_pmax);
> + trace_powernv_throttle(chips[i].id,
> + throttle_reason[chips[i].throt_reason],
> + pmsr_pmax);
> }
> +}
> +
> +static void powernv_cpufreq_throttle_check(void *data)
> +{
> + unsigned long pmsr;
> +
> + pmsr = get_pmspr(SPRN_PMSR);
> +
> + /* Check for Pmax Capping */
> + powernv_cpufreq_check_pmax();
If you want to retain this function, you could pass pmsr as an
argument instead of computing it afresh in
powernv_cpufreq_check_pmax()
> /* Check if Psafe_mode_active is set in PMSR. */
> -next:
> if (pmsr & PMSR_PSAFE_ENABLE) {
> throttled = true;
> pr_info("Pstate set to safe frequency\n");
--
Thanks and Regards
gautham.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats
2016-01-12 10:24 [PATCH RESEND v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification Shilpasri G Bhat
` (2 preceding siblings ...)
2016-01-12 10:24 ` [PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event Shilpasri G Bhat
@ 2016-01-12 10:24 ` Shilpasri G Bhat
2016-01-12 11:13 ` Gautham R Shenoy
3 siblings, 1 reply; 8+ messages in thread
From: Shilpasri G Bhat @ 2016-01-12 10:24 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel
Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
Shilpasri G Bhat
Create sysfs attributes to export throttle information in
/sys/devices/system/cpu/cpufreq/chipN. The newly added sysfs files are as
follows:
1)/sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies
This gives the throttle stats for each of the available frequencies.
The throttle stat of a frequency is the total number of times the max
frequency is reduced to that frequency.
# cat /sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies
4023000 0
3990000 0
3956000 1
3923000 0
3890000 0
3857000 2
3823000 0
3790000 0
3757000 2
3724000 1
3690000 1
...
2)/sys/devices/system/cpu/cpufreq/chip0/throttle_reasons
This directory contains throttle reason files. Each file gives the
total number of times the max frequency is throttled, except for
'throttle_reset', which gives the total number of times the max
frequency is unthrottled after being throttled.
# cd /sys/devices/system/cpu/cpufreq/chip0/throttle_reasons
# cat cpu_over_temperature
7
# cat occ_reset
0
# cat over_current
0
# cat power_cap
0
# cat power_supply_failure
0
# cat throttle_reset
7
3)/sys/devices/system/cpu/cpufreq/chip0/throttle_stat
This gives the total number of events of max frequency throttling to
lower frequencies in the turbo range of frequencies and the sub-turbo(at
and below nominal) range of frequencies.
# cat /sys/devices/system/cpu/cpufreq/chip0/throttle_stat
turbo 7
sub-turbo 0
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
---
Changes from v3:
- Seperate the patch to contain only the throttle sysfs attribute changes.
- Add helper inline function get_chip_index()
Changes from v2:
- Fixed kbuild test warning.
drivers/cpufreq/powernv-cpufreq.c:609:2: warning: ignoring return
value of 'kstrtoint', declared with attribute warn_unused_result
[-Wunused-result]
Changes from v1:
- Added a kobject to struct chip
- Grouped the throttle reasons under a separate attribute_group and
exported each reason as individual file.
- Moved the sysfs files from /sys/devices/system/node/nodeN to
/sys/devices/system/cpu/cpufreq/chipN
- As suggested by Paul Clarke replaced 'Nominal' with 'sub-turbo'.
- Modified the commit message.
drivers/cpufreq/powernv-cpufreq.c | 177 +++++++++++++++++++++++++++++++++++++-
1 file changed, 173 insertions(+), 4 deletions(-)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index c98a6e7..40ccd9d 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -54,6 +54,16 @@ static const char * const throttle_reason[] = {
"OCC Reset"
};
+enum throt_reason_type {
+ NO_THROTTLE = 0,
+ POWERCAP,
+ CPU_OVERTEMP,
+ POWER_SUPPLY_FAILURE,
+ OVERCURRENT,
+ OCC_RESET_THROTTLE,
+ OCC_MAX_REASON
+};
+
static struct chip {
unsigned int id;
bool throttled;
@@ -61,6 +71,11 @@ static struct chip {
u8 throt_reason;
cpumask_t mask;
struct work_struct throttle;
+ int throt_turbo;
+ int throt_nominal;
+ int reason[OCC_MAX_REASON];
+ int *pstate_stat;
+ struct kobject *kobj;
} *chips;
static int nr_chips;
@@ -195,6 +210,113 @@ static struct freq_attr *powernv_cpu_freq_attr[] = {
NULL,
};
+static inline int get_chip_index(struct kobject *kobj)
+{
+ int i, id;
+
+ i = kstrtoint(kobj->name + 4, 0, &id);
+ if (i)
+ return i;
+
+ for (i = 0; i < nr_chips; i++)
+ if (chips[i].id == id)
+ return i;
+ return -EINVAL;
+}
+
+static ssize_t throttle_freq_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ int i, count = 0, id;
+
+ id = get_chip_index(kobj);
+ if (id < 0)
+ return id;
+
+ for (i = 0; i < powernv_pstate_info.nr_pstates; i++)
+ count += sprintf(&buf[count], "%d %d\n",
+ powernv_freqs[i].frequency,
+ chips[id].pstate_stat[i]);
+
+ return count;
+}
+
+static struct kobj_attribute attr_throttle_frequencies =
+__ATTR(throttle_frequencies, 0444, throttle_freq_show, NULL);
+
+static ssize_t throttle_stat_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ int id, count = 0;
+
+ id = get_chip_index(kobj);
+ if (id < 0)
+ return id;
+
+ count += sprintf(&buf[count], "turbo %d\n", chips[id].throt_turbo);
+ count += sprintf(&buf[count], "sub-turbo %d\n",
+ chips[id].throt_nominal);
+
+ return count;
+}
+
+static struct kobj_attribute attr_throttle_stat =
+__ATTR(throttle_stat, 0444, throttle_stat_show, NULL);
+
+#define define_throttle_reason_attr(attr_name, val) \
+static ssize_t attr_name##_show(struct kobject *kobj, \
+ struct kobj_attribute *attr, char *buf) \
+{ \
+ int id; \
+ \
+ id = get_chip_index(kobj); \
+ if (id < 0) \
+ return id; \
+ \
+ return sprintf(buf, "%d\n", chips[id].reason[val]); \
+} \
+ \
+static struct kobj_attribute attr_##attr_name = \
+__ATTR(attr_name, 0444, attr_name##_show, NULL)
+
+define_throttle_reason_attr(throttle_reset, NO_THROTTLE);
+define_throttle_reason_attr(power_cap, POWERCAP);
+define_throttle_reason_attr(cpu_over_temperature, CPU_OVERTEMP);
+define_throttle_reason_attr(power_supply_failure, POWER_SUPPLY_FAILURE);
+define_throttle_reason_attr(over_current, OVERCURRENT);
+define_throttle_reason_attr(occ_reset, OCC_RESET_THROTTLE);
+
+static struct attribute *throttle_reason_attrs[] = {
+ &attr_throttle_reset.attr,
+ &attr_power_cap.attr,
+ &attr_cpu_over_temperature.attr,
+ &attr_power_supply_failure.attr,
+ &attr_over_current.attr,
+ &attr_occ_reset.attr,
+ NULL
+};
+
+static struct attribute *throttle_stat_attrs[] = {
+ &attr_throttle_frequencies.attr,
+ &attr_throttle_stat.attr,
+ NULL
+};
+
+static const struct attribute_group throttle_reason_group = {
+ .name = "throttle_reasons",
+ .attrs = throttle_reason_attrs,
+};
+
+static const struct attribute_group throttle_stat_group = {
+ .attrs = throttle_stat_attrs,
+};
+
+static const struct attribute_group *throttle_attr_groups[] = {
+ &throttle_stat_group,
+ &throttle_reason_group,
+ NULL
+};
+
/* Helper routines */
/* Access helpers to power mgt SPR */
@@ -325,7 +447,7 @@ static void powernv_cpufreq_check_pmax(void)
{
unsigned int cpu = smp_processor_id();
unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id());
- int pmsr_pmax, i;
+ int pmsr_pmax, i, index;
pmsr_pmax = (s8)PMSR_MAX(get_pmspr(SPRN_PMSR));
@@ -338,10 +460,18 @@ static void powernv_cpufreq_check_pmax(void)
return;
chips[i].throttled = true;
- if (pmsr_pmax < powernv_pstate_info.nominal)
+ if (pmsr_pmax < powernv_pstate_info.nominal) {
pr_warn_once("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
cpu, chips[i].id, pmsr_pmax,
powernv_pstate_info.nominal);
+ chips[i].throt_nominal++;
+ } else {
+ chips[i].throt_turbo++;
+ }
+
+ index = powernv_pstate_info.max - pmsr_pmax;
+ if (index >= 0 && index < powernv_pstate_info.nr_pstates)
+ chips[i].pstate_stat[index]++;
trace_powernv_throttle(chips[i].id,
throttle_reason[chips[i].throt_reason],
@@ -522,8 +652,10 @@ static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
break;
if (omsg.throttle_status >= 0 &&
- omsg.throttle_status <= OCC_MAX_THROTTLE_STATUS)
+ omsg.throttle_status <= OCC_MAX_THROTTLE_STATUS) {
chips[i].throt_reason = omsg.throttle_status;
+ chips[i].reason[omsg.throttle_status]++;
+ }
if (!omsg.throttle_status)
chips[i].restore = true;
@@ -563,6 +695,7 @@ static int init_chip_info(void)
unsigned int chip[256];
unsigned int cpu, i;
unsigned int prev_chip_id = UINT_MAX;
+ int ret = -ENOMEM;
for_each_possible_cpu(cpu) {
unsigned int id =
@@ -576,15 +709,43 @@ static int init_chip_info(void)
chips = kcalloc(nr_chips, sizeof(struct chip), GFP_KERNEL);
if (!chips)
- return -ENOMEM;
+ goto out;
for (i = 0; i < nr_chips; i++) {
+ char name[10];
+
chips[i].id = chip[i];
cpumask_copy(&chips[i].mask, cpumask_of_node(chip[i]));
INIT_WORK(&chips[i].throttle, powernv_cpufreq_work_fn);
+ chips[i].pstate_stat = kcalloc(powernv_pstate_info.nr_pstates,
+ sizeof(int), GFP_KERNEL);
+ if (!chips[i].pstate_stat)
+ goto free;
+
+ sprintf(name, "chip%d", chips[i].id);
+ chips[i].kobj = kobject_create_and_add(name,
+ cpufreq_global_kobject);
+ if (!chips[i].kobj)
+ goto free;
+
+ ret = sysfs_create_groups(chips[i].kobj, throttle_attr_groups);
+ if (ret) {
+ pr_info("Chip %d failed to create throttle sysfs group\n",
+ chips[i].id);
+ goto free;
+ }
}
return 0;
+free:
+ nr_chips = i;
+ for (i = 0; i <= nr_chips; i++) {
+ kobject_put(chips[i].kobj);
+ kfree(chips[i].pstate_stat);
+ }
+ kfree(chips);
+out:
+ return ret;
}
static int __init powernv_cpufreq_init(void)
@@ -615,9 +776,17 @@ module_init(powernv_cpufreq_init);
static void __exit powernv_cpufreq_exit(void)
{
+ int i;
+
unregister_reboot_notifier(&powernv_cpufreq_reboot_nb);
opal_message_notifier_unregister(OPAL_MSG_OCC,
&powernv_cpufreq_opal_nb);
+ for (i = 0; i < nr_chips; i++) {
+ kobject_put(chips[i].kobj);
+ kfree(chips[i].pstate_stat);
+ }
+ kfree(chips);
+
cpufreq_unregister_driver(&powernv_cpufreq_driver);
}
module_exit(powernv_cpufreq_exit);
--
1.9.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats
2016-01-12 10:24 ` [PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats Shilpasri G Bhat
@ 2016-01-12 11:13 ` Gautham R Shenoy
0 siblings, 0 replies; 8+ messages in thread
From: Gautham R Shenoy @ 2016-01-12 11:13 UTC (permalink / raw)
To: Shilpasri G Bhat
Cc: linuxppc-dev, linux-kernel, rjw, viresh.kumar, linux-pm, pc,
anton, ego, shreyas
Hi Shilpa,
On Tue, Jan 12, 2016 at 04:24:27AM -0600, Shilpasri G Bhat wrote:
> +static inline int get_chip_index(struct kobject *kobj)
Probably have "get_chip_index(int id)". See the reason below.
> +{
> + int i, id;
> +
> + i = kstrtoint(kobj->name + 4, 0, &id);
> + if (i)
> + return i;
> +
> + for (i = 0; i < nr_chips; i++)
> + if (chips[i].id == id)
> + return i;
This pattern to obtain a chip index from the chip id is repeated in
multiple place inside this file. Might be worthwhile to move this to a
helper function, i.e get_chip_index(id)!
> + return -EINVAL;
> +}
> +
> +static ssize_t throttle_freq_show(struct kobject *kobj,
> + struct kobj_attribute *attr, char *buf)
> +{
> + int i, count = 0, id;
> +
We obtain the id from kobj here and then obtain the index from
id via the function below.
> + id = get_chip_index(kobj);
> + if (id < 0)
> + return id;
> +
> + for (i = 0; i < powernv_pstate_info.nr_pstates; i++)
> + count += sprintf(&buf[count], "%d %d\n",
> + powernv_freqs[i].frequency,
> + chips[id].pstate_stat[i]);
> +
> + return count;
> +}
> +
> +static struct kobj_attribute attr_throttle_frequencies =
> +__ATTR(throttle_frequencies, 0444, throttle_freq_show, NULL);
> +
--
Thanks and Regards
gautham.
^ permalink raw reply [flat|nested] 8+ messages in thread