* [PATCH RESEND v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification
@ 2016-01-12 10:24 Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Shilpasri G Bhat @ 2016-01-12 10:24 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel
Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
Shilpasri G Bhat
In POWER8, OCC(On-Chip-Controller) can throttle the frequency of the
CPU when the chip crosses its thermal and power limits. Currently,
powernv-cpufreq driver detects and reports this event as a console
message. Some machines may not sustain the max turbo frequency in all
conditions and can be throttled frequently. This can lead to the
flooding of console with throttle messages. So this patchset aims to
redesign the presentation of this event via sysfs counters and
tracepoints.
Patches [2] to [4] will add a perf trace point "power:powernv_throttle" and
sysfs throttle counter stats in /sys/devices/system/cpu/cpufreq/chipN.
Patch [1] solves a bug in powernv_cpufreq_throttle_check(), which calls in to
cpu_to_chip_id() in hot path which reads DT every time to find the chip id.
Resending the patchset as I has cc'ed stable@vger.kernel.org in developemnt
cycle and used --in-reply-to to post a new version.
Changes from v3:
- Add a fix to replace cpu_to_chip_id() with simpler PIR shift to obtain the
chip id.
- Break patch2 in to two patches separating the tracepoint and sysfs attribute
changes.
Changes from v2:
- Fixed kbuild test warning.
drivers/cpufreq/powernv-cpufreq.c:609:2: warning: ignoring return
value of 'kstrtoint', declared with attribute warn_unused_result
[-Wunused-result]
Shilpasri G Bhat (4):
cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
cpufreq: powernv/tracing: Add powernv_throttle tracepoint
cpufreq: powernv: Add a trace print for the throttle event
cpufreq: powernv: Add sysfs attributes to show throttle stats
drivers/cpufreq/powernv-cpufreq.c | 279 +++++++++++++++++++++++++++++++-------
include/trace/events/power.h | 22 +++
kernel/trace/power-traces.c | 1 +
3 files changed, 250 insertions(+), 52 deletions(-)
--
1.9.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
2016-01-12 10:24 [PATCH RESEND v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification Shilpasri G Bhat
@ 2016-01-12 10:24 ` Shilpasri G Bhat
2016-01-12 10:57 ` Shreyas B Prabhu
2016-01-12 10:24 ` [PATCH RESEND v4 2/4] cpufreq: powernv/tracing: Add powernv_throttle tracepoint Shilpasri G Bhat
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Shilpasri G Bhat @ 2016-01-12 10:24 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel
Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
Shilpasri G Bhat
cpu_to_chip_id() does a DT walk through to find out the chip id by taking a
contended device tree lock. This adds an unnecessary overhead in a hot-path.
So instead of cpu_to_chip_id() use PIR of the cpu to find the chip id.
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
---
drivers/cpufreq/powernv-cpufreq.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index cb50138..597a084 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -39,6 +39,7 @@
#define PMSR_PSAFE_ENABLE (1UL << 30)
#define PMSR_SPR_EM_DISABLE (1UL << 31)
#define PMSR_MAX(x) ((x >> 32) & 0xFF)
+#define pir_to_chip_id(pir) (((pir) >> 7) & 0x3f)
static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
static bool rebooting, throttled, occ_reset;
@@ -312,13 +313,14 @@ static inline unsigned int get_nominal_index(void)
static void powernv_cpufreq_throttle_check(void *data)
{
unsigned int cpu = smp_processor_id();
+ unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id());
unsigned long pmsr;
int pmsr_pmax, i;
pmsr = get_pmspr(SPRN_PMSR);
for (i = 0; i < nr_chips; i++)
- if (chips[i].id == cpu_to_chip_id(cpu))
+ if (chips[i].id == chip_id)
break;
/* Check for Pmax Capping */
@@ -558,7 +560,8 @@ static int init_chip_info(void)
unsigned int prev_chip_id = UINT_MAX;
for_each_possible_cpu(cpu) {
- unsigned int id = cpu_to_chip_id(cpu);
+ unsigned int id =
+ pir_to_chip_id(get_hard_smp_processor_id(cpu));
if (prev_chip_id != id) {
prev_chip_id = id;
--
1.9.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH RESEND v4 2/4] cpufreq: powernv/tracing: Add powernv_throttle tracepoint
2016-01-12 10:24 [PATCH RESEND v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
@ 2016-01-12 10:24 ` Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats Shilpasri G Bhat
3 siblings, 0 replies; 8+ messages in thread
From: Shilpasri G Bhat @ 2016-01-12 10:24 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel
Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
Shilpasri G Bhat, Ingo Molnar, Steven Rostedt
This patch adds the powernv_throttle tracepoint to trace the CPU
frequency throttling event, which is used by the powernv-cpufreq
driver in POWER8.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Steven Rostedt <rostedt@goodmis.org>
---
No changes from v2 and v3.
include/trace/events/power.h | 22 ++++++++++++++++++++++
kernel/trace/power-traces.c | 1 +
2 files changed, 23 insertions(+)
diff --git a/include/trace/events/power.h b/include/trace/events/power.h
index 284244e..19e5030 100644
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -38,6 +38,28 @@ DEFINE_EVENT(cpu, cpu_idle,
TP_ARGS(state, cpu_id)
);
+TRACE_EVENT(powernv_throttle,
+
+ TP_PROTO(int chip_id, const char *reason, int pmax),
+
+ TP_ARGS(chip_id, reason, pmax),
+
+ TP_STRUCT__entry(
+ __field(int, chip_id)
+ __string(reason, reason)
+ __field(int, pmax)
+ ),
+
+ TP_fast_assign(
+ __entry->chip_id = chip_id;
+ __assign_str(reason, reason);
+ __entry->pmax = pmax;
+ ),
+
+ TP_printk("Chip %d Pmax %d %s", __entry->chip_id,
+ __entry->pmax, __get_str(reason))
+);
+
TRACE_EVENT(pstate_sample,
TP_PROTO(u32 core_busy,
diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c
index eb4220a..81b8745 100644
--- a/kernel/trace/power-traces.c
+++ b/kernel/trace/power-traces.c
@@ -15,4 +15,5 @@
EXPORT_TRACEPOINT_SYMBOL_GPL(suspend_resume);
EXPORT_TRACEPOINT_SYMBOL_GPL(cpu_idle);
+EXPORT_TRACEPOINT_SYMBOL_GPL(powernv_throttle);
--
1.9.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event
2016-01-12 10:24 [PATCH RESEND v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 2/4] cpufreq: powernv/tracing: Add powernv_throttle tracepoint Shilpasri G Bhat
@ 2016-01-12 10:24 ` Shilpasri G Bhat
2016-01-12 10:55 ` Gautham R Shenoy
2016-01-12 10:24 ` [PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats Shilpasri G Bhat
3 siblings, 1 reply; 8+ messages in thread
From: Shilpasri G Bhat @ 2016-01-12 10:24 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel
Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
Shilpasri G Bhat
Record the throttle event with a trace print replacing the printk,
except for events like throttling below nominal and occ reset
event which print a warning message.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
---
Changes from v3:
- Separate this patch to contain trace_point changes
- Move struct chip member 'restore' of type bool above 'mask' to reduce
structure padding.
No changes from v2.
Changes from v1:
- As suggested by Paul Clarke replaced char * throttle_reason[][30] by
const char * const throttle_reason[].
drivers/cpufreq/powernv-cpufreq.c | 95 ++++++++++++++++++++-------------------
1 file changed, 49 insertions(+), 46 deletions(-)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 597a084..c98a6e7 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -28,6 +28,7 @@
#include <linux/of.h>
#include <linux/reboot.h>
#include <linux/slab.h>
+#include <trace/events/power.h>
#include <asm/cputhreads.h>
#include <asm/firmware.h>
@@ -44,12 +45,22 @@
static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
static bool rebooting, throttled, occ_reset;
+static const char * const throttle_reason[] = {
+ "No throttling",
+ "Power Cap",
+ "Processor Over Temperature",
+ "Power Supply Failure",
+ "Over Current",
+ "OCC Reset"
+};
+
static struct chip {
unsigned int id;
bool throttled;
+ bool restore;
+ u8 throt_reason;
cpumask_t mask;
struct work_struct throttle;
- bool restore;
} *chips;
static int nr_chips;
@@ -310,41 +321,49 @@ static inline unsigned int get_nominal_index(void)
return powernv_pstate_info.max - powernv_pstate_info.nominal;
}
-static void powernv_cpufreq_throttle_check(void *data)
+static void powernv_cpufreq_check_pmax(void)
{
unsigned int cpu = smp_processor_id();
unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id());
- unsigned long pmsr;
int pmsr_pmax, i;
- pmsr = get_pmspr(SPRN_PMSR);
+ pmsr_pmax = (s8)PMSR_MAX(get_pmspr(SPRN_PMSR));
for (i = 0; i < nr_chips; i++)
if (chips[i].id == chip_id)
break;
- /* Check for Pmax Capping */
- pmsr_pmax = (s8)PMSR_MAX(pmsr);
if (pmsr_pmax != powernv_pstate_info.max) {
if (chips[i].throttled)
- goto next;
+ return;
+
chips[i].throttled = true;
if (pmsr_pmax < powernv_pstate_info.nominal)
- pr_crit("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
- cpu, chips[i].id, pmsr_pmax,
- powernv_pstate_info.nominal);
- else
- pr_info("CPU %d on Chip %u has Pmax reduced below turbo frequency (%d < %d)\n",
- cpu, chips[i].id, pmsr_pmax,
- powernv_pstate_info.max);
+ pr_warn_once("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
+ cpu, chips[i].id, pmsr_pmax,
+ powernv_pstate_info.nominal);
+
+ trace_powernv_throttle(chips[i].id,
+ throttle_reason[chips[i].throt_reason],
+ pmsr_pmax);
} else if (chips[i].throttled) {
chips[i].throttled = false;
- pr_info("CPU %d on Chip %u has Pmax restored to %d\n", cpu,
- chips[i].id, pmsr_pmax);
+ trace_powernv_throttle(chips[i].id,
+ throttle_reason[chips[i].throt_reason],
+ pmsr_pmax);
}
+}
+
+static void powernv_cpufreq_throttle_check(void *data)
+{
+ unsigned long pmsr;
+
+ pmsr = get_pmspr(SPRN_PMSR);
+
+ /* Check for Pmax Capping */
+ powernv_cpufreq_check_pmax();
/* Check if Psafe_mode_active is set in PMSR. */
-next:
if (pmsr & PMSR_PSAFE_ENABLE) {
throttled = true;
pr_info("Pstate set to safe frequency\n");
@@ -358,7 +377,7 @@ next:
if (throttled) {
pr_info("PMSR = %16lx\n", pmsr);
- pr_crit("CPU Frequency could be throttled\n");
+ pr_warn("CPU Frequency could be throttled\n");
}
}
@@ -449,15 +468,6 @@ void powernv_cpufreq_work_fn(struct work_struct *work)
}
}
-static char throttle_reason[][30] = {
- "No throttling",
- "Power Cap",
- "Processor Over Temperature",
- "Power Supply Failure",
- "Over Current",
- "OCC Reset"
- };
-
static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
unsigned long msg_type, void *_msg)
{
@@ -483,7 +493,7 @@ static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
*/
if (!throttled) {
throttled = true;
- pr_crit("CPU frequency is throttled for duration\n");
+ pr_warn("CPU frequency is throttled for duration\n");
}
break;
@@ -507,23 +517,18 @@ static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
return 0;
}
- if (omsg.throttle_status &&
+ for (i = 0; i < nr_chips; i++)
+ if (chips[i].id == omsg.chip)
+ break;
+
+ if (omsg.throttle_status >= 0 &&
omsg.throttle_status <= OCC_MAX_THROTTLE_STATUS)
- pr_info("OCC: Chip %u Pmax reduced due to %s\n",
- (unsigned int)omsg.chip,
- throttle_reason[omsg.throttle_status]);
- else if (!omsg.throttle_status)
- pr_info("OCC: Chip %u %s\n", (unsigned int)omsg.chip,
- throttle_reason[omsg.throttle_status]);
- else
- return 0;
+ chips[i].throt_reason = omsg.throttle_status;
- for (i = 0; i < nr_chips; i++)
- if (chips[i].id == omsg.chip) {
- if (!omsg.throttle_status)
- chips[i].restore = true;
- schedule_work(&chips[i].throttle);
- }
+ if (!omsg.throttle_status)
+ chips[i].restore = true;
+
+ schedule_work(&chips[i].throttle);
}
return 0;
}
@@ -569,16 +574,14 @@ static int init_chip_info(void)
}
}
- chips = kmalloc_array(nr_chips, sizeof(struct chip), GFP_KERNEL);
+ chips = kcalloc(nr_chips, sizeof(struct chip), GFP_KERNEL);
if (!chips)
return -ENOMEM;
for (i = 0; i < nr_chips; i++) {
chips[i].id = chip[i];
- chips[i].throttled = false;
cpumask_copy(&chips[i].mask, cpumask_of_node(chip[i]));
INIT_WORK(&chips[i].throttle, powernv_cpufreq_work_fn);
- chips[i].restore = false;
}
return 0;
--
1.9.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats
2016-01-12 10:24 [PATCH RESEND v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification Shilpasri G Bhat
` (2 preceding siblings ...)
2016-01-12 10:24 ` [PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event Shilpasri G Bhat
@ 2016-01-12 10:24 ` Shilpasri G Bhat
2016-01-12 11:13 ` Gautham R Shenoy
3 siblings, 1 reply; 8+ messages in thread
From: Shilpasri G Bhat @ 2016-01-12 10:24 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel
Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
Shilpasri G Bhat
Create sysfs attributes to export throttle information in
/sys/devices/system/cpu/cpufreq/chipN. The newly added sysfs files are as
follows:
1)/sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies
This gives the throttle stats for each of the available frequencies.
The throttle stat of a frequency is the total number of times the max
frequency is reduced to that frequency.
# cat /sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies
4023000 0
3990000 0
3956000 1
3923000 0
3890000 0
3857000 2
3823000 0
3790000 0
3757000 2
3724000 1
3690000 1
...
2)/sys/devices/system/cpu/cpufreq/chip0/throttle_reasons
This directory contains throttle reason files. Each file gives the
total number of times the max frequency is throttled, except for
'throttle_reset', which gives the total number of times the max
frequency is unthrottled after being throttled.
# cd /sys/devices/system/cpu/cpufreq/chip0/throttle_reasons
# cat cpu_over_temperature
7
# cat occ_reset
0
# cat over_current
0
# cat power_cap
0
# cat power_supply_failure
0
# cat throttle_reset
7
3)/sys/devices/system/cpu/cpufreq/chip0/throttle_stat
This gives the total number of events of max frequency throttling to
lower frequencies in the turbo range of frequencies and the sub-turbo(at
and below nominal) range of frequencies.
# cat /sys/devices/system/cpu/cpufreq/chip0/throttle_stat
turbo 7
sub-turbo 0
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
---
Changes from v3:
- Seperate the patch to contain only the throttle sysfs attribute changes.
- Add helper inline function get_chip_index()
Changes from v2:
- Fixed kbuild test warning.
drivers/cpufreq/powernv-cpufreq.c:609:2: warning: ignoring return
value of 'kstrtoint', declared with attribute warn_unused_result
[-Wunused-result]
Changes from v1:
- Added a kobject to struct chip
- Grouped the throttle reasons under a separate attribute_group and
exported each reason as individual file.
- Moved the sysfs files from /sys/devices/system/node/nodeN to
/sys/devices/system/cpu/cpufreq/chipN
- As suggested by Paul Clarke replaced 'Nominal' with 'sub-turbo'.
- Modified the commit message.
drivers/cpufreq/powernv-cpufreq.c | 177 +++++++++++++++++++++++++++++++++++++-
1 file changed, 173 insertions(+), 4 deletions(-)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index c98a6e7..40ccd9d 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -54,6 +54,16 @@ static const char * const throttle_reason[] = {
"OCC Reset"
};
+enum throt_reason_type {
+ NO_THROTTLE = 0,
+ POWERCAP,
+ CPU_OVERTEMP,
+ POWER_SUPPLY_FAILURE,
+ OVERCURRENT,
+ OCC_RESET_THROTTLE,
+ OCC_MAX_REASON
+};
+
static struct chip {
unsigned int id;
bool throttled;
@@ -61,6 +71,11 @@ static struct chip {
u8 throt_reason;
cpumask_t mask;
struct work_struct throttle;
+ int throt_turbo;
+ int throt_nominal;
+ int reason[OCC_MAX_REASON];
+ int *pstate_stat;
+ struct kobject *kobj;
} *chips;
static int nr_chips;
@@ -195,6 +210,113 @@ static struct freq_attr *powernv_cpu_freq_attr[] = {
NULL,
};
+static inline int get_chip_index(struct kobject *kobj)
+{
+ int i, id;
+
+ i = kstrtoint(kobj->name + 4, 0, &id);
+ if (i)
+ return i;
+
+ for (i = 0; i < nr_chips; i++)
+ if (chips[i].id == id)
+ return i;
+ return -EINVAL;
+}
+
+static ssize_t throttle_freq_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ int i, count = 0, id;
+
+ id = get_chip_index(kobj);
+ if (id < 0)
+ return id;
+
+ for (i = 0; i < powernv_pstate_info.nr_pstates; i++)
+ count += sprintf(&buf[count], "%d %d\n",
+ powernv_freqs[i].frequency,
+ chips[id].pstate_stat[i]);
+
+ return count;
+}
+
+static struct kobj_attribute attr_throttle_frequencies =
+__ATTR(throttle_frequencies, 0444, throttle_freq_show, NULL);
+
+static ssize_t throttle_stat_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ int id, count = 0;
+
+ id = get_chip_index(kobj);
+ if (id < 0)
+ return id;
+
+ count += sprintf(&buf[count], "turbo %d\n", chips[id].throt_turbo);
+ count += sprintf(&buf[count], "sub-turbo %d\n",
+ chips[id].throt_nominal);
+
+ return count;
+}
+
+static struct kobj_attribute attr_throttle_stat =
+__ATTR(throttle_stat, 0444, throttle_stat_show, NULL);
+
+#define define_throttle_reason_attr(attr_name, val) \
+static ssize_t attr_name##_show(struct kobject *kobj, \
+ struct kobj_attribute *attr, char *buf) \
+{ \
+ int id; \
+ \
+ id = get_chip_index(kobj); \
+ if (id < 0) \
+ return id; \
+ \
+ return sprintf(buf, "%d\n", chips[id].reason[val]); \
+} \
+ \
+static struct kobj_attribute attr_##attr_name = \
+__ATTR(attr_name, 0444, attr_name##_show, NULL)
+
+define_throttle_reason_attr(throttle_reset, NO_THROTTLE);
+define_throttle_reason_attr(power_cap, POWERCAP);
+define_throttle_reason_attr(cpu_over_temperature, CPU_OVERTEMP);
+define_throttle_reason_attr(power_supply_failure, POWER_SUPPLY_FAILURE);
+define_throttle_reason_attr(over_current, OVERCURRENT);
+define_throttle_reason_attr(occ_reset, OCC_RESET_THROTTLE);
+
+static struct attribute *throttle_reason_attrs[] = {
+ &attr_throttle_reset.attr,
+ &attr_power_cap.attr,
+ &attr_cpu_over_temperature.attr,
+ &attr_power_supply_failure.attr,
+ &attr_over_current.attr,
+ &attr_occ_reset.attr,
+ NULL
+};
+
+static struct attribute *throttle_stat_attrs[] = {
+ &attr_throttle_frequencies.attr,
+ &attr_throttle_stat.attr,
+ NULL
+};
+
+static const struct attribute_group throttle_reason_group = {
+ .name = "throttle_reasons",
+ .attrs = throttle_reason_attrs,
+};
+
+static const struct attribute_group throttle_stat_group = {
+ .attrs = throttle_stat_attrs,
+};
+
+static const struct attribute_group *throttle_attr_groups[] = {
+ &throttle_stat_group,
+ &throttle_reason_group,
+ NULL
+};
+
/* Helper routines */
/* Access helpers to power mgt SPR */
@@ -325,7 +447,7 @@ static void powernv_cpufreq_check_pmax(void)
{
unsigned int cpu = smp_processor_id();
unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id());
- int pmsr_pmax, i;
+ int pmsr_pmax, i, index;
pmsr_pmax = (s8)PMSR_MAX(get_pmspr(SPRN_PMSR));
@@ -338,10 +460,18 @@ static void powernv_cpufreq_check_pmax(void)
return;
chips[i].throttled = true;
- if (pmsr_pmax < powernv_pstate_info.nominal)
+ if (pmsr_pmax < powernv_pstate_info.nominal) {
pr_warn_once("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
cpu, chips[i].id, pmsr_pmax,
powernv_pstate_info.nominal);
+ chips[i].throt_nominal++;
+ } else {
+ chips[i].throt_turbo++;
+ }
+
+ index = powernv_pstate_info.max - pmsr_pmax;
+ if (index >= 0 && index < powernv_pstate_info.nr_pstates)
+ chips[i].pstate_stat[index]++;
trace_powernv_throttle(chips[i].id,
throttle_reason[chips[i].throt_reason],
@@ -522,8 +652,10 @@ static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
break;
if (omsg.throttle_status >= 0 &&
- omsg.throttle_status <= OCC_MAX_THROTTLE_STATUS)
+ omsg.throttle_status <= OCC_MAX_THROTTLE_STATUS) {
chips[i].throt_reason = omsg.throttle_status;
+ chips[i].reason[omsg.throttle_status]++;
+ }
if (!omsg.throttle_status)
chips[i].restore = true;
@@ -563,6 +695,7 @@ static int init_chip_info(void)
unsigned int chip[256];
unsigned int cpu, i;
unsigned int prev_chip_id = UINT_MAX;
+ int ret = -ENOMEM;
for_each_possible_cpu(cpu) {
unsigned int id =
@@ -576,15 +709,43 @@ static int init_chip_info(void)
chips = kcalloc(nr_chips, sizeof(struct chip), GFP_KERNEL);
if (!chips)
- return -ENOMEM;
+ goto out;
for (i = 0; i < nr_chips; i++) {
+ char name[10];
+
chips[i].id = chip[i];
cpumask_copy(&chips[i].mask, cpumask_of_node(chip[i]));
INIT_WORK(&chips[i].throttle, powernv_cpufreq_work_fn);
+ chips[i].pstate_stat = kcalloc(powernv_pstate_info.nr_pstates,
+ sizeof(int), GFP_KERNEL);
+ if (!chips[i].pstate_stat)
+ goto free;
+
+ sprintf(name, "chip%d", chips[i].id);
+ chips[i].kobj = kobject_create_and_add(name,
+ cpufreq_global_kobject);
+ if (!chips[i].kobj)
+ goto free;
+
+ ret = sysfs_create_groups(chips[i].kobj, throttle_attr_groups);
+ if (ret) {
+ pr_info("Chip %d failed to create throttle sysfs group\n",
+ chips[i].id);
+ goto free;
+ }
}
return 0;
+free:
+ nr_chips = i;
+ for (i = 0; i <= nr_chips; i++) {
+ kobject_put(chips[i].kobj);
+ kfree(chips[i].pstate_stat);
+ }
+ kfree(chips);
+out:
+ return ret;
}
static int __init powernv_cpufreq_init(void)
@@ -615,9 +776,17 @@ module_init(powernv_cpufreq_init);
static void __exit powernv_cpufreq_exit(void)
{
+ int i;
+
unregister_reboot_notifier(&powernv_cpufreq_reboot_nb);
opal_message_notifier_unregister(OPAL_MSG_OCC,
&powernv_cpufreq_opal_nb);
+ for (i = 0; i < nr_chips; i++) {
+ kobject_put(chips[i].kobj);
+ kfree(chips[i].pstate_stat);
+ }
+ kfree(chips);
+
cpufreq_unregister_driver(&powernv_cpufreq_driver);
}
module_exit(powernv_cpufreq_exit);
--
1.9.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event
2016-01-12 10:24 ` [PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event Shilpasri G Bhat
@ 2016-01-12 10:55 ` Gautham R Shenoy
0 siblings, 0 replies; 8+ messages in thread
From: Gautham R Shenoy @ 2016-01-12 10:55 UTC (permalink / raw)
To: Shilpasri G Bhat
Cc: linuxppc-dev, linux-kernel, rjw, viresh.kumar, linux-pm, pc,
anton, ego, shreyas
Hi Shilpa,
Just saw this resend!
On Tue, Jan 12, 2016 at 04:24:26AM -0600, Shilpasri G Bhat wrote:
> Record the throttle event with a trace print replacing the printk,
> except for events like throttling below nominal and occ reset
> event which print a warning message.
>
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
> ---
[..snip..]
>
> -static void powernv_cpufreq_throttle_check(void *data)
> +static void powernv_cpufreq_check_pmax(void)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This function only contains code moved from
powernv_cpufreq_throttle_check with pr_crit/pr_warns replaced by
trace_powernv_throttle. Furthermore, it is not called from any other
place. Given that the original function was ~60 lines do we really
need to split it into two separate functions ? If yes, could it be an
inline function ?
> {
> unsigned int cpu = smp_processor_id();
> unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id());
> - unsigned long pmsr;
> int pmsr_pmax, i;
>
> - pmsr = get_pmspr(SPRN_PMSR);
> + pmsr_pmax = (s8)PMSR_MAX(get_pmspr(SPRN_PMSR));
>
> for (i = 0; i < nr_chips; i++)
> if (chips[i].id == chip_id)
> break;
>
> - /* Check for Pmax Capping */
> - pmsr_pmax = (s8)PMSR_MAX(pmsr);
> if (pmsr_pmax != powernv_pstate_info.max) {
> if (chips[i].throttled)
> - goto next;
> + return;
> +
> chips[i].throttled = true;
> if (pmsr_pmax < powernv_pstate_info.nominal)
> - pr_crit("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
> - cpu, chips[i].id, pmsr_pmax,
> - powernv_pstate_info.nominal);
> - else
> - pr_info("CPU %d on Chip %u has Pmax reduced below turbo frequency (%d < %d)\n",
> - cpu, chips[i].id, pmsr_pmax,
> - powernv_pstate_info.max);
> + pr_warn_once("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
> + cpu, chips[i].id, pmsr_pmax,
> + powernv_pstate_info.nominal);
> +
> + trace_powernv_throttle(chips[i].id,
> + throttle_reason[chips[i].throt_reason],
> + pmsr_pmax);
> } else if (chips[i].throttled) {
> chips[i].throttled = false;
> - pr_info("CPU %d on Chip %u has Pmax restored to %d\n", cpu,
> - chips[i].id, pmsr_pmax);
> + trace_powernv_throttle(chips[i].id,
> + throttle_reason[chips[i].throt_reason],
> + pmsr_pmax);
> }
> +}
> +
> +static void powernv_cpufreq_throttle_check(void *data)
> +{
> + unsigned long pmsr;
> +
> + pmsr = get_pmspr(SPRN_PMSR);
> +
> + /* Check for Pmax Capping */
> + powernv_cpufreq_check_pmax();
If you want to retain this function, you could pass pmsr as an
argument instead of computing it afresh in
powernv_cpufreq_check_pmax()
> /* Check if Psafe_mode_active is set in PMSR. */
> -next:
> if (pmsr & PMSR_PSAFE_ENABLE) {
> throttled = true;
> pr_info("Pstate set to safe frequency\n");
--
Thanks and Regards
gautham.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
2016-01-12 10:24 ` [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
@ 2016-01-12 10:57 ` Shreyas B Prabhu
0 siblings, 0 replies; 8+ messages in thread
From: Shreyas B Prabhu @ 2016-01-12 10:57 UTC (permalink / raw)
To: Shilpasri G Bhat
Cc: linuxppc-dev, linux-kernel, rjw, viresh.kumar, linux-pm, pc,
anton, ego
On 01/12/2016 03:54 PM, Shilpasri G Bhat wrote:
> cpu_to_chip_id() does a DT walk through to find out the chip id by taking a
> contended device tree lock. This adds an unnecessary overhead in a hot-path.
> So instead of cpu_to_chip_id() use PIR of the cpu to find the chip id.
>
> Reported-by: Anton Blanchard <anton@samba.org>
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
> ---
> drivers/cpufreq/powernv-cpufreq.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index cb50138..597a084 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -39,6 +39,7 @@
> #define PMSR_PSAFE_ENABLE (1UL << 30)
> #define PMSR_SPR_EM_DISABLE (1UL << 31)
> #define PMSR_MAX(x) ((x >> 32) & 0xFF)
> +#define pir_to_chip_id(pir) (((pir) >> 7) & 0x3f)
Since this is platform specific and true only for power8, this is not
the right place to put it. Either you can move this to arch/powerpc or
you can maintain a cpu to chip map within the driver.
>
> static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
> static bool rebooting, throttled, occ_reset;
> @@ -312,13 +313,14 @@ static inline unsigned int get_nominal_index(void)
> static void powernv_cpufreq_throttle_check(void *data)
> {
> unsigned int cpu = smp_processor_id();
> + unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id());
> unsigned long pmsr;
> int pmsr_pmax, i;
>
> pmsr = get_pmspr(SPRN_PMSR);
>
> for (i = 0; i < nr_chips; i++)
> - if (chips[i].id == cpu_to_chip_id(cpu))
> + if (chips[i].id == chip_id)
> break;
>
> /* Check for Pmax Capping */
> @@ -558,7 +560,8 @@ static int init_chip_info(void)
> unsigned int prev_chip_id = UINT_MAX;
>
> for_each_possible_cpu(cpu) {
> - unsigned int id = cpu_to_chip_id(cpu);
> + unsigned int id =
> + pir_to_chip_id(get_hard_smp_processor_id(cpu));
>
> if (prev_chip_id != id) {
> prev_chip_id = id;
>
Thanks,
Shreyas
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats
2016-01-12 10:24 ` [PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats Shilpasri G Bhat
@ 2016-01-12 11:13 ` Gautham R Shenoy
0 siblings, 0 replies; 8+ messages in thread
From: Gautham R Shenoy @ 2016-01-12 11:13 UTC (permalink / raw)
To: Shilpasri G Bhat
Cc: linuxppc-dev, linux-kernel, rjw, viresh.kumar, linux-pm, pc,
anton, ego, shreyas
Hi Shilpa,
On Tue, Jan 12, 2016 at 04:24:27AM -0600, Shilpasri G Bhat wrote:
> +static inline int get_chip_index(struct kobject *kobj)
Probably have "get_chip_index(int id)". See the reason below.
> +{
> + int i, id;
> +
> + i = kstrtoint(kobj->name + 4, 0, &id);
> + if (i)
> + return i;
> +
> + for (i = 0; i < nr_chips; i++)
> + if (chips[i].id == id)
> + return i;
This pattern to obtain a chip index from the chip id is repeated in
multiple place inside this file. Might be worthwhile to move this to a
helper function, i.e get_chip_index(id)!
> + return -EINVAL;
> +}
> +
> +static ssize_t throttle_freq_show(struct kobject *kobj,
> + struct kobj_attribute *attr, char *buf)
> +{
> + int i, count = 0, id;
> +
We obtain the id from kobj here and then obtain the index from
id via the function below.
> + id = get_chip_index(kobj);
> + if (id < 0)
> + return id;
> +
> + for (i = 0; i < powernv_pstate_info.nr_pstates; i++)
> + count += sprintf(&buf[count], "%d %d\n",
> + powernv_freqs[i].frequency,
> + chips[id].pstate_stat[i]);
> +
> + return count;
> +}
> +
> +static struct kobj_attribute attr_throttle_frequencies =
> +__ATTR(throttle_frequencies, 0444, throttle_freq_show, NULL);
> +
--
Thanks and Regards
gautham.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-01-12 11:13 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-12 10:24 [PATCH RESEND v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
2016-01-12 10:57 ` Shreyas B Prabhu
2016-01-12 10:24 ` [PATCH RESEND v4 2/4] cpufreq: powernv/tracing: Add powernv_throttle tracepoint Shilpasri G Bhat
2016-01-12 10:24 ` [PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event Shilpasri G Bhat
2016-01-12 10:55 ` Gautham R Shenoy
2016-01-12 10:24 ` [PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats Shilpasri G Bhat
2016-01-12 11:13 ` Gautham R Shenoy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).