From: mark gross <markgross@thegnar.org>
To: Tim Chen <tim.c.chen@linux.intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>,
mark gross <markgross@thegnar.org>,
James Bottomley <James.Bottomley@suse.de>,
David Alan Gilbert <linux@treblig.org>,
linux-kernel@vger.kernel.org, Len <len.brown@intel.com>,
Andi Kleen <ak@linux.intel.com>,
Arjan van de Ven <arjan@linux.intel.com>
Subject: Re: [Patch] idle governor: Avoid lock acquisition to read pm_qos before entering idle
Date: Wed, 9 Feb 2011 21:10:42 -0800 [thread overview]
Message-ID: <20110210051042.GC4897@gvim.org> (raw)
In-Reply-To: <1297300864.2645.128.camel@schen9-DESK>
On Wed, Feb 09, 2011 at 05:21:04PM -0800, Tim Chen wrote:
> I noticed that before entering idle state, the menu idle governor will
> look up the current pm_qos value according to the list of qos request
> received. This look up currently needs the acquisition of a lock to go
> down a list of qos requests to find the qos value, slowing down the
wait a second... It gets the target_value (that is an atomic variable)
humpf. I was looking at 2.6.35, where this is true. If you want to put
back a target_value why not put it back the way it was?
> entrance into idle state due to contention by multiple cpus to traverse
> this list. The contention is severe when there are a lot of cpus waking
> and going into idle. For example, for a simple workload that has 32
> pair of processes ping ponging messages to each other, where 64 cores
> cores are active in test system, I see the following profile:
>
> - 37.82% swapper [kernel.kallsyms] [k]
> _raw_spin_lock_irqsave
> - _raw_spin_lock_irqsave
> - 95.65% pm_qos_request
> menu_select
> cpuidle_idle_call
> - cpu_idle
> 99.98% start_secondary
>
I'm surprised by this as the last update to the pm_qos replaced the
lists with a O(1) data structure so there was no more walking of pending
requests.
What is the profile after the patch the Plist should be only one
dereference and an if instruction slower than a cached value.
Does your patch remove the need for the locks because if it doesn't I
don't see how it will make much of a difference?
> Perhaps a better approach will be to cache the updated pm_qos value so
> reading it does not require lock acquisition as in the patch below.
See v2.6.35 for an possible instance of the better approach.
>
> Tim
>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> diff --git a/include/linux/pm_qos_params.h b/include/linux/pm_qos_params.h
> index 77cbddb..a7d87f9 100644
> --- a/include/linux/pm_qos_params.h
> +++ b/include/linux/pm_qos_params.h
> @@ -16,6 +16,10 @@
> #define PM_QOS_NUM_CLASSES 4
> #define PM_QOS_DEFAULT_VALUE -1
>
> +#define PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC)
> +#define PM_QOS_NETWORK_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC)
> +#define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE 0
> +
> struct pm_qos_request_list {
> struct plist_node list;
> int pm_qos_class;
> diff --git a/kernel/pm_qos_params.c b/kernel/pm_qos_params.c
> index aeaa7f8..b6310d1 100644
> --- a/kernel/pm_qos_params.c
> +++ b/kernel/pm_qos_params.c
> @@ -58,6 +58,7 @@ struct pm_qos_object {
> struct blocking_notifier_head *notifiers;
> struct miscdevice pm_qos_power_miscdev;
> char *name;
> + s32 value;
> s32 default_value;
> enum pm_qos_type type;
> };
> @@ -70,7 +71,8 @@ static struct pm_qos_object cpu_dma_pm_qos = {
> .requests = PLIST_HEAD_INIT(cpu_dma_pm_qos.requests, pm_qos_lock),
> .notifiers = &cpu_dma_lat_notifier,
> .name = "cpu_dma_latency",
> - .default_value = 2000 * USEC_PER_SEC,
> + .value = PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE,
> + .default_value = PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE,
> .type = PM_QOS_MIN,
> };
>
> @@ -79,7 +81,8 @@ static struct pm_qos_object network_lat_pm_qos = {
> .requests = PLIST_HEAD_INIT(network_lat_pm_qos.requests, pm_qos_lock),
> .notifiers = &network_lat_notifier,
> .name = "network_latency",
> - .default_value = 2000 * USEC_PER_SEC,
> + .value = PM_QOS_NETWORK_LAT_DEFAULT_VALUE,
> + .default_value = PM_QOS_NETWORK_LAT_DEFAULT_VALUE,
> .type = PM_QOS_MIN
> };
>
> @@ -89,7 +92,8 @@ static struct pm_qos_object network_throughput_pm_qos = {
> .requests = PLIST_HEAD_INIT(network_throughput_pm_qos.requests, pm_qos_lock),
> .notifiers = &network_throughput_notifier,
> .name = "network_throughput",
> - .default_value = 0,
> + .value = PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE,
> + .default_value = PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE,
> .type = PM_QOS_MAX,
> };
>
> @@ -132,6 +136,16 @@ static inline int pm_qos_get_value(struct pm_qos_object *o)
> }
> }
>
> +static inline s32 pm_qos_read_value(struct pm_qos_object *o)
> +{
> + return o->value;
> +}
> +
> +static inline int pm_qos_set_value(struct pm_qos_object *o, s32 value)
> +{
> + o->value = value;
> +}
> +
> static void update_target(struct pm_qos_object *o, struct plist_node *node,
> int del, int value)
> {
> @@ -156,6 +170,7 @@ static void update_target(struct pm_qos_object *o, struct plist_node *node,
> plist_add(node, &o->requests);
> }
> curr_value = pm_qos_get_value(o);
> + pm_qos_set_value(o, curr_value);
> spin_unlock_irqrestore(&pm_qos_lock, flags);
>
> if (prev_value != curr_value)
> @@ -190,18 +205,11 @@ static int find_pm_qos_object_by_minor(int minor)
> * pm_qos_request - returns current system wide qos expectation
> * @pm_qos_class: identification of which qos value is requested
> *
> - * This function returns the current target value in an atomic manner.
> + * This function returns the current target value.
> */
> int pm_qos_request(int pm_qos_class)
> {
> - unsigned long flags;
> - int value;
> -
> - spin_lock_irqsave(&pm_qos_lock, flags);
> - value = pm_qos_get_value(pm_qos_array[pm_qos_class]);
Hmmm, I was looking at 2.6.35 where we had the return of an
atomic value here.
what happened to the atomic value in the 2.6.35 kernel?
> - spin_unlock_irqrestore(&pm_qos_lock, flags);
> -
> - return value;
> + return pm_qos_read_value(pm_qos_array[pm_qos_class]);
why does this not need a lock? This value is getting updated by
multiple CPU's concurrently It probably should at least be an atomic
read like it used to be in 2.6.35.
it looks to me that your change is really just remove the locks around
the current QoS target. The caching of the target to avoid hitting the
plist is really close to a noop.
--mark
> }
> EXPORT_SYMBOL_GPL(pm_qos_request);
>
>
>
next prev parent reply other threads:[~2011-02-10 5:10 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-10 1:21 [Patch] idle governor: Avoid lock acquisition to read pm_qos before entering idle Tim Chen
2011-02-10 3:46 ` Andi Kleen
2011-02-10 18:59 ` Rafael J. Wysocki
2011-02-10 5:10 ` mark gross [this message]
2011-02-10 16:20 ` Andi Kleen
2011-02-10 17:27 ` Tim Chen
2011-02-10 17:45 ` James Bottomley
2011-02-10 18:50 ` Rafael J. Wysocki
2011-02-10 19:33 ` Tim Chen
2011-02-10 20:37 ` James Bottomley
2011-02-10 17:55 ` mark gross
2011-02-10 23:17 ` Rafael J. Wysocki
2011-02-11 0:50 ` Tim Chen
2011-02-11 19:22 ` Rafael J. Wysocki
2011-02-11 19:27 ` Tim Chen
2011-02-11 19:33 ` James Bottomley
2011-02-11 20:03 ` Tim Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110210051042.GC4897@gvim.org \
--to=markgross@thegnar.org \
--cc=James.Bottomley@suse.de \
--cc=ak@linux.intel.com \
--cc=arjan@linux.intel.com \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@treblig.org \
--cc=rjw@sisk.pl \
--cc=tim.c.chen@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox