From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Lezcano Subject: Re: [RFC] [PATCH 3/3] QoS: Enhance framework to support cpu/irq specific QoS requests Date: Fri, 01 Aug 2014 17:58:39 +0200 Message-ID: <53DBB92F.60602@linaro.org> References: <1406307334-8288-1-git-send-email-lina.iyer@linaro.org> <1406307334-8288-4-git-send-email-lina.iyer@linaro.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from mail-we0-f181.google.com ([74.125.82.181]:50504 "EHLO mail-we0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751694AbaHAP6l (ORCPT ); Fri, 1 Aug 2014 11:58:41 -0400 Received: by mail-we0-f181.google.com with SMTP id k48so4593998wev.12 for ; Fri, 01 Aug 2014 08:58:40 -0700 (PDT) In-Reply-To: <1406307334-8288-4-git-send-email-lina.iyer@linaro.org> Sender: linux-pm-owner@vger.kernel.org List-Id: linux-pm@vger.kernel.org To: Lina Iyer , linux-pm@vger.kernel.org Cc: linus.walleij@linaro.org, arnd.bergmann@linaro.org, rjw@rjwysocki.net, tglx@linutronix.de, Praveen Chidambaram On 07/25/2014 06:55 PM, Lina Iyer wrote: > QoS request for CPU_DMA_LATENCY can be better optimized if the reques= t > can be set only for the required cpus and not all cpus. This helps sa= ve > power on other cores, while still gauranteeing the quality of service= =2E > > Enhance the QoS constraints data structures to support target value f= or > each core. Requests specify if the QoS is applicable to all cores > (default) or to a selective subset of the cores or to a core(s), that > the IRQ is affine to. > > QoS requests that need to track an IRQ can be set to apply only on th= e > cpus to which the IRQ's smp_affinity attribute is set to. The QoS > framework will automatically track IRQ migration between the cores. T= he > QoS is updated to be applied only to the core(s) that the IRQ has bee= n > migrated to. > > Idle and interested drivers can request a PM QoS value for a constrai= nt > across all cpus, or a specific cpu or a set of cpus. Separate APIs ha= ve > been added to request for individual cpu or a cpumask. The default > behaviour of PM QoS is maintained i.e, requests that do not specify a > type of the request will continue to be effected on all cores. Reque= sts > that want to specify an affinity of cpu(s) or an irq, can modify the = PM > QoS request data structures by specifying the type of the request and > either the mask of the cpus or the IRQ number depending on the type. > Updating the request does not reset the type of the request. > > The userspace sysfs interface does not support CPU/IRQ affinity. > > Signed-off-by: Praveen Chidambaram > Signed-off-by: Lina Iyer > --- > Documentation/power/pm_qos_interface.txt | 18 ++++ > include/linux/pm_qos.h | 16 +++ > kernel/power/qos.c | 170 ++++++++++++++++++++= +++++++++++ > 3 files changed, 204 insertions(+) > > diff --git a/Documentation/power/pm_qos_interface.txt b/Documentation= /power/pm_qos_interface.txt > index a5da5c7..32b864d 100644 > --- a/Documentation/power/pm_qos_interface.txt > +++ b/Documentation/power/pm_qos_interface.txt > @@ -41,6 +41,17 @@ registered notifiers are called only if the target= value is now different. > Clients of pm_qos need to save the returned handle for future use i= n other > pm_qos API functions. > > +The handle is a pm_qos_request object. By default the request object= sets the > +request type to PM_QOS_REQ_ALL_CORES, in which case, the PM QoS requ= est > +applies to all cores. However, the driver can also specify a request= type to > +be either of > + PM_QOS_REQ_ALL_CORES, > + PM_QOS_REQ_AFFINE_CORES, > + PM_QOS_REQ_AFFINE_IRQ, > + > +Specify the cpumask when type is set to PM_QOS_REQ_AFFINE_CORES and = specify > +the IRQ number with PM_QOS_REQ_AFFINE_IRQ. > + > void pm_qos_update_request(handle, new_target_value): > Will update the list element pointed to by the handle with the new = target value > and recompute the new aggregated target, calling the notification t= ree if the > @@ -54,6 +65,13 @@ the request. > int pm_qos_request(param_class): > Returns the aggregated value for a given PM QoS class. > > +int pm_qos_request_for_cpu(param_class, cpu): > +Returns the aggregated value for a given PM QoS class for the specif= ied cpu. > + > +int pm_qos_request_for_cpumask(param_class, cpumask): > +Returns the aggregated value for a given PM QoS class for the specif= ied > +cpumask. > + You can get rid of 'pm_qos_request_for_cpu' because that will be easy t= o=20 call 'pm_qos_request_for_cpumask' with cpumask_of(cpu). IMO, you should split this patch into several. First bring per cpu=20 without adding new apis and then bring one by one the different changes= =2E > int pm_qos_request_active(handle): > Returns if the request is still active, i.e. it has not been remove= d from a > PM QoS class constraints list. > diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h > index e1b763d..05d7c46 100644 > --- a/include/linux/pm_qos.h > +++ b/include/linux/pm_qos.h > @@ -9,6 +9,8 @@ > #include > #include > #include > +#include > +#include > > enum { > PM_QOS_RESERVED =3D 0, > @@ -40,7 +42,18 @@ enum pm_qos_flags_status { > #define PM_QOS_FLAG_NO_POWER_OFF (1 << 0) > #define PM_QOS_FLAG_REMOTE_WAKEUP (1 << 1) > > +enum pm_qos_req_type { > + PM_QOS_REQ_ALL_CORES =3D 0, > + PM_QOS_REQ_AFFINE_CORES, > + PM_QOS_REQ_AFFINE_IRQ, > +}; > + > struct pm_qos_request { > + enum pm_qos_req_type type; > + struct cpumask cpus_affine; > + uint32_t irq; > + /* Internal structure members */ > + struct irq_affinity_notify irq_notify; > struct plist_node node; > int pm_qos_class; > struct delayed_work work; /* for pm_qos_update_request_timeout */ > @@ -80,6 +93,7 @@ enum pm_qos_type { > struct pm_qos_constraints { > struct plist_head list; > s32 target_value; /* Do not change to 64 bit */ > + s32 target_per_cpu[NR_CPUS]; > s32 default_value; > s32 no_constraint_value; > enum pm_qos_type type; > @@ -127,6 +141,8 @@ void pm_qos_update_request_timeout(struct pm_qos_= request *req, > void pm_qos_remove_request(struct pm_qos_request *req); > > int pm_qos_request(int pm_qos_class); > +int pm_qos_request_for_cpu(int pm_qos_class, int cpu); > +int pm_qos_request_for_cpumask(int pm_qos_class, struct cpumask *mas= k); > int pm_qos_add_notifier(int pm_qos_class, struct notifier_block *no= tifier); > int pm_qos_remove_notifier(int pm_qos_class, struct notifier_block = *notifier); > int pm_qos_request_active(struct pm_qos_request *req); > diff --git a/kernel/power/qos.c b/kernel/power/qos.c > index d0b9c0f..92d8521 100644 > --- a/kernel/power/qos.c > +++ b/kernel/power/qos.c > @@ -41,6 +41,9 @@ > #include > #include > #include > +#include > +#include > +#include > > #include > #include > @@ -65,6 +68,8 @@ static BLOCKING_NOTIFIER_HEAD(cpu_dma_lat_notifier)= ; > static struct pm_qos_constraints cpu_dma_constraints =3D { > .list =3D PLIST_HEAD_INIT(cpu_dma_constraints.list), > .target_value =3D PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE, > + .target_per_cpu =3D { [0 ... (NR_CPUS - 1)] =3D > + PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE }, > .default_value =3D PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE, > .no_constraint_value =3D PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE, > .type =3D PM_QOS_MIN, > @@ -79,6 +84,8 @@ static BLOCKING_NOTIFIER_HEAD(network_lat_notifier)= ; > static struct pm_qos_constraints network_lat_constraints =3D { > .list =3D PLIST_HEAD_INIT(network_lat_constraints.list), > .target_value =3D PM_QOS_NETWORK_LAT_DEFAULT_VALUE, > + .target_per_cpu =3D { [0 ... (NR_CPUS - 1)] =3D > + PM_QOS_NETWORK_LAT_DEFAULT_VALUE }, > .default_value =3D PM_QOS_NETWORK_LAT_DEFAULT_VALUE, > .no_constraint_value =3D PM_QOS_NETWORK_LAT_DEFAULT_VALUE, > .type =3D PM_QOS_MIN, > @@ -94,6 +101,8 @@ static BLOCKING_NOTIFIER_HEAD(network_throughput_n= otifier); > static struct pm_qos_constraints network_tput_constraints =3D { > .list =3D PLIST_HEAD_INIT(network_tput_constraints.list), > .target_value =3D PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE, > + .target_per_cpu =3D { [0 ... (NR_CPUS - 1)] =3D > + PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE }, > .default_value =3D PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE, > .no_constraint_value =3D PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE, > .type =3D PM_QOS_MAX, > @@ -157,6 +166,34 @@ static inline void pm_qos_set_value(struct pm_qo= s_constraints *c, s32 value) > c->target_value =3D value; > } Wouldn't make sense to create a per_cpu variable pm qos constraints=20 instead of creating an array of cpus ? > +static inline void pm_qos_set_value_for_cpus(struct pm_qos_constrain= ts *c) > +{ > + struct pm_qos_request *req =3D NULL; > + int cpu; > + s32 qos_val[NR_CPUS] =3D { [0 ... (NR_CPUS - 1)] =3D c->default_val= ue }; The kernel stack size is small, IIRC, 2xPAGE_SIZE. NR_CPUS could be big= =20 (1024). That may lead to a stack overflow on some systems. > + > + plist_for_each_entry(req, &c->list, node) { > + for_each_cpu(cpu, &req->cpus_affine) { > + switch (c->type) { > + case PM_QOS_MIN: > + if (qos_val[cpu] > req->node.prio) > + qos_val[cpu] =3D req->node.prio; > + break; > + case PM_QOS_MAX: > + if (req->node.prio > qos_val[cpu]) > + qos_val[cpu] =3D req->node.prio; > + break; > + default: > + BUG(); > + break; > + } > + } > + } > + > + for_each_possible_cpu(cpu) > + c->target_per_cpu[cpu] =3D qos_val[cpu]; > +} > + > /** > * pm_qos_update_target - manages the constraints list and calls th= e notifiers > * if needed > @@ -206,6 +243,7 @@ int pm_qos_update_target(struct pm_qos_constraint= s *c, > > curr_value =3D pm_qos_get_value(c); > pm_qos_set_value(c, curr_value); > + pm_qos_set_value_for_cpus(c); > > spin_unlock_irqrestore(&pm_qos_lock, flags); > > @@ -298,6 +336,44 @@ int pm_qos_request(int pm_qos_class) > } > EXPORT_SYMBOL_GPL(pm_qos_request); > > +int pm_qos_request_for_cpu(int pm_qos_class, int cpu) > +{ > + return pm_qos_array[pm_qos_class]->constraints->target_per_cpu[cpu]= ; > +} > +EXPORT_SYMBOL(pm_qos_request_for_cpu); > + > +int pm_qos_request_for_cpumask(int pm_qos_class, struct cpumask *mas= k) > +{ > + unsigned long irqflags; > + int cpu; > + struct pm_qos_constraints *c =3D NULL; > + int val; > + > + spin_lock_irqsave(&pm_qos_lock, irqflags); > + c =3D pm_qos_array[pm_qos_class]->constraints; > + val =3D c->default_value; > + > + for_each_cpu(cpu, mask) { > + switch (c->type) { > + case PM_QOS_MIN: > + if (c->target_per_cpu[cpu] < val) > + val =3D c->target_per_cpu[cpu]; > + break; > + case PM_QOS_MAX: > + if (c->target_per_cpu[cpu] > val) > + val =3D c->target_per_cpu[cpu]; > + break; > + default: > + BUG(); > + break; > + } > + } > + spin_unlock_irqrestore(&pm_qos_lock, irqflags); > + > + return val; > +} > +EXPORT_SYMBOL(pm_qos_request_for_cpumask); > + > int pm_qos_request_active(struct pm_qos_request *req) > { > return req->pm_qos_class !=3D 0; > @@ -330,6 +406,35 @@ static void pm_qos_work_fn(struct work_struct *w= ork) > __pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); > } > > +static void pm_qos_irq_release(struct kref *ref) > +{ > + unsigned long flags; > + struct irq_affinity_notify *notify =3D container_of(ref, > + struct irq_affinity_notify, kref); > + struct pm_qos_request *req =3D container_of(notify, > + struct pm_qos_request, irq_notify); > + > + spin_lock_irqsave(&pm_qos_lock, flags); > + cpumask_clear(&req->cpus_affine); > + spin_unlock_irqrestore(&pm_qos_lock, flags); > +} > + > +static void pm_qos_irq_notify(struct irq_affinity_notify *notify, > + const cpumask_t *mask) > +{ > + unsigned long flags; > + struct pm_qos_request *req =3D container_of(notify, > + struct pm_qos_request, irq_notify); > + struct pm_qos_constraints *c =3D > + pm_qos_array[req->pm_qos_class]->constraints; > + > + spin_lock_irqsave(&pm_qos_lock, flags); > + cpumask_copy(&req->cpus_affine, mask); > + spin_unlock_irqrestore(&pm_qos_lock, flags); > + > + pm_qos_update_target(c, req, PM_QOS_UPDATE_REQ, req->node.prio); > +} > + > /** > * pm_qos_add_request - inserts new qos request into the list > * @req: pointer to a preallocated handle > @@ -353,6 +458,51 @@ void pm_qos_add_request(struct pm_qos_request *r= eq, > WARN(1, KERN_ERR "pm_qos_add_request() called for already added r= equest\n"); > return; > } > + > + switch (req->type) { > + case PM_QOS_REQ_AFFINE_CORES: > + if (cpumask_empty(&req->cpus_affine)) { > + req->type =3D PM_QOS_REQ_ALL_CORES; > + cpumask_setall(&req->cpus_affine); > + WARN(1, KERN_ERR "Affine cores not set for request with affinity = flag\n"); > + } > + break; > + > + case PM_QOS_REQ_AFFINE_IRQ: > + if (irq_can_set_affinity(req->irq)) { > + int ret =3D 0; > + struct irq_desc *desc =3D irq_to_desc(req->irq); > + struct cpumask *mask =3D desc->irq_data.affinity; > + > + /* Get the current affinity */ > + cpumask_copy(&req->cpus_affine, mask); > + req->irq_notify.irq =3D req->irq; > + req->irq_notify.notify =3D pm_qos_irq_notify; > + req->irq_notify.release =3D pm_qos_irq_release; > + > + ret =3D irq_set_affinity_notifier(req->irq, > + &req->irq_notify); > + if (ret) { > + WARN(1, KERN_ERR "IRQ affinity notify set failed\n"); > + req->type =3D PM_QOS_REQ_ALL_CORES; > + cpumask_setall(&req->cpus_affine); > + } > + } else { > + req->type =3D PM_QOS_REQ_ALL_CORES; > + cpumask_setall(&req->cpus_affine); > + WARN(1, KERN_ERR "IRQ-%d not set for request with affinity flag\n= ", > + req->irq); > + } > + break; > + > + default: > + WARN(1, KERN_ERR "Unknown request type %d\n", req->type); > + /* fall through */ > + case PM_QOS_REQ_ALL_CORES: > + cpumask_setall(&req->cpus_affine); > + break; > + } > + > req->pm_qos_class =3D pm_qos_class; > INIT_DELAYED_WORK(&req->work, pm_qos_work_fn); > trace_pm_qos_add_request(pm_qos_class, value); > @@ -426,10 +576,16 @@ void pm_qos_update_request_timeout(struct pm_qo= s_request *req, s32 new_value, > */ > void pm_qos_remove_request(struct pm_qos_request *req) > { > + int count =3D 10; > + > if (!req) /*guard against callers passing in null */ > return; > /* silent return to keep pcm code cleaner */ > > + /* Remove ourselves from the irq notification */ > + if (req->type =3D=3D PM_QOS_REQ_AFFINE_IRQ) > + irq_release_affinity_notifier(&req->irq_notify); > + > if (!pm_qos_request_active(req)) { > WARN(1, KERN_ERR "pm_qos_remove_request() called for unknown obje= ct\n"); > return; > @@ -441,6 +597,20 @@ void pm_qos_remove_request(struct pm_qos_request= *req) > pm_qos_update_target(pm_qos_array[req->pm_qos_class]->constraints, > req, PM_QOS_REMOVE_REQ, > PM_QOS_DEFAULT_VALUE); > + > + /** > + * The 'release' callback of the notifier would not be called unles= s > + * there are no active users of the irq_notify object, i.e, kref co= unt > + * is non-zero. This could happen if there is an active 'notify' > + * callback happening while the pm_qos_remove request is called. Wa= it > + * until the release callback clears the cpus_affine mask. > + */ > + if (req->type =3D=3D PM_QOS_REQ_AFFINE_IRQ) { > + while (!cpumask_empty(&req->cpus_affine) && count--) > + msleep(50); > + BUG_ON(!count); > + } You shouldn't use this approach but a locking mechanism. > memset(req, 0, sizeof(*req)); > } > EXPORT_SYMBOL_GPL(pm_qos_remove_request); > --=20 Linaro.org =E2=94=82 Open source software fo= r ARM SoCs =46ollow Linaro: Facebook | Twitter | Blog