From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759598AbcAUNyk (ORCPT ); Thu, 21 Jan 2016 08:54:40 -0500 Received: from mail-wm0-f54.google.com ([74.125.82.54]:33277 "EHLO mail-wm0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759002AbcAUNyi (ORCPT ); Thu, 21 Jan 2016 08:54:38 -0500 Subject: Re: [RFC V2 2/2] sched: idle: IRQ based next prediction for idle period To: Thomas Gleixner References: <1453305636-22156-1-git-send-email-daniel.lezcano@linaro.org> <1453305636-22156-3-git-send-email-daniel.lezcano@linaro.org> Cc: peterz@infradead.org, rafael@kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, nicolas.pitre@linaro.org, vincent.guittot@linaro.org From: Daniel Lezcano Message-ID: <56A0E31A.6020001@linaro.org> Date: Thu, 21 Jan 2016 14:54:34 +0100 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/20/2016 08:49 PM, Thomas Gleixner wrote: [ ... ] Thanks for all your comments. I agree with them. One question below. >> +static void sched_irq_timing_free(unsigned int irq) >> +{ >> + struct wakeup *w; >> + int cpu; >> + >> + for_each_possible_cpu(cpu) { >> + >> + w = per_cpu(wakeups[irq], cpu); >> + if (!w) >> + continue; >> + >> + per_cpu(wakeups[irq], cpu) = NULL; >> + kfree(w); > > Your simple array does not work. You need a radix_tree to handle SPARSE_IRQ > and you need proper protection against teardown. > > So we can avoid all that stuff and simply stick that data into irqdesc and let > the core handle it. That allows us to use proper percpu allocations and avoid > that for_each_possible_cpu() sillyness. > > That leaves the iterator, but that's a solvable problem. We simply can have an > iterator function in the irq core, which gives you the next sample > structure. Something like this: > > struct irqtiming_sample *irqtiming_get_next(int *irq) > { > struct irq_desc *desc; > int next; > > /* Do a racy lookup of the next allocated irq */ > next = irq_get_next_irq(*irq); > if (next >= nr_irqs) > return NULL; > > *irq = next + 1; > > /* Now lookup the descriptor. It's RCU protected. */ > desc = irq_to_desc(next); > if (!desc || !desc->irqtimings || !(desc->istate & IRQS_TIMING)) > return NULL; > > return this_cpu_ptr(&desc->irqtimings); > } > > And that needs to be called rcu protected; > > next = 0; > rcu_read_lock(); > sample = irqtiming_get_next(&next); > while (sample) { > .... > sample = irqtiming_get_next(&next); > } > rcu_read_unlock(); > > So the interrupt part becomes: > > if (desc->istate & IRQS_TIMING) > irqtimings_handle(__this_cpu_ptr(&desc->irqtimings)); > > So now for the allocation/free of that data. We simply allocate/free it along > with the irq descriptor. That IRQS_TIMING bit gets set in __setup_irq() except > for timer interrupts. That's simple and avoid _all_ the issues. Indeed, making this as part of the irq code makes everything much more simple and self contained. For the shared interrupts, shouldn't we put the timings samples into the irqaction structure instead of the irqdesc structure ? eg. #define IRQT_MAX_VALUES 4 struct irqaction { ... #ifdef CONFIG_IRQ_TIMINGS u32 irqtimings_samples[IRQT_MAX_VALUES]; #endif ... }; So we don't have to deal with the allocation/free under locks. The drawback is the array won't be used in the case of the timers. Does it make sense ? Thanks Thomas for your help, your time and your suggestions. -- Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog