From: Saravana Kannan <skannan@codeaurora.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Joonwoo Park <joonwoop@codeaurora.org>,
linux-kernel@vger.kernel.org,
John Stultz <john.stultz@linaro.org>, Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH v2 RESEND/RFC] timer: make deferrable cpu unbound timers really not bound to a cpu
Date: Tue, 24 Mar 2015 12:36:59 -0700 [thread overview]
Message-ID: <5511BCDB.208@codeaurora.org> (raw)
In-Reply-To: <550B6584.3020007@codeaurora.org>
On 03/19/2015 05:10 PM, Saravana Kannan wrote:
> On 09/23/2014 11:33 AM, Thomas Gleixner wrote:
>> On Mon, 15 Sep 2014, Joonwoo Park wrote:
>>> +#ifdef CONFIG_SMP
>>> +static struct tvec_base *tvec_base_deferral = &boot_tvec_bases;
>>> +#endif
>>
>> In principle I like the idea of a deferrable wheel, but this
>> implementation is going to go nowhere.
>
> Hi Thomas,
>
> To give some more context:
>
> This bug is a serious pain in the a** for anyone using deferrable timers
> or deferrable workqueues today for some periodic work and don't care for
> which CPU the code runs in.
>
> Couple of examples of such issues in existing code:
>
> 1) In a SMP system, CPUfreq governors (ondemand and conservative) end up
> queueing a deferrable work on every single CPU and the first one to run
> the deferrable workqueue cancels the work on all other CPUS, runs the
> work and then sets up a workqueue on all the CPUs again for the next
> sampling point.
>
> 2) Devfreq actually doesn't work well today because it doesn't do this
> nasty hack like CPUfreq. So, if the devfreq work happens to run on a CPU
> that's typically idle, then it doesn't run for a long time. I've
> actually seen logs where the devfreq polling interval is set to 20ms,
> but ends up running 800ms later.
>
> Don't know how many other drivers are suffering from this bug. Yes,
> IMHO, this is a bug. When the timer is CPU unbound, anyone who hasn't
> looked at the actual timer code would assume that it'll run as long as
> any CPU is busy.
>
>> First of all making it SMP only is silly. The deferrable stuff is a
>> pain in other places as well.
>>
>> But whats way worse is:
>>
>>> +static inline void __run_timers(struct tvec_base *base, bool try)
>>> {
>>> struct timer_list *timer;
>>>
>>> - spin_lock_irq(&base->lock);
>>> + if (!try)
>>> + spin_lock_irq(&base->lock);
>>> + else if (!spin_trylock_irq(&base->lock))
>>> + return;
>>
>> Yuck. All cpus fighting about a single spinlock roughly at the same
>> time? You just created a proper thundering herd cacheline bouncing
>> issue.
>
> I agree, This is not good. I think a simple way to fix this is to have
> only the CPU that's the jiffy owner to run through this deferrable timer
> list.
>
> That should address any unnecessary cache bouncing issues. Would that be
> an acceptable solution to this?
>
>>
>> No way. We have already mechanisms in place to deal with such
>> problems, you just have to use them.
>
> I'm not sure which problem you are referring to here. Or what the
> already existing solutions are.
>
> I don't think you were referring to the "deferrable timer getting
> delayed for long periods despite CPUs being active" problem, because I
> don't think we have a better solution than this patch (with the jiffy
> owner CPU fix). Asking every driver that doesn't care which CPU the work
> runs on to queue a work or set up a timer on every CPU is definitely not
> a good solution -- that's spreading the complexity to every other driver
> instead of fixing it in one place. And that causes unnecessary cache
> pollution too.
>
> Thoughts? I would really like to see a fix for this go in soon so that
> we can simplify cpufreq and have devfreq and other drivers work correctly.
>
> Thanks,
> Saravana
>
Thomas,
Bump.
-Saravana
--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
next prev parent reply other threads:[~2015-03-24 19:37 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-16 2:59 [PATCH v2 RESEND/RFC] timer: make deferrable cpu unbound timers really not bound to a cpu Joonwoo Park
2014-09-23 18:33 ` Thomas Gleixner
2014-09-26 20:54 ` Joonwoo Park
2015-03-20 0:10 ` Saravana Kannan
2015-03-24 19:36 ` Saravana Kannan [this message]
2015-03-30 20:16 ` Joonwoo Park
2015-04-28 2:39 ` [PATCH 2/2] " Joonwoo Park
2015-04-28 2:41 ` Joonwoo Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5511BCDB.208@codeaurora.org \
--to=skannan@codeaurora.org \
--cc=john.stultz@linaro.org \
--cc=joonwoop@codeaurora.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.