From: Dimitri Sivanich <sivanich@sgi.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [patch 2/3] timer: move calc_load to softirq
Date: Thu, 7 May 2009 13:18:10 -0500 [thread overview]
Message-ID: <20090507181810.GB6549@sgi.com> (raw)
In-Reply-To: <alpine.LFD.2.00.0905022138010.3375@localhost.localdomain>
On Sat, May 02, 2009 at 09:54:49PM +0200, Thomas Gleixner wrote:
> On Sat, 2 May 2009, Andrew Morton wrote:
> > > + spin_lock(&avenrun_lock);
> > > + ticks = atomic_read(&avenrun_ticks);
> > > + if (ticks >= LOAD_FREQ) {
> > > + atomic_sub(LOAD_FREQ, &avenrun_ticks);
> > > + calc_global_load();
> > > }
> > > + spin_unlock(&avenrun_lock);
> > > + *calc = 0;
> > > +}
> >
> > I wonder if we really really need avenrun_lock. Various bits of code
> > (eg net/sched/em_meta.c) cheerily read avenrun[] without locking.
>
> I don't care about the reader side anyway, the lock is just there to
> protect the calc_load update from two cpus, but that's probably
> paranoia.
>
> Though, there is a theoretical race between 2 cpus which might want to
> update avenrun_ticks in the NOHZ case, but thinking more about it we
> can just prevent this by clever usage of the atomic ops on
> avenrun_ticks.
>
> Thanks,
>
> tglx
>
> ----------->
> Subject: timer: move calc_load to softirq
> From: Thomas Gleixner <tglx@linutronix.de>
> Date: Sat, 2 May 2009 19:43:41 +0200
>
> xtime_lock is held write locked across calc_load() which iterates over
> all online CPUs. That can cause long latencies for xtime_lock readers
> on large SMP systems. The load average calculation is an rough
> estimate anyway so there is no real need to protect the readers
> vs. the update. It's not a problem when the avenrun array is updated
> while a reader copies the values.
>
> Move the calculation to the softirq and reduce the xtime_lock write
> locked section. This also reduces the interrupts off section.
>
> Inspired by an inital patch from Dimitri Sivanich.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
> ---
> kernel/time/timekeeping.c | 2 -
> kernel/timer.c | 57 ++++++++++++++++++++++++++++++++--------------
> 2 files changed, 41 insertions(+), 18 deletions(-)
>
> Index: linux-2.6/kernel/time/timekeeping.c
> ===================================================================
> --- linux-2.6.orig/kernel/time/timekeeping.c
> +++ linux-2.6/kernel/time/timekeeping.c
> @@ -22,7 +22,7 @@
>
> /*
> * This read-write spinlock protects us from races in SMP while
> - * playing with xtime and avenrun.
> + * playing with xtime.
> */
> __cacheline_aligned_in_smp DEFINE_SEQLOCK(xtime_lock);
>
> Index: linux-2.6/kernel/timer.c
> ===================================================================
> --- linux-2.6.orig/kernel/timer.c
> +++ linux-2.6/kernel/timer.c
> @@ -1127,12 +1127,13 @@ void update_process_times(int user_tick)
> * imply that avenrun[] is the standard name for this kind of thing.
> * Nothing else seems to be standardized: the fractional size etc
> * all seem to differ on different machines.
> - *
> - * Requires xtime_lock to access.
> */
> unsigned long avenrun[3];
> EXPORT_SYMBOL(avenrun);
>
> +static atomic_t avenrun_ticks;
> +static DEFINE_PER_CPU(int, avenrun_calculate);
> +
> static unsigned long
> calc_load(unsigned long load, unsigned long exp, unsigned long active)
> {
> @@ -1143,23 +1144,44 @@ calc_load(unsigned long load, unsigned l
>
> /*
> * calc_load - given tick count, update the avenrun load estimates.
> - * This is called while holding a write_lock on xtime_lock.
> */
> -static void calc_global_load(unsigned long ticks)
> +static void calc_global_load(void)
> {
> - unsigned long active_tasks; /* fixed-point */
> - static int count = LOAD_FREQ;
> + unsigned long active_tasks = nr_active() * FIXED_1;
>
> - count -= ticks;
> - if (unlikely(count < 0)) {
> - active_tasks = nr_active() * FIXED_1;
> - do {
> - avenrun[0] = calc_load(avenrun[0], EXP_1, active_tasks);
> - avenrun[1] = calc_load(avenrun[1], EXP_5, active_tasks);
> - avenrun[2] = calc_load(avenrun[2], EXP_15, active_tasks);
> - count += LOAD_FREQ;
> - } while (count < 0);
> - }
> + avenrun[0] = calc_load(avenrun[0], EXP_1, active_tasks);
> + avenrun[1] = calc_load(avenrun[1], EXP_5, active_tasks);
> + avenrun[2] = calc_load(avenrun[2], EXP_15, active_tasks);
> +}
> +
> +/*
> + * Check whether do_timer has set avenrun_calculate. The variable is
> + * cpu local so we avoid cache line bouncing of avenrun_ticks.
> + */
> +static void check_calc_load(void)
> +{
> + int ticks, *calc = &__get_cpu_var(avenrun_calculate);
> +
> + if (!*calc)
> + return;
> +
> + ticks = atomic_sub_return(LOAD_FREQ, &avenrun_ticks);
> + if (ticks >= 0)
> + calc_global_load();
> + else
> + atomic_add(LOAD_FREQ, &avenrun_ticks);
> + *calc = 0;
> +}
> +
> +/*
> + * Update avenrun_ticks and trigger the load calculation when the
> + * result is >= LOAD_FREQ.
> + */
> +static void calc_load_update(unsigned long ticks)
> +{
> + ticks = atomic_add_return(ticks, &avenrun_ticks);
> + if (ticks >= LOAD_FREQ)
> + __get_cpu_var(avenrun_calculate) = 1;
> }
>
> /*
> @@ -1169,6 +1191,7 @@ static void run_timer_softirq(struct sof
> {
> struct tvec_base *base = __get_cpu_var(tvec_bases);
>
> + check_calc_load();
> hrtimer_run_pending();
>
> if (time_after_eq(jiffies, base->timer_jiffies))
> @@ -1192,7 +1215,7 @@ void run_local_timers(void)
> static inline void update_times(unsigned long ticks)
> {
> update_wall_time();
> - calc_global_load(ticks);
> + calc_load_update(ticks);
> }
>
> /*
>
next prev parent reply other threads:[~2009-05-07 18:18 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-02 19:06 [patch 0/3] move calc_load to softirq Thomas Gleixner
2009-05-02 19:06 ` [patch 1/3] timers: use function instead of macro for calc_load Thomas Gleixner
2009-05-07 18:16 ` Dimitri Sivanich
2009-05-02 19:06 ` [patch 2/3] timer: move calc_load to softirq Thomas Gleixner
2009-05-02 19:24 ` Andrew Morton
2009-05-02 19:54 ` Thomas Gleixner
2009-05-07 18:18 ` Dimitri Sivanich [this message]
2009-05-02 19:06 ` [patch 3/3] tiemrs: cleanup avenrun users Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090507181810.GB6549@sgi.com \
--to=sivanich@sgi.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.