All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Christoph Lameter <cl@linux.com>
Cc: akpm@linux-foundation.org, Pekka Enberg <penberg@cs.helsinki.fi>,
	linux-kernel@vger.kernel.org,
	Eric Dumazet <eric.dumazet@gmail.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Subject: Re: [cpuops cmpxchg V2 4/5] vmstat: User per cpu atomics to avoid interrupt disable / enable
Date: Wed, 15 Dec 2010 17:45:22 +0100	[thread overview]
Message-ID: <4D08F0A2.9010301@kernel.org> (raw)
In-Reply-To: <20101214162854.811759020@linux.com>

On 12/14/2010 05:28 PM, Christoph Lameter wrote:
> Currently the operations to increment vm counters must disable interrupts
> in order to not mess up their housekeeping of counters.
> 
> So use this_cpu_cmpxchg() to avoid the overhead. Since we can no longer
> count on preremption being disabled we still have some minor issues.
> The fetching of the counter thresholds is racy.
> A threshold from another cpu may be applied if we happen to be
> rescheduled on another cpu.  However, the following vmstat operation
> will then bring the counter again under the threshold limit.
> 
> The operations for __xxx_zone_state are not changed since the caller
> has taken care of the synchronization needs (and therefore the cycle
> count is even less than the optimized version for the irq disable case
> provided here).
> 
> The optimization using this_cpu_cmpxchg will only be used if the arch
> supports efficient this_cpu_ops (must have CONFIG_CMPXCHG_LOCAL set!)
> 
> The use of this_cpu_cmpxchg reduces the cycle count for the counter
> operations by %80 (inc_zone_page_state goes from 170 cycles to 32).
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>
>
+/*
+ * If we have cmpxchg_local support then we do not need to incur the overhead
+ * that comes with local_irq_save/restore if we use this_cpu_cmpxchg.
+ *
+ * mod_state() modifies the zone counter state through atomic per cpu
+ * operations.
+ *
+ * Overstep mode specifies how overstep should handled:
+ *     0       No overstepping
+ *     1       Overstepping half of threshold
+ *     -1      Overstepping minus half of threshold
+*/
+static inline void mod_state(struct zone *zone,
+       enum zone_stat_item item, int delta, int overstep_mode)
+{
+	struct per_cpu_pageset __percpu *pcp = zone->pageset;
+	s8 __percpu *p = pcp->vm_stat_diff + item;
+	long o, n, t, z;
+
+	do {
+		z = 0;  /* overflow to zone counters */
+
+		/*
+		 * The fetching of the stat_threshold is racy. We may apply
+		 * a counter threshold to the wrong the cpu if we get
+		 * rescheduled while executing here. However, the following
+		 * will apply the threshold again and therefore bring the
+		 * counter under the threshold.
+		 */

What does "the following" mean here?  Later executions of the
function?  It seems like the counter can go out of the threshold at
least temporarily, which probably is okay but I think the comment can
be improved a bit.

Thanks.

-- 
tejun

  reply	other threads:[~2010-12-15 16:45 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-14 16:28 [cpuops cmpxchg V2 0/5] Cmpxchg and xchg operations Christoph Lameter
2010-12-14 16:28 ` [cpuops cmpxchg V2 1/5] percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support Christoph Lameter
2010-12-17 14:55   ` Tejun Heo
2010-12-14 16:28 ` [cpuops cmpxchg V2 2/5] x86: this_cpu_cmpxchg and this_cpu_xchg operations Christoph Lameter
2010-12-17 15:22   ` Tejun Heo
2010-12-14 16:28 ` [cpuops cmpxchg V2 3/5] irq_work: Use per cpu atomics instead of regular atomics Christoph Lameter
2010-12-15 16:32   ` Tejun Heo
2010-12-15 16:34     ` H. Peter Anvin
2010-12-15 16:50     ` Peter Zijlstra
2010-12-15 17:04       ` Christoph Lameter
2010-12-15 17:18         ` Peter Zijlstra
2010-12-15 17:31           ` H. Peter Anvin
2010-12-15 17:32           ` Christoph Lameter
2010-12-18 15:32   ` Tejun Heo
2010-12-14 16:28 ` [cpuops cmpxchg V2 4/5] vmstat: User per cpu atomics to avoid interrupt disable / enable Christoph Lameter
2010-12-15 16:45   ` Tejun Heo [this message]
2010-12-15 17:01     ` Christoph Lameter
2010-12-14 16:28 ` [cpuops cmpxchg V2 5/5] cpuops: Use cmpxchg for xchg to avoid lock semantics Christoph Lameter
2010-12-14 16:35   ` Mathieu Desnoyers
2010-12-14 16:44   ` Eric Dumazet
2010-12-14 16:55     ` Christoph Lameter
2010-12-14 17:00       ` H. Peter Anvin
2010-12-14 17:19         ` Christoph Lameter
2010-12-14 17:22           ` H. Peter Anvin
2010-12-14 17:29             ` Tejun Heo
2010-12-14 17:35               ` Christoph Lameter
2010-12-15  1:06               ` H. Peter Anvin
2010-12-15 16:29                 ` Tejun Heo
2010-12-15 16:35                   ` H. Peter Anvin
2010-12-15 16:39                     ` Tejun Heo
2010-12-16 16:14                       ` Tejun Heo
2010-12-16 18:13                         ` x86: Use this_cpu_has for thermal_interrupt Christoph Lameter
2010-12-18 15:35                           ` Tejun Heo
2010-12-21  0:56                             ` H. Peter Anvin
2010-12-30 11:29                               ` Tejun Heo
2010-12-30 18:19                                 ` H. Peter Anvin
2010-12-31 12:43                                   ` Tejun Heo
2010-12-16 18:14                         ` x86: udelay: Use this_cpu_read to avoid address calculation Christoph Lameter
2010-12-16 18:15                         ` gameport: use this_cpu_read instead of lookup Christoph Lameter
2010-12-18 15:34                           ` Tejun Heo
2010-12-16 18:16                         ` acpi throttling: Use this_cpu_has and simplify code Christoph Lameter
2010-12-18 15:50                           ` Tejun Heo
2010-12-21  1:52                             ` ykzhao
2010-12-21 22:43                             ` Christoph Lameter
2010-12-21  4:28                           ` Len Brown
2010-12-16 18:19                         ` [cpuops cmpxchg V2 5/5] cpuops: Use cmpxchg for xchg to avoid lock semantics H. Peter Anvin
2010-12-16 18:55                           ` Tejun Heo
2010-12-16 20:42                         ` H. Peter Anvin
2010-12-15 16:47   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D08F0A2.9010301@kernel.org \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=penberg@cs.helsinki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.