public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <compudj@krystal.dyndns.org>
To: Christoph Lameter <cl@linux.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Yuriy Lalym <ylalym@gmail.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	ltt-dev@lists.casi.polymtl.ca, Tejun Heo <tj@kernel.org>,
	Ingo Molnar <mingo@elte.hu>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [ltt-dev] [PATCH] Fix dirty page accounting in redirty_page_for_writepage()
Date: Fri, 1 May 2009 15:21:42 -0400	[thread overview]
Message-ID: <20090501192142.GA18339@Krystal> (raw)
In-Reply-To: <alpine.DEB.1.10.0905010939090.18324@qirst.com>

* Christoph Lameter (cl@linux.com) wrote:
> On Thu, 30 Apr 2009, Mathieu Desnoyers wrote:
> 
> > By ZVC update, you mean Zone ... Counter update ? (which code exactly ?)
> 
> The code that you were modifying in vmstat.c.
> 
> > Hrm, I must admit I'm not sure I follow how your reasoning applies to my
> > code. I am using a percpu_add_return_irq() exactly for this reason : it
> > only ever touches the percpu variable once and atomically. The test for
> > overflow is done on the value returned by percpu_add_return_irq().
> 
> If the percpu differential goes over a certain boundary then the
> differential would be updated twice.
> 

Not with my approach which tests for == 0, as you point out below,

> > Therefore, an interrupt scenario that would be close to what I
> > understand from your concerns would be :
> >
> > * Thread A
> >
> > inc_zone_page_state()
> >   p_ret = percpu_add_return(p, 1); (let's suppose this increment
> >                                     overflows the threshold, therefore
> >                                     (p_ret & mask) == 0)
> >
> > ----> interrupt comes in, preempts the current thread, execution in a
> >       different thread context (* Thread B) :
> >
> >      inc_zone_page_state()
> >        p_ret = percpu_add_return(p, 1);  ((p_ret & mask) == 1)
> >        if (!(p_ret & mask))
> >          increment global zone count. (not executed)
> >
> > ----> interrupt comes in, preempts the current thread, execution back to
> >       the original thread context (Thread A), on the same or on a
> >       different CPU :
> >
> >   if (!(p_ret & mask))
> >     increment global zone count.   -----> will therefore increment the
> >                                           global zone count only after
> >                                           scheduling back the original
> >                                           thread.
> >
> > So I guess what you say here is that if Thread B is preempted for too
> > long, we will have to wait until it gets scheduled back before the
> > global count is incremented. Do we really need such degree of precision
> > for those counters ?
> >
> > (I fear I'm not understanding your concern fully though)
> 
> Inc_zone_page_state modifies the differential which is u8 and can easily
> overflow.
> 
> Hmmm. But if you check for overflow to zero this way it may work without
> the need for cmpxchg. But if you rely on overflow then we only update the
> global count after 256 counts on the percpu differential. The tuning of
> the accuracy of the counter wont work anymore. The global counter could
> become wildly inaccurate with a lot of processors.
> 

I see that we are getting on the same page here. Good :) About the
overflow :

What I do here is to let those u8 counters increment as free-running
counters. Yes, they will periodically overflow the 8 bits. But I don't
rely on this for counting the number of increments we need between
global counter updates : I use the bitmask taken from the threshold
value (which is now required to be a power of two) to detect 0, 1, 2, 3,
4, 5, 6 or 7-bit counter overflow. Therefore we can still have the kind
of granularity currently provided. The only limitation is that we have
to use powers of two for the threshold, so we end up counting in power
of two modulo, which will be unaffected by the u8 overflow.

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2009-05-01 19:21 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-29 23:25 [PATCH] Fix dirty page accounting in redirty_page_for_writepage() Mathieu Desnoyers
2009-04-29 23:56 ` Mathieu Desnoyers
2009-04-29 23:59 ` Andrew Morton
2009-04-30  2:34   ` Mathieu Desnoyers
2009-04-30  0:06 ` Linus Torvalds
2009-04-30  2:43   ` Mathieu Desnoyers
2009-04-30  6:21     ` Ingo Molnar
2009-04-30  6:33       ` [ltt-dev] " Mathieu Desnoyers
2009-04-30  6:50         ` Ingo Molnar
2009-04-30 13:38           ` Christoph Lameter
2009-04-30 14:10             ` Ingo Molnar
2009-04-30 14:12             ` Mathieu Desnoyers
2009-04-30 14:12               ` Christoph Lameter
2009-04-30 19:41                 ` Mathieu Desnoyers
2009-04-30 20:17                   ` Christoph Lameter
2009-04-30 21:17                     ` Mathieu Desnoyers
2009-05-01 13:44                       ` Christoph Lameter
2009-05-01 19:21                         ` Mathieu Desnoyers [this message]
2009-05-01 19:31                           ` Christoph Lameter
2009-05-01 20:24                             ` Mathieu Desnoyers
2009-05-01 20:28                               ` Christoph Lameter
2009-05-01 20:43                                 ` Mathieu Desnoyers
2009-05-01 20:42                                   ` Christoph Lameter
2009-05-01 21:19                                     ` Mathieu Desnoyers
2009-05-02  3:00                                       ` Christoph Lameter
2009-05-02  7:01                                         ` Mathieu Desnoyers
2009-05-02 21:01                             ` Mathieu Desnoyers
2009-05-04 14:08                               ` Christoph Lameter
2009-05-03  2:40       ` Tejun Heo
2009-05-04 14:10         ` Christoph Lameter
2009-04-30 13:22     ` Christoph Lameter
2009-04-30 13:38       ` Ingo Molnar
2009-04-30 13:40         ` Christoph Lameter
2009-04-30 14:14           ` Ingo Molnar
2009-04-30 14:15             ` Christoph Lameter
2009-04-30 14:38               ` Ingo Molnar
2009-04-30 14:45                 ` Christoph Lameter
2009-04-30 15:01                   ` Ingo Molnar
2009-04-30 15:25                     ` Christoph Lameter
2009-04-30 15:42                       ` Ingo Molnar
2009-04-30 15:44                         ` Christoph Lameter
2009-04-30 16:06                           ` Ingo Molnar
2009-04-30 16:11                             ` Christoph Lameter
2009-04-30 16:16                             ` Linus Torvalds
2009-04-30 17:23                               ` Ingo Molnar
2009-04-30 18:07                                 ` Christoph Lameter
2009-05-01 19:59                                   ` Ingo Molnar
2009-05-01 20:35                                     ` Christoph Lameter
2009-05-01 21:07                                       ` Ingo Molnar
2009-05-02  3:06                                         ` Christoph Lameter
2009-05-02  9:03                                           ` Ingo Molnar
2009-05-04 14:48                                             ` Christoph Lameter
2009-04-30 16:13                         ` Linus Torvalds
2009-04-30 15:54                       ` Ingo Molnar
2009-04-30 16:00                       ` Ingo Molnar
2009-04-30 16:08                         ` Christoph Lameter
2009-04-30 13:50         ` Mathieu Desnoyers
2009-04-30 13:55           ` Christoph Lameter
2009-04-30 14:32           ` Ingo Molnar
2009-04-30 14:42             ` Christoph Lameter
2009-04-30 14:59               ` Ingo Molnar
2009-04-30 16:03             ` [ltt-dev] " Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090501192142.GA18339@Krystal \
    --to=compudj@krystal.dyndns.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ltt-dev@lists.casi.polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=ylalym@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox