public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Rusty Russell <rusty@rustcorp.com.au>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] Correct nr_processes() when CPUs have been unplugged
Date: Wed, 4 Nov 2009 19:04:29 +1030	[thread overview]
Message-ID: <200911041904.29362.rusty@rustcorp.com.au> (raw)
In-Reply-To: <1257243074.23110.779.camel@zakaz.uk.xensource.com>

On Tue, 3 Nov 2009 08:41:14 pm Ian Campbell wrote:
> nr_processes() returns the sum of the per cpu counter process_counts for
> all online CPUs. This counter is incremented for the current CPU on
> fork() and decremented for the current CPU on exit(). Since a process
> does not necessarily fork and exit on the same CPU the process_count for
> an individual CPU can be either positive or negative and effectively has
> no meaning in isolation.
> 
> Therefore calculating the sum of process_counts over only the online
> CPUs omits the processes which were started or stopped on any CPU which
> has since been unplugged. Only the sum of process_counts across all
> possible CPUs has meaning.
> 
> The only caller of nr_processes() is proc_root_getattr() which
> calculates the number of links to /proc as
>         stat->nlink = proc_root.nlink + nr_processes();
> 
> You don't have to be all that unlucky for the nr_processes() to return a
> negative value leading to a negative number of links (or rather, an
> apparently enormous number of links). If this happens then you can get
> failures where things like "ls /proc" start to fail because they got an
> -EOVERFLOW from some stat() call.
> 
> Example with some debugging inserted to show what goes on:
>         # ps haux|wc -l
>         nr_processes: CPU0:     90
>         nr_processes: CPU1:     1030
>         nr_processes: CPU2:     -900
>         nr_processes: CPU3:     -136
>         nr_processes: TOTAL:    84
>         proc_root_getattr. nlink 12 + nr_processes() 84 = 96
>         84
>         # echo 0 >/sys/devices/system/cpu/cpu1/online
>         # ps haux|wc -l
>         nr_processes: CPU0:     85
>         nr_processes: CPU2:     -901
>         nr_processes: CPU3:     -137
>         nr_processes: TOTAL:    -953
>         proc_root_getattr. nlink 12 + nr_processes() -953 = -941
>         75
>         # stat /proc/
>         nr_processes: CPU0:     84
>         nr_processes: CPU2:     -901
>         nr_processes: CPU3:     -137
>         nr_processes: TOTAL:    -954
>         proc_root_getattr. nlink 12 + nr_processes() -954 = -942
>           File: `/proc/'
>           Size: 0               Blocks: 0          IO Block: 1024   directory
>         Device: 3h/3d   Inode: 1           Links: 4294966354
>         Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
>         Access: 2009-11-03 09:06:55.000000000 +0000
>         Modify: 2009-11-03 09:06:55.000000000 +0000
>         Change: 2009-11-03 09:06:55.000000000 +0000
> 
> I'm not 100% convinced that the per_cpu regions remain valid for offline
> CPUs, although my testing suggests that they do.

Yep.  And so code should usually start with for_each_possible_cpu() then:

> If not then I think the
> correct solution would be to aggregate the process_count for a given CPU
> into a global base value in cpu_down().

If it proves to be an issue.

Acked-by: Rusty Russell <rusty@rustcorp.com.au>

Thanks!
Rusty.

      parent reply	other threads:[~2009-11-04  8:34 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-03 10:11 [PATCH] Correct nr_processes() when CPUs have been unplugged Ian Campbell
2009-11-03 15:51 ` Linus Torvalds
2009-11-03 16:07 ` Ingo Molnar
2009-11-03 18:34   ` Christoph Lameter
2009-11-04  6:09     ` Paul E. McKenney
2009-11-04 19:37       ` Christoph Lameter
2009-11-04 19:44         ` Paul E. McKenney
2009-11-04 10:42     ` Ingo Molnar
2009-11-05  0:43     ` Rusty Russell
2009-11-04 11:10   ` Ian Campbell
2009-11-04  8:34 ` Rusty Russell [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200911041904.29362.rusty@rustcorp.com.au \
    --to=rusty@rustcorp.com.au \
    --cc=Ian.Campbell@citrix.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox