public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Brice Goglin <Brice.Goglin@inria.fr>
To: Paul Mundt <lethal@linux-sh.org>, Chris Snook <csnook@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: Purpose of numa_node?
Date: Thu, 31 Jan 2008 14:42:18 +0100	[thread overview]
Message-ID: <47A1D03A.3020508@inria.fr> (raw)
In-Reply-To: <20080131074045.GA13788@linux-sh.org>

Paul Mundt wrote:
> On Wed, Jan 30, 2008 at 07:48:13PM -0500, Chris Snook wrote:
>   
>> While pondering ways to optimize I/O and swapping on large NUMA machines, I 
>> noticed that the numa_node field in struct device isn't actually used 
>> anywhere. We just have a couple dozen lines of code to conditionally 
>>  create a sysfs file that will always return -1.  Is anyone even working on 
>> code to actually use this field?  I think it's a good piece of information 
>> to keep track of, so I'm not suggesting we remove it, but I want to make 
>> sure I'm not stepping on toes or duplicating effort if I try to make it 
>> useful.
>>     
> It's manipulated with accessors. If you look at the users of
> dev_to_node()/set_dev_node() you can see where it's being used. It's
> primarily used in allocation paths for node locality, and the existing
> set_dev_node() callsites are places where node locality information
> already exists (ie, which node a given controller sits on). You can see
> this in places like PCI (pcibus_to_node()) and USB, with node allocation
> hints used in places like the dmapool and skb alloc paths.
>
> The in-kernel use looks perfectly sane in that regard, though I'm not
> sure what the point of exporting this as a RO attribute to userspace is.
> Presumably someone has a tool somewhere that cares about this.
>   

I added the numa_node sysfs attribute in the beginning to make it easier 
to bind processes near some devices. So yes I have some user-space tool 
using it. It is much easier to use than the local_cpus field on large 
machines, especially when you use the libnuma interface to bind things, 
since you don't have to translate numa_node from/to cpumasks.

It works fine on regular machines such as dual opterons. However, I 
noticed recently that it was wrong on some quad-opteron machines (see 
http://marc.info/?l=linux-pci&m=119072400008538&w=2) because something 
is not initialized in the right order. But I haven't tested 2.6.24 on 
this hardware yet, and I don't know if things have changed regarding this.

Brice


  parent reply	other threads:[~2008-01-31 13:42 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-31  0:48 Purpose of numa_node? Chris Snook
2008-01-31  7:40 ` Paul Mundt
2008-01-31  9:56   ` Andi Kleen
2008-01-31 13:42   ` Brice Goglin [this message]
2008-01-31 21:29     ` Yinghai Lu
2008-01-31 21:35       ` Brice Goglin
2008-01-31 21:42         ` Yinghai Lu
2008-01-31 23:35           ` Yinghai Lu
2008-02-13 18:52             ` Brice Goglin
2008-02-13 21:31               ` Yinghai Lu
2008-02-20 21:55               ` Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47A1D03A.3020508@inria.fr \
    --to=brice.goglin@inria.fr \
    --cc=csnook@redhat.com \
    --cc=lethal@linux-sh.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox