From: Yinghai Lu <Yinghai.Lu@Sun.COM>
To: Cornelia Huck <cornelia.huck@de.ibm.com>,
Stefan Richter <stefanr@s5r6.in-berlin.de>,
Greg KH <greg@kroah.com>,
Andrew Morton <akpm@linux-foundation.org>,
Andi Kleen <ak@suse.de>,
rientjes@google.com, Christoph Lameter <clameter@sgi.com>,
Christoph Hellwig <hch@infradead.org>,
David Miller <davem@davemloft.net>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
netdev@vger.kernel.org
Subject: [PATCH] try parent numa_node at first before using default
Date: Thu, 12 Jul 2007 10:59:53 -0700 [thread overview]
Message-ID: <200707121059.54320.yinghai.lu@sun.com> (raw)
In-Reply-To: <20070712172310.4d662e1f@gondolin.boeblingen.de.ibm.com>
[PATCH] try parent numa_node at first before using default
For pci_device, pcibios_scan_root and pci_scan_root will call pci_device_add.
pci_device_add will call device_initialize and set_dev_node(&dev->dev,
pcibus_to_node(bus)).
other device such as netdev, and usb_device, set_dev_node is never be
used. So that field numa_node always is -1.
So for netdev, it will need to use dev->parent to get pci_device to
use it's numa_node. esp in netdev_alloc_skb()
not sure how other device such as infiniband do that.
Actually before patch
[PATCH 1/2] x86_64: get mp_bus_to_node as early
there is a bug about squence of bus->sysdata and using pcibus_to_node.
the numa_node of pci_dev->dev is never set correctly...always 0.
So some device have to use pcibus_to_node(to_pci_dev(dev)->bus) directly
such as dma_alloc_pages in arch/x86_64/kernel/pci-dma.c.
or hwif_to_node in include/linux/ide.h
According to Stefan Richter
- Change all subsystems to set dev->parent before device_initialize().
*Document* that the device_initialize() API has this requirement.
This is counter-intuitive, amounts to some work across the kernel,
and could be gotten wrong again in future code because it's a
counter-intuitive API.
- Move your code from device_initialize() to device_add(). One minor
drawback is that node-specific allocations based on the device's
numa_node would not be optimized before device_add(), but there is
probably no need for this. Driver probes come after device_add().
- Let subsystems explicitly call set_dev_node() on their own.
this patch is using second method.
Also we don't need call set_dev_node in pci_device_add anymore. but need to
make sure every pci root bus's bridge device numa is set.
with this patch, we could use device->numa_node direclty for all device.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
diff --git a/drivers/base/core.c b/drivers/base/core.c
index dd40d78..091c2b1 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -666,6 +666,11 @@ int device_add(struct device *dev)
if (error)
goto Error;
+ /* use parent numa_node */
+ if (parent) {
+ set_dev_node(dev, dev_to_node(dev->parent));
+ }
+
/* first, register with generic layer. */
kobject_set_name(&dev->kobj, "%s", dev->bus_id);
error = kobject_add(&dev->kobj);
@@ -1285,8 +1290,11 @@ int device_move(struct device *dev, struct device *new_parent)
dev->parent = new_parent;
if (old_parent)
klist_remove(&dev->knode_parent);
- if (new_parent)
+ if (new_parent) {
klist_add_tail(&dev->knode_parent, &new_parent->klist_children);
+ set_dev_node(dev, dev_to_node(new_parent));
+ }
+
if (!dev->class)
goto out_put;
error = device_move_class_links(dev, old_parent, new_parent);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index e48fcf0..c029ffc 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -935,7 +938,6 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
dev->dev.release = pci_release_dev;
pci_dev_get(dev);
- set_dev_node(&dev->dev, pcibus_to_node(bus));
dev->dev.dma_mask = &dev->dma_mask;
dev->dev.coherent_dma_mask = 0xffffffffull;
@@ -1096,6 +1098,9 @@ struct pci_bus * pci_create_bus(struct device *parent,
goto dev_reg_err;
b->bridge = get_device(dev);
+ if (!parent)
+ set_dev_node(b->bridge, pcibus_to_node(b));
+
b->class_dev.class = &pcibus_class;
sprintf(b->class_dev.class_id, "%04x:%02x", pci_domain_nr(b), bus);
error = class_device_register(&b->class_dev);
next prev parent reply other threads:[~2007-07-12 17:56 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <200707101641.17672.yinghai.lu@sun.com>
2007-07-10 23:52 ` [PATCH 1/5] try parent numa_node at first before using default Yinghai Lu
2007-07-11 10:54 ` Stefan Richter
2007-07-11 11:03 ` Stefan Richter
2007-07-11 21:08 ` Greg KH
2007-07-11 21:28 ` Yinghai Lu
2007-07-12 2:47 ` Stefan Richter
2007-07-12 3:01 ` Yinghai Lu
2007-07-12 5:47 ` Stefan Richter
2007-07-12 7:15 ` Cornelia Huck
2007-07-12 11:30 ` Stefan Richter
2007-07-12 15:23 ` Cornelia Huck
2007-07-12 17:59 ` Yinghai Lu [this message]
2007-07-12 18:31 ` [PATCH] " Greg KH
2007-07-12 19:06 ` Yinghai Lu
2007-07-13 3:16 ` Greg KH
2007-07-13 4:42 ` Yinghai Lu
2007-07-13 5:48 ` Cornelia Huck
2007-07-13 19:27 ` [PATCH] try parent numa_node at first before using default v2 Yinghai Lu
2007-07-10 23:52 ` [PATCH 4/5] net: show numa_node for net_device in /sys Yinghai Lu
2007-07-10 23:53 ` [PATCH 5/5] dma: use dev_to_node to get node for device in dma_alloc_pages Yinghai Lu
2007-07-23 19:30 ` Christoph Lameter
2007-07-11 0:05 ` [PATCH 2/5] net: use numa_node in net_devcice->dev instead of parent Yinghai Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200707121059.54320.yinghai.lu@sun.com \
--to=yinghai.lu@sun.com \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=cornelia.huck@de.ibm.com \
--cc=davem@davemloft.net \
--cc=greg@kroah.com \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rientjes@google.com \
--cc=stefanr@s5r6.in-berlin.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).