public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
@ 2004-07-27  0:10 Matthew Dobson
  2004-07-27  3:38 ` Jesse Barnes
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Matthew Dobson @ 2004-07-27  0:10 UTC (permalink / raw)
  To: Jesse Barnes, Andi Kleen, LKML, Martin J. Bligh, LSE Tech

[-- Attachment #1: Type: text/plain, Size: 1675 bytes --]

So in discussions with Jesse at OLS, we decided that pcibus_to_node() is
a more generally useful function than pcibus_to_cpumask().  If anyone
disagrees with that, now would be a good time to let us know.

This is just a preliminary patch.  It needs review for x86_64, as I
don't know how to properly populate the mp_bus_to_node (which used to be
mp_bus_to_cpumask) array.

The main changes are as follows:

1) Replace instances of pcibus_to_cpumask(bus) with
node_to_cpumask(pcibus_to_node(bus)).  There are currently only 2 uses
of pcibus_to_cpumask(): flush_gart() in arch/x86_64/kernel/pci-gart.c
and pci_bus_show_cpuaffinity() in drivers/pci/probe.c.
2) Define the asm-generic version of pcibus_to_node() to always return
node 0, as this is the sensible non-NUMA behavior.
3) Drop the mips/mach-ip27 and ppc64 versions of pcibus_to_cpumask()
entirely, since they were simply defined to be identical to the
asm-generic version.
4) Define the i386 version of pcibus_to_node().

Future work:

1) Correctly map PCI buses to nodes for x86_64.
2) IA64 implementation?
3) Other arch implementations?

[mcd@arrakis source]$ diffstat ~/linux/patches/pcibus_to_node.patch
 arch/x86_64/kernel/mpparse.c          |    3 ++-
 arch/x86_64/kernel/pci-gart.c         |    2 +-
 drivers/pci/probe.c                   |    2 +-
 include/asm-generic/topology.h        |    4 ++--
 include/asm-i386/topology.h           |    4 ++--
 include/asm-mips/mach-ip27/topology.h |    1 -
 include/asm-ppc64/topology.h          |    2 --
 include/asm-x86_64/mpspec.h           |    2 +-
 include/asm-x86_64/topology.h         |    6 ++----
 9 files changed, 11 insertions(+), 15 deletions(-)

-Matt

[-- Attachment #2: pcibus_to_node.patch --]
[-- Type: text/x-patch, Size: 6456 bytes --]

diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.7-mm7/arch/x86_64/kernel/mpparse.c linux-2.6.7-mm7+pcibus_to_node/arch/x86_64/kernel/mpparse.c
--- linux-2.6.7-mm7/arch/x86_64/kernel/mpparse.c	2004-07-12 10:39:11.000000000 -0700
+++ linux-2.6.7-mm7+pcibus_to_node/arch/x86_64/kernel/mpparse.c	2004-07-26 15:09:30.000000000 -0700
@@ -44,7 +44,7 @@ int acpi_found_madt;
 int apic_version [MAX_APICS];
 unsigned char mp_bus_id_to_type [MAX_MP_BUSSES] = { [0 ... MAX_MP_BUSSES-1] = -1 };
 int mp_bus_id_to_pci_bus [MAX_MP_BUSSES] = { [0 ... MAX_MP_BUSSES-1] = -1 };
-cpumask_t mp_bus_to_cpumask [MAX_MP_BUSSES] = { [0 ... MAX_MP_BUSSES-1] = CPU_MASK_ALL };
+int mp_bus_to_node [MAX_MP_BUSSES] = { [0 ... MAX_MP_BUSSES-1] = -1 };
 
 int mp_current_pci_id = 0;
 /* I/O APIC entries */
@@ -169,6 +169,7 @@ static void __init MP_bus_info (struct m
 		mp_bus_id_to_type[m->mpc_busid] = MP_BUS_PCI;
 		mp_bus_id_to_pci_bus[m->mpc_busid] = mp_current_pci_id;
 		mp_current_pci_id++;
+		/* FIXME: Setup PCI bus to Node mapping here? */
 	} else if (strncmp(str, "MCA", 3) == 0) {
 		mp_bus_id_to_type[m->mpc_busid] = MP_BUS_MCA;
 	} else {
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.7-mm7/arch/x86_64/kernel/pci-gart.c linux-2.6.7-mm7+pcibus_to_node/arch/x86_64/kernel/pci-gart.c
--- linux-2.6.7-mm7/arch/x86_64/kernel/pci-gart.c	2004-07-12 10:39:11.000000000 -0700
+++ linux-2.6.7-mm7+pcibus_to_node/arch/x86_64/kernel/pci-gart.c	2004-07-26 15:12:55.000000000 -0700
@@ -148,7 +148,7 @@ static void flush_gart(struct pci_dev *d
 { 
 	unsigned long flags;
 	int bus = dev ? dev->bus->number : -1;
-	cpumask_t bus_cpumask = pcibus_to_cpumask(bus);
+	cpumask_t bus_cpumask = node_to_cpumask(pcibus_to_node(bus));
 	int flushed = 0;
 	int i;
 
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.7-mm7/drivers/pci/probe.c linux-2.6.7-mm7+pcibus_to_node/drivers/pci/probe.c
--- linux-2.6.7-mm7/drivers/pci/probe.c	2004-07-12 10:39:18.000000000 -0700
+++ linux-2.6.7-mm7+pcibus_to_node/drivers/pci/probe.c	2004-07-23 07:49:24.000000000 -0700
@@ -54,7 +54,7 @@ postcore_initcall(pcibus_class_init);
  */
 static ssize_t pci_bus_show_cpuaffinity(struct class_device *class_dev, char *buf)
 {
-	cpumask_t cpumask = pcibus_to_cpumask((to_pci_bus(class_dev))->number);
+	cpumask_t cpumask = node_to_cpumask(pcibus_to_node((to_pci_bus(class_dev))->number));
 	int ret;
 
 	ret = cpumask_scnprintf(buf, PAGE_SIZE, cpumask);
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.7-mm7/include/asm-generic/topology.h linux-2.6.7-mm7+pcibus_to_node/include/asm-generic/topology.h
--- linux-2.6.7-mm7/include/asm-generic/topology.h	2004-06-15 22:18:58.000000000 -0700
+++ linux-2.6.7-mm7+pcibus_to_node/include/asm-generic/topology.h	2004-07-26 15:11:27.000000000 -0700
@@ -41,8 +41,8 @@
 #ifndef node_to_first_cpu
 #define node_to_first_cpu(node)	(0)
 #endif
-#ifndef pcibus_to_cpumask
-#define pcibus_to_cpumask(bus)	(cpu_online_map)
+#ifndef pcibus_to_node
+#define pcibus_to_node(bus)	(0)
 #endif
 
 /* Cross-node load balancing interval. */
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.7-mm7/include/asm-i386/topology.h linux-2.6.7-mm7+pcibus_to_node/include/asm-i386/topology.h
--- linux-2.6.7-mm7/include/asm-i386/topology.h	2004-06-15 22:19:01.000000000 -0700
+++ linux-2.6.7-mm7+pcibus_to_node/include/asm-i386/topology.h	2004-07-23 07:56:16.000000000 -0700
@@ -61,9 +61,9 @@ static inline int node_to_first_cpu(int 
 }
 
 /* Returns the number of the node containing PCI bus 'bus' */
-static inline cpumask_t pcibus_to_cpumask(int bus)
+static inline int pcibus_to_node(int bus)
 {
-	return node_to_cpumask(mp_bus_id_to_node[bus]);
+	return mp_bus_id_to_node[bus];
 }
 
 /* Node-to-Node distance */
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.7-mm7/include/asm-mips/mach-ip27/topology.h linux-2.6.7-mm7+pcibus_to_node/include/asm-mips/mach-ip27/topology.h
--- linux-2.6.7-mm7/include/asm-mips/mach-ip27/topology.h	2004-06-15 22:20:26.000000000 -0700
+++ linux-2.6.7-mm7+pcibus_to_node/include/asm-mips/mach-ip27/topology.h	2004-07-23 07:54:20.000000000 -0700
@@ -7,7 +7,6 @@
 #define parent_node(node)	(node)
 #define node_to_cpumask(node)	(HUB_DATA(node)->h_cpus)
 #define node_to_first_cpu(node)	(first_cpu(node_to_cpumask(node)))
-#define pcibus_to_cpumask(bus)	(cpu_online_map)
 
 extern int node_distance(nasid_t nasid_a, nasid_t nasid_b);
 #define node_distance(from, to)	node_distance(from, to)
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.7-mm7/include/asm-ppc64/topology.h linux-2.6.7-mm7+pcibus_to_node/include/asm-ppc64/topology.h
--- linux-2.6.7-mm7/include/asm-ppc64/topology.h	2004-06-15 22:20:16.000000000 -0700
+++ linux-2.6.7-mm7+pcibus_to_node/include/asm-ppc64/topology.h	2004-07-23 07:54:30.000000000 -0700
@@ -33,8 +33,6 @@ static inline int node_to_first_cpu(int 
 	return first_cpu(tmp);
 }
 
-#define pcibus_to_cpumask(bus)	(cpu_online_map)
-
 #define nr_cpus_node(node)	(nr_cpus_in_node[node])
 
 /* Cross-node load balancing interval. */
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.7-mm7/include/asm-x86_64/mpspec.h linux-2.6.7-mm7+pcibus_to_node/include/asm-x86_64/mpspec.h
--- linux-2.6.7-mm7/include/asm-x86_64/mpspec.h	2004-07-12 10:39:40.000000000 -0700
+++ linux-2.6.7-mm7+pcibus_to_node/include/asm-x86_64/mpspec.h	2004-07-23 07:57:53.000000000 -0700
@@ -166,7 +166,7 @@ enum mp_bustype {
 };
 extern unsigned char mp_bus_id_to_type [MAX_MP_BUSSES];
 extern int mp_bus_id_to_pci_bus [MAX_MP_BUSSES];
-extern cpumask_t mp_bus_to_cpumask [MAX_MP_BUSSES];
+extern cpumask_t mp_bus_to_node [MAX_MP_BUSSES];
 
 extern unsigned int boot_cpu_physical_apicid;
 extern int smp_found_config;
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.7-mm7/include/asm-x86_64/topology.h linux-2.6.7-mm7+pcibus_to_node/include/asm-x86_64/topology.h
--- linux-2.6.7-mm7/include/asm-x86_64/topology.h	2004-07-12 10:39:40.000000000 -0700
+++ linux-2.6.7-mm7+pcibus_to_node/include/asm-x86_64/topology.h	2004-07-23 07:59:29.000000000 -0700
@@ -20,11 +20,9 @@ extern cpumask_t     node_to_cpumask[];
 #define node_to_first_cpu(node) 	(__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)		(node_to_cpumask[node])
 
-static inline cpumask_t pcibus_to_cpumask(int bus)
+static inline int pcibus_to_node(int bus)
 {
-	cpumask_t tmp;
-	cpus_and(tmp, mp_bus_to_cpumask[bus], cpu_online_map);
-	return tmp;
+	return mp_bus_to_node[bus];
 }
 
 #define NODE_BALANCE_RATE 30	/* CHECKME */ 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-27  0:10 [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node() Matthew Dobson
@ 2004-07-27  3:38 ` Jesse Barnes
  2004-07-27  9:51 ` [Lse-tech] " Christoph Hellwig
  2004-07-27 14:16 ` Andi Kleen
  2 siblings, 0 replies; 23+ messages in thread
From: Jesse Barnes @ 2004-07-27  3:38 UTC (permalink / raw)
  To: colpatch; +Cc: Andi Kleen, LKML, Martin J. Bligh, LSE Tech

On Monday, July 26, 2004 8:10 pm, Matthew Dobson wrote:
> So in discussions with Jesse at OLS, we decided that pcibus_to_node() is
> a more generally useful function than pcibus_to_cpumask().  If anyone
> disagrees with that, now would be a good time to let us know.

Thanks for putting the fact that I was an idiot so kindly... :)

> 1) Replace instances of pcibus_to_cpumask(bus) with
> node_to_cpumask(pcibus_to_node(bus)).  There are currently only 2 uses
> of pcibus_to_cpumask(): flush_gart() in arch/x86_64/kernel/pci-gart.c
> and pci_bus_show_cpuaffinity() in drivers/pci/probe.c.
> 2) Define the asm-generic version of pcibus_to_node() to always return
> node 0, as this is the sensible non-NUMA behavior.
> 3) Drop the mips/mach-ip27 and ppc64 versions of pcibus_to_cpumask()
> entirely, since they were simply defined to be identical to the
> asm-generic version.
> 4) Define the i386 version of pcibus_to_node().

Looks good to me.

> Future work:
>
> 1) Correctly map PCI buses to nodes for x86_64.
> 2) IA64 implementation?

I'll put this together, though the implementation will probably change as we 
add PROM support in the SLIT and SRAT tables for our host to PCI bridges.

Platforms that support it should probably also use pcibus_to_node in their 
pci_alloc_consistent and dma_alloc_coherent APIs if possible.

Thanks,
Jesse

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-27  0:10 [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node() Matthew Dobson
  2004-07-27  3:38 ` Jesse Barnes
@ 2004-07-27  9:51 ` Christoph Hellwig
  2004-07-27 15:22   ` Jesse Barnes
  2004-07-27 14:16 ` Andi Kleen
  2 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2004-07-27  9:51 UTC (permalink / raw)
  To: Matthew Dobson; +Cc: Jesse Barnes, Andi Kleen, LKML, Martin J. Bligh, LSE Tech

On Mon, Jul 26, 2004 at 05:10:08PM -0700, Matthew Dobson wrote:
> So in discussions with Jesse at OLS, we decided that pcibus_to_node() is

Please do pcibus_to_nodemask() instead - there could be dual-ported pci
bridges.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-27  0:10 [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node() Matthew Dobson
  2004-07-27  3:38 ` Jesse Barnes
  2004-07-27  9:51 ` [Lse-tech] " Christoph Hellwig
@ 2004-07-27 14:16 ` Andi Kleen
  2004-07-27 15:15   ` Jesse Barnes
  2 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2004-07-27 14:16 UTC (permalink / raw)
  To: colpatch; +Cc: jbarnes, linux-kernel, mbligh, lse-tech

On Mon, 26 Jul 2004 17:10:08 -0700
Matthew Dobson <colpatch@us.ibm.com> wrote:

> So in discussions with Jesse at OLS, we decided that pcibus_to_node() is
> a more generally useful function than pcibus_to_cpumask().  If anyone
> disagrees with that, now would be a good time to let us know.

Not sure that is a good idea. Sometimes this information is not available.
With pcibus_to_cpumask() the fallback is obvious, but it isn't with
pcibus_to_node(). Returning a random node is wrong.


> This is just a preliminary patch.  It needs review for x86_64, as I
> don't know how to properly populate the mp_bus_to_node (which used to be
> mp_bus_to_cpumask) array.

It's impossible currently - I need an ACPI 3.0 BIOS to get this information.
Even then there will be machines who don't supply it.

I tried some time ago to get it from the hardware, but the hardware registers
were arcane enough that I didn't find it easy enough. Relying on firmware
for this thing is probably a better idea anyways.

-Andi


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-27 14:16 ` Andi Kleen
@ 2004-07-27 15:15   ` Jesse Barnes
  2004-07-27 15:57     ` Andi Kleen
  0 siblings, 1 reply; 23+ messages in thread
From: Jesse Barnes @ 2004-07-27 15:15 UTC (permalink / raw)
  To: Andi Kleen; +Cc: colpatch, jbarnes, linux-kernel, mbligh, lse-tech

On Tuesday, July 27, 2004 7:16 am, Andi Kleen wrote:
> On Mon, 26 Jul 2004 17:10:08 -0700
>
> Matthew Dobson <colpatch@us.ibm.com> wrote:
> > So in discussions with Jesse at OLS, we decided that pcibus_to_node() is
> > a more generally useful function than pcibus_to_cpumask().  If anyone
> > disagrees with that, now would be a good time to let us know.
>
> Not sure that is a good idea. Sometimes this information is not available.
> With pcibus_to_cpumask() the fallback is obvious, but it isn't with
> pcibus_to_node(). Returning a random node is wrong.

Hmm... so there's no way for you to get a node or nodemask at all?

> It's impossible currently - I need an ACPI 3.0 BIOS to get this
> information. Even then there will be machines who don't supply it.
>
> I tried some time ago to get it from the hardware, but the hardware
> registers were arcane enough that I didn't find it easy enough. Relying on
> firmware for this thing is probably a better idea anyways.

Yeah, it's easier that way, but for the first cut on ia64, I'm gonna have to 
do it by hand too.  It's not that bad on sn2 at least.

Jesse

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-27  9:51 ` [Lse-tech] " Christoph Hellwig
@ 2004-07-27 15:22   ` Jesse Barnes
  2004-07-27 18:32     ` Matthew Dobson
  0 siblings, 1 reply; 23+ messages in thread
From: Jesse Barnes @ 2004-07-27 15:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Matthew Dobson, Jesse Barnes, Andi Kleen, LKML, Martin J. Bligh,
	LSE Tech

On Tuesday, July 27, 2004 2:51 am, Christoph Hellwig wrote:
> On Mon, Jul 26, 2004 at 05:10:08PM -0700, Matthew Dobson wrote:
> > So in discussions with Jesse at OLS, we decided that pcibus_to_node() is
>
> Please do pcibus_to_nodemask() instead - there could be dual-ported pci
> bridges.

Do you know of any?  On sn2 there are dual ported xio->pci bridges, but in 
that case, half the busses are associated with one node and the other half 
with another node, so pcibus_to_node would work in that case.  And for 
numalink->pci bridges, we'll return the node id of the bridge in that case 
(which may not have any memory, but in that case alloc_pages_node will fall 
back to the next node).

I wonder though if we shouldn't add

  ...
#ifdef CONFIG_NUMA
  int node; /* or nodemask_t if necessary */
#endif
  ...

to struct pci_bus instead?  That would make the existing code paths a little 
faster and avoid the need for a global array, which tends to lead to TLB 
misses.

Anyway, my needs are very simple.  I'd like to do 
alloc_pages_node(pci_to_node(pci_dev)); in the sn2 version of 
pci_alloc_consistent and use the new routine to simplify the initial irq 
setup code, making it look more like build_zonelists and the sched domains 
patch I posted yesterday.  So as long as those needs are provided for, I'm ok 
with the interface.

Thanks,
Jesse

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-27 15:15   ` Jesse Barnes
@ 2004-07-27 15:57     ` Andi Kleen
  2004-07-27 18:18       ` Matthew Dobson
  0 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2004-07-27 15:57 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: colpatch, jbarnes, linux-kernel, mbligh, lse-tech

On Tue, 27 Jul 2004 08:15:39 -0700
Jesse Barnes <jbarnes@engr.sgi.com> wrote:

> On Tuesday, July 27, 2004 7:16 am, Andi Kleen wrote:
> > On Mon, 26 Jul 2004 17:10:08 -0700
> >
> > Matthew Dobson <colpatch@us.ibm.com> wrote:
> > > So in discussions with Jesse at OLS, we decided that pcibus_to_node() is
> > > a more generally useful function than pcibus_to_cpumask().  If anyone
> > > disagrees with that, now would be a good time to let us know.
> >
> > Not sure that is a good idea. Sometimes this information is not available.
> > With pcibus_to_cpumask() the fallback is obvious, but it isn't with
> > pcibus_to_node(). Returning a random node is wrong.
> 
> Hmm... so there's no way for you to get a node or nodemask at all?

When the BIOS has _PXM methods there will be probably.
Just I cannot guarantee it has that, so there should be some clean fallback path.

If cpumask is too complicated for you a pcibus_to_nodemask would be fine
for me too, just please no single node number.


-Andi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-27 15:57     ` Andi Kleen
@ 2004-07-27 18:18       ` Matthew Dobson
  2004-07-29  8:34         ` Paul Jackson
  0 siblings, 1 reply; 23+ messages in thread
From: Matthew Dobson @ 2004-07-27 18:18 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jesse Barnes, Jesse Barnes, LKML, Martin J. Bligh, LSE Tech

On Tue, 2004-07-27 at 08:57, Andi Kleen wrote:
> On Tue, 27 Jul 2004 08:15:39 -0700
> Jesse Barnes <jbarnes@engr.sgi.com> wrote:
> 
> > On Tuesday, July 27, 2004 7:16 am, Andi Kleen wrote:
> > > On Mon, 26 Jul 2004 17:10:08 -0700
> > >
> > > Matthew Dobson <colpatch@us.ibm.com> wrote:
> > > > So in discussions with Jesse at OLS, we decided that pcibus_to_node() is
> > > > a more generally useful function than pcibus_to_cpumask().  If anyone
> > > > disagrees with that, now would be a good time to let us know.
> > >
> > > Not sure that is a good idea. Sometimes this information is not available.
> > > With pcibus_to_cpumask() the fallback is obvious, but it isn't with
> > > pcibus_to_node(). Returning a random node is wrong.
> > 
> > Hmm... so there's no way for you to get a node or nodemask at all?
> 
> When the BIOS has _PXM methods there will be probably.
> Just I cannot guarantee it has that, so there should be some clean fallback path.
> 
> If cpumask is too complicated for you a pcibus_to_nodemask would be fine
> for me too, just please no single node number.
> 
> 
> -Andi

I guess I'm OK with a nodemask instead of a node.  That will make this
patch dependent on my nodemask_t patch, which I'll also be sending out
again later today, though...  A nodemask instead of a node also allows
us to return a mask of nearby memory-only nodes as well as CPU-only
nodes, if the arch supports that, for allocating buffers/doing DMA
from...

-Matt


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-27 15:22   ` Jesse Barnes
@ 2004-07-27 18:32     ` Matthew Dobson
  2004-07-27 18:40       ` Jesse Barnes
  2004-07-28 15:01       ` Martin J. Bligh
  0 siblings, 2 replies; 23+ messages in thread
From: Matthew Dobson @ 2004-07-27 18:32 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML,
	Martin J. Bligh, LSE Tech

On Tue, 2004-07-27 at 08:22, Jesse Barnes wrote:
> On Tuesday, July 27, 2004 2:51 am, Christoph Hellwig wrote:
> > On Mon, Jul 26, 2004 at 05:10:08PM -0700, Matthew Dobson wrote:
> > > So in discussions with Jesse at OLS, we decided that pcibus_to_node() is
> >
> > Please do pcibus_to_nodemask() instead - there could be dual-ported pci
> > bridges.
> 
> Do you know of any?  On sn2 there are dual ported xio->pci bridges, but in 
> that case, half the busses are associated with one node and the other half 
> with another node, so pcibus_to_node would work in that case.  And for 
> numalink->pci bridges, we'll return the node id of the bridge in that case 
> (which may not have any memory, but in that case alloc_pages_node will fall 
> back to the next node).
> 
> I wonder though if we shouldn't add
> 
>   ...
> #ifdef CONFIG_NUMA
>   int node; /* or nodemask_t if necessary */
> #endif
>   ...
> 
> to struct pci_bus instead?  That would make the existing code paths a little 
> faster and avoid the need for a global array, which tends to lead to TLB 
> misses.

I like that idea!  Stick a nodemask_t in struct pci_bus, initialize it
to NODE_MASK_ALL.  If a particular arch wants to put something more
accurate in there, then great, if not, we're just in the same boat we're
in now.

Anyone else have opinions one way or the other on Jesse's idea?

> Anyway, my needs are very simple.  I'd like to do 
> alloc_pages_node(pci_to_node(pci_dev)); in the sn2 version of 
> pci_alloc_consistent and use the new routine to simplify the initial irq 
> setup code, making it look more like build_zonelists and the sched domains 
> patch I posted yesterday.  So as long as those needs are provided for, I'm ok 
> with the interface.
> 
> Thanks,
> Jesse

I'm trying to keep the dependency of topology on what the pci_dev and
pci_bus structs look like to a minimum.  That's why I'd like to keep the
topology function based on PCI bus numbers (or possibly struct pci_bus),
not struct pci_dev.  The pci_bus is what really has the node affinity
anyway, and the device only has that affinity through the fact that it
is physically plugged into a particular bus.

-Matt


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-27 18:32     ` Matthew Dobson
@ 2004-07-27 18:40       ` Jesse Barnes
  2004-07-29  0:06         ` Matthew Dobson
  2004-07-28 15:01       ` Martin J. Bligh
  1 sibling, 1 reply; 23+ messages in thread
From: Jesse Barnes @ 2004-07-27 18:40 UTC (permalink / raw)
  To: colpatch
  Cc: Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML,
	Martin J. Bligh, LSE Tech

On Tuesday, July 27, 2004 11:32 am, Matthew Dobson wrote:
> >   ...
> > #ifdef CONFIG_NUMA
> >   int node; /* or nodemask_t if necessary */
> > #endif
> >   ...
> >
> > to struct pci_bus instead?  That would make the existing code paths a
> > little faster and avoid the need for a global array, which tends to lead
> > to TLB misses.
>
> I like that idea!  Stick a nodemask_t in struct pci_bus, initialize it
> to NODE_MASK_ALL.  If a particular arch wants to put something more
> accurate in there, then great, if not, we're just in the same boat we're
> in now.

Cool, sounds like that'll work well.

> I'm trying to keep the dependency of topology on what the pci_dev and
> pci_bus structs look like to a minimum.  That's why I'd like to keep the
> topology function based on PCI bus numbers (or possibly struct pci_bus),
> not struct pci_dev.  The pci_bus is what really has the node affinity
> anyway, and the device only has that affinity through the fact that it
> is physically plugged into a particular bus.

Sure, that make sense.  And it's easy enough to get a pci_bus from a pci_dev 
that we probably won't run into trouble.

Thanks,
Jesse

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-27 18:32     ` Matthew Dobson
  2004-07-27 18:40       ` Jesse Barnes
@ 2004-07-28 15:01       ` Martin J. Bligh
  2004-07-28 19:10         ` Matthew Dobson
  1 sibling, 1 reply; 23+ messages in thread
From: Martin J. Bligh @ 2004-07-28 15:01 UTC (permalink / raw)
  To: colpatch, Jesse Barnes
  Cc: Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML, LSE Tech

>> I wonder though if we shouldn't add
>> 
>>   ...
>> # ifdef CONFIG_NUMA
>>   int node; /* or nodemask_t if necessary */
>> # endif
>>   ...
>> 
>> to struct pci_bus instead?  That would make the existing code paths a little 
>> faster and avoid the need for a global array, which tends to lead to TLB 
>> misses.
> 
> I like that idea!  Stick a nodemask_t in struct pci_bus, initialize it
> to NODE_MASK_ALL.  If a particular arch wants to put something more
> accurate in there, then great, if not, we're just in the same boat we're
> in now.
> 
> Anyone else have opinions one way or the other on Jesse's idea?

Sounds great - if it's possible to add it to something more generic than
PCI, that'd be even better, but pci would still be very useful.

M.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-28 15:01       ` Martin J. Bligh
@ 2004-07-28 19:10         ` Matthew Dobson
  0 siblings, 0 replies; 23+ messages in thread
From: Matthew Dobson @ 2004-07-28 19:10 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Jesse Barnes, Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML,
	LSE Tech

On Wed, 2004-07-28 at 08:01, Martin J. Bligh wrote:
> >> I wonder though if we shouldn't add
> >> 
> >>   ...
> >> # ifdef CONFIG_NUMA
> >>   int node; /* or nodemask_t if necessary */
> >> # endif
> >>   ...
> >> 
> >> to struct pci_bus instead?  That would make the existing code paths a little 
> >> faster and avoid the need for a global array, which tends to lead to TLB 
> >> misses.
> > 
> > I like that idea!  Stick a nodemask_t in struct pci_bus, initialize it
> > to NODE_MASK_ALL.  If a particular arch wants to put something more
> > accurate in there, then great, if not, we're just in the same boat we're
> > in now.
> > 
> > Anyone else have opinions one way or the other on Jesse's idea?
> 
> Sounds great - if it's possible to add it to something more generic than
> PCI, that'd be even better, but pci would still be very useful.
> 
> M.

Is there anything like that?  I'm not aware of any structure that keeps
track of general "buses", which would be what we want.  Something that
keeps track of PCI buses, Infiniband buses, arch-specific fabric buses,
etc.  Barring the existence of such a structure, I'll just shove it in
the PCI bus structure for now.

-Matt


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-27 18:40       ` Jesse Barnes
@ 2004-07-29  0:06         ` Matthew Dobson
  2004-07-29 15:43           ` Jesse Barnes
  2004-07-29 17:02           ` Rajesh Shah
  0 siblings, 2 replies; 23+ messages in thread
From: Matthew Dobson @ 2004-07-29  0:06 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML,
	Martin J. Bligh, LSE Tech

On Tue, 2004-07-27 at 11:40, Jesse Barnes wrote:
> On Tuesday, July 27, 2004 11:32 am, Matthew Dobson wrote:
> > >   ...
> > > #ifdef CONFIG_NUMA
> > >   int node; /* or nodemask_t if necessary */
> > > #endif
> > >   ...
> > >
> > > to struct pci_bus instead?  That would make the existing code paths a
> > > little faster and avoid the need for a global array, which tends to lead
> > > to TLB misses.
> >
> > I like that idea!  Stick a nodemask_t in struct pci_bus, initialize it
> > to NODE_MASK_ALL.  If a particular arch wants to put something more
> > accurate in there, then great, if not, we're just in the same boat we're
> > in now.
> 
> Cool, sounds like that'll work well.

Ok, so I'm no longer convinced that this will work as well as I once
thought.  It's pretty trivial to add a nodemask_t to the struct pci_bus,
and even initialize it to a reasonable value (ie: NODE_MASK_ALL) since
there's the convenient pci_alloc_bus() function in drivers/pci/probe.c. 
The problem is where to put hooks for individual arches to put the
*real* nodemask in this field...  My only thought right now is to create
a per-arch callback function, arch_get_pcibus_nodemask() or something,
and use the value it returns to populate pci_bus->nodemask.  We would
have to call this function anywhere a struct pci_bus is allocated, and
probably pass along the PCI bus number so the arch could determine which
nodes it belongs to.  Would that work for everyone that cares?  We could
overload that to return NODE_MASK_ALL for non-NUMA systems, and have it
do the right thing for arches that care...

Current, nowhere near complete patch attached...

-Matt

diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/arch/x86_64/kernel/mpparse.c linux-2.6.8-rc2-mm1+pcibus_to_nodemask/arch/x86_64/kernel/mpparse.c
--- linux-2.6.8-rc2-mm1/arch/x86_64/kernel/mpparse.c	2004-07-28 10:50:34.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/arch/x86_64/kernel/mpparse.c	2004-07-28 16:23:50.000000000 -0700
@@ -44,7 +44,6 @@ int acpi_found_madt;
 int apic_version [MAX_APICS];
 unsigned char mp_bus_id_to_type [MAX_MP_BUSSES] = { [0 ... MAX_MP_BUSSES-1] = -1 };
 int mp_bus_id_to_pci_bus [MAX_MP_BUSSES] = { [0 ... MAX_MP_BUSSES-1] = -1 };
-cpumask_t pci_bus_to_cpumask [256] = { [0 ... 255] = CPU_MASK_ALL };
 
 int mp_current_pci_id = 0;
 /* I/O APIC entries */
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/drivers/pci/probe.c linux-2.6.8-rc2-mm1+pcibus_to_nodemask/drivers/pci/probe.c
--- linux-2.6.8-rc2-mm1/drivers/pci/probe.c	2004-07-28 10:49:45.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/drivers/pci/probe.c	2004-07-28 17:00:30.000000000 -0700
@@ -6,7 +6,7 @@
 #include <linux/pci.h>
 #include <linux/slab.h>
 #include <linux/module.h>
-#include <linux/cpumask.h>
+#include <linux/topology.h>
 
 #undef DEBUG
 
@@ -54,7 +54,7 @@ postcore_initcall(pcibus_class_init);
  */
 static ssize_t pci_bus_show_cpuaffinity(struct class_device *class_dev, char *buf)
 {
-	cpumask_t cpumask = pcibus_to_cpumask((to_pci_bus(class_dev))->number);
+	cpumask_t cpumask = nodemask_to_cpumask(pcibus_to_nodemask(to_pci_bus(class_dev)));
 	int ret;
 
 	ret = cpumask_scnprintf(buf, PAGE_SIZE, cpumask);
@@ -270,6 +270,7 @@ static struct pci_bus * __devinit pci_al
 		INIT_LIST_HEAD(&b->node);
 		INIT_LIST_HEAD(&b->children);
 		INIT_LIST_HEAD(&b->devices);
+		b->nodemask = NODE_MASK_ALL;
 	}
 	return b;
 }
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-alpha/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-alpha/topology.h
--- linux-2.6.8-rc2-mm1/include/asm-alpha/topology.h	2004-07-28 10:49:58.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-alpha/topology.h	2004-07-28 16:42:29.000000000 -0700
@@ -42,8 +42,6 @@ static inline cpumask_t node_to_cpumask(
 /* Cross-node load balancing interval. */
 # define NODE_BALANCE_RATE 10
 
-#define pcibus_to_cpumask(bus)	(cpu_online_map)
-
 #else /* CONFIG_NUMA */
 # include <asm-generic/topology.h>
 #endif /* !CONFIG_NUMA */
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-generic/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-generic/topology.h
--- linux-2.6.8-rc2-mm1/include/asm-generic/topology.h	2004-06-15 22:18:58.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-generic/topology.h	2004-07-28 16:29:06.000000000 -0700
@@ -41,9 +41,6 @@
 #ifndef node_to_first_cpu
 #define node_to_first_cpu(node)	(0)
 #endif
-#ifndef pcibus_to_cpumask
-#define pcibus_to_cpumask(bus)	(cpu_online_map)
-#endif
 
 /* Cross-node load balancing interval. */
 #ifndef NODE_BALANCE_RATE
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-i386/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-i386/topology.h
--- linux-2.6.8-rc2-mm1/include/asm-i386/topology.h	2004-06-15 22:19:01.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-i386/topology.h	2004-07-28 16:29:35.000000000 -0700
@@ -60,12 +60,6 @@ static inline int node_to_first_cpu(int 
 	return first_cpu(mask);
 }
 
-/* Returns the number of the node containing PCI bus 'bus' */
-static inline cpumask_t pcibus_to_cpumask(int bus)
-{
-	return node_to_cpumask(mp_bus_id_to_node[bus]);
-}
-
 /* Node-to-Node distance */
 #define node_distance(from, to) (from != to)
 
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-mips/mach-ip27/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-mips/mach-ip27/topology.h
--- linux-2.6.8-rc2-mm1/include/asm-mips/mach-ip27/topology.h	2004-06-15 22:20:26.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-mips/mach-ip27/topology.h	2004-07-28 16:07:38.000000000 -0700
@@ -7,7 +7,6 @@
 #define parent_node(node)	(node)
 #define node_to_cpumask(node)	(HUB_DATA(node)->h_cpus)
 #define node_to_first_cpu(node)	(first_cpu(node_to_cpumask(node)))
-#define pcibus_to_cpumask(bus)	(cpu_online_map)
 
 extern int node_distance(nasid_t nasid_a, nasid_t nasid_b);
 #define node_distance(from, to)	node_distance(from, to)
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-ppc64/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-ppc64/topology.h
--- linux-2.6.8-rc2-mm1/include/asm-ppc64/topology.h	2004-06-15 22:20:16.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-ppc64/topology.h	2004-07-28 16:07:48.000000000 -0700
@@ -33,8 +33,6 @@ static inline int node_to_first_cpu(int 
 	return first_cpu(tmp);
 }
 
-#define pcibus_to_cpumask(bus)	(cpu_online_map)
-
 #define nr_cpus_node(node)	(nr_cpus_in_node[node])
 
 /* Cross-node load balancing interval. */
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-x86_64/mpspec.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-x86_64/mpspec.h
--- linux-2.6.8-rc2-mm1/include/asm-x86_64/mpspec.h	2004-07-28 10:50:50.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-x86_64/mpspec.h	2004-07-28 16:27:01.000000000 -0700
@@ -166,7 +166,6 @@ enum mp_bustype {
 };
 extern unsigned char mp_bus_id_to_type [MAX_MP_BUSSES];
 extern int mp_bus_id_to_pci_bus [MAX_MP_BUSSES];
-extern cpumask_t pci_bus_to_cpumask [256];
 
 extern unsigned int boot_cpu_physical_apicid;
 extern int smp_found_config;
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-x86_64/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-x86_64/topology.h
--- linux-2.6.8-rc2-mm1/include/asm-x86_64/topology.h	2004-07-28 10:50:50.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-x86_64/topology.h	2004-07-28 16:30:17.000000000 -0700
@@ -20,13 +20,6 @@ extern cpumask_t     node_to_cpumask[];
 #define node_to_first_cpu(node) 	(__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)		(node_to_cpumask[node])
 
-static inline cpumask_t pcibus_to_cpumask(int bus)
-{
-	cpumask_t res;
-	cpus_and(res,  pci_bus_to_cpumask[bus], cpu_online_map);
-	return res;
-}
-
 #define NODE_BALANCE_RATE 30	/* CHECKME */ 
 
 #endif
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/linux/pci.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/linux/pci.h
--- linux-2.6.8-rc2-mm1/include/linux/pci.h	2004-07-28 10:50:51.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/linux/pci.h	2004-07-28 16:58:46.000000000 -0700
@@ -590,6 +590,8 @@ struct pci_bus {
 	unsigned short  pad2;
 	struct device		*bridge;
 	struct class_device	class_dev;
+	nodemask_t	nodemask;	/* For NUMA systems, we care about which 
+					   node(s) this PCI bus is on/close to. */
 };
 
 #define pci_bus_b(n)	list_entry(n, struct pci_bus, node)
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/linux/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/linux/topology.h
--- linux-2.6.8-rc2-mm1/include/linux/topology.h	2004-07-28 16:25:59.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/linux/topology.h	2004-07-28 17:00:26.000000000 -0700
@@ -28,6 +28,7 @@
 #define _LINUX_TOPOLOGY_H
 
 #include <linux/cpumask.h>
+#include <linux/nodemask.h>
 #include <linux/bitops.h>
 #include <linux/mmzone.h>
 #include <linux/smp.h>
@@ -54,6 +55,18 @@ static inline int __next_node_with_cpus(
 #define for_each_node_with_cpus(node) \
 	for (node = 0; node < numnodes; node = __next_node_with_cpus(node))
 
+static inline cpumask_t nodemask_to_cpumask(nodemask_t nodemask)
+{
+	cpumask_t ret, tmp;
+	int node;
+	cpus_clear(ret);
+	for_each_node_mask(node, nodemask) {
+		tmp = node_to_cpumask(node);
+		cpus_or(ret, ret, tmp);
+	}
+	return ret;
+}
+
 #ifndef node_distance
 #define node_distance(from,to)	(from != to)
 #endif
@@ -61,4 +74,9 @@ static inline int __next_node_with_cpus(
 #define PENALTY_FOR_NODE_WITH_CPUS	(1)
 #endif
 
+static inline nodemask_t pcibus_to_nodemask(struct pci_bus *bus)
+{
+	return bus->nodemask;
+}
+
 #endif /* _LINUX_TOPOLOGY_H */



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-27 18:18       ` Matthew Dobson
@ 2004-07-29  8:34         ` Paul Jackson
  0 siblings, 0 replies; 23+ messages in thread
From: Paul Jackson @ 2004-07-29  8:34 UTC (permalink / raw)
  To: colpatch; +Cc: ak, jbarnes, jbarnes, linux-kernel, mbligh, lse-tech

Matthew wrote:
> That will make this patch dependent on my nodemask_t patch,

Did you notice that Andrew included your most recently published
nodemask_t in 2.6.8-rc2-mm1?

Congratulations !

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-29  0:06         ` Matthew Dobson
@ 2004-07-29 15:43           ` Jesse Barnes
  2004-07-29 22:23             ` Matthew Dobson
  2004-07-29 17:02           ` Rajesh Shah
  1 sibling, 1 reply; 23+ messages in thread
From: Jesse Barnes @ 2004-07-29 15:43 UTC (permalink / raw)
  To: colpatch
  Cc: Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML,
	Martin J. Bligh, LSE Tech

On Wednesday, July 28, 2004 5:06 pm, Matthew Dobson wrote:
> Ok, so I'm no longer convinced that this will work as well as I once
> thought.  It's pretty trivial to add a nodemask_t to the struct pci_bus,
> and even initialize it to a reasonable value (ie: NODE_MASK_ALL) since
> there's the convenient pci_alloc_bus() function in drivers/pci/probe.c.
> The problem is where to put hooks for individual arches to put the
> *real* nodemask in this field...  My only thought right now is to create
> a per-arch callback function, arch_get_pcibus_nodemask() or something,

Yeah, that sounds reasonable.  You could protect a generic definition with 
#ifndef ARCH_HAS_PCIBUS_TO_NODEMASK or something...

> and use the value it returns to populate pci_bus->nodemask.  We would
> have to call this function anywhere a struct pci_bus is allocated, and
> probably pass along the PCI bus number so the arch could determine which
> nodes it belongs to.  Would that work for everyone that cares?  We could
> overload that to return NODE_MASK_ALL for non-NUMA systems, and have it
> do the right thing for arches that care...

Yeah, I think that would work.  The alternative is to simply add the field, 
initialize it in pci_alloc_bus like you're doing, and leave it to the arches 
to fill it in however they see fit.

Jesse

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-29  0:06         ` Matthew Dobson
  2004-07-29 15:43           ` Jesse Barnes
@ 2004-07-29 17:02           ` Rajesh Shah
  2004-07-29 22:27             ` Matthew Dobson
  1 sibling, 1 reply; 23+ messages in thread
From: Rajesh Shah @ 2004-07-29 17:02 UTC (permalink / raw)
  To: Matthew Dobson
  Cc: Jesse Barnes, Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML,
	Martin J. Bligh, LSE Tech

On Wed, Jul 28, 2004 at 05:06:48PM -0700, Matthew Dobson wrote:
> 
> thought.  It's pretty trivial to add a nodemask_t to the struct pci_bus,
> and even initialize it to a reasonable value (ie: NODE_MASK_ALL) since
> there's the convenient pci_alloc_bus() function in drivers/pci/probe.c. 
> The problem is where to put hooks for individual arches to put the
> *real* nodemask in this field...  My only thought right now is to create
> a per-arch callback function, arch_get_pcibus_nodemask() or something,
> and use the value it returns to populate pci_bus->nodemask.  We would
> have to call this function anywhere a struct pci_bus is allocated, and
> probably pass along the PCI bus number so the arch could determine which
> nodes it belongs to.  Would that work for everyone that cares?  We could

With PCI root/p2p bridge hotplug, the code dealing with the
hotplug (e.g. ACPI hotplug code) will have this information, not 
arch specific code. How about having the PCI subsystem export
an interface to set the nodemask, and have the arch or hotplug
code call it to change the defaults? That way, pci_alloc_bus()
simply sets the default and does not perform any callback.
Does that work for everyone?

Rajesh


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-29 15:43           ` Jesse Barnes
@ 2004-07-29 22:23             ` Matthew Dobson
  2004-07-30 15:36               ` Jesse Barnes
  0 siblings, 1 reply; 23+ messages in thread
From: Matthew Dobson @ 2004-07-29 22:23 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML,
	Martin J. Bligh, LSE Tech

On Thu, 2004-07-29 at 08:43, Jesse Barnes wrote:
> On Wednesday, July 28, 2004 5:06 pm, Matthew Dobson wrote:
> > Ok, so I'm no longer convinced that this will work as well as I once
> > thought.  It's pretty trivial to add a nodemask_t to the struct pci_bus,
> > and even initialize it to a reasonable value (ie: NODE_MASK_ALL) since
> > there's the convenient pci_alloc_bus() function in drivers/pci/probe.c.
> > The problem is where to put hooks for individual arches to put the
> > *real* nodemask in this field...  My only thought right now is to create
> > a per-arch callback function, arch_get_pcibus_nodemask() or something,
> 
> Yeah, that sounds reasonable.  You could protect a generic definition with 
> #ifndef ARCH_HAS_PCIBUS_TO_NODEMASK or something...
> 
> > and use the value it returns to populate pci_bus->nodemask.  We would
> > have to call this function anywhere a struct pci_bus is allocated, and
> > probably pass along the PCI bus number so the arch could determine which
> > nodes it belongs to.  Would that work for everyone that cares?  We could
> > overload that to return NODE_MASK_ALL for non-NUMA systems, and have it
> > do the right thing for arches that care...
> 
> Yeah, I think that would work.  The alternative is to simply add the field, 
> initialize it in pci_alloc_bus like you're doing, and leave it to the arches 
> to fill it in however they see fit.
> 
> Jesse

Ok...  Still an RFC, but moving closer to something that we can use. 
Anyone have any comments on this untested iteration? ;)

What I'm doing is basically ripping out all the old pcibus_to_cpumask()
calls.  The only arch that defined it to be anything other than
CPU_MASK_ALL was i386, and theirs should still work.  x86_64 had the
beginnings of a PCI bus to CPU mask mapping, but it was never filled in,
just populated with CPU_MASK_ALL, so it does the same with NODE_MASK_ALL
now.  Those two arches, in their include/asm-$ARCH/topology.h define
both ARCH_HAS_GET_PCIBUS_NODEMASK and get_pcibus_nodemask(bus). 
include/linux/topology.h defines a simple get_pcibus_nodemask(bus) if
there isn't an arch-specific one provided.  We then, in
drivers/pci/probe.c, populate the nodemask field of struct pci_bus with
this nodemask.  Lookup involves simply returning the nodemask stored in
the struct pci_bus.

[mcd@arrakis source]$ diffstat ~/linux/patches/pcibus_to_nodemask.patch
 arch/x86_64/kernel/mpparse.c          |    2 +-
 drivers/pci/probe.c                   |    6 ++++--
 include/asm-alpha/topology.h          |    2 --
 include/asm-generic/topology.h        |    3 ---
 include/asm-i386/topology.h           |    7 ++-----
 include/asm-mips/mach-ip27/topology.h |    1 -
 include/asm-ppc64/topology.h          |    2 --
 include/asm-x86_64/mpspec.h           |    2 +-
 include/asm-x86_64/topology.h         |   10 +++-------
 include/linux/pci.h                   |    2 ++
 include/linux/topology.h              |   22 ++++++++++++++++++++++
 11 files changed, 35 insertions(+), 24 deletions(-)

-Matt

diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/arch/x86_64/kernel/mpparse.c linux-2.6.8-rc2-mm1+pcibus_to_nodemask/arch/x86_64/kernel/mpparse.c
--- linux-2.6.8-rc2-mm1/arch/x86_64/kernel/mpparse.c	2004-07-28 10:50:34.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/arch/x86_64/kernel/mpparse.c	2004-07-29 14:53:27.000000000 -0700
@@ -44,7 +44,7 @@ int acpi_found_madt;
 int apic_version [MAX_APICS];
 unsigned char mp_bus_id_to_type [MAX_MP_BUSSES] = { [0 ... MAX_MP_BUSSES-1] = -1 };
 int mp_bus_id_to_pci_bus [MAX_MP_BUSSES] = { [0 ... MAX_MP_BUSSES-1] = -1 };
-cpumask_t pci_bus_to_cpumask [256] = { [0 ... 255] = CPU_MASK_ALL };
+nodemask_t pci_bus_to_nodemask [256] = { [0 ... 255] = NODE_MASK_ALL };
 
 int mp_current_pci_id = 0;
 /* I/O APIC entries */
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/drivers/pci/probe.c linux-2.6.8-rc2-mm1+pcibus_to_nodemask/drivers/pci/probe.c
--- linux-2.6.8-rc2-mm1/drivers/pci/probe.c	2004-07-28 10:49:45.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/drivers/pci/probe.c	2004-07-29 15:06:26.000000000 -0700
@@ -6,7 +6,7 @@
 #include <linux/pci.h>
 #include <linux/slab.h>
 #include <linux/module.h>
-#include <linux/cpumask.h>
+#include <linux/topology.h>
 
 #undef DEBUG
 
@@ -54,7 +54,7 @@ postcore_initcall(pcibus_class_init);
  */
 static ssize_t pci_bus_show_cpuaffinity(struct class_device *class_dev, char *buf)
 {
-	cpumask_t cpumask = pcibus_to_cpumask((to_pci_bus(class_dev))->number);
+	cpumask_t cpumask = nodemask_to_cpumask(pcibus_to_nodemask(to_pci_bus(class_dev)));
 	int ret;
 
 	ret = cpumask_scnprintf(buf, PAGE_SIZE, cpumask);
@@ -305,6 +305,7 @@ pci_alloc_child_bus(struct pci_bus *pare
 	child->number = child->secondary = busnr;
 	child->primary = parent->secondary;
 	child->subordinate = 0xff;
+	child->nodemask = get_pcibus_nodemask(busnr);
 
 	/* Set up default resource pointers and names.. */
 	for (i = 0; i < 4; i++) {
@@ -786,6 +787,7 @@ struct pci_bus * __devinit pci_scan_bus_
 	b->number = b->secondary = bus;
 	b->resource[0] = &ioport_resource;
 	b->resource[1] = &iomem_resource;
+	b->nodemask = get_pcibus_nodemask(bus);
 
 	b->subordinate = pci_scan_child_bus(b);
 
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-alpha/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-alpha/topology.h
--- linux-2.6.8-rc2-mm1/include/asm-alpha/topology.h	2004-07-28 10:49:58.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-alpha/topology.h	2004-07-28 16:42:29.000000000 -0700
@@ -42,8 +42,6 @@ static inline cpumask_t node_to_cpumask(
 /* Cross-node load balancing interval. */
 # define NODE_BALANCE_RATE 10
 
-#define pcibus_to_cpumask(bus)	(cpu_online_map)
-
 #else /* CONFIG_NUMA */
 # include <asm-generic/topology.h>
 #endif /* !CONFIG_NUMA */
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-generic/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-generic/topology.h
--- linux-2.6.8-rc2-mm1/include/asm-generic/topology.h	2004-06-15 22:18:58.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-generic/topology.h	2004-07-28 16:29:06.000000000 -0700
@@ -41,9 +41,6 @@
 #ifndef node_to_first_cpu
 #define node_to_first_cpu(node)	(0)
 #endif
-#ifndef pcibus_to_cpumask
-#define pcibus_to_cpumask(bus)	(cpu_online_map)
-#endif
 
 /* Cross-node load balancing interval. */
 #ifndef NODE_BALANCE_RATE
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-i386/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-i386/topology.h
--- linux-2.6.8-rc2-mm1/include/asm-i386/topology.h	2004-06-15 22:19:01.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-i386/topology.h	2004-07-29 15:16:16.000000000 -0700
@@ -60,11 +60,8 @@ static inline int node_to_first_cpu(int 
 	return first_cpu(mask);
 }
 
-/* Returns the number of the node containing PCI bus 'bus' */
-static inline cpumask_t pcibus_to_cpumask(int bus)
-{
-	return node_to_cpumask(mp_bus_id_to_node[bus]);
-}
+#define ARCH_HAS_GET_PCIBUS_NODEMASK
+#define get_pcibus_nodemask(bus)	(nodemask_of_node(mp_bus_id_to_node[bus]))
 
 /* Node-to-Node distance */
 #define node_distance(from, to) (from != to)
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-mips/mach-ip27/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-mips/mach-ip27/topology.h
--- linux-2.6.8-rc2-mm1/include/asm-mips/mach-ip27/topology.h	2004-06-15 22:20:26.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-mips/mach-ip27/topology.h	2004-07-28 16:07:38.000000000 -0700
@@ -7,7 +7,6 @@
 #define parent_node(node)	(node)
 #define node_to_cpumask(node)	(HUB_DATA(node)->h_cpus)
 #define node_to_first_cpu(node)	(first_cpu(node_to_cpumask(node)))
-#define pcibus_to_cpumask(bus)	(cpu_online_map)
 
 extern int node_distance(nasid_t nasid_a, nasid_t nasid_b);
 #define node_distance(from, to)	node_distance(from, to)
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-ppc64/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-ppc64/topology.h
--- linux-2.6.8-rc2-mm1/include/asm-ppc64/topology.h	2004-06-15 22:20:16.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-ppc64/topology.h	2004-07-28 16:07:48.000000000 -0700
@@ -33,8 +33,6 @@ static inline int node_to_first_cpu(int 
 	return first_cpu(tmp);
 }
 
-#define pcibus_to_cpumask(bus)	(cpu_online_map)
-
 #define nr_cpus_node(node)	(nr_cpus_in_node[node])
 
 /* Cross-node load balancing interval. */
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-x86_64/mpspec.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-x86_64/mpspec.h
--- linux-2.6.8-rc2-mm1/include/asm-x86_64/mpspec.h	2004-07-28 10:50:50.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-x86_64/mpspec.h	2004-07-29 15:08:24.000000000 -0700
@@ -166,7 +166,7 @@ enum mp_bustype {
 };
 extern unsigned char mp_bus_id_to_type [MAX_MP_BUSSES];
 extern int mp_bus_id_to_pci_bus [MAX_MP_BUSSES];
-extern cpumask_t pci_bus_to_cpumask [256];
+extern nodemask_t pci_bus_to_nodemask [256];
 
 extern unsigned int boot_cpu_physical_apicid;
 extern int smp_found_config;
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/asm-x86_64/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-x86_64/topology.h
--- linux-2.6.8-rc2-mm1/include/asm-x86_64/topology.h	2004-07-28 10:50:50.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/asm-x86_64/topology.h	2004-07-29 15:09:58.000000000 -0700
@@ -20,15 +20,11 @@ extern cpumask_t     node_to_cpumask[];
 #define node_to_first_cpu(node) 	(__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)		(node_to_cpumask[node])
 
-static inline cpumask_t pcibus_to_cpumask(int bus)
-{
-	cpumask_t res;
-	cpus_and(res,  pci_bus_to_cpumask[bus], cpu_online_map);
-	return res;
-}
-
 #define NODE_BALANCE_RATE 30	/* CHECKME */ 
 
+#define ARCH_HAS_GET_PCIBUS_NODEMASK
+#define get_pcibus_nodemask(bus)	(pci_bus_to_nodemask[bus])
+
 #endif
 
 #include <asm-generic/topology.h>
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/linux/pci.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/linux/pci.h
--- linux-2.6.8-rc2-mm1/include/linux/pci.h	2004-07-28 10:50:51.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/linux/pci.h	2004-07-28 16:58:46.000000000 -0700
@@ -590,6 +590,8 @@ struct pci_bus {
 	unsigned short  pad2;
 	struct device		*bridge;
 	struct class_device	class_dev;
+	nodemask_t	nodemask;	/* For NUMA systems, we care about which 
+					   node(s) this PCI bus is on/close to. */
 };
 
 #define pci_bus_b(n)	list_entry(n, struct pci_bus, node)
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.8-rc2-mm1/include/linux/topology.h linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/linux/topology.h
--- linux-2.6.8-rc2-mm1/include/linux/topology.h	2004-07-28 16:25:59.000000000 -0700
+++ linux-2.6.8-rc2-mm1+pcibus_to_nodemask/include/linux/topology.h	2004-07-29 14:58:43.000000000 -0700
@@ -28,6 +28,7 @@
 #define _LINUX_TOPOLOGY_H
 
 #include <linux/cpumask.h>
+#include <linux/nodemask.h>
 #include <linux/bitops.h>
 #include <linux/mmzone.h>
 #include <linux/smp.h>
@@ -54,6 +55,18 @@ static inline int __next_node_with_cpus(
 #define for_each_node_with_cpus(node) \
 	for (node = 0; node < numnodes; node = __next_node_with_cpus(node))
 
+static inline cpumask_t nodemask_to_cpumask(nodemask_t nodemask)
+{
+	cpumask_t ret, tmp;
+	int node;
+	cpus_clear(ret);
+	for_each_node_mask(node, nodemask) {
+		tmp = node_to_cpumask(node);
+		cpus_or(ret, ret, tmp);
+	}
+	return ret;
+}
+
 #ifndef node_distance
 #define node_distance(from,to)	(from != to)
 #endif
@@ -61,4 +74,13 @@ static inline int __next_node_with_cpus(
 #define PENALTY_FOR_NODE_WITH_CPUS	(1)
 #endif
 
+static inline nodemask_t pcibus_to_nodemask(struct pci_bus *bus)
+{
+	return bus->nodemask;
+}
+
+#ifndef ARCH_HAS_GET_PCIBUS_NODEMASK
+#define get_pcibus_nodemask(busnr)		(NODE_MASK_ALL)
+#endif
+
 #endif /* _LINUX_TOPOLOGY_H */



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-29 17:02           ` Rajesh Shah
@ 2004-07-29 22:27             ` Matthew Dobson
  2004-07-30  0:02               ` Rajesh Shah
  0 siblings, 1 reply; 23+ messages in thread
From: Matthew Dobson @ 2004-07-29 22:27 UTC (permalink / raw)
  To: Rajesh Shah
  Cc: Jesse Barnes, Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML,
	Martin J. Bligh, LSE Tech

On Thu, 2004-07-29 at 10:02, Rajesh Shah wrote:
> On Wed, Jul 28, 2004 at 05:06:48PM -0700, Matthew Dobson wrote:
> > 
> > thought.  It's pretty trivial to add a nodemask_t to the struct pci_bus,
> > and even initialize it to a reasonable value (ie: NODE_MASK_ALL) since
> > there's the convenient pci_alloc_bus() function in drivers/pci/probe.c. 
> > The problem is where to put hooks for individual arches to put the
> > *real* nodemask in this field...  My only thought right now is to create
> > a per-arch callback function, arch_get_pcibus_nodemask() or something,
> > and use the value it returns to populate pci_bus->nodemask.  We would
> > have to call this function anywhere a struct pci_bus is allocated, and
> > probably pass along the PCI bus number so the arch could determine which
> > nodes it belongs to.  Would that work for everyone that cares?  We could
> 
> With PCI root/p2p bridge hotplug, the code dealing with the
> hotplug (e.g. ACPI hotplug code) will have this information, not 
> arch specific code. How about having the PCI subsystem export
> an interface to set the nodemask, and have the arch or hotplug
> code call it to change the defaults? That way, pci_alloc_bus()
> simply sets the default and does not perform any callback.
> Does that work for everyone?
> 
> Rajesh

Does the patch I just posted in this thread work for you?  You could
have ACPI define the get_pcibus_nodemask(bus) call, and all should work
fine...

-Matt


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-29 22:27             ` Matthew Dobson
@ 2004-07-30  0:02               ` Rajesh Shah
  0 siblings, 0 replies; 23+ messages in thread
From: Rajesh Shah @ 2004-07-30  0:02 UTC (permalink / raw)
  To: Matthew Dobson
  Cc: Rajesh Shah, Jesse Barnes, Christoph Hellwig, Jesse Barnes,
	Andi Kleen, LKML, Martin J. Bligh, LSE Tech

On Thu, Jul 29, 2004 at 03:27:46PM -0700, Matthew Dobson wrote:
> On Thu, 2004-07-29 at 10:02, Rajesh Shah wrote:
> > On Wed, Jul 28, 2004 at 05:06:48PM -0700, Matthew Dobson wrote:
> > > 
> > > and even initialize it to a reasonable value (ie: NODE_MASK_ALL) since
> > > there's the convenient pci_alloc_bus() function in drivers/pci/probe.c. 
> > > The problem is where to put hooks for individual arches to put the
> > > *real* nodemask in this field...  My only thought right now is to create
> > > a per-arch callback function, arch_get_pcibus_nodemask() or something,
> > > and use the value it returns to populate pci_bus->nodemask.  We would
> > > have to call this function anywhere a struct pci_bus is allocated, and
> > > probably pass along the PCI bus number so the arch could determine which
> > > nodes it belongs to.  Would that work for everyone that cares?  We could
> > 
> > With PCI root/p2p bridge hotplug, the code dealing with the
> > hotplug (e.g. ACPI hotplug code) will have this information, not 
> > arch specific code. How about having the PCI subsystem export
> > an interface to set the nodemask, and have the arch or hotplug
> > code call it to change the defaults? That way, pci_alloc_bus()
> > simply sets the default and does not perform any callback.
> > Does that work for everyone?
> > 
> 
> Does the patch I just posted in this thread work for you?  You could
> have ACPI define the get_pcibus_nodemask(bus) call, and all should work
> fine...
> 
Yes, the patch you posted is fine. I was talking about the part
that was not in the patch but mentioned above (arch callbacks).
I'm working on ACPI based root/p2p bridge hotplug but am far from
being done. I can post the patches to get/set nodemask later, when
my work is farther along.

Rajesh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-29 22:23             ` Matthew Dobson
@ 2004-07-30 15:36               ` Jesse Barnes
  2004-07-30 22:17                 ` Matthew Dobson
  0 siblings, 1 reply; 23+ messages in thread
From: Jesse Barnes @ 2004-07-30 15:36 UTC (permalink / raw)
  To: colpatch
  Cc: Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML,
	Martin J. Bligh, LSE Tech

[-- Attachment #1: Type: text/plain, Size: 1007 bytes --]

On Thursday, July 29, 2004 3:23 pm, Matthew Dobson wrote:
> What I'm doing is basically ripping out all the old pcibus_to_cpumask()
> calls.  The only arch that defined it to be anything other than
> CPU_MASK_ALL was i386, and theirs should still work.  x86_64 had the
> beginnings of a PCI bus to CPU mask mapping, but it was never filled in,
> just populated with CPU_MASK_ALL, so it does the same with NODE_MASK_ALL
> now.  Those two arches, in their include/asm-$ARCH/topology.h define
> both ARCH_HAS_GET_PCIBUS_NODEMASK and get_pcibus_nodemask(bus).
> include/linux/topology.h defines a simple get_pcibus_nodemask(bus) if
> there isn't an arch-specific one provided.  We then, in
> drivers/pci/probe.c, populate the nodemask field of struct pci_bus with
> this nodemask.  Lookup involves simply returning the nodemask stored in
> the struct pci_bus.

I think this will work.  My tree didn't have nodemask_t though, so it didn't 
compile :)  Here's a first stab at an ia64 portion of the patch.

Jesse

[-- Attachment #2: pci-bus-to-node-ia64.patch --]
[-- Type: text/plain, Size: 3874 bytes --]

===== arch/ia64/sn/io/machvec/pci_dma.c 1.30 vs edited =====
--- 1.30/arch/ia64/sn/io/machvec/pci_dma.c	2004-03-26 06:33:08 -08:00
+++ edited/arch/ia64/sn/io/machvec/pci_dma.c	2004-07-30 08:38:05 -07:00
@@ -662,6 +662,19 @@
 	return 0;
 }
 
+/**
+ * sn_get_pcibus_nodemask - return set of nearby nodes for a given PCI bus
+ * @bus: bus number
+ *
+ * Return a nodemask_t with nearby node(s).
+ */
+nodemask_t sn_get_pcibus_nodemask(int bus)
+{
+	nodemask_t nodes;
+	nodes_clear(nodes);
+	return node_set(nasid_to_cnode(busnum_to_nid[bus]), nodes);
+}
+
 EXPORT_SYMBOL(sn_dma_mapping_error);
 EXPORT_SYMBOL(sn_pci_unmap_single);
 EXPORT_SYMBOL(sn_pci_map_single);
===== include/asm-ia64/io.h 1.19 vs edited =====
--- 1.19/include/asm-ia64/io.h	2004-02-03 21:31:10 -08:00
+++ edited/include/asm-ia64/io.h	2004-07-30 08:37:56 -07:00
@@ -391,6 +391,11 @@
 # define outl_p		outl
 #endif
 
+static inline nodemask_t __ia64_get_pcibus_nodemask(int bus)
+{
+	return NODE_MASK_ALL;
+}
+
 /*
  * An "address" in IO memory space is not clearly either an integer or a pointer. We will
  * accept both, thus the casts.
===== include/asm-ia64/machvec.h 1.25 vs edited =====
--- 1.25/include/asm-ia64/machvec.h	2004-07-10 17:14:00 -07:00
+++ edited/include/asm-ia64/machvec.h	2004-07-30 08:38:35 -07:00
@@ -70,6 +70,7 @@
 typedef unsigned short ia64_mv_readw_relaxed_t (void *);
 typedef unsigned int ia64_mv_readl_relaxed_t (void *);
 typedef unsigned long ia64_mv_readq_relaxed_t (void *);
+typedef nodemask_t ia64_mv_get_pcibus_nodemask_t (int bus);
 
 static inline void
 machvec_noop (void)
@@ -138,6 +139,7 @@
 #  define platform_readw_relaxed        ia64_mv.readw_relaxed
 #  define platform_readl_relaxed        ia64_mv.readl_relaxed
 #  define platform_readq_relaxed        ia64_mv.readq_relaxed
+#  define platform_get_pcibus_nodemask  ia64_mv.get_pcibus_nodemask
 # endif
 
 /* __attribute__((__aligned__(16))) is required to make size of the
@@ -184,6 +186,7 @@
 	ia64_mv_readw_relaxed_t *readw_relaxed;
 	ia64_mv_readl_relaxed_t *readl_relaxed;
 	ia64_mv_readq_relaxed_t *readq_relaxed;
+	ia64_mv_get_pcibus_nodemask_t *get_pcibus_nodemask;
 } __attribute__((__aligned__(16))); /* align attrib? see above comment */
 
 #define MACHVEC_INIT(name)			\
@@ -226,6 +229,7 @@
 	platform_readw_relaxed,			\
 	platform_readl_relaxed,			\
 	platform_readq_relaxed,			\
+	platform_get_pcibus_nodemask,		\
 }
 
 extern struct ia64_machine_vector ia64_mv;
@@ -367,6 +371,9 @@
 #endif
 #ifndef platform_readq_relaxed
 # define platform_readq_relaxed	__ia64_readq_relaxed
+#endif
+#ifndef platform_get_pcibus_nodemask
+# define platform_get_pcibus_nodemask __ia64_get_pcibus_nodemask
 #endif
 
 #endif /* _ASM_IA64_MACHVEC_H */
===== include/asm-ia64/machvec_sn2.h 1.14 vs edited =====
--- 1.14/include/asm-ia64/machvec_sn2.h	2004-07-10 17:14:00 -07:00
+++ edited/include/asm-ia64/machvec_sn2.h	2004-07-30 08:31:29 -07:00
@@ -69,6 +69,7 @@
 extern ia64_mv_dma_sync_sg_for_device	sn_dma_sync_sg_for_device;
 extern ia64_mv_dma_mapping_error	sn_dma_mapping_error;
 extern ia64_mv_dma_supported		sn_dma_supported;
+extern ia64_mv_get_pcibus_nodemask_t	sn_get_pcibus_nodemask;
 
 /*
  * This stuff has dual use!
@@ -116,6 +117,7 @@
 #define platform_dma_sync_sg_for_device	sn_dma_sync_sg_for_device
 #define platform_dma_mapping_error		sn_dma_mapping_error
 #define platform_dma_supported		sn_dma_supported
+#define platform_get_pcibus_nodemask	sn_get_pcibus_nodemask
 
 #include <asm/sn/sn2/io.h>
 
===== include/asm-ia64/topology.h 1.10 vs edited =====
--- 1.10/include/asm-ia64/topology.h	2004-02-03 21:35:17 -08:00
+++ edited/include/asm-ia64/topology.h	2004-07-30 08:38:17 -07:00
@@ -45,6 +45,9 @@
 
 void build_cpu_to_node_map(void);
 
+#define ARCH_HAS_GET_PCIBUS_NODEMASK
+extern nodemask_t get_pcibus_nodemask(int bus);
+
 #endif /* CONFIG_NUMA */
 
 #include <asm-generic/topology.h>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-30 15:36               ` Jesse Barnes
@ 2004-07-30 22:17                 ` Matthew Dobson
  2004-07-30 22:21                   ` Jesse Barnes
  0 siblings, 1 reply; 23+ messages in thread
From: Matthew Dobson @ 2004-07-30 22:17 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML,
	Martin J. Bligh, LSE Tech

On Fri, 2004-07-30 at 08:36, Jesse Barnes wrote:
> I think this will work.  My tree didn't have nodemask_t though, so it didn't 
> compile :)  Here's a first stab at an ia64 portion of the patch.
> 
> Jesse

Andrew picked it up in 2.6.8-rc2-mm1, so if you base your patch against
that it should compile...  That's what I based my patch off.  Our lab
has been down for a few days so I hope to do some testing on Monday for
my patches.  If all goes well, I'll add your code into my patch and
submit it early next week, ok?

Thanks!

-Matt


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-30 22:17                 ` Matthew Dobson
@ 2004-07-30 22:21                   ` Jesse Barnes
  2004-07-30 22:33                     ` Matthew Dobson
  0 siblings, 1 reply; 23+ messages in thread
From: Jesse Barnes @ 2004-07-30 22:21 UTC (permalink / raw)
  To: colpatch
  Cc: Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML,
	Martin J. Bligh, LSE Tech

On Friday, July 30, 2004 3:17 pm, Matthew Dobson wrote:
> On Fri, 2004-07-30 at 08:36, Jesse Barnes wrote:
> > I think this will work.  My tree didn't have nodemask_t though, so it
> > didn't compile :)  Here's a first stab at an ia64 portion of the patch.
> >
> > Jesse
>
> Andrew picked it up in 2.6.8-rc2-mm1, so if you base your patch against
> that it should compile...  That's what I based my patch off.  Our lab
> has been down for a few days so I hope to do some testing on Monday for
> my patches.  If all goes well, I'll add your code into my patch and
> submit it early next week, ok?

Sounds good, but it will probably need some fixes before it works correctly 
(my stuff I mean), so when you have something that looks good give me a few 
minutes with it before you send it on to Andrew if you would.

Thanks,
Jesse

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Lse-tech] [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node()
  2004-07-30 22:21                   ` Jesse Barnes
@ 2004-07-30 22:33                     ` Matthew Dobson
  0 siblings, 0 replies; 23+ messages in thread
From: Matthew Dobson @ 2004-07-30 22:33 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Christoph Hellwig, Jesse Barnes, Andi Kleen, LKML,
	Martin J. Bligh, LSE Tech

On Fri, 2004-07-30 at 15:21, Jesse Barnes wrote:
> On Friday, July 30, 2004 3:17 pm, Matthew Dobson wrote:
> > On Fri, 2004-07-30 at 08:36, Jesse Barnes wrote:
> > > I think this will work.  My tree didn't have nodemask_t though, so it
> > > didn't compile :)  Here's a first stab at an ia64 portion of the patch.
> > >
> > > Jesse
> >
> > Andrew picked it up in 2.6.8-rc2-mm1, so if you base your patch against
> > that it should compile...  That's what I based my patch off.  Our lab
> > has been down for a few days so I hope to do some testing on Monday for
> > my patches.  If all goes well, I'll add your code into my patch and
> > submit it early next week, ok?
> 
> Sounds good, but it will probably need some fixes before it works correctly 
> (my stuff I mean), so when you have something that looks good give me a few 
> minutes with it before you send it on to Andrew if you would.
> 
> Thanks,
> Jesse

No problem, Jesse.  Like I said, with lab machines being decidedly
unfriendly, I won't even be able to run any real tests on my code until
Monday at the earliest.  I'll certainly give you at least 5 minutes
warning before I post any untested, potentially dangerous code! ;)

-Matt


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2004-07-30 22:33 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-27  0:10 [RFC][PATCH] Change pcibus_to_cpumask() to pcibus_to_node() Matthew Dobson
2004-07-27  3:38 ` Jesse Barnes
2004-07-27  9:51 ` [Lse-tech] " Christoph Hellwig
2004-07-27 15:22   ` Jesse Barnes
2004-07-27 18:32     ` Matthew Dobson
2004-07-27 18:40       ` Jesse Barnes
2004-07-29  0:06         ` Matthew Dobson
2004-07-29 15:43           ` Jesse Barnes
2004-07-29 22:23             ` Matthew Dobson
2004-07-30 15:36               ` Jesse Barnes
2004-07-30 22:17                 ` Matthew Dobson
2004-07-30 22:21                   ` Jesse Barnes
2004-07-30 22:33                     ` Matthew Dobson
2004-07-29 17:02           ` Rajesh Shah
2004-07-29 22:27             ` Matthew Dobson
2004-07-30  0:02               ` Rajesh Shah
2004-07-28 15:01       ` Martin J. Bligh
2004-07-28 19:10         ` Matthew Dobson
2004-07-27 14:16 ` Andi Kleen
2004-07-27 15:15   ` Jesse Barnes
2004-07-27 15:57     ` Andi Kleen
2004-07-27 18:18       ` Matthew Dobson
2004-07-29  8:34         ` Paul Jackson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox