LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: root file system mounted via NFS - retry?
From: Randy Smith @ 2006-04-12 14:01 UTC (permalink / raw)
  Cc: linuxppc-embedded
In-Reply-To: <200604120750.31421.bcook@bpointsys.com>



Brent Cook wrote:
> On Wednesday 12 April 2006 07:35, Randy Smith wrote:
>   
>> Hello,
>>
>> I have a general question regarding having a root file system mounted
>> via NFS.  Our system consists of a linux PC that acts as the NFS server
>> and two embedded ppc boards running linux that mount their file systems
>> via NFS.  The problem is that the embedded boards boot much faster than
>> the PC, and when they attempt to mount the root file system, the NFS
>> server is not up yet.  The embedded boards hang and it takes quite a
>> long time before the watchdog timer reboots them.  The next time round,
>> they come up just fine.
>> What I would like for them to do is keep trying until the NFS server
>> appears and then continue to boot.
>>
>>     
>
> Sorry to try the obvious, but could you make the firmware's auto-boot timeout 
> longer? I know that u-boot and several others support this.
>   
We are using U-boot and yes, I can set the delay.  This does work around 
the problem, but I was hoping for a more 'elegant' solution.
> Also, you could try, instead of mounting root NFS directly from the kernel, an 
> initrd or embedding an initramfs that does the mount using a script. They you 
> could retry as much as you wanted without having to teach the kernel 
> anything.
>   
Another possibility.  Thanks!
> Or, just hook up some GPIOs from your PC to the reset lines on those PPC's. 
> Then reset them once your PC is up.
>   
This is not possible with our current design, but it would solve the 
problem.
> Or rather than running the PPC kernel from flash, tftp it to RAM from your 
> PC - that way your PPCs can't even load the kernel until the PC is up 
> (assuming your firmware knows how to do this and handles faults gracefully.)
>   
The way our system was initially designed, it worked this way.  We are 
modifying the design to remove the dependency on NFS by embedding the 
boot loader, kernel and root file system on the flash.  We aren't quite 
there yet so in the meantime we moved the kernel to the flash and now we 
are seeing this problem.

And in fact, the problem may not be as I described.  I have been running 
some other tests and I am getting indications that the problem may be in 
the hardware that our ppc boards are talking to.  I stripped a ppc board 
down to just the computer part and hooked up a serial line to it.  
Cold-boot and the stripped down ppc reached the mount nfs part and 
waited for a little while, and then continued on, so I am discounting 
the theory that the NFS mount was failing due to timeout.  The other ppc 
board hung with the same behavior, but I cannot put a serial line on it.

The question remains though, does the NFS timeo and retrans parameters 
on the nfsroot kernel command line apply to the 2.4.25 kernel supplied 
by DENX' ELDK?

Thanks,

-Randy
>  - Brent
>
>   

^ permalink raw reply

* problem booting custom board
From: Joachim Denil @ 2006-04-12 13:25 UTC (permalink / raw)
  To: linuxppc-embedded

[-- Attachment #1: Type: text/plain, Size: 1195 bytes --]

Dear list,
 
When booting linux (2.6.12) on a custom board with a ppc405gp processor i
run in to some problems. The board's got 64 mb SDRAM at 0x00000000. The
board is booted with a stripped version of ppc-boot (I added a
board_information structure with the right values and placed the address in
gpr3) witch is located in SRAM, and then jumps to 0x0000 where I placed my
raw binary image. 
 
I made a platform file where I call: ppc4xx_intit, ppc4xx_setup_arch,
ppc4xx_map_io, and a function ppc405_map_irq, but with an empty table (no
PCI devices). I set the link/load address at 0x0 (where I placed my image)
and didn't touch the virtual address of the kernel base. As kernelargs I
gave console=ttyS0,9600, and my rootfs is a busybox in an initramfs.
 
Still I don't get anything on the UART (checked both in just in case) so
there is definitely something wrong. I only have 1 day in the week to work
on that and I have access to a BDI2000 to debug this Friday, so that should
give me some insights to where it all goes wrong.
 
But can someone tell me what I'm doing wrong? (or right just to give me some
hope after all ;) )
 
Thank you,
 
Joachim Denil
Student @ IWT-KDG Antwerpen Belgium

[-- Attachment #2: Type: text/html, Size: 6976 bytes --]

^ permalink raw reply

* Re: root file system mounted via NFS - retry?
From: Brent Cook @ 2006-04-12 12:50 UTC (permalink / raw)
  To: linuxppc-embedded
In-Reply-To: <443CF3F8.408@imagemap.com>

On Wednesday 12 April 2006 07:35, Randy Smith wrote:
> Hello,
>
> I have a general question regarding having a root file system mounted
> via NFS.  Our system consists of a linux PC that acts as the NFS server
> and two embedded ppc boards running linux that mount their file systems
> via NFS.  The problem is that the embedded boards boot much faster than
> the PC, and when they attempt to mount the root file system, the NFS
> server is not up yet.  The embedded boards hang and it takes quite a
> long time before the watchdog timer reboots them.  The next time round,
> they come up just fine.
> What I would like for them to do is keep trying until the NFS server
> appears and then continue to boot.
>

Sorry to try the obvious, but could you make the firmware's auto-boot timeout 
longer? I know that u-boot and several others support this.

Also, you could try, instead of mounting root NFS directly from the kernel, an 
initrd or embedding an initramfs that does the mount using a script. They you 
could retry as much as you wanted without having to teach the kernel 
anything.

Or, just hook up some GPIOs from your PC to the reset lines on those PPC's. 
Then reset them once your PC is up.

Or rather than running the PPC kernel from flash, tftp it to RAM from your 
PC - that way your PPCs can't even load the kernel until the PC is up 
(assuming your firmware knows how to do this and handles faults gracefully.)

 - Brent

^ permalink raw reply

* Re: root file system mounted via NFS - retry?
From: Jarno Manninen @ 2006-04-12 12:40 UTC (permalink / raw)
  To: linuxppc-embedded
In-Reply-To: <443CF3F8.408@imagemap.com>

On Wednesday 12 April 2006 15:35, Randy Smith wrote:

Hi,

If your system configuration is static why not just sleep in the bootloader 
for a while? U-boot has sleep command.

- Jarno

> Hello,
>
> I have a general question regarding having a root file system mounted
> via NFS.  Our system consists of a linux PC that acts as the NFS server
> and two embedded ppc boards running linux that mount their file systems
> via NFS.  The problem is that the embedded boards boot much faster than
> the PC, and when they attempt to mount the root file system, the NFS
> server is not up yet.  The embedded boards hang and it takes quite a
> long time before the watchdog timer reboots them.  The next time round,
> they come up just fine.
> What I would like for them to do is keep trying until the NFS server
> appears and then continue to boot.
>
> We are running DENX's ELDK 2.4.25 kernel on the ppc boards and I know
> that for some linux releases, one may add values for timeo and retrans
> on the boot command line in the NFSROOT= command.  Can anyone tell me if
> they are valid for this release of the kernel?  Or if there is another
> way to accomplish this?
>
> Thanks,
>
> -Randy Smith
> Software Engineer
> ImageMap, Inc.
>
> _______________________________________________
> Linuxppc-embedded mailing list
> Linuxppc-embedded@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-embedded

^ permalink raw reply

* root file system mounted via NFS - retry?
From: Randy Smith @ 2006-04-12 12:35 UTC (permalink / raw)
  To: linuxppc-embedded

Hello,

I have a general question regarding having a root file system mounted 
via NFS.  Our system consists of a linux PC that acts as the NFS server 
and two embedded ppc boards running linux that mount their file systems 
via NFS.  The problem is that the embedded boards boot much faster than 
the PC, and when they attempt to mount the root file system, the NFS 
server is not up yet.  The embedded boards hang and it takes quite a 
long time before the watchdog timer reboots them.  The next time round, 
they come up just fine.
What I would like for them to do is keep trying until the NFS server 
appears and then continue to boot.

We are running DENX's ELDK 2.4.25 kernel on the ppc boards and I know 
that for some linux releases, one may add values for timeo and retrans 
on the boot command line in the NFSROOT= command.  Can anyone tell me if 
they are valid for this release of the kernel?  Or if there is another 
way to accomplish this?

Thanks,

-Randy Smith
Software Engineer
ImageMap, Inc.

^ permalink raw reply

* Re: BFD 2.16.1 assertion fail
From: jschopp @ 2006-04-12 11:28 UTC (permalink / raw)
  To: Sai prasanna; +Cc: linuxppc-dev
In-Reply-To: <20060412073814.39382.qmail@web54313.mail.yahoo.com>

If you just want a working cross compiler these tend to be pretty good:
http://developer.osdl.org/dev/plm/cross_compile/

Or if you insist on doing your own crosstool is a very good idea:
http://www.kegel.com/crosstool/

-Joel

Sai prasanna wrote:
> 
>     Date: Wed, 12 Apr 2006 00:04:59 -0700 (PDT)
>     From: Sai prasanna <saiprasannasp@yahoo.com>
>     Subject: BFD 2.16.1 assertion fail
>     To: linux project group <linux_projects_2006@yahoogroups.com>
>     CC: saiprasannasp@yahoo.com
> 
>     Hi All,
> 
>     I Have been trying to build across compiler for power-ppc-linux-gnu
>     I have succesfully installed binutils-2.16.1, first phase gcc-4.0.3,
>     glibc-2.3.6 and then when i finally tried to install the final gcc
>     ima getting eror such as
> 
>     /home/foo/build-install/powerpc-ppc-linux-gnu/bin/ld: BFD 2.16.1
>     assertion fail../../binutils-2.16.1/bfd/elf32-ppc.c:5397
>     collect2: ld returned 1 exit status
>     make[3]: *** [libmudflap.la] Error 1
>     make[3]: Leaving directory
>     `/home/foo/build-gcc/powerpc-ppc-linux-gnu/libmudflap'
>     make[2]: *** [all-recursive] Error 1
>     make[2]: Leaving directory
>     `/home/foo/build-gcc/powerpc-ppc-linux-gnu/libmudflap'
>     make[1]: *** [all] Error 2
>     make[1]: Leaving directory
>     `/home/foo/build-gcc/powerpc-ppc-linux-gnu/libmudflap'
>     make: *** [all-target-libmudflap] Error 2
> 
>     I have used the following gcc configuration opitons
> 
>     [root@localhost build-gcc]# ../gcc-4.0.3/configure
>     --host=i686-pc-linux-gnu --target=powerpc-ppc-linux-gnu
>     --prefix=/home/foo/build-install --enable-languages=c,c++
> 
>     Some one please help me with this.
> 
>     Regards,
>     Sai     
> 
> 
>     Sai Prasanna.S,
>     Department Of ComputerScience,
>     University Of Madras,
>     Chennai.
> 
>     Ph: 9940357430
>     Email: saiprasannasp@yahoo.com
>     ------------------------------------------------------------------------
>     Love cheap thrills? Enjoy PC-to-Phone calls to 30+ countries
>     <http://us.rd.yahoo.com/mail_us/taglines/postman9/*http://us.rd.yahoo.com/evt=39666/*http://beta.messenger.yahoo.com/>
>     for just 2�/min with Yahoo! Messenger with Voice.
> 
> 
> ------------------------------------------------------------------------
> Yahoo! Messenger with Voice. 
> <http://us.rd.yahoo.com/mail_us/taglines/postman3/*http://us.rd.yahoo.com/evt=39666/*http://beta.messenger.yahoo.com> 
> PC-to-Phone calls for ridiculously low rates.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev

^ permalink raw reply

* Re: [PATCH 0/6] [RFC] Sizing zones and holes in an architecture independent manner
From: Mel Gorman @ 2006-04-12 10:59 UTC (permalink / raw)
  To: Bob Picco; +Cc: linuxppc-dev, ak, Luck, Tony, Linux Kernel Mailing List, davej
In-Reply-To: <20060412013824.GF23742@localhost>

On Tue, 11 Apr 2006, Bob Picco wrote:

> Mel Gorman wrote:	[Tue Apr 11 2006, 08:02:10PM EDT]
>> On Tue, 11 Apr 2006, Bob Picco wrote:
>>
>>> luck wrote:	[Tue Apr 11 2006, 06:20:29PM EDT]
>>>> On Tue, Apr 11, 2006 at 11:39:46AM +0100, Mel Gorman wrote:
>>>>
>>>>> The patches have only been *compile tested* for ia64 with a flatmem
>>>>> configuration. At attempt was made to boot test on an ancient RS/6000
>>>>> but the vanilla kernel does not boot so I have to investigate there.
>>>>
>>>> The good news: Compilation is clean on the ia64 config variants that
>>>> I usually build (all 10 of them).
>>>>
>>>> The bad (or at least consistent) news: It doesn't boot on an Intel
>>>> Tiger either (oops at kmem_cache_alloc+0x41).
>>>>
>>>> -Tony
>>> I had a reply queued to report the same failure with
>>> DISCONTIG+NUMA+VIRTUAL_MEM_MAP.  This was 2 CPU HP rx2600. I'll take a
>>> closer
>>> look at the code tomorrow.
>>>
>>
>> hmm, ok, so discontig.c is in use which narrows things down. When
>> build_node_maps() is called, I assumed that the start and end pfn passed
>> in was for a valid page range. Was this a valid assumption? When I re-read
> The addresses are a valid physical range. The caution should be that
> filter_rsvd_memory converts the addresses from identity mapped to
> physical. efi_memmap_walk calls back to function with identity mapped
> addresses. What you've done seems okay.

It would have been ok if I spotted it was physical addresses being passed 
into count_node_pages(). add_active_range() expects pfns so a >> PAGE_SHIFT
was missing there.

> BTW - I like want you are attempting to achieve.

Thanks

>> the comment, it implies that memory holes could be within this range which
>> would cause boot failures. If that is the case, the correct thing to do
>> was to call add_active_range() in count_node_pages() instead of
>> build_node_maps().
> Yes that helps because of granules and it boots.  The patch below is applied
> on top of your original post. But..
>
> <Patch Snipped>
>
> Page free/avail accounting is off and I'm done for tonight. I believe it's how
> you treat holes but haven't looked closely yet.
>

Thanks for trying it out so late in the evening. The accounting is off 
because I was passing in physical addresses instead of pfns. The fact it 
booted at all means we probably registered the memory near address 0 by 
accident and it would eventually oops.

>
> Let me wrap my head around this code again. It's been some time.

This is the same patch I posted to Tony that hopefully fix the problems on 
flatmem. The important changes for your discontig machine is;

o Registering in count_node_pages() as your patch fixed
o Converting the physical address passed to count_node_pages() to a PFN

Can you try it out when you're next looking at this? Thanks

diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/Kconfig linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/Kconfig
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/Kconfig	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/Kconfig	2006-04-11 23:31:38.000000000 +0100
@@ -352,6 +352,9 @@ config NUMA
  	  Access).  This option is for configuring high-end multiprocessor
  	  server systems.  If in doubt, say N.

+config ARCH_POPULATES_NODE_MAP
+	def_bool y
+
  # VIRTUAL_MEM_MAP and FLAT_NODE_MEM_MAP are functionally equivalent.
  # VIRTUAL_MEM_MAP has been retained for historical reasons.
  config VIRTUAL_MEM_MAP
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/contig.c linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/contig.c
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/contig.c	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/contig.c	2006-04-11 23:56:45.000000000 +0100
@@ -26,10 +26,6 @@
  #include <asm/sections.h>
  #include <asm/mca.h>

-#ifdef CONFIG_VIRTUAL_MEM_MAP
-static unsigned long num_dma_physpages;
-#endif
-
  /**
   * show_mem - display a memory statistics summary
   *
@@ -212,18 +208,6 @@ count_pages (u64 start, u64 end, void *a
  	return 0;
  }

-#ifdef CONFIG_VIRTUAL_MEM_MAP
-static int
-count_dma_pages (u64 start, u64 end, void *arg)
-{
-	unsigned long *count = arg;
-
-	if (start < MAX_DMA_ADDRESS)
-		*count += (min(end, MAX_DMA_ADDRESS) - start) >> PAGE_SHIFT;
-	return 0;
-}
-#endif
-
  /*
   * Set up the page tables.
   */
@@ -232,47 +216,24 @@ void __init
  paging_init (void)
  {
  	unsigned long max_dma;
-	unsigned long zones_size[MAX_NR_ZONES];
  #ifdef CONFIG_VIRTUAL_MEM_MAP
-	unsigned long zholes_size[MAX_NR_ZONES];
+	unsigned long nid = 0;
  	unsigned long max_gap;
  #endif

-	/* initialize mem_map[] */
-
-	memset(zones_size, 0, sizeof(zones_size));
-
  	num_physpages = 0;
  	efi_memmap_walk(count_pages, &num_physpages);

  	max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS) >> PAGE_SHIFT;

  #ifdef CONFIG_VIRTUAL_MEM_MAP
-	memset(zholes_size, 0, sizeof(zholes_size));
-
-	num_dma_physpages = 0;
-	efi_memmap_walk(count_dma_pages, &num_dma_physpages);
-
-	if (max_low_pfn < max_dma) {
-		zones_size[ZONE_DMA] = max_low_pfn;
-		zholes_size[ZONE_DMA] = max_low_pfn - num_dma_physpages;
-	} else {
-		zones_size[ZONE_DMA] = max_dma;
-		zholes_size[ZONE_DMA] = max_dma - num_dma_physpages;
-		if (num_physpages > num_dma_physpages) {
-			zones_size[ZONE_NORMAL] = max_low_pfn - max_dma;
-			zholes_size[ZONE_NORMAL] =
-				((max_low_pfn - max_dma) -
-				 (num_physpages - num_dma_physpages));
-		}
-	}
-
  	max_gap = 0;
+	efi_memmap_walk(register_active_ranges, &nid);
  	efi_memmap_walk(find_largest_hole, (u64 *)&max_gap);
  	if (max_gap < LARGE_GAP) {
  		vmem_map = (struct page *) 0;
-		free_area_init_node(0, NODE_DATA(0), zones_size, 0,
-				    zholes_size);
+		free_area_init_nodes(max_dma, max_dma,
+				max_low_pfn, max_low_pfn);
  	} else {
  		unsigned long map_size;

@@ -284,19 +245,14 @@ paging_init (void)
  		efi_memmap_walk(create_mem_map_page_table, NULL);

  		NODE_DATA(0)->node_mem_map = vmem_map;
-		free_area_init_node(0, NODE_DATA(0), zones_size,
-				    0, zholes_size);
+		free_area_init_nodes(max_dma, max_dma,
+				max_low_pfn, max_low_pfn);

  		printk("Virtual mem_map starts at 0x%p\n", mem_map);
  	}
  #else /* !CONFIG_VIRTUAL_MEM_MAP */
-	if (max_low_pfn < max_dma)
-		zones_size[ZONE_DMA] = max_low_pfn;
-	else {
-		zones_size[ZONE_DMA] = max_dma;
-		zones_size[ZONE_NORMAL] = max_low_pfn - max_dma;
-	}
-	free_area_init(zones_size);
+	add_active_range(0, 0, max_low_pfn);
+	free_area_init_nodes(max_dma, max_dma, max_low_pfn, max_low_pfn);
  #endif /* !CONFIG_VIRTUAL_MEM_MAP */
  	zero_page_memmap_ptr = virt_to_page(ia64_imva(empty_zero_page));
  }
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/discontig.c linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/discontig.c
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/discontig.c	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/discontig.c	2006-04-12 11:27:55.000000000 +0100
@@ -647,6 +647,7 @@ static __init int count_node_pages(unsig
  				     end >> PAGE_SHIFT);
  	mem_data[node].min_pfn = min(mem_data[node].min_pfn,
  				     start >> PAGE_SHIFT);
+	add_active_range(node, start >> PAGE_SHIFT, end >> PAGE_SHIFT);

  	return 0;
  }
@@ -660,9 +661,8 @@ static __init int count_node_pages(unsig
  void __init paging_init(void)
  {
  	unsigned long max_dma;
-	unsigned long zones_size[MAX_NR_ZONES];
-	unsigned long zholes_size[MAX_NR_ZONES];
  	unsigned long pfn_offset = 0;
+	unsigned long max_pfn = 0;
  	int node;

  	max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS) >> PAGE_SHIFT;
@@ -679,46 +679,17 @@ void __init paging_init(void)
  #endif

  	for_each_online_node(node) {
-		memset(zones_size, 0, sizeof(zones_size));
-		memset(zholes_size, 0, sizeof(zholes_size));
-
  		num_physpages += mem_data[node].num_physpages;
-
-		if (mem_data[node].min_pfn >= max_dma) {
-			/* All of this node's memory is above ZONE_DMA */
-			zones_size[ZONE_NORMAL] = mem_data[node].max_pfn -
-				mem_data[node].min_pfn;
-			zholes_size[ZONE_NORMAL] = mem_data[node].max_pfn -
-				mem_data[node].min_pfn -
-				mem_data[node].num_physpages;
-		} else if (mem_data[node].max_pfn < max_dma) {
-			/* All of this node's memory is in ZONE_DMA */
-			zones_size[ZONE_DMA] = mem_data[node].max_pfn -
-				mem_data[node].min_pfn;
-			zholes_size[ZONE_DMA] = mem_data[node].max_pfn -
-				mem_data[node].min_pfn -
-				mem_data[node].num_dma_physpages;
-		} else {
-			/* This node has memory in both zones */
-			zones_size[ZONE_DMA] = max_dma -
-				mem_data[node].min_pfn;
-			zholes_size[ZONE_DMA] = zones_size[ZONE_DMA] -
-				mem_data[node].num_dma_physpages;
-			zones_size[ZONE_NORMAL] = mem_data[node].max_pfn -
-				max_dma;
-			zholes_size[ZONE_NORMAL] = zones_size[ZONE_NORMAL] -
-				(mem_data[node].num_physpages -
-				 mem_data[node].num_dma_physpages);
-		}
-
  		pfn_offset = mem_data[node].min_pfn;

  #ifdef CONFIG_VIRTUAL_MEM_MAP
  		NODE_DATA(node)->node_mem_map = vmem_map + pfn_offset;
  #endif
-		free_area_init_node(node, NODE_DATA(node), zones_size,
-				    pfn_offset, zholes_size);
+		if (mem_data[node].max_pfn > max_pfn)
+			max_pfn = mem_data[node].max_pfn;
  	}

+	free_area_init_nodes(max_dma, max_dma, max_pfn, max_pfn);
+
  	zero_page_memmap_ptr = virt_to_page(ia64_imva(empty_zero_page));
  }
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/init.c linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/init.c
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/init.c	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/init.c	2006-04-12 11:07:10.000000000 +0100
@@ -539,6 +539,18 @@ find_largest_hole (u64 start, u64 end, v
  	last_end = end;
  	return 0;
  }
+
+int __init
+register_active_ranges(u64 start, u64 end, void *nid)
+{
+	BUG_ON(nid == NULL);
+	BUG_ON(*(unsigned long *)nid >= MAX_NUMNODES);
+
+	add_active_range(*(unsigned long *)nid,
+				__pa(start) >> PAGE_SHIFT,
+				__pa(end) >> PAGE_SHIFT);
+	return 0;
+}
  #endif /* CONFIG_VIRTUAL_MEM_MAP */

  static int __init
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/include/asm-ia64/meminit.h linux-2.6.17-rc1-105-ia64_use_init_nodes/include/asm-ia64/meminit.h
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/include/asm-ia64/meminit.h	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/include/asm-ia64/meminit.h	2006-04-11 23:34:58.000000000 +0100
@@ -56,6 +56,7 @@ extern void efi_memmap_init(unsigned lon
    extern unsigned long vmalloc_end;
    extern struct page *vmem_map;
    extern int find_largest_hole (u64 start, u64 end, void *arg);
+  extern int register_active_ranges (u64 start, u64 end, void *arg);
    extern int create_mem_map_page_table (u64 start, u64 end, void *arg);
  #endif

^ permalink raw reply

* Re: [PATCH 0/6] [RFC] Sizing zones and holes in an architecture independent manner
From: Mel Gorman @ 2006-04-12 10:50 UTC (permalink / raw)
  To: Luck, Tony; +Cc: linuxppc-dev, ak, Linux Kernel Mailing List, davej
In-Reply-To: <20060412000500.GA8532@agluck-lia64.sc.intel.com>

On Tue, 11 Apr 2006, Luck, Tony wrote:

> On Wed, Apr 12, 2006 at 12:23:45AM +0100, Mel Gorman wrote:
>> Darn.
>>
>> o Did it boot on other IA64 machines or was the Tiger the first boot failure?
>
> I only tried to boot on the Tiger.
>

ok, based on your console log, I'm pretty sure it would have broken on 
almost any IA64.

>> o Possibly a stupid question but does the Tiger configuration use the
>>    flatmem memory model, sparsemem or discontig?
>
> I built using arch/ia64/configs/tiger_defconfig - a FLATMEM config with
> VIRT_MEM_MAP=y.  The machine has 4G of memory, 2G at 0-2G, and 2G at 6G-8G
> (so it is somewhat sparse ... but this is pretty normal for an ia64 with
>> 2G).
>

That's useful to know. It means I know what pfn ranges I expect to see 
being passed to add_active_range().

>> If it's flatmem, I noticed I made a stupid mistake where vmem_map is not
>> getting set to (void *)0 for machines with small memory holes. Nothing
>> else really obvious jumped out at me.
>>
>> I've attached a patch called "105-ia64_use_init_nodes.patch". Can you
>> reverse Patch 5/6 and apply this one instead please? I've also attached
>> 107-debug.diff that applies on top of patch 6/6. It just prints out
>> debugging information during startup that may tell me where I went wrong
>> in arch/ia64. I'd really appreciate it if you could use both patches, let
>> me know if it still fails to boot and send me the console log of the
>> machine starting up if it fails so I can make guesses as to what is going
>> wrong.
>>
>> Thanks a lot for trying the patches out on ia64. It was the one arch of
>> the set I had no chance to test with at all :/
>
> Ok, I cloned a branch from patch4, applied the new patch 5, git-cherry-picked
> patch 6, and then applied the debug patch7.
>
> Here's the console log:
>
> <snip snip>
> add_active_range(0, 16140901064512634880, 16140901066637049856): New
> add_active_range(0, 16140901066641899520, 16140901066642489344): New
> add_active_range(0, 16140901070938308608, 16140901073083760640): New
> add_active_range(0, 16140901073084219392, 16140901073085480960): New
> <snip snip>

Good man Mel! The callback register_active_ranges() callback is getting 
*virtual addresses*, not PFNs (which is brutally obvious now!). For 
discontig, there is a similar story. count_node_pages() is getting a 
*physical address*, not a pfn (also called start which is a bit confusing 
but a different problem).

So some thinking out loud to see if you spot problems;

o PAGE_OFFSET seems to be 16140901064495857664 from the header file
o Instead of using add_active_range(node, start, end), assume I had used
   add_active_range(node,
 		(start - PAGE_OFFSET) >> PAGE_SHIFT,
 		(end - PAGE_OFFSET) >> PAGE_SHIFT);

That would have made the console log look something like;

add_active_range(0, 4096, 522752): New
add_active_range(0, 523936, 524080): New
add_active_range(0, 1572864, 2096656): New
add_active_range(0, 2096768, 2097076): New

That seems to register memory about the 0-2G mark and 6-8G with some small 
holes here and there. Sounds like what you expected to happen. In case the 
1:1 virt->phys mapping is not always true on IA64, I decided to use __pa() 
instead of PAGE_OFFSET like;

add_active_range(node, __pa(start) >> PAGE_SHIFT, __pa(end) >> PAGE_SHIFT);

Is this the correct thing to do or is "start - PAGE_OFFSET" safer? 
Optimistically assuming __pa() is ok, the following patch (which replaces 
Patch 5/6 again) should boot (passed compile testing here). If it doesn't, 
can you send the console log again please?

Thanks again.

diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/Kconfig linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/Kconfig
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/Kconfig	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/Kconfig	2006-04-11 23:31:38.000000000 +0100
@@ -352,6 +352,9 @@ config NUMA
  	  Access).  This option is for configuring high-end multiprocessor
  	  server systems.  If in doubt, say N.

+config ARCH_POPULATES_NODE_MAP
+	def_bool y
+
  # VIRTUAL_MEM_MAP and FLAT_NODE_MEM_MAP are functionally equivalent.
  # VIRTUAL_MEM_MAP has been retained for historical reasons.
  config VIRTUAL_MEM_MAP
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/contig.c linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/contig.c
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/contig.c	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/contig.c	2006-04-11 23:56:45.000000000 +0100
@@ -26,10 +26,6 @@
  #include <asm/sections.h>
  #include <asm/mca.h>

-#ifdef CONFIG_VIRTUAL_MEM_MAP
-static unsigned long num_dma_physpages;
-#endif
-
  /**
   * show_mem - display a memory statistics summary
   *
@@ -212,18 +208,6 @@ count_pages (u64 start, u64 end, void *a
  	return 0;
  }

-#ifdef CONFIG_VIRTUAL_MEM_MAP
-static int
-count_dma_pages (u64 start, u64 end, void *arg)
-{
-	unsigned long *count = arg;
-
-	if (start < MAX_DMA_ADDRESS)
-		*count += (min(end, MAX_DMA_ADDRESS) - start) >> PAGE_SHIFT;
-	return 0;
-}
-#endif
-
  /*
   * Set up the page tables.
   */
@@ -232,47 +216,24 @@ void __init
  paging_init (void)
  {
  	unsigned long max_dma;
-	unsigned long zones_size[MAX_NR_ZONES];
  #ifdef CONFIG_VIRTUAL_MEM_MAP
-	unsigned long zholes_size[MAX_NR_ZONES];
+	unsigned long nid = 0;
  	unsigned long max_gap;
  #endif

-	/* initialize mem_map[] */
-
-	memset(zones_size, 0, sizeof(zones_size));
-
  	num_physpages = 0;
  	efi_memmap_walk(count_pages, &num_physpages);

  	max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS) >> PAGE_SHIFT;

  #ifdef CONFIG_VIRTUAL_MEM_MAP
-	memset(zholes_size, 0, sizeof(zholes_size));
-
-	num_dma_physpages = 0;
-	efi_memmap_walk(count_dma_pages, &num_dma_physpages);
-
-	if (max_low_pfn < max_dma) {
-		zones_size[ZONE_DMA] = max_low_pfn;
-		zholes_size[ZONE_DMA] = max_low_pfn - num_dma_physpages;
-	} else {
-		zones_size[ZONE_DMA] = max_dma;
-		zholes_size[ZONE_DMA] = max_dma - num_dma_physpages;
-		if (num_physpages > num_dma_physpages) {
-			zones_size[ZONE_NORMAL] = max_low_pfn - max_dma;
-			zholes_size[ZONE_NORMAL] =
-				((max_low_pfn - max_dma) -
-				 (num_physpages - num_dma_physpages));
-		}
-	}
-
  	max_gap = 0;
+	efi_memmap_walk(register_active_ranges, &nid);
  	efi_memmap_walk(find_largest_hole, (u64 *)&max_gap);
  	if (max_gap < LARGE_GAP) {
  		vmem_map = (struct page *) 0;
-		free_area_init_node(0, NODE_DATA(0), zones_size, 0,
-				    zholes_size);
+		free_area_init_nodes(max_dma, max_dma,
+				max_low_pfn, max_low_pfn);
  	} else {
  		unsigned long map_size;

@@ -284,19 +245,14 @@ paging_init (void)
  		efi_memmap_walk(create_mem_map_page_table, NULL);

  		NODE_DATA(0)->node_mem_map = vmem_map;
-		free_area_init_node(0, NODE_DATA(0), zones_size,
-				    0, zholes_size);
+		free_area_init_nodes(max_dma, max_dma,
+				max_low_pfn, max_low_pfn);

  		printk("Virtual mem_map starts at 0x%p\n", mem_map);
  	}
  #else /* !CONFIG_VIRTUAL_MEM_MAP */
-	if (max_low_pfn < max_dma)
-		zones_size[ZONE_DMA] = max_low_pfn;
-	else {
-		zones_size[ZONE_DMA] = max_dma;
-		zones_size[ZONE_NORMAL] = max_low_pfn - max_dma;
-	}
-	free_area_init(zones_size);
+	add_active_range(0, 0, max_low_pfn);
+	free_area_init_nodes(max_dma, max_dma, max_low_pfn, max_low_pfn);
  #endif /* !CONFIG_VIRTUAL_MEM_MAP */
  	zero_page_memmap_ptr = virt_to_page(ia64_imva(empty_zero_page));
  }
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/discontig.c linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/discontig.c
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/discontig.c	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/discontig.c	2006-04-12 11:27:55.000000000 +0100
@@ -647,6 +647,7 @@ static __init int count_node_pages(unsig
  				     end >> PAGE_SHIFT);
  	mem_data[node].min_pfn = min(mem_data[node].min_pfn,
  				     start >> PAGE_SHIFT);
+	add_active_range(node, start >> PAGE_SHIFT, end >> PAGE_SHIFT);

  	return 0;
  }
@@ -660,9 +661,8 @@ static __init int count_node_pages(unsig
  void __init paging_init(void)
  {
  	unsigned long max_dma;
-	unsigned long zones_size[MAX_NR_ZONES];
-	unsigned long zholes_size[MAX_NR_ZONES];
  	unsigned long pfn_offset = 0;
+	unsigned long max_pfn = 0;
  	int node;

  	max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS) >> PAGE_SHIFT;
@@ -679,46 +679,17 @@ void __init paging_init(void)
  #endif

  	for_each_online_node(node) {
-		memset(zones_size, 0, sizeof(zones_size));
-		memset(zholes_size, 0, sizeof(zholes_size));
-
  		num_physpages += mem_data[node].num_physpages;
-
-		if (mem_data[node].min_pfn >= max_dma) {
-			/* All of this node's memory is above ZONE_DMA */
-			zones_size[ZONE_NORMAL] = mem_data[node].max_pfn -
-				mem_data[node].min_pfn;
-			zholes_size[ZONE_NORMAL] = mem_data[node].max_pfn -
-				mem_data[node].min_pfn -
-				mem_data[node].num_physpages;
-		} else if (mem_data[node].max_pfn < max_dma) {
-			/* All of this node's memory is in ZONE_DMA */
-			zones_size[ZONE_DMA] = mem_data[node].max_pfn -
-				mem_data[node].min_pfn;
-			zholes_size[ZONE_DMA] = mem_data[node].max_pfn -
-				mem_data[node].min_pfn -
-				mem_data[node].num_dma_physpages;
-		} else {
-			/* This node has memory in both zones */
-			zones_size[ZONE_DMA] = max_dma -
-				mem_data[node].min_pfn;
-			zholes_size[ZONE_DMA] = zones_size[ZONE_DMA] -
-				mem_data[node].num_dma_physpages;
-			zones_size[ZONE_NORMAL] = mem_data[node].max_pfn -
-				max_dma;
-			zholes_size[ZONE_NORMAL] = zones_size[ZONE_NORMAL] -
-				(mem_data[node].num_physpages -
-				 mem_data[node].num_dma_physpages);
-		}
-
  		pfn_offset = mem_data[node].min_pfn;

  #ifdef CONFIG_VIRTUAL_MEM_MAP
  		NODE_DATA(node)->node_mem_map = vmem_map + pfn_offset;
  #endif
-		free_area_init_node(node, NODE_DATA(node), zones_size,
-				    pfn_offset, zholes_size);
+		if (mem_data[node].max_pfn > max_pfn)
+			max_pfn = mem_data[node].max_pfn;
  	}

+	free_area_init_nodes(max_dma, max_dma, max_pfn, max_pfn);
+
  	zero_page_memmap_ptr = virt_to_page(ia64_imva(empty_zero_page));
  }
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/init.c linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/init.c
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/init.c	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/init.c	2006-04-12 11:07:10.000000000 +0100
@@ -539,6 +539,18 @@ find_largest_hole (u64 start, u64 end, v
  	last_end = end;
  	return 0;
  }
+
+int __init
+register_active_ranges(u64 start, u64 end, void *nid)
+{
+	BUG_ON(nid == NULL);
+	BUG_ON(*(unsigned long *)nid >= MAX_NUMNODES);
+
+	add_active_range(*(unsigned long *)nid,
+				__pa(start) >> PAGE_SHIFT,
+				__pa(end) >> PAGE_SHIFT);
+	return 0;
+}
  #endif /* CONFIG_VIRTUAL_MEM_MAP */

  static int __init
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/include/asm-ia64/meminit.h linux-2.6.17-rc1-105-ia64_use_init_nodes/include/asm-ia64/meminit.h
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/include/asm-ia64/meminit.h	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/include/asm-ia64/meminit.h	2006-04-11 23:34:58.000000000 +0100
@@ -56,6 +56,7 @@ extern void efi_memmap_init(unsigned lon
    extern unsigned long vmalloc_end;
    extern struct page *vmem_map;
    extern int find_largest_hole (u64 start, u64 end, void *arg);
+  extern int register_active_ranges (u64 start, u64 end, void *arg);
    extern int create_mem_map_page_table (u64 start, u64 end, void *arg);
  #endif

^ permalink raw reply

* Re: Watchdog on MPC82xx
From: Bastos Fernandez Alexandre @ 2006-04-12 11:44 UTC (permalink / raw)
  To: linuxppc-embedded list
In-Reply-To: <1144835330.443ccd021b40a@webmail.televes.com:443>


> So, can anyone give me some guidelines for this job? I have seen from
> freescale docs that MPC83xx watchdog may be the same than MPC82xx one.
> So, could I use the same approach than used on MPC83xx boards?
>
> http://patchwork.ozlabs.org/linuxppc//patch?id=4118
>

Though Freescales doc for MPC83xx says:

• Functional and programming compatibility with MPC8260 watchdog timer

further reading shows that there is a "little" difference. MPC83xx can
come out of reset with WDT disabled, while MPC82xx can't. So the approach
for MPC83xx is not possible on those other systems.

So, any other idea? Is the WDT on MPC8248 usable at all?

Thanks,

Alex

^ permalink raw reply

* Re: BFD 2.16.1 assertion fail
From: Alan Modra @ 2006-04-12  8:53 UTC (permalink / raw)
  To: Sai prasanna; +Cc: linuxppc-dev
In-Reply-To: <20060412073814.39382.qmail@web54313.mail.yahoo.com>

On Wed, Apr 12, 2006 at 12:38:14AM -0700, Sai prasanna wrote:
> /home/foo/build-install/powerpc-ppc-linux-gnu/bin/ld: BFD 2.16.1 assertion fail../../binutils-2.16.1/bfd/elf32-ppc.c:5397

You will likely be much better off with a new CVS binutils.  2.16.1 was
released almost a year ago, and a lot of bugs have been fixed since
then.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply

* Watchdog on MPC82xx
From: Bastos Fernandez Alexandre @ 2006-04-12  9:48 UTC (permalink / raw)
  To: linuxppc-embedded list


Hi list,

I am trying to use the on-chip watchdog timer in the MPC8248 to
recover from system hangs.

I have been following the guidelines from the changes I have found
from Compulab.

In u-boot, the watchdog is activated and works fine. The changes in
the kernel are, basically, assigning the ppd_md.heartbeat to a function
which resets the watchdog. This is supposed to start reseting the wdt
after ppc_md.setup_arch is invoked, but this is not working, and the
kernel keeps rebooting during startup.

I have tried to catch the point when this happens to force a WD service
sequence before that, but I had no success.

So, can anyone give me some guidelines for this job? I have seen from
freescale docs that MPC83xx watchdog may be the same than MPC82xx one.
So, could I use the same approach than used on MPC83xx boards?

http://patchwork.ozlabs.org/linuxppc//patch?id=4118


Thanks in advance

Alex

^ permalink raw reply

* BFD 2.16.1 assertion fail
From: Sai prasanna @ 2006-04-12  7:38 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1673 bytes --]


 Date: Wed, 12 Apr 2006 00:04:59 -0700 (PDT)
From: Sai prasanna <saiprasannasp@yahoo.com>
Subject: BFD 2.16.1 assertion fail
To: linux project group <linux_projects_2006@yahoogroups.com>
CC: saiprasannasp@yahoo.com

 Hi All,

I Have been trying to build across compiler for power-ppc-linux-gnu I have succesfully installed binutils-2.16.1, first phase gcc-4.0.3, glibc-2.3.6 and then when i finally tried to install the final gcc ima getting eror such as

/home/foo/build-install/powerpc-ppc-linux-gnu/bin/ld: BFD 2.16.1 assertion fail../../binutils-2.16.1/bfd/elf32-ppc.c:5397
collect2: ld returned 1 exit status
make[3]: *** [libmudflap.la] Error 1
make[3]: Leaving directory `/home/foo/build-gcc/powerpc-ppc-linux-gnu/libmudflap'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/foo/build-gcc/powerpc-ppc-linux-gnu/libmudflap'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/home/foo/build-gcc/powerpc-ppc-linux-gnu/libmudflap'
make: *** [all-target-libmudflap] Error 2

I have used the following gcc configuration opitons

[root@localhost build-gcc]# ../gcc-4.0.3/configure --host=i686-pc-linux-gnu  --target=powerpc-ppc-linux-gnu --prefix=/home/foo/build-install --enable-languages=c,c++ 

Some one please help me with this.

Regards,
Sai      


Sai Prasanna.S,
Department Of ComputerScience,
University Of Madras,
Chennai.

Ph: 9940357430
Email: saiprasannasp@yahoo.com   

---------------------------------
Love cheap thrills? Enjoy PC-to-Phone  calls to 30+ countries for just 2�/min with Yahoo! Messenger with Voice.

			
---------------------------------
Yahoo! Messenger with Voice. PC-to-Phone calls for ridiculously low rates.

[-- Attachment #2: Type: text/html, Size: 2175 bytes --]

^ permalink raw reply

* Re: [PATCH 1/2] tickless idle cpus: core patch - v2
From: Srivatsa Vaddagiri @ 2006-04-12  4:50 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev
In-Reply-To: <17467.59608.503042.216312@cargo.ozlabs.ibm.com>

On Wed, Apr 12, 2006 at 03:35:20AM +1000, Paul Mackerras wrote:
> It would be nice if we could arrange to call stop_hz_timer from the
> top-level cpu_idle() function rather than having to call it from the
> individual power_save() functions such as power4_idle().  Can you see
> a problem with doing that?

I had considered doing that, but one problem with it is - how do we ensure that 
start_hz_timer will be called before idle thread calls schedule? A problem 
scenario is when the power_save() function returns without taking an interrupt 
(as is possible in pseries_dedicated_idle_sleep?), since start_hz_timer is 
currently called from only an interrupt context.

Now we could contemplate calling start_hz_timer directly from cpu_idle
when power_save() function returns - but how do we get the register
context required as an argument in start_hz_timer()?

-- 
Regards,
vatsa

^ permalink raw reply

* Check it out
From: jefferxu @ 2006-04-12  6:25 UTC (permalink / raw)
  To: linuxppc-embedded

[-- Attachment #1: Type: text/html, Size: 586 bytes --]

^ permalink raw reply

* Check it out
From: vijesh.vh @ 2006-04-12  3:18 UTC (permalink / raw)
  To: linuxppc-embedded

[-- Attachment #1: Type: text/html, Size: 605 bytes --]

^ permalink raw reply

* Re: [PATCH 1/2] Base pSeries PCIe support
From: Christoph Hellwig @ 2006-04-12  4:06 UTC (permalink / raw)
  To: Jake Moilanen; +Cc: linuxppc-dev, paulus
In-Reply-To: <20060331160545.db3fa210.moilanen@austin.ibm.com>

On Fri, Mar 31, 2006 at 04:05:45PM -0600, Jake Moilanen wrote:
> +source "drivers/pci/pcie/Kconfig"

does it actually work on ppc64?  last time I checked that code was
utterly x86-centric.

> -		if (node->type == NULL || strcmp(node->type, "pci") != 0)
> +
> +		if (node->type == NULL || ((strcmp(node->type, "pci") != 0) && (strcmp(node->type, "pciex") != 0)))
>  			continue;

please don't add superflous braces

^ permalink raw reply

* Re: [PATCH 2/2] Base pSeries PCIe support
From: Christoph Hellwig @ 2006-04-12  4:05 UTC (permalink / raw)
  To: Jake Moilanen; +Cc: linuxppc-dev, paulus
In-Reply-To: <20060331161330.3c723103.moilanen@austin.ibm.com>

On Fri, Mar 31, 2006 at 04:13:30PM -0600, Jake Moilanen wrote:
> This patch hooks our current interrupt subsystem and sets up a single
> vector MSI as if it was a LSI.  Multiple MSI vectors is coming in the
> future.

This is broken.  Linux drivers expect MSI to be disabled on ->probe.
There's at least two reasons for that:

 (1) Many devices that claim to implement MSI are actually broken in
     more or less subtile ways. and thus must use traditition INTx pins.
 (2) MSI defines relaxed semantics for dma synchronization.  Silently
     enabling MSI could cause subtile data corruption.

^ permalink raw reply

* Re: question about Linux 2.6 with Xilinx ML-403
From: Grant Likely @ 2006-04-12  4:04 UTC (permalink / raw)
  To: yding; +Cc: linuxppc-embedded
In-Reply-To: <443C61BA.1030404@lnxw.com>

On 4/11/06, yding <yding@lnxw.com> wrote:
>  I just checked out Linus's GIT tree from
> rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> linux-2.6.
>
>  Where is the location for Paul's GIT tree ? Just curious ...

rsync://rsync.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git

g.

--
Grant Likely, B.Sc. P.Eng.
Secret Lab Technologies Ltd.
(403) 399-0195

^ permalink raw reply

* Re: question about Linux 2.6 with Xilinx ML-403
From: yding @ 2006-04-12  2:11 UTC (permalink / raw)
  To: Grant Likely; +Cc: linuxppc-embedded
In-Reply-To: <528646bc0604071518i5f95b4dbtd232d8cb92ef4fb9@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 926 bytes --]

Hi, Grant,

Thanks for the information.

I just checked out Linus's GIT tree from
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
linux-2.6.

Where is the location for Paul's GIT tree ? Just curious ...

Best Regards,
--
Ying Ding


Grant Likely wrote:

>On 4/7/06, yding <yding@lnxw.com> wrote:
>  
>
>> HI, Grant,
>>
>> I found this message :
>>http://patchwork.ozlabs.org/linuxppc/patch?id=3841 on
>>Internet.
>> It looks like you created some patch files for supporting Linux 2.6 with
>>Xilinx ML-403.
>>
>>how can download the whole kernel source tree with your patched files (via
>>cvs or bitkeeper) ?
>>    
>>
>
>I believe they are now in Linus' mainline git tree.  If not, they are
>in Paul's powerpc git tree.
>
>BTW, please CC the linuxppc-embedded mailing list when emailing me directly.
>
>Cheers,
>g.
>--
>Grant Likely, B.Sc. P.Eng.
>Secret Lab Technologies Ltd.
>(403) 399-0195
>
>  
>

[-- Attachment #2: Type: text/html, Size: 1559 bytes --]

^ permalink raw reply

* Re: [PATCH 0/6] [RFC] Sizing zones and holes in an architecture independent manner
From: Bob Picco @ 2006-04-12  1:38 UTC (permalink / raw)
  To: Mel Gorman
  Cc: davej, Luck, Tony, ak, Bob Picco, Linux Kernel Mailing List,
	linuxppc-dev
In-Reply-To: <Pine.LNX.4.64.0604120053080.10268@skynet.skynet.ie>

Mel Gorman wrote:	[Tue Apr 11 2006, 08:02:10PM EDT]
> On Tue, 11 Apr 2006, Bob Picco wrote:
> 
> >luck wrote:	[Tue Apr 11 2006, 06:20:29PM EDT]
> >>On Tue, Apr 11, 2006 at 11:39:46AM +0100, Mel Gorman wrote:
> >>
> >>>The patches have only been *compile tested* for ia64 with a flatmem
> >>>configuration. At attempt was made to boot test on an ancient RS/6000
> >>>but the vanilla kernel does not boot so I have to investigate there.
> >>
> >>The good news: Compilation is clean on the ia64 config variants that
> >>I usually build (all 10 of them).
> >>
> >>The bad (or at least consistent) news: It doesn't boot on an Intel
> >>Tiger either (oops at kmem_cache_alloc+0x41).
> >>
> >>-Tony
> >I had a reply queued to report the same failure with
> >DISCONTIG+NUMA+VIRTUAL_MEM_MAP.  This was 2 CPU HP rx2600. I'll take a 
> >closer
> >look at the code tomorrow.
> >
> 
> hmm, ok, so discontig.c is in use which narrows things down. When 
> build_node_maps() is called, I assumed that the start and end pfn passed 
> in was for a valid page range. Was this a valid assumption? When I re-read 
The addresses are a valid physical range. The caution should be that
filter_rsvd_memory converts the addresses from identity mapped to
physical. efi_memmap_walk calls back to function with identity mapped
addresses. What you've done seems okay.
BTW - I like want you are attempting to achieve.
> the comment, it implies that memory holes could be within this range which 
> would cause boot failures. If that is the case, the correct thing to do 
> was to call add_active_range() in count_node_pages() instead of 
> build_node_maps().
Yes that helps because of granules and it boots.  The patch below is applied 
on top of your original post. But..

Index: linux-2.6.17-rc1/arch/ia64/mm/discontig.c
===================================================================
--- linux-2.6.17-rc1.orig/arch/ia64/mm/discontig.c	2006-04-11 20:36:15.000000000 -0400
+++ linux-2.6.17-rc1/arch/ia64/mm/discontig.c	2006-04-11 20:52:59.000000000 -0400
@@ -88,9 +88,6 @@ static int __init build_node_maps(unsign
 	min_low_pfn = min(min_low_pfn, bdp->node_boot_start>>PAGE_SHIFT);
 	max_low_pfn = max(max_low_pfn, bdp->node_low_pfn);
 
-	/* Add a known active range */
-	add_active_range(node, start, end);
-
 	return 0;
 }
 
@@ -651,6 +648,8 @@ static __init int count_node_pages(unsig
 	mem_data[node].min_pfn = min(mem_data[node].min_pfn,
 				     start >> PAGE_SHIFT);
 
+	add_active_range(node, start, end);
+
 	return 0;
 }

Page free/avail accounting is off and I'm done for tonight. I believe it's how 
you treat holes but haven't looked closely yet.
 

Let me wrap my head around this code again. It's been some time.
> 
bob

^ permalink raw reply

* The behavior of L2 cache controller for PPC440
From: Shawn Jin @ 2006-04-12  1:04 UTC (permalink / raw)
  To: ppcembed

Hi,

I'm testing the error interrupts of a PPC440 L2 cache controller and
having some confusion about the bahvior of the tag parity error. I
hope here in the list there are some people had worked on this before
can shed some light on it.

I use the error injection bit in L2_CONFIG to generate tag parity
error. The L2 cache error handler comes from ibm440gx_common.c, which
uses CTP command to clear a tag error. However once a tag parity error
occurs, the handler keeps getting invoked and the error interrupt
keeps asserted. It looks like that the CTP command won't clear the tag
error or the tag error status bit in L2_STATUS.

The L2 cache controller spec only says that CTP command will reset the
tag trap address and way registers within the L2 design such that they
can trap a new error. From this statement I assume it doesn't clear
the status bit in L2_STATUS. If my assumption is correct, then the
L2_INTERRUPT will keep asserted until a CLEAR command invalidates the
trapped cache line. This is NOT implemented in the ibm440gx_common.c's
l2_error_handler().

However is it really necessary to use CLEAR command to invalidate the
cache line? The spec describes tag array parity in section 2.6, which
says tag parity errors are self correcting since the way with a tag
parity error will be LRUed out of the L2 cache. So it seems that no
explicit invalidation is required.

So my question is "Does or should the CTP command clear the status bit
in L2_STATUS?"

Thanks for your comments,
-Shawn.

^ permalink raw reply

* Re: [PATCH 0/6] [RFC] Sizing zones and holes in an architecture independent manner
From: Luck, Tony @ 2006-04-12  0:05 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linuxppc-dev, ak, Linux Kernel Mailing List, davej
In-Reply-To: <Pine.LNX.4.64.0604112352230.6624@skynet.skynet.ie>

On Wed, Apr 12, 2006 at 12:23:45AM +0100, Mel Gorman wrote:
> Darn.
> 
> o Did it boot on other IA64 machines or was the Tiger the first boot failure?

I only tried to boot on the Tiger.

> o Possibly a stupid question but does the Tiger configuration use the
>    flatmem memory model, sparsemem or discontig?

I built using arch/ia64/configs/tiger_defconfig - a FLATMEM config with
VIRT_MEM_MAP=y.  The machine has 4G of memory, 2G at 0-2G, and 2G at 6G-8G
(so it is somewhat sparse ... but this is pretty normal for an ia64 with
>2G).

> If it's flatmem, I noticed I made a stupid mistake where vmem_map is not 
> getting set to (void *)0 for machines with small memory holes. Nothing 
> else really obvious jumped out at me.
> 
> I've attached a patch called "105-ia64_use_init_nodes.patch". Can you 
> reverse Patch 5/6 and apply this one instead please? I've also attached 
> 107-debug.diff that applies on top of patch 6/6. It just prints out 
> debugging information during startup that may tell me where I went wrong 
> in arch/ia64. I'd really appreciate it if you could use both patches, let 
> me know if it still fails to boot and send me the console log of the 
> machine starting up if it fails so I can make guesses as to what is going 
> wrong.
> 
> Thanks a lot for trying the patches out on ia64. It was the one arch of 
> the set I had no chance to test with at all :/

Ok, I cloned a branch from patch4, applied the new patch 5, git-cherry-picked
patch 6, and then applied the debug patch7.

Here's the console log:

Linux version 2.6.17-rc1-tiger-smpxx (aegl@linux-t10) (gcc version 3.4.3 20050227 (Red Hat 3.4.3-22.1)) #2 SMP Tue Apr 11 16:45:31 PDT 2006
EFI v1.10 by INTEL: SALsystab=0x7fe54980 ACPI=0x7ff84000 ACPI 2.0=0x7ff83000 MPS=0x7ff82000 SMBIOS=0xf0000
Early serial console at I/O port 0x2f8 (options '115200')
Initial ramdisk at: 0xe0000001fedf5000 (1303568 bytes)
SAL 3.20: Intel Corp                       SR870BN4                         version 3.0
SAL Platform features: BusLock IRQ_Redirection
SAL: AP wakeup using external interrupt vector 0xf0
No logical to physical processor mapping available
iosapic_system_init: Disabling PC-AT compatible 8259 interrupts
ACPI: Local APIC address c0000000fee00000
PLATFORM int CPEI (0x3): GSI 22 (level, low) -> CPU 0 (0xc618) vector 30
register_intr: changing vector 39 from IO-SAPIC-edge to IO-SAPIC-level
4 CPUs available, 4 CPUs total
MCA related initialization done
add_active_range(0, 16140901064512634880, 16140901066637049856): New
add_active_range(0, 16140901066641899520, 16140901066642489344): New
add_active_range(0, 16140901070938308608, 16140901073083760640): New
add_active_range(0, 16140901073084219392, 16140901073085480960): New
Dumping sorted node map
entry 0: 0  16140901064512634880 -> 16140901066637049856
entry 1: 0  16140901066641899520 -> 16140901066642489344
entry 2: 0  16140901070938308608 -> 16140901073083760640
entry 3: 0  16140901073084219392 -> 16140901073085480960
Virtual mem_map starts at 0x0000000000000000
SMP: Allowing 4 CPUs, 0 hotplug CPUs
Built 1 zonelists
Kernel command line: BOOT_IMAGE=scsi0:EFI\redhat\l-tiger-smpxx.gz  root=LABEL=/ console=uart,io,0x2f8 ro
PID hash table entries: 16 (order: 4, 128 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 1 (order: -11, 8 bytes)
Inode-cache hash table entries: 1 (order: -11, 8 bytes)
Placing software IO TLB between 0x4a84000 - 0x8a84000
kernel BUG at arch/ia64/mm/init.c:609!
swapper[0]: bugcheck! 0 [1]
Modules linked in:

Pid: 0, CPU 0, comm:              swapper
psr : 00001010084a6010 ifs : 800000000000040f ip  : [<a0000001007dd620>]    Not tainted
ip is at mem_init+0x80/0x580
unat: 0000000000000000 pfs : 000000000000040f rsc : 0000000000000003
rnat: a00000010095fd80 bsps: 00000000000002f9 pr  : 80000000afb5666b
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0930ffff00090000 ssd : 0930ffff00090000
b0  : a0000001007dd620 b6  : a0000001007f6ba0 b7  : a0000001003c27e0
f6  : 0fffbccccccccc8c00000 f7  : 0ffdbf300000000000000
f8  : 10001c000000000000000 f9  : 10002a000000000000000
f10 : 0fffe9999999996900000 f11 : 1003e0000000000000000
r1  : a000000100b460d0 r2  : 0000000000000000 r3  : a00000010095cba0
r8  : 000000000000002a r9  : a00000010095cb90 r10 : 00000000000002f9
r11 : 00000000000be000 r12 : a00000010080fe10 r13 : a000000100808000
r14 : 0000000000004000 r15 : a00000010095cba8 r16 : 0000000000000001
r17 : a00000010095cb98 r18 : ffffffffffffffff r19 : a00000010095fd88
r20 : 00000000000000be r21 : a00000010095bd50 r22 : 0000000000000000
r23 : a00000010095cbb8 r24 : a00000010087f7e8 r25 : a00000010087f7e0
r26 : a000000100946308 r27 : 00000010084a6010 r28 : 00000000000002f9
r29 : 00000000000002f8 r30 : 0000000000000000 r31 : a00000010095cb68
Unable to handle kernel NULL pointer dereference (address 0000000000000000)
swapper[0]: Oops 11012296146944 [2]
Modules linked in:

Pid: 0, CPU 0, comm:              swapper
psr : 0000121008022018 ifs : 8000000000000287 ip  : [<a000000100116b81>]    Not tainted
ip is at kmem_cache_alloc+0x41/0x100
unat: 0000000000000000 pfs : 0000000000000793 rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr  : 80000000afb56967
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f
csd : 0930ffff00090000 ssd : 0930ffff00090000
b0  : a00000010003d820 b6  : a00000010003e600 b7  : a00000010000c9d0
f6  : 1003ea08f5c3b783104ea f7  : 1003e9e3779b97f4a7c16
f8  : 1003e0a0000001000117f f9  : 1003e000000000000007f
f10 : 1003e0000000000000379 f11 : 1003e6db6db6db6db6db7
r1  : a000000100b460d0 r2  : 0000000000000000 r3  : a000000100949240
r8  : 0000000000000000 r9  : 0000000000000000 r10 : 0000000000000000
r11 : a000000100808f14 r12 : a00000010080f260 r13 : a000000100808000
r14 : 0000000000000000 r15 : 000000000000000f r16 : a00000010080f2f0
r17 : 0000000000000000 r18 : a000000100876bd8 r19 : a00000010080f2ec
r20 : a00000010080f2e8 r21 : 000000007fffffff r22 : 0000000000000000
r23 : 0000000000000050 r24 : a0000001000117f0 r25 : a0000001000117a0
r26 : a000000100885480 r27 : a000000100946fb0 r28 : a00000010087df40
r29 : 0000000000000002 r30 : 0000000000000002 r31 : 00000000000000c0

^ permalink raw reply

* Re: [PATCH 0/6] [RFC] Sizing zones and holes in an architecture independent manner
From: Mel Gorman @ 2006-04-12  0:02 UTC (permalink / raw)
  To: Bob Picco; +Cc: linuxppc-dev, ak, Luck, Tony, Linux Kernel Mailing List, davej
In-Reply-To: <20060411232944.GE23742@localhost>

On Tue, 11 Apr 2006, Bob Picco wrote:

> luck wrote:	[Tue Apr 11 2006, 06:20:29PM EDT]
>> On Tue, Apr 11, 2006 at 11:39:46AM +0100, Mel Gorman wrote:
>>
>>> The patches have only been *compile tested* for ia64 with a flatmem
>>> configuration. At attempt was made to boot test on an ancient RS/6000
>>> but the vanilla kernel does not boot so I have to investigate there.
>>
>> The good news: Compilation is clean on the ia64 config variants that
>> I usually build (all 10 of them).
>>
>> The bad (or at least consistent) news: It doesn't boot on an Intel
>> Tiger either (oops at kmem_cache_alloc+0x41).
>>
>> -Tony
> I had a reply queued to report the same failure with
> DISCONTIG+NUMA+VIRTUAL_MEM_MAP.  This was 2 CPU HP rx2600. I'll take a closer
> look at the code tomorrow.
>

hmm, ok, so discontig.c is in use which narrows things down. When 
build_node_maps() is called, I assumed that the start and end pfn passed 
in was for a valid page range. Was this a valid assumption? When I re-read 
the comment, it implies that memory holes could be within this range which 
would cause boot failures. If that is the case, the correct thing to do 
was to call add_active_range() in count_node_pages() instead of 
build_node_maps().

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply

* Re: [PATCH 0/6] [RFC] Sizing zones and holes in an architecture independent manner
From: Bob Picco @ 2006-04-11 23:29 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Mel Gorman, linuxppc-dev, ak, linux-kernel, davej
In-Reply-To: <20060411222029.GA7743@agluck-lia64.sc.intel.com>

luck wrote:	[Tue Apr 11 2006, 06:20:29PM EDT]
> On Tue, Apr 11, 2006 at 11:39:46AM +0100, Mel Gorman wrote:
> 
> > The patches have only been *compile tested* for ia64 with a flatmem
> > configuration. At attempt was made to boot test on an ancient RS/6000
> > but the vanilla kernel does not boot so I have to investigate there.
> 
> The good news: Compilation is clean on the ia64 config variants that
> I usually build (all 10 of them).
> 
> The bad (or at least consistent) news: It doesn't boot on an Intel
> Tiger either (oops at kmem_cache_alloc+0x41).
> 
> -Tony
I had a reply queued to report the same failure with
DISCONTIG+NUMA+VIRTUAL_MEM_MAP.  This was 2 CPU HP rx2600. I'll take a closer 
look at the code tomorrow. 

bob

^ permalink raw reply

* Re: [PATCH 0/6] [RFC] Sizing zones and holes in an architecture independent manner
From: Mel Gorman @ 2006-04-11 23:23 UTC (permalink / raw)
  To: Luck, Tony; +Cc: linuxppc-dev, ak, Linux Kernel Mailing List, davej
In-Reply-To: <20060411222029.GA7743@agluck-lia64.sc.intel.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1754 bytes --]

On Tue, 11 Apr 2006, Luck, Tony wrote:

> On Tue, Apr 11, 2006 at 11:39:46AM +0100, Mel Gorman wrote:
>
>> The patches have only been *compile tested* for ia64 with a flatmem
>> configuration. At attempt was made to boot test on an ancient RS/6000
>> but the vanilla kernel does not boot so I have to investigate there.
>
> The good news: Compilation is clean on the ia64 config variants that
> I usually build (all 10 of them).
>

One plus at least.

> The bad (or at least consistent) news: It doesn't boot on an Intel
> Tiger either (oops at kmem_cache_alloc+0x41).
>

Darn.

o Did it boot on other IA64 machines or was the Tiger the first boot failure?
o Possibly a stupid question but does the Tiger configuration use the
   flatmem memory model, sparsemem or discontig?

If it's flatmem, I noticed I made a stupid mistake where vmem_map is not 
getting set to (void *)0 for machines with small memory holes. Nothing 
else really obvious jumped out at me.

I've attached a patch called "105-ia64_use_init_nodes.patch". Can you 
reverse Patch 5/6 and apply this one instead please? I've also attached 
107-debug.diff that applies on top of patch 6/6. It just prints out 
debugging information during startup that may tell me where I went wrong 
in arch/ia64. I'd really appreciate it if you could use both patches, let 
me know if it still fails to boot and send me the console log of the 
machine starting up if it fails so I can make guesses as to what is going 
wrong.

Thanks a lot for trying the patches out on ia64. It was the one arch of 
the set I had no chance to test with at all :/

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

[-- Attachment #2: 105-ia64_use_init_nodes.patch --]
[-- Type: TEXT/PLAIN, Size: 8361 bytes --]

diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/Kconfig linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/Kconfig
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/Kconfig	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/Kconfig	2006-04-11 23:31:38.000000000 +0100
@@ -352,6 +352,9 @@ config NUMA
 	  Access).  This option is for configuring high-end multiprocessor
 	  server systems.  If in doubt, say N.
 
+config ARCH_POPULATES_NODE_MAP
+	def_bool y
+
 # VIRTUAL_MEM_MAP and FLAT_NODE_MEM_MAP are functionally equivalent.
 # VIRTUAL_MEM_MAP has been retained for historical reasons.
 config VIRTUAL_MEM_MAP
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/contig.c linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/contig.c
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/contig.c	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/contig.c	2006-04-11 23:56:45.000000000 +0100
@@ -26,10 +26,6 @@
 #include <asm/sections.h>
 #include <asm/mca.h>
 
-#ifdef CONFIG_VIRTUAL_MEM_MAP
-static unsigned long num_dma_physpages;
-#endif
-
 /**
  * show_mem - display a memory statistics summary
  *
@@ -212,18 +208,6 @@ count_pages (u64 start, u64 end, void *a
 	return 0;
 }
 
-#ifdef CONFIG_VIRTUAL_MEM_MAP
-static int
-count_dma_pages (u64 start, u64 end, void *arg)
-{
-	unsigned long *count = arg;
-
-	if (start < MAX_DMA_ADDRESS)
-		*count += (min(end, MAX_DMA_ADDRESS) - start) >> PAGE_SHIFT;
-	return 0;
-}
-#endif
-
 /*
  * Set up the page tables.
  */
@@ -232,47 +216,24 @@ void __init
 paging_init (void)
 {
 	unsigned long max_dma;
-	unsigned long zones_size[MAX_NR_ZONES];
 #ifdef CONFIG_VIRTUAL_MEM_MAP
-	unsigned long zholes_size[MAX_NR_ZONES];
+	unsigned long nid = 0;
 	unsigned long max_gap;
 #endif
 
-	/* initialize mem_map[] */
-
-	memset(zones_size, 0, sizeof(zones_size));
-
 	num_physpages = 0;
 	efi_memmap_walk(count_pages, &num_physpages);
 
 	max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS) >> PAGE_SHIFT;
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
-	memset(zholes_size, 0, sizeof(zholes_size));
-
-	num_dma_physpages = 0;
-	efi_memmap_walk(count_dma_pages, &num_dma_physpages);
-
-	if (max_low_pfn < max_dma) {
-		zones_size[ZONE_DMA] = max_low_pfn;
-		zholes_size[ZONE_DMA] = max_low_pfn - num_dma_physpages;
-	} else {
-		zones_size[ZONE_DMA] = max_dma;
-		zholes_size[ZONE_DMA] = max_dma - num_dma_physpages;
-		if (num_physpages > num_dma_physpages) {
-			zones_size[ZONE_NORMAL] = max_low_pfn - max_dma;
-			zholes_size[ZONE_NORMAL] =
-				((max_low_pfn - max_dma) -
-				 (num_physpages - num_dma_physpages));
-		}
-	}
-
 	max_gap = 0;
+	efi_memmap_walk(register_active_ranges, &nid);
 	efi_memmap_walk(find_largest_hole, (u64 *)&max_gap);
 	if (max_gap < LARGE_GAP) {
 		vmem_map = (struct page *) 0;
-		free_area_init_node(0, NODE_DATA(0), zones_size, 0,
-				    zholes_size);
+		free_area_init_nodes(max_dma, max_dma,
+				max_low_pfn, max_low_pfn);
 	} else {
 		unsigned long map_size;
 
@@ -284,19 +245,14 @@ paging_init (void)
 		efi_memmap_walk(create_mem_map_page_table, NULL);
 
 		NODE_DATA(0)->node_mem_map = vmem_map;
-		free_area_init_node(0, NODE_DATA(0), zones_size,
-				    0, zholes_size);
+		free_area_init_nodes(max_dma, max_dma,
+				max_low_pfn, max_low_pfn);
 
 		printk("Virtual mem_map starts at 0x%p\n", mem_map);
 	}
 #else /* !CONFIG_VIRTUAL_MEM_MAP */
-	if (max_low_pfn < max_dma)
-		zones_size[ZONE_DMA] = max_low_pfn;
-	else {
-		zones_size[ZONE_DMA] = max_dma;
-		zones_size[ZONE_NORMAL] = max_low_pfn - max_dma;
-	}
-	free_area_init(zones_size);
+	add_active_range(0, 0, max_low_pfn);
+	free_area_init_nodes(max_dma, max_dma, max_low_pfn, max_low_pfn);
 #endif /* !CONFIG_VIRTUAL_MEM_MAP */
 	zero_page_memmap_ptr = virt_to_page(ia64_imva(empty_zero_page));
 }
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/discontig.c linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/discontig.c
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/discontig.c	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/discontig.c	2006-04-11 23:54:06.000000000 +0100
@@ -87,6 +87,7 @@ static int __init build_node_maps(unsign
 
 	min_low_pfn = min(min_low_pfn, bdp->node_boot_start>>PAGE_SHIFT);
 	max_low_pfn = max(max_low_pfn, bdp->node_low_pfn);
+	add_active_range(node, start, end);
 
 	return 0;
 }
@@ -660,9 +661,8 @@ static __init int count_node_pages(unsig
 void __init paging_init(void)
 {
 	unsigned long max_dma;
-	unsigned long zones_size[MAX_NR_ZONES];
-	unsigned long zholes_size[MAX_NR_ZONES];
 	unsigned long pfn_offset = 0;
+	unsigned long max_pfn = 0;
 	int node;
 
 	max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS) >> PAGE_SHIFT;
@@ -679,46 +679,17 @@ void __init paging_init(void)
 #endif
 
 	for_each_online_node(node) {
-		memset(zones_size, 0, sizeof(zones_size));
-		memset(zholes_size, 0, sizeof(zholes_size));
-
 		num_physpages += mem_data[node].num_physpages;
-
-		if (mem_data[node].min_pfn >= max_dma) {
-			/* All of this node's memory is above ZONE_DMA */
-			zones_size[ZONE_NORMAL] = mem_data[node].max_pfn -
-				mem_data[node].min_pfn;
-			zholes_size[ZONE_NORMAL] = mem_data[node].max_pfn -
-				mem_data[node].min_pfn -
-				mem_data[node].num_physpages;
-		} else if (mem_data[node].max_pfn < max_dma) {
-			/* All of this node's memory is in ZONE_DMA */
-			zones_size[ZONE_DMA] = mem_data[node].max_pfn -
-				mem_data[node].min_pfn;
-			zholes_size[ZONE_DMA] = mem_data[node].max_pfn -
-				mem_data[node].min_pfn -
-				mem_data[node].num_dma_physpages;
-		} else {
-			/* This node has memory in both zones */
-			zones_size[ZONE_DMA] = max_dma -
-				mem_data[node].min_pfn;
-			zholes_size[ZONE_DMA] = zones_size[ZONE_DMA] -
-				mem_data[node].num_dma_physpages;
-			zones_size[ZONE_NORMAL] = mem_data[node].max_pfn -
-				max_dma;
-			zholes_size[ZONE_NORMAL] = zones_size[ZONE_NORMAL] -
-				(mem_data[node].num_physpages -
-				 mem_data[node].num_dma_physpages);
-		}
-
 		pfn_offset = mem_data[node].min_pfn;
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
 		NODE_DATA(node)->node_mem_map = vmem_map + pfn_offset;
 #endif
-		free_area_init_node(node, NODE_DATA(node), zones_size,
-				    pfn_offset, zholes_size);
+		if (mem_data[node].max_pfn > max_pfn)
+			max_pfn = mem_data[node].max_pfn;
 	}
 
+	free_area_init_nodes(max_dma, max_dma, max_pfn, max_pfn);
+	
 	zero_page_memmap_ptr = virt_to_page(ia64_imva(empty_zero_page));
 }
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/init.c linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/init.c
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/arch/ia64/mm/init.c	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/arch/ia64/mm/init.c	2006-04-11 23:40:15.000000000 +0100
@@ -539,6 +539,16 @@ find_largest_hole (u64 start, u64 end, v
 	last_end = end;
 	return 0;
 }
+
+int __init
+register_active_ranges(u64 start, u64 end, void *nid)
+{
+	BUG_ON(nid == NULL);
+	BUG_ON(*(unsigned long *)nid >= MAX_NUMNODES);
+
+	add_active_range(*(unsigned long *)nid, start, end);
+	return 0;
+}
 #endif /* CONFIG_VIRTUAL_MEM_MAP */
 
 static int __init
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-104-x86_64_use_init_nodes/include/asm-ia64/meminit.h linux-2.6.17-rc1-105-ia64_use_init_nodes/include/asm-ia64/meminit.h
--- linux-2.6.17-rc1-104-x86_64_use_init_nodes/include/asm-ia64/meminit.h	2006-04-03 04:22:10.000000000 +0100
+++ linux-2.6.17-rc1-105-ia64_use_init_nodes/include/asm-ia64/meminit.h	2006-04-11 23:34:58.000000000 +0100
@@ -56,6 +56,7 @@ extern void efi_memmap_init(unsigned lon
   extern unsigned long vmalloc_end;
   extern struct page *vmem_map;
   extern int find_largest_hole (u64 start, u64 end, void *arg);
+  extern int register_active_ranges (u64 start, u64 end, void *arg);
   extern int create_mem_map_page_table (u64 start, u64 end, void *arg);
 #endif
 

[-- Attachment #3: 107-debug.diff --]
[-- Type: TEXT/PLAIN, Size: 3852 bytes --]

diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.17-rc1-106-breakout_mem_init/mm/mem_init.c linux-2.6.17-rc1-107-debug/mm/mem_init.c
--- linux-2.6.17-rc1-106-breakout_mem_init/mm/mem_init.c	2006-04-11 23:49:52.000000000 +0100
+++ linux-2.6.17-rc1-107-debug/mm/mem_init.c	2006-04-11 23:52:00.000000000 +0100
@@ -645,13 +645,23 @@ void __init free_bootmem_with_active_reg
 	for_each_active_range_index_in_nid(i, nid) {
 		unsigned long size_pages = 0;
 		unsigned long end_pfn = early_node_map[i].end_pfn;
-		if (early_node_map[i].start_pfn >= max_low_pfn)
+		if (early_node_map[i].start_pfn >= max_low_pfn) {
+			printk("start_pfn %lu >= %lu\n", early_node_map[i].start_pfn,
+								max_low_pfn);
 			continue;
+		}
 
-		if (end_pfn > max_low_pfn)
+		if (end_pfn > max_low_pfn) {
+			printk("end_pfn %lu going back to %lu\n", early_node_map[i].end_pfn,
+									max_low_pfn);
 			end_pfn = max_low_pfn;
+		}
 
 		size_pages = end_pfn - early_node_map[i].start_pfn;
+		printk("free_bootmem_node(%d, %lu, %lu)\n",
+				early_node_map[i].nid,
+				PFN_PHYS(early_node_map[i].start_pfn),
+				PFN_PHYS(size_pages));
 		free_bootmem_node(NODE_DATA(early_node_map[i].nid),
 				PFN_PHYS(early_node_map[i].start_pfn),
 				PFN_PHYS(size_pages));
@@ -661,10 +671,15 @@ void __init free_bootmem_with_active_reg
 void __init memory_present_with_active_regions(int nid)
 {
 	unsigned int i;
-	for_each_active_range_index_in_nid(i, nid)
+	for_each_active_range_index_in_nid(i, nid) {
+		printk("memory_present(%d, %lu, %lu)\n",
+			early_node_map[i].nid,
+			early_node_map[i].start_pfn,
+			early_node_map[i].end_pfn);
 		memory_present(early_node_map[i].nid,
 				early_node_map[i].start_pfn,
 				early_node_map[i].end_pfn);
+	}
 }
 
 void __init get_pfn_range_for_nid(unsigned int nid,
@@ -752,8 +767,16 @@ unsigned long __init zone_absent_pages_i
 		/* Increase the hole size if the hole is within the zone */
 		start_pfn = early_node_map[i].start_pfn;
 		if (pfn_range_in_zone(prev_end_pfn, start_pfn, zone_type)) {
-			BUG_ON(prev_end_pfn > start_pfn);
+			if (prev_end_pfn > start_pfn) {
+				printk("prev_end > start_pfn : %lu > %lu\n",
+						prev_end_pfn,
+						start_pfn);
+				BUG();
+			}
+			//BUG_ON(prev_end_pfn > start_pfn);
 			hole_pages += start_pfn - prev_end_pfn;
+			printk("Hole found index %d: %lu -> %lu\n",
+					i, prev_end_pfn, start_pfn);
 		}
 
 		prev_end_pfn = early_node_map[i].end_pfn;
@@ -907,17 +930,21 @@ void __init add_active_range(unsigned in
 	unsigned int i;
 	unsigned long pages = end_pfn - start_pfn;
 
+	printk("add_active_range(%d, %lu, %lu): ",
+			nid, start_pfn, end_pfn);
 	/* Merge with existing active regions if possible */
 	for (i = 0; early_node_map[i].end_pfn; i++) {
 		if (early_node_map[i].nid != nid)
 			continue;
 
 		if (early_node_map[i].end_pfn == start_pfn) {
+			printk("Merging forward\n");
 			early_node_map[i].end_pfn += pages;
 			return;
 		}
 
 		if (early_node_map[i].start_pfn == (start_pfn + pages)) {
+			printk("Merging backwards\n");
 			early_node_map[i].start_pfn -= pages;
 			return;
 		}
@@ -933,6 +960,7 @@ void __init add_active_range(unsigned in
 		return;
 	}
 
+	printk("New\n");
 	early_node_map[i].nid = nid;
 	early_node_map[i].start_pfn = start_pfn;
 	early_node_map[i].end_pfn = end_pfn;
@@ -962,6 +990,14 @@ static void __init sort_node_map(void)
 
 	sort(early_node_map, num, sizeof(struct node_active_region),
 						cmp_node_active_region, NULL);
+
+	printk("Dumping sorted node map\n");
+	for (num = 0; early_node_map[num].end_pfn; num++) {
+		printk("entry %lu: %d  %lu -> %lu\n", num,
+				early_node_map[num].nid,
+				early_node_map[num].start_pfn,
+				early_node_map[num].end_pfn);
+	}
 }
 
 unsigned long __init find_min_pfn(void)

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox