* Make show_mem() skip holes in a pgdat.
@ 2006-04-13 3:15 Robin Holt
2006-04-13 8:05 ` Andreas Schwab
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Robin Holt @ 2006-04-13 3:15 UTC (permalink / raw)
To: linux-ia64
This patch modifies ia64's show_mem() to walk the vmem_map page tables and
rapidly skip forward across regions where the page tables are missing.
This prevents the pfn_valid() check from causing numerous unnecessary
page faults.
Without this patch on a 512 node 512 cpu system where every node has four
memory holes, the show_mem() call takes 1 hour 18 minutes. With this
patch, it takes less than 3 seconds.
Signed-off-by: Robin Holt <holt@sgi.com>
Index: linux-2.6/arch/ia64/mm/discontig.c
=================================--- linux-2.6.orig/arch/ia64/mm/discontig.c 2006-04-12 18:20:44.374700839 -0500
+++ linux-2.6/arch/ia64/mm/discontig.c 2006-04-12 22:11:31.971106982 -0500
@@ -547,8 +547,71 @@ void show_mem(void)
struct page *page;
if (pfn_valid(pgdat->node_start_pfn + i))
page = pfn_to_page(pgdat->node_start_pfn + i);
- else
+ else {
+ /*
+ * At the beginning of a hole. Search vmem_map
+ * page tables for the end.
+ */
+ unsigned long end_address, hole_end_pfn;
+ unsigned long stop_address;
+
+ end_address = (unsigned long) &vmem_map[pgdat->node_start_pfn + i];
+ end_address = PAGE_ALIGN(end_address);
+
+ stop_address = (unsigned long) &vmem_map[
+ pgdat->node_start_pfn + pgdat->node_spanned_pages];
+
+ do {
+ pgd_t *pgd;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t *pte;
+
+ pgd = pgd_offset_k(end_address);
+ if (pgd_none(*pgd)) {
+ end_address += PTRS_PER_PUD *
+ PTRS_PER_PMD *
+ PTRS_PER_PTE *
+ PAGE_SIZE;
+ continue;
+ }
+
+ pud = pud_offset(pgd, end_address);
+ if (pud_none(*pud)) {
+ end_address += PTRS_PER_PMD *
+ PTRS_PER_PTE *
+ PAGE_SIZE;
+ continue;
+ }
+
+ pmd = pmd_offset(pud, end_address);
+ if (pmd_none(*pmd)) {
+ end_address += PTRS_PER_PTE *
+ PAGE_SIZE;
+ continue;
+ }
+
+ pte = pte_offset_kernel(pmd, end_address);
+
+retry_pte:
+ if (pte_none(*pte)) {
+ end_address += PAGE_SIZE;
+ pte++;
+ if ((end_address < stop_address) &&
+ (end_address != ALIGN(end_address, 1UL << PMD_SHIFT)))
+ goto retry_pte;
+ continue;
+ }
+ /* Found next valid vmem_map page */
+ break;
+ } while (end_address < stop_address);
+
+ end_address = end_address - (unsigned long) vmem_map - 1;
+ hole_end_pfn = end_address / sizeof(struct page);
+ i = hole_end_pfn - pgdat->node_start_pfn;
+
continue;
+ }
if (PageReserved(page))
reserved++;
else if (PageSwapCache(page))
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Make show_mem() skip holes in a pgdat.
2006-04-13 3:15 Make show_mem() skip holes in a pgdat Robin Holt
@ 2006-04-13 8:05 ` Andreas Schwab
2006-04-13 13:14 ` Robin Holt
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Andreas Schwab @ 2006-04-13 8:05 UTC (permalink / raw)
To: linux-ia64
Robin Holt <holt@sgi.com> writes:
> Index: linux-2.6/arch/ia64/mm/discontig.c
> =================================> --- linux-2.6.orig/arch/ia64/mm/discontig.c 2006-04-12 18:20:44.374700839 -0500
> +++ linux-2.6/arch/ia64/mm/discontig.c 2006-04-12 22:11:31.971106982 -0500
> @@ -547,8 +547,71 @@ void show_mem(void)
> struct page *page;
> if (pfn_valid(pgdat->node_start_pfn + i))
> page = pfn_to_page(pgdat->node_start_pfn + i);
> - else
> + else {
> + /*
> + * At the beginning of a hole. Search vmem_map
> + * page tables for the end.
> + */
> + unsigned long end_address, hole_end_pfn;
> + unsigned long stop_address;
> +
> + end_address = (unsigned long) &vmem_map[pgdat->node_start_pfn + i];
> + end_address = PAGE_ALIGN(end_address);
> +
> + stop_address = (unsigned long) &vmem_map[
> + pgdat->node_start_pfn + pgdat->node_spanned_pages];
When you need more than 3 levels of indentation you should factor it out
into an inline function.
Andreas.
--
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
^ permalink raw reply [flat|nested] 5+ messages in thread
* Make show_mem() skip holes in a pgdat.
2006-04-13 3:15 Make show_mem() skip holes in a pgdat Robin Holt
2006-04-13 8:05 ` Andreas Schwab
@ 2006-04-13 13:14 ` Robin Holt
2006-04-13 16:04 ` Bob Picco
2006-04-13 16:36 ` Chen, Kenneth W
3 siblings, 0 replies; 5+ messages in thread
From: Robin Holt @ 2006-04-13 13:14 UTC (permalink / raw)
To: linux-ia64
This patch modifies ia64's show_mem() to walk the vmem_map page tables and
rapidly skip forward across regions where the page tables are missing.
This prevents the pfn_valid() check from causing numerous unnecessary
page faults.
Without this patch on a 512 node 512 cpu system where every node has four
memory holes, the show_mem() call takes 1 hour 18 minutes. With this
patch, it takes less than 3 seconds.
Signed-off-by: Robin Holt <holt@sgi.com>
---
Fixed.
Index: linux-2.6/arch/ia64/mm/discontig.c
=================================--- linux-2.6.orig/arch/ia64/mm/discontig.c 2006-04-13 06:16:00.500029306 -0500
+++ linux-2.6/arch/ia64/mm/discontig.c 2006-04-13 07:01:31.668069891 -0500
@@ -519,6 +519,69 @@ void __cpuinit *per_cpu_init(void)
}
#endif /* CONFIG_SMP */
+
+static int find_next_valid_pfn_for_pgdat(pg_data_t *pgdat, int i)
+{
+ unsigned long end_address, hole_next_pfn;
+ unsigned long stop_address;
+
+ end_address = (unsigned long) &vmem_map[pgdat->node_start_pfn + i];
+ end_address = PAGE_ALIGN(end_address);
+
+ stop_address = (unsigned long) &vmem_map[
+ pgdat->node_start_pfn + pgdat->node_spanned_pages];
+
+ do {
+ pgd_t *pgd;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t *pte;
+
+ pgd = pgd_offset_k(end_address);
+ if (pgd_none(*pgd)) {
+ end_address += PTRS_PER_PUD *
+ PTRS_PER_PMD *
+ PTRS_PER_PTE *
+ PAGE_SIZE;
+ continue;
+ }
+
+ pud = pud_offset(pgd, end_address);
+ if (pud_none(*pud)) {
+ end_address += PTRS_PER_PMD *
+ PTRS_PER_PTE *
+ PAGE_SIZE;
+ continue;
+ }
+
+ pmd = pmd_offset(pud, end_address);
+ if (pmd_none(*pmd)) {
+ end_address += PTRS_PER_PTE *
+ PAGE_SIZE;
+ continue;
+ }
+
+ pte = pte_offset_kernel(pmd, end_address);
+retry_pte:
+ if (pte_none(*pte)) {
+ end_address += PAGE_SIZE;
+ pte++;
+ if ((end_address < stop_address) &&
+ (end_address != ALIGN(end_address, 1UL << PMD_SHIFT)))
+ goto retry_pte;
+ continue;
+ }
+ /* Found next valid vmem_map page */
+ break;
+ } while (end_address < stop_address);
+
+ end_address = min(end_address, stop_address);
+ end_address = end_address - (unsigned long) vmem_map + sizeof(struct page) - 1;
+ hole_next_pfn = end_address / sizeof(struct page);
+ return hole_next_pfn - pgdat->node_start_pfn;
+}
+
+
/**
* show_mem - give short summary of memory stats
*
@@ -547,8 +610,10 @@ void show_mem(void)
struct page *page;
if (pfn_valid(pgdat->node_start_pfn + i))
page = pfn_to_page(pgdat->node_start_pfn + i);
- else
+ else {
+ i = find_next_valid_pfn_for_pgdat(pgdat, i) - 1;
continue;
+ }
if (PageReserved(page))
reserved++;
else if (PageSwapCache(page))
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Make show_mem() skip holes in a pgdat.
2006-04-13 3:15 Make show_mem() skip holes in a pgdat Robin Holt
2006-04-13 8:05 ` Andreas Schwab
2006-04-13 13:14 ` Robin Holt
@ 2006-04-13 16:04 ` Bob Picco
2006-04-13 16:36 ` Chen, Kenneth W
3 siblings, 0 replies; 5+ messages in thread
From: Bob Picco @ 2006-04-13 16:04 UTC (permalink / raw)
To: linux-ia64
Robin Holt wrote: [Thu Apr 13 2006, 09:14:42AM EDT]
>
> This patch modifies ia64's show_mem() to walk the vmem_map page tables and
> rapidly skip forward across regions where the page tables are missing.
> This prevents the pfn_valid() check from causing numerous unnecessary
> page faults.
>
> Without this patch on a 512 node 512 cpu system where every node has four
> memory holes, the show_mem() call takes 1 hour 18 minutes. With this
> patch, it takes less than 3 seconds.
>
> Signed-off-by: Robin Holt <holt@sgi.com>
>
> ---
> Fixed.
>
>
> Index: linux-2.6/arch/ia64/mm/discontig.c
> =================================> --- linux-2.6.orig/arch/ia64/mm/discontig.c 2006-04-13 06:16:00.500029306 -0500
> +++ linux-2.6/arch/ia64/mm/discontig.c 2006-04-13 07:01:31.668069891 -0500
> @@ -519,6 +519,69 @@ void __cpuinit *per_cpu_init(void)
> }
> #endif /* CONFIG_SMP */
How about these changes to fix SPARSEMEM.
>
> +
#ifdef CONFIG_VIRTUAL_MEM_MAP
> +static int find_next_valid_pfn_for_pgdat(pg_data_t *pgdat, int i)
> +{
> + unsigned long end_address, hole_next_pfn;
> + unsigned long stop_address;
> +
> + end_address = (unsigned long) &vmem_map[pgdat->node_start_pfn + i];
> + end_address = PAGE_ALIGN(end_address);
> +
> + stop_address = (unsigned long) &vmem_map[
> + pgdat->node_start_pfn + pgdat->node_spanned_pages];
> +
> + do {
> + pgd_t *pgd;
> + pud_t *pud;
> + pmd_t *pmd;
> + pte_t *pte;
> +
> + pgd = pgd_offset_k(end_address);
> + if (pgd_none(*pgd)) {
> + end_address += PTRS_PER_PUD *
> + PTRS_PER_PMD *
> + PTRS_PER_PTE *
> + PAGE_SIZE;
> + continue;
> + }
> +
> + pud = pud_offset(pgd, end_address);
> + if (pud_none(*pud)) {
> + end_address += PTRS_PER_PMD *
> + PTRS_PER_PTE *
> + PAGE_SIZE;
> + continue;
> + }
> +
> + pmd = pmd_offset(pud, end_address);
> + if (pmd_none(*pmd)) {
> + end_address += PTRS_PER_PTE *
> + PAGE_SIZE;
> + continue;
> + }
> +
> + pte = pte_offset_kernel(pmd, end_address);
> +retry_pte:
> + if (pte_none(*pte)) {
> + end_address += PAGE_SIZE;
> + pte++;
> + if ((end_address < stop_address) &&
> + (end_address != ALIGN(end_address, 1UL << PMD_SHIFT)))
> + goto retry_pte;
> + continue;
> + }
> + /* Found next valid vmem_map page */
> + break;
> + } while (end_address < stop_address);
> +
> + end_address = min(end_address, stop_address);
> + end_address = end_address - (unsigned long) vmem_map + sizeof(struct page) - 1;
> + hole_next_pfn = end_address / sizeof(struct page);
> + return hole_next_pfn - pgdat->node_start_pfn;
> +}
#else
static inline int find_next_valid_pfn_for_pgdat(pg_data_t *pgdat, int i)
{
return i + 1;
}
#endif
This should optimize out below for SPARSEMEM.
bob
> +
> +
> /**
> * show_mem - give short summary of memory stats
> *
> @@ -547,8 +610,10 @@ void show_mem(void)
> struct page *page;
> if (pfn_valid(pgdat->node_start_pfn + i))
> page = pfn_to_page(pgdat->node_start_pfn + i);
> - else
> + else {
> + i = find_next_valid_pfn_for_pgdat(pgdat, i) - 1;
> continue;
> + }
> if (PageReserved(page))
> reserved++;
> else if (PageSwapCache(page))
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: Make show_mem() skip holes in a pgdat.
2006-04-13 3:15 Make show_mem() skip holes in a pgdat Robin Holt
` (2 preceding siblings ...)
2006-04-13 16:04 ` Bob Picco
@ 2006-04-13 16:36 ` Chen, Kenneth W
3 siblings, 0 replies; 5+ messages in thread
From: Chen, Kenneth W @ 2006-04-13 16:36 UTC (permalink / raw)
To: linux-ia64
Robin Holt wrote on Wednesday, April 12, 2006 8:15 PM
> This patch modifies ia64's show_mem() to walk the vmem_map page tables and
> rapidly skip forward across regions where the page tables are missing.
> This prevents the pfn_valid() check from causing numerous unnecessary
> page faults.
>
> Without this patch on a 512 node 512 cpu system where every node has four
> memory holes, the show_mem() call takes 1 hour 18 minutes. With this
> patch, it takes less than 3 seconds.
If you are going to respin another rev, please consider the following.
no biggy, just some cosmetic stuff.
> + pgd = pgd_offset_k(end_address);
> + if (pgd_none(*pgd)) {
> + end_address += PTRS_PER_PUD *
> + PTRS_PER_PMD *
> + PTRS_PER_PTE *
> + PAGE_SIZE;
end_address += PGDIR_SIZE;
> + pud = pud_offset(pgd, end_address);
> + if (pud_none(*pud)) {
> + end_address += PTRS_PER_PMD *
> + PTRS_PER_PTE *
> + PAGE_SIZE;
end_address += PUD_SIZE;
> +
> + pmd = pmd_offset(pud, end_address);
> + if (pmd_none(*pmd)) {
> + end_address += PTRS_PER_PTE *
> + PAGE_SIZE;
end_address += PMD_SIZE;
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-04-13 16:36 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-13 3:15 Make show_mem() skip holes in a pgdat Robin Holt
2006-04-13 8:05 ` Andreas Schwab
2006-04-13 13:14 ` Robin Holt
2006-04-13 16:04 ` Bob Picco
2006-04-13 16:36 ` Chen, Kenneth W
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox