public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] - deleting huge pages
@ 2004-05-02 12:30 Jack Steiner
  2004-05-02 18:33 ` Chen, Kenneth W
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Jack Steiner @ 2004-05-02 12:30 UTC (permalink / raw)
  To: linux-ia64


I found this problem in 2.4,21, but AFAICT, the same problem
exists in 2.6.5.

If you attempt to allocate a LOT more huge pages than are physically available,
the kernel may reference invalid PGDs or PMDs. 

Here is the 2.4 backtrace of a failure. If the mmap fails, do_mmap_pgoff attempts to
unmap the vma range it was mapping. Depending on where it failed during
the mmap, some of the higher level PGD/PMDs may not have been assigned.

The bug (at least in 2.4) exists on all platforms but on our platform
attempts to dereference NULL pointers usually cause MCAs. (If a platform
has zeros in page 0, you may be lucky & the code would appear to work,
but it is still a bug).

	Stack traceback for pid 6817
	0xe00025307ba50000     6817     6663  0  148   D  0xe00025307ba50420  toy
	0xe00000000445e180 unmap_hugepage_range+0x160  << mca surfaced here
	0xe00000000445e300 zap_hugepage_range+0x80
	0xe00000000452dbc0 do_mmap_pgoff+0xea0
	0xe000000004432910 sys_mmap+0x210
	0xe00000000440e2a0 ia64_ret_from_syscall

The MCA was caused by the NULL pmd dereference in huge_pte_offset. The
MCA doesnt surface until the bad data is consumed.

A patch against 2.6.5:



Index: linux/arch/ia64/mm/hugetlbpage.c
=================================--- linux.orig/arch/ia64/mm/hugetlbpage.c	2004-05-01 20:51:52.000000000 -0500
+++ linux/arch/ia64/mm/hugetlbpage.c	2004-05-01 20:51:54.000000000 -0500
@@ -111,9 +111,16 @@
 	pte_t *pte = NULL;
 
 	pgd = pgd_offset(mm, taddr);
+	if (pgd_none(*pgd) || pgd_bad(*pgd))
+		goto out;
 	pmd = pmd_offset(pgd, taddr);
+	if (pmd_none(*pmd) || pmd_bad(*pmd))
+		goto out;
 	pte = pte_offset_map(pmd, taddr);
 	return pte;
+
+out:
+	return 0;
 }
 
 #define mk_pte_huge(entry) { pte_val(entry) |= _PAGE_P; }
@@ -331,7 +338,7 @@
 
 	for (address = start; address < end; address += HPAGE_SIZE) {
 		pte = huge_pte_offset(mm, address);
-		if (pte_none(*pte))
+		if (!pte || pte_none(*pte))
 			continue;
 		page = pte_page(*pte);
 		huge_page_release(page);
-- 
Thanks

Jack Steiner (steiner@sgi.com)          651-683-5302
Principal Engineer                      SGI - Silicon Graphics, Inc.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-05-03 19:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-02 12:30 [PATCH] - deleting huge pages Jack Steiner
2004-05-02 18:33 ` Chen, Kenneth W
2004-05-03 14:53 ` Jack Steiner
2004-05-03 17:12 ` Chen, Kenneth W
2004-05-03 19:47 ` Jack Steiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox