linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Jon Tollefson <kniht@linux.vnet.ibm.com>
To: libhugetlbfs-devel@lists.sourceforge.net,
	linuxppc-dev@ozlabs.org,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Subject: Re: Buglet in 16G page handling
Date: Tue, 02 Sep 2008 12:12:27 -0500	[thread overview]
Message-ID: <48BD73FB.1050601@linux.vnet.ibm.com> (raw)
In-Reply-To: <20080902050510.GB12965@yookeroo.seuss>

David Gibson wrote:
> When BenH and I were looking at the new code for handling 16G pages,
> we noticed a small bug.  It doesn't actually break anything user
> visible, but it's certainly not the way things are supposed to be.
> The 16G patches didn't update the huge_pte_offset() and
> huge_pte_alloc() functions, which means that the hugepte tables for
> 16G pages will be allocated much further down the page table tree than
> they should be - allocating several levels of page table with a single
> entry in them along the way.
>
> The patch below is supposed to fix this, cleaning up the existing
> handling of 64k vs 16M pages while its at it.  However, it needs some
> testing.
>
> I've checked that it doesn't break existing 16M support, either with
> 4k or 64k base pages.  I haven't figured out how to test with 64k
> pages yet, at least until the multisize support goes into
> libhugetlbfs.  For 16G pages, I just don't have access to a machine
> with enough memory to test.  Jon, presumably you must have found such
> a machine when you did the 16G page support in the first place.  Do
> you still have access, and can you test this patch?
>   
I do have access to a machine to test it.  I applied the patch to -rc4
and used a pseries_defconfig.  I boot with
default_hugepagesz=16G... in order to test huge page sizes other then
16M at this point.

Running the libhugetlbfs test suite it gets as far as   Readback (64):  
PASS
before it hits the following program check.

kernel BUG at arch/powerpc/mm/hugetlbpage.c:98!
cpu 0x0: Vector: 700 (Program Check) at [c0000002843db580]
    pc: c000000000035ff4: .free_hugepte_range+0x2c/0x7c
    lr: c000000000036af0: .hugetlb_free_pgd_range+0x2c0/0x398
    sp: c0000002843db800
   msr: 8000000000029032
  current = 0xc00000028417a2a0
  paca    = 0xc0000000008d4300
    pid   = 3334, comm = readback
kernel BUG at arch/powerpc/mm/hugetlbpage.c:98!
enter ? for help
[c0000002843db880] c000000000036af0 .hugetlb_free_pgd_range+0x2c0/0x398
[c0000002843db980] c0000000000da224 .free_pgtables+0x98/0x140
[c0000002843dba40] c0000000000dc4d8 .exit_mmap+0x13c/0x22c
[c0000002843dbb00] c00000000005b218 .mmput+0x78/0x148
[c0000002843dbba0] c000000000060528 .exit_mm+0x164/0x18c
[c0000002843dbc50] c000000000062718 .do_exit+0x2e8/0x858
[c0000002843dbd10] c000000000062d24 .do_group_exit+0x9c/0xd0
[c0000002843dbdb0] c000000000062d74 .sys_exit_group+0x1c/0x30
[c0000002843dbe30] c0000000000086d4 syscall_exit+0x0/0x40
--- Exception: c00 (System Call) at 000000802db7a530
SP (fffffa6e290) is in userspace


Line 98 appears to be this BUG_ON

static inline pte_t *hugepd_page(hugepd_t hpd)
{
        BUG_ON(!(hpd.pd & HUGEPD_OK));


Jon

> Index: working-2.6/arch/powerpc/mm/hugetlbpage.c
> ===================================================================
> --- working-2.6.orig/arch/powerpc/mm/hugetlbpage.c	2008-09-02 13:39:52.000000000 +1000
> +++ working-2.6/arch/powerpc/mm/hugetlbpage.c	2008-09-02 14:08:56.000000000 +1000
> @@ -128,29 +128,37 @@ static int __hugepte_alloc(struct mm_str
>  	return 0;
>  }
>
> -/* Base page size affects how we walk hugetlb page tables */
> -#ifdef CONFIG_PPC_64K_PAGES
> -#define hpmd_offset(pud, addr, h)	pmd_offset(pud, addr)
> -#define hpmd_alloc(mm, pud, addr, h)	pmd_alloc(mm, pud, addr)
> -#else
> -static inline
> -pmd_t *hpmd_offset(pud_t *pud, unsigned long addr, struct hstate *hstate)
> +
> +static pud_t *hpud_offset(pgd_t *pgd, unsigned long addr, struct hstate *hstate)
> +{
> +	if (huge_page_shift(hstate) < PUD_SHIFT)
> +		return pud_offset(pgd, addr);
> +	else
> +		return (pud_t *) pgd;
> +}
> +static pud_t *hpud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long addr,
> +			 struct hstate *hstate)
>  {
> -	if (huge_page_shift(hstate) == PAGE_SHIFT_64K)
> +	if (huge_page_shift(hstate) < PUD_SHIFT)
> +		return pud_alloc(mm, pgd, addr);
> +	else
> +		return (pud_t *) pgd;
> +}
> +static pmd_t *hpmd_offset(pud_t *pud, unsigned long addr, struct hstate *hstate)
> +{
> +	if (huge_page_shift(hstate) < PMD_SHIFT)
>  		return pmd_offset(pud, addr);
>  	else
>  		return (pmd_t *) pud;
>  }
> -static inline
> -pmd_t *hpmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long addr,
> -		  struct hstate *hstate)
> +static pmd_t *hpmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long addr,
> +			 struct hstate *hstate)
>  {
> -	if (huge_page_shift(hstate) == PAGE_SHIFT_64K)
> +	if (huge_page_shift(hstate) < PMD_SHIFT)
>  		return pmd_alloc(mm, pud, addr);
>  	else
>  		return (pmd_t *) pud;
>  }
> -#endif
>
>  /* Build list of addresses of gigantic pages.  This function is used in early
>   * boot before the buddy or bootmem allocator is setup.
> @@ -204,7 +212,7 @@ pte_t *huge_pte_offset(struct mm_struct 
>
>  	pg = pgd_offset(mm, addr);
>  	if (!pgd_none(*pg)) {
> -		pu = pud_offset(pg, addr);
> +		pu = hpud_offset(pg, addr, hstate);
>  		if (!pud_none(*pu)) {
>  			pm = hpmd_offset(pu, addr, hstate);
>  			if (!pmd_none(*pm))
> @@ -233,7 +241,7 @@ pte_t *huge_pte_alloc(struct mm_struct *
>  	addr &= hstate->mask;
>
>  	pg = pgd_offset(mm, addr);
> -	pu = pud_alloc(mm, pg, addr);
> +	pu = hpud_alloc(mm, pg, addr, hstate);
>
>  	if (pu) {
>  		pm = hpmd_alloc(mm, pu, addr, hstate);
>
>
>   

  parent reply	other threads:[~2008-09-02 17:12 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-02  5:05 Buglet in 16G page handling David Gibson
2008-09-02 12:44 ` [Libhugetlbfs-devel] " Mel Gorman
2008-09-02 16:25   ` Nishanth Aravamudan
2008-09-02 21:05   ` Benjamin Herrenschmidt
2008-09-02 22:16     ` Jon Tollefson
2008-09-02 22:53       ` Benjamin Herrenschmidt
2008-09-03 14:11         ` Jon Tollefson
2008-09-02 17:12 ` Jon Tollefson [this message]
2008-09-03  0:20   ` David Gibson
2008-09-03 22:19     ` Jon Tollefson
2008-09-04  6:22       ` David Gibson
2008-09-04 21:08       ` Jon Tollefson
2008-09-05  1:36         ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48BD73FB.1050601@linux.vnet.ibm.com \
    --to=kniht@linux.vnet.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=libhugetlbfs-devel@lists.sourceforge.net \
    --cc=linuxppc-dev@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).