linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* hugetlbfs for ppc440 - kernel BUG
@ 2007-07-10 18:38 Satya
  2008-10-21 20:47 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 7+ messages in thread
From: Satya @ 2007-07-10 18:38 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: kazutomo, edi, david

hello,
I am trying to implement hugetlbfs on the IBM Bluegene/L IO node
(ppc440) and I have a big problem as well as a few questions to ask
the group. I patched a 2.6.21.6 linux kernel (manually) with Edi
Shmueli's hugetlbfs implementation (found here:
http://patchwork.ozlabs.org/linuxppc/patch?id=8427) for this. I did
have to make slight changes (described at the end) to make it work.
My test program is a shortened version of a sys v shared memory
example described in Documentation/vm/hugetlbpage.txt

I get the following kernel BUG when a page fault occurs on a huge page address:
BUG: scheduling while atomic: shmtest2/0x10000001/1291
Call Trace:
[CFF0BCE0] [C00084F4] show_stack+0x4c/0x194 (unreliable)
 [CFF0BD20] [C01A53C4] schedule+0x664/0x668
[CFF0BD60] [C00175F8] __cond_resched+0x24/0x50
[CFF0BD80] [C01A5A6C] cond_resched+0x50/0x58
[CFF0BD90] [C005A31C] clear_huge_page+0x28/0x174
[CFF0BDC0] [C005B360] hugetlb_no_page+0xb4/0x220
[CFF0BE00] [C005B5BC] hugetlb_fault+0xf0/0xf4
[CFF0BE30] [C0052AC0] __handle_mm_fault+0x3a8/0x3ac
[CFF0BE70] [C00094A0] do_page_fault+0x118/0x428
[CFF0BF40] [C0002360] handle_page_fault+0xc/0x80
BUG: scheduling while atomic: shmtest2/0x10000001/1291

Now for my questions:

1. Can the kernel really reschedule in a page fault handler context ?

2. Just to test where this "scheduling while atomic" bug is arising, i
put schedule() calls at various places in the path of the stack trace
shown above.
I found that a call to pte_alloc_map() puts the kernel in a context
where it cannot reschedule without throwing up. Here is a trace of
what's going on:

__handle_mm_fault -> hugetlb_fault -> huge_pte_alloc() -> pte_alloc_map()

Any call to schedule() before pte_alloc_map() does not throw this
error. Well, this might be a flawed experiment, I am no expert kernel
hacker. Does this throw any light on the problem?

Here are the modifications I made to Edi's patch:

arch/ppc/mm/hugetlbpage.c
struct page *
follow_huge_addr(struct mm_struct *mm, unsigned long address, int write)
{
  pte_t *pte;
  struct page *page;
+  struct vm_area_struct *vma;
+
+  vma = find_vma(mm, address);
+ if (!vma || !is_vm_hugetlb_page(vma))
+    return ERR_PTR(-EINVAL);

  pte = huge_pte_offset(mm, address);
  page = pte_page(*pte);
  return page;
}

+int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep)
+{
+        return 0;
+}

Here is my test program:

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <sys/mman.h>

#ifndef SHM_HUGETLB
#define SHM_HUGETLB 04000
#endif

#define LENGTH (16UL*1024*1024)

#define dprintf(x)  printf(x)

#define ADDR (void *)(0x0UL)
#define SHMAT_FLAGS (0)


int main(void)
{
        int shmid;
        unsigned long i;
        char *shmaddr;

        if ((shmid = shmget(2, LENGTH,
                            SHM_HUGETLB | IPC_CREAT | SHM_R | SHM_W)) < 0) {
                perror("shmget");
                exit(1);
        }
        printf("shmid: 0x%x\n", shmid);

        shmaddr = shmat(shmid, ADDR, SHMAT_FLAGS);
        if (shmaddr == (char *)-1) {
                perror("Shared memory attach failure");
                shmctl(shmid, IPC_RMID, NULL);
                exit(2);
        }
        printf("shmaddr: %p\n", shmaddr);
        printf("touching a huge page..\n");

        shmaddr[0]='a';
        shmaddr[1]='b';

        if (shmdt((const void *)shmaddr) != 0) {
                perror("Detach failure");
                shmctl(shmid, IPC_RMID, NULL);
                exit(3);
        }

        shmctl(shmid, IPC_RMID, NULL);

        return 0;
}

thanks!
Satya.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hugetlbfs for ppc440 - kernel BUG
  2007-07-10 18:38 hugetlbfs for ppc440 - kernel BUG Satya
@ 2008-10-21 20:47 ` Benjamin Herrenschmidt
  2008-10-21 22:46   ` Satya
  0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2008-10-21 20:47 UTC (permalink / raw)
  To: Satya, edi; +Cc: kazutomo, linuxppc-dev, david

On Tue, 2007-07-10 at 13:38 -0500, Satya wrote:
> hello,
> I am trying to implement hugetlbfs on the IBM Bluegene/L IO node
> (ppc440) and I have a big problem as well as a few questions to ask
> the group. I patched a 2.6.21.6 linux kernel (manually) with Edi
> Shmueli's hugetlbfs implementation (found here:
> http://patchwork.ozlabs.org/linuxppc/patch?id=8427) for this. I did
> have to make slight changes (described at the end) to make it work.
> My test program is a shortened version of a sys v shared memory
> example described in Documentation/vm/hugetlbpage.txt

Hi !

The patchwork link unfortunately didn't survive the transition to
patchwork 2.

Do you know what's the status of Hugetlb support for 44x ? Is there any
plan to release that for upstream inclusion ?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hugetlbfs for ppc440 - kernel BUG
  2008-10-21 20:47 ` Benjamin Herrenschmidt
@ 2008-10-21 22:46   ` Satya
  2008-10-21 22:50     ` Satya
  2008-10-21 22:53     ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 7+ messages in thread
From: Satya @ 2008-10-21 22:46 UTC (permalink / raw)
  To: benh; +Cc: kazutomo, linuxppc-dev, edi, david

[-- Attachment #1: Type: text/plain, Size: 1009 bytes --]

Ben,
Look here: http://www-unix.mcs.anl.gov/zeptoos/hugepages/

thanks,
./satya

On Tue, Oct 21, 2008 at 1:47 PM, Benjamin Herrenschmidt <
benh@kernel.crashing.org> wrote:

> On Tue, 2007-07-10 at 13:38 -0500, Satya wrote:
> > hello,
> > I am trying to implement hugetlbfs on the IBM Bluegene/L IO node
> > (ppc440) and I have a big problem as well as a few questions to ask
> > the group. I patched a 2.6.21.6 linux kernel (manually) with Edi
> > Shmueli's hugetlbfs implementation (found here:
> > http://patchwork.ozlabs.org/linuxppc/patch?id=8427) for this. I did
> > have to make slight changes (described at the end) to make it work.
> > My test program is a shortened version of a sys v shared memory
> > example described in Documentation/vm/hugetlbpage.txt
>
> Hi !
>
> The patchwork link unfortunately didn't survive the transition to
> patchwork 2.
>
> Do you know what's the status of Hugetlb support for 44x ? Is there any
> plan to release that for upstream inclusion ?
>
> Cheers,
> Ben.
>
>
>

[-- Attachment #2: Type: text/html, Size: 1585 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hugetlbfs for ppc440 - kernel BUG
  2008-10-21 22:46   ` Satya
@ 2008-10-21 22:50     ` Satya
  2008-10-23  2:42       ` David Gibson
  2008-10-21 22:53     ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 7+ messages in thread
From: Satya @ 2008-10-21 22:50 UTC (permalink / raw)
  To: benh; +Cc: kazutomo, linuxppc-dev, edi, david

[-- Attachment #1: Type: text/plain, Size: 1409 bytes --]

On Tue, Oct 21, 2008 at 3:46 PM, Satya <satyakiran@gmail.com> wrote:

> Ben,
> Look here: http://www-unix.mcs.anl.gov/zeptoos/hugepages/
>
> thanks,
> ./satya
>
>
> On Tue, Oct 21, 2008 at 1:47 PM, Benjamin Herrenschmidt <
> benh@kernel.crashing.org> wrote:
>
>> On Tue, 2007-07-10 at 13:38 -0500, Satya wrote:
>> > hello,
>> > I am trying to implement hugetlbfs on the IBM Bluegene/L IO node
>> > (ppc440) and I have a big problem as well as a few questions to ask
>> > the group. I patched a 2.6.21.6 linux kernel (manually) with Edi
>> > Shmueli's hugetlbfs implementation (found here:
>> > http://patchwork.ozlabs.org/linuxppc/patch?id=8427) for this. I did
>> > have to make slight changes (described at the end) to make it work.
>> > My test program is a shortened version of a sys v shared memory
>> > example described in Documentation/vm/hugetlbpage.txt
>>
>> Hi !
>>
>> The patchwork link unfortunately didn't survive the transition to
>> patchwork 2.
>>
>> Do you know what's the status of Hugetlb support for 44x ? Is there any
>> plan to release that for upstream inclusion ?
>>
>> Cheers,
>> Ben.
>>
>>
>>

whoops, sorry for top-posting. Here is a patch that worked at that time:
http://www-unix.mcs.anl.gov/zeptoos/hugepages/hugetlbpage_44x.patch

I didn't follow up after this to get it merged upstream. Also I don't know
if hugetlb core has changed to deal with PTEs in high memory.

./satya

[-- Attachment #2: Type: text/html, Size: 2419 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hugetlbfs for ppc440 - kernel BUG
  2008-10-21 22:46   ` Satya
  2008-10-21 22:50     ` Satya
@ 2008-10-21 22:53     ` Benjamin Herrenschmidt
  2008-10-22  0:12       ` David Gibson
  1 sibling, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2008-10-21 22:53 UTC (permalink / raw)
  To: Satya; +Cc: kazutomo, linuxppc-dev, edi, david

On Tue, 2008-10-21 at 15:46 -0700, Satya wrote:
> Ben,
> Look here: http://www-unix.mcs.anl.gov/zeptoos/hugepages/

Thanks. What is the status ? Do they work fine ? Are they going to be
re-submitted for inclusion ?

Cheers,
Ben.

> thanks,
> ./satya
> 
> On Tue, Oct 21, 2008 at 1:47 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
>         On Tue, 2007-07-10 at 13:38 -0500, Satya wrote:
>         > hello,
>         > I am trying to implement hugetlbfs on the IBM Bluegene/L IO
>         node
>         > (ppc440) and I have a big problem as well as a few questions
>         to ask
>         > the group. I patched a 2.6.21.6 linux kernel (manually) with
>         Edi
>         > Shmueli's hugetlbfs implementation (found here:
>         > http://patchwork.ozlabs.org/linuxppc/patch?id=8427) for
>         this. I did
>         > have to make slight changes (described at the end) to make
>         it work.
>         > My test program is a shortened version of a sys v shared
>         memory
>         > example described in Documentation/vm/hugetlbpage.txt
>         
>         
>         Hi !
>         
>         The patchwork link unfortunately didn't survive the transition
>         to
>         patchwork 2.
>         
>         Do you know what's the status of Hugetlb support for 44x ? Is
>         there any
>         plan to release that for upstream inclusion ?
>         
>         Cheers,
>         Ben.
>         
>         
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hugetlbfs for ppc440 - kernel BUG
  2008-10-21 22:53     ` Benjamin Herrenschmidt
@ 2008-10-22  0:12       ` David Gibson
  0 siblings, 0 replies; 7+ messages in thread
From: David Gibson @ 2008-10-22  0:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: kazutomo, linuxppc-dev, edi, Satya

On Wed, Oct 22, 2008 at 09:53:13AM +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2008-10-21 at 15:46 -0700, Satya wrote:
> > Ben,
> > Look here: http://www-unix.mcs.anl.gov/zeptoos/hugepages/
> 
> Thanks. What is the status ? Do they work fine ? Are they going to be
> re-submitted for inclusion ?

Hrm.  Last I looked at the 440 hugepage patches they appeared to have
several serious bugs (I was surprised they worked at all).  I had
meant to fix them up and push, but I never quite got around to it.
I'll have at this link later today.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hugetlbfs for ppc440 - kernel BUG
  2008-10-21 22:50     ` Satya
@ 2008-10-23  2:42       ` David Gibson
  0 siblings, 0 replies; 7+ messages in thread
From: David Gibson @ 2008-10-23  2:42 UTC (permalink / raw)
  To: Satya; +Cc: kazutomo, edi, linuxppc-dev

On Tue, Oct 21, 2008 at 03:50:30PM -0700, Satya wrote:
> On Tue, Oct 21, 2008 at 3:46 PM, Satya <satyakiran@gmail.com> wrote:
> 
> > Ben,
> > Look here: http://www-unix.mcs.anl.gov/zeptoos/hugepages/
> >
> > thanks,
> > ./satya
> >
> >
> > On Tue, Oct 21, 2008 at 1:47 PM, Benjamin Herrenschmidt <
> > benh@kernel.crashing.org> wrote:
> >
> >> On Tue, 2007-07-10 at 13:38 -0500, Satya wrote:
> >> > hello,
> >> > I am trying to implement hugetlbfs on the IBM Bluegene/L IO node
> >> > (ppc440) and I have a big problem as well as a few questions to ask
> >> > the group. I patched a 2.6.21.6 linux kernel (manually) with Edi
> >> > Shmueli's hugetlbfs implementation (found here:
> >> > http://patchwork.ozlabs.org/linuxppc/patch?id=8427) for this. I did
> >> > have to make slight changes (described at the end) to make it work.
> >> > My test program is a shortened version of a sys v shared memory
> >> > example described in Documentation/vm/hugetlbpage.txt
> >>
> >> Hi !
> >>
> >> The patchwork link unfortunately didn't survive the transition to
> >> patchwork 2.
> >>
> >> Do you know what's the status of Hugetlb support for 44x ? Is there any
> >> plan to release that for upstream inclusion ?
> >>
> >> Cheers,
> >> Ben.
> >>
> >>
> >>
> 
> whoops, sorry for top-posting. Here is a patch that worked at that time:
> http://www-unix.mcs.anl.gov/zeptoos/hugepages/hugetlbpage_44x.patch
> 
> I didn't follow up after this to get it merged upstream. Also I don't know
> if hugetlb core has changed to deal with PTEs in high memory.

Ok, had a look at this.  It's had some tweaks since I last looked at
the bluegene hugepage/440 patch.  It still has the rather ugly
approach of storing the hugepage PTEs always at the bottom level, and
duplicating them umpteen times (including pointing multiple PMDs at a
single PTE page when the hugepage size exceeds the area mapped by a
PMD).  It also has the most serious bug I remember from the old
version - the DIRTY and ACCESSED handling is completely bogus, because
it doesn't keep the copies of the bits in the many copies of the PTEs
in sync.  Between the TLB miss rewrite that's happened in the meantime
and my patch to handle these from hugetlb_fault() it's at least now
easier to fix this bug.  Also the patch is arch/ppc based.

I'll try to sort this out in the near future.  I guess the only big
question is whether its important to support hugepage sizes < 2M.  For
hugepage sizes >=2M (16M and 256M) we can just make PMD pointers into
hugepage pointers with the addition of a suitable size field, as we do
for 40x.  For page sizes <2M things get more complicated because we
need some sort of second level hugepage tables (which may or may not
be distinct from the ordinary second level tables).

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-10-23  2:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-10 18:38 hugetlbfs for ppc440 - kernel BUG Satya
2008-10-21 20:47 ` Benjamin Herrenschmidt
2008-10-21 22:46   ` Satya
2008-10-21 22:50     ` Satya
2008-10-23  2:42       ` David Gibson
2008-10-21 22:53     ` Benjamin Herrenschmidt
2008-10-22  0:12       ` David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).