From: Michael Ellerman <michael@ellerman.id.au>
To: Mel Gorman <mel@csn.ul.ie>
Cc: linuxppc-dev@ozlabs.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [BUG] 2.6.30-rc3: BUG triggered on some hugepage usages
Date: Sat, 25 Apr 2009 01:24:50 +1000 [thread overview]
Message-ID: <1240586690.12551.31.camel@localhost> (raw)
In-Reply-To: <20090424095116.GB14283@csn.ul.ie>
[-- Attachment #1: Type: text/plain, Size: 5601 bytes --]
On Fri, 2009-04-24 at 10:51 +0100, Mel Gorman wrote:
> On Tue, Apr 21, 2009 at 08:27:57PM -0700, Linus Torvalds wrote:
> > Another week, another -rc.
> >
>
> I'm seeing some tests with sysbench+postgres+large pages fail on ppc64
> although a very clear pattern is not forming as to what exactly is
> causing it. However, the libhugetlbfs regression tests (make && make
> func) are triggering the following oops when calling mlock() and so are
> likely related.
>
> ------------[ cut here ]------------
> kernel BUG at arch/powerpc/mm/pgtable.c:243!
> Oops: Exception in kernel mode, sig: 5 [#1]
> SMP NR_CPUS=128 NUMA pSeries
> Modules linked in: dm_snapshot dm_mirror dm_region_hash dm_log qla2xxx
> loop nfnetlink iptable_filter iptable_nat nf_nat ip_tables
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT
> xt_tcpudp xt_limit ipt_LOG xt_pkttype x_tables
> NIP: c00000000002becc LR: c00000000002c02c CTR: 0000000000000000
> REGS: c0000000ea92b4c0 TRAP: 0700 Not tainted (2.6.30-rc3-autokern1)
> MSR: 8000000000029032 <EE,ME,CE,IR,DR> CR: 28000484 XER: 20000020
> TASK = c00000000395b660[7611] 'mlock' THREAD: c0000000ea928000 CPU: 3
> GPR00: 0000000000000001 c0000000ea92b740 c0000000008ea170 c0000000ec7d4980
> GPR04: 000000003f000000 c0000001e2278cf8 0000001900000393 0000000000000001
> GPR08: f000000002bc0000 0000000000000000 0000000000000113 c0000001e2278c81
> GPR12: 0000000044000482 c00000000093b880 0000000028004422 0000000000000000
> GPR16: c0000000ea92bbf0 c0000000009f06f0 0000001900000113 c0000000ec7d4980
> GPR20: 0000000000000000 f000000002bc0000 000000003f000000 c0000001e2278cf8
> GPR24: c0000000eaa90bb0 0000000000000000 c0000000eaa90bb0 c0000000ea928000
> GPR28: f000000002bc0000 0000001900000393 0000000000000001 c0000001e2278cf8
> NIP [c00000000002becc] .assert_pte_locked+0x54/0x8c
> LR [c00000000002c02c] .ptep_set_access_flags+0x50/0x8c
> Call Trace:
> [c0000000ea92b740] [c0000000eaa90bb0] 0xc0000000eaa90bb0 (unreliable)
> [c0000000ea92b7d0] [c0000000000ed1b0] .hugetlb_cow+0xd4/0x654
> [c0000000ea92b900] [c0000000000edbf0] .hugetlb_fault+0x4c0/0x708
> [c0000000ea92b9f0] [c0000000000ee890] .follow_hugetlb_page+0x174/0x364
> [c0000000ea92bae0] [c0000000000d8d30] .__get_user_pages+0x288/0x4c0
> [c0000000ea92bbb0] [c0000000000da10c] .make_pages_present+0xa0/0xe0
> [c0000000ea92bc40] [c0000000000db758] .mlock_fixup+0x90/0x228
> [c0000000ea92bd00] [c0000000000dbb38] .do_mlock+0xc4/0x128
> [c0000000ea92bda0] [c0000000000dbccc] .SyS_mlock+0xb0/0xec
> [c0000000ea92be30] [c00000000000852c] syscall_exit+0x0/0x40
> Instruction dump:
> 0b000000 78892662 79291f24 7d69582a 7d600074 7800d182 0b000000 78895e62
> 79291f24 7d29582a 7d200074 7800d182 <0b000000> 3c004000 3960ffff
> 780007c6
> ---[ end trace 36a7faa04fa9452b ]---
>
> This corresponds to
>
> #ifdef CONFIG_DEBUG_VM
> void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
> {
> pgd_t *pgd;
> pud_t *pud;
> pmd_t *pmd;
>
> if (mm == &init_mm)
> return;
> pgd = mm->pgd + pgd_index(addr);
> BUG_ON(pgd_none(*pgd));
> pud = pud_offset(pgd, addr);
> BUG_ON(pud_none(*pud));
> pmd = pmd_offset(pud, addr);
> BUG_ON(!pmd_present(*pmd)); <----- THIS LINE
> BUG_ON(!spin_is_locked(pte_lockptr(mm, pmd)));
> }
> #endif /* CONFIG_DEBUG_VM */
>
> This area was last changed by commit 8d30c14cab30d405a05f2aaceda1e9ad57800f36
> in the 2.6.30-rc1 timeframe. I think there was another hugepage-related
> problem with this patch but I can't remember what it was.
It broke modules, but I don't remember anything hugepage related.
So the code changed from:
-#define ptep_set_access_flags(__vma, __address, __ptep, __entry, __dirty) \
-({ \
- int __changed = !pte_same(*(__ptep), __entry); \
- if (__changed) { \
- __ptep_set_access_flags(__ptep, __entry, __dirty); \
- flush_tlb_page_nohash(__vma, __address); \
- } \
- __changed; \
-})
to:
+int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
+ pte_t *ptep, pte_t entry, int dirty)
+{
+ int changed;
+ if (!dirty && pte_need_exec_flush(entry, 0))
+ entry = do_dcache_icache_coherency(entry);
+ changed = !pte_same(*(ptep), entry);
+ if (changed) {
+ assert_pte_locked(vma->vm_mm, address);
+ __ptep_set_access_flags(ptep, entry);
+ flush_tlb_page_nohash(vma, address);
+ }
+ return changed;
+}
So the call to assert_pte_locked() is new. And it's never going to work
for huge pages, the page table structure is different right? Notice
pte_update() checks (arch/powerpc/include/asm/pgtable-ppc64.h):
198 /* huge pages use the old page table lock */
199 if (!huge)
200 assert_pte_locked(mm, addr);
But unlike pte_update() ptep_set_access_flags() has no way of knowing
it's been called from huge_ptep_set_access_flags().
So my guess is we either remove the call to assert_pte_locked() in
there, or have assert_pte_locked() check whether it's being called for a
huge pte.
cheers
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: Michael Ellerman <michael@ellerman.id.au>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
linuxppc-dev@ozlabs.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>
Subject: Re: [BUG] 2.6.30-rc3: BUG triggered on some hugepage usages
Date: Sat, 25 Apr 2009 01:24:50 +1000 [thread overview]
Message-ID: <1240586690.12551.31.camel@localhost> (raw)
In-Reply-To: <20090424095116.GB14283@csn.ul.ie>
[-- Attachment #1: Type: text/plain, Size: 5601 bytes --]
On Fri, 2009-04-24 at 10:51 +0100, Mel Gorman wrote:
> On Tue, Apr 21, 2009 at 08:27:57PM -0700, Linus Torvalds wrote:
> > Another week, another -rc.
> >
>
> I'm seeing some tests with sysbench+postgres+large pages fail on ppc64
> although a very clear pattern is not forming as to what exactly is
> causing it. However, the libhugetlbfs regression tests (make && make
> func) are triggering the following oops when calling mlock() and so are
> likely related.
>
> ------------[ cut here ]------------
> kernel BUG at arch/powerpc/mm/pgtable.c:243!
> Oops: Exception in kernel mode, sig: 5 [#1]
> SMP NR_CPUS=128 NUMA pSeries
> Modules linked in: dm_snapshot dm_mirror dm_region_hash dm_log qla2xxx
> loop nfnetlink iptable_filter iptable_nat nf_nat ip_tables
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT
> xt_tcpudp xt_limit ipt_LOG xt_pkttype x_tables
> NIP: c00000000002becc LR: c00000000002c02c CTR: 0000000000000000
> REGS: c0000000ea92b4c0 TRAP: 0700 Not tainted (2.6.30-rc3-autokern1)
> MSR: 8000000000029032 <EE,ME,CE,IR,DR> CR: 28000484 XER: 20000020
> TASK = c00000000395b660[7611] 'mlock' THREAD: c0000000ea928000 CPU: 3
> GPR00: 0000000000000001 c0000000ea92b740 c0000000008ea170 c0000000ec7d4980
> GPR04: 000000003f000000 c0000001e2278cf8 0000001900000393 0000000000000001
> GPR08: f000000002bc0000 0000000000000000 0000000000000113 c0000001e2278c81
> GPR12: 0000000044000482 c00000000093b880 0000000028004422 0000000000000000
> GPR16: c0000000ea92bbf0 c0000000009f06f0 0000001900000113 c0000000ec7d4980
> GPR20: 0000000000000000 f000000002bc0000 000000003f000000 c0000001e2278cf8
> GPR24: c0000000eaa90bb0 0000000000000000 c0000000eaa90bb0 c0000000ea928000
> GPR28: f000000002bc0000 0000001900000393 0000000000000001 c0000001e2278cf8
> NIP [c00000000002becc] .assert_pte_locked+0x54/0x8c
> LR [c00000000002c02c] .ptep_set_access_flags+0x50/0x8c
> Call Trace:
> [c0000000ea92b740] [c0000000eaa90bb0] 0xc0000000eaa90bb0 (unreliable)
> [c0000000ea92b7d0] [c0000000000ed1b0] .hugetlb_cow+0xd4/0x654
> [c0000000ea92b900] [c0000000000edbf0] .hugetlb_fault+0x4c0/0x708
> [c0000000ea92b9f0] [c0000000000ee890] .follow_hugetlb_page+0x174/0x364
> [c0000000ea92bae0] [c0000000000d8d30] .__get_user_pages+0x288/0x4c0
> [c0000000ea92bbb0] [c0000000000da10c] .make_pages_present+0xa0/0xe0
> [c0000000ea92bc40] [c0000000000db758] .mlock_fixup+0x90/0x228
> [c0000000ea92bd00] [c0000000000dbb38] .do_mlock+0xc4/0x128
> [c0000000ea92bda0] [c0000000000dbccc] .SyS_mlock+0xb0/0xec
> [c0000000ea92be30] [c00000000000852c] syscall_exit+0x0/0x40
> Instruction dump:
> 0b000000 78892662 79291f24 7d69582a 7d600074 7800d182 0b000000 78895e62
> 79291f24 7d29582a 7d200074 7800d182 <0b000000> 3c004000 3960ffff
> 780007c6
> ---[ end trace 36a7faa04fa9452b ]---
>
> This corresponds to
>
> #ifdef CONFIG_DEBUG_VM
> void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
> {
> pgd_t *pgd;
> pud_t *pud;
> pmd_t *pmd;
>
> if (mm == &init_mm)
> return;
> pgd = mm->pgd + pgd_index(addr);
> BUG_ON(pgd_none(*pgd));
> pud = pud_offset(pgd, addr);
> BUG_ON(pud_none(*pud));
> pmd = pmd_offset(pud, addr);
> BUG_ON(!pmd_present(*pmd)); <----- THIS LINE
> BUG_ON(!spin_is_locked(pte_lockptr(mm, pmd)));
> }
> #endif /* CONFIG_DEBUG_VM */
>
> This area was last changed by commit 8d30c14cab30d405a05f2aaceda1e9ad57800f36
> in the 2.6.30-rc1 timeframe. I think there was another hugepage-related
> problem with this patch but I can't remember what it was.
It broke modules, but I don't remember anything hugepage related.
So the code changed from:
-#define ptep_set_access_flags(__vma, __address, __ptep, __entry, __dirty) \
-({ \
- int __changed = !pte_same(*(__ptep), __entry); \
- if (__changed) { \
- __ptep_set_access_flags(__ptep, __entry, __dirty); \
- flush_tlb_page_nohash(__vma, __address); \
- } \
- __changed; \
-})
to:
+int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
+ pte_t *ptep, pte_t entry, int dirty)
+{
+ int changed;
+ if (!dirty && pte_need_exec_flush(entry, 0))
+ entry = do_dcache_icache_coherency(entry);
+ changed = !pte_same(*(ptep), entry);
+ if (changed) {
+ assert_pte_locked(vma->vm_mm, address);
+ __ptep_set_access_flags(ptep, entry);
+ flush_tlb_page_nohash(vma, address);
+ }
+ return changed;
+}
So the call to assert_pte_locked() is new. And it's never going to work
for huge pages, the page table structure is different right? Notice
pte_update() checks (arch/powerpc/include/asm/pgtable-ppc64.h):
198 /* huge pages use the old page table lock */
199 if (!huge)
200 assert_pte_locked(mm, addr);
But unlike pte_update() ptep_set_access_flags() has no way of knowing
it's been called from huge_ptep_set_access_flags().
So my guess is we either remove the call to assert_pte_locked() in
there, or have assert_pte_locked() check whether it's being called for a
huge pte.
cheers
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
next prev parent reply other threads:[~2009-04-24 15:24 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-22 3:27 Linus 2.6.30-rc3 Linus Torvalds
2009-04-22 6:20 ` Ingo Molnar
2009-04-22 6:38 ` [PATCH] include/linux/pktcdvd.h: add mempool.h dependency Ingo Molnar
2009-04-22 6:39 ` Jens Axboe
2009-04-22 6:54 ` Ingo Molnar
2009-04-22 6:58 ` Jens Axboe
2009-04-22 7:06 ` Ingo Molnar
2009-04-22 6:42 ` Ingo Molnar
2009-04-22 9:24 ` Linus 2.6.30-rc3 Denys Vlasenko
2009-04-24 9:51 ` [BUG] 2.6.30-rc3: BUG triggered on some hugepage usages Mel Gorman
2009-04-24 9:51 ` Mel Gorman
2009-04-24 15:24 ` Michael Ellerman [this message]
2009-04-24 15:24 ` Michael Ellerman
2009-04-30 20:59 ` Mel Gorman
2009-04-30 20:59 ` Mel Gorman
2009-04-30 21:48 ` Benjamin Herrenschmidt
2009-04-30 21:48 ` Benjamin Herrenschmidt
2009-05-18 17:13 ` Mel Gorman
2009-05-18 17:13 ` Mel Gorman
2009-05-18 17:26 ` Linus Torvalds
2009-05-18 17:26 ` Linus Torvalds
2009-04-27 8:15 ` Benjamin Herrenschmidt
2009-04-27 8:15 ` Benjamin Herrenschmidt
2009-04-24 17:52 ` [BUG] 2.6.30-rc3: bnx2 failing to load firmware Mel Gorman
2009-04-24 18:31 ` Frans Pop
2009-04-24 18:37 ` Linus Torvalds
2009-04-24 19:02 ` Frans Pop
2009-04-27 12:34 ` Martin Knoblauch
2009-04-27 13:33 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1240586690.12551.31.camel@localhost \
--to=michael@ellerman.id.au \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=mel@csn.ul.ie \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.