From: Michael Ellerman <michael@ellerman.id.au>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
linuxppc-dev@ozlabs.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>
Subject: Re: [BUG] 2.6.30-rc3: BUG triggered on some hugepage usages
Date: Sat, 25 Apr 2009 01:24:50 +1000 [thread overview]
Message-ID: <1240586690.12551.31.camel@localhost> (raw)
In-Reply-To: <20090424095116.GB14283@csn.ul.ie>
[-- Attachment #1: Type: text/plain, Size: 5601 bytes --]
On Fri, 2009-04-24 at 10:51 +0100, Mel Gorman wrote:
> On Tue, Apr 21, 2009 at 08:27:57PM -0700, Linus Torvalds wrote:
> > Another week, another -rc.
> >
>
> I'm seeing some tests with sysbench+postgres+large pages fail on ppc64
> although a very clear pattern is not forming as to what exactly is
> causing it. However, the libhugetlbfs regression tests (make && make
> func) are triggering the following oops when calling mlock() and so are
> likely related.
>
> ------------[ cut here ]------------
> kernel BUG at arch/powerpc/mm/pgtable.c:243!
> Oops: Exception in kernel mode, sig: 5 [#1]
> SMP NR_CPUS=128 NUMA pSeries
> Modules linked in: dm_snapshot dm_mirror dm_region_hash dm_log qla2xxx
> loop nfnetlink iptable_filter iptable_nat nf_nat ip_tables
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT
> xt_tcpudp xt_limit ipt_LOG xt_pkttype x_tables
> NIP: c00000000002becc LR: c00000000002c02c CTR: 0000000000000000
> REGS: c0000000ea92b4c0 TRAP: 0700 Not tainted (2.6.30-rc3-autokern1)
> MSR: 8000000000029032 <EE,ME,CE,IR,DR> CR: 28000484 XER: 20000020
> TASK = c00000000395b660[7611] 'mlock' THREAD: c0000000ea928000 CPU: 3
> GPR00: 0000000000000001 c0000000ea92b740 c0000000008ea170 c0000000ec7d4980
> GPR04: 000000003f000000 c0000001e2278cf8 0000001900000393 0000000000000001
> GPR08: f000000002bc0000 0000000000000000 0000000000000113 c0000001e2278c81
> GPR12: 0000000044000482 c00000000093b880 0000000028004422 0000000000000000
> GPR16: c0000000ea92bbf0 c0000000009f06f0 0000001900000113 c0000000ec7d4980
> GPR20: 0000000000000000 f000000002bc0000 000000003f000000 c0000001e2278cf8
> GPR24: c0000000eaa90bb0 0000000000000000 c0000000eaa90bb0 c0000000ea928000
> GPR28: f000000002bc0000 0000001900000393 0000000000000001 c0000001e2278cf8
> NIP [c00000000002becc] .assert_pte_locked+0x54/0x8c
> LR [c00000000002c02c] .ptep_set_access_flags+0x50/0x8c
> Call Trace:
> [c0000000ea92b740] [c0000000eaa90bb0] 0xc0000000eaa90bb0 (unreliable)
> [c0000000ea92b7d0] [c0000000000ed1b0] .hugetlb_cow+0xd4/0x654
> [c0000000ea92b900] [c0000000000edbf0] .hugetlb_fault+0x4c0/0x708
> [c0000000ea92b9f0] [c0000000000ee890] .follow_hugetlb_page+0x174/0x364
> [c0000000ea92bae0] [c0000000000d8d30] .__get_user_pages+0x288/0x4c0
> [c0000000ea92bbb0] [c0000000000da10c] .make_pages_present+0xa0/0xe0
> [c0000000ea92bc40] [c0000000000db758] .mlock_fixup+0x90/0x228
> [c0000000ea92bd00] [c0000000000dbb38] .do_mlock+0xc4/0x128
> [c0000000ea92bda0] [c0000000000dbccc] .SyS_mlock+0xb0/0xec
> [c0000000ea92be30] [c00000000000852c] syscall_exit+0x0/0x40
> Instruction dump:
> 0b000000 78892662 79291f24 7d69582a 7d600074 7800d182 0b000000 78895e62
> 79291f24 7d29582a 7d200074 7800d182 <0b000000> 3c004000 3960ffff
> 780007c6
> ---[ end trace 36a7faa04fa9452b ]---
>
> This corresponds to
>
> #ifdef CONFIG_DEBUG_VM
> void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
> {
> pgd_t *pgd;
> pud_t *pud;
> pmd_t *pmd;
>
> if (mm == &init_mm)
> return;
> pgd = mm->pgd + pgd_index(addr);
> BUG_ON(pgd_none(*pgd));
> pud = pud_offset(pgd, addr);
> BUG_ON(pud_none(*pud));
> pmd = pmd_offset(pud, addr);
> BUG_ON(!pmd_present(*pmd)); <----- THIS LINE
> BUG_ON(!spin_is_locked(pte_lockptr(mm, pmd)));
> }
> #endif /* CONFIG_DEBUG_VM */
>
> This area was last changed by commit 8d30c14cab30d405a05f2aaceda1e9ad57800f36
> in the 2.6.30-rc1 timeframe. I think there was another hugepage-related
> problem with this patch but I can't remember what it was.
It broke modules, but I don't remember anything hugepage related.
So the code changed from:
-#define ptep_set_access_flags(__vma, __address, __ptep, __entry, __dirty) \
-({ \
- int __changed = !pte_same(*(__ptep), __entry); \
- if (__changed) { \
- __ptep_set_access_flags(__ptep, __entry, __dirty); \
- flush_tlb_page_nohash(__vma, __address); \
- } \
- __changed; \
-})
to:
+int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
+ pte_t *ptep, pte_t entry, int dirty)
+{
+ int changed;
+ if (!dirty && pte_need_exec_flush(entry, 0))
+ entry = do_dcache_icache_coherency(entry);
+ changed = !pte_same(*(ptep), entry);
+ if (changed) {
+ assert_pte_locked(vma->vm_mm, address);
+ __ptep_set_access_flags(ptep, entry);
+ flush_tlb_page_nohash(vma, address);
+ }
+ return changed;
+}
So the call to assert_pte_locked() is new. And it's never going to work
for huge pages, the page table structure is different right? Notice
pte_update() checks (arch/powerpc/include/asm/pgtable-ppc64.h):
198 /* huge pages use the old page table lock */
199 if (!huge)
200 assert_pte_locked(mm, addr);
But unlike pte_update() ptep_set_access_flags() has no way of knowing
it's been called from huge_ptep_set_access_flags().
So my guess is we either remove the call to assert_pte_locked() in
there, or have assert_pte_locked() check whether it's being called for a
huge pte.
cheers
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
next prev parent reply other threads:[~2009-04-24 15:25 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-22 3:27 Linus 2.6.30-rc3 Linus Torvalds
2009-04-22 6:20 ` Ingo Molnar
2009-04-22 6:38 ` [PATCH] include/linux/pktcdvd.h: add mempool.h dependency Ingo Molnar
2009-04-22 6:39 ` Jens Axboe
2009-04-22 6:54 ` Ingo Molnar
2009-04-22 6:58 ` Jens Axboe
2009-04-22 7:06 ` Ingo Molnar
2009-04-22 6:42 ` Ingo Molnar
2009-04-22 9:24 ` Linus 2.6.30-rc3 Denys Vlasenko
2009-04-24 9:51 ` [BUG] 2.6.30-rc3: BUG triggered on some hugepage usages Mel Gorman
2009-04-24 15:24 ` Michael Ellerman [this message]
2009-04-30 20:59 ` Mel Gorman
2009-04-30 21:48 ` Benjamin Herrenschmidt
2009-05-18 17:13 ` Mel Gorman
2009-05-18 17:26 ` Linus Torvalds
2009-04-27 8:15 ` Benjamin Herrenschmidt
2009-04-24 17:52 ` [BUG] 2.6.30-rc3: bnx2 failing to load firmware Mel Gorman
2009-04-24 18:31 ` Frans Pop
2009-04-24 18:37 ` Linus Torvalds
2009-04-24 19:02 ` Frans Pop
2009-04-27 12:34 ` Martin Knoblauch
2009-04-27 13:33 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1240586690.12551.31.camel@localhost \
--to=michael@ellerman.id.au \
--cc=benh@kernel.crashing.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=mel@csn.ul.ie \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox