All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Sachin Sant <sachinp@linux.ibm.com>
Cc: linux-mm@kvack.org, linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [powerpc] Kernel crash with THP tests (next-20220920)
Date: Wed, 21 Sep 2022 16:41:03 -0700	[thread overview]
Message-ID: <YyuhD+7N022PgRA+@monkey> (raw)
In-Reply-To: <C2C8DA4F-F00F-43E9-ACD8-2A8BACA55893@linux.ibm.com>

On 09/21/22 12:00, Sachin Sant wrote:
> While running transparent huge page tests [1] against 6.0.0-rc6-next-20220920
> following crash is seen on IBM Power server.

Thanks Sachin,

Naoya reported this, with my analysis here:
https://lore.kernel.org/linux-mm/YyqCS6+OXAgoqI8T@monkey/

An updated version of the patch was posted here,
https://lore.kernel.org/linux-mm/20220921202702.106069-1-mike.kravetz@oracle.com/

Sorry about that,
-- 
Mike Kravetz

> 
> Kernel attempted to read user page (34) - exploit attempt? (uid: 0)
> BUG: Kernel NULL pointer dereference on read at 0x00000034
> Faulting instruction address: 0xc0000000004d2744
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> Modules linked in: dm_mod(E) bonding(E) rfkill(E) tls(E) sunrpc(E) nd_pmem(E) nd_btt(E) dax_pmem(E) papr_scm(E) libnvdimm(E) pseries_rng(E) vmx_crypto(E) ext4(E) mbcache(E) jbd2(E) sd_mod(E) t10_pi(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) fuse(E)
> CPU: 37 PID: 2219255 Comm: sysctl Tainted: G            E      6.0.0-rc6-next-20220920 #1
> NIP:  c0000000004d2744 LR: c0000000004d2734 CTR: 0000000000000000
> REGS: c0000012801bf660 TRAP: 0300   Tainted: G            E       (6.0.0-rc6-next-20220920)
> MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24048222  XER: 20040000
> CFAR: c0000000004b0eac DAR: 0000000000000034 DSISR: 40000000 IRQMASK: 0 
> GPR00: c0000000004d2734 c0000012801bf900 c000000002a92300 0000000000000000 
> GPR04: c000000002ac8ac0 c000000001209340 0000000000000005 c000001286714b80 
> GPR08: 0000000000000034 0000000000000000 0000000000000000 0000000000000000 
> GPR12: 0000000028048242 c00000167fff6b00 0000000000000000 0000000000000000 
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> GPR20: c0000012801bfae8 0000000000000001 0000000000000100 0000000000000001 
> GPR24: c0000012801bfae8 c000000002ac8ac0 0000000000000002 0000000000000005 
> GPR28: 0000000000000000 0000000000000001 0000000000000000 0000000000346cca 
> NIP [c0000000004d2744] alloc_buddy_huge_page+0xd4/0x240
> LR [c0000000004d2734] alloc_buddy_huge_page+0xc4/0x240
> Call Trace:
> [c0000012801bf900] [c0000000004d2734] alloc_buddy_huge_page+0xc4/0x240 (unreliable)
> [c0000012801bf9b0] [c0000000004d46a4] alloc_fresh_huge_page.part.72+0x214/0x2a0
> [c0000012801bfa40] [c0000000004d7f88] alloc_pool_huge_page+0x118/0x190
> [c0000012801bfa90] [c0000000004d84dc] __nr_hugepages_store_common+0x4dc/0x610
> [c0000012801bfb70] [c0000000004d88bc] hugetlb_sysctl_handler_common+0x13c/0x180
> [c0000012801bfc10] [c0000000006380e0] proc_sys_call_handler+0x210/0x350
> [c0000012801bfc90] [c000000000551c00] vfs_write+0x2e0/0x460
> [c0000012801bfd50] [c000000000551f5c] ksys_write+0x7c/0x140
> [c0000012801bfda0] [c000000000033f58] system_call_exception+0x188/0x3f0
> [c0000012801bfe10] [c00000000000c53c] system_call_common+0xec/0x270
> --- interrupt: c00 at 0x7fffa9520c34
> NIP:  00007fffa9520c34 LR: 00000001024754bc CTR: 0000000000000000
> REGS: c0000012801bfe80 TRAP: 0c00   Tainted: G            E       (6.0.0-rc6-next-20220920)
> MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 28002202  XER: 00000000
> IRQMASK: 0 
> GPR00: 0000000000000004 00007fffccd76cd0 00007fffa9607300 0000000000000003 
> GPR04: 0000000138da6970 0000000000000006 fffffffffffffff6 0000000000000000 
> GPR08: 0000000138da6970 0000000000000000 0000000000000000 0000000000000000 
> GPR12: 0000000000000000 00007fffa9a40940 0000000000000000 0000000000000000 
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> GPR24: 0000000000000001 0000000000000010 0000000000000006 0000000138da8aa0 
> GPR28: 00007fffa95fc2c8 0000000138da8aa0 0000000000000006 0000000138da6930 
> NIP [00007fffa9520c34] 0x7fffa9520c34
> LR [00000001024754bc] 0x1024754bc
> --- interrupt: c00
> Instruction dump:
> 3b400002 3ba00001 3b800000 7f26cb78 7fc5f378 7f64db78 7fe3fb78 4bfde5b9 
> 60000000 7c691b78 39030034 7c0004ac <7d404028> 7c0ae800 40c20010 7f80412d 
> ---[ end trace 0000000000000000 ]---
> 
> Kernel panic - not syncing: Fatal exception
> 
> Bisect points to following patch:
> commit f2f3c25dea3acfb17aecb7273541e7266dfc8842
>     hugetlb: freeze allocated pages before creating hugetlb pages
> 
> Reverting the patch allows the test to run successfully.
> 
> Thanks
> - Sachin
> 
> [1] https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/transparent_hugepages_defrag.py

WARNING: multiple messages have this Message-ID (diff)
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Sachin Sant <sachinp@linux.ibm.com>
Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	linux-mm@kvack.org, open list <linux-kernel@vger.kernel.org>
Subject: Re: [powerpc] Kernel crash with THP tests (next-20220920)
Date: Wed, 21 Sep 2022 16:41:03 -0700	[thread overview]
Message-ID: <YyuhD+7N022PgRA+@monkey> (raw)
In-Reply-To: <C2C8DA4F-F00F-43E9-ACD8-2A8BACA55893@linux.ibm.com>

On 09/21/22 12:00, Sachin Sant wrote:
> While running transparent huge page tests [1] against 6.0.0-rc6-next-20220920
> following crash is seen on IBM Power server.

Thanks Sachin,

Naoya reported this, with my analysis here:
https://lore.kernel.org/linux-mm/YyqCS6+OXAgoqI8T@monkey/

An updated version of the patch was posted here,
https://lore.kernel.org/linux-mm/20220921202702.106069-1-mike.kravetz@oracle.com/

Sorry about that,
-- 
Mike Kravetz

> 
> Kernel attempted to read user page (34) - exploit attempt? (uid: 0)
> BUG: Kernel NULL pointer dereference on read at 0x00000034
> Faulting instruction address: 0xc0000000004d2744
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> Modules linked in: dm_mod(E) bonding(E) rfkill(E) tls(E) sunrpc(E) nd_pmem(E) nd_btt(E) dax_pmem(E) papr_scm(E) libnvdimm(E) pseries_rng(E) vmx_crypto(E) ext4(E) mbcache(E) jbd2(E) sd_mod(E) t10_pi(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) fuse(E)
> CPU: 37 PID: 2219255 Comm: sysctl Tainted: G            E      6.0.0-rc6-next-20220920 #1
> NIP:  c0000000004d2744 LR: c0000000004d2734 CTR: 0000000000000000
> REGS: c0000012801bf660 TRAP: 0300   Tainted: G            E       (6.0.0-rc6-next-20220920)
> MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24048222  XER: 20040000
> CFAR: c0000000004b0eac DAR: 0000000000000034 DSISR: 40000000 IRQMASK: 0 
> GPR00: c0000000004d2734 c0000012801bf900 c000000002a92300 0000000000000000 
> GPR04: c000000002ac8ac0 c000000001209340 0000000000000005 c000001286714b80 
> GPR08: 0000000000000034 0000000000000000 0000000000000000 0000000000000000 
> GPR12: 0000000028048242 c00000167fff6b00 0000000000000000 0000000000000000 
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> GPR20: c0000012801bfae8 0000000000000001 0000000000000100 0000000000000001 
> GPR24: c0000012801bfae8 c000000002ac8ac0 0000000000000002 0000000000000005 
> GPR28: 0000000000000000 0000000000000001 0000000000000000 0000000000346cca 
> NIP [c0000000004d2744] alloc_buddy_huge_page+0xd4/0x240
> LR [c0000000004d2734] alloc_buddy_huge_page+0xc4/0x240
> Call Trace:
> [c0000012801bf900] [c0000000004d2734] alloc_buddy_huge_page+0xc4/0x240 (unreliable)
> [c0000012801bf9b0] [c0000000004d46a4] alloc_fresh_huge_page.part.72+0x214/0x2a0
> [c0000012801bfa40] [c0000000004d7f88] alloc_pool_huge_page+0x118/0x190
> [c0000012801bfa90] [c0000000004d84dc] __nr_hugepages_store_common+0x4dc/0x610
> [c0000012801bfb70] [c0000000004d88bc] hugetlb_sysctl_handler_common+0x13c/0x180
> [c0000012801bfc10] [c0000000006380e0] proc_sys_call_handler+0x210/0x350
> [c0000012801bfc90] [c000000000551c00] vfs_write+0x2e0/0x460
> [c0000012801bfd50] [c000000000551f5c] ksys_write+0x7c/0x140
> [c0000012801bfda0] [c000000000033f58] system_call_exception+0x188/0x3f0
> [c0000012801bfe10] [c00000000000c53c] system_call_common+0xec/0x270
> --- interrupt: c00 at 0x7fffa9520c34
> NIP:  00007fffa9520c34 LR: 00000001024754bc CTR: 0000000000000000
> REGS: c0000012801bfe80 TRAP: 0c00   Tainted: G            E       (6.0.0-rc6-next-20220920)
> MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 28002202  XER: 00000000
> IRQMASK: 0 
> GPR00: 0000000000000004 00007fffccd76cd0 00007fffa9607300 0000000000000003 
> GPR04: 0000000138da6970 0000000000000006 fffffffffffffff6 0000000000000000 
> GPR08: 0000000138da6970 0000000000000000 0000000000000000 0000000000000000 
> GPR12: 0000000000000000 00007fffa9a40940 0000000000000000 0000000000000000 
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> GPR24: 0000000000000001 0000000000000010 0000000000000006 0000000138da8aa0 
> GPR28: 00007fffa95fc2c8 0000000138da8aa0 0000000000000006 0000000138da6930 
> NIP [00007fffa9520c34] 0x7fffa9520c34
> LR [00000001024754bc] 0x1024754bc
> --- interrupt: c00
> Instruction dump:
> 3b400002 3ba00001 3b800000 7f26cb78 7fc5f378 7f64db78 7fe3fb78 4bfde5b9 
> 60000000 7c691b78 39030034 7c0004ac <7d404028> 7c0ae800 40c20010 7f80412d 
> ---[ end trace 0000000000000000 ]---
> 
> Kernel panic - not syncing: Fatal exception
> 
> Bisect points to following patch:
> commit f2f3c25dea3acfb17aecb7273541e7266dfc8842
>     hugetlb: freeze allocated pages before creating hugetlb pages
> 
> Reverting the patch allows the test to run successfully.
> 
> Thanks
> - Sachin
> 
> [1] https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/transparent_hugepages_defrag.py


  reply	other threads:[~2022-09-21 23:42 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21  6:30 [powerpc] Kernel crash with THP tests (next-20220920) Sachin Sant
2022-09-21  6:30 ` Sachin Sant
2022-09-21 23:41 ` Mike Kravetz [this message]
2022-09-21 23:41   ` Mike Kravetz
2022-09-22 12:53   ` Sachin Sant
2022-09-22 12:53     ` Sachin Sant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YyuhD+7N022PgRA+@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=sachinp@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.