* [BUG] madvise05 leads kernel panic on 4.9.122
@ 2018-08-21 18:37 Yang Shi
2018-08-21 18:43 ` David Woodhouse
0 siblings, 1 reply; 5+ messages in thread
From: Yang Shi @ 2018-08-21 18:37 UTC (permalink / raw)
To: gregkh, ak, tglx, dave.hansen, dwmw; +Cc: stable, x86, linux-kernel, YangShi
Hi folks,
I just ran some regression test on stable 4.9.122 with LTP. madvise05
triggers the below kernel panic:
[ 6785.089994] BUG: unable to handle kernel paging request at
ffffeaff44488020
[ 6785.097952] IP: [] page_remove_rmap+0x27/0x580
[ 6785.104810] PGD 0
[ 6785.106859]
[ 6785.108526] Oops: 0000 [#1] SMP
[ 6785.112029] Modules linked in: mptctl(E) mptbase(E) tun(E) fuse(E)
vfat(E) fat(E) btrfs(E) xor(E) raid6_pq(E) xfs(E)
[ 6785.123905] CPU: 14 PID: 77983 Comm: madvise05 Tainted: G E
4.9.122-001.ali3000_nightly_cov_20180820_193.test.alios7.x86_64 #1
[ 6785.137880] Hardware name: Dell Inc. PowerEdge R720xd/0X6FFV, BIOS
1.3.6 09/11/2012
[ 6785.146425] task: ffff882daeb78000 task.stack: ffffc9001b438000
[ 6785.153031] RIP: 0010:[] [] page_remove_rmap+0x27/0x580
[ 6785.162461] RSP: 0018:ffffc9001b43bc50 EFLAGS: 00010246
[ 6785.168388] RAX: 0000000000000000 RBX: ffffeaff44488000 RCX:
0000000000000080
[ 6785.176351] RDX: ffffeaff44488000 RSI: 0000000000000001 RDI:
ffffeaff44488000
[ 6785.184315] RBP: ffffc9001b43bc80 R08: ffff882f9d4a6540 R09:
ffffffff84bcdf60
[ 6785.192277] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000001
[ 6785.200241] R13: ffff882db6996910 R14: ffffea00b6da65b0 R15:
00003fffffe00000
[ 6785.208205] FS: 00002ae5a3dfbb80(0000) GS:ffff882fbf180000(0000)
knlGS:0000000000000000
[ 6785.217234] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6785.223646] CR2: ffffeaff44488020 CR3: 0000002f9d51c000 CR4:
00000000000606f0
[ 6785.231610] Stack:
[ 6785.233851] 0000000000000000 00003ffffd6b3320 0434f9415ac47f2c
ffffeaff44488000
[ 6785.242143] ffffc9001b43bdd8 ffff882db6996910 ffffc9001b43bcc0
ffffffff81355aaa
[ 6785.250438] ffff882f9d4a6540 00002ae5a4400000 00002ae5a4600000
ffffffff84bcd7a0
[ 6785.258728] Call Trace:
[ 6785.261460] [] zap_huge_pmd+0x28a/0x640
[ 6785.267585] [] unmap_page_range+0x532/0x630
[ 6785.274087] [] unmap_single_vma+0xa9/0x160
[ 6785.280501] [] unmap_vmas+0x5f/0xe0
[ 6785.286236] [] unmap_region+0xe4/0x1e0
[ 6785.292263] [] ? blk_finish_plug+0x3c/0x60
[ 6785.298669] [] ? SYSC_madvise+0x69b/0xed0
[ 6785.304985] [] do_munmap+0x39b/0x5b0
[ 6785.310818] [] SyS_munmap+0x78/0xb0
[ 6785.316552] [] do_syscall_64+0xf4/0x350
[ 6785.322676] [] entry_SYSCALL_64_after_swapgs+0x58/0xca
[ 6785.330244] Code: 00 00 00 00 66 66 66 66 90 55 <48> 8b 57 20 48 89
f8 f6 c2 01 0f
[ 6785.339011] RIP [] page_remove_rmap+0x27/0x580
[ 6785.345825] RSP
[ 6785.349715] CR2: ffffeaff44488020
The same test case works well on both 4.9.119 and the latest Linus's
tree. So, it looks it is caused by the L1TF patches on the stable tree.
And, the madvise05 test case can be simplified to the below test program:
#include <sys/mman.h>
#include <stdio.h>
void main()
{
void *addr;
int err;
addr = mmap(NULL, 32 * 1024 * 1024, PROT_READ,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_POPULATE, -1, 0);
if (addr == MAP_FAILED) {
printf("mmap failed\n");
return;
}
err = mprotect(addr, 32 * 1024 * 1024, PROT_NONE);
if (err < 0) {
printf("mprotect failed\n");
return;
}
munmap(addr, 32 * 1024 * 1024);
}
You may be already aware of this problem or any hint is appreciated.
Thanks,
Yang
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] madvise05 leads kernel panic on 4.9.122
2018-08-21 18:37 [BUG] madvise05 leads kernel panic on 4.9.122 Yang Shi
@ 2018-08-21 18:43 ` David Woodhouse
2018-08-21 20:30 ` Yang Shi
0 siblings, 1 reply; 5+ messages in thread
From: David Woodhouse @ 2018-08-21 18:43 UTC (permalink / raw)
To: Yang Shi, gregkh, ak, tglx, dave.hansen; +Cc: stable, x86, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 370 bytes --]
On Tue, 2018-08-21 at 11:37 -0700, Yang Shi wrote:
>
> I just ran some regression test on stable 4.9.122 with LTP. madvise05
> triggers the below kernel panic:
Please could you try 4.9.123-rc1, specifically this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/commit/?h=linux-4.9.y&id=64fc89a2702e0e70e34ced0a270ef556242d5f26
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] madvise05 leads kernel panic on 4.9.122
2018-08-21 18:43 ` David Woodhouse
@ 2018-08-21 20:30 ` Yang Shi
2018-08-21 20:36 ` Andi Kleen
0 siblings, 1 reply; 5+ messages in thread
From: Yang Shi @ 2018-08-21 20:30 UTC (permalink / raw)
To: David Woodhouse, gregkh, ak, tglx, dave.hansen; +Cc: stable, x86, linux-kernel
On 8/21/18 11:43 AM, David Woodhouse wrote:
> On Tue, 2018-08-21 at 11:37 -0700, Yang Shi wrote:
>> I just ran some regression test on stable 4.9.122 with LTP. madvise05
>> triggers the below kernel panic:
Thanks, David. It works. A silly question, I don't get why this commit
could solve this issue, it looks just like a code refactor. Just because
it changed how to get pfn from page table entries? And, this may cause
some mismatch on 4.9 stable without it?
> Please could you try 4.9.123-rc1, specifically this commit:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/commit/?h=linux-4.9.y&id=64fc89a2702e0e70e34ced0a270ef556242d5f26
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] madvise05 leads kernel panic on 4.9.122
2018-08-21 20:30 ` Yang Shi
@ 2018-08-21 20:36 ` Andi Kleen
2018-08-21 21:32 ` Yang Shi
0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2018-08-21 20:36 UTC (permalink / raw)
To: yang.shi
Cc: David Woodhouse, gregkh, tglx, dave.hansen, stable, x86,
linux-kernel
On Tue, Aug 21, 2018 at 01:30:20PM -0700, yang.shi@linux.alibaba.com wrote:
>
>
> On 8/21/18 11:43 AM, David Woodhouse wrote:
> > On Tue, 2018-08-21 at 11:37 -0700, Yang Shi wrote:
> > > I just ran some regression test on stable 4.9.122 with LTP. madvise05
> > > triggers the below kernel panic:
>
> Thanks, David. It works. A silly question, I don't get why this commit could
> solve this issue, it looks just like a code refactor. Just because it
> changed how to get pfn from page table entries? And, this may cause some
> mismatch on 4.9 stable without it?
With the L1TF patches open coded pte_val() to get the PFN can cause problems
because it doesn't do the invert for PROT_NONE mappings
The cleanup changes the open coded versions to use p*_pfn(), which always
works correctly.
-Andi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] madvise05 leads kernel panic on 4.9.122
2018-08-21 20:36 ` Andi Kleen
@ 2018-08-21 21:32 ` Yang Shi
0 siblings, 0 replies; 5+ messages in thread
From: Yang Shi @ 2018-08-21 21:32 UTC (permalink / raw)
To: Andi Kleen
Cc: David Woodhouse, gregkh, tglx, dave.hansen, stable, x86,
linux-kernel
On 8/21/18 1:36 PM, Andi Kleen wrote:
> On Tue, Aug 21, 2018 at 01:30:20PM -0700, yang.shi@linux.alibaba.com wrote:
>>
>> On 8/21/18 11:43 AM, David Woodhouse wrote:
>>> On Tue, 2018-08-21 at 11:37 -0700, Yang Shi wrote:
>>>> I just ran some regression test on stable 4.9.122 with LTP. madvise05
>>>> triggers the below kernel panic:
>> Thanks, David. It works. A silly question, I don't get why this commit could
>> solve this issue, it looks just like a code refactor. Just because it
>> changed how to get pfn from page table entries? And, this may cause some
>> mismatch on 4.9 stable without it?
> With the L1TF patches open coded pte_val() to get the PFN can cause problems
> because it doesn't do the invert for PROT_NONE mappings
>
> The cleanup changes the open coded versions to use p*_pfn(), which always
> works correctly.
Thanks. Got it.
>
> -Andi
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-08-21 21:32 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-21 18:37 [BUG] madvise05 leads kernel panic on 4.9.122 Yang Shi
2018-08-21 18:43 ` David Woodhouse
2018-08-21 20:30 ` Yang Shi
2018-08-21 20:36 ` Andi Kleen
2018-08-21 21:32 ` Yang Shi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox