public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG] commit 444d13ff10f introduced boot failure on s390x
@ 2016-08-10 15:21 Eryu Guan
  2016-08-10 22:58 ` Jessica Yu
  0 siblings, 1 reply; 4+ messages in thread
From: Eryu Guan @ 2016-08-10 15:21 UTC (permalink / raw)
  To: live-patching; +Cc: linux-kernel, Jessica Yu, Rusty Russell

Hi,

I hit boot failure on s390x host starting from 4.8-rc1 kernel, 4.7
kernel works fine. And I bisected to this commit 444d13ff10fb

    commit 444d13ff10fb13bc3e64859c3cf9ce43dcfeb075
    Author: Jessica Yu <jeyu@redhat.com>
    Date:   Wed Jul 27 12:06:21 2016 +0930
    
        modules: add ro_after_init support
    
        Add ro_after_init support for modules by adding a new page-aligned section
        in the module layout (after rodata) for ro_after_init data and enabling RO
        protection for that section after module init runs.
    
        Signed-off-by: Jessica Yu <jeyu@redhat.com>
        Acked-by: Kees Cook <keescook@chromium.org>
        Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

and I've only hit this panic on s390x hosts. Console log is appended at
the end of email.

Thanks,
Eryu

[    2.050197] device-mapper: uevent: version 1.0.3
[    2.050370] device-mapper: ioctl: 4.34.0-ioctl (2015-10-28) initialised: dm-d
evel@redhat.com
[    2.057615] Unable to handle kernel pointer dereference in virtual kernel add
ress space
[    2.057619] Failing address: 000003ff8001d000 TEID: 000003ff8001d407
[    2.057620] Fault in home space mode while using kernel ASCE.
[    2.057622] AS:0000000000a7c007 R3:000000007c974007 S:000000007cc24800 P:0000
00000239b21d
[    2.057665] Oops: 0004 ilc:3 [#1] SMP
[    2.057667] Modules linked in: dm_mod
[    2.057670] CPU: 0 PID: 399 Comm: modprobe Not tainted 4.7.0+ #7
[    2.057672] Hardware name: IBM              2827 H43              400
      (z/VM)
[    2.057673] task: 000000007cccd100 ti: 0000000002324000 task.ti: 000000000232
4000
[    2.057675] Krnl PSW : 0704c00180000000 000000000043a5c8 (__list_add_rcu+0x50
/0xa8)
[    2.057683]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:
0 EA:3
Krnl GPRS: 0000000000000006 000000000098b278 000003ff80117208 000000000098b278
[    2.057685]            000003ff8001df08 00000000001c913c 0000000000000002 000
000008001e55c
[    2.057686]            0000000000000560 0000000002327e00 000003ff80117218 000
003ff80117208
[    2.057687]            000000000098b278 000003ff8001df08 0000000002327cc0 000
0000002327c90
[    2.057696] Krnl Code: 000000000043a5b6: e3d0b0000024        stg     %r13,0(%
r11)
           000000000043a5bc: e3c0b0080024       stg     %r12,8(%r11)
          #000000000043a5c2: e3b0c0000024       stg     %r11,0(%r12)
          >000000000043a5c8: e3b0d0080024       stg     %r11,8(%r13)
           000000000043a5ce: e340f0b80004       lg      %r4,184(%r15)
           000000000043a5d4: ebbff0a00004       lmg     %r11,%r15,160(%r15)
           000000000043a5da: 07f4               bcr     15,%r4
           000000000043a5dc: e34040080004       lg      %r4,8(%r4)
[    2.057706] Call Trace:
[    2.057708] ([<0000000002327cc0>] 0x2327cc0)
[    2.057714] ([<00000000001c98d0>] load_module+0x8e0/0x1870)
[    2.057715] ([<00000000001caa74>] SyS_finit_module+0xb4/0xf0)
[    2.057720] ([<00000000006678b6>] system_call+0xd6/0x264)
[    2.057721] Last Breaking-Event-Address:
[    2.057722]  [<00000000001c98ca>] load_module+0x8da/0x1870
[    2.057723]
[    2.057724] Kernel panic - not syncing: Fatal exception: panic_on_oops

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: commit 444d13ff10f introduced boot failure on s390x
  2016-08-10 15:21 [BUG] commit 444d13ff10f introduced boot failure on s390x Eryu Guan
@ 2016-08-10 22:58 ` Jessica Yu
  2016-08-15 19:12   ` Jessica Yu
  0 siblings, 1 reply; 4+ messages in thread
From: Jessica Yu @ 2016-08-10 22:58 UTC (permalink / raw)
  To: Eryu Guan; +Cc: live-patching, linux-kernel, Rusty Russell

+++ Eryu Guan [10/08/16 23:21 +0800]:
>Hi,
>
>I hit boot failure on s390x host starting from 4.8-rc1 kernel, 4.7
>kernel works fine. And I bisected to this commit 444d13ff10fb
>
>    commit 444d13ff10fb13bc3e64859c3cf9ce43dcfeb075
>    Author: Jessica Yu <jeyu@redhat.com>
>    Date:   Wed Jul 27 12:06:21 2016 +0930
>
>        modules: add ro_after_init support
>
>        Add ro_after_init support for modules by adding a new page-aligned section
>        in the module layout (after rodata) for ro_after_init data and enabling RO
>        protection for that section after module init runs.
>
>        Signed-off-by: Jessica Yu <jeyu@redhat.com>
>        Acked-by: Kees Cook <keescook@chromium.org>
>        Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
>
>and I've only hit this panic on s390x hosts. Console log is appended at
>the end of email.
>
>Thanks,
>Eryu

Hi Eryu, thanks for reporting this. It's a bit difficult to tell from
the stacktrace alone what's really going on, so I'll attempt to
reproduce this on a 4.8-rc1 kernel once I get my hands on an s390x
system and report back.

>[    2.050197] device-mapper: uevent: version 1.0.3
>[    2.050370] device-mapper: ioctl: 4.34.0-ioctl (2015-10-28) initialised: dm-d
>evel@redhat.com
>[    2.057615] Unable to handle kernel pointer dereference in virtual kernel add
>ress space
>[    2.057619] Failing address: 000003ff8001d000 TEID: 000003ff8001d407
>[    2.057620] Fault in home space mode while using kernel ASCE.
>[    2.057622] AS:0000000000a7c007 R3:000000007c974007 S:000000007cc24800 P:0000
>00000239b21d
>[    2.057665] Oops: 0004 ilc:3 [#1] SMP
>[    2.057667] Modules linked in: dm_mod
>[    2.057670] CPU: 0 PID: 399 Comm: modprobe Not tainted 4.7.0+ #7
>[    2.057672] Hardware name: IBM              2827 H43              400
>      (z/VM)
>[    2.057673] task: 000000007cccd100 ti: 0000000002324000 task.ti: 000000000232
>4000
>[    2.057675] Krnl PSW : 0704c00180000000 000000000043a5c8 (__list_add_rcu+0x50
>/0xa8)
>[    2.057683]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:
>0 EA:3
>Krnl GPRS: 0000000000000006 000000000098b278 000003ff80117208 000000000098b278
>[    2.057685]            000003ff8001df08 00000000001c913c 0000000000000002 000
>000008001e55c
>[    2.057686]            0000000000000560 0000000002327e00 000003ff80117218 000
>003ff80117208
>[    2.057687]            000000000098b278 000003ff8001df08 0000000002327cc0 000
>0000002327c90
>[    2.057696] Krnl Code: 000000000043a5b6: e3d0b0000024        stg     %r13,0(%
>r11)
>           000000000043a5bc: e3c0b0080024       stg     %r12,8(%r11)
>          #000000000043a5c2: e3b0c0000024       stg     %r11,0(%r12)
>          >000000000043a5c8: e3b0d0080024       stg     %r11,8(%r13)
>           000000000043a5ce: e340f0b80004       lg      %r4,184(%r15)
>           000000000043a5d4: ebbff0a00004       lmg     %r11,%r15,160(%r15)
>           000000000043a5da: 07f4               bcr     15,%r4
>           000000000043a5dc: e34040080004       lg      %r4,8(%r4)
>[    2.057706] Call Trace:
>[    2.057708] ([<0000000002327cc0>] 0x2327cc0)
>[    2.057714] ([<00000000001c98d0>] load_module+0x8e0/0x1870)
>[    2.057715] ([<00000000001caa74>] SyS_finit_module+0xb4/0xf0)
>[    2.057720] ([<00000000006678b6>] system_call+0xd6/0x264)
>[    2.057721] Last Breaking-Event-Address:
>[    2.057722]  [<00000000001c98ca>] load_module+0x8da/0x1870
>[    2.057723]
>[    2.057724] Kernel panic - not syncing: Fatal exception: panic_on_oops

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: commit 444d13ff10f introduced boot failure on s390x
  2016-08-10 22:58 ` Jessica Yu
@ 2016-08-15 19:12   ` Jessica Yu
  2016-08-16  6:48     ` Heiko Carstens
  0 siblings, 1 reply; 4+ messages in thread
From: Jessica Yu @ 2016-08-15 19:12 UTC (permalink / raw)
  To: Eryu Guan
  Cc: live-patching, linux-kernel, Rusty Russell, Heiko Carstens,
	Martin Schwidefsky

+++ Jessica Yu [10/08/16 18:58 -0400]:
>+++ Eryu Guan [10/08/16 23:21 +0800]:
>>Hi,
>>
>>I hit boot failure on s390x host starting from 4.8-rc1 kernel, 4.7
>>kernel works fine. And I bisected to this commit 444d13ff10fb
>>
>>   commit 444d13ff10fb13bc3e64859c3cf9ce43dcfeb075
>>   Author: Jessica Yu <jeyu@redhat.com>
>>   Date:   Wed Jul 27 12:06:21 2016 +0930
>>
>>       modules: add ro_after_init support
>>
>>       Add ro_after_init support for modules by adding a new page-aligned section
>>       in the module layout (after rodata) for ro_after_init data and enabling RO
>>       protection for that section after module init runs.
>>
>>       Signed-off-by: Jessica Yu <jeyu@redhat.com>
>>       Acked-by: Kees Cook <keescook@chromium.org>
>>       Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
>>
>>and I've only hit this panic on s390x hosts. Console log is appended at
>>the end of email.
>>
>>Thanks,
>>Eryu
>
>Hi Eryu, thanks for reporting this. It's a bit difficult to tell from
>the stacktrace alone what's really going on, so I'll attempt to
>reproduce this on a 4.8-rc1 kernel once I get my hands on an s390x
>system and report back.

[ CC'ing Heiko and Martin ]

So this panic is related to some recent changes to set_memory_{ro,rw}
on s390x, see commit e8a97e42 "s390/pageattr: allow kernel page table
splitting." The new implementation of set_memory_{ro,rw} on s390 isn't
handling the case when numpages is 0.

Recall the general layout of a module:
    [text] [rodata] [ro-after-init] [writable data]

Normally a module's ro after init section sits between rodata and
writable data. When a module doesn't have a ro after init section,
set_memory_ro gets called with the first page-aligned addr after
rodata, but with numpages = 0. However in this case since
set_memory_ro isn't handling the case when numpages is 0, it
incorrectly ends up walking the page table anyway and ends up setting
a normally writable page to ro. Adding a simple numpages == 0 check
to set_memory_{ro,rw} and returning fixes the panic.

Jessica

>>[    2.050197] device-mapper: uevent: version 1.0.3
>>[    2.050370] device-mapper: ioctl: 4.34.0-ioctl (2015-10-28) initialised: dm-d
>>evel@redhat.com
>>[    2.057615] Unable to handle kernel pointer dereference in virtual kernel add
>>ress space
>>[    2.057619] Failing address: 000003ff8001d000 TEID: 000003ff8001d407
>>[    2.057620] Fault in home space mode while using kernel ASCE.
>>[    2.057622] AS:0000000000a7c007 R3:000000007c974007 S:000000007cc24800 P:0000
>>00000239b21d
>>[    2.057665] Oops: 0004 ilc:3 [#1] SMP
>>[    2.057667] Modules linked in: dm_mod
>>[    2.057670] CPU: 0 PID: 399 Comm: modprobe Not tainted 4.7.0+ #7
>>[    2.057672] Hardware name: IBM              2827 H43              400
>>     (z/VM)
>>[    2.057673] task: 000000007cccd100 ti: 0000000002324000 task.ti: 000000000232
>>4000
>>[    2.057675] Krnl PSW : 0704c00180000000 000000000043a5c8 (__list_add_rcu+0x50
>>/0xa8)
>>[    2.057683]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:
>>0 EA:3
>>Krnl GPRS: 0000000000000006 000000000098b278 000003ff80117208 000000000098b278
>>[    2.057685]            000003ff8001df08 00000000001c913c 0000000000000002 000
>>000008001e55c
>>[    2.057686]            0000000000000560 0000000002327e00 000003ff80117218 000
>>003ff80117208
>>[    2.057687]            000000000098b278 000003ff8001df08 0000000002327cc0 000
>>0000002327c90
>>[    2.057696] Krnl Code: 000000000043a5b6: e3d0b0000024        stg     %r13,0(%
>>r11)
>>          000000000043a5bc: e3c0b0080024       stg     %r12,8(%r11)
>>         #000000000043a5c2: e3b0c0000024       stg     %r11,0(%r12)
>>         >000000000043a5c8: e3b0d0080024       stg     %r11,8(%r13)
>>          000000000043a5ce: e340f0b80004       lg      %r4,184(%r15)
>>          000000000043a5d4: ebbff0a00004       lmg     %r11,%r15,160(%r15)
>>          000000000043a5da: 07f4               bcr     15,%r4
>>          000000000043a5dc: e34040080004       lg      %r4,8(%r4)
>>[    2.057706] Call Trace:
>>[    2.057708] ([<0000000002327cc0>] 0x2327cc0)
>>[    2.057714] ([<00000000001c98d0>] load_module+0x8e0/0x1870)
>>[    2.057715] ([<00000000001caa74>] SyS_finit_module+0xb4/0xf0)
>>[    2.057720] ([<00000000006678b6>] system_call+0xd6/0x264)
>>[    2.057721] Last Breaking-Event-Address:
>>[    2.057722]  [<00000000001c98ca>] load_module+0x8da/0x1870
>>[    2.057723]
>>[    2.057724] Kernel panic - not syncing: Fatal exception: panic_on_oops

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: commit 444d13ff10f introduced boot failure on s390x
  2016-08-15 19:12   ` Jessica Yu
@ 2016-08-16  6:48     ` Heiko Carstens
  0 siblings, 0 replies; 4+ messages in thread
From: Heiko Carstens @ 2016-08-16  6:48 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Eryu Guan, live-patching, linux-kernel, Rusty Russell,
	Martin Schwidefsky

On Mon, Aug 15, 2016 at 03:12:53PM -0400, Jessica Yu wrote:
> +++ Jessica Yu [10/08/16 18:58 -0400]:
> >+++ Eryu Guan [10/08/16 23:21 +0800]:
> >>Hi,
> >>
> >>I hit boot failure on s390x host starting from 4.8-rc1 kernel, 4.7
> >>kernel works fine. And I bisected to this commit 444d13ff10fb
> >>
> >>  commit 444d13ff10fb13bc3e64859c3cf9ce43dcfeb075
> >>  Author: Jessica Yu <jeyu@redhat.com>
> >>  Date:   Wed Jul 27 12:06:21 2016 +0930
> >>
> >>      modules: add ro_after_init support
> >>
> >>      Add ro_after_init support for modules by adding a new page-aligned section
> >>      in the module layout (after rodata) for ro_after_init data and enabling RO
> >>      protection for that section after module init runs.
> >>
> >>      Signed-off-by: Jessica Yu <jeyu@redhat.com>
> >>      Acked-by: Kees Cook <keescook@chromium.org>
> >>      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> >>
> >>and I've only hit this panic on s390x hosts. Console log is appended at
> >>the end of email.
> >>
> >>Thanks,
> >>Eryu
> >
> >Hi Eryu, thanks for reporting this. It's a bit difficult to tell from
> >the stacktrace alone what's really going on, so I'll attempt to
> >reproduce this on a 4.8-rc1 kernel once I get my hands on an s390x
> >system and report back.
> 
> [ CC'ing Heiko and Martin ]
> 
> So this panic is related to some recent changes to set_memory_{ro,rw}
> on s390x, see commit e8a97e42 "s390/pageattr: allow kernel page table
> splitting." The new implementation of set_memory_{ro,rw} on s390 isn't
> handling the case when numpages is 0.
> 
> Recall the general layout of a module:
>    [text] [rodata] [ro-after-init] [writable data]
> 
> Normally a module's ro after init section sits between rodata and
> writable data. When a module doesn't have a ro after init section,
> set_memory_ro gets called with the first page-aligned addr after
> rodata, but with numpages = 0. However in this case since
> set_memory_ro isn't handling the case when numpages is 0, it
> incorrectly ends up walking the page table anyway and ends up setting
> a normally writable page to ro. Adding a simple numpages == 0 check
> to set_memory_{ro,rw} and returning fixes the panic.
> 
> Jessica

All what you write is correct. The patch below is sitting in our "fixes"
branch since a week:

https://git.kernel.org/cgit/linux/kernel/git/s390/linux.git/log/?h=fixes

I assume there will be a pull request from Martin soon.

>From 4d81aaa53c2dea220ddf88e19c33033d6cf4f8cb Mon Sep 17 00:00:00 2001
From: Heiko Carstens <heiko.carstens@de.ibm.com>
Date: Tue, 9 Aug 2016 12:26:28 +0200
Subject: [PATCH] s390/pageattr: handle numpages parameter correctly

Both set_memory_ro() and set_memory_rw() will modify the page
attributes of at least one page, even if the numpages parameter is
zero.

The author expected that calling these functions with numpages == zero
would never happen. However with the new 444d13ff10fb ("modules: add
ro_after_init support") feature this happens frequently.

Therefore do the right thing and make these two functions return
gracefully if nothing should be done.

Fixes crashes on module load like this one:

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 000003ff80008000 TEID: 000003ff80008407
Fault in home space mode while using kernel ASCE.
AS:0000000000d18007 R3:00000001e6aa4007 S:00000001e6a10800 P:00000001e34ee21d
Oops: 0004 ilc:3 [#1] SMP
Modules linked in: x_tables
CPU: 10 PID: 1 Comm: systemd Not tainted 4.7.0-11895-g3fa9045 #4
Hardware name: IBM              2964 N96              703              (LPAR)
task: 00000001e9118000 task.stack: 00000001e9120000
Krnl PSW : 0704e00180000000 00000000005677f8 (rb_erase+0xf0/0x4d0)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 000003ff80008b20 000003ff80008b20 000003ff80008b70 0000000000b9d608
           000003ff80008b20 0000000000000000 00000001e9123e88 000003ff80008950
           00000001e485ab40 000003ff00000000 000003ff80008b00 00000001e4858480
           0000000100000000 000003ff80008b68 00000000001d5998 00000001e9123c28
Krnl Code: 00000000005677e8: ec1801c3007c        cgij    %r1,0,8,567b6e
           00000000005677ee: e32010100020        cg      %r2,16(%r1)
          #00000000005677f4: a78401c2            brc     8,567b78
          >00000000005677f8: e35010080024        stg     %r5,8(%r1)
           00000000005677fe: ec5801af007c        cgij    %r5,0,8,567b5c
           0000000000567804: e30050000024        stg     %r0,0(%r5)
           000000000056780a: ebacf0680004        lmg     %r10,%r12,104(%r15)
           0000000000567810: 07fe                bcr     15,%r14
Call Trace:
([<000003ff80008900>] __this_module+0x0/0xffffffffffffd700 [x_tables])
([<0000000000264fd4>] do_init_module+0x12c/0x220)
([<00000000001da14a>] load_module+0x24e2/0x2b10)
([<00000000001da976>] SyS_finit_module+0xbe/0xd8)
([<0000000000803b26>] system_call+0xd6/0x264)
Last Breaking-Event-Address:
 [<000000000056771a>] rb_erase+0x12/0x4d0
 Kernel panic - not syncing: Fatal exception: panic_on_oops

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Fixes: e8a97e42dc98 ("s390/pageattr: allow kernel page table splitting")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 arch/s390/mm/pageattr.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/s390/mm/pageattr.c b/arch/s390/mm/pageattr.c
index 7104ffb5a67f..af7cf28cf97e 100644
--- a/arch/s390/mm/pageattr.c
+++ b/arch/s390/mm/pageattr.c
@@ -252,6 +252,8 @@ static int change_page_attr(unsigned long addr, unsigned long end,
 	int rc = -EINVAL;
 	pgd_t *pgdp;
 
+	if (addr == end)
+		return 0;
 	if (end >= MODULES_END)
 		return -EINVAL;
 	mutex_lock(&cpa_mutex);
-- 
2.6.6

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-08-16  6:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-10 15:21 [BUG] commit 444d13ff10f introduced boot failure on s390x Eryu Guan
2016-08-10 22:58 ` Jessica Yu
2016-08-15 19:12   ` Jessica Yu
2016-08-16  6:48     ` Heiko Carstens

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox