From: Sven Schnelle <svens@stackframe.org>
To: Helge Deller <deller@gmx.de>
Cc: linux-parisc@vger.kernel.org,
Mikulas Patocka <mpatocka@redhat.com>,
James Bottomley <James.Bottomley@hansenpartnership.com>,
John David Anglin <dave.anglin@bell.net>,
Mel Gorman <mgorman@techsingularity.net>
Subject: Re: [PATCH] parisc: Switch from DISCONTIGMEM to SPARSEMEM
Date: Mon, 15 Apr 2019 21:52:03 +0200 [thread overview]
Message-ID: <20190415195203.GC4827@t470p.stackframe.org> (raw)
In-Reply-To: <20190410173911.GA11288@ls3530.dellerweb.de>
Hi,
On Wed, Apr 10, 2019 at 07:39:11PM +0200, Helge Deller wrote:
> The commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an
> external fragmentation event occurs") breaks memory management on a
> parisc c8000 workstation with this memory layout:
>
> 0) Start 0x0000000000000000 End 0x000000003fffffff Size 1024 MB
> 1) Start 0x0000000100000000 End 0x00000001bfdfffff Size 3070 MB
> 2) Start 0x0000004040000000 End 0x00000040ffffffff Size 3072 MB
>
> With the patch 1c30844d2dfe, the kernel will incorrectly reclaim the
> first zone when it fills up, ignoring the fact that there are two
> completely free zones. Basiscally, it limits cache size to 1GiB.
>
> The parisc kernel is currently using the DISCONTIGMEM implementation,
> but isn't NUMA. Avoid this issue and strange work-arounds by switching
> to the more commonly used SPARSEMEM implementation.
> [..]
unfortunately this patch breaks booting on my J5000. The second CPU fails
to start, and triggers a HPMC (Bus timeout). Running with this patch adding
the nosmp command line option works. On my C3750 there's no problem.
Here's the dmesg:
[ 0.000000] Linux version 5.1.0-rc3-64bit+ (svens@t470p) (gcc version 7.4.0 (GCC)) #259 SMP Mon Apr 15 20:57:57 CEST 2019
[ 0.000000] CPU0: thread -1, cpu 0, socket 0
[ 0.000000] FP[0] enabled: Rev 1 Model 16
[ 0.000000] The 64-bit Kernel has started...
[ 0.000000] Kernel default page size is 4 KB. Huge pages disabled.
[ 0.000000] printk: bootconsole [ttyB0] enabled
[ 0.000000] Initialized PDC Console for debugging.
[ 0.000000] Determining PDC firmware type: System Map.
[ 0.000000] model 00005bd0 00000491 00000000 00000002 782482ee 100000f0 00000008 000000b2 000000b2
[ 0.000000] vers 00000201
[ 0.000000] CPUID vers 17 rev 5 (0x00000225)
[ 0.000000] capabilities 0x3
[ 0.000000] model 9000/785/J5000
[ 0.000000] Memory Ranges:
[ 0.000000] 0) Start 0x0000000000000000 End 0x00000000efffffff Size 3840 MB
[ 0.000000] 1) Start 0x00000010f0000000 End 0x00000010ffffffff Size 256 MB
[ 0.000000] Total Memory: 4096 MB
[ 0.000000] PDT: type PDT_PDC, size 50, entries 0, status 2, dbe_loc 0xffffffffffffffff, good_mem 171 MB
[ 0.000000] PDT: Firmware reports all memory OK.
[ 0.000000] LCD display at fffffff0f05d0008,fffffff0f05d0000 registered
[ 0.000000] percpu: Embedded 25 pages/cpu @(____ptrval____) s64064 r8192 d30144 u102400
[ 0.000000] SMP: bootstrap CPU ID is 0
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 1032192
[ 0.000000] Kernel command line: HOME=/ root=/dev/sda4 panic_timeout=60 panic=10 console=ttyS0,9600 kgdboc=ttyS0,9600 palo_kernel=0/vmlinuz
[ 0.000000] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
[ 0.000000] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
[ 0.000000] Memory: 4102676K/4194304K available (5660K kernel code, 1638K rwdata, 940K rodata, 444K init, 932K bss, 91628K reserved, 0K cma-reserved)
[ 0.000000] SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[ 0.000000] rcu: Hierarchical RCU implementation.
[ 0.000000] rcu: RCU event tracing is enabled.
[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[ 0.000000] NR_IRQS: 128
[ 0.000019] sched_clock: 64 bits at 440MHz, resolution 2ns, wraps every 4398046511103ns
[ 0.106184] Console: colour dummy device 160x64
[ 0.165835] Calibrating delay loop... 872.44 BogoMIPS (lpj=1744896)
[ 0.269845] pid_max: default: 32768 minimum: 301
[ 0.330589] Mount-cache hash table entries: 8192 (order: 4, 65536 bytes)
[ 0.422023] Mountpoint-cache hash table entries: 8192 (order: 4, 65536 bytes)
[ 0.514921] *** VALIDATE proc ***
[ 0.558204] *** VALIDATE cgroup1 ***
[ 0.605858] *** VALIDATE cgroup2 ***
[ 0.655933] rcu: Hierarchical SRCU implementation.
[ 0.878065] smp: Bringing up secondary CPUs ...
[ 0.937862] smp: Brought up 1 node, 1 CPU
[ 0.994306] devtmpfs: initialized
[ 1.040437] random: get_random_u32 called from bucket_table_alloc+0x270/0x2a0 with crng_init=0
[ 1.154583] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 1.281884] futex hash table entries: 1024 (order: 3, 32768 bytes)
[ 1.366411] NET: Registered protocol family 16
[ 1.426212] Searching for devices...
[ 1.717848] Found devices:
[ 1.749864] 1. Astro BC Runway Port at 0xfffffffffed00000 [10] { 12, 0x0, 0x582, 0x0000b }
[ 1.861898] 2. Elroy PCI Bridge at 0xfffffffffed30000 [10/0] { 13, 0x0, 0x782, 0x0000a }
[ 1.965864] 3. Elroy PCI Bridge at 0xfffffffffed32000 [10/1] { 13, 0x0, 0x782, 0x0000a }
[ 2.073861] 4. Elroy PCI Bridge at 0xfffffffffed34000 [10/2] { 13, 0x0, 0x782, 0x0000a }
[ 2.177861] 5. Elroy PCI Bridge at 0xfffffffffed38000 [10/4] { 13, 0x0, 0x782, 0x0000a }
[ 2.285861] 6. Elroy PCI Bridge at 0xfffffffffed3c000 [10/6] { 13, 0x0, 0x782, 0x0000a }
[ 2.393860] 7. Forte W 2-way at 0xfffffffffffa0000 [32] { 0, 0x0, 0x5bd, 0x00004 }
[ 2.493860] 8. Forte W 2-way at 0xfffffffffffa2000 [34] { 0, 0x0, 0x5bd, 0x00004 }
[ 2.593860] 9. Memory at 0xfffffffffed10200 [49] { 1, 0x0, 0x088, 0x00009 }
[ 2.681855] Enabling regular chassis codes support v0.05
[ 2.874416] CPU1: thread -1, cpu 0, socket 1
[ 2.935257] Releasing cpu 1 now, hpa=fffffffffffa2000
[hangs here forever]
One interesting detail is that if i reserve PAGE0 from the memory mem, at least the HPMC
handler from the kernel is triggered:
[ 2.785875] Backtrace:
[ 2.785875] [<00000000401d20b4>] smp_boot_one_cpu+0x15c/0x1e8
[ 2.785875] [<00000000401d2270>] __cpu_up+0xe0/0xf0
[ 2.785875] [<00000000401f2580>] bringup_cpu+0xa0/0x1e0
[ 2.785875] [<00000000401f1770>] cpuhp_invoke_callback+0x118/0x848
[ 2.785875] [<00000000401f4848>] do_cpu_up+0x290/0x3d8
[ 2.785875] [<00000000401f49f8>] cpu_up+0x68/0x80
[ 2.785875] [<000000004010c47c>] processor_probe+0x3ec/0x420
[ 2.785875] [<00000000401cae7c>] parisc_driver_probe+0x6c/0x98
[ 2.785875] [<000000004083cb20>] really_probe+0x398/0x560
[ 2.785875] [<000000004083d1e8>] driver_probe_device+0x198/0x1a0
[ 2.785875] [<000000004083d3d0>] __driver_attach+0x1e0/0x1e8
[ 2.785875] [<0000000040837ba0>] bus_for_each_dev+0x108/0x170
[ 2.785875] [<000000004083bbb8>] driver_attach+0x80/0x98
[ 2.785875] [<000000004083ab70>] bus_add_driver+0x298/0x4b8
[ 2.785875] [<000000004083e628>] driver_register+0xe0/0x268
[ 2.785875] [<00000000401cb0a0>] register_parisc_driver+0xa0/0x118
[ 2.785875] [<000000004010cb44>] processor_init+0x6c/0x80
[ 2.785875] [<0000000040108348>] parisc_init+0x348/0x5c0
[ 2.785875] [<00000000401b30bc>] do_one_initcall+0xb4/0x2c8
[ 2.785875] [<0000000040102ac0>] kernel_init_freeable+0x5a0/0x730
[ 2.785875] [<0000000040bb0890>] kernel_init+0x60/0x318
[ 2.785875] [<00000000401be020>] ret_from_kernel_thread+0x20/0x28
[ 2.785875]
[ 2.785875]
[ 2.785875] High Priority Machine Check (HPMC): Code=1 (High-priority machine check (HPMC)) at addr 0000000000000000
[ 2.785875] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0-64bit+ #116
[ 2.785875] Hardware name: 9000/785/J5000
[ 2.785875]
[ 2.785875] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
[ 2.785875] PSW: 00001000000001001111011100001111 Not tainted
[ 2.785875] r00-03 000000ff0804f70f 0000000040d27a80 0000000040bab790 00000000400ecfe0
[ 2.785875] r04-07 0000000040c58a80 0000000000000064 0000000000000002 0000000040ecee30
[ 2.785875] r08-11 00000000400ecfb0 0000000040c7ba80 0000000000000001 000000004106c858
[ 2.785875] r12-15 000000004106c860 0000000040f0c9e8 0000000000000064 0000000000000001
[ 2.785875] r16-19 0000000000000000 0000000040d29a80 0000000040c78280 000000005666f671
[ 2.785875] r20-23 0000000000000000 0000000000000000 00000000000001b8 000000000000abe0
[ 2.785875] r24-27 00000000400ecfe0 0000000000000000 0000000000000000 0000000040c58a80
[ 2.785875] r28-31 000000000000abe0 00000000400ed040 00000000400ed070 0000000056679e96
[ 2.785875] sr00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 2.785875] sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 2.785875]
[ 2.785875] IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040bab7c8 0000000040bab7cc
[ 2.785875] IIR: 82953fb5 ISR: 0000000010340000 IOR: 000000003b4ed078
[ 2.785875] CPU: 0 CR30: 00000000400ec000 CR31: 00000000ffffffff
[ 2.785875] ORIG_R28: 0000000000000000
[ 2.785875] IAOQ[0]: __udelay+0xe0/0x110
[ 2.785875] IAOQ[1]: __udelay+0xe4/0x110
[ 2.785875] RP(r2): __udelay+0xa8/0x110
[ 2.785875] Backtrace:
[ 2.785875] [<00000000401d20b4>] smp_boot_one_cpu+0x15c/0x1e8
[ 2.785875] [<00000000401d2270>] __cpu_up+0xe0/0xf0
[ 2.785875] [<00000000401f2580>] bringup_cpu+0xa0/0x1e0
[ 2.785875] [<00000000401f1770>] cpuhp_invoke_callback+0x118/0x848
[ 2.785875] [<00000000401f4848>] do_cpu_up+0x290/0x3d8
[ 2.785875] [<00000000401f49f8>] cpu_up+0x68/0x80
[ 2.785875] [<000000004010c47c>] processor_probe+0x3ec/0x420
[ 2.785875] [<00000000401cae7c>] parisc_driver_probe+0x6c/0x98
[ 2.785875] [<000000004083cb20>] really_probe+0x398/0x560
[ 2.785875] [<000000004083d1e8>] driver_probe_device+0x198/0x1a0
[ 2.785875] [<000000004083d3d0>] __driver_attach+0x1e0/0x1e8
[ 2.785875] [<0000000040837ba0>] bus_for_each_dev+0x108/0x170
[ 2.785875] [<000000004083bbb8>] driver_attach+0x80/0x98
[ 2.785875] [<000000004083ab70>] bus_add_driver+0x298/0x4b8
[ 2.785875] [<000000004083e628>] driver_register+0xe0/0x268
[ 2.785875] [<00000000401cb0a0>] register_parisc_driver+0xa0/0x118
[ 2.785875] [<000000004010cb44>] processor_init+0x6c/0x80
[ 2.785875] [<0000000040108348>] parisc_init+0x348/0x5c0
[ 2.785875] [<00000000401b30bc>] do_one_initcall+0xb4/0x2c8
[ 2.785875] [<0000000040102ac0>] kernel_init_freeable+0x5a0/0x730
[ 2.785875] [<0000000040bb0890>] kernel_init+0x60/0x318
[ 2.785875] [<00000000401be020>] ret_from_kernel_thread+0x20/0x28
[ 2.785875]
[ 2.785875] Kernel panic - not syncing: High Priority Machine Check (HPMC)
[ 2.785875] Rebooting in 10 seconds..
PIM record shows:
----------------- Processor 1 HPMC Information ------------------
Timestamp =
Thu Apr 11 13:35:21 GMT 2019 (20:19:04:11:13:35:21)
HPMC Chassis Codes = 2cbf0 2510b 2cbf5 2cbfc
General Registers 0 - 31
00-03 0000000000000000 0000000040000000 000000f0f0002090 0000000000000000
04-07 0000000000000e33 fffffff0f0400008 00000000000000fa fffffff0f0002f68
08-11 fffffffffee003f8 00000000000000c4 000000000000000a fffffff0f0001608
12-15 00000000000000f2 0000000000000001 0000000000000001 00000000000000f3
16-19 0000000002020202 0000000000000002 fffffff0f000016c 0440c24000000000
20-23 00000000000000cc 0000000000000001 0000000000000009 0000000000000000
24-27 000000000400c240 fffffffffffa2000 fffffff0f0000018 fffffff0f0412000
28-31 fffffffffffa2000 fffffff0f040ae70 00000010fb0e6f60 0000000000000000
<Press any key to continue (q to quit)>
Control Registers 0 - 31
00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000
04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
08-11 0000000000000000 0000000000000000 0000000000000000 0000000000000000
12-15 0000000000000000 0000000000000000 0000000000000000 0000000000000000
16-19 0000001959e11c2f 0000000000000000 0000000000100274 000000000fd010de
20-23 00000000a637ffec c0000000398e6f68 000000ff00007f08 0000000000000000
24-27 ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
28-31 ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
Space Registers 0 - 7
00-03 00000000 00000000 00000000 00000000
04-07 00000000 00000000 00000000 00000000
<Press any key to continue (q to quit)>
IIA Space = 0x0000000000000000
IIA Offset = 0x0000000000100278
Check Type = 0x20000000
CPU State = 0x9e000004
Cache Check = 0x00000000
TLB Check = 0x00000000
Bus Check = 0x0030103b
Assists Check = 0x00000000
Assist State = 0x00000000
Path Info = 0x00000000
System Responder Address = 0x000000fffb0e6f68
System Requestor Address = 0xfffffffffffa2000
0000000040100250 <smp_slave_stext>:
40100250: 00 00 38 20 mtsp r0,sr4
40100254: 00 00 78 20 mtsp r0,sr5
40100258: 00 00 b8 20 mtsp r0,sr6
4010025c: 00 00 f8 20 mtsp r0,sr7
40100260: 23 d6 50 20 ldil L%106c800,sp
40100264: 37 de 00 b0 ldo 58(sp),sp
40100268: 0f c0 10 de ldd 0(sp),sp
4010026c: 20 20 08 00 ldil L%40000000,r1
40100270: 08 3e 04 1e sub sp,r1,sp
40100274: 0f d0 10 de ldd 8(sp),sp <-- HPMC
40100278: 03 de 18 40 mtctl sp,tr6
4010027c: 37 de 01 80 ldo c0(sp),sp
40100280: 20 94 20 20 ldil L%1029000,r4
40100284: 34 84 00 00 ldo 0(r4),r4
40100288: 03 04 18 40 mtctl r4,tr0
4010028c: 03 24 18 40 mtctl r4,tr1
40100290: 08 1a 02 43 copy r26,r3
40100294: 21 66 18 02 ldil L%4010c800,r11
40100298: 35 6b 0e c0 ldo 760(r11),r11
4010029c: e8 1f 1c b5 b,l 401000fc <common_stext>,r0
401002a0: 08 00 02 40 nop
sp (r30) is 00000010fb0e6f60, which is valid RAM. However, it's triggering a HPMC
and the Display show Bus timeout. Does anyone have an idea what's going wrong?
Regards,
Sven
next prev parent reply other threads:[~2019-04-15 19:52 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-10 17:39 [PATCH] parisc: Switch from DISCONTIGMEM to SPARSEMEM Helge Deller
2019-04-15 19:52 ` Sven Schnelle [this message]
2019-04-15 20:46 ` John David Anglin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190415195203.GC4827@t470p.stackframe.org \
--to=svens@stackframe.org \
--cc=James.Bottomley@hansenpartnership.com \
--cc=dave.anglin@bell.net \
--cc=deller@gmx.de \
--cc=linux-parisc@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mpatocka@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.