* MCEs on MIPS: multiple matching TLB entries
@ 2024-06-28 17:57 Yu Zhao
2024-06-30 2:22 ` Jiaxun Yang
0 siblings, 1 reply; 6+ messages in thread
From: Yu Zhao @ 2024-06-28 17:57 UTC (permalink / raw)
To: linux-mips; +Cc: Linux-MM
Hi,
OpenWrt folks ran into MCEs caused by multiple matching TLB entries
[1], after they updated their kernel from v6.1 to v6.6.
I reported similar crashes previously [2], on v6.4. So they asked me
whether I'm aware of a fix from the mainline, which I am not.
I took a quick look from the MM's POV and found nothing obviously
wrong. I'm hoping they have better luck with the MIPS experts.
Thanks!
[1] https://github.com/openwrt/openwrt/pull/15635
[2] https://lore.kernel.org/linux-mm/CAOUHufbAjZd4Mxkio9OGct-TZ=L0QRG+_6Xa7atQVFN_4ez86w@mail.gmail.com/
Copying and pasting one of the crashes from OpenWrt:
CFE for WNR3500L version: v1.0.36
Build Date: Tue Aug 11 15:09:14 CST 2009
Init Arena
Init Devs.
Boot partition size = 262144(0x40000)
Found a 8MB ST compatible serial flash
et0: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 5.10.56.28
CPU type 0x19740: 453MHz
Tot mem: 65536 KBytes
Device eth0: hwaddr 00-FF-FF-FF-FF-FF, ipaddr 192.168.1.1, mask 255.255.255.0
gateway not set, nameserver not set
too long file.
LZMA boot failed
Loader:raw Filesys:raw Dev:flash0.os File: Options:(null)
Loading: .. 3808 bytes read
Entry at 0x80001000
Closing network.
Starting program at 0x80001000
[ 0.000000] Linux version 6.6.35 (user@connors)
(mipsel-openwrt-linux-musl-gcc (OpenWrt GCC 13.3.0
r25518+987-f7a68458b4) 13.3.0, GNU ld (GNU Binutils) 2.42) #0 Sun Jun
23 09:14:12 2024
[ 0.000000] CPU0 revision is: 00019740 (MIPS 74Kc)
[ 0.000000] bcm47xx: Using bcma bus
[ 0.000000] (NULL device *): bus0: Found chip with id 0x4716, rev
0x01 and package 0x0A
[ 0.000000] Initrd not found or empty - disabling initrd
[ 0.000000] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[ 0.000000] Primary data cache 32kB, 4-way, VIPT, cache aliases,
linesize 32 bytes
[ 0.000000] This processor doesn't support highmem. -65536k highmem ignored
[ 0.000000] Zone ranges:
[ 0.000000] Normal [mem 0x0000000000000000-0x0000000003ffffff]
[ 0.000000] HighMem empty
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000000000000-0x0000000003ffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000003ffffff]
[ 0.000000] Kernel command line: noinitrd console=ttyS0,115200
[ 0.000000] Dentry cache hash table entries: 8192 (order: 3, 32768
bytes, linear)
[ 0.000000] Inode-cache hash table entries: 4096 (order: 2, 16384
bytes, linear)
[ 0.000000] Writing ErrCtl register=00000000
[ 0.000000] Readback ErrCtl register=00000000
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 16240
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] Memory: 56672K/65536K available (5819K kernel code,
596K rwdata, 1244K rodata, 204K init, 297K bss, 8864K reserved, 0K
cma-reserved, 0K highmem)
[ 0.000000] NR_IRQS: 256
[ 0.000000] bcm47xx_soc: bus0: Core 0 found: ChipCommon (manuf
0x4BF, id 0x800, rev 0x1F, class 0x0)
[ 0.000000] bcm47xx_soc: bus0: Core 1 found: IEEE 802.11 (manuf
0x4BF, id 0x812, rev 0x11, class 0x0)
[ 0.000000] bcm47xx_soc: bus0: Core 2 found: GBit MAC (manuf
0x4BF, id 0x82D, rev 0x00, class 0x0)
[ 0.000000] bcm47xx_soc: bus0: Core 3 found: MIPS 74K (manuf
0x4A7, id 0x82C, rev 0x01, class 0x0)
[ 0.000000] bcm47xx_soc: bus0: Core 4 found: USB 2.0 Host (manuf
0x4BF, id 0x819, rev 0x04, class 0x0)
[ 0.000000] bcm47xx_soc: bus0: Core 5 found: PCIe (manuf 0x4BF, id
0x820, rev 0x0E, class 0x0)
[ 0.000000] bcm47xx_soc: bus0: Core 6 found: DDR1/DDR2 Memory
Controller (manuf 0x4BF, id 0x82E, rev 0x01, class 0x0)
[ 0.000000] bcm47xx_soc: bus0: Core 7 found: Internal Memory
(manuf 0x4BF, id 0x80E, rev 0x07, class 0x0)
[ 0.000000] bcm47xx_soc: bus0: Core 8 found: I2S (manuf 0x4BF, id
0x834, rev 0x00, class 0x0)
[ 0.000000] bcm47xx_soc: bus0: Found M25P64 serial flash (size:
8192KiB, blocksize: 0x10000, blocks: 128)
[ 0.000000] bcm47xx_soc: bus0: Early bus registered
[ 0.000000] MIPS: machine is Netgear WNR3500L
[ 0.000000] bcm47xx: Setting up vectored interrupts
[ 0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 8438235966 ns
[ 0.000003] sched_clock: 32 bits at 227MHz, resolution 4ns, wraps
every 9481163773ns
[ 0.009630] Calibrating delay loop... 226.09 BogoMIPS (lpj=1130496)
[ 0.080067] pid_max: default: 32768 minimum: 301
[ 0.098070] Mount-cache hash table entries: 1024 (order: 0, 4096
bytes, linear)
[ 0.098182] Mountpoint-cache hash table entries: 1024 (order: 0,
4096 bytes, linear)
[ 0.115630] RCU Tasks Trace: Setting shift to 0 and lim to 1
rcu_task_cb_adjust=1.
[ 0.119327] clocksource: jiffies: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 19112604462750000 ns
[ 0.119449] futex hash table entries: 256 (order: -1, 3072 bytes, linear)
[ 0.127277] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[ 0.147294] clocksource: Switched to clocksource MIPS
[ 0.162062] NET: Registered PF_INET protocol family
[ 0.162678] IP idents hash table entries: 2048 (order: 2, 16384
bytes, linear)
[ 0.164718] tcp_listen_portaddr_hash hash table entries: 1024
(order: 0, 4096 bytes, linear)
[ 0.167040] Table-perturb hash table entries: 65536 (order: 6,
262144 bytes, linear)
[ 0.167138] TCP established hash table entries: 1024 (order: 0,
4096 bytes, linear)
[ 0.167258] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
[ 0.167386] TCP: Hash tables configured (established 1024 bind 1024)
[ 0.168124] UDP hash table entries: 256 (order: 0, 4096 bytes, linear)
[ 0.168379] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes, linear)
[ 0.169663] NET: Registered PF_UNIX/PF_LOCAL protocol family
[ 0.169984] PCI: CLS 0 bytes, default 32
[ 0.201695] bcm47xx_soc: bus0: PCIEcore in host mode found
[ 0.201712] bcm47xx_soc: bus0: This PCIE core is disabled and not working
[ 0.203394] gpio gpiochip0: Static allocation of GPIO base is
deprecated, use dynamic allocation.
[ 0.204527] bcm47xx_soc: bus0: Bus registered
[ 0.230753] workingset: timestamp_bits=14 max_order=14 bucket_order=0
[ 0.233286] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[ 0.233331] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
(CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[ 0.244147] Serial: 8250/16550 driver, 2 ports, IRQ sharing enabled
[ 0.246592] printk: console [ttyS0] disabled
[ 0.267632] serial8250.0: ttyS0 at MMIO 0xb8000300 (irq = 2,
base_baud = 1250000) is a U6_16550A
[ 0.267762] printk: console [ttyS0] enabled
[ 0.793749] 3 bcm47xxpart partitions found on MTD device bcm47xxsflash
[ 0.800497] Creating 3 MTD partitions on "bcm47xxsflash":
[ 0.806143] 0x000000000000-0x000000040000 : "boot"
[ 0.844873] 0x000000040000-0x0000007f0000 : "firmware"
[ 0.853949] failed to parse "brcm,trx-magic" DT attribute, using default: -89
[ 0.861402] 3 trx partitions found on MTD device firmware
[ 0.866918] Creating 3 MTD partitions on "firmware":
[ 0.872045] 0x00000000001c-0x000000000928 : "loader"
[ 0.877127] mtd: partition "loader" doesn't start on an erase/write
block boundary -- force read-only
[ 0.892292] 0x000000000928-0x00000024f800 : "linux"
[ 0.897409] mtd: partition "linux" doesn't start on an erase/write
block boundary -- force read-only
[ 0.911927] 0x00000024f800-0x0000007b0000 : "rootfs"
[ 0.917042] mtd: partition "rootfs" doesn't start on an erase/write
block boundary -- force read-only
[ 0.930529] mtd: setting mtd4 (rootfs) as root device
[ 0.935780] 1 squashfs-split partitions found on MTD device rootfs
[ 0.942210] 0x000000560000-0x0000007b0000 : "rootfs_data"
[ 0.953347] 0x0000007f0000-0x000000800000 : "nvram"
[ 0.976291] bgmac_bcma bcma0:2: Found PHY addr: 30 (NOREGS)
[ 1.089641] b53_common: found switch: BCM53115, rev 8
[ 1.095225] bgmac_bcma bcma0:2: Support for Roboswitch not implemented
[ 1.104192] bgmac_bcma: Broadcom 47xx GBit MAC driver loaded
[ 1.110909] bcm47xx-wdt bcm47xx-wdt.0: BCM47xx Watchdog Timer
enabled (30 seconds)
[ 1.121569] NET: Registered PF_INET6 protocol family
[ 1.146114] Segment Routing with IPv6
[ 1.150302] In-situ OAM (IOAM) with IPv6
[ 1.154877] NET: Registered PF_PACKET protocol family
[ 1.160253] 8021q: 802.1Q VLAN Support v1.8
[ 1.273304] VFS: Mounted root (squashfs filesystem) readonly on device 31:4.
[ 1.282239] Freeing unused kernel image (initmem) memory: 204K
[ 1.288306] This architecture does not have kernel memory protection.
[ 1.294893] Run /sbin/init as init process
[ 2.348011] init: Console is alive
[ 2.352416] init: - watchdog -
[ 3.465026] kmodloader: loading kernel modules from /etc/modules-boot.d/*
[ 3.645823] usbcore: registered new interface driver usbfs
[ 3.651824] usbcore: registered new interface driver hub
[ 3.657587] usbcore: registered new device driver usb
[ 3.678605] gpio_button_hotplug: loading out-of-tree module taints kernel.
[ 3.722626] ehci-platform ehci-platform.0: EHCI Host Controller
[ 3.728888] ehci-platform ehci-platform.0: new USB bus registered,
assigned bus number 1
[ 3.737524] ehci-platform ehci-platform.0: irq 5, io mem 0x18004000
[ 3.767369] ehci-platform ehci-platform.0: USB 2.0 started, EHCI 1.00
[ 3.776631] hub 1-0:1.0: USB hub found
[ 3.782776] hub 1-0:1.0: 2 ports detected
[ 3.813642] ohci-platform ohci-platform.0: Generic Platform OHCI controller
[ 3.820986] ohci-platform ohci-platform.0: new USB bus registered,
assigned bus number 2
[ 3.829703] ohci-platform ohci-platform.0: irq 5, io mem 0x18009000
[ 3.903777] hub 2-0:1.0: USB hub found
[ 3.909946] hub 2-0:1.0: 2 ports detected
[ 3.940493] kmodloader: done loading kernel modules from
/etc/modules-boot.d/*
[ 3.959102] init: - preinit -
[ 3.965715] Got mcheck at 800104ec
[ 3.969240] CPU: 0 PID: 245 Comm: init Tainted: G O 6.6.35 #0
[ 3.976522] $ 0 : 00000000 00000001 fffd5000 00000001
[ 3.981900] $ 4 : 00000004 00000003 00026edf 7fc8c008
[ 3.987273] $ 8 : 00000000 00000001 0000001b 00000068
[ 3.992635] $12 : 7fc8d600 81c2376c 81c2370c ffffff00
[ 3.998007] $16 : 7fc8d000 8192dad0 00000000 806ed280
[ 4.003379] $20 : 8192dad0 00000001 807b1000 7fc8d7c0
[ 4.008751] $24 : 77d75000 ffffffff
[ 4.014122] $28 : 81ad6000 81ad7dc8 00000000 80017678
[ 4.019486] Hi : 00000071
[ 4.022428] Lo : 0ceb0000
[ 4.025369] epc : 800104ec __kmap_pgprot+0xdc/0x108
[ 4.030547] ra : 80017678 r4k_flush_cache_page+0x24c/0x29c
[ 4.036430] Status: 1120a402 KERNEL EXL
[ 4.040455] Cause : 00800060 (ExcCode 18)
[ 4.044541] PrId : 00019740 (MIPS 74Kc)
[ 4.048543] CPU: 0 PID: 245 Comm: init Tainted: G O 6.6.35 #0
[ 4.055816] Stack : 00000000 00000001 807b1000 800625ac 00000000
00000004 00000000 00000000
[ 4.064399] 81ad7c8c 807b0000 80780000 8064d57c 81ac1528
00000001 81ad7c30 860f9307
[ 4.072985] 00000000 00000000 8064d57c 81ad7b70 ffffefff
00000000 00000000 ffffffea
[ 4.081562] 00000000 81ad7b7c 000000a9 806f5a88 00000000
8064d57c 00000000 806ed280
[ 4.090140] 8192dad0 00000001 807b1000 7fc8d7c0 00000018
80324b6c 89052010 00000060
[ 4.098718] ...
[ 4.101229] Call Trace:
[ 4.103730] [<80006fd0>] show_stack+0x28/0xf0
[ 4.108219] [<8057ec8c>] dump_stack_lvl+0x38/0x60
[ 4.113077] [<80008108>] do_mcheck+0x2c/0xa0
[ 4.117462] [<80003d34>] handle_mcheck_int+0x3c/0x48
[ 4.122547]
[ 4.124072] Index : 3
[ 4.126661] PageMask : 0
[ 4.129241] EntryHi : fffd4000
[ 4.132447] EntryLo0 : 00026edf
[ 4.135652] EntryLo1 : 00026edf
[ 4.138848] Wired : 4
[ 4.141430]
[ 4.142954] Index: 0 pgmask=4kb va=fffd4000 asid=00
[ 4.142954] [pa=007af000 c=3 d=1 v=1 g=1] [pa=007af000 c=3 d=1 v=1 g=1]
[ 4.154868] Index: 1 pgmask=4kb va=fffd2000 asid=00
[ 4.154868] [pa=0105b000 c=3 d=1 v=1 g=1] [pa=0105b000 c=3 d=1 v=1 g=1]
[ 4.166775] Index: 2 pgmask=4kb va=fffd0000 asid=00
[ 4.166775] [pa=01018000 c=3 d=1 v=1 g=1] [pa=01018000 c=3 d=1 v=1 g=1]
[ 4.178682] Index: 3 pgmask=4kb va=fffd4000 asid=00
[ 4.178682] [pa=009bb000 c=3 d=1 v=1 g=1] [pa=009bb000 c=3 d=1 v=1 g=1]
[ 4.190590] Index: 19 pgmask=4kb va=80026000 asid=00
[ 4.190590] [pa=00000000 c=0 d=0 v=0 g=0] [pa=00000000 c=0 d=0 v=0 g=0]
[ 4.202496] Index: 28 pgmask=4kb va=80038000 asid=00
[ 4.202496] [pa=00000000 c=0 d=0 v=0 g=0] [pa=00000000 c=0 d=0 v=0 g=0]
[ 4.214400] Index: 29 pgmask=4kb va=8003a000 asid=00
[ 4.214400] [pa=00000000 c=0 d=0 v=0 g=0] [pa=00000000 c=0 d=0 v=0 g=0]
[ 4.226312]
[ 4.227840] Code: 40843000 40850000 000000c0 <42000002> 000000c0
40875000 10600002 41606000 41606020
[ 4.237866] Kernel panic - not syncing: Caught Machine Check
exception - caused by multiple matching entries in the TLB.
[ 4.248920] Rebooting in 1 seconds..
[ 5.250906] bcm47xx: Please stand by while rebooting the system...
Decompressing..........done
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: MCEs on MIPS: multiple matching TLB entries
2024-06-28 17:57 MCEs on MIPS: multiple matching TLB entries Yu Zhao
@ 2024-06-30 2:22 ` Jiaxun Yang
2024-06-30 3:01 ` Jiaxun Yang
0 siblings, 1 reply; 6+ messages in thread
From: Jiaxun Yang @ 2024-06-30 2:22 UTC (permalink / raw)
To: Yu Zhao, linux-mips@vger.kernel.org; +Cc: Linux-MM
在2024年6月28日六月 下午6:57,Yu Zhao写道:
> Hi,
>
> OpenWrt folks ran into MCEs caused by multiple matching TLB entries
> [1], after they updated their kernel from v6.1 to v6.6.
>
> I reported similar crashes previously [2], on v6.4. So they asked me
> whether I'm aware of a fix from the mainline, which I am not.
> on
> I took a quick look from the MM's POV and found nothing obviously
> wrong. I'm hoping they have better luck with the MIPS experts.
Hi Yu,
I never hit such problem on my (non-bcm) 74Kc systems.
However a quick glance suggested it may be related to Wired TLB entries
on your platform.
Both duplicated TLB entries, Index 2 and 3, are all below "Wired" setting,
which means they are not managed by mm, but platform code.
Thanks
- Jiaxun
>
> Thanks!
>
> [1] https://github.com/openwrt/openwrt/pull/15635
> [2]
> https://lore.kernel.org/linux-mm/CAOUHufbAjZd4Mxkio9OGct-TZ=L0QRG+_6Xa7atQVFN_4ez86w@mail.gmail.com/
>
--
- Jiaxun
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: MCEs on MIPS: multiple matching TLB entries
2024-06-30 2:22 ` Jiaxun Yang
@ 2024-06-30 3:01 ` Jiaxun Yang
2024-06-30 17:25 ` Jiaxun Yang
0 siblings, 1 reply; 6+ messages in thread
From: Jiaxun Yang @ 2024-06-30 3:01 UTC (permalink / raw)
To: Yu Zhao, linux-mips@vger.kernel.org; +Cc: Linux-MM
在2024年6月30日六月 上午3:22,Jiaxun Yang写道:
> 在2024年6月28日六月 下午6:57,Yu Zhao写道:
>> Hi,
>>
>> OpenWrt folks ran into MCEs caused by multiple matching TLB entries
>> [1], after they updated their kernel from v6.1 to v6.6.
>>
>> I reported similar crashes previously [2], on v6.4. So they asked me
>> whether I'm aware of a fix from the mainline, which I am not.
>> on
>> I took a quick look from the MM's POV and found nothing obviously
>> wrong. I'm hoping they have better luck with the MIPS experts.
>
> Hi Yu,
>
> I never hit such problem on my (non-bcm) 74Kc systems.
>
> However a quick glance suggested it may be related to Wired TLB entries
> on your platform.
>
> Both duplicated TLB entries, Index 2 and 3, are all below "Wired" setting,
> which means they are not managed by mm, but platform code.
I just tried to dig into bcm47xx platform code and I think we should blame
bcm47xx_prom_highmem_init, which created wired entry for high mem and may
conflict with kernel's mapping.
Nowadays, MIPS mm code can handle highmem on it's own, so there is no need
to create such entry IMO.
>
> Thanks
> - Jiaxun
>
>>
>> Thanks!
>>
>> [1] https://github.com/openwrt/openwrt/pull/15635
>> [2]
>> https://lore.kernel.org/linux-mm/CAOUHufbAjZd4Mxkio9OGct-TZ=L0QRG+_6Xa7atQVFN_4ez86w@mail.gmail.com/
>>
> --
> - Jiaxun
--
- Jiaxun
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: MCEs on MIPS: multiple matching TLB entries
2024-06-30 3:01 ` Jiaxun Yang
@ 2024-06-30 17:25 ` Jiaxun Yang
2024-06-30 18:23 ` Yu Zhao
0 siblings, 1 reply; 6+ messages in thread
From: Jiaxun Yang @ 2024-06-30 17:25 UTC (permalink / raw)
To: Yu Zhao, linux-mips@vger.kernel.org; +Cc: Linux-MM
在2024年6月30日六月 上午4:01,Jiaxun Yang写道:
> 在2024年6月30日六月 上午3:22,Jiaxun Yang写道:
>> 在2024年6月28日六月 下午6:57,Yu Zhao写道:
>>> Hi,
>>>
>>> OpenWrt folks ran into MCEs caused by multiple matching TLB entries
>>> [1], after they updated their kernel from v6.1 to v6.6.
>>>
>>> I reported similar crashes previously [2], on v6.4. So they asked me
>>> whether I'm aware of a fix from the mainline, which I am not.
>>> on
>>> I took a quick look from the MM's POV and found nothing obviously
>>> wrong. I'm hoping they have better luck with the MIPS experts.
>>
>> Hi Yu,
>>
>> I never hit such problem on my (non-bcm) 74Kc systems.
>>
>> However a quick glance suggested it may be related to Wired TLB entries
>> on your platform.
>>
>> Both duplicated TLB entries, Index 2 and 3, are all below "Wired" setting,
>> which means they are not managed by mm, but platform code.
>
> I just tried to dig into bcm47xx platform code and I think we should blame
> bcm47xx_prom_highmem_init, which created wired entry for high mem and may
> conflict with kernel's mapping.
>
> Nowadays, MIPS mm code can handle highmem on it's own, so there is no need
> to create such entry IMO.
Sorry, I think I made a wrong diagnoses, it's actually a problem in our cache
alias code.
Will try to fix.
Thanks
>
>>
>> Thanks
>> - Jiaxun
>>
>>>
>>> Thanks!
>>>
>>> [1] https://github.com/openwrt/openwrt/pull/15635
>>> [2]
>>> https://lore.kernel.org/linux-mm/CAOUHufbAjZd4Mxkio9OGct-TZ=L0QRG+_6Xa7atQVFN_4ez86w@mail.gmail.com/
>>>
>> --
>> - Jiaxun
>
> --
> - Jiaxun
--
- Jiaxun
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: MCEs on MIPS: multiple matching TLB entries
2024-06-30 17:25 ` Jiaxun Yang
@ 2024-06-30 18:23 ` Yu Zhao
2024-07-03 5:13 ` Jiaxun Yang
0 siblings, 1 reply; 6+ messages in thread
From: Yu Zhao @ 2024-06-30 18:23 UTC (permalink / raw)
To: Jiaxun Yang; +Cc: linux-mips@vger.kernel.org, Linux-MM
On Sun, Jun 30, 2024 at 11:25 AM Jiaxun Yang <jiaxun.yang@flygoat.com> wrote:
>
>
>
> 在2024年6月30日六月 上午4:01,Jiaxun Yang写道:
> > 在2024年6月30日六月 上午3:22,Jiaxun Yang写道:
> >> 在2024年6月28日六月 下午6:57,Yu Zhao写道:
> >>> Hi,
> >>>
> >>> OpenWrt folks ran into MCEs caused by multiple matching TLB entries
> >>> [1], after they updated their kernel from v6.1 to v6.6.
> >>>
> >>> I reported similar crashes previously [2], on v6.4. So they asked me
> >>> whether I'm aware of a fix from the mainline, which I am not.
> >>> on
> >>> I took a quick look from the MM's POV and found nothing obviously
> >>> wrong. I'm hoping they have better luck with the MIPS experts.
> >>
> >> Hi Yu,
> >>
> >> I never hit such problem on my (non-bcm) 74Kc systems.
> >>
> >> However a quick glance suggested it may be related to Wired TLB entries
> >> on your platform.
> >>
> >> Both duplicated TLB entries, Index 2 and 3, are all below "Wired" setting,
> >> which means they are not managed by mm, but platform code.
> >
> > I just tried to dig into bcm47xx platform code and I think we should blame
> > bcm47xx_prom_highmem_init, which created wired entry for high mem and may
> > conflict with kernel's mapping.
> >
> > Nowadays, MIPS mm code can handle highmem on it's own, so there is no need
> > to create such entry IMO.
>
> Sorry, I think I made a wrong diagnoses, it's actually a problem in our cache
> alias code.
>
> Will try to fix.
Thanks for looking into this!
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: MCEs on MIPS: multiple matching TLB entries
2024-06-30 18:23 ` Yu Zhao
@ 2024-07-03 5:13 ` Jiaxun Yang
0 siblings, 0 replies; 6+ messages in thread
From: Jiaxun Yang @ 2024-07-03 5:13 UTC (permalink / raw)
To: Yu Zhao; +Cc: linux-mips@vger.kernel.org, Linux-MM
在2024年7月1日七月 上午2:23,Yu Zhao写道:
> On Sun, Jun 30, 2024 at 11:25 AM Jiaxun Yang <jiaxun.yang@flygoat.com> wrote:
>>
>>
>>
>> 在2024年6月30日六月 上午4:01,Jiaxun Yang写道:
>> > 在2024年6月30日六月 上午3:22,Jiaxun Yang写道:
>> >> 在2024年6月28日六月 下午6:57,Yu Zhao写道:
>> >>> Hi,
>> >>>
>> >>> OpenWrt folks ran into MCEs caused by multiple matching TLB entries
>> >>> [1], after they updated their kernel from v6.1 to v6.6.
>> >>>
>> >>> I reported similar crashes previously [2], on v6.4. So they asked me
>> >>> whether I'm aware of a fix from the mainline, which I am not.
>> >>> on
>> >>> I took a quick look from the MM's POV and found nothing obviously
>> >>> wrong. I'm hoping they have better luck with the MIPS experts.
>> >>
>> >> Hi Yu,
>> >>
>> >> I never hit such problem on my (non-bcm) 74Kc systems.
>> >>
>> >> However a quick glance suggested it may be related to Wired TLB entries
>> >> on your platform.
>> >>
>> >> Both duplicated TLB entries, Index 2 and 3, are all below "Wired" setting,
>> >> which means they are not managed by mm, but platform code.
>> >
>> > I just tried to dig into bcm47xx platform code and I think we should blame
>> > bcm47xx_prom_highmem_init, which created wired entry for high mem and may
>> > conflict with kernel's mapping.
>> >
>> > Nowadays, MIPS mm code can handle highmem on it's own, so there is no need
>> > to create such entry IMO.
>>
>> Sorry, I think I made a wrong diagnoses, it's actually a problem in our cache
>> alias code.
>>
>> Will try to fix.
>
> Thanks for looking into this!
So it's a problem incurred by OpenWRT's down stream patch rather than upstream code.
This is sorted at: https://github.com/openwrt/openwrt/pull/15635
No further action required :-)
Thanks
--
- Jiaxun
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-07-03 5:13 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-28 17:57 MCEs on MIPS: multiple matching TLB entries Yu Zhao
2024-06-30 2:22 ` Jiaxun Yang
2024-06-30 3:01 ` Jiaxun Yang
2024-06-30 17:25 ` Jiaxun Yang
2024-06-30 18:23 ` Yu Zhao
2024-07-03 5:13 ` Jiaxun Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).