Linux PARISC architecture development
 help / color / mirror / Atom feed
From: matoro <matoro_mailinglist_kernel@matoro.tk>
To: John David Anglin <dave.anglin@bell.net>
Cc: Vidra.Jonas@seznam.cz, linux-parisc@vger.kernel.org,
	John David Anglin <dave@parisc-linux.org>,
	Helge Deller <deller@gmx.de>
Subject: Re: [PATCH] parisc: Try to fix random segmentation faults in package builds
Date: Mon, 10 Jun 2024 15:52:52 -0400	[thread overview]
Message-ID: <13894865a496a7f2a6ed607e2ef708c4@matoro.tk> (raw)
In-Reply-To: <213f7afe-5bc8-40ff-835c-1fadaae0a96d@bell.net>

On 2024-06-04 13:08, John David Anglin wrote:
> On 2024-06-04 11:07 a.m., matoro wrote:
>>> Thanks a ton Dave, I've applied this on top of 6.9.2 and also think I'm 
>>> seeing improvement!  No panics yet, I have a couple week's worth of 
>>> package testing to catch up on so I'll report if I see anything!
>> 
>> I've seen a few warnings in my dmesg while testing, although I didn't see 
>> any immediately corresponding failures.  Any danger?
> We have determined most of the warnings arise from pages that have been 
> swapped out.  Mostly, it seems these
> pages have been flushed to memory before the pte is changed to a swap pte.  
> There might be issues for pages that
> have been cleared.  It is possible the random faults aren't related to the 
> warning I added for pages with an invalid pfn
> in flush_cache_page_if_present.  The only thing I know for certain is there 
> is no way to flush these pages on parisc
> other than flushing the whole cache.
> 
> My c8000 has run almost two weeks without any random faults.  On the other 
> hand, Helge has two machines that
> frequently fault and generate these warnings.
> 
> Flushing the whole cache in flush_cache_mm and flush_cache_range might 
> eliminate the random faults but
> there will be a significant performance hit.
> 
> Dave

Unfortunately I had a few of these faults trip today after ~4 days of uptime 
with corresponding random segfaults.  One of the WARNs was emitted shortly 
before, though not for the same PID.  Reattempted the build twice and 
randomly segfaulted all 3 times.  Had to reboot as usual to get it out of the 
bad state.

[Mon Jun 10 14:26:20 2024] ------------[ cut here ]------------
[Mon Jun 10 14:26:20 2024] WARNING: CPU: 1 PID: 26453 at 
arch/parisc/kernel/cache.c:624 flush_cache_page_if_present+0x1a4/0x330
[Mon Jun 10 14:26:20 2024] Modules linked in: nfnetlink af_packet overlay 
loop nfsv4 dns_resolver nfs lockd grace sunrpc netfs autofs4 binfmt_m
isc sr_mod ohci_pci cdrom ehci_pci ohci_hcd ehci_hcd tg3 usbcore pata_cmd64x 
ipmi_si hwmon usb_common ipmi_devintf libata libphy nls_base ipmi_
msghandler
[Mon Jun 10 14:26:20 2024] CPU: 1 PID: 26453 Comm: ld.so.1 Tainted: G        
W          6.9.3-gentoo-parisc64 #1
[Mon Jun 10 14:26:20 2024] Hardware name: 9000/800/rp3440

[Mon Jun 10 14:26:20 2024]      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
[Mon Jun 10 14:26:20 2024] PSW: 00001000000001001111100100001111 Tainted: G   
      W
[Mon Jun 10 14:26:20 2024] r00-03  000000ff0804f90f 000000004106b280 
00000000402090bc 000000007f4c85f0
[Mon Jun 10 14:26:20 2024] r04-07  0000000040f99a80 00000000f855d000 
00000000561b6360 000000000800000f
[Mon Jun 10 14:26:20 2024] r08-11  0000000c009674de 0000000000000000 
0000004100b2e39c 000000007f4c81c0
[Mon Jun 10 14:26:20 2024] r12-15  00000000561b6360 0000004100b2e330 
0000000000000002 0000000000000000
[Mon Jun 10 14:26:20 2024] r16-19  0000000040f50360 fffffffffffffff4 
000000007f4c8108 0000000000000003
[Mon Jun 10 14:26:20 2024] r20-23  0000000000001a46 0000000011b81000 
ffffffffc0000000 00000000f859d000
[Mon Jun 10 14:26:20 2024] r24-27  0000000000000000 000000000800000f 
0000004100b2e3a0 0000000040f99a80
[Mon Jun 10 14:26:20 2024] r28-31  0000000000000000 000000007f4c8670 
000000007f4c86a0 0000000000000000
[Mon Jun 10 14:26:20 2024] sr00-03  000000000604d000 000000000604d000 
0000000000000000 000000000604d000
[Mon Jun 10 14:26:20 2024] sr04-07  0000000000000000 0000000000000000 
0000000000000000 0000000000000000

[Mon Jun 10 14:26:20 2024] IASQ: 0000000000000000 0000000000000000 IAOQ: 
0000000040209104 0000000040209108
[Mon Jun 10 14:26:20 2024]  IIR: 03ffe01f    ISR: 0000000000000000  IOR: 
0000000000000000
[Mon Jun 10 14:26:20 2024]  CPU:        1   CR30: 00000001e700e780 CR31: 
fffffff0f0e05ee0
[Mon Jun 10 14:26:20 2024]  ORIG_R28: 00000000414cab90
[Mon Jun 10 14:26:20 2024]  IAOQ[0]: flush_cache_page_if_present+0x1a4/0x330
[Mon Jun 10 14:26:20 2024]  IAOQ[1]: flush_cache_page_if_present+0x1a8/0x330
[Mon Jun 10 14:26:20 2024]  RP(r2): flush_cache_page_if_present+0x15c/0x330
[Mon Jun 10 14:26:20 2024] Backtrace:
[Mon Jun 10 14:26:20 2024]  [<000000004020b110>] 
flush_cache_range+0x138/0x158
[Mon Jun 10 14:26:20 2024]  [<00000000405fdfc8>] 
change_protection+0x134/0xb78
[Mon Jun 10 14:26:20 2024]  [<00000000405feb4c>] mprotect_fixup+0x140/0x478
[Mon Jun 10 14:26:20 2024]  [<00000000405ff15c>] 
do_mprotect_pkey.constprop.0+0x2d8/0x5f0
[Mon Jun 10 14:26:20 2024]  [<00000000405ff4a4>] sys_mprotect+0x30/0x60
[Mon Jun 10 14:26:20 2024]  [<0000000040203fbc>] syscall_exit+0x0/0x10

[Mon Jun 10 14:26:20 2024] ---[ end trace 0000000000000000 ]---

[Mon Jun 10 14:28:04 2024] do_page_fault() command='ld.so.1' type=15 
address=0x161236a0 in libc.so[f8b9c000+1b6000]
                            trap #15: Data TLB miss fault, vm_start = 
0x4208e000, vm_end = 0x420af000
[Mon Jun 10 14:28:04 2024] CPU: 0 PID: 26681 Comm: ld.so.1 Tainted: G        
W          6.9.3-gentoo-parisc64 #1
[Mon Jun 10 14:28:04 2024] Hardware name: 9000/800/rp3440

[Mon Jun 10 14:28:04 2024]      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
[Mon Jun 10 14:28:04 2024] PSW: 00000000000001100000000000001111 Tainted: G   
      W
[Mon Jun 10 14:28:04 2024] r00-03  000000000006000f 00000000f8d584a8 
00000000f8c46e33 0000000000000028
[Mon Jun 10 14:28:04 2024] r04-07  00000000f8d54660 00000000f8d54648 
0000000000000020 000000000001ab91
[Mon Jun 10 14:28:04 2024] r08-11  00000000f8d54654 00000000f8d5bf78 
0000000000000005 00000000f9ad87c8
[Mon Jun 10 14:28:04 2024] r12-15  0000000000000000 0000000000000000 
000000000000003f 00000000000003e9
[Mon Jun 10 14:28:04 2024] r16-19  000000000001a000 000000000001a000 
000000000001a000 00000000f8d56ca8
[Mon Jun 10 14:28:04 2024] r20-23  0000000000000000 00000000f8c46bcc 
000000000001a2d8 00000000ffffffff
[Mon Jun 10 14:28:04 2024] r24-27  0000000000000000 0000000000000020 
00000000f8d54648 000000000001a000
[Mon Jun 10 14:28:04 2024] r28-31  0000000000000001 0000000016123698 
00000000f9ad8cc0 00000000f9ad8c2c
[Mon Jun 10 14:28:04 2024] sr00-03  0000000006069400 0000000006069400 
0000000000000000 0000000006069400
[Mon Jun 10 14:28:04 2024] sr04-07  0000000006069400 0000000006069400 
0000000006069400 0000000006069400

[Mon Jun 10 14:28:04 2024]       VZOUICununcqcqcqcqcqcrmunTDVZOUI
[Mon Jun 10 14:28:04 2024] FPSR: 00000000000000000000000000000000
[Mon Jun 10 14:28:04 2024] FPER1: 00000000
[Mon Jun 10 14:28:04 2024] fr00-03  0000000000000000 0000000000000000 
0000000000000000 0000000000000000
[Mon Jun 10 14:28:04 2024] fr04-07  3fbc58dcd6e825cf 41d98fdb92c00000 
00001d29b5e9bfb4 41d999952df718f9
[Mon Jun 10 14:28:04 2024] fr08-11  ffe3d998c543273c ff60537aba025d00 
004698b61bd9b9ee 000527c1bed53af7
[Mon Jun 10 14:28:04 2024] fr12-15  0000000000000000 0000000000000000 
0000000000000000 0000000000000000
[Mon Jun 10 14:28:04 2024] fr16-19  0000000000000000 0000000000000000 
0000000000000000 0000000000000000
[Mon Jun 10 14:28:04 2024] fr20-23  0000000000000000 0000000000000000 
0000000000000020 0000000000000000
[Mon Jun 10 14:28:04 2024] fr24-27  0000000000000003 0000000000000000 
3d473181aed58d64 bff0000000000000
[Mon Jun 10 14:28:04 2024] fr28-31  3fc999b324f10111 057028cc5c564e70 
dbc91a3f6bd13476 02632fb493c76730

[Mon Jun 10 14:28:04 2024] IASQ: 0000000006069400 0000000006069400 IAOQ: 
00000000f8c44063 00000000f8c44067
[Mon Jun 10 14:28:04 2024]  IIR: 0fb0109c    ISR: 0000000006069400  IOR: 
00000000161236a0
[Mon Jun 10 14:28:04 2024]  CPU:        0   CR30: 00000001e70099e0 CR31: 
fffffff0f0e05ee0
[Mon Jun 10 14:28:04 2024]  ORIG_R28: 0000000000000000
[Mon Jun 10 14:28:04 2024]  IAOQ[0]: 00000000f8c44063
[Mon Jun 10 14:28:04 2024]  IAOQ[1]: 00000000f8c44067
[Mon Jun 10 14:28:04 2024]  RP(r2): 00000000f8c46e33

  reply	other threads:[~2024-06-10 19:53 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-05 16:58 [PATCH] parisc: Try to fix random segmentation faults in package builds John David Anglin
2024-05-08  8:54 ` Vidra.Jonas
2024-05-08 15:23   ` John David Anglin
2024-05-08 19:18     ` matoro
2024-05-08 20:52       ` John David Anglin
2024-05-08 23:51         ` matoro
2024-05-09  1:21           ` John David Anglin
2024-05-09 17:10         ` John David Anglin
2024-05-29 15:54           ` matoro
2024-05-29 16:33             ` John David Anglin
2024-05-30  5:00               ` matoro
2024-06-04 15:07                 ` matoro
2024-06-04 17:08                   ` John David Anglin
2024-06-10 19:52                     ` matoro [this message]
2024-06-10 20:17                       ` John David Anglin
2024-06-26  6:12                         ` matoro
2024-06-26 15:44                           ` John David Anglin
2024-05-12  6:57     ` Vidra.Jonas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=13894865a496a7f2a6ed607e2ef708c4@matoro.tk \
    --to=matoro_mailinglist_kernel@matoro.tk \
    --cc=Vidra.Jonas@seznam.cz \
    --cc=dave.anglin@bell.net \
    --cc=dave@parisc-linux.org \
    --cc=deller@gmx.de \
    --cc=linux-parisc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox