public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Oops on /proc/interrupt access with 6.5-rc1
@ 2023-07-11 15:51 Johan Hovold
  2023-07-11 16:56 ` Shanker Donthineni
  2023-07-11 18:14 ` Marc Zyngier
  0 siblings, 2 replies; 5+ messages in thread
From: Johan Hovold @ 2023-07-11 15:51 UTC (permalink / raw)
  To: Marc Zyngier, Thomas Gleixner, Shanker Donthineni
  Cc: Konrad Dybcio, linux-kernel

Hi,

Konrad reported on IRC that he hit a segfault and hang when watch:ing
/proc/interrupts with 6.5-rc1.

I tried simply catting it and hit the below oops immediately with my
X13s (aarch64).

Commit 721255b9826b ("genirq: Use a maple tree for interrupt descriptor
management") stood out when skimming the log, and Marc soon suggested
the same possible culprit on IRC.

I have not been able to reproduce it with the maple tree patch reverted,
but I hit it again after adding it back. Did not trigger immediately
after boot though, I had had the machine idling for a few minutes in
between.

Marc asked for a dump so figured I'd CC the list as well.

Johan


[ 2546.693932] Unable to handle kernel paging request at virtual address ffff80008106bb19
[ 2546.695148] Mem abort info:
[ 2546.695562]   ESR = 0x0000000096000007
[ 2546.695976]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 2546.696394]   SET = 0, FnV = 0
[ 2546.696807]   EA = 0, S1PTW = 0
[ 2546.697220]   FSC = 0x07: level 3 translation fault
[ 2546.697642] Data abort info:
[ 2546.698066]   ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
[ 2546.698494]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 2546.698922]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 2546.699355] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000002d7a05000
[ 2546.699792] [ffff80008106bb19] pgd=10000001000a5003, p4d=10000001000a5003, pud=10000001000a6003, pmd=1000000100d5a003, pte=0000000000000000
[ 2546.700387] Internal error: Oops: 0000000096000007 [#1] PREEMPT SMP
[ 2546.700796] Modules linked in: snd_soc_wsa883x q6prm_clocks q6apm_lpass_dais snd_q6dsp_common q6apm_dai q6prm michael_mic cbc des_generic libdes ecb algif_skcipher md5 algif_hash af_alg ip6_tables xt_LOG nf_log_syslog ipt_REJECT nf_reject_ipv4 xt_tcpudp snd_q6apm xt_conntrack nf_conntrack libcrc32c nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter r8152 mii qrtr_mhi panel_edp snd_soc_hdmi_codec venus_enc venus_dec apr videobuf2_dma_contig videobuf2_memops fastrpc qrtr_smd rpmsg_ctrl rpmsg_char qcom_pm8008_regulator qcom_battmgr pmic_glink_altmode ath11k_pci ath11k venus_core snd_soc_wcd938x v4l2_mem2mem hci_uart mac80211 msm videobuf2_v4l2 snd_soc_wcd938x_sdw snd_soc_sc8280xp libarc4 btqca snd_soc_qcom_common regmap_sdw snd_soc_lpass_rx_macro videodev snd_soc_lpass_va_macro soundwire_qcom leds_qcom_lpg snd_soc_lpass_wsa_macro bluetooth snd_soc_lpass_tx_macro qcom_spmi_adc_tm5 snd_soc_wcd_mbhc snd_soc_qcom_sdw snd_soc_lpass_macro_common cfg80211 gpu_sched gpio_sbu_mux videobuf2_common qcom_spmi_temp_alarm snd_soc_core
[ 2546.700875]  qcom_spmi_adc5 ecdh_generic drm_display_helper ecc qcom_pon mc snd_compress qcom_q6v5_pas industrialio rtc_pm8xxx reboot_mode phy_qcom_qmp_combo mhi led_class_multicolor nvmem_qcom_spmi_sdam drm_dp_aux_bus rfkill qcom_vadc_common snd_pcm qcom_pil_info drm_kms_helper qcom_common phy_qcom_edp qcom_pm8008 qcom_stats qrtr qcom_glink_smem snd_timer typec videocc_sc8280xp icc_bwmon qcom_q6v5 phy_qcom_qmp_usb pinctrl_sc8280xp_lpass_lpi regmap_i2c snd qcom_sysmon phy_qcom_snps_femto_v2 pmic_glink soundwire_bus pinctrl_lpass_lpi pdr_interface lpasscc_sc8280xp icc_osm_l3 mdt_loader soundcore socinfo qcom_wdt qcom_rng qmi_helpers pwm_bl drm dm_mod ip_tables x_tables ipv6 pcie_qcom crc8 phy_qcom_qmp_pcie nvme nvme_core hid_multitouch i2c_qcom_geni i2c_hid_of i2c_hid i2c_core
[ 2546.705703] CPU: 4 PID: 610 Comm: cat Not tainted 6.5.0-rc1 #45
[ 2546.706287] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET53W (1.25 ) 10/12/2022
[ 2546.706880] pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 2546.707476] pc : string+0x4c/0xfc
[ 2546.708080] lr : vsnprintf+0x170/0x748
[ 2546.708674] sp : ffff800083563ac0
[ 2546.709265] x29: ffff800083563ac0 x28: ffff11b942bca791 x27: ffffbb03f92e0974
[ 2546.709866] x26: ffffbb03f92e0974 x25: 0000000000000020 x24: 0000000000000871
[ 2546.710476] x23: 00000000ffffffd8 x22: ffffbb03f9161778 x21: ffff800083563c10
[ 2546.711083] x20: ffff11b942bca78f x19: ffff11b942bcb000 x18: 0000000000000020
[ 2546.711688] x17: 0000000000000000 x16: 0000000000000000 x15: ffffffffffffffff
[ 2546.712297] x14: 0000000000000001 x13: 0000000000000003 x12: ffff11b942bca783
[ 2546.712910] x11: 0000000000000000 x10: 0000000000000020 x9 : 0000000000000000
[ 2546.713522] x8 : 00000000ffffffff x7 : ffff800083563c10 x6 : 0000000000000020
[ 2546.714133] x5 : ffff11b942bcb000 x4 : 0000000000000000 x3 : ffff0a00ffffff04
[ 2546.714752] x2 : ffff80008106bb19 x1 : ffffffffffffffff x0 : ffff11b942bca791
[ 2546.715362] Call trace:
[ 2546.715962]  string+0x4c/0xfc
[ 2546.716557]  vsnprintf+0x170/0x748
[ 2546.717152]  seq_printf+0xb4/0xd0
[ 2546.717746]  show_interrupts+0x2f4/0x4e8
[ 2546.718345]  seq_read_iter+0x3bc/0x4ac
[ 2546.718940]  proc_reg_read_iter+0x84/0xd8
[ 2546.719539]  vfs_read+0x1d4/0x294
[ 2546.720137]  ksys_read+0x68/0xf4
[ 2546.720735]  __arm64_sys_read+0x1c/0x28
[ 2546.721335]  invoke_syscall+0x48/0x114
[ 2546.721934]  el0_svc_common.constprop.0+0x60/0x10c
[ 2546.722536]  do_el0_svc+0x30/0x88
[ 2546.723132]  el0_svc+0x40/0xac
[ 2546.723729]  el0t_64_sync_handler+0xc0/0xc4
[ 2546.724329]  el0t_64_sync+0x190/0x194
[ 2546.724930] Code: 91000400 110004e1 eb08009f 540000e0 (38646846) 
[ 2546.725536] ---[ end trace 0000000000000000 ]---
[ 2546.726143] note: cat[610] exited with irqs disabled
[ 2546.726781] note: cat[610] exited with preempt_count 1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Oops on /proc/interrupt access with 6.5-rc1
  2023-07-11 15:51 Oops on /proc/interrupt access with 6.5-rc1 Johan Hovold
@ 2023-07-11 16:56 ` Shanker Donthineni
  2023-07-11 18:14 ` Marc Zyngier
  1 sibling, 0 replies; 5+ messages in thread
From: Shanker Donthineni @ 2023-07-11 16:56 UTC (permalink / raw)
  To: Johan Hovold, Marc Zyngier, Thomas Gleixner; +Cc: Konrad Dybcio, linux-kernel

Hi,

On 7/11/23 10:51, Johan Hovold wrote:
> External email: Use caution opening links or attachments
> 
> 
> Hi,
> 
> Konrad reported on IRC that he hit a segfault and hang when watch:ing
> /proc/interrupts with 6.5-rc1.
> 
> I tried simply catting it and hit the below oops immediately with my
> X13s (aarch64).
> 

I have successfully verified the execution of the "cat /proc/interrupts" command
on the NVIDIA-GRACE server platform, using v6.5.0-rc1, without any errors. I
conducted tests using 8, 16 and 72 CPUs by setting the max number of CPUs
(maxcpus=). Not able to reproduce the Oops, tried ~10 times.

root@Grace# uname -a
Linux Grace 6.5.0-rc1 #2 SMP Tue Jul 11 11:13:59 CDT 2023 aarch64 GNU/Linux

root@Grace# cat /proc/interrupts
            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
   9:          0          0          0          0          0          0          0          0     GICv3  25 Level     vgic
  10:          0          0          0          0          0          0          0          0     GICv3  30 Level     kvm guest ptimer
  11:          0          0          0          0          0          0          0          0     GICv3  27 Level     kvm guest vtimer
  12:       3315       1855       1750       3268      10540       2394       8336       1607     GICv3  26 Level     arch_timer
  18:          0          0          0          0          0          0          0          0     GICv3 276 Edge      arm-smmu-v3-evtq
  19:          0          0          0          0          0          0          0          0     GICv3 277 Edge      arm-smmu-v3-gerror
  20:          0          0          0          0          0          0          0          0     GICv3 285 Edge      arm-smmu-v3-evtq
  21:          0          0          0          0          0          0          0          0     GICv3 286 Edge      arm-smmu-v3-gerror
  22:          0          0          0          0          0          0          0          0     GICv3 294 Edge      arm-smmu-v3-evtq
  23:          0          0          0          0          0          0          0          0     GICv3 295 Edge      arm-smmu-v3-gerror
  24:          0          0          0          0          0          0          0          0     GICv3 303 Edge      arm-smmu-v3-evtq
  25:          0          0          0          0          0          0          0          0     GICv3 304 Edge      arm-smmu-v3-gerror
  26:          3          0          0          0          0          0          0          0     GICv3 312 Edge      arm-smmu-v3-evtq
  27:          0          0          0          0          0          0          0          0     GICv3 313 Edge      arm-smmu-v3-gerror
  33:          0          0          0          0          0          0          0          0     GICv3 226 Level     ACPI:Ged
  34:          0          0          0          0          0          0          0          0     GICv3 227 Level     ACPI:Ged
  65:       1724          0          0          0          0          0          0          0     GICv3 202 Level     uart-pl011
  68:          0          0          0          0          0          0          0          0   ITS-MSI 1077444608 Edge      ehci_hcd:usb1
  69:          0          0          0          0          0          0          0          0     GICv3  23 Level     arm-pmu
  84:          0        150          0          0          0          0          0          0   ITS-MSI 1075314688 Edge      nvme0q0
  85:          0          0          0          0          0          0          0          0   ITS-MSI 1075314689 Edge      nvme0q1
  86:          0          0          0          0          0          0          0          0   ITS-MSI 1075314690 Edge      nvme0q2
  87:          0          0          0          0          0          0          0          0   ITS-MSI 1075314691 Edge      nvme0q3
  88:          0          0          0          0          0          0          0          0   ITS-MSI 1075314692 Edge      nvme0q4
  89:          0          0          0          0         10          0          0          0   ITS-MSI 1075314693 Edge      nvme0q5
  90:          0          0          0          0          0          0          0          0   ITS-MSI 1075314694 Edge      nvme0q6
  91:          0          0          0          0          0          0          0          0   ITS-MSI 1075314695 Edge      nvme0q7
  92:          0          0          0          0          0          0          0          0   ITS-MSI 1075314696 Edge      nvme0q8
  93:          0          0          0          0          0          0          0          0   ITS-MSI 1075314697 Edge      nvme0q9
  94:          0          0          0          0          0          0          0          0   ITS-MSI 1075314698 Edge      nvme0q10
  95:          0          0          0          0          0          0          0          0   ITS-MSI 1075314699 Edge      nvme0q11
  96:          0          0          0          0          0          0          0          0   ITS-MSI 1075314700 Edge      nvme0q12
  97:          0          0          0          0          0          0          0          0   ITS-MSI 1075314701 Edge      nvme0q13
  98:          0          0          0          0          0          0          0          0   ITS-MSI 1075314702 Edge      nvme0q14
  99:          0          0          0          0          0          0          0          0   ITS-MSI 1075314703 Edge      nvme0q15
100:          0          0          0          0          0          0          0          0   ITS-MSI 1075314704 Edge      nvme0q16
101:          0          0          0          0          0          0          0          0   ITS-MSI 1075314705 Edge      nvme0q17
102:          0          0          0          0          0          0          0          0   ITS-MSI 1075314706 Edge      nvme0q18
103:          0          0          0          0          0          0          0          0   ITS-MSI 1075314707 Edge      nvme0q19
104:          0          0          0          0          0          0          0          0   ITS-MSI 1075314708 Edge      nvme0q20
105:          0          0          0          0          0          0          0          0   ITS-MSI 1075314709 Edge      nvme0q21
106:          0          0          0          0          0          0          0          0   ITS-MSI 1075314710 Edge      nvme0q22
107:          0          0          0          0          0          0          0          0   ITS-MSI 1075314711 Edge      nvme0q23
108:          0          0          0          0          0          0          0          0   ITS-MSI 1075314712 Edge      nvme0q24
109:          0          0          0          0          0          0          0          0   ITS-MSI 1075314713 Edge      nvme0q25
110:          0          0          0          0          0          0          0          0   ITS-MSI 1075314714 Edge      nvme0q26
111:          0          0          0          0          0          0          0          0   ITS-MSI 1075314715 Edge      nvme0q27
112:          0          0          0          0          0          0          0          0   ITS-MSI 1075314716 Edge      nvme0q28
113:          0          0          0          0          0          0          0          0   ITS-MSI 1075314717 Edge      nvme0q29
114:          0          0          0          0          0          0          0          0   ITS-MSI 1075314718 Edge      nvme0q30
115:          0          0          0          0          0          0          0          0   ITS-MSI 1075314719 Edge      nvme0q31
116:          0          0          0          0          0          0          0          0   ITS-MSI 1075314720 Edge      nvme0q32
117:          0          0          0          0          0          0          0          0   ITS-MSI 1075314721 Edge      nvme0q33
118:          0          0          0          0          0          0          0          0   ITS-MSI 1075314722 Edge      nvme0q34
119:          0          0          0          0          0          0          0          0   ITS-MSI 1075314723 Edge      nvme0q35
120:          0          0          0          0          0          0          0          0   ITS-MSI 1075314724 Edge      nvme0q36
121:          0          0          0          0          0          0          0          0   ITS-MSI 1075314725 Edge      nvme0q37
122:          0          0          0          0          0          0          0          0   ITS-MSI 1075314726 Edge      nvme0q38
123:          0          0          0          0          0          0          0          0   ITS-MSI 1075314727 Edge      nvme0q39
124:          0          0          0          0          0          0          0          0   ITS-MSI 1075314728 Edge      nvme0q40
125:          0          0          0          0          0          0          0          0   ITS-MSI 1075314729 Edge      nvme0q41
126:          0          0          0          0          0          0          0          0   ITS-MSI 1075314730 Edge      nvme0q42
127:          0          0          0          0          0          0          0          0   ITS-MSI 1075314731 Edge      nvme0q43
128:          0          0          0          0          0          0          0          0   ITS-MSI 1075314732 Edge      nvme0q44
129:          0          0          0          0          0          0          0          0   ITS-MSI 1075314733 Edge      nvme0q45
130:          0          0          0          0          0          0          0          0   ITS-MSI 1075314734 Edge      nvme0q46
131:          0          0          0          0          0          0          0          0   ITS-MSI 1075314735 Edge      nvme0q47
132:          0          0          0          0          0          0          0          0   ITS-MSI 1075314736 Edge      nvme0q48
133:          0          0          0          0          0          0          0          0   ITS-MSI 1075314737 Edge      nvme0q49
134:          0          0          0          0          0          0          0          0   ITS-MSI 1075314738 Edge      nvme0q50
135:          0          0          0          0          0          0          0          0   ITS-MSI 1075314739 Edge      nvme0q51
136:          0          0          0          0          0          0          0          0   ITS-MSI 1075314740 Edge      nvme0q52
137:          0          0          0          0          0          0          0          0   ITS-MSI 1075314741 Edge      nvme0q53
138:          0          0          0          0          0          0          0          0   ITS-MSI 1075314742 Edge      nvme0q54
139:          0          0          0          0          0          0          0          0   ITS-MSI 1075314743 Edge      nvme0q55
140:          0          0          0          0          0          0          0          0   ITS-MSI 1075314744 Edge      nvme0q56
141:          0          0          0          0          0          0          0          0   ITS-MSI 1075314745 Edge      nvme0q57
142:          0          0          0          0          0          0          0          0   ITS-MSI 1075314746 Edge      nvme0q58
143:          0          0          0          0          0          0          0          0   ITS-MSI 1075314747 Edge      nvme0q59
144:          0          0          0          0          0          0          0          0   ITS-MSI 1075314748 Edge      nvme0q60
145:          0          0          0          0          0          0          0          0   ITS-MSI 1075314749 Edge      nvme0q61
146:          0          0          0          0          0          0          0          0   ITS-MSI 1075314750 Edge      nvme0q62
147:          0          0          0          0          0          0          0          0   ITS-MSI 1075314751 Edge      nvme0q63
148:          0          0          0          0          0          0          0          0   ITS-MSI 1075314752 Edge      nvme0q64
149:          0          0          0          0          0          0          0          0   ITS-MSI 1075314753 Edge      nvme0q65
150:          0          0          0          0          0          0          0          0   ITS-MSI 1075314754 Edge      nvme0q66
151:          0          0          0          0          0          0          0          0   ITS-MSI 1075314755 Edge      nvme0q67
152:          0          0          0          0          0          0          0          0   ITS-MSI 1075314756 Edge      nvme0q68
153:          0          0          0          0          0          0          0          0   ITS-MSI 1075314757 Edge      nvme0q69
154:          0          0          0          0          0          0          0          0   ITS-MSI 1075314758 Edge      nvme0q70
155:          0          0          0          0          0          0          0          0   ITS-MSI 1075314759 Edge      nvme0q71
156:          0          0          0          0          0          0          0          0   ITS-MSI 1075314760 Edge      nvme0q72
IPI0:        15          3          7         18         16         23         19         22       Rescheduling interrupts
IPI1:      4429        473        294       1307       3926       1535       1897        216       Function call interrupts
IPI2:         0          0          0          0          0          0          0          0       CPU stop interrupts
IPI3:         0          0          0          0          0          0          0          0       CPU stop (for crash dump) interrupts
IPI4:         0          0          0          0          0          0          0          0       Timer broadcast interrupts
IPI5:         0          0          0          0          0          0          0          0       IRQ work interrupts
IPI6:         0          0          0          0          0          0          0          0       CPU wake-up interrupts
Err:          0

> Commit 721255b9826b ("genirq: Use a maple tree for interrupt descriptor
> management") stood out when skimming the log, and Marc soon suggested
> the same possible culprit on IRC.
> 
> I have not been able to reproduce it with the maple tree patch reverted,
> but I hit it again after adding it back. Did not trigger immediately
> after boot though, I had had the machine idling for a few minutes in
> between.
> 
> Marc asked for a dump so figured I'd CC the list as well.
> 
> Johan
> 
> 
> [ 2546.693932] Unable to handle kernel paging request at virtual address ffff80008106bb19
> [ 2546.695148] Mem abort info:
> [ 2546.695562]   ESR = 0x0000000096000007
> [ 2546.695976]   EC = 0x25: DABT (current EL), IL = 32 bits
> [ 2546.696394]   SET = 0, FnV = 0
> [ 2546.696807]   EA = 0, S1PTW = 0
> [ 2546.697220]   FSC = 0x07: level 3 translation fault
> [ 2546.697642] Data abort info:
> [ 2546.698066]   ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
> [ 2546.698494]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 2546.698922]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 2546.699355] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000002d7a05000
> [ 2546.699792] [ffff80008106bb19] pgd=10000001000a5003, p4d=10000001000a5003, pud=10000001000a6003, pmd=1000000100d5a003, pte=0000000000000000
> [ 2546.700387] Internal error: Oops: 0000000096000007 [#1] PREEMPT SMP
> [ 2546.700796] Modules linked in: snd_soc_wsa883x q6prm_clocks q6apm_lpass_dais snd_q6dsp_common q6apm_dai q6prm michael_mic cbc des_generic libdes ecb algif_skcipher md5 algif_hash af_alg ip6_tables xt_LOG nf_log_syslog ipt_REJECT nf_reject_ipv4 xt_tcpudp snd_q6apm xt_conntrack nf_conntrack libcrc32c nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter r8152 mii qrtr_mhi panel_edp snd_soc_hdmi_codec venus_enc venus_dec apr videobuf2_dma_contig videobuf2_memops fastrpc qrtr_smd rpmsg_ctrl rpmsg_char qcom_pm8008_regulator qcom_battmgr pmic_glink_altmode ath11k_pci ath11k venus_core snd_soc_wcd938x v4l2_mem2mem hci_uart mac80211 msm videobuf2_v4l2 snd_soc_wcd938x_sdw snd_soc_sc8280xp libarc4 btqca snd_soc_qcom_common regmap_sdw snd_soc_lpass_rx_macro videodev snd_soc_lpass_va_macro soundwire_qcom leds_qcom_lpg snd_soc_lpass_wsa_macro bluetooth snd_soc_lpass_tx_macro qcom_spmi_adc_tm5 snd_soc_wcd_mbhc snd_soc_qcom_sdw snd_soc_lpass_macro_common cfg80211 gpu_sched gpio_sbu_mux videobuf2_common qcom_spmi_temp_alarm snd_soc_core
> [ 2546.700875]  qcom_spmi_adc5 ecdh_generic drm_display_helper ecc qcom_pon mc snd_compress qcom_q6v5_pas industrialio rtc_pm8xxx reboot_mode phy_qcom_qmp_combo mhi led_class_multicolor nvmem_qcom_spmi_sdam drm_dp_aux_bus rfkill qcom_vadc_common snd_pcm qcom_pil_info drm_kms_helper qcom_common phy_qcom_edp qcom_pm8008 qcom_stats qrtr qcom_glink_smem snd_timer typec videocc_sc8280xp icc_bwmon qcom_q6v5 phy_qcom_qmp_usb pinctrl_sc8280xp_lpass_lpi regmap_i2c snd qcom_sysmon phy_qcom_snps_femto_v2 pmic_glink soundwire_bus pinctrl_lpass_lpi pdr_interface lpasscc_sc8280xp icc_osm_l3 mdt_loader soundcore socinfo qcom_wdt qcom_rng qmi_helpers pwm_bl drm dm_mod ip_tables x_tables ipv6 pcie_qcom crc8 phy_qcom_qmp_pcie nvme nvme_core hid_multitouch i2c_qcom_geni i2c_hid_of i2c_hid i2c_core
> [ 2546.705703] CPU: 4 PID: 610 Comm: cat Not tainted 6.5.0-rc1 #45
> [ 2546.706287] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET53W (1.25 ) 10/12/2022
> [ 2546.706880] pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 2546.707476] pc : string+0x4c/0xfc
> [ 2546.708080] lr : vsnprintf+0x170/0x748
> [ 2546.708674] sp : ffff800083563ac0
> [ 2546.709265] x29: ffff800083563ac0 x28: ffff11b942bca791 x27: ffffbb03f92e0974
> [ 2546.709866] x26: ffffbb03f92e0974 x25: 0000000000000020 x24: 0000000000000871
> [ 2546.710476] x23: 00000000ffffffd8 x22: ffffbb03f9161778 x21: ffff800083563c10
> [ 2546.711083] x20: ffff11b942bca78f x19: ffff11b942bcb000 x18: 0000000000000020
> [ 2546.711688] x17: 0000000000000000 x16: 0000000000000000 x15: ffffffffffffffff
> [ 2546.712297] x14: 0000000000000001 x13: 0000000000000003 x12: ffff11b942bca783
> [ 2546.712910] x11: 0000000000000000 x10: 0000000000000020 x9 : 0000000000000000
> [ 2546.713522] x8 : 00000000ffffffff x7 : ffff800083563c10 x6 : 0000000000000020
> [ 2546.714133] x5 : ffff11b942bcb000 x4 : 0000000000000000 x3 : ffff0a00ffffff04
> [ 2546.714752] x2 : ffff80008106bb19 x1 : ffffffffffffffff x0 : ffff11b942bca791
> [ 2546.715362] Call trace:
> [ 2546.715962]  string+0x4c/0xfc
> [ 2546.716557]  vsnprintf+0x170/0x748
> [ 2546.717152]  seq_printf+0xb4/0xd0
> [ 2546.717746]  show_interrupts+0x2f4/0x4e8
> [ 2546.718345]  seq_read_iter+0x3bc/0x4ac
> [ 2546.718940]  proc_reg_read_iter+0x84/0xd8
> [ 2546.719539]  vfs_read+0x1d4/0x294
> [ 2546.720137]  ksys_read+0x68/0xf4
> [ 2546.720735]  __arm64_sys_read+0x1c/0x28
> [ 2546.721335]  invoke_syscall+0x48/0x114
> [ 2546.721934]  el0_svc_common.constprop.0+0x60/0x10c
> [ 2546.722536]  do_el0_svc+0x30/0x88
> [ 2546.723132]  el0_svc+0x40/0xac
> [ 2546.723729]  el0t_64_sync_handler+0xc0/0xc4
> [ 2546.724329]  el0t_64_sync+0x190/0x194
> [ 2546.724930] Code: 91000400 110004e1 eb08009f 540000e0 (38646846)
> [ 2546.725536] ---[ end trace 0000000000000000 ]---
> [ 2546.726143] note: cat[610] exited with irqs disabled
> [ 2546.726781] note: cat[610] exited with preempt_count 1
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Oops on /proc/interrupt access with 6.5-rc1
  2023-07-11 15:51 Oops on /proc/interrupt access with 6.5-rc1 Johan Hovold
  2023-07-11 16:56 ` Shanker Donthineni
@ 2023-07-11 18:14 ` Marc Zyngier
  2023-07-12  8:53   ` Johan Hovold
  2023-07-12 11:02   ` Johan Hovold
  1 sibling, 2 replies; 5+ messages in thread
From: Marc Zyngier @ 2023-07-11 18:14 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Thomas Gleixner, Shanker Donthineni, Konrad Dybcio, linux-kernel

On Tue, 11 Jul 2023 16:51:10 +0100,
Johan Hovold <johan@kernel.org> wrote:
> 
> Hi,
> 
> Konrad reported on IRC that he hit a segfault and hang when watch:ing
> /proc/interrupts with 6.5-rc1.
> 
> I tried simply catting it and hit the below oops immediately with my
> X13s (aarch64).
> 
> Commit 721255b9826b ("genirq: Use a maple tree for interrupt descriptor
> management") stood out when skimming the log, and Marc soon suggested
> the same possible culprit on IRC.
> 
> I have not been able to reproduce it with the maple tree patch reverted,
> but I hit it again after adding it back. Did not trigger immediately
> after boot though, I had had the machine idling for a few minutes in
> between.
> 
> Marc asked for a dump so figured I'd CC the list as well.

Thanks for that. I've been trying to reproduce this locally, but
failed so far. I'll try a different part of the zoo to see if I get
more luck.

I wonder if you have a driver that periodically allocates an interrupt
and then frees it...

[...]

> [ 2546.693932] Unable to handle kernel paging request at virtual address ffff80008106bb19

The VA seems legitimate, and not unusual for a string.

> [ 2546.695148] Mem abort info:
> [ 2546.695562]   ESR = 0x0000000096000007
> [ 2546.695976]   EC = 0x25: DABT (current EL), IL = 32 bits
> [ 2546.696394]   SET = 0, FnV = 0
> [ 2546.696807]   EA = 0, S1PTW = 0
> [ 2546.697220]   FSC = 0x07: level 3 translation fault
> [ 2546.697642] Data abort info:
> [ 2546.698066]   ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
> [ 2546.698494]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0

This is a read, but we don't have any valid syndrome information.

Could you try and enable KASAN?

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Oops on /proc/interrupt access with 6.5-rc1
  2023-07-11 18:14 ` Marc Zyngier
@ 2023-07-12  8:53   ` Johan Hovold
  2023-07-12 11:02   ` Johan Hovold
  1 sibling, 0 replies; 5+ messages in thread
From: Johan Hovold @ 2023-07-12  8:53 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Thomas Gleixner, Shanker Donthineni, Konrad Dybcio, linux-kernel

On Tue, Jul 11, 2023 at 07:14:02PM +0100, Marc Zyngier wrote:
> On Tue, 11 Jul 2023 16:51:10 +0100,
> Johan Hovold <johan@kernel.org> wrote:

> > Konrad reported on IRC that he hit a segfault and hang when watch:ing
> > /proc/interrupts with 6.5-rc1.
> > 
> > I tried simply catting it and hit the below oops immediately with my
> > X13s (aarch64).

> I wonder if you have a driver that periodically allocates an interrupt
> and then frees it...

I checked by instrumenting the descriptor allocator, but that does not
appear to be the case.

> > [ 2546.693932] Unable to handle kernel paging request at virtual address ffff80008106bb19
> 
> The VA seems legitimate, and not unusual for a string.
> 
> > [ 2546.695148] Mem abort info:
> > [ 2546.695562]   ESR = 0x0000000096000007
> > [ 2546.695976]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [ 2546.696394]   SET = 0, FnV = 0
> > [ 2546.696807]   EA = 0, S1PTW = 0
> > [ 2546.697220]   FSC = 0x07: level 3 translation fault
> > [ 2546.697642] Data abort info:
> > [ 2546.698066]   ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
> > [ 2546.698494]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> 
> This is a read, but we don't have any valid syndrome information.
> 
> Could you try and enable KASAN?

Just reproduced it with KASAN enabled. See splat below.

Johan

[  537.007382] ==================================================================
[  537.007536] BUG: KASAN: vmalloc-out-of-bounds in string+0xec/0x1ec
[  537.007635] Read of size 1 at addr ffff8000813478d0 by task cat/533

[  537.007752] CPU: 6 PID: 533 Comm: cat Not tainted 6.5.0-rc1 #4
[  537.007836] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET53W (1.25 ) 10/12/2022
[  537.007947] Call trace:
[  537.007984]  dump_backtrace+0x9c/0x11c
[  537.008042]  show_stack+0x18/0x24
[  537.008092]  dump_stack_lvl+0x60/0xac
[  537.008147]  print_address_description.constprop.0+0x84/0x394
[  537.008231]  kasan_report+0x110/0x144
[  537.008287]  __asan_load1+0x60/0x6c
[  537.008338]  string+0xec/0x1ec
[  537.008386]  vsnprintf+0x224/0x8b8
[  537.008438]  seq_printf+0x164/0x194
[  537.008491]  show_interrupts+0x40c/0x5e8
[  537.008551]  seq_read_iter+0x5d0/0x738
[  537.008605]  proc_reg_read_iter+0xe8/0x140
[  537.008668]  vfs_read+0x33c/0x444
[  537.008720]  ksys_read+0xc4/0x168
[  537.008770]  __arm64_sys_read+0x44/0x58
[  537.008827]  invoke_syscall+0x60/0x190
[  537.008884]  el0_svc_common.constprop.0+0x80/0x154
[  537.008955]  do_el0_svc+0x38/0xa0
[  537.009006]  el0_svc+0x44/0x90
[  537.009053]  el0t_64_sync_handler+0xc0/0xc4
[  537.009115]  el0t_64_sync+0x190/0x194

[  537.009201] Memory state around the buggy address:
[  537.009269]  ffff800081347780: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[  537.009365]  ffff800081347800: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[  537.009462] >ffff800081347880: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[  537.009557]                                                  ^
[  537.009637]  ffff800081347900: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[  537.009733]  ffff800081347980: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[  537.009829] ==================================================================
[  537.009925] Disabling lock debugging due to kernel taint
[  537.009998] Unable to handle kernel paging request at virtual address ffff8000813478d0
[  537.010100] Mem abort info:
[  537.010139]   ESR = 0x0000000096000007
[  537.010191]   EC = 0x25: DABT (current EL), IL = 32 bits
[  537.010261]   SET = 0, FnV = 0
[  537.010304]   EA = 0, S1PTW = 0
[  537.010347]   FSC = 0x07: level 3 translation fault
[  537.013925] Data abort info:
[  537.017460]   ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
[  537.021044]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  537.024609]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  537.028157] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000032ef05000
[  537.031731] [ffff8000813478d0] pgd=10000001000de003, p4d=10000001000de003, pud=10000001000df003, pmd=1000000100f0f003, pte=0000000000000000
[  537.035463] Internal error: Oops: 0000000096000007 [#1] PREEMPT SMP
[  537.039178] Modules linked in: snd_soc_wsa883x q6prm_clocks q6apm_lpass_dais snd_q6dsp_common q6apm_dai q6prm michael_mic cbc des_generic libdes ecb algif_skcipher md5 algif_hash af_alg ip6_tables xt_LOG nf_log_syslog ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_conntrack snd_q6apm nf_conntrack libcrc32c nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter r8152 qrtr_mhi mii panel_edp snd_soc_hdmi_codec fastrpc qrtr_smd venus_dec venus_enc videobuf2_dma_contig rpmsg_ctrl apr videobuf2_memops rpmsg_char qcom_pm8008_regulator pmic_glink_altmode qcom_battmgr qcom_spmi_adc5 leds_qcom_lpg qcom_spmi_temp_alarm qcom_pon qcom_spmi_adc_tm5 led_class_multicolor reboot_mode industrialio rtc_pm8xxx qcom_vadc_common nvmem_qcom_spmi_sdam hci_uart msm btqca snd_soc_sc8280xp ath11k_pci snd_soc_qcom_common gpu_sched ath11k bluetooth qcom_pm8008 snd_soc_qcom_sdw regmap_i2c gpio_sbu_mux venus_core ecdh_generic ecc drm_display_helper snd_soc_wcd938x mac80211 v4l2_mem2mem videobuf2_v4l2 snd_soc_wcd938x_sdw snd_soc_lpass_tx_macro qcom_stats
[  537.039382]  snd_soc_lpass_va_macro snd_soc_lpass_rx_macro regmap_sdw snd_soc_lpass_wsa_macro soundwire_qcom snd_soc_wcd_mbhc snd_soc_lpass_macro_common videodev drm_dp_aux_bus libarc4 qcom_q6v5_pas phy_qcom_edp videobuf2_common snd_soc_core mc qcom_pil_info videocc_sc8280xp snd_compress qcom_common snd_pcm icc_bwmon cfg80211 qcom_glink_smem phy_qcom_qmp_combo qcom_q6v5 drm_kms_helper snd_timer rfkill phy_qcom_qmp_usb qcom_sysmon qrtr pmic_glink typec mhi snd pinctrl_sc8280xp_lpass_lpi pdr_interface mdt_loader soundwire_bus phy_qcom_snps_femto_v2 pinctrl_lpass_lpi lpasscc_sc8280xp qmi_helpers soundcore pwm_bl socinfo icc_osm_l3 qcom_wdt qcom_rng drm dm_mod ip_tables x_tables ipv6 pcie_qcom crc8 phy_qcom_qmp_pcie nvme nvme_core hid_multitouch i2c_qcom_geni i2c_hid_of i2c_hid i2c_core
[  537.081081] CPU: 6 PID: 533 Comm: cat Tainted: G    B              6.5.0-rc1 #4
[  537.086148] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET53W (1.25 ) 10/12/2022
[  537.091295] pstate: 404000c5 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  537.096402] pc : string+0xec/0x1ec
[  537.101518] lr : string+0xec/0x1ec
[  537.106554] sp : ffff8000802876f0
[  537.111608] x29: ffff8000802876f0 x28: ffffcfdaf4ef5144 x27: 1ffff00010050f06
[  537.116753] x26: ffff0a00ffffff04 x25: ffff37069f4a4790 x24: 0000000000000000
[  537.121933] x23: 00000000ffffffff x22: 1ffff00010050eea x21: ffff37059f4a5000
[  537.127075] x20: ffff8000813478d0 x19: ffff37059f4a4791 x18: 0000000000000000
[  537.132149] x17: 0000000000000000 x16: ffffcfdaf413a5d4 x15: 0000000000000000
[  537.137241] x14: 1ffff00010050dec x13: 0000000041b58ab3 x12: ffff79fb5ec6191d
[  537.142267] x11: 1ffff9fb5ec6191c x10: ffff79fb5ec6191c x9 : dfff800000000000
[  537.147258] x8 : 00008604a139e6e4 x7 : ffffcfdaf630c8e7 x6 : 0000000000000001
[  537.152222] x5 : ffffcfdaf630c8e0 x4 : ffff79fb5ec6191d x3 : ffffcfdaf41015cc
[  537.157174] x2 : 0000000000000001 x1 : ffff3705815a4e00 x0 : 0000000000000001
[  537.162123] Call trace:
[  537.167016]  string+0xec/0x1ec
[  537.171904]  vsnprintf+0x224/0x8b8
[  537.176750]  seq_printf+0x164/0x194
[  537.181551]  show_interrupts+0x40c/0x5e8
[  537.186359]  seq_read_iter+0x5d0/0x738
[  537.191160]  proc_reg_read_iter+0xe8/0x140
[  537.195983]  vfs_read+0x33c/0x444
[  537.200785]  ksys_read+0xc4/0x168
[  537.205557]  __arm64_sys_read+0x44/0x58
[  537.210342]  invoke_syscall+0x60/0x190
[  537.215131]  el0_svc_common.constprop.0+0x80/0x154
[  537.219946]  do_el0_svc+0x38/0xa0
[  537.224761]  el0_svc+0x44/0x90
[  537.229569]  el0t_64_sync_handler+0xc0/0xc4
[  537.234393]  el0t_64_sync+0x190/0x194
[  537.239204] Code: eb19027f 540000a0 aa1403e0 97d8fc79 (38401697) 
[  537.244064] ---[ end trace 0000000000000000 ]---
[  537.248926] note: cat[533] exited with irqs disabled
[  537.254196] note: cat[533] exited with preempt_count 1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Oops on /proc/interrupt access with 6.5-rc1
  2023-07-11 18:14 ` Marc Zyngier
  2023-07-12  8:53   ` Johan Hovold
@ 2023-07-12 11:02   ` Johan Hovold
  1 sibling, 0 replies; 5+ messages in thread
From: Johan Hovold @ 2023-07-12 11:02 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Thomas Gleixner, Shanker Donthineni, Konrad Dybcio, linux-kernel

On Tue, Jul 11, 2023 at 07:14:02PM +0100, Marc Zyngier wrote:
> On Tue, 11 Jul 2023 16:51:10 +0100,
> Johan Hovold <johan@kernel.org> wrote:

> > Konrad reported on IRC that he hit a segfault and hang when watch:ing
> > /proc/interrupts with 6.5-rc1.
> > 
> > I tried simply catting it and hit the below oops immediately with my
> > X13s (aarch64).
> > 
> > Commit 721255b9826b ("genirq: Use a maple tree for interrupt descriptor
> > management") stood out when skimming the log, and Marc soon suggested
> > the same possible culprit on IRC.

Turns out we had a buggy patch that requested an irq using a stack
allocated name. So false alarm.

Sorry about the noise.

Johan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-07-12 11:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-11 15:51 Oops on /proc/interrupt access with 6.5-rc1 Johan Hovold
2023-07-11 16:56 ` Shanker Donthineni
2023-07-11 18:14 ` Marc Zyngier
2023-07-12  8:53   ` Johan Hovold
2023-07-12 11:02   ` Johan Hovold

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox