* B580 on POWER9: Unable to handle kernel data access on read at 0xc00a0000000003cc
@ 2025-03-09 16:04 Simon Richter
2025-03-10 17:09 ` Rodrigo Vivi
0 siblings, 1 reply; 3+ messages in thread
From: Simon Richter @ 2025-03-09 16:04 UTC (permalink / raw)
To: intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 4413 bytes --]
Hi,
I've built a horrible contraption and received the following output:
BUG: Unable to handle kernel data access on read at 0xc00a0000000003cc
Faulting instruction address: 0xc00800000dd46484
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=4K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: xe(+) xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat
nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
xfrm_user xfrm_algo xt_addrtype nft_compat x_tables nf_tables libcrc32c
nfnetlink br_netfilter bridge stp llc overlay binfmt_misc evdev joydev
hid_generic usbhid hid cec rc_core drm_gpuvm drm_exec drm_buddy
gpu_sched drm_suballoc_helper drm_display_helper ast drm_ttm_helper ttm
snd_hda_intel drm_shmem_helper snd_intel_dspcfg drm_kms_helper
snd_hda_codec snd_hda_core drm ofpart snd_hwdep snd_pcm xts ipmi_powernv
snd_timer ipmi_devintf powernv_flash snd vmx_crypto gf128mul mtd
ipmi_msghandler opal_prd soundcore at24 drm_panel_orientation_quirks
i2c_algo_bit regmap_i2c ext4 crc16 mbcache jbd2 crc32c_generic dm_mod
xhci_pci xhci_hcd tg3 nvme crc32c_vpmsum usbcore nvme_core libphy
usb_common nvme_auth [last unloaded: xe]
CPU: 32 UID: 0 PID: 453 Comm: kworker/32:1 Not tainted
6.12.12+bpo-powerpc64le #1 Debian 6.12.12-1~bpo12+1
Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-9858186 PowerNV
Workqueue: events work_for_cpu_fn
NIP: c00800000dd46484 LR: c00800000dd46384 CTR: 000000003006c394
REGS: c00020000a637660 TRAP: 0300 Not tainted
(6.12.12+bpo-powerpc64le Debian 6.12.12-1~bpo12+1)
MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24008488 XER:
20040000
CFAR: c00800000dd46398 DAR: c00a0000000003cc DSISR: 40000000 IRQMASK: 0
GPR00: c00800000dd46384 c00020000a637900 c00800000db30600 c00a0000000003cc
GPR04: 0000000000000000 0000000000000004 0000000035c20060 0000000035c20060
GPR08: 0000000000000000 c00a000000000000 0000000000000000 0000000000008000
GPR12: 0000000000000000 c0002007ff7ffb00 c00000000017d1cc c000200005d5c200
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 c0002007fbeeb280 c00800000df35350 c00800000df35700
GPR24: c00800000df35730 c000200006f60000 0000000000000001 c000200006f61628
GPR28: 0000000000000001 c0000000027703e0 0000000000000000 c00000000c167000
NIP [c00800000dd46484] intel_vga_reset_io_mem+0x140/0x188 [xe]
LR [c00800000dd46384] intel_vga_reset_io_mem+0x40/0x188 [xe]
Call Trace:
[c00020000a637900] [c00800000dd46384] intel_vga_reset_io_mem+0x40/0x188
[xe] (unreliable)
[c00020000a637940] [c00800000dcbbd88] hsw_power_well_enable+0x1f0/0x238 [xe]
[c00020000a637990] [c00800000dcbe470] intel_power_well_enable+0x8c/0xb8 [xe]
[c00020000a637a00] [c00800000dcb6d08]
__intel_display_power_get_domain.part.0+0x98/0xf4 [xe]
[c00020000a637a50] [c00800000dcb9260]
intel_power_domains_init_hw+0x94/0x3a8 [xe]
[c00020000a637af0] [c00800000dcafca4]
intel_display_driver_probe_noirq+0xf4/0x300 [xe]
[c00020000a637b70] [c00800000dc4f124] xe_display_init_noirq+0x80/0x124 [xe]
[c00020000a637ba0] [c00800000dbcee98] xe_device_probe+0x414/0x768 [xe]
[c00020000a637c30] [c00800000dc13370] xe_pci_probe+0x77c/0xc60 [xe]
[c00020000a637d90] [c0000000009f4540] local_pci_probe+0x68/0xf4
[c00020000a637e10] [c000000000169cdc] work_for_cpu_fn+0x38/0x54
[c00020000a637e40] [c00000000016ff24] process_one_work+0x1fc/0x4d4
[c00020000a637ef0] [c000000000170e3c] worker_thread+0x33c/0x504
[c00020000a637f90] [c00000000017d2fc] kthread+0x138/0x140
[c00020000a637fe0] [c00000000000de58] start_kernel_thread+0x14/0x18
Code: ebc1fff0 ebe1fff8 7c0803a6 4e800020 60000000 60000000 60420000
3d220000 e9291800 e9290000 386903cc 7c0004ac <8bc903cc> 0c1e0000
4c00012c 57c9063e
---[ end trace 0000000000000000 ]---
note: kworker/32:1[453] exited with irqs disabled
The faulting line is
outb(inb(VGA_MIS_R), VGA_MIS_W);
in drivers/gpu/drm/i915/display/intel_vga.c
The 0x3cc offset is VGA_MIS_R, and R9 points to c00a000000000000, where
the I/O ports are mapped -- however the legacy ports don't seem to be
active if the card was never initialized at boot, which it isn't,
because the BMC's VGA emulation is preferred.
Commenting out the call to intel_vga_reset_io_mem solves the problem,
and I get wonderful 1080p60 output -- but obviously that is not a
generic solution.
Simon
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: B580 on POWER9: Unable to handle kernel data access on read at 0xc00a0000000003cc
2025-03-09 16:04 B580 on POWER9: Unable to handle kernel data access on read at 0xc00a0000000003cc Simon Richter
@ 2025-03-10 17:09 ` Rodrigo Vivi
2025-03-11 5:22 ` Simon Richter
0 siblings, 1 reply; 3+ messages in thread
From: Rodrigo Vivi @ 2025-03-10 17:09 UTC (permalink / raw)
To: Simon Richter; +Cc: intel-gfx
On Mon, Mar 10, 2025 at 01:04:36AM +0900, Simon Richter wrote:
> Hi,
>
> I've built a horrible contraption and received the following output:
>
> BUG: Unable to handle kernel data access on read at 0xc00a0000000003cc
> Faulting instruction address: 0xc00800000dd46484
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=4K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
> Modules linked in: xe(+) xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat
> nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user
> xfrm_algo xt_addrtype nft_compat x_tables nf_tables libcrc32c nfnetlink
> br_netfilter bridge stp llc overlay binfmt_misc evdev joydev hid_generic
> usbhid hid cec rc_core drm_gpuvm drm_exec drm_buddy gpu_sched
> drm_suballoc_helper drm_display_helper ast drm_ttm_helper ttm snd_hda_intel
> drm_shmem_helper snd_intel_dspcfg drm_kms_helper snd_hda_codec snd_hda_core
> drm ofpart snd_hwdep snd_pcm xts ipmi_powernv snd_timer ipmi_devintf
> powernv_flash snd vmx_crypto gf128mul mtd ipmi_msghandler opal_prd soundcore
> at24 drm_panel_orientation_quirks i2c_algo_bit regmap_i2c ext4 crc16 mbcache
> jbd2 crc32c_generic dm_mod xhci_pci xhci_hcd tg3 nvme crc32c_vpmsum usbcore
> nvme_core libphy usb_common nvme_auth [last unloaded: xe]
> CPU: 32 UID: 0 PID: 453 Comm: kworker/32:1 Not tainted
> 6.12.12+bpo-powerpc64le #1 Debian 6.12.12-1~bpo12+1
> Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-9858186 PowerNV
> Workqueue: events work_for_cpu_fn
> NIP: c00800000dd46484 LR: c00800000dd46384 CTR: 000000003006c394
> REGS: c00020000a637660 TRAP: 0300 Not tainted (6.12.12+bpo-powerpc64le
> Debian 6.12.12-1~bpo12+1)
> MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24008488 XER:
> 20040000
> CFAR: c00800000dd46398 DAR: c00a0000000003cc DSISR: 40000000 IRQMASK: 0
> GPR00: c00800000dd46384 c00020000a637900 c00800000db30600 c00a0000000003cc
> GPR04: 0000000000000000 0000000000000004 0000000035c20060 0000000035c20060
> GPR08: 0000000000000000 c00a000000000000 0000000000000000 0000000000008000
> GPR12: 0000000000000000 c0002007ff7ffb00 c00000000017d1cc c000200005d5c200
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR20: 0000000000000000 c0002007fbeeb280 c00800000df35350 c00800000df35700
> GPR24: c00800000df35730 c000200006f60000 0000000000000001 c000200006f61628
> GPR28: 0000000000000001 c0000000027703e0 0000000000000000 c00000000c167000
> NIP [c00800000dd46484] intel_vga_reset_io_mem+0x140/0x188 [xe]
> LR [c00800000dd46384] intel_vga_reset_io_mem+0x40/0x188 [xe]
> Call Trace:
> [c00020000a637900] [c00800000dd46384] intel_vga_reset_io_mem+0x40/0x188 [xe]
> (unreliable)
> [c00020000a637940] [c00800000dcbbd88] hsw_power_well_enable+0x1f0/0x238 [xe]
> [c00020000a637990] [c00800000dcbe470] intel_power_well_enable+0x8c/0xb8 [xe]
> [c00020000a637a00] [c00800000dcb6d08]
> __intel_display_power_get_domain.part.0+0x98/0xf4 [xe]
> [c00020000a637a50] [c00800000dcb9260] intel_power_domains_init_hw+0x94/0x3a8
> [xe]
> [c00020000a637af0] [c00800000dcafca4]
> intel_display_driver_probe_noirq+0xf4/0x300 [xe]
> [c00020000a637b70] [c00800000dc4f124] xe_display_init_noirq+0x80/0x124 [xe]
> [c00020000a637ba0] [c00800000dbcee98] xe_device_probe+0x414/0x768 [xe]
> [c00020000a637c30] [c00800000dc13370] xe_pci_probe+0x77c/0xc60 [xe]
> [c00020000a637d90] [c0000000009f4540] local_pci_probe+0x68/0xf4
> [c00020000a637e10] [c000000000169cdc] work_for_cpu_fn+0x38/0x54
> [c00020000a637e40] [c00000000016ff24] process_one_work+0x1fc/0x4d4
> [c00020000a637ef0] [c000000000170e3c] worker_thread+0x33c/0x504
> [c00020000a637f90] [c00000000017d2fc] kthread+0x138/0x140
> [c00020000a637fe0] [c00000000000de58] start_kernel_thread+0x14/0x18
> Code: ebc1fff0 ebe1fff8 7c0803a6 4e800020 60000000 60000000 60420000
> 3d220000 e9291800 e9290000 386903cc 7c0004ac <8bc903cc> 0c1e0000 4c00012c
> 57c9063e
> ---[ end trace 0000000000000000 ]---
>
> note: kworker/32:1[453] exited with irqs disabled
>
> The faulting line is
>
> outb(inb(VGA_MIS_R), VGA_MIS_W);
>
> in drivers/gpu/drm/i915/display/intel_vga.c
>
> The 0x3cc offset is VGA_MIS_R, and R9 points to c00a000000000000, where the
> I/O ports are mapped -- however the legacy ports don't seem to be active if
> the card was never initialized at boot, which it isn't, because the BMC's
> VGA emulation is preferred.
>
> Commenting out the call to intel_vga_reset_io_mem solves the problem, and I
> get wonderful 1080p60 output -- but obviously that is not a generic
> solution.
Hi Simon, thanks for reporting this. Could you please report this to our gitlab
issues so we ensure this doesn't get lost on this busy mailing list?
https://drm.pages.freedesktop.org/intel-docs/how-to-file-i915-bugs.html
Since you are building your own kernel. Would you mind trying to reproduce
with latest drm-tip branch from https://gitlab.freedesktop.org/drm/tip
I believe Maarten had done some changes on this area lately.
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
in case he has some quick thoughts...
Thanks,
Rodrigo.
>
> Simon
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: B580 on POWER9: Unable to handle kernel data access on read at 0xc00a0000000003cc
2025-03-10 17:09 ` Rodrigo Vivi
@ 2025-03-11 5:22 ` Simon Richter
0 siblings, 0 replies; 3+ messages in thread
From: Simon Richter @ 2025-03-11 5:22 UTC (permalink / raw)
To: Rodrigo Vivi; +Cc: intel-gfx
Hi,
> Hi Simon, thanks for reporting this. Could you please report this to our gitlab
> issues so we ensure this doesn't get lost on this busy mailing list?
I've added it to
https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1824
which is the same problem on aarch64.
I don't think this is a porting issue, just that the ports are more
likely to have either something else or nothing at all at the ISA VGA
addresses, and are thus more likely to trip up, but this would also
affect PCs that have a different primary VGA device (BMC, onboard,
multiple GPUs).
This seems to be some bug workaround for VGA emulation mode, so I don't
know whether it's safe to just remove it when the card is not mapped to
VGA ports (that would be somewhat easy to do), when the expansion ROM is
disabled (even easier to do), or if another mechanism is needed.
> Since you are building your own kernel. Would you mind trying to reproduce
> with latest drm-tip branch from https://gitlab.freedesktop.org/drm/tip
Same behaviour, added the log to the bug as well.
Simon
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-03-11 5:22 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-09 16:04 B580 on POWER9: Unable to handle kernel data access on read at 0xc00a0000000003cc Simon Richter
2025-03-10 17:09 ` Rodrigo Vivi
2025-03-11 5:22 ` Simon Richter
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.