* further issues with MGA G200 graphics chipset
@ 2026-04-22 23:55 Jacob Keller
2026-04-23 0:05 ` David Airlie
2026-04-23 7:44 ` Thomas Zimmermann
0 siblings, 2 replies; 20+ messages in thread
From: Jacob Keller @ 2026-04-22 23:55 UTC (permalink / raw)
To: Jocelyn Falempe, Thomas Zimmermann, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
Hello,
You may recall the issues I recently reported and submitted a fix for in
the mgag200 DRM driver from [1].
[1]:
https://lore.kernel.org/all/20260202-jk-mgag200-fix-bad-udelay-v2-1-ce1e9665987d@intel.com/
I recently have been running into another issue with the mgag200
graphics driver on a similar platform. I noticed occasional spikes where
Tx timestamps from the ice driver were delayed, very similar behavior to
what was going on with the original bug report. However, this was on a
system running v6.12.76, which contains my MGA G200 usleep fix.
I analyzed the data with perf and have discovered what looks like
another issue where the mgag200 polling routine is causing us issues.
Here's a perf report which captures the cycles samples between the start
of a Tx timestamp request and the point where we report it to the stack:
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] ret_from_fork_asm
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] ret_from_fork
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] kthread
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] worker_thread
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] process_one_work
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] output_poll_execute
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_client_dev_hotplug
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_fbdev_shmem_client_hotplug
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_fb_helper_hotplug_event
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_client_modeset_probe
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_helper_probe_single_connector_modes
> + 89.87% 0.00% kworker/65:1-ev [mgag200] [k] mgag200_vga_bmc_connector_helper_get_modes
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_connector_helper_get_modes
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_edid_read
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_edid_read_custom
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] _drm_do_get_edid
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] edid_block_read
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_do_probe_ddc_edid
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] i2c_transfer
> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] __i2c_transfer
> + 89.87% 0.00% kworker/65:1-ev [i2c_algo_bit] [k] bit_xfer
> - 59.65% 59.65% kworker/65:1-ev [kernel.kallsyms] [k] delay_halt_tpause
> ret_from_fork_asm
> ret_from_fork
> kthread
> worker_thread
> process_one_work
> output_poll_execute
> drm_client_dev_hotplug
> drm_fbdev_shmem_client_hotplug
> drm_fb_helper_hotplug_event
> drm_client_modeset_probe
> drm_helper_probe_single_connector_modes
> mgag200_vga_bmc_connector_helper_get_modes
> drm_connector_helper_get_modes
> drm_edid_read
> drm_edid_read_custom
> _drm_do_get_edid
> edid_block_read
> drm_do_probe_ddc_edid
> i2c_transfer
> __i2c_transfer
> + bit_xfer
> + 59.65% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] __udelay
> + 59.65% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] __const_udelay
> + 51.11% 0.00% kworker/65:1-ev [i2c_algo_bit] [k] sclhi
> + 30.22% 30.22% kworker/65:1-ev [kernel.kallsyms] [k] ioread8
> + 7.30% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] delay_halt
> + 7.30% 0.00% kworker/65:1-ev [i2c_algo_bit] [k] acknak
> + 7.29% 0.00% kworker/65:1-ev [mgag200] [k] mgag200_ddc_algo_bit_data_setscl
> + 5.02% 0.00% swapper [kernel.kallsyms] [k] secondary_startup_64
> + 5.02% 0.00% swapper [kernel.kallsyms] [k] start_secondary
> + 5.02% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
> + 5.02% 0.00% swapper [kernel.kallsyms] [k] do_idle
> + 3.60% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle
> + 3.60% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter
> + 3.53% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state
> + 2.57% 0.00% kworker/65:1-ev [mgag200] [k] mgag200_ddc_algo_bit_data_setsda
> + 2.14% 0.00% perf [unknown] [k] 0xffffffffffffffff
> + 2.14% 0.00% perf perf [.] __cmd_record.constprop.0
> + 2.14% 0.00% perf [kernel.kallsyms] [k] entry_SYSCALL_64
> + 2.14% 0.00% perf [kernel.kallsyms] [k] do_syscall_64
> + 2.14% 0.00% perf [kernel.kallsyms] [k] x64_sys_call
> + 2.06% 2.06% swapper [kernel.kallsyms] [k] intel_idle
> + 1.31% 0.42% perf [kernel.kallsyms] [k] do_sys_poll
> + 1.31% 0.00% perf perf [.] fdarray__poll
> + 1.31% 0.00% perf libc.so.6 [.] __poll
> + 1.31% 0.00% perf [kernel.kallsyms] [k] __x64_sys_poll
> + 1.06% 0.00% systemd-journal systemd-journald [.] 0x00005d6bb7cb3f64
> + 1.06% 0.00% systemd-journal libc.so.6 [.] __libc_start_main
> + 1.06% 0.00% systemd-journal libc.so.6 [.] 0x00007d6ce3a2a1c9
> + 1.06% 0.00% systemd-journal systemd-journald [.] 0x00005d6bb7cb389e
> + 1.06% 0.00% systemd-journal libsystemd-shared-255.so [.] sd_event_run
> + 1.06% 0.00% systemd-journal libsystemd-shared-255.so [.] sd_event_dispatch
> + 1.06% 0.00% systemd-journal libsystemd-shared-255.so [.] 0x00007d6ce409d413
> + 1.00% 0.00% kworker/65:1-ev [i2c_algo_bit] [k] i2c_stop
> + 0.83% 0.00% perf [kernel.kallsyms] [k] perf_poll
> + 0.83% 0.00% perf perf [.] record__mmap_read_evlist
>
As you can see, in this case we are spending +60% of the cycles in
delay_halt_tpause which is part of the bit_xfer function for
implementing i2c.
I also occasionally see these messages coming on dmesg:
> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 7 times, consider switching to WQ_UNBOUND
> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 11 times, consider switching to WQ_UNBOUND
> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 19 times, consider switching to WQ_UNBOUND
> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 35 times, consider switching to WQ_UNBOUND
> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 67 times, consider switching to WQ_UNBOUND
> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 131 times, consider switching to WQ_UNBOUND
> Apr 20 23:14:44 1762811 kernel: workqueue: work_for_cpu_fn hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
> Apr 20 23:14:44 1762811 kernel: workqueue: work_for_cpu_fn hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
> Apr 20 23:14:44 1762811 kernel: workqueue: work_for_cpu_fn hogged CPU for >10000us 7 times, consider switching to WQ_UNBOUND
> Apr 20 23:14:45 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 259 times, consider switching to WQ_UNBOUND
> Apr 20 23:15:15 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
> Apr 20 23:15:25 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
> Apr 20 23:15:46 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 7 times, consider switching to WQ_UNBOUND
> Apr 20 23:16:27 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 11 times, consider switching to WQ_UNBOUND
> Apr 20 23:16:45 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
> Apr 20 23:16:45 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
> Apr 20 23:16:45 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 7 times, consider switching to WQ_UNBOUND
> Apr 20 23:16:45 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 11 times, consider switching to WQ_UNBOUND
> Apr 20 23:16:45 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 19 times, consider switching to WQ_UNBOUND
> Apr 20 23:17:49 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 19 times, consider switching to WQ_UNBOUND
> Apr 20 23:20:33 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 35 times, consider switching to WQ_UNBOUND
> Apr 20 23:26:00 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 67 times, consider switching to WQ_UNBOUND
> Apr 20 23:36:56 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 131 times, consider switching to WQ_UNBOUND
> Apr 20 23:58:46 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 259 times, consider switching to WQ_UNBOUND
> Apr 21 00:34:27 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 515 times, consider switching to WQ_UNBOUND
> Apr 21 00:42:28 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 515 times, consider switching to WQ_UNBOUND
> Apr 21 02:09:51 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 1027 times, consider switching to WQ_UNBOUND
> Apr 21 03:27:40 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 1027 times, consider switching to WQ_UNBOUND
> Apr 21 05:04:37 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 2051 times, consider switching to WQ_UNBOUND
> Apr 21 08:09:39 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 35 times, consider switching to WQ_UNBOUND
> Apr 21 08:10:07 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 67 times, consider switching to WQ_UNBOUND
> Apr 21 08:10:10 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 131 times, consider switching to WQ_UNBOUND
> Apr 21 08:10:21 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 259 times, consider switching to WQ_UNBOUND
> Apr 21 09:14:18 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 2051 times, consider switching to WQ_UNBOUND
> Apr 21 10:54:08 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 4099 times, consider switching to WQ_UNBOUND
> Apr 21 21:11:47 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 4099 times, consider switching to WQ_UNBOUND
> Apr 21 22:33:11 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 8195 times, consider switching to WQ_UNBOUND
> Apr 22 20:31:04 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 8195 times, consider switching to WQ_UNBOUND
> Apr 22 21:51:17 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 16387 times, consider switching to WQ_UNBOUND
These all appear to be workqueue warnings about functions that are
hogging CPU. If I look carefully, it looks like they are all possibly
related to the same mgag200 driver. At the very least
output_poll_execute is certainly related to the mgag200 stall.
I do noot understand exactly what is causing the driver to get stuck,
its something in the i2c routine for reading the EDID block.
I also see this being printed:
EDID block 0 (tag 0x00) checksum is invalid, remainder is 125
It appears to print quite consistently every few seconds. I guess this
might be possibly related to a bad EDID block on the mgag200 device?
What does this even mean?
I am not sure how I'd go about verifying this, or root causing what is
going wrong.
It looks like we print the message as part of _drm_do_get_edid(), and
this definitely is called as part of the mgag200 routines:
> - 33.33% 33.33% kworker/64:1-ev [kernel.kallsyms] [k] _drm_do_get_edid
> ret_from_fork_asm
> ret_from_fork
> kthread
> worker_thread
> process_one_work
> output_poll_execute
> drm_client_dev_hotplug
> drm_fbdev_shmem_client_hotplug
> drm_fb_helper_hotplug_event
> drm_client_modeset_probe
> drm_helper_probe_single_connector_modes
> mgag200_vga_bmc_connector_helper_get_modes
> drm_connector_helper_get_modes
> drm_edid_read
> drm_edid_read_custom
> _drm_do_get_edid
This makes me think that we're reading a bad EDID. I enabled drm.debug
setting to get more data:
> Apr 22 23:47:11 1762811 kernel: EDID block 0 (tag 0x00) checksum is invalid, remainder is 125
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:connector_bad_edid] [CONNECTOR:36:VGA-1] EDID is invalid:
> Apr 22 23:47:11 1762811 kernel: [00] BAD 00 ff ff ff ff ff ff 00 ff ff ff ff ff ff ff ff
> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x720": 60 74250 1280 1390 1430 1650 720 725 730 750 0x40 0x5 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x768": 60 68250 1280 1328 1360 1440 768 771 778 790 0x40 0x9 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x768": 60 79500 1280 1344 1472 1664 768 771 778 798 0x40 0x6 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x800": 60 71000 1280 1328 1360 1440 800 803 809 823 0x40 0x9 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x800": 60 83500 1280 1352 1480 1680 800 803 809 831 0x40 0x6 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x960": 60 108000 1280 1376 1488 1800 960 961 964 1000 0x40 0x5 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x1024": 60 108000 1280 1328 1440 1688 1024 1025 1028 1066 0x40 0x5 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1360x768": 60 85500 1360 1424 1536 1792 768 771 777 795 0x40 0x5 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1366x768": 60 85500 1366 1436 1579 1792 768 771 774 798 0x40 0x5 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1366x768": 60 72000 1366 1380 1436 1500 768 769 772 800 0x40 0x5 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1400x1050": 60 101000 1400 1448 1480 1560 1050 1053 1057 1080 0x40 0x9 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1400x1050": 60 121750 1400 1488 1632 1864 1050 1053 1057 1089 0x40 0x6 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1440x900": 60 88750 1440 1488 1520 1600 900 903 909 926 0x40 0x9 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1440x900": 60 106500 1440 1520 1672 1904 900 903 909 934 0x40 0x6 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1600x900": 60 108000 1600 1624 1704 1800 900 901 904 1000 0x40 0x5 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1600x1200": 60 162000 1600 1664 1856 2160 1200 1201 1204 1250 0x40 0x5 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1680x1050": 60 119000 1680 1728 1760 1840 1050 1053 1059 1080 0x40 0x9 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1680x1050": 60 146250 1680 1784 1960 2240 1050 1053 1059 1089 0x40 0x6 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1792x1344": 60 204750 1792 1920 2120 2448 1344 1345 1348 1394 0x40 0x6 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1856x1392": 60 218250 1856 1952 2176 2528 1392 1393 1396 1439 0x40 0x6 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1920x1080": 60 148500 1920 2008 2052 2200 1080 1084 1089 1125 0x40 0xa (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1920x1200": 60 154000 1920 1968 2000 2080 1200 1203 1209 1235 0x40 0x9 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1920x1200": 60 193250 1920 2056 2256 2592 1200 1203 1209 1245 0x40 0x6 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1920x1440": 60 234000 1920 2048 2256 2600 1440 1441 1444 1500 0x40 0x6 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "2048x1152": 60 162000 2048 2074 2154 2250 1152 1153 1156 1200 0x40 0x5 (VIRTUAL_X)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:36:VGA-1] probed modes:
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "1024x768": 60 65000 1024 1048 1184 1344 768 771 777 806 0x48 0xa
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "800x600": 60 40000 800 840 968 1056 600 601 605 628 0x40 0x5
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "800x600": 56 36000 800 824 896 1024 600 601 603 625 0x40 0x5
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "848x480": 60 33750 848 864 976 1088 480 486 494 517 0x40 0x5
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "640x480": 60 25175 640 656 752 800 480 490 492 525 0x40 0xa
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CONNECTOR:36:VGA-1] enabled? yes
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] Not using firmware configuration
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CONNECTOR:36:VGA-1] looking for cmdline mode
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CONNECTOR:36:VGA-1] looking for preferred mode, tile 0
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CONNECTOR:36:VGA-1] Found mode 1024x768
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] picking CRTCs for 1024x768 config
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CRTC:34:crtc-0] desired mode 1024x768 set (0,0)
> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_dev_hotplug] fbdev: ret=0
Does anyone have any idea whats going wrong here? A google search seems
to imply this is reading the EDID data from the VGA cable...
I'm also curious if its possible to stop polling for so long with udelay
in the i2c logic somehow? I am not very familiar with i2c, but it is
frustrating that this driver is causing yet another stall that is
impacting timing sensitive data. Even if in this case its due to a
faulty cable.. it is frustrating that such result causes the PTP
failures. Would switching to WQ_UNBOUND be helpful here at all?
Thanks,
Jake
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-22 23:55 further issues with MGA G200 graphics chipset Jacob Keller
@ 2026-04-23 0:05 ` David Airlie
2026-04-23 21:39 ` Jacob Keller
2026-04-23 7:44 ` Thomas Zimmermann
1 sibling, 1 reply; 20+ messages in thread
From: David Airlie @ 2026-04-23 0:05 UTC (permalink / raw)
To: Jacob Keller
Cc: Jocelyn Falempe, Thomas Zimmermann, dri-devel,
linux-kernel@vger.kernel.org, Pasi Vaananen
>
> These all appear to be workqueue warnings about functions that are
> hogging CPU. If I look carefully, it looks like they are all possibly
> related to the same mgag200 driver. At the very least
> output_poll_execute is certainly related to the mgag200 stall.
>
> I do noot understand exactly what is causing the driver to get stuck,
> its something in the i2c routine for reading the EDID block.
>
> I also see this being printed:
>
> EDID block 0 (tag 0x00) checksum is invalid, remainder is 125
>
> It appears to print quite consistently every few seconds. I guess this
> might be possibly related to a bad EDID block on the mgag200 device?
> What does this even mean?
>
It sounds like the polling is having trouble with the i2c bus even if
there is no cable plugged in, probably cheaped out on some pull
up/down resistors on the VGA connector.
does adding drm_kms_helper.poll=0 help to the command line help?
Dave.
> I am not sure how I'd go about verifying this, or root causing what is
> going wrong.
>
> It looks like we print the message as part of _drm_do_get_edid(), and
> this definitely is called as part of the mgag200 routines:
>
> > - 33.33% 33.33% kworker/64:1-ev [kernel.kallsyms] [k] _drm_do_get_edid
> > ret_from_fork_asm
> > ret_from_fork
> > kthread
> > worker_thread
> > process_one_work
> > output_poll_execute
> > drm_client_dev_hotplug
> > drm_fbdev_shmem_client_hotplug
> > drm_fb_helper_hotplug_event
> > drm_client_modeset_probe
> > drm_helper_probe_single_connector_modes
> > mgag200_vga_bmc_connector_helper_get_modes
> > drm_connector_helper_get_modes
> > drm_edid_read
> > drm_edid_read_custom
> > _drm_do_get_edid
>
> This makes me think that we're reading a bad EDID. I enabled drm.debug
> setting to get more data:
>
> > Apr 22 23:47:11 1762811 kernel: EDID block 0 (tag 0x00) checksum is invalid, remainder is 125
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:connector_bad_edid] [CONNECTOR:36:VGA-1] EDID is invalid:
> > Apr 22 23:47:11 1762811 kernel: [00] BAD 00 ff ff ff ff ff ff 00 ff ff ff ff ff ff ff ff
> > Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x720": 60 74250 1280 1390 1430 1650 720 725 730 750 0x40 0x5 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x768": 60 68250 1280 1328 1360 1440 768 771 778 790 0x40 0x9 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x768": 60 79500 1280 1344 1472 1664 768 771 778 798 0x40 0x6 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x800": 60 71000 1280 1328 1360 1440 800 803 809 823 0x40 0x9 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x800": 60 83500 1280 1352 1480 1680 800 803 809 831 0x40 0x6 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x960": 60 108000 1280 1376 1488 1800 960 961 964 1000 0x40 0x5 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x1024": 60 108000 1280 1328 1440 1688 1024 1025 1028 1066 0x40 0x5 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1360x768": 60 85500 1360 1424 1536 1792 768 771 777 795 0x40 0x5 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1366x768": 60 85500 1366 1436 1579 1792 768 771 774 798 0x40 0x5 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1366x768": 60 72000 1366 1380 1436 1500 768 769 772 800 0x40 0x5 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1400x1050": 60 101000 1400 1448 1480 1560 1050 1053 1057 1080 0x40 0x9 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1400x1050": 60 121750 1400 1488 1632 1864 1050 1053 1057 1089 0x40 0x6 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1440x900": 60 88750 1440 1488 1520 1600 900 903 909 926 0x40 0x9 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1440x900": 60 106500 1440 1520 1672 1904 900 903 909 934 0x40 0x6 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1600x900": 60 108000 1600 1624 1704 1800 900 901 904 1000 0x40 0x5 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1600x1200": 60 162000 1600 1664 1856 2160 1200 1201 1204 1250 0x40 0x5 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1680x1050": 60 119000 1680 1728 1760 1840 1050 1053 1059 1080 0x40 0x9 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1680x1050": 60 146250 1680 1784 1960 2240 1050 1053 1059 1089 0x40 0x6 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1792x1344": 60 204750 1792 1920 2120 2448 1344 1345 1348 1394 0x40 0x6 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1856x1392": 60 218250 1856 1952 2176 2528 1392 1393 1396 1439 0x40 0x6 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1920x1080": 60 148500 1920 2008 2052 2200 1080 1084 1089 1125 0x40 0xa (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1920x1200": 60 154000 1920 1968 2000 2080 1200 1203 1209 1235 0x40 0x9 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1920x1200": 60 193250 1920 2056 2256 2592 1200 1203 1209 1245 0x40 0x6 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1920x1440": 60 234000 1920 2048 2256 2600 1440 1441 1444 1500 0x40 0x6 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "2048x1152": 60 162000 2048 2074 2154 2250 1152 1153 1156 1200 0x40 0x5 (VIRTUAL_X)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:36:VGA-1] probed modes:
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "1024x768": 60 65000 1024 1048 1184 1344 768 771 777 806 0x48 0xa
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "800x600": 60 40000 800 840 968 1056 600 601 605 628 0x40 0x5
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "800x600": 56 36000 800 824 896 1024 600 601 603 625 0x40 0x5
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "848x480": 60 33750 848 864 976 1088 480 486 494 517 0x40 0x5
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "640x480": 60 25175 640 656 752 800 480 490 492 525 0x40 0xa
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CONNECTOR:36:VGA-1] enabled? yes
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] Not using firmware configuration
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CONNECTOR:36:VGA-1] looking for cmdline mode
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CONNECTOR:36:VGA-1] looking for preferred mode, tile 0
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CONNECTOR:36:VGA-1] Found mode 1024x768
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] picking CRTCs for 1024x768 config
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CRTC:34:crtc-0] desired mode 1024x768 set (0,0)
> > Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_dev_hotplug] fbdev: ret=0
>
> Does anyone have any idea whats going wrong here? A google search seems
> to imply this is reading the EDID data from the VGA cable...
>
> I'm also curious if its possible to stop polling for so long with udelay
> in the i2c logic somehow? I am not very familiar with i2c, but it is
> frustrating that this driver is causing yet another stall that is
> impacting timing sensitive data. Even if in this case its due to a
> faulty cable.. it is frustrating that such result causes the PTP
> failures. Would switching to WQ_UNBOUND be helpful here at all?
>
> Thanks,
> Jake
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-22 23:55 further issues with MGA G200 graphics chipset Jacob Keller
2026-04-23 0:05 ` David Airlie
@ 2026-04-23 7:44 ` Thomas Zimmermann
2026-04-23 16:35 ` Jacob Keller
1 sibling, 1 reply; 20+ messages in thread
From: Thomas Zimmermann @ 2026-04-23 7:44 UTC (permalink / raw)
To: Jacob Keller, Jocelyn Falempe, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
Hi
Am 23.04.26 um 01:55 schrieb Jacob Keller:
> Hello,
>
> You may recall the issues I recently reported and submitted a fix for in
> the mgag200 DRM driver from [1].
>
> [1]:
> https://lore.kernel.org/all/20260202-jk-mgag200-fix-bad-udelay-v2-1-ce1e9665987d@intel.com/
>
> I recently have been running into another issue with the mgag200
> graphics driver on a similar platform. I noticed occasional spikes where
> Tx timestamps from the ice driver were delayed, very similar behavior to
> what was going on with the original bug report. However, this was on a
> system running v6.12.76, which contains my MGA G200 usleep fix.
>
> I analyzed the data with perf and have discovered what looks like
> another issue where the mgag200 polling routine is causing us issues.
>
> Here's a perf report which captures the cycles samples between the start
> of a Tx timestamp request and the point where we report it to the stack:
>
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] ret_from_fork_asm
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] ret_from_fork
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] kthread
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] worker_thread
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] process_one_work
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] output_poll_execute
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_client_dev_hotplug
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_fbdev_shmem_client_hotplug
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_fb_helper_hotplug_event
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_client_modeset_probe
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_helper_probe_single_connector_modes
>> + 89.87% 0.00% kworker/65:1-ev [mgag200] [k] mgag200_vga_bmc_connector_helper_get_modes
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_connector_helper_get_modes
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_edid_read
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_edid_read_custom
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] _drm_do_get_edid
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] edid_block_read
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] drm_do_probe_ddc_edid
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] i2c_transfer
>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] __i2c_transfer
>> + 89.87% 0.00% kworker/65:1-ev [i2c_algo_bit] [k] bit_xfer
>> - 59.65% 59.65% kworker/65:1-ev [kernel.kallsyms] [k] delay_halt_tpause
>> ret_from_fork_asm
>> ret_from_fork
>> kthread
>> worker_thread
>> process_one_work
>> output_poll_execute
>> drm_client_dev_hotplug
>> drm_fbdev_shmem_client_hotplug
>> drm_fb_helper_hotplug_event
>> drm_client_modeset_probe
>> drm_helper_probe_single_connector_modes
>> mgag200_vga_bmc_connector_helper_get_modes
>> drm_connector_helper_get_modes
>> drm_edid_read
>> drm_edid_read_custom
>> _drm_do_get_edid
>> edid_block_read
>> drm_do_probe_ddc_edid
>> i2c_transfer
>> __i2c_transfer
>> + bit_xfer
>> + 59.65% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] __udelay
>> + 59.65% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] __const_udelay
>> + 51.11% 0.00% kworker/65:1-ev [i2c_algo_bit] [k] sclhi
>> + 30.22% 30.22% kworker/65:1-ev [kernel.kallsyms] [k] ioread8
>> + 7.30% 0.00% kworker/65:1-ev [kernel.kallsyms] [k] delay_halt
>> + 7.30% 0.00% kworker/65:1-ev [i2c_algo_bit] [k] acknak
>> + 7.29% 0.00% kworker/65:1-ev [mgag200] [k] mgag200_ddc_algo_bit_data_setscl
>> + 5.02% 0.00% swapper [kernel.kallsyms] [k] secondary_startup_64
>> + 5.02% 0.00% swapper [kernel.kallsyms] [k] start_secondary
>> + 5.02% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
>> + 5.02% 0.00% swapper [kernel.kallsyms] [k] do_idle
>> + 3.60% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle
>> + 3.60% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter
>> + 3.53% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state
>> + 2.57% 0.00% kworker/65:1-ev [mgag200] [k] mgag200_ddc_algo_bit_data_setsda
>> + 2.14% 0.00% perf [unknown] [k] 0xffffffffffffffff
>> + 2.14% 0.00% perf perf [.] __cmd_record.constprop.0
>> + 2.14% 0.00% perf [kernel.kallsyms] [k] entry_SYSCALL_64
>> + 2.14% 0.00% perf [kernel.kallsyms] [k] do_syscall_64
>> + 2.14% 0.00% perf [kernel.kallsyms] [k] x64_sys_call
>> + 2.06% 2.06% swapper [kernel.kallsyms] [k] intel_idle
>> + 1.31% 0.42% perf [kernel.kallsyms] [k] do_sys_poll
>> + 1.31% 0.00% perf perf [.] fdarray__poll
>> + 1.31% 0.00% perf libc.so.6 [.] __poll
>> + 1.31% 0.00% perf [kernel.kallsyms] [k] __x64_sys_poll
>> + 1.06% 0.00% systemd-journal systemd-journald [.] 0x00005d6bb7cb3f64
>> + 1.06% 0.00% systemd-journal libc.so.6 [.] __libc_start_main
>> + 1.06% 0.00% systemd-journal libc.so.6 [.] 0x00007d6ce3a2a1c9
>> + 1.06% 0.00% systemd-journal systemd-journald [.] 0x00005d6bb7cb389e
>> + 1.06% 0.00% systemd-journal libsystemd-shared-255.so [.] sd_event_run
>> + 1.06% 0.00% systemd-journal libsystemd-shared-255.so [.] sd_event_dispatch
>> + 1.06% 0.00% systemd-journal libsystemd-shared-255.so [.] 0x00007d6ce409d413
>> + 1.00% 0.00% kworker/65:1-ev [i2c_algo_bit] [k] i2c_stop
>> + 0.83% 0.00% perf [kernel.kallsyms] [k] perf_poll
>> + 0.83% 0.00% perf perf [.] record__mmap_read_evlist
>>
> As you can see, in this case we are spending +60% of the cycles in
> delay_halt_tpause which is part of the bit_xfer function for
> implementing i2c.
That's from the DDC's i2c channel, which we poll on regular intervals
when we update the connector status. Dave's suggestion should at least
mitigate the problem.
>
> I also occasionally see these messages coming on dmesg:
>> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
>> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
>> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 7 times, consider switching to WQ_UNBOUND
>> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 11 times, consider switching to WQ_UNBOUND
>> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 19 times, consider switching to WQ_UNBOUND
>> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 35 times, consider switching to WQ_UNBOUND
>> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 67 times, consider switching to WQ_UNBOUND
>> Apr 20 23:14:44 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 131 times, consider switching to WQ_UNBOUND
>> Apr 20 23:14:44 1762811 kernel: workqueue: work_for_cpu_fn hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
>> Apr 20 23:14:44 1762811 kernel: workqueue: work_for_cpu_fn hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
>> Apr 20 23:14:44 1762811 kernel: workqueue: work_for_cpu_fn hogged CPU for >10000us 7 times, consider switching to WQ_UNBOUND
>> Apr 20 23:14:45 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 259 times, consider switching to WQ_UNBOUND
>> Apr 20 23:15:15 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
>> Apr 20 23:15:25 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
>> Apr 20 23:15:46 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 7 times, consider switching to WQ_UNBOUND
>> Apr 20 23:16:27 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 11 times, consider switching to WQ_UNBOUND
>> Apr 20 23:16:45 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
>> Apr 20 23:16:45 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
>> Apr 20 23:16:45 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 7 times, consider switching to WQ_UNBOUND
>> Apr 20 23:16:45 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 11 times, consider switching to WQ_UNBOUND
>> Apr 20 23:16:45 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 19 times, consider switching to WQ_UNBOUND
>> Apr 20 23:17:49 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 19 times, consider switching to WQ_UNBOUND
>> Apr 20 23:20:33 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 35 times, consider switching to WQ_UNBOUND
>> Apr 20 23:26:00 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 67 times, consider switching to WQ_UNBOUND
>> Apr 20 23:36:56 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 131 times, consider switching to WQ_UNBOUND
>> Apr 20 23:58:46 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 259 times, consider switching to WQ_UNBOUND
>> Apr 21 00:34:27 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 515 times, consider switching to WQ_UNBOUND
>> Apr 21 00:42:28 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 515 times, consider switching to WQ_UNBOUND
>> Apr 21 02:09:51 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 1027 times, consider switching to WQ_UNBOUND
>> Apr 21 03:27:40 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 1027 times, consider switching to WQ_UNBOUND
>> Apr 21 05:04:37 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 2051 times, consider switching to WQ_UNBOUND
>> Apr 21 08:09:39 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 35 times, consider switching to WQ_UNBOUND
>> Apr 21 08:10:07 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 67 times, consider switching to WQ_UNBOUND
>> Apr 21 08:10:10 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 131 times, consider switching to WQ_UNBOUND
>> Apr 21 08:10:21 1762811 kernel: workqueue: vmstat_update hogged CPU for >10000us 259 times, consider switching to WQ_UNBOUND
>> Apr 21 09:14:18 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 2051 times, consider switching to WQ_UNBOUND
>> Apr 21 10:54:08 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 4099 times, consider switching to WQ_UNBOUND
>> Apr 21 21:11:47 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 4099 times, consider switching to WQ_UNBOUND
>> Apr 21 22:33:11 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 8195 times, consider switching to WQ_UNBOUND
>> Apr 22 20:31:04 1762811 kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 8195 times, consider switching to WQ_UNBOUND
>> Apr 22 21:51:17 1762811 kernel: workqueue: output_poll_execute hogged CPU for >10000us 16387 times, consider switching to WQ_UNBOUND
> These all appear to be workqueue warnings about functions that are
> hogging CPU. If I look carefully, it looks like they are all possibly
> related to the same mgag200 driver. At the very least
> output_poll_execute is certainly related to the mgag200 stall.
Polling the DDC involves acquiring locks so that it does not interfere
with display updates. These errors about drm_fb_helper_damage_work() are
fallout. The function most likely waits for the DDC polling to finish.
>
> I do noot understand exactly what is causing the driver to get stuck,
> its something in the i2c routine for reading the EDID block.
>
> I also see this being printed:
>
> EDID block 0 (tag 0x00) checksum is invalid, remainder is 125
>
> It appears to print quite consistently every few seconds. I guess this
> might be possibly related to a bad EDID block on the mgag200 device?
> What does this even mean?
The monitor's EDID is wrong. This is likely another fallout from the issue.
>
> I am not sure how I'd go about verifying this, or root causing what is
> going wrong.
>
> It looks like we print the message as part of _drm_do_get_edid(), and
> this definitely is called as part of the mgag200 routines:
>
>> - 33.33% 33.33% kworker/64:1-ev [kernel.kallsyms] [k] _drm_do_get_edid
>> ret_from_fork_asm
>> ret_from_fork
>> kthread
>> worker_thread
>> process_one_work
>> output_poll_execute
>> drm_client_dev_hotplug
>> drm_fbdev_shmem_client_hotplug
>> drm_fb_helper_hotplug_event
>> drm_client_modeset_probe
>> drm_helper_probe_single_connector_modes
>> mgag200_vga_bmc_connector_helper_get_modes
>> drm_connector_helper_get_modes
>> drm_edid_read
>> drm_edid_read_custom
>> _drm_do_get_edid
> This makes me think that we're reading a bad EDID. I enabled drm.debug
> setting to get more data:
>
>> Apr 22 23:47:11 1762811 kernel: EDID block 0 (tag 0x00) checksum is invalid, remainder is 125
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:connector_bad_edid] [CONNECTOR:36:VGA-1] EDID is invalid:
>> Apr 22 23:47:11 1762811 kernel: [00] BAD 00 ff ff ff ff ff ff 00 ff ff ff ff ff ff ff ff
>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
This EDID has a correct identifier in the first 8 bytes and the rest is
garbage.
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x720": 60 74250 1280 1390 1430 1650 720 725 730 750 0x40 0x5 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x768": 60 68250 1280 1328 1360 1440 768 771 778 790 0x40 0x9 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x768": 60 79500 1280 1344 1472 1664 768 771 778 798 0x40 0x6 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x800": 60 71000 1280 1328 1360 1440 800 803 809 823 0x40 0x9 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x800": 60 83500 1280 1352 1480 1680 800 803 809 831 0x40 0x6 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x960": 60 108000 1280 1376 1488 1800 960 961 964 1000 0x40 0x5 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1280x1024": 60 108000 1280 1328 1440 1688 1024 1025 1028 1066 0x40 0x5 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1360x768": 60 85500 1360 1424 1536 1792 768 771 777 795 0x40 0x5 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1366x768": 60 85500 1366 1436 1579 1792 768 771 774 798 0x40 0x5 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1366x768": 60 72000 1366 1380 1436 1500 768 769 772 800 0x40 0x5 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1400x1050": 60 101000 1400 1448 1480 1560 1050 1053 1057 1080 0x40 0x9 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1400x1050": 60 121750 1400 1488 1632 1864 1050 1053 1057 1089 0x40 0x6 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1440x900": 60 88750 1440 1488 1520 1600 900 903 909 926 0x40 0x9 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1440x900": 60 106500 1440 1520 1672 1904 900 903 909 934 0x40 0x6 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1600x900": 60 108000 1600 1624 1704 1800 900 901 904 1000 0x40 0x5 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1600x1200": 60 162000 1600 1664 1856 2160 1200 1201 1204 1250 0x40 0x5 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1680x1050": 60 119000 1680 1728 1760 1840 1050 1053 1059 1080 0x40 0x9 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1680x1050": 60 146250 1680 1784 1960 2240 1050 1053 1059 1089 0x40 0x6 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1792x1344": 60 204750 1792 1920 2120 2448 1344 1345 1348 1394 0x40 0x6 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1856x1392": 60 218250 1856 1952 2176 2528 1392 1393 1396 1439 0x40 0x6 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1920x1080": 60 148500 1920 2008 2052 2200 1080 1084 1089 1125 0x40 0xa (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1920x1200": 60 154000 1920 1968 2000 2080 1200 1203 1209 1235 0x40 0x9 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1920x1200": 60 193250 1920 2056 2256 2592 1200 1203 1209 1245 0x40 0x6 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "1920x1440": 60 234000 1920 2048 2256 2600 1440 1441 1444 1500 0x40 0x6 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_mode_prune_invalid] Rejected mode: "2048x1152": 60 162000 2048 2074 2154 2250 1152 1153 1156 1200 0x40 0x5 (VIRTUAL_X)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:36:VGA-1] probed modes:
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "1024x768": 60 65000 1024 1048 1184 1344 768 771 777 806 0x48 0xa
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "800x600": 60 40000 800 840 968 1056 600 601 605 628 0x40 0x5
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "800x600": 56 36000 800 824 896 1024 600 601 603 625 0x40 0x5
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "848x480": 60 33750 848 864 976 1088 480 486 494 517 0x40 0x5
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_helper_probe_single_connector_modes] Probed mode: "640x480": 60 25175 640 656 752 800 480 490 492 525 0x40 0xa
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CONNECTOR:36:VGA-1] enabled? yes
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] Not using firmware configuration
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CONNECTOR:36:VGA-1] looking for cmdline mode
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CONNECTOR:36:VGA-1] looking for preferred mode, tile 0
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CONNECTOR:36:VGA-1] Found mode 1024x768
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] picking CRTCs for 1024x768 config
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_modeset_probe] [CRTC:34:crtc-0] desired mode 1024x768 set (0,0)
>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0: [drm:drm_client_dev_hotplug] fbdev: ret=0
> Does anyone have any idea whats going wrong here? A google search seems
> to imply this is reading the EDID data from the VGA cable...
The HW is probably broken.
>
> I'm also curious if its possible to stop polling for so long with udelay
> in the i2c logic somehow? I am not very familiar with i2c, but it is
> frustrating that this driver is causing yet another stall that is
> impacting timing sensitive data. Even if in this case its due to a
> faulty cable.. it is frustrating that such result causes the PTP
> failures. Would switching to WQ_UNBOUND be helpful here at all?
Try Dave's suggestion to avoid polling. The driver won't be able to
detect changes to the connector status, though.
Best regards
Thomas
>
> Thanks,
> Jake
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-23 7:44 ` Thomas Zimmermann
@ 2026-04-23 16:35 ` Jacob Keller
2026-04-23 19:22 ` Jocelyn Falempe
0 siblings, 1 reply; 20+ messages in thread
From: Jacob Keller @ 2026-04-23 16:35 UTC (permalink / raw)
To: Thomas Zimmermann, Jocelyn Falempe, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
On 4/23/2026 12:44 AM, Thomas Zimmermann wrote:
> Hi
>
> Am 23.04.26 um 01:55 schrieb Jacob Keller:
>> Hello,
>>
>> You may recall the issues I recently reported and submitted a fix for in
>> the mgag200 DRM driver from [1].
>>
>> [1]:
>> https://lore.kernel.org/all/20260202-jk-mgag200-fix-bad-udelay-v2-1-
>> ce1e9665987d@intel.com/
>>
>> I recently have been running into another issue with the mgag200
>> graphics driver on a similar platform. I noticed occasional spikes where
>> Tx timestamps from the ice driver were delayed, very similar behavior to
>> what was going on with the original bug report. However, this was on a
>> system running v6.12.76, which contains my MGA G200 usleep fix.
>>
>> I analyzed the data with perf and have discovered what looks like
>> another issue where the mgag200 polling routine is causing us issues.
>>
>> Here's a perf report which captures the cycles samples between the start
>> of a Tx timestamp request and the point where we report it to the stack:
>>
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> ret_from_fork_asm
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> ret_from_fork
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> kthread
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> worker_thread
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> process_one_work
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> output_poll_execute
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> drm_client_dev_hotplug
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> drm_fbdev_shmem_client_hotplug
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> drm_fb_helper_hotplug_event
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> drm_client_modeset_probe
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> drm_helper_probe_single_connector_modes
>>> + 89.87% 0.00% kworker/65:1-ev [mgag200] [k]
>>> mgag200_vga_bmc_connector_helper_get_modes
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> drm_connector_helper_get_modes
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> drm_edid_read
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> drm_edid_read_custom
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> _drm_do_get_edid
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> edid_block_read
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> drm_do_probe_ddc_edid
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> i2c_transfer
>>> + 89.87% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> __i2c_transfer
>>> + 89.87% 0.00% kworker/65:1-ev [i2c_algo_bit] [k]
>>> bit_xfer
>>> - 59.65% 59.65% kworker/65:1-ev [kernel.kallsyms] [k]
>>> delay_halt_tpause
>>> ret_from_fork_asm
>>> ret_from_fork
>>> kthread
>>> worker_thread
>>> process_one_work
>>> output_poll_execute
>>> drm_client_dev_hotplug
>>> drm_fbdev_shmem_client_hotplug
>>> drm_fb_helper_hotplug_event
>>> drm_client_modeset_probe
>>> drm_helper_probe_single_connector_modes
>>> mgag200_vga_bmc_connector_helper_get_modes
>>> drm_connector_helper_get_modes
>>> drm_edid_read
>>> drm_edid_read_custom
>>> _drm_do_get_edid
>>> edid_block_read
>>> drm_do_probe_ddc_edid
>>> i2c_transfer
>>> __i2c_transfer
>>> + bit_xfer
>>> + 59.65% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> __udelay
>>> + 59.65% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> __const_udelay
>>> + 51.11% 0.00% kworker/65:1-ev [i2c_algo_bit] [k]
>>> sclhi
>>> + 30.22% 30.22% kworker/65:1-ev [kernel.kallsyms] [k]
>>> ioread8
>>> + 7.30% 0.00% kworker/65:1-ev [kernel.kallsyms] [k]
>>> delay_halt
>>> + 7.30% 0.00% kworker/65:1-ev [i2c_algo_bit] [k]
>>> acknak
>>> + 7.29% 0.00% kworker/65:1-ev [mgag200] [k]
>>> mgag200_ddc_algo_bit_data_setscl
>>> + 5.02% 0.00% swapper [kernel.kallsyms] [k]
>>> secondary_startup_64
>>> + 5.02% 0.00% swapper [kernel.kallsyms] [k]
>>> start_secondary
>>> + 5.02% 0.00% swapper [kernel.kallsyms] [k]
>>> cpu_startup_entry
>>> + 5.02% 0.00% swapper [kernel.kallsyms] [k]
>>> do_idle
>>> + 3.60% 0.00% swapper [kernel.kallsyms] [k]
>>> call_cpuidle
>>> + 3.60% 0.00% swapper [kernel.kallsyms] [k]
>>> cpuidle_enter
>>> + 3.53% 0.00% swapper [kernel.kallsyms] [k]
>>> cpuidle_enter_state
>>> + 2.57% 0.00% kworker/65:1-ev [mgag200] [k]
>>> mgag200_ddc_algo_bit_data_setsda
>>> + 2.14% 0.00% perf [unknown] [k]
>>> 0xffffffffffffffff
>>> + 2.14% 0.00% perf perf [.]
>>> __cmd_record.constprop.0
>>> + 2.14% 0.00% perf [kernel.kallsyms] [k]
>>> entry_SYSCALL_64
>>> + 2.14% 0.00% perf [kernel.kallsyms] [k]
>>> do_syscall_64
>>> + 2.14% 0.00% perf [kernel.kallsyms] [k]
>>> x64_sys_call
>>> + 2.06% 2.06% swapper [kernel.kallsyms] [k]
>>> intel_idle
>>> + 1.31% 0.42% perf [kernel.kallsyms] [k]
>>> do_sys_poll
>>> + 1.31% 0.00% perf perf [.]
>>> fdarray__poll
>>> + 1.31% 0.00% perf libc.so.6 [.]
>>> __poll
>>> + 1.31% 0.00% perf [kernel.kallsyms] [k]
>>> __x64_sys_poll
>>> + 1.06% 0.00% systemd-journal systemd-journald [.]
>>> 0x00005d6bb7cb3f64
>>> + 1.06% 0.00% systemd-journal libc.so.6 [.]
>>> __libc_start_main
>>> + 1.06% 0.00% systemd-journal libc.so.6 [.]
>>> 0x00007d6ce3a2a1c9
>>> + 1.06% 0.00% systemd-journal systemd-journald [.]
>>> 0x00005d6bb7cb389e
>>> + 1.06% 0.00% systemd-journal libsystemd-shared-255.so [.]
>>> sd_event_run
>>> + 1.06% 0.00% systemd-journal libsystemd-shared-255.so [.]
>>> sd_event_dispatch
>>> + 1.06% 0.00% systemd-journal libsystemd-shared-255.so [.]
>>> 0x00007d6ce409d413
>>> + 1.00% 0.00% kworker/65:1-ev [i2c_algo_bit] [k]
>>> i2c_stop
>>> + 0.83% 0.00% perf [kernel.kallsyms] [k]
>>> perf_poll
>>> + 0.83% 0.00% perf perf [.]
>>> record__mmap_read_evlist
>>>
>> As you can see, in this case we are spending +60% of the cycles in
>> delay_halt_tpause which is part of the bit_xfer function for
>> implementing i2c.
>
> That's from the DDC's i2c channel, which we poll on regular intervals
> when we update the connector status. Dave's suggestion should at least
> mitigate the problem.
>
Right.
>
> Polling the DDC involves acquiring locks so that it does not interfere
> with display updates. These errors about drm_fb_helper_damage_work() are
> fallout. The function most likely waits for the DDC polling to finish.
Makes sense. I'm still wondering if it makes sense to convert to
WQ_UNBOUND so that the task doesn't get bound to CPU and (hopefully?)
doesn't cause other critical processes like IRQs to get stuck when they
*happen* to be bound to the same CPU? I'm not entirely sure. It seems
crazy to me that this simple background polling thread stalls my IRQ
from executing for 30 milliseconds, but that appears to be what is
happening.
I am guessing that refactoring the i2c-bit-algo to allow usleep is not
really possible either, so we can't make this part of the logic actually
sleep instead of busy-waiting.. :(
>>
>> I do noot understand exactly what is causing the driver to get stuck,
>> its something in the i2c routine for reading the EDID block.
>>
>> I also see this being printed:
>>
>> EDID block 0 (tag 0x00) checksum is invalid, remainder is 125
>>
>> It appears to print quite consistently every few seconds. I guess this
>> might be possibly related to a bad EDID block on the mgag200 device?
>> What does this even mean?
>
> The monitor's EDID is wrong. This is likely another fallout from the issue.
>
It turns out that the platform doesn't even seem to have a physical VGA
port. This makes me suspect Dave's point about a cheap resistor is quite
plausible.
>>
>> I am not sure how I'd go about verifying this, or root causing what is
>> going wrong.
>>
>> It looks like we print the message as part of _drm_do_get_edid(), and
>> this definitely is called as part of the mgag200 routines:
>>
>>> - 33.33% 33.33% kworker/64:1-ev [kernel.kallsyms] [k]
>>> _drm_do_get_edid
>>> ret_from_fork_asm
>>> ret_from_fork
>>> kthread
>>> worker_thread
>>> process_one_work
>>> output_poll_execute
>>> drm_client_dev_hotplug
>>> drm_fbdev_shmem_client_hotplug
>>> drm_fb_helper_hotplug_event
>>> drm_client_modeset_probe
>>> drm_helper_probe_single_connector_modes
>>> mgag200_vga_bmc_connector_helper_get_modes
>>> drm_connector_helper_get_modes
>>> drm_edid_read
>>> drm_edid_read_custom
>>> _drm_do_get_edid
>> This makes me think that we're reading a bad EDID. I enabled drm.debug
>> setting to get more data:
>>
>>> Apr 22 23:47:11 1762811 kernel: EDID block 0 (tag 0x00) checksum is
>>> invalid, remainder is 125
>>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0:
>>> [drm:connector_bad_edid] [CONNECTOR:36:VGA-1] EDID is invalid:
>>> Apr 22 23:47:11 1762811 kernel: [00] BAD 00 ff ff ff ff ff
>>> ff 00 ff ff ff ff ff ff ff ff
>>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff
>>> ff ff ff ff ff ff ff ff ff ff
>>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff
>>> ff ff ff ff ff ff ff ff ff ff
>>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff
>>> ff ff ff ff ff ff ff ff ff ff
>>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff
>>> ff ff ff ff ff ff ff ff ff ff
>>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff
>>> ff ff ff ff ff ff ff ff ff ff
>>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff
>>> ff ff ff ff ff ff ff ff ff ff
>>> Apr 22 23:47:11 1762811 kernel: [00] BAD ff ff ff ff ff ff
>>> ff ff ff ff ff ff ff ff ff ff
>
> This EDID has a correct identifier in the first 8 bytes and the rest is
> garbage.
>
Yep.
>>> Apr 22 23:47:11 1762811 kernel: mgag200 0000:b5:00.0:
>>> [drm:drm_client_dev_hotplug] fbdev: ret=0
>> Does anyone have any idea whats going wrong here? A google search seems
>> to imply this is reading the EDID data from the VGA cable...
>
> The HW is probably broken.
>
Right. I thought we had a KVM dongle plugged into the VGA port, but
further inspection shows that there doesn't even appear to be a physical
VGA port on the system, and the mgag200 is only used for its BMC
connection! (We have a mini display port to VGA adapter in use, and I've
asked the team to swap that out just to confirm its not related)
>>
>> I'm also curious if its possible to stop polling for so long with udelay
>> in the i2c logic somehow? I am not very familiar with i2c, but it is
>> frustrating that this driver is causing yet another stall that is
>> impacting timing sensitive data. Even if in this case its due to a
>> faulty cable.. it is frustrating that such result causes the PTP
>> failures. Would switching to WQ_UNBOUND be helpful here at all?
>
> Try Dave's suggestion to avoid polling. The driver won't be able to
> detect changes to the connector status, though.
>
That's fine. I don't think we're even using the device. It looks like it
might only be in use for BMC, and the VGA connection isn't actually
physically available, so there are no changes to detect.
Is this polling really only to detect when VGA is enabled? Would it make
sense to only poll on platforms which actually *have* that VGA connection?
I'd like a solution where we don't have to go to each individual
customer and have them ban the mgag200 driver or set some kernel
parameter like drm_kms_helper.poll=0 to prevent issues. If the VGA
connector isn't even available to *be* plugged in, then it doesn't make
sense to constantly poll to check if it was...
Many system admins likely aren't even aware of the devices existence,
and it ends up causing stall issues like this, which for timing
sensitive tasks results in service disruption.
It is unpleasant that the mere *existence* of the device+driver causes
such problems.
> Best regards
> Thomas
>
>>
>> Thanks,
>> Jake
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-23 16:35 ` Jacob Keller
@ 2026-04-23 19:22 ` Jocelyn Falempe
2026-04-23 19:42 ` Jacob Keller
0 siblings, 1 reply; 20+ messages in thread
From: Jocelyn Falempe @ 2026-04-23 19:22 UTC (permalink / raw)
To: Jacob Keller, Thomas Zimmermann, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
On 23/04/2026 18:35, Jacob Keller wrote:
> On 4/23/2026 12:44 AM, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 23.04.26 um 01:55 schrieb Jacob Keller:
>>> Hello,
>>><snip>>>> I'm also curious if its possible to stop polling for so
long with udelay
>>> in the i2c logic somehow? I am not very familiar with i2c, but it is
>>> frustrating that this driver is causing yet another stall that is
>>> impacting timing sensitive data. Even if in this case its due to a
>>> faulty cable.. it is frustrating that such result causes the PTP
>>> failures. Would switching to WQ_UNBOUND be helpful here at all?
>>
>> Try Dave's suggestion to avoid polling. The driver won't be able to
>> detect changes to the connector status, though.
>>
>
> That's fine. I don't think we're even using the device. It looks like it
> might only be in use for BMC, and the VGA connection isn't actually
> physically available, so there are no changes to detect.
>
> Is this polling really only to detect when VGA is enabled? Would it make
> sense to only poll on platforms which actually *have* that VGA
connection?
>
Polling was introduced with https://patchwork.freedesktop.org/series/131977/
The driver needs to know if a VGA monitor is connected or not, to
provide the right available resolutions to the userspace.
Otherwise you can set a high resolution that works from the BMC, but
then connecting a VGA monitor will not work, as the driver won't notice
that something has been connected.
The mgag200 doesn't have an IRQ or a register to check if something is
connected on the VGA port, so the driver uses the i2c and tries to read
the EDID.
Unfortunately, there is no way to know reliably if a VGA connector is
present. It's possible to disable polling on some machines using DMI
quirks, but I don't think this approach will scale.
>
> I'd like a solution where we don't have to go to each individual
> customer and have them ban the mgag200 driver or set some kernel
> parameter like drm_kms_helper.poll=0 to prevent issues. If the VGA
> connector isn't even available to *be* plugged in, then it doesn't make
> sense to constantly poll to check if it was...
>
> Many system admins likely aren't even aware of the devices existence,
> and it ends up causing stall issues like this, which for timing
> sensitive tasks results in service disruption.
>
> It is unpleasant that the mere *existence* of the device+driver causes
> such problems.
>
>> Best regards
>> Thomas
>>
>>>
>>> Thanks,
>>> Jake
>>
>
Best regards,
--
Jocelyn
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-23 19:22 ` Jocelyn Falempe
@ 2026-04-23 19:42 ` Jacob Keller
2026-04-23 21:02 ` David Airlie
2026-04-24 6:20 ` Thomas Zimmermann
0 siblings, 2 replies; 20+ messages in thread
From: Jacob Keller @ 2026-04-23 19:42 UTC (permalink / raw)
To: Jocelyn Falempe, Thomas Zimmermann, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
On 4/23/2026 12:22 PM, Jocelyn Falempe wrote:
> On 23/04/2026 18:35, Jacob Keller wrote:
>> On 4/23/2026 12:44 AM, Thomas Zimmermann wrote:
>>> Hi
>>>
>>> Am 23.04.26 um 01:55 schrieb Jacob Keller:
>>>> Hello,
>>>><snip>>>> I'm also curious if its possible to stop polling for so
> long with udelay
>>>> in the i2c logic somehow? I am not very familiar with i2c, but it is
>>>> frustrating that this driver is causing yet another stall that is
>>>> impacting timing sensitive data. Even if in this case its due to a
>>>> faulty cable.. it is frustrating that such result causes the PTP
>>>> failures. Would switching to WQ_UNBOUND be helpful here at all?
>>>
>>> Try Dave's suggestion to avoid polling. The driver won't be able to
>>> detect changes to the connector status, though.
>>>
>>
>> That's fine. I don't think we're even using the device. It looks like it
>> might only be in use for BMC, and the VGA connection isn't actually
>> physically available, so there are no changes to detect.
>>
>> Is this polling really only to detect when VGA is enabled? Would it make
>> sense to only poll on platforms which actually *have* that VGA
> connection?
>>
> Polling was introduced with https://patchwork.freedesktop.org/
> series/131977/
>
> The driver needs to know if a VGA monitor is connected or not, to
> provide the right available resolutions to the userspace.
> Otherwise you can set a high resolution that works from the BMC, but
> then connecting a VGA monitor will not work, as the driver won't notice
> that something has been connected.
>
> The mgag200 doesn't have an IRQ or a register to check if something is
> connected on the VGA port, so the driver uses the i2c and tries to read
> the EDID.
>
> Unfortunately, there is no way to know reliably if a VGA connector is
> present. It's possible to disable polling on some machines using DMI
> quirks, but I don't think this approach will scale.
>
Timing sensitive setups like mine must have system admins who know to
manually disable mgag200 or disable polling. Many users won't be aware
of this. If the polling were not intrusive, this would not be an issue.
But....
Faulty hardware (perhaps just a cheap pull down resistor on the VGA
connection as Dave Airlie suggests) means that any such affected
platform has a polling routine that causes significant issues on any
timing sensitive applications.
Right now, I am stuck in a situation which means that I have to fight to
reach every customer who uses one of these platforms and confirm they
either disable polling or ban the module so it won't even load.
This is frustrating, as it is unlikely I'll reach everyone.
I doubt that I'm the only one with users who are affected by mysterious
performance or timing problems related to this. While its true that not
*every* instance of the device is problematic (at least not now that we
fixed the other issue with the udelay...), but many systems using the
controller *are* negatively impacted even with the timing fix, as I have
now seen...
Unfortunately, I also have no better idea than a DMI quirk table to
record known platforms that include the controller but don't have a
physical VGA connection exposed.
Thus, I'm wondering what else we can do? Using WQ_UNBOUND might help
somewhat? I have no idea if its safe to sleep instead of spin while
reading the i2c connections... As far as I can tell the non-atomic
version has nothing that *strictly* prevents sleep.. but maybe i2c
access has tighter timing requirements than what usleep_range can
fulfill? I am not sure...
I'd just really like to not have to worry about going to every single
user and asking them to unload and ban a driver for these big server
platforms...
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-23 19:42 ` Jacob Keller
@ 2026-04-23 21:02 ` David Airlie
2026-04-23 21:18 ` Jacob Keller
2026-04-24 6:16 ` Thomas Zimmermann
2026-04-24 6:20 ` Thomas Zimmermann
1 sibling, 2 replies; 20+ messages in thread
From: David Airlie @ 2026-04-23 21:02 UTC (permalink / raw)
To: Jacob Keller
Cc: Jocelyn Falempe, Thomas Zimmermann, dri-devel,
linux-kernel@vger.kernel.org, Pasi Vaananen
On Fri, Apr 24, 2026 at 5:42 AM Jacob Keller <jacob.e.keller@intel.com> wrote:
>
> On 4/23/2026 12:22 PM, Jocelyn Falempe wrote:
> > On 23/04/2026 18:35, Jacob Keller wrote:
> >> On 4/23/2026 12:44 AM, Thomas Zimmermann wrote:
> >>> Hi
> >>>
> >>> Am 23.04.26 um 01:55 schrieb Jacob Keller:
> >>>> Hello,
> >>>><snip>>>> I'm also curious if its possible to stop polling for so
> > long with udelay
> >>>> in the i2c logic somehow? I am not very familiar with i2c, but it is
> >>>> frustrating that this driver is causing yet another stall that is
> >>>> impacting timing sensitive data. Even if in this case its due to a
> >>>> faulty cable.. it is frustrating that such result causes the PTP
> >>>> failures. Would switching to WQ_UNBOUND be helpful here at all?
> >>>
> >>> Try Dave's suggestion to avoid polling. The driver won't be able to
> >>> detect changes to the connector status, though.
> >>>
> >>
> >> That's fine. I don't think we're even using the device. It looks like it
> >> might only be in use for BMC, and the VGA connection isn't actually
> >> physically available, so there are no changes to detect.
> >>
> >> Is this polling really only to detect when VGA is enabled? Would it make
> >> sense to only poll on platforms which actually *have* that VGA
> > connection?
> >>
> > Polling was introduced with https://patchwork.freedesktop.org/
> > series/131977/
> >
> > The driver needs to know if a VGA monitor is connected or not, to
> > provide the right available resolutions to the userspace.
> > Otherwise you can set a high resolution that works from the BMC, but
> > then connecting a VGA monitor will not work, as the driver won't notice
> > that something has been connected.
> >
> > The mgag200 doesn't have an IRQ or a register to check if something is
> > connected on the VGA port, so the driver uses the i2c and tries to read
> > the EDID.
> >
> > Unfortunately, there is no way to know reliably if a VGA connector is
> > present. It's possible to disable polling on some machines using DMI
> > quirks, but I don't think this approach will scale.
> >
>
> Timing sensitive setups like mine must have system admins who know to
> manually disable mgag200 or disable polling. Many users won't be aware
> of this. If the polling were not intrusive, this would not be an issue.
> But....
>
> Faulty hardware (perhaps just a cheap pull down resistor on the VGA
> connection as Dave Airlie suggests) means that any such affected
> platform has a polling routine that causes significant issues on any
> timing sensitive applications.
We could write a patch to just say if we see 10 bogus EDID polls we
just give up and loudly say in the logs.
This might break some crash-cart plugins in some data centers though,
I don't think we have contracts in Matrox or the server vendors who
make the hw to say how they recommend finding this info.
It might be in ACPI or dmidecodes.
Dave.
>
> Right now, I am stuck in a situation which means that I have to fight to
> reach every customer who uses one of these platforms and confirm they
> either disable polling or ban the module so it won't even load.
>
> This is frustrating, as it is unlikely I'll reach everyone.
>
> I doubt that I'm the only one with users who are affected by mysterious
> performance or timing problems related to this. While its true that not
> *every* instance of the device is problematic (at least not now that we
> fixed the other issue with the udelay...), but many systems using the
> controller *are* negatively impacted even with the timing fix, as I have
> now seen...
>
> Unfortunately, I also have no better idea than a DMI quirk table to
> record known platforms that include the controller but don't have a
> physical VGA connection exposed.
>
> Thus, I'm wondering what else we can do? Using WQ_UNBOUND might help
> somewhat? I have no idea if its safe to sleep instead of spin while
> reading the i2c connections... As far as I can tell the non-atomic
> version has nothing that *strictly* prevents sleep.. but maybe i2c
> access has tighter timing requirements than what usleep_range can
> fulfill? I am not sure...
>
> I'd just really like to not have to worry about going to every single
> user and asking them to unload and ban a driver for these big server
> platforms...
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-23 21:02 ` David Airlie
@ 2026-04-23 21:18 ` Jacob Keller
2026-04-24 6:16 ` Thomas Zimmermann
1 sibling, 0 replies; 20+ messages in thread
From: Jacob Keller @ 2026-04-23 21:18 UTC (permalink / raw)
To: David Airlie
Cc: Jocelyn Falempe, Thomas Zimmermann, dri-devel,
linux-kernel@vger.kernel.org, Pasi Vaananen
On 4/23/2026 2:02 PM, David Airlie wrote:
> On Fri, Apr 24, 2026 at 5:42 AM Jacob Keller <jacob.e.keller@intel.com> wrote:
>>
>> On 4/23/2026 12:22 PM, Jocelyn Falempe wrote:
>>> On 23/04/2026 18:35, Jacob Keller wrote:
>>>> On 4/23/2026 12:44 AM, Thomas Zimmermann wrote:
>>>>> Hi
>>>>>
>>>>> Am 23.04.26 um 01:55 schrieb Jacob Keller:
>>>>>> Hello,
>>>>>> <snip>>>> I'm also curious if its possible to stop polling for so
>>> long with udelay
>>>>>> in the i2c logic somehow? I am not very familiar with i2c, but it is
>>>>>> frustrating that this driver is causing yet another stall that is
>>>>>> impacting timing sensitive data. Even if in this case its due to a
>>>>>> faulty cable.. it is frustrating that such result causes the PTP
>>>>>> failures. Would switching to WQ_UNBOUND be helpful here at all?
>>>>>
>>>>> Try Dave's suggestion to avoid polling. The driver won't be able to
>>>>> detect changes to the connector status, though.
>>>>>
>>>>
>>>> That's fine. I don't think we're even using the device. It looks like it
>>>> might only be in use for BMC, and the VGA connection isn't actually
>>>> physically available, so there are no changes to detect.
>>>>
>>>> Is this polling really only to detect when VGA is enabled? Would it make
>>>> sense to only poll on platforms which actually *have* that VGA
>>> connection?
>>>>
>>> Polling was introduced with https://patchwork.freedesktop.org/
>>> series/131977/
>>>
>>> The driver needs to know if a VGA monitor is connected or not, to
>>> provide the right available resolutions to the userspace.
>>> Otherwise you can set a high resolution that works from the BMC, but
>>> then connecting a VGA monitor will not work, as the driver won't notice
>>> that something has been connected.
>>>
>>> The mgag200 doesn't have an IRQ or a register to check if something is
>>> connected on the VGA port, so the driver uses the i2c and tries to read
>>> the EDID.
>>>
>>> Unfortunately, there is no way to know reliably if a VGA connector is
>>> present. It's possible to disable polling on some machines using DMI
>>> quirks, but I don't think this approach will scale.
>>>
>>
>> Timing sensitive setups like mine must have system admins who know to
>> manually disable mgag200 or disable polling. Many users won't be aware
>> of this. If the polling were not intrusive, this would not be an issue.
>> But....
>>
>> Faulty hardware (perhaps just a cheap pull down resistor on the VGA
>> connection as Dave Airlie suggests) means that any such affected
>> platform has a polling routine that causes significant issues on any
>> timing sensitive applications.
>
> We could write a patch to just say if we see 10 bogus EDID polls we
> just give up and loudly say in the logs.
>
That would certainly be a better situation for me...
> This might break some crash-cart plugins in some data centers though,
> I don't think we have contracts in Matrox or the server vendors who
> make the hw to say how they recommend finding this info.
>
But I could see this being a problem for data centers who previously saw
"no issue" and now see "this device is causing a problem", especially if
that problem is really non-existent?
> It might be in ACPI or dmidecodes.
>
I can try checking if anything obvious shows up in dmidecodes for the
device.
> Dave.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-23 0:05 ` David Airlie
@ 2026-04-23 21:39 ` Jacob Keller
0 siblings, 0 replies; 20+ messages in thread
From: Jacob Keller @ 2026-04-23 21:39 UTC (permalink / raw)
To: David Airlie
Cc: Jocelyn Falempe, Thomas Zimmermann, dri-devel,
linux-kernel@vger.kernel.org, Pasi Vaananen
On 4/22/2026 5:05 PM, David Airlie wrote:
>>
>> These all appear to be workqueue warnings about functions that are
>> hogging CPU. If I look carefully, it looks like they are all possibly
>> related to the same mgag200 driver. At the very least
>> output_poll_execute is certainly related to the mgag200 stall.
>>
>> I do noot understand exactly what is causing the driver to get stuck,
>> its something in the i2c routine for reading the EDID block.
>>
>> I also see this being printed:
>>
>> EDID block 0 (tag 0x00) checksum is invalid, remainder is 125
>>
>> It appears to print quite consistently every few seconds. I guess this
>> might be possibly related to a bad EDID block on the mgag200 device?
>> What does this even mean?
>>
>
> It sounds like the polling is having trouble with the i2c bus even if
> there is no cable plugged in, probably cheaped out on some pull
> up/down resistors on the VGA connector.
>
> does adding drm_kms_helper.poll=0 help to the command line help?
>
> Dave.
>
This looks like it is a global parameter for all users of the
drm_kms_helper. Would it be feasible to have a mgag200 specific
parameter made available?
I am testing this out now, but if it helps, it would be good to be able
to disable polling only for mgag200 in the off chance that some system
has another device which depends on its functionality? I guess that may
not be super common so maybe its not a big deal...
Thanks,
Jake
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-23 21:02 ` David Airlie
2026-04-23 21:18 ` Jacob Keller
@ 2026-04-24 6:16 ` Thomas Zimmermann
1 sibling, 0 replies; 20+ messages in thread
From: Thomas Zimmermann @ 2026-04-24 6:16 UTC (permalink / raw)
To: David Airlie, Jacob Keller
Cc: Jocelyn Falempe, dri-devel, linux-kernel@vger.kernel.org,
Pasi Vaananen
Hi
Am 23.04.26 um 23:02 schrieb David Airlie:
[...]
>> Faulty hardware (perhaps just a cheap pull down resistor on the VGA
>> connection as Dave Airlie suggests) means that any such affected
>> platform has a polling routine that causes significant issues on any
>> timing sensitive applications.
> We could write a patch to just say if we see 10 bogus EDID polls we
> just give up and loudly say in the logs.
I don't think we should do that. The fallout might just backfire as well.
Best regards
Thomas
>
> This might break some crash-cart plugins in some data centers though,
> I don't think we have contracts in Matrox or the server vendors who
> make the hw to say how they recommend finding this info.
>
> It might be in ACPI or dmidecodes.
>
> Dave.
>
>
>> Right now, I am stuck in a situation which means that I have to fight to
>> reach every customer who uses one of these platforms and confirm they
>> either disable polling or ban the module so it won't even load.
>>
>> This is frustrating, as it is unlikely I'll reach everyone.
>>
>> I doubt that I'm the only one with users who are affected by mysterious
>> performance or timing problems related to this. While its true that not
>> *every* instance of the device is problematic (at least not now that we
>> fixed the other issue with the udelay...), but many systems using the
>> controller *are* negatively impacted even with the timing fix, as I have
>> now seen...
>>
>> Unfortunately, I also have no better idea than a DMI quirk table to
>> record known platforms that include the controller but don't have a
>> physical VGA connection exposed.
>>
>> Thus, I'm wondering what else we can do? Using WQ_UNBOUND might help
>> somewhat? I have no idea if its safe to sleep instead of spin while
>> reading the i2c connections... As far as I can tell the non-atomic
>> version has nothing that *strictly* prevents sleep.. but maybe i2c
>> access has tighter timing requirements than what usleep_range can
>> fulfill? I am not sure...
>>
>> I'd just really like to not have to worry about going to every single
>> user and asking them to unload and ban a driver for these big server
>> platforms...
>>
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-23 19:42 ` Jacob Keller
2026-04-23 21:02 ` David Airlie
@ 2026-04-24 6:20 ` Thomas Zimmermann
2026-04-24 7:36 ` Jocelyn Falempe
1 sibling, 1 reply; 20+ messages in thread
From: Thomas Zimmermann @ 2026-04-24 6:20 UTC (permalink / raw)
To: Jacob Keller, Jocelyn Falempe, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
Hi
Am 23.04.26 um 21:42 schrieb Jacob Keller:
[...]
> Unfortunately, I also have no better idea than a DMI quirk table to
> record known platforms that include the controller but don't have a
> physical VGA connection exposed.
I'm in favor of this. If you send a meaningful DMI identifier for your
system, I'd make you a patch for testing.
I don't know of any way for detecting the presence of a physical VGA
connector BTW.
Best regards
Thomas
>
> Thus, I'm wondering what else we can do? Using WQ_UNBOUND might help
> somewhat? I have no idea if its safe to sleep instead of spin while
> reading the i2c connections... As far as I can tell the non-atomic
> version has nothing that *strictly* prevents sleep.. but maybe i2c
> access has tighter timing requirements than what usleep_range can
> fulfill? I am not sure...
>
> I'd just really like to not have to worry about going to every single
> user and asking them to unload and ban a driver for these big server
> platforms...
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-24 6:20 ` Thomas Zimmermann
@ 2026-04-24 7:36 ` Jocelyn Falempe
2026-04-24 7:47 ` Thomas Zimmermann
0 siblings, 1 reply; 20+ messages in thread
From: Jocelyn Falempe @ 2026-04-24 7:36 UTC (permalink / raw)
To: Thomas Zimmermann, Jacob Keller, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
On 24/04/2026 08:20, Thomas Zimmermann wrote:
> Hi
>
> Am 23.04.26 um 21:42 schrieb Jacob Keller:
> [...]
>> Unfortunately, I also have no better idea than a DMI quirk table to
>> record known platforms that include the controller but don't have a
>> physical VGA connection exposed.
>
> I'm in favor of this. If you send a meaningful DMI identifier for your
> system, I'd make you a patch for testing.
I didn't find something related to VGA connector in dmidecode.
My suggestion would be to use the chassis-type [1], and disable polling
on Blade (0x1C and 0x1D) and Rack Mount (0x17) as they are less likely
to have a real VGA monitor connected.
My Dell T310, which is kind of a Tower, has a chassis-type of 0x11 "Main
server chassis" so it might not be very reliable.
Another option would be to disable polling if PREEMPT_RT is set, so if
the user expects low latency, he can actually have it.
Last resort is that the driver did work for 2 decades without polling
the VGA connector, maybe we can revert to that behavior.
--
Jocelyn
[1]
https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.9.0.pdf
>
> I don't know of any way for detecting the presence of a physical VGA
> connector BTW.
>
> Best regards
> Thomas
>
>>
>> Thus, I'm wondering what else we can do? Using WQ_UNBOUND might help
>> somewhat? I have no idea if its safe to sleep instead of spin while
>> reading the i2c connections... As far as I can tell the non-atomic
>> version has nothing that *strictly* prevents sleep.. but maybe i2c
>> access has tighter timing requirements than what usleep_range can
>> fulfill? I am not sure...
>>
>> I'd just really like to not have to worry about going to every single
>> user and asking them to unload and ban a driver for these big server
>> platforms...
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-24 7:36 ` Jocelyn Falempe
@ 2026-04-24 7:47 ` Thomas Zimmermann
2026-04-24 23:29 ` Jacob Keller
0 siblings, 1 reply; 20+ messages in thread
From: Thomas Zimmermann @ 2026-04-24 7:47 UTC (permalink / raw)
To: Jocelyn Falempe, Jacob Keller, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
Hi Jocelyn
Am 24.04.26 um 09:36 schrieb Jocelyn Falempe:
> On 24/04/2026 08:20, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 23.04.26 um 21:42 schrieb Jacob Keller:
>> [...]
>>> Unfortunately, I also have no better idea than a DMI quirk table to
>>> record known platforms that include the controller but don't have a
>>> physical VGA connection exposed.
>>
>> I'm in favor of this. If you send a meaningful DMI identifier for
>> your system, I'd make you a patch for testing.
>
> I didn't find something related to VGA connector in dmidecode.
> My suggestion would be to use the chassis-type [1], and disable
> polling on Blade (0x1C and 0x1D) and Rack Mount (0x17) as they are
> less likely to have a real VGA monitor connected.
> My Dell T310, which is kind of a Tower, has a chassis-type of 0x11
> "Main server chassis" so it might not be very reliable.
This is the first time, I hear about this problem. I don't think it's
very common. And it appears only to be related due to cheap hardware
manufacturing.
So I suggest to pick Manufacturer, Product, Version as key. I'd be
surprised if we find more than a hand full of systems with the issue. If
we see a trend or common pattern, we can generalize later on.
>
> Another option would be to disable polling if PREEMPT_RT is set, so if
> the user expects low latency, he can actually have it.
>
> Last resort is that the driver did work for 2 decades without polling
> the VGA connector, maybe we can revert to that behavior.
It didn't actually work. Removing it will break a lot of systems.
Best regards
Thomas
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-24 7:47 ` Thomas Zimmermann
@ 2026-04-24 23:29 ` Jacob Keller
2026-04-27 12:14 ` Thomas Zimmermann
0 siblings, 1 reply; 20+ messages in thread
From: Jacob Keller @ 2026-04-24 23:29 UTC (permalink / raw)
To: Thomas Zimmermann, Jocelyn Falempe, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
On 4/24/2026 12:47 AM, Thomas Zimmermann wrote:
> Hi Jocelyn
>
> Am 24.04.26 um 09:36 schrieb Jocelyn Falempe:
>> On 24/04/2026 08:20, Thomas Zimmermann wrote:
>>> Hi
>>>
>>> Am 23.04.26 um 21:42 schrieb Jacob Keller:
>>> [...]
>>>> Unfortunately, I also have no better idea than a DMI quirk table to
>>>> record known platforms that include the controller but don't have a
>>>> physical VGA connection exposed.
>>>
>>> I'm in favor of this. If you send a meaningful DMI identifier for
>>> your system, I'd make you a patch for testing.
>>
>> I didn't find something related to VGA connector in dmidecode.
>> My suggestion would be to use the chassis-type [1], and disable
>> polling on Blade (0x1C and 0x1D) and Rack Mount (0x17) as they are
>> less likely to have a real VGA monitor connected.
>> My Dell T310, which is kind of a Tower, has a chassis-type of 0x11
>> "Main server chassis" so it might not be very reliable.
>
> This is the first time, I hear about this problem. I don't think it's
> very common. And it appears only to be related due to cheap hardware
> manufacturing.
>
> So I suggest to pick Manufacturer, Product, Version as key. I'd be
> surprised if we find more than a hand full of systems with the issue. If
> we see a trend or common pattern, we can generalize later on.
>
I think this is the best solution. Keep it focused for now. I believe
Intel has two major platforms that we care about with respect to this
issue. I'll see if I can dig up the data. The systems install the MGA
G200 for BMC use but don't seem to expose the VGA connection.
For the specific system I have that was faulty, we have the following:
$ for t in system-manufacturer system-product-name system-version ; \
do dmidecode -s ${t}; \
done
Dell Inc.
PowerEdge XR8720t
Not Specified
I believe there was also some concern about HP systems which similarly
use this chipset, but I don't have the DMI data for that one off hand.
I've asked some colleagues to confirm the situation and obtain that
data. I'll get back early next week if we think there are any other
systems possibly affected.
In the mean time, I'm happy to have our team test any patch to confirm
that it behaves as expected and resolves the service interruptions.
Appreciate all the feedback on this thread.
Thanks,
Jake
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-24 23:29 ` Jacob Keller
@ 2026-04-27 12:14 ` Thomas Zimmermann
2026-04-27 22:53 ` Jacob Keller
2026-04-28 19:12 ` stuart hayes
0 siblings, 2 replies; 20+ messages in thread
From: Thomas Zimmermann @ 2026-04-27 12:14 UTC (permalink / raw)
To: Jacob Keller, Jocelyn Falempe, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
[-- Attachment #1: Type: text/plain, Size: 2275 bytes --]
Hi
Am 25.04.26 um 01:29 schrieb Jacob Keller:
[...]
>>
>> So I suggest to pick Manufacturer, Product, Version as key. I'd be
>> surprised if we find more than a hand full of systems with the issue. If
>> we see a trend or common pattern, we can generalize later on.
>>
> I think this is the best solution. Keep it focused for now. I believe
> Intel has two major platforms that we care about with respect to this
> issue. I'll see if I can dig up the data. The systems install the MGA
> G200 for BMC use but don't seem to expose the VGA connection.
>
> For the specific system I have that was faulty, we have the following:
>
> $ for t in system-manufacturer system-product-name system-version ; \
> do dmidecode -s ${t}; \
> done
> Dell Inc.
> PowerEdge XR8720t
> Not Specified
>
>
>
> I believe there was also some concern about HP systems which similarly
> use this chipset, but I don't have the DMI data for that one off hand.
> I've asked some colleagues to confirm the situation and obtain that
> data. I'll get back early next week if we think there are any other
> systems possibly affected.
>
> In the mean time, I'm happy to have our team test any patch to confirm
> that it behaves as expected and resolves the service interruptions.
For now, I've modified the two places that have BMC support in the
driver. Could you please also tell me your system's exact Matrox chipset
or its PCI id?
The patch is attached for your testing. It would work against drm-tip or
v7.1-rc1.
I've also found the page at [1], which claims that there's a Mini-DP
port at the front. If so, I'd assume that there's also an extra encoder
chip to replace the VGA. If we ever get specs for that, we could
implement real support in the driver.
In the meantime, the current fix should work. In the worst case, that
Mini-DP port would give a lower default resolution.
[1]
https://www.dell.com/en-us/shop/ipovw/poweredge-xr8720t?hve=shop+now#techspecs_section
Best regards
Thomas
>
> Appreciate all the feedback on this thread.
>
> Thanks,
> Jake
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)
[-- Attachment #2: 0001-drm-mgag200-Add-BMC-only-connector.patch --]
[-- Type: text/x-patch, Size: 7785 bytes --]
From 95d72c2e4abef9fc45433076d3b130336c734e75 Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann <tzimmermann@suse.de>
Date: Fri, 24 Apr 2026 09:05:14 +0200
Subject: [PATCH] drm/mgag200: Add BMC-only connector
---
drivers/gpu/drm/mgag200/mgag200_bmc.c | 109 ++++++++++++++++++++++
drivers/gpu/drm/mgag200/mgag200_drv.h | 7 +-
drivers/gpu/drm/mgag200/mgag200_g200ew3.c | 17 +++-
drivers/gpu/drm/mgag200/mgag200_g200wb.c | 17 +++-
4 files changed, 147 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/mgag200/mgag200_bmc.c b/drivers/gpu/drm/mgag200/mgag200_bmc.c
index bbdeb791c5b3..8d974e2c1810 100644
--- a/drivers/gpu/drm/mgag200/mgag200_bmc.c
+++ b/drivers/gpu/drm/mgag200/mgag200_bmc.c
@@ -6,6 +6,8 @@
#include <drm/drm_atomic_helper.h>
#include <drm/drm_edid.h>
#include <drm/drm_managed.h>
+#include <drm/drm_modeset_helper_vtables.h>
+#include <drm/drm_print.h>
#include <drm/drm_probe_helper.h>
#include "mgag200_drv.h"
@@ -90,3 +92,110 @@ void mgag200_bmc_start_scanout(struct mga_device *mdev)
tmp &= ~0x10;
WREG_DAC(MGA1064_GEN_IO_DATA, tmp);
}
+
+static void mgag200_bmc_encoder_atomic_disable(struct drm_encoder *encoder,
+ struct drm_atomic_state *state)
+{
+ struct mga_device *mdev = to_mga_device(encoder->dev);
+
+ if (mdev->info->sync_bmc)
+ mgag200_bmc_stop_scanout(mdev);
+}
+
+static void mgag200_bmc_encoder_atomic_enable(struct drm_encoder *encoder,
+ struct drm_atomic_state *state)
+{
+ struct mga_device *mdev = to_mga_device(encoder->dev);
+
+ if (mdev->info->sync_bmc)
+ mgag200_bmc_start_scanout(mdev);
+}
+
+static int mgag200_bmc_encoder_atomic_check(struct drm_encoder *encoder,
+ struct drm_crtc_state *new_crtc_state,
+ struct drm_connector_state *new_connector_state)
+{
+ struct mga_device *mdev = to_mga_device(encoder->dev);
+ struct mgag200_crtc_state *new_mgag200_crtc_state = to_mgag200_crtc_state(new_crtc_state);
+
+ new_mgag200_crtc_state->set_vidrst = mdev->info->sync_bmc;
+
+ return 0;
+}
+
+static const struct drm_encoder_helper_funcs mgag200_bmc_encoder_helper_funcs = {
+ .atomic_disable = mgag200_bmc_encoder_atomic_disable,
+ .atomic_enable = mgag200_bmc_encoder_atomic_enable,
+ .atomic_check = mgag200_bmc_encoder_atomic_check,
+};
+
+static const struct drm_encoder_funcs mgag200_bmc_encoder_funcs = {
+ .destroy = drm_encoder_cleanup
+};
+
+static int mgag200_bmc_connector_helper_get_modes(struct drm_connector *connector)
+{
+ struct mga_device *mdev = to_mga_device(connector->dev);
+ const struct mgag200_device_info *minfo = mdev->info;
+ int count;
+
+ /*
+ * There's no EDID data without a connected monitor. Set BMC-
+ * compatible modes in this case. The XGA default resolution
+ * should work well for all BMCs.
+ */
+ count = drm_add_modes_noedid(connector, minfo->max_hdisplay, minfo->max_vdisplay);
+ if (count)
+ drm_set_preferred_mode(connector, 1024, 768);
+
+ return count;
+}
+
+static const struct drm_connector_helper_funcs mgag200_bmc_connector_helper_funcs = {
+ .get_modes = mgag200_bmc_connector_helper_get_modes,
+};
+
+static const struct drm_connector_funcs mgag200_bmc_connector_funcs = {
+ .reset = drm_atomic_helper_connector_reset,
+ .fill_modes = drm_helper_probe_single_connector_modes,
+ .destroy = drm_connector_cleanup,
+ .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
+ .atomic_destroy_state = drm_atomic_helper_connector_destroy_state
+};
+
+int mgag200_bmc_output_init(struct mga_device *mdev)
+{
+ struct drm_device *dev = &mdev->base;
+ struct drm_crtc *crtc = &mdev->crtc;
+ struct drm_encoder *encoder;
+ struct drm_connector *connector;
+ int ret;
+
+ encoder = &mdev->output.bmc.encoder;
+ ret = drm_encoder_init(dev, encoder, &mgag200_bmc_encoder_funcs,
+ DRM_MODE_ENCODER_VIRTUAL, NULL);
+ if (ret) {
+ drm_err(dev, "drm_encoder_init() failed: %d\n", ret);
+ return ret;
+ }
+ drm_encoder_helper_add(encoder, &mgag200_bmc_encoder_helper_funcs);
+
+ encoder->possible_crtcs = drm_crtc_mask(crtc);
+
+ connector = &mdev->output.bmc.connector;
+ ret = drm_connector_init(dev, connector, &mgag200_bmc_connector_funcs,
+ DRM_MODE_CONNECTOR_VGA);
+ if (ret) {
+ drm_err(dev, "drm_connector_init() failed: %d\n", ret);
+ return ret;
+ }
+ drm_connector_helper_add(connector, &mgag200_bmc_connector_helper_funcs);
+
+ ret = drm_connector_attach_encoder(connector, encoder);
+ if (ret) {
+ drm_err(dev, "drm_connector_attach_encoder() failed: %d\n", ret);
+ return ret;
+ }
+
+ return 0;
+}
diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h b/drivers/gpu/drm/mgag200/mgag200_drv.h
index a875c4bf8cbe..f126f6d61ed0 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.h
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.h
@@ -279,7 +279,11 @@ struct mga_device {
struct drm_plane primary_plane;
struct drm_crtc crtc;
- struct {
+ union {
+ struct {
+ struct drm_encoder encoder;
+ struct drm_connector connector;
+ } bmc;
struct {
struct drm_encoder encoder;
struct drm_connector connector;
@@ -435,5 +439,6 @@ int mgag200_vga_output_init(struct mga_device *mdev);
/* mgag200_bmc.c */
void mgag200_bmc_stop_scanout(struct mga_device *mdev);
void mgag200_bmc_start_scanout(struct mga_device *mdev);
+int mgag200_bmc_output_init(struct mga_device *mdev);
#endif /* __MGAG200_DRV_H__ */
diff --git a/drivers/gpu/drm/mgag200/mgag200_g200ew3.c b/drivers/gpu/drm/mgag200/mgag200_g200ew3.c
index e387a455eae5..12047066b615 100644
--- a/drivers/gpu/drm/mgag200/mgag200_g200ew3.c
+++ b/drivers/gpu/drm/mgag200/mgag200_g200ew3.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/dmi.h>
#include <linux/pci.h>
#include <drm/drm_atomic.h>
@@ -11,6 +12,17 @@
#include "mgag200_drv.h"
+static const struct dmi_system_id mgag200_g200ew3_novga[] = {
+ {
+ .ident = "PowerEdge XR8720t",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge XR8720t"),
+ },
+ },
+ {},
+};
+
static void mgag200_g200ew3_init_registers(struct mga_device *mdev)
{
mgag200_g200wb_init_registers(mdev); // same as G200WB
@@ -128,7 +140,10 @@ static int mgag200_g200ew3_pipeline_init(struct mga_device *mdev)
drm_mode_crtc_set_gamma_size(crtc, MGAG200_LUT_SIZE);
drm_crtc_enable_color_mgmt(crtc, 0, false, MGAG200_LUT_SIZE);
- ret = mgag200_vga_bmc_output_init(mdev);
+ if (dmi_check_system(mgag200_g200ew3_novga))
+ ret = mgag200_bmc_output_init(mdev);
+ else
+ ret = mgag200_vga_bmc_output_init(mdev);
if (ret)
return ret;
diff --git a/drivers/gpu/drm/mgag200/mgag200_g200wb.c b/drivers/gpu/drm/mgag200/mgag200_g200wb.c
index d847fa8ded8c..e6ce1130d5eb 100644
--- a/drivers/gpu/drm/mgag200/mgag200_g200wb.c
+++ b/drivers/gpu/drm/mgag200/mgag200_g200wb.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0-only
#include <linux/delay.h>
+#include <linux/dmi.h>
#include <linux/pci.h>
#include <drm/drm_atomic.h>
@@ -12,6 +13,17 @@
#include "mgag200_drv.h"
+static const struct dmi_system_id mgag200_g200wb_novga[] = {
+ {
+ .ident = "PowerEdge XR8720t",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge XR8720t"),
+ },
+ },
+ {},
+};
+
void mgag200_g200wb_init_registers(struct mga_device *mdev)
{
static const u8 dacvalue[] = {
@@ -262,7 +274,10 @@ static int mgag200_g200wb_pipeline_init(struct mga_device *mdev)
drm_mode_crtc_set_gamma_size(crtc, MGAG200_LUT_SIZE);
drm_crtc_enable_color_mgmt(crtc, 0, false, MGAG200_LUT_SIZE);
- ret = mgag200_vga_bmc_output_init(mdev);
+ if (dmi_check_system(mgag200_g200wb_novga))
+ ret = mgag200_bmc_output_init(mdev);
+ else
+ ret = mgag200_vga_bmc_output_init(mdev);
if (ret)
return ret;
--
2.54.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-27 12:14 ` Thomas Zimmermann
@ 2026-04-27 22:53 ` Jacob Keller
2026-04-27 23:32 ` Jacob Keller
2026-04-28 19:12 ` stuart hayes
1 sibling, 1 reply; 20+ messages in thread
From: Jacob Keller @ 2026-04-27 22:53 UTC (permalink / raw)
To: Thomas Zimmermann, Jocelyn Falempe, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
On 4/27/2026 5:14 AM, Thomas Zimmermann wrote:
> Hi
>
> Am 25.04.26 um 01:29 schrieb Jacob Keller:
> [...]
>>>
>>> So I suggest to pick Manufacturer, Product, Version as key. I'd be
>>> surprised if we find more than a hand full of systems with the issue. If
>>> we see a trend or common pattern, we can generalize later on.
>>>
>> I think this is the best solution. Keep it focused for now. I believe
>> Intel has two major platforms that we care about with respect to this
>> issue. I'll see if I can dig up the data. The systems install the MGA
>> G200 for BMC use but don't seem to expose the VGA connection.
>>
>> For the specific system I have that was faulty, we have the following:
>>
>> $ for t in system-manufacturer system-product-name system-version ; \
>> do dmidecode -s ${t}; \
>> done
>> Dell Inc.
>> PowerEdge XR8720t
>> Not Specified
>>
>>
>>
>> I believe there was also some concern about HP systems which similarly
>> use this chipset, but I don't have the DMI data for that one off hand.
>> I've asked some colleagues to confirm the situation and obtain that
>> data. I'll get back early next week if we think there are any other
>> systems possibly affected.
>>
>> In the mean time, I'm happy to have our team test any patch to confirm
>> that it behaves as expected and resolves the service interruptions.
>
> For now, I've modified the two places that have BMC support in the
> driver. Could you please also tell me your system's exact Matrox chipset
> or its PCI id?
>
Here's the lspci output:
>
> b5:00.0 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. Integrated Matrox G200eW3 Graphics Controller [102b:0536] (rev 08) (prog-if 00 [VGA controller])
> DeviceName: Embedded Video
> Subsystem: Dell Integrated Matrox G200eW3 Graphics Controller [1028:0d38]
> Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> Interrupt: pin A routed to IRQ 16
> NUMA node: 0
> IOMMU group: 16
> Region 0: Memory at e5000000 (32-bit, prefetchable) [size=16M]
> Region 1: Memory at e6810000 (32-bit, non-prefetchable) [size=16K]
> Region 2: Memory at e6000000 (32-bit, non-prefetchable) [size=8M]
> Capabilities: [dc] Power Management version 3
> Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> Kernel driver in use: mgag200
> Kernel modules: mgag200
The device ID looks to be 0x0536, and the subdevice ID is Dell 0x0D38. I
don't see anything specifically related to mini display port. It is
plausible there is an encoder between that output and the G200eW3.
> The patch is attached for your testing. It would work against drm-tip or
> v7.1-rc1.
>
I'll give it a shot.
> I've also found the page at [1], which claims that there's a Mini-DP
> port at the front. If so, I'd assume that there's also an extra encoder
> chip to replace the VGA. If we ever get specs for that, we could
> implement real support in the driver.
>
There does appear to be a mini display port cable. I am not certain if
that is driven from the Matrox graphics or not, but looking at lspci
there doesn't appear to be any other graphics chipset on the system, so
you might be right, but I am not certain.
> In the meantime, the current fix should work. In the worst case, that
> Mini-DP port would give a lower default resolution.
>
I can check the behavior of the mini-DP output too and see if this
changes anything for it.
> [1] https://www.dell.com/en-us/shop/ipovw/poweredge-xr8720t?
> hve=shop+now#techspecs_section
>
> Best regards
> Thomas
>
>
>
>
>>
>> Appreciate all the feedback on this thread.
>>
>> Thanks,
>> Jake
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-27 22:53 ` Jacob Keller
@ 2026-04-27 23:32 ` Jacob Keller
0 siblings, 0 replies; 20+ messages in thread
From: Jacob Keller @ 2026-04-27 23:32 UTC (permalink / raw)
To: Thomas Zimmermann, Jocelyn Falempe, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
On 4/27/2026 3:53 PM, Jacob Keller wrote:
> On 4/27/2026 5:14 AM, Thomas Zimmermann wrote:
>> Hi
>>> In the mean time, I'm happy to have our team test any patch to confirm
>>> that it behaves as expected and resolves the service interruptions.
>>
>> For now, I've modified the two places that have BMC support in the
>> driver. Could you please also tell me your system's exact Matrox chipset
>> or its PCI id?
>>
>
> Here's the lspci output:
>
>>
>> b5:00.0 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. Integrated Matrox G200eW3 Graphics Controller [102b:0536] (rev 08) (prog-if 00 [VGA controller])
>> DeviceName: Embedded Video
>> Subsystem: Dell Integrated Matrox G200eW3 Graphics Controller [1028:0d38]
>> Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>> Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>> Interrupt: pin A routed to IRQ 16
>> NUMA node: 0
>> IOMMU group: 16
>> Region 0: Memory at e5000000 (32-bit, prefetchable) [size=16M]
>> Region 1: Memory at e6810000 (32-bit, non-prefetchable) [size=16K]
>> Region 2: Memory at e6000000 (32-bit, non-prefetchable) [size=8M]
>> Capabilities: [dc] Power Management version 3
>> Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
>> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>> Kernel driver in use: mgag200
>> Kernel modules: mgag200
>
> The device ID looks to be 0x0536, and the subdevice ID is Dell 0x0D38. I
> don't see anything specifically related to mini display port. It is
> plausible there is an encoder between that output and the G200eW3.
>
>> The patch is attached for your testing. It would work against drm-tip or
>> v7.1-rc1.
>>
>
> I'll give it a shot.
>
The systems that were having trouble are currently being used by other
folks on my team to check other issues. It might take a day or two
before I can get access again to test this. I'll update you once I've
gotten access and had a chance to test the changes.
Thanks,
Jake
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-27 12:14 ` Thomas Zimmermann
2026-04-27 22:53 ` Jacob Keller
@ 2026-04-28 19:12 ` stuart hayes
2026-04-28 21:07 ` Jacob Keller
2026-04-29 6:40 ` Thomas Zimmermann
1 sibling, 2 replies; 20+ messages in thread
From: stuart hayes @ 2026-04-28 19:12 UTC (permalink / raw)
To: Thomas Zimmermann, Jacob Keller, Jocelyn Falempe,
airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
On 4/27/2026 7:14 AM, Thomas Zimmermann wrote:
> Hi
>
> Am 25.04.26 um 01:29 schrieb Jacob Keller:
> [...]
>>>
>>> So I suggest to pick Manufacturer, Product, Version as key. I'd be
>>> surprised if we find more than a hand full of systems with the issue. If
>>> we see a trend or common pattern, we can generalize later on.
>>>
>> I think this is the best solution. Keep it focused for now. I believe
>> Intel has two major platforms that we care about with respect to this
>> issue. I'll see if I can dig up the data. The systems install the MGA
>> G200 for BMC use but don't seem to expose the VGA connection.
>>
>> For the specific system I have that was faulty, we have the following:
>>
>> $ for t in system-manufacturer system-product-name system-version ; \
>> do dmidecode -s ${t}; \
>> done
>> Dell Inc.
>> PowerEdge XR8720t
>> Not Specified
>>
>>
>>
>> I believe there was also some concern about HP systems which similarly
>> use this chipset, but I don't have the DMI data for that one off hand.
>> I've asked some colleagues to confirm the situation and obtain that
>> data. I'll get back early next week if we think there are any other
>> systems possibly affected.
>>
>> In the mean time, I'm happy to have our team test any patch to confirm
>> that it behaves as expected and resolves the service interruptions.
>
> For now, I've modified the two places that have BMC support in the
> driver. Could you please also tell me your system's exact Matrox chipset
> or its PCI id?
>
> The patch is attached for your testing. It would work against drm-tip or
> v7.1-rc1.
>
> I've also found the page at [1], which claims that there's a Mini-DP
> port at the front. If so, I'd assume that there's also an extra encoder
> chip to replace the VGA. If we ever get specs for that, we could
> implement real support in the driver.
>
> In the meantime, the current fix should work. In the worst case, that
> Mini-DP port would give a lower default resolution.
>
> [1] https://www.dell.com/en-us/shop/ipovw/poweredge-xr8720t?
> hve=shop+now#techspecs_section
>
> Best regards
> Thomas
>
>
So this patch disables DDC polling if the dmi_check_system() matches. If
this was to happen on systems that _do_ have a physical VGA connector,
will that port still be active, just with a resolution that may not be
compatible with the monitor that's plugged in?
I don't see anyone say that the DDC polling doesn't cause too much
latency for real time kernels on other systems that do have a VGA
connector... did I miss that, or is there a chance that a lot of other
systems that use this driver might also have issues with a real time kernel?
>
>
>>
>> Appreciate all the feedback on this thread.
>>
>> Thanks,
>> Jake
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-28 19:12 ` stuart hayes
@ 2026-04-28 21:07 ` Jacob Keller
2026-04-29 6:40 ` Thomas Zimmermann
1 sibling, 0 replies; 20+ messages in thread
From: Jacob Keller @ 2026-04-28 21:07 UTC (permalink / raw)
To: stuart hayes, Thomas Zimmermann, Jocelyn Falempe,
airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
On 4/28/2026 12:12 PM, stuart hayes wrote:
> On 4/27/2026 7:14 AM, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 25.04.26 um 01:29 schrieb Jacob Keller:
>> [...]
>>>>
>>>> So I suggest to pick Manufacturer, Product, Version as key. I'd be
>>>> surprised if we find more than a hand full of systems with the
>>>> issue. If
>>>> we see a trend or common pattern, we can generalize later on.
>>>>
>>> I think this is the best solution. Keep it focused for now. I believe
>>> Intel has two major platforms that we care about with respect to this
>>> issue. I'll see if I can dig up the data. The systems install the MGA
>>> G200 for BMC use but don't seem to expose the VGA connection.
>>>
>>> For the specific system I have that was faulty, we have the following:
>>>
>>> $ for t in system-manufacturer system-product-name system-version ; \
>>> do dmidecode -s ${t}; \
>>> done
>>> Dell Inc.
>>> PowerEdge XR8720t
>>> Not Specified
>>>
>>>
>>>
>>> I believe there was also some concern about HP systems which similarly
>>> use this chipset, but I don't have the DMI data for that one off hand.
>>> I've asked some colleagues to confirm the situation and obtain that
>>> data. I'll get back early next week if we think there are any other
>>> systems possibly affected.
>>>
>>> In the mean time, I'm happy to have our team test any patch to confirm
>>> that it behaves as expected and resolves the service interruptions.
>>
>> For now, I've modified the two places that have BMC support in the
>> driver. Could you please also tell me your system's exact Matrox
>> chipset or its PCI id?
>>
>> The patch is attached for your testing. It would work against drm-tip
>> or v7.1-rc1.
>>
>> I've also found the page at [1], which claims that there's a Mini-DP
>> port at the front. If so, I'd assume that there's also an extra
>> encoder chip to replace the VGA. If we ever get specs for that, we
>> could implement real support in the driver.
>>
>> In the meantime, the current fix should work. In the worst case, that
>> Mini-DP port would give a lower default resolution.
>>
>> [1] https://www.dell.com/en-us/shop/ipovw/poweredge-xr8720t?
>> hve=shop+now#techspecs_section
>>
>> Best regards
>> Thomas
>>
>>
>
> So this patch disables DDC polling if the dmi_check_system() matches. If
> this was to happen on systems that _do_ have a physical VGA connector,
> will that port still be active, just with a resolution that may not be
> compatible with the monitor that's plugged in?
>
That is my understanding, yes. At least as far as I can tell none of the
Dell PowerEdge systems with this chipset have a true VGA port, but it is
still unconfirmed if the mini DisplayPort is connected to the MGA G200
through some sort of encoder, and how it interacts with the polling
disabled.
What I can confirm so far is that the mini Display Port output does seem
to work despite the MGA G200 driver continuously complaining about
bad/faulty EDID checksums. I haven't had time again on the system to
check the patch or confirm what happens with mgag200 polling disabled yet.
> I don't see anyone say that the DDC polling doesn't cause too much
> latency for real time kernels on other systems that do have a VGA
> connector... did I miss that, or is there a chance that a lot of other
> systems that use this driver might also have issues with a real time
> kernel?
>
There were 2 problems so far identified:
1) the DDC polling was causing issues due to spinning where it could
sleep. I fixed that a while ago with 0e0c8f4d16de ("drm/mgag200: fix
mgag200_bmc_stop_scanout()"). This was causing 300 millisecond delays
that impacted PTP functionality. This was affecting both RT and non-RT
systems (though both setups are timing sensitive only one was actually
using PREEMPT_RT). This issue, I believe, widely affects all systems
which use the MGA G200. It has been fixed.
2) the issue I reported here. This issue appears to be possibly due to
fault in the hadware, and does not happen on *every* Dell PowerEdge we
have access to.. but it seems that the driver fails to read data over
DDC when reading the EDID data for the connector. This results in it
continuously retrying. Because i2c bit algo uses udelays, this results
in enough spinning that it impacts my PTP setup.
I do not know how wide the impacts from (2) are. I also do not know if
the issue causes problems on systems that have a VGA port and which also
have the VGA port plugged in. Given my experience, it seems entirely
plausible. Here's a more detailed summary of the issue I saw:
On regular (not PREEMPT_RT) kernel, the udelay on a CPU appears to block
the interrupts from being fired on the CPU that is spinning. The polling
doesn't use WORKQEUEUE_UNBOUND, so it schedules on the specific CPU that
was executing the scheduling function. If this happens on the same CPU
as the one that is assigned the IRQ for the ice driver, it won't fire.
Since the polling thread ends up doing udelay for potentially a long
time, it results in delaying ~20-30 milliseconds which is enough to
impact the PTP functionality.
I suspect this also causes issues with PREEMPT_RT but they might be
different issues, due to the nature of PREEMPT_RT changing a lot of the
way various critical sections work.
I am fairly certain that other timing sensitive applications will have
issues caused by this issue. It is plausible that such issues are below
the radar for many deployments and its not ultimately causing a "real"
impact for everyone.. but its definitely causing problems and hiccups
for Intel as well as some of our partners and customers. The actual
failure symptom we get is somewhat inconsistent.
Even though it fails to read the EDID every second or 2, timestamps are
only impacted every few minutes. But that missed timestamp is considered
catastrophic failure for us. It results in ptp4l going to fault and
losing synchronization for several seconds. It also means such setups do
not pass various industry tests.
We have been recently recommending that users remove the mgag200 driver.
However this is a poor workaround as it results in inability to access
the video over BMC, which some customers rely on. I also plan to confirm
whether the mini DisplayPort is also affected by the driver removal. If
it is, then removal of the driver could also result in the output being
stopped. Some of our customers rely on the BMC to connect to the system,
and I suspect it is useful to have local video access when debugging if
something goes wrong with your remote access methods.
Thus, I am trying to find another solution that resolves the issues
we're having without needing to completely remove the driver.
Thanks,
Jake
>
>>
>>
>>>
>>> Appreciate all the feedback on this thread.
>>>
>>> Thanks,
>>> Jake
>>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: further issues with MGA G200 graphics chipset
2026-04-28 19:12 ` stuart hayes
2026-04-28 21:07 ` Jacob Keller
@ 2026-04-29 6:40 ` Thomas Zimmermann
1 sibling, 0 replies; 20+ messages in thread
From: Thomas Zimmermann @ 2026-04-29 6:40 UTC (permalink / raw)
To: stuart hayes, Jacob Keller, Jocelyn Falempe, airlied@redhat.com
Cc: dri-devel, linux-kernel@vger.kernel.org, Pasi Vaananen
Hi
Am 28.04.26 um 21:12 schrieb stuart hayes:
> On 4/27/2026 7:14 AM, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 25.04.26 um 01:29 schrieb Jacob Keller:
>> [...]
>>>>
>>>> So I suggest to pick Manufacturer, Product, Version as key. I'd be
>>>> surprised if we find more than a hand full of systems with the
>>>> issue. If
>>>> we see a trend or common pattern, we can generalize later on.
>>>>
>>> I think this is the best solution. Keep it focused for now. I believe
>>> Intel has two major platforms that we care about with respect to this
>>> issue. I'll see if I can dig up the data. The systems install the MGA
>>> G200 for BMC use but don't seem to expose the VGA connection.
>>>
>>> For the specific system I have that was faulty, we have the following:
>>>
>>> $ for t in system-manufacturer system-product-name system-version ; \
>>> do dmidecode -s ${t}; \
>>> done
>>> Dell Inc.
>>> PowerEdge XR8720t
>>> Not Specified
>>>
>>>
>>>
>>> I believe there was also some concern about HP systems which similarly
>>> use this chipset, but I don't have the DMI data for that one off hand.
>>> I've asked some colleagues to confirm the situation and obtain that
>>> data. I'll get back early next week if we think there are any other
>>> systems possibly affected.
>>>
>>> In the mean time, I'm happy to have our team test any patch to confirm
>>> that it behaves as expected and resolves the service interruptions.
>>
>> For now, I've modified the two places that have BMC support in the
>> driver. Could you please also tell me your system's exact Matrox
>> chipset or its PCI id?
>>
>> The patch is attached for your testing. It would work against drm-tip
>> or v7.1-rc1.
>>
>> I've also found the page at [1], which claims that there's a Mini-DP
>> port at the front. If so, I'd assume that there's also an extra
>> encoder chip to replace the VGA. If we ever get specs for that, we
>> could implement real support in the driver.
>>
>> In the meantime, the current fix should work. In the worst case, that
>> Mini-DP port would give a lower default resolution.
>>
>> [1] https://www.dell.com/en-us/shop/ipovw/poweredge-xr8720t?
>> hve=shop+now#techspecs_section
>>
>> Best regards
>> Thomas
>>
>>
>
> So this patch disables DDC polling if the dmi_check_system() matches.
> If this was to happen on systems that _do_ have a physical VGA
> connector, will that port still be active, just with a resolution that
> may not be compatible with the monitor that's plugged in?
The DMI test is supposed to only match on affected systems. If there's a
matched system with a VGA port, or possibly that Mini-DP port, users
will get a number of bogus display modes at the worst. But the default
mode is 1024x768, which should work on any display.
>
> I don't see anyone say that the DDC polling doesn't cause too much
> latency for real time kernels on other systems that do have a VGA
> connector... did I miss that, or is there a chance that a lot of other
> systems that use this driver might also have issues with a real time
> kernel?
That missing VGA port is a problem on any workload. So we fix it as far
as possible. Where the connector polling interferes with RT, it can also
be disabled with drm_kms_helper.poll=0. Best regards Thomas
>
>
>>
>>
>>>
>>> Appreciate all the feedback on this thread.
>>>
>>> Thanks,
>>> Jake
>>
>
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2026-04-29 6:40 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-22 23:55 further issues with MGA G200 graphics chipset Jacob Keller
2026-04-23 0:05 ` David Airlie
2026-04-23 21:39 ` Jacob Keller
2026-04-23 7:44 ` Thomas Zimmermann
2026-04-23 16:35 ` Jacob Keller
2026-04-23 19:22 ` Jocelyn Falempe
2026-04-23 19:42 ` Jacob Keller
2026-04-23 21:02 ` David Airlie
2026-04-23 21:18 ` Jacob Keller
2026-04-24 6:16 ` Thomas Zimmermann
2026-04-24 6:20 ` Thomas Zimmermann
2026-04-24 7:36 ` Jocelyn Falempe
2026-04-24 7:47 ` Thomas Zimmermann
2026-04-24 23:29 ` Jacob Keller
2026-04-27 12:14 ` Thomas Zimmermann
2026-04-27 22:53 ` Jacob Keller
2026-04-27 23:32 ` Jacob Keller
2026-04-28 19:12 ` stuart hayes
2026-04-28 21:07 ` Jacob Keller
2026-04-29 6:40 ` Thomas Zimmermann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox