public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Thomas Zimmermann <tzimmermann@suse.de>,
	Jacob Keller <jacob.e.keller@intel.com>,
	Jocelyn Falempe <jfalempe@redhat.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 6.1 56/69] drm/mgag200: fix mgag200_bmc_stop_scanout()
Date: Mon,  9 Feb 2026 15:24:24 +0100	[thread overview]
Message-ID: <20260209142303.940808045@linuxfoundation.org> (raw)
In-Reply-To: <20260209142301.913348974@linuxfoundation.org>

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jacob Keller <jacob.e.keller@intel.com>

[ Upstream commit 0e0c8f4d16de92520623aa1ea485cadbf64e6929 ]

The mgag200_bmc_stop_scanout() function is called by the .atomic_disable()
handler for the MGA G200 VGA BMC encoder. This function performs a few
register writes to inform the BMC of an upcoming mode change, and then
polls to wait until the BMC actually stops.

The polling is implemented using a busy loop with udelay() and an iteration
timeout of 300, resulting in the function blocking for 300 milliseconds.

The function gets called ultimately by the output_poll_execute work thread
for the DRM output change polling thread of the mgag200 driver:

kworker/0:0-mm_    3528 [000]  4555.315364:
        ffffffffaa0e25b3 delay_halt.part.0+0x33
        ffffffffc03f6188 mgag200_bmc_stop_scanout+0x178
        ffffffffc087ae7a disable_outputs+0x12a
        ffffffffc087c12a drm_atomic_helper_commit_tail+0x1a
        ffffffffc03fa7b6 mgag200_mode_config_helper_atomic_commit_tail+0x26
        ffffffffc087c9c1 commit_tail+0x91
        ffffffffc087d51b drm_atomic_helper_commit+0x11b
        ffffffffc0509694 drm_atomic_commit+0xa4
        ffffffffc05105e8 drm_client_modeset_commit_atomic+0x1e8
        ffffffffc0510ce6 drm_client_modeset_commit_locked+0x56
        ffffffffc0510e24 drm_client_modeset_commit+0x24
        ffffffffc088a743 __drm_fb_helper_restore_fbdev_mode_unlocked+0x93
        ffffffffc088a683 drm_fb_helper_hotplug_event+0xe3
        ffffffffc050f8aa drm_client_dev_hotplug+0x9a
        ffffffffc088555a output_poll_execute+0x29a
        ffffffffa9b35924 process_one_work+0x194
        ffffffffa9b364ee worker_thread+0x2fe
        ffffffffa9b3ecad kthread+0xdd
        ffffffffa9a08549 ret_from_fork+0x29

On a server running ptp4l with the mgag200 driver loaded, we found that
ptp4l would sometimes get blocked from execution because of this busy
waiting loop.

Every so often, approximately once every 20 minutes -- though with large
variance -- the output_poll_execute() thread would detect some sort of
change that required performing a hotplug event which results in attempting
to stop the BMC scanout, resulting in a 300msec delay on one CPU.

On this system, ptp4l was pinned to a single CPU. When the
output_poll_execute() thread ran on that CPU, it blocked ptp4l from
executing for its 300 millisecond duration.

This resulted in PTP service disruptions such as failure to send a SYNC
message on time, failure to handle ANNOUNCE messages on time, and clock
check warnings from the application. All of this despite the application
being configured with FIFO_RT and a higher priority than the background
workqueue tasks. (However, note that the kernel did not use
CONFIG_PREEMPT...)

It is unclear if the event is due to a faulty VGA connection, another bug,
or actual events causing a change in the connection. At least on the system
under test it is not a one-time event and consistently causes disruption to
the time sensitive applications.

The function has some helpful comments explaining what steps it is
attempting to take. In particular, step 3a and 3b are explained as such:

  3a - The third step is to verify if there is an active scan. We are
       waiting on a 0 on remhsyncsts (<XSPAREREG<0>.

  3b - This step occurs only if the remove is actually scanning. We are
       waiting for the end of the frame which is a 1 on remvsyncsts
       (<XSPAREREG<1>).

The actual steps 3a and 3b are implemented as while loops with a
non-sleeping udelay(). The first step iterates while the tmp value at
position 0 is *not* set. That is, it keeps iterating as long as the bit is
zero. If the bit is already 0 (because there is no active scan), it will
iterate the entire 300 attempts which wastes 300 milliseconds in total.
This is opposite of what the description claims.

The step 3b logic only executes if we do not iterate over the entire 300
attempts in the first loop. If it does trigger, it is trying to check and
wait for a 1 on the remvsyncsts. However, again the condition is actually
inverted and it will loop as long as the bit is 1, stopping once it hits
zero (rather than the explained attempt to wait until we see a 1).

Worse, both loops are implemented using non-sleeping waits which spin
instead of allowing the scheduler to run other processes. If the kernel is
not configured to allow arbitrary preemption, it will waste valuable CPU
time doing nothing.

There does not appear to be any documentation for the BMC register
interface, beyond what is in the comments here. It seems more probable that
the comment here is correct and the implementation accidentally got
inverted from the intended logic.

Reading through other DRM driver implementations, it does not appear that
the .atomic_enable or .atomic_disable handlers need to delay instead of
sleep. For example, the ast_astdp_encoder_helper_atomic_disable() function
calls ast_dp_set_phy_sleep() which uses msleep(). The "atomic" in the name
is referring to the atomic modesetting support, which is the support to
enable atomic configuration from userspace, and not to the "atomic context"
of the kernel. There is no reason to use udelay() here if a sleep would be
sufficient.

Replace the while loops with a read_poll_timeout() based implementation
that will sleep between iterations, and which stops polling once the
condition is met (instead of looping as long as the condition is met). This
aligns with the commented behavior and avoids blocking on the CPU while
doing nothing.

Note the RREG_DAC is implemented using a statement expression to allow
working properly with the read_poll_timeout family of functions. The other
RREG_<TYPE> macros ought to be cleaned up to have better semantics, and
several places in the mgag200 driver could make use of RREG_DAC or similar
RREG_* macros should likely be cleaned up for better semantics as well, but
that task has been left as a future cleanup for a non-bugfix.

Fixes: 414c45310625 ("mgag200: initial g200se driver (v2)")
Suggested-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260202-jk-mgag200-fix-bad-udelay-v2-1-ce1e9665987d@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/mgag200/mgag200_bmc.c | 31 +++++++++++----------------
 drivers/gpu/drm/mgag200/mgag200_drv.h |  6 ++++++
 2 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_bmc.c b/drivers/gpu/drm/mgag200/mgag200_bmc.c
index 2ba2e3c5086a5..852a82f6309ba 100644
--- a/drivers/gpu/drm/mgag200/mgag200_bmc.c
+++ b/drivers/gpu/drm/mgag200/mgag200_bmc.c
@@ -1,13 +1,14 @@
 // SPDX-License-Identifier: GPL-2.0-only
 
 #include <linux/delay.h>
+#include <linux/iopoll.h>
 
 #include "mgag200_drv.h"
 
 void mgag200_bmc_disable_vidrst(struct mga_device *mdev)
 {
 	u8 tmp;
-	int iter_max;
+	int ret;
 
 	/*
 	 * 1 - The first step is to inform the BMC of an upcoming mode
@@ -37,30 +38,22 @@ void mgag200_bmc_disable_vidrst(struct mga_device *mdev)
 
 	/*
 	 * 3a- The third step is to verify if there is an active scan.
-	 * We are waiting for a 0 on remhsyncsts <XSPAREREG<0>).
+	 * We are waiting for a 0 on remhsyncsts (<XSPAREREG<0>).
 	 */
-	iter_max = 300;
-	while (!(tmp & 0x1) && iter_max) {
-		WREG8(DAC_INDEX, MGA1064_SPAREREG);
-		tmp = RREG8(DAC_DATA);
-		udelay(1000);
-		iter_max--;
-	}
+	ret = read_poll_timeout(RREG_DAC, tmp, !(tmp & 0x1),
+				1000, 300000, false,
+				MGA1064_SPAREREG);
+	if (ret == -ETIMEDOUT)
+		return;
 
 	/*
-	 * 3b- This step occurs only if the remove is actually
+	 * 3b- This step occurs only if the remote BMC is actually
 	 * scanning. We are waiting for the end of the frame which is
 	 * a 1 on remvsyncsts (XSPAREREG<1>)
 	 */
-	if (iter_max) {
-		iter_max = 300;
-		while ((tmp & 0x2) && iter_max) {
-			WREG8(DAC_INDEX, MGA1064_SPAREREG);
-			tmp = RREG8(DAC_DATA);
-			udelay(1000);
-			iter_max--;
-		}
-	}
+	(void)read_poll_timeout(RREG_DAC, tmp, (tmp & 0x2),
+				1000, 300000, false,
+				MGA1064_SPAREREG);
 }
 
 void mgag200_bmc_enable_vidrst(struct mga_device *mdev)
diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h b/drivers/gpu/drm/mgag200/mgag200_drv.h
index aebd09e2d4087..c84c3d0865345 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.h
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.h
@@ -116,6 +116,12 @@
 #define DAC_INDEX 0x3c00
 #define DAC_DATA 0x3c0a
 
+#define RREG_DAC(reg)						\
+	({							\
+		WREG8(DAC_INDEX, reg);				\
+		RREG8(DAC_DATA);				\
+	})							\
+
 #define WREG_DAC(reg, v)					\
 	do {							\
 		WREG8(DAC_INDEX, reg);				\
-- 
2.51.0




  parent reply	other threads:[~2026-02-09 14:44 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-09 14:23 [PATCH 6.1 00/69] 6.1.163-rc1 review Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 01/69] nvmet-tcp: add bounds checks in nvmet_tcp_build_pdu_iovec Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 02/69] x86/kfence: fix booting on 32bit non-PAE systems Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 03/69] platform/x86: intel_telemetry: Fix swapped arrays in PSS output Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 04/69] rbd: check for EOD after exclusive lock is ensured to be held Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 05/69] ARM: 9468/1: fix memset64() on big-endian Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 06/69] Revert "drm/amd: Check if ASPM is enabled from PCIe subsystem" Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 07/69] KVM: Dont clobber irqfd routing type when deassigning irqfd Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 08/69] netfilter: nft_set_pipapo: clamp maximum map bucket size to INT_MAX Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 09/69] binder: fix BR_FROZEN_REPLY error log Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 10/69] binderfs: fix ida_alloc_max() upper bound Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 11/69] pmdomain: imx8mp-blk-ctrl: Keep gpc power domain on for system wakeup Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 12/69] pmdomain: imx8mp-blk-ctrl: Keep usb phy " Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 13/69] pmdomain: imx8m-blk-ctrl: fix out-of-range access of bc->domains Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 14/69] gve: Fix stats report corruption on queue count change Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 15/69] tracing: Fix ftrace event field alignments Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 16/69] gve: Correct ethtool rx_dropped calculation Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 17/69] KVM: selftests: Add -U_FORTIFY_SOURCE to avoid some unpredictable test failures Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 18/69] wifi: mac80211: ocb: skip rx_no_sta when interface is not joined Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 19/69] wifi: wlcore: ensure skb headroom before skb_push Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 20/69] net: usb: sr9700: support devices with virtual driver CD Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 21/69] block,bfq: fix aux stat accumulation destination Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 22/69] smb/server: call ksmbd_session_rpc_close() on error path in create_smb2_pipe() Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 23/69] LoongArch: Set correct protection_map[] for VM_NONE/VM_SHARED Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 24/69] LoongArch: Enable exception fixup for specific ADE subcode Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 25/69] HID: intel-ish-hid: Update ishtp bus match to support device ID table Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 26/69] HID: multitouch: add MT_QUIRK_STICKY_FINGERS to MT_CLS_VTL Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 27/69] btrfs: fix reservation leak in some error paths when inserting inline extent Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 28/69] HID: intel-ish-hid: Reset enum_devices_done before enumeration Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 29/69] HID: playstation: Center initial joystick axes to prevent spurious events Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 30/69] ALSA: hda/realtek: add HP Laptop 15s-eq1xxx mute LED quirk Greg Kroah-Hartman
2026-02-09 14:23 ` [PATCH 6.1 31/69] netfilter: replace -EEXIST with -EBUSY Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 32/69] HID: quirks: Add another Chicony HP 5MP Cameras to hid_ignore_list Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 33/69] HID: i2c-hid: fix potential buffer overflow in i2c_hid_get_report() Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 34/69] HID: Apply quirk HID_QUIRK_ALWAYS_POLL to Edifier QR30 (2d99:a101) Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 35/69] ring-buffer: Avoid softlockup in ring_buffer_resize() during memory free Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 36/69] wifi: mac80211: collect station statistics earlier when disconnect Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 37/69] ASoC: davinci-evm: Fix reference leak in davinci_evm_probe Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 38/69] nvme-fc: release admin tagset if init fails Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 39/69] ASoC: tlv320adcx140: Propagate error codes during probe Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 40/69] wifi: cfg80211: Fix bitrate calculation overflow for HE rates Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 41/69] scsi: target: iscsi: Fix use-after-free in iscsit_dec_session_usage_count() Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 42/69] ALSA: hda/realtek: Fix headset mic for TongFang X6AR55xU Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 43/69] scsi: target: iscsi: Fix use-after-free in iscsit_dec_conn_usage_count() Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 44/69] wifi: mac80211: dont increment crypto_tx_tailroom_needed_cnt twice Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 45/69] platform/x86: toshiba_haps: Fix memory leaks in add/remove routines Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 46/69] platform/x86: intel_telemetry: Fix PSS event register mask Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 47/69] smb/client: fix memory leak in smb2_open_file() Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 48/69] dpaa2-switch: prevent ZERO_SIZE_PTR dereference when num_ifs is zero Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 49/69] net: liquidio: Initialize netdev pointer before queue setup Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 50/69] net: liquidio: Fix off-by-one error in PF setup_nic_devices() cleanup Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 51/69] net: liquidio: Fix off-by-one error in VF " Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 52/69] dpaa2-switch: add bounds check for if_id in IRQ handler Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 53/69] macvlan: fix error recovery in macvlan_common_newlink() Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 54/69] net: dont touch dev->stats in BPF redirect paths Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 55/69] tipc: use kfree_sensitive() for session key material Greg Kroah-Hartman
2026-02-09 14:24 ` Greg Kroah-Hartman [this message]
2026-02-09 14:24 ` [PATCH 6.1 57/69] hwmon: (occ) Mark occ_init_attribute() as __printf Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 58/69] netfilter: nf_tables: fix inverted genmask check in nft_map_catchall_activate() Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 59/69] ASoC: amd: fix memory leak in acp3x pdm dma ops Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 60/69] hfsplus: fix slab-out-of-bounds read in hfsplus_uni2asc() Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 61/69] riscv: uprobes: Add missing fence.i after building the XOL buffer Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 62/69] iommu: disable SVA when CONFIG_X86 is set Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 63/69] spi: tegra210-quad: Return IRQ_HANDLED when timeout already processed transfer Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 64/69] spi: tegra210-quad: Move curr_xfer read inside spinlock Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 65/69] spi: tegra210-quad: Protect curr_xfer assignment in tegra_qspi_setup_transfer_one Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 66/69] spi: tegra210-quad: Protect curr_xfer in tegra_qspi_combined_seq_xfer Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 67/69] spi: tegra210-quad: Protect curr_xfer clearing in tegra_qspi_non_combined_seq_xfer Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 68/69] spi: tegra: Fix a memory leak in tegra_slink_probe() Greg Kroah-Hartman
2026-02-09 14:24 ` [PATCH 6.1 69/69] ALSA: hda/realtek: Really fix headset mic for TongFang X6AR55xU Greg Kroah-Hartman
2026-02-09 16:48 ` [PATCH 6.1 00/69] 6.1.163-rc1 review Francesco Dolcini
2026-02-09 18:16 ` Brett A C Sheffield
2026-02-09 19:49 ` Florian Fainelli
2026-02-09 20:49 ` Hardik Garg
2026-02-09 20:55 ` Jon Hunter
2026-02-09 22:04 ` Peter Schneider
2026-02-10  8:55 ` Ron Economos
2026-02-11 11:34   ` Greg Kroah-Hartman
2026-02-10 13:08 ` Mark Brown
2026-02-11 10:08 ` Jeffrin Thalakkottoor
2026-02-11 13:44 ` Miguel Ojeda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260209142303.940808045@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=jacob.e.keller@intel.com \
    --cc=jfalempe@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox