stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Dexuan Cui <decui@microsoft.com>,
	Michael Kelley <mikelley@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 03/43] x86/hyperv: Initialize clockevents after LAPIC is initialized
Date: Fri, 22 Jan 2021 15:12:19 +0100	[thread overview]
Message-ID: <20210122135735.780499208@linuxfoundation.org> (raw)
In-Reply-To: <20210122135735.652681690@linuxfoundation.org>

From: Dexuan Cui <decui@microsoft.com>

[ Upstream commit fff7b5e6ee63c5d20406a131b260c619cdd24fd1 ]

With commit 4df4cb9e99f8, the Hyper-V direct-mode STIMER is actually
initialized before LAPIC is initialized: see

  apic_intr_mode_init()

    x86_platform.apic_post_init()
      hyperv_init()
        hv_stimer_alloc()

    apic_bsp_setup()
      setup_local_APIC()

setup_local_APIC() temporarily disables LAPIC, initializes it and
re-eanble it.  The direct-mode STIMER depends on LAPIC, and when it's
registered, it can be programmed immediately and the timer can fire
very soon:

  hv_stimer_init
    clockevents_config_and_register
      clockevents_register_device
        tick_check_new_device
          tick_setup_device
            tick_setup_periodic(), tick_setup_oneshot()
              clockevents_program_event

When the timer fires in the hypervisor, if the LAPIC is in the
disabled state, new versions of Hyper-V ignore the event and don't inject
the timer interrupt into the VM, and hence the VM hangs when it boots.

Note: when the VM starts/reboots, the LAPIC is pre-enabled by the
firmware, so the window of LAPIC being temporarily disabled is pretty
small, and the issue can only happen once out of 100~200 reboots for
a 40-vCPU VM on one dev host, and on another host the issue doesn't
reproduce after 2000 reboots.

The issue is more noticeable for kdump/kexec, because the LAPIC is
disabled by the first kernel, and stays disabled until the kdump/kexec
kernel enables it. This is especially an issue to a Generation-2 VM
(for which Hyper-V doesn't emulate the PIT timer) when CONFIG_HZ=1000
(rather than CONFIG_HZ=250) is used.

Fix the issue by moving hv_stimer_alloc() to a later place where the
LAPIC timer is initialized.

Fixes: 4df4cb9e99f8 ("x86/hyperv: Initialize clockevents earlier in CPU onlining")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by:  Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210116223136.13892-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/hyperv/hv_init.c |   29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -312,6 +312,25 @@ static struct syscore_ops hv_syscore_ops
 	.resume		= hv_resume,
 };
 
+static void (* __initdata old_setup_percpu_clockev)(void);
+
+static void __init hv_stimer_setup_percpu_clockev(void)
+{
+	/*
+	 * Ignore any errors in setting up stimer clockevents
+	 * as we can run with the LAPIC timer as a fallback.
+	 */
+	(void)hv_stimer_alloc();
+
+	/*
+	 * Still register the LAPIC timer, because the direct-mode STIMER is
+	 * not supported by old versions of Hyper-V. This also allows users
+	 * to switch to LAPIC timer via /sys, if they want to.
+	 */
+	if (old_setup_percpu_clockev)
+		old_setup_percpu_clockev();
+}
+
 /*
  * This function is to be invoked early in the boot sequence after the
  * hypervisor has been detected.
@@ -390,10 +409,14 @@ void __init hyperv_init(void)
 	wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
 
 	/*
-	 * Ignore any errors in setting up stimer clockevents
-	 * as we can run with the LAPIC timer as a fallback.
+	 * hyperv_init() is called before LAPIC is initialized: see
+	 * apic_intr_mode_init() -> x86_platform.apic_post_init() and
+	 * apic_bsp_setup() -> setup_local_APIC(). The direct-mode STIMER
+	 * depends on LAPIC, so hv_stimer_alloc() should be called from
+	 * x86_init.timers.setup_percpu_clockev.
 	 */
-	(void)hv_stimer_alloc();
+	old_setup_percpu_clockev = x86_init.timers.setup_percpu_clockev;
+	x86_init.timers.setup_percpu_clockev = hv_stimer_setup_percpu_clockev;
 
 	hv_apic_init();
 



  parent reply	other threads:[~2021-01-22 14:55 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-22 14:12 [PATCH 5.10 00/43] 5.10.10-rc1 review Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 01/43] Revert "kconfig: remove kvmconfig and xenconfig shorthands" Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 02/43] bpf: Fix selftest compilation on clang 11 Greg Kroah-Hartman
2021-01-22 14:12 ` Greg Kroah-Hartman [this message]
2021-01-22 14:12 ` [PATCH 5.10 04/43] drm/amdgpu/display: drop DCN support for aarch64 Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 05/43] bpf: Fix signed_{sub,add32}_overflows type handling Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 06/43] X.509: Fix crash caused by NULL pointer Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 07/43] nfsd4: readdirplus shouldnt return parent of export Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 08/43] bpf: Dont leak memory in bpf getsockopt when optlen == 0 Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 09/43] bpf: Support PTR_TO_MEM{,_OR_NULL} register spilling Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 10/43] bpf: Fix helper bpf_map_peek_elem_proto pointing to wrong callback Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 11/43] net: ipa: modem: add missing SET_NETDEV_DEV() for proper sysfs links Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 12/43] net: fix use-after-free when UDP GRO with shared fraglist Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 13/43] udp: Prevent reuseport_select_sock from reading uninitialized socks Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 14/43] netxen_nic: fix MSI/MSI-x interrupts Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 15/43] net: ipv6: Validate GSO SKB before finish IPv6 processing Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 16/43] tipc: fix NULL deref in tipc_link_xmit() Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 17/43] mlxsw: core: Add validation of transceiver temperature thresholds Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 18/43] mlxsw: core: Increase critical threshold for ASIC thermal zone Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 19/43] net: mvpp2: Remove Pause and Asym_Pause support Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 20/43] rndis_host: set proper input size for OID_GEN_PHYSICAL_MEDIUM request Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 21/43] esp: avoid unneeded kmap_atomic call Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 22/43] net: dcb: Validate netlink message in DCB handler Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 23/43] net: dcb: Accept RTM_GETDCB messages carrying set-like DCB commands Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 24/43] rxrpc: Call state should be read with READ_ONCE() under some circumstances Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 25/43] i40e: fix potential NULL pointer dereferencing Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 26/43] net: stmmac: Fixed mtu channged by cache aligned Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 27/43] net: sit: unregister_netdevice on newlinks error path Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 28/43] net: stmmac: fix taprio schedule configuration Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 29/43] net: stmmac: fix taprio configuration when base_time is in the past Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 30/43] net: avoid 32 x truesize under-estimation for tiny skbs Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 31/43] dt-bindings: net: renesas,etheravb: RZ/G2H needs tx-internal-delay-ps Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 32/43] net: phy: smsc: fix clk error handling Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 33/43] net: dsa: clear devlink port type before unregistering slave netdevs Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 34/43] rxrpc: Fix handling of an unsupported token type in rxrpc_read() Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 35/43] net: stmmac: use __napi_schedule() for PREEMPT_RT Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 36/43] can: mcp251xfd: mcp251xfd_handle_rxif_one(): fix wrong NULL pointer check Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 37/43] drm/panel: otm8009a: allow using non-continuous dsi clock Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 38/43] mac80211: do not drop tx nulldata packets on encrypted links Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 39/43] mac80211: check if atf has been disabled in __ieee80211_schedule_txq Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 40/43] net: dsa: unbind all switches from tree when DSA master unbinds Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 41/43] cxgb4/chtls: Fix tid stuck due to wrong update of qid Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 42/43] spi: fsl: Fix driver breakage when SPI_CS_HIGH is not set in spi->mode Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 43/43] spi: cadence: cache reference clock rate during probe Greg Kroah-Hartman
2021-01-23  0:24 ` [PATCH 5.10 00/43] 5.10.10-rc1 review Shuah Khan
2021-01-23 15:06   ` Greg Kroah-Hartman
2021-01-23  5:44 ` Naresh Kamboju
2021-01-23  7:20   ` Naresh Kamboju
2021-01-23 15:06     ` Greg Kroah-Hartman
2021-01-23  9:52 ` Pavel Machek
2021-01-23 15:06   ` Greg Kroah-Hartman
2021-01-23  9:59 ` Jon Hunter
2021-01-23 15:19   ` Greg Kroah-Hartman
2021-01-23 14:36 ` Guenter Roeck
2021-01-23 15:07   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210122135735.780499208@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=decui@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).