From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFDE0C678D7 for ; Tue, 10 Jan 2023 18:32:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235436AbjAJSce (ORCPT ); Tue, 10 Jan 2023 13:32:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239462AbjAJSbv (ORCPT ); Tue, 10 Jan 2023 13:31:51 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7BF37A2AA5 for ; Tue, 10 Jan 2023 10:27:05 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B6A3861864 for ; Tue, 10 Jan 2023 18:27:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BFC68C433D2; Tue, 10 Jan 2023 18:27:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1673375224; bh=go9v8viwIjZtLWIHYEBl2vH+FrY8ZELBcbuiFrPAVKg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=wNtiFzrYvDhc5QSWngAzh8aPUk9tvmEwyULZ+IecqxKjPWoXzp/U3JN0wzBUe/Aoh 3IK8+4szN7zCyNSZUus6h7PeIvoD74T+p9lE26x8lQd4Zlmj4/nF/wpMBYMuoVQgKA nKUlKIqjcmKkF4TaHqwRrRibc7rI0pIJxSOQdreg= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Zhang Yuchen , Corey Minyard Subject: [PATCH 5.15 119/290] ipmi: fix long wait in unload when IPMI disconnect Date: Tue, 10 Jan 2023 19:03:31 +0100 Message-Id: <20230110180035.910475961@linuxfoundation.org> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230110180031.620810905@linuxfoundation.org> References: <20230110180031.620810905@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Zhang Yuchen commit f6f1234d98cce69578bfac79df147a1f6660596c upstream. When fixing the problem mentioned in PATCH1, we also found the following problem: If the IPMI is disconnected and in the sending process, the uninstallation driver will be stuck for a long time. The main problem is that uninstalling the driver waits for curr_msg to be sent or HOSED. After stopping tasklet, the only place to trigger the timeout mechanism is the circular poll in shutdown_smi. The poll function delays 10us and calls smi_event_handler(smi_info,10). Smi_event_handler deducts 10us from kcs->ibf_timeout. But the poll func is followed by schedule_timeout_uninterruptible(1). The time consumed here is not counted in kcs->ibf_timeout. So when 10us is deducted from kcs->ibf_timeout, at least 1 jiffies has actually passed. The waiting time has increased by more than a hundredfold. Now instead of calling poll(). call smi_event_handler() directly and calculate the elapsed time. For verification, you can directly use ebpf to check the kcs-> ibf_timeout for each call to kcs_event() when IPMI is disconnected. Decrement at normal rate before unloading. The decrement rate becomes very slow after unloading. $ bpftrace -e 'kprobe:kcs_event {printf("kcs->ibftimeout : %d\n", *(arg0+584));}' Signed-off-by: Zhang Yuchen Message-Id: <20221007092617.87597-3-zhangyuchen.lcr@bytedance.com> Signed-off-by: Corey Minyard Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/char/ipmi/ipmi_si_intf.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) --- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -2152,6 +2152,20 @@ skip_fallback_noirq: } module_init(init_ipmi_si); +static void wait_msg_processed(struct smi_info *smi_info) +{ + unsigned long jiffies_now; + long time_diff; + + while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) { + jiffies_now = jiffies; + time_diff = (((long)jiffies_now - (long)smi_info->last_timeout_jiffies) + * SI_USEC_PER_JIFFY); + smi_event_handler(smi_info, time_diff); + schedule_timeout_uninterruptible(1); + } +} + static void shutdown_smi(void *send_info) { struct smi_info *smi_info = send_info; @@ -2186,16 +2200,13 @@ static void shutdown_smi(void *send_info * in the BMC. Note that timers and CPU interrupts are off, * so no need for locks. */ - while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) { - poll(smi_info); - schedule_timeout_uninterruptible(1); - } + wait_msg_processed(smi_info); + if (smi_info->handlers) disable_si_irq(smi_info); - while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) { - poll(smi_info); - schedule_timeout_uninterruptible(1); - } + + wait_msg_processed(smi_info); + if (smi_info->handlers) smi_info->handlers->cleanup(smi_info->si_sm);