All of lore.kernel.org
 help / color / mirror / Atom feed
From: Corey Minyard <minyard@acm.org>
To: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
Cc: openipmi-developer@lists.sourceforge.net,
	linux-kernel@vger.kernel.org, qi.zheng@linux.dev
Subject: Re: [PATCH 2/3] ipmi: fix long wait in unload when IPMI disconnect
Date: Fri, 7 Oct 2022 14:48:09 -0500	[thread overview]
Message-ID: <Y0CCeW1GtiCk0XdQ@minyard.net> (raw)
In-Reply-To: <20221007092617.87597-3-zhangyuchen.lcr@bytedance.com>

On Fri, Oct 07, 2022 at 05:26:16PM +0800, Zhang Yuchen wrote:
> When fixing the problem mentioned in PATCH1, we also found
> the following problem:
> 
> If the IPMI is disconnected and in the sending process, the
> uninstallation driver will be stuck for a long time.
> 
> The main problem is that uninstalling the driver waits for curr_msg to
> be sent or HOSED. After stopping tasklet, the only place to trigger the
> timeout mechanism is the circular poll in shutdown_smi.
> 
> The poll function delays 10us and calls smi_event_handler(smi_info,10).
> Smi_event_handler deducts 10us from kcs->ibf_timeout.
> 
> But the poll func is followed by schedule_timeout_uninterruptible(1).
> The time consumed here is not counted in kcs->ibf_timeout.
> 
> So when 10us is deducted from kcs->ibf_timeout, at least 1 jiffies has
> actually passed. The waiting time has increased by more than a
> hundredfold.
> 
> Now instead of calling poll(). call smi_event_handler() directly and
> calculate the elapsed time.

Yes, you are correct.

I've included this for 6.2, and added:

  Cc: stable@vger.kernel.org

I would like it to soak for a bit.

-corey

> 
> For verification, you can directly use ebpf to check the kcs->
> ibf_timeout for each call to kcs_event() when IPMI is disconnected.
> Decrement at normal rate before unloading. The decrement rate becomes
> very slow after unloading.
> 
>   $ bpftrace -e 'kprobe:kcs_event {printf("kcs->ibftimeout : %d\n",
>       *(arg0+584));}'
> 
> Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
> ---
>  drivers/char/ipmi/ipmi_si_intf.c | 27 +++++++++++++++++++--------
>  1 file changed, 19 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> index 6e357ad76f2e..abddd7e43a9a 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -2153,6 +2153,20 @@ static int __init init_ipmi_si(void)
>  }
>  module_init(init_ipmi_si);
>  
> +static void wait_msg_processed(struct smi_info *smi_info)
> +{
> +	unsigned long jiffies_now;
> +	long time_diff;
> +
> +	while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) {
> +		jiffies_now = jiffies;
> +		time_diff = (((long)jiffies_now - (long)smi_info->last_timeout_jiffies)
> +		     * SI_USEC_PER_JIFFY);
> +		smi_event_handler(smi_info, time_diff);
> +		schedule_timeout_uninterruptible(1);
> +	}
> +}
> +
>  static void shutdown_smi(void *send_info)
>  {
>  	struct smi_info *smi_info = send_info;
> @@ -2187,16 +2201,13 @@ static void shutdown_smi(void *send_info)
>  	 * in the BMC.  Note that timers and CPU interrupts are off,
>  	 * so no need for locks.
>  	 */
> -	while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) {
> -		poll(smi_info);
> -		schedule_timeout_uninterruptible(1);
> -	}
> +	wait_msg_processed(smi_info);
> +
>  	if (smi_info->handlers)
>  		disable_si_irq(smi_info);
> -	while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) {
> -		poll(smi_info);
> -		schedule_timeout_uninterruptible(1);
> -	}
> +
> +	wait_msg_processed(smi_info);
> +
>  	if (smi_info->handlers)
>  		smi_info->handlers->cleanup(smi_info->si_sm);
>  
> -- 
> 2.30.2
> 

  reply	other threads:[~2022-10-07 19:48 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-07  9:26 [PATCH 0/3] ipmi: fix ipmi memleak and unload bug Zhang Yuchen
2022-10-07  9:26 ` [PATCH 1/3] ipmi: fix msg stack when IPMI is disconnected Zhang Yuchen
2022-10-07 19:43   ` Corey Minyard
2022-10-08  1:36     ` Yuchen Zhang
2022-10-08 11:49       ` Corey Minyard
2022-10-09  0:53         ` Yuchen Zhang
2022-10-07  9:26 ` [PATCH 2/3] ipmi: fix long wait in unload when IPMI disconnect Zhang Yuchen
2022-10-07 19:48   ` Corey Minyard [this message]
2022-10-07  9:26 ` [PATCH 3/3] ipmi: fix memleak when unload ipmi driver Zhang Yuchen
2022-10-07 19:51   ` Corey Minyard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y0CCeW1GtiCk0XdQ@minyard.net \
    --to=minyard@acm.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=openipmi-developer@lists.sourceforge.net \
    --cc=qi.zheng@linux.dev \
    --cc=zhangyuchen.lcr@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.