From: Thomas Gleixner <tglx@linutronix.de>
To: Fabio Estevam <festevam@gmail.com>
Cc: "Paul E . McKenney" <paulmck@kernel.org>,
Kalle Valo <kvalo@codeaurora.org>,
ath10k@lists.infradead.org, linux-mmc <linux-mmc@vger.kernel.org>,
Ulf Hansson <ulf.hansson@linaro.org>, Marek Vasut <marex@denx.de>,
qais.yousef@arm.com, Frederic Weisbecker <frederic@kernel.org>
Subject: Re: NOHZ tick-stop error with ath10k SDIO
Date: Sun, 05 Sep 2021 15:00:32 +0200 [thread overview]
Message-ID: <87y28b9nyn.ffs@tglx> (raw)
In-Reply-To: <CAOMZO5BnPEnF-HNM7vCzeUrRW7BsQ-hhm4fcVmO_QieKf6oJsw@mail.gmail.com>
Fabio,
On Sat, Sep 04 2021 at 18:10, Fabio Estevam wrote:
> On Fri, Sep 3, 2021 at 5:07 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> I did as suggested and here is trace.txt:
> https://pastebin.com/VUfLRJ8a
Lacks a stack trace, but yes this one is the culprit:
kworker/u4:2-70 [000] d..1 87.940929: softirq_raise: vec=3 [action=NET_RX]
It has only interrupts and preemption disabled and it's in task
context. So if there is no interrupt raised and no local_bh_disable /
enable() pair invoked before the CPU goes idle nothing will handle the
softirq and the raised bit stays pending which makes the NOHZ idle code
complain.
> Also, while investigating this problem I saw a commit that fixed a
> similar issue:
> e63052a5dd3c ("mlx5e: add add missing BH locking around napi_schdule()").
>
> I then tried the same approach on the ath10k sdio driver:
>
> diff --git a/drivers/net/wireless/ath/ath10k/sdio.c
> b/drivers/net/wireless/ath/ath10k/sdio.c
> index b746052737e0..eb705214f3f0 100644
> --- a/drivers/net/wireless/ath/ath10k/sdio.c
> +++ b/drivers/net/wireless/ath/ath10k/sdio.c
> @@ -1363,8 +1363,11 @@ static void
> ath10k_rx_indication_async_work(struct work_struct *work)
> ep->ep_ops.ep_rx_complete(ar, skb);
> }
>
> - if (test_bit(ATH10K_FLAG_CORE_REGISTERED, &ar->dev_flags))
> + if (test_bit(ATH10K_FLAG_CORE_REGISTERED, &ar->dev_flags)) {
> + local_bh_disable();
> napi_schedule(&ar->napi);
> + local_bh_enable();
> + }
> }
>
> and no longer get the "NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!" error messages after launching hostapd.
>
> Is this a proper fix?
Yes. This is correct. See above.
Thanks,
tglx
_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k
WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de>
To: Fabio Estevam <festevam@gmail.com>
Cc: "Paul E . McKenney" <paulmck@kernel.org>,
Kalle Valo <kvalo@codeaurora.org>,
ath10k@lists.infradead.org, linux-mmc <linux-mmc@vger.kernel.org>,
Ulf Hansson <ulf.hansson@linaro.org>, Marek Vasut <marex@denx.de>,
qais.yousef@arm.com, Frederic Weisbecker <frederic@kernel.org>
Subject: Re: NOHZ tick-stop error with ath10k SDIO
Date: Sun, 05 Sep 2021 15:00:32 +0200 [thread overview]
Message-ID: <87y28b9nyn.ffs@tglx> (raw)
In-Reply-To: <CAOMZO5BnPEnF-HNM7vCzeUrRW7BsQ-hhm4fcVmO_QieKf6oJsw@mail.gmail.com>
Fabio,
On Sat, Sep 04 2021 at 18:10, Fabio Estevam wrote:
> On Fri, Sep 3, 2021 at 5:07 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> I did as suggested and here is trace.txt:
> https://pastebin.com/VUfLRJ8a
Lacks a stack trace, but yes this one is the culprit:
kworker/u4:2-70 [000] d..1 87.940929: softirq_raise: vec=3 [action=NET_RX]
It has only interrupts and preemption disabled and it's in task
context. So if there is no interrupt raised and no local_bh_disable /
enable() pair invoked before the CPU goes idle nothing will handle the
softirq and the raised bit stays pending which makes the NOHZ idle code
complain.
> Also, while investigating this problem I saw a commit that fixed a
> similar issue:
> e63052a5dd3c ("mlx5e: add add missing BH locking around napi_schdule()").
>
> I then tried the same approach on the ath10k sdio driver:
>
> diff --git a/drivers/net/wireless/ath/ath10k/sdio.c
> b/drivers/net/wireless/ath/ath10k/sdio.c
> index b746052737e0..eb705214f3f0 100644
> --- a/drivers/net/wireless/ath/ath10k/sdio.c
> +++ b/drivers/net/wireless/ath/ath10k/sdio.c
> @@ -1363,8 +1363,11 @@ static void
> ath10k_rx_indication_async_work(struct work_struct *work)
> ep->ep_ops.ep_rx_complete(ar, skb);
> }
>
> - if (test_bit(ATH10K_FLAG_CORE_REGISTERED, &ar->dev_flags))
> + if (test_bit(ATH10K_FLAG_CORE_REGISTERED, &ar->dev_flags)) {
> + local_bh_disable();
> napi_schedule(&ar->napi);
> + local_bh_enable();
> + }
> }
>
> and no longer get the "NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!" error messages after launching hostapd.
>
> Is this a proper fix?
Yes. This is correct. See above.
Thanks,
tglx
next prev parent reply other threads:[~2021-09-05 13:00 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-18 15:18 NOHZ tick-stop error with ath10k SDIO Fabio Estevam
2021-08-18 15:18 ` Fabio Estevam
2021-08-18 15:43 ` Paul E. McKenney
2021-08-18 15:43 ` Paul E. McKenney
2021-08-18 16:29 ` Fabio Estevam
2021-08-18 16:29 ` Fabio Estevam
2021-08-18 17:02 ` Fabio Estevam
2021-08-18 17:02 ` Fabio Estevam
2021-08-18 17:56 ` Paul E. McKenney
2021-08-18 17:56 ` Paul E. McKenney
2021-08-19 13:24 ` Fabio Estevam
2021-08-19 13:24 ` Fabio Estevam
2021-09-02 21:51 ` Thomas Gleixner
2021-09-02 21:51 ` Thomas Gleixner
2021-09-02 22:09 ` Paul E. McKenney
2021-09-02 22:09 ` Paul E. McKenney
2021-09-03 8:07 ` Thomas Gleixner
2021-09-03 8:07 ` Thomas Gleixner
2021-09-04 21:10 ` Fabio Estevam
2021-09-04 21:10 ` Fabio Estevam
2021-09-05 13:00 ` Thomas Gleixner [this message]
2021-09-05 13:00 ` Thomas Gleixner
2021-09-05 13:07 ` Fabio Estevam
2021-09-05 13:07 ` Fabio Estevam
2021-09-17 16:32 ` Qais Yousef
2021-09-17 16:32 ` Qais Yousef
2021-09-17 17:09 ` Paul E. McKenney
2021-09-17 17:09 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87y28b9nyn.ffs@tglx \
--to=tglx@linutronix.de \
--cc=ath10k@lists.infradead.org \
--cc=festevam@gmail.com \
--cc=frederic@kernel.org \
--cc=kvalo@codeaurora.org \
--cc=linux-mmc@vger.kernel.org \
--cc=marex@denx.de \
--cc=paulmck@kernel.org \
--cc=qais.yousef@arm.com \
--cc=ulf.hansson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.