From: Peter Seiderer <ps.report@gmx.net>
To: "Toke Høiland-Jørgensen" <toke@toke.dk>
Cc: Kalle Valo <kvalo@kernel.org>,
linux-wireless@vger.kernel.org,
Bagas Sanjaya <bagasdotme@gmail.com>
Subject: Re: [PATCH v5.18] ath9k: Properly clear TX status area before reporting to mac80211
Date: Mon, 4 Apr 2022 13:04:53 +0200 [thread overview]
Message-ID: <20220404130453.5ec6e045@gmx.net> (raw)
In-Reply-To: <20220402181910.5b35f0cb@gmx.net>
Hello Toke,
On Sat, 2 Apr 2022 18:19:10 +0200, Peter Seiderer <ps.report@gmx.net> wrote:
> Hello Toke,
>
> On Sat, 02 Apr 2022 17:11:54 +0200, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> > Peter Seiderer <ps.report@gmx.net> writes:
> >
> > > Hello Toke,
> > >
> > > On Fri, 1 Apr 2022 19:26:57 +0200, Peter Seiderer <ps.report@gmx.net> wrote:
> > >
> > >> Hello Toke,
> > >>
> > >> On Wed, 30 Mar 2022 18:44:09 +0200, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> > >>
> > >> > The ath9k driver was not properly clearing the status area in the
> > >> > ieee80211_tx_info struct before reporting TX status to mac80211. Instead,
> > >> > it was manually filling in fields, which meant that fields introduced later
> > >> > were left as-is.
> > >> >
> > >> > Conveniently, mac80211 actually provides a helper to zero out the status
> > >> > area, so use that to make sure we zero everything.
> > >> >
> > >> > The last commit touching the driver function writing the status information
> > >> > seems to have actually been fixing an issue that was also caused by the
> > >> > area being uninitialised; but it only added clearing of a single field
> > >> > instead of the whole struct. That is now redundant, though, so revert that
> > >> > commit and use it as a convenient Fixes tag.
> > >> >
> > >> > Fixes: cc591d77aba1 ("ath9k: Make sure to zero status.tx_time before reporting TX status")
> > >> > Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
> > >> > Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
> > >> > ---
> > >> > drivers/net/wireless/ath/ath9k/xmit.c | 5 ++---
> > >> > 1 file changed, 2 insertions(+), 3 deletions(-)
> > >> >
> > >> > diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
> > >> > index d0caf1de2bde..cbcf96ac303e 100644
> > >> > --- a/drivers/net/wireless/ath/ath9k/xmit.c
> > >> > +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> > >> > @@ -2553,6 +2553,8 @@ static void ath_tx_rc_status(struct ath_softc *sc, struct ath_buf *bf,
> > >> > struct ath_hw *ah = sc->sc_ah;
> > >> > u8 i, tx_rateindex;
> > >> >
> > >> > + ieee80211_tx_info_clear_status(tx_info);
> > >> > +
> > >>
> > >> As this also clears the status.rates[].count, see include/net/mac80211.h:
> > >>
> > >> 1195 static inline void
> > >> 1196 ieee80211_tx_info_clear_status(struct ieee80211_tx_info *info)
> > >> 1197 {
> > >> 1198 int i;
> > >> 1199
> > >> 1200 BUILD_BUG_ON(offsetof(struct ieee80211_tx_info, status.rates) !=
> > >> 1201 offsetof(struct ieee80211_tx_info, control.rates));
> > >> 1202 BUILD_BUG_ON(offsetof(struct ieee80211_tx_info, status.rates) !=
> > >> 1203 offsetof(struct ieee80211_tx_info, driver_rates));
> > >> 1204 BUILD_BUG_ON(offsetof(struct ieee80211_tx_info, status.rates) != 8) ;
> > >> 1205 /* clear the rate counts */
> > >> 1206 for (i = 0; i < IEEE80211_TX_MAX_RATES; i++)
> > >> 1207 info->status.rates[i].count = 0;
> > >> 1208 memset_after(&info->status, 0, rates);
> > >> 1209 }
> > >>
> > >> I would have expected some lines added to restore the count (for the
> > >> rates with index < tx_rateindex), e.g. as done in
> > >> drivers/net/wireless/ath/ath5k/base.c:
> > >>
> > >> 1731 ieee80211_tx_info_clear_status(info);
> > >> 1732
> > >> 1733 for (i = 0; i < ts->ts_final_idx; i++) {
> > >> 1734 struct ieee80211_tx_rate *r =
> > >> 1735 &info->status.rates[i];
> > >> 1736
> > >> 1737 r->count = tries[i];
> > >> 1738 }
> > >>
> > >> In drivers/net/wireless/ath/ath9k/xmit.c this is only done/changed for
> > >> the tx_rateindex index (which is often zero in case the first suggested rate
> > >> succeeds, but in noisy environment is sometimes > 0)...
> > >>
> > >> > if (txok)
> > >> > tx_info->status.ack_signal = ts->ts_rssi;
> > >> >
> > >> > @@ -2595,9 +2597,6 @@ static void ath_tx_rc_status(struct ath_softc *sc, struct ath_buf *bf,
> > >> > }
> > >> >
> > >> > tx_info->status.rates[)].count = ts->ts_longretry + 1;
> > >> > -
> > >> > - /* we report airtime in ath_tx_count_airtime(), don't report twice */
> > >> > - tx_info->status.tx_time = 0;
> > >> > }
> > >> >
> > >> > static void ath_tx_processq(struct ath_softc *sc, struct ath_txq *txq)
> > >>
> > >>
> > >> And from drivers/net/wireless/ath/ath9k/xmit.c:
> > >>
> > >> 2592 for (i = tx_rateindex + 1; i < hw->max_rates; i++) {
> > >> 2593 tx_info->status.rates[i].count = 0;
> > >> 2594 tx_info->status.rates[i].idx = -1;
> > >> 2595 }
> > >>
> > >> Line 2593 can be deleted as status.rates[].count is already zeroed through the
> > >> ieee80211_tx_info_clear_status() call...
> > >>
> > >> And it should be sufficient to do:
> > >>
> > >> if (tx_rateindex + 1 < hw->max_rates)
> > >> tx_info->status.rates[tx_rateindex + 1].idx = -1;
> > >>
> > >> Regards,
> > >> Peter
> > >
> > > And one additional problem found with your patch applied (to 5.16.18), the log
> > > get spammed by:
> > >
> > > [ 445.489139] ------------[ cut here ]------------
> > > [ 445.489150] WARNING: CPU: 0 PID: 8630 at drivers/net/wireless/ath/ath9k/xmit.c:174 ath_tx_complete+0x190/0x1a0 [ath9k]
> > > [ 445.489178] Modules linked in: ath9k ath9k_common ath9k_hw mac80211 ppp_async ppp_generic slhc xt_u32 xt_TCPMSS xt_tcpmss xt_MASQUERADE iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter iptable_filter ip6table_mangle ip6_tables cls_u32 sch_prio qmi_wwan cdc_wdm usbnet qcserial usb_wwan gpio_pca953x xt_tcpudp xt_mark iptable_mangle ip_tables x_tables dummy tqmx86_wdt watchdog i2c_machxo2 gpio_tqmx86 libarc4 ath tqmx86 mfd_core ftdi_sio cfg80211 usbserial nct7802 regmap_i2c [last unloaded: mac80211]
> > > [ 445.489298] CPU: 0 PID: 8630 Comm: modprobe Tainted: G W 5.16.18+ #2
> > > [ 445.489307] Hardware name: TQ-Group TQMxE38M X64/Type2 - Board Product Name, BIOS TQMxE38M.5.4.48.0028.12 01/31/2017
> > > [ 445.489311] EIP: ath_tx_complete+0x190/0x1a0 [ath9k]
> > > [ 445.489328] Code: ff 75 e8 68 d7 a1 2c f8 89 55 ec e8 d5 4f e3 ff 8b 45 f0 8b 55 ec 83 c4 10 8b 80 58 0a 00 00 e9 a8 fe ff ff 8d b6 00 00 00 00 <0f> 0b c7 42 68 00 00 00 00 e9 1d ff ff ff 66 90 55 89 e5 57 56 89
> > > [ 445.489336] EAX: 00000000 EBX: c2e67900 ECX: c23cc040 EDX: c235a980
> > > [ 445.489342] ESI: c235a888 EDI: c2359860 EBP: c1089e18 ESP: c1089dfc
> > > [ 445.489348] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010297
> > > [ 445.489355] CR0: 80050033 CR2: b7f81000 CR3: 02ec7000 CR4: 001006d0
> > > [ 445.489362] Call Trace:
> > > [ 445.489366] <SOFTIRQ>
> > > [ 445.489372] ath_tx_complete_buf+0x100/0x130 [ath9k]
> > > [ 445.489390] ath_tx_process_buffer+0x187/0xb40 [ath9k]
> > > [ 445.489407] ? resched_curr+0x1d/0xc0
> > > [ 445.489419] ? check_preempt_wakeup+0x115/0x250
> > > [ 445.489429] ? task_fork_fair+0xc0/0x170
> > > [ 445.489436] ? put_prev_task_fair+0x40/0x40
> > > [ 445.489444] ? check_preempt_curr+0x62/0x70
> > > [ 445.489452] ? __local_bh_enable_ip+0x33/0x80
> > > [ 445.489460] ? _raw_spin_unlock_bh+0x13/0x20
> > > [ 445.489468] ? ath_txq_unlock_complete+0x5a/0x100 [ath9k]
> > > [ 445.489485] ? _raw_spin_unlock_irqrestore+0x16/0x30
> > > [ 445.489492] ? try_to_wake_up+0x7c/0x550
> > > [ 445.489499] ath_tx_tasklet+0x1de/0x2e0 [ath9k]
> > > [ 445.489519] ath9k_tasklet+0x22f/0x330 [ath9k]
> > > [ 445.489535] tasklet_action_common.constprop.0+0x89/0xb0
> > > [ 445.489544] tasklet_action+0x21/0x30
> > > [ 445.489551] __do_softirq+0xc5/0x28b
> > > [ 445.489559] ? __entry_text_end+0x4/0x4
> > > [ 445.489567] call_on_stack+0x40/0x50
> > > [ 445.489576] </SOFTIRQ>
> > > [ 445.489580] ? irq_exit_rcu+0x92/0x100
> > > [ 445.489587] ? common_interrupt+0x27/0x40
> > > [ 445.489595] ? asm_common_interrupt+0x102/0x140
> > > [ 445.489605] ---[ end trace 5c176b666255bca1 ]---
> > > [ 445.607206] ------------[ cut here ]------------
> > >
> > >
> > > Could be fixed by the following additional change:
> > >
> > > --- a/drivers/net/wireless/ath/ath9k/xmit.c
> > > +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> > > @@ -2511,9 +2511,12 @@ static void ath_tx_rc_status(struct ath_softc *sc, struct ath_buf *bf,
> > > struct ieee80211_hw *hw = sc->hw;
> > > struct ath_hw *ah = sc->sc_ah;
> > > u8 i, tx_rateindex;
> > > + bool is_valid_ack_signal = tx_info->status.is_valid_ack_signal;
> > >
> > > ieee80211_tx_info_clear_status(tx_info);
> > >
> > > + tx_info->status.is_valid_ack_signal = is_valid_ack_signal;
> > > +
> > > if (txok)
> >
> > That doesn't seem right, zeroing is_valid_ack_signal was the whole point
> > of the first patch... Something seems off in that backtrace; could you
> > try running it through ./scripts/decode_stacktrace.sh please?
>
> Will do (and re-run the test with an fresh compiled kernel as CONFIG_DEBUG_KERNEL
> was not set - but need to wait until next week as I need physical access to
> the test system for kernel update)...
Same with an fresh build (and with stacktrace run through decode_stacktrace.sh), but
did some more investigations, added an debug print of tx_info->status.is_valid_ack_signal
in ath_tx_rc_status() and an corresponding one in ath_txq_skb_done() printing out
fi->txq, with the additional
tx_info->status.is_valid_ack_signal = is_valid_ack_signal;
line added showed the following output
[ 839.543175] XXX - ath_tx_rc_status() - is_valid_ack_signal: 2
[ 839.543195] XXX - ath_txq_skb_done() txq: 2
[ 839.617898] XXX - ath_tx_rc_status() - is_valid_ack_signal: 2
[ 839.617961] XXX - ath_txq_skb_done() txq: 2
[ 840.024845] XXX - ath_tx_rc_status() - is_valid_ack_signal: 2
[ 840.024863] XXX - ath_txq_skb_done() txq: 2
[...]
and without the following:
[ 924.614745] XXX - ath_tx_rc_status() - is_valid_ack_signal: 2
[ 924.614762] XXX - ath_txq_skb_done() txq: 0
[ 924.614805] ------------[ cut here ]------------
[ 924.614809] WARNING: CPU: 0 PID: 0 at drivers/net/wireless/ath/ath9k/xmit.c:176 ath_txq_skb_done.constprop.0+0x28/0x38 [ath9k]
[...]
Seems ieee80211_tx_info_clear_status() clears/destroys the dual use of the
struct ieee80211_tx_info.status vs. ieee80211_tx_info.rate_driver_data...
See drivers/net/wireless/ath/ath9k/xmit.c
140 static struct ath_frame_info *get_frame_info(struct sk_buff *skb)
141 {
142 struct ieee80211_tx_info *tx_info = IEEE80211_SKB_CB(skb);
143 BUILD_BUG_ON(sizeof(struct ath_frame_info) >
144 sizeof(tx_info->rate_driver_data));
145 return (struct ath_frame_info *) &tx_info->rate_driver_data[0];
146 }
205 static void ath_txq_skb_done(struct ath_softc *sc, struct ath_txq *txq,
206 struct sk_buff *skb)
207 {
208 struct ath_frame_info *fi = get_frame_info(skb);
209 int q = fi->txq;
210
211 if (q < 0)
212 return;
213
214 txq = sc->tx.txq_map[q];
215 if (WARN_ON(--txq->pending_frames < 0))
216 txq->pending_frames = 0;
217
218 }
Seems it only worked by chance (?) before...
Regards,
Peter
>
> Regards,
> Peter
>
> >
> > -Toke
>
next prev parent reply other threads:[~2022-04-04 11:05 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-30 16:44 [PATCH v5.18] ath9k: Properly clear TX status area before reporting to mac80211 Toke Høiland-Jørgensen
2022-03-31 5:06 ` Bagas Sanjaya
2022-03-31 5:18 ` Bagas Sanjaya
2022-03-31 5:36 ` Kalle Valo
2022-03-31 8:35 ` Toke Høiland-Jørgensen
2022-03-31 9:31 ` Kalle Valo
2022-04-01 6:37 ` [v5.18] " Kalle Valo
2022-04-01 17:26 ` [PATCH v5.18] " Peter Seiderer
2022-04-02 12:00 ` Toke Høiland-Jørgensen
2022-04-02 14:33 ` Peter Seiderer
2022-04-02 15:11 ` Toke Høiland-Jørgensen
2022-04-02 16:19 ` Peter Seiderer
2022-04-04 11:04 ` Peter Seiderer [this message]
2022-04-04 16:03 ` Toke Høiland-Jørgensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220404130453.5ec6e045@gmx.net \
--to=ps.report@gmx.net \
--cc=bagasdotme@gmail.com \
--cc=kvalo@kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=toke@toke.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.