All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Seiderer <ps.report@gmx.net>
To: "Toke Høiland-Jørgensen" <toke@toke.dk>
Cc: Kalle Valo <kvalo@kernel.org>,
	linux-wireless@vger.kernel.org,
	Bagas Sanjaya <bagasdotme@gmail.com>
Subject: Re: [PATCH v5.18] ath9k: Properly clear TX status area before reporting to mac80211
Date: Mon, 4 Apr 2022 13:04:53 +0200	[thread overview]
Message-ID: <20220404130453.5ec6e045@gmx.net> (raw)
In-Reply-To: <20220402181910.5b35f0cb@gmx.net>

Hello Toke,

On Sat, 2 Apr 2022 18:19:10 +0200, Peter Seiderer <ps.report@gmx.net> wrote:

> Hello Toke,
> 
> On Sat, 02 Apr 2022 17:11:54 +0200, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> 
> > Peter Seiderer <ps.report@gmx.net> writes:
> >   
> > > Hello Toke,
> > >
> > > On Fri, 1 Apr 2022 19:26:57 +0200, Peter Seiderer <ps.report@gmx.net> wrote:
> > >    
> > >> Hello Toke,
> > >> 
> > >> On Wed, 30 Mar 2022 18:44:09 +0200, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> > >>     
> > >> > The ath9k driver was not properly clearing the status area in the
> > >> > ieee80211_tx_info struct before reporting TX status to mac80211. Instead,
> > >> > it was manually filling in fields, which meant that fields introduced later
> > >> > were left as-is.
> > >> > 
> > >> > Conveniently, mac80211 actually provides a helper to zero out the status
> > >> > area, so use that to make sure we zero everything.
> > >> > 
> > >> > The last commit touching the driver function writing the status information
> > >> > seems to have actually been fixing an issue that was also caused by the
> > >> > area being uninitialised; but it only added clearing of a single field
> > >> > instead of the whole struct. That is now redundant, though, so revert that
> > >> > commit and use it as a convenient Fixes tag.
> > >> > 
> > >> > Fixes: cc591d77aba1 ("ath9k: Make sure to zero status.tx_time before reporting TX status")
> > >> > Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
> > >> > Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
> > >> > ---
> > >> >  drivers/net/wireless/ath/ath9k/xmit.c | 5 ++---
> > >> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > >> > 
> > >> > diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
> > >> > index d0caf1de2bde..cbcf96ac303e 100644
> > >> > --- a/drivers/net/wireless/ath/ath9k/xmit.c
> > >> > +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> > >> > @@ -2553,6 +2553,8 @@ static void ath_tx_rc_status(struct ath_softc *sc, struct ath_buf *bf,
> > >> >  	struct ath_hw *ah = sc->sc_ah;
> > >> >  	u8 i, tx_rateindex;
> > >> >  
> > >> > +	ieee80211_tx_info_clear_status(tx_info);
> > >> > +      
> > >> 
> > >> As this also clears the status.rates[].count, see include/net/mac80211.h:
> > >> 
> > >> 1195 static inline void
> > >> 1196 ieee80211_tx_info_clear_status(struct ieee80211_tx_info *info)
> > >> 1197 {
> > >> 1198         int i;
> > >> 1199         
> > >> 1200         BUILD_BUG_ON(offsetof(struct ieee80211_tx_info, status.rates) !=
> > >> 1201                      offsetof(struct ieee80211_tx_info, control.rates));
> > >> 1202         BUILD_BUG_ON(offsetof(struct ieee80211_tx_info, status.rates) !=
> > >> 1203                      offsetof(struct ieee80211_tx_info, driver_rates));
> > >> 1204         BUILD_BUG_ON(offsetof(struct ieee80211_tx_info, status.rates) != 8)     ;       
> > >> 1205         /* clear the rate counts */
> > >> 1206         for (i = 0; i < IEEE80211_TX_MAX_RATES; i++)
> > >> 1207                 info->status.rates[i].count = 0;
> > >> 1208         memset_after(&info->status, 0, rates);
> > >> 1209 }
> > >> 
> > >> I would have expected some lines added to restore the count (for the
> > >> rates with index < tx_rateindex), e.g. as done in
> > >> drivers/net/wireless/ath/ath5k/base.c:
> > >> 
> > >> 1731         ieee80211_tx_info_clear_status(info);
> > >> 1732 
> > >> 1733         for (i = 0; i < ts->ts_final_idx; i++) {
> > >> 1734                 struct ieee80211_tx_rate *r =
> > >> 1735                         &info->status.rates[i];
> > >> 1736 
> > >> 1737                 r->count = tries[i];
> > >> 1738         }
> > >> 
> > >> In drivers/net/wireless/ath/ath9k/xmit.c this is only done/changed for
> > >> the tx_rateindex index (which is often zero in case the first suggested rate
> > >> succeeds, but in noisy environment is sometimes > 0)...
> > >>     
> > >> >  	if (txok)
> > >> >  		tx_info->status.ack_signal = ts->ts_rssi;
> > >> >  
> > >> > @@ -2595,9 +2597,6 @@ static void ath_tx_rc_status(struct ath_softc *sc, struct ath_buf *bf,
> > >> >  	}
> > >> >  
> > >> >  	tx_info->status.rates[)].count = ts->ts_longretry + 1;
> > >> > -
> > >> > -	/* we report airtime in ath_tx_count_airtime(), don't report twice */
> > >> > -	tx_info->status.tx_time = 0;
> > >> >  }
> > >> >  
> > >> >  static void ath_tx_processq(struct ath_softc *sc, struct ath_txq *txq)      
> > >> 
> > >> 
> > >> And from drivers/net/wireless/ath/ath9k/xmit.c:
> > >> 
> > >> 2592         for (i = tx_rateindex + 1; i < hw->max_rates; i++) {
> > >> 2593                 tx_info->status.rates[i].count = 0;
> > >> 2594                 tx_info->status.rates[i].idx = -1;
> > >> 2595         }
> > >> 
> > >> Line 2593 can be deleted as status.rates[].count is already zeroed through the
> > >> ieee80211_tx_info_clear_status() call...
> > >> 
> > >> And it should be sufficient to do:
> > >> 
> > >> 	if (tx_rateindex + 1 < hw->max_rates)
> > >> 		tx_info->status.rates[tx_rateindex + 1].idx = -1;
> > >> 
> > >> Regards,
> > >> Peter    
> > >
> > > And one additional problem found with your patch applied (to 5.16.18), the log
> > > get spammed by:
> > >
> > > [  445.489139] ------------[ cut here ]------------
> > > [  445.489150] WARNING: CPU: 0 PID: 8630 at drivers/net/wireless/ath/ath9k/xmit.c:174 ath_tx_complete+0x190/0x1a0 [ath9k]
> > > [  445.489178] Modules linked in: ath9k ath9k_common ath9k_hw mac80211 ppp_async ppp_generic slhc xt_u32 xt_TCPMSS xt_tcpmss xt_MASQUERADE iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter iptable_filter ip6table_mangle ip6_tables cls_u32 sch_prio qmi_wwan cdc_wdm usbnet qcserial usb_wwan gpio_pca953x xt_tcpudp xt_mark iptable_mangle ip_tables x_tables dummy tqmx86_wdt watchdog i2c_machxo2 gpio_tqmx86 libarc4 ath tqmx86 mfd_core ftdi_sio cfg80211 usbserial nct7802 regmap_i2c [last unloaded: mac80211]
> > > [  445.489298] CPU: 0 PID: 8630 Comm: modprobe Tainted: G        W         5.16.18+ #2
> > > [  445.489307] Hardware name: TQ-Group TQMxE38M X64/Type2 - Board Product Name, BIOS TQMxE38M.5.4.48.0028.12 01/31/2017
> > > [  445.489311] EIP: ath_tx_complete+0x190/0x1a0 [ath9k]
> > > [  445.489328] Code: ff 75 e8 68 d7 a1 2c f8 89 55 ec e8 d5 4f e3 ff 8b 45 f0 8b 55 ec 83 c4 10 8b 80 58 0a 00 00 e9 a8 fe ff ff 8d b6 00 00 00 00 <0f> 0b c7 42 68 00 00 00 00 e9 1d ff ff ff 66 90 55 89 e5 57 56 89
> > > [  445.489336] EAX: 00000000 EBX: c2e67900 ECX: c23cc040 EDX: c235a980
> > > [  445.489342] ESI: c235a888 EDI: c2359860 EBP: c1089e18 ESP: c1089dfc
> > > [  445.489348] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010297
> > > [  445.489355] CR0: 80050033 CR2: b7f81000 CR3: 02ec7000 CR4: 001006d0
> > > [  445.489362] Call Trace:
> > > [  445.489366]  <SOFTIRQ>
> > > [  445.489372]  ath_tx_complete_buf+0x100/0x130 [ath9k]
> > > [  445.489390]  ath_tx_process_buffer+0x187/0xb40 [ath9k]
> > > [  445.489407]  ? resched_curr+0x1d/0xc0
> > > [  445.489419]  ? check_preempt_wakeup+0x115/0x250
> > > [  445.489429]  ? task_fork_fair+0xc0/0x170
> > > [  445.489436]  ? put_prev_task_fair+0x40/0x40
> > > [  445.489444]  ? check_preempt_curr+0x62/0x70
> > > [  445.489452]  ? __local_bh_enable_ip+0x33/0x80
> > > [  445.489460]  ? _raw_spin_unlock_bh+0x13/0x20
> > > [  445.489468]  ? ath_txq_unlock_complete+0x5a/0x100 [ath9k]
> > > [  445.489485]  ? _raw_spin_unlock_irqrestore+0x16/0x30
> > > [  445.489492]  ? try_to_wake_up+0x7c/0x550
> > > [  445.489499]  ath_tx_tasklet+0x1de/0x2e0 [ath9k]
> > > [  445.489519]  ath9k_tasklet+0x22f/0x330 [ath9k]
> > > [  445.489535]  tasklet_action_common.constprop.0+0x89/0xb0
> > > [  445.489544]  tasklet_action+0x21/0x30
> > > [  445.489551]  __do_softirq+0xc5/0x28b
> > > [  445.489559]  ? __entry_text_end+0x4/0x4
> > > [  445.489567]  call_on_stack+0x40/0x50
> > > [  445.489576]  </SOFTIRQ>
> > > [  445.489580]  ? irq_exit_rcu+0x92/0x100
> > > [  445.489587]  ? common_interrupt+0x27/0x40
> > > [  445.489595]  ? asm_common_interrupt+0x102/0x140
> > > [  445.489605] ---[ end trace 5c176b666255bca1 ]---
> > > [  445.607206] ------------[ cut here ]------------
> > >
> > >
> > > Could be fixed by the following additional change:
> > >
> > > --- a/drivers/net/wireless/ath/ath9k/xmit.c
> > > +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> > > @@ -2511,9 +2511,12 @@ static void ath_tx_rc_status(struct ath_softc *sc, struct ath_buf *bf,
> > >         struct ieee80211_hw *hw = sc->hw;
> > >         struct ath_hw *ah = sc->sc_ah;
> > >         u8 i, tx_rateindex;
> > > +       bool is_valid_ack_signal = tx_info->status.is_valid_ack_signal;
> > >
> > >         ieee80211_tx_info_clear_status(tx_info);
> > >
> > > +       tx_info->status.is_valid_ack_signal = is_valid_ack_signal;
> > > +
> > >         if (txok)    
> > 
> > That doesn't seem right, zeroing is_valid_ack_signal was the whole point
> > of the first patch... Something seems off in that backtrace; could you
> > try running it through ./scripts/decode_stacktrace.sh please?  
> 
> Will do (and re-run the test with an fresh compiled kernel as CONFIG_DEBUG_KERNEL
> was not set - but need to wait until next week as I need physical access to
> the test system for kernel update)...

Same with an fresh build (and with stacktrace run through decode_stacktrace.sh), but
did some more investigations, added an debug print of tx_info->status.is_valid_ack_signal
in ath_tx_rc_status() and an corresponding one in ath_txq_skb_done() printing out
fi->txq, with the additional

	tx_info->status.is_valid_ack_signal = is_valid_ack_signal;

line added showed the following output

[  839.543175] XXX - ath_tx_rc_status() - is_valid_ack_signal: 2
[  839.543195] XXX - ath_txq_skb_done() txq: 2
[  839.617898] XXX - ath_tx_rc_status() - is_valid_ack_signal: 2
[  839.617961] XXX - ath_txq_skb_done() txq: 2
[  840.024845] XXX - ath_tx_rc_status() - is_valid_ack_signal: 2
[  840.024863] XXX - ath_txq_skb_done() txq: 2
[...]

and without the following:

[  924.614745] XXX - ath_tx_rc_status() - is_valid_ack_signal: 2
[  924.614762] XXX - ath_txq_skb_done() txq: 0
[  924.614805] ------------[ cut here ]------------
[  924.614809] WARNING: CPU: 0 PID: 0 at drivers/net/wireless/ath/ath9k/xmit.c:176 ath_txq_skb_done.constprop.0+0x28/0x38 [ath9k]
[...]


Seems ieee80211_tx_info_clear_status() clears/destroys the dual use of the
struct ieee80211_tx_info.status vs. ieee80211_tx_info.rate_driver_data...

See drivers/net/wireless/ath/ath9k/xmit.c

 140 static struct ath_frame_info *get_frame_info(struct sk_buff *skb)
 141 {
 142         struct ieee80211_tx_info *tx_info = IEEE80211_SKB_CB(skb);
 143         BUILD_BUG_ON(sizeof(struct ath_frame_info) >
 144                      sizeof(tx_info->rate_driver_data));
 145         return (struct ath_frame_info *) &tx_info->rate_driver_data[0];
 146 }


 205 static void ath_txq_skb_done(struct ath_softc *sc, struct ath_txq *txq,
 206                              struct sk_buff *skb)
 207 {
 208         struct ath_frame_info *fi = get_frame_info(skb);
 209         int q = fi->txq;
 210 
 211         if (q < 0)
 212                 return;
 213 
 214         txq = sc->tx.txq_map[q];
 215         if (WARN_ON(--txq->pending_frames < 0))
 216                 txq->pending_frames = 0;
 217 
 218 }

Seems it only worked by chance (?) before...

Regards,
Peter


> 
> Regards,
> Peter
> 
> > 
> > -Toke  
> 


  reply	other threads:[~2022-04-04 11:05 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-30 16:44 [PATCH v5.18] ath9k: Properly clear TX status area before reporting to mac80211 Toke Høiland-Jørgensen
2022-03-31  5:06 ` Bagas Sanjaya
2022-03-31  5:18   ` Bagas Sanjaya
2022-03-31  5:36   ` Kalle Valo
2022-03-31  8:35     ` Toke Høiland-Jørgensen
2022-03-31  9:31       ` Kalle Valo
2022-04-01  6:37 ` [v5.18] " Kalle Valo
2022-04-01 17:26 ` [PATCH v5.18] " Peter Seiderer
2022-04-02 12:00   ` Toke Høiland-Jørgensen
2022-04-02 14:33   ` Peter Seiderer
2022-04-02 15:11     ` Toke Høiland-Jørgensen
2022-04-02 16:19       ` Peter Seiderer
2022-04-04 11:04         ` Peter Seiderer [this message]
2022-04-04 16:03           ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220404130453.5ec6e045@gmx.net \
    --to=ps.report@gmx.net \
    --cc=bagasdotme@gmail.com \
    --cc=kvalo@kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=toke@toke.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.