linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* lockdep warning in mac80211/tx.c, 4.13.16+ kernel, ath9k related
@ 2018-02-01  0:53 Ben Greear
  2018-02-01 22:23 ` Johannes Berg
  0 siblings, 1 reply; 7+ messages in thread
From: Ben Greear @ 2018-02-01  0:53 UTC (permalink / raw)
  To: linux-wireless@vger.kernel.org

Hello,

This first splat comes from this code:

static ieee80211_tx_result debug_noinline
ieee80211_tx_h_select_key(struct ieee80211_tx_data *tx)
{
	struct ieee80211_key *key;
	struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb);
	struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data;

	if (unlikely(info->flags & IEEE80211_TX_INTFL_DONT_ENCRYPT))
		tx->key = NULL;
	else if (tx->sta &&
### line 605 ###		 (key = rcu_dereference(tx->sta->ptk[tx->sta->ptk_idx])))
		tx->key = key;
	else if (ieee80211_is_group_privacy_action(tx->skb) &&
		(key = rcu_dereference(tx->sdata->default_multicast_key)))
		tx->key = key;
	else if (ieee80211_is_mgmt(hdr->frame_control) &&
		 is_multicast_ether_addr(hdr->addr1) &&
		 ieee80211_is_robust_mgmt_frame(tx->skb) &&
		 (key = rcu_dereference(tx->sdata->default_mgmt_key)))
		tx->key = key;
	else if (is_multicast_ether_addr(hdr->addr1) &&
		 (key = rcu_dereference(tx->sdata->default_multicast_key)))
		tx->key = key;
	else if (!is_multicast_ether_addr(hdr->addr1) &&
### line 619 ###		 (key = rcu_dereference(tx->sdata->default_unicast_key)))
		tx->key = key;
	else
		tx->key = NULL;

This ath9k has some local modifications, including removal of the airtime fairness logic
that was breaking my test case very quickly, so could be my fault of course.

Full tree is here:

https://github.com/greearb/linux-ct-4.13


Any idea why this might be complaining?

Test case is to bring up 200 virtual stations on each of 6 radios and then randomly
restart the stations and/or APs they are connected to.  I'm trying to shake out
stability bugs and such...


  30917 Jan 31 15:21:01 2u-6n kernel: =============================
  30918 Jan 31 15:21:01 2u-6n kernel: WARNING: suspicious RCU usage
  30919 Jan 31 15:21:01 2u-6n kernel: 4.13.16+ #2 Tainted: G        W  O
  30920 Jan 31 15:21:01 2u-6n kernel: -----------------------------
  30921 Jan 31 15:21:01 2u-6n kernel: /home/greearb/git/linux-4.13.dev.y/net/mac80211/tx.c:605 suspicious rcu_dereference_check() usage!
  30922 Jan 31 15:21:01 2u-6n kernel:
  30923                               other info that might help us debug this:
  30924 Jan 31 15:21:01 2u-6n kernel:
  30925                               rcu_scheduler_active = 2, debug_locks = 1
  30926 Jan 31 15:21:01 2u-6n kernel: 5 locks held by ip/19628:
  30927 Jan 31 15:21:01 2u-6n kernel:  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff818a40f6>] rtnetlink_rcv+0x16/0x30
  30928 Jan 31 15:21:01 2u-6n kernel:  #1:  (&wdev->mtx){+.+.+.}, at: [<ffffffffa04b0e4d>] cfg80211_leave+0x1d/0x40 [cfg80211]
  30929 Jan 31 15:21:01 2u-6n kernel:  #2:  (&local->sta_mtx){+.+.+.}, at: [<ffffffffa093999c>] __sta_info_flush+0x7c/0x170 [mac80211]
  30930 Jan 31 15:21:01 2u-6n kernel:  #3:  (&(&txq->axq_lock)->rlock){+.-...}, at: [<ffffffffa0485db6>] ath_tx_node_cleanup+0x66/0x160 [ath9k]
  30931 Jan 31 15:21:01 2u-6n kernel:  #4:  (&(&fq->lock)->rlock){+.-...}, at: [<ffffffffa096f865>] ieee80211_tx_dequeue+0x45/0xca0 [mac80211]
  30932 Jan 31 15:21:01 2u-6n kernel:
  30933                               stack backtrace:
  30934 Jan 31 15:21:01 2u-6n kernel: CPU: 1 PID: 19628 Comm: ip Tainted: G        W  O    4.13.16+ #2
  30935 Jan 31 15:21:01 2u-6n kernel: Hardware name: Iron_Systems,Inc CS-CAD-2U-A02/X10SRL-F, BIOS 2.0b 05/02/2017
  30936 Jan 31 15:21:01 2u-6n kernel: Call Trace:
  30937 Jan 31 15:21:01 2u-6n kernel:  dump_stack+0x85/0xc7
  30938 Jan 31 15:21:01 2u-6n kernel:  lockdep_rcu_suspicious+0xc5/0x100
  30939 Jan 31 15:21:01 2u-6n kernel:  ieee80211_tx_h_select_key+0x1c9/0x4e0 [mac80211]
  30940 Jan 31 15:21:01 2u-6n kernel:  ieee80211_tx_dequeue+0x376/0xca0 [mac80211]
  30941 Jan 31 15:21:01 2u-6n kernel:  ath_tid_dequeue+0x9c/0x110 [ath9k]
  30942 Jan 31 15:21:01 2u-6n kernel:  ath_tx_node_cleanup+0xa4/0x160 [ath9k]
  30943 Jan 31 15:21:01 2u-6n kernel:  ath9k_sta_state+0x6b/0x1e0 [ath9k]
  30944 Jan 31 15:21:01 2u-6n kernel:  drv_sta_state+0xad/0xa80 [mac80211]
  30945 Jan 31 15:21:01 2u-6n kernel:  __sta_info_destroy_part2+0x178/0x1d0 [mac80211]
  30946 Jan 31 15:21:01 2u-6n kernel:  __sta_info_flush+0xef/0x170 [mac80211]
  30947 Jan 31 15:21:01 2u-6n kernel:  ieee80211_set_disassoc+0xc6/0x400 [mac80211]
  30948 Jan 31 15:21:01 2u-6n kernel:  ieee80211_mgd_deauth+0x2f6/0x830 [mac80211]
  30949 Jan 31 15:21:01 2u-6n kernel:  ieee80211_deauth+0x13/0x20 [mac80211]
  30950 Jan 31 15:21:01 2u-6n kernel:  cfg80211_mlme_deauth+0x16b/0x3e0 [cfg80211]
  30951 Jan 31 15:21:01 2u-6n kernel:  cfg80211_mlme_down+0x6d/0xa0 [cfg80211]
  30952 Jan 31 15:21:01 2u-6n kernel:  cfg80211_disconnect+0x2f4/0x3f0 [cfg80211]
  30953 Jan 31 15:21:01 2u-6n kernel:  ? kfree+0x24d/0x2b0
  30954 Jan 31 15:21:01 2u-6n kernel:  __cfg80211_leave+0x134/0x190 [cfg80211]
  30955 Jan 31 15:21:01 2u-6n kernel:  cfg80211_leave+0x28/0x40 [cfg80211]
  30956 Jan 31 15:21:01 2u-6n kernel:  cfg80211_netdev_notifier_call+0x49c/0x820 [cfg80211]
  30957 Jan 31 15:21:01 2u-6n kernel:  ? lockdep_rtnl_is_held+0x15/0x20
  30958 Jan 31 15:21:01 2u-6n kernel:  ? addrconf_notify+0x6b/0xcc0 [ipv6]
  30959 Jan 31 15:21:01 2u-6n kernel:  ? packet_notifier+0xee/0x2a0
  30960 Jan 31 15:21:01 2u-6n kernel:  notifier_call_chain+0x45/0x70
  30961 Jan 31 15:21:01 2u-6n kernel:  raw_notifier_call_chain+0x11/0x20
  30962 Jan 31 15:21:01 2u-6n kernel:  call_netdevice_notifiers_info+0x30/0x60
  30963 Jan 31 15:21:01 2u-6n kernel:  __dev_close_many+0x54/0xe0
  30964 Jan 31 15:21:01 2u-6n kernel:  __dev_close+0x31/0x50
  30965 Jan 31 15:21:01 2u-6n kernel:  __dev_change_flags+0x98/0x160
  30966 Jan 31 15:21:01 2u-6n kernel:  dev_change_flags+0x24/0x60
  30967 Jan 31 15:21:01 2u-6n kernel:  do_setlink+0x367/0xcd0
  30968 Jan 31 15:21:01 2u-6n kernel:  ? mark_held_locks+0x6f/0xa0
  30969 Jan 31 15:21:01 2u-6n kernel:  ? get_page_from_freelist+0x1fd/0xc10
  30970 Jan 31 15:21:01 2u-6n kernel:  ? trace_hardirqs_on_caller+0x11f/0x190
  30971 Jan 31 15:21:01 2u-6n kernel:  ? nla_parse+0x36/0x150
  30972 Jan 31 15:21:01 2u-6n kernel:  rtnl_newlink+0x776/0x8e0
  30973 Jan 31 15:21:01 2u-6n kernel:  ? __wake_up+0x1e/0x50
30974 Jan 31 15:21:01 2u-6n kernel:  ? ns_capable_common+0x75/0x90
  30975 Jan 31 15:21:01 2u-6n kernel:  ? ns_capable+0xe/0x10
  30976 Jan 31 15:21:01 2u-6n kernel:  rtnetlink_rcv_msg+0x85/0x1f0
  30977 Jan 31 15:21:01 2u-6n kernel:  ? lock_acquire+0xac/0x200
  30978 Jan 31 15:21:01 2u-6n kernel:  ? rtnl_newlink+0x8e0/0x8e0
  30979 Jan 31 15:21:01 2u-6n kernel:  netlink_rcv_skb+0xe2/0x110
  30980 Jan 31 15:21:01 2u-6n kernel:  rtnetlink_rcv+0x25/0x30
  30981 Jan 31 15:21:01 2u-6n kernel:  netlink_unicast+0x1cb/0x2e0
  30982 Jan 31 15:21:01 2u-6n kernel:  netlink_sendmsg+0x2c6/0x3a0
  30983 Jan 31 15:21:01 2u-6n kernel:  sock_sendmsg+0x33/0x40
  30984 Jan 31 15:21:01 2u-6n kernel:  ___sys_sendmsg+0x2f3/0x300
  30985 Jan 31 15:21:01 2u-6n kernel:  ? lock_acquire+0xac/0x200
  30986 Jan 31 15:21:01 2u-6n kernel:  ? __handle_mm_fault+0x5e8/0xfb0
  30987 Jan 31 15:21:01 2u-6n kernel:  ? getnstimeofday64+0x9/0x20
  30988 Jan 31 15:21:01 2u-6n kernel:  ? trace_hardirqs_on_caller+0x11f/0x190
  30989 Jan 31 15:21:01 2u-6n kernel:  __sys_sendmsg+0x40/0x70
  30990 Jan 31 15:21:01 2u-6n kernel:  ? __sys_sendmsg+0x40/0x70
  30991 Jan 31 15:21:01 2u-6n kernel:  SyS_sendmsg+0xd/0x20
  30992 Jan 31 15:21:01 2u-6n kernel:  do_syscall_64+0x64/0x140
  30993 Jan 31 15:21:01 2u-6n kernel:  entry_SYSCALL64_slow_path+0x25/0x25
  30994 Jan 31 15:21:01 2u-6n kernel: RIP: 0033:0x7f1e8363a150
  30995 Jan 31 15:21:01 2u-6n kernel: RSP: 002b:00007ffee8bb8bf8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
  30996 Jan 31 15:21:01 2u-6n kernel: RAX: ffffffffffffffda RBX: 000000005a724f5d RCX: 00007f1e8363a150
  30997 Jan 31 15:21:01 2u-6n kernel: RDX: 0000000000000000 RSI: 00007ffee8bb8c70 RDI: 0000000000000004
  30998 Jan 31 15:21:01 2u-6n kernel: RBP: 00007ffee8bb8c70 R08: 0000000000000001 R09: 0000000000000000
  30999 Jan 31 15:21:01 2u-6n kernel: R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffee8bb8cb0
  31000 Jan 31 15:21:01 2u-6n kernel: R13: 000000000066b460 R14: 00007ffee8bc0d20 R15: 0000000000000000
  31001 Jan 31 15:21:01 2u-6n kernel:
  31002 Jan 31 15:21:01 2u-6n kernel: =============================
  31003 Jan 31 15:21:01 2u-6n kernel: WARNING: suspicious RCU usage
  31004 Jan 31 15:21:01 2u-6n kernel: 4.13.16+ #2 Tainted: G        W  O
  31005 Jan 31 15:21:01 2u-6n kernel: -----------------------------
  31006 Jan 31 15:21:01 2u-6n kernel: /home/greearb/git/linux-4.13.dev.y/net/mac80211/tx.c:619 suspicious rcu_dereference_check() usage!
  31007 Jan 31 15:21:01 2u-6n kernel:
  31008                               other info that might help us debug this:
  31009 Jan 31 15:21:01 2u-6n kernel:
  31010                               rcu_scheduler_active = 2, debug_locks = 1
  31011 Jan 31 15:21:01 2u-6n kernel: 5 locks held by ip/19628:
  31012 Jan 31 15:21:01 2u-6n kernel:  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff818a40f6>] rtnetlink_rcv+0x16/0x30
  31013 Jan 31 15:21:01 2u-6n kernel:  #1:  (&wdev->mtx){+.+.+.}, at: [<ffffffffa04b0e4d>] cfg80211_leave+0x1d/0x40 [cfg80211]
  31014 Jan 31 15:21:01 2u-6n kernel:  #2:  (&local->sta_mtx){+.+.+.}, at: [<ffffffffa093999c>] __sta_info_flush+0x7c/0x170 [mac80211]
  31015 Jan 31 15:21:01 2u-6n kernel:  #3:  (&(&txq->axq_lock)->rlock){+.-...}, at: [<ffffffffa0485db6>] ath_tx_node_cleanup+0x66/0x160 [ath9k]
  31016 Jan 31 15:21:01 2u-6n kernel:  #4:  (&(&fq->lock)->rlock){+.-...}, at: [<ffffffffa096f865>] ieee80211_tx_dequeue+0x45/0xca0 [mac80211]
  31017 Jan 31 15:21:01 2u-6n kernel:
  31018                               stack backtrace:
  31019 Jan 31 15:21:01 2u-6n kernel: CPU: 1 PID: 19628 Comm: ip Tainted: G        W  O    4.13.16+ #2
  31020 Jan 31 15:21:01 2u-6n kernel: Hardware name: Iron_Systems,Inc CS-CAD-2U-A02/X10SRL-F, BIOS 2.0b 05/02/2017
  31021 Jan 31 15:21:01 2u-6n kernel: Call Trace:
  31022 Jan 31 15:21:01 2u-6n kernel:  dump_stack+0x85/0xc7
  31023 Jan 31 15:21:01 2u-6n kernel:  lockdep_rcu_suspicious+0xc5/0x100
  31024 Jan 31 15:21:01 2u-6n kernel:  ieee80211_tx_h_select_key+0x431/0x4e0 [mac80211]
  31025 Jan 31 15:21:01 2u-6n kernel:  ieee80211_tx_dequeue+0x376/0xca0 [mac80211]
  31026 Jan 31 15:21:01 2u-6n kernel:  ath_tid_dequeue+0x9c/0x110 [ath9k]
  31027 Jan 31 15:21:01 2u-6n kernel:  ath_tx_node_cleanup+0xa4/0x160 [ath9k]
  31028 Jan 31 15:21:01 2u-6n kernel:  ath9k_sta_state+0x6b/0x1e0 [ath9k]
  31029 Jan 31 15:21:01 2u-6n kernel:  drv_sta_state+0xad/0xa80 [mac80211]
  31030 Jan 31 15:21:01 2u-6n kernel:  __sta_info_destroy_part2+0x178/0x1d0 [mac80211]
31031 Jan 31 15:21:01 2u-6n kernel:  __sta_info_flush+0xef/0x170 [mac80211]
  31032 Jan 31 15:21:01 2u-6n kernel:  ieee80211_set_disassoc+0xc6/0x400 [mac80211]
  31033 Jan 31 15:21:01 2u-6n kernel:  ieee80211_mgd_deauth+0x2f6/0x830 [mac80211]
  31034 Jan 31 15:21:01 2u-6n kernel:  ieee80211_deauth+0x13/0x20 [mac80211]
  31035 Jan 31 15:21:01 2u-6n kernel:  cfg80211_mlme_deauth+0x16b/0x3e0 [cfg80211]
  31036 Jan 31 15:21:01 2u-6n kernel:  cfg80211_mlme_down+0x6d/0xa0 [cfg80211]
  31037 Jan 31 15:21:01 2u-6n kernel:  cfg80211_disconnect+0x2f4/0x3f0 [cfg80211]
  31038 Jan 31 15:21:01 2u-6n kernel:  ? kfree+0x24d/0x2b0
  31039 Jan 31 15:21:01 2u-6n kernel:  __cfg80211_leave+0x134/0x190 [cfg80211]
  31040 Jan 31 15:21:01 2u-6n kernel:  cfg80211_leave+0x28/0x40 [cfg80211]
  31041 Jan 31 15:21:01 2u-6n kernel:  cfg80211_netdev_notifier_call+0x49c/0x820 [cfg80211]
  31042 Jan 31 15:21:01 2u-6n kernel:  ? lockdep_rtnl_is_held+0x15/0x20
  31043 Jan 31 15:21:01 2u-6n kernel:  ? addrconf_notify+0x6b/0xcc0 [ipv6]
  31044 Jan 31 15:21:01 2u-6n kernel:  ? packet_notifier+0xee/0x2a0
  31045 Jan 31 15:21:01 2u-6n kernel:  notifier_call_chain+0x45/0x70
  31046 Jan 31 15:21:01 2u-6n kernel:  raw_notifier_call_chain+0x11/0x20
  31047 Jan 31 15:21:01 2u-6n kernel:  call_netdevice_notifiers_info+0x30/0x60
  31048 Jan 31 15:21:01 2u-6n kernel:  __dev_close_many+0x54/0xe0
  31049 Jan 31 15:21:01 2u-6n kernel:  __dev_close+0x31/0x50
  31050 Jan 31 15:21:01 2u-6n kernel:  __dev_change_flags+0x98/0x160
  31051 Jan 31 15:21:01 2u-6n kernel:  dev_change_flags+0x24/0x60
  31052 Jan 31 15:21:01 2u-6n kernel:  do_setlink+0x367/0xcd0
  31053 Jan 31 15:21:01 2u-6n kernel:  ? mark_held_locks+0x6f/0xa0
  31054 Jan 31 15:21:01 2u-6n kernel:  ? get_page_from_freelist+0x1fd/0xc10
  31055 Jan 31 15:21:01 2u-6n kernel:  ? trace_hardirqs_on_caller+0x11f/0x190
  31056 Jan 31 15:21:01 2u-6n kernel:  ? nla_parse+0x36/0x150
  31057 Jan 31 15:21:01 2u-6n kernel:  rtnl_newlink+0x776/0x8e0
  31058 Jan 31 15:21:01 2u-6n kernel:  ? __wake_up+0x1e/0x50
  31059 Jan 31 15:21:01 2u-6n kernel:  ? ns_capable_common+0x75/0x90
  31060 Jan 31 15:21:01 2u-6n kernel:  ? ns_capable+0xe/0x10
  31061 Jan 31 15:21:01 2u-6n kernel:  rtnetlink_rcv_msg+0x85/0x1f0
  31062 Jan 31 15:21:01 2u-6n kernel:  ? lock_acquire+0xac/0x200
  31063 Jan 31 15:21:01 2u-6n kernel:  ? rtnl_newlink+0x8e0/0x8e0
  31064 Jan 31 15:21:01 2u-6n kernel:  netlink_rcv_skb+0xe2/0x110
  31065 Jan 31 15:21:01 2u-6n kernel:  rtnetlink_rcv+0x25/0x30
  31066 Jan 31 15:21:01 2u-6n kernel:  netlink_unicast+0x1cb/0x2e0
  31067 Jan 31 15:21:01 2u-6n kernel:  netlink_sendmsg+0x2c6/0x3a0
  31068 Jan 31 15:21:01 2u-6n kernel:  sock_sendmsg+0x33/0x40
  31069 Jan 31 15:21:01 2u-6n kernel:  ___sys_sendmsg+0x2f3/0x300
  31070 Jan 31 15:21:01 2u-6n kernel:  ? lock_acquire+0xac/0x200
  31071 Jan 31 15:21:01 2u-6n kernel:  ? __handle_mm_fault+0x5e8/0xfb0
  31072 Jan 31 15:21:01 2u-6n kernel:  ? getnstimeofday64+0x9/0x20
  31073 Jan 31 15:21:01 2u-6n kernel:  ? trace_hardirqs_on_caller+0x11f/0x190
  31074 Jan 31 15:21:01 2u-6n kernel:  __sys_sendmsg+0x40/0x70
  31075 Jan 31 15:21:01 2u-6n kernel:  ? __sys_sendmsg+0x40/0x70
  31076 Jan 31 15:21:01 2u-6n kernel:  SyS_sendmsg+0xd/0x20
  31077 Jan 31 15:21:01 2u-6n kernel:  do_syscall_64+0x64/0x140
  31078 Jan 31 15:21:01 2u-6n kernel:  entry_SYSCALL64_slow_path+0x25/0x25
  31079 Jan 31 15:21:01 2u-6n kernel: RIP: 0033:0x7f1e8363a150
  31080 Jan 31 15:21:01 2u-6n kernel: RSP: 002b:00007ffee8bb8bf8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
  31081 Jan 31 15:21:01 2u-6n kernel: RAX: ffffffffffffffda RBX: 000000005a724f5d RCX: 00007f1e8363a150
  31082 Jan 31 15:21:01 2u-6n kernel: RDX: 0000000000000000 RSI: 00007ffee8bb8c70 RDI: 0000000000000004
  31083 Jan 31 15:21:01 2u-6n kernel: RBP: 00007ffee8bb8c70 R08: 0000000000000001 R09: 0000000000000000
  31084 Jan 31 15:21:01 2u-6n kernel: R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffee8bb8cb0
  31085 Jan 31 15:21:01 2u-6n kernel: R13: 000000000066b460 R14: 00007ffee8bc0d20 R15: 0000000000000000


Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lockdep warning in mac80211/tx.c, 4.13.16+ kernel, ath9k related
  2018-02-01  0:53 lockdep warning in mac80211/tx.c, 4.13.16+ kernel, ath9k related Ben Greear
@ 2018-02-01 22:23 ` Johannes Berg
  2018-02-01 22:33   ` Ben Greear
  0 siblings, 1 reply; 7+ messages in thread
From: Johannes Berg @ 2018-02-01 22:23 UTC (permalink / raw)
  To: Ben Greear, linux-wireless@vger.kernel.org

On Wed, 2018-01-31 at 16:53 -0800, Ben Greear wrote:
> 
> Any idea why this might be complaining?

>   30917 Jan 31 15:21:01 2u-6n kernel: =============================
>   30918 Jan 31 15:21:01 2u-6n kernel: WARNING: suspicious RCU usage

well, that's why? :-)

>   30933                               stack backtrace:
>   30934 Jan 31 15:21:01 2u-6n kernel: CPU: 1 PID: 19628 Comm: ip Tainted: G        W  O    4.13.16+ #2
>   30935 Jan 31 15:21:01 2u-6n kernel: Hardware name: Iron_Systems,Inc CS-CAD-2U-A02/X10SRL-F, BIOS 2.0b 05/02/2017
>   30936 Jan 31 15:21:01 2u-6n kernel: Call Trace:
>   30937 Jan 31 15:21:01 2u-6n kernel:  dump_stack+0x85/0xc7
>   30938 Jan 31 15:21:01 2u-6n kernel:  lockdep_rcu_suspicious+0xc5/0x100
>   30939 Jan 31 15:21:01 2u-6n kernel:  ieee80211_tx_h_select_key+0x1c9/0x4e0 [mac80211]
>   30940 Jan 31 15:21:01 2u-6n kernel:  ieee80211_tx_dequeue+0x376/0xca0 [mac80211]
>   30941 Jan 31 15:21:01 2u-6n kernel:  ath_tid_dequeue+0x9c/0x110 [ath9k]
>   30942 Jan 31 15:21:01 2u-6n kernel:  ath_tx_node_cleanup+0xa4/0x160 [ath9k]
>   30943 Jan 31 15:21:01 2u-6n kernel:  ath9k_sta_state+0x6b/0x1e0 [ath9k]
>   30944 Jan 31 15:21:01 2u-6n kernel:  drv_sta_state+0xad/0xa80 [mac80211]

I'd argue your hacks are at fault - somewhere in your modifications
around this call stack you lost the necessary rcu_read_lock()
protection.

joahnnes

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lockdep warning in mac80211/tx.c, 4.13.16+ kernel, ath9k related
  2018-02-01 22:23 ` Johannes Berg
@ 2018-02-01 22:33   ` Ben Greear
  2018-02-01 22:40     ` Johannes Berg
  0 siblings, 1 reply; 7+ messages in thread
From: Ben Greear @ 2018-02-01 22:33 UTC (permalink / raw)
  To: Johannes Berg, linux-wireless@vger.kernel.org

On 02/01/2018 02:23 PM, Johannes Berg wrote:
> On Wed, 2018-01-31 at 16:53 -0800, Ben Greear wrote:
>>
>> Any idea why this might be complaining?
>
>>   30917 Jan 31 15:21:01 2u-6n kernel: =============================
>>   30918 Jan 31 15:21:01 2u-6n kernel: WARNING: suspicious RCU usage
>
> well, that's why? :-)

All of RCU is suspicious as far as my feeble brain is concerned...I was wondering *why*
it was suspicious.  I see you answered below... :)

>
>>   30933                               stack backtrace:
>>   30934 Jan 31 15:21:01 2u-6n kernel: CPU: 1 PID: 19628 Comm: ip Tainted: G        W  O    4.13.16+ #2
>>   30935 Jan 31 15:21:01 2u-6n kernel: Hardware name: Iron_Systems,Inc CS-CAD-2U-A02/X10SRL-F, BIOS 2.0b 05/02/2017
>>   30936 Jan 31 15:21:01 2u-6n kernel: Call Trace:
>>   30937 Jan 31 15:21:01 2u-6n kernel:  dump_stack+0x85/0xc7
>>   30938 Jan 31 15:21:01 2u-6n kernel:  lockdep_rcu_suspicious+0xc5/0x100
>>   30939 Jan 31 15:21:01 2u-6n kernel:  ieee80211_tx_h_select_key+0x1c9/0x4e0 [mac80211]
>>   30940 Jan 31 15:21:01 2u-6n kernel:  ieee80211_tx_dequeue+0x376/0xca0 [mac80211]
>>   30941 Jan 31 15:21:01 2u-6n kernel:  ath_tid_dequeue+0x9c/0x110 [ath9k]
>>   30942 Jan 31 15:21:01 2u-6n kernel:  ath_tx_node_cleanup+0xa4/0x160 [ath9k]
>>   30943 Jan 31 15:21:01 2u-6n kernel:  ath9k_sta_state+0x6b/0x1e0 [ath9k]
>>   30944 Jan 31 15:21:01 2u-6n kernel:  drv_sta_state+0xad/0xa80 [mac80211]
>
> I'd argue your hacks are at fault - somewhere in your modifications
> around this call stack you lost the necessary rcu_read_lock()
> protection.

Ok, I'll go looking for bugs like that.

Thanks!
--Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lockdep warning in mac80211/tx.c, 4.13.16+ kernel, ath9k related
  2018-02-01 22:33   ` Ben Greear
@ 2018-02-01 22:40     ` Johannes Berg
  2018-02-01 22:47       ` Johannes Berg
  0 siblings, 1 reply; 7+ messages in thread
From: Johannes Berg @ 2018-02-01 22:40 UTC (permalink / raw)
  To: Ben Greear, linux-wireless@vger.kernel.org

On Thu, 2018-02-01 at 14:33 -0800, Ben Greear wrote:
> 
> All of RCU is suspicious

Heh.

The code does a plain rcu_dereference(), no locking other than
rcu_read_lock() involved.

On second thought though, I'm not convinced that your modifications
caused the problem.

Given your call stack, we'd expect rcu_read_lock() somewhere around
ath_tid_dequeue (or its caller(s)), since ieee80211_tx_dequeue clearly
requires it.

Normally, ieee80211_tx_dequeue() is called from various places that
probably come from mac80211 and already hold the rcu_read_lock(), e.g.
the wake_tx_queue op.

In this case, you're coming from drv_sta_state, so not sure why the
driver thinks it's OK to call the dequeue there.

johannes

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lockdep warning in mac80211/tx.c, 4.13.16+ kernel, ath9k related
  2018-02-01 22:40     ` Johannes Berg
@ 2018-02-01 22:47       ` Johannes Berg
  2018-02-01 23:21         ` Ben Greear
  0 siblings, 1 reply; 7+ messages in thread
From: Johannes Berg @ 2018-02-01 22:47 UTC (permalink / raw)
  To: Ben Greear, linux-wireless@vger.kernel.org

On Thu, 2018-02-01 at 23:40 +0100, Johannes Berg wrote:
> 
> The code does a plain rcu_dereference(), no locking other than
> rcu_read_lock() involved.
> 
> On second thought though, I'm not convinced that your modifications
> caused the problem.
> 
> Given your call stack, we'd expect rcu_read_lock() somewhere around
> ath_tid_dequeue (or its caller(s)), since ieee80211_tx_dequeue clearly
> requires it.
> 
> Normally, ieee80211_tx_dequeue() is called from various places that
> probably come from mac80211 and already hold the rcu_read_lock(), e.g.
> the wake_tx_queue op.
> 
> In this case, you're coming from drv_sta_state, so not sure why the
> driver thinks it's OK to call the dequeue there.

Just to clarify - it could just be that in the "normal" case, when a
station dies, there's nothing on the queues - so the dequeue just
doesn't do anything and never goes into the code path where the
rcu_dereference() is, hence it might be a bug in mainline that just
never triggers in ordinary scenarios.

johannes

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lockdep warning in mac80211/tx.c, 4.13.16+ kernel, ath9k related
  2018-02-01 22:47       ` Johannes Berg
@ 2018-02-01 23:21         ` Ben Greear
  2018-02-02 10:19           ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 7+ messages in thread
From: Ben Greear @ 2018-02-01 23:21 UTC (permalink / raw)
  To: Johannes Berg, linux-wireless@vger.kernel.org

On 02/01/2018 02:47 PM, Johannes Berg wrote:
> On Thu, 2018-02-01 at 23:40 +0100, Johannes Berg wrote:
>>
>> The code does a plain rcu_dereference(), no locking other than
>> rcu_read_lock() involved.
>>
>> On second thought though, I'm not convinced that your modifications
>> caused the problem.
>>
>> Given your call stack, we'd expect rcu_read_lock() somewhere around
>> ath_tid_dequeue (or its caller(s)), since ieee80211_tx_dequeue clearly
>> requires it.
>>
>> Normally, ieee80211_tx_dequeue() is called from various places that
>> probably come from mac80211 and already hold the rcu_read_lock(), e.g.
>> the wake_tx_queue op.
>>
>> In this case, you're coming from drv_sta_state, so not sure why the
>> driver thinks it's OK to call the dequeue there.
>
> Just to clarify - it could just be that in the "normal" case, when a
> station dies, there's nothing on the queues - so the dequeue just
> doesn't do anything and never goes into the code path where the
> rcu_dereference() is, hence it might be a bug in mainline that just
> never triggers in ordinary scenarios.

It looks like the code in ath9k has not been changed in that area for
some time.

Would adding rcu_read_lock in drv_sta_state() be a possibility?

Thanks,
Ben

>
> johannes
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lockdep warning in mac80211/tx.c, 4.13.16+ kernel, ath9k related
  2018-02-01 23:21         ` Ben Greear
@ 2018-02-02 10:19           ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 7+ messages in thread
From: Toke Høiland-Jørgensen @ 2018-02-02 10:19 UTC (permalink / raw)
  To: Ben Greear, Johannes Berg, linux-wireless@vger.kernel.org

Ben Greear <greearb@candelatech.com> writes:

> On 02/01/2018 02:47 PM, Johannes Berg wrote:
>> On Thu, 2018-02-01 at 23:40 +0100, Johannes Berg wrote:
>>>
>>> The code does a plain rcu_dereference(), no locking other than
>>> rcu_read_lock() involved.
>>>
>>> On second thought though, I'm not convinced that your modifications
>>> caused the problem.
>>>
>>> Given your call stack, we'd expect rcu_read_lock() somewhere around
>>> ath_tid_dequeue (or its caller(s)), since ieee80211_tx_dequeue clearly
>>> requires it.
>>>
>>> Normally, ieee80211_tx_dequeue() is called from various places that
>>> probably come from mac80211 and already hold the rcu_read_lock(), e.g.
>>> the wake_tx_queue op.
>>>
>>> In this case, you're coming from drv_sta_state, so not sure why the
>>> driver thinks it's OK to call the dequeue there.
>>
>> Just to clarify - it could just be that in the "normal" case, when a
>> station dies, there's nothing on the queues - so the dequeue just
>> doesn't do anything and never goes into the code path where the
>> rcu_dereference() is, hence it might be a bug in mainline that just
>> never triggers in ordinary scenarios.
>
> It looks like the code in ath9k has not been changed in that area for
> some time.

Hmm, I think the issue here is that after the switch to mac80211 txqs,
the driver is now draining mac80211 queues, whereas before it was only
draining its own driver-internal queues (which are protected by the
ath_txq_lock() that ath_tx_node_cleanup() does grab). And when the
switch to the mac80211 queues happened, a call to rcu_read_lock() should
have been added. But, well, I had no idea this was needed until just
now, so obviously I didn't add that... :)

> Would adding rcu_read_lock in drv_sta_state() be a possibility?

Think it should probably just be added in ath_tx_node_cleanup()? Can
send a patch...

-Toke

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-02-02 10:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-01  0:53 lockdep warning in mac80211/tx.c, 4.13.16+ kernel, ath9k related Ben Greear
2018-02-01 22:23 ` Johannes Berg
2018-02-01 22:33   ` Ben Greear
2018-02-01 22:40     ` Johannes Berg
2018-02-01 22:47       ` Johannes Berg
2018-02-01 23:21         ` Ben Greear
2018-02-02 10:19           ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).