Linux wireless drivers development
 help / color / mirror / Atom feed
* Re: ath10k firmware crashes in mesh mode on QCA9880
From: Benjamin Morgan @ 2016-12-13 18:42 UTC (permalink / raw)
  To: Nagarajan, Ashok Raj, Mohammed Shafi Shajakhan
  Cc: agreen@cococorp.com, lede-dev@lists.infradead.org,
	linux-wireless@vger.kernel.org, ath10k@lists.infradead.org
In-Reply-To: <58472E7B.7090603@cococorp.com>

Just tested the latest 10.2.4.70.59-2 firmware and it still crashes with 
wpa_supplicant encrypted mesh =(

[   85.201440] ath10k_pci 0000:01:00.0: firmware crashed! (uuid 
b7f44483-0488-46af-8dff-db88f4b56327)
[   85.210573] ath10k_pci 0000:01:00.0: qca988x hw2.0 target 0x4100016c 
chip_id 0x043202ff sub 0000:0000
[   85.219940] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 1 
tracing 0 dfs 1 testmode 1
[   85.233034] ath10k_pci 0000:01:00.0: firmware ver 10.2.4.70.59-2 api 
5 features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
[   85.245177] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A 
crc32 bebc7c08
[   85.252592] ath10k_pci 0000:01:00.0: htt-ver 2.1 wmi-op 5 htt-op 2 
cal file max-sta 128 raw 0 hwcrypto 1
[   85.264235] ath10k_pci 0000:01:00.0: firmware register dump:
[   85.269992] ath10k_pci 0000:01:00.0: [00]: 0x4100016C 0x000015B3 
0x009A45AF 0x00955B31
[   85.278031] ath10k_pci 0000:01:00.0: [04]: 0x009A45AF 0x00060130 
0x00000002 0x00439E98
[   85.286078] ath10k_pci 0000:01:00.0: [08]: 0x0044110C 0x00442074 
0x00407120 0x004436CC
[   85.294107] ath10k_pci 0000:01:00.0: [12]: 0x00000009 0x00000000 
0x009A3550 0x009A355E
[   85.302152] ath10k_pci 0000:01:00.0: [16]: 0x00958080 0x0094085D 
0x00000000 0x00000000
[   85.310195] ath10k_pci 0000:01:00.0: [20]: 0x409A45AF 0x0040AAC4 
0x0040AC60 0x0040AC09
[   85.318239] ath10k_pci 0000:01:00.0: [24]: 0x809A44F2 0x0040AB24 
0x00400000 0xC09A45AF
[   85.326282] ath10k_pci 0000:01:00.0: [28]: 0x809A3A16 0x0040AB84 
0x0044110C 0x00442074
[   85.334314] ath10k_pci 0000:01:00.0: [32]: 0x809A601A 0x0040ABB4 
0x0044110C 0x00407120
[   85.342350] ath10k_pci 0000:01:00.0: [36]: 0x809A2EA4 0x0040ABF4 
0x0040AC14 0x00001580
[   85.350393] ath10k_pci 0000:01:00.0: [40]: 0x80990F63 0x0040AD04 
0x009C6458 0x004436CC
[   85.358437] ath10k_pci 0000:01:00.0: [44]: 0x80998520 0x0040AD64 
0x004208FC 0x00439E4C
[   85.366479] ath10k_pci 0000:01:00.0: [48]: 0x8099AEA5 0x0040AD84 
0x004208FC 0x00425AAC
[   85.374512] ath10k_pci 0000:01:00.0: [52]: 0x809BFC39 0x0040AEE4 
0x00424FE8 0x00000002
[   85.382548] ath10k_pci 0000:01:00.0: [56]: 0x80940F18 0x0040AF14 
0x00000004 0x004039D0
[   85.487067] ieee80211 phy0: Hardware restart was requested
[   85.492701] ath10k_pci 0000:01:00.0: wmi disable pktlog

Any new leads on tracking down this issue?

~Benjamin


On 12/06/2016 01:32 PM, Benjamin Morgan wrote:
> 1. Yes, this appears to happens every time a unicast packet with 
> wpa_supplicant encryption in VHT80 mode is received. I haven't seen a 
> successful ping-pong pair.
> 2. We tried with 10.2.4.70.42-2 firmware and still saw crashes.
> 3. We ran our experiment again with extra debugging turned on.
>     Node A: 18:A6:F7:23:6E:66 | 10.230.5.41
>     Node B: 18:A6:F7:26:0F:21 | 10.230.5.42
>     The ping command we used was run on Node A was 'ping -s 1500 -i 
> 0.1 10.230.5.42'
>     Here is the dmesg log from Node B.
>
> [ 5413.478170] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5413.503954] ath10k_pci 0000:01:00.0: scan event bss channel type 4 
> reason 3 freq 5825 req_id 40961 scan_id 40960 vdev_id 0 state running (2)
> [ 5413.503985] ath10k_pci 0000:01:00.0: chan info err_code 0 freq 5825 
> cmd_flags 1 noise_floor -105 rx_clear_count 7692807 cycle_count 312271423
> [ 5413.504029] ath10k_pci 0000:01:00.0: scan event completed type 2 
> reason 0 freq 5825 req_id 40961 scan_id 40960 vdev_id 0 state running (2)
> [ 5413.525868] ath10k_pci 0000:01:00.0: wmi vdev install key idx 1 
> cipher 4 len 16
> [ 5413.526014] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 31 
> value 1
> [ 5413.526193] ath10k_pci 0000:01:00.0: mac vdev 0 set keyidx 1
> [ 5413.526216] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 31 
> value 1
> [ 5413.526532] ath10k_pci 0000:01:00.0: mac chanctx add freq 5180 
> width 3 ptr 86db29b0
> [ 5413.526556] ath10k_pci 0000:01:00.0: mac monitor recalc started? 0 
> needed? 0 allowed? 1
> [ 5413.526574] ath10k_pci 0000:01:00.0: mac chanctx assign ptr 
> 86db29b0 vdev_id 0
> [ 5413.526592] ath10k_pci 0000:01:00.0: mac vdev 0 start center_freq 
> 5180 phymode 11ac-vht80
> [ 5413.526616] ath10k_pci 0000:01:00.0: wmi vdev start id 0x0 flags: 
> 0x0, freq 5180, mode 10, ch_flags: 0xA000000, max_power: 46
> [ 5413.533099] ath10k_pci 0000:01:00.0: WMI_VDEV_START_RESP_EVENTID
> [ 5413.533148] ath10k_pci 0000:01:00.0: mac vdev_id 0 txpower 23
> [ 5413.533163] ath10k_pci 0000:01:00.0: mac txpower 23
> [ 5413.533180] ath10k_pci 0000:01:00.0: wmi pdev set param 3 value 46
> [ 5413.533247] ath10k_pci 0000:01:00.0: wmi pdev set param 4 value 46
> [ 5413.533295] ath10k_pci 0000:01:00.0: mac chanctx change freq 5180 
> width 3 ptr 86db29b0 changed 10
> [ 5413.533318] ath10k_pci 0000:01:00.0: mac chanctx change freq 5180 
> width 3 ptr 86db29b0 changed 2
> [ 5413.533337] ath10k_pci 0000:01:00.0: mac monitor recalc started? 0 
> needed? 1 allowed? 1
> [ 5413.533357] ath10k_pci 0000:01:00.0: WMI vdev create: id 1 type 4 
> subtype 0 macaddr 18:a6:f7:26:0f:21
> [ 5413.533412] ath10k_pci 0000:01:00.0: mac monitor vdev 1 created
> [ 5413.533463] ath10k_pci 0000:01:00.0: wmi vdev start id 0x1 flags: 
> 0x0, freq 5180, mode 10, ch_flags: 0xA000000, max_power: 46
> [ 5413.937652] ath10k_pci 0000:01:00.0: wmi event debug mesg len 152
> [ 5413.978273] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5414.478363] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5414.527015] ath10k_pci 0000:01:00.0: WMI_VDEV_START_RESP_EVENTID
> [ 5414.527067] ath10k_pci 0000:01:00.0: wmi mgmt vdev up id 0x1 assoc 
> id 0 bssid 18:a6:f7:26:0f:21
> [ 5414.527121] ath10k_pci 0000:01:00.0: mac monitor vdev 1 started
> [ 5414.527165] ath10k_pci 0000:01:00.0: mac monitor started
> [ 5414.527216] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 3 
> value 1000
> [ 5414.527262] ath10k_pci 0000:01:00.0: mac vdev 0 beacon_interval 1000
> [ 5414.527278] ath10k_pci 0000:01:00.0: vdev 0 set beacon tx mode to 
> staggered
> [ 5414.527294] ath10k_pci 0000:01:00.0: wmi pdev set param 7 value 0
> [ 5414.527314] ath10k_pci 0000:01:00.0: mac vdev 0 dtim_period 2
> [ 5414.527330] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 13 
> value 2
> [ 5414.527457] ath10k_pci 0000:01:00.0: wmi mgmt vdev up id 0x0 assoc 
> id 0 bssid 00:00:00:00:00:00
> [ 5414.527501] ath10k_pci 0000:01:00.0: mac vdev 0 up
> [ 5414.527564] ath10k_pci 0000:01:00.0: WMI_TBTTOFFSET_UPDATE_EVENTID
> [ 5414.541090] ath10k_pci 0000:01:00.0: mac monitor recalc started? 1 
> needed? 1 allowed? 1
> [ 5414.978454] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5415.478548] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5415.978649] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5416.445280] ath10k_pci 0000:01:00.0: mac monitor recalc started? 1 
> needed? 1 allowed? 1
> [ 5416.478761] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5416.978879] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5417.478985] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5417.979081] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5418.479190] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5418.979301] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5419.479403] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5419.979551] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5420.479643] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5420.979746] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5421.479841] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5421.979940] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5422.480288] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5422.980386] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5423.480490] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5423.980600] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5424.480702] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5424.971969] ath10k_pci 0000:01:00.0: mac vdev 0 peer create 
> 18:a6:f7:23:6e:66 (new sta) sta 1 / 128 peer 2 / 144
> [ 5424.972000] ath10k_pci 0000:01:00.0: wmi peer create vdev_id 0 
> peer_addr 18:a6:f7:23:6e:66
> [ 5424.975107] ath10k_pci 0000:01:00.0: vdev 0 set beacon tx mode to 
> staggered
> [ 5424.975134] ath10k_pci 0000:01:00.0: wmi pdev set param 7 value 0
> [ 5424.975219] ath10k_pci 0000:01:00.0: mac vdev 0 dtim_period 2
> [ 5424.975238] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 13 
> value 2
> [ 5424.980787] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5425.204468] ath10k_pci 0000:01:00.0: mac sta 18:a6:f7:23:6e:66 
> associated
> [ 5425.204531] ath10k_pci 0000:01:00.0: mac ht peer 18:a6:f7:23:6e:66 
> mcs cnt 24 nss 3
> [ 5425.204548] ath10k_pci 0000:01:00.0: mac peer 18:a6:f7:23:6e:66 qos 1
> [ 5425.204563] ath10k_pci 0000:01:00.0: mac peer 18:a6:f7:23:6e:66 
> phymode 11na-ht40
> [ 5425.204585] ath10k_pci 0000:01:00.0: wmi peer assoc vdev 0 addr 
> 18:a6:f7:23:6e:66 (new)
> [ 5425.204614] ath10k_pci 0000:01:00.0: wmi vdev 0 peer 
> 0x18:a6:f7:23:6e:66 set param 1 value 0
> [ 5425.205376] ath10k_pci 0000:01:00.0: received event id 36891 not 
> implemented
> [ 5425.209240] ath10k_pci 0000:01:00.0: wmi vdev install key idx 0 
> cipher 4 len 16
> [ 5425.209655] ath10k_pci 0000:01:00.0: wmi vdev install key idx 1 
> cipher 4 len 16
> [ 5425.209848] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 31 
> value 1
> [ 5425.210196] ath10k_pci 0000:01:00.0: vdev 0 set beacon tx mode to 
> staggered
> [ 5425.210221] ath10k_pci 0000:01:00.0: wmi pdev set param 7 value 0
> [ 5425.210296] ath10k_pci 0000:01:00.0: mac vdev 0 dtim_period 2
> [ 5425.210315] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 13 
> value 2
> [ 5425.480863] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5425.938619] ath10k_pci 0000:01:00.0: wmi event debug mesg len 100
> [ 5425.980946] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5425.995007] ath10k_pci 0000:01:00.0: mac sta rc update for 
> 18:a6:f7:23:6e:66 changed 00000001 bw 2 nss 3 smps 1
> [ 5425.995060] ath10k_pci 0000:01:00.0: mac update sta 
> 18:a6:f7:23:6e:66 peer bw 2
> [ 5425.995081] ath10k_pci 0000:01:00.0: wmi vdev 0 peer 
> 0x18:a6:f7:23:6e:66 set param 4 value 2
> [ 5426.481030] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5426.981117] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5427.481206] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5427.981294] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5428.481628] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5428.981718] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5429.481812] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5429.981894] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5430.481985] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5430.982073] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5431.482174] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5431.982505] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5432.482597] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5432.982679] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5433.482765] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5433.982857] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5434.482946] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5434.983008] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5435.483100] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5435.983181] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5436.483276] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5436.983366] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5437.483445] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5437.983516] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5438.483607] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5438.983692] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [ 5439.439875] ath10k_pci 0000:01:00.0: firmware crashed! (uuid 
> db76b67c-ca98-4519-a762-4ff4edb45526)
> [ 5439.449007] ath10k_pci 0000:01:00.0: qca988x hw2.0 target 
> 0x4100016c chip_id 0x043202ff sub 0000:0000
> [ 5439.458378] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 1 
> tracing 0 dfs 1 testmode 1
> [ 5439.471460] ath10k_pci 0000:01:00.0: firmware ver 10.2.4.70.54 api 
> 5 features no-p2p,raw-mode,mfp crc32 9d340dd9
> [ 5439.481844] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A 
> crc32 bebc7c08
> [ 5439.489267] ath10k_pci 0000:01:00.0: htt-ver 2.1 wmi-op 5 htt-op 2 
> cal file max-sta 128 raw 0 hwcrypto 1
> [ 5439.500918] ath10k_pci 0000:01:00.0: firmware register dump:
> [ 5439.506678] ath10k_pci 0000:01:00.0: [00]: 0x4100016C 0x000015B3 
> 0x009A4577 0x00955B31
> [ 5439.514706] ath10k_pci 0000:01:00.0: [04]: 0x009A4577 0x00060130 
> 0x00000002 0x00439E98
> [ 5439.522751] ath10k_pci 0000:01:00.0: [08]: 0x0044110C 0x00442074 
> 0x00407120 0x004436CC
> [ 5439.530794] ath10k_pci 0000:01:00.0: [12]: 0x00000009 0x00000000 
> 0x009A3518 0x009A3526
> [ 5439.538834] ath10k_pci 0000:01:00.0: [16]: 0x00958080 0x0094085D 
> 0x00000000 0x00000000
> [ 5439.546871] ath10k_pci 0000:01:00.0: [20]: 0x409A4577 0x0040AAC4 
> 0x0040AC60 0x0040AC09
> [ 5439.554915] ath10k_pci 0000:01:00.0: [24]: 0x809A44BA 0x0040AB24 
> 0x00400000 0xC09A4577
> [ 5439.562948] ath10k_pci 0000:01:00.0: [28]: 0x809A39DE 0x0040AB84 
> 0x0044110C 0x00442074
> [ 5439.570992] ath10k_pci 0000:01:00.0: [32]: 0x809A5FE2 0x0040ABB4 
> 0x0044110C 0x00407120
> [ 5439.579032] ath10k_pci 0000:01:00.0: [36]: 0x809A2E6C 0x0040ABF4 
> 0x0040AC14 0x00001580
> [ 5439.587070] ath10k_pci 0000:01:00.0: [40]: 0x80990F6F 0x0040AD04 
> 0x009C643C 0x004436CC
> [ 5439.595113] ath10k_pci 0000:01:00.0: [44]: 0x80998510 0x0040AD64 
> 0x004208FC 0x00439E4C
> [ 5439.603146] ath10k_pci 0000:01:00.0: [48]: 0x8099AE95 0x0040AD84 
> 0x004208FC 0x00425E00
> [ 5439.611191] ath10k_pci 0000:01:00.0: [52]: 0x809BFC55 0x0040AEE4 
> 0x00424FE8 0x00000002
> [ 5439.619230] ath10k_pci 0000:01:00.0: [56]: 0x80940F18 0x0040AF14 
> 0x00000004 0x004039D0
> [ 5439.726818] ieee80211 phy0: Hardware restart was requested
> [ 5439.732433] ath10k_pci 0000:01:00.0: wmi mgmt vdev down id 0x1
> [ 5439.732461] ath10k_pci 0000:01:00.0: wmi vdev stop id 0x1
> [ 5439.732482] ath10k_pci 0000:01:00.0: failed to synchronize monitor 
> vdev 1 stop: -143
> [ 5439.740370] ath10k_pci 0000:01:00.0: mac monitor vdev 1 stopped
> [ 5439.740386] ath10k_pci 0000:01:00.0: failed to stop monitor vdev: -143
> [ 5439.747042] ath10k_pci 0000:01:00.0: wmi disable pktlog
>
> We noticed in this log that when the radio starts up it says that it 
> is in VHT80 mode:
> [ 5413.526592] ath10k_pci 0000:01:00.0: mac vdev 0 start center_freq 
> 5180 phymode 11ac-vht80
>
> But when a peer connects it seems to think the peer is in HT40 mode:
> [ 5425.204563] ath10k_pci 0000:01:00.0: mac peer 18:a6:f7:23:6e:66 
> phymode 11na-ht40
>
> Compared to no encryption case - this log was taken from Node A:
>
> [   24.874253] ath10k_pci 0000:01:00.0: mac chanctx add freq 5180 
> width 3 ptr 86d26db0
> [   24.874278] ath10k_pci 0000:01:00.0: mac monitor recalc started? 0 
> needed? 0 allowed? 1
> [   24.874296] ath10k_pci 0000:01:00.0: mac chanctx assign ptr 
> 86d26db0 vdev_id 0
> [   24.874312] ath10k_pci 0000:01:00.0: mac vdev 0 start center_freq 
> 5180 phymode 11ac-vht80
> [   24.874337] ath10k_pci 0000:01:00.0: wmi vdev start id 0x0 flags: 
> 0x0, freq 5180, mode 10, ch_flags: 0xA000000, max_power: 46
> [   24.881335] ath10k_pci 0000:01:00.0: WMI_VDEV_START_RESP_EVENTID
> [   24.881423] ath10k_pci 0000:01:00.0: mac vdev_id 0 txpower 23
> [   24.881438] ath10k_pci 0000:01:00.0: mac txpower 23
> [   24.881454] ath10k_pci 0000:01:00.0: wmi pdev set param 3 value 46
> [   24.881491] ath10k_pci 0000:01:00.0: wmi pdev set param 4 value 46
> [   24.881515] ath10k_pci 0000:01:00.0: mac chanctx change freq 5180 
> width 3 ptr 86d26db0 changed 10
> [   24.881535] ath10k_pci 0000:01:00.0: mac chanctx change freq 5180 
> width 3 ptr 86d26db0 changed 2
> [   24.881554] ath10k_pci 0000:01:00.0: mac monitor recalc started? 0 
> needed? 1 allowed? 1
> [   24.881574] ath10k_pci 0000:01:00.0: WMI vdev create: id 1 type 4 
> subtype 0 macaddr 18:a6:f7:23:6e:66
> [   24.881689] ath10k_pci 0000:01:00.0: mac monitor vdev 1 created
> [   24.881745] ath10k_pci 0000:01:00.0: wmi vdev start id 0x1 flags: 
> 0x0, freq 5180, mode 10, ch_flags: 0xA000000, max_power: 46
> [   25.273460] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [   25.730570] ath10k_pci 0000:01:00.0: wmi event debug mesg len 300
> [   25.773566] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [   25.874556] ath10k_pci 0000:01:00.0: WMI_VDEV_START_RESP_EVENTID
> [   25.879992] ath10k_pci 0000:01:00.0: wmi mgmt vdev up id 0x1 assoc 
> id 0 bssid 18:a6:f7:23:6e:66
> [   25.880077] ath10k_pci 0000:01:00.0: mac monitor vdev 1 started
> [   25.880093] ath10k_pci 0000:01:00.0: mac monitor started
> [   25.880139] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 3 
> value 1000
> [   25.880184] ath10k_pci 0000:01:00.0: mac vdev 0 beacon_interval 1000
> [   25.880199] ath10k_pci 0000:01:00.0: vdev 0 set beacon tx mode to 
> staggered
> [   25.880215] ath10k_pci 0000:01:00.0: wmi pdev set param 7 value 0
> [   25.880235] ath10k_pci 0000:01:00.0: mac vdev 0 dtim_period 2
> [   25.880250] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 13 
> value 2
> [   25.880988] ath10k_pci 0000:01:00.0: wmi mgmt vdev up id 0x0 assoc 
> id 0 bssid 00:00:00:00:00:00
> [   25.881035] ath10k_pci 0000:01:00.0: mac vdev 0 up
> [   25.881097] ath10k_pci 0000:01:00.0: WMI_TBTTOFFSET_UPDATE_EVENTID
> [   25.882968] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
> [   25.928796] ath10k_pci 0000:01:00.0: vdev 0 set beacon tx mode to 
> staggered
> [   25.928821] ath10k_pci 0000:01:00.0: wmi pdev set param 7 value 0
> [   25.928866] ath10k_pci 0000:01:00.0: mac vdev 0 dtim_period 2
> [   25.928883] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 13 
> value 2
> [   25.929020] ath10k_pci 0000:01:00.0: mac monitor recalc started? 1 
> needed? 1 allowed? 1
> [   25.941886] ath10k_pci 0000:01:00.0: vdev 0 set beacon tx mode to 
> staggered
> [   25.941911] ath10k_pci 0000:01:00.0: wmi pdev set param 7 value 0
> [   25.941955] ath10k_pci 0000:01:00.0: mac vdev 0 dtim_period 2
> [   25.941972] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 13 
> value 2
> [   25.953727] ath10k_pci 0000:01:00.0: vdev 0 set beacon tx mode to 
> staggered
> [   25.953753] ath10k_pci 0000:01:00.0: wmi pdev set param 7 value 0
> [   25.953798] ath10k_pci 0000:01:00.0: mac vdev 0 dtim_period 2
> [   25.953817] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 13 
> value 2
> [   25.970588] ath10k_pci 0000:01:00.0: vdev 0 set beacon tx mode to 
> staggered
> [   25.970614] ath10k_pci 0000:01:00.0: wmi pdev set param 7 value 0
> [   25.970659] ath10k_pci 0000:01:00.0: mac vdev 0 dtim_period 2
> [   25.970676] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 13 
> value 2
> [   25.989056] ath10k_pci 0000:01:00.0: vdev 0 set beacon tx mode to 
> staggered
> [   25.989081] ath10k_pci 0000:01:00.0: wmi pdev set param 7 value 0
> [   25.989126] ath10k_pci 0000:01:00.0: mac vdev 0 dtim_period 2
> [   25.989143] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 13 
> value 2
> [   26.071686] ath10k_pci 0000:01:00.0: mac vdev 0 peer create 
> 18:a6:f7:26:0f:21 (new sta) sta 1 / 128 peer 2 / 144
> [   26.071712] ath10k_pci 0000:01:00.0: wmi peer create vdev_id 0 
> peer_addr 18:a6:f7:26:0f:21
> [   26.071952] ath10k_pci 0000:01:00.0: mac sta 18:a6:f7:26:0f:21 
> associated
> [   26.071981] ath10k_pci 0000:01:00.0: mac ht peer 18:a6:f7:26:0f:21 
> mcs cnt 24 nss 3
> [   26.071999] ath10k_pci 0000:01:00.0: mac vht peer 18:a6:f7:26:0f:21 
> max_mpdu 1048575 flags 0x601b001
> [   26.072013] ath10k_pci 0000:01:00.0: mac peer 18:a6:f7:26:0f:21 qos 1
> [   26.072028] ath10k_pci 0000:01:00.0: mac peer 18:a6:f7:26:0f:21 
> phymode 11ac-vht80
> [   26.072047] ath10k_pci 0000:01:00.0: wmi peer assoc vdev 0 addr 
> 18:a6:f7:26:0f:21 (new)
> [   26.072071] ath10k_pci 0000:01:00.0: wmi vdev 0 peer 
> 0x18:a6:f7:26:0f:21 set param 1 value 0
> [   26.072502] ath10k_pci 0000:01:00.0: received event id 36891 not 
> implemented
> [   26.074194] ath10k_pci 0000:01:00.0: mac sta rc update for 
> 18:a6:f7:26:0f:21 changed 00000000 bw 2 nss 3 smps 1
> [   26.074586] ath10k_pci 0000:01:00.0: vdev 0 set beacon tx mode to 
> staggered
> [   26.074609] ath10k_pci 0000:01:00.0: wmi pdev set param 7 value 0
> [   26.074682] ath10k_pci 0000:01:00.0: mac vdev 0 dtim_period 2
> [   26.074701] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 13 
> value 2
> [   26.074760] ath10k_pci 0000:01:00.0: mac vdev 0 slot_time 2
> [   26.074779] ath10k_pci 0000:01:00.0: wmi vdev id 0x0 set param 7 
> value 2
> [   26.273652] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [   26.730650] ath10k_pci 0000:01:00.0: wmi event debug mesg len 44
> [   26.773733] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
> [   27.135445] ath10k_pci 0000:01:00.0: mac monitor recalc started? 1 
> needed? 1 allowed? 1
> [   27.273810] ath10k_pci 0000:01:00.0: WMI_UPDATE_STATS_EVENTID
>
> It seems to start up in VHT80 mode and when it peers with Node B it 
> thinks Node B is also in VHT80 mode and ping works.
>
> 4. Beacons are sent at 6 Mb/s basic rate and unicast QoS Data is sent 
> with three spatial streams. Attached is the full pcap of the experiment.
>
> Thank you for looking into this!
>
> ~Benjamin
>
> On 12/05/2016 11:24 AM, Nagarajan, Ashok Raj wrote:
>> 0x009A4577 0x00955B31
>>
>> Benjamin, Thanks for the logs.
>> Quick questions to further debug the issue here,
>>
>> 1. Is this issue seen every time you start sending data traffic?
>> 2. Issue seen with older firmwares? (FYR, 
>> http://linuxwireless.org/en/users/Drivers/ath10k/firmware/ )
>> 3. Could you please share the dmesg from your device after enabling 
>> MAC and WMI logs in ath10k driver
>>     To enable debug logs please see 
>> http://linuxwireless.org/en/users/Drivers/ath10k/debug/
>> 4. Do you know what is the Number of Spatial Streams seen in mesh 
>> beacons and in mesh data packet?
>>
>> Thanks,
>> Ashok
>

^ permalink raw reply

* Re: [RFC v2 05/11] ath10k: htc: refactorization
From: Erik Stromdahl @ 2016-12-13 18:37 UTC (permalink / raw)
  To: Valo, Kalle, michal.kazior@tieto.com
  Cc: linux-wireless@vger.kernel.org, ath10k@lists.infradead.org
In-Reply-To: <871sxbzqo6.fsf@kamboji.qca.qualcomm.com>



On 12/13/2016 06:26 PM, Valo, Kalle wrote:
> Michal Kazior <michal.kazior@tieto.com> writes:
> 
>> On 13 December 2016 at 14:44, Valo, Kalle <kvalo@qca.qualcomm.com> wrote:
>>> Erik Stromdahl <erik.stromdahl@gmail.com> writes:
>>>
>>>> Code refactorization:
>>>>
>>>> Moved the code for ep 0 in ath10k_htc_rx_completion_handler
>>>> to ath10k_htc_control_rx_complete.
>>>>
>>>> This eases the implementation of SDIO/mbox significantly since
>>>> the ep_rx_complete cb is invoked directly from the SDIO/mbox
>>>> hif layer.
>>>>
>>>> Since the ath10k_htc_control_rx_complete already is present
>>>> (only containing a warning message) there is no reason for not
>>>> using it (instead of having a special case for ep 0 in
>>>> ath10k_htc_rx_completion_handler).
>>>>
>>>> Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
>>>
>>> I tested this on QCA988X PCI board just to see if there are any
>>> regressions. It crashes immediately during module load, every time, and
>>> bisected that the crashing starts on this patch:
>>>
>>> [ 1239.715325] ath10k_pci 0000:02:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
>>> [ 1239.885125] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:02:00.0.bin failed with error -2
>>> [ 1239.885260] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/cal-pci-0000:02:00.0.bin failed with error -2
>>> [ 1239.885687] ath10k_pci 0000:02:00.0: qca988x hw2.0 target 0x4100016c chip_id 0x043202ff sub 0000:0000
>>> [ 1239.885699] ath10k_pci 0000:02:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1
>>> [ 1239.885899] ath10k_pci 0000:02:00.0: firmware ver 10.2.4.70.59-2 api 5 features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
>>> [ 1239.941836] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
>>> [ 1239.941993] ath10k_pci 0000:02:00.0: board_file api 1 bmi_id N/A crc32 bebc7c08
>>> [ 1241.136693] BUG: unable to handle kernel NULL pointer dereference at   (null)
>>> [ 1241.136738] IP: [<  (null)>]   (null)
>>> [ 1241.136759] *pdpt = 0000000000000000 *pde = f0002a55f0002a55 [ 1241.136781]
>>> [ 1241.136793] Oops: 0010 [#1] SMP
>>>
>>> What's odd is that when I added some printks on my own and enabled both
>>> boot and htc debug levels it doesn't crash anymore. After everything
>>> works normally after that, I can start AP mode and connect to it. Is it
>>> a race somewhere?
>>
>> Yes. htc_wait_target() is called after hif_start(). The ep_rx_complete
>> is set in htc_wait_target() [changed patch 4, but still too late].
>>
>> ep_rx_complete must be set prior to calling hif_start(). You probably
>> crash on end of ath10k_htc_rx_completion_handler() when trying to call
>> ep->ep_ops.ep_rx_complete(ar, skb).
> 
> Yeah, just checked and ep->ep_ops.ep_rx_complete is NULL at the end of
> ath10k_htc_rx_completion_handler().
> 
It is indeed correct as Michal points out, there is a risk that the
first HTC control message (typically an HTC ready message) is received
before the HTC control endpoint is connected.

I have experienced a similar race with my SDIO implementation as well.
In this case I did solve the issue by enabling HIF target interrupts
after the HTC control endpoint was connected. I am not sure however if
this is the most elegant way to solve this problem.

My SDIO target won't send the HTC ready message before this is done.
The fix essentially consists of moving the ..._irg_enable call from
hif_start into another hif op.

I have made a few updates since I submitted the original RFC and created
a repo on github:

https://github.com/erstrom/linux-ath

I have a bunch of branches that are all based on the tags on the ath master.

As of this moment the latest version is:

ath-201612131156-ath10k-sdio

This branch contains the original RFC patches plus some addons/fixes.

In the above mentioned branch there are a few commits related to this
race condition. Perhaps you can have a look at them?

The commits are:
821672913328cf737c9616786dc28d2e4e8a4a90
dd7fcf0a1f78e68876d14f90c12bd37f3a700ad7
7434b7b40875bd08a3a48a437ba50afed7754931

Perhaps this approach can work with PCIe as well?

/Erik

^ permalink raw reply

* Re: [RFC v2 05/11] ath10k: htc: refactorization
From: Valo, Kalle @ 2016-12-13 17:26 UTC (permalink / raw)
  To: michal.kazior@tieto.com
  Cc: Erik Stromdahl, linux-wireless@vger.kernel.org,
	ath10k@lists.infradead.org
In-Reply-To: <CA+BoTQkuUFxnra75B4LpF5+k9yE7hkNYDXbtUMBLq2LFuPSasg@mail.gmail.com>

Michal Kazior <michal.kazior@tieto.com> writes:

> On 13 December 2016 at 14:44, Valo, Kalle <kvalo@qca.qualcomm.com> wrote:
>> Erik Stromdahl <erik.stromdahl@gmail.com> writes:
>>
>>> Code refactorization:
>>>
>>> Moved the code for ep 0 in ath10k_htc_rx_completion_handler
>>> to ath10k_htc_control_rx_complete.
>>>
>>> This eases the implementation of SDIO/mbox significantly since
>>> the ep_rx_complete cb is invoked directly from the SDIO/mbox
>>> hif layer.
>>>
>>> Since the ath10k_htc_control_rx_complete already is present
>>> (only containing a warning message) there is no reason for not
>>> using it (instead of having a special case for ep 0 in
>>> ath10k_htc_rx_completion_handler).
>>>
>>> Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
>>
>> I tested this on QCA988X PCI board just to see if there are any
>> regressions. It crashes immediately during module load, every time, and
>> bisected that the crashing starts on this patch:
>>
>> [ 1239.715325] ath10k_pci 0000:02:00.0: pci irq msi oper_irq_mode 2 irq_=
mode 0 reset_mode 0
>> [ 1239.885125] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/=
pre-cal-pci-0000:02:00.0.bin failed with error -2
>> [ 1239.885260] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/=
cal-pci-0000:02:00.0.bin failed with error -2
>> [ 1239.885687] ath10k_pci 0000:02:00.0: qca988x hw2.0 target 0x4100016c =
chip_id 0x043202ff sub 0000:0000
>> [ 1239.885699] ath10k_pci 0000:02:00.0: kconfig debug 1 debugfs 1 tracin=
g 1 dfs 1 testmode 1
>> [ 1239.885899] ath10k_pci 0000:02:00.0: firmware ver 10.2.4.70.59-2 api =
5 features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
>> [ 1239.941836] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/=
QCA988X/hw2.0/board-2.bin failed with error -2
>> [ 1239.941993] ath10k_pci 0000:02:00.0: board_file api 1 bmi_id N/A crc3=
2 bebc7c08
>> [ 1241.136693] BUG: unable to handle kernel NULL pointer dereference at =
  (null)
>> [ 1241.136738] IP: [<  (null)>]   (null)
>> [ 1241.136759] *pdpt =3D 0000000000000000 *pde =3D f0002a55f0002a55 [ 12=
41.136781]
>> [ 1241.136793] Oops: 0010 [#1] SMP
>>
>> What's odd is that when I added some printks on my own and enabled both
>> boot and htc debug levels it doesn't crash anymore. After everything
>> works normally after that, I can start AP mode and connect to it. Is it
>> a race somewhere?
>
> Yes. htc_wait_target() is called after hif_start(). The ep_rx_complete
> is set in htc_wait_target() [changed patch 4, but still too late].
>
> ep_rx_complete must be set prior to calling hif_start(). You probably
> crash on end of ath10k_htc_rx_completion_handler() when trying to call
> ep->ep_ops.ep_rx_complete(ar, skb).

Yeah, just checked and ep->ep_ops.ep_rx_complete is NULL at the end of
ath10k_htc_rx_completion_handler().

--=20
Kalle Valo=

^ permalink raw reply

* [PATCH v2] ath9k: do not return early to fix rcu unlocking
From: Tobias Klausmann @ 2016-12-13 17:08 UTC (permalink / raw)
  To: kvalo, helgaas, linux-kernel, linux-pci, marc.zyngier,
	Janusz.Dziedzic, ath9k-devel, linux-wireless, rmanohar,
	bharat.kumar.gogada
  Cc: Tobias Klausmann, # v4 . 9
In-Reply-To: <e15c5efc-d548-8b32-ca86-5ec26506ff7c@nbd.name>

Starting with commit d94a461d7a7d ("ath9k: use ieee80211_tx_status_noskb
where possible") the driver uses rcu_read_lock() && rcu_read_unlock(), yet on
returning early in ath_tx_edma_tasklet() the unlock is missing leading to stalls
and suspicious RCU usage:

 ===============================
 [ INFO: suspicious RCU usage. ]
 4.9.0-rc8 #11 Not tainted
 -------------------------------
 kernel/rcu/tree.c:705 Illegal idle entry in RCU read-side critical section.!

 other info that might help us debug this:

 RCU used illegally from idle CPU!
 rcu_scheduler_active = 1, debug_locks = 0
 RCU used illegally from extended quiescent state!
 1 lock held by swapper/7/0:
 #0:
  (
 rcu_read_lock
 ){......}
 , at:
 [<ffffffffa06ed110>] ath_tx_edma_tasklet+0x0/0x450 [ath9k]

 stack backtrace:
 CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.9.0-rc8 #11
 Hardware name: Acer Aspire V3-571G/VA50_HC_CR, BIOS V2.21 12/16/2013
  ffff88025efc3f38 ffffffff8132b1e5 ffff88017ede4540 0000000000000001
  ffff88025efc3f68 ffffffff810a25f7 ffff88025efcee60 ffff88017edebdd8
  ffff88025eeb5400 0000000000000091 ffff88025efc3f88 ffffffff810c3cd4
 Call Trace:
  <IRQ>
  [<ffffffff8132b1e5>] dump_stack+0x68/0x93
  [<ffffffff810a25f7>] lockdep_rcu_suspicious+0xd7/0x110
  [<ffffffff810c3cd4>] rcu_eqs_enter_common.constprop.85+0x154/0x200
  [<ffffffff810c5a54>] rcu_irq_exit+0x44/0xa0
  [<ffffffff81058631>] irq_exit+0x61/0xd0
  [<ffffffff81018d25>] do_IRQ+0x65/0x110
  [<ffffffff81672189>] common_interrupt+0x89/0x89
  <EOI>
  [<ffffffff814ffe11>] ? cpuidle_enter_state+0x151/0x200
  [<ffffffff814ffee2>] cpuidle_enter+0x12/0x20
  [<ffffffff8109a6ae>] call_cpuidle+0x1e/0x40
  [<ffffffff8109a8f6>] cpu_startup_entry+0x146/0x220
  [<ffffffff810336f8>] start_secondary+0x148/0x170

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Fixes: d94a461d7a7d ("ath9k: use ieee80211_tx_status_noskb where possible")
Cc: <stable@vger.kernel.org> # v4.9

---
v2: break instead of unlock (rename patch) [Felix Fietkau],
    fix reference to commit [Kalle Valo]
---
 drivers/net/wireless/ath/ath9k/xmit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
index 52bfbb988611..e47286bf378e 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -2787,7 +2787,7 @@ void ath_tx_edma_tasklet(struct ath_softc *sc)
 		fifo_list = &txq->txq_fifo[txq->txq_tailidx];
 		if (list_empty(fifo_list)) {
 			ath_txq_unlock(sc, txq);
-			return;
+			break;
 		}
 
 		bf = list_first_entry(fifo_list, struct ath_buf, list);
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH] orinoco: Use shash instead of ahash for MIC calculations
From: Kalle Valo @ 2016-12-13 17:03 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andy Lutomirski, linux-kernel@vger.kernel.org, USB list,
	Linux Wireless List, Eric Biggers, linux-crypto, Herbert Xu,
	Stephan Mueller
In-Reply-To: <CALCETrXxQ9FxuqV5A1rkj2SpeFfd89njDP9h5VBuNx387ieKdQ@mail.gmail.com>

Andy Lutomirski <luto@amacapital.net> writes:

> On Tue, Dec 13, 2016 at 3:35 AM, Kalle Valo <kvalo@codeaurora.org> wrote:
>> Andy Lutomirski <luto@kernel.org> writes:
>>
>>> Eric Biggers pointed out that the orinoco driver pointed scatterlists
>>> at the stack.
>>>
>>> Fix it by switching from ahash to shash.  The result should be
>>> simpler, faster, and more correct.
>>>
>>> Cc: stable@vger.kernel.org # 4.9 only
>>> Reported-by: Eric Biggers <ebiggers3@gmail.com>
>>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>>
>> "more correct"? Does this fix a real user visible bug or what? And why
>> just stable 4.9, does this maybe have something to do with
>> CONFIG_VMAP_STACK?
>
> Whoops, I had that text in some other patches but forgot to put it in
> this one.  It'll blow up with CONFIG_VMAP_STACK=y if a debug option
> like CONFIG_DEBUG_VIRTUAL=y is set.  It may work by accident if
> debugging is off.

Makes sense now, thanks. I'll add that to the commit log and queue this
to 4.10.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH] orinoco: Use shash instead of ahash for MIC calculations
From: Andy Lutomirski @ 2016-12-13 16:41 UTC (permalink / raw)
  To: Kalle Valo
  Cc: Andy Lutomirski, linux-kernel@vger.kernel.org, USB list,
	Linux Wireless List, Eric Biggers, linux-crypto, Herbert Xu,
	Stephan Mueller
In-Reply-To: <87mvg0kqno.fsf@purkki.adurom.net>

On Tue, Dec 13, 2016 at 3:35 AM, Kalle Valo <kvalo@codeaurora.org> wrote:
> Andy Lutomirski <luto@kernel.org> writes:
>
>> Eric Biggers pointed out that the orinoco driver pointed scatterlists
>> at the stack.
>>
>> Fix it by switching from ahash to shash.  The result should be
>> simpler, faster, and more correct.
>>
>> Cc: stable@vger.kernel.org # 4.9 only
>> Reported-by: Eric Biggers <ebiggers3@gmail.com>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>
> "more correct"? Does this fix a real user visible bug or what? And why
> just stable 4.9, does this maybe have something to do with
> CONFIG_VMAP_STACK?

Whoops, I had that text in some other patches but forgot to put it in
this one.  It'll blow up with CONFIG_VMAP_STACK=y if a debug option
like CONFIG_DEBUG_VIRTUAL=y is set.  It may work by accident if
debugging is off.

--Andy

^ permalink raw reply

* Re: [RFC V3 01/11] nl80211: add reporting of gscan capabilities
From: Johannes Berg @ 2016-12-13 16:15 UTC (permalink / raw)
  To: Arend van Spriel; +Cc: linux-wireless, Arend van Spriel
In-Reply-To: <1481543997-24624-2-git-send-email-arend.vanspriel@broadcom.com>


> +	case 14:
> +		if (!rdev->wiphy.gscan) {
> +			/* done */
> +			state->split_start = 0;
> +			break;
> +		}
> 

Nit, but I'm not really happy with this - this assumes that case 14 is
the last case, if anyone ever adds one we break this code but it would
still work if the device has gscan. Move the gscan stuff into a new
function and make that return immediately if gscan is NULL or so?

johannes

^ permalink raw reply

* Re: [RFC V3 04/11] nl80211: add driver api for gscan notifications
From: Johannes Berg @ 2016-12-13 16:20 UTC (permalink / raw)
  To: Arend van Spriel; +Cc: linux-wireless
In-Reply-To: <1481543997-24624-5-git-send-email-arend.vanspriel@broadcom.com>

On Mon, 2016-12-12 at 11:59 +0000, Arend van Spriel wrote:
> The driver can indicate gscan results are available or gscan
> operation has stopped.

This patch is renumbering the previous patches' nl80211 API, which is
best avoided, even if I do realize it doesn't matter now. :)

Even here it's not clear how things are reported though. Somehow I
thought that gscan was reporting only partial information through the
buckets, or is that not true?

johannes

^ permalink raw reply

* Re: [RFC V3 03/11] nl80211: add support for gscan
From: Johannes Berg @ 2016-12-13 16:19 UTC (permalink / raw)
  To: Arend van Spriel; +Cc: linux-wireless
In-Reply-To: <1481543997-24624-4-git-send-email-arend.vanspriel@broadcom.com>

On Mon, 2016-12-12 at 11:59 +0000, Arend van Spriel wrote:
> This patch adds support for GScan which is a scan offload feature
> used in Android.

Found a few places with spaces instead of tabs as indentation, and
spurious braces around single-statement things, but other than that it
looks fine from a patch/nl80211 POV.

Haven't really looked into the details of gscan itself now though,
sorry.

There's a bit of a weird hard-coded restriction to 16 channels too,
that's due to the bucket map?

johannes

^ permalink raw reply

* Re: [PATCH 3/3][RFC] nl80211/mac80211: Accept multiple RSSI thresholds for CQM
From: Johannes Berg @ 2016-12-13 16:11 UTC (permalink / raw)
  To: Andrew Zaborowski; +Cc: linux-wireless
In-Reply-To: <CAOq732K88CygXBmDfGmU21AcQHurcWx081Q3SM-b0ecm+848gA@mail.gmail.com>


> I wasn't clear: nl80211 sets the thresholds so that "high" is higher
> than last known value and "low" is lower than last known value, also
> the distance is at least 2 x hysteresis.  There's no purpose for
> reporting "middle" rssi events because we have to set a new range as
> soon as we receive a high or a low event.  I realize I need to
> document better.

But there can be a delay between reporting and reprogramming, and if
during that time a new event could be reported? I guess it doesn't
matter much if we assume that upon reprogramming the driver will always
report a new event if the current value falls outside the new range
(either high or low)... it just seemed a little bit more consistent to
unconditionally report a new event at the beginning, even if that new
event is "yup - falling into the middle of your range now".

johannes

^ permalink raw reply

* Re: [PATCH 2/4] cfg80211: Add new NL80211_CMD_SET_BTCOEX_PRIORITY to support BTCOEX
From: Johannes Berg @ 2016-12-13 16:09 UTC (permalink / raw)
  To: Tamizh chelvam; +Cc: c_traja, linux-wireless, ath10k
In-Reply-To: <5e5e8971c96293a81e7cb37bcdfbd593@codeaurora.org>


> > >  /**
> > > + * wiphy_btcoex_support_flags
> > > + *	This enum has the driver supported frame types for
> > > BTCOEX.
> > > + * @WIPHY_WLAN_BE_PREFERRED - Supports Best Effort frame for
> > > BTCOEX
> > > + * @WIPHY_WLAN_BK_PREFERRED - supports Background frame for
> > > BTCOEX
> > > + * @WIPHY_WLAN_VI_PREFERRED - supports Video frame for BTCOEX
> > > + * @WIPHY_WLAN_VO_PREFERRED - supports Voice frame for BTCOEX
> > > + * @WIPHY_WLAN_BEACON_PREFERRED - supports Beacon frame for
> > > BTCOEX
> > > + * @WIPHY_WLAN_MGMT_PREFERRED - supports Management frames for
> > > BTCOEX.
> > > + */
> > 
> > That's not making much sense to me?
> > 
> 
> is it fine to have as WIPHY_BTCOEX_BE_PREFERRED ?

It's not really clear to me what you intend to do this - if it's really
support flags then you really should name those better.

> > > +/**
> > > + * enum wiphy_btcoex_priority - BTCOEX priority level
> > > + *	This enum defines priority level for BTCOEX
> > > + * WIPHY_WLAN_PREFERRED_LOW - low priority frames over BT
> > > traffic
> > > + * WIPHY_WLAN_PREFERRED_HIGH - high priority frames over BT
> > > traffic
> > > + */
> > > +
> > > +enum wiphy_btcoex_priority {
> > > +	WIPHY_WLAN_PREFERRED_LOW = false,
> > > +	WIPHY_WLAN_PREFERRED_HIGH = true,
> > > +};
> > 
> > That false/true seems just strange.
> > 
> 
> I will just use as a enum without assigning false/true.

What do you even need this enum for though?

> > > +enum nl80211_btcoex_priority {
> > > +	__NL80211_WLAN_PREFERRED_INVALID,
> > > +	NL80211_WLAN_BE_PREFERRED,
> > > +	NL80211_WLAN_BK_PREFERRED,
> > > +	NL80211_WLAN_VI_PREFERRED,
> > > +	NL80211_WLAN_VO_PREFERRED,
> > > +	NL80211_WLAN_BEACON_PREFERRED,
> > > +	NL80211_WLAN_MGMT_PREFERRED,
> > > +	__NL80211_WLAN_PREFERRED_LAST,
> > > +	NL80211_WLAN_PREFERRED_MAX =
> > > +			__NL80211_WLAN_PREFERRED_LAST - 1,
> > > +};
> > 
> > Wouldn't a bitmap be easier?
> > 
> since this is to distinguish between different btcoex priorities and
> we 
> are not going to do any manipulations on these parameters.
> It is just used as flag attribute.

But why the (parsing) complexity, when a single bitmap would do?

johannes

^ permalink raw reply

* Re: [PATCH] RFC: Universal scan proposal
From: Johannes Berg @ 2016-12-13 16:06 UTC (permalink / raw)
  To: Dmitry Shmidt, Arend Van Spriel; +Cc: linux-wireless
In-Reply-To: <CAH7ZN-wGseBVzV3Vuq+6=kgaSL7e0UnndGXPdu4PQKZw8H47YQ@mail.gmail.com>


> Supporting requests (or more precisely requests and results)
> differentiated by user-space entity can be tricky. Right now we are
> not checking current caller pid, right? Maybe it is also good idea -
> or maybe we can just make result filtering per user-space caller?

Could be done.

You seem to be very worried about the partial results - I'm not too
worried about that I guess, the connection manager itself will always
be able to wait for the full scan to finish before making a decision,
but it may not even want to (see the separate discussion on per-channel 
"done" notifications etc.)

I'm much more worried about the "bucket reporting" since that doesn't
fit into the current full BSS reporting model at all. What's your
suggestion for this?

johannes

^ permalink raw reply

* Re: [PATCH] RFC: Universal scan proposal
From: Johannes Berg @ 2016-12-13 16:04 UTC (permalink / raw)
  To: Dmitry Shmidt; +Cc: linux-wireless
In-Reply-To: <CAH7ZN-wP+9AGrXFUS4RY65-RyfP-J46svBvLdytP2c=QPtiaug@mail.gmail.com>

> > Well eventually we also have to clear for location if we run out of
> > memory, that usually means dumping them out to the host, no?
> 
> Being out of memory and consuming more memory are different
> things, but I agree - maybe we don't need to worry about it.

Well, reaching the limit of what we're willing to spend on it is
equivalent I guess :)

> > I'm not entirely sure about this case - surely noticing "we can do
> > better now" is still better than waiting for being able to make the
> > perfect decision?
> 
> Maybe we can just keep flag saying that currently available results
> were not received by usual full scan.

Elsewhere we were planning per-channel results, and a cookie to filter
them - perhaps we could have a similar thing where you may even have to
request these scan results specifically with a certain cookie you got
from the scanning, or so. Or indicate the cookie there so you can tie
it back to the scan request somehow?

> So, let's summarize:
> Instead of creating new type of generic scan with special types,
> we want to go with additional expansion of scheduled scan options and
> parameters (in order not to "multiply entities"), including ability
> to send new scheduled scan request without stopping previous one.
> 
> Is it Ok?

Sounds fine to me.

johannes

^ permalink raw reply

* Re: [PATCH v2 2/2] cfg80211: Add support to sched scan to report better BSSs
From: Johannes Berg @ 2016-12-13 15:56 UTC (permalink / raw)
  To: Arend Van Spriel, Malinen, Jouni
  Cc: Vamsi, Krishna, linux-wireless@vger.kernel.org
In-Reply-To: <63343007-2245-1861-94fd-bdda0de2f7dc@broadcom.com>

Ok... this is getting complicated :)

Regarding reusing attributes, we have (for the BSS selection thing) the
attribute NL80211_BSS_SELECT_ATTR_RSSI_ADJUST, which is really quite
similar to your new NL80211_ATTR_SCHED_SCAN_RELATIVE_RSSI_5G_PREF since
while connected (which BSS_SELECT_ATTR_* assumes) the current BSS is
always part of the considered BSSes, I'd think.

However, I tend to think now that reusing the attribute is perhaps not
the right thing to do - but defining them with the same semantics would
still make sense.

Assuming that the value defined in NL80211_BSS_SELECT_ATTR_RSSI_ADJUST
applies also to the *current* BSS, it's actually quite pointless to
define there the band to adjust - if you want to adjust 2.4 GHz
positively you might as well adjust 5 GHz negatively, and vice versa,
and both ways are supported.

OTOH, the new NL80211_ATTR_SCHED_SCAN_RELATIVE_RSSI_5G_PREF doesn't
make this quite clear - is the current BSS to be adjusted before
comparing, if it's 5 GHz? If so, the semantics are equivalent. If not,
it doesn't actually make much sense ;-)
So assuming that it is in fact taken into account after the same
adjustment, the two attributes are equivalent, and then perhaps it
would make sense to use struct nl80211_bss_select_rssi_adjust for the
new attribute. If a driver doesn't support arbitrary bands, but just 5
GHz as in your example, it can just flip it around to 2.4 GHz by
switching the sign.

Perhaps we should even consider doing that in cfg80211 and adjusting
the internal API for both that way?

> I am not saying it should be avoided. Just looking at it conceptually
> the scheduled scan request holds so-called matchsets that specify the
> constraints to determine whether a BSS was found that is worth
> notifying the host/user-space about. As such I would expect the
> relative RSSI attribute(s) to be part of the matchset. That way you
> can specify it together with the currently connected SSID in a single
> matchset.

I think this makes a lot of sense.

We already have NL80211_SCHED_SCAN_MATCH_ATTR_RSSI, which asks to be
reporting only networks that have an *absolute* RSSI value above the
value of the attribute - a new attribute to make it relative to the
current network instead would make sense.

That would indeed be equivalent to NL80211_BSS_SELECT_ATTR_RSSI then.

Now, if we consider this, NL80211_ATTR_SCHED_SCAN_RELATIVE_RSSI
actually is equivalent to NL80211_BSS_SELECT_ATTR_RSSI (a flag
attribute indicating whether or not RSSI-based selection/matching is
done) and NL80211_ATTR_SCHED_SCAN_RELATIVE_RSSI_5G_PREF is equivalent
to NL80211_BSS_SELECT_ATTR_RSSI_ADJUST, both need to be given with the
flag and affect operation.

However, NL80211_BSS_SELECT_ATTR_BAND_PREF doesn't exist, and reusing
the BSS_SELECT namespace also doesn't make sense.


So, how about we move these into NL80211_SCHED_SCAN_MATCH_ATTR_* as
suggested by Arend, and define them with the same content as  the
corresponding NL80211_BSS_SELECT_ATTR_*?

If they're part of match attributes, we might even remove the feature
flag entirely - those were always defined to be optional, but it very
well be worthwhile for userspace to know if they're supported if it
wants to behave differently depending on whether they're supported or
not, I'll leave that up to you since presumably you know the userspace
implementation that you're planning to create.

johannes

^ permalink raw reply

* Re: [PATCH v3] cfg80211: NL80211_ATTR_SOCKET_OWNER support for CMD_CONNECT
From: Johannes Berg @ 2016-12-13 15:36 UTC (permalink / raw)
  To: Andrew Zaborowski, linux-wireless
In-Reply-To: <1481643200.20412.9.camel@sipsolutions.net>


> All the code you added to nl80211.c is racy.

Maybe it's not? But that'd be hard to reason about, having to look into
af_netlink.c and all, so it's easier to just make it not appear racy
here.

johannes

^ permalink raw reply

* Re: [PATCH v3] cfg80211: NL80211_ATTR_SOCKET_OWNER support for CMD_CONNECT
From: Johannes Berg @ 2016-12-13 15:33 UTC (permalink / raw)
  To: Andrew Zaborowski, linux-wireless
In-Reply-To: <20161212164500.691-1-andrew.zaborowski@intel.com>

[snip]

Please fix coding style, particularly indentation.

> +static void cfg80211_disconnect_wk(struct work_struct *work)
> +{
> +       struct cfg80211_registered_device *rdev;
> +       struct wireless_dev *wdev;
> +
> +       wdev = container_of(work, struct wireless_dev, disconnect_wk);
> +       rdev = wiphy_to_rdev(wdev->wiphy);


Those should also be possible as initializers on the same line, I
guess?

It might also be worthwhile moving this function into a better file,
even if then it needs a prototype in core.h (it can't be inlined anyway
since it's called through a function pointer in the work struct)

> +       if (!wdev->netdev)
> +               return;

This obviously cannot happen.

All the code you added to nl80211.c is racy.

johannes

^ permalink raw reply

* Re: [PATCH v2 1/2] mac80211: Remove invalid flag operations in mesh TSF synchronization
From: Johannes Berg @ 2016-12-13 15:23 UTC (permalink / raw)
  To: Masashi Honma, me; +Cc: linux-wireless
In-Reply-To: <1481159751-4097-1-git-send-email-masashi.honma@gmail.com>

On Thu, 2016-12-08 at 10:15 +0900, Masashi Honma wrote:
> mesh_sync_offset_adjust_tbtt() implements Extensible synchronization
> framework ([1] 13.13.2 Extensible synchronization framework). It
> shall
> not operate the flag "TBTT Adjusting subfield" ([1] 8.4.2.100.8 Mesh
> Capability), since it is used only for MBCA ([1] 13.13.4 Mesh beacon
> collision avoidance, see 13.13.4.4.3 TBTT scanning and adjustment
> procedures for detail). So this patch remove the flag operations.
> 

Both applied; I changed this patch to remove ifmsh->adjusting_tbtt
completely since it was now unused.

johannes

^ permalink raw reply

* Re: [PATCH v2 2/2] net: rfkill: Add rfkill-any LED trigger
From: Johannes Berg @ 2016-12-13 15:18 UTC (permalink / raw)
  To: Michał Kępień, David S . Miller
  Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20161208073052.12988-2-kernel@kempniu.pl>

On Thu, 2016-12-08 at 08:30 +0100, Michał Kępień wrote:
> Add a new "global" (i.e. not per-rfkill device) LED trigger, rfkill-
> any,
> which may be useful on laptops with a single "radio LED" and multiple
> radio transmitters.  The trigger is meant to turn a LED on whenever
> there is at least one radio transmitter active and turn it off
> otherwise.
> 

Also applied, but I moved the discussion of the mutex into the recorded
commit log.

johannes

^ permalink raw reply

* Re: [PATCH v2 1/2] net: rfkill: Cleanup error handling in rfkill_init()
From: Johannes Berg @ 2016-12-13 15:16 UTC (permalink / raw)
  To: Michał Kępień, David S . Miller
  Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20161208073052.12988-1-kernel@kempniu.pl>

On Thu, 2016-12-08 at 08:30 +0100, Michał Kępień wrote:
> Use a separate label per error condition in rfkill_init() to make it
> a bit cleaner and easier to extend.

applied.

johannes

^ permalink raw reply

* Re: [PATCH v2] mac80211: Ensure enough headroom when forwarding mesh pkt
From: Johannes Berg @ 2016-12-13 15:09 UTC (permalink / raw)
  To: Cedric Izoard, linux-wireless@vger.kernel.org
In-Reply-To: <1ffe01100a724290ab910d68980604ba@ceva-dsp.com>

On Wed, 2016-12-07 at 09:59 +0000, Cedric Izoard wrote:
> When a buffer is duplicated during MESH packet forwarding,
> this patch ensures that the new buffer has enough headroom.

Applied.

johannes

^ permalink raw reply

* Re: [PATCH] ath9k: unlock rcu read when returning early
From: Felix Fietkau @ 2016-12-13 13:52 UTC (permalink / raw)
  To: Tobias Klausmann, kvalo, helgaas, linux-kernel, linux-pci,
	marc.zyngier, Janusz.Dziedzic, rmanohar, ath9k-devel,
	linux-wireless, rmanohar, bharat.kumar.gogada
In-Reply-To: <7b4b7748-06d6-92d4-228c-e7ebf00f8699@mni.thm.de>

On 2016-12-13 14:41, Tobias Klausmann wrote:
> On 13.12.2016 11:41, Felix Fietkau wrote:
>> On 2016-12-12 19:50, Tobias Klausmann wrote:
>>> diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
>>> index 52bfbb988611..857d5ae09a1d 100644
>>> --- a/drivers/net/wireless/ath/ath9k/xmit.c
>>> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
>>> @@ -2787,6 +2787,7 @@ void ath_tx_edma_tasklet(struct ath_softc *sc)
>>>   		fifo_list = &txq->txq_fifo[txq->txq_tailidx];
>>>   		if (list_empty(fifo_list)) {
>>>   			ath_txq_unlock(sc, txq);
>>> +			rcu_read_unlock();
>> Technically this is fine as well, but I'd prefer a fix where you replace
>> the 'return' with 'break', thus avoiding the duplication of
>> rcu_read_unlock()
> 
> Actually if you want to avoid it, maybe skipping over the rest is better 
> (as originally intended):
> 
> ...
> 
> ath_txq_unlock(sc, txq);
> 
> 
> goto unlock;
> }
> ...
> 
> unlock:
> rcu_read_unlock();
There are already other places that skip to the rcu_read_unlock() part
by using 'break'. I don't see how adding an unnecessary goto makes
things any better.

- Felix

^ permalink raw reply

* Re: [RFC v2 05/11] ath10k: htc: refactorization
From: Michal Kazior @ 2016-12-13 13:52 UTC (permalink / raw)
  To: Valo, Kalle
  Cc: Erik Stromdahl, linux-wireless@vger.kernel.org,
	ath10k@lists.infradead.org
In-Reply-To: <87inqoymd0.fsf@kamboji.qca.qualcomm.com>

On 13 December 2016 at 14:44, Valo, Kalle <kvalo@qca.qualcomm.com> wrote:
> Erik Stromdahl <erik.stromdahl@gmail.com> writes:
>
>> Code refactorization:
>>
>> Moved the code for ep 0 in ath10k_htc_rx_completion_handler
>> to ath10k_htc_control_rx_complete.
>>
>> This eases the implementation of SDIO/mbox significantly since
>> the ep_rx_complete cb is invoked directly from the SDIO/mbox
>> hif layer.
>>
>> Since the ath10k_htc_control_rx_complete already is present
>> (only containing a warning message) there is no reason for not
>> using it (instead of having a special case for ep 0 in
>> ath10k_htc_rx_completion_handler).
>>
>> Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
>
> I tested this on QCA988X PCI board just to see if there are any
> regressions. It crashes immediately during module load, every time, and
> bisected that the crashing starts on this patch:
>
> [ 1239.715325] ath10k_pci 0000:02:00.0: pci irq msi oper_irq_mode 2 irq_m=
ode 0 reset_mode 0
> [ 1239.885125] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/p=
re-cal-pci-0000:02:00.0.bin failed with error -2
> [ 1239.885260] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/c=
al-pci-0000:02:00.0.bin failed with error -2
> [ 1239.885687] ath10k_pci 0000:02:00.0: qca988x hw2.0 target 0x4100016c c=
hip_id 0x043202ff sub 0000:0000
> [ 1239.885699] ath10k_pci 0000:02:00.0: kconfig debug 1 debugfs 1 tracing=
 1 dfs 1 testmode 1
> [ 1239.885899] ath10k_pci 0000:02:00.0: firmware ver 10.2.4.70.59-2 api 5=
 features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
> [ 1239.941836] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/Q=
CA988X/hw2.0/board-2.bin failed with error -2
> [ 1239.941993] ath10k_pci 0000:02:00.0: board_file api 1 bmi_id N/A crc32=
 bebc7c08
> [ 1241.136693] BUG: unable to handle kernel NULL pointer dereference at  =
 (null)
> [ 1241.136738] IP: [<  (null)>]   (null)
> [ 1241.136759] *pdpt =3D 0000000000000000 *pde =3D f0002a55f0002a55 [ 124=
1.136781]
> [ 1241.136793] Oops: 0010 [#1] SMP
>
> What's odd is that when I added some printks on my own and enabled both
> boot and htc debug levels it doesn't crash anymore. After everything
> works normally after that, I can start AP mode and connect to it. Is it
> a race somewhere?

Yes. htc_wait_target() is called after hif_start(). The ep_rx_complete
is set in htc_wait_target() [changed patch 4, but still too late].

ep_rx_complete must be set prior to calling hif_start(). You probably
crash on end of ath10k_htc_rx_completion_handler() when trying to call
ep->ep_ops.ep_rx_complete(ar, skb).


Micha=C5=82

^ permalink raw reply

* Re: [RFC v2 05/11] ath10k: htc: refactorization
From: Valo, Kalle @ 2016-12-13 13:44 UTC (permalink / raw)
  To: Erik Stromdahl; +Cc: linux-wireless@vger.kernel.org, ath10k@lists.infradead.org
In-Reply-To: <1479496971-19174-6-git-send-email-erik.stromdahl@gmail.com>

Erik Stromdahl <erik.stromdahl@gmail.com> writes:

> Code refactorization:
>
> Moved the code for ep 0 in ath10k_htc_rx_completion_handler
> to ath10k_htc_control_rx_complete.
>
> This eases the implementation of SDIO/mbox significantly since
> the ep_rx_complete cb is invoked directly from the SDIO/mbox
> hif layer.
>
> Since the ath10k_htc_control_rx_complete already is present
> (only containing a warning message) there is no reason for not
> using it (instead of having a special case for ep 0 in
> ath10k_htc_rx_completion_handler).
>
> Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>

I tested this on QCA988X PCI board just to see if there are any
regressions. It crashes immediately during module load, every time, and
bisected that the crashing starts on this patch:

[ 1239.715325] ath10k_pci 0000:02:00.0: pci irq msi oper_irq_mode 2 irq_mod=
e 0 reset_mode 0
[ 1239.885125] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/pre=
-cal-pci-0000:02:00.0.bin failed with error -2
[ 1239.885260] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/cal=
-pci-0000:02:00.0.bin failed with error -2
[ 1239.885687] ath10k_pci 0000:02:00.0: qca988x hw2.0 target 0x4100016c chi=
p_id 0x043202ff sub 0000:0000
[ 1239.885699] ath10k_pci 0000:02:00.0: kconfig debug 1 debugfs 1 tracing 1=
 dfs 1 testmode 1
[ 1239.885899] ath10k_pci 0000:02:00.0: firmware ver 10.2.4.70.59-2 api 5 f=
eatures no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
[ 1239.941836] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/QCA=
988X/hw2.0/board-2.bin failed with error -2
[ 1239.941993] ath10k_pci 0000:02:00.0: board_file api 1 bmi_id N/A crc32 b=
ebc7c08
[ 1241.136693] BUG: unable to handle kernel NULL pointer dereference at   (=
null)
[ 1241.136738] IP: [<  (null)>]   (null)
[ 1241.136759] *pdpt =3D 0000000000000000 *pde =3D f0002a55f0002a55 [ 1241.=
136781]=20
[ 1241.136793] Oops: 0010 [#1] SMP

What's odd is that when I added some printks on my own and enabled both
boot and htc debug levels it doesn't crash anymore. After everything
works normally after that, I can start AP mode and connect to it. Is it
a race somewhere?

--=20
Kalle Valo=

^ permalink raw reply

* Re: [PATCH] ath9k: unlock rcu read when returning early
From: Tobias Klausmann @ 2016-12-13 13:41 UTC (permalink / raw)
  To: Felix Fietkau, kvalo, helgaas, linux-kernel, linux-pci,
	marc.zyngier, Janusz.Dziedzic, rmanohar, ath9k-devel,
	linux-wireless, rmanohar, bharat.kumar.gogada
In-Reply-To: <e15c5efc-d548-8b32-ca86-5ec26506ff7c@nbd.name>



On 13.12.2016 11:41, Felix Fietkau wrote:
> On 2016-12-12 19:50, Tobias Klausmann wrote:
>> Starting with ath9k: use ieee80211_tx_status_noskb where possible
>> [d94a461d7a7df68991fb9663531173f60ef89c68] the driver uses rcu_read_lock() &&
>> rcu_read_unlock() yet on returning early in ath_tx_edma_tasklet() the unlock is
>> missing leading to stalls and suspicious RCU usage:
>>
>>   ===============================
>>   [ INFO: suspicious RCU usage. ]
>>   4.9.0-rc8 #11 Not tainted
>>   -------------------------------
>>   kernel/rcu/tree.c:705 Illegal idle entry in RCU read-side critical section.!
>>
>>   other info that might help us debug this:
>>
>>   RCU used illegally from idle CPU!
>>   rcu_scheduler_active = 1, debug_locks = 0
>>   RCU used illegally from extended quiescent state!
>>   1 lock held by swapper/7/0:
>>   #0:
>>    (
>>   rcu_read_lock
>>   ){......}
>>   , at:
>>   [<ffffffffa06ed110>] ath_tx_edma_tasklet+0x0/0x450 [ath9k]
>>
>>   stack backtrace:
>>   CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.9.0-rc8 #11
>>   Hardware name: Acer Aspire V3-571G/VA50_HC_CR, BIOS V2.21 12/16/2013
>>    ffff88025efc3f38 ffffffff8132b1e5 ffff88017ede4540 0000000000000001
>>    ffff88025efc3f68 ffffffff810a25f7 ffff88025efcee60 ffff88017edebdd8
>>    ffff88025eeb5400 0000000000000091 ffff88025efc3f88 ffffffff810c3cd4
>>   Call Trace:
>>    <IRQ>
>>    [<ffffffff8132b1e5>] dump_stack+0x68/0x93
>>    [<ffffffff810a25f7>] lockdep_rcu_suspicious+0xd7/0x110
>>    [<ffffffff810c3cd4>] rcu_eqs_enter_common.constprop.85+0x154/0x200
>>    [<ffffffff810c5a54>] rcu_irq_exit+0x44/0xa0
>>    [<ffffffff81058631>] irq_exit+0x61/0xd0
>>    [<ffffffff81018d25>] do_IRQ+0x65/0x110
>>    [<ffffffff81672189>] common_interrupt+0x89/0x89
>>    <EOI>
>>    [<ffffffff814ffe11>] ? cpuidle_enter_state+0x151/0x200
>>    [<ffffffff814ffee2>] cpuidle_enter+0x12/0x20
>>    [<ffffffff8109a6ae>] call_cpuidle+0x1e/0x40
>>    [<ffffffff8109a8f6>] cpu_startup_entry+0x146/0x220
>>    [<ffffffff810336f8>] start_secondary+0x148/0x170
>>
>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
>> ---
>>   drivers/net/wireless/ath/ath9k/xmit.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
>> index 52bfbb988611..857d5ae09a1d 100644
>> --- a/drivers/net/wireless/ath/ath9k/xmit.c
>> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
>> @@ -2787,6 +2787,7 @@ void ath_tx_edma_tasklet(struct ath_softc *sc)
>>   		fifo_list = &txq->txq_fifo[txq->txq_tailidx];
>>   		if (list_empty(fifo_list)) {
>>   			ath_txq_unlock(sc, txq);
>> +			rcu_read_unlock();
> Technically this is fine as well, but I'd prefer a fix where you replace
> the 'return' with 'break', thus avoiding the duplication of
> rcu_read_unlock()

Actually if you want to avoid it, maybe skipping over the rest is better 
(as originally intended):

...

ath_txq_unlock(sc, txq);


goto unlock;
}
...

unlock:
rcu_read_unlock();

Thanks,
Tobias
>
> Thanks,
>
> - Felix
>

^ permalink raw reply

* Re: [RFC v2 11/11] ath10k: Added sdio support
From: Valo, Kalle @ 2016-12-13 13:10 UTC (permalink / raw)
  To: Erik Stromdahl; +Cc: linux-wireless@vger.kernel.org, ath10k@lists.infradead.org
In-Reply-To: <1479496971-19174-12-git-send-email-erik.stromdahl@gmail.com>

Erik Stromdahl <erik.stromdahl@gmail.com> writes:

> Initial HIF sdio/mailbox implementation.
>
> Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>

While testing this I noticed few new warnings:

drivers/net/wireless/ath/ath10k/sdio.c: In function ath10k_sdio_probe:
drivers/net/wireless/ath/ath10k/sdio.c:1723:6: warning: 'ret' may be used u=
ninitialized in this function [-Wuninitialized]
drivers/net/wireless/ath/ath10k/sdio.c:375:5: warning: symbol 'ath10k_sdio_=
mbox_rxmsg_pending_handler' was not declared. Should it be static?
drivers/net/wireless/ath/ath10k/sdio.c:1018:5: warning: symbol 'ath10k_sdio=
_hif_tx_sg' was not declared. Should it be static?
drivers/net/wireless/ath/ath10k/sdio.c:1415:5: warning: symbol 'ath10k_sdio=
_hif_exchange_bmi_msg' was not declared. Should it be static?
drivers/net/wireless/ath/ath10k/sdio.c:1555:5: warning: symbol 'ath10k_sdio=
_hif_map_service_to_pipe' was not declared. Should it be static?
drivers/net/wireless/ath/ath10k/sdio.c:1635:6: warning: symbol 'ath10k_sdio=
_hif_get_default_pipe' was not declared. Should it be static?
drivers/net/wireless/ath/ath10k/htc.c:265: line over 90 characters
drivers/net/wireless/ath/ath10k/htc.c:355: line over 90 characters

--=20
Kalle Valo=

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox