* RE: ath10k firmware sends probes on DFS channels without radar detection
From: Jean-Pierre Tosoni @ 2016-12-14 18:14 UTC (permalink / raw)
To: 'Ben Greear', ath10k, linux-wireless
In-Reply-To: <63c731f2-ee21-6450-2ae4-2808e7130eb5@candelatech.com>
On 12/06/15 08:36 PM, Ben Greear wrote:
> On 12/06/2016 09:02 AM, Jean-Pierre Tosoni wrote:
> > This follows on the previous discussion
> > "Client station sends probes on DFS channels"
> >
> > Problem:
> > The combination of QCA988X firmware v10.2.4.70-2 + ath10k +
> > wpa_supplicant do not comply with the norm ETSI/EN 301-893 section
> > 4.7; because they can send probes for 600s when no AP is around.
> >
> > Analysis:
> > The problem seems to lie in the firmware, which regards the presence
> > of *any* beacon as a proof that the channel is radar-clean for 600s.
> >
> > This is a wrong hypothesis, since a rogue AP sending fraudulent
> > beacons should not induce a scrupulous STA in sending illegal probes.
> >
> > Moreover, the norm (table D.1) sets a time limit of 10s to shutdown
> > when no AP positively allows the STA to transmit on the DFS channel.
> >
> > Status:
> > - there is no known plan at QCA to fix the issue.
> > - ath10k firmware is not publicly available for fixes.
> > - there is no obvious fix working in ath10k.
> > - the issue does not show up with other mac80211 devices like ath9k.
> > - wpa_supplicant considers this is a kernel issue [2]
>
> Have you confirmed that there are no probe requests being sent by ath10k
> driver?
I have put a printk in the ath10k_tx function (in the path for management
frames) and it does not show up.
On another hand I cannot find any other function preparing / sending probes.
There is only the wmi command that starts scans.
>
> You could also try disabling the station keep-alive and roaming logic in
> the firmware by tweaking the wmi initial setup logic. I disable that in
> my firmware, for instance, because mac80211 can do a better job and then I
> can save resources in the firmware.
Do you mean the values set in wmi.c:: ath10k_wmi_start_scan_init() ?
Should I replace all the scan offload by a mac80211-controlled scan like
in ath9k? The problem then is to switch channels; channel switching
seems very different from ath9k.
Thanks,
JP
> Finally, if that doesn't work, then I could probably fix that in CT
> firmware in case that is of interest.
>
> Thanks,
> Ben
>
> >
> > Jean-Pierre Tosoni
> >
> >
> >
> >> -----Message d'origine-----
> >> De : ath10k [mailto:ath10k-bounces@lists.infradead.org] De la part de
> >> Jean-Pierre Tosoni Envoyé : mercredi 30 novembre 2016 19:04 À :
> >> ath10k@lists.infradead.org Objet : Client station sends probes on DFS
> >> channels
> >>
> >> Hello list,
> >>
> >> There is a case where I can see probes on a DFS channel, from a non-
> >> associated station using ath10k. (note that the problem does not
> >> arise when using ath9k).
> >>
> >> *The setup*
> >>
> >> I am using Openwrt, wpa_supplicant and compat-wireless 2016-10-08,
> >> Card firmware is ath10k_pci: qca988x hw2.0 (0x4100016c, 0x043202ff)
> >> fw 10.2.4.70-2 api 5 htt-ver 2.1 wmi-op 5 htt-op 2
> >> cal otp max-sta 128 features no-p2p
> >> ath10k_pci: debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
> >>
> >> I am using channel 116, regdom US or FR, where I see no traffic at
> >> all using wireshark+Airpcap.
> >> I set wpa_supplicant to scan this channel only for a specific SSID
> >> "ssid1".
> >>
> >> At initial startup of the client device, no probes are going out,
> >> which is OK.
> >>
> >> Then, on another device, I start a hostapd set to channel 116, with a
> >> different SSID "otherssid" so that the supplicant will not associate.
> >>
> >> Shortly (1-2s) after the beacons appear in the air, the client begins
> >> to Send probe request in the air, which is unexpected, but acceptable
> >> since the client can infer absence of radars from the presence of
> beacons.
> >>
> >> *The problem*
> >>
> >> If I power down the AP, the client continues to send probes for
> >> around
> >> 10 minutes, which is unacceptable since it cannot handle radar
> >> detection, as it is a slave device in the meaning of ETSI/EN 301-
> 893[1].
> >>
> >>
> >> Some tests I made:
> >> - I tried to investigate the "beacon hint" mechanism but it appears
> >> that it is not used on DFS channels.
> >>
> >> - I tried to force the IEEE80211_NO_IR flag when IEEE80211_CHAN_DFS
> >> is set.
> >>
> >> - When I reload the reg. domain using "iw reg set", the probes cease,
> >> but will reappear if I cycle my AP again On then Off.
> >>
> >> - When I let the client associate, then disassociate and stop the AP,
> >> the same problem arises. It disappears if I add a call to
> >> ath10k_regd_update() in mac.c after a disconnection (This is not a
> >> fix, since in my case the client never associates).
> >>
> >> - Since at startup, the client does not send probes, I infer that the
> >> problem is *not* caused by a hidden AP that the card could see but
> >> not airpcap.
> >>
> >> - I tried with channels 52 and 100, with regdom FR or US: same problem.
> >>
> >> Any ideas?
> >>
> >>
> >> [1] http://www.etsi.org/deliver/etsi_en/301800_301899/301893/
> >> 01.08.01_60/en_301893v010801p.pdf
> > [2] http://lists.shmoo.com/pipermail/hostap/2015-January/031906.html
> >>
> >> J.P. Tosoni - ACKSYS
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> ath10k mailing list
> >> ath10k@lists.infradead.org
> >> http://lists.infradead.org/mailman/listinfo/ath10k
> >
>
>
> --
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc http://www.candelatech.com
>
>
> _______________________________________________
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k
^ permalink raw reply
* Re: [Patch] NFC: trf7970a:
From: Geoff Lansberry @ 2016-12-14 18:35 UTC (permalink / raw)
To: Mark Greer
Cc: linux-wireless, Lauro Ramos Venancio, Aloisio Almeida Jr,
Samuel Ortiz, Justin Bronder
In-Reply-To: <20161214171010.GA29321@animalcreek.com>
On Wed, Dec 14, 2016 at 12:10 PM, Mark Greer <mgreer@animalcreek.com> wrote:
> On Wed, Dec 14, 2016 at 11:17:33AM -0500, Geoff Lansberry wrote:
>> On Wed, Dec 14, 2016 at 10:57 AM, Mark Greer <mgreer@animalcreek.com> wrote:
>> >
>> > On Tue, Dec 13, 2016 at 08:50:04PM -0500, Geoff Lansberry wrote:
>> > > Hi Mark - Thanks for getting back to me. It's funny that you ask,
>> > > because we are currently chasing a segfault that is happening in neard, but
>> > > may end up back in the trf7970a driver. Have you ever heard on anyone
>> > > having segfault problems related to the trf7970a hardware drivers?
>> >
>> > No. Mind sharing more info on that segfault?
>> >
>> > > I'll get you an update later tonight or tomorrow.
>> >
>> > Okay, thanks.
>> >
>> > Mark
>> > --
>>
>> Mark - The segfault issue is only happening on writing, The work on
>> the segfault is being done by a consultant, but here is his statement
>> on how to recreate it on our build:
>>
>> I am able to reliably force neard to segfault by flooding it with
>> write requests. I have attached a python script called flood.py that
>> can be used to do this. The script uses utilities that ship with
>> neard.
>>
>> The segfault does not appear deterministic. It usually happens within
>> 1000 writes, but the time can varying greatly. The logs output from
>> neard are inconsistent between crashes, which suggests this may be a
>> timing or race condition related issue.
>>
>> I have been running neard manually to obtain the log information and a
>> core file for debugging (attached). I run neard as,
>>
>> $ /usr/lib/neard/nfc/neard -d -n
>>
>> In a separate terminal I run,
>>
>> $ python flood.py
>>
>> And the resulting core file provides the following backtrace,
>>
>> (gdb) bt
>> #0 0xb6caed64 in ?? ()
>> #1 0x0001ed7c in data_recv (resp=0x5bd90 "", length=17, data=0x58348)
>> at plugins/nfctype2.c:156
>> #2 0x00024ecc in execute_recv_cb (user_data=0x5bd88) at src/adapter.c:979
>> #3 0xb6e70d60 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb)
>>
>> The line at nfctype2.c:156 contains a memcpy operation.
>
> Thanks Geoff.
>
> What are the values of the arguments to memcpy()?
>
> I will look at it later today/tomorrow but if you have another NFC device
> to test with, it would help isolate whether it is neard or the trf7970a
> driver. The driver shouldn't be able to make neard crash like this but
> who knows.
>
> You could also try testing older versions of neard to see if they also
> fail and if not, start bisecting from there. Maybe test a different
> tag type too.
>
> Mark
> --
Mark - We can't seem to get gdb to run on our board, so we can't see
the exact arguments. Here is what our consultant has to say about
your question:
The backtrace seems to indicate that the error is occurring in neard,
not the driver.
Since the driver is built as a module, your kernel won't crash if
there is a problem in it, but you should be told that the error is
originating in the module.
It is also possible that the NFC driver does have a non-fatal problem
in it (such as returning unexpected data) that is propagating to neard
and causing the error there.
Of course, it is also worth noting:
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
and the same address appearing twice -- what I would assume to be your
memcpy address, since that is the last call made on a given source
line. If the stack is corrupt, then the error could very well
originate in the driver and not neard.
^ permalink raw reply
* Re: ath10k firmware sends probes on DFS channels without radar detection
From: Ben Greear @ 2016-12-14 18:28 UTC (permalink / raw)
To: Jean-Pierre Tosoni, ath10k, linux-wireless
In-Reply-To: <00e701d25635$e0108480$a0318d80$@acksys.fr>
On 12/14/2016 10:14 AM, Jean-Pierre Tosoni wrote:
> On 12/06/15 08:36 PM, Ben Greear wrote:
>> On 12/06/2016 09:02 AM, Jean-Pierre Tosoni wrote:
>>> This follows on the previous discussion
>>> "Client station sends probes on DFS channels"
>>>
>>> Problem:
>>> The combination of QCA988X firmware v10.2.4.70-2 + ath10k +
>>> wpa_supplicant do not comply with the norm ETSI/EN 301-893 section
>>> 4.7; because they can send probes for 600s when no AP is around.
>>>
>>> Analysis:
>>> The problem seems to lie in the firmware, which regards the presence
>>> of *any* beacon as a proof that the channel is radar-clean for 600s.
>>>
>>> This is a wrong hypothesis, since a rogue AP sending fraudulent
>>> beacons should not induce a scrupulous STA in sending illegal probes.
>>>
>>> Moreover, the norm (table D.1) sets a time limit of 10s to shutdown
>>> when no AP positively allows the STA to transmit on the DFS channel.
>>>
>>> Status:
>>> - there is no known plan at QCA to fix the issue.
>>> - ath10k firmware is not publicly available for fixes.
>>> - there is no obvious fix working in ath10k.
>>> - the issue does not show up with other mac80211 devices like ath9k.
>>> - wpa_supplicant considers this is a kernel issue [2]
>>
>> Have you confirmed that there are no probe requests being sent by ath10k
>> driver?
>
> I have put a printk in the ath10k_tx function (in the path for management
> frames) and it does not show up.
> On another hand I cannot find any other function preparing / sending probes.
> There is only the wmi command that starts scans.
That wmi command causes probes, so make sure it is not being called.
>> You could also try disabling the station keep-alive and roaming logic in
>> the firmware by tweaking the wmi initial setup logic. I disable that in
>> my firmware, for instance, because mac80211 can do a better job and then I
>> can save resources in the firmware.
>
> Do you mean the values set in wmi.c:: ath10k_wmi_start_scan_init() ?
>
> Should I replace all the scan offload by a mac80211-controlled scan like
> in ath9k? The problem then is to switch channels; channel switching
> seems very different from ath9k.
You might try one or more of these settings in the ath10k_wmi_10_1_op_gen_init
method (or 10.2 if that is the FW you are using):
config.roam_offload_max_vdev = 0; /* disable roaming */
config.roam_offload_max_ap_profiles = 0; /* disable roaming */
config.bmiss_offload_max_vdev = 0;
I have only tested this using my CT firmware, and I have some additional
patches in my driver to enable mac80211 keep-alive logic when using my
CT firmware.
Thanks,
Ben
>
> Thanks,
> JP
>
>> Finally, if that doesn't work, then I could probably fix that in CT
>> firmware in case that is of interest.
>>
>> Thanks,
>> Ben
>>
>>>
>>> Jean-Pierre Tosoni
>>>
>>>
>>>
>>>> -----Message d'origine-----
>>>> De : ath10k [mailto:ath10k-bounces@lists.infradead.org] De la part de
>>>> Jean-Pierre Tosoni Envoyé : mercredi 30 novembre 2016 19:04 À :
>>>> ath10k@lists.infradead.org Objet : Client station sends probes on DFS
>>>> channels
>>>>
>>>> Hello list,
>>>>
>>>> There is a case where I can see probes on a DFS channel, from a non-
>>>> associated station using ath10k. (note that the problem does not
>>>> arise when using ath9k).
>>>>
>>>> *The setup*
>>>>
>>>> I am using Openwrt, wpa_supplicant and compat-wireless 2016-10-08,
>>>> Card firmware is ath10k_pci: qca988x hw2.0 (0x4100016c, 0x043202ff)
>>>> fw 10.2.4.70-2 api 5 htt-ver 2.1 wmi-op 5 htt-op 2
>>>> cal otp max-sta 128 features no-p2p
>>>> ath10k_pci: debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
>>>>
>>>> I am using channel 116, regdom US or FR, where I see no traffic at
>>>> all using wireshark+Airpcap.
>>>> I set wpa_supplicant to scan this channel only for a specific SSID
>>>> "ssid1".
>>>>
>>>> At initial startup of the client device, no probes are going out,
>>>> which is OK.
>>>>
>>>> Then, on another device, I start a hostapd set to channel 116, with a
>>>> different SSID "otherssid" so that the supplicant will not associate.
>>>>
>>>> Shortly (1-2s) after the beacons appear in the air, the client begins
>>>> to Send probe request in the air, which is unexpected, but acceptable
>>>> since the client can infer absence of radars from the presence of
>> beacons.
>>>>
>>>> *The problem*
>>>>
>>>> If I power down the AP, the client continues to send probes for
>>>> around
>>>> 10 minutes, which is unacceptable since it cannot handle radar
>>>> detection, as it is a slave device in the meaning of ETSI/EN 301-
>> 893[1].
>>>>
>>>>
>>>> Some tests I made:
>>>> - I tried to investigate the "beacon hint" mechanism but it appears
>>>> that it is not used on DFS channels.
>>>>
>>>> - I tried to force the IEEE80211_NO_IR flag when IEEE80211_CHAN_DFS
>>>> is set.
>>>>
>>>> - When I reload the reg. domain using "iw reg set", the probes cease,
>>>> but will reappear if I cycle my AP again On then Off.
>>>>
>>>> - When I let the client associate, then disassociate and stop the AP,
>>>> the same problem arises. It disappears if I add a call to
>>>> ath10k_regd_update() in mac.c after a disconnection (This is not a
>>>> fix, since in my case the client never associates).
>>>>
>>>> - Since at startup, the client does not send probes, I infer that the
>>>> problem is *not* caused by a hidden AP that the card could see but
>>>> not airpcap.
>>>>
>>>> - I tried with channels 52 and 100, with regdom FR or US: same problem.
>>>>
>>>> Any ideas?
>>>>
>>>>
>>>> [1] http://www.etsi.org/deliver/etsi_en/301800_301899/301893/
>>>> 01.08.01_60/en_301893v010801p.pdf
>>> [2] http://lists.shmoo.com/pipermail/hostap/2015-January/031906.html
>>>>
>>>> J.P. Tosoni - ACKSYS
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ath10k mailing list
>>>> ath10k@lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/ath10k
>>>
>>
>>
>> --
>> Ben Greear <greearb@candelatech.com>
>> Candela Technologies Inc http://www.candelatech.com
>>
>>
>> _______________________________________________
>> ath10k mailing list
>> ath10k@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/ath10k
>
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply
* Re: [Patch] NFC: trf7970a:
From: Mark Greer @ 2016-12-14 17:10 UTC (permalink / raw)
To: Geoff Lansberry
Cc: linux-wireless, Lauro Ramos Venancio, Aloisio Almeida Jr,
Samuel Ortiz, Justin Bronder
In-Reply-To: <CAO7Z3WKqhS5Q6qAaDs8364KP5-7ma=b_ic2B10=njngMmp5noQ@mail.gmail.com>
On Wed, Dec 14, 2016 at 11:17:33AM -0500, Geoff Lansberry wrote:
> On Wed, Dec 14, 2016 at 10:57 AM, Mark Greer <mgreer@animalcreek.com> wrote:
> >
> > On Tue, Dec 13, 2016 at 08:50:04PM -0500, Geoff Lansberry wrote:
> > > Hi Mark - Thanks for getting back to me. It's funny that you ask,
> > > because we are currently chasing a segfault that is happening in neard, but
> > > may end up back in the trf7970a driver. Have you ever heard on anyone
> > > having segfault problems related to the trf7970a hardware drivers?
> >
> > No. Mind sharing more info on that segfault?
> >
> > > I'll get you an update later tonight or tomorrow.
> >
> > Okay, thanks.
> >
> > Mark
> > --
>
> Mark - The segfault issue is only happening on writing, The work on
> the segfault is being done by a consultant, but here is his statement
> on how to recreate it on our build:
>
> I am able to reliably force neard to segfault by flooding it with
> write requests. I have attached a python script called flood.py that
> can be used to do this. The script uses utilities that ship with
> neard.
>
> The segfault does not appear deterministic. It usually happens within
> 1000 writes, but the time can varying greatly. The logs output from
> neard are inconsistent between crashes, which suggests this may be a
> timing or race condition related issue.
>
> I have been running neard manually to obtain the log information and a
> core file for debugging (attached). I run neard as,
>
> $ /usr/lib/neard/nfc/neard -d -n
>
> In a separate terminal I run,
>
> $ python flood.py
>
> And the resulting core file provides the following backtrace,
>
> (gdb) bt
> #0 0xb6caed64 in ?? ()
> #1 0x0001ed7c in data_recv (resp=0x5bd90 "", length=17, data=0x58348)
> at plugins/nfctype2.c:156
> #2 0x00024ecc in execute_recv_cb (user_data=0x5bd88) at src/adapter.c:979
> #3 0xb6e70d60 in ?? ()
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> (gdb)
>
> The line at nfctype2.c:156 contains a memcpy operation.
Thanks Geoff.
What are the values of the arguments to memcpy()?
I will look at it later today/tomorrow but if you have another NFC device
to test with, it would help isolate whether it is neard or the trf7970a
driver. The driver shouldn't be able to make neard crash like this but
who knows.
You could also try testing older versions of neard to see if they also
fail and if not, start bisecting from there. Maybe test a different
tag type too.
Mark
--
^ permalink raw reply
* Re: [PATCH 1/2] ath10k: add accounting for the extended peer statistics
From: Christian Lamparter @ 2016-12-14 16:38 UTC (permalink / raw)
To: Mohammed Shafi Shajakhan; +Cc: linux-wireless, ath10k, Kalle Valo
In-Reply-To: <20161214073337.GA4046@atheros-ThinkPad-T61>
On Wednesday, December 14, 2016 1:03:38 PM CET Mohammed Shafi Shajakhan wrote:
> > On Wednesday, December 7, 2016 11:58:24 AM CET Mohammed Shafi Shajakhan wrote:
> > > On Mon, Dec 05, 2016 at 10:52:45PM +0100, Christian Lamparter wrote:
> > > > @@ -409,10 +410,12 @@ void ath10k_debug_fw_stats_process(struct ath10k *ar, struct sk_buff *skb)
> > > > goto free;
> > > > }
> > > >
> > > > + if (!list_empty(&stats.peers))
> > >
> > > [shafi] sorry please correct me if i am wrong, for 'extended peer stats' we are checking
> > > for normal 'peer stats' ? Is this the fix intended, i had started a build to
> > > check your change and we will keep you posted, does this fix displaying
> > > 'rx_duration' in ath10k fw_stats.
> > The idea is not to queue any "extended peer stats" when there where no "peer stats" to
> > begin with. Because otherwise, the function is still vulnerable to OOM since the
> > extended peers stats will be queued unchecked (not that this is currently a problem).
>
> [shafi] list_splice_tail_init should still check for non-empty 'peers_extd' list
> and append if required ? please let me know if i am still missing something
Well, you can also count the entries in peers_extd and delete the old entries
if they start to overflow. If you want to do it differently, please go ahead.
Regards,
Christian
^ permalink raw reply
* [PATCH] mac80211: only alloc mem if a station doesnt exist yet
From: Koen Vandeputte @ 2016-12-14 16:28 UTC (permalink / raw)
To: johannes; +Cc: linux-wireless, Koen Vandeputte
This speeds up the function in case a station already exists by avoiding
calling an expensive kzalloc just to free it again after the next check.
Signed-off-by: Koen Vandeputte <koen.vandeputte@ncentric.com>
---
net/mac80211/sta_info.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 1711bae..0a42e6e 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -513,23 +513,23 @@ static int sta_info_insert_finish(struct sta_info *sta) __acquires(RCU)
{
struct ieee80211_local *local = sta->local;
struct ieee80211_sub_if_data *sdata = sta->sdata;
- struct station_info *sinfo;
+ struct station_info *sinfo = NULL;
int err = 0;
lockdep_assert_held(&local->sta_mtx);
- sinfo = kzalloc(sizeof(struct station_info), GFP_KERNEL);
- if (!sinfo) {
- err = -ENOMEM;
- goto out_err;
- }
-
/* check if STA exists already */
if (sta_info_get_bss(sdata, sta->sta.addr)) {
err = -EEXIST;
goto out_err;
}
+ sinfo = kzalloc(sizeof(struct station_info), GFP_KERNEL);
+ if (!sinfo) {
+ err = -ENOMEM;
+ goto out_err;
+ }
+
local->num_sta++;
local->sta_generation++;
smp_mb();
--
2.7.4
^ permalink raw reply related
* Re: [Patch] NFC: trf7970a:
From: Geoff Lansberry @ 2016-12-14 16:17 UTC (permalink / raw)
To: Mark Greer
Cc: linux-wireless, Lauro Ramos Venancio, Aloisio Almeida Jr,
Samuel Ortiz, Justin Bronder
In-Reply-To: <20161214155743.GA22282@animalcreek.com>
On Wed, Dec 14, 2016 at 10:57 AM, Mark Greer <mgreer@animalcreek.com> wrote:
>
> On Tue, Dec 13, 2016 at 08:50:04PM -0500, Geoff Lansberry wrote:
> > Hi Mark - Thanks for getting back to me. It's funny that you ask,
> > because we are currently chasing a segfault that is happening in neard, but
> > may end up back in the trf7970a driver. Have you ever heard on anyone
> > having segfault problems related to the trf7970a hardware drivers?
>
> No. Mind sharing more info on that segfault?
>
> > I'll get you an update later tonight or tomorrow.
>
> Okay, thanks.
>
> Mark
> --
Mark - The segfault issue is only happening on writing, The work on
the segfault is being done by a consultant, but here is his statement
on how to recreate it on our build:
I am able to reliably force neard to segfault by flooding it with
write requests. I have attached a python script called flood.py that
can be used to do this. The script uses utilities that ship with
neard.
The segfault does not appear deterministic. It usually happens within
1000 writes, but the time can varying greatly. The logs output from
neard are inconsistent between crashes, which suggests this may be a
timing or race condition related issue.
I have been running neard manually to obtain the log information and a
core file for debugging (attached). I run neard as,
$ /usr/lib/neard/nfc/neard -d -n
In a separate terminal I run,
$ python flood.py
And the resulting core file provides the following backtrace,
(gdb) bt
#0 0xb6caed64 in ?? ()
#1 0x0001ed7c in data_recv (resp=0x5bd90 "", length=17, data=0x58348)
at plugins/nfctype2.c:156
#2 0x00024ecc in execute_recv_cb (user_data=0x5bd88) at src/adapter.c:979
#3 0xb6e70d60 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
The line at nfctype2.c:156 contains a memcpy operation.
Below is the flood.py script:
#!/usr/bin/python
import neardutils
import dbus
import time
bus = dbus.SystemBus()
DURATION = 0.05
def write():
# Get an adapter interface
objects = neardutils.get_managed_objects()
for path, interfaces in objects.iteritems():
if "org.neard.Adapter" in interfaces:
break
else:
raise Exception("Unable to find adapter")
print("adapter object path: " + path)
adapter = dbus.Interface(bus.get_object("org.neard", path),
"org.freedesktop.DBus.Properties")
# power cycle
try:
adapter.Set("org.neard.Adapter", "Powered", dbus.Boolean(0))
time.sleep(DURATION)
except:
pass
try:
adapter.Set("org.neard.Adapter", "Powered", dbus.Boolean(1))
time.sleep(DURATION)
except:
pass
# Set polling
adapter = dbus.Interface(bus.get_object("org.neard", path),
"org.neard.Adapter")
adapter.StartPollLoop("Initiator")
time.sleep(DURATION)
# write tag
objects = neardutils.get_managed_objects()
for path, interfaces in objects.iteritems():
if "org.neard.Tag" in interfaces:
break
else:
raise Exception("Unable to find tag")
print("tag object path: " + path)
time.sleep(DURATION)
tag = dbus.Interface(bus.get_object("org.neard", path), "org.neard.Tag")
tag.Write(({
"Type": "Text",
"Encoding": "UTF-8",
"Language": "en",
"Representation": "omen_red_2014",
}))
time.sleep(DURATION)
objects = neardutils.get_managed_objects()
for path, interfaces in objects.iteritems():
if "org.neard.Record" in interfaces:
break
else:
raise Exception("Unable to read record")
print("record object path: " + path)
time.sleep(DURATION)
record = dbus.Interface(bus.get_object("org.neard", path),
"org.freedesktop.DBus.Properties")
print("representation: " + record.Get("org.neard.Record", "Representation"))
def main():
for iteration in range(1000):
try:
print("==================================================")
print("iteration: " + str(iteration))
write()
print("SUCCESS")
except Exception,e:
print(str(e))
print("FAILURE")
if __name__ == "__main__":
main()
-----
If we find the source of the problem, we will share it upstream. If
you have any thoughts on where to look, please share.
Geoff Lansberry
^ permalink raw reply
* Re: [Patch] NFC: trf7970a:
From: Mark Greer @ 2016-12-14 15:57 UTC (permalink / raw)
To: Geoff Lansberry
Cc: linux-wireless, Lauro Ramos Venancio, Aloisio Almeida Jr,
Samuel Ortiz, Justin Bronder
In-Reply-To: <CAO7Z3WJwf80mCqubSYTeK=BHN9sd=mzmL9th4Su-E25de6TmAg@mail.gmail.com>
On Tue, Dec 13, 2016 at 08:50:04PM -0500, Geoff Lansberry wrote:
> Hi Mark - Thanks for getting back to me. It's funny that you ask,
> because we are currently chasing a segfault that is happening in neard, but
> may end up back in the trf7970a driver. Have you ever heard on anyone
> having segfault problems related to the trf7970a hardware drivers?
No. Mind sharing more info on that segfault?
> I'll get you an update later tonight or tomorrow.
Okay, thanks.
Mark
--
^ permalink raw reply
* [PATCH 4/5] mwifiex: get rid of __mwifiex_sdio_remove helper
From: Amitkumar Karwar @ 2016-12-14 14:10 UTC (permalink / raw)
To: linux-wireless
Cc: Cathy Luo, Nishant Sarmukadam, rajatja, briannorris,
dmitry.torokhov, Xinming Hu, Amitkumar Karwar
In-Reply-To: <1481724651-30397-1-git-send-email-akarwar@marvell.com>
From: Xinming Hu <huxm@marvell.com>
__mwifiex_sdio_remove helper is not needed after
our enhancements in SDIO card reset.
Signed-off-by: Xinming Hu <huxm@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
---
drivers/net/wireless/marvell/mwifiex/sdio.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
index b3aca10..0fda87a 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.c
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
@@ -370,7 +370,7 @@ static int mwifiex_check_winner_status(struct mwifiex_adapter *adapter)
* This function removes the interface and frees up the card structure.
*/
static void
-__mwifiex_sdio_remove(struct sdio_func *func)
+mwifiex_sdio_remove(struct sdio_func *func)
{
struct sdio_mmc_card *card;
struct mwifiex_adapter *adapter;
@@ -388,6 +388,8 @@ static int mwifiex_check_winner_status(struct mwifiex_adapter *adapter)
if (!adapter || !adapter->priv_num)
return;
+ cancel_work_sync(&sdio_work);
+
mwifiex_dbg(adapter, INFO, "info: SDIO func num=%d\n", func->num);
ret = mwifiex_sdio_read_fw_status(adapter, &firmware_stat);
@@ -402,13 +404,6 @@ static int mwifiex_check_winner_status(struct mwifiex_adapter *adapter)
mwifiex_remove_card(adapter);
}
-static void
-mwifiex_sdio_remove(struct sdio_func *func)
-{
- cancel_work_sync(&sdio_work);
- __mwifiex_sdio_remove(func);
-}
-
/*
* SDIO suspend.
*
--
1.9.1
^ permalink raw reply related
* [PATCH 2/5] mwifiex: cleanup in PCIe flr code path
From: Amitkumar Karwar @ 2016-12-14 14:10 UTC (permalink / raw)
To: linux-wireless
Cc: Cathy Luo, Nishant Sarmukadam, rajatja, briannorris,
dmitry.torokhov, Xinming Hu, Amitkumar Karwar
In-Reply-To: <1481724651-30397-1-git-send-email-akarwar@marvell.com>
From: Xinming Hu <huxm@marvell.com>
adapter and card variables don't get freed during PCIe function level
reset. "adapter->ext_scan" variable need not be re-initialized.
fw_name and tx_buf_size initialization is moved to pcie specific code
so that mwifiex_reinit_sw() can be used by SDIO.
Signed-off-by: Xinming Hu <huxm@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
---
drivers/net/wireless/marvell/mwifiex/main.c | 9 ---------
drivers/net/wireless/marvell/mwifiex/pcie.c | 12 +++++++++++-
2 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c
index 3fc6221..9d80180 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.c
+++ b/drivers/net/wireless/marvell/mwifiex/main.c
@@ -1425,9 +1425,6 @@ static void mwifiex_main_work_queue(struct work_struct *work)
int
mwifiex_reinit_sw(struct mwifiex_adapter *adapter)
{
- char fw_name[32];
- struct pcie_service_card *card = adapter->card;
-
mwifiex_init_lock_list(adapter);
if (adapter->if_ops.up_dev)
adapter->if_ops.up_dev(adapter);
@@ -1468,18 +1465,12 @@ static void mwifiex_main_work_queue(struct work_struct *work)
* mwifiex_register_dev()
*/
mwifiex_dbg(adapter, INFO, "%s, mwifiex_init_hw_fw()...\n", __func__);
- strcpy(fw_name, adapter->fw_name);
- strcpy(adapter->fw_name, PCIE8997_DEFAULT_WIFIFW_NAME);
- adapter->tx_buf_size = card->pcie.tx_buf_size;
- adapter->ext_scan = card->pcie.can_ext_scan;
if (mwifiex_init_hw_fw(adapter, false)) {
- strcpy(adapter->fw_name, fw_name);
mwifiex_dbg(adapter, ERROR,
"%s: firmware init failed\n", __func__);
goto err_init_fw;
}
- strcpy(adapter->fw_name, fw_name);
mwifiex_dbg(adapter, INFO, "%s, successful\n", __func__);
complete_all(adapter->fw_done);
diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index 02f6db0..66226c6 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -3082,6 +3082,17 @@ static void mwifiex_pcie_up_dev(struct mwifiex_adapter *adapter)
struct pci_dev *pdev = card->dev;
const struct mwifiex_pcie_card_reg *reg = card->pcie.reg;
+ /* Bluetooth is not on pcie interface. Download Wifi only firmware
+ * during pcie FLR, so that bluetooth part of firmware which is
+ * already running doesn't get affected.
+ */
+ strcpy(adapter->fw_name, PCIE8997_DEFAULT_WIFIFW_NAME);
+
+ /* tx_buf_size might be changed to 3584 by firmware during
+ * data transfer, we should reset it to default size.
+ */
+ adapter->tx_buf_size = card->pcie.tx_buf_size;
+
card->cmdrsp_buf = NULL;
ret = mwifiex_pcie_create_txbd_ring(adapter);
if (ret) {
@@ -3143,7 +3154,6 @@ static void mwifiex_pcie_down_dev(struct mwifiex_adapter *adapter)
mwifiex_dbg(adapter, ERROR, "Failed to write driver not-ready signature\n");
adapter->seq_num = 0;
- adapter->tx_buf_size = MWIFIEX_TX_DATA_BUF_SIZE_4K;
if (reg->sleep_cookie)
mwifiex_pcie_delete_sleep_cookie_buf(adapter);
--
1.9.1
^ permalink raw reply related
* [PATCH 5/5] mwifiex: get rid of global save_adapter and sdio_work
From: Amitkumar Karwar @ 2016-12-14 14:10 UTC (permalink / raw)
To: linux-wireless
Cc: Cathy Luo, Nishant Sarmukadam, rajatja, briannorris,
dmitry.torokhov, Xinming Hu, Amitkumar Karwar
In-Reply-To: <1481724651-30397-1-git-send-email-akarwar@marvell.com>
From: Xinming Hu <huxm@marvell.com>
This patch moves sdio_work to card structure, in this way we can get
adapter structure in the work, so save_adapter won't be needed.
Signed-off-by: Xinming Hu <huxm@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
---
drivers/net/wireless/marvell/mwifiex/sdio.c | 39 ++++++++++++++++-------------
drivers/net/wireless/marvell/mwifiex/sdio.h | 3 +++
2 files changed, 24 insertions(+), 18 deletions(-)
diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
index 0fda87a..a4b356d 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.c
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
@@ -32,10 +32,8 @@
#define SDIO_VERSION "1.0"
static void mwifiex_sdio_work(struct work_struct *work);
-static DECLARE_WORK(sdio_work, mwifiex_sdio_work);
static struct mwifiex_if_ops sdio_ops;
-static unsigned long iface_work_flags;
static struct memory_type_mapping generic_mem_type_map[] = {
{"DUMP", NULL, 0, 0xDD},
@@ -123,6 +121,7 @@ static int mwifiex_sdio_probe_of(struct device *dev)
card->fw_dump_enh = data->fw_dump_enh;
card->can_auto_tdls = data->can_auto_tdls;
card->can_ext_scan = data->can_ext_scan;
+ INIT_WORK(&card->work, mwifiex_sdio_work);
}
sdio_claim_host(func);
@@ -388,7 +387,7 @@ static int mwifiex_check_winner_status(struct mwifiex_adapter *adapter)
if (!adapter || !adapter->priv_num)
return;
- cancel_work_sync(&sdio_work);
+ cancel_work_sync(&card->work);
mwifiex_dbg(adapter, INFO, "info: SDIO func num=%d\n", func->num);
@@ -2190,7 +2189,6 @@ static void mwifiex_cleanup_sdio(struct mwifiex_adapter *adapter)
port, card->mp_data_port_mask);
}
-static struct mwifiex_adapter *save_adapter;
static void mwifiex_sdio_card_reset_work(struct mwifiex_adapter *adapter)
{
struct sdio_mmc_card *card = adapter->card;
@@ -2206,8 +2204,8 @@ static void mwifiex_sdio_card_reset_work(struct mwifiex_adapter *adapter)
/* Previous save_adapter won't be valid after this. We will cancel
* pending work requests.
*/
- clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &iface_work_flags);
- clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &iface_work_flags);
+ clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &card->work_flags);
+ clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &card->work_flags);
mwifiex_reinit_sw(adapter);
}
@@ -2513,35 +2511,40 @@ static void mwifiex_sdio_device_dump_work(struct mwifiex_adapter *adapter)
static void mwifiex_sdio_work(struct work_struct *work)
{
+ struct sdio_mmc_card *card =
+ container_of(work, struct sdio_mmc_card, work);
+
if (test_and_clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP,
- &iface_work_flags))
- mwifiex_sdio_device_dump_work(save_adapter);
+ &card->work_flags))
+ mwifiex_sdio_device_dump_work(card->adapter);
if (test_and_clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET,
- &iface_work_flags))
- mwifiex_sdio_card_reset_work(save_adapter);
+ &card->work_flags))
+ mwifiex_sdio_card_reset_work(card->adapter);
}
/* This function resets the card */
static void mwifiex_sdio_card_reset(struct mwifiex_adapter *adapter)
{
- save_adapter = adapter;
- if (test_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &iface_work_flags))
+ struct sdio_mmc_card *card = adapter->card;
+
+ if (test_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &card->work_flags))
return;
- set_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &iface_work_flags);
+ set_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &card->work_flags);
- schedule_work(&sdio_work);
+ schedule_work(&card->work);
}
/* This function dumps FW information */
static void mwifiex_sdio_device_dump(struct mwifiex_adapter *adapter)
{
- save_adapter = adapter;
- if (test_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &iface_work_flags))
+ struct sdio_mmc_card *card = adapter->card;
+
+ if (test_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &card->work_flags))
return;
- set_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &iface_work_flags);
- schedule_work(&sdio_work);
+ set_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &card->work_flags);
+ schedule_work(&card->work);
}
/* Function to dump SDIO function registers and SDIO scratch registers in case
diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.h b/drivers/net/wireless/marvell/mwifiex/sdio.h
index afa10d5..dccf7fd 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.h
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.h
@@ -267,6 +267,9 @@ struct sdio_mmc_card {
struct mwifiex_sdio_mpa_tx mpa_tx;
struct mwifiex_sdio_mpa_rx mpa_rx;
+
+ struct work_struct work;
+ unsigned long work_flags;
};
struct mwifiex_sdio_device {
--
1.9.1
^ permalink raw reply related
* [PATCH 3/5] mwifiex: sdio card reset enhancement
From: Amitkumar Karwar @ 2016-12-14 14:10 UTC (permalink / raw)
To: linux-wireless
Cc: Cathy Luo, Nishant Sarmukadam, rajatja, briannorris,
dmitry.torokhov, Xinming Hu, Amitkumar Karwar
In-Reply-To: <1481724651-30397-1-git-send-email-akarwar@marvell.com>
From: Xinming Hu <huxm@marvell.com>
Commit b4336a282db8 ("mwifiex: sdio: reset adapter using mmc_hw_reset")
introduces a simple sdio card reset solution based on card remove and
re-probe. This solution has proved to be vulnerable, as card and
adapter structures are not protected, concurrent access will result in
kernel panic issues.
Let's reuse PCIe FLR's functions for SDIO reset to avoid freeing and
reallocating adapter and card structures.
Signed-off-by: Xinming Hu <huxm@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
---
drivers/net/wireless/marvell/mwifiex/sdio.c | 73 +++++++++++++----------------
drivers/net/wireless/marvell/mwifiex/sdio.h | 3 --
2 files changed, 33 insertions(+), 43 deletions(-)
diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
index 44eb65a..b3aca10 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.c
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
@@ -104,7 +104,6 @@ static int mwifiex_sdio_probe_of(struct device *dev)
init_completion(&card->fw_done);
card->func = func;
- card->device_id = id;
func->card->quirks |= MMC_QUIRK_BLKSZ_FOR_BYTE_MODE;
@@ -2196,33 +2195,13 @@ static void mwifiex_cleanup_sdio(struct mwifiex_adapter *adapter)
port, card->mp_data_port_mask);
}
-static void mwifiex_recreate_adapter(struct sdio_mmc_card *card)
+static struct mwifiex_adapter *save_adapter;
+static void mwifiex_sdio_card_reset_work(struct mwifiex_adapter *adapter)
{
+ struct sdio_mmc_card *card = adapter->card;
struct sdio_func *func = card->func;
- const struct sdio_device_id *device_id = card->device_id;
-
- /* TODO mmc_hw_reset does not require destroying and re-probing the
- * whole adapter. Hence there was no need to for this rube-goldberg
- * design to reload the fw from an external workqueue. If we don't
- * destroy the adapter we could reload the fw from
- * mwifiex_main_work_queue directly.
- * The real difficulty with fw reset is to restore all the user
- * settings applied through ioctl. By destroying and recreating the
- * adapter, we take the easy way out, since we rely on user space to
- * restore them. We assume that user space will treat the new
- * incarnation of the adapter(interfaces) as if they had been just
- * discovered and initializes them from scratch.
- */
- __mwifiex_sdio_remove(func);
-
- /*
- * Normally, we would let the driver core take care of releasing these.
- * But we're not letting the driver core handle this one. See above
- * TODO.
- */
- sdio_set_drvdata(func, NULL);
- devm_kfree(&func->dev, card);
+ mwifiex_shutdown_sw(adapter);
/* power cycle the adapter */
sdio_claim_host(func);
@@ -2235,21 +2214,7 @@ static void mwifiex_recreate_adapter(struct sdio_mmc_card *card)
clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &iface_work_flags);
clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &iface_work_flags);
- mwifiex_sdio_probe(func, device_id);
-}
-
-static struct mwifiex_adapter *save_adapter;
-static void mwifiex_sdio_card_reset_work(struct mwifiex_adapter *adapter)
-{
- struct sdio_mmc_card *card = adapter->card;
-
- /* TODO card pointer is unprotected. If the adapter is removed
- * physically, sdio core might trigger mwifiex_sdio_remove, before this
- * workqueue is run, which will destroy the adapter struct. When this
- * workqueue eventually exceutes it will dereference an invalid adapter
- * pointer
- */
- mwifiex_recreate_adapter(card);
+ mwifiex_reinit_sw(adapter);
}
/* This function read/write firmware */
@@ -2677,6 +2642,33 @@ static void mwifiex_sdio_device_dump(struct mwifiex_adapter *adapter)
return p - drv_buf;
}
+/* sdio device/function initialization, code is extracted
+ * from init_if handler and register_dev handler.
+ */
+static void mwifiex_sdio_up_dev(struct mwifiex_adapter *adapter)
+{
+ struct sdio_mmc_card *card = adapter->card;
+ u8 sdio_ireg;
+
+ sdio_claim_host(card->func);
+ sdio_enable_func(card->func);
+ sdio_set_block_size(card->func, MWIFIEX_SDIO_BLOCK_SIZE);
+ sdio_release_host(card->func);
+
+ /* tx_buf_size might be changed to 3584 by firmware during
+ * data transfer, we will reset to default size.
+ */
+ adapter->tx_buf_size = card->tx_buf_size;
+
+ /* Read the host_int_status_reg for ACK the first interrupt got
+ * from the bootloader. If we don't do this we get a interrupt
+ * as soon as we register the irq.
+ */
+ mwifiex_read_reg(adapter, card->reg->host_int_status_reg, &sdio_ireg);
+
+ mwifiex_init_sdio_ioport(adapter);
+}
+
static struct mwifiex_if_ops sdio_ops = {
.init_if = mwifiex_init_sdio,
.cleanup_if = mwifiex_cleanup_sdio,
@@ -2702,6 +2694,7 @@ static void mwifiex_sdio_device_dump(struct mwifiex_adapter *adapter)
.reg_dump = mwifiex_sdio_reg_dump,
.device_dump = mwifiex_sdio_device_dump,
.deaggr_pkt = mwifiex_deaggr_sdio_pkt,
+ .up_dev = mwifiex_sdio_up_dev,
};
module_driver(mwifiex_sdio, sdio_register_driver, sdio_unregister_driver);
diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.h b/drivers/net/wireless/marvell/mwifiex/sdio.h
index cdbf3a3a..afa10d5 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.h
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.h
@@ -267,9 +267,6 @@ struct sdio_mmc_card {
struct mwifiex_sdio_mpa_tx mpa_tx;
struct mwifiex_sdio_mpa_rx mpa_rx;
-
- /* needed for card reset */
- const struct sdio_device_id *device_id;
};
struct mwifiex_sdio_device {
--
1.9.1
^ permalink raw reply related
* [PATCH 1/5] mwifiex: get rid of mwifiex_do_flr wrapper
From: Amitkumar Karwar @ 2016-12-14 14:10 UTC (permalink / raw)
To: linux-wireless
Cc: Cathy Luo, Nishant Sarmukadam, rajatja, briannorris,
dmitry.torokhov, Xinming Hu, Amitkumar Karwar
From: Xinming Hu <huxm@marvell.com>
This patch gets rid of mwifiex_do_flr. We will call
mwifiex_shutdown_sw() and mwifiex_reinit_sw() directly.
These two general purpose functions will be useful for
sdio card reset handler.
Signed-off-by: Xinming Hu <huxm@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
---
drivers/net/wireless/marvell/mwifiex/main.c | 31 +++++------------------------
drivers/net/wireless/marvell/mwifiex/main.h | 3 ++-
drivers/net/wireless/marvell/mwifiex/pcie.c | 4 ++--
3 files changed, 9 insertions(+), 29 deletions(-)
diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c
index c1821aa..3fc6221 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.c
+++ b/drivers/net/wireless/marvell/mwifiex/main.c
@@ -1348,7 +1348,7 @@ static void mwifiex_main_work_queue(struct work_struct *work)
* This function gets called during PCIe function level reset. Required
* code is extracted from mwifiex_remove_card()
*/
-static int
+int
mwifiex_shutdown_sw(struct mwifiex_adapter *adapter)
{
struct mwifiex_private *priv;
@@ -1417,13 +1417,13 @@ static void mwifiex_main_work_queue(struct work_struct *work)
exit_return:
return 0;
}
+EXPORT_SYMBOL_GPL(mwifiex_shutdown_sw);
/* This function gets called during PCIe function level reset. Required
* code is extracted from mwifiex_add_card()
*/
-static int
-mwifiex_reinit_sw(struct mwifiex_adapter *adapter, struct completion *fw_done,
- struct mwifiex_if_ops *if_ops, u8 iface_type)
+int
+mwifiex_reinit_sw(struct mwifiex_adapter *adapter)
{
char fw_name[32];
struct pcie_service_card *card = adapter->card;
@@ -1432,9 +1432,6 @@ static void mwifiex_main_work_queue(struct work_struct *work)
if (adapter->if_ops.up_dev)
adapter->if_ops.up_dev(adapter);
- adapter->iface_type = iface_type;
- adapter->fw_done = fw_done;
-
adapter->hw_status = MWIFIEX_HW_STATUS_INITIALIZING;
adapter->surprise_removed = false;
init_waitqueue_head(&adapter->init_wait_q);
@@ -1507,25 +1504,7 @@ static void mwifiex_main_work_queue(struct work_struct *work)
return -1;
}
-
-/* This function processes pre and post PCIe function level resets.
- * It performs software cleanup without touching PCIe specific code.
- * Also, during initialization PCIe stuff is skipped.
- */
-void mwifiex_do_flr(struct mwifiex_adapter *adapter, bool prepare)
-{
- struct mwifiex_if_ops if_ops;
-
- if (!prepare) {
- mwifiex_reinit_sw(adapter, adapter->fw_done, &if_ops,
- adapter->iface_type);
- } else {
- memcpy(&if_ops, &adapter->if_ops,
- sizeof(struct mwifiex_if_ops));
- mwifiex_shutdown_sw(adapter);
- }
-}
-EXPORT_SYMBOL_GPL(mwifiex_do_flr);
+EXPORT_SYMBOL_GPL(mwifiex_reinit_sw);
static irqreturn_t mwifiex_irq_wakeup_handler(int irq, void *priv)
{
diff --git a/drivers/net/wireless/marvell/mwifiex/main.h b/drivers/net/wireless/marvell/mwifiex/main.h
index d501d03..5f7a010 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.h
+++ b/drivers/net/wireless/marvell/mwifiex/main.h
@@ -1666,5 +1666,6 @@ void mwifiex_process_multi_chan_event(struct mwifiex_private *priv,
void mwifiex_dev_debugfs_init(struct mwifiex_private *priv);
void mwifiex_dev_debugfs_remove(struct mwifiex_private *priv);
#endif
-void mwifiex_do_flr(struct mwifiex_adapter *adapter, bool prepare);
+int mwifiex_reinit_sw(struct mwifiex_adapter *adapter);
+int mwifiex_shutdown_sw(struct mwifiex_adapter *adapter);
#endif /* !_MWIFIEX_MAIN_H_ */
diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index 55c79e3..02f6db0 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -375,7 +375,7 @@ static void mwifiex_pcie_reset_notify(struct pci_dev *pdev, bool prepare)
* Cleanup all software without cleaning anything related to
* PCIe and HW.
*/
- mwifiex_do_flr(adapter, prepare);
+ mwifiex_shutdown_sw(adapter);
adapter->surprise_removed = true;
} else {
/* Kernel stores and restores PCIe function context before and
@@ -383,7 +383,7 @@ static void mwifiex_pcie_reset_notify(struct pci_dev *pdev, bool prepare)
* and firmware including firmware redownload
*/
adapter->surprise_removed = false;
- mwifiex_do_flr(adapter, prepare);
+ mwifiex_reinit_sw(adapter);
}
mwifiex_dbg(adapter, INFO, "%s, successful\n", __func__);
}
--
1.9.1
^ permalink raw reply related
* Re: [RFC v2 05/11] ath10k: htc: refactorization
From: Valo, Kalle @ 2016-12-14 13:49 UTC (permalink / raw)
To: michal.kazior@tieto.com
Cc: Erik Stromdahl, linux-wireless@vger.kernel.org,
ath10k@lists.infradead.org
In-Reply-To: <CA+BoTQmRV59imnmSykjs=ym0qut9akUHzFKfAMioK=MT8my98A@mail.gmail.com>
Michal Kazior <michal.kazior@tieto.com> writes:
>> I have made a few updates since I submitted the original RFC and created
>> a repo on github:
>>
>> https://github.com/erstrom/linux-ath
>>
>> I have a bunch of branches that are all based on the tags on the ath mas=
ter.
>>
>> As of this moment the latest version is:
>>
>> ath-201612131156-ath10k-sdio
>>
>> This branch contains the original RFC patches plus some addons/fixes.
>>
>> In the above mentioned branch there are a few commits related to this
>> race condition. Perhaps you can have a look at them?
>>
>> The commits are:
>> 821672913328cf737c9616786dc28d2e4e8a4a90
>
> I would avoid if(bus=3D=3Dxx) checks.
Very much agreed. For example to enable HTT high latency support you
could add an enum to ath10k_core_create() with values for both high and
low latency. This way sdio.c and pci.c can enable the correct mode
during initialisation.
--=20
Kalle Valo=
^ permalink raw reply
* Re: [RFC v2 05/11] ath10k: htc: refactorization
From: Valo, Kalle @ 2016-12-14 13:46 UTC (permalink / raw)
To: Erik Stromdahl
Cc: michal.kazior@tieto.com, linux-wireless@vger.kernel.org,
ath10k@lists.infradead.org
In-Reply-To: <70b8a055-e6ba-857d-8b03-06f50e3f10fe@gmail.com>
Erik Stromdahl <erik.stromdahl@gmail.com> writes:
> I have made a few updates since I submitted the original RFC and created
> a repo on github:
>
> https://github.com/erstrom/linux-ath
>
> I have a bunch of branches that are all based on the tags on the ath mast=
er.
>
> As of this moment the latest version is:
>
> ath-201612131156-ath10k-sdio
>
> This branch contains the original RFC patches plus some addons/fixes.
Good, this makes it easier to follow the development. So what's the
current status with this branch? What works and what doesn't?
Especially I'm interested about the state of the HTT high latency
support. How much work is to add that? It would also make it easier to
add USB support to ath10k.
--=20
Kalle Valo=
^ permalink raw reply
* Re: [RFC V3 04/11] nl80211: add driver api for gscan notifications
From: Arend Van Spriel @ 2016-12-14 10:07 UTC (permalink / raw)
To: Johannes Berg; +Cc: linux-wireless
In-Reply-To: <1481646049.20412.43.camel@sipsolutions.net>
On 13-12-2016 17:20, Johannes Berg wrote:
> On Mon, 2016-12-12 at 11:59 +0000, Arend van Spriel wrote:
>> The driver can indicate gscan results are available or gscan
>> operation has stopped.
>
> This patch is renumbering the previous patches' nl80211 API, which is
> best avoided, even if I do realize it doesn't matter now. :)
Indeed. Will be more careful in upcoming revision(s).
> Even here it's not clear how things are reported though. Somehow I
> thought that gscan was reporting only partial information through the
> buckets, or is that not true?
Not sure what is meant by "through the buckets". Referring to your
remark/question in the "Unversal scan proposal" thread:
"""
I'm much more worried about the "bucket reporting" since that doesn't
fit into the current full BSS reporting model at all. What's your
suggestion for this?
"""
So this is exactly the dilemma I considered so I decided to stick with
the full BSS reporting model for gscan as well merely to get it
discussed so glad you brought it up ;-). The problem here is that
gscan is a vehicle that serves a number of use-cases. So ignoring
hotlists, ePNO, etc. the gscan configuration still hold several
notification types:
- report after completing M scans capturing N best APs or a
percentage of (M * N).
- report after completing a scan include a specific bucket.
- report full scan results.
The first two notification trigger retrieval of gscan results which are
historic, ie. partial scan results for max M scans.
As said earlier the universal scan proposal has some similarities to
gscan. Guess you share that as you are using the term "bucket reporting"
in that discussion ;-). The historic results are needed for location (if
I am not mistaken) so the full BSS reporting model does not fit that.
Question is what particular attribute in the historic results is needed
for location (suspecting only rssi and possibly the timestamp, but would
like to see that confirmed). I was thinking about have historic storage
in cfg80211 so we do not need a per-driver solution.
Regards,
Arend
^ permalink raw reply
* Re: [RFC V3 03/11] nl80211: add support for gscan
From: Arend Van Spriel @ 2016-12-14 9:01 UTC (permalink / raw)
To: Johannes Berg; +Cc: linux-wireless
In-Reply-To: <1481668194.22319.0.camel@sipsolutions.net>
On 13-12-2016 23:29, Johannes Berg wrote:
> On Tue, 2016-12-13 at 21:09 +0100, Arend Van Spriel wrote:
>>
>>> There's a bit of a weird hard-coded restriction to 16 channels too,
>>> that's due to the bucket map?
>>
>> Uhm. Is there? I will check, but if you can give me a pointer where
>> to look it is appreciated.
>
> Just look for "< 16" or "<= 16" or so in the patch. I do think that's
> because the channel map is a u16 though, not sure we'd want to change
> that.
Had to look for "> 16" ;-)
> + /* ignore channels if band is specified */
> + if (band_select)
> + return 0;
> +
> + nla_for_each_nested(chan,
tb[NL80211_GSCAN_BUCKET_ATTR_CHANNELS], rem) {
> + num_chans++;
> + }
Here an instance of the tab vs. space issue you mentioned. Will go over
the patch and fix that.
> + if (num_chans > 16)
> + return -EINVAL;
I suspect this is the restriction you were referring to. There is no
reason for this although the android wifi hal has max 16 channels in a
bucket so I might have picked that up. So could a driver have a similar
limit and should we add such to the gscan capabilities? For instance our
firmware api has a nasty restriction of 64 channels for all buckets
together, eg. can do 4 buckets of 16 channels each.
Regards,
Arend
^ permalink raw reply
* Re: [PATCH 1/2] ath10k: add accounting for the extended peer statistics
From: Mohammed Shafi Shajakhan @ 2016-12-14 7:33 UTC (permalink / raw)
To: Christian Lamparter; +Cc: linux-wireless, ath10k, Kalle Valo
In-Reply-To: <2870294.lMmxz9iPmq@debian64>
On Tue, Dec 13, 2016 at 01:41:33PM +0100, Christian Lamparter wrote:
> Hello,
>
> It looks like google put your mail into the spam-can.
> I'm sorry for not answering sooner.
[shafi] np, thanks for your reply !
>
> On Wednesday, December 7, 2016 11:58:24 AM CET Mohammed Shafi Shajakhan wrote:
> > On Mon, Dec 05, 2016 at 10:52:45PM +0100, Christian Lamparter wrote:
> > > The 10.4 firmware adds extended peer information to the
> > > firmware's statistics payload. This additional info is
> > > stored as a separate data field and the elements are
> > > stored in their own "peers_extd" list.
> > >
> > > These elements can pile up in the same way as the peer
> > > information elements. This is because the
> > > ath10k_wmi_10_4_op_pull_fw_stats() function tries to
> > > pull the same amount (num_peer_stats) for every statistic
> > > data unit.
> > >
> > > Fixes: 4a49ae94a448faa ("ath10k: fix 10.4 extended peer stats update")
> > > Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
> > > ---
> > > drivers/net/wireless/ath/ath10k/debug.c | 7 +++++--
> > > 1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/net/wireless/ath/ath10k/debug.c b/drivers/net/wireless/ath/ath10k/debug.c
> > > index 82a4c67f3672..4acd9eb65910 100644
> > > --- a/drivers/net/wireless/ath/ath10k/debug.c
> > > +++ b/drivers/net/wireless/ath/ath10k/debug.c
> > > @@ -399,6 +399,7 @@ void ath10k_debug_fw_stats_process(struct ath10k *ar, struct sk_buff *skb)
> > > * prevent firmware from DoS-ing the host.
> > > */
> > > ath10k_fw_stats_peers_free(&ar->debug.fw_stats.peers);
> > > + ath10k_fw_extd_stats_peers_free(&ar->debug.fw_stats.peers_extd);
> >
> > [shafi] thanks for fixing this !
> >
> > > ath10k_warn(ar, "dropping fw peer stats\n");
> > > goto free;
> > > }
> > > @@ -409,10 +410,12 @@ void ath10k_debug_fw_stats_process(struct ath10k *ar, struct sk_buff *skb)
> > > goto free;
> > > }
> > >
> > > + if (!list_empty(&stats.peers))
> >
> > [shafi] sorry please correct me if i am wrong, for 'extended peer stats' we are checking
> > for normal 'peer stats' ? Is this the fix intended, i had started a build to
> > check your change and we will keep you posted, does this fix displaying
> > 'rx_duration' in ath10k fw_stats.
> The idea is not to queue any "extended peer stats" when there where no "peer stats" to
> begin with. Because otherwise, the function is still vulnerable to OOM since the
> extended peers stats will be queued unchecked (not that this is currently a problem).
[shafi] list_splice_tail_init should still check for non-empty 'peers_extd' list
and append if required ? please let me know if i am still missing something
>
> > > + list_splice_tail_init(&stats.peers_extd,
> > > + &ar->debug.fw_stats.peers_extd);
> > > +
> > > list_splice_tail_init(&stats.peers, &ar->debug.fw_stats.peers);
> > > list_splice_tail_init(&stats.vdevs, &ar->debug.fw_stats.vdevs);
> > > - list_splice_tail_init(&stats.peers_extd,
> > > - &ar->debug.fw_stats.peers_extd);
> > > }
> > >
> > > complete(&ar->debug.fw_stats_complete);
>
> Regards,
> Christian
>
>
^ permalink raw reply
* RE: ATH9 driver issues on ARM64
From: Bharat Kumar Gogada @ 2016-12-14 5:09 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Tobias Klausmann, Kalle Valo, linux-kernel@vger.kernel.org,
linux-pci@vger.kernel.org, Marc Zyngier,
Janusz.Dziedzic@tieto.com, rmanohar@qti.qualcomm.com,
ath9k-devel@qca.qualcomm.com, linux-wireless@vger.kernel.org
In-Reply-To: <20161212163114.GA32712@bhelgaas-glaptop.roam.corp.google.com>
> On Sat, Dec 10, 2016 at 02:40:48PM +0000, Bharat Kumar Gogada wrote:
> > Hi,
> >
> > After taking some more lecroy traces, we see that after 2nd ASSERT from=
EP
> on ARM64 we see continuous data movement of 32 dwords or 12 dwords and
> never sign of DEASSERT.
> > Comparatively on working traces (x86) after 2nd assert there are only B=
AR
> register reads and writes and then DEASSERT, for almost most of the inter=
rupts
> and we haven't seen 12 or 32 dwords data movement on this trace.
> >
> > I did not work on EP wifi/network drivers, any help why EP needs those =
many
> number of data at scan time ?
>=20
> The device doesn't know whether it's in an x86 or an arm64 system. If it=
works
> differently, it must be because the PCI core or the driver is programming=
the
> device differently.
>=20
> You should be able to match up Memory transactions from the host in the t=
race
> with things the driver does. For example, if you see an Assert_INTx mess=
age
> from the device, you should eventually see a Memory Read from the host to=
get
> the ISR, i.e., some read done in the bowels of ath9k_hw_getisr().
>=20
> I don't know how the ath9k device works, but there must be some Memory Re=
ad
> or Write done by the driver that tells the device "we've handled this int=
errupt".
> The device should then send a Deassert_INTx; of course, if the device sti=
ll
> requires service, e.g., because it has received more packets, it might le=
ave the
> INTx asserted.
>=20
> I doubt you'd see exactly the same traces on x86 and arm64 because they a=
ren't
> seeing the same network packets and the driver is executing at different =
rates.
> But you should at least be able to identify interrupt assertion and the a=
ctions of
> the driver's interrupt service routine.
Thanks Bjorn.
As you mentioned we did try to debug in that path. After we start scan afte=
r 2nd ASSERT we see lots of 32 and 12 dword
data, and in function
void ath9k_hw_enable_interrupts(struct ath_hw *ah)=20
{
...
..
REG_WRITE(ah, AR_IER, AR_IER_ENABLE);
// EP driver hangs at this position after 2nd ASSERT
// The following writes are not happening
if (!AR_SREV_9100(ah)) { =09
REG_WRITE(ah, AR_INTR_ASYNC_ENABLE, async_mask);
REG_WRITE(ah, AR_INTR_ASYNC_MASK, async_mask);
REG_WRITE(ah, AR_INTR_SYNC_ENABLE, sync_default);
REG_WRITE(ah, AR_INTR_SYNC_MASK, sync_default);
} =20
ath_dbg(common, INTERRUPT, "AR_IMR 0x%x IER 0x%x\n",
REG_READ(ah, AR_IMR), REG_READ(ah, AR_IER));
}
The above funtion is invoked from tasklet.
I tried several boots every it stops here. The condition (!AR_SREV_9100(ah)=
) is true as per before 1st ASSERT handling.
Regards,
Bharat
>=20
> > > Hello there,
> > >
> > > as this is a thread about ath9k and ARM64, i'm not sure if i should
> > > answer here or not, but i have similar "stalls" with ath9k on x86_64
> > > (starting with 4.9rc), stack trace is posted down below where the ori=
ginal
> ARM64 stall traces are.
> > >
> > > Greetings,
> > >
> > > Tobias
> > >
> > >
> > > On 08.12.2016 18:36, Kalle Valo wrote:
> > > > Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com> writes:
> > > >
> > > >> > [+cc Kalle, ath9k list]
> > > > Thanks, but please also CC linux-wireless. Full thread below for
> > > > the folks there.
> > > >
> > > >>> On Thu, Dec 08, 2016 at 01:49:42PM +0000, Bharat Kumar Gogada
> wrote:
> > > >>>> Hi,
> > > >>>>
> > > >>>> Did anyone test Atheros ATH9
> > > >>>> driver(drivers/net/wireless/ath/ath9k/)
> > > >>>> on ARM64. The end point is TP link wifi card with which
> > > >>>> supports only legacy interrupts.
> > > >>> If it works on other arches and the arm64 PCI enumeration works,
> > > >>> my first guess would be an INTx issue, e.g., maybe the driver is
> > > >>> waiting for an interrupt that never arrives.
> > > >> We are not sure for now.
> > > >>>> We are trying to test it on ARM64 with
> > > >>>> (drivers/pci/host/pcie-xilinx-nwl.c) as root port.
> > > >>>>
> > > >>>> EP is getting enumerated and able to link up.
> > > >>>>
> > > >>>> But when we start scan system gets hanged.
> > > >>> When you say the system hangs when you start a scan, I assume
> > > >>> you mean a wifi scan, not the PCI enumeration. A problem with a
> > > >>> wifi scan might cause a *process* to hang, but it shouldn't hang
> > > >>> the entire system.
> > > >>>
> > > >> Yes wifi scan.
> > > >>>> When we took trace we see that after we start scan assert
> > > >>>> message is sent but there is no de assert from end point.
> > > >>> Are you talking about a trace from a PCIe analyzer? Do you see
> > > >>> an Assert_INTx PCIe message on the link?
> > > >>>
> > > >> Yes lecroy trace, yes we do see Assert_INTx and Deassert_INTx
> > > >> happening
> > > when we do interface link up.
> > > >> When we have less debug prints in Atheros driver, and do wifi
> > > >> scan we see Assert_INTx but never Deassert_INTx,
> > > >>>> What might cause end point not sending de assert ?
> > > >>> If the endpoint doesn't send a Deassert_INTx message, I expect
> > > >>> that would mean the driver didn't service the interrupt and
> > > >>> remove the condition that caused the device to assert the
> > > >>> interrupt in the first place.
> > > >>>
> > > >>> If the driver didn't receive the interrupt, it couldn't service
> > > >>> it, of course. You could add a printk in the ath9k interrupt
> > > >>> service routine to see if you ever get there.
> > > >>>
> > > >> The interrupt behavior is changing w.r.t amount of debug prints
> > > >> we add. (I kept many prints to aid debug)
> > > >> root@Xilinx-ZCU102-2016_3:~# iw dev
> > > wlan0 scan
> > > >> [ 83.064675] ath9k: ath9k_iowrite32 ffffff800a400024
> > > >> [ 83.069486] ath9k: ath9k_ioread32 ffffff800a400024
> > > >> [ 83.074257] ath9k_hw_kill_interrupts 793
> > > >> [ 83.078260] ath9k: ath9k_iowrite32 ffffff800a400024
> > > >> [ 83.083107] ath9k: ath9k_ioread32 ffffff800a400024
> > > >> [ 83.087882] ath9k_hw_kill_interrupts 793
> > > >> [ 83.095450] ath9k_hw_enable_interrupts 821
> > > >> [ 83.099557] ath9k_hw_enable_interrupts 825
> > > >> [ 83.103721] ath9k_hw_enable_interrupts 832
> > > >> [ 83.107887] ath9k: ath9k_iowrite32 ffffff800a400024
> > > >> [ 83.112748] AR_SREV_9100 0
> > > >> [ 83.115438] ath9k_hw_enable_interrupts 848
> > > >> [ 83.119607] ath9k: ath9k_ioread32 ffffff800a400024
> > > >> [ 83.124389] ath9k_hw_intrpend 762
> > > >> [ 83.127761] (AR_SREV_9340(ah) val 0
> > > >> [ 83.131234] ath9k_hw_intrpend 767
> > > >> [ 83.134628] ath_isr 603
> > > >> [ 83.137134] ath9k: ath9k_iowrite32 ffffff800a400024
> > > >> [ 83.141995] ath9k: ath9k_ioread32 ffffff800a400024
> > > >> [ 83.146771] ath9k_hw_kill_interrupts 793
> > > >> [ 83.150864] ath9k_hw_enable_interrupts 821
> > > >> [ 83.154971] ath9k_hw_enable_interrupts 825
> > > >> [ 83.159135] ath9k_hw_enable_interrupts 832
> > > >> [ 83.163300] ath9k: ath9k_iowrite32 ffffff800a400024
> > > >> [ 83.168161] AR_SREV_9100 0
> > > >> [ 83.170852] ath9k_hw_enable_interrupts 848
> > > >> [ 83.170855] ath9k_hw_intrpend 762
> > > >> [ 83.178398] (AR_SREV_9340(ah) val 0
> > > >> [ 83.181873] ath9k_hw_intrpend 767
> > > >> [ 83.185265] ath_isr 603
> > > >> [ 83.187773] ath9k: ath9k_iowrite32 ffffff800a400024
> > > >> [ 83.192635] ath9k: ath9k_ioread32 ffffff800a400024
> > > >> [ 83.197411] ath9k_hw_kill_interrupts 793
> > > >> [ 83.201414] ath9k: ath9k_ioread32 ffffff800a400024
> > > >> [ 83.206258] ath9k_hw_enable_interrupts 821
> > > >> [ 83.210368] ath9k_hw_enable_interrupts 825
> > > >> [ 83.214531] ath9k_hw_enable_interrupts 832
> > > >> [ 83.218698] ath9k: ath9k_iowrite32 ffffff800a400024
> > > >> [ 83.223558] AR_SREV_9100 0
> > > >> [ 83.226243] ath9k_hw_enable_interrupts 848
> > > >> [ 83.226246] ath9k_hw_intrpend 762
> > > >> [ 83.233794] (AR_SREV_9340(ah) val 0
> > > >> [ 83.237268] ath9k_hw_intrpend 767
> > > >> [ 83.240661] ath_isr 603
> > > >> [ 83.243169] ath9k: ath9k_iowrite32 ffffff800a400024
> > > >> [ 83.248030] ath9k: ath9k_ioread32 ffffff800a400024
> > > >> [ 83.252806] ath9k_hw_kill_interrupts 793
> > > >> [ 83.256811] ath9k: ath9k_ioread32 ffffff800a400024
> > > >> [ 83.261651] ath9k_hw_enable_interrupts 821
> > > >> [ 83.265753] ath9k_hw_enable_interrupts 825
> > > >> [ 83.269919] ath9k_hw_enable_interrupts 832
> > > >> [ 83.274083] ath9k: ath9k_iowrite32 ffffff800a400024
> > > >> [ 83.278945] AR_SREV_9100 0
> > > >> [ 83.281630] ath9k_hw_enable_interrupts 848
> > > >> [ 83.281633] ath9k_hw_intrpend 762
> > > >> [ 83.281634] (AR_SREV_9340(ah) val 0
> > > >> [ 83.281637] ath9k_hw_intrpend 767
> > > >> [ 83.281648] ath_isr 603
> > > >> [ 83.281649] ath9k: ath9k_iowrite32 ffffff800a400024
> > > >> [ 83.281651] ath9k: ath9k_ioread32 ffffff800a400024
> > > >> [ 83.281654] ath9k_hw_kill_interrupts 793
> > > >> [ 83.312192] ath9k: ath9k_ioread32 ffffff800a400024
> > > >> [ 83.317030] ath9k_hw_enable_interrupts 821
> > > >> [ 83.321132] ath9k_hw_enable_interrupts 825
> > > >> [ 83.325297] ath9k_hw_enable_interrupts 832
> > > >> [ 83.329463] ath9k: ath9k_iowrite32 ffffff800a400024
> > > >> [ 83.334324] AR_SREV_9100 0
> > > >> [ 83.337014] ath9k_hw_enable_interrupts 848
> > > >> ..
> > > >> ..
> > > >> This log continues until I turn off board without obtaining scanni=
ng result.
> > > >>
> > > >> In between I get following cpu stall outputs :
> > > >> 230.457179] INFO: rcu_sched self-detected stall on CPU
> > > >> [ 230.457185] 2-...: (31314 ticks this GP)
> > > idle=3D2d1/140000000000001/0 softirq=3D1400/1400 fqs=3D36713
> > > >> [ 230.457189] (t=3D36756 jiffies g=3D161 c=3D160 q=3D16169)
> > > >> [ 230.457191] Task dump for CPU 2:
> > > >> [ 230.457196] kworker/u8:4 R running task 0 1342 =
2
> 0x00000002
> > > >> [ 230.457207] Workqueue: phy0 ieee80211_scan_work [ 230.457208]
> > > >> Call trace:
> > > >> [ 230.457214] [<ffffff8008089860>] dump_backtrace+0x0/0x198 [
> > > >> 230.457219] [<ffffff8008089a0c>] show_stack+0x14/0x20 [
> > > >> 230.457224] [<ffffff80080c0930>] sched_show_task+0x98/0xf8 [
> > > >> 230.457228] [<ffffff80080c2628>] dump_cpu_task+0x40/0x50 [
> > > >> 230.457233] [<ffffff80080e14a8>] rcu_dump_cpu_stacks+0xa0/0xf0 [
> > > >> 230.457239] [<ffffff80080e4cd8>] rcu_check_callbacks+0x468/0x748
> > > >> [ 230.457243] [<ffffff80080e7cfc>]
> > > >> update_process_times+0x3c/0x68 [ 230.457249]
> > > >> [<ffffff80080f6dfc>] tick_sched_handle.isra.5+0x3c/0x50
> > > >> [ 230.457253] [<ffffff80080f6e54>] tick_sched_timer+0x44/0x90 [
> > > >> 230.457257] [<ffffff80080e86b0>] __hrtimer_run_queues+0xf0/0x178
> > > >> ** 10 printk messages dropped ** [ 230.457302] f8c0:
> > > >> 0000000000000000 0000000005f5e0ff 000000000001379a
> > > 3866666666666620 [
> > > >> 230.457306] f8e0: ffffff800a1b4065 0000000000000006
> > > >> ffffff800a129000
> > > >> ffffffc87b8010a8 [ 230.457310] f900: ffffff808a1b4057
> > > >> ffffff800a1c3000 ffffff800a1b3000 ffffff800a13b000 [ 230.457314]
> > > >> f920: 0000000000000140 0000000000000006 ffffff800a1b3b10
> > > >> ffffff800a1c39e8 [ 230.457318] f940: 000000000000002f
> > > >> ffffff800a1b8a98 ffffff800a1b3ae8 ffffffc87b07f990 [ 230.457322]
> > > >> f960: ffffff80080d6230 ffffffc87b07f990 ffffff80080d6234
> > > >> 0000000060000145
> > > >> ** 1 printk messages dropped ** [ 230.457329]
> > > >> [<ffffff8008085720>]
> > > >> el1_irq+0xa0/0x100
> > > >> ** 9 printk messages dropped ** [ 230.457373]
> > > >> [<ffffff800885ad60>]
> > > >> ieee80211_hw_config+0x50/0x290 [ 230.457377]
> > > >> [<ffffff8008863690>]
> > > >> ieee80211_scan_work+0x1f8/0x480 [ 230.457383]
> > > >> [<ffffff80080b15d0>]
> > > >> process_one_work+0x120/0x378 [ 230.457386] [<ffffff80080b1870>]
> > > >> worker_thread+0x48/0x4b0 [ 230.457391] [<ffffff80080b7108>]
> > > >> kthread+0xd0/0xe8 [ 230.457395] [<ffffff8008085dd0>]
> > > ret_from_fork+0x10/0x40
> > > >> [ 230.480389] ath9k_hw_intrpend 762
> > > >>
> > > >>
> > > >> [ 545.487987] ath9k: ath9k_ioread32 ffffff800a400024 [
> > > >> 545.526189]
> > > >> INFO: rcu_sched self-detected stall on CPU
> > > >> [ 545.526195] 2-...: (97636 ticks this GP)
> > > idle=3D2d1/140000000000001/0 softirq=3D1400/1400 fqs=3D115374
> > > >> [ 545.526199] (t=3D115523 jiffies g=3D161 c=3D160 q=3D51066)
> > > >> [ 545.526201] Task dump for CPU 2:
> > > >> [ 545.526206] kworker/u8:4 R running task 0 1342 =
2
> 0x00000002
> > > >> ** 3 printk messages dropped ** [ 545.526231]
> > > >> [<ffffff8008089a0c>]
> > > >> show_stack+0x14/0x20
> > > >> ** 9 printk messages dropped ** [ 545.526280]
> > > >> [<ffffff80086a71e8>]
> > > >> arch_timer_handler_phys+0x30/0x40 [ 545.526284]
> > > >> [<ffffff80080dbe18>]
> > > >> handle_percpu_devid_irq+0x78/0xa0 [ 545.526291]
> > > >> [<ffffff80080d760c>]
> > > >> generic_handle_irq+0x24/0x38 [ 545.526296] [<ffffff80080d7944>]
> > > >> __handle_domain_irq+0x5c/0xb8 [ 545.526299] [<ffffff80080824bc>]
> > > >> gic_handle_irq+0x64/0xc0 [ 545.526302] Exception
> > > >> stack(0xffffffc87b07f870
> > > to 0xffffffc87b07f990)
> > > >> [ 545.526306] f860: 00000000000=
09732
> ffffff800a1eaaa8
> > > >> ** 8 printk messages dropped ** [ 545.526341] f980:
> > > >> ffffff800a1c39e8
> > > >> 0000000000000036 [ 545.526345] [<ffffff8008085720>]
> > > >> el1_irq+0xa0/0x100 [ 545.526349] [<ffffff80080d6234>]
> > > >> console_unlock+0x384/0x5b0 [ 545.526353] [<ffffff80080d673c>]
> > > >> vprintk_emit+0x2dc/0x4b0 [ 545.526357] [<ffffff80080d6a50>]
> > > >> vprintk_default+0x38/0x40 [ 545.526362] [<ffffff8008129704>]
> > > >> printk+0x58/0x60 [ 545.526366] [<ffffff800859e3e4>]
> > > >> ath9k_iowrite32+0x9c/0xa8 [ 545.526372] [<ffffff80085c7ca8>]
> > > >> ath9k_hw_kill_interrupts+0x28/0xf0
> > > >> [ 545.526376] [<ffffff80085a18ec>] ath_reset+0x24/0x68
> > > >> ** 2 printk messages dropped ** [ 545.526391]
> > > >> [<ffffff800885ad60>]
> > > ieee80211_hw_config+0x50/0x290
> > > >> ** 11 printk messages dropped ** [ 545.532834]
> > > >> ath9k_hw_kill_interrupts
> > > 793
> > > >> [ 545.532890] ath9k_hw_enable_interrupts 821
> > >
> > > [ 81.876902] INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > [ 81.876912] Tasks blocked on level-0 rcu_node (CPUs 0-7): P0
> > > [ 81.876932] (detected by 4, t=3D60002 jiffies, g=3D1873, c=3D1=
872, q=3D4967)
> > > [ 81.876936] swapper/4 R running task 0 0 1
> > > 0x00000000
> > > [ 81.876941] 0000000000000001 ffffffff810725f6 ffff88017edbc240
> > > ffffffff81a3dc40
> > > [ 81.876945] ffffffff81101e46 ffff88025ef173c0 ffffffff81a3dc40
> > > ffffffff81a3dc40
> > > [ 81.876948] 00000000ffffffff ffffffff810a7333 ffff88017ecee698
> > > ffff88017edbc240
> > > [ 81.876951] Call Trace:
> > > [ 81.876970] <IRQ>
> > > [ 81.876979] [<ffffffff810725f6>] ? sched_show_task+0xd6/0x140
> > > [ 81.876983] [<ffffffff81101e46>] ?
> > > rcu_print_detail_task_stall_rnp+0x40/0x61
> > > [ 81.876989] [<ffffffff810a7333>] ? rcu_check_callbacks+0x6b3/0x8c=
0
> > > [ 81.876993] [<ffffffff810b8350>] ? tick_sched_handle.isra.14+0x40=
/0x40
> > > [ 81.876996] [<ffffffff810aa4c3>] ? update_process_times+0x23/0x50
> > > [ 81.876999] [<ffffffff810b8383>] ? tick_sched_timer+0x33/0x60
> > > [ 81.877002] [<ffffffff810aaf09>] ? __hrtimer_run_queues+0xb9/0x15=
0
> > > [ 81.877004] [<ffffffff810ab198>] ? hrtimer_interrupt+0x98/0x1a0
> > > [ 81.877008] [<ffffffff81031b1e>] ?
> > > smp_trace_apic_timer_interrupt+0x5e/0x90
> > > [ 81.877012] [<ffffffff815b31bf>] ? apic_timer_interrupt+0x7f/0x90
> > > [ 81.877013] <EOI>
> > > [ 81.877017] [<ffffffff8147f28d>] ? cpuidle_enter_state+0x13d/0x1f=
0
> > > [ 81.877019] [<ffffffff8147f289>] ? cpuidle_enter_state+0x139/0x1f=
0
> > > [ 81.877021] [<ffffffff81088c19>] ? cpu_startup_entry+0x139/0x210
> > > [ 81.877027] [<ffffffff8102fc9e>] ? start_secondary+0x13e/0x170
> > > [ 81.877029] swapper/4 R running task 0 0 1
> > > 0x00000000
> > > [ 81.877032] 0000000000000001 ffffffff810725f6 ffff88017edbc240
> > > ffffffff81a3dc40
> > > [ 81.877035] ffffffff81101e46 ffff88025ef173c0 ffffffff81a3dc40
> > > ffffffff81a3dc40
> > > [ 81.877038] 00000000ffffffff ffffffff810a7368 ffff88017ecee698
> > > ffff88017edbc240
> > > [ 81.877041] Call Trace:
> > > [ 81.877045] <IRQ>
> > > [ 81.877049] [<ffffffff810725f6>] ? sched_show_task+0xd6/0x140
> > > [ 81.877051] [<ffffffff81101e46>] ?
> > > rcu_print_detail_task_stall_rnp+0x40/0x61
> > > [ 81.877055] [<ffffffff810a7368>] ? rcu_check_callbacks+0x6e8/0x8c=
0
> > > [ 81.877058] [<ffffffff810b8350>] ? tick_sched_handle.isra.14+0x40=
/0x40
> > > [ 81.877060] [<ffffffff810aa4c3>] ? update_process_times+0x23/0x50
> > > [ 81.877063] [<ffffffff810b8383>] ? tick_sched_timer+0x33/0x60
> > > [ 81.877065] [<ffffffff810aaf09>] ? __hrtimer_run_queues+0xb9/0x15=
0
> > > [ 81.877068] [<ffffffff810ab198>] ? hrtimer_interrupt+0x98/0x1a0
> > > [ 81.877070] [<ffffffff81031b1e>] ?
> > > smp_trace_apic_timer_interrupt+0x5e/0x90
> > > [ 81.877073] [<ffffffff815b31bf>] ? apic_timer_interrupt+0x7f/0x90
> > > [ 81.877074] <EOI>
> > > [ 81.877076] [<ffffffff8147f28d>] ? cpuidle_enter_state+0x13d/0x1f=
0
> > > [ 81.877078] [<ffffffff8147f289>] ? cpuidle_enter_state+0x139/0x1f=
0
> > > [ 81.877080] [<ffffffff81088c19>] ? cpu_startup_entry+0x139/0x210
> > > [ 81.877084] [<ffffffff8102fc9e>] ? start_secondary+0x13e/0x170
> > > [ 91.132787] INFO: rcu_preempt detected expedited stalls on
> > > CPUs/tasks: { P0 } 63785 jiffies s: 505 root: 0x0/T
> > > [ 91.132796] blocking rcu_node structures:
> > >
> > > >>
> > > >>
> > > >> But if we have less debug prints it does not reach EP handler
> > > >> sometimes, due to following Condition in "kernel/irq/chip.c" in
> > > >> function handle_simple_irq
> > > >>
> > > >> if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data)))=
{
> > > >> desc->istate |=3D IRQS_PENDING;
> > > >> goto out_unlock;
> > > >> }
> > > >> Here irqd_irq_disabled is being set to 1.
> > > >>
> > > >> With lesser debug prints it stops after following prints:
> > > >> root@Xilinx-ZCU102-2016_3:~# iw dev wlan0 scan
> > > >> [ 54.781045] ath9k_hw_kill_interrupts 793
> > > >> [ 54.785007] ath9k_hw_kill_interrupts 793
> > > >> [ 54.792535] ath9k_hw_enable_interrupts 821
> > > >> [ 54.796642] ath9k_hw_enable_interrupts 825
> > > >> [ 54.800807] ath9k_hw_enable_interrupts 832
> > > >> [ 54.804973] AR_SREV_9100 0
> > > >> [ 54.807663] ath9k_hw_enable_interrupts 848
> > > >> [ 54.811843] ath9k_hw_intrpend 762
> > > >> [ 54.815211] (AR_SREV_9340(ah) val 0
> > > >> [ 54.818684] ath9k_hw_intrpend 767
> > > >> [ 54.822078] ath_isr 603
> > > >> [ 54.824587] ath9k_hw_kill_interrupts 793
> > > >> [ 54.828601] ath9k_hw_enable_interrupts 821
> > > >> [ 54.832750] ath9k_hw_enable_interrupts 825
> > > >> [ 54.836916] ath9k_hw_enable_interrupts 832
> > > >> [ 54.841082] AR_SREV_9100 0
> > > >> [ 54.843772] ath9k_hw_enable_interrupts 848
> > > >> [ 54.843775] ath9k_hw_intrpend 762
> > > >> [ 54.851319] (AR_SREV_9340(ah) val 0
> > > >> [ 54.854793] ath9k_hw_intrpend 767
> > > >> [ 54.858185] ath_isr 603
> > > >> [ 54.860696] ath9k_hw_kill_interrupts 793
> > > >> [ 54.864776] ath9k_hw_enable_interrupts 821
> > > >> [ 54.867061] ath9k_hw_kill_interrupts 793
> > > >> [ 54.872870] ath9k_hw_enable_interrupts 825
> > > >> [ 54.877036] ath9k_hw_enable_interrupts 832
> > > >> [ 54.881202] AR_SREV_9100 0
> > > >> [ 54.883892] ath9k_hw_enable_interrupts 848
> > > >> [ 75.963129] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > >> [ 75.968602] 0-...: (2 GPs behind) idle=3D9d5/140000000000001/0
> > > softirq=3D1103/1109 fqs=3D519
> > > >> [ 75.976675] (detected by 2, t=3D5274 jiffies, g=3D64, c=3D63, =
q=3D11)
> > > >> [ 75.982485] Task dump for CPU 0:
> > > >> [ 75.985696] ksoftirqd/0 R running task 0 3 =
2 0x00000002
> > > >> [ 75.992726] Call trace:
> > > >> [ 75.995165] [<ffffff8008086b3c>] __switch_to+0xc4/0xd0
> > > >> [ 76.000281] [<ffffffc87b830500>] 0xffffffc87b830500
> > > >> [ 139.059027] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > >> [ 139.064430] 0-...: (2 GPs behind) idle=3D9d5/140000000000001/0
> > > softirq=3D1103/1109 fqs=3D2097
> > > >> [ 139.072593] (detected by 2, t=3D21049 jiffies, g=3D64, c=3D63,=
q=3D11)
> > > >> [ 139.078489] Task dump for CPU 0:
> > > >> [ 139.081700] ksoftirqd/0 R running task 0 3 =
2 0x00000002
> > > >> [ 139.088731] Call trace:
> > > >> [ 139.091165] [<ffffff8008086b3c>] __switch_to+0xc4/0xd0 [
> > > >> 139.096285] [<ffffffc87b830500>] 0xffffffc87b830500
> > > >>
> > > >>
> > > >>>> We are not seeing any issues on 32-bit ARM platform and X86
> > > >>>> platform.
> > > >>> Can you collect a dmesg log (or, if the system hang means you
> > > >>> can't collect that, a console log with "ignore_loglevel"), and "l=
spci -vv"
> > > >>> output as root? That should have clues about whether the INTx
> > > >>> got routed correctly. /proc/interrupts should also show whether
> > > >>> we're receiving interrupts from the device.
> > > >> Here is the lspci output:
> > > >> 00:00.0 PCI bridge: Xilinx Corporation Device d022 (prog-if 00
> > > >> [Normal
> > > decode])
> > > >> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> > > ParErr- Stepping- SERR- FastB2B- DisINTx-
> > > >> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=3Dfast >TAbort-
> > > <TAbort- <MAbort- >SERR- <PERR- INTx-
> > > >> Latency: 0
> > > >> Interrupt: pin A routed to IRQ 224
> > > >> Bus: primary=3D00, secondary=3D01, subordinate=3D0c, sec-latency=
=3D0
> > > >> I/O behind bridge: 00000000-00000fff
> > > >> Memory behind bridge: e0000000-e00fffff
> > > >> Prefetchable memory behind bridge: 00000000fff00000-
> > > 00000000000fffff
> > > >> Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=3Dfast >TAbort-
> > > <TAbort- <MAbort- <SERR- <PERR-
> > > >> BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
> > > >> PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
> > > >> Capabilities: [40] Power Management version 3
> > > >> Flags: PMEClk- DSI- D1- D2- AuxCurrent=3D0mA
> > > PME(D0+,D1+,D2+,D3hot+,D3cold-)
> > > >> Status: D0 NoSoftRst+ PME-Enable- DSel=3D0 DScale=3D0 PME-
> > > >> Capabilities: [60] Express (v2) Root Port (Slot-), MSI 00
> > > >> DevCap: MaxPayload 256 bytes, PhantFunc 0
> > > >> ExtTag- RBE+
> > > >> DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
> > > Unsupported-
> > > >> RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> > > >> MaxPayload 128 bytes, MaxReadReq 512 bytes
> > > >> DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr-
> > > TransPend+
> > > >> LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM not supported,
> > > Exit Latency L0s unlimited, L1 unlimited
> > > >> ClockPM- Surprise- LLActRep- BwNot+ ASPMOptComp+
> > > >> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
> > > >> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> > > >> LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+
> > > DLActive- BWMgmt- ABWMgmt-
> > > >> RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna-
> > > CRSVisible+
> > > >> RootCap: CRSVisible+
> > > >> RootSta: PME ReqID 0000, PMEStatus- PMEPending-
> > > >> DevCap2: Completion Timeout: Range B, TimeoutDis+, LTR-,
> > > OBFF Not Supported ARIFwd-
> > > >> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-,
> > > OBFF Disabled ARIFwd-
> > > >> LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
> > > >> Transmit Margin: Normal Operating Range,
> > > EnterModifiedCompliance- ComplianceSOS-
> > > >> Compliance De-emphasis: -6dB
> > > >> LnkSta2: Current De-emphasis Level: -3.5dB,
> > > EqualizationComplete-, EqualizationPhase1-
> > > >> EqualizationPhase2-, EqualizationPhase3-,
> > > LinkEqualizationRequest-
> > > >> Capabilities: [100 v1] Device Serial Number 00-00-00-00-00-00-00-=
00
> > > >> Capabilities: [10c v1] Virtual Channel
> > > >> Caps: LPEVC=3D0 RefClk=3D100ns PATEntryBits=3D1
> > > >> Arb: Fixed- WRR32- WRR64- WRR128-
> > > >> Ctrl: ArbSelect=3DFixed
> > > >> Status: InProgress-
> > > >> VC0: Caps: PATOffset=3D00 MaxTimeSlots=3D1 RejSnoopTrans-
> > > >> Arb: Fixed- WRR32- WRR64- WRR128- TWRR128-
> > > WRR256-
> > > >> Ctrl: Enable+ ID=3D0 ArbSelect=3DFixed TC/VC=3Dff
> > > >> Status: NegoPending- InProgress-
> > > >> Capabilities: [128 v1] Vendor Specific Information: ID=3D1234
> > > >> Rev=3D1
> > > >> Len=3D018 <?>
> > > >>
> > > >> 01:00.0 Network controller: Qualcomm Atheros AR93xx Wireless
> > > >> Network
> > > Adapter (rev 01)
> > > >> Subsystem: Qualcomm Atheros Device 3112
> > > >> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> > > ParErr- Stepping- SERR- FastB2B- DisINTx-
> > > >> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=3Dfast >TAbort-
> > > <TAbort- <MAbort- >SERR- <PERR- INTx-
> > > >> Latency: 0, Cache Line Size: 128 bytes
> > > >> Interrupt: pin A routed to IRQ 224
> > > >> Region 0: Memory at e0000000 (64-bit, non-prefetchable) [size=3D1=
28K]
> > > >> [virtual] Expansion ROM at e0020000 [disabled] [size=3D64K]
> > > >> Capabilities: [40] Power Management version 3
> > > >> Flags: PMEClk- DSI- D1+ D2- AuxCurrent=3D375mA
> > > PME(D0+,D1+,D2-,D3hot+,D3cold-)
> > > >> Status: D0 NoSoftRst- PME-Enable- DSel=3D0 DScale=3D0 PME-
> > > >> Capabilities: [50] MSI: Enable- Count=3D1/4 Maskable+ 64bit+
> > > >> Address: 0000000000000000 Data: 0000
> > > >> Masking: 00000000 Pending: 00000000
> > > >> Capabilities: [70] Express (v2) Endpoint, MSI 00
> > > >> DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency
> > > L0s <1us, L1 <8us
> > > >> ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> > > SlotPowerLimit 0.000W
> > > >> DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
> > > Unsupported-
> > > >> RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
> > > >> MaxPayload 128 bytes, MaxReadReq 512 bytes
> > > >> DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr-
> > > TransPend-
> > > >> LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit
> > > Latency L0s <2us, L1 <64us
> > > >> ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
> > > >> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
> > > >> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> > > >> LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+
> > > DLActive- BWMgmt- ABWMgmt-
> > > >> DevCap2: Completion Timeout: Not Supported, TimeoutDis+,
> > > LTR-, OBFF Not Supported
> > > >> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-,
> > > OBFF Disabled
> > > >> LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance-
> > > SpeedDis-
> > > >> Transmit Margin: Normal Operating Range,
> > > EnterModifiedCompliance- ComplianceSOS-
> > > >> Compliance De-emphasis: -6dB
> > > >> LnkSta2: Current De-emphasis Level: -6dB,
> > > EqualizationComplete-, EqualizationPhase1-
> > > >> EqualizationPhase2-, EqualizationPhase3-,
> > > LinkEqualizationRequest-
> > > >> Capabilities: [100 v1] Advanced Error Reporting
> > > >> UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> > > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > > >> UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> > > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > > >> UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
> > > RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> > > >> CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> > > NonFatalErr-
> > > >> CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> > > NonFatalErr+
> > > >> AERCap: First Error Pointer: 00, GenCap- CGenEn-
> > > ChkCap- ChkEn-
> > > >> Capabilities: [140 v1] Virtual Channel
> > > >> Caps: LPEVC=3D0 RefClk=3D100ns PATEntryBits=3D1
> > > >> Arb: Fixed- WRR32- WRR64- WRR128-
> > > >> Ctrl: ArbSelect=3DFixed
> > > >> Status: InProgress-
> > > >> VC0: Caps: PATOffset=3D00 MaxTimeSlots=3D1 RejSnoopTrans-
> > > >> Arb: Fixed- WRR32- WRR64- WRR128- TWRR128-
> > > WRR256-
> > > >> Ctrl: Enable+ ID=3D0 ArbSelect=3DFixed TC/VC=3Dff
> > > >> Status: NegoPending- InProgress-
> > > >> Capabilities: [300 v1] Device Serial Number 00-00-00-00-00-00-00-=
00
> > > >> Kernel driver in use: ath9k
> > > >>
> > > >> Here is the cat /proc/interrupts (after we do interface up):
> > > >>
> > > >> root@:~# ifconfig wlan0 up
> > > >> [ 1548.926601] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not
> > > >> ready root@Xilinx-ZCU102-2016_3:~# cat /proc/interrupts
> > > >> CPU0 CPU1 CPU2 CPU3
> > > >> 1: 0 0 0 0 GICv2 29 Ed=
ge arch_timer
> > > >> 2: 19873 20058 19089 17435 GICv2 30 Ed=
ge arch_timer
> > > >> 12: 0 0 0 0 GICv2 156 Le=
vel zynqmp-dma
> > > >> 13: 0 0 0 0 GICv2 157 Le=
vel zynqmp-dma
> > > >> 14: 0 0 0 0 GICv2 158 Le=
vel zynqmp-dma
> > > >> 15: 0 0 0 0 GICv2 159 Le=
vel zynqmp-dma
> > > >> 16: 0 0 0 0 GICv2 160 Le=
vel zynqmp-dma
> > > >> 17: 0 0 0 0 GICv2 161 Le=
vel zynqmp-dma
> > > >> 18: 0 0 0 0 GICv2 162 Le=
vel zynqmp-dma
> > > >> 19: 0 0 0 0 GICv2 163 Le=
vel zynqmp-dma
> > > >> 20: 0 0 0 0 GICv2 164 Le=
vel Mali_GP_MMU,
> > > Mali_GP, Mali_PP0_MMU, Mali_PP0, Mali_PP1_MMU, Mali_PP1
> > > >> 30: 0 0 0 0 GICv2 95 Le=
vel eth0, eth0
> > > >> 206: 314 0 0 0 GICv2 49 Lev=
el cdns-i2c
> > > >> 207: 40 0 0 0 GICv2 50 Lev=
el cdns-i2c
> > > >> 209: 0 0 0 0 GICv2 150 Lev=
el nwl_pcie:misc
> > > >> 214: 12 0 0 0 GICv2 47 Lev=
el ff0f0000.spi
> > > >> 215: 0 0 0 0 GICv2 58 Lev=
el ffa60000.rtc
> > > >> 216: 0 0 0 0 GICv2 59 Lev=
el ffa60000.rtc
> > > >> 217: 0 0 0 0 GICv2 165 Lev=
el ahci-
> > > ceva[fd0c0000.ahci]
> > > >> 218: 61 0 0 0 GICv2 81 Lev=
el mmc0
> > > >> 219: 0 0 0 0 GICv2 187 Lev=
el arm-smmu global
> fault
> > > >> 220: 471 0 0 0 GICv2 53 Lev=
el xuartps
> > > >> 223: 0 0 0 0 GICv2 154 Lev=
el fd4c0000.dma
> > > >> 224: 3 0 0 0 dummy 1 Edg=
e ath9k
> > > >> 225: 0 0 0 0 GICv2 97 Lev=
el xhci-hcd:usb1
> > > >>
> > > >> Regards,
> > > >> Bharat
> >
^ permalink raw reply
* Re: [PATCH v2 0/7] ath9k: EEPROM swapping improvements
From: Adrian Chadd @ 2016-12-14 6:45 UTC (permalink / raw)
To: Martin Blumenstingl
Cc: Valo, Kalle, ath9k-devel, linux-wireless@vger.kernel.org,
ath9k-devel@lists.ath9k.org, devicetree@vger.kernel.org,
arnd@arndb.de, chunkeey@googlemail.com, nbd@nbd.name
In-Reply-To: <CAFBinCC6JWBhZwma=66fBi3_to2SaHOMNDQS23jHNhcc+RUcYQ@mail.gmail.com>
hi,
On 12 December 2016 at 12:05, Martin Blumenstingl
<martin.blumenstingl@googlemail.com> wrote:
>
> It seems that there are a few devices out there where the whole EEPROM
> is swab16'ed which switches the position of the 1-byte fields
> opCapFlags and eepMisc.
> those still work fine with the new code, however I had a second patch
> in LEDE [0] which results in ath9k_platform_data.endian_check NOT
> being set anymore.
> that endian_check flag was used before to swab16 the whole EEPROM, to
> correct the position of the 1-byte fields again.
> Currently we are fixing this in the firmware hotplug script: [1]
> This is definitely not a blocker for this series though (if we want to
> have a devicetree replacement for "ath9k_platform_data.endian_check"
> then I'd work on that within a separate series, but I somewhat
> consider these EEPROMs as "broken" so fixing them in
> userspace/firmware hotplug script is fine for me)
As a reference - the reference driver has been doign this for a while.
It attempts to detect the endianness by looking at the 0xa55a
signature endian and figuring out which endian the actual contents are
in.
So just FYI yeah, this is a "thing" for reasons I don't quite know.
-adrian
>
>
> Regards,
> Martin
>
>
> [0] https://git.lede-project.org/?p=source.git;a=commitdiff;h=a20616863d32d91163043b6657a63c836bd9c5ba
> [1] https://git.lede-project.org/?p=source.git;a=commitdiff;h=afa37092663d00aa0abf8c61943d9a1b5558b144
^ permalink raw reply
* Re: ath10k firmware crashes in mesh mode on QCA9880
From: Adrian Chadd @ 2016-12-14 6:17 UTC (permalink / raw)
To: Alexis Green
Cc: Manoharan, Rajkumar, Benjamin Morgan, Nagarajan, Ashok Raj,
Mohammed Shafi Shajakhan, lede-dev@lists.infradead.org,
linux-wireless@vger.kernel.org, ath10k@lists.infradead.org
In-Reply-To: <CAAnMG+MY=LhfJxxUVdfRNYUW-2R+EXdLm2o6yDcCH0T1brRVxw@mail.gmail.com>
Hi!
ok, thanks! I've seen some .. annoying rate control related firmware
crashes if you aren't using 11ac / 11n rates (ie you're /really/
legacy, so I wondered if something similar is going on here.
Thanks!
-a
On 13 December 2016 at 22:06, Alexis Green <agreen@cococorp.com> wrote:
> Hi Adrian,
>
> I have not done much testing of ath10k and ath9k devices in a single
> encrypted mesh recently, but I have a memory of only having this issue
> when communicating between ath10k devices.
>
> Alexis
>
> On Tue, Dec 13, 2016 at 9:53 PM, Adrian Chadd <adrian@freebsd.org> wrote:
>> Hi!
>>
>> Hm! So is there a firmware bug if there are 11n only capable nodes in
>> an 11s mesh?
>>
>>
>>
>> -adrian
^ permalink raw reply
* Re: ath10k firmware crashes in mesh mode on QCA9880
From: Adrian Chadd @ 2016-12-14 5:53 UTC (permalink / raw)
To: Alexis Green
Cc: Manoharan, Rajkumar, Benjamin Morgan, Nagarajan, Ashok Raj,
Mohammed Shafi Shajakhan, lede-dev@lists.infradead.org,
linux-wireless@vger.kernel.org, ath10k@lists.infradead.org
In-Reply-To: <CAAnMG+OmeataRKYRjOjtBXF_1-De-F4zWbUsWoYRhVu02NPrrw@mail.gmail.com>
Hi!
Hm! So is there a firmware bug if there are 11n only capable nodes in
an 11s mesh?
-adrian
^ permalink raw reply
* Re: ath10k firmware crashes in mesh mode on QCA9880
From: Alexis Green @ 2016-12-14 6:06 UTC (permalink / raw)
To: Adrian Chadd
Cc: Manoharan, Rajkumar, Benjamin Morgan, Nagarajan, Ashok Raj,
Mohammed Shafi Shajakhan, lede-dev@lists.infradead.org,
linux-wireless@vger.kernel.org, ath10k@lists.infradead.org
In-Reply-To: <CAJ-Vmo=iLryhOS--PPYCGa9pE2GovPDOXi6xoH22xFOYXXKNVA@mail.gmail.com>
Hi Adrian,
I have not done much testing of ath10k and ath9k devices in a single
encrypted mesh recently, but I have a memory of only having this issue
when communicating between ath10k devices.
Alexis
On Tue, Dec 13, 2016 at 9:53 PM, Adrian Chadd <adrian@freebsd.org> wrote:
> Hi!
>
> Hm! So is there a firmware bug if there are 11n only capable nodes in
> an 11s mesh?
>
>
>
> -adrian
^ permalink raw reply
* Re: ath10k firmware crashes in mesh mode on QCA9880
From: Alexis Green @ 2016-12-14 5:36 UTC (permalink / raw)
To: Manoharan, Rajkumar
Cc: Benjamin Morgan, Nagarajan, Ashok Raj, Mohammed Shafi Shajakhan,
lede-dev@lists.infradead.org, linux-wireless@vger.kernel.org,
ath10k@lists.infradead.org
In-Reply-To: <ccf17009a52748c99accdaa005ea837a@NALASEXR01H.na.qualcomm.com>
Thank you for your help Rajkumar,
We've traced the problem down to a peering issue. Looks like there was
a missing compile flag that caused some kind of incongruence. My best
guest is that beacons are generated by firmware and advertise support
for AC mode, whereas wpa_supplicant, when not compiled with
CONFIG_IEEE80211AC=y, sends mesh peering messages and creates peers
without AC support, causing firmware to get confused. After
recompiling supplicant with the correct flag, no more crashes were
observed in casual testing. I submitted a pull request to LEDE to,
hopefully, fix it in upstream.
Best regards,
Alexis
On Tue, Dec 13, 2016 at 3:51 PM, Manoharan, Rajkumar
<rmanohar@qca.qualcomm.com> wrote:
>> Tested the 10.2.4.70.59-2 firmware and wpa_supplicant running WITHOUT
>> encryption and it still crashes. I suspect this means wpa_supplicant is setting up
>> the interface incorrectly and/or transmitting a malformed packet that is causing
>> the driver to crash.
>>
> Ben,
>
> IIRC mesh support was validated in qca988x in VHT mode while ago. Either it could
> be regression in driver/fw or lede mac80211 package.
>
> 1) Could you please try plain backports in lede w/o applying ath10k patches.
> I do see 160MHz support in LEDE.
> 2) There are some peer stats dump from your earlier log. Disable peer stats
> by "peer_stats" debugfs.
> 3) Please confirm the behavior with older firmware revisions.
> 4) use iw to bring up open mesh to rule out wpa_s config
>
> -Rajkumar
>
^ permalink raw reply
* RE: ath10k firmware crashes in mesh mode on QCA9880
From: Manoharan, Rajkumar @ 2016-12-13 23:51 UTC (permalink / raw)
To: Benjamin Morgan, Nagarajan, Ashok Raj, Mohammed Shafi Shajakhan
Cc: agreen@cococorp.com, lede-dev@lists.infradead.org,
linux-wireless@vger.kernel.org, ath10k@lists.infradead.org
In-Reply-To: <5850831D.7020206@cococorp.com>
PiBUZXN0ZWQgdGhlIDEwLjIuNC43MC41OS0yIGZpcm13YXJlIGFuZCB3cGFfc3VwcGxpY2FudCBy
dW5uaW5nIFdJVEhPVVQNCj4gZW5jcnlwdGlvbiBhbmQgaXQgc3RpbGwgY3Jhc2hlcy4gSSBzdXNw
ZWN0IHRoaXMgbWVhbnMgd3BhX3N1cHBsaWNhbnQgaXMgc2V0dGluZyB1cA0KPiB0aGUgaW50ZXJm
YWNlIGluY29ycmVjdGx5IGFuZC9vciB0cmFuc21pdHRpbmcgYSBtYWxmb3JtZWQgcGFja2V0IHRo
YXQgaXMgY2F1c2luZw0KPiB0aGUgZHJpdmVyIHRvIGNyYXNoLg0KPiANCkJlbiwNCg0KSUlSQyBt
ZXNoIHN1cHBvcnQgd2FzIHZhbGlkYXRlZCBpbiBxY2E5ODh4IGluIFZIVCBtb2RlIHdoaWxlIGFn
by4gIEVpdGhlciBpdCBjb3VsZA0KYmUgcmVncmVzc2lvbiBpbiBkcml2ZXIvZncgb3IgbGVkZSBt
YWM4MDIxMSBwYWNrYWdlLg0KDQoxKSBDb3VsZCB5b3UgcGxlYXNlIHRyeSBwbGFpbiBiYWNrcG9y
dHMgaW4gbGVkZSB3L28gYXBwbHlpbmcgYXRoMTBrIHBhdGNoZXMuDQogICAgIEkgZG8gc2VlIDE2
ME1IeiBzdXBwb3J0IGluIExFREUuDQoyKSBUaGVyZSBhcmUgc29tZSBwZWVyIHN0YXRzIGR1bXAg
ZnJvbSB5b3VyIGVhcmxpZXIgbG9nLiBEaXNhYmxlIHBlZXIgc3RhdHMgDQogICAgIGJ5ICJwZWVy
X3N0YXRzIiBkZWJ1Z2ZzLg0KMykgUGxlYXNlIGNvbmZpcm0gdGhlIGJlaGF2aW9yIHdpdGggb2xk
ZXIgZmlybXdhcmUgcmV2aXNpb25zLg0KNCkgdXNlIGl3IHRvIGJyaW5nIHVwIG9wZW4gbWVzaCB0
byBydWxlIG91dCB3cGFfcyBjb25maWcNCg0KLVJhamt1bWFyDQoNCg==
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox