* memory leak in scan with 9170? @ 2010-09-27 17:16 Chuck Crisler 2010-09-27 17:31 ` Luis R. Rodriguez 0 siblings, 1 reply; 6+ messages in thread From: Chuck Crisler @ 2010-09-27 17:16 UTC (permalink / raw) To: linux-wireless I have modified my code that is using a 9170. I am really concerned about roaming and so am testing that pretty hard. Yesterday I had a loop that forced a DISCONNECT followed by a REASSOCIATE every 30 seconds. After between 1:30 and 1:40 it failed by no longer receiving scan results. When I looked into a log, the very last scan results that I received had a reduced number of BSSs, down from 10-12 per scan to 4, then the next scan was zero. It never recovered. All scans always failed to return any results from then on and, of course, the re-associate failed. This 'feels' to me like a memory leak somewhere, either in the firmware or the driver. I am running the 2.6.31 kernel/driver and the dual file firmware and version 0.6.10 of the supplicant. At the moment I am running another test where it roams every 60 seconds rather than 30 seconds to see what kind of difference that makes. I know that my kernel is old, but for now I don't have a choice. Does anyone have any experience like this or insight into this new problem? This is an embedded device that doesn't have the memory of a PC. Is there some way that I could instrument something to check this? Thank you, Chuck ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: memory leak in scan with 9170? 2010-09-27 17:16 memory leak in scan with 9170? Chuck Crisler @ 2010-09-27 17:31 ` Luis R. Rodriguez 2010-09-27 22:40 ` Chuck Crisler 0 siblings, 1 reply; 6+ messages in thread From: Luis R. Rodriguez @ 2010-09-27 17:31 UTC (permalink / raw) To: Chuck Crisler; +Cc: linux-wireless On Mon, Sep 27, 2010 at 10:16 AM, Chuck Crisler <ccrisler@vgocom.com> wrote: > I have modified my code that is using a 9170. I am really concerned about > roaming and so am testing that pretty hard. Yesterday I had a loop that > forced a DISCONNECT followed by a REASSOCIATE every 30 seconds. After > between 1:30 and 1:40 it failed by no longer receiving scan results. When I > looked into a log, the very last scan results that I received had a reduced > number of BSSs, down from 10-12 per scan to 4, then the next scan was zero. > It never recovered. All scans always failed to return any results from then > on and, of course, the re-associate failed. This 'feels' to me like a memory > leak somewhere, either in the firmware or the driver. I am running the > 2.6.31 kernel/driver and the dual file firmware and version 0.6.10 of the > supplicant. Both are ancient. Please try compat-wireless-2.6.36-rc3-1, I will soon make a new release with some stable fixes applied which are not yet in Linus' tree which I think will help a lot with your roaming testing. I should also note roaming was not possible until circa 2.6.33 when Jouni allowed for cfg80211 to authenticate to two APs at the same time and then move off to it to associate. Also although technically older userspace should work with newer kernels I have noted some issues with some really old supplicant on current kernels. I don't think there has been enough motivation to track down the exact issues though, but your best bet is to just upgrade the supplicant. > At the moment I am running another test where it roams every 60 > seconds rather than 30 seconds to see what kind of difference that makes. I > know that my kernel is old, but for now I don't have a choice. Does anyone > have any experience like this or insight into this new problem? This is an > embedded device that doesn't have the memory of a PC. Is there some way that > I could instrument something to check this? I'm testing roaming by using wpa_cli roam <bss> in an ESS every 5 seconds. To really stress test the hell out of this I force a roam every second too, its quite fun, it created a crash but I think we now know one of the main issues behind some warnings and Johannes has been brainstorming some solution. I don't suspect you'll hit these corner cases unless you roam every 2 seconds or so. The warnings are related to the fact that we assume the STA peer channel is the currently operating one when we TX a frame, and if we already associated to another station when moving from 2.4 GHz to 5 GHz we can potentially be trying to send a frame to a peer with no valid bitrate. You can use my script to test stuff as well: http://bombadil.infradead.org/~mcgrof/test-roam For example if you already know your ESS just replace the ESS variable with the set of BSSes for your ESS, they all most be on the same SSID though. Luis ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: memory leak in scan with 9170? 2010-09-27 17:31 ` Luis R. Rodriguez @ 2010-09-27 22:40 ` Chuck Crisler 2010-09-27 23:01 ` Luis R. Rodriguez 0 siblings, 1 reply; 6+ messages in thread From: Chuck Crisler @ 2010-09-27 22:40 UTC (permalink / raw) To: linux-wireless Well, (as usual) I was wrong. It isn't a memory problem. It seems that after some indeterminant time, the USB interface locks up. When we try to take it down (ifconfig wlan0 down) we get a message about outstanding urbs. By powering down the 9170 we can re-set the device and get it to re-associate and resume work. So, the problem is a USB problem. The question is if it is a module problem or a system problem. We are typically seeing this after 50-200 reassociations. If we don't reassociate, it doesn't seem to occur. Does anyone else have experience or insight into this? Chuck ----- Original Message ----- From: "Luis R. Rodriguez" <mcgrof@gmail.com> To: "Chuck Crisler" <ccrisler@vgocom.com> Cc: <linux-wireless@vger.kernel.org> Sent: Monday, September 27, 2010 1:31 PM Subject: Re: memory leak in scan with 9170? > On Mon, Sep 27, 2010 at 10:16 AM, Chuck Crisler <ccrisler@vgocom.com> > wrote: >> I have modified my code that is using a 9170. I am really concerned about >> roaming and so am testing that pretty hard. Yesterday I had a loop that >> forced a DISCONNECT followed by a REASSOCIATE every 30 seconds. After >> between 1:30 and 1:40 it failed by no longer receiving scan results. When >> I >> looked into a log, the very last scan results that I received had a >> reduced >> number of BSSs, down from 10-12 per scan to 4, then the next scan was >> zero. >> It never recovered. All scans always failed to return any results from >> then >> on and, of course, the re-associate failed. This 'feels' to me like a >> memory >> leak somewhere, either in the firmware or the driver. I am running the >> 2.6.31 kernel/driver and the dual file firmware and version 0.6.10 of the >> supplicant. > > Both are ancient. Please try compat-wireless-2.6.36-rc3-1, I will soon > make a new release with some stable fixes applied which are not yet in > Linus' tree which I think will help a lot with your roaming testing. I > should also note roaming was not possible until circa 2.6.33 when > Jouni allowed for cfg80211 to authenticate to two APs at the same time > and then move off to it to associate. Also although technically older > userspace should work with newer kernels I have noted some issues with > some really old supplicant on current kernels. I don't think there has > been enough motivation to track down the exact issues though, but your > best bet is to just upgrade the supplicant. > >> At the moment I am running another test where it roams every 60 >> seconds rather than 30 seconds to see what kind of difference that makes. >> I >> know that my kernel is old, but for now I don't have a choice. Does >> anyone >> have any experience like this or insight into this new problem? This is >> an >> embedded device that doesn't have the memory of a PC. Is there some way >> that >> I could instrument something to check this? > > I'm testing roaming by using wpa_cli roam <bss> in an ESS every 5 > seconds. To really stress test the hell out of this I force a roam > every second too, its quite fun, it created a crash but I think we now > know one of the main issues behind some warnings and Johannes has been > brainstorming some solution. I don't suspect you'll hit these corner > cases unless you roam every 2 seconds or so. The warnings are related > to the fact that we assume the STA peer channel is the currently > operating one when we TX a frame, and if we already associated to > another station when moving from 2.4 GHz to 5 GHz we can potentially > be trying to send a frame to a peer with no valid bitrate. > > You can use my script to test stuff as well: > > http://bombadil.infradead.org/~mcgrof/test-roam > > For example if you already know your ESS just replace the ESS variable > with the set of BSSes for your ESS, they all most be on the same SSID > though. > > Luis > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: memory leak in scan with 9170? 2010-09-27 22:40 ` Chuck Crisler @ 2010-09-27 23:01 ` Luis R. Rodriguez 2010-09-27 23:02 ` Luis R. Rodriguez 2010-09-28 7:24 ` Johannes Berg 0 siblings, 2 replies; 6+ messages in thread From: Luis R. Rodriguez @ 2010-09-27 23:01 UTC (permalink / raw) To: Chuck Crisler; +Cc: linux-wireless On Mon, Sep 27, 2010 at 3:40 PM, Chuck Crisler <ccrisler@vgocom.com> wrote: > Well, (as usual) I was wrong. It isn't a memory problem. It seems that after > some indeterminant time, the USB interface locks up. When we try to take it > down (ifconfig wlan0 down) we get a message about outstanding urbs. By > powering down the 9170 we can re-set the device and get it to re-associate > and resume work. So, the problem is a USB problem. The question is if it is > a module problem or a system problem. We are typically seeing this after > 50-200 reassociations. If we don't reassociate, it doesn't seem to occur. > Does anyone else have experience or insight into this? Upgrade. Luis ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: memory leak in scan with 9170? 2010-09-27 23:01 ` Luis R. Rodriguez @ 2010-09-27 23:02 ` Luis R. Rodriguez 2010-09-28 7:24 ` Johannes Berg 1 sibling, 0 replies; 6+ messages in thread From: Luis R. Rodriguez @ 2010-09-27 23:02 UTC (permalink / raw) To: Chuck Crisler; +Cc: linux-wireless On Mon, Sep 27, 2010 at 4:01 PM, Luis R. Rodriguez <mcgrof@gmail.com> wrote: > On Mon, Sep 27, 2010 at 3:40 PM, Chuck Crisler <ccrisler@vgocom.com> wrote: >> Well, (as usual) I was wrong. It isn't a memory problem. It seems that after >> some indeterminant time, the USB interface locks up. When we try to take it >> down (ifconfig wlan0 down) we get a message about outstanding urbs. By >> powering down the 9170 we can re-set the device and get it to re-associate >> and resume work. So, the problem is a USB problem. The question is if it is >> a module problem or a system problem. We are typically seeing this after >> 50-200 reassociations. If we don't reassociate, it doesn't seem to occur. >> Does anyone else have experience or insight into this? > > Upgrade. Let me clarify, 2.6.31 is not supported, its not listed on kernel.org any more as a supported kernel. You are losing valuable fixes by not moving away from it. If you don't have a plan to move, you need it, if you have policies to lock you down to old kernels, try to change it. Luis ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: memory leak in scan with 9170? 2010-09-27 23:01 ` Luis R. Rodriguez 2010-09-27 23:02 ` Luis R. Rodriguez @ 2010-09-28 7:24 ` Johannes Berg 1 sibling, 0 replies; 6+ messages in thread From: Johannes Berg @ 2010-09-28 7:24 UTC (permalink / raw) To: Luis R. Rodriguez; +Cc: Chuck Crisler, linux-wireless On Mon, 2010-09-27 at 16:01 -0700, Luis R. Rodriguez wrote: > On Mon, Sep 27, 2010 at 3:40 PM, Chuck Crisler <ccrisler@vgocom.com> wrote: > > Well, (as usual) I was wrong. It isn't a memory problem. It seems that after > > some indeterminant time, the USB interface locks up. When we try to take it > > down (ifconfig wlan0 down) we get a message about outstanding urbs. By > > powering down the 9170 we can re-set the device and get it to re-associate > > and resume work. So, the problem is a USB problem. The question is if it is > > a module problem or a system problem. We are typically seeing this after > > 50-200 reassociations. If we don't reassociate, it doesn't seem to occur. > > Does anyone else have experience or insight into this? > > Upgrade. Won't help. I've seen that issue as recently as 2.6.35 (I think) with ar9170, and eventually figured I wouldn't bother and started using carl9170. johannes ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-09-28 7:24 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-09-27 17:16 memory leak in scan with 9170? Chuck Crisler 2010-09-27 17:31 ` Luis R. Rodriguez 2010-09-27 22:40 ` Chuck Crisler 2010-09-27 23:01 ` Luis R. Rodriguez 2010-09-27 23:02 ` Luis R. Rodriguez 2010-09-28 7:24 ` Johannes Berg
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.