b43-dev.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Regression affecting b43 LP-PHY card
@ 2011-05-09 22:52 Rafał Miłecki
  2011-05-09 22:58 ` Ben Greear
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Rafał Miłecki @ 2011-05-09 22:52 UTC (permalink / raw)
  To: b43-dev, linux-wireless, Juan Carlos Romero

Juan owns Lenovo affected by well-known LP-PHY DMA errors. His testing
procedure is following:
modprobe wl; connect; download sth small; rmmod wl;
modprobe b43; download 2GB

When working on DMA errors we discovered that wireless-testing is not
working well for him. Even after performing above procedure his
machine disconnects quickly and he is not able to reconnect. We tested
2.6.39-rc6 from tarball and it was working fine. I'd like to highlight
here, that we were switching between mainline and wireless-testing few
times. It is not a random issue.

I suspected this regression could be caused by my recent ssb patches.
So I reverted all of them but this didn't help.

In this situation we decided to bisect. I was a little afraid of last
merges so we took older 2.6.38 as GOOD (we tested this twice) and
wireless-testing commit before my ssb changes as BAD. Today Juan
finished bisecting kernel:
http://pastebin.com/HSKbRzpB

According to his bisection the first bad commit is
e06383db9ec591696a06654257474b85bac1f8cb [0]:
hrtimers: extend hrtimer base code to handle more then 2 clockids

Does it make any sense to you? Could this be some timing issue?

It was too late to test this today, we (Juan) will work on this
tomorrow. It's impossible to revert this commit from HEAD of
wireless-testing, so my idea is to checkout commit, test, revert,
test.

Did anyone else experience any similar problems with latest wireless-testing?


http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e06383db9ec591696a06654257474b85bac1f8cb

-- 
Rafa?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Regression affecting b43 LP-PHY card
  2011-05-09 22:52 Regression affecting b43 LP-PHY card Rafał Miłecki
@ 2011-05-09 22:58 ` Ben Greear
  2011-05-10 18:31 ` Larry Finger
  2011-05-10 18:57 ` [hrtimers] " Rafał Miłecki
  2 siblings, 0 replies; 6+ messages in thread
From: Ben Greear @ 2011-05-09 22:58 UTC (permalink / raw)
  To: Rafał Miłecki; +Cc: b43-dev, linux-wireless, Juan Carlos Romero

On 05/09/2011 03:52 PM, Rafa? Mi?ecki wrote:

> Did anyone else experience any similar problems with latest wireless-testing?

With the ath5k patch I posted, and the ath9k patches that Felix
posted recently in response to my bug reports, I've had good
luck with ath9k and ath5k.

I also pulled in the slub cmpxcg fix that fixes fatal bugs
for compiles for something earlier than Pentium-II processors.

Hopefully -rc7 will be out shortly (which will contain the slub
fix, and if we're lucky..the ath9k and ath5k fixes too).

I don't test with any other wifi nics..just ath5k and ath9k.

I've mostly been testing virtual stations...will crank up some
ath9k APs shortly...

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Regression affecting b43 LP-PHY card
  2011-05-09 22:52 Regression affecting b43 LP-PHY card Rafał Miłecki
  2011-05-09 22:58 ` Ben Greear
@ 2011-05-10 18:31 ` Larry Finger
  2011-05-10 18:57 ` [hrtimers] " Rafał Miłecki
  2 siblings, 0 replies; 6+ messages in thread
From: Larry Finger @ 2011-05-10 18:31 UTC (permalink / raw)
  To: Rafał Miłecki; +Cc: b43-dev, linux-wireless, Juan Carlos Romero

On 05/09/2011 05:52 PM, Rafa? Mi?ecki wrote:
> Juan owns Lenovo affected by well-known LP-PHY DMA errors. His testing
> procedure is following:
> modprobe wl; connect; download sth small; rmmod wl;
> modprobe b43; download 2GB
>
> When working on DMA errors we discovered that wireless-testing is not
> working well for him. Even after performing above procedure his
> machine disconnects quickly and he is not able to reconnect. We tested
> 2.6.39-rc6 from tarball and it was working fine. I'd like to highlight
> here, that we were switching between mainline and wireless-testing few
> times. It is not a random issue.
>
> I suspected this regression could be caused by my recent ssb patches.
> So I reverted all of them but this didn't help.
>
> In this situation we decided to bisect. I was a little afraid of last
> merges so we took older 2.6.38 as GOOD (we tested this twice) and
> wireless-testing commit before my ssb changes as BAD. Today Juan
> finished bisecting kernel:
> http://pastebin.com/HSKbRzpB
>
> According to his bisection the first bad commit is
> e06383db9ec591696a06654257474b85bac1f8cb [0]:
> hrtimers: extend hrtimer base code to handle more then 2 clockids
>
> Does it make any sense to you? Could this be some timing issue?
>
> It was too late to test this today, we (Juan) will work on this
> tomorrow. It's impossible to revert this commit from HEAD of
> wireless-testing, so my idea is to checkout commit, test, revert,
> test.
>
> Did anyone else experience any similar problems with latest wireless-testing?

I did some testing over the weekend using the LP-PHY device in my HP Mini 110 
netbook. This one does not have any DMA issues, but b43 generates PHY 
transmission errors and dies when I try to copy a file over my LAN. The source 
material is contained on an NFS-mounted volume. The machine that exports the 
volume is connected by wire to the router/switch. When I get a file from the 
Internet, there are no problems. In the latter case, the transfer rate of the 
download is up to 1.2 MB/s. I don't know what the peak rate is for the NFS copy 
operation.

I was able to test kernels from the wireless-testing tree back to v2.6.36. All 
behaved the same, thus my problem is not a regression.

Larry

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [hrtimers] Re: Regression affecting b43 LP-PHY card
  2011-05-09 22:52 Regression affecting b43 LP-PHY card Rafał Miłecki
  2011-05-09 22:58 ` Ben Greear
  2011-05-10 18:31 ` Larry Finger
@ 2011-05-10 18:57 ` Rafał Miłecki
       [not found]   ` <1305054401.2939.52.camel@work-vm>
  2 siblings, 1 reply; 6+ messages in thread
From: Rafał Miłecki @ 2011-05-10 18:57 UTC (permalink / raw)
  To: b43-dev, linux-wireless, Juan Carlos Romero, Larry Finger,
	Ben Greear, John Stultz, Jamie Lokier, Thomas Gleixner,
	Alexander Shishkin, Arve Hjønnevåg, Rafael J. Wysocki,
	Linux Kernel Mailing List

W dniu 10 maja 2011 00:52 u?ytkownik Rafa? Mi?ecki <zajec5@gmail.com> napisa?:
> Juan owns Lenovo affected by well-known LP-PHY DMA errors. His testing
> procedure is following:
> modprobe wl; connect; download sth small; rmmod wl;
> modprobe b43; download 2GB
>
> When working on DMA errors we discovered that wireless-testing is not
> working well for him. Even after performing above procedure his
> machine disconnects quickly and he is not able to reconnect. We tested
> 2.6.39-rc6 from tarball and it was working fine. I'd like to highlight
> here, that we were switching between mainline and wireless-testing few
> times. It is not a random issue.
>
> I suspected this regression could be caused by my recent ssb patches.
> So I reverted all of them but this didn't help.
>
> In this situation we decided to bisect. I was a little afraid of last
> merges so we took older 2.6.38 as GOOD (we tested this twice) and
> wireless-testing commit before my ssb changes as BAD. Today Juan
> finished bisecting kernel:
> http://pastebin.com/HSKbRzpB
>
> According to his bisection the first bad commit is
> e06383db9ec591696a06654257474b85bac1f8cb [0]:
> hrtimers: extend hrtimer base code to handle more then 2 clockids
>
> Does it make any sense to you? Could this be some timing issue?
>
> It was too late to test this today, we (Juan) will work on this
> tomorrow. It's impossible to revert this commit from HEAD of
> wireless-testing, so my idea is to checkout commit, test, revert,
> test.
>
> Did anyone else experience any similar problems with latest wireless-testing?
>
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e06383db9ec591696a06654257474b85bac1f8cb

Today Juan checkouted commit e06383db9ec591696a06654257474b85bac1f8cb
and tested it. He was disconnected really soon.

Then he reverted e06383db9ec591696a06654257474b85bac1f8cb and tested
again. Connection was stable, he downloaded 2GB file over network.


John S.: your commit does not touch Broadcom card directly, but it
seems it somehow affects it. I suspect there can be some timing issue.
Do you have any idea what could it be, how can we debug this?

-- 
Rafa?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [hrtimers] Re: Regression affecting b43 LP-PHY card
       [not found]   ` <1305054401.2939.52.camel@work-vm>
@ 2011-05-10 19:45     ` Rafał Miłecki
  2011-05-10 21:48       ` Rafał Miłecki
  0 siblings, 1 reply; 6+ messages in thread
From: Rafał Miłecki @ 2011-05-10 19:45 UTC (permalink / raw)
  To: John Stultz
  Cc: b43-dev, linux-wireless, Juan Carlos Romero, Larry Finger,
	Ben Greear, Jamie Lokier, Thomas Gleixner, Alexander Shishkin,
	Arve Hjønnevåg, Rafael J. Wysocki,
	Linux Kernel Mailing List

2011/5/10 John Stultz <john.stultz@linaro.org>:
> On Tue, 2011-05-10 at 20:57 +0200, Rafa? Mi?ecki wrote:
>> W dniu 10 maja 2011 00:52 u?ytkownik Rafa? Mi?ecki <zajec5@gmail.com> napisa?:
>> > In this situation we decided to bisect. I was a little afraid of last
>> > merges so we took older 2.6.38 as GOOD (we tested this twice) and
>> > wireless-testing commit before my ssb changes as BAD. Today Juan
>> > finished bisecting kernel:
>> > http://pastebin.com/HSKbRzpB
>> >
>> > According to his bisection the first bad commit is
>> > e06383db9ec591696a06654257474b85bac1f8cb [0]:
>> > hrtimers: extend hrtimer base code to handle more then 2 clockids
>> >
>> > Does it make any sense to you? Could this be some timing issue?
>> >
>> > It was too late to test this today, we (Juan) will work on this
>> > tomorrow. It's impossible to revert this commit from HEAD of
>> > wireless-testing, so my idea is to checkout commit, test, revert,
>> > test.
>> >
>> > Did anyone else experience any similar problems with latest wireless-testing?
>> >
>> >
>> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e06383db9ec591696a06654257474b85bac1f8cb
>>
>> Today Juan checkouted commit e06383db9ec591696a06654257474b85bac1f8cb
>> and tested it. He was disconnected really soon.
>>
>> Then he reverted e06383db9ec591696a06654257474b85bac1f8cb and tested
>> again. Connection was stable, he downloaded 2GB file over network.
>>
>>
>> John S.: your commit does not touch Broadcom card directly, but it
>> seems it somehow affects it. I suspect there can be some timing issue.
>> Do you have any idea what could it be, how can we debug this?
>
> Sorry for the trouble!
>
> My commit exposed a few spots where hrtimers were being initialized
> before hrtimer_init is called, which caused problems. Thomas provided a
> solution that makes such behavior still function ok:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ce31332d3c77532d6ea97ddcb475a2b02dd358b4
>
> Let me know if the issue is still reproducible with Linus' latest git
> tree.

We were using wireless-testing git tree, commit:
1e664a777e5eb4b23e65e76fbeadd2376fe8d8d8

I can not see ce31332d3c77532d6ea97ddcb475a2b02dd358b4 it git log.
I'll apply and test, thanks!

-- 
Rafa?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [hrtimers] Re: Regression affecting b43 LP-PHY card
  2011-05-10 19:45     ` Rafał Miłecki
@ 2011-05-10 21:48       ` Rafał Miłecki
  0 siblings, 0 replies; 6+ messages in thread
From: Rafał Miłecki @ 2011-05-10 21:48 UTC (permalink / raw)
  To: John Stultz
  Cc: b43-dev, linux-wireless, Juan Carlos Romero, Larry Finger,
	Ben Greear, Jamie Lokier, Thomas Gleixner, Alexander Shishkin,
	Arve Hjønnevåg, Rafael J. Wysocki,
	Linux Kernel Mailing List

2011/5/10 Rafa? Mi?ecki <zajec5@gmail.com>:
> 2011/5/10 John Stultz <john.stultz@linaro.org>:
>> On Tue, 2011-05-10 at 20:57 +0200, Rafa? Mi?ecki wrote:
>>> W dniu 10 maja 2011 00:52 u?ytkownik Rafa? Mi?ecki <zajec5@gmail.com> napisa?:
>>> > In this situation we decided to bisect. I was a little afraid of last
>>> > merges so we took older 2.6.38 as GOOD (we tested this twice) and
>>> > wireless-testing commit before my ssb changes as BAD. Today Juan
>>> > finished bisecting kernel:
>>> > http://pastebin.com/HSKbRzpB
>>> >
>>> > According to his bisection the first bad commit is
>>> > e06383db9ec591696a06654257474b85bac1f8cb [0]:
>>> > hrtimers: extend hrtimer base code to handle more then 2 clockids
>>> >
>>> > Does it make any sense to you? Could this be some timing issue?
>>> >
>>> > It was too late to test this today, we (Juan) will work on this
>>> > tomorrow. It's impossible to revert this commit from HEAD of
>>> > wireless-testing, so my idea is to checkout commit, test, revert,
>>> > test.
>>> >
>>> > Did anyone else experience any similar problems with latest wireless-testing?
>>> >
>>> >
>>> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e06383db9ec591696a06654257474b85bac1f8cb
>>>
>>> Today Juan checkouted commit e06383db9ec591696a06654257474b85bac1f8cb
>>> and tested it. He was disconnected really soon.
>>>
>>> Then he reverted e06383db9ec591696a06654257474b85bac1f8cb and tested
>>> again. Connection was stable, he downloaded 2GB file over network.
>>>
>>>
>>> John S.: your commit does not touch Broadcom card directly, but it
>>> seems it somehow affects it. I suspect there can be some timing issue.
>>> Do you have any idea what could it be, how can we debug this?
>>
>> Sorry for the trouble!
>>
>> My commit exposed a few spots where hrtimers were being initialized
>> before hrtimer_init is called, which caused problems. Thomas provided a
>> solution that makes such behavior still function ok:
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ce31332d3c77532d6ea97ddcb475a2b02dd358b4
>>
>> Let me know if the issue is still reproducible with Linus' latest git
>> tree.
>
> We were using wireless-testing git tree, commit:
> 1e664a777e5eb4b23e65e76fbeadd2376fe8d8d8
>
> I can not see ce31332d3c77532d6ea97ddcb475a2b02dd358b4 it git log.
> I'll apply and test, thanks!

I can confirm updated wireless-testing resolves this issue!

Too bad Juan spent 2-3 days on bisecting on his Atom... but at least
we came and met ready fix. It could be pain to find a solution with
out feedback only. I'm aware it was not very detailed.

Thanks for your help :)

-- 
Rafa?

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-05-10 21:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-09 22:52 Regression affecting b43 LP-PHY card Rafał Miłecki
2011-05-09 22:58 ` Ben Greear
2011-05-10 18:31 ` Larry Finger
2011-05-10 18:57 ` [hrtimers] " Rafał Miłecki
     [not found]   ` <1305054401.2939.52.camel@work-vm>
2011-05-10 19:45     ` Rafał Miłecki
2011-05-10 21:48       ` Rafał Miłecki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).