* Re: ath9k: panic on tip/master
@ 2008-10-03 18:09 ` John W. Linville
0 siblings, 0 replies; 13+ messages in thread
From: John W. Linville @ 2008-10-03 18:09 UTC (permalink / raw)
To: Ingo Molnar
Cc: Steven Noonan, linux-kernel, ath9k-devel, lrodriguez,
linux-wireless
On Fri, Oct 03, 2008 at 11:35:23AM -0400, John W. Linville wrote:
> On Fri, Oct 03, 2008 at 12:02:11PM +0200, Ingo Molnar wrote:
> >
> > * Steven Noonan <steven@uplinklabs.net> wrote:
> >
> > > Hey folks,
> > >
> > > Just got a panic on tip. According to the stack trace, ath9k is what
> > > decided to bomb.
> > >
> > > http://www.uplinklabs.net/~tycho/linux/ath9k_panic_tip_10.3.2008.jpg
> > >
> > > Note: Although it says 'sudo modprobe radeon' on the bash prompt above
> > > the panic, I never got to hit 'enter' on that command before the panic
> > > occurred.
> >
> > it appears to me that ath9k's eth_rx_input() takes a spinlock that is
> > not initialized (or already destroyed by the allocator).
>
> Seems reasonable...
>
> > this would be consistent with an IRQ storm hitting some race in the
> > ath9k driver init sequence. For example if request_irq() is done before
> > all structures that the IRQ handler relies on are properly initialized.
> >
> > i.e. this has the signature of a genuine ath9k bug.
>
> Agreed, although I don't see anything specifically relating to
> request_irq or the like.
>
> I think the spin_lock call may actually be in ath_ampdu_input (called
> from ath_rx_input), which perhaps is getting called simultaneous
> with ath_rx_node_init still running? With no locks in between them,
> it seems like this could be the culprit?
>
> Sorry to not be more immediately helpful, but I'm going to have to
> run in a few minutes. Perhaps this insight is helpful for someone
> more familiar with the internals of this driver?
This is probably a dead-end...I don't think the ath_node_find
in ath__rx_indicate will be able to find the ath_node used
in ath_ampdu_input unless ath_rx_node_init had already complete.
Back to square one...
John
--
John W. Linville Linux should be at the core
linville@tuxdriver.com of your literate lifestyle.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ath9k: panic on tip/master
@ 2008-10-03 18:09 ` John W. Linville
0 siblings, 0 replies; 13+ messages in thread
From: John W. Linville @ 2008-10-03 18:09 UTC (permalink / raw)
To: Ingo Molnar
Cc: Steven Noonan, linux-kernel, ath9k-devel, lrodriguez,
linux-wireless
On Fri, Oct 03, 2008 at 11:35:23AM -0400, John W. Linville wrote:
> On Fri, Oct 03, 2008 at 12:02:11PM +0200, Ingo Molnar wrote:
> >
> > * Steven Noonan <steven@uplinklabs.net> wrote:
> >
> > > Hey folks,
> > >
> > > Just got a panic on tip. According to the stack trace, ath9k is what
> > > decided to bomb.
> > >
> > > http://www.uplinklabs.net/~tycho/linux/ath9k_panic_tip_10.3.2008.jpg
> > >
> > > Note: Although it says 'sudo modprobe radeon' on the bash prompt above
> > > the panic, I never got to hit 'enter' on that command before the panic
> > > occurred.
> >
> > it appears to me that ath9k's eth_rx_input() takes a spinlock that is
> > not initialized (or already destroyed by the allocator).
>
> Seems reasonable...
>
> > this would be consistent with an IRQ storm hitting some race in the
> > ath9k driver init sequence. For example if request_irq() is done before
> > all structures that the IRQ handler relies on are properly initialized.
> >
> > i.e. this has the signature of a genuine ath9k bug.
>
> Agreed, although I don't see anything specifically relating to
> request_irq or the like.
>
> I think the spin_lock call may actually be in ath_ampdu_input (called
> from ath_rx_input), which perhaps is getting called simultaneous
> with ath_rx_node_init still running? With no locks in between them,
> it seems like this could be the culprit?
>
> Sorry to not be more immediately helpful, but I'm going to have to
> run in a few minutes. Perhaps this insight is helpful for someone
> more familiar with the internals of this driver?
This is probably a dead-end...I don't think the ath_node_find
in ath__rx_indicate will be able to find the ath_node used
in ath_ampdu_input unless ath_rx_node_init had already complete.
Back to square one...
John
--
John W. Linville Linux should be at the core
linville@tuxdriver.com of your literate lifestyle.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [ath9k-devel] ath9k: panic on tip/master
2008-10-03 18:09 ` John W. Linville
(?)
@ 2008-10-03 11:49 ` Luis R. Rodriguez
-1 siblings, 0 replies; 13+ messages in thread
From: Luis R. Rodriguez @ 2008-10-03 11:49 UTC (permalink / raw)
To: ath9k-devel
On Fri, Oct 03, 2008 at 11:09:31AM -0700, John W. Linville wrote:
> On Fri, Oct 03, 2008 at 11:35:23AM -0400, John W. Linville wrote:
> > On Fri, Oct 03, 2008 at 12:02:11PM +0200, Ingo Molnar wrote:
> > >
> > > * Steven Noonan <steven@uplinklabs.net> wrote:
> > >
> > > > Hey folks,
> > > >
> > > > Just got a panic on tip. According to the stack trace, ath9k is what
> > > > decided to bomb.
> > > >
> > > > http://www.uplinklabs.net/~tycho/linux/ath9k_panic_tip_10.3.2008.jpg
> > > >
> > > > Note: Although it says 'sudo modprobe radeon' on the bash prompt above
> > > > the panic, I never got to hit 'enter' on that command before the panic
> > > > occurred.
> > >
> > > it appears to me that ath9k's eth_rx_input() takes a spinlock that is
> > > not initialized (or already destroyed by the allocator).
> >
> > Seems reasonable...
> >
> > > this would be consistent with an IRQ storm hitting some race in the
> > > ath9k driver init sequence. For example if request_irq() is done before
> > > all structures that the IRQ handler relies on are properly initialized.
> > >
> > > i.e. this has the signature of a genuine ath9k bug.
> >
> > Agreed, although I don't see anything specifically relating to
> > request_irq or the like.
> >
> > I think the spin_lock call may actually be in ath_ampdu_input (called
> > from ath_rx_input), which perhaps is getting called simultaneous
> > with ath_rx_node_init still running? With no locks in between them,
> > it seems like this could be the culprit?
> >
> > Sorry to not be more immediately helpful, but I'm going to have to
> > run in a few minutes. Perhaps this insight is helpful for someone
> > more familiar with the internals of this driver?
>
> This is probably a dead-end...I don't think the ath_node_find
> in ath__rx_indicate will be able to find the ath_node used
> in ath_ampdu_input unless ath_rx_node_init had already complete.
> Back to square one...
Well Steven, please give this a shot, we think this is the culprit.
[PATCH] ath9k: fix oops on trying to hold the wrong spinlock
We were trying to hold the wrong spinlock due to a typo
on IEEE80211_BAR_CTL_TID_S's definition. We use this to
compute the tid number and then hold this this tid number's
spinlock during ath_bar_rx().
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Signed-off-by: Sujith <Sujith.Manoharan@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
---
drivers/net/wireless/ath9k/core.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/wireless/ath9k/core.h b/drivers/net/wireless/ath9k/core.h
index 2f84093..88f4cc3 100644
--- a/drivers/net/wireless/ath9k/core.h
+++ b/drivers/net/wireless/ath9k/core.h
@@ -316,7 +316,7 @@ void ath_descdma_cleanup(struct ath_softc *sc,
#define ATH_RX_TIMEOUT 40 /* 40 milliseconds */
#define WME_NUM_TID 16
#define IEEE80211_BAR_CTL_TID_M 0xF000 /* tid mask */
-#define IEEE80211_BAR_CTL_TID_S 2 /* tid shift */
+#define IEEE80211_BAR_CTL_TID_S 12 /* tid shift */
enum ATH_RX_TYPE {
ATH_RX_NON_CONSUMED = 0,
--
1.5.6.3
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: ath9k: panic on tip/master
@ 2008-10-03 11:49 ` Luis R. Rodriguez
0 siblings, 0 replies; 13+ messages in thread
From: Luis R. Rodriguez @ 2008-10-03 11:49 UTC (permalink / raw)
To: John W. Linville
Cc: Ingo Molnar, Steven Noonan, linux-kernel@vger.kernel.org,
ath9k-devel@lists.ath9k.org, Luis Rodriguez,
linux-wireless@vger.kernel.org
On Fri, Oct 03, 2008 at 11:09:31AM -0700, John W. Linville wrote:
> On Fri, Oct 03, 2008 at 11:35:23AM -0400, John W. Linville wrote:
> > On Fri, Oct 03, 2008 at 12:02:11PM +0200, Ingo Molnar wrote:
> > >
> > > * Steven Noonan <steven@uplinklabs.net> wrote:
> > >
> > > > Hey folks,
> > > >
> > > > Just got a panic on tip. According to the stack trace, ath9k is what
> > > > decided to bomb.
> > > >
> > > > http://www.uplinklabs.net/~tycho/linux/ath9k_panic_tip_10.3.2008.jpg
> > > >
> > > > Note: Although it says 'sudo modprobe radeon' on the bash prompt above
> > > > the panic, I never got to hit 'enter' on that command before the panic
> > > > occurred.
> > >
> > > it appears to me that ath9k's eth_rx_input() takes a spinlock that is
> > > not initialized (or already destroyed by the allocator).
> >
> > Seems reasonable...
> >
> > > this would be consistent with an IRQ storm hitting some race in the
> > > ath9k driver init sequence. For example if request_irq() is done before
> > > all structures that the IRQ handler relies on are properly initialized.
> > >
> > > i.e. this has the signature of a genuine ath9k bug.
> >
> > Agreed, although I don't see anything specifically relating to
> > request_irq or the like.
> >
> > I think the spin_lock call may actually be in ath_ampdu_input (called
> > from ath_rx_input), which perhaps is getting called simultaneous
> > with ath_rx_node_init still running? With no locks in between them,
> > it seems like this could be the culprit?
> >
> > Sorry to not be more immediately helpful, but I'm going to have to
> > run in a few minutes. Perhaps this insight is helpful for someone
> > more familiar with the internals of this driver?
>
> This is probably a dead-end...I don't think the ath_node_find
> in ath__rx_indicate will be able to find the ath_node used
> in ath_ampdu_input unless ath_rx_node_init had already complete.
> Back to square one...
Well Steven, please give this a shot, we think this is the culprit.
[PATCH] ath9k: fix oops on trying to hold the wrong spinlock
We were trying to hold the wrong spinlock due to a typo
on IEEE80211_BAR_CTL_TID_S's definition. We use this to
compute the tid number and then hold this this tid number's
spinlock during ath_bar_rx().
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Signed-off-by: Sujith <Sujith.Manoharan@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
---
drivers/net/wireless/ath9k/core.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/wireless/ath9k/core.h b/drivers/net/wireless/ath9k/core.h
index 2f84093..88f4cc3 100644
--- a/drivers/net/wireless/ath9k/core.h
+++ b/drivers/net/wireless/ath9k/core.h
@@ -316,7 +316,7 @@ void ath_descdma_cleanup(struct ath_softc *sc,
#define ATH_RX_TIMEOUT 40 /* 40 milliseconds */
#define WME_NUM_TID 16
#define IEEE80211_BAR_CTL_TID_M 0xF000 /* tid mask */
-#define IEEE80211_BAR_CTL_TID_S 2 /* tid shift */
+#define IEEE80211_BAR_CTL_TID_S 12 /* tid shift */
enum ATH_RX_TYPE {
ATH_RX_NON_CONSUMED = 0,
--
1.5.6.3
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: ath9k: panic on tip/master
@ 2008-10-03 11:49 ` Luis R. Rodriguez
0 siblings, 0 replies; 13+ messages in thread
From: Luis R. Rodriguez @ 2008-10-03 11:49 UTC (permalink / raw)
To: John W. Linville
Cc: Ingo Molnar, Steven Noonan, linux-kernel@vger.kernel.org,
ath9k-devel@lists.ath9k.org, Luis Rodriguez,
linux-wireless@vger.kernel.org
On Fri, Oct 03, 2008 at 11:09:31AM -0700, John W. Linville wrote:
> On Fri, Oct 03, 2008 at 11:35:23AM -0400, John W. Linville wrote:
> > On Fri, Oct 03, 2008 at 12:02:11PM +0200, Ingo Molnar wrote:
> > >
> > > * Steven Noonan <steven@uplinklabs.net> wrote:
> > >
> > > > Hey folks,
> > > >
> > > > Just got a panic on tip. According to the stack trace, ath9k is what
> > > > decided to bomb.
> > > >
> > > > http://www.uplinklabs.net/~tycho/linux/ath9k_panic_tip_10.3.2008.jpg
> > > >
> > > > Note: Although it says 'sudo modprobe radeon' on the bash prompt above
> > > > the panic, I never got to hit 'enter' on that command before the panic
> > > > occurred.
> > >
> > > it appears to me that ath9k's eth_rx_input() takes a spinlock that is
> > > not initialized (or already destroyed by the allocator).
> >
> > Seems reasonable...
> >
> > > this would be consistent with an IRQ storm hitting some race in the
> > > ath9k driver init sequence. For example if request_irq() is done before
> > > all structures that the IRQ handler relies on are properly initialized.
> > >
> > > i.e. this has the signature of a genuine ath9k bug.
> >
> > Agreed, although I don't see anything specifically relating to
> > request_irq or the like.
> >
> > I think the spin_lock call may actually be in ath_ampdu_input (called
> > from ath_rx_input), which perhaps is getting called simultaneous
> > with ath_rx_node_init still running? With no locks in between them,
> > it seems like this could be the culprit?
> >
> > Sorry to not be more immediately helpful, but I'm going to have to
> > run in a few minutes. Perhaps this insight is helpful for someone
> > more familiar with the internals of this driver?
>
> This is probably a dead-end...I don't think the ath_node_find
> in ath__rx_indicate will be able to find the ath_node used
> in ath_ampdu_input unless ath_rx_node_init had already complete.
> Back to square one...
Well Steven, please give this a shot, we think this is the culprit.
[PATCH] ath9k: fix oops on trying to hold the wrong spinlock
We were trying to hold the wrong spinlock due to a typo
on IEEE80211_BAR_CTL_TID_S's definition. We use this to
compute the tid number and then hold this this tid number's
spinlock during ath_bar_rx().
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Signed-off-by: Sujith <Sujith.Manoharan@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
---
drivers/net/wireless/ath9k/core.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/wireless/ath9k/core.h b/drivers/net/wireless/ath9k/core.h
index 2f84093..88f4cc3 100644
--- a/drivers/net/wireless/ath9k/core.h
+++ b/drivers/net/wireless/ath9k/core.h
@@ -316,7 +316,7 @@ void ath_descdma_cleanup(struct ath_softc *sc,
#define ATH_RX_TIMEOUT 40 /* 40 milliseconds */
#define WME_NUM_TID 16
#define IEEE80211_BAR_CTL_TID_M 0xF000 /* tid mask */
-#define IEEE80211_BAR_CTL_TID_S 2 /* tid shift */
+#define IEEE80211_BAR_CTL_TID_S 12 /* tid shift */
enum ATH_RX_TYPE {
ATH_RX_NON_CONSUMED = 0,
--
1.5.6.3
^ permalink raw reply related [flat|nested] 13+ messages in thread