Linux wireless drivers development
 help / color / mirror / Atom feed
* [RFC] p54pci: skb_over_panic, soft lockup, stall under flood
From: Quintin Pitts @ 2009-10-11 14:28 UTC (permalink / raw)
  To: John Linville; +Cc: linux-wireless, Christian Lamparter

Hi,

Sorry for my lack of experience in all aspects - first time
submitting!!!

In trying to get p54pci driver to be stable on my platform and hardware
- here is a generic patch that seems to accomplish that.  Since the
ViewSonic V210 uses the IT8152 pci bridge - some attention was needed to
get dma related allocation in the first physical 64M.  I have verified
that the dma related allocation is in the first 64M and dmabounce is not
being used - just for those wondering if that was part of the problems.

Platform: ViewSonic V210 arm pxa255
Kernel 2.6.30.5 eabi
Wireless Drivers from compat-wireless-2009-09-30 and what I applied the below patch to.
Firmware used: FW rev 2.13.12.0 - Softmac protocol 5.9

Wireless card: GemTek WL-850FJB minipci card.

phy0: p54 detected a LM86 firmware
p54: rx_mtu reduced from 3240 to 2376
phy0: FW rev 2.13.12.0 - Softmac protocol 5.9
phy0: cryptographic accelerator WEP:YES, TKIP:YES, CCMP:YES
phy0: hwaddr 00:90:4b:c1:06:bc, MAC:isl3890 RF:Frisbee
phy0: Selected rate control algorithm 'minstrel'

device pci info (lspci -v):

00:06.0 Network controller: Intersil Corporation ISL3886 [Prism Javelin/Prism Xbow] (rev 01)
Subsystem: Intersil Corporation Device 0000
Flags: bus master, medium devsel, latency 56, IRQ 217
Memory at 11000000 (32-bit, non-prefetchable) [size=8K]
Capabilities: [dc] Power Management version 1
Kernel driver in use: p54pci
Kernel modules: prism54, p54pci

Reasons for patch was to solve the below problems.

1.  p54p_check_rx_ring - skb_over_panic: Under a ping flood or just left
running for a bit would panic with a skb_over_panic. Investigation
showed for some odd reason the device/firmware instead of writing a
length in the data rx_ring (desc->len) had instead written the whole dma
address (host->host_addr) into location of the len/flag (host->len and
host->flags) spot and the same dma address that was in the ring.  Added
the following condition in p54p_check_rx_ring to trap that condition and
trim the skb reset the len and flags only.  By the way - I used haret to
see if it I could prove it happening under wince - located the dma
memory that was being used for rings - and also happening under windows
ce with the  len/flag being set to the same as the host dma.  Scanning
the ring at 1000 times per second (I think)  In a flood or iperf.  Would
see an occasional len/flag location get set to the same host address in
that ring - may only happen a few times every minute.  Under normal
operation maybe a few times a day.

   if(unlikely(len == (desc->host_addr & 0xffff)
   && (desc->flags == ((desc->host_addr & 0xffff0000) >> 16))) )

2.  p54p_refill_rx_ring - eventual stall: Has the potential in very busy
(flood) to over run the last rx data processed ring index corrupting the
next rings - causing some havoc of getting some 13 indexes difference
between priv->rx_idx_data and ring_control host_idx on a 8 index ring.
This appears to eventually fill up the TX queue - returning a -ENOSPC in
p54_assign_address (txrx.c) because of ring corruption missing some TX
releases.  Changed p54p_refill_rx_ring to take a index parm and use that
as the last processed ring index - instead of the using the ring_control
device_idx.

3.  p54p_check_rx_ring - eventual stall: On ping flood - Control
P54_CONTROL_TYPE_TXDONE rx packets that are skb reused - seem to cause a
problem on the next time around with the same index.   Even though the
length was not the same was still being seen as a
P54_CONTROL_TYPE_TXDONE packet again. Side affects varied - one being
the main end result same as the #2 listed above TX not being released
and returning a -ENOSPC in p54_assign_address (txrx.c) - stall.
Problem went away if did not reuse the skb but unmap it and
dev_kfree_skb if return was zero from p54_rx. Still unclear why this
would be - but had no problems with patch afterwards.

4.  p54p_check_rx_ring - soft lockup in p54p_refill_rx_ring.  This only
occurred when 5 minute iperf on a fast wireless network - Or 1 to 2 days
of unit left up.  Discovered that the device had lost it's mind and set
the ring_control->device_index[ring_index] exactly 0xFF or 255 less than
it should be (ram issue??) don't know.  Happens on three of my devices
the same way.  If left to continue - the p54p_refill_rx_ring while loop
goes negative and soft lockup.  Trap and return if device_idx - (*index)
greater than ring_index.  Error is only tripped the one time - meaning
the next time p54p_check_rx_ring is called the device index is back to
what it should have been.

5.  p54p_open   - 1 out of 10 boots will produce device does not
respond! or Cannot boot firmware!.    Minor - but frustrating all the
same.
Always rmmod p54pci and then modprobe p54pci works.  It seems if get a
error on p54p_open trying again works.  And if p54_read_eeprom fails -
trying again works.

The below was applied to compat-wireless-2009-09-30:

Thanks,

Quintin.

Signed-off-by: Quintin Pitts <geek4linux@gmail.com>

--- 

--- a/drivers/net/wireless/p54/p54pci.c	2009-09-29 23:13:58.000000000 -0500
+++ b/drivers/net/wireless/p54/p54pci.c	2009-10-09 08:15:58.000000000 -0500
@@ -131,7 +131,7 @@ static int p54p_upload_firmware(struct i
 
 static void p54p_refill_rx_ring(struct ieee80211_hw *dev,
 	int ring_index, struct p54p_desc *ring, u32 ring_limit,
-	struct sk_buff **rx_buf)
+	struct sk_buff **rx_buf, u32 index)
 {
 	struct p54p_priv *priv = dev->priv;
 	struct p54p_ring_control *ring_control = priv->ring_control;
@@ -139,7 +139,11 @@ static void p54p_refill_rx_ring(struct i
 
 	idx = le32_to_cpu(ring_control->host_idx[ring_index]);
 	limit = idx;
-	limit -= le32_to_cpu(ring_control->device_idx[ring_index]);
+/*
+ *           Use last processed index instead of device_idx
+ *           so we don't corrupt our ring 
+ */
+	limit -= le32_to_cpu(index);
 	limit = ring_limit - limit;
 
 	i = idx % ring_limit;
@@ -181,9 +185,26 @@ static void p54p_check_rx_ring(struct ie
 	struct p54p_ring_control *ring_control = priv->ring_control;
 	struct p54p_desc *desc;
 	u32 idx, i;
+	int ret;
 
+	idx = le32_to_cpu(ring_control->device_idx[ring_index]);
 	i = (*index) % ring_limit;
-	(*index) = idx = le32_to_cpu(ring_control->device_idx[ring_index]);
+	if(unlikely((idx - (*index)) > ring_limit || 
+ (le32_to_cpu(ring_control->host_idx[ring_index]) - (*index)) > ring_limit)) { 
+  	printk(KERN_DEBUG "%s: devidx jumped *index=%d devidx=%d hostidx=%d ring_limit=%d\n",
+	__func__,(*index),idx,ring_control->host_idx[ring_index],ring_limit);
+/* 
+ * Do nothing things are really wrong - device index has jumped got corrupted
+ *  - wait for it to stabilize 
+ * So far device idx exactly 0xFF (255) bytes less than what it should be. 
+ * only seen to happen on very fast wireless and packet floods and/or iperf test
+ * In testing this error only encountered once - so next time around the 
+ * device index is correct.
+ * if to continue would soft lockup/hang in while loop in p54p_refill_rx_ring
+ */
+		return;
+		}
+	(*index) = idx;
 	idx %= ring_limit;
 	while (i != idx) {
 		u16 len;
@@ -197,25 +218,40 @@ static void p54p_check_rx_ring(struct ie
 			i %= ring_limit;
 			continue;
 		}
+		if(unlikely(len == (desc->host_addr & 0xffff) 
+	&& (desc->flags == ((desc->host_addr & 0xffff0000) >> 16))) ) {
+/* device has put device dma in desc len/flag location - will crash in skb_put
+ * desc->len and desc->flags contain the host_addr -
+ * trap before skb_put and discard
+ * ViewSonic V210 and wireless card GENTEK WL-850 , IT8152 PCI bridge 
+ * happens occasionally - no clear reason or frequency.
+ *  
+ */ 
+		printk(KERN_DEBUG "%s: rx_ring len/flags has address - skipping!\n",__func__); 
+                  skb_trim(skb,0);
+		  desc->len = cpu_to_le16(priv->common.rx_mtu + 32);
+		  desc->flags=0;
+                 
+		} else {
+
 		skb_put(skb, len);
 
-		if (p54_rx(dev, skb)) {
-			pci_unmap_single(priv->pdev,
+		ret=p54_rx(dev,skb);
+		pci_unmap_single(priv->pdev,
 					 le32_to_cpu(desc->host_addr),
 					 priv->common.rx_mtu + 32,
 					 PCI_DMA_FROMDEVICE);
-			rx_buf[i] = NULL;
-			desc->host_addr = 0;
-		} else {
-			skb_trim(skb, 0);
-			desc->len = cpu_to_le16(priv->common.rx_mtu + 32);
-		}
+		if(ret==0)
+			dev_kfree_skb(skb);
+		rx_buf[i] = NULL;
+		desc->host_addr = 0;
+		} /* end of desc->len skb corrupt crash test */
 
 		i++;
 		i %= ring_limit;
 	}
 
-	p54p_refill_rx_ring(dev, ring_index, ring, ring_limit, rx_buf);
+	p54p_refill_rx_ring(dev, ring_index, ring, ring_limit, rx_buf, (*index));
 }
 
 /* caller must hold priv->lock */
@@ -428,10 +464,10 @@ static int p54p_open(struct ieee80211_hw
 	priv->rx_idx_mgmt = priv->tx_idx_mgmt = 0;
 
 	p54p_refill_rx_ring(dev, 0, priv->ring_control->rx_data,
-		ARRAY_SIZE(priv->ring_control->rx_data), priv->rx_buf_data);
+		ARRAY_SIZE(priv->ring_control->rx_data), priv->rx_buf_data, 0);
 
 	p54p_refill_rx_ring(dev, 2, priv->ring_control->rx_mgmt,
-		ARRAY_SIZE(priv->ring_control->rx_mgmt), priv->rx_buf_mgmt);
+		ARRAY_SIZE(priv->ring_control->rx_mgmt), priv->rx_buf_mgmt, 0);
 
 	P54P_WRITE(ring_control_base, cpu_to_le32(priv->ring_control_dma));
 	P54P_READ(ring_control_base);
@@ -550,9 +586,26 @@ static int __devinit p54p_probe(struct p
 	}
 
 	err = p54p_open(dev);
-	if (err)
-		goto err_free_common;
+	if (err) {
+                
+		printk(KERN_DEBUG "%s: p54p_open failed - trying again\n",__func__);
+                msleep(10);
+		err = p54p_open(dev);
+		if (err)
+			goto err_free_common;
+        }
 	err = p54_read_eeprom(dev);
+	if (err)
+	{
+                printk(KERN_DEBUG "%s: p54_read_eeprom failed - trying again\n",__func__);
+		p54p_stop(dev);
+		err = p54p_open(dev);
+                if (err)
+			goto err_free_common;
+		msleep(10);
+		err = p54_read_eeprom(dev);
+             
+	}
 	p54p_stop(dev);
 	if (err)
 		goto err_free_common;


^ permalink raw reply

* Re: [mmotm 2009-10-09-01-07] b43/wireless possible circular locking
From: Larry Finger @ 2009-10-11 14:23 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Dave Young, akpm, bcm43xx-dev, linux-wireless, linux-kernel
In-Reply-To: <1255254687.4095.46.camel@johannes.local>

On 10/11/2009 04:51 AM, Johannes Berg wrote:
> On Sun, 2009-10-11 at 17:41 +0800, Dave Young wrote:
>> Hi,
>>
>> I got lockdep warnings about possible circular lock with
>> b43 interface startup. It looks like a real problem.
>>
>> [   71.974542] wlan0: deauthenticating from 00:19:e0:db:24:de by local choice (reason=3)
>> [   72.004352] b43-phy0 debug: Removing Interface type 2
>> [   72.005431] 
>> [   72.005435] =======================================================
>> [   72.006168] [ INFO: possible circular locking dependency detected ]
>> [   72.006759] 2.6.32-rc3-mm1 #4
>> [   72.007047] -------------------------------------------------------
>> [   72.007617] ifconfig/2175 is trying to acquire lock:
>> [   72.007617]  (&(&rfkill->poll_work)->work){+.+...}, at: [<c0239375>] __cancel_work_timer+0x8c/0x18e
>> [   72.007617] 
>> [   72.007617] but task is already holding lock:
>> [   72.007617]  (&wl->mutex){+.+.+.}, at: [<f8fa5359>] b43_op_stop+0x28/0x6a [b43]
> 
> I believe this is already taken care of by Larry.

Yes, I introduced this bug and fixed it. The latest wireless-testing
should have the fix. In addition, it is on its way to Linus through
DameM. Unfortunately, that fix has another bug that will not show up
in the logs. With it, it is impossible to turn the radio back on after
it is killed via the rfkill switch. A "final" (And I hope that is
true!) fix is out for testing; however, the OP for the bug that
started this whole chain only has limited access to the machine in
question.

Larry

^ permalink raw reply

* Re: [RFC] p54pci: skb_over_panic, soft lockup, stall under flood
From: Larry Finger @ 2009-10-11 15:31 UTC (permalink / raw)
  To: Quintin Pitts; +Cc: John Linville, linux-wireless, Christian Lamparter
In-Reply-To: <4AD1EBA7.904@gmail.com>

On 10/11/2009 09:28 AM, Quintin Pitts wrote:
> Hi,
> 
> Sorry for my lack of experience in all aspects - first time
> submitting!!!

Everyone that goes through this "right of passage" gets somewhat
discouraged by the response. My advice is to hang in.

My first advice is for you to run every submitted patch through the
check at scripts/checkpatch.pl. This one shows 95 errors and 7
warnings in 136 lines. Most of the errors are due to "DOS line
endings". We really hate carriage returns - a really useless occupier
of space unless it is _NOT_ followed by \n!

As I understand it, this patch is to fix the driver to work around
firmware errors. If that is correct, please state that clearly. If
only partially correct, then indicate which parts are to fix firmware
errors, and which are to fix driver errors. Has your analysis included
thinking about where the driver might delay to avoid firmware problems.

> In trying to get p54pci driver to be stable on my platform and hardware
> - here is a generic patch that seems to accomplish that.  Since the
> ViewSonic V210 uses the IT8152 pci bridge - some attention was needed to
> get dma related allocation in the first physical 64M.  I have verified
> that the dma related allocation is in the first 64M and dmabounce is not
> being used - just for those wondering if that was part of the problems.
> 
> Platform: ViewSonic V210 arm pxa255
> Kernel 2.6.30.5 eabi
> Wireless Drivers from compat-wireless-2009-09-30 and what I applied the below patch to.
> Firmware used: FW rev 2.13.12.0 - Softmac protocol 5.9
> 
> Wireless card: GemTek WL-850FJB minipci card.
> 
> phy0: p54 detected a LM86 firmware
> p54: rx_mtu reduced from 3240 to 2376
> phy0: FW rev 2.13.12.0 - Softmac protocol 5.9
> phy0: cryptographic accelerator WEP:YES, TKIP:YES, CCMP:YES
> phy0: hwaddr 00:90:4b:c1:06:bc, MAC:isl3890 RF:Frisbee
> phy0: Selected rate control algorithm 'minstrel'
> 
> device pci info (lspci -v):
> 
> 00:06.0 Network controller: Intersil Corporation ISL3886 [Prism Javelin/Prism Xbow] (rev 01)
> Subsystem: Intersil Corporation Device 0000
> Flags: bus master, medium devsel, latency 56, IRQ 217
> Memory at 11000000 (32-bit, non-prefetchable) [size=8K]
> Capabilities: [dc] Power Management version 1
> Kernel driver in use: p54pci
> Kernel modules: prism54, p54pci

Mush of the above is useless detail. Stating the device and the
platform should be sufficient.

> Reasons for patch was to solve the below problems.
> 
> 1.  p54p_check_rx_ring - skb_over_panic: Under a ping flood or just left
> running for a bit would panic with a skb_over_panic. Investigation
> showed for some odd reason the device/firmware instead of writing a
> length in the data rx_ring (desc->len) had instead written the whole dma
> address (host->host_addr) into location of the len/flag (host->len and
> host->flags) spot and the same dma address that was in the ring.  Added
> the following condition in p54p_check_rx_ring to trap that condition and
> trim the skb reset the len and flags only.  By the way - I used haret to
> see if it I could prove it happening under wince - located the dma
> memory that was being used for rings - and also happening under windows
> ce with the  len/flag being set to the same as the host dma.  Scanning
> the ring at 1000 times per second (I think)  In a flood or iperf.  Would
> see an occasional len/flag location get set to the same host address in
> that ring - may only happen a few times every minute.  Under normal
> operation maybe a few times a day.
> 
>    if(unlikely(len == (desc->host_addr & 0xffff)
>    && (desc->flags == ((desc->host_addr & 0xffff0000) >> 16))) )
> 
> 2.  p54p_refill_rx_ring - eventual stall: Has the potential in very busy
> (flood) to over run the last rx data processed ring index corrupting the
> next rings - causing some havoc of getting some 13 indexes difference
> between priv->rx_idx_data and ring_control host_idx on a 8 index ring.
> This appears to eventually fill up the TX queue - returning a -ENOSPC in
> p54_assign_address (txrx.c) because of ring corruption missing some TX
> releases.  Changed p54p_refill_rx_ring to take a index parm and use that
> as the last processed ring index - instead of the using the ring_control
> device_idx.
> 
> 3.  p54p_check_rx_ring - eventual stall: On ping flood - Control
> P54_CONTROL_TYPE_TXDONE rx packets that are skb reused - seem to cause a
> problem on the next time around with the same index.   Even though the
> length was not the same was still being seen as a
> P54_CONTROL_TYPE_TXDONE packet again. Side affects varied - one being
> the main end result same as the #2 listed above TX not being released
> and returning a -ENOSPC in p54_assign_address (txrx.c) - stall.
> Problem went away if did not reuse the skb but unmap it and
> dev_kfree_skb if return was zero from p54_rx. Still unclear why this
> would be - but had no problems with patch afterwards.
> 
> 4.  p54p_check_rx_ring - soft lockup in p54p_refill_rx_ring.  This only
> occurred when 5 minute iperf on a fast wireless network - Or 1 to 2 days
> of unit left up.  Discovered that the device had lost it's mind and set
> the ring_control->device_index[ring_index] exactly 0xFF or 255 less than
> it should be (ram issue??) don't know.  Happens on three of my devices
> the same way.  If left to continue - the p54p_refill_rx_ring while loop
> goes negative and soft lockup.  Trap and return if device_idx - (*index)
> greater than ring_index.  Error is only tripped the one time - meaning
> the next time p54p_check_rx_ring is called the device index is back to
> what it should have been.
> 
> 5.  p54p_open   - 1 out of 10 boots will produce device does not
> respond! or Cannot boot firmware!.    Minor - but frustrating all the
> same.
> Always rmmod p54pci and then modprobe p54pci works.  It seems if get a
> error on p54p_open trying again works.  And if p54_read_eeprom fails -
> trying again works.
> 
> The below was applied to compat-wireless-2009-09-30:
> 
> Thanks,
> 
> Quintin.
> 
> Signed-off-by: Quintin Pitts <geek4linux@gmail.com>
> 
> --- 
> 
> --- a/drivers/net/wireless/p54/p54pci.c	2009-09-29 23:13:58.000000000 -0500
> +++ b/drivers/net/wireless/p54/p54pci.c	2009-10-09 08:15:58.000000000 -0500
> @@ -131,7 +131,7 @@ static int p54p_upload_firmware(struct i
>  
>  static void p54p_refill_rx_ring(struct ieee80211_hw *dev,
>  	int ring_index, struct p54p_desc *ring, u32 ring_limit,
> -	struct sk_buff **rx_buf)
> +	struct sk_buff **rx_buf, u32 index)
>  {
>  	struct p54p_priv *priv = dev->priv;
>  	struct p54p_ring_control *ring_control = priv->ring_control;
> @@ -139,7 +139,11 @@ static void p54p_refill_rx_ring(struct i
>  
>  	idx = le32_to_cpu(ring_control->host_idx[ring_index]);
>  	limit = idx;
> -	limit -= le32_to_cpu(ring_control->device_idx[ring_index]);
> +/*
> + *           Use last processed index instead of device_idx
> + *           so we don't corrupt our ring 
> + */
> +	limit -= le32_to_cpu(index);
>  	limit = ring_limit - limit;
>  
>  	i = idx % ring_limit;
> @@ -181,9 +185,26 @@ static void p54p_check_rx_ring(struct ie
>  	struct p54p_ring_control *ring_control = priv->ring_control;
>  	struct p54p_desc *desc;
>  	u32 idx, i;
> +	int ret;
>  
> +	idx = le32_to_cpu(ring_control->device_idx[ring_index]);
>  	i = (*index) % ring_limit;
> -	(*index) = idx = le32_to_cpu(ring_control->device_idx[ring_index]);
> +	if(unlikely((idx - (*index)) > ring_limit || 
> + (le32_to_cpu(ring_control->host_idx[ring_index]) - (*index)) > ring_limit)) { 

The indentation in this section is strange.

> +  	printk(KERN_DEBUG "%s: devidx jumped *index=%d devidx=%d hostidx=%d ring_limit=%d\n",
> +	__func__,(*index),idx,ring_control->host_idx[ring_index],ring_limit);
> +/* 
> + * Do nothing things are really wrong - device index has jumped got corrupted
> + *  - wait for it to stabilize 
> + * So far device idx exactly 0xFF (255) bytes less than what it should be. 
> + * only seen to happen on very fast wireless and packet floods and/or iperf test
> + * In testing this error only encountered once - so next time around the 
> + * device index is correct.
> + * if to continue would soft lockup/hang in while loop in p54p_refill_rx_ring
> + */
> +		return;
> +		}

This section looks like one where a driver delay might resolve the
problem. I have no objections to your fix. I'm just curious.

> +	(*index) = idx;
>  	idx %= ring_limit;
>  	while (i != idx) {
>  		u16 len;
> @@ -197,25 +218,40 @@ static void p54p_check_rx_ring(struct ie
>  			i %= ring_limit;
>  			continue;
>  		}
> +		if(unlikely(len == (desc->host_addr & 0xffff) 
> +	&& (desc->flags == ((desc->host_addr & 0xffff0000) >> 16))) ) {

You have a whitespace problem (space after if) and again an
indentation problem. Usually, the form is as follows:

                if (unlikely(len == (desc->host_addr & 0xffff) &&
                   (desc->flags....

> +/* device has put device dma in desc len/flag location - will crash in skb_put
> + * desc->len and desc->flags contain the host_addr -
> + * trap before skb_put and discard
> + * ViewSonic V210 and wireless card GENTEK WL-850 , IT8152 PCI bridge 
> + * happens occasionally - no clear reason or frequency.
> + *  
> + */ 
> +		printk(KERN_DEBUG "%s: rx_ring len/flags has address - skipping!\n",__func__); 
> +                  skb_trim(skb,0);
> +		  desc->len = cpu_to_le16(priv->common.rx_mtu + 32);
> +		  desc->flags=0;

There is an indentation problem here.

> +                 
> +		} else {
> +
>  		skb_put(skb, len);
>  
> -		if (p54_rx(dev, skb)) {
> -			pci_unmap_single(priv->pdev,
> +		ret=p54_rx(dev,skb);
> +		pci_unmap_single(priv->pdev,

Here too.

>  					 le32_to_cpu(desc->host_addr),
>  					 priv->common.rx_mtu + 32,
>  					 PCI_DMA_FROMDEVICE);
> -			rx_buf[i] = NULL;
> -			desc->host_addr = 0;
> -		} else {
> -			skb_trim(skb, 0);
> -			desc->len = cpu_to_le16(priv->common.rx_mtu + 32);
> -		}
> +		if(ret==0)
> +			dev_kfree_skb(skb);
> +		rx_buf[i] = NULL;
> +		desc->host_addr = 0;
> +		} /* end of desc->len skb corrupt crash test */
>  
>  		i++;
>  		i %= ring_limit;
>  	}
>  
> -	p54p_refill_rx_ring(dev, ring_index, ring, ring_limit, rx_buf);
> +	p54p_refill_rx_ring(dev, ring_index, ring, ring_limit, rx_buf, (*index));
>  }
>  
>  /* caller must hold priv->lock */
> @@ -428,10 +464,10 @@ static int p54p_open(struct ieee80211_hw
>  	priv->rx_idx_mgmt = priv->tx_idx_mgmt = 0;
>  
>  	p54p_refill_rx_ring(dev, 0, priv->ring_control->rx_data,
> -		ARRAY_SIZE(priv->ring_control->rx_data), priv->rx_buf_data);
> +		ARRAY_SIZE(priv->ring_control->rx_data), priv->rx_buf_data, 0);
>  
>  	p54p_refill_rx_ring(dev, 2, priv->ring_control->rx_mgmt,
> -		ARRAY_SIZE(priv->ring_control->rx_mgmt), priv->rx_buf_mgmt);
> +		ARRAY_SIZE(priv->ring_control->rx_mgmt), priv->rx_buf_mgmt, 0);
>  
>  	P54P_WRITE(ring_control_base, cpu_to_le32(priv->ring_control_dma));
>  	P54P_READ(ring_control_base);
> @@ -550,9 +586,26 @@ static int __devinit p54p_probe(struct p
>  	}
>  
>  	err = p54p_open(dev);
> -	if (err)
> -		goto err_free_common;
> +	if (err) {
> +                
> +		printk(KERN_DEBUG "%s: p54p_open failed - trying again\n",__func__);
> +                msleep(10);
> +		err = p54p_open(dev);
> +		if (err)
> +			goto err_free_common;
> +        }
>  	err = p54_read_eeprom(dev);
> +	if (err)
> +	{
> +                printk(KERN_DEBUG "%s: p54_read_eeprom failed - trying again\n",__func__);
> +		p54p_stop(dev);
> +		err = p54p_open(dev);
> +                if (err)
> +			goto err_free_common;
> +		msleep(10);
> +		err = p54_read_eeprom(dev);
> +             
> +	}
>  	p54p_stop(dev);
>  	if (err)
>  		goto err_free_common;
> 

I will let Christian comment more on the technical merits of the patch
as he understands the device much better than I do.

Larry

^ permalink raw reply

* Re: [PATCH] b43: fix ieee80211_rx() context
From: Kalle Valo @ 2009-10-11 15:59 UTC (permalink / raw)
  To: Johannes Berg; +Cc: John Linville, David Miller, Dave Young, linux-wireless
In-Reply-To: <1255256361.4095.56.camel@johannes.local>

Johannes Berg <johannes@sipsolutions.net> writes:

> Due to the way it interacts with the networking
> stack and other parts of mac80211, ieee80211_rx()
> must be called with disabled softirqs.

[...]

> +	local_bh_disable();
>  	ieee80211_rx(dev->wl->hw, skb);
> +	local_bh_enable();

This is a bit awkward from drivers' point of view, we have to add the
same code to all mac80211 drivers using either SPI or SDIO buses.

What about adding a new inline function ieee80211_rx_ni() which would
disable bottom halves like above and call ieee80211_rx()? IMHO that's
easier for the driver developers to understand and also easier to
document ("use this function when calling from process context"). If
this is acceptable, I can create a patch.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH] b43: fix ieee80211_rx() context
From: Johannes Berg @ 2009-10-11 16:02 UTC (permalink / raw)
  To: Kalle Valo; +Cc: John Linville, David Miller, Dave Young, linux-wireless
In-Reply-To: <873a5pq5vc.fsf@purkki.valot.fi>

[-- Attachment #1: Type: text/plain, Size: 979 bytes --]

On Sun, 2009-10-11 at 18:59 +0300, Kalle Valo wrote:
> Johannes Berg <johannes@sipsolutions.net> writes:
> 
> > Due to the way it interacts with the networking
> > stack and other parts of mac80211, ieee80211_rx()
> > must be called with disabled softirqs.
> 
> [...]
> 
> > +	local_bh_disable();
> >  	ieee80211_rx(dev->wl->hw, skb);
> > +	local_bh_enable();
> 
> This is a bit awkward from drivers' point of view, we have to add the
> same code to all mac80211 drivers using either SPI or SDIO buses.
> 
> What about adding a new inline function ieee80211_rx_ni() which would
> disable bottom halves like above and call ieee80211_rx()? IMHO that's
> easier for the driver developers to understand and also easier to
> document ("use this function when calling from process context"). If
> this is acceptable, I can create a patch.

I really don't see the point, since it's just three lines of code, but I
wouldn't mind all that much either.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* Re: [PATCH] b43: fix ieee80211_rx() context
From: Kalle Valo @ 2009-10-11 16:08 UTC (permalink / raw)
  To: Johannes Berg; +Cc: John Linville, David Miller, Dave Young, linux-wireless
In-Reply-To: <1255276971.4095.241.camel@johannes.local>

Johannes Berg <johannes@sipsolutions.net> writes:

>> > +	local_bh_disable();
>> >  	ieee80211_rx(dev->wl->hw, skb);
>> > +	local_bh_enable();
>> 
>> This is a bit awkward from drivers' point of view, we have to add the
>> same code to all mac80211 drivers using either SPI or SDIO buses.
>> 
>> What about adding a new inline function ieee80211_rx_ni() which would
>> disable bottom halves like above and call ieee80211_rx()? IMHO that's
>> easier for the driver developers to understand and also easier to
>> document ("use this function when calling from process context"). If
>> this is acceptable, I can create a patch.
>
> I really don't see the point, since it's just three lines of code, but I
> wouldn't mind all that much either.

My worry are the developers who even don't know what is a bottom half
and might get it all wrong. (Yes, there really are such people.)

But if you don't see any benefit from adding a new function, I'll drop
the idea. No need to complicate this anymore :)

-- 
Kalle Valo

^ permalink raw reply

* ath5k AP kernel panic when client uses SCP
From: Tomasz Chmielewski @ 2009-10-11 17:49 UTC (permalink / raw)
  To: linux-wireless, linux-netdev, linux-mips

I am able to trigger this kernel panic when a client copies data using SCP to another client using the same AP:

client_1 <---wired---> AP <---wireless---> client_2

The panic happens after transferring around 30-40 MB of data.


The AP behaves stable with normal traffic (HTTP, HTTPS, IMAPS, text SSH).



The AP is Asus WL-500gP, it's a MIPS platform, running 2.6.31.1 kernel, hostapd v0.6.9.

I can reproduce the issue reliably.

Let me know if you need more info here.

[67359.700000] ------------[ cut here ]------------
[67359.710000] WARNING: at net/core/dev.c:1566 0x80280890()
[67359.710000] b44: caps=(0x0, 0x0) len=80 data_len=0 ip_summed=1
[67359.720000] Modules linked in: tun sch_sfq cls_fw sch_htb ipt_MASQUERADE iptable_nat nf_nat xt_MARK iptable_mangle ipt_ULOG xt_recent nf_conntrack_ipv4 nf_defrag1
[67359.740000] Call Trace:[<8002df58>] 0x8002df58
[67359.750000] [<8001371c>] 0x8001371c
[67359.750000] [<8001371c>] 0x8001371c
[67359.750000] [<8002cfb0>] 0x8002cfb0
[67359.760000] [<80280890>] 0x80280890
[67359.760000] [<8002d018>] 0x8002d018
[67359.770000] [<80280890>] 0x80280890
[67359.770000] [<80280c9c>] 0x80280c9c
[67359.770000] [<80280c20>] 0x80280c20
[67359.780000] [<802980e0>] 0x802980e0
[67359.780000] [<80298078>] 0x80298078
[67359.780000] [<8030dd3c>] 0x8030dd3c
[67359.790000] [<80284fbc>] 0x80284fbc
[67359.790000] [<80284f18>] 0x80284f18
[67359.790000] [<8030dd3c>] 0x8030dd3c
[67359.800000] [<803089d4>] 0x803089d4
[67359.800000] [<8030893c>] 0x8030893c
[67359.800000] [<8030ea74>] 0x8030ea74
[67359.810000] [<8030dd3c>] 0x8030dd3c
[67359.810000] [<8030dd3c>] 0x8030dd3c
[67359.820000] [<802a52f0>] 0x802a52f0
[67359.820000] [<8030893c>] 0x8030893c
[67359.820000] [<8030893c>] 0x8030893c
[67359.830000] [<8030893c>] 0x8030893c
[67359.830000] [<802a5440>] 0x802a5440
[67359.830000] [<8030893c>] 0x8030893c
[67359.840000] [<803089e8>] 0x803089e8
[67359.840000] [<80308a3c>] 0x80308a3c
[67359.840000] [<8030893c>] 0x8030893c
[67359.850000] [<8030dec8>] 0x8030dec8
[67359.850000] [<803089e8>] 0x803089e8
[67359.850000] [<803089e8>] 0x803089e8
[67359.860000] [<8030efa8>] 0x8030efa8
[67359.860000] [<8030ddb4>] 0x8030ddb4
[67359.870000] [<8030ddb4>] 0x8030ddb4
[67359.870000] [<802a52f0>] 0x802a52f0
[67359.870000] [<803089e8>] 0x803089e8
[67359.880000] [<803089e8>] 0x803089e8
[67359.880000] [<802a5440>] 0x802a5440
[67359.880000] [<806a5858>] 0x806a5858
[67359.890000] [<803089e8>] 0x803089e8
[67359.890000] [<80309978>] 0x80309978
[67359.890000] [<80308af4>] 0x80308af4
[67359.900000] [<80309978>] 0x80309978
[67359.900000] [<802a5440>] 0x802a5440
[67359.900000] [<803089e8>] 0x803089e8
[67359.910000] [<80309ae0>] 0x80309ae0
[67359.910000] [<80309978>] 0x80309978
[67359.920000] [<8030e668>] 0x8030e668
[67359.920000] [<8030e60c>] 0x8030e60c
[67359.920000] [<8030e25c>] 0x8030e25c
[67359.930000] [<80309978>] 0x80309978
[67359.930000] [<8030e25c>] 0x8030e25c
[67359.930000] [<802a5440>] 0x802a5440
[67359.940000] [<80014aa0>] 0x80014aa0
[67359.940000] [<8030e25c>] 0x8030e25c
[67359.940000] [<8030f96c>] 0x8030f96c
[67359.950000] [<8079301c>] 0x8079301c
[67359.950000] [<8030e25c>] 0x8030e25c
[67359.950000] [<802a52f0>] 0x802a52f0
[67359.960000] [<80309978>] 0x80309978
[67359.960000] [<80309978>] 0x80309978
[67359.970000] [<802a5440>] 0x802a5440
[67359.970000] [<8008feec>] 0x8008feec
[67359.970000] [<80309978>] 0x80309978
[67359.980000] [<80309d44>] 0x80309d44
[67359.980000] [<8008feec>] 0x8008feec
[67359.980000] [<8008ffdc>] 0x8008ffdc
[67359.990000] [<80276190>] 0x80276190
[67359.990000] [<80309978>] 0x80309978
[67359.990000] [<800900a4>] 0x800900a4
[67360.000000] [<8027fe3c>] 0x8027fe3c
[67360.000000] [<802964f8>] 0x802964f8
[67360.000000] [<8008feec>] 0x8008feec
[67360.010000] [<802828e8>] 0x802828e8
[67360.010000] [<80276190>] 0x80276190
[67360.020000] [<80283b40>] 0x80283b40
[67360.020000] [<80282a94>] 0x80282a94
[67360.020000] [<80033764>] 0x80033764
[67360.030000] [<8005d0f0>] 0x8005d0f0
[67360.030000] [<800543e0>] 0x800543e0
[67360.030000] [<80033874>] 0x80033874
[67360.040000] [<80033d74>] 0x80033d74
[67360.040000] [<80001844>] 0x80001844
[67360.040000] [<80001844>] 0x80001844
[67360.050000] [<80001a60>] 0x80001a60
[67360.050000] [<800149fc>] 0x800149fc
[67360.050000] [<8000efc8>] 0x8000efc8
[67360.060000] [<8000efc8>] 0x8000efc8
[67360.060000] [<8039c9ec>] 0x8039c9ec
[67360.070000] [<8039c9d0>] 0x8039c9d0
[67360.070000] [<8039c110>] 0x8039c110
[67360.070000]
[67360.070000] ---[ end trace 94ff764c3a95abf9 ]---
[67360.080000] Unhandled kernel unaligned access[#1]:
[67360.080000] Cpu 0
[67360.080000] $ 0   : 00000000 1000dc00 00000001 81445a40
[67360.080000] $ 4   : 05f20d4d 00000000 00000001 00000083
[67360.080000] $ 8   : 00000000 00000083 803d0000 ffffffea
[67360.080000] $12   : 803d0000 00000000 00000000 00000000
[67360.080000] $16   : 0000000c 00000001 8099ee20 81c72000
[67360.080000] $20   : 81d17e00 80330090 8030893c 81c72000
[67360.080000] $24   : 00010720 802ced34
[67360.080000] $28   : 8037c000 8037d880 00000010 80276890
[67360.080000] Hi    : 00000000
[67360.080000] Lo    : 00000000
[67360.080000] epc   : 8006f058 0x8006f058
[67360.080000]     Tainted: G        W
[67360.080000] ra    : 80276890 0x80276890
[67360.080000] Status: 1000dc03    KERNEL EXL IE
[67360.080000] Cause : 00800010
[67360.080000] BadVA : 05f20d4d
[67360.080000] PrId  : 00029006 (Broadcom BCM3302)
[67360.080000] Modules linked in: tun sch_sfq cls_fw sch_htb ipt_MASQUERADE iptable_nat nf_nat xt_MARK iptable_mangle ipt_ULOG xt_recent nf_conntrack_ipv4 nf_defrag1
[67360.080000] Process swapper (pid: 0, threadinfo=8037c000, task=8037e000, tls=00000000)
[67360.080000] Stack : 8099ee20 80276190 00000000 00000000 8099ee20 803d80a8 8099ee20 802761f0
[67360.080000]         81d17e00 803d80a8 8099ee20 81c72000 81d17e00 80280d50 8099ee20 81404f80
[67360.080000]         8037d900 80da8000 80da8000 81d17e00 81d17e00 00000001 80da8000 8099ee20
[67360.080000]         81c72000 00665332 802980e0 80298078 80da8000 00000002 00000000 8030dd3c
[67360.080000]         80da8000 81d17e00 8099ee20 00000000 803d86e0 80000000 80284fbc 80284f18
[67360.080000]         ...
[67360.080000] Call Trace:[<80276190>] 0x80276190
[67360.080000] [<802761f0>] 0x802761f0
[67360.080000] [<80280d50>] 0x80280d50
[67360.080000] [<802980e0>] 0x802980e0
[67360.080000] [<80298078>] 0x80298078
[67360.080000] [<8030dd3c>] 0x8030dd3c
[67360.080000] [<80284fbc>] 0x80284fbc
[67360.080000] [<80284f18>] 0x80284f18
[67360.080000] [<8030dd3c>] 0x8030dd3c
[67360.080000] [<803089d4>] 0x803089d4
[67360.080000] [<8030893c>] 0x8030893c
[67360.080000] [<8030ea74>] 0x8030ea74
[67360.080000] [<8030dd3c>] 0x8030dd3c
[67360.080000] [<8030dd3c>] 0x8030dd3c
[67360.080000] [<802a52f0>] 0x802a52f0
[67360.080000] [<8030893c>] 0x8030893c
[67360.080000] [<8030893c>] 0x8030893c
[67360.080000] [<8030893c>] 0x8030893c
[67360.080000] [<802a5440>] 0x802a5440
[67360.080000] [<8030893c>] 0x8030893c
[67360.080000] [<803089e8>] 0x803089e8
[67360.080000] [<80308a3c>] 0x80308a3c
[67360.080000] [<8030893c>] 0x8030893c
[67360.080000] [<8030dec8>] 0x8030dec8
[67360.080000] [<803089e8>] 0x803089e8
[67360.080000] [<803089e8>] 0x803089e8
[67360.080000] [<8030efa8>] 0x8030efa8
[67360.080000] [<8030ddb4>] 0x8030ddb4
[67360.080000] [<8030ddb4>] 0x8030ddb4
[67360.080000] [<802a52f0>] 0x802a52f0
[67360.080000] [<803089e8>] 0x803089e8
[67360.080000] [<803089e8>] 0x803089e8
[67360.080000] [<802a5440>] 0x802a5440
[67360.080000] [<806a5858>] 0x806a5858
[67360.080000] [<803089e8>] 0x803089e8
[67360.080000] [<80309978>] 0x80309978
[67360.080000] [<80308af4>] 0x80308af4
[67360.080000] [<80309978>] 0x80309978
[67360.080000] [<802a5440>] 0x802a5440
[67360.080000] [<803089e8>] 0x803089e8
[67360.080000] [<80309ae0>] 0x80309ae0
[67360.080000] [<80309978>] 0x80309978
[67360.080000] [<8030e668>] 0x8030e668
[67360.080000] [<8030e60c>] 0x8030e60c
[67360.080000] [<8030e25c>] 0x8030e25c
[67360.080000] [<80309978>] 0x80309978
[67360.080000] [<8030e25c>] 0x8030e25c
[67360.080000] [<802a5440>] 0x802a5440
[67360.080000] [<80014aa0>] 0x80014aa0
[67360.080000] [<8030e25c>] 0x8030e25c
[67360.080000] [<8030f96c>] 0x8030f96c
[67360.080000] [<8079301c>] 0x8079301c
[67360.080000] [<8030e25c>] 0x8030e25c
[67360.080000] [<802a52f0>] 0x802a52f0
[67360.080000] [<80309978>] 0x80309978
[67360.080000] [<80309978>] 0x80309978
[67360.080000] [<802a5440>] 0x802a5440
[67360.080000] [<8008feec>] 0x8008feec
[67360.080000] [<80309978>] 0x80309978
[67360.080000] [<80309d44>] 0x80309d44
[67360.080000] [<8008feec>] 0x8008feec
[67360.080000] [<8008ffdc>] 0x8008ffdc
[67360.080000] [<80276190>] 0x80276190
[67360.080000] [<80309978>] 0x80309978
[67360.080000] [<800900a4>] 0x800900a4
[67360.080000] [<8027fe3c>] 0x8027fe3c
[67360.080000] [<802964f8>] 0x802964f8
[67360.080000] [<8008feec>] 0x8008feec
[67360.080000] [<802828e8>] 0x802828e8
[67360.080000] [<80276190>] 0x80276190
[67360.080000] [<80283b40>] 0x80283b40
[67360.080000] [<80282a94>] 0x80282a94
[67360.080000] [<80033764>] 0x80033764
[67360.080000] [<8005d0f0>] 0x8005d0f0
[67360.080000] [<800543e0>] 0x800543e0
[67360.080000] [<80033874>] 0x80033874
[67360.080000] [<80033d74>] 0x80033d74
[67360.080000] [<80001844>] 0x80001844
[67360.080000] [<80001844>] 0x80001844
[67360.080000] [<80001a60>] 0x80001a60
[67360.080000] [<800149fc>] 0x800149fc
[67360.080000] [<8000efc8>] 0x8000efc8
[67360.080000] [<8000efc8>] 0x8000efc8
[67360.080000] [<8039c9ec>] 0x8039c9ec
[67360.080000] [<8039c9d0>] 0x8039c9d0
[67360.080000] [<8039c110>] 0x8039c110
[67360.080000]
[67360.080000]
[67360.080000] Code: 3c048007  08010471  2484f044 <8c820000> 3042c000  10400003  00803821  0801b8d1  00000000
[67360.080000] Disabling lock debugging due to kernel taint
[67360.560000] Kernel panic - not syncing: Fatal exception in interrupt




-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply

* Re: ath5k AP kernel panic when client uses SCP
From: Tomasz Chmielewski @ 2009-10-11 17:55 UTC (permalink / raw)
  To: linux-wireless, linux-net, linux-mips
In-Reply-To: <4AD21AB4.6010208@wpkg.org>

Added linux-net as I used a wrong address in the original mail.


> I am able to trigger this kernel panic when a client copies data using 
> SCP to another client using the same AP:
> 
> client_1 <---wired---> AP <---wireless---> client_2
> 
> The panic happens after transferring around 30-40 MB of data.
> 
> 
> The AP behaves stable with normal traffic (HTTP, HTTPS, IMAPS, text SSH).
> 
> 
> 
> The AP is Asus WL-500gP, it's a MIPS platform, running 2.6.31.1 kernel, 
> hostapd v0.6.9.
> 
> I can reproduce the issue reliably.
> 
> Let me know if you need more info here.
> 
> [67359.700000] ------------[ cut here ]------------
> [67359.710000] WARNING: at net/core/dev.c:1566 0x80280890()
> [67359.710000] b44: caps=(0x0, 0x0) len=80 data_len=0 ip_summed=1
> [67359.720000] Modules linked in: tun sch_sfq cls_fw sch_htb 
> ipt_MASQUERADE iptable_nat nf_nat xt_MARK iptable_mangle ipt_ULOG 
> xt_recent nf_conntrack_ipv4 nf_defrag1
> [67359.740000] Call Trace:[<8002df58>] 0x8002df58
> [67359.750000] [<8001371c>] 0x8001371c
> [67359.750000] [<8001371c>] 0x8001371c
> [67359.750000] [<8002cfb0>] 0x8002cfb0
> [67359.760000] [<80280890>] 0x80280890
> [67359.760000] [<8002d018>] 0x8002d018
> [67359.770000] [<80280890>] 0x80280890
> [67359.770000] [<80280c9c>] 0x80280c9c
> [67359.770000] [<80280c20>] 0x80280c20
> [67359.780000] [<802980e0>] 0x802980e0
> [67359.780000] [<80298078>] 0x80298078
> [67359.780000] [<8030dd3c>] 0x8030dd3c
> [67359.790000] [<80284fbc>] 0x80284fbc
> [67359.790000] [<80284f18>] 0x80284f18
> [67359.790000] [<8030dd3c>] 0x8030dd3c
> [67359.800000] [<803089d4>] 0x803089d4
> [67359.800000] [<8030893c>] 0x8030893c
> [67359.800000] [<8030ea74>] 0x8030ea74
> [67359.810000] [<8030dd3c>] 0x8030dd3c
> [67359.810000] [<8030dd3c>] 0x8030dd3c
> [67359.820000] [<802a52f0>] 0x802a52f0
> [67359.820000] [<8030893c>] 0x8030893c
> [67359.820000] [<8030893c>] 0x8030893c
> [67359.830000] [<8030893c>] 0x8030893c
> [67359.830000] [<802a5440>] 0x802a5440
> [67359.830000] [<8030893c>] 0x8030893c
> [67359.840000] [<803089e8>] 0x803089e8
> [67359.840000] [<80308a3c>] 0x80308a3c
> [67359.840000] [<8030893c>] 0x8030893c
> [67359.850000] [<8030dec8>] 0x8030dec8
> [67359.850000] [<803089e8>] 0x803089e8
> [67359.850000] [<803089e8>] 0x803089e8
> [67359.860000] [<8030efa8>] 0x8030efa8
> [67359.860000] [<8030ddb4>] 0x8030ddb4
> [67359.870000] [<8030ddb4>] 0x8030ddb4
> [67359.870000] [<802a52f0>] 0x802a52f0
> [67359.870000] [<803089e8>] 0x803089e8
> [67359.880000] [<803089e8>] 0x803089e8
> [67359.880000] [<802a5440>] 0x802a5440
> [67359.880000] [<806a5858>] 0x806a5858
> [67359.890000] [<803089e8>] 0x803089e8
> [67359.890000] [<80309978>] 0x80309978
> [67359.890000] [<80308af4>] 0x80308af4
> [67359.900000] [<80309978>] 0x80309978
> [67359.900000] [<802a5440>] 0x802a5440
> [67359.900000] [<803089e8>] 0x803089e8
> [67359.910000] [<80309ae0>] 0x80309ae0
> [67359.910000] [<80309978>] 0x80309978
> [67359.920000] [<8030e668>] 0x8030e668
> [67359.920000] [<8030e60c>] 0x8030e60c
> [67359.920000] [<8030e25c>] 0x8030e25c
> [67359.930000] [<80309978>] 0x80309978
> [67359.930000] [<8030e25c>] 0x8030e25c
> [67359.930000] [<802a5440>] 0x802a5440
> [67359.940000] [<80014aa0>] 0x80014aa0
> [67359.940000] [<8030e25c>] 0x8030e25c
> [67359.940000] [<8030f96c>] 0x8030f96c
> [67359.950000] [<8079301c>] 0x8079301c
> [67359.950000] [<8030e25c>] 0x8030e25c
> [67359.950000] [<802a52f0>] 0x802a52f0
> [67359.960000] [<80309978>] 0x80309978
> [67359.960000] [<80309978>] 0x80309978
> [67359.970000] [<802a5440>] 0x802a5440
> [67359.970000] [<8008feec>] 0x8008feec
> [67359.970000] [<80309978>] 0x80309978
> [67359.980000] [<80309d44>] 0x80309d44
> [67359.980000] [<8008feec>] 0x8008feec
> [67359.980000] [<8008ffdc>] 0x8008ffdc
> [67359.990000] [<80276190>] 0x80276190
> [67359.990000] [<80309978>] 0x80309978
> [67359.990000] [<800900a4>] 0x800900a4
> [67360.000000] [<8027fe3c>] 0x8027fe3c
> [67360.000000] [<802964f8>] 0x802964f8
> [67360.000000] [<8008feec>] 0x8008feec
> [67360.010000] [<802828e8>] 0x802828e8
> [67360.010000] [<80276190>] 0x80276190
> [67360.020000] [<80283b40>] 0x80283b40
> [67360.020000] [<80282a94>] 0x80282a94
> [67360.020000] [<80033764>] 0x80033764
> [67360.030000] [<8005d0f0>] 0x8005d0f0
> [67360.030000] [<800543e0>] 0x800543e0
> [67360.030000] [<80033874>] 0x80033874
> [67360.040000] [<80033d74>] 0x80033d74
> [67360.040000] [<80001844>] 0x80001844
> [67360.040000] [<80001844>] 0x80001844
> [67360.050000] [<80001a60>] 0x80001a60
> [67360.050000] [<800149fc>] 0x800149fc
> [67360.050000] [<8000efc8>] 0x8000efc8
> [67360.060000] [<8000efc8>] 0x8000efc8
> [67360.060000] [<8039c9ec>] 0x8039c9ec
> [67360.070000] [<8039c9d0>] 0x8039c9d0
> [67360.070000] [<8039c110>] 0x8039c110
> [67360.070000]
> [67360.070000] ---[ end trace 94ff764c3a95abf9 ]---
> [67360.080000] Unhandled kernel unaligned access[#1]:
> [67360.080000] Cpu 0
> [67360.080000] $ 0   : 00000000 1000dc00 00000001 81445a40
> [67360.080000] $ 4   : 05f20d4d 00000000 00000001 00000083
> [67360.080000] $ 8   : 00000000 00000083 803d0000 ffffffea
> [67360.080000] $12   : 803d0000 00000000 00000000 00000000
> [67360.080000] $16   : 0000000c 00000001 8099ee20 81c72000
> [67360.080000] $20   : 81d17e00 80330090 8030893c 81c72000
> [67360.080000] $24   : 00010720 802ced34
> [67360.080000] $28   : 8037c000 8037d880 00000010 80276890
> [67360.080000] Hi    : 00000000
> [67360.080000] Lo    : 00000000
> [67360.080000] epc   : 8006f058 0x8006f058
> [67360.080000]     Tainted: G        W
> [67360.080000] ra    : 80276890 0x80276890
> [67360.080000] Status: 1000dc03    KERNEL EXL IE
> [67360.080000] Cause : 00800010
> [67360.080000] BadVA : 05f20d4d
> [67360.080000] PrId  : 00029006 (Broadcom BCM3302)
> [67360.080000] Modules linked in: tun sch_sfq cls_fw sch_htb 
> ipt_MASQUERADE iptable_nat nf_nat xt_MARK iptable_mangle ipt_ULOG 
> xt_recent nf_conntrack_ipv4 nf_defrag1
> [67360.080000] Process swapper (pid: 0, threadinfo=8037c000, 
> task=8037e000, tls=00000000)
> [67360.080000] Stack : 8099ee20 80276190 00000000 00000000 8099ee20 
> 803d80a8 8099ee20 802761f0
> [67360.080000]         81d17e00 803d80a8 8099ee20 81c72000 81d17e00 
> 80280d50 8099ee20 81404f80
> [67360.080000]         8037d900 80da8000 80da8000 81d17e00 81d17e00 
> 00000001 80da8000 8099ee20
> [67360.080000]         81c72000 00665332 802980e0 80298078 80da8000 
> 00000002 00000000 8030dd3c
> [67360.080000]         80da8000 81d17e00 8099ee20 00000000 803d86e0 
> 80000000 80284fbc 80284f18
> [67360.080000]         ...
> [67360.080000] Call Trace:[<80276190>] 0x80276190
> [67360.080000] [<802761f0>] 0x802761f0
> [67360.080000] [<80280d50>] 0x80280d50
> [67360.080000] [<802980e0>] 0x802980e0
> [67360.080000] [<80298078>] 0x80298078
> [67360.080000] [<8030dd3c>] 0x8030dd3c
> [67360.080000] [<80284fbc>] 0x80284fbc
> [67360.080000] [<80284f18>] 0x80284f18
> [67360.080000] [<8030dd3c>] 0x8030dd3c
> [67360.080000] [<803089d4>] 0x803089d4
> [67360.080000] [<8030893c>] 0x8030893c
> [67360.080000] [<8030ea74>] 0x8030ea74
> [67360.080000] [<8030dd3c>] 0x8030dd3c
> [67360.080000] [<8030dd3c>] 0x8030dd3c
> [67360.080000] [<802a52f0>] 0x802a52f0
> [67360.080000] [<8030893c>] 0x8030893c
> [67360.080000] [<8030893c>] 0x8030893c
> [67360.080000] [<8030893c>] 0x8030893c
> [67360.080000] [<802a5440>] 0x802a5440
> [67360.080000] [<8030893c>] 0x8030893c
> [67360.080000] [<803089e8>] 0x803089e8
> [67360.080000] [<80308a3c>] 0x80308a3c
> [67360.080000] [<8030893c>] 0x8030893c
> [67360.080000] [<8030dec8>] 0x8030dec8
> [67360.080000] [<803089e8>] 0x803089e8
> [67360.080000] [<803089e8>] 0x803089e8
> [67360.080000] [<8030efa8>] 0x8030efa8
> [67360.080000] [<8030ddb4>] 0x8030ddb4
> [67360.080000] [<8030ddb4>] 0x8030ddb4
> [67360.080000] [<802a52f0>] 0x802a52f0
> [67360.080000] [<803089e8>] 0x803089e8
> [67360.080000] [<803089e8>] 0x803089e8
> [67360.080000] [<802a5440>] 0x802a5440
> [67360.080000] [<806a5858>] 0x806a5858
> [67360.080000] [<803089e8>] 0x803089e8
> [67360.080000] [<80309978>] 0x80309978
> [67360.080000] [<80308af4>] 0x80308af4
> [67360.080000] [<80309978>] 0x80309978
> [67360.080000] [<802a5440>] 0x802a5440
> [67360.080000] [<803089e8>] 0x803089e8
> [67360.080000] [<80309ae0>] 0x80309ae0
> [67360.080000] [<80309978>] 0x80309978
> [67360.080000] [<8030e668>] 0x8030e668
> [67360.080000] [<8030e60c>] 0x8030e60c
> [67360.080000] [<8030e25c>] 0x8030e25c
> [67360.080000] [<80309978>] 0x80309978
> [67360.080000] [<8030e25c>] 0x8030e25c
> [67360.080000] [<802a5440>] 0x802a5440
> [67360.080000] [<80014aa0>] 0x80014aa0
> [67360.080000] [<8030e25c>] 0x8030e25c
> [67360.080000] [<8030f96c>] 0x8030f96c
> [67360.080000] [<8079301c>] 0x8079301c
> [67360.080000] [<8030e25c>] 0x8030e25c
> [67360.080000] [<802a52f0>] 0x802a52f0
> [67360.080000] [<80309978>] 0x80309978
> [67360.080000] [<80309978>] 0x80309978
> [67360.080000] [<802a5440>] 0x802a5440
> [67360.080000] [<8008feec>] 0x8008feec
> [67360.080000] [<80309978>] 0x80309978
> [67360.080000] [<80309d44>] 0x80309d44
> [67360.080000] [<8008feec>] 0x8008feec
> [67360.080000] [<8008ffdc>] 0x8008ffdc
> [67360.080000] [<80276190>] 0x80276190
> [67360.080000] [<80309978>] 0x80309978
> [67360.080000] [<800900a4>] 0x800900a4
> [67360.080000] [<8027fe3c>] 0x8027fe3c
> [67360.080000] [<802964f8>] 0x802964f8
> [67360.080000] [<8008feec>] 0x8008feec
> [67360.080000] [<802828e8>] 0x802828e8
> [67360.080000] [<80276190>] 0x80276190
> [67360.080000] [<80283b40>] 0x80283b40
> [67360.080000] [<80282a94>] 0x80282a94
> [67360.080000] [<80033764>] 0x80033764
> [67360.080000] [<8005d0f0>] 0x8005d0f0
> [67360.080000] [<800543e0>] 0x800543e0
> [67360.080000] [<80033874>] 0x80033874
> [67360.080000] [<80033d74>] 0x80033d74
> [67360.080000] [<80001844>] 0x80001844
> [67360.080000] [<80001844>] 0x80001844
> [67360.080000] [<80001a60>] 0x80001a60
> [67360.080000] [<800149fc>] 0x800149fc
> [67360.080000] [<8000efc8>] 0x8000efc8
> [67360.080000] [<8000efc8>] 0x8000efc8
> [67360.080000] [<8039c9ec>] 0x8039c9ec
> [67360.080000] [<8039c9d0>] 0x8039c9d0
> [67360.080000] [<8039c110>] 0x8039c110
> [67360.080000]
> [67360.080000]
> [67360.080000] Code: 3c048007  08010471  2484f044 <8c820000> 3042c000  
> 10400003  00803821  0801b8d1  00000000
> [67360.080000] Disabling lock debugging due to kernel taint
> [67360.560000] Kernel panic - not syncing: Fatal exception in interrupt


-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply

* Re: [PATCH] compat-wireless: Fix the bleeding-edge version to build on 2.6.27
From: Hauke Mehrtens @ 2009-10-11 19:00 UTC (permalink / raw)
  To: Larry Finger; +Cc: lrodriguez, linux-wireless
In-Reply-To: <4ac55601.EpUUwD1vnjBKSXDy%Larry.Finger@lwfinger.net>

[-- Attachment #1: Type: text/plain, Size: 3546 bytes --]

Larry Finger wrote:
> When building the bleeding-edge compat-wireless for kernel 2.6.27,
> several compilation errors were detected.
> 
> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
> ---
> 
> Luis,
> 
> I checked these patches on 2.6.27 and 2.6.31, but not for the intermediate
> releases.
> 
> Larry
> ---
> 
> Index: compat-wireless-2009-09-05/include/net/compat-2.6.28.h
> ===================================================================
> --- compat-wireless-2009-09-05.orig/include/net/compat-2.6.28.h
> +++ compat-wireless-2009-09-05/include/net/compat-2.6.28.h
> @@ -149,6 +149,7 @@ static inline void skb_queue_splice_tail
>  struct module;
>  struct tracepoint;
>  
> +#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,28))
>  struct tracepoint {
>  	const char *name;		/* Tracepoint name */
>  	int state;			/* State. */
> @@ -159,6 +160,7 @@ struct tracepoint {
>  					 * align these on the structure size.
>  					 * Keep in sync with vmlinux.lds.h.
>  					 */
> +#endif
>  
>  #ifndef DECLARE_TRACE
>  
> @@ -179,13 +181,17 @@ struct tracepoint {
>  		return -ENOSYS;						\
>  	}
>  
> +#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,28))
>  #define DEFINE_TRACE(name)
> +#endif
>  #define EXPORT_TRACEPOINT_SYMBOL_GPL(name)
>  #define EXPORT_TRACEPOINT_SYMBOL(name)
>  
> +#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,28))
>  static inline void tracepoint_update_probe_range(struct tracepoint *begin,
>  	struct tracepoint *end)
>  { }
> +#endif
>  
>  #endif
>  

LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,28) can not be true in
compat-2.6.28.h. The definitions are not needed in compat-wireless any
more. Removing this does not break compiling with mainline kernel 2.6.25
to 2.6.32

> Index: compat-wireless-2009-09-05/net/wireless/compat-2.6.28.c
> ===================================================================
> --- compat-wireless-2009-09-05.orig/net/wireless/compat-2.6.28.c
> +++ compat-wireless-2009-09-05/net/wireless/compat-2.6.28.c
> @@ -260,6 +260,7 @@ static unsigned long round_jiffies_commo
>  	return j;
>  }
>  
> +#if 0
>  /**
>   * round_jiffies_up - function to round jiffies up to a full second
>   * @j: the time in (absolute) jiffies that should be rounded
> @@ -274,5 +275,6 @@ unsigned long round_jiffies_up(unsigned
>  	return round_jiffies_common(j, raw_smp_processor_id(), true);
>  }
>  EXPORT_SYMBOL_GPL(round_jiffies_up);
> +#endif
>  
>  #endif /* LINUX_VERSION_CODE < KERNEL_VERSION(2,6,28) */

The mainline kernel 2.6.27 does not contain round_jiffies_up. Are you
using Suse? Suse adds some extra extensions into the kernel, we need an
other way to deactivate the round_jiffies_up export. With this patch
this symbol is missing while compiling against mainline kernel <=
2.6.27. An other user reported a problem with the Suse kernel in:
http://marc.info/?l=linux-wireless&m=125393384728475

> Index: compat-wireless-2009-09-05/net/wireless/scan.c
> ===================================================================
> --- compat-wireless-2009-09-05.orig/net/wireless/scan.c
> +++ compat-wireless-2009-09-05/net/wireless/scan.c
> @@ -499,8 +499,10 @@ cfg80211_inform_bss(struct wiphy *wiphy,
>  
>  	kref_init(&res->ref);
>  
> +#if (LINUX_VERSION_CODE > KERNEL_VERSION(2,6,30))
>  	/* cfg80211_bss_update() eats up res - we ensure we free it there */
>  	kmemleak_ignore(res);
> +#endif
>  
>  	res = cfg80211_bss_update(wiphy_to_dev(wiphy), res, 0);
>  	if (!res)

Hauke


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 898 bytes --]

^ permalink raw reply

* Re: Massive packet loss with ath9k, AR9280, hostapd in 802.11n mode
From: Rene Mayrhofer @ 2009-10-11 19:13 UTC (permalink / raw)
  To: Holger Schurig; +Cc: Bob Copeland, linux-wireless, leitner
In-Reply-To: <200909280956.23887.hs4233@mail.mn-solutions.de>

Hi everybody,
 
>> [ 1698.498801] ath: EEPROM regdomain: 0x0 
>> 
>> Does this indicate that the EEPROM is locked to country code
>> 0x0 (whatever that is, probably US)? "iw reg" doesn't seem to
>> change anything: 
> 
> Yep, that's the case.
> 
> However, "iw XXX reg set XX" should *STILL* change some things, 
> so I guess that crda/regdb isn't still correctly installed. And 
> you should still see something in your "dmesg" output. Hey, 
> even "COUNTRY=AT crda" should change/produce something, e.g. 
> check "dmesg" and "iw list".

After installing Ubuntu's wireless-crda package (which includes crda, the
regulatory.bin, and the udev rule), things started to work. One kernel
module update later, I was able to issue "iw reg set AT" manually and via
hostapd and it seems to do something - "iw reg get" now returns "country
98". This should now let me use channel 40 (5GHz band), which I will try
later on this week.
 
> Oh, and what I'm not yet getting: how can a wrong country setting 
> lead to so much packet-loss?

I can now confirm that the country setting was only a side issue that
prevented me from switching to the 5GHz band, but had nothing to do with the
massive packet loss issue. With or without the correct country setting, the
packet loss happened.


However, a (very) recent update seems to have solved this problem. While (if
I remember correctly) compat-wireless-2009-09-25 still caused the packet
loss, compat-wireless-2009-10-09 seems to be stable. With kernel 2.6.30.9
and compat-wireless-2009-10-09, scripts/driver-select set to "ath" and
loading the updated modules, I have now had a stable (nearly
packet-loss-free) connection for nearly 2h. This is a new record ;-) 
Unfortunately, the connection to a Linksys WUSB600N v2 in client mode at
about 1m distance from the miniPCI Atheros AR9280 with two antennas in
access point mode is still limited to a maximum of roughly 30MBit/s of
actual transfer rate as measured with iperf (the Windows client with Linksys
rt2870 drivers reports a connection speed of 270 to 300MBit/s). hostapd is
set to 

channel=6
ieee80211n=1
ht_capab=[HT40-][SHORT-GI-40]
wpa=3
wpa_passphrase=long...
wpa_key_mgmt=WPA-PSK
wpa_pairwise=CCMP
rsn_pairwise=CCMP
wme_enabled=1
#
# Low priority / AC_BK = background
wme_ac_bk_cwmin=4
wme_ac_bk_cwmax=10
wme_ac_bk_aifs=7
wme_ac_bk_txop_limit=0
wme_ac_bk_acm=0
# Note: for IEEE 802.11b mode: cWmin=5 cWmax=10
#
# Normal priority / AC_BE = best effort
wme_ac_be_aifs=3
wme_ac_be_cwmin=4
wme_ac_be_cwmax=10
wme_ac_be_txop_limit=0
wme_ac_be_acm=0
# Note: for IEEE 802.11b mode: cWmin=5 cWmax=7
#
# High priority / AC_VI = video
wme_ac_vi_aifs=2
wme_ac_vi_cwmin=3
wme_ac_vi_cwmax=4
wme_ac_vi_txop_limit=94
wme_ac_vi_acm=0
# Note: for IEEE 802.11b mode: cWmin=4 cWmax=5 txop_limit=188
#
# Highest priority / AC_VO = voice
wme_ac_vo_aifs=2
wme_ac_vo_cwmin=2
wme_ac_vo_cwmax=3
wme_ac_vo_txop_limit=47
wme_ac_vo_acm=0
# Note: for IEEE 802.11b mode: cWmin=3 cWmax=4 burst=102

Should the current ath9k code in access point be able to provide higher
throughput? The Linksys WUSB600N v2 is, according to multiple reviews,
capable of much more (which is the reason why I bought it as a test client).

On a side note, "rmmod cfg80211" causes a kernel BUG with subsequent network
subsystem instability independently of the module version (upstream
2.6.30.9, compat-wireless-2009-09-25 and 2009-10-09).

best regards,
Rene



^ permalink raw reply

* Re: [RFC] p54pci: skb_over_panic, soft lockup, stall under flood
From: Christian Lamparter @ 2009-10-11 19:41 UTC (permalink / raw)
  To: Larry Finger; +Cc: Quintin Pitts, John Linville, linux-wireless
In-Reply-To: <4AD1FA5E.1010201@lwfinger.net>

On Sunday 11 October 2009 17:31:42 Larry Finger wrote:
> > In trying to get p54pci driver to be stable on my platform and hardware
> > - here is a generic patch that seems to accomplish that.  Since the
> > ViewSonic V210 uses the IT8152 pci bridge - some attention was needed to
> > get dma related allocation in the first physical 64M.  I have verified
> > that the dma related allocation is in the first 64M and dmabounce is not
> > being used - just for those wondering if that was part of the problems.

it8152 was an important bit:

http://lkml.indiana.edu/hypermail/linux/kernel/0702.1/0645.html

the commit sparked a discussion about it8152 pci reliability.
It doesn't look good:

(commit author):
"I have no idea if it's possible to get a reliable PCI bus or
not with this chip. Right now, we only use it for it's built-in OHCI
USB host controller and UART. You're making me hope I never have to
use it for interfacing a PCI card!"

( http://lkml.indiana.edu/hypermail/linux/kernel/0702.1/1907.html )

[...]

"Well on the system on a board we were trying, using the development
baseboard from the same supplier, by simply doing a ping flood through
the onboard rtl8139 I managed to get corrupted ethernet packets fairly
frequently. "
( http://lkml.indiana.edu/hypermail/linux/kernel/0702.1/1917.html )

the sad conclusion is that: no matter what fixes you throw
at the driver (BTW: isl38xx isn't 100% pci v2.1 compliance either)
your chances of getting the device (with the softmac fw)
working properly with that board are next to... not,
unless you can magically fix the pci-bridge issues.

Regards,
	Chr

^ permalink raw reply

* Re: [PATCH] compat-wireless: Fix the bleeding-edge version to build on 2.6.27
From: Larry Finger @ 2009-10-11 21:31 UTC (permalink / raw)
  To: Hauke Mehrtens; +Cc: lrodriguez, linux-wireless
In-Reply-To: <4AD22B39.3070006@hauke-m.de>

On 10/11/2009 02:00 PM, Hauke Mehrtens wrote:
> Larry Finger wrote:
>> When building the bleeding-edge compat-wireless for kernel 2.6.27,
>> several compilation errors were detected.
>>
>> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
>> ---
>>
>> Luis,
>>
>> I checked these patches on 2.6.27 and 2.6.31, but not for the intermediate
>> releases.
>>
>> Larry
>> ---
>>
>> Index: compat-wireless-2009-09-05/include/net/compat-2.6.28.h
>> ===================================================================
>> --- compat-wireless-2009-09-05.orig/include/net/compat-2.6.28.h
>> +++ compat-wireless-2009-09-05/include/net/compat-2.6.28.h
>> @@ -149,6 +149,7 @@ static inline void skb_queue_splice_tail
>>  struct module;
>>  struct tracepoint;
>>  
>> +#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,28))
>>  struct tracepoint {
>>  	const char *name;		/* Tracepoint name */
>>  	int state;			/* State. */
>> @@ -159,6 +160,7 @@ struct tracepoint {
>>  					 * align these on the structure size.
>>  					 * Keep in sync with vmlinux.lds.h.
>>  					 */
>> +#endif
>>  
>>  #ifndef DECLARE_TRACE
>>  
>> @@ -179,13 +181,17 @@ struct tracepoint {
>>  		return -ENOSYS;						\
>>  	}
>>  
>> +#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,28))
>>  #define DEFINE_TRACE(name)
>> +#endif
>>  #define EXPORT_TRACEPOINT_SYMBOL_GPL(name)
>>  #define EXPORT_TRACEPOINT_SYMBOL(name)
>>  
>> +#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,28))
>>  static inline void tracepoint_update_probe_range(struct tracepoint *begin,
>>  	struct tracepoint *end)
>>  { }
>> +#endif
>>  
>>  #endif
>>  
> 
> LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,28) can not be true in
> compat-2.6.28.h. The definitions are not needed in compat-wireless any
> more. Removing this does not break compiling with mainline kernel 2.6.25
> to 2.6.32

Well... I was using the openSUSE 2.6.27 kernel, and it broke without
this statement, as did the compilation of the user on the o-penSUSE
forums!

>> Index: compat-wireless-2009-09-05/net/wireless/compat-2.6.28.c
>> ===================================================================
>> --- compat-wireless-2009-09-05.orig/net/wireless/compat-2.6.28.c
>> +++ compat-wireless-2009-09-05/net/wireless/compat-2.6.28.c
>> @@ -260,6 +260,7 @@ static unsigned long round_jiffies_commo
>>  	return j;
>>  }
>>  
>> +#if 0
>>  /**
>>   * round_jiffies_up - function to round jiffies up to a full second
>>   * @j: the time in (absolute) jiffies that should be rounded
>> @@ -274,5 +275,6 @@ unsigned long round_jiffies_up(unsigned
>>  	return round_jiffies_common(j, raw_smp_processor_id(), true);
>>  }
>>  EXPORT_SYMBOL_GPL(round_jiffies_up);
>> +#endif
>>  
>>  #endif /* LINUX_VERSION_CODE < KERNEL_VERSION(2,6,28) */
> 
> The mainline kernel 2.6.27 does not contain round_jiffies_up. Are you
> using Suse? Suse adds some extra extensions into the kernel, we need an
> other way to deactivate the round_jiffies_up export. With this patch
> this symbol is missing while compiling against mainline kernel <=
> 2.6.27. An other user reported a problem with the Suse kernel in:
> http://marc.info/?l=linux-wireless&m=125393384728475

Yes, the openSUSE patched kernel sources for 2.6.27.

Larry

^ permalink raw reply

* Re: 2.6.31.[12] ath5k regression
From: Richard Zidlicky @ 2009-10-11 22:00 UTC (permalink / raw)
  To: Bob Copeland; +Cc: linux-wireless
In-Reply-To: <20091011133010.GA21494@hash.localnet>

On Sun, Oct 11, 2009 at 09:30:10AM -0400, Bob Copeland wrote:
> On Sun, Oct 11, 2009 at 02:26:16PM +0200, Richard Zidlicky wrote:
> > thanks, compiling it right now. Not quite sure - which version of this
> > > > -       ret = ath5k_hw_reset(ah, sc->opmode, sc->curchan, true);
> > > > +       ret = ath5k_hw_reset(ah, sc->opmode, sc->curchan, chan != NULL);
> > 
> > is it supposed to be tested with?
> 
> The "chan != NULL" case.  The patch should apply against latest
> wireless-testing (but will probably work with linus-2.6).

so the results are same like before. The printk message came once only,
I will try to gather more debug info tomorrow.

It is striking that after running the not working driver the machine requires
a power-off reboot to have working ath5k again.

Richard

^ permalink raw reply

* 2.6.32-rc4: Reported regressions from 2.6.31
From: Rafael J. Wysocki @ 2009-10-11 22:07 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Adrian Bunk, Andrew Morton, Linus Torvalds, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, Linux Wireless List, DRI

[Note:
  We started to get more regression reports for 2.6.32-rc (100%+ jump in the
  last 10 days.  Nevertheless, we're still receiving new reports of regressions
  from 2.6.30.]

This message contains a list of some regressions from 2.6.31, for which there
are no fixes in the mainline I know of.  If any of them have been fixed already,
please let me know.

If you know of any other unresolved regressions from 2.6.31, please let me know
either and I'll add them to the list.  Also, please let me know if any of the
entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2009-10-12       48       31          27
  2009-10-02       22       15           9


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14392
Subject		: Touchpad "paste" stops working after suspend to RAM
Submitter	: Carlos R. Mafra <crmafra2@gmail.com>
Date		: 2009-10-11 16:21 (1 days old)
References	: http://marc.info/?l=linux-kernel&m=125527987316493&w=4
Handled-By	: Dmitry Torokhov <dmitry.torokhov@gmail.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14390
Subject		: "bind" a device to a driver doesn't not work anymore
Submitter	: Éric Piel <Eric.Piel@tremplin-utc.net>
Date		: 2009-10-11 0:04 (1 days old)
References	: http://marc.info/?l=linux-kernel&m=125521979921241&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14389
Subject		: Build system issue
Submitter	: Peter Zijlstra <peterz@infradead.org>
Date		: 2009-10-09 8:58 (3 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=575543347b5baed0ca927cb90ba8807396fe9cc9
References	: http://marc.info/?l=linux-kernel&m=125507914909152&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14387
Subject		: deadlock with fallocate
Submitter	: Thomas Neumann <tneumann@users.sourceforge.net>
Date		: 2009-10-07 3:00 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=125488495526471&w=4
Handled-By	: Christoph Hellwig <hch@lst.de>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14386
Subject		: GPF in snd_hda_intel
Submitter	: Luca Tettamanti <kronos.it@gmail.com>
Date		: 2009-10-10 13:01 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=125517999408019&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14384
Subject		: tbench regression with 2.6.32-rc1
Submitter	: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Date		: 2009-10-09 9:51 (3 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=59abf02644c45f1591e1374ee7bb45dc757fcb88
References	: http://marc.info/?l=linux-kernel&m=125508216713138&w=4
Handled-By	: Peter Zijlstra <a.p.zijlstra@chello.nl>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14383
Subject		: hackbench regression with kernel 2.6.32-rc1
Submitter	: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Date		: 2009-10-09 9:19 (3 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=29cd8bae396583a2ee9a3340db8c5102acf9f6fd
References	: http://marc.info/?l=linux-kernel&m=125508007510274&w=4
Handled-By	: Peter Zijlstra <a.p.zijlstra@chello.nl>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14381
Subject		: iwlagn lost connection after s2ram (with warnings)
Submitter	: Carlos R. Mafra <crmafra2@gmail.com>
Date		: 2009-10-07 14:20 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=125492569119947&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14380
Subject		: Video tearing/glitching with T400 laptops
Submitter	: Theodore Ts'o <tytso@mit.edu>
Date		: 2009-10-02 22:40 (10 days old)
References	: http://marc.info/?l=linux-kernel&m=125452324520623&w=4
Handled-By	: Jesse Barnes <jbarnes@virtuousgeek.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14379
Subject		: ACPI Warning for _SB_.BAT0._BIF: Converted Buffer to expected String
Submitter	: Justin Mattock <justinmattock@gmail.com>
Date		: 2009-10-08 21:46 (4 days old)
References	: http://marc.info/?l=linux-kernel&m=125504031328941&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14378
Subject		: Problems with net/core/skbuff.c
Submitter	: Massimo Cetra <mcetra@navynet.it>
Date		: 2009-10-08 14:51 (4 days old)
References	: http://marc.info/?l=linux-kernel&m=125501488220358&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14376
Subject		: Kernel NULL pointer dereference/ kvm subsystem
Submitter	: Don Dupuis <dondster@gmail.com>
Date		: 2009-10-06 14:38 (6 days old)
References	: http://marc.info/?l=linux-kernel&m=125484025021737&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14374
Subject		: MCEs caused by commit db8be50c4307dac2b37305fc59c8dc0f978d09ea
Submitter	: Nick Piggin <npiggin@suse.de>
Date		: 2009-10-02 7:34 (10 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db8be50c4307dac2b37305fc59c8dc0f978d09ea
References	: http://marc.info/?l=linux-kernel&m=125446885705223&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14373
Subject		: Task blocked for more than 120 seconds
Submitter	: Zeno Davatz <zdavatz@gmail.com>
Date		: 2009-10-02 10:16 (10 days old)
References	: http://marc.info/?l=linux-kernel&m=125447858618412&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14372
Subject		: Wireless not working after suspend-resume
Submitter	: Fabio Comolli <fabio.comolli@gmail.com>
Date		: 2009-10-03 15:36 (9 days old)
References	: http://lkml.org/lkml/2009/10/3/91


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14370
Subject		: ext4 corruptions
Submitter	: Alexey Fisher <bug-track@fisher-privat.net>
Date		: 2009-10-09 19:20 (3 days old)
References	: http://marc.info/?l=linux-kernel&m=125511643504864&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14355
Subject		: USB serial regression after 2.6.31.1 with Huawei E169 GSM modem
Submitter	: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date		: 2009-10-10 03:07 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=125513456327542&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14354
Subject		: Bad corruption with 2.6.32-rc1 and upwards
Submitter	: Holger Freyther <zecke@selfish.org>
Date		: 2009-10-09 15:42 (3 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14353
Subject		: BUG: sleeping function called from invalid context at kernel/mutex.c:280
Submitter	: Miles Lane <miles.lane@gmail.com>
Date		: 2009-10-05 3:39 (7 days old)
References	: http://marc.info/?l=linux-kernel&m=125471432208671&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14352
Subject		: WARNING: at net/mac80211/scan.c:267
Submitter	: Maciej Rutecki <maciej.rutecki@gmail.com>
Date		: 2009-10-08 00:30 (4 days old)
References	: http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2089#c7


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14334
Subject		: pcmcia suspend regression from 2.6.31.1 to 2.6.31.2 - Dell Inspiron 600m
Submitter	: Jose Marino <braket@hotmail.com>
Date		: 2009-10-06 15:44 (6 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14299
Subject		: oops in wireless, iwl3945 related?
Submitter	: Pavel Machek <pavel@ucw.cz>
Date		: 2009-09-29 17:12 (13 days old)
References	: http://marc.info/?l=linux-kernel&m=125424439725743&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14298
Subject		: warning at manage.c:361 (set_irq_wake), matrix-keypad related?
Submitter	: Pavel Machek <pavel@ucw.cz>
Date		: 2009-09-30 20:07 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125434130703538&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14297
Subject		: console resume broken since ba15ab0e8d
Submitter	: Sascha Hauer <s.hauer@pengutronix.de>
Date		: 2009-09-30 15:11 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125432349404060&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14296
Subject		: spitz boots but suspend/resume is broken
Submitter	: Pavel Machek <pavel@ucw.cz>
Date		: 2009-09-30 12:06 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125431244516449&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14279
Subject		: Suspend to RAM freeze totally since 2.6.32-rc1 - Acer Aspire 1511Lmi laptop
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2009-09-30 18:14 (12 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5f68563996e812f9ca35b3939ad2a42e5d254d66


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14277
Subject		: Caught 8-bit read from freed memory in b43 driver at association
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2009-09-30 18:06 (12 days old)


Regressions with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14382
Subject		: Transmit failure in et131x.
Submitter	: Nick Bowler <nbowler@elliptictech.com>
Date		: 2009-10-08 14:08 (4 days old)
References	: http://marc.info/?l=linux-kernel&m=125501117713744&w=4
Handled-By	: Alan Cox <alan@lxorguk.ukuu.org.uk>
Patch		: http://patchwork.kernel.org/patch/52698/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14375
Subject		: Intel(R) I/OAT DMA Engine init failed
Submitter	: Alexander Beregalov <a.beregalov@gmail.com>
Date		: 2009-10-02 9:46 (10 days old)
References	: http://marc.info/?l=linux-kernel&m=125447680016160&w=4
Handled-By	: Dan Williams <dan.j.williams@intel.com>
Patch		: http://patchwork.kernel.org/patch/51808/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14302
Subject		: Kernel panic on i386 machine when booting with profile=2
Submitter	: Shi, Alex <alex.shi@intel.com>
Date		: 2009-10-01 3:23 (11 days old)
References	: http://marc.info/?l=linux-kernel&m=125436749607199&w=4
Handled-By	: Alex Shi <alex.shi@intel.com>
Patch		: http://patchwork.kernel.org/patch/50813/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14278
Subject		: New message "NOHZ: local_softirq_pending 08" at each ping request
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2009-09-30 18:12 (12 days old)
Handled-By	: Michael Buesch <mb@bu3sch.de>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=23220


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 2.6.31,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=14230

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael


^ permalink raw reply

* Re: 2.6.31.[12] ath5k regression
From: Bob Copeland @ 2009-10-11 22:23 UTC (permalink / raw)
  To: Richard Zidlicky; +Cc: linux-wireless
In-Reply-To: <20091011220002.GA11603@linux-m68k.org>

On Mon, 12 Oct 2009 00:00:02 +0200, Richard Zidlicky wrote:
> so the results are same like before. The printk message came once only,
> I will try to gather more debug info tomorrow.

Meaning it works the same as with your change (replacing "chan != null"
with "true") or it works the same as mainline, i.e. it fails?

-- 
Bob Copeland %% www.bobcopeland.com



^ permalink raw reply

* 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
From: Rafael J. Wysocki @ 2009-10-11 22:41 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Andrew Morton, Linus Torvalds, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, Linux Wireless List, DRI

[Note:
  10 new reports in the last 10 days, but fortunately we're fixing them faster
  than they're being reported.]

This message contains a list of some regressions introduced between 2.6.30 and
2.6.31, for which there are no fixes in the mainline I know of.  If any of them
have been fixed already, please let me know.

If you know of any other unresolved regressions introduced between 2.6.30
and 2.6.31, please let me know either and I'll add them to the list.
Also, please let me know if any of the entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2009-10-12      161       45          35
  2009-10-02      151       49          42
  2009-09-06      123       34          27
  2009-08-26      108       33          26
  2009-08-20      102       32          29
  2009-08-10       89       27          24
  2009-08-02       76       36          28
  2009-07-27       70       51          43
  2009-07-07       35       25          21
  2009-06-29       22       22          15


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14391
Subject		: use after free of struct powernow_k8_data
Submitter	: Michal Schmidt <mschmidt@redhat.com>
Date		: 2009-09-24 14:51 (18 days old)
References	: http://marc.info/?l=linux-kernel&m=125380383515615&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14388
Subject		: keyboard under X with 2.6.31
Submitter	: Frédéric L. W. Meunier <fredlwm@gmail.com>
Date		: 2009-10-07 20:19 (5 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e043e42bdb66885b3ac10d27a01ccb9972e2b0a3
References	: http://marc.info/?l=linux-kernel&m=125494753228217&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14385
Subject		: DMAR regression in 2.6.31 leads to ext4 corruption?
Submitter	: Andy Isaacson <adi@hexapodia.org>
Date		: 2009-10-08 23:56 (4 days old)
References	: http://marc.info/?l=linux-kernel&m=125504643703877&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14377
Subject		: "conservative" cpufreq governor broken
Submitter	: Steven Noonan <steven@uplinklabs.net>
Date		: 2009-10-05 16:32 (7 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f2e21c9610991e95621a81407cdbab881226419b
References	: http://marc.info/?l=linux-kernel&m=125476067108252&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14329
Subject		: Sata disk doesn't wake up after S3 suspend
Submitter	:  <frodone@gmail.com>
Date		: 2009-10-05 22:58 (7 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14309
Subject		: MCA on hp rx8640
Submitter	: Andrew Patterson <andrew.patterson@hp.com>
Date		: 2009-09-29 17:20 (13 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db8be50c4307dac2b37305fc59c8dc0f978d09ea
References	: http://www.spinics.net/lists/linux-usb/msg22799.html


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14294
Subject		: kernel BUG at drivers/ide/ide-disk.c:187
Submitter	: Santiago Garcia Mantinan <manty@manty.net>
Date		: 2009-09-30 11:05 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125430926311466&w=4
Handled-By	: David Miller <davem@davemloft.net>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14267
Subject		: Disassociating atheros wlan
Submitter	: Kristoffer Ericson <kristoffer.ericson@gmail.com>
Date		: 2009-09-24 10:16 (18 days old)
References	: http://marc.info/?l=linux-kernel&m=125378723723384&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14266
Subject		: regression in page writeback
Submitter	: Shaohua Li <shaohua.li@intel.com>
Date		: 2009-09-22 5:49 (20 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d7831a0bdf06b9f722b947bb0c205ff7d77cebd8
References	: http://marc.info/?l=linux-kernel&m=125359858117176&w=4
Handled-By	: Wu Fengguang <fengguang.wu@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14265
Subject		: ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100
Submitter	: Karol Lewandowski <karol.k.lewandowski@gmail.com>
Date		: 2009-09-15 12:05 (27 days old)
References	: http://marc.info/?l=linux-kernel&m=125301636509517&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14264
Subject		: ehci problem - mouse dead on scroll
Submitter	: Volker Armin Hemmann <volkerarmin@googlemail.com>
Date		: 2009-09-12 7:46 (30 days old)
References	: http://marc.info/?l=linux-kernel&m=125274202707893&w=4
Handled-By	: Alan Stern <stern@rowland.harvard.edu>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14257
Subject		: Not able to boot on 32 bit System
Submitter	: Rishikesh <risrajak@linux.vnet.ibm.com>
Date		: 2009-09-21 15:25 (21 days old)
References	: http://marc.info/?l=linux-kernel&m=125354604314412&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14256
Subject		: kernel BUG at fs/ext3/super.c:435
Submitter	: Mikael Pettersson <mikpe@it.uu.se>
Date		: 2009-09-21 7:29 (21 days old)
References	: http://marc.info/?l=linux-kernel&m=125351816109264&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14252
Subject		: WARNING: at include/linux/skbuff.h:1382 w/ e1000
Submitter	: Stephan von Krawczynski <skraw@ithnet.com>
Date		: 2009-09-20 11:26 (22 days old)
References	: http://marc.info/?l=linux-kernel&m=125344599006033&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14249
Subject		: BUG: oops in gss_validate on 2.6.31
Submitter	: Bastian Blank <bastian@waldi.eu.org>
Date		: 2009-09-16 10:29 (26 days old)
References	: http://marc.info/?l=linux-kernel&m=125309700417283&w=4
Handled-By	: Trond Myklebust <trond.myklebust@fys.uio.no>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14248
Subject		: 2.6.31 wireless: WARNING: at net/wireless/ibss.c:34
Submitter	: Jurriaan <thunder8@xs4all.nl>
Date		: 2009-09-13 7:32 (29 days old)
References	: http://marc.info/?l=linux-kernel&m=125282721113553&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14204
Subject		: MCE prevent booting on my computer(pentium iii @500Mhz)
Submitter	: GNUtoo <GNUtoo@no-log.org>
Date		: 2009-09-21 20:36 (21 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14185
Subject		: Oops in driversbasefirmware_class
Submitter	:  <lars_ericsson@telia.com>
Date		: 2009-09-17 05:09 (25 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e03a201bbe8137487f340d26aa662110e324b20


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
Subject		: b43 causes panic at ifconfig down / shutdown
Submitter	: Jeremy Huddleston <jeremyhu@freedesktop.org>
Date		: 2009-09-15 18:34 (27 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14157
Subject		: end_request: I/O error, dev cciss/cXdX, sector 0
Submitter	:  <jiri.harcarik@gmail.com>
Date		: 2009-09-11 07:42 (31 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14143
Subject		: OOPS when setting nr_requests for md devices
Submitter	: aCaB <acab@clamav.net>
Date		: 2009-09-08 08:48 (34 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14141
Subject		: order 2 page allocation failures in iwlagn
Submitter	: Frans Pop <elendil@planet.nl>
Date		: 2009-09-06 7:40 (36 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2ff05b2b4eac2e63d345fc731ea151a060247f53
References	: http://marc.info/?l=linux-kernel&m=125222287419691&w=4
		  http://lkml.org/lkml/2009/10/2/86
		  http://lkml.org/lkml/2009/10/5/24
Handled-By	: Pekka Enberg <penberg@cs.helsinki.fi>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14114
Subject		: Tuning a saa7134 based card is broken in kernel 2.6.31-rc7
Submitter	: Tsvety Petrov <Tsvetoslav.Petrov@itron.com>
Date		: 2009-09-03 21:06 (39 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14090
Subject		: WARNING: at fs/notify/inotify/inotify_user.c:394
Submitter	: Joerg Platte <bugzilla@jako.ping.de>
Date		: 2009-08-30 15:21 (43 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14070
Subject		: lockdep warning triggered by dup_fd
Submitter	: Bart Van Assche <bart.vanassche@gmail.com>
Date		: 2009-08-23 09:36 (50 days old)
References	: http://lkml.org/lkml/2009/8/23/8


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14058
Subject		: Oops in fsnotify
Submitter	: Grant Wilson <grant.wilson@zen.co.uk>
Date		: 2009-08-20 15:48 (53 days old)
References	: http://marc.info/?l=linux-kernel&m=125078450923133&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14013
Subject		: hd don't show up
Submitter	: Tim Blechmann <tim@klingt.org>
Date		: 2009-08-14 8:26 (59 days old)
References	: http://marc.info/?l=linux-kernel&m=125023842514480&w=4
Handled-By	: Tejun Heo <tj@kernel.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13987
Subject		: Received NMI interrupt at resume
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2009-08-15 07:55 (58 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13943
Subject		: WARNING: at net/mac80211/mlme.c:2292 with ath5k
Submitter	: Fabio Comolli <fabio.comolli@gmail.com>
Date		: 2009-08-06 20:15 (67 days old)
References	: http://marc.info/?l=linux-kernel&m=124958978600600&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13941
Subject		: x86 Geode issue
Submitter	: Martin-Éric Racine <q-funk@iki.fi>
Date		: 2009-08-03 12:58 (70 days old)
References	: http://marc.info/?l=linux-kernel&m=124930434732481&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13906
Subject		: Huawei E169 GPRS connection causes Ooops
Submitter	: Clemens Eisserer <linuxhippy@gmail.com>
Date		: 2009-08-04 09:02 (69 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13836
Subject		: suspend script fails, related to stdout?
Submitter	: Tomas M. <tmezzadra@gmail.com>
Date		: 2009-07-17 21:24 (87 days old)
References	: http://marc.info/?l=linux-kernel&m=124785853811667&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13809
Subject		: oprofile: possible circular locking dependency detected
Submitter	: Jerome Marchand <jmarchan@redhat.com>
Date		: 2009-07-22 13:35 (82 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13733
Subject		: 2.6.31-rc2: irq 16: nobody cared
Submitter	: Niel Lambrechts <niel.lambrechts@gmail.com>
Date		: 2009-07-06 18:32 (98 days old)
References	: http://marc.info/?l=linux-kernel&m=124690524027166&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13645
Subject		: NULL pointer dereference at (null) (level2_spare_pgt)
Submitter	: poornima nayak <mpnayak@linux.vnet.ibm.com>
Date		: 2009-06-17 17:56 (117 days old)
References	: http://lkml.org/lkml/2009/6/17/194


Regressions with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14301
Subject		: WARNING: at net/ipv4/af_inet.c:154
Submitter	: Ralf Hildebrandt <Ralf.Hildebrandt@charite.de>
Date		: 2009-09-30 12:24 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125431350218137&w=4
Handled-By	: Eric Dumazet <eric.dumazet@gmail.com>
Patch		: http://patchwork.kernel.org/patch/52743/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14275
Subject		: kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more?
Submitter	: gabriele balducci <balducci@units.it>
Date		: 2009-09-30 15:02 (12 days old)
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=14275#c0


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14261
Subject		: e1000e jumbo frames no longer work: 'Unsupported MTU setting'
Submitter	: Nix <nix@esperi.org.uk>
Date		: 2009-09-26 11:16 (16 days old)
References	: http://marc.info/?l=linux-kernel&m=125396433321342&w=4
Handled-By	: Alexander Duyck <alexander.duyck@gmail.com>
Patch		: http://patchwork.kernel.org/patch/50277/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14258
Subject		: Memory leak in SCSI initialization
Submitter	: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Date		: 2009-09-22 4:18 (20 days old)
References	: http://marc.info/?l=linux-kernel&m=125359311312243&w=4
Handled-By	: Michael Ellerman <michael@ellerman.id.au>
		  James Bottomley <James.Bottomley@suse.de>
Patch		: http://patchwork.kernel.org/patch/51412/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
Subject		: Oops in driversbasefirmware_class
Submitter	: Lars Ericsson <Lars_Ericsson@telia.com>
Date		: 2009-09-16 20:44 (26 days old)
References	: http://lkml.org/lkml/2009/9/16/461
Handled-By	: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
Patch		: http://patchwork.kernel.org/patch/49914/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14137
Subject		: usb console regressions
Submitter	: Jason Wessel <jason.wessel@windriver.com>
Date		: 2009-09-05 21:08 (37 days old)
References	: http://marc.info/?l=linux-kernel&m=125218501310512&w=4
Handled-By	: Jason Wessel <jason.wessel@windriver.com>
Patch		: http://patchwork.kernel.org/patch/45953/
		  http://patchwork.kernel.org/patch/45952/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14129
Subject		: 2.6.31 regression - pci_get_slot oops, udev boot hang - toshiba X200
Submitter	: chepioq <chepioq@gmail.com>
Date		: 2009-09-06 07:01 (36 days old)
Handled-By	: Alex Chiang <achiang@hp.com>
		  Rafael J. Wysocki <rjw@sisk.pl>
Patch		: http://patchwork.kernel.org/patch/51834/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14017
Subject		: _end symbol missing from Symbol.map
Submitter	: Hannes Reinecke <hare@suse.de>
Date		: 2009-08-13 6:45 (60 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=091e52c3551d3031343df24b573b770b4c6c72b6
References	: http://marc.info/?l=linux-kernel&m=125014649102253&w=4
Handled-By	: Hannes Reinecke <hare@suse.de>
Patch		: http://marc.info/?l=linux-kernel&m=125014649102253&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13948
Subject		: ath5k broken after suspend-to-ram
Submitter	: Johannes Stezenbach <js@sig21.net>
Date		: 2009-08-07 21:51 (66 days old)
References	: http://marc.info/?l=linux-kernel&m=124968192727854&w=4
Handled-By	: Nick Kossifidis <mickflemm@gmail.com>
Patch		: http://patchwork.kernel.org/patch/38550/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13940
Subject		: 2.6.31-rc1 - iwlagn and sky2 stopped working when ACPI enabled - Toshiba U400-17b, Acer Aspire 8935G
Submitter	: Ricardo Jorge da Fonseca Marques Ferreira <storm@sys49152.net>
Date		: 2009-08-07 22:33 (66 days old)
References	: http://marc.info/?l=linux-kernel&m=124968457731107&w=4
Handled-By	: Len Brown <lenb@kernel.org>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=23280


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions introduced
between 2.6.30 and 2.6.31, unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=13615

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael


^ permalink raw reply

* Re: ath5k AP kernel panic when client uses SCP
From: Bob Copeland @ 2009-10-11 22:47 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: linux-wireless, linux-mips
In-Reply-To: <4AD21AB4.6010208@wpkg.org>

On Sun, Oct 11, 2009 at 1:49 PM, Tomasz Chmielewski <mangoo@wpkg.org> wrote:
> The AP is Asus WL-500gP, it's a MIPS platform, running 2.6.31.1 kernel,
> hostapd v0.6.9.
>
> I can reproduce the issue reliably.
>
> Let me know if you need more info here.

Yes, please -- it would really be helpful if instead of just the addresses
we have all the function names in the stack trace.  Or at least what is
at 8006f058 and 80276190.  We've had a couple of reports of unaligned
accesses but I haven't yet seen a useful stack trace.

> [67359.700000] ------------[ cut here ]------------
> [67359.710000] WARNING: at net/core/dev.c:1566 0x80280890()
> [67359.710000] b44: caps=(0x0, 0x0) len=80 data_len=0 ip_summed=1
> [67359.720000] Modules linked in: tun sch_sfq cls_fw sch_htb ipt_MASQUERADE
> iptable_nat nf_nat xt_MARK iptable_mangle ipt_ULOG xt_recent
> nf_conntrack_ipv4 nf_defrag1

Did you replace the wireless device?  I don't see ath5k in the
above list; IIRC that AP is originally some broadcom chipset.

> [67360.080000] Disabling lock debugging due to kernel taint
> [67360.560000] Kernel panic - not syncing: Fatal exception in interrupt

Why's it tainted?  I can't remember and can't check right now if
TAINT_WARN counts.

-- 
Bob Copeland %% www.bobcopeland.com

^ permalink raw reply

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
From: Larry Finger @ 2009-10-11 23:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Andrew Morton, Linus Torvalds,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List, Linux Wireless List,
	DRI
In-Reply-To: <56acieJJ2fF.A.nEB.Hzl0KB@chimera>

On 10/11/2009 05:41 PM, Rafael J. Wysocki wrote:
> [Note:
>   10 new reports in the last 10 days, but fortunately we're fixing them faster
>   than they're being reported.]

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
> Subject		: b43 causes panic at ifconfig down / shutdown
> Submitter	: Jeremy Huddleston <jeremyhu@freedesktop.org>
> Date		: 2009-09-15 18:34 (27 days old)

A patch to fix this one is in the hands of the OP. It should be tested
within the next couple of days.

Larry

^ permalink raw reply

* Re: [RFC] p54pci: skb_over_panic, soft lockup, stall under flood
From: Quintin Pitts @ 2009-10-12  0:09 UTC (permalink / raw)
  To: Larry Finger; +Cc: John Linville, linux-wireless, Christian Lamparter
In-Reply-To: <4AD1FA5E.1010201@lwfinger.net>

On Sun Oct 11 2009 10:31:42 GMT-0500 (CDT), Larry Finger wrote:
> On 10/11/2009 09:28 AM, Quintin Pitts wrote:
>> Hi,
>>
>> Sorry for my lack of experience in all aspects - first time
>> submitting!!!
> 
> Everyone that goes through this "right of passage" gets somewhat
> discouraged by the response. My advice is to hang in.
> 
> My first advice is for you to run every submitted patch through the
> check at scripts/checkpatch.pl. This one shows 95 errors and 7
> warnings in 136 lines. Most of the errors are due to "DOS line
> endings". We really hate carriage returns - a really useless occupier
> of space unless it is _NOT_ followed by \n!

Thanks for the advice!
 

> As I understand it, this patch is to fix the driver to work around
> firmware errors. If that is correct, please state that clearly. If
> only partially correct, then indicate which parts are to fix firmware
> errors, and which are to fix driver errors. Has your analysis included
> thinking about where the driver might delay to avoid firmware problems.

I think Christian has hit the nail on the head.  Mostly flaky hardware
or implementation (it8152 pci bridge) when pushed.

> 
> I will let Christian comment more on the technical merits of the patch
> as he understands the device much better than I do.
> 
> Larry

Thanks,

Quintin.

^ permalink raw reply

* Re: [RFC] p54pci: skb_over_panic, soft lockup, stall under flood
From: Quintin Pitts @ 2009-10-12  0:26 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: Larry Finger, John Linville, linux-wireless
In-Reply-To: <200910112141.01747.chunkeey@googlemail.com>

On Sun Oct 11 2009 14:41:01 GMT-0500 (CDT), Christian Lamparter wrote:
> On Sunday 11 October 2009 17:31:42 Larry Finger wrote:
>>> In trying to get p54pci driver to be stable on my platform and hardware
>>> - here is a generic patch that seems to accomplish that.  Since the
>>> ViewSonic V210 uses the IT8152 pci bridge - some attention was needed to
>>> get dma related allocation in the first physical 64M.  I have verified
>>> that the dma related allocation is in the first 64M and dmabounce is not
>>> being used - just for those wondering if that was part of the problems.
> 
> it8152 was an important bit:
> 
> http://lkml.indiana.edu/hypermail/linux/kernel/0702.1/0645.html
> 
> the commit sparked a discussion about it8152 pci reliability.
> It doesn't look good:
> 
> (commit author):
> "I have no idea if it's possible to get a reliable PCI bus or
> not with this chip. Right now, we only use it for it's built-in OHCI
> USB host controller and UART. You're making me hope I never have to
> use it for interfacing a PCI card!"
> 
> ( http://lkml.indiana.edu/hypermail/linux/kernel/0702.1/1907.html )
> 
> [...]
> 
> "Well on the system on a board we were trying, using the development
> baseboard from the same supplier, by simply doing a ping flood through
> the onboard rtl8139 I managed to get corrupted ethernet packets fairly
> frequently. "
> ( http://lkml.indiana.edu/hypermail/linux/kernel/0702.1/1917.html )
> 
> the sad conclusion is that: no matter what fixes you throw
> at the driver (BTW: isl38xx isn't 100% pci v2.1 compliance either)
> your chances of getting the device (with the softmac fw)
> working properly with that board are next to... not,
> unless you can magically fix the pci-bridge issues.

I feared as much.

At least that patch has given me hope.  It survives a iperf and ping
flood now with out going belly up.  Device has been up for three days
now with network still working.   But likely glitches will show up.

iperf tests at about 9.6 to 15 Mbits/sec on a 802.11g WPA network which is
about 2 times faster than what I had under WinCE.

web browsing and ssh and scp all work great at the moment.

Time will tell...

Thanks much!

Quintin.


^ permalink raw reply

* Re: [PATCH] b43: fix ieee80211_rx() context
From: David Miller @ 2009-10-12  3:08 UTC (permalink / raw)
  To: kalle.valo; +Cc: johannes, linville, hidave.darkstar, linux-wireless
In-Reply-To: <87y6nhoqud.fsf@purkki.valot.fi>

From: Kalle Valo <kalle.valo@iki.fi>
Date: Sun, 11 Oct 2009 19:08:58 +0300

> Johannes Berg <johannes@sipsolutions.net> writes:
> 
>>> > +	local_bh_disable();
>>> >  	ieee80211_rx(dev->wl->hw, skb);
>>> > +	local_bh_enable();
>>> 
>>> This is a bit awkward from drivers' point of view, we have to add the
>>> same code to all mac80211 drivers using either SPI or SDIO buses.
>>> 
>>> What about adding a new inline function ieee80211_rx_ni() which would
>>> disable bottom halves like above and call ieee80211_rx()? IMHO that's
>>> easier for the driver developers to understand and also easier to
>>> document ("use this function when calling from process context"). If
>>> this is acceptable, I can create a patch.
>>
>> I really don't see the point, since it's just three lines of code, but I
>> wouldn't mind all that much either.
> 
> My worry are the developers who even don't know what is a bottom half
> and might get it all wrong. (Yes, there really are such people.)

And the difference between this and knowing you need to call the
ieee80211_rx_ni() thing is?

You have to know what the heck a bottom half is to even know that you
would need to call the ieee80211_rx_ni() thing.

And that's the same amount of knowledge necessary to simply wrap the
thing in a BH disable/enable sequence.

^ permalink raw reply

* 2.6.32-rc3: wifi warning?
From: Andrew Lutomirski @ 2009-10-12  3:22 UTC (permalink / raw)
  To: linux-wireless

Not really sure what happened here, but I think I triggered it by
trying to suspend.  This is 36a07902c2134649c4af7f07980413ffb1a56085
from Linus's tree, which is a little bit after 2.6.32-rc3.

--Andy

[  545.919955] usb 2-6: USB disconnect, address 4
[  548.724525] phy0: device now idle
[  548.735689] phy0: Removed STA 00:11:50:64:93:75
[  548.738313] phy0: Destroyed STA 00:11:50:64:93:75
[  548.738318] wlan0: deauthenticating from 00:11:50:64:93:75 by local
choice (reason=3)
[  548.742752] wlan0: deauthenticating from 00:11:50:64:93:75 by local
choice (reason=3)
[  548.742772] phy0: device no longer idle - scanning
[  548.753108] phy0: device now idle
[  548.778800] ------------[ cut here ]------------
[  548.778821] WARNING: at net/wireless/core.c:613
wdev_cleanup_work+0x69/0xec [cfg80211]()
[  548.778824] Hardware name: 7465CTO
[  548.778826] Modules linked in: vfat fat usb_storage fuse tp_smapi
thinkpad_ec bridge stp llc bnep sco l2cap bluetooth ip6t_REJECT
nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand
dm_multipath uinput arc4 ecb thinkpad_acpi hwmon
snd_hda_codec_conexant snd_hda_intel snd_hda_codec iwlagn iwlcore
i2400m_usb snd_hwdep snd_pcm snd_timer mac80211 snd i2400m i2c_i801
iTCO_wdt soundcore cfg80211 snd_page_alloc iTCO_vendor_support xts
gf128mul aes_x86_64 aes_generic dm_crypt i915 drm_kms_helper drm
i2c_algo_bit i2c_core video output [last unloaded: microcode]
[  548.778878] Pid: 10, comm: events/1 Not tainted 2.6.32-rc3 #9
[  548.778881] Call Trace:
[  548.778890]  [<ffffffff810567f3>] warn_slowpath_common+0x8d/0xbb
[  548.778895]  [<ffffffff81056848>] warn_slowpath_null+0x27/0x3d
[  548.778902]  [<ffffffffa00ec65c>] wdev_cleanup_work+0x69/0xec [cfg80211]
[  548.778908]  [<ffffffff81071c4d>] worker_thread+0x1d0/0x270
[  548.778915]  [<ffffffffa00ec5f3>] ? wdev_cleanup_work+0x0/0xec [cfg80211]
[  548.778920]  [<ffffffff81076cdf>] ? autoremove_wake_function+0x0/0x5f
[  548.778925]  [<ffffffff81040379>] ? __spin_unlock_irq+0x23/0x3a
[  548.778928]  [<ffffffff81071a7d>] ? worker_thread+0x0/0x270
[  548.778932]  [<ffffffff8107685d>] kthread+0x8e/0x96
[  548.778936]  [<ffffffff8100cfca>] child_rip+0xa/0x20
[  548.778940]  [<ffffffff8100c969>] ? restore_args+0x0/0x30
[  548.778944]  [<ffffffff810767cf>] ? kthread+0x0/0x96
[  548.778947]  [<ffffffff8100cfc0>] ? child_rip+0x0/0x20
[  548.778950] ---[ end trace 6da4e1284e480ee1 ]---

^ permalink raw reply

* Re: [PATCH 1/3] iwmc3200top: Add Intel Wireless MultiCom 3200 top driver.
From: David Miller @ 2009-10-12  6:07 UTC (permalink / raw)
  To: eric.dumazet
  Cc: tomasw, linville, netdev, linux-wireless, linux-mmc, yi.zhu,
	inaky.perez-gonzalez, cindy.h.kao, guy.cohen, ron.rindjunsky,
	tomas.winkler, joe
In-Reply-To: <4AD18C06.8040002@gmail.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 11 Oct 2009 09:40:54 +0200

> diff --git a/MAINTAINERS b/MAINTAINERS
> index e1da925..18244ad 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3654,6 +3654,7 @@ NETWORKING [GENERAL]
>  M:	"David S. Miller" <davem@davemloft.net>
>  L:	netdev@vger.kernel.org
>  W:	http://www.linuxfoundation.org/en/Net
> +W:	http://patchwork.ozlabs.org/project/netdev/list/
>  T:	git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.git
>  S:	Maintained
>  F:	net/

I've applied this, thanks Eric.

^ permalink raw reply

* Re: deauthentication and disassociation nl80211 commands
From: Jouni Malinen @ 2009-10-12  6:52 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: hostap@lists.shmoo.com, linux-wireless
In-Reply-To: <1254708707.24430.68.camel@maxim-laptop>

On Mon, Oct 05, 2009 at 04:11:47AM +0200, Maxim Levitsky wrote:

> Today kernel explicitly requests the driver to perform both
> disassociation and deauthentication in that order.

I hope that deauthentication alone would be enough since that is
supposed to implicitly first take care of disassociation as far as the
IEEE 802.11 standard is concerned. It should also be noted that "kernel"
here is referring to mac80211; most other drivers/IEEE 802.11 stacks
do not have this type of restriction.

> However, currently wpa_supplicant assumes that once it called
> wpa_drv_disassociate it can again start the complete connect sequence
> from the authentication.

Actually, wpa_supplicant assume that it can authenticate again at any
point, i.e., even without first calling wpa_drv_disassociate.

> In fact I have carefully studied the code and found that calls to
> wpa_supplicant_deauthenticate (which is the only user of
> wpa_drv_deauthenticate) only happen at deinitialization of wireless
> interface and when wpa_supplicant really has to do it, that is if there
> is a failure (mic failure for example).

Yes, wpa_supplicant tries to follow the operations as defined in IEEE
802.11 and does not unnecessarily deauthenticate. In addition, when
reassociating back to the same AP (e.g., to change some parameters),
there will be no deauthentication/disassociation at all.

> My hacky patch that was rejected on the grounds that it is not right to
> introduce the driver dependent behavior might actually be the correct
> solution. It just makes the wpa_supplicant_disassociate do both
> disassociation and deauthentication, as was always assumed by the
> wpa_supplicant core.

There is no such assumption and the patch is not correct.

> Or kernel should became smarter and do the work for wpa_supplicant. 

No, it should not do that either. Rather, it should allow valid IEEE
802.11 operations to be performed (authentication while authenticated is
allowed)..

> If mac80211 is already authenticated to the AP that was requested, it
> should just return success.

No. It should start new authentication in that case.

> If it isn't authenticated to new AP then, new authentication should be
> made.
> (and old one can be kept, but removed after a timeout)

That should be done regardless of the current authentication/association
state.

> When do you plan to switch officially the wpa_supplicant to
> driver_nl80211?

To whom is this "you" referring? wpa_supplicant does support both WEXT
and nl80211. It is up to the user (of wpa_supplicant; e.g., NM) to
decide which driver wrapper it wants to use.


As far as working out this issue is concerned, I committed a following
change to wpa_supplicant:
http://w1.fi/gitweb/gitweb.cgi?p=hostap.git;a=commitdiff;h=6d6f4bb87f33278aed133875d0d561eb55d7ae59

I cannot day that I exactly like this due to the required extra code in
events.c, but the part in driver_nl80211.c shows how this type of
driver-specific cases should be handled in general. Anyway, I hope that
this can be eventually removed from wpa_supplicant.

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply

* Re: deauthentication and disassociation nl80211 commands
From: Jouni Malinen @ 2009-10-12  6:55 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Maxim Levitsky, hostap@lists.shmoo.com, linux-wireless
In-Reply-To: <1255191866.4095.32.camel@johannes.local>

On Sat, Oct 10, 2009 at 06:24:26PM +0200, Johannes Berg wrote:
> On the other hand, I think Jouni's argument is that you should be able
> to authenticate (force an auth frame exchange) even while authenticated.
> I don't really disagree with that all that much, but I'm not sure how to
> cleanly fit it in. mac80211 would have to reset the auth state without
> sending a deauth.

Yes, this is exactly what I would like to see happening when using
mac80211. For now, I think we can work around the issue in
wpa_supplicant, but eventually, this change in mac80211 would allow the
code in wpa_supplicant to be cleaned up and the need for an extra
deauthentication frame could be removed.

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox