linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] ath10k: pci fixes 2014-06-05
@ 2014-06-05  7:48 Michal Kazior
  2014-06-05  7:48 ` [PATCH 1/2] ath10k: fix bmi exchange tx/rx race Michal Kazior
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Michal Kazior @ 2014-06-05  7:48 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

Hi,

I've found a few issues while I was testing ath10k
in QEMU/KVM with PCI passthrough. The second patch
is something that I've found while analysing the
bmi race and I don't think anyone experienced
problems associated with it.

There's still a weird synchronization issue with
(what I think is) iomap write propagation. The
initial fw bootup interrupt doesn't come in and
it seems CE interrupts are unmasked with a lag
because explicit polling after ctl_resp timeout
seems to be sufficient to work around it. I
suspect this might be a virtualization problem
rather than ath10k. Ideas, anyone?


Michal Kazior (2):
  ath10k: fix bmi exchange tx/rx race
  ath10k: sanitize tx ring index access properly

 drivers/net/wireless/ath/ath10k/ce.c  | 11 +++++++----
 drivers/net/wireless/ath/ath10k/pci.c | 11 +++--------
 drivers/net/wireless/ath/ath10k/pci.h |  3 ++-
 3 files changed, 12 insertions(+), 13 deletions(-)

-- 
1.8.5.3


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] ath10k: fix bmi exchange tx/rx race
  2014-06-05  7:48 [PATCH 0/2] ath10k: pci fixes 2014-06-05 Michal Kazior
@ 2014-06-05  7:48 ` Michal Kazior
  2014-06-05  7:48 ` [PATCH 2/2] ath10k: sanitize tx ring index access properly Michal Kazior
  2014-07-15  8:22 ` [PATCH 0/2] ath10k: pci fixes 2014-06-05 Kalle Valo
  2 siblings, 0 replies; 5+ messages in thread
From: Michal Kazior @ 2014-06-05  7:48 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

It was possible for tx completion not to be
processed. In that case an old stack pointer was
left on copy engine tx ring. Next bmi exchange
would immediately pop it and use complete() on the
completion struct there causing corruption.

Make sure to wait for both tx and rx completions
properly.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/pci.c | 11 +++--------
 drivers/net/wireless/ath/ath10k/pci.h |  3 ++-
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
index d0004d5..06840d1 100644
--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -1362,8 +1362,6 @@ static int ath10k_pci_hif_exchange_bmi_msg(struct ath10k *ar,
 		ath10k_ce_recv_buf_enqueue(ce_rx, &xfer, resp_paddr);
 	}
 
-	init_completion(&xfer.done);
-
 	ret = ath10k_ce_send(ce_tx, &xfer, req_paddr, req_len, -1, 0);
 	if (ret)
 		goto err_resp;
@@ -1414,10 +1412,7 @@ static void ath10k_pci_bmi_send_done(struct ath10k_ce_pipe *ce_state)
 					  &nbytes, &transfer_id))
 		return;
 
-	if (xfer->wait_for_resp)
-		return;
-
-	complete(&xfer->done);
+	xfer->tx_done = true;
 }
 
 static void ath10k_pci_bmi_recv_data(struct ath10k_ce_pipe *ce_state)
@@ -1438,7 +1433,7 @@ static void ath10k_pci_bmi_recv_data(struct ath10k_ce_pipe *ce_state)
 	}
 
 	xfer->resp_len = nbytes;
-	complete(&xfer->done);
+	xfer->rx_done = true;
 }
 
 static int ath10k_pci_bmi_wait(struct ath10k_ce_pipe *tx_pipe,
@@ -1451,7 +1446,7 @@ static int ath10k_pci_bmi_wait(struct ath10k_ce_pipe *tx_pipe,
 		ath10k_pci_bmi_send_done(tx_pipe);
 		ath10k_pci_bmi_recv_data(rx_pipe);
 
-		if (completion_done(&xfer->done))
+		if (xfer->tx_done && (xfer->rx_done == xfer->wait_for_resp))
 			return 0;
 
 		schedule();
diff --git a/drivers/net/wireless/ath/ath10k/pci.h b/drivers/net/wireless/ath/ath10k/pci.h
index dfdebb4..9401292 100644
--- a/drivers/net/wireless/ath/ath10k/pci.h
+++ b/drivers/net/wireless/ath/ath10k/pci.h
@@ -38,7 +38,8 @@
 #define DIAG_TRANSFER_LIMIT 2048
 
 struct bmi_xfer {
-	struct completion done;
+	bool tx_done;
+	bool rx_done;
 	bool wait_for_resp;
 	u32 resp_len;
 };
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] ath10k: sanitize tx ring index access properly
  2014-06-05  7:48 [PATCH 0/2] ath10k: pci fixes 2014-06-05 Michal Kazior
  2014-06-05  7:48 ` [PATCH 1/2] ath10k: fix bmi exchange tx/rx race Michal Kazior
@ 2014-06-05  7:48 ` Michal Kazior
  2014-07-14 13:20   ` Kalle Valo
  2014-07-15  8:22 ` [PATCH 0/2] ath10k: pci fixes 2014-06-05 Kalle Valo
  2 siblings, 1 reply; 5+ messages in thread
From: Michal Kazior @ 2014-06-05  7:48 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

The tx ring index was immediately trimmed with a
bitmask. This discarded the 0xFFFFFFFF error case
(which theoretically can happen when a device is
abruptly disconnected) and led to using an invalid
tx ring index. This could lead to memory
corruption.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/ce.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/ce.c b/drivers/net/wireless/ath/ath10k/ce.c
index d185dc0..7c6c7d5 100644
--- a/drivers/net/wireless/ath/ath10k/ce.c
+++ b/drivers/net/wireless/ath/ath10k/ce.c
@@ -603,16 +603,19 @@ static int ath10k_ce_completed_send_next_nolock(struct ath10k_ce_pipe *ce_state,
 		if (ret)
 			return ret;
 
-		src_ring->hw_index =
-			ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
-		src_ring->hw_index &= nentries_mask;
+		read_index = ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
+		if (read_index == 0xFFFFFFFF)
+			return -ENODEV;
+
+		read_index &= nentries_mask;
+		src_ring->hw_index = read_index;
 
 		ath10k_pci_sleep(ar);
 	}
 
 	read_index = src_ring->hw_index;
 
-	if ((read_index == sw_index) || (read_index == 0xffffffff))
+	if (read_index == sw_index)
 		return -EIO;
 
 	sbase = src_ring->shadow_base;
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] ath10k: sanitize tx ring index access properly
  2014-06-05  7:48 ` [PATCH 2/2] ath10k: sanitize tx ring index access properly Michal Kazior
@ 2014-07-14 13:20   ` Kalle Valo
  0 siblings, 0 replies; 5+ messages in thread
From: Kalle Valo @ 2014-07-14 13:20 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k, linux-wireless

Michal Kazior <michal.kazior@tieto.com> writes:

> The tx ring index was immediately trimmed with a
> bitmask. This discarded the 0xFFFFFFFF error case
> (which theoretically can happen when a device is
> abruptly disconnected) and led to using an invalid
> tx ring index. This could lead to memory
> corruption.
>
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
> ---
>  drivers/net/wireless/ath/ath10k/ce.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath10k/ce.c b/drivers/net/wireless/ath/ath10k/ce.c
> index d185dc0..7c6c7d5 100644
> --- a/drivers/net/wireless/ath/ath10k/ce.c
> +++ b/drivers/net/wireless/ath/ath10k/ce.c
> @@ -603,16 +603,19 @@ static int ath10k_ce_completed_send_next_nolock(struct ath10k_ce_pipe *ce_state,
>  		if (ret)
>  			return ret;
>  
> -		src_ring->hw_index =
> -			ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
> -		src_ring->hw_index &= nentries_mask;
> +		read_index = ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
> +		if (read_index == 0xFFFFFFFF)
> +			return -ENODEV;

I changed this to lower case, as it was before. Let's use lower case hex
values in ath10k.

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] ath10k: pci fixes 2014-06-05
  2014-06-05  7:48 [PATCH 0/2] ath10k: pci fixes 2014-06-05 Michal Kazior
  2014-06-05  7:48 ` [PATCH 1/2] ath10k: fix bmi exchange tx/rx race Michal Kazior
  2014-06-05  7:48 ` [PATCH 2/2] ath10k: sanitize tx ring index access properly Michal Kazior
@ 2014-07-15  8:22 ` Kalle Valo
  2 siblings, 0 replies; 5+ messages in thread
From: Kalle Valo @ 2014-07-15  8:22 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k, linux-wireless

Michal Kazior <michal.kazior@tieto.com> writes:

> I've found a few issues while I was testing ath10k
> in QEMU/KVM with PCI passthrough. The second patch
> is something that I've found while analysing the
> bmi race and I don't think anyone experienced
> problems associated with it.
>
> There's still a weird synchronization issue with
> (what I think is) iomap write propagation. The
> initial fw bootup interrupt doesn't come in and
> it seems CE interrupts are unmasked with a lag
> because explicit polling after ctl_resp timeout
> seems to be sufficient to work around it. I
> suspect this might be a virtualization problem
> rather than ath10k. Ideas, anyone?

No ideas, but it would be great to get ath10k working in KVM. If we
cannot find a better solution, could we add the explicit polling anyway?
(With an approriate comment why such a hack is needed, of course)

> Michal Kazior (2):
>   ath10k: fix bmi exchange tx/rx race
>   ath10k: sanitize tx ring index access properly

Thanks, both patches applied.

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-07-15  8:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-05  7:48 [PATCH 0/2] ath10k: pci fixes 2014-06-05 Michal Kazior
2014-06-05  7:48 ` [PATCH 1/2] ath10k: fix bmi exchange tx/rx race Michal Kazior
2014-06-05  7:48 ` [PATCH 2/2] ath10k: sanitize tx ring index access properly Michal Kazior
2014-07-14 13:20   ` Kalle Valo
2014-07-15  8:22 ` [PATCH 0/2] ath10k: pci fixes 2014-06-05 Kalle Valo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).