netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* pull request: sfc-next-2.6 2011-03-07
@ 2011-03-07 23:23 Ben Hutchings
  2011-03-07 23:25 ` [PATCH net-next-2.6] sfc: Use write-combining to reduce TX latency Ben Hutchings
  2011-03-08 19:33 ` pull request: sfc-next-2.6 2011-03-07 David Miller
  0 siblings, 2 replies; 3+ messages in thread
From: Ben Hutchings @ 2011-03-07 23:23 UTC (permalink / raw)
  To: David Miller; +Cc: sf-linux-drivers, netdev

The following changes since commit 07df5294a753dfac2cc9f75e6159fc25fdc22149:

  inet: Replace left-over references to inet->cork (2011-03-01 23:00:58 -0800)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next-2.6.git for-davem

Just one more performance optimisation.

Ben.

Ben Hutchings (1):
      sfc: Use write-combining to reduce TX latency

 drivers/net/sfc/efx.c  |    4 ++--
 drivers/net/sfc/io.h   |   13 +++++++++----
 drivers/net/sfc/mcdi.c |    9 +++++----
 3 files changed, 16 insertions(+), 10 deletions(-)

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH net-next-2.6] sfc: Use write-combining to reduce TX latency
  2011-03-07 23:23 pull request: sfc-next-2.6 2011-03-07 Ben Hutchings
@ 2011-03-07 23:25 ` Ben Hutchings
  2011-03-08 19:33 ` pull request: sfc-next-2.6 2011-03-07 David Miller
  1 sibling, 0 replies; 3+ messages in thread
From: Ben Hutchings @ 2011-03-07 23:25 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers

Based on work by Neil Turton <nturton@solarflare.com> and
Kieran Mansley <kmansley@solarflare.com>.

The BIU has now been verified to handle 3- and 4-dword writes within a
single 128-bit register correctly.  This means we can enable write-
combining and only insert write barriers between writes to distinct
registers.

This has been observed to save about 0.5 us when pushing a TX
descriptor to an empty TX queue.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/sfc/efx.c  |    4 ++--
 drivers/net/sfc/io.h   |   13 +++++++++----
 drivers/net/sfc/mcdi.c |    9 +++++----
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index d563049..b8bd936 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -1104,8 +1104,8 @@ static int efx_init_io(struct efx_nic *efx)
 		rc = -EIO;
 		goto fail3;
 	}
-	efx->membase = ioremap_nocache(efx->membase_phys,
-				       efx->type->mem_map_size);
+	efx->membase = ioremap_wc(efx->membase_phys,
+				  efx->type->mem_map_size);
 	if (!efx->membase) {
 		netif_err(efx, probe, efx->net_dev,
 			  "could not map memory BAR at %llx+%x\n",
diff --git a/drivers/net/sfc/io.h b/drivers/net/sfc/io.h
index dc45110..d9d8c2e 100644
--- a/drivers/net/sfc/io.h
+++ b/drivers/net/sfc/io.h
@@ -48,9 +48,9 @@
  *   replacing the low 96 bits with zero does not affect functionality.
  * - If the host writes to the last dword address of such a register
  *   (i.e. the high 32 bits) the underlying register will always be
- *   written.  If the collector does not hold values for the low 96
- *   bits of the register, they will be written as zero.  Writing to
- *   the last qword does not have this effect and must not be done.
+ *   written.  If the collector and the current write together do not
+ *   provide values for all 128 bits of the register, the low 96 bits
+ *   will be written as zero.
  * - If the host writes to the address of any other part of such a
  *   register while the collector already holds values for some other
  *   register, the write is discarded and the collector maintains its
@@ -103,6 +103,7 @@ static inline void efx_writeo(struct efx_nic *efx, efx_oword_t *value,
 	_efx_writed(efx, value->u32[2], reg + 8);
 	_efx_writed(efx, value->u32[3], reg + 12);
 #endif
+	wmb();
 	mmiowb();
 	spin_unlock_irqrestore(&efx->biu_lock, flags);
 }
@@ -125,6 +126,7 @@ static inline void efx_sram_writeq(struct efx_nic *efx, void __iomem *membase,
 	__raw_writel((__force u32)value->u32[0], membase + addr);
 	__raw_writel((__force u32)value->u32[1], membase + addr + 4);
 #endif
+	wmb();
 	mmiowb();
 	spin_unlock_irqrestore(&efx->biu_lock, flags);
 }
@@ -139,6 +141,7 @@ static inline void efx_writed(struct efx_nic *efx, efx_dword_t *value,
 
 	/* No lock required */
 	_efx_writed(efx, value->u32[0], reg);
+	wmb();
 }
 
 /* Read a 128-bit CSR, locking as appropriate. */
@@ -237,12 +240,14 @@ static inline void _efx_writeo_page(struct efx_nic *efx, efx_oword_t *value,
 
 #ifdef EFX_USE_QWORD_IO
 	_efx_writeq(efx, value->u64[0], reg + 0);
+	_efx_writeq(efx, value->u64[1], reg + 8);
 #else
 	_efx_writed(efx, value->u32[0], reg + 0);
 	_efx_writed(efx, value->u32[1], reg + 4);
-#endif
 	_efx_writed(efx, value->u32[2], reg + 8);
 	_efx_writed(efx, value->u32[3], reg + 12);
+#endif
+	wmb();
 }
 #define efx_writeo_page(efx, value, reg, page)				\
 	_efx_writeo_page(efx, value,					\
diff --git a/drivers/net/sfc/mcdi.c b/drivers/net/sfc/mcdi.c
index 8bba895..5e118f0 100644
--- a/drivers/net/sfc/mcdi.c
+++ b/drivers/net/sfc/mcdi.c
@@ -94,14 +94,15 @@ static void efx_mcdi_copyin(struct efx_nic *efx, unsigned cmd,
 
 	efx_writed(efx, &hdr, pdu);
 
-	for (i = 0; i < inlen; i += 4)
+	for (i = 0; i < inlen; i += 4) {
 		_efx_writed(efx, *((__le32 *)(inbuf + i)), pdu + 4 + i);
-
-	/* Ensure the payload is written out before the header */
-	wmb();
+		/* use wmb() within loop to inhibit write combining */
+		wmb();
+	}
 
 	/* ring the doorbell with a distinctive value */
 	_efx_writed(efx, (__force __le32) 0x45789abc, doorbell);
+	wmb();
 }
 
 static void efx_mcdi_copyout(struct efx_nic *efx, u8 *outbuf, size_t outlen)
-- 
1.7.4


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: pull request: sfc-next-2.6 2011-03-07
  2011-03-07 23:23 pull request: sfc-next-2.6 2011-03-07 Ben Hutchings
  2011-03-07 23:25 ` [PATCH net-next-2.6] sfc: Use write-combining to reduce TX latency Ben Hutchings
@ 2011-03-08 19:33 ` David Miller
  1 sibling, 0 replies; 3+ messages in thread
From: David Miller @ 2011-03-08 19:33 UTC (permalink / raw)
  To: bhutchings; +Cc: linux-net-drivers, netdev

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Mon, 07 Mar 2011 23:23:14 +0000

> The following changes since commit 07df5294a753dfac2cc9f75e6159fc25fdc22149:
> 
>   inet: Replace left-over references to inet->cork (2011-03-01 23:00:58 -0800)
> 
> are available in the git repository at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next-2.6.git for-davem
> 
> Just one more performance optimisation.

Pulled, thanks Ben.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-03-08 19:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-07 23:23 pull request: sfc-next-2.6 2011-03-07 Ben Hutchings
2011-03-07 23:25 ` [PATCH net-next-2.6] sfc: Use write-combining to reduce TX latency Ben Hutchings
2011-03-08 19:33 ` pull request: sfc-next-2.6 2011-03-07 David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).