[PATCHv2 net-next] net: igb: Only dma sync frame length

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCHv2 net-next] net: igb: Only dma sync frame length
@ 2016-06-03 21:03 Andrew Lunn
  2016-06-15  1:40 ` [Intel-wired-lan] " Brown, Aaron F
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Lunn @ 2016-06-03 21:03 UTC (permalink / raw)
  To: Jeff Kirsher, David Miller
  Cc: netdev, intel-wired-lan, Alexander Duyck, Andrew Lunn

On some platforms, syncing a buffer for DMA is expensive. Rather than
sync the whole 2K receive buffer, only synchronise the length of the
frame, which will typically be the MTU, or a much smaller TCP ACK.

For an IMX6Q, this gives around 6% increased TCP receive performance,
which is cache operations bound and reduces CPU load for TCP transmit.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
---
v2:
Christmas tree the local variables
Pass size into igb_add_rx_frag() rather than repeating the endiness swap.
---
 drivers/net/ethernet/intel/igb/igb_main.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 0a289dda604a..8fa9e6e8c3b0 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6856,12 +6856,12 @@ static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer,
  **/
 static bool igb_add_rx_frag(struct igb_ring *rx_ring,
 			    struct igb_rx_buffer *rx_buffer,
+			    unsigned int size,
 			    union e1000_adv_rx_desc *rx_desc,
 			    struct sk_buff *skb)
 {
 	struct page *page = rx_buffer->page;
 	unsigned char *va = page_address(page) + rx_buffer->page_offset;
-	unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
 #if (PAGE_SIZE < 8192)
 	unsigned int truesize = IGB_RX_BUFSZ;
 #else
@@ -6913,6 +6913,7 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
 					   union e1000_adv_rx_desc *rx_desc,
 					   struct sk_buff *skb)
 {
+	unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
 	struct igb_rx_buffer *rx_buffer;
 	struct page *page;
 
@@ -6948,11 +6949,11 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
 	dma_sync_single_range_for_cpu(rx_ring->dev,
 				      rx_buffer->dma,
 				      rx_buffer->page_offset,
-				      IGB_RX_BUFSZ,
+				      size,
 				      DMA_FROM_DEVICE);
 
 	/* pull page into skb */
-	if (igb_add_rx_frag(rx_ring, rx_buffer, rx_desc, skb)) {
+	if (igb_add_rx_frag(rx_ring, rx_buffer, size, rx_desc, skb)) {
 		/* hand second half of page back to the ring */
 		igb_reuse_rx_page(rx_ring, rx_buffer);
 	} else {
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCHv2 net-next] net: igb: Only dma sync frame length
  2016-06-04 19:16 [PATCHv2 net-next 00/17] New DSA bind, switches as devices Andrew Lunn
@ 2016-06-04 19:16 ` Andrew Lunn
  0 siblings, 0 replies; 3+ messages in thread
From: Andrew Lunn @ 2016-06-04 19:16 UTC (permalink / raw)
  To: David Miller, Vivien Didelot, Florian Fainelli
  Cc: netdev, Shawn Guo, Andrew Lunn

On some platforms, syncing a buffer for DMA is expensive. Rather than
sync the whole 2K receive buffer, only synchronise the length of the
frame, which will typically be the MTU, or a much smaller TCP ACK.

For an IMX6Q, this gives around 6% increased TCP receive performance,
which is cache operations bound and reduces CPU load for TCP transmit.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
---
v2:
Christmas tree the local variables
Pass size into igb_add_rx_frag() rather than repeating the endiness swap.
---
 drivers/net/ethernet/intel/igb/igb_main.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 0a289dda604a..8fa9e6e8c3b0 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6856,12 +6856,12 @@ static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer,
  **/
 static bool igb_add_rx_frag(struct igb_ring *rx_ring,
 			    struct igb_rx_buffer *rx_buffer,
+			    unsigned int size,
 			    union e1000_adv_rx_desc *rx_desc,
 			    struct sk_buff *skb)
 {
 	struct page *page = rx_buffer->page;
 	unsigned char *va = page_address(page) + rx_buffer->page_offset;
-	unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
 #if (PAGE_SIZE < 8192)
 	unsigned int truesize = IGB_RX_BUFSZ;
 #else
@@ -6913,6 +6913,7 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
 					   union e1000_adv_rx_desc *rx_desc,
 					   struct sk_buff *skb)
 {
+	unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
 	struct igb_rx_buffer *rx_buffer;
 	struct page *page;
 
@@ -6948,11 +6949,11 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
 	dma_sync_single_range_for_cpu(rx_ring->dev,
 				      rx_buffer->dma,
 				      rx_buffer->page_offset,
-				      IGB_RX_BUFSZ,
+				      size,
 				      DMA_FROM_DEVICE);
 
 	/* pull page into skb */
-	if (igb_add_rx_frag(rx_ring, rx_buffer, rx_desc, skb)) {
+	if (igb_add_rx_frag(rx_ring, rx_buffer, size, rx_desc, skb)) {
 		/* hand second half of page back to the ring */
 		igb_reuse_rx_page(rx_ring, rx_buffer);
 	} else {
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* RE: [Intel-wired-lan] [PATCHv2 net-next] net: igb: Only dma sync frame length
  2016-06-03 21:03 [PATCHv2 net-next] net: igb: Only dma sync frame length Andrew Lunn
@ 2016-06-15  1:40 ` Brown, Aaron F
  0 siblings, 0 replies; 3+ messages in thread
From: Brown, Aaron F @ 2016-06-15  1:40 UTC (permalink / raw)
  To: Andrew Lunn, Kirsher, Jeffrey T, David Miller
  Cc: netdev, intel-wired-lan@lists.osuosl.org

> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@lists.osuosl.org] On
> Behalf Of Andrew Lunn
> Sent: Friday, June 3, 2016 2:03 PM
> To: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; David Miller
> <davem@davemloft.net>
> Cc: netdev <netdev@vger.kernel.org>; intel-wired-lan@lists.osuosl.org;
> Andrew Lunn <andrew@lunn.ch>
> Subject: [Intel-wired-lan] [PATCHv2 net-next] net: igb: Only dma sync frame
> length
> 
> On some platforms, syncing a buffer for DMA is expensive. Rather than
> sync the whole 2K receive buffer, only synchronise the length of the
> frame, which will typically be the MTU, or a much smaller TCP ACK.
> 
> For an IMX6Q, this gives around 6% increased TCP receive performance,
> which is cache operations bound and reduces CPU load for TCP transmit.
> 
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> ---
> v2:
> Christmas tree the local variables
> Pass size into igb_add_rx_frag() rather than repeating the endiness swap.
> ---
>  drivers/net/ethernet/intel/igb/igb_main.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)

Tested-by: Aaron Brown <aaron.f.brown@intel.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-06-15  1:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-03 21:03 [PATCHv2 net-next] net: igb: Only dma sync frame length Andrew Lunn
2016-06-15  1:40 ` [Intel-wired-lan] " Brown, Aaron F
  -- strict thread matches above, loose matches on Subject: below --
2016-06-04 19:16 [PATCHv2 net-next 00/17] New DSA bind, switches as devices Andrew Lunn
2016-06-04 19:16 ` [PATCHv2 net-next] net: igb: Only dma sync frame length Andrew Lunn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).