* [PATCHv2 net-next] net: igb: Only dma sync frame length
@ 2016-06-03 21:03 Andrew Lunn
2016-06-15 1:40 ` [Intel-wired-lan] " Brown, Aaron F
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Lunn @ 2016-06-03 21:03 UTC (permalink / raw)
To: Jeff Kirsher, David Miller
Cc: netdev, intel-wired-lan, Alexander Duyck, Andrew Lunn
On some platforms, syncing a buffer for DMA is expensive. Rather than
sync the whole 2K receive buffer, only synchronise the length of the
frame, which will typically be the MTU, or a much smaller TCP ACK.
For an IMX6Q, this gives around 6% increased TCP receive performance,
which is cache operations bound and reduces CPU load for TCP transmit.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
---
v2:
Christmas tree the local variables
Pass size into igb_add_rx_frag() rather than repeating the endiness swap.
---
drivers/net/ethernet/intel/igb/igb_main.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 0a289dda604a..8fa9e6e8c3b0 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6856,12 +6856,12 @@ static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer,
**/
static bool igb_add_rx_frag(struct igb_ring *rx_ring,
struct igb_rx_buffer *rx_buffer,
+ unsigned int size,
union e1000_adv_rx_desc *rx_desc,
struct sk_buff *skb)
{
struct page *page = rx_buffer->page;
unsigned char *va = page_address(page) + rx_buffer->page_offset;
- unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
#if (PAGE_SIZE < 8192)
unsigned int truesize = IGB_RX_BUFSZ;
#else
@@ -6913,6 +6913,7 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
union e1000_adv_rx_desc *rx_desc,
struct sk_buff *skb)
{
+ unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
struct igb_rx_buffer *rx_buffer;
struct page *page;
@@ -6948,11 +6949,11 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
dma_sync_single_range_for_cpu(rx_ring->dev,
rx_buffer->dma,
rx_buffer->page_offset,
- IGB_RX_BUFSZ,
+ size,
DMA_FROM_DEVICE);
/* pull page into skb */
- if (igb_add_rx_frag(rx_ring, rx_buffer, rx_desc, skb)) {
+ if (igb_add_rx_frag(rx_ring, rx_buffer, size, rx_desc, skb)) {
/* hand second half of page back to the ring */
igb_reuse_rx_page(rx_ring, rx_buffer);
} else {
--
2.8.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCHv2 net-next] net: igb: Only dma sync frame length
2016-06-04 19:16 [PATCHv2 net-next 00/17] New DSA bind, switches as devices Andrew Lunn
@ 2016-06-04 19:16 ` Andrew Lunn
0 siblings, 0 replies; 3+ messages in thread
From: Andrew Lunn @ 2016-06-04 19:16 UTC (permalink / raw)
To: David Miller, Vivien Didelot, Florian Fainelli
Cc: netdev, Shawn Guo, Andrew Lunn
On some platforms, syncing a buffer for DMA is expensive. Rather than
sync the whole 2K receive buffer, only synchronise the length of the
frame, which will typically be the MTU, or a much smaller TCP ACK.
For an IMX6Q, this gives around 6% increased TCP receive performance,
which is cache operations bound and reduces CPU load for TCP transmit.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
---
v2:
Christmas tree the local variables
Pass size into igb_add_rx_frag() rather than repeating the endiness swap.
---
drivers/net/ethernet/intel/igb/igb_main.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 0a289dda604a..8fa9e6e8c3b0 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6856,12 +6856,12 @@ static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer,
**/
static bool igb_add_rx_frag(struct igb_ring *rx_ring,
struct igb_rx_buffer *rx_buffer,
+ unsigned int size,
union e1000_adv_rx_desc *rx_desc,
struct sk_buff *skb)
{
struct page *page = rx_buffer->page;
unsigned char *va = page_address(page) + rx_buffer->page_offset;
- unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
#if (PAGE_SIZE < 8192)
unsigned int truesize = IGB_RX_BUFSZ;
#else
@@ -6913,6 +6913,7 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
union e1000_adv_rx_desc *rx_desc,
struct sk_buff *skb)
{
+ unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
struct igb_rx_buffer *rx_buffer;
struct page *page;
@@ -6948,11 +6949,11 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
dma_sync_single_range_for_cpu(rx_ring->dev,
rx_buffer->dma,
rx_buffer->page_offset,
- IGB_RX_BUFSZ,
+ size,
DMA_FROM_DEVICE);
/* pull page into skb */
- if (igb_add_rx_frag(rx_ring, rx_buffer, rx_desc, skb)) {
+ if (igb_add_rx_frag(rx_ring, rx_buffer, size, rx_desc, skb)) {
/* hand second half of page back to the ring */
igb_reuse_rx_page(rx_ring, rx_buffer);
} else {
--
2.8.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* RE: [Intel-wired-lan] [PATCHv2 net-next] net: igb: Only dma sync frame length
2016-06-03 21:03 [PATCHv2 net-next] net: igb: Only dma sync frame length Andrew Lunn
@ 2016-06-15 1:40 ` Brown, Aaron F
0 siblings, 0 replies; 3+ messages in thread
From: Brown, Aaron F @ 2016-06-15 1:40 UTC (permalink / raw)
To: Andrew Lunn, Kirsher, Jeffrey T, David Miller
Cc: netdev, intel-wired-lan@lists.osuosl.org
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@lists.osuosl.org] On
> Behalf Of Andrew Lunn
> Sent: Friday, June 3, 2016 2:03 PM
> To: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; David Miller
> <davem@davemloft.net>
> Cc: netdev <netdev@vger.kernel.org>; intel-wired-lan@lists.osuosl.org;
> Andrew Lunn <andrew@lunn.ch>
> Subject: [Intel-wired-lan] [PATCHv2 net-next] net: igb: Only dma sync frame
> length
>
> On some platforms, syncing a buffer for DMA is expensive. Rather than
> sync the whole 2K receive buffer, only synchronise the length of the
> frame, which will typically be the MTU, or a much smaller TCP ACK.
>
> For an IMX6Q, this gives around 6% increased TCP receive performance,
> which is cache operations bound and reduces CPU load for TCP transmit.
>
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> ---
> v2:
> Christmas tree the local variables
> Pass size into igb_add_rx_frag() rather than repeating the endiness swap.
> ---
> drivers/net/ethernet/intel/igb/igb_main.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-06-15 1:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-03 21:03 [PATCHv2 net-next] net: igb: Only dma sync frame length Andrew Lunn
2016-06-15 1:40 ` [Intel-wired-lan] " Brown, Aaron F
-- strict thread matches above, loose matches on Subject: below --
2016-06-04 19:16 [PATCHv2 net-next 00/17] New DSA bind, switches as devices Andrew Lunn
2016-06-04 19:16 ` [PATCHv2 net-next] net: igb: Only dma sync frame length Andrew Lunn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).