[PATCH 3/7] mlx4_en: Incoming traffic alignment optimizations

Netdev List
 help / color / mirror / Atom feed

* [PATCH 3/7] mlx4_en: Incoming traffic alignment optimizations
@ 2011-10-17 20:17 Yevgeny Petrilin
  2011-10-18  1:53 ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Yevgeny Petrilin @ 2011-10-17 20:17 UTC (permalink / raw)
  To: davem; +Cc: netdev, yevgenyp


Packet headers are copied to skb linear part (which is IP aligned), so there is no reason for
the scatter entry to be IP aligned.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c   |   12 +++---------
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |    2 +-
 2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index fbf1dcf..8ae1eb5 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -744,15 +744,9 @@ void mlx4_en_calc_rx_buf(struct net_device *dev)
 			(eff_mtu > buf_size + frag_sizes[i]) ?
 				frag_sizes[i] : eff_mtu - buf_size;
 		priv->frag_info[i].frag_prefix_size = buf_size;
-		if (!i)	{
-			priv->frag_info[i].frag_align = NET_IP_ALIGN;
-			priv->frag_info[i].frag_stride =
-				ALIGN(frag_sizes[i] + NET_IP_ALIGN, SMP_CACHE_BYTES);
-		} else {
-			priv->frag_info[i].frag_align = 0;
-			priv->frag_info[i].frag_stride =
-				ALIGN(frag_sizes[i], SMP_CACHE_BYTES);
-		}
+		priv->frag_info[i].frag_align = 0;
+		priv->frag_info[i].frag_stride =
+			ALIGN(frag_sizes[i], SMP_CACHE_BYTES);
 		priv->frag_info[i].last_offset = mlx4_en_last_alloc_offset(
 						priv, priv->frag_info[i].frag_stride,
 						priv->frag_info[i].frag_align);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 3b753f7..013cda2 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -91,7 +91,7 @@
 /* Receive fragment sizes; we use at most 4 fragments (for 9600 byte MTU
  * and 4K allocations) */
 enum {
-	FRAG_SZ0 = 512 - NET_IP_ALIGN,
+	FRAG_SZ0 = 2048,
 	FRAG_SZ1 = 1024,
 	FRAG_SZ2 = 4096,
 	FRAG_SZ3 = MLX4_EN_ALLOC_SIZE
-- 
1.7.7

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 3/7] mlx4_en: Incoming traffic alignment optimizations
  2011-10-17 20:17 [PATCH 3/7] mlx4_en: Incoming traffic alignment optimizations Yevgeny Petrilin
@ 2011-10-18  1:53 ` Eric Dumazet
  2011-10-18  7:43   ` Yevgeny Petrilin
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2011-10-18  1:53 UTC (permalink / raw)
  To: Yevgeny Petrilin; +Cc: davem, netdev

Le lundi 17 octobre 2011 à 22:17 +0200, Yevgeny Petrilin a écrit :
> Packet headers are copied to skb linear part (which is IP aligned), so there is no reason for
> the scatter entry to be IP aligned.
> 
> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
> ---

> @@ -91,7 +91,7 @@
>  /* Receive fragment sizes; we use at most 4 fragments (for 9600 byte MTU
>   * and 4K allocations) */
>  enum {
> -	FRAG_SZ0 = 512 - NET_IP_ALIGN,
> +	FRAG_SZ0 = 2048,
>  	FRAG_SZ1 = 1024,
>  	FRAG_SZ2 = 4096,
>  	FRAG_SZ3 = MLX4_EN_ALLOC_SIZE

Is the 512 -> 2048 change really wanted ? Its not mentioned in changelog
and is confusing.

This means mlx4 lost the ability to use a small frag (512 bytes) to
store small frames.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH 3/7] mlx4_en: Incoming traffic alignment optimizations
  2011-10-18  1:53 ` Eric Dumazet
@ 2011-10-18  7:43   ` Yevgeny Petrilin
  2011-10-18  8:40     ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Yevgeny Petrilin @ 2011-10-18  7:43 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem@davemloft.net, netdev@vger.kernel.org

> 
> > @@ -91,7 +91,7 @@
> >  /* Receive fragment sizes; we use at most 4 fragments (for 9600 byte MTU
> >   * and 4K allocations) */
> >  enum {
> > -	FRAG_SZ0 = 512 - NET_IP_ALIGN,
> > +	FRAG_SZ0 = 2048,
> >  	FRAG_SZ1 = 1024,
> >  	FRAG_SZ2 = 4096,
> >  	FRAG_SZ3 = MLX4_EN_ALLOC_SIZE
> 
> Is the 512 -> 2048 change really wanted ? Its not mentioned in changelog and is confusing.
> 
> This means mlx4 lost the ability to use a small frag (512 bytes) to store small frames.
> 
The change is wanted as an optimization for our HW.
We do get better numbers with this change, even with small packets.
You are correct, I should have mentioned it in the changelog.

Yevgeny

 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH 3/7] mlx4_en: Incoming traffic alignment optimizations
  2011-10-18  7:43   ` Yevgeny Petrilin
@ 2011-10-18  8:40     ` Eric Dumazet
  2011-10-18  8:52       ` Yevgeny Petrilin
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2011-10-18  8:40 UTC (permalink / raw)
  To: Yevgeny Petrilin; +Cc: davem@davemloft.net, netdev@vger.kernel.org

Le mardi 18 octobre 2011 à 07:43 +0000, Yevgeny Petrilin a écrit :
> > 
> > > @@ -91,7 +91,7 @@
> > >  /* Receive fragment sizes; we use at most 4 fragments (for 9600 byte MTU
> > >   * and 4K allocations) */
> > >  enum {
> > > -	FRAG_SZ0 = 512 - NET_IP_ALIGN,
> > > +	FRAG_SZ0 = 2048,
> > >  	FRAG_SZ1 = 1024,
> > >  	FRAG_SZ2 = 4096,
> > >  	FRAG_SZ3 = MLX4_EN_ALLOC_SIZE
> > 
> > Is the 512 -> 2048 change really wanted ? Its not mentioned in changelog and is confusing.
> > 
> > This means mlx4 lost the ability to use a small frag (512 bytes) to store small frames.
> > 
> The change is wanted as an optimization for our HW.
> We do get better numbers with this change, even with small packets.
> You are correct, I should have mentioned it in the changelog.

Oh my...

Of course you are aware that the 'truesize' stuff around means that
using big frag size will probably lower your performance number, unless
you allow protocol stacks to use more ram ?

Only possible drawback using 512 bytes instead of 2048 is the cache-line
bounce on the page->_count field. So I would say your change hides a
performance issue of your driver.

Maybe you should make sure you dont touch it too often [ You should use
a single add per allocated PAGE, not 2 (for 2048-bytes frags) or 8 (for
512-bytes frags) ]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH 3/7] mlx4_en: Incoming traffic alignment optimizations
  2011-10-18  8:40     ` Eric Dumazet
@ 2011-10-18  8:52       ` Yevgeny Petrilin
  0 siblings, 0 replies; 5+ messages in thread
From: Yevgeny Petrilin @ 2011-10-18  8:52 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem@davemloft.net, netdev@vger.kernel.org

> 
> Oh my...
> 
> Of course you are aware that the 'truesize' stuff around means that using big frag size will probably lower your performance number, unless you allow protocol stacks to use more ram ?
> 
> Only possible drawback using 512 bytes instead of 2048 is the cache-line bounce on the page->_count field. So I would say your change hides a performance issue of your driver.
> 
> Maybe you should make sure you dont touch it too often [ You should use a single add per allocated PAGE, not 2 (for 2048-bytes frags) or 8 (for 512-bytes frags) ]
> 
> 
Thanks Eric,
I'll check this.
In the meanwhile will resubmit the series without this change.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-10-18  8:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-17 20:17 [PATCH 3/7] mlx4_en: Incoming traffic alignment optimizations Yevgeny Petrilin
2011-10-18  1:53 ` Eric Dumazet
2011-10-18  7:43   ` Yevgeny Petrilin
2011-10-18  8:40     ` Eric Dumazet
2011-10-18  8:52       ` Yevgeny Petrilin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox