* [PATCH net-next] net/mlx4: use one page fragment per incoming frame
@ 2013-06-03 17:54 Eric Dumazet
2013-06-03 18:05 ` Rick Jones
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Eric Dumazet @ 2013-06-03 17:54 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Amir Vadai
From: Eric Dumazet <edumazet@google.com>
mlx4 driver has a suboptimal memory allocation strategy for regular
MTU=1500 frames, as it uses two page fragments :
One of 512 bytes and one of 1024 bytes.
This makes GRO less effective, as each GSO packet contains 8 MSS instead
of 16 MSS.
Performance of a single TCP flow gains 25 % increase with the following
patch.
Before patch :
A:~# netperf -H 192.168.0.2 -Cc
MIGRATED TCP STREAM TEST ...
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 10.00 13798.47 3.06 4.20 0.436 0.598
After patch :
A:~# netperf -H 192.68.0.2 -Cc
MIGRATED TCP STREAM TEST ...
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 10.00 17273.80 3.44 4.19 0.391 0.477
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amir Vadai <amirv@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index b1d7657..b1f51c1 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -98,11 +98,11 @@
#define MLX4_EN_ALLOC_SIZE PAGE_ALIGN(16384)
#define MLX4_EN_ALLOC_ORDER get_order(MLX4_EN_ALLOC_SIZE)
-/* Receive fragment sizes; we use at most 4 fragments (for 9600 byte MTU
+/* Receive fragment sizes; we use at most 3 fragments (for 9600 byte MTU
* and 4K allocations) */
enum {
- FRAG_SZ0 = 512 - NET_IP_ALIGN,
- FRAG_SZ1 = 1024,
+ FRAG_SZ0 = 1536 - NET_IP_ALIGN,
+ FRAG_SZ1 = 4096,
FRAG_SZ2 = 4096,
FRAG_SZ3 = MLX4_EN_ALLOC_SIZE
};
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] net/mlx4: use one page fragment per incoming frame
2013-06-03 17:54 [PATCH net-next] net/mlx4: use one page fragment per incoming frame Eric Dumazet
@ 2013-06-03 18:05 ` Rick Jones
2013-06-03 18:12 ` Eric Dumazet
2013-06-04 8:40 ` Amir Vadai
2013-06-05 0:28 ` David Miller
2 siblings, 1 reply; 7+ messages in thread
From: Rick Jones @ 2013-06-03 18:05 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev, Amir Vadai
On 06/03/2013 10:54 AM, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> mlx4 driver has a suboptimal memory allocation strategy for regular
> MTU=1500 frames, as it uses two page fragments :
>
> One of 512 bytes and one of 1024 bytes.
>
> This makes GRO less effective, as each GSO packet contains 8 MSS instead
> of 16 MSS.
>
> Performance of a single TCP flow gains 25 % increase with the following
> patch.
>
> Before patch :
>
> A:~# netperf -H 192.168.0.2 -Cc
> MIGRATED TCP STREAM TEST ...
> Recv Send Send Utilization Service Demand
> Socket Socket Message Elapsed Send Recv Send Recv
> Size Size Size Time Throughput local remote local remote
> bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
>
> 87380 16384 16384 10.00 13798.47 3.06 4.20 0.436 0.598
>
> After patch :
>
> A:~# netperf -H 192.68.0.2 -Cc
> MIGRATED TCP STREAM TEST ...
> Recv Send Send Utilization Service Demand
> Socket Socket Message Elapsed Send Recv Send Recv
> Size Size Size Time Throughput local remote local remote
> bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
>
> 87380 16384 16384 10.00 17273.80 3.44 4.19 0.391 0.477
I take it this is a > 10 Gbit/s NIC?
What if any is the downside to an incoming stream of small packets?
rick jones
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Amir Vadai <amirv@mellanox.com>
> ---
> drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> index b1d7657..b1f51c1 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> @@ -98,11 +98,11 @@
> #define MLX4_EN_ALLOC_SIZE PAGE_ALIGN(16384)
> #define MLX4_EN_ALLOC_ORDER get_order(MLX4_EN_ALLOC_SIZE)
>
> -/* Receive fragment sizes; we use at most 4 fragments (for 9600 byte MTU
> +/* Receive fragment sizes; we use at most 3 fragments (for 9600 byte MTU
> * and 4K allocations) */
> enum {
> - FRAG_SZ0 = 512 - NET_IP_ALIGN,
> - FRAG_SZ1 = 1024,
> + FRAG_SZ0 = 1536 - NET_IP_ALIGN,
> + FRAG_SZ1 = 4096,
> FRAG_SZ2 = 4096,
> FRAG_SZ3 = MLX4_EN_ALLOC_SIZE
> };
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] net/mlx4: use one page fragment per incoming frame
2013-06-03 18:05 ` Rick Jones
@ 2013-06-03 18:12 ` Eric Dumazet
2013-06-03 18:24 ` Rick Jones
0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2013-06-03 18:12 UTC (permalink / raw)
To: Rick Jones; +Cc: David Miller, netdev, Amir Vadai
On Mon, 2013-06-03 at 11:05 -0700, Rick Jones wrote:
> I take it this is a > 10 Gbit/s NIC?
>
Well, I guess so !
Or something is really strange with this NIC !
> What if any is the downside to an incoming stream of small packets?
These kind of NIC are aimed for performance, I was told.
1) It would be strange to use them on memory constrained machines.
2) using 1536 bytes fragments is better than most other NIC drivers, as
they usually use 2048 or 4096 bytes frags
3) Current memory allocations done in mlx4 uses order-2 pages, and
nobody yet complained that these allocation might fail...
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] net/mlx4: use one page fragment per incoming frame
2013-06-03 18:12 ` Eric Dumazet
@ 2013-06-03 18:24 ` Rick Jones
2013-06-03 18:26 ` Eric Dumazet
0 siblings, 1 reply; 7+ messages in thread
From: Rick Jones @ 2013-06-03 18:24 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev, Amir Vadai
On 06/03/2013 11:12 AM, Eric Dumazet wrote:
> On Mon, 2013-06-03 at 11:05 -0700, Rick Jones wrote:
>
>> I take it this is a > 10 Gbit/s NIC?
>>
>
> Well, I guess so !
>
> Or something is really strange with this NIC !
Indeed.
>> What if any is the downside to an incoming stream of small packets?
>
> These kind of NIC are aimed for performance, I was told.
>
> 1) It would be strange to use them on memory constrained machines.
>
> 2) using 1536 bytes fragments is better than most other NIC drivers, as
> they usually use 2048 or 4096 bytes frags
>
> 3) Current memory allocations done in mlx4 uses order-2 pages, and
> nobody yet complained that these allocation might fail...
I was thinking more about getting up to a socket and finding
insufficient overhead remaining to hold the packet.
rick
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] net/mlx4: use one page fragment per incoming frame
2013-06-03 17:54 [PATCH net-next] net/mlx4: use one page fragment per incoming frame Eric Dumazet
2013-06-03 18:05 ` Rick Jones
@ 2013-06-04 8:40 ` Amir Vadai
2013-06-05 0:28 ` David Miller
2 siblings, 0 replies; 7+ messages in thread
From: Amir Vadai @ 2013-06-04 8:40 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
On 03/06/2013 20:54, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> mlx4 driver has a suboptimal memory allocation strategy for regular
> MTU=1500 frames, as it uses two page fragments :
>
> One of 512 bytes and one of 1024 bytes.
>
> This makes GRO less effective, as each GSO packet contains 8 MSS instead
> of 16 MSS.
>
> Performance of a single TCP flow gains 25 % increase with the following
> patch.
>
> Before patch :
>
> A:~# netperf -H 192.168.0.2 -Cc
> MIGRATED TCP STREAM TEST ...
> Recv Send Send Utilization Service Demand
> Socket Socket Message Elapsed Send Recv Send Recv
> Size Size Size Time Throughput local remote local remote
> bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
>
> 87380 16384 16384 10.00 13798.47 3.06 4.20 0.436 0.598
>
> After patch :
>
> A:~# netperf -H 192.68.0.2 -Cc
> MIGRATED TCP STREAM TEST ...
> Recv Send Send Utilization Service Demand
> Socket Socket Message Elapsed Send Recv Send Recv
> Size Size Size Time Throughput local remote local remote
> bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
>
> 87380 16384 16384 10.00 17273.80 3.44 4.19 0.391 0.477
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Amir Vadai <amirv@mellanox.com>
> ---
> drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> index b1d7657..b1f51c1 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> @@ -98,11 +98,11 @@
> #define MLX4_EN_ALLOC_SIZE PAGE_ALIGN(16384)
> #define MLX4_EN_ALLOC_ORDER get_order(MLX4_EN_ALLOC_SIZE)
>
> -/* Receive fragment sizes; we use at most 4 fragments (for 9600 byte MTU
> +/* Receive fragment sizes; we use at most 3 fragments (for 9600 byte MTU
> * and 4K allocations) */
> enum {
> - FRAG_SZ0 = 512 - NET_IP_ALIGN,
> - FRAG_SZ1 = 1024,
> + FRAG_SZ0 = 1536 - NET_IP_ALIGN,
> + FRAG_SZ1 = 4096,
> FRAG_SZ2 = 4096,
> FRAG_SZ3 = MLX4_EN_ALLOC_SIZE
> };
>
>
Acked-By: Amir Vadai <amirv@mellanox.com>
We are currently working on a patch to change the skb allocation scheme
for the RX side, because the current mlx4_en architecture is behaving
very bad when IOMMU is enabled.
After this change, the driver will use one fragment, among other
improvements. The most important will be fragments recycling to save
alloc/free and dma_map/unmap.
But, I think it will be a good idea to apply your fix now in case there
will be delays.
Thanks,
Amir
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] net/mlx4: use one page fragment per incoming frame
2013-06-03 17:54 [PATCH net-next] net/mlx4: use one page fragment per incoming frame Eric Dumazet
2013-06-03 18:05 ` Rick Jones
2013-06-04 8:40 ` Amir Vadai
@ 2013-06-05 0:28 ` David Miller
2 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2013-06-05 0:28 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, amirv
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 03 Jun 2013 10:54:55 -0700
> From: Eric Dumazet <edumazet@google.com>
>
> mlx4 driver has a suboptimal memory allocation strategy for regular
> MTU=1500 frames, as it uses two page fragments :
>
> One of 512 bytes and one of 1024 bytes.
>
> This makes GRO less effective, as each GSO packet contains 8 MSS instead
> of 16 MSS.
>
> Performance of a single TCP flow gains 25 % increase with the following
> patch.
...
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied, thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-06-05 0:28 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-03 17:54 [PATCH net-next] net/mlx4: use one page fragment per incoming frame Eric Dumazet
2013-06-03 18:05 ` Rick Jones
2013-06-03 18:12 ` Eric Dumazet
2013-06-03 18:24 ` Rick Jones
2013-06-03 18:26 ` Eric Dumazet
2013-06-04 8:40 ` Amir Vadai
2013-06-05 0:28 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).