* [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
@ 2015-06-30 17:25 Nicolae Rosia
2015-07-01 10:03 ` Nicolas Ferre
2015-07-01 10:56 ` Eric Dumazet
0 siblings, 2 replies; 15+ messages in thread
From: Nicolae Rosia @ 2015-06-30 17:25 UTC (permalink / raw)
To: netdev; +Cc: 'Nicolas Ferre
Signed-off-by: Nicolae Rosia <nicolae.rosia@certsign.ro>
---
drivers/net/ethernet/cadence/macb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index caeb395..dbb5160 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -2554,9 +2554,9 @@ static void at91ether_rx(struct net_device *dev)
while (lp->rx_ring[lp->rx_tail].addr & MACB_BIT(RX_USED)) {
p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ;
pktlen = MACB_BF(RX_FRMLEN, lp->rx_ring[lp->rx_tail].ctrl);
- skb = netdev_alloc_skb(dev, pktlen + 2);
+ skb = netdev_alloc_skb(dev, pktlen + NET_IP_ALIGN);
if (skb) {
- skb_reserve(skb, 2);
+ skb_reserve(skb, NET_IP_ALIGN);
memcpy(skb_put(skb, pktlen), p_recv, pktlen);
skb->protocol = eth_type_trans(skb, dev);
--
2.1.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-06-30 17:25 [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN Nicolae Rosia
@ 2015-07-01 10:03 ` Nicolas Ferre
2015-07-01 10:56 ` Eric Dumazet
1 sibling, 0 replies; 15+ messages in thread
From: Nicolas Ferre @ 2015-07-01 10:03 UTC (permalink / raw)
To: Nicolae Rosia, netdev; +Cc: Cyrille Pitchen
Le 30/06/2015 19:25, Nicolae Rosia a écrit :
>
> Signed-off-by: Nicolae Rosia <nicolae.rosia@certsign.ro>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Thanks, bye.
> ---
> drivers/net/ethernet/cadence/macb.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> index caeb395..dbb5160 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -2554,9 +2554,9 @@ static void at91ether_rx(struct net_device *dev)
> while (lp->rx_ring[lp->rx_tail].addr & MACB_BIT(RX_USED)) {
> p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ;
> pktlen = MACB_BF(RX_FRMLEN, lp->rx_ring[lp->rx_tail].ctrl);
> - skb = netdev_alloc_skb(dev, pktlen + 2);
> + skb = netdev_alloc_skb(dev, pktlen + NET_IP_ALIGN);
> if (skb) {
> - skb_reserve(skb, 2);
> + skb_reserve(skb, NET_IP_ALIGN);
> memcpy(skb_put(skb, pktlen), p_recv, pktlen);
>
> skb->protocol = eth_type_trans(skb, dev);
>
--
Nicolas Ferre
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-06-30 17:25 [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN Nicolae Rosia
2015-07-01 10:03 ` Nicolas Ferre
@ 2015-07-01 10:56 ` Eric Dumazet
2015-07-01 13:29 ` Nicolae Rosia
2015-07-01 17:06 ` Joe Perches
1 sibling, 2 replies; 15+ messages in thread
From: Eric Dumazet @ 2015-07-01 10:56 UTC (permalink / raw)
To: Nicolae Rosia; +Cc: netdev, 'Nicolas Ferre
On Tue, 2015-06-30 at 20:25 +0300, Nicolae Rosia wrote:
> Signed-off-by: Nicolae Rosia <nicolae.rosia@certsign.ro>
> ---
> drivers/net/ethernet/cadence/macb.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> index caeb395..dbb5160 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -2554,9 +2554,9 @@ static void at91ether_rx(struct net_device *dev)
> while (lp->rx_ring[lp->rx_tail].addr & MACB_BIT(RX_USED)) {
> p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ;
> pktlen = MACB_BF(RX_FRMLEN, lp->rx_ring[lp->rx_tail].ctrl);
> - skb = netdev_alloc_skb(dev, pktlen + 2);
> + skb = netdev_alloc_skb(dev, pktlen + NET_IP_ALIGN);
> if (skb) {
> - skb_reserve(skb, 2);
> + skb_reserve(skb, NET_IP_ALIGN);
> memcpy(skb_put(skb, pktlen), p_recv, pktlen);
>
> skb->protocol = eth_type_trans(skb, dev);
Then please use netdev_alloc_skb_ip_align(), so that you get rid of
skb_reserve()
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-07-01 10:56 ` Eric Dumazet
@ 2015-07-01 13:29 ` Nicolae Rosia
2015-07-01 13:44 ` Eric Dumazet
2015-07-01 17:06 ` Joe Perches
1 sibling, 1 reply; 15+ messages in thread
From: Nicolae Rosia @ 2015-07-01 13:29 UTC (permalink / raw)
To: Eric Dumazet, Nicolas Ferre; +Cc: netdev
On 07/01/2015 01:56 PM, Eric Dumazet wrote:
> Then please use netdev_alloc_skb_ip_align(), so that you get rid of
> skb_reserve()
Thank you for the suggestion.
I can do that.
A related question, should I also replace netdev_alloc with
napi_alloc_skb in places where I have a napi struct?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-07-01 13:29 ` Nicolae Rosia
@ 2015-07-01 13:44 ` Eric Dumazet
2015-07-01 14:29 ` Nicolae Rosia
0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2015-07-01 13:44 UTC (permalink / raw)
To: Nicolae Rosia; +Cc: Nicolas Ferre, netdev
On Wed, 2015-07-01 at 16:29 +0300, Nicolae Rosia wrote:
> On 07/01/2015 01:56 PM, Eric Dumazet wrote:
> > Then please use netdev_alloc_skb_ip_align(), so that you get rid of
> > skb_reserve()
> Thank you for the suggestion.
> I can do that.
> A related question, should I also replace netdev_alloc with
> napi_alloc_skb in places where I have a napi struct?
I really doubt this adapter can process millions of packets per second ?
I would rather enable GRO, it would be more useful.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-07-01 13:44 ` Eric Dumazet
@ 2015-07-01 14:29 ` Nicolae Rosia
2015-07-01 15:33 ` Eric Dumazet
0 siblings, 1 reply; 15+ messages in thread
From: Nicolae Rosia @ 2015-07-01 14:29 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Nicolas Ferre, netdev
On 07/01/2015 04:44 PM, Eric Dumazet wrote:
> I really doubt this adapter can process millions of packets per second ?
I was suggesting this since I was taking into consideration the comment
from skbuff.c, "we can save several CPU cycles by avoiding having to
disable and re-enable IRQs."
Is there a downside to this?
>
> I would rather enable GRO, it would be more useful.
I had no idea what GRO is, so I have read about it [0] and looked at a
couple of drivers which use it. They all seem to
replace netif_receive_skb with napi_gro_receive and when there are no
more packets in napi_pool they call napi_gro_flush.
Is it that simple?
Regards
[0] https://lwn.net/Articles/358910/
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-07-01 14:29 ` Nicolae Rosia
@ 2015-07-01 15:33 ` Eric Dumazet
2015-07-01 15:53 ` Nicolae Rosia
0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2015-07-01 15:33 UTC (permalink / raw)
To: Nicolae Rosia; +Cc: Nicolas Ferre, netdev
On Wed, 2015-07-01 at 17:29 +0300, Nicolae Rosia wrote:
> On 07/01/2015 04:44 PM, Eric Dumazet wrote:
> > I really doubt this adapter can process millions of packets per second ?
> I was suggesting this since I was taking into consideration the comment
> from skbuff.c, "we can save several CPU cycles by avoiding having to
> disable and re-enable IRQs."
> Is there a downside to this?
This only matters in terms of few nano seconds per packet, so for 10Gb+
NIC anyway. Absolute noise for most NIC.
>
> >
> > I would rather enable GRO, it would be more useful.
> I had no idea what GRO is, so I have read about it [0] and looked at a
> couple of drivers which use it. They all seem to
> replace netif_receive_skb with napi_gro_receive and when there are no
> more packets in napi_pool they call napi_gro_flush.
> Is it that simple?
Yes, but main question is : Do you have the hardware to test your
changes ?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-07-01 15:33 ` Eric Dumazet
@ 2015-07-01 15:53 ` Nicolae Rosia
2015-07-01 16:19 ` Eric Dumazet
0 siblings, 1 reply; 15+ messages in thread
From: Nicolae Rosia @ 2015-07-01 15:53 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Nicolae Rosia, Nicolas Ferre, netdev
On Wed, Jul 1, 2015 at 6:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
[...]
> This only matters in terms of few nano seconds per packet, so for 10Gb+
> NIC anyway. Absolute noise for most NIC.
>
I'll give it a try and benchmark.
I achieved a huge speedup by moving TX into napi [0], but my hardware doesn't
support multiple TX queues and I can't test that situation.
> Yes, but main question is : Do you have the hardware to test your
> changes ?
Yes, I have a Xilinx ZC706 board with a Zynq7 XC7Z045 processor [1]
[0] https://patchwork.ozlabs.org/patch/488949/
[1] http://www.xilinx.com/products/boards-and-kits/ek-z7-zc706-g.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-07-01 15:53 ` Nicolae Rosia
@ 2015-07-01 16:19 ` Eric Dumazet
0 siblings, 0 replies; 15+ messages in thread
From: Eric Dumazet @ 2015-07-01 16:19 UTC (permalink / raw)
To: Nicolae Rosia; +Cc: Nicolae Rosia, Nicolas Ferre, netdev
On Wed, 2015-07-01 at 18:53 +0300, Nicolae Rosia wrote:
> On Wed, Jul 1, 2015 at 6:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> [...]
> > This only matters in terms of few nano seconds per packet, so for 10Gb+
> > NIC anyway. Absolute noise for most NIC.
> >
> I'll give it a try and benchmark.
> I achieved a huge speedup by moving TX into napi [0], but my hardware doesn't
> support multiple TX queues and I can't test that situation.
>
> > Yes, but main question is : Do you have the hardware to test your
> > changes ?
> Yes, I have a Xilinx ZC706 board with a Zynq7 XC7Z045 processor [1]
>
> [0] https://patchwork.ozlabs.org/patch/488949/
> [1] http://www.xilinx.com/products/boards-and-kits/ek-z7-zc706-g.html
OK, then enabling GRO should be quite easy, given driver already has
most of it.
diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index caeb39561567237261ac0d50befebad666cfbeb3..24a93c769caa5430ca61efe002b458fef7281e99 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -815,7 +815,7 @@ static int gem_rx(struct macb *bp, int budget)
skb->data, 32, true);
#endif
- netif_receive_skb(skb);
+ napi_gro_receive(&bp->napi, skb);
}
gem_rx_refill(bp);
@@ -896,7 +896,7 @@ static int macb_rx_frame(struct macb *bp, unsigned int first_frag,
bp->stats.rx_bytes += skb->len;
netdev_vdbg(bp->dev, "received skb of length %u, csum: %08x\n",
skb->len, skb->csum);
- netif_receive_skb(skb);
+ napi_gro_receive(&bp->napi, skb);
return 0;
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-07-01 10:56 ` Eric Dumazet
2015-07-01 13:29 ` Nicolae Rosia
@ 2015-07-01 17:06 ` Joe Perches
2015-07-01 22:13 ` Eric Dumazet
1 sibling, 1 reply; 15+ messages in thread
From: Joe Perches @ 2015-07-01 17:06 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Nicolae Rosia, netdev, 'Nicolas Ferre
On Wed, 2015-07-01 at 12:56 +0200, Eric Dumazet wrote:
> On Tue, 2015-06-30 at 20:25 +0300, Nicolae Rosia wrote:
> > diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
[]
> > @@ -2554,9 +2554,9 @@ static void at91ether_rx(struct net_device *dev)
> > while (lp->rx_ring[lp->rx_tail].addr & MACB_BIT(RX_USED)) {
> > p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ;
> > pktlen = MACB_BF(RX_FRMLEN, lp->rx_ring[lp->rx_tail].ctrl);
> > - skb = netdev_alloc_skb(dev, pktlen + 2);
> > + skb = netdev_alloc_skb(dev, pktlen + NET_IP_ALIGN);
> > if (skb) {
> > - skb_reserve(skb, 2);
> > + skb_reserve(skb, NET_IP_ALIGN);
> > memcpy(skb_put(skb, pktlen), p_recv, pktlen);
> >
> > skb->protocol = eth_type_trans(skb, dev);
>
> Then please use netdev_alloc_skb_ip_align(), so that you get rid of
> skb_reserve()
It seems there are ~50 of these in the kernel tree
that could be converted.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-07-01 17:06 ` Joe Perches
@ 2015-07-01 22:13 ` Eric Dumazet
2015-07-01 23:00 ` Joe Perches
2015-07-03 16:18 ` David Laight
0 siblings, 2 replies; 15+ messages in thread
From: Eric Dumazet @ 2015-07-01 22:13 UTC (permalink / raw)
To: Joe Perches; +Cc: Nicolae Rosia, netdev, 'Nicolas Ferre
On Wed, 2015-07-01 at 10:06 -0700, Joe Perches wrote:
> On Wed, 2015-07-01 at 12:56 +0200, Eric Dumazet wrote:
> > On Tue, 2015-06-30 at 20:25 +0300, Nicolae Rosia wrote:
> > > diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> []
> > > @@ -2554,9 +2554,9 @@ static void at91ether_rx(struct net_device *dev)
> > > while (lp->rx_ring[lp->rx_tail].addr & MACB_BIT(RX_USED)) {
> > > p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ;
> > > pktlen = MACB_BF(RX_FRMLEN, lp->rx_ring[lp->rx_tail].ctrl);
> > > - skb = netdev_alloc_skb(dev, pktlen + 2);
> > > + skb = netdev_alloc_skb(dev, pktlen + NET_IP_ALIGN);
> > > if (skb) {
> > > - skb_reserve(skb, 2);
> > > + skb_reserve(skb, NET_IP_ALIGN);
> > > memcpy(skb_put(skb, pktlen), p_recv, pktlen);
> > >
> > > skb->protocol = eth_type_trans(skb, dev);
> >
> > Then please use netdev_alloc_skb_ip_align(), so that you get rid of
> > skb_reserve()
>
> It seems there are ~50 of these in the kernel tree
> that could be converted.
>
Make sure the 2 is really NET_IP_ALIGN
Some hardwares need 2, even if NET_IP_ALIGN is 0 (on x86 arches for
example)
I would rather not touch this without testing the change on real
hardware.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-07-01 22:13 ` Eric Dumazet
@ 2015-07-01 23:00 ` Joe Perches
2015-07-03 16:18 ` David Laight
1 sibling, 0 replies; 15+ messages in thread
From: Joe Perches @ 2015-07-01 23:00 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Nicolae Rosia, netdev, 'Nicolas Ferre
On Thu, 2015-07-02 at 00:13 +0200, Eric Dumazet wrote:
> On Wed, 2015-07-01 at 10:06 -0700, Joe Perches wrote:
> > On Wed, 2015-07-01 at 12:56 +0200, Eric Dumazet wrote:
> > > On Tue, 2015-06-30 at 20:25 +0300, Nicolae Rosia wrote:
> > > > diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> > []
> > > > @@ -2554,9 +2554,9 @@ static void at91ether_rx(struct net_device *dev)
> > > > while (lp->rx_ring[lp->rx_tail].addr & MACB_BIT(RX_USED)) {
> > > > p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ;
> > > > pktlen = MACB_BF(RX_FRMLEN, lp->rx_ring[lp->rx_tail].ctrl);
> > > > - skb = netdev_alloc_skb(dev, pktlen + 2);
> > > > + skb = netdev_alloc_skb(dev, pktlen + NET_IP_ALIGN);
> > > > if (skb) {
> > > > - skb_reserve(skb, 2);
> > > > + skb_reserve(skb, NET_IP_ALIGN);
> > > > memcpy(skb_put(skb, pktlen), p_recv, pktlen);
> > > >
> > > > skb->protocol = eth_type_trans(skb, dev);
> > >
> > > Then please use netdev_alloc_skb_ip_align(), so that you get rid of
> > > skb_reserve()
> >
> > It seems there are ~50 of these in the kernel tree
> > that could be converted.
> >
>
> Make sure the 2 is really NET_IP_ALIGN
>
> Some hardwares need 2, even if NET_IP_ALIGN is 0 (on x86 arches for
> example)
>
> I would rather not touch this without testing the change on real
> hardware.
Nor I really.
Most all of those are in fairly old hardware drivers.
I just wanted to point out that more exist.
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-07-01 22:13 ` Eric Dumazet
2015-07-01 23:00 ` Joe Perches
@ 2015-07-03 16:18 ` David Laight
2015-07-03 16:39 ` Eric Dumazet
1 sibling, 1 reply; 15+ messages in thread
From: David Laight @ 2015-07-03 16:18 UTC (permalink / raw)
To: 'Eric Dumazet', Joe Perches
Cc: Nicolae Rosia, netdev@vger.kernel.org, 'Nicolas Ferre
From: Eric Dumazet
> Sent: 01 July 2015 23:14
> To: Joe Perches
...
> > > Then please use netdev_alloc_skb_ip_align(), so that you get rid of
> > > skb_reserve()
> >
> > It seems there are ~50 of these in the kernel tree
> > that could be converted.
> >
>
> Make sure the 2 is really NET_IP_ALIGN
>
> Some hardwares need 2, even if NET_IP_ALIGN is 0 (on x86 arches for
> example)
Even on x86 aligning the ethernet receive data on a 4n+2
boundary is likely to give marginally better performance
than aligning on a 4n boundary.
David
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-07-03 16:18 ` David Laight
@ 2015-07-03 16:39 ` Eric Dumazet
2015-07-06 8:54 ` David Laight
0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2015-07-03 16:39 UTC (permalink / raw)
To: David Laight
Cc: Joe Perches, Nicolae Rosia, netdev@vger.kernel.org,
'Nicolas Ferre
On Fri, 2015-07-03 at 16:18 +0000, David Laight wrote:
> Even on x86 aligning the ethernet receive data on a 4n+2
> boundary is likely to give marginally better performance
> than aligning on a 4n boundary.
You are coming late to the party.
Intel guys decided to change NET_IP_ALIGN to 0 (it was 2 in the past)
commit ea812ca1b06113597adcd8e70c0f84a413d97544
Author: Alexander Duyck <alexander.h.duyck@intel.com>
Date: Tue Jun 29 18:38:00 2010 +0000
x86: Align skb w/ start of cacheline on newer core 2/Xeon Arch
x86 architectures can handle unaligned accesses in hardware, and it has
been shown that unaligned DMA accesses can be expensive on Nehalem
architectures. As such we should overwrite NET_IP_ALIGN to resolve
this issue.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN
2015-07-03 16:39 ` Eric Dumazet
@ 2015-07-06 8:54 ` David Laight
0 siblings, 0 replies; 15+ messages in thread
From: David Laight @ 2015-07-06 8:54 UTC (permalink / raw)
To: 'Eric Dumazet'
Cc: Joe Perches, Nicolae Rosia, netdev@vger.kernel.org,
'Nicolas Ferre
From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: 03 July 2015 17:39
> On Fri, 2015-07-03 at 16:18 +0000, David Laight wrote:
>
> > Even on x86 aligning the ethernet receive data on a 4n+2
> > boundary is likely to give marginally better performance
> > than aligning on a 4n boundary.
>
> You are coming late to the party.
I've been to many parties at many different times....
Going back many years, Sun's original sbus DMA part generated a lot
of single sbus transfers for 4n+2 aligned buffers - so it was necessary
to do a 'realignment' copy. The later DMA+ (definitely the DMA2) part
did sbus burst transfers even when the buffer was 4n+2 aligned.
So with the later parts you could correctly align the buffer.
> Intel guys decided to change NET_IP_ALIGN to 0 (it was 2 in the past)
...
> x86: Align skb w/ start of cacheline on newer core 2/Xeon Arch
>
> x86 architectures can handle unaligned accesses in hardware, and it has
> been shown that unaligned DMA accesses can be expensive on Nehalem
> architectures. As such we should overwrite NET_IP_ALIGN to resolve
> this issue.
My 2 cents:
I'd have thought it would depend on the nature of the 'DMA' requests
generated by the hardware - so ethernet hardware dependant.
The above may be correct for PCI masters - especially those that do
paired 16bit accesses for every 32bit word.
If the hardware generated cache line aligned PCI bursts I wouldn't
have thought it would matter.
I doubt it is valid for PCIe transfers - where the ethernet frame
will be split into (probably) 128byte TLPs. Even if it starts on
a 64n+2 boundary the splits will be on 64 byte boundaries since the
first and last 32bit words of the TLP have separate byte enables.
So I'd expect to see a cache line RMW for the first and last cache
lines - That may, or may not, be slower than the misaligned accesses
for the entire frame (1 clock data delay per access?)
Of course, modern nics will write 2 bytes of 'crap' before the frame.
Rounding up the transfer to the end of a cache line might also help
(especially if only a few extra words are needed).
David
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2015-07-06 8:57 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-30 17:25 [PATCH net-next] net: macb: replace literal constant with NET_IP_ALIGN Nicolae Rosia
2015-07-01 10:03 ` Nicolas Ferre
2015-07-01 10:56 ` Eric Dumazet
2015-07-01 13:29 ` Nicolae Rosia
2015-07-01 13:44 ` Eric Dumazet
2015-07-01 14:29 ` Nicolae Rosia
2015-07-01 15:33 ` Eric Dumazet
2015-07-01 15:53 ` Nicolae Rosia
2015-07-01 16:19 ` Eric Dumazet
2015-07-01 17:06 ` Joe Perches
2015-07-01 22:13 ` Eric Dumazet
2015-07-01 23:00 ` Joe Perches
2015-07-03 16:18 ` David Laight
2015-07-03 16:39 ` Eric Dumazet
2015-07-06 8:54 ` David Laight
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).