netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anton Blanchard <anton@samba.org>
To: davem@redhat.com
Cc: netdev@oss.sgi.com
Subject: Allow IP header alignment to be overriden
Date: Fri, 11 Jun 2004 11:27:27 +1000	[thread overview]
Message-ID: <20040611012727.GA27672@krispykreme> (raw)


Hi,

The networking layer currently aligns IP headers in rx packets. It does
this via skb_reserve(,2).

On some architectures (like ppc64) we handle most unaligned accesses in
hardware. This means we gain little from this header alignment. However
forcing this alignment means we attempt to DMA from an unaligned
address.

In the lab we see DMAs beginning at 2 bytes into the page. On some of
our chips we have to do power of 2 writes of increasing size until we
hit a reasonable alignment. It was noticeable on gigabit and now with
10Gbit appearing its becoming a real problem.

Id be surprised if other architectures arent seeing similar issues, with
bridges that disconnect at power of two boundaries.

The following patch creates skb_align and allows an architecture to
override it. Thoughts?

Anton

===== drivers/net/acenic.c 1.44 vs edited =====
--- 1.44/drivers/net/acenic.c	Tue Apr  6 18:01:26 2004
+++ edited/drivers/net/acenic.c	Fri Jun 11 08:09:27 2004
@@ -1695,7 +1695,7 @@
 		/*
 		 * Make sure IP header starts on a fresh cache line.
 		 */
-		skb_reserve(skb, 2 + 16);
+		skb_align(skb, 2 + 16);
 		mapping = pci_map_page(ap->pdev, virt_to_page(skb->data),
 				       offset_in_page(skb->data),
 				       ACE_STD_BUFSIZE - (2 + 16),
@@ -1761,7 +1761,7 @@
 		/*
 		 * Make sure the IP header ends up on a fresh cache line
 		 */
-		skb_reserve(skb, 2 + 16);
+		skb_align(skb, 2 + 16);
 		mapping = pci_map_page(ap->pdev, virt_to_page(skb->data),
 				       offset_in_page(skb->data),
 				       ACE_MINI_BUFSIZE - (2 + 16),
@@ -1822,7 +1822,7 @@
 		/*
 		 * Make sure the IP header ends up on a fresh cache line
 		 */
-		skb_reserve(skb, 2 + 16);
+		skb_align(skb, 2 + 16);
 		mapping = pci_map_page(ap->pdev, virt_to_page(skb->data),
 				       offset_in_page(skb->data),
 				       ACE_JUMBO_BUFSIZE - (2 + 16),
===== drivers/net/e100.c 1.15 vs edited =====
--- 1.15/drivers/net/e100.c	Sat Jun  5 01:49:59 2004
+++ edited/drivers/net/e100.c	Fri Jun 11 08:00:49 2004
@@ -1395,7 +1395,7 @@
 
 	/* Align, init, and map the RFD. */
 	rx->skb->dev = nic->netdev;
-	skb_reserve(rx->skb, rx_offset);
+	skb_align(rx->skb, rx_offset);
 	memcpy(rx->skb->data, &nic->blank_rfd, sizeof(struct rfd));
 	rx->dma_addr = pci_map_single(nic->pdev, rx->skb->data,
 		RFD_BUF_LEN, PCI_DMA_BIDIRECTIONAL);
===== drivers/net/s2io.c 1.5 vs edited =====
--- 1.5/drivers/net/s2io.c	Fri Jun  4 12:00:15 2004
+++ edited/drivers/net/s2io.c	Fri Jun 11 08:06:52 2004
@@ -1431,7 +1431,7 @@
 			DBG_PRINT(ERR_DBG, "memory to allocate SKBs\n");
 			return -ENOMEM;
 		}
-		skb_reserve(skb, HEADER_ALIGN_LAYER_3);
+		skb_align(skb, HEADER_ALIGN_LAYER_3);
 		memset(rxdp, 0, sizeof(RxD_t));
 		rxdp->Buffer0_ptr = pci_map_single
 		    (nic->pdev, skb->data, size, PCI_DMA_FROMDEVICE);
===== drivers/net/tg3.c 1.180 vs edited =====
--- 1.180/drivers/net/tg3.c	Sat Jun  5 01:49:59 2004
+++ edited/drivers/net/tg3.c	Fri Jun 11 08:07:28 2004
@@ -2472,7 +2472,7 @@
 				goto drop_it_no_recycle;
 
 			copy_skb->dev = tp->dev;
-			skb_reserve(copy_skb, 2);
+			skb_align(copy_skb, 2);
 			skb_put(copy_skb, len);
 			pci_dma_sync_single_for_cpu(tp->pdev, dma_addr, len, PCI_DMA_FROMDEVICE);
 			memcpy(copy_skb->data, skb->data, len);
===== drivers/net/e1000/e1000_ethtool.c 1.45 vs edited =====
--- 1.45/drivers/net/e1000/e1000_ethtool.c	Fri May 28 06:59:25 2004
+++ edited/drivers/net/e1000/e1000_ethtool.c	Fri Jun 11 08:10:26 2004
@@ -1008,7 +1008,7 @@
 			ret_val = 6;
 			goto err_nomem;
 		}
-		skb_reserve(skb, 2);
+		skb_align(skb, 2);
 		rxdr->buffer_info[i].skb = skb;
 		rxdr->buffer_info[i].length = E1000_RXBUFFER_2048;
 		rxdr->buffer_info[i].dma =
===== drivers/net/e1000/e1000_main.c 1.118 vs edited =====
--- 1.118/drivers/net/e1000/e1000_main.c	Fri Jun  4 10:59:04 2004
+++ edited/drivers/net/e1000/e1000_main.c	Fri Jun 11 08:05:39 2004
@@ -2387,7 +2387,7 @@
 		 * this will result in a 16 byte aligned IP header after
 		 * the 14 byte MAC header is removed
 		 */
-		skb_reserve(skb, reserve_len);
+		skb_align(skb, reserve_len);
 
 		skb->dev = netdev;
 
===== drivers/net/ixgb/ixgb_main.c 1.13 vs edited =====
--- 1.13/drivers/net/ixgb/ixgb_main.c	Tue Jun  1 10:01:23 2004
+++ edited/drivers/net/ixgb/ixgb_main.c	Fri Jun 11 08:41:13 2004
@@ -1906,7 +1906,7 @@
 		 * this will result in a 16 byte aligned IP header after
 		 * the 14 byte MAC header is removed
 		 */
-		skb_reserve(skb, reserve_len);
+		skb_align(skb, reserve_len);
 
 		skb->dev = netdev;
 
===== include/asm-ppc64/system.h 1.28 vs edited =====
--- 1.28/include/asm-ppc64/system.h	Fri May 21 17:50:12 2004
+++ edited/include/asm-ppc64/system.h	Fri Jun 11 08:39:11 2004
@@ -277,5 +277,15 @@
 				    (unsigned long)_n_, sizeof(*(ptr))); \
   })
 
+/*
+ * We handle most unaligned accesses in hardware. On the other hand 
+ * unaligned DMA can be very expensive on some ppc64 IO chips (it does
+ * powers of 2 writes until it reaches sufficient alignment.
+ *
+ * Based on this we disable the IP header alignment in network drivers.
+ */
+#define ARCH_HAS_SKB_ALIGN
+#define skb_align(SKB, LEN)     do { } while (0)
+
 #endif /* __KERNEL__ */
 #endif
===== include/linux/skbuff.h 1.43 vs edited =====
--- 1.43/include/linux/skbuff.h	Mon May 31 05:09:46 2004
+++ edited/include/linux/skbuff.h	Fri Jun 11 08:27:42 2004
@@ -816,6 +816,20 @@
 	skb->tail += len;
 }
 
+/**
+ *	skb_align - align a buffer
+ *	@skb: buffer to alter
+ *	@len: bytes required to align
+ *
+ * 	Shift a buffer by len bytes for the purposes of alignment. On
+ * 	some architectures that handle unaligned accesses in hardware
+ * 	the effects of unaligned DMA is more costly so we allow it to
+ * 	be overridden. This is only allowed for an empty buffer.
+ */
+#ifndef ARCH_HAS_SKB_ALIGN
+#define skb_align(SKB, LEN)	skb_reserve((SKB), (LEN))
+#endif
+
 extern int ___pskb_trim(struct sk_buff *skb, unsigned int len, int realloc);
 
 static inline void __skb_trim(struct sk_buff *skb, unsigned int len)

             reply	other threads:[~2004-06-11  1:27 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-11  1:27 Anton Blanchard [this message]
2004-06-11  1:35 ` Allow IP header alignment to be overriden Andi Kleen
2004-06-11  1:43   ` Anton Blanchard
2004-06-11  5:35 ` David S. Miller
2004-06-11  7:39   ` Scott Feldman
2004-06-11 14:23     ` Anton Blanchard
2004-06-12 18:12       ` David S. Miller
2004-06-15 23:34         ` Anton Blanchard
2004-06-16  4:37           ` David S. Miller
2004-06-11 12:41   ` jamal
2004-06-11 14:08   ` Anton Blanchard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040611012727.GA27672@krispykreme \
    --to=anton@samba.org \
    --cc=davem@redhat.com \
    --cc=netdev@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).