* [1/2] e1000e: Invoke VLAN GRO handler
@ 2009-01-13 9:26 Herbert Xu
2009-01-13 9:28 ` [2/2] igb: Replace LRO with GRO Herbert Xu
` (3 more replies)
0 siblings, 4 replies; 37+ messages in thread
From: Herbert Xu @ 2009-01-13 9:26 UTC (permalink / raw)
To: David S. Miller, Jeff Kirsher, netdev
Hi Dave:
I'm starting to convert the LRO users now. Here's the first
patch which is trivial. It simply completes the test driver
that I started out with.
e1000e: Invoke VLAN GRO handler
Now that VLAN has GRO support as well, we can call its GRO handler
as well.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index 91817d0..f73faac 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -99,8 +99,8 @@ static void e1000_receive_skb(struct e1000_adapter *adapter,
skb->protocol = eth_type_trans(skb, netdev);
if (adapter->vlgrp && (status & E1000_RXD_STAT_VP))
- vlan_hwaccel_receive_skb(skb, adapter->vlgrp,
- le16_to_cpu(vlan));
+ vlan_gro_receive(&adapter->napi, adapter->vlgrp,
+ le16_to_cpu(vlan), skb);
else
napi_gro_receive(&adapter->napi, skb);
}
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [2/2] igb: Replace LRO with GRO
2009-01-13 9:26 [1/2] e1000e: Invoke VLAN GRO handler Herbert Xu
@ 2009-01-13 9:28 ` Herbert Xu
2009-01-14 8:44 ` ixgbe: " Herbert Xu
` (2 more replies)
2009-01-15 4:24 ` sfc: " Herbert Xu
` (2 subsequent siblings)
3 siblings, 3 replies; 37+ messages in thread
From: Herbert Xu @ 2009-01-13 9:28 UTC (permalink / raw)
To: David S. Miller, Jeff Kirsher, netdev
Hi:
And here is the first real conversion.
I can't test this one because I don't have the hardware. However,
GRO is a software feature and I have tested the various code paths
using e1000e.
igb: Replace LRO with GRO
This patch makes igb invoke the GRO hooks instead of LRO. As
GRO has a compatible external interface to LRO this is a very
straightforward replacement.
Three things of note:
1) I've kept the LRO Kconfig option until we decide to enable
GRO across the board at which point it can also be killed.
2) The poll_controller stuff is broken in igb as it tries to do
the same work as the normal poll routine. Since poll_controller
can be called in the middle of a poll, this can't be good.
I noticed this because poll_controller can invoke the GRO hooks
without flushing held GRO packets.
However, this should be harmless (assuming the poll_controller
bug above doesn't kill you first :) since the next ->poll will
clear the backlog. The only time when we'll have a problem is
if we're already executing the GRO code on the same ring, but
that's no worse than what happens now.
3) I kept the ip_summed check before calling GRO so that we're
on par with previous behaviour.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 65afda4..cf4fea0 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2022,7 +2022,6 @@ config IGB
config IGB_LRO
bool "Use software LRO"
depends on IGB && INET
- select INET_LRO
---help---
Say Y here if you want to use large receive offload.
diff --git a/drivers/net/igb/igb.h b/drivers/net/igb/igb.h
index 5a27825..7d8c887 100644
--- a/drivers/net/igb/igb.h
+++ b/drivers/net/igb/igb.h
@@ -36,12 +36,6 @@
struct igb_adapter;
-#ifdef CONFIG_IGB_LRO
-#include <linux/inet_lro.h>
-#define MAX_LRO_AGGR 32
-#define MAX_LRO_DESCRIPTORS 8
-#endif
-
/* Interrupt defines */
#define IGB_MIN_DYN_ITR 3000
#define IGB_MAX_DYN_ITR 96000
@@ -176,10 +170,6 @@ struct igb_ring {
struct napi_struct napi;
int set_itr;
struct igb_ring *buddy;
-#ifdef CONFIG_IGB_LRO
- struct net_lro_mgr lro_mgr;
- bool lro_used;
-#endif
};
};
@@ -288,12 +278,6 @@ struct igb_adapter {
int need_ioport;
struct igb_ring *multi_tx_table[IGB_MAX_TX_QUEUES];
-#ifdef CONFIG_IGB_LRO
- unsigned int lro_max_aggr;
- unsigned int lro_aggregated;
- unsigned int lro_flushed;
- unsigned int lro_no_desc;
-#endif
unsigned int tx_ring_count;
unsigned int rx_ring_count;
};
diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c
index 3c831f1..4606e63 100644
--- a/drivers/net/igb/igb_ethtool.c
+++ b/drivers/net/igb/igb_ethtool.c
@@ -93,11 +93,6 @@ static const struct igb_stats igb_gstrings_stats[] = {
{ "tx_smbus", IGB_STAT(stats.mgptc) },
{ "rx_smbus", IGB_STAT(stats.mgprc) },
{ "dropped_smbus", IGB_STAT(stats.mgpdc) },
-#ifdef CONFIG_IGB_LRO
- { "lro_aggregated", IGB_STAT(lro_aggregated) },
- { "lro_flushed", IGB_STAT(lro_flushed) },
- { "lro_no_desc", IGB_STAT(lro_no_desc) },
-#endif
};
#define IGB_QUEUE_STATS_LEN \
@@ -1921,18 +1916,6 @@ static void igb_get_ethtool_stats(struct net_device *netdev,
int stat_count = sizeof(struct igb_queue_stats) / sizeof(u64);
int j;
int i;
-#ifdef CONFIG_IGB_LRO
- int aggregated = 0, flushed = 0, no_desc = 0;
-
- for (i = 0; i < adapter->num_rx_queues; i++) {
- aggregated += adapter->rx_ring[i].lro_mgr.stats.aggregated;
- flushed += adapter->rx_ring[i].lro_mgr.stats.flushed;
- no_desc += adapter->rx_ring[i].lro_mgr.stats.no_desc;
- }
- adapter->lro_aggregated = aggregated;
- adapter->lro_flushed = flushed;
- adapter->lro_no_desc = no_desc;
-#endif
igb_update_stats(adapter);
for (i = 0; i < IGB_GLOBAL_STATS_LEN; i++) {
diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index b82b0fb..ed3d881 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -115,9 +115,6 @@ static bool igb_clean_tx_irq(struct igb_ring *);
static int igb_poll(struct napi_struct *, int);
static bool igb_clean_rx_irq_adv(struct igb_ring *, int *, int);
static void igb_alloc_rx_buffers_adv(struct igb_ring *, int);
-#ifdef CONFIG_IGB_LRO
-static int igb_get_skb_hdr(struct sk_buff *skb, void **, void **, u64 *, void *);
-#endif
static int igb_ioctl(struct net_device *, struct ifreq *, int cmd);
static void igb_tx_timeout(struct net_device *);
static void igb_reset_task(struct work_struct *);
@@ -1189,7 +1186,7 @@ static int __devinit igb_probe(struct pci_dev *pdev,
netdev->features |= NETIF_F_TSO6;
#ifdef CONFIG_IGB_LRO
- netdev->features |= NETIF_F_LRO;
+ netdev->features |= NETIF_F_GRO;
#endif
netdev->vlan_features |= NETIF_F_TSO;
@@ -1739,14 +1736,6 @@ int igb_setup_rx_resources(struct igb_adapter *adapter,
struct pci_dev *pdev = adapter->pdev;
int size, desc_len;
-#ifdef CONFIG_IGB_LRO
- size = sizeof(struct net_lro_desc) * MAX_LRO_DESCRIPTORS;
- rx_ring->lro_mgr.lro_arr = vmalloc(size);
- if (!rx_ring->lro_mgr.lro_arr)
- goto err;
- memset(rx_ring->lro_mgr.lro_arr, 0, size);
-#endif
-
size = sizeof(struct igb_buffer) * rx_ring->count;
rx_ring->buffer_info = vmalloc(size);
if (!rx_ring->buffer_info)
@@ -1773,10 +1762,6 @@ int igb_setup_rx_resources(struct igb_adapter *adapter,
return 0;
err:
-#ifdef CONFIG_IGB_LRO
- vfree(rx_ring->lro_mgr.lro_arr);
- rx_ring->lro_mgr.lro_arr = NULL;
-#endif
vfree(rx_ring->buffer_info);
dev_err(&adapter->pdev->dev, "Unable to allocate memory for "
"the receive descriptor ring\n");
@@ -1930,16 +1915,6 @@ static void igb_configure_rx(struct igb_adapter *adapter)
rxdctl |= IGB_RX_HTHRESH << 8;
rxdctl |= IGB_RX_WTHRESH << 16;
wr32(E1000_RXDCTL(j), rxdctl);
-#ifdef CONFIG_IGB_LRO
- /* Intitial LRO Settings */
- ring->lro_mgr.max_aggr = MAX_LRO_AGGR;
- ring->lro_mgr.max_desc = MAX_LRO_DESCRIPTORS;
- ring->lro_mgr.get_skb_header = igb_get_skb_hdr;
- ring->lro_mgr.features = LRO_F_NAPI | LRO_F_EXTRACT_VLAN_ID;
- ring->lro_mgr.dev = adapter->netdev;
- ring->lro_mgr.ip_summed = CHECKSUM_UNNECESSARY;
- ring->lro_mgr.ip_summed_aggr = CHECKSUM_UNNECESSARY;
-#endif
}
if (adapter->num_rx_queues > 1) {
@@ -2128,11 +2103,6 @@ void igb_free_rx_resources(struct igb_ring *rx_ring)
vfree(rx_ring->buffer_info);
rx_ring->buffer_info = NULL;
-#ifdef CONFIG_IGB_LRO
- vfree(rx_ring->lro_mgr.lro_arr);
- rx_ring->lro_mgr.lro_arr = NULL;
-#endif
-
pci_free_consistent(pdev, rx_ring->size, rx_ring->desc, rx_ring->dma);
rx_ring->desc = NULL;
@@ -3768,39 +3738,6 @@ static bool igb_clean_tx_irq(struct igb_ring *tx_ring)
return (count < tx_ring->count);
}
-#ifdef CONFIG_IGB_LRO
- /**
- * igb_get_skb_hdr - helper function for LRO header processing
- * @skb: pointer to sk_buff to be added to LRO packet
- * @iphdr: pointer to ip header structure
- * @tcph: pointer to tcp header structure
- * @hdr_flags: pointer to header flags
- * @priv: pointer to the receive descriptor for the current sk_buff
- **/
-static int igb_get_skb_hdr(struct sk_buff *skb, void **iphdr, void **tcph,
- u64 *hdr_flags, void *priv)
-{
- union e1000_adv_rx_desc *rx_desc = priv;
- u16 pkt_type = rx_desc->wb.lower.lo_dword.pkt_info &
- (E1000_RXDADV_PKTTYPE_IPV4 | E1000_RXDADV_PKTTYPE_TCP);
-
- /* Verify that this is a valid IPv4 TCP packet */
- if (pkt_type != (E1000_RXDADV_PKTTYPE_IPV4 |
- E1000_RXDADV_PKTTYPE_TCP))
- return -1;
-
- /* Set network headers */
- skb_reset_network_header(skb);
- skb_set_transport_header(skb, ip_hdrlen(skb));
- *iphdr = ip_hdr(skb);
- *tcph = tcp_hdr(skb);
- *hdr_flags = LRO_IPV4 | LRO_TCP;
-
- return 0;
-
-}
-#endif /* CONFIG_IGB_LRO */
-
/**
* igb_receive_skb - helper function to handle rx indications
* @ring: pointer to receive ring receving this packet
@@ -3815,28 +3752,20 @@ static void igb_receive_skb(struct igb_ring *ring, u8 status,
struct igb_adapter * adapter = ring->adapter;
bool vlan_extracted = (adapter->vlgrp && (status & E1000_RXD_STAT_VP));
-#ifdef CONFIG_IGB_LRO
- if (adapter->netdev->features & NETIF_F_LRO &&
- skb->ip_summed == CHECKSUM_UNNECESSARY) {
+ if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
if (vlan_extracted)
- lro_vlan_hwaccel_receive_skb(&ring->lro_mgr, skb,
- adapter->vlgrp,
- le16_to_cpu(rx_desc->wb.upper.vlan),
- rx_desc);
+ vlan_gro_receive(&ring->napi, adapter->vlgrp,
+ le16_to_cpu(rx_desc->wb.upper.vlan),
+ skb);
else
- lro_receive_skb(&ring->lro_mgr,skb, rx_desc);
- ring->lro_used = 1;
+ napi_gro_receive(&ring->napi, skb);
} else {
-#endif
if (vlan_extracted)
vlan_hwaccel_receive_skb(skb, adapter->vlgrp,
le16_to_cpu(rx_desc->wb.upper.vlan));
else
-
netif_receive_skb(skb);
-#ifdef CONFIG_IGB_LRO
}
-#endif
}
@@ -3991,13 +3920,6 @@ next_desc:
rx_ring->next_to_clean = i;
cleaned_count = IGB_DESC_UNUSED(rx_ring);
-#ifdef CONFIG_IGB_LRO
- if (rx_ring->lro_used) {
- lro_flush_all(&rx_ring->lro_mgr);
- rx_ring->lro_used = 0;
- }
-#endif
-
if (cleaned_count)
igb_alloc_rx_buffers_adv(rx_ring, cleaned_count);
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 37+ messages in thread
* ixgbe: Replace LRO with GRO
2009-01-13 9:28 ` [2/2] igb: Replace LRO with GRO Herbert Xu
@ 2009-01-14 8:44 ` Herbert Xu
2009-01-14 11:53 ` Jeff Kirsher
2009-01-15 3:46 ` Herbert Xu
2009-01-14 11:49 ` [2/2] igb: " Jeff Kirsher
2009-01-19 5:47 ` David Miller
2 siblings, 2 replies; 37+ messages in thread
From: Herbert Xu @ 2009-01-14 8:44 UTC (permalink / raw)
To: David S. Miller, Jeff Kirsher, netdev
Hi:
Another day, another driver.
This one's pretty simple, there is some netpoll bogosity around
with the switch to netif_rx but at least it's harmless.
ixgbe: Replace LRO with GRO
This patch makes igb invoke the GRO hooks instead of LRO. As
GRO has a compatible external interface to LRO this is a very
straightforward replacement.
As GRO uses the napi structure to track the held packets, I've
modified the code paths involved to pass that along.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index e112008..6ac361a 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -31,7 +31,6 @@
#include <linux/types.h>
#include <linux/pci.h>
#include <linux/netdevice.h>
-#include <linux/inet_lro.h>
#include <linux/aer.h>
#include "ixgbe_type.h"
@@ -88,9 +87,6 @@
#define IXGBE_TX_FLAGS_VLAN_PRIO_MASK 0x0000e000
#define IXGBE_TX_FLAGS_VLAN_SHIFT 16
-#define IXGBE_MAX_LRO_DESCRIPTORS 8
-#define IXGBE_MAX_LRO_AGGREGATE 32
-
/* wrapper around a pointer to a socket buffer,
* so a DMA handle can be stored along with the buffer */
struct ixgbe_tx_buffer {
@@ -142,8 +138,6 @@ struct ixgbe_ring {
/* cpu for tx queue */
int cpu;
#endif
- struct net_lro_mgr lro_mgr;
- bool lro_used;
struct ixgbe_queue_stats stats;
u16 v_idx; /* maps directly to the index for this ring in the hardware
* vector array, can also be used for finding the bit in EICR
@@ -301,9 +295,6 @@ struct ixgbe_adapter {
unsigned long state;
u64 tx_busy;
- u64 lro_aggregated;
- u64 lro_flushed;
- u64 lro_no_desc;
unsigned int tx_ring_count;
unsigned int rx_ring_count;
diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index 67f87a7..4f6b5df 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -89,8 +89,6 @@ static struct ixgbe_stats ixgbe_gstrings_stats[] = {
{"rx_header_split", IXGBE_STAT(rx_hdr_split)},
{"alloc_rx_page_failed", IXGBE_STAT(alloc_rx_page_failed)},
{"alloc_rx_buff_failed", IXGBE_STAT(alloc_rx_buff_failed)},
- {"lro_aggregated", IXGBE_STAT(lro_aggregated)},
- {"lro_flushed", IXGBE_STAT(lro_flushed)},
};
#define IXGBE_QUEUE_STATS_LEN \
@@ -808,15 +806,6 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
int stat_count = sizeof(struct ixgbe_queue_stats) / sizeof(u64);
int j, k;
int i;
- u64 aggregated = 0, flushed = 0, no_desc = 0;
- for (i = 0; i < adapter->num_rx_queues; i++) {
- aggregated += adapter->rx_ring[i].lro_mgr.stats.aggregated;
- flushed += adapter->rx_ring[i].lro_mgr.stats.flushed;
- no_desc += adapter->rx_ring[i].lro_mgr.stats.no_desc;
- }
- adapter->lro_aggregated = aggregated;
- adapter->lro_flushed = flushed;
- adapter->lro_no_desc = no_desc;
ixgbe_update_stats(adapter);
for (i = 0; i < IXGBE_GLOBAL_STATS_LEN; i++) {
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index acef3c6..23f25fe 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -400,23 +400,20 @@ static int __ixgbe_notify_dca(struct device *dev, void *data)
* @rx_ring: rx descriptor ring (for a specific queue) to setup
* @rx_desc: rx descriptor
**/
-static void ixgbe_receive_skb(struct ixgbe_adapter *adapter,
+static void ixgbe_receive_skb(struct ixgbe_q_vector *q_vector,
struct sk_buff *skb, u8 status,
- struct ixgbe_ring *ring,
union ixgbe_adv_rx_desc *rx_desc)
{
+ struct ixgbe_adapter *adapter = q_vector->adapter;
+ struct napi_struct *napi = &q_vector->napi;
bool is_vlan = (status & IXGBE_RXD_STAT_VP);
u16 tag = le16_to_cpu(rx_desc->wb.upper.vlan);
- if (adapter->netdev->features & NETIF_F_LRO &&
- skb->ip_summed == CHECKSUM_UNNECESSARY) {
+ if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
if (adapter->vlgrp && is_vlan && (tag != 0))
- lro_vlan_hwaccel_receive_skb(&ring->lro_mgr, skb,
- adapter->vlgrp, tag,
- rx_desc);
+ vlan_gro_receive(napi, adapter->vlgrp, tag, skb);
else
- lro_receive_skb(&ring->lro_mgr, skb, rx_desc);
- ring->lro_used = true;
+ napi_gro_receive(napi, skb);
} else {
if (!(adapter->flags & IXGBE_FLAG_IN_NETPOLL)) {
if (adapter->vlgrp && is_vlan && (tag != 0))
@@ -571,10 +568,11 @@ static inline u16 ixgbe_get_pkt_info(union ixgbe_adv_rx_desc *rx_desc)
return rx_desc->wb.lower.lo_dword.hs_rss.pkt_info;
}
-static bool ixgbe_clean_rx_irq(struct ixgbe_adapter *adapter,
+static bool ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
struct ixgbe_ring *rx_ring,
int *work_done, int work_to_do)
{
+ struct ixgbe_adapter *adapter = q_vector->adapter;
struct pci_dev *pdev = adapter->pdev;
union ixgbe_adv_rx_desc *rx_desc, *next_rxd;
struct ixgbe_rx_buffer *rx_buffer_info, *next_buffer;
@@ -675,7 +673,7 @@ static bool ixgbe_clean_rx_irq(struct ixgbe_adapter *adapter,
total_rx_packets++;
skb->protocol = eth_type_trans(skb, adapter->netdev);
- ixgbe_receive_skb(adapter, skb, staterr, rx_ring, rx_desc);
+ ixgbe_receive_skb(q_vector, skb, staterr, rx_desc);
next_desc:
rx_desc->wb.upper.status_error = 0;
@@ -693,11 +691,6 @@ next_desc:
staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
}
- if (rx_ring->lro_used) {
- lro_flush_all(&rx_ring->lro_mgr);
- rx_ring->lro_used = false;
- }
-
rx_ring->next_to_clean = i;
cleaned_count = IXGBE_DESC_UNUSED(rx_ring);
@@ -1049,7 +1042,7 @@ static int ixgbe_clean_rxonly(struct napi_struct *napi, int budget)
ixgbe_update_rx_dca(adapter, rx_ring);
#endif
- ixgbe_clean_rx_irq(adapter, rx_ring, &work_done, budget);
+ ixgbe_clean_rx_irq(q_vector, rx_ring, &work_done, budget);
/* If all Rx work done, exit the polling mode */
if (work_done < budget) {
@@ -1092,7 +1085,7 @@ static int ixgbe_clean_rxonly_many(struct napi_struct *napi, int budget)
if (adapter->flags & IXGBE_FLAG_DCA_ENABLED)
ixgbe_update_rx_dca(adapter, rx_ring);
#endif
- ixgbe_clean_rx_irq(adapter, rx_ring, &work_done, budget);
+ ixgbe_clean_rx_irq(q_vector, rx_ring, &work_done, budget);
enable_mask |= rx_ring->v_idx;
r_idx = find_next_bit(q_vector->rxr_idx, adapter->num_rx_queues,
r_idx + 1);
@@ -1565,33 +1558,6 @@ static void ixgbe_configure_srrctl(struct ixgbe_adapter *adapter, int index)
IXGBE_WRITE_REG(&adapter->hw, IXGBE_SRRCTL(index), srrctl);
}
-/**
- * ixgbe_get_skb_hdr - helper function for LRO header processing
- * @skb: pointer to sk_buff to be added to LRO packet
- * @iphdr: pointer to ip header structure
- * @tcph: pointer to tcp header structure
- * @hdr_flags: pointer to header flags
- * @priv: private data
- **/
-static int ixgbe_get_skb_hdr(struct sk_buff *skb, void **iphdr, void **tcph,
- u64 *hdr_flags, void *priv)
-{
- union ixgbe_adv_rx_desc *rx_desc = priv;
-
- /* Verify that this is a valid IPv4 TCP packet */
- if (!((ixgbe_get_pkt_info(rx_desc) & IXGBE_RXDADV_PKTTYPE_IPV4) &&
- (ixgbe_get_pkt_info(rx_desc) & IXGBE_RXDADV_PKTTYPE_TCP)))
- return -1;
-
- /* Set network headers */
- skb_reset_network_header(skb);
- skb_set_transport_header(skb, ip_hdrlen(skb));
- *iphdr = ip_hdr(skb);
- *tcph = tcp_hdr(skb);
- *hdr_flags = LRO_IPV4 | LRO_TCP;
- return 0;
-}
-
#define PAGE_USE_COUNT(S) (((S) >> PAGE_SHIFT) + \
(((S) & (PAGE_SIZE - 1)) ? 1 : 0))
@@ -1663,16 +1629,6 @@ static void ixgbe_configure_rx(struct ixgbe_adapter *adapter)
adapter->rx_ring[i].head = IXGBE_RDH(j);
adapter->rx_ring[i].tail = IXGBE_RDT(j);
adapter->rx_ring[i].rx_buf_len = rx_buf_len;
- /* Intitial LRO Settings */
- adapter->rx_ring[i].lro_mgr.max_aggr = IXGBE_MAX_LRO_AGGREGATE;
- adapter->rx_ring[i].lro_mgr.max_desc = IXGBE_MAX_LRO_DESCRIPTORS;
- adapter->rx_ring[i].lro_mgr.get_skb_header = ixgbe_get_skb_hdr;
- adapter->rx_ring[i].lro_mgr.features = LRO_F_EXTRACT_VLAN_ID;
- if (!(adapter->flags & IXGBE_FLAG_IN_NETPOLL))
- adapter->rx_ring[i].lro_mgr.features |= LRO_F_NAPI;
- adapter->rx_ring[i].lro_mgr.dev = adapter->netdev;
- adapter->rx_ring[i].lro_mgr.ip_summed = CHECKSUM_UNNECESSARY;
- adapter->rx_ring[i].lro_mgr.ip_summed_aggr = CHECKSUM_UNNECESSARY;
ixgbe_configure_srrctl(adapter, j);
}
@@ -2303,7 +2259,7 @@ static int ixgbe_poll(struct napi_struct *napi, int budget)
#endif
tx_cleaned = ixgbe_clean_tx_irq(adapter, adapter->tx_ring);
- ixgbe_clean_rx_irq(adapter, adapter->rx_ring, &work_done, budget);
+ ixgbe_clean_rx_irq(q_vector, adapter->rx_ring, &work_done, budget);
if (tx_cleaned)
work_done = budget;
@@ -2919,12 +2875,6 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter *adapter,
struct pci_dev *pdev = adapter->pdev;
int size;
- size = sizeof(struct net_lro_desc) * IXGBE_MAX_LRO_DESCRIPTORS;
- rx_ring->lro_mgr.lro_arr = vmalloc(size);
- if (!rx_ring->lro_mgr.lro_arr)
- return -ENOMEM;
- memset(rx_ring->lro_mgr.lro_arr, 0, size);
-
size = sizeof(struct ixgbe_rx_buffer) * rx_ring->count;
rx_ring->rx_buffer_info = vmalloc(size);
if (!rx_ring->rx_buffer_info) {
@@ -2953,8 +2903,6 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter *adapter,
return 0;
alloc_failed:
- vfree(rx_ring->lro_mgr.lro_arr);
- rx_ring->lro_mgr.lro_arr = NULL;
return -ENOMEM;
}
@@ -3032,9 +2980,6 @@ void ixgbe_free_rx_resources(struct ixgbe_adapter *adapter,
{
struct pci_dev *pdev = adapter->pdev;
- vfree(rx_ring->lro_mgr.lro_arr);
- rx_ring->lro_mgr.lro_arr = NULL;
-
ixgbe_clean_rx_ring(adapter, rx_ring);
vfree(rx_ring->rx_buffer_info);
@@ -4136,7 +4081,7 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
netdev->features |= NETIF_F_IPV6_CSUM;
netdev->features |= NETIF_F_TSO;
netdev->features |= NETIF_F_TSO6;
- netdev->features |= NETIF_F_LRO;
+ netdev->features |= NETIF_F_GRO;
netdev->vlan_features |= NETIF_F_TSO;
netdev->vlan_features |= NETIF_F_TSO6;
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [2/2] igb: Replace LRO with GRO
2009-01-13 9:28 ` [2/2] igb: Replace LRO with GRO Herbert Xu
2009-01-14 8:44 ` ixgbe: " Herbert Xu
@ 2009-01-14 11:49 ` Jeff Kirsher
2009-01-14 12:36 ` Herbert Xu
2009-01-19 5:47 ` David Miller
2 siblings, 1 reply; 37+ messages in thread
From: Jeff Kirsher @ 2009-01-14 11:49 UTC (permalink / raw)
To: Herbert Xu; +Cc: David S. Miller, netdev
On Tue, Jan 13, 2009 at 1:28 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> Hi:
>
> And here is the first real conversion.
>
> I can't test this one because I don't have the hardware. However,
> GRO is a software feature and I have tested the various code paths
> using e1000e.
>
> igb: Replace LRO with GRO
>
> This patch makes igb invoke the GRO hooks instead of LRO. As
> GRO has a compatible external interface to LRO this is a very
> straightforward replacement.
>
> Three things of note:
>
> 1) I've kept the LRO Kconfig option until we decide to enable
> GRO across the board at which point it can also be killed.
>
> 2) The poll_controller stuff is broken in igb as it tries to do
> the same work as the normal poll routine. Since poll_controller
> can be called in the middle of a poll, this can't be good.
>
> I noticed this because poll_controller can invoke the GRO hooks
> without flushing held GRO packets.
>
> However, this should be harmless (assuming the poll_controller
> bug above doesn't kill you first :) since the next ->poll will
> clear the backlog. The only time when we'll have a problem is
> if we're already executing the GRO code on the same ring, but
> that's no worse than what happens now.
>
> 3) I kept the ip_summed check before calling GRO so that we're
> on par with previous behaviour.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
> index 65afda4..cf4fea0 100644
> --- a/drivers/net/Kconfig
> +++ b/drivers/net/Kconfig
> @@ -2022,7 +2022,6 @@ config IGB
> config IGB_LRO
> bool "Use software LRO"
> depends on IGB && INET
> - select INET_LRO
> ---help---
> Say Y here if you want to use large receive offload.
>
> diff --git a/drivers/net/igb/igb.h b/drivers/net/igb/igb.h
> index 5a27825..7d8c887 100644
> --- a/drivers/net/igb/igb.h
> +++ b/drivers/net/igb/igb.h
> @@ -36,12 +36,6 @@
>
> struct igb_adapter;
>
> -#ifdef CONFIG_IGB_LRO
> -#include <linux/inet_lro.h>
> -#define MAX_LRO_AGGR 32
> -#define MAX_LRO_DESCRIPTORS 8
> -#endif
> -
> /* Interrupt defines */
> #define IGB_MIN_DYN_ITR 3000
> #define IGB_MAX_DYN_ITR 96000
> @@ -176,10 +170,6 @@ struct igb_ring {
> struct napi_struct napi;
> int set_itr;
> struct igb_ring *buddy;
> -#ifdef CONFIG_IGB_LRO
> - struct net_lro_mgr lro_mgr;
> - bool lro_used;
> -#endif
> };
> };
>
> @@ -288,12 +278,6 @@ struct igb_adapter {
> int need_ioport;
>
> struct igb_ring *multi_tx_table[IGB_MAX_TX_QUEUES];
> -#ifdef CONFIG_IGB_LRO
> - unsigned int lro_max_aggr;
> - unsigned int lro_aggregated;
> - unsigned int lro_flushed;
> - unsigned int lro_no_desc;
> -#endif
> unsigned int tx_ring_count;
> unsigned int rx_ring_count;
> };
> diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c
> index 3c831f1..4606e63 100644
> --- a/drivers/net/igb/igb_ethtool.c
> +++ b/drivers/net/igb/igb_ethtool.c
> @@ -93,11 +93,6 @@ static const struct igb_stats igb_gstrings_stats[] = {
> { "tx_smbus", IGB_STAT(stats.mgptc) },
> { "rx_smbus", IGB_STAT(stats.mgprc) },
> { "dropped_smbus", IGB_STAT(stats.mgpdc) },
> -#ifdef CONFIG_IGB_LRO
> - { "lro_aggregated", IGB_STAT(lro_aggregated) },
> - { "lro_flushed", IGB_STAT(lro_flushed) },
> - { "lro_no_desc", IGB_STAT(lro_no_desc) },
> -#endif
> };
>
> #define IGB_QUEUE_STATS_LEN \
> @@ -1921,18 +1916,6 @@ static void igb_get_ethtool_stats(struct net_device *netdev,
> int stat_count = sizeof(struct igb_queue_stats) / sizeof(u64);
> int j;
> int i;
> -#ifdef CONFIG_IGB_LRO
> - int aggregated = 0, flushed = 0, no_desc = 0;
> -
> - for (i = 0; i < adapter->num_rx_queues; i++) {
> - aggregated += adapter->rx_ring[i].lro_mgr.stats.aggregated;
> - flushed += adapter->rx_ring[i].lro_mgr.stats.flushed;
> - no_desc += adapter->rx_ring[i].lro_mgr.stats.no_desc;
> - }
> - adapter->lro_aggregated = aggregated;
> - adapter->lro_flushed = flushed;
> - adapter->lro_no_desc = no_desc;
> -#endif
>
> igb_update_stats(adapter);
> for (i = 0; i < IGB_GLOBAL_STATS_LEN; i++) {
> diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
> index b82b0fb..ed3d881 100644
> --- a/drivers/net/igb/igb_main.c
> +++ b/drivers/net/igb/igb_main.c
> @@ -115,9 +115,6 @@ static bool igb_clean_tx_irq(struct igb_ring *);
> static int igb_poll(struct napi_struct *, int);
> static bool igb_clean_rx_irq_adv(struct igb_ring *, int *, int);
> static void igb_alloc_rx_buffers_adv(struct igb_ring *, int);
> -#ifdef CONFIG_IGB_LRO
> -static int igb_get_skb_hdr(struct sk_buff *skb, void **, void **, u64 *, void *);
> -#endif
> static int igb_ioctl(struct net_device *, struct ifreq *, int cmd);
> static void igb_tx_timeout(struct net_device *);
> static void igb_reset_task(struct work_struct *);
> @@ -1189,7 +1186,7 @@ static int __devinit igb_probe(struct pci_dev *pdev,
> netdev->features |= NETIF_F_TSO6;
>
> #ifdef CONFIG_IGB_LRO
> - netdev->features |= NETIF_F_LRO;
> + netdev->features |= NETIF_F_GRO;
> #endif
>
> netdev->vlan_features |= NETIF_F_TSO;
> @@ -1739,14 +1736,6 @@ int igb_setup_rx_resources(struct igb_adapter *adapter,
> struct pci_dev *pdev = adapter->pdev;
> int size, desc_len;
>
> -#ifdef CONFIG_IGB_LRO
> - size = sizeof(struct net_lro_desc) * MAX_LRO_DESCRIPTORS;
> - rx_ring->lro_mgr.lro_arr = vmalloc(size);
> - if (!rx_ring->lro_mgr.lro_arr)
> - goto err;
> - memset(rx_ring->lro_mgr.lro_arr, 0, size);
> -#endif
> -
> size = sizeof(struct igb_buffer) * rx_ring->count;
> rx_ring->buffer_info = vmalloc(size);
> if (!rx_ring->buffer_info)
> @@ -1773,10 +1762,6 @@ int igb_setup_rx_resources(struct igb_adapter *adapter,
> return 0;
>
> err:
> -#ifdef CONFIG_IGB_LRO
> - vfree(rx_ring->lro_mgr.lro_arr);
> - rx_ring->lro_mgr.lro_arr = NULL;
> -#endif
> vfree(rx_ring->buffer_info);
> dev_err(&adapter->pdev->dev, "Unable to allocate memory for "
> "the receive descriptor ring\n");
> @@ -1930,16 +1915,6 @@ static void igb_configure_rx(struct igb_adapter *adapter)
> rxdctl |= IGB_RX_HTHRESH << 8;
> rxdctl |= IGB_RX_WTHRESH << 16;
> wr32(E1000_RXDCTL(j), rxdctl);
> -#ifdef CONFIG_IGB_LRO
> - /* Intitial LRO Settings */
> - ring->lro_mgr.max_aggr = MAX_LRO_AGGR;
> - ring->lro_mgr.max_desc = MAX_LRO_DESCRIPTORS;
> - ring->lro_mgr.get_skb_header = igb_get_skb_hdr;
> - ring->lro_mgr.features = LRO_F_NAPI | LRO_F_EXTRACT_VLAN_ID;
> - ring->lro_mgr.dev = adapter->netdev;
> - ring->lro_mgr.ip_summed = CHECKSUM_UNNECESSARY;
> - ring->lro_mgr.ip_summed_aggr = CHECKSUM_UNNECESSARY;
> -#endif
> }
>
> if (adapter->num_rx_queues > 1) {
> @@ -2128,11 +2103,6 @@ void igb_free_rx_resources(struct igb_ring *rx_ring)
> vfree(rx_ring->buffer_info);
> rx_ring->buffer_info = NULL;
>
> -#ifdef CONFIG_IGB_LRO
> - vfree(rx_ring->lro_mgr.lro_arr);
> - rx_ring->lro_mgr.lro_arr = NULL;
> -#endif
> -
> pci_free_consistent(pdev, rx_ring->size, rx_ring->desc, rx_ring->dma);
>
> rx_ring->desc = NULL;
> @@ -3768,39 +3738,6 @@ static bool igb_clean_tx_irq(struct igb_ring *tx_ring)
> return (count < tx_ring->count);
> }
>
> -#ifdef CONFIG_IGB_LRO
> - /**
> - * igb_get_skb_hdr - helper function for LRO header processing
> - * @skb: pointer to sk_buff to be added to LRO packet
> - * @iphdr: pointer to ip header structure
> - * @tcph: pointer to tcp header structure
> - * @hdr_flags: pointer to header flags
> - * @priv: pointer to the receive descriptor for the current sk_buff
> - **/
> -static int igb_get_skb_hdr(struct sk_buff *skb, void **iphdr, void **tcph,
> - u64 *hdr_flags, void *priv)
> -{
> - union e1000_adv_rx_desc *rx_desc = priv;
> - u16 pkt_type = rx_desc->wb.lower.lo_dword.pkt_info &
> - (E1000_RXDADV_PKTTYPE_IPV4 | E1000_RXDADV_PKTTYPE_TCP);
> -
> - /* Verify that this is a valid IPv4 TCP packet */
> - if (pkt_type != (E1000_RXDADV_PKTTYPE_IPV4 |
> - E1000_RXDADV_PKTTYPE_TCP))
> - return -1;
> -
> - /* Set network headers */
> - skb_reset_network_header(skb);
> - skb_set_transport_header(skb, ip_hdrlen(skb));
> - *iphdr = ip_hdr(skb);
> - *tcph = tcp_hdr(skb);
> - *hdr_flags = LRO_IPV4 | LRO_TCP;
> -
> - return 0;
> -
> -}
> -#endif /* CONFIG_IGB_LRO */
> -
> /**
> * igb_receive_skb - helper function to handle rx indications
> * @ring: pointer to receive ring receving this packet
> @@ -3815,28 +3752,20 @@ static void igb_receive_skb(struct igb_ring *ring, u8 status,
> struct igb_adapter * adapter = ring->adapter;
> bool vlan_extracted = (adapter->vlgrp && (status & E1000_RXD_STAT_VP));
>
> -#ifdef CONFIG_IGB_LRO
> - if (adapter->netdev->features & NETIF_F_LRO &&
> - skb->ip_summed == CHECKSUM_UNNECESSARY) {
> + if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
> if (vlan_extracted)
> - lro_vlan_hwaccel_receive_skb(&ring->lro_mgr, skb,
> - adapter->vlgrp,
> - le16_to_cpu(rx_desc->wb.upper.vlan),
> - rx_desc);
> + vlan_gro_receive(&ring->napi, adapter->vlgrp,
> + le16_to_cpu(rx_desc->wb.upper.vlan),
> + skb);
> else
> - lro_receive_skb(&ring->lro_mgr,skb, rx_desc);
> - ring->lro_used = 1;
> + napi_gro_receive(&ring->napi, skb);
> } else {
> -#endif
> if (vlan_extracted)
> vlan_hwaccel_receive_skb(skb, adapter->vlgrp,
> le16_to_cpu(rx_desc->wb.upper.vlan));
> else
> -
> netif_receive_skb(skb);
> -#ifdef CONFIG_IGB_LRO
> }
> -#endif
> }
>
>
> @@ -3991,13 +3920,6 @@ next_desc:
> rx_ring->next_to_clean = i;
> cleaned_count = IGB_DESC_UNUSED(rx_ring);
>
> -#ifdef CONFIG_IGB_LRO
> - if (rx_ring->lro_used) {
> - lro_flush_all(&rx_ring->lro_mgr);
> - rx_ring->lro_used = 0;
> - }
> -#endif
> -
> if (cleaned_count)
> igb_alloc_rx_buffers_adv(rx_ring, cleaned_count);
>
> Cheers,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Ok, a couple of things...
First is that we should have the patch in testing (most likely today,
do not worry dave, it won't take a week)
Second is I am not sure you need to keep this code wraped in CONFIG_IGB_LRO...
#ifdef CONFIG_IGB_LRO
- netdev->features |= NETIF_F_LRO;
+ netdev->features |= NETIF_F_GRO;
#endif
Also, you left part of the lro code in igb_receive_skb and then put in
the GRO code. Our preference is that you mirrored what you did with
e1000e and just replaced the main vlan_hwacel and netif_receive_skb
calls.
--
Cheers,
Jeff
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: ixgbe: Replace LRO with GRO
2009-01-14 8:44 ` ixgbe: " Herbert Xu
@ 2009-01-14 11:53 ` Jeff Kirsher
2009-01-15 3:46 ` Herbert Xu
1 sibling, 0 replies; 37+ messages in thread
From: Jeff Kirsher @ 2009-01-14 11:53 UTC (permalink / raw)
To: Herbert Xu; +Cc: David S. Miller, netdev
On Wed, Jan 14, 2009 at 12:44 AM, Herbert Xu
<herbert@gondor.apana.org.au> wrote:
> Hi:
>
> Another day, another driver.
>
> This one's pretty simple, there is some netpoll bogosity around
> with the switch to netif_rx but at least it's harmless.
>
> ixgbe: Replace LRO with GRO
>
> This patch makes igb invoke the GRO hooks instead of LRO. As
> GRO has a compatible external interface to LRO this is a very
> straightforward replacement.
>
> As GRO uses the napi structure to track the held packets, I've
> modified the code paths involved to pass that along.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
> index e112008..6ac361a 100644
> --- a/drivers/net/ixgbe/ixgbe.h
> +++ b/drivers/net/ixgbe/ixgbe.h
> @@ -31,7 +31,6 @@
> #include <linux/types.h>
> #include <linux/pci.h>
> #include <linux/netdevice.h>
> -#include <linux/inet_lro.h>
> #include <linux/aer.h>
>
> #include "ixgbe_type.h"
> @@ -88,9 +87,6 @@
> #define IXGBE_TX_FLAGS_VLAN_PRIO_MASK 0x0000e000
> #define IXGBE_TX_FLAGS_VLAN_SHIFT 16
>
> -#define IXGBE_MAX_LRO_DESCRIPTORS 8
> -#define IXGBE_MAX_LRO_AGGREGATE 32
> -
> /* wrapper around a pointer to a socket buffer,
> * so a DMA handle can be stored along with the buffer */
> struct ixgbe_tx_buffer {
> @@ -142,8 +138,6 @@ struct ixgbe_ring {
> /* cpu for tx queue */
> int cpu;
> #endif
> - struct net_lro_mgr lro_mgr;
> - bool lro_used;
> struct ixgbe_queue_stats stats;
> u16 v_idx; /* maps directly to the index for this ring in the hardware
> * vector array, can also be used for finding the bit in EICR
> @@ -301,9 +295,6 @@ struct ixgbe_adapter {
>
> unsigned long state;
> u64 tx_busy;
> - u64 lro_aggregated;
> - u64 lro_flushed;
> - u64 lro_no_desc;
> unsigned int tx_ring_count;
> unsigned int rx_ring_count;
>
> diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
> index 67f87a7..4f6b5df 100644
> --- a/drivers/net/ixgbe/ixgbe_ethtool.c
> +++ b/drivers/net/ixgbe/ixgbe_ethtool.c
> @@ -89,8 +89,6 @@ static struct ixgbe_stats ixgbe_gstrings_stats[] = {
> {"rx_header_split", IXGBE_STAT(rx_hdr_split)},
> {"alloc_rx_page_failed", IXGBE_STAT(alloc_rx_page_failed)},
> {"alloc_rx_buff_failed", IXGBE_STAT(alloc_rx_buff_failed)},
> - {"lro_aggregated", IXGBE_STAT(lro_aggregated)},
> - {"lro_flushed", IXGBE_STAT(lro_flushed)},
> };
>
> #define IXGBE_QUEUE_STATS_LEN \
> @@ -808,15 +806,6 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
> int stat_count = sizeof(struct ixgbe_queue_stats) / sizeof(u64);
> int j, k;
> int i;
> - u64 aggregated = 0, flushed = 0, no_desc = 0;
> - for (i = 0; i < adapter->num_rx_queues; i++) {
> - aggregated += adapter->rx_ring[i].lro_mgr.stats.aggregated;
> - flushed += adapter->rx_ring[i].lro_mgr.stats.flushed;
> - no_desc += adapter->rx_ring[i].lro_mgr.stats.no_desc;
> - }
> - adapter->lro_aggregated = aggregated;
> - adapter->lro_flushed = flushed;
> - adapter->lro_no_desc = no_desc;
>
> ixgbe_update_stats(adapter);
> for (i = 0; i < IXGBE_GLOBAL_STATS_LEN; i++) {
> diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
> index acef3c6..23f25fe 100644
> --- a/drivers/net/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ixgbe/ixgbe_main.c
> @@ -400,23 +400,20 @@ static int __ixgbe_notify_dca(struct device *dev, void *data)
> * @rx_ring: rx descriptor ring (for a specific queue) to setup
> * @rx_desc: rx descriptor
> **/
> -static void ixgbe_receive_skb(struct ixgbe_adapter *adapter,
> +static void ixgbe_receive_skb(struct ixgbe_q_vector *q_vector,
> struct sk_buff *skb, u8 status,
> - struct ixgbe_ring *ring,
> union ixgbe_adv_rx_desc *rx_desc)
> {
> + struct ixgbe_adapter *adapter = q_vector->adapter;
> + struct napi_struct *napi = &q_vector->napi;
> bool is_vlan = (status & IXGBE_RXD_STAT_VP);
> u16 tag = le16_to_cpu(rx_desc->wb.upper.vlan);
>
> - if (adapter->netdev->features & NETIF_F_LRO &&
> - skb->ip_summed == CHECKSUM_UNNECESSARY) {
> + if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
> if (adapter->vlgrp && is_vlan && (tag != 0))
> - lro_vlan_hwaccel_receive_skb(&ring->lro_mgr, skb,
> - adapter->vlgrp, tag,
> - rx_desc);
> + vlan_gro_receive(napi, adapter->vlgrp, tag, skb);
> else
> - lro_receive_skb(&ring->lro_mgr, skb, rx_desc);
> - ring->lro_used = true;
> + napi_gro_receive(napi, skb);
> } else {
> if (!(adapter->flags & IXGBE_FLAG_IN_NETPOLL)) {
> if (adapter->vlgrp && is_vlan && (tag != 0))
> @@ -571,10 +568,11 @@ static inline u16 ixgbe_get_pkt_info(union ixgbe_adv_rx_desc *rx_desc)
> return rx_desc->wb.lower.lo_dword.hs_rss.pkt_info;
> }
>
> -static bool ixgbe_clean_rx_irq(struct ixgbe_adapter *adapter,
> +static bool ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
> struct ixgbe_ring *rx_ring,
> int *work_done, int work_to_do)
> {
> + struct ixgbe_adapter *adapter = q_vector->adapter;
> struct pci_dev *pdev = adapter->pdev;
> union ixgbe_adv_rx_desc *rx_desc, *next_rxd;
> struct ixgbe_rx_buffer *rx_buffer_info, *next_buffer;
> @@ -675,7 +673,7 @@ static bool ixgbe_clean_rx_irq(struct ixgbe_adapter *adapter,
> total_rx_packets++;
>
> skb->protocol = eth_type_trans(skb, adapter->netdev);
> - ixgbe_receive_skb(adapter, skb, staterr, rx_ring, rx_desc);
> + ixgbe_receive_skb(q_vector, skb, staterr, rx_desc);
>
> next_desc:
> rx_desc->wb.upper.status_error = 0;
> @@ -693,11 +691,6 @@ next_desc:
> staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
> }
>
> - if (rx_ring->lro_used) {
> - lro_flush_all(&rx_ring->lro_mgr);
> - rx_ring->lro_used = false;
> - }
> -
> rx_ring->next_to_clean = i;
> cleaned_count = IXGBE_DESC_UNUSED(rx_ring);
>
> @@ -1049,7 +1042,7 @@ static int ixgbe_clean_rxonly(struct napi_struct *napi, int budget)
> ixgbe_update_rx_dca(adapter, rx_ring);
> #endif
>
> - ixgbe_clean_rx_irq(adapter, rx_ring, &work_done, budget);
> + ixgbe_clean_rx_irq(q_vector, rx_ring, &work_done, budget);
>
> /* If all Rx work done, exit the polling mode */
> if (work_done < budget) {
> @@ -1092,7 +1085,7 @@ static int ixgbe_clean_rxonly_many(struct napi_struct *napi, int budget)
> if (adapter->flags & IXGBE_FLAG_DCA_ENABLED)
> ixgbe_update_rx_dca(adapter, rx_ring);
> #endif
> - ixgbe_clean_rx_irq(adapter, rx_ring, &work_done, budget);
> + ixgbe_clean_rx_irq(q_vector, rx_ring, &work_done, budget);
> enable_mask |= rx_ring->v_idx;
> r_idx = find_next_bit(q_vector->rxr_idx, adapter->num_rx_queues,
> r_idx + 1);
> @@ -1565,33 +1558,6 @@ static void ixgbe_configure_srrctl(struct ixgbe_adapter *adapter, int index)
> IXGBE_WRITE_REG(&adapter->hw, IXGBE_SRRCTL(index), srrctl);
> }
>
> -/**
> - * ixgbe_get_skb_hdr - helper function for LRO header processing
> - * @skb: pointer to sk_buff to be added to LRO packet
> - * @iphdr: pointer to ip header structure
> - * @tcph: pointer to tcp header structure
> - * @hdr_flags: pointer to header flags
> - * @priv: private data
> - **/
> -static int ixgbe_get_skb_hdr(struct sk_buff *skb, void **iphdr, void **tcph,
> - u64 *hdr_flags, void *priv)
> -{
> - union ixgbe_adv_rx_desc *rx_desc = priv;
> -
> - /* Verify that this is a valid IPv4 TCP packet */
> - if (!((ixgbe_get_pkt_info(rx_desc) & IXGBE_RXDADV_PKTTYPE_IPV4) &&
> - (ixgbe_get_pkt_info(rx_desc) & IXGBE_RXDADV_PKTTYPE_TCP)))
> - return -1;
> -
> - /* Set network headers */
> - skb_reset_network_header(skb);
> - skb_set_transport_header(skb, ip_hdrlen(skb));
> - *iphdr = ip_hdr(skb);
> - *tcph = tcp_hdr(skb);
> - *hdr_flags = LRO_IPV4 | LRO_TCP;
> - return 0;
> -}
> -
> #define PAGE_USE_COUNT(S) (((S) >> PAGE_SHIFT) + \
> (((S) & (PAGE_SIZE - 1)) ? 1 : 0))
>
> @@ -1663,16 +1629,6 @@ static void ixgbe_configure_rx(struct ixgbe_adapter *adapter)
> adapter->rx_ring[i].head = IXGBE_RDH(j);
> adapter->rx_ring[i].tail = IXGBE_RDT(j);
> adapter->rx_ring[i].rx_buf_len = rx_buf_len;
> - /* Intitial LRO Settings */
> - adapter->rx_ring[i].lro_mgr.max_aggr = IXGBE_MAX_LRO_AGGREGATE;
> - adapter->rx_ring[i].lro_mgr.max_desc = IXGBE_MAX_LRO_DESCRIPTORS;
> - adapter->rx_ring[i].lro_mgr.get_skb_header = ixgbe_get_skb_hdr;
> - adapter->rx_ring[i].lro_mgr.features = LRO_F_EXTRACT_VLAN_ID;
> - if (!(adapter->flags & IXGBE_FLAG_IN_NETPOLL))
> - adapter->rx_ring[i].lro_mgr.features |= LRO_F_NAPI;
> - adapter->rx_ring[i].lro_mgr.dev = adapter->netdev;
> - adapter->rx_ring[i].lro_mgr.ip_summed = CHECKSUM_UNNECESSARY;
> - adapter->rx_ring[i].lro_mgr.ip_summed_aggr = CHECKSUM_UNNECESSARY;
>
> ixgbe_configure_srrctl(adapter, j);
> }
> @@ -2303,7 +2259,7 @@ static int ixgbe_poll(struct napi_struct *napi, int budget)
> #endif
>
> tx_cleaned = ixgbe_clean_tx_irq(adapter, adapter->tx_ring);
> - ixgbe_clean_rx_irq(adapter, adapter->rx_ring, &work_done, budget);
> + ixgbe_clean_rx_irq(q_vector, adapter->rx_ring, &work_done, budget);
>
> if (tx_cleaned)
> work_done = budget;
> @@ -2919,12 +2875,6 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter *adapter,
> struct pci_dev *pdev = adapter->pdev;
> int size;
>
> - size = sizeof(struct net_lro_desc) * IXGBE_MAX_LRO_DESCRIPTORS;
> - rx_ring->lro_mgr.lro_arr = vmalloc(size);
> - if (!rx_ring->lro_mgr.lro_arr)
> - return -ENOMEM;
> - memset(rx_ring->lro_mgr.lro_arr, 0, size);
> -
> size = sizeof(struct ixgbe_rx_buffer) * rx_ring->count;
> rx_ring->rx_buffer_info = vmalloc(size);
> if (!rx_ring->rx_buffer_info) {
> @@ -2953,8 +2903,6 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter *adapter,
> return 0;
>
> alloc_failed:
> - vfree(rx_ring->lro_mgr.lro_arr);
> - rx_ring->lro_mgr.lro_arr = NULL;
> return -ENOMEM;
> }
>
> @@ -3032,9 +2980,6 @@ void ixgbe_free_rx_resources(struct ixgbe_adapter *adapter,
> {
> struct pci_dev *pdev = adapter->pdev;
>
> - vfree(rx_ring->lro_mgr.lro_arr);
> - rx_ring->lro_mgr.lro_arr = NULL;
> -
> ixgbe_clean_rx_ring(adapter, rx_ring);
>
> vfree(rx_ring->rx_buffer_info);
> @@ -4136,7 +4081,7 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
> netdev->features |= NETIF_F_IPV6_CSUM;
> netdev->features |= NETIF_F_TSO;
> netdev->features |= NETIF_F_TSO6;
> - netdev->features |= NETIF_F_LRO;
> + netdev->features |= NETIF_F_GRO;
>
> netdev->vlan_features |= NETIF_F_TSO;
> netdev->vlan_features |= NETIF_F_TSO6;
>
> Cheers,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Thanks Herbert, we will get this in testing later today. Initially
the patch looks fine.
--
Cheers,
Jeff
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [2/2] igb: Replace LRO with GRO
2009-01-14 11:49 ` [2/2] igb: " Jeff Kirsher
@ 2009-01-14 12:36 ` Herbert Xu
2009-01-15 0:03 ` Jeff Kirsher
0 siblings, 1 reply; 37+ messages in thread
From: Herbert Xu @ 2009-01-14 12:36 UTC (permalink / raw)
To: Jeff Kirsher; +Cc: davem, netdev
Jeff Kirsher <jeffrey.t.kirsher@intel.com> wrote:
>
> Ok, a couple of things...
> First is that we should have the patch in testing (most likely today,
> do not worry dave, it won't take a week)
Thanks!
> Second is I am not sure you need to keep this code wraped in CONFIG_IGB_LRO...
>
> #ifdef CONFIG_IGB_LRO
> - netdev->features |= NETIF_F_LRO;
> + netdev->features |= NETIF_F_GRO;
> #endif
My objective is to minimise changes with respect to the current
LRO base (the e1000e was an exception, it was the only hardware
I had :)
Therefore this patch is simply trying to replace LRO with GRO
letter by letter, so to speak. In this case, it's preserving
the semantics of the Kconfig option, i.e., if LRO was off before,
then GRO will be off too (although it can now be enabled using
ethtool without recompiling).
> Also, you left part of the lro code in igb_receive_skb and then put in
> the GRO code. Our preference is that you mirrored what you did with
> e1000e and just replaced the main vlan_hwacel and netif_receive_skb
> calls.
No I left in the netif_receive_skb but replaced the lro calls by
the corresponding gro calls. Again the point is to replace LRO
exactly as it is.
Later on we can make further changes.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [2/2] igb: Replace LRO with GRO
2009-01-14 12:36 ` Herbert Xu
@ 2009-01-15 0:03 ` Jeff Kirsher
2009-01-15 0:32 ` Herbert Xu
0 siblings, 1 reply; 37+ messages in thread
From: Jeff Kirsher @ 2009-01-15 0:03 UTC (permalink / raw)
To: Herbert Xu; +Cc: davem, netdev, Emil Tantilov
On Wed, Jan 14, 2009 at 4:36 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> Jeff Kirsher <jeffrey.t.kirsher@intel.com> wrote:
>>
>> Ok, a couple of things...
>> First is that we should have the patch in testing (most likely today,
>> do not worry dave, it won't take a week)
>
> Thanks!
>
>> Second is I am not sure you need to keep this code wraped in CONFIG_IGB_LRO...
>>
>> #ifdef CONFIG_IGB_LRO
>> - netdev->features |= NETIF_F_LRO;
>> + netdev->features |= NETIF_F_GRO;
>> #endif
>
> My objective is to minimise changes with respect to the current
> LRO base (the e1000e was an exception, it was the only hardware
> I had :)
>
> Therefore this patch is simply trying to replace LRO with GRO
> letter by letter, so to speak. In this case, it's preserving
> the semantics of the Kconfig option, i.e., if LRO was off before,
> then GRO will be off too (although it can now be enabled using
> ethtool without recompiling).
>
>> Also, you left part of the lro code in igb_receive_skb and then put in
>> the GRO code. Our preference is that you mirrored what you did with
>> e1000e and just replaced the main vlan_hwacel and netif_receive_skb
>> calls.
>
> No I left in the netif_receive_skb but replaced the lro calls by
> the corresponding gro calls. Again the point is to replace LRO
> exactly as it is.
>
> Later on we can make further changes.
>
> Cheers,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> --
We are seeing a kernel panic during our testing using jumbo frames,
below is the trace.
BUG: Bad page state in process swapper pfn:1194ff
page:ffffe200060b37a8 flags:8000000000000000 count:-1 mapcount:0
mapping:(null) index:0
Pid: 0, comm: swapper Tainted: G W 2.6.29-rc1-net-2.6-igb_gro #1
Call Trace:
<IRQ> [<ffffffff80283e5e>] bad_page+0x105/0x11c
[<ffffffff8028489f>] prep_new_page+0x4d/0xb2 [<ffffffff80284ae1>]
buffered_rmqueue+0x1dd/0x243 [<ffffffff80284bfb>]
get_page_from_freelist+0xb4/0xef [<ffffffff80284ec6>]
__alloc_pages_internal+0xa1/0x38f [<ffffffff8036ad24>]
address_needs_mapping+0xd/0x1e [<ffffffff8036ad4c>]
range_needs_mapping+0x17/0x21 [<ffffffffa0072a45>]
igb_alloc_rx_buffers_adv+0x6a/0x1bc [igb] [<ffffffff804cdf06>]
dev_gro_receive+0x43/0x24f [<ffffffffa0073a5c>]
igb_clean_rx_irq_adv+0x306/0x34b [igb] [<ffffffff80559baf>]
_spin_unlock_irqrestore+0x63/0x74 [<ffffffffa0074df8>]
igb_clean_rx_ring_msix+0x4a/0xca [igb] [<ffffffff804d0750>]
net_rx_action+0xa0/0x14f [<ffffffff80244d6a>] __do_softirq+0x7b/0x116
[<ffffffff8020cb3c>] call_softirq+0x1c/0x28 [<ffffffff8020de92>]
do_softirq+0x31/0x83 [<ffffffff80244c47>] irq_exit+0x45/0x87
[<ffffffff8020de4b>] do_IRQ+0xa5/0xbb [<ffffffff8020c353>]
ret_from_intr+0x0/0xf <EOI> <1>BUG: Bad page state in process swapper
pfn:12c8f4 page:ffffe200067513e0 flags:8000000000000000 count:-1
mapcount:0 mapping:(null) index:6
Pid: 0, comm: swapper Tainted: G B W 2.6.29-rc1-net-2.6-igb_gro #1
Call Trace:
<IRQ> [<ffffffff80283e5e>] bad_page+0x105/0x11c
[<ffffffff8028489f>] prep_new_page+0x4d/0xb2 [<ffffffff80284ae1>]
buffered_rmqueue+0x1dd/0x243 [<ffffffff80284bfb>]
get_page_from_freelist+0xb4/0xef [<ffffffff80284ec6>]
__alloc_pages_internal+0xa1/0x38f [<ffffffff8036ad24>]
address_needs_mapping+0xd/0x1e [<ffffffff8036ad4c>]
range_needs_mapping+0x17/0x21 [<ffffffffa0072a45>]
igb_alloc_rx_buffers_adv+0x6a/0x1bc [igb] [<ffffffff804cdf06>]
dev_gro_receive+0x43/0x24f [<ffffffffa0073a5c>]
igb_clean_rx_irq_adv+0x306/0x34b [igb] [<ffffffff80559baf>]
_spin_unlock_irqrestore+0x63/0x74 [<ffffffffa0074df8>]
igb_clean_rx_ring_msix+0x4a/0xca [igb] [<ffffffff804d0750>]
net_rx_action+0xa0/0x14f [<ffffffff80244d6a>] __do_softirq+0x7b/0x116
[<ffffffff8020cb3c>] call_softirq+0x1c/0x28 [<ffffffff8020de92>]
do_softirq+0x31/0x83 [<ffffffff80244c47>] irq_exit+0x45/0x87
[<ffffffff8020de4b>] do_IRQ+0xa5/0xbb [<ffffffff8020c353>]
ret_from_intr+0x0/0xf <EOI> <1>BUG: Bad page state in process swapper
pfn:12999c page:ffffe2000664cda0 flags:8000000000000000 count:-1
mapcount:0 mapping:(null) index:0
Pid: 0, comm: swapper Tainted: G B W 2.6.29-rc1-net-2.6-igb_gro #1
Call Trace:
<IRQ> [<ffffffff80283e5e>] bad_page+0x105/0x11c
[<ffffffff8028489f>] prep_new_page+0x4d/0xb2 [<ffffffff80284ae1>]
buffered_rmqueue+0x1dd/0x243 [<ffffffff80284bfb>]
get_page_from_freelist+0xb4/0xef [<ffffffff80284ec6>]
__alloc_pages_internal+0xa1/0x38f [<ffffffff8036ad24>]
address_needs_mapping+0xd/0x1e [<ffffffff8036ad4c>]
range_needs_mapping+0x17/0x21 [<ffffffffa0072a45>]
igb_alloc_rx_buffers_adv+0x6a/0x1bc [igb] [<ffffffff804cdf06>]
dev_gro_receive+0x43/0x24f [<ffffffffa0073a5c>]
igb_clean_rx_irq_adv+0x306/0x34b [igb] [<ffffffff80559baf>]
_spin_unlock_irqrestore+0x63/0x74 [<ffffffffa0074df8>]
igb_clean_rx_ring_msix+0x4a/0xca [igb] [<ffffffff804d0750>]
net_rx_action+0xa0/0x14f [<ffffffff80244d6a>] __do_softirq+0x7b/0x116
[<ffffffff8020cb3c>] call_softirq+0x1c/0x28 [<ffffffff8020de92>]
do_softirq+0x31/0x83 [<ffffffff80244c47>] irq_exit+0x45/0x87
[<ffffffff8020de4b>] do_IRQ+0xa5/0xbb [<ffffffff8020c353>]
ret_from_intr+0x0/0xf <EOI>
I have added Emil to mail thread, he can give additional testing
details, if necessary.
--
Cheers,
Jeff
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [2/2] igb: Replace LRO with GRO
2009-01-15 0:03 ` Jeff Kirsher
@ 2009-01-15 0:32 ` Herbert Xu
2009-01-15 1:35 ` Jeff Kirsher
2009-01-15 4:40 ` David Miller
0 siblings, 2 replies; 37+ messages in thread
From: Herbert Xu @ 2009-01-15 0:32 UTC (permalink / raw)
To: Jeff Kirsher; +Cc: davem, netdev, Emil Tantilov
On Wed, Jan 14, 2009 at 04:03:10PM -0800, Jeff Kirsher wrote:
>
> We are seeing a kernel panic during our testing using jumbo frames,
> below is the trace.
Thanks! This was the one case that I didn't test with e1000e,
namely an skb with page frags which comes from the driver (as
opposed to being constructed by the stack through gro_receive_frags).
gro: Fix page ref count for skbs freed normally
When an skb with page frags is merged into an existing one, we
cannibalise its reference count. This is OK when the skb is
reused because we set nr_frags to zero in that case. However,
for the case where the skb is freed through kfree_skb, we didn't
clear nr_frags which causes the page to be freed prematurely.
This is fixed by moving the skb resetting into skb_gro_receive.
Reported-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/net/core/dev.c b/net/core/dev.c
index 972a47d..4f69a2d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2481,12 +2481,6 @@ EXPORT_SYMBOL(napi_gro_receive);
void napi_reuse_skb(struct napi_struct *napi, struct sk_buff *skb)
{
- skb_shinfo(skb)->nr_frags = 0;
-
- skb->len -= skb->data_len;
- skb->truesize -= skb->data_len;
- skb->data_len = 0;
-
__skb_pull(skb, skb_headlen(skb));
skb_reserve(skb, NET_IP_ALIGN - skb_headroom(skb));
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 5110b35..65eac77 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2602,6 +2602,12 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
skb_shinfo(skb)->nr_frags * sizeof(skb_frag_t));
skb_shinfo(p)->nr_frags += skb_shinfo(skb)->nr_frags;
+ skb_shinfo(skb)->nr_frags = 0;
+
+ skb->truesize -= skb->data_len;
+ skb->len -= skb->data_len;
+ skb->data_len = 0;
+
NAPI_GRO_CB(skb)->free = 1;
goto done;
}
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [2/2] igb: Replace LRO with GRO
2009-01-15 0:32 ` Herbert Xu
@ 2009-01-15 1:35 ` Jeff Kirsher
2009-01-15 1:56 ` David Miller
2009-01-15 4:40 ` David Miller
1 sibling, 1 reply; 37+ messages in thread
From: Jeff Kirsher @ 2009-01-15 1:35 UTC (permalink / raw)
To: Herbert Xu; +Cc: davem, netdev, Emil Tantilov
On Wed, Jan 14, 2009 at 4:32 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Wed, Jan 14, 2009 at 04:03:10PM -0800, Jeff Kirsher wrote:
>>
>> We are seeing a kernel panic during our testing using jumbo frames,
>> below is the trace.
>
> Thanks! This was the one case that I didn't test with e1000e,
> namely an skb with page frags which comes from the driver (as
> opposed to being constructed by the stack through gro_receive_frags).
>
> gro: Fix page ref count for skbs freed normally
>
> When an skb with page frags is merged into an existing one, we
> cannibalise its reference count. This is OK when the skb is
> reused because we set nr_frags to zero in that case. However,
> for the case where the skb is freed through kfree_skb, we didn't
> clear nr_frags which causes the page to be freed prematurely.
>
> This is fixed by moving the skb resetting into skb_gro_receive.
>
> Reported-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 972a47d..4f69a2d 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2481,12 +2481,6 @@ EXPORT_SYMBOL(napi_gro_receive);
>
> void napi_reuse_skb(struct napi_struct *napi, struct sk_buff *skb)
> {
> - skb_shinfo(skb)->nr_frags = 0;
> -
> - skb->len -= skb->data_len;
> - skb->truesize -= skb->data_len;
> - skb->data_len = 0;
> -
> __skb_pull(skb, skb_headlen(skb));
> skb_reserve(skb, NET_IP_ALIGN - skb_headroom(skb));
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 5110b35..65eac77 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2602,6 +2602,12 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
> skb_shinfo(skb)->nr_frags * sizeof(skb_frag_t));
>
> skb_shinfo(p)->nr_frags += skb_shinfo(skb)->nr_frags;
> + skb_shinfo(skb)->nr_frags = 0;
> +
> + skb->truesize -= skb->data_len;
> + skb->len -= skb->data_len;
> + skb->data_len = 0;
> +
> NAPI_GRO_CB(skb)->free = 1;
> goto done;
> }
>
> Cheers,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> --
>
Might want to redo the subject line so that Dave knows that this is a
different patch, since you already have the IGB conversion to GRO
patch in this mail thread.
--
Cheers,
Jeff
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [2/2] igb: Replace LRO with GRO
2009-01-15 1:35 ` Jeff Kirsher
@ 2009-01-15 1:56 ` David Miller
2009-01-15 2:02 ` Jeff Kirsher
0 siblings, 1 reply; 37+ messages in thread
From: David Miller @ 2009-01-15 1:56 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: herbert, netdev, emil.s.tantilov
From: "Jeff Kirsher" <jeffrey.t.kirsher@intel.com>
Date: Wed, 14 Jan 2009 17:35:00 -0800
> Might want to redo the subject line so that Dave knows that this is a
> different patch, since you already have the IGB conversion to GRO
> patch in this mail thread.
Come on Jeff, I'm not that much of a robot :-)
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [2/2] igb: Replace LRO with GRO
2009-01-15 1:56 ` David Miller
@ 2009-01-15 2:02 ` Jeff Kirsher
0 siblings, 0 replies; 37+ messages in thread
From: Jeff Kirsher @ 2009-01-15 2:02 UTC (permalink / raw)
To: David Miller; +Cc: herbert, netdev, emil.s.tantilov
On Wed, Jan 14, 2009 at 5:56 PM, David Miller <davem@davemloft.net> wrote:
> From: "Jeff Kirsher" <jeffrey.t.kirsher@intel.com>
> Date: Wed, 14 Jan 2009 17:35:00 -0800
>
>> Might want to redo the subject line so that Dave knows that this is a
>> different patch, since you already have the IGB conversion to GRO
>> patch in this mail thread.
>
> Come on Jeff, I'm not that much of a robot :-)
> --
Just trying to help you out. :P
--
Cheers,
Jeff
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: ixgbe: Replace LRO with GRO
2009-01-14 8:44 ` ixgbe: " Herbert Xu
2009-01-14 11:53 ` Jeff Kirsher
@ 2009-01-15 3:46 ` Herbert Xu
2009-01-15 4:22 ` Herbert Xu
2009-01-16 23:32 ` Jeff Kirsher
1 sibling, 2 replies; 37+ messages in thread
From: Herbert Xu @ 2009-01-15 3:46 UTC (permalink / raw)
To: David S. Miller, Jeff Kirsher, netdev
On Wed, Jan 14, 2009 at 07:44:12PM +1100, Herbert Xu wrote:
>
> ixgbe: Replace LRO with GRO
I forgot to delete the Kconfig dependency, here is an updated
version.
ixgbe: Replace LRO with GRO
This patch makes igb invoke the GRO hooks instead of LRO. As
GRO has a compatible external interface to LRO this is a very
straightforward replacement.
As GRO uses the napi structure to track the held packets, I've
modified the code paths involved to pass that along.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index cf4fea0..ee37540 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2443,7 +2443,6 @@ config ENIC
config IXGBE
tristate "Intel(R) 10GbE PCI Express adapters support"
depends on PCI && INET
- select INET_LRO
---help---
This driver supports Intel(R) 10GbE PCI Express family of
adapters. For more information on how to identify your adapter, go
diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index e112008..6ac361a 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -31,7 +31,6 @@
#include <linux/types.h>
#include <linux/pci.h>
#include <linux/netdevice.h>
-#include <linux/inet_lro.h>
#include <linux/aer.h>
#include "ixgbe_type.h"
@@ -88,9 +87,6 @@
#define IXGBE_TX_FLAGS_VLAN_PRIO_MASK 0x0000e000
#define IXGBE_TX_FLAGS_VLAN_SHIFT 16
-#define IXGBE_MAX_LRO_DESCRIPTORS 8
-#define IXGBE_MAX_LRO_AGGREGATE 32
-
/* wrapper around a pointer to a socket buffer,
* so a DMA handle can be stored along with the buffer */
struct ixgbe_tx_buffer {
@@ -142,8 +138,6 @@ struct ixgbe_ring {
/* cpu for tx queue */
int cpu;
#endif
- struct net_lro_mgr lro_mgr;
- bool lro_used;
struct ixgbe_queue_stats stats;
u16 v_idx; /* maps directly to the index for this ring in the hardware
* vector array, can also be used for finding the bit in EICR
@@ -301,9 +295,6 @@ struct ixgbe_adapter {
unsigned long state;
u64 tx_busy;
- u64 lro_aggregated;
- u64 lro_flushed;
- u64 lro_no_desc;
unsigned int tx_ring_count;
unsigned int rx_ring_count;
diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index 67f87a7..4f6b5df 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -89,8 +89,6 @@ static struct ixgbe_stats ixgbe_gstrings_stats[] = {
{"rx_header_split", IXGBE_STAT(rx_hdr_split)},
{"alloc_rx_page_failed", IXGBE_STAT(alloc_rx_page_failed)},
{"alloc_rx_buff_failed", IXGBE_STAT(alloc_rx_buff_failed)},
- {"lro_aggregated", IXGBE_STAT(lro_aggregated)},
- {"lro_flushed", IXGBE_STAT(lro_flushed)},
};
#define IXGBE_QUEUE_STATS_LEN \
@@ -808,15 +806,6 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
int stat_count = sizeof(struct ixgbe_queue_stats) / sizeof(u64);
int j, k;
int i;
- u64 aggregated = 0, flushed = 0, no_desc = 0;
- for (i = 0; i < adapter->num_rx_queues; i++) {
- aggregated += adapter->rx_ring[i].lro_mgr.stats.aggregated;
- flushed += adapter->rx_ring[i].lro_mgr.stats.flushed;
- no_desc += adapter->rx_ring[i].lro_mgr.stats.no_desc;
- }
- adapter->lro_aggregated = aggregated;
- adapter->lro_flushed = flushed;
- adapter->lro_no_desc = no_desc;
ixgbe_update_stats(adapter);
for (i = 0; i < IXGBE_GLOBAL_STATS_LEN; i++) {
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index acef3c6..23f25fe 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -400,23 +400,20 @@ static int __ixgbe_notify_dca(struct device *dev, void *data)
* @rx_ring: rx descriptor ring (for a specific queue) to setup
* @rx_desc: rx descriptor
**/
-static void ixgbe_receive_skb(struct ixgbe_adapter *adapter,
+static void ixgbe_receive_skb(struct ixgbe_q_vector *q_vector,
struct sk_buff *skb, u8 status,
- struct ixgbe_ring *ring,
union ixgbe_adv_rx_desc *rx_desc)
{
+ struct ixgbe_adapter *adapter = q_vector->adapter;
+ struct napi_struct *napi = &q_vector->napi;
bool is_vlan = (status & IXGBE_RXD_STAT_VP);
u16 tag = le16_to_cpu(rx_desc->wb.upper.vlan);
- if (adapter->netdev->features & NETIF_F_LRO &&
- skb->ip_summed == CHECKSUM_UNNECESSARY) {
+ if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
if (adapter->vlgrp && is_vlan && (tag != 0))
- lro_vlan_hwaccel_receive_skb(&ring->lro_mgr, skb,
- adapter->vlgrp, tag,
- rx_desc);
+ vlan_gro_receive(napi, adapter->vlgrp, tag, skb);
else
- lro_receive_skb(&ring->lro_mgr, skb, rx_desc);
- ring->lro_used = true;
+ napi_gro_receive(napi, skb);
} else {
if (!(adapter->flags & IXGBE_FLAG_IN_NETPOLL)) {
if (adapter->vlgrp && is_vlan && (tag != 0))
@@ -571,10 +568,11 @@ static inline u16 ixgbe_get_pkt_info(union ixgbe_adv_rx_desc *rx_desc)
return rx_desc->wb.lower.lo_dword.hs_rss.pkt_info;
}
-static bool ixgbe_clean_rx_irq(struct ixgbe_adapter *adapter,
+static bool ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
struct ixgbe_ring *rx_ring,
int *work_done, int work_to_do)
{
+ struct ixgbe_adapter *adapter = q_vector->adapter;
struct pci_dev *pdev = adapter->pdev;
union ixgbe_adv_rx_desc *rx_desc, *next_rxd;
struct ixgbe_rx_buffer *rx_buffer_info, *next_buffer;
@@ -675,7 +673,7 @@ static bool ixgbe_clean_rx_irq(struct ixgbe_adapter *adapter,
total_rx_packets++;
skb->protocol = eth_type_trans(skb, adapter->netdev);
- ixgbe_receive_skb(adapter, skb, staterr, rx_ring, rx_desc);
+ ixgbe_receive_skb(q_vector, skb, staterr, rx_desc);
next_desc:
rx_desc->wb.upper.status_error = 0;
@@ -693,11 +691,6 @@ next_desc:
staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
}
- if (rx_ring->lro_used) {
- lro_flush_all(&rx_ring->lro_mgr);
- rx_ring->lro_used = false;
- }
-
rx_ring->next_to_clean = i;
cleaned_count = IXGBE_DESC_UNUSED(rx_ring);
@@ -1049,7 +1042,7 @@ static int ixgbe_clean_rxonly(struct napi_struct *napi, int budget)
ixgbe_update_rx_dca(adapter, rx_ring);
#endif
- ixgbe_clean_rx_irq(adapter, rx_ring, &work_done, budget);
+ ixgbe_clean_rx_irq(q_vector, rx_ring, &work_done, budget);
/* If all Rx work done, exit the polling mode */
if (work_done < budget) {
@@ -1092,7 +1085,7 @@ static int ixgbe_clean_rxonly_many(struct napi_struct *napi, int budget)
if (adapter->flags & IXGBE_FLAG_DCA_ENABLED)
ixgbe_update_rx_dca(adapter, rx_ring);
#endif
- ixgbe_clean_rx_irq(adapter, rx_ring, &work_done, budget);
+ ixgbe_clean_rx_irq(q_vector, rx_ring, &work_done, budget);
enable_mask |= rx_ring->v_idx;
r_idx = find_next_bit(q_vector->rxr_idx, adapter->num_rx_queues,
r_idx + 1);
@@ -1565,33 +1558,6 @@ static void ixgbe_configure_srrctl(struct ixgbe_adapter *adapter, int index)
IXGBE_WRITE_REG(&adapter->hw, IXGBE_SRRCTL(index), srrctl);
}
-/**
- * ixgbe_get_skb_hdr - helper function for LRO header processing
- * @skb: pointer to sk_buff to be added to LRO packet
- * @iphdr: pointer to ip header structure
- * @tcph: pointer to tcp header structure
- * @hdr_flags: pointer to header flags
- * @priv: private data
- **/
-static int ixgbe_get_skb_hdr(struct sk_buff *skb, void **iphdr, void **tcph,
- u64 *hdr_flags, void *priv)
-{
- union ixgbe_adv_rx_desc *rx_desc = priv;
-
- /* Verify that this is a valid IPv4 TCP packet */
- if (!((ixgbe_get_pkt_info(rx_desc) & IXGBE_RXDADV_PKTTYPE_IPV4) &&
- (ixgbe_get_pkt_info(rx_desc) & IXGBE_RXDADV_PKTTYPE_TCP)))
- return -1;
-
- /* Set network headers */
- skb_reset_network_header(skb);
- skb_set_transport_header(skb, ip_hdrlen(skb));
- *iphdr = ip_hdr(skb);
- *tcph = tcp_hdr(skb);
- *hdr_flags = LRO_IPV4 | LRO_TCP;
- return 0;
-}
-
#define PAGE_USE_COUNT(S) (((S) >> PAGE_SHIFT) + \
(((S) & (PAGE_SIZE - 1)) ? 1 : 0))
@@ -1663,16 +1629,6 @@ static void ixgbe_configure_rx(struct ixgbe_adapter *adapter)
adapter->rx_ring[i].head = IXGBE_RDH(j);
adapter->rx_ring[i].tail = IXGBE_RDT(j);
adapter->rx_ring[i].rx_buf_len = rx_buf_len;
- /* Intitial LRO Settings */
- adapter->rx_ring[i].lro_mgr.max_aggr = IXGBE_MAX_LRO_AGGREGATE;
- adapter->rx_ring[i].lro_mgr.max_desc = IXGBE_MAX_LRO_DESCRIPTORS;
- adapter->rx_ring[i].lro_mgr.get_skb_header = ixgbe_get_skb_hdr;
- adapter->rx_ring[i].lro_mgr.features = LRO_F_EXTRACT_VLAN_ID;
- if (!(adapter->flags & IXGBE_FLAG_IN_NETPOLL))
- adapter->rx_ring[i].lro_mgr.features |= LRO_F_NAPI;
- adapter->rx_ring[i].lro_mgr.dev = adapter->netdev;
- adapter->rx_ring[i].lro_mgr.ip_summed = CHECKSUM_UNNECESSARY;
- adapter->rx_ring[i].lro_mgr.ip_summed_aggr = CHECKSUM_UNNECESSARY;
ixgbe_configure_srrctl(adapter, j);
}
@@ -2303,7 +2259,7 @@ static int ixgbe_poll(struct napi_struct *napi, int budget)
#endif
tx_cleaned = ixgbe_clean_tx_irq(adapter, adapter->tx_ring);
- ixgbe_clean_rx_irq(adapter, adapter->rx_ring, &work_done, budget);
+ ixgbe_clean_rx_irq(q_vector, adapter->rx_ring, &work_done, budget);
if (tx_cleaned)
work_done = budget;
@@ -2919,12 +2875,6 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter *adapter,
struct pci_dev *pdev = adapter->pdev;
int size;
- size = sizeof(struct net_lro_desc) * IXGBE_MAX_LRO_DESCRIPTORS;
- rx_ring->lro_mgr.lro_arr = vmalloc(size);
- if (!rx_ring->lro_mgr.lro_arr)
- return -ENOMEM;
- memset(rx_ring->lro_mgr.lro_arr, 0, size);
-
size = sizeof(struct ixgbe_rx_buffer) * rx_ring->count;
rx_ring->rx_buffer_info = vmalloc(size);
if (!rx_ring->rx_buffer_info) {
@@ -2953,8 +2903,6 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter *adapter,
return 0;
alloc_failed:
- vfree(rx_ring->lro_mgr.lro_arr);
- rx_ring->lro_mgr.lro_arr = NULL;
return -ENOMEM;
}
@@ -3032,9 +2980,6 @@ void ixgbe_free_rx_resources(struct ixgbe_adapter *adapter,
{
struct pci_dev *pdev = adapter->pdev;
- vfree(rx_ring->lro_mgr.lro_arr);
- rx_ring->lro_mgr.lro_arr = NULL;
-
ixgbe_clean_rx_ring(adapter, rx_ring);
vfree(rx_ring->rx_buffer_info);
@@ -4136,7 +4081,7 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
netdev->features |= NETIF_F_IPV6_CSUM;
netdev->features |= NETIF_F_TSO;
netdev->features |= NETIF_F_TSO6;
- netdev->features |= NETIF_F_LRO;
+ netdev->features |= NETIF_F_GRO;
netdev->vlan_features |= NETIF_F_TSO;
netdev->vlan_features |= NETIF_F_TSO6;
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: ixgbe: Replace LRO with GRO
2009-01-15 3:46 ` Herbert Xu
@ 2009-01-15 4:22 ` Herbert Xu
2009-01-19 5:49 ` David Miller
2009-01-16 23:32 ` Jeff Kirsher
1 sibling, 1 reply; 37+ messages in thread
From: Herbert Xu @ 2009-01-15 4:22 UTC (permalink / raw)
To: David S. Miller, Jeff Kirsher, netdev
On Thu, Jan 15, 2009 at 02:46:19PM +1100, Herbert Xu wrote:
>
> ixgbe: Replace LRO with GRO
>
> This patch makes igb invoke the GRO hooks instead of LRO. As
Doh I left the name igb in the there.
ixgbe: Replace LRO with GRO
This patch makes ixgbe invoke the GRO hooks instead of LRO. As
GRO has a compatible external interface to LRO this is a very
straightforward replacement.
As GRO uses the napi structure to track the held packets, I've
modified the code paths involved to pass that along.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index cf4fea0..ee37540 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2443,7 +2443,6 @@ config ENIC
config IXGBE
tristate "Intel(R) 10GbE PCI Express adapters support"
depends on PCI && INET
- select INET_LRO
---help---
This driver supports Intel(R) 10GbE PCI Express family of
adapters. For more information on how to identify your adapter, go
diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index e112008..6ac361a 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -31,7 +31,6 @@
#include <linux/types.h>
#include <linux/pci.h>
#include <linux/netdevice.h>
-#include <linux/inet_lro.h>
#include <linux/aer.h>
#include "ixgbe_type.h"
@@ -88,9 +87,6 @@
#define IXGBE_TX_FLAGS_VLAN_PRIO_MASK 0x0000e000
#define IXGBE_TX_FLAGS_VLAN_SHIFT 16
-#define IXGBE_MAX_LRO_DESCRIPTORS 8
-#define IXGBE_MAX_LRO_AGGREGATE 32
-
/* wrapper around a pointer to a socket buffer,
* so a DMA handle can be stored along with the buffer */
struct ixgbe_tx_buffer {
@@ -142,8 +138,6 @@ struct ixgbe_ring {
/* cpu for tx queue */
int cpu;
#endif
- struct net_lro_mgr lro_mgr;
- bool lro_used;
struct ixgbe_queue_stats stats;
u16 v_idx; /* maps directly to the index for this ring in the hardware
* vector array, can also be used for finding the bit in EICR
@@ -301,9 +295,6 @@ struct ixgbe_adapter {
unsigned long state;
u64 tx_busy;
- u64 lro_aggregated;
- u64 lro_flushed;
- u64 lro_no_desc;
unsigned int tx_ring_count;
unsigned int rx_ring_count;
diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index 67f87a7..4f6b5df 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -89,8 +89,6 @@ static struct ixgbe_stats ixgbe_gstrings_stats[] = {
{"rx_header_split", IXGBE_STAT(rx_hdr_split)},
{"alloc_rx_page_failed", IXGBE_STAT(alloc_rx_page_failed)},
{"alloc_rx_buff_failed", IXGBE_STAT(alloc_rx_buff_failed)},
- {"lro_aggregated", IXGBE_STAT(lro_aggregated)},
- {"lro_flushed", IXGBE_STAT(lro_flushed)},
};
#define IXGBE_QUEUE_STATS_LEN \
@@ -808,15 +806,6 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
int stat_count = sizeof(struct ixgbe_queue_stats) / sizeof(u64);
int j, k;
int i;
- u64 aggregated = 0, flushed = 0, no_desc = 0;
- for (i = 0; i < adapter->num_rx_queues; i++) {
- aggregated += adapter->rx_ring[i].lro_mgr.stats.aggregated;
- flushed += adapter->rx_ring[i].lro_mgr.stats.flushed;
- no_desc += adapter->rx_ring[i].lro_mgr.stats.no_desc;
- }
- adapter->lro_aggregated = aggregated;
- adapter->lro_flushed = flushed;
- adapter->lro_no_desc = no_desc;
ixgbe_update_stats(adapter);
for (i = 0; i < IXGBE_GLOBAL_STATS_LEN; i++) {
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index acef3c6..23f25fe 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -400,23 +400,20 @@ static int __ixgbe_notify_dca(struct device *dev, void *data)
* @rx_ring: rx descriptor ring (for a specific queue) to setup
* @rx_desc: rx descriptor
**/
-static void ixgbe_receive_skb(struct ixgbe_adapter *adapter,
+static void ixgbe_receive_skb(struct ixgbe_q_vector *q_vector,
struct sk_buff *skb, u8 status,
- struct ixgbe_ring *ring,
union ixgbe_adv_rx_desc *rx_desc)
{
+ struct ixgbe_adapter *adapter = q_vector->adapter;
+ struct napi_struct *napi = &q_vector->napi;
bool is_vlan = (status & IXGBE_RXD_STAT_VP);
u16 tag = le16_to_cpu(rx_desc->wb.upper.vlan);
- if (adapter->netdev->features & NETIF_F_LRO &&
- skb->ip_summed == CHECKSUM_UNNECESSARY) {
+ if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
if (adapter->vlgrp && is_vlan && (tag != 0))
- lro_vlan_hwaccel_receive_skb(&ring->lro_mgr, skb,
- adapter->vlgrp, tag,
- rx_desc);
+ vlan_gro_receive(napi, adapter->vlgrp, tag, skb);
else
- lro_receive_skb(&ring->lro_mgr, skb, rx_desc);
- ring->lro_used = true;
+ napi_gro_receive(napi, skb);
} else {
if (!(adapter->flags & IXGBE_FLAG_IN_NETPOLL)) {
if (adapter->vlgrp && is_vlan && (tag != 0))
@@ -571,10 +568,11 @@ static inline u16 ixgbe_get_pkt_info(union ixgbe_adv_rx_desc *rx_desc)
return rx_desc->wb.lower.lo_dword.hs_rss.pkt_info;
}
-static bool ixgbe_clean_rx_irq(struct ixgbe_adapter *adapter,
+static bool ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
struct ixgbe_ring *rx_ring,
int *work_done, int work_to_do)
{
+ struct ixgbe_adapter *adapter = q_vector->adapter;
struct pci_dev *pdev = adapter->pdev;
union ixgbe_adv_rx_desc *rx_desc, *next_rxd;
struct ixgbe_rx_buffer *rx_buffer_info, *next_buffer;
@@ -675,7 +673,7 @@ static bool ixgbe_clean_rx_irq(struct ixgbe_adapter *adapter,
total_rx_packets++;
skb->protocol = eth_type_trans(skb, adapter->netdev);
- ixgbe_receive_skb(adapter, skb, staterr, rx_ring, rx_desc);
+ ixgbe_receive_skb(q_vector, skb, staterr, rx_desc);
next_desc:
rx_desc->wb.upper.status_error = 0;
@@ -693,11 +691,6 @@ next_desc:
staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
}
- if (rx_ring->lro_used) {
- lro_flush_all(&rx_ring->lro_mgr);
- rx_ring->lro_used = false;
- }
-
rx_ring->next_to_clean = i;
cleaned_count = IXGBE_DESC_UNUSED(rx_ring);
@@ -1049,7 +1042,7 @@ static int ixgbe_clean_rxonly(struct napi_struct *napi, int budget)
ixgbe_update_rx_dca(adapter, rx_ring);
#endif
- ixgbe_clean_rx_irq(adapter, rx_ring, &work_done, budget);
+ ixgbe_clean_rx_irq(q_vector, rx_ring, &work_done, budget);
/* If all Rx work done, exit the polling mode */
if (work_done < budget) {
@@ -1092,7 +1085,7 @@ static int ixgbe_clean_rxonly_many(struct napi_struct *napi, int budget)
if (adapter->flags & IXGBE_FLAG_DCA_ENABLED)
ixgbe_update_rx_dca(adapter, rx_ring);
#endif
- ixgbe_clean_rx_irq(adapter, rx_ring, &work_done, budget);
+ ixgbe_clean_rx_irq(q_vector, rx_ring, &work_done, budget);
enable_mask |= rx_ring->v_idx;
r_idx = find_next_bit(q_vector->rxr_idx, adapter->num_rx_queues,
r_idx + 1);
@@ -1565,33 +1558,6 @@ static void ixgbe_configure_srrctl(struct ixgbe_adapter *adapter, int index)
IXGBE_WRITE_REG(&adapter->hw, IXGBE_SRRCTL(index), srrctl);
}
-/**
- * ixgbe_get_skb_hdr - helper function for LRO header processing
- * @skb: pointer to sk_buff to be added to LRO packet
- * @iphdr: pointer to ip header structure
- * @tcph: pointer to tcp header structure
- * @hdr_flags: pointer to header flags
- * @priv: private data
- **/
-static int ixgbe_get_skb_hdr(struct sk_buff *skb, void **iphdr, void **tcph,
- u64 *hdr_flags, void *priv)
-{
- union ixgbe_adv_rx_desc *rx_desc = priv;
-
- /* Verify that this is a valid IPv4 TCP packet */
- if (!((ixgbe_get_pkt_info(rx_desc) & IXGBE_RXDADV_PKTTYPE_IPV4) &&
- (ixgbe_get_pkt_info(rx_desc) & IXGBE_RXDADV_PKTTYPE_TCP)))
- return -1;
-
- /* Set network headers */
- skb_reset_network_header(skb);
- skb_set_transport_header(skb, ip_hdrlen(skb));
- *iphdr = ip_hdr(skb);
- *tcph = tcp_hdr(skb);
- *hdr_flags = LRO_IPV4 | LRO_TCP;
- return 0;
-}
-
#define PAGE_USE_COUNT(S) (((S) >> PAGE_SHIFT) + \
(((S) & (PAGE_SIZE - 1)) ? 1 : 0))
@@ -1663,16 +1629,6 @@ static void ixgbe_configure_rx(struct ixgbe_adapter *adapter)
adapter->rx_ring[i].head = IXGBE_RDH(j);
adapter->rx_ring[i].tail = IXGBE_RDT(j);
adapter->rx_ring[i].rx_buf_len = rx_buf_len;
- /* Intitial LRO Settings */
- adapter->rx_ring[i].lro_mgr.max_aggr = IXGBE_MAX_LRO_AGGREGATE;
- adapter->rx_ring[i].lro_mgr.max_desc = IXGBE_MAX_LRO_DESCRIPTORS;
- adapter->rx_ring[i].lro_mgr.get_skb_header = ixgbe_get_skb_hdr;
- adapter->rx_ring[i].lro_mgr.features = LRO_F_EXTRACT_VLAN_ID;
- if (!(adapter->flags & IXGBE_FLAG_IN_NETPOLL))
- adapter->rx_ring[i].lro_mgr.features |= LRO_F_NAPI;
- adapter->rx_ring[i].lro_mgr.dev = adapter->netdev;
- adapter->rx_ring[i].lro_mgr.ip_summed = CHECKSUM_UNNECESSARY;
- adapter->rx_ring[i].lro_mgr.ip_summed_aggr = CHECKSUM_UNNECESSARY;
ixgbe_configure_srrctl(adapter, j);
}
@@ -2303,7 +2259,7 @@ static int ixgbe_poll(struct napi_struct *napi, int budget)
#endif
tx_cleaned = ixgbe_clean_tx_irq(adapter, adapter->tx_ring);
- ixgbe_clean_rx_irq(adapter, adapter->rx_ring, &work_done, budget);
+ ixgbe_clean_rx_irq(q_vector, adapter->rx_ring, &work_done, budget);
if (tx_cleaned)
work_done = budget;
@@ -2919,12 +2875,6 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter *adapter,
struct pci_dev *pdev = adapter->pdev;
int size;
- size = sizeof(struct net_lro_desc) * IXGBE_MAX_LRO_DESCRIPTORS;
- rx_ring->lro_mgr.lro_arr = vmalloc(size);
- if (!rx_ring->lro_mgr.lro_arr)
- return -ENOMEM;
- memset(rx_ring->lro_mgr.lro_arr, 0, size);
-
size = sizeof(struct ixgbe_rx_buffer) * rx_ring->count;
rx_ring->rx_buffer_info = vmalloc(size);
if (!rx_ring->rx_buffer_info) {
@@ -2953,8 +2903,6 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter *adapter,
return 0;
alloc_failed:
- vfree(rx_ring->lro_mgr.lro_arr);
- rx_ring->lro_mgr.lro_arr = NULL;
return -ENOMEM;
}
@@ -3032,9 +2980,6 @@ void ixgbe_free_rx_resources(struct ixgbe_adapter *adapter,
{
struct pci_dev *pdev = adapter->pdev;
- vfree(rx_ring->lro_mgr.lro_arr);
- rx_ring->lro_mgr.lro_arr = NULL;
-
ixgbe_clean_rx_ring(adapter, rx_ring);
vfree(rx_ring->rx_buffer_info);
@@ -4136,7 +4081,7 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
netdev->features |= NETIF_F_IPV6_CSUM;
netdev->features |= NETIF_F_TSO;
netdev->features |= NETIF_F_TSO6;
- netdev->features |= NETIF_F_LRO;
+ netdev->features |= NETIF_F_GRO;
netdev->vlan_features |= NETIF_F_TSO;
netdev->vlan_features |= NETIF_F_TSO6;
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 37+ messages in thread
* sfc: Replace LRO with GRO
2009-01-13 9:26 [1/2] e1000e: Invoke VLAN GRO handler Herbert Xu
2009-01-13 9:28 ` [2/2] igb: Replace LRO with GRO Herbert Xu
@ 2009-01-15 4:24 ` Herbert Xu
2009-01-19 5:50 ` David Miller
2009-01-15 4:38 ` [1/2] e1000e: Invoke VLAN GRO handler David Miller
2009-01-15 6:59 ` cxgb3: Replace LRO with GRO Herbert Xu
3 siblings, 1 reply; 37+ messages in thread
From: Herbert Xu @ 2009-01-15 4:24 UTC (permalink / raw)
To: David S. Miller, Ben Hutchings, netdev
Hi:
Here's the first shot at a driver using the frags interface.
sfc: Replace LRO with GRO
This patch makes sfc invoke the GRO hooks instead of LRO. As
GRO has a compatible external interface to LRO this is a very
straightforward replacement.
Everything should appear identical to the user except that the
offload is now controlled by the GRO ethtool option instead of
LRO. I've kept the lro module parameter as is since that's for
compatibility only.
I have eliminated efx_rx_mk_skb as the GRO layer can take care
of all packets regardless of whether GRO is enabled or not.
So the only case where we don't call GRO is if the packet checksum
is absent. This is to keep the behaviour changes of the patch to
a minimum.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/net/sfc/Kconfig b/drivers/net/sfc/Kconfig
index c535408..12a8296 100644
--- a/drivers/net/sfc/Kconfig
+++ b/drivers/net/sfc/Kconfig
@@ -2,7 +2,6 @@ config SFC
tristate "Solarflare Solarstorm SFC4000 support"
depends on PCI && INET
select MII
- select INET_LRO
select CRC32
select I2C
select I2C_ALGOBIT
diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index 7673fd9..38247a1 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -182,7 +182,6 @@ static int efx_process_channel(struct efx_channel *channel, int rx_quota)
channel->rx_pkt = NULL;
}
- efx_flush_lro(channel);
efx_rx_strategy(channel);
efx_fast_push_rx_descriptors(&efx->rx_queue[channel->channel]);
@@ -1269,18 +1268,11 @@ static int efx_ioctl(struct net_device *net_dev, struct ifreq *ifr, int cmd)
static int efx_init_napi(struct efx_nic *efx)
{
struct efx_channel *channel;
- int rc;
efx_for_each_channel(channel, efx) {
channel->napi_dev = efx->net_dev;
- rc = efx_lro_init(&channel->lro_mgr, efx);
- if (rc)
- goto err;
}
return 0;
- err:
- efx_fini_napi(efx);
- return rc;
}
static void efx_fini_napi(struct efx_nic *efx)
@@ -1288,7 +1280,6 @@ static void efx_fini_napi(struct efx_nic *efx)
struct efx_channel *channel;
efx_for_each_channel(channel, efx) {
- efx_lro_fini(&channel->lro_mgr);
channel->napi_dev = NULL;
}
}
@@ -2097,7 +2088,7 @@ static int __devinit efx_pci_probe(struct pci_dev *pci_dev,
net_dev->features |= (NETIF_F_IP_CSUM | NETIF_F_SG |
NETIF_F_HIGHDMA | NETIF_F_TSO);
if (lro)
- net_dev->features |= NETIF_F_LRO;
+ net_dev->features |= NETIF_F_GRO;
/* Mask for features that also apply to VLAN devices */
net_dev->vlan_features |= (NETIF_F_ALL_CSUM | NETIF_F_SG |
NETIF_F_HIGHDMA | NETIF_F_TSO);
diff --git a/drivers/net/sfc/net_driver.h b/drivers/net/sfc/net_driver.h
index 5f255f7..8643505 100644
--- a/drivers/net/sfc/net_driver.h
+++ b/drivers/net/sfc/net_driver.h
@@ -25,15 +25,11 @@
#include <linux/device.h>
#include <linux/highmem.h>
#include <linux/workqueue.h>
-#include <linux/inet_lro.h>
#include <linux/i2c.h>
#include "enum.h"
#include "bitfield.h"
-#define EFX_MAX_LRO_DESCRIPTORS 8
-#define EFX_MAX_LRO_AGGR MAX_SKB_FRAGS
-
/**************************************************************************
*
* Build definitions
@@ -340,13 +336,10 @@ enum efx_rx_alloc_method {
* @eventq_read_ptr: Event queue read pointer
* @last_eventq_read_ptr: Last event queue read pointer value.
* @eventq_magic: Event queue magic value for driver-generated test events
- * @lro_mgr: LRO state
* @rx_alloc_level: Watermark based heuristic counter for pushing descriptors
* and diagnostic counters
* @rx_alloc_push_pages: RX allocation method currently in use for pushing
* descriptors
- * @rx_alloc_pop_pages: RX allocation method currently in use for popping
- * descriptors
* @n_rx_tobe_disc: Count of RX_TOBE_DISC errors
* @n_rx_ip_frag_err: Count of RX IP fragment errors
* @n_rx_ip_hdr_chksum_err: Count of RX IP header checksum errors
@@ -371,10 +364,8 @@ struct efx_channel {
unsigned int last_eventq_read_ptr;
unsigned int eventq_magic;
- struct net_lro_mgr lro_mgr;
int rx_alloc_level;
int rx_alloc_push_pages;
- int rx_alloc_pop_pages;
unsigned n_rx_tobe_disc;
unsigned n_rx_ip_frag_err;
diff --git a/drivers/net/sfc/rx.c b/drivers/net/sfc/rx.c
index b8ba4bb..a0345b3 100644
--- a/drivers/net/sfc/rx.c
+++ b/drivers/net/sfc/rx.c
@@ -99,109 +99,6 @@ static inline unsigned int efx_rx_buf_size(struct efx_nic *efx)
}
-/**************************************************************************
- *
- * Linux generic LRO handling
- *
- **************************************************************************
- */
-
-static int efx_lro_get_skb_hdr(struct sk_buff *skb, void **ip_hdr,
- void **tcpudp_hdr, u64 *hdr_flags, void *priv)
-{
- struct efx_channel *channel = priv;
- struct iphdr *iph;
- struct tcphdr *th;
-
- iph = (struct iphdr *)skb->data;
- if (skb->protocol != htons(ETH_P_IP) || iph->protocol != IPPROTO_TCP)
- goto fail;
-
- th = (struct tcphdr *)(skb->data + iph->ihl * 4);
-
- *tcpudp_hdr = th;
- *ip_hdr = iph;
- *hdr_flags = LRO_IPV4 | LRO_TCP;
-
- channel->rx_alloc_level += RX_ALLOC_FACTOR_LRO;
- return 0;
-fail:
- channel->rx_alloc_level += RX_ALLOC_FACTOR_SKB;
- return -1;
-}
-
-static int efx_get_frag_hdr(struct skb_frag_struct *frag, void **mac_hdr,
- void **ip_hdr, void **tcpudp_hdr, u64 *hdr_flags,
- void *priv)
-{
- struct efx_channel *channel = priv;
- struct ethhdr *eh;
- struct iphdr *iph;
-
- /* We support EtherII and VLAN encapsulated IPv4 */
- eh = page_address(frag->page) + frag->page_offset;
- *mac_hdr = eh;
-
- if (eh->h_proto == htons(ETH_P_IP)) {
- iph = (struct iphdr *)(eh + 1);
- } else {
- struct vlan_ethhdr *veh = (struct vlan_ethhdr *)eh;
- if (veh->h_vlan_encapsulated_proto != htons(ETH_P_IP))
- goto fail;
-
- iph = (struct iphdr *)(veh + 1);
- }
- *ip_hdr = iph;
-
- /* We can only do LRO over TCP */
- if (iph->protocol != IPPROTO_TCP)
- goto fail;
-
- *hdr_flags = LRO_IPV4 | LRO_TCP;
- *tcpudp_hdr = (struct tcphdr *)((u8 *) iph + iph->ihl * 4);
-
- channel->rx_alloc_level += RX_ALLOC_FACTOR_LRO;
- return 0;
- fail:
- channel->rx_alloc_level += RX_ALLOC_FACTOR_SKB;
- return -1;
-}
-
-int efx_lro_init(struct net_lro_mgr *lro_mgr, struct efx_nic *efx)
-{
- size_t s = sizeof(struct net_lro_desc) * EFX_MAX_LRO_DESCRIPTORS;
- struct net_lro_desc *lro_arr;
-
- /* Allocate the LRO descriptors structure */
- lro_arr = kzalloc(s, GFP_KERNEL);
- if (lro_arr == NULL)
- return -ENOMEM;
-
- lro_mgr->lro_arr = lro_arr;
- lro_mgr->max_desc = EFX_MAX_LRO_DESCRIPTORS;
- lro_mgr->max_aggr = EFX_MAX_LRO_AGGR;
- lro_mgr->frag_align_pad = EFX_PAGE_SKB_ALIGN;
-
- lro_mgr->get_skb_header = efx_lro_get_skb_hdr;
- lro_mgr->get_frag_header = efx_get_frag_hdr;
- lro_mgr->dev = efx->net_dev;
-
- lro_mgr->features = LRO_F_NAPI;
-
- /* We can pass packets up with the checksum intact */
- lro_mgr->ip_summed = CHECKSUM_UNNECESSARY;
-
- lro_mgr->ip_summed_aggr = CHECKSUM_UNNECESSARY;
-
- return 0;
-}
-
-void efx_lro_fini(struct net_lro_mgr *lro_mgr)
-{
- kfree(lro_mgr->lro_arr);
- lro_mgr->lro_arr = NULL;
-}
-
/**
* efx_init_rx_buffer_skb - create new RX buffer using skb-based allocation
*
@@ -549,77 +446,31 @@ static void efx_rx_packet__check_len(struct efx_rx_queue *rx_queue,
static void efx_rx_packet_lro(struct efx_channel *channel,
struct efx_rx_buffer *rx_buf)
{
- struct net_lro_mgr *lro_mgr = &channel->lro_mgr;
- void *priv = channel;
+ struct napi_struct *napi = &channel->napi_str;
/* Pass the skb/page into the LRO engine */
if (rx_buf->page) {
- struct skb_frag_struct frags;
+ struct napi_gro_fraginfo info;
- frags.page = rx_buf->page;
- frags.page_offset = efx_rx_buf_offset(rx_buf);
- frags.size = rx_buf->len;
+ info.frags[0].page = rx_buf->page;
+ info.frags[0].page_offset = efx_rx_buf_offset(rx_buf);
+ info.frags[0].size = rx_buf->len;
+ info.nr_frags = 1;
+ info.ip_summed = CHECKSUM_UNNECESSARY;
+ info.len = rx_buf->len;
- lro_receive_frags(lro_mgr, &frags, rx_buf->len,
- rx_buf->len, priv, 0);
+ napi_gro_frags(napi, &info);
EFX_BUG_ON_PARANOID(rx_buf->skb);
rx_buf->page = NULL;
} else {
EFX_BUG_ON_PARANOID(!rx_buf->skb);
- lro_receive_skb(lro_mgr, rx_buf->skb, priv);
+ napi_gro_receive(napi, rx_buf->skb);
rx_buf->skb = NULL;
}
}
-/* Allocate and construct an SKB around a struct page.*/
-static struct sk_buff *efx_rx_mk_skb(struct efx_rx_buffer *rx_buf,
- struct efx_nic *efx,
- int hdr_len)
-{
- struct sk_buff *skb;
-
- /* Allocate an SKB to store the headers */
- skb = netdev_alloc_skb(efx->net_dev, hdr_len + EFX_PAGE_SKB_ALIGN);
- if (unlikely(skb == NULL)) {
- EFX_ERR_RL(efx, "RX out of memory for skb\n");
- return NULL;
- }
-
- EFX_BUG_ON_PARANOID(skb_shinfo(skb)->nr_frags);
- EFX_BUG_ON_PARANOID(rx_buf->len < hdr_len);
-
- skb->ip_summed = CHECKSUM_UNNECESSARY;
- skb_reserve(skb, EFX_PAGE_SKB_ALIGN);
-
- skb->len = rx_buf->len;
- skb->truesize = rx_buf->len + sizeof(struct sk_buff);
- memcpy(skb->data, rx_buf->data, hdr_len);
- skb->tail += hdr_len;
-
- /* Append the remaining page onto the frag list */
- if (unlikely(rx_buf->len > hdr_len)) {
- struct skb_frag_struct *frag = skb_shinfo(skb)->frags;
- frag->page = rx_buf->page;
- frag->page_offset = efx_rx_buf_offset(rx_buf) + hdr_len;
- frag->size = skb->len - hdr_len;
- skb_shinfo(skb)->nr_frags = 1;
- skb->data_len = frag->size;
- } else {
- __free_pages(rx_buf->page, efx->rx_buffer_order);
- skb->data_len = 0;
- }
-
- /* Ownership has transferred from the rx_buf to skb */
- rx_buf->page = NULL;
-
- /* Move past the ethernet header */
- skb->protocol = eth_type_trans(skb, efx->net_dev);
-
- return skb;
-}
-
void efx_rx_packet(struct efx_rx_queue *rx_queue, unsigned int index,
unsigned int len, bool checksummed, bool discard)
{
@@ -687,7 +538,6 @@ void __efx_rx_packet(struct efx_channel *channel,
{
struct efx_nic *efx = channel->efx;
struct sk_buff *skb;
- bool lro = !!(efx->net_dev->features & NETIF_F_LRO);
/* If we're in loopback test, then pass the packet directly to the
* loopback layer, and free the rx_buf here
@@ -709,41 +559,21 @@ void __efx_rx_packet(struct efx_channel *channel,
efx->net_dev);
}
- /* Both our generic-LRO and SFC-SSR support skb and page based
- * allocation, but neither support switching from one to the
- * other on the fly. If we spot that the allocation mode has
- * changed, then flush the LRO state.
- */
- if (unlikely(channel->rx_alloc_pop_pages != (rx_buf->page != NULL))) {
- efx_flush_lro(channel);
- channel->rx_alloc_pop_pages = (rx_buf->page != NULL);
- }
- if (likely(checksummed && lro)) {
+ if (likely(checksummed || rx_buf->page)) {
efx_rx_packet_lro(channel, rx_buf);
goto done;
}
- /* Form an skb if required */
- if (rx_buf->page) {
- int hdr_len = min(rx_buf->len, EFX_SKB_HEADERS);
- skb = efx_rx_mk_skb(rx_buf, efx, hdr_len);
- if (unlikely(skb == NULL)) {
- efx_free_rx_buffer(efx, rx_buf);
- goto done;
- }
- } else {
- /* We now own the SKB */
- skb = rx_buf->skb;
- rx_buf->skb = NULL;
- }
+ /* We now own the SKB */
+ skb = rx_buf->skb;
+ rx_buf->skb = NULL;
EFX_BUG_ON_PARANOID(rx_buf->page);
EFX_BUG_ON_PARANOID(rx_buf->skb);
EFX_BUG_ON_PARANOID(!skb);
/* Set the SKB flags */
- if (unlikely(!checksummed || !efx->rx_checksum_enabled))
- skb->ip_summed = CHECKSUM_NONE;
+ skb->ip_summed = CHECKSUM_NONE;
/* Pass the packet up */
netif_receive_skb(skb);
@@ -760,7 +590,7 @@ void efx_rx_strategy(struct efx_channel *channel)
enum efx_rx_alloc_method method = rx_alloc_method;
/* Only makes sense to use page based allocation if LRO is enabled */
- if (!(channel->efx->net_dev->features & NETIF_F_LRO)) {
+ if (!(channel->efx->net_dev->features & NETIF_F_GRO)) {
method = RX_ALLOC_METHOD_SKB;
} else if (method == RX_ALLOC_METHOD_AUTO) {
/* Constrain the rx_alloc_level */
@@ -865,11 +695,6 @@ void efx_remove_rx_queue(struct efx_rx_queue *rx_queue)
rx_queue->buffer = NULL;
}
-void efx_flush_lro(struct efx_channel *channel)
-{
- lro_flush_all(&channel->lro_mgr);
-}
-
module_param(rx_alloc_method, int, 0644);
MODULE_PARM_DESC(rx_alloc_method, "Allocation method used for RX buffers");
diff --git a/drivers/net/sfc/rx.h b/drivers/net/sfc/rx.h
index 0e88a9d..42ee755 100644
--- a/drivers/net/sfc/rx.h
+++ b/drivers/net/sfc/rx.h
@@ -17,9 +17,6 @@ void efx_remove_rx_queue(struct efx_rx_queue *rx_queue);
void efx_init_rx_queue(struct efx_rx_queue *rx_queue);
void efx_fini_rx_queue(struct efx_rx_queue *rx_queue);
-int efx_lro_init(struct net_lro_mgr *lro_mgr, struct efx_nic *efx);
-void efx_lro_fini(struct net_lro_mgr *lro_mgr);
-void efx_flush_lro(struct efx_channel *channel);
void efx_rx_strategy(struct efx_channel *channel);
void efx_fast_push_rx_descriptors(struct efx_rx_queue *rx_queue);
void efx_rx_work(struct work_struct *data);
diff --git a/drivers/net/sfc/sfe4001.c b/drivers/net/sfc/sfe4001.c
index 16b80ac..d21d014 100644
--- a/drivers/net/sfc/sfe4001.c
+++ b/drivers/net/sfc/sfe4001.c
@@ -24,6 +24,7 @@
*/
#include <linux/delay.h>
+#include <linux/rtnetlink.h>
#include "net_driver.h"
#include "efx.h"
#include "phy.h"
diff --git a/drivers/net/sfc/tenxpress.c b/drivers/net/sfc/tenxpress.c
index b976876..67f834e 100644
--- a/drivers/net/sfc/tenxpress.c
+++ b/drivers/net/sfc/tenxpress.c
@@ -8,6 +8,7 @@
*/
#include <linux/delay.h>
+#include <linux/rtnetlink.h>
#include <linux/seq_file.h>
#include "efx.h"
#include "mdio_10g.h"
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [1/2] e1000e: Invoke VLAN GRO handler
2009-01-13 9:26 [1/2] e1000e: Invoke VLAN GRO handler Herbert Xu
2009-01-13 9:28 ` [2/2] igb: Replace LRO with GRO Herbert Xu
2009-01-15 4:24 ` sfc: " Herbert Xu
@ 2009-01-15 4:38 ` David Miller
2009-01-15 6:59 ` cxgb3: Replace LRO with GRO Herbert Xu
3 siblings, 0 replies; 37+ messages in thread
From: David Miller @ 2009-01-15 4:38 UTC (permalink / raw)
To: herbert; +Cc: jeffrey.t.kirsher, netdev
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Tue, 13 Jan 2009 20:26:25 +1100
> e1000e: Invoke VLAN GRO handler
>
> Now that VLAN has GRO support as well, we can call its GRO handler
> as well.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Applied to net-next-2.6
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [2/2] igb: Replace LRO with GRO
2009-01-15 0:32 ` Herbert Xu
2009-01-15 1:35 ` Jeff Kirsher
@ 2009-01-15 4:40 ` David Miller
1 sibling, 0 replies; 37+ messages in thread
From: David Miller @ 2009-01-15 4:40 UTC (permalink / raw)
To: herbert; +Cc: jeffrey.t.kirsher, netdev, emil.s.tantilov
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 15 Jan 2009 11:32:46 +1100
> gro: Fix page ref count for skbs freed normally
>
> When an skb with page frags is merged into an existing one, we
> cannibalise its reference count. This is OK when the skb is
> reused because we set nr_frags to zero in that case. However,
> for the case where the skb is freed through kfree_skb, we didn't
> clear nr_frags which causes the page to be freed prematurely.
>
> This is fixed by moving the skb resetting into skb_gro_receive.
>
> Reported-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Applied to net-2.6
^ permalink raw reply [flat|nested] 37+ messages in thread
* cxgb3: Replace LRO with GRO
2009-01-13 9:26 [1/2] e1000e: Invoke VLAN GRO handler Herbert Xu
` (2 preceding siblings ...)
2009-01-15 4:38 ` [1/2] e1000e: Invoke VLAN GRO handler David Miller
@ 2009-01-15 6:59 ` Herbert Xu
3 siblings, 0 replies; 37+ messages in thread
From: Herbert Xu @ 2009-01-15 6:59 UTC (permalink / raw)
To: David S. Miller, Divy Le Ray, netdev
Hi Dave:
This one is slightly different because the driver has chosen to
expose to user-space per-queue switches to control LRO. I've
kept those for now.
cxgb3: Replace LRO with GRO
This patch makes cxgb3 invoke the GRO hooks instead of LRO. As
GRO has a compatible external interface to LRO this is a very
straightforward replacement.
I've kept the ioctl controls for per-queue LRO switches. However,
we should not encourage anyone to use these.
Because of that, I've also kept the skb construction code in
cxgb3. Hopefully we can phase out those per-queue switches
and then kill this too.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index ee37540..d6bf215 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2407,7 +2407,6 @@ config CHELSIO_T3
tristate "Chelsio Communications T3 10Gb Ethernet support"
depends on CHELSIO_T3_DEPENDS
select FW_LOADER
- select INET_LRO
help
This driver supports Chelsio T3-based gigabit and 10Gb Ethernet
adapters.
diff --git a/drivers/net/cxgb3/adapter.h b/drivers/net/cxgb3/adapter.h
index 5b346f9..41b87a1 100644
--- a/drivers/net/cxgb3/adapter.h
+++ b/drivers/net/cxgb3/adapter.h
@@ -42,7 +42,6 @@
#include <linux/cache.h>
#include <linux/mutex.h>
#include <linux/bitops.h>
-#include <linux/inet_lro.h>
#include "t3cdev.h"
#include <asm/io.h>
@@ -173,15 +172,11 @@ enum { /* per port SGE statistics */
SGE_PSTAT_TX_CSUM, /* # of TX checksum offloads */
SGE_PSTAT_VLANEX, /* # of VLAN tag extractions */
SGE_PSTAT_VLANINS, /* # of VLAN tag insertions */
- SGE_PSTAT_LRO_AGGR, /* # of page chunks added to LRO sessions */
- SGE_PSTAT_LRO_FLUSHED, /* # of flushed LRO sessions */
- SGE_PSTAT_LRO_NO_DESC, /* # of overflown LRO sessions */
SGE_PSTAT_MAX /* must be last */
};
-#define T3_MAX_LRO_SES 8
-#define T3_MAX_LRO_MAX_PKTS 64
+struct napi_gro_fraginfo;
struct sge_qset { /* an SGE queue set */
struct adapter *adap;
@@ -189,12 +184,8 @@ struct sge_qset { /* an SGE queue set */
struct sge_rspq rspq;
struct sge_fl fl[SGE_RXQ_PER_SET];
struct sge_txq txq[SGE_TXQ_PER_SET];
- struct net_lro_mgr lro_mgr;
- struct net_lro_desc lro_desc[T3_MAX_LRO_SES];
- struct skb_frag_struct *lro_frag_tbl;
- int lro_nfrags;
+ struct napi_gro_fraginfo lro_frag_tbl;
int lro_enabled;
- int lro_frag_len;
void *lro_va;
struct net_device *netdev;
struct netdev_queue *tx_q; /* associated netdev TX queue */
diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
index 2847f94..e94430d 100644
--- a/drivers/net/cxgb3/cxgb3_main.c
+++ b/drivers/net/cxgb3/cxgb3_main.c
@@ -508,19 +508,9 @@ static void set_qset_lro(struct net_device *dev, int qset_idx, int val)
{
struct port_info *pi = netdev_priv(dev);
struct adapter *adapter = pi->adapter;
- int i, lro_on = 1;
adapter->params.sge.qset[qset_idx].lro = !!val;
adapter->sge.qs[qset_idx].lro_enabled = !!val;
-
- /* let ethtool report LRO on only if all queues are LRO enabled */
- for (i = pi->first_qset; i < pi->first_qset + pi->nqsets; ++i)
- lro_on &= adapter->params.sge.qset[i].lro;
-
- if (lro_on)
- dev->features |= NETIF_F_LRO;
- else
- dev->features &= ~NETIF_F_LRO;
}
/**
@@ -1433,9 +1423,9 @@ static void get_stats(struct net_device *dev, struct ethtool_stats *stats,
*data++ = collect_sge_port_stats(adapter, pi, SGE_PSTAT_VLANINS);
*data++ = collect_sge_port_stats(adapter, pi, SGE_PSTAT_TX_CSUM);
*data++ = collect_sge_port_stats(adapter, pi, SGE_PSTAT_RX_CSUM_GOOD);
- *data++ = collect_sge_port_stats(adapter, pi, SGE_PSTAT_LRO_AGGR);
- *data++ = collect_sge_port_stats(adapter, pi, SGE_PSTAT_LRO_FLUSHED);
- *data++ = collect_sge_port_stats(adapter, pi, SGE_PSTAT_LRO_NO_DESC);
+ *data++ = 0;
+ *data++ = 0;
+ *data++ = 0;
*data++ = s->rx_cong_drops;
*data++ = s->num_toggled;
@@ -1824,25 +1814,6 @@ static void get_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
memset(&wol->sopass, 0, sizeof(wol->sopass));
}
-static int cxgb3_set_flags(struct net_device *dev, u32 data)
-{
- struct port_info *pi = netdev_priv(dev);
- int i;
-
- if (data & ETH_FLAG_LRO) {
- if (!pi->rx_csum_offload)
- return -EINVAL;
-
- for (i = pi->first_qset; i < pi->first_qset + pi->nqsets; i++)
- set_qset_lro(dev, i, 1);
-
- } else
- for (i = pi->first_qset; i < pi->first_qset + pi->nqsets; i++)
- set_qset_lro(dev, i, 0);
-
- return 0;
-}
-
static const struct ethtool_ops cxgb_ethtool_ops = {
.get_settings = get_settings,
.set_settings = set_settings,
@@ -1872,8 +1843,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
.get_regs = get_regs,
.get_wol = get_wol,
.set_tso = ethtool_op_set_tso,
- .get_flags = ethtool_op_get_flags,
- .set_flags = cxgb3_set_flags,
};
static int in_range(int val, int lo, int hi)
diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
index 6c641a8..893783b 100644
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -585,8 +585,7 @@ static void t3_reset_qset(struct sge_qset *q)
memset(q->txq, 0, sizeof(struct sge_txq) * SGE_TXQ_PER_SET);
q->txq_stopped = 0;
q->tx_reclaim_timer.function = NULL; /* for t3_stop_sge_timers() */
- kfree(q->lro_frag_tbl);
- q->lro_nfrags = q->lro_frag_len = 0;
+ q->lro_frag_tbl.nr_frags = q->lro_frag_tbl.len = 0;
}
@@ -1945,10 +1944,8 @@ static void rx_eth(struct adapter *adap, struct sge_rspq *rq,
qs->port_stats[SGE_PSTAT_VLANEX]++;
if (likely(grp))
if (lro)
- lro_vlan_hwaccel_receive_skb(&qs->lro_mgr, skb,
- grp,
- ntohs(p->vlan),
- p);
+ vlan_gro_receive(&qs->napi, grp,
+ ntohs(p->vlan), skb);
else {
if (unlikely(pi->iscsi_ipv4addr &&
is_arp(skb))) {
@@ -1965,7 +1962,7 @@ static void rx_eth(struct adapter *adap, struct sge_rspq *rq,
dev_kfree_skb_any(skb);
} else if (rq->polling) {
if (lro)
- lro_receive_skb(&qs->lro_mgr, skb, p);
+ napi_gro_receive(&qs->napi, skb);
else {
if (unlikely(pi->iscsi_ipv4addr && is_arp(skb)))
cxgb3_arp_process(adap, skb);
@@ -1981,59 +1978,6 @@ static inline int is_eth_tcp(u32 rss)
}
/**
- * lro_frame_ok - check if an ingress packet is eligible for LRO
- * @p: the CPL header of the packet
- *
- * Returns true if a received packet is eligible for LRO.
- * The following conditions must be true:
- * - packet is TCP/IP Ethernet II (checked elsewhere)
- * - not an IP fragment
- * - no IP options
- * - TCP/IP checksums are correct
- * - the packet is for this host
- */
-static inline int lro_frame_ok(const struct cpl_rx_pkt *p)
-{
- const struct ethhdr *eh = (struct ethhdr *)(p + 1);
- const struct iphdr *ih = (struct iphdr *)(eh + 1);
-
- return (*((u8 *)p + 1) & 0x90) == 0x10 && p->csum == htons(0xffff) &&
- eh->h_proto == htons(ETH_P_IP) && ih->ihl == (sizeof(*ih) >> 2);
-}
-
-static int t3_get_lro_header(void **eh, void **iph, void **tcph,
- u64 *hdr_flags, void *priv)
-{
- const struct cpl_rx_pkt *cpl = priv;
-
- if (!lro_frame_ok(cpl))
- return -1;
-
- *eh = (struct ethhdr *)(cpl + 1);
- *iph = (struct iphdr *)((struct ethhdr *)*eh + 1);
- *tcph = (struct tcphdr *)((struct iphdr *)*iph + 1);
-
- *hdr_flags = LRO_IPV4 | LRO_TCP;
- return 0;
-}
-
-static int t3_get_skb_header(struct sk_buff *skb,
- void **iph, void **tcph, u64 *hdr_flags,
- void *priv)
-{
- void *eh;
-
- return t3_get_lro_header(&eh, iph, tcph, hdr_flags, priv);
-}
-
-static int t3_get_frag_header(struct skb_frag_struct *frag, void **eh,
- void **iph, void **tcph, u64 *hdr_flags,
- void *priv)
-{
- return t3_get_lro_header(eh, iph, tcph, hdr_flags, priv);
-}
-
-/**
* lro_add_page - add a page chunk to an LRO session
* @adap: the adapter
* @qs: the associated queue set
@@ -2049,8 +1993,9 @@ static void lro_add_page(struct adapter *adap, struct sge_qset *qs,
{
struct rx_sw_desc *sd = &fl->sdesc[fl->cidx];
struct cpl_rx_pkt *cpl;
- struct skb_frag_struct *rx_frag = qs->lro_frag_tbl;
- int nr_frags = qs->lro_nfrags, frag_len = qs->lro_frag_len;
+ struct skb_frag_struct *rx_frag = qs->lro_frag_tbl.frags;
+ int nr_frags = qs->lro_frag_tbl.nr_frags;
+ int frag_len = qs->lro_frag_tbl.len;
int offset = 0;
if (!nr_frags) {
@@ -2069,13 +2014,13 @@ static void lro_add_page(struct adapter *adap, struct sge_qset *qs,
rx_frag->page_offset = sd->pg_chunk.offset + offset;
rx_frag->size = len;
frag_len += len;
- qs->lro_nfrags++;
- qs->lro_frag_len = frag_len;
+ qs->lro_frag_tbl.nr_frags++;
+ qs->lro_frag_tbl.len = frag_len;
if (!complete)
return;
- qs->lro_nfrags = qs->lro_frag_len = 0;
+ qs->lro_frag_tbl.ip_summed = CHECKSUM_UNNECESSARY;
cpl = qs->lro_va;
if (unlikely(cpl->vlan_valid)) {
@@ -2084,35 +2029,15 @@ static void lro_add_page(struct adapter *adap, struct sge_qset *qs,
struct vlan_group *grp = pi->vlan_grp;
if (likely(grp != NULL)) {
- lro_vlan_hwaccel_receive_frags(&qs->lro_mgr,
- qs->lro_frag_tbl,
- frag_len, frag_len,
- grp, ntohs(cpl->vlan),
- cpl, 0);
- return;
+ vlan_gro_frags(&qs->napi, grp, ntohs(cpl->vlan),
+ &qs->lro_frag_tbl);
+ goto out;
}
}
- lro_receive_frags(&qs->lro_mgr, qs->lro_frag_tbl,
- frag_len, frag_len, cpl, 0);
-}
+ napi_gro_frags(&qs->napi, &qs->lro_frag_tbl);
-/**
- * init_lro_mgr - initialize a LRO manager object
- * @lro_mgr: the LRO manager object
- */
-static void init_lro_mgr(struct sge_qset *qs, struct net_lro_mgr *lro_mgr)
-{
- lro_mgr->dev = qs->netdev;
- lro_mgr->features = LRO_F_NAPI;
- lro_mgr->ip_summed = CHECKSUM_UNNECESSARY;
- lro_mgr->ip_summed_aggr = CHECKSUM_UNNECESSARY;
- lro_mgr->max_desc = T3_MAX_LRO_SES;
- lro_mgr->lro_arr = qs->lro_desc;
- lro_mgr->get_frag_header = t3_get_frag_header;
- lro_mgr->get_skb_header = t3_get_skb_header;
- lro_mgr->max_aggr = T3_MAX_LRO_MAX_PKTS;
- if (lro_mgr->max_aggr > MAX_SKB_FRAGS)
- lro_mgr->max_aggr = MAX_SKB_FRAGS;
+out:
+ qs->lro_frag_tbl.nr_frags = qs->lro_frag_tbl.len = 0;
}
/**
@@ -2356,10 +2281,6 @@ next_fl:
}
deliver_partial_bundle(&adap->tdev, q, offload_skbs, ngathered);
- lro_flush_all(&qs->lro_mgr);
- qs->port_stats[SGE_PSTAT_LRO_AGGR] = qs->lro_mgr.stats.aggregated;
- qs->port_stats[SGE_PSTAT_LRO_FLUSHED] = qs->lro_mgr.stats.flushed;
- qs->port_stats[SGE_PSTAT_LRO_NO_DESC] = qs->lro_mgr.stats.no_desc;
if (sleeping)
check_ring_db(adap, qs, sleeping);
@@ -2906,7 +2827,6 @@ int t3_sge_alloc_qset(struct adapter *adapter, unsigned int id, int nports,
{
int i, avail, ret = -ENOMEM;
struct sge_qset *q = &adapter->sge.qs[id];
- struct net_lro_mgr *lro_mgr = &q->lro_mgr;
init_qset_cntxt(q, id);
setup_timer(&q->tx_reclaim_timer, sge_timer_cb, (unsigned long)q);
@@ -2986,10 +2906,6 @@ int t3_sge_alloc_qset(struct adapter *adapter, unsigned int id, int nports,
q->fl[0].order = FL0_PG_ORDER;
q->fl[1].order = FL1_PG_ORDER;
- q->lro_frag_tbl = kcalloc(MAX_FRAME_SIZE / FL1_PG_CHUNK_SIZE + 1,
- sizeof(struct skb_frag_struct),
- GFP_KERNEL);
- q->lro_nfrags = q->lro_frag_len = 0;
spin_lock_irq(&adapter->sge.reg_lock);
/* FL threshold comparison uses < */
@@ -3041,8 +2957,6 @@ int t3_sge_alloc_qset(struct adapter *adapter, unsigned int id, int nports,
q->tx_q = netdevq;
t3_update_qset_coalesce(q, p);
- init_lro_mgr(q, lro_mgr);
-
avail = refill_fl(adapter, &q->fl[0], q->fl[0].size,
GFP_KERNEL | __GFP_COMP);
if (!avail) {
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: ixgbe: Replace LRO with GRO
2009-01-15 3:46 ` Herbert Xu
2009-01-15 4:22 ` Herbert Xu
@ 2009-01-16 23:32 ` Jeff Kirsher
2009-01-17 0:36 ` Herbert Xu
1 sibling, 1 reply; 37+ messages in thread
From: Jeff Kirsher @ 2009-01-16 23:32 UTC (permalink / raw)
To: Herbert Xu; +Cc: David S. Miller, netdev, Emil Tantilov
On Wed, Jan 14, 2009 at 7:46 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Wed, Jan 14, 2009 at 07:44:12PM +1100, Herbert Xu wrote:
>>
>> ixgbe: Replace LRO with GRO
>
> I forgot to delete the Kconfig dependency, here is an updated
> version.
>
> ixgbe: Replace LRO with GRO
>
> This patch makes igb invoke the GRO hooks instead of LRO. As
> GRO has a compatible external interface to LRO this is a very
> straightforward replacement.
>
> As GRO uses the napi structure to track the held packets, I've
> modified the code paths involved to pass that along.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
Adding Emil to provide further testing details...
We are seeing data corruption with this patch applied. When we
disable GRO the data corruption goes away.
We have this simple ftp test that does put/get and then compares
md5sum of the file. The file we get back is different then the one we
sent with GRO enabled. The file is around 34MB. Note that we don't
always see the corruption with 30meg file, but it becomes more
apparent as we increase the size of the file.
If this helps - Emil did a diff off of a hexdump output between the
sent and received file. Looks like the receive was cut off at the end:
--- sent.dump 2009-01-16 11:48:50.000000000 -0800
+++ received.dump 2009-01-16 11:48:47.000000000 -0800
@@ -62714,43 +62714,5 @@
00f4f90 0000 0000 ffff ffff 0000 0000 ffff ffff 00f4fa0 0000 0000
ffff ffff 0000 0000 ffff ffff 00f4fb0 0000 0000 ffff ffff 0000 0000
ffff ffff -00f4fc0 0000 0000 ffff ffff 0000 0000 ffff ffff -00f4fd0
0000 0000 ffff ffff 0000 0000 ffff ffff -00f4fe0 0000 0000 ffff ffff
0000 0000 ffff ffff -00f4ff0 0000 0000 ffff ffff 0000 0000 ffff ffff
-00f5000 0000 0000 ffff ffff 0000 0000 ffff ffff -00f5010 0000 0000
ffff ffff 0000 0000 ffff ffff -00f5020 0000 0000 ffff ffff 0000 0000
ffff ffff -00f5030 0000 0000 ffff ffff 0000 0000 ffff ffff -00f5040
0000 0000 ffff ffff 0000 0000 ffff ffff -00f5050 0000 0000 ffff ffff
0000 0000 ffff ffff -00f5060 0000 0000 ffff ffff 0000 0000 ffff ffff
-00f5070 0000 0000 ffff ffff 0000 0000 ffff ffff -00f5080 0000 0000
ffff ffff 0000 0000 ffff ffff -00f5090 0000 0000 ffff ffff 0000 0000
ffff ffff -00f50a0 0000 0000 ffff ffff 0000 0000 ffff ffff -00f50b0
0000 0000 ffff ffff 0000 0000 ffff ffff -00f50c0 0000 0000 ffff ffff
0000 0000 ffff ffff -00f50d0 0000 0000 ffff ffff 0000 0000 ffff ffff
-00f50e0 0000 0000 ffff ffff 0000 0000 ffff ffff -00f50f0 0000 0000
ffff ffff 0000 0000 ffff ffff -00f5100 0000 0000 ffff ffff 0000 0000
ffff ffff -00f5110 0000 0000 ffff ffff 0000 0000 ffff ffff -00f5120
0000 0000 ffff ffff 0000 0000 ffff ffff -00f5130 0000 0000 ffff ffff
0000 0000 ffff ffff -00f5140 0000 0000 ffff ffff 0000 0000 ffff ffff
-00f5150 0000 0000 ffff ffff 0000 0000 ffff ffff -00f5160 0000 0000
ffff ffff 0000 0000 ffff ffff -00f5170 0000 0000 ffff ffff 0000 0000
ffff ffff -00f5180 0000 0000 ffff ffff 0000 0000 ffff ffff -00f5190
0000 0000 ffff ffff 0000 0000 ffff ffff -00f51a0 0000 0000 ffff ffff
0000 0000 ffff ffff -00f51b0 0000 0000 ffff ffff 0000 0000 ffff ffff
-00f51c0 0000 0000 ffff ffff 0000 0000 ffff ffff -00f51d0 0000 0000
ffff ffff 0000 0000 ffff ffff -00f51e0 0000 0000 ffff ffff 0000 0000
ffff ffff -00f51f0 0000 0000 ffff ffff 0000 0000 ffff ffff -00f5200
0000 0000 ffff ffff 0000 0000 ffff ffff -00f5210 0000 0000 ffff ffff
0000 0000 ffff ffff -00f5220 0000 0000 ffff ffff 0000 0000 ffff ffff
-00f5230
+00f4fc0 0000 0000 ffff ffff
+00f4fc8
--
Cheers,
Jeff
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: ixgbe: Replace LRO with GRO
2009-01-16 23:32 ` Jeff Kirsher
@ 2009-01-17 0:36 ` Herbert Xu
2009-01-17 1:06 ` Tantilov, Emil S
0 siblings, 1 reply; 37+ messages in thread
From: Herbert Xu @ 2009-01-17 0:36 UTC (permalink / raw)
To: Jeff Kirsher; +Cc: David S. Miller, netdev, Emil Tantilov
On Fri, Jan 16, 2009 at 03:32:56PM -0800, Jeff Kirsher wrote:
>
> Adding Emil to provide further testing details...
>
> We are seeing data corruption with this patch applied. When we
> disable GRO the data corruption goes away.
>
> We have this simple ftp test that does put/get and then compares
> md5sum of the file. The file we get back is different then the one we
> sent with GRO enabled. The file is around 34MB. Note that we don't
> always see the corruption with 30meg file, but it becomes more
> apparent as we increase the size of the file.
>
> If this helps - Emil did a diff off of a hexdump output between the
> sent and received file. Looks like the receive was cut off at the end:
Thanks for the info Jeff!
As the file isn't too big, could you do two packet dumps, one
at the sender and one at the receiver? Capturing just the headers
should be sufficient for now.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 37+ messages in thread
* RE: ixgbe: Replace LRO with GRO
2009-01-17 0:36 ` Herbert Xu
@ 2009-01-17 1:06 ` Tantilov, Emil S
2009-01-17 3:45 ` Herbert Xu
0 siblings, 1 reply; 37+ messages in thread
From: Tantilov, Emil S @ 2009-01-17 1:06 UTC (permalink / raw)
To: Herbert Xu, Kirsher, Jeffrey T; +Cc: David S. Miller, netdev@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1425 bytes --]
See attached.
Thanks,
Emil
-----Original Message-----
From: Herbert Xu [mailto:herbert@gondor.apana.org.au]
Sent: Friday, January 16, 2009 4:36 PM
To: Kirsher, Jeffrey T
Cc: David S. Miller; netdev@vger.kernel.org; Tantilov, Emil S
Subject: Re: ixgbe: Replace LRO with GRO
On Fri, Jan 16, 2009 at 03:32:56PM -0800, Jeff Kirsher wrote:
>
> Adding Emil to provide further testing details...
>
> We are seeing data corruption with this patch applied. When we
> disable GRO the data corruption goes away.
>
> We have this simple ftp test that does put/get and then compares
> md5sum of the file. The file we get back is different then the one we
> sent with GRO enabled. The file is around 34MB. Note that we don't
> always see the corruption with 30meg file, but it becomes more
> apparent as we increase the size of the file.
>
> If this helps - Emil did a diff off of a hexdump output between the
> sent and received file. Looks like the receive was cut off at the end:
Thanks for the info Jeff!
As the file isn't too big, could you do two packet dumps, one
at the sender and one at the receiver? Capturing just the headers
should be sufficient for now.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[-- Attachment #2: gro_corruption.tar.gz --]
[-- Type: application/x-gzip, Size: 92972 bytes --]
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: ixgbe: Replace LRO with GRO
2009-01-17 1:06 ` Tantilov, Emil S
@ 2009-01-17 3:45 ` Herbert Xu
2009-01-17 8:07 ` Tantilov, Emil S
0 siblings, 1 reply; 37+ messages in thread
From: Herbert Xu @ 2009-01-17 3:45 UTC (permalink / raw)
To: Tantilov, Emil S
Cc: Kirsher, Jeffrey T, David S. Miller, netdev@vger.kernel.org
On Fri, Jan 16, 2009 at 06:06:18PM -0700, Tantilov, Emil S wrote:
> See attached.
Thanks Emil!
However, it's not very clear who is the sender in this case :)
FTP has a separate data connection, so the sender of the control
connection is not necessarily the sender of the data connection.
In fact, looking at the dump of the data connection it'd appear
that the one you marked as the sender dump is in fact the receiver
for the data connection:
12:07:17.045244 IP (tos 0x8, ttl 64, id 19084, offset 0, flags [DF], proto TCP (6), length 1500) 190.0.15.1.9504 > 190.0.15.3.41206: . 1:1449(1448) ack 1 win 46 <nop,nop,timestamp 199022661 4897363>
12:07:17.045260 IP (tos 0x8, ttl 64, id 15863, offset 0, flags [DF], proto TCP (6), length 52) 190.0.15.3.41206 > 190.0.15.1.9504: ., cksum 0x72a1 (correct), ack 1449 win 69 <nop,nop,timestamp 4897363 199022661>
We can see that because the delta between the data and the ack
is much smaller on this side.
It's also peculiar that there seems to be no GRO aggregation
at all on either side. Were these dumps taken with it enabled
on both sides? You'll need a patched ethtool to check.
The transfer itself appears to be encountering a lot of packet loss,
which presumably caused a premature end which is why you saw a
missing chunk in the file.
However, it is not very clear what is causing this loss. BTW,
did you apply the fix b0059b50b70dc3a908bea4ece2f9494a22200018
(gro: Fix page ref count for skbs freed normally) on both sides?
That one is required to make igb or ixgbe work properly.
Another thing to try is to disable GSO/TSO on both sides as that
is complicating the interpretation of the dumps.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 37+ messages in thread
* RE: ixgbe: Replace LRO with GRO
2009-01-17 3:45 ` Herbert Xu
@ 2009-01-17 8:07 ` Tantilov, Emil S
2009-01-17 9:52 ` Herbert Xu
0 siblings, 1 reply; 37+ messages in thread
From: Tantilov, Emil S @ 2009-01-17 8:07 UTC (permalink / raw)
To: Herbert Xu; +Cc: Kirsher, Jeffrey T, David S. Miller, netdev@vger.kernel.org
Herbert Xu wrote:
> On Fri, Jan 16, 2009 at 06:06:18PM -0700, Tantilov, Emil S wrote:
>> See attached.
>
> Thanks Emil!
>
> However, it's not very clear who is the sender in this case :)
>
> FTP has a separate data connection, so the sender of the control
> connection is not necessarily the sender of the data connection.
>
> In fact, looking at the dump of the data connection it'd appear
> that the one you marked as the sender dump is in fact the receiver
> for the data connection:
The test is done by a script that:
1. checks the connection to the secondary system (ping)
2. copies the test file to the secondary test system (put)
3. copies the same file again from the secondary to the test system (get)
4. Compares the checksum of the sent vs. the received file
Since I had tcpdump on both ends - I think you should see both session in each. The file I named sender (probably a bit incorrect) is just the capture taken from the primary test system (the one that started the script). This is also the system on which I have the latest git with GRO.
> 12:07:17.045244 IP (tos 0x8, ttl 64, id 19084, offset 0, flags [DF],
> proto TCP (6), length 1500) 190.0.15.1.9504 > 190.0.15.3.41206: .
> 1:1449(1448) ack 1 win 46 <nop,nop,timestamp 199022661 4897363>
> 12:07:17.045260 IP (tos 0x8, ttl 64, id 15863, offset 0, flags [DF],
> proto TCP (6), length 52) 190.0.15.3.41206 > 190.0.15.1.9504: .,
> cksum 0x72a1 (correct), ack 1449 win 69 <nop,nop,timestamp 4897363
> 199022661>
>
> We can see that because the delta between the data and the ack
> is much smaller on this side.
>
> It's also peculiar that there seems to be no GRO aggregation
> at all on either side. Were these dumps taken with it enabled
> on both sides? You'll need a patched ethtool to check.
No only one of the systems (the sender) runs the latest git with GRO. The secondary test system runs on stock RHEL5.2 kernel.
> The transfer itself appears to be encountering a lot of packet loss,
> which presumably caused a premature end which is why you saw a
> missing chunk in the file.
>
> However, it is not very clear what is causing this loss. BTW,
> did you apply the fix b0059b50b70dc3a908bea4ece2f9494a22200018
> (gro: Fix page ref count for skbs freed normally) on both sides?
> That one is required to make igb or ixgbe work properly.
Ah - doesn't seem like I have this fix. Jeff will have to pull this in our test tree.
> Another thing to try is to disable GSO/TSO on both sides as that
> is complicating the interpretation of the dumps.
Will do if the fix above doesn't help.
Thanks,
Emil
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: ixgbe: Replace LRO with GRO
2009-01-17 8:07 ` Tantilov, Emil S
@ 2009-01-17 9:52 ` Herbert Xu
2009-01-17 13:04 ` Herbert Xu
2009-01-17 15:44 ` Tantilov, Emil S
0 siblings, 2 replies; 37+ messages in thread
From: Herbert Xu @ 2009-01-17 9:52 UTC (permalink / raw)
To: Tantilov, Emil S
Cc: Kirsher, Jeffrey T, David S. Miller, netdev@vger.kernel.org
On Sat, Jan 17, 2009 at 01:07:46AM -0700, Tantilov, Emil S wrote:
>
> The test is done by a script that:
>
> 1. checks the connection to the secondary system (ping)
> 2. copies the test file to the secondary test system (put)
> 3. copies the same file again from the secondary to the test system (get)
> 4. Compares the checksum of the sent vs. the received file
>
> Since I had tcpdump on both ends - I think you should see both session in each. The file I named sender (probably a bit incorrect) is just the capture taken from the primary test system (the one that started the script). This is also the system on which I have the latest git with GRO.
Thanks, that makes a lot more sense. Just to confirm, the second
part of the sender dump should be the one showing GRO, right?
> > However, it is not very clear what is causing this loss. BTW,
> > did you apply the fix b0059b50b70dc3a908bea4ece2f9494a22200018
> > (gro: Fix page ref count for skbs freed normally) on both sides?
> > That one is required to make igb or ixgbe work properly.
> Ah - doesn't seem like I have this fix. Jeff will have to pull this in our test tree.
OK, although this bug should only manifest itself if aggregation
occurs, which apparently isn't the case according to the dump.
> > Another thing to try is to disable GSO/TSO on both sides as that
> > is complicating the interpretation of the dumps.
> Will do if the fix above doesn't help.
BTW, when you're testing without GRO are you disabling GRO with
ethtool or loading an ixgbe driver without the patch? If the latter
please try it with ethtool (even if you don't have a patched ethtool
that supports GRO you can always disable RX checksums).
Maybe I just stuffed up the ixgbe patch.
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: ixgbe: Replace LRO with GRO
2009-01-17 9:52 ` Herbert Xu
@ 2009-01-17 13:04 ` Herbert Xu
2009-01-20 23:00 ` Tantilov, Emil S
2009-01-17 15:44 ` Tantilov, Emil S
1 sibling, 1 reply; 37+ messages in thread
From: Herbert Xu @ 2009-01-17 13:04 UTC (permalink / raw)
To: Tantilov, Emil S
Cc: Kirsher, Jeffrey T, David S. Miller, netdev@vger.kernel.org
On Sat, Jan 17, 2009 at 08:52:56PM +1100, Herbert Xu wrote:
>
> > > However, it is not very clear what is causing this loss. BTW,
> > > did you apply the fix b0059b50b70dc3a908bea4ece2f9494a22200018
> > > (gro: Fix page ref count for skbs freed normally) on both sides?
> > > That one is required to make igb or ixgbe work properly.
> > Ah - doesn't seem like I have this fix. Jeff will have to pull this in our test tree.
>
> OK, although this bug should only manifest itself if aggregation
> occurs, which apparently isn't the case according to the dump.
Doh, it looks like the bug fix itself was broken. In fact, I
think you do have the fix as it results in exactly what you were
seeing.
Please apply this on top of it.
gro: Fix merging of paged packets
The previous fix to paged packets broke the merging because it
reset the skb->len before we added it to the merged packet. This
wasn't detected because it simply resulted in the truncation of
the packet while the missing bit is subsequently retransmitted.
The fix is to store skb->len before we clobber it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 65eac77..9127c47 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2588,8 +2588,9 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
struct sk_buff *nskb;
unsigned int headroom;
unsigned int hlen = p->data - skb_mac_header(p);
+ unsigned int len = skb->len;
- if (hlen + p->len + skb->len >= 65536)
+ if (hlen + p->len + len >= 65536)
return -E2BIG;
if (skb_shinfo(p)->frag_list)
@@ -2651,9 +2652,9 @@ merge:
done:
NAPI_GRO_CB(p)->count++;
- p->data_len += skb->len;
- p->truesize += skb->len;
- p->len += skb->len;
+ p->data_len += len;
+ p->truesize += len;
+ p->len += len;
NAPI_GRO_CB(skb)->same_flow = 1;
return 0;
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 37+ messages in thread
* RE: ixgbe: Replace LRO with GRO
2009-01-17 9:52 ` Herbert Xu
2009-01-17 13:04 ` Herbert Xu
@ 2009-01-17 15:44 ` Tantilov, Emil S
1 sibling, 0 replies; 37+ messages in thread
From: Tantilov, Emil S @ 2009-01-17 15:44 UTC (permalink / raw)
To: Herbert Xu; +Cc: Kirsher, Jeffrey T, David S. Miller, netdev@vger.kernel.org
Herbert Xu wrote:
> On Sat, Jan 17, 2009 at 01:07:46AM -0700, Tantilov, Emil S wrote:
>>
>> The test is done by a script that:
>>
>> 1. checks the connection to the secondary system (ping)
>> 2. copies the test file to the secondary test system (put)
>> 3. copies the same file again from the secondary to the test system
>> (get)
>> 4. Compares the checksum of the sent vs. the received file
>>
>> Since I had tcpdump on both ends - I think you should see both
>> session in each. The file I named sender (probably a bit incorrect)
>> is just the capture taken from the primary test system (the one that
>> started the script). This is also the system on which I have the
>> latest git with GRO.
>
> Thanks, that makes a lot more sense. Just to confirm, the second
> part of the sender dump should be the one showing GRO, right?
Yes - the second part is the one where the primary test system is receiving (hence GRO).
>>> However, it is not very clear what is causing this loss. BTW,
>>> did you apply the fix b0059b50b70dc3a908bea4ece2f9494a22200018
>>> (gro: Fix page ref count for skbs freed normally) on both sides?
>>> That one is required to make igb or ixgbe work properly.
>> Ah - doesn't seem like I have this fix. Jeff will have to pull this
>> in our test tree.
>
> OK, although this bug should only manifest itself if aggregation
> occurs, which apparently isn't the case according to the dump.
>
>>> Another thing to try is to disable GSO/TSO on both sides as that
>>> is complicating the interpretation of the dumps.
>> Will do if the fix above doesn't help.
>
> BTW, when you're testing without GRO are you disabling GRO with
> ethtool or loading an ixgbe driver without the patch? If the latter
> please try it with ethtool (even if you don't have a patched ethtool
> that supports GRO you can always disable RX checksums).
To disable GRO I just recompile ixgbe with "netdev->features |= NETIF_F_GRO" commented out. Is there a patch for ethtool?
>
> Maybe I just stuffed up the ixgbe patch.
>
> Thanks,
Thanks for looking into it.
Emil
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [2/2] igb: Replace LRO with GRO
2009-01-13 9:28 ` [2/2] igb: Replace LRO with GRO Herbert Xu
2009-01-14 8:44 ` ixgbe: " Herbert Xu
2009-01-14 11:49 ` [2/2] igb: " Jeff Kirsher
@ 2009-01-19 5:47 ` David Miller
2009-01-19 11:16 ` Herbert Xu
2 siblings, 1 reply; 37+ messages in thread
From: David Miller @ 2009-01-19 5:47 UTC (permalink / raw)
To: herbert; +Cc: jeffrey.t.kirsher, netdev
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Tue, 13 Jan 2009 20:28:28 +1100
> igb: Replace LRO with GRO
...
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Have the crashes reported by Intel been sorted out yet?
If so, I'll queue this up for 2.6.30
Otherwise please resubmit once the problem is resolved.
Thanks.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: ixgbe: Replace LRO with GRO
2009-01-15 4:22 ` Herbert Xu
@ 2009-01-19 5:49 ` David Miller
0 siblings, 0 replies; 37+ messages in thread
From: David Miller @ 2009-01-19 5:49 UTC (permalink / raw)
To: herbert; +Cc: jeffrey.t.kirsher, netdev
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 15 Jan 2009 15:22:49 +1100
> ixgbe: Replace LRO with GRO
>
> This patch makes ixgbe invoke the GRO hooks instead of LRO. As
> GRO has a compatible external interface to LRO this is a very
> straightforward replacement.
>
> As GRO uses the napi structure to track the held packets, I've
> modified the code paths involved to pass that along.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Applied for 2.6.30
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: sfc: Replace LRO with GRO
2009-01-15 4:24 ` sfc: " Herbert Xu
@ 2009-01-19 5:50 ` David Miller
2009-01-19 14:40 ` Ben Hutchings
0 siblings, 1 reply; 37+ messages in thread
From: David Miller @ 2009-01-19 5:50 UTC (permalink / raw)
To: herbert; +Cc: bhutchings, netdev
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 15 Jan 2009 15:24:22 +1100
> sfc: Replace LRO with GRO
>
> This patch makes sfc invoke the GRO hooks instead of LRO. As
> GRO has a compatible external interface to LRO this is a very
> straightforward replacement.
>
> Everything should appear identical to the user except that the
> offload is now controlled by the GRO ethtool option instead of
> LRO. I've kept the lro module parameter as is since that's for
> compatibility only.
>
> I have eliminated efx_rx_mk_skb as the GRO layer can take care
> of all packets regardless of whether GRO is enabled or not.
>
> So the only case where we don't call GRO is if the packet checksum
> is absent. This is to keep the behaviour changes of the patch to
> a minimum.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Applied for 2.6.30
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [2/2] igb: Replace LRO with GRO
2009-01-19 5:47 ` David Miller
@ 2009-01-19 11:16 ` Herbert Xu
2009-01-19 22:34 ` Jeff Kirsher
0 siblings, 1 reply; 37+ messages in thread
From: Herbert Xu @ 2009-01-19 11:16 UTC (permalink / raw)
To: David Miller; +Cc: jeffrey.t.kirsher, netdev
On Sun, Jan 18, 2009 at 09:47:29PM -0800, David Miller wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Tue, 13 Jan 2009 20:28:28 +1100
>
> > igb: Replace LRO with GRO
> ...
> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> Have the crashes reported by Intel been sorted out yet?
> If so, I'll queue this up for 2.6.30
>
> Otherwise please resubmit once the problem is resolved.
I believe it should be cured. But it wouldn't hurt to wait for
them to reconfirm.
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: sfc: Replace LRO with GRO
2009-01-19 5:50 ` David Miller
@ 2009-01-19 14:40 ` Ben Hutchings
2009-01-19 21:29 ` David Miller
0 siblings, 1 reply; 37+ messages in thread
From: Ben Hutchings @ 2009-01-19 14:40 UTC (permalink / raw)
To: David Miller; +Cc: herbert, netdev
On Sun, 2009-01-18 at 21:50 -0800, David Miller wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Thu, 15 Jan 2009 15:24:22 +1100
>
> > sfc: Replace LRO with GRO
> >
> > This patch makes sfc invoke the GRO hooks instead of LRO. As
> > GRO has a compatible external interface to LRO this is a very
> > straightforward replacement.
> >
> > Everything should appear identical to the user except that the
> > offload is now controlled by the GRO ethtool option instead of
> > LRO. I've kept the lro module parameter as is since that's for
> > compatibility only.
> >
> > I have eliminated efx_rx_mk_skb as the GRO layer can take care
> > of all packets regardless of whether GRO is enabled or not.
> >
> > So the only case where we don't call GRO is if the packet checksum
> > is absent. This is to keep the behaviour changes of the patch to
> > a minimum.
> >
> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> Applied for 2.6.30
Please could you push this and other GRO changes for .30 to
net-next-2.6?
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: sfc: Replace LRO with GRO
2009-01-19 14:40 ` Ben Hutchings
@ 2009-01-19 21:29 ` David Miller
2009-01-19 22:12 ` Harvey Harrison
0 siblings, 1 reply; 37+ messages in thread
From: David Miller @ 2009-01-19 21:29 UTC (permalink / raw)
To: bhutchings; +Cc: herbert, netdev
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Mon, 19 Jan 2009 14:40:06 +0000
> On Sun, 2009-01-18 at 21:50 -0800, David Miller wrote:
> > From: Herbert Xu <herbert@gondor.apana.org.au>
> > Date: Thu, 15 Jan 2009 15:24:22 +1100
> >
> > > sfc: Replace LRO with GRO
> > >
> > > This patch makes sfc invoke the GRO hooks instead of LRO. As
> > > GRO has a compatible external interface to LRO this is a very
> > > straightforward replacement.
> > >
> > > Everything should appear identical to the user except that the
> > > offload is now controlled by the GRO ethtool option instead of
> > > LRO. I've kept the lro module parameter as is since that's for
> > > compatibility only.
> > >
> > > I have eliminated efx_rx_mk_skb as the GRO layer can take care
> > > of all packets regardless of whether GRO is enabled or not.
> > >
> > > So the only case where we don't call GRO is if the packet checksum
> > > is absent. This is to keep the behaviour changes of the patch to
> > > a minimum.
> > >
> > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> >
> > Applied for 2.6.30
>
> Please could you push this and other GRO changes for .30 to
> net-next-2.6?
All of Herbert's bug fixes are going into net-2.6, and that's
where they will stay.
I haven't published my net-next-2.6 tree at all, because
Stephen Rothwell hasn't stated that such content is
OK yet for his -next tree.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: sfc: Replace LRO with GRO
2009-01-19 21:29 ` David Miller
@ 2009-01-19 22:12 ` Harvey Harrison
2009-01-19 23:27 ` David Miller
2009-01-20 0:57 ` Stephen Rothwell
0 siblings, 2 replies; 37+ messages in thread
From: Harvey Harrison @ 2009-01-19 22:12 UTC (permalink / raw)
To: David Miller; +Cc: bhutchings, herbert, netdev, Stephen Rothwell
On Mon, 2009-01-19 at 13:29 -0800, David Miller wrote:
> From: Ben Hutchings <bhutchings@solarflare.com>
> >
> > Please could you push this and other GRO changes for .30 to
> > net-next-2.6?
>
> All of Herbert's bug fixes are going into net-2.6, and that's
> where they will stay.
>
> I haven't published my net-next-2.6 tree at all, because
> Stephen Rothwell hasn't stated that such content is
> OK yet for his -next tree.
I thought Stephen had stated that anytime _after_ -rc1 was released was
OK for new material destined for the next release.
If you're waiting for an explicit OK, I don't think it will forthcoming.
Harvey
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [2/2] igb: Replace LRO with GRO
2009-01-19 11:16 ` Herbert Xu
@ 2009-01-19 22:34 ` Jeff Kirsher
2009-01-19 23:21 ` David Miller
0 siblings, 1 reply; 37+ messages in thread
From: Jeff Kirsher @ 2009-01-19 22:34 UTC (permalink / raw)
To: Herbert Xu; +Cc: David Miller, netdev
On Mon, Jan 19, 2009 at 3:16 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Sun, Jan 18, 2009 at 09:47:29PM -0800, David Miller wrote:
>> From: Herbert Xu <herbert@gondor.apana.org.au>
>> Date: Tue, 13 Jan 2009 20:28:28 +1100
>>
>> > igb: Replace LRO with GRO
>> ...
>> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>>
>> Have the crashes reported by Intel been sorted out yet?
>> If so, I'll queue this up for 2.6.30
>>
>> Otherwise please resubmit once the problem is resolved.
>
> I believe it should be cured. But it wouldn't hurt to wait for
> them to reconfirm.
>
> Thanks,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> --
>
Last word I have from test is that igb loads fine now, and no kernel
panics have been reported.
Go ahead and queue it up for 2.6.30.
--
Cheers,
Jeff
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [2/2] igb: Replace LRO with GRO
2009-01-19 22:34 ` Jeff Kirsher
@ 2009-01-19 23:21 ` David Miller
0 siblings, 0 replies; 37+ messages in thread
From: David Miller @ 2009-01-19 23:21 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: herbert, netdev
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Mon, 19 Jan 2009 14:34:22 -0800
> On Mon, Jan 19, 2009 at 3:16 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > On Sun, Jan 18, 2009 at 09:47:29PM -0800, David Miller wrote:
> >> From: Herbert Xu <herbert@gondor.apana.org.au>
> >> Date: Tue, 13 Jan 2009 20:28:28 +1100
> >>
> >> > igb: Replace LRO with GRO
> >> ...
> >> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> >>
> >> Have the crashes reported by Intel been sorted out yet?
> >> If so, I'll queue this up for 2.6.30
> >>
> >> Otherwise please resubmit once the problem is resolved.
> >
> > I believe it should be cured. But it wouldn't hurt to wait for
> > them to reconfirm.
> >
> > Thanks,
> > --
> > Visit Openswan at http://www.openswan.org/
> > Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> > Home Page: http://gondor.apana.org.au/~herbert/
> > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> > --
> >
>
> Last word I have from test is that igb loads fine now, and no kernel
> panics have been reported.
> Go ahead and queue it up for 2.6.30.
Done, thanks!
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: sfc: Replace LRO with GRO
2009-01-19 22:12 ` Harvey Harrison
@ 2009-01-19 23:27 ` David Miller
2009-01-20 0:57 ` Stephen Rothwell
1 sibling, 0 replies; 37+ messages in thread
From: David Miller @ 2009-01-19 23:27 UTC (permalink / raw)
To: harvey.harrison; +Cc: bhutchings, herbert, netdev, sfr
From: Harvey Harrison <harvey.harrison@gmail.com>
Date: Mon, 19 Jan 2009 14:12:47 -0800
> On Mon, 2009-01-19 at 13:29 -0800, David Miller wrote:
> > From: Ben Hutchings <bhutchings@solarflare.com>
> > >
> > > Please could you push this and other GRO changes for .30 to
> > > net-next-2.6?
> >
> > All of Herbert's bug fixes are going into net-2.6, and that's
> > where they will stay.
> >
> > I haven't published my net-next-2.6 tree at all, because
> > Stephen Rothwell hasn't stated that such content is
> > OK yet for his -next tree.
>
> I thought Stephen had stated that anytime _after_ -rc1 was released was
> OK for new material destined for the next release.
>
> If you're waiting for an explicit OK, I don't think it will forthcoming.
I, unlike some others, value Stephen Rothwell's sanity. So I will ask
explicitly before potentially spoiling his afternoon :-)
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: sfc: Replace LRO with GRO
2009-01-19 22:12 ` Harvey Harrison
2009-01-19 23:27 ` David Miller
@ 2009-01-20 0:57 ` Stephen Rothwell
1 sibling, 0 replies; 37+ messages in thread
From: Stephen Rothwell @ 2009-01-20 0:57 UTC (permalink / raw)
To: Harvey Harrison; +Cc: David Miller, bhutchings, herbert, netdev, Andrew Morton
[-- Attachment #1: Type: text/plain, Size: 889 bytes --]
Hi,
On Mon, 19 Jan 2009 14:12:47 -0800 Harvey Harrison <harvey.harrison@gmail.com> wrote:
>
> On Mon, 2009-01-19 at 13:29 -0800, David Miller wrote:
> >
> > I haven't published my net-next-2.6 tree at all, because
> > Stephen Rothwell hasn't stated that such content is
> > OK yet for his -next tree.
>
> I thought Stephen had stated that anytime _after_ -rc1 was released was
> OK for new material destined for the next release.
>
> If you're waiting for an explicit OK, I don't think it will forthcoming.
Here it is: "OK". :-)
Yeah any time after -rc1 is fine. I thought -rc2 would be better, but
Andrew beat me into submission :-) (I suspect he suffers more from that
decision then I do).
It also has the advantage of ruining LCA for you all. :-)
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 37+ messages in thread
* RE: ixgbe: Replace LRO with GRO
2009-01-17 13:04 ` Herbert Xu
@ 2009-01-20 23:00 ` Tantilov, Emil S
0 siblings, 0 replies; 37+ messages in thread
From: Tantilov, Emil S @ 2009-01-20 23:00 UTC (permalink / raw)
To: Herbert Xu; +Cc: Kirsher, Jeffrey T, David S. Miller, netdev@vger.kernel.org
Herbert Xu wrote:
> On Sat, Jan 17, 2009 at 08:52:56PM +1100, Herbert Xu wrote:
>>
>>>> However, it is not very clear what is causing this loss. BTW,
>>>> did you apply the fix b0059b50b70dc3a908bea4ece2f9494a22200018
>>>> (gro: Fix page ref count for skbs freed normally) on both sides?
>>>> That one is required to make igb or ixgbe work properly.
>>> Ah - doesn't seem like I have this fix. Jeff will have to pull this
>>> in our test tree.
>>
>> OK, although this bug should only manifest itself if aggregation
>> occurs, which apparently isn't the case according to the dump.
>
> Doh, it looks like the bug fix itself was broken. In fact, I
> think you do have the fix as it results in exactly what you were
> seeing.
>
> Please apply this on top of it.
>
> gro: Fix merging of paged packets
>
> The previous fix to paged packets broke the merging because it
> reset the skb->len before we added it to the merged packet. This
> wasn't detected because it simply resulted in the truncation of
> the packet while the missing bit is subsequently retransmitted.
>
> The fix is to store skb->len before we clobber it.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 65eac77..9127c47 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2588,8 +2588,9 @@ int skb_gro_receive(struct sk_buff **head,
> struct sk_buff *skb) struct sk_buff *nskb;
> unsigned int headroom;
> unsigned int hlen = p->data - skb_mac_header(p);
> + unsigned int len = skb->len;
>
> - if (hlen + p->len + skb->len >= 65536)
> + if (hlen + p->len + len >= 65536)
> return -E2BIG;
>
> if (skb_shinfo(p)->frag_list)
> @@ -2651,9 +2652,9 @@ merge:
>
> done:
> NAPI_GRO_CB(p)->count++;
> - p->data_len += skb->len;
> - p->truesize += skb->len;
> - p->len += skb->len;
> + p->data_len += len;
> + p->truesize += len;
> + p->len += len;
>
> NAPI_GRO_CB(skb)->same_flow = 1;
> return 0;
>
> Cheers,
With this patch applied I am no longer seeing the packet loss. I will run some more tests to make sure nothing else is broken, but looks like the original issue is resolved.
Sorry for the late response BTW - I was out on vacation and was only able to do limited testing over the weekend.
Thanks,
Emil
^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2009-01-20 23:00 UTC | newest]
Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-13 9:26 [1/2] e1000e: Invoke VLAN GRO handler Herbert Xu
2009-01-13 9:28 ` [2/2] igb: Replace LRO with GRO Herbert Xu
2009-01-14 8:44 ` ixgbe: " Herbert Xu
2009-01-14 11:53 ` Jeff Kirsher
2009-01-15 3:46 ` Herbert Xu
2009-01-15 4:22 ` Herbert Xu
2009-01-19 5:49 ` David Miller
2009-01-16 23:32 ` Jeff Kirsher
2009-01-17 0:36 ` Herbert Xu
2009-01-17 1:06 ` Tantilov, Emil S
2009-01-17 3:45 ` Herbert Xu
2009-01-17 8:07 ` Tantilov, Emil S
2009-01-17 9:52 ` Herbert Xu
2009-01-17 13:04 ` Herbert Xu
2009-01-20 23:00 ` Tantilov, Emil S
2009-01-17 15:44 ` Tantilov, Emil S
2009-01-14 11:49 ` [2/2] igb: " Jeff Kirsher
2009-01-14 12:36 ` Herbert Xu
2009-01-15 0:03 ` Jeff Kirsher
2009-01-15 0:32 ` Herbert Xu
2009-01-15 1:35 ` Jeff Kirsher
2009-01-15 1:56 ` David Miller
2009-01-15 2:02 ` Jeff Kirsher
2009-01-15 4:40 ` David Miller
2009-01-19 5:47 ` David Miller
2009-01-19 11:16 ` Herbert Xu
2009-01-19 22:34 ` Jeff Kirsher
2009-01-19 23:21 ` David Miller
2009-01-15 4:24 ` sfc: " Herbert Xu
2009-01-19 5:50 ` David Miller
2009-01-19 14:40 ` Ben Hutchings
2009-01-19 21:29 ` David Miller
2009-01-19 22:12 ` Harvey Harrison
2009-01-19 23:27 ` David Miller
2009-01-20 0:57 ` Stephen Rothwell
2009-01-15 4:38 ` [1/2] e1000e: Invoke VLAN GRO handler David Miller
2009-01-15 6:59 ` cxgb3: Replace LRO with GRO Herbert Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).