* [net-next 00/11][pull request] Intel Wired LAN Driver Updates
@ 2011-10-08 6:47 Jeff Kirsher
2011-10-08 6:47 ` [net-next 01/11] igb: push data into first igb_tx_buffer sooner to reduce stack usage Jeff Kirsher
` (11 more replies)
0 siblings, 12 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:47 UTC (permalink / raw)
To: davem; +Cc: Jeff Kirsher, netdev, gospo, sassmann
The following series contains updates to igb only. They are a
continuation of the cleanups and refactoring that Alex has done.
After this series there are 4-5 more patches to complete the work
that Alex has done on igb.
The following are changes since commit 1d0861acfb24d0ca0661ff5a156b992b2c589458:
Add ethtool -g support to 8139cp
and are available in the git repository at
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next.git
or
git://github.com/Jkirsher/net-next.git
Alexander Duyck (11):
igb: push data into first igb_tx_buffer sooner to reduce stack usage
igb: Use node specific allocations for the q_vectors and rings
igb: avoid unnecessary conversions from u16 to int
igb: Consolidate all of the ring feature flags into a single value
igb: Move ITR related data into work container within the q_vector
igb: cleanup IVAR configuration
igb: retire the RX_CSUM flag and use the netdev flag instead
igb: leave staterr in place and instead us a helper function to check
bits
igb: fix recent VLAN changes that would leave VLANs disabled after
reset
igb: move TX hang check flag into ring->flags
igb: add support for NETIF_F_RXHASH
drivers/net/ethernet/intel/igb/e1000_defines.h | 3 +
drivers/net/ethernet/intel/igb/igb.h | 53 ++-
drivers/net/ethernet/intel/igb/igb_ethtool.c | 14 +-
drivers/net/ethernet/intel/igb/igb_main.c | 675 +++++++++++++-----------
4 files changed, 411 insertions(+), 334 deletions(-)
--
1.7.6.4
^ permalink raw reply [flat|nested] 20+ messages in thread
* [net-next 01/11] igb: push data into first igb_tx_buffer sooner to reduce stack usage
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
@ 2011-10-08 6:47 ` Jeff Kirsher
2011-10-08 6:47 ` [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings Jeff Kirsher
` (10 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:47 UTC (permalink / raw)
To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
From: Alexander Duyck <alexander.h.duyck@intel.com>
Instead of storing most of the data for the TX hot path in the stack until
we are ready to write the descriptor we can save ourselves some time and
effort by pushing the SKB, tx_flags, gso_size, bytecount, and protocol into
the first igb_tx_buffer since that is where we will end up putting it
anyway.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/igb/igb.h | 1 +
drivers/net/ethernet/intel/igb/igb_main.c | 103 +++++++++++++++--------------
2 files changed, 54 insertions(+), 50 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 77793a9..de35c02 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -146,6 +146,7 @@ struct igb_tx_buffer {
struct sk_buff *skb;
unsigned int bytecount;
u16 gso_segs;
+ __be16 protocol;
dma_addr_t dma;
u32 length;
u32 tx_flags;
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 862dd7c..1c234f0 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3975,10 +3975,11 @@ void igb_tx_ctxtdesc(struct igb_ring *tx_ring, u32 vlan_macip_lens,
context_desc->mss_l4len_idx = cpu_to_le32(mss_l4len_idx);
}
-static inline int igb_tso(struct igb_ring *tx_ring, struct sk_buff *skb,
- u32 tx_flags, __be16 protocol, u8 *hdr_len)
+static int igb_tso(struct igb_ring *tx_ring,
+ struct igb_tx_buffer *first,
+ u8 *hdr_len)
{
- int err;
+ struct sk_buff *skb = first->skb;
u32 vlan_macip_lens, type_tucmd;
u32 mss_l4len_idx, l4len;
@@ -3986,7 +3987,7 @@ static inline int igb_tso(struct igb_ring *tx_ring, struct sk_buff *skb,
return 0;
if (skb_header_cloned(skb)) {
- err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
+ int err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
if (err)
return err;
}
@@ -3994,7 +3995,7 @@ static inline int igb_tso(struct igb_ring *tx_ring, struct sk_buff *skb,
/* ADV DTYP TUCMD MKRLOC/ISCSIHEDLEN */
type_tucmd = E1000_ADVTXD_TUCMD_L4T_TCP;
- if (protocol == __constant_htons(ETH_P_IP)) {
+ if (first->protocol == __constant_htons(ETH_P_IP)) {
struct iphdr *iph = ip_hdr(skb);
iph->tot_len = 0;
iph->check = 0;
@@ -4003,16 +4004,26 @@ static inline int igb_tso(struct igb_ring *tx_ring, struct sk_buff *skb,
IPPROTO_TCP,
0);
type_tucmd |= E1000_ADVTXD_TUCMD_IPV4;
+ first->tx_flags |= IGB_TX_FLAGS_TSO |
+ IGB_TX_FLAGS_CSUM |
+ IGB_TX_FLAGS_IPV4;
} else if (skb_is_gso_v6(skb)) {
ipv6_hdr(skb)->payload_len = 0;
tcp_hdr(skb)->check = ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
&ipv6_hdr(skb)->daddr,
0, IPPROTO_TCP, 0);
+ first->tx_flags |= IGB_TX_FLAGS_TSO |
+ IGB_TX_FLAGS_CSUM;
}
+ /* compute header lengths */
l4len = tcp_hdrlen(skb);
*hdr_len = skb_transport_offset(skb) + l4len;
+ /* update gso size and bytecount with header size */
+ first->gso_segs = skb_shinfo(skb)->gso_segs;
+ first->bytecount += (first->gso_segs - 1) * *hdr_len;
+
/* MSS L4LEN IDX */
mss_l4len_idx = l4len << E1000_ADVTXD_L4LEN_SHIFT;
mss_l4len_idx |= skb_shinfo(skb)->gso_size << E1000_ADVTXD_MSS_SHIFT;
@@ -4020,26 +4031,26 @@ static inline int igb_tso(struct igb_ring *tx_ring, struct sk_buff *skb,
/* VLAN MACLEN IPLEN */
vlan_macip_lens = skb_network_header_len(skb);
vlan_macip_lens |= skb_network_offset(skb) << E1000_ADVTXD_MACLEN_SHIFT;
- vlan_macip_lens |= tx_flags & IGB_TX_FLAGS_VLAN_MASK;
+ vlan_macip_lens |= first->tx_flags & IGB_TX_FLAGS_VLAN_MASK;
igb_tx_ctxtdesc(tx_ring, vlan_macip_lens, type_tucmd, mss_l4len_idx);
return 1;
}
-static inline bool igb_tx_csum(struct igb_ring *tx_ring, struct sk_buff *skb,
- u32 tx_flags, __be16 protocol)
+static void igb_tx_csum(struct igb_ring *tx_ring, struct igb_tx_buffer *first)
{
+ struct sk_buff *skb = first->skb;
u32 vlan_macip_lens = 0;
u32 mss_l4len_idx = 0;
u32 type_tucmd = 0;
if (skb->ip_summed != CHECKSUM_PARTIAL) {
- if (!(tx_flags & IGB_TX_FLAGS_VLAN))
- return false;
+ if (!(first->tx_flags & IGB_TX_FLAGS_VLAN))
+ return;
} else {
u8 l4_hdr = 0;
- switch (protocol) {
+ switch (first->protocol) {
case __constant_htons(ETH_P_IP):
vlan_macip_lens |= skb_network_header_len(skb);
type_tucmd |= E1000_ADVTXD_TUCMD_IPV4;
@@ -4053,7 +4064,7 @@ static inline bool igb_tx_csum(struct igb_ring *tx_ring, struct sk_buff *skb,
if (unlikely(net_ratelimit())) {
dev_warn(tx_ring->dev,
"partial checksum but proto=%x!\n",
- protocol);
+ first->protocol);
}
break;
}
@@ -4081,14 +4092,15 @@ static inline bool igb_tx_csum(struct igb_ring *tx_ring, struct sk_buff *skb,
}
break;
}
+
+ /* update TX checksum flag */
+ first->tx_flags |= IGB_TX_FLAGS_CSUM;
}
vlan_macip_lens |= skb_network_offset(skb) << E1000_ADVTXD_MACLEN_SHIFT;
- vlan_macip_lens |= tx_flags & IGB_TX_FLAGS_VLAN_MASK;
+ vlan_macip_lens |= first->tx_flags & IGB_TX_FLAGS_VLAN_MASK;
igb_tx_ctxtdesc(tx_ring, vlan_macip_lens, type_tucmd, mss_l4len_idx);
-
- return (skb->ip_summed == CHECKSUM_PARTIAL);
}
static __le32 igb_tx_cmd_type(u32 tx_flags)
@@ -4113,8 +4125,9 @@ static __le32 igb_tx_cmd_type(u32 tx_flags)
return cmd_type;
}
-static __le32 igb_tx_olinfo_status(u32 tx_flags, unsigned int paylen,
- struct igb_ring *tx_ring)
+static void igb_tx_olinfo_status(struct igb_ring *tx_ring,
+ union e1000_adv_tx_desc *tx_desc,
+ u32 tx_flags, unsigned int paylen)
{
u32 olinfo_status = paylen << E1000_ADVTXD_PAYLEN_SHIFT;
@@ -4132,7 +4145,7 @@ static __le32 igb_tx_olinfo_status(u32 tx_flags, unsigned int paylen,
olinfo_status |= E1000_TXD_POPTS_IXSM << 8;
}
- return cpu_to_le32(olinfo_status);
+ tx_desc->read.olinfo_status = cpu_to_le32(olinfo_status);
}
/*
@@ -4140,12 +4153,13 @@ static __le32 igb_tx_olinfo_status(u32 tx_flags, unsigned int paylen,
* maintain a power of two alignment we have to limit ourselves to 32K.
*/
#define IGB_MAX_TXD_PWR 15
-#define IGB_MAX_DATA_PER_TXD (1 << IGB_MAX_TXD_PWR)
+#define IGB_MAX_DATA_PER_TXD (1<<IGB_MAX_TXD_PWR)
-static void igb_tx_map(struct igb_ring *tx_ring, struct sk_buff *skb,
- struct igb_tx_buffer *first, u32 tx_flags,
+static void igb_tx_map(struct igb_ring *tx_ring,
+ struct igb_tx_buffer *first,
const u8 hdr_len)
{
+ struct sk_buff *skb = first->skb;
struct igb_tx_buffer *tx_buffer_info;
union e1000_adv_tx_desc *tx_desc;
dma_addr_t dma;
@@ -4154,24 +4168,12 @@ static void igb_tx_map(struct igb_ring *tx_ring, struct sk_buff *skb,
unsigned int size = skb_headlen(skb);
unsigned int paylen = skb->len - hdr_len;
__le32 cmd_type;
+ u32 tx_flags = first->tx_flags;
u16 i = tx_ring->next_to_use;
- u16 gso_segs;
-
- if (tx_flags & IGB_TX_FLAGS_TSO)
- gso_segs = skb_shinfo(skb)->gso_segs;
- else
- gso_segs = 1;
-
- /* multiply data chunks by size of headers */
- first->bytecount = paylen + (gso_segs * hdr_len);
- first->gso_segs = gso_segs;
- first->skb = skb;
tx_desc = IGB_TX_DESC(tx_ring, i);
- tx_desc->read.olinfo_status =
- igb_tx_olinfo_status(tx_flags, paylen, tx_ring);
-
+ igb_tx_olinfo_status(tx_ring, tx_desc, tx_flags, paylen);
cmd_type = igb_tx_cmd_type(tx_flags);
dma = dma_map_single(tx_ring->dev, skb->data, size, DMA_TO_DEVICE);
@@ -4181,7 +4183,6 @@ static void igb_tx_map(struct igb_ring *tx_ring, struct sk_buff *skb,
/* record length, and DMA address */
first->length = size;
first->dma = dma;
- first->tx_flags = tx_flags;
tx_desc->read.buffer_addr = cpu_to_le64(dma);
for (;;) {
@@ -4336,6 +4337,12 @@ netdev_tx_t igb_xmit_frame_ring(struct sk_buff *skb,
return NETDEV_TX_BUSY;
}
+ /* record the location of the first descriptor for this packet */
+ first = &tx_ring->tx_buffer_info[tx_ring->next_to_use];
+ first->skb = skb;
+ first->bytecount = skb->len;
+ first->gso_segs = 1;
+
if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
tx_flags |= IGB_TX_FLAGS_TSTAMP;
@@ -4346,22 +4353,17 @@ netdev_tx_t igb_xmit_frame_ring(struct sk_buff *skb,
tx_flags |= (vlan_tx_tag_get(skb) << IGB_TX_FLAGS_VLAN_SHIFT);
}
- /* record the location of the first descriptor for this packet */
- first = &tx_ring->tx_buffer_info[tx_ring->next_to_use];
+ /* record initial flags and protocol */
+ first->tx_flags = tx_flags;
+ first->protocol = protocol;
- tso = igb_tso(tx_ring, skb, tx_flags, protocol, &hdr_len);
- if (tso < 0) {
+ tso = igb_tso(tx_ring, first, &hdr_len);
+ if (tso < 0)
goto out_drop;
- } else if (tso) {
- tx_flags |= IGB_TX_FLAGS_TSO | IGB_TX_FLAGS_CSUM;
- if (protocol == htons(ETH_P_IP))
- tx_flags |= IGB_TX_FLAGS_IPV4;
- } else if (igb_tx_csum(tx_ring, skb, tx_flags, protocol) &&
- (skb->ip_summed == CHECKSUM_PARTIAL)) {
- tx_flags |= IGB_TX_FLAGS_CSUM;
- }
+ else if (!tso)
+ igb_tx_csum(tx_ring, first);
- igb_tx_map(tx_ring, skb, first, tx_flags, hdr_len);
+ igb_tx_map(tx_ring, first, hdr_len);
/* Make sure there is space in the ring for the next send. */
igb_maybe_stop_tx(tx_ring, MAX_SKB_FRAGS + 4);
@@ -4369,7 +4371,8 @@ netdev_tx_t igb_xmit_frame_ring(struct sk_buff *skb,
return NETDEV_TX_OK;
out_drop:
- dev_kfree_skb_any(skb);
+ igb_unmap_and_free_tx_resource(tx_ring, first);
+
return NETDEV_TX_OK;
}
--
1.7.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
2011-10-08 6:47 ` [net-next 01/11] igb: push data into first igb_tx_buffer sooner to reduce stack usage Jeff Kirsher
@ 2011-10-08 6:47 ` Jeff Kirsher
2011-10-08 19:51 ` David Miller
2011-10-09 18:08 ` Andi Kleen
2011-10-08 6:47 ` [net-next 03/11] igb: avoid unnecessary conversions from u16 to int Jeff Kirsher
` (9 subsequent siblings)
11 siblings, 2 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:47 UTC (permalink / raw)
To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
From: Alexander Duyck <alexander.h.duyck@intel.com>
This change is meant to update the ring and vector allocations so that they
are per node instead of allocating everything on the node that
ifconfig/modprobe is called on. By doing this we can cut down
significantly on cross node traffic.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/igb/igb.h | 4 ++
drivers/net/ethernet/intel/igb/igb_main.c | 80 +++++++++++++++++++++++++++--
2 files changed, 79 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index de35c02..9e4bed3 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -185,6 +185,8 @@ struct igb_q_vector {
u16 cpu;
u16 tx_work_limit;
+ int numa_node;
+
u16 itr_val;
u8 set_itr;
void __iomem *itr_register;
@@ -232,6 +234,7 @@ struct igb_ring {
};
/* Items past this point are only used during ring alloc / free */
dma_addr_t dma; /* phys address of the ring */
+ int numa_node; /* node to alloc ring memory on */
};
#define IGB_RING_FLAG_RX_CSUM 0x00000001 /* RX CSUM enabled */
@@ -341,6 +344,7 @@ struct igb_adapter {
int vf_rate_link_speed;
u32 rss_queues;
u32 wvbr;
+ int node;
};
#define IGB_FLAG_HAS_MSI (1 << 0)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 1c234f0..287be85 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -687,41 +687,68 @@ static int igb_alloc_queues(struct igb_adapter *adapter)
{
struct igb_ring *ring;
int i;
+ int orig_node = adapter->node;
for (i = 0; i < adapter->num_tx_queues; i++) {
- ring = kzalloc(sizeof(struct igb_ring), GFP_KERNEL);
+ if (orig_node == -1) {
+ int cur_node = next_online_node(adapter->node);
+ if (cur_node == MAX_NUMNODES)
+ cur_node = first_online_node;
+ adapter->node = cur_node;
+ }
+ ring = kzalloc_node(sizeof(struct igb_ring), GFP_KERNEL,
+ adapter->node);
+ if (!ring)
+ ring = kzalloc(sizeof(struct igb_ring), GFP_KERNEL);
if (!ring)
goto err;
ring->count = adapter->tx_ring_count;
ring->queue_index = i;
ring->dev = &adapter->pdev->dev;
ring->netdev = adapter->netdev;
+ ring->numa_node = adapter->node;
/* For 82575, context index must be unique per ring. */
if (adapter->hw.mac.type == e1000_82575)
ring->flags = IGB_RING_FLAG_TX_CTX_IDX;
adapter->tx_ring[i] = ring;
}
+ /* Restore the adapter's original node */
+ adapter->node = orig_node;
for (i = 0; i < adapter->num_rx_queues; i++) {
- ring = kzalloc(sizeof(struct igb_ring), GFP_KERNEL);
+ if (orig_node == -1) {
+ int cur_node = next_online_node(adapter->node);
+ if (cur_node == MAX_NUMNODES)
+ cur_node = first_online_node;
+ adapter->node = cur_node;
+ }
+ ring = kzalloc_node(sizeof(struct igb_ring), GFP_KERNEL,
+ adapter->node);
+ if (!ring)
+ ring = kzalloc(sizeof(struct igb_ring), GFP_KERNEL);
if (!ring)
goto err;
ring->count = adapter->rx_ring_count;
ring->queue_index = i;
ring->dev = &adapter->pdev->dev;
ring->netdev = adapter->netdev;
+ ring->numa_node = adapter->node;
ring->flags = IGB_RING_FLAG_RX_CSUM; /* enable rx checksum */
/* set flag indicating ring supports SCTP checksum offload */
if (adapter->hw.mac.type >= e1000_82576)
ring->flags |= IGB_RING_FLAG_RX_SCTP_CSUM;
adapter->rx_ring[i] = ring;
}
+ /* Restore the adapter's original node */
+ adapter->node = orig_node;
igb_cache_ring_register(adapter);
return 0;
err:
+ /* Restore the adapter's original node */
+ adapter->node = orig_node;
igb_free_queues(adapter);
return -ENOMEM;
@@ -1087,9 +1114,24 @@ static int igb_alloc_q_vectors(struct igb_adapter *adapter)
struct igb_q_vector *q_vector;
struct e1000_hw *hw = &adapter->hw;
int v_idx;
+ int orig_node = adapter->node;
for (v_idx = 0; v_idx < adapter->num_q_vectors; v_idx++) {
- q_vector = kzalloc(sizeof(struct igb_q_vector), GFP_KERNEL);
+ if ((adapter->num_q_vectors == (adapter->num_rx_queues +
+ adapter->num_tx_queues)) &&
+ (adapter->num_rx_queues == v_idx))
+ adapter->node = orig_node;
+ if (orig_node == -1) {
+ int cur_node = next_online_node(adapter->node);
+ if (cur_node == MAX_NUMNODES)
+ cur_node = first_online_node;
+ adapter->node = cur_node;
+ }
+ q_vector = kzalloc_node(sizeof(struct igb_q_vector), GFP_KERNEL,
+ adapter->node);
+ if (!q_vector)
+ q_vector = kzalloc(sizeof(struct igb_q_vector),
+ GFP_KERNEL);
if (!q_vector)
goto err_out;
q_vector->adapter = adapter;
@@ -1098,9 +1140,14 @@ static int igb_alloc_q_vectors(struct igb_adapter *adapter)
netif_napi_add(adapter->netdev, &q_vector->napi, igb_poll, 64);
adapter->q_vector[v_idx] = q_vector;
}
+ /* Restore the adapter's original node */
+ adapter->node = orig_node;
+
return 0;
err_out:
+ /* Restore the adapter's original node */
+ adapter->node = orig_node;
igb_free_q_vectors(adapter);
return -ENOMEM;
}
@@ -2409,6 +2456,8 @@ static int __devinit igb_sw_init(struct igb_adapter *adapter)
VLAN_HLEN;
adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
+ adapter->node = -1;
+
spin_lock_init(&adapter->stats64_lock);
#ifdef CONFIG_PCI_IOV
switch (hw->mac.type) {
@@ -2579,10 +2628,13 @@ static int igb_close(struct net_device *netdev)
int igb_setup_tx_resources(struct igb_ring *tx_ring)
{
struct device *dev = tx_ring->dev;
+ int orig_node = dev_to_node(dev);
int size;
size = sizeof(struct igb_tx_buffer) * tx_ring->count;
- tx_ring->tx_buffer_info = vzalloc(size);
+ tx_ring->tx_buffer_info = vzalloc_node(size, tx_ring->numa_node);
+ if (!tx_ring->tx_buffer_info)
+ tx_ring->tx_buffer_info = vzalloc(size);
if (!tx_ring->tx_buffer_info)
goto err;
@@ -2590,16 +2642,24 @@ int igb_setup_tx_resources(struct igb_ring *tx_ring)
tx_ring->size = tx_ring->count * sizeof(union e1000_adv_tx_desc);
tx_ring->size = ALIGN(tx_ring->size, 4096);
+ set_dev_node(dev, tx_ring->numa_node);
tx_ring->desc = dma_alloc_coherent(dev,
tx_ring->size,
&tx_ring->dma,
GFP_KERNEL);
+ set_dev_node(dev, orig_node);
+ if (!tx_ring->desc)
+ tx_ring->desc = dma_alloc_coherent(dev,
+ tx_ring->size,
+ &tx_ring->dma,
+ GFP_KERNEL);
if (!tx_ring->desc)
goto err;
tx_ring->next_to_use = 0;
tx_ring->next_to_clean = 0;
+
return 0;
err:
@@ -2722,10 +2782,13 @@ static void igb_configure_tx(struct igb_adapter *adapter)
int igb_setup_rx_resources(struct igb_ring *rx_ring)
{
struct device *dev = rx_ring->dev;
+ int orig_node = dev_to_node(dev);
int size, desc_len;
size = sizeof(struct igb_rx_buffer) * rx_ring->count;
- rx_ring->rx_buffer_info = vzalloc(size);
+ rx_ring->rx_buffer_info = vzalloc_node(size, rx_ring->numa_node);
+ if (!rx_ring->rx_buffer_info)
+ rx_ring->rx_buffer_info = vzalloc(size);
if (!rx_ring->rx_buffer_info)
goto err;
@@ -2735,10 +2798,17 @@ int igb_setup_rx_resources(struct igb_ring *rx_ring)
rx_ring->size = rx_ring->count * desc_len;
rx_ring->size = ALIGN(rx_ring->size, 4096);
+ set_dev_node(dev, rx_ring->numa_node);
rx_ring->desc = dma_alloc_coherent(dev,
rx_ring->size,
&rx_ring->dma,
GFP_KERNEL);
+ set_dev_node(dev, orig_node);
+ if (!rx_ring->desc)
+ rx_ring->desc = dma_alloc_coherent(dev,
+ rx_ring->size,
+ &rx_ring->dma,
+ GFP_KERNEL);
if (!rx_ring->desc)
goto err;
--
1.7.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [net-next 03/11] igb: avoid unnecessary conversions from u16 to int
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
2011-10-08 6:47 ` [net-next 01/11] igb: push data into first igb_tx_buffer sooner to reduce stack usage Jeff Kirsher
2011-10-08 6:47 ` [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings Jeff Kirsher
@ 2011-10-08 6:47 ` Jeff Kirsher
2011-10-08 6:47 ` [net-next 04/11] igb: Consolidate all of the ring feature flags into a single value Jeff Kirsher
` (8 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:47 UTC (permalink / raw)
To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
From: Alexander Duyck <alexander.h.duyck@intel.com>
There are a number of places where we have values that are stored as u16
but are being converted to int unnecessarily. In order to avoid that we
should convert all variables that deal with the next_to_clean, next_to_use,
and count to u16 values.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/igb/igb_ethtool.c | 5 +++--
drivers/net/ethernet/intel/igb/igb_main.c | 9 ++++-----
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index f227fc5..a893da1 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -1581,8 +1581,8 @@ static int igb_clean_test_rings(struct igb_ring *rx_ring,
union e1000_adv_rx_desc *rx_desc;
struct igb_rx_buffer *rx_buffer_info;
struct igb_tx_buffer *tx_buffer_info;
- int rx_ntc, tx_ntc, count = 0;
u32 staterr;
+ u16 rx_ntc, tx_ntc, count = 0;
/* initialize next to clean and descriptor values */
rx_ntc = rx_ring->next_to_clean;
@@ -1634,7 +1634,8 @@ static int igb_run_loopback_test(struct igb_adapter *adapter)
{
struct igb_ring *tx_ring = &adapter->test_tx_ring;
struct igb_ring *rx_ring = &adapter->test_rx_ring;
- int i, j, lc, good_cnt, ret_val = 0;
+ u16 i, j, lc, good_cnt;
+ int ret_val = 0;
unsigned int size = IGB_RX_HDR_LEN;
netdev_tx_t tx_ret_val;
struct sk_buff *skb;
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 287be85..3a5c75d 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -338,14 +338,13 @@ static void igb_dump(struct igb_adapter *adapter)
struct net_device *netdev = adapter->netdev;
struct e1000_hw *hw = &adapter->hw;
struct igb_reg_info *reginfo;
- int n = 0;
struct igb_ring *tx_ring;
union e1000_adv_tx_desc *tx_desc;
struct my_u0 { u64 a; u64 b; } *u0;
struct igb_ring *rx_ring;
union e1000_adv_rx_desc *rx_desc;
u32 staterr;
- int i = 0;
+ u16 i, n;
if (!netif_msg_hw(adapter))
return;
@@ -3239,7 +3238,7 @@ static void igb_clean_tx_ring(struct igb_ring *tx_ring)
{
struct igb_tx_buffer *buffer_info;
unsigned long size;
- unsigned int i;
+ u16 i;
if (!tx_ring->tx_buffer_info)
return;
@@ -4355,7 +4354,7 @@ dma_error:
tx_ring->next_to_use = i;
}
-static int __igb_maybe_stop_tx(struct igb_ring *tx_ring, int size)
+static int __igb_maybe_stop_tx(struct igb_ring *tx_ring, const u16 size)
{
struct net_device *netdev = tx_ring->netdev;
@@ -4381,7 +4380,7 @@ static int __igb_maybe_stop_tx(struct igb_ring *tx_ring, int size)
return 0;
}
-static inline int igb_maybe_stop_tx(struct igb_ring *tx_ring, int size)
+static inline int igb_maybe_stop_tx(struct igb_ring *tx_ring, const u16 size)
{
if (igb_desc_unused(tx_ring) >= size)
return 0;
--
1.7.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [net-next 04/11] igb: Consolidate all of the ring feature flags into a single value
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
` (2 preceding siblings ...)
2011-10-08 6:47 ` [net-next 03/11] igb: avoid unnecessary conversions from u16 to int Jeff Kirsher
@ 2011-10-08 6:47 ` Jeff Kirsher
2011-10-08 6:47 ` [net-next 05/11] igb: Move ITR related data into work container within the q_vector Jeff Kirsher
` (7 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:47 UTC (permalink / raw)
To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
From: Alexander Duyck <alexander.h.duyck@intel.com>
This change moves all of the ring flags into a single value. The advantage
to this is that there is one central area for all of these flags and they
can all make use of the set/test bit operations.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/igb/igb.h | 10 ++++++----
drivers/net/ethernet/intel/igb/igb_main.c | 23 +++++++++++++----------
2 files changed, 19 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 9e4bed3..0df040a 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -237,10 +237,12 @@ struct igb_ring {
int numa_node; /* node to alloc ring memory on */
};
-#define IGB_RING_FLAG_RX_CSUM 0x00000001 /* RX CSUM enabled */
-#define IGB_RING_FLAG_RX_SCTP_CSUM 0x00000002 /* SCTP CSUM offload enabled */
-
-#define IGB_RING_FLAG_TX_CTX_IDX 0x00000001 /* HW requires context index */
+enum e1000_ring_flags_t {
+ IGB_RING_FLAG_RX_CSUM,
+ IGB_RING_FLAG_RX_SCTP_CSUM,
+ IGB_RING_FLAG_TX_CTX_IDX,
+ IGB_RING_FLAG_TX_DETECT_HANG
+};
#define IGB_TXD_DCMD (E1000_ADVTXD_DCMD_EOP | E1000_ADVTXD_DCMD_RS)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 3a5c75d..f339de9 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -708,7 +708,7 @@ static int igb_alloc_queues(struct igb_adapter *adapter)
ring->numa_node = adapter->node;
/* For 82575, context index must be unique per ring. */
if (adapter->hw.mac.type == e1000_82575)
- ring->flags = IGB_RING_FLAG_TX_CTX_IDX;
+ set_bit(IGB_RING_FLAG_TX_CTX_IDX, &ring->flags);
adapter->tx_ring[i] = ring;
}
/* Restore the adapter's original node */
@@ -732,10 +732,11 @@ static int igb_alloc_queues(struct igb_adapter *adapter)
ring->dev = &adapter->pdev->dev;
ring->netdev = adapter->netdev;
ring->numa_node = adapter->node;
- ring->flags = IGB_RING_FLAG_RX_CSUM; /* enable rx checksum */
+ /* enable rx checksum */
+ set_bit(IGB_RING_FLAG_RX_CSUM, &ring->flags);
/* set flag indicating ring supports SCTP checksum offload */
if (adapter->hw.mac.type >= e1000_82576)
- ring->flags |= IGB_RING_FLAG_RX_SCTP_CSUM;
+ set_bit(IGB_RING_FLAG_RX_SCTP_CSUM, &ring->flags);
adapter->rx_ring[i] = ring;
}
/* Restore the adapter's original node */
@@ -1822,9 +1823,11 @@ static int igb_set_features(struct net_device *netdev, u32 features)
for (i = 0; i < adapter->num_rx_queues; i++) {
if (features & NETIF_F_RXCSUM)
- adapter->rx_ring[i]->flags |= IGB_RING_FLAG_RX_CSUM;
+ set_bit(IGB_RING_FLAG_RX_CSUM,
+ &adapter->rx_ring[i]->flags);
else
- adapter->rx_ring[i]->flags &= ~IGB_RING_FLAG_RX_CSUM;
+ clear_bit(IGB_RING_FLAG_RX_CSUM,
+ &adapter->rx_ring[i]->flags);
}
if (changed & NETIF_F_HW_VLAN_RX)
@@ -4035,7 +4038,7 @@ void igb_tx_ctxtdesc(struct igb_ring *tx_ring, u32 vlan_macip_lens,
type_tucmd |= E1000_TXD_CMD_DEXT | E1000_ADVTXD_DTYP_CTXT;
/* For 82575, context index must be unique per ring. */
- if (tx_ring->flags & IGB_RING_FLAG_TX_CTX_IDX)
+ if (test_bit(IGB_RING_FLAG_TX_CTX_IDX, &tx_ring->flags))
mss_l4len_idx |= tx_ring->reg_idx << 4;
context_desc->vlan_macip_lens = cpu_to_le32(vlan_macip_lens);
@@ -4202,7 +4205,7 @@ static void igb_tx_olinfo_status(struct igb_ring *tx_ring,
/* 82575 requires a unique index per ring if any offload is enabled */
if ((tx_flags & (IGB_TX_FLAGS_CSUM | IGB_TX_FLAGS_VLAN)) &&
- (tx_ring->flags & IGB_RING_FLAG_TX_CTX_IDX))
+ test_bit(IGB_RING_FLAG_TX_CTX_IDX, &tx_ring->flags))
olinfo_status |= tx_ring->reg_idx << 4;
/* insert L4 checksum */
@@ -5828,7 +5831,7 @@ static inline void igb_rx_checksum(struct igb_ring *ring,
skb_checksum_none_assert(skb);
/* Ignore Checksum bit is set or checksum is disabled through ethtool */
- if (!(ring->flags & IGB_RING_FLAG_RX_CSUM) ||
+ if (!test_bit(IGB_RING_FLAG_RX_CSUM, &ring->flags) ||
(status_err & E1000_RXD_STAT_IXSM))
return;
@@ -5840,8 +5843,8 @@ static inline void igb_rx_checksum(struct igb_ring *ring,
* L4E bit is set incorrectly on 64 byte (60 byte w/o crc)
* packets, (aka let the stack check the crc32c)
*/
- if ((skb->len == 60) &&
- (ring->flags & IGB_RING_FLAG_RX_SCTP_CSUM)) {
+ if (!((skb->len == 60) &&
+ test_bit(IGB_RING_FLAG_RX_SCTP_CSUM, &ring->flags))) {
u64_stats_update_begin(&ring->rx_syncp);
ring->rx_stats.csum_err++;
u64_stats_update_end(&ring->rx_syncp);
--
1.7.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [net-next 05/11] igb: Move ITR related data into work container within the q_vector
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
` (3 preceding siblings ...)
2011-10-08 6:47 ` [net-next 04/11] igb: Consolidate all of the ring feature flags into a single value Jeff Kirsher
@ 2011-10-08 6:47 ` Jeff Kirsher
2011-10-08 6:47 ` [net-next 06/11] igb: cleanup IVAR configuration Jeff Kirsher
` (6 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:47 UTC (permalink / raw)
To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
From: Alexander Duyck <alexander.h.duyck@intel.com>
This change moves information related to interrupt throttle rate
configuration into a separate q_vector sub-structure called a work
container. A similar change has already been made for ixgbe and this work
is based off of that.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/igb/e1000_defines.h | 3 +
drivers/net/ethernet/intel/igb/igb.h | 31 +++--
drivers/net/ethernet/intel/igb/igb_ethtool.c | 4 +-
drivers/net/ethernet/intel/igb/igb_main.c | 203 +++++++++++-------------
4 files changed, 118 insertions(+), 123 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h b/drivers/net/ethernet/intel/igb/e1000_defines.h
index 7b8ddd8..68558be 100644
--- a/drivers/net/ethernet/intel/igb/e1000_defines.h
+++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
@@ -409,6 +409,9 @@
#define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset Aserted */
/* Extended Interrupt Cause Set */
+/* E1000_EITR_CNT_IGNR is only for 82576 and newer */
+#define E1000_EITR_CNT_IGNR 0x80000000 /* Don't reset counters on write */
+
/* Transmit Descriptor Control */
/* Enable the counting of descriptors still to be processed. */
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 0df040a..91f90fe 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -42,8 +42,11 @@
struct igb_adapter;
-/* ((1000000000ns / (6000ints/s * 1024ns)) << 2 = 648 */
-#define IGB_START_ITR 648
+/* Interrupt defines */
+#define IGB_START_ITR 648 /* ~6000 ints/sec */
+#define IGB_4K_ITR 980
+#define IGB_20K_ITR 196
+#define IGB_70K_ITR 56
/* TX/RX descriptor defines */
#define IGB_DEFAULT_TXD 256
@@ -175,16 +178,23 @@ struct igb_rx_queue_stats {
u64 alloc_failed;
};
+struct igb_ring_container {
+ struct igb_ring *ring; /* pointer to linked list of rings */
+ unsigned int total_bytes; /* total bytes processed this int */
+ unsigned int total_packets; /* total packets processed this int */
+ u16 work_limit; /* total work allowed per interrupt */
+ u8 count; /* total number of rings in vector */
+ u8 itr; /* current ITR setting for ring */
+};
+
struct igb_q_vector {
- struct igb_adapter *adapter; /* backlink */
- struct igb_ring *rx_ring;
- struct igb_ring *tx_ring;
- struct napi_struct napi;
+ struct igb_adapter *adapter; /* backlink */
+ int cpu; /* CPU for DCA */
+ u32 eims_value; /* EIMS mask value */
- u32 eims_value;
- u16 cpu;
- u16 tx_work_limit;
+ struct igb_ring_container rx, tx;
+ struct napi_struct napi;
int numa_node;
u16 itr_val;
@@ -215,9 +225,6 @@ struct igb_ring {
u16 next_to_clean ____cacheline_aligned_in_smp;
u16 next_to_use;
- unsigned int total_bytes;
- unsigned int total_packets;
-
union {
/* TX */
struct {
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index a893da1..5ebe992 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -2013,8 +2013,8 @@ static int igb_set_coalesce(struct net_device *netdev,
for (i = 0; i < adapter->num_q_vectors; i++) {
struct igb_q_vector *q_vector = adapter->q_vector[i];
- q_vector->tx_work_limit = adapter->tx_work_limit;
- if (q_vector->rx_ring)
+ q_vector->tx.work_limit = adapter->tx_work_limit;
+ if (q_vector->rx.ring)
q_vector->itr_val = adapter->rx_itr_setting;
else
q_vector->itr_val = adapter->tx_itr_setting;
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index f339de9..8dc04e0 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -764,10 +764,10 @@ static void igb_assign_vector(struct igb_q_vector *q_vector, int msix_vector)
int rx_queue = IGB_N0_QUEUE;
int tx_queue = IGB_N0_QUEUE;
- if (q_vector->rx_ring)
- rx_queue = q_vector->rx_ring->reg_idx;
- if (q_vector->tx_ring)
- tx_queue = q_vector->tx_ring->reg_idx;
+ if (q_vector->rx.ring)
+ rx_queue = q_vector->rx.ring->reg_idx;
+ if (q_vector->tx.ring)
+ tx_queue = q_vector->tx.ring->reg_idx;
switch (hw->mac.type) {
case e1000_82575:
@@ -950,15 +950,15 @@ static int igb_request_msix(struct igb_adapter *adapter)
q_vector->itr_register = hw->hw_addr + E1000_EITR(vector);
- if (q_vector->rx_ring && q_vector->tx_ring)
+ if (q_vector->rx.ring && q_vector->tx.ring)
sprintf(q_vector->name, "%s-TxRx-%u", netdev->name,
- q_vector->rx_ring->queue_index);
- else if (q_vector->tx_ring)
+ q_vector->rx.ring->queue_index);
+ else if (q_vector->tx.ring)
sprintf(q_vector->name, "%s-tx-%u", netdev->name,
- q_vector->tx_ring->queue_index);
- else if (q_vector->rx_ring)
+ q_vector->tx.ring->queue_index);
+ else if (q_vector->rx.ring)
sprintf(q_vector->name, "%s-rx-%u", netdev->name,
- q_vector->rx_ring->queue_index);
+ q_vector->rx.ring->queue_index);
else
sprintf(q_vector->name, "%s-unused", netdev->name);
@@ -1157,8 +1157,9 @@ static void igb_map_rx_ring_to_vector(struct igb_adapter *adapter,
{
struct igb_q_vector *q_vector = adapter->q_vector[v_idx];
- q_vector->rx_ring = adapter->rx_ring[ring_idx];
- q_vector->rx_ring->q_vector = q_vector;
+ q_vector->rx.ring = adapter->rx_ring[ring_idx];
+ q_vector->rx.ring->q_vector = q_vector;
+ q_vector->rx.count++;
q_vector->itr_val = adapter->rx_itr_setting;
if (q_vector->itr_val && q_vector->itr_val <= 3)
q_vector->itr_val = IGB_START_ITR;
@@ -1169,10 +1170,11 @@ static void igb_map_tx_ring_to_vector(struct igb_adapter *adapter,
{
struct igb_q_vector *q_vector = adapter->q_vector[v_idx];
- q_vector->tx_ring = adapter->tx_ring[ring_idx];
- q_vector->tx_ring->q_vector = q_vector;
+ q_vector->tx.ring = adapter->tx_ring[ring_idx];
+ q_vector->tx.ring->q_vector = q_vector;
+ q_vector->tx.count++;
q_vector->itr_val = adapter->tx_itr_setting;
- q_vector->tx_work_limit = adapter->tx_work_limit;
+ q_vector->tx.work_limit = adapter->tx_work_limit;
if (q_vector->itr_val && q_vector->itr_val <= 3)
q_vector->itr_val = IGB_START_ITR;
}
@@ -3826,33 +3828,24 @@ static void igb_update_ring_itr(struct igb_q_vector *q_vector)
int new_val = q_vector->itr_val;
int avg_wire_size = 0;
struct igb_adapter *adapter = q_vector->adapter;
- struct igb_ring *ring;
unsigned int packets;
/* For non-gigabit speeds, just fix the interrupt rate at 4000
* ints/sec - ITR timer value of 120 ticks.
*/
if (adapter->link_speed != SPEED_1000) {
- new_val = 976;
+ new_val = IGB_4K_ITR;
goto set_itr_val;
}
- ring = q_vector->rx_ring;
- if (ring) {
- packets = ACCESS_ONCE(ring->total_packets);
-
- if (packets)
- avg_wire_size = ring->total_bytes / packets;
- }
+ packets = q_vector->rx.total_packets;
+ if (packets)
+ avg_wire_size = q_vector->rx.total_bytes / packets;
- ring = q_vector->tx_ring;
- if (ring) {
- packets = ACCESS_ONCE(ring->total_packets);
-
- if (packets)
- avg_wire_size = max_t(u32, avg_wire_size,
- ring->total_bytes / packets);
- }
+ packets = q_vector->tx.total_packets;
+ if (packets)
+ avg_wire_size = max_t(u32, avg_wire_size,
+ q_vector->tx.total_bytes / packets);
/* if avg_wire_size isn't set no work was done */
if (!avg_wire_size)
@@ -3870,9 +3863,11 @@ static void igb_update_ring_itr(struct igb_q_vector *q_vector)
else
new_val = avg_wire_size / 2;
- /* when in itr mode 3 do not exceed 20K ints/sec */
- if (adapter->rx_itr_setting == 3 && new_val < 196)
- new_val = 196;
+ /* conservative mode (itr 3) eliminates the lowest_latency setting */
+ if (new_val < IGB_20K_ITR &&
+ ((q_vector->rx.ring && adapter->rx_itr_setting == 3) ||
+ (!q_vector->rx.ring && adapter->tx_itr_setting == 3)))
+ new_val = IGB_20K_ITR;
set_itr_val:
if (new_val != q_vector->itr_val) {
@@ -3880,14 +3875,10 @@ set_itr_val:
q_vector->set_itr = 1;
}
clear_counts:
- if (q_vector->rx_ring) {
- q_vector->rx_ring->total_bytes = 0;
- q_vector->rx_ring->total_packets = 0;
- }
- if (q_vector->tx_ring) {
- q_vector->tx_ring->total_bytes = 0;
- q_vector->tx_ring->total_packets = 0;
- }
+ q_vector->rx.total_bytes = 0;
+ q_vector->rx.total_packets = 0;
+ q_vector->tx.total_bytes = 0;
+ q_vector->tx.total_packets = 0;
}
/**
@@ -3903,106 +3894,102 @@ clear_counts:
* parameter (see igb_param.c)
* NOTE: These calculations are only valid when operating in a single-
* queue environment.
- * @adapter: pointer to adapter
- * @itr_setting: current q_vector->itr_val
- * @packets: the number of packets during this measurement interval
- * @bytes: the number of bytes during this measurement interval
+ * @q_vector: pointer to q_vector
+ * @ring_container: ring info to update the itr for
**/
-static unsigned int igb_update_itr(struct igb_adapter *adapter, u16 itr_setting,
- int packets, int bytes)
+static void igb_update_itr(struct igb_q_vector *q_vector,
+ struct igb_ring_container *ring_container)
{
- unsigned int retval = itr_setting;
+ unsigned int packets = ring_container->total_packets;
+ unsigned int bytes = ring_container->total_bytes;
+ u8 itrval = ring_container->itr;
+ /* no packets, exit with status unchanged */
if (packets == 0)
- goto update_itr_done;
+ return;
- switch (itr_setting) {
+ switch (itrval) {
case lowest_latency:
/* handle TSO and jumbo frames */
if (bytes/packets > 8000)
- retval = bulk_latency;
+ itrval = bulk_latency;
else if ((packets < 5) && (bytes > 512))
- retval = low_latency;
+ itrval = low_latency;
break;
case low_latency: /* 50 usec aka 20000 ints/s */
if (bytes > 10000) {
/* this if handles the TSO accounting */
if (bytes/packets > 8000) {
- retval = bulk_latency;
+ itrval = bulk_latency;
} else if ((packets < 10) || ((bytes/packets) > 1200)) {
- retval = bulk_latency;
+ itrval = bulk_latency;
} else if ((packets > 35)) {
- retval = lowest_latency;
+ itrval = lowest_latency;
}
} else if (bytes/packets > 2000) {
- retval = bulk_latency;
+ itrval = bulk_latency;
} else if (packets <= 2 && bytes < 512) {
- retval = lowest_latency;
+ itrval = lowest_latency;
}
break;
case bulk_latency: /* 250 usec aka 4000 ints/s */
if (bytes > 25000) {
if (packets > 35)
- retval = low_latency;
+ itrval = low_latency;
} else if (bytes < 1500) {
- retval = low_latency;
+ itrval = low_latency;
}
break;
}
-update_itr_done:
- return retval;
+ /* clear work counters since we have the values we need */
+ ring_container->total_bytes = 0;
+ ring_container->total_packets = 0;
+
+ /* write updated itr to ring container */
+ ring_container->itr = itrval;
}
-static void igb_set_itr(struct igb_adapter *adapter)
+static void igb_set_itr(struct igb_q_vector *q_vector)
{
- struct igb_q_vector *q_vector = adapter->q_vector[0];
- u16 current_itr;
+ struct igb_adapter *adapter = q_vector->adapter;
u32 new_itr = q_vector->itr_val;
+ u8 current_itr = 0;
/* for non-gigabit speeds, just fix the interrupt rate at 4000 */
if (adapter->link_speed != SPEED_1000) {
current_itr = 0;
- new_itr = 4000;
+ new_itr = IGB_4K_ITR;
goto set_itr_now;
}
- adapter->rx_itr = igb_update_itr(adapter,
- adapter->rx_itr,
- q_vector->rx_ring->total_packets,
- q_vector->rx_ring->total_bytes);
+ igb_update_itr(q_vector, &q_vector->tx);
+ igb_update_itr(q_vector, &q_vector->rx);
- adapter->tx_itr = igb_update_itr(adapter,
- adapter->tx_itr,
- q_vector->tx_ring->total_packets,
- q_vector->tx_ring->total_bytes);
- current_itr = max(adapter->rx_itr, adapter->tx_itr);
+ current_itr = max(q_vector->rx.itr, q_vector->tx.itr);
/* conservative mode (itr 3) eliminates the lowest_latency setting */
- if (adapter->rx_itr_setting == 3 && current_itr == lowest_latency)
+ if (current_itr == lowest_latency &&
+ ((q_vector->rx.ring && adapter->rx_itr_setting == 3) ||
+ (!q_vector->rx.ring && adapter->tx_itr_setting == 3)))
current_itr = low_latency;
switch (current_itr) {
/* counts and packets in update_itr are dependent on these numbers */
case lowest_latency:
- new_itr = 56; /* aka 70,000 ints/sec */
+ new_itr = IGB_70K_ITR; /* 70,000 ints/sec */
break;
case low_latency:
- new_itr = 196; /* aka 20,000 ints/sec */
+ new_itr = IGB_20K_ITR; /* 20,000 ints/sec */
break;
case bulk_latency:
- new_itr = 980; /* aka 4,000 ints/sec */
+ new_itr = IGB_4K_ITR; /* 4,000 ints/sec */
break;
default:
break;
}
set_itr_now:
- q_vector->rx_ring->total_bytes = 0;
- q_vector->rx_ring->total_packets = 0;
- q_vector->tx_ring->total_bytes = 0;
- q_vector->tx_ring->total_packets = 0;
-
if (new_itr != q_vector->itr_val) {
/* this attempts to bias the interrupt rate towards Bulk
* by adding intermediate steps when interrupt rate is
@@ -4010,7 +3997,7 @@ set_itr_now:
new_itr = new_itr > q_vector->itr_val ?
max((new_itr * q_vector->itr_val) /
(new_itr + (q_vector->itr_val >> 2)),
- new_itr) :
+ new_itr) :
new_itr;
/* Don't write the value here; it resets the adapter's
* internal timer, and causes us to delay far longer than
@@ -4830,7 +4817,7 @@ static void igb_write_itr(struct igb_q_vector *q_vector)
if (adapter->hw.mac.type == e1000_82575)
itr_val |= itr_val << 16;
else
- itr_val |= 0x8000000;
+ itr_val |= E1000_EITR_CNT_IGNR;
writel(itr_val, q_vector->itr_register);
q_vector->set_itr = 0;
@@ -4858,8 +4845,8 @@ static void igb_update_dca(struct igb_q_vector *q_vector)
if (q_vector->cpu == cpu)
goto out_no_update;
- if (q_vector->tx_ring) {
- int q = q_vector->tx_ring->reg_idx;
+ if (q_vector->tx.ring) {
+ int q = q_vector->tx.ring->reg_idx;
u32 dca_txctrl = rd32(E1000_DCA_TXCTRL(q));
if (hw->mac.type == e1000_82575) {
dca_txctrl &= ~E1000_DCA_TXCTRL_CPUID_MASK;
@@ -4872,8 +4859,8 @@ static void igb_update_dca(struct igb_q_vector *q_vector)
dca_txctrl |= E1000_DCA_TXCTRL_DESC_DCA_EN;
wr32(E1000_DCA_TXCTRL(q), dca_txctrl);
}
- if (q_vector->rx_ring) {
- int q = q_vector->rx_ring->reg_idx;
+ if (q_vector->rx.ring) {
+ int q = q_vector->rx.ring->reg_idx;
u32 dca_rxctrl = rd32(E1000_DCA_RXCTRL(q));
if (hw->mac.type == e1000_82575) {
dca_rxctrl &= ~E1000_DCA_RXCTRL_CPUID_MASK;
@@ -5517,16 +5504,14 @@ static irqreturn_t igb_intr(int irq, void *data)
/* Interrupt Auto-Mask...upon reading ICR, interrupts are masked. No
* need for the IMC write */
u32 icr = rd32(E1000_ICR);
- if (!icr)
- return IRQ_NONE; /* Not our interrupt */
-
- igb_write_itr(q_vector);
/* IMS will not auto-mask if INT_ASSERTED is not set, and if it is
* not set, then the adapter didn't send an interrupt */
if (!(icr & E1000_ICR_INT_ASSERTED))
return IRQ_NONE;
+ igb_write_itr(q_vector);
+
if (icr & E1000_ICR_DRSTA)
schedule_work(&adapter->reset_task);
@@ -5547,15 +5532,15 @@ static irqreturn_t igb_intr(int irq, void *data)
return IRQ_HANDLED;
}
-static inline void igb_ring_irq_enable(struct igb_q_vector *q_vector)
+void igb_ring_irq_enable(struct igb_q_vector *q_vector)
{
struct igb_adapter *adapter = q_vector->adapter;
struct e1000_hw *hw = &adapter->hw;
- if ((q_vector->rx_ring && (adapter->rx_itr_setting & 3)) ||
- (!q_vector->rx_ring && (adapter->tx_itr_setting & 3))) {
- if (!adapter->msix_entries)
- igb_set_itr(adapter);
+ if ((q_vector->rx.ring && (adapter->rx_itr_setting & 3)) ||
+ (!q_vector->rx.ring && (adapter->tx_itr_setting & 3))) {
+ if ((adapter->num_q_vectors == 1) && !adapter->vf_data)
+ igb_set_itr(q_vector);
else
igb_update_ring_itr(q_vector);
}
@@ -5584,10 +5569,10 @@ static int igb_poll(struct napi_struct *napi, int budget)
if (q_vector->adapter->flags & IGB_FLAG_DCA_ENABLED)
igb_update_dca(q_vector);
#endif
- if (q_vector->tx_ring)
+ if (q_vector->tx.ring)
clean_complete = igb_clean_tx_irq(q_vector);
- if (q_vector->rx_ring)
+ if (q_vector->rx.ring)
clean_complete &= igb_clean_rx_irq(q_vector, budget);
/* If all work not completed, return budget and keep polling */
@@ -5667,11 +5652,11 @@ static void igb_tx_hwtstamp(struct igb_q_vector *q_vector,
static bool igb_clean_tx_irq(struct igb_q_vector *q_vector)
{
struct igb_adapter *adapter = q_vector->adapter;
- struct igb_ring *tx_ring = q_vector->tx_ring;
+ struct igb_ring *tx_ring = q_vector->tx.ring;
struct igb_tx_buffer *tx_buffer;
union e1000_adv_tx_desc *tx_desc, *eop_desc;
unsigned int total_bytes = 0, total_packets = 0;
- unsigned int budget = q_vector->tx_work_limit;
+ unsigned int budget = q_vector->tx.work_limit;
unsigned int i = tx_ring->next_to_clean;
if (test_bit(__IGB_DOWN, &adapter->state))
@@ -5757,8 +5742,8 @@ static bool igb_clean_tx_irq(struct igb_q_vector *q_vector)
tx_ring->tx_stats.bytes += total_bytes;
tx_ring->tx_stats.packets += total_packets;
u64_stats_update_end(&tx_ring->tx_syncp);
- tx_ring->total_bytes += total_bytes;
- tx_ring->total_packets += total_packets;
+ q_vector->tx.total_bytes += total_bytes;
+ q_vector->tx.total_packets += total_packets;
if (tx_ring->detect_tx_hung) {
struct e1000_hw *hw = &adapter->hw;
@@ -5907,7 +5892,7 @@ static inline u16 igb_get_hlen(union e1000_adv_rx_desc *rx_desc)
static bool igb_clean_rx_irq(struct igb_q_vector *q_vector, int budget)
{
- struct igb_ring *rx_ring = q_vector->rx_ring;
+ struct igb_ring *rx_ring = q_vector->rx.ring;
union e1000_adv_rx_desc *rx_desc;
const int current_node = numa_node_id();
unsigned int total_bytes = 0, total_packets = 0;
@@ -6024,8 +6009,8 @@ next_desc:
rx_ring->rx_stats.packets += total_packets;
rx_ring->rx_stats.bytes += total_bytes;
u64_stats_update_end(&rx_ring->rx_syncp);
- rx_ring->total_packets += total_packets;
- rx_ring->total_bytes += total_bytes;
+ q_vector->rx.total_packets += total_packets;
+ q_vector->rx.total_bytes += total_bytes;
if (cleaned_count)
igb_alloc_rx_buffers(rx_ring, cleaned_count);
--
1.7.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [net-next 06/11] igb: cleanup IVAR configuration
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
` (4 preceding siblings ...)
2011-10-08 6:47 ` [net-next 05/11] igb: Move ITR related data into work container within the q_vector Jeff Kirsher
@ 2011-10-08 6:47 ` Jeff Kirsher
2011-10-08 6:47 ` [net-next 07/11] igb: retire the RX_CSUM flag and use the netdev flag instead Jeff Kirsher
` (5 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:47 UTC (permalink / raw)
To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
From: Alexander Duyck <alexander.h.duyck@intel.com>
This change is meant to cleanup some of the IVAR register configuration.
igb_assign_vector had become pretty large with multiple copies of the same
general code for setting the IVAR. This change consolidates most of that
code by adding the igb_write_ivar function which allows us just to compute
the index and offset and then use that information to setup the IVAR.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 120 +++++++++++++---------------
1 files changed, 56 insertions(+), 64 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 8dc04e0..ec715f4 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -754,15 +754,40 @@ err:
return -ENOMEM;
}
+/**
+ * igb_write_ivar - configure ivar for given MSI-X vector
+ * @hw: pointer to the HW structure
+ * @msix_vector: vector number we are allocating to a given ring
+ * @index: row index of IVAR register to write within IVAR table
+ * @offset: column offset of in IVAR, should be multiple of 8
+ *
+ * This function is intended to handle the writing of the IVAR register
+ * for adapters 82576 and newer. The IVAR table consists of 2 columns,
+ * each containing an cause allocation for an Rx and Tx ring, and a
+ * variable number of rows depending on the number of queues supported.
+ **/
+static void igb_write_ivar(struct e1000_hw *hw, int msix_vector,
+ int index, int offset)
+{
+ u32 ivar = array_rd32(E1000_IVAR0, index);
+
+ /* clear any bits that are currently set */
+ ivar &= ~((u32)0xFF << offset);
+
+ /* write vector and valid bit */
+ ivar |= (msix_vector | E1000_IVAR_VALID) << offset;
+
+ array_wr32(E1000_IVAR0, index, ivar);
+}
+
#define IGB_N0_QUEUE -1
static void igb_assign_vector(struct igb_q_vector *q_vector, int msix_vector)
{
- u32 msixbm = 0;
struct igb_adapter *adapter = q_vector->adapter;
struct e1000_hw *hw = &adapter->hw;
- u32 ivar, index;
int rx_queue = IGB_N0_QUEUE;
int tx_queue = IGB_N0_QUEUE;
+ u32 msixbm = 0;
if (q_vector->rx.ring)
rx_queue = q_vector->rx.ring->reg_idx;
@@ -785,72 +810,39 @@ static void igb_assign_vector(struct igb_q_vector *q_vector, int msix_vector)
q_vector->eims_value = msixbm;
break;
case e1000_82576:
- /* 82576 uses a table-based method for assigning vectors.
- Each queue has a single entry in the table to which we write
- a vector number along with a "valid" bit. Sadly, the layout
- of the table is somewhat counterintuitive. */
- if (rx_queue > IGB_N0_QUEUE) {
- index = (rx_queue & 0x7);
- ivar = array_rd32(E1000_IVAR0, index);
- if (rx_queue < 8) {
- /* vector goes into low byte of register */
- ivar = ivar & 0xFFFFFF00;
- ivar |= msix_vector | E1000_IVAR_VALID;
- } else {
- /* vector goes into third byte of register */
- ivar = ivar & 0xFF00FFFF;
- ivar |= (msix_vector | E1000_IVAR_VALID) << 16;
- }
- array_wr32(E1000_IVAR0, index, ivar);
- }
- if (tx_queue > IGB_N0_QUEUE) {
- index = (tx_queue & 0x7);
- ivar = array_rd32(E1000_IVAR0, index);
- if (tx_queue < 8) {
- /* vector goes into second byte of register */
- ivar = ivar & 0xFFFF00FF;
- ivar |= (msix_vector | E1000_IVAR_VALID) << 8;
- } else {
- /* vector goes into high byte of register */
- ivar = ivar & 0x00FFFFFF;
- ivar |= (msix_vector | E1000_IVAR_VALID) << 24;
- }
- array_wr32(E1000_IVAR0, index, ivar);
- }
+ /*
+ * 82576 uses a table that essentially consists of 2 columns
+ * with 8 rows. The ordering is column-major so we use the
+ * lower 3 bits as the row index, and the 4th bit as the
+ * column offset.
+ */
+ if (rx_queue > IGB_N0_QUEUE)
+ igb_write_ivar(hw, msix_vector,
+ rx_queue & 0x7,
+ (rx_queue & 0x8) << 1);
+ if (tx_queue > IGB_N0_QUEUE)
+ igb_write_ivar(hw, msix_vector,
+ tx_queue & 0x7,
+ ((tx_queue & 0x8) << 1) + 8);
q_vector->eims_value = 1 << msix_vector;
break;
case e1000_82580:
case e1000_i350:
- /* 82580 uses the same table-based approach as 82576 but has fewer
- entries as a result we carry over for queues greater than 4. */
- if (rx_queue > IGB_N0_QUEUE) {
- index = (rx_queue >> 1);
- ivar = array_rd32(E1000_IVAR0, index);
- if (rx_queue & 0x1) {
- /* vector goes into third byte of register */
- ivar = ivar & 0xFF00FFFF;
- ivar |= (msix_vector | E1000_IVAR_VALID) << 16;
- } else {
- /* vector goes into low byte of register */
- ivar = ivar & 0xFFFFFF00;
- ivar |= msix_vector | E1000_IVAR_VALID;
- }
- array_wr32(E1000_IVAR0, index, ivar);
- }
- if (tx_queue > IGB_N0_QUEUE) {
- index = (tx_queue >> 1);
- ivar = array_rd32(E1000_IVAR0, index);
- if (tx_queue & 0x1) {
- /* vector goes into high byte of register */
- ivar = ivar & 0x00FFFFFF;
- ivar |= (msix_vector | E1000_IVAR_VALID) << 24;
- } else {
- /* vector goes into second byte of register */
- ivar = ivar & 0xFFFF00FF;
- ivar |= (msix_vector | E1000_IVAR_VALID) << 8;
- }
- array_wr32(E1000_IVAR0, index, ivar);
- }
+ /*
+ * On 82580 and newer adapters the scheme is similar to 82576
+ * however instead of ordering column-major we have things
+ * ordered row-major. So we traverse the table by using
+ * bit 0 as the column offset, and the remaining bits as the
+ * row index.
+ */
+ if (rx_queue > IGB_N0_QUEUE)
+ igb_write_ivar(hw, msix_vector,
+ rx_queue >> 1,
+ (rx_queue & 0x1) << 4);
+ if (tx_queue > IGB_N0_QUEUE)
+ igb_write_ivar(hw, msix_vector,
+ tx_queue >> 1,
+ ((tx_queue & 0x1) << 4) + 8);
q_vector->eims_value = 1 << msix_vector;
break;
default:
--
1.7.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [net-next 07/11] igb: retire the RX_CSUM flag and use the netdev flag instead
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
` (5 preceding siblings ...)
2011-10-08 6:47 ` [net-next 06/11] igb: cleanup IVAR configuration Jeff Kirsher
@ 2011-10-08 6:47 ` Jeff Kirsher
2011-10-08 6:47 ` [net-next 08/11] igb: leave staterr in place and instead us a helper function to check bits Jeff Kirsher
` (4 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:47 UTC (permalink / raw)
To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
From: Alexander Duyck <alexander.h.duyck@intel.com>
Since the netdev now has its' own checksum flag to indicate if Rx checksum
is enabled we might as well use that instead of using the ring flag.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/igb/igb.h | 1 -
drivers/net/ethernet/intel/igb/igb_main.c | 22 ++++++----------------
2 files changed, 6 insertions(+), 17 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 91f90fe..fde381a 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -245,7 +245,6 @@ struct igb_ring {
};
enum e1000_ring_flags_t {
- IGB_RING_FLAG_RX_CSUM,
IGB_RING_FLAG_RX_SCTP_CSUM,
IGB_RING_FLAG_TX_CTX_IDX,
IGB_RING_FLAG_TX_DETECT_HANG
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index ec715f4..cae4abb 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -732,8 +732,6 @@ static int igb_alloc_queues(struct igb_adapter *adapter)
ring->dev = &adapter->pdev->dev;
ring->netdev = adapter->netdev;
ring->numa_node = adapter->node;
- /* enable rx checksum */
- set_bit(IGB_RING_FLAG_RX_CSUM, &ring->flags);
/* set flag indicating ring supports SCTP checksum offload */
if (adapter->hw.mac.type >= e1000_82576)
set_bit(IGB_RING_FLAG_RX_SCTP_CSUM, &ring->flags);
@@ -1811,19 +1809,8 @@ static u32 igb_fix_features(struct net_device *netdev, u32 features)
static int igb_set_features(struct net_device *netdev, u32 features)
{
- struct igb_adapter *adapter = netdev_priv(netdev);
- int i;
u32 changed = netdev->features ^ features;
- for (i = 0; i < adapter->num_rx_queues; i++) {
- if (features & NETIF_F_RXCSUM)
- set_bit(IGB_RING_FLAG_RX_CSUM,
- &adapter->rx_ring[i]->flags);
- else
- clear_bit(IGB_RING_FLAG_RX_CSUM,
- &adapter->rx_ring[i]->flags);
- }
-
if (changed & NETIF_F_HW_VLAN_RX)
igb_vlan_mode(netdev, features);
@@ -5807,9 +5794,12 @@ static inline void igb_rx_checksum(struct igb_ring *ring,
{
skb_checksum_none_assert(skb);
- /* Ignore Checksum bit is set or checksum is disabled through ethtool */
- if (!test_bit(IGB_RING_FLAG_RX_CSUM, &ring->flags) ||
- (status_err & E1000_RXD_STAT_IXSM))
+ /* Ignore Checksum bit is set */
+ if (status_err & E1000_RXD_STAT_IXSM)
+ return;
+
+ /* Rx checksum disabled via ethtool */
+ if (!(ring->netdev->features & NETIF_F_RXCSUM))
return;
/* TCP/UDP checksum error bit is set */
--
1.7.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [net-next 08/11] igb: leave staterr in place and instead us a helper function to check bits
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
` (6 preceding siblings ...)
2011-10-08 6:47 ` [net-next 07/11] igb: retire the RX_CSUM flag and use the netdev flag instead Jeff Kirsher
@ 2011-10-08 6:47 ` Jeff Kirsher
2011-10-08 6:47 ` [net-next 09/11] igb: fix recent VLAN changes that would leave VLANs disabled after reset Jeff Kirsher
` (3 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:47 UTC (permalink / raw)
To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
From: Alexander Duyck <alexander.h.duyck@intel.com>
Instead of doing a byte swap on the staterr bits in the Rx descriptor we can
save ourselves a bit of space and some CPU time by instead just testing for
the various bits out of the Rx descriptor directly.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/igb/igb.h | 7 +++
drivers/net/ethernet/intel/igb/igb_ethtool.c | 5 +--
drivers/net/ethernet/intel/igb/igb_main.c | 55 ++++++++++++++-----------
3 files changed, 39 insertions(+), 28 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index fde381a..11d17f1 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -259,6 +259,13 @@ enum e1000_ring_flags_t {
#define IGB_TX_CTXTDESC(R, i) \
(&(((struct e1000_adv_tx_context_desc *)((R)->desc))[i]))
+/* igb_test_staterr - tests bits within Rx descriptor status and error fields */
+static inline __le32 igb_test_staterr(union e1000_adv_rx_desc *rx_desc,
+ const u32 stat_err_bits)
+{
+ return rx_desc->wb.upper.status_error & cpu_to_le32(stat_err_bits);
+}
+
/* igb_desc_unused - calculate if we have unused descriptors */
static inline int igb_desc_unused(struct igb_ring *ring)
{
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index 5ebe992..bc198ea 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -1581,16 +1581,14 @@ static int igb_clean_test_rings(struct igb_ring *rx_ring,
union e1000_adv_rx_desc *rx_desc;
struct igb_rx_buffer *rx_buffer_info;
struct igb_tx_buffer *tx_buffer_info;
- u32 staterr;
u16 rx_ntc, tx_ntc, count = 0;
/* initialize next to clean and descriptor values */
rx_ntc = rx_ring->next_to_clean;
tx_ntc = tx_ring->next_to_clean;
rx_desc = IGB_RX_DESC(rx_ring, rx_ntc);
- staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
- while (staterr & E1000_RXD_STAT_DD) {
+ while (igb_test_staterr(rx_desc, E1000_RXD_STAT_DD)) {
/* check rx buffer */
rx_buffer_info = &rx_ring->rx_buffer_info[rx_ntc];
@@ -1619,7 +1617,6 @@ static int igb_clean_test_rings(struct igb_ring *rx_ring,
/* fetch next descriptor */
rx_desc = IGB_RX_DESC(rx_ring, rx_ntc);
- staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
}
/* re-map buffers to ring, store next to clean values */
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index cae4abb..1419ae8 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -5790,12 +5790,13 @@ static bool igb_clean_tx_irq(struct igb_q_vector *q_vector)
}
static inline void igb_rx_checksum(struct igb_ring *ring,
- u32 status_err, struct sk_buff *skb)
+ union e1000_adv_rx_desc *rx_desc,
+ struct sk_buff *skb)
{
skb_checksum_none_assert(skb);
/* Ignore Checksum bit is set */
- if (status_err & E1000_RXD_STAT_IXSM)
+ if (igb_test_staterr(rx_desc, E1000_RXD_STAT_IXSM))
return;
/* Rx checksum disabled via ethtool */
@@ -5803,8 +5804,9 @@ static inline void igb_rx_checksum(struct igb_ring *ring,
return;
/* TCP/UDP checksum error bit is set */
- if (status_err &
- (E1000_RXDEXT_STATERR_TCPE | E1000_RXDEXT_STATERR_IPE)) {
+ if (igb_test_staterr(rx_desc,
+ E1000_RXDEXT_STATERR_TCPE |
+ E1000_RXDEXT_STATERR_IPE)) {
/*
* work around errata with sctp packets where the TCPE aka
* L4E bit is set incorrectly on 64 byte (60 byte w/o crc)
@@ -5820,19 +5822,26 @@ static inline void igb_rx_checksum(struct igb_ring *ring,
return;
}
/* It must be a TCP or UDP packet with a valid checksum */
- if (status_err & (E1000_RXD_STAT_TCPCS | E1000_RXD_STAT_UDPCS))
+ if (igb_test_staterr(rx_desc, E1000_RXD_STAT_TCPCS |
+ E1000_RXD_STAT_UDPCS))
skb->ip_summed = CHECKSUM_UNNECESSARY;
- dev_dbg(ring->dev, "cksum success: bits %08X\n", status_err);
+ dev_dbg(ring->dev, "cksum success: bits %08X\n",
+ le32_to_cpu(rx_desc->wb.upper.status_error));
}
-static void igb_rx_hwtstamp(struct igb_q_vector *q_vector, u32 staterr,
- struct sk_buff *skb)
+static void igb_rx_hwtstamp(struct igb_q_vector *q_vector,
+ union e1000_adv_rx_desc *rx_desc,
+ struct sk_buff *skb)
{
struct igb_adapter *adapter = q_vector->adapter;
struct e1000_hw *hw = &adapter->hw;
u64 regval;
+ if (!igb_test_staterr(rx_desc, E1000_RXDADV_STAT_TSIP |
+ E1000_RXDADV_STAT_TS))
+ return;
+
/*
* If this bit is set, then the RX registers contain the time stamp. No
* other packet will be time stamped until we read these registers, so
@@ -5844,7 +5853,7 @@ static void igb_rx_hwtstamp(struct igb_q_vector *q_vector, u32 staterr,
* If nothing went wrong, then it should have a shared tx_flags that we
* can turn into a skb_shared_hwtstamps.
*/
- if (staterr & E1000_RXDADV_STAT_TSIP) {
+ if (igb_test_staterr(rx_desc, E1000_RXDADV_STAT_TSIP)) {
u32 *stamp = (u32 *)skb->data;
regval = le32_to_cpu(*(stamp + 2));
regval |= (u64)le32_to_cpu(*(stamp + 3)) << 32;
@@ -5878,14 +5887,12 @@ static bool igb_clean_rx_irq(struct igb_q_vector *q_vector, int budget)
union e1000_adv_rx_desc *rx_desc;
const int current_node = numa_node_id();
unsigned int total_bytes = 0, total_packets = 0;
- u32 staterr;
u16 cleaned_count = igb_desc_unused(rx_ring);
u16 i = rx_ring->next_to_clean;
rx_desc = IGB_RX_DESC(rx_ring, i);
- staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
- while (staterr & E1000_RXD_STAT_DD) {
+ while (igb_test_staterr(rx_desc, E1000_RXD_STAT_DD)) {
struct igb_rx_buffer *buffer_info = &rx_ring->rx_buffer_info[i];
struct sk_buff *skb = buffer_info->skb;
union e1000_adv_rx_desc *next_rxd;
@@ -5938,7 +5945,7 @@ static bool igb_clean_rx_irq(struct igb_q_vector *q_vector, int budget)
buffer_info->page_dma = 0;
}
- if (!(staterr & E1000_RXD_STAT_EOP)) {
+ if (!igb_test_staterr(rx_desc, E1000_RXD_STAT_EOP)) {
struct igb_rx_buffer *next_buffer;
next_buffer = &rx_ring->rx_buffer_info[i];
buffer_info->skb = next_buffer->skb;
@@ -5948,25 +5955,26 @@ static bool igb_clean_rx_irq(struct igb_q_vector *q_vector, int budget)
goto next_desc;
}
- if (staterr & E1000_RXDEXT_ERR_FRAME_ERR_MASK) {
+ if (igb_test_staterr(rx_desc,
+ E1000_RXDEXT_ERR_FRAME_ERR_MASK)) {
dev_kfree_skb_any(skb);
goto next_desc;
}
- if (staterr & (E1000_RXDADV_STAT_TSIP | E1000_RXDADV_STAT_TS))
- igb_rx_hwtstamp(q_vector, staterr, skb);
- total_bytes += skb->len;
- total_packets++;
-
- igb_rx_checksum(rx_ring, staterr, skb);
-
- skb->protocol = eth_type_trans(skb, rx_ring->netdev);
+ igb_rx_hwtstamp(q_vector, rx_desc, skb);
+ igb_rx_checksum(rx_ring, rx_desc, skb);
- if (staterr & E1000_RXD_STAT_VP) {
+ if (igb_test_staterr(rx_desc, E1000_RXD_STAT_VP)) {
u16 vid = le16_to_cpu(rx_desc->wb.upper.vlan);
__vlan_hwaccel_put_tag(skb, vid);
}
+
+ total_bytes += skb->len;
+ total_packets++;
+
+ skb->protocol = eth_type_trans(skb, rx_ring->netdev);
+
napi_gro_receive(&q_vector->napi, skb);
budget--;
@@ -5983,7 +5991,6 @@ next_desc:
/* use prefetched values */
rx_desc = next_rxd;
- staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
}
rx_ring->next_to_clean = i;
--
1.7.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [net-next 09/11] igb: fix recent VLAN changes that would leave VLANs disabled after reset
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
` (7 preceding siblings ...)
2011-10-08 6:47 ` [net-next 08/11] igb: leave staterr in place and instead us a helper function to check bits Jeff Kirsher
@ 2011-10-08 6:47 ` Jeff Kirsher
2011-10-08 6:47 ` [net-next 10/11] igb: move TX hang check flag into ring->flags Jeff Kirsher
` (2 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:47 UTC (permalink / raw)
To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
From: Alexander Duyck <alexander.h.duyck@intel.com>
This patch cleans up several issues with VLANs on igb after the recent
changes that were meant to leave the VLANs enabled/disable via the
netdev->features flags.
Specifically the Rx VLAN settings were being dropped after reset due to the
fact that they were not being restored correctly. In addition I removed
the IRQ disable/enable since those were in place to protect the setting of
vlgrp.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 18 ++++--------------
1 files changed, 4 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 1419ae8..971aea9 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2112,8 +2112,6 @@ static int __devinit igb_probe(struct pci_dev *pdev,
if (err)
goto err_register;
- igb_vlan_mode(netdev, netdev->features);
-
/* carrier off reporting is important to ethtool even BEFORE open */
netif_carrier_off(netdev);
@@ -5120,7 +5118,6 @@ static s32 igb_vlvf_set(struct igb_adapter *adapter, u32 vid, bool add, u32 vf)
}
adapter->vf_data[vf].vlans_enabled++;
- return 0;
}
} else {
if (i < E1000_VLVF_ARRAY_SIZE) {
@@ -6385,10 +6382,9 @@ static void igb_vlan_mode(struct net_device *netdev, u32 features)
struct igb_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = &adapter->hw;
u32 ctrl, rctl;
+ bool enable = !!(features & NETIF_F_HW_VLAN_RX);
- igb_irq_disable(adapter);
-
- if (features & NETIF_F_HW_VLAN_RX) {
+ if (enable) {
/* enable VLAN tag insert/strip */
ctrl = rd32(E1000_CTRL);
ctrl |= E1000_CTRL_VME;
@@ -6406,9 +6402,6 @@ static void igb_vlan_mode(struct net_device *netdev, u32 features)
}
igb_rlpml_set(adapter);
-
- if (!test_bit(__IGB_DOWN, &adapter->state))
- igb_irq_enable(adapter);
}
static void igb_vlan_rx_add_vid(struct net_device *netdev, u16 vid)
@@ -6433,11 +6426,6 @@ static void igb_vlan_rx_kill_vid(struct net_device *netdev, u16 vid)
int pf_id = adapter->vfs_allocated_count;
s32 err;
- igb_irq_disable(adapter);
-
- if (!test_bit(__IGB_DOWN, &adapter->state))
- igb_irq_enable(adapter);
-
/* remove vlan from VLVF table array */
err = igb_vlvf_set(adapter, vid, false, pf_id);
@@ -6452,6 +6440,8 @@ static void igb_restore_vlan(struct igb_adapter *adapter)
{
u16 vid;
+ igb_vlan_mode(adapter->netdev, adapter->netdev->features);
+
for_each_set_bit(vid, adapter->active_vlans, VLAN_N_VID)
igb_vlan_rx_add_vid(adapter->netdev, vid);
}
--
1.7.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [net-next 10/11] igb: move TX hang check flag into ring->flags
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
` (8 preceding siblings ...)
2011-10-08 6:47 ` [net-next 09/11] igb: fix recent VLAN changes that would leave VLANs disabled after reset Jeff Kirsher
@ 2011-10-08 6:47 ` Jeff Kirsher
2011-10-08 6:47 ` [net-next 11/11] igb: add support for NETIF_F_RXHASH Jeff Kirsher
2011-10-08 6:52 ` [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
11 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:47 UTC (permalink / raw)
To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
From: Alexander Duyck <alexander.h.duyck@intel.com>
This change moves the Tx hang check into the ring flags.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/igb/igb.h | 1 -
drivers/net/ethernet/intel/igb/igb_main.c | 6 +++---
2 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 11d17f1..4e665a9 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -231,7 +231,6 @@ struct igb_ring {
struct igb_tx_queue_stats tx_stats;
struct u64_stats_sync tx_syncp;
struct u64_stats_sync tx_syncp2;
- bool detect_tx_hung;
};
/* RX */
struct {
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 971aea9..77ade67 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3754,7 +3754,7 @@ static void igb_watchdog_task(struct work_struct *work)
}
/* Force detection of hung controller every watchdog period */
- tx_ring->detect_tx_hung = true;
+ set_bit(IGB_RING_FLAG_TX_DETECT_HANG, &tx_ring->flags);
}
/* Cause software interrupt to ensure rx ring is cleaned */
@@ -5721,14 +5721,14 @@ static bool igb_clean_tx_irq(struct igb_q_vector *q_vector)
q_vector->tx.total_bytes += total_bytes;
q_vector->tx.total_packets += total_packets;
- if (tx_ring->detect_tx_hung) {
+ if (test_bit(IGB_RING_FLAG_TX_DETECT_HANG, &tx_ring->flags)) {
struct e1000_hw *hw = &adapter->hw;
eop_desc = tx_buffer->next_to_watch;
/* Detect a transmit hang in hardware, this serializes the
* check with the clearing of time_stamp and movement of i */
- tx_ring->detect_tx_hung = false;
+ clear_bit(IGB_RING_FLAG_TX_DETECT_HANG, &tx_ring->flags);
if (eop_desc &&
time_after(jiffies, tx_buffer->time_stamp +
(adapter->tx_timeout_factor * HZ)) &&
--
1.7.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [net-next 11/11] igb: add support for NETIF_F_RXHASH
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
` (9 preceding siblings ...)
2011-10-08 6:47 ` [net-next 10/11] igb: move TX hang check flag into ring->flags Jeff Kirsher
@ 2011-10-08 6:47 ` Jeff Kirsher
2011-10-08 6:52 ` [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
11 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:47 UTC (permalink / raw)
To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
From: Alexander Duyck <alexander.h.duyck@intel.com>
This patch adds support for Rx hashing.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 52 +++++++++++++++++++---------
1 files changed, 35 insertions(+), 17 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 77ade67..10670f9 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -1978,23 +1978,32 @@ static int __devinit igb_probe(struct pci_dev *pdev,
dev_info(&pdev->dev,
"PHY reset is blocked due to SOL/IDER session.\n");
- netdev->hw_features = NETIF_F_SG |
- NETIF_F_IP_CSUM |
- NETIF_F_IPV6_CSUM |
- NETIF_F_TSO |
- NETIF_F_TSO6 |
- NETIF_F_RXCSUM |
- NETIF_F_HW_VLAN_RX;
-
- netdev->features = netdev->hw_features |
- NETIF_F_HW_VLAN_TX |
- NETIF_F_HW_VLAN_FILTER;
-
- netdev->vlan_features |= NETIF_F_TSO;
- netdev->vlan_features |= NETIF_F_TSO6;
- netdev->vlan_features |= NETIF_F_IP_CSUM;
- netdev->vlan_features |= NETIF_F_IPV6_CSUM;
- netdev->vlan_features |= NETIF_F_SG;
+ /*
+ * features is initialized to 0 in allocation, it might have bits
+ * set by igb_sw_init so we should use an or instead of an
+ * assignment.
+ */
+ netdev->features |= NETIF_F_SG |
+ NETIF_F_IP_CSUM |
+ NETIF_F_IPV6_CSUM |
+ NETIF_F_TSO |
+ NETIF_F_TSO6 |
+ NETIF_F_RXHASH |
+ NETIF_F_RXCSUM |
+ NETIF_F_HW_VLAN_RX |
+ NETIF_F_HW_VLAN_TX;
+
+ /* copy netdev features into list of user selectable features */
+ netdev->hw_features |= netdev->features;
+
+ /* set this bit last since it cannot be part of hw_features */
+ netdev->features |= NETIF_F_HW_VLAN_FILTER;
+
+ netdev->vlan_features |= NETIF_F_TSO |
+ NETIF_F_TSO6 |
+ NETIF_F_IP_CSUM |
+ NETIF_F_IPV6_CSUM |
+ NETIF_F_SG;
if (pci_using_dac) {
netdev->features |= NETIF_F_HIGHDMA;
@@ -5827,6 +5836,14 @@ static inline void igb_rx_checksum(struct igb_ring *ring,
le32_to_cpu(rx_desc->wb.upper.status_error));
}
+static inline void igb_rx_hash(struct igb_ring *ring,
+ union e1000_adv_rx_desc *rx_desc,
+ struct sk_buff *skb)
+{
+ if (ring->netdev->features & NETIF_F_RXHASH)
+ skb->rxhash = le32_to_cpu(rx_desc->wb.lower.hi_dword.rss);
+}
+
static void igb_rx_hwtstamp(struct igb_q_vector *q_vector,
union e1000_adv_rx_desc *rx_desc,
struct sk_buff *skb)
@@ -5959,6 +5976,7 @@ static bool igb_clean_rx_irq(struct igb_q_vector *q_vector, int budget)
}
igb_rx_hwtstamp(q_vector, rx_desc, skb);
+ igb_rx_hash(rx_ring, rx_desc, skb);
igb_rx_checksum(rx_ring, rx_desc, skb);
if (igb_test_staterr(rx_desc, E1000_RXD_STAT_VP)) {
--
1.7.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [net-next 00/11][pull request] Intel Wired LAN Driver Updates
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
` (10 preceding siblings ...)
2011-10-08 6:47 ` [net-next 11/11] igb: add support for NETIF_F_RXHASH Jeff Kirsher
@ 2011-10-08 6:52 ` Jeff Kirsher
11 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2011-10-08 6:52 UTC (permalink / raw)
To: davem@davemloft.net
Cc: netdev@vger.kernel.org, gospo@redhat.com, sassmann@redhat.com
[-- Attachment #1: Type: text/plain, Size: 715 bytes --]
On Fri, 2011-10-07 at 23:47 -0700, Kirsher, Jeffrey T wrote:
> The following series contains updates to igb only. They are a
> continuation of the cleanups and refactoring that Alex has done.
> After this series there are 4-5 more patches to complete the work
> that Alex has done on igb.
>
> The following are changes since commit
> 1d0861acfb24d0ca0661ff5a156b992b2c589458:
> Add ethtool -g support to 8139cp
> and are available in the git repository at
> git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next.git
> or
> git://github.com/Jkirsher/net-next.git
Even though I have my kernel.org tree back up and running, I will keep
the github tree's updated (at least for now).
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings
2011-10-08 6:47 ` [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings Jeff Kirsher
@ 2011-10-08 19:51 ` David Miller
2011-10-10 16:15 ` Alexander Duyck
2011-10-09 18:08 ` Andi Kleen
1 sibling, 1 reply; 20+ messages in thread
From: David Miller @ 2011-10-08 19:51 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: alexander.h.duyck, netdev, gospo, sassmann
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Fri, 7 Oct 2011 23:47:32 -0700
> From: Alexander Duyck <alexander.h.duyck@intel.com>
>
> This change is meant to update the ring and vector allocations so that they
> are per node instead of allocating everything on the node that
> ifconfig/modprobe is called on. By doing this we can cut down
> significantly on cross node traffic.
>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
adapter->node seems superfluous.
It's always "-1" when we enter the allocation functions, and we
always restore it to it's original value upon exit from such
functions.
Just get rid of it and use a local variable in these functions
to keep track of the current allocation node.
Also, what ensures that MSI-X interrupts are targetted to a cpu
on the the node where you've made these allocations? I was
pretty sure Ben Hutchings added infrastructure that's usable
to ensure this, but I can't see where you're using it.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings
2011-10-08 6:47 ` [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings Jeff Kirsher
2011-10-08 19:51 ` David Miller
@ 2011-10-09 18:08 ` Andi Kleen
2011-10-10 16:25 ` Alexander Duyck
1 sibling, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2011-10-09 18:08 UTC (permalink / raw)
To: Jeff Kirsher; +Cc: davem, Alexander Duyck, netdev, gospo, sassmann
Jeff Kirsher <jeffrey.t.kirsher@intel.com> writes:
>
> for (i = 0; i < adapter->num_tx_queues; i++) {
> - ring = kzalloc(sizeof(struct igb_ring), GFP_KERNEL);
> + if (orig_node == -1) {
> + int cur_node = next_online_node(adapter->node);
> + if (cur_node == MAX_NUMNODES)
> + cur_node = first_online_node;
RR seems quite arbitrary. Who guarantees those nodes have any
relationship with the CPUs submitting on those queues? Or the node
the device is on.
Anyways if it's a good idea probably need to add a
dma_alloc_coherent_node() too
-Andi
--
ak@linux.intel.com -- Speaking for myself only
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings
2011-10-08 19:51 ` David Miller
@ 2011-10-10 16:15 ` Alexander Duyck
2011-10-10 17:50 ` David Miller
0 siblings, 1 reply; 20+ messages in thread
From: Alexander Duyck @ 2011-10-10 16:15 UTC (permalink / raw)
To: David Miller; +Cc: jeffrey.t.kirsher, netdev, gospo, sassmann
On 10/08/2011 12:51 PM, David Miller wrote:
> From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Date: Fri, 7 Oct 2011 23:47:32 -0700
>
>> From: Alexander Duyck <alexander.h.duyck@intel.com>
>>
>> This change is meant to update the ring and vector allocations so that they
>> are per node instead of allocating everything on the node that
>> ifconfig/modprobe is called on. By doing this we can cut down
>> significantly on cross node traffic.
>>
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
>> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>
> adapter->node seems superfluous.
>
> It's always "-1" when we enter the allocation functions, and we
> always restore it to it's original value upon exit from such
> functions.
>
> Just get rid of it and use a local variable in these functions
> to keep track of the current allocation node.
>
> Also, what ensures that MSI-X interrupts are targetted to a cpu
> on the the node where you've made these allocations? I was
> pretty sure Ben Hutchings added infrastructure that's usable
> to ensure this, but I can't see where you're using it.
Actually the main reason for having adapter->node is because in our
out-of-tree driver we end up using it as a module parameter in the event
that someone is running in single queue mode and wants to split up the
ports between nodes. As such I would prefer to keep the parameter
around and just default it to -1 as I am currently doing. However if it
must go I guess I can work around that sync-up issue.
In this case we don't have any guarantee other than the fact that most
people when trying to get performance will arrange their IRQs in a round
robin fashion. However this approach is still preferred over just
allocating all of the rings on one node and incurring the possible
overhead for all of the access being primarily on a single node. The
igb implementation doesn't have the code in place yet for the irq
affinity hints. It is one of the few things remaining for me to sync up
between igb and ixgbe, however it is on my list of things to do.
Thanks,
Alex
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings
2011-10-09 18:08 ` Andi Kleen
@ 2011-10-10 16:25 ` Alexander Duyck
2011-10-10 16:32 ` Andi Kleen
0 siblings, 1 reply; 20+ messages in thread
From: Alexander Duyck @ 2011-10-10 16:25 UTC (permalink / raw)
To: Andi Kleen; +Cc: Jeff Kirsher, davem, netdev, gospo, sassmann
On 10/09/2011 11:08 AM, Andi Kleen wrote:
> Jeff Kirsher <jeffrey.t.kirsher@intel.com> writes:
>>
>> for (i = 0; i < adapter->num_tx_queues; i++) {
>> - ring = kzalloc(sizeof(struct igb_ring), GFP_KERNEL);
>> + if (orig_node == -1) {
>> + int cur_node = next_online_node(adapter->node);
>> + if (cur_node == MAX_NUMNODES)
>> + cur_node = first_online_node;
>
> RR seems quite arbitrary. Who guarantees those nodes have any
> relationship with the CPUs submitting on those queues? Or the node
> the device is on.
>
> Anyways if it's a good idea probably need to add a
> dma_alloc_coherent_node() too
>
> -Andi
>
The RR configuration is somewhat arbitrary. However it is still better
than dumping everyting on a single node, and it works with the
configuration when the rings numbers line up with the CPU numbers since
normally the CPUs are RR on the nodes. From what I have seen it does
work quite well and it prevents almost all cross-node memory accesses
when running a routing workload.
I was thinking along the same lines for dma_alloc_coherent_node as well.
I've been meaning to get to it but I just haven't had the time. I'm
intentionally holding off on the ixgbe version of these patches until I
get the time to write up such a function. At which time I was going to
write up a patch to convert igb over to it.
Thanks,
Alex
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings
2011-10-10 16:25 ` Alexander Duyck
@ 2011-10-10 16:32 ` Andi Kleen
2011-10-10 17:02 ` Alexander Duyck
0 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2011-10-10 16:32 UTC (permalink / raw)
To: Alexander Duyck; +Cc: Andi Kleen, Jeff Kirsher, davem, netdev, gospo, sassmann
> The RR configuration is somewhat arbitrary. However it is still better
> than dumping everyting on a single node, and it works with the
> configuration when the rings numbers line up with the CPU numbers since
> normally the CPUs are RR on the nodes. From what I have seen it does
> work quite well and it prevents almost all cross-node memory accesses
> when running a routing workload.
Ok so it's optimized for one specific workload. I'm sure you'll
find some other workload where it doesn't work out.
I suppose it's hard to get right in the general case, but best
would be if ethtool had a nice and easy interface to set it at least.
However one disadvantage of that patch over the existing state of the
art (numactl modprobe ...) is that there's no way to override the placement
now. So if you do the forced RR I think you need the ethtool part too,
or at least some parameter to turn it off.
-Andi
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings
2011-10-10 16:32 ` Andi Kleen
@ 2011-10-10 17:02 ` Alexander Duyck
0 siblings, 0 replies; 20+ messages in thread
From: Alexander Duyck @ 2011-10-10 17:02 UTC (permalink / raw)
To: Andi Kleen; +Cc: Jeff Kirsher, davem, netdev, gospo, sassmann
On 10/10/2011 09:32 AM, Andi Kleen wrote:
>> The RR configuration is somewhat arbitrary. However it is still better
>> than dumping everyting on a single node, and it works with the
>> configuration when the rings numbers line up with the CPU numbers since
>> normally the CPUs are RR on the nodes. From what I have seen it does
>> work quite well and it prevents almost all cross-node memory accesses
>> when running a routing workload.
>
> Ok so it's optimized for one specific workload. I'm sure you'll
> find some other workload where it doesn't work out.
It isn't that I optimized it for one specific workload. I was just
citing that specific workload as one of the ones seeing the advantage.
> I suppose it's hard to get right in the general case, but best
> would be if ethtool had a nice and easy interface to set it at least.
The general case is never right for this it seems like. At least in
this case it becomes much easier to line up the memory and interrupts so
that they are all affinitized to the same core. From there RPS/RFS can
typically be used to spread out the work more if necessary.
> However one disadvantage of that patch over the existing state of the
> art (numactl modprobe ...) is that there's no way to override the placement
> now. So if you do the forced RR I think you need the ethtool part too,
> or at least some parameter to turn it off.
>
> -Andi
The counter argument to that though is that the approach you mention
always limits you to one node. At least with this approach we are
spread out over multiple nodes so that we can make full use of the
memory bandwidth on the system.
Thanks,
Alex
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings
2011-10-10 16:15 ` Alexander Duyck
@ 2011-10-10 17:50 ` David Miller
0 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2011-10-10 17:50 UTC (permalink / raw)
To: alexander.h.duyck; +Cc: jeffrey.t.kirsher, netdev, gospo, sassmann
From: Alexander Duyck <alexander.h.duyck@intel.com>
Date: Mon, 10 Oct 2011 09:15:02 -0700
> Actually the main reason for having adapter->node is because in our
> out-of-tree driver we end up using it as a module parameter in the event
> that someone is running in single queue mode and wants to split up the
> ports between nodes. As such I would prefer to keep the parameter
> around and just default it to -1 as I am currently doing. However if it
> must go I guess I can work around that sync-up issue.
Please stop adding such hacks to your out-of-tree driver and add
appropriate, generic, configure mechanisms to the upstream tree.
It absolutely is not appropriate to add something which is completely
useless to the upstream tree for the sake of something being done
only externally.
You guys are the best at upstream net driver maintainence, so it
really surprises me that you continue to do completely unacceptable
crap like this. Write the necessary generic non-module-option
mechanisms to facilitate the features you need and kill your out of
tree driver _now_.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2011-10-10 17:50 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-08 6:47 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
2011-10-08 6:47 ` [net-next 01/11] igb: push data into first igb_tx_buffer sooner to reduce stack usage Jeff Kirsher
2011-10-08 6:47 ` [net-next 02/11] igb: Use node specific allocations for the q_vectors and rings Jeff Kirsher
2011-10-08 19:51 ` David Miller
2011-10-10 16:15 ` Alexander Duyck
2011-10-10 17:50 ` David Miller
2011-10-09 18:08 ` Andi Kleen
2011-10-10 16:25 ` Alexander Duyck
2011-10-10 16:32 ` Andi Kleen
2011-10-10 17:02 ` Alexander Duyck
2011-10-08 6:47 ` [net-next 03/11] igb: avoid unnecessary conversions from u16 to int Jeff Kirsher
2011-10-08 6:47 ` [net-next 04/11] igb: Consolidate all of the ring feature flags into a single value Jeff Kirsher
2011-10-08 6:47 ` [net-next 05/11] igb: Move ITR related data into work container within the q_vector Jeff Kirsher
2011-10-08 6:47 ` [net-next 06/11] igb: cleanup IVAR configuration Jeff Kirsher
2011-10-08 6:47 ` [net-next 07/11] igb: retire the RX_CSUM flag and use the netdev flag instead Jeff Kirsher
2011-10-08 6:47 ` [net-next 08/11] igb: leave staterr in place and instead us a helper function to check bits Jeff Kirsher
2011-10-08 6:47 ` [net-next 09/11] igb: fix recent VLAN changes that would leave VLANs disabled after reset Jeff Kirsher
2011-10-08 6:47 ` [net-next 10/11] igb: move TX hang check flag into ring->flags Jeff Kirsher
2011-10-08 6:47 ` [net-next 11/11] igb: add support for NETIF_F_RXHASH Jeff Kirsher
2011-10-08 6:52 ` [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).