Netdev List
 help / color / mirror / Atom feed
* [net-next 3/7] ixgbevf: remove counters for Tx/Rx checksum offload
From: Aaron Brown @ 2014-01-18  2:30 UTC (permalink / raw)
  To: davem; +Cc: Emil Tantilov, netdev, gospo, sassmann, Alexander Duyck,
	Aaron Brown
In-Reply-To: <1390012205-21995-1-git-send-email-aaron.f.brown@intel.com>

From: Emil Tantilov <emil.s.tantilov@intel.com>

This patch removes the Tx/Rx counters for checksum offload.

The Tx counter was never updated and the Rx counter is of limited use.
This is in effort to clean up the counters and make them consistent
with the counters shown by ixgbe.

Also this patch removes some members of the adapter structure that were
never used and shuffles others to reduce number of holes.

before:
	/* size: 1568, cachelines: 25, members: 48 */
	/* sum members: 1519, holes: 10, sum holes: 43 */
	/* padding: 6 */
	/* last cacheline: 32 bytes */

after:
	/* size: 1480, cachelines: 24, members: 43 */
	/* sum members: 1479, holes: 1, sum holes: 1 */
	/* last cacheline: 8 bytes */

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ethtool.c      |  2 --
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      | 23 +++++++++--------------
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |  4 ----
 3 files changed, 9 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ethtool.c b/drivers/net/ethernet/intel/ixgbevf/ethtool.c
index 0769306..b48df78 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ethtool.c
@@ -79,9 +79,7 @@ static const struct ixgbe_stats ixgbe_gstrings_stats[] = {
 	{"tx_busy", IXGBEVF_ZSTAT(tx_busy)},
 	{"multicast", IXGBEVF_STAT(stats.vfmprc, stats.base_vfmprc,
 				   stats.saved_reset_vfmprc)},
-	{"rx_csum_offload_good", IXGBEVF_ZSTAT(hw_csum_rx_good)},
 	{"rx_csum_offload_errors", IXGBEVF_ZSTAT(hw_csum_rx_error)},
-	{"tx_csum_offload_ctxt", IXGBEVF_ZSTAT(hw_csum_tx_good)},
 #ifdef BP_EXTENDED_STATS
 	{"rx_bp_poll_yield", IXGBEVF_ZSTAT(bp_rx_yields)},
 	{"rx_bp_cleaned", IXGBEVF_ZSTAT(bp_rx_cleaned)},
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index 0642bd2..0068428 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -106,7 +106,6 @@ struct ixgbevf_ring {
 	};
 
 	u64 hw_csum_rx_error;
-	u64 hw_csum_rx_good;
 	u8 __iomem *tail;
 
 	u16 reg_idx; /* holds the special value that gets the hardware register
@@ -336,7 +335,6 @@ static inline u16 ixgbevf_desc_unused(struct ixgbevf_ring *ring)
 struct ixgbevf_adapter {
 	struct timer_list watchdog_timer;
 	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
-	u16 bd_number;
 	struct work_struct reset_task;
 	struct ixgbevf_q_vector *q_vector[MAX_MSIX_Q_VECTORS];
 
@@ -349,25 +347,18 @@ struct ixgbevf_adapter {
 	u32 eims_other;
 
 	/* TX */
-	struct ixgbevf_ring *tx_ring[MAX_TX_QUEUES]; /* One per active queue */
 	int num_tx_queues;
+	struct ixgbevf_ring *tx_ring[MAX_TX_QUEUES]; /* One per active queue */
 	u64 restart_queue;
-	u64 hw_csum_tx_good;
-	u64 lsc_int;
-	u64 hw_tso_ctxt;
-	u64 hw_tso6_ctxt;
 	u32 tx_timeout_count;
 
 	/* RX */
-	struct ixgbevf_ring *rx_ring[MAX_TX_QUEUES]; /* One per active queue */
 	int num_rx_queues;
+	struct ixgbevf_ring *rx_ring[MAX_TX_QUEUES]; /* One per active queue */
 	u64 hw_csum_rx_error;
 	u64 hw_rx_no_dma_resources;
-	u64 hw_csum_rx_good;
 	u64 non_eop_descs;
 	int num_msix_vectors;
-	struct msix_entry *msix_entries;
-
 	u32 alloc_rx_page_failed;
 	u32 alloc_rx_buff_failed;
 
@@ -379,6 +370,8 @@ struct ixgbevf_adapter {
 #define IXGBE_FLAG_IN_NETPOLL                   (u32)(1 << 1)
 #define IXGBEVF_FLAG_QUEUE_RESET_REQUESTED	(u32)(1 << 2)
 
+	struct msix_entry *msix_entries;
+
 	/* OS defined structs */
 	struct net_device *netdev;
 	struct pci_dev *pdev;
@@ -386,10 +379,12 @@ struct ixgbevf_adapter {
 	/* structs defined in ixgbe_vf.h */
 	struct ixgbe_hw hw;
 	u16 msg_enable;
-	struct ixgbevf_hw_stats stats;
+	u16 bd_number;
 	/* Interrupt Throttle Rate */
 	u32 eitr_param;
 
+	struct ixgbevf_hw_stats stats;
+
 	unsigned long state;
 	u64 tx_busy;
 	unsigned int tx_ring_count;
@@ -408,9 +403,9 @@ struct ixgbevf_adapter {
 	u32 link_speed;
 	bool link_up;
 
-	struct work_struct watchdog_task;
-
 	spinlock_t mbx_lock;
+
+	struct work_struct watchdog_task;
 };
 
 enum ixbgevf_state_t {
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 77ddda6..41b72ed 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -357,7 +357,6 @@ static inline void ixgbevf_rx_checksum(struct ixgbevf_ring *ring,
 
 	/* It must be a TCP or UDP packet with a valid checksum */
 	skb->ip_summed = CHECKSUM_UNNECESSARY;
-	ring->hw_csum_rx_good++;
 }
 
 /**
@@ -2263,10 +2262,7 @@ void ixgbevf_update_stats(struct ixgbevf_adapter *adapter)
 	for (i = 0;  i  < adapter->num_rx_queues;  i++) {
 		adapter->hw_csum_rx_error +=
 			adapter->rx_ring[i]->hw_csum_rx_error;
-		adapter->hw_csum_rx_good +=
-			adapter->rx_ring[i]->hw_csum_rx_good;
 		adapter->rx_ring[i]->hw_csum_rx_error = 0;
-		adapter->rx_ring[i]->hw_csum_rx_good = 0;
 	}
 }
 
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 4/7] ixgbevf: add tx counters
From: Aaron Brown @ 2014-01-18  2:30 UTC (permalink / raw)
  To: davem; +Cc: Emil Tantilov, netdev, gospo, sassmann, Alexander Duyck,
	Aaron Brown
In-Reply-To: <1390012205-21995-1-git-send-email-aaron.f.brown@intel.com>

From: Emil Tantilov <emil.s.tantilov@intel.com>

This patch adds counters for tx_restart_queue and tx_timeout_count.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ethtool.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ethtool.c b/drivers/net/ethernet/intel/ixgbevf/ethtool.c
index b48df78..f68b78c 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ethtool.c
@@ -77,6 +77,8 @@ static const struct ixgbe_stats ixgbe_gstrings_stats[] = {
 	{"tx_bytes", IXGBEVF_STAT(stats.vfgotc, stats.base_vfgotc,
 				  stats.saved_reset_vfgotc)},
 	{"tx_busy", IXGBEVF_ZSTAT(tx_busy)},
+	{"tx_restart_queue", IXGBEVF_ZSTAT(restart_queue)},
+	{"tx_timeout_count", IXGBEVF_ZSTAT(tx_timeout_count)},
 	{"multicast", IXGBEVF_STAT(stats.vfmprc, stats.base_vfmprc,
 				   stats.saved_reset_vfmprc)},
 	{"rx_csum_offload_errors", IXGBEVF_ZSTAT(hw_csum_rx_error)},
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 5/7] ixgbevf: make the first tx_buffer a repository for most of the skb info
From: Aaron Brown @ 2014-01-18  2:30 UTC (permalink / raw)
  To: davem; +Cc: Emil Tantilov, netdev, gospo, sassmann, Alexander Duyck,
	Aaron Brown
In-Reply-To: <1390012205-21995-1-git-send-email-aaron.f.brown@intel.com>

From: Emil Tantilov <emil.s.tantilov@intel.com>

This change makes it so that the first tx_buffer structure acts as a
central storage location for most of the info about the skb we are about
to transmit.

In addition this patch makes tx_flags part of the ixgbevf_tx_buffer struct.
This allows us to use the flags directly from the stucture and as result
removes the tx_flags parameter from some functions. Also as a cleanup
mapped_as_page is folded into tx_flags and some unused flags were removed.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |  12 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 213 +++++++++++++---------
 2 files changed, 130 insertions(+), 95 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index 0068428..bad3219 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -46,12 +46,15 @@
 /* wrapper around a pointer to a socket buffer,
  * so a DMA handle can be stored along with the buffer */
 struct ixgbevf_tx_buffer {
+	union ixgbe_adv_tx_desc *next_to_watch;
+	unsigned long time_stamp;
 	struct sk_buff *skb;
+	unsigned int bytecount;
+	unsigned short gso_segs;
+	__be16 protocol;
 	dma_addr_t dma;
-	unsigned long time_stamp;
-	union ixgbe_adv_tx_desc *next_to_watch;
+	u32 tx_flags;
 	u16 length;
-	u16 mapped_as_page;
 };
 
 struct ixgbevf_rx_buffer {
@@ -144,8 +147,7 @@ struct ixgbevf_ring {
 #define IXGBE_TX_FLAGS_VLAN		(u32)(1 << 1)
 #define IXGBE_TX_FLAGS_TSO		(u32)(1 << 2)
 #define IXGBE_TX_FLAGS_IPV4		(u32)(1 << 3)
-#define IXGBE_TX_FLAGS_FCOE		(u32)(1 << 4)
-#define IXGBE_TX_FLAGS_FSO		(u32)(1 << 5)
+#define IXGBE_TX_FLAGS_MAPPED_AS_PAGE	(u32)(1 << 4)
 #define IXGBE_TX_FLAGS_VLAN_MASK	0xffff0000
 #define IXGBE_TX_FLAGS_VLAN_PRIO_MASK	0x0000e000
 #define IXGBE_TX_FLAGS_VLAN_SHIFT	16
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 41b72ed..61425f8 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -149,7 +149,7 @@ static void ixgbevf_unmap_and_free_tx_resource(struct ixgbevf_ring *tx_ring,
 					       *tx_buffer_info)
 {
 	if (tx_buffer_info->dma) {
-		if (tx_buffer_info->mapped_as_page)
+		if (tx_buffer_info->tx_flags & IXGBE_TX_FLAGS_MAPPED_AS_PAGE)
 			dma_unmap_page(tx_ring->dev,
 				       tx_buffer_info->dma,
 				       tx_buffer_info->length,
@@ -187,20 +187,21 @@ static bool ixgbevf_clean_tx_irq(struct ixgbevf_q_vector *q_vector,
 				 struct ixgbevf_ring *tx_ring)
 {
 	struct ixgbevf_adapter *adapter = q_vector->adapter;
-	union ixgbe_adv_tx_desc *tx_desc, *eop_desc;
-	struct ixgbevf_tx_buffer *tx_buffer_info;
-	unsigned int i, count = 0;
+	struct ixgbevf_tx_buffer *tx_buffer;
+	union ixgbe_adv_tx_desc *tx_desc;
 	unsigned int total_bytes = 0, total_packets = 0;
+	unsigned int budget = tx_ring->count / 2;
+	unsigned int i = tx_ring->next_to_clean;
 
 	if (test_bit(__IXGBEVF_DOWN, &adapter->state))
 		return true;
 
-	i = tx_ring->next_to_clean;
-	tx_buffer_info = &tx_ring->tx_buffer_info[i];
-	eop_desc = tx_buffer_info->next_to_watch;
+	tx_buffer = &tx_ring->tx_buffer_info[i];
+	tx_desc = IXGBEVF_TX_DESC(tx_ring, i);
+	i -= tx_ring->count;
 
 	do {
-		bool cleaned = false;
+		union ixgbe_adv_tx_desc *eop_desc = tx_buffer->next_to_watch;
 
 		/* if next_to_watch is not set then there is no work pending */
 		if (!eop_desc)
@@ -214,67 +215,77 @@ static bool ixgbevf_clean_tx_irq(struct ixgbevf_q_vector *q_vector,
 			break;
 
 		/* clear next_to_watch to prevent false hangs */
-		tx_buffer_info->next_to_watch = NULL;
+		tx_buffer->next_to_watch = NULL;
 
-		for ( ; !cleaned; count++) {
-			struct sk_buff *skb;
-			tx_desc = IXGBEVF_TX_DESC(tx_ring, i);
-			cleaned = (tx_desc == eop_desc);
-			skb = tx_buffer_info->skb;
-
-			if (cleaned && skb) {
-				unsigned int segs, bytecount;
-
-				/* gso_segs is currently only valid for tcp */
-				segs = skb_shinfo(skb)->gso_segs ?: 1;
-				/* multiply data chunks by size of headers */
-				bytecount = ((segs - 1) * skb_headlen(skb)) +
-					    skb->len;
-				total_packets += segs;
-				total_bytes += bytecount;
-			}
+		/* update the statistics for this packet */
+		total_bytes += tx_buffer->bytecount;
+		total_packets += tx_buffer->gso_segs;
 
-			ixgbevf_unmap_and_free_tx_resource(tx_ring,
-							   tx_buffer_info);
+		/* clear tx_buffer data */
+		ixgbevf_unmap_and_free_tx_resource(tx_ring, tx_buffer);
 
+		/* unmap remaining buffers */
+		while (tx_desc != eop_desc) {
 			tx_desc->wb.status = 0;
 
+			tx_buffer++;
+			tx_desc++;
 			i++;
-			if (i == tx_ring->count)
-				i = 0;
+			if (unlikely(!i)) {
+				i -= tx_ring->count;
+				tx_buffer = tx_ring->tx_buffer_info;
+				tx_desc = IXGBEVF_TX_DESC(tx_ring, 0);
+			}
 
-			tx_buffer_info = &tx_ring->tx_buffer_info[i];
+			ixgbevf_unmap_and_free_tx_resource(tx_ring, tx_buffer);
 		}
 
-		eop_desc = tx_buffer_info->next_to_watch;
-	} while (count < tx_ring->count);
+		tx_desc->wb.status = 0;
 
+		/* move us one more past the eop_desc for start of next pkt */
+		tx_buffer++;
+		tx_desc++;
+		i++;
+		if (unlikely(!i)) {
+			i -= tx_ring->count;
+			tx_buffer = tx_ring->tx_buffer_info;
+			tx_desc = IXGBEVF_TX_DESC(tx_ring, 0);
+		}
+
+		/* issue prefetch for next Tx descriptor */
+		prefetch(tx_desc);
+
+		/* update budget accounting */
+		budget--;
+	} while (likely(budget));
+
+	i += tx_ring->count;
 	tx_ring->next_to_clean = i;
+	u64_stats_update_begin(&tx_ring->syncp);
+	tx_ring->stats.bytes += total_bytes;
+	tx_ring->stats.packets += total_packets;
+	u64_stats_update_end(&tx_ring->syncp);
+	q_vector->tx.total_bytes += total_bytes;
+	q_vector->tx.total_packets += total_packets;
 
 #define TX_WAKE_THRESHOLD (DESC_NEEDED * 2)
-	if (unlikely(count && netif_carrier_ok(tx_ring->netdev) &&
+	if (unlikely(total_packets && netif_carrier_ok(tx_ring->netdev) &&
 		     (ixgbevf_desc_unused(tx_ring) >= TX_WAKE_THRESHOLD))) {
 		/* Make sure that anybody stopping the queue after this
 		 * sees the new next_to_clean.
 		 */
 		smp_mb();
+
 		if (__netif_subqueue_stopped(tx_ring->netdev,
 					     tx_ring->queue_index) &&
 		    !test_bit(__IXGBEVF_DOWN, &adapter->state)) {
 			netif_wake_subqueue(tx_ring->netdev,
 					    tx_ring->queue_index);
-			++adapter->restart_queue;
+			++tx_ring->tx_stats.restart_queue;
 		}
 	}
 
-	u64_stats_update_begin(&tx_ring->syncp);
-	tx_ring->stats.bytes += total_bytes;
-	tx_ring->stats.packets += total_packets;
-	u64_stats_update_end(&tx_ring->syncp);
-	q_vector->tx.total_bytes += total_bytes;
-	q_vector->tx.total_packets += total_packets;
-
-	return count < tx_ring->count;
+	return !!budget;
 }
 
 /**
@@ -2759,8 +2770,10 @@ static void ixgbevf_tx_ctxtdesc(struct ixgbevf_ring *tx_ring,
 }
 
 static int ixgbevf_tso(struct ixgbevf_ring *tx_ring,
-		       struct sk_buff *skb, u32 tx_flags, u8 *hdr_len)
+		       struct ixgbevf_tx_buffer *first,
+		       u8 *hdr_len)
 {
+	struct sk_buff *skb = first->skb;
 	u32 vlan_macip_lens, type_tucmd;
 	u32 mss_l4len_idx, l4len;
 
@@ -2785,12 +2798,17 @@ static int ixgbevf_tso(struct ixgbevf_ring *tx_ring,
 							 IPPROTO_TCP,
 							 0);
 		type_tucmd |= IXGBE_ADVTXD_TUCMD_IPV4;
+		first->tx_flags |= IXGBE_TX_FLAGS_TSO |
+				   IXGBE_TX_FLAGS_CSUM |
+				   IXGBE_TX_FLAGS_IPV4;
 	} else if (skb_is_gso_v6(skb)) {
 		ipv6_hdr(skb)->payload_len = 0;
 		tcp_hdr(skb)->check =
 		    ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
 				     &ipv6_hdr(skb)->daddr,
 				     0, IPPROTO_TCP, 0);
+		first->tx_flags |= IXGBE_TX_FLAGS_TSO |
+				   IXGBE_TX_FLAGS_CSUM;
 	}
 
 	/* compute header lengths */
@@ -2798,6 +2816,10 @@ static int ixgbevf_tso(struct ixgbevf_ring *tx_ring,
 	*hdr_len += l4len;
 	*hdr_len = skb_transport_offset(skb) + l4len;
 
+	/* update gso size and bytecount with header size */
+	first->gso_segs = skb_shinfo(skb)->gso_segs;
+	first->bytecount += (first->gso_segs - 1) * *hdr_len;
+
 	/* mss_l4len_id: use 1 as index for TSO */
 	mss_l4len_idx = l4len << IXGBE_ADVTXD_L4LEN_SHIFT;
 	mss_l4len_idx |= skb_shinfo(skb)->gso_size << IXGBE_ADVTXD_MSS_SHIFT;
@@ -2806,7 +2828,7 @@ static int ixgbevf_tso(struct ixgbevf_ring *tx_ring,
 	/* vlan_macip_lens: HEADLEN, MACLEN, VLAN tag */
 	vlan_macip_lens = skb_network_header_len(skb);
 	vlan_macip_lens |= skb_network_offset(skb) << IXGBE_ADVTXD_MACLEN_SHIFT;
-	vlan_macip_lens |= tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
+	vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
 
 	ixgbevf_tx_ctxtdesc(tx_ring, vlan_macip_lens,
 			    type_tucmd, mss_l4len_idx);
@@ -2814,9 +2836,10 @@ static int ixgbevf_tso(struct ixgbevf_ring *tx_ring,
 	return 1;
 }
 
-static bool ixgbevf_tx_csum(struct ixgbevf_ring *tx_ring,
-			    struct sk_buff *skb, u32 tx_flags)
+static void ixgbevf_tx_csum(struct ixgbevf_ring *tx_ring,
+			    struct ixgbevf_tx_buffer *first)
 {
+	struct sk_buff *skb = first->skb;
 	u32 vlan_macip_lens = 0;
 	u32 mss_l4len_idx = 0;
 	u32 type_tucmd = 0;
@@ -2837,7 +2860,7 @@ static bool ixgbevf_tx_csum(struct ixgbevf_ring *tx_ring,
 			if (unlikely(net_ratelimit())) {
 				dev_warn(tx_ring->dev,
 				 "partial checksum but proto=%x!\n",
-				 skb->protocol);
+				 first->protocol);
 			}
 			break;
 		}
@@ -2865,21 +2888,23 @@ static bool ixgbevf_tx_csum(struct ixgbevf_ring *tx_ring,
 			}
 			break;
 		}
+
+		/* update TX checksum flag */
+		first->tx_flags |= IXGBE_TX_FLAGS_CSUM;
 	}
 
 	/* vlan_macip_lens: MACLEN, VLAN tag */
 	vlan_macip_lens |= skb_network_offset(skb) << IXGBE_ADVTXD_MACLEN_SHIFT;
-	vlan_macip_lens |= tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
+	vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
 
 	ixgbevf_tx_ctxtdesc(tx_ring, vlan_macip_lens,
 			    type_tucmd, mss_l4len_idx);
-
-	return (skb->ip_summed == CHECKSUM_PARTIAL);
 }
 
 static int ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
-			  struct sk_buff *skb, u32 tx_flags)
+			  struct ixgbevf_tx_buffer *first)
 {
+	struct sk_buff *skb = first->skb;
 	struct ixgbevf_tx_buffer *tx_buffer_info;
 	unsigned int len;
 	unsigned int total = skb->len;
@@ -2897,7 +2922,7 @@ static int ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
 		size = min(len, (unsigned int)IXGBE_MAX_DATA_PER_TXD);
 
 		tx_buffer_info->length = size;
-		tx_buffer_info->mapped_as_page = false;
+		tx_buffer_info->tx_flags = first->tx_flags;
 		tx_buffer_info->dma = dma_map_single(tx_ring->dev,
 						     skb->data + offset,
 						     size, DMA_TO_DEVICE);
@@ -2928,10 +2953,11 @@ static int ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
 			tx_buffer_info->dma =
 				skb_frag_dma_map(tx_ring->dev, frag,
 						 offset, size, DMA_TO_DEVICE);
+			tx_buffer_info->tx_flags |=
+						IXGBE_TX_FLAGS_MAPPED_AS_PAGE;
 			if (dma_mapping_error(tx_ring->dev,
 					      tx_buffer_info->dma))
 				goto dma_error;
-			tx_buffer_info->mapped_as_page = true;
 
 			len -= size;
 			total -= size;
@@ -2949,7 +2975,9 @@ static int ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
 		i = tx_ring->count - 1;
 	else
 		i = i - 1;
-	tx_ring->tx_buffer_info[i].skb = skb;
+
+	first->next_to_watch = IXGBEVF_TX_DESC(tx_ring, i);
+	first->time_stamp = jiffies;
 
 	return count;
 
@@ -2973,13 +3001,15 @@ dma_error:
 	return count;
 }
 
-static void ixgbevf_tx_queue(struct ixgbevf_ring *tx_ring, int tx_flags,
-			     int count, unsigned int first, u32 paylen,
-			     u8 hdr_len)
+static void ixgbevf_tx_queue(struct ixgbevf_ring *tx_ring,
+			     struct ixgbevf_tx_buffer *first,
+			     int count, u8 hdr_len)
 {
 	union ixgbe_adv_tx_desc *tx_desc = NULL;
+	struct sk_buff *skb = first->skb;
 	struct ixgbevf_tx_buffer *tx_buffer_info;
 	u32 olinfo_status = 0, cmd_type_len = 0;
+	u32 tx_flags = first->tx_flags;
 	unsigned int i;
 
 	u32 txd_cmd = IXGBE_TXD_CMD_EOP | IXGBE_TXD_CMD_RS | IXGBE_TXD_CMD_IFCS;
@@ -3009,7 +3039,7 @@ static void ixgbevf_tx_queue(struct ixgbevf_ring *tx_ring, int tx_flags,
 	 */
 	olinfo_status |= IXGBE_ADVTXD_CC;
 
-	olinfo_status |= ((paylen - hdr_len) << IXGBE_ADVTXD_PAYLEN_SHIFT);
+	olinfo_status |= ((skb->len - hdr_len) << IXGBE_ADVTXD_PAYLEN_SHIFT);
 
 	i = tx_ring->next_to_use;
 	while (count--) {
@@ -3026,16 +3056,6 @@ static void ixgbevf_tx_queue(struct ixgbevf_ring *tx_ring, int tx_flags,
 
 	tx_desc->read.cmd_type_len |= cpu_to_le32(txd_cmd);
 
-	tx_ring->tx_buffer_info[first].time_stamp = jiffies;
-
-	/* Force memory writes to complete before letting h/w
-	 * know there are new descriptors to fetch.  (Only
-	 * applicable for weak-ordered memory model archs,
-	 * such as IA-64).
-	 */
-	wmb();
-
-	tx_ring->tx_buffer_info[first].next_to_watch = tx_desc;
 	tx_ring->next_to_use = i;
 }
 
@@ -3069,22 +3089,23 @@ static int ixgbevf_maybe_stop_tx(struct ixgbevf_ring *tx_ring, int size)
 static int ixgbevf_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
 {
 	struct ixgbevf_adapter *adapter = netdev_priv(netdev);
+	struct ixgbevf_tx_buffer *first;
 	struct ixgbevf_ring *tx_ring;
-	unsigned int first;
-	unsigned int tx_flags = 0;
-	u8 hdr_len = 0;
-	int r_idx = 0, tso;
+	int tso;
+	u32 tx_flags = 0;
 	u16 count = TXD_USE_COUNT(skb_headlen(skb));
 #if PAGE_SIZE > IXGBE_MAX_DATA_PER_TXD
 	unsigned short f;
 #endif
+	u8 hdr_len = 0;
 	u8 *dst_mac = skb_header_pointer(skb, 0, 0, NULL);
+
 	if (!dst_mac || is_link_local_ether_addr(dst_mac)) {
 		dev_kfree_skb(skb);
 		return NETDEV_TX_OK;
 	}
 
-	tx_ring = adapter->tx_ring[r_idx];
+	tx_ring = adapter->tx_ring[skb->queue_mapping];
 
 	/*
 	 * need: 1 descriptor per page * PAGE_SIZE/IXGBE_MAX_DATA_PER_TXD,
@@ -3104,36 +3125,48 @@ static int ixgbevf_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
 		return NETDEV_TX_BUSY;
 	}
 
+	/* record the location of the first descriptor for this packet */
+	first = &tx_ring->tx_buffer_info[tx_ring->next_to_use];
+	first->skb = skb;
+	first->bytecount = skb->len;
+	first->gso_segs = 1;
+
 	if (vlan_tx_tag_present(skb)) {
 		tx_flags |= vlan_tx_tag_get(skb);
 		tx_flags <<= IXGBE_TX_FLAGS_VLAN_SHIFT;
 		tx_flags |= IXGBE_TX_FLAGS_VLAN;
 	}
 
-	first = tx_ring->next_to_use;
+	/* record initial flags and protocol */
+	first->tx_flags = tx_flags;
+	first->protocol = vlan_get_protocol(skb);
 
-	if (skb->protocol == htons(ETH_P_IP))
-		tx_flags |= IXGBE_TX_FLAGS_IPV4;
-	tso = ixgbevf_tso(tx_ring, skb, tx_flags, &hdr_len);
-	if (tso < 0) {
-		dev_kfree_skb_any(skb);
-		return NETDEV_TX_OK;
-	}
+	tso = ixgbevf_tso(tx_ring, first, &hdr_len);
+	if (tso < 0)
+		goto out_drop;
+	else
+		ixgbevf_tx_csum(tx_ring, first);
 
-	if (tso)
-		tx_flags |= IXGBE_TX_FLAGS_TSO | IXGBE_TX_FLAGS_CSUM;
-	else if (ixgbevf_tx_csum(tx_ring, skb, tx_flags))
-		tx_flags |= IXGBE_TX_FLAGS_CSUM;
+	ixgbevf_tx_queue(tx_ring, first,
+			 ixgbevf_tx_map(tx_ring, first), hdr_len);
 
-	ixgbevf_tx_queue(tx_ring, tx_flags,
-			 ixgbevf_tx_map(tx_ring, skb, tx_flags),
-			 first, skb->len, hdr_len);
+	/* Force memory writes to complete before letting h/w
+	 * know there are new descriptors to fetch.  (Only
+	 * applicable for weak-ordered memory model archs,
+	 * such as IA-64).
+	 */
+	wmb();
 
 	writel(tx_ring->next_to_use, tx_ring->tail);
-
 	ixgbevf_maybe_stop_tx(tx_ring, DESC_NEEDED);
 
 	return NETDEV_TX_OK;
+
+out_drop:
+	dev_kfree_skb_any(first->skb);
+	first->skb = NULL;
+
+	return NETDEV_TX_OK;
 }
 
 /**
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 7/7] ixgbevf: merge ixgbevf_tx_map and ixgbevf_tx_queue into a single function
From: Aaron Brown @ 2014-01-18  2:30 UTC (permalink / raw)
  To: davem; +Cc: Emil Tantilov, netdev, gospo, sassmann, Alexander Duyck,
	Aaron Brown
In-Reply-To: <1390012205-21995-1-git-send-email-aaron.f.brown@intel.com>

From: Emil Tantilov <emil.s.tantilov@intel.com>

This change merges the ixgbevf_tx_map call and the ixgbevf_tx_queue call
into a single function.  In order to make room for this setting of cmd_type
and olinfo flags is done in separate functions.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/defines.h      |   1 +
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 272 +++++++++++-----------
 2 files changed, 133 insertions(+), 140 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h
index 5426b2d..05e4f32 100644
--- a/drivers/net/ethernet/intel/ixgbevf/defines.h
+++ b/drivers/net/ethernet/intel/ixgbevf/defines.h
@@ -183,6 +183,7 @@ typedef u32 ixgbe_link_speed;
 #define IXGBE_TXD_CMD_DEXT   0x20000000 /* Descriptor extension (0 = legacy) */
 #define IXGBE_TXD_CMD_VLE    0x40000000 /* Add VLAN tag */
 #define IXGBE_TXD_STAT_DD    0x00000001 /* Descriptor Done */
+#define IXGBE_TXD_CMD	     (IXGBE_TXD_CMD_EOP | IXGBE_TXD_CMD_RS)
 
 /* Transmit Descriptor - Advanced */
 union ixgbe_adv_tx_desc {
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 0fc0433..43496cd 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -233,8 +233,6 @@ static bool ixgbevf_clean_tx_irq(struct ixgbevf_q_vector *q_vector,
 
 		/* unmap remaining buffers */
 		while (tx_desc != eop_desc) {
-			tx_desc->wb.status = 0;
-
 			tx_buffer++;
 			tx_desc++;
 			i++;
@@ -254,8 +252,6 @@ static bool ixgbevf_clean_tx_irq(struct ixgbevf_q_vector *q_vector,
 			}
 		}
 
-		tx_desc->wb.status = 0;
-
 		/* move us one more past the eop_desc for start of next pkt */
 		tx_buffer++;
 		tx_desc++;
@@ -2915,166 +2911,171 @@ static void ixgbevf_tx_csum(struct ixgbevf_ring *tx_ring,
 			    type_tucmd, mss_l4len_idx);
 }
 
-static int ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
-			  struct ixgbevf_tx_buffer *first)
+static __le32 ixgbevf_tx_cmd_type(u32 tx_flags)
 {
-	dma_addr_t dma;
-	struct sk_buff *skb = first->skb;
-	struct ixgbevf_tx_buffer *tx_buffer_info;
-	unsigned int len;
-	unsigned int total = skb->len;
-	unsigned int offset = 0, size;
-	int count = 0;
-	unsigned int nr_frags = skb_shinfo(skb)->nr_frags;
-	unsigned int f;
-	int i;
+	/* set type for advanced descriptor with frame checksum insertion */
+	__le32 cmd_type = cpu_to_le32(IXGBE_ADVTXD_DTYP_DATA |
+				      IXGBE_ADVTXD_DCMD_IFCS |
+				      IXGBE_ADVTXD_DCMD_DEXT);
 
-	i = tx_ring->next_to_use;
+	/* set HW vlan bit if vlan is present */
+	if (tx_flags & IXGBE_TX_FLAGS_VLAN)
+		cmd_type |= cpu_to_le32(IXGBE_ADVTXD_DCMD_VLE);
 
-	len = min(skb_headlen(skb), total);
-	while (len) {
-		tx_buffer_info = &tx_ring->tx_buffer_info[i];
-		size = min(len, (unsigned int)IXGBE_MAX_DATA_PER_TXD);
+	/* set segmentation enable bits for TSO/FSO */
+	if (tx_flags & IXGBE_TX_FLAGS_TSO)
+		cmd_type |= cpu_to_le32(IXGBE_ADVTXD_DCMD_TSE);
 
-		tx_buffer_info->tx_flags = first->tx_flags;
-		dma = dma_map_single(tx_ring->dev, skb->data + offset,
-				     size, DMA_TO_DEVICE);
-		if (dma_mapping_error(tx_ring->dev, dma))
-			goto dma_error;
+	return cmd_type;
+}
 
-		/* record length, and DMA address */
-		dma_unmap_len_set(tx_buffer_info, len, size);
-		dma_unmap_addr_set(tx_buffer_info, dma, dma);
+static void ixgbevf_tx_olinfo_status(union ixgbe_adv_tx_desc *tx_desc,
+				     u32 tx_flags, unsigned int paylen)
+{
+	__le32 olinfo_status = cpu_to_le32(paylen << IXGBE_ADVTXD_PAYLEN_SHIFT);
 
-		len -= size;
-		total -= size;
-		offset += size;
-		count++;
-		i++;
-		if (i == tx_ring->count)
-			i = 0;
-	}
+	/* enable L4 checksum for TSO and TX checksum offload */
+	if (tx_flags & IXGBE_TX_FLAGS_CSUM)
+		olinfo_status |= cpu_to_le32(IXGBE_ADVTXD_POPTS_TXSM);
 
-	for (f = 0; f < nr_frags; f++) {
-		const struct skb_frag_struct *frag;
+	/* enble IPv4 checksum for TSO */
+	if (tx_flags & IXGBE_TX_FLAGS_IPV4)
+		olinfo_status |= cpu_to_le32(IXGBE_ADVTXD_POPTS_IXSM);
 
-		frag = &skb_shinfo(skb)->frags[f];
-		len = min((unsigned int)skb_frag_size(frag), total);
-		offset = 0;
+	/* use index 1 context for TSO/FSO/FCOE */
+	if (tx_flags & IXGBE_TX_FLAGS_TSO)
+		olinfo_status |= cpu_to_le32(1 << IXGBE_ADVTXD_IDX_SHIFT);
 
-		while (len) {
-			tx_buffer_info = &tx_ring->tx_buffer_info[i];
-			size = min(len, (unsigned int)IXGBE_MAX_DATA_PER_TXD);
+	/* Check Context must be set if Tx switch is enabled, which it
+	 * always is for case where virtual functions are running
+	 */
+	olinfo_status |= cpu_to_le32(IXGBE_ADVTXD_CC);
 
-			dma = skb_frag_dma_map(tx_ring->dev, frag,
-					       offset, size, DMA_TO_DEVICE);
-			if (dma_mapping_error(tx_ring->dev, dma))
-				goto dma_error;
+	tx_desc->read.olinfo_status = olinfo_status;
+}
 
-			/* record length, and DMA address */
-			dma_unmap_len_set(tx_buffer_info, len, size);
-			dma_unmap_addr_set(tx_buffer_info, dma, dma);
+static void ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
+			   struct ixgbevf_tx_buffer *first,
+			   const u8 hdr_len)
+{
+	dma_addr_t dma;
+	struct sk_buff *skb = first->skb;
+	struct ixgbevf_tx_buffer *tx_buffer;
+	union ixgbe_adv_tx_desc *tx_desc;
+	struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[0];
+	unsigned int data_len = skb->data_len;
+	unsigned int size = skb_headlen(skb);
+	unsigned int paylen = skb->len - hdr_len;
+	u32 tx_flags = first->tx_flags;
+	__le32 cmd_type;
+	u16 i = tx_ring->next_to_use;
 
-			len -= size;
-			total -= size;
-			offset += size;
-			count++;
-			i++;
-			if (i == tx_ring->count)
-				i = 0;
-		}
-		if (total == 0)
-			break;
-	}
+	tx_desc = IXGBEVF_TX_DESC(tx_ring, i);
 
-	if (i == 0)
-		i = tx_ring->count - 1;
-	else
-		i = i - 1;
+	ixgbevf_tx_olinfo_status(tx_desc, tx_flags, paylen);
+	cmd_type = ixgbevf_tx_cmd_type(tx_flags);
 
-	first->next_to_watch = IXGBEVF_TX_DESC(tx_ring, i);
-	first->time_stamp = jiffies;
+	dma = dma_map_single(tx_ring->dev, skb->data, size, DMA_TO_DEVICE);
+	if (dma_mapping_error(tx_ring->dev, dma))
+		goto dma_error;
 
-	return count;
+	/* record length, and DMA address */
+	dma_unmap_len_set(first, len, size);
+	dma_unmap_addr_set(first, dma, dma);
 
-dma_error:
-	dev_err(tx_ring->dev, "TX DMA map failed\n");
+	tx_desc->read.buffer_addr = cpu_to_le64(dma);
 
-	/* clear timestamp and dma mappings for failed tx_buffer_info map */
-	tx_buffer_info->dma = 0;
-	count--;
+	for (;;) {
+		while (unlikely(size > IXGBE_MAX_DATA_PER_TXD)) {
+			tx_desc->read.cmd_type_len =
+				cmd_type | cpu_to_le32(IXGBE_MAX_DATA_PER_TXD);
 
-	/* clear timestamp and dma mappings for remaining portion of packet */
-	while (count >= 0) {
-		count--;
-		i--;
-		if (i < 0)
-			i += tx_ring->count;
-		tx_buffer_info = &tx_ring->tx_buffer_info[i];
-		ixgbevf_unmap_and_free_tx_resource(tx_ring, tx_buffer_info);
-	}
+			i++;
+			tx_desc++;
+			if (i == tx_ring->count) {
+				tx_desc = IXGBEVF_TX_DESC(tx_ring, 0);
+				i = 0;
+			}
 
-	return count;
-}
+			dma += IXGBE_MAX_DATA_PER_TXD;
+			size -= IXGBE_MAX_DATA_PER_TXD;
 
-static void ixgbevf_tx_queue(struct ixgbevf_ring *tx_ring,
-			     struct ixgbevf_tx_buffer *first,
-			     int count, u8 hdr_len)
-{
-	union ixgbe_adv_tx_desc *tx_desc = NULL;
-	struct sk_buff *skb = first->skb;
-	struct ixgbevf_tx_buffer *tx_buffer_info;
-	u32 olinfo_status = 0, cmd_type_len = 0;
-	u32 tx_flags = first->tx_flags;
-	unsigned int i;
+			tx_desc->read.buffer_addr = cpu_to_le64(dma);
+			tx_desc->read.olinfo_status = 0;
+		}
 
-	u32 txd_cmd = IXGBE_TXD_CMD_EOP | IXGBE_TXD_CMD_RS | IXGBE_TXD_CMD_IFCS;
+		if (likely(!data_len))
+			break;
 
-	cmd_type_len |= IXGBE_ADVTXD_DTYP_DATA;
+		tx_desc->read.cmd_type_len = cmd_type | cpu_to_le32(size);
 
-	cmd_type_len |= IXGBE_ADVTXD_DCMD_IFCS | IXGBE_ADVTXD_DCMD_DEXT;
+		i++;
+		tx_desc++;
+		if (i == tx_ring->count) {
+			tx_desc = IXGBEVF_TX_DESC(tx_ring, 0);
+			i = 0;
+		}
 
-	if (tx_flags & IXGBE_TX_FLAGS_VLAN)
-		cmd_type_len |= IXGBE_ADVTXD_DCMD_VLE;
+		size = skb_frag_size(frag);
+		data_len -= size;
 
-	if (tx_flags & IXGBE_TX_FLAGS_CSUM)
-		olinfo_status |= IXGBE_ADVTXD_POPTS_TXSM;
+		dma = skb_frag_dma_map(tx_ring->dev, frag, 0, size,
+				       DMA_TO_DEVICE);
+		if (dma_mapping_error(tx_ring->dev, dma))
+			goto dma_error;
 
-	if (tx_flags & IXGBE_TX_FLAGS_TSO) {
-		cmd_type_len |= IXGBE_ADVTXD_DCMD_TSE;
+		tx_buffer = &tx_ring->tx_buffer_info[i];
+		dma_unmap_len_set(tx_buffer, len, size);
+		dma_unmap_addr_set(tx_buffer, dma, dma);
 
-		/* use index 1 context for tso */
-		olinfo_status |= (1 << IXGBE_ADVTXD_IDX_SHIFT);
-		if (tx_flags & IXGBE_TX_FLAGS_IPV4)
-			olinfo_status |= IXGBE_ADVTXD_POPTS_IXSM;
+		tx_desc->read.buffer_addr = cpu_to_le64(dma);
+		tx_desc->read.olinfo_status = 0;
+
+		frag++;
 	}
 
-	/*
-	 * Check Context must be set if Tx switch is enabled, which it
-	 * always is for case where virtual functions are running
+	/* write last descriptor with RS and EOP bits */
+	cmd_type |= cpu_to_le32(size) | cpu_to_le32(IXGBE_TXD_CMD);
+	tx_desc->read.cmd_type_len = cmd_type;
+
+	/* set the timestamp */
+	first->time_stamp = jiffies;
+
+	/* Force memory writes to complete before letting h/w know there
+	 * are new descriptors to fetch.  (Only applicable for weak-ordered
+	 * memory model archs, such as IA-64).
+	 *
+	 * We also need this memory barrier (wmb) to make certain all of the
+	 * status bits have been updated before next_to_watch is written.
 	 */
-	olinfo_status |= IXGBE_ADVTXD_CC;
+	wmb();
 
-	olinfo_status |= ((skb->len - hdr_len) << IXGBE_ADVTXD_PAYLEN_SHIFT);
+	/* set next_to_watch value indicating a packet is present */
+	first->next_to_watch = tx_desc;
 
-	i = tx_ring->next_to_use;
-	while (count--) {
-		dma_addr_t dma;
-		unsigned int len;
+	i++;
+	if (i == tx_ring->count)
+		i = 0;
 
-		tx_buffer_info = &tx_ring->tx_buffer_info[i];
-		dma = dma_unmap_addr(tx_buffer_info, dma);
-		len = dma_unmap_len(tx_buffer_info, len);
-		tx_desc = IXGBEVF_TX_DESC(tx_ring, i);
-		tx_desc->read.buffer_addr = cpu_to_le64(dma);
-		tx_desc->read.cmd_type_len = cpu_to_le32(cmd_type_len | len);
-		tx_desc->read.olinfo_status = cpu_to_le32(olinfo_status);
-		i++;
-		if (i == tx_ring->count)
-			i = 0;
-	}
+	tx_ring->next_to_use = i;
 
-	tx_desc->read.cmd_type_len |= cpu_to_le32(txd_cmd);
+	/* notify HW of packet */
+	writel(i, tx_ring->tail);
+
+	return;
+dma_error:
+	dev_err(tx_ring->dev, "TX DMA map failed\n");
+
+	/* clear dma mappings for failed tx_buffer_info map */
+	for (;;) {
+		tx_buffer = &tx_ring->tx_buffer_info[i];
+		ixgbevf_unmap_and_free_tx_resource(tx_ring, tx_buffer);
+		if (tx_buffer == first)
+			break;
+		if (i == 0)
+			i = tx_ring->count;
+		i--;
+	}
 
 	tx_ring->next_to_use = i;
 }
@@ -3167,17 +3168,8 @@ static int ixgbevf_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
 	else
 		ixgbevf_tx_csum(tx_ring, first);
 
-	ixgbevf_tx_queue(tx_ring, first,
-			 ixgbevf_tx_map(tx_ring, first), hdr_len);
-
-	/* Force memory writes to complete before letting h/w
-	 * know there are new descriptors to fetch.  (Only
-	 * applicable for weak-ordered memory model archs,
-	 * such as IA-64).
-	 */
-	wmb();
+	ixgbevf_tx_map(tx_ring, first, hdr_len);
 
-	writel(tx_ring->next_to_use, tx_ring->tail);
 	ixgbevf_maybe_stop_tx(tx_ring, DESC_NEEDED);
 
 	return NETDEV_TX_OK;
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 6/7] ixgbevf: redo dma mapping using the tx buffer info
From: Aaron Brown @ 2014-01-18  2:30 UTC (permalink / raw)
  To: davem; +Cc: Emil Tantilov, netdev, gospo, sassmann, Alexander Duyck,
	Aaron Brown
In-Reply-To: <1390012205-21995-1-git-send-email-aaron.f.brown@intel.com>

From: Emil Tantilov <emil.s.tantilov@intel.com>

This patch takes advantage of the dma buffer always being present in the
first descriptor and mapped as single. As such we can call dma_unmap_single
and don't need to check for DMA mapping in ixgbevf_clean_tx_irq().

In addition this patch makes use of the DMA API.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |  5 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 94 ++++++++++++++---------
 2 files changed, 59 insertions(+), 40 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index bad3219..5482932 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -52,9 +52,9 @@ struct ixgbevf_tx_buffer {
 	unsigned int bytecount;
 	unsigned short gso_segs;
 	__be16 protocol;
-	dma_addr_t dma;
+	DEFINE_DMA_UNMAP_ADDR(dma);
+	DEFINE_DMA_UNMAP_LEN(len);
 	u32 tx_flags;
-	u16 length;
 };
 
 struct ixgbevf_rx_buffer {
@@ -147,7 +147,6 @@ struct ixgbevf_ring {
 #define IXGBE_TX_FLAGS_VLAN		(u32)(1 << 1)
 #define IXGBE_TX_FLAGS_TSO		(u32)(1 << 2)
 #define IXGBE_TX_FLAGS_IPV4		(u32)(1 << 3)
-#define IXGBE_TX_FLAGS_MAPPED_AS_PAGE	(u32)(1 << 4)
 #define IXGBE_TX_FLAGS_VLAN_MASK	0xffff0000
 #define IXGBE_TX_FLAGS_VLAN_PRIO_MASK	0x0000e000
 #define IXGBE_TX_FLAGS_VLAN_SHIFT	16
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 61425f8..0fc0433 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -145,28 +145,25 @@ static void ixgbevf_set_ivar(struct ixgbevf_adapter *adapter, s8 direction,
 }
 
 static void ixgbevf_unmap_and_free_tx_resource(struct ixgbevf_ring *tx_ring,
-					       struct ixgbevf_tx_buffer
-					       *tx_buffer_info)
-{
-	if (tx_buffer_info->dma) {
-		if (tx_buffer_info->tx_flags & IXGBE_TX_FLAGS_MAPPED_AS_PAGE)
-			dma_unmap_page(tx_ring->dev,
-				       tx_buffer_info->dma,
-				       tx_buffer_info->length,
-				       DMA_TO_DEVICE);
-		else
+					struct ixgbevf_tx_buffer *tx_buffer)
+{
+	if (tx_buffer->skb) {
+		dev_kfree_skb_any(tx_buffer->skb);
+		if (dma_unmap_len(tx_buffer, len))
 			dma_unmap_single(tx_ring->dev,
-					 tx_buffer_info->dma,
-					 tx_buffer_info->length,
+					 dma_unmap_addr(tx_buffer, dma),
+					 dma_unmap_len(tx_buffer, len),
 					 DMA_TO_DEVICE);
-		tx_buffer_info->dma = 0;
-	}
-	if (tx_buffer_info->skb) {
-		dev_kfree_skb_any(tx_buffer_info->skb);
-		tx_buffer_info->skb = NULL;
+	} else if (dma_unmap_len(tx_buffer, len)) {
+		dma_unmap_page(tx_ring->dev,
+			       dma_unmap_addr(tx_buffer, dma),
+			       dma_unmap_len(tx_buffer, len),
+			       DMA_TO_DEVICE);
 	}
-	tx_buffer_info->time_stamp = 0;
-	/* tx_buffer_info must be completely set up in the transmit path */
+	tx_buffer->next_to_watch = NULL;
+	tx_buffer->skb = NULL;
+	dma_unmap_len_set(tx_buffer, len, 0);
+	/* tx_buffer must be completely set up in the transmit path */
 }
 
 #define IXGBE_MAX_TXD_PWR	14
@@ -221,8 +218,18 @@ static bool ixgbevf_clean_tx_irq(struct ixgbevf_q_vector *q_vector,
 		total_bytes += tx_buffer->bytecount;
 		total_packets += tx_buffer->gso_segs;
 
+		/* free the skb */
+		dev_kfree_skb_any(tx_buffer->skb);
+
+		/* unmap skb header data */
+		dma_unmap_single(tx_ring->dev,
+				 dma_unmap_addr(tx_buffer, dma),
+				 dma_unmap_len(tx_buffer, len),
+				 DMA_TO_DEVICE);
+
 		/* clear tx_buffer data */
-		ixgbevf_unmap_and_free_tx_resource(tx_ring, tx_buffer);
+		tx_buffer->skb = NULL;
+		dma_unmap_len_set(tx_buffer, len, 0);
 
 		/* unmap remaining buffers */
 		while (tx_desc != eop_desc) {
@@ -237,7 +244,14 @@ static bool ixgbevf_clean_tx_irq(struct ixgbevf_q_vector *q_vector,
 				tx_desc = IXGBEVF_TX_DESC(tx_ring, 0);
 			}
 
-			ixgbevf_unmap_and_free_tx_resource(tx_ring, tx_buffer);
+			/* unmap any remaining paged data */
+			if (dma_unmap_len(tx_buffer, len)) {
+				dma_unmap_page(tx_ring->dev,
+					       dma_unmap_addr(tx_buffer, dma),
+					       dma_unmap_len(tx_buffer, len),
+					       DMA_TO_DEVICE);
+				dma_unmap_len_set(tx_buffer, len, 0);
+			}
 		}
 
 		tx_desc->wb.status = 0;
@@ -2904,6 +2918,7 @@ static void ixgbevf_tx_csum(struct ixgbevf_ring *tx_ring,
 static int ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
 			  struct ixgbevf_tx_buffer *first)
 {
+	dma_addr_t dma;
 	struct sk_buff *skb = first->skb;
 	struct ixgbevf_tx_buffer *tx_buffer_info;
 	unsigned int len;
@@ -2921,14 +2936,16 @@ static int ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
 		tx_buffer_info = &tx_ring->tx_buffer_info[i];
 		size = min(len, (unsigned int)IXGBE_MAX_DATA_PER_TXD);
 
-		tx_buffer_info->length = size;
 		tx_buffer_info->tx_flags = first->tx_flags;
-		tx_buffer_info->dma = dma_map_single(tx_ring->dev,
-						     skb->data + offset,
-						     size, DMA_TO_DEVICE);
-		if (dma_mapping_error(tx_ring->dev, tx_buffer_info->dma))
+		dma = dma_map_single(tx_ring->dev, skb->data + offset,
+				     size, DMA_TO_DEVICE);
+		if (dma_mapping_error(tx_ring->dev, dma))
 			goto dma_error;
 
+		/* record length, and DMA address */
+		dma_unmap_len_set(tx_buffer_info, len, size);
+		dma_unmap_addr_set(tx_buffer_info, dma, dma);
+
 		len -= size;
 		total -= size;
 		offset += size;
@@ -2949,16 +2966,15 @@ static int ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
 			tx_buffer_info = &tx_ring->tx_buffer_info[i];
 			size = min(len, (unsigned int)IXGBE_MAX_DATA_PER_TXD);
 
-			tx_buffer_info->length = size;
-			tx_buffer_info->dma =
-				skb_frag_dma_map(tx_ring->dev, frag,
-						 offset, size, DMA_TO_DEVICE);
-			tx_buffer_info->tx_flags |=
-						IXGBE_TX_FLAGS_MAPPED_AS_PAGE;
-			if (dma_mapping_error(tx_ring->dev,
-					      tx_buffer_info->dma))
+			dma = skb_frag_dma_map(tx_ring->dev, frag,
+					       offset, size, DMA_TO_DEVICE);
+			if (dma_mapping_error(tx_ring->dev, dma))
 				goto dma_error;
 
+			/* record length, and DMA address */
+			dma_unmap_len_set(tx_buffer_info, len, size);
+			dma_unmap_addr_set(tx_buffer_info, dma, dma);
+
 			len -= size;
 			total -= size;
 			offset += size;
@@ -3043,11 +3059,15 @@ static void ixgbevf_tx_queue(struct ixgbevf_ring *tx_ring,
 
 	i = tx_ring->next_to_use;
 	while (count--) {
+		dma_addr_t dma;
+		unsigned int len;
+
 		tx_buffer_info = &tx_ring->tx_buffer_info[i];
+		dma = dma_unmap_addr(tx_buffer_info, dma);
+		len = dma_unmap_len(tx_buffer_info, len);
 		tx_desc = IXGBEVF_TX_DESC(tx_ring, i);
-		tx_desc->read.buffer_addr = cpu_to_le64(tx_buffer_info->dma);
-		tx_desc->read.cmd_type_len =
-			cpu_to_le32(cmd_type_len | tx_buffer_info->length);
+		tx_desc->read.buffer_addr = cpu_to_le64(dma);
+		tx_desc->read.cmd_type_len = cpu_to_le32(cmd_type_len | len);
 		tx_desc->read.olinfo_status = cpu_to_le32(olinfo_status);
 		i++;
 		if (i == tx_ring->count)
-- 
1.8.5.GIT

^ permalink raw reply related

* Re: [PATCH net-next] ipv4: fix a dst leak in tunnels
From: David Miller @ 2014-01-18  2:37 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, therbert, maze, cwang
In-Reply-To: <1389919279.31367.439.camel@edumazet-glaptop2.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 16 Jan 2014 16:41:19 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> This patch :
> 
> 1) Remove a dst leak if DST_NOCACHE was set on dst
>    Fix this by holding a reference only if dst really cached.
> 
> 2) Remove a lockdep warning in __tunnel_dst_set()
>     This was reported by Cong Wang.
> 
> 3) Remove usage of a spinlock where xchg() is enough
> 
> 4) Remove some spurious inline keywords.
>    Let compiler decide for us.
> 
> Fixes: 7d442fab0a67 ("ipv4: Cache dst in tunnels")
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [Patch v6 net-next 0/2] Intel Wired LAN Driver Updates
From: David Miller @ 2014-01-18  2:38 UTC (permalink / raw)
  To: aaron.f.brown; +Cc: netdev, gospo, sassmann, ethan.kernel
In-Reply-To: <1389930065-3330-1-git-send-email-aaron.f.brown@intel.com>

From: Aaron Brown <aaron.f.brown@intel.com>
Date: Thu, 16 Jan 2014 19:41:03 -0800

> This series contains updates to ixgbe Ethan Zhao.  The first one replaces
> the magic number "63" with a macro, IXGBE_MAX_VFS_DRV_LIMIT, the second 
> moves the call to set driver_max_VFS to before SRIOV is enabled.
> 
> The code of these patches match the v3 (1/2) and v2 (2/2) versions sent
> to the e1000-devel and netdev mailing lists.  The intermediate versions
> (v4, v5) are from sorting out style issues, mostly tabs to spaces and
> split lines probably introduced via mailer.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] net: vxlan: do not use vxlan_net before checking event type
From: David Miller @ 2014-01-18  2:50 UTC (permalink / raw)
  To: dborkman; +Cc: netdev, ebiederm, jesse.brandeburg, xiyou.wangcong
In-Reply-To: <1389959706-30976-1-git-send-email-dborkman@redhat.com>

From: Daniel Borkmann <dborkman@redhat.com>
Date: Fri, 17 Jan 2014 12:55:06 +0100

> Jesse Brandeburg reported that commit acaf4e70997f caused a panic
> when adding a network namespace while vxlan module was present in
> the system:
 ...
> Apparently loopback device is being registered first and thus we
> receive an event notification when vxlan_net is not ready. Hence,
> when we call net_generic() and request vxlan_net_id, we seem to
> access garbage at that point in time. In setup_net() where we set
> up a newly allocated network namespace, we traverse the list of
> pernet ops ...
> 
> list_for_each_entry(ops, &pernet_list, list) {
> 	error = ops_init(ops, net);
> 	if (error < 0)
> 		goto out_undo;
> }
> 
> ... and loopback_net_init() is invoked first here, so in the middle
> of setup_net() we get this notification in vxlan. As currently we
> only care about devices that unregister, move access through
> net_generic() there. Fix is based on Cong Wang's proposal, but
> only changes what is needed here. It sucks a bit as we only work
> around the actual cure: right now it seems the only way to check if
> a netns actually finished traversing all init ops would be to check
> if it's part of net_namespace_list. But that I find quite expensive
> each time we go through a notifier callback. Anyway, did a couple
> of tests and it seems good for now.
> 
> Fixes: acaf4e70997f ("net: vxlan: when lower dev unregisters remove vxlan dev as well")
> Reported-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next v2 0/2] bonding: add slave netlink and sysfs support
From: David Miller @ 2014-01-18  2:53 UTC (permalink / raw)
  To: sfeldma; +Cc: vfalico, fubar, andy, netdev, roopa, shm, dingtianhong
In-Reply-To: <20140117065316.3194.94624.stgit@monster-03.cumulusnetworks.com>

From: Scott Feldman <sfeldma@cumulusnetworks.com>
Date: Thu, 16 Jan 2014 22:57:42 -0800

> v2:
> 
>   - Address review comment from Ding (and Veacesiav): handle kobj cleanup
>     if sysfs_create_file() fails when adding slave attribute nodes.
> 
> v1:
> 
>   The following series adds bonding slave netlink and sysfs interfaces.
>   Slave interfaces get a new IFLA_SLAVE set of netlink attributes, along
>   with RTM_NEWLINK notification when slave's active status changes.  The
>   sysfs interface adds read-only nodes for slave attributes under a /slave
>   dir, simliar to how bond interfaces get a /bonding dir for bonding
>   attributes.

Applied, thanks Scott.

^ permalink raw reply

* Re: [PATCH net-next] net: ftgmac100: use kfree_skb() where appropriate
From: David Miller @ 2014-01-18  2:54 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1389944304.31367.476.camel@edumazet-glaptop2.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 16 Jan 2014 23:38:24 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> In order to get correct drop monitor notifications for dropped
> packets, we should call kfree_skb() instead of dev_kfree_skb()
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH v2 net] bpf: do not use reciprocal divide
From: David Miller @ 2014-01-18  2:56 UTC (permalink / raw)
  To: heiko.carstens
  Cc: eric.dumazet, schwidefsky, hannes, netdev, dborkman, darkjames-ws,
	mgherzan, rmk+kernel, matt
In-Reply-To: <20140117085916.GA4208@osiris>

From: Heiko Carstens <heiko.carstens@de.ibm.com>
Date: Fri, 17 Jan 2014 09:59:16 +0100

> Could you please also apply the patch below to your tree? It would only
> generate a merge conflict, that would need fixing, if it would sit in the
> s390 tree.

Applied and I queued it up for -stable so I can combine it with
Eric's original change when I submit it to -stable.

^ permalink raw reply

* Re: [net-next 0/3] Intel Wired LAN Driver Updates
From: David Miller @ 2014-01-18  2:58 UTC (permalink / raw)
  To: aaron.f.brown; +Cc: netdev, gospo, sassmann
In-Reply-To: <1389950498-8820-1-git-send-email-aaron.f.brown@intel.com>

From: Aaron Brown <aaron.f.brown@intel.com>
Date: Fri, 17 Jan 2014 01:21:35 -0800

> This series contains an updates to ixgbe and ixgbevf.
> 
> Jacob add braces around some ixgbe_qv_lock_* calls lto better adhere 
> to Kernel style guidelines.  Don bumps the versions on ixgbe and
> ixgbevf to match internal driver functionality better.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] virtio-net: fix build error when CONFIG_AVERAGE is not enabled
From: David Miller @ 2014-01-18  2:58 UTC (permalink / raw)
  To: mwdalton; +Cc: mst, netdev, virtualization, edumazet
In-Reply-To: <1389950828-24039-1-git-send-email-mwdalton@google.com>

From: Michael Dalton <mwdalton@google.com>
Date: Fri, 17 Jan 2014 01:27:08 -0800

> Commit ab7db91705e9 ("virtio-net: auto-tune mergeable rx buffer size for
> improved performance") introduced a virtio-net dependency on EWMA.
> The inclusion of EWMA is controlled by CONFIG_AVERAGE. Fix build error
> when CONFIG_AVERAGE is not enabled by adding select AVERAGE to
> virtio-net's Kconfig entry.
> 
> Build failure reported using config make ARCH=s390 defconfig.
> 
> Signed-off-by: Michael Dalton <mwdalton@google.com>

Applied.

^ permalink raw reply

* Re: [PATCH 0/4] Patchset - Support for configurable RSS hash key
From: David Miller @ 2014-01-18  3:01 UTC (permalink / raw)
  To: VenkatKumar.Duvvuru; +Cc: netdev
In-Reply-To: <BF3270C86E8B1349A26C34E4EC1C44CB2C83D8B7@CMEXMB1.ad.emulex.com>

From: Venkata Duvvuru <VenkatKumar.Duvvuru@Emulex.Com>
Date: Fri, 17 Jan 2014 13:02:10 +0000

> NIC drivers that support RSS use either a hard-coded value or a random value for the RSS hash key. Irrespective of the type of the key used, the user would want to change the hash key if he/she is not satisfied with the effectiveness of the default hash-key in spreading the incoming flows evenly across the RSS queues.
> 
> This patch set provides support for configuring the RSS hash-key via the ethtool interface.
> 
> The patch set consists of:
> a) ethtool user-land patches
> b) ethtool kernel patch
> c) be2net patch that implements the ethtool hooks

Your submission is confusing.

Changes for the kernel side of things should be submitted as a separate
series, and you do not need to mention where the tree was, via SHA ID,
in the commit message.  It is sufficient to say that your patches are
against the 'net-net' tree, but after the "---" delimiter.  It's not
useful in the commit message proper.

^ permalink raw reply

* Re: [PATCH 17/41] net: Replace __this_cpu_inc in route.c with raw_cpu_inc
From: David Miller @ 2014-01-18  3:05 UTC (permalink / raw)
  To: cl; +Cc: tj, akpm, rostedt, linux-kernel, mingo, peterz, tglx, netdev,
	edumazet
In-Reply-To: <20140117151836.140608046@linux.com>

From: Christoph Lameter <cl@linux.com>
Date: Fri, 17 Jan 2014 09:18:29 -0600

> Acked-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Christoph Lameter <cl@linux.com>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* Re: [PATCH 24/41] net: Replace get_cpu_var through this_cpu_ptr
From: David Miller @ 2014-01-18  3:05 UTC (permalink / raw)
  To: cl; +Cc: tj, akpm, rostedt, linux-kernel, mingo, peterz, tglx, netdev,
	edumazet
In-Reply-To: <20140117151836.883704411@linux.com>

From: Christoph Lameter <cl@linux.com>
Date: Fri, 17 Jan 2014 09:18:36 -0600

> [Patch depends on another patch in this series that introduces raw_cpu_ops]
> 
> Replace uses of get_cpu_var for address calculation through this_cpu_ptr.
> 
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: netdev@vger.kernel.org
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Christoph Lameter <cl@linux.com>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* Re: [PATCH] Bluetooth: remove direct compilation of 6lowpan_iphc.c
From: David Miller @ 2014-01-18  3:10 UTC (permalink / raw)
  To: swarren
  Cc: dbaryshkov, marcel, sfr, linux-next, linux-kernel,
	linux-zigbee-devel, alex.bluesman.smirnov, netdev, jukka.rissanen,
	swarren
In-Reply-To: <1389986964-5177-1-git-send-email-swarren@wwwdotorg.org>

From: Stephen Warren <swarren@wwwdotorg.org>
Date: Fri, 17 Jan 2014 12:29:24 -0700

> From: Stephen Warren <swarren@nvidia.com>
> 
> It's now built as a separate utility module, and enabling BT selects
> that module in Kconfig. This fixes:
 ...
> (this change probably simply wasn't "git add"d to a53d34c3465b)
> 
> Fixes: a53d34c3465b ("net: move 6lowpan compression code to separate module")
> Fixes: 18722c247023 ("Bluetooth: Enable 6LoWPAN support for BT LE devices")
> Signed-off-by: Stephen Warren <swarren@nvidia.com>

Applied to net-next, thanks a lot.

^ permalink raw reply

* Re: pull request: wireless-next 2014-01-17
From: David Miller @ 2014-01-18  3:11 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, netdev
In-Reply-To: <20140117200028.GC15145@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Fri, 17 Jan 2014 15:00:29 -0500

> Please pull this batch of updates for the 3.14 stream!

Pulled, thanks John.

^ permalink raw reply

* Re: [net-next 0/9] Intel Wired LAN Driver Updates
From: David Miller @ 2014-01-18  3:17 UTC (permalink / raw)
  To: aaron.f.brown; +Cc: netdev, gospo, sassmann
In-Reply-To: <1390001799-19425-1-git-send-email-aaron.f.brown@intel.com>

From: Aaron Brown <aaron.f.brown@intel.com>
Date: Fri, 17 Jan 2014 15:36:30 -0800

> This series contains updates to i40e.  
> 
> Neerav implements DCB and DCBNL support and adds DCB options
> to Kconfig.  DCB is disabled by default.
> 
> Anjali refactors flow control director to fix inconsistencies
> that were preventing clean unloads of the driver, move the
> queues for handling flow director error into their own hardware
> VSI and implement a corrected version of the basic ethtool add 
> ntuple rule.
> 
> Jesse provides fixes for a compiler warning, firmware workaround,
> white space fixes and renames some defines.
> 
> Shannon reworks the device ID #defines to follow the
> DEV_ID_ convention followed by our other drivers.

Series applied.

^ permalink raw reply

* Re: [net-next 0/7] Intel Wired LAN Driver Updates
From: David Miller @ 2014-01-18  3:18 UTC (permalink / raw)
  To: aaron.f.brown; +Cc: netdev, gospo, sassmann
In-Reply-To: <1390012205-21995-1-git-send-email-aaron.f.brown@intel.com>

From: Aaron Brown <aaron.f.brown@intel.com>
Date: Fri, 17 Jan 2014 18:29:58 -0800

> This series contains updates from Emil to ixgbevf.
> 
> He cleans up the code by removing the adapter structure as a
> parameter from multiple functions in favor of using the ixgbevf_ring
> structure and moves hot-path specific statistic int the ring 
> structure for anticipated performance gains.
> 
> He also removes the Tx/Rx counters for checksum offload and adds 
> counters for tx_restart_queue and tx_timeout_count.
> 
> Next he makes it so that the first tx_buffer structure acts as a
> central storage location for most the skb info we are about to
> transmit, then takes advantage of the dma buffer always being
> present in the first descriptor and mapped as single allowing a 
> call to dma_unmap_single which alleviates the need to check for
> DMA mapping in ixgbevf_clean_tx_irq().  
> 
> Finally he merges the ixgbevf_tx_map call and the ixgbevf_tx_queue
> call into a single function.

Series applied, thanks.

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2014-01-18  3:25 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) The value choosen for the new SO_MAX_PACING_RATE socket option on
   parisc was very poorly choosen, let's fix it while we still can.
   From Eric Dumazet.

2) Our generic reciprocal divide was found to handle some edge cases
   incorrectly, part of this is encoded into the BPF as deep as the
   JIT engines themselves.  Just use a real divide throughout for now.
   From Eric Dumazet.

3) Because the initial lookup is lockless, the TCP metrics engine
   can end up creating two entries for the same lookup key.  Fix this
   by doing a second lookup under the lock before we actually create
   the new entry.  From Christoph Paasch.

4) Fix scatter-gather list init in usbnet driver, from Bjørn Mork.

5) Fix unintended 32-bit truncation in cxgb4 driver's bit shifting.
   From Dan Carpenter.

6) Netlink socket dumping uses the wrong socket state for timewait
   sockets.  Fix from Neal Cardwell.

7) Fix netlink memory leak in ieee802154_add_iface(), from Christian
   Engelmayer.

8) Multicast forwarding in ipv4 can overflow the per-rule reference
   counts, causing all multicast traffic to cease.  Fix from
   Hannes Frederic Sowa.

9) via-rhine needs to stop all TX queues when it resets the device,
   from Richard Weinberger.

10) Fix RDS per-cpu accesses broken by the this_cpu_* conversions.
    From Gerald Schaefer.

Please pull, thanks a lot!

The following changes since commit 228fdc083b017eaf90e578fa86fb1ecfd5ffae87:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2014-01-11 06:37:11 +0700)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master

for you to fetch changes up to 3af57f78c38131b7a66e2b01e06fdacae01992a3:

  s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions (2014-01-17 18:54:49 -0800)

----------------------------------------------------------------
Bjørn Mork (1):
      net: usbnet: fix SG initialisation

Christian Engelmayer (1):
      ieee802154: Fix memory leak in ieee802154_add_iface()

Christoph Paasch (1):
      tcp: metrics: Avoid duplicate entries with the same destination-IP

Dan Carpenter (1):
      cxgb4: silence shift wrapping static checker warning

David S. Miller (1):
      Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge

Eric Dumazet (2):
      bpf: do not use reciprocal divide
      parisc: fix SO_MAX_PACING_RATE typo

Gerald Schaefer (1):
      net: rds: fix per-cpu helper usage

Hannes Frederic Sowa (2):
      net: avoid reference counter overflows on fib_rules in multicast forwarding
      ipv6: simplify detection of first operational link-local address on interface

Heiko Carstens (1):
      s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions

Ivan Vecera (1):
      be2net: add dma_mapping_error() check for dma_map_page()

Jitendra Kalsaria (1):
      qlge: Fix vlan netdev features.

Marek Lindner (1):
      batman-adv: fix batman-adv header overhead calculation

Michael S. Tsirkin (1):
      MAINTAINERS: add virtio-dev ML for virtio

Mika Westerberg (1):
      e1000e: Fix compilation warning when !CONFIG_PM_SLEEP

Neal Cardwell (1):
      inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets

Peter Korsgaard (1):
      dm9601: add USB IDs for new dm96xx variants

Richard Weinberger (1):
      net,via-rhine: Fix tx_timeout handling

Yuval Mintz (1):
      bnx2x: Don't release PCI bars on shutdown

 MAINTAINERS                                      |  3 +++
 arch/arm/net/bpf_jit_32.c                        |  6 +++---
 arch/parisc/include/uapi/asm/socket.h            |  2 +-
 arch/powerpc/net/bpf_jit_comp.c                  |  7 ++++---
 arch/s390/net/bpf_jit_comp.c                     | 29 +++++++++++++++++-----------
 arch/sparc/net/bpf_jit_comp.c                    | 17 ++++++++++++++---
 arch/x86/net/bpf_jit_comp.c                      | 14 ++++++++++----
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 29 ++++++++++++++--------------
 drivers/net/ethernet/chelsio/cxgb4/l2t.c         |  2 +-
 drivers/net/ethernet/emulex/benet/be_main.c      | 11 +++++++++--
 drivers/net/ethernet/intel/e1000e/netdev.c       |  8 ++------
 drivers/net/ethernet/qlogic/qlge/qlge_main.c     |  2 ++
 drivers/net/ethernet/via/via-rhine.c             |  1 +
 drivers/net/usb/dm9601.c                         | 12 ++++++++++++
 drivers/net/usb/usbnet.c                         |  2 +-
 include/net/if_inet6.h                           |  1 -
 net/batman-adv/main.c                            |  2 +-
 net/core/filter.c                                | 30 ++---------------------------
 net/ieee802154/nl-phy.c                          |  6 ++++--
 net/ipv4/inet_diag.c                             |  5 ++++-
 net/ipv4/ipmr.c                                  |  7 +++++--
 net/ipv4/tcp_metrics.c                           | 51 +++++++++++++++++++++++++++++++-------------------
 net/ipv6/addrconf.c                              | 38 +++++++++++++++++--------------------
 net/ipv6/ip6mr.c                                 |  7 +++++--
 net/rds/ib_recv.c                                |  7 +++----
 25 files changed, 169 insertions(+), 130 deletions(-)

^ permalink raw reply

* Re: [PATCH net-next] net: vxlan: do not use vxlan_net before checking event type
From: Cong Wang @ 2014-01-18  3:50 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Miller, Linux Kernel Network Developers, Eric W. Biederman,
	Jesse Brandeburg
In-Reply-To: <52D97746.1040408@redhat.com>

On Fri, Jan 17, 2014 at 10:32 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
>
>
> If you want to do cleanups, whatever, I really don't care.
> You had your chance to complain about that when you reviewed
> the initial version ... it has nothing to do with the fix.

This is not for stable, as long as it doesn't harm the readability
we are free to do any cleanup's.

If unsure, check Eric's patch for tunnel dst cache.

BTW, I am the original author of the patch, you just updated
it *trivially* and set yourself as the author. :) I don't mind, but
remember that this may be not appropriate for others. At
very least I didn't and don't do this myself.

^ permalink raw reply

* Re: [PATCH RFC 4/6] net: rfkill: gpio: add device tree support
From: Chen-Yu Tsai @ 2014-01-18  4:41 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Mika Westerberg, Heikki Krogerus, Alexandre Courbot,
	Arnd Bergmann, linux-arm-kernel, Johannes Berg, David S. Miller,
	devicetree, netdev, linux-wireless, linux-sunxi, linux-kernel,
	Maxime Ripard
In-Reply-To: <CACRpkdZOD4zeA8T5kbJ4c5NsnuzHCg1mw8rRMYNT9c4R-Qnc6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Sat, Jan 18, 2014 at 7:11 AM, Linus Walleij <linus.walleij-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
> On Fri, Jan 17, 2014 at 6:43 PM, Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org> wrote:
>> On Sat, Jan 18, 2014 at 12:47 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
>
>>>> +- NAME_shutdown-gpios  : GPIO phandle to shutdown control
>>>> +                         (phandle must be the second)
>>>> +- NAME_reset-gpios     : GPIO phandle to reset control
>>>> +
>>>> +NAME must match the rfkill-name property. NAME_shutdown-gpios or
>>>> +NAME_reset-gpios, or both, must be defined.
>>>> +
>>>
>>> I don't understand this part. Why do you include the name in the
>>> gpios property, rather than just hardcoding the property strings
>>> to "shutdown-gpios" and "reset-gpios"?
>>
>> This quirk is a result of how gpiod_get_index implements device tree
>> lookup.
>
> Why can't it just have a single property "gpios", where the first
> element is the reset GPIO and the second is the shutdown GPIO?
>
> rfkill-gpio does this:
>
> gpio = devm_gpiod_get_index(&pdev->dev, rfkill->reset_name, 0);
> gpio = devm_gpiod_get_index(&pdev->dev, rfkill->shutdown_name, 1);
>
> The passed con ID name parameter is only there for the device
> tree case it seems. (ACPI ignores it.) So what about you just
> don't pass it at all and patch it to do like this instead:
>
> gpio = devm_gpiod_get_index(&pdev->dev, NULL, 0);
> gpio = devm_gpiod_get_index(&pdev->dev, NULL, 1);

I'd like that. It's much cleaner.

> Heikki, are you OK with this change?
>
> I think this is actually necessary if the ACPI and DT unification
> pipe dream shall limp forward, we cannot have arguments passed
> that have a semantic effect on DT but not on ACPI... Drivers
> that are supposed to use both ACPI and DT will always
> have to pass NULL as con ID.
>
>> If con_id is given, it is prepended to "gpios" as the property string.
>> con_id is also used as the label passed to gpiod_request, which is
>> then shown in /sys/kernel/debug/gpio.
>
> If your problem  is really what turns up in debugfs, then we need
> to figure out a way to label gpios outside of the *gpiod_get* calls.

Let's add a gpiod_set_label call. Currently there's a desc_set_label
in gpiolib, which is static inlined. We can either rename and promote
it to non-static, or add a new wrapping function.

> The string passed in *gpiod_get* is a "connection ID" not a proper
> name for the GPIO.

I see. Perhaps we should not pass this to gpiod_request as the label,
or add a comment stating consumers can use the new gpiod_set_label call
to change it.


Cheers,
ChenYu

^ permalink raw reply

* Re: [RFC PATCH net-next 3/3] virtio-net: Add accelerated RFS support
From: Tom Herbert @ 2014-01-18  4:59 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Zhi Yong Wu, Stefan Hajnoczi, Linux Netdev List, Eric Dumazet,
	David S. Miller, Zhi Yong Wu
In-Reply-To: <1389979208.27141.11.camel@bwh-desktop.uk.level5networks.com>

Ben,

I've never quite understood why flow management in aRFS has to be done
with separate messages, and if I recall this seems to mitigate
performance gains to a large extent. It seems like we should be able
to piggyback on a TX descriptor for a connection information about the
RX side for that connection, namely the rxhash and queue mapping.
State creation should be implicit by just seeing a new rxhash value,
tear down might be accomplished with a separate flag on the final TX
packet on the connection (this would need some additional logic in the
stack). Is this method not feasible in either NICs or virtio-net?

On Fri, Jan 17, 2014 at 9:20 AM, Ben Hutchings
<bhutchings@solarflare.com> wrote:
> On Sat, 2014-01-18 at 00:54 +0800, Zhi Yong Wu wrote:
>> On Fri, Jan 17, 2014 at 7:16 AM, Ben Hutchings
>> <bhutchings@solarflare.com> wrote:
> [...]
>> > However, to take advantage of ARFS on a physical net driver, it would be
>> > necessary to send a control request for part 2.
>> aRFS on a physical net driver? What is this physical net driver? I
>> thought that in order to enable aRFS, guest virtio_net driver should
>> send a control request to its emulated virtio_net NIC.
> [...]
>
> If the backend is connected to a macvlan device on top of a physical net
> device that supports ARFS, then there is further potential for improving
> performance by steering to the best physical RX queue and CPU as well as
> the best virtio_net RX queue and vCPU.
>
> Ben.
>
> --
> Ben Hutchings, Staff Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
>

^ permalink raw reply

* Re: [PATCH net] net: core: orphan frags before queuing to slow qdisc
From: Jason Wang @ 2014-01-18  5:35 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, netdev, linux-kernel, Michael S. Tsirkin
In-Reply-To: <1389968897.31367.489.camel@edumazet-glaptop2.roam.corp.google.com>

On 01/17/2014 10:28 PM, Eric Dumazet wrote:
> On Fri, 2014-01-17 at 17:42 +0800, Jason Wang wrote:
>> Many qdiscs can queue a packet for a long time, this will lead an issue
>> with zerocopy skb. It means the frags will not be orphaned in an expected
>> short time, this breaks the assumption that virtio-net will transmit the
>> packet in time.
>>
>> So if guest packets were queued through such kind of qdisc and hit the
>> limitation of the max pending packets for virtio/vhost. All packets that
>> go to another destination from guest will also be blocked.
>>
>> A case for reproducing the issue:
>>
>> - Boot two VMs and connect them to the same bridge kvmbr.
>> - Setup tbf with a very low rate/burst on eth0 which is a port of kvmbr.
>> - Let VM1 send lots of packets thorugh eth0
>> - After a while, VM1 is unable to send any packets out since the number of
>>    pending packets (queued to tbf) were exceeds the limitation of vhost/virito
> So whats the problem ? If the limit is low, you cannot sent packets.

It was just an extreme case. The problem is if zercopy packets of vm1 
were throttled by qdisc in eth0, probably all packets from vm1 were 
throttled even if it was not go through eth0.
> Solution : increase the limit, or tell the vm to lower its rate.
>
> Oh wait, are you bitten because you did some prior skb_orphan() to allow
> the vm to send unlimited number of skbs ???
>

The problem is sndbuf were defaulted to INT_MAX to prevent a similar 
issue for non-zerocopy packets. For zerocopy, only after the frags were 
orphaned can vhost notify the completion of tx for virtio-net. So 
INT_MAX sndbuf is not enough.
>> Solve this issue by orphaning the frags before queuing it to a slow qdisc (the
>> one without TCQ_F_CAN_BYPASS).
> Why orphaning the frags only solves the problem ? A skb without zerocopy
> frags should also be blocked for a while.

It's ok for non-zerocopy packet to be blocked since VM1 thought the 
packets has been sent instead of pending in the virtqueue. So VM1 can 
still send packet to other destination.
> Seriously, lets admit this zero copy stuff is utterly broken.
>
>
> TCQ_F_CAN_BYPASS is not enough. Some NIC have separate queues with
> strict priorities.
>

Yes, but looks less serious than traffic shaping.
> It seems to me that you are pushing to use FIFO (the only qdisc setting
> TCQ_F_CAN_BYPASS), by adding yet another test in fast path (I do not
> know how we can still call it a fast path), while we already have smart
> qdisc to avoid the inherent HOL and unfairness problems of FIFO.
>

It was just a workaround like the case of sndbuf before we had a better 
solution. So looks like using sfq or fq in guest can mitigate the issue?
>> Cc: Michael S. Tsirkin<mst@redhat.com>
>> Signed-off-by: Jason Wang<jasowang@redhat.com>
>> ---
>>   net/core/dev.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 0ce469e..1209774 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -2700,6 +2700,12 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
>>   	contended = qdisc_is_running(q);
>>   	if (unlikely(contended))
>>   		spin_lock(&q->busylock);
>> +	if (!(q->flags&  TCQ_F_CAN_BYPASS)&&
>> +	    unlikely(skb_orphan_frags(skb, GFP_ATOMIC))) {
>> +		kfree_skb(skb);
>> +		rc = NET_XMIT_DROP;
>> +		goto out;
>> +	}
> Are you aware that copying stuff takes time ?
>
> If yes, why is it done after taking the busylock spinlock ?
>

Yes and it should be done outside the spinlock.
>>
>>   	spin_lock(root_lock);
>>   	if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED,&q->state))) {
>> @@ -2739,6 +2745,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
>>   		}
>>   	}
>>   	spin_unlock(root_lock);
>> +out:
>>   	if (unlikely(contended))
>>   		spin_unlock(&q->busylock);
>>   	return rc;
>
>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox