netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2
@ 2015-04-09  1:49 Alexander Duyck
  2015-04-09  1:49 ` [net-next PATCH 1/4] cxgb3/4/4vf: Update drivers to use dma_rmb/wmb where appropriate Alexander Duyck
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Alexander Duyck @ 2015-04-09  1:49 UTC (permalink / raw)
  To: netdev; +Cc: davem

More cleanup of drivers in order to start making use of dma_rmb and dma_wmb
calls.  This is another pass of what I would consider to be low hanging
fruit.  There may be other opportunities to make use of the barriers in the
Mellanox and Chelsio drivers but I didn't want to risk meddling with code I
was not completely familiar with so I am leaving that for future work.

I have revisited the Mellanox driver changes.  This time around I went only
for the sections with a clearly defined pattern.  For dma_wmb I used it
between accesses of the descriptor bits followed by owner or size.  For
dma_wmb I used it to replace rmb following a read of the ownership bit in
the descriptor.

---

Alexander Duyck (4):
      cxgb3/4/4vf: Update drivers to use dma_rmb/wmb where appropriate
      mlx4/mlx5: Use dma_wmb/rmb where appropriate
      i40e/i40evf: Use dma_rmb where appropriate
      e100: Use dma_rmb/wmb where appropriate


 drivers/net/ethernet/chelsio/cxgb3/sge.c      |   20 ++++++++++----------
 drivers/net/ethernet/chelsio/cxgb4/sge.c      |    4 ++--
 drivers/net/ethernet/chelsio/cxgb4vf/sge.c    |    4 ++--
 drivers/net/ethernet/intel/e100.c             |    6 +++---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   |    4 ++--
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c |    4 ++--
 drivers/net/ethernet/mellanox/mlx4/en_rx.c    |    2 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c    |   12 ++++++------
 drivers/net/ethernet/mellanox/mlx4/eq.c       |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/eq.c  |    2 +-
 10 files changed, 31 insertions(+), 31 deletions(-)

--

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [net-next PATCH 1/4] cxgb3/4/4vf: Update drivers to use dma_rmb/wmb where appropriate
  2015-04-09  1:49 [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2 Alexander Duyck
@ 2015-04-09  1:49 ` Alexander Duyck
  2015-04-09  1:49 ` [net-next PATCH 2/4] mlx4/mlx5: Use dma_wmb/rmb " Alexander Duyck
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Alexander Duyck @ 2015-04-09  1:49 UTC (permalink / raw)
  To: netdev; +Cc: Santosh Raspatur, Casey Leedom, Hariprasad S, davem

Update the Chelsio Ethernet drivers to use the dma_rmb/wmb calls instead of
the full barriers in order to improve performance.

Cc: Santosh Raspatur <santosh@chelsio.com>
Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 drivers/net/ethernet/chelsio/cxgb3/sge.c   |   20 ++++++++++----------
 drivers/net/ethernet/chelsio/cxgb4/sge.c   |    4 ++--
 drivers/net/ethernet/chelsio/cxgb4vf/sge.c |    4 ++--
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb3/sge.c b/drivers/net/ethernet/chelsio/cxgb3/sge.c
index d6aa602f168d..e4b5b057f417 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/sge.c
@@ -422,7 +422,7 @@ static inline int add_one_rx_buf(void *va, unsigned int len,
 
 	d->addr_lo = cpu_to_be32(mapping);
 	d->addr_hi = cpu_to_be32((u64) mapping >> 32);
-	wmb();
+	dma_wmb();
 	d->len_gen = cpu_to_be32(V_FLD_GEN1(gen));
 	d->gen2 = cpu_to_be32(V_FLD_GEN2(gen));
 	return 0;
@@ -433,7 +433,7 @@ static inline int add_one_rx_chunk(dma_addr_t mapping, struct rx_desc *d,
 {
 	d->addr_lo = cpu_to_be32(mapping);
 	d->addr_hi = cpu_to_be32((u64) mapping >> 32);
-	wmb();
+	dma_wmb();
 	d->len_gen = cpu_to_be32(V_FLD_GEN1(gen));
 	d->gen2 = cpu_to_be32(V_FLD_GEN2(gen));
 	return 0;
@@ -579,7 +579,7 @@ static void recycle_rx_buf(struct adapter *adap, struct sge_fl *q,
 	q->sdesc[q->pidx] = q->sdesc[idx];
 	to->addr_lo = from->addr_lo;	/* already big endian */
 	to->addr_hi = from->addr_hi;	/* likewise */
-	wmb();
+	dma_wmb();
 	to->len_gen = cpu_to_be32(V_FLD_GEN1(q->gen));
 	to->gen2 = cpu_to_be32(V_FLD_GEN2(q->gen));
 
@@ -1068,7 +1068,7 @@ static void write_wr_hdr_sgl(unsigned int ndesc, struct sk_buff *skb,
 		sd->eop = 1;
 		wrp->wr_hi = htonl(F_WR_SOP | F_WR_EOP | V_WR_DATATYPE(1) |
 				   V_WR_SGLSFLT(flits)) | wr_hi;
-		wmb();
+		dma_wmb();
 		wrp->wr_lo = htonl(V_WR_LEN(flits + sgl_flits) |
 				   V_WR_GEN(gen)) | wr_lo;
 		wr_gen2(d, gen);
@@ -1114,7 +1114,7 @@ static void write_wr_hdr_sgl(unsigned int ndesc, struct sk_buff *skb,
 		}
 		sd->eop = 1;
 		wrp->wr_hi |= htonl(F_WR_EOP);
-		wmb();
+		dma_wmb();
 		wp->wr_lo = htonl(V_WR_LEN(WR_FLITS) | V_WR_GEN(ogen)) | wr_lo;
 		wr_gen2((struct tx_desc *)wp, ogen);
 		WARN_ON(ndesc != 0);
@@ -1184,7 +1184,7 @@ static void write_tx_pkt_wr(struct adapter *adap, struct sk_buff *skb,
 			cpl->wr.wr_hi = htonl(V_WR_BCNTLFLT(skb->len & 7) |
 					      V_WR_OP(FW_WROPCODE_TUNNEL_TX_PKT)
 					      | F_WR_SOP | F_WR_EOP | compl);
-			wmb();
+			dma_wmb();
 			cpl->wr.wr_lo = htonl(V_WR_LEN(flits) | V_WR_GEN(gen) |
 					      V_WR_TID(q->token));
 			wr_gen2(d, gen);
@@ -1342,7 +1342,7 @@ static inline void write_imm(struct tx_desc *d, struct sk_buff *skb,
 
 	to->wr_hi = from->wr_hi | htonl(F_WR_SOP | F_WR_EOP |
 					V_WR_BCNTLFLT(len & 7));
-	wmb();
+	dma_wmb();
 	to->wr_lo = from->wr_lo | htonl(V_WR_GEN(gen) |
 					V_WR_LEN((len + 7) / 8));
 	wr_gen2(d, gen);
@@ -2271,7 +2271,7 @@ static int process_responses(struct adapter *adap, struct sge_qset *qs,
 		u32 len, flags;
 		__be32 rss_hi, rss_lo;
 
-		rmb();
+		dma_rmb();
 		eth = r->rss_hdr.opcode == CPL_RX_PKT;
 		rss_hi = *(const __be32 *)r;
 		rss_lo = r->rss_hdr.rss_hash_val;
@@ -2488,7 +2488,7 @@ static int process_pure_responses(struct adapter *adap, struct sge_qset *qs,
 		}
 		if (!is_new_response(r, q))
 			break;
-		rmb();
+		dma_rmb();
 	} while (is_pure_response(r));
 
 	if (sleeping)
@@ -2523,7 +2523,7 @@ static inline int handle_responses(struct adapter *adap, struct sge_rspq *q)
 
 	if (!is_new_response(r, q))
 		return -1;
-	rmb();
+	dma_rmb();
 	if (is_pure_response(r) && process_pure_responses(adap, qs, r) == 0) {
 		t3_write_reg(adap, A_SG_GTS, V_RSPQ(q->cntxt_id) |
 			     V_NEWTIMER(q->holdoff_tmr) | V_NEWINDEX(q->cidx));
diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index c438f3895c40..e622214e2eca 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -1968,7 +1968,7 @@ static int process_responses(struct sge_rspq *q, int budget)
 		if (!is_new_response(rc, q))
 			break;
 
-		rmb();
+		dma_rmb();
 		rsp_type = RSPD_TYPE(rc->type_gen);
 		if (likely(rsp_type == RSP_TYPE_FLBUF)) {
 			struct page_frag *fp;
@@ -2160,7 +2160,7 @@ static unsigned int process_intrq(struct adapter *adap)
 		if (!is_new_response(rc, q))
 			break;
 
-		rmb();
+		dma_rmb();
 		if (RSPD_TYPE(rc->type_gen) == RSP_TYPE_INTR) {
 			unsigned int qid = ntohl(rc->pldbuflen_qid);
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4vf/sge.c b/drivers/net/ethernet/chelsio/cxgb4vf/sge.c
index 7715982230e5..482f6de6817d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4vf/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4vf/sge.c
@@ -1751,7 +1751,7 @@ static int process_responses(struct sge_rspq *rspq, int budget)
 		 * Figure out what kind of response we've received from the
 		 * SGE.
 		 */
-		rmb();
+		dma_rmb();
 		rsp_type = RSPD_TYPE(rc->type_gen);
 		if (likely(rsp_type == RSP_TYPE_FLBUF)) {
 			struct page_frag *fp;
@@ -1935,7 +1935,7 @@ static unsigned int process_intrq(struct adapter *adapter)
 		 * error and go on to the next response message.  This should
 		 * never happen ...
 		 */
-		rmb();
+		dma_rmb();
 		if (unlikely(RSPD_TYPE(rc->type_gen) != RSP_TYPE_INTR)) {
 			dev_err(adapter->pdev_dev,
 				"Unexpected INTRQ response type %d\n",

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net-next PATCH 2/4] mlx4/mlx5: Use dma_wmb/rmb where appropriate
  2015-04-09  1:49 [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2 Alexander Duyck
  2015-04-09  1:49 ` [net-next PATCH 1/4] cxgb3/4/4vf: Update drivers to use dma_rmb/wmb where appropriate Alexander Duyck
@ 2015-04-09  1:49 ` Alexander Duyck
  2015-04-09  1:49 ` [net-next PATCH 3/4] i40e/i40evf: Use dma_rmb " Alexander Duyck
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Alexander Duyck @ 2015-04-09  1:49 UTC (permalink / raw)
  To: netdev; +Cc: davem, Amir Vadai, Ido Shamay, Eli Cohen

This patch should help to improve the performance of the mlx4 and mlx5 on a
number of architectures.  For example, on x86 the dma_wmb/rmb equates out
to a barrer() call as the architecture is already strong ordered, and on
PowerPC the call works out to a lwsync which is significantly less expensive
than the sync call that was being used for wmb.

I placed the new barriers between any spots that seemed to be trying to
order memory/memory reads or writes, if there are any spots that involved
MMIO I left the existing wmb in place as the new barriers cannot order
transactions between coherent and non-coherent memories.

v2: Reduced the replacments to just the spots where I could clearly
    identify the usage pattern.

Cc: Amir Vadai <amirv@mellanox.com>
Cc: Ido Shamay <idos@mellanox.com>
Cc: Eli Cohen <eli@mellanox.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c   |    2 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c   |   12 ++++++------
 drivers/net/ethernet/mellanox/mlx4/eq.c      |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/eq.c |    2 +-
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 79b1501e7951..4fdd3c37e47b 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -771,7 +771,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 		/*
 		 * make sure we read the CQE after we read the ownership bit
 		 */
-		rmb();
+		dma_rmb();
 
 		/* Drop packet on bad receive or bad checksum */
 		if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 55f9f5c5344e..1783705273d8 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -416,7 +416,7 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev,
 		 * make sure we read the CQE after we read the
 		 * ownership bit
 		 */
-		rmb();
+		dma_rmb();
 
 		if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
 			     MLX4_CQE_OPCODE_ERROR)) {
@@ -667,7 +667,7 @@ static void build_inline_wqe(struct mlx4_en_tx_desc *tx_desc,
 				       skb_frag_size(&shinfo->frags[0]));
 		}
 
-		wmb();
+		dma_wmb();
 		inl->byte_count = cpu_to_be32(1 << 31 | (skb->len - spc));
 	}
 }
@@ -804,7 +804,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 
 			data->addr = cpu_to_be64(dma);
 			data->lkey = ring->mr_key;
-			wmb();
+			dma_wmb();
 			data->byte_count = cpu_to_be32(byte_count);
 			--data;
 		}
@@ -821,7 +821,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 
 			data->addr = cpu_to_be64(dma);
 			data->lkey = ring->mr_key;
-			wmb();
+			dma_wmb();
 			data->byte_count = cpu_to_be32(byte_count);
 		}
 		/* tx completion can avoid cache line miss for common cases */
@@ -938,7 +938,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 		/* Ensure new descriptor hits memory
 		 * before setting ownership of this descriptor to HW
 		 */
-		wmb();
+		dma_wmb();
 		tx_desc->ctrl.owner_opcode = op_own;
 
 		wmb();
@@ -958,7 +958,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 		/* Ensure new descriptor hits memory
 		 * before setting ownership of this descriptor to HW
 		 */
-		wmb();
+		dma_wmb();
 		tx_desc->ctrl.owner_opcode = op_own;
 		if (send_doorbell) {
 			wmb();
diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c b/drivers/net/ethernet/mellanox/mlx4/eq.c
index 6e70ffee8e87..190fd624bdfe 100644
--- a/drivers/net/ethernet/mellanox/mlx4/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/eq.c
@@ -188,7 +188,7 @@ static void slave_event(struct mlx4_dev *dev, u8 slave, struct mlx4_eqe *eqe)
 	memcpy(s_eqe, eqe, dev->caps.eqe_size - 1);
 	s_eqe->slave_id = slave;
 	/* ensure all information is written before setting the ownersip bit */
-	wmb();
+	dma_wmb();
 	s_eqe->owner = !!(slave_eq->prod & SLAVE_EVENT_EQ_SIZE) ? 0x0 : 0x80;
 	++slave_eq->prod;
 
@@ -473,7 +473,7 @@ static int mlx4_eq_int(struct mlx4_dev *dev, struct mlx4_eq *eq)
 		 * Make sure we read EQ entry contents after we've
 		 * checked the ownership bit.
 		 */
-		rmb();
+		dma_rmb();
 
 		switch (eqe->type) {
 		case MLX4_EVENT_TYPE_COMP:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index dbf190d9b9ad..58800e4f3958 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -208,7 +208,7 @@ static int mlx5_eq_int(struct mlx5_core_dev *dev, struct mlx5_eq *eq)
 		 * Make sure we read EQ entry contents after we've
 		 * checked the ownership bit.
 		 */
-		rmb();
+		dma_rmb();
 
 		mlx5_core_dbg(eq->dev, "eqn %d, eqe type %s\n",
 			      eq->eqn, eqe_type_str(eqe->type));

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net-next PATCH 3/4] i40e/i40evf: Use dma_rmb where appropriate
  2015-04-09  1:49 [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2 Alexander Duyck
  2015-04-09  1:49 ` [net-next PATCH 1/4] cxgb3/4/4vf: Update drivers to use dma_rmb/wmb where appropriate Alexander Duyck
  2015-04-09  1:49 ` [net-next PATCH 2/4] mlx4/mlx5: Use dma_wmb/rmb " Alexander Duyck
@ 2015-04-09  1:49 ` Alexander Duyck
  2015-04-09  2:52   ` Jeff Kirsher
  2015-04-09  1:49 ` [net-next PATCH 4/4] e100: Use dma_rmb/wmb " Alexander Duyck
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Alexander Duyck @ 2015-04-09  1:49 UTC (permalink / raw)
  To: netdev; +Cc: Jeff Kirsher, davem

Update i40e and i40evf to use dma_rmb.  This should improve performance by
decreasing the barrier overhead on strong ordered architectures.

Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   |    4 ++--
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c |    4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index d8989f9d1798..4bd3a80aba82 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1554,7 +1554,7 @@ static int i40e_clean_rx_irq_ps(struct i40e_ring *rx_ring, int budget)
 		 * any other fields out of the rx_desc until we know the
 		 * DD bit is set.
 		 */
-		rmb();
+		dma_rmb();
 		if (i40e_rx_is_programming_status(qword)) {
 			i40e_clean_programming_status(rx_ring, rx_desc);
 			I40E_RX_INCREMENT(rx_ring, i);
@@ -1745,7 +1745,7 @@ static int i40e_clean_rx_irq_1buf(struct i40e_ring *rx_ring, int budget)
 		 * any other fields out of the rx_desc until we know the
 		 * DD bit is set.
 		 */
-		rmb();
+		dma_rmb();
 
 		if (i40e_rx_is_programming_status(qword)) {
 			i40e_clean_programming_status(rx_ring, rx_desc);
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
index e2ddb30e96f5..b077e02a0cc7 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
@@ -1034,7 +1034,7 @@ static int i40e_clean_rx_irq_ps(struct i40e_ring *rx_ring, int budget)
 		 * any other fields out of the rx_desc until we know the
 		 * DD bit is set.
 		 */
-		rmb();
+		dma_rmb();
 		rx_bi = &rx_ring->rx_bi[i];
 		skb = rx_bi->skb;
 		if (likely(!skb)) {
@@ -1213,7 +1213,7 @@ static int i40e_clean_rx_irq_1buf(struct i40e_ring *rx_ring, int budget)
 		 * any other fields out of the rx_desc until we know the
 		 * DD bit is set.
 		 */
-		rmb();
+		dma_rmb();
 
 		rx_bi = &rx_ring->rx_bi[i];
 		skb = rx_bi->skb;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net-next PATCH 4/4] e100: Use dma_rmb/wmb where appropriate
  2015-04-09  1:49 [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2 Alexander Duyck
                   ` (2 preceding siblings ...)
  2015-04-09  1:49 ` [net-next PATCH 3/4] i40e/i40evf: Use dma_rmb " Alexander Duyck
@ 2015-04-09  1:49 ` Alexander Duyck
  2015-04-09  2:48   ` Jeff Kirsher
  2015-04-09  2:36 ` [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2 David Miller
  2015-04-09 18:32 ` David Miller
  5 siblings, 1 reply; 10+ messages in thread
From: Alexander Duyck @ 2015-04-09  1:49 UTC (permalink / raw)
  To: netdev; +Cc: Jeff Kirsher, davem

Reduce the CPU overhead for transmit and receive by using lightweight dma_
barriers instead of full barriers where they are applicable.

Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 drivers/net/ethernet/intel/e100.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/e100.c b/drivers/net/ethernet/intel/e100.c
index 05f88394f9a5..1a450f4b6b12 100644
--- a/drivers/net/ethernet/intel/e100.c
+++ b/drivers/net/ethernet/intel/e100.c
@@ -899,7 +899,7 @@ static int e100_exec_cb(struct nic *nic, struct sk_buff *skb,
 	/* Order is important otherwise we'll be in a race with h/w:
 	 * set S-bit in current first, then clear S-bit in previous. */
 	cb->command |= cpu_to_le16(cb_s);
-	wmb();
+	dma_wmb();
 	cb->prev->command &= cpu_to_le16(~cb_s);
 
 	while (nic->cb_to_send != nic->cb_to_use) {
@@ -1843,7 +1843,7 @@ static int e100_tx_clean(struct nic *nic)
 	for (cb = nic->cb_to_clean;
 	    cb->status & cpu_to_le16(cb_complete);
 	    cb = nic->cb_to_clean = cb->next) {
-		rmb(); /* read skb after status */
+		dma_rmb(); /* read skb after status */
 		netif_printk(nic, tx_done, KERN_DEBUG, nic->netdev,
 			     "cb[%d]->status = 0x%04X\n",
 			     (int)(((void*)cb - (void*)nic->cbs)/sizeof(struct cb)),
@@ -1993,7 +1993,7 @@ static int e100_rx_indicate(struct nic *nic, struct rx *rx,
 
 	netif_printk(nic, rx_status, KERN_DEBUG, nic->netdev,
 		     "status=0x%04X\n", rfd_status);
-	rmb(); /* read size after status bit */
+	dma_rmb(); /* read size after status bit */
 
 	/* If data isn't ready, nothing to indicate */
 	if (unlikely(!(rfd_status & cb_complete))) {

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2
  2015-04-09  1:49 [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2 Alexander Duyck
                   ` (3 preceding siblings ...)
  2015-04-09  1:49 ` [net-next PATCH 4/4] e100: Use dma_rmb/wmb " Alexander Duyck
@ 2015-04-09  2:36 ` David Miller
  2015-04-09  2:46   ` Alexander Duyck
  2015-04-09 18:32 ` David Miller
  5 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2015-04-09  2:36 UTC (permalink / raw)
  To: alexander.h.duyck; +Cc: netdev

From: Alexander Duyck <alexander.h.duyck@redhat.com>
Date: Wed, 08 Apr 2015 18:49:23 -0700

> More cleanup of drivers in order to start making use of dma_rmb and dma_wmb
> calls.  This is another pass of what I would consider to be low hanging
> fruit.  There may be other opportunities to make use of the barriers in the
> Mellanox and Chelsio drivers but I didn't want to risk meddling with code I
> was not completely familiar with so I am leaving that for future work.
> 
> I have revisited the Mellanox driver changes.  This time around I went only
> for the sections with a clearly defined pattern.  For dma_wmb I used it
> between accesses of the descriptor bits followed by owner or size.  For
> dma_wmb I used it to replace rmb following a read of the ownership bit in
> the descriptor.

That last sentence should be talking about "dma_rmb" instead of "dma_wmb"
right?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2
  2015-04-09  2:36 ` [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2 David Miller
@ 2015-04-09  2:46   ` Alexander Duyck
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Duyck @ 2015-04-09  2:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev



On 04/08/2015 07:36 PM, David Miller wrote:
> From: Alexander Duyck <alexander.h.duyck@redhat.com>
> Date: Wed, 08 Apr 2015 18:49:23 -0700
>
>> More cleanup of drivers in order to start making use of dma_rmb and dma_wmb
>> calls.  This is another pass of what I would consider to be low hanging
>> fruit.  There may be other opportunities to make use of the barriers in the
>> Mellanox and Chelsio drivers but I didn't want to risk meddling with code I
>> was not completely familiar with so I am leaving that for future work.
>>
>> I have revisited the Mellanox driver changes.  This time around I went only
>> for the sections with a clearly defined pattern.  For dma_wmb I used it
>> between accesses of the descriptor bits followed by owner or size.  For
>> dma_wmb I used it to replace rmb following a read of the ownership bit in
>> the descriptor.
> That last sentence should be talking about "dma_rmb" instead of "dma_wmb"
> right?

Yeah, it was just a typo.

rmb -> dma_rmb
wmb-> dma_wmb

I'll double check my patches to be sure I didn't make the mistake 
anywhere in the code.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [net-next PATCH 4/4] e100: Use dma_rmb/wmb where appropriate
  2015-04-09  1:49 ` [net-next PATCH 4/4] e100: Use dma_rmb/wmb " Alexander Duyck
@ 2015-04-09  2:48   ` Jeff Kirsher
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff Kirsher @ 2015-04-09  2:48 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev, davem

[-- Attachment #1: Type: text/plain, Size: 496 bytes --]

On Wed, 2015-04-08 at 18:49 -0700, Alexander Duyck wrote:
> Reduce the CPU overhead for transmit and receive by using lightweight
> dma_
> barriers instead of full barriers where they are applicable.
> 
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>

Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

> ---
>  drivers/net/ethernet/intel/e100.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [net-next PATCH 3/4] i40e/i40evf: Use dma_rmb where appropriate
  2015-04-09  1:49 ` [net-next PATCH 3/4] i40e/i40evf: Use dma_rmb " Alexander Duyck
@ 2015-04-09  2:52   ` Jeff Kirsher
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff Kirsher @ 2015-04-09  2:52 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev, davem

[-- Attachment #1: Type: text/plain, Size: 573 bytes --]

On Wed, 2015-04-08 at 18:49 -0700, Alexander Duyck wrote:
> Update i40e and i40evf to use dma_rmb.  This should improve
> performance by
> decreasing the barrier overhead on strong ordered architectures.
> 
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>

Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

> ---
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c   |    4 ++--
>  drivers/net/ethernet/intel/i40evf/i40e_txrx.c |    4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2
  2015-04-09  1:49 [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2 Alexander Duyck
                   ` (4 preceding siblings ...)
  2015-04-09  2:36 ` [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2 David Miller
@ 2015-04-09 18:32 ` David Miller
  5 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2015-04-09 18:32 UTC (permalink / raw)
  To: alexander.h.duyck; +Cc: netdev

From: Alexander Duyck <alexander.h.duyck@redhat.com>
Date: Wed, 08 Apr 2015 18:49:23 -0700

> More cleanup of drivers in order to start making use of dma_rmb and dma_wmb
> calls.  This is another pass of what I would consider to be low hanging
> fruit.  There may be other opportunities to make use of the barriers in the
> Mellanox and Chelsio drivers but I didn't want to risk meddling with code I
> was not completely familiar with so I am leaving that for future work.
> 
> I have revisited the Mellanox driver changes.  This time around I went only
> for the sections with a clearly defined pattern.  For dma_wmb I used it
> between accesses of the descriptor bits followed by owner or size.  For
> dma_wmb I used it to replace rmb following a read of the ownership bit in
> the descriptor.

Series applied with fixed commit message.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-04-09 18:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-09  1:49 [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2 Alexander Duyck
2015-04-09  1:49 ` [net-next PATCH 1/4] cxgb3/4/4vf: Update drivers to use dma_rmb/wmb where appropriate Alexander Duyck
2015-04-09  1:49 ` [net-next PATCH 2/4] mlx4/mlx5: Use dma_wmb/rmb " Alexander Duyck
2015-04-09  1:49 ` [net-next PATCH 3/4] i40e/i40evf: Use dma_rmb " Alexander Duyck
2015-04-09  2:52   ` Jeff Kirsher
2015-04-09  1:49 ` [net-next PATCH 4/4] e100: Use dma_rmb/wmb " Alexander Duyck
2015-04-09  2:48   ` Jeff Kirsher
2015-04-09  2:36 ` [net-next PATCH 0/4] Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2 David Miller
2015-04-09  2:46   ` Alexander Duyck
2015-04-09 18:32 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).