Netdev List
 help / color / mirror / Atom feed
* Re: [PATCHv4 net-next 3/6] {IPv4,xfrm} Add ESN support for AH ingress part
From: Steffen Klassert @ 2014-01-16  9:56 UTC (permalink / raw)
  To: Fan Du; +Cc: davem, netdev
In-Reply-To: <1389769995-350-4-git-send-email-fan.du@windriver.com>

On Wed, Jan 15, 2014 at 03:13:12PM +0800, Fan Du wrote:
>  
> -	ahash_request_set_crypt(req, sg, icv, skb->len);
> +	if (x->props.flags & XFRM_STATE_ESN) {
> +		/* Attach seqhi sg right after packet payload */
> +		*seqhi = htonl(XFRM_SKB_CB(skb)->seq.input.hi);

XFRM_SKB_CB(skb)->seq.input.hi is already in network byte order,
please remove the htonl(). The ipv6 patch has the same problem.

^ permalink raw reply

* Re: [PATCHv3 net-next 1/2] flowcache: Make flow cache name space aware
From: Steffen Klassert @ 2014-01-16 10:19 UTC (permalink / raw)
  To: Fan Du; +Cc: davem, netdev
In-Reply-To: <1389771254-8738-2-git-send-email-fan.du@windriver.com>

On Wed, Jan 15, 2014 at 03:34:13PM +0800, Fan Du wrote:
> Inserting a entry into flowcache, or flushing flowcache should be based
> on per net scope. The reason to do so is flushing operation from fat
> netns crammed with flow entries will also making the slim netns with only
> a few flow cache entries go away in original implementation.
> 
> Since flowcache is tightly coupled with IPsec, so it would be easier to
> put flow cache global parameters into xfrm namespace part.

With this change the flow cache is not globally available any more,
it becomes a part of IPsec. IPsec is the only user of the flow cache
and I guess noone else want to use it (we removed the routing cache
to avoid such caching recently).

So I would not mind to move this into IPsec, but I want to see an ack
from David before I take this into ipsec-next.

^ permalink raw reply

* Kernel Crash
From: Nieścierowicz Adam @ 2014-01-16 10:24 UTC (permalink / raw)
  To: netdev

Hi,

I do not know if I am writing in a good place.

With the launch of the new kernel 3.13.0-rc8 on on our router errors 
appear.

http://wklej.org/id/1238075/ [1]

Is this a kernel bug or wrong configuration?

-- 

Thanks,
Adam Nieścierowicz


Links:
------
[1] http://wklej.org/id/1238075/

^ permalink raw reply

* Re: [PATCH net-next 2/2] reciprocal_divide: correction/update of the algorithm
From: Hannes Frederic Sowa @ 2014-01-16 10:26 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Daniel Borkmann, davem, netdev, linux-kernel, Austin S Hemmelgarn,
	Jesse Gross, Jamal Hadi Salim, Stephen Hemminger, Matt Mackall,
	Pekka Enberg, Christoph Lameter, Andy Gospodarek,
	Veaceslav Falico, Jay Vosburgh, Jakub Zawadzki
In-Reply-To: <1389841646.31367.391.camel@edumazet-glaptop2.roam.corp.google.com>

Hi Eric!

On Wed, Jan 15, 2014 at 07:07:26PM -0800, Eric Dumazet wrote:
> On Thu, 2014-01-16 at 00:23 +0100, Daniel Borkmann wrote:
> 
> > Also, reciprocal_value() and reciprocal_divide() always return 0
> > for divisions by 1. This is a bit worrisome as those functions
> > also get used in mm/slab.c and lib/flex_array.c, apparently for
> > index calculation to access array slots. 
> 
> Hi Daniel
> 
> This off-by-one limitation is a known one,
> and mm/slab.c does not have an issue with it because :
> 
> - Minimal object size is not 1 byte, but 8 (or maybe 4)
> - We always divide a multiple of the divisor,
>   so there is no off-by-one effect.
> 
> Little attached prog does a brute force check if needed.
> 
> So far, the only relevant issue was about BPF, and a better
> documentation of reciprocal_divide() use cases.
> 
> (I let Jesse comment on the flex_array case)
> 
> I am unsure we want to 'fix' things, we tried hard in the past to avoid
> divides, so the ones we use are usually because the divisor is not
> constant, so the reciprocal doesn't help.
> 
> (BPF is fixed in David tree)
> 
> Thanks !

You are right, we rewrite that part. The text is still from the first commit
message where we did no full impact analysis. ;)

Thanks,

  Hannes

^ permalink raw reply

* [net-next 6/6] ixgbe: Fix incorrect logic for fixed fiber eeprom write
From: Aaron Brown @ 2014-01-16 10:30 UTC (permalink / raw)
  To: davem; +Cc: Don Skidmore, netdev, gospo, sassmann, Aaron Brown
In-Reply-To: <1389868210-24035-1-git-send-email-aaron.f.brown@intel.com>

From: Don Skidmore <donald.c.skidmore@intel.com>

In this code we wanted to set the bit in IXGBE_SFF_SOFT_RS_SELECT_MASK to
the value in rs.  So we really needed a logical or rather than an and, this
patch makes that change.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c
index 007a008..edda681 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c
@@ -626,7 +626,7 @@ static void ixgbe_set_fiber_fixed_speed(struct ixgbe_hw *hw,
 		goto out;
 	}
 
-	eeprom_data = (eeprom_data & ~IXGBE_SFF_SOFT_RS_SELECT_MASK) & rs;
+	eeprom_data = (eeprom_data & ~IXGBE_SFF_SOFT_RS_SELECT_MASK) | rs;
 
 	status = hw->phy.ops.write_i2c_byte(hw, IXGBE_SFF_SFF_8472_OSCB,
 					    IXGBE_I2C_EEPROM_DEV_ADDR2,
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 0/6] Intel Wired LAN Driver Updates
From: Aaron Brown @ 2014-01-16 10:30 UTC (permalink / raw)
  To: davem; +Cc: Aaron Brown, netdev, gospo, sassmann

This series contains updates to ixgbe and ixgbevf.

John adds rtnl lock / unlock semantics for ixgbe_reinit_locked()
which was being called without the rtnl lock being held.

Jacob corrects an issue where ixgbevf_qv_disable function does not
set the disabled bit correctly.


>From the community, Wei uses a type of struct for pci driver-specific
data in ixgbevf_suspend()

Don changes the way we store ring arrays in a manner that allows 
support of multiple queues on multiple nodes and creates new ring
initialization functions for work previously done across multiple
functions - making the code closer to ixgbe and hopefully more readable.
He also fixes incorrect fiber eeprom write logic. 

John Fastabend (1):
  ixgbe: reinit_locked() should be called with rtnl_lock

Jacob Keller (1):
  ixgbevf: set the disable state when ixgbevf_qv_disable is called

Wei Yongjun (1):
  ixgbevf: use pci drvdata correctly in ixgbevf_suspend()

Don Skidmore (3):
  ixgbevf: Convert ring storage form pointer to an array to array of
    pointers
  ixgbevf: create function for all of ring init
  ixgbe: Fix incorrect logic for fixed fiber eeprom write

 drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c    |   2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     |   2 +
 drivers/net/ethernet/intel/ixgbevf/defines.h      |  17 +
 drivers/net/ethernet/intel/ixgbevf/ethtool.c      |  30 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |   5 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 431 ++++++++++++----------
 6 files changed, 274 insertions(+), 213 deletions(-)

-- 
1.8.5.GIT

^ permalink raw reply

* [net-next 2/6] ixgbevf: set the disable state when ixgbevf_qv_disable is called
From: Aaron Brown @ 2014-01-16 10:30 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, gospo, sassmann, Aaron Brown
In-Reply-To: <1389868210-24035-1-git-send-email-aaron.f.brown@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

The ixgbevf_qv_disable function used by CONFIG_NET_RX_BUSY_POLL is broken,
because it does not properly set the IXGBEVF_QV_STATE_DISABLED bit, indicating
that the q_vector should be disabled (and preventing future locks from
obtaining the vector). This patch corrects the issue by setting the disable
state.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index bb76e96..d7cec1d 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -260,6 +260,7 @@ static inline bool ixgbevf_qv_disable(struct ixgbevf_q_vector *q_vector)
 	spin_lock_bh(&q_vector->lock);
 	if (q_vector->state & IXGBEVF_QV_OWNED)
 		rc = false;
+	q_vector->state |= IXGBEVF_QV_STATE_DISABLED;
 	spin_unlock_bh(&q_vector->lock);
 	return rc;
 }
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 1/6] ixgbe: reinit_locked() should be called with rtnl_lock
From: Aaron Brown @ 2014-01-16 10:30 UTC (permalink / raw)
  To: davem; +Cc: John Fastabend, netdev, gospo, sassmann, Aaron Brown
In-Reply-To: <1389868210-24035-1-git-send-email-aaron.f.brown@intel.com>

From: John Fastabend <john.r.fastabend@intel.com>

ixgbe_service_task() is calling ixgbe_reinit_locked() without
the rtnl_lock being held. This is because it is being called
from a worker thread and not a rtnl netlink or dcbnl path.

Add rtnl_{un}lock() semantics. I found this during code review.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 3ca59d2..b445ad1 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -6392,7 +6392,9 @@ static void ixgbe_reset_subtask(struct ixgbe_adapter *adapter)
 	netdev_err(adapter->netdev, "Reset adapter\n");
 	adapter->tx_timeout_count++;
 
+	rtnl_lock();
 	ixgbe_reinit_locked(adapter);
+	rtnl_unlock();
 }
 
 /**
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 3/6] ixgbevf: use pci drvdata correctly in ixgbevf_suspend()
From: Aaron Brown @ 2014-01-16 10:30 UTC (permalink / raw)
  To: davem; +Cc: Wei Yongjun, netdev, gospo, sassmann, Aaron Brown
In-Reply-To: <1389868210-24035-1-git-send-email-aaron.f.brown@intel.com>

From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

We had set the pci driver-specific data in ixgbevf_probe() as a type of
struct net_device, so we should use it as netdev in ixgbevf_suspend().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index a5d3167..74af295 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -3222,8 +3222,8 @@ static int ixgbevf_suspend(struct pci_dev *pdev, pm_message_t state)
 #ifdef CONFIG_PM
 static int ixgbevf_resume(struct pci_dev *pdev)
 {
-	struct ixgbevf_adapter *adapter = pci_get_drvdata(pdev);
-	struct net_device *netdev = adapter->netdev;
+	struct net_device *netdev = pci_get_drvdata(pdev);
+	struct ixgbevf_adapter *adapter = netdev_priv(netdev);
 	u32 err;
 
 	pci_set_power_state(pdev, PCI_D0);
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 4/6] ixgbevf: Convert ring storage form pointer to an array to array of pointers
From: Aaron Brown @ 2014-01-16 10:30 UTC (permalink / raw)
  To: davem; +Cc: Don Skidmore, netdev, gospo, sassmann, Alexander Duyck,
	Aaron Brown
In-Reply-To: <1389868210-24035-1-git-send-email-aaron.f.brown@intel.com>

From: Don Skidmore <donald.c.skidmore@intel.com>

This will change how we store rings arrays in the adapter sturct.
We use to have a pointer to an array now we will be using an array
of pointers.  This will allow us to support multiple queues on
muliple nodes at some point we would be able to reallocate the rings
so that each is on a local node if needed.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ethtool.c      |  30 ++---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |   4 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 150 ++++++++++++----------
 3 files changed, 98 insertions(+), 86 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ethtool.c b/drivers/net/ethernet/intel/ixgbevf/ethtool.c
index 54d9ace..515ba4e 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ethtool.c
@@ -286,9 +286,9 @@ static int ixgbevf_set_ringparam(struct net_device *netdev,
 
 	if (!netif_running(adapter->netdev)) {
 		for (i = 0; i < adapter->num_tx_queues; i++)
-			adapter->tx_ring[i].count = new_tx_count;
+			adapter->tx_ring[i]->count = new_tx_count;
 		for (i = 0; i < adapter->num_rx_queues; i++)
-			adapter->rx_ring[i].count = new_rx_count;
+			adapter->rx_ring[i]->count = new_rx_count;
 		adapter->tx_ring_count = new_tx_count;
 		adapter->rx_ring_count = new_rx_count;
 		goto clear_reset;
@@ -303,7 +303,7 @@ static int ixgbevf_set_ringparam(struct net_device *netdev,
 
 		for (i = 0; i < adapter->num_tx_queues; i++) {
 			/* clone ring and setup updated count */
-			tx_ring[i] = adapter->tx_ring[i];
+			tx_ring[i] = *adapter->tx_ring[i];
 			tx_ring[i].count = new_tx_count;
 			err = ixgbevf_setup_tx_resources(adapter, &tx_ring[i]);
 			if (!err)
@@ -329,7 +329,7 @@ static int ixgbevf_set_ringparam(struct net_device *netdev,
 
 		for (i = 0; i < adapter->num_rx_queues; i++) {
 			/* clone ring and setup updated count */
-			rx_ring[i] = adapter->rx_ring[i];
+			rx_ring[i] = *adapter->rx_ring[i];
 			rx_ring[i].count = new_rx_count;
 			err = ixgbevf_setup_rx_resources(adapter, &rx_ring[i]);
 			if (!err)
@@ -352,9 +352,8 @@ static int ixgbevf_set_ringparam(struct net_device *netdev,
 	/* Tx */
 	if (tx_ring) {
 		for (i = 0; i < adapter->num_tx_queues; i++) {
-			ixgbevf_free_tx_resources(adapter,
-						  &adapter->tx_ring[i]);
-			adapter->tx_ring[i] = tx_ring[i];
+			ixgbevf_free_tx_resources(adapter, adapter->tx_ring[i]);
+			*adapter->tx_ring[i] = tx_ring[i];
 		}
 		adapter->tx_ring_count = new_tx_count;
 
@@ -365,9 +364,8 @@ static int ixgbevf_set_ringparam(struct net_device *netdev,
 	/* Rx */
 	if (rx_ring) {
 		for (i = 0; i < adapter->num_rx_queues; i++) {
-			ixgbevf_free_rx_resources(adapter,
-						  &adapter->rx_ring[i]);
-			adapter->rx_ring[i] = rx_ring[i];
+			ixgbevf_free_rx_resources(adapter, adapter->rx_ring[i]);
+			*adapter->rx_ring[i] = rx_ring[i];
 		}
 		adapter->rx_ring_count = new_rx_count;
 
@@ -413,15 +411,15 @@ static void ixgbevf_get_ethtool_stats(struct net_device *netdev,
 	    tx_yields = 0, tx_cleaned = 0, tx_missed = 0;
 
 	for (i = 0; i < adapter->num_rx_queues; i++) {
-		rx_yields += adapter->rx_ring[i].bp_yields;
-		rx_cleaned += adapter->rx_ring[i].bp_cleaned;
-		rx_yields += adapter->rx_ring[i].bp_yields;
+		rx_yields += adapter->rx_ring[i]->bp_yields;
+		rx_cleaned += adapter->rx_ring[i]->bp_cleaned;
+		rx_yields += adapter->rx_ring[i]->bp_yields;
 	}
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
-		tx_yields += adapter->tx_ring[i].bp_yields;
-		tx_cleaned += adapter->tx_ring[i].bp_cleaned;
-		tx_yields += adapter->tx_ring[i].bp_yields;
+		tx_yields += adapter->tx_ring[i]->bp_yields;
+		tx_cleaned += adapter->tx_ring[i]->bp_cleaned;
+		tx_yields += adapter->tx_ring[i]->bp_yields;
 	}
 
 	adapter->bp_rx_yields = rx_yields;
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index d7cec1d..0547e40 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -327,7 +327,7 @@ struct ixgbevf_adapter {
 	u32 eims_other;
 
 	/* TX */
-	struct ixgbevf_ring *tx_ring;	/* One per active queue */
+	struct ixgbevf_ring *tx_ring[MAX_TX_QUEUES]; /* One per active queue */
 	int num_tx_queues;
 	u64 restart_queue;
 	u64 hw_csum_tx_good;
@@ -337,7 +337,7 @@ struct ixgbevf_adapter {
 	u32 tx_timeout_count;
 
 	/* RX */
-	struct ixgbevf_ring *rx_ring;	/* One per active queue */
+	struct ixgbevf_ring *rx_ring[MAX_TX_QUEUES]; /* One per active queue */
 	int num_rx_queues;
 	u64 hw_csum_rx_error;
 	u64 hw_rx_no_dma_resources;
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 74af295..202fc47 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -848,8 +848,8 @@ static inline void map_vector_to_rxq(struct ixgbevf_adapter *a, int v_idx,
 {
 	struct ixgbevf_q_vector *q_vector = a->q_vector[v_idx];
 
-	a->rx_ring[r_idx].next = q_vector->rx.ring;
-	q_vector->rx.ring = &a->rx_ring[r_idx];
+	a->rx_ring[r_idx]->next = q_vector->rx.ring;
+	q_vector->rx.ring = a->rx_ring[r_idx];
 	q_vector->rx.count++;
 }
 
@@ -858,8 +858,8 @@ static inline void map_vector_to_txq(struct ixgbevf_adapter *a, int v_idx,
 {
 	struct ixgbevf_q_vector *q_vector = a->q_vector[v_idx];
 
-	a->tx_ring[t_idx].next = q_vector->tx.ring;
-	q_vector->tx.ring = &a->tx_ring[t_idx];
+	a->tx_ring[t_idx]->next = q_vector->tx.ring;
+	q_vector->tx.ring = a->tx_ring[t_idx];
 	q_vector->tx.count++;
 }
 
@@ -1100,7 +1100,7 @@ static void ixgbevf_configure_tx(struct ixgbevf_adapter *adapter)
 
 	/* Setup the HW Tx Head and Tail descriptor pointers */
 	for (i = 0; i < adapter->num_tx_queues; i++) {
-		struct ixgbevf_ring *ring = &adapter->tx_ring[i];
+		struct ixgbevf_ring *ring = adapter->tx_ring[i];
 		j = ring->reg_idx;
 		tdba = ring->dma;
 		tdlen = ring->count * sizeof(union ixgbe_adv_tx_desc);
@@ -1130,7 +1130,7 @@ static void ixgbevf_configure_srrctl(struct ixgbevf_adapter *adapter, int index)
 	struct ixgbe_hw *hw = &adapter->hw;
 	u32 srrctl;
 
-	rx_ring = &adapter->rx_ring[index];
+	rx_ring = adapter->rx_ring[index];
 
 	srrctl = IXGBE_SRRCTL_DROP_EN;
 
@@ -1188,7 +1188,7 @@ static void ixgbevf_set_rx_buffer_len(struct ixgbevf_adapter *adapter)
 		rx_buf_len = IXGBEVF_RXBUFFER_10K;
 
 	for (i = 0; i < adapter->num_rx_queues; i++)
-		adapter->rx_ring[i].rx_buf_len = rx_buf_len;
+		adapter->rx_ring[i]->rx_buf_len = rx_buf_len;
 }
 
 /**
@@ -1212,7 +1212,7 @@ static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter)
 	/* Setup the HW Rx Head and Tail Descriptor Pointers and
 	 * the Base and Length of the Rx Descriptor Ring */
 	for (i = 0; i < adapter->num_rx_queues; i++) {
-		struct ixgbevf_ring *ring = &adapter->rx_ring[i];
+		struct ixgbevf_ring *ring = adapter->rx_ring[i];
 		rdba = ring->dma;
 		j = ring->reg_idx;
 		rdlen = ring->count * sizeof(union ixgbe_adv_rx_desc);
@@ -1389,7 +1389,7 @@ static int ixgbevf_configure_dcb(struct ixgbevf_adapter *adapter)
 
 	if (num_tcs > 1) {
 		/* update default Tx ring register index */
-		adapter->tx_ring[0].reg_idx = def_q;
+		adapter->tx_ring[0]->reg_idx = def_q;
 
 		/* we need as many queues as traffic classes */
 		num_rx_queues = num_tcs;
@@ -1421,7 +1421,7 @@ static void ixgbevf_configure(struct ixgbevf_adapter *adapter)
 	ixgbevf_configure_tx(adapter);
 	ixgbevf_configure_rx(adapter);
 	for (i = 0; i < adapter->num_rx_queues; i++) {
-		struct ixgbevf_ring *ring = &adapter->rx_ring[i];
+		struct ixgbevf_ring *ring = adapter->rx_ring[i];
 		ixgbevf_alloc_rx_buffers(adapter, ring,
 					 ixgbevf_desc_unused(ring));
 	}
@@ -1429,24 +1429,23 @@ static void ixgbevf_configure(struct ixgbevf_adapter *adapter)
 
 #define IXGBEVF_MAX_RX_DESC_POLL 10
 static void ixgbevf_rx_desc_queue_enable(struct ixgbevf_adapter *adapter,
-					 int rxr)
+					 struct ixgbevf_ring *ring)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
 	int wait_loop = IXGBEVF_MAX_RX_DESC_POLL;
 	u32 rxdctl;
-	int j = adapter->rx_ring[rxr].reg_idx;
+	u8 reg_idx = ring->reg_idx;
 
 	do {
 		usleep_range(1000, 2000);
-		rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(j));
+		rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(reg_idx));
 	} while (--wait_loop && !(rxdctl & IXGBE_RXDCTL_ENABLE));
 
 	if (!wait_loop)
 		hw_dbg(hw, "RXDCTL.ENABLE queue %d not set while polling\n",
-		       rxr);
+		       reg_idx);
 
-	ixgbevf_release_rx_desc(&adapter->rx_ring[rxr],
-				(adapter->rx_ring[rxr].count - 1));
+	ixgbevf_release_rx_desc(ring, ring->count - 1);
 }
 
 static void ixgbevf_disable_rx_queue(struct ixgbevf_adapter *adapter,
@@ -1541,7 +1540,7 @@ static void ixgbevf_up_complete(struct ixgbevf_adapter *adapter)
 	u32 txdctl, rxdctl;
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
-		j = adapter->tx_ring[i].reg_idx;
+		j = adapter->tx_ring[i]->reg_idx;
 		txdctl = IXGBE_READ_REG(hw, IXGBE_VFTXDCTL(j));
 		/* enable WTHRESH=8 descriptors, to encourage burst writeback */
 		txdctl |= (8 << 16);
@@ -1549,14 +1548,14 @@ static void ixgbevf_up_complete(struct ixgbevf_adapter *adapter)
 	}
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
-		j = adapter->tx_ring[i].reg_idx;
+		j = adapter->tx_ring[i]->reg_idx;
 		txdctl = IXGBE_READ_REG(hw, IXGBE_VFTXDCTL(j));
 		txdctl |= IXGBE_TXDCTL_ENABLE;
 		IXGBE_WRITE_REG(hw, IXGBE_VFTXDCTL(j), txdctl);
 	}
 
 	for (i = 0; i < num_rx_rings; i++) {
-		j = adapter->rx_ring[i].reg_idx;
+		j = adapter->rx_ring[i]->reg_idx;
 		rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(j));
 		rxdctl |= IXGBE_RXDCTL_ENABLE | IXGBE_RXDCTL_VME;
 		if (hw->mac.type == ixgbe_mac_X540_vf) {
@@ -1565,7 +1564,7 @@ static void ixgbevf_up_complete(struct ixgbevf_adapter *adapter)
 				   IXGBE_RXDCTL_RLPML_EN);
 		}
 		IXGBE_WRITE_REG(hw, IXGBE_VFRXDCTL(j), rxdctl);
-		ixgbevf_rx_desc_queue_enable(adapter, i);
+		ixgbevf_rx_desc_queue_enable(adapter, adapter->rx_ring[i]);
 	}
 
 	ixgbevf_configure_msix(adapter);
@@ -1686,7 +1685,7 @@ static void ixgbevf_clean_all_rx_rings(struct ixgbevf_adapter *adapter)
 	int i;
 
 	for (i = 0; i < adapter->num_rx_queues; i++)
-		ixgbevf_clean_rx_ring(adapter, &adapter->rx_ring[i]);
+		ixgbevf_clean_rx_ring(adapter, adapter->rx_ring[i]);
 }
 
 /**
@@ -1698,7 +1697,7 @@ static void ixgbevf_clean_all_tx_rings(struct ixgbevf_adapter *adapter)
 	int i;
 
 	for (i = 0; i < adapter->num_tx_queues; i++)
-		ixgbevf_clean_tx_ring(adapter, &adapter->tx_ring[i]);
+		ixgbevf_clean_tx_ring(adapter, adapter->tx_ring[i]);
 }
 
 void ixgbevf_down(struct ixgbevf_adapter *adapter)
@@ -1713,7 +1712,7 @@ void ixgbevf_down(struct ixgbevf_adapter *adapter)
 
 	/* disable all enabled rx queues */
 	for (i = 0; i < adapter->num_rx_queues; i++)
-		ixgbevf_disable_rx_queue(adapter, &adapter->rx_ring[i]);
+		ixgbevf_disable_rx_queue(adapter, adapter->rx_ring[i]);
 
 	netif_tx_disable(netdev);
 
@@ -1734,7 +1733,7 @@ void ixgbevf_down(struct ixgbevf_adapter *adapter)
 
 	/* disable transmits in the hardware now that interrupts are off */
 	for (i = 0; i < adapter->num_tx_queues; i++) {
-		j = adapter->tx_ring[i].reg_idx;
+		j = adapter->tx_ring[i]->reg_idx;
 		txdctl = IXGBE_READ_REG(hw, IXGBE_VFTXDCTL(j));
 		IXGBE_WRITE_REG(hw, IXGBE_VFTXDCTL(j),
 				(txdctl & ~IXGBE_TXDCTL_ENABLE));
@@ -1875,40 +1874,50 @@ static void ixgbevf_set_num_queues(struct ixgbevf_adapter *adapter)
  **/
 static int ixgbevf_alloc_queues(struct ixgbevf_adapter *adapter)
 {
-	int i;
+	struct ixgbevf_ring *ring;
+	int rx = 0, tx = 0;
 
-	adapter->tx_ring = kcalloc(adapter->num_tx_queues,
-				   sizeof(struct ixgbevf_ring), GFP_KERNEL);
-	if (!adapter->tx_ring)
-		goto err_tx_ring_allocation;
+	for (; tx < adapter->num_tx_queues; tx++) {
+		ring = kzalloc(sizeof(*ring), GFP_KERNEL);
+		if (!ring)
+			goto err_allocation;
 
-	adapter->rx_ring = kcalloc(adapter->num_rx_queues,
-				   sizeof(struct ixgbevf_ring), GFP_KERNEL);
-	if (!adapter->rx_ring)
-		goto err_rx_ring_allocation;
+		ring->dev = &adapter->pdev->dev;
+		ring->netdev = adapter->netdev;
+		ring->count = adapter->tx_ring_count;
+		ring->queue_index = tx;
+		ring->reg_idx = tx;
 
-	for (i = 0; i < adapter->num_tx_queues; i++) {
-		adapter->tx_ring[i].count = adapter->tx_ring_count;
-		adapter->tx_ring[i].queue_index = i;
-		/* reg_idx may be remapped later by DCB config */
-		adapter->tx_ring[i].reg_idx = i;
-		adapter->tx_ring[i].dev = &adapter->pdev->dev;
-		adapter->tx_ring[i].netdev = adapter->netdev;
+		adapter->tx_ring[tx] = ring;
 	}
 
-	for (i = 0; i < adapter->num_rx_queues; i++) {
-		adapter->rx_ring[i].count = adapter->rx_ring_count;
-		adapter->rx_ring[i].queue_index = i;
-		adapter->rx_ring[i].reg_idx = i;
-		adapter->rx_ring[i].dev = &adapter->pdev->dev;
-		adapter->rx_ring[i].netdev = adapter->netdev;
+	for (; rx < adapter->num_rx_queues; rx++) {
+		ring = kzalloc(sizeof(*ring), GFP_KERNEL);
+		if (!ring)
+			goto err_allocation;
+
+		ring->dev = &adapter->pdev->dev;
+		ring->netdev = adapter->netdev;
+
+		ring->count = adapter->rx_ring_count;
+		ring->queue_index = rx;
+		ring->reg_idx = rx;
+
+		adapter->rx_ring[rx] = ring;
 	}
 
 	return 0;
 
-err_rx_ring_allocation:
-	kfree(adapter->tx_ring);
-err_tx_ring_allocation:
+err_allocation:
+	while (tx) {
+		kfree(adapter->tx_ring[--tx]);
+		adapter->tx_ring[tx] = NULL;
+	}
+
+	while (rx) {
+		kfree(adapter->rx_ring[--rx]);
+		adapter->rx_ring[rx] = NULL;
+	}
 	return -ENOMEM;
 }
 
@@ -2099,6 +2108,17 @@ err_set_interrupt:
  **/
 static void ixgbevf_clear_interrupt_scheme(struct ixgbevf_adapter *adapter)
 {
+	int i;
+
+	for (i = 0; i < adapter->num_tx_queues; i++) {
+		kfree(adapter->tx_ring[i]);
+		adapter->tx_ring[i] = NULL;
+	}
+	for (i = 0; i < adapter->num_rx_queues; i++) {
+		kfree(adapter->rx_ring[i]);
+		adapter->rx_ring[i] = NULL;
+	}
+
 	adapter->num_tx_queues = 0;
 	adapter->num_rx_queues = 0;
 
@@ -2229,11 +2249,11 @@ void ixgbevf_update_stats(struct ixgbevf_adapter *adapter)
 
 	for (i = 0;  i  < adapter->num_rx_queues;  i++) {
 		adapter->hw_csum_rx_error +=
-			adapter->rx_ring[i].hw_csum_rx_error;
+			adapter->rx_ring[i]->hw_csum_rx_error;
 		adapter->hw_csum_rx_good +=
-			adapter->rx_ring[i].hw_csum_rx_good;
-		adapter->rx_ring[i].hw_csum_rx_error = 0;
-		adapter->rx_ring[i].hw_csum_rx_good = 0;
+			adapter->rx_ring[i]->hw_csum_rx_good;
+		adapter->rx_ring[i]->hw_csum_rx_error = 0;
+		adapter->rx_ring[i]->hw_csum_rx_good = 0;
 	}
 }
 
@@ -2413,10 +2433,8 @@ static void ixgbevf_free_all_tx_resources(struct ixgbevf_adapter *adapter)
 	int i;
 
 	for (i = 0; i < adapter->num_tx_queues; i++)
-		if (adapter->tx_ring[i].desc)
-			ixgbevf_free_tx_resources(adapter,
-						  &adapter->tx_ring[i]);
-
+		if (adapter->tx_ring[i]->desc)
+			ixgbevf_free_tx_resources(adapter, adapter->tx_ring[i]);
 }
 
 /**
@@ -2471,7 +2489,7 @@ static int ixgbevf_setup_all_tx_resources(struct ixgbevf_adapter *adapter)
 	int i, err = 0;
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
-		err = ixgbevf_setup_tx_resources(adapter, &adapter->tx_ring[i]);
+		err = ixgbevf_setup_tx_resources(adapter, adapter->tx_ring[i]);
 		if (!err)
 			continue;
 		hw_dbg(&adapter->hw,
@@ -2533,7 +2551,7 @@ static int ixgbevf_setup_all_rx_resources(struct ixgbevf_adapter *adapter)
 	int i, err = 0;
 
 	for (i = 0; i < adapter->num_rx_queues; i++) {
-		err = ixgbevf_setup_rx_resources(adapter, &adapter->rx_ring[i]);
+		err = ixgbevf_setup_rx_resources(adapter, adapter->rx_ring[i]);
 		if (!err)
 			continue;
 		hw_dbg(&adapter->hw,
@@ -2577,9 +2595,8 @@ static void ixgbevf_free_all_rx_resources(struct ixgbevf_adapter *adapter)
 	int i;
 
 	for (i = 0; i < adapter->num_rx_queues; i++)
-		if (adapter->rx_ring[i].desc)
-			ixgbevf_free_rx_resources(adapter,
-						  &adapter->rx_ring[i]);
+		if (adapter->rx_ring[i]->desc)
+			ixgbevf_free_rx_resources(adapter, adapter->rx_ring[i]);
 }
 
 /**
@@ -3069,7 +3086,7 @@ static int ixgbevf_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
 		return NETDEV_TX_OK;
 	}
 
-	tx_ring = &adapter->tx_ring[r_idx];
+	tx_ring = adapter->tx_ring[r_idx];
 
 	/*
 	 * need: 1 descriptor per page * PAGE_SIZE/IXGBE_MAX_DATA_PER_TXD,
@@ -3282,7 +3299,7 @@ static struct rtnl_link_stats64 *ixgbevf_get_stats(struct net_device *netdev,
 	stats->multicast = adapter->stats.vfmprc - adapter->stats.base_vfmprc;
 
 	for (i = 0; i < adapter->num_rx_queues; i++) {
-		ring = &adapter->rx_ring[i];
+		ring = adapter->rx_ring[i];
 		do {
 			start = u64_stats_fetch_begin_bh(&ring->syncp);
 			bytes = ring->total_bytes;
@@ -3293,7 +3310,7 @@ static struct rtnl_link_stats64 *ixgbevf_get_stats(struct net_device *netdev,
 	}
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
-		ring = &adapter->tx_ring[i];
+		ring = adapter->tx_ring[i];
 		do {
 			start = u64_stats_fetch_begin_bh(&ring->syncp);
 			bytes = ring->total_bytes;
@@ -3528,9 +3545,6 @@ static void ixgbevf_remove(struct pci_dev *pdev)
 
 	hw_dbg(&adapter->hw, "Remove complete\n");
 
-	kfree(adapter->tx_ring);
-	kfree(adapter->rx_ring);
-
 	free_netdev(netdev);
 
 	pci_disable_device(pdev);
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 5/6] ixgbevf: create function for all of ring init
From: Aaron Brown @ 2014-01-16 10:30 UTC (permalink / raw)
  To: davem; +Cc: Don Skidmore, netdev, gospo, sassmann, Alexander Duyck,
	Aaron Brown
In-Reply-To: <1389868210-24035-1-git-send-email-aaron.f.brown@intel.com>

From: Don Skidmore <donald.c.skidmore@intel.com>

This patch creates new functions for ring initialization,
ixgbevf_configure_tx_ring() and ixgbevf_configure_rx_ring(). The work done
in these function previously was spread between several other functions and
this change should hopefully lead to greater readability and make the code
more like ixgbe.  This patch also moves the placement of some older functions
to avoid having to write prototypes.  It also promotes a couple of debug
messages to errors.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/defines.h      |  17 ++
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 303 ++++++++++++----------
 2 files changed, 183 insertions(+), 137 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h
index 3147795..5426b2d 100644
--- a/drivers/net/ethernet/intel/ixgbevf/defines.h
+++ b/drivers/net/ethernet/intel/ixgbevf/defines.h
@@ -277,4 +277,21 @@ struct ixgbe_adv_tx_context_desc {
 #define IXGBE_ERR_RESET_FAILED                  -2
 #define IXGBE_ERR_INVALID_ARGUMENT              -3
 
+/* Transmit Config masks */
+#define IXGBE_TXDCTL_ENABLE		0x02000000 /* Ena specific Tx Queue */
+#define IXGBE_TXDCTL_SWFLSH		0x04000000 /* Tx Desc. wr-bk flushing */
+#define IXGBE_TXDCTL_WTHRESH_SHIFT	16	   /* shift to WTHRESH bits */
+
+#define IXGBE_DCA_RXCTRL_DESC_DCA_EN	(1 << 5)  /* Rx Desc enable */
+#define IXGBE_DCA_RXCTRL_HEAD_DCA_EN	(1 << 6)  /* Rx Desc header ena */
+#define IXGBE_DCA_RXCTRL_DATA_DCA_EN	(1 << 7)  /* Rx Desc payload ena */
+#define IXGBE_DCA_RXCTRL_DESC_RRO_EN	(1 << 9)  /* Rx rd Desc Relax Order */
+#define IXGBE_DCA_RXCTRL_DATA_WRO_EN	(1 << 13) /* Rx wr data Relax Order */
+#define IXGBE_DCA_RXCTRL_HEAD_WRO_EN	(1 << 15) /* Rx wr header RO */
+
+#define IXGBE_DCA_TXCTRL_DESC_DCA_EN	(1 << 5)  /* DCA Tx Desc enable */
+#define IXGBE_DCA_TXCTRL_DESC_RRO_EN	(1 << 9)  /* Tx rd Desc Relax Order */
+#define IXGBE_DCA_TXCTRL_DESC_WRO_EN	(1 << 11) /* Tx Desc writeback RO bit */
+#define IXGBE_DCA_TXCTRL_DATA_RRO_EN	(1 << 13) /* Tx rd data Relax Order */
+
 #endif /* _IXGBEVF_DEFINES_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 202fc47..6cf4120 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -1087,6 +1087,70 @@ static inline void ixgbevf_irq_enable(struct ixgbevf_adapter *adapter)
 }
 
 /**
+ * ixgbevf_configure_tx_ring - Configure 82599 VF Tx ring after Reset
+ * @adapter: board private structure
+ * @ring: structure containing ring specific data
+ *
+ * Configure the Tx descriptor ring after a reset.
+ **/
+static void ixgbevf_configure_tx_ring(struct ixgbevf_adapter *adapter,
+				      struct ixgbevf_ring *ring)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	u64 tdba = ring->dma;
+	int wait_loop = 10;
+	u32 txdctl = IXGBE_TXDCTL_ENABLE;
+	u8 reg_idx = ring->reg_idx;
+
+	/* disable queue to avoid issues while updating state */
+	IXGBE_WRITE_REG(hw, IXGBE_VFTXDCTL(reg_idx), IXGBE_TXDCTL_SWFLSH);
+	IXGBE_WRITE_FLUSH(hw);
+
+	IXGBE_WRITE_REG(hw, IXGBE_VFTDBAL(reg_idx), tdba & DMA_BIT_MASK(32));
+	IXGBE_WRITE_REG(hw, IXGBE_VFTDBAH(reg_idx), tdba >> 32);
+	IXGBE_WRITE_REG(hw, IXGBE_VFTDLEN(reg_idx),
+			ring->count * sizeof(union ixgbe_adv_tx_desc));
+
+	/* disable head writeback */
+	IXGBE_WRITE_REG(hw, IXGBE_VFTDWBAH(reg_idx), 0);
+	IXGBE_WRITE_REG(hw, IXGBE_VFTDWBAL(reg_idx), 0);
+
+	/* enable relaxed ordering */
+	IXGBE_WRITE_REG(hw, IXGBE_VFDCA_TXCTRL(reg_idx),
+			(IXGBE_DCA_TXCTRL_DESC_RRO_EN |
+			 IXGBE_DCA_TXCTRL_DATA_RRO_EN));
+
+	/* reset head and tail pointers */
+	IXGBE_WRITE_REG(hw, IXGBE_VFTDH(reg_idx), 0);
+	IXGBE_WRITE_REG(hw, IXGBE_VFTDT(reg_idx), 0);
+	ring->tail = hw->hw_addr + IXGBE_VFTDT(reg_idx);
+
+	/* reset ntu and ntc to place SW in sync with hardwdare */
+	ring->next_to_clean = 0;
+	ring->next_to_use = 0;
+
+	/* In order to avoid issues WTHRESH + PTHRESH should always be equal
+	 * to or less than the number of on chip descriptors, which is
+	 * currently 40.
+	 */
+	txdctl |= (8 << 16);    /* WTHRESH = 8 */
+
+	/* Setting PTHRESH to 32 both improves performance */
+	txdctl |= (1 << 8) |    /* HTHRESH = 1 */
+		  32;          /* PTHRESH = 32 */
+
+	IXGBE_WRITE_REG(hw, IXGBE_VFTXDCTL(reg_idx), txdctl);
+
+	/* poll to verify queue is enabled */
+	do {
+		usleep_range(1000, 2000);
+		txdctl = IXGBE_READ_REG(hw, IXGBE_VFTXDCTL(reg_idx));
+	}  while (--wait_loop && !(txdctl & IXGBE_TXDCTL_ENABLE));
+	if (!wait_loop)
+		pr_err("Could not enable Tx Queue %d\n", reg_idx);
+}
+
+/**
  * ixgbevf_configure_tx - Configure 82599 VF Transmit Unit after Reset
  * @adapter: board private structure
  *
@@ -1094,32 +1158,11 @@ static inline void ixgbevf_irq_enable(struct ixgbevf_adapter *adapter)
  **/
 static void ixgbevf_configure_tx(struct ixgbevf_adapter *adapter)
 {
-	u64 tdba;
-	struct ixgbe_hw *hw = &adapter->hw;
-	u32 i, j, tdlen, txctrl;
+	u32 i;
 
 	/* Setup the HW Tx Head and Tail descriptor pointers */
-	for (i = 0; i < adapter->num_tx_queues; i++) {
-		struct ixgbevf_ring *ring = adapter->tx_ring[i];
-		j = ring->reg_idx;
-		tdba = ring->dma;
-		tdlen = ring->count * sizeof(union ixgbe_adv_tx_desc);
-		IXGBE_WRITE_REG(hw, IXGBE_VFTDBAL(j),
-				(tdba & DMA_BIT_MASK(32)));
-		IXGBE_WRITE_REG(hw, IXGBE_VFTDBAH(j), (tdba >> 32));
-		IXGBE_WRITE_REG(hw, IXGBE_VFTDLEN(j), tdlen);
-		IXGBE_WRITE_REG(hw, IXGBE_VFTDH(j), 0);
-		IXGBE_WRITE_REG(hw, IXGBE_VFTDT(j), 0);
-		ring->tail = hw->hw_addr + IXGBE_VFTDT(j);
-		ring->next_to_clean = 0;
-		ring->next_to_use = 0;
-		/* Disable Tx Head Writeback RO bit, since this hoses
-		 * bookkeeping if things aren't delivered in order.
-		 */
-		txctrl = IXGBE_READ_REG(hw, IXGBE_VFDCA_TXCTRL(j));
-		txctrl &= ~IXGBE_DCA_TXCTRL_TX_WB_RO_EN;
-		IXGBE_WRITE_REG(hw, IXGBE_VFDCA_TXCTRL(j), txctrl);
-	}
+	for (i = 0; i < adapter->num_tx_queues; i++)
+		ixgbevf_configure_tx_ring(adapter, adapter->tx_ring[i]);
 }
 
 #define IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT	2
@@ -1191,6 +1234,92 @@ static void ixgbevf_set_rx_buffer_len(struct ixgbevf_adapter *adapter)
 		adapter->rx_ring[i]->rx_buf_len = rx_buf_len;
 }
 
+#define IXGBEVF_MAX_RX_DESC_POLL 10
+static void ixgbevf_disable_rx_queue(struct ixgbevf_adapter *adapter,
+				     struct ixgbevf_ring *ring)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	int wait_loop = IXGBEVF_MAX_RX_DESC_POLL;
+	u32 rxdctl;
+	u8 reg_idx = ring->reg_idx;
+
+	rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(reg_idx));
+	rxdctl &= ~IXGBE_RXDCTL_ENABLE;
+
+	/* write value back with RXDCTL.ENABLE bit cleared */
+	IXGBE_WRITE_REG(hw, IXGBE_VFRXDCTL(reg_idx), rxdctl);
+
+	/* the hardware may take up to 100us to really disable the rx queue */
+	do {
+		udelay(10);
+		rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(reg_idx));
+	} while (--wait_loop && (rxdctl & IXGBE_RXDCTL_ENABLE));
+
+	if (!wait_loop)
+		pr_err("RXDCTL.ENABLE queue %d not cleared while polling\n",
+		       reg_idx);
+}
+
+static void ixgbevf_rx_desc_queue_enable(struct ixgbevf_adapter *adapter,
+					 struct ixgbevf_ring *ring)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	int wait_loop = IXGBEVF_MAX_RX_DESC_POLL;
+	u32 rxdctl;
+	u8 reg_idx = ring->reg_idx;
+
+	do {
+		usleep_range(1000, 2000);
+		rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(reg_idx));
+	} while (--wait_loop && !(rxdctl & IXGBE_RXDCTL_ENABLE));
+
+	if (!wait_loop)
+		pr_err("RXDCTL.ENABLE queue %d not set while polling\n",
+		       reg_idx);
+}
+
+static void ixgbevf_configure_rx_ring(struct ixgbevf_adapter *adapter,
+				      struct ixgbevf_ring *ring)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	u64 rdba = ring->dma;
+	u32 rxdctl;
+	u8 reg_idx = ring->reg_idx;
+
+	/* disable queue to avoid issues while updating state */
+	rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(reg_idx));
+	ixgbevf_disable_rx_queue(adapter, ring);
+
+	IXGBE_WRITE_REG(hw, IXGBE_VFRDBAL(reg_idx), rdba & DMA_BIT_MASK(32));
+	IXGBE_WRITE_REG(hw, IXGBE_VFRDBAH(reg_idx), rdba >> 32);
+	IXGBE_WRITE_REG(hw, IXGBE_VFRDLEN(reg_idx),
+			ring->count * sizeof(union ixgbe_adv_rx_desc));
+
+	/* enable relaxed ordering */
+	IXGBE_WRITE_REG(hw, IXGBE_VFDCA_RXCTRL(reg_idx),
+			IXGBE_DCA_RXCTRL_DESC_RRO_EN);
+
+	/* reset head and tail pointers */
+	IXGBE_WRITE_REG(hw, IXGBE_VFRDH(reg_idx), 0);
+	IXGBE_WRITE_REG(hw, IXGBE_VFRDT(reg_idx), 0);
+	ring->tail = hw->hw_addr + IXGBE_VFRDT(reg_idx);
+
+	/* reset ntu and ntc to place SW in sync with hardwdare */
+	ring->next_to_clean = 0;
+	ring->next_to_use = 0;
+
+	ixgbevf_configure_srrctl(adapter, reg_idx);
+
+	/* prevent DMA from exceeding buffer space available */
+	rxdctl &= ~IXGBE_RXDCTL_RLPMLMASK;
+	rxdctl |= ring->rx_buf_len | IXGBE_RXDCTL_RLPML_EN;
+	rxdctl |= IXGBE_RXDCTL_ENABLE | IXGBE_RXDCTL_VME;
+	IXGBE_WRITE_REG(hw, IXGBE_VFRXDCTL(reg_idx), rxdctl);
+
+	ixgbevf_rx_desc_queue_enable(adapter, ring);
+	ixgbevf_alloc_rx_buffers(adapter, ring, ixgbevf_desc_unused(ring));
+}
+
 /**
  * ixgbevf_configure_rx - Configure 82599 VF Receive Unit after Reset
  * @adapter: board private structure
@@ -1199,10 +1328,7 @@ static void ixgbevf_set_rx_buffer_len(struct ixgbevf_adapter *adapter)
  **/
 static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter)
 {
-	u64 rdba;
-	struct ixgbe_hw *hw = &adapter->hw;
-	int i, j;
-	u32 rdlen;
+	int i;
 
 	ixgbevf_setup_psrtype(adapter);
 
@@ -1211,23 +1337,8 @@ static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter)
 
 	/* Setup the HW Rx Head and Tail Descriptor Pointers and
 	 * the Base and Length of the Rx Descriptor Ring */
-	for (i = 0; i < adapter->num_rx_queues; i++) {
-		struct ixgbevf_ring *ring = adapter->rx_ring[i];
-		rdba = ring->dma;
-		j = ring->reg_idx;
-		rdlen = ring->count * sizeof(union ixgbe_adv_rx_desc);
-		IXGBE_WRITE_REG(hw, IXGBE_VFRDBAL(j),
-				(rdba & DMA_BIT_MASK(32)));
-		IXGBE_WRITE_REG(hw, IXGBE_VFRDBAH(j), (rdba >> 32));
-		IXGBE_WRITE_REG(hw, IXGBE_VFRDLEN(j), rdlen);
-		IXGBE_WRITE_REG(hw, IXGBE_VFRDH(j), 0);
-		IXGBE_WRITE_REG(hw, IXGBE_VFRDT(j), 0);
-		ring->tail = hw->hw_addr + IXGBE_VFRDT(j);
-		ring->next_to_clean = 0;
-		ring->next_to_use = 0;
-
-		ixgbevf_configure_srrctl(adapter, j);
-	}
+	for (i = 0; i < adapter->num_rx_queues; i++)
+		ixgbevf_configure_rx_ring(adapter, adapter->rx_ring[i]);
 }
 
 static int ixgbevf_vlan_rx_add_vid(struct net_device *netdev,
@@ -1409,68 +1520,14 @@ static int ixgbevf_configure_dcb(struct ixgbevf_adapter *adapter)
 
 static void ixgbevf_configure(struct ixgbevf_adapter *adapter)
 {
-	struct net_device *netdev = adapter->netdev;
-	int i;
-
 	ixgbevf_configure_dcb(adapter);
 
-	ixgbevf_set_rx_mode(netdev);
+	ixgbevf_set_rx_mode(adapter->netdev);
 
 	ixgbevf_restore_vlan(adapter);
 
 	ixgbevf_configure_tx(adapter);
 	ixgbevf_configure_rx(adapter);
-	for (i = 0; i < adapter->num_rx_queues; i++) {
-		struct ixgbevf_ring *ring = adapter->rx_ring[i];
-		ixgbevf_alloc_rx_buffers(adapter, ring,
-					 ixgbevf_desc_unused(ring));
-	}
-}
-
-#define IXGBEVF_MAX_RX_DESC_POLL 10
-static void ixgbevf_rx_desc_queue_enable(struct ixgbevf_adapter *adapter,
-					 struct ixgbevf_ring *ring)
-{
-	struct ixgbe_hw *hw = &adapter->hw;
-	int wait_loop = IXGBEVF_MAX_RX_DESC_POLL;
-	u32 rxdctl;
-	u8 reg_idx = ring->reg_idx;
-
-	do {
-		usleep_range(1000, 2000);
-		rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(reg_idx));
-	} while (--wait_loop && !(rxdctl & IXGBE_RXDCTL_ENABLE));
-
-	if (!wait_loop)
-		hw_dbg(hw, "RXDCTL.ENABLE queue %d not set while polling\n",
-		       reg_idx);
-
-	ixgbevf_release_rx_desc(ring, ring->count - 1);
-}
-
-static void ixgbevf_disable_rx_queue(struct ixgbevf_adapter *adapter,
-				     struct ixgbevf_ring *ring)
-{
-	struct ixgbe_hw *hw = &adapter->hw;
-	int wait_loop = IXGBEVF_MAX_RX_DESC_POLL;
-	u32 rxdctl;
-	u8 reg_idx = ring->reg_idx;
-
-	rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(reg_idx));
-	rxdctl &= ~IXGBE_RXDCTL_ENABLE;
-
-	/* write value back with RXDCTL.ENABLE bit cleared */
-	IXGBE_WRITE_REG(hw, IXGBE_VFRXDCTL(reg_idx), rxdctl);
-
-	/* the hardware may take up to 100us to really disable the rx queue */
-	do {
-		udelay(10);
-		rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(reg_idx));
-	} while (--wait_loop && (rxdctl & IXGBE_RXDCTL_ENABLE));
-
-	if (!wait_loop)
-		hw_dbg(hw, "RXDCTL.ENABLE queue %d not cleared while polling\n",
-		       reg_idx);
 }
 
 static void ixgbevf_save_reset_stats(struct ixgbevf_adapter *adapter)
@@ -1535,37 +1592,6 @@ static void ixgbevf_up_complete(struct ixgbevf_adapter *adapter)
 {
 	struct net_device *netdev = adapter->netdev;
 	struct ixgbe_hw *hw = &adapter->hw;
-	int i, j = 0;
-	int num_rx_rings = adapter->num_rx_queues;
-	u32 txdctl, rxdctl;
-
-	for (i = 0; i < adapter->num_tx_queues; i++) {
-		j = adapter->tx_ring[i]->reg_idx;
-		txdctl = IXGBE_READ_REG(hw, IXGBE_VFTXDCTL(j));
-		/* enable WTHRESH=8 descriptors, to encourage burst writeback */
-		txdctl |= (8 << 16);
-		IXGBE_WRITE_REG(hw, IXGBE_VFTXDCTL(j), txdctl);
-	}
-
-	for (i = 0; i < adapter->num_tx_queues; i++) {
-		j = adapter->tx_ring[i]->reg_idx;
-		txdctl = IXGBE_READ_REG(hw, IXGBE_VFTXDCTL(j));
-		txdctl |= IXGBE_TXDCTL_ENABLE;
-		IXGBE_WRITE_REG(hw, IXGBE_VFTXDCTL(j), txdctl);
-	}
-
-	for (i = 0; i < num_rx_rings; i++) {
-		j = adapter->rx_ring[i]->reg_idx;
-		rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(j));
-		rxdctl |= IXGBE_RXDCTL_ENABLE | IXGBE_RXDCTL_VME;
-		if (hw->mac.type == ixgbe_mac_X540_vf) {
-			rxdctl &= ~IXGBE_RXDCTL_RLPMLMASK;
-			rxdctl |= ((netdev->mtu + ETH_HLEN + ETH_FCS_LEN) |
-				   IXGBE_RXDCTL_RLPML_EN);
-		}
-		IXGBE_WRITE_REG(hw, IXGBE_VFRXDCTL(j), rxdctl);
-		ixgbevf_rx_desc_queue_enable(adapter, adapter->rx_ring[i]);
-	}
 
 	ixgbevf_configure_msix(adapter);
 
@@ -1704,8 +1730,7 @@ void ixgbevf_down(struct ixgbevf_adapter *adapter)
 {
 	struct net_device *netdev = adapter->netdev;
 	struct ixgbe_hw *hw = &adapter->hw;
-	u32 txdctl;
-	int i, j;
+	int i;
 
 	/* signal that we are down to the interrupt handler */
 	set_bit(__IXGBEVF_DOWN, &adapter->state);
@@ -1733,10 +1758,10 @@ void ixgbevf_down(struct ixgbevf_adapter *adapter)
 
 	/* disable transmits in the hardware now that interrupts are off */
 	for (i = 0; i < adapter->num_tx_queues; i++) {
-		j = adapter->tx_ring[i]->reg_idx;
-		txdctl = IXGBE_READ_REG(hw, IXGBE_VFTXDCTL(j));
-		IXGBE_WRITE_REG(hw, IXGBE_VFTXDCTL(j),
-				(txdctl & ~IXGBE_TXDCTL_ENABLE));
+		u8 reg_idx = adapter->tx_ring[i]->reg_idx;
+
+		IXGBE_WRITE_REG(hw, IXGBE_VFTXDCTL(reg_idx),
+				IXGBE_TXDCTL_SWFLSH);
 	}
 
 	netif_carrier_off(netdev);
@@ -2416,6 +2441,10 @@ void ixgbevf_free_tx_resources(struct ixgbevf_adapter *adapter,
 	vfree(tx_ring->tx_buffer_info);
 	tx_ring->tx_buffer_info = NULL;
 
+	/* if not set, then don't free */
+	if (!tx_ring->desc)
+		return;
+
 	dma_free_coherent(&pdev->dev, tx_ring->size, tx_ring->desc,
 			  tx_ring->dma);
 
-- 
1.8.5.GIT

^ permalink raw reply related

* [PATCH v2 0/9] net: stmmac PM related fixes.
From: srinivas.kandagatla @ 2014-01-16 10:48 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe CAVALLARO, linux-kernel, srinivas.kandagatla
In-Reply-To: <1384774256-10039-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

Hi Peppe/Dave,

During PM_SUSPEND_FREEZE testing, I have noticed that PM support in STMMAC is
partly broken. I had to re-arrange the code to do PM correctly. There were lot
of things I did not like personally and some bits did not work in the first
place. I thought this is the nice opportunity to clean the mess up.

Here is what I did:
 any
1> Test PM suspend freeze via pm_test
It did not work for following reasons.
 - If the power to gmac is removed when it enters in low power state.
stmmac_resume could not cope up with such behaviour, it was expecting the ip
register contents to be still same as before entering low power, This
assumption is wrong. So I started to add some code to do Hardware
initialization, thats when I started to re-arrange the code. stmmac_open
contains both resource and memory allocations and hardware initialization. I
had to separate these two things in two different functions.

These two patches do that
  net: stmmac: move dma allocation to new function
  net: stmmac: move hardware setup for stmmac_open to new function

And rest of the other patches are fixing the loose ends, things like mdio
reset, which might be necessary in cases likes hibernation(I did not test).

In hibernation cases the driver was just unregistering with subsystems and
releasing resources which I did not like and its not necessary to do this as
part of PM. So using the same stmmac_suspend/resume made more sense for
hibernation cases than using stmmac_open/release.
Also fixed a NULL pointer dereference bug too.

2> Test WOL via PM_SUSPEND_FREEZE
Did get an wakeup interrupt, but could not wakeup a freeze system.
So I had to add pm_wakeup_event to the driver.
net: stmmac: notify the PM core of a wakeup event. patch.

Also few patches like 
  net: stmmac: make stmmac_mdio_reset non-static
  net: stmmac: restore pinstate in pm resume.
helps the resume function to reset the phy and put back the pins in default
state.

Changes since RFC:
	- Rebased to net-next on Dave's suggestion.

All these patches are Acked by Peppe.

Thanks,
srini

Srinivas Kandagatla (9):
  net: stmmac: support max-speed device tree property
  net: stmmac: mdio: remove reset gpio free
  net: stmmac: move dma allocation to new function
  net: stmmac: move hardware setup for stmmac_open to new function
  net: stmmac: make stmmac_mdio_reset non-static
  net: stmmac: fix power management suspend-resume case
  net: stmmac: use suspend functions for hibernation
  net: stmmac: restore pinstate in pm resume.
  net: stmmac: notify the PM core of a wakeup event.

 drivers/net/ethernet/stmicro/stmmac/stmmac.h       |    4 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  361 ++++++++++----------
 drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c  |    3 +-
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |   48 +--
 include/linux/stmmac.h                             |    1 +
 5 files changed, 209 insertions(+), 208 deletions(-)

-- 
1.7.6.5

^ permalink raw reply

* [PATCH v2 1/9] net: stmmac: support max-speed device tree property
From: srinivas.kandagatla @ 2014-01-16 10:51 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe CAVALLARO, David Miller, linux-kernel,
	srinivas.kandagatla
In-Reply-To: <1389869450-28003-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

This patch adds support to "max-speed" property which is a standard
Ethernet device tree property. max-speed specifies maximum speed
(specified in megabits per second) supported the device.

Depending on the clocking schemes some of the boards can only support
few link speeds, so having a way to limit the link speed in the mac
driver would allow such setups to work reliably.

Without this patch there is no way to tell the driver to limit the
link speed.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |    4 +++-
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |    4 ++++
 include/linux/stmmac.h                             |    1 +
 3 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index ecdc8ab..15192c0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -776,6 +776,7 @@ static int stmmac_init_phy(struct net_device *dev)
 	char phy_id_fmt[MII_BUS_ID_SIZE + 3];
 	char bus_id[MII_BUS_ID_SIZE];
 	int interface = priv->plat->interface;
+	int max_speed = priv->plat->max_speed;
 	priv->oldlink = 0;
 	priv->speed = 0;
 	priv->oldduplex = -1;
@@ -800,7 +801,8 @@ static int stmmac_init_phy(struct net_device *dev)
 
 	/* Stop Advertising 1000BASE Capability if interface is not GMII */
 	if ((interface == PHY_INTERFACE_MODE_MII) ||
-	    (interface == PHY_INTERFACE_MODE_RMII))
+	    (interface == PHY_INTERFACE_MODE_RMII) ||
+		(max_speed < 1000 &&  max_speed > 0))
 		phydev->advertising &= ~(SUPPORTED_1000baseT_Half |
 					 SUPPORTED_1000baseT_Full);
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 38bd1f4..9377ee6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -42,6 +42,10 @@ static int stmmac_probe_config_dt(struct platform_device *pdev,
 	*mac = of_get_mac_address(np);
 	plat->interface = of_get_phy_mode(np);
 
+	/* Get max speed of operation from device tree */
+	if (of_property_read_u32(np, "max-speed", &plat->max_speed))
+		plat->max_speed = -1;
+
 	plat->bus_id = of_alias_get_id(np, "ethernet");
 	if (plat->bus_id < 0)
 		plat->bus_id = 0;
diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
index bb5deb0..33ace71 100644
--- a/include/linux/stmmac.h
+++ b/include/linux/stmmac.h
@@ -110,6 +110,7 @@ struct plat_stmmacenet_data {
 	int force_sf_dma_mode;
 	int force_thresh_dma_mode;
 	int riwt_off;
+	int max_speed;
 	void (*fix_mac_speed)(void *priv, unsigned int speed);
 	void (*bus_setup)(void __iomem *ioaddr);
 	int (*init)(struct platform_device *pdev);
-- 
1.7.6.5

^ permalink raw reply related

* [PATCH v2 2/9] net: stmmac: mdio: remove reset gpio free
From: srinivas.kandagatla @ 2014-01-16 10:51 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe CAVALLARO, David Miller, linux-kernel,
	srinivas.kandagatla
In-Reply-To: <1389869450-28003-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

This patch removes gpio_free for reset line of the phy, driver stores
the gpio number in its private data-structure to use in future. As the
driver uses this pin in future this pin should not be freed.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
index fe7bc99..aab12d2 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
@@ -166,7 +166,6 @@ static int stmmac_mdio_reset(struct mii_bus *bus)
 			udelay(data->delays[1]);
 			gpio_set_value(reset_gpio, active_low ? 1 : 0);
 			udelay(data->delays[2]);
-			gpio_free(reset_gpio);
 		}
 	}
 #endif
-- 
1.7.6.5

^ permalink raw reply related

* [PATCH v2 5/9] net: stmmac: make stmmac_mdio_reset non-static
From: srinivas.kandagatla @ 2014-01-16 10:52 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe CAVALLARO, David Miller, linux-kernel,
	srinivas.kandagatla
In-Reply-To: <1389869450-28003-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

This patch promotes stmmac_mdio_reset function from static to
non-static, so that power management functions can decide to reset if
the IP comes out from lowe power state specially hibernation cases.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac.h      |    1 +
 drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c |    2 +-
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 92be6b3..5a568015 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -110,6 +110,7 @@ struct stmmac_priv {
 
 int stmmac_mdio_unregister(struct net_device *ndev);
 int stmmac_mdio_register(struct net_device *ndev);
+int stmmac_mdio_reset(struct mii_bus *mii);
 void stmmac_set_ethtool_ops(struct net_device *netdev);
 extern const struct stmmac_desc_ops enh_desc_ops;
 extern const struct stmmac_desc_ops ndesc_ops;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
index aab12d2..a468eb1 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
@@ -128,7 +128,7 @@ static int stmmac_mdio_write(struct mii_bus *bus, int phyaddr, int phyreg,
  * @bus: points to the mii_bus structure
  * Description: reset the MII bus
  */
-static int stmmac_mdio_reset(struct mii_bus *bus)
+int stmmac_mdio_reset(struct mii_bus *bus)
 {
 #if defined(CONFIG_STMMAC_PLATFORM)
 	struct net_device *ndev = bus->priv;
-- 
1.7.6.5

^ permalink raw reply related

* [PATCH v2 6/9] net: stmmac: fix power management suspend-resume case
From: srinivas.kandagatla @ 2014-01-16 10:52 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe CAVALLARO, David Miller, linux-kernel,
	srinivas.kandagatla
In-Reply-To: <1389869450-28003-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

The driver PM resume assumes that the IP is still powered up and the
all the register contents are not disturbed when it comes out of low
power suspend case. This assumption is wrong, basically the driver
should not consider any state of registers after it comes out of low
power. However driver can keep the part of the IP powered up if its a
wake up source. But it can not assume the register state of the IP. Also
its possible that SOC glue layer can take the power off the IP if its
not wake-up source to reduce the power consumption.

This patch re initializes hardware by calling stmmac_hw_setup function in
resume case.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |   13 +++++++------
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 341c8dc..742a83f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2887,18 +2887,19 @@ int stmmac_resume(struct net_device *ndev)
 	 * this bit because it can generate problems while resuming
 	 * from another devices (e.g. serial console).
 	 */
-	if (device_may_wakeup(priv->device))
+	if (device_may_wakeup(priv->device)) {
 		priv->hw->mac->pmt(priv->ioaddr, 0);
-	else
+	} else {
 		/* enable the clk prevously disabled */
 		clk_prepare_enable(priv->stmmac_clk);
+		/* reset the phy so that it's ready */
+		if (priv->mii)
+			stmmac_mdio_reset(priv->mii);
+	}
 
 	netif_device_attach(ndev);
 
-	/* Enable the MAC and DMA */
-	stmmac_set_mac(priv->ioaddr, true);
-	priv->hw->dma->start_tx(priv->ioaddr);
-	priv->hw->dma->start_rx(priv->ioaddr);
+	stmmac_hw_setup(ndev);
 
 	napi_enable(&priv->napi);
 
-- 
1.7.6.5

^ permalink raw reply related

* [PATCH v2 7/9] net: stmmac: use suspend functions for hibernation
From: srinivas.kandagatla @ 2014-01-16 10:52 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe CAVALLARO, David Miller, linux-kernel,
	srinivas.kandagatla
In-Reply-To: <1389869450-28003-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

In hibernation freeze case the driver just releases the resources like
dma buffers, irqs, unregisters the drivers and during restore it does
register, request the resources. This is not really necessary, as part
of power management all the data structures are intact, all the
previously allocated resources can be used after coming out of low
power.

This patch uses the suspend and resume callbacks for freeze and
restore which initializes the hardware correctly without unregistering
or releasing the resources, this should also help in reducing the time
to restore.

Also this patch fixes a bug in stmmac_pltfr_restore and
stmmac_pltfr_freeze where it tries to get hold of platform data via
dev_get_platdata call, which would return NULL in device tree cases and
the next if statement would crash as there is no NULL check.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac.h       |    2 -
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |   16 -------
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |   44 +++++--------------
 3 files changed, 12 insertions(+), 50 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 5a568015..027f1dd 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -117,8 +117,6 @@ extern const struct stmmac_desc_ops ndesc_ops;
 extern const struct stmmac_hwtimestamp stmmac_ptp;
 int stmmac_ptp_register(struct stmmac_priv *priv);
 void stmmac_ptp_unregister(struct stmmac_priv *priv);
-int stmmac_freeze(struct net_device *ndev);
-int stmmac_restore(struct net_device *ndev);
 int stmmac_resume(struct net_device *ndev);
 int stmmac_suspend(struct net_device *ndev);
 int stmmac_dvr_remove(struct net_device *ndev);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 742a83f..c1298a0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2912,22 +2912,6 @@ int stmmac_resume(struct net_device *ndev)
 
 	return 0;
 }
-
-int stmmac_freeze(struct net_device *ndev)
-{
-	if (!ndev || !netif_running(ndev))
-		return 0;
-
-	return stmmac_release(ndev);
-}
-
-int stmmac_restore(struct net_device *ndev)
-{
-	if (!ndev || !netif_running(ndev))
-		return 0;
-
-	return stmmac_open(ndev);
-}
 #endif /* CONFIG_PM */
 
 /* Driver can be configured w/ and w/ both PCI and Platf drivers
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 9377ee6..6d0bf22 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -211,55 +211,35 @@ static int stmmac_pltfr_remove(struct platform_device *pdev)
 #ifdef CONFIG_PM
 static int stmmac_pltfr_suspend(struct device *dev)
 {
-	struct net_device *ndev = dev_get_drvdata(dev);
-
-	return stmmac_suspend(ndev);
-}
-
-static int stmmac_pltfr_resume(struct device *dev)
-{
-	struct net_device *ndev = dev_get_drvdata(dev);
-
-	return stmmac_resume(ndev);
-}
-
-static int stmmac_pltfr_freeze(struct device *dev)
-{
 	int ret;
-	struct plat_stmmacenet_data *plat_dat = dev_get_platdata(dev);
 	struct net_device *ndev = dev_get_drvdata(dev);
+	struct stmmac_priv *priv = netdev_priv(ndev);
 	struct platform_device *pdev = to_platform_device(dev);
 
-	ret = stmmac_freeze(ndev);
-	if (plat_dat->exit)
-		plat_dat->exit(pdev);
+	ret = stmmac_suspend(ndev);
+	if (priv->plat->exit)
+		priv->plat->exit(pdev);
 
 	return ret;
 }
 
-static int stmmac_pltfr_restore(struct device *dev)
+static int stmmac_pltfr_resume(struct device *dev)
 {
-	struct plat_stmmacenet_data *plat_dat = dev_get_platdata(dev);
 	struct net_device *ndev = dev_get_drvdata(dev);
+	struct stmmac_priv *priv = netdev_priv(ndev);
 	struct platform_device *pdev = to_platform_device(dev);
 
-	if (plat_dat->init)
-		plat_dat->init(pdev);
+	if (priv->plat->init)
+		priv->plat->init(pdev);
 
-	return stmmac_restore(ndev);
+	return stmmac_resume(ndev);
 }
 
-static const struct dev_pm_ops stmmac_pltfr_pm_ops = {
-	.suspend = stmmac_pltfr_suspend,
-	.resume = stmmac_pltfr_resume,
-	.freeze = stmmac_pltfr_freeze,
-	.thaw = stmmac_pltfr_restore,
-	.restore = stmmac_pltfr_restore,
-};
-#else
-static const struct dev_pm_ops stmmac_pltfr_pm_ops;
 #endif /* CONFIG_PM */
 
+static SIMPLE_DEV_PM_OPS(stmmac_pltfr_pm_ops,
+			stmmac_pltfr_suspend, stmmac_pltfr_resume);
+
 static const struct of_device_id stmmac_dt_ids[] = {
 	{ .compatible = "st,spear600-gmac"},
 	{ .compatible = "snps,dwmac-3.610"},
-- 
1.7.6.5

^ permalink raw reply related

* [PATCH v2 8/9] net: stmmac: restore pinstate in pm resume.
From: srinivas.kandagatla @ 2014-01-16 10:52 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe CAVALLARO, David Miller, linux-kernel,
	srinivas.kandagatla
In-Reply-To: <1389869450-28003-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

This patch adds code to restore default pinstate of the pins when it
comes back from low power state. Without this patch the state of the
pins would be unknown and the driver would not work.

This patch also adds code to put the pins in to sleep state when the
driver enters low power state.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index c1298a0..df7d8d6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -43,6 +43,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/slab.h>
 #include <linux/prefetch.h>
+#include <linux/pinctrl/consumer.h>
 #ifdef CONFIG_STMMAC_DEBUG_FS
 #include <linux/debugfs.h>
 #include <linux/seq_file.h>
@@ -2864,6 +2865,7 @@ int stmmac_suspend(struct net_device *ndev)
 		priv->hw->mac->pmt(priv->ioaddr, priv->wolopts);
 	else {
 		stmmac_set_mac(priv->ioaddr, false);
+		pinctrl_pm_select_sleep_state(priv->device);
 		/* Disable clock in case of PWM is off */
 		clk_disable_unprepare(priv->stmmac_clk);
 	}
@@ -2890,6 +2892,7 @@ int stmmac_resume(struct net_device *ndev)
 	if (device_may_wakeup(priv->device)) {
 		priv->hw->mac->pmt(priv->ioaddr, 0);
 	} else {
+		pinctrl_pm_select_default_state(priv->device);
 		/* enable the clk prevously disabled */
 		clk_prepare_enable(priv->stmmac_clk);
 		/* reset the phy so that it's ready */
-- 
1.7.6.5

^ permalink raw reply related

* [PATCH v2 0/9] net: stmmac PM related fixes.
From: srinivas.kandagatla @ 2014-01-16 10:50 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe CAVALLARO, David Miller, linux-kernel,
	srinivas.kandagatla
In-Reply-To: <1384774256-10039-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

Hi Peppe/Dave,

During PM_SUSPEND_FREEZE testing, I have noticed that PM support in STMMAC is
partly broken. I had to re-arrange the code to do PM correctly. There were lot
of things I did not like personally and some bits did not work in the first
place. I thought this is the nice opportunity to clean the mess up.

Here is what I did:
 any
1> Test PM suspend freeze via pm_test
It did not work for following reasons.
 - If the power to gmac is removed when it enters in low power state.
stmmac_resume could not cope up with such behaviour, it was expecting the ip
register contents to be still same as before entering low power, This
assumption is wrong. So I started to add some code to do Hardware
initialization, thats when I started to re-arrange the code. stmmac_open
contains both resource and memory allocations and hardware initialization. I
had to separate these two things in two different functions.

These two patches do that
  net: stmmac: move dma allocation to new function
  net: stmmac: move hardware setup for stmmac_open to new function

And rest of the other patches are fixing the loose ends, things like mdio
reset, which might be necessary in cases likes hibernation(I did not test).

In hibernation cases the driver was just unregistering with subsystems and
releasing resources which I did not like and its not necessary to do this as
part of PM. So using the same stmmac_suspend/resume made more sense for
hibernation cases than using stmmac_open/release.
Also fixed a NULL pointer dereference bug too.

2> Test WOL via PM_SUSPEND_FREEZE
Did get an wakeup interrupt, but could not wakeup a freeze system.
So I had to add pm_wakeup_event to the driver.
net: stmmac: notify the PM core of a wakeup event. patch.

Also few patches like 
  net: stmmac: make stmmac_mdio_reset non-static
  net: stmmac: restore pinstate in pm resume.
helps the resume function to reset the phy and put back the pins in default
state.

Changes since RFC:
	- Rebased to net-next on Dave's suggestion.

All these patches are Acked by Peppe.

Thanks,
srini

Srinivas Kandagatla (9):
  net: stmmac: support max-speed device tree property
  net: stmmac: mdio: remove reset gpio free
  net: stmmac: move dma allocation to new function
  net: stmmac: move hardware setup for stmmac_open to new function
  net: stmmac: make stmmac_mdio_reset non-static
  net: stmmac: fix power management suspend-resume case
  net: stmmac: use suspend functions for hibernation
  net: stmmac: restore pinstate in pm resume.
  net: stmmac: notify the PM core of a wakeup event.

 drivers/net/ethernet/stmicro/stmmac/stmmac.h       |    4 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  361 ++++++++++----------
 drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c  |    3 +-
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |   48 +--
 include/linux/stmmac.h                             |    1 +
 5 files changed, 209 insertions(+), 208 deletions(-)

-- 
1.7.6.5

^ permalink raw reply

* [PATCH v2 3/9] net: stmmac: move dma allocation to new function
From: srinivas.kandagatla @ 2014-01-16 10:52 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe CAVALLARO, David Miller, linux-kernel,
	srinivas.kandagatla
In-Reply-To: <1389869450-28003-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

This patch moves dma resource allocation to a new function
alloc_dma_desc_resources, the reason for moving this to a new function
is to keep the memory allocations in a separate function. One more reason
it to get suspend and hibernation cases working without releasing and
allocating these resources during suspend-resume and freeze-restore
cases.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |  169 +++++++++++----------
 1 files changed, 85 insertions(+), 84 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 15192c0..532f2b4 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -996,66 +996,6 @@ static int init_dma_desc_rings(struct net_device *dev)
 		pr_debug("%s: txsize %d, rxsize %d, bfsize %d\n", __func__,
 			 txsize, rxsize, bfsize);
 
-	if (priv->extend_desc) {
-		priv->dma_erx = dma_alloc_coherent(priv->device, rxsize *
-						   sizeof(struct
-							  dma_extended_desc),
-						   &priv->dma_rx_phy,
-						   GFP_KERNEL);
-		if (!priv->dma_erx)
-			goto err_dma;
-
-		priv->dma_etx = dma_alloc_coherent(priv->device, txsize *
-						   sizeof(struct
-							  dma_extended_desc),
-						   &priv->dma_tx_phy,
-						   GFP_KERNEL);
-		if (!priv->dma_etx) {
-			dma_free_coherent(priv->device, priv->dma_rx_size *
-					sizeof(struct dma_extended_desc),
-					priv->dma_erx, priv->dma_rx_phy);
-			goto err_dma;
-		}
-	} else {
-		priv->dma_rx = dma_alloc_coherent(priv->device, rxsize *
-						  sizeof(struct dma_desc),
-						  &priv->dma_rx_phy,
-						  GFP_KERNEL);
-		if (!priv->dma_rx)
-			goto err_dma;
-
-		priv->dma_tx = dma_alloc_coherent(priv->device, txsize *
-						  sizeof(struct dma_desc),
-						  &priv->dma_tx_phy,
-						  GFP_KERNEL);
-		if (!priv->dma_tx) {
-			dma_free_coherent(priv->device, priv->dma_rx_size *
-					sizeof(struct dma_desc),
-					priv->dma_rx, priv->dma_rx_phy);
-			goto err_dma;
-		}
-	}
-
-	priv->rx_skbuff_dma = kmalloc_array(rxsize, sizeof(dma_addr_t),
-					    GFP_KERNEL);
-	if (!priv->rx_skbuff_dma)
-		goto err_rx_skbuff_dma;
-
-	priv->rx_skbuff = kmalloc_array(rxsize, sizeof(struct sk_buff *),
-					GFP_KERNEL);
-	if (!priv->rx_skbuff)
-		goto err_rx_skbuff;
-
-	priv->tx_skbuff_dma = kmalloc_array(txsize, sizeof(dma_addr_t),
-					    GFP_KERNEL);
-	if (!priv->tx_skbuff_dma)
-		goto err_tx_skbuff_dma;
-
-	priv->tx_skbuff = kmalloc_array(txsize, sizeof(struct sk_buff *),
-					GFP_KERNEL);
-	if (!priv->tx_skbuff)
-		goto err_tx_skbuff;
-
 	if (netif_msg_probe(priv)) {
 		pr_debug("(%s) dma_rx_phy=0x%08x dma_tx_phy=0x%08x\n", __func__,
 			 (u32) priv->dma_rx_phy, (u32) priv->dma_tx_phy);
@@ -1123,30 +1063,6 @@ static int init_dma_desc_rings(struct net_device *dev)
 err_init_rx_buffers:
 	while (--i >= 0)
 		stmmac_free_rx_buffers(priv, i);
-	kfree(priv->tx_skbuff);
-err_tx_skbuff:
-	kfree(priv->tx_skbuff_dma);
-err_tx_skbuff_dma:
-	kfree(priv->rx_skbuff);
-err_rx_skbuff:
-	kfree(priv->rx_skbuff_dma);
-err_rx_skbuff_dma:
-	if (priv->extend_desc) {
-		dma_free_coherent(priv->device, priv->dma_tx_size *
-				  sizeof(struct dma_extended_desc),
-				  priv->dma_etx, priv->dma_tx_phy);
-		dma_free_coherent(priv->device, priv->dma_rx_size *
-				  sizeof(struct dma_extended_desc),
-				  priv->dma_erx, priv->dma_rx_phy);
-	} else {
-		dma_free_coherent(priv->device,
-				priv->dma_tx_size * sizeof(struct dma_desc),
-				priv->dma_tx, priv->dma_tx_phy);
-		dma_free_coherent(priv->device,
-				priv->dma_rx_size * sizeof(struct dma_desc),
-				priv->dma_rx, priv->dma_rx_phy);
-	}
-err_dma:
 	return ret;
 }
 
@@ -1182,6 +1098,85 @@ static void dma_free_tx_skbufs(struct stmmac_priv *priv)
 	}
 }
 
+static int alloc_dma_desc_resources(struct stmmac_priv *priv)
+{
+	unsigned int txsize = priv->dma_tx_size;
+	unsigned int rxsize = priv->dma_rx_size;
+	int ret = -ENOMEM;
+
+	priv->rx_skbuff_dma = kmalloc_array(rxsize, sizeof(dma_addr_t),
+					    GFP_KERNEL);
+	if (!priv->rx_skbuff_dma)
+		return -ENOMEM;
+
+	priv->rx_skbuff = kmalloc_array(rxsize, sizeof(struct sk_buff *),
+					GFP_KERNEL);
+	if (!priv->rx_skbuff)
+		goto err_rx_skbuff;
+
+	priv->tx_skbuff_dma = kmalloc_array(txsize, sizeof(dma_addr_t),
+					    GFP_KERNEL);
+	if (!priv->tx_skbuff_dma)
+		goto err_tx_skbuff_dma;
+
+	priv->tx_skbuff = kmalloc_array(txsize, sizeof(struct sk_buff *),
+					GFP_KERNEL);
+	if (!priv->tx_skbuff)
+		goto err_tx_skbuff;
+
+	if (priv->extend_desc) {
+		priv->dma_erx = dma_alloc_coherent(priv->device, rxsize *
+						   sizeof(struct
+							  dma_extended_desc),
+						   &priv->dma_rx_phy,
+						   GFP_KERNEL);
+		if (!priv->dma_erx)
+			goto err_dma;
+
+		priv->dma_etx = dma_alloc_coherent(priv->device, txsize *
+						   sizeof(struct
+							  dma_extended_desc),
+						   &priv->dma_tx_phy,
+						   GFP_KERNEL);
+		if (!priv->dma_etx) {
+			dma_free_coherent(priv->device, priv->dma_rx_size *
+					sizeof(struct dma_extended_desc),
+					priv->dma_erx, priv->dma_rx_phy);
+			goto err_dma;
+		}
+	} else {
+		priv->dma_rx = dma_alloc_coherent(priv->device, rxsize *
+						  sizeof(struct dma_desc),
+						  &priv->dma_rx_phy,
+						  GFP_KERNEL);
+		if (!priv->dma_rx)
+			goto err_dma;
+
+		priv->dma_tx = dma_alloc_coherent(priv->device, txsize *
+						  sizeof(struct dma_desc),
+						  &priv->dma_tx_phy,
+						  GFP_KERNEL);
+		if (!priv->dma_tx) {
+			dma_free_coherent(priv->device, priv->dma_rx_size *
+					sizeof(struct dma_desc),
+					priv->dma_rx, priv->dma_rx_phy);
+			goto err_dma;
+		}
+	}
+
+	return 0;
+
+err_dma:
+	kfree(priv->tx_skbuff);
+err_tx_skbuff:
+	kfree(priv->tx_skbuff_dma);
+err_tx_skbuff_dma:
+	kfree(priv->rx_skbuff);
+err_rx_skbuff:
+	kfree(priv->rx_skbuff_dma);
+	return ret;
+}
+
 static void free_dma_desc_resources(struct stmmac_priv *priv)
 {
 	/* Release the DMA TX/RX socket buffers */
@@ -1623,6 +1618,12 @@ static int stmmac_open(struct net_device *dev)
 	priv->dma_rx_size = STMMAC_ALIGN(dma_rxsize);
 	priv->dma_buf_sz = STMMAC_ALIGN(buf_sz);
 
+	alloc_dma_desc_resources(priv);
+	if (ret < 0) {
+		pr_err("%s: DMA descriptors allocation failed\n", __func__);
+		goto dma_desc_error;
+	}
+
 	ret = init_dma_desc_rings(dev);
 	if (ret < 0) {
 		pr_err("%s: DMA descriptors initialization failed\n", __func__);
-- 
1.7.6.5

^ permalink raw reply related

* [PATCH v2 4/9] net: stmmac: move hardware setup for stmmac_open to new function
From: srinivas.kandagatla @ 2014-01-16 10:52 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe CAVALLARO, David Miller, linux-kernel,
	srinivas.kandagatla
In-Reply-To: <1389869450-28003-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

This patch moves hardware setup part of the code in stmmac_open to a new
function stmmac_hw_setup, the reason for doing this is to make hw
initialization independent function so that PM functions can re-use it to
re-initialize the IP after returning from low power state.
This will also avoid code duplication across stmmac_resume/restore and
stmmac_open.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |  155 ++++++++++++---------
 1 files changed, 88 insertions(+), 67 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 532f2b4..341c8dc 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1586,6 +1586,86 @@ static void stmmac_init_tx_coalesce(struct stmmac_priv *priv)
 }
 
 /**
+ * stmmac_hw_setup: setup mac in a usable state.
+ *  @dev : pointer to the device structure.
+ *  Description:
+ *  This function sets up the ip in a usable state.
+ *  Return value:
+ *  0 on success and an appropriate (-)ve integer as defined in errno.h
+ *  file on failure.
+ */
+static int stmmac_hw_setup(struct net_device *dev)
+{
+	struct stmmac_priv *priv = netdev_priv(dev);
+	int ret;
+
+	ret = init_dma_desc_rings(dev);
+	if (ret < 0) {
+		pr_err("%s: DMA descriptors initialization failed\n", __func__);
+		return ret;
+	}
+	/* DMA initialization and SW reset */
+	ret = stmmac_init_dma_engine(priv);
+	if (ret < 0) {
+		pr_err("%s: DMA engine initialization failed\n", __func__);
+		return ret;
+	}
+
+	/* Copy the MAC addr into the HW  */
+	priv->hw->mac->set_umac_addr(priv->ioaddr, dev->dev_addr, 0);
+
+	/* If required, perform hw setup of the bus. */
+	if (priv->plat->bus_setup)
+		priv->plat->bus_setup(priv->ioaddr);
+
+	/* Initialize the MAC Core */
+	priv->hw->mac->core_init(priv->ioaddr);
+
+	/* Enable the MAC Rx/Tx */
+	stmmac_set_mac(priv->ioaddr, true);
+
+	/* Set the HW DMA mode and the COE */
+	stmmac_dma_operation_mode(priv);
+
+	stmmac_mmc_setup(priv);
+
+	ret = stmmac_init_ptp(priv);
+	if (ret)
+		pr_warn("%s: failed PTP initialisation\n", __func__);
+
+#ifdef CONFIG_STMMAC_DEBUG_FS
+	ret = stmmac_init_fs(dev);
+	if (ret < 0)
+		pr_warn("%s: failed debugFS registration\n", __func__);
+#endif
+	/* Start the ball rolling... */
+	pr_debug("%s: DMA RX/TX processes started...\n", dev->name);
+	priv->hw->dma->start_tx(priv->ioaddr);
+	priv->hw->dma->start_rx(priv->ioaddr);
+
+	/* Dump DMA/MAC registers */
+	if (netif_msg_hw(priv)) {
+		priv->hw->mac->dump_regs(priv->ioaddr);
+		priv->hw->dma->dump_regs(priv->ioaddr);
+	}
+	priv->tx_lpi_timer = STMMAC_DEFAULT_TWT_LS;
+
+	priv->eee_enabled = stmmac_eee_init(priv);
+
+	stmmac_init_tx_coalesce(priv);
+
+	if ((priv->use_riwt) && (priv->hw->dma->rx_watchdog)) {
+		priv->rx_riwt = MAX_DMA_RIWT;
+		priv->hw->dma->rx_watchdog(priv->ioaddr, MAX_DMA_RIWT);
+	}
+
+	if (priv->pcs && priv->hw->mac->ctrl_ane)
+		priv->hw->mac->ctrl_ane(priv->ioaddr, 0);
+
+	return 0;
+}
+
+/**
  *  stmmac_open - open entry point of the driver
  *  @dev : pointer to the device structure.
  *  Description:
@@ -1613,6 +1693,10 @@ static int stmmac_open(struct net_device *dev)
 		}
 	}
 
+	/* Extra statistics */
+	memset(&priv->xstats, 0, sizeof(struct stmmac_extra_stats));
+	priv->xstats.threshold = tc;
+
 	/* Create and initialize the TX/RX descriptors chains. */
 	priv->dma_tx_size = STMMAC_ALIGN(dma_txsize);
 	priv->dma_rx_size = STMMAC_ALIGN(dma_rxsize);
@@ -1624,28 +1708,14 @@ static int stmmac_open(struct net_device *dev)
 		goto dma_desc_error;
 	}
 
-	ret = init_dma_desc_rings(dev);
+	ret = stmmac_hw_setup(dev);
 	if (ret < 0) {
-		pr_err("%s: DMA descriptors initialization failed\n", __func__);
-		goto dma_desc_error;
-	}
-
-	/* DMA initialization and SW reset */
-	ret = stmmac_init_dma_engine(priv);
-	if (ret < 0) {
-		pr_err("%s: DMA engine initialization failed\n", __func__);
+		pr_err("%s: Hw setup failed\n", __func__);
 		goto init_error;
 	}
 
-	/* Copy the MAC addr into the HW  */
-	priv->hw->mac->set_umac_addr(priv->ioaddr, dev->dev_addr, 0);
-
-	/* If required, perform hw setup of the bus. */
-	if (priv->plat->bus_setup)
-		priv->plat->bus_setup(priv->ioaddr);
-
-	/* Initialize the MAC Core */
-	priv->hw->mac->core_init(priv->ioaddr);
+	if (priv->phydev)
+		phy_start(priv->phydev);
 
 	/* Request the IRQ lines */
 	ret = request_irq(dev->irq, stmmac_interrupt,
@@ -1678,55 +1748,6 @@ static int stmmac_open(struct net_device *dev)
 		}
 	}
 
-	/* Enable the MAC Rx/Tx */
-	stmmac_set_mac(priv->ioaddr, true);
-
-	/* Set the HW DMA mode and the COE */
-	stmmac_dma_operation_mode(priv);
-
-	/* Extra statistics */
-	memset(&priv->xstats, 0, sizeof(struct stmmac_extra_stats));
-	priv->xstats.threshold = tc;
-
-	stmmac_mmc_setup(priv);
-
-	ret = stmmac_init_ptp(priv);
-	if (ret)
-		pr_warn("%s: failed PTP initialisation\n", __func__);
-
-#ifdef CONFIG_STMMAC_DEBUG_FS
-	ret = stmmac_init_fs(dev);
-	if (ret < 0)
-		pr_warn("%s: failed debugFS registration\n", __func__);
-#endif
-	/* Start the ball rolling... */
-	pr_debug("%s: DMA RX/TX processes started...\n", dev->name);
-	priv->hw->dma->start_tx(priv->ioaddr);
-	priv->hw->dma->start_rx(priv->ioaddr);
-
-	/* Dump DMA/MAC registers */
-	if (netif_msg_hw(priv)) {
-		priv->hw->mac->dump_regs(priv->ioaddr);
-		priv->hw->dma->dump_regs(priv->ioaddr);
-	}
-
-	if (priv->phydev)
-		phy_start(priv->phydev);
-
-	priv->tx_lpi_timer = STMMAC_DEFAULT_TWT_LS;
-
-	priv->eee_enabled = stmmac_eee_init(priv);
-
-	stmmac_init_tx_coalesce(priv);
-
-	if ((priv->use_riwt) && (priv->hw->dma->rx_watchdog)) {
-		priv->rx_riwt = MAX_DMA_RIWT;
-		priv->hw->dma->rx_watchdog(priv->ioaddr, MAX_DMA_RIWT);
-	}
-
-	if (priv->pcs && priv->hw->mac->ctrl_ane)
-		priv->hw->mac->ctrl_ane(priv->ioaddr, 0);
-
 	napi_enable(&priv->napi);
 	netif_start_queue(dev);
 
-- 
1.7.6.5

^ permalink raw reply related

* [PATCH v2 9/9] net: stmmac: notify the PM core of a wakeup event.
From: srinivas.kandagatla @ 2014-01-16 10:53 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe CAVALLARO, David Miller, linux-kernel,
	srinivas.kandagatla
In-Reply-To: <1389869450-28003-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

In PM_SUSPEND_FREEZE and WOL(Wakeup On Lan) case, when the driver gets a
wakeup event, either the driver or platform specific PM code should notify
the pm core about it, so that the system can wakeup from low power.

In cases where there is no involvement of platform specific PM, it
becomes driver responsibility to notify the PM core to wakeup the
system.

Without this WOL with PM_SUSPEND_FREEZE does not work on STi based SOCs.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac.h      |    1 +
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |    9 +++++++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 027f1dd..73709e9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -105,6 +105,7 @@ struct stmmac_priv {
 	unsigned int default_addend;
 	u32 adv_ts;
 	int use_riwt;
+	int irq_wake;
 	spinlock_t ptp_lock;
 };
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index df7d8d6..cddcf76 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2320,6 +2320,9 @@ static irqreturn_t stmmac_interrupt(int irq, void *dev_id)
 	struct net_device *dev = (struct net_device *)dev_id;
 	struct stmmac_priv *priv = netdev_priv(dev);
 
+	if (priv->irq_wake)
+		pm_wakeup_event(priv->device, 0);
+
 	if (unlikely(!dev)) {
 		pr_err("%s: invalid dev pointer\n", __func__);
 		return IRQ_NONE;
@@ -2861,9 +2864,10 @@ int stmmac_suspend(struct net_device *ndev)
 	stmmac_clear_descriptors(priv);
 
 	/* Enable Power down mode by programming the PMT regs */
-	if (device_may_wakeup(priv->device))
+	if (device_may_wakeup(priv->device)) {
 		priv->hw->mac->pmt(priv->ioaddr, priv->wolopts);
-	else {
+		priv->irq_wake = 1;
+	} else {
 		stmmac_set_mac(priv->ioaddr, false);
 		pinctrl_pm_select_sleep_state(priv->device);
 		/* Disable clock in case of PWM is off */
@@ -2891,6 +2895,7 @@ int stmmac_resume(struct net_device *ndev)
 	 */
 	if (device_may_wakeup(priv->device)) {
 		priv->hw->mac->pmt(priv->ioaddr, 0);
+		priv->irq_wake = 0;
 	} else {
 		pinctrl_pm_select_default_state(priv->device);
 		/* enable the clk prevously disabled */
-- 
1.7.6.5

^ permalink raw reply related

* Re: [Xen-devel][PATCH net-next v2] xen-netfront: clean up code in xennet_release_rx_bufs
From: David Vrabel @ 2014-01-16 11:10 UTC (permalink / raw)
  To: Annie Li
  Cc: xen-devel, netdev, davem, konrad.wilk, ian.campbell, wei.liu2,
	andrew.bennieston
In-Reply-To: <1389830228-2381-1-git-send-email-Annie.li@oracle.com>

On 15/01/14 23:57, Annie Li wrote:
> This patch implements two things:
> 
> * release grant reference and skb for rx path, this fixex resource leaking.
> * clean up grant transfer code kept from old netfront(2.6.18) which grants
> pages for access/map and transfer. But grant transfer is deprecated in current
> netfront, so remove corresponding release code for transfer.
> 
> gnttab_end_foreign_access_ref may fail when the grant entry is currently used
> for reading or writing. But this patch does not cover this and improvement for
> this failure may be implemented in a separate patch.

I don't think replacing a resource leak with a security bug is a good idea.

If you would prefer not to fix the gnttab_end_foreign_access() call, I
think you can fix this in netfront by taking a reference to the page
before calling gnttab_end_foreign_access().  This will ensure the page
isn't freed until the subsequent kfree_skb(), or the gref is released by
the foreign domain (whichever is later).

David

^ permalink raw reply

* Re: [PATCH net-next] ipv6: send Change Status Report after DAD is completed
From: Daniel Borkmann @ 2014-01-16 11:21 UTC (permalink / raw)
  To: Flavio Leitner; +Cc: netdev, Hideaki YOSHIFUJI, Hannes Frederic Sowa
In-Reply-To: <1389813008-23851-1-git-send-email-fbl@redhat.com>

On 01/15/2014 08:10 PM, Flavio Leitner wrote:
> The RFC 3810 defines two type of messages for multicast
> listeners. The "Current State Report" message, as the name
> implies, refreshes the *current* state to the querier.
> Since the querier sends Query messages periodically, there
> is no need to retransmit the report.
>
> On the other hand, any change should be reported immediately
> using "State Change Report" messages. Since it's an event
> triggered by a change and that it can be affected by packet
> loss, the rfc states it should be retransmitted [RobVar] times
> to make sure routers will receive timely.
>
> Currently, we are sending "Current State Reports" after
> DAD is completed.  Before that, we send messages using
> unspecified address (::) which should be silently discarded
> by routers.
>
> This patch changes to send "State Change Report" messages
> after DAD is completed fixing the behavior to be RFC compliant
> and also to pass TAHI IPv6 testsuite.
>
> Signed-off-by: Flavio Leitner <fbl@redhat.com>
> ---
>   net/ipv6/mcast.c | 64 ++++++++++++++++++++++++++++++++++----------------------
>   1 file changed, 39 insertions(+), 25 deletions(-)
>
> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> index 99cd65c..8ac17f5 100644
> --- a/net/ipv6/mcast.c
> +++ b/net/ipv6/mcast.c
> @@ -1493,7 +1493,7 @@ static struct sk_buff *add_grhead(struct sk_buff *skb, struct ifmcaddr6 *pmc,
>   	skb_tailroom(skb)) : 0)
>
>   static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> -	int type, int gdeleted, int sdeleted)
> +	int type, int gdeleted, int sdeleted, int crsend)

Maybe we should convert the last three to 'bool' at some point in time?

Nit: argument indent

>   {
>   	struct inet6_dev *idev = pmc->idev;
>   	struct net_device *dev = idev->dev;
> @@ -1585,7 +1585,7 @@ empty_source:
>   		if (type == MLD2_ALLOW_NEW_SOURCES ||
>   		    type == MLD2_BLOCK_OLD_SOURCES)
>   			return skb;
> -		if (pmc->mca_crcount || isquery) {
> +		if (pmc->mca_crcount || isquery || crsend) {
>   			/* make sure we have room for group header */
>   			if (skb && AVAILABLE(skb) < sizeof(struct mld2_grec)) {
>   				mld_sendpack(skb);
> @@ -1602,6 +1602,28 @@ empty_source:
>   	return skb;
>   }
>
> +static void mld_send_initial_cr(struct inet6_dev *idev)
> +{
> +	struct sk_buff *skb;
> +	struct ifmcaddr6 *pmc;
> +	int type;
> +
> +	skb = NULL;
> +	read_lock_bh(&idev->lock);
> +	for (pmc=idev->mc_list; pmc; pmc=pmc->next) {
> +		spin_lock_bh(&pmc->mca_lock);
> +		if (pmc->mca_sfcount[MCAST_EXCLUDE])
> +			type = MLD2_CHANGE_TO_EXCLUDE;
> +		else
> +			type = MLD2_CHANGE_TO_INCLUDE;
> +		skb = add_grec(skb, pmc, type, 0, 0, 1);
> +		spin_unlock_bh(&pmc->mca_lock);
> +	}
> +	read_unlock_bh(&idev->lock);
> +	if (skb)
> +		mld_sendpack(skb);
> +}
> +
>   static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
>   {
>   	struct sk_buff *skb = NULL;
> @@ -1617,7 +1639,7 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
>   				type = MLD2_MODE_IS_EXCLUDE;
>   			else
>   				type = MLD2_MODE_IS_INCLUDE;
> -			skb = add_grec(skb, pmc, type, 0, 0);
> +			skb = add_grec(skb, pmc, type, 0, 0, 0);
>   			spin_unlock_bh(&pmc->mca_lock);
>   		}
>   	} else {
> @@ -1626,7 +1648,7 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
>   			type = MLD2_MODE_IS_EXCLUDE;
>   		else
>   			type = MLD2_MODE_IS_INCLUDE;
> -		skb = add_grec(skb, pmc, type, 0, 0);
> +		skb = add_grec(skb, pmc, type, 0, 0, 0);
>   		spin_unlock_bh(&pmc->mca_lock);
>   	}
>   	read_unlock_bh(&idev->lock);
> @@ -1671,13 +1693,13 @@ static void mld_send_cr(struct inet6_dev *idev)
>   		if (pmc->mca_sfmode == MCAST_INCLUDE) {
>   			type = MLD2_BLOCK_OLD_SOURCES;
>   			dtype = MLD2_BLOCK_OLD_SOURCES;
> -			skb = add_grec(skb, pmc, type, 1, 0);
> -			skb = add_grec(skb, pmc, dtype, 1, 1);
> +			skb = add_grec(skb, pmc, type, 1, 0, 0);
> +			skb = add_grec(skb, pmc, dtype, 1, 1, 0);
>   		}
>   		if (pmc->mca_crcount) {
>   			if (pmc->mca_sfmode == MCAST_EXCLUDE) {
>   				type = MLD2_CHANGE_TO_INCLUDE;
> -				skb = add_grec(skb, pmc, type, 1, 0);
> +				skb = add_grec(skb, pmc, type, 1, 0, 0);
>   			}
>   			pmc->mca_crcount--;
>   			if (pmc->mca_crcount == 0) {
> @@ -1708,8 +1730,8 @@ static void mld_send_cr(struct inet6_dev *idev)
>   			type = MLD2_ALLOW_NEW_SOURCES;
>   			dtype = MLD2_BLOCK_OLD_SOURCES;
>   		}
> -		skb = add_grec(skb, pmc, type, 0, 0);
> -		skb = add_grec(skb, pmc, dtype, 0, 1);	/* deleted sources */
> +		skb = add_grec(skb, pmc, type, 0, 0, 0);
> +		skb = add_grec(skb, pmc, dtype, 0, 1, 0);	/* deleted sources */
>
>   		/* filter mode changes */
>   		if (pmc->mca_crcount) {
> @@ -1717,7 +1739,7 @@ static void mld_send_cr(struct inet6_dev *idev)
>   				type = MLD2_CHANGE_TO_EXCLUDE;
>   			else
>   				type = MLD2_CHANGE_TO_INCLUDE;
> -			skb = add_grec(skb, pmc, type, 0, 0);
> +			skb = add_grec(skb, pmc, type, 0, 0, 0);
>   			pmc->mca_crcount--;
>   		}
>   		spin_unlock_bh(&pmc->mca_lock);
> @@ -1825,27 +1847,19 @@ err_out:
>   	goto out;
>   }
>
> -static void mld_resend_report(struct inet6_dev *idev)
> +static void mld_resend_cr(struct inet6_dev *idev)
>   {
> -	if (MLD_V1_SEEN(idev)) {
> -		struct ifmcaddr6 *mcaddr;
> -		read_lock_bh(&idev->lock);
> -		for (mcaddr = idev->mc_list; mcaddr; mcaddr = mcaddr->next) {
> -			if (!(mcaddr->mca_flags & MAF_NOREPORT))
> -				igmp6_send(&mcaddr->mca_addr, idev->dev,
> -					   ICMPV6_MGM_REPORT);
> -		}
> -		read_unlock_bh(&idev->lock);
> -	} else {
> -		mld_send_report(idev, NULL);
> -	}
> +	if (MLD_V1_SEEN(idev))
> +		return;

$ git grep -n MLD_V1_SEEN
$

I believe I have removed that macro some time ago. ;)

I presume you mean rather: mld_in_v1_mode(idev)

> +	mld_send_initial_cr(idev);
>   }
>
>   void ipv6_mc_dad_complete(struct inet6_dev *idev)
>   {
>   	idev->mc_dad_count = idev->mc_qrv;
>   	if (idev->mc_dad_count) {
> -		mld_resend_report(idev);
> +		mld_resend_cr(idev);
>   		idev->mc_dad_count--;
>   		if (idev->mc_dad_count)
>   			mld_dad_start_timer(idev, idev->mc_maxdelay);
> @@ -1856,7 +1870,7 @@ static void mld_dad_timer_expire(unsigned long data)
>   {
>   	struct inet6_dev *idev = (struct inet6_dev *)data;
>
> -	mld_resend_report(idev);
> +	mld_resend_cr(idev);
>   	if (idev->mc_dad_count) {
>   		idev->mc_dad_count--;
>   		if (idev->mc_dad_count)
>

^ permalink raw reply

* Re: [PATCH net-next v3 5/5] virtio-net: initial rx sysfs support, export mergeable rx buffer size
From: Michael S. Tsirkin @ 2014-01-16 11:53 UTC (permalink / raw)
  To: Michael Dalton; +Cc: netdev, virtualization, Eric Dumazet, David S. Miller
In-Reply-To: <1389865126-26225-5-git-send-email-mwdalton@google.com>

On Thu, Jan 16, 2014 at 01:38:46AM -0800, Michael Dalton wrote:
> Add initial support for per-rx queue sysfs attributes to virtio-net. If
> mergeable packet buffers are enabled, adds a read-only mergeable packet
> buffer size sysfs attribute for each RX queue.
> 
> Signed-off-by: Michael Dalton <mwdalton@google.com>
> ---
>  drivers/net/virtio_net.c | 66 +++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 62 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 3e82311..f315cbb 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -27,6 +27,7 @@
>  #include <linux/slab.h>
>  #include <linux/cpu.h>
>  #include <linux/average.h>
> +#include <linux/seqlock.h>
>  
>  static int napi_weight = NAPI_POLL_WEIGHT;
>  module_param(napi_weight, int, 0444);
> @@ -89,6 +90,12 @@ struct receive_queue {
>  	/* Average packet length for mergeable receive buffers. */
>  	struct ewma mrg_avg_pkt_len;
>  
> +	/* Sequence counter to allow sysfs readers to safely access stats.
> +	 * Assumes a single virtio-net writer, which is enforced by virtio-net
> +	 * and NAPI.
> +	 */
> +	seqcount_t sysfs_seq;
> +
>  	/* Page frag for packet buffer allocation. */
>  	struct page_frag alloc_frag;
>  
> @@ -416,7 +423,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  		}
>  	}
>  
> +	write_seqcount_begin(&rq->sysfs_seq);
>  	ewma_add(&rq->mrg_avg_pkt_len, head_skb->len);
> +	write_seqcount_end(&rq->sysfs_seq);
>  	return head_skb;
>  
>  err_skb:

Hmm this adds overhead just to prevent sysfs from getting wrong value.
Can't sysfs simply disable softirq while it's reading the value?

> @@ -604,18 +613,29 @@ static int add_recvbuf_big(struct receive_queue *rq, gfp_t gfp)
>  	return err;
>  }
>  
> -static int add_recvbuf_mergeable(struct receive_queue *rq, gfp_t gfp)
> +static unsigned int get_mergeable_buf_len(struct ewma *avg_pkt_len)
>  {
>  	const size_t hdr_len = sizeof(struct virtio_net_hdr_mrg_rxbuf);
> +	unsigned int len;
> +
> +	len = hdr_len + clamp_t(unsigned int, ewma_read(avg_pkt_len),
> +			GOOD_PACKET_LEN, PAGE_SIZE - hdr_len);
> +	return ALIGN(len, MERGEABLE_BUFFER_ALIGN);
> +}
> +
> +static int add_recvbuf_mergeable(struct receive_queue *rq, gfp_t gfp)
> +{
>  	struct page_frag *alloc_frag = &rq->alloc_frag;
>  	char *buf;
>  	unsigned long ctx;
>  	int err;
>  	unsigned int len, hole;
>  
> -	len = hdr_len + clamp_t(unsigned int, ewma_read(&rq->mrg_avg_pkt_len),
> -				GOOD_PACKET_LEN, PAGE_SIZE - hdr_len);
> -	len = ALIGN(len, MERGEABLE_BUFFER_ALIGN);
> +	/* avg_pkt_len is written only in NAPI rx softirq context. We may
> +	 * read avg_pkt_len without using the sysfs_seq seqcount, as this code
> +	 * is called only in NAPI rx softirq context or when NAPI is disabled.
> +	 */
> +	len = get_mergeable_buf_len(&rq->mrg_avg_pkt_len);
>  	if (unlikely(!skb_page_frag_refill(len, alloc_frag, gfp)))
>  		return -ENOMEM;
>  
> @@ -1557,6 +1577,7 @@ static int virtnet_alloc_queues(struct virtnet_info *vi)
>  			       napi_weight);
>  
>  		sg_init_table(vi->rq[i].sg, ARRAY_SIZE(vi->rq[i].sg));
> +		seqcount_init(&vi->rq[i].sysfs_seq);
>  		ewma_init(&vi->rq[i].mrg_avg_pkt_len, 1, RECEIVE_AVG_WEIGHT);
>  		sg_init_table(vi->sq[i].sg, ARRAY_SIZE(vi->sq[i].sg));
>  	}
> @@ -1594,6 +1615,39 @@ err:
>  	return ret;
>  }
>  
> +#ifdef CONFIG_SYSFS
> +static ssize_t mergeable_rx_buffer_size_show(struct netdev_rx_queue *queue,
> +		struct rx_queue_attribute *attribute, char *buf)
> +{
> +	struct virtnet_info *vi = netdev_priv(queue->dev);
> +	unsigned int queue_index = get_netdev_rx_queue_index(queue);
> +	struct receive_queue *rq;
> +	struct ewma avg;
> +	unsigned int start;
> +
> +	BUG_ON(queue_index >= vi->max_queue_pairs);
> +	rq = &vi->rq[queue_index];
> +	do {
> +		start = read_seqcount_begin(&rq->sysfs_seq);
> +		avg = rq->mrg_avg_pkt_len;
> +	} while (read_seqcount_retry(&rq->sysfs_seq, start));
> +	return sprintf(buf, "%u\n", get_mergeable_buf_len(&avg));
> +}
> +
> +static struct rx_queue_attribute mergeable_rx_buffer_size_attribute =
> +	__ATTR_RO(mergeable_rx_buffer_size);
> +
> +static struct attribute *virtio_net_mrg_rx_attrs[] = {
> +	&mergeable_rx_buffer_size_attribute.attr,
> +	NULL
> +};
> +
> +static const struct attribute_group virtio_net_mrg_rx_group = {
> +	.name = "virtio_net",
> +	.attrs = virtio_net_mrg_rx_attrs
> +};
> +#endif
> +
>  static int virtnet_probe(struct virtio_device *vdev)
>  {
>  	int i, err;
> @@ -1708,6 +1762,10 @@ static int virtnet_probe(struct virtio_device *vdev)
>  	if (err)
>  		goto free_stats;
>  
> +#ifdef CONFIG_SYSFS
> +	if (vi->mergeable_rx_bufs)
> +		dev->sysfs_rx_queue_group = &virtio_net_mrg_rx_group;
> +#endif
>  	netif_set_real_num_tx_queues(dev, vi->curr_queue_pairs);
>  	netif_set_real_num_rx_queues(dev, vi->curr_queue_pairs);
>  
> -- 
> 1.8.5.2

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox