Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v2 1/6] ethtool: move option parsing related code into function
From: Michal Kubecek @ 2019-02-06  9:56 UTC (permalink / raw)
  To: netdev; +Cc: Jeff Kirsher, linville, Nicholas Nunley, nhorman, sassmann
In-Reply-To: <20190206000106.24364-1-jeffrey.t.kirsher@intel.com>

On Tue, Feb 05, 2019 at 04:01:01PM -0800, Jeff Kirsher wrote:
> From: Nicholas Nunley <nicholas.d.nunley@intel.com>
> 
> Move option parsing code into find_option function.
> 
> No behavior changes.
> 
> Based on patch by Kan Liang <kan.liang@intel.com>
> 
> Signed-off-by: Nicholas Nunley <nicholas.d.nunley@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
>  ethtool.c | 49 +++++++++++++++++++++++++++++++------------------
>  1 file changed, 31 insertions(+), 18 deletions(-)
> 
> diff --git a/ethtool.c b/ethtool.c
> index 2f7e96b..8b7c224 100644
> --- a/ethtool.c
> +++ b/ethtool.c
> @@ -5265,6 +5265,29 @@ static int show_usage(struct cmd_context *ctx)
>  	return 0;
>  }
>  
> +static int find_option(int argc, char **argp)
> +{
> +	const char *opt;
> +	size_t len;
> +	int k;
> +
> +	for (k = 0; args[k].opts; k++) {
> +		opt = args[k].opts;
> +		for (;;) {
> +			len = strcspn(opt, "|");
> +			if (strncmp(*argp, opt, len) == 0 &&
> +			    (*argp)[len] == 0)
> +				return k;
> +
> +			if (opt[len] == 0)
> +				break;
> +			opt += len + 1;
> +		}
> +	}
> +
> +	return -1;
> +}
> +
>  int main(int argc, char **argp)
>  {
>  	int (*func)(struct cmd_context *);
> @@ -5284,24 +5307,14 @@ int main(int argc, char **argp)
>  	 */
>  	if (argc == 0)
>  		exit_bad_args();
> -	for (k = 0; args[k].opts; k++) {
> -		const char *opt;
> -		size_t len;
> -		opt = args[k].opts;
> -		for (;;) {
> -			len = strcspn(opt, "|");
> -			if (strncmp(*argp, opt, len) == 0 &&
> -			    (*argp)[len] == 0) {
> -				argp++;
> -				argc--;
> -				func = args[k].func;
> -				want_device = args[k].want_device;
> -				goto opt_found;
> -			}
> -			if (opt[len] == 0)
> -				break;
> -			opt += len + 1;
> -		}
> +
> +	k = find_option(argc, argp);
> +	if (k >= 0) {
> +		argp++;
> +		argc--;
> +		func = args[k].func;
> +		want_device = args[k].want_device;
> +		goto opt_found;
>  	}
>  	if ((*argp)[0] == '-')
>  		exit_bad_args();

After hiding the loop into find_option() helper, you can put the
following few lines into else branch and get rid of the goto.

Michal Kubecek

^ permalink raw reply

* Re: [PATCH net-next v7 0/8] devlink: Add configuration parameters support for devlink_port
From: Vasundhara Volam @ 2019-02-06 10:13 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: Jakub Kicinski, Netdev, David Miller, michael.chan@broadcom.com,
	Jiri Pirko
In-Reply-To: <20190205165133.GF21401@unicorn.suse.cz>

On Tue, Feb 5, 2019 at 10:21 PM Michal Kubecek <mkubecek@suse.cz> wrote:
>
> On Tue, Feb 05, 2019 at 09:53:26AM +0530, Vasundhara Volam wrote:
> > On Tue, Feb 5, 2019 at 8:26 AM Jakub Kicinski
> > >
> > > No?  We were talking about using the soon-too-come ethtool netlink
> > > API with additional indication that given configuration request is
> > > supposed to be persisted.  Adding more devlink parameters is exactly
> > > the opposite of what you should be doing.
> >
> > Okay. So, till then can we have the devlink wake_on_lan parameter or
> > you want this to be removed? Could you please clarify?
> >
> > Once ethtool netlink API is available with persisted support, I can remove
> > this wake_on_lan parameter from devlink. Thanks.
>
> Once you provide an interface for userspace and applications start using
> it, it's hard to get rid of it. As an extreme example, the legacy ioctl
> interface used by ifconfig has been declared obsolete since kernel 2.2.0
> (January 1999, i.e. 20 years ago) and we still have to maintain it.
>
Okay Got it. I will revert only the wake_on_lan parameter and send the patch.
We will wait for soon-too-come ethtool netlink API.

Thank you.
> Michal Kubecek

^ permalink raw reply

* [PATCH 6/8] net: thunderx: add mutex to protect mailbox from concurrent calls for same VF
From: Vadim Lomovtsev @ 2019-02-06 10:13 UTC (permalink / raw)
  To: sgoutham@cavium.com, rric@kernel.org, davem@davemloft.net,
	linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: dnelson@redhat.com, Vadim Lomovtsev
In-Reply-To: <20190206101351.16744-1-vlomovtsev@marvell.com>

In some cases it could happen that nicvf_send_msg_to_pf() could be called
concurrently for the same NIC VF, and thus re-writing mailbox contents and
breaking messaging sequence with PF by re-writing NICVF data.

This commit is to implement mutex for NICVF to protect mailbox registers
and NICVF messaging control data from concurrent access.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
---
 drivers/net/ethernet/cavium/thunder/nic.h        |  2 ++
 drivers/net/ethernet/cavium/thunder/nicvf_main.c | 13 ++++++++++---
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 227343625e83..86cda3f4b37b 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -329,6 +329,8 @@ struct nicvf {
 	spinlock_t              rx_mode_wq_lock;
 	/* workqueue for handling kernel ndo_set_rx_mode() calls */
 	struct workqueue_struct *nicvf_rx_mode_wq;
+	/* mutex to protect VF's mailbox contents from concurrent access */
+	struct mutex            rx_mode_mtx;
 
 	/* PTP timestamp */
 	struct cavium_ptp	*ptp_clock;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 30c7f54b4f17..a05e2989ec76 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -124,6 +124,9 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx)
 {
 	int timeout = NIC_MBOX_MSG_TIMEOUT;
 	int sleep = 10;
+	int ret = 0;
+
+	mutex_lock(&nic->rx_mode_mtx);
 
 	nic->pf_acked = false;
 	nic->pf_nacked = false;
@@ -136,7 +139,8 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx)
 			netdev_err(nic->netdev,
 				   "PF NACK to mbox msg 0x%02x from VF%d\n",
 				   (mbx->msg.msg & 0xFF), nic->vf_id);
-			return -EINVAL;
+			ret = -EINVAL;
+			break;
 		}
 		msleep(sleep);
 		if (nic->pf_acked)
@@ -146,10 +150,12 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx)
 			netdev_err(nic->netdev,
 				   "PF didn't ACK to mbox msg 0x%02x from VF%d\n",
 				   (mbx->msg.msg & 0xFF), nic->vf_id);
-			return -EBUSY;
+			ret = -EBUSY;
+			break;
 		}
 	}
-	return 0;
+	mutex_unlock(&nic->rx_mode_mtx);
+	return ret;
 }
 
 /* Checks if VF is able to comminicate with PF
@@ -2211,6 +2217,7 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 							nic->vf_id);
 	INIT_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task);
 	spin_lock_init(&nic->rx_mode_wq_lock);
+	mutex_init(&nic->rx_mode_mtx);
 
 	err = register_netdev(netdev);
 	if (err) {
-- 
2.17.2

^ permalink raw reply related

* [PATCH 7/8] net: thunderx: implement helpers to read mailbox IRQ status
From: Vadim Lomovtsev @ 2019-02-06 10:13 UTC (permalink / raw)
  To: sgoutham@cavium.com, rric@kernel.org, davem@davemloft.net,
	linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: dnelson@redhat.com, Vadim Lomovtsev
In-Reply-To: <20190206101351.16744-1-vlomovtsev@marvell.com>

This commit is to implement routines to read mailbox IRQ status
for particular VF at PF side, and for mailbox IRQ status
from PF at VF side.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
---
 drivers/net/ethernet/cavium/thunder/nic_main.c     | 13 +++++++++++++
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 14 ++++++++++++++
 drivers/net/ethernet/cavium/thunder/nicvf_queues.h |  1 +
 3 files changed, 28 insertions(+)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 620dbe082ca0..a32c1bd75794 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -104,6 +104,19 @@ static u64 nic_reg_read(struct nicpf *nic, u64 offset)
 	return readq_relaxed(nic->reg_base + offset);
 }
 
+static int nic_is_mbox_intr_active(struct nicpf *nic, int vf_id)
+{
+	int ret = 0;
+
+	if (vf_id < NIC_VF_PER_MBX_REG) {
+		ret = nic_reg_read(nic, NIC_PF_MAILBOX_INT) & BIT_ULL(vf_id);
+	} else {
+		ret = nic_reg_read(nic, NIC_PF_MAILBOX_INT + sizeof(u64)) &
+			BIT_ULL(vf_id - NIC_VF_PER_MBX_REG);
+	}
+	return ret;
+}
+
 /* PF -> VF mailbox communication APIs */
 static void nic_enable_mbx_intr(struct nicpf *nic)
 {
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 5b4d3badcb73..e7ee7005657c 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -1801,6 +1801,20 @@ void nicvf_clear_intr(struct nicvf *nic, int int_type, int q_idx)
 	nicvf_reg_write(nic, NIC_VF_INT, mask);
 }
 
+/* Check if interrupt is active */
+int nicvf_check_is_intr_active(struct nicvf *nic, int int_type, int q_idx)
+{
+	u64 mask = nicvf_int_type_to_mask(int_type, q_idx);
+
+	if (!mask) {
+		netdev_dbg(nic->netdev,
+			   "Failed to read interrupt status: unknown type\n");
+		return 0;
+	}
+
+	return (mask & nicvf_reg_read(nic, NIC_VF_INT));
+}
+
 /* Check if interrupt is enabled */
 int nicvf_is_intr_enabled(struct nicvf *nic, int int_type, int q_idx)
 {
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index 5e9a03cf1b4d..58f6fbe48bce 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -358,6 +358,7 @@ void nicvf_enable_intr(struct nicvf *nic, int int_type, int q_idx);
 void nicvf_disable_intr(struct nicvf *nic, int int_type, int q_idx);
 void nicvf_clear_intr(struct nicvf *nic, int int_type, int q_idx);
 int nicvf_is_intr_enabled(struct nicvf *nic, int int_type, int q_idx);
+int nicvf_check_is_intr_active(struct nicvf *nic, int int_type, int q_idx);
 
 /* Register access APIs */
 void nicvf_reg_write(struct nicvf *nic, u64 offset, u64 val);
-- 
2.17.2

^ permalink raw reply related

* [PATCH 8/8] net: thunderx: check status of mailbox IRQ before sending a message
From: Vadim Lomovtsev @ 2019-02-06 10:13 UTC (permalink / raw)
  To: sgoutham@cavium.com, rric@kernel.org, davem@davemloft.net,
	linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: dnelson@redhat.com, Vadim Lomovtsev
In-Reply-To: <20190206101351.16744-1-vlomovtsev@marvell.com>

In order to prevent mailbox data re-writing at VF side we need to check if
there is an active mailbox IRQ from PF, and if there is no one proceed with
sending message to PF. Having spinlock at irq handler and message send
routing wont help since by the moment when code flow would reach the irq
handler and acquire spinlock the message send routine could be already
invoked and thus re-write data in the mailbox.

The same is true for PF while sending messages to VF.

This commit is to implement mailbox IRQ status check before sending
message to VF from PF. Same is for sending message to PF from VF.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
---
 .../net/ethernet/cavium/thunder/nic_main.c    | 39 ++++++++-----------
 .../net/ethernet/cavium/thunder/nicvf_main.c  |  3 ++
 2 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index a32c1bd75794..e0041692caef 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -64,7 +64,6 @@ struct nicpf {
 	u32			*speed;
 	u16			cpi_base[MAX_NUM_VFS_SUPPORTED];
 	u16			rssi_base[MAX_NUM_VFS_SUPPORTED];
-	bool			mbx_lock[MAX_NUM_VFS_SUPPORTED];
 
 	/* MSI-X */
 	u8			num_vec;
@@ -954,8 +953,6 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
 	int i;
 	int ret = 0;
 
-	nic->mbx_lock[vf] = true;
-
 	mbx_addr = nic_get_mbx_addr(vf);
 	mbx_data = (u64 *)&mbx;
 
@@ -975,7 +972,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
 			nic->duplex[vf] = 0;
 			nic->speed[vf] = 0;
 		}
-		goto unlock;
+		return;
 	case NIC_MBOX_MSG_QS_CFG:
 		reg_addr = NIC_PF_QSET_0_127_CFG |
 			   (mbx.qs.num << NIC_QS_ID_SHIFT);
@@ -1044,7 +1041,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
 		break;
 	case NIC_MBOX_MSG_RSS_SIZE:
 		nic_send_rss_size(nic, vf);
-		goto unlock;
+		return;
 	case NIC_MBOX_MSG_RSS_CFG:
 	case NIC_MBOX_MSG_RSS_CFG_CONT:
 		nic_config_rss(nic, &mbx.rss_cfg);
@@ -1062,19 +1059,19 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
 		break;
 	case NIC_MBOX_MSG_ALLOC_SQS:
 		nic_alloc_sqs(nic, &mbx.sqs_alloc);
-		goto unlock;
+		return;
 	case NIC_MBOX_MSG_NICVF_PTR:
 		nic->nicvf[vf] = mbx.nicvf.nicvf;
 		break;
 	case NIC_MBOX_MSG_PNICVF_PTR:
 		nic_send_pnicvf(nic, vf);
-		goto unlock;
+		return;
 	case NIC_MBOX_MSG_SNICVF_PTR:
 		nic_send_snicvf(nic, &mbx.nicvf);
-		goto unlock;
+		return;
 	case NIC_MBOX_MSG_BGX_STATS:
 		nic_get_bgx_stats(nic, &mbx.bgx_stats);
-		goto unlock;
+		return;
 	case NIC_MBOX_MSG_LOOPBACK:
 		ret = nic_config_loopback(nic, &mbx.lbk);
 		break;
@@ -1083,7 +1080,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
 		break;
 	case NIC_MBOX_MSG_PFC:
 		nic_pause_frame(nic, vf, &mbx.pfc);
-		goto unlock;
+		return;
 	case NIC_MBOX_MSG_PTP_CFG:
 		nic_config_timestamp(nic, vf, &mbx.ptp);
 		break;
@@ -1134,8 +1131,6 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
 			mbx.msg.msg, vf);
 		nic_mbx_send_nack(nic, vf);
 	}
-unlock:
-	nic->mbx_lock[vf] = false;
 }
 
 static irqreturn_t nic_mbx_intr_handler(int irq, void *nic_irq)
@@ -1313,18 +1308,18 @@ static void nic_poll_for_link(struct work_struct *work)
 		if (nic->link[vf] == link.link_up)
 			continue;
 
-		if (!nic->mbx_lock[vf]) {
-			nic->link[vf] = link.link_up;
-			nic->duplex[vf] = link.duplex;
-			nic->speed[vf] = link.speed;
+		nic->link[vf] = link.link_up;
+		nic->duplex[vf] = link.duplex;
+		nic->speed[vf] = link.speed;
 
-			/* Send a mbox message to VF with current link status */
-			mbx.link_status.link_up = link.link_up;
-			mbx.link_status.duplex = link.duplex;
-			mbx.link_status.speed = link.speed;
-			mbx.link_status.mac_type = link.mac_type;
+		/* Send a mbox message to VF with current link status */
+		mbx.link_status.link_up = link.link_up;
+		mbx.link_status.duplex = link.duplex;
+		mbx.link_status.speed = link.speed;
+		mbx.link_status.mac_type = link.mac_type;
+
+		if (!nic_is_mbox_intr_active(nic, vf))
 			nic_send_msg_to_vf(nic, vf, &mbx);
-		}
 	}
 	queue_delayed_work(nic->check_link, &nic->dwork, HZ * 2);
 }
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index a05e2989ec76..66e19c207467 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -128,6 +128,9 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx)
 
 	mutex_lock(&nic->rx_mode_mtx);
 
+	while (nicvf_check_is_intr_active(nic, NICVF_INTR_MBOX, 0))
+		msleep(1);
+
 	nic->pf_acked = false;
 	nic->pf_nacked = false;
 
-- 
2.17.2

^ permalink raw reply related

* [PATCH 3/8] net: thunderx: make CFG_DONE message to run through generic send-ack sequence
From: Vadim Lomovtsev @ 2019-02-06 10:13 UTC (permalink / raw)
  To: sgoutham@cavium.com, rric@kernel.org, davem@davemloft.net,
	linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: dnelson@redhat.com, Vadim Lomovtsev
In-Reply-To: <20190206101351.16744-1-vlomovtsev@marvell.com>

At the end of NIC VF initialization it send CFG_DONE message to PF without
using nicvf_msg_send_to_pf routine. This potentially could re-write data in
mailbox. This commit is to implement common way of sending CFG_DONE message
by the same way with other configuration messages by using
nicvf_send_msg_to_pf() routine.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
---
 drivers/net/ethernet/cavium/thunder/nic_main.c |  2 +-
 .../net/ethernet/cavium/thunder/nicvf_main.c   | 18 +++++++++++++++---
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 6c8dcb65ff03..90497a27df18 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -1039,7 +1039,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
 	case NIC_MBOX_MSG_CFG_DONE:
 		/* Last message of VF config msg sequence */
 		nic_enable_vf(nic, vf, true);
-		goto unlock;
+		break;
 	case NIC_MBOX_MSG_SHUTDOWN:
 		/* First msg in VF teardown sequence */
 		if (vf >= nic->num_vf_en)
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index abf24e7dff2d..b0e8a04e0f1e 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -169,6 +169,20 @@ static int nicvf_check_pf_ready(struct nicvf *nic)
 	return 1;
 }
 
+static int nicvf_send_cfg_done(struct nicvf *nic)
+{
+	union nic_mbx mbx = {};
+
+	mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE;
+	if (nicvf_send_msg_to_pf(nic, &mbx)) {
+		netdev_err(nic->netdev,
+			   "PF didn't respond to CFG DONE msg\n");
+		return 0;
+	}
+
+	return 1;
+}
+
 static void nicvf_read_bgx_stats(struct nicvf *nic, struct bgx_stats_msg *bgx)
 {
 	if (bgx->rx)
@@ -1416,7 +1430,6 @@ int nicvf_open(struct net_device *netdev)
 	struct nicvf *nic = netdev_priv(netdev);
 	struct queue_set *qs = nic->qs;
 	struct nicvf_cq_poll *cq_poll = NULL;
-	union nic_mbx mbx = {};
 
 	/* wait till all queued set_rx_mode tasks completes if any */
 	drain_workqueue(nic->nicvf_rx_mode_wq);
@@ -1515,8 +1528,7 @@ int nicvf_open(struct net_device *netdev)
 		nicvf_enable_intr(nic, NICVF_INTR_RBDR, qidx);
 
 	/* Send VF config done msg to PF */
-	mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE;
-	nicvf_write_to_mbx(nic, &mbx);
+	nicvf_send_cfg_done(nic);
 
 	return 0;
 cleanup:
-- 
2.17.2

^ permalink raw reply related

* [PATCH 0/8] nic: thunderx: fix communication races betwen VF & PF
From: Vadim Lomovtsev @ 2019-02-06 10:13 UTC (permalink / raw)
  To: sgoutham@cavium.com, rric@kernel.org, davem@davemloft.net,
	linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: dnelson@redhat.com, Vadim Lomovtsev

The ThunderX CN88XX NIC Virtual Function driver uses mailbox interface 
to communicate to physical function driver. Each of VF has it's own pair
of mailbox registers to read from and write to. The mailbox registers
has no protection from possible races, so it has to be implemented
at software side.

After long term testing by loop of 'ip link set <ifname> up/down'
command it was found that there are two possible scenarios when
race condition appears:
 1. VF receives link change message from PF and VF send RX mode
configuration message to PF in the same time from separate thread.
 2. PF receives RX mode configuration from VF and in the same time,
in separate thread PF detects link status change and sends appropriate
message to particular VF.

Both cases leads to mailbox data to be rewritten, NIC VF messaging control
data to be updated incorrectly and communication sequence gets broken.

This patch series is to address race condition with VF & PF communication.

Vadim Lomovtsev (8):
  net: thunderx: correct typo in macro name
  net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs
    to private for each of them.
  net: thunderx: make CFG_DONE message to run through generic send-ack
    sequence
  net: thunderx: add nicvf_send_msg_to_pf result check for
    set_rx_mode_task
  net: thunderx: rework xcast message structure to make it fit into 64
    bit
  net: thunderx: add mutex to protect mailbox from concurrent calls for
    same VF
  net: thunderx: implement helpers to read mailbox IRQ status
  net: thunderx: check status of mailbox IRQ before sending a message

 drivers/net/ethernet/cavium/thunder/nic.h     | 12 +--
 .../net/ethernet/cavium/thunder/nic_main.c    | 58 +++++++------
 .../net/ethernet/cavium/thunder/nicvf_main.c  | 82 +++++++++++++------
 .../ethernet/cavium/thunder/nicvf_queues.c    | 14 ++++
 .../ethernet/cavium/thunder/nicvf_queues.h    |  1 +
 .../net/ethernet/cavium/thunder/thunder_bgx.c |  2 +-
 .../net/ethernet/cavium/thunder/thunder_bgx.h |  2 +-
 7 files changed, 112 insertions(+), 59 deletions(-)

-- 
2.17.2

^ permalink raw reply

* [PATCH 4/8] net: thunderx: add nicvf_send_msg_to_pf result check for set_rx_mode_task
From: Vadim Lomovtsev @ 2019-02-06 10:13 UTC (permalink / raw)
  To: sgoutham@cavium.com, rric@kernel.org, davem@davemloft.net,
	linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: dnelson@redhat.com, Vadim Lomovtsev
In-Reply-To: <20190206101351.16744-1-vlomovtsev@marvell.com>

The rx_set_mode invokes number of messages to be send to PF for receive
mode configuration. In case if there any issues we need to stop sending
messages and release allocated memory.

This commit is to implement check of nicvf_msg_send_to_pf() result.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
---
 drivers/net/ethernet/cavium/thunder/nicvf_main.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index b0e8a04e0f1e..dbd8862d60d6 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1956,7 +1956,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
 
 	/* flush DMAC filters and reset RX mode */
 	mbx.xcast.msg = NIC_MBOX_MSG_RESET_XCAST;
-	nicvf_send_msg_to_pf(nic, &mbx);
+	if (nicvf_send_msg_to_pf(nic, &mbx) < 0)
+		goto free_mc;
 
 	if (mode & BGX_XCAST_MCAST_FILTER) {
 		/* once enabling filtering, we need to signal to PF to add
@@ -1964,7 +1965,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
 		 */
 		mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST;
 		mbx.xcast.data.mac = 0;
-		nicvf_send_msg_to_pf(nic, &mbx);
+		if (nicvf_send_msg_to_pf(nic, &mbx) < 0)
+			goto free_mc;
 	}
 
 	/* check if we have any specific MACs to be added to PF DMAC filter */
@@ -1973,9 +1975,9 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
 		for (idx = 0; idx < mc_addrs->count; idx++) {
 			mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST;
 			mbx.xcast.data.mac = mc_addrs->mc[idx];
-			nicvf_send_msg_to_pf(nic, &mbx);
+			if (nicvf_send_msg_to_pf(nic, &mbx) < 0)
+				goto free_mc;
 		}
-		kfree(mc_addrs);
 	}
 
 	/* and finally set rx mode for PF accordingly */
@@ -1983,6 +1985,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
 	mbx.xcast.data.mode = mode;
 
 	nicvf_send_msg_to_pf(nic, &mbx);
+free_mc:
+	kfree(mc_addrs);
 }
 
 static void nicvf_set_rx_mode_task(struct work_struct *work_arg)
-- 
2.17.2

^ permalink raw reply related

* [PATCH 5/8] net: thunderx: rework xcast message structure to make it fit into 64 bit
From: Vadim Lomovtsev @ 2019-02-06 10:13 UTC (permalink / raw)
  To: sgoutham@cavium.com, rric@kernel.org, davem@davemloft.net,
	linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: dnelson@redhat.com, Vadim Lomovtsev
In-Reply-To: <20190206101351.16744-1-vlomovtsev@marvell.com>

To communicate to PF each of ThunderX NIC VF uses mailbox which is
pair of 64 bit registers available to both VFn and PF.

This commit is to change the xcast message structure in order to
fit it into 64 bit.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
---
 drivers/net/ethernet/cavium/thunder/nic.h        | 6 ++----
 drivers/net/ethernet/cavium/thunder/nic_main.c   | 4 ++--
 drivers/net/ethernet/cavium/thunder/nicvf_main.c | 6 +++---
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 376a96bce33f..227343625e83 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -577,10 +577,8 @@ struct set_ptp {
 
 struct xcast {
 	u8    msg;
-	union {
-		u8    mode;
-		u64   mac;
-	} data;
+	u8    mode;
+	u64   mac:48;
 };
 
 /* 128 bit shared memory between PF and each VF */
diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 90497a27df18..620dbe082ca0 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -1094,7 +1094,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
 		bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
 		lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
 		bgx_set_dmac_cam_filter(nic->node, bgx, lmac,
-					mbx.xcast.data.mac,
+					mbx.xcast.mac,
 					vf < NIC_VF_PER_MBX_REG ? vf :
 					vf - NIC_VF_PER_MBX_REG);
 		break;
@@ -1106,7 +1106,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
 		}
 		bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
 		lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
-		bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.data.mode);
+		bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.mode);
 		break;
 	default:
 		dev_err(&nic->pdev->dev,
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index dbd8862d60d6..30c7f54b4f17 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1964,7 +1964,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
 		 * its' own LMAC to the filter to accept packets for it.
 		 */
 		mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST;
-		mbx.xcast.data.mac = 0;
+		mbx.xcast.mac = 0;
 		if (nicvf_send_msg_to_pf(nic, &mbx) < 0)
 			goto free_mc;
 	}
@@ -1974,7 +1974,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
 		/* now go through kernel list of MACs and add them one by one */
 		for (idx = 0; idx < mc_addrs->count; idx++) {
 			mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST;
-			mbx.xcast.data.mac = mc_addrs->mc[idx];
+			mbx.xcast.mac = mc_addrs->mc[idx];
 			if (nicvf_send_msg_to_pf(nic, &mbx) < 0)
 				goto free_mc;
 		}
@@ -1982,7 +1982,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
 
 	/* and finally set rx mode for PF accordingly */
 	mbx.xcast.msg = NIC_MBOX_MSG_SET_XCAST;
-	mbx.xcast.data.mode = mode;
+	mbx.xcast.mode = mode;
 
 	nicvf_send_msg_to_pf(nic, &mbx);
 free_mc:
-- 
2.17.2

^ permalink raw reply related

* [PATCH 2/8] net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them.
From: Vadim Lomovtsev @ 2019-02-06 10:13 UTC (permalink / raw)
  To: sgoutham@cavium.com, rric@kernel.org, davem@davemloft.net,
	linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: dnelson@redhat.com, Vadim Lomovtsev
In-Reply-To: <20190206101351.16744-1-vlomovtsev@marvell.com>

Having one work queue for receive mode configuration ndo_set_rx_mode()
call for all VFs results in making each of them wait till the
set_rx_mode() call completes for another VF if any of close, set
receive mode and change flags calls being already invoked. Potentially
this could cause device state change before appropriate call of receive
mode configuration completes, so the call itself became meaningless,
corrupt data or break configuration sequence.

We don't need any delays in NIC VF configuration sequence so having delayed
work call with 0 delay has no sense.

This commit is to implement one work queue for each NIC VF for set_rx_mode
task and to let them work independently and replacing delayed_work
with work_struct.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
---
 drivers/net/ethernet/cavium/thunder/nic.h     |  4 ++-
 .../net/ethernet/cavium/thunder/nicvf_main.c  | 30 ++++++++++---------
 2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index f4d81765221e..376a96bce33f 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -271,7 +271,7 @@ struct xcast_addr_list {
 };
 
 struct nicvf_work {
-	struct delayed_work    work;
+	struct work_struct     work;
 	u8                     mode;
 	struct xcast_addr_list *mc;
 };
@@ -327,6 +327,8 @@ struct nicvf {
 	struct nicvf_work       rx_mode_work;
 	/* spinlock to protect workqueue arguments from concurrent access */
 	spinlock_t              rx_mode_wq_lock;
+	/* workqueue for handling kernel ndo_set_rx_mode() calls */
+	struct workqueue_struct *nicvf_rx_mode_wq;
 
 	/* PTP timestamp */
 	struct cavium_ptp	*ptp_clock;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 88f8a8fa93cd..abf24e7dff2d 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -68,9 +68,6 @@ module_param(cpi_alg, int, 0444);
 MODULE_PARM_DESC(cpi_alg,
 		 "PFC algorithm (0=none, 1=VLAN, 2=VLAN16, 3=IP Diffserv)");
 
-/* workqueue for handling kernel ndo_set_rx_mode() calls */
-static struct workqueue_struct *nicvf_rx_mode_wq;
-
 static inline u8 nicvf_netdev_qidx(struct nicvf *nic, u8 qidx)
 {
 	if (nic->sqs_mode)
@@ -1311,6 +1308,9 @@ int nicvf_stop(struct net_device *netdev)
 	struct nicvf_cq_poll *cq_poll = NULL;
 	union nic_mbx mbx = {};
 
+	/* wait till all queued set_rx_mode tasks completes */
+	drain_workqueue(nic->nicvf_rx_mode_wq);
+
 	mbx.msg.msg = NIC_MBOX_MSG_SHUTDOWN;
 	nicvf_send_msg_to_pf(nic, &mbx);
 
@@ -1418,6 +1418,9 @@ int nicvf_open(struct net_device *netdev)
 	struct nicvf_cq_poll *cq_poll = NULL;
 	union nic_mbx mbx = {};
 
+	/* wait till all queued set_rx_mode tasks completes if any */
+	drain_workqueue(nic->nicvf_rx_mode_wq);
+
 	netif_carrier_off(netdev);
 
 	err = nicvf_register_misc_interrupt(nic);
@@ -1973,7 +1976,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
 static void nicvf_set_rx_mode_task(struct work_struct *work_arg)
 {
 	struct nicvf_work *vf_work = container_of(work_arg, struct nicvf_work,
-						  work.work);
+						  work);
 	struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work);
 	u8 mode;
 	struct xcast_addr_list *mc;
@@ -2030,7 +2033,7 @@ static void nicvf_set_rx_mode(struct net_device *netdev)
 	kfree(nic->rx_mode_work.mc);
 	nic->rx_mode_work.mc = mc_list;
 	nic->rx_mode_work.mode = mode;
-	queue_delayed_work(nicvf_rx_mode_wq, &nic->rx_mode_work.work, 0);
+	queue_work(nic->nicvf_rx_mode_wq, &nic->rx_mode_work.work);
 	spin_unlock(&nic->rx_mode_wq_lock);
 }
 
@@ -2187,7 +2190,10 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	INIT_WORK(&nic->reset_task, nicvf_reset_task);
 
-	INIT_DELAYED_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task);
+	nic->nicvf_rx_mode_wq = alloc_ordered_workqueue("nicvf_rx_mode_wq_VF%d",
+							WQ_MEM_RECLAIM,
+							nic->vf_id);
+	INIT_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task);
 	spin_lock_init(&nic->rx_mode_wq_lock);
 
 	err = register_netdev(netdev);
@@ -2228,13 +2234,15 @@ static void nicvf_remove(struct pci_dev *pdev)
 	nic = netdev_priv(netdev);
 	pnetdev = nic->pnicvf->netdev;
 
-	cancel_delayed_work_sync(&nic->rx_mode_work.work);
-
 	/* Check if this Qset is assigned to different VF.
 	 * If yes, clean primary and all secondary Qsets.
 	 */
 	if (pnetdev && (pnetdev->reg_state == NETREG_REGISTERED))
 		unregister_netdev(pnetdev);
+	if (nic->nicvf_rx_mode_wq) {
+		destroy_workqueue(nic->nicvf_rx_mode_wq);
+		nic->nicvf_rx_mode_wq = NULL;
+	}
 	nicvf_unregister_interrupts(nic);
 	pci_set_drvdata(pdev, NULL);
 	if (nic->drv_stats)
@@ -2261,17 +2269,11 @@ static struct pci_driver nicvf_driver = {
 static int __init nicvf_init_module(void)
 {
 	pr_info("%s, ver %s\n", DRV_NAME, DRV_VERSION);
-	nicvf_rx_mode_wq = alloc_ordered_workqueue("nicvf_generic",
-						   WQ_MEM_RECLAIM);
 	return pci_register_driver(&nicvf_driver);
 }
 
 static void __exit nicvf_cleanup_module(void)
 {
-	if (nicvf_rx_mode_wq) {
-		destroy_workqueue(nicvf_rx_mode_wq);
-		nicvf_rx_mode_wq = NULL;
-	}
 	pci_unregister_driver(&nicvf_driver);
 }
 
-- 
2.17.2

^ permalink raw reply related

* [PATCH 1/8] net: thunderx: correct typo in macro name
From: Vadim Lomovtsev @ 2019-02-06 10:13 UTC (permalink / raw)
  To: sgoutham@cavium.com, rric@kernel.org, davem@davemloft.net,
	linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: dnelson@redhat.com, Vadim Lomovtsev
In-Reply-To: <20190206101351.16744-1-vlomovtsev@marvell.com>

Correct STREERING to STEERING at macro name for BGX steering register.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
---
 drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 2 +-
 drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index e337da6ba2a4..673c57b8023f 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -1217,7 +1217,7 @@ static void bgx_init_hw(struct bgx *bgx)
 
 	/* Disable MAC steering (NCSI traffic) */
 	for (i = 0; i < RX_TRAFFIC_STEER_RULE_COUNT; i++)
-		bgx_reg_write(bgx, 0, BGX_CMR_RX_STREERING + (i * 8), 0x00);
+		bgx_reg_write(bgx, 0, BGX_CMR_RX_STEERING + (i * 8), 0x00);
 }
 
 static u8 bgx_get_lane2sds_cfg(struct bgx *bgx, struct lmac *lmac)
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
index cbdd20b9ee6f..5cbc54e9eb19 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
@@ -60,7 +60,7 @@
 #define  RX_DMACX_CAM_EN			BIT_ULL(48)
 #define  RX_DMACX_CAM_LMACID(x)			(((u64)x) << 49)
 #define  RX_DMAC_COUNT				32
-#define BGX_CMR_RX_STREERING		0x300
+#define BGX_CMR_RX_STEERING		0x300
 #define  RX_TRAFFIC_STEER_RULE_COUNT		8
 #define BGX_CMR_CHAN_MSK_AND		0x450
 #define BGX_CMR_BIST_STATUS		0x460
-- 
2.17.2

^ permalink raw reply related

* [PATCH] net: sxgbe: fix unintended sign extension
From: Colin King @ 2019-02-06 10:25 UTC (permalink / raw)
  To: Byungho An, Girish K S, Vipul Pandya, David S . Miller, netdev
  Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

Shifting a u8 by 24 will cause the value to be promoted to an integer. If
the top bit of the u8 is set then the following conversion to an unsigned
long will sign extend the value causing the upper 32 bits to be set in
the result.

Fix this by casting the u8 value to an unsigned long before the shift.

Detected by CoverityScan, CID#1195586 ("Unintended sign extension")

Fixes: 1edb9ca69e8a ("net: sxgbe: add basic framework for Samsung 10Gb ethernet driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c b/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
index 6d22dd500790..51fe161e5da9 100644
--- a/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
+++ b/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
@@ -1822,7 +1822,8 @@ static void sxgbe_set_umac_addr(void __iomem *ioaddr, unsigned char *addr,
 	 * is RO.
 	 */
 	writel(data | SXGBE_HI_REG_AE, ioaddr + SXGBE_ADDR_HIGH(reg_n));
-	data = (addr[3] << 24) | (addr[2] << 16) | (addr[1] << 8) | addr[0];
+	data = ((unsigned long)addr[3] << 24) | (addr[2] << 16) |
+		(addr[1] << 8) | addr[0];
 	writel(data, ioaddr + SXGBE_ADDR_LOW(reg_n));
 }
 
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH v2 3/6] ethtool: introduce new ioctl for per-queue settings
From: Michal Kubecek @ 2019-02-06 10:32 UTC (permalink / raw)
  To: netdev; +Cc: Jeff Kirsher, linville, Nicholas Nunley, nhorman, sassmann
In-Reply-To: <20190206000106.24364-3-jeffrey.t.kirsher@intel.com>

On Tue, Feb 05, 2019 at 04:01:03PM -0800, Jeff Kirsher wrote:
> +static int find_max_num_queues(struct cmd_context *ctx)
> +{
> +	struct ethtool_channels echannels;
> +
> +	echannels.cmd = ETHTOOL_GCHANNELS;
> +	if (send_ioctl(ctx, &echannels))
> +		return -1;
> +
> +	return MAX(MAX(echannels.rx_count, echannels.tx_count),
> +		   echannels.combined_count);
> +}

Is the outer MAX() correct here? From the documentation to -L option, it
rather seems we might want

	return MAX(echannels.rx_count, echannels.tx_count) +
	       echannels.combined_count;

But I can't find any NIC around which would have non-zero rx_count or
tx_count so that I cannot check.

Michal Kubecek

^ permalink raw reply

* Re: stmmac / meson8b-dwmac
From: Emiliano Ingrassia @ 2019-02-06 10:36 UTC (permalink / raw)
  To: Martin Blumenstingl, Simon Huelck
  Cc: Gpeppe.cavallaro, alexandre.torgue, linux-amlogic, netdev
In-Reply-To: <CAFBinCAkXN42piiYsc_Qmuqv0w4dbK86gq10AbZyrB-zKuh9hg@mail.gmail.com>

Hi Martin, Hi Simon,

On Mon, Feb 04, 2019 at 03:34:41PM +0100, Martin Blumenstingl wrote:
> On Thu, Jan 17, 2019 at 10:23 PM Simon Huelck <simonmail@gmx.de> wrote:
> [...]
> > >> I got problems with my ODROID c2 running on 4.19.16 ( and some releases
> > >> earlier ). the stmmac / dwmac driver doesnt provide the 800M/900M
> > >> performance that i was used to earlier.
> > >>

Simon, did you ever reach 1 Gbps full duplex speed?
If yes, what was the kernel version did you use?

> > >>
> > >> Now im stuck near 550M/600M in the same environment. but what really
> > >> confuses me that duplex does hurt even more.
> > > interesting that you see this on the Odroid-C2 as well.
> > > previously I have only observed it on an Odroid-C1
> > >
> > >> PC --- VLAN3 --> switch --VLAN3--> ODROID
> > >>
> > >> NAS <-- VLAN1 -- switch <-- VLAN1-- ODROID
> > >>
> > >>
> > >> this means when im doing a iperf from PC to NAS, that my ODROID has load
> > >> on RX/TX same time (duplex). this shouldnt be an issue , all is 1GBits
> > >> FD. And in the past that wasnt an issue.
> +Cc Emiliano who has seen a similar duplex issue on his Odroid-C1: [0]
> (please note that all kernels prior to v5.1 with the pending patches
> from [1] applied are only receiving data on RXD0 and RXD1 but not on
> RXD2 and RXD3)
>
> Emiliano, can you confirm the duplex issue observed by Simon is
> similar to the one you see on your Odroid-C1?
>

It could be but, if I understand correctly, Simon is limited in
speed also in half duplex transmission (~550/600 Mbps), while we can
reach at least 900 Mbps.

> > >>
> > >>
> > >> Now what happens:
> > >>
> > >> - benchmark between PC - ODROID is roughly 550M
> > >>
> > >> - benchmark between NAS - ODROID is roughly 550M
> > >>
> > >> - benchmark between PC - NAS is only around 300M
> > >>
> > >>
> > >> and like i said i was easliy able to hit 800 or even 900M to my NAS
> > >> earlier. I applied some .dtb fixes for interrupt levels for the
> > >> meson-gx.dtsi and meson-gxbb-odroid-c2.dtb, which will be mainlined ,
> > >> but the effect stayed identical.
> > > good that you have the interrupt patches already applied
> > > I believe it don't fix any performance issues - it's a fix for the
> > > Ethernet controller seemingly getting "stuck" (not processing data
> > > anymore). however, that already rules out one potential issue
> > >
> > >> are you aware of this problem ? Earlier kernel versions were all
> > >> perfectly fine and i stepped ( self compiled) kernel through all major
> > >> releases since odroid c2 was mainlined.
> Guiseppe, Alexandre: what kind of data do you need from us if we see
> the speeds drop (in both directions) when we send and receive at the
> same time?
>
> [...]
> > the problem is that i dont have these kernel sources anymore :-(. but i
> > can provide some testing and numbers. maybe i dig if i got these kernel
> > configs somewhere around but i did not change much during migrating
> do you remember the kernel version where it worked fine?
>
> > im using a zyxel gs1900-8 switch and a qnap ts231p , and as i said i
> > didnt change my setup. i was able to hit 100MByte/s from my NAS , so
> > close to the benchmarks of 900MBit/s
> I typically only do small transfers or I have traffic only in one direction.
> thus it's likely that I missed this in my own tests
>
>
> Regards
> Martin
>
>
> [0] http://lists.infradead.org/pipermail/linux-amlogic/2018-December/009679.html
> [1] https://patchwork.kernel.org/cover/10744905/

Regards,

Emiliano

^ permalink raw reply

* [PATCH] net: sfp: do not probe SFP module before we're attached
From: Russell King @ 2019-02-06 10:52 UTC (permalink / raw)
  To: netdev; +Cc: Andrew Lunn, Florian Fainelli, Heiner Kallweit, David S. Miller

When we probe a SFP module, we expect to be able to call the upstream
device's module_insert() function so that the upstream link can be
configured.  However, when the upstream device is delayed, we currently
may end up probing the module before the upstream device is available,
and lose the module_insert() call.

Avoid this by holding off probing the module until the SFP bus is
properly connected to both the SFP socket driver and the upstream
driver.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp-bus.c |  2 ++
 drivers/net/phy/sfp.c     | 30 +++++++++++++++++++++---------
 drivers/net/phy/sfp.h     |  2 ++
 3 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c
index ad9db652874d..fef701bfad62 100644
--- a/drivers/net/phy/sfp-bus.c
+++ b/drivers/net/phy/sfp-bus.c
@@ -347,6 +347,7 @@ static int sfp_register_bus(struct sfp_bus *bus)
 				return ret;
 		}
 	}
+	bus->socket_ops->attach(bus->sfp);
 	if (bus->started)
 		bus->socket_ops->start(bus->sfp);
 	bus->netdev->sfp_bus = bus;
@@ -362,6 +363,7 @@ static void sfp_unregister_bus(struct sfp_bus *bus)
 	if (bus->registered) {
 		if (bus->started)
 			bus->socket_ops->stop(bus->sfp);
+		bus->socket_ops->detach(bus->sfp);
 		if (bus->phydev && ops && ops->disconnect_phy)
 			ops->disconnect_phy(bus->upstream);
 	}
diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index fd8bb998ae52..68c8fbf099f8 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -184,6 +184,7 @@ struct sfp {
 
 	struct gpio_desc *gpio[GPIO_MAX];
 
+	bool attached;
 	unsigned int state;
 	struct delayed_work poll;
 	struct delayed_work timeout;
@@ -1475,7 +1476,7 @@ static void sfp_sm_event(struct sfp *sfp, unsigned int event)
 	 */
 	switch (sfp->sm_mod_state) {
 	default:
-		if (event == SFP_E_INSERT) {
+		if (event == SFP_E_INSERT && sfp->attached) {
 			sfp_module_tx_disable(sfp);
 			sfp_sm_ins_next(sfp, SFP_MOD_PROBE, T_PROBE_INIT);
 		}
@@ -1607,6 +1608,19 @@ static void sfp_sm_event(struct sfp *sfp, unsigned int event)
 	mutex_unlock(&sfp->sm_mutex);
 }
 
+static void sfp_attach(struct sfp *sfp)
+{
+	sfp->attached = true;
+	if (sfp->state & SFP_F_PRESENT)
+		sfp_sm_event(sfp, SFP_E_INSERT);
+}
+
+static void sfp_detach(struct sfp *sfp)
+{
+	sfp->attached = false;
+	sfp_sm_event(sfp, SFP_E_REMOVE);
+}
+
 static void sfp_start(struct sfp *sfp)
 {
 	sfp_sm_event(sfp, SFP_E_DEV_UP);
@@ -1667,6 +1681,8 @@ static int sfp_module_eeprom(struct sfp *sfp, struct ethtool_eeprom *ee,
 }
 
 static const struct sfp_socket_ops sfp_module_ops = {
+	.attach = sfp_attach,
+	.detach = sfp_detach,
 	.start = sfp_start,
 	.stop = sfp_stop,
 	.module_info = sfp_module_info,
@@ -1834,10 +1850,6 @@ static int sfp_probe(struct platform_device *pdev)
 	dev_info(sfp->dev, "Host maximum power %u.%uW\n",
 		 sfp->max_power_mW / 1000, (sfp->max_power_mW / 100) % 10);
 
-	sfp->sfp_bus = sfp_register_socket(sfp->dev, sfp, &sfp_module_ops);
-	if (!sfp->sfp_bus)
-		return -ENOMEM;
-
 	/* Get the initial state, and always signal TX disable,
 	 * since the network interface will not be up.
 	 */
@@ -1848,10 +1860,6 @@ static int sfp_probe(struct platform_device *pdev)
 		sfp->state |= SFP_F_RATE_SELECT;
 	sfp_set_state(sfp, sfp->state);
 	sfp_module_tx_disable(sfp);
-	rtnl_lock();
-	if (sfp->state & SFP_F_PRESENT)
-		sfp_sm_event(sfp, SFP_E_INSERT);
-	rtnl_unlock();
 
 	for (i = 0; i < GPIO_MAX; i++) {
 		if (gpio_flags[i] != GPIOD_IN || !sfp->gpio[i])
@@ -1884,6 +1892,10 @@ static int sfp_probe(struct platform_device *pdev)
 		dev_warn(sfp->dev,
 			 "No tx_disable pin: SFP modules will always be emitting.\n");
 
+	sfp->sfp_bus = sfp_register_socket(sfp->dev, sfp, &sfp_module_ops);
+	if (!sfp->sfp_bus)
+		return -ENOMEM;
+
 	return 0;
 }
 
diff --git a/drivers/net/phy/sfp.h b/drivers/net/phy/sfp.h
index 31b0acf337e2..64f54b0bbd8c 100644
--- a/drivers/net/phy/sfp.h
+++ b/drivers/net/phy/sfp.h
@@ -7,6 +7,8 @@
 struct sfp;
 
 struct sfp_socket_ops {
+	void (*attach)(struct sfp *sfp);
+	void (*detach)(struct sfp *sfp);
 	void (*start)(struct sfp *sfp);
 	void (*stop)(struct sfp *sfp);
 	int (*module_info)(struct sfp *sfp, struct ethtool_modinfo *modinfo);
-- 
2.7.4


^ permalink raw reply related

* [PATCH] MAINTAINERS: add maintainer for SFF/SFP/SFP+ support
From: Russell King @ 2019-02-06 10:54 UTC (permalink / raw)
  To: netdev; +Cc: Andrew Lunn, David S. Miller, Florian Fainelli, Heiner Kallweit

Add maintainer entry for SFF/SFP/SFP+ support.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 MAINTAINERS | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index f3a5c97e3419..f10b60dcb0bb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13510,6 +13510,15 @@ L:	netdev@vger.kernel.org
 S:	Supported
 F:	drivers/net/ethernet/sfc/
 
+SFF/SFP/SFP+ MODULE SUPPORT
+M:	Russell King <linux@armlinux.org.uk>
+L:	netdev@vger.kernel.org
+S:	Maintained
+F:	drivers/net/phy/phylink.c
+F:	drivers/net/phy/sfp*
+F:	include/linux/phylink.h
+F:	include/linux/sfp.h
+
 SGI GRU DRIVER
 M:	Dimitri Sivanich <sivanich@sgi.com>
 S:	Maintained
-- 
2.7.4


^ permalink raw reply related

* Re: [PATCH iproute2-next] Introduce ip-brctl shell script
From: Stefano Brivio @ 2019-02-06 10:55 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Roopa Prabhu, Nikolay Aleksandrov, Phil Sutter, Eric Garver,
	Tomas Dolezal, Lennert Buytenhek, netdev
In-Reply-To: <20190205145033.5d90dc42@hermes.lan>

On Tue, 5 Feb 2019 14:50:33 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:

> Providing brctl or ifconfig scripts is possible, but it really should
> be put in a sample or demo directory and not installed by default.
> It would just cause too much pain to distributions.

On one hand, I did it this way exactly to make it easy for
distributions (after all, I work for a distributor). The rationale is
that it's easier to get rid of things from package scripts rather than
add them. Also implementing a Debian package diversion looked more
elegant this way. No idea about other ones though.

On the other hand, I didn't even think of adding that to examples/.
Maybe that would alleviate David's concern about having to maintain it
forever (if it breaks temporarily, or if we need to remove it for some
reason, it's not a drama).

> I love concise human readable output and hate long winded VMS style
> commands.

That's exactly where I feel the current tools included in iproute2 fall
rather short. Compare "brctl show" to the closest equivalent "IFS='
'
for b in $(ip -br link show type bridge); do ip -br link show type
bridge_slave master ${b%% *}; done".

Sure, as Roopa said, we could and should improve 'bridge', and by now
I'm even almost convinced it's doable without breaking the existing
syntax, but it's not something we're doing overnight.

-- 
Stefano

^ permalink raw reply

* Re: [PATCH RFC RFT net-next 00/10] Modernize mv88e6060 and remove legacy probe
From: Gregory CLEMENT @ 2019-02-06 11:31 UTC (permalink / raw)
  To: Andrew Lunn, netdev
  Cc: Vivien Didelot, Florian Fainelli, Pavel Machek, Andrew Lunn
In-Reply-To: <20190130003758.23852-1-andrew@lunn.ch>

Hi Andrew,
 
 On mer., janv. 30 2019, Andrew Lunn <andrew@lunn.ch> wrote:

> The mv88e6060 is the last device using the legacy method of probing an
> DSA Ethernet switch. This patchset applies some cleanups to the
> driver, and then adds support for probing the device as an MDIO bus
> device. The legacy probe is then removed from the driver, and then
> from DSA as a whole.
>
> This is compile tested only. Comment and testing welcome.
>
> It should not be merged yet, and one of the patches should go via
> arm-soc.
>
> Andrew Lunn (10):
>   net: dsa: mv88e6xxx: Remove legacy probe support
>   net: dsa: mv88e6060: Replace ds with priv
>   net: dsa: mv88e6060: Replace REG_WRITE macro
>   net: dsa: mv88e6060: Replace REG_READ macro
>   net: dsa: mv88e6060: Support probing as an mdio device
>   net: dsa: mv88e6060: Remove support for legacy probing
>   net: dsa: mv88e6060: Add SPDX header
>   net: dsa: Remove legacy probing support
>   arch: arm: dts: Remove disabled marvell,dsa properties
>   bt-bindings: net: DSA: Remove legacy binding

I saw there was some test already done on this series, but unless I am
wrong it was not applied yet.

Do you plan to send a new iteration soon?

I am asking because this week the dt part can be applied to be merged in
v5.1, whereas next week, it will be doable but as rc6 will be released
there will be less chance to be accepted.

Thanks,

Gregory


>
>  .../devicetree/bindings/net/dsa/dsa.txt       | 155 ----
>  arch/arm/boot/dts/armada-370-rd.dts           |  42 -
>  arch/arm/boot/dts/armada-388-clearfog.dts     |  58 --
>  arch/arm/boot/dts/armada-xp-linksys-mamba.dts |  47 --
>  arch/arm/boot/dts/kirkwood-dir665.dts         |  47 --
>  arch/arm/boot/dts/kirkwood-linksys-viper.dts  |  47 --
>  .../arm/boot/dts/kirkwood-mv88f6281gtw-ge.dts |  47 --
>  arch/arm/boot/dts/kirkwood-rd88f6281.dtsi     |  41 -
>  drivers/net/dsa/Kconfig                       |   2 +-
>  drivers/net/dsa/mv88e6060.c                   | 217 ++---
>  drivers/net/dsa/mv88e6060.h                   |   1 +
>  drivers/net/dsa/mv88e6xxx/chip.c              |  71 +-
>  include/net/dsa.h                             |  23 -
>  net/dsa/Kconfig                               |   9 -
>  net/dsa/Makefile                              |   1 -
>  net/dsa/dsa.c                                 |   5 -
>  net/dsa/dsa_priv.h                            |  12 -
>  net/dsa/legacy.c                              | 745 ------------------
>  18 files changed, 121 insertions(+), 1449 deletions(-)
>
> -- 
> 2.20.1
>

-- 
Gregory Clement, Bootlin
Embedded Linux and Kernel engineering
http://bootlin.com

^ permalink raw reply

* [PATCH net-next 0/6] Add comphy support for Armada 38x
From: Russell King - ARM Linux admin @ 2019-02-06 11:34 UTC (permalink / raw)
  To: Andrew Lunn, Gregory Clement, Jason Cooper,
	Kishon Vijay Abraham I, Sebastian Hesselbarth, Thomas Petazzoni
  Cc: David S. Miller, devicetree, linux-arm-kernel, Mark Rutland,
	netdev, Rob Herring

Hi,

This series adds support for the comphy for Armada 38x, which allows
these SoCs to use 2500BASE-X mode with appropriate SFP modules.

Tested on SolidRun Clearfog.

 .../bindings/net/marvell-armada-370-neta.txt       |   2 +-
 .../bindings/phy/phy-armada38x-comphy.txt          |  40 ++++
 arch/arm/boot/dts/armada-388-clearfog.dtsi         |   2 +
 arch/arm/boot/dts/armada-38x.dtsi                  |  37 ++++
 drivers/net/ethernet/marvell/mvneta.c              |  57 ++++-
 drivers/phy/marvell/Kconfig                        |  10 +
 drivers/phy/marvell/Makefile                       |   1 +
 drivers/phy/marvell/phy-armada38x-comphy.c         | 232 +++++++++++++++++++++
 8 files changed, 376 insertions(+), 5 deletions(-)

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply

* [PATCH net-next 1/6] dt-bindings: phy: Armada 38x common phy bindings
From: Russell King @ 2019-02-06 11:34 UTC (permalink / raw)
  To: Andrew Lunn, Gregory Clement, Jason Cooper,
	Kishon Vijay Abraham I, Sebastian Hesselbarth, Thomas Petazzoni
  Cc: devicetree, linux-arm-kernel, netdev, Rob Herring, Mark Rutland
In-Reply-To: <20190206113409.ko7lwuv5kb7l7rjo@shell.armlinux.org.uk>

Add the Marvell Armada 38x common phy bindings.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 .../bindings/phy/phy-armada38x-comphy.txt          | 40 ++++++++++++++++++++++
 1 file changed, 40 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/phy-armada38x-comphy.txt

diff --git a/Documentation/devicetree/bindings/phy/phy-armada38x-comphy.txt b/Documentation/devicetree/bindings/phy/phy-armada38x-comphy.txt
new file mode 100644
index 000000000000..ad49e5c01334
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/phy-armada38x-comphy.txt
@@ -0,0 +1,40 @@
+mvebu armada 38x comphy driver
+------------------------------
+
+This comphy controller can be found on Marvell Armada 38x. It provides a
+number of shared PHYs used by various interfaces (network, sata, usb,
+PCIe...).
+
+Required properties:
+
+- compatible: should be "marvell,armada-380-comphy"
+- reg: should contain the comphy register location and length.
+- #address-cells: should be 1.
+- #size-cells: should be 0.
+
+A sub-node is required for each comphy lane provided by the comphy.
+
+Required properties (child nodes):
+
+- reg: comphy lane number.
+- #phy-cells : from the generic phy bindings, must be 1. Defines the
+               input port to use for a given comphy lane.
+
+Example:
+
+	comphy: phy@18300 {
+		compatible = "marvell,armada-380-comphy";
+		reg = <0x18300 0x100>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		cpm_comphy0: phy@0 {
+			reg = <0>;
+			#phy-cells = <1>;
+		};
+
+		cpm_comphy1: phy@1 {
+			reg = <1>;
+			#phy-cells = <1>;
+		};
+	};
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 2/6] phy: armada38x: add common phy support
From: Russell King @ 2019-02-06 11:34 UTC (permalink / raw)
  To: Andrew Lunn, Gregory Clement, Jason Cooper,
	Kishon Vijay Abraham I, Sebastian Hesselbarth, Thomas Petazzoni
  Cc: devicetree, linux-arm-kernel, netdev
In-Reply-To: <20190206113409.ko7lwuv5kb7l7rjo@shell.armlinux.org.uk>

Add support for the Armada 38x common phy to allow us to change the
speed of the Ethernet serdes lane.  This driver only supports
manipulation of the speed, it does not support configuration of the
common phy.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/phy/marvell/Kconfig                |  10 ++
 drivers/phy/marvell/Makefile               |   1 +
 drivers/phy/marvell/phy-armada38x-comphy.c | 232 +++++++++++++++++++++++++++++
 3 files changed, 243 insertions(+)
 create mode 100644 drivers/phy/marvell/phy-armada38x-comphy.c

diff --git a/drivers/phy/marvell/Kconfig b/drivers/phy/marvell/Kconfig
index 6fb4b56e4c14..224ea4e6a46d 100644
--- a/drivers/phy/marvell/Kconfig
+++ b/drivers/phy/marvell/Kconfig
@@ -21,6 +21,16 @@ config PHY_BERLIN_USB
 	help
 	  Enable this to support the USB PHY on Marvell Berlin SoCs.
 
+config PHY_MVEBU_A38X_COMPHY
+	tristate "Marvell Armada 38x comphy driver"
+	depends on ARCH_MVEBU || COMPILE_TEST
+	depends on OF
+	select GENERIC_PHY
+	help
+	  This driver allows to control the comphy, an hardware block providing
+	  shared serdes PHYs on Marvell Armada 38x. Its serdes lanes can be
+	  used by various controllers (Ethernet, sata, usb, PCIe...).
+
 config PHY_MVEBU_CP110_COMPHY
 	tristate "Marvell CP110 comphy driver"
 	depends on ARCH_MVEBU || COMPILE_TEST
diff --git a/drivers/phy/marvell/Makefile b/drivers/phy/marvell/Makefile
index 3975b144f8ec..59b6c03ef756 100644
--- a/drivers/phy/marvell/Makefile
+++ b/drivers/phy/marvell/Makefile
@@ -2,6 +2,7 @@
 obj-$(CONFIG_ARMADA375_USBCLUSTER_PHY)	+= phy-armada375-usb2.o
 obj-$(CONFIG_PHY_BERLIN_SATA)		+= phy-berlin-sata.o
 obj-$(CONFIG_PHY_BERLIN_USB)		+= phy-berlin-usb.o
+obj-$(CONFIG_PHY_MVEBU_A38X_COMPHY)	+= phy-armada38x-comphy.o
 obj-$(CONFIG_PHY_MVEBU_CP110_COMPHY)	+= phy-mvebu-cp110-comphy.o
 obj-$(CONFIG_PHY_MVEBU_SATA)		+= phy-mvebu-sata.o
 obj-$(CONFIG_PHY_PXA_28NM_HSIC)		+= phy-pxa-28nm-hsic.o
diff --git a/drivers/phy/marvell/phy-armada38x-comphy.c b/drivers/phy/marvell/phy-armada38x-comphy.c
new file mode 100644
index 000000000000..f921ff448cfb
--- /dev/null
+++ b/drivers/phy/marvell/phy-armada38x-comphy.c
@@ -0,0 +1,232 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018 Russell King, Deep Blue Solutions Ltd.
+ *
+ * Partly derived from CP110 comphy driver by Antoine Tenart
+ * <antoine.tenart@bootlin.com>
+ */
+#include <linux/delay.h>
+#include <linux/iopoll.h>
+#include <linux/module.h>
+#include <linux/phy/phy.h>
+#include <linux/platform_device.h>
+
+#define MAX_A38X_COMPHY	6
+#define MAX_A38X_PORTS	3
+
+#define COMPHY_CFG1		0x00
+#define  COMPHY_CFG1_GEN_TX(x)		((x) << 26)
+#define  COMPHY_CFG1_GEN_TX_MSK		COMPHY_CFG1_GEN_TX(15)
+#define  COMPHY_CFG1_GEN_RX(x)		((x) << 22)
+#define  COMPHY_CFG1_GEN_RX_MSK		COMPHY_CFG1_GEN_RX(15)
+#define  GEN_SGMII_1_25GBPS		6
+#define  GEN_SGMII_3_125GBPS		8
+
+#define COMPHY_STAT1		0x18
+#define  COMPHY_STAT1_PLL_RDY_TX	BIT(3)
+#define  COMPHY_STAT1_PLL_RDY_RX	BIT(2)
+
+#define COMPHY_SELECTOR		0xfc
+
+struct a38x_comphy;
+
+struct a38x_comphy_lane {
+	void __iomem *base;
+	struct a38x_comphy *priv;
+	unsigned int n;
+
+	int port;
+};
+
+struct a38x_comphy {
+	void __iomem *base;
+	struct device *dev;
+	struct a38x_comphy_lane lane[MAX_A38X_COMPHY];
+};
+
+static const u8 gbe_mux[MAX_A38X_COMPHY][MAX_A38X_PORTS] = {
+	{ 0, 0, 0 },
+	{ 4, 5, 0 },
+	{ 0, 4, 0 },
+	{ 0, 0, 4 },
+	{ 0, 3, 0 },
+	{ 0, 0, 3 },
+};
+
+static void a38x_comphy_set_reg(struct a38x_comphy_lane *lane,
+				unsigned int offset, u32 mask, u32 value)
+{
+	u32 val;
+
+	val = readl_relaxed(lane->base + offset) & ~mask;
+	writel(val | value, lane->base + offset);
+}
+
+static void a38x_comphy_set_speed(struct a38x_comphy_lane *lane,
+				  unsigned int gen_tx, unsigned int gen_rx)
+{
+	a38x_comphy_set_reg(lane, COMPHY_CFG1,
+			    COMPHY_CFG1_GEN_TX_MSK | COMPHY_CFG1_GEN_RX_MSK,
+			    COMPHY_CFG1_GEN_TX(gen_tx) |
+		            COMPHY_CFG1_GEN_RX(gen_rx));
+}
+
+static int a38x_comphy_poll(struct a38x_comphy_lane *lane,
+			    unsigned int offset, u32 mask, u32 value)
+{
+	u32 val;
+	int ret;
+
+	ret = readl_relaxed_poll_timeout_atomic(lane->base + offset, val,
+						(val & mask) == value,
+						10, 100);
+
+	if (ret)
+		dev_err(lane->priv->dev,
+			"comphy%u: timed out waiting for status\n", lane->n);
+
+	return ret;
+}
+
+/*
+ * We only support changing the speed for comphys configured for GBE.
+ * Since that is all we do, we only poll for PLL ready status.
+ */
+static int a38x_comphy_set_mode(struct phy *phy, enum phy_mode mode)
+{
+	struct a38x_comphy_lane *lane = phy_get_drvdata(phy);
+	unsigned int gen;
+
+	switch (mode) {
+	case PHY_MODE_SGMII:
+		gen = GEN_SGMII_1_25GBPS;
+		break;
+
+	case PHY_MODE_2500SGMII:
+		gen = GEN_SGMII_3_125GBPS;
+		break;
+
+	default:
+		return -EINVAL;
+	}
+
+	a38x_comphy_set_speed(lane, gen, gen);
+
+	return a38x_comphy_poll(lane, COMPHY_STAT1,
+				COMPHY_STAT1_PLL_RDY_TX |
+				COMPHY_STAT1_PLL_RDY_RX,
+				COMPHY_STAT1_PLL_RDY_TX |
+				COMPHY_STAT1_PLL_RDY_RX);
+}
+
+static const struct phy_ops a38x_comphy_ops = {
+	.set_mode	= a38x_comphy_set_mode,
+	.owner		= THIS_MODULE,
+};
+
+static struct phy *a38x_comphy_xlate(struct device *dev,
+				     struct of_phandle_args *args)
+{
+	struct a38x_comphy_lane *lane;
+	struct phy *phy;
+	u32 val;
+
+	if (WARN_ON(args->args[0] >= MAX_A38X_PORTS))
+		return ERR_PTR(-EINVAL);
+
+	phy = of_phy_simple_xlate(dev, args);
+	if (IS_ERR(phy))
+		return phy;
+
+	lane = phy_get_drvdata(phy);
+	if (lane->port >= 0)
+		return ERR_PTR(-EBUSY);
+
+	lane->port = args->args[0];
+
+	val = readl_relaxed(lane->priv->base + COMPHY_SELECTOR);
+	val = (val >> (4 * lane->n)) & 0xf;
+
+	if (!gbe_mux[lane->n][lane->port] ||
+	    val != gbe_mux[lane->n][lane->port]) {
+		dev_warn(lane->priv->dev,
+			 "comphy%u: not configured for GBE\n", lane->n);
+		phy = ERR_PTR(-EINVAL);
+	}
+
+	return phy;
+}
+
+static int a38x_comphy_probe(struct platform_device *pdev)
+{
+	struct a38x_comphy *priv;
+	struct phy_provider *provider;
+	struct device_node *child;
+	struct resource *res;
+	void __iomem *base;
+
+	priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(base))
+		return PTR_ERR(base);
+
+	priv->dev = &pdev->dev;
+	priv->base = base;
+
+	for_each_available_child_of_node(pdev->dev.of_node, child) {
+		struct phy *phy;
+		int ret;
+		u32 val;
+
+		ret = of_property_read_u32(child, "reg", &val);
+		if (ret < 0) {
+			dev_err(&pdev->dev, "missing 'reg' property (%d)\n",
+				ret);
+			continue;
+		}
+
+		if (val >= MAX_A38X_COMPHY || priv->lane[val].base) {
+			dev_err(&pdev->dev, "invalid 'reg' property\n");
+			continue;
+		}
+
+		phy = devm_phy_create(&pdev->dev, child, &a38x_comphy_ops);
+		if (IS_ERR(phy))
+			return PTR_ERR(phy);
+
+		priv->lane[val].base = base + 0x28 * val;
+		priv->lane[val].priv = priv;
+		priv->lane[val].n = val;
+		priv->lane[val].port = -1;
+		phy_set_drvdata(phy, &priv->lane[val]);
+	}
+
+	dev_set_drvdata(&pdev->dev, priv);
+
+	provider = devm_of_phy_provider_register(&pdev->dev, a38x_comphy_xlate);
+
+	return PTR_ERR_OR_ZERO(provider);
+}
+
+static const struct of_device_id a38x_comphy_of_match_table[] = {
+	{ .compatible = "marvell,armada-380-comphy" },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, a38x_comphy_of_match_table);
+
+static struct platform_driver a38x_comphy_driver = {
+	.probe	= a38x_comphy_probe,
+	.driver	= {
+		.name = "armada-38x-comphy",
+		.of_match_table = a38x_comphy_of_match_table,
+	},
+};
+module_platform_driver(a38x_comphy_driver);
+
+MODULE_AUTHOR("Russell King <rmk+kernel@armlinux.org.uk>");
+MODULE_DESCRIPTION("Common PHY driver for Armada 38x SoCs");
+MODULE_LICENSE("GPL v2");
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 3/6] ARM: dts: add description for Armada 38x common phy
From: Russell King @ 2019-02-06 11:34 UTC (permalink / raw)
  To: Andrew Lunn, Gregory Clement, Jason Cooper,
	Kishon Vijay Abraham I, Sebastian Hesselbarth, Thomas Petazzoni
  Cc: devicetree, linux-arm-kernel, netdev, Rob Herring, Mark Rutland
In-Reply-To: <20190206113409.ko7lwuv5kb7l7rjo@shell.armlinux.org.uk>

Add the DT description for the Armada 38x common phy.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 arch/arm/boot/dts/armada-38x.dtsi | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/arch/arm/boot/dts/armada-38x.dtsi b/arch/arm/boot/dts/armada-38x.dtsi
index 929459c42760..7b2e2bd6479b 100644
--- a/arch/arm/boot/dts/armada-38x.dtsi
+++ b/arch/arm/boot/dts/armada-38x.dtsi
@@ -335,6 +335,43 @@
 				#clock-cells = <1>;
 			};
 
+			comphy: phy@18300 {
+				compatible = "marvell,armada-380-comphy";
+				reg = <0x18300 0x100>;
+				#address-cells = <1>;
+				#size-cells = <0>;
+
+				comphy0: phy@0 {
+					reg = <0>;
+					#phy-cells = <1>;
+				};
+
+				comphy1: phy@1 {
+					reg = <1>;
+					#phy-cells = <1>;
+				};
+
+				comphy2: phy@2 {
+					reg = <2>;
+					#phy-cells = <1>;
+				};
+
+				comphy3: phy@3 {
+					reg = <3>;
+					#phy-cells = <1>;
+				};
+
+				comphy4: phy@4 {
+					reg = <4>;
+					#phy-cells = <1>;
+				};
+
+				comphy5: phy@5 {
+					reg = <5>;
+					#phy-cells = <1>;
+				};
+			};
+
 			coreclk: mvebu-sar@18600 {
 				compatible = "marvell,armada-380-core-clock";
 				reg = <0x18600 0x04>;
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 4/6] dt-bindings: net: mvneta: add phys property
From: Russell King @ 2019-02-06 11:35 UTC (permalink / raw)
  To: Andrew Lunn, Gregory Clement, Jason Cooper,
	Kishon Vijay Abraham I, Sebastian Hesselbarth, Thomas Petazzoni
  Cc: devicetree, linux-arm-kernel, netdev, David S. Miller,
	Rob Herring, Mark Rutland
In-Reply-To: <20190206113409.ko7lwuv5kb7l7rjo@shell.armlinux.org.uk>

Add an optional phys property to the mvneta binding documentation for
the common phy.

Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
index bedcfd5a52cd..691f886cfc4a 100644
--- a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
+++ b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
@@ -19,7 +19,7 @@
   "marvell,armada-370-neta" and 9800B for others.
 - clock-names: List of names corresponding to clocks property; shall be
   "core" for core clock and "bus" for the optional bus clock.
-
+- phys: comphy for the ethernet port, see ../phy/phy-bindings.txt
 
 Optional properties (valid only for Armada XP/38x):
 
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 5/6] net: marvell: neta: add comphy support
From: Russell King @ 2019-02-06 11:35 UTC (permalink / raw)
  To: Andrew Lunn, Gregory Clement, Jason Cooper,
	Kishon Vijay Abraham I, Sebastian Hesselbarth, Thomas Petazzoni
  Cc: devicetree, linux-arm-kernel, netdev, David S. Miller
In-Reply-To: <20190206113409.ko7lwuv5kb7l7rjo@shell.armlinux.org.uk>

Add support for the common phy binding, so that we can reconfigure the
comphy according to the desired ethernet speed.  This will allow us to
support 1000base-X and 2500base-X SFPs dynamically on SolidRun Clearfog.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/ethernet/marvell/mvneta.c | 57 ++++++++++++++++++++++++++++++++---
 1 file changed, 53 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 61b23497f836..86bdb0eb463a 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -27,6 +27,7 @@
 #include <linux/of_irq.h>
 #include <linux/of_mdio.h>
 #include <linux/of_net.h>
+#include <linux/phy/phy.h>
 #include <linux/phy.h>
 #include <linux/phylink.h>
 #include <linux/platform_device.h>
@@ -436,6 +437,7 @@ struct mvneta_port {
 	struct device_node *dn;
 	unsigned int tx_csum_limit;
 	struct phylink *phylink;
+	struct phy *comphy;
 
 	struct mvneta_bm *bm_priv;
 	struct mvneta_bm_pool *pool_long;
@@ -3151,6 +3153,8 @@ static void mvneta_start_dev(struct mvneta_port *pp)
 {
 	int cpu;
 
+	WARN_ON(phy_power_on(pp->comphy));
+
 	mvneta_max_rx_size_set(pp, pp->pkt_size);
 	mvneta_txq_max_tx_size_set(pp, pp->pkt_size);
 
@@ -3213,6 +3217,8 @@ static void mvneta_stop_dev(struct mvneta_port *pp)
 
 	mvneta_tx_reset(pp);
 	mvneta_rx_reset(pp);
+
+	WARN_ON(phy_power_off(pp->comphy));
 }
 
 static void mvneta_percpu_enable(void *arg)
@@ -3338,6 +3344,7 @@ static int mvneta_set_mac_addr(struct net_device *dev, void *addr)
 static void mvneta_validate(struct net_device *ndev, unsigned long *supported,
 			    struct phylink_link_state *state)
 {
+	struct mvneta_port *pp = netdev_priv(ndev);
 	__ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
 
 	/* We only support QSGMII, SGMII, 802.3z and RGMII modes */
@@ -3358,8 +3365,13 @@ static void mvneta_validate(struct net_device *ndev, unsigned long *supported,
 	phylink_set(mask, Pause);
 
 	/* Half-duplex at speeds higher than 100Mbit is unsupported */
-	phylink_set(mask, 1000baseT_Full);
-	phylink_set(mask, 1000baseX_Full);
+	if (pp->comphy || state->interface != PHY_INTERFACE_MODE_2500BASEX) {
+		phylink_set(mask, 1000baseT_Full);
+		phylink_set(mask, 1000baseX_Full);
+	}
+	if (pp->comphy || state->interface == PHY_INTERFACE_MODE_2500BASEX) {
+		phylink_set(mask, 2500baseX_Full);
+	}
 
 	if (!phy_interface_mode_is_8023z(state->interface)) {
 		/* 10M and 100M are only supported in non-802.3z mode */
@@ -3373,6 +3385,11 @@ static void mvneta_validate(struct net_device *ndev, unsigned long *supported,
 		   __ETHTOOL_LINK_MODE_MASK_NBITS);
 	bitmap_and(state->advertising, state->advertising, mask,
 		   __ETHTOOL_LINK_MODE_MASK_NBITS);
+
+	/* We can only operate at 2500BaseX or 1000BaseX.  If requested
+	 * to advertise both, only report advertising at 2500BaseX.
+	 */
+	phylink_helper_basex_speed(state);
 }
 
 static int mvneta_mac_link_state(struct net_device *ndev,
@@ -3384,7 +3401,9 @@ static int mvneta_mac_link_state(struct net_device *ndev,
 	gmac_stat = mvreg_read(pp, MVNETA_GMAC_STATUS);
 
 	if (gmac_stat & MVNETA_GMAC_SPEED_1000)
-		state->speed = SPEED_1000;
+		state->speed =
+			state->interface == PHY_INTERFACE_MODE_2500BASEX ?
+			SPEED_2500 : SPEED_1000;
 	else if (gmac_stat & MVNETA_GMAC_SPEED_100)
 		state->speed = SPEED_100;
 	else
@@ -3499,12 +3518,32 @@ static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
 			    MVNETA_GMAC_FORCE_LINK_DOWN);
 	}
 
+
 	/* When at 2.5G, the link partner can send frames with shortened
 	 * preambles.
 	 */
 	if (state->speed == SPEED_2500)
 		new_ctrl4 |= MVNETA_GMAC4_SHORT_PREAMBLE_ENABLE;
 
+	if (pp->comphy) {
+		enum phy_mode mode = PHY_MODE_INVALID;
+
+		switch (state->interface) {
+		case PHY_INTERFACE_MODE_SGMII:
+		case PHY_INTERFACE_MODE_1000BASEX:
+			mode = PHY_MODE_SGMII;
+			break;
+		case PHY_INTERFACE_MODE_2500BASEX:
+			mode = PHY_MODE_2500SGMII;
+			break;
+		default:
+			break;
+		}
+
+		if (mode != PHY_MODE_INVALID)
+			WARN_ON(phy_set_mode(pp->comphy, mode));
+	}
+
 	if (new_ctrl0 != gmac_ctrl0)
 		mvreg_write(pp, MVNETA_GMAC_CTRL_0, new_ctrl0);
 	if (new_ctrl2 != gmac_ctrl2)
@@ -4405,7 +4444,7 @@ static int mvneta_port_power_up(struct mvneta_port *pp, int phy_mode)
 	if (phy_mode == PHY_INTERFACE_MODE_QSGMII)
 		mvreg_write(pp, MVNETA_SERDES_CFG, MVNETA_QSGMII_SERDES_PROTO);
 	else if (phy_mode == PHY_INTERFACE_MODE_SGMII ||
-		 phy_mode == PHY_INTERFACE_MODE_1000BASEX)
+		 phy_interface_mode_is_8023z(phy_mode))
 		mvreg_write(pp, MVNETA_SERDES_CFG, MVNETA_SGMII_SERDES_PROTO);
 	else if (!phy_interface_mode_is_rgmii(phy_mode))
 		return -EINVAL;
@@ -4422,6 +4461,7 @@ static int mvneta_probe(struct platform_device *pdev)
 	struct mvneta_port *pp;
 	struct net_device *dev;
 	struct phylink *phylink;
+	struct phy *comphy;
 	const char *dt_mac_addr;
 	char hw_mac_addr[ETH_ALEN];
 	const char *mac_from;
@@ -4447,6 +4487,14 @@ static int mvneta_probe(struct platform_device *pdev)
 		goto err_free_irq;
 	}
 
+	comphy = devm_of_phy_get(&pdev->dev, dn, NULL);
+	if (comphy == ERR_PTR(-EPROBE_DEFER)) {
+		err = -EPROBE_DEFER;
+		goto err_free_irq;
+	} else if (IS_ERR(comphy)) {
+		comphy = NULL;
+	}
+
 	phylink = phylink_create(dev, pdev->dev.fwnode, phy_mode,
 				 &mvneta_phylink_ops);
 	if (IS_ERR(phylink)) {
@@ -4463,6 +4511,7 @@ static int mvneta_probe(struct platform_device *pdev)
 	pp = netdev_priv(dev);
 	spin_lock_init(&pp->lock);
 	pp->phylink = phylink;
+	pp->comphy = comphy;
 	pp->phy_interface = phy_mode;
 	pp->dn = dn;
 
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 6/6] ARM: dts: clearfog: add comphy settings for Ethernet interfaces
From: Russell King @ 2019-02-06 11:35 UTC (permalink / raw)
  To: Andrew Lunn, Gregory Clement, Jason Cooper,
	Kishon Vijay Abraham I, Sebastian Hesselbarth, Thomas Petazzoni
  Cc: devicetree, linux-arm-kernel, netdev, Rob Herring, Mark Rutland
In-Reply-To: <20190206113409.ko7lwuv5kb7l7rjo@shell.armlinux.org.uk>

Add the comphy settings for the Ethernet interfaces.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 arch/arm/boot/dts/armada-388-clearfog.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/boot/dts/armada-388-clearfog.dtsi b/arch/arm/boot/dts/armada-388-clearfog.dtsi
index 1b0d0680c8b6..0d81600ca247 100644
--- a/arch/arm/boot/dts/armada-388-clearfog.dtsi
+++ b/arch/arm/boot/dts/armada-388-clearfog.dtsi
@@ -93,6 +93,7 @@
 	bm,pool-long = <2>;
 	bm,pool-short = <1>;
 	buffer-manager = <&bm>;
+	phys = <&comphy1 1>;
 	phy-mode = "sgmii";
 	status = "okay";
 };
@@ -103,6 +104,7 @@
 	bm,pool-short = <1>;
 	buffer-manager = <&bm>;
 	managed = "in-band-status";
+	phys = <&comphy5 2>;
 	phy-mode = "sgmii";
 	sfp = <&sfp>;
 	status = "okay";
-- 
2.7.4


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox