Netdev List
 help / color / mirror / Atom feed
* [PATCH net 0/5] l2tp: fix some l2tp_tunnel_find() issues in l2tp_netlink
From: Guillaume Nault @ 2017-08-25 14:51 UTC (permalink / raw)
  To: netdev; +Cc: James Chapman

Since l2tp_tunnel_find() doesn't take a reference on the tunnel it
returns, its users are almost guaranteed to be racy.

This series defines l2tp_tunnel_get() which can be used as a safe
replacement, and converts some of l2tp_tunnel_find() users in the
l2tp_netlink module.

Other users often combine this issue with other more or less subtle
races. They will be fixed incrementally in followup series.

Guillaume Nault (5):
  l2tp: hold tunnel while looking up sessions in l2tp_netlink
  l2tp: hold tunnel while processing genl delete command
  l2tp: hold tunnel while handling genl tunnel updates
  l2tp: hold tunnel while handling genl TUNNEL_GET commands
  l2tp: hold tunnel used while creating sessions with netlink

 net/l2tp/l2tp_core.c    | 66 ++++++++++++++++---------------------------------
 net/l2tp/l2tp_core.h    | 13 ++++++++++
 net/l2tp/l2tp_netlink.c | 66 +++++++++++++++++++++++++++++--------------------
 3 files changed, 73 insertions(+), 72 deletions(-)

-- 
2.14.1

^ permalink raw reply

* Re: [PATCH 6/7] netfilter: nat: make rhashtable_params const
From: Florian Westphal @ 2017-08-25 14:50 UTC (permalink / raw)
  To: Bhumika Goyal; +Cc: netfilter-devel, netdev
In-Reply-To: <1503670907-23221-7-git-send-email-bhumirks@gmail.com>

Bhumika Goyal <bhumirks@gmail.com> wrote:

[..]

trimming CC list, also this should probably have been
sent to netfilter-devel@vger.kernel.org on its own, with [nf-next] in
subject line.  That aside:

Acked-by: Florian Westphal <fw@strlen.de>


^ permalink raw reply

* Re: [PATCH 7/7] SUNRPC: make kvec const
From: Jeff Layton @ 2017-08-25 14:48 UTC (permalink / raw)
  To: Bhumika Goyal, julia.lawall, marcel, gustavo, johan.hedberg,
	davem, pablo, kadlec, fw, stephen, alex.aring, stefan, kuznet,
	yoshfuji, santosh.shilimkar, trond.myklebust, anna.schumaker,
	bfields, linux-bluetooth, netdev, linux-kernel, netfilter-devel,
	coreteam, bridge, linux-wpan, linux-rdma, rds-devel, linux-nfs
In-Reply-To: <1503670907-23221-8-git-send-email-bhumirks@gmail.com>

On Fri, 2017-08-25 at 19:51 +0530, Bhumika Goyal wrote:
> Make this const as it is only used during a copy operation.
> 
> Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
> ---
>  net/sunrpc/xdr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> index e34f4ee..a096025 100644
> --- a/net/sunrpc/xdr.c
> +++ b/net/sunrpc/xdr.c
> @@ -998,7 +998,7 @@ void xdr_enter_page(struct xdr_stream *xdr, unsigned int len)
>  }
>  EXPORT_SYMBOL_GPL(xdr_enter_page);
>  
> -static struct kvec empty_iov = {.iov_base = NULL, .iov_len = 0};
> +static const struct kvec empty_iov = {.iov_base = NULL, .iov_len = 0};
>  
>  void
>  xdr_buf_from_iov(struct kvec *iov, struct xdr_buf *buf)

Reviewed-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply

* Re: [PATCH] net: stmmac: dwmac-sun8i: Use reset exclusive
From: Maxime Ripard @ 2017-08-25 14:48 UTC (permalink / raw)
  To: Corentin Labbe
  Cc: peppe.cavallaro, alexandre.torgue, wens, netdev, linux-arm-kernel,
	linux-kernel
In-Reply-To: <20170825143805.21733-1-clabbe.montjoie@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1462 bytes --]

On Fri, Aug 25, 2017 at 04:38:05PM +0200, Corentin Labbe wrote:
> The current dwmac_sun8i module cannot be rmmod/modprobe due to that
> the reset controller was not released when removed.
> 
> This patch remove ambiguity, by using of_reset_control_get_exclusive and
> add the missing reset_control_put().
> 
> Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
> ---
>  drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
> index fffd6d5fc907..675a09629d85 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
> @@ -782,6 +782,7 @@ static int sun8i_dwmac_unpower_internal_phy(struct sunxi_priv_data *gmac)
>  
>  	clk_disable_unprepare(gmac->ephy_clk);
>  	reset_control_assert(gmac->rst_ephy);
> +	reset_control_put(gmac->rst_ephy);
>  	return 0;
>  }
>  
> @@ -942,7 +943,7 @@ static int sun8i_dwmac_probe(struct platform_device *pdev)
>  			return -EINVAL;
>  		}
>  
> -		gmac->rst_ephy = of_reset_control_get(plat_dat->phy_node, NULL);
> +		gmac->rst_ephy = of_reset_control_get_exclusive(plat_dat->phy_node, NULL);

Why not just use devm_reset_control_get?

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* [PATCH net-next v2 14/14] arm64: defconfig: enable Marvell CP110 comphy
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Miquel Raynal, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, netdev, Antoine Tenart
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

From: Miquel Raynal <miquel.raynal@free-electrons.com>

The comphy is an hardware block giving access to common PHYs that can be
used by various other engines (Network, SATA, ...). This is used on
Marvell 7k/8k platforms for now. Enable the corresponding driver.

Signed-off-by: Miquel Raynal <miquel.raynal@free-electrons.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index cdde4f56a281..e671e37b30af 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -509,6 +509,7 @@ CONFIG_PWM_TEGRA=m
 CONFIG_PHY_RCAR_GEN3_USB2=y
 CONFIG_PHY_HI6220_USB=y
 CONFIG_PHY_SUN4I_USB=y
+CONFIG_PHY_MVEBU_CP110_COMPHY=y
 CONFIG_PHY_ROCKCHIP_INNO_USB2=y
 CONFIG_PHY_ROCKCHIP_EMMC=y
 CONFIG_PHY_ROCKCHIP_PCIE=m
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 13/14] arm64: dts: marvell: 7040-db: add comphy references to Ethernet ports
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

This patch adds comphy phandles to the Ethernet ports in the 7040-db
device tree. The comphy is used to configure the serdes PHYs used by
these ports.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 arch/arm64/boot/dts/marvell/armada-7040-db.dts | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/boot/dts/marvell/armada-7040-db.dts b/arch/arm64/boot/dts/marvell/armada-7040-db.dts
index 92c761c380d3..03d1c42d7c47 100644
--- a/arch/arm64/boot/dts/marvell/armada-7040-db.dts
+++ b/arch/arm64/boot/dts/marvell/armada-7040-db.dts
@@ -180,6 +180,7 @@
 	status = "okay";
 	phy = <&phy0>;
 	phy-mode = "sgmii";
+	phys = <&cpm_comphy0 1>;
 };
 
 &cpm_eth2 {
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 12/14] arm64: dts: marvell: mcbin: add comphy references to Ethernet ports
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

This patch adds comphy phandles to the Ethernet ports in the mcbin
device tree. The comphy is used to configure the serdes PHYs used by
these ports.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 arch/arm64/boot/dts/marvell/armada-8040-mcbin.dts | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/boot/dts/marvell/armada-8040-mcbin.dts b/arch/arm64/boot/dts/marvell/armada-8040-mcbin.dts
index 9f0a00802452..970081ca197b 100644
--- a/arch/arm64/boot/dts/marvell/armada-8040-mcbin.dts
+++ b/arch/arm64/boot/dts/marvell/armada-8040-mcbin.dts
@@ -148,6 +148,7 @@
 &cpm_eth0 {
 	status = "okay";
 	phy = <&phy0>;
+	phys = <&cpm_comphy4 0>;
 	phy-mode = "10gbase-kr";
 };
 
@@ -181,6 +182,7 @@
 &cps_eth0 {
 	status = "okay";
 	phy = <&phy8>;
+	phys = <&cps_comphy4 0>;
 	phy-mode = "10gbase-kr";
 };
 
@@ -189,6 +191,7 @@
 	status = "okay";
 	phy = <&ge_phy>;
 	phy-mode = "sgmii";
+	phys = <&cps_comphy0 1>;
 };
 
 &cps_sata0 {
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 11/14] arm64: dts: marvell: add comphy nodes on cp110 master and slave
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

Now that the comphy driver is available, this patch adds the
corresponding nodes in the cp110 master and slave device trees.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 .../boot/dts/marvell/armada-cp110-master.dtsi      | 38 ++++++++++++++++++++++
 .../arm64/boot/dts/marvell/armada-cp110-slave.dtsi | 38 ++++++++++++++++++++++
 2 files changed, 76 insertions(+)

diff --git a/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi
index 9b2581473183..f2a50552bad4 100644
--- a/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi
@@ -91,6 +91,44 @@
 				};
 			};
 
+			cpm_comphy: phy@120000 {
+				compatible = "marvell,comphy-cp110";
+				reg = <0x120000 0x6000>;
+				marvell,system-controller = <&cpm_syscon0>;
+				#address-cells = <1>;
+				#size-cells = <0>;
+
+				cpm_comphy0: phy@0 {
+					reg = <0>;
+					#phy-cells = <1>;
+				};
+
+				cpm_comphy1: phy@1 {
+					reg = <1>;
+					#phy-cells = <1>;
+				};
+
+				cpm_comphy2: phy@2 {
+					reg = <2>;
+					#phy-cells = <1>;
+				};
+
+				cpm_comphy3: phy@3 {
+					reg = <3>;
+					#phy-cells = <1>;
+				};
+
+				cpm_comphy4: phy@4 {
+					reg = <4>;
+					#phy-cells = <1>;
+				};
+
+				cpm_comphy5: phy@5 {
+					reg = <5>;
+					#phy-cells = <1>;
+				};
+			};
+
 			cpm_mdio: mdio@12a200 {
 				#address-cells = <1>;
 				#size-cells = <0>;
diff --git a/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi
index d3902f218c46..bd7f7d0e6de9 100644
--- a/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi
@@ -98,6 +98,44 @@
 				};
 			};
 
+			cps_comphy: phy@120000 {
+				compatible = "marvell,comphy-cp110";
+				reg = <0x120000 0x6000>;
+				marvell,system-controller = <&cps_syscon0>;
+				#address-cells = <1>;
+				#size-cells = <0>;
+
+				cps_comphy0: phy@0 {
+					reg = <0>;
+					#phy-cells = <1>;
+				};
+
+				cps_comphy1: phy@1 {
+					reg = <1>;
+					#phy-cells = <1>;
+				};
+
+				cps_comphy2: phy@2 {
+					reg = <2>;
+					#phy-cells = <1>;
+				};
+
+				cps_comphy3: phy@3 {
+					reg = <3>;
+					#phy-cells = <1>;
+				};
+
+				cps_comphy4: phy@4 {
+					reg = <4>;
+					#phy-cells = <1>;
+				};
+
+				cps_comphy5: phy@5 {
+					reg = <5>;
+					#phy-cells = <1>;
+				};
+			};
+
 			cps_mdio: mdio@12a200 {
 				#address-cells = <1>;
 				#size-cells = <0>;
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 10/14] arm64: dts: marvell: extend the cp110 syscon register area length
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

This patch extends on both cp110 the system register area length to
include some of the comphy registers as well.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi | 2 +-
 arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi
index 18299e164cb7..9b2581473183 100644
--- a/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi
@@ -118,7 +118,7 @@
 
 			cpm_syscon0: system-controller@440000 {
 				compatible = "syscon", "simple-mfd";
-				reg = <0x440000 0x1000>;
+				reg = <0x440000 0x2000>;
 
 				cpm_clk: clock {
 					compatible = "marvell,cp110-clock";
diff --git a/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi
index 5ae8fa575859..d3902f218c46 100644
--- a/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi
@@ -125,7 +125,7 @@
 
 			cps_syscon0: system-controller@440000 {
 				compatible = "syscon", "simple-mfd";
-				reg = <0x440000 0x1000>;
+				reg = <0x440000 0x2000>;
 
 				cps_clk: clock {
 					compatible = "marvell,cp110-clock";
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 09/14] net: mvpp2: dynamic reconfiguration of the PHY mode
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

This patch adds logic to reconfigure the comphy/gop when the link status
change at runtime. This is very useful on boards such as the mcbin which
have SFP and Ethernet ports connected to the same MAC port: depending on
what the user connects the driver will automatically reconfigure the
link mode.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 drivers/net/ethernet/marvell/mvpp2.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index 49a6789a4142..04e0c8ab7b51 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -5740,6 +5740,7 @@ static void mvpp2_link_event(struct net_device *dev)
 {
 	struct mvpp2_port *port = netdev_priv(dev);
 	struct phy_device *phydev = dev->phydev;
+	bool link_reconfigured = false;
 
 	if (!netif_running(dev))
 		return;
@@ -5750,9 +5751,27 @@ static void mvpp2_link_event(struct net_device *dev)
 			port->duplex = phydev->duplex;
 			port->speed  = phydev->speed;
 		}
+
+		if (port->phy_interface != phydev->interface && port->comphy) {
+	                /* disable current port for reconfiguration */
+	                mvpp2_interrupts_disable(port);
+	                netif_carrier_off(port->dev);
+	                mvpp2_port_disable(port);
+			phy_power_off(port->comphy);
+
+	                /* comphy reconfiguration */
+	                port->phy_interface = phydev->interface;
+	                mvpp22_comphy_init(port);
+
+	                /* gop/mac reconfiguration */
+	                mvpp22_gop_init(port);
+	                mvpp2_port_mii_set(port);
+
+	                link_reconfigured = true;
+		}
 	}
 
-	if (phydev->link != port->link) {
+	if (phydev->link != port->link || link_reconfigured) {
 		port->link = phydev->link;
 
 		if (phydev->link) {
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 08/14] net: mvpp2: check the netif is running in the link_event function
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

This patch adds an extra check when the link_event function is called,
so that it won't do anything when the netif isn't running.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 drivers/net/ethernet/marvell/mvpp2.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index 1b26f5ed994f..49a6789a4142 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -5741,6 +5741,9 @@ static void mvpp2_link_event(struct net_device *dev)
 	struct mvpp2_port *port = netdev_priv(dev);
 	struct phy_device *phydev = dev->phydev;
 
+	if (!netif_running(dev))
+		return;
+
 	if (phydev->link) {
 		if ((port->speed != phydev->speed) ||
 		    (port->duplex != phydev->duplex)) {
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 07/14] net: mvpp2: improve the link management function
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

When the link status changes, the phylib calls the link_event function
in the mvpp2 driver. Before this patch only the egress/ingress transmit
was enabled/disabled. This patch adds more functionality to the link
status management code by enabling/disabling the port per-cpu
interrupts, and the port itself. The queues are now stopped as well, and
the netif carrier helpers are called.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 drivers/net/ethernet/marvell/mvpp2.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index a9dea9b344de..1b26f5ed994f 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -5753,14 +5753,24 @@ static void mvpp2_link_event(struct net_device *dev)
 		port->link = phydev->link;
 
 		if (phydev->link) {
+			mvpp2_interrupts_enable(port);
+			mvpp2_port_enable(port);
+
 			mvpp2_egress_enable(port);
 			mvpp2_ingress_enable(port);
+			netif_carrier_on(dev);
+			netif_tx_wake_all_queues(dev);
 		} else {
 			port->duplex = -1;
 			port->speed = 0;
 
+			netif_tx_stop_all_queues(dev);
+			netif_carrier_off(dev);
 			mvpp2_ingress_disable(port);
 			mvpp2_egress_disable(port);
+
+			mvpp2_port_disable(port);
+			mvpp2_interrupts_disable(port);
 		}
 
 		phy_print_status(phydev);
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 06/14] net: mvpp2: simplify the link_event function
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

The link_event function is somewhat complicated. This cosmetic patch
simplifies it.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 drivers/net/ethernet/marvell/mvpp2.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index 498a4969dc58..a9dea9b344de 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -5740,7 +5740,6 @@ static void mvpp2_link_event(struct net_device *dev)
 {
 	struct mvpp2_port *port = netdev_priv(dev);
 	struct phy_device *phydev = dev->phydev;
-	int status_change = 0;
 
 	if (phydev->link) {
 		if ((port->speed != phydev->speed) ||
@@ -5751,23 +5750,19 @@ static void mvpp2_link_event(struct net_device *dev)
 	}
 
 	if (phydev->link != port->link) {
-		if (!phydev->link) {
-			port->duplex = -1;
-			port->speed = 0;
-		}
-
 		port->link = phydev->link;
-		status_change = 1;
-	}
 
-	if (status_change) {
 		if (phydev->link) {
 			mvpp2_egress_enable(port);
 			mvpp2_ingress_enable(port);
 		} else {
+			port->duplex = -1;
+			port->speed = 0;
+
 			mvpp2_ingress_disable(port);
 			mvpp2_egress_disable(port);
 		}
+
 		phy_print_status(phydev);
 	}
 }
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 05/14] net: mvpp2: do not force the link mode
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

The link mode (speed, duplex) was forced based on what the phylib
returns. This should not be the case, and only forced by ethtool
functions manually. This patch removes the link mode enforcement from
the phylib link_event callback.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 drivers/net/ethernet/marvell/mvpp2.c | 24 ------------------------
 1 file changed, 24 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index fab231858a41..498a4969dc58 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -5741,30 +5741,10 @@ static void mvpp2_link_event(struct net_device *dev)
 	struct mvpp2_port *port = netdev_priv(dev);
 	struct phy_device *phydev = dev->phydev;
 	int status_change = 0;
-	u32 val;
 
 	if (phydev->link) {
 		if ((port->speed != phydev->speed) ||
 		    (port->duplex != phydev->duplex)) {
-			u32 val;
-
-			val = readl(port->base + MVPP2_GMAC_AUTONEG_CONFIG);
-			val &= ~(MVPP2_GMAC_CONFIG_MII_SPEED |
-				 MVPP2_GMAC_CONFIG_GMII_SPEED |
-				 MVPP2_GMAC_CONFIG_FULL_DUPLEX |
-				 MVPP2_GMAC_AN_SPEED_EN |
-				 MVPP2_GMAC_AN_DUPLEX_EN);
-
-			if (phydev->duplex)
-				val |= MVPP2_GMAC_CONFIG_FULL_DUPLEX;
-
-			if (phydev->speed == SPEED_1000)
-				val |= MVPP2_GMAC_CONFIG_GMII_SPEED;
-			else if (phydev->speed == SPEED_100)
-				val |= MVPP2_GMAC_CONFIG_MII_SPEED;
-
-			writel(val, port->base + MVPP2_GMAC_AUTONEG_CONFIG);
-
 			port->duplex = phydev->duplex;
 			port->speed  = phydev->speed;
 		}
@@ -5782,10 +5762,6 @@ static void mvpp2_link_event(struct net_device *dev)
 
 	if (status_change) {
 		if (phydev->link) {
-			val = readl(port->base + MVPP2_GMAC_AUTONEG_CONFIG);
-			val |= (MVPP2_GMAC_FORCE_LINK_PASS |
-				MVPP2_GMAC_FORCE_LINK_DOWN);
-			writel(val, port->base + MVPP2_GMAC_AUTONEG_CONFIG);
 			mvpp2_egress_enable(port);
 			mvpp2_ingress_enable(port);
 		} else {
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 04/14] net: mvpp2: initialize the comphy
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

On some platforms, the comphy is between the MAC GoP and the PHYs. The
mvpp2 driver currently relies on the firmware/bootloader to configure
the comphy. As a comphy driver was added to the generic PHY framework,
this patch uses it in the mvpp2 driver to configure the comphy at boot
time to avoid relying on the bootloader.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 drivers/net/ethernet/marvell/mvpp2.c | 44 +++++++++++++++++++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index e312dfc3555b..fab231858a41 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -28,6 +28,7 @@
 #include <linux/of_address.h>
 #include <linux/of_device.h>
 #include <linux/phy.h>
+#include <linux/phy/phy.h>
 #include <linux/clk.h>
 #include <linux/hrtimer.h>
 #include <linux/ktime.h>
@@ -861,6 +862,7 @@ struct mvpp2_port {
 
 	phy_interface_t phy_interface;
 	struct device_node *phy_node;
+	struct phy *comphy;
 	unsigned int link;
 	unsigned int duplex;
 	unsigned int speed;
@@ -4420,6 +4422,32 @@ static int mvpp22_gop_init(struct mvpp2_port *port)
 	return -EINVAL;
 }
 
+static int mvpp22_comphy_init(struct mvpp2_port *port)
+{
+	enum phy_mode mode;
+	int ret;
+
+	if (!port->comphy)
+		return 0;
+
+	switch (port->phy_interface) {
+	case PHY_INTERFACE_MODE_SGMII:
+		mode = PHY_MODE_SGMII;
+		break;
+	case PHY_INTERFACE_MODE_10GKR:
+		mode = PHY_MODE_10GKR;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	ret = phy_set_mode(port->comphy, mode);
+	if (ret)
+		return ret;
+
+	return phy_power_on(port->comphy);
+}
+
 static void mvpp2_port_mii_gmac_configure_mode(struct mvpp2_port *port)
 {
 	u32 val;
@@ -6404,8 +6432,10 @@ static void mvpp2_start_dev(struct mvpp2_port *port)
 	/* Enable interrupts on all CPUs */
 	mvpp2_interrupts_enable(port);
 
-	if (port->priv->hw_version == MVPP22)
+	if (port->priv->hw_version == MVPP22) {
+		mvpp22_comphy_init(port);
 		mvpp22_gop_init(port);
+	}
 
 	mvpp2_port_mii_set(port);
 	mvpp2_port_enable(port);
@@ -6436,6 +6466,7 @@ static void mvpp2_stop_dev(struct mvpp2_port *port)
 	mvpp2_egress_disable(port);
 	mvpp2_port_disable(port);
 	phy_stop(ndev->phydev);
+	phy_power_off(port->comphy);
 }
 
 static int mvpp2_check_ringparam_valid(struct net_device *dev,
@@ -7270,6 +7301,7 @@ static int mvpp2_port_probe(struct platform_device *pdev,
 			    struct mvpp2 *priv)
 {
 	struct device_node *phy_node;
+	struct phy *comphy;
 	struct mvpp2_port *port;
 	struct mvpp2_port_pcpu *port_pcpu;
 	struct net_device *dev;
@@ -7311,6 +7343,15 @@ static int mvpp2_port_probe(struct platform_device *pdev,
 		goto err_free_netdev;
 	}
 
+	comphy = devm_of_phy_get(&pdev->dev, port_node, NULL);
+	if (IS_ERR(comphy)) {
+		if (PTR_ERR(comphy) == -EPROBE_DEFER) {
+			err = -EPROBE_DEFER;
+			goto err_free_netdev;
+		}
+		comphy = NULL;
+	}
+
 	if (of_property_read_u32(port_node, "port-id", &id)) {
 		err = -EINVAL;
 		dev_err(&pdev->dev, "missing port-id value\n");
@@ -7344,6 +7385,7 @@ static int mvpp2_port_probe(struct platform_device *pdev,
 
 	port->phy_node = phy_node;
 	port->phy_interface = phy_mode;
+	port->comphy = comphy;
 
 	if (priv->hw_version == MVPP21) {
 		res = platform_get_resource(pdev, IORESOURCE_MEM, 2 + id);
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 03/14] Documentation/bindings: phy: document the Marvell comphy driver
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

The Marvell Armada 7K/8K SoCs contains an hardware block called COMPHY
that provides a number of shared PHYs used by various interfaces in the
SoC: network, SATA, PCIe, etc. This Device Tree binding allows to
describe this COMPHY hardware block.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 .../devicetree/bindings/phy/phy-mvebu-comphy.txt   | 43 ++++++++++++++++++++++
 1 file changed, 43 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/phy-mvebu-comphy.txt

diff --git a/Documentation/devicetree/bindings/phy/phy-mvebu-comphy.txt b/Documentation/devicetree/bindings/phy/phy-mvebu-comphy.txt
new file mode 100644
index 000000000000..bfcf80341657
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/phy-mvebu-comphy.txt
@@ -0,0 +1,43 @@
+mvebu comphy driver
+-------------------
+
+A comphy controller can be found on Marvell Armada 7k/8k on the CP110. It
+provides a number of shared PHYs used by various interfaces (network, sata,
+usb, PCIe...).
+
+Required properties:
+
+- compatible: should be "marvell,comphy-cp110"
+- reg: should contain the comphy register location and length.
+- marvell,system-controller: should contain a phandle to the
+                             system controller node.
+- #address-cells: should be 1.
+- #size-cells: should be 0.
+
+A sub-node is required for each comphy lane provided by the comphy.
+
+Required properties (child nodes):
+
+- reg: comphy lane number.
+- #phy-cells : from the generic phy bindings, must be 1. Defines the
+               input port to use for a given comphy lane.
+
+Example:
+
+	cpm_comphy: phy@120000 {
+		compatible = "marvell,comphy-cp110";
+		reg = <0x120000 0x6000>;
+		marvell,system-controller = <&cpm_syscon0>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		cpm_comphy0: phy@0 {
+			reg = <0>;
+			#phy-cells = <1>;
+		};
+
+		cpm_comphy1: phy@1 {
+			reg = <1>;
+			#phy-cells = <1>;
+		};
+	};
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 02/14] phy: add the mvebu cp110 comphy driver
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

On the CP110 unit, which can be found on various Marvell platforms such
as the 7k and 8k (currently), a comphy (common PHYs) hardware block can
be found. This block provides a number of PHYs which can be used in
various modes by other controllers (network, SATA ...). These common
PHYs must be configured for the controllers using them to work correctly
either at boot time, or when the system runs to switch the mode used.
This patch adds a driver for this comphy hardware block, providing
callbacks for the its PHYs so that consumers can configure the modes
used.

As of this commit, two modes are supported by the comphy driver: sgmii
and 10gkr.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 drivers/phy/marvell/Kconfig                  |  10 +
 drivers/phy/marvell/Makefile                 |   1 +
 drivers/phy/marvell/phy-mvebu-cp110-comphy.c | 656 +++++++++++++++++++++++++++
 3 files changed, 667 insertions(+)
 create mode 100644 drivers/phy/marvell/phy-mvebu-cp110-comphy.c

diff --git a/drivers/phy/marvell/Kconfig b/drivers/phy/marvell/Kconfig
index 048d8893bc2e..26755f3d1a9a 100644
--- a/drivers/phy/marvell/Kconfig
+++ b/drivers/phy/marvell/Kconfig
@@ -21,6 +21,16 @@ config PHY_BERLIN_USB
 	help
 	  Enable this to support the USB PHY on Marvell Berlin SoCs.
 
+config PHY_MVEBU_CP110_COMPHY
+	tristate "Marvell CP110 comphy driver"
+	depends on ARCH_MVEBU && OF
+	select GENERIC_PHY
+	help
+	  This driver allows to control the comphy, an hardware block providing
+	  shared serdes PHYs on Marvell Armada 7k/8k (in the CP110). Its serdes
+	  lanes can be used by various controllers (Ethernet, sata, usb,
+	  PCIe...).
+
 config PHY_MVEBU_SATA
 	def_bool y
 	depends on ARCH_DOVE || MACH_DOVE || MACH_KIRKWOOD
diff --git a/drivers/phy/marvell/Makefile b/drivers/phy/marvell/Makefile
index 3fc188f59118..0cf6a7cbaf9f 100644
--- a/drivers/phy/marvell/Makefile
+++ b/drivers/phy/marvell/Makefile
@@ -1,6 +1,7 @@
 obj-$(CONFIG_ARMADA375_USBCLUSTER_PHY)	+= phy-armada375-usb2.o
 obj-$(CONFIG_PHY_BERLIN_SATA)		+= phy-berlin-sata.o
 obj-$(CONFIG_PHY_BERLIN_USB)		+= phy-berlin-usb.o
+obj-$(CONFIG_PHY_MVEBU_CP110_COMPHY)	+= phy-mvebu-cp110-comphy.o
 obj-$(CONFIG_PHY_MVEBU_SATA)		+= phy-mvebu-sata.o
 obj-$(CONFIG_PHY_PXA_28NM_HSIC)		+= phy-pxa-28nm-hsic.o
 obj-$(CONFIG_PHY_PXA_28NM_USB2)		+= phy-pxa-28nm-usb2.o
diff --git a/drivers/phy/marvell/phy-mvebu-cp110-comphy.c b/drivers/phy/marvell/phy-mvebu-cp110-comphy.c
new file mode 100644
index 000000000000..41556e790856
--- /dev/null
+++ b/drivers/phy/marvell/phy-mvebu-cp110-comphy.c
@@ -0,0 +1,656 @@
+/*
+ * Copyright (C) 2017 Marvell
+ *
+ * Antoine Tenart <antoine.tenart@free-electrons.com>
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/mfd/syscon.h>
+#include <linux/module.h>
+#include <linux/phy/phy.h>
+#include <linux/platform_device.h>
+#include <linux/regmap.h>
+
+/* Relative to priv->base */
+#define MVEBU_COMPHY_SERDES_CFG0(n)		(0x0 + (n) * 0x1000)
+#define     MVEBU_COMPHY_SERDES_CFG0_PU_PLL	BIT(1)
+#define     MVEBU_COMPHY_SERDES_CFG0_GEN_RX(n)	((n) << 3)
+#define     MVEBU_COMPHY_SERDES_CFG0_GEN_TX(n)	((n) << 7)
+#define     MVEBU_COMPHY_SERDES_CFG0_PU_RX	BIT(11)
+#define     MVEBU_COMPHY_SERDES_CFG0_PU_TX	BIT(12)
+#define     MVEBU_COMPHY_SERDES_CFG0_HALF_BUS	BIT(14)
+#define MVEBU_COMPHY_SERDES_CFG1(n)		(0x4 + (n) * 0x1000)
+#define     MVEBU_COMPHY_SERDES_CFG1_RESET	BIT(3)
+#define     MVEBU_COMPHY_SERDES_CFG1_RX_INIT	BIT(4)
+#define     MVEBU_COMPHY_SERDES_CFG1_CORE_RESET	BIT(5)
+#define     MVEBU_COMPHY_SERDES_CFG1_RF_RESET	BIT(6)
+#define MVEBU_COMPHY_SERDES_CFG2(n)		(0x8 + (n) * 0x1000)
+#define     MVEBU_COMPHY_SERDES_CFG2_DFE_EN	BIT(4)
+#define MVEBU_COMPHY_SERDES_STATUS0(n)		(0x18 + (n) * 0x1000)
+#define     MVEBU_COMPHY_SERDES_STATUS0_TX_PLL_RDY	BIT(2)
+#define     MVEBU_COMPHY_SERDES_STATUS0_RX_PLL_RDY	BIT(3)
+#define     MVEBU_COMPHY_SERDES_STATUS0_RX_INIT		BIT(4)
+#define MVEBU_COMPHY_PWRPLL_CTRL(n)		(0x804 + (n) * 0x1000)
+#define     MVEBU_COMPHY_PWRPLL_CTRL_RFREQ(n)	((n) << 0)
+#define     MVEBU_COMPHY_PWRPLL_PHY_MODE(n)	((n) << 5)
+#define MVEBU_COMPHY_IMP_CAL(n)			(0x80c + (n) * 0x1000)
+#define     MVEBU_COMPHY_IMP_CAL_TX_EXT(n)	((n) << 10)
+#define     MVEBU_COMPHY_IMP_CAL_TX_EXT_EN	BIT(15)
+#define MVEBU_COMPHY_DFE_RES(n)			(0x81c + (n) * 0x1000)
+#define     MVEBU_COMPHY_DFE_RES_FORCE_GEN_TBL	BIT(15)
+#define MVEBU_COMPHY_COEF(n)			(0x828 + (n) * 0x1000)
+#define     MVEBU_COMPHY_COEF_DFE_EN		BIT(14)
+#define     MVEBU_COMPHY_COEF_DFE_CTRL		BIT(15)
+#define MVEBU_COMPHY_GEN1_S0(n)			(0x834 + (n) * 0x1000)
+#define     MVEBU_COMPHY_GEN1_S0_TX_AMP(n)	((n) << 1)
+#define     MVEBU_COMPHY_GEN1_S0_TX_EMPH(n)	((n) << 7)
+#define MVEBU_COMPHY_GEN1_S1(n)			(0x838 + (n) * 0x1000)
+#define     MVEBU_COMPHY_GEN1_S1_RX_MUL_PI(n)	((n) << 0)
+#define     MVEBU_COMPHY_GEN1_S1_RX_MUL_PF(n)	((n) << 3)
+#define     MVEBU_COMPHY_GEN1_S1_RX_MUL_FI(n)	((n) << 6)
+#define     MVEBU_COMPHY_GEN1_S1_RX_MUL_FF(n)	((n) << 8)
+#define     MVEBU_COMPHY_GEN1_S1_RX_DFE_EN	BIT(10)
+#define     MVEBU_COMPHY_GEN1_S1_RX_DIV(n)	((n) << 11)
+#define MVEBU_COMPHY_GEN1_S2(n)			(0x8f4 + (n) * 0x1000)
+#define     MVEBU_COMPHY_GEN1_S2_TX_EMPH(n)	((n) << 0)
+#define     MVEBU_COMPHY_GEN1_S2_TX_EMPH_EN	BIT(4)
+#define MVEBU_COMPHY_LOOPBACK(n)		(0x88c + (n) * 0x1000)
+#define     MVEBU_COMPHY_LOOPBACK_DBUS_WIDTH(n)	((n) << 1)
+#define MVEBU_COMPHY_VDD_CAL0(n)		(0x908 + (n) * 0x1000)
+#define     MVEBU_COMPHY_VDD_CAL0_CONT_MODE	BIT(15)
+#define MVEBU_COMPHY_EXT_SELV(n)		(0x914 + (n) * 0x1000)
+#define     MVEBU_COMPHY_EXT_SELV_RX_SAMPL(n)	((n) << 5)
+#define MVEBU_COMPHY_MISC_CTRL0(n)		(0x93c + (n) * 0x1000)
+#define     MVEBU_COMPHY_MISC_CTRL0_ICP_FORCE	BIT(5)
+#define     MVEBU_COMPHY_MISC_CTRL0_REFCLK_SEL	BIT(10)
+#define MVEBU_COMPHY_RX_CTRL1(n)		(0x940 + (n) * 0x1000)
+#define     MVEBU_COMPHY_RX_CTRL1_RXCLK2X_SEL	BIT(11)
+#define     MVEBU_COMPHY_RX_CTRL1_CLK8T_EN	BIT(12)
+#define MVEBU_COMPHY_SPEED_DIV(n)		(0x954 + (n) * 0x1000)
+#define     MVEBU_COMPHY_SPEED_DIV_TX_FORCE	BIT(7)
+#define MVEBU_SP_CALIB(n)			(0x96c + (n) * 0x1000)
+#define     MVEBU_SP_CALIB_SAMPLER(n)		((n) << 8)
+#define     MVEBU_SP_CALIB_SAMPLER_EN		BIT(12)
+#define MVEBU_COMPHY_TX_SLEW_RATE(n)		(0x974 + (n) * 0x1000)
+#define     MVEBU_COMPHY_TX_SLEW_RATE_EMPH(n)	((n) << 5)
+#define     MVEBU_COMPHY_TX_SLEW_RATE_SLC(n)	((n) << 10)
+#define MVEBU_COMPHY_DLT_CTRL(n)		(0x984 + (n) * 0x1000)
+#define     MVEBU_COMPHY_DLT_CTRL_DTL_FLOOP_EN	BIT(2)
+#define MVEBU_COMPHY_FRAME_DETECT0(n)		(0xa14 + (n) * 0x1000)
+#define     MVEBU_COMPHY_FRAME_DETECT0_PATN(n)	((n) << 7)
+#define MVEBU_COMPHY_FRAME_DETECT3(n)		(0xa20 + (n) * 0x1000)
+#define     MVEBU_COMPHY_FRAME_DETECT3_LOST_TIMEOUT_EN	BIT(12)
+#define MVEBU_COMPHY_DME(n)			(0xa28 + (n) * 0x1000)
+#define     MVEBU_COMPHY_DME_ETH_MODE		BIT(7)
+#define MVEBU_COMPHY_TRAINING0(n)		(0xa68 + (n) * 0x1000)
+#define     MVEBU_COMPHY_TRAINING0_P2P_HOLD	BIT(15)
+#define MVEBU_COMPHY_TRAINING5(n)		(0xaa4 + (n) * 0x1000)
+#define	    MVEBU_COMPHY_TRAINING5_RX_TIMER(n)	((n) << 0)
+#define MVEBU_COMPHY_TX_TRAIN_PRESET(n)		(0xb1c + (n) * 0x1000)
+#define     MVEBU_COMPHY_TX_TRAIN_PRESET_16B_AUTO_EN	BIT(8)
+#define     MVEBU_COMPHY_TX_TRAIN_PRESET_PRBS11		BIT(9)
+#define MVEBU_COMPHY_GEN1_S3(n)			(0xc40 + (n) * 0x1000)
+#define     MVEBU_COMPHY_GEN1_S3_FBCK_SEL	BIT(9)
+#define MVEBU_COMPHY_GEN1_S4(n)			(0xc44 + (n) * 0x1000)
+#define	    MVEBU_COMPHY_GEN1_S4_DFE_RES(n)	((n) << 8)
+#define MVEBU_COMPHY_TX_PRESET(n)		(0xc68 + (n) * 0x1000)
+#define     MVEBU_COMPHY_TX_PRESET_INDEX(n)	((n) << 0)
+#define MVEBU_COMPHY_GEN1_S5(n)			(0xd38 + (n) * 0x1000)
+#define     MVEBU_COMPHY_GEN1_S5_ICP(n)		((n) << 0)
+
+/* Relative to priv->regmap */
+#define MVEBU_COMPHY_CONF1(n)			(0x1000 + (n) * 0x28)
+#define     MVEBU_COMPHY_CONF1_PWRUP		BIT(1)
+#define     MVEBU_COMPHY_CONF1_USB_PCIE		BIT(2)	/* 0: Ethernet/SATA */
+#define MVEBU_COMPHY_CONF6(n)			(0x1014 + (n) * 0x28)
+#define     MVEBU_COMPHY_CONF6_40B		BIT(18)
+#define MVEBU_COMPHY_SELECTOR			0x1140
+#define     MVEBU_COMPHY_SELECTOR_PHY(n)	((n) * 0x4)
+
+#define MVEBU_COMPHY_LANES	6
+#define MVEBU_COMPHY_PORTS	3
+
+struct mvebu_comhy_conf {
+	enum phy_mode mode;
+	unsigned lane;
+	unsigned port;
+	u32 mux;
+};
+
+#define MVEBU_COMPHY_CONF(_lane, _port, _mode, _mux)	\
+	{						\
+		.lane = _lane,				\
+		.port = _port,				\
+		.mode = _mode,				\
+		.mux = _mux,				\
+	}
+
+static const struct mvebu_comhy_conf mvebu_comphy_cp110_modes[] = {
+	/* lane 0 */
+	MVEBU_COMPHY_CONF(0, 1, PHY_MODE_SGMII, 0x1),
+	/* lane 1 */
+	MVEBU_COMPHY_CONF(1, 2, PHY_MODE_SGMII, 0x1),
+	/* lane 2 */
+	MVEBU_COMPHY_CONF(2, 0, PHY_MODE_SGMII, 0x1),
+	MVEBU_COMPHY_CONF(2, 0, PHY_MODE_10GKR, 0x1),
+	/* lane 3 */
+	MVEBU_COMPHY_CONF(3, 1, PHY_MODE_SGMII, 0x2),
+	/* lane 4 */
+	MVEBU_COMPHY_CONF(4, 0, PHY_MODE_SGMII, 0x2),
+	MVEBU_COMPHY_CONF(4, 0, PHY_MODE_10GKR, 0x2),
+	MVEBU_COMPHY_CONF(4, 1, PHY_MODE_SGMII, 0x1),
+	/* lane 5 */
+	MVEBU_COMPHY_CONF(5, 2, PHY_MODE_SGMII, 0x1),
+};
+
+struct mvebu_comphy_priv {
+	void __iomem *base;
+	struct regmap *regmap;
+	struct device *dev;
+	struct phy *phys[MVEBU_COMPHY_LANES];
+	int modes[MVEBU_COMPHY_LANES];
+};
+
+struct mvebu_comphy_lane {
+	struct mvebu_comphy_priv *priv;
+	struct device_node *of_node;
+	unsigned id;
+	enum phy_mode mode;
+	int port;
+};
+
+static int mvebu_comphy_get_mux(int lane, int port, enum phy_mode mode)
+{
+	int i, n = ARRAY_SIZE(mvebu_comphy_cp110_modes);
+
+	/* Unused PHY mux value is 0x0 */
+	if (mode == PHY_MODE_INVALID)
+		return 0;
+
+	for (i = 0; i < n; i++) {
+		if (mvebu_comphy_cp110_modes[i].lane == lane &&
+		    mvebu_comphy_cp110_modes[i].port == port &&
+		    mvebu_comphy_cp110_modes[i].mode == mode)
+			break;
+	}
+
+	if (i == n)
+		return -EINVAL;
+
+	return mvebu_comphy_cp110_modes[i].mux;
+}
+
+static void mvebu_comphy_ethernet_init_reset(struct mvebu_comphy_lane *lane,
+					     enum phy_mode mode)
+{
+	struct mvebu_comphy_priv *priv = lane->priv;
+	u32 val;
+
+	regmap_read(priv->regmap, MVEBU_COMPHY_CONF1(lane->id), &val);
+	val &= ~MVEBU_COMPHY_CONF1_USB_PCIE;
+	val |= MVEBU_COMPHY_CONF1_PWRUP;
+	regmap_write(priv->regmap, MVEBU_COMPHY_CONF1(lane->id), val);
+
+	/* Select baud rates and PLLs */
+	val = readl(priv->base + MVEBU_COMPHY_SERDES_CFG0(lane->id));
+	val &= ~(MVEBU_COMPHY_SERDES_CFG0_PU_PLL |
+		 MVEBU_COMPHY_SERDES_CFG0_PU_RX |
+		 MVEBU_COMPHY_SERDES_CFG0_PU_TX |
+		 MVEBU_COMPHY_SERDES_CFG0_HALF_BUS |
+		 MVEBU_COMPHY_SERDES_CFG0_GEN_RX(0xf) |
+		 MVEBU_COMPHY_SERDES_CFG0_GEN_TX(0xf));
+	if (mode == PHY_MODE_10GKR)
+		val |= MVEBU_COMPHY_SERDES_CFG0_GEN_RX(0xe) |
+		       MVEBU_COMPHY_SERDES_CFG0_GEN_TX(0xe);
+	else if (mode == PHY_MODE_SGMII)
+		val |= MVEBU_COMPHY_SERDES_CFG0_GEN_RX(0x6) |
+		       MVEBU_COMPHY_SERDES_CFG0_GEN_TX(0x6) |
+		       MVEBU_COMPHY_SERDES_CFG0_HALF_BUS;
+	writel(val, priv->base + MVEBU_COMPHY_SERDES_CFG0(lane->id));
+
+	/* reset */
+	val = readl(priv->base + MVEBU_COMPHY_SERDES_CFG1(lane->id));
+	val &= ~(MVEBU_COMPHY_SERDES_CFG1_RESET |
+		 MVEBU_COMPHY_SERDES_CFG1_CORE_RESET |
+		 MVEBU_COMPHY_SERDES_CFG1_RF_RESET);
+	writel(val, priv->base + MVEBU_COMPHY_SERDES_CFG1(lane->id));
+
+	/* de-assert reset */
+	val = readl(priv->base + MVEBU_COMPHY_SERDES_CFG1(lane->id));
+	val |= MVEBU_COMPHY_SERDES_CFG1_RESET |
+	       MVEBU_COMPHY_SERDES_CFG1_CORE_RESET;
+	writel(val, priv->base + MVEBU_COMPHY_SERDES_CFG1(lane->id));
+
+	/* wait until clocks are ready */
+	mdelay(1);
+
+	/* exlicitly disable 40B, the bits isn't clear on reset */
+	regmap_read(priv->regmap, MVEBU_COMPHY_CONF6(lane->id), &val);
+	val &= ~MVEBU_COMPHY_CONF6_40B;
+	regmap_write(priv->regmap, MVEBU_COMPHY_CONF6(lane->id), val);
+
+	/* refclk selection */
+	val = readl(priv->base + MVEBU_COMPHY_MISC_CTRL0(lane->id));
+	val &= ~MVEBU_COMPHY_MISC_CTRL0_REFCLK_SEL;
+	if (mode == PHY_MODE_10GKR)
+		val |= MVEBU_COMPHY_MISC_CTRL0_ICP_FORCE;
+	writel(val, priv->base + MVEBU_COMPHY_MISC_CTRL0(lane->id));
+
+	/* power and pll selection */
+	val = readl(priv->base + MVEBU_COMPHY_PWRPLL_CTRL(lane->id));
+	val &= ~(MVEBU_COMPHY_PWRPLL_CTRL_RFREQ(0x1f) |
+		 MVEBU_COMPHY_PWRPLL_PHY_MODE(0x7));
+	val |= MVEBU_COMPHY_PWRPLL_CTRL_RFREQ(0x1) |
+	       MVEBU_COMPHY_PWRPLL_PHY_MODE(0x4);
+	writel(val, priv->base + MVEBU_COMPHY_PWRPLL_CTRL(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_LOOPBACK(lane->id));
+	val &= ~MVEBU_COMPHY_LOOPBACK_DBUS_WIDTH(0x7);
+	val |= MVEBU_COMPHY_LOOPBACK_DBUS_WIDTH(0x1);
+	writel(val, priv->base + MVEBU_COMPHY_LOOPBACK(lane->id));
+}
+
+static int mvebu_comphy_init_plls(struct mvebu_comphy_lane *lane,
+				  enum phy_mode mode)
+{
+	struct mvebu_comphy_priv *priv = lane->priv;
+	u32 val;
+
+	/* SERDES external config */
+	val = readl(priv->base + MVEBU_COMPHY_SERDES_CFG0(lane->id));
+	val |= MVEBU_COMPHY_SERDES_CFG0_PU_PLL |
+	       MVEBU_COMPHY_SERDES_CFG0_PU_RX |
+	       MVEBU_COMPHY_SERDES_CFG0_PU_TX;
+	writel(val, priv->base + MVEBU_COMPHY_SERDES_CFG0(lane->id));
+
+	/* check rx/tx pll */
+	readl_poll_timeout(priv->base + MVEBU_COMPHY_SERDES_STATUS0(lane->id),
+			   val,
+			   val & (MVEBU_COMPHY_SERDES_STATUS0_RX_PLL_RDY |
+				  MVEBU_COMPHY_SERDES_STATUS0_TX_PLL_RDY),
+			   1000, 150000);
+	if (!(val & (MVEBU_COMPHY_SERDES_STATUS0_RX_PLL_RDY |
+		     MVEBU_COMPHY_SERDES_STATUS0_TX_PLL_RDY)))
+		return -ETIMEDOUT;
+
+	/* rx init */
+	val = readl(priv->base + MVEBU_COMPHY_SERDES_CFG1(lane->id));
+	val |= MVEBU_COMPHY_SERDES_CFG1_RX_INIT;
+	writel(val, priv->base + MVEBU_COMPHY_SERDES_CFG1(lane->id));
+
+	/* check rx */
+	readl_poll_timeout(priv->base + MVEBU_COMPHY_SERDES_STATUS0(lane->id),
+			   val, val & MVEBU_COMPHY_SERDES_STATUS0_RX_INIT,
+			   1000, 10000);
+	if (!(val & MVEBU_COMPHY_SERDES_STATUS0_RX_INIT))
+		return -ETIMEDOUT;
+
+	val = readl(priv->base + MVEBU_COMPHY_SERDES_CFG1(lane->id));
+	val &= ~MVEBU_COMPHY_SERDES_CFG1_RX_INIT;
+	writel(val, priv->base + MVEBU_COMPHY_SERDES_CFG1(lane->id));
+
+	return 0;
+}
+
+static int mvebu_comphy_set_mode_sgmii(struct phy *phy, enum phy_mode mode)
+{
+	struct mvebu_comphy_lane *lane = phy_get_drvdata(phy);
+	struct mvebu_comphy_priv *priv = lane->priv;
+	u32 val;
+
+	mvebu_comphy_ethernet_init_reset(lane, mode);
+
+	val = readl(priv->base + MVEBU_COMPHY_RX_CTRL1(lane->id));
+	val &= ~MVEBU_COMPHY_RX_CTRL1_CLK8T_EN;
+	val |= MVEBU_COMPHY_RX_CTRL1_RXCLK2X_SEL;
+	writel(val, priv->base + MVEBU_COMPHY_RX_CTRL1(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_DLT_CTRL(lane->id));
+	val &= ~MVEBU_COMPHY_DLT_CTRL_DTL_FLOOP_EN;
+	writel(val, priv->base + MVEBU_COMPHY_DLT_CTRL(lane->id));
+
+	regmap_read(priv->regmap, MVEBU_COMPHY_CONF1(lane->id), &val);
+	val &= ~MVEBU_COMPHY_CONF1_USB_PCIE;
+	val |= MVEBU_COMPHY_CONF1_PWRUP;
+	regmap_write(priv->regmap, MVEBU_COMPHY_CONF1(lane->id), val);
+
+	val = readl(priv->base + MVEBU_COMPHY_GEN1_S0(lane->id));
+	val &= ~MVEBU_COMPHY_GEN1_S0_TX_EMPH(0xf);
+	val |= MVEBU_COMPHY_GEN1_S0_TX_EMPH(0x1);
+	writel(val, priv->base + MVEBU_COMPHY_GEN1_S0(lane->id));
+
+	return mvebu_comphy_init_plls(lane, mode);
+}
+
+static int mvebu_comphy_set_mode_10gkr(struct phy *phy, enum phy_mode mode)
+{
+	struct mvebu_comphy_lane *lane = phy_get_drvdata(phy);
+	struct mvebu_comphy_priv *priv = lane->priv;
+	u32 val;
+
+	mvebu_comphy_ethernet_init_reset(lane, mode);
+
+	val = readl(priv->base + MVEBU_COMPHY_RX_CTRL1(lane->id));
+	val |= MVEBU_COMPHY_RX_CTRL1_RXCLK2X_SEL |
+	       MVEBU_COMPHY_RX_CTRL1_CLK8T_EN;
+	writel(val, priv->base + MVEBU_COMPHY_RX_CTRL1(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_DLT_CTRL(lane->id));
+	val |= MVEBU_COMPHY_DLT_CTRL_DTL_FLOOP_EN;
+	writel(val, priv->base + MVEBU_COMPHY_DLT_CTRL(lane->id));
+
+	/* Speed divider */
+	val = readl(priv->base + MVEBU_COMPHY_SPEED_DIV(lane->id));
+	val |= MVEBU_COMPHY_SPEED_DIV_TX_FORCE;
+	writel(val, priv->base + MVEBU_COMPHY_SPEED_DIV(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_SERDES_CFG2(lane->id));
+	val |= MVEBU_COMPHY_SERDES_CFG2_DFE_EN;
+	writel(val, priv->base + MVEBU_COMPHY_SERDES_CFG2(lane->id));
+
+	/* DFE resolution */
+	val = readl(priv->base + MVEBU_COMPHY_DFE_RES(lane->id));
+	val |= MVEBU_COMPHY_DFE_RES_FORCE_GEN_TBL;
+	writel(val, priv->base + MVEBU_COMPHY_DFE_RES(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_GEN1_S0(lane->id));
+	val &= ~(MVEBU_COMPHY_GEN1_S0_TX_AMP(0x1f) |
+		 MVEBU_COMPHY_GEN1_S0_TX_EMPH(0xf));
+	val |= MVEBU_COMPHY_GEN1_S0_TX_AMP(0x1c) |
+	       MVEBU_COMPHY_GEN1_S0_TX_EMPH(0xe);
+	writel(val, priv->base + MVEBU_COMPHY_GEN1_S0(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_GEN1_S2(lane->id));
+	val &= ~MVEBU_COMPHY_GEN1_S2_TX_EMPH(0xf);
+	val |= MVEBU_COMPHY_GEN1_S2_TX_EMPH_EN;
+	writel(val, priv->base + MVEBU_COMPHY_GEN1_S2(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_TX_SLEW_RATE(lane->id));
+	val |= MVEBU_COMPHY_TX_SLEW_RATE_EMPH(0x3) |
+	       MVEBU_COMPHY_TX_SLEW_RATE_SLC(0x3f);
+	writel(val, priv->base + MVEBU_COMPHY_TX_SLEW_RATE(lane->id));
+
+	/* Impedance calibration */
+	val = readl(priv->base + MVEBU_COMPHY_IMP_CAL(lane->id));
+	val &= ~MVEBU_COMPHY_IMP_CAL_TX_EXT(0x1f);
+	val |= MVEBU_COMPHY_IMP_CAL_TX_EXT(0xe) |
+	       MVEBU_COMPHY_IMP_CAL_TX_EXT_EN;
+	writel(val, priv->base + MVEBU_COMPHY_IMP_CAL(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_GEN1_S5(lane->id));
+	val &= ~MVEBU_COMPHY_GEN1_S5_ICP(0xf);
+	writel(val, priv->base + MVEBU_COMPHY_GEN1_S5(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_GEN1_S1(lane->id));
+	val &= ~(MVEBU_COMPHY_GEN1_S1_RX_MUL_PI(0x7) |
+		 MVEBU_COMPHY_GEN1_S1_RX_MUL_PF(0x7) |
+		 MVEBU_COMPHY_GEN1_S1_RX_MUL_FI(0x3) |
+		 MVEBU_COMPHY_GEN1_S1_RX_MUL_FF(0x3));
+	val |= MVEBU_COMPHY_GEN1_S1_RX_DFE_EN |
+	       MVEBU_COMPHY_GEN1_S1_RX_MUL_PI(0x2) |
+	       MVEBU_COMPHY_GEN1_S1_RX_MUL_PF(0x2) |
+	       MVEBU_COMPHY_GEN1_S1_RX_MUL_FF(0x1) |
+	       MVEBU_COMPHY_GEN1_S1_RX_DIV(0x3);
+	writel(val, priv->base + MVEBU_COMPHY_GEN1_S1(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_COEF(lane->id));
+	val &= ~(MVEBU_COMPHY_COEF_DFE_EN | MVEBU_COMPHY_COEF_DFE_CTRL);
+	writel(val, priv->base + MVEBU_COMPHY_COEF(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_GEN1_S4(lane->id));
+	val &= ~MVEBU_COMPHY_GEN1_S4_DFE_RES(0x3);
+	val |= MVEBU_COMPHY_GEN1_S4_DFE_RES(0x1);
+	writel(val, priv->base + MVEBU_COMPHY_GEN1_S4(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_GEN1_S3(lane->id));
+	val |= MVEBU_COMPHY_GEN1_S3_FBCK_SEL;
+	writel(val, priv->base + MVEBU_COMPHY_GEN1_S3(lane->id));
+
+	/* rx training timer */
+	val = readl(priv->base + MVEBU_COMPHY_TRAINING5(lane->id));
+	val &= ~MVEBU_COMPHY_TRAINING5_RX_TIMER(0x3ff);
+	val |= MVEBU_COMPHY_TRAINING5_RX_TIMER(0x13);
+	writel(val, priv->base + MVEBU_COMPHY_TRAINING5(lane->id));
+
+	/* tx train peak to peak hold */
+	val = readl(priv->base + MVEBU_COMPHY_TRAINING0(lane->id));
+	val |= MVEBU_COMPHY_TRAINING0_P2P_HOLD;
+	writel(val, priv->base + MVEBU_COMPHY_TRAINING0(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_TX_PRESET(lane->id));
+	val &= ~MVEBU_COMPHY_TX_PRESET_INDEX(0xf);
+	val |= MVEBU_COMPHY_TX_PRESET_INDEX(0x2);	/* preset coeff */
+	writel(val, priv->base + MVEBU_COMPHY_TX_PRESET(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_FRAME_DETECT3(lane->id));
+	val &= ~MVEBU_COMPHY_FRAME_DETECT3_LOST_TIMEOUT_EN;
+	writel(val, priv->base + MVEBU_COMPHY_FRAME_DETECT3(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_TX_TRAIN_PRESET(lane->id));
+	val |= MVEBU_COMPHY_TX_TRAIN_PRESET_16B_AUTO_EN |
+	       MVEBU_COMPHY_TX_TRAIN_PRESET_PRBS11;
+	writel(val, priv->base + MVEBU_COMPHY_TX_TRAIN_PRESET(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_FRAME_DETECT0(lane->id));
+	val &= ~MVEBU_COMPHY_FRAME_DETECT0_PATN(0x1ff);
+	val |= MVEBU_COMPHY_FRAME_DETECT0_PATN(0x88);
+	writel(val, priv->base + MVEBU_COMPHY_FRAME_DETECT0(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_DME(lane->id));
+	val |= MVEBU_COMPHY_DME_ETH_MODE;
+	writel(val, priv->base + MVEBU_COMPHY_DME(lane->id));
+
+	val = readl(priv->base + MVEBU_COMPHY_VDD_CAL0(lane->id));
+	val |= MVEBU_COMPHY_VDD_CAL0_CONT_MODE;
+	writel(val, priv->base + MVEBU_COMPHY_VDD_CAL0(lane->id));
+
+	val = readl(priv->base + MVEBU_SP_CALIB(lane->id));
+	val &= ~MVEBU_SP_CALIB_SAMPLER(0x3);
+	val |= MVEBU_SP_CALIB_SAMPLER(0x3) |
+	       MVEBU_SP_CALIB_SAMPLER_EN;
+	writel(val, priv->base + MVEBU_SP_CALIB(lane->id));
+	val &= ~MVEBU_SP_CALIB_SAMPLER_EN;
+	writel(val, priv->base + MVEBU_SP_CALIB(lane->id));
+
+	/* External rx regulator */
+	val = readl(priv->base + MVEBU_COMPHY_EXT_SELV(lane->id));
+	val &= ~MVEBU_COMPHY_EXT_SELV_RX_SAMPL(0x1f);
+	val |= MVEBU_COMPHY_EXT_SELV_RX_SAMPL(0x1a);
+	writel(val, priv->base + MVEBU_COMPHY_EXT_SELV(lane->id));
+
+	return mvebu_comphy_init_plls(lane, mode);
+}
+
+static int mvebu_comphy_power_on(struct phy *phy)
+{
+	struct mvebu_comphy_lane *lane = phy_get_drvdata(phy);
+	struct mvebu_comphy_priv *priv = lane->priv;
+	int ret;
+	u32 mux, val;
+
+	mux = mvebu_comphy_get_mux(lane->id, lane->port, lane->mode);
+	if (mux < 0)
+		return -ENOTSUPP;
+
+	regmap_read(priv->regmap, MVEBU_COMPHY_SELECTOR, &val);
+	val &= ~(0xf << MVEBU_COMPHY_SELECTOR_PHY(lane->id));
+	val |= mux << MVEBU_COMPHY_SELECTOR_PHY(lane->id);
+	regmap_write(priv->regmap, MVEBU_COMPHY_SELECTOR, val);
+
+	switch (lane->mode) {
+	case PHY_MODE_SGMII:
+		ret = mvebu_comphy_set_mode_sgmii(phy, lane->mode);
+		break;
+	case PHY_MODE_10GKR:
+		ret = mvebu_comphy_set_mode_10gkr(phy, lane->mode);
+		break;
+	default:
+		return -ENOTSUPP;
+	}
+
+	/* digital reset */
+	val = readl(priv->base + MVEBU_COMPHY_SERDES_CFG1(lane->id));
+	val |= MVEBU_COMPHY_SERDES_CFG1_RF_RESET;
+	writel(val, priv->base + MVEBU_COMPHY_SERDES_CFG1(lane->id));
+
+	return ret;
+}
+
+static int mvebu_comphy_set_mode(struct phy *phy, enum phy_mode mode)
+{
+	struct mvebu_comphy_lane *lane = phy_get_drvdata(phy);
+
+	if (mvebu_comphy_get_mux(lane->id, lane->port, mode) < 0)
+		return -EINVAL;
+
+	lane->mode = mode;
+	return 0;
+}
+
+static int mvebu_comphy_power_off(struct phy *phy)
+{
+	struct mvebu_comphy_lane *lane = phy_get_drvdata(phy);
+	struct mvebu_comphy_priv *priv = lane->priv;
+	u32 val;
+
+	val = readl(priv->base + MVEBU_COMPHY_SERDES_CFG1(lane->id));
+	val &= ~(MVEBU_COMPHY_SERDES_CFG1_RESET |
+		 MVEBU_COMPHY_SERDES_CFG1_CORE_RESET |
+		 MVEBU_COMPHY_SERDES_CFG1_RF_RESET);
+	writel(val, priv->base + MVEBU_COMPHY_SERDES_CFG1(lane->id));
+
+	regmap_read(priv->regmap, MVEBU_COMPHY_SELECTOR, &val);
+	val &= ~(0xf << MVEBU_COMPHY_SELECTOR_PHY(lane->id));
+	regmap_write(priv->regmap, MVEBU_COMPHY_SELECTOR, val);
+
+	return 0;
+}
+
+static const struct phy_ops mvebu_comphy_ops = {
+	.power_on	= mvebu_comphy_power_on,
+	.power_off	= mvebu_comphy_power_off,
+	.set_mode	= mvebu_comphy_set_mode,
+};
+
+static struct phy *mvebu_comphy_xlate(struct device *dev,
+				      struct of_phandle_args *args)
+{
+	struct mvebu_comphy_priv *priv = dev_get_drvdata(dev);
+	struct mvebu_comphy_lane *lane;
+	int i;
+
+	if (WARN_ON(args->args[0] >= MVEBU_COMPHY_PORTS))
+		return ERR_PTR(-EINVAL);
+
+	for (i = 0; i < MVEBU_COMPHY_LANES; i++) {
+		if (!priv->phys[i])
+			continue;
+
+		lane = phy_get_drvdata(priv->phys[i]);
+		if (priv->phys[i] && args->np == lane->of_node)
+			break;
+	}
+
+	if (i == MVEBU_COMPHY_LANES)
+		return ERR_PTR(-ENODEV);
+
+	if (lane->port >= 0)
+		return ERR_PTR(-EBUSY);
+
+	lane->port = args->args[0];
+	return priv->phys[i];
+}
+
+static int mvebu_comphy_probe(struct platform_device *pdev)
+{
+	struct mvebu_comphy_priv *priv;
+	struct phy_provider *provider;
+	struct device_node *child;
+	struct resource *res;
+
+	priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	priv->dev = &pdev->dev;
+	priv->regmap =
+		syscon_regmap_lookup_by_phandle(pdev->dev.of_node,
+						"marvell,system-controller");
+	if (IS_ERR(priv->regmap))
+		return PTR_ERR(priv->regmap);
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	priv->base = devm_ioremap_resource(&pdev->dev, res);
+	if (!priv->base)
+		return -ENOMEM;
+
+	for_each_available_child_of_node(pdev->dev.of_node, child) {
+		struct mvebu_comphy_lane *lane;
+		struct phy *phy;
+		int ret;
+		u32 val;
+
+		ret = of_property_read_u32(child, "reg", &val);
+		if (ret < 0) {
+			dev_err(&pdev->dev, "missing 'reg' property (%d)\n",
+				ret);
+			continue;
+		}
+
+		if (val >= MVEBU_COMPHY_LANES) {
+			dev_err(&pdev->dev, "invalid 'reg' property\n");
+			continue;
+		}
+
+		lane = devm_kzalloc(&pdev->dev, sizeof(*lane), GFP_KERNEL);
+		if (!lane)
+			return -ENOMEM;
+
+		phy = devm_phy_create(&pdev->dev, NULL, &mvebu_comphy_ops);
+		if (IS_ERR(phy))
+			return PTR_ERR(phy);
+
+		lane->priv = priv;
+		lane->of_node = child;
+		lane->mode = PHY_MODE_INVALID;
+		lane->id = val;
+		lane->port = -1;
+		phy_set_drvdata(phy, lane);
+
+		priv->phys[val] = phy;
+
+		/*
+		 * Once all modes are supported in this driver we should call
+		 * mvebu_comphy_power_off(phy) here to avoid relying on the
+		 * bootloader/firmware configuration.
+		 */
+	}
+
+	dev_set_drvdata(&pdev->dev, priv);
+	provider = devm_of_phy_provider_register(&pdev->dev,
+						 mvebu_comphy_xlate);
+	return PTR_ERR_OR_ZERO(provider);
+}
+
+static const struct of_device_id mvebu_comphy_of_match_table[] = {
+	{ .compatible = "marvell,comphy-cp110" },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, mvebu_comphy_of_match_table);
+
+static struct platform_driver mvebu_comphy_driver = {
+	.probe	= mvebu_comphy_probe,
+	.driver	= {
+		.name = "mvebu-comphy",
+		.of_match_table = mvebu_comphy_of_match_table,
+	},
+};
+module_platform_driver(mvebu_comphy_driver);
+
+MODULE_AUTHOR("Antoine Tenart <antoine.tenart@free-electrons.com>");
+MODULE_DESCRIPTION("Common PHY driver for mvebu SoCs");
+MODULE_LICENSE("GPL v2");
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 01/14] phy: add sgmii and 10gkr modes to the phy_mode enum
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev
In-Reply-To: <20170825144821.31129-1-antoine.tenart@free-electrons.com>

This patch adds more generic PHY modes to the phy_mode enum, to
allow configuring generic PHYs to the SGMII and/or the 10GKR mode
by using the set_mode callback.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
---
 include/linux/phy/phy.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/phy/phy.h b/include/linux/phy/phy.h
index 78bb0d7f6b11..e694d4008c4a 100644
--- a/include/linux/phy/phy.h
+++ b/include/linux/phy/phy.h
@@ -27,6 +27,8 @@ enum phy_mode {
 	PHY_MODE_USB_HOST,
 	PHY_MODE_USB_DEVICE,
 	PHY_MODE_USB_OTG,
+	PHY_MODE_SGMII,
+	PHY_MODE_10GKR,
 };
 
 /**
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 00/14] net: mvpp2: comphy configuration
From: Antoine Tenart @ 2017-08-25 14:48 UTC (permalink / raw)
  To: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement
  Cc: Antoine Tenart, thomas.petazzoni, nadavh, linux, linux-kernel, mw,
	stefanc, miquel.raynal, netdev

Hi all,

This series, following up the one one the GoP/MAC configuration, aims at
stopping to depend on the firmware/bootloader configuration when using
the PPv2 engine. With this series the PPv2 driver does not need to rely
on a previous configuration, and dynamic reconfiguration while the
kernel is running can be done (i.e. switch one port from SGMII to 10GKR,
or the opposite). A port can now be configured in a different mode than
what's done in the firmware/bootloader as well.

The series first contain patches in the generic PHY framework to support
what is called the comphy (common PHYs), which is an h/w block providing
PHYs that can be configured in various modes ranging from SGMII, 10GKR
to SATA and others. As of now only the SGMII and 10GKR modes are
supported by the comphy driver.

Then patches are modifying the PPv2 driver to first add the comphy
initialization sequence (i.e. calls to the generic PHY framework) and to
then take advantage of this to allow dynamic reconfiguration (i.e.
configuring the mode of a port given what's connected, between sgmii and
10G). Note the use of the comphy in the PPv2 driver is kept optional
(i.e. if not described in dt the driver still as before an relies on the
firmware/bootloader configuration).

Finally there are dt/defconfig patches to describe and take advantage of
this.

This was tested on a range of devices: 8040-db, 8040-mcbin and 7040-db.

Thanks!
Antoine

Since v1:
  - Updated the mode settings variable name in the comphy driver to
    have 'cp110' in it.
  - Documented the PHY cell argument in the dt documentation.
  - New patch adding comphy phandles for the 7040-db board.
  - Checked if the carrier_on/off functions were needed. They are.
  - s/PHY/generic PHY/ in commit log of patch 1.
  - Rebased on the latest net-next/master.

Antoine Tenart (13):
  phy: add sgmii and 10gkr modes to the phy_mode enum
  phy: add the mvebu cp110 comphy driver
  Documentation/bindings: phy: document the Marvell comphy driver
  net: mvpp2: initialize the comphy
  net: mvpp2: do not force the link mode
  net: mvpp2: simplify the link_event function
  net: mvpp2: improve the link management function
  net: mvpp2: check the netif is running in the link_event function
  net: mvpp2: dynamic reconfiguration of the PHY mode
  arm64: dts: marvell: extend the cp110 syscon register area length
  arm64: dts: marvell: add comphy nodes on cp110 master and slave
  arm64: dts: marvell: mcbin: add comphy references to Ethernet ports
  arm64: dts: marvell: 7040-db: add comphy references to Ethernet ports

Miquel Raynal (1):
  arm64: defconfig: enable Marvell CP110 comphy

 .../devicetree/bindings/phy/phy-mvebu-comphy.txt   |  43 ++
 arch/arm64/boot/dts/marvell/armada-7040-db.dts     |   1 +
 arch/arm64/boot/dts/marvell/armada-8040-mcbin.dts  |   3 +
 .../boot/dts/marvell/armada-cp110-master.dtsi      |  40 +-
 .../arm64/boot/dts/marvell/armada-cp110-slave.dtsi |  40 +-
 arch/arm64/configs/defconfig                       |   1 +
 drivers/net/ethernet/marvell/mvpp2.c               | 111 ++--
 drivers/phy/marvell/Kconfig                        |  10 +
 drivers/phy/marvell/Makefile                       |   1 +
 drivers/phy/marvell/phy-mvebu-cp110-comphy.c       | 656 +++++++++++++++++++++
 include/linux/phy/phy.h                            |   2 +
 11 files changed, 873 insertions(+), 35 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/phy/phy-mvebu-comphy.txt
 create mode 100644 drivers/phy/marvell/phy-mvebu-cp110-comphy.c

-- 
2.13.5

^ permalink raw reply

* [PATCH] net: stmmac: Handle possible fixed-link with need_mdio_ids
From: Corentin Labbe @ 2017-08-25 14:42 UTC (permalink / raw)
  To: peppe.cavallaro, alexandre.torgue; +Cc: netdev, linux-kernel, Corentin Labbe

In case of fixed link, there are no mdio node.
This patch add a test for fixed-link for bypassing MDIO node register.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index a366b3747eeb..e1be5735365b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -332,7 +332,7 @@ static int stmmac_dt_phy(struct plat_stmmacenet_data *plat,
 		mdio = false;
 	}
 
-	if (of_match_node(need_mdio_ids, np)) {
+	if (of_match_node(need_mdio_ids, np) && !of_phy_is_fixed_link(np)) {
 		plat->mdio_node = of_get_child_by_name(np, "mdio");
 	} else {
 		/**
-- 
2.13.5

^ permalink raw reply related

* [PATCH] net: stmmac: dwmac-sun8i: Use reset exclusive
From: Corentin Labbe @ 2017-08-25 14:38 UTC (permalink / raw)
  To: peppe.cavallaro, alexandre.torgue, maxime.ripard, wens
  Cc: netdev, linux-arm-kernel, linux-kernel, Corentin Labbe

The current dwmac_sun8i module cannot be rmmod/modprobe due to that
the reset controller was not released when removed.

This patch remove ambiguity, by using of_reset_control_get_exclusive and
add the missing reset_control_put().

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
 drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
index fffd6d5fc907..675a09629d85 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
@@ -782,6 +782,7 @@ static int sun8i_dwmac_unpower_internal_phy(struct sunxi_priv_data *gmac)
 
 	clk_disable_unprepare(gmac->ephy_clk);
 	reset_control_assert(gmac->rst_ephy);
+	reset_control_put(gmac->rst_ephy);
 	return 0;
 }
 
@@ -942,7 +943,7 @@ static int sun8i_dwmac_probe(struct platform_device *pdev)
 			return -EINVAL;
 		}
 
-		gmac->rst_ephy = of_reset_control_get(plat_dat->phy_node, NULL);
+		gmac->rst_ephy = of_reset_control_get_exclusive(plat_dat->phy_node, NULL);
 		if (IS_ERR(gmac->rst_ephy)) {
 			ret = PTR_ERR(gmac->rst_ephy);
 			if (ret == -EPROBE_DEFER)
-- 
2.13.5

^ permalink raw reply related

* Re: [PATCH 3/7] ieee802154: 6lowpan: make header_ops const
From: Stefan Schmidt @ 2017-08-25 14:36 UTC (permalink / raw)
  To: Bhumika Goyal, julia.lawall, marcel, gustavo, johan.hedberg,
	davem, pablo, kadlec, fw, stephen, alex.aring, kuznet, yoshfuji,
	santosh.shilimkar, trond.myklebust, anna.schumaker, bfields,
	jlayton, linux-bluetooth, netdev, linux-kernel, netfilter-devel,
	coreteam, bridge, linux-wpan, linux-rdma, rds-devel, linux-nfs
In-Reply-To: <1503670907-23221-4-git-send-email-bhumirks@gmail.com>

Hello.

On 08/25/2017 04:21 PM, Bhumika Goyal wrote:
> Make this const as it is only stored as a reference in a const field of
> a net_device structure.
> 
> Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
> ---
>   net/ieee802154/6lowpan/core.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ieee802154/6lowpan/core.c b/net/ieee802154/6lowpan/core.c
> index de2661c..974765b 100644
> --- a/net/ieee802154/6lowpan/core.c
> +++ b/net/ieee802154/6lowpan/core.c
> @@ -54,7 +54,7 @@
>   
>   static int open_count;
>   
> -static struct header_ops lowpan_header_ops = {
> +static const struct header_ops lowpan_header_ops = {
>   	.create	= lowpan_header_create,
>   };
>   
> 

Acked-by: Stefan Schmidt <stefan@osg.samsung.com>

regards
Stefan Schmidt

^ permalink raw reply

* Re: [PATCH net-next v1] net: gso: Add GSO support for NSH (Network Service Header)
From: Yang, Yi @ 2017-08-25 14:16 UTC (permalink / raw)
  To: Jiri Benc; +Cc: netdev@vger.kernel.org, simon.horman@netronome.com
In-Reply-To: <20170824161326.1f5f7c7f@griffin>

On Thu, Aug 24, 2017 at 10:13:26PM +0800, Jiri Benc wrote:
> On Thu, 24 Aug 2017 17:36:16 +0800, Yi Yang wrote:
> >  include/net/nsh.h             | 307 ++++++++++++++++++++++++++++++++++++++++++
> >  include/uapi/linux/if_ether.h |   1 +
> >  net/Kconfig                   |   1 +
> >  net/Makefile                  |   1 +
> >  net/nsh/Kconfig               |  10 ++
> >  net/nsh/Makefile              |   4 +
> >  net/nsh/nsh_gso.c             | 116 ++++++++++++++++
> 
> Please consider making this a patchset, with a patch adding the NSH
> structures and helper functions, a patch adding GSO and a patch adding
> OVS support. That way, everything can be reviewed together, yet the
> patches are reasonably contained.

Sent out v6 as you said here.

> 
> > +config NET_NSH_GSO
> > +	bool "NSH GSO support"
> > +	depends on INET
> > +	default y
> > +	---help---
> > +	 This allows segmentation of GSO packet that have had NSH header pushed         onto them and thus become NSH GSO packets.
> 
> Seems you're missing a newline here.

Yes, fixed in v6.

> 
> > +static struct sk_buff *nsh_gso_segment(struct sk_buff *skb,
> > +				       netdev_features_t features)
> > +{
> > +	struct sk_buff *segs = ERR_PTR(-EINVAL);
> > +	__be16 protocol = skb->protocol;
> > +	__be16 inner_proto;
> > +	u16 mac_offset = skb->mac_header;
> > +	u16 mac_len = skb->mac_len;
> > +	struct nsh_hdr *nsh;
> > +	unsigned int nsh_hlen;
> > +	const struct net_offload *ops;
> > +	struct sk_buff *(*gso_inner_segment)(struct sk_buff *skb,
> > +					     netdev_features_t features);
> > +
> > +	skb_reset_network_header(skb);
> > +	nsh = (struct nsh_hdr *)skb_network_header(skb);
> > +	nsh_hlen = nsh_hdr_len(nsh);
> > +	if (unlikely(!pskb_may_pull(skb, nsh_hlen)))
> > +		goto out;
> 
> You have to reload nsh after this.

Yeah, reload it in v6.

> 
> > +
> > +	__skb_pull(skb, nsh_hlen);
> > +
> > +	skb_reset_mac_header(skb);
> > +	skb_reset_mac_len(skb);
> > +
> > +	rcu_read_lock();
> > +	switch (nsh->next_proto) {
> > +	case NSH_P_ETHERNET:
> > +		inner_proto = htons(ETH_P_TEB);
> > +		gso_inner_segment = skb_mac_gso_segment;
> > +		break;
> > +	case NSH_P_IPV4:
> > +		inner_proto = htons(ETH_P_IP);
> > +		ops = rcu_dereference(inet_offloads[inner_proto]);
> > +		if (!ops || !ops->callbacks.gso_segment)
> > +			goto out;
> > +		gso_inner_segment = ops->callbacks.gso_segment;
> > +		break;
> > +	case NSH_P_IPV6:
> > +		inner_proto = htons(ETH_P_IPV6);
> > +		ops = rcu_dereference(inet6_offloads[inner_proto]);
> > +		if (!ops || !ops->callbacks.gso_segment)
> > +			goto out;
> > +		gso_inner_segment = ops->callbacks.gso_segment;
> > +		break;
> > +	case NSH_P_NSH:
> > +		inner_proto = htons(ETH_P_NSH);
> > +		gso_inner_segment = nsh_gso_segment;
> 
> This doesn't look correct. Recursive call to nsh_gso_segment will reset
> mac header, causing skb_segment not to copy the previous NSH header(s)
> to the new segments.

There are some errors here, I carefully checked inet_gso_segment again,
it is very similiar to nsh_gso_segment. v6 should be ok.

> 
> > +		break;
> > +	default:
> > +		skb_gso_error_unwind(skb, protocol, nsh_hlen, mac_offset,
> > +				     mac_len);
> > +		goto out;
> > +	}
> > +
> > +	skb->protocol = inner_proto;
> > +	segs = gso_inner_segment(skb, features);
> > +	if (IS_ERR_OR_NULL(segs)) {
> > +		skb_gso_error_unwind(skb, protocol, nsh_hlen, mac_offset,
> > +				     mac_len);
> > +		goto out;
> > +	}
> > +
> > +	do {
> > +		skb->mac_len = mac_len;
> > +		skb->protocol = protocol;
> > +
> > +		skb_reset_inner_network_header(skb);
> 
> This looks superfluous.

v6 looks simple :-)

> 
> > +
> > +		__skb_push(skb, nsh_hlen);
> > +
> > +		skb_reset_mac_header(skb);
> > +		skb_set_network_header(skb, mac_len);
> > +	} while ((skb = skb->next));
> > +
> > +out:
> > +	rcu_read_unlock();
> > +	return segs;
> > +}
> > +
> > +static struct packet_offload nsh_offload __read_mostly = {
> > +	.type = cpu_to_be16(ETH_P_NSH),
> > +	.priority = 15,
> > +	.callbacks = {
> > +		.gso_segment    =	nsh_gso_segment,
> > +	},
> > +};
> > +
> > +static int __init nsh_gso_init(void)
> > +{
> > +	dev_add_offload(&nsh_offload);
> > +
> > +	return 0;
> > +}
> > +
> > +fs_initcall(nsh_gso_init);
> 
> device_initcall should be enough. I doubt we'll be doing NFS over
> NSH ;-)

I have used device_initcall in v6.

> 
>  Jiri

^ permalink raw reply

* Re: [PATCH net v2 1/4] net: mvpp2: fix the mac address used when using PPv2.2
From: Antoine Tenart @ 2017-08-25 14:29 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Antoine Tenart, davem, thomas.petazzoni, gregory.clement, nadavh,
	linux, linux-kernel, mw, stefanc, netdev
In-Reply-To: <20170825141939.GB30922@lunn.ch>

[-- Attachment #1: Type: text/plain, Size: 1617 bytes --]

Hi Andrew,

On Fri, Aug 25, 2017 at 04:19:39PM +0200, Andrew Lunn wrote:
> On Fri, Aug 25, 2017 at 04:14:17PM +0200, Antoine Tenart wrote:
> > The mac address is only retrieved from h/w when using PPv2.1. Otherwise
> > the variable holding it is still checked and used if it contains a valid
> > value. As the variable isn't initialized to an invalid mac address
> > value, we end up with random mac addresses which can be the same for all
> > the ports handled by this PPv2 driver.
> > 
> > Fixes this by initializing the h/w mac address variable to {0}, which is
> > an invalid mac address value. This way the random assignation fallback
> > is called and all ports end up with their own addresses.
> > 
> > Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
> > Fixes: 2697582144dd ("net: mvpp2: handle misc PPv2.1/PPv2.2 differences")
> 
> Is this patch alone sufficient to fix the problem?

Yes it is.

> Ideally, you want a minimal patch for stable, i.e. -net, and a fuller
> fix can go into net-next.

That was my intention (providing a minimal patch for stable), and I
asked about this approach in the v1. Dave told me to resend the series
to net. Maybe my questions weren't clear enough.

So probably the best way to handle this would have been to send 1/4 to
net and 2-4/4 to net-next (but then there's a dependency between the two
series).

Let's wait for Dave's answer, I'll respin if needed so that it's easy
for him to apply it.

Thanks!
Antoine

-- 
Antoine Ténart, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH net-next v6 3/3] openvswitch: enable NSH support
From: Yi Yang @ 2017-08-25 14:20 UTC (permalink / raw)
  To: netdev; +Cc: dev, jbenc, e, blp, jan.scheurich, Yi Yang
In-Reply-To: <1503670805-31051-1-git-send-email-yi.y.yang@intel.com>

OVS master and 2.8 branch has merged NSH userspace
patch series, this patch is to enable NSH support
in kernel data path in order that OVS can support
NSH in compat mode by porting this.

Signed-off-by: Yi Yang <yi.y.yang@intel.com>
---
 drivers/net/vxlan.c              |   7 +
 include/uapi/linux/openvswitch.h |  28 +++
 net/openvswitch/actions.c        | 178 +++++++++++++++++
 net/openvswitch/flow.c           |  55 ++++++
 net/openvswitch/flow.h           |  11 ++
 net/openvswitch/flow_netlink.c   | 404 ++++++++++++++++++++++++++++++++++++++-
 net/openvswitch/flow_netlink.h   |   4 +
 7 files changed, 686 insertions(+), 1 deletion(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index ae3a1da..a36c41e 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -27,6 +27,7 @@
 #include <net/net_namespace.h>
 #include <net/netns/generic.h>
 #include <net/vxlan.h>
+#include <net/nsh.h>
 
 #if IS_ENABLED(CONFIG_IPV6)
 #include <net/ip6_tunnel.h>
@@ -1268,6 +1269,9 @@ static bool vxlan_parse_gpe_hdr(struct vxlanhdr *unparsed,
 	case VXLAN_GPE_NP_IPV6:
 		*protocol = htons(ETH_P_IPV6);
 		break;
+	case VXLAN_GPE_NP_NSH:
+		*protocol = htons(ETH_P_NSH);
+		break;
 	case VXLAN_GPE_NP_ETHERNET:
 		*protocol = htons(ETH_P_TEB);
 		break;
@@ -1807,6 +1811,9 @@ static int vxlan_build_gpe_hdr(struct vxlanhdr *vxh, u32 vxflags,
 	case htons(ETH_P_IPV6):
 		gpe->next_protocol = VXLAN_GPE_NP_IPV6;
 		return 0;
+	case htons(ETH_P_NSH):
+		gpe->next_protocol = VXLAN_GPE_NP_NSH;
+		return 0;
 	case htons(ETH_P_TEB):
 		gpe->next_protocol = VXLAN_GPE_NP_ETHERNET;
 		return 0;
diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 156ee4c..91dee5b 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -333,6 +333,7 @@ enum ovs_key_attr {
 	OVS_KEY_ATTR_CT_LABELS,	/* 16-octet connection tracking label */
 	OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4,   /* struct ovs_key_ct_tuple_ipv4 */
 	OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6,   /* struct ovs_key_ct_tuple_ipv6 */
+	OVS_KEY_ATTR_NSH,       /* Nested set of ovs_nsh_key_* */
 
 #ifdef __KERNEL__
 	OVS_KEY_ATTR_TUNNEL_INFO,  /* struct ip_tunnel_info */
@@ -491,6 +492,29 @@ struct ovs_key_ct_tuple_ipv6 {
 	__u8   ipv6_proto;
 };
 
+enum ovs_nsh_key_attr {
+	OVS_NSH_KEY_ATTR_BASE,  /* struct ovs_nsh_key_base. */
+	OVS_NSH_KEY_ATTR_MD1,   /* struct ovs_nsh_key_md1. */
+	OVS_NSH_KEY_ATTR_MD2,   /* variable-length octets for MD type 2. */
+	__OVS_NSH_KEY_ATTR_MAX
+};
+
+#define OVS_NSH_KEY_ATTR_MAX (__OVS_NSH_KEY_ATTR_MAX - 1)
+
+struct ovs_nsh_key_base {
+	__u8 flags;
+	__u8 ttl;
+	__u8 mdtype;
+	__u8 np;
+	__be32 path_hdr;
+};
+
+#define NSH_MD1_CONTEXT_SIZE 4
+
+struct ovs_nsh_key_md1 {
+	__be32 context[NSH_MD1_CONTEXT_SIZE];
+};
+
 /**
  * enum ovs_flow_attr - attributes for %OVS_FLOW_* commands.
  * @OVS_FLOW_ATTR_KEY: Nested %OVS_KEY_ATTR_* attributes specifying the flow
@@ -806,6 +830,8 @@ struct ovs_action_push_eth {
  * packet.
  * @OVS_ACTION_ATTR_POP_ETH: Pop the outermost Ethernet header off the
  * packet.
+ * @OVS_ACTION_ATTR_PUSH_NSH: push NSH header to the packet.
+ * @OVS_ACTION_ATTR_POP_NSH: pop the outermost NSH header off the packet.
  *
  * Only a single header can be set with a single %OVS_ACTION_ATTR_SET.  Not all
  * fields within a header are modifiable, e.g. the IPv4 protocol and fragment
@@ -835,6 +861,8 @@ enum ovs_action_attr {
 	OVS_ACTION_ATTR_TRUNC,        /* u32 struct ovs_action_trunc. */
 	OVS_ACTION_ATTR_PUSH_ETH,     /* struct ovs_action_push_eth. */
 	OVS_ACTION_ATTR_POP_ETH,      /* No argument. */
+	OVS_ACTION_ATTR_PUSH_NSH,     /* Nested OVS_NSH_KEY_ATTR_*. */
+	OVS_ACTION_ATTR_POP_NSH,      /* No argument. */
 
 	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
 				       * from userspace. */
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index a54a556..9ca1a84 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -43,6 +43,7 @@
 #include "flow.h"
 #include "conntrack.h"
 #include "vport.h"
+#include "flow_netlink.h"
 
 struct deferred_action {
 	struct sk_buff *skb;
@@ -380,6 +381,95 @@ static int push_eth(struct sk_buff *skb, struct sw_flow_key *key,
 	return 0;
 }
 
+static int push_nsh(struct sk_buff *skb, struct sw_flow_key *key,
+		    const struct nsh_hdr *nsh_src)
+{
+	struct nsh_hdr *nsh;
+	size_t length = nsh_hdr_len(nsh_src);
+	u8 next_proto;
+
+	if (key->mac_proto == MAC_PROTO_ETHERNET) {
+		next_proto = NSH_P_ETHERNET;
+	} else {
+		switch (ntohs(skb->protocol)) {
+		case ETH_P_IP:
+			next_proto = NSH_P_IPV4;
+			break;
+		case ETH_P_IPV6:
+			next_proto = NSH_P_IPV6;
+			break;
+		case ETH_P_NSH:
+			next_proto = NSH_P_NSH;
+			break;
+		default:
+			return -ENOTSUPP;
+		}
+	}
+
+	/* Add the NSH header */
+	if (skb_cow_head(skb, length) < 0)
+		return -ENOMEM;
+
+	skb_push(skb, length);
+	nsh = (struct nsh_hdr *)(skb->data);
+	memcpy(nsh, nsh_src, length);
+	nsh->np = next_proto;
+	nsh->mdtype &= NSH_MDTYPE_MASK;
+
+	skb->protocol = htons(ETH_P_NSH);
+	key->eth.type = htons(ETH_P_NSH);
+	skb_reset_mac_header(skb);
+	skb_reset_mac_len(skb);
+
+	/* safe right before invalidate_flow_key */
+	key->mac_proto = MAC_PROTO_NONE;
+	invalidate_flow_key(key);
+	return 0;
+}
+
+static int pop_nsh(struct sk_buff *skb, struct sw_flow_key *key)
+{
+	struct nsh_hdr *nsh = (struct nsh_hdr *)(skb->data);
+	size_t length;
+	u16 inner_proto;
+
+	if (ovs_key_mac_proto(key) != MAC_PROTO_NONE ||
+	    skb->protocol != htons(ETH_P_NSH)) {
+		return -EINVAL;
+	}
+
+	switch (nsh->np) {
+	case NSH_P_ETHERNET:
+		inner_proto = htons(ETH_P_TEB);
+		break;
+	case NSH_P_IPV4:
+		inner_proto = htons(ETH_P_IP);
+		break;
+	case NSH_P_IPV6:
+		inner_proto = htons(ETH_P_IPV6);
+		break;
+	case NSH_P_NSH:
+		inner_proto = htons(ETH_P_NSH);
+		break;
+	default:
+		return -ENOTSUPP;
+	}
+
+	length = nsh_hdr_len(nsh);
+	skb_pull(skb, length);
+	skb_reset_mac_header(skb);
+	skb_reset_mac_len(skb);
+	skb->protocol = inner_proto;
+
+	/* safe right before invalidate_flow_key */
+	if (inner_proto == htons(ETH_P_TEB))
+		key->mac_proto = MAC_PROTO_ETHERNET;
+	else
+		key->mac_proto = MAC_PROTO_NONE;
+	invalidate_flow_key(key);
+	return 0;
+}
+
 static void update_ip_l4_checksum(struct sk_buff *skb, struct iphdr *nh,
 				  __be32 addr, __be32 new_addr)
 {
@@ -602,6 +692,53 @@ static int set_ipv6(struct sk_buff *skb, struct sw_flow_key *flow_key,
 	return 0;
 }
 
+static int set_nsh(struct sk_buff *skb, struct sw_flow_key *flow_key,
+		   const struct ovs_key_nsh *key,
+		   const struct ovs_key_nsh *mask)
+{
+	struct nsh_hdr *nsh;
+	int err;
+	u8 flags;
+	u8 ttl;
+	int i;
+
+	err = skb_ensure_writable(skb, skb_network_offset(skb) +
+				  sizeof(struct nsh_hdr));
+	if (unlikely(err))
+		return err;
+
+	nsh = (struct nsh_hdr *)skb_network_header(skb);
+
+	flags = nsh_get_flags(nsh);
+	flags = OVS_MASKED(flags, key->flags, mask->flags);
+	flow_key->nsh.flags = flags;
+	ttl = nsh_get_ttl(nsh);
+	ttl = OVS_MASKED(ttl, key->ttl, mask->ttl);
+	flow_key->nsh.ttl = ttl;
+	nsh_set_flags_and_ttl(nsh, flags, ttl);
+	nsh->path_hdr = OVS_MASKED(nsh->path_hdr, key->path_hdr,
+				   mask->path_hdr);
+	flow_key->nsh.path_hdr = nsh->path_hdr;
+	switch (nsh->mdtype) {
+	case NSH_M_TYPE1:
+		for (i = 0; i < NSH_MD1_CONTEXT_SIZE; i++) {
+			nsh->md1.context[i] =
+			    OVS_MASKED(nsh->md1.context[i], key->context[i],
+				       mask->context[i]);
+		}
+		memcpy(flow_key->nsh.context, nsh->md1.context,
+		       sizeof(nsh->md1.context));
+		break;
+	case NSH_M_TYPE2:
+		memset(flow_key->nsh.context, 0,
+		       sizeof(flow_key->nsh.context));
+		break;
+	default:
+		return -EINVAL;
+	}
+	return 0;
+}
+
 /* Must follow skb_ensure_writable() since that can move the skb data. */
 static void set_tp_port(struct sk_buff *skb, __be16 *port,
 			__be16 new_port, __sum16 *check)
@@ -1024,6 +1161,32 @@ static int execute_masked_set_action(struct sk_buff *skb,
 				   get_mask(a, struct ovs_key_ethernet *));
 		break;
 
+	case OVS_KEY_ATTR_NSH: {
+		struct ovs_key_nsh nsh;
+		struct ovs_key_nsh nsh_mask;
+		size_t size = nla_len(a) / 2;
+		struct {
+			struct nlattr nla;
+			u8 data[size];
+		} attr, mask;
+
+		attr.nla.nla_type = nla_type(a);
+		attr.nla.nla_len = NLA_HDRLEN + size;
+		memcpy(attr.data, (char *)(a + 1), size);
+
+		mask.nla = attr.nla;
+		memcpy(mask.data, (char *)(a + 1) + size, size);
+
+		err = nsh_key_from_nlattr(&attr.nla, &nsh);
+		if (err)
+			break;
+		err = nsh_key_from_nlattr(&mask.nla, &nsh_mask);
+		if (err)
+			break;
+		err = set_nsh(skb, flow_key, &nsh, &nsh_mask);
+		break;
+	}
+
 	case OVS_KEY_ATTR_IPV4:
 		err = set_ipv4(skb, flow_key, nla_data(a),
 			       get_mask(a, struct ovs_key_ipv4 *));
@@ -1210,6 +1373,21 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
 		case OVS_ACTION_ATTR_POP_ETH:
 			err = pop_eth(skb, key);
 			break;
+
+		case OVS_ACTION_ATTR_PUSH_NSH: {
+			u8 buffer[NSH_HDR_MAX_LEN];
+			struct nsh_hdr *nsh_hdr = (struct nsh_hdr *)buffer;
+			const struct nsh_hdr *nsh_src = nsh_hdr;
+
+			nsh_hdr_from_nlattr(nla_data(a), nsh_hdr,
+					    NSH_HDR_MAX_LEN);
+			err = push_nsh(skb, key, nsh_src);
+			break;
+		}
+
+		case OVS_ACTION_ATTR_POP_NSH:
+			err = pop_nsh(skb, key);
+			break;
 		}
 
 		if (unlikely(err)) {
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 8c94cef..34865b5 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -46,6 +46,7 @@
 #include <net/ipv6.h>
 #include <net/mpls.h>
 #include <net/ndisc.h>
+#include <net/nsh.h>
 
 #include "conntrack.h"
 #include "datapath.h"
@@ -490,6 +491,56 @@ static int parse_icmpv6(struct sk_buff *skb, struct sw_flow_key *key,
 	return 0;
 }
 
+static int parse_nsh(struct sk_buff *skb, struct sw_flow_key *key)
+{
+	struct nsh_hdr *nsh;
+	unsigned int nh_ofs = skb_network_offset(skb);
+	u8 version, length;
+	int err;
+
+	err = check_header(skb, nh_ofs + NSH_BASE_HDR_LEN);
+	if (unlikely(err))
+		return err;
+
+	nsh = (struct nsh_hdr *)skb_network_header(skb);
+	version = nsh_get_ver(nsh);
+	length = nsh_hdr_len(nsh);
+
+	if (version != 0)
+		return -EINVAL;
+
+	err = check_header(skb, nh_ofs + length);
+	if (unlikely(err))
+		return err;
+
+	nsh = (struct nsh_hdr *)skb_network_header(skb);
+	key->nsh.flags = nsh_get_flags(nsh);
+	key->nsh.ttl = nsh_get_ttl(nsh);
+	key->nsh.mdtype = nsh->mdtype;
+	key->nsh.np = nsh->np;
+	key->nsh.path_hdr = nsh->path_hdr;
+	switch (key->nsh.mdtype) {
+	case NSH_M_TYPE1:
+		if (length != NSH_M_TYPE1_LEN)
+			return -EINVAL;
+		memcpy(key->nsh.context, nsh->md1.context,
+		       sizeof(nsh->md1));
+		break;
+	case NSH_M_TYPE2:
+		/* Don't support MD type 2 metedata parsing yet */
+		if (length < NSH_BASE_HDR_LEN)
+			return -EINVAL;
+
+		memset(key->nsh.context, 0,
+		       sizeof(nsh->md1));
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 /**
  * key_extract - extracts a flow key from an Ethernet frame.
  * @skb: sk_buff that contains the frame, with skb->data pointing to the
@@ -735,6 +786,10 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key)
 				memset(&key->tp, 0, sizeof(key->tp));
 			}
 		}
+	} else if (key->eth.type == htons(ETH_P_NSH)) {
+		error = parse_nsh(skb, key);
+		if (error)
+			return error;
 	}
 	return 0;
 }
diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index 1875bba..6a3cd9c 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -35,6 +35,7 @@
 #include <net/inet_ecn.h>
 #include <net/ip_tunnels.h>
 #include <net/dst_metadata.h>
+#include <net/nsh.h>
 
 struct sk_buff;
 
@@ -66,6 +67,15 @@ struct vlan_head {
 	(offsetof(struct sw_flow_key, recirc_id) +	\
 	FIELD_SIZEOF(struct sw_flow_key, recirc_id))
 
+struct ovs_key_nsh {
+	u8 flags;
+	u8 ttl;
+	u8 mdtype;
+	u8 np;
+	__be32 path_hdr;
+	__be32 context[NSH_MD1_CONTEXT_SIZE];
+};
+
 struct sw_flow_key {
 	u8 tun_opts[IP_TUNNEL_OPTS_MAX];
 	u8 tun_opts_len;
@@ -144,6 +154,7 @@ struct sw_flow_key {
 			};
 		} ipv6;
 	};
+	struct ovs_key_nsh nsh;         /* network service header */
 	struct {
 		/* Connection tracking fields not packed above. */
 		struct {
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index e8eb427..3b7f166 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -78,9 +78,11 @@ static bool actions_may_change_flow(const struct nlattr *actions)
 		case OVS_ACTION_ATTR_HASH:
 		case OVS_ACTION_ATTR_POP_ETH:
 		case OVS_ACTION_ATTR_POP_MPLS:
+		case OVS_ACTION_ATTR_POP_NSH:
 		case OVS_ACTION_ATTR_POP_VLAN:
 		case OVS_ACTION_ATTR_PUSH_ETH:
 		case OVS_ACTION_ATTR_PUSH_MPLS:
+		case OVS_ACTION_ATTR_PUSH_NSH:
 		case OVS_ACTION_ATTR_PUSH_VLAN:
 		case OVS_ACTION_ATTR_SAMPLE:
 		case OVS_ACTION_ATTR_SET:
@@ -322,12 +324,27 @@ size_t ovs_tun_key_attr_size(void)
 		+ nla_total_size(2);   /* OVS_TUNNEL_KEY_ATTR_TP_DST */
 }
 
+size_t ovs_nsh_key_attr_size(void)
+{
+	/* Whenever adding new OVS_NSH_KEY_ FIELDS, we should consider
+	 * updating this function.
+	 */
+	return  nla_total_size(NSH_BASE_HDR_LEN) /* OVS_NSH_KEY_ATTR_BASE */
+		/* OVS_NSH_KEY_ATTR_MD1 and OVS_NSH_KEY_ATTR_MD2 are
+		 * mutually exclusive, so the bigger one can cover
+		 * the small one.
+		 *
+		 * OVS_NSH_KEY_ATTR_MD2
+		 */
+		+ nla_total_size(NSH_CTX_HDRS_MAX_LEN);
+}
+
 size_t ovs_key_attr_size(void)
 {
 	/* Whenever adding new OVS_KEY_ FIELDS, we should consider
 	 * updating this function.
 	 */
-	BUILD_BUG_ON(OVS_KEY_ATTR_TUNNEL_INFO != 28);
+	BUILD_BUG_ON(OVS_KEY_ATTR_TUNNEL_INFO != 29);
 
 	return    nla_total_size(4)   /* OVS_KEY_ATTR_PRIORITY */
 		+ nla_total_size(0)   /* OVS_KEY_ATTR_TUNNEL */
@@ -341,6 +358,8 @@ size_t ovs_key_attr_size(void)
 		+ nla_total_size(4)   /* OVS_KEY_ATTR_CT_MARK */
 		+ nla_total_size(16)  /* OVS_KEY_ATTR_CT_LABELS */
 		+ nla_total_size(40)  /* OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6 */
+		+ nla_total_size(0)   /* OVS_KEY_ATTR_NSH */
+		  + ovs_nsh_key_attr_size()
 		+ nla_total_size(12)  /* OVS_KEY_ATTR_ETHERNET */
 		+ nla_total_size(2)   /* OVS_KEY_ATTR_ETHERTYPE */
 		+ nla_total_size(4)   /* OVS_KEY_ATTR_VLAN */
@@ -373,6 +392,13 @@ static const struct ovs_len_tbl ovs_tunnel_key_lens[OVS_TUNNEL_KEY_ATTR_MAX + 1]
 	[OVS_TUNNEL_KEY_ATTR_IPV6_DST]      = { .len = sizeof(struct in6_addr) },
 };
 
+static const struct ovs_len_tbl
+ovs_nsh_key_attr_lens[OVS_NSH_KEY_ATTR_MAX + 1] = {
+	[OVS_NSH_KEY_ATTR_BASE]     = { .len = 8 },
+	[OVS_NSH_KEY_ATTR_MD1]      = { .len = 16 },
+	[OVS_NSH_KEY_ATTR_MD2]      = { .len = OVS_ATTR_VARIABLE },
+};
+
 /* The size of the argument for each %OVS_KEY_ATTR_* Netlink attribute.  */
 static const struct ovs_len_tbl ovs_key_lens[OVS_KEY_ATTR_MAX + 1] = {
 	[OVS_KEY_ATTR_ENCAP]	 = { .len = OVS_ATTR_NESTED },
@@ -405,6 +431,8 @@ static const struct ovs_len_tbl ovs_key_lens[OVS_KEY_ATTR_MAX + 1] = {
 		.len = sizeof(struct ovs_key_ct_tuple_ipv4) },
 	[OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6] = {
 		.len = sizeof(struct ovs_key_ct_tuple_ipv6) },
+	[OVS_KEY_ATTR_NSH]       = { .len = OVS_ATTR_NESTED,
+				     .next = ovs_nsh_key_attr_lens, },
 };
 
 static bool check_attr_len(unsigned int attr_len, unsigned int expected_len)
@@ -1179,6 +1207,303 @@ static int metadata_from_nlattrs(struct net *net, struct sw_flow_match *match,
 	return 0;
 }
 
+int nsh_hdr_from_nlattr(const struct nlattr *attr,
+			struct nsh_hdr *nsh, size_t size)
+{
+	struct nlattr *a;
+	int rem;
+	u8 flags = 0;
+	u8 ttl = 0;
+	int mdlen = 0;
+	bool has_md1 = false;
+	bool has_md2 = false;
+	u8 len;
+
+	nla_for_each_nested(a, attr, rem) {
+		int type = nla_type(a);
+
+		if (type > OVS_NSH_KEY_ATTR_MAX) {
+			OVS_NLERR(1, "nsh attr %d is out of range max %d",
+				  type, OVS_NSH_KEY_ATTR_MAX);
+			return -EINVAL;
+		}
+
+		if (!check_attr_len(nla_len(a),
+				    ovs_nsh_key_attr_lens[type].len)) {
+			OVS_NLERR(
+			    1,
+			    "nsh attr %d has unexpected len %d expected %d",
+			    type,
+			    nla_len(a),
+			    ovs_nsh_key_attr_lens[type].len
+			);
+			return -EINVAL;
+		}
+
+		switch (type) {
+		case OVS_NSH_KEY_ATTR_BASE: {
+			const struct ovs_nsh_key_base *base =
+				(struct ovs_nsh_key_base *)nla_data(a);
+			flags = base->flags;
+			ttl = base->ttl;
+			nsh->np = base->np;
+			nsh->mdtype = base->mdtype;
+			nsh->path_hdr = base->path_hdr;
+			break;
+		}
+		case OVS_NSH_KEY_ATTR_MD1: {
+			const struct ovs_nsh_key_md1 *md1 =
+				(struct ovs_nsh_key_md1 *)nla_data(a);
+			struct nsh_md1_ctx *md1_dst = &nsh->md1;
+
+			has_md1 = true;
+			mdlen = nla_len(a);
+			if (((mdlen + NSH_BASE_HDR_LEN) != NSH_M_TYPE1_LEN) ||
+			    ((mdlen + NSH_BASE_HDR_LEN) > size) ||
+			    (mdlen <= 0)) {
+				OVS_NLERR(
+				    1,
+				    "length %d of nsh attr %d is invalid",
+				    mdlen,
+				    type
+				);
+				return -EINVAL;
+			}
+			memcpy(md1_dst, md1, mdlen);
+			break;
+		}
+		case OVS_NSH_KEY_ATTR_MD2: {
+			const struct u8 *md2 = nla_data(a);
+			struct nsh_md2_tlv *md2_dst = &nsh->md2;
+
+			has_md2 = true;
+			mdlen = nla_len(a);
+			if (((mdlen + NSH_BASE_HDR_LEN) > size) ||
+			    (mdlen <= 0)) {
+				OVS_NLERR(
+				    1,
+				    "length %d of nsh attr %d is invalid",
+				    mdlen,
+				    type
+				);
+				return -EINVAL;
+			}
+			memcpy(md2_dst, md2, mdlen);
+			break;
+		}
+		default:
+			OVS_NLERR(1, "Unknown nsh attribute %d",
+				  type);
+			return -EINVAL;
+		}
+	}
+
+	if (rem > 0) {
+		OVS_NLERR(1, "nsh attribute has %d unknown bytes.", rem);
+		return -EINVAL;
+	}
+
+	if ((has_md1 && nsh->mdtype != NSH_M_TYPE1) ||
+	    (has_md2 && nsh->mdtype != NSH_M_TYPE2)) {
+		OVS_NLERR(1,
+			  "nsh attribute has unmatched MD type %d.",
+			  nsh->mdtype);
+		return -EINVAL;
+	}
+
+	if (unlikely(has_md1 && has_md2)) {
+		OVS_NLERR(1, "both nsh md1 and md2 attribute are there");
+		return -EINVAL;
+	}
+
+	if (unlikely(!has_md1 && !has_md2)) {
+		OVS_NLERR(1, "neither nsh md1 nor md2 attribute is there");
+		return -EINVAL;
+	}
+
+	/* nsh header length  = NSH_BASE_HDR_LEN + mdlen */
+	len = NSH_BASE_HDR_LEN + mdlen;
+	nsh_set_flags_ttl_len(nsh, flags, ttl, len);
+
+	return 0;
+}
+
+int nsh_key_from_nlattr(const struct nlattr *attr,
+			struct ovs_key_nsh *nsh)
+{
+	struct nlattr *a;
+	int rem;
+	bool has_md1 = false;
+
+	nla_for_each_nested(a, attr, rem) {
+		int type = nla_type(a);
+
+		if (type > OVS_NSH_KEY_ATTR_MAX) {
+			OVS_NLERR(1, "nsh attr %d is out of range max %d",
+				  type, OVS_NSH_KEY_ATTR_MAX);
+			return -EINVAL;
+		}
+
+		if (!check_attr_len(nla_len(a),
+				    ovs_nsh_key_attr_lens[type].len)) {
+			OVS_NLERR(
+			    1,
+			    "nsh attr %d has unexpected len %d expected %d",
+			    type,
+			    nla_len(a),
+			    ovs_nsh_key_attr_lens[type].len
+			);
+			return -EINVAL;
+		}
+
+		switch (type) {
+		case OVS_NSH_KEY_ATTR_BASE: {
+			const struct ovs_nsh_key_base *base =
+				(struct ovs_nsh_key_base *)nla_data(a);
+
+			memcpy(nsh, base, sizeof(*base));
+			break;
+		}
+		case OVS_NSH_KEY_ATTR_MD1: {
+			const struct ovs_nsh_key_md1 *md1 =
+				(struct ovs_nsh_key_md1 *)nla_data(a);
+
+			has_md1 = true;
+			memcpy(nsh->context, md1->context, sizeof(*md1));
+			break;
+		}
+		case OVS_NSH_KEY_ATTR_MD2:
+			/* Not supported yet */
+			return -ENOTSUPP;
+		default:
+			OVS_NLERR(1, "Unknown nsh attribute %d",
+				  type);
+			return -EINVAL;
+		}
+	}
+
+	if (rem > 0) {
+		OVS_NLERR(1, "nsh attribute has %d unknown bytes.", rem);
+		return -EINVAL;
+	}
+
+	if ((has_md1 && nsh->mdtype != NSH_M_TYPE1)) {
+		OVS_NLERR(1, "nsh attribute has unmatched MD type %d.",
+			  nsh->mdtype);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int nsh_key_put_from_nlattr(const struct nlattr *attr,
+				   struct sw_flow_match *match, bool is_mask,
+				   bool is_push_nsh, bool log)
+{
+	struct nlattr *a;
+	int rem;
+	bool has_md1 = false;
+	bool has_md2 = false;
+	u8 mdtype = 0;
+	int mdlen = 0;
+
+	nla_for_each_nested(a, attr, rem) {
+		int type = nla_type(a);
+		int i;
+
+		if (type > OVS_NSH_KEY_ATTR_MAX) {
+			OVS_NLERR(log, "nsh attr %d is out of range max %d",
+				  type, OVS_NSH_KEY_ATTR_MAX);
+			return -EINVAL;
+		}
+
+		if (!check_attr_len(nla_len(a),
+				    ovs_nsh_key_attr_lens[type].len)) {
+			OVS_NLERR(
+			    log,
+			    "nsh attr %d has unexpected len %d expected %d",
+			    type,
+			    nla_len(a),
+			    ovs_nsh_key_attr_lens[type].len
+			);
+			return -EINVAL;
+		}
+
+		switch (type) {
+		case OVS_NSH_KEY_ATTR_BASE: {
+			const struct ovs_nsh_key_base *base =
+				(struct ovs_nsh_key_base *)nla_data(a);
+
+			mdtype = base->mdtype;
+			SW_FLOW_KEY_PUT(match, nsh.flags,
+					base->flags, is_mask);
+			SW_FLOW_KEY_PUT(match, nsh.ttl,
+					base->ttl, is_mask);
+			SW_FLOW_KEY_PUT(match, nsh.mdtype,
+					base->mdtype, is_mask);
+			SW_FLOW_KEY_PUT(match, nsh.np,
+					base->np, is_mask);
+			SW_FLOW_KEY_PUT(match, nsh.path_hdr,
+					base->path_hdr, is_mask);
+			break;
+		}
+		case OVS_NSH_KEY_ATTR_MD1: {
+			const struct ovs_nsh_key_md1 *md1 =
+				(struct ovs_nsh_key_md1 *)nla_data(a);
+
+			has_md1 = true;
+			for (i = 0; i < NSH_MD1_CONTEXT_SIZE; i++)
+				SW_FLOW_KEY_PUT(match, nsh.context[i],
+						md1->context[i], is_mask);
+			break;
+		}
+		case OVS_NSH_KEY_ATTR_MD2:
+			if (!is_push_nsh) /* Not supported MD type 2 yet */
+				return -ENOTSUPP;
+
+			has_md2 = true;
+			mdlen = nla_len(a);
+			if ((mdlen > NSH_CTX_HDRS_MAX_LEN) ||
+			    (mdlen <= 0))
+				return -EINVAL;
+			break;
+		default:
+			OVS_NLERR(log, "Unknown nsh attribute %d",
+				  type);
+			return -EINVAL;
+		}
+	}
+
+	if (rem > 0) {
+		OVS_NLERR(log, "nsh attribute has %d unknown bytes.", rem);
+		return -EINVAL;
+	}
+
+	if (!is_mask) {
+		if ((has_md1 && mdtype != NSH_M_TYPE1)) {
+			OVS_NLERR(1, "nsh attribute has unmatched MD type %d.",
+				  mdtype);
+			return -EINVAL;
+		}
+	}
+
+	if (is_push_nsh & !is_mask) {
+		if ((has_md1 && mdtype != NSH_M_TYPE1) ||
+		    (has_md2 && mdtype != NSH_M_TYPE2) ||
+		    (has_md1 && has_md2) ||
+		    (!has_md1 && !has_md2)) {
+			OVS_NLERR(
+			    1,
+			    "push nsh attributes are invalid for type %d.",
+			    mdtype
+			);
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
 static int ovs_key_from_nlattrs(struct net *net, struct sw_flow_match *match,
 				u64 attrs, const struct nlattr **a,
 				bool is_mask, bool log)
@@ -1306,6 +1631,13 @@ static int ovs_key_from_nlattrs(struct net *net, struct sw_flow_match *match,
 		attrs &= ~(1 << OVS_KEY_ATTR_ARP);
 	}
 
+	if (attrs & (1 << OVS_KEY_ATTR_NSH)) {
+		if (nsh_key_put_from_nlattr(a[OVS_KEY_ATTR_NSH], match,
+					    is_mask, false, log) < 0)
+			return -EINVAL;
+		attrs &= ~(1 << OVS_KEY_ATTR_NSH);
+	}
+
 	if (attrs & (1 << OVS_KEY_ATTR_MPLS)) {
 		const struct ovs_key_mpls *mpls_key;
 
@@ -1622,6 +1954,40 @@ static int ovs_nla_put_vlan(struct sk_buff *skb, const struct vlan_head *vh,
 	return 0;
 }
 
+static int nsh_key_to_nlattr(const struct ovs_key_nsh *nsh, bool is_mask,
+			     struct sk_buff *skb)
+{
+	struct nlattr *start;
+	struct ovs_nsh_key_base base;
+	struct ovs_nsh_key_md1 md1;
+
+	memcpy(&base, nsh, sizeof(base));
+
+	if (is_mask || nsh->mdtype == NSH_M_TYPE1)
+		memcpy(md1.context, nsh->context, sizeof(md1));
+
+	start = nla_nest_start(skb, OVS_KEY_ATTR_NSH);
+	if (!start)
+		return -EMSGSIZE;
+
+	if (nla_put(skb, OVS_NSH_KEY_ATTR_BASE, sizeof(base), &base))
+		goto nla_put_failure;
+
+	if (is_mask || nsh->mdtype == NSH_M_TYPE1) {
+		if (nla_put(skb, OVS_NSH_KEY_ATTR_MD1, sizeof(md1), &md1))
+			goto nla_put_failure;
+	}
+
+	/* Don't support MD type 2 yet */
+
+	nla_nest_end(skb, start);
+
+	return 0;
+
+nla_put_failure:
+	return -EMSGSIZE;
+}
+
 static int __ovs_nla_put_key(const struct sw_flow_key *swkey,
 			     const struct sw_flow_key *output, bool is_mask,
 			     struct sk_buff *skb)
@@ -1750,6 +2116,9 @@ static int __ovs_nla_put_key(const struct sw_flow_key *swkey,
 		ipv6_key->ipv6_tclass = output->ip.tos;
 		ipv6_key->ipv6_hlimit = output->ip.ttl;
 		ipv6_key->ipv6_frag = output->ip.frag;
+	} else if (swkey->eth.type == htons(ETH_P_NSH)) {
+		if (nsh_key_to_nlattr(&output->nsh, is_mask, skb))
+			goto nla_put_failure;
 	} else if (swkey->eth.type == htons(ETH_P_ARP) ||
 		   swkey->eth.type == htons(ETH_P_RARP)) {
 		struct ovs_key_arp *arp_key;
@@ -2242,6 +2611,19 @@ static int validate_and_copy_set_tun(const struct nlattr *attr,
 	return err;
 }
 
+static bool validate_nsh(const struct nlattr *attr, bool is_mask,
+			 bool is_push_nsh, bool log)
+{
+	struct sw_flow_match match;
+	struct sw_flow_key key;
+	int ret = 0;
+
+	ovs_match_init(&match, &key, true, NULL);
+	ret = nsh_key_put_from_nlattr(attr, &match, is_mask,
+				      is_push_nsh, log);
+	return ((ret != 0) ? false : true);
+}
+
 /* Return false if there are any non-masked bits set.
  * Mask follows data immediately, before any netlink padding.
  */
@@ -2384,6 +2766,11 @@ static int validate_set(const struct nlattr *a,
 
 		break;
 
+	case OVS_KEY_ATTR_NSH:
+		if (!validate_nsh(nla_data(a), masked, false, log))
+			return -EINVAL;
+		break;
+
 	default:
 		return -EINVAL;
 	}
@@ -2482,6 +2869,8 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 			[OVS_ACTION_ATTR_TRUNC] = sizeof(struct ovs_action_trunc),
 			[OVS_ACTION_ATTR_PUSH_ETH] = sizeof(struct ovs_action_push_eth),
 			[OVS_ACTION_ATTR_POP_ETH] = 0,
+			[OVS_ACTION_ATTR_PUSH_NSH] = (u32)-1,
+			[OVS_ACTION_ATTR_POP_NSH] = 0,
 		};
 		const struct ovs_action_push_vlan *vlan;
 		int type = nla_type(a);
@@ -2636,6 +3025,19 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 			mac_proto = MAC_PROTO_ETHERNET;
 			break;
 
+		case OVS_ACTION_ATTR_PUSH_NSH:
+			mac_proto = MAC_PROTO_NONE;
+			if (!validate_nsh(nla_data(a), false, true, true))
+				return -EINVAL;
+			break;
+
+		case OVS_ACTION_ATTR_POP_NSH:
+			if (key->nsh.np == NSH_P_ETHERNET)
+				mac_proto = MAC_PROTO_ETHERNET;
+			else
+				mac_proto = MAC_PROTO_NONE;
+			break;
+
 		default:
 			OVS_NLERR(log, "Unknown Action type %d", type);
 			return -EINVAL;
diff --git a/net/openvswitch/flow_netlink.h b/net/openvswitch/flow_netlink.h
index 929c665..a849945 100644
--- a/net/openvswitch/flow_netlink.h
+++ b/net/openvswitch/flow_netlink.h
@@ -79,4 +79,8 @@ int ovs_nla_put_actions(const struct nlattr *attr,
 void ovs_nla_free_flow_actions(struct sw_flow_actions *);
 void ovs_nla_free_flow_actions_rcu(struct sw_flow_actions *);
 
+int nsh_key_from_nlattr(const struct nlattr *attr, struct ovs_key_nsh *nsh);
+int nsh_hdr_from_nlattr(const struct nlattr *attr, struct nsh_hdr *nsh_src,
+			size_t size);
+
 #endif /* flow_netlink.h */
-- 
2.5.5

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox