Netdev List
 help / color / mirror / Atom feed
* Re: [net 0/2] DPAA fixes
From: David Miller @ 2018-06-30  9:51 UTC (permalink / raw)
  To: madalin.bucur; +Cc: netdev, linux-kernel
In-Reply-To: <1530188811-8918-1-git-send-email-madalin.bucur@nxp.com>

From: Madalin Bucur <madalin.bucur@nxp.com>
Date: Thu, 28 Jun 2018 15:26:49 +0300

> A couple of fixes for the DPAA drivers, addressing an issue
> with short UDP or TCP frames (with padding) that were marked
> as having a wrong checksum and dropped by the FMan hardware
> and a problem with the buffer used for the scatter-gather
> table being too small as per the hardware requirements.

Series applied.

^ permalink raw reply

* [PATCH net-next 0/5] Introduce matching on double vlan/QinQ headers for TC flower
From: Jianbo Liu @ 2018-06-30  9:53 UTC (permalink / raw)
  To: netdev, davem, jiri; +Cc: Jianbo Liu

Currently TC flower supports only one vlan tag, it doesn't match on both outer
and inner vlan headers for QinQ. To do this, we add support to get both outer
and inner vlan headers for flow dissector, and then TC flower do matching on
those information.

We also plan to extend TC command to support this feature. We add new
cvlan_id/cvlan_prio/cvlan_ethtype keywords for inner vlan header. The existing
vlan_id/vlan_prio/vlan_ethtype are for outer vlan header, and vlan_ethtype must
be 802.1q or 802.1ad.

The examples for command and output are as the following.
# tc filter add dev ens1f1 parent ffff: protocol 802.1ad pref 33 \
        flower vlan_id 1000 vlan_ethtype 802.1q \
        cvlan_id 100 cvlan_ethtype ipv4 \
        action vlan pop \
        action vlan pop \
        action mirred egress redirect dev ens1f1_0

# tc filter show dev ens1f1 ingress
filter protocol 802.1ad pref 33 flower chain 0
filter protocol 802.1ad pref 33 flower chain 0 handle 0x1
  vlan_id 1000
  vlan_ethtype 802.1Q
  cvlan_id 100
  cvlan_ethtype ip
  eth_type ipv4
  in_hw
    ...

Jianbo Liu (5):
  net/flow_dissector: Save vlan ethertype from headers
  net/sched: flower: Add support for matching on vlan ethertype
  net/flow_dissector: Add support for QinQ dissection
  net/sched: flower: Dump the ethertype encapsulated in vlan
  net/sched: flower: Add supprt for matching on QinQ vlan headers

 include/net/flow_dissector.h |  4 ++-
 include/uapi/linux/pkt_cls.h |  4 +++
 net/core/flow_dissector.c    | 34 +++++++++++----------
 net/sched/cls_flower.c       | 70 ++++++++++++++++++++++++++++++++++++--------
 4 files changed, 83 insertions(+), 29 deletions(-)

-- 
2.9.5

^ permalink raw reply

* [PATCH net-next 1/5] net/flow_dissector: Save vlan ethertype from headers
From: Jianbo Liu @ 2018-06-30  9:53 UTC (permalink / raw)
  To: netdev, davem, jiri
  Cc: Jianbo Liu, Jiri Pirko, Simon Horman, Jakub Kicinski, Tom Herbert,
	Paolo Abeni, John Crispin, Sven Eckelmann, WANG Cong, David Ahern,
	Jon Maloy
In-Reply-To: <20180630095317.5691-1-jianbol@mellanox.com>

Change vlan dissector key to save vlan tpid to support both 802.1Q
and 802.1AD ethertype.

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/flow_dissector.h | 2 +-
 net/core/flow_dissector.c    | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index adc24df5..8f89968 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -47,7 +47,7 @@ struct flow_dissector_key_tags {
 struct flow_dissector_key_vlan {
 	u16	vlan_id:12,
 		vlan_priority:3;
-	u16	padding;
+	__be16	vlan_tpid;
 };
 
 struct flow_dissector_key_mpls {
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 4fc1e84..e068ccf 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -751,6 +751,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 		const struct vlan_hdr *vlan;
 		struct vlan_hdr _vlan;
 		bool vlan_tag_present = skb && skb_vlan_tag_present(skb);
+		__be16 saved_vlan_tpid = proto;
 
 		if (vlan_tag_present)
 			proto = skb->protocol;
@@ -789,6 +790,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 					(ntohs(vlan->h_vlan_TCI) &
 					 VLAN_PRIO_MASK) >> VLAN_PRIO_SHIFT;
 			}
+			key_vlan->vlan_tpid = saved_vlan_tpid;
 		}
 
 		fdret = FLOW_DISSECT_RET_PROTO_AGAIN;
-- 
2.9.5

^ permalink raw reply related

* [PATCH net-next 2/5] net/sched: flower: Add support for matching on vlan ethertype
From: Jianbo Liu @ 2018-06-30  9:53 UTC (permalink / raw)
  To: netdev, davem, jiri; +Cc: Jianbo Liu, Jamal Hadi Salim, Cong Wang
In-Reply-To: <20180630095317.5691-1-jianbol@mellanox.com>

As flow dissector stores vlan ethertype, tc flower now can match on that.
It is to make preparation for supporting QinQ.

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_flower.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 4e74508..e6774af 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -483,6 +483,7 @@ static int fl_set_key_mpls(struct nlattr **tb,
 }
 
 static void fl_set_key_vlan(struct nlattr **tb,
+			    __be16 ethertype,
 			    struct flow_dissector_key_vlan *key_val,
 			    struct flow_dissector_key_vlan *key_mask)
 {
@@ -499,6 +500,8 @@ static void fl_set_key_vlan(struct nlattr **tb,
 			VLAN_PRIORITY_MASK;
 		key_mask->vlan_priority = VLAN_PRIORITY_MASK;
 	}
+	key_val->vlan_tpid = ethertype;
+	key_mask->vlan_tpid = cpu_to_be16(~0);
 }
 
 static void fl_set_key_flag(u32 flower_key, u32 flower_mask,
@@ -575,8 +578,8 @@ static int fl_set_key(struct net *net, struct nlattr **tb,
 	if (tb[TCA_FLOWER_KEY_ETH_TYPE]) {
 		ethertype = nla_get_be16(tb[TCA_FLOWER_KEY_ETH_TYPE]);
 
-		if (ethertype == htons(ETH_P_8021Q)) {
-			fl_set_key_vlan(tb, &key->vlan, &mask->vlan);
+		if (eth_type_vlan(ethertype)) {
+			fl_set_key_vlan(tb, ethertype, &key->vlan, &mask->vlan);
 			fl_set_key_val(tb, &key->basic.n_proto,
 				       TCA_FLOWER_KEY_VLAN_ETH_TYPE,
 				       &mask->basic.n_proto, TCA_FLOWER_UNSPEC,
-- 
2.9.5

^ permalink raw reply related

* [PATCH net-next 3/5] net/flow_dissector: Add support for QinQ dissection
From: Jianbo Liu @ 2018-06-30  9:53 UTC (permalink / raw)
  To: netdev, davem, jiri
  Cc: Jianbo Liu, Jiri Pirko, Jon Maloy, Simon Horman, Tom Herbert,
	Paolo Abeni, John Crispin, Sven Eckelmann, WANG Cong, David Ahern
In-Reply-To: <20180630095317.5691-1-jianbol@mellanox.com>

Dissect the QinQ packets to get both outer and inner vlan information,
then store to the extended flow keys.

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/flow_dissector.h |  2 ++
 net/core/flow_dissector.c    | 32 +++++++++++++++++---------------
 2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 8f89968..c644067 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -206,6 +206,7 @@ enum flow_dissector_key_id {
 	FLOW_DISSECTOR_KEY_MPLS, /* struct flow_dissector_key_mpls */
 	FLOW_DISSECTOR_KEY_TCP, /* struct flow_dissector_key_tcp */
 	FLOW_DISSECTOR_KEY_IP, /* struct flow_dissector_key_ip */
+	FLOW_DISSECTOR_KEY_CVLAN, /* struct flow_dissector_key_flow_vlan */
 
 	FLOW_DISSECTOR_KEY_MAX,
 };
@@ -237,6 +238,7 @@ struct flow_keys {
 	struct flow_dissector_key_basic basic;
 	struct flow_dissector_key_tags tags;
 	struct flow_dissector_key_vlan vlan;
+	struct flow_dissector_key_vlan cvlan;
 	struct flow_dissector_key_keyid keyid;
 	struct flow_dissector_key_ports ports;
 	struct flow_dissector_key_addrs addrs;
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index e068ccf..09360ae 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -589,7 +589,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 	struct flow_dissector_key_tags *key_tags;
 	struct flow_dissector_key_vlan *key_vlan;
 	enum flow_dissect_ret fdret;
-	bool skip_vlan = false;
+	enum flow_dissector_key_id dissector_vlan = FLOW_DISSECTOR_KEY_MAX;
 	int num_hdrs = 0;
 	u8 ip_proto = 0;
 	bool ret;
@@ -748,15 +748,14 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 	}
 	case htons(ETH_P_8021AD):
 	case htons(ETH_P_8021Q): {
-		const struct vlan_hdr *vlan;
+		const struct vlan_hdr *vlan = NULL;
 		struct vlan_hdr _vlan;
-		bool vlan_tag_present = skb && skb_vlan_tag_present(skb);
 		__be16 saved_vlan_tpid = proto;
 
-		if (vlan_tag_present)
+		if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX &&
+		    skb && skb_vlan_tag_present(skb)) {
 			proto = skb->protocol;
-
-		if (!vlan_tag_present || eth_type_vlan(skb->protocol)) {
+		} else {
 			vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan),
 						    data, hlen, &_vlan);
 			if (!vlan) {
@@ -766,20 +765,23 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 
 			proto = vlan->h_vlan_encapsulated_proto;
 			nhoff += sizeof(*vlan);
-			if (skip_vlan) {
-				fdret = FLOW_DISSECT_RET_PROTO_AGAIN;
-				break;
-			}
 		}
 
-		skip_vlan = true;
-		if (dissector_uses_key(flow_dissector,
-				       FLOW_DISSECTOR_KEY_VLAN)) {
+		if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX) {
+			dissector_vlan = FLOW_DISSECTOR_KEY_VLAN;
+		} else if (dissector_vlan == FLOW_DISSECTOR_KEY_VLAN) {
+			dissector_vlan = FLOW_DISSECTOR_KEY_CVLAN;
+		} else {
+			fdret = FLOW_DISSECT_RET_PROTO_AGAIN;
+			break;
+		}
+
+		if (dissector_uses_key(flow_dissector, dissector_vlan)) {
 			key_vlan = skb_flow_dissector_target(flow_dissector,
-							     FLOW_DISSECTOR_KEY_VLAN,
+							     dissector_vlan,
 							     target_container);
 
-			if (vlan_tag_present) {
+			if (!vlan) {
 				key_vlan->vlan_id = skb_vlan_tag_get_id(skb);
 				key_vlan->vlan_priority =
 					(skb_vlan_tag_get_prio(skb) >> VLAN_PRIO_SHIFT);
-- 
2.9.5

^ permalink raw reply related

* [PATCH net-next 5/5] net/sched: flower: Add supprt for matching on QinQ vlan headers
From: Jianbo Liu @ 2018-06-30  9:53 UTC (permalink / raw)
  To: netdev, davem, jiri; +Cc: Jianbo Liu, Jamal Hadi Salim, Cong Wang
In-Reply-To: <20180630095317.5691-1-jianbol@mellanox.com>

As support dissecting of QinQ inner and outer vlan headers, user can
add rules to match on QinQ vlan headers.

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/pkt_cls.h |  4 +++
 net/sched/cls_flower.c       | 65 ++++++++++++++++++++++++++++++++++----------
 2 files changed, 55 insertions(+), 14 deletions(-)

diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index 84e4c1d..c4262d9 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -469,6 +469,10 @@ enum {
 	TCA_FLOWER_KEY_IP_TTL,		/* u8 */
 	TCA_FLOWER_KEY_IP_TTL_MASK,	/* u8 */
 
+	TCA_FLOWER_KEY_CVLAN_ID,	/* be16 */
+	TCA_FLOWER_KEY_CVLAN_PRIO,	/* u8   */
+	TCA_FLOWER_KEY_CVLAN_ETH_TYPE,	/* be16 */
+
 	__TCA_FLOWER_MAX,
 };
 
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 30f3e18..e3ecffd 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -35,6 +35,7 @@ struct fl_flow_key {
 	struct flow_dissector_key_basic basic;
 	struct flow_dissector_key_eth_addrs eth;
 	struct flow_dissector_key_vlan vlan;
+	struct flow_dissector_key_vlan cvlan;
 	union {
 		struct flow_dissector_key_ipv4_addrs ipv4;
 		struct flow_dissector_key_ipv6_addrs ipv6;
@@ -432,6 +433,9 @@ static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = {
 	[TCA_FLOWER_KEY_IP_TOS_MASK]	= { .type = NLA_U8 },
 	[TCA_FLOWER_KEY_IP_TTL]		= { .type = NLA_U8 },
 	[TCA_FLOWER_KEY_IP_TTL_MASK]	= { .type = NLA_U8 },
+	[TCA_FLOWER_KEY_CVLAN_ID]	= { .type = NLA_U16 },
+	[TCA_FLOWER_KEY_CVLAN_PRIO]	= { .type = NLA_U8 },
+	[TCA_FLOWER_KEY_CVLAN_ETH_TYPE]	= { .type = NLA_U16 },
 };
 
 static void fl_set_key_val(struct nlattr **tb,
@@ -484,19 +488,20 @@ static int fl_set_key_mpls(struct nlattr **tb,
 
 static void fl_set_key_vlan(struct nlattr **tb,
 			    __be16 ethertype,
+			    int vlan_id_key, int vlan_prio_key,
 			    struct flow_dissector_key_vlan *key_val,
 			    struct flow_dissector_key_vlan *key_mask)
 {
 #define VLAN_PRIORITY_MASK	0x7
 
-	if (tb[TCA_FLOWER_KEY_VLAN_ID]) {
+	if (tb[vlan_id_key]) {
 		key_val->vlan_id =
-			nla_get_u16(tb[TCA_FLOWER_KEY_VLAN_ID]) & VLAN_VID_MASK;
+			nla_get_u16(tb[vlan_id_key]) & VLAN_VID_MASK;
 		key_mask->vlan_id = VLAN_VID_MASK;
 	}
-	if (tb[TCA_FLOWER_KEY_VLAN_PRIO]) {
+	if (tb[vlan_prio_key]) {
 		key_val->vlan_priority =
-			nla_get_u8(tb[TCA_FLOWER_KEY_VLAN_PRIO]) &
+			nla_get_u8(tb[vlan_prio_key]) &
 			VLAN_PRIORITY_MASK;
 		key_mask->vlan_priority = VLAN_PRIORITY_MASK;
 	}
@@ -579,11 +584,25 @@ static int fl_set_key(struct net *net, struct nlattr **tb,
 		ethertype = nla_get_be16(tb[TCA_FLOWER_KEY_ETH_TYPE]);
 
 		if (eth_type_vlan(ethertype)) {
-			fl_set_key_vlan(tb, ethertype, &key->vlan, &mask->vlan);
-			fl_set_key_val(tb, &key->basic.n_proto,
-				       TCA_FLOWER_KEY_VLAN_ETH_TYPE,
-				       &mask->basic.n_proto, TCA_FLOWER_UNSPEC,
-				       sizeof(key->basic.n_proto));
+			fl_set_key_vlan(tb, ethertype, TCA_FLOWER_KEY_VLAN_ID,
+					TCA_FLOWER_KEY_VLAN_PRIO, &key->vlan,
+					&mask->vlan);
+
+			ethertype = nla_get_be16(tb[TCA_FLOWER_KEY_VLAN_ETH_TYPE]);
+			if (eth_type_vlan(ethertype)) {
+				fl_set_key_vlan(tb, ethertype,
+						TCA_FLOWER_KEY_CVLAN_ID,
+						TCA_FLOWER_KEY_CVLAN_PRIO,
+						&key->cvlan, &mask->cvlan);
+				fl_set_key_val(tb, &key->basic.n_proto,
+					       TCA_FLOWER_KEY_CVLAN_ETH_TYPE,
+					       &mask->basic.n_proto,
+					       TCA_FLOWER_UNSPEC,
+					       sizeof(key->basic.n_proto));
+			} else {
+				key->basic.n_proto = ethertype;
+				mask->basic.n_proto = cpu_to_be16(~0);
+			}
 		} else {
 			key->basic.n_proto = ethertype;
 			mask->basic.n_proto = cpu_to_be16(~0);
@@ -809,6 +828,8 @@ static void fl_init_dissector(struct fl_flow_mask *mask)
 	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
 			     FLOW_DISSECTOR_KEY_VLAN, vlan);
 	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+			     FLOW_DISSECTOR_KEY_CVLAN, cvlan);
+	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
 			     FLOW_DISSECTOR_KEY_ENC_KEYID, enc_key_id);
 	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
 			     FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS, enc_ipv4);
@@ -1143,6 +1164,7 @@ static int fl_dump_key_ip(struct sk_buff *skb,
 }
 
 static int fl_dump_key_vlan(struct sk_buff *skb,
+			    int vlan_id_key, int vlan_prio_key,
 			    struct flow_dissector_key_vlan *vlan_key,
 			    struct flow_dissector_key_vlan *vlan_mask)
 {
@@ -1151,13 +1173,13 @@ static int fl_dump_key_vlan(struct sk_buff *skb,
 	if (!memchr_inv(vlan_mask, 0, sizeof(*vlan_mask)))
 		return 0;
 	if (vlan_mask->vlan_id) {
-		err = nla_put_u16(skb, TCA_FLOWER_KEY_VLAN_ID,
+		err = nla_put_u16(skb, vlan_id_key,
 				  vlan_key->vlan_id);
 		if (err)
 			return err;
 	}
 	if (vlan_mask->vlan_priority) {
-		err = nla_put_u8(skb, TCA_FLOWER_KEY_VLAN_PRIO,
+		err = nla_put_u8(skb, vlan_prio_key,
 				 vlan_key->vlan_priority);
 		if (err)
 			return err;
@@ -1252,13 +1274,28 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
 	if (fl_dump_key_mpls(skb, &key->mpls, &mask->mpls))
 		goto nla_put_failure;
 
-	if (fl_dump_key_vlan(skb, &key->vlan, &mask->vlan))
+	if (fl_dump_key_vlan(skb, TCA_FLOWER_KEY_VLAN_ID,
+			     TCA_FLOWER_KEY_VLAN_PRIO, &key->vlan, &mask->vlan))
 		goto nla_put_failure;
 
-	if (mask->vlan.vlan_tpid &&
-	    nla_put_u16(skb, TCA_FLOWER_KEY_VLAN_ETH_TYPE, key->basic.n_proto))
+	if (fl_dump_key_vlan(skb, TCA_FLOWER_KEY_CVLAN_ID,
+			     TCA_FLOWER_KEY_CVLAN_PRIO,
+			     &key->cvlan, &mask->cvlan) ||
+	    (mask->cvlan.vlan_tpid &&
+	     nla_put_u16(skb, TCA_FLOWER_KEY_VLAN_ETH_TYPE,
+			 key->cvlan.vlan_tpid)))
 		goto nla_put_failure;
 
+	if (mask->cvlan.vlan_tpid) {
+		if (nla_put_u16(skb, TCA_FLOWER_KEY_CVLAN_ETH_TYPE,
+				key->basic.n_proto))
+			goto nla_put_failure;
+	} else if (mask->vlan.vlan_tpid) {
+		if (nla_put_u16(skb, TCA_FLOWER_KEY_VLAN_ETH_TYPE,
+				key->basic.n_proto))
+			goto nla_put_failure;
+	}
+
 	if ((key->basic.n_proto == htons(ETH_P_IP) ||
 	     key->basic.n_proto == htons(ETH_P_IPV6)) &&
 	    (fl_dump_key_val(skb, &key->basic.ip_proto, TCA_FLOWER_KEY_IP_PROTO,
-- 
2.9.5

^ permalink raw reply related

* [PATCH net-next 4/5] net/sched: flower: Dump the ethertype encapsulated in vlan
From: Jianbo Liu @ 2018-06-30  9:53 UTC (permalink / raw)
  To: netdev, davem, jiri; +Cc: Jianbo Liu, Jamal Hadi Salim, Cong Wang
In-Reply-To: <20180630095317.5691-1-jianbol@mellanox.com>

Currently the encapsulated ethertype is not dumped as it's the same as
TCA_FLOWER_KEY_ETH_TYPE keyvalue. But the dumping result is inconsistent
with input, we add dumping it with TCA_FLOWER_KEY_VLAN_ETH_TYPE.

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_flower.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index e6774af..30f3e18 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -1255,6 +1255,10 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
 	if (fl_dump_key_vlan(skb, &key->vlan, &mask->vlan))
 		goto nla_put_failure;
 
+	if (mask->vlan.vlan_tpid &&
+	    nla_put_u16(skb, TCA_FLOWER_KEY_VLAN_ETH_TYPE, key->basic.n_proto))
+		goto nla_put_failure;
+
 	if ((key->basic.n_proto == htons(ETH_P_IP) ||
 	     key->basic.n_proto == htons(ETH_P_IPV6)) &&
 	    (fl_dump_key_val(skb, &key->basic.ip_proto, TCA_FLOWER_KEY_IP_PROTO,
-- 
2.9.5

^ permalink raw reply related

* [PATCH iproute2/net-next] tc: flower: Add support for QinQ
From: Jianbo Liu @ 2018-06-30 10:01 UTC (permalink / raw)
  To: netdev, davem, jiri; +Cc: Jianbo Liu

To support matching on both outer and inner vlan headers,
we add new cvlan_id/cvlan_prio/cvlan_ethtype for inner vlan header.

Example:
# tc filter add dev eth0 protocol 802.1ad parent ffff: \
    flower vlan_id 1000 vlan_ethtype 802.1q \
        cvlan_id 100 cvlan_ethtype ipv4 \
    action vlan pop \
    action vlan pop \
    action mirred egress redirect dev eth1

# tc filter show dev eth0 ingress
filter protocol 802.1ad pref 1 flower chain 0
filter protocol 802.1ad pref 1 flower chain 0 handle 0x1
  vlan_id 1000
  vlan_ethtype 802.1Q
  cvlan_id 100
  cvlan_ethtype ip
  eth_type ipv4
  in_hw

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/pkt_cls.h |   4 ++
 man/man8/tc-flower.8         |  23 ++++++++++
 tc/f_flower.c                | 103 ++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 118 insertions(+), 12 deletions(-)

diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index be05e66..e7f7c52 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -468,6 +468,10 @@ enum {
 	TCA_FLOWER_KEY_IP_TTL,		/* u8 */
 	TCA_FLOWER_KEY_IP_TTL_MASK,	/* u8 */
 
+	TCA_FLOWER_KEY_CVLAN_ID,	/* be16 */
+	TCA_FLOWER_KEY_CVLAN_PRIO,	/* u8   */
+	TCA_FLOWER_KEY_CVLAN_ETH_TYPE,	/* be16 */
+
 	__TCA_FLOWER_MAX,
 };
 
diff --git a/man/man8/tc-flower.8 b/man/man8/tc-flower.8
index a561443..c684652 100644
--- a/man/man8/tc-flower.8
+++ b/man/man8/tc-flower.8
@@ -32,6 +32,12 @@ flower \- flow based traffic control filter
 .IR PRIORITY " | "
 .BR vlan_ethtype " { " ipv4 " | " ipv6 " | "
 .IR ETH_TYPE " } | "
+.B cvlan_id
+.IR VID " | "
+.B cvlan_prio
+.IR PRIORITY " | "
+.BR cvlan_ethtype " { " ipv4 " | " ipv6 " | "
+.IR ETH_TYPE " } | "
 .B mpls_label
 .IR LABEL " | "
 .B mpls_tc
@@ -132,6 +138,23 @@ Match on layer three protocol.
 .I VLAN_ETH_TYPE
 may be either
 .BR ipv4 ", " ipv6
+or an unsigned 16bit value in hexadecimal format. To match on QinQ packet, it must be 802.1Q or 802.1AD.
+.TP
+.BI cvlan_id " VID"
+Match on QinQ inner vlan tag id.
+.I VID
+is an unsigned 12bit value in decimal format.
+.TP
+.BI cvlan_prio " PRIORITY"
+Match on QinQ inner vlan tag priority.
+.I PRIORITY
+is an unsigned 3bit value in decimal format.
+.TP
+.BI cvlan_ethtype " VLAN_ETH_TYPE"
+Match on QinQ layer three protocol.
+.I VLAN_ETH_TYPE
+may be either
+.BR ipv4 ", " ipv6
 or an unsigned 16bit value in hexadecimal format.
 .TP
 .BI mpls_label " LABEL"
diff --git a/tc/f_flower.c b/tc/f_flower.c
index ba8eb66..71f2bc1 100644
--- a/tc/f_flower.c
+++ b/tc/f_flower.c
@@ -50,6 +50,9 @@ static void explain(void)
 		"                       vlan_id VID |\n"
 		"                       vlan_prio PRIORITY |\n"
 		"                       vlan_ethtype [ ipv4 | ipv6 | ETH-TYPE ] |\n"
+		"                       cvlan_id VID |\n"
+		"                       cvlan_prio PRIORITY |\n"
+		"                       cvlan_ethtype [ ipv4 | ipv6 | ETH-TYPE ] |\n"
 		"                       dst_mac MASKED-LLADDR |\n"
 		"                       src_mac MASKED-LLADDR |\n"
 		"                       ip_proto [tcp | udp | sctp | icmp | icmpv6 | IP-PROTO ] |\n"
@@ -128,15 +131,21 @@ err:
 	return err;
 }
 
+static bool eth_type_vlan(__be16 ethertype)
+{
+	return ethertype == htons(ETH_P_8021Q) ||
+	       ethertype == htons(ETH_P_8021AD);
+}
+
 static int flower_parse_vlan_eth_type(char *str, __be16 eth_type, int type,
 				      __be16 *p_vlan_eth_type,
 				      struct nlmsghdr *n)
 {
 	__be16 vlan_eth_type;
 
-	if (eth_type != htons(ETH_P_8021Q)) {
-		fprintf(stderr,
-			"Can't set \"vlan_ethtype\" if ethertype isn't 802.1Q\n");
+	if (!eth_type_vlan(eth_type)) {
+		fprintf(stderr, "Can't set \"%s\" if ethertype isn't 802.1Q or 802.1AD\n",
+			type == TCA_FLOWER_KEY_VLAN_ETH_TYPE ? "vlan_ethtype" : "cvlan_ethtype");
 		return -1;
 	}
 
@@ -586,6 +595,7 @@ static int flower_parse_opt(struct filter_util *qu, char *handle,
 	struct rtattr *tail;
 	__be16 eth_type = TC_H_MIN(t->tcm_info);
 	__be16 vlan_ethtype = 0;
+	__be16 cvlan_ethtype = 0;
 	__u8 ip_proto = 0xff;
 	__u32 flags = 0;
 	__u32 mtf = 0;
@@ -661,9 +671,8 @@ static int flower_parse_opt(struct filter_util *qu, char *handle,
 			__u16 vid;
 
 			NEXT_ARG();
-			if (eth_type != htons(ETH_P_8021Q)) {
-				fprintf(stderr,
-					"Can't set \"vlan_id\" if ethertype isn't 802.1Q\n");
+			if (!eth_type_vlan(eth_type)) {
+				fprintf(stderr, "Can't set \"vlan_id\" if ethertype isn't 802.1Q or 802.1AD\n");
 				return -1;
 			}
 			ret = get_u16(&vid, *argv, 10);
@@ -676,9 +685,8 @@ static int flower_parse_opt(struct filter_util *qu, char *handle,
 			__u8 vlan_prio;
 
 			NEXT_ARG();
-			if (eth_type != htons(ETH_P_8021Q)) {
-				fprintf(stderr,
-					"Can't set \"vlan_prio\" if ethertype isn't 802.1Q\n");
+			if (!eth_type_vlan(eth_type)) {
+				fprintf(stderr, "Can't set \"vlan_prio\" if ethertype isn't 802.1Q or 802.1AD\n");
 				return -1;
 			}
 			ret = get_u8(&vlan_prio, *argv, 10);
@@ -695,6 +703,42 @@ static int flower_parse_opt(struct filter_util *qu, char *handle,
 						 &vlan_ethtype, n);
 			if (ret < 0)
 				return -1;
+		} else if (matches(*argv, "cvlan_id") == 0) {
+			__u16 vid;
+
+			NEXT_ARG();
+			if (!eth_type_vlan(vlan_ethtype)) {
+				fprintf(stderr, "Can't set \"cvlan_id\" if inner vlan ethertype isn't 802.1Q or 802.1AD\n");
+				return -1;
+			}
+			ret = get_u16(&vid, *argv, 10);
+			if (ret < 0 || vid & ~0xfff) {
+				fprintf(stderr, "Illegal \"cvlan_id\"\n");
+				return -1;
+			}
+			addattr16(n, MAX_MSG, TCA_FLOWER_KEY_CVLAN_ID, vid);
+		} else if (matches(*argv, "cvlan_prio") == 0) {
+			__u8 cvlan_prio;
+
+			NEXT_ARG();
+			if (!eth_type_vlan(vlan_ethtype)) {
+				fprintf(stderr, "Can't set \"cvlan_prio\" if inner vlan ethertype isn't 802.1Q or 802.1AD\n");
+				return -1;
+			}
+			ret = get_u8(&cvlan_prio, *argv, 10);
+			if (ret < 0 || cvlan_prio & ~0x7) {
+				fprintf(stderr, "Illegal \"cvlan_prio\"\n");
+				return -1;
+			}
+			addattr8(n, MAX_MSG,
+				 TCA_FLOWER_KEY_CVLAN_PRIO, cvlan_prio);
+		} else if (matches(*argv, "cvlan_ethtype") == 0) {
+			NEXT_ARG();
+			ret = flower_parse_vlan_eth_type(*argv, vlan_ethtype,
+						 TCA_FLOWER_KEY_CVLAN_ETH_TYPE,
+						 &cvlan_ethtype, n);
+			if (ret < 0)
+				return -1;
 		} else if (matches(*argv, "mpls_label") == 0) {
 			__u32 label;
 
@@ -781,7 +825,8 @@ static int flower_parse_opt(struct filter_util *qu, char *handle,
 			}
 		} else if (matches(*argv, "ip_proto") == 0) {
 			NEXT_ARG();
-			ret = flower_parse_ip_proto(*argv, vlan_ethtype ?
+			ret = flower_parse_ip_proto(*argv, cvlan_ethtype ?
+						    cvlan_ethtype : vlan_ethtype ?
 						    vlan_ethtype : eth_type,
 						    TCA_FLOWER_KEY_IP_PROTO,
 						    &ip_proto, n);
@@ -811,7 +856,8 @@ static int flower_parse_opt(struct filter_util *qu, char *handle,
 			}
 		} else if (matches(*argv, "dst_ip") == 0) {
 			NEXT_ARG();
-			ret = flower_parse_ip_addr(*argv, vlan_ethtype ?
+			ret = flower_parse_ip_addr(*argv, cvlan_ethtype ?
+						   cvlan_ethtype : vlan_ethtype ?
 						   vlan_ethtype : eth_type,
 						   TCA_FLOWER_KEY_IPV4_DST,
 						   TCA_FLOWER_KEY_IPV4_DST_MASK,
@@ -824,7 +870,8 @@ static int flower_parse_opt(struct filter_util *qu, char *handle,
 			}
 		} else if (matches(*argv, "src_ip") == 0) {
 			NEXT_ARG();
-			ret = flower_parse_ip_addr(*argv, vlan_ethtype ?
+			ret = flower_parse_ip_addr(*argv, cvlan_ethtype ?
+						   cvlan_ethtype : vlan_ethtype ?
 						   vlan_ethtype : eth_type,
 						   TCA_FLOWER_KEY_IPV4_SRC,
 						   TCA_FLOWER_KEY_IPV4_SRC_MASK,
@@ -1374,6 +1421,38 @@ static int flower_print_opt(struct filter_util *qu, FILE *f,
 			   rta_getattr_u8(attr));
 	}
 
+	if (tb[TCA_FLOWER_KEY_VLAN_ETH_TYPE]) {
+		SPRINT_BUF(buf);
+		struct rtattr *attr = tb[TCA_FLOWER_KEY_VLAN_ETH_TYPE];
+
+		print_string(PRINT_ANY, "vlan_ethtype", "\n  vlan_ethtype %s",
+			     ll_proto_n2a(rta_getattr_u16(attr),
+			     buf, sizeof(buf)));
+	}
+
+	if (tb[TCA_FLOWER_KEY_CVLAN_ID]) {
+		struct rtattr *attr = tb[TCA_FLOWER_KEY_CVLAN_ID];
+
+		print_uint(PRINT_ANY, "cvlan_id", "\n  cvlan_id %u",
+			   rta_getattr_u16(attr));
+	}
+
+	if (tb[TCA_FLOWER_KEY_CVLAN_PRIO]) {
+		struct rtattr *attr = tb[TCA_FLOWER_KEY_CVLAN_PRIO];
+
+		print_uint(PRINT_ANY, "cvlan_prio", "\n  cvlan_prio %d",
+			   rta_getattr_u8(attr));
+	}
+
+	if (tb[TCA_FLOWER_KEY_CVLAN_ETH_TYPE]) {
+		SPRINT_BUF(buf);
+		struct rtattr *attr = tb[TCA_FLOWER_KEY_CVLAN_ETH_TYPE];
+
+		print_string(PRINT_ANY, "cvlan_ethtype", "\n  cvlan_ethtype %s",
+			     ll_proto_n2a(rta_getattr_u16(attr),
+			     buf, sizeof(buf)));
+	}
+
 	flower_print_eth_addr("dst_mac", tb[TCA_FLOWER_KEY_ETH_DST],
 			      tb[TCA_FLOWER_KEY_ETH_DST_MASK]);
 	flower_print_eth_addr("src_mac", tb[TCA_FLOWER_KEY_ETH_SRC],
-- 
2.9.5

^ permalink raw reply related

* [PATCH net-next] net, mm: avoid unnecessary memcg charge skmem
From: Yafang Shao @ 2018-06-30 10:09 UTC (permalink / raw)
  To: davem; +Cc: hannes, mhocko, cgroups, linux-mm, netdev, linux-kernel,
	Yafang Shao

In __sk_mem_raise_allocated(), if mem_cgroup_charge_skmem() return
false, mem_cgroup_uncharge_skmem will be executed.

The logic is as bellow,
__sk_mem_raise_allocated
	ret = mem_cgroup_uncharge_skmem
		try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages);
		return false
	if (!ret)
		mem_cgroup_uncharge_skmem(sk->sk_memcg, amt);

So it is unnecessary to charge if it is not forced.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 include/linux/memcontrol.h |  3 ++-
 mm/memcontrol.c            | 12 +++++++++---
 net/core/sock.c            |  5 +++--
 net/ipv4/tcp_output.c      |  2 +-
 4 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 6c6fb11..56c07c9 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1160,7 +1160,8 @@ static inline void mem_cgroup_wb_stats(struct bdi_writeback *wb,
 #endif	/* CONFIG_CGROUP_WRITEBACK */
 
 struct sock;
-bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
+bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages,
+			     bool force);
 void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
 #ifdef CONFIG_MEMCG
 extern struct static_key_false memcg_sockets_enabled_key;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e6f0d5e..1122be2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5929,7 +5929,8 @@ void mem_cgroup_sk_free(struct sock *sk)
  * Charges @nr_pages to @memcg. Returns %true if the charge fit within
  * @memcg's configured limit, %false if the charge had to be forced.
  */
-bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
+bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages,
+			     bool force)
 {
 	gfp_t gfp_mask = GFP_KERNEL;
 
@@ -5940,7 +5941,10 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
 			memcg->tcpmem_pressure = 0;
 			return true;
 		}
-		page_counter_charge(&memcg->tcpmem, nr_pages);
+
+		if (force)
+			page_counter_charge(&memcg->tcpmem, nr_pages);
+
 		memcg->tcpmem_pressure = 1;
 		return false;
 	}
@@ -5954,7 +5958,9 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
 	if (try_charge(memcg, gfp_mask, nr_pages) == 0)
 		return true;
 
-	try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages);
+	if (force)
+		try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages);
+
 	return false;
 }
 
diff --git a/net/core/sock.c b/net/core/sock.c
index bcc4182..148a840 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2401,9 +2401,10 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
 {
 	struct proto *prot = sk->sk_prot;
 	long allocated = sk_memory_allocated_add(sk, amt);
+	bool charged = false;
 
 	if (mem_cgroup_sockets_enabled && sk->sk_memcg &&
-	    !mem_cgroup_charge_skmem(sk->sk_memcg, amt))
+	    !(charged = mem_cgroup_charge_skmem(sk->sk_memcg, amt, false)))
 		goto suppress_allocation;
 
 	/* Under limit. */
@@ -2465,7 +2466,7 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
 
 	sk_memory_allocated_sub(sk, amt);
 
-	if (mem_cgroup_sockets_enabled && sk->sk_memcg)
+	if (mem_cgroup_sockets_enabled && sk->sk_memcg && charged)
 		mem_cgroup_uncharge_skmem(sk->sk_memcg, amt);
 
 	return 0;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index f8f6129..9b741d4 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3014,7 +3014,7 @@ void sk_forced_mem_schedule(struct sock *sk, int size)
 	sk_memory_allocated_add(sk, amt);
 
 	if (mem_cgroup_sockets_enabled && sk->sk_memcg)
-		mem_cgroup_charge_skmem(sk->sk_memcg, amt);
+		mem_cgroup_charge_skmem(sk->sk_memcg, amt, true);
 }
 
 /* Send a FIN. The caller locks the socket for us.
-- 
1.8.3.1

^ permalink raw reply related

* Re: [patch net-next v2 0/9] net: sched: introduce chain templates support with offloading to mlxsw
From: Jiri Pirko @ 2018-06-30 10:12 UTC (permalink / raw)
  To: Cong Wang
  Cc: sridhar.samudrala, David Ahern, Jamal Hadi Salim,
	Linux Kernel Network Developers, David Miller, Jakub Kicinski,
	Simon Horman, john.hurley, mlxsw
In-Reply-To: <CAM_iQpVj_FHUBGsuBoCAPTbB5416MZ9GoFJVb-+a6C58NaFzJA@mail.gmail.com>

Sat, Jun 30, 2018 at 12:18:02AM CEST, xiyou.wangcong@gmail.com wrote:
>On Fri, Jun 29, 2018 at 10:06 AM Samudrala, Sridhar
><sridhar.samudrala@intel.com> wrote:
>>
>> So instead of introducing 'chaintemplate' object in the kernel, can't we add 'chain'
>> object in the kernel that takes the 'template' as an attribute?
>
>This is exactly what I mean above. Making the chain a standalone object
>in kernel would benefit:
>
>1. Align with 'tc chain' in iproute2, add/del an object is natural
>
>2. Template is an attribute of this object when creating it:
># tc chain add template ....
># tc chain add ... # non-template chain

Okay. So that would allow either create a chain or "chain with
template". Once that is done, there would be no means to manipulate the
template. One can only remove the chain.

What about refounting? I think it would make sense that this implicit
chain addition would take one reference. That means if later on the last
filter is removed, the chain would stay there until user removes it by
hand.

Okay. Sounds good to me. Will do.

Thanks!


>
>3. Easier for sharing by qdiscs:
># tc chain add X block Y ...
># tc filter add ... chain X block Y ...
># tc qdisc add dev eth0 block Y ...
>
>The current 'ingress_block 22 ingress' syntax is ugly.

^ permalink raw reply

* Re: [PATCH net-next] net, mm: avoid unnecessary memcg charge skmem
From: Yafang Shao @ 2018-06-30 10:53 UTC (permalink / raw)
  To: David Miller
  Cc: Johannes Weiner, Michal Hocko, Cgroups, Linux MM, netdev, LKML,
	Yafang Shao
In-Reply-To: <1530353397-12948-1-git-send-email-laoar.shao@gmail.com>

On Sat, Jun 30, 2018 at 6:09 PM, Yafang Shao <laoar.shao@gmail.com> wrote:
> In __sk_mem_raise_allocated(), if mem_cgroup_charge_skmem() return
> false, mem_cgroup_uncharge_skmem will be executed.
>
> The logic is as bellow,
> __sk_mem_raise_allocated
>         ret = mem_cgroup_uncharge_skmem
>                 try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages);
>                 return false
>         if (!ret)
>                 mem_cgroup_uncharge_skmem(sk->sk_memcg, amt);
>
> So it is unnecessary to charge if it is not forced.
>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
>  include/linux/memcontrol.h |  3 ++-
>  mm/memcontrol.c            | 12 +++++++++---
>  net/core/sock.c            |  5 +++--
>  net/ipv4/tcp_output.c      |  2 +-
>  4 files changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 6c6fb11..56c07c9 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -1160,7 +1160,8 @@ static inline void mem_cgroup_wb_stats(struct bdi_writeback *wb,
>  #endif /* CONFIG_CGROUP_WRITEBACK */
>
>  struct sock;
> -bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
> +bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages,
> +                            bool force);
>  void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
>  #ifdef CONFIG_MEMCG
>  extern struct static_key_false memcg_sockets_enabled_key;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e6f0d5e..1122be2 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5929,7 +5929,8 @@ void mem_cgroup_sk_free(struct sock *sk)
>   * Charges @nr_pages to @memcg. Returns %true if the charge fit within
>   * @memcg's configured limit, %false if the charge had to be forced.
>   */
> -bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
> +bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages,
> +                            bool force)
>  {
>         gfp_t gfp_mask = GFP_KERNEL;
>
> @@ -5940,7 +5941,10 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
>                         memcg->tcpmem_pressure = 0;
>                         return true;
>                 }
> -               page_counter_charge(&memcg->tcpmem, nr_pages);
> +
> +               if (force)
> +                       page_counter_charge(&memcg->tcpmem, nr_pages);
> +
>                 memcg->tcpmem_pressure = 1;
>                 return false;
>         }
> @@ -5954,7 +5958,9 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
>         if (try_charge(memcg, gfp_mask, nr_pages) == 0)
>                 return true;
>
> -       try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages);
> +       if (force)
> +               try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages);
> +
>         return false;
>  }
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index bcc4182..148a840 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -2401,9 +2401,10 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
>  {
>         struct proto *prot = sk->sk_prot;
>         long allocated = sk_memory_allocated_add(sk, amt);
> +       bool charged = false;
>
>         if (mem_cgroup_sockets_enabled && sk->sk_memcg &&
> -           !mem_cgroup_charge_skmem(sk->sk_memcg, amt))
> +           !(charged = mem_cgroup_charge_skmem(sk->sk_memcg, amt, false)))
>                 goto suppress_allocation;
>
>         /* Under limit. */
> @@ -2465,7 +2466,7 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
>
>         sk_memory_allocated_sub(sk, amt);
>
> -       if (mem_cgroup_sockets_enabled && sk->sk_memcg)
> +       if (mem_cgroup_sockets_enabled && sk->sk_memcg && charged)
>                 mem_cgroup_uncharge_skmem(sk->sk_memcg, amt);
>
>         return 0;
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index f8f6129..9b741d4 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -3014,7 +3014,7 @@ void sk_forced_mem_schedule(struct sock *sk, int size)
>         sk_memory_allocated_add(sk, amt);
>
>         if (mem_cgroup_sockets_enabled && sk->sk_memcg)
> -               mem_cgroup_charge_skmem(sk->sk_memcg, amt);
> +               mem_cgroup_charge_skmem(sk->sk_memcg, amt, true);
>  }
>
>  /* Send a FIN. The caller locks the socket for us.
> --
> 1.8.3.1
>

Pls. ignore this patch.

Sorry about the noise.

Thanks
Yafang

^ permalink raw reply

* Re: [PATCH 3/3] net: dsa: Add Vitesse VSC73xx DSA router driver
From: Linus Walleij @ 2018-06-30 11:05 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, Vivien Didelot, netdev, OpenWrt Development List,
	LEDE Development List, Gabor Juhos
In-Reply-To: <3ad23735-38be-16ea-3bd0-c77eccc22f16@gmail.com>

Hi Florian,

Thanks for your review!

will send a v2 shortly, just some minor feedbacks:

On Thu, Jun 14, 2018 at 6:51 PM Florian Fainelli <f.fainelli@gmail.com> wrote:
> On 06/14/2018 05:35 AM, Linus Walleij wrote:

> > This driver (currently) only takes control of the switch chip over
> > SPI and configures it to route packages around when connected to a
> > CPU port. The chip has embedded PHYs and VLAN support so we model it
> > using DSA as a best fit so we can easily add VLAN support and maybe
> > later also exploit the internal frame header to get more direct
> > control over the switch.
>
> Yes having the internal frame header working would be really great,
> DSA_TAG_PROTO_NONE is really difficult to use without knowing all the
> DSA details which reminds that we should have the following action items:
>
> - document how DSA_TAG_PROTO_NONE behave differently with respect to
> bridges/VLAN configuration and the DSA master device
>
> - possibly introduce DSA_TAG_PROTO_8021Q which would automatically
> partition ports by allocating one VLAN ID per-port (e.g: from top to
> bottom) that would effectively offer the same features/paradigms as what
> a proper header would offer (Port separation, if nothing else) and it
> could be made seemingly automatic from within DSA

To me this makes a lot of sense. I haven't implemented VLAN yet
for this switch chip, but for the Realtek RTL8366RB that I'm also
working with, the vendor driver already does more or less exactly
this by default. It would be great to have the kernel just put this
in place so we have it under control.

The RTL8366 chips does not even seem to have an internal
frame format AFAICT, it seems they don't know what that is.
So they will likely always have to use VLAN for this.

> > The four built-in GPIO lines are exposed using a standard GPIO chip.
>
> What are those typically used for out of curiosity? Is this to connect
> to an EEPROM?

No, there are actually separate address and data lines for connecting
an EEPROM and booting the 8051 CPU from it if need be.

This is likely mostly for the stand-alone usecase. The VSC73xx
and other Vitesse chips are only used with another CPU (like
one running Linux) as an exception, IIUC. The usual usecase is
to put the Vitesse chip with an EEPROM and SW from the vendor
running on the 8051, and box it up as a rack-mounted or small
switch or router.

The vendor provides an SDK for the 8051 CPU including an
SNMP implementation so it can be remotely managed.

If you use this to build a rack-mounted or boxed product,
you likely want to add some "reset" key or a few LEDs to
indicate whatever, and then those 4 GPIO lines come in
handy.

With another OS, like Linux running, these lines can be used
for whatever you want, just an extra GPIO expander
essentially.

> > +     u16                     chipid;
> > +     bool                    is_vsc7385;
> > +     bool                    is_vsc7388;
> > +     bool                    is_vsc7395;
> > +     bool                    is_vsc7398;
>
> How about having an u16/u32 chip_id instead?

OK I already have the u16 chipid there.

So I added some macros like IS_VSC7385() instead.

> > +     /* The switch internally uses a 8 byte header with length,
> > +      * source port, tag, LPA and priority. This is supposedly
> > +      * only accessible when operating the switch using the internal
> > +      * CPU or with an external CPU mapping the device in, but not
> > +      * when operating the switch over SPI and putting frames in/out
> > +      * on port 6 (the CPU port). So far we must assume that we
> > +      * cannot access the tag. (See "Internal frame header" section
> > +      * 3.9.1 in the manual.)
>
> I would be really good if we could get this to work even in SPI with the
> CPU controlling the switch, I cannot really think of why the 8051 would
> have to be involved, because having the 8051 means either the switch is
> entirely standalone and runs off an EEPROM (which is additional cost on
> your BOM), or the host, through SPI can entirely take over.

If you think back on the original usecase in a switch or router it does
make a bit of sense. Another OS is just in the back seat, the 8051
is supposed to do the heavy lifting here.

This is different from devices without an internal CPU where
some "outside OS" (Linux) is supposed to deal with the raw
DSA-tagged frames.

OpenWRT have a bunch of firmware from vendors being uploaded
to the 8051 for different hardware:
http://mirror2.openwrt.org/sources/vsc73x5-ucode.tar.bz2

These look very small, for the internal RAM. I think those are
usually quite silly hacks and unnecessary, the only thing that really
makes sense to load over there is the full SNMP stack that one
can get from the vendor. I haven't disassembled their code though.
That's why I think these devices are better off using this driver
that just take control of the device over SPI.

But it would be neat to figure out if we can get access to the
raw frames. :/

> Is the datasheet public somehow?

Partminer accidentally managed to download and archive it at
one point, but it seems it disappeared from the link:
https://www.buyparts.io/download.php?id=6263858&t=ih

Since Microchips are much more liberal with datasheets I
expect it to become public ASAP, I just got access to all datasheets
when I asked.

> > +static void vsc73xx_adjust_link(struct dsa_switch *ds, int port,
> > +                             struct phy_device *phydev)
> > +{
> > +     struct vsc73xx *vsc = ds->priv;
> > +     u32 val;
>
> No is_pseudo_fixed_link() check, you really have to do all of this for
> each front-panel port? That is really bad if that is the case... most
> switches with front-panel built-in PHYs are at the very least capable of
> re-configuring their internal MAC accordingly.

Sadly this is needed. The manual explicitly says that the MAC
has to be set up depending on speed immediately after the PHY
comes up. I guess the ASIC engineer skipped any integration
of the PHY and MAC and they are just connected, not really
talking to each other.

Luckily the .adjust_link() callback is handy for this.

> > +static const struct of_device_id vsc73xx_of_match[] = {
> > +     {
> > +             .compatible = "vitesse,vsc7385",
>
> Would not you want to pass additional data here, like the possible port
> layout/chip id to be reading?

Since the chip is self-identifying (chip ID) it is not necessary.

If I wanted to be overzealous I could check that the internal
chip ID matches the compatible string (I have seen that some
drivers do that.)

Yours,
Linus Walleij

^ permalink raw reply

* Re: [PATCH 3/3] net: dsa: Add Vitesse VSC73xx DSA router driver
From: Linus Walleij @ 2018-06-30 11:05 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, Vivien Didelot, netdev, OpenWrt Development List,
	LEDE Development List, Gabor Juhos
In-Reply-To: <3ad23735-38be-16ea-3bd0-c77eccc22f16@gmail.com>

Hi Florian,

Thanks for your review!

will send a v2 shortly, just some minor feedbacks:

On Thu, Jun 14, 2018 at 6:51 PM Florian Fainelli <f.fainelli@gmail.com> wrote:
> On 06/14/2018 05:35 AM, Linus Walleij wrote:

> > This driver (currently) only takes control of the switch chip over
> > SPI and configures it to route packages around when connected to a
> > CPU port. The chip has embedded PHYs and VLAN support so we model it
> > using DSA as a best fit so we can easily add VLAN support and maybe
> > later also exploit the internal frame header to get more direct
> > control over the switch.
>
> Yes having the internal frame header working would be really great,
> DSA_TAG_PROTO_NONE is really difficult to use without knowing all the
> DSA details which reminds that we should have the following action items:
>
> - document how DSA_TAG_PROTO_NONE behave differently with respect to
> bridges/VLAN configuration and the DSA master device
>
> - possibly introduce DSA_TAG_PROTO_8021Q which would automatically
> partition ports by allocating one VLAN ID per-port (e.g: from top to
> bottom) that would effectively offer the same features/paradigms as what
> a proper header would offer (Port separation, if nothing else) and it
> could be made seemingly automatic from within DSA

To me this makes a lot of sense. I haven't implemented VLAN yet
for this switch chip, but for the Realtek RTL8366RB that I'm also
working with, the vendor driver already does more or less exactly
this by default. It would be great to have the kernel just put this
in place so we have it under control.

The RTL8366 chips does not even seem to have an internal
frame format AFAICT, it seems they don't know what that is.
So they will likely always have to use VLAN for this.

> > The four built-in GPIO lines are exposed using a standard GPIO chip.
>
> What are those typically used for out of curiosity? Is this to connect
> to an EEPROM?

No, there are actually separate address and data lines for connecting
an EEPROM and booting the 8051 CPU from it if need be.

This is likely mostly for the stand-alone usecase. The VSC73xx
and other Vitesse chips are only used with another CPU (like
one running Linux) as an exception, IIUC. The usual usecase is
to put the Vitesse chip with an EEPROM and SW from the vendor
running on the 8051, and box it up as a rack-mounted or small
switch or router.

The vendor provides an SDK for the 8051 CPU including an
SNMP implementation so it can be remotely managed.

If you use this to build a rack-mounted or boxed product,
you likely want to add some "reset" key or a few LEDs to
indicate whatever, and then those 4 GPIO lines come in
handy.

With another OS, like Linux running, these lines can be used
for whatever you want, just an extra GPIO expander
essentially.

> > +     u16                     chipid;
> > +     bool                    is_vsc7385;
> > +     bool                    is_vsc7388;
> > +     bool                    is_vsc7395;
> > +     bool                    is_vsc7398;
>
> How about having an u16/u32 chip_id instead?

OK I already have the u16 chipid there.

So I added some macros like IS_VSC7385() instead.

> > +     /* The switch internally uses a 8 byte header with length,
> > +      * source port, tag, LPA and priority. This is supposedly
> > +      * only accessible when operating the switch using the internal
> > +      * CPU or with an external CPU mapping the device in, but not
> > +      * when operating the switch over SPI and putting frames in/out
> > +      * on port 6 (the CPU port). So far we must assume that we
> > +      * cannot access the tag. (See "Internal frame header" section
> > +      * 3.9.1 in the manual.)
>
> I would be really good if we could get this to work even in SPI with the
> CPU controlling the switch, I cannot really think of why the 8051 would
> have to be involved, because having the 8051 means either the switch is
> entirely standalone and runs off an EEPROM (which is additional cost on
> your BOM), or the host, through SPI can entirely take over.

If you think back on the original usecase in a switch or router it does
make a bit of sense. Another OS is just in the back seat, the 8051
is supposed to do the heavy lifting here.

This is different from devices without an internal CPU where
some "outside OS" (Linux) is supposed to deal with the raw
DSA-tagged frames.

OpenWRT have a bunch of firmware from vendors being uploaded
to the 8051 for different hardware:
http://mirror2.openwrt.org/sources/vsc73x5-ucode.tar.bz2

These look very small, for the internal RAM. I think those are
usually quite silly hacks and unnecessary, the only thing that really
makes sense to load over there is the full SNMP stack that one
can get from the vendor. I haven't disassembled their code though.
That's why I think these devices are better off using this driver
that just take control of the device over SPI.

But it would be neat to figure out if we can get access to the
raw frames. :/

> Is the datasheet public somehow?

Partminer accidentally managed to download and archive it at
one point, but it seems it disappeared from the link:
https://www.buyparts.io/download.php?id=6263858&t=ih

Since Microchips are much more liberal with datasheets I
expect it to become public ASAP, I just got access to all datasheets
when I asked.

> > +static void vsc73xx_adjust_link(struct dsa_switch *ds, int port,
> > +                             struct phy_device *phydev)
> > +{
> > +     struct vsc73xx *vsc = ds->priv;
> > +     u32 val;
>
> No is_pseudo_fixed_link() check, you really have to do all of this for
> each front-panel port? That is really bad if that is the case... most
> switches with front-panel built-in PHYs are at the very least capable of
> re-configuring their internal MAC accordingly.

Sadly this is needed. The manual explicitly says that the MAC
has to be set up depending on speed immediately after the PHY
comes up. I guess the ASIC engineer skipped any integration
of the PHY and MAC and they are just connected, not really
talking to each other.

Luckily the .adjust_link() callback is handy for this.

> > +static const struct of_device_id vsc73xx_of_match[] = {
> > +     {
> > +             .compatible = "vitesse,vsc7385",
>
> Would not you want to pass additional data here, like the possible port
> layout/chip id to be reading?

Since the chip is self-identifying (chip ID) it is not necessary.

If I wanted to be overzealous I could check that the internal
chip ID matches the compatible string (I have seen that some
drivers do that.)

Yours,
Linus Walleij

^ permalink raw reply

* Re: [PATCH v3 net-next 0/5] Fixes coding style in xilinx_emaclite.c
From: David Miller @ 2018-06-30 11:16 UTC (permalink / raw)
  To: radhey.shyam.pandey
  Cc: michal.simek, joe, andrew, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <1530191510-10310-1-git-send-email-radhey.shyam.pandey@xilinx.com>

From: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Date: Thu, 28 Jun 2018 18:41:45 +0530

> This patchset fixes checkpatch and kernel-doc warnings in
> xilinx emaclite driver. No functional change.
> 
> Changes from v2:
> -In 2/5 patch refactor if-else to make failure path return early.
> -In 2/5 patch coalesce the format onto a single line and add the
> missing space after the comma.

Series applied.

^ permalink raw reply

* Re: [PATCH] net: cleanup gfp mask in alloc_skb_with_frags
From: David Miller @ 2018-06-30 11:19 UTC (permalink / raw)
  To: mhocko; +Cc: edumazet, netdev, linux-kernel, mhocko
In-Reply-To: <20180628155306.29038-1-mhocko@kernel.org>

From: Michal Hocko <mhocko@kernel.org>
Date: Thu, 28 Jun 2018 17:53:06 +0200

> From: Michal Hocko <mhocko@suse.com>
> 
> alloc_skb_with_frags uses __GFP_NORETRY for non-sleeping allocations
> which is just a noop and a little bit confusing.
> 
> __GFP_NORETRY was added by ed98df3361f0 ("net: use __GFP_NORETRY for
> high order allocations") to prevent from the OOM killer. Yet this was
> not enough because fb05e7a89f50 ("net: don't wait for order-3 page
> allocation") didn't want an excessive reclaim for non-costly orders
> so it made it completely NOWAIT while it preserved __GFP_NORETRY in
> place which is now redundant.
> 
> Drop the pointless __GFP_NORETRY because this function is used as
> copy&paste source for other places.
> 
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Applied.

^ permalink raw reply

* [PATCH 1/3 v2] net: dsa: Add DT bindings for Vitesse VSC73xx switches
From: Linus Walleij @ 2018-06-30 11:17 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli
  Cc: netdev, openwrt-devel, LEDE Development List, Gabor Juhos,
	Linus Walleij, devicetree

This adds the device tree bindings for the Vitesse VSC73xx
switches. We also add the vendor name for Vitesse.

Cc: devicetree@vger.kernel.org
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
ChangeLog v1->v2:
- Fix spelling error
- Properly reference the GPIO bindings
- Collect Florians ACK
---
 .../bindings/net/dsa/vitesse,vsc73xx.txt      | 81 +++++++++++++++++++
 .../devicetree/bindings/vendor-prefixes.txt   |  1 +
 2 files changed, 82 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.txt

diff --git a/Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.txt b/Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.txt
new file mode 100644
index 000000000000..ed4710c40641
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.txt
@@ -0,0 +1,81 @@
+Vitesse VSC73xx Switches
+========================
+
+This defines device tree bindings for the Vitesse VSC73xx switch chips.
+The Vitesse company has been acquired by Microsemi and Microsemi in turn
+acquired by Microchip but retains this vendor branding.
+
+The currently supported switch chips are:
+Vitesse VSC7385 SparX-G5 5+1-port Integrated Gigabit Ethernet Switch
+Vitesse VSC7388 SparX-G8 8-port Integrated Gigabit Ethernet Switch
+Vitesse VSC7395 SparX-G5e 5+1-port Integrated Gigabit Ethernet Switch
+Vitesse VSC7398 SparX-G8e 8-port Integrated Gigabit Ethernet Switch
+
+The device tree node is an SPI device so it must reside inside a SPI bus
+device tree node, see spi/spi-bus.txt
+
+Required properties:
+
+- compatible: must be exactly one of:
+	"vitesse,vsc7385"
+	"vitesse,vsc7388"
+	"vitesse,vsc7395"
+	"vitesse,vsc7398"
+- gpio-controller: indicates that this switch is also a GPIO controller,
+  see gpio/gpio.txt
+- #gpio-cells: this must be set to <2> and indicates that we are a twocell
+  GPIO controller, see gpio/gpio.txt
+
+Optional properties:
+
+- reset-gpios: a handle to a GPIO line that can issue reset of the chip.
+  It should be tagged as active low.
+
+Required subnodes:
+
+See net/dsa/dsa.txt for a list of additional required and optional properties
+and subnodes of DSA switches.
+
+Examples:
+
+switch@0 {
+	compatible = "vitesse,vsc7395";
+	reg = <0>;
+	/* Specified for 2.5 MHz or below */
+	spi-max-frequency = <2500000>;
+	gpio-controller;
+	#gpio-cells = <2>;
+
+	ports {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		port@0 {
+			reg = <0>;
+			label = "lan1";
+		};
+		port@1 {
+			reg = <1>;
+			label = "lan2";
+		};
+		port@2 {
+			reg = <2>;
+			label = "lan3";
+		};
+		port@3 {
+			reg = <3>;
+			label = "lan4";
+		};
+		vsc: port@6 {
+			reg = <6>;
+			label = "cpu";
+			ethernet = <&gmac1>;
+			phy-mode = "rgmii";
+			fixed-link {
+				speed = <1000>;
+				full-duplex;
+				pause;
+			};
+		};
+	};
+};
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 7cad066191ee..3e5398f87eac 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -395,6 +395,7 @@ v3	V3 Semiconductor
 variscite	Variscite Ltd.
 via	VIA Technologies, Inc.
 virtio	Virtual I/O Device Specification, developed by the OASIS consortium
+vitesse	Vitesse Semiconductor Corporation
 vivante	Vivante Corporation
 vocore VoCore Studio
 voipac	Voipac Technologies s.r.o.
-- 
2.17.1

^ permalink raw reply related

* [PATCH 0/2 ] cdc_ncm: Handle multicast Ethernet traffic
From: Miguel Rodríguez Pérez @ 2018-06-30 11:21 UTC (permalink / raw)
  To: linux-usb; +Cc: netdev

Sending again, as the previous try had the wrong subjects. Sorry about that.

Dell D6000 dock (and I guess other docks too) exposes a CDC_NCM device
for Ethernet traffic. However, multicast Ethernet traffic is not
processed making IPv6 not functional. Other services, like mDNS used for
LAN service discovery are also hindered.

The actual reason is that CDC_NCM driver was not processing requests to
filter (admit) multicast traffic. I provide two patches to the linux
kernel that admit all Ethernet multicast traffic whenever a multicast
group is being joined.

The solution is not optimal, as it makes the system receive more traffic
than that strictly needed, but otherwise this only happens when the
computer is connected to a dock and thus is running on AC power. I
believe it is not worth the hassle to join only the requested groups.
This is the same that is done in the CDN_ETHER driver.

Best regards,

-- 
Miguel Rodríguez Pérez
Laboratorio de Redes
EE Telecomunicación – Universidade de Vigo

^ permalink raw reply

* [PATCH 2/3 v2] net: phy: vitesse: Add support for VSC73xx
From: Linus Walleij @ 2018-06-30 11:17 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli
  Cc: netdev, openwrt-devel, LEDE Development List, Gabor Juhos,
	Linus Walleij
In-Reply-To: <20180630111731.19551-1-linus.walleij@linaro.org>

The VSC7385, VSC7388, VSC7395 and VSC7398 are integrated
switch/router chips for 5+1 or 8-port switches/routers. When
managed directly by Linux using DSA we need to do a special
set-up "dance" on the PHY. Unfortunately these sequences
switches the PHY to undocumented pages named 2a30 and 52b6
and does undocumented things. It is described by these opaque
sequences also in the reference manual. This is a best
effort to integrate it anyways.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
ChangeLog v1->v2:
- Drop <linux/delay.h> from an earlier iteration.
- Implement an .config_aneg() routine that does nothing: the
  imroved genphy_config_aneg() makes the device fail as of
  v4.18-rc1 and the device seems to feel best like this: it
  comes up in autonegotiation mode and we do not try to instruct
  it.
- Use some MII defines when reading/writing registers.
- Collect Florian's ACK.
---
 drivers/net/phy/vitesse.c | 175 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 175 insertions(+)

diff --git a/drivers/net/phy/vitesse.c b/drivers/net/phy/vitesse.c
index d9dd8fbfffc7..fbf9ad429593 100644
--- a/drivers/net/phy/vitesse.c
+++ b/drivers/net/phy/vitesse.c
@@ -72,6 +72,10 @@
 #define PHY_ID_VSC8572			0x000704d0
 #define PHY_ID_VSC8574			0x000704a0
 #define PHY_ID_VSC8601			0x00070420
+#define PHY_ID_VSC7385			0x00070450
+#define PHY_ID_VSC7388			0x00070480
+#define PHY_ID_VSC7395			0x00070550
+#define PHY_ID_VSC7398			0x00070580
 #define PHY_ID_VSC8662			0x00070660
 #define PHY_ID_VSC8221			0x000fc550
 #define PHY_ID_VSC8211			0x000fc4b0
@@ -116,6 +120,137 @@ static int vsc824x_config_init(struct phy_device *phydev)
 	return err;
 }
 
+#define VSC73XX_EXT_PAGE_ACCESS 0x1f
+
+static int vsc73xx_read_page(struct phy_device *phydev)
+{
+	return __phy_read(phydev, VSC73XX_EXT_PAGE_ACCESS);
+}
+
+static int vsc73xx_write_page(struct phy_device *phydev, int page)
+{
+	return __phy_write(phydev, VSC73XX_EXT_PAGE_ACCESS, page);
+}
+
+static void vsc73xx_config_init(struct phy_device *phydev)
+{
+	/* Receiver init */
+	phy_write(phydev, 0x1f, 0x2a30);
+	phy_modify(phydev, 0x0c, 0x0300, 0x0200);
+	phy_write(phydev, 0x1f, 0x0000);
+
+	/* Config LEDs 0x61 */
+	phy_modify(phydev, MII_TPISTATUS, 0xff00, 0x0061);
+}
+
+static int vsc738x_config_init(struct phy_device *phydev)
+{
+	u16 rev;
+	/* This magic sequence appear in the application note
+	 * "VSC7385/7388 PHY Configuration".
+	 *
+	 * Maybe one day we will get to know what it all means.
+	 */
+	phy_write(phydev, 0x1f, 0x2a30);
+	phy_modify(phydev, 0x08, 0x0200, 0x0200);
+	phy_write(phydev, 0x1f, 0x52b5);
+	phy_write(phydev, 0x10, 0xb68a);
+	phy_modify(phydev, 0x12, 0xff07, 0x0003);
+	phy_modify(phydev, 0x11, 0x00ff, 0x00a2);
+	phy_write(phydev, 0x10, 0x968a);
+	phy_write(phydev, 0x1f, 0x2a30);
+	phy_modify(phydev, 0x08, 0x0200, 0x0000);
+	phy_write(phydev, 0x1f, 0x0000);
+
+	/* Read revision */
+	rev = phy_read(phydev, MII_PHYSID2);
+	rev &= 0x0f;
+
+	/* Special quirk for revision 0 */
+	if (rev == 0) {
+		phy_write(phydev, 0x1f, 0x2a30);
+		phy_modify(phydev, 0x08, 0x0200, 0x0200);
+		phy_write(phydev, 0x1f, 0x52b5);
+		phy_write(phydev, 0x12, 0x0000);
+		phy_write(phydev, 0x11, 0x0689);
+		phy_write(phydev, 0x10, 0x8f92);
+		phy_write(phydev, 0x1f, 0x52b5);
+		phy_write(phydev, 0x12, 0x0000);
+		phy_write(phydev, 0x11, 0x0e35);
+		phy_write(phydev, 0x10, 0x9786);
+		phy_write(phydev, 0x1f, 0x2a30);
+		phy_modify(phydev, 0x08, 0x0200, 0x0000);
+		phy_write(phydev, 0x17, 0xff80);
+		phy_write(phydev, 0x17, 0x0000);
+	}
+
+	phy_write(phydev, 0x1f, 0x0000);
+	phy_write(phydev, 0x12, 0x0048);
+
+	if (rev == 0) {
+		phy_write(phydev, 0x1f, 0x2a30);
+		phy_write(phydev, 0x14, 0x6600);
+		phy_write(phydev, 0x1f, 0x0000);
+		phy_write(phydev, 0x18, 0xa24e);
+	} else {
+		phy_write(phydev, 0x1f, 0x2a30);
+		phy_modify(phydev, 0x16, 0x0fc0, 0x0240);
+		phy_modify(phydev, 0x14, 0x6000, 0x4000);
+		/* bits 14-15 in extended register 0x14 controls DACG amplitude
+		 * 6 = -8%, 2 is hardware default
+		 */
+		phy_write(phydev, 0x1f, 0x0001);
+		phy_modify(phydev, 0x14, 0xe000, 0x6000);
+		phy_write(phydev, 0x1f, 0x0000);
+	}
+
+	vsc73xx_config_init(phydev);
+
+	return genphy_config_init(phydev);
+}
+
+static int vsc739x_config_init(struct phy_device *phydev)
+{
+	/* This magic sequence appears in the VSC7395 SparX-G5e application
+	 * note "VSC7395/VSC7398 PHY Configuration"
+	 *
+	 * Maybe one day we will get to know what it all means.
+	 */
+	phy_write(phydev, 0x1f, 0x2a30);
+	phy_modify(phydev, 0x08, 0x0200, 0x0200);
+	phy_write(phydev, 0x1f, 0x52b5);
+	phy_write(phydev, 0x10, 0xb68a);
+	phy_modify(phydev, 0x12, 0xff07, 0x0003);
+	phy_modify(phydev, 0x11, 0x00ff, 0x00a2);
+	phy_write(phydev, 0x10, 0x968a);
+	phy_write(phydev, 0x1f, 0x2a30);
+	phy_modify(phydev, 0x08, 0x0200, 0x0000);
+	phy_write(phydev, 0x1f, 0x0000);
+
+	phy_write(phydev, 0x1f, 0x0000);
+	phy_write(phydev, 0x12, 0x0048);
+	phy_write(phydev, 0x1f, 0x2a30);
+	phy_modify(phydev, 0x16, 0x0fc0, 0x0240);
+	phy_modify(phydev, 0x14, 0x6000, 0x4000);
+	phy_write(phydev, 0x1f, 0x0001);
+	phy_modify(phydev, 0x14, 0xe000, 0x6000);
+	phy_write(phydev, 0x1f, 0x0000);
+
+	vsc73xx_config_init(phydev);
+
+	return genphy_config_init(phydev);
+}
+
+static int vsc73xx_config_aneg(struct phy_device *phydev)
+{
+	/* The VSC73xx switches does not like to be instructed to
+	 * do autonegotiation in any way, it prefers that you just go
+	 * with the power-on/reset defaults. Writing some registers will
+	 * just make autonegotiation permanently fail.
+	 */
+	return 0;
+}
+
 /* This adds a skew for both TX and RX clocks, so the skew should only be
  * applied to "rgmii-id" interfaces. It may not work as expected
  * on "rgmii-txid", "rgmii-rxid" or "rgmii" interfaces. */
@@ -318,6 +453,42 @@ static struct phy_driver vsc82xx_driver[] = {
 	.config_init    = &vsc8601_config_init,
 	.ack_interrupt  = &vsc824x_ack_interrupt,
 	.config_intr    = &vsc82xx_config_intr,
+}, {
+	.phy_id         = PHY_ID_VSC7385,
+	.name           = "Vitesse VSC7385",
+	.phy_id_mask    = 0x000ffff0,
+	.features       = PHY_GBIT_FEATURES,
+	.config_init    = vsc738x_config_init,
+	.config_aneg    = vsc73xx_config_aneg,
+	.read_page      = vsc73xx_read_page,
+	.write_page     = vsc73xx_write_page,
+}, {
+	.phy_id         = PHY_ID_VSC7388,
+	.name           = "Vitesse VSC7388",
+	.phy_id_mask    = 0x000ffff0,
+	.features       = PHY_GBIT_FEATURES,
+	.config_init    = vsc738x_config_init,
+	.config_aneg    = vsc73xx_config_aneg,
+	.read_page      = vsc73xx_read_page,
+	.write_page     = vsc73xx_write_page,
+}, {
+	.phy_id         = PHY_ID_VSC7395,
+	.name           = "Vitesse VSC7395",
+	.phy_id_mask    = 0x000ffff0,
+	.features       = PHY_GBIT_FEATURES,
+	.config_init    = vsc739x_config_init,
+	.config_aneg    = vsc73xx_config_aneg,
+	.read_page      = vsc73xx_read_page,
+	.write_page     = vsc73xx_write_page,
+}, {
+	.phy_id         = PHY_ID_VSC7398,
+	.name           = "Vitesse VSC7398",
+	.phy_id_mask    = 0x000ffff0,
+	.features       = PHY_GBIT_FEATURES,
+	.config_init    = vsc739x_config_init,
+	.config_aneg    = vsc73xx_config_aneg,
+	.read_page      = vsc73xx_read_page,
+	.write_page     = vsc73xx_write_page,
 }, {
 	.phy_id         = PHY_ID_VSC8662,
 	.name           = "Vitesse VSC8662",
@@ -358,6 +529,10 @@ static struct mdio_device_id __maybe_unused vitesse_tbl[] = {
 	{ PHY_ID_VSC8514, 0x000ffff0 },
 	{ PHY_ID_VSC8572, 0x000ffff0 },
 	{ PHY_ID_VSC8574, 0x000ffff0 },
+	{ PHY_ID_VSC7385, 0x000ffff0 },
+	{ PHY_ID_VSC7388, 0x000ffff0 },
+	{ PHY_ID_VSC7395, 0x000ffff0 },
+	{ PHY_ID_VSC7398, 0x000ffff0 },
 	{ PHY_ID_VSC8662, 0x000ffff0 },
 	{ PHY_ID_VSC8221, 0x000ffff0 },
 	{ PHY_ID_VSC8211, 0x000ffff0 },
-- 
2.17.1

^ permalink raw reply related

* [PATCH 3/3 v2] net: dsa: Add Vitesse VSC73xx DSA router driver
From: Linus Walleij @ 2018-06-30 11:17 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli
  Cc: netdev, openwrt-devel, LEDE Development List, Gabor Juhos,
	Linus Walleij
In-Reply-To: <20180630111731.19551-1-linus.walleij@linaro.org>

This adds a DSA driver for:

Vitesse VSC7385 SparX-G5 5-port Integrated Gigabit Ethernet Switch
Vitesse VSC7388 SparX-G8 8-port Integrated Gigabit Ethernet Switch
Vitesse VSC7395 SparX-G5e 5+1-port Integrated Gigabit Ethernet Switch
Vitesse VSC7398 SparX-G8e 8-port Integrated Gigabit Ethernet Switch

These switches have a built-in 8051 CPU and can download and execute
firmware in this CPU. They can also be configured to use an external
CPU handling the switch in a memory-mapped manner by connecting to
that external CPU's memory bus.

This driver (currently) only takes control of the switch chip over
SPI and configures it to route packages around when connected to a
CPU port. The chip has embedded PHYs and VLAN support so we model it
using DSA as a best fit so we can easily add VLAN support and maybe
later also exploit the internal frame header to get more direct
control over the switch.

The four built-in GPIO lines are exposed using a standard GPIO chip.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
ChangeLog v1->v2:
- Update .get_strings() and .get_sset_count() to match the signature
  with the new argument for sset type.
- Drop DSA trailer select from Kconfig.
- Use MII_* namespace definitions instead of hard-coded hex
  values and calls to read into the PHY: instead use the genphy
  decoded link state already present in the phydev.
- Drop extraneous port number check in .enable() and .disable(),
  this should not happen.
- Move the GPIO chip set-up into a separate function.
- Add some missing static in front of a counter function.
- Drop the bool flags in the state container, use some macros with
  the chipid to identify model instead like IS_VSC739X().
- Check phydev->interface() for configuring the CPU port into
  RGMII mode.
---
 drivers/net/dsa/Kconfig           |   11 +
 drivers/net/dsa/Makefile          |    1 +
 drivers/net/dsa/vitesse-vsc73xx.c | 1364 +++++++++++++++++++++++++++++
 3 files changed, 1376 insertions(+)
 create mode 100644 drivers/net/dsa/vitesse-vsc73xx.c

diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
index 2b81b97e994f..733653d8359a 100644
--- a/drivers/net/dsa/Kconfig
+++ b/drivers/net/dsa/Kconfig
@@ -76,4 +76,15 @@ config NET_DSA_SMSC_LAN9303_MDIO
 	  Enable access functions if the SMSC/Microchip LAN9303 is configured
 	  for MDIO managed mode.
 
+config NET_DSA_VITESSE_VSC73XX
+	tristate "Vitesse VSC7385/7388/7395/7398 support"
+	depends on OF && SPI
+	depends on NET_DSA
+	select FIXED_PHY
+	select VITESSE_PHY
+	select GPIOLIB
+	---help---
+	  This enables support for the Vitesse VSC7385, VSC7388,
+	  VSC7395 and VSC7398 SparX integrated ethernet switches.
+
 endmenu
diff --git a/drivers/net/dsa/Makefile b/drivers/net/dsa/Makefile
index 15c2a831edf1..d4f873ae2f6a 100644
--- a/drivers/net/dsa/Makefile
+++ b/drivers/net/dsa/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_NET_DSA_QCA8K)	+= qca8k.o
 obj-$(CONFIG_NET_DSA_SMSC_LAN9303) += lan9303-core.o
 obj-$(CONFIG_NET_DSA_SMSC_LAN9303_I2C) += lan9303_i2c.o
 obj-$(CONFIG_NET_DSA_SMSC_LAN9303_MDIO) += lan9303_mdio.o
+obj-$(CONFIG_NET_DSA_VITESSE_VSC73XX) += vitesse-vsc73xx.o
 obj-y				+= b53/
 obj-y				+= microchip/
 obj-y				+= mv88e6xxx/
diff --git a/drivers/net/dsa/vitesse-vsc73xx.c b/drivers/net/dsa/vitesse-vsc73xx.c
new file mode 100644
index 000000000000..a4fc260006de
--- /dev/null
+++ b/drivers/net/dsa/vitesse-vsc73xx.c
@@ -0,0 +1,1364 @@
+// SPDX-License-Identifier: GPL-2.0
+/* DSA driver for:
+ * Vitesse VSC7385 SparX-G5 5+1-port Integrated Gigabit Ethernet Switch
+ * Vitesse VSC7388 SparX-G8 8-port Integrated Gigabit Ethernet Switch
+ * Vitesse VSC7395 SparX-G5e 5+1-port Integrated Gigabit Ethernet Switch
+ * Vitesse VSC7398 SparX-G8e 8-port Integrated Gigabit Ethernet Switch
+ *
+ * These switches have a built-in 8051 CPU and can download and execute a
+ * firmware in this CPU. They can also be configured to use an external CPU
+ * handling the switch in a memory-mapped manner by connecting to that external
+ * CPU's memory bus.
+ *
+ * This driver (currently) only takes control of the switch chip over SPI and
+ * configures it to route packages around when connected to a CPU port. The
+ * chip has embedded PHYs and VLAN support so we model it using DSA.
+ *
+ * Copyright (C) 2018 Linus Wallej <linus.walleij@linaro.org>
+ * Includes portions of code from the firmware uploader by:
+ * Copyright (C) 2009 Gabor Juhos <juhosg@openwrt.org>
+ */
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/of_mdio.h>
+#include <linux/platform_device.h>
+#include <linux/spi/spi.h>
+#include <linux/bitops.h>
+#include <linux/if_bridge.h>
+#include <linux/etherdevice.h>
+#include <linux/gpio/consumer.h>
+#include <linux/gpio/driver.h>
+#include <linux/random.h>
+#include <net/dsa.h>
+
+#define VSC73XX_BLOCK_MAC	0x1 /* Subblocks 0-4, 6 (CPU port) */
+#define VSC73XX_BLOCK_ANALYZER	0x2 /* Only subblock 0 */
+#define VSC73XX_BLOCK_MII	0x3 /* Subblocks 0 and 1 */
+#define VSC73XX_BLOCK_MEMINIT	0x3 /* Only subblock 2 */
+#define VSC73XX_BLOCK_CAPTURE	0x4 /* Only subblock 2 */
+#define VSC73XX_BLOCK_ARBITER	0x5 /* Only subblock 0 */
+#define VSC73XX_BLOCK_SYSTEM	0x7 /* Only subblock 0 */
+
+#define CPU_PORT	6 /* CPU port */
+
+/* MAC Block registers */
+#define VSC73XX_MAC_CFG		0x00
+#define VSC73XX_MACHDXGAP	0x02
+#define VSC73XX_FCCONF		0x04
+#define VSC73XX_FCMACHI		0x08
+#define VSC73XX_FCMACLO		0x0c
+#define VSC73XX_MAXLEN		0x10
+#define VSC73XX_ADVPORTM	0x19
+#define VSC73XX_TXUPDCFG	0x24
+#define VSC73XX_TXQ_SELECT_CFG	0x28
+#define VSC73XX_RXOCT		0x50
+#define VSC73XX_TXOCT		0x51
+#define VSC73XX_C_RX0		0x52
+#define VSC73XX_C_RX1		0x53
+#define VSC73XX_C_RX2		0x54
+#define VSC73XX_C_TX0		0x55
+#define VSC73XX_C_TX1		0x56
+#define VSC73XX_C_TX2		0x57
+#define VSC73XX_C_CFG		0x58
+#define VSC73XX_CAT_DROP	0x6e
+#define VSC73XX_CAT_PR_MISC_L2	0x6f
+#define VSC73XX_CAT_PR_USR_PRIO	0x75
+#define VSC73XX_Q_MISC_CONF	0xdf
+
+/* MAC_CFG register bits */
+#define VSC73XX_MAC_CFG_WEXC_DIS	BIT(31)
+#define VSC73XX_MAC_CFG_PORT_RST	BIT(29)
+#define VSC73XX_MAC_CFG_TX_EN		BIT(28)
+#define VSC73XX_MAC_CFG_SEED_LOAD	BIT(27)
+#define VSC73XX_MAC_CFG_SEED_MASK	GENMASK(26, 19)
+#define VSC73XX_MAC_CFG_SEED_OFFSET	19
+#define VSC73XX_MAC_CFG_FDX		BIT(18)
+#define VSC73XX_MAC_CFG_GIGA_MODE	BIT(17)
+#define VSC73XX_MAC_CFG_RX_EN		BIT(16)
+#define VSC73XX_MAC_CFG_VLAN_DBLAWR	BIT(15)
+#define VSC73XX_MAC_CFG_VLAN_AWR	BIT(14)
+#define VSC73XX_MAC_CFG_100_BASE_T	BIT(13) /* Not in manual */
+#define VSC73XX_MAC_CFG_TX_IPG_MASK	GENMASK(10, 6)
+#define VSC73XX_MAC_CFG_TX_IPG_OFFSET	6
+#define VSC73XX_MAC_CFG_TX_IPG_1000M	(6 << VSC73XX_MAC_CFG_TX_IPG_OFFSET)
+#define VSC73XX_MAC_CFG_TX_IPG_100_10M	(17 << VSC73XX_MAC_CFG_TX_IPG_OFFSET)
+#define VSC73XX_MAC_CFG_MAC_RX_RST	BIT(5)
+#define VSC73XX_MAC_CFG_MAC_TX_RST	BIT(4)
+#define VSC73XX_MAC_CFG_CLK_SEL_MASK	GENMASK(2, 0)
+#define VSC73XX_MAC_CFG_CLK_SEL_OFFSET	0
+#define VSC73XX_MAC_CFG_CLK_SEL_1000M	1
+#define VSC73XX_MAC_CFG_CLK_SEL_100M	2
+#define VSC73XX_MAC_CFG_CLK_SEL_10M	3
+#define VSC73XX_MAC_CFG_CLK_SEL_EXT	4
+
+#define VSC73XX_MAC_CFG_1000M_F_PHY	(VSC73XX_MAC_CFG_FDX | \
+					 VSC73XX_MAC_CFG_GIGA_MODE | \
+					 VSC73XX_MAC_CFG_TX_IPG_1000M | \
+					 VSC73XX_MAC_CFG_CLK_SEL_EXT)
+#define VSC73XX_MAC_CFG_100_10M_F_PHY	(VSC73XX_MAC_CFG_FDX | \
+					 VSC73XX_MAC_CFG_TX_IPG_100_10M | \
+					 VSC73XX_MAC_CFG_CLK_SEL_EXT)
+#define VSC73XX_MAC_CFG_100_10M_H_PHY	(VSC73XX_MAC_CFG_TX_IPG_100_10M | \
+					 VSC73XX_MAC_CFG_CLK_SEL_EXT)
+#define VSC73XX_MAC_CFG_1000M_F_RGMII	(VSC73XX_MAC_CFG_FDX | \
+					 VSC73XX_MAC_CFG_GIGA_MODE | \
+					 VSC73XX_MAC_CFG_TX_IPG_1000M | \
+					 VSC73XX_MAC_CFG_CLK_SEL_1000M)
+#define VSC73XX_MAC_CFG_RESET		(VSC73XX_MAC_CFG_PORT_RST | \
+					 VSC73XX_MAC_CFG_MAC_RX_RST | \
+					 VSC73XX_MAC_CFG_MAC_TX_RST)
+
+/* Flow control register bits */
+#define VSC73XX_FCCONF_ZERO_PAUSE_EN	BIT(17)
+#define VSC73XX_FCCONF_FLOW_CTRL_OBEY	BIT(16)
+#define VSC73XX_FCCONF_PAUSE_VAL_MASK	GENMASK(15, 0)
+
+/* ADVPORTM advanced port setup register bits */
+#define VSC73XX_ADVPORTM_IFG_PPM	BIT(7)
+#define VSC73XX_ADVPORTM_EXC_COL_CONT	BIT(6)
+#define VSC73XX_ADVPORTM_EXT_PORT	BIT(5)
+#define VSC73XX_ADVPORTM_INV_GTX	BIT(4)
+#define VSC73XX_ADVPORTM_ENA_GTX	BIT(3)
+#define VSC73XX_ADVPORTM_DDR_MODE	BIT(2)
+#define VSC73XX_ADVPORTM_IO_LOOPBACK	BIT(1)
+#define VSC73XX_ADVPORTM_HOST_LOOPBACK	BIT(0)
+
+/* CAT_DROP categorizer frame dropping register bits */
+#define VSC73XX_CAT_DROP_DROP_MC_SMAC_ENA	BIT(6)
+#define VSC73XX_CAT_DROP_FWD_CTRL_ENA		BIT(4)
+#define VSC73XX_CAT_DROP_FWD_PAUSE_ENA		BIT(3)
+#define VSC73XX_CAT_DROP_UNTAGGED_ENA		BIT(2)
+#define VSC73XX_CAT_DROP_TAGGED_ENA		BIT(1)
+#define VSC73XX_CAT_DROP_NULL_MAC_ENA		BIT(0)
+
+#define VSC73XX_Q_MISC_CONF_EXTENT_MEM		BIT(31)
+#define VSC73XX_Q_MISC_CONF_EARLY_TX_MASK	GENMASK(4, 1)
+#define VSC73XX_Q_MISC_CONF_EARLY_TX_512	(1 << 1)
+#define VSC73XX_Q_MISC_CONF_MAC_PAUSE_MODE	BIT(0)
+
+/* Frame analyzer block 2 registers */
+#define VSC73XX_STORMLIMIT	0x02
+#define VSC73XX_ADVLEARN	0x03
+#define VSC73XX_IFLODMSK	0x04
+#define VSC73XX_VLANMASK	0x05
+#define VSC73XX_MACHDATA	0x06
+#define VSC73XX_MACLDATA	0x07
+#define VSC73XX_ANMOVED		0x08
+#define VSC73XX_ANAGEFIL	0x09
+#define VSC73XX_ANEVENTS	0x0a
+#define VSC73XX_ANCNTMASK	0x0b
+#define VSC73XX_ANCNTVAL	0x0c
+#define VSC73XX_LEARNMASK	0x0d
+#define VSC73XX_UFLODMASK	0x0e
+#define VSC73XX_MFLODMASK	0x0f
+#define VSC73XX_RECVMASK	0x10
+#define VSC73XX_AGGRCTRL	0x20
+#define VSC73XX_AGGRMSKS	0x30 /* Until 0x3f */
+#define VSC73XX_DSTMASKS	0x40 /* Until 0x7f */
+#define VSC73XX_SRCMASKS	0x80 /* Until 0x87 */
+#define VSC73XX_CAPENAB		0xa0
+#define VSC73XX_MACACCESS	0xb0
+#define VSC73XX_IPMCACCESS	0xb1
+#define VSC73XX_MACTINDX	0xc0
+#define VSC73XX_VLANACCESS	0xd0
+#define VSC73XX_VLANTIDX	0xe0
+#define VSC73XX_AGENCTRL	0xf0
+#define VSC73XX_CAPRST		0xff
+
+#define VSC73XX_MACACCESS_CPU_COPY		BIT(14)
+#define VSC73XX_MACACCESS_FWD_KILL		BIT(13)
+#define VSC73XX_MACACCESS_IGNORE_VLAN		BIT(12)
+#define VSC73XX_MACACCESS_AGED_FLAG		BIT(11)
+#define VSC73XX_MACACCESS_VALID			BIT(10)
+#define VSC73XX_MACACCESS_LOCKED		BIT(9)
+#define VSC73XX_MACACCESS_DEST_IDX_MASK		GENMASK(8, 3)
+#define VSC73XX_MACACCESS_CMD_MASK		GENMASK(2, 0)
+#define VSC73XX_MACACCESS_CMD_IDLE		0
+#define VSC73XX_MACACCESS_CMD_LEARN		1
+#define VSC73XX_MACACCESS_CMD_FORGET		2
+#define VSC73XX_MACACCESS_CMD_AGE_TABLE		3
+#define VSC73XX_MACACCESS_CMD_FLUSH_TABLE	4
+#define VSC73XX_MACACCESS_CMD_CLEAR_TABLE	5
+#define VSC73XX_MACACCESS_CMD_READ_ENTRY	6
+#define VSC73XX_MACACCESS_CMD_WRITE_ENTRY	7
+
+#define VSC73XX_VLANACCESS_LEARN_DISABLED	BIT(30)
+#define VSC73XX_VLANACCESS_VLAN_MIRROR		BIT(29)
+#define VSC73XX_VLANACCESS_VLAN_SRC_CHECK	BIT(28)
+#define VSC73XX_VLANACCESS_VLAN_PORT_MASK	GENMASK(9, 2)
+#define VSC73XX_VLANACCESS_VLAN_TBL_CMD_MASK	GENMASK(2, 0)
+#define VSC73XX_VLANACCESS_VLAN_TBL_CMD_IDLE	0
+#define VSC73XX_VLANACCESS_VLAN_TBL_CMD_READ_ENTRY	1
+#define VSC73XX_VLANACCESS_VLAN_TBL_CMD_WRITE_ENTRY	2
+#define VSC73XX_VLANACCESS_VLAN_TBL_CMD_CLEAR_TABLE	3
+
+/* MII block 3 registers */
+#define VSC73XX_MII_STAT	0x0
+#define VSC73XX_MII_CMD		0x1
+#define VSC73XX_MII_DATA	0x2
+
+/* Arbiter block 5 registers */
+#define VSC73XX_ARBEMPTY		0x0c
+#define VSC73XX_ARBDISC			0x0e
+#define VSC73XX_SBACKWDROP		0x12
+#define VSC73XX_DBACKWDROP		0x13
+#define VSC73XX_ARBBURSTPROB		0x15
+
+/* System block 7 registers */
+#define VSC73XX_ICPU_SIPAD		0x01
+#define VSC73XX_GMIIDELAY		0x05
+#define VSC73XX_ICPU_CTRL		0x10
+#define VSC73XX_ICPU_ADDR		0x11
+#define VSC73XX_ICPU_SRAM		0x12
+#define VSC73XX_HWSEM			0x13
+#define VSC73XX_GLORESET		0x14
+#define VSC73XX_ICPU_MBOX_VAL		0x15
+#define VSC73XX_ICPU_MBOX_SET		0x16
+#define VSC73XX_ICPU_MBOX_CLR		0x17
+#define VSC73XX_CHIPID			0x18
+#define VSC73XX_GPIO			0x34
+
+#define VSC73XX_GMIIDELAY_GMII0_GTXDELAY_NONE	0
+#define VSC73XX_GMIIDELAY_GMII0_GTXDELAY_1_4_NS	1
+#define VSC73XX_GMIIDELAY_GMII0_GTXDELAY_1_7_NS	2
+#define VSC73XX_GMIIDELAY_GMII0_GTXDELAY_2_0_NS	3
+
+#define VSC73XX_GMIIDELAY_GMII0_RXDELAY_NONE	(0 << 4)
+#define VSC73XX_GMIIDELAY_GMII0_RXDELAY_1_4_NS	(1 << 4)
+#define VSC73XX_GMIIDELAY_GMII0_RXDELAY_1_7_NS	(2 << 4)
+#define VSC73XX_GMIIDELAY_GMII0_RXDELAY_2_0_NS	(3 << 4)
+
+#define VSC73XX_ICPU_CTRL_WATCHDOG_RST	BIT(31)
+#define VSC73XX_ICPU_CTRL_CLK_DIV_MASK	GENMASK(12, 8)
+#define VSC73XX_ICPU_CTRL_SRST_HOLD	BIT(7)
+#define VSC73XX_ICPU_CTRL_ICPU_PI_EN	BIT(6)
+#define VSC73XX_ICPU_CTRL_BOOT_EN	BIT(3)
+#define VSC73XX_ICPU_CTRL_EXT_ACC_EN	BIT(2)
+#define VSC73XX_ICPU_CTRL_CLK_EN	BIT(1)
+#define VSC73XX_ICPU_CTRL_SRST		BIT(0)
+
+#define VSC73XX_CHIPID_ID_SHIFT		12
+#define VSC73XX_CHIPID_ID_MASK		0xffff
+#define VSC73XX_CHIPID_REV_SHIFT	28
+#define VSC73XX_CHIPID_REV_MASK		0xf
+#define VSC73XX_CHIPID_ID_7385		0x7385
+#define VSC73XX_CHIPID_ID_7388		0x7388
+#define VSC73XX_CHIPID_ID_7395		0x7395
+#define VSC73XX_CHIPID_ID_7398		0x7398
+
+#define VSC73XX_GLORESET_STROBE		BIT(4)
+#define VSC73XX_GLORESET_ICPU_LOCK	BIT(3)
+#define VSC73XX_GLORESET_MEM_LOCK	BIT(2)
+#define VSC73XX_GLORESET_PHY_RESET	BIT(1)
+#define VSC73XX_GLORESET_MASTER_RESET	BIT(0)
+
+#define VSC73XX_CMD_MODE_READ		0
+#define VSC73XX_CMD_MODE_WRITE		1
+#define VSC73XX_CMD_MODE_SHIFT		4
+#define VSC73XX_CMD_BLOCK_SHIFT		5
+#define VSC73XX_CMD_BLOCK_MASK		0x7
+#define VSC73XX_CMD_SUBBLOCK_MASK	0xf
+
+#define VSC7385_CLOCK_DELAY		((3 << 4) | 3)
+#define VSC7385_CLOCK_DELAY_MASK	((3 << 4) | 3)
+
+#define VSC73XX_ICPU_CTRL_STOP	(VSC73XX_ICPU_CTRL_SRST_HOLD | \
+				 VSC73XX_ICPU_CTRL_BOOT_EN | \
+				 VSC73XX_ICPU_CTRL_EXT_ACC_EN)
+
+#define VSC73XX_ICPU_CTRL_START	(VSC73XX_ICPU_CTRL_CLK_DIV | \
+				 VSC73XX_ICPU_CTRL_BOOT_EN | \
+				 VSC73XX_ICPU_CTRL_CLK_EN | \
+				 VSC73XX_ICPU_CTRL_SRST)
+
+/**
+ * struct vsc73xx - VSC73xx state container
+ */
+struct vsc73xx {
+	struct device		*dev;
+	struct gpio_desc	*reset;
+	struct spi_device	*spi;
+	struct dsa_switch	*ds;
+	struct gpio_chip	gc;
+	u16			chipid;
+	u8			addr[ETH_ALEN];
+	struct mutex		lock; /* Protects SPI traffic */
+};
+
+#define IS_7385(a) ((a)->chipid == VSC73XX_CHIPID_ID_7385)
+#define IS_7388(a) ((a)->chipid == VSC73XX_CHIPID_ID_7388)
+#define IS_7395(a) ((a)->chipid == VSC73XX_CHIPID_ID_7395)
+#define IS_7398(a) ((a)->chipid == VSC73XX_CHIPID_ID_7398)
+#define IS_739X(a) (IS_7395(a) || IS_7398(a))
+
+struct vsc73xx_counter {
+	u8 counter;
+	const char *name;
+};
+
+/* Counters are named according to the MIB standards where applicable.
+ * Some counters are custom, non-standard. The standard counters are
+ * named in accordance with RFC2819, RFC2021 and IEEE Std 802.3-2002 Annex
+ * 30A Counters.
+ */
+static const struct vsc73xx_counter vsc73xx_rx_counters[] = {
+	{ 0, "RxEtherStatsPkts" },
+	{ 1, "RxBroadcast+MulticastPkts" }, /* non-standard counter */
+	{ 2, "RxTotalErrorPackets" }, /* non-standard counter */
+	{ 3, "RxEtherStatsBroadcastPkts" },
+	{ 4, "RxEtherStatsMulticastPkts" },
+	{ 5, "RxEtherStatsPkts64Octets" },
+	{ 6, "RxEtherStatsPkts65to127Octets" },
+	{ 7, "RxEtherStatsPkts128to255Octets" },
+	{ 8, "RxEtherStatsPkts256to511Octets" },
+	{ 9, "RxEtherStatsPkts512to1023Octets" },
+	{ 10, "RxEtherStatsPkts1024to1518Octets" },
+	{ 11, "RxJumboFrames" }, /* non-standard counter */
+	{ 12, "RxaPauseMACControlFramesTransmitted" },
+	{ 13, "RxFIFODrops" }, /* non-standard counter */
+	{ 14, "RxBackwardDrops" }, /* non-standard counter */
+	{ 15, "RxClassifierDrops" }, /* non-standard counter */
+	{ 16, "RxEtherStatsCRCAlignErrors" },
+	{ 17, "RxEtherStatsUndersizePkts" },
+	{ 18, "RxEtherStatsOversizePkts" },
+	{ 19, "RxEtherStatsFragments" },
+	{ 20, "RxEtherStatsJabbers" },
+	{ 21, "RxaMACControlFramesReceived" },
+	/* 22-24 are undefined */
+	{ 25, "RxaFramesReceivedOK" },
+	{ 26, "RxQoSClass0" }, /* non-standard counter */
+	{ 27, "RxQoSClass1" }, /* non-standard counter */
+	{ 28, "RxQoSClass2" }, /* non-standard counter */
+	{ 29, "RxQoSClass3" }, /* non-standard counter */
+};
+
+static const struct vsc73xx_counter vsc73xx_tx_counters[] = {
+	{ 0, "TxEtherStatsPkts" },
+	{ 1, "TxBroadcast+MulticastPkts" }, /* non-standard counter */
+	{ 2, "TxTotalErrorPackets" }, /* non-standard counter */
+	{ 3, "TxEtherStatsBroadcastPkts" },
+	{ 4, "TxEtherStatsMulticastPkts" },
+	{ 5, "TxEtherStatsPkts64Octets" },
+	{ 6, "TxEtherStatsPkts65to127Octets" },
+	{ 7, "TxEtherStatsPkts128to255Octets" },
+	{ 8, "TxEtherStatsPkts256to511Octets" },
+	{ 9, "TxEtherStatsPkts512to1023Octets" },
+	{ 10, "TxEtherStatsPkts1024to1518Octets" },
+	{ 11, "TxJumboFrames" }, /* non-standard counter */
+	{ 12, "TxaPauseMACControlFramesTransmitted" },
+	{ 13, "TxFIFODrops" }, /* non-standard counter */
+	{ 14, "TxDrops" }, /* non-standard counter */
+	{ 15, "TxEtherStatsCollisions" },
+	{ 16, "TxEtherStatsCRCAlignErrors" },
+	{ 17, "TxEtherStatsUndersizePkts" },
+	{ 18, "TxEtherStatsOversizePkts" },
+	{ 19, "TxEtherStatsFragments" },
+	{ 20, "TxEtherStatsJabbers" },
+	/* 21-24 are undefined */
+	{ 25, "TxaFramesReceivedOK" },
+	{ 26, "TxQoSClass0" }, /* non-standard counter */
+	{ 27, "TxQoSClass1" }, /* non-standard counter */
+	{ 28, "TxQoSClass2" }, /* non-standard counter */
+	{ 29, "TxQoSClass3" }, /* non-standard counter */
+};
+
+static int vsc73xx_is_addr_valid(u8 block, u8 subblock)
+{
+	switch (block) {
+	case VSC73XX_BLOCK_MAC:
+		switch (subblock) {
+		case 0 ... 4:
+		case 6:
+			return 1;
+		}
+		break;
+
+	case VSC73XX_BLOCK_ANALYZER:
+	case VSC73XX_BLOCK_SYSTEM:
+		switch (subblock) {
+		case 0:
+			return 1;
+		}
+		break;
+
+	case VSC73XX_BLOCK_MII:
+	case VSC73XX_BLOCK_CAPTURE:
+	case VSC73XX_BLOCK_ARBITER:
+		switch (subblock) {
+		case 0 ... 1:
+			return 1;
+		}
+		break;
+	}
+
+	return 0;
+}
+
+static u8 vsc73xx_make_addr(u8 mode, u8 block, u8 subblock)
+{
+	u8 ret;
+
+	ret = (block & VSC73XX_CMD_BLOCK_MASK) << VSC73XX_CMD_BLOCK_SHIFT;
+	ret |= (mode & 1) << VSC73XX_CMD_MODE_SHIFT;
+	ret |= subblock & VSC73XX_CMD_SUBBLOCK_MASK;
+
+	return ret;
+}
+
+static int vsc73xx_read(struct vsc73xx *vsc, u8 block, u8 subblock, u8 reg,
+			u32 *val)
+{
+	struct spi_transfer t[2];
+	struct spi_message m;
+	u8 cmd[4];
+	u8 buf[4];
+	int ret;
+
+	if (!vsc73xx_is_addr_valid(block, subblock))
+		return -EINVAL;
+
+	spi_message_init(&m);
+
+	memset(&t, 0, sizeof(t));
+
+	t[0].tx_buf = cmd;
+	t[0].len = sizeof(cmd);
+	spi_message_add_tail(&t[0], &m);
+
+	t[1].rx_buf = buf;
+	t[1].len = sizeof(buf);
+	spi_message_add_tail(&t[1], &m);
+
+	cmd[0] = vsc73xx_make_addr(VSC73XX_CMD_MODE_READ, block, subblock);
+	cmd[1] = reg;
+	cmd[2] = 0;
+	cmd[3] = 0;
+
+	mutex_lock(&vsc->lock);
+	ret = spi_sync(vsc->spi, &m);
+	mutex_unlock(&vsc->lock);
+
+	if (ret)
+		return ret;
+
+	*val = (buf[0] << 24) | (buf[1] << 16) | (buf[2] << 8) | buf[3];
+
+	return 0;
+}
+
+static int vsc73xx_write(struct vsc73xx *vsc, u8 block, u8 subblock, u8 reg,
+			 u32 val)
+{
+	struct spi_transfer t[2];
+	struct spi_message m;
+	u8 cmd[2];
+	u8 buf[4];
+	int ret;
+
+	if (!vsc73xx_is_addr_valid(block, subblock))
+		return -EINVAL;
+
+	spi_message_init(&m);
+
+	memset(&t, 0, sizeof(t));
+
+	t[0].tx_buf = cmd;
+	t[0].len = sizeof(cmd);
+	spi_message_add_tail(&t[0], &m);
+
+	t[1].tx_buf = buf;
+	t[1].len = sizeof(buf);
+	spi_message_add_tail(&t[1], &m);
+
+	cmd[0] = vsc73xx_make_addr(VSC73XX_CMD_MODE_WRITE, block, subblock);
+	cmd[1] = reg;
+
+	buf[0] = (val >> 24) & 0xff;
+	buf[1] = (val >> 16) & 0xff;
+	buf[2] = (val >> 8) & 0xff;
+	buf[3] = val & 0xff;
+
+	mutex_lock(&vsc->lock);
+	ret = spi_sync(vsc->spi, &m);
+	mutex_unlock(&vsc->lock);
+
+	return ret;
+}
+
+static int vsc73xx_update_bits(struct vsc73xx *vsc, u8 block, u8 subblock,
+			       u8 reg, u32 mask, u32 val)
+{
+	u32 tmp, orig;
+	int ret;
+
+	/* Same read-modify-write algorithm as e.g. regmap */
+	ret = vsc73xx_read(vsc, block, subblock, reg, &orig);
+	if (ret)
+		return ret;
+	tmp = orig & ~mask;
+	tmp |= val & mask;
+	return vsc73xx_write(vsc, block, subblock, reg, tmp);
+}
+
+static int vsc73xx_detect(struct vsc73xx *vsc)
+{
+	bool icpu_si_boot_en;
+	bool icpu_pi_en;
+	u32 val;
+	u32 rev;
+	int ret;
+	u32 id;
+
+	ret = vsc73xx_read(vsc, VSC73XX_BLOCK_SYSTEM, 0,
+			   VSC73XX_ICPU_MBOX_VAL, &val);
+	if (ret) {
+		dev_err(vsc->dev, "unable to read mailbox (%d)\n", ret);
+		return ret;
+	}
+
+	if (val == 0xffffffff) {
+		dev_info(vsc->dev, "chip seems dead, assert reset\n");
+		gpiod_set_value_cansleep(vsc->reset, 1);
+		/* Reset pulse should be 20ns minimum, according to datasheet
+		 * table 245, so 10us should be fine
+		 */
+		usleep_range(10, 100);
+		gpiod_set_value_cansleep(vsc->reset, 0);
+		/* Wait 20ms according to datasheet table 245 */
+		msleep(20);
+
+		ret = vsc73xx_read(vsc, VSC73XX_BLOCK_SYSTEM, 0,
+				   VSC73XX_ICPU_MBOX_VAL, &val);
+		if (val == 0xffffffff) {
+			dev_err(vsc->dev, "seems not to help, giving up\n");
+			return -ENODEV;
+		}
+	}
+
+	ret = vsc73xx_read(vsc, VSC73XX_BLOCK_SYSTEM, 0,
+			   VSC73XX_CHIPID, &val);
+	if (ret) {
+		dev_err(vsc->dev, "unable to read chip id (%d)\n", ret);
+		return ret;
+	}
+
+	id = (val >> VSC73XX_CHIPID_ID_SHIFT) &
+		VSC73XX_CHIPID_ID_MASK;
+	switch (id) {
+	case VSC73XX_CHIPID_ID_7385:
+	case VSC73XX_CHIPID_ID_7388:
+	case VSC73XX_CHIPID_ID_7395:
+	case VSC73XX_CHIPID_ID_7398:
+		break;
+	default:
+		dev_err(vsc->dev, "unsupported chip, id=%04x\n", id);
+		return -ENODEV;
+	}
+
+	vsc->chipid = id;
+	rev = (val >> VSC73XX_CHIPID_REV_SHIFT) &
+		VSC73XX_CHIPID_REV_MASK;
+	dev_info(vsc->dev, "VSC%04X (rev: %d) switch found\n", id, rev);
+
+	ret = vsc73xx_read(vsc, VSC73XX_BLOCK_SYSTEM, 0,
+			   VSC73XX_ICPU_CTRL, &val);
+	if (ret) {
+		dev_err(vsc->dev, "unable to read iCPU control\n");
+		return ret;
+	}
+
+	/* The iCPU can always be used but can boot in different ways.
+	 * If it is initially disabled and has no external memory,
+	 * we are in control and can do whatever we like, else we
+	 * are probably in trouble (we need some way to communicate
+	 * with the running firmware) so we bail out for now.
+	 */
+	icpu_pi_en = !!(val & VSC73XX_ICPU_CTRL_ICPU_PI_EN);
+	icpu_si_boot_en = !!(val & VSC73XX_ICPU_CTRL_BOOT_EN);
+	if (icpu_si_boot_en && icpu_pi_en) {
+		dev_err(vsc->dev,
+			"iCPU enabled boots from SI, has external memory\n");
+		dev_err(vsc->dev, "no idea how to deal with this\n");
+		return -ENODEV;
+	}
+	if (icpu_si_boot_en && !icpu_pi_en) {
+		dev_err(vsc->dev,
+			"iCPU enabled boots from SI, no external memory\n");
+		dev_err(vsc->dev, "no idea how to deal with this\n");
+		return -ENODEV;
+	}
+	if (!icpu_si_boot_en && icpu_pi_en) {
+		dev_err(vsc->dev,
+			"iCPU enabled, boots from PI external memory\n");
+		dev_err(vsc->dev, "no idea how to deal with this\n");
+		return -ENODEV;
+	}
+	/* !icpu_si_boot_en && !cpu_pi_en */
+	dev_info(vsc->dev, "iCPU disabled, no external memory\n");
+
+	return 0;
+}
+
+static int vsc73xx_phy_read(struct dsa_switch *ds, int phy, int regnum)
+{
+	struct vsc73xx *vsc = ds->priv;
+	u32 cmd;
+	u32 val;
+	int ret;
+
+	/* Setting bit 26 means "read" */
+	cmd = BIT(26) | (phy << 21) | (regnum << 16);
+	ret = vsc73xx_write(vsc, VSC73XX_BLOCK_MII, 0, 1, cmd);
+	if (ret)
+		return ret;
+	msleep(2);
+	ret = vsc73xx_read(vsc, VSC73XX_BLOCK_MII, 0, 2, &val);
+	if (ret)
+		return ret;
+	if (val & BIT(16)) {
+		dev_err(vsc->dev, "reading reg %02x from phy%d failed\n",
+			regnum, phy);
+		return -EIO;
+	}
+	val &= 0xFFFFU;
+
+	dev_dbg(vsc->dev, "read reg %02x from phy%d = %04x\n",
+		regnum, phy, val);
+
+	return val;
+}
+
+static int vsc73xx_phy_write(struct dsa_switch *ds, int phy, int regnum,
+			     u16 val)
+{
+	struct vsc73xx *vsc = ds->priv;
+	u32 cmd;
+	int ret;
+
+	/* It was found through tedious experiments that this router
+	 * chip really hates to have it's PHYs reset. They
+	 * never recover if that happens: autonegotiation stops
+	 * working after a reset. Just filter out this command.
+	 * (Resetting the whole chip is OK.)
+	 */
+	if (regnum == 0 && (val & BIT(15))) {
+		dev_info(vsc->dev, "reset PHY - disallowed\n");
+		return 0;
+	}
+
+	cmd = (phy << 21) | (regnum << 16);
+	ret = vsc73xx_write(vsc, VSC73XX_BLOCK_MII, 0, 1, cmd);
+	if (ret)
+		return ret;
+
+	dev_dbg(vsc->dev, "write %04x to reg %02x in phy%d\n",
+		val, regnum, phy);
+	return 0;
+}
+
+static enum dsa_tag_protocol vsc73xx_get_tag_protocol(struct dsa_switch *ds,
+						      int port)
+{
+	/* The switch internally uses a 8 byte header with length,
+	 * source port, tag, LPA and priority. This is supposedly
+	 * only accessible when operating the switch using the internal
+	 * CPU or with an external CPU mapping the device in, but not
+	 * when operating the switch over SPI and putting frames in/out
+	 * on port 6 (the CPU port). So far we must assume that we
+	 * cannot access the tag. (See "Internal frame header" section
+	 * 3.9.1 in the manual.)
+	 */
+	return DSA_TAG_PROTO_NONE;
+}
+
+static int vsc73xx_setup(struct dsa_switch *ds)
+{
+	struct vsc73xx *vsc = ds->priv;
+	int i;
+
+	dev_info(vsc->dev, "set up the switch\n");
+
+	/* Issue RESET */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_SYSTEM, 0, VSC73XX_GLORESET,
+		      VSC73XX_GLORESET_MASTER_RESET);
+	usleep_range(125, 200);
+
+	/* Initialize memory, initialize RAM bank 0..15 except 6 and 7
+	 * This sequence appears in the
+	 * VSC7385 SparX-G5 datasheet section 6.6.1
+	 * VSC7395 SparX-G5e datasheet section 6.6.1
+	 * "initialization sequence".
+	 * No explanation is given to the 0x1010400 magic number.
+	 */
+	for (i = 0; i <= 15; i++) {
+		if (i != 6 && i != 7) {
+			vsc73xx_write(vsc, VSC73XX_BLOCK_MEMINIT,
+				      2,
+				      0, 0x1010400 + i);
+			mdelay(1);
+		}
+	}
+	mdelay(30);
+
+	/* Clear MAC table */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_ANALYZER, 0,
+		      VSC73XX_MACACCESS,
+		      VSC73XX_MACACCESS_CMD_CLEAR_TABLE);
+
+	/* Clear VLAN table */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_ANALYZER, 0,
+		      VSC73XX_VLANACCESS,
+		      VSC73XX_VLANACCESS_VLAN_TBL_CMD_CLEAR_TABLE);
+
+	msleep(40);
+
+	/* Use 20KiB buffers on all ports on VSC7395
+	 * The VSC7385 has 16KiB buffers and that is the
+	 * default if we don't set this up explicitly.
+	 * Port "31" is "all ports".
+	 */
+	if (IS_739X(vsc))
+		vsc73xx_write(vsc, VSC73XX_BLOCK_MAC, 0x1f,
+			      VSC73XX_Q_MISC_CONF,
+			      VSC73XX_Q_MISC_CONF_EXTENT_MEM);
+
+	/* Put all ports into reset until enabled */
+	for (i = 0; i < 7; i++) {
+		if (i == 5)
+			continue;
+		vsc73xx_write(vsc, VSC73XX_BLOCK_MAC, 4,
+			      VSC73XX_MAC_CFG, VSC73XX_MAC_CFG_RESET);
+	}
+
+	/* MII delay, set both GTX and RX delay to 2 ns */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_SYSTEM, 0, VSC73XX_GMIIDELAY,
+		      VSC73XX_GMIIDELAY_GMII0_GTXDELAY_2_0_NS |
+		      VSC73XX_GMIIDELAY_GMII0_RXDELAY_2_0_NS);
+	/* Enable reception of frames on all ports */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_ANALYZER, 0, VSC73XX_RECVMASK,
+		      0x5f);
+	/* IP multicast flood mask (table 144) */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_ANALYZER, 0, VSC73XX_IFLODMSK,
+		      0xff);
+
+	mdelay(50);
+
+	/* Release reset from the internal PHYs */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_SYSTEM, 0, VSC73XX_GLORESET,
+		      VSC73XX_GLORESET_PHY_RESET);
+
+	udelay(4);
+
+	return 0;
+}
+
+static void vsc73xx_init_port(struct vsc73xx *vsc, int port)
+{
+	u32 val;
+
+	/* MAC configure, first reset the port and then write defaults */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC,
+		      port,
+		      VSC73XX_MAC_CFG,
+		      VSC73XX_MAC_CFG_RESET);
+
+	/* Take up the port in 1Gbit mode by default, this will be
+	 * augmented after auto-negotiation on the PHY-facing
+	 * ports.
+	 */
+	if (port == CPU_PORT)
+		val = VSC73XX_MAC_CFG_1000M_F_RGMII;
+	else
+		val = VSC73XX_MAC_CFG_1000M_F_PHY;
+
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC,
+		      port,
+		      VSC73XX_MAC_CFG,
+		      val |
+		      VSC73XX_MAC_CFG_TX_EN |
+		      VSC73XX_MAC_CFG_RX_EN);
+
+	/* Max length, we can do up to 9.6 KiB, so allow that.
+	 * According to application not "VSC7398 Jumbo Frames" setting
+	 * up the MTU to 9.6 KB does not affect the performance on standard
+	 * frames, so just enable it. It is clear from the application note
+	 * that "9.6 kilobytes" == 9600 bytes.
+	 */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC,
+		      port,
+		      VSC73XX_MAXLEN, 9600);
+
+	/* Flow control for the CPU port:
+	 * Use a zero delay pause frame when pause condition is left
+	 * Obey pause control frames
+	 */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC,
+		      port,
+		      VSC73XX_FCCONF,
+		      VSC73XX_FCCONF_ZERO_PAUSE_EN |
+		      VSC73XX_FCCONF_FLOW_CTRL_OBEY);
+
+	/* Issue pause control frames on PHY facing ports.
+	 * Allow early initiation of MAC transmission if the amount
+	 * of egress data is below 512 bytes on CPU port.
+	 * FIXME: enable 20KiB buffers?
+	 */
+	if (port == CPU_PORT)
+		val = VSC73XX_Q_MISC_CONF_EARLY_TX_512;
+	else
+		val = VSC73XX_Q_MISC_CONF_MAC_PAUSE_MODE;
+	val |= VSC73XX_Q_MISC_CONF_EXTENT_MEM;
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC,
+		      port,
+		      VSC73XX_Q_MISC_CONF,
+		      val);
+
+	/* Flow control MAC: a MAC address used in flow control frames */
+	val = (vsc->addr[5] << 16) | (vsc->addr[4] << 8) | (vsc->addr[3]);
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC,
+		      port,
+		      VSC73XX_FCMACHI,
+		      val);
+	val = (vsc->addr[2] << 16) | (vsc->addr[1] << 8) | (vsc->addr[0]);
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC,
+		      port,
+		      VSC73XX_FCMACLO,
+		      val);
+
+	/* Tell the categorizer to forward pause frames, not control
+	 * frame. Do not drop anything.
+	 */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC,
+		      port,
+		      VSC73XX_CAT_DROP,
+		      VSC73XX_CAT_DROP_FWD_PAUSE_ENA);
+
+	/* Clear all counters */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC,
+		      port, VSC73XX_C_RX0, 0);
+}
+
+static void vsc73xx_adjust_enable_port(struct vsc73xx *vsc,
+				       int port, struct phy_device *phydev,
+				       u32 initval)
+{
+	u32 val = initval;
+	u8 seed;
+
+	/* Reset this port FIXME: break out subroutine */
+	val |= VSC73XX_MAC_CFG_RESET;
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC, port, VSC73XX_MAC_CFG, val);
+
+	/* Seed the port randomness with randomness */
+	get_random_bytes(&seed, 1);
+	val |= seed << VSC73XX_MAC_CFG_SEED_OFFSET;
+	val |= VSC73XX_MAC_CFG_SEED_LOAD;
+	val |= VSC73XX_MAC_CFG_WEXC_DIS;
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC, port, VSC73XX_MAC_CFG, val);
+
+	/* Flow control for the PHY facing ports:
+	 * Use a zero delay pause frame when pause condition is left
+	 * Obey pause control frames
+	 * When generating pause frames, use 0xff as pause value
+	 */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC, port, VSC73XX_FCCONF,
+		      VSC73XX_FCCONF_ZERO_PAUSE_EN |
+		      VSC73XX_FCCONF_FLOW_CTRL_OBEY |
+		      0xff);
+
+	/* Disallow backward dropping of frames from this port */
+	vsc73xx_update_bits(vsc, VSC73XX_BLOCK_ARBITER, 0,
+			    VSC73XX_SBACKWDROP, BIT(port), 0);
+
+	/* Enable TX, RX, deassert reset, stop loading seed */
+	vsc73xx_update_bits(vsc, VSC73XX_BLOCK_MAC, port,
+			    VSC73XX_MAC_CFG,
+			    VSC73XX_MAC_CFG_RESET | VSC73XX_MAC_CFG_SEED_LOAD |
+			    VSC73XX_MAC_CFG_TX_EN | VSC73XX_MAC_CFG_RX_EN,
+			    VSC73XX_MAC_CFG_TX_EN | VSC73XX_MAC_CFG_RX_EN);
+}
+
+static void vsc73xx_adjust_link(struct dsa_switch *ds, int port,
+				struct phy_device *phydev)
+{
+	struct vsc73xx *vsc = ds->priv;
+	u32 val;
+
+	/* Special handling of the CPU-facing port */
+	if (port == CPU_PORT) {
+		/* Other ports are already initialized but not this one */
+		vsc73xx_init_port(vsc, CPU_PORT);
+		/* Select the external port for this interface (EXT_PORT)
+		 * Enable the GMII GTX external clock
+		 * Use double data rate (DDR mode)
+		 */
+		vsc73xx_write(vsc, VSC73XX_BLOCK_MAC,
+			      CPU_PORT,
+			      VSC73XX_ADVPORTM,
+			      VSC73XX_ADVPORTM_EXT_PORT |
+			      VSC73XX_ADVPORTM_ENA_GTX |
+			      VSC73XX_ADVPORTM_DDR_MODE);
+	}
+
+	/* This is the MAC confiuration that always need to happen
+	 * after a PHY or the CPU port comes up or down.
+	 */
+	if (!phydev->link) {
+		int maxloop = 10;
+
+		dev_dbg(vsc->dev, "port %d: went down\n",
+			port);
+
+		/* Disable RX on this port */
+		vsc73xx_update_bits(vsc, VSC73XX_BLOCK_MAC, port,
+				    VSC73XX_MAC_CFG,
+				    VSC73XX_MAC_CFG_RX_EN, 0);
+
+		/* Discard packets */
+		vsc73xx_update_bits(vsc, VSC73XX_BLOCK_ARBITER, 0,
+				    VSC73XX_ARBDISC, BIT(port), BIT(port));
+
+		/* Wait until queue is empty */
+		vsc73xx_read(vsc, VSC73XX_BLOCK_ARBITER, 0,
+			     VSC73XX_ARBEMPTY, &val);
+		while (!(val & BIT(port))) {
+			msleep(1);
+			vsc73xx_read(vsc, VSC73XX_BLOCK_ARBITER, 0,
+				     VSC73XX_ARBEMPTY, &val);
+			if (--maxloop == 0) {
+				dev_err(vsc->dev,
+					"timeout waitting for block arbiter\n");
+				/* Continue anyway */
+				break;
+			}
+		}
+
+		/* Put this port into reset */
+		vsc73xx_write(vsc, VSC73XX_BLOCK_MAC, port, VSC73XX_MAC_CFG,
+			      VSC73XX_MAC_CFG_RESET);
+
+		/* Accept packets again */
+		vsc73xx_update_bits(vsc, VSC73XX_BLOCK_ARBITER, 0,
+				    VSC73XX_ARBDISC, BIT(port), 0);
+
+		/* Allow backward dropping of frames from this port */
+		vsc73xx_update_bits(vsc, VSC73XX_BLOCK_ARBITER, 0,
+				    VSC73XX_SBACKWDROP, BIT(port), BIT(port));
+
+		/* Receive mask (disable forwarding) */
+		vsc73xx_update_bits(vsc, VSC73XX_BLOCK_ANALYZER, 0,
+				    VSC73XX_RECVMASK, BIT(port), 0);
+
+		return;
+	}
+
+	/* Figure out what speed was negotiated */
+	if (phydev->speed == SPEED_1000) {
+		dev_dbg(vsc->dev, "port %d: 1000 Mbit mode full duplex\n",
+			port);
+
+		/* Set up default for internal port or external RGMII */
+		if (phydev->interface == PHY_INTERFACE_MODE_RGMII)
+			val = VSC73XX_MAC_CFG_1000M_F_RGMII;
+		else
+			val = VSC73XX_MAC_CFG_1000M_F_PHY;
+		vsc73xx_adjust_enable_port(vsc, port, phydev, val);
+	} else if (phydev->speed == SPEED_100) {
+		if (phydev->duplex == DUPLEX_FULL) {
+			val = VSC73XX_MAC_CFG_100_10M_F_PHY;
+			dev_dbg(vsc->dev,
+				"port %d: 100 Mbit full duplex mode\n",
+				port);
+		} else {
+			val = VSC73XX_MAC_CFG_100_10M_H_PHY;
+			dev_dbg(vsc->dev,
+				"port %d: 100 Mbit half duplex mode\n",
+				port);
+		}
+		vsc73xx_adjust_enable_port(vsc, port, phydev, val);
+	} else if (phydev->speed == SPEED_10) {
+		if (phydev->duplex == DUPLEX_FULL) {
+			val = VSC73XX_MAC_CFG_100_10M_F_PHY;
+			dev_dbg(vsc->dev,
+				"port %d: 10 Mbit full duplex mode\n",
+				port);
+		} else {
+			val = VSC73XX_MAC_CFG_100_10M_H_PHY;
+			dev_dbg(vsc->dev,
+				"port %d: 10 Mbit half duplex mode\n",
+				port);
+		}
+		vsc73xx_adjust_enable_port(vsc, port, phydev, val);
+	} else {
+		dev_err(vsc->dev,
+			"could not adjust link: unknown speed\n");
+	}
+
+	/* Enable port (forwarding) in the receieve mask */
+	vsc73xx_update_bits(vsc, VSC73XX_BLOCK_ANALYZER, 0,
+			    VSC73XX_RECVMASK, BIT(port), BIT(port));
+}
+
+static int vsc73xx_port_enable(struct dsa_switch *ds, int port,
+			       struct phy_device *phy)
+{
+	struct vsc73xx *vsc = ds->priv;
+
+	dev_info(vsc->dev, "enable port %d\n", port);
+	vsc73xx_init_port(vsc, port);
+
+	return 0;
+}
+
+static void vsc73xx_port_disable(struct dsa_switch *ds, int port,
+				 struct phy_device *phy)
+{
+	struct vsc73xx *vsc = ds->priv;
+
+	/* Just put the port into reset */
+	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC, port,
+		      VSC73XX_MAC_CFG, VSC73XX_MAC_CFG_RESET);
+}
+
+static const struct vsc73xx_counter *
+vsc73xx_find_counter(struct vsc73xx *vsc,
+		     u8 counter,
+		     bool tx)
+{
+	const struct vsc73xx_counter *cnts;
+	int num_cnts;
+	int i;
+
+	if (tx) {
+		cnts = vsc73xx_tx_counters;
+		num_cnts = ARRAY_SIZE(vsc73xx_tx_counters);
+	} else {
+		cnts = vsc73xx_rx_counters;
+		num_cnts = ARRAY_SIZE(vsc73xx_rx_counters);
+	}
+
+	for (i = 0; i < num_cnts; i++) {
+		const struct vsc73xx_counter *cnt;
+
+		cnt = &cnts[i];
+		if (cnt->counter == counter)
+			return cnt;
+	}
+
+	return NULL;
+}
+
+void vsc73xx_get_strings(struct dsa_switch *ds, int port, u32 stringset,
+			 uint8_t *data)
+{
+	const struct vsc73xx_counter *cnt;
+	struct vsc73xx *vsc = ds->priv;
+	u8 indices[6];
+	int i, j;
+	u32 val;
+	int ret;
+
+	if (stringset != ETH_SS_STATS)
+		return;
+
+	ret = vsc73xx_read(vsc, VSC73XX_BLOCK_MAC, port,
+			   VSC73XX_C_CFG, &val);
+	if (ret)
+		return;
+
+	indices[0] = (val & 0x1f); /* RX counter 0 */
+	indices[1] = ((val >> 5) & 0x1f); /* RX counter 1 */
+	indices[2] = ((val >> 10) & 0x1f); /* RX counter 2 */
+	indices[3] = ((val >> 16) & 0x1f); /* TX counter 0 */
+	indices[4] = ((val >> 21) & 0x1f); /* TX counter 1 */
+	indices[5] = ((val >> 26) & 0x1f); /* TX counter 2 */
+
+	/* The first counters is the RX octets */
+	j = 0;
+	strncpy(data + j * ETH_GSTRING_LEN,
+		"RxEtherStatsOctets", ETH_GSTRING_LEN);
+	j++;
+
+	/* Each port supports recording 3 RX counters and 3 TX counters,
+	 * figure out what counters we use in this set-up and return the
+	 * names of them. The hardware default counters will be number of
+	 * packets on RX/TX, combined broadcast+multicast packets RX/TX and
+	 * total error packets RX/TX.
+	 */
+	for (i = 0; i < 3; i++) {
+		cnt = vsc73xx_find_counter(vsc, indices[i], false);
+		if (cnt)
+			strncpy(data + j * ETH_GSTRING_LEN,
+				cnt->name, ETH_GSTRING_LEN);
+		j++;
+	}
+
+	/* TX stats begins with the number of TX octets */
+	strncpy(data + j * ETH_GSTRING_LEN,
+		"TxEtherStatsOctets", ETH_GSTRING_LEN);
+	j++;
+
+	for (i = 3; i < 6; i++) {
+		cnt = vsc73xx_find_counter(vsc, indices[i], true);
+		if (cnt)
+			strncpy(data + j * ETH_GSTRING_LEN,
+				cnt->name, ETH_GSTRING_LEN);
+		j++;
+	}
+}
+
+int vsc73xx_get_sset_count(struct dsa_switch *ds, int port, int sset)
+{
+	/* We only support SS_STATS */
+	if (sset != ETH_SS_STATS)
+		return 0;
+	/* RX and TX packets, then 3 RX counters, 3 TX counters */
+	return 8;
+}
+
+void vsc73xx_get_ethtool_stats(struct dsa_switch *ds, int port, uint64_t *data)
+{
+	struct vsc73xx *vsc = ds->priv;
+	u8 regs[] = {
+		VSC73XX_RXOCT,
+		VSC73XX_C_RX0,
+		VSC73XX_C_RX1,
+		VSC73XX_C_RX2,
+		VSC73XX_TXOCT,
+		VSC73XX_C_TX0,
+		VSC73XX_C_TX1,
+		VSC73XX_C_TX2,
+	};
+	u32 val;
+	int ret;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(regs); i++) {
+		ret = vsc73xx_read(vsc, VSC73XX_BLOCK_MAC, port,
+				   regs[i], &val);
+		if (ret) {
+			dev_err(vsc->dev, "error reading counter %d\n", i);
+			return;
+		}
+		data[i] = val;
+	}
+}
+
+static const struct dsa_switch_ops vsc73xx_ds_ops = {
+	.get_tag_protocol = vsc73xx_get_tag_protocol,
+	.setup = vsc73xx_setup,
+	.phy_read = vsc73xx_phy_read,
+	.phy_write = vsc73xx_phy_write,
+	.adjust_link = vsc73xx_adjust_link,
+	.get_strings = vsc73xx_get_strings,
+	.get_ethtool_stats = vsc73xx_get_ethtool_stats,
+	.get_sset_count = vsc73xx_get_sset_count,
+	.port_enable = vsc73xx_port_enable,
+	.port_disable = vsc73xx_port_disable,
+};
+
+static int vsc73xx_gpio_get(struct gpio_chip *chip, unsigned int offset)
+{
+	struct vsc73xx *vsc = gpiochip_get_data(chip);
+	u32 val;
+	int ret;
+
+	ret = vsc73xx_read(vsc, VSC73XX_BLOCK_SYSTEM, 0,
+			   VSC73XX_GPIO, &val);
+	if (ret)
+		return ret;
+
+	return !!(val & BIT(offset));
+}
+
+static void vsc73xx_gpio_set(struct gpio_chip *chip, unsigned int offset,
+			     int val)
+{
+	struct vsc73xx *vsc = gpiochip_get_data(chip);
+	u32 tmp = val ? BIT(offset) : 0;
+
+	vsc73xx_update_bits(vsc, VSC73XX_BLOCK_SYSTEM, 0,
+			    VSC73XX_GPIO, BIT(offset), tmp);
+}
+
+static int vsc73xx_gpio_direction_output(struct gpio_chip *chip,
+					 unsigned int offset, int val)
+{
+	struct vsc73xx *vsc = gpiochip_get_data(chip);
+	u32 tmp = val ? BIT(offset) : 0;
+
+	return vsc73xx_update_bits(vsc, VSC73XX_BLOCK_SYSTEM, 0,
+				   VSC73XX_GPIO, BIT(offset + 4) | BIT(offset),
+				   BIT(offset + 4) | tmp);
+}
+
+static int vsc73xx_gpio_direction_input(struct gpio_chip *chip,
+					unsigned int offset)
+{
+	struct vsc73xx *vsc = gpiochip_get_data(chip);
+
+	return  vsc73xx_update_bits(vsc, VSC73XX_BLOCK_SYSTEM, 0,
+				    VSC73XX_GPIO, BIT(offset + 4),
+				    0);
+}
+
+static int vsc73xx_gpio_get_direction(struct gpio_chip *chip,
+				      unsigned int offset)
+{
+	struct vsc73xx *vsc = gpiochip_get_data(chip);
+	u32 val;
+	int ret;
+
+	ret = vsc73xx_read(vsc, VSC73XX_BLOCK_SYSTEM, 0,
+			   VSC73XX_GPIO, &val);
+	if (ret)
+		return ret;
+
+	return !(val & BIT(offset + 4));
+}
+
+static int vsc73xx_gpio_probe(struct vsc73xx *vsc)
+{
+	int ret;
+
+	vsc->gc.label = devm_kasprintf(vsc->dev, GFP_KERNEL, "VSC%04x",
+				       vsc->chipid);
+	vsc->gc.ngpio = 4;
+	vsc->gc.owner = THIS_MODULE;
+	vsc->gc.parent = vsc->dev;
+	vsc->gc.of_node = vsc->dev->of_node;
+	vsc->gc.base = -1;
+	vsc->gc.get = vsc73xx_gpio_get;
+	vsc->gc.set = vsc73xx_gpio_set;
+	vsc->gc.direction_input = vsc73xx_gpio_direction_input;
+	vsc->gc.direction_output = vsc73xx_gpio_direction_output;
+	vsc->gc.get_direction = vsc73xx_gpio_get_direction;
+	vsc->gc.can_sleep = true;
+	ret = devm_gpiochip_add_data(vsc->dev, &vsc->gc, vsc);
+	if (ret) {
+		dev_err(vsc->dev, "unable to register GPIO chip\n");
+		return ret;
+	}
+	return 0;
+}
+
+static int vsc73xx_probe(struct spi_device *spi)
+{
+	struct device *dev = &spi->dev;
+	struct vsc73xx *vsc;
+	int ret;
+
+	vsc = devm_kzalloc(dev, sizeof(*vsc), GFP_KERNEL);
+	if (!vsc)
+		return -ENOMEM;
+
+	spi_set_drvdata(spi, vsc);
+	vsc->spi = spi_dev_get(spi);
+	vsc->dev = dev;
+	mutex_init(&vsc->lock);
+
+	/* Release reset, if any */
+	vsc->reset = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_LOW);
+	if (IS_ERR(vsc->reset)) {
+		dev_err(dev, "failed to get RESET GPIO\n");
+		return PTR_ERR(vsc->reset);
+	}
+	if (vsc->reset)
+		/* Wait 20ms according to datasheet table 245 */
+		msleep(20);
+
+	spi->mode = SPI_MODE_0;
+	spi->bits_per_word = 8;
+	ret = spi_setup(spi);
+	if (ret < 0) {
+		dev_err(dev, "spi setup failed.\n");
+		return ret;
+	}
+
+	ret = vsc73xx_detect(vsc);
+	if (ret) {
+		dev_err(dev, "no chip found (%d)\n", ret);
+		return -ENODEV;
+	}
+
+	eth_random_addr(vsc->addr);
+	dev_info(vsc->dev,
+		 "MAC for control frames: %02X:%02X:%02X:%02X:%02X:%02X\n",
+		 vsc->addr[0], vsc->addr[1], vsc->addr[2],
+		 vsc->addr[3], vsc->addr[4], vsc->addr[5]);
+
+	/* The VSC7395 switch chips have 5+1 ports which means 5
+	 * ordinary ports and a sixth CPU port facing the processor
+	 * with an RGMII interface. These ports are numbered 0..4
+	 * and 6, so they leave a "hole" in the port map for port 5,
+	 * which is invalid.
+	 *
+	 * The VSC7398 has 8 ports, port 7 is again the CPU port.
+	 *
+	 * We allocate 8 ports and avoid access to the nonexistant
+	 * ports.
+	 */
+	vsc->ds = dsa_switch_alloc(dev, 8);
+	if (!vsc->ds)
+		return -ENOMEM;
+	vsc->ds->priv = vsc;
+
+	vsc->ds->ops = &vsc73xx_ds_ops;
+	ret = dsa_register_switch(vsc->ds);
+	if (ret) {
+		dev_err(dev, "unable to register switch (%d)\n", ret);
+		return ret;
+	}
+
+	ret = vsc73xx_gpio_probe(vsc);
+	if (ret) {
+		dsa_unregister_switch(vsc->ds);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int vsc73xx_remove(struct spi_device *spi)
+{
+	struct vsc73xx *vsc = spi_get_drvdata(spi);
+
+	dsa_unregister_switch(vsc->ds);
+	gpiod_set_value(vsc->reset, 1);
+
+	return 0;
+}
+
+static const struct of_device_id vsc73xx_of_match[] = {
+	{
+		.compatible = "vitesse,vsc7385",
+	},
+	{
+		.compatible = "vitesse,vsc7388",
+	},
+	{
+		.compatible = "vitesse,vsc7395",
+	},
+	{
+		.compatible = "vitesse,vsc7398",
+	},
+	{ },
+};
+MODULE_DEVICE_TABLE(of, vsc73xx_of_match);
+
+static struct spi_driver vsc73xx_driver = {
+	.probe = vsc73xx_probe,
+	.remove = vsc73xx_remove,
+	.driver = {
+		.name = "vsc73xx",
+		.of_match_table = vsc73xx_of_match,
+	},
+};
+module_spi_driver(vsc73xx_driver);
+
+MODULE_AUTHOR("Linus Walleij <linus.walleij@linaro.org>");
+MODULE_DESCRIPTION("Vitesse VSC7385/7388/7395/7398 driver");
+MODULE_LICENSE("GPL v2");
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH 1/2] cdc_ncm: Hook into usbnet_change_mtu respecting usbnet,driver_info
From: Miguel Rodríguez Pérez @ 2018-06-30 11:26 UTC (permalink / raw)
  To: linux-usb; +Cc: netdev
In-Reply-To: <69bbee1a-9a88-eabe-554f-fbc59e7fde79@det.uvigo.gal>

Change the way cdc_ncm_change_mtu hooks into the netdev_ops
structure so that changes into usbnet driver_info operations
can be respected. Without this, is was not possible to hook
into usbnet_set_rx_mode.

Signed-off-by: Miguel Rodríguez Pérez <miguel@det.uvigo.gal>
---
 drivers/net/usb/cdc_ncm.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index 9e1b74590682..d6b51e2b9495 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -750,16 +750,7 @@ int cdc_ncm_change_mtu(struct net_device *net, int
new_mtu)
 }
 EXPORT_SYMBOL_GPL(cdc_ncm_change_mtu);

-static const struct net_device_ops cdc_ncm_netdev_ops = {
-	.ndo_open	     = usbnet_open,
-	.ndo_stop	     = usbnet_stop,
-	.ndo_start_xmit	     = usbnet_start_xmit,
-	.ndo_tx_timeout	     = usbnet_tx_timeout,
-	.ndo_get_stats64     = usbnet_get_stats64,
-	.ndo_change_mtu	     = cdc_ncm_change_mtu,
-	.ndo_set_mac_address = eth_mac_addr,
-	.ndo_validate_addr   = eth_validate_addr,
-};
+static struct net_device_ops cdc_ncm_netdev_ops;

 int cdc_ncm_bind_common(struct usbnet *dev, struct usb_interface *intf,
u8 data_altsetting, int drvflags)
 {
@@ -939,6 +930,8 @@ int cdc_ncm_bind_common(struct usbnet *dev, struct
usb_interface *intf, u8 data_
 	dev->net->sysfs_groups[0] = &cdc_ncm_sysfs_attr_group;

 	/* must handle MTU changes */
+	cdc_ncm_netdev_ops = *dev->net->netdev_ops;
+	cdc_ncm_netdev_ops.ndo_change_mtu = cdc_ncm_change_mtu;
 	dev->net->netdev_ops = &cdc_ncm_netdev_ops;
 	dev->net->max_mtu = cdc_ncm_max_dgram_size(dev) - cdc_ncm_eth_hlen(dev);

-- 
2.17.1

-- 
Miguel Rodríguez Pérez
Laboratorio de Redes
EE Telecomunicación – Universidade de Vigo

^ permalink raw reply related

* Re: [PATCH 2/2] cdc_ncm: Admit multicast traffic
From: Miguel Rodríguez Pérez @ 2018-06-30 11:26 UTC (permalink / raw)
  To: linux-usb; +Cc: netdev
In-Reply-To: <69bbee1a-9a88-eabe-554f-fbc59e7fde79@det.uvigo.gal>

Some CDC_NCM devices are used as docks for laptops. In this case, it
makes sense to accept multicast Ethernet traffic, as these devices
can reside in a proper LAN. Without this, mDNS or IPv6 simply do not
work.

Signed-off-by: Miguel Rodríguez Pérez <miguel@det.uvigo.gal>
---
 drivers/net/usb/cdc_ncm.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index d6b51e2b9495..50af1d9d0102 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -132,6 +132,33 @@ static void cdc_ncm_get_strings(struct net_device
__always_unused *netdev, u32 s

 static void cdc_ncm_update_rxtx_max(struct usbnet *dev, u32 new_rx, u32
new_tx);

+static void cdc_ncm_update_filter(struct usbnet *dev)
+{
+       struct cdc_ncm_ctx *ctx = (struct cdc_ncm_ctx *)dev->data[0];
+	u8 iface_no = ctx->control->cur_altsetting->desc.bInterfaceNumber;
+	struct net_device *net = dev->net;
+
+	u16 cdc_filter = USB_CDC_PACKET_TYPE_DIRECTED
+			| USB_CDC_PACKET_TYPE_BROADCAST;
+
+	/* filtering on the device is an optional feature and not worth
+	 * the hassle so we just roughly care about snooping and if any
+	 * multicast is requested, we take every multicast
+	 */
+	if (net->flags & IFF_PROMISC)
+		cdc_filter |= USB_CDC_PACKET_TYPE_PROMISCUOUS;
+	if (!netdev_mc_empty(net) || (net->flags & IFF_ALLMULTI))
+		cdc_filter |= USB_CDC_PACKET_TYPE_ALL_MULTICAST;
+
+	usbnet_write_cmd(dev,
+			USB_CDC_SET_ETHERNET_PACKET_FILTER,
+			USB_TYPE_CLASS | USB_DIR_OUT | USB_RECIP_INTERFACE,
+			cdc_filter,
+			iface_no,
+			NULL,
+			0);
+}
+
 static const struct ethtool_ops cdc_ncm_ethtool_ops = {
 	.get_link          = usbnet_get_link,
 	.nway_reset        = usbnet_nway_reset,
@@ -1652,6 +1679,7 @@ static const struct driver_info cdc_ncm_info = {
 	.status = cdc_ncm_status,
 	.rx_fixup = cdc_ncm_rx_fixup,
 	.tx_fixup = cdc_ncm_tx_fixup,
+	.set_rx_mode = cdc_ncm_update_filter,
 };

 /* Same as cdc_ncm_info, but with FLAG_WWAN */
-- 
2.17.1


On 30/06/18 13:21, Miguel Rodríguez Pérez wrote:
> Sending again, as the previous try had the wrong subjects. Sorry about that.
> 
> Dell D6000 dock (and I guess other docks too) exposes a CDC_NCM device
> for Ethernet traffic. However, multicast Ethernet traffic is not
> processed making IPv6 not functional. Other services, like mDNS used for
> LAN service discovery are also hindered.
> 
> The actual reason is that CDC_NCM driver was not processing requests to
> filter (admit) multicast traffic. I provide two patches to the linux
> kernel that admit all Ethernet multicast traffic whenever a multicast
> group is being joined.
> 
> The solution is not optimal, as it makes the system receive more traffic
> than that strictly needed, but otherwise this only happens when the
> computer is connected to a dock and thus is running on AC power. I
> believe it is not worth the hassle to join only the requested groups.
> This is the same that is done in the CDN_ETHER driver.
> 
> Best regards,
> 

-- 
Miguel Rodríguez Pérez
Laboratorio de Redes
EE Telecomunicación – Universidade de Vigo

^ permalink raw reply related

* Re: [PATCH net-next 0/4] Fixes for running mirror-to-gretap tests on veth
From: David Miller @ 2018-06-30 11:35 UTC (permalink / raw)
  To: petrm; +Cc: netdev, linux-kselftest, shuah
In-Reply-To: <cover.1530204784.git.petrm@mellanox.com>

From: Petr Machata <petrm@mellanox.com>
Date: Thu, 28 Jun 2018 18:56:15 +0200

> The forwarding selftests infrastructure makes it possible to run the
> individual tests on a purely software netdevices. Names of interfaces to
> run the test with can be passed as command line arguments to a test.
> lib.sh then creates veth pairs backing the interfaces if none exist in
> the system.
> 
> However, the tests need to recognize that they might be run on a soft
> device. Many mirror-to-gretap tests are buggy in this regard. This patch
> set aims to fix the problems in running mirror-to-gretap tests on veth
> devices.
> 
> In patch #1, a service function is split out of setup_wait().
> In patch #2, installing a trap is made optional.
> In patch #3, tc filters in several tests are tweaked to work with veth.
> In patch #4, the logic for waiting for neighbor is fixed for veth.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] r8169: use standard debug output functions
From: David Miller @ 2018-06-30 11:45 UTC (permalink / raw)
  To: hkallweit1; +Cc: nic_swsd, netdev
In-Reply-To: <02ff0bd1-cf54-bc1d-9e85-d433bda5909a@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Thu, 28 Jun 2018 20:36:15 +0200

> I see no need to define a private debug output symbol, let's use the
> standard debug output functions instead. In this context also remove
> the deprecated PFX define.
> 
> The one assertion is wrong IMO anyway, this code path is used also
> by chip version 01.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next 00/10] pnetid and SMC-D support
From: David Miller @ 2018-06-30 11:46 UTC (permalink / raw)
  To: ubraun
  Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, hwippel,
	sebott
In-Reply-To: <20180628170513.5089-1-ubraun@linux.ibm.com>

From: Ursula Braun <ubraun@linux.ibm.com>
Date: Thu, 28 Jun 2018 19:05:03 +0200

> SMC requires a configured pnet table to map Ethernet interfaces to
> RoCE adapter ports. For s390 there exists hardware support to group
> such devices. The first three patches cover the s390 pnetid support,
> enabling SMC-R usage on s390 without configuring an extra pnet table.
> 
> SMC currently requires RoCE adapters, and uses RDMA-techniques
> implemented with IB-verbs. But s390 offers another method for
> intra-CEC Shared Memory communication. The following seven patches
> implement a solution to run SMC traffic based on intra-CEC DMA,
> called SMC-D.

Series applied.

^ permalink raw reply

* Re: [PATCH net-next] net: phy: realtek: add support for RTL8211
From: David Miller @ 2018-06-30 11:47 UTC (permalink / raw)
  To: hkallweit1; +Cc: andrew, f.fainelli, nic_swsd, netdev
In-Reply-To: <4ba9b12d-d7a2-6cfa-7ca5-fc4414c0b1e6@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Thu, 28 Jun 2018 20:46:45 +0200

> In preparation of adding phylib support to the r8169 driver we need
> PHY drivers for all chip-internal PHY types. Fortunately almost all
> of them are either supported by the Realtek PHY driver already or work
> with the genphy driver.
> Still missing is support for the PHY of RTL8169s, it requires a quirk
> to properly support 100Mbit-fixed mode. The quirk was copied from
> r8169 driver which copied it from the vendor driver.
> Based on the PHYID the internal PHY seems to be a RTL8211.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Applied.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox