Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next 4/5] net: phy: bcm7xxx: Add support for downshift/Wirespeed
From: Andrew Lunn @ 2016-11-22 20:57 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, davem, bcm-kernel-feedback-list, allan.nielsen,
	raju.lakkaraju, vivien.didelot
In-Reply-To: <1902f0f0-46e5-d3b3-90c1-10867f4fb826@gmail.com>

> > Maybe we should think about this locking a bit. It is normal for the
> > lock to be held when using ops in the phy driver structure. The
> > exception is suspend/resume. Maybe we should also take the lock before
> > calling the phydev->drv->get_tunable() and phydev->drv->set_tunable()?
> 
> Yes, that certainly seems like a good approach to me, let me cook a
> patch doing that.

Hi Florian

There are a couple of mutex locks/unlocks you will need to remove from
mscc.c when you centralize this mutex.

       Andrew

^ permalink raw reply

* Re: [PATCH net-next] net/sched: cls_flower: verify root pointer before dereferncing it
From: Daniel Borkmann @ 2016-11-22 20:41 UTC (permalink / raw)
  To: Cong Wang, Jiri Pirko
  Cc: Roi Dayan, David S. Miller, Linux Kernel Network Developers,
	Jiri Pirko, Or Gerlitz, Cong Wang, John Fastabend
In-Reply-To: <CAM_iQpVTtqQRg7KgQbMQSxxVJRB8a46DGgfU52m=SxF1uYDFWQ@mail.gmail.com>

On 11/22/2016 08:28 PM, Cong Wang wrote:
> On Tue, Nov 22, 2016 at 8:11 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>> Tue, Nov 22, 2016 at 05:04:11PM CET, daniel@iogearbox.net wrote:
>>> Hmm, I don't think we want to have such an additional test in fast
>>> path for each and every classifier. Can we think of ways to avoid that?
>>>
>>> My question is, since we unlink individual instances from such tp-internal
>>> lists through RCU and release the instance through call_rcu() as well as
>>> the head (tp->root) via kfree_rcu() eventually, against what are we protecting
>>> setting RCU_INIT_POINTER(tp->root, NULL) in ->destroy() callback? Something
>>> not respecting grace period?
>>
>> If you call tp->ops->destroy in call_rcu, you don't have to set tp->root
>> to null.

But that's not really an answer to my question. ;)

> We do need to respect the grace period if we touch the globally visible
> data structure tp in tcf_destroy(). Therefore Roi's patch is not fixing the
> right place.

I think there may be multiple issues actually.

At the time we go into tc_classify(), from ingress as well as egress side,
we're under RCU, but BH variant. In cls delete()/destroy() callbacks, we
everywhere use call_rcu() and kfree_rcu(), same as for tcf_destroy() where
we use kfree_rcu() on tp, although we iterate tps (and implicitly inner filters)
via rcu_dereference_bh() from reader side. Is there a reason why we don't
use call_rcu_bh() variant on destruction for all this instead?

Just looking at cls_bpf and others, what protects RCU_INIT_POINTER(tp->root,
NULL) against? The tp is unlinked in tc_ctl_tfilter() from the tp chain in
tcf_destroy() cases. Still active readers under RCU BH can race against this
(tp->root being NULL), as the commit identified. Only the get() callback checks
for head against NULL, but both are serialized under rtnl, and the only place
we call this is tc_ctl_tfilter(). Even if we create a new tp, head should not
be NULL there, if it was assigned during the init() cb, but contains an empty
list. (It's different for things like cls_cgroup, though.) So, I'm wondering
if the RCU_INIT_POINTER(tp->root, NULL) can just be removed instead (unless I'm
missing something obvious)?

> Also I don't know why you blame my commit, this problem should already
> exist prior to my commit, probably date back to John's RCU patches.

It seems so.

^ permalink raw reply

* [PATCH ethtool v4 2/2] Ethtool: Implements ETHTOOL_PHY_GTUNABLE/ETHTOOL_PHY_STUNABLE and PHY downshift
From: Allan W. Nielsen @ 2016-11-22 20:32 UTC (permalink / raw)
  To: netdev; +Cc: andrew, f.fainelli, raju.lakkaraju, allan.nielsen, Raju Lakkaraju
In-Reply-To: <1479846737-18669-1-git-send-email-allan.nielsen@microsemi.com>

From: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>

Add ethtool get and set tunable to access PHY drivers.

Ethtool Help: ethtool -h for PHY tunables
    ethtool --set-phy-tunable DEVNAME      Set PHY tunable
                [ downshift on|off [count N] ]
    ethtool --get-phy-tunable DEVNAME      Get PHY tunable
                [ downshift ]

Ethtool ex:
  ethtool --set-phy-tunable eth0 downshift on
  ethtool --set-phy-tunable eth0 downshift off
  ethtool --set-phy-tunable eth0 downshift on count 2

  ethtool --get-phy-tunable eth0 downshift

Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>
Signed-off-by: Allan W. Nielsen <allan.nielsen@microsemi.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
---
 ethtool.8.in |  40 +++++++++++++++++
 ethtool.c    | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 184 insertions(+)

diff --git a/ethtool.8.in b/ethtool.8.in
index 9631847..5c36c06 100644
--- a/ethtool.8.in
+++ b/ethtool.8.in
@@ -340,6 +340,18 @@ ethtool \- query or control network driver and hardware settings
 .B2 tx-lpi on off
 .BN tx-timer
 .BN advertise
+.HP
+.B ethtool \-\-set\-phy\-tunable
+.I devname
+.RB [
+.B downshift
+.A1 on off
+.BN count
+.RB ]
+.HP
+.B ethtool \-\-get\-phy\-tunable
+.I devname
+.RB [ downshift ]
 .
 .\" Adjust lines (i.e. full justification) and hyphenate.
 .ad
@@ -947,6 +959,34 @@ Values are as for
 Sets the amount of time the device should stay in idle mode prior to asserting
 its Tx LPI (in microseconds). This has meaning only when Tx LPI is enabled.
 .RE
+.TP
+.B \-\-set\-phy\-tunable
+Sets the PHY tunable parameters.
+.RS 4
+.TP
+.A2 downshift on off
+Specifies whether downshift should be enabled
+.TS
+nokeep;
+lB	l.
+.BI count \ N
+Sets the PHY downshift re-tries count.
+.TE
+.PD
+.RE
+.TP
+.B \-\-get\-phy\-tunable
+Gets the PHY tunable parameters.
+.RS 4
+.TP
+.B downshift
+For operation in cabling environments that are incompatible with 1000BASE-T,
+PHY device provides an automatic link speed downshift operation.
+Link speed downshift after N failed 1000BASE-T auto-negotiation attempts.
+Downshift is useful where cable does not have the 4 pairs instance.
+
+Gets the PHY downshift count/status.
+.RE
 .SH BUGS
 Not supported (in part or whole) on all network drivers.
 .SH AUTHOR
diff --git a/ethtool.c b/ethtool.c
index 49ac94e..7dcd005 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -4520,6 +4520,146 @@ static int do_seee(struct cmd_context *ctx)
 	return 0;
 }
 
+static int do_get_phy_tunable(struct cmd_context *ctx)
+{
+	int argc = ctx->argc;
+	char **argp = ctx->argp;
+	int err, i;
+	u8 downshift_changed = 0;
+
+	if (argc < 1)
+		exit_bad_args();
+	for (i = 0; i < argc; i++) {
+		if (!strcmp(argp[i], "downshift")) {
+			downshift_changed = 1;
+			i += 1;
+			if (i < argc)
+				exit_bad_args();
+		} else  {
+			exit_bad_args();
+		}
+	}
+
+	if (downshift_changed) {
+		struct ethtool_tunable ds;
+		u8 count = 0;
+
+		ds.cmd = ETHTOOL_PHY_GTUNABLE;
+		ds.id = ETHTOOL_PHY_DOWNSHIFT;
+		ds.type_id = ETHTOOL_TUNABLE_U8;
+		ds.len = 1;
+		ds.data[0] = &count;
+		err = send_ioctl(ctx, &ds);
+		if (err < 0) {
+			perror("Cannot Get PHY downshift count");
+			return 87;
+		}
+		count = *((u8 *)&ds.data[0]);
+		if (count)
+			fprintf(stdout, "Downshift count: %d\n", count);
+		else
+			fprintf(stdout, "Downshift disabled\n");
+	}
+
+	return err;
+}
+
+static int parse_named_bool(struct cmd_context *ctx, const char *name, u8 *on)
+{
+	if (ctx->argc < 2)
+		return 0;
+
+	if (strcmp(*ctx->argp, name))
+		return 0;
+
+	if (!strcmp(*(ctx->argp + 1), "on")) {
+		*on = 1;
+	} else if (!strcmp(*(ctx->argp + 1), "off")) {
+		*on = 0;
+	} else {
+		fprintf(stderr, "Invalid boolean\n");
+		exit_bad_args();
+	}
+
+	ctx->argc -= 2;
+	ctx->argp += 2;
+
+	return 1;
+}
+
+static int parse_named_u8(struct cmd_context *ctx, const char *name, u8 *val)
+{
+	if (ctx->argc < 2)
+		return 0;
+
+	if (strcmp(*ctx->argp, name))
+		return 0;
+
+	*val = get_uint_range(*(ctx->argp + 1), 0, 0xff);
+
+	ctx->argc -= 2;
+	ctx->argp += 2;
+
+	return 1;
+}
+
+static int do_set_phy_tunable(struct cmd_context *ctx)
+{
+	int err = 0;
+	u8 ds_cnt = DOWNSHIFT_DEV_DEFAULT_COUNT;
+	u8 ds_changed = 0, ds_has_cnt = 0, ds_enable = 0;
+
+	if (ctx->argc == 0)
+		exit_bad_args();
+
+	/* Parse arguments */
+	while (ctx->argc) {
+		if (parse_named_bool(ctx, "downshift", &ds_enable)) {
+			ds_changed = 1;
+			ds_has_cnt = parse_named_u8(ctx, "count", &ds_cnt);
+		} else {
+			exit_bad_args();
+		}
+	}
+
+	/* Validate parameters */
+	if (ds_changed) {
+		if (!ds_enable && ds_has_cnt) {
+			fprintf(stderr, "'count' may not be set when downshift "
+				        "is off.\n");
+			exit_bad_args();
+		}
+
+		if (ds_enable && ds_has_cnt && ds_cnt == 0) {
+			fprintf(stderr, "'count' may not be zero.\n");
+			exit_bad_args();
+		}
+
+		if (!ds_enable)
+			ds_cnt = DOWNSHIFT_DEV_DISABLE;
+	}
+
+	/* Do it */
+	if (ds_changed) {
+		struct ethtool_tunable ds;
+		u8 count;
+
+		ds.cmd = ETHTOOL_PHY_STUNABLE;
+		ds.id = ETHTOOL_PHY_DOWNSHIFT;
+		ds.type_id = ETHTOOL_TUNABLE_U8;
+		ds.len = 1;
+		ds.data[0] = &count;
+		*((u8 *)&ds.data[0]) = ds_cnt;
+		err = send_ioctl(ctx, &ds);
+		if (err < 0) {
+			perror("Cannot Set PHY downshift count");
+			err = 87;
+		}
+	}
+
+	return err;
+}
+
 #ifndef TEST_ETHTOOL
 int send_ioctl(struct cmd_context *ctx, void *cmd)
 {
@@ -4681,6 +4821,10 @@ static const struct option {
 	  "		[ advertise %x ]\n"
 	  "		[ tx-lpi on|off ]\n"
 	  "		[ tx-timer %d ]\n"},
+	{ "--set-phy-tunable", 1, do_set_phy_tunable, "Set PHY tunable",
+	  "		[ downshift on|off [count N] ]\n"},
+	{ "--get-phy-tunable", 1, do_get_phy_tunable, "Get PHY tunable",
+	  "		[ downshift ]\n"},
 	{ "-h|--help", 0, show_usage, "Show this help" },
 	{ "--version", 0, do_version, "Show version number" },
 	{}
-- 
2.7.3

^ permalink raw reply related

* [PATCH ethtool v4 1/2] ethtool-copy.h:sync with net
From: Allan W. Nielsen @ 2016-11-22 20:32 UTC (permalink / raw)
  To: netdev; +Cc: andrew, f.fainelli, raju.lakkaraju, allan.nielsen, Raju Lakkaraju
In-Reply-To: <1479846737-18669-1-git-send-email-allan.nielsen@microsemi.com>

From: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>

This covers kernel changes upto:

commit f5a4732f85613b3fb43f8bc33a017e3db3b3605a
Author: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>
Date:   Wed Nov 9 16:33:09 2016 +0530

    ethtool: (uapi) Add ETHTOOL_PHY_DOWNSHIFT to PHY tunables

    For operation in cabling environments that are incompatible with
    1000BASE-T, PHY device may provide an automatic link speed downshift
    operation. When enabled, the device automatically changes its 1000BASE-T
    auto-negotiation to the next slower speed after a configured number of
    failed attempts at 1000BASE-T.  This feature is useful in setting up in
    networks using older cable installations that include only pairs A and B,
    and not pairs C and D.

    Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>
    Signed-off-by: Allan W. Nielsen <allan.nielsen@microsemi.com>

Signed-off-by: Allan W. Nielsen <allan.nielsen@microsemi.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
---
 ethtool-copy.h | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/ethtool-copy.h b/ethtool-copy.h
index 70748f5..2e2448f 100644
--- a/ethtool-copy.h
+++ b/ethtool-copy.h
@@ -247,6 +247,19 @@ struct ethtool_tunable {
 	void	*data[0];
 };
 
+#define DOWNSHIFT_DEV_DEFAULT_COUNT	0xff
+#define DOWNSHIFT_DEV_DISABLE		0
+
+enum phy_tunable_id {
+	ETHTOOL_PHY_ID_UNSPEC,
+	ETHTOOL_PHY_DOWNSHIFT,
+	/*
+	 * Add your fresh new phy tunable attribute above and remember to update
+	 * phy_tunable_strings[] in net/core/ethtool.c
+	 */
+	__ETHTOOL_PHY_TUNABLE_COUNT,
+};
+
 /**
  * struct ethtool_regs - hardware register dump
  * @cmd: Command number = %ETHTOOL_GREGS
@@ -547,6 +560,7 @@ struct ethtool_pauseparam {
  * @ETH_SS_FEATURES: Device feature names
  * @ETH_SS_RSS_HASH_FUNCS: RSS hush function names
  * @ETH_SS_PHY_STATS: Statistic names, for use with %ETHTOOL_GPHYSTATS
+ * @ETH_SS_PHY_TUNABLES: PHY tunable names
  */
 enum ethtool_stringset {
 	ETH_SS_TEST		= 0,
@@ -557,6 +571,7 @@ enum ethtool_stringset {
 	ETH_SS_RSS_HASH_FUNCS,
 	ETH_SS_TUNABLES,
 	ETH_SS_PHY_STATS,
+	ETH_SS_PHY_TUNABLES,
 };
 
 /**
@@ -1312,7 +1327,8 @@ struct ethtool_per_queue_op {
 
 #define ETHTOOL_GLINKSETTINGS	0x0000004c /* Get ethtool_link_settings */
 #define ETHTOOL_SLINKSETTINGS	0x0000004d /* Set ethtool_link_settings */
-
+#define ETHTOOL_PHY_GTUNABLE	0x0000004e /* Get PHY tunable configuration */
+#define ETHTOOL_PHY_STUNABLE	0x0000004f /* Set PHY tunable configuration */
 
 /* compatibility with older code */
 #define SPARC_ETH_GSET		ETHTOOL_GSET
-- 
2.7.3

^ permalink raw reply related

* [PATCH ethtool v4 0/2] Adding downshift support to ethtool
From: Allan W. Nielsen @ 2016-11-22 20:32 UTC (permalink / raw)
  To: netdev; +Cc: andrew, f.fainelli, raju.lakkaraju, allan.nielsen

(downshift feature is applied in the net-next tree - d3c19c0a72)

This series adds support for downshift (using phy-tunables).

Downshifting can either be turned on/off, or it can be configured to a
specifc count.

"count" is optional.

Change set:
v1:
- Initial version of set/get phy tunable with downshift feature.
v2:
- (ethtool) Syntax is changed from "--set-phy-tunable downshift on|off|%d"
  to "--set-phy-tunable [downshift on|off [count N]]" - as requested by
  Andrew.
v3:
- Fixed Spelling in "ethtool-copy.h:sync with net" 
- Fixed "if send_ioctl() returns an error, print the error message and then
  still print th value of count".
v4:
- Fixing spelling in the example included in the commit message
- Improve the description in the man-page

Raju Lakkaraju (2):
  ethtool-copy.h:sync with net
  Ethtool: Implements ETHTOOL_PHY_GTUNABLE/ETHTOOL_PHY_STUNABLE and PHY
    downshift

 ethtool-copy.h |  18 +++++++-
 ethtool.8.in   |  40 ++++++++++++++++
 ethtool.c      | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 201 insertions(+), 1 deletion(-)

-- 
2.7.3

^ permalink raw reply

* [PATCH net-next] ethtool: Protect {get,set}_phy_tunable with PHY device mutex
From: Florian Fainelli @ 2016-11-22 20:13 UTC (permalink / raw)
  To: netdev
  Cc: davem, bcm-kernel-feedback-list, andrew, allan.nielsen,
	raju.lakkaraju, vivien.didelot, Florian Fainelli

PHY drivers should be able to rely on the caller of {get,set}_tunable to
have acquired the PHY device mutex, in order to both serialize against
concurrent calls of these functions, but also against PHY state machine
changes. All ethtool PHY-level functions do this, except
{get,set}_tunable, so we make them consistent here as well.

Fixes: 968ad9da7e0e ("ethtool: Implements ETHTOOL_PHY_GTUNABLE/ETHTOOL_PHY_STUNABLE")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 net/core/ethtool.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index e9b4556751ff..0adb3bec5b5a 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -2466,7 +2466,9 @@ static int get_phy_tunable(struct net_device *dev, void __user *useraddr)
 	data = kmalloc(tuna.len, GFP_USER);
 	if (!data)
 		return -ENOMEM;
+	mutex_lock(&phydev->lock);
 	ret = phydev->drv->get_tunable(phydev, &tuna, data);
+	mutex_unlock(&phydev->lock);
 	if (ret)
 		goto out;
 	useraddr += sizeof(tuna);
@@ -2501,7 +2503,9 @@ static int set_phy_tunable(struct net_device *dev, void __user *useraddr)
 	ret = -EFAULT;
 	if (copy_from_user(data, useraddr, tuna.len))
 		goto out;
+	mutex_lock(&phydev->lock);
 	ret = phydev->drv->set_tunable(phydev, &tuna, data);
+	mutex_unlock(&phydev->lock);
 
 out:
 	kfree(data);
-- 
2.9.3

^ permalink raw reply related

* Re: [PATCH net-next 4/5] net: phy: bcm7xxx: Add support for downshift/Wirespeed
From: Florian Fainelli @ 2016-11-22 20:07 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, davem, bcm-kernel-feedback-list, allan.nielsen,
	raju.lakkaraju, vivien.didelot
In-Reply-To: <20161122200228.GG14947@lunn.ch>

On 11/22/2016 12:02 PM, Andrew Lunn wrote:
>> +static int bcm7xxx_28nm_set_tunable(struct phy_device *phydev,
>> +				    struct ethtool_tunable *tuna,
>> +				    const void *data)
>> +{
>> +	u8 count = *(u8 *)data;
>> +	int ret;
>> +
>> +	switch (tuna->id) {
>> +	case ETHTOOL_PHY_DOWNSHIFT:
>> +		ret = bcm_phy_downshift_set(phydev, count);
>> +		break;
>> +	default:
>> +		return -EOPNOTSUPP;
>> +	}
>> +
>> +	if (ret)
>> +		return ret;
>> +
>> +	/* Disable EEE advertisment since this prevents the PHY
>> +	 * from successfully linking up, trigger auto-negotiation restart
>> +	 * to let the MAC decide what to do.
>> +	 */
>> +	ret = bcm_phy_set_eee(phydev, count == DOWNSHIFT_DEV_DISABLE);
>> +	if (ret)
>> +		return ret;
>> +
>> +	return genphy_restart_aneg(phydev);
>> +}
> 
> Hi Florian
> 
> Is the locking O.K. here? The core code does not take the phy lock.
> But i think your shadow register accesses at least need to be
> protected by the lock?

There should be some kind of protection, but I was expecting it to be
done at the caller level, so that when {get,set}_tunable run, they are
serialized with respect to each other, clearly, by looking at the code,
this is not the case.

> 
> Maybe we should think about this locking a bit. It is normal for the
> lock to be held when using ops in the phy driver structure. The
> exception is suspend/resume. Maybe we should also take the lock before
> calling the phydev->drv->get_tunable() and phydev->drv->set_tunable()?

Yes, that certainly seems like a good approach to me, let me cook a
patch doing that.
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next 4/5] net: phy: bcm7xxx: Add support for downshift/Wirespeed
From: Andrew Lunn @ 2016-11-22 20:02 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, davem, bcm-kernel-feedback-list, allan.nielsen,
	raju.lakkaraju, vivien.didelot
In-Reply-To: <20161122194058.29820-5-f.fainelli@gmail.com>

> +static int bcm7xxx_28nm_set_tunable(struct phy_device *phydev,
> +				    struct ethtool_tunable *tuna,
> +				    const void *data)
> +{
> +	u8 count = *(u8 *)data;
> +	int ret;
> +
> +	switch (tuna->id) {
> +	case ETHTOOL_PHY_DOWNSHIFT:
> +		ret = bcm_phy_downshift_set(phydev, count);
> +		break;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	if (ret)
> +		return ret;
> +
> +	/* Disable EEE advertisment since this prevents the PHY
> +	 * from successfully linking up, trigger auto-negotiation restart
> +	 * to let the MAC decide what to do.
> +	 */
> +	ret = bcm_phy_set_eee(phydev, count == DOWNSHIFT_DEV_DISABLE);
> +	if (ret)
> +		return ret;
> +
> +	return genphy_restart_aneg(phydev);
> +}

Hi Florian

Is the locking O.K. here? The core code does not take the phy lock.
But i think your shadow register accesses at least need to be
protected by the lock?

Maybe we should think about this locking a bit. It is normal for the
lock to be held when using ops in the phy driver structure. The
exception is suspend/resume. Maybe we should also take the lock before
calling the phydev->drv->get_tunable() and phydev->drv->set_tunable()?

	  Andrew

^ permalink raw reply

* Re: [RFC 02/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) Bus driver
From: Vishwanathapura, Niranjana @ 2016-11-22 19:49 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, linux-rdma, netdev, Dennis Dalessandro
In-Reply-To: <20161122170407.GE3956@obsidianresearch.com>

Ok, I do understand Jason's point that we should probably not put this driver 
under drivers/infiniband/sw/.., as this driver is not a HCA.
It is an ULP similar to ipoib, built on top of Omni-path irrespective of 
whether we register a hfi_vnic_bus or a direct custom interface with HFI1.
This ULP will transmit and recieve Omni-path packets over the fabric, and is 
dependent on IB MAD interface and the HFI1 driver.

Doug,
Will it be acceptable if we put it under 'drivers/infiniband/ulp/hfi_vnic'?

Niranjana

^ permalink raw reply

* Re: [PATCH net] flow_dissect: call init_default_flow_dissectors() earlier
From: David Miller @ 2016-11-22 19:44 UTC (permalink / raw)
  To: eric.dumazet
  Cc: maan, jiri, alexander.h.duyck, edumazet, linux-kernel, ast,
	willemb, gregkh, jslaby, yibyang, netdev
In-Reply-To: <1479842250.8455.452.camel@edumazet-glaptop3.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 22 Nov 2016 11:17:30 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> Andre Noll reported panics after my recent fix (commit 34fad54c2537
> "net: __skb_flow_dissect() must cap its return value")
> 
> After some more headaches, Alexander root caused the problem to
> init_default_flow_dissectors() being called too late, in case
> a network driver like IGB is not a module and receives DHCP message
> very early.
> 
> Fix is to call init_default_flow_dissectors() much earlier,
> as it is a core infrastructure and does not depend on another
> kernel service.
> 
> Fixes: 06635a35d13d4 ("flow_dissect: use programable dissector in skb_flow_dissect and friends")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Andre Noll <maan@tuebingen.mpg.de>
> Diagnosed-by: Alexander Duyck <alexander.h.duyck@intel.com>

Applied and queued up for -stable, I'll try to fast-track this.

^ permalink raw reply

* Re: [PATCH net] flow_dissect: call init_default_flow_dissectors() earlier
From: Andre Noll @ 2016-11-22 19:44 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jiri Pirko, Duyck, Alexander H, edumazet@google.com,
	linux-kernel@vger.kernel.org, ast@kernel.org, willemb@google.com,
	gregkh@linuxfoundation.org, jslaby@suse.cz, davem@davemloft.net,
	yibyang@cisco.com, netdev
In-Reply-To: <1479842250.8455.452.camel@edumazet-glaptop3.roam.corp.google.com>

[-- Attachment #1: Type: text/plain, Size: 424 bytes --]

On Tue, Nov 22, 11:17, Eric Dumazet wrote
> -late_initcall_sync(init_default_flow_dissectors);
> +core_initcall(init_default_flow_dissectors);

Indeed, that fixed it. Feel free to add

Tested-by: Andre Noll <maan@tuebingen.mpg.de>

Thanks a lot
Andre
-- 
Max Planck Institute for Developmental Biology
Spemannstraße 35, 72076 Tübingen, Germany. Phone: (+49) 7071 601 829
http://people.tuebingen.mpg.de/maan/

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* [PATCH net-next 5/5] net: dsa: bcm_sf2: Ensure we re-negotiate EEE during after link change
From: Florian Fainelli @ 2016-11-22 19:40 UTC (permalink / raw)
  To: netdev
  Cc: davem, bcm-kernel-feedback-list, andrew, allan.nielsen,
	raju.lakkaraju, vivien.didelot, Florian Fainelli
In-Reply-To: <20161122194058.29820-1-f.fainelli@gmail.com>

In case the link change and EEE is enabled or disabled, always try to
re-negotiate this with the link partner.

Fixes: 450b05c15f9c ("net: dsa: bcm_sf2: add support for controlling EEE")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/dsa/bcm_sf2.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index e3ee27ce13dd..9ec33b51a0ed 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -588,6 +588,7 @@ static void bcm_sf2_sw_adjust_link(struct dsa_switch *ds, int port,
 				   struct phy_device *phydev)
 {
 	struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
+	struct ethtool_eee *p = &priv->port_sts[port].eee;
 	u32 id_mode_dis = 0, port_mode;
 	const char *str = NULL;
 	u32 reg;
@@ -662,6 +663,9 @@ static void bcm_sf2_sw_adjust_link(struct dsa_switch *ds, int port,
 		reg |= DUPLX_MODE;
 
 	core_writel(priv, reg, CORE_STS_OVERRIDE_GMIIP_PORT(port));
+
+	if (!phydev->is_pseudo_fixed_link)
+		p->eee_enabled = bcm_sf2_eee_init(ds, port, phydev);
 }
 
 static void bcm_sf2_sw_fixed_link_update(struct dsa_switch *ds, int port,
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 4/5] net: phy: bcm7xxx: Add support for downshift/Wirespeed
From: Florian Fainelli @ 2016-11-22 19:40 UTC (permalink / raw)
  To: netdev
  Cc: davem, bcm-kernel-feedback-list, andrew, allan.nielsen,
	raju.lakkaraju, vivien.didelot, Florian Fainelli
In-Reply-To: <20161122194058.29820-1-f.fainelli@gmail.com>

Add support for configuring the downshift/Wirespeed enable/disable
toggles and specify a link retry value ranging from 1 to 9. Since the
integrated BCM7xxx have issues when wirespeed is enabled and EEE is also
enabled, we do disable EEE if wirespeed is enabled.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/phy/bcm7xxx.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/bcm7xxx.c b/drivers/net/phy/bcm7xxx.c
index b7789e879670..5b3be4c67be8 100644
--- a/drivers/net/phy/bcm7xxx.c
+++ b/drivers/net/phy/bcm7xxx.c
@@ -167,6 +167,7 @@ static int bcm7xxx_28nm_config_init(struct phy_device *phydev)
 {
 	u8 rev = PHY_BRCM_7XXX_REV(phydev->dev_flags);
 	u8 patch = PHY_BRCM_7XXX_PATCH(phydev->dev_flags);
+	u8 count;
 	int ret = 0;
 
 	pr_info_once("%s: %s PHY revision: 0x%02x, patch: %d\n",
@@ -199,7 +200,12 @@ static int bcm7xxx_28nm_config_init(struct phy_device *phydev)
 	if (ret)
 		return ret;
 
-	ret = bcm_phy_set_eee(phydev, true);
+	ret = bcm_phy_downshift_get(phydev, &count);
+	if (ret)
+		return ret;
+
+	/* Only enable EEE if Wirespeed/downshift is disabled */
+	ret = bcm_phy_set_eee(phydev, count == DOWNSHIFT_DEV_DISABLE);
 	if (ret)
 		return ret;
 
@@ -303,6 +309,47 @@ static int bcm7xxx_suspend(struct phy_device *phydev)
 	return 0;
 }
 
+static int bcm7xxx_28nm_get_tunable(struct phy_device *phydev,
+				    struct ethtool_tunable *tuna,
+				    void *data)
+{
+	switch (tuna->id) {
+	case ETHTOOL_PHY_DOWNSHIFT:
+		return bcm_phy_downshift_get(phydev, (u8 *)data);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int bcm7xxx_28nm_set_tunable(struct phy_device *phydev,
+				    struct ethtool_tunable *tuna,
+				    const void *data)
+{
+	u8 count = *(u8 *)data;
+	int ret;
+
+	switch (tuna->id) {
+	case ETHTOOL_PHY_DOWNSHIFT:
+		ret = bcm_phy_downshift_set(phydev, count);
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	if (ret)
+		return ret;
+
+	/* Disable EEE advertisment since this prevents the PHY
+	 * from successfully linking up, trigger auto-negotiation restart
+	 * to let the MAC decide what to do.
+	 */
+	ret = bcm_phy_set_eee(phydev, count == DOWNSHIFT_DEV_DISABLE);
+	if (ret)
+		return ret;
+
+	return genphy_restart_aneg(phydev);
+}
+
 #define BCM7XXX_28NM_GPHY(_oui, _name)					\
 {									\
 	.phy_id		= (_oui),					\
@@ -315,6 +362,8 @@ static int bcm7xxx_suspend(struct phy_device *phydev)
 	.config_aneg	= genphy_config_aneg,				\
 	.read_status	= genphy_read_status,				\
 	.resume		= bcm7xxx_28nm_resume,				\
+	.get_tunable	= bcm7xxx_28nm_get_tunable,			\
+	.set_tunable	= bcm7xxx_28nm_set_tunable,			\
 }
 
 #define BCM7XXX_40NM_EPHY(_oui, _name)					\
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 2/5] net: phy: broadcom: Add support code for downshift/Wirespeed
From: Florian Fainelli @ 2016-11-22 19:40 UTC (permalink / raw)
  To: netdev
  Cc: davem, bcm-kernel-feedback-list, andrew, allan.nielsen,
	raju.lakkaraju, vivien.didelot, Florian Fainelli
In-Reply-To: <20161122194058.29820-1-f.fainelli@gmail.com>

Broadcom's Wirespeed feature allows us to configure how auto-negotiation
should behave with fewer working pairs of wires on a cable. Add support
code for retrieving and setting such downshift counters using the
recently added ethtool downshift tunables.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/phy/bcm-phy-lib.c | 86 +++++++++++++++++++++++++++++++++++++++++++
 drivers/net/phy/bcm-phy-lib.h |  5 +++
 include/linux/brcmphy.h       | 10 +++++
 3 files changed, 101 insertions(+)

diff --git a/drivers/net/phy/bcm-phy-lib.c b/drivers/net/phy/bcm-phy-lib.c
index 18e11b3a0f41..d742894816f6 100644
--- a/drivers/net/phy/bcm-phy-lib.c
+++ b/drivers/net/phy/bcm-phy-lib.c
@@ -225,6 +225,92 @@ int bcm_phy_enable_eee(struct phy_device *phydev)
 }
 EXPORT_SYMBOL_GPL(bcm_phy_enable_eee);
 
+int bcm_phy_downshift_get(struct phy_device *phydev, u8 *count)
+{
+	int val;
+
+	val = bcm54xx_auxctl_read(phydev, MII_BCM54XX_AUXCTL_SHDWSEL_MISC);
+	if (val < 0)
+		return val;
+
+	/* Check if wirespeed is enabled or not */
+	if (!(val & MII_BCM54XX_AUXCTL_SHDWSEL_MISC_WIRESPEED_EN)) {
+		*count = DOWNSHIFT_DEV_DISABLE;
+		return 0;
+	}
+
+	val = bcm_phy_read_shadow(phydev, BCM54XX_SHD_SCR2);
+	if (val < 0)
+		return val;
+
+	/* Downgrade after one link attempt */
+	if (val & BCM54XX_SHD_SCR2_WSPD_RTRY_DIS) {
+		*count = 1;
+	} else {
+		/* Downgrade after configured retry count */
+		val >>= BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_SHIFT;
+		val &= BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_MASK;
+		*count = val + BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_OFFSET;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(bcm_phy_downshift_get);
+
+int bcm_phy_downshift_set(struct phy_device *phydev, u8 count)
+{
+	int val = 0, ret = 0;
+
+	/* Range check the number given */
+	if (count - BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_OFFSET >
+	    BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_MASK &&
+	    count != DOWNSHIFT_DEV_DEFAULT_COUNT) {
+		return -ERANGE;
+	}
+
+	val = bcm54xx_auxctl_read(phydev, MII_BCM54XX_AUXCTL_SHDWSEL_MISC);
+	if (val < 0)
+		return val;
+
+	/* Se the write enable bit */
+	val |= MII_BCM54XX_AUXCTL_MISC_WREN;
+
+	if (count == DOWNSHIFT_DEV_DISABLE) {
+		val &= ~MII_BCM54XX_AUXCTL_SHDWSEL_MISC_WIRESPEED_EN;
+		return bcm54xx_auxctl_write(phydev,
+					    MII_BCM54XX_AUXCTL_SHDWSEL_MISC,
+					    val);
+	} else {
+		val |= MII_BCM54XX_AUXCTL_SHDWSEL_MISC_WIRESPEED_EN;
+		ret = bcm54xx_auxctl_write(phydev,
+					   MII_BCM54XX_AUXCTL_SHDWSEL_MISC,
+					   val);
+		if (ret < 0)
+			return ret;
+	}
+
+	val = bcm_phy_read_shadow(phydev, BCM54XX_SHD_SCR2);
+	val &= ~(BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_MASK <<
+		 BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_SHIFT |
+		 BCM54XX_SHD_SCR2_WSPD_RTRY_DIS);
+
+	switch (count) {
+	case 1:
+		val |= BCM54XX_SHD_SCR2_WSPD_RTRY_DIS;
+		break;
+	case DOWNSHIFT_DEV_DEFAULT_COUNT:
+		val |= 1 << BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_SHIFT;
+		break;
+	default:
+		val |= (count - BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_OFFSET) <<
+			BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_SHIFT;
+		break;
+	}
+
+	return bcm_phy_write_shadow(phydev, BCM54XX_SHD_SCR2, val);
+}
+EXPORT_SYMBOL_GPL(bcm_phy_downshift_set);
+
 MODULE_DESCRIPTION("Broadcom PHY Library");
 MODULE_LICENSE("GPL v2");
 MODULE_AUTHOR("Broadcom Corporation");
diff --git a/drivers/net/phy/bcm-phy-lib.h b/drivers/net/phy/bcm-phy-lib.h
index 31cb4fdf5d5a..3f492e629094 100644
--- a/drivers/net/phy/bcm-phy-lib.h
+++ b/drivers/net/phy/bcm-phy-lib.h
@@ -37,4 +37,9 @@ int bcm_phy_config_intr(struct phy_device *phydev);
 int bcm_phy_enable_apd(struct phy_device *phydev, bool dll_pwr_down);
 
 int bcm_phy_enable_eee(struct phy_device *phydev);
+
+int bcm_phy_downshift_get(struct phy_device *phydev, u8 *count);
+
+int bcm_phy_downshift_set(struct phy_device *phydev, u8 count);
+
 #endif /* _LINUX_BCM_PHY_LIB_H */
diff --git a/include/linux/brcmphy.h b/include/linux/brcmphy.h
index 848dc508ef57..f9f8aaf9c943 100644
--- a/include/linux/brcmphy.h
+++ b/include/linux/brcmphy.h
@@ -114,6 +114,7 @@
 #define MII_BCM54XX_AUXCTL_SHDWSEL_MISC	0x0007
 #define MII_BCM54XX_AUXCTL_SHDWSEL_READ_SHIFT	12
 #define MII_BCM54XX_AUXCTL_SHDWSEL_MISC_RGMII_SKEW_EN	(1 << 8)
+#define MII_BCM54XX_AUXCTL_SHDWSEL_MISC_WIRESPEED_EN	(1 << 4)
 
 #define MII_BCM54XX_AUXCTL_SHDWSEL_MASK	0x0007
 
@@ -130,6 +131,7 @@
 #define BCM_LED_SRC_INTR	0x6
 #define BCM_LED_SRC_QUALITY	0x7
 #define BCM_LED_SRC_RCVLED	0x8
+#define BCM_LED_SRC_WIRESPEED	0x9
 #define BCM_LED_SRC_MULTICOLOR1	0xa
 #define BCM_LED_SRC_OPENSHORT	0xb
 #define BCM_LED_SRC_OFF		0xe	/* Tied high */
@@ -141,6 +143,14 @@
  * Shadow values go into bits [14:10] of register 0x1c to select a shadow
  * register to access.
  */
+
+/* 00100: Reserved control register 2 */
+#define BCM54XX_SHD_SCR2		0x04
+#define  BCM54XX_SHD_SCR2_WSPD_RTRY_DIS	0x100
+#define  BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_SHIFT	2
+#define  BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_OFFSET	2
+#define  BCM54XX_SHD_SCR2_WSPD_RTRY_LMT_MASK	0x7
+
 /* 00101: Spare Control Register 3 */
 #define BCM54XX_SHD_SCR3		0x05
 #define  BCM54XX_SHD_SCR3_DEF_CLK125	0x0001
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 3/5] net: phy: broadcom: Allow enabling or disabling of EEE
From: Florian Fainelli @ 2016-11-22 19:40 UTC (permalink / raw)
  To: netdev
  Cc: davem, bcm-kernel-feedback-list, andrew, allan.nielsen,
	raju.lakkaraju, vivien.didelot, Florian Fainelli
In-Reply-To: <20161122194058.29820-1-f.fainelli@gmail.com>

In preparation for adding support for Wirespeed/downshift, we need to
change bcm_phy_eee_enable() to allow enabling or disabling EEE, so make
the function take an extra enable/disable boolean parameter and rename
it to illustrate it sets EEE, not necessarily just enables it.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/phy/bcm-cygnus.c  |  2 +-
 drivers/net/phy/bcm-phy-lib.c | 14 ++++++++++----
 drivers/net/phy/bcm-phy-lib.h |  2 +-
 drivers/net/phy/bcm7xxx.c     |  2 +-
 4 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/net/phy/bcm-cygnus.c b/drivers/net/phy/bcm-cygnus.c
index 49bbc6826883..196400cddf68 100644
--- a/drivers/net/phy/bcm-cygnus.c
+++ b/drivers/net/phy/bcm-cygnus.c
@@ -104,7 +104,7 @@ static int bcm_cygnus_config_init(struct phy_device *phydev)
 		return rc;
 
 	/* Advertise EEE */
-	rc = bcm_phy_enable_eee(phydev);
+	rc = bcm_phy_set_eee(phydev, true);
 	if (rc)
 		return rc;
 
diff --git a/drivers/net/phy/bcm-phy-lib.c b/drivers/net/phy/bcm-phy-lib.c
index d742894816f6..3156ce6d5861 100644
--- a/drivers/net/phy/bcm-phy-lib.c
+++ b/drivers/net/phy/bcm-phy-lib.c
@@ -195,7 +195,7 @@ int bcm_phy_enable_apd(struct phy_device *phydev, bool dll_pwr_down)
 }
 EXPORT_SYMBOL_GPL(bcm_phy_enable_apd);
 
-int bcm_phy_enable_eee(struct phy_device *phydev)
+int bcm_phy_set_eee(struct phy_device *phydev, bool enable)
 {
 	int val;
 
@@ -205,7 +205,10 @@ int bcm_phy_enable_eee(struct phy_device *phydev)
 	if (val < 0)
 		return val;
 
-	val |= LPI_FEATURE_EN | LPI_FEATURE_EN_DIG1000X;
+	if (enable)
+		val |= LPI_FEATURE_EN | LPI_FEATURE_EN_DIG1000X;
+	else
+		val &= ~(LPI_FEATURE_EN | LPI_FEATURE_EN_DIG1000X);
 
 	phy_write_mmd_indirect(phydev, BRCM_CL45VEN_EEE_CONTROL,
 			       MDIO_MMD_AN, (u32)val);
@@ -216,14 +219,17 @@ int bcm_phy_enable_eee(struct phy_device *phydev)
 	if (val < 0)
 		return val;
 
-	val |= (MDIO_AN_EEE_ADV_100TX | MDIO_AN_EEE_ADV_1000T);
+	if (enable)
+		val |= (MDIO_AN_EEE_ADV_100TX | MDIO_AN_EEE_ADV_1000T);
+	else
+		val &= ~(MDIO_AN_EEE_ADV_100TX | MDIO_AN_EEE_ADV_1000T);
 
 	phy_write_mmd_indirect(phydev, BCM_CL45VEN_EEE_ADV,
 			       MDIO_MMD_AN, (u32)val);
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(bcm_phy_enable_eee);
+EXPORT_SYMBOL_GPL(bcm_phy_set_eee);
 
 int bcm_phy_downshift_get(struct phy_device *phydev, u8 *count)
 {
diff --git a/drivers/net/phy/bcm-phy-lib.h b/drivers/net/phy/bcm-phy-lib.h
index 3f492e629094..a117f657c6d7 100644
--- a/drivers/net/phy/bcm-phy-lib.h
+++ b/drivers/net/phy/bcm-phy-lib.h
@@ -36,7 +36,7 @@ int bcm_phy_config_intr(struct phy_device *phydev);
 
 int bcm_phy_enable_apd(struct phy_device *phydev, bool dll_pwr_down);
 
-int bcm_phy_enable_eee(struct phy_device *phydev);
+int bcm_phy_set_eee(struct phy_device *phydev, bool enable);
 
 int bcm_phy_downshift_get(struct phy_device *phydev, u8 *count);
 
diff --git a/drivers/net/phy/bcm7xxx.c b/drivers/net/phy/bcm7xxx.c
index 9636da0b6efc..b7789e879670 100644
--- a/drivers/net/phy/bcm7xxx.c
+++ b/drivers/net/phy/bcm7xxx.c
@@ -199,7 +199,7 @@ static int bcm7xxx_28nm_config_init(struct phy_device *phydev)
 	if (ret)
 		return ret;
 
-	ret = bcm_phy_enable_eee(phydev);
+	ret = bcm_phy_set_eee(phydev, true);
 	if (ret)
 		return ret;
 
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 1/5] net: phy: broadcom: Move bcm54xx_auxctl_{read,write} to common library
From: Florian Fainelli @ 2016-11-22 19:40 UTC (permalink / raw)
  To: netdev
  Cc: davem, bcm-kernel-feedback-list, andrew, allan.nielsen,
	raju.lakkaraju, vivien.didelot, Florian Fainelli
In-Reply-To: <20161122194058.29820-1-f.fainelli@gmail.com>

We are going to need these functions to implement support for Broadcom
Wirespeed, aka downshift.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/phy/bcm-phy-lib.c | 17 +++++++++++++++++
 drivers/net/phy/bcm-phy-lib.h |  3 +++
 drivers/net/phy/broadcom.c    | 15 ---------------
 3 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/net/phy/bcm-phy-lib.c b/drivers/net/phy/bcm-phy-lib.c
index df0416db0b88..18e11b3a0f41 100644
--- a/drivers/net/phy/bcm-phy-lib.c
+++ b/drivers/net/phy/bcm-phy-lib.c
@@ -50,6 +50,23 @@ int bcm_phy_read_exp(struct phy_device *phydev, u16 reg)
 }
 EXPORT_SYMBOL_GPL(bcm_phy_read_exp);
 
+int bcm54xx_auxctl_read(struct phy_device *phydev, u16 regnum)
+{
+	/* The register must be written to both the Shadow Register Select and
+	 * the Shadow Read Register Selector
+	 */
+	phy_write(phydev, MII_BCM54XX_AUX_CTL, regnum |
+		  regnum << MII_BCM54XX_AUXCTL_SHDWSEL_READ_SHIFT);
+	return phy_read(phydev, MII_BCM54XX_AUX_CTL);
+}
+EXPORT_SYMBOL_GPL(bcm54xx_auxctl_read);
+
+int bcm54xx_auxctl_write(struct phy_device *phydev, u16 regnum, u16 val)
+{
+	return phy_write(phydev, MII_BCM54XX_AUX_CTL, regnum | val);
+}
+EXPORT_SYMBOL(bcm54xx_auxctl_write);
+
 int bcm_phy_write_misc(struct phy_device *phydev,
 		       u16 reg, u16 chl, u16 val)
 {
diff --git a/drivers/net/phy/bcm-phy-lib.h b/drivers/net/phy/bcm-phy-lib.h
index b2091c88b44d..31cb4fdf5d5a 100644
--- a/drivers/net/phy/bcm-phy-lib.h
+++ b/drivers/net/phy/bcm-phy-lib.h
@@ -19,6 +19,9 @@
 int bcm_phy_write_exp(struct phy_device *phydev, u16 reg, u16 val);
 int bcm_phy_read_exp(struct phy_device *phydev, u16 reg);
 
+int bcm54xx_auxctl_write(struct phy_device *phydev, u16 regnum, u16 val);
+int bcm54xx_auxctl_read(struct phy_device *phydev, u16 regnum);
+
 int bcm_phy_write_misc(struct phy_device *phydev,
 		       u16 reg, u16 chl, u16 value);
 int bcm_phy_read_misc(struct phy_device *phydev,
diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c
index b1e32e9be1b3..409b365f12b1 100644
--- a/drivers/net/phy/broadcom.c
+++ b/drivers/net/phy/broadcom.c
@@ -30,21 +30,6 @@ MODULE_DESCRIPTION("Broadcom PHY driver");
 MODULE_AUTHOR("Maciej W. Rozycki");
 MODULE_LICENSE("GPL");
 
-static int bcm54xx_auxctl_read(struct phy_device *phydev, u16 regnum)
-{
-	/* The register must be written to both the Shadow Register Select and
-	 * the Shadow Read Register Selector
-	 */
-	phy_write(phydev, MII_BCM54XX_AUX_CTL, regnum |
-		  regnum << MII_BCM54XX_AUXCTL_SHDWSEL_READ_SHIFT);
-	return phy_read(phydev, MII_BCM54XX_AUX_CTL);
-}
-
-static int bcm54xx_auxctl_write(struct phy_device *phydev, u16 regnum, u16 val)
-{
-	return phy_write(phydev, MII_BCM54XX_AUX_CTL, regnum | val);
-}
-
 static int bcm54810_config(struct phy_device *phydev)
 {
 	int rc, val;
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 0/5] net: phy: broadcom: Wirespeed/downshift support
From: Florian Fainelli @ 2016-11-22 19:40 UTC (permalink / raw)
  To: netdev
  Cc: davem, bcm-kernel-feedback-list, andrew, allan.nielsen,
	raju.lakkaraju, vivien.didelot, Florian Fainelli

Hi all,

This patch series adds support for the Broadcom Wirespeed, aka downsfhit feature
utilizing the recently added ethtool PHY tunables.

Tested with two Gigabit link partners with a 4-wire cable having only 2 pairs
connected.

Last patch in the series is a fix that was required for testing, which should
make it to -stable, which I can submit separate against net if you prefer David.

Thanks!

Florian Fainelli (5):
  net: phy: broadcom: Move bcm54xx_auxctl_{read,write} to common library
  net: phy: broadcom: Add support code for downshift/Wirespeed
  net: phy: broadcom: Allow enabling or disabling of EEE
  net: phy: bcm7xxx: Add support for downshift/Wirespeed
  net: dsa: bcm_sf2: Ensure we re-negotiate EEE during after link change

 drivers/net/dsa/bcm_sf2.c     |   4 ++
 drivers/net/phy/bcm-cygnus.c  |   2 +-
 drivers/net/phy/bcm-phy-lib.c | 117 ++++++++++++++++++++++++++++++++++++++++--
 drivers/net/phy/bcm-phy-lib.h |  10 +++-
 drivers/net/phy/bcm7xxx.c     |  51 +++++++++++++++++-
 drivers/net/phy/broadcom.c    |  15 ------
 include/linux/brcmphy.h       |  10 ++++
 7 files changed, 187 insertions(+), 22 deletions(-)

-- 
2.9.3

^ permalink raw reply

* Re: [PATCH net-next] net/sched: cls_flower: verify root pointer before dereferncing it
From: Cong Wang @ 2016-11-22 19:28 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Daniel Borkmann, Roi Dayan, David S. Miller,
	Linux Kernel Network Developers, Jiri Pirko, Or Gerlitz,
	Cong Wang, John Fastabend
In-Reply-To: <20161122161159.GC1819@nanopsycho>

On Tue, Nov 22, 2016 at 8:11 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> Tue, Nov 22, 2016 at 05:04:11PM CET, daniel@iogearbox.net wrote:
>>Hmm, I don't think we want to have such an additional test in fast
>>path for each and every classifier. Can we think of ways to avoid that?
>>
>>My question is, since we unlink individual instances from such tp-internal
>>lists through RCU and release the instance through call_rcu() as well as
>>the head (tp->root) via kfree_rcu() eventually, against what are we protecting
>>setting RCU_INIT_POINTER(tp->root, NULL) in ->destroy() callback? Something
>>not respecting grace period?
>
> If you call tp->ops->destroy in call_rcu, you don't have to set tp->root
> to null.

We do need to respect the grace period if we touch the globally visible
data structure tp in tcf_destroy(). Therefore Roi's patch is not fixing the
right place.

Also I don't know why you blame my commit, this problem should already
exist prior to my commit, probably date back to John's RCU patches.

I am working on a patch.

^ permalink raw reply

* [PATCH net] flow_dissect: call init_default_flow_dissectors() earlier
From: Eric Dumazet @ 2016-11-22 19:17 UTC (permalink / raw)
  To: Andre Noll, Jiri Pirko
  Cc: Duyck, Alexander H, edumazet@google.com,
	linux-kernel@vger.kernel.org, ast@kernel.org, willemb@google.com,
	gregkh@linuxfoundation.org, jslaby@suse.cz, davem@davemloft.net,
	yibyang@cisco.com, netdev
In-Reply-To: <20161122182258.GF19939@tuebingen.mpg.de>

From: Eric Dumazet <edumazet@google.com>

Andre Noll reported panics after my recent fix (commit 34fad54c2537
"net: __skb_flow_dissect() must cap its return value")

After some more headaches, Alexander root caused the problem to
init_default_flow_dissectors() being called too late, in case
a network driver like IGB is not a module and receives DHCP message
very early.

Fix is to call init_default_flow_dissectors() much earlier,
as it is a core infrastructure and does not depend on another
kernel service.

Fixes: 06635a35d13d4 ("flow_dissect: use programable dissector in skb_flow_dissect and friends")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andre Noll <maan@tuebingen.mpg.de>
Diagnosed-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 net/core/flow_dissector.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 69e4463a4b1b..c6d8207ffa7e 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -1013,4 +1013,4 @@ static int __init init_default_flow_dissectors(void)
 	return 0;
 }
 
-late_initcall_sync(init_default_flow_dissectors);
+core_initcall(init_default_flow_dissectors);

^ permalink raw reply related

* Re: [PATCH net-next] tcp: enhance tcp_collapse_retrans() with skb_shift()
From: Eric Dumazet @ 2016-11-22 18:57 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Neal Cardwell, Yuchung Cheng
In-Reply-To: <1479243110.8455.135.camel@edumazet-glaptop3.roam.corp.google.com>

On Tue, 2016-11-15 at 12:51 -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> In commit 2331ccc5b323 ("tcp: enhance tcp collapsing"),
> we made a first step allowing copying right skb to left skb head.
> 
> Since all skbs in socket write queue are headless (but possibly the very
> first one), this strategy often does not work.
> 
> This patch extends tcp_collapse_retrans() to perform frag shifting,
> thanks to skb_shift() helper.
> 
> This helper needs to not BUG on non headless skbs, as callers are ok
> with that.
> 
> Tested:
> 
> Following packetdrill test now passes :
> 
> 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
>    +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>    +0 bind(3, ..., ...) = 0
>    +0 listen(3, 1) = 0
> 
>    +0 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 8>
>    +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
> +.100 < . 1:1(0) ack 1 win 257
>    +0 accept(3, ..., ...) = 4
> 
>    +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
>    +0 write(4, ..., 200) = 200
>    +0 > P. 1:201(200) ack 1
> +.001 write(4, ..., 200) = 200
>    +0 > P. 201:401(200) ack 1
> +.001 write(4, ..., 200) = 200
>    +0 > P. 401:601(200) ack 1
> +.001 write(4, ..., 200) = 200
>    +0 > P. 601:801(200) ack 1
> +.001 write(4, ..., 200) = 200
>    +0 > P. 801:1001(200) ack 1
> +.001 write(4, ..., 100) = 100
>    +0 > P. 1001:1101(100) ack 1
> +.001 write(4, ..., 100) = 100
>    +0 > P. 1101:1201(100) ack 1
> +.001 write(4, ..., 100) = 100
>    +0 > P. 1201:1301(100) ack 1
> +.001 write(4, ..., 100) = 100
>    +0 > P. 1301:1401(100) ack 1
> 
> +.099 < . 1:1(0) ack 201 win 257
> +.001 < . 1:1(0) ack 201 win 257 <nop,nop,sack 1001:1401>
>    +0 > P. 201:1001(800) ack 1
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> ---
>  net/core/skbuff.c     |    4 +++-
>  net/ipv4/tcp_output.c |   22 +++++++++++-----------
>  2 files changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 0b2a6e94af2de73ed638634c47a0fb71e2cbc1cb..a9cb81a10c4ba895587727aa4cf098e9a38424ea 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2656,7 +2656,9 @@ int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, int shiftlen)
>  	struct skb_frag_struct *fragfrom, *fragto;
>  
>  	BUG_ON(shiftlen > skb->len);
> -	BUG_ON(skb_headlen(skb));	/* Would corrupt stream */
> +
> +	if (skb_headlen(skb))
> +		return 0;
>  
>  	todo = shiftlen;
>  	from = 0;
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index f57b5aa51b59cf0a58975fe34a7dcdb886ea8c50..19105b46a30436ebb85fe97ee43089e77aa028bb 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2514,7 +2514,7 @@ void tcp_skb_collapse_tstamp(struct sk_buff *skb,
>  }
>  
>  /* Collapses two adjacent SKB's during retransmission. */
> -static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
> +static bool tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
>  {
>  	struct tcp_sock *tp = tcp_sk(sk);
>  	struct sk_buff *next_skb = tcp_write_queue_next(sk, skb);
> @@ -2525,14 +2525,17 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
>  
>  	BUG_ON(tcp_skb_pcount(skb) != 1 || tcp_skb_pcount(next_skb) != 1);
>  
> +	if (next_skb_size) {
> +		if (next_skb_size <= skb_availroom(skb))
> +			skb_copy_bits(next_skb, 0, skb_put(skb, next_skb_size),
> +				      next_skb_size);
> +		else if (!skb_shift(skb, next_skb, next_skb_size))
> +			return false;
> +	}
>  	tcp_highest_sack_combine(sk, next_skb, skb);
>  
>  	tcp_unlink_write_queue(next_skb, sk);
>  
> -	if (next_skb_size)
> -		skb_copy_bits(next_skb, 0, skb_put(skb, next_skb_size),
> -			      next_skb_size);
> -
>  	if (next_skb->ip_summed == CHECKSUM_PARTIAL)
>  		skb->ip_summed = CHECKSUM_PARTIAL;
>  
> @@ -2561,6 +2564,7 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
>  	tcp_skb_collapse_tstamp(skb, next_skb);
>  
>  	sk_wmem_free_skb(sk, next_skb);
> +	return true;
>  }
>  
>  /* Check if coalescing SKBs is legal. */
> @@ -2610,16 +2614,12 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to,
>  
>  		if (space < 0)
>  			break;
> -		/* Punt if not enough space exists in the first SKB for
> -		 * the data in the second
> -		 */
> -		if (skb->len > skb_availroom(to))
> -			break;
>  
>  		if (after(TCP_SKB_CB(skb)->end_seq, tcp_wnd_end(tp)))
>  			break;
>  
> -		tcp_collapse_retrans(sk, to);
> +		if (!tcp_collapse_retrans(sk, to))
> +			break;
>  	}
>  }
>  


David, patch is marked 'Superseded' in
https://patchwork.ozlabs.org/patch/695264/

Not sure what this means exactly ?
Did I miss a mail/feedback/something ?

Thanks !

^ permalink raw reply

* Re: net/icmp: null-ptr-deref in icmp6_send
From: David Ahern @ 2016-11-22 19:14 UTC (permalink / raw)
  To: Cong Wang
  Cc: Andrey Konovalov, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, LKML, Dmitry Vyukov,
	Alexander Potapenko, Kostya Serebryany, Eric Dumazet, syzkaller
In-Reply-To: <CAM_iQpUQtt2CGrYF4H2DJv_NRsmnBqkXj5rJVXD8osJBFz_qYw@mail.gmail.com>



Sent from my iPhone

> On Nov 22, 2016, at 1:11 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> 
>> On Tue, Nov 22, 2016 at 2:23 AM, Andrey Konovalov <andreyknvl@google.com> wrote:
>> Hi,
>> 
>> I've got the following error report while fuzzing the kernel with syzkaller.
>> 
>> It seems that skb_dst(skb) may end up being NULL.
>> 
>> As far as I can see the bug was introduced in commit 5d41ce29e ("net:
>> icmp6_send should use dst dev to determine L3 domain").
>> ICMP v4 probaly has similar issue due to 9d1a6c4ea ("net:
>> icmp_route_lookup should use rt dev to determine L3 domain").
> 
> 
> ipv6_parse_hopopts() is called before NF_INET_PRE_ROUTING,
> so the skb_dst could be NULL.
> 
> I have no idea what commit 5d41ce29e tried to fix, but we already
> use skb->dev a few lines before l3mdev_master_ifindex(), so I don't
> understand why skb->dev could be NULL, maybe just for vrf dev?

On PTO this week and currently at the beach. Will take a look tonight. Thanks for the report. 

^ permalink raw reply

* Re: [PATCH net] bnxt: do not busy-poll when link is down
From: Eric Dumazet @ 2016-11-22 19:06 UTC (permalink / raw)
  To: Michael Chan; +Cc: Andy Gospodarek, Netdev
In-Reply-To: <CACKFLik9rvPfSh1ppcTtjS-bbgsOh4hKJCEaXkCN46eb5_fhpA@mail.gmail.com>

On Tue, 2016-11-22 at 10:55 -0800, Michael Chan wrote:
> On Tue, Nov 22, 2016 at 10:38 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:

> >
> > Any plans removing this busy polling stuff, now it is done in core
> > networking stack ?
> >
> > This would remove bnxt_lock_napi() extra overhead in normal path ( napi
> > poll )
> >
> > I could do this but I do not have the hardware to do the tests.
> >
> It's on my list of many TODO things.  Probably in the next few weeks.

Awesome, thanks !

^ permalink raw reply

* Re: [PATCH] net: dsa: mv88e6xxx: egress all frames
From: Andrew Lunn @ 2016-11-22 19:02 UTC (permalink / raw)
  To: Stefan Eichenberger
  Cc: Stefan Eichenberger, vivien.didelot, f.fainelli, netdev
In-Reply-To: <20161122183732.GD13973@gruene.netmodule.intranet>

On Tue, Nov 22, 2016 at 07:37:33PM +0100, Stefan Eichenberger wrote:
> Hi Andrew
> 
> On Tue, Nov 22, 2016 at 04:03:30PM +0100, Andrew Lunn wrote:
> > On Tue, Nov 22, 2016 at 11:39:44AM +0100, Stefan Eichenberger wrote:
> > > Egress multicast and egress unicast is only enabled for CPU/DSA ports
> > > but for switching operation it seems it should be enabled for all ports.
> > > Do I miss something here?
> > > 
> > > I did the following test:
> > > brctl addbr br0
> > > brctl addif br0 lan0
> > > brctl addif br0 lan1
> > > 
> > > In this scenario the unicast and multicast packets were not forwarded,
> > > therefore ARP requests were not resolved, and no connection could be
> > > established.
> > 
> > Hi Stefan
> > 
> > This is probably specific to the 6097 family. It works fine without
> > this on other devices. Creating a bridge like above and pinging across
> > it is one of my standard tests. But i only test modern devices like
> > the 6165, 6352, 6351, 6390 families.
> 
> Okay perfect, I wasn't 100% sure if I would have to configure something
> additionally.

No. The idea is you treat the interfaces as normal interfaces. You
should not need to do anything additional to what you would do with a
normal interface, when adding it to a bridge.
 
> > In fact, you might need to review all the code and look where
> > mv88e6xxx_6095_family(chip) is used and consider if you need to add
> > mv88e6xxx_6097_family(chip). e.g.
> > 
> >         if (mv88e6xxx_6095_family(chip) || mv88e6xxx_6185_family(chip)) {
> >                 /* Set the upstream port this port should use */
> >                 reg |= dsa_upstream_port(ds);
> >                 /* enable forwarding of unknown multicast addresses to
> >                  * the upstream port
> >                  */
> >                 if (port == dsa_upstream_port(ds))
> >                         reg |= PORT_CONTROL_2_FORWARD_UNKNOWN;
> >         }
> > 
> > Maybe this is your problem?
> 
> I think I still don't understand exactly how the driver works.
> 
> My problem is that the multicast and broadcast frames are filtered and
> the following counter is increasing in ethtool:
> sw_in_filtered: 596

This is not what is supposed to happen. Broadcast and multicast frames
should go to all ports in the bridge. There are two different ways
this can happen:

1) The mv88e6xxx driver started out with the host doing all bridge
operations. The switch forwards all frames to the software bridge, and
the software bridge then sends them out another port if needed.

2) We later added support for hardware bridging. That is, the switch
itself bridges frames between ports. It will only pass frames to the
software bridge if it does not know what to do with a frame itself.

Now, the different families are not 100% compatible with each
other. We never had access to a 6097, so it has not been tested
recently, and we have probably broken it... My guess would be,
anywhere mv88e6xxx_6095_family(chip) is used, there also needs to be
an mv88e6xxx_6097_family(chip). But i could be wrong.

What you might find useful is

https://github.com/vivien/linux.git 161b96bd7d16d21b0f046c935b70c3b2d277ccc2

although it might need some changes for recent commits.

With that, you can see deeper into the switches registers.

     Andrew

^ permalink raw reply

* Re: [PATCH net] bnxt: do not busy-poll when link is down
From: Michael Chan @ 2016-11-22 18:55 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Andy Gospodarek, Netdev
In-Reply-To: <1479839916.8455.439.camel@edumazet-glaptop3.roam.corp.google.com>

On Tue, Nov 22, 2016 at 10:38 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2016-11-22 at 13:14 -0500, Andy Gospodarek wrote:
>> When busy polling while a link is down (during a link-flap test), TX
>> timeouts were observed as well as the following messages in the ring
>> buffer:
>>
>> bnxt_en 0008:01:00.2 enP8p1s0f2d2: Resp cmpl intr err msg: 0x51
>> bnxt_en 0008:01:00.2 enP8p1s0f2d2: hwrm_ring_free tx failed. rc:-1
>> bnxt_en 0008:01:00.2 enP8p1s0f2d2: Resp cmpl intr err msg: 0x51
>> bnxt_en 0008:01:00.2 enP8p1s0f2d2: hwrm_ring_free rx failed. rc:-1
>>
>> These were resolved by checking for link status and returning if link
>> was not up.
>>
>> Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
>> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
>> Tested-by: Rob Miller <rob.miller@broadcom.com>
>> ---
>>  drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>> index e18635b..013e373 100644
>> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>> @@ -1811,6 +1811,9 @@ static int bnxt_busy_poll(struct napi_struct *napi)
>>       if (atomic_read(&bp->intr_sem) != 0)
>>               return LL_FLUSH_FAILED;
>>
>> +     if (!bp->link_info.link_up)
>> +             return LL_FLUSH_FAILED;
>> +
>>       if (!bnxt_lock_poll(bnapi))
>>               return LL_FLUSH_BUSY;
>>
>
>
> Any plans removing this busy polling stuff, now it is done in core
> networking stack ?
>
> This would remove bnxt_lock_napi() extra overhead in normal path ( napi
> poll )
>
> I could do this but I do not have the hardware to do the tests.
>
It's on my list of many TODO things.  Probably in the next few weeks.

^ permalink raw reply

* Re: [PATCH] net: dsa: mv88e6xxx: egress all frames
From: Stefan Eichenberger @ 2016-11-22 18:37 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Stefan Eichenberger, vivien.didelot, f.fainelli, netdev
In-Reply-To: <20161122150330.GE2691@lunn.ch>

Hi Andrew

On Tue, Nov 22, 2016 at 04:03:30PM +0100, Andrew Lunn wrote:
> On Tue, Nov 22, 2016 at 11:39:44AM +0100, Stefan Eichenberger wrote:
> > Egress multicast and egress unicast is only enabled for CPU/DSA ports
> > but for switching operation it seems it should be enabled for all ports.
> > Do I miss something here?
> > 
> > I did the following test:
> > brctl addbr br0
> > brctl addif br0 lan0
> > brctl addif br0 lan1
> > 
> > In this scenario the unicast and multicast packets were not forwarded,
> > therefore ARP requests were not resolved, and no connection could be
> > established.
> 
> Hi Stefan
> 
> This is probably specific to the 6097 family. It works fine without
> this on other devices. Creating a bridge like above and pinging across
> it is one of my standard tests. But i only test modern devices like
> the 6165, 6352, 6351, 6390 families.

Okay perfect, I wasn't 100% sure if I would have to configure something
additionally.

> 
> In fact, you might need to review all the code and look where
> mv88e6xxx_6095_family(chip) is used and consider if you need to add
> mv88e6xxx_6097_family(chip). e.g.
> 
>         if (mv88e6xxx_6095_family(chip) || mv88e6xxx_6185_family(chip)) {
>                 /* Set the upstream port this port should use */
>                 reg |= dsa_upstream_port(ds);
>                 /* enable forwarding of unknown multicast addresses to
>                  * the upstream port
>                  */
>                 if (port == dsa_upstream_port(ds))
>                         reg |= PORT_CONTROL_2_FORWARD_UNKNOWN;
>         }
> 
> Maybe this is your problem?

I think I still don't understand exactly how the driver works.

My problem is that the multicast and broadcast frames are filtered and
the following counter is increasing in ethtool:
sw_in_filtered: 596

This makes sense because "Egress Floods" in the Port Control Register is
set to 0. What kind of mechanism should make sure that for example ARP
packets are sent trough all ports anyway?

Unfortunately I don't have any devices available with more modern
devices, so I can't double check the registers.

Regards,
Stefan

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox