Netdev List
 help / color / mirror / Atom feed
* RE: [Intel-wired-lan] [PATCH] net: ixgbe: fix memory leaks
From: Bowers, AndrewX @ 2019-08-28 16:22 UTC (permalink / raw)
  To: open list:NETWORKING DRIVERS,
	moderated list:INTEL ETHERNET DRIVERS, open list
In-Reply-To: <1565554067-4994-1-git-send-email-wenwen@cs.uga.edu>

> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> Behalf Of Wenwen Wang
> Sent: Sunday, August 11, 2019 1:08 PM
> To: Wenwen Wang <wenwen@cs.uga.edu>
> Cc: open list:NETWORKING DRIVERS <netdev@vger.kernel.org>; moderated
> list:INTEL ETHERNET DRIVERS <intel-wired-lan@lists.osuosl.org>; open list
> <linux-kernel@vger.kernel.org>; David S. Miller <davem@davemloft.net>
> Subject: [Intel-wired-lan] [PATCH] net: ixgbe: fix memory leaks
> 
> In ixgbe_configure_clsu32(), 'jump', 'input', and 'mask' are allocated through
> kzalloc() respectively in a for loop body. Then,
> ixgbe_clsu32_build_input() is invoked to build the input. If this process fails,
> next iteration of the for loop will be executed. However, the allocated
> 'jump', 'input', and 'mask' are not deallocated on this execution path, leading
> to memory leaks.
> 
> Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++++
>  1 file changed, 4 insertions(+)

Tested-by: Andrew Bowers <andrewx.bowers@intel.com>



^ permalink raw reply

* RE: [Intel-wired-lan] [PATCH] i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask
From: Bowers, AndrewX @ 2019-08-28 16:23 UTC (permalink / raw)
  To: intel-wired-lan@lists.osuosl.org; +Cc: netdev@vger.kernel.org
In-Reply-To: <20190821140929.26985-1-sassmann@kpanic.de>

> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> Behalf Of Stefan Assmann
> Sent: Wednesday, August 21, 2019 7:09 AM
> To: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; davem@davemloft.net; sassmann@kpanic.de
> Subject: [Intel-wired-lan] [PATCH] i40e: check __I40E_VF_DISABLE bit in
> i40e_sync_filters_subtask
> 
> While testing VF spawn/destroy the following panic occured.
> 
> BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000029 [...]
> Workqueue: i40e i40e_service_task [i40e]
> RIP: 0010:i40e_sync_vsi_filters+0x6fd/0xc60 [i40e] [...] Call Trace:
>  ? __switch_to_asm+0x35/0x70
>  ? __switch_to_asm+0x41/0x70
>  ? __switch_to_asm+0x35/0x70
>  ? _cond_resched+0x15/0x30
>  i40e_sync_filters_subtask+0x56/0x70 [i40e]
>  i40e_service_task+0x382/0x11b0 [i40e]
>  ? __switch_to_asm+0x41/0x70
>  ? __switch_to_asm+0x41/0x70
>  process_one_work+0x1a7/0x3b0
>  worker_thread+0x30/0x390
>  ? create_worker+0x1a0/0x1a0
>  kthread+0x112/0x130
>  ? kthread_bind+0x30/0x30
>  ret_from_fork+0x35/0x40
> 
> Investigation revealed a race where pf->vf[vsi->vf_id].trusted may get
> accessed by the watchdog via i40e_sync_filters_subtask() although
> i40e_free_vfs() already free'd pf->vf.
> To avoid this the call to i40e_sync_vsi_filters() in
> i40e_sync_filters_subtask() needs to be guarded by __I40E_VF_DISABLE,
> which is also used by i40e_free_vfs().
> 
> Note: put the __I40E_VF_DISABLE check after the
> __I40E_MACVLAN_SYNC_PENDING check as the latter is more likely to
> trigger.
> 
> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 5 +++++
>  1 file changed, 5 insertions(+)

Tested-by: Andrew Bowers <andrewx.bowers@intel.com>



^ permalink raw reply

* [PATCH net-next] net: dsa: mv88e6xxx: get serdes lane after lock
From: Vivien Didelot @ 2019-08-28 16:26 UTC (permalink / raw)
  To: netdev; +Cc: davem, Marek Behún, f.fainelli, andrew, Vivien Didelot

This is a follow-up patch for commit 17deaf5cb37a ("net: dsa:
mv88e6xxx: create serdes_get_lane chip operation").

The .serdes_get_lane implementations access the CMODE of a port,
even though it is cached at the moment, it is safer to call them
after the mutex is locked, not before.

At the same time, check for an eventual error and return IRQ_DONE,
instead of blindly ignoring it.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/serdes.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/serdes.c b/drivers/net/dsa/mv88e6xxx/serdes.c
index 9424e401dbc7..38c0da2492c0 100644
--- a/drivers/net/dsa/mv88e6xxx/serdes.c
+++ b/drivers/net/dsa/mv88e6xxx/serdes.c
@@ -646,10 +646,12 @@ static irqreturn_t mv88e6390_serdes_thread_fn(int irq, void *dev_id)
 	int err;
 	u8 lane;
 
-	mv88e6xxx_serdes_get_lane(chip, port->port, &lane);
-
 	mv88e6xxx_reg_lock(chip);
 
+	err = mv88e6xxx_serdes_get_lane(chip, port->port, &lane);
+	if (err)
+		goto out;
+
 	switch (cmode) {
 	case MV88E6XXX_PORT_STS_CMODE_SGMII:
 	case MV88E6XXX_PORT_STS_CMODE_1000BASEX:
-- 
2.23.0


^ permalink raw reply related

* [PATCH net-next] net: dsa: mv88e6xxx: keep CMODE writable code private
From: Vivien Didelot @ 2019-08-28 16:26 UTC (permalink / raw)
  To: netdev; +Cc: davem, Marek Behún, f.fainelli, andrew, Vivien Didelot

This is a follow-up patch for commit 7a3007d22e8d ("net: dsa:
mv88e6xxx: fully support SERDES on Topaz family").

Since .port_set_cmode is only called from mv88e6xxx_port_setup_mac and
mv88e6xxx_phylink_mac_config, it is fine to keep this "make writable"
code private to the mv88e6341_port_set_cmode implementation, instead
of adding yet another operation to the switch info structure.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c | 8 --------
 drivers/net/dsa/mv88e6xxx/chip.h | 1 -
 drivers/net/dsa/mv88e6xxx/port.c | 9 ++++++++-
 drivers/net/dsa/mv88e6xxx/port.h | 1 -
 4 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 54e88aafba2f..6525075f6bd3 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -454,12 +454,6 @@ int mv88e6xxx_port_setup_mac(struct mv88e6xxx_chip *chip, int port, int link,
 			goto restore_link;
 	}
 
-	if (chip->info->ops->port_set_cmode_writable) {
-		err = chip->info->ops->port_set_cmode_writable(chip, port);
-		if (err && err != -EOPNOTSUPP)
-			goto restore_link;
-	}
-
 	if (chip->info->ops->port_set_cmode) {
 		err = chip->info->ops->port_set_cmode(chip, port, mode);
 		if (err && err != -EOPNOTSUPP)
@@ -2919,7 +2913,6 @@ static const struct mv88e6xxx_ops mv88e6141_ops = {
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
 	.port_link_state = mv88e6352_port_link_state,
 	.port_get_cmode = mv88e6352_port_get_cmode,
-	.port_set_cmode_writable = mv88e6341_port_set_cmode_writable,
 	.port_set_cmode = mv88e6341_port_set_cmode,
 	.port_setup_message_port = mv88e6xxx_setup_message_port,
 	.stats_snapshot = mv88e6390_g1_stats_snapshot,
@@ -3618,7 +3611,6 @@ static const struct mv88e6xxx_ops mv88e6341_ops = {
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
 	.port_link_state = mv88e6352_port_link_state,
 	.port_get_cmode = mv88e6352_port_get_cmode,
-	.port_set_cmode_writable = mv88e6341_port_set_cmode_writable,
 	.port_set_cmode = mv88e6341_port_set_cmode,
 	.port_setup_message_port = mv88e6xxx_setup_message_port,
 	.stats_snapshot = mv88e6390_g1_stats_snapshot,
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index d6b1aa35aa1a..421e8b84bec3 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -400,7 +400,6 @@ struct mv88e6xxx_ops {
 	/* CMODE control what PHY mode the MAC will use, eg. SGMII, RGMII, etc.
 	 * Some chips allow this to be configured on specific ports.
 	 */
-	int (*port_set_cmode_writable)(struct mv88e6xxx_chip *chip, int port);
 	int (*port_set_cmode)(struct mv88e6xxx_chip *chip, int port,
 			      phy_interface_t mode);
 	int (*port_get_cmode)(struct mv88e6xxx_chip *chip, int port, u8 *cmode);
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
index 542201214c36..4f841335ea32 100644
--- a/drivers/net/dsa/mv88e6xxx/port.c
+++ b/drivers/net/dsa/mv88e6xxx/port.c
@@ -510,7 +510,8 @@ int mv88e6390_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
 	return mv88e6xxx_port_set_cmode(chip, port, mode);
 }
 
-int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip, int port)
+static int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip,
+					     int port)
 {
 	int err, addr;
 	u16 reg, bits;
@@ -537,6 +538,8 @@ int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip, int port)
 int mv88e6341_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
 			     phy_interface_t mode)
 {
+	int err;
+
 	if (port != 5)
 		return -EOPNOTSUPP;
 
@@ -551,6 +554,10 @@ int mv88e6341_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
 		break;
 	}
 
+	err = mv88e6341_port_set_cmode_writable(chip, port);
+	if (err)
+		return err;
+
 	return mv88e6xxx_port_set_cmode(chip, port, mode);
 }
 
diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
index e78d68c3e671..d4e9bea6e82f 100644
--- a/drivers/net/dsa/mv88e6xxx/port.h
+++ b/drivers/net/dsa/mv88e6xxx/port.h
@@ -336,7 +336,6 @@ int mv88e6097_port_pause_limit(struct mv88e6xxx_chip *chip, int port, u8 in,
 			       u8 out);
 int mv88e6390_port_pause_limit(struct mv88e6xxx_chip *chip, int port, u8 in,
 			       u8 out);
-int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip, int port);
 int mv88e6341_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
 			     phy_interface_t mode);
 int mv88e6390_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
-- 
2.23.0


^ permalink raw reply related

* Re: [PATCH net 1/3] taprio: Fix kernel panic in taprio_destroy
From: Vinicius Costa Gomes @ 2019-08-28 16:31 UTC (permalink / raw)
  To: Vladimir Oltean, jhs, xiyou.wangcong, jiri, davem, vedang.patel,
	leandro.maciel.dorileo
  Cc: netdev, Vladimir Oltean
In-Reply-To: <20190828144829.32570-2-olteanv@gmail.com>

Hi,

Vladimir Oltean <olteanv@gmail.com> writes:

> taprio_init may fail earlier than this line:
>
> 	list_add(&q->taprio_list, &taprio_list);
>
> i.e. due to the net device not being multi queue.

Good catch.

>
> Attempting to remove q from the global taprio_list when it is not part
> of it will result in a kernel panic.
>
> Fix it by iterating through the list and removing it only if found.
>
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---
>  net/sched/sch_taprio.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
> index 540bde009ea5..f1eea8c68011 100644
> --- a/net/sched/sch_taprio.c
> +++ b/net/sched/sch_taprio.c
> @@ -1199,12 +1199,17 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
>  
>  static void taprio_destroy(struct Qdisc *sch)
>  {
> -	struct taprio_sched *q = qdisc_priv(sch);
> +	struct taprio_sched *p, *q = qdisc_priv(sch);
>  	struct net_device *dev = qdisc_dev(sch);
> +	struct list_head *pos, *tmp;
>  	unsigned int i;
>  
>  	spin_lock(&taprio_list_lock);
> -	list_del(&q->taprio_list);
> +	list_for_each_safe(pos, tmp, &taprio_list) {
> +		p = list_entry(pos, struct taprio_sched, taprio_list);
> +		if (p == q)
> +			list_del(&q->taprio_list);
> +	}

Personally, I would do things differently, I am thinking: adding the
taprio instance earlier to the list in taprio_init(), and keeping
taprio_destroy() the way it is now. But take this more as a suggestion
:-)


Cheers,
--
Vinicius


^ permalink raw reply

* [PATCH net-next 0/2] Fixes for unlocked cls hardware offload API refactoring
From: Vlad Buslov @ 2019-08-28 16:41 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, saeedm, idosch, Vlad Buslov

Two fixes for my "Refactor cls hardware offload API to support
rtnl-independent drivers" series.

Vlad Buslov (2):
  net: sched: cls_matchall: cleanup flow_action before deallocating
  net/mlx5e: Move local var definition into ifdef block

 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 6 ++++--
 net/sched/cls_matchall.c                          | 2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

-- 
2.21.0


^ permalink raw reply

* [PATCH net-next 1/2] net: sched: cls_matchall: cleanup flow_action before deallocating
From: Vlad Buslov @ 2019-08-28 16:41 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, saeedm, idosch, Vlad Buslov
In-Reply-To: <20190828164104.6020-1-vladbu@mellanox.com>

Recent rtnl lock removal patch changed flow_action infra to require proper
cleanup besides simple memory deallocation. However, matchall classifier
was not updated to call tc_cleanup_flow_action(). Add proper cleanup to
mall_replace_hw_filter() and mall_reoffload().

Fixes: 5a6ff4b13d59 ("net: sched: take reference to action dev before calling offloads")
Reported-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
---
 net/sched/cls_matchall.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index 3266f25011cc..7fc2eb62aa98 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -111,6 +111,7 @@ static int mall_replace_hw_filter(struct tcf_proto *tp,
 
 	err = tc_setup_cb_add(block, tp, TC_SETUP_CLSMATCHALL, &cls_mall,
 			      skip_sw, &head->flags, &head->in_hw_count, true);
+	tc_cleanup_flow_action(&cls_mall.rule->action);
 	kfree(cls_mall.rule);
 
 	if (err) {
@@ -313,6 +314,7 @@ static int mall_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
 	err = tc_setup_cb_reoffload(block, tp, add, cb, TC_SETUP_CLSMATCHALL,
 				    &cls_mall, cb_priv, &head->flags,
 				    &head->in_hw_count);
+	tc_cleanup_flow_action(&cls_mall.rule->action);
 	kfree(cls_mall.rule);
 
 	if (err)
-- 
2.21.0


^ permalink raw reply related

* [PATCH net-next 2/2] net/mlx5e: Move local var definition into ifdef block
From: Vlad Buslov @ 2019-08-28 16:41 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, saeedm, idosch, Vlad Buslov,
	tanhuazhong
In-Reply-To: <20190828164104.6020-1-vladbu@mellanox.com>

New local variable "struct flow_block_offload *f" was added to
mlx5e_setup_tc() in recent rtnl lock removal patches. The variable is used
in code that is only compiled when CONFIG_MLX5_ESWITCH is enabled. This
results compilation warning about unused variable when CONFIG_MLX5_ESWITCH
is not set. Move the variable definition into eswitch-specific code block
from the begging of mlx5e_setup_tc() function.

Fixes: c9f14470d048 ("net: sched: add API for registering unlocked offload block callbacks")
Reported-by: tanhuazhong <tanhuazhong@huawei.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 8592b98d0e70..c10a1fc8e469 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3470,16 +3470,18 @@ static int mlx5e_setup_tc(struct net_device *dev, enum tc_setup_type type,
 			  void *type_data)
 {
 	struct mlx5e_priv *priv = netdev_priv(dev);
-	struct flow_block_offload *f = type_data;
 
 	switch (type) {
 #ifdef CONFIG_MLX5_ESWITCH
-	case TC_SETUP_BLOCK:
+	case TC_SETUP_BLOCK: {
+		struct flow_block_offload *f = type_data;
+
 		f->unlocked_driver_cb = true;
 		return flow_block_cb_setup_simple(type_data,
 						  &mlx5e_block_cb_list,
 						  mlx5e_setup_tc_block_cb,
 						  priv, priv, true);
+	}
 #endif
 	case TC_SETUP_QDISC_MQPRIO:
 		return mlx5e_setup_tc_mqprio(priv, type_data);
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH net 2/3] taprio: Set default link speed to 10 Mbps in taprio_set_picos_per_byte
From: Vinicius Costa Gomes @ 2019-08-28 16:42 UTC (permalink / raw)
  To: Vladimir Oltean, jhs, xiyou.wangcong, jiri, davem, vedang.patel,
	leandro.maciel.dorileo
  Cc: netdev, Vladimir Oltean
In-Reply-To: <20190828144829.32570-3-olteanv@gmail.com>

Vladimir Oltean <olteanv@gmail.com> writes:

> The taprio budget needs to be adapted at runtime according to interface
> link speed. But that handling is problematic.
>
> For one thing, installing a qdisc on an interface that doesn't have
> carrier is not illegal. But taprio prints the following stack trace:
>
> [   31.851373] ------------[ cut here ]------------
> [   31.856024] WARNING: CPU: 1 PID: 207 at net/sched/sch_taprio.c:481 taprio_dequeue+0x1a8/0x2d4
> [   31.864566] taprio: dequeue() called with unknown picos per byte.
> [   31.864570] Modules linked in:
> [   31.873701] CPU: 1 PID: 207 Comm: tc Not tainted 5.3.0-rc5-01199-g8838fe023cd6 #1689
> [   31.881398] Hardware name: Freescale LS1021A
> [   31.885661] [<c03133a4>] (unwind_backtrace) from [<c030d8cc>] (show_stack+0x10/0x14)
> [   31.893368] [<c030d8cc>] (show_stack) from [<c10ac958>] (dump_stack+0xb4/0xc8)
> [   31.900555] [<c10ac958>] (dump_stack) from [<c0349d04>] (__warn+0xe0/0xf8)
> [   31.907395] [<c0349d04>] (__warn) from [<c0349d64>] (warn_slowpath_fmt+0x48/0x6c)
> [   31.914841] [<c0349d64>] (warn_slowpath_fmt) from [<c0f38db4>] (taprio_dequeue+0x1a8/0x2d4)
> [   31.923150] [<c0f38db4>] (taprio_dequeue) from [<c0f227b0>] (__qdisc_run+0x90/0x61c)
> [   31.930856] [<c0f227b0>] (__qdisc_run) from [<c0ec82ac>] (net_tx_action+0x12c/0x2bc)
> [   31.938560] [<c0ec82ac>] (net_tx_action) from [<c0302298>] (__do_softirq+0x130/0x3c8)
> [   31.946350] [<c0302298>] (__do_softirq) from [<c03502a0>] (irq_exit+0xbc/0xd8)
> [   31.953536] [<c03502a0>] (irq_exit) from [<c03a4808>] (__handle_domain_irq+0x60/0xb4)
> [   31.961328] [<c03a4808>] (__handle_domain_irq) from [<c0754478>] (gic_handle_irq+0x58/0x9c)
> [   31.969638] [<c0754478>] (gic_handle_irq) from [<c0301a8c>] (__irq_svc+0x6c/0x90)
> [   31.977076] Exception stack(0xe8167b20 to 0xe8167b68)
> [   31.982100] 7b20: e9d4bd80 00000cc0 000000cf 00000000 e9d4bd80 c1f38958 00000cc0 c1f38960
> [   31.990234] 7b40: 00000001 000000cf 00000004 e9dc0800 00000000 e8167b70 c0f478ec c0f46d94
> [   31.998363] 7b60: 60070013 ffffffff
> [   32.001833] [<c0301a8c>] (__irq_svc) from [<c0f46d94>] (netlink_trim+0x18/0xd8)
> [   32.009104] [<c0f46d94>] (netlink_trim) from [<c0f478ec>] (netlink_broadcast_filtered+0x34/0x414)
> [   32.017930] [<c0f478ec>] (netlink_broadcast_filtered) from [<c0f47cec>] (netlink_broadcast+0x20/0x28)
> [   32.027102] [<c0f47cec>] (netlink_broadcast) from [<c0eea378>] (rtnetlink_send+0x34/0x88)
> [   32.035238] [<c0eea378>] (rtnetlink_send) from [<c0f25890>] (notify_and_destroy+0x2c/0x44)
> [   32.043461] [<c0f25890>] (notify_and_destroy) from [<c0f25e08>] (qdisc_graft+0x398/0x470)
> [   32.051595] [<c0f25e08>] (qdisc_graft) from [<c0f27a00>] (tc_modify_qdisc+0x3a4/0x724)
> [   32.059470] [<c0f27a00>] (tc_modify_qdisc) from [<c0ee4c84>] (rtnetlink_rcv_msg+0x260/0x2ec)
> [   32.067864] [<c0ee4c84>] (rtnetlink_rcv_msg) from [<c0f4a988>] (netlink_rcv_skb+0xb8/0x110)
> [   32.076172] [<c0f4a988>] (netlink_rcv_skb) from [<c0f4a170>] (netlink_unicast+0x1b4/0x22c)
> [   32.084392] [<c0f4a170>] (netlink_unicast) from [<c0f4a5e4>] (netlink_sendmsg+0x33c/0x380)
> [   32.092614] [<c0f4a5e4>] (netlink_sendmsg) from [<c0ea9f40>] (sock_sendmsg+0x14/0x24)
> [   32.100403] [<c0ea9f40>] (sock_sendmsg) from [<c0eaa780>] (___sys_sendmsg+0x214/0x228)
> [   32.108279] [<c0eaa780>] (___sys_sendmsg) from [<c0eabad0>] (__sys_sendmsg+0x50/0x8c)
> [   32.116068] [<c0eabad0>] (__sys_sendmsg) from [<c0301000>] (ret_fast_syscall+0x0/0x54)
> [   32.123938] Exception stack(0xe8167fa8 to 0xe8167ff0)
> [   32.128960] 7fa0:                   b6fa68c8 000000f8 00000003 bea142d0 00000000 00000000
> [   32.137093] 7fc0: b6fa68c8 000000f8 0052154c 00000128 5d6468a2 00000000 00000028 00558c9c
> [   32.145224] 7fe0: 00000070 bea14278 00530d64 b6e17e64
> [   32.150659] ---[ end trace 2139c9827c3e5177 ]---
>
> This happens because the qdisc ->dequeue callback gets called. Which
> again is not illegal, the qdisc will dequeue even when the interface is
> up but doesn't have carrier (and hence SPEED_UNKNOWN), and the frames
> will be dropped further down the stack in dev_direct_xmit().
>
> And, at the end of the day, for what? For calculating the initial budget
> of an interface which is non-operational at the moment and where frames
> will get dropped anyway.
>
> So if we can't figure out the link speed, default to SPEED_10 and move
> along. We can also remove the runtime check now.
>
> Cc: Leandro Dorileo <leandro.maciel.dorileo@intel.com>
> Fixes: 7b9eba7ba0c1 ("net/sched: taprio: fix picos_per_byte miscalculation")
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---

Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>


^ permalink raw reply

* Re: [PATCH net 3/3] net/sched: cbs: Set default link speed to 10 Mbps in cbs_set_port_rate
From: Vinicius Costa Gomes @ 2019-08-28 16:45 UTC (permalink / raw)
  To: Vladimir Oltean, jhs, xiyou.wangcong, jiri, davem, vedang.patel,
	leandro.maciel.dorileo
  Cc: netdev, Vladimir Oltean
In-Reply-To: <20190828144829.32570-4-olteanv@gmail.com>

Vladimir Oltean <olteanv@gmail.com> writes:

> The discussion to be made is absolutely the same as in the case of
> previous patch ("taprio: Set default link speed to 10 Mbps in
> taprio_set_picos_per_byte"). Nothing is lost when setting a default.
>
> Cc: Leandro Dorileo <leandro.maciel.dorileo@intel.com>
> Fixes: e0a7683d30e9 ("net/sched: cbs: fix port_rate miscalculation")
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---

Hm, taking another look at cbs it has a similar problem than the problem
your patch 1/3 solves for taprio, I will propose a patch in a few
moments.

For this one:

Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>


Cheers,
--
Vinicius

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: mv88e6xxx: get serdes lane after lock
From: Marek Behún @ 2019-08-28 16:48 UTC (permalink / raw)
  To: Vivien Didelot; +Cc: netdev, davem, f.fainelli, andrew
In-Reply-To: <20190828162611.10064-1-vivien.didelot@gmail.com>

On Wed, 28 Aug 2019 12:26:11 -0400
Vivien Didelot <vivien.didelot@gmail.com> wrote:

> This is a follow-up patch for commit 17deaf5cb37a ("net: dsa:
> mv88e6xxx: create serdes_get_lane chip operation").
> 
> The .serdes_get_lane implementations access the CMODE of a port,
> even though it is cached at the moment, it is safer to call them
> after the mutex is locked, not before.
> 
> At the same time, check for an eventual error and return IRQ_DONE,
> instead of blindly ignoring it.
> 
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
> ---
>  drivers/net/dsa/mv88e6xxx/serdes.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/serdes.c
> b/drivers/net/dsa/mv88e6xxx/serdes.c index 9424e401dbc7..38c0da2492c0
> 100644 --- a/drivers/net/dsa/mv88e6xxx/serdes.c
> +++ b/drivers/net/dsa/mv88e6xxx/serdes.c
> @@ -646,10 +646,12 @@ static irqreturn_t
> mv88e6390_serdes_thread_fn(int irq, void *dev_id) int err;
>  	u8 lane;
>  
> -	mv88e6xxx_serdes_get_lane(chip, port->port, &lane);
> -
>  	mv88e6xxx_reg_lock(chip);
>  
> +	err = mv88e6xxx_serdes_get_lane(chip, port->port, &lane);
> +	if (err)
> +		goto out;
> +
>  	switch (cmode) {
>  	case MV88E6XXX_PORT_STS_CMODE_SGMII:
>  	case MV88E6XXX_PORT_STS_CMODE_1000BASEX:

Reviewed-by: Marek Behún <marek.behun@nic.cz>

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: mv88e6xxx: keep CMODE writable code private
From: Marek Behún @ 2019-08-28 16:49 UTC (permalink / raw)
  To: Vivien Didelot; +Cc: netdev, davem, f.fainelli, andrew
In-Reply-To: <20190828162659.10306-1-vivien.didelot@gmail.com>

On Wed, 28 Aug 2019 12:26:59 -0400
Vivien Didelot <vivien.didelot@gmail.com> wrote:

> This is a follow-up patch for commit 7a3007d22e8d ("net: dsa:
> mv88e6xxx: fully support SERDES on Topaz family").
> 
> Since .port_set_cmode is only called from mv88e6xxx_port_setup_mac and
> mv88e6xxx_phylink_mac_config, it is fine to keep this "make writable"
> code private to the mv88e6341_port_set_cmode implementation, instead
> of adding yet another operation to the switch info structure.
> 
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
> ---
>  drivers/net/dsa/mv88e6xxx/chip.c | 8 --------
>  drivers/net/dsa/mv88e6xxx/chip.h | 1 -
>  drivers/net/dsa/mv88e6xxx/port.c | 9 ++++++++-
>  drivers/net/dsa/mv88e6xxx/port.h | 1 -
>  4 files changed, 8 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c
> b/drivers/net/dsa/mv88e6xxx/chip.c index 54e88aafba2f..6525075f6bd3
> 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -454,12 +454,6 @@ int mv88e6xxx_port_setup_mac(struct
> mv88e6xxx_chip *chip, int port, int link, goto restore_link;
>  	}
>  
> -	if (chip->info->ops->port_set_cmode_writable) {
> -		err = chip->info->ops->port_set_cmode_writable(chip,
> port);
> -		if (err && err != -EOPNOTSUPP)
> -			goto restore_link;
> -	}
> -
>  	if (chip->info->ops->port_set_cmode) {
>  		err = chip->info->ops->port_set_cmode(chip, port,
> mode); if (err && err != -EOPNOTSUPP)
> @@ -2919,7 +2913,6 @@ static const struct mv88e6xxx_ops mv88e6141_ops
> = { .port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
>  	.port_link_state = mv88e6352_port_link_state,
>  	.port_get_cmode = mv88e6352_port_get_cmode,
> -	.port_set_cmode_writable = mv88e6341_port_set_cmode_writable,
>  	.port_set_cmode = mv88e6341_port_set_cmode,
>  	.port_setup_message_port = mv88e6xxx_setup_message_port,
>  	.stats_snapshot = mv88e6390_g1_stats_snapshot,
> @@ -3618,7 +3611,6 @@ static const struct mv88e6xxx_ops mv88e6341_ops
> = { .port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
>  	.port_link_state = mv88e6352_port_link_state,
>  	.port_get_cmode = mv88e6352_port_get_cmode,
> -	.port_set_cmode_writable = mv88e6341_port_set_cmode_writable,
>  	.port_set_cmode = mv88e6341_port_set_cmode,
>  	.port_setup_message_port = mv88e6xxx_setup_message_port,
>  	.stats_snapshot = mv88e6390_g1_stats_snapshot,
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.h
> b/drivers/net/dsa/mv88e6xxx/chip.h index d6b1aa35aa1a..421e8b84bec3
> 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h
> +++ b/drivers/net/dsa/mv88e6xxx/chip.h
> @@ -400,7 +400,6 @@ struct mv88e6xxx_ops {
>  	/* CMODE control what PHY mode the MAC will use, eg. SGMII,
> RGMII, etc.
>  	 * Some chips allow this to be configured on specific ports.
>  	 */
> -	int (*port_set_cmode_writable)(struct mv88e6xxx_chip *chip,
> int port); int (*port_set_cmode)(struct mv88e6xxx_chip *chip, int
> port, phy_interface_t mode);
>  	int (*port_get_cmode)(struct mv88e6xxx_chip *chip, int port,
> u8 *cmode); diff --git a/drivers/net/dsa/mv88e6xxx/port.c
> b/drivers/net/dsa/mv88e6xxx/port.c index 542201214c36..4f841335ea32
> 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c
> +++ b/drivers/net/dsa/mv88e6xxx/port.c
> @@ -510,7 +510,8 @@ int mv88e6390_port_set_cmode(struct
> mv88e6xxx_chip *chip, int port, return mv88e6xxx_port_set_cmode(chip,
> port, mode); }
>  
> -int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip,
> int port) +static int mv88e6341_port_set_cmode_writable(struct
> mv88e6xxx_chip *chip,
> +					     int port)
>  {
>  	int err, addr;
>  	u16 reg, bits;
> @@ -537,6 +538,8 @@ int mv88e6341_port_set_cmode_writable(struct
> mv88e6xxx_chip *chip, int port) int mv88e6341_port_set_cmode(struct
> mv88e6xxx_chip *chip, int port, phy_interface_t mode)
>  {
> +	int err;
> +
>  	if (port != 5)
>  		return -EOPNOTSUPP;
>  
> @@ -551,6 +554,10 @@ int mv88e6341_port_set_cmode(struct
> mv88e6xxx_chip *chip, int port, break;
>  	}
>  
> +	err = mv88e6341_port_set_cmode_writable(chip, port);
> +	if (err)
> +		return err;
> +
>  	return mv88e6xxx_port_set_cmode(chip, port, mode);
>  }
>  
> diff --git a/drivers/net/dsa/mv88e6xxx/port.h
> b/drivers/net/dsa/mv88e6xxx/port.h index e78d68c3e671..d4e9bea6e82f
> 100644 --- a/drivers/net/dsa/mv88e6xxx/port.h
> +++ b/drivers/net/dsa/mv88e6xxx/port.h
> @@ -336,7 +336,6 @@ int mv88e6097_port_pause_limit(struct
> mv88e6xxx_chip *chip, int port, u8 in, u8 out);
>  int mv88e6390_port_pause_limit(struct mv88e6xxx_chip *chip, int
> port, u8 in, u8 out);
> -int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip,
> int port); int mv88e6341_port_set_cmode(struct mv88e6xxx_chip *chip,
> int port, phy_interface_t mode);
>  int mv88e6390_port_set_cmode(struct mv88e6xxx_chip *chip, int port,

Reviewed-by: Marek Behún <marek.behun@nic.cz>

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: mv88e6xxx: get serdes lane after lock
From: Andrew Lunn @ 2019-08-28 16:49 UTC (permalink / raw)
  To: Vivien Didelot; +Cc: netdev, davem, Marek Behún, f.fainelli
In-Reply-To: <20190828162611.10064-1-vivien.didelot@gmail.com>

On Wed, Aug 28, 2019 at 12:26:11PM -0400, Vivien Didelot wrote:
> This is a follow-up patch for commit 17deaf5cb37a ("net: dsa:
> mv88e6xxx: create serdes_get_lane chip operation").
> 
> The .serdes_get_lane implementations access the CMODE of a port,
> even though it is cached at the moment, it is safer to call them
> after the mutex is locked, not before.
> 
> At the same time, check for an eventual error and return IRQ_DONE,
> instead of blindly ignoring it.
> 
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: mv88e6xxx: keep CMODE writable code private
From: Andrew Lunn @ 2019-08-28 16:51 UTC (permalink / raw)
  To: Vivien Didelot; +Cc: netdev, davem, Marek Behún, f.fainelli
In-Reply-To: <20190828162659.10306-1-vivien.didelot@gmail.com>

On Wed, Aug 28, 2019 at 12:26:59PM -0400, Vivien Didelot wrote:
> This is a follow-up patch for commit 7a3007d22e8d ("net: dsa:
> mv88e6xxx: fully support SERDES on Topaz family").
> 
> Since .port_set_cmode is only called from mv88e6xxx_port_setup_mac and
> mv88e6xxx_phylink_mac_config, it is fine to keep this "make writable"
> code private to the mv88e6341_port_set_cmode implementation, instead
> of adding yet another operation to the switch info structure.
> 
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net 1/3] taprio: Fix kernel panic in taprio_destroy
From: Vladimir Oltean @ 2019-08-28 16:51 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: jhs, xiyou.wangcong, Jiri Pirko, David S. Miller, vedang.patel,
	leandro.maciel.dorileo, netdev
In-Reply-To: <87a7btqmk7.fsf@intel.com>

Hi Vinicius,

On Wed, 28 Aug 2019 at 19:31, Vinicius Costa Gomes
<vinicius.gomes@intel.com> wrote:
>
> Hi,
>
> Vladimir Oltean <olteanv@gmail.com> writes:
>
> > taprio_init may fail earlier than this line:
> >
> >       list_add(&q->taprio_list, &taprio_list);
> >
> > i.e. due to the net device not being multi queue.
>
> Good catch.
>
> >
> > Attempting to remove q from the global taprio_list when it is not part
> > of it will result in a kernel panic.
> >
> > Fix it by iterating through the list and removing it only if found.
> >
> > Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> > ---
> >  net/sched/sch_taprio.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
> > index 540bde009ea5..f1eea8c68011 100644
> > --- a/net/sched/sch_taprio.c
> > +++ b/net/sched/sch_taprio.c
> > @@ -1199,12 +1199,17 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
> >
> >  static void taprio_destroy(struct Qdisc *sch)
> >  {
> > -     struct taprio_sched *q = qdisc_priv(sch);
> > +     struct taprio_sched *p, *q = qdisc_priv(sch);
> >       struct net_device *dev = qdisc_dev(sch);
> > +     struct list_head *pos, *tmp;
> >       unsigned int i;
> >
> >       spin_lock(&taprio_list_lock);
> > -     list_del(&q->taprio_list);
> > +     list_for_each_safe(pos, tmp, &taprio_list) {
> > +             p = list_entry(pos, struct taprio_sched, taprio_list);
> > +             if (p == q)
> > +                     list_del(&q->taprio_list);
> > +     }
>
> Personally, I would do things differently, I am thinking: adding the
> taprio instance earlier to the list in taprio_init(), and keeping
> taprio_destroy() the way it is now. But take this more as a suggestion
> :-)
>

While I don't strongly oppose your proposal (keep the list removal
unconditional, but match it better in placement to the list addition),
I think it's rather fragile and I do see this bug recurring in the
future. Anyway if you want to keep it "simpler" I can respin it like
that.

>
> Cheers,
> --
> Vinicius
>

Regards,
-Vladimir

^ permalink raw reply

* Re: [PATCH net-next v5] sched: Add dualpi2 qdisc
From: Dave Taht @ 2019-08-28 16:55 UTC (permalink / raw)
  To: Bob Briscoe
  Cc: Tilmans, Olivier (Nokia - BE/Antwerp), Eric Dumazet,
	Stephen Hemminger, Olga Albisser,
	De Schepper, Koen (Nokia - BE/Antwerp), Henrik Steen,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <bded966b-5176-69c8-4ac3-70d81d344c22@bobbriscoe.net>

On Wed, Aug 28, 2019 at 7:00 AM Bob Briscoe <research@bobbriscoe.net> wrote:
>
> Olivier, Dave,
>
> On 23/08/2019 13:59, Tilmans, Olivier (Nokia - BE/Antwerp) wrote:
>
> as best as I can
> tell (but could be wrong) the NQB idea wants to put something into the
> l4s fast queue? Or is NQB supposed to
> be a third queue?
>
> NQB is not supported in this release of the code. But FYI, it's not for a third queue.

At the time of my code review of dualpi I had not gone back to review
the NQB draft fully.

> We can add support for NQB in the future, by expanding the
> dualpi2_skb_classify() function. This is however out of scope at the
> moment as NQB is not yet adopted by the TSV WG. I'd guess we may want more

> than just the NQB DSCP codepoint in the L queue, which then warrant
> another way to classify traffic, e.g., using tc filter hints.

Yes, you'll find find folk are fans of being able to put tc (and ebpf)
filters in front of various qdiscs for classification, logging, and/or
dropping behavior.

A fairly typical stanza is here:
https://github.com/torvalds/linux/blob/master/net/sched/sch_sfq.c#L171
to line 193.

> The IETF adopted the NQB draft at the meeting just passed in July, but the draft has not yet been updated to reflect that: https://tools.ietf.org/html/draft-white-tsvwg-nqb-02

Hmmm... no. I think oliver's statement was correct.

NQB was put into the "call for adoption into tsvwg" state (
https://mailarchive.ietf.org/arch/msg/tsvwg/fjyYQgU9xQCNalwPO7v9-al6mGk
) in the tsvwg aug 21st, which
doesn't mean "adopted by the ietf", either. In response to that call
several folk did put in (rather pithy),
comments on the current state of the NQB idea and internet draft, starting here:

https://mailarchive.ietf.org/arch/msg/tsvwg/hZGjm899t87YZl9JJUOWQq4KBsk

For those here that are not familiar with IETF processes (and there
are many!) there are "internet drafts" that may or may not become
working group items, that if they become accepted by the working group
may or may not evolve to become actual RFCs.  Unlike lkml usage where
we use RFC in its original meaning as a mere request for comments,
there are several classes of IETF RFC - standards track, experimental,
and informational - whenever they are adopted and published by the
ietf.

There are RFCs for how they do RFCs, and BCPs and other TLAs, and if
you really want to know more about how the ietf processes actually
work, please contact me off list. Anyway...

Much of the experimental L4S architecture itself (of which NQB MAY
become part, and dualpi/tcpprague/etc are) is presently an accepted
tsvwg wg item with a list of 11 problems on the bug database here (
https://trac.ietf.org/trac/tsvwg/report/1?sort=ticket&asc=1&page=1 ).
IMHO it's not currently near last call for standardization as a set of
experimental RFCs.

L4S takes advantage of several RFCs that have
indeed been published as experimental, notably, RFC8311, which too few
have read as yet.

While using up ECT1 in the L4S code as an identifier and not as a
congestion indicator is very controversial for me (
https://lwn.net/Articles/783673/ ), AND I'd rather it not be baked
into the linux api for dualpi should this identifier not be chosen by
the wg (thus my suggestion of a mask or lookup table)...

... I also dearly would like both sides of this code - dualpi and tcp
prague - in a simultaneously testable and high quality state. Without
that, many core ideas in dualpi cannot be tested, nor objectively
evaluated against other tcps and qdiscs using rfc3168 behavior along
the path. Multiple experimental ideas in RFC8311 (such as those in
section 4.3) have also not been re-evaluated in any context.

Is the known to work reference codebase for "tcp prague" still 3.19 based?

> The draft requests 0x2A (decimal 42) as the DSCP but, until the IETF converges on a specific DSCP for NQB, I believe we should not code in a default classifier anyway.
>
>
>
> Bob
>
> --
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/



--

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply

* Re: [PATCH v1 net-next] net: phy: mdio_bus: make mdiobus_scan also cover PHY that only talks C45
From: Florian Fainelli @ 2019-08-28 17:00 UTC (permalink / raw)
  To: Ong, Boon Leong, Andrew Lunn
  Cc: David S. Miller, Maxime Coquelin, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Jose Abreu, Voon, Weifeng,
	Heiner Kallweit
In-Reply-To: <AF233D1473C1364ABD51D28909A1B1B75C22CD3C@pgsmsx114.gar.corp.intel.com>

On 8/28/19 8:41 AM, Ong, Boon Leong wrote:
>> On Tue, Aug 27, 2019 at 03:23:34PM +0000, Voon, Weifeng wrote:
>>>>>> Make mdiobus_scan() to try harder to look for any PHY that only
>>>> talks C45.
>>>>> If you are not using Device Tree or ACPI, and you are letting the MDIO
>>>>> bus be scanned, it sounds like there should be a way for you to
>>>>> provide a hint as to which addresses should be scanned (that's
>>>>> mii_bus::phy_mask) and possibly enhance that with a mask of possible
>>>>> C45 devices?
>>>>
>>>> Yes, i don't like this unconditional c45 scanning. A lot of MDIO bus
>>>> drivers don't look for the MII_ADDR_C45. They are going to do a C22
>>>> transfer, and maybe not mask out the MII_ADDR_C45 from reg, causing an
>>>> invalid register write. Bad things can then happen.
>>>>
>>>> With DT and ACPI, we have an explicit indication that C45 should be used,
>>>> so we know on this platform C45 is safe to use. We need something
>>>> similar when not using DT or ACPI.
>>>>
>>>> 	  Andrew
>>>
>>> Florian and Andrew,
>>> The mdio c22 is using the start-of-frame ST=01 while mdio c45 is using ST=00
>>> as identifier. So mdio c22 device will not response to mdio c45 protocol.
>>> As in IEEE 802.1ae-2002 Annex 45A.3 mention that:
>>> " Even though the Clause 45 MDIO frames using the ST=00 frame code
>>> will also be driven on to the Clause 22 MII Management interface,
>>> the Clause 22 PHYs will ignore the frames. "
>>>
>>> Hence, I am not seeing any concern that the c45 scanning will mess up with
>>> c22 devices.
>>
>> Hi Voon
>>
>> Take for example mdio-hisi-femac.c
>>
>> static int hisi_femac_mdio_read(struct mii_bus *bus, int mii_id, int regnum)
>> {
>>        struct hisi_femac_mdio_data *data = bus->priv;
>>        int ret;
>>
>>        ret = hisi_femac_mdio_wait_ready(data);
>>        if (ret)
>>                return ret;
>>
>>        writel((mii_id << BIT_PHY_ADDR_OFFSET) | regnum,
>>               data->membase + MDIO_RWCTRL);
>>
>>
>> There is no check here for MII_ADDR_C45. So it will perform a C22
>> transfer. And regnum will still have MII_ADDR_C45 in it, so the
>> writel() is going to set bit 30, since #define MII_ADDR_C45
>> (1<<30). What happens on this hardware under these conditions?
>>
>> You cannot unconditionally ask an MDIO driver to do a C45
>> transfer. Some drivers are going to do bad things.
> 
> Andrew & Florian, thanks for your review on this patch and insights on it.
> We will look into the implementation as suggested as follow. 
> 
> - for each bit clear in mii_bus::phy_mask, scan it as C22
> - for each bit clear in mii_bus::phy_c45_mask, scan it as C45
> 
> We will work on this and resubmit soonest. 

Sounds good. If you do not need to scan the MDIO bus, another approach
is to call get_phy_device() by passing the is_c45 boolean to true in
order to connect directly to a C45 device for which you already know the
address.

Assuming this is done for the stmmac PCI changes that you have
submitted, and that those cards have a fixed set of addresses for their
PHYs, maybe scanning the bus is overkill?
-- 
Florian

^ permalink raw reply

* RE: [PATCH v1 net-next] net: stmmac: Add support for MDIO interrupts
From: Voon, Weifeng @ 2019-08-28 17:07 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn
  Cc: David S. Miller, Maxime Coquelin, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Jose Abreu, Giuseppe Cavallaro,
	Alexandre Torgue, Ong, Boon Leong
In-Reply-To: <cac5aba0-b47b-00c6-f99b-64c6b385308a@gmail.com>

> >> DW EQoS v5.xx controllers added capability for interrupt generation
> >> when MDIO interface is done (GMII Busy bit is cleared).
> >> This patch adds support for this interrupt on supported HW to avoid
> >> polling on GMII Busy bit.
> >>
> >> stmmac_mdio_read() & stmmac_mdio_write() will sleep until wake_up()
> >> is called by the interrupt handler.
> >
> > Hi Voon
> >
> > I _think_ there are some order of operation issues here. The mdiobus
> > is registered in the probe function. As soon as of_mdiobus_register()
> > is called, the MDIO bus must work. At that point MDIO read/writes can
> > start to happen.
> >
> > As far as i can see, the interrupt handler is only requested in
> > stmmac_open(). So it seems like any MDIO operations after probe, but
> > before open are going to fail?
> 
> AFAIR, wait_event_timeout() will continue to busy loop and wait until
> the timeout, but not return an error because the polled condition was
> true, at least that is my recollection from having the same issue with
> the bcmgenet driver before it was moved to connecting to the PHY in the
> ndo_open() function.
> --
> Florian

Florian is right as the poll condition is still true after the timeout. 
Hence, any mdio operation after probe and before ndo_open will still work.
The only cons here is that attaching the PHY will takes a full length of 
timeout time for each mdio_read and mdio_write. 
So we should attach the phy only after the interrupt handler is requested?
 

^ permalink raw reply

* Re: [PATCH net-next 03/15] net: sgi: ioc3-eth: remove checkpatch errors/warning
From: Joe Perches @ 2019-08-28 17:10 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Ralf Baechle, Paul Burton, James Hogan,
	David S. Miller, linux-mips, linux-kernel, netdev
In-Reply-To: <20190828140315.17048-4-tbogendoerfer@suse.de>

On Wed, 2019-08-28 at 16:03 +0200, Thomas Bogendoerfer wrote:
> Before massaging the driver further fix oddities found by checkpatch like
> - wrong indention
> - comment formatting
> - use of printk instead or netdev_xxx/pr_xxx

trivial notes:

Please try to make the code better rather than merely
shutting up checkpatch.

> diff --git a/drivers/net/ethernet/sgi/ioc3-eth.c b/drivers/net/ethernet/sgi/ioc3-eth.c
[]
> @@ -209,8 +201,7 @@ static inline void nic_write_bit(u32 __iomem *mcr, int bit)
>  	nic_wait(mcr);
>  }
>  
> -/*
> - * Read a byte from an iButton device
> +/* Read a byte from an iButton device
>   */

These comment styles would be simpler on a single line

/* Read a byte from an iButton device */

>  static u32 nic_read_byte(u32 __iomem *mcr)
>  {
> @@ -223,8 +214,7 @@ static u32 nic_read_byte(u32 __iomem *mcr)
>  	return result;
>  }
>  
> -/*
> - * Write a byte to an iButton device
> +/* Write a byte to an iButton device
>   */

/* Write a byte to an iButton device */

etc...

[]
> @@ -323,16 +315,15 @@ static int nic_init(u32 __iomem *mcr)
>  		break;
>  	}
>  
> -	printk("Found %s NIC", type);
> +	pr_info("Found %s NIC", type);
>  	if (type != unknown)
> -		printk (" registration number %pM, CRC %02x", serial, crc);
> -	printk(".\n");
> +		pr_cont(" registration number %pM, CRC %02x", serial, crc);
> +	pr_cont(".\n");

This code would be more sensible as

	if (type != unknown)
		pr_info("Found %s NIC registration number %pM, CRC %02x\n",
			type, serial, crc);
	else
		pr_info("Found %s NIC\n", type); 

Though I don't know if registration number is actually a MAC
address or something else.  If it's just a 6 byte identifier
that uses colon separation it should probably use "%6phC"
instead of "%pM"

[] 

> @@ -645,22 +636,21 @@ static inline void ioc3_tx(struct net_device *dev)
>  static void ioc3_error(struct net_device *dev, u32 eisr)
>  {
>  	struct ioc3_private *ip = netdev_priv(dev);
> -	unsigned char *iface = dev->name;
>  
>  	spin_lock(&ip->ioc3_lock);
>  
>  	if (eisr & EISR_RXOFLO)
> -		printk(KERN_ERR "%s: RX overflow.\n", iface);
> +		netdev_err(dev, "RX overflow.\n");
>  	if (eisr & EISR_RXBUFOFLO)
> -		printk(KERN_ERR "%s: RX buffer overflow.\n", iface);
> +		netdev_err(dev, "RX buffer overflow.\n");
>  	if (eisr & EISR_RXMEMERR)
> -		printk(KERN_ERR "%s: RX PCI error.\n", iface);
> +		netdev_err(dev, "RX PCI error.\n");
>  	if (eisr & EISR_RXPARERR)
> -		printk(KERN_ERR "%s: RX SSRAM parity error.\n", iface);
> +		netdev_err(dev, "RX SSRAM parity error.\n");
>  	if (eisr & EISR_TXBUFUFLO)
> -		printk(KERN_ERR "%s: TX buffer underflow.\n", iface);
> +		netdev_err(dev, "TX buffer underflow.\n");
>  	if (eisr & EISR_TXMEMERR)
> -		printk(KERN_ERR "%s: TX PCI error.\n", iface);
> +		netdev_err(dev, "TX PCI error.\n");

All of these should probably be ratelimited() output.



^ permalink raw reply

* Re: [RFC PATCH 1/1] phylink: Set speed to SPEED_UNKNOWN when there is no PHY connected
From: Russell King - ARM Linux admin @ 2019-08-28 17:14 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: andrew, f.fainelli, asolokha, netdev
In-Reply-To: <20190828145802.3609-2-olteanv@gmail.com>

On Wed, Aug 28, 2019 at 05:58:02PM +0300, Vladimir Oltean wrote:
> phylink_ethtool_ksettings_get can be called while the interface may not
> even be up, which should not be a problem. But there are drivers (e.g.
> gianfar) which connect to the PHY in .ndo_open and disconnect in
> .ndo_close. While odd, to my knowledge this is again not illegal and
> there may be more that do the same. But PHYLINK for example has this
> check in phylink_ethtool_ksettings_get:
> 
> 	if (pl->phydev) {
> 		phy_ethtool_ksettings_get(pl->phydev, kset);
> 	} else {
> 		kset->base.port = pl->link_port;
> 	}
> 
> So it will not populate kset->base.speed if there is no PHY connected.
> The speed will be 0, by way of a previous memset. Not SPEED_UNKNOWN.
> It is arguable whether that is legal or not. include/uapi/linux/ethtool.h
> says:
> 
> 	All values 0 to INT_MAX are legal.
> 
> By that measure it may be. But it sure would make users of the
> __ethtool_get_link_ksettings API need make more complicated checks
> (against -1, against 0, 1, etc). So far the kernel community has been ok
> with just checking for SPEED_UNKNOWN.
> 
> Take net/sched/sch_taprio.c for example. The check in
> taprio_set_picos_per_byte is currently not robust enough and will
> trigger this division by zero, due to PHYLINK not setting SPEED_UNKNOWN:
> 
> 	if (!__ethtool_get_link_ksettings(dev, &ecmd) &&
> 	    ecmd.base.speed != SPEED_UNKNOWN)
> 		picos_per_byte = div64_s64(NSEC_PER_SEC * 1000LL * 8,
> 					   ecmd.base.speed * 1000 * 1000);

The ethtool API says:

 * If it is enabled then they are read-only; if the link
 * is up they represent the negotiated link mode; if the link is down,
 * the speed is 0, %SPEED_UNKNOWN or the highest enabled speed and
 * @duplex is %DUPLEX_UNKNOWN or the best enabled duplex mode.

So, it seems that taprio is not following the API... I'd suggest either
fixing taprio, or getting agreement to change the ethtool API.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply

* Re: [PATCH v1 net-next] net: stmmac: Add support for MDIO interrupts
From: Florian Fainelli @ 2019-08-28 17:14 UTC (permalink / raw)
  To: Voon, Weifeng, Andrew Lunn
  Cc: David S. Miller, Maxime Coquelin, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Jose Abreu, Giuseppe Cavallaro,
	Alexandre Torgue, Ong, Boon Leong
In-Reply-To: <D6759987A7968C4889FDA6FA91D5CBC814759747@PGSMSX103.gar.corp.intel.com>

On 8/28/19 10:07 AM, Voon, Weifeng wrote:
>>>> DW EQoS v5.xx controllers added capability for interrupt generation
>>>> when MDIO interface is done (GMII Busy bit is cleared).
>>>> This patch adds support for this interrupt on supported HW to avoid
>>>> polling on GMII Busy bit.
>>>>
>>>> stmmac_mdio_read() & stmmac_mdio_write() will sleep until wake_up()
>>>> is called by the interrupt handler.
>>>
>>> Hi Voon
>>>
>>> I _think_ there are some order of operation issues here. The mdiobus
>>> is registered in the probe function. As soon as of_mdiobus_register()
>>> is called, the MDIO bus must work. At that point MDIO read/writes can
>>> start to happen.
>>>
>>> As far as i can see, the interrupt handler is only requested in
>>> stmmac_open(). So it seems like any MDIO operations after probe, but
>>> before open are going to fail?
>>
>> AFAIR, wait_event_timeout() will continue to busy loop and wait until
>> the timeout, but not return an error because the polled condition was
>> true, at least that is my recollection from having the same issue with
>> the bcmgenet driver before it was moved to connecting to the PHY in the
>> ndo_open() function.
>> --
>> Florian
> 
> Florian is right as the poll condition is still true after the timeout. 
> Hence, any mdio operation after probe and before ndo_open will still work.
> The only cons here is that attaching the PHY will takes a full length of 
> timeout time for each mdio_read and mdio_write. 
> So we should attach the phy only after the interrupt handler is requested?

From a power management/resource utilization perspective, it is better
to initialize as close as possible from the time where you are actually
going to use the hardware, therefore ndo_open().

This may not be convenient or possible given how widely use stmmac is,
and I do not know if parts of the Ethernet MAC require the PHY to supply
the clock, in which case, you may have some chicke and egg conditions if
the design does not allow for MDIO to work independently from the data
plane. Also, I would be worried about introducing bugs.

You could do a couple of things:

- continue to probe the device with interrupts disabled and add a
condition around the call to wait_event_timeout() to do a busy-loop
without going to the maximum defined timeout, if the interrupt line is
requested, use wait_event_timeout()

- request the interrupt during the probe function, but only
unmask/enable the MDIO interrupts for the probe to succeed and leave the
data path interrupts for a later enabling during ndo_open()
-- 
Florian

^ permalink raw reply

* [RFC net-next v1 1/5] net: phy: make mdiobus_create_device() function callable from Eth driver
From: Ong Boon Leong @ 2019-08-28 17:33 UTC (permalink / raw)
  To: davem, linux, mcoquelin.stm32, joabreu, f.fainelli, andrew
  Cc: netdev, linux-kernel, peppe.cavallaro, alexandre.torgue,
	weifeng.voon
In-Reply-To: <20190828173321.25334-1-boon.leong.ong@intel.com>

PHY converter and external PHY drivers depend on MDIO functions of Eth
driver and such MDIO read/write completion may fire IRQ. The ISR for MDIO
completion IRQ is done in the open() function of driver.

For PHY converter mdio driver that registers ISR event that uses MDIO
read/write function during its probe() function, the MDIO ISR should have
been performed a head of time before mdio driver probe() is called. It is
for reason as such, the mdio device creation and registration will need
to be callable from Eth driver open() function.

Why existing way to register mdio_device for PHY converter that is done
via mdiobus_register_board_info() is not feasible is the mdio device
creation and registration happens inside Eth driver probe() function,
specifically in mdiobus_setup_mdiodevfrom_board_info() that is called
by mdiobus_register().

Therefore, to fulfill the need mentioned above, we make mdiobus_create_
device() to be callable from Eth driver open().

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
---
 drivers/net/phy/mdio_bus.c | 5 +++--
 include/linux/phy.h        | 7 +++++++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index bd04fe762056..06658d9197a1 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -338,8 +338,8 @@ static inline void of_mdiobus_link_mdiodev(struct mii_bus *mdio,
  *
  * Returns 0 on success or < 0 on error.
  */
-static int mdiobus_create_device(struct mii_bus *bus,
-				 struct mdio_board_info *bi)
+int mdiobus_create_device(struct mii_bus *bus,
+			  struct mdio_board_info *bi)
 {
 	struct mdio_device *mdiodev;
 	int ret = 0;
@@ -359,6 +359,7 @@ static int mdiobus_create_device(struct mii_bus *bus,
 
 	return ret;
 }
+EXPORT_SYMBOL(mdiobus_create_device);
 
 /**
  * __mdiobus_register - bring up all the PHYs on a given bus and attach them to bus
diff --git a/include/linux/phy.h b/include/linux/phy.h
index d26779f1fb6b..4524db57fe0b 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -1249,12 +1249,19 @@ struct mdio_board_info {
 #if IS_ENABLED(CONFIG_MDIO_DEVICE)
 int mdiobus_register_board_info(const struct mdio_board_info *info,
 				unsigned int n);
+int mdiobus_create_device(struct mii_bus *bus, struct mdio_board_info *bi);
 #else
 static inline int mdiobus_register_board_info(const struct mdio_board_info *i,
 					      unsigned int n)
 {
 	return 0;
 }
+
+static inline int mdiobus_create_device(struct mii_bus *bus,
+					struct mdio_board_info *bi)
+{
+	return 0;
+}
 #endif
 
 
-- 
2.17.0


^ permalink raw reply related

* [RFC net-next v1 2/5] net: phy: introduce mdiobus_get_mdio_device
From: Ong Boon Leong @ 2019-08-28 17:33 UTC (permalink / raw)
  To: davem, linux, mcoquelin.stm32, joabreu, f.fainelli, andrew
  Cc: netdev, linux-kernel, peppe.cavallaro, alexandre.torgue,
	weifeng.voon
In-Reply-To: <20190828173321.25334-1-boon.leong.ong@intel.com>

Add the function to get mdio_device based on the mdio addr.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
---
 drivers/net/phy/mdio_bus.c | 6 ++++++
 include/linux/mdio.h       | 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index 06658d9197a1..96ef94f87ff1 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -130,6 +130,12 @@ struct phy_device *mdiobus_get_phy(struct mii_bus *bus, int addr)
 }
 EXPORT_SYMBOL(mdiobus_get_phy);
 
+struct mdio_device *mdiobus_get_mdio_device(struct mii_bus *bus, int addr)
+{
+	return bus->mdio_map[addr];
+}
+EXPORT_SYMBOL(mdiobus_get_mdio_device);
+
 bool mdiobus_is_registered_device(struct mii_bus *bus, int addr)
 {
 	return bus->mdio_map[addr];
diff --git a/include/linux/mdio.h b/include/linux/mdio.h
index e8242ad88c81..e0ccd56a7ac0 100644
--- a/include/linux/mdio.h
+++ b/include/linux/mdio.h
@@ -315,6 +315,7 @@ int mdiobus_register_device(struct mdio_device *mdiodev);
 int mdiobus_unregister_device(struct mdio_device *mdiodev);
 bool mdiobus_is_registered_device(struct mii_bus *bus, int addr);
 struct phy_device *mdiobus_get_phy(struct mii_bus *bus, int addr);
+struct mdio_device *mdiobus_get_mdio_device(struct mii_bus *bus, int addr);
 
 /**
  * mdio_module_driver() - Helper macro for registering mdio drivers
-- 
2.17.0


^ permalink raw reply related

* [RFC net-next v1 3/5] net: phy: add private data to mdio_device
From: Ong Boon Leong @ 2019-08-28 17:33 UTC (permalink / raw)
  To: davem, linux, mcoquelin.stm32, joabreu, f.fainelli, andrew
  Cc: netdev, linux-kernel, peppe.cavallaro, alexandre.torgue,
	weifeng.voon
In-Reply-To: <20190828173321.25334-1-boon.leong.ong@intel.com>

PHY converter device is represented as mdio_device and requires private
data. So, we add pointer for private data to mdio_device struct.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
---
 include/linux/mdio.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/mdio.h b/include/linux/mdio.h
index e0ccd56a7ac0..fc7dfbe75006 100644
--- a/include/linux/mdio.h
+++ b/include/linux/mdio.h
@@ -40,6 +40,8 @@ struct mdio_device {
 	struct reset_control *reset_ctrl;
 	unsigned int reset_assert_delay;
 	unsigned int reset_deassert_delay;
+	/* Private data */
+	void *priv;
 };
 #define to_mdio_device(d) container_of(d, struct mdio_device, dev)
 
-- 
2.17.0


^ permalink raw reply related

* [RFC net-next v1 5/5] net: stmmac: add dwxpcs boardinfo for mdio_device registration
From: Ong Boon Leong @ 2019-08-28 17:33 UTC (permalink / raw)
  To: davem, linux, mcoquelin.stm32, joabreu, f.fainelli, andrew
  Cc: netdev, linux-kernel, peppe.cavallaro, alexandre.torgue,
	weifeng.voon
In-Reply-To: <20190828173321.25334-1-boon.leong.ong@intel.com>

For EHL & TGL Ethernet PCS, the mdio bus address is the same across all
TSN controller instances. External PHY is using default mdio bus address of
0x0. As Ethernet DW PCS is only applicable for SGMII interface, we only
register setup_intel_mgbe_phy_conv() for all TSN controller with SGMII
interface only.

Also introduce callback for remove mdio_device for unloading driver.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
---
 drivers/net/ethernet/stmicro/stmmac/Kconfig   |  1 +
 drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  2 +
 .../net/ethernet/stmicro/stmmac/stmmac_main.c | 25 +++++++++++
 .../net/ethernet/stmicro/stmmac/stmmac_pci.c  | 45 ++++++++++++++++++-
 include/linux/stmmac.h                        |  3 ++
 5 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig
index 2325b40dff6e..db4332863611 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
+++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
@@ -200,6 +200,7 @@ endif
 config STMMAC_PCI
 	tristate "STMMAC PCI bus support"
 	depends on STMMAC_ETH && PCI
+	select DWXPCS
 	---help---
 	  This selects the platform specific bus support for the stmmac driver.
 	  This driver was tested on XLINX XC2V3000 FF1152AMT0221
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index dcb2e29a5717..d4e232223941 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -29,6 +29,7 @@ struct stmmac_resources {
 	int wol_irq;
 	int lpi_irq;
 	int irq;
+	int phy_conv_irq;
 };
 
 struct stmmac_tx_info {
@@ -203,6 +204,7 @@ struct stmmac_priv {
 	void __iomem *mmcaddr;
 	void __iomem *ptpaddr;
 	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
+	int phy_conv_irq;
 
 #ifdef CONFIG_DEBUG_FS
 	struct dentry *dbgfs_dir;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 06ccd216ae90..43e3d3799581 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2726,11 +2726,23 @@ static int stmmac_open(struct net_device *dev)
 		}
 	}
 
+	/* Start phy converter after MDIO bus IRQ handling is up */
+	if (priv->plat->setup_phy_conv) {
+		ret = priv->plat->setup_phy_conv(priv->mii, priv->phy_conv_irq);
+		if (ret < 0) {
+			netdev_err(priv->dev,
+				   "%s: ERROR: setup phy conv (error: %d)\n",
+				   __func__, ret);
+			goto phy_conv_error;
+		}
+	}
+
 	stmmac_enable_all_queues(priv);
 	stmmac_start_all_queues(priv);
 
 	return 0;
 
+phy_conv_error:
 lpiirq_error:
 	if (priv->wol_irq != dev->irq)
 		free_irq(priv->wol_irq, dev);
@@ -2760,6 +2772,7 @@ static int stmmac_release(struct net_device *dev)
 {
 	struct stmmac_priv *priv = netdev_priv(dev);
 	u32 chan;
+	int ret;
 
 	if (priv->eee_enabled)
 		del_timer_sync(&priv->eee_ctrl_timer);
@@ -2782,6 +2795,17 @@ static int stmmac_release(struct net_device *dev)
 	if (priv->lpi_irq > 0)
 		free_irq(priv->lpi_irq, dev);
 
+	/* Start phy converter after MDIO bus IRQ handling is up */
+	if (priv->plat->remove_phy_conv) {
+		ret = priv->plat->remove_phy_conv(priv->mii);
+		if (ret < 0) {
+			netdev_err(priv->dev,
+				   "%s: ERROR: remove phy conv (error: %d)\n",
+				   __func__, ret);
+			return 0;
+		}
+	}
+
 	/* Stop TX/RX DMA and clear the descriptors */
 	stmmac_stop_all_dma(priv);
 
@@ -4424,6 +4448,7 @@ int stmmac_dvr_probe(struct device *device,
 	priv->dev->irq = res->irq;
 	priv->wol_irq = res->wol_irq;
 	priv->lpi_irq = res->lpi_irq;
+	priv->phy_conv_irq = res->phy_conv_irq;
 
 	if (!IS_ERR_OR_NULL(res->mac))
 		memcpy(priv->dev->dev_addr, res->mac, ETH_ALEN);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c
index 20906287b6d4..c3dfb0e9b025 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c
@@ -10,9 +10,10 @@
 *******************************************************************************/
 
 #include <linux/clk-provider.h>
+#include <linux/phy.h>
 #include <linux/pci.h>
 #include <linux/dmi.h>
-
+#include <linux/dwxpcs.h>
 #include "stmmac.h"
 
 /*
@@ -109,6 +110,42 @@ static const struct stmmac_pci_info stmmac_pci_info = {
 	.setup = stmmac_default_data,
 };
 
+static struct dwxpcs_platform_data intel_mgbe_pdata = {
+	.mode = DWXPCS_MODE_SGMII_AN,
+	.ext_phy_addr = 0x0,
+};
+
+static struct mdio_board_info intel_mgbe_bdinfo = {
+	.bus_id = "stmmac-1",
+	.modalias = "dwxpcs",
+	.mdio_addr = 0x16,
+	.platform_data = &intel_mgbe_pdata,
+};
+
+static int setup_intel_mgbe_phy_conv(struct mii_bus *bus, int irq)
+{
+	struct dwxpcs_platform_data *pdata = &intel_mgbe_pdata;
+
+	pdata->irq = irq;
+
+	return mdiobus_create_device(bus, &intel_mgbe_bdinfo);
+}
+
+static int remove_intel_mgbe_phy_conv(struct mii_bus *bus)
+{
+	struct mdio_board_info *bdinfo = &intel_mgbe_bdinfo;
+	struct mdio_device *mdiodev;
+
+	mdiodev = mdiobus_get_mdio_device(bus, bdinfo->mdio_addr);
+
+	if (!mdiodev)
+		return -1;
+
+	mdio_device_remove(mdiodev);
+
+	return 0;
+}
+
 static int intel_mgbe_common_data(struct pci_dev *pdev,
 				  struct plat_stmmacenet_data *plat)
 {
@@ -197,6 +234,11 @@ static int intel_mgbe_common_data(struct pci_dev *pdev,
 	/* Set the maxmtu to a default of JUMBO_LEN */
 	plat->maxmtu = JUMBO_LEN;
 
+	if (plat->interface == PHY_INTERFACE_MODE_SGMII) {
+		plat->setup_phy_conv = setup_intel_mgbe_phy_conv;
+		plat->remove_phy_conv = remove_intel_mgbe_phy_conv;
+	}
+
 	return 0;
 }
 
@@ -441,6 +483,7 @@ static int stmmac_pci_probe(struct pci_dev *pdev,
 	res.addr = pcim_iomap_table(pdev)[i];
 	res.wol_irq = pdev->irq;
 	res.irq = pdev->irq;
+	res.phy_conv_irq = res.irq;
 
 	return stmmac_dvr_probe(&pdev->dev, plat, &res);
 }
diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
index 7ad7ae35cf88..9ffd0e9c21b1 100644
--- a/include/linux/stmmac.h
+++ b/include/linux/stmmac.h
@@ -12,6 +12,7 @@
 #ifndef __STMMAC_PLATFORM_DATA
 #define __STMMAC_PLATFORM_DATA
 
+#include <linux/phy.h>
 #include <linux/platform_device.h>
 
 #define MTL_MAX_RX_QUEUES	8
@@ -162,6 +163,8 @@ struct plat_stmmacenet_data {
 	int (*init)(struct platform_device *pdev, void *priv);
 	void (*exit)(struct platform_device *pdev, void *priv);
 	struct mac_device_info *(*setup)(void *priv);
+	int (*setup_phy_conv)(struct mii_bus *bus, int irq);
+	int (*remove_phy_conv)(struct mii_bus *bus);
 	void *bsp_priv;
 	struct clk *stmmac_clk;
 	struct clk *pclk;
-- 
2.17.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox