Netdev List
 help / color / mirror / Atom feed
* [PATCH] e1000e: Use dev_get_drvdata where possible
From: Chuhong Yuan @ 2019-07-23 14:15 UTC (permalink / raw)
  Cc: Jeff Kirsher, David S . Miller, intel-wired-lan, netdev,
	linux-kernel, Chuhong Yuan

Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index e4baa13b3cda..fa2755849c54 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -6297,7 +6297,7 @@ static void e1000e_flush_lpic(struct pci_dev *pdev)
 
 static int e1000e_pm_freeze(struct device *dev)
 {
-	struct net_device *netdev = pci_get_drvdata(to_pci_dev(dev));
+	struct net_device *netdev = dev_get_drvdata(dev);
 	struct e1000_adapter *adapter = netdev_priv(netdev);
 
 	netif_device_detach(netdev);
@@ -6630,7 +6630,7 @@ static int __e1000_resume(struct pci_dev *pdev)
 #ifdef CONFIG_PM_SLEEP
 static int e1000e_pm_thaw(struct device *dev)
 {
-	struct net_device *netdev = pci_get_drvdata(to_pci_dev(dev));
+	struct net_device *netdev = dev_get_drvdata(dev);
 	struct e1000_adapter *adapter = netdev_priv(netdev);
 
 	e1000e_set_interrupt_capability(adapter);
@@ -6679,8 +6679,7 @@ static int e1000e_pm_resume(struct device *dev)
 
 static int e1000e_pm_runtime_idle(struct device *dev)
 {
-	struct pci_dev *pdev = to_pci_dev(dev);
-	struct net_device *netdev = pci_get_drvdata(pdev);
+	struct net_device *netdev = dev_get_drvdata(dev);
 	struct e1000_adapter *adapter = netdev_priv(netdev);
 	u16 eee_lp;
 
-- 
2.20.1


^ permalink raw reply related

* [PATCH] fm10k: Use dev_get_drvdata
From: Chuhong Yuan @ 2019-07-23 14:15 UTC (permalink / raw)
  Cc: Jeff Kirsher, David S . Miller, intel-wired-lan, netdev,
	linux-kernel, Chuhong Yuan

Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index e49fb51d3613..7bfc8a5b6f55 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -2352,7 +2352,7 @@ static int fm10k_handle_resume(struct fm10k_intfc *interface)
  **/
 static int __maybe_unused fm10k_resume(struct device *dev)
 {
-	struct fm10k_intfc *interface = pci_get_drvdata(to_pci_dev(dev));
+	struct fm10k_intfc *interface = dev_get_drvdata(dev);
 	struct net_device *netdev = interface->netdev;
 	struct fm10k_hw *hw = &interface->hw;
 	int err;
@@ -2379,7 +2379,7 @@ static int __maybe_unused fm10k_resume(struct device *dev)
  **/
 static int __maybe_unused fm10k_suspend(struct device *dev)
 {
-	struct fm10k_intfc *interface = pci_get_drvdata(to_pci_dev(dev));
+	struct fm10k_intfc *interface = dev_get_drvdata(dev);
 	struct net_device *netdev = interface->netdev;
 
 	netif_device_detach(netdev);
-- 
2.20.1


^ permalink raw reply related

* [PATCH] i40e: Use dev_get_drvdata
From: Chuhong Yuan @ 2019-07-23 14:15 UTC (permalink / raw)
  Cc: Jeff Kirsher, David S . Miller, intel-wired-lan, netdev,
	linux-kernel, Chuhong Yuan

Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9ebbe3da61bb..44da407e0bf9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -15605,8 +15605,7 @@ static void i40e_shutdown(struct pci_dev *pdev)
  **/
 static int __maybe_unused i40e_suspend(struct device *dev)
 {
-	struct pci_dev *pdev = to_pci_dev(dev);
-	struct i40e_pf *pf = pci_get_drvdata(pdev);
+	struct i40e_pf *pf = dev_get_drvdata(dev);
 	struct i40e_hw *hw = &pf->hw;
 
 	/* If we're already suspended, then there is nothing to do */
@@ -15656,8 +15655,7 @@ static int __maybe_unused i40e_suspend(struct device *dev)
  **/
 static int __maybe_unused i40e_resume(struct device *dev)
 {
-	struct pci_dev *pdev = to_pci_dev(dev);
-	struct i40e_pf *pf = pci_get_drvdata(pdev);
+	struct i40e_pf *pf = dev_get_drvdata(dev);
 	int err;
 
 	/* If we're not suspended, then there is nothing to do */
@@ -15674,7 +15672,7 @@ static int __maybe_unused i40e_resume(struct device *dev)
 	 */
 	err = i40e_restore_interrupt_scheme(pf);
 	if (err) {
-		dev_err(&pdev->dev, "Cannot restore interrupt scheme: %d\n",
+		dev_err(dev, "Cannot restore interrupt scheme: %d\n",
 			err);
 	}
 
-- 
2.20.1


^ permalink raw reply related

* [PATCH] igb: Use dev_get_drvdata where possible
From: Chuhong Yuan @ 2019-07-23 14:16 UTC (permalink / raw)
  Cc: Jeff Kirsher, David S . Miller, intel-wired-lan, netdev,
	linux-kernel, Chuhong Yuan

Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index b4df3e319467..145f58ee0451 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -8879,8 +8879,7 @@ static int __maybe_unused igb_resume(struct device *dev)
 
 static int __maybe_unused igb_runtime_idle(struct device *dev)
 {
-	struct pci_dev *pdev = to_pci_dev(dev);
-	struct net_device *netdev = pci_get_drvdata(pdev);
+	struct net_device *netdev = dev_get_drvdata(dev);
 	struct igb_adapter *adapter = netdev_priv(netdev);
 
 	if (!igb_has_link(adapter))
-- 
2.20.1


^ permalink raw reply related

* Re: [RFC PATCH net-next 10/12] drop_monitor: Add packet alert mode
From: Ido Schimmel @ 2019-07-23 14:16 UTC (permalink / raw)
  To: Neil Horman
  Cc: netdev, davem, dsahern, roopa, nikolay, jakub.kicinski, toke,
	andy, f.fainelli, andrew, vivien.didelot, mlxsw, Ido Schimmel
In-Reply-To: <20190723124340.GA10377@hmswarspite.think-freely.org>

On Tue, Jul 23, 2019 at 08:43:40AM -0400, Neil Horman wrote:
> On Mon, Jul 22, 2019 at 09:31:32PM +0300, Ido Schimmel wrote:
> > +static void net_dm_packet_work(struct work_struct *work)
> > +{
> > +	struct per_cpu_dm_data *data;
> > +	struct sk_buff_head list;
> > +	struct sk_buff *skb;
> > +	unsigned long flags;
> > +
> > +	data = container_of(work, struct per_cpu_dm_data, dm_alert_work);
> > +
> > +	__skb_queue_head_init(&list);
> > +
> > +	spin_lock_irqsave(&data->drop_queue.lock, flags);
> > +	skb_queue_splice_tail_init(&data->drop_queue, &list);
> > +	spin_unlock_irqrestore(&data->drop_queue.lock, flags);
> > +
> These functions are all executed in a per-cpu context.  While theres nothing
> wrong with using a spinlock here, I think you can get away with just doing
> local_irqsave and local_irq_restore.

Hi Neil,

Thanks a lot for reviewing. I might be missing something, but please
note that this function is executed from a workqueue and therefore the
CPU it is running on does not have to be the same CPU to which 'data'
belongs to. If so, I'm not sure how I can avoid taking the spinlock, as
otherwise two different CPUs can modify the list concurrently.

> 
> Neil
> 
> > +	while ((skb = __skb_dequeue(&list)))
> > +		net_dm_packet_report(skb);
> > +}

^ permalink raw reply

* [PATCH] net: jme: Use dev_get_drvdata
From: Chuhong Yuan @ 2019-07-23 14:16 UTC (permalink / raw)
  Cc: Guo-Fu Tseng, David S . Miller, netdev, linux-kernel,
	Chuhong Yuan

Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
---
 drivers/net/ethernet/jme.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c
index 0b668357db4d..db7e10e23310 100644
--- a/drivers/net/ethernet/jme.c
+++ b/drivers/net/ethernet/jme.c
@@ -3193,8 +3193,7 @@ jme_shutdown(struct pci_dev *pdev)
 static int
 jme_suspend(struct device *dev)
 {
-	struct pci_dev *pdev = to_pci_dev(dev);
-	struct net_device *netdev = pci_get_drvdata(pdev);
+	struct net_device *netdev = dev_get_drvdata(dev);
 	struct jme_adapter *jme = netdev_priv(netdev);
 
 	if (!netif_running(netdev))
@@ -3236,8 +3235,7 @@ jme_suspend(struct device *dev)
 static int
 jme_resume(struct device *dev)
 {
-	struct pci_dev *pdev = to_pci_dev(dev);
-	struct net_device *netdev = pci_get_drvdata(pdev);
+	struct net_device *netdev = dev_get_drvdata(dev);
 	struct jme_adapter *jme = netdev_priv(netdev);
 
 	if (!netif_running(netdev))
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH] net: atheros: Use dev_get_drvdata
From: Joe Perches @ 2019-07-23 14:17 UTC (permalink / raw)
  To: Chuhong Yuan
  Cc: Jay Cliburn, Chris Snook, David S . Miller, netdev, linux-kernel
In-Reply-To: <20190723131856.31932-1-hslester96@gmail.com>

On Tue, 2019-07-23 at 21:18 +0800, Chuhong Yuan wrote:
> Instead of using to_pci_dev + pci_get_drvdata,
> use dev_get_drvdata to make code simpler.

unrelated trivia:

> diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
[]
> @@ -2422,8 +2422,7 @@ static int atl1c_close(struct net_device *netdev)
>  
>  static int atl1c_suspend(struct device *dev)
>  {
> -	struct pci_dev *pdev = to_pci_dev(dev);
> -	struct net_device *netdev = pci_get_drvdata(pdev);
> +	struct net_device *netdev = dev_get_drvdata(dev);
>  	struct atl1c_adapter *adapter = netdev_priv(netdev);
>  	struct atl1c_hw *hw = &adapter->hw;
>  	u32 wufc = adapter->wol;
> @@ -2437,7 +2436,7 @@ static int atl1c_suspend(struct device *dev)
>  
>  	if (wufc)
>  		if (atl1c_phy_to_ps_link(hw) != 0)
> -			dev_dbg(&pdev->dev, "phy power saving failed");
> +			dev_dbg(dev, "phy power saving failed");

These and similar uses could/should use netdev_dbg

			netdev_dbg(netdev, "phy power saving failed\n");

with the terminating newline too

> diff --git a/drivers/net/ethernet/atheros/atlx/atl1.c b/drivers/net/ethernet/atheros/atlx/atl1.c
[]
> @@ -2780,7 +2779,7 @@ static int atl1_suspend(struct device *dev)
>  		val = atl1_get_speed_and_duplex(hw, &speed, &duplex);
>  		if (val) {
>  			if (netif_msg_ifdown(adapter))
> -				dev_printk(KERN_DEBUG, &pdev->dev,
> +				dev_printk(KERN_DEBUG, dev,
>  					"error getting speed/duplex\n");

netdev_printk(KERN_DEBUG, netdev, etc...);



^ permalink raw reply

* [PATCH v2] tun: mark small packets as owned by the tap sock
From: Alexis Bauvin @ 2019-07-23 14:23 UTC (permalink / raw)
  To: stephen, davem, jasowang; +Cc: netdev, abauvin

- v1 -> v2: Move skb_set_owner_w to __tun_build_skb to reduce patch size

Small packets going out of a tap device go through an optimized code
path that uses build_skb() rather than sock_alloc_send_pskb(). The
latter calls skb_set_owner_w(), but the small packet code path does not.

The net effect is that small packets are not owned by the userland
application's socket (e.g. QEMU), while large packets are.
This can be seen with a TCP session, where packets are not owned when
the window size is small enough (around PAGE_SIZE), while they are once
the window grows (note that this requires the host to support virtio
tso for the guest to offload segmentation).
All this leads to inconsistent behaviour in the kernel, especially on
netfilter modules that uses sk->socket (e.g. xt_owner).

Signed-off-by: Alexis Bauvin <abauvin@scaleway.com>
Fixes: 66ccbc9c87c2 ("tap: use build_skb() for small packet")
---
 drivers/net/tun.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 3d443597bd04..db16d7a13e00 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1599,7 +1599,8 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
 	return true;
 }
 
-static struct sk_buff *__tun_build_skb(struct page_frag *alloc_frag, char *buf,
+static struct sk_buff *__tun_build_skb(struct tun_file *tfile,
+				       struct page_frag *alloc_frag, char *buf,
 				       int buflen, int len, int pad)
 {
 	struct sk_buff *skb = build_skb(buf, buflen);
@@ -1609,6 +1610,7 @@ static struct sk_buff *__tun_build_skb(struct page_frag *alloc_frag, char *buf,
 
 	skb_reserve(skb, pad);
 	skb_put(skb, len);
+	skb_set_owner_w(skb, tfile->socket.sk);
 
 	get_page(alloc_frag->page);
 	alloc_frag->offset += buflen;
@@ -1686,7 +1688,8 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
 	 */
 	if (hdr->gso_type || !xdp_prog) {
 		*skb_xdp = 1;
-		return __tun_build_skb(alloc_frag, buf, buflen, len, pad);
+		return __tun_build_skb(tfile, alloc_frag, buf, buflen, len,
+				       pad);
 	}
 
 	*skb_xdp = 0;
@@ -1723,7 +1726,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
 	rcu_read_unlock();
 	local_bh_enable();
 
-	return __tun_build_skb(alloc_frag, buf, buflen, len, pad);
+	return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad);
 
 err_xdp:
 	put_page(alloc_frag->page);
-- 


^ permalink raw reply related

* Re: b53 DSA : vlan tagging broken ?
From: Anand Raj Manickam @ 2019-07-23 14:26 UTC (permalink / raw)
  To: f.fainelli, netdev, andrew
In-Reply-To: <CAEyr1FS-8uBEMBS+7U4K8wBLJgPZD0Lxa4FyzuvYZ0RGhTH8fA@mail.gmail.com>

The issue is resolved by enabling vlan_filtering for the bridge and
fix the phy-mode to "rgmii" from "rgmii-txid" in the dts file.


On Mon, Jul 22, 2019 at 6:57 PM Anand Raj Manickam <anandrm@gmail.com> wrote:
>
> Hi ,
> I had working DSA with 4.9.184 kernel, with BCM53125, rev 4 hardware .
> It had 2 bridges with
> br0            8000.00       no              lan1
>                                                         lan2
>                                                         lan3
>                                                         eth0.101
>
> br1            8000.01     no             eth0.102
>                                                     wan
> # bridge vlan
> port    vlan ids
> wan      102 PVID Egress Untagged
> wan      102 PVID Egress Untagged
> lan3     101 PVID Egress Untagged
> lan3     101 PVID Egress Untagged
> lan2     101 PVID Egress Untagged
> lan2     101 PVID Egress Untagged
> lan1     101 PVID Egress Untagged
> lan1     101 PVID Egress Untagged
> eth0.102  102 PVID
> eth0.102
> br1     1 PVID Egress Untagged
> eth0.101  101 PVID
> eth0.101
> br0     1 PVID Egress Untagged
>
> I upgrade the kernel to 5.2 . The behavior is broken. I had to rip the
> config and check what was broken from the init scripts.
> the bridge vlan commands failed to add , as the newer kernel requires
> the vlan interfaces to be up .
> https://lkml.org/lkml/2018/5/22/887  - i had the same behaviour as this thread .
> I re added them manually  , so the we have the same bridge to vlan
> mapping as the previous kernel .
> but the ingress packets for WAN where going to LAN(bridge) and the
> egress packets where on WAN(bridge)  but the packets never leaves the
> interface .
>
> I test this with a simple config :
>  ip link add link eth0 name eth0.101 type vlan id 101
>  ip link add link eth0 name eth0.102 type vlan id 102
>  ip link set eth0.101 up
>  ip link set eth0.102 up
>  ip link add br0 type bridge
>   ip link add br1 type bridge
>   ip link set lan1 master br1
>   ip link set lan2 master br1
>   ip link set lan3 master br1
>   ip link set wan master br0
>   bridge vlan add vid 101 dev lan1 pvid untagged
>   bridge vlan add vid 101 dev lan2 pvid untagged
>   bridge vlan add vid 101 dev lan3 pvid untagged
>   bridge vlan add vid 102 dev wan pvid untagged
>   bridge vlan del vid 1 dev wan
>   bridge vlan del vid 1 dev lan1
>   bridge vlan del vid 1 dev lan2
>   bridge vlan del vid 1 dev lan3
>   ip link set eth0.101 master br1
>   ip link set eth0.102 master br0
>   bridge vlan del vid 1 dev eth0.102
>  bridge vlan del vid 1 dev eth0.101
>   bridge vlan add vid 102 dev eth0.102 pvid
>   bridge vlan add vid 101 dev eth0.101 pvid
>   ifconfig br0 up
>   ifconfig br1 up
>   ifconfig wan up
>   ifconfig lan1 up
>   ifconfig lan2 up
>   ifconfig lan3 up
>
> I donot see any packets with a tag on eth0
> ~# bridge vlan
> port    vlan ids
> wan      102 PVID Egress Untagged
> lan3     101 PVID Egress Untagged
> lan2     101 PVID Egress Untagged
> lan1     101 PVID Egress Untagged
> eth0.101         101 PVID
> eth0.102         102 PVID
> br0      1 PVID Egress Untagged
> br1      1 PVID Egress Untagged
>
> These are the loaded modules:
> # lsmod
> Module                  Size  Used by
> b53_mdio               16384  0
> b53_mmap               16384  0
> b53_common             28672  2 b53_mdio,b53_mmap
> tag_8021q              16384  0
> dsa_core               32768  9 b53_mdio,b53_common,b53_mmap,tag_8021q
> phylink                20480  2 b53_common,dsa_core
>
> if i re config
> #bridge vlan add vid 102 dev wan pvid untagged
> #bridge vlan add vid 102 dev eth0.102 pvid
> Then i see the tags for ingress packets . but no packets are
> transmitted out on the wire , but the stats in ifconfig show as
> transmitted .
> # ifconfig br0
> br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>         inet 10.17.33.137  netmask 255.255.255.0  broadcast 10.17.33.255
>         inet6 fe80::3ef8:4aff:fe9c:5a04  prefixlen 64  scopeid 0x20<link>
>         ether 3c:f8:4a:9c:5a:04  txqueuelen 1000  (Ethernet)
>         RX packets 616  bytes 32351 (31.5 KiB)
>         RX errors 0  dropped 0  overruns 0  frame 0
>         TX packets 679  bytes 30286 (29.5 KiB)
>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
> #ifconfig eth0
> eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>         inet6 fe80::d6:5ff:fec2:93af  prefixlen 64  scopeid 0x20<link>
>         ether 02:d6:05:c2:93:af  txqueuelen 1000  (Ethernet)
>         RX packets 58017  bytes 4004093 (3.8 MiB)
>         RX errors 0  dropped 0  overruns 0  frame 0
>         TX packets 4322  bytes 301365 (294.3 KiB)
>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>         device interrupt 56
>
> Can some shed some light on this config .
> -Anand

^ permalink raw reply

* [PATCH net-next 0/4] nfp: Offload MPLS actions
From: John Hurley @ 2019-07-23 14:33 UTC (permalink / raw)
  To: netdev; +Cc: davem, simon.horman, jakub.kicinski, oss-drivers, John Hurley

The module act_mpls has recently been added to the kernel. This allows the
manipulation of MPLS headers on packets including push, pop and modify.
Add these new actions and parameters to the intermediate representation
API for hardware offload. Follow this by implementing the offload of these
MPLS actions in the NFP driver.

John Hurley (4):
  net: sched: include mpls actions in hardware intermediate
    representation
  nfp: flower: offload MPLS push action
  nfp: flower: offload MPLS pop action
  nfp: flower: offload MPLS set action

 drivers/net/ethernet/netronome/nfp/flower/action.c | 120 +++++++++++++++++++++
 drivers/net/ethernet/netronome/nfp/flower/cmsg.h   |  21 ++++
 include/net/flow_offload.h                         |  19 ++++
 include/net/tc_act/tc_mpls.h                       |  75 +++++++++++++
 net/sched/cls_api.c                                |  25 +++++
 5 files changed, 260 insertions(+)

-- 
2.7.4


^ permalink raw reply

* [PATCH net-next 1/4] net: sched: include mpls actions in hardware intermediate representation
From: John Hurley @ 2019-07-23 14:33 UTC (permalink / raw)
  To: netdev; +Cc: davem, simon.horman, jakub.kicinski, oss-drivers, John Hurley
In-Reply-To: <1563892442-4654-1-git-send-email-john.hurley@netronome.com>

A recent addition to TC actions is the ability to manipulate the MPLS
headers on packets.

In preparation to offload such actions to hardware, update the IR code to
accept and prepare the new actions.

Note that no driver currently impliments the MPLS dec_ttl action so this
is not included.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 include/net/flow_offload.h   | 19 +++++++++++
 include/net/tc_act/tc_mpls.h | 75 ++++++++++++++++++++++++++++++++++++++++++++
 net/sched/cls_api.c          | 25 +++++++++++++++
 3 files changed, 119 insertions(+)

diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index b16d216..00b9aab 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -131,6 +131,9 @@ enum flow_action_id {
 	FLOW_ACTION_SAMPLE,
 	FLOW_ACTION_POLICE,
 	FLOW_ACTION_CT,
+	FLOW_ACTION_MPLS_PUSH,
+	FLOW_ACTION_MPLS_POP,
+	FLOW_ACTION_MPLS_MANGLE,
 };
 
 /* This is mirroring enum pedit_header_type definition for easy mapping between
@@ -184,6 +187,22 @@ struct flow_action_entry {
 			int action;
 			u16 zone;
 		} ct;
+		struct {				/* FLOW_ACTION_MPLS_PUSH */
+			u32		label;
+			__be16		proto;
+			u8		tc;
+			u8		bos;
+			u8		ttl;
+		} mpls_push;
+		struct {				/* FLOW_ACTION_MPLS_POP */
+			__be16		proto;
+		} mpls_pop;
+		struct {				/* FLOW_ACTION_MPLS_MANGLE */
+			u32		label;
+			u8		tc;
+			u8		bos;
+			u8		ttl;
+		} mpls_mangle;
 	};
 };
 
diff --git a/include/net/tc_act/tc_mpls.h b/include/net/tc_act/tc_mpls.h
index 4bc3d92..721de4f 100644
--- a/include/net/tc_act/tc_mpls.h
+++ b/include/net/tc_act/tc_mpls.h
@@ -27,4 +27,79 @@ struct tcf_mpls {
 };
 #define to_mpls(a) ((struct tcf_mpls *)a)
 
+static inline bool is_tcf_mpls(const struct tc_action *a)
+{
+#ifdef CONFIG_NET_CLS_ACT
+	if (a->ops && a->ops->id == TCA_ID_MPLS)
+		return true;
+#endif
+	return false;
+}
+
+static inline u32 tcf_mpls_action(const struct tc_action *a)
+{
+	u32 tcfm_action;
+
+	rcu_read_lock();
+	tcfm_action = rcu_dereference(to_mpls(a)->mpls_p)->tcfm_action;
+	rcu_read_unlock();
+
+	return tcfm_action;
+}
+
+static inline __be16 tcf_mpls_proto(const struct tc_action *a)
+{
+	__be16 tcfm_proto;
+
+	rcu_read_lock();
+	tcfm_proto = rcu_dereference(to_mpls(a)->mpls_p)->tcfm_proto;
+	rcu_read_unlock();
+
+	return tcfm_proto;
+}
+
+static inline u32 tcf_mpls_label(const struct tc_action *a)
+{
+	u32 tcfm_label;
+
+	rcu_read_lock();
+	tcfm_label = rcu_dereference(to_mpls(a)->mpls_p)->tcfm_label;
+	rcu_read_unlock();
+
+	return tcfm_label;
+}
+
+static inline u8 tcf_mpls_tc(const struct tc_action *a)
+{
+	u8 tcfm_tc;
+
+	rcu_read_lock();
+	tcfm_tc = rcu_dereference(to_mpls(a)->mpls_p)->tcfm_tc;
+	rcu_read_unlock();
+
+	return tcfm_tc;
+}
+
+static inline u8 tcf_mpls_bos(const struct tc_action *a)
+{
+	u8 tcfm_bos;
+
+	rcu_read_lock();
+	tcfm_bos = rcu_dereference(to_mpls(a)->mpls_p)->tcfm_bos;
+	rcu_read_unlock();
+
+	return tcfm_bos;
+}
+
+static inline u8 tcf_mpls_ttl(const struct tc_action *a)
+{
+	u8 tcfm_ttl;
+
+	rcu_read_lock();
+	tcfm_ttl = rcu_dereference(to_mpls(a)->mpls_p)->tcfm_ttl;
+	rcu_read_unlock();
+
+	return tcfm_ttl;
+}
+
 #endif /* __NET_TC_MPLS_H */
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index efd3cfb..3565d9a 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -36,6 +36,7 @@
 #include <net/tc_act/tc_sample.h>
 #include <net/tc_act/tc_skbedit.h>
 #include <net/tc_act/tc_ct.h>
+#include <net/tc_act/tc_mpls.h>
 
 extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1];
 
@@ -3269,6 +3270,30 @@ int tc_setup_flow_action(struct flow_action *flow_action,
 			entry->id = FLOW_ACTION_CT;
 			entry->ct.action = tcf_ct_action(act);
 			entry->ct.zone = tcf_ct_zone(act);
+		} else if (is_tcf_mpls(act)) {
+			switch (tcf_mpls_action(act)) {
+			case TCA_MPLS_ACT_PUSH:
+				entry->id = FLOW_ACTION_MPLS_PUSH;
+				entry->mpls_push.proto = tcf_mpls_proto(act);
+				entry->mpls_push.label = tcf_mpls_label(act);
+				entry->mpls_push.tc = tcf_mpls_tc(act);
+				entry->mpls_push.bos = tcf_mpls_bos(act);
+				entry->mpls_push.ttl = tcf_mpls_ttl(act);
+				break;
+			case TCA_MPLS_ACT_POP:
+				entry->id = FLOW_ACTION_MPLS_POP;
+				entry->mpls_pop.proto = tcf_mpls_proto(act);
+				break;
+			case TCA_MPLS_ACT_MODIFY:
+				entry->id = FLOW_ACTION_MPLS_MANGLE;
+				entry->mpls_mangle.label = tcf_mpls_label(act);
+				entry->mpls_mangle.tc = tcf_mpls_tc(act);
+				entry->mpls_mangle.bos = tcf_mpls_bos(act);
+				entry->mpls_mangle.ttl = tcf_mpls_ttl(act);
+				break;
+			default:
+				goto err_out;
+			}
 		} else {
 			goto err_out;
 		}
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 2/4] nfp: flower: offload MPLS push action
From: John Hurley @ 2019-07-23 14:34 UTC (permalink / raw)
  To: netdev; +Cc: davem, simon.horman, jakub.kicinski, oss-drivers, John Hurley
In-Reply-To: <1563892442-4654-1-git-send-email-john.hurley@netronome.com>

Recent additions to the kernel include a TC action module to manipulate
MPLS headers on packets. Such actions are available to offload via the
flow_offload intermediate representation API.

Modify the NFP driver to allow the offload of MPLS push actions to
firmware.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/flower/action.c | 50 ++++++++++++++++++++++
 drivers/net/ethernet/netronome/nfp/flower/cmsg.h   |  7 +++
 2 files changed, 57 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/flower/action.c b/drivers/net/ethernet/netronome/nfp/flower/action.c
index 5a54fe8..9e18bec 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/action.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/action.c
@@ -2,10 +2,12 @@
 /* Copyright (C) 2017-2018 Netronome Systems, Inc. */
 
 #include <linux/bitfield.h>
+#include <linux/mpls.h>
 #include <net/pkt_cls.h>
 #include <net/tc_act/tc_csum.h>
 #include <net/tc_act/tc_gact.h>
 #include <net/tc_act/tc_mirred.h>
+#include <net/tc_act/tc_mpls.h>
 #include <net/tc_act/tc_pedit.h>
 #include <net/tc_act/tc_vlan.h>
 #include <net/tc_act/tc_tunnel_key.h>
@@ -25,6 +27,38 @@
 						 NFP_FL_TUNNEL_KEY | \
 						 NFP_FL_TUNNEL_GENEVE_OPT)
 
+static int
+nfp_fl_push_mpls(struct nfp_fl_push_mpls *push_mpls,
+		 const struct flow_action_entry *act,
+		 struct netlink_ext_ack *extack)
+{
+	size_t act_size = sizeof(struct nfp_fl_push_mpls);
+	u32 mpls_lse = 0;
+
+	push_mpls->head.jump_id = NFP_FL_ACTION_OPCODE_PUSH_MPLS;
+	push_mpls->head.len_lw = act_size >> NFP_FL_LW_SIZ;
+
+	/* BOS is optional in the TC action but required for offload. */
+	if (act->mpls_push.bos != ACT_MPLS_BOS_NOT_SET) {
+		mpls_lse |= act->mpls_push.bos << MPLS_LS_S_SHIFT;
+	} else {
+		NL_SET_ERR_MSG_MOD(extack, "unsupported offload: BOS field must explicitly be set for MPLS push");
+		return -EOPNOTSUPP;
+	}
+
+	/* Leave MPLS TC as a default value of 0 if not explicitly set. */
+	if (act->mpls_push.tc != ACT_MPLS_TC_NOT_SET)
+		mpls_lse |= act->mpls_push.tc << MPLS_LS_TC_SHIFT;
+
+	/* Proto, label and TTL are enforced and verified for MPLS push. */
+	mpls_lse |= act->mpls_push.label << MPLS_LS_LABEL_SHIFT;
+	mpls_lse |= act->mpls_push.ttl << MPLS_LS_TTL_SHIFT;
+	push_mpls->ethtype = act->mpls_push.proto;
+	push_mpls->lse = cpu_to_be32(mpls_lse);
+
+	return 0;
+}
+
 static void nfp_fl_pop_vlan(struct nfp_fl_pop_vlan *pop_vlan)
 {
 	size_t act_size = sizeof(struct nfp_fl_pop_vlan);
@@ -869,6 +903,7 @@ nfp_flower_loop_action(struct nfp_app *app, const struct flow_action_entry *act,
 	struct nfp_fl_set_ipv4_tun *set_tun;
 	struct nfp_fl_pre_tunnel *pre_tun;
 	struct nfp_fl_push_vlan *psh_v;
+	struct nfp_fl_push_mpls *psh_m;
 	struct nfp_fl_pop_vlan *pop_v;
 	int err;
 
@@ -975,6 +1010,21 @@ nfp_flower_loop_action(struct nfp_app *app, const struct flow_action_entry *act,
 		 */
 		*csum_updated &= ~act->csum_flags;
 		break;
+	case FLOW_ACTION_MPLS_PUSH:
+		if (*a_len +
+		    sizeof(struct nfp_fl_push_mpls) > NFP_FL_MAX_A_SIZ) {
+			NL_SET_ERR_MSG_MOD(extack, "unsupported offload: maximum allowed action list size exceeded at push MPLS");
+			return -EOPNOTSUPP;
+		}
+
+		psh_m = (struct nfp_fl_push_mpls *)&nfp_fl->action_data[*a_len];
+		nfp_fl->meta.shortcut = cpu_to_be32(NFP_FL_SC_ACT_NULL);
+
+		err = nfp_fl_push_mpls(psh_m, act, extack);
+		if (err)
+			return err;
+		*a_len += sizeof(struct nfp_fl_push_mpls);
+		break;
 	default:
 		/* Currently we do not handle any other actions. */
 		NL_SET_ERR_MSG_MOD(extack, "unsupported offload: unsupported action in action list");
diff --git a/drivers/net/ethernet/netronome/nfp/flower/cmsg.h b/drivers/net/ethernet/netronome/nfp/flower/cmsg.h
index 0f1706a..91af0fa 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/cmsg.h
+++ b/drivers/net/ethernet/netronome/nfp/flower/cmsg.h
@@ -68,6 +68,7 @@
 #define NFP_FL_ACTION_OPCODE_OUTPUT		0
 #define NFP_FL_ACTION_OPCODE_PUSH_VLAN		1
 #define NFP_FL_ACTION_OPCODE_POP_VLAN		2
+#define NFP_FL_ACTION_OPCODE_PUSH_MPLS		3
 #define NFP_FL_ACTION_OPCODE_SET_IPV4_TUNNEL	6
 #define NFP_FL_ACTION_OPCODE_SET_ETHERNET	7
 #define NFP_FL_ACTION_OPCODE_SET_IPV4_ADDRS	9
@@ -232,6 +233,12 @@ struct nfp_fl_push_geneve {
 	u8 opt_data[];
 };
 
+struct nfp_fl_push_mpls {
+	struct nfp_fl_act_head head;
+	__be16 ethtype;
+	__be32 lse;
+};
+
 /* Metadata with L2 (1W/4B)
  * ----------------------------------------------------------------
  *    3                   2                   1
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 3/4] nfp: flower: offload MPLS pop action
From: John Hurley @ 2019-07-23 14:34 UTC (permalink / raw)
  To: netdev; +Cc: davem, simon.horman, jakub.kicinski, oss-drivers, John Hurley
In-Reply-To: <1563892442-4654-1-git-send-email-john.hurley@netronome.com>

Recent additions to the kernel include a TC action module to manipulate
MPLS headers on packets. Such actions are available to offload via the
flow_offload intermediate representation API.

Modify the NFP driver to allow the offload of MPLS pop actions to
firmware. The act_mpls TC module enforces that the next protocol is
supplied along with the pop action. Passing this to firmware allows it
to properly rebuild the underlying packet after the pop.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/flower/action.c | 25 ++++++++++++++++++++++
 drivers/net/ethernet/netronome/nfp/flower/cmsg.h   |  6 ++++++
 2 files changed, 31 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/flower/action.c b/drivers/net/ethernet/netronome/nfp/flower/action.c
index 9e18bec..7f288ae 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/action.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/action.c
@@ -59,6 +59,17 @@ nfp_fl_push_mpls(struct nfp_fl_push_mpls *push_mpls,
 	return 0;
 }
 
+static void
+nfp_fl_pop_mpls(struct nfp_fl_pop_mpls *pop_mpls,
+		const struct flow_action_entry *act)
+{
+	size_t act_size = sizeof(struct nfp_fl_pop_mpls);
+
+	pop_mpls->head.jump_id = NFP_FL_ACTION_OPCODE_POP_MPLS;
+	pop_mpls->head.len_lw = act_size >> NFP_FL_LW_SIZ;
+	pop_mpls->ethtype = act->mpls_pop.proto;
+}
+
 static void nfp_fl_pop_vlan(struct nfp_fl_pop_vlan *pop_vlan)
 {
 	size_t act_size = sizeof(struct nfp_fl_pop_vlan);
@@ -905,6 +916,7 @@ nfp_flower_loop_action(struct nfp_app *app, const struct flow_action_entry *act,
 	struct nfp_fl_push_vlan *psh_v;
 	struct nfp_fl_push_mpls *psh_m;
 	struct nfp_fl_pop_vlan *pop_v;
+	struct nfp_fl_pop_mpls *pop_m;
 	int err;
 
 	switch (act->id) {
@@ -1025,6 +1037,19 @@ nfp_flower_loop_action(struct nfp_app *app, const struct flow_action_entry *act,
 			return err;
 		*a_len += sizeof(struct nfp_fl_push_mpls);
 		break;
+	case FLOW_ACTION_MPLS_POP:
+		if (*a_len +
+		    sizeof(struct nfp_fl_pop_mpls) > NFP_FL_MAX_A_SIZ) {
+			NL_SET_ERR_MSG_MOD(extack, "unsupported offload: maximum allowed action list size exceeded at pop MPLS");
+			return -EOPNOTSUPP;
+		}
+
+		pop_m = (struct nfp_fl_pop_mpls *)&nfp_fl->action_data[*a_len];
+		nfp_fl->meta.shortcut = cpu_to_be32(NFP_FL_SC_ACT_NULL);
+
+		nfp_fl_pop_mpls(pop_m, act);
+		*a_len += sizeof(struct nfp_fl_pop_mpls);
+		break;
 	default:
 		/* Currently we do not handle any other actions. */
 		NL_SET_ERR_MSG_MOD(extack, "unsupported offload: unsupported action in action list");
diff --git a/drivers/net/ethernet/netronome/nfp/flower/cmsg.h b/drivers/net/ethernet/netronome/nfp/flower/cmsg.h
index 91af0fa..3198ad4 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/cmsg.h
+++ b/drivers/net/ethernet/netronome/nfp/flower/cmsg.h
@@ -69,6 +69,7 @@
 #define NFP_FL_ACTION_OPCODE_PUSH_VLAN		1
 #define NFP_FL_ACTION_OPCODE_POP_VLAN		2
 #define NFP_FL_ACTION_OPCODE_PUSH_MPLS		3
+#define NFP_FL_ACTION_OPCODE_POP_MPLS		4
 #define NFP_FL_ACTION_OPCODE_SET_IPV4_TUNNEL	6
 #define NFP_FL_ACTION_OPCODE_SET_ETHERNET	7
 #define NFP_FL_ACTION_OPCODE_SET_IPV4_ADDRS	9
@@ -239,6 +240,11 @@ struct nfp_fl_push_mpls {
 	__be32 lse;
 };
 
+struct nfp_fl_pop_mpls {
+	struct nfp_fl_act_head head;
+	__be16 ethtype;
+};
+
 /* Metadata with L2 (1W/4B)
  * ----------------------------------------------------------------
  *    3                   2                   1
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 4/4] nfp: flower: offload MPLS set action
From: John Hurley @ 2019-07-23 14:34 UTC (permalink / raw)
  To: netdev; +Cc: davem, simon.horman, jakub.kicinski, oss-drivers, John Hurley
In-Reply-To: <1563892442-4654-1-git-send-email-john.hurley@netronome.com>

Recent additions to the kernel include a TC action module to manipulate
MPLS headers on packets. Such actions are available to offload via the
flow_offload intermediate representation API.

Modify the NFP driver to allow the offload of MPLS set actions to
firmware. Set actions update the outermost MPLS header. The offload
includes a mask to specify which fields should be set.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/flower/action.c | 45 ++++++++++++++++++++++
 drivers/net/ethernet/netronome/nfp/flower/cmsg.h   |  8 ++++
 2 files changed, 53 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/flower/action.c b/drivers/net/ethernet/netronome/nfp/flower/action.c
index 7f288ae..ff2f419 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/action.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/action.c
@@ -70,6 +70,37 @@ nfp_fl_pop_mpls(struct nfp_fl_pop_mpls *pop_mpls,
 	pop_mpls->ethtype = act->mpls_pop.proto;
 }
 
+static void
+nfp_fl_set_mpls(struct nfp_fl_set_mpls *set_mpls,
+		const struct flow_action_entry *act)
+{
+	size_t act_size = sizeof(struct nfp_fl_set_mpls);
+	u32 mpls_lse = 0, mpls_mask = 0;
+
+	set_mpls->head.jump_id = NFP_FL_ACTION_OPCODE_SET_MPLS;
+	set_mpls->head.len_lw = act_size >> NFP_FL_LW_SIZ;
+
+	if (act->mpls_mangle.label != ACT_MPLS_LABEL_NOT_SET) {
+		mpls_lse |= act->mpls_mangle.label << MPLS_LS_LABEL_SHIFT;
+		mpls_mask |= MPLS_LS_LABEL_MASK;
+	}
+	if (act->mpls_mangle.tc != ACT_MPLS_TC_NOT_SET) {
+		mpls_lse |= act->mpls_mangle.tc << MPLS_LS_TC_SHIFT;
+		mpls_mask |= MPLS_LS_TC_MASK;
+	}
+	if (act->mpls_mangle.bos != ACT_MPLS_BOS_NOT_SET) {
+		mpls_lse |= act->mpls_mangle.bos << MPLS_LS_S_SHIFT;
+		mpls_mask |= MPLS_LS_S_MASK;
+	}
+	if (act->mpls_mangle.ttl) {
+		mpls_lse |= act->mpls_mangle.ttl << MPLS_LS_TTL_SHIFT;
+		mpls_mask |= MPLS_LS_TTL_MASK;
+	}
+
+	set_mpls->lse = cpu_to_be32(mpls_lse);
+	set_mpls->lse_mask = cpu_to_be32(mpls_mask);
+}
+
 static void nfp_fl_pop_vlan(struct nfp_fl_pop_vlan *pop_vlan)
 {
 	size_t act_size = sizeof(struct nfp_fl_pop_vlan);
@@ -917,6 +948,7 @@ nfp_flower_loop_action(struct nfp_app *app, const struct flow_action_entry *act,
 	struct nfp_fl_push_mpls *psh_m;
 	struct nfp_fl_pop_vlan *pop_v;
 	struct nfp_fl_pop_mpls *pop_m;
+	struct nfp_fl_set_mpls *set_m;
 	int err;
 
 	switch (act->id) {
@@ -1050,6 +1082,19 @@ nfp_flower_loop_action(struct nfp_app *app, const struct flow_action_entry *act,
 		nfp_fl_pop_mpls(pop_m, act);
 		*a_len += sizeof(struct nfp_fl_pop_mpls);
 		break;
+	case FLOW_ACTION_MPLS_MANGLE:
+		if (*a_len +
+		    sizeof(struct nfp_fl_set_mpls) > NFP_FL_MAX_A_SIZ) {
+			NL_SET_ERR_MSG_MOD(extack, "unsupported offload: maximum allowed action list size exceeded at set MPLS");
+			return -EOPNOTSUPP;
+		}
+
+		set_m = (struct nfp_fl_set_mpls *)&nfp_fl->action_data[*a_len];
+		nfp_fl->meta.shortcut = cpu_to_be32(NFP_FL_SC_ACT_NULL);
+
+		nfp_fl_set_mpls(set_m, act);
+		*a_len += sizeof(struct nfp_fl_set_mpls);
+		break;
 	default:
 		/* Currently we do not handle any other actions. */
 		NL_SET_ERR_MSG_MOD(extack, "unsupported offload: unsupported action in action list");
diff --git a/drivers/net/ethernet/netronome/nfp/flower/cmsg.h b/drivers/net/ethernet/netronome/nfp/flower/cmsg.h
index 3198ad4..3324394 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/cmsg.h
+++ b/drivers/net/ethernet/netronome/nfp/flower/cmsg.h
@@ -72,6 +72,7 @@
 #define NFP_FL_ACTION_OPCODE_POP_MPLS		4
 #define NFP_FL_ACTION_OPCODE_SET_IPV4_TUNNEL	6
 #define NFP_FL_ACTION_OPCODE_SET_ETHERNET	7
+#define NFP_FL_ACTION_OPCODE_SET_MPLS		8
 #define NFP_FL_ACTION_OPCODE_SET_IPV4_ADDRS	9
 #define NFP_FL_ACTION_OPCODE_SET_IPV4_TTL_TOS	10
 #define NFP_FL_ACTION_OPCODE_SET_IPV6_SRC	11
@@ -245,6 +246,13 @@ struct nfp_fl_pop_mpls {
 	__be16 ethtype;
 };
 
+struct nfp_fl_set_mpls {
+	struct nfp_fl_act_head head;
+	__be16 reserved;
+	__be32 lse_mask;
+	__be32 lse;
+};
+
 /* Metadata with L2 (1W/4B)
  * ----------------------------------------------------------------
  *    3                   2                   1
-- 
2.7.4


^ permalink raw reply related

* Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
From: Andy Lutomirski @ 2019-07-23 15:11 UTC (permalink / raw)
  To: Song Liu
  Cc: Andy Lutomirski, Kees Cook, linux-security@vger.kernel.org,
	Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team,
	Lorenz Bauer, Jann Horn, Greg KH, Linux API
In-Reply-To: <4A7A225A-6C23-4C0F-9A95-7C6C56B281ED@fb.com>

On Mon, Jul 22, 2019 at 1:54 PM Song Liu <songliubraving@fb.com> wrote:
>
> Hi Andy, Lorenz, and all,
>
> > On Jul 2, 2019, at 2:32 PM, Andy Lutomirski <luto@kernel.org> wrote:
> >
> > On Tue, Jul 2, 2019 at 2:04 PM Kees Cook <keescook@chromium.org> wrote:
> >>
> >> On Mon, Jul 01, 2019 at 06:59:13PM -0700, Andy Lutomirski wrote:
> >>> I think I'm understanding your motivation.  You're not trying to make
> >>> bpf() generically usable without privilege -- you're trying to create
> >>> a way to allow certain users to access dangerous bpf functionality
> >>> within some limits.
> >>>
> >>> That's a perfectly fine goal, but I think you're reinventing the
> >>> wheel, and the wheel you're reinventing is quite complicated and
> >>> already exists.  I think you should teach bpftool to be secure when
> >>> installed setuid root or with fscaps enabled and put your policy in
> >>> bpftool.  If you want to harden this a little bit, it would seem
> >>> entirely reasonable to add a new CAP_BPF_ADMIN and change some, but
> >>> not all, of the capable() checks to check CAP_BPF_ADMIN instead of the
> >>> capabilities that they currently check.
> >>
> >> If finer grained controls are wanted, it does seem like the /dev/bpf
> >> path makes the most sense. open, request abilities, use fd. The open can
> >> be mediated by DAC and LSM. The request can be mediated by LSM. This
> >> provides a way to add policy at the LSM level and at the tool level.
> >> (i.e. For tool-level controls: leave LSM wide open, make /dev/bpf owned
> >> by "bpfadmin" and bpftool becomes setuid "bpfadmin". For fine-grained
> >> controls, leave /dev/bpf wide open and add policy to SELinux, etc.)
> >>
> >> With only a new CAP, you don't get the fine-grained controls. (The
> >> "request abilities" part is the key there.)
> >
> > Sure you do: the effective set.  It has somewhat bizarre defaults, but
> > I don't think that's a real problem.  Also, this wouldn't be like
> > CAP_DAC_READ_SEARCH -- you can't accidentally use your BPF caps.
> >
> > I think that a /dev capability-like object isn't totally nuts, but I
> > think we should do it well, and this patch doesn't really achieve
> > that.  But I don't think bpf wants fine-grained controls like this at
> > all -- as I pointed upthread, a fine-grained solution really wants
> > different treatment for the different capable() checks, and a bunch of
> > them won't resemble capabilities or /dev/bpf at all.
>
> With 5.3-rc1 out, I am back on this. :)
>
> How about we modify the set as:
>   1. Introduce sys_bpf_with_cap() that takes fd of /dev/bpf.

I'm fine with this in principle, but:

>   2. Better handling of capable() calls through bpf code. I guess the
>      biggest problem here is is_priv in verifier.c:bpf_check().

I think it would be good to understand exactly what /dev/bpf will
enable one to do.  Without some care, it would just become the next
CAP_SYS_ADMIN: if you can open it, sure, you're not root, but you can
intercept network traffic, modify cgroup behavior, and do plenty of
other things, any of which can probably be used to completely take
over the system.

It would also be nice to understand why you can't do what you need to
do entirely in user code using setuid or fscaps.

Finally, at risk of rehashing some old arguments, I'll point out that
the bpf() syscall is an unusual design to begin with.  As an example,
consider bpf_prog_attach().  Outside of bpf(), if I want to change the
behavior of a cgroup, I would write to a file in
/sys/kernel/cgroup/unified/whatever/, and normal DAC and MAC rules
apply.  With bpf(), however, I just call bpf() to attach a program to
the cgroup.  bpf() says "oh, you are capable(CAP_NET_ADMIN) -- go for
it!".  Unless I missed something major, and I just re-read the code,
there is no check that the caller has write or LSM permission to
anything at all in cgroupfs, and the existing API would make it very
awkward to impose any kind of DAC rules here.

So I think it might actually be time to repay some techincal debt and
come up with a real fix.  As a less intrusive approach, you could see
about requiring ownership of the cgroup directory instead of
CAP_NET_ADMIN.  As a more intrusive but perhaps better approach, you
could invert the logic to to make it work like everything outside of
cgroup: add pseudo-files like bpf.inet_ingress to the cgroup
directories, and require a writable fd to *that* to a new improved
attach API.  If a user could do:

int fd = open("/sys/fs/cgroup/.../bpf.inet_attach", O_RDWR);  /* usual
DAC and MAC policy applies */
int bpf_fd = setup the bpf stuff;  /* no privilege required, unless
the program is huge or needs is_priv */
bpf(BPF_IMPROVED_ATTACH, target = fd, program = bpf_fd);

there would be no capabilities or global privilege at all required for
this.  It would just work with cgroup delegation, containers, etc.

I think you could even pull off this type of API change with only
libbpf changes.  In particular, there's this code:

int bpf_prog_attach(int prog_fd, int target_fd, enum bpf_attach_type type,
                    unsigned int flags)
{
        union bpf_attr attr;

        memset(&attr, 0, sizeof(attr));
        attr.target_fd     = target_fd;
        attr.attach_bpf_fd = prog_fd;
        attr.attach_type   = type;
        attr.attach_flags  = flags;

        return sys_bpf(BPF_PROG_ATTACH, &attr, sizeof(attr));
}

This would instead do something like:

int specific_target_fd = openat(target_fd, bpf_type_to_target[type], O_RDWR);
attr.target_fd = specific_target_fd;
...

return sys_bpf(BPF_PROG_IMPROVED_ATTACH, &attr, sizeof(attr));

Would this solve your problem without needing /dev/bpf at all?

--Andy

^ permalink raw reply

* Re: [RFC PATCH net-next 00/12] drop_monitor: Capture dropped packets and metadata
From: Ido Schimmel @ 2019-07-23 15:14 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: netdev, davem, nhorman, dsahern, roopa, nikolay, jakub.kicinski,
	andy, f.fainelli, andrew, vivien.didelot, mlxsw, Ido Schimmel
In-Reply-To: <875znt3pxu.fsf@toke.dk>

On Tue, Jul 23, 2019 at 02:17:49PM +0200, Toke Høiland-Jørgensen wrote:
> Ido Schimmel <idosch@idosch.org> writes:
> 
> > On Mon, Jul 22, 2019 at 09:43:15PM +0200, Toke Høiland-Jørgensen wrote:
> >> Is there a mechanism for the user to filter the packets before they are
> >> sent to userspace? A bpf filter would be the obvious choice I guess...
> >
> > Hi Toke,
> >
> > Yes, it's on my TODO list to write an eBPF program that only lets
> > "unique" packets to be enqueued on the netlink socket. Where "unique" is
> > defined as {5-tuple, PC}. The rest of the copies will be counted in an
> > eBPF map, which is just a hash table keyed by {5-tuple, PC}.
> 
> Yeah, that's a good idea. Or even something simpler like tcpdump-style
> filters for the packets returned by drop monitor (say if I'm just trying
> to figure out what happens to my HTTP requests).

Yep, that's a good idea. I guess different users will use different
programs. Will look into both options.

> > I think it would be good to have the program as part of the bcc
> > repository [1]. What do you think?
> 
> Sure. We could also add it to the XDP tutorial[2]; it could go into a
> section on introspection and debugging (just added a TODO about that[3]).

Great!

> >> For integrating with XDP the trick would be to find a way to do it that
> >> doesn't incur any overhead when it's not enabled. Are you envisioning
> >> that this would be enabled separately for the different "modes" (kernel,
> >> hardware, XDP, etc)?
> >
> > Yes. Drop monitor have commands to enable and disable tracing, but they
> > don't carry any attributes at the moment. My plan is to add an attribute
> > (e.g., 'NET_DM_ATTR_DROP_TYPE') that will specify the type of drops
> > you're interested in - SW/HW/XDP. If the attribute is not specified,
> > then current behavior is maintained and all the drop types are traced.
> > But if you're only interested in SW drops, then overhead for the rest
> > should be zero.
> 
> Makes sense (although "should be" is the key here ;)).
> 
> I'm also worried about the drop monitor getting overwhelmed; if you turn
> it on for XDP and you're running a filtering program there, you'll
> suddenly get *a lot* of drops.
> 
> As I read your patch, the current code can basically queue up an
> unbounded number of packets waiting to go out over netlink, can't it?

That's a very good point. Each CPU holds a drop list. It probably makes
sense to limit it by default (to 1000?) and allow user to change it
later, if needed. I can expose a counter that shows how many packets
were dropped because of this limit. It can be used as an indication to
adjust the queue length (or flip to 'summary' mode).

^ permalink raw reply

* Re: [PATCH v12 1/5] can: m_can: Create a m_can platform framework
From: Dan Murphy @ 2019-07-23 15:14 UTC (permalink / raw)
  To: wg, mkl, davem, gregkh; +Cc: linux-can, netdev, linux-kernel
In-Reply-To: <f236a88a-485c-9002-1e4a-9a5ad0e1c81f@ti.com>

Hello

On 7/10/19 7:08 AM, Dan Murphy wrote:
> Hello
>
> On 6/17/19 10:09 AM, Dan Murphy wrote:
>> Marc
>>
>> On 6/10/19 11:35 AM, Dan Murphy wrote:
>>> Bump
>>>
>>> On 6/6/19 8:16 AM, Dan Murphy wrote:
>>>> Marc
>>>>
>>>> Bump
>>>>
>>>> On 5/31/19 6:51 AM, Dan Murphy wrote:
>>>>> Marc
>>>>>
>>>>> On 5/15/19 3:54 PM, Dan Murphy wrote:
>>>>>> Marc
>>>>>>
>>>>>> On 5/9/19 11:11 AM, Dan Murphy wrote:
>>>>>>> Create a m_can platform framework that peripheral
>>>>>>> devices can register to and use common code and register sets.
>>>>>>> The peripheral devices may provide read/write and configuration
>>>>>>> support of the IP.
>>>>>>>
>>>>>>> Acked-by: Wolfgang Grandegger <wg@grandegger.com>
>>>>>>> Signed-off-by: Dan Murphy <dmurphy@ti.com>
>>>>>>> ---
>>>>>>>
>>>>>>> v12 - Update the m_can_read/write functions to create a 
>>>>>>> backtrace if the callback
>>>>>>> pointer is NULL. - https://lore.kernel.org/patchwork/patch/1052302/
>>>>>>>
>>>>>> Is this able to be merged now?
>>>>>
>>>>> ping
>>
>> Wondering if there is anything else we need to do?
>>
>> The part has officially shipped and we had hoped to have driver 
>> support in Linux as part of the announcement.
>>
> Is this being sent in a PR for 5.3?
>
> Dan
>
Adding Greg to this thread as I have no idea what is going on with 
this.  This patch set has missed 2 merge windows and has

been ready since May.  Our customers are requesting status but we can 
only point to the mail thread

Here is the reference of the pinging I have done without reply

https://lore.kernel.org/patchwork/patch/1071894/

Dan


>
>> Dan
>>
>>
>>>>>
>>>>>
>>>>>> Dan
>>>>>>
>>>>>> <snip>

^ permalink raw reply

* Re: [PATCH bpf] tools/bpf: fix bpftool build with OUTPUT set
From: Ilya Leoshkevich @ 2019-07-23 15:14 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: bpf, netdev, lmb, gor, heiko.carstens, Arnaldo Carvalho de Melo
In-Reply-To: <20190719111716.1cbf62d1@cakuba.netronome.com>

> Am 19.07.2019 um 20:17 schrieb Jakub Kicinski <jakub.kicinski@netronome.com>:
> 
> On Fri, 19 Jul 2019 15:12:24 +0200, Ilya Leoshkevich wrote:
>>> Am 18.07.2019 um 20:51 schrieb Jakub Kicinski <jakub.kicinski@netronome.com>:
>>> 
>>> We should probably make a script with all the ways of calling make
>>> should work. Otherwise we can lose track too easily.  
>> 
>> Thanks for the script!
>> 
>> I’m trying to make it all pass now, and hitting a weird issue in the
>> Kbuild case. The build prints "No rule to make target
>> 'scripts/Makefile.ubsan.o'" and proceeds with an empty BPFTOOL_VERSION,
>> which causes problems later on.
> 
> Does it only break with UBSAN enabled?

No, all the time. I think this is a coincidence - make happens to scan
scripts/Makefile.ubsan first.

> 
>> I've found that this is caused by sub_make_done=1 environment variable,
>> and unsetting it indeed fixes the problem, since the root Makefile no
>> longer uses the implicit %.o rule.
>> 
>> However, I wonder if that would be acceptable in the final version of
>> the patch, and whether there is a cleaner way to achieve the same
>> effect?
> 
> I'm not sure to be honest. Did you check how perf deals with that?

perf obtains the version using "git describe". However, if we are
building it from a tarball, it falls back to "make kernelversion" and
fails in a similar way:

linux-5.3-rc1$ make defconfig
linux-5.3-rc1$ make tools/perf
<snip>
make[6]: Circular scripts/Makefile.ubsan.mod <- scripts/Makefile.ubsan.o dependency dropped.
make[6]: m2c: Command not found
make[6]: *** [<builtin>: scripts/Makefile.ubsan.o] Error 127
make[5]: *** [Makefile:1765: scripts/Makefile.ubsan.o] Error 2
<snip>

The same trick helps:

--- tools/perf/util/PERF-VERSION-GEN.orig	2019-07-23 17:12:07.621123187 +0200
+++ tools/perf/util/PERF-VERSION-GEN	2019-07-23 17:12:33.441133619 +0200
@@ -26,7 +26,7 @@
 fi
 if test -z "$TAG"
 then
-	TAG=$(MAKEFLAGS= make -sC ../.. kernelversion)
+	TAG=$(MAKEFLAGS= sub_make_done= make -sC ../.. kernelversion)
 fi
 VN="$TAG$CID"
 if test -n "$CID"

^ permalink raw reply

* Re: [RFC PATCH net-next 10/12] drop_monitor: Add packet alert mode
From: Neil Horman @ 2019-07-23 15:14 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, dsahern, roopa, nikolay, jakub.kicinski, toke,
	andy, f.fainelli, andrew, vivien.didelot, mlxsw, Ido Schimmel
In-Reply-To: <20190723141625.GA8972@splinter>

On Tue, Jul 23, 2019 at 05:16:25PM +0300, Ido Schimmel wrote:
> On Tue, Jul 23, 2019 at 08:43:40AM -0400, Neil Horman wrote:
> > On Mon, Jul 22, 2019 at 09:31:32PM +0300, Ido Schimmel wrote:
> > > +static void net_dm_packet_work(struct work_struct *work)
> > > +{
> > > +	struct per_cpu_dm_data *data;
> > > +	struct sk_buff_head list;
> > > +	struct sk_buff *skb;
> > > +	unsigned long flags;
> > > +
> > > +	data = container_of(work, struct per_cpu_dm_data, dm_alert_work);
> > > +
> > > +	__skb_queue_head_init(&list);
> > > +
> > > +	spin_lock_irqsave(&data->drop_queue.lock, flags);
> > > +	skb_queue_splice_tail_init(&data->drop_queue, &list);
> > > +	spin_unlock_irqrestore(&data->drop_queue.lock, flags);
> > > +
> > These functions are all executed in a per-cpu context.  While theres nothing
> > wrong with using a spinlock here, I think you can get away with just doing
> > local_irqsave and local_irq_restore.
> 
> Hi Neil,
> 
> Thanks a lot for reviewing. I might be missing something, but please
> note that this function is executed from a workqueue and therefore the
> CPU it is running on does not have to be the same CPU to which 'data'
> belongs to. If so, I'm not sure how I can avoid taking the spinlock, as
> otherwise two different CPUs can modify the list concurrently.
> 
Ah, my bad, I was under the impression that the schedule_work call for
that particular work queue was actually a call to schedule_work_on,
which would have affined it to a specific cpu.  That said, looking at
it, I think using schedule_work_on was my initial intent, as the work
queue is registered per cpu.  And converting it to schedule_work_on
would allow you to reduce the spin_lock to a faster local_irqsave

Otherwise though, this looks really good to me
Neil

> > 
> > Neil
> > 
> > > +	while ((skb = __skb_dequeue(&list)))
> > > +		net_dm_packet_report(skb);
> > > +}
> 

^ permalink raw reply

* [PATCH] sky2: Disable MSI on ASUS P6T
From: Takashi Iwai @ 2019-07-23 15:15 UTC (permalink / raw)
  To: netdev; +Cc: Mirko Lindner, Stephen Hemminger, Marcus Seyfarth,
	David S . Miller

The onboard sky2 NIC on ASUS P6T WS PRO doesn't work after PM resume
due to the infamous IRQ problem.  Disabling MSI works around it, so
let's add it to the blacklist.

Unfortunately the BIOS on the machine doesn't fill the standard
DMI_SYS_* entry, so we pick up DMI_BOARD_* entries instead.

BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1142496
Reported-and-tested-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 drivers/net/ethernet/marvell/sky2.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/marvell/sky2.c b/drivers/net/ethernet/marvell/sky2.c
index f518312ffe69..a01c75ede871 100644
--- a/drivers/net/ethernet/marvell/sky2.c
+++ b/drivers/net/ethernet/marvell/sky2.c
@@ -4924,6 +4924,13 @@ static const struct dmi_system_id msi_blacklist[] = {
 			DMI_MATCH(DMI_PRODUCT_NAME, "P5W DH Deluxe"),
 		},
 	},
+	{
+		.ident = "ASUS P6T",
+		.matches = {
+			DMI_MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
+			DMI_MATCH(DMI_BOARD_NAME, "P6T"),
+		},
+	},
 	{}
 };
 
-- 
2.16.4


^ permalink raw reply related

* [PATCH net-next 3/3] arm64: dts: ls1028a: Enable eth port1 on the ls1028a QDS board
From: Claudiu Manoil @ 2019-07-23 15:15 UTC (permalink / raw)
  To: David S . Miller
  Cc: Rob Herring, Li Yang, alexandru.marginean, netdev, devicetree,
	linux-arm-kernel, linux-kernel
In-Reply-To: <1563894955-545-1-git-send-email-claudiu.manoil@nxp.com>

LS1028a has one Ethernet management interface. On the QDS board, the
MDIO signals are multiplexed to either on-board AR8035 PHY device or
to 4 PCIe slots allowing for SGMII cards.
To enable the Ethernet ENETC Port 1, which can only be connected to a
RGMII PHY, the multiplexer needs to be configured to route the MDIO to
the AR8035 PHY.  The MDIO/MDC routing is controlled by bits 7:4 of FPGA
board config register 0x54, and value 0 selects the on-board RGMII PHY.
The FPGA board config registers are accessible on the i2c bus, at address
0x66.

The PF3 MDIO PCIe integrated endpoint device allows for centralized access
to the MDIO bus.  Add the corresponding devicetree node and set it to be
the MDIO bus parent.

Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
 .../boot/dts/freescale/fsl-ls1028a-qds.dts    | 40 +++++++++++++++++++
 .../arm64/boot/dts/freescale/fsl-ls1028a.dtsi |  6 +++
 2 files changed, 46 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts b/arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts
index de6ef39f3118..663c4b728c07 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts
@@ -85,6 +85,26 @@
 			system-clock-frequency = <25000000>;
 		};
 	};
+
+	mdio-mux {
+		compatible = "mdio-mux-multiplexer";
+		mux-controls = <&mux 0>;
+		mdio-parent-bus = <&enetc_mdio_pf3>;
+		#address-cells=<1>;
+		#size-cells = <0>;
+
+		/* on-board RGMII PHY */
+		mdio@0 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			reg = <0>;
+
+			qds_phy1: ethernet-phy@5 {
+				/* Atheros 8035 */
+				reg = <5>;
+			};
+		};
+	};
 };
 
 &duart0 {
@@ -164,6 +184,26 @@
 			};
 		};
 	};
+
+	fpga@66 {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		compatible = "fsl,ls1028aqds-fpga", "fsl,fpga-qixis-i2c",
+			     "simple-mfd";
+		reg = <0x66>;
+
+		mux: mux-controller {
+			compatible = "reg-mux";
+			#mux-control-cells = <1>;
+			mux-reg-masks = <0x54 0xf0>; /* 0: reg 0x54, bits 7:4 */
+		};
+	};
+
+};
+
+&enetc_port1 {
+	phy-handle = <&qds_phy1>;
+	phy-connection-type = "rgmii-id";
 };
 
 &sai1 {
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
index 7975519b4f56..de71153fda00 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
@@ -536,6 +536,12 @@
 				compatible = "fsl,enetc";
 				reg = <0x000100 0 0 0 0>;
 			};
+			enetc_mdio_pf3: mdio@0,3 {
+				compatible = "fsl,enetc-mdio";
+				reg = <0x000300 0 0 0 0>;
+				#address-cells = <1>;
+				#size-cells = <0>;
+			};
 			ethernet@0,4 {
 				compatible = "fsl,enetc-ptp";
 				reg = <0x000400 0 0 0 0>;
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next 1/3] enetc: Add mdio bus driver for the PCIe MDIO endpoint
From: Claudiu Manoil @ 2019-07-23 15:15 UTC (permalink / raw)
  To: David S . Miller
  Cc: Rob Herring, Li Yang, alexandru.marginean, netdev, devicetree,
	linux-arm-kernel, linux-kernel
In-Reply-To: <1563894955-545-1-git-send-email-claudiu.manoil@nxp.com>

ENETC ports can manage the MDIO bus via local register
interface.  However there's also a centralized way
to manage the MDIO bus, via the MDIO PCIe endpoint
device integrated by the same root complex that also
integrates the ENETC ports (eth controllers).

Depending on board design and use case, centralized
access to MDIO may be better than using local ENETC
port registers.  For instance, on the LS1028A QDS board
where MDIO muxing is requiered.  Also, the LS1028A on-chip
switch doesn't have a local MDIO register interface.

The current patch registers the above PCIe enpoint as a
separate MDIO bus and provides a driver for it by re-using
the code used for local MDIO access.  It also allows the
ENETC port PHYs to be managed by this driver if the local
"mdio" node is missing from the ENETC port node.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
 .../net/ethernet/freescale/enetc/enetc_mdio.c | 90 +++++++++++++++++++
 .../net/ethernet/freescale/enetc/enetc_pf.c   |  5 +-
 2 files changed, 94 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc_mdio.c b/drivers/net/ethernet/freescale/enetc/enetc_mdio.c
index 77b9cd10ba2b..efa8a29f463d 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_mdio.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_mdio.c
@@ -197,3 +197,93 @@ void enetc_mdio_remove(struct enetc_pf *pf)
 		mdiobus_free(pf->mdio);
 	}
 }
+
+#define ENETC_MDIO_DEV_ID	0xee01
+#define ENETC_MDIO_DEV_NAME	"FSL PCIe IE Central MDIO"
+#define ENETC_MDIO_BUS_NAME	ENETC_MDIO_DEV_NAME " Bus"
+#define ENETC_MDIO_DRV_NAME	ENETC_MDIO_DEV_NAME " driver"
+#define ENETC_MDIO_DRV_ID	"fsl_enetc_mdio"
+
+static int enetc_pci_mdio_probe(struct pci_dev *pdev,
+				const struct pci_device_id *ent)
+{
+	struct device *dev = &pdev->dev;
+	struct mii_bus *bus;
+	int err;
+
+	bus = mdiobus_alloc_size(sizeof(u32 *));
+	if (!bus)
+		return -ENOMEM;
+
+	bus->name = ENETC_MDIO_BUS_NAME;
+	bus->read = enetc_mdio_read;
+	bus->write = enetc_mdio_write;
+	bus->parent = dev;
+	snprintf(bus->id, MII_BUS_ID_SIZE, "%s", dev_name(dev));
+
+	pcie_flr(pdev);
+	err = pci_enable_device_mem(pdev);
+	if (err) {
+		dev_err(dev, "device enable failed\n");
+		return err;
+	}
+
+	err = pci_request_mem_regions(pdev, ENETC_MDIO_DRV_ID);
+	if (err) {
+		dev_err(dev, "pci_request_regions failed\n");
+		goto err_pci_mem_reg;
+	}
+
+	bus->priv = pci_iomap_range(pdev, 0, ENETC_MDIO_REG_OFFSET, 0);
+	if (!bus->priv) {
+		err = -ENXIO;
+		dev_err(dev, "ioremap failed\n");
+		goto err_ioremap;
+	}
+
+	err = of_mdiobus_register(bus, dev->of_node);
+	if (err)
+		goto err_mdiobus_reg;
+
+	pci_set_drvdata(pdev, bus);
+
+	return 0;
+
+err_mdiobus_reg:
+	iounmap(bus->priv);
+err_ioremap:
+	pci_release_mem_regions(pdev);
+err_pci_mem_reg:
+	pci_disable_device(pdev);
+
+	return err;
+}
+
+static void enetc_pci_mdio_remove(struct pci_dev *pdev)
+{
+	struct mii_bus *bus = pci_get_drvdata(pdev);
+
+	mdiobus_unregister(bus);
+	iounmap(bus->priv);
+	mdiobus_free(bus);
+
+	pci_release_mem_regions(pdev);
+	pci_disable_device(pdev);
+}
+
+static const struct pci_device_id enetc_pci_mdio_id_table[] = {
+	{ PCI_DEVICE(PCI_VENDOR_ID_FREESCALE, ENETC_MDIO_DEV_ID) },
+	{ 0, } /* End of table. */
+};
+MODULE_DEVICE_TABLE(pci, enetc_mdio_id_table);
+
+static struct pci_driver enetc_pci_mdio_driver = {
+	.name = ENETC_MDIO_DRV_ID,
+	.id_table = enetc_pci_mdio_id_table,
+	.probe = enetc_pci_mdio_probe,
+	.remove = enetc_pci_mdio_remove,
+};
+module_pci_driver(enetc_pci_mdio_driver);
+
+MODULE_DESCRIPTION(ENETC_MDIO_DRV_NAME);
+MODULE_LICENSE("Dual BSD/GPL");
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
index 258b3cb38a6f..7d6513ff8507 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
@@ -750,6 +750,7 @@ static int enetc_of_get_phy(struct enetc_ndev_priv *priv)
 {
 	struct enetc_pf *pf = enetc_si_priv(priv->si);
 	struct device_node *np = priv->dev->of_node;
+	struct device_node *mdio_np;
 	int err;
 
 	if (!np) {
@@ -773,7 +774,9 @@ static int enetc_of_get_phy(struct enetc_ndev_priv *priv)
 		priv->phy_node = of_node_get(np);
 	}
 
-	if (!of_phy_is_fixed_link(np)) {
+	mdio_np = of_get_child_by_name(np, "mdio");
+	if (mdio_np) {
+		of_node_put(mdio_np);
 		err = enetc_mdio_probe(pf);
 		if (err) {
 			of_node_put(priv->phy_node);
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next 2/3] dt-bindings: net: fsl: enetc: Add bindings for the central MDIO PCIe endpoint
From: Claudiu Manoil @ 2019-07-23 15:15 UTC (permalink / raw)
  To: David S . Miller
  Cc: Rob Herring, Li Yang, alexandru.marginean, netdev, devicetree,
	linux-arm-kernel, linux-kernel
In-Reply-To: <1563894955-545-1-git-send-email-claudiu.manoil@nxp.com>

The on-chip PCIe root complex that integrates the ENETC ethernet
controllers also integrates a PCIe enpoint for the MDIO controller
provinding for cetralized control of the ENETC mdio bus.
Add bindings for this "central" MDIO Integrated PCIe Endpoit.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
 .../devicetree/bindings/net/fsl-enetc.txt     | 42 +++++++++++++++++--
 1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/fsl-enetc.txt b/Documentation/devicetree/bindings/net/fsl-enetc.txt
index 25fc687419db..c090f6df7a39 100644
--- a/Documentation/devicetree/bindings/net/fsl-enetc.txt
+++ b/Documentation/devicetree/bindings/net/fsl-enetc.txt
@@ -11,7 +11,9 @@ Required properties:
 		  to parent node bindings.
 - compatible	: Should be "fsl,enetc".
 
-1) The ENETC external port is connected to a MDIO configurable phy:
+1. The ENETC external port is connected to a MDIO configurable phy
+
+1.1. Using the local ENETC Port MDIO interface
 
 In this case, the ENETC node should include a "mdio" sub-node
 that in turn should contain the "ethernet-phy" node describing the
@@ -47,8 +49,42 @@ Example:
 		};
 	};
 
-2) The ENETC port is an internal port or has a fixed-link external
-connection:
+1.2. Using the central MDIO PCIe enpoint device
+
+In this case, the mdio node should be defined as another PCIe
+endpoint node, at the same level with the ENETC port nodes.
+
+Required properties:
+
+- reg		: Specifies PCIe Device Number and Function
+		  Number of the ENETC endpoint device, according
+		  to parent node bindings.
+- compatible	: Should be "fsl,enetc-mdio".
+
+The remaining required mdio bus properties are standard, their bindings
+already defined in Documentation/devicetree/bindings/net/mdio.txt.
+
+Example:
+
+	ethernet@0,0 {
+		compatible = "fsl,enetc";
+		reg = <0x000000 0 0 0 0>;
+		phy-handle = <&sgmii_phy0>;
+		phy-connection-type = "sgmii";
+	};
+
+	mdio@0,3 {
+		compatible = "fsl,enetc-mdio";
+		reg = <0x000300 0 0 0 0>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+		sgmii_phy0: ethernet-phy@2 {
+			reg = <0x2>;
+		};
+	};
+
+2. The ENETC port is an internal port or has a fixed-link external
+connection
 
 In this case, the ENETC port node defines a fixed link connection,
 as specified by Documentation/devicetree/bindings/net/fixed-link.txt.
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next 0/3] enetc: Add mdio bus driver for the PCIe MDIO endpoint
From: Claudiu Manoil @ 2019-07-23 15:15 UTC (permalink / raw)
  To: David S . Miller
  Cc: Rob Herring, Li Yang, alexandru.marginean, netdev, devicetree,
	linux-arm-kernel, linux-kernel

First patch just registers the PCIe endpoint device containing
the MDIO registers as a standalone MDIO bus driver, to allow
an alternative way to control the MDIO bus.  The same code used
by the ENETC ports (eth controllers) to manage MDIO via local
registers applies and is reused.

Bindings are provided for the new MDIO node, similarly to ENETC
port nodes bindings.

Last patch enables the ENETC port 1 and its RGMII PHY on the
LS1028A QDS board, where the MDIO muxing configuration relies
on the MDIO support provided in the first patch.


Claudiu Manoil (3):
  enetc: Add mdio bus driver for the PCIe MDIO endpoint
  dt-bindings: net: fsl: enetc: Add bindings for the central MDIO PCIe
    endpoint
  arm64: dts: ls1028a: Enable eth port1 on the ls1028a QDS board

 .../devicetree/bindings/net/fsl-enetc.txt     | 42 ++++++++-
 .../boot/dts/freescale/fsl-ls1028a-qds.dts    | 40 +++++++++
 .../arm64/boot/dts/freescale/fsl-ls1028a.dtsi |  6 ++
 .../net/ethernet/freescale/enetc/enetc_mdio.c | 90 +++++++++++++++++++
 .../net/ethernet/freescale/enetc/enetc_pf.c   |  5 +-
 5 files changed, 179 insertions(+), 4 deletions(-)

-- 
2.17.1


^ permalink raw reply

* [RFC PATCH 0/2] convert gianfar to phylink
From: Arseny Solokha @ 2019-07-23 15:17 UTC (permalink / raw)
  To: Claudiu Manoil, Ioana Ciornei, Russell King, Andrew Lunn
  Cc: netdev, Arseny Solokha

The first patch in the series (almost) converts gianfar to phylink API. The
incentive behind this effort was to get proper support for 1000Base-X and
SGMII SFP modules.

There are some usages of the older phylib left, as serdes have to be
configured and its parameters queried via a TBI interface, and I've failed
to find a reasonably easy way to do it with phylink without much surgery.
It's the first reason for RFC here. However, usage of the older API only
covers two special cases of underlying hardware management and is not
involved in link and SFP management directly.

The conversion was tested with various 1000Base-X connected optical modules
and SGMII-connected copper ones.

The second patch deals with an issue in the phylink proper which only
manifests when bringing up or shutting down a network interface with SGMII
SFP module connected, which yields in calling phy_start() or phy_stop()
twice in a row for such modules. It doesn't look like a proper fix to me,
though, thus the second reason for RFC.

Arseny Solokha (2):
  gianfar: convert to phylink
  net: phylink: don't start and stop SGMII PHYs in SFP modules twice

 drivers/net/ethernet/freescale/Kconfig        |   2 +-
 drivers/net/ethernet/freescale/gianfar.c      | 409 +++++++++---------
 drivers/net/ethernet/freescale/gianfar.h      |  26 +-
 .../net/ethernet/freescale/gianfar_ethtool.c  |  79 ++--
 drivers/net/phy/phylink.c                     |   6 +-
 5 files changed, 254 insertions(+), 268 deletions(-)

-- 
2.22.0


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox