Netdev List
 help / color / mirror / Atom feed
* Re: [RFC PATCH v2] net: phy: Added device tree binding for dev-addr and dev-addr code check-up
From: Andrew Lunn @ 2018-03-23 15:44 UTC (permalink / raw)
  To: Vicentiu Galanopulo
  Cc: netdev, linux-kernel, robh+dt, mark.rutland, davem, marcel,
	devicetree, madalin.bucur, alexandru.marginean
In-Reply-To: <20180323150522.9603-1-vicentiu.galanopulo@nxp.com>

> --- a/drivers/of/of_mdio.c
> +++ b/drivers/of/of_mdio.c
> @@ -24,6 +24,8 @@
>  
>  #define DEFAULT_GPIO_RESET_DELAY	10	/* in microseconds */
>  
> +struct phy_c45_device_ids mdio_c45_ids = {0};

You do know that Linux is multi-threaded. It could be probing two MDIO
busses at once.

	Andrew

^ permalink raw reply

* [GIT] 'net' merged into 'net-next'
From: David Miller @ 2018-03-23 15:40 UTC (permalink / raw)
  To: netdev; +Cc: jgg, dledford, idosch, dsahern, sd


This merge was a little bit more hectic than usual.

But thankfully, I had some sample conflict resolutions to work
with, in particular for the mlx5 infiniband changes which were
the most difficult to resolve.

Please double check my work and provide any fixup patches if
necessary.

Thank you.

^ permalink raw reply

* RE: [PATCH net-next,2/2] hv_netvsc: Add range checking for rx packet offset and length
From: Haiyang Zhang @ 2018-03-23 15:25 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Haiyang Zhang
  Cc: davem@davemloft.net, netdev@vger.kernel.org, KY Srinivasan,
	Stephen Hemminger, olaf@aepfle.de, devel@linuxdriverproject.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <87sh8q4y9s.fsf@vitty.brq.redhat.com>



> -----Original Message-----
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> Sent: Friday, March 23, 2018 11:17 AM
> To: Haiyang Zhang <haiyangz@linuxonhyperv.com>
> Cc: davem@davemloft.net; netdev@vger.kernel.org; Haiyang Zhang
> <haiyangz@microsoft.com>; KY Srinivasan <kys@microsoft.com>; Stephen
> Hemminger <sthemmin@microsoft.com>; olaf@aepfle.de;
> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH net-next,2/2] hv_netvsc: Add range checking for rx packet
> offset and length
> 
> Haiyang Zhang <haiyangz@linuxonhyperv.com> writes:
> 
> > From: Haiyang Zhang <haiyangz@microsoft.com>
> >
> > This patch adds range checking for rx packet offset and length.
> > It may only happen if there is a host side bug.
> >
> > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> > ---
> >  drivers/net/hyperv/hyperv_net.h |  1 +
> >  drivers/net/hyperv/netvsc.c     | 17 +++++++++++++++--
> >  2 files changed, 16 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/hyperv/hyperv_net.h
> > b/drivers/net/hyperv/hyperv_net.h index 0db3bd1ea06f..49c05ac894e5
> > 100644
> > --- a/drivers/net/hyperv/hyperv_net.h
> > +++ b/drivers/net/hyperv/hyperv_net.h
> > @@ -793,6 +793,7 @@ struct netvsc_device {
> >
> >  	/* Receive buffer allocated by us but manages by NetVSP */
> >  	void *recv_buf;
> > +	u32 recv_buf_size; /* allocated bytes */
> >  	u32 recv_buf_gpadl_handle;
> >  	u32 recv_section_cnt;
> >  	u32 recv_section_size;
> > diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
> > index 1ddb2c39b6e4..a6700d65f206 100644
> > --- a/drivers/net/hyperv/netvsc.c
> > +++ b/drivers/net/hyperv/netvsc.c
> > @@ -289,6 +289,8 @@ static int netvsc_init_buf(struct hv_device *device,
> >  		goto cleanup;
> >  	}
> >
> > +	net_device->recv_buf_size = buf_size;
> > +
> >  	/*
> >  	 * Establish the gpadl handle for this buffer on this
> >  	 * channel.  Note: This call uses the vmbus connection rather @@
> > -1095,11 +1097,22 @@ static int netvsc_receive(struct net_device
> > *ndev,
> >
> >  	/* Each range represents 1 RNDIS pkt that contains 1 ethernet frame */
> >  	for (i = 0; i < count; i++) {
> > -		void *data = recv_buf
> > -			+ vmxferpage_packet->ranges[i].byte_offset;
> > +		u32 offset = vmxferpage_packet->ranges[i].byte_offset;
> >  		u32 buflen = vmxferpage_packet->ranges[i].byte_count;
> > +		void *data;
> >  		int ret;
> >
> > +		if (unlikely(offset + buflen > net_device->recv_buf_size)) {
> > +			status = NVSP_STAT_FAIL;
> > +			netif_err(net_device_ctx, rx_err, ndev,
> > +				  "Packet offset:%u + len:%u too big\n",
> > +				  offset, buflen);
> 
> This shouldn't happen, of course, but I'd rather ratelimit this error or even used
> something like netdev_WARN_ONCE().

Actually I thought about ratelimit, but this range check is only to catch host side bug. 
It should not happen. 
But if it happens, the VM should not be used anymore. And we need to debug
the host. Similarly, some other this kind of checks in the same function are not using
ratelimit:

        if (unlikely(nvsp->hdr.msg_type != NVSP_MSG1_TYPE_SEND_RNDIS_PKT)) {
                netif_err(net_device_ctx, rx_err, ndev,
                          "Unknown nvsp packet type received %u\n",
                          nvsp->hdr.msg_type);

Thanks,
- Haiyang

^ permalink raw reply

* Re: [patch net-next RFC 00/12] devlink: introduce port flavours and common phys_port_name generation
From: Andrew Lunn @ 2018-03-23 15:24 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, idosch, jakub.kicinski, mlxsw, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, simon.horman,
	pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
	alexander.h.duyck, ogerlitz, dsahern, vijaya.guvva,
	satananda.burla, raghu.vatsavayi, felix.manlunas, gospo,
	sathya.perla, vasundhara-v.volam, tariqt, eranbe,
	jeffrey.t.kirsher
In-Reply-To: <20180323145935.GC2125@nanopsycho>

On Fri, Mar 23, 2018 at 03:59:35PM +0100, Jiri Pirko wrote:
> Fri, Mar 23, 2018 at 02:43:57PM CET, andrew@lunn.ch wrote:
> >> I tested this for mlxsw and nfp. I have no way to test this on DSA hw,
> >> I would really appretiate DSA guys to test this.
> >
> >Hi Jiri
> >
> >With the missing break added, i get:
> >
> >root@zii-devel-b:~# ./iproute2/devlink/devlink port 
> >mdio_bus/0.1:00/0: type eth netdev lan0 flavour physical number 0
> >mdio_bus/0.1:00/1: type eth netdev lan1 flavour physical number 1
> >mdio_bus/0.1:00/2: type eth netdev lan2 flavour physical number 2
> >mdio_bus/0.1:00/3: type notset
> >mdio_bus/0.1:00/4: type notset
> >mdio_bus/0.1:00/5: type notset flavour dsa number 5
> >mdio_bus/0.1:00/6: type notset flavour cpu number 6
> >mdio_bus/0.2:00/0: type eth netdev lan3 flavour physical number 0
> >mdio_bus/0.2:00/1: type eth netdev lan4 flavour physical number 1
> >mdio_bus/0.2:00/2: type eth netdev lan5 flavour physical number 2
> >mdio_bus/0.2:00/3: type notset
> >mdio_bus/0.2:00/4: type notset
> >mdio_bus/0.2:00/5: type notset flavour dsa number 5
> >mdio_bus/0.2:00/6: type notset flavour dsa number 6
> >mdio_bus/0.4:00/0: type eth netdev lan6 flavour physical number 0
> >mdio_bus/0.4:00/1: type eth netdev lan7 flavour physical number 1
> >mdio_bus/0.4:00/2: type eth netdev lan8 flavour physical number 2
> >mdio_bus/0.4:00/3: type eth netdev optical3 flavour physical number 3
> >mdio_bus/0.4:00/4: type eth netdev optical4 flavour physical number 4
> >mdio_bus/0.4:00/5: type notset
> >mdio_bus/0.4:00/6: type notset
> >mdio_bus/0.4:00/7: type notset
> >mdio_bus/0.4:00/8: type notset
> >mdio_bus/0.4:00/9: type notset flavour dsa number 9

> That is basically front panel number for physical ports.

You cannot make that assumption. As you can see here, we have 3 ports
with the number 0.

Look at clearfog, armada-388-clearfog.dts. port 0=lan5, port 1=lan4
port 2=lan3, port 3=lan2, port 4=lan1, port 5=cpu, port 6=lan6.

The hardware and mechanical engineer is free to wire switch ports to
the front panel however they want. That is why we put the netdev name
in device tree.

    Andrew

^ permalink raw reply

* Re: [PATCH net-next, 2/2] hv_netvsc: Add range checking for rx packet offset and length
From: Vitaly Kuznetsov @ 2018-03-23 15:17 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: olaf, sthemmin, netdev, haiyangz, linux-kernel, devel, davem
In-Reply-To: <20180322190114.25596-3-haiyangz@linuxonhyperv.com>

Haiyang Zhang <haiyangz@linuxonhyperv.com> writes:

> From: Haiyang Zhang <haiyangz@microsoft.com>
>
> This patch adds range checking for rx packet offset and length.
> It may only happen if there is a host side bug.
>
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> ---
>  drivers/net/hyperv/hyperv_net.h |  1 +
>  drivers/net/hyperv/netvsc.c     | 17 +++++++++++++++--
>  2 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
> index 0db3bd1ea06f..49c05ac894e5 100644
> --- a/drivers/net/hyperv/hyperv_net.h
> +++ b/drivers/net/hyperv/hyperv_net.h
> @@ -793,6 +793,7 @@ struct netvsc_device {
>
>  	/* Receive buffer allocated by us but manages by NetVSP */
>  	void *recv_buf;
> +	u32 recv_buf_size; /* allocated bytes */
>  	u32 recv_buf_gpadl_handle;
>  	u32 recv_section_cnt;
>  	u32 recv_section_size;
> diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
> index 1ddb2c39b6e4..a6700d65f206 100644
> --- a/drivers/net/hyperv/netvsc.c
> +++ b/drivers/net/hyperv/netvsc.c
> @@ -289,6 +289,8 @@ static int netvsc_init_buf(struct hv_device *device,
>  		goto cleanup;
>  	}
>
> +	net_device->recv_buf_size = buf_size;
> +
>  	/*
>  	 * Establish the gpadl handle for this buffer on this
>  	 * channel.  Note: This call uses the vmbus connection rather
> @@ -1095,11 +1097,22 @@ static int netvsc_receive(struct net_device *ndev,
>
>  	/* Each range represents 1 RNDIS pkt that contains 1 ethernet frame */
>  	for (i = 0; i < count; i++) {
> -		void *data = recv_buf
> -			+ vmxferpage_packet->ranges[i].byte_offset;
> +		u32 offset = vmxferpage_packet->ranges[i].byte_offset;
>  		u32 buflen = vmxferpage_packet->ranges[i].byte_count;
> +		void *data;
>  		int ret;
>
> +		if (unlikely(offset + buflen > net_device->recv_buf_size)) {
> +			status = NVSP_STAT_FAIL;
> +			netif_err(net_device_ctx, rx_err, ndev,
> +				  "Packet offset:%u + len:%u too big\n",
> +				  offset, buflen);

This shouldn't happen, of course, but I'd rather ratelimit this error or
even used something like netdev_WARN_ONCE().

> +
> +			continue;
> +		}
> +
> +		data = recv_buf + offset;
> +
>  		trace_rndis_recv(ndev, q_idx, data);
>
>  		/* Pass it to the upper layer */

-- 
  Vitaly

^ permalink raw reply

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
From: David Ahern @ 2018-03-23 15:13 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski,
	David Ahern
In-Reply-To: <20180323150516.GE2125@nanopsycho>

On 3/23/18 9:05 AM, Jiri Pirko wrote:
> Fri, Mar 23, 2018 at 04:03:40PM CET, dsa@cumulusnetworks.com wrote:
>> On 3/23/18 9:01 AM, Jiri Pirko wrote:
>>> Fri, Mar 23, 2018 at 03:31:02PM CET, dsa@cumulusnetworks.com wrote:
>>>> On 3/23/18 12:50 AM, Jiri Pirko wrote:
>>>>>> +void nsim_devlink_setup(struct netdevsim *ns)
>>>>>> +{
>>>>>> +	struct net *net = dev_net(ns->netdev);
>>>>>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>>>>>> +	struct devlink *devlink;
>>>>>> +	int err = -ENOMEM;
>>>>>> +
>>>>>> +	/* only one device per namespace controls devlink */
>>>>>> +	if (!*reg_devlink) {
>>>>>> +		ns->devlink = NULL;
>>>>>> +		return;
>>>>>> +	}
>>>>>> +
>>>>>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>>>>>> +	if (!devlink)
>>>>>> +		return;
>>>>>> +
>>>>>> +	devlink_net_set(devlink, net);
>>>>>> +	err = devlink_register(devlink, &ns->dev);
>>>>>
>>>>> This reg_devlink construct looks odd. Why don't you leave the devlink
>>>>> instance in init_ns?
>>>>
>>>> It is a per-network namespace resource controller. Since struct devlink
>>>
>>> Wait a second. What do you mean by "per-network namespace"? Devlink
>>> instance is always associated with one physical device. Like an ASIC.
>>>
>>>
>>>> has a net entry, the simplest design is to put it into the namespace of
>>>> the controller. Without it, controlling resource sizes in namespace
>>>> 'foobar' has to be done from init_net, which is just wrong.
>>
>> you need to look at how netdevsim creates a device per netdevice.
> 
> That means one devlink instance for each netdevsim device, doesn't it?
> 

yes.

^ permalink raw reply

* Re: [PATCH] of_net: Implement of_get_nvmem_mac_address helper
From: Andrew Lunn @ 2018-03-23 15:11 UTC (permalink / raw)
  To: Mike Looijmans
  Cc: netdev, linux-kernel, devicetree, f.fainelli, robh+dt,
	frowand.list
In-Reply-To: <1521815074-30424-1-git-send-email-mike.looijmans@topic.nl>

On Fri, Mar 23, 2018 at 03:24:34PM +0100, Mike Looijmans wrote:
> It's common practice to store MAC addresses for network interfaces into
> nvmem devices. However the code to actually do this in the kernel lacks,
> so this patch adds of_get_nvmem_mac_address() for drivers to obtain the
> address from an nvmem cell provider.
> 
> This is particulary useful on devices where the ethernet interface cannot
> be configured by the bootloader, for example because it's in an FPGA.
> 
> Tested by adapting the cadence macb driver to call this instead of
> of_get_mac_address().

Hi Mike

Please can you document the device tree binding. I assume you are
adding a nvmen-cells and nvmem-cell-names to the Ethernet node in
device tree.

> +/**
> + * Search the device tree for a MAC address, by calling of_get_mac_address
> + * and if that doesn't provide an address, fetch it from an nvmem provider
> + * using the name 'mac-address'.
> + * On success, copies the new address is into memory pointed to by addr and
> + * returns 0. Returns a negative error code otherwise.
> + * @dev:	Pointer to the device containing the device_node
> + * @addr:	Pointer to receive the MAC address using ether_addr_copy()
> + */
> +int of_get_nvmem_mac_address(struct device *dev, char *addr)
> +{
> +	const char *mac;
> +	struct nvmem_cell *cell;
> +	size_t len;
> +	int ret;
> +
> +	mac = of_get_mac_address(dev->of_node);
> +	if (mac) {
> +		ether_addr_copy(addr, mac);
> +		return 0;
> +	}

Is there a need to add a new API? Could of_get_mac_address() be
extended to look in NVMEM? The MAC driver does not care. It is saying,
using OF get me a MAC address. One API seems sufficient, and would
mean you don't need to change the MAC drivers.

     Andrew

^ permalink raw reply

* [PATCH net-next] devlink: Remove top_hierarchy arg for DEVLINK disabled path
From: David Ahern @ 2018-03-23 15:09 UTC (permalink / raw)
  To: netdev; +Cc: David Ahern

Earlier change missed the path where CONFIG_NET_DEVLINK is disabled.
Thanks to Jiri for spotting.

Fixes: 145307460ba9 ("devlink: Remove top_hierarchy arg to devlink_resource_register")
Signed-off-by: David Ahern <dsahern@gmail.com>
---
 include/net/devlink.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index d5b707375e48..e21d8cadd480 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -559,7 +559,6 @@ devlink_dpipe_match_put(struct sk_buff *skb,
 static inline int
 devlink_resource_register(struct devlink *devlink,
 			  const char *resource_name,
-			  bool top_hierarchy,
 			  u64 resource_size,
 			  u64 resource_id,
 			  u64 parent_resource_id,
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH net-next] ibmvnic: Potential NULL dereference in clean_one_tx_pool()
From: Thomas Falcon @ 2018-03-23 15:06 UTC (permalink / raw)
  To: Dan Carpenter, Benjamin Herrenschmidt
  Cc: netdev, kernel-janitors, John Allen, Paul Mackerras, linuxppc-dev
In-Reply-To: <20180323113615.GA28518@mwanda>

On 03/23/2018 06:36 AM, Dan Carpenter wrote:
> There is an && vs || typo here, which potentially leads to a NULL
> dereference.

Thanks for catching that!

>
> Fixes: e9e1e97884b7 ("ibmvnic: Update TX pool cleaning routine")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
>
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> index 5632c030811b..0389a7a52152 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -1135,7 +1135,7 @@ static void clean_one_tx_pool(struct ibmvnic_adapter *adapter,
>  	u64 tx_entries;
>  	int i;
>
> -	if (!tx_pool && !tx_pool->tx_buff)
> +	if (!tx_pool || !tx_pool->tx_buff)
>  		return;
>
>  	tx_entries = tx_pool->num_buffers;
>

^ permalink raw reply

* [RFC PATCH v2] net: phy: Added device tree binding for dev-addr and dev-addr code check-up
From: Vicentiu Galanopulo @ 2018-03-23 15:05 UTC (permalink / raw)
  To: netdev, linux-kernel, robh+dt, mark.rutland, davem, marcel,
	devicetree
  Cc: madalin.bucur, alexandru.marginean

Reason for this patch is that the Inphi PHY has a
vendor specific address space for accessing the
C45 MDIO registers - starting from 0x1e.

A search of the dev-addr property is done in of_mdiobus_register.
If the property is found in the PHY node,
of_mdiobus_register_static_phy is called. This is a
wrapper function for of_mdiobus_register_phy which finds the
device in package based on dev-addr and fills devices_addrs:
devices_addrs is a new field added to phy_c45_device_ids.
This new field will store the dev-addr property on the same
index where the device in package has been found.
In order to have dev-addr in get_phy_c45_ids(), mdio_c45_ids is
passed from of_mdio.c to phy_device.c as an external variable.
In get_phy_device a copy of the mdio_c45_ids is done over the
local c45_ids (wich are empty). After the copying, the c45_ids
will also contain the static device found from dev-addr.
Having dev-addr stored in devices_addrs, in get_phy_c45_ids(),
when probing the identifiers, dev-addr can be extracted from
devices_addrs and probed if devices_addrs[current_identifier]
is not 0.
This way changing the kernel API is avoided completely.

As a plus to this patch, num_ids in get_phy_c45_ids,
has the value 8 (ARRAY_SIZE(c45_ids->device_ids)),
but the u32 *devs can store 32 devices in the bitfield.
If a device is stored in *devs, in bits 32 to 9, it
will not be found. This is the reason for changing
in phy.h, the size of device_ids array.

Signed-off-by: Vicentiu Galanopulo <vicentiu.galanopulo@nxp.com>
---
 Documentation/devicetree/bindings/net/phy.txt |  6 ++
 drivers/net/phy/phy_device.c                  | 22 +++++-
 drivers/of/of_mdio.c                          | 98 ++++++++++++++++++++++++++-
 include/linux/phy.h                           |  5 +-
 4 files changed, 125 insertions(+), 6 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/phy.txt b/Documentation/devicetree/bindings/net/phy.txt
index d2169a5..82692e2 100644
--- a/Documentation/devicetree/bindings/net/phy.txt
+++ b/Documentation/devicetree/bindings/net/phy.txt
@@ -61,6 +61,11 @@ Optional Properties:
 - reset-deassert-us: Delay after the reset was deasserted in microseconds.
   If this property is missing the delay will be skipped.
 
+- dev-addr: If set, it indicates the device address of the PHY to be used
+  when accessing the C45 PHY registers over MDIO. It is used for vendor specific
+  register space addresses that do no conform to standard address for the MDIO
+  registers (e.g. MMD30)
+
 Example:
 
 ethernet-phy@0 {
@@ -72,4 +77,5 @@ ethernet-phy@0 {
 	reset-gpios = <&gpio1 4 GPIO_ACTIVE_LOW>;
 	reset-assert-us = <1000>;
 	reset-deassert-us = <2000>;
+	dev-addr = <0x1e>;
 };
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index b285323..f5051cf6 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -71,6 +71,7 @@ static void phy_mdio_device_remove(struct mdio_device *mdiodev)
 
 static struct phy_driver genphy_driver;
 extern struct phy_driver genphy_10g_driver;
+extern struct phy_c45_device_ids mdio_c45_ids;
 
 static LIST_HEAD(phy_fixup_list);
 static DEFINE_MUTEX(phy_fixup_lock);
@@ -457,7 +458,7 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
 static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 			   struct phy_c45_device_ids *c45_ids) {
 	int phy_reg;
-	int i, reg_addr;
+	int i, reg_addr, dev_addr;
 	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
 	u32 *devs = &c45_ids->devices_in_package;
 
@@ -493,13 +494,23 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 		if (!(c45_ids->devices_in_package & (1 << i)))
 			continue;
 
-		reg_addr = MII_ADDR_C45 | i << 16 | MII_PHYSID1;
+		/* if c45_ids->devices_addrs for the current id is not 0,
+		 * then dev-addr was defined in the device tree node,
+		 * and the PHY as been seen as a valid device, and added,
+		 * in the package. In this case we can use the
+		 * dev-addr(c45_ids->devices_addrs[i]) to do the MDIO
+		 * reading of the PHY ID.
+		 */
+		dev_addr = !!c45_ids->devices_addrs[i] ?
+					c45_ids->devices_addrs[i] : i;
+
+		reg_addr = MII_ADDR_C45 | dev_addr << 16 | MII_PHYSID1;
 		phy_reg = mdiobus_read(bus, addr, reg_addr);
 		if (phy_reg < 0)
 			return -EIO;
 		c45_ids->device_ids[i] = (phy_reg & 0xffff) << 16;
 
-		reg_addr = MII_ADDR_C45 | i << 16 | MII_PHYSID2;
+		reg_addr = MII_ADDR_C45 | dev_addr << 16 | MII_PHYSID2;
 		phy_reg = mdiobus_read(bus, addr, reg_addr);
 		if (phy_reg < 0)
 			return -EIO;
@@ -566,6 +577,11 @@ struct phy_device *get_phy_device(struct mii_bus *bus, int addr, bool is_c45)
 	u32 phy_id = 0;
 	int r;
 
+	/* copy the external mdio_c45_ids (which may contain the id's found
+	 * by serching the device tree dev-addr property) to local c45_ids
+	 */
+	memcpy(&c45_ids, &mdio_c45_ids, sizeof(struct phy_c45_device_ids));
+
 	r = get_phy_id(bus, addr, &phy_id, is_c45, &c45_ids);
 	if (r)
 		return ERR_PTR(r);
diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c
index 8c0c927..cbc34f6 100644
--- a/drivers/of/of_mdio.c
+++ b/drivers/of/of_mdio.c
@@ -24,6 +24,8 @@
 
 #define DEFAULT_GPIO_RESET_DELAY	10	/* in microseconds */
 
+struct phy_c45_device_ids mdio_c45_ids = {0};
+
 MODULE_AUTHOR("Grant Likely <grant.likely@secretlab.ca>");
 MODULE_LICENSE("GPL");
 
@@ -190,6 +192,71 @@ static bool of_mdiobus_child_is_phy(struct device_node *child)
 	return false;
 }
 
+static void of_fill_c45ids_devs_addrs(u32 dev_addr)
+{
+	int i;
+	const int num_ids = ARRAY_SIZE(mdio_c45_ids.device_ids);
+
+	/* Search through all Device Identifiers and set
+	 * dev_addr in mdio_c45_ids.devices_addrs,
+	 * if the device bit is set in
+	 * mdio_c45_ids.devices_in_package
+	 */
+	for (i = 1; i < num_ids; i++) {
+		if (!(mdio_c45_ids.devices_in_package & (1 << i)))
+			continue;
+
+		mdio_c45_ids.devices_addrs[i] = dev_addr;
+		break;
+	}
+}
+
+static int of_find_devaddr_in_pkg(struct mii_bus *bus, u32 addr,
+				  u32 dev_addr)
+{
+	u32 *devs = &mdio_c45_ids.devices_in_package;
+	int phy_reg, reg_addr;
+
+	reg_addr = MII_ADDR_C45 | dev_addr << 16 | MDIO_DEVS2;
+	phy_reg = mdiobus_read(bus, addr, reg_addr);
+	if (phy_reg < 0)
+		return -EIO;
+
+	*devs = (phy_reg & 0xffff) << 16;
+
+	reg_addr = MII_ADDR_C45 | dev_addr << 16 | MDIO_DEVS1;
+	phy_reg = mdiobus_read(bus, addr, reg_addr);
+	if (phy_reg < 0)
+		return -EIO;
+
+	*devs |= (phy_reg & 0xffff);
+
+	return 0;
+}
+
+/*
+ * Finds the device in package and populates the mdio_c45_ids
+ * if any device is found at dev_addr address. After this
+ * the PHY is registered
+ */
+static int of_mdiobus_register_static_phy(struct mii_bus *mdio,
+					  struct device_node *child,
+					  u32 addr, u32 dev_addr)
+{
+	int dev_err = 0;
+
+	if (!dev_addr)
+		goto exit_register_phy;
+
+	dev_err = of_find_devaddr_in_pkg(mdio, addr, dev_addr);
+
+	if (!dev_err)
+		of_fill_c45ids_devs_addrs(dev_addr);
+
+exit_register_phy:
+	return of_mdiobus_register_phy(mdio, child, addr);
+}
+
 /**
  * of_mdiobus_register - Register mii_bus and create PHYs from the device tree
  * @mdio: pointer to mii_bus structure
@@ -202,7 +269,10 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
 {
 	struct device_node *child;
 	bool scanphys = false;
+	bool dev_addr_found = true;
 	int addr, rc;
+	int dev_addr = 0;
+	int ret;
 
 	/* Do not continue if the node is disabled */
 	if (!of_device_is_available(np))
@@ -226,6 +296,14 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
 
 	/* Loop over the child nodes and register a phy_device for each phy */
 	for_each_available_child_of_node(np, child) {
+		/* Check if dev-addr is set in the PHY node */
+		ret = of_property_read_u32(child, "dev-addr", &dev_addr);
+
+		if (ret < 0) {
+			/* either not set or invalid */
+			dev_addr_found = false;
+		}
+
 		addr = of_mdio_parse_addr(&mdio->dev, child);
 		if (addr < 0) {
 			scanphys = true;
@@ -233,7 +311,11 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
 		}
 
 		if (of_mdiobus_child_is_phy(child))
-			rc = of_mdiobus_register_phy(mdio, child, addr);
+			if (dev_addr_found)
+				rc = of_mdiobus_register_static_phy(mdio, child,
+								    addr, dev_addr);
+			else
+				rc = of_mdiobus_register_phy(mdio, child, addr);
 		else
 			rc = of_mdiobus_register_device(mdio, child, addr);
 
@@ -248,8 +330,16 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
 	if (!scanphys)
 		return 0;
 
+	/* reset device found variable */
+	dev_addr_found = true;
+
 	/* auto scan for PHYs with empty reg property */
 	for_each_available_child_of_node(np, child) {
+		/* Check if dev-addr is set in the PHY node,
+		 * for PHYs which don't have reg property set
+		 */
+		ret = of_property_read_u32(child, "dev-addr", &dev_addr);
+
 		/* Skip PHYs with reg property set */
 		if (of_find_property(child, "reg", NULL))
 			continue;
@@ -264,7 +354,11 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
 				 child->name, addr);
 
 			if (of_mdiobus_child_is_phy(child)) {
-				rc = of_mdiobus_register_phy(mdio, child, addr);
+				if (dev_addr_found)
+					rc = of_mdiobus_register_static_phy(mdio, child,
+									    addr, dev_addr);
+				else
+					rc = of_mdiobus_register_phy(mdio, child, addr);
 				if (rc && rc != -ENODEV)
 					goto unregister;
 			}
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 5a9b175..161ad90 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -357,10 +357,13 @@ enum phy_state {
  * struct phy_c45_device_ids - 802.3-c45 Device Identifiers
  * @devices_in_package: Bit vector of devices present.
  * @device_ids: The device identifer for each present device.
+ * @devices_addrs: The devices addresses from the device tree
+ *		   for each present device.
  */
 struct phy_c45_device_ids {
 	u32 devices_in_package;
-	u32 device_ids[8];
+	u32 device_ids[32];
+	u32 devices_addrs[32];
 };
 
 /* phy_device: An instance of a PHY
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
From: Jiri Pirko @ 2018-03-23 15:05 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski,
	David Ahern
In-Reply-To: <e7900c83-3dbd-5e7a-57ec-2647e0371e75@cumulusnetworks.com>

Fri, Mar 23, 2018 at 04:03:40PM CET, dsa@cumulusnetworks.com wrote:
>On 3/23/18 9:01 AM, Jiri Pirko wrote:
>> Fri, Mar 23, 2018 at 03:31:02PM CET, dsa@cumulusnetworks.com wrote:
>>> On 3/23/18 12:50 AM, Jiri Pirko wrote:
>>>>> +void nsim_devlink_setup(struct netdevsim *ns)
>>>>> +{
>>>>> +	struct net *net = dev_net(ns->netdev);
>>>>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>>>>> +	struct devlink *devlink;
>>>>> +	int err = -ENOMEM;
>>>>> +
>>>>> +	/* only one device per namespace controls devlink */
>>>>> +	if (!*reg_devlink) {
>>>>> +		ns->devlink = NULL;
>>>>> +		return;
>>>>> +	}
>>>>> +
>>>>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>>>>> +	if (!devlink)
>>>>> +		return;
>>>>> +
>>>>> +	devlink_net_set(devlink, net);
>>>>> +	err = devlink_register(devlink, &ns->dev);
>>>>
>>>> This reg_devlink construct looks odd. Why don't you leave the devlink
>>>> instance in init_ns?
>>>
>>> It is a per-network namespace resource controller. Since struct devlink
>> 
>> Wait a second. What do you mean by "per-network namespace"? Devlink
>> instance is always associated with one physical device. Like an ASIC.
>> 
>> 
>>> has a net entry, the simplest design is to put it into the namespace of
>>> the controller. Without it, controlling resource sizes in namespace
>>> 'foobar' has to be done from init_net, which is just wrong.
>
>you need to look at how netdevsim creates a device per netdevice.

That means one devlink instance for each netdevsim device, doesn't it?

^ permalink raw reply

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
From: David Ahern @ 2018-03-23 15:03 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski,
	David Ahern
In-Reply-To: <20180323150149.GD2125@nanopsycho>

On 3/23/18 9:01 AM, Jiri Pirko wrote:
> Fri, Mar 23, 2018 at 03:31:02PM CET, dsa@cumulusnetworks.com wrote:
>> On 3/23/18 12:50 AM, Jiri Pirko wrote:
>>>> +void nsim_devlink_setup(struct netdevsim *ns)
>>>> +{
>>>> +	struct net *net = dev_net(ns->netdev);
>>>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>>>> +	struct devlink *devlink;
>>>> +	int err = -ENOMEM;
>>>> +
>>>> +	/* only one device per namespace controls devlink */
>>>> +	if (!*reg_devlink) {
>>>> +		ns->devlink = NULL;
>>>> +		return;
>>>> +	}
>>>> +
>>>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>>>> +	if (!devlink)
>>>> +		return;
>>>> +
>>>> +	devlink_net_set(devlink, net);
>>>> +	err = devlink_register(devlink, &ns->dev);
>>>
>>> This reg_devlink construct looks odd. Why don't you leave the devlink
>>> instance in init_ns?
>>
>> It is a per-network namespace resource controller. Since struct devlink
> 
> Wait a second. What do you mean by "per-network namespace"? Devlink
> instance is always associated with one physical device. Like an ASIC.
> 
> 
>> has a net entry, the simplest design is to put it into the namespace of
>> the controller. Without it, controlling resource sizes in namespace
>> 'foobar' has to be done from init_net, which is just wrong.

you need to look at how netdevsim creates a device per netdevice.

^ permalink raw reply

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
From: Jiri Pirko @ 2018-03-23 15:01 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski,
	David Ahern
In-Reply-To: <03eade79-1727-3a31-8e31-a0a7f51b72cf@cumulusnetworks.com>

Fri, Mar 23, 2018 at 03:31:02PM CET, dsa@cumulusnetworks.com wrote:
>On 3/23/18 12:50 AM, Jiri Pirko wrote:
>>> +void nsim_devlink_setup(struct netdevsim *ns)
>>> +{
>>> +	struct net *net = dev_net(ns->netdev);
>>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>>> +	struct devlink *devlink;
>>> +	int err = -ENOMEM;
>>> +
>>> +	/* only one device per namespace controls devlink */
>>> +	if (!*reg_devlink) {
>>> +		ns->devlink = NULL;
>>> +		return;
>>> +	}
>>> +
>>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>>> +	if (!devlink)
>>> +		return;
>>> +
>>> +	devlink_net_set(devlink, net);
>>> +	err = devlink_register(devlink, &ns->dev);
>> 
>> This reg_devlink construct looks odd. Why don't you leave the devlink
>> instance in init_ns?
>
>It is a per-network namespace resource controller. Since struct devlink

Wait a second. What do you mean by "per-network namespace"? Devlink
instance is always associated with one physical device. Like an ASIC.


>has a net entry, the simplest design is to put it into the namespace of
>the controller. Without it, controlling resource sizes in namespace
>'foobar' has to be done from init_net, which is just wrong.

^ permalink raw reply

* Re: [patch net-next RFC 00/12] devlink: introduce port flavours and common phys_port_name generation
From: Jiri Pirko @ 2018-03-23 14:59 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, davem, idosch, jakub.kicinski, mlxsw, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, simon.horman,
	pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
	alexander.h.duyck, ogerlitz, dsahern, vijaya.guvva,
	satananda.burla, raghu.vatsavayi, felix.manlunas, gospo,
	sathya.perla, vasundhara-v.volam, tariqt, eranbe,
	jeffrey.t.kirsher
In-Reply-To: <20180323134357.GG5145@lunn.ch>

Fri, Mar 23, 2018 at 02:43:57PM CET, andrew@lunn.ch wrote:
>> I tested this for mlxsw and nfp. I have no way to test this on DSA hw,
>> I would really appretiate DSA guys to test this.
>
>Hi Jiri
>
>With the missing break added, i get:
>
>root@zii-devel-b:~# ./iproute2/devlink/devlink port 
>mdio_bus/0.1:00/0: type eth netdev lan0 flavour physical number 0
>mdio_bus/0.1:00/1: type eth netdev lan1 flavour physical number 1
>mdio_bus/0.1:00/2: type eth netdev lan2 flavour physical number 2
>mdio_bus/0.1:00/3: type notset
>mdio_bus/0.1:00/4: type notset
>mdio_bus/0.1:00/5: type notset flavour dsa number 5
>mdio_bus/0.1:00/6: type notset flavour cpu number 6
>mdio_bus/0.2:00/0: type eth netdev lan3 flavour physical number 0
>mdio_bus/0.2:00/1: type eth netdev lan4 flavour physical number 1
>mdio_bus/0.2:00/2: type eth netdev lan5 flavour physical number 2
>mdio_bus/0.2:00/3: type notset
>mdio_bus/0.2:00/4: type notset
>mdio_bus/0.2:00/5: type notset flavour dsa number 5
>mdio_bus/0.2:00/6: type notset flavour dsa number 6
>mdio_bus/0.4:00/0: type eth netdev lan6 flavour physical number 0
>mdio_bus/0.4:00/1: type eth netdev lan7 flavour physical number 1
>mdio_bus/0.4:00/2: type eth netdev lan8 flavour physical number 2
>mdio_bus/0.4:00/3: type eth netdev optical3 flavour physical number 3
>mdio_bus/0.4:00/4: type eth netdev optical4 flavour physical number 4
>mdio_bus/0.4:00/5: type notset
>mdio_bus/0.4:00/6: type notset
>mdio_bus/0.4:00/7: type notset
>mdio_bus/0.4:00/8: type notset
>mdio_bus/0.4:00/9: type notset flavour dsa number 9
>
>This is on a board with a DSA cluster of three switches. Some of the
>switch ports are not connected to anything, so are plain 'notset'.

Okay. That looks fine. I wonder if it would make sense to have another
flavour for "unused" ports.


>
>What is the "number X" meant to mean?

That is basically front panel number for physical ports. It is used for
generating phys_port_name. It should have separate numbering for cpu
ports and dsa ports most probably. Although, since they have no
netdevice associated, the number is not used and only shown here.

In case of mlxsw switch port 1, the netdev name is then
for example: "enp3s0np1".

^ permalink raw reply

* Re: l2tp stable request
From: Guillaume Nault @ 2018-03-23 14:58 UTC (permalink / raw)
  To: Daniel Rosenberg; +Cc: netdev, Greg Kroah-Hartman, stable
In-Reply-To: <fbb33606-b817-356f-acaa-81aab44327cb@google.com>

On Thu, Mar 22, 2018 at 05:55:30PM -0700, Daniel Rosenberg wrote:
> f3c66d4e144a0904ea9b95d23ed9f8eb38c11bfb        l2tp: prevent creation of
> sessions on terminated tunnels
> 9ee369a405c57613d7c83a3967780c3e30c52ecc        l2tp: initialise session's
> refcount before making it reachable
> dbdbc73b44782e22b3b4b6e8b51e7a3d245f3086        l2tp: fix duplicate session
> creation
> 61b9a047729bb230978178bca6729689d0c50ca2        l2tp: fix race in
> l2tp_recv_common()
> 
> For v3.18+. It requires some minor backporting.
> 
> Without these, I'm seeing a null pointer in l2tp_session_create. These logs
> are from a 3.18 kernel, although I was able to hit it on a 4.4 kernel I
> tested as well.
> 
No objection from me. Let me know if there are any difficulties with a
backport.

Guillaume

^ permalink raw reply

* [PATCH net] ipv6: fix possible deadlock in rt6_age_examine_exception()
From: Eric Dumazet @ 2018-03-23 14:56 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Eric Dumazet, Wei Wang, Martin KaFai Lau

syzbot reported a LOCKDEP splat [1] in rt6_age_examine_exception()

rt6_age_examine_exception() is called while rt6_exception_lock is held.
This lock is the lower one in the lock hierarchy, thus we can not
call dst_neigh_lookup() function, as it can fallback to neigh_create()

We should instead do a pure RCU lookup. As a bonus we avoid
a pair of atomic operations on neigh refcount.

[1]

WARNING: possible circular locking dependency detected
4.16.0-rc4+ #277 Not tainted

syz-executor7/4015 is trying to acquire lock:
 (&ndev->lock){++--}, at: [<00000000416dce19>] __ipv6_dev_mc_dec+0x45/0x350 net/ipv6/mcast.c:928

but task is already holding lock:
 (&tbl->lock){++-.}, at: [<00000000b5cb1d65>] neigh_ifdown+0x3d/0x250 net/core/neighbour.c:292

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (&tbl->lock){++-.}:
       __raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline]
       _raw_write_lock_bh+0x31/0x40 kernel/locking/spinlock.c:312
       __neigh_create+0x87e/0x1d90 net/core/neighbour.c:528
       neigh_create include/net/neighbour.h:315 [inline]
       ip6_neigh_lookup+0x9a7/0xba0 net/ipv6/route.c:228
       dst_neigh_lookup include/net/dst.h:405 [inline]
       rt6_age_examine_exception net/ipv6/route.c:1609 [inline]
       rt6_age_exceptions+0x381/0x660 net/ipv6/route.c:1645
       fib6_age+0xfb/0x140 net/ipv6/ip6_fib.c:2033
       fib6_clean_node+0x389/0x580 net/ipv6/ip6_fib.c:1919
       fib6_walk_continue+0x46c/0x8a0 net/ipv6/ip6_fib.c:1845
       fib6_walk+0x91/0xf0 net/ipv6/ip6_fib.c:1893
       fib6_clean_tree+0x1e6/0x340 net/ipv6/ip6_fib.c:1970
       __fib6_clean_all+0x1f4/0x3a0 net/ipv6/ip6_fib.c:1986
       fib6_clean_all net/ipv6/ip6_fib.c:1997 [inline]
       fib6_run_gc+0x16b/0x3c0 net/ipv6/ip6_fib.c:2053
       ndisc_netdev_event+0x3c2/0x4a0 net/ipv6/ndisc.c:1781
       notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
       __raw_notifier_call_chain kernel/notifier.c:394 [inline]
       raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
       call_netdevice_notifiers_info+0x32/0x70 net/core/dev.c:1707
       call_netdevice_notifiers net/core/dev.c:1725 [inline]
       __dev_notify_flags+0x262/0x430 net/core/dev.c:6960
       dev_change_flags+0xf5/0x140 net/core/dev.c:6994
       devinet_ioctl+0x126a/0x1ac0 net/ipv4/devinet.c:1080
       inet_ioctl+0x184/0x310 net/ipv4/af_inet.c:919
       sock_do_ioctl+0xef/0x390 net/socket.c:957
       sock_ioctl+0x36b/0x610 net/socket.c:1081
       vfs_ioctl fs/ioctl.c:46 [inline]
       do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
       SYSC_ioctl fs/ioctl.c:701 [inline]
       SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
       do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x42/0xb7

-> #2 (rt6_exception_lock){+.-.}:
       __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
       _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
       spin_lock_bh include/linux/spinlock.h:315 [inline]
       rt6_flush_exceptions+0x21/0x210 net/ipv6/route.c:1367
       fib6_del_route net/ipv6/ip6_fib.c:1677 [inline]
       fib6_del+0x624/0x12c0 net/ipv6/ip6_fib.c:1761
       __ip6_del_rt+0xc7/0x120 net/ipv6/route.c:2980
       ip6_del_rt+0x132/0x1a0 net/ipv6/route.c:2993
       __ipv6_dev_ac_dec+0x3b1/0x600 net/ipv6/anycast.c:332
       ipv6_dev_ac_dec net/ipv6/anycast.c:345 [inline]
       ipv6_sock_ac_close+0x2b4/0x3e0 net/ipv6/anycast.c:200
       inet6_release+0x48/0x70 net/ipv6/af_inet6.c:433
       sock_release+0x8d/0x1e0 net/socket.c:594
       sock_close+0x16/0x20 net/socket.c:1149
       __fput+0x327/0x7e0 fs/file_table.c:209
       ____fput+0x15/0x20 fs/file_table.c:243
       task_work_run+0x199/0x270 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x9bb/0x1ad0 kernel/exit.c:865
       do_group_exit+0x149/0x400 kernel/exit.c:968
       get_signal+0x73a/0x16d0 kernel/signal.c:2469
       do_signal+0x90/0x1e90 arch/x86/kernel/signal.c:809
       exit_to_usermode_loop+0x258/0x2f0 arch/x86/entry/common.c:162
       prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
       syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
       do_syscall_64+0x6ec/0x940 arch/x86/entry/common.c:292
       entry_SYSCALL_64_after_hwframe+0x42/0xb7

-> #1 (&(&tb->tb6_lock)->rlock){+.-.}:
       __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
       _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
       spin_lock_bh include/linux/spinlock.h:315 [inline]
       __ip6_ins_rt+0x56/0x90 net/ipv6/route.c:1007
       ip6_route_add+0x141/0x190 net/ipv6/route.c:2955
       addrconf_prefix_route+0x44f/0x620 net/ipv6/addrconf.c:2359
       fixup_permanent_addr net/ipv6/addrconf.c:3368 [inline]
       addrconf_permanent_addr net/ipv6/addrconf.c:3391 [inline]
       addrconf_notify+0x1ad2/0x2310 net/ipv6/addrconf.c:3460
       notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
       __raw_notifier_call_chain kernel/notifier.c:394 [inline]
       raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
       call_netdevice_notifiers_info+0x32/0x70 net/core/dev.c:1707
       call_netdevice_notifiers net/core/dev.c:1725 [inline]
       __dev_notify_flags+0x15d/0x430 net/core/dev.c:6958
       dev_change_flags+0xf5/0x140 net/core/dev.c:6994
       do_setlink+0xa22/0x3bb0 net/core/rtnetlink.c:2357
       rtnl_newlink+0xf37/0x1a50 net/core/rtnetlink.c:2965
       rtnetlink_rcv_msg+0x57f/0xb10 net/core/rtnetlink.c:4641
       netlink_rcv_skb+0x14b/0x380 net/netlink/af_netlink.c:2444
       rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4659
       netlink_unicast_kernel net/netlink/af_netlink.c:1308 [inline]
       netlink_unicast+0x4c4/0x6b0 net/netlink/af_netlink.c:1334
       netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1897
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:639
       ___sys_sendmsg+0x767/0x8b0 net/socket.c:2047
       __sys_sendmsg+0xe5/0x210 net/socket.c:2081
       SYSC_sendmsg net/socket.c:2092 [inline]
       SyS_sendmsg+0x2d/0x50 net/socket.c:2088
       do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x42/0xb7

-> #0 (&ndev->lock){++--}:
       lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
       __raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline]
       _raw_write_lock_bh+0x31/0x40 kernel/locking/spinlock.c:312
       __ipv6_dev_mc_dec+0x45/0x350 net/ipv6/mcast.c:928
       ipv6_dev_mc_dec+0x110/0x1f0 net/ipv6/mcast.c:961
       pndisc_destructor+0x21a/0x340 net/ipv6/ndisc.c:392
       pneigh_ifdown net/core/neighbour.c:695 [inline]
       neigh_ifdown+0x149/0x250 net/core/neighbour.c:294
       rt6_disable_ip+0x537/0x700 net/ipv6/route.c:3874
       addrconf_ifdown+0x14b/0x14f0 net/ipv6/addrconf.c:3633
       addrconf_notify+0x5f8/0x2310 net/ipv6/addrconf.c:3557
       notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
       __raw_notifier_call_chain kernel/notifier.c:394 [inline]
       raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
       call_netdevice_notifiers_info+0x32/0x70 net/core/dev.c:1707
       call_netdevice_notifiers net/core/dev.c:1725 [inline]
       __dev_notify_flags+0x262/0x430 net/core/dev.c:6960
       dev_change_flags+0xf5/0x140 net/core/dev.c:6994
       devinet_ioctl+0x126a/0x1ac0 net/ipv4/devinet.c:1080
       inet_ioctl+0x184/0x310 net/ipv4/af_inet.c:919
       packet_ioctl+0x1ff/0x310 net/packet/af_packet.c:4066
       sock_do_ioctl+0xef/0x390 net/socket.c:957
       sock_ioctl+0x36b/0x610 net/socket.c:1081
       vfs_ioctl fs/ioctl.c:46 [inline]
       do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
       SYSC_ioctl fs/ioctl.c:701 [inline]
       SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
       do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x42/0xb7

other info that might help us debug this:

Chain exists of:
  &ndev->lock --> rt6_exception_lock --> &tbl->lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&tbl->lock);
                               lock(rt6_exception_lock);
                               lock(&tbl->lock);
  lock(&ndev->lock);

 *** DEADLOCK ***

2 locks held by syz-executor7/4015:
 #0:  (rtnl_mutex){+.+.}, at: [<00000000a2f16daa>] rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
 #1:  (&tbl->lock){++-.}, at: [<00000000b5cb1d65>] neigh_ifdown+0x3d/0x250 net/core/neighbour.c:292

stack backtrace:
CPU: 0 PID: 4015 Comm: syz-executor7 Not tainted 4.16.0-rc4+ #277
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x194/0x24d lib/dump_stack.c:53
 print_circular_bug.isra.38+0x2cd/0x2dc kernel/locking/lockdep.c:1223
 check_prev_add kernel/locking/lockdep.c:1863 [inline]
 check_prevs_add kernel/locking/lockdep.c:1976 [inline]
 validate_chain kernel/locking/lockdep.c:2417 [inline]
 __lock_acquire+0x30a8/0x3e00 kernel/locking/lockdep.c:3431
 lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
 __raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline]
 _raw_write_lock_bh+0x31/0x40 kernel/locking/spinlock.c:312
 __ipv6_dev_mc_dec+0x45/0x350 net/ipv6/mcast.c:928
 ipv6_dev_mc_dec+0x110/0x1f0 net/ipv6/mcast.c:961
 pndisc_destructor+0x21a/0x340 net/ipv6/ndisc.c:392
 pneigh_ifdown net/core/neighbour.c:695 [inline]
 neigh_ifdown+0x149/0x250 net/core/neighbour.c:294
 rt6_disable_ip+0x537/0x700 net/ipv6/route.c:3874
 addrconf_ifdown+0x14b/0x14f0 net/ipv6/addrconf.c:3633
 addrconf_notify+0x5f8/0x2310 net/ipv6/addrconf.c:3557
 notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
 __raw_notifier_call_chain kernel/notifier.c:394 [inline]
 raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
 call_netdevice_notifiers_info+0x32/0x70 net/core/dev.c:1707
 call_netdevice_notifiers net/core/dev.c:1725 [inline]
 __dev_notify_flags+0x262/0x430 net/core/dev.c:6960
 dev_change_flags+0xf5/0x140 net/core/dev.c:6994
 devinet_ioctl+0x126a/0x1ac0 net/ipv4/devinet.c:1080
 inet_ioctl+0x184/0x310 net/ipv4/af_inet.c:919
 packet_ioctl+0x1ff/0x310 net/packet/af_packet.c:4066
 sock_do_ioctl+0xef/0x390 net/socket.c:957
 sock_ioctl+0x36b/0x610 net/socket.c:1081
 vfs_ioctl fs/ioctl.c:46 [inline]
 do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
 SYSC_ioctl fs/ioctl.c:701 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7

Fixes: c757faa8bfa2 ("ipv6: prepare fib6_age() for exception table")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
---
 net/ipv6/route.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index b0d5c64e19780ce94feb112285ed1d85dbe07e9e..b33d057ac5eb2a85e19be59f0bceacf547cc9e59 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1626,11 +1626,10 @@ static void rt6_age_examine_exception(struct rt6_exception_bucket *bucket,
 		struct neighbour *neigh;
 		__u8 neigh_flags = 0;
 
-		neigh = dst_neigh_lookup(&rt->dst, &rt->rt6i_gateway);
-		if (neigh) {
+		neigh = __ipv6_neigh_lookup_noref(rt->dst.dev, &rt->rt6i_gateway);
+		if (neigh)
 			neigh_flags = neigh->flags;
-			neigh_release(neigh);
-		}
+
 		if (!(neigh_flags & NTF_ROUTER)) {
 			RT6_TRACE("purging route %p via non-router but gateway\n",
 				  rt);
@@ -1654,7 +1653,8 @@ void rt6_age_exceptions(struct rt6_info *rt,
 	if (!rcu_access_pointer(rt->rt6i_exception_bucket))
 		return;
 
-	spin_lock_bh(&rt6_exception_lock);
+	rcu_read_lock_bh();
+	spin_lock(&rt6_exception_lock);
 	bucket = rcu_dereference_protected(rt->rt6i_exception_bucket,
 				    lockdep_is_held(&rt6_exception_lock));
 
@@ -1668,7 +1668,8 @@ void rt6_age_exceptions(struct rt6_info *rt,
 			bucket++;
 		}
 	}
-	spin_unlock_bh(&rt6_exception_lock);
+	spin_unlock(&rt6_exception_lock);
+	rcu_read_unlock_bh();
 }
 
 struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table,
-- 
2.17.0.rc0.231.g781580f067-goog

^ permalink raw reply related

* Re: [bpf-next V5 PATCH 11/15] page_pool: refurbish version of page_pool code
From: Eric Dumazet @ 2018-03-23 14:55 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Eric Dumazet
  Cc: netdev, BjörnTöpel, magnus.karlsson, eugenia,
	Jason Wang, John Fastabend, Eran Ben Elisha, Saeed Mahameed, galp,
	Daniel Borkmann, Alexei Starovoitov, Tariq Toukan
In-Reply-To: <20180323151522.2d3dde07@redhat.com>



On 03/23/2018 07:15 AM, Jesper Dangaard Brouer wrote:
> On Fri, 23 Mar 2018 06:29:55 -0700
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
>> On 03/23/2018 05:18 AM, Jesper Dangaard Brouer wrote:
>>
>>> +
>>> +void page_pool_destroy_rcu(struct page_pool *pool)
>>> +{
>>> +	call_rcu(&pool->rcu, __page_pool_destroy_rcu);
>>> +}
>>> +EXPORT_SYMBOL(page_pool_destroy_rcu);
>>>   
>>
>>
>> Why do we need to respect one rcu grace period before destroying a page pool ?
> 
> Due to previous allocator ID patch, which can have a pointer reference
> to a page_pool, and the allocator ID lookup uses RCU.
> 

I am not convinced.  How comes a patch that is _before_ this one can have any impact ?

Normally, we put first infrastructure, then something using it.

rcu grace period before freeing huge quantitites of pages is problematic and could
be used by syzbot to OOM the host.

^ permalink raw reply

* [PATCH v2] fsl/fman: remove unnecessary set_dma_ops() call and HAS_DMA dependency
From: Madalin Bucur @ 2018-03-23 14:52 UTC (permalink / raw)
  To: davem, geert.uytterhoeven; +Cc: netdev, linux-kernel, Madalin Bucur

The platform device is no longer used for DMA mapping so the
(questionable) setting of the DMA ops done here is no longer
needed. Removing it together with the HAS_DMA dependency that
it required.

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
---
 drivers/net/ethernet/freescale/fman/Kconfig | 1 -
 drivers/net/ethernet/freescale/fman/mac.c   | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fman/Kconfig b/drivers/net/ethernet/freescale/fman/Kconfig
index 8870a9a..dc0850b 100644
--- a/drivers/net/ethernet/freescale/fman/Kconfig
+++ b/drivers/net/ethernet/freescale/fman/Kconfig
@@ -2,7 +2,6 @@ config FSL_FMAN
 	tristate "FMan support"
 	depends on FSL_SOC || ARCH_LAYERSCAPE || COMPILE_TEST
 	select GENERIC_ALLOCATOR
-	depends on HAS_DMA
 	select PHYLIB
 	default n
 	help
diff --git a/drivers/net/ethernet/freescale/fman/mac.c b/drivers/net/ethernet/freescale/fman/mac.c
index 4829dcd..7b5b95f 100644
--- a/drivers/net/ethernet/freescale/fman/mac.c
+++ b/drivers/net/ethernet/freescale/fman/mac.c
@@ -567,7 +567,6 @@ static struct platform_device *dpaa_eth_add_device(int fman_id,
 	}
 
 	pdev->dev.parent = priv->dev;
-	set_dma_ops(&pdev->dev, get_dma_ops(priv->dev));
 
 	ret = platform_device_add_data(pdev, &data, sizeof(data));
 	if (ret)
-- 
2.1.0

^ permalink raw reply related

* Re: [patch net-next RFC 04/12] dsa: set devlink port attrs for dsa ports
From: Jiri Pirko @ 2018-03-23 14:49 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, davem, idosch, jakub.kicinski, mlxsw, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, simon.horman,
	pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
	alexander.h.duyck, ogerlitz, dsahern, vijaya.guvva,
	satananda.burla, raghu.vatsavayi, felix.manlunas, gospo,
	sathya.perla, vasundhara-v.volam, tariqt, eranbe,
	jeffrey.t.kirsher
In-Reply-To: <20180323133002.GF5145@lunn.ch>

Fri, Mar 23, 2018 at 02:30:02PM CET, andrew@lunn.ch wrote:
>On Thu, Mar 22, 2018 at 11:55:14AM +0100, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> Set the attrs and allow to expose port flavour to user via devlink.
>> 
>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>> ---
>>  net/dsa/dsa2.c | 23 +++++++++++++++++++++++
>>  1 file changed, 23 insertions(+)
>> 
>> diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
>> index adf50fbc4c13..49453690696d 100644
>> --- a/net/dsa/dsa2.c
>> +++ b/net/dsa/dsa2.c
>> @@ -270,7 +270,27 @@ static int dsa_port_setup(struct dsa_port *dp)
>>  	case DSA_PORT_TYPE_UNUSED:
>>  		break;
>>  	case DSA_PORT_TYPE_CPU:
>> +		/* dp->index is used now as port_number. However
>> +		 * CPU ports should have separate numbering
>> +		 * independent from front panel port numbers.
>> +		 */
>> +		devlink_port_attrs_set(&dp->devlink_port,
>> +				       DEVLINK_PORT_FLAVOUR_CPU,
>> +				       dp->index, false, 0);
>> +		err = dsa_port_link_register_of(dp);
>> +		if (err) {
>> +			dev_err(ds->dev, "failed to setup link for port %d.%d\n",
>> +				ds->index, dp->index);
>> +			return err;
>> +		}
>
>Ah, i get it. These used to be two case statements with one code
>block. But you split them apart, so needed to duplicate the
>dsa_port_link_register.
>
>Unfortunately, you forgot to add a 'break;', so it still falls
>through, and overwrites the port flavour to DSA.

ah, crap. Don't have hw to test this :/
Will fix. Thanks!

>
>>  	case DSA_PORT_TYPE_DSA:
>> +		/* dp->index is used now as port_number. However
>> +		 * DSA ports should have separate numbering
>> +		 * independent from front panel port numbers.
>> +		 */
>> +		devlink_port_attrs_set(&dp->devlink_port,
>> +				       DEVLINK_PORT_FLAVOUR_DSA,
>> +				       dp->index, false, 0);
>
>  Andrew

^ permalink raw reply

* Re: [net-next:master 304/314] drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3878:8: error: too few arguments to function 'devlink_resource_register'
From: David Ahern @ 2018-03-23 14:33 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: kbuild test robot, kbuild-all, netdev
In-Reply-To: <20180323065310.GN2074@nanopsycho.orion>

On 3/23/18 12:53 AM, Jiri Pirko wrote:
> Fri, Mar 23, 2018 at 02:53:38AM CET, dsahern@gmail.com wrote:
>> On 3/22/18 6:47 PM, kbuild test robot wrote:
>>> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
>>> head:   6686c459e1449a3ee5f3fd313b0a559ace7a700e
>>> commit: 145307460ba9c11489807de7acd3f4c7395f60b7 [304/314] devlink: Remove top_hierarchy arg to devlink_resource_register
>>> config: x86_64-randconfig-s1-03230751 (attached as .config)
>>> compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
>>> reproduce:
>>>         git checkout 145307460ba9c11489807de7acd3f4c7395f60b7
>>>         # save the attached .config to linux build tree
>>>         make ARCH=x86_64 
>>>
>>> All error/warnings (new ones prefixed by >>):
>>>
>>>    drivers/net/ethernet/mellanox/mlxsw/spectrum.c: In function 'mlxsw_sp_resources_register':
>>>>> drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3881:6: warning: passing argument 6 of 'devlink_resource_register' makes integer from pointer without a cast [-Wint-conversion]
>>>          &kvd_size_params,
>>>          ^
>>>    In file included from drivers/net/ethernet/mellanox/mlxsw/core.h:47:0,
>>>                     from drivers/net/ethernet/mellanox/mlxsw/spectrum.h:54,
>>>                     from drivers/net/ethernet/mellanox/mlxsw/spectrum.c:64:
>>>    include/net/devlink.h:560:1: note: expected 'u64 {aka long long unsigned int}' but argument is of type 'struct devlink_resource_size_params *'
>>>     devlink_resource_register(struct devlink *devlink,
>>>     ^~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> I just did another full build (allmodconfig) on net-next and did not hit
>> this error.
> 
> The "else branch" in "#if IS_ENABLED(CONFIG_NET_DEVLINK)" is the
> problem:
> 
> static inline int
> devlink_resource_register(struct devlink *devlink,
>                           const char *resource_name,
>                           bool top_hierarchy,
>                           u64 resource_size,
>                           u64 resource_id,
>                           u64 parent_resource_id,
>                           const struct devlink_resource_size_params *size_params,
>                           const struct devlink_resource_ops *resource_ops)
> {
>         return 0;
> }
> 

ugh. Thanks, Jiri. Will fix.

^ permalink raw reply

* [PATCH net] udp6: set dst cache for a connected sk before udp_v6_send_skb
From: Alexey Kodanev @ 2018-03-23 14:39 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Alexey Kodanev

After commit 33c162a980fe ("ipv6: datagram: Update dst cache of a
connected datagram sk during pmtu update"), when the error occurs on
sending datagram in udpv6_sendmsg() due to ICMPV6_PKT_TOOBIG type,
error handler can trigger the following path and call ip6_dst_store():

    udpv6_err()
        ip6_sk_update_pmtu()
            ip6_datagram_dst_update()
                ip6_dst_lookup_flow(), can create a RTF_CACHE clone
                ...
                ip6_dst_store()

It can happen before a connected UDP socket invokes ip6_dst_store()
in the end of udpv6_sendmsg(), on destination release, as a result,
the last one changes dst to the old one, preventing getting updated
dst cache on the next udpv6_sendmsg() call.

This patch moves ip6_dst_store() in udpv6_sendmsg(), so that it is
invoked after ip6_sk_dst_lookup_flow() and before udp_v6_send_skb().

Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
---
 net/ipv6/udp.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 52e3ea0..0d413c6 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1299,6 +1299,16 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	if (ipc6.hlimit < 0)
 		ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst);
 
+	if (connected)
+		ip6_dst_store(sk, dst,
+			      ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ?
+			      &sk->sk_v6_daddr : NULL,
+#ifdef CONFIG_IPV6_SUBTREES
+			      ipv6_addr_equal(&fl6.saddr, &np->saddr) ?
+			      &np->saddr :
+#endif
+			      NULL);
+
 	if (msg->msg_flags&MSG_CONFIRM)
 		goto do_confirm;
 back_from_confirm:
@@ -1350,18 +1360,8 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 
 release_dst:
 	if (dst) {
-		if (connected) {
-			ip6_dst_store(sk, dst,
-				      ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ?
-				      &sk->sk_v6_daddr : NULL,
-#ifdef CONFIG_IPV6_SUBTREES
-				      ipv6_addr_equal(&fl6.saddr, &np->saddr) ?
-				      &np->saddr :
-#endif
-				      NULL);
-		} else {
+		if (!connected)
 			dst_release(dst);
-		}
 		dst = NULL;
 	}
 
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
From: David Ahern @ 2018-03-23 14:31 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski,
	David Ahern
In-Reply-To: <20180323065010.GM2074@nanopsycho.orion>

On 3/23/18 12:50 AM, Jiri Pirko wrote:
>> +void nsim_devlink_setup(struct netdevsim *ns)
>> +{
>> +	struct net *net = dev_net(ns->netdev);
>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>> +	struct devlink *devlink;
>> +	int err = -ENOMEM;
>> +
>> +	/* only one device per namespace controls devlink */
>> +	if (!*reg_devlink) {
>> +		ns->devlink = NULL;
>> +		return;
>> +	}
>> +
>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>> +	if (!devlink)
>> +		return;
>> +
>> +	devlink_net_set(devlink, net);
>> +	err = devlink_register(devlink, &ns->dev);
> 
> This reg_devlink construct looks odd. Why don't you leave the devlink
> instance in init_ns?

It is a per-network namespace resource controller. Since struct devlink
has a net entry, the simplest design is to put it into the namespace of
the controller. Without it, controlling resource sizes in namespace
'foobar' has to be done from init_net, which is just wrong.

^ permalink raw reply

* [PATCH] of_net: Implement of_get_nvmem_mac_address helper
From: Mike Looijmans @ 2018-03-23 14:24 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, devicetree, andrew, f.fainelli, robh+dt,
	frowand.list, Mike Looijmans

It's common practice to store MAC addresses for network interfaces into
nvmem devices. However the code to actually do this in the kernel lacks,
so this patch adds of_get_nvmem_mac_address() for drivers to obtain the
address from an nvmem cell provider.

This is particulary useful on devices where the ethernet interface cannot
be configured by the bootloader, for example because it's in an FPGA.

Tested by adapting the cadence macb driver to call this instead of
of_get_mac_address().

Signed-off-by: Mike Looijmans <mike.looijmans@topic.nl>
---
 drivers/of/of_net.c    | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/of_net.h |  6 ++++++
 2 files changed, 54 insertions(+)

diff --git a/drivers/of/of_net.c b/drivers/of/of_net.c
index d820f3e..316a537 100644
--- a/drivers/of/of_net.c
+++ b/drivers/of/of_net.c
@@ -7,6 +7,7 @@
  */
 #include <linux/etherdevice.h>
 #include <linux/kernel.h>
+#include <linux/nvmem-consumer.h>
 #include <linux/of_net.h>
 #include <linux/phy.h>
 #include <linux/export.h>
@@ -80,3 +81,50 @@ const void *of_get_mac_address(struct device_node *np)
 	return of_get_mac_addr(np, "address");
 }
 EXPORT_SYMBOL(of_get_mac_address);
+
+/**
+ * Search the device tree for a MAC address, by calling of_get_mac_address
+ * and if that doesn't provide an address, fetch it from an nvmem provider
+ * using the name 'mac-address'.
+ * On success, copies the new address is into memory pointed to by addr and
+ * returns 0. Returns a negative error code otherwise.
+ * @dev:	Pointer to the device containing the device_node
+ * @addr:	Pointer to receive the MAC address using ether_addr_copy()
+ */
+int of_get_nvmem_mac_address(struct device *dev, char *addr)
+{
+	const char *mac;
+	struct nvmem_cell *cell;
+	size_t len;
+	int ret;
+
+	mac = of_get_mac_address(dev->of_node);
+	if (mac) {
+		ether_addr_copy(addr, mac);
+		return 0;
+	}
+
+	cell = nvmem_cell_get(dev, "mac-address");
+	if (IS_ERR(cell))
+		return PTR_ERR(cell);
+
+	mac = (const char *)nvmem_cell_read(cell, &len);
+
+	nvmem_cell_put(cell);
+
+	if (IS_ERR(mac))
+		return PTR_ERR(mac);
+
+	if (len < 6 || !is_valid_ether_addr(mac)) {
+		dev_err(dev, "MAC address from NVMEM is invalid\n");
+		ret = -EINVAL;
+	} else {
+		ether_addr_copy(addr, mac);
+		ret = 0;
+	}
+
+	kfree(mac);
+
+	return ret;
+}
+EXPORT_SYMBOL(of_get_nvmem_mac_address);
diff --git a/include/linux/of_net.h b/include/linux/of_net.h
index 9cd72aa..0d52e1d 100644
--- a/include/linux/of_net.h
+++ b/include/linux/of_net.h
@@ -13,6 +13,7 @@
 struct net_device;
 extern int of_get_phy_mode(struct device_node *np);
 extern const void *of_get_mac_address(struct device_node *np);
+extern int of_get_nvmem_mac_address(struct device *dev, char *addr);
 extern struct net_device *of_find_net_device_by_node(struct device_node *np);
 #else
 static inline int of_get_phy_mode(struct device_node *np)
@@ -25,6 +26,11 @@ static inline const void *of_get_mac_address(struct device_node *np)
 	return NULL;
 }
 
+static inline int of_get_nvmem_mac_address(struct device *dev, char *addr)
+{
+	return -ENODEV;
+}
+
 static inline struct net_device *of_find_net_device_by_node(struct device_node *np)
 {
 	return NULL;
-- 
1.9.1

^ permalink raw reply related

* Re: linux-next: manual merge of the net-next tree with the rdma-fixes tree
From: David Miller @ 2018-03-23 14:22 UTC (permalink / raw)
  To: jgg; +Cc: dledford, sfr, netdev, linux-next, linux-kernel, markb, leonro
In-Reply-To: <20180323043315.GB13185@mellanox.com>

From: Jason Gunthorpe <jgg@mellanox.com>
Date: Thu, 22 Mar 2018 22:33:15 -0600

> Doug and I moved to a shared repo location when we started maintain it
> as a team:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
> 
> The commit is here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?h=for-next&id=2d873449a202d02e0c4d90009fb2beb7013ac575

Thanks a lot.

^ permalink raw reply

* Re: [bpf-next V5 PATCH 11/15] page_pool: refurbish version of page_pool code
From: Jesper Dangaard Brouer @ 2018-03-23 14:15 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, BjörnTöpel, magnus.karlsson, eugenia,
	Jason Wang, John Fastabend, Eran Ben Elisha, Saeed Mahameed, galp,
	Daniel Borkmann, Alexei Starovoitov, Tariq Toukan, brouer
In-Reply-To: <b8463e12-d1eb-d862-c5f4-09fc0ac33382@gmail.com>

On Fri, 23 Mar 2018 06:29:55 -0700
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> On 03/23/2018 05:18 AM, Jesper Dangaard Brouer wrote:
> 
> > +
> > +void page_pool_destroy_rcu(struct page_pool *pool)
> > +{
> > +	call_rcu(&pool->rcu, __page_pool_destroy_rcu);
> > +}
> > +EXPORT_SYMBOL(page_pool_destroy_rcu);
> >   
> 
> 
> Why do we need to respect one rcu grace period before destroying a page pool ?

Due to previous allocator ID patch, which can have a pointer reference
to a page_pool, and the allocator ID lookup uses RCU.

> In any case, this should be called page_pool_destroy()

Okay.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox