Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH rdma-next 0/3] Dump and fill MKEY
From: Jason Gunthorpe @ 2018-06-19 14:21 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Guy Levi,
	Yishai Hadas, Yonatan Cohen, Saeed Mahameed, linux-netdev
In-Reply-To: <20180619054724.32677-1-leon@kernel.org>

On Tue, Jun 19, 2018 at 08:47:21AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> MLX5 IB HCA offers the memory key, dump_fill_mkey to increase
> performance, when used in a send or receive operations.
> 
> It is used to force local HCA operations to skip the PCI bus access,
> while keeping track of the processed length in the ibv_sge handling.
> 
> In this three patch series, we expose various bits in our HW
> spec file (mlx5_ifc.h), move unneeded for mlx5_core FW command and
> export such memory key to user space thought our mlx5-abi header file.

Where is the user space for this?

Jason

^ permalink raw reply

* RE: [PATCH] net/phy: Micrel KSZ8061 PHY link failure after cable connect
From: Onnasch, Alexander (EXT) @ 2018-06-19 14:23 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20180606123947.GB14898@lunn.ch>

Hi Andrew
thanks for the hint. But actually I cannot confirm - or I don't see it yet. 

Without having tested, just from the code, the struct phy_driver instance for PHY_ID_KSZ8061 in micrel.c does not have a .write_mmd function assigned, thus phy_write_mmd should evaluate to its else-clause (see below) and not to mdiobus_write (as in phy_write).

Also the ksz8061_extended_write() function which I have added uses the same principle as already existing HW-specific functions in micrel.c for simular reasons (kszphy_extended_write and ksz9031_extended_write).
They use phy_write all over the place in that file and never phy_write_mmd - for whatever reason they had.
Thus I thought it would be a good idea ...

best regards, Alex


static inline int phy_write(struct phy_device *phydev, u32 regnum, u16 val)
{
	return mdiobus_write(phydev->mdio.bus, phydev->mdio.addr, regnum, val);
}



int phy_write_mmd(struct phy_device *phydev, int devad, u32 regnum, u16 val)
{
	int ret;

	if (regnum > (u16)~0 || devad > 32)
		return -EINVAL;

	if (phydev->drv->write_mmd) {
		ret = phydev->drv->write_mmd(phydev, devad, regnum, val);
	} else if (phydev->is_c45) {
		u32 addr = MII_ADDR_C45 | (devad << 16) | (regnum & 0xffff);

		ret = mdiobus_write(phydev->mdio.bus, phydev->mdio.addr,
				    addr, val);
	} else {
		struct mii_bus *bus = phydev->mdio.bus;
		int phy_addr = phydev->mdio.addr;

		mutex_lock(&bus->mdio_lock);
		mmd_phy_indirect(bus, phy_addr, devad, regnum);

		/* Write the data into MMD's selected register */
		bus->write(bus, phy_addr, MII_MMD_DATA, val);
		mutex_unlock(&bus->mdio_lock);

		ret = 0;
	}
	return ret;
}

-----Original Message-----
From: Andrew Lunn <andrew@lunn.ch> 
Sent: Mittwoch, 6. Juni 2018 14:40
To: Onnasch, Alexander (EXT) <Alexander.Onnasch@landisgyr.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>; netdev@vger.kernel.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] net/phy: Micrel KSZ8061 PHY link failure after cable connect

On Wed, Jun 06, 2018 at 10:03:18AM +0200, Alexander Onnasch wrote:
> With Micrel KSZ8061 PHY, the link may occasionally not come up after 
> Ethernet cable connect. The vendor's (Microchip, former Micrel) errata 
> sheet 80000688A.pdf describes the problem and possible workarounds in 
> detail, see below.
> The patch implements workaround 1, which permanently fixes the issue.
> 
> DESCRIPTION
> Link-up may not occur properly when the Ethernet cable is initially 
> connected. This issue occurs more commonly when the cable is connected 
> slowly, but it may occur any time a cable is connected. This issue 
> occurs in the auto-negotiation circuit, and will not occur if 
> auto-negotiation is disabled (which requires that the two link 
> partners be set to the same speed and duplex).
> 
> END USER IMPLICATIONS
> When this issue occurs, link is not established. Subsequent cable 
> plug/unplug cycles will not correct the issue.
> 
> WORK AROUND
> There are four approaches to work around this issue:
> 1.  This issue can be prevented by setting bit 15 in MMD device address 1,
>     register 2, prior to connecting the  cable or prior to setting the
>     Restart Auto-Negotiation bit in register 0h.The MMD registers are
>     accessed via the indirect access registers Dh and Eh, or via the Micrel
>     EthUtil utility as shown here:
>     •  If using the EthUtil utility (usually with a Micrel KSZ8061
>        Evaluation Board), type the following commands:
>        > address 1
>        > mmd 1
>        > iw 2 b61a
>     •  Alternatively, write the following registers to write to the
>        indirect MMD register:
>        Write register Dh, data 0001h
>        Write register Eh, data 0002h
>        Write register Dh, data 4001h
>        Write register Eh, data B61Ah
> 2.  The issue can be avoided by disabling auto-negotiation in the KSZ8061,
>     either by the strapping option, or by clearing bit 12 in register 0h.
>     Care must be taken to ensure that the KSZ8061 and the link partner
>     will link with the same speed and duplex. Note that the KSZ8061
>     defaults to full-duplex when auto-negotiation is off, but other
>     devices may default to half-duplex in the event of failed
>     auto-negotiation.
> 3.  The issue can be avoided by connecting the cable prior to powering-up
>     or resetting the KSZ8061, and leaving it plugged in thereafter.
> 4.  If the above measures are not taken and the problem occurs, link can
>     be recovered by setting the Restart Auto-Negotiation bit in
>     register 0h, or by resetting or power cycling the device. Reset may
>     be either hardware reset or software reset (register 0h, bit 15).
> 
> PLAN
> This errata will not be corrected in a future revision.
> 
> Signed-off-by: Alexander Onnasch <alexander.onnasch@landisgyr.com>
> ---
>  drivers/net/phy/micrel.c | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index 
> 6c45ff6..7d80a00 100644
> --- a/drivers/net/phy/micrel.c
> +++ b/drivers/net/phy/micrel.c
> @@ -339,6 +339,28 @@ static int ksz8041_config_aneg(struct phy_device *phydev)
>  	return genphy_config_aneg(phydev);
>  }
>  
> +#define MII_KSZ8061RN_MMD_CTRL_REG	    0x0d
> +#define MII_KSZ8061RN_MMD_REGDATA_REG	0x0e
> +
> +static int ksz8061_extended_write(struct phy_device *phydev,
> +				  u8 mode, u32 dev_addr, u32 regnum, u16 val) {
> +	phy_write(phydev, MII_KSZ8061RN_MMD_CTRL_REG, dev_addr);
> +	phy_write(phydev, MII_KSZ8061RN_MMD_REGDATA_REG, regnum);
> +	phy_write(phydev, MII_KSZ8061RN_MMD_CTRL_REG, (mode << 14) | dev_addr);
> +	return phy_write(phydev, MII_KSZ8061RN_MMD_REGDATA_REG, val); }

Hi Alexander

This looks a lot like phy_write_mmd().

     Andrew

^ permalink raw reply

* Re: [PATCH rdma-next 0/3] Dump and fill MKEY
From: Leon Romanovsky @ 2018-06-19 14:26 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, RDMA mailing list, Guy Levi, Yishai Hadas,
	Yonatan Cohen, Saeed Mahameed, linux-netdev
In-Reply-To: <20180619142106.GA27457@mellanox.com>

On Tue, Jun 19, 2018 at 08:21:06AM -0600, Jason Gunthorpe wrote:
> On Tue, Jun 19, 2018 at 08:47:21AM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > MLX5 IB HCA offers the memory key, dump_fill_mkey to increase
> > performance, when used in a send or receive operations.
> >
> > It is used to force local HCA operations to skip the PCI bus access,
> > while keeping track of the processed length in the ibv_sge handling.
> >
> > In this three patch series, we expose various bits in our HW
> > spec file (mlx5_ifc.h), move unneeded for mlx5_core FW command and
> > export such memory key to user space thought our mlx5-abi header file.
>
> Where is the user space for this?

It will be published in the near future.

Thanks

>
> Jason

^ permalink raw reply

* Re: [PATCH bpf-net] bpf: Change bpf_fib_lookup to return lookup status
From: David Ahern @ 2018-06-19 15:02 UTC (permalink / raw)
  To: Quentin Monnet, dsahern, netdev, borkmann, ast; +Cc: davem
In-Reply-To: <a856c7da-da18-9460-3eff-4d25ed6852ea@netronome.com>

On 6/19/18 3:36 AM, Quentin Monnet wrote:
> Since you are about to respin (I think?), could you please also fix the
> formatting in your change to the doc? The "BPF_FIB_LKUP_RET_" is not
> emphasized (and will even cause an error message when producing the man
> page, because of the trailing underscore that gets interpreted in RST),
> and the three cases for the return value are not formatted properly for
> the conversion.
> 
> Something like the following would work:
> 
> ---
>  *     Return
>  *		* < 0 if any input argument is invalid.
>  *		*   0 on success (packet is forwarded and nexthop neighbor exists).
>  *		* > 0: one of **BPF_FIB_LKUP_RET_** codes on FIB lookup response.
> ---
> 

Will do. thanks for the review.

^ permalink raw reply

* Re: [RFC PATCH ghak86 V1] audit: eliminate audit_enabled magic number comparison
From: Paul Moore @ 2018-06-19 15:10 UTC (permalink / raw)
  To: rgb
  Cc: linux-audit, linux-kernel, netdev, netfilter-devel,
	linux-security-module, Eric Paris, sgrubb
In-Reply-To: <490a00a7902582823fe8c532f5dd995a1da61fb1.1528214962.git.rgb@redhat.com>

On Tue, Jun 5, 2018 at 7:22 PM Richard Guy Briggs <rgb@redhat.com> wrote:
>
> Remove comparison of audit_enabled to magic numbers outside of audit.
>
> Related: https://github.com/linux-audit/audit-kernel/issues/86
>
> Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
> ---
>  drivers/tty/tty_audit.c      | 2 +-
>  include/linux/audit.h        | 5 ++++-
>  include/net/xfrm.h           | 2 +-
>  kernel/audit.c               | 3 ---
>  net/netfilter/xt_AUDIT.c     | 2 +-
>  net/netlabel/netlabel_user.c | 2 +-
>  6 files changed, 8 insertions(+), 8 deletions(-)

Merged, thanks.

> diff --git a/drivers/tty/tty_audit.c b/drivers/tty/tty_audit.c
> index e30aa6b..50f567b 100644
> --- a/drivers/tty/tty_audit.c
> +++ b/drivers/tty/tty_audit.c
> @@ -92,7 +92,7 @@ static void tty_audit_buf_push(struct tty_audit_buf *buf)
>  {
>         if (buf->valid == 0)
>                 return;
> -       if (audit_enabled == 0) {
> +       if (audit_enabled == AUDIT_OFF) {
>                 buf->valid = 0;
>                 return;
>         }
> diff --git a/include/linux/audit.h b/include/linux/audit.h
> index 69c7847..9334fbe 100644
> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -117,6 +117,9 @@ struct audit_field {
>
>  extern void audit_log_session_info(struct audit_buffer *ab);
>
> +#define AUDIT_OFF      0
> +#define AUDIT_ON       1
> +#define AUDIT_LOCKED   2
>  #ifdef CONFIG_AUDIT
>  /* These are defined in audit.c */
>                                 /* Public API */
> @@ -202,7 +205,7 @@ static inline int audit_log_task_context(struct audit_buffer *ab)
>  static inline void audit_log_task_info(struct audit_buffer *ab,
>                                        struct task_struct *tsk)
>  { }
> -#define audit_enabled 0
> +#define audit_enabled AUDIT_OFF
>  #endif /* CONFIG_AUDIT */
>
>  #ifdef CONFIG_AUDIT_COMPAT_GENERIC
> diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> index 7f2e31a..ce995a1 100644
> --- a/include/net/xfrm.h
> +++ b/include/net/xfrm.h
> @@ -734,7 +734,7 @@ static inline struct audit_buffer *xfrm_audit_start(const char *op)
>  {
>         struct audit_buffer *audit_buf = NULL;
>
> -       if (audit_enabled == 0)
> +       if (audit_enabled == AUDIT_OFF)
>                 return NULL;
>         audit_buf = audit_log_start(audit_context(), GFP_ATOMIC,
>                                     AUDIT_MAC_IPSEC_EVENT);
> diff --git a/kernel/audit.c b/kernel/audit.c
> index e7478cb..8442c65 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -83,9 +83,6 @@
>  #define AUDIT_INITIALIZED      1
>  static int     audit_initialized;
>
> -#define AUDIT_OFF      0
> -#define AUDIT_ON       1
> -#define AUDIT_LOCKED   2
>  u32            audit_enabled = AUDIT_OFF;
>  bool           audit_ever_enabled = !!AUDIT_OFF;
>
> diff --git a/net/netfilter/xt_AUDIT.c b/net/netfilter/xt_AUDIT.c
> index f368ee6..af883f1 100644
> --- a/net/netfilter/xt_AUDIT.c
> +++ b/net/netfilter/xt_AUDIT.c
> @@ -72,7 +72,7 @@ static bool audit_ip6(struct audit_buffer *ab, struct sk_buff *skb)
>         struct audit_buffer *ab;
>         int fam = -1;
>
> -       if (audit_enabled == 0)
> +       if (audit_enabled == AUDIT_OFF)
>                 goto errout;
>         ab = audit_log_start(NULL, GFP_ATOMIC, AUDIT_NETFILTER_PKT);
>         if (ab == NULL)
> diff --git a/net/netlabel/netlabel_user.c b/net/netlabel/netlabel_user.c
> index 2f328af..4676f5b 100644
> --- a/net/netlabel/netlabel_user.c
> +++ b/net/netlabel/netlabel_user.c
> @@ -101,7 +101,7 @@ struct audit_buffer *netlbl_audit_start_common(int type,
>         char *secctx;
>         u32 secctx_len;
>
> -       if (audit_enabled == 0)
> +       if (audit_enabled == AUDIT_OFF)
>                 return NULL;
>
>         audit_buf = audit_log_start(audit_context(), GFP_ATOMIC, type);
> --
> 1.8.3.1
>


-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* Re: [PATCH bpf-net] bpf: Change bpf_fib_lookup to return lookup status
From: David Ahern @ 2018-06-19 15:11 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: dsahern, netdev, borkmann, ast, davem
In-Reply-To: <20180618205537.2j645mfujdsqxf2b@kafai-mbp.dhcp.thefacebook.com>

On 6/18/18 2:55 PM, Martin KaFai Lau wrote:
>>
>> Arguably BPF_FIB_LKUP_RET_NO_NHDEV is not needed. See below.
>>

...

>>>> @@ -4252,16 +4277,19 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
>>>>  	if (check_mtu) {
>>>>  		mtu = ipv6_stub->ip6_mtu_from_fib6(f6i, dst, src);
>>>>  		if (params->tot_len > mtu)
>>>> -			return 0;
>>>> +			return BPF_FIB_LKUP_RET_FRAG_NEEDED;
>>>>  	}
>>>>  
>>>>  	if (f6i->fib6_nh.nh_lwtstate)
>>>> -		return 0;
>>>> +		return BPF_FIB_LKUP_RET_UNSUPP_LWT;
>>>>  
>>>>  	if (f6i->fib6_flags & RTF_GATEWAY)
>>>>  		*dst = f6i->fib6_nh.nh_gw;
>>>>  
>>>>  	dev = f6i->fib6_nh.nh_dev;
>>>> +	if (unlikely(!dev))
>>>> +		return BPF_FIB_LKUP_RET_NO_NHDEV;
>>> Is this a bug fix?
>>>
>>
>> Difference between IPv4 and IPv6. Making them consistent.
>>
>> It is a major BUG in the kernel to reach this point in either protocol
>> to have a unicast route not tied to a device. IPv4 has checks; v6 does
>> not. I figured this being new code, why not make bpf_ipv{4,6}_fib_lookup
>> as close to the same as possible.
> Make sense.  A comment in the commit log will be useful if there is a
> re-spin.
> 

Upon further review, I will remove BPF_FIB_LKUP_RET_NO_NHDEV. The dev
check is not needed in either ipv4 or ipv6. For IPv4 after fib_lookup
calls both __mkroute_input and ip_route_output_key_hash_rcu expect dev
to be set for unicast as it should be.

^ permalink raw reply

* Re: [PATCH] optoe: driver to read/write SFP/QSFP EEPROMs
From: Andrew Lunn @ 2018-06-19 15:15 UTC (permalink / raw)
  To: Don Bollinger
  Cc: Tom Lendacky, Arnd Bergmann, Greg Kroah-Hartman, linux-kernel,
	brandon_chuang, wally_wang, roy_lee, rick_burchett, quentin.chang,
	steven.noble, jeffrey.townsend, scotte, roopa, David Ahern,
	luke.williams, Guohan Lu, Russell King, netdev@vger.kernel.org
In-Reply-To: <20180618194127.uaeqlo3dy35qs3ip@thebollingers.org>

> > If you are using Linux as a boot loader, i doubt you will find any
> > network kernel developers who are willing to consider this driver. The
> 
> It isn't a boot loader.  It is the kernel that is running on the switch
> when it is doing its switch thing.  The kernel hosts the drivers and the
> switch SDK and all the apps that configure and manage the networking.

That is exactly using it as a boot loader. Linux network stack itself
has nothing to do with the switches. It just loads an application
which controls the switch from user space.

> This is real code, that fits a real need, and would like the
> benefits of being maintained as part of the mainline kernel.

Have you ever attended one of the netdev conferences?

https://www.netdevconf.org/2.1/submit-proposal.html

Netdev 2.1 is a community-driven conference geared towards Linux
netheads. Linux kernel networking and user space utilization of the
interfaces to the Linux kernel networking subsystem are the focus. If
you are using Linux as a boot system for proprietary networking, then
this conference _may not be for you_.

That gives you an idea of what the kernel community feels about this
sort of thing. Why should we help maintain code which brings no
benefit to the netdev community?

If you make use of the existing SFP code, extend it so it both
benefits the netdev kernel community and your own, then we might
consider merging your code. But optoe as it is brings no benefit to
the netdev community, so is unlikely to get merged.

    Andrew

^ permalink raw reply

* [PATCH] bpfilter: fix build error
From: Matteo Croce @ 2018-06-19 15:16 UTC (permalink / raw)
  To: netdev; +Cc: Alexei Starovoitov

bpfilter Makefile assumes that the system locale is en_US, and the
parsing of objdump output fails.
Set LC_ALL=C and, while at it, rewrite the objdump parsing so it spawns
only 2 processes instead of 7.

Fixes: d2ba09c17a064 ("net: add skeleton of bpfilter kernel module")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
---
 net/bpfilter/Makefile | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index e0bbe7583e58..dd86b022eff0 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -21,8 +21,10 @@ endif
 # which bpfilter_kern.c passes further into umh blob loader at run-time
 quiet_cmd_copy_umh = GEN $@
       cmd_copy_umh = echo ':' > $(obj)/.bpfilter_umh.o.cmd; \
-      $(OBJCOPY) -I binary -O `$(OBJDUMP) -f $<|grep format|cut -d' ' -f8` \
-      -B `$(OBJDUMP) -f $<|grep architecture|cut -d, -f1|cut -d' ' -f2` \
+      $(OBJCOPY) -I binary \
+          `LC_ALL=C objdump -f net/bpfilter/bpfilter_umh \
+          |awk -F' |,' '/file format/{print "-O",$$NF} \
+          /^architecture:/{print "-B",$$2}'` \
       --rename-section .data=.init.rodata $< $@
 
 $(obj)/bpfilter_umh.o: $(obj)/bpfilter_umh
-- 
2.17.1

^ permalink raw reply related

* iproute2 won't compile without AF_VSOCK
From: Steve Wise @ 2018-06-19 15:17 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev

Hey David,

I'm trying to compile the latest iproute2 on an RHEL-7.3 distro, and it
fails to compile because AF_VSOCK is not defined.  Should this
functionality be a configure option to disable it on older distros?


Thanks,

Steve.

----

misc
    CC       ss.o
ss.c:301:27: error: ‘AF_VSOCK’ undeclared here (not in a function)
   .families = FAMILY_MASK(AF_VSOCK),
                           ^
ss.c:252:46: note: in definition of macro ‘FAMILY_MASK’
 #define FAMILY_MASK(family) ((uint64_t)1 << (family))
                                              ^
ss.c:334:2: error: array index in initializer not of integer type
  [AF_VSOCK] = {
  ^
ss.c:334:2: error: (near initialization for ‘default_afs’)
make[1]: *** [ss.o] Error 1
make: *** [all] Error 2


^ permalink raw reply

* [PATCH] bpfilter: ignore binary files
From: Matteo Croce @ 2018-06-19 15:21 UTC (permalink / raw)
  To: netdev; +Cc: Alexei Starovoitov

net/bpfilter/bpfilter_umh is a binary file generated when bpfilter is
enabled, add it to .gitignore to avoid committing it.

Fixes: d2ba09c17a064 ("net: add skeleton of bpfilter kernel module")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
---
 net/bpfilter/.gitignore | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 net/bpfilter/.gitignore

diff --git a/net/bpfilter/.gitignore b/net/bpfilter/.gitignore
new file mode 100644
index 000000000000..e97084e3eea2
--- /dev/null
+++ b/net/bpfilter/.gitignore
@@ -0,0 +1 @@
+bpfilter_umh
-- 
2.17.1

^ permalink raw reply related

* Re: Link modes representation in phylib
From: Andrew Lunn @ 2018-06-19 15:21 UTC (permalink / raw)
  To: Maxime Chevallier
  Cc: davem, Russell King - ARM Linux, Florian Fainelli, netdev,
	Antoine Tenart, thomas.petazzoni@bootlin.com, Gregory CLEMENT,
	Miquel Raynal
In-Reply-To: <20180619113053.11df78a2@bootlin.com>

> What I propose is that we add 3 link_mode fields in phy_device, and keep
> the legacy fields for now. It would be up to the driver to fill the new
> "supported" field in config_init, kind of like what's done in the
> marvell10g driver.

Hi Maxime

You can do this conversion in the core. If features == 0, and some
bits are set in the features link_mode, do the conversion at probe
time. The same can be done for lp_advertising, when the call into the
drivers read_status() has completed.

> Would that be acceptable ?

It sounds reasonable. Lets see what the code looks like.

   Andrew

^ permalink raw reply

* Re: [PATCH bpf-net] bpf: Change bpf_fib_lookup to return lookup status
From: Martin KaFai Lau @ 2018-06-19 15:25 UTC (permalink / raw)
  To: David Ahern; +Cc: dsahern, netdev, borkmann, ast, davem
In-Reply-To: <1339f6f2-9dd3-886c-2178-7088b0ae4746@gmail.com>

On Mon, Jun 18, 2018 at 03:35:25PM -0600, David Ahern wrote:
> On 6/18/18 2:55 PM, Martin KaFai Lau wrote:
> >> 	/* rc > 0 case */
> >> 	switch(rc) {
> >> 	case BPF_FIB_LKUP_RET_BLACKHOLE:
> >> 	case BPF_FIB_LKUP_RET_UNREACHABLE:
> >> 	case BPF_FIB_LKUP_RET_PROHIBIT:
> >> 		return XDP_DROP;
> >> 	}
> >>
> >> For the others it becomes a question of do we share why the stack needs
> >> to be involved? Maybe the program wants to collect stats to show traffic
> >> patterns that can be improved (BPF_FIB_LKUP_RET_FRAG_NEEDED) or support
> >> in the kernel needs to be improved (BPF_FIB_LKUP_RET_UNSUPP_LWT) or an
> >> interface is misconfigured (BPF_FIB_LKUP_RET_FWD_DISABLED).
> > Thanks for the explanation.
> > 
> > Agree on the bpf able to collect stats will be useful.
> > 
> > I am wondering, if a new BPF_FIB_LKUP_RET_XYZ is added later,
> > how may the old xdp_prog work/not-work?  As of now, the return value
> > is straight forward, FWD, PASS (to stack) or DROP (error).
> > With this change, the xdp_prog needs to match/switch() the
> > BPF_FIB_LKUP_RET_* to at least PASS and DROP.
> 
> IMO, programs should only call XDP_DROP for known reasons - like the 3
> above. Anything else punt to the stack.
> 
> If a new RET_XYZ comes along:
> 1. the new XYZ is a new ACL response where the packet is to be dropped.
> If the program does not understand XYZ and punts to the stack
> (recommendation), then a second lookup is done during normal packet
> processing and the stack drops it.
> 
> 2. the new XYZ is a new path in the kernel that is unsupported with
> respect to XDP forwarding, nothing new for the program to do.
> 
> Either way I would expect stats on BPF_FIB_LKUP_RET_* to give a hint to
> the program writer.
> 
> Worst case of punting packets to the stack for any rc != 0 means the
> stack is doing 2 lookups - 1 in XDP based on its lookup parameters and 1
> in normal stack processing - to handle the packet.
Instead of having the xdp_prog to follow the meaning of what RET_SYZ is,
should the bpf_*_fib_lookup() return value be kept as is such that
the xdp_prog is clear what to do.  The reason can be returned in
the 'struct bpf_fib_lookup'.  The number of reasons can be extended.
If the xdp_prog does not understand a reason, it still will not
affect its decision because the return value is clear.
I think the situation here is similar to regular syscall which usually
uses -1 to clearly states error and errno to spells out the reason.

> 
> > 
> >>
> >> Arguably BPF_FIB_LKUP_RET_NO_NHDEV is not needed. See below.
> >>
> >>>> @@ -2612,6 +2613,19 @@ struct bpf_raw_tracepoint_args {
> >>>>  #define BPF_FIB_LOOKUP_DIRECT  BIT(0)
> >>>>  #define BPF_FIB_LOOKUP_OUTPUT  BIT(1)
> >>>>  
> >>>> +enum {
> >>>> +	BPF_FIB_LKUP_RET_SUCCESS,      /* lookup successful */
> >>>> +	BPF_FIB_LKUP_RET_BLACKHOLE,    /* dest is blackholed */
> >>>> +	BPF_FIB_LKUP_RET_UNREACHABLE,  /* dest is unreachable */
> >>>> +	BPF_FIB_LKUP_RET_PROHIBIT,     /* dest not allowed */
> >>>> +	BPF_FIB_LKUP_RET_NOT_FWDED,    /* pkt is not forwardded */
> >>> BPF_FIB_LKUP_RET_NOT_FWDED is a catch all?
> >>>
> >>
> >> Destination is local. More precisely, the FIB lookup is not unicast so
> >> not forwarded. It could be RTN_LOCAL, RTN_BROADCAST, RTN_ANYCAST, or
> >> RTN_MULTICAST. The next ones -- blackhole, reachable, prohibit -- are
> >> called out.
> > I think it also includes the tbid not found case.
> 
> Another one of those "should never happen scenarios". The user does not
> specify the table; it is retrieved based on device association. Table
> defaults to the main table - which always exists - and any VRF
> enslavement of a device happens after the VRF device creates the table.
> 
> > 
> >>
> >>>> @@ -4252,16 +4277,19 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
> >>>>  	if (check_mtu) {
> >>>>  		mtu = ipv6_stub->ip6_mtu_from_fib6(f6i, dst, src);
> >>>>  		if (params->tot_len > mtu)
> >>>> -			return 0;
> >>>> +			return BPF_FIB_LKUP_RET_FRAG_NEEDED;
> >>>>  	}
> >>>>  
> >>>>  	if (f6i->fib6_nh.nh_lwtstate)
> >>>> -		return 0;
> >>>> +		return BPF_FIB_LKUP_RET_UNSUPP_LWT;
> >>>>  
> >>>>  	if (f6i->fib6_flags & RTF_GATEWAY)
> >>>>  		*dst = f6i->fib6_nh.nh_gw;
> >>>>  
> >>>>  	dev = f6i->fib6_nh.nh_dev;
> >>>> +	if (unlikely(!dev))
> >>>> +		return BPF_FIB_LKUP_RET_NO_NHDEV;
> >>> Is this a bug fix?
> >>>
> >>
> >> Difference between IPv4 and IPv6. Making them consistent.
> >>
> >> It is a major BUG in the kernel to reach this point in either protocol
> >> to have a unicast route not tied to a device. IPv4 has checks; v6 does
> >> not. I figured this being new code, why not make bpf_ipv{4,6}_fib_lookup
> >> as close to the same as possible.
> > Make sense.  A comment in the commit log will be useful if there is a
> > re-spin.
> > 
> 
> ok.

^ permalink raw reply

* Re: [PATCH] net/phy: Micrel KSZ8061 PHY link failure after cable connect
From: Andrew Lunn @ 2018-06-19 15:28 UTC (permalink / raw)
  To: Onnasch, Alexander (EXT)
  Cc: Florian Fainelli, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <AM6PR01MB4262F8BB6004D246DFF5C3FEF0700@AM6PR01MB4262.eurprd01.prod.exchangelabs.com>

On Tue, Jun 19, 2018 at 02:23:41PM +0000, Onnasch, Alexander (EXT) wrote:
> Hi Andrew
> thanks for the hint. But actually I cannot confirm - or I don't see it yet. 
> 
> Without having tested, just from the code, the struct phy_driver instance for PHY_ID_KSZ8061 in micrel.c does not have a .write_mmd function assigned, thus phy_write_mmd should evaluate to its else-clause (see below) and not to mdiobus_write (as in phy_write).
> 
> Also the ksz8061_extended_write() function which I have added uses the same principle as already existing HW-specific functions in micrel.c for simular reasons (kszphy_extended_write and ksz9031_extended_write).
> They use phy_write all over the place in that file and never phy_write_mmd - for whatever reason they had.
> Thus I thought it would be a good idea ...

Hi Alexander

Please don't top post. And wrap your lines at around 75 characters 

> 		struct mii_bus *bus = phydev->mdio.bus;
> 		int phy_addr = phydev->mdio.addr;
> 
> 		mutex_lock(&bus->mdio_lock);
> 		mmd_phy_indirect(bus, phy_addr, devad, regnum);
> 
> 		/* Write the data into MMD's selected register */
> 		bus->write(bus, phy_addr, MII_MMD_DATA, val);
> 		mutex_unlock(&bus->mdio_lock);


> > +static int ksz8061_extended_write(struct phy_device *phydev,
> > +				  u8 mode, u32 dev_addr, u32 regnum, u16 val) {
> > +	phy_write(phydev, MII_KSZ8061RN_MMD_CTRL_REG, dev_addr);
> > +	phy_write(phydev, MII_KSZ8061RN_MMD_REGDATA_REG, regnum);
> > +	phy_write(phydev, MII_KSZ8061RN_MMD_CTRL_REG, (mode << 14) | dev_addr);
> > +	return phy_write(phydev, MII_KSZ8061RN_MMD_REGDATA_REG, val); }
> 
> Hi Alexander
> 
> This looks a lot like phy_write_mmd().

Look closely at the two implementations. Look at what
mmd_phy_indirect() does. I _think_ these are identical. So don't add
your own helper, please use the core code.

     Andrew

^ permalink raw reply

* Re: [PATCH bpf-net] bpf: Change bpf_fib_lookup to return lookup status
From: David Ahern @ 2018-06-19 15:34 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: dsahern, netdev, borkmann, ast, davem
In-Reply-To: <20180619152529.rkzeyyqgmiwsvjp6@kafai-mbp.dhcp.thefacebook.com>

On 6/19/18 9:25 AM, Martin KaFai Lau wrote:
> On Mon, Jun 18, 2018 at 03:35:25PM -0600, David Ahern wrote:
>> On 6/18/18 2:55 PM, Martin KaFai Lau wrote:
>>>> 	/* rc > 0 case */
>>>> 	switch(rc) {
>>>> 	case BPF_FIB_LKUP_RET_BLACKHOLE:
>>>> 	case BPF_FIB_LKUP_RET_UNREACHABLE:
>>>> 	case BPF_FIB_LKUP_RET_PROHIBIT:
>>>> 		return XDP_DROP;
>>>> 	}
>>>>
>>>> For the others it becomes a question of do we share why the stack needs
>>>> to be involved? Maybe the program wants to collect stats to show traffic
>>>> patterns that can be improved (BPF_FIB_LKUP_RET_FRAG_NEEDED) or support
>>>> in the kernel needs to be improved (BPF_FIB_LKUP_RET_UNSUPP_LWT) or an
>>>> interface is misconfigured (BPF_FIB_LKUP_RET_FWD_DISABLED).
>>> Thanks for the explanation.
>>>
>>> Agree on the bpf able to collect stats will be useful.
>>>
>>> I am wondering, if a new BPF_FIB_LKUP_RET_XYZ is added later,
>>> how may the old xdp_prog work/not-work?  As of now, the return value
>>> is straight forward, FWD, PASS (to stack) or DROP (error).
>>> With this change, the xdp_prog needs to match/switch() the
>>> BPF_FIB_LKUP_RET_* to at least PASS and DROP.
>>
>> IMO, programs should only call XDP_DROP for known reasons - like the 3
>> above. Anything else punt to the stack.
>>
>> If a new RET_XYZ comes along:
>> 1. the new XYZ is a new ACL response where the packet is to be dropped.
>> If the program does not understand XYZ and punts to the stack
>> (recommendation), then a second lookup is done during normal packet
>> processing and the stack drops it.
>>
>> 2. the new XYZ is a new path in the kernel that is unsupported with
>> respect to XDP forwarding, nothing new for the program to do.
>>
>> Either way I would expect stats on BPF_FIB_LKUP_RET_* to give a hint to
>> the program writer.
>>
>> Worst case of punting packets to the stack for any rc != 0 means the
>> stack is doing 2 lookups - 1 in XDP based on its lookup parameters and 1
>> in normal stack processing - to handle the packet.
> Instead of having the xdp_prog to follow the meaning of what RET_SYZ is,
> should the bpf_*_fib_lookup() return value be kept as is such that
> the xdp_prog is clear what to do.  The reason can be returned in
> the 'struct bpf_fib_lookup'.  The number of reasons can be extended.
> If the xdp_prog does not understand a reason, it still will not
> affect its decision because the return value is clear.
> I think the situation here is similar to regular syscall which usually
> uses -1 to clearly states error and errno to spells out the reason.
> 

I did consider returning the status in struct bpf_fib_lookup. However,
it is 64 bytes and can not be extended without a big performance
penalty, so the only option there is to make an existing entry a union
the most logical of which is the ifindex. It seemed odd to me to have
the result by hidden in the struct as a union on ifindex and returning
the egress index from the function:

@@ -2625,7 +2636,11 @@ struct bpf_fib_lookup {

        /* total length of packet from network header - used for MTU
check */
        __u16   tot_len;
-       __u32   ifindex;  /* L3 device index for lookup */
+
+       union {
+               __u32   ifindex;  /* input: L3 device index for lookup */
+               __u32   result;   /* output: one of BPF_FIB_LKUP_RET_* */
+       };


It seemed more natural to have ifindex stay ifindex and only change
value on return:

@@ -2625,7 +2639,11 @@ struct bpf_fib_lookup {

 	/* total length of packet from network header - used for MTU check */
 	__u16	tot_len;
-	__u32	ifindex;  /* L3 device index for lookup */
+
+	/* input: L3 device index for lookup
+	 * output: nexthop device index from FIB lookup
+	 */
+	__u32	ifindex;

 	union {
 		/* inputs to lookup */


>From a program's perspective:

rc < 0  -- program is passing incorrect data
rc == 0 -- packet can be forwarded
rc > 0  -- packet can not be forwarded.

BPF programs are not required to track the LKUP_RET values any more than
a function returning multiple negative values - the caller just checks
rc < 0 means failure. If the program cares it can look at specific
values of rc to see the specific value.

The same applies with the LKUP_RET values - they are there to provide
insight into why the packet is not forwarded directly if the program
cares to know why.

^ permalink raw reply

* [PATCH] net: stmmac: socfpga: add additional ocp reset line for Stratix10
From: Dinh Nguyen @ 2018-06-19 15:35 UTC (permalink / raw)
  To: netdev
  Cc: dinguyen, davem, joabreu, alexandre.torgue, peppe.cavallaro,
	linux-kernel

The Stratix10 platform has an additional reset line, OCP(Open Core Protocol),
that also needs to get deasserted for the stmmac ethernet controller to work.
Thus we need to update the Kconfig to include ARCH_STRATIX10 in order to build
dwmac-socfpga.

Also, remove the redundant check for the reset controller pointer. The
reset driver already checks for the pointer and returns 0 if the pointer
is NULL.

Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
---
 drivers/net/ethernet/stmicro/stmmac/Kconfig         |  2 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c | 18 ++++++++++++++----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig
index cb5b0f5..edf2036 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
+++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
@@ -111,7 +111,7 @@ config DWMAC_ROCKCHIP
 config DWMAC_SOCFPGA
 	tristate "SOCFPGA dwmac support"
 	default ARCH_SOCFPGA
-	depends on OF && (ARCH_SOCFPGA || COMPILE_TEST)
+	depends on OF && (ARCH_SOCFPGA || ARCH_STRATIX10 || COMPILE_TEST)
 	select MFD_SYSCON
 	help
 	  Support for ethernet controller on Altera SOCFPGA
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c
index 6e35957..5b3b06a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c
@@ -55,6 +55,7 @@ struct socfpga_dwmac {
 	struct	device *dev;
 	struct regmap *sys_mgr_base_addr;
 	struct reset_control *stmmac_rst;
+	struct reset_control *stmmac_ocp_rst;
 	void __iomem *splitter_base;
 	bool f2h_ptp_ref_clk;
 	struct tse_pcs pcs;
@@ -262,8 +263,8 @@ static int socfpga_dwmac_set_phy_mode(struct socfpga_dwmac *dwmac)
 		val = SYSMGR_EMACGRP_CTRL_PHYSEL_ENUM_GMII_MII;
 
 	/* Assert reset to the enet controller before changing the phy mode */
-	if (dwmac->stmmac_rst)
-		reset_control_assert(dwmac->stmmac_rst);
+	reset_control_assert(dwmac->stmmac_ocp_rst);
+	reset_control_assert(dwmac->stmmac_rst);
 
 	regmap_read(sys_mgr_base_addr, reg_offset, &ctrl);
 	ctrl &= ~(SYSMGR_EMACGRP_CTRL_PHYSEL_MASK << reg_shift);
@@ -288,8 +289,8 @@ static int socfpga_dwmac_set_phy_mode(struct socfpga_dwmac *dwmac)
 	/* Deassert reset for the phy configuration to be sampled by
 	 * the enet controller, and operation to start in requested mode
 	 */
-	if (dwmac->stmmac_rst)
-		reset_control_deassert(dwmac->stmmac_rst);
+	reset_control_deassert(dwmac->stmmac_ocp_rst);
+	reset_control_deassert(dwmac->stmmac_rst);
 	if (phymode == PHY_INTERFACE_MODE_SGMII) {
 		if (tse_pcs_init(dwmac->pcs.tse_pcs_base, &dwmac->pcs) != 0) {
 			dev_err(dwmac->dev, "Unable to initialize TSE PCS");
@@ -324,6 +325,15 @@ static int socfpga_dwmac_probe(struct platform_device *pdev)
 		goto err_remove_config_dt;
 	}
 
+	dwmac->stmmac_ocp_rst = devm_reset_control_get_optional(dev, "stmmaceth-ocp");
+	if (IS_ERR(dwmac->stmmac_ocp_rst)) {
+		ret = PTR_ERR(dwmac->stmmac_ocp_rst);
+		dev_err(dev, "error getting reset control of ocp %d\n", ret);
+		goto err_remove_config_dt;
+	}
+
+	reset_control_deassert(dwmac->stmmac_ocp_rst);
+
 	ret = socfpga_dwmac_parse_data(dwmac, dev);
 	if (ret) {
 		dev_err(dev, "Unable to parse OF data\n");
-- 
2.7.4

^ permalink raw reply related

* Re: iproute2 won't compile without AF_VSOCK
From: Stephen Hemminger @ 2018-06-19 15:47 UTC (permalink / raw)
  To: Steve Wise; +Cc: David Ahern, netdev
In-Reply-To: <c06b38c7-1680-806d-5b93-a7e04313183d@opengridcomputing.com>

On Tue, 19 Jun 2018 10:17:45 -0500
Steve Wise <swise@opengridcomputing.com> wrote:

> Hey David,
> 
> I'm trying to compile the latest iproute2 on an RHEL-7.3 distro, and it
> fails to compile because AF_VSOCK is not defined.  Should this
> functionality be a configure option to disable it on older distros?
> 
> 
> Thanks,
> 
> Steve.
> 
> ----
> 
> misc
>     CC       ss.o
> ss.c:301:27: error: ‘AF_VSOCK’ undeclared here (not in a function)
>    .families = FAMILY_MASK(AF_VSOCK),
>                            ^
> ss.c:252:46: note: in definition of macro ‘FAMILY_MASK’
>  #define FAMILY_MASK(family) ((uint64_t)1 << (family))
>                                               ^
> ss.c:334:2: error: array index in initializer not of integer type
>   [AF_VSOCK] = {
>   ^
> ss.c:334:2: error: (near initialization for ‘default_afs’)
> make[1]: *** [ss.o] Error 1
> make: *** [all] Error 2
> 

Probably should just add an #ifdef to takeout that if not present

^ permalink raw reply

* [PATCH 0/3] net: davinci_emac: fix suspend/resume (both a regression and a common clk problem)
From: Bartosz Golaszewski @ 2018-06-19 16:09 UTC (permalink / raw)
  To: Grygorii Strashko, David S . Miller, Florian Fainelli,
	Dan Carpenter, Ivan Khoronzhuk, Rob Herring, Lukas Wunner,
	Kevin Hilman, David Lechner, Sekhar Nori, Andrew Lunn
  Cc: linux-omap, netdev, linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

Earlier today I sent the first patch as a solution to a regression
introduced during the v4.16 merge window, but after testing David's
common clock series on top of 4.18-rc1 + this patch it turned out that
the problem persisted.

This is a follow-up containing the regression fix and two additional
patches that make suspend/resume work with David's changes.

Bartosz Golaszewski (3):
  net: ethernet: fix suspend/resume in davinci_emac
  net: phy: set the of_node in the mdiodev's struct device
  net: davinci_emac: match the mdio device against its compatible if
    possible

 drivers/net/ethernet/ti/davinci_emac.c | 19 +++++++++++++++++--
 drivers/net/phy/phy_device.c           |  1 +
 2 files changed, 18 insertions(+), 2 deletions(-)

-- 
2.17.1

^ permalink raw reply

* [PATCH 1/3] net: ethernet: fix suspend/resume in davinci_emac
From: Bartosz Golaszewski @ 2018-06-19 16:09 UTC (permalink / raw)
  To: Grygorii Strashko, David S . Miller, Florian Fainelli,
	Dan Carpenter, Ivan Khoronzhuk, Rob Herring, Lukas Wunner,
	Kevin Hilman, David Lechner, Sekhar Nori, Andrew Lunn
  Cc: linux-omap, netdev, linux-kernel, Bartosz Golaszewski, stable
In-Reply-To: <20180619160950.6283-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

This patch reverts commit 3243ff2a05ec ("net: ethernet: davinci_emac:
Deduplicate bus_find_device() by name matching") and adds a comment
which should stop anyone from reintroducing the same "fix" in the future.

We can't use bus_find_device_by_name() here because the device name is
not guaranteed to be 'davinci_mdio'. On some systems it can be
'davinci_mdio.0' so we need to use strncmp() against the first part of
the string to correctly match it.

Fixes: 3243ff2a05ec ("net: ethernet: davinci_emac: Deduplicate bus_find_device() by name matching")
Cc: stable@vger.kernel.org
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 drivers/net/ethernet/ti/davinci_emac.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index 06d7c9e4dcda..a1a6445b5a7e 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -1385,6 +1385,11 @@ static int emac_devioctl(struct net_device *ndev, struct ifreq *ifrq, int cmd)
 		return -EOPNOTSUPP;
 }
 
+static int match_first_device(struct device *dev, void *data)
+{
+	return !strncmp(dev_name(dev), "davinci_mdio", 12);
+}
+
 /**
  * emac_dev_open - EMAC device open
  * @ndev: The DaVinci EMAC network adapter
@@ -1484,8 +1489,14 @@ static int emac_dev_open(struct net_device *ndev)
 
 	/* use the first phy on the bus if pdata did not give us a phy id */
 	if (!phydev && !priv->phy_id) {
-		phy = bus_find_device_by_name(&mdio_bus_type, NULL,
-					      "davinci_mdio");
+		/* NOTE: we can't use bus_find_device_by_name() here because
+		 * the device name is not guaranteed to be 'davinci_mdio'. On
+		 * some systems it can be 'davinci_mdio.0' so we need to use
+		 * strncmp() against the first part of the string to correctly
+		 * match it.
+		 */
+		phy = bus_find_device(&mdio_bus_type, NULL, NULL,
+				      match_first_device);
 		if (phy) {
 			priv->phy_id = dev_name(phy);
 			if (!priv->phy_id || !*priv->phy_id)
-- 
2.17.1

^ permalink raw reply related

* [PATCH 2/3] net: phy: set the of_node in the mdiodev's struct device
From: Bartosz Golaszewski @ 2018-06-19 16:09 UTC (permalink / raw)
  To: Grygorii Strashko, David S . Miller, Florian Fainelli,
	Dan Carpenter, Ivan Khoronzhuk, Rob Herring, Lukas Wunner,
	Kevin Hilman, David Lechner, Sekhar Nori, Andrew Lunn
  Cc: linux-omap, netdev, linux-kernel, Bartosz Golaszewski
In-Reply-To: <20180619160950.6283-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

Copy the of_node over from mii_bus's struct device. This is needed
for device-tree systems to be able to check the mdio device's
compatible string.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 drivers/net/phy/phy_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index bd0f339f69fd..a92d5ee61813 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -411,6 +411,7 @@ struct phy_device *phy_device_create(struct mii_bus *bus, int addr, int phy_id,
 	mdiodev->dev.parent = &bus->dev;
 	mdiodev->dev.bus = &mdio_bus_type;
 	mdiodev->dev.type = &mdio_bus_phy_type;
+	mdiodev->dev.of_node = bus->dev.of_node;
 	mdiodev->bus = bus;
 	mdiodev->bus_match = phy_bus_match;
 	mdiodev->addr = addr;
-- 
2.17.1

^ permalink raw reply related

* [PATCH 3/3] net: davinci_emac: match the mdio device against its compatible if possible
From: Bartosz Golaszewski @ 2018-06-19 16:09 UTC (permalink / raw)
  To: Grygorii Strashko, David S . Miller, Florian Fainelli,
	Dan Carpenter, Ivan Khoronzhuk, Rob Herring, Lukas Wunner,
	Kevin Hilman, David Lechner, Sekhar Nori, Andrew Lunn
  Cc: linux-omap, netdev, linux-kernel, Bartosz Golaszewski
In-Reply-To: <20180619160950.6283-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

Device tree based systems without of_dev_auxdata will have the mdio
device named differently than "davinci_mdio(.0)". In this case use the
device's compatible string for matching.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 drivers/net/ethernet/ti/davinci_emac.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index a1a6445b5a7e..c28a35bb852f 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -1387,6 +1387,10 @@ static int emac_devioctl(struct net_device *ndev, struct ifreq *ifrq, int cmd)
 
 static int match_first_device(struct device *dev, void *data)
 {
+	if (dev->of_node)
+		return of_device_is_compatible(dev->of_node,
+					       "ti,davinci_mdio");
+
 	return !strncmp(dev_name(dev), "davinci_mdio", 12);
 }
 
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH] net: ethernet: fix suspend/resume in davinci_emac
From: Bartosz Golaszewski @ 2018-06-19 16:11 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Grygorii Strashko, David S . Miller, Florian Fainelli,
	Dan Carpenter, Ivan Khoronzhuk, Rob Herring, Kevin Hilman,
	David Lechner, Sekhar Nori, linux-omap, netdev,
	Linux Kernel Mailing List, Bartosz Golaszewski, stable
In-Reply-To: <20180619135219.GA7312@wunner.de>

2018-06-19 15:52 GMT+02:00 Lukas Wunner <lukas@wunner.de>:
> On Tue, Jun 19, 2018 at 02:44:00PM +0200, Bartosz Golaszewski wrote:
>> From: Bartosz Golaszewski <bgolaszewski@baylibre.com>
>>
>> This patch reverts commit 3243ff2a05ec ("net: ethernet: davinci_emac:
>> Deduplicate bus_find_device() by name matching") and adds a comment
>> which should stop anyone from reintroducing the same "fix" in the future.
>>
>> We can't use bus_find_device_by_name() here because the device name is
>> not guaranteed to be 'davinci_mdio'. On some systems it can be
>> 'davinci_mdio.0' so we need to use strncmp() against the first part of
>> the string to correctly match it.
>>
>> Fixes: 3243ff2a05ec ("net: ethernet: davinci_emac: Deduplicate bus_find_device() by name matching")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
>
> Acked-by: Lukas Wunner <lukas@wunner.de>
>
> My apologies Bartosz, it wasn't clear to me that the driver deliberately
> only matched against the prefix of the name.  Sorry for the breakage.
>

No issue. I submitted a follow-up series with additional changes
required to make suspend/resume work. Please leave your Acked-by there
as well as I forgot to pick it up.

Thanks in advance,
Bartosz Golaszewski

^ permalink raw reply

* [PATCH] selftests/net: Fix permissions for fib_tests.sh
From: Daniel Díaz @ 2018-06-19 16:20 UTC (permalink / raw)
  To: shuahkh, linux-kselftest
  Cc: Daniel Díaz, David S. Miller, Shuah Khan,
	open list:NETWORKING [GENERAL], open list

fib_tests.sh became non-executable at some point. This is
what happens:
  selftests: net: fib_tests.sh: Warning: file fib_tests.sh is
  not executable, correct this.
  not ok 1..11 selftests: net: fib_tests.sh [FAIL]

Fixes: d69faad76584 ("selftests: fib_tests: Add prefix route tests with metric")

Signed-off-by: Daniel Díaz <daniel.diaz@linaro.org>
---
 tools/testing/selftests/net/fib_tests.sh | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 mode change 100644 => 100755 tools/testing/selftests/net/fib_tests.sh

diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
old mode 100644
new mode 100755
-- 
2.7.4

^ permalink raw reply

* Re: [PATCH bpf-net] bpf: Change bpf_fib_lookup to return lookup status
From: Martin KaFai Lau @ 2018-06-19 16:36 UTC (permalink / raw)
  To: David Ahern; +Cc: dsahern, netdev, borkmann, ast, davem
In-Reply-To: <2d6278d1-45ab-ff53-7b97-d9593203ff3e@gmail.com>

On Tue, Jun 19, 2018 at 09:34:28AM -0600, David Ahern wrote:
> On 6/19/18 9:25 AM, Martin KaFai Lau wrote:
> > On Mon, Jun 18, 2018 at 03:35:25PM -0600, David Ahern wrote:
> >> On 6/18/18 2:55 PM, Martin KaFai Lau wrote:
> >>>> 	/* rc > 0 case */
> >>>> 	switch(rc) {
> >>>> 	case BPF_FIB_LKUP_RET_BLACKHOLE:
> >>>> 	case BPF_FIB_LKUP_RET_UNREACHABLE:
> >>>> 	case BPF_FIB_LKUP_RET_PROHIBIT:
> >>>> 		return XDP_DROP;
> >>>> 	}
> >>>>
> >>>> For the others it becomes a question of do we share why the stack needs
> >>>> to be involved? Maybe the program wants to collect stats to show traffic
> >>>> patterns that can be improved (BPF_FIB_LKUP_RET_FRAG_NEEDED) or support
> >>>> in the kernel needs to be improved (BPF_FIB_LKUP_RET_UNSUPP_LWT) or an
> >>>> interface is misconfigured (BPF_FIB_LKUP_RET_FWD_DISABLED).
> >>> Thanks for the explanation.
> >>>
> >>> Agree on the bpf able to collect stats will be useful.
> >>>
> >>> I am wondering, if a new BPF_FIB_LKUP_RET_XYZ is added later,
> >>> how may the old xdp_prog work/not-work?  As of now, the return value
> >>> is straight forward, FWD, PASS (to stack) or DROP (error).
> >>> With this change, the xdp_prog needs to match/switch() the
> >>> BPF_FIB_LKUP_RET_* to at least PASS and DROP.
> >>
> >> IMO, programs should only call XDP_DROP for known reasons - like the 3
> >> above. Anything else punt to the stack.
> >>
> >> If a new RET_XYZ comes along:
> >> 1. the new XYZ is a new ACL response where the packet is to be dropped.
> >> If the program does not understand XYZ and punts to the stack
> >> (recommendation), then a second lookup is done during normal packet
> >> processing and the stack drops it.
> >>
> >> 2. the new XYZ is a new path in the kernel that is unsupported with
> >> respect to XDP forwarding, nothing new for the program to do.
> >>
> >> Either way I would expect stats on BPF_FIB_LKUP_RET_* to give a hint to
> >> the program writer.
> >>
> >> Worst case of punting packets to the stack for any rc != 0 means the
> >> stack is doing 2 lookups - 1 in XDP based on its lookup parameters and 1
> >> in normal stack processing - to handle the packet.
> > Instead of having the xdp_prog to follow the meaning of what RET_SYZ is,
> > should the bpf_*_fib_lookup() return value be kept as is such that
> > the xdp_prog is clear what to do.  The reason can be returned in
> > the 'struct bpf_fib_lookup'.  The number of reasons can be extended.
> > If the xdp_prog does not understand a reason, it still will not
> > affect its decision because the return value is clear.
> > I think the situation here is similar to regular syscall which usually
> > uses -1 to clearly states error and errno to spells out the reason.
> > 
> 
> I did consider returning the status in struct bpf_fib_lookup. However,
> it is 64 bytes and can not be extended without a big performance
> penalty, so the only option there is to make an existing entry a union
> the most logical of which is the ifindex. It seemed odd to me to have
> the result by hidden in the struct as a union on ifindex and returning
> the egress index from the function:
> 
> @@ -2625,7 +2636,11 @@ struct bpf_fib_lookup {
> 
>         /* total length of packet from network header - used for MTU
> check */
>         __u16   tot_len;
> -       __u32   ifindex;  /* L3 device index for lookup */
> +
> +       union {
> +               __u32   ifindex;  /* input: L3 device index for lookup */
> +               __u32   result;   /* output: one of BPF_FIB_LKUP_RET_* */
> +       };
> 
> 
> It seemed more natural to have ifindex stay ifindex and only change
> value on return:
> 
> @@ -2625,7 +2639,11 @@ struct bpf_fib_lookup {
> 
>  	/* total length of packet from network header - used for MTU check */
>  	__u16	tot_len;
> -	__u32	ifindex;  /* L3 device index for lookup */
> +
> +	/* input: L3 device index for lookup
> +	 * output: nexthop device index from FIB lookup
> +	 */
> +	__u32	ifindex;
> 
>  	union {
>  		/* inputs to lookup */
> 
> 
> From a program's perspective:
> 
> rc < 0  -- program is passing incorrect data
> rc == 0 -- packet can be forwarded
> rc > 0  -- packet can not be forwarded.
> 
> BPF programs are not required to track the LKUP_RET values any more than
> a function returning multiple negative values - the caller just checks
> rc < 0 means failure. If the program cares it can look at specific
> values of rc to see the specific value.
> 
> The same applies with the LKUP_RET values - they are there to provide
> insight into why the packet is not forwarded directly if the program
> cares to know why.
hmm...ic. My concern is, the prog can interpret rc > 0 (in this patch) to be
drop vs pass (although we can advise them in bpf.h to always pass if it does
not understand a rc but it is not a strong contract),  it may catch people
a surprise if a xdp_prog suddenly drops everything when running in a
newer kernel where the upper stack can actually handle it.

while the current behavior (i.e. before this patch, rc == 0) is always pass
to the stack.

I think at least comments should be put in the enum such that
the xdp/tc_prog should expect the enum could be extended later, so
the suggested behavior should be a pass for unknown LKUP_RET and let
the stack to decide.

^ permalink raw reply

* [PATCH] ucc_geth: Add BQL support
From: Joakim Tjernlund @ 2018-06-19 16:30 UTC (permalink / raw)
  To: Li Yang, netdev; +Cc: Joakim Tjernlund

Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
---
 drivers/net/ethernet/freescale/ucc_geth.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/ucc_geth.c b/drivers/net/ethernet/freescale/ucc_geth.c
index f77ba9fa257b..6c99a9af6647 100644
--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -3096,6 +3096,7 @@ static int ucc_geth_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	ugeth_vdbg("%s: IN", __func__);
 
+	netdev_sent_queue(dev, skb->len);
 	spin_lock_irqsave(&ugeth->lock, flags);
 
 	dev->stats.tx_bytes += skb->len;
@@ -3242,6 +3243,8 @@ static int ucc_geth_tx(struct net_device *dev, u8 txQ)
 	struct ucc_geth_private *ugeth = netdev_priv(dev);
 	u8 __iomem *bd;		/* BD pointer */
 	u32 bd_status;
+	int howmany = 0;
+	unsigned int bytes_sent = 0;
 
 	bd = ugeth->confBd[txQ];
 	bd_status = in_be32((u32 __iomem *)bd);
@@ -3257,7 +3260,8 @@ static int ucc_geth_tx(struct net_device *dev, u8 txQ)
 		skb = ugeth->tx_skbuff[txQ][ugeth->skb_dirtytx[txQ]];
 		if (!skb)
 			break;
-
+		howmany++;
+		bytes_sent += skb->len;
 		dev->stats.tx_packets++;
 
 		dev_consume_skb_any(skb);
@@ -3279,6 +3283,7 @@ static int ucc_geth_tx(struct net_device *dev, u8 txQ)
 		bd_status = in_be32((u32 __iomem *)bd);
 	}
 	ugeth->confBd[txQ] = bd;
+	netdev_completed_queue(dev, howmany, bytes_sent);
 	return 0;
 }
 
-- 
2.13.6

^ permalink raw reply related

* [PATCH] selftests: net: add config fragments
From: Anders Roxell @ 2018-06-19 16:41 UTC (permalink / raw)
  To: davem, shuah, fw, shannon.nelson
  Cc: netdev, linux-kselftest, linux-kernel, Anders Roxell

Add fragments to pass bridge and vlan tests.

Fixes: 33b01b7b4f19 ("selftests: add rtnetlink test script")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
---

Hi,

net/rtnetlink.sh still fails on tc hbt hierarchy, addrlabel and ipsec:
Error: Specified qdisc not found.
RTNETLINK answers: No such file or directory
Error: Parent Qdisc doesn't exists.
We have an error talking to the kernel, -1
Error: Parent Qdisc doesn't exists.
We have an error talking to the kernel, -1
Error: Parent Qdisc doesn't exists.
We have an error talking to the kernel, -1
Error: Parent Qdisc doesn't exists.
We have an error talking to the kernel, -1
Error: Parent Qdisc doesn't exists.
We have an error talking to the kernel, -1
Error: Parent Qdisc doesn't exists.
We have an error talking to the kernel, -1
Error: Invalid handle.
FAIL: tc htb hierarchy

FAIL: ipv6 addrlabel

FAIL: can't add fou port 7777, skipping test
RTNETLINK answers: Operation not supported
FAIL: can't add macsec interface, skipping test
RTNETLINK answers: Protocol not supported
RTNETLINK answers: No such process
RTNETLINK answers: No such process
./rtnetlink.sh: line 527:  5356 Terminated              ip x m >
$tmpfile
FAIL: ipsec

I'm using iproute2 tag: 4.17 and tried the qdisc command from the
function kci_test_tc in net/rtnetlink.sh:
$ tc qdisc add dev lo root handle 1: htb
Error: Specified qdisc not found.

For kci_test_addrlabel it fails on this row:
ip addrlabel list |grep -q "prefix dead::/64 dev lo label 1"

Any idea why these three fails?

Cheers,
Anders

 tools/testing/selftests/net/config | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/net/config b/tools/testing/selftests/net/config
index 7ba089b33e8b..cd3a2f1545b5 100644
--- a/tools/testing/selftests/net/config
+++ b/tools/testing/selftests/net/config
@@ -12,3 +12,5 @@ CONFIG_NET_IPVTI=y
 CONFIG_INET6_XFRM_MODE_TUNNEL=y
 CONFIG_IPV6_VTI=y
 CONFIG_DUMMY=y
+CONFIG_BRIDGE=y
+CONFIG_VLAN_8021Q=y
-- 
2.17.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox