Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next] net: phy: don't double-read link status register if link is up
From: Heiner Kallweit @ 2019-02-07 19:22 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller; +Cc: netdev@vger.kernel.org

The link status register latches link-down events. Therefore, if link
is reported as being up, there's no need for a second read.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy-c45.c    | 2 ++
 drivers/net/phy/phy_device.c | 6 +++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
index d429c6a3e..19334fe5f 100644
--- a/drivers/net/phy/phy-c45.c
+++ b/drivers/net/phy/phy-c45.c
@@ -147,6 +147,8 @@ int genphy_c45_read_link(struct phy_device *phydev)
 			val = phy_read_mmd(phydev, devad, MDIO_STAT1);
 			if (val < 0)
 				return val;
+			else if (val & MDIO_STAT1_LSTATUS)
+				continue;
 		}
 
 		val = phy_read_mmd(phydev, devad, MDIO_STAT1);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index d490cd2a8..9369e1323 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1735,8 +1735,12 @@ int genphy_update_link(struct phy_device *phydev)
 	 */
 	if (!phy_polling_mode(phydev)) {
 		status = phy_read(phydev, MII_BMSR);
-		if (status < 0)
+		if (status < 0) {
 			return status;
+		} else if (status & BMSR_LSTATUS) {
+			phydev->link = 1;
+			return 0;
+		}
 	}
 
 	/* Read link and autonegotiation status */
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH v2 bpf-next 2/2] btf: expose API to work with raw btf data
From: Andrii Nakryiko @ 2019-02-07 19:21 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, Song Liu, Yonghong Song, Alexei Starovoitov,
	Martin Lau, netdev, Daniel Borkmann
In-Reply-To: <20190206062456.tvh7fe73ku5rwyx7@ast-mbp.dhcp.thefacebook.com>

On Tue, Feb 5, 2019 at 10:25 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Feb 05, 2019 at 09:46:14PM -0800, Andrii Nakryiko wrote:
> > On Tue, Feb 5, 2019 at 7:07 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Tue, Feb 05, 2019 at 04:29:49PM -0800, Andrii Nakryiko wrote:
> > > > This patch exposes two new APIs btf__get_raw_data_size() and
> > > > btf__get_raw_data() that allows to get a copy of raw BTF data out of
> > > > struct btf. This is useful for external programs that need to manipulate
> > > > raw data, e.g., pahole using btf__dedup() to deduplicate BTF type info
> > > > and then writing it back to file.
> > > >
> > > > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > > > Acked-by: Song Liu <songliubraving@fb.com>
> > > > ---
> > > >  tools/lib/bpf/btf.c      | 10 ++++++++++
> > > >  tools/lib/bpf/btf.h      |  2 ++
> > > >  tools/lib/bpf/libbpf.map |  2 ++
> > > >  3 files changed, 14 insertions(+)
> > > >
> > > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > > > index 1c2ba7182400..34bfb3641aac 100644
> > > > --- a/tools/lib/bpf/btf.c
> > > > +++ b/tools/lib/bpf/btf.c
> > > > @@ -437,6 +437,16 @@ int btf__fd(const struct btf *btf)
> > > >       return btf->fd;
> > > >  }
> > > >
> > > > +__u32 btf__get_raw_data_size(const struct btf *btf)
> > > > +{
> > > > +     return btf->data_size;
> > > > +}
> > > > +
> > > > +void btf__get_raw_data(const struct btf *btf, char *data)
> > > > +{
> > > > +     memcpy(data, btf->data, btf->data_size);
> > > > +}
> > >
> > > I cannot think of any other way to use this api,
> > > but to call btf__get_raw_data_size() first,
> > > then malloc that much memory and then call btf__get_raw_data()
> > > to store btf into it.
> > >
> > > If so, may be api should be single call that allocates, copies,
> > > and returns pointer to new mem and its size?
> > > Probably less error prone?
> > >
> >
> > I don't have strong preference, but providing pointer to allocated memory
> > seems more flexible and allows more clever/optimal use of memory from caller
> > side. E.g., instead of doing two mallocs, you can imagine doing something
> > like:
> >
> > int max_size = max(btf__get_raw_data_size(btf),
> >                    btf_ext__get_raw_data_size(btf_ext));
> > char *m = malloc(max_size);
> > btf__get_raw_data(btf, m);
> > dump_btf_section_to_file(m, some_file);
> > btf_ext__get_raw_data(btf_ext, m);
> > dump_btf_ext_section_to_file(m, some_file);
> > free(m);
> >
> > Also, pointer to memory could be mmap()'ed file, for instance. In general,
> > for a library it might be a good thing to not be prescriptive as to how one
> > gets that piece of memory.
>
> Plausible, but I'd like to see pahole patches to be convinced ;)
>

Here's a summary of proposed ways to expose raw data through new api,
with pros/cons.

1. Originally proposed two functions. `int btf__get_raw_data_size()`
to get size, `void btf__get_raw_data(void* buf)` to write raw data to
a provided buf.

Pros:
  - allows maximal flexibility for users of this API. They can manage
memory as it's convenient for them (e.g., reusing same buffer for
multiple btf and btf_ext raw data).
  - allows using mmap()'ed memory, as allocation and memory ownership
is delegated to user

Cons:
  - has potential of buffer overflows, if user doesn't provide big enough buffer


2. Alexei's proposal to combine getting size in single function that
internally allocates new memory buffer, copies data and returns it to
users to use and later free.

Pros:
  - one less API function
  - more straightforward usage, it's hard to misuse it (except for
memory leaking, if memory is not freed)

Cons:
  - always allocated for each call
  - least flexible approach, doesn't allow caller to manage memory,
prevents any kind of direct write to mmap()'ed file

3. Daniel proposed realloc-like approach, where caller optionally
provides memory buffer, but we always call realloc() internally to
ensure we have long enough buffer.

Pros:
  - allows callers to provide their memory buffer (similar to approach
#1, but see cons below)
  - prevents user error with providing too small buffer (similar to approach #2)

Cons:
  - realloc expects that memory was allocated by previous malloc()
call, so caller can't allocate bigger chunk of memory and provide
pointer inside that area (behavior is undefined in that case). This
requirement is not immediately obvious, so this approach feels more
error prone than either of approach #1 and #2
  - still doesn't allow mmap()'ed usage, again due to realloc()'s requirements


Approach #3 looks most subtly-error-prone, as it's too easy to just
provide pointer that's not at the beginning of malloc()'ed memory, but
this might not be detected immediately, and could potentially lead to
silent memory corruption.

I'd still go with approach #1 as it provides least restrictive API,
even though approach #2 will provide marginally better usability for
common cases.

Alexei, Daniel, which approach you'd prefer in the end after
considering all pros and cons?

^ permalink raw reply

* Re: [PATCH bpf-next] tools/bpf: add missing strings.h include
From: Arnaldo Carvalho de Melo @ 2019-02-07 19:21 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: yhs, songliubraving, ast, kafai, netdev, kernel-team
In-Reply-To: <20190207175027.1950358-1-andriin@fb.com>

Em Thu, Feb 07, 2019 at 09:50:27AM -0800, Andrii Nakryiko escreveu:
> Few files in libbpf are using bzero() function (defined in strings.h header), but
> don't include corresponding header. When libbpf is added as a dependency to pahole,
> this undeterministically causes warnings on some machines:
> 
> bpf.c:225:2: warning: implicit declaration of function ‘bzero’ [-Wimplicit-function-declaration]
>   bzero(&attr, sizeof(attr));
     ^~~~~

You forgot your:

Signed-off-by: Andrii Nakryiko <andriin@fb.com>

And a:

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>

:-)

- Arnaldo

> ---
>  tools/lib/bpf/bpf.c    | 1 +
>  tools/lib/bpf/btf.c    | 1 +
>  tools/lib/bpf/libbpf.c | 1 +
>  3 files changed, 3 insertions(+)
> 
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 3defad77dc7a..92fd27fe0599 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -22,6 +22,7 @@
>   */
>  
>  #include <stdlib.h>
> +#include <strings.h>
>  #include <memory.h>
>  #include <unistd.h>
>  #include <asm/unistd.h>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index ab6528c935a1..4324eb47d214 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -4,6 +4,7 @@
>  #include <stdio.h>
>  #include <stdlib.h>
>  #include <string.h>
> +#include <strings.h>
>  #include <unistd.h>
>  #include <errno.h>
>  #include <linux/err.h>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 47969aa0faf8..8d64ada5f728 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -18,6 +18,7 @@
>  #include <libgen.h>
>  #include <inttypes.h>
>  #include <string.h>
> +#include <strings.h>
>  #include <unistd.h>
>  #include <fcntl.h>
>  #include <errno.h>
> -- 
> 2.17.1

-- 

- Arnaldo

^ permalink raw reply

* Re: [PATCH bpf-next] tools/bpf: add missing strings.h include
From: Alexei Starovoitov @ 2019-02-07 19:18 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: yhs, songliubraving, ast, kafai, netdev, kernel-team
In-Reply-To: <20190207175027.1950358-1-andriin@fb.com>

On Thu, Feb 07, 2019 at 09:50:27AM -0800, Andrii Nakryiko wrote:
> Few files in libbpf are using bzero() function (defined in strings.h header), but
> don't include corresponding header. When libbpf is added as a dependency to pahole,
> this undeterministically causes warnings on some machines:
> 
> bpf.c:225:2: warning: implicit declaration of function ‘bzero’ [-Wimplicit-function-declaration]
>   bzero(&attr, sizeof(attr));
>     ^~~~~
> ---

SOB is missing.


^ permalink raw reply

* Re: [PATCH net-next] net: vxlan: Free a leaked vetoed multicast rdst
From: David Miller @ 2019-02-07 19:18 UTC (permalink / raw)
  To: petrm; +Cc: netdev, jiri
In-Reply-To: <f0b53ffcc4ca130436a8f2799f85bd90695f6d45.1549540888.git.petrm@mellanox.com>

From: Petr Machata <petrm@mellanox.com>
Date: Thu, 7 Feb 2019 12:18:02 +0000

> When an rdst is rejected by a driver, the current code removes it from
> the remote list, but neglects to free it. This is triggered by
> tools/testing/selftests/drivers/net/mlxsw/vxlan_fdb_veto.sh and shows as
> the following kmemleak trace:
> 
> unreferenced object 0xffff88817fa3d888 (size 96):
>   comm "softirq", pid 0, jiffies 4372702718 (age 165.252s)
>   hex dump (first 32 bytes):
>     02 00 00 00 c6 33 64 03 80 f5 a2 61 81 88 ff ff  .....3d....a....
>     06 df 71 ae ff ff ff ff 0c 00 00 00 04 d2 6a 6b  ..q...........jk
>   backtrace:
>     [<00000000296b27ac>] kmem_cache_alloc_trace+0x1ae/0x370
>     [<0000000075c86dc6>] vxlan_fdb_append.part.12+0x62/0x3b0 [vxlan]
>     [<00000000e0414b63>] vxlan_fdb_update+0xc61/0x1020 [vxlan]
>     [<00000000f330c4bd>] vxlan_fdb_add+0x2e8/0x3d0 [vxlan]
>     [<0000000008f81c2c>] rtnl_fdb_add+0x4c2/0xa10
>     [<00000000bdc4b270>] rtnetlink_rcv_msg+0x6dd/0x970
>     [<000000006701f2ce>] netlink_rcv_skb+0x290/0x410
>     [<00000000c08a5487>] rtnetlink_rcv+0x15/0x20
>     [<00000000d5f54b1e>] netlink_unicast+0x43f/0x5e0
>     [<00000000db4336bb>] netlink_sendmsg+0x789/0xcd0
>     [<00000000e1ee26b6>] sock_sendmsg+0xba/0x100
>     [<00000000ba409802>] ___sys_sendmsg+0x631/0x960
>     [<000000003c332113>] __sys_sendmsg+0xea/0x180
>     [<00000000f4139144>] __x64_sys_sendmsg+0x78/0xb0
>     [<000000006d1ddc59>] do_syscall_64+0x94/0x410
>     [<00000000c8defa9a>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> Move vxlan_dst_free() up and schedule a call thereof to plug this leak.
> 
> Fixes: 61f46fe8c646 ("vxlan: Allow vetoing of FDB notifications")
> Signed-off-by: Petr Machata <petrm@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>

Applied, thanks.

^ permalink raw reply

* Re: Resource management for ndo_xdp_xmit (Was: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames)
From: Saeed Mahameed @ 2019-02-07 19:08 UTC (permalink / raw)
  To: brouer@redhat.com
  Cc: thoiland@redhat.com, hawk@kernel.org,
	virtualization@lists.linux-foundation.org, borkmann@iogearbox.net,
	Tariq Toukan, john.fastabend@gmail.com, toke@toke.dk,
	mst@redhat.com, jakub.kicinski@netronome.com, dsahern@gmail.com,
	netdev@vger.kernel.org, jasowang@redhat.com, davem@davemloft.net,
	makita.toshiaki@lab.ntt.co.jp
In-Reply-To: <20190207084815.1e8bd817@carbon>

On Thu, 2019-02-07 at 08:48 +0100, Jesper Dangaard Brouer wrote:
> On Wed, 6 Feb 2019 00:06:33 +0000 Saeed Mahameed <saeedm@mellanox.com
> > wrote:
> 
> > On Mon, 2019-02-04 at 19:13 -0800, David Ahern wrote:
> [...]
> > > mlx5 needs some work. As I recall it still has the bug/panic
> > > removing xdp programs - at least I don't recall seeing a patch
> > > for
> > > it.  
> > 
> > Only when xdp_redirect to mlx5, and removing the program while
> > redirect is happening, this is actually due to a lack of
> > synchronization means between different drivers, we have some ideas
> > to overcome this using a standard XDP API, or just use a hack in
> > mlx5
> > driver which i don't like:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp-redirect-fix&id=a3652d03cc35fd3ad62744986c8ccaca74c9f20c
> > 
> > I will be working on this towards the end of this week.
> 
> Toke and I have been discussing how to solve this.
> 
> The main idea for fixing this is to tie resource allocation to
> interface
> insertion into interface maps (kernel/bpf/devmap.c). As the =devmap=
> already have the needed synchronisation mechanisms and steps for
> safely
> adding and removing =net_devices= (e.g. stopping RX side, flushing
> remaining frames, waiting RCU period before freeing objects, etc.)
> 
> As described here:
>  
> https://github.com/xdp-project/xdp-project/blob/master/xdp-project.org#better-ndo_xdp_xmit-resource-management
> 
> --Jesper

Yes you already suggested this approach @LPC:

So 
1) on dev_map_update_elem() we will call
dev->dev->ndo_bpf() to notify the device on the intention to start/stop
redirect, and wait for it to create/destroy the HW resources
before/after actually updating the map

But:
2) this won't totally solve our problem, since sometimes the driver can
decide to recreate (change of configuration) hw resources on the fly
while redirect/devmap is already happening, so we need some kind of a
dev_map_notification or a flag with rcu synch, for when the driver want
to make the xdp redirect resources unavailable.

Thanks,
Saeed.

^ permalink raw reply

* [PATCH net-next] net: phy: let genphy_c45_read_link manage the devices to check
From: Heiner Kallweit @ 2019-02-07 19:05 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller, Russell King
  Cc: netdev@vger.kernel.org

Let genphy_c45_read_link manage the devices to check, this removes
overhead from callers. Devices VEND1 and VEND2 will never be checked,
for now adopt the logic of the Marvell driver to also exclude PHY XS.

At the moment we have very few clause 45 PHY drivers, so we are
lacking experience whether other drivers will have to exclude further
devices, or may need to check PHY XS. If we should figure out that
list of devices to check needs to be configurable, I think best will
be to add a device list member to struct phy_driver.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/marvell10g.c | 10 +---------
 drivers/net/phy/phy-c45.c    | 17 +++++++++--------
 include/linux/phy.h          |  2 +-
 include/uapi/linux/mdio.h    |  2 ++
 4 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/drivers/net/phy/marvell10g.c b/drivers/net/phy/marvell10g.c
index 296a537cd..96a79c6c7 100644
--- a/drivers/net/phy/marvell10g.c
+++ b/drivers/net/phy/marvell10g.c
@@ -428,16 +428,8 @@ static int mv3310_read_10gbr_status(struct phy_device *phydev)
 
 static int mv3310_read_status(struct phy_device *phydev)
 {
-	u32 mmd_mask = phydev->c45_ids.devices_in_package;
 	int val;
 
-	/* The vendor devads do not report link status.  Avoid the PHYXS
-	 * instance as there are three, and its status depends on the MAC
-	 * being appropriately configured for the negotiated speed.
-	 */
-	mmd_mask &= ~(BIT(MDIO_MMD_VEND1) | BIT(MDIO_MMD_VEND2) |
-		      BIT(MDIO_MMD_PHYXS));
-
 	phydev->speed = SPEED_UNKNOWN;
 	phydev->duplex = DUPLEX_UNKNOWN;
 	linkmode_zero(phydev->lp_advertising);
@@ -453,7 +445,7 @@ static int mv3310_read_status(struct phy_device *phydev)
 	if (val & MDIO_STAT1_LSTATUS)
 		return mv3310_read_10gbr_status(phydev);
 
-	val = genphy_c45_read_link(phydev, mmd_mask);
+	val = genphy_c45_read_link(phydev);
 	if (val < 0)
 		return val;
 
diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
index 449f0f986..d429c6a3e 100644
--- a/drivers/net/phy/phy-c45.c
+++ b/drivers/net/phy/phy-c45.c
@@ -118,17 +118,23 @@ EXPORT_SYMBOL_GPL(genphy_c45_aneg_done);
 /**
  * genphy_c45_read_link - read the overall link status from the MMDs
  * @phydev: target phy_device struct
- * @mmd_mask: MMDs to read status from
  *
  * Read the link status from the specified MMDs, and if they all indicate
  * that the link is up, set phydev->link to 1.  If an error is encountered,
  * a negative errno will be returned, otherwise zero.
  */
-int genphy_c45_read_link(struct phy_device *phydev, u32 mmd_mask)
+int genphy_c45_read_link(struct phy_device *phydev)
 {
+	u32 mmd_mask = phydev->c45_ids.devices_in_package;
 	int val, devad;
 	bool link = true;
 
+	/* The vendor devads do not report link status.  Avoid the PHYXS
+	 * instance as its status may depend on the MAC being appropriately
+	 * configured for the negotiated speed.
+	 */
+	mmd_mask &= ~(MDIO_DEVS_VEND1 | MDIO_DEVS_VEND2 | MDIO_DEVS_PHYXS);
+
 	while (mmd_mask && link) {
 		devad = __ffs(mmd_mask);
 		mmd_mask &= ~BIT(devad);
@@ -272,16 +278,11 @@ EXPORT_SYMBOL_GPL(gen10g_config_aneg);
 
 int gen10g_read_status(struct phy_device *phydev)
 {
-	u32 mmd_mask = phydev->c45_ids.devices_in_package;
-
 	/* For now just lie and say it's 10G all the time */
 	phydev->speed = SPEED_10000;
 	phydev->duplex = DUPLEX_FULL;
 
-	/* Avoid reading the vendor MMDs */
-	mmd_mask &= ~(BIT(MDIO_MMD_VEND1) | BIT(MDIO_MMD_VEND2));
-
-	return genphy_c45_read_link(phydev, mmd_mask);
+	return genphy_c45_read_link(phydev);
 }
 EXPORT_SYMBOL_GPL(gen10g_read_status);
 
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 237dd0358..f41bf651f 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -1094,7 +1094,7 @@ int genphy_write_mmd_unsupported(struct phy_device *phdev, int devnum,
 /* Clause 45 PHY */
 int genphy_c45_restart_aneg(struct phy_device *phydev);
 int genphy_c45_aneg_done(struct phy_device *phydev);
-int genphy_c45_read_link(struct phy_device *phydev, u32 mmd_mask);
+int genphy_c45_read_link(struct phy_device *phydev);
 int genphy_c45_read_lpa(struct phy_device *phydev);
 int genphy_c45_read_pma(struct phy_device *phydev);
 int genphy_c45_pma_setup_forced(struct phy_device *phydev);
diff --git a/include/uapi/linux/mdio.h b/include/uapi/linux/mdio.h
index d435b00d6..2e6e309f0 100644
--- a/include/uapi/linux/mdio.h
+++ b/include/uapi/linux/mdio.h
@@ -123,6 +123,8 @@
 #define MDIO_DEVS_TC			MDIO_DEVS_PRESENT(MDIO_MMD_TC)
 #define MDIO_DEVS_AN			MDIO_DEVS_PRESENT(MDIO_MMD_AN)
 #define MDIO_DEVS_C22EXT		MDIO_DEVS_PRESENT(MDIO_MMD_C22EXT)
+#define MDIO_DEVS_VEND1			MDIO_DEVS_PRESENT(MDIO_MMD_VEND1)
+#define MDIO_DEVS_VEND2			MDIO_DEVS_PRESENT(MDIO_MMD_VEND2)
 
 /* Control register 2. */
 #define MDIO_PMA_CTRL2_TYPE		0x000f	/* PMA/PMD type selection */
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH bpf-next] tools/bpf: add missing strings.h include
From: Andrii Nakryiko @ 2019-02-07 19:02 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Yonghong Song, Song Liu, Alexei Starovoitov, Martin Lau, netdev,
	kernel-team, Arnaldo Carvalho de Melo
In-Reply-To: <20190207175027.1950358-1-andriin@fb.com>

cc Arnaldo

On Thu, Feb 7, 2019 at 9:52 AM Andrii Nakryiko <andriin@fb.com> wrote:
>
> Few files in libbpf are using bzero() function (defined in strings.h header), but
> don't include corresponding header. When libbpf is added as a dependency to pahole,
> this undeterministically causes warnings on some machines:
>
> bpf.c:225:2: warning: implicit declaration of function ‘bzero’ [-Wimplicit-function-declaration]
>   bzero(&attr, sizeof(attr));
>     ^~~~~
> ---
>  tools/lib/bpf/bpf.c    | 1 +
>  tools/lib/bpf/btf.c    | 1 +
>  tools/lib/bpf/libbpf.c | 1 +
>  3 files changed, 3 insertions(+)
>
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 3defad77dc7a..92fd27fe0599 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -22,6 +22,7 @@
>   */
>
>  #include <stdlib.h>
> +#include <strings.h>
>  #include <memory.h>
>  #include <unistd.h>
>  #include <asm/unistd.h>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index ab6528c935a1..4324eb47d214 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -4,6 +4,7 @@
>  #include <stdio.h>
>  #include <stdlib.h>
>  #include <string.h>
> +#include <strings.h>
>  #include <unistd.h>
>  #include <errno.h>
>  #include <linux/err.h>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 47969aa0faf8..8d64ada5f728 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -18,6 +18,7 @@
>  #include <libgen.h>
>  #include <inttypes.h>
>  #include <string.h>
> +#include <strings.h>
>  #include <unistd.h>
>  #include <fcntl.h>
>  #include <errno.h>
> --
> 2.17.1
>

^ permalink raw reply

* Re: [PATCH v2 net 0/2] fix two kernel panics when disabled IPv6 on boot up
From: David Miller @ 2019-02-07 18:48 UTC (permalink / raw)
  To: liuhangbin; +Cc: netdev, sbrivio, eric.dumazet
In-Reply-To: <20190207103611.25046-1-liuhangbin@gmail.com>

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Thu,  7 Feb 2019 18:36:09 +0800

> When disabled IPv6 on boot up, since there is no ipv6 route tables, we should
> not call rt6_lookup. Fix them by checking if we have inet6_dev pointer on
> netdevice.
> 
> v2: Fix idev reference leak, declarations and code mixing as Stefano,
>     Eric pointed. Since we only want to check if idev exists and not
>     reference it, use __in6_dev_get() insteand of in6_dev_get().

Series applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [ISSUE][4.20.6] mlx5 and checksum failures
From: Saeed Mahameed @ 2019-02-07 18:42 UTC (permalink / raw)
  To: Ian Kumlien
  Cc: Cong Wang, David Miller, Saeed Mahameed,
	Linux Kernel Network Developers
In-Reply-To: <CAA85sZu+Ptn99rsa0KrtedU8YiedOiNtULabHuvmqXEEhW0tZA@mail.gmail.com>

On Thu, Feb 7, 2019 at 2:17 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
>
> On Thu, Feb 7, 2019 at 2:01 AM Saeed Mahameed <saeedm@dev.mellanox.co.il> wrote:
> > On Wed, Feb 6, 2019 at 3:00 PM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> > > It changes directly after the first hw checksum failure, I don't know why =/
> >
> > weird, Maybe a real check-summing issue/corruption on the PCI ?!
>
> Actually, it seems to have been introduced in 4.20.6 - 4.20.5 works just fine
>

Great, the difference is only 120 patches.
that is bisect-able, it will only take 5 iterations to find the
offending commit.


> Just FYI, my dmesg testcase:
> time ssh <server> "dmesg && exit
> real    3m5.845s
> user    0m0.035s
> sys     0m0.041s
>
> > can you try turning off checksum offloads
> > ethtool -K ethX  rx off
>
> same test:
> real    0m3.408s
> user    0m0.022s
> sys     0m0.032s
>
> So yes, something in 4.20.6 goes wrong on the receiving part :/

^ permalink raw reply

* Re: [PATCH v4 net-next 00/11] Devlink health reporting and recovery system
From: David Miller @ 2019-02-07 18:37 UTC (permalink / raw)
  To: eranbe; +Cc: netdev, saeedm, jiri, moshe, ayal
In-Reply-To: <1549532202-943-1-git-send-email-eranbe@mellanox.com>

From: Eran Ben Elisha <eranbe@mellanox.com>
Date: Thu,  7 Feb 2019 11:36:31 +0200

> The health mechanism is targeted for Real Time Alerting, in order to know when
> something bad had happened to a PCI device
> - Provide alert debug information
> - Self healing
> - If problem needs vendor support, provide a way to gather all needed debugging
>   information.
> 
> The main idea is to unify and centralize driver health reports in the
> generic devlink instance and allow the user to set different
> attributes of the health reporting and recovery procedures.
 ...

Series applied, thank you.

^ permalink raw reply

* Re: linux-next: manual merge of the net-next tree with the net tree
From: Pablo Neira Ayuso @ 2019-02-07 18:36 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: David Miller, Networking, Linux Next Mailing List,
	Linux Kernel Mailing List, Guy Shattah, Saeed Mahameed
In-Reply-To: <20190207115425.61c3708a@canb.auug.org.au>

Hi Stephen,

On Thu, Feb 07, 2019 at 11:54:24AM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the net-next tree got a conflict in:
> 
>   drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> 
> between commit:
> 
>   1651925d403e ("net/mlx5e: Use the inner headers to determine tc/pedit offload limitation on decap flows")
> 
> from the net tree and commit:
> 
>   738678817573 ("drivers: net: use flow action infrastructure")
> 
> from the net-next tree.

This conflict resolution when merging net into net-next looks good to me.

Thanks.

^ permalink raw reply

* Re: [PATCH net-next v2 01/10] net: phy: Update PHY linkmodes after config_init
From: Heiner Kallweit @ 2019-02-07 18:21 UTC (permalink / raw)
  To: Maxime Chevallier, Andrew Lunn
  Cc: davem, netdev, linux-kernel, Florian Fainelli, Russell King,
	linux-arm-kernel, Antoine Tenart, thomas.petazzoni,
	gregory.clement, miquel.raynal, nadavh, stefanc, mw
In-Reply-To: <20190207145517.08bb8975@bootlin.com>

On 07.02.2019 14:55, Maxime Chevallier wrote:
> Hello Andrew,
> 
> On Thu, 7 Feb 2019 14:48:07 +0100
> Andrew Lunn <andrew@lunn.ch> wrote:
> 
>> On Thu, Feb 07, 2019 at 10:49:30AM +0100, Maxime Chevallier wrote:
>>> We want to be able to update a PHY's supported list in the config_init
>>> callback, so move the Pause parameters settings from phydrv->features
>>> after calling config_init to make sure these  parameters aren't
>>> overwritten.  
>>
>> Hi Maxime
>>
>> I have a patch which makes some core changes to support PHY doing
>> runtime feature detection. I would prefer to use them, than this.
>>
>> Either I or Heiner will post them soon.
> 
> Sure, no problem, thanks for doing this. I'll be happy to review and
> test that on my side.
> 
The second version of the patch set comes in time because I was just
about to submit parts of it on Maxime's behalf. We may have some
dependencies between adding the get_features callback in phy_driver
and this patch set. Maybe the patch set needs to be splitted.
I think tomorrow or the day after I can have a closer look at this.

> As I said, I lack the big picture view on that part so my approach was
> pretty naive, I'm glad you can take care of this :)
> 
Don't be so self-effacing, it's good work :)

> Thanks,
> 
> Maxime
> 
Heiner

^ permalink raw reply

* Re: [PATCH net-next 2/2] net: Change TCA_ACT_* to TCA_ID_* to match that of TCA_ID_POLICE
From: David Miller @ 2019-02-07 18:01 UTC (permalink / raw)
  To: eli
  Cc: jhs, xiyou.wangcong, jiri, netdev, linux-kernel, simon.horman,
	jakub.kicinski, dirk.vandermerwe, francois.theron, quentin.monnet,
	john.hurley, edwin.peer
In-Reply-To: <20190207074549.29861-3-eli@mellanox.com>

From: Eli Cohen <eli@mellanox.com>
Date: Thu,  7 Feb 2019 09:45:49 +0200

> diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c
> index 902957beceb3..d54cb608dbaf 100644
> --- a/net/sched/act_simple.c
> +++ b/net/sched/act_simple.c
> @@ -19,8 +19,6 @@
>  #include <net/netlink.h>
>  #include <net/pkt_sched.h>
>  
> -#define TCA_ACT_SIMP 22
> -
>  #include <linux/tc_act/tc_defact.h>
>  #include <net/tc_act/tc_defact.h>
>  

I would do this in patch #1.

Actually, because you didn't, after patch #1 there are two #defines
evaluated for this macro.  One in pkt_cls.h and one here.

The only reason the compiler doesn't warn and complain is because the
definitions are identical.

Thank you.

^ permalink raw reply

* Re: [PATCH net-next] net: fixed-phy: Add fixed_phy_register_with_gpiod() API
From: Florian Fainelli @ 2019-02-07 17:59 UTC (permalink / raw)
  To: Moritz Fischer, andrew; +Cc: hkallweit1, davem, netdev, linux-kernel
In-Reply-To: <20190207175210.9309-1-mdf@kernel.org>

On 2/7/19 9:52 AM, Moritz Fischer wrote:
> Add fixed_phy_register_with_gpiod() API. It lets users create a
> fixed_phy instance that uses a GPIO descriptor which was obtained
> externally e.g. through platform data.
> This enables platform devices (non-DT based) to use GPIOs for link
> status.
> 
> Signed-off-by: Moritz Fischer <mdf@kernel.org>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

Thank you!
-- 
Florian

^ permalink raw reply

* Re: [PATCH bpf-next v7 3/6] bpf: handle GSO in bpf_lwt_push_encap
From: Willem de Bruijn @ 2019-02-07 17:57 UTC (permalink / raw)
  To: Peter Oskolkov
  Cc: Alexei Starovoitov, Daniel Borkmann, Network Development,
	Peter Oskolkov, David Ahern, Willem de Bruijn
In-Reply-To: <20190207003720.51096-4-posk@google.com>

On Wed, Feb 6, 2019 at 7:39 PM Peter Oskolkov <posk@google.com> wrote:
>
> This patch adds handling of GSO packets in bpf_lwt_push_ip_encap()
> (called from bpf_lwt_push_encap):
>
> * IPIP, GRE, and UDP encapsulation types are deduced by looking
>   into iphdr->protocol or ipv6hdr->next_header;
> * an error is returned if the same GSO encap type is set on the skb;
> * SCTP GSO packets are not supported (as bpf_skb_proto_4_to_6
>   and similar do);
> * UDP_L4 GSO packets are also not supported (although they are
>   not blocked in bpf_skb_proto_4_to_6 and similar), as
>   skb_decrease_gso_size() will break it;

Yes, I was not aware of this gso_size resizing when adding udp gso.
Good catch. Will send a separate patch. That's unrelated to this
patchset.

> * SKB_GSO_DODGY bit is set.
>
> Note: it may be possible to support SCTP and UDP_L4 gso packets;
>       but as these cases seem to be not well handled by other
>       tunneling/encapping code paths, the solution should
>       be generic enough to apply to all tunneling/encapping code.
>
> Signed-off-by: Peter Oskolkov <posk@google.com>
> ---
>  net/core/lwt_bpf.c | 62 +++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 59 insertions(+), 3 deletions(-)
>
> diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
> index 786b96148937..4ff60757bf23 100644
> --- a/net/core/lwt_bpf.c
> +++ b/net/core/lwt_bpf.c
> @@ -16,6 +16,7 @@
>  #include <linux/types.h>
>  #include <linux/bpf.h>
>  #include <net/lwtunnel.h>
> +#include <net/gre.h>
>
>  struct bpf_lwt_prog {
>         struct bpf_prog *prog;
> @@ -390,15 +391,70 @@ static const struct lwtunnel_encap_ops bpf_encap_ops = {
>         .owner          = THIS_MODULE,
>  };
>
> -static int handle_gso_encap(struct sk_buff *skb, bool ipv4, int encap_len)
> +static int handle_gso_type(struct sk_buff *skb, unsigned int gso_type,
> +                          int encap_len)
>  {
> -       /* Handling of GSO-enabled packets is added in the next patch. */
> -       if (unlikely(skb_is_gso(skb)))

Passing all packets to a function called with gso in the name and then
filtering out non-gso in there is a bit confusing. It is customary to
only have gso packets traverse functions names gso. Please move this
test (in patch 2) into the main encap function and only call this
function for gso packets.

> +       struct skb_shared_info *shinfo = skb_shinfo(skb);
> +
> +       /* Refuse to double-encap with the same type. */
> +       if (shinfo->gso_type & gso_type)
>                 return -EINVAL;

Why is double encap allowed with different protocols, but not the
same? That seems arbitrary.

The current interface does not set an upper bound on the encap, so a
single program could also double encap directly.

As of commit 121d57af308d ("gso: validate gso_type in GSO handlers")
at least the GSO stack is robust against packets with mismatching
gso_types, so it is safe to accept types given by untrusted sources
like PF_PACKET or a BPF program, even if they prove wrong or (with
double encap) incomplete. The packet will just get dropped.

>
> +       gso_type |= SKB_GSO_DODGY;
> +       shinfo->gso_type |= gso_type;
> +       skb_decrease_gso_size(shinfo, encap_len);
> +       shinfo->gso_segs = 0;
>         return 0;
>  }
>
> +static int handle_gso_encap(struct sk_buff *skb, bool ipv4, int encap_len)
> +{
> +       void *next_hdr;
> +       __u8 protocol;
> +
> +       if (!skb_is_gso(skb))
> +               return 0;
> +
> +       /* SCTP and UDP_L4 gso need more nuanced handling than what
> +        * handle_gso_type() does above: skb_decrease_gso_size() is not enough.
> +        */
> +       if (unlikely(skb_shinfo(skb)->gso_type &
> +                    (SKB_GSO_SCTP | SKB_GSO_UDP_L4)))
> +               return -ENOTSUPP;
> +
> +       if (ipv4) {
> +               protocol = ip_hdr(skb)->protocol;
> +               next_hdr = skb_network_header(skb) + sizeof(struct iphdr);
> +       } else {
> +               protocol = ipv6_hdr(skb)->nexthdr;
> +               next_hdr = skb_network_header(skb) + sizeof(struct ipv6hdr);
> +       }
> +
> +       switch (protocol) {
> +       case IPPROTO_GRE:
> +               if (((struct gre_base_hdr *)next_hdr)->flags & GRE_CSUM)
> +                       return handle_gso_type(skb, SKB_GSO_GRE_CSUM,
> +                                              encap_len);
> +               return handle_gso_type(skb, SKB_GSO_GRE, encap_len);
> +
> +       case IPPROTO_UDP:
> +               if (((struct udphdr *)next_hdr)->check)

These tests may read beyond the end of the packet. The only guarantee
is that encap_len >= sizeof(struct iphdr). Not that there is an actual
fully formed follow on header.

If this is called from LWT_XMIT, there is the assurance that the inner
packet is at least sizeof(struct iphdr) as well. But that is still not
sufficient to unconditionally peek inside the packet.

We probably need the usual precaution wrt pskb_may_pull.

> +                       return handle_gso_type(skb, SKB_GSO_UDP_TUNNEL_CSUM,
> +                                              encap_len);
> +               return handle_gso_type(skb, SKB_GSO_UDP_TUNNEL, encap_len);
> +
> +       case IPPROTO_IP:
> +       case IPPROTO_IPV6:
> +               if (ipv4)
> +                       return handle_gso_type(skb, SKB_GSO_IPXIP4, encap_len);
> +               else
> +                       return handle_gso_type(skb, SKB_GSO_IPXIP6, encap_len);
> +
> +       default:
> +               return -EPROTONOSUPPORT;
> +       }
> +}
> +
>  int bpf_lwt_push_ip_encap(struct sk_buff *skb, void *hdr, u32 len, bool ingress)
>  {
>         struct iphdr *iph;
> --
> 2.20.1.611.gfbb209baf1-goog
>

^ permalink raw reply

* Re: [PATCH] net: stmmac: Variable "val" in function sun8i_dwmac_set_syscon() could be uninitialized
From: David Miller @ 2019-02-07 17:53 UTC (permalink / raw)
  To: yzhai003
  Cc: csong, zhiyunq, peppe.cavallaro, alexandre.torgue, maxime.ripard,
	wens, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <CABvMjLQ6MYtGYeLxwceZsvcyn4oScgMo+BGQMHw7SkZ1uxFmHQ@mail.gmail.com>

From: Yizhuo Zhai <yzhai003@ucr.edu>
Date: Wed, 6 Feb 2019 21:52:15 -0800

> Thanks, but why initialization matters here? Is performance the main
> concern?

Code that is unnecessary is hard to audit.

People will ask "why" is it initialized?  In what situations is the
initialized value of "0" ever used?

You are wasting people's time and energy by writing unnecessary code.

^ permalink raw reply

* Re: [PATCH v2 net-next] net: phy: fixed_phy: Fix fixed_phy not checking GPIO
From: David Miller @ 2019-02-07 17:52 UTC (permalink / raw)
  To: mdf; +Cc: netdev, linux-kernel, hkallweit1, f.fainelli, andrew, moritz
In-Reply-To: <20190207054529.2865-1-mdf@kernel.org>

From: Moritz Fischer <mdf@kernel.org>
Date: Wed,  6 Feb 2019 21:45:29 -0800

> Fix fixed_phy not checking GPIO if no link_update callback
> is registered.
> 
> In the original version all users registered a link_update
> callback so the issue was masked.
> 
> Fixes: a5597008dbc2 ("phy: fixed_phy: Add gpio to determine link up/down.")
> Reviewed-by: Andrew Lunn <andrew@lunn.ch>
> Signed-off-by: Moritz Fischer <mdf@kernel.org>

Applied, thank you.

^ permalink raw reply

* [PATCH net-next] net: fixed-phy: Add fixed_phy_register_with_gpiod() API
From: Moritz Fischer @ 2019-02-07 17:52 UTC (permalink / raw)
  To: andrew; +Cc: f.fainelli, hkallweit1, davem, netdev, linux-kernel,
	Moritz Fischer

Add fixed_phy_register_with_gpiod() API. It lets users create a
fixed_phy instance that uses a GPIO descriptor which was obtained
externally e.g. through platform data.
This enables platform devices (non-DT based) to use GPIOs for link
status.

Signed-off-by: Moritz Fischer <mdf@kernel.org>
---

Changes from RFC:
- Implemented Andrew's/Florians suggestion to add new API
  instead of changing old API

---
 drivers/net/phy/fixed_phy.c | 32 +++++++++++++++++++++++++-------
 include/linux/phy_fixed.h   | 15 +++++++++++++++
 2 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c
index d810f914aaa4..b0d1368c3400 100644
--- a/drivers/net/phy/fixed_phy.c
+++ b/drivers/net/phy/fixed_phy.c
@@ -229,12 +229,12 @@ static struct gpio_desc *fixed_phy_get_gpiod(struct device_node *np)
 }
 #endif
 
-struct phy_device *fixed_phy_register(unsigned int irq,
-				      struct fixed_phy_status *status,
-				      struct device_node *np)
+static struct phy_device *__fixed_phy_register(unsigned int irq,
+					       struct fixed_phy_status *status,
+					       struct device_node *np,
+					       struct gpio_desc *gpiod)
 {
 	struct fixed_mdio_bus *fmb = &platform_fmb;
-	struct gpio_desc *gpiod = NULL;
 	struct phy_device *phy;
 	int phy_addr;
 	int ret;
@@ -243,9 +243,11 @@ struct phy_device *fixed_phy_register(unsigned int irq,
 		return ERR_PTR(-EPROBE_DEFER);
 
 	/* Check if we have a GPIO associated with this fixed phy */
-	gpiod = fixed_phy_get_gpiod(np);
-	if (IS_ERR(gpiod))
-		return ERR_CAST(gpiod);
+	if (!gpiod) {
+		gpiod = fixed_phy_get_gpiod(np);
+		if (IS_ERR(gpiod))
+			return ERR_CAST(gpiod);
+	}
 
 	/* Get the next available PHY address, up to PHY_MAX_ADDR */
 	phy_addr = ida_simple_get(&phy_fixed_ida, 0, PHY_MAX_ADDR, GFP_KERNEL);
@@ -308,8 +310,24 @@ struct phy_device *fixed_phy_register(unsigned int irq,
 
 	return phy;
 }
+
+struct phy_device *fixed_phy_register(unsigned int irq,
+				      struct fixed_phy_status *status,
+				      struct device_node *np)
+{
+	return __fixed_phy_register(irq, status, np, NULL);
+}
 EXPORT_SYMBOL_GPL(fixed_phy_register);
 
+struct phy_device *
+fixed_phy_register_with_gpiod(unsigned int irq,
+			      struct fixed_phy_status *status,
+			      struct gpio_desc *gpiod)
+{
+	return __fixed_phy_register(irq, status, NULL, gpiod);
+}
+EXPORT_SYMBOL_GPL(fixed_phy_register_with_gpiod);
+
 void fixed_phy_unregister(struct phy_device *phy)
 {
 	phy_device_remove(phy);
diff --git a/include/linux/phy_fixed.h b/include/linux/phy_fixed.h
index c78fc203db43..1e5d86ebdaeb 100644
--- a/include/linux/phy_fixed.h
+++ b/include/linux/phy_fixed.h
@@ -19,6 +19,12 @@ extern int fixed_phy_add(unsigned int irq, int phy_id,
 extern struct phy_device *fixed_phy_register(unsigned int irq,
 					     struct fixed_phy_status *status,
 					     struct device_node *np);
+
+extern struct phy_device *
+fixed_phy_register_with_gpiod(unsigned int irq,
+			      struct fixed_phy_status *status,
+			      struct gpio_desc *gpiod);
+
 extern void fixed_phy_unregister(struct phy_device *phydev);
 extern int fixed_phy_set_link_update(struct phy_device *phydev,
 			int (*link_update)(struct net_device *,
@@ -35,6 +41,15 @@ static inline struct phy_device *fixed_phy_register(unsigned int irq,
 {
 	return ERR_PTR(-ENODEV);
 }
+
+static inline struct phy_device *
+fixed_phy_register_with_gpiod(unsigned int irq,
+			      struct fixed_phy_status *status,
+			      struct gpio_desc *gpiod)
+{
+	return ERR_PTR(-ENODEV);
+}
+
 static inline void fixed_phy_unregister(struct phy_device *phydev)
 {
 }
-- 
2.20.1


^ permalink raw reply related

* [PATCH bpf-next] tools/bpf: add missing strings.h include
From: Andrii Nakryiko @ 2019-02-07 17:50 UTC (permalink / raw)
  To: yhs, songliubraving, ast, kafai, netdev, kernel-team; +Cc: Andrii Nakryiko

Few files in libbpf are using bzero() function (defined in strings.h header), but
don't include corresponding header. When libbpf is added as a dependency to pahole,
this undeterministically causes warnings on some machines:

bpf.c:225:2: warning: implicit declaration of function ‘bzero’ [-Wimplicit-function-declaration]
  bzero(&attr, sizeof(attr));
    ^~~~~
---
 tools/lib/bpf/bpf.c    | 1 +
 tools/lib/bpf/btf.c    | 1 +
 tools/lib/bpf/libbpf.c | 1 +
 3 files changed, 3 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 3defad77dc7a..92fd27fe0599 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -22,6 +22,7 @@
  */
 
 #include <stdlib.h>
+#include <strings.h>
 #include <memory.h>
 #include <unistd.h>
 #include <asm/unistd.h>
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index ab6528c935a1..4324eb47d214 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -4,6 +4,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <strings.h>
 #include <unistd.h>
 #include <errno.h>
 #include <linux/err.h>
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 47969aa0faf8..8d64ada5f728 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -18,6 +18,7 @@
 #include <libgen.h>
 #include <inttypes.h>
 #include <string.h>
+#include <strings.h>
 #include <unistd.h>
 #include <fcntl.h>
 #include <errno.h>
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH] net: stmmac: Variable "val" in function sun8i_dwmac_set_syscon() could be uninitialized
From: Yizhuo Zhai @ 2019-02-07 17:48 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: David Miller, Chengyu Song, Zhiyun Qian, Giuseppe Cavallaro,
	Alexandre Torgue, Chen-Yu Tsai, netdev, linux-arm-kernel,
	linux-kernel
In-Reply-To: <20190207092454.xq5jvnfnuhcp37nm@flea>

Make sense, I will send the new patch. Thanks for the opinion.

On Thu, Feb 7, 2019 at 1:25 AM Maxime Ripard <maxime.ripard@bootlin.com> wrote:
>
> On Wed, Feb 06, 2019 at 09:53:16PM -0800, Yizhuo Zhai wrote:
> >
> >
> > On Wed, Feb 6, 2019 at 9:52 PM Yizhuo Zhai <yzhai003@ucr.edu> wrote:
> > >
> > > Thanks, but why initialization matters here? Is performance the main concern?
> > >
> > > On Wed, Feb 6, 2019 at 8:17 PM David Miller <davem@davemloft.net> wrote:
> > >>
> > >> From: Yizhuo <yzhai003@ucr.edu>
> > >> Date: Tue,  5 Feb 2019 14:15:59 -0800
> > >>
> > >> > @@ -639,9 +639,14 @@ static int sun8i_dwmac_set_syscon(struct stmmac_priv *priv)
> > >> >       struct sunxi_priv_data *gmac = priv->plat->bsp_priv;
> > >> >       struct device_node *node = priv->device->of_node;
> > >> >       int ret;
> > >> > -     u32 reg, val;
> > >> > +     u32 reg, val = 0;
> > >> > +
> > >> > +     ret = regmap_read(gmac->regmap, SYSCON_EMAC_REG, &val);
> > >> > +     if (ret) {
> > >> > +             dev_err(priv->device, "Fail to read SYSCON_EMAC_REG.\n");
> > >> > +             return ret;
> > >> > +     }
> > >>
> > >> I agree with the other reviewer that since you check 'ret' the initialization of
> > >> 'val' is no longer needed.
> >
> > Thanks, but why initialization matters here? Is performance the main
> > concern?
>
> Not really, but if we turn this the other way around, why should we do
> something that doesn't bring anything?
>
> Maxime
>
> --
> Maxime Ripard, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com



-- 
Kind Regards,

Yizhuo Zhai

Computer Science, Graduate Student
University of California, Riverside

^ permalink raw reply

* Re: [PATCH net] sctp: make sctp_setsockopt_events() less strict about the option length
From: 'Marcelo Ricardo Leitner' @ 2019-02-07 17:47 UTC (permalink / raw)
  To: David Laight
  Cc: Julien Gomes, netdev@vger.kernel.org, linux-sctp@vger.kernel.org,
	linux-kernel@vger.kernel.org, davem@davemloft.net,
	nhorman@tuxdriver.com, vyasevich@gmail.com, lucien.xin@gmail.com
In-Reply-To: <1257941619984a2a992af8d801855c04@AcuMS.aculab.com>

On Thu, Feb 07, 2019 at 05:33:07PM +0000, David Laight wrote:
> From: Marcelo Ricardo Leitner
> > Sent: 06 February 2019 21:07
> > 
> > On Wed, Feb 06, 2019 at 12:48:38PM -0800, Julien Gomes wrote:
> > >
> > >
> > > On 2/6/19 12:37 PM, Marcelo Ricardo Leitner wrote:
> > > > On Wed, Feb 06, 2019 at 12:14:30PM -0800, Julien Gomes wrote:
> > > >> Make sctp_setsockopt_events() able to accept sctp_event_subscribe
> > > >> structures longer than the current definitions.
> > > >>
> > > >> This should prevent unjustified setsockopt() failures due to struct
> > > >> sctp_event_subscribe extensions (as in 4.11 and 4.12) when using
> > > >> binaries that should be compatible, but were built with later kernel
> > > >> uapi headers.
> > > >
> > > > Not sure if we support backwards compatibility like this?
> > > >
> > > > My issue with this change is that by doing this, application will have
> > > > no clue if the new bits were ignored or not and it may think that an
> > > > event is enabled while it is not.
> > > >
> > > > A workaround would be to do a getsockopt and check the size that was
> > > > returned. But then, it might as well use the right struct here in the
> > > > first place.
> > > >
> > > > I'm seeing current implementation as an implicitly versioned argument:
> > > > it will always accept setsockopt calls with an old struct (v4.11 or
> > > > v4.12), but if the user tries to use v3 on a v1-only system, it will
> > > > be rejected. Pretty much like using a newer setsockopt on an old
> > > > system.
> > >
> > > With the current implementation, given sources that say are supposed to
> > > run on a 4.9 kernel (no use of any newer field added in 4.11 or 4.12),
> > > we can't rebuild the exact same sources on a 4.19 kernel and still run
> > > them on 4.9 without messing with structures re-definition.
> > 
> > Maybe what we want(ed) here then is explicit versioning, to have the 3
> > definitions available. Then the application is able to use, say struct
> > sctp_event_subscribe, and be happy with it, while there is struct
> > sctp_event_subscribe_v2 and struct sctp_event_subscribe_v3 there too.
> > 
> > But it's too late for that now because that would break applications
> > already using the new fields in sctp_event_subscribe.
> 
> It is probably better to break the recompilation of the few programs
> that use the new fields (and have them not work on old kernels)
> than to stop recompilations of old programs stop working on old
> kernels or have requested new options silently ignored.

I got confused here, not sure what you mean. Seems there is one "stop"
word too many.

> 
> There are all sorts of reasons why programs get built on systems that
> are newer than the ones they need to run on.
> I'm currently planning to get around the glibc 'memcpy()' fubar so I
> can retire some very old build systems before their disks die.

You can virtualize those. That's not really a good reason for
building with newer kernel and running on old systems, as virtually
any old system can be virtualized.

  Marcelo

> 
> Fortunately our sctp code is in the kernel - so has to be compiled
> with the correct headers.
> 
> > > I understand your point, but this still looks like a sort of uapi
> > > breakage to me.
> > 
> > Not disagreeing. I really just don't know how supported that is.
> > Willing to know so I can pay more attention to this on future changes.
> 
> Agreed, these structures should never be changed.
> 
> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
> 

^ permalink raw reply

* [PATCH] net: stmmac: Variable "val" in function sun8i_dwmac_set_syscon() could be uninitialized
From: Yizhuo @ 2019-02-07 17:46 UTC (permalink / raw)
  Cc: csong, zhiyunq, Yizhuo, Giuseppe Cavallaro, Alexandre Torgue,
	Maxime Ripard, Chen-Yu Tsai, netdev, linux-arm-kernel,
	linux-kernel

In function sun8i_dwmac_set_syscon(), local variable "val" could
be uninitialized if function regmap_read() returns -EINVAL.
However, it will be used directly in the if statement, which
is potentially unsafe.

Signed-off-by: Yizhuo <yzhai003@ucr.edu>
---
 drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
index 39c2122a4f26..50cfd6d83052 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
@@ -641,7 +641,12 @@ static int sun8i_dwmac_set_syscon(struct stmmac_priv *priv)
 	int ret;
 	u32 reg, val;
 
-	regmap_read(gmac->regmap, SYSCON_EMAC_REG, &val);
+	ret = regmap_read(gmac->regmap, SYSCON_EMAC_REG, &val);
+	if (ret) {
+		dev_err(priv->device, "Fail to read SYSCON_EMAC_REG.\n");
+		return ret;
+	}
+
 	reg = gmac->variant->default_syscon_value;
 	if (reg != val)
 		dev_warn(priv->device,
-- 
2.17.1


^ permalink raw reply related

* [PATCH bpf-next v3] tools/bpf: add log_level to bpf_load_program_attr
From: Yonghong Song @ 2019-02-07 17:34 UTC (permalink / raw)
  To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Yonghong Song

The kernel verifier has three levels of logs:
    0: no logs
    1: logs mostly useful
  > 1: verbose

Current libbpf API functions bpf_load_program_xattr() and
bpf_load_program() cannot specify log_level.
The bcc, however, provides an interface for user to
specify log_level 2 for verbose output.

This patch added log_level into structure
bpf_load_program_attr, so users, including bcc, can use
bpf_load_program_xattr() to change log_level. The
supported log_level is 0, 1, and 2.

The bpf selftest test_sock.c is modified to enable log_level = 2.
If the "verbose" in test_sock.c is changed to true,
the test will output logs like below:
  $ ./test_sock
  func#0 @0
  0: R1=ctx(id=0,off=0,imm=0) R10=fp0,call_-1
  0: (bf) r6 = r1
  1: R1=ctx(id=0,off=0,imm=0) R6_w=ctx(id=0,off=0,imm=0) R10=fp0,call_-1
  1: (61) r7 = *(u32 *)(r6 +28)
  invalid bpf_context access off=28 size=4

  Test case: bind4 load with invalid access: src_ip6 .. [PASS]
  ...
  Test case: bind6 allow all .. [PASS]
  Summary: 16 PASSED, 0 FAILED

Some test_sock tests are negative tests and verbose verifier
log will be printed out as shown in the above.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/lib/bpf/bpf.c                     | 22 +++++++++++++++++-----
 tools/lib/bpf/bpf.h                     |  1 +
 tools/testing/selftests/bpf/test_sock.c |  9 ++++++++-
 3 files changed, 26 insertions(+), 6 deletions(-)

Changelog:
  v2 -> v3:
    . returning error immediately for inconsistent log_level/log_buf/log_buf_sz
      values
    . for log_level=1/2, always doing logging during bpf_prog_load.
  v1 -> v2:
    . make log_level as the last member of struct bpf_load_program_attr.
    . return -EINVAL if bpf_load_program_attr.log_level > 2.

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 3defad77dc7a..3f5290ee1b47 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -214,10 +214,15 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
 {
 	void *finfo = NULL, *linfo = NULL;
 	union bpf_attr attr;
+	__u32 log_level;
 	__u32 name_len;
 	int fd;
 
-	if (!load_attr)
+	if (!load_attr || !log_buf != !log_buf_sz)
+		return -EINVAL;
+
+	log_level = load_attr->log_level;
+	if (log_level > 2 || (log_level && !log_buf))
 		return -EINVAL;
 
 	name_len = load_attr->name ? strlen(load_attr->name) : 0;
@@ -228,9 +233,16 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
 	attr.insn_cnt = (__u32)load_attr->insns_cnt;
 	attr.insns = ptr_to_u64(load_attr->insns);
 	attr.license = ptr_to_u64(load_attr->license);
-	attr.log_buf = ptr_to_u64(NULL);
-	attr.log_size = 0;
-	attr.log_level = 0;
+
+	attr.log_level = log_level;
+	if (log_level) {
+		attr.log_buf = ptr_to_u64(log_buf);
+		attr.log_size = log_buf_sz;
+	} else {
+		attr.log_buf = ptr_to_u64(NULL);
+		attr.log_size = 0;
+	}
+
 	attr.kern_version = load_attr->kern_version;
 	attr.prog_ifindex = load_attr->prog_ifindex;
 	attr.prog_btf_fd = load_attr->prog_btf_fd;
@@ -286,7 +298,7 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
 			goto done;
 	}
 
-	if (!log_buf || !log_buf_sz)
+	if (log_level || !log_buf)
 		goto done;
 
 	/* Try again with log */
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index ed09eed2dc3b..6ffdd79bea89 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -85,6 +85,7 @@ struct bpf_load_program_attr {
 	__u32 line_info_rec_size;
 	const void *line_info;
 	__u32 line_info_cnt;
+	__u32 log_level;
 };
 
 /* Flags to direct loading requirements */
diff --git a/tools/testing/selftests/bpf/test_sock.c b/tools/testing/selftests/bpf/test_sock.c
index 561ffb6d6433..fb679ac3d4b0 100644
--- a/tools/testing/selftests/bpf/test_sock.c
+++ b/tools/testing/selftests/bpf/test_sock.c
@@ -20,6 +20,7 @@
 #define MAX_INSNS	512
 
 char bpf_log_buf[BPF_LOG_BUF_SIZE];
+static bool verbose = false;
 
 struct sock_test {
 	const char *descr;
@@ -325,6 +326,7 @@ static int load_sock_prog(const struct bpf_insn *prog,
 			  enum bpf_attach_type attach_type)
 {
 	struct bpf_load_program_attr attr;
+	int ret;
 
 	memset(&attr, 0, sizeof(struct bpf_load_program_attr));
 	attr.prog_type = BPF_PROG_TYPE_CGROUP_SOCK;
@@ -332,8 +334,13 @@ static int load_sock_prog(const struct bpf_insn *prog,
 	attr.insns = prog;
 	attr.insns_cnt = probe_prog_length(attr.insns);
 	attr.license = "GPL";
+	attr.log_level = 2;
 
-	return bpf_load_program_xattr(&attr, bpf_log_buf, BPF_LOG_BUF_SIZE);
+	ret = bpf_load_program_xattr(&attr, bpf_log_buf, BPF_LOG_BUF_SIZE);
+	if (verbose && ret < 0)
+		fprintf(stderr, "%s\n", bpf_log_buf);
+
+	return ret;
 }
 
 static int attach_sock_prog(int cgfd, int progfd,
-- 
2.17.1


^ permalink raw reply related

* RE: [PATCH net] sctp: make sctp_setsockopt_events() less strict about the option length
From: David Laight @ 2019-02-07 17:33 UTC (permalink / raw)
  To: 'Marcelo Ricardo Leitner', Julien Gomes
  Cc: netdev@vger.kernel.org, linux-sctp@vger.kernel.org,
	linux-kernel@vger.kernel.org, davem@davemloft.net,
	nhorman@tuxdriver.com, vyasevich@gmail.com, lucien.xin@gmail.com
In-Reply-To: <20190206210723.GD13621@localhost.localdomain>

From: Marcelo Ricardo Leitner
> Sent: 06 February 2019 21:07
> 
> On Wed, Feb 06, 2019 at 12:48:38PM -0800, Julien Gomes wrote:
> >
> >
> > On 2/6/19 12:37 PM, Marcelo Ricardo Leitner wrote:
> > > On Wed, Feb 06, 2019 at 12:14:30PM -0800, Julien Gomes wrote:
> > >> Make sctp_setsockopt_events() able to accept sctp_event_subscribe
> > >> structures longer than the current definitions.
> > >>
> > >> This should prevent unjustified setsockopt() failures due to struct
> > >> sctp_event_subscribe extensions (as in 4.11 and 4.12) when using
> > >> binaries that should be compatible, but were built with later kernel
> > >> uapi headers.
> > >
> > > Not sure if we support backwards compatibility like this?
> > >
> > > My issue with this change is that by doing this, application will have
> > > no clue if the new bits were ignored or not and it may think that an
> > > event is enabled while it is not.
> > >
> > > A workaround would be to do a getsockopt and check the size that was
> > > returned. But then, it might as well use the right struct here in the
> > > first place.
> > >
> > > I'm seeing current implementation as an implicitly versioned argument:
> > > it will always accept setsockopt calls with an old struct (v4.11 or
> > > v4.12), but if the user tries to use v3 on a v1-only system, it will
> > > be rejected. Pretty much like using a newer setsockopt on an old
> > > system.
> >
> > With the current implementation, given sources that say are supposed to
> > run on a 4.9 kernel (no use of any newer field added in 4.11 or 4.12),
> > we can't rebuild the exact same sources on a 4.19 kernel and still run
> > them on 4.9 without messing with structures re-definition.
> 
> Maybe what we want(ed) here then is explicit versioning, to have the 3
> definitions available. Then the application is able to use, say struct
> sctp_event_subscribe, and be happy with it, while there is struct
> sctp_event_subscribe_v2 and struct sctp_event_subscribe_v3 there too.
> 
> But it's too late for that now because that would break applications
> already using the new fields in sctp_event_subscribe.

It is probably better to break the recompilation of the few programs
that use the new fields (and have them not work on old kernels)
than to stop recompilations of old programs stop working on old
kernels or have requested new options silently ignored.

There are all sorts of reasons why programs get built on systems that
are newer than the ones they need to run on.
I'm currently planning to get around the glibc 'memcpy()' fubar so I
can retire some very old build systems before their disks die.

Fortunately our sctp code is in the kernel - so has to be compiled
with the correct headers.

> > I understand your point, but this still looks like a sort of uapi
> > breakage to me.
> 
> Not disagreeing. I really just don't know how supported that is.
> Willing to know so I can pay more attention to this on future changes.

Agreed, these structures should never be changed.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox