Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next v4 3/8] devlink: add generic info version names
From: Jakub Kicinski @ 2019-01-31 18:50 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, jiri, andrew, f.fainelli, mkubecek, eugenem,
	jonathan.lemon, Jakub Kicinski
In-Reply-To: <20190131185047.27685-1-jakub.kicinski@netronome.com>

Add defines and docs for generic info versions.

v3:
 - add docs;
 - separate patch (Jiri).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 .../networking/devlink-info-versions.rst      | 38 +++++++++++++++++++
 Documentation/networking/index.rst            |  1 +
 include/net/devlink.h                         | 14 +++++++
 3 files changed, 53 insertions(+)
 create mode 100644 Documentation/networking/devlink-info-versions.rst

diff --git a/Documentation/networking/devlink-info-versions.rst b/Documentation/networking/devlink-info-versions.rst
new file mode 100644
index 000000000000..7d4ecf6b6f34
--- /dev/null
+++ b/Documentation/networking/devlink-info-versions.rst
@@ -0,0 +1,38 @@
+.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+
+=====================
+Devlink info versions
+=====================
+
+board.id
+========
+
+Unique identifier of the board design.
+
+board.rev
+=========
+
+Board design revision.
+
+fw.mgmt
+=======
+
+Control unit firmware version. This firmware is responsible for house
+keeping tasks, PHY control etc. but not the packet-by-packet data path
+operation.
+
+fw.app
+======
+
+Data path microcode controlling high-speed packet processing.
+
+fw.undi
+=======
+
+UNDI software, may include the UEFI driver, firmware or both.
+
+fw.ncsi
+=======
+
+Version of the software responsible for supporting/handling the
+Network Controller Sideband Interface.
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index f1627ca2a0ea..9a32451cd201 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -24,6 +24,7 @@ Linux Networking Documentation
    device_drivers/intel/i40e
    device_drivers/intel/iavf
    device_drivers/intel/ice
+   devlink-info-versions
    kapi
    z8530book
    msg_zerocopy
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 6dc0ef964392..6b417f141fd6 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -428,6 +428,20 @@ enum devlink_param_wol_types {
 	.validate = _validate,						\
 }
 
+/* Part number, identifier of board design */
+#define DEVLINK_INFO_VERSION_GENERIC_BOARD_ID	"board.id"
+/* Revision of board design */
+#define DEVLINK_INFO_VERSION_GENERIC_BOARD_REV	"board.rev"
+
+/* Control processor FW version */
+#define DEVLINK_INFO_VERSION_GENERIC_FW_MGMT	"fw.mgmt"
+/* Data path microcode controlling high-speed packet processing */
+#define DEVLINK_INFO_VERSION_GENERIC_FW_APP	"fw.app"
+/* UNDI software version */
+#define DEVLINK_INFO_VERSION_GENERIC_FW_UNDI	"fw.undi"
+/* NCSI support/handler version */
+#define DEVLINK_INFO_VERSION_GENERIC_FW_NCSI	"fw.ncsi"
+
 struct devlink_region;
 struct devlink_info_req;
 
-- 
2.19.2


^ permalink raw reply related

* [PATCH net-next v4 1/8] devlink: add device information API
From: Jakub Kicinski @ 2019-01-31 18:50 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, jiri, andrew, f.fainelli, mkubecek, eugenem,
	jonathan.lemon, Jakub Kicinski
In-Reply-To: <20190131185047.27685-1-jakub.kicinski@netronome.com>

ethtool -i has served us well for a long time, but its showing
its limitations more and more. The device information should
also be reported per device not per-netdev.

Lay foundation for a simple devlink-based way of reading device
info. Add driver name and device serial number as initial pieces
of information exposed via this new API.

v3:
 - rename helpers (Jiri);
 - rename driver name attr (Jiri);
 - remove double spacing in commit message (Jiri).
RFC v2:
 - wrap the skb into an opaque structure (Jiri);
 - allow the serial number of be any length (Jiri & Andrew);
 - add driver name (Jonathan).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/devlink.h        |  18 ++++++
 include/uapi/linux/devlink.h |   5 ++
 net/core/devlink.c           | 112 +++++++++++++++++++++++++++++++++++
 3 files changed, 135 insertions(+)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index 85c9eabaf056..a6d0a530483d 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -429,6 +429,7 @@ enum devlink_param_wol_types {
 }
 
 struct devlink_region;
+struct devlink_info_req;
 
 typedef void devlink_snapshot_data_dest_t(const void *data);
 
@@ -484,6 +485,8 @@ struct devlink_ops {
 	int (*eswitch_encap_mode_get)(struct devlink *devlink, u8 *p_encap_mode);
 	int (*eswitch_encap_mode_set)(struct devlink *devlink, u8 encap_mode,
 				      struct netlink_ext_ack *extack);
+	int (*info_get)(struct devlink *devlink, struct devlink_info_req *req,
+			struct netlink_ext_ack *extack);
 };
 
 static inline void *devlink_priv(struct devlink *devlink)
@@ -607,6 +610,10 @@ u32 devlink_region_shapshot_id_get(struct devlink *devlink);
 int devlink_region_snapshot_create(struct devlink_region *region, u64 data_len,
 				   u8 *data, u32 snapshot_id,
 				   devlink_snapshot_data_dest_t *data_destructor);
+int devlink_info_serial_number_put(struct devlink_info_req *req,
+				   const char *sn);
+int devlink_info_driver_name_put(struct devlink_info_req *req,
+				 const char *name);
 
 #else
 
@@ -905,6 +912,17 @@ devlink_region_snapshot_create(struct devlink_region *region, u64 data_len,
 	return 0;
 }
 
+static inline int
+devlink_info_driver_name_put(struct devlink_info_req *req, const char *name)
+{
+	return 0;
+}
+
+static inline int
+devlink_info_serial_number_put(struct devlink_info_req *req, const char *sn)
+{
+	return 0;
+}
 #endif
 
 #endif /* _NET_DEVLINK_H_ */
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 61b4447a6c5b..142710d45093 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -94,6 +94,8 @@ enum devlink_command {
 	DEVLINK_CMD_PORT_PARAM_NEW,
 	DEVLINK_CMD_PORT_PARAM_DEL,
 
+	DEVLINK_CMD_INFO_GET,		/* can dump */
+
 	/* add new commands above here */
 	__DEVLINK_CMD_MAX,
 	DEVLINK_CMD_MAX = __DEVLINK_CMD_MAX - 1
@@ -290,6 +292,9 @@ enum devlink_attr {
 	DEVLINK_ATTR_REGION_CHUNK_ADDR,         /* u64 */
 	DEVLINK_ATTR_REGION_CHUNK_LEN,          /* u64 */
 
+	DEVLINK_ATTR_INFO_DRIVER_NAME,		/* string */
+	DEVLINK_ATTR_INFO_SERIAL_NUMBER,	/* string */
+
 	/* add new attributes above here, update the policy in devlink.c */
 
 	__DEVLINK_ATTR_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index e6f170caf449..f456f6aa3d40 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -3714,6 +3714,110 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
 	return 0;
 }
 
+struct devlink_info_req {
+	struct sk_buff *msg;
+};
+
+int devlink_info_driver_name_put(struct devlink_info_req *req, const char *name)
+{
+	return nla_put_string(req->msg, DEVLINK_ATTR_INFO_DRIVER_NAME, name);
+}
+EXPORT_SYMBOL_GPL(devlink_info_driver_name_put);
+
+int devlink_info_serial_number_put(struct devlink_info_req *req, const char *sn)
+{
+	return nla_put_string(req->msg, DEVLINK_ATTR_INFO_SERIAL_NUMBER, sn);
+}
+EXPORT_SYMBOL_GPL(devlink_info_serial_number_put);
+
+static int
+devlink_nl_info_fill(struct sk_buff *msg, struct devlink *devlink,
+		     enum devlink_command cmd, u32 portid,
+		     u32 seq, int flags, struct netlink_ext_ack *extack)
+{
+	struct devlink_info_req req;
+	void *hdr;
+	int err;
+
+	hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
+	if (!hdr)
+		return -EMSGSIZE;
+
+	err = -EMSGSIZE;
+	if (devlink_nl_put_handle(msg, devlink))
+		goto err_cancel_msg;
+
+	req.msg = msg;
+	err = devlink->ops->info_get(devlink, &req, extack);
+	if (err)
+		goto err_cancel_msg;
+
+	genlmsg_end(msg, hdr);
+	return 0;
+
+err_cancel_msg:
+	genlmsg_cancel(msg, hdr);
+	return err;
+}
+
+static int devlink_nl_cmd_info_get_doit(struct sk_buff *skb,
+					struct genl_info *info)
+{
+	struct devlink *devlink = info->user_ptr[0];
+	struct sk_buff *msg;
+	int err;
+
+	if (!devlink->ops || !devlink->ops->info_get)
+		return -EOPNOTSUPP;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	err = devlink_nl_info_fill(msg, devlink, DEVLINK_CMD_INFO_GET,
+				   info->snd_portid, info->snd_seq, 0,
+				   info->extack);
+	if (err) {
+		nlmsg_free(msg);
+		return err;
+	}
+
+	return genlmsg_reply(msg, info);
+}
+
+static int devlink_nl_cmd_info_get_dumpit(struct sk_buff *msg,
+					  struct netlink_callback *cb)
+{
+	struct devlink *devlink;
+	int start = cb->args[0];
+	int idx = 0;
+	int err;
+
+	mutex_lock(&devlink_mutex);
+	list_for_each_entry(devlink, &devlink_list, list) {
+		if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+			continue;
+		if (idx < start) {
+			idx++;
+			continue;
+		}
+
+		mutex_lock(&devlink->lock);
+		err = devlink_nl_info_fill(msg, devlink, DEVLINK_CMD_INFO_GET,
+					   NETLINK_CB(cb->skb).portid,
+					   cb->nlh->nlmsg_seq, NLM_F_MULTI,
+					   cb->extack);
+		mutex_unlock(&devlink->lock);
+		if (err)
+			break;
+		idx++;
+	}
+	mutex_unlock(&devlink_mutex);
+
+	cb->args[0] = idx;
+	return msg->len;
+}
+
 static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
@@ -3974,6 +4078,14 @@ static const struct genl_ops devlink_nl_ops[] = {
 		.flags = GENL_ADMIN_PERM,
 		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
 	},
+	{
+		.cmd = DEVLINK_CMD_INFO_GET,
+		.doit = devlink_nl_cmd_info_get_doit,
+		.dumpit = devlink_nl_cmd_info_get_dumpit,
+		.policy = devlink_nl_policy,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
+		/* can be retrieved by unprivileged users */
+	},
 };
 
 static struct genl_family devlink_nl_family __ro_after_init = {
-- 
2.19.2


^ permalink raw reply related

* Re: [PATCH bpf-next v3 1/3] libbpf: move pr_*() functions to common header file
From: Alexei Starovoitov @ 2019-01-31 18:52 UTC (permalink / raw)
  To: Magnus Karlsson
  Cc: bjorn.topel, ast, daniel, netdev, jakub.kicinski, bjorn.topel,
	qi.z.zhang, brouer, acme, yhs
In-Reply-To: <1548774737-16579-2-git-send-email-magnus.karlsson@intel.com>

On Tue, Jan 29, 2019 at 04:12:15PM +0100, Magnus Karlsson wrote:
> Move the pr_*() functions in libbpf.c to a common header file called
> libbpf_internal.h. This so that the later libbpf AF_XDP helper library
> code in xsk.c can use these printing functions too.
> 
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> ---
>  tools/lib/bpf/libbpf.c          | 30 +-----------------------------
>  tools/lib/bpf/libbpf_internal.h | 41 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 42 insertions(+), 29 deletions(-)
>  create mode 100644 tools/lib/bpf/libbpf_internal.h
> 
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 2ccde17..1d7fe26 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -39,6 +39,7 @@
>  #include <gelf.h>
>  
>  #include "libbpf.h"
> +#include "libbpf_internal.h"
>  #include "bpf.h"
>  #include "btf.h"
>  #include "str_error.h"
> @@ -51,34 +52,6 @@
>  #define BPF_FS_MAGIC		0xcafe4a11
>  #endif
>  
> -#define __printf(a, b)	__attribute__((format(printf, a, b)))
> -
> -__printf(1, 2)
> -static int __base_pr(const char *format, ...)
> -{
> -	va_list args;
> -	int err;
> -
> -	va_start(args, format);
> -	err = vfprintf(stderr, format, args);
> -	va_end(args);
> -	return err;
> -}
> -
> -static __printf(1, 2) libbpf_print_fn_t __pr_warning = __base_pr;
> -static __printf(1, 2) libbpf_print_fn_t __pr_info = __base_pr;
> -static __printf(1, 2) libbpf_print_fn_t __pr_debug;
> -
> -#define __pr(func, fmt, ...)	\
> -do {				\
> -	if ((func))		\
> -		(func)("libbpf: " fmt, ##__VA_ARGS__); \
> -} while (0)
> -
> -#define pr_warning(fmt, ...)	__pr(__pr_warning, fmt, ##__VA_ARGS__)
> -#define pr_info(fmt, ...)	__pr(__pr_info, fmt, ##__VA_ARGS__)
> -#define pr_debug(fmt, ...)	__pr(__pr_debug, fmt, ##__VA_ARGS__)

since these funcs are about to be used more widely
let's clean this api while we still can.
How about we convert it to single pr_log callback function
with verbosity flag instead of three callbacks ?


^ permalink raw reply

* Re: bpf memory model. Was: [PATCH v4 bpf-next 1/9] bpf: introduce bpf_spin_lock
From: Alexei Starovoitov @ 2019-01-31 18:47 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Will Deacon, Peter Zijlstra, Alexei Starovoitov, davem, daniel,
	jakub.kicinski, netdev, kernel-team, mingo, jannh
In-Reply-To: <20190131140156.GF4240@linux.ibm.com>

On Thu, Jan 31, 2019 at 06:01:56AM -0800, Paul E. McKenney wrote:
> On Wed, Jan 30, 2019 at 02:57:43PM -0800, Alexei Starovoitov wrote:
> > On Wed, Jan 30, 2019 at 01:05:36PM -0800, Paul E. McKenney wrote:
> > > On Wed, Jan 30, 2019 at 11:51:14AM -0800, Alexei Starovoitov wrote:
> > > > On Wed, Jan 30, 2019 at 10:36:18AM -0800, Paul E. McKenney wrote:
> > > > > On Wed, Jan 30, 2019 at 06:11:00PM +0000, Will Deacon wrote:
> > > > > > Hi Alexei,
> > > > > > 
> > > > > > On Mon, Jan 28, 2019 at 01:56:24PM -0800, Alexei Starovoitov wrote:
> > > > > > > On Mon, Jan 28, 2019 at 10:24:08AM +0100, Peter Zijlstra wrote:
> > > > > > > > On Fri, Jan 25, 2019 at 04:17:26PM -0800, Alexei Starovoitov wrote:
> > > > > > > > > What I want to avoid is to define the whole execution ordering model upfront.
> > > > > > > > > We cannot say that BPF ISA is weakly ordered like alpha.
> > > > > > > > > Most of the bpf progs are written and running on x86. We shouldn't
> > > > > > > > > twist bpf developer's arm by artificially relaxing memory model.
> > > > > > > > > BPF memory model is equal to memory model of underlying architecture.
> > > > > > > > > What we can do is to make it bpf progs a bit more portable with
> > > > > > > > > smp_rmb instructions, but we must not force weak execution on the developer.
> > > > > > > > 
> > > > > > > > Well, I agree with only introducing bits you actually need, and my
> > > > > > > > smp_rmb() example might have been poorly chosen, smp_load_acquire() /
> > > > > > > > smp_store_release() might have been a far more useful example.
> > > > > > > > 
> > > > > > > > But I disagree with the last part; we have to pick a model now;
> > > > > > > > otherwise you'll pain yourself into a corner.
> > > > > > > > 
> > > > > > > > Also; Alpha isn't very relevant these days; however ARM64 does seem to
> > > > > > > > be gaining a lot of attention and that is very much a weak architecture.
> > > > > > > > Adding strongly ordered assumptions to BPF now, will penalize them in
> > > > > > > > the long run.
> > > > > > > 
> > > > > > > arm64 is gaining attention just like riscV is gaining it too.
> > > > > > > BPF jit for arm64 is very solid, while BPF jit for riscV is being worked on.
> > > > > > > BPF is not picking sides in CPU HW and ISA battles.
> > > > > > 
> > > > > > It's not about picking a side, it's about providing an abstraction of the
> > > > > > various CPU architectures out there so that the programmer doesn't need to
> > > > > > worry about where their program may run. Hell, even if you just said "eBPF
> > > > > > follows x86 semantics" that would be better than saying nothing (and then we
> > > > > > could have a discussion about whether x86 semantics are really what you
> > > > > > want).
> > > > > 
> > > > > To reinforce this point, the Linux-kernel memory model (tools/memory-model)
> > > > > is that abstraction for the Linux kernel.  Why not just use that for BPF?
> > > > 
> > > > I already answered this earlier in the thread.
> > > > tldr: not going to sacrifice performance.
> > > 
> > > Understood.
> > > 
> > > But can we at least say that where there are no performance consequences,
> > > BPF should follow LKMM?  You already mentioned smp_load_acquire()
> > > and smp_store_release(), but the void atomics (e.g., atomic_inc())
> > > should also work because they don't provide any ordering guarantees.
> > > The _relaxed(), _release(), and _acquire() variants of the value-returning
> > > atomics should be just fine as well.
> > > 
> > > The other value-returning atomics have strong ordering, which is fine
> > > on many systems, but potentially suboptimal for the weakly ordered ones.
> > > Though you have to have pretty good locality of reference to be able to
> > > see the difference, because otherwise cache-miss overhead dominates.
> > > 
> > > Things like cmpxchg() don't seem to fit BPF because they are normally
> > > used in spin loops, though there are some non-spinning use cases.
> > > 
> > > You correctly pointed out that READ_ONCE() and WRITE_ONCE() are suboptimal
> > > on systems that don't support all sizes of loads, but I bet that there
> > > are some sizes for which they are just fine across systems, for example,
> > > pointer size and int size.
> > > 
> > > Does that help?  Or am I missing additional cases where performance
> > > could be degraded?
> > 
> > bpf doesn't have smp_load_acquire, atomic_fetch_add, xchg, fence instructions.
> > They can be added step by step. That's easy.
> > I believe folks already started working on adding atomic_fetch_add.
> > What I have problem with is making a statement today that bpf's end
> > goal is LKMM. Even after adding all sorts of instructions it may
> > not be practical.
> > Only when real use case requires adding new instruction we do it.
> > Do you have a bpf program that needs smp_load_acquire ?
> 
> We seem to be talking past each other.  Let me try again...
> 
> I believe that if BPF adds a given concurrency feature, it should follow
> LKMM unless there is some specific problem with its doing so.
> 
> My paragraphs in my previous email list the concurrency features BPF
> could follow LKMM without penalty, should BPF choose to add them.
> 
> Does that help?

yeah. we're talking past each other indeed.
Doesn't look like that more emails will help.
Let's resolve it either f2f during next conference or join our bi-weekly
bpf bluejeans call Wed 11am pacific.
Reminders and links are on this list
https://lists.iovisor.org/g/iovisor-dev/messages?p=created,0,,20,2,0,0


^ permalink raw reply

* Re: [PATCH v6 bpf-next 1/9] bpf: introduce bpf_spin_lock
From: Alexei Starovoitov @ 2019-01-31 18:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexei Starovoitov, davem, daniel, jannh, netdev, kernel-team
In-Reply-To: <20190131104754.GA31516@hirez.programming.kicks-ass.net>

On Thu, Jan 31, 2019 at 11:47:54AM +0100, Peter Zijlstra wrote:
> > +	local_irq_save(flags);
> > +	__bpf_spin_lock(lock);
> > +	this_cpu_write(irqsave_flags, flags);
> 
> 	__this_cpu_write()

sure. will do


^ permalink raw reply

* Re: [PATCH v2 bpf-next] bpf: add optional memory accounting for maps
From: Alexei Starovoitov @ 2019-01-31 18:36 UTC (permalink / raw)
  To: Martynas Pumputis; +Cc: netdev, ys114321, ast, daniel, Yonghong Song
In-Reply-To: <20190131093801.32220-1-m@lambda.lt>

On Thu, Jan 31, 2019 at 10:38:01AM +0100, Martynas Pumputis wrote:
> Previously, memory allocated for a map was not accounted. Therefore,
> this memory could not be taken into consideration by the cgroups
> memory controller.
> 
> This patch introduces the "BPF_F_ACCOUNT_MEM" flag which enables
> the memory accounting for a map, and it can be set during
> the map creation ("BPF_MAP_CREATE") in "map_flags".
> 
> When enabled, we account only that amount of memory which is charged
> against the "RLIMIT_MEMLOCK" limit.
> 
> To validate the change, first we create the memory cgroup-v1 "test-map":
> 
>     # mkdir /sys/fs/cgroup/memory/test-map
> 
> And then we run the following program against the cgroup:
> 
>     $ cat test_map.c
>     <..>
>     int main() {
>         usleep(3 * 1000000);
>         assert(bpf_create_map(BPF_MAP_TYPE_HASH, 8, 16, 65536, 0) > 0);
>         usleep(3 * 1000000);
>     }
>     # cgexec -g memory:test-map ./test_map &
>     # cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
>     397312
>     258048
> 
>     <after 3 sec the map has been created>
> 
>     # bpftool map list
>     19: hash  flags 0x0
>         key 8B  value 16B  max_entries 65536  memlock 5771264B
>     # cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
>     401408
>     262144
> 
> As we can see, the memory allocated for map is not accounted, as
> 397312B + 5771264B > 401408B.
> 
> Next, we enabled the accounting and re-run the test:
> 
>     $ cat test_map.c
>     <..>
>     int main() {
>         usleep(3 * 1000000);
>         assert(bpf_create_map(BPF_MAP_TYPE_HASH, 8, 16, 65536, BPF_F_ACCOUNT_MEM) > 0);
>         usleep(3 * 1000000);
>     }
>     # cgexec -g memory:test-map ./test_map &
>     # cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
>     450560
>     307200
> 
>     <after 3 sec the map has been created>
> 
>     # bpftool map list
>     20: hash  flags 0x80
>         key 8B  value 16B  max_entries 65536  memlock 5771264B
>     # cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
>     6221824
>     6078464
> 
> This time, the memory (including kmem) is accounted, as
> 450560B + 5771264B <= 6221824B
> 
> Acked-by: Yonghong Song <yhs@fb.com>
> Signed-off-by: Martynas Pumputis <m@lambda.lt>

see my reply in other thread.


^ permalink raw reply

* Re: [PATCH bpf-next] bpf: add optional memory accounting for maps
From: Alexei Starovoitov @ 2019-01-31 18:35 UTC (permalink / raw)
  To: Martynas Pumputis; +Cc: netdev, ast, daniel
In-Reply-To: <20190130140251.23784-1-m@lambda.lt>

On Wed, Jan 30, 2019 at 03:02:51PM +0100, Martynas Pumputis wrote:
> Previously, memory allocated for a map was not accounted. Therefore,
> this memory could not be taken into consideration by the cgroups
> memory controller.
> 
> This patch introduces the "BPF_F_ACCOUNT_MEM" flag which enables
> the memory accounting for a map, and it can be set during
> the map creation ("BPF_MAP_CREATE") in "map_flags".
> 
> When enabled, we account only that amount of memory which is charged
> against the "RLIMIT_MEMLOCK" limit.
> 
> To validate the change, first we create the memory cgroup "test-map":
> 
>     # mkdir /sys/fs/cgroup/memory/test-map
> 
> And then we run the following program against the cgroup:
> 
>     $ cat test_map.c
>     <..>
>     int main() {
>         usleep(3 * 1000000);
>         assert(bpf_create_map(BPF_MAP_TYPE_HASH, 8, 16, 65536, 0) > 0);
>         usleep(3 * 1000000);
>     }
>     # cgexec -g memory:test-map ./test_map &
>     # cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
>     397312
>     258048
> 
>     <after 3 sec the map has been created>
> 
>     # bpftool map list
>     19: hash  flags 0x0
>         key 8B  value 16B  max_entries 65536  memlock 5771264B
>     # cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
>     401408
>     262144
> 
> As we can see, the memory allocated for map is not accounted, as
> 397312B + 5771264B > 401408B.
> 
> Next, we enabled the accounting and re-run the test:
> 
>     $ cat test_map.c
>     <..>
>     int main() {
>         usleep(3 * 1000000);
>         assert(bpf_create_map(BPF_MAP_TYPE_HASH, 8, 16, 65536, BPF_F_ACCOUNT_MEM) > 0);
>         usleep(3 * 1000000);
>     }
>     # cgexec -g memory:test-map ./test_map &
>     # cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
>     450560
>     307200
> 
>     <after 3 sec the map has been created>
> 
>     # bpftool map list
>     20: hash  flags 0x80
>         key 8B  value 16B  max_entries 65536  memlock 5771264B
>     # cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
>     6221824
>     6078464
> 
> This time, the memory (including kmem) is accounted, as
> 450560B + 5771264B <= 6221824B
> 
> Signed-off-by: Martynas Pumputis <m@lambda.lt>
...
> @@ -49,7 +51,9 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
>  
>  	err = -ENOMEM;
>  
> -	m->flush_list = alloc_percpu(struct list_head);
> +	if (account_mem)
> +		gfp |= __GFP_ACCOUNT;
> +	m->flush_list = alloc_percpu_gfp(struct list_head, gfp);

I think it's better to account this memory by default.
Extra flag during map creation is not needed.
There are nokmem and nosocket memcg boot options.
We can add one more to turn off accounting of bpf map memory.


^ permalink raw reply

* Re: r8169 Driver - Poor Network Performance Since Kernel 4.19
From: Heiner Kallweit @ 2019-01-31 18:28 UTC (permalink / raw)
  To: Peter Ceiley, David Chang; +Cc: Realtek linux nic maintainers, netdev
In-Reply-To: <CAMLO_R5m+tFa2yzeMbacROrFirwWN+zCUVAbDd864RHVMNe08Q@mail.gmail.com>

Thanks for testing, Peter!
So we have an ASPM-related issue indeed. I'm aware that there are certain
incompatibilities between board chipsets and network chip versions
(although it's not known which combinations are affected).
And we don't know whether it's a hardware or BIOS issue.

Older driver versions dealt with this by simply disabling ASPM in general.
As a result all systems with a supported Realtek chip didn't reach higher
package power-saving states, resulting in significantly reduced battery
lifetime on notebooks.
The network driver has no stake in dealing with the ASPM policies, this
is handled by lower PCI layers.

Unfortunately we can't detect ASPM incompatibilities at runtime. Maybe
we could build some heuristics based on rx_missed percentage, but it's
not clear that ASPM issues always show the same symptoms.

So for now people with affected systems have to set a proper
pcie_aspm.policy parameter.
Just what is not clear to me is why pcie_aspm=off doesn't help.

@David:
I assume you'll check with the affected user to test the ASPM policy
parameter.

Heiner


On 31.01.2019 13:09, Peter Ceiley wrote:
> Hi Heiner,
> 
> A quick update on my testing with different pcie_aspm settings:
> 
> pcie_aspm=off | no change
> pcie_aspm.policy=default | no change
> pcie_aspm.policy=performance | issue resolved
> pcie_aspm.policy=powersave | issue resolved
> pcie_aspm.policy=powersupersave | issue resolved
> 
> It seems the new driver does not play nicely with the default ASPM policy.
> 
> As requested, I've included an output of ethtool below when experiencing
> the issue - note that no errors are recorded.
> 
> # ethtool -S enp3s0
> NIC statistics:
>      tx_packets: 2749
>      rx_packets: 4089
>      tx_errors: 0
>      rx_errors: 0
>      rx_missed: 0
>      align_errors: 0
>      tx_single_collisions: 0
>      tx_multi_collisions: 0
>      unicast: 4078
>      broadcast: 9
>      multicast: 2
>      tx_aborted: 0
>      tx_underrun: 0
> 
> David, I hope this helps for your user as well. I appreciate you sharing
> the bug ticket - thanks.
> 
> Heiner, thanks very much for your help to date.
> 
> Regards,
> 
> Peter.
> 
> On Thu, 31 Jan 2019 at 18:23, David Chang <dchang@suse.com> wrote:
>>
>> Hi Heiner,
>>
>> On Jan 31, 2019 at 07:35:30 +0100, Heiner Kallweit wrote:
>>> Hi David, two more things:
>>>
>>> 1. Could you please test a recent linux-next kernel?
>>> 2. Please get a register dump (ethtool -d <if>) from 4.18 and 4.19
>>>    and compare them.
>>
>> I'm sorry that I do not have the issue machine handy. I would ask
>> our user to do the test. Thanks!
>>
>> Regards,
>> David
>>
>>>
>>> Heiner
>>>
>>>
>>> On 31.01.2019 07:21, Heiner Kallweit wrote:
>>>> David, thanks for the link to the bug ticket.
>>>> I think only a proper bisect can help to find the offending commit.
>>>>
>>>> Heiner
>>>>
>>>>
>>>> On 31.01.2019 03:32, David Chang wrote:
>>>>> Hi,
>>>>>
>>>>> We had a similr case here.
>>>>> - Realtek r8169 receive performance regression in kernel 4.19
>>>>>   https://bugzilla.suse.com/show_bug.cgi?id=1119649
>>>>>
>>>>> kernel: r8169 0000:01:00.0 eth0: RTL8168h/8111h, XID 54100880
>>>>> The major symptom is there are many rx_missed count.
>>>>>
>>>>>
>>>>> On Jan 30, 2019 at 20:15:45 +0100, Heiner Kallweit wrote:
>>>>>> Hi Peter,
>>>>>>
>>>>>> recently I had somebody where pcie_aspm=off for whatever reason didn't
>>>>>> do the trick, can you also check with pcie_aspm.policy=performance.
>>>>>
>>>>> We will give it a try later.
>>>>>
>>>>>> And please check with "ethtool -S <if>" whether the chip statistics
>>>>>> show a significant number of errors.
>>>>>>
>>>>>> If this doesn't help you may have to bisect to find the offending commit.
>>>>>
>>>>> We had tried fallback driver to a few previous commits as following,
>>>>> but with no luck.
>>>>>
>>>>> 9675931e6b65 r8169: re-enable MSI-X on RTL8168g (v4.19)
>>>>> 098b01ad9837 r8169: don't include asm headers directly (v4.19-rc1)
>>>>> a2965f12fde6 r8169: remove rtl8169_set_speed_xmii (v4.19-rc1)
>>>>> 6fcf9b1d4d6c r8169: fix runtime suspend (v4.19-rc1)
>>>>> e397286b8e89 r8169: remove TBI 1000BaseX support (v4.19-rc1)
>>>>>
>>>>> Thanks,
>>>>> David Chang
>>>>>
>>>>>>
>>>>>> Heiner
>>>>>>
>>>>>>
>>>>>> On 30.01.2019 10:59, Peter Ceiley wrote:
>>>>>>> Hi Heiner,
>>>>>>>
>>>>>>> I tried disabling the ASPM using the pcie_aspm=off kernel parameter
>>>>>>> and this made no difference.
>>>>>>>
>>>>>>> I tried compiling the 4.18.16 r8169.c with the 4.19.18 source and
>>>>>>> subsequently loaded the module in the running 4.19.18 kernel. I can
>>>>>>> confirm that this immediately resolved the issue and access to the NFS
>>>>>>> shares operated as expected.
>>>>>>>
>>>>>>> I presume this means it is an issue with the r8169 driver included in
>>>>>>> 4.19 onwards?
>>>>>>>
>>>>>>> To answer your last questions:
>>>>>>>
>>>>>>> Base Board Information
>>>>>>>     Manufacturer: Alienware
>>>>>>>     Product Name: 0PGRP5
>>>>>>>     Version: A02
>>>>>>>
>>>>>>> ... and yes, the RTL8168 is the onboard network chip.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Peter.
>>>>>>>
>>>>>>> On Tue, 29 Jan 2019 at 17:44, Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Hi Peter,
>>>>>>>>
>>>>>>>> I think the vendor driver doesn't enable ASPM per default.
>>>>>>>> So it's worth a try to disable ASPM in the BIOS or via sysfs.
>>>>>>>> Few older systems seem to have issues with ASPM, what kind of
>>>>>>>> system / mainboard are you using? The RTL8168 is the onboard
>>>>>>>> network chip?
>>>>>>>>
>>>>>>>> Rgds, Heiner
>>>>>>>>
>>>>>>>>
>>>>>>>> On 29.01.2019 07:20, Peter Ceiley wrote:
>>>>>>>>> Hi Heiner,
>>>>>>>>>
>>>>>>>>> Thanks, I'll do some more testing. It might not be the driver - I
>>>>>>>>> assumed it was due to the fact that using the r8168 driver 'resolves'
>>>>>>>>> the issue. I'll see if I can test the r8169.c on top of 4.19 - this is
>>>>>>>>> a good idea.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> Peter.
>>>>>>>>>
>>>>>>>>> On Tue, 29 Jan 2019 at 17:16, Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Peter,
>>>>>>>>>>
>>>>>>>>>> at a first glance it doesn't look like a typical driver issue.
>>>>>>>>>> What you could do:
>>>>>>>>>>
>>>>>>>>>> - Test the r8169.c from 4.18 on top of 4.19.
>>>>>>>>>>
>>>>>>>>>> - Check whether disabling ASPM (/sys/module/pcie_aspm) has an effect.
>>>>>>>>>>
>>>>>>>>>> - Bisect between 4.18 and 4.19 to find the offending commit.
>>>>>>>>>>
>>>>>>>>>> Any specific reason why you think root cause is in the driver and not
>>>>>>>>>> elsewhere in the network subsystem?
>>>>>>>>>>
>>>>>>>>>> Heiner
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 28.01.2019 23:10, Peter Ceiley wrote:
>>>>>>>>>>> Hi Heiner,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for getting back to me.
>>>>>>>>>>>
>>>>>>>>>>> No, I don't use jumbo packets.
>>>>>>>>>>>
>>>>>>>>>>> Bandwidth is *generally* good, and iperf results to my NAS provide
>>>>>>>>>>> over 900 Mbits/s in both circumstances. The issue seems to appear when
>>>>>>>>>>> establishing a connection and is most notable, for example, on my
>>>>>>>>>>> mounted NFS shares where it takes seconds (up to 10's of seconds on
>>>>>>>>>>> larger directories) to list the contents of each directory. Once a
>>>>>>>>>>> transfer begins on a file, I appear to get good bandwidth.
>>>>>>>>>>>
>>>>>>>>>>> I'm unsure of the best scientific data to provide you in order to
>>>>>>>>>>> troubleshoot this issue. Running the following
>>>>>>>>>>>
>>>>>>>>>>>     netstat -s |grep retransmitted
>>>>>>>>>>>
>>>>>>>>>>> shows a steady increase in retransmitted segments each time I list the
>>>>>>>>>>> contents of a remote directory, for example, running 'ls' on a
>>>>>>>>>>> directory containing 345 media files did the following using kernel
>>>>>>>>>>> 4.19.18:
>>>>>>>>>>>
>>>>>>>>>>> increased retransmitted segments by 21 and the 'time' command showed
>>>>>>>>>>> the following:
>>>>>>>>>>>     real    0m19.867s
>>>>>>>>>>>     user    0m0.012s
>>>>>>>>>>>     sys    0m0.036s
>>>>>>>>>>>
>>>>>>>>>>> The same command shows no retransmitted segments running kernel
>>>>>>>>>>> 4.18.16 and 'time' showed:
>>>>>>>>>>>     real    0m0.300s
>>>>>>>>>>>     user    0m0.004s
>>>>>>>>>>>     sys    0m0.007s
>>>>>>>>>>>
>>>>>>>>>>> ifconfig does not show any RX/TX errors nor dropped packets in either case.
>>>>>>>>>>>
>>>>>>>>>>> dmesg XID:
>>>>>>>>>>> [    2.979984] r8169 0000:03:00.0 eth0: RTL8168g/8111g,
>>>>>>>>>>> f8:b1:56:fe:67:e0, XID 4c000800, IRQ 32
>>>>>>>>>>>
>>>>>>>>>>> # lspci -vv
>>>>>>>>>>> 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>>>>>>>>>>> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
>>>>>>>>>>>     Subsystem: Dell RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
>>>>>>>>>>>     Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>>>>>>>>>>> ParErr- Stepping- SERR- FastB2B- DisINTx+
>>>>>>>>>>>     Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>>>>>>>>>>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>>>>>>>>>     Latency: 0, Cache Line Size: 64 bytes
>>>>>>>>>>>     Interrupt: pin A routed to IRQ 19
>>>>>>>>>>>     Region 0: I/O ports at d000 [size=256]
>>>>>>>>>>>     Region 2: Memory at f7b00000 (64-bit, non-prefetchable) [size=4K]
>>>>>>>>>>>     Region 4: Memory at f2100000 (64-bit, prefetchable) [size=16K]
>>>>>>>>>>>     Capabilities: [40] Power Management version 3
>>>>>>>>>>>         Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA
>>>>>>>>>>> PME(D0+,D1+,D2+,D3hot+,D3cold+)
>>>>>>>>>>>         Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>>>>>>>>>>>     Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
>>>>>>>>>>>         Address: 0000000000000000  Data: 0000
>>>>>>>>>>>     Capabilities: [70] Express (v2) Endpoint, MSI 01
>>>>>>>>>>>         DevCap:    MaxPayload 128 bytes, PhantFunc 0, Latency L0s
>>>>>>>>>>> <512ns, L1 <64us
>>>>>>>>>>>             ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>>>>>>>>>>> SlotPowerLimit 10.000W
>>>>>>>>>>>         DevCtl:    CorrErr- NonFatalErr- FatalErr- UnsupReq-
>>>>>>>>>>>             RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
>>>>>>>>>>>             MaxPayload 128 bytes, MaxReadReq 4096 bytes
>>>>>>>>>>>         DevSta:    CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
>>>>>>>>>>>         LnkCap:    Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit
>>>>>>>>>>> Latency L0s unlimited, L1 <64us
>>>>>>>>>>>             ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
>>>>>>>>>>>         LnkCtl:    ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
>>>>>>>>>>>             ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>>>>>>>>>>>         LnkSta:    Speed 2.5GT/s (ok), Width x1 (ok)
>>>>>>>>>>>             TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>>>>>>>>>>         DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+,
>>>>>>>>>>> OBFF Via message/WAKE#
>>>>>>>>>>>              AtomicOpsCap: 32bit- 64bit- 128bitCAS-
>>>>>>>>>>>         DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+,
>>>>>>>>>>> OBFF Disabled
>>>>>>>>>>>              AtomicOpsCtl: ReqEn-
>>>>>>>>>>>         LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>>>>>>>>>>>              Transmit Margin: Normal Operating Range,
>>>>>>>>>>> EnterModifiedCompliance- ComplianceSOS-
>>>>>>>>>>>              Compliance De-emphasis: -6dB
>>>>>>>>>>>         LnkSta2: Current De-emphasis Level: -6dB,
>>>>>>>>>>> EqualizationComplete-, EqualizationPhase1-
>>>>>>>>>>>              EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>>>>>>>>>>>     Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
>>>>>>>>>>>         Vector table: BAR=4 offset=00000000
>>>>>>>>>>>         PBA: BAR=4 offset=00000800
>>>>>>>>>>>     Capabilities: [d0] Vital Product Data
>>>>>>>>>>> pcilib: sysfs_read_vpd: read failed: Input/output error
>>>>>>>>>>>         Not readable
>>>>>>>>>>>     Capabilities: [100 v1] Advanced Error Reporting
>>>>>>>>>>>         UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
>>>>>>>>>>> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>>>>>>>>>         UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
>>>>>>>>>>> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>>>>>>>>>         UESvrt:    DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
>>>>>>>>>>> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>>>>>>>>>>         CESta:    RxErr+ BadTLP+ BadDLLP+ Rollover- Timeout+ AdvNonFatalErr-
>>>>>>>>>>>         CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
>>>>>>>>>>>         AERCap:    First Error Pointer: 00, ECRCGenCap+ ECRCGenEn-
>>>>>>>>>>> ECRCChkCap+ ECRCChkEn-
>>>>>>>>>>>             MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
>>>>>>>>>>>         HeaderLog: 00000000 00000000 00000000 00000000
>>>>>>>>>>>     Capabilities: [140 v1] Virtual Channel
>>>>>>>>>>>         Caps:    LPEVC=0 RefClk=100ns PATEntryBits=1
>>>>>>>>>>>         Arb:    Fixed- WRR32- WRR64- WRR128-
>>>>>>>>>>>         Ctrl:    ArbSelect=Fixed
>>>>>>>>>>>         Status:    InProgress-
>>>>>>>>>>>         VC0:    Caps:    PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>>>>>>>>>>>             Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>>>>>>>>>>>             Ctrl:    Enable+ ID=0 ArbSelect=Fixed TC/VC=01
>>>>>>>>>>>             Status:    NegoPending- InProgress-
>>>>>>>>>>>     Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00
>>>>>>>>>>>     Capabilities: [170 v1] Latency Tolerance Reporting
>>>>>>>>>>>         Max snoop latency: 71680ns
>>>>>>>>>>>         Max no snoop latency: 71680ns
>>>>>>>>>>>     Kernel driver in use: r8169
>>>>>>>>>>>     Kernel modules: r8169
>>>>>>>>>>>
>>>>>>>>>>> Please let me know if you have any other ideas in terms of testing.
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>>
>>>>>>>>>>> Peter.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, 29 Jan 2019 at 05:28, Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 28.01.2019 12:13, Peter Ceiley wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have been experiencing very poor network performance since Kernel
>>>>>>>>>>>>> 4.19 and I'm confident it's related to the r8169 driver.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have no issue with kernel versions 4.18 and prior. I am experiencing
>>>>>>>>>>>>> this issue in kernels 4.19 and 4.20 (currently running/testing with
>>>>>>>>>>>>> 4.20.4 & 4.19.18).
>>>>>>>>>>>>>
>>>>>>>>>>>>> If someone could guide me in the right direction, I'm happy to help
>>>>>>>>>>>>> troubleshoot this issue. Note that I have been keeping an eye on one
>>>>>>>>>>>>> issue related to loading of the PHY driver, however, my symptoms
>>>>>>>>>>>>> differ in that I still have a network connection. I have attempted to
>>>>>>>>>>>>> reload the driver on a running system, but this does not improve the
>>>>>>>>>>>>> situation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Using the proprietary r8168 driver returns my device to proper working order.
>>>>>>>>>>>>>
>>>>>>>>>>>>> lshw shows:
>>>>>>>>>>>>>        description: Ethernet interface
>>>>>>>>>>>>>        product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
>>>>>>>>>>>>>        vendor: Realtek Semiconductor Co., Ltd.
>>>>>>>>>>>>>        physical id: 0
>>>>>>>>>>>>>        bus info: pci@0000:03:00.0
>>>>>>>>>>>>>        logical name: enp3s0
>>>>>>>>>>>>>        version: 0c
>>>>>>>>>>>>>        serial:
>>>>>>>>>>>>>        size: 1Gbit/s
>>>>>>>>>>>>>        capacity: 1Gbit/s
>>>>>>>>>>>>>        width: 64 bits
>>>>>>>>>>>>>        clock: 33MHz
>>>>>>>>>>>>>        capabilities: pm msi pciexpress msix vpd bus_master cap_list
>>>>>>>>>>>>> ethernet physical tp aui bnc mii fibre 10bt 10bt-fd 100bt 100bt-fd
>>>>>>>>>>>>> 1000bt-fd autonegotiation
>>>>>>>>>>>>>        configuration: autonegotiation=on broadcast=yes driver=r8169
>>>>>>>>>>>>> duplex=full firmware=rtl8168g-2_0.0.1 02/06/13 ip=192.168.1.25
>>>>>>>>>>>>> latency=0 link=yes multicast=yes port=MII speed=1Gbit/s
>>>>>>>>>>>>>        resources: irq:19 ioport:d000(size=256)
>>>>>>>>>>>>> memory:f7b00000-f7b00fff memory:f2100000-f2103fff
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kind Regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Peter.
>>>>>>>>>>>>>
>>>>>>>>>>>> Hi Peter,
>>>>>>>>>>>>
>>>>>>>>>>>> the description "poor network performance" is quite vague, therefore:
>>>>>>>>>>>>
>>>>>>>>>>>> - Can you provide any measurements?
>>>>>>>>>>>> - iperf results before and after
>>>>>>>>>>>> - statistics about dropped packets (rx and/or tx)
>>>>>>>>>>>> - Do you use jumbo packets?
>>>>>>>>>>>>
>>>>>>>>>>>> Also help would be a "lspci -vv" output for the network card and
>>>>>>>>>>>> the dmesg output line with the chip XID.
>>>>>>>>>>>>
>>>>>>>>>>>> Heiner
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
> 


^ permalink raw reply

* Do the work for you
From: Julie @ 2019-01-31 13:03 UTC (permalink / raw)
  To: netdev

Want to make white background for your images?

We can add clipping path, or give retouching for your photos if needed.

Let's start testing for your photos.

Thanks,
Julie

Dessau

Seevetal

^ permalink raw reply

* [PATCH iproute2-next] tc: add 'kind' property to 'csum' action
From: Davide Caratti @ 2019-01-31 17:58 UTC (permalink / raw)
  To: David Ahern, Stephen Hemminger; +Cc: netdev

unlike other TC actions already supporting JSON printout, 'csum' does not
print the value of TCA_KIND in the 'kind' property: remove 'csum' word
from 'csum' property, and add a separate 'kind' property containing the
action name. The human-readable printout is preserved.

Tested with:
 # ./tdc.py -c csum

Cc: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 tc/m_csum.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tc/m_csum.c b/tc/m_csum.c
index 752269d1d020..84396d6a482d 100644
--- a/tc/m_csum.c
+++ b/tc/m_csum.c
@@ -199,10 +199,11 @@ print_csum(struct action_util *au, FILE *f, struct rtattr *arg)
 		uflag_1 = "?empty";
 	}
 
+	print_string(PRINT_ANY, "kind", "%s ", "csum");
 	snprintf(buf, sizeof(buf), "%s%s%s%s%s%s%s",
 		 uflag_1, uflag_2, uflag_3,
 		 uflag_4, uflag_5, uflag_6, uflag_7);
-	print_string(PRINT_ANY, "csum", "csum (%s) ", buf);
+	print_string(PRINT_ANY, "csum", "(%s) ", buf);
 
 	print_action_control(f, "action ", sel->action, "\n");
 	print_uint(PRINT_ANY, "index", "\tindex %u", sel->index);
-- 
2.20.1


^ permalink raw reply related

* [PATCH iproute2-next] tc: full JSON support for 'bpf' actions
From: Davide Caratti @ 2019-01-31 17:58 UTC (permalink / raw)
  To: David Ahern, Stephen Hemminger; +Cc: netdev

Add full JSON output support in the dump of 'act_bpf'.

Example using eBPF:

 # tc actions flush action bpf
 # tc action add action bpf object bpf/action.o section 'action-ok'
 # tc -j action list action bpf | jq
 [
   {
     "total acts": 1
   },
   {
     "actions": [
       {
         "order": 0,
         "kind": "bpf",
         "bpf_name": "action.o:[action-ok]",
         "prog": {
           "id": 33,
           "tag": "a04f5eef06a7f555",
           "jited": 1
         },
         "control_action": {
           "type": "pipe"
         },
         "index": 1,
         "ref": 1,
         "bind": 0
       }
     ]
   }
 ]

Example using cBPF:

 # tc actions flush action bpf
 # a=$(mktemp)
 # tcpdump -ddd not ether proto 0x888e >$a
 # tc action add action bpf bytecode-file $a index 42
 # rm $a
 # tc -j action list action bpf | jq
 [
   {
     "total acts": 1
   },
   {
     "actions": [
       {
         "order": 0,
         "kind": "bpf",
         "bytecode": {
           "length": 4,
           "insns": [
             {
               "code": 40,
               "jt": 0,
               "jf": 0,
               "k": 12
             },
             {
               "code": 21,
               "jt": 0,
               "jf": 1,
               "k": 34958
             },
             {
               "code": 6,
               "jt": 0,
               "jf": 0,
               "k": 0
             },
             {
               "code": 6,
               "jt": 0,
               "jf": 0,
               "k": 262144
             }
           ]
         },
         "control_action": {
           "type": "pipe"
         },
         "index": 42,
         "ref": 1,
         "bind": 0
       }
     ]
   }
 ]

Tested with:
 # ./tdc.py -c bpf

Cc: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/bpf_util.h |  2 +-
 lib/bpf.c          | 26 ++++++++++++++++++--------
 tc/f_bpf.c         |  2 +-
 tc/m_bpf.c         | 32 +++++++++++++++++---------------
 4 files changed, 37 insertions(+), 25 deletions(-)

diff --git a/include/bpf_util.h b/include/bpf_util.h
index 63837a04e56f..63db07ca49ae 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -272,7 +272,7 @@ const char *bpf_prog_to_default_section(enum bpf_prog_type type);
 int bpf_graft_map(const char *map_path, uint32_t *key, int argc, char **argv);
 int bpf_trace_pipe(void);
 
-void bpf_print_ops(FILE *f, struct rtattr *bpf_ops, __u16 len);
+void bpf_print_ops(struct rtattr *bpf_ops, __u16 len);
 
 int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
 		  size_t size_insns, const char *license, char *log,
diff --git a/lib/bpf.c b/lib/bpf.c
index 5e85cfc0bdd5..dfc4f4f522c3 100644
--- a/lib/bpf.c
+++ b/lib/bpf.c
@@ -339,7 +339,7 @@ out:
 	return ret;
 }
 
-void bpf_print_ops(FILE *f, struct rtattr *bpf_ops, __u16 len)
+void bpf_print_ops(struct rtattr *bpf_ops, __u16 len)
 {
 	struct sock_filter *ops = RTA_DATA(bpf_ops);
 	int i;
@@ -347,14 +347,24 @@ void bpf_print_ops(FILE *f, struct rtattr *bpf_ops, __u16 len)
 	if (len == 0)
 		return;
 
-	fprintf(f, "bytecode \'%u,", len);
-
-	for (i = 0; i < len - 1; i++)
-		fprintf(f, "%hu %hhu %hhu %u,", ops[i].code, ops[i].jt,
-			ops[i].jf, ops[i].k);
+	open_json_object("bytecode");
+	print_uint(PRINT_ANY, "length", "bytecode \'%u,", len);
+	open_json_array(PRINT_JSON, "insns");
+
+	for (i = 0; i < len; i++) {
+		open_json_object(NULL);
+		print_uint(PRINT_ANY, "code", "%hu ", ops[i].code);
+		print_uint(PRINT_ANY, "jt", "%hhu ", ops[i].jt);
+		print_uint(PRINT_ANY, "jf", "%hhu ", ops[i].jf);
+		if (i == len - 1)
+			print_uint(PRINT_ANY, "k", "%u\'", ops[i].k);
+		else
+			print_uint(PRINT_ANY, "k", "%u,", ops[i].k);
+		close_json_object();
+	}
 
-	fprintf(f, "%hu %hhu %hhu %u\'", ops[i].code, ops[i].jt,
-		ops[i].jf, ops[i].k);
+	close_json_array(PRINT_JSON, NULL);
+	close_json_object();
 }
 
 static void bpf_map_pin_report(const struct bpf_elf_map *pin,
diff --git a/tc/f_bpf.c b/tc/f_bpf.c
index 5906f8bb969d..948d9051b9a5 100644
--- a/tc/f_bpf.c
+++ b/tc/f_bpf.c
@@ -235,7 +235,7 @@ static int bpf_print_opt(struct filter_util *qu, FILE *f,
 	}
 
 	if (tb[TCA_BPF_OPS] && tb[TCA_BPF_OPS_LEN])
-		bpf_print_ops(f, tb[TCA_BPF_OPS],
+		bpf_print_ops(tb[TCA_BPF_OPS],
 			      rta_getattr_u16(tb[TCA_BPF_OPS_LEN]));
 
 	if (tb[TCA_BPF_ID])
diff --git a/tc/m_bpf.c b/tc/m_bpf.c
index 7c6f8c298abd..3e8468c68324 100644
--- a/tc/m_bpf.c
+++ b/tc/m_bpf.c
@@ -157,7 +157,7 @@ static int bpf_print_opt(struct action_util *au, FILE *f, struct rtattr *arg)
 {
 	struct rtattr *tb[TCA_ACT_BPF_MAX + 1];
 	struct tc_act_bpf *parm;
-	int dump_ok = 0;
+	int d_ok = 0;
 
 	if (arg == NULL)
 		return -1;
@@ -170,31 +170,33 @@ static int bpf_print_opt(struct action_util *au, FILE *f, struct rtattr *arg)
 	}
 
 	parm = RTA_DATA(tb[TCA_ACT_BPF_PARMS]);
-	fprintf(f, "bpf ");
+	print_string(PRINT_ANY, "kind", "%s ", "bpf");
 
 	if (tb[TCA_ACT_BPF_NAME])
-		fprintf(f, "%s ", rta_getattr_str(tb[TCA_ACT_BPF_NAME]));
-
+		print_string(PRINT_ANY, "bpf_name", "%s ",
+			     rta_getattr_str(tb[TCA_ACT_BPF_NAME]));
 	if (tb[TCA_ACT_BPF_OPS] && tb[TCA_ACT_BPF_OPS_LEN]) {
-		bpf_print_ops(f, tb[TCA_ACT_BPF_OPS],
+		bpf_print_ops(tb[TCA_ACT_BPF_OPS],
 			      rta_getattr_u16(tb[TCA_ACT_BPF_OPS_LEN]));
-		fprintf(f, " ");
+		print_string(PRINT_FP, NULL, "%s", " ");
 	}
 
 	if (tb[TCA_ACT_BPF_ID])
-		dump_ok = bpf_dump_prog_info(f, rta_getattr_u32(tb[TCA_ACT_BPF_ID]));
-	if (!dump_ok && tb[TCA_ACT_BPF_TAG]) {
+		d_ok = bpf_dump_prog_info(f,
+					  rta_getattr_u32(tb[TCA_ACT_BPF_ID]));
+	if (!d_ok && tb[TCA_ACT_BPF_TAG]) {
 		SPRINT_BUF(b);
 
-		fprintf(f, "tag %s ",
-			hexstring_n2a(RTA_DATA(tb[TCA_ACT_BPF_TAG]),
-				      RTA_PAYLOAD(tb[TCA_ACT_BPF_TAG]),
-				      b, sizeof(b)));
+		print_string(PRINT_ANY, "tag", "tag %s ",
+			     hexstring_n2a(RTA_DATA(tb[TCA_ACT_BPF_TAG]),
+			     RTA_PAYLOAD(tb[TCA_ACT_BPF_TAG]),
+			     b, sizeof(b)));
 	}
 
-	print_action_control(f, "default-action ", parm->action, "\n");
-	fprintf(f, "\tindex %u ref %d bind %d", parm->index, parm->refcnt,
-		parm->bindcnt);
+	print_action_control(f, "default-action ", parm->action, _SL_);
+	print_uint(PRINT_ANY, "index", "\t index %u", parm->index);
+	print_int(PRINT_ANY, "ref", " ref %d", parm->refcnt);
+	print_int(PRINT_ANY, "bind", " bind %d", parm->bindcnt);
 
 	if (show_stats) {
 		if (tb[TCA_ACT_BPF_TM]) {
-- 
2.20.1


^ permalink raw reply related

* Re: pull-request: ieee802154 for net 2019-01-31
From: David Miller @ 2019-01-31 17:48 UTC (permalink / raw)
  To: stefan; +Cc: linux-wpan, alex.aring, netdev
In-Reply-To: <20190131170001.25905-1-stefan@datenfreihafen.org>

From: Stefan Schmidt <stefan@datenfreihafen.org>
Date: Thu, 31 Jan 2019 18:00:01 +0100

> An update from ieee802154 for your *net* tree.
> 
> I waited a while to see if anything else comes up, but it seems this time
> we only have one fixup patch for the -rc rounds.
> Colin fixed some indentation in the mcr20a drivers. That's about it.
> 
> If there are any problems with taking these two before the final 5.0 let
> me know.

Pulled, thanks!

> Greetings from Brussels, FOSDEM ahead. :-)

Enjoy!

^ permalink raw reply

* Re: [PATCH net] rds: fix refcount bug in rds_sock_addref
From: David Miller @ 2019-01-31 17:46 UTC (permalink / raw)
  To: edumazet
  Cc: netdev, eric.dumazet, syzkaller, sowmini.varadhan,
	santosh.shilimkar, rds-devel, xiyou.wangcong
In-Reply-To: <20190131164710.230590-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Thu, 31 Jan 2019 08:47:10 -0800

> syzbot was able to catch a bug in rds [1]
> 
> The issue here is that the socket might be found in a hash table
> but that its refcount has already be set to 0 by another cpu.
> 
> We need to use refcount_inc_not_zero() to be safe here.
 ...
> Fixes: cc4dfb7f70a3 ("rds: fix two RCU related problems")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>

Applied and queued up for -stable, thanks Eric.

^ permalink raw reply

* Re: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames
From: David Miller @ 2019-01-31 17:45 UTC (permalink / raw)
  To: mst; +Cc: makita.toshiaki, jasowang, netdev, virtualization, dsahern, hawk
In-Reply-To: <20190131101516-mutt-send-email-mst@kernel.org>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Thu, 31 Jan 2019 10:25:17 -0500

> On Thu, Jan 31, 2019 at 08:40:30PM +0900, Toshiaki Makita wrote:
>> Previously virtnet_xdp_xmit() did not account for device tx counters,
>> which caused confusions.
>> To be consistent with SKBs, account them on freeing xdp_frames.
>> 
>> Reported-by: David Ahern <dsahern@gmail.com>
>> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> 
> Well we count them on receive so I guess it makes sense for consistency
> 
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> 
> however, I really wonder whether adding more and more standard net stack
> things like this will end up costing most of XDP its speed.
> 
> Should we instead make sure *not* to account XDP packets
> in any counters at all? XDP programs can use maps
> to do their own counting...

This has been definitely a discussion point, and something we should
develop a clear, strong, policy on.

David, Jesper, care to chime in where we ended up in that last thread
discussion this?

^ permalink raw reply

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
From: David Miller @ 2019-01-31 17:38 UTC (permalink / raw)
  To: tariqt; +Cc: netdev, eranbe, saeedm, edumazet
In-Reply-To: <1548939763-20074-1-git-send-email-tariqt@mellanox.com>

From: Tariq Toukan <tariqt@mellanox.com>
Date: Thu, 31 Jan 2019 15:02:43 +0200

> From: Saeed Mahameed <saeedm@mellanox.com>
> 
> When an ethernet frame is padded to meet the minimum ethernet frame
> size, the padding octets are not covered by the hardware checksum.
> Fortunately the padding octets are usually zero's, which don't affect
> checksum. However, it is not guaranteed. For example, switches might
> choose to make other use of these octets.
> This repeatedly causes kernel hardware checksum fault.
> 
> Prior to the cited commit below, skb checksum was forced to be
> CHECKSUM_NONE when padding is detected. After it, we need to keep
> skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
> verify and parse IP headers, it does not worth the effort as the packets
> are so small that CHECKSUM_COMPLETE has no significant advantage.
> 
> Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends")
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> ---
>  drivers/net/ethernet/mellanox/mlx4/en_rx.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> Hi Dave,
> Please queue for -stable >= v4.18.

Please look into Eric's feedback and update the comment as needed.

Thank you.

^ permalink raw reply

* Re: BUG: KASAN: double-free or invalid-free in ip_defrag after upgrade from 4.19.13
From: David Miller @ 2019-01-31 17:38 UTC (permalink / raw)
  To: gregkh; +Cc: edumazet, ivan, mkubecek, netdev, ignat, sbohrer, jakub
In-Reply-To: <20190131124816.GA8031@kroah.com>

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Thu, 31 Jan 2019 13:48:16 +0100

> Thanks for this, I'll turn this into a real patch and backport it to
> where it is needed.

Thanks a lot for taking care of this!

^ permalink raw reply

* Re: [PATCH v3] lib/test_rhashtable: Make test_insert_dup() allocate its hash table dynamically
From: David Miller @ 2019-01-31 17:37 UTC (permalink / raw)
  To: herbert; +Cc: bvanassche, tgraf, netdev, linux-kernel
In-Reply-To: <20190131120826.l4xd3vfonwmldudd@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 31 Jan 2019 20:08:26 +0800

> On Wed, Jan 30, 2019 at 10:42:30AM -0800, Bart Van Assche wrote:
>> The test_insert_dup() function from lib/test_rhashtable.c passes a
>> pointer to a stack object to rhltable_init(). Allocate the hash table
>> dynamically to avoid that the following is reported with object
>> debugging enabled:
 ...
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
 ...
> 
> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, thanks everyone.

^ permalink raw reply

* Re: [PATCH net-next v3 8/8] ethtool: add compat for devlink info
From: Jakub Kicinski @ 2019-01-31 17:29 UTC (permalink / raw)
  To: David Miller
  Cc: lkp, kbuild-all, netdev, oss-drivers, jiri, andrew, f.fainelli,
	mkubecek, eugenem, jonathan.lemon
In-Reply-To: <20190131092508.53bb519c@cakuba.hsd1.ca.comcast.net>

On Thu, 31 Jan 2019 09:25:08 -0800, Jakub Kicinski wrote:
> On Thu, 31 Jan 2019 09:23:03 -0800 (PST), David Miller wrote:
> > From: kbuild test robot <lkp@intel.com>
> > Date: Fri, 1 Feb 2019 00:19:33 +0800
> >   
> > > All errors (new ones prefixed by >>):
> > > 
> > >    m68k-linux-gnu-ld: drivers/rtc/proc.o: in function `is_rtc_hctosys.isra.0':
> > >    proc.c:(.text+0x178): undefined reference to `strcmp'
> > >    m68k-linux-gnu-ld: net/core/ethtool.o: in function `ethtool_get_drvinfo':    
> > >>> ethtool.c:(.text+0xc08): undefined reference to `devlink_compat_running_version'    
> > 
> > Missing string.h include perhaps?  
> 
> Yeah, that one looks like existing m68k bug, but also I think I need to
> cater to the DEVLINK=m case since ethtool code is always built in we
> can't use the MAY_USE_DEVLINK trick :S

I think we have to do this:

#if IS_REACHABLE(CONFIG_NET_DEVLINK)
void devlink_compat_running_version(struct net_device *dev,
				    char *buf, size_t len);
#else
static inline void
devlink_compat_running_version(struct net_device *dev, char *buf, size_t len)
{
}
#endif

Jiri, any objections?

^ permalink raw reply

* Re: net: phylink: dsa: mv88e6xxx: flaky link detection on switch ports with internal PHYs
From: John David Anglin @ 2019-01-31 17:27 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Russell King, Vivien Didelot, Florian Fainelli, netdev
In-Reply-To: <9415d82e-965b-7777-0ad0-f23d6c9f177e@bell.net>

On 2019-01-30 8:27 p.m., John David Anglin wrote:
> On 2019-01-30 5:38 p.m., Andrew Lunn wrote:
>> I'd suggest you take a look at the datasheet for the 37xx and check
>> what the hardware actually supports. You might need to extend the
>> driver.
> I did look and the GIC does support level interrupts.  But all the
> documentation is in
> generic ARM documents that I don't currently have.  I'll see if I can
> find them tomorrow.
On a closer look at MV-S110897-00C, I see that the north and south
bridge GPIO interrupt registers
only provide edge polarity control.  The GPIO pins don't appear to
support level interrupts on 88F37xx.

Dave

-- 
John David Anglin  dave.anglin@bell.net



^ permalink raw reply

* Re: [PATCH net-next] macvlan: use netif_is_macvlan_port()
From: David Miller @ 2019-01-31 17:26 UTC (permalink / raw)
  To: jwi; +Cc: netdev
In-Reply-To: <20190131094810.57656-1-jwi@linux.ibm.com>

From: Julian Wiedmann <jwi@linux.ibm.com>
Date: Thu, 31 Jan 2019 10:48:10 +0100

> Replace the macvlan_port_exists() macro with its twin from netdevice.h
> 
> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>

Looks great, applied.

Thanks for cleaning this up.

^ permalink raw reply

* Re: [PATCH -next] mISDN: hfcsusb: Fix potential NULL pointer dereference
From: David Miller @ 2019-01-31 17:25 UTC (permalink / raw)
  To: yuehaibing; +Cc: isdn, gustavo, bigeasy, linux-kernel, netdev
In-Reply-To: <b565e6fa-a52f-f4f1-21b9-c63332d06ea0@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Thu, 31 Jan 2019 17:41:46 +0800

> On 2019/1/31 2:10, David Miller wrote:
>> From: YueHaibing <yuehaibing@huawei.com>
>> Date: Wed, 30 Jan 2019 18:19:02 +0800
>> 
>>> There is a potential NULL pointer dereference in case
>>> kzalloc() fails and returns NULL.
>>>
>>> Fixes: 69f52adb2d53 ("mISDN: Add HFC USB driver")
>>> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
>>> ---
>>>  drivers/isdn/hardware/mISDN/hfcsusb.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/isdn/hardware/mISDN/hfcsusb.c b/drivers/isdn/hardware/mISDN/hfcsusb.c
>>> index 124ff53..5660d5a 100644
>>> --- a/drivers/isdn/hardware/mISDN/hfcsusb.c
>>> +++ b/drivers/isdn/hardware/mISDN/hfcsusb.c
>>> @@ -263,6 +263,8 @@ hfcsusb_ph_info(struct hfcsusb *hw)
>>>  	int i;
>>>  
>>>  	phi = kzalloc(struct_size(phi, bch, dch->dev.nrbchan), GFP_ATOMIC);
>>> +	if (!phi)
>>> +		return;
>> 
>> If we fail with an error and do not perform the operation we were requested to
>> make, we must return an error to the caller, and the caller must do something
>> reasonable with that error (perhaps return it to it's caller) and so on and
>> so forth.
> 
> 
> hfcsusb_ph_info alloced the 'phi'，then use it _alloc_mISDN_skb in _queue_data.
> while _alloc_mISDN_skb fails, it also just return without err handling,then kfree(phi).
> It seems that all the caller of hfcsusb_ph_info doesn't care the return value.

And that's a bug!

^ permalink raw reply

* Re: [PATCH net-next v3 8/8] ethtool: add compat for devlink info
From: Jakub Kicinski @ 2019-01-31 17:25 UTC (permalink / raw)
  To: David Miller
  Cc: lkp, kbuild-all, netdev, oss-drivers, jiri, andrew, f.fainelli,
	mkubecek, eugenem, jonathan.lemon
In-Reply-To: <20190131.092303.1843359963647606693.davem@davemloft.net>

On Thu, 31 Jan 2019 09:23:03 -0800 (PST), David Miller wrote:
> From: kbuild test robot <lkp@intel.com>
> Date: Fri, 1 Feb 2019 00:19:33 +0800
> 
> > All errors (new ones prefixed by >>):
> > 
> >    m68k-linux-gnu-ld: drivers/rtc/proc.o: in function `is_rtc_hctosys.isra.0':
> >    proc.c:(.text+0x178): undefined reference to `strcmp'
> >    m68k-linux-gnu-ld: net/core/ethtool.o: in function `ethtool_get_drvinfo':  
> >>> ethtool.c:(.text+0xc08): undefined reference to `devlink_compat_running_version'  
> 
> Missing string.h include perhaps?

Yeah, that one looks like existing m68k bug, but also I think I need to
cater to the DEVLINK=m case since ethtool code is always built in we
can't use the MAY_USE_DEVLINK trick :S

^ permalink raw reply

* Re: [PATCH net-next v8 0/8] devlink: Add configuration parameters support for devlink_port
From: David Miller @ 2019-01-31 17:25 UTC (permalink / raw)
  To: vasundhara-v.volam; +Cc: jakub.kicinski, michael.chan, jiri, mkubecek, netdev
In-Reply-To: <CAACQVJokoZYSuus+_0NOs4Mjw7i-JnCOU-wz2tFBZS4a7Lqdzw@mail.gmail.com>

From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Date: Thu, 31 Jan 2019 14:14:02 +0530

> On Thu, Jan 31, 2019 at 5:28 AM Jakub Kicinski
> <jakub.kicinski@netronome.com> wrote:
>>
>> On Mon, 28 Jan 2019 18:00:19 +0530, Vasundhara Volam wrote:
>> > This patchset adds support for configuration parameters setting through
>> > devlink_port.  Each device registers supported configuration parameters
>> > table.
>> >
>> > The user can retrieve data on these parameters by
>> > "devlink port param show" command and can set new value to a
>> > parameter by "devlink port param set" command.
>> > All configuration modes supported by devlink_dev are supported
>> > by devlink_port also.
>>
>> Hm, I think we were kind of going somewhere with the ethtool/nl
>> attribute encapsulation idea.  You seem to have ignored those comments
>> on v7 and reposted v8 a day after.
> Jakub, I have added the idea of future expansion of WOL in my v8 cover letter
> mentioning the same. I will work on this as a future patchset.
>>
>> I think we should explore the nesting further.  The only obstacle is
>> that ethtool netlink conversion is not yet finished, but that's just
>> a simple matter of programming.  Do you disagree with that direction?
>> Please comment.
> No, I agree with you about ethtool netlink encapsulation.

This is great.

But this has to be resolved before the next merge window, otherwise I will
really have to revert this patch series.  You have been warned, so do not
let this slip under the cracks.

Thank you.

^ permalink raw reply

* Re: [PATCH net-next v3 8/8] ethtool: add compat for devlink info
From: David Miller @ 2019-01-31 17:23 UTC (permalink / raw)
  To: lkp
  Cc: jakub.kicinski, kbuild-all, netdev, oss-drivers, jiri, andrew,
	f.fainelli, mkubecek, eugenem, jonathan.lemon
In-Reply-To: <201902010037.2Jsqwihj%fengguang.wu@intel.com>

From: kbuild test robot <lkp@intel.com>
Date: Fri, 1 Feb 2019 00:19:33 +0800

> All errors (new ones prefixed by >>):
> 
>    m68k-linux-gnu-ld: drivers/rtc/proc.o: in function `is_rtc_hctosys.isra.0':
>    proc.c:(.text+0x178): undefined reference to `strcmp'
>    m68k-linux-gnu-ld: net/core/ethtool.o: in function `ethtool_get_drvinfo':
>>> ethtool.c:(.text+0xc08): undefined reference to `devlink_compat_running_version'

Missing string.h include perhaps?

^ permalink raw reply

* Re: [PATCH net] l2tp: copy 4 more bytes to linear part if necessary
From: David Miller @ 2019-01-31 17:20 UTC (permalink / raw)
  To: jian.w.wen; +Cc: netdev, gnault
In-Reply-To: <20190131071856.22120-1-jian.w.wen@oracle.com>

From: Jacob Wen <jian.w.wen@oracle.com>
Date: Thu, 31 Jan 2019 15:18:56 +0800

> The size of L2TPv2 header with all optional fields is 14 bytes.
> l2tp_udp_recv_core only moves 10 bytes to the linear part of a
> skb. This may lead to l2tp_recv_common read data outside of a skb.
> 
> This patch make sure that there is at least 14 bytes in the linear
> part of a skb to meet the maximum need of l2tp_udp_recv_core and
> l2tp_recv_common. The minimum size of both PPP HDLC-like frame and
> Ethernet frame is larger than 14 bytes, so we are safe to do so.
> 
> Also remove L2TP_HDR_SIZE_NOSEQ, it is unused now.
> 
> Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
> Suggested-by: Guillaume Nault <gnault@redhat.com>
> Signed-off-by: Jacob Wen <jian.w.wen@oracle.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox