Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next 2/3] net/ipv6: Udate fib6_table_lookup tracepoint
From: David Miller @ 2018-05-23 17:09 UTC (permalink / raw)
  To: dsahern; +Cc: netdev, dsahern
In-Reply-To: <20180521212443.23612-3-dsahern@kernel.org>

From: dsahern@kernel.org
Date: Mon, 21 May 2018 14:24:42 -0700

> +		__entry->err = ip6_rt_type_to_error(f6i->fib6_type);

As the kbuild bot discovered, this doesn't work when IPV6=m.

^ permalink raw reply

* Re: [PATCH net-next 2/3] net/ipv6: Udate fib6_table_lookup tracepoint
From: David Ahern @ 2018-05-23 17:13 UTC (permalink / raw)
  To: David Miller, dsahern; +Cc: netdev
In-Reply-To: <20180523.130952.2276806407125412362.davem@davemloft.net>

On 5/23/18 11:09 AM, David Miller wrote:
> From: dsahern@kernel.org
> Date: Mon, 21 May 2018 14:24:42 -0700
> 
>> +		__entry->err = ip6_rt_type_to_error(f6i->fib6_type);
> 
> As the kbuild bot discovered, this doesn't work when IPV6=m.
> 

yep. I'll take a look later today. Thinking about moving the tracepoint
create from net/core/net-traces.c to net/ipv6/route.c.

^ permalink raw reply

* Re: [PATCH net-next v4 0/2] openvswitch: Support conntrack zone limit
From: David Miller @ 2018-05-23 17:13 UTC (permalink / raw)
  To: yihung.wei; +Cc: netdev, pshelar
In-Reply-To: <1526948165-32443-1-git-send-email-yihung.wei@gmail.com>

From: Yi-Hung Wei <yihung.wei@gmail.com>
Date: Mon, 21 May 2018 17:16:03 -0700

> v3->v4:
>   - Addresses comments from Parvin that include simplify netlink API,
>     and remove unncessary RCU lockings.
>   - Rebases to master.

Pravin, please review.

^ permalink raw reply

* Re: [PATCH bpf-next v3 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY
From: Martin KaFai Lau @ 2018-05-23 17:13 UTC (permalink / raw)
  To: Yonghong Song; +Cc: peterz, ast, daniel, netdev, kernel-team
In-Reply-To: <20180522163048.3128924-3-yhs@fb.com>

On Tue, May 22, 2018 at 09:30:46AM -0700, Yonghong Song wrote:
> Currently, suppose a userspace application has loaded a bpf program
> and attached it to a tracepoint/kprobe/uprobe, and a bpf
> introspection tool, e.g., bpftool, wants to show which bpf program
> is attached to which tracepoint/kprobe/uprobe. Such attachment
> information will be really useful to understand the overall bpf
> deployment in the system.
> 
> There is a name field (16 bytes) for each program, which could
> be used to encode the attachment point. There are some drawbacks
> for this approaches. First, bpftool user (e.g., an admin) may not
> really understand the association between the name and the
> attachment point. Second, if one program is attached to multiple
> places, encoding a proper name which can imply all these
> attachments becomes difficult.
> 
> This patch introduces a new bpf subcommand BPF_TASK_FD_QUERY.
> Given a pid and fd, if the <pid, fd> is associated with a
> tracepoint/kprobe/uprobe perf event, BPF_TASK_FD_QUERY will return
>    . prog_id
>    . tracepoint name, or
>    . k[ret]probe funcname + offset or kernel addr, or
>    . u[ret]probe filename + offset
> to the userspace.
> The user can use "bpftool prog" to find more information about
> bpf program itself with prog_id.
LGTM, some comments inline.

> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  include/linux/trace_events.h |  16 ++++++
>  include/uapi/linux/bpf.h     |  27 ++++++++++
>  kernel/bpf/syscall.c         | 124 +++++++++++++++++++++++++++++++++++++++++++
>  kernel/trace/bpf_trace.c     |  48 +++++++++++++++++
>  kernel/trace/trace_kprobe.c  |  29 ++++++++++
>  kernel/trace/trace_uprobe.c  |  22 ++++++++
>  6 files changed, 266 insertions(+)
> 
> diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
> index 2bde3ef..eab806d 100644
> --- a/include/linux/trace_events.h
> +++ b/include/linux/trace_events.h
> @@ -473,6 +473,9 @@ int perf_event_query_prog_array(struct perf_event *event, void __user *info);
>  int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
>  int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
>  struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name);
> +int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
> +			    u32 *attach_info, const char **buf,
> +			    u64 *probe_offset, u64 *probe_addr);
The first arg is 'const struct perf_event *event' while...

>  #else
>  static inline unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
>  {
> @@ -504,6 +507,12 @@ static inline struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name
>  {
>  	return NULL;
>  }
> +static inline int bpf_get_perf_event_info(const struct file *file, u32 *prog_id,
this one has 'const struct file *file'?

> +					  u32 *attach_info, const char **buf,
> +					  u64 *probe_offset, u64 *probe_addr)
> +{
> +	return -EOPNOTSUPP;
> +}
>  #endif
>  
>  enum {
> @@ -560,10 +569,17 @@ extern void perf_trace_del(struct perf_event *event, int flags);
>  #ifdef CONFIG_KPROBE_EVENTS
>  extern int  perf_kprobe_init(struct perf_event *event, bool is_retprobe);
>  extern void perf_kprobe_destroy(struct perf_event *event);
> +extern int bpf_get_kprobe_info(const struct perf_event *event,
> +			       u32 *attach_info, const char **symbol,
> +			       u64 *probe_offset, u64 *probe_addr,
> +			       bool perf_type_tracepoint);
>  #endif
>  #ifdef CONFIG_UPROBE_EVENTS
>  extern int  perf_uprobe_init(struct perf_event *event, bool is_retprobe);
>  extern void perf_uprobe_destroy(struct perf_event *event);
> +extern int bpf_get_uprobe_info(const struct perf_event *event,
> +			       u32 *attach_info, const char **filename,
> +			       u64 *probe_offset, bool perf_type_tracepoint);
>  #endif
>  extern int  ftrace_profile_set_filter(struct perf_event *event, int event_id,
>  				     char *filter_str);
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 97446bb..a602150 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -97,6 +97,7 @@ enum bpf_cmd {
>  	BPF_RAW_TRACEPOINT_OPEN,
>  	BPF_BTF_LOAD,
>  	BPF_BTF_GET_FD_BY_ID,
> +	BPF_TASK_FD_QUERY,
>  };
>  
>  enum bpf_map_type {
> @@ -379,6 +380,22 @@ union bpf_attr {
>  		__u32		btf_log_size;
>  		__u32		btf_log_level;
>  	};
> +
> +	struct {
> +		int		pid;		/* input: pid */
> +		int		fd;		/* input: fd */
Should fd and pid be always positive?
The current fd (like map_fd) in bpf_attr is using __u32.

> +		__u32		flags;		/* input: flags */
> +		__u32		buf_len;	/* input: buf len */
> +		__aligned_u64	buf;		/* input/output:
> +						 *   tp_name for tracepoint
> +						 *   symbol for kprobe
> +						 *   filename for uprobe
> +						 */
> +		__u32		prog_id;	/* output: prod_id */
> +		__u32		attach_info;	/* output: BPF_ATTACH_* */
> +		__u64		probe_offset;	/* output: probe_offset */
> +		__u64		probe_addr;	/* output: probe_addr */
> +	} task_fd_query;
>  } __attribute__((aligned(8)));
>  
>  /* The description below is an attempt at providing documentation to eBPF
> @@ -2458,4 +2475,14 @@ struct bpf_fib_lookup {
>  	__u8	dmac[6];     /* ETH_ALEN */
>  };
>  
> +/* used by <task, fd> based query */
> +enum {
Nit. Instead of a comment, is it better to give this
enum a descriptive name?

> +	BPF_ATTACH_RAW_TRACEPOINT,	/* tp name */
> +	BPF_ATTACH_TRACEPOINT,		/* tp name */
> +	BPF_ATTACH_KPROBE,		/* (symbol + offset) or addr */
> +	BPF_ATTACH_KRETPROBE,		/* (symbol + offset) or addr */
> +	BPF_ATTACH_UPROBE,		/* filename + offset */
> +	BPF_ATTACH_URETPROBE,		/* filename + offset */
> +};
> +
>  #endif /* _UAPI__LINUX_BPF_H__ */
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index bfcde94..9356f0e 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -18,7 +18,9 @@
>  #include <linux/vmalloc.h>
>  #include <linux/mmzone.h>
>  #include <linux/anon_inodes.h>
> +#include <linux/fdtable.h>
>  #include <linux/file.h>
> +#include <linux/fs.h>
>  #include <linux/license.h>
>  #include <linux/filter.h>
>  #include <linux/version.h>
> @@ -2102,6 +2104,125 @@ static int bpf_btf_get_fd_by_id(const union bpf_attr *attr)
>  	return btf_get_fd_by_id(attr->btf_id);
>  }
>  
> +static int bpf_task_fd_query_copy(const union bpf_attr *attr,
> +				    union bpf_attr __user *uattr,
> +				    u32 prog_id, u32 attach_info,
> +				    const char *buf, u64 probe_offset,
> +				    u64 probe_addr)
> +{
> +	__u64 __user *ubuf;
Nit. ubuf is a string instead of an array of __u64?

> +	int len;
> +
> +	ubuf = u64_to_user_ptr(attr->task_fd_query.buf);
> +	if (buf) {
> +		len = strlen(buf);
> +		if (attr->task_fd_query.buf_len < len + 1)
I think the current convention is to take the min,
copy whatever it can to buf and return the real
len/size in buf_len.  F.e., the prog_ids and prog_cnt in
__cgroup_bpf_query().

Should the same be done here or it does not make sense to
truncate the string?  The user may/may not need the tailing
char though if its pretty print has limited width anyway.
The user still needs to know what the buf_len should be to
retry also but I guess any reasonable buf_len should
work?

> +			return -ENOSPC;
> +		if (copy_to_user(ubuf, buf, len + 1))
> +			return -EFAULT;
> +	} else if (attr->task_fd_query.buf_len) {
> +		/* copy '\0' to ubuf */
> +		__u8 zero = 0;
> +
> +		if (copy_to_user(ubuf, &zero, 1))
> +			return -EFAULT;
> +	}
> +
> +	if (copy_to_user(&uattr->task_fd_query.prog_id, &prog_id,
> +			 sizeof(prog_id)) ||
> +	    copy_to_user(&uattr->task_fd_query.attach_info, &attach_info,
> +			 sizeof(attach_info)) ||
> +	    copy_to_user(&uattr->task_fd_query.probe_offset, &probe_offset,
> +			 sizeof(probe_offset)) ||
> +	    copy_to_user(&uattr->task_fd_query.probe_addr, &probe_addr,
> +			 sizeof(probe_addr)))
Nit. put_user() may be able to shorten them.

> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +#define BPF_TASK_FD_QUERY_LAST_FIELD task_fd_query.probe_addr
> +
> +static int bpf_task_fd_query(const union bpf_attr *attr,
> +			     union bpf_attr __user *uattr)
> +{
> +	pid_t pid = attr->task_fd_query.pid;
> +	int fd = attr->task_fd_query.fd;
> +	const struct perf_event *event;
> +	struct files_struct *files;
> +	struct task_struct *task;
> +	struct file *file;
> +	int err;
> +
> +	if (CHECK_ATTR(BPF_TASK_FD_QUERY))
> +		return -EINVAL;
> +
> +	if (!capable(CAP_SYS_ADMIN))
> +		return -EPERM;
> +
> +	if (attr->task_fd_query.flags != 0)
How flags is used?

> +		return -EINVAL;
> +
> +	task = get_pid_task(find_vpid(pid), PIDTYPE_PID);
> +	if (!task)
> +		return -ENOENT;
> +
> +	files = get_files_struct(task);
> +	put_task_struct(task);
> +	if (!files)
> +		return -ENOENT;
> +
> +	err = 0;
> +	spin_lock(&files->file_lock);
> +	file = fcheck_files(files, fd);
> +	if (!file)
> +		err = -EBADF;
> +	else
> +		get_file(file);
> +	spin_unlock(&files->file_lock);
> +	put_files_struct(files);
> +
> +	if (err)
> +		goto out;
> +
> +	if (file->f_op == &bpf_raw_tp_fops) {
> +		struct bpf_raw_tracepoint *raw_tp = file->private_data;
> +		struct bpf_raw_event_map *btp = raw_tp->btp;
> +
> +		if (!raw_tp->prog)
> +			err = -ENOENT;
> +		else
> +			err = bpf_task_fd_query_copy(attr, uattr,
> +						     raw_tp->prog->aux->id,
> +						     BPF_ATTACH_RAW_TRACEPOINT,
> +						     btp->tp->name, 0, 0);
> +		goto put_file;
> +	}
> +
> +	event = perf_get_event(file);
> +	if (!IS_ERR(event)) {
> +		u64 probe_offset, probe_addr;
> +		u32 prog_id, attach_info;
> +		const char *buf;
> +
> +		err = bpf_get_perf_event_info(event, &prog_id, &attach_info,
> +					      &buf, &probe_offset,
> +					      &probe_addr);
> +		if (!err)
> +			err = bpf_task_fd_query_copy(attr, uattr, prog_id,
> +						     attach_info, buf,
> +						     probe_offset,
> +						     probe_addr);
> +		goto put_file;
> +	}
> +
> +	err = -ENOTSUPP;
> +put_file:
> +	fput(file);
> +out:
> +	return err;
> +}
> +
>  SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
>  {
>  	union bpf_attr attr = {};
> @@ -2188,6 +2309,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
>  	case BPF_BTF_GET_FD_BY_ID:
>  		err = bpf_btf_get_fd_by_id(&attr);
>  		break;
> +	case BPF_TASK_FD_QUERY:
> +		err = bpf_task_fd_query(&attr, uattr);
> +		break;
>  	default:
>  		err = -EINVAL;
>  		break;
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index ce2cbbf..323c80e 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -14,6 +14,7 @@
>  #include <linux/uaccess.h>
>  #include <linux/ctype.h>
>  #include <linux/kprobes.h>
> +#include <linux/syscalls.h>
>  #include <linux/error-injection.h>
>  
>  #include "trace_probe.h"
> @@ -1163,3 +1164,50 @@ int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog)
>  	mutex_unlock(&bpf_event_mutex);
>  	return err;
>  }
> +
> +int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
> +			    u32 *attach_info, const char **buf,
> +			    u64 *probe_offset, u64 *probe_addr)
> +{
> +	bool is_tracepoint, is_syscall_tp;
> +	struct bpf_prog *prog;
> +	int flags, err = 0;
> +
> +	prog = event->prog;
> +	if (!prog)
> +		return -ENOENT;
> +
> +	/* not supporting BPF_PROG_TYPE_PERF_EVENT yet */
> +	if (prog->type == BPF_PROG_TYPE_PERF_EVENT)
> +		return -EOPNOTSUPP;
> +
> +	*prog_id = prog->aux->id;
> +	flags = event->tp_event->flags;
> +	is_tracepoint = flags & TRACE_EVENT_FL_TRACEPOINT;
> +	is_syscall_tp = is_syscall_trace_event(event->tp_event);
> +
> +	if (is_tracepoint || is_syscall_tp) {
> +		*buf = is_tracepoint ? event->tp_event->tp->name
> +				     : event->tp_event->name;
> +		*attach_info = BPF_ATTACH_TRACEPOINT;
> +		*probe_offset = 0x0;
> +		*probe_addr = 0x0;
> +	} else {
> +		/* kprobe/uprobe */
> +		err = -EOPNOTSUPP;
> +#ifdef CONFIG_KPROBE_EVENTS
> +		if (flags & TRACE_EVENT_FL_KPROBE)
> +			err = bpf_get_kprobe_info(event, attach_info, buf,
> +						  probe_offset, probe_addr,
> +						  event->attr.type == PERF_TYPE_TRACEPOINT);
> +#endif
> +#ifdef CONFIG_UPROBE_EVENTS
> +		if (flags & TRACE_EVENT_FL_UPROBE)
> +			err = bpf_get_uprobe_info(event, attach_info, buf,
> +						  probe_offset,
> +						  event->attr.type == PERF_TYPE_TRACEPOINT);
> +#endif
> +	}
> +
> +	return err;
> +}
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 02aed76..32e9190 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -1287,6 +1287,35 @@ kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
>  			      head, NULL);
>  }
>  NOKPROBE_SYMBOL(kretprobe_perf_func);
> +
> +int bpf_get_kprobe_info(const struct perf_event *event, u32 *attach_info,
> +			const char **symbol, u64 *probe_offset,
> +			u64 *probe_addr, bool perf_type_tracepoint)
> +{
> +	const char *pevent = trace_event_name(event->tp_event);
> +	const char *group = event->tp_event->class->system;
> +	struct trace_kprobe *tk;
> +
> +	if (perf_type_tracepoint)
> +		tk = find_trace_kprobe(pevent, group);
> +	else
> +		tk = event->tp_event->data;
> +	if (!tk)
> +		return -EINVAL;
> +
> +	*attach_info = trace_kprobe_is_return(tk) ? BPF_ATTACH_KRETPROBE
> +						  : BPF_ATTACH_KPROBE;
> +	if (tk->symbol) {
> +		*symbol = tk->symbol;
> +		*probe_offset = tk->rp.kp.offset;
> +		*probe_addr = 0;
> +	} else {
> +		*symbol = NULL;
> +		*probe_offset = 0;
> +		*probe_addr = (unsigned long)tk->rp.kp.addr;
> +	}
> +	return 0;
> +}
>  #endif	/* CONFIG_PERF_EVENTS */
>  
>  /*
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index ac89287..12a3667 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -1161,6 +1161,28 @@ static void uretprobe_perf_func(struct trace_uprobe *tu, unsigned long func,
>  {
>  	__uprobe_perf_func(tu, func, regs, ucb, dsize);
>  }
> +
> +int bpf_get_uprobe_info(const struct perf_event *event, u32 *attach_info,
> +			const char **filename, u64 *probe_offset,
> +			bool perf_type_tracepoint)
> +{
> +	const char *pevent = trace_event_name(event->tp_event);
> +	const char *group = event->tp_event->class->system;
> +	struct trace_uprobe *tu;
> +
> +	if (perf_type_tracepoint)
> +		tu = find_probe_event(pevent, group);
> +	else
> +		tu = event->tp_event->data;
> +	if (!tu)
> +		return -EINVAL;
> +
> +	*attach_info = is_ret_probe(tu) ? BPF_ATTACH_URETPROBE
> +					: BPF_ATTACH_UPROBE;
> +	*filename = tu->filename;
> +	*probe_offset = tu->offset;
> +	return 0;
> +}
>  #endif	/* CONFIG_PERF_EVENTS */
>  
>  static int
> -- 
> 2.9.5
> 

^ permalink raw reply

* Re: [PATCH bpf-next v3 4/7] tools/bpf: add ksym_get_addr() in trace_helpers
From: Martin KaFai Lau @ 2018-05-23 17:16 UTC (permalink / raw)
  To: Yonghong Song; +Cc: peterz, ast, daniel, netdev, kernel-team
In-Reply-To: <20180522163048.3128924-5-yhs@fb.com>

On Tue, May 22, 2018 at 09:30:48AM -0700, Yonghong Song wrote:
> Given a kernel function name, ksym_get_addr() will return the kernel
> address for this function, or 0 if it cannot find this function name
> in /proc/kallsyms. This function will be used later when a kernel
> address is used to initiate a kprobe perf event.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>

^ permalink raw reply

* Re: [PATCH v3 net-next 0/2] bpfilter
From: David Miller @ 2018-05-23 17:26 UTC (permalink / raw)
  To: ast
  Cc: daniel, torvalds, gregkh, luto, mcgrof, keescook, netdev,
	linux-kernel, kernel-team
In-Reply-To: <20180522022230.2492505-1-ast@kernel.org>

From: Alexei Starovoitov <ast@kernel.org>
Date: Mon, 21 May 2018 19:22:28 -0700

> v2->v3:
> - followed Luis's suggestion and significantly simplied first patch
>   with shmem_kernel_file_setup+kernel_write. Added kdoc for new helper
> - fixed typos and race to access pipes with mutex
> - tested with bpfilter being 'builtin'. CONFIG_BPFILTER_UMH=y|m both work.
>   Interesting to see a usermode executable being embedded inside vmlinux.
> - it doesn't hurt to enable bpfilter in .config.
>   ip_setsockopt commands sent to usermode via pipes and -ENOPROTOOPT is
>   returned from userspace, so kernel falls back to original iptables code
> 
> v1->v2:
> this patch set is almost a full rewrite of the earlier umh modules approach
> The v1 of patches and follow up discussion was covered by LWN:
> https://lwn.net/Articles/749108/
> 
> I believe the v2 addresses all issues brought up by Andy and others.
> Mainly there are zero changes to kernel/module.c
> Instead of teaching module loading logic to recognize special
> umh module, let normal kernel modules execute part of its own
> .init.rodata as a new user space process (Andy's idea)
> Patch 1 introduces this new helper:
> int fork_usermode_blob(void *data, size_t len, struct umh_info *info);
> Input:
>   data + len == executable file
> Output:
>   struct umh_info {
>        struct file *pipe_to_umh;
>        struct file *pipe_from_umh;
>        pid_t pid;
>   };

Series applied, let the madness begin... :-)

^ permalink raw reply

* Re: [PATCH v3 net-next 0/2] bpfilter
From: Greg KH @ 2018-05-23 17:33 UTC (permalink / raw)
  To: David Miller
  Cc: ast, daniel, torvalds, luto, mcgrof, keescook, netdev,
	linux-kernel, kernel-team
In-Reply-To: <20180523.132648.459690706167609338.davem@davemloft.net>

On Wed, May 23, 2018 at 01:26:48PM -0400, David Miller wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> Date: Mon, 21 May 2018 19:22:28 -0700
> 
> > v2->v3:
> > - followed Luis's suggestion and significantly simplied first patch
> >   with shmem_kernel_file_setup+kernel_write. Added kdoc for new helper
> > - fixed typos and race to access pipes with mutex
> > - tested with bpfilter being 'builtin'. CONFIG_BPFILTER_UMH=y|m both work.
> >   Interesting to see a usermode executable being embedded inside vmlinux.
> > - it doesn't hurt to enable bpfilter in .config.
> >   ip_setsockopt commands sent to usermode via pipes and -ENOPROTOOPT is
> >   returned from userspace, so kernel falls back to original iptables code
> > 
> > v1->v2:
> > this patch set is almost a full rewrite of the earlier umh modules approach
> > The v1 of patches and follow up discussion was covered by LWN:
> > https://lwn.net/Articles/749108/
> > 
> > I believe the v2 addresses all issues brought up by Andy and others.
> > Mainly there are zero changes to kernel/module.c
> > Instead of teaching module loading logic to recognize special
> > umh module, let normal kernel modules execute part of its own
> > .init.rodata as a new user space process (Andy's idea)
> > Patch 1 introduces this new helper:
> > int fork_usermode_blob(void *data, size_t len, struct umh_info *info);
> > Input:
> >   data + len == executable file
> > Output:
> >   struct umh_info {
> >        struct file *pipe_to_umh;
> >        struct file *pipe_from_umh;
> >        pid_t pid;
> >   };
> 
> Series applied, let the madness begin... :-)

Yeah, this is going to be fun :)

^ permalink raw reply

* Re: [PATCH net V2 0/4] Fix several issues of virtio-net mergeable XDP
From: David Miller @ 2018-05-23 17:37 UTC (permalink / raw)
  To: jasowang; +Cc: mst, virtualization, netdev, linux-kernel
In-Reply-To: <1526960671-11782-1-git-send-email-jasowang@redhat.com>

From: Jason Wang <jasowang@redhat.com>
Date: Tue, 22 May 2018 11:44:27 +0800

> Please review the patches that tries to fix sevreal issues of
> virtio-net mergeable XDP.
> 
> Changes from V1:
> - check against 1 before decreasing instead of resetting to 1
> - typoe fixes

Series applied and queued up for -stable.

^ permalink raw reply

* [PATCH bpf-next] bpf: btf: Avoid variable length array
From: Martin KaFai Lau @ 2018-05-23 17:46 UTC (permalink / raw)
  To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team

Sparse warning:
kernel/bpf/btf.c:1985:34: warning: Variable length array is used.

This patch moves the nr_secs from btf_check_sec_info() to a macro.

Fixes: f80442a4cd18 ("bpf: btf: Change how section is supported in btf_header")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 kernel/bpf/btf.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 9cbeabb5aca3..517296712774 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -1970,6 +1970,8 @@ static const size_t btf_sec_info_offset[] = {
 	offsetof(struct btf_header, str_off),
 };
 
+#define NR_SECS ARRAY_SIZE(btf_sec_info_offset)
+
 static int btf_sec_info_cmp(const void *a, const void *b)
 {
 	const struct btf_sec_info *x = a;
@@ -1981,8 +1983,7 @@ static int btf_sec_info_cmp(const void *a, const void *b)
 static int btf_check_sec_info(struct btf_verifier_env *env,
 			      u32 btf_data_size)
 {
-	const unsigned int nr_secs = ARRAY_SIZE(btf_sec_info_offset);
-	struct btf_sec_info secs[nr_secs];
+	struct btf_sec_info secs[NR_SECS];
 	u32 total, expected_total, i;
 	const struct btf_header *hdr;
 	const struct btf *btf;
@@ -1991,17 +1992,17 @@ static int btf_check_sec_info(struct btf_verifier_env *env,
 	hdr = &btf->hdr;
 
 	/* Populate the secs from hdr */
-	for (i = 0; i < nr_secs; i++)
+	for (i = 0; i < NR_SECS; i++)
 		secs[i] = *(struct btf_sec_info *)((void *)hdr +
 						   btf_sec_info_offset[i]);
 
-	sort(secs, nr_secs, sizeof(struct btf_sec_info),
+	sort(secs, NR_SECS, sizeof(struct btf_sec_info),
 	     btf_sec_info_cmp, NULL);
 
 	/* Check for gaps and overlap among sections */
 	total = 0;
 	expected_total = btf_data_size - hdr->hdr_len;
-	for (i = 0; i < nr_secs; i++) {
+	for (i = 0; i < NR_SECS; i++) {
 		if (expected_total < secs[i].off) {
 			btf_verifier_log(env, "Invalid section offset");
 			return -EINVAL;
-- 
2.9.5

^ permalink raw reply related

* Re: [PATCH v2] ath10k: transmit queued frames after waking queues
From: Rajkumar Manoharan @ 2018-05-23 18:05 UTC (permalink / raw)
  To: Erik Stromdahl
  Cc: Niklas Cassel, Kalle Valo, David S. Miller, ath10k,
	linux-wireless, netdev, linux-kernel, linux-wireless-owner
In-Reply-To: <c131da6e-6479-3a40-fbd3-9c61d6690ba8@gmail.com>

On 2018-05-23 09:25, Erik Stromdahl wrote:
> On 05/22/2018 11:15 PM, Niklas Cassel wrote:
> 
[...]
>> 
>> Perhaps it would be possible to call ath10k_mac_tx_push_pending()
>> from the equivalent to ath10k_htt_txrx_compl_task(),
>> but from SDIO's point of view.
> An equivalent for SDIO would most likely be 
> *ath10k_htt_htc_t2h_msg_handler*
> or any of the other functions called from this function.
> 
> *ath10k_txrx_tx_unref* is actually called from 
> *ath10k_htt_htc_t2h_msg_handler*,
> so that function could be viewed as an equivalent.
> 
> If the call should be added in the bus driver (sdio.c) it should most 
> likely be
> placed in *ath10k_sdio_mbox_rx_process_packets*
> 
> 		if (!pkt->trailer_only) {
> 			ep->ep_ops.ep_rx_complete(ar_sdio->ar, pkt->skb);
> 			ath10k_mac_tx_push_pending(ar_sdio->ar);
> 		} else {
> 			kfree_skb(pkt->skb)
> 		}
> 
> The above call would of course result in lot's of calls to
> *ath10k_mac_tx_push_pending*
> Adding a htt_num_pending check here wouldn't look nice.
> 
> The HL RX path differs from the LL path in that the t2h_msg_handler 
> returns
> false indicating that it has consumed the skb.
> 
> This is because it is the HL RX indication handler that delivers the 
> skb's
> to mac80211.
> 
I also dont prefer to call *_push_pending for every HTC packets. Similar 
to
LL approach, call ath10k_mac_tx_push_pending after processing all 
pending
rx messages like calling from ath10k_sdio_mbox_rxmsg_pending_handler.

--- a/drivers/net/wireless/ath/ath10k/sdio.c
+++ b/drivers/net/wireless/ath/ath10k/sdio.c
@@ -807,6 +807,8 @@ static int 
ath10k_sdio_mbox_rxmsg_pending_handler(struct ath10k *ar,
                 ath10k_warn(ar, "failed to get pending recv messages: 
%d\n",
                             ret);

+       ath10k_mac_tx_push_pending(ar);
+
         return ret;
  }

> Another solution could be to add an *else-statement* as a part of the
> *if (release)*
> in *ath10k_htt_htc_t2h_msg_handler*, where
> *ath10k_mac_tx_push_pending* could be called.
> 
> Something like this perhaps:
> 
> 	/* Free the indication buffer */
> 	if (release)
> 		dev_kfree_skb_any(skb);
> 	else if (!ar->htt.num_pending_tx)
> 		ath10k_mac_tx_push_pending(ar);
> 
> I think I prefer your original patch though.
>> 

Better to do changes as HL specific path instead in common path.
The above change will impact QCA6174 based devices.

-Rajkumar

^ permalink raw reply

* Re: [PATCH bpf-next] bpf: btf: Avoid variable length array
From: Joe Perches @ 2018-05-23 18:11 UTC (permalink / raw)
  To: Martin KaFai Lau, netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20180523174659.354660-1-kafai@fb.com>

On Wed, 2018-05-23 at 10:46 -0700, Martin KaFai Lau wrote:
> Sparse warning:
> kernel/bpf/btf.c:1985:34: warning: Variable length array is used.

Perhaps use ARRAY_SIZE directly instead of indirectly via a #define

> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
[]
> @@ -1970,6 +1970,8 @@ static const size_t btf_sec_info_offset[] = {
>  	offsetof(struct btf_header, str_off),
>  };
>  
> +#define NR_SECS ARRAY_SIZE(btf_sec_info_offset)
> +
>  static int btf_sec_info_cmp(const void *a, const void *b)
>  {
>  	const struct btf_sec_info *x = a;
> @@ -1981,8 +1983,7 @@ static int btf_sec_info_cmp(const void *a, const void *b)
>  static int btf_check_sec_info(struct btf_verifier_env *env,
>  			      u32 btf_data_size)
>  {
> -	const unsigned int nr_secs = ARRAY_SIZE(btf_sec_info_offset);
> -	struct btf_sec_info secs[nr_secs];
> +	struct btf_sec_info secs[NR_SECS];

	struct btf_sec_info secs[ARRAY_SIZE(btf_sec_info_offset)];

>  	u32 total, expected_total, i;
>  	const struct btf_header *hdr;
>  	const struct btf *btf;
> @@ -1991,17 +1992,17 @@ static int btf_check_sec_info(struct btf_verifier_env *env,
>  	hdr = &btf->hdr;
>  
>  	/* Populate the secs from hdr */
> -	for (i = 0; i < nr_secs; i++)
> +	for (i = 0; i < NR_SECS; i++)

	for (i = 0; i < ARRAY_SIZE(btf_sec_info_offset); i++)

>  		secs[i] = *(struct btf_sec_info *)((void *)hdr +
>  						   btf_sec_info_offset[i]);

which makes this loop more intelligible.

> -	sort(secs, nr_secs, sizeof(struct btf_sec_info),
> +	sort(secs, NR_SECS, sizeof(struct btf_sec_info),
>  	     btf_sec_info_cmp, NULL);

etc...

^ permalink raw reply

* Re: [PATCH v2 1/1] tools/lib/libbpf.c: fix string format to allow build on arm32
From: Daniel Borkmann @ 2018-05-23 18:19 UTC (permalink / raw)
  To: Sirio Balmelli; +Cc: netdev
In-Reply-To: <20180523161704.4f5af2ehqdh6cqrh@vm4>

On 05/23/2018 06:17 PM, Sirio Balmelli wrote:
> On arm32, 'cd tools/testing/selftests/bpf && make' fails with:
> 
> libbpf.c:80:10: error: format ‘%ld’ expects argument of type ‘long int’, but argument 4 has type ‘int64_t {aka long long int}’ [-Werror=format=]
>    (func)("libbpf: " fmt, ##__VA_ARGS__); \
>           ^
> libbpf.c:83:30: note: in expansion of macro ‘__pr’
>  #define pr_warning(fmt, ...) __pr(__pr_warning, fmt, ##__VA_ARGS__)
>                               ^~~~
> libbpf.c:1072:3: note: in expansion of macro ‘pr_warning’
>    pr_warning("map:%s value_type:%s has BTF type_size:%ld != value_size:%u\n",
> 
> To fix, typecast 'key_size' and amend format string.
> 
> Signed-off-by: Sirio Balmelli <sirio@b-ad.ch>

Applied to bpf-next, thank Sirio!

^ permalink raw reply

* Re: [PATCH net-next 00/13] nfp: abm: add basic support for advanced buffering NIC
From: David Miller @ 2018-05-23 18:28 UTC (permalink / raw)
  To: jakub.kicinski; +Cc: netdev, oss-drivers
In-Reply-To: <20180522051255.9438-1-jakub.kicinski@netronome.com>

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Mon, 21 May 2018 22:12:42 -0700

> This series lays groundwork for advanced buffer management NIC feature.
> It makes necessary NFP core changes, spawns representors and adds devlink
> glue.  Following series will add the actual buffering configuration (patch
> series size limit).
> 
> First three patches add support for configuring NFP buffer pools via a
> mailbox.  The existing devlink APIs are used for the purpose.
> 
> Third patch allows us to perform small reads from the NFP memory.
> 
> The rest of the patch set adds eswitch mode change support and makes
> the driver spawn appropriate representors.

Series applied, thank you!

^ permalink raw reply

* [PATCH net] ipv4: remove warning in ip_recv_error
From: Willem de Bruijn @ 2018-05-23 18:29 UTC (permalink / raw)
  To: netdev; +Cc: davem, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

A precondition check in ip_recv_error triggered on an otherwise benign
race. Remove the warning.

The warning triggers when passing an ipv6 socket to this ipv4 error
handling function. RaceFuzzer was able to trigger it due to a race
in setsockopt IPV6_ADDRFORM.

  ---
  CPU0
    do_ipv6_setsockopt
      sk->sk_socket->ops = &inet_dgram_ops;

  ---
  CPU1
    sk->sk_prot->recvmsg
      udp_recvmsg
        ip_recv_error
          WARN_ON_ONCE(sk->sk_family == AF_INET6);

  ---
  CPU0
    do_ipv6_setsockopt
      sk->sk_family = PF_INET;

This socket option converts a v6 socket that is connected to a v4 peer
to an v4 socket. It updates the socket on the fly, changing fields in
sk as well as other structs. This is inherently non-atomic. It races
with the lockless udp_recvmsg path.

No other code makes an assumption that these fields are updated
atomically. It is benign here, too, as ip_recv_error cares only about
the protocol of the skbs enqueued on the error queue, for which
sk_family is not a precise predictor (thanks to another isue with
IPV6_ADDRFORM).

Link: http://lkml.kernel.org/r/20180518120826.GA19515@dragonet.kaist.ac.kr
Fixes: ("7ce875e5ecb8 ipv4: warn once on passing AF_INET6 socket to ip_recv_error")
Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 net/ipv4/ip_sockglue.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 5ad2d8ed3a3f..57bbb060faaf 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -505,8 +505,6 @@ int ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 	int err;
 	int copied;

-	WARN_ON_ONCE(sk->sk_family == AF_INET6);
-
 	err = -EAGAIN;
 	skb = sock_dequeue_err_skb(sk);
 	if (!skb)
-- 
2.17.0.441.gb46fe60e1d-goog

^ permalink raw reply related

* Re: WARNING in ip_recv_error
From: Willem de Bruijn @ 2018-05-23 18:30 UTC (permalink / raw)
  To: David Miller
  Cc: Eric Dumazet, DaeLyong Jeong, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Network Development, LKML, Byoungyoung Lee, Kyungtae Kim,
	bammanag, Willem de Bruijn
In-Reply-To: <CAF=yD-KTfUbXGvU7qQy4=eHbuUB88=g_tQ8sp8TEebhW=rzKVQ@mail.gmail.com>

On Wed, May 23, 2018 at 11:40 AM, Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
> On Sun, May 20, 2018 at 7:13 PM, Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>> On Fri, May 18, 2018 at 2:59 PM, Willem de Bruijn
>> <willemdebruijn.kernel@gmail.com> wrote:
>>> On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
>>> <willemdebruijn.kernel@gmail.com> wrote:
>>>> On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
>>>> <willemdebruijn.kernel@gmail.com> wrote:
>>>>> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>>>>> <willemdebruijn.kernel@gmail.com> wrote:
>>>>>> On Fri, May 18, 2018 at 11:44 AM, David Miller <davem@davemloft.net> wrote:
>>>>>>> From: Eric Dumazet <eric.dumazet@gmail.com>
>>>>>>> Date: Fri, 18 May 2018 08:30:43 -0700
>>>>>>>
>>>>>>>> We probably need to revert Willem patch (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>>>>>>
>>>>>>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>>>>>>
>>>>>> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
>>>>>> atomic operation, so that the socket is neither fully ipv4 nor fully
>>>>>> ipv6 by the time it reaches ip_recv_error.
>>>>>>
>>>>>>   sk->sk_socket->ops = &inet_dgram_ops;
>>>>>>   < HERE >
>>>>>>   sk->sk_family = PF_INET;
>>>>>>
>>>>>> Even calling inet_recv_error to demux would not necessarily help.
>>>>>>
>>>>>> Safest would be to look up by skb->protocol, similar to what
>>>>>> ipv6_recv_error does to handle v4-mapped-v6.
>>>>>>
>>>>>> Or to make that function safe with PF_INET and swap the order
>>>>>> of the above two operations.
>>>>>>
>>>>>> All sound needlessly complicated for this rare socket option, but
>>>>>> I don't have a better idea yet. Dropping on the floor is not nice,
>>>>>> either.
>>>>>
>>>>> Ensuring that ip_recv_error correctly handles packets from either
>>>>> socket and removing the warning should indeed be good.
>>>>>
>>>>> It is robust against v4-mapped packets from an AF_INET6 socket,
>>>>> but see caveat on reconnect below.
>>>>>
>>>>> The code between ipv6_recv_error for v4-mapped addresses and
>>>>> ip_recv_error is essentially the same, the main difference being
>>>>> whether to return network headers as sockaddr_in with SOL_IP
>>>>> or sockaddr_in6 with SOL_IPV6.
>>>>>
>>>>> There are very few other locations in the stack that explicitly test
>>>>> sk_family in this way and thus would be vulnerable to races with
>>>>> IPV6_ADDRFORM.
>>>>>
>>>>> I'm not sure whether it is possible for a udpv6 socket to queue a
>>>>> real ipv6 packet on the error queue, disconnect, connect to an
>>>>> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
>>>>> on a true ipv6 packet. That would return buggy data, e.g., in
>>>>> msg_name.
>>>>
>>>> In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
>>>> error queue is empty, and then take its lock for the duration of the
>>>> operation.
>>>
>>> Actually, no reason to hold the lock. This setsockopt holds the socket
>>> lock, which connect would need, too. So testing that the queue
>>> is empty after testing that it is connected to a v4 address is
>>> sufficient to ensure that no ipv6 packets are queued for reception.
>>>
>>> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
>>> index 4d780c7f0130..a975d6311341 100644
>>> --- a/net/ipv6/ipv6_sockglue.c
>>> +++ b/net/ipv6/ipv6_sockglue.c
>>> @@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
>>> int level, int optname,
>>>
>>>                         if (ipv6_only_sock(sk) ||
>>>                             !ipv6_addr_v4mapped(&sk->sk_v6_daddr)) {
>>>                                 retv = -EADDRNOTAVAIL;
>>>                                 break;
>>>                         }
>>>
>>> +                       if (!skb_queue_empty(&sk->sk_error_queue)) {
>>> +                               retv = -EBUSY;
>>> +                               break;
>>> +                       }
>>> +
>>>                         fl6_free_socklist(sk);
>>>                         __ipv6_sock_mc_close(sk);
>>>
>>> After this it should be safe to remove the warning in ip_recv_error.
>>
>> Hmm.. nope.
>>
>> This ensures that the socket cannot produce any new true v6 packets.
>> But it does not guarantee that they are not already in the system, e.g.
>> queued in tc, and will find their way to the error queue later.
>>
>> We'll have to just be able to handle ipv6 packets in ip_recv_error.
>> Since IPV6_ADDRFORM is used to pass to legacy v4-only
>> processes and those likely are only confused by SOL_IPV6
>> error messages, it is probably best to just drop them and perhaps
>> WARN_ONCE.
>
> Even more fun, this is not limited to the error queue.
>
> I can queue a v6 packet for reception on a socket, connect to a v4
> address, call IPV6_ADDRFORM and then a regular recvfrom will
> return a partial v6 address as AF_INET.
>
> We definitely do not want to have to add a check
>
>   if (skb->protocol == htons(ETH_P_IPV6)) {
>     kfree_skb(skb);
>     goto try_again;
>   }
>
> to the normal recvmsg path.
>
> An alternative may be to tighten the check on when to allow
> IPV6_ADDRFORM. Not only return EBUSY if a packet is pending,
> but also if any sk_{rmem, omem, wmem}_alloc is non-zero. Only,
> these tightened constraints could break a legacy application.
>
> Either way, this race is somewhat tangential to the one that
> RaceFuzzer found. The sk changes that IPV6_ADDRFORM makes
> to sk_prot, sk_socket->ops and sk_family are not atomic and will
> not be. They need not be, because no other code assumes this
> consistency.
>
> So I'll start by removing the warning as Eric suggested.

http://patchwork.ozlabs.org/patch/919270/

^ permalink raw reply

* Re: [PATCH net] tuntap: correctly set SOCKWQ_ASYNC_NOSPACE
From: David Miller @ 2018-05-23 18:32 UTC (permalink / raw)
  To: jasowang; +Cc: netdev, linux-kernel, mst, hannes, edumazet
In-Reply-To: <1526970064-29711-1-git-send-email-jasowang@redhat.com>

From: Jason Wang <jasowang@redhat.com>
Date: Tue, 22 May 2018 14:21:04 +0800

> When link is down, writes to the device might fail with
> -EIO. Userspace needs an indication when the status is resolved.  As a
> fix, tun_net_open() attempts to wake up writers - but that is only
> effective if SOCKWQ_ASYNC_NOSPACE has been set in the past. This is
> not the case of vhost_net which only poll for EPOLLOUT after it meets
> errors during sendmsg().
> 
> This patch fixes this by making sure SOCKWQ_ASYNC_NOSPACE is set when
> socket is not writable or device is down to guarantee EPOLLOUT will be
> raised in either tun_chr_poll() or tun_sock_write_space() after device
> is up.
> 
> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Cc: Eric Dumazet <edumazet@google.com>
> Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Applied and queued up for -stable, thanks Jason.

^ permalink raw reply

* [PATCH v2 bpf-next] bpf: btf: Avoid variable length array
From: Martin KaFai Lau @ 2018-05-23 18:32 UTC (permalink / raw)
  To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team

Sparse warning:
kernel/bpf/btf.c:1985:34: warning: Variable length array is used.

This patch directly uses ARRAY_SIZE().

Fixes: f80442a4cd18 ("bpf: btf: Change how section is supported in btf_header")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 kernel/bpf/btf.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 9cbeabb5aca3..7e90fd13b5b5 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -1981,8 +1981,7 @@ static int btf_sec_info_cmp(const void *a, const void *b)
 static int btf_check_sec_info(struct btf_verifier_env *env,
 			      u32 btf_data_size)
 {
-	const unsigned int nr_secs = ARRAY_SIZE(btf_sec_info_offset);
-	struct btf_sec_info secs[nr_secs];
+	struct btf_sec_info secs[ARRAY_SIZE(btf_sec_info_offset)];
 	u32 total, expected_total, i;
 	const struct btf_header *hdr;
 	const struct btf *btf;
@@ -1991,17 +1990,17 @@ static int btf_check_sec_info(struct btf_verifier_env *env,
 	hdr = &btf->hdr;
 
 	/* Populate the secs from hdr */
-	for (i = 0; i < nr_secs; i++)
+	for (i = 0; i < ARRAY_SIZE(btf_sec_info_offset); i++)
 		secs[i] = *(struct btf_sec_info *)((void *)hdr +
 						   btf_sec_info_offset[i]);
 
-	sort(secs, nr_secs, sizeof(struct btf_sec_info),
-	     btf_sec_info_cmp, NULL);
+	sort(secs, ARRAY_SIZE(btf_sec_info_offset),
+	     sizeof(struct btf_sec_info), btf_sec_info_cmp, NULL);
 
 	/* Check for gaps and overlap among sections */
 	total = 0;
 	expected_total = btf_data_size - hdr->hdr_len;
-	for (i = 0; i < nr_secs; i++) {
+	for (i = 0; i < ARRAY_SIZE(btf_sec_info_offset); i++) {
 		if (expected_total < secs[i].off) {
 			btf_verifier_log(env, "Invalid section offset");
 			return -EINVAL;
-- 
2.9.5

^ permalink raw reply related

* Re: [PATCH net-next v2 0/3] net: sfp: small improvements
From: David Miller @ 2018-05-23 18:34 UTC (permalink / raw)
  To: antoine.tenart
  Cc: linux, netdev, linux-kernel, thomas.petazzoni, maxime.chevallier,
	gregory.clement, miquel.raynal, nadavh, stefanc, ymarkman, mw
In-Reply-To: <20180522101801.18947-1-antoine.tenart@bootlin.com>

From: Antoine Tenart <antoine.tenart@bootlin.com>
Date: Tue, 22 May 2018 12:17:58 +0200

> A small series of patches improving the SFP support by adding a warning
> when no Tx disable pin is available, and making the i2c-bus property
> mandatory.
 ...
> Since v1:
>   - Removed the patch fixing the sfp driver when no i2c bus was described.
>   - Made two new patches to make the i2c-bus property mandatory for sfp modules.

Series applied, thank you.

^ permalink raw reply

* Re: [PATCH v3] powerpc: Implement csum_ipv6_magic in assembly
From: Segher Boessenkool @ 2018-05-23 18:34 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	linux-kernel, linuxppc-dev, netdev
In-Reply-To: <20180522065701.9DE696CCB4@po14934vm.idsi0.si.c-s.fr>

On Tue, May 22, 2018 at 08:57:01AM +0200, Christophe Leroy wrote:
> The generic csum_ipv6_magic() generates a pretty bad result

<snip>

Please try with a more recent compiler, what you used is pretty ancient.
It's not like recent compilers do great on this either, but it's not
*that* bad anymore ;-)

> --- a/arch/powerpc/lib/checksum_32.S
> +++ b/arch/powerpc/lib/checksum_32.S
> @@ -293,3 +293,36 @@ dst_error:
>  	EX_TABLE(51b, dst_error);
>  
>  EXPORT_SYMBOL(csum_partial_copy_generic)
> +
> +/*
> + * static inline __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
> + *				      const struct in6_addr *daddr,
> + *				      __u32 len, __u8 proto, __wsum sum)
> + */
> +
> +_GLOBAL(csum_ipv6_magic)
> +	lwz	r8, 0(r3)
> +	lwz	r9, 4(r3)
> +	lwz	r10, 8(r3)
> +	lwz	r11, 12(r3)
> +	addc	r0, r5, r6
> +	adde	r0, r0, r7
> +	adde	r0, r0, r8
> +	adde	r0, r0, r9
> +	adde	r0, r0, r10
> +	adde	r0, r0, r11
> +	lwz	r8, 0(r4)
> +	lwz	r9, 4(r4)
> +	lwz	r10, 8(r4)
> +	lwz	r11, 12(r4)
> +	adde	r0, r0, r8
> +	adde	r0, r0, r9
> +	adde	r0, r0, r10
> +	adde	r0, r0, r11
> +	addze	r0, r0
> +	rotlwi	r3, r0, 16
> +	add	r3, r0, r3
> +	not	r3, r3
> +	rlwinm	r3, r3, 16, 16, 31
> +	blr
> +EXPORT_SYMBOL(csum_ipv6_magic)

Clustering the loads and carry insns together is pretty much the worst you
can do on most 32-bit CPUs.


Segher

^ permalink raw reply

* Re: [PATCH net-next] enic: set DMA mask to 47 bit
From: David Miller @ 2018-05-23 18:37 UTC (permalink / raw)
  To: gvaradar; +Cc: netdev, benve
In-Reply-To: <20180522133724.1188-1-gvaradar@cisco.com>

From: Govindarajulu Varadarajan <gvaradar@cisco.com>
Date: Tue, 22 May 2018 06:37:24 -0700

> In commit 624dbf55a359b ("driver/net: enic: Try DMA 64 first, then
> failover to DMA") DMA mask was changed from 40 bits to 64 bits.
> Hardware actually supports only 47 bits.
> 
> Fixes: 624dbf55a359b("driver/net: enic: Try DMA 64 first, then failover
> to DMA")

Do not chop up long Fixes: tag lines, they should always be one
single line.

Also, need a space after the SHA1 ID.

> Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>

This is a fix for a bug which is very old, back to 3.12  Therefore
you should target the 'net' tree, rather than 'net-next'.

^ permalink raw reply

* Re: [PATCH net-next v2 0/3] net: sfp: small improvements
From: Florian Fainelli @ 2018-05-23 18:40 UTC (permalink / raw)
  To: Antoine Tenart, davem, linux
  Cc: netdev, linux-kernel, thomas.petazzoni, maxime.chevallier,
	gregory.clement, miquel.raynal, nadavh, stefanc, ymarkman, mw
In-Reply-To: <20180522101801.18947-1-antoine.tenart@bootlin.com>

On 05/22/2018 03:17 AM, Antoine Tenart wrote:
> Hi Russell, David,
> 
> A small series of patches improving the SFP support by adding a warning
> when no Tx disable pin is available, and making the i2c-bus property
> mandatory.
> 
> Thanks!
> Antoine

Antoine, can you please do CC the people who worked on that code before,
arguably, send an update to MAINTAINERS file to create a specific
section for PHYLINK.

Thank you

> 
> Since v1:
>   - Removed the patch fixing the sfp driver when no i2c bus was described.
>   - Made two new patches to make the i2c-bus property mandatory for sfp modules.
> 
> Since the phylink series:
>   - s/-EOPNOTSUPP/-ENODEV/ in patch 1/2.
>   - I added the acked-by tag in patch 2/2.
> 
> Antoine Tenart (3):
>   net: phy: sfp: warn the user when no tx_disable pin is available
>   net: phy: sfp: make the i2c-bus dt property mandatory
>   Documentation/bindings: net: the sfp i2c-bus property is now mandatory
> 
>  .../devicetree/bindings/net/sff,sfp.txt       |  4 +-
>  drivers/net/phy/sfp.c                         | 37 ++++++++++++-------
>  2 files changed, 26 insertions(+), 15 deletions(-)
> 


-- 
Florian

^ permalink raw reply

* [PATCH 00/18] Netfilter updates for net-next
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter updates for your net-next
tree, they are:

1) Remove obsolete nf_log tracing from nf_tables, from Florian Westphal.

2) Add support for map lookups to numgen, random and hash expressions,
   from Laura Garcia.

3) Allow to register nat hooks for iptables and nftables at the same
   time. Patchset from Florian Westpha.

4) Timeout support for rbtree sets.

5) ip6_rpfilter works needs interface for link-local addresses, from
   Vincent Bernat.

6) Add nf_ct_hook and nf_nat_hook structures and use them.

7) Do not drop packets on packets raceing to insert conntrack entries
   into hashes, this is particularly a problem in nfqueue setups.

8) Address fallout from xt_osf separation to nf_osf, patches
   from Florian Westphal and Fernando Mancera.

9) Remove reference to struct nft_af_info, which doesn't exist anymore.
   From Taehee Yoo.

This batch comes with is a conflict between 25fd386e0bc0 ("netfilter:
core: add missing __rcu annotation") in your tree and 2c205dd3981f
("netfilter: add struct nf_nat_hook and use it") coming in this batch.
This conflict can be solved by leaving the __rcu tag on
__netfilter_net_init() - added by 25fd386e0bc0 - and remove all code
related to nf_nat_decode_session_hook - which is gone after
2c205dd3981f, as described by:

diff --cc net/netfilter/core.c
index e0ae4aae96f5,206fb2c4c319..168af54db975
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@@ -611,7 -580,13 +611,8 @@@ const struct nf_conntrack_zone nf_ct_zo
  EXPORT_SYMBOL_GPL(nf_ct_zone_dflt);
  #endif /* CONFIG_NF_CONNTRACK */
  
- static void __net_init __netfilter_net_init(struct nf_hook_entries **e, int max)
 -#ifdef CONFIG_NF_NAT_NEEDED
 -void (*nf_nat_decode_session_hook)(struct sk_buff *, struct flowi *);
 -EXPORT_SYMBOL(nf_nat_decode_session_hook);
 -#endif
 -
+ static void __net_init
+ __netfilter_net_init(struct nf_hook_entries __rcu **e, int max)
  {
  	int h;
  

I can also merge your net-next tree into nf-next, solve the conflict and
resend the pull request if you prefer so.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git

Thanks.

----------------------------------------------------------------

The following changes since commit 289e1f4e9e4a09c73a1c0152bb93855ea351ccda:

  net: ipv4: ipconfig: fix unused variable (2018-05-13 20:27:25 -0400)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git HEAD

for you to fetch changes up to 0c6bca747111dee19aa48c8f73d77fc85fcb8dd0:

  netfilter: nf_tables: remove nft_af_info. (2018-05-23 12:16:25 +0200)

----------------------------------------------------------------
Fernando Fernandez Mancera (1):
      netfilter: make NF_OSF non-visible symbol

Florian Westphal (9):
      netfilter: fix fallout from xt/nf osf separation
      netfilter: nf_tables: remove old nf_log based tracing
      netfilter: nf_nat: move common nat code to nat core
      netfilter: xtables: allow table definitions not backed by hook_ops
      netfilter: nf_tables: allow chain type to override hook register
      netfilter: core: export raw versions of add/delete hook functions
      netfilter: nf_nat: add nat hook register functions to nf_nat
      netfilter: nf_nat: add nat type hooks to nat core
      netfilter: lift one-nat-hook-only restriction

Laura Garcia Liebana (2):
      netfilter: nft_numgen: add map lookups for numgen random operations
      netfilter: nft_hash: add map lookups for hashing operations

Pablo Neira Ayuso (4):
      netfilter: nft_set_rbtree: add timeout support
      netfilter: add struct nf_ct_hook and use it
      netfilter: add struct nf_nat_hook and use it
      netfilter: nfnetlink_queue: resolve clash for unconfirmed conntracks

Taehee Yoo (1):
      netfilter: nf_tables: remove nft_af_info.

Vincent Bernat (1):
      netfilter: ip6t_rpfilter: provide input interface for route lookup

 include/linux/netfilter.h                |  34 +++-
 include/linux/netfilter/nf_osf.h         |   6 +
 include/net/netfilter/nf_nat.h           |   4 +
 include/net/netfilter/nf_nat_core.h      |  11 +-
 include/net/netfilter/nf_nat_l3proto.h   |  52 +-----
 include/net/netfilter/nf_tables.h        |   8 +-
 include/net/netns/nftables.h             |   2 -
 include/uapi/linux/netfilter/nf_osf.h    |   8 +-
 include/uapi/linux/netfilter/nf_tables.h |   4 +
 net/ipv4/netfilter/ip_tables.c           |   5 +-
 net/ipv4/netfilter/iptable_nat.c         |  85 ++++-----
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c | 135 ++++++--------
 net/ipv4/netfilter/nft_chain_nat_ipv4.c  |  52 ++----
 net/ipv6/netfilter/ip6_tables.c          |   5 +-
 net/ipv6/netfilter/ip6t_rpfilter.c       |   2 +
 net/ipv6/netfilter/ip6table_nat.c        |  84 ++++-----
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c | 129 ++++++--------
 net/ipv6/netfilter/nft_chain_nat_ipv6.c  |  48 ++---
 net/netfilter/Kconfig                    |   2 +-
 net/netfilter/core.c                     | 102 +++++++----
 net/netfilter/nf_conntrack_core.c        |  91 +++++++++-
 net/netfilter/nf_conntrack_netlink.c     |  10 +-
 net/netfilter/nf_internals.h             |   5 +
 net/netfilter/nf_nat_core.c              | 294 ++++++++++++++++++++++++++++---
 net/netfilter/nf_tables_api.c            |  87 ++-------
 net/netfilter/nf_tables_core.c           |  29 +--
 net/netfilter/nfnetlink_queue.c          |  28 ++-
 net/netfilter/nft_hash.c                 | 131 +++++++++++++-
 net/netfilter/nft_numgen.c               |  76 +++++++-
 net/netfilter/nft_set_rbtree.c           |  75 +++++++-
 30 files changed, 1033 insertions(+), 571 deletions(-)

^ permalink raw reply

* [PATCH 02/18] netfilter: nf_tables: remove old nf_log based tracing
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20180523184254.22599-1-pablo@netfilter.org>

From: Florian Westphal <fw@strlen.de>

nfnetlink tracing is available since nft 0.6 (June 2016).
Remove old nf_log based tracing to avoid rule counter in main loop.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_tables_core.c | 29 +++++++----------------------
 1 file changed, 7 insertions(+), 22 deletions(-)

diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
index 9cf47c4cb9d5..d457d854fcae 100644
--- a/net/netfilter/nf_tables_core.c
+++ b/net/netfilter/nf_tables_core.c
@@ -41,7 +41,7 @@ static const struct nf_loginfo trace_loginfo = {
 
 static noinline void __nft_trace_packet(struct nft_traceinfo *info,
 					const struct nft_chain *chain,
-					int rulenum, enum nft_trace_types type)
+					enum nft_trace_types type)
 {
 	const struct nft_pktinfo *pkt = info->pkt;
 
@@ -52,22 +52,16 @@ static noinline void __nft_trace_packet(struct nft_traceinfo *info,
 	info->type = type;
 
 	nft_trace_notify(info);
-
-	nf_log_trace(nft_net(pkt), nft_pf(pkt), nft_hook(pkt), pkt->skb,
-		     nft_in(pkt), nft_out(pkt), &trace_loginfo,
-		     "TRACE: %s:%s:%s:%u ",
-		     chain->table->name, chain->name, comments[type], rulenum);
 }
 
 static inline void nft_trace_packet(struct nft_traceinfo *info,
 				    const struct nft_chain *chain,
 				    const struct nft_rule *rule,
-				    int rulenum,
 				    enum nft_trace_types type)
 {
 	if (static_branch_unlikely(&nft_trace_enabled)) {
 		info->rule = rule;
-		__nft_trace_packet(info, chain, rulenum, type);
+		__nft_trace_packet(info, chain, type);
 	}
 }
 
@@ -133,7 +127,6 @@ static noinline void nft_update_chain_stats(const struct nft_chain *chain,
 struct nft_jumpstack {
 	const struct nft_chain	*chain;
 	const struct nft_rule	*rule;
-	int			rulenum;
 };
 
 unsigned int
@@ -146,7 +139,6 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 	struct nft_regs regs;
 	unsigned int stackptr = 0;
 	struct nft_jumpstack jumpstack[NFT_JUMP_STACK_SIZE];
-	int rulenum;
 	unsigned int gencursor = nft_genmask_cur(net);
 	struct nft_traceinfo info;
 
@@ -154,7 +146,6 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 	if (static_branch_unlikely(&nft_trace_enabled))
 		nft_trace_init(&info, pkt, &regs.verdict, basechain);
 do_chain:
-	rulenum = 0;
 	rule = list_entry(&chain->rules, struct nft_rule, list);
 next_rule:
 	regs.verdict.code = NFT_CONTINUE;
@@ -164,8 +155,6 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 		if (unlikely(rule->genmask & gencursor))
 			continue;
 
-		rulenum++;
-
 		nft_rule_for_each_expr(expr, last, rule) {
 			if (expr->ops == &nft_cmp_fast_ops)
 				nft_cmp_fast_eval(expr, &regs);
@@ -183,7 +172,7 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 			continue;
 		case NFT_CONTINUE:
 			nft_trace_packet(&info, chain, rule,
-					 rulenum, NFT_TRACETYPE_RULE);
+					 NFT_TRACETYPE_RULE);
 			continue;
 		}
 		break;
@@ -195,7 +184,7 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 	case NF_QUEUE:
 	case NF_STOLEN:
 		nft_trace_packet(&info, chain, rule,
-				 rulenum, NFT_TRACETYPE_RULE);
+				 NFT_TRACETYPE_RULE);
 		return regs.verdict.code;
 	}
 
@@ -204,21 +193,19 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 		BUG_ON(stackptr >= NFT_JUMP_STACK_SIZE);
 		jumpstack[stackptr].chain = chain;
 		jumpstack[stackptr].rule  = rule;
-		jumpstack[stackptr].rulenum = rulenum;
 		stackptr++;
 		/* fall through */
 	case NFT_GOTO:
 		nft_trace_packet(&info, chain, rule,
-				 rulenum, NFT_TRACETYPE_RULE);
+				 NFT_TRACETYPE_RULE);
 
 		chain = regs.verdict.chain;
 		goto do_chain;
 	case NFT_CONTINUE:
-		rulenum++;
 		/* fall through */
 	case NFT_RETURN:
 		nft_trace_packet(&info, chain, rule,
-				 rulenum, NFT_TRACETYPE_RETURN);
+				 NFT_TRACETYPE_RETURN);
 		break;
 	default:
 		WARN_ON(1);
@@ -228,12 +215,10 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 		stackptr--;
 		chain = jumpstack[stackptr].chain;
 		rule  = jumpstack[stackptr].rule;
-		rulenum = jumpstack[stackptr].rulenum;
 		goto next_rule;
 	}
 
-	nft_trace_packet(&info, basechain, NULL, -1,
-			 NFT_TRACETYPE_POLICY);
+	nft_trace_packet(&info, basechain, NULL, NFT_TRACETYPE_POLICY);
 
 	if (static_branch_unlikely(&nft_counters_enabled))
 		nft_update_chain_stats(basechain, pkt);
-- 
2.11.0

^ permalink raw reply related

* [PATCH 03/18] netfilter: nft_numgen: add map lookups for numgen random operations
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20180523184254.22599-1-pablo@netfilter.org>

From: Laura Garcia Liebana <nevola@gmail.com>

This patch uses the map lookup already included to be applied
for random number generation.

Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_numgen.c | 76 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 72 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nft_numgen.c b/net/netfilter/nft_numgen.c
index 8a64db8f2e69..cdbc62a53933 100644
--- a/net/netfilter/nft_numgen.c
+++ b/net/netfilter/nft_numgen.c
@@ -166,18 +166,43 @@ struct nft_ng_random {
 	enum nft_registers      dreg:8;
 	u32			modulus;
 	u32			offset;
+	struct nft_set		*map;
 };
 
+static u32 nft_ng_random_gen(struct nft_ng_random *priv)
+{
+	struct rnd_state *state = this_cpu_ptr(&nft_numgen_prandom_state);
+
+	return reciprocal_scale(prandom_u32_state(state), priv->modulus) +
+	       priv->offset;
+}
+
 static void nft_ng_random_eval(const struct nft_expr *expr,
 			       struct nft_regs *regs,
 			       const struct nft_pktinfo *pkt)
 {
 	struct nft_ng_random *priv = nft_expr_priv(expr);
-	struct rnd_state *state = this_cpu_ptr(&nft_numgen_prandom_state);
-	u32 val;
 
-	val = reciprocal_scale(prandom_u32_state(state), priv->modulus);
-	regs->data[priv->dreg] = val + priv->offset;
+	regs->data[priv->dreg] = nft_ng_random_gen(priv);
+}
+
+static void nft_ng_random_map_eval(const struct nft_expr *expr,
+				   struct nft_regs *regs,
+				   const struct nft_pktinfo *pkt)
+{
+	struct nft_ng_random *priv = nft_expr_priv(expr);
+	const struct nft_set *map = priv->map;
+	const struct nft_set_ext *ext;
+	u32 result;
+	bool found;
+
+	result = nft_ng_random_gen(priv);
+	found = map->ops->lookup(nft_net(pkt), map, &result, &ext);
+	if (!found)
+		return;
+
+	nft_data_copy(&regs->data[priv->dreg],
+		      nft_set_ext_data(ext), map->dlen);
 }
 
 static int nft_ng_random_init(const struct nft_ctx *ctx,
@@ -204,6 +229,23 @@ static int nft_ng_random_init(const struct nft_ctx *ctx,
 					   NFT_DATA_VALUE, sizeof(u32));
 }
 
+static int nft_ng_random_map_init(const struct nft_ctx *ctx,
+				  const struct nft_expr *expr,
+				  const struct nlattr * const tb[])
+{
+	struct nft_ng_random *priv = nft_expr_priv(expr);
+	u8 genmask = nft_genmask_next(ctx->net);
+
+	nft_ng_random_init(ctx, expr, tb);
+	priv->map = nft_set_lookup_global(ctx->net, ctx->table,
+					  tb[NFTA_NG_SET_NAME],
+					  tb[NFTA_NG_SET_ID], genmask);
+	if (IS_ERR(priv->map))
+		return PTR_ERR(priv->map);
+
+	return 0;
+}
+
 static int nft_ng_random_dump(struct sk_buff *skb, const struct nft_expr *expr)
 {
 	const struct nft_ng_random *priv = nft_expr_priv(expr);
@@ -212,6 +254,22 @@ static int nft_ng_random_dump(struct sk_buff *skb, const struct nft_expr *expr)
 			   priv->offset);
 }
 
+static int nft_ng_random_map_dump(struct sk_buff *skb,
+				  const struct nft_expr *expr)
+{
+	const struct nft_ng_random *priv = nft_expr_priv(expr);
+
+	if (nft_ng_dump(skb, priv->dreg, priv->modulus,
+			NFT_NG_RANDOM, priv->offset) ||
+	    nla_put_string(skb, NFTA_NG_SET_NAME, priv->map->name))
+		goto nla_put_failure;
+
+	return 0;
+
+nla_put_failure:
+	return -1;
+}
+
 static struct nft_expr_type nft_ng_type;
 static const struct nft_expr_ops nft_ng_inc_ops = {
 	.type		= &nft_ng_type,
@@ -237,6 +295,14 @@ static const struct nft_expr_ops nft_ng_random_ops = {
 	.dump		= nft_ng_random_dump,
 };
 
+static const struct nft_expr_ops nft_ng_random_map_ops = {
+	.type		= &nft_ng_type,
+	.size		= NFT_EXPR_SIZE(sizeof(struct nft_ng_random)),
+	.eval		= nft_ng_random_map_eval,
+	.init		= nft_ng_random_map_init,
+	.dump		= nft_ng_random_map_dump,
+};
+
 static const struct nft_expr_ops *
 nft_ng_select_ops(const struct nft_ctx *ctx, const struct nlattr * const tb[])
 {
@@ -255,6 +321,8 @@ nft_ng_select_ops(const struct nft_ctx *ctx, const struct nlattr * const tb[])
 			return &nft_ng_inc_map_ops;
 		return &nft_ng_inc_ops;
 	case NFT_NG_RANDOM:
+		if (tb[NFTA_NG_SET_NAME])
+			return &nft_ng_random_map_ops;
 		return &nft_ng_random_ops;
 	}
 
-- 
2.11.0

^ permalink raw reply related

* [PATCH 01/18] netfilter: fix fallout from xt/nf osf separation
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20180523184254.22599-1-pablo@netfilter.org>

From: Florian Westphal <fw@strlen.de>

Stephen Rothwell says:
  today's linux-next build (x86_64 allmodconfig) produced this warning:
  ./usr/include/linux/netfilter/nf_osf.h:25: found __[us]{8,16,32,64} type without #include <linux/types.h>

Fix that up and also move kernel-private struct out of uapi (it was not
exposed in any released kernel version).

tested via allmodconfig build + make headers_check.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: bfb15f2a95cb ("netfilter: extract Passive OS fingerprint infrastructure from xt_osf")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter/nf_osf.h      | 6 ++++++
 include/uapi/linux/netfilter/nf_osf.h | 8 ++------
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/include/linux/netfilter/nf_osf.h b/include/linux/netfilter/nf_osf.h
index a2b39602e87d..0e114c492fb8 100644
--- a/include/linux/netfilter/nf_osf.h
+++ b/include/linux/netfilter/nf_osf.h
@@ -21,6 +21,12 @@ enum osf_fmatch_states {
 	FMATCH_OPT_WRONG,
 };
 
+struct nf_osf_finger {
+	struct rcu_head			rcu_head;
+	struct list_head		finger_entry;
+	struct nf_osf_user_finger	finger;
+};
+
 bool nf_osf_match(const struct sk_buff *skb, u_int8_t family,
 		  int hooknum, struct net_device *in, struct net_device *out,
 		  const struct nf_osf_info *info, struct net *net,
diff --git a/include/uapi/linux/netfilter/nf_osf.h b/include/uapi/linux/netfilter/nf_osf.h
index 45376eae31ef..8f2f2f403183 100644
--- a/include/uapi/linux/netfilter/nf_osf.h
+++ b/include/uapi/linux/netfilter/nf_osf.h
@@ -1,6 +1,8 @@
 #ifndef _NF_OSF_H
 #define _NF_OSF_H
 
+#include <linux/types.h>
+
 #define MAXGENRELEN	32
 
 #define NF_OSF_GENRE	(1 << 0)
@@ -57,12 +59,6 @@ struct nf_osf_user_finger {
 	struct nf_osf_opt	opt[MAX_IPOPTLEN];
 };
 
-struct nf_osf_finger {
-	struct rcu_head			rcu_head;
-	struct list_head		finger_entry;
-	struct nf_osf_user_finger	finger;
-};
-
 struct nf_osf_nlmsg {
 	struct nf_osf_user_finger	f;
 	struct iphdr			ip;
-- 
2.11.0

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox