* Re: [PATCH 2/3] macb: Update compatibility string for SiFive FU540-C000
From: Paul Walmsley @ 2019-08-13 18:42 UTC (permalink / raw)
To: Nicolas Ferre, David Miller
Cc: Yash Shah, Rob Herring, netdev, devicetree,
linux-kernel@vger.kernel.org List, linux-riscv, Mark Rutland,
Palmer Dabbelt, Albert Ou, Petr Štetiar, Sachin Ghadi
In-Reply-To: <CAJ2_jOEHoh+D76VpAoVq3XnpAZEQxdQtaVX5eiKw5X4r+ypKVw@mail.gmail.com>
Dave, Nicolas,
On Mon, 22 Jul 2019, Yash Shah wrote:
> On Fri, Jul 19, 2019 at 5:36 PM <Nicolas.Ferre@microchip.com> wrote:
> >
> > On 19/07/2019 at 13:10, Yash Shah wrote:
> > > Update the compatibility string for SiFive FU540-C000 as per the new
> > > string updated in the binding doc.
> > > Reference: https://lkml.org/lkml/2019/7/17/200
> >
> > Maybe referring to lore.kernel.org is better:
> > https://lore.kernel.org/netdev/CAJ2_jOFEVZQat0Yprg4hem4jRrqkB72FKSeQj4p8P5KA-+rgww@mail.gmail.com/
>
> Sure. Will keep that in mind for future reference.
>
> >
> > > Signed-off-by: Yash Shah <yash.shah@sifive.com>
> >
> > Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
>
> Thanks.
Am assuming you'll pick this up for the -net tree for v5.4-rc1 or earlier.
If not, please let us know.
- Paul
^ permalink raw reply
* Re: [RESEND][PATCH v3 bpf-next] btf: expose BTF info through sysfs
From: Arnaldo Carvalho de Melo @ 2019-08-13 18:45 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Daniel Borkmann, Andrii Nakryiko, bpf, Networking,
Alexei Starovoitov, Kernel Team
In-Reply-To: <CAEf4BzZr4FGfy+QpDQzVxMxCGWx5DYCcu9jsQJWK235+f3Oigg@mail.gmail.com>
Em Tue, Aug 13, 2019 at 11:08:14AM -0700, Andrii Nakryiko escreveu:
> On Tue, Aug 13, 2019 at 7:20 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> > On 8/12/19 8:39 PM, Andrii Nakryiko wrote:
> > > 3. final vmlinux image is generated by linking this object file (and
> > > kallsyms, if necessary). sysfs_btf.c then creates
> > > /sys/kernel/btf/kernel file and exposes embedded BTF contents through
> > > it. This allows, e.g., libbpf and bpftool access BTF info at
> > > well-known location, without resorting to searching for vmlinux image
> > > on disk (location of which is not standardized and vmlinux image
> > > might not be even available in some scenarios, e.g., inside qemu
> > > during testing).
> > Small question: given modules will be covered later, would it not be more
> > obvious to name it /sys/kernel/btf/vmlinux instead?
> vmlinux totally makes sense, not sure why I didn't think about that initially...
Agreed :-)
> I'll follow up with a rename.
Great.
- Arnaldo
^ permalink raw reply
* [PATCH bpf-next V9 0/3] BPF: New helper to obtain namespace data from current task
From: Carlos Neira @ 2019-08-13 18:47 UTC (permalink / raw)
To: netdev; +Cc: yhs, ebiederm, brouer, cneirabustos, bpf
This helper obtains the active namespace from current and returns pid, tgid,
device and namespace id as seen from that namespace, allowing to instrument
a process inside a container.
Device is read from /proc/self/ns/pid, as in the future it's possible that
different pid_ns files may belong to different devices, according
to the discussion between Eric Biederman and Yonghong in 2017 linux plumbers
conference.
Currently bpf_get_current_pid_tgid(), is used to do pid filtering in bcc's
scripts but this helper returns the pid as seen by the root namespace which is
fine when a bcc script is not executed inside a container.
When the process of interest is inside a container, pid filtering will not work
if bpf_get_current_pid_tgid() is used. This helper addresses this limitation
returning the pid as it's seen by the current namespace where the script is
executing.
This helper has the same use cases as bpf_get_current_pid_tgid() as it can be
used to do pid filtering even inside a container.
For example a bcc script using bpf_get_current_pid_tgid() (tools/funccount.py):
u32 pid = bpf_get_current_pid_tgid() >> 32;
if (pid != <pid_arg_passed_in>)
return 0;
Could be modified to use bpf_get_current_pidns_info() as follows:
struct bpf_pidns pidns;
bpf_get_current_pidns_info(&pidns, sizeof(struct bpf_pidns));
u32 pid = pidns.tgid;
u32 nsid = pidns.nsid;
if ((pid != <pid_arg_passed_in>) && (nsid != <nsid_arg_passed_in>))
return 0;
To find out the name PID namespace id of a process, you could use this command:
$ ps -h -o pidns -p <pid_of_interest>
Or this other command:
$ ls -Li /proc/<pid_of_interest>/ns/pid
Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
Carlos Neira (3):
bpf: new helper to obtain namespace data from current task
samples/bpf: added sample code for bpf_get_current_pidns_info.
tools/testing/selftests/bpf: Add self-tests for new helper.
fs/internal.h | 2 -
fs/namei.c | 1 -
include/linux/bpf.h | 1 +
include/linux/namei.h | 4 +
include/uapi/linux/bpf.h | 31 ++++-
kernel/bpf/core.c | 1 +
kernel/bpf/helpers.c | 64 ++++++++++
kernel/trace/bpf_trace.c | 2 +
samples/bpf/Makefile | 3 +
samples/bpf/trace_ns_info_user.c | 35 ++++++
samples/bpf/trace_ns_info_user_kern.c | 44 +++++++
tools/include/uapi/linux/bpf.h | 31 ++++-
tools/testing/selftests/bpf/Makefile | 2 +-
tools/testing/selftests/bpf/bpf_helpers.h | 3 +
.../testing/selftests/bpf/progs/test_pidns_kern.c | 51 ++++++++
tools/testing/selftests/bpf/test_pidns.c | 138 +++++++++++++++++++++
16 files changed, 407 insertions(+), 6 deletions(-)
create mode 100644 samples/bpf/trace_ns_info_user.c
create mode 100644 samples/bpf/trace_ns_info_user_kern.c
create mode 100644 tools/testing/selftests/bpf/progs/test_pidns_kern.c
create mode 100644 tools/testing/selftests/bpf/test_pidns.c
--
2.11.0
^ permalink raw reply
* [PATCH bpf-next V9 1/3] bpf: new helper to obtain namespace data from current task
From: Carlos Neira @ 2019-08-13 18:47 UTC (permalink / raw)
To: netdev; +Cc: yhs, ebiederm, brouer, cneirabustos, bpf
In-Reply-To: <20190813184747.12225-1-cneirabustos@gmail.com>
From: Carlos <cneirabustos@gmail.com>
New bpf helper bpf_get_current_pidns_info.
This helper obtains the active namespace from current and returns
pid, tgid, device and namespace id as seen from that namespace,
allowing to instrument a process inside a container.
Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
---
fs/internal.h | 2 --
fs/namei.c | 1 -
include/linux/bpf.h | 1 +
include/linux/namei.h | 4 +++
include/uapi/linux/bpf.h | 31 ++++++++++++++++++++++-
kernel/bpf/core.c | 1 +
kernel/bpf/helpers.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
kernel/trace/bpf_trace.c | 2 ++
8 files changed, 102 insertions(+), 4 deletions(-)
diff --git a/fs/internal.h b/fs/internal.h
index 315fcd8d237c..6647e15dd419 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -59,8 +59,6 @@ extern int finish_clean_context(struct fs_context *fc);
/*
* namei.c
*/
-extern int filename_lookup(int dfd, struct filename *name, unsigned flags,
- struct path *path, struct path *root);
extern int user_path_mountpoint_at(int, const char __user *, unsigned int, struct path *);
extern int vfs_path_lookup(struct dentry *, struct vfsmount *,
const char *, unsigned int, struct path *);
diff --git a/fs/namei.c b/fs/namei.c
index 209c51a5226c..a89fc72a4a10 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -19,7 +19,6 @@
#include <linux/export.h>
#include <linux/kernel.h>
#include <linux/slab.h>
-#include <linux/fs.h>
#include <linux/namei.h>
#include <linux/pagemap.h>
#include <linux/fsnotify.h>
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index f9a506147c8a..e4adf5e05afd 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1050,6 +1050,7 @@ extern const struct bpf_func_proto bpf_get_local_storage_proto;
extern const struct bpf_func_proto bpf_strtol_proto;
extern const struct bpf_func_proto bpf_strtoul_proto;
extern const struct bpf_func_proto bpf_tcp_sock_proto;
+extern const struct bpf_func_proto bpf_get_current_pidns_info_proto;
/* Shared helpers among cBPF and eBPF. */
void bpf_user_rnd_init_once(void);
diff --git a/include/linux/namei.h b/include/linux/namei.h
index 9138b4471dbf..b45c8b6f7cb4 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -6,6 +6,7 @@
#include <linux/path.h>
#include <linux/fcntl.h>
#include <linux/errno.h>
+#include <linux/fs.h>
enum { MAX_NESTED_LINKS = 8 };
@@ -97,6 +98,9 @@ extern void unlock_rename(struct dentry *, struct dentry *);
extern void nd_jump_link(struct path *path);
+extern int filename_lookup(int dfd, struct filename *name, unsigned flags,
+ struct path *path, struct path *root);
+
static inline void nd_terminate_link(void *name, size_t len, size_t maxlen)
{
((char *) name)[min(len, maxlen)] = '\0';
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4393bd4b2419..db241857ec15 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2741,6 +2741,28 @@ union bpf_attr {
* **-EOPNOTSUPP** kernel configuration does not enable SYN cookies
*
* **-EPROTONOSUPPORT** IP packet version is not 4 or 6
+ *
+ * int bpf_get_current_pidns_info(struct bpf_pidns_info *pidns, u32 size_of_pidns)
+ * Description
+ * Copies into *pidns* pid, namespace id and tgid as seen by the
+ * current namespace and also device from /proc/self/ns/pid.
+ * *size_of_pidns* must be the size of *pidns*
+ *
+ * This helper is used when pid filtering is needed inside a
+ * container as bpf_get_current_tgid() helper returns always the
+ * pid id as seen by the root namespace.
+ * Return
+ * 0 on success
+ *
+ * **-EINVAL** if *size_of_pidns* is not valid or unable to get ns, pid
+ * or tgid of the current task.
+ *
+ * **-ECHILD** if /proc/self/ns/pid does not exists.
+ *
+ * **-ENOTDIR** if /proc/self/ns does not exists.
+ *
+ * **-ENOMEM** if allocation fails.
+ *
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -2853,7 +2875,8 @@ union bpf_attr {
FN(sk_storage_get), \
FN(sk_storage_delete), \
FN(send_signal), \
- FN(tcp_gen_syncookie),
+ FN(tcp_gen_syncookie), \
+ FN(get_current_pidns_info),
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
* function eBPF program intends to call
@@ -3604,4 +3627,10 @@ struct bpf_sockopt {
__s32 retval;
};
+struct bpf_pidns_info {
+ __u32 dev;
+ __u32 nsid;
+ __u32 tgid;
+ __u32 pid;
+};
#endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 8191a7db2777..3159f2a0188c 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2038,6 +2038,7 @@ const struct bpf_func_proto bpf_get_current_uid_gid_proto __weak;
const struct bpf_func_proto bpf_get_current_comm_proto __weak;
const struct bpf_func_proto bpf_get_current_cgroup_id_proto __weak;
const struct bpf_func_proto bpf_get_local_storage_proto __weak;
+const struct bpf_func_proto bpf_get_current_pidns_info __weak;
const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
{
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 5e28718928ca..41fbf1f28a48 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -11,6 +11,12 @@
#include <linux/uidgid.h>
#include <linux/filter.h>
#include <linux/ctype.h>
+#include <linux/pid_namespace.h>
+#include <linux/major.h>
+#include <linux/stat.h>
+#include <linux/namei.h>
+#include <linux/version.h>
+
#include "../../lib/kstrtox.h"
@@ -312,6 +318,64 @@ void copy_map_value_locked(struct bpf_map *map, void *dst, void *src,
preempt_enable();
}
+BPF_CALL_2(bpf_get_current_pidns_info, struct bpf_pidns_info *, pidns_info, u32,
+ size)
+{
+ const char *pidns_path = "/proc/self/ns/pid";
+ struct pid_namespace *pidns = NULL;
+ struct filename *tmp = NULL;
+ struct inode *inode;
+ struct path kp;
+ pid_t tgid = 0;
+ pid_t pid = 0;
+ int ret;
+ int len;
+
+ if (unlikely(size != sizeof(struct bpf_pidns_info)))
+ return -EINVAL;
+ pidns = task_active_pid_ns(current);
+ if (unlikely(!pidns))
+ goto clear;
+ pidns_info->nsid = pidns->ns.inum;
+ pid = task_pid_nr_ns(current, pidns);
+ if (unlikely(!pid))
+ goto clear;
+ tgid = task_tgid_nr_ns(current, pidns);
+ if (unlikely(!tgid))
+ goto clear;
+ pidns_info->tgid = (u32) tgid;
+ pidns_info->pid = (u32) pid;
+ tmp = kmem_cache_alloc(names_cachep, GFP_ATOMIC);
+ if (unlikely(!tmp)) {
+ memset((void *)pidns_info, 0, (size_t) size);
+ return -ENOMEM;
+ }
+ len = strlen(pidns_path) + 1;
+ memcpy((char *)tmp->name, pidns_path, len);
+ tmp->uptr = NULL;
+ tmp->aname = NULL;
+ tmp->refcnt = 1;
+ ret = filename_lookup(AT_FDCWD, tmp, 0, &kp, NULL);
+ if (ret) {
+ memset((void *)pidns_info, 0, (size_t) size);
+ return ret;
+ }
+ inode = d_backing_inode(kp.dentry);
+ pidns_info->dev = inode->i_sb->s_dev;
+ return 0;
+clear:
+ memset((void *)pidns_info, 0, (size_t) size);
+ return -EINVAL;
+}
+
+const struct bpf_func_proto bpf_get_current_pidns_info_proto = {
+ .func = bpf_get_current_pidns_info,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_UNINIT_MEM,
+ .arg2_type = ARG_CONST_SIZE,
+};
+
#ifdef CONFIG_CGROUPS
BPF_CALL_0(bpf_get_current_cgroup_id)
{
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index ca1255d14576..5e1dc22765a5 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -709,6 +709,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
#endif
case BPF_FUNC_send_signal:
return &bpf_send_signal_proto;
+ case BPF_FUNC_get_current_pidns_info:
+ return &bpf_get_current_pidns_info_proto;
default:
return NULL;
}
--
2.11.0
^ permalink raw reply related
* [PATCH bpf-next V9 2/3] samples/bpf: added sample code for bpf_get_current_pidns_info.
From: Carlos Neira @ 2019-08-13 18:47 UTC (permalink / raw)
To: netdev; +Cc: yhs, ebiederm, brouer, cneirabustos, bpf
In-Reply-To: <20190813184747.12225-1-cneirabustos@gmail.com>
From: Carlos <cneirabustos@gmail.com>
sample program to call new bpf helper bpf_get_current_pidns_info.
Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
---
samples/bpf/Makefile | 3 +++
samples/bpf/trace_ns_info_user.c | 35 ++++++++++++++++++++++++++++
samples/bpf/trace_ns_info_user_kern.c | 44 +++++++++++++++++++++++++++++++++++
3 files changed, 82 insertions(+)
create mode 100644 samples/bpf/trace_ns_info_user.c
create mode 100644 samples/bpf/trace_ns_info_user_kern.c
diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 1d9be26b4edd..238453ff27d2 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -53,6 +53,7 @@ hostprogs-y += task_fd_query
hostprogs-y += xdp_sample_pkts
hostprogs-y += ibumad
hostprogs-y += hbm
+hostprogs-y += trace_ns_info
# Libbpf dependencies
LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a
@@ -109,6 +110,7 @@ task_fd_query-objs := bpf_load.o task_fd_query_user.o $(TRACE_HELPERS)
xdp_sample_pkts-objs := xdp_sample_pkts_user.o $(TRACE_HELPERS)
ibumad-objs := bpf_load.o ibumad_user.o $(TRACE_HELPERS)
hbm-objs := bpf_load.o hbm.o $(CGROUP_HELPERS)
+trace_ns_info-objs := bpf_load.o trace_ns_info_user.o
# Tell kbuild to always build the programs
always := $(hostprogs-y)
@@ -170,6 +172,7 @@ always += xdp_sample_pkts_kern.o
always += ibumad_kern.o
always += hbm_out_kern.o
always += hbm_edt_kern.o
+always += trace_ns_info_user_kern.o
KBUILD_HOSTCFLAGS += -I$(objtree)/usr/include
KBUILD_HOSTCFLAGS += -I$(srctree)/tools/lib/bpf/
diff --git a/samples/bpf/trace_ns_info_user.c b/samples/bpf/trace_ns_info_user.c
new file mode 100644
index 000000000000..e06d08db6f30
--- /dev/null
+++ b/samples/bpf/trace_ns_info_user.c
@@ -0,0 +1,35 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018 Carlos Neira cneirabustos@gmail.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+
+#include <stdio.h>
+#include <linux/bpf.h>
+#include <unistd.h>
+#include "bpf/libbpf.h"
+#include "bpf_load.h"
+
+/* This code was taken verbatim from tracex1_user.c, it's used
+ * to exercize bpf_get_current_pidns_info() helper call.
+ */
+int main(int ac, char **argv)
+{
+ FILE *f;
+ char filename[256];
+
+ snprintf(filename, sizeof(filename), "%s_user_kern.o", argv[0]);
+ printf("loading %s\n", filename);
+
+ if (load_bpf_file(filename)) {
+ printf("%s", bpf_log_buf);
+ return 1;
+ }
+
+ f = popen("taskset 1 ping localhost", "r");
+ (void) f;
+ read_trace_pipe();
+ return 0;
+}
diff --git a/samples/bpf/trace_ns_info_user_kern.c b/samples/bpf/trace_ns_info_user_kern.c
new file mode 100644
index 000000000000..96675e02b707
--- /dev/null
+++ b/samples/bpf/trace_ns_info_user_kern.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018 Carlos Neira cneirabustos@gmail.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include <linux/skbuff.h>
+#include <linux/netdevice.h>
+#include <linux/version.h>
+#include <uapi/linux/bpf.h>
+#include "bpf_helpers.h"
+
+typedef __u64 u64;
+typedef __u32 u32;
+
+
+/* kprobe is NOT a stable ABI
+ * kernel functions can be removed, renamed or completely change semantics.
+ * Number of arguments and their positions can change, etc.
+ * In such case this bpf+kprobe example will no longer be meaningful
+ */
+
+/* This will call bpf_get_current_pidns_info() to display pid and ns values
+ * as seen by the current namespace, on the far left you will see the pid as
+ * seen as by the root namespace.
+ */
+
+SEC("kprobe/__netif_receive_skb_core")
+int bpf_prog1(struct pt_regs *ctx)
+{
+ char fmt[] = "nsid:%u, dev: %u, pid:%u\n";
+ struct bpf_pidns_info nsinfo;
+ int ok = 0;
+
+ ok = bpf_get_current_pidns_info(&nsinfo, sizeof(nsinfo));
+ if (ok == 0)
+ bpf_trace_printk(fmt, sizeof(fmt), (u32)nsinfo.nsid,
+ (u32) nsinfo.dev, (u32)nsinfo.pid);
+
+ return 0;
+}
+char _license[] SEC("license") = "GPL";
+u32 _version SEC("version") = LINUX_VERSION_CODE;
--
2.11.0
^ permalink raw reply related
* [PATCH bpf-next V9 3/3] tools/testing/selftests/bpf: Add self-tests for new helper.
From: Carlos Neira @ 2019-08-13 18:47 UTC (permalink / raw)
To: netdev; +Cc: yhs, ebiederm, brouer, cneirabustos, bpf
In-Reply-To: <20190813184747.12225-1-cneirabustos@gmail.com>
From: Carlos <cneirabustos@gmail.com>
Added self-tests for new helper bpf_get_current_pidns_info.
Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
---
tools/include/uapi/linux/bpf.h | 31 ++++-
tools/testing/selftests/bpf/Makefile | 2 +-
tools/testing/selftests/bpf/bpf_helpers.h | 3 +
.../testing/selftests/bpf/progs/test_pidns_kern.c | 51 ++++++++
tools/testing/selftests/bpf/test_pidns.c | 138 +++++++++++++++++++++
5 files changed, 223 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/test_pidns_kern.c
create mode 100644 tools/testing/selftests/bpf/test_pidns.c
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4393bd4b2419..db241857ec15 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2741,6 +2741,28 @@ union bpf_attr {
* **-EOPNOTSUPP** kernel configuration does not enable SYN cookies
*
* **-EPROTONOSUPPORT** IP packet version is not 4 or 6
+ *
+ * int bpf_get_current_pidns_info(struct bpf_pidns_info *pidns, u32 size_of_pidns)
+ * Description
+ * Copies into *pidns* pid, namespace id and tgid as seen by the
+ * current namespace and also device from /proc/self/ns/pid.
+ * *size_of_pidns* must be the size of *pidns*
+ *
+ * This helper is used when pid filtering is needed inside a
+ * container as bpf_get_current_tgid() helper returns always the
+ * pid id as seen by the root namespace.
+ * Return
+ * 0 on success
+ *
+ * **-EINVAL** if *size_of_pidns* is not valid or unable to get ns, pid
+ * or tgid of the current task.
+ *
+ * **-ECHILD** if /proc/self/ns/pid does not exists.
+ *
+ * **-ENOTDIR** if /proc/self/ns does not exists.
+ *
+ * **-ENOMEM** if allocation fails.
+ *
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -2853,7 +2875,8 @@ union bpf_attr {
FN(sk_storage_get), \
FN(sk_storage_delete), \
FN(send_signal), \
- FN(tcp_gen_syncookie),
+ FN(tcp_gen_syncookie), \
+ FN(get_current_pidns_info),
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
* function eBPF program intends to call
@@ -3604,4 +3627,10 @@ struct bpf_sockopt {
__s32 retval;
};
+struct bpf_pidns_info {
+ __u32 dev;
+ __u32 nsid;
+ __u32 tgid;
+ __u32 pid;
+};
#endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 3bd0f4a0336a..1f97b571b581 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -29,7 +29,7 @@ TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test
test_cgroup_storage test_select_reuseport test_section_names \
test_netcnt test_tcpnotify_user test_sock_fields test_sysctl test_hashmap \
test_btf_dump test_cgroup_attach xdping test_sockopt test_sockopt_sk \
- test_sockopt_multi test_tcp_rtt
+ test_sockopt_multi test_tcp_rtt test_pidns
BPF_OBJ_FILES = $(patsubst %.c,%.o, $(notdir $(wildcard progs/*.c)))
TEST_GEN_FILES = $(BPF_OBJ_FILES)
diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
index 8b503ea142f0..3fae3b9fcd2c 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -231,6 +231,9 @@ static int (*bpf_send_signal)(unsigned sig) = (void *)BPF_FUNC_send_signal;
static long long (*bpf_tcp_gen_syncookie)(struct bpf_sock *sk, void *ip,
int ip_len, void *tcp, int tcp_len) =
(void *) BPF_FUNC_tcp_gen_syncookie;
+static int (*bpf_get_current_pidns_info)(struct bpf_pidns_info *buf,
+ unsigned int buf_size) =
+ (void *) BPF_FUNC_get_current_pidns_info;
/* llvm builtin functions that eBPF C program may use to
* emit BPF_LD_ABS and BPF_LD_IND instructions
diff --git a/tools/testing/selftests/bpf/progs/test_pidns_kern.c b/tools/testing/selftests/bpf/progs/test_pidns_kern.c
new file mode 100644
index 000000000000..e1d2facfa762
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_pidns_kern.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018 Carlos Neira cneirabustos@gmail.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+
+#include <linux/bpf.h>
+#include <errno.h>
+#include "bpf_helpers.h"
+
+struct bpf_map_def SEC("maps") nsidmap = {
+ .type = BPF_MAP_TYPE_ARRAY,
+ .key_size = sizeof(__u32),
+ .value_size = sizeof(__u32),
+ .max_entries = 1,
+};
+
+struct bpf_map_def SEC("maps") pidmap = {
+ .type = BPF_MAP_TYPE_ARRAY,
+ .key_size = sizeof(__u32),
+ .value_size = sizeof(__u32),
+ .max_entries = 1,
+};
+
+SEC("tracepoint/syscalls/sys_enter_nanosleep")
+int trace(void *ctx)
+{
+ struct bpf_pidns_info nsinfo;
+ __u32 key = 0, *expected_pid, *val;
+ char fmt[] = "ERROR nspid:%d\n";
+
+ if (bpf_get_current_pidns_info(&nsinfo, sizeof(nsinfo)))
+ return -EINVAL;
+
+ expected_pid = bpf_map_lookup_elem(&pidmap, &key);
+
+
+ if (!expected_pid || *expected_pid != nsinfo.pid)
+ return 0;
+
+ val = bpf_map_lookup_elem(&nsidmap, &key);
+ if (val)
+ *val = nsinfo.nsid;
+
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";
+__u32 _version SEC("version") = 1;
diff --git a/tools/testing/selftests/bpf/test_pidns.c b/tools/testing/selftests/bpf/test_pidns.c
new file mode 100644
index 000000000000..a7254055f294
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_pidns.c
@@ -0,0 +1,138 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018 Carlos Neira cneirabustos@gmail.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <syscall.h>
+#include <unistd.h>
+#include <linux/perf_event.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+
+#include <linux/bpf.h>
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+
+#include "cgroup_helpers.h"
+#include "bpf_rlimit.h"
+
+#define CHECK(condition, tag, format...) ({ \
+ int __ret = !!(condition); \
+ if (__ret) { \
+ printf("%s:FAIL:%s ", __func__, tag); \
+ printf(format); \
+ } else { \
+ printf("%s:PASS:%s\n", __func__, tag); \
+ } \
+ __ret; \
+})
+
+static int bpf_find_map(const char *test, struct bpf_object *obj,
+ const char *name)
+{
+ struct bpf_map *map;
+
+ map = bpf_object__find_map_by_name(obj, name);
+ if (!map)
+ return -1;
+ return bpf_map__fd(map);
+}
+
+
+int main(int argc, char **argv)
+{
+ const char *probe_name = "syscalls/sys_enter_nanosleep";
+ const char *file = "test_pidns_kern.o";
+ int err, bytes, efd, prog_fd, pmu_fd;
+ int pidmap_fd, nsidmap_fd;
+ struct perf_event_attr attr = {};
+ struct bpf_object *obj;
+ __u32 knsid = 0;
+ __u32 key = 0, pid;
+ int exit_code = 1;
+ struct stat st;
+ char buf[256];
+
+ err = bpf_prog_load(file, BPF_PROG_TYPE_TRACEPOINT, &obj, &prog_fd);
+ if (CHECK(err, "bpf_prog_load", "err %d errno %d\n", err, errno))
+ goto cleanup_cgroup_env;
+
+ nsidmap_fd = bpf_find_map(__func__, obj, "nsidmap");
+ if (CHECK(nsidmap_fd < 0, "bpf_find_map", "err %d errno %d\n",
+ nsidmap_fd, errno))
+ goto close_prog;
+
+ pidmap_fd = bpf_find_map(__func__, obj, "pidmap");
+ if (CHECK(pidmap_fd < 0, "bpf_find_map", "err %d errno %d\n",
+ pidmap_fd, errno))
+ goto close_prog;
+
+ pid = getpid();
+ bpf_map_update_elem(pidmap_fd, &key, &pid, 0);
+
+ snprintf(buf, sizeof(buf),
+ "/sys/kernel/debug/tracing/events/%s/id", probe_name);
+ efd = open(buf, O_RDONLY, 0);
+ if (CHECK(efd < 0, "open", "err %d errno %d\n", efd, errno))
+ goto close_prog;
+ bytes = read(efd, buf, sizeof(buf));
+ close(efd);
+ if (CHECK(bytes <= 0 || bytes >= sizeof(buf), "read",
+ "bytes %d errno %d\n", bytes, errno))
+ goto close_prog;
+
+ attr.config = strtol(buf, NULL, 0);
+ attr.type = PERF_TYPE_TRACEPOINT;
+ attr.sample_type = PERF_SAMPLE_RAW;
+ attr.sample_period = 1;
+ attr.wakeup_events = 1;
+
+ pmu_fd = syscall(__NR_perf_event_open, &attr, getpid(), -1, -1, 0);
+ if (CHECK(pmu_fd < 0, "perf_event_open", "err %d errno %d\n", pmu_fd,
+ errno))
+ goto close_prog;
+
+ err = ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0);
+ if (CHECK(err, "perf_event_ioc_enable", "err %d errno %d\n", err,
+ errno))
+ goto close_pmu;
+
+ err = ioctl(pmu_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);
+ if (CHECK(err, "perf_event_ioc_set_bpf", "err %d errno %d\n", err,
+ errno))
+ goto close_pmu;
+
+ /* trigger some syscalls */
+ sleep(1);
+
+ err = bpf_map_lookup_elem(nsidmap_fd, &key, &knsid);
+ if (CHECK(err, "bpf_map_lookup_elem", "err %d errno %d\n", err, errno))
+ goto close_pmu;
+
+ if (stat("/proc/self/ns/pid", &st))
+ goto close_pmu;
+
+ if (CHECK(knsid != (__u32) st.st_ino, "compare_namespace_id",
+ "kern knsid %u user unsid %u\n", knsid, (__u32) st.st_ino))
+ goto close_pmu;
+
+ exit_code = 0;
+ printf("%s:PASS\n", argv[0]);
+
+close_pmu:
+ close(pmu_fd);
+close_prog:
+ bpf_object__close(obj);
+cleanup_cgroup_env:
+ return exit_code;
+}
--
2.11.0
^ permalink raw reply related
* [PATCH bpf-next 0/2] libbpf: make use of BTF through sysfs
From: Andrii Nakryiko @ 2019-08-13 18:54 UTC (permalink / raw)
To: bpf, netdev, ast, daniel, acme
Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko
Now that kernel's BTF is exposed through sysfs at well-known location, attempt
to load it first as a target BTF for the purpose of BPF CO-RE relocations.
Patch #1 is a follow-up patch to rename /sys/kernel/btf/kernel into
/sys/kernel/btf/vmlinux.
Patch #2 adds ability to load raw BTF contents from sysfs and expands the list
of locations libbpf attempts to load vmlinux BTF from.
Andrii Nakryiko (2):
btf: rename /sys/kernel/btf/kernel into /sys/kernel/btf/vmlinux
libbpf: attempt to load kernel BTF from sysfs first
Documentation/ABI/testing/sysfs-kernel-btf | 2 +-
kernel/bpf/sysfs_btf.c | 30 +++++-----
scripts/link-vmlinux.sh | 18 +++---
tools/lib/bpf/libbpf.c | 64 +++++++++++++++++++---
4 files changed, 82 insertions(+), 32 deletions(-)
--
2.17.1
^ permalink raw reply
* [PATCH bpf-next 1/2] btf: rename /sys/kernel/btf/kernel into /sys/kernel/btf/vmlinux
From: Andrii Nakryiko @ 2019-08-13 18:54 UTC (permalink / raw)
To: bpf, netdev, ast, daniel, acme
Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko
In-Reply-To: <20190813185443.437829-1-andriin@fb.com>
Expose kernel's BTF under the name vmlinux to be more uniform with using
kernel module names as file names in the future.
Fixes: 341dfcf8d78e ("btf: expose BTF info through sysfs")
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
Documentation/ABI/testing/sysfs-kernel-btf | 2 +-
kernel/bpf/sysfs_btf.c | 30 +++++++++++-----------
scripts/link-vmlinux.sh | 18 ++++++-------
3 files changed, 25 insertions(+), 25 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-kernel-btf b/Documentation/ABI/testing/sysfs-kernel-btf
index 5390f8001f96..2c9744b2cd59 100644
--- a/Documentation/ABI/testing/sysfs-kernel-btf
+++ b/Documentation/ABI/testing/sysfs-kernel-btf
@@ -6,7 +6,7 @@ Description:
Contains BTF type information and related data for kernel and
kernel modules.
-What: /sys/kernel/btf/kernel
+What: /sys/kernel/btf/vmlinux
Date: Aug 2019
KernelVersion: 5.5
Contact: bpf@vger.kernel.org
diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c
index 092e63b9758b..4659349fc795 100644
--- a/kernel/bpf/sysfs_btf.c
+++ b/kernel/bpf/sysfs_btf.c
@@ -9,30 +9,30 @@
#include <linux/sysfs.h>
/* See scripts/link-vmlinux.sh, gen_btf() func for details */
-extern char __weak _binary__btf_kernel_bin_start[];
-extern char __weak _binary__btf_kernel_bin_end[];
+extern char __weak _binary__btf_vmlinux_bin_start[];
+extern char __weak _binary__btf_vmlinux_bin_end[];
static ssize_t
-btf_kernel_read(struct file *file, struct kobject *kobj,
- struct bin_attribute *bin_attr,
- char *buf, loff_t off, size_t len)
+btf_vmlinux_read(struct file *file, struct kobject *kobj,
+ struct bin_attribute *bin_attr,
+ char *buf, loff_t off, size_t len)
{
- memcpy(buf, _binary__btf_kernel_bin_start + off, len);
+ memcpy(buf, _binary__btf_vmlinux_bin_start + off, len);
return len;
}
-static struct bin_attribute bin_attr_btf_kernel __ro_after_init = {
- .attr = { .name = "kernel", .mode = 0444, },
- .read = btf_kernel_read,
+static struct bin_attribute bin_attr_btf_vmlinux __ro_after_init = {
+ .attr = { .name = "vmlinux", .mode = 0444, },
+ .read = btf_vmlinux_read,
};
static struct kobject *btf_kobj;
-static int __init btf_kernel_init(void)
+static int __init btf_vmlinux_init(void)
{
int err;
- if (!_binary__btf_kernel_bin_start)
+ if (!_binary__btf_vmlinux_bin_start)
return 0;
btf_kobj = kobject_create_and_add("btf", kernel_kobj);
@@ -42,10 +42,10 @@ static int __init btf_kernel_init(void)
return err;
}
- bin_attr_btf_kernel.size = _binary__btf_kernel_bin_end -
- _binary__btf_kernel_bin_start;
+ bin_attr_btf_vmlinux.size = _binary__btf_vmlinux_bin_end -
+ _binary__btf_vmlinux_bin_start;
- return sysfs_create_bin_file(btf_kobj, &bin_attr_btf_kernel);
+ return sysfs_create_bin_file(btf_kobj, &bin_attr_btf_vmlinux);
}
-subsys_initcall(btf_kernel_init);
+subsys_initcall(btf_vmlinux_init);
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index cb93832c6ad7..f7933c606f27 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -117,9 +117,9 @@ gen_btf()
# dump .BTF section into raw binary file to link with final vmlinux
bin_arch=$(${OBJDUMP} -f ${1} | grep architecture | \
cut -d, -f1 | cut -d' ' -f2)
- ${OBJCOPY} --dump-section .BTF=.btf.kernel.bin ${1} 2>/dev/null
+ ${OBJCOPY} --dump-section .BTF=.btf.vmlinux.bin ${1} 2>/dev/null
${OBJCOPY} -I binary -O ${CONFIG_OUTPUT_FORMAT} -B ${bin_arch} \
- --rename-section .data=.BTF .btf.kernel.bin ${2}
+ --rename-section .data=.BTF .btf.vmlinux.bin ${2}
}
# Create ${2} .o file with all symbols from the ${1} object file
@@ -227,10 +227,10 @@ ${MAKE} -f "${srctree}/scripts/Makefile.modpost" vmlinux.o
info MODINFO modules.builtin.modinfo
${OBJCOPY} -j .modinfo -O binary vmlinux.o modules.builtin.modinfo
-btf_kernel_bin_o=""
+btf_vmlinux_bin_o=""
if [ -n "${CONFIG_DEBUG_INFO_BTF}" ]; then
- if gen_btf .tmp_vmlinux.btf .btf.kernel.bin.o ; then
- btf_kernel_bin_o=.btf.kernel.bin.o
+ if gen_btf .tmp_vmlinux.btf .btf.vmlinux.bin.o ; then
+ btf_vmlinux_bin_o=.btf.vmlinux.bin.o
fi
fi
@@ -265,11 +265,11 @@ if [ -n "${CONFIG_KALLSYMS}" ]; then
kallsyms_vmlinux=.tmp_vmlinux2
# step 1
- vmlinux_link .tmp_vmlinux1 ${btf_kernel_bin_o}
+ vmlinux_link .tmp_vmlinux1 ${btf_vmlinux_bin_o}
kallsyms .tmp_vmlinux1 .tmp_kallsyms1.o
# step 2
- vmlinux_link .tmp_vmlinux2 .tmp_kallsyms1.o ${btf_kernel_bin_o}
+ vmlinux_link .tmp_vmlinux2 .tmp_kallsyms1.o ${btf_vmlinux_bin_o}
kallsyms .tmp_vmlinux2 .tmp_kallsyms2.o
# step 3
@@ -280,13 +280,13 @@ if [ -n "${CONFIG_KALLSYMS}" ]; then
kallsymso=.tmp_kallsyms3.o
kallsyms_vmlinux=.tmp_vmlinux3
- vmlinux_link .tmp_vmlinux3 .tmp_kallsyms2.o ${btf_kernel_bin_o}
+ vmlinux_link .tmp_vmlinux3 .tmp_kallsyms2.o ${btf_vmlinux_bin_o}
kallsyms .tmp_vmlinux3 .tmp_kallsyms3.o
fi
fi
info LD vmlinux
-vmlinux_link vmlinux "${kallsymso}" "${btf_kernel_bin_o}"
+vmlinux_link vmlinux "${kallsymso}" "${btf_vmlinux_bin_o}"
if [ -n "${CONFIG_BUILDTIME_EXTABLE_SORT}" ]; then
info SORTEX vmlinux
--
2.17.1
^ permalink raw reply related
* [PATCH bpf-next 2/2] libbpf: attempt to load kernel BTF from sysfs first
From: Andrii Nakryiko @ 2019-08-13 18:54 UTC (permalink / raw)
To: bpf, netdev, ast, daniel, acme
Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko
In-Reply-To: <20190813185443.437829-1-andriin@fb.com>
Add support for loading kernel BTF from sysfs (/sys/kernel/btf/vmlinux)
as a target BTF. Also extend the list of on disk search paths for
vmlinux ELF image with entries that perf is searching for.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
tools/lib/bpf/libbpf.c | 64 +++++++++++++++++++++++++++++++++++++-----
1 file changed, 57 insertions(+), 7 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 3abf2dd1b3b5..8462dab02812 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -2807,15 +2807,61 @@ static int bpf_core_reloc_insn(struct bpf_program *prog, int insn_off,
return 0;
}
+static struct btf *btf_load_raw(const char *path)
+{
+ struct btf *btf;
+ size_t read_cnt;
+ struct stat st;
+ void *data;
+ FILE *f;
+
+ if (stat(path, &st))
+ return ERR_PTR(-errno);
+
+ data = malloc(st.st_size);
+ if (!data)
+ return ERR_PTR(-ENOMEM);
+
+ f = fopen(path, "rb");
+ if (!f) {
+ btf = ERR_PTR(-errno);
+ goto cleanup;
+ }
+
+ read_cnt = fread(data, 1, st.st_size, f);
+ fclose(f);
+ if (read_cnt < st.st_size) {
+ btf = ERR_PTR(-EBADF);
+ goto cleanup;
+ }
+
+ btf = btf__new(data, read_cnt);
+
+cleanup:
+ free(data);
+ return btf;
+}
+
/*
* Probe few well-known locations for vmlinux kernel image and try to load BTF
* data out of it to use for target BTF.
*/
static struct btf *bpf_core_find_kernel_btf(void)
{
- const char *locations[] = {
- "/lib/modules/%1$s/vmlinux-%1$s",
- "/usr/lib/modules/%1$s/kernel/vmlinux",
+ struct {
+ const char *path_fmt;
+ bool raw_btf;
+ } locations[] = {
+ /* try canonical vmlinux BTF through sysfs first */
+ { "/sys/kernel/btf/vmlinux", true /* raw BTF */ },
+ /* fall back to trying to find vmlinux ELF on disk otherwise */
+ { "/boot/vmlinux-%1$s" },
+ { "/lib/modules/%1$s/vmlinux-%1$s" },
+ { "/lib/modules/%1$s/build/vmlinux" },
+ { "/usr/lib/modules/%1$s/kernel/vmlinux" },
+ { "/usr/lib/debug/boot/vmlinux-%1$s" },
+ { "/usr/lib/debug/boot/vmlinux-%1$s.debug" },
+ { "/usr/lib/debug/lib/modules/%1$s/vmlinux" },
};
char path[PATH_MAX + 1];
struct utsname buf;
@@ -2825,14 +2871,18 @@ static struct btf *bpf_core_find_kernel_btf(void)
uname(&buf);
for (i = 0; i < ARRAY_SIZE(locations); i++) {
- snprintf(path, PATH_MAX, locations[i], buf.release);
+ snprintf(path, PATH_MAX, locations[i].path_fmt, buf.release);
if (access(path, R_OK))
continue;
- btf = btf__parse_elf(path, NULL);
- pr_debug("kernel BTF load from '%s': %ld\n",
- path, PTR_ERR(btf));
+ if (locations[i].raw_btf)
+ btf = btf_load_raw(path);
+ else
+ btf = btf__parse_elf(path, NULL);
+
+ pr_debug("loading kernel BTF '%s': %ld\n",
+ path, IS_ERR(btf) ? PTR_ERR(btf) : 0);
if (IS_ERR(btf))
continue;
--
2.17.1
^ permalink raw reply related
* Re: [PATCH bpf-next 3/3] samples: bpf: syscal_nrs: use mmap2 if defined
From: Ivan Khoronzhuk @ 2019-08-13 18:59 UTC (permalink / raw)
To: Jonathan Lemon
Cc: magnus.karlsson, bjorn.topel, davem, hawk, john.fastabend,
jakub.kicinski, daniel, netdev, bpf, xdp-newbies, linux-kernel
In-Reply-To: <036BCF4A-53D6-4000-BBDE-07C04B8B23FA@flugsvamp.com>
On Tue, Aug 13, 2019 at 10:41:54AM -0700, Jonathan Lemon wrote:
>
>
>On 13 Aug 2019, at 3:23, Ivan Khoronzhuk wrote:
>
>> For arm32 xdp sockets mmap2 is preferred, so use it if it's defined.
>>
>> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>
>Doesn't this change the application API?
>--
>Jonathan
From what I know there is no reason to use both, so if __NR_mmap2 is defined
but not __NR_mmap. Despite the fact that it can be defined internally, say
#define __NR_mmap (__NR_SYSCALL_BASE + 90)
and be used anyway, at least arm use 2 definition one is for old abi and one is
for new and names as their numbers are different:
#define __NR_mmap (__NR_SYSCALL_BASE + 90)
#define __NR_mmap2 (__NR_SYSCALL_BASE + 192)
, so they are not interchangeable and if eabi is used then only __NR_mmap2 is
defined if oeabi then __NR_mmap only... But mmap() use only one and can hide
this from user.
In this patch, seems like here is direct access, so I have no declaration for
__NR_mmap and it breaks build. So here several solutions, I can block __NR_mmap
at all or replace it on __NR_mmap2...or define it by hand (for what then?).
I decided to replace on real one.
>
>
>> ---
>> samples/bpf/syscall_nrs.c | 5 +++++
>> samples/bpf/tracex5_kern.c | 11 +++++++++++
>> 2 files changed, 16 insertions(+)
>>
>> diff --git a/samples/bpf/syscall_nrs.c b/samples/bpf/syscall_nrs.c
>> index 516e255cbe8f..2dec94238350 100644
>> --- a/samples/bpf/syscall_nrs.c
>> +++ b/samples/bpf/syscall_nrs.c
>> @@ -9,5 +9,10 @@ void syscall_defines(void)
>> COMMENT("Linux system call numbers.");
>> SYSNR(__NR_write);
>> SYSNR(__NR_read);
>> +#ifdef __NR_mmap2
>> + SYSNR(__NR_mmap2);
>> +#else
>> SYSNR(__NR_mmap);
>> +#endif
>> +
>> }
>> diff --git a/samples/bpf/tracex5_kern.c b/samples/bpf/tracex5_kern.c
>> index f57f4e1ea1ec..300350ad299a 100644
>> --- a/samples/bpf/tracex5_kern.c
>> +++ b/samples/bpf/tracex5_kern.c
>> @@ -68,12 +68,23 @@ PROG(SYS__NR_read)(struct pt_regs *ctx)
>> return 0;
>> }
>>
>> +#ifdef __NR_mmap2
>> +PROG(SYS__NR_mmap2)(struct pt_regs *ctx)
>> +{
>> + char fmt[] = "mmap2\n";
>> +
>> + bpf_trace_printk(fmt, sizeof(fmt));
>> + return 0;
>> +}
>> +#else
>> PROG(SYS__NR_mmap)(struct pt_regs *ctx)
>> {
>> char fmt[] = "mmap\n";
>> +
>> bpf_trace_printk(fmt, sizeof(fmt));
>> return 0;
>> }
>> +#endif
>>
>> char _license[] SEC("license") = "GPL";
>> u32 _version SEC("version") = LINUX_VERSION_CODE;
>> --
>> 2.17.1
--
Regards,
Ivan Khoronzhuk
^ permalink raw reply
* Re: general protection fault in tls_write_space
From: Jakub Kicinski @ 2019-08-13 18:59 UTC (permalink / raw)
To: John Fastabend
Cc: Hillf Danton, syzbot, aviadye, borisp, daniel, davejwatson, davem,
linux-kernel, netdev, oss-drivers, syzkaller-bugs, willemb
In-Reply-To: <5d5301a82578_268d2b12c8efa5b470@john-XPS-13-9370.notmuch>
On Tue, 13 Aug 2019 11:30:00 -0700, John Fastabend wrote:
> Jakub Kicinski wrote:
> > On Tue, 13 Aug 2019 10:17:06 -0700, John Fastabend wrote:
> > > > Followup of commit 95fa145479fb
> > > > ("bpf: sockmap/tls, close can race with map free")
> > > >
> > > > --- a/net/tls/tls_main.c
> > > > +++ b/net/tls/tls_main.c
> > > > @@ -308,6 +308,9 @@ static void tls_sk_proto_close(struct so
> > > > if (free_ctx)
> > > > icsk->icsk_ulp_data = NULL;
> > > > sk->sk_prot = ctx->sk_proto;
> > > > + /* tls will go; restore sock callback before enabling bh */
> > > > + if (sk->sk_write_space == tls_write_space)
> > > > + sk->sk_write_space = ctx->sk_write_space;
> > > > write_unlock_bh(&sk->sk_callback_lock);
> > > > release_sock(sk);
> > > > if (ctx->tx_conf == TLS_SW)
> > >
> > > Hi Hillf,
> > >
> > > We need this patch (although slightly updated for bpf tree) do
> > > you want to send it? Otherwise I can. We should only set this if
> > > TX path was enabled otherwise we null it. Checking against
> > > tls_write_space seems best to me as well.
> > >
> > > Against bpf this patch should fix it.
> > >
> > > diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> > > index ce6ef56a65ef..43252a801c3f 100644
> > > --- a/net/tls/tls_main.c
> > > +++ b/net/tls/tls_main.c
> > > @@ -308,7 +308,8 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
> > > if (free_ctx)
> > > icsk->icsk_ulp_data = NULL;
> > > sk->sk_prot = ctx->sk_proto;
> > > - sk->sk_write_space = ctx->sk_write_space;
> > > + if (sk->sk_write_space == tls_write_space)
> > > + sk->sk_write_space = ctx->sk_write_space;
> > > write_unlock_bh(&sk->sk_callback_lock);
> > > release_sock(sk);
> > > if (ctx->tx_conf == TLS_SW)
> >
> > This is already in net since Friday:
>
> Don't we need to guard that with an
>
> if (sk->sk_write_space == tls_write_space)
>
> or something similar? Where is ctx->sk_write_space set in the rx only
> case? In do_tls_setsockop_conf() we have this block
>
> if (tx) {
> ctx->sk_write_space = sk->sk_write_space;
> sk->sk_write_space = tls_write_space;
> } else {
> sk->sk_socket->ops = &tls_sw_proto_ops;
> }
>
> which makes me think ctx->sk_write_space may not be set correctly in
> all cases.
Ah damn, you're right I remember looking at that but then I went down
the rabbit hole of trying to repro and forgot :/
Do you want to send an incremental change?
^ permalink raw reply
* [PATCH v6 1/4] dt-bindings: net: phy: Add subnode for LED configuration
From: Matthias Kaehlcke @ 2019-08-13 19:11 UTC (permalink / raw)
To: David S . Miller, Rob Herring, Mark Rutland, Andrew Lunn,
Florian Fainelli, Heiner Kallweit
Cc: netdev, devicetree, linux-kernel, Douglas Anderson,
Matthias Kaehlcke
In-Reply-To: <20190813191147.19936-1-mka@chromium.org>
The LED behavior of some Ethernet PHYs is configurable. Add an
optional 'leds' subnode with a child node for each LED to be
configured. The binding aims to be compatible with the common
LED binding (see devicetree/bindings/leds/common.txt).
A LED can be configured to be:
- 'on' when a link is active, some PHYs allow configuration for
certain link speeds
speeds
- 'off'
- blink on RX/TX activity, some PHYs allow configuration for
certain link speeds
For the configuration to be effective it needs to be supported by
the hardware and the corresponding PHY driver.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
---
Changes in v6:
- none
Changes in v5:
- renamed triggers from 'phy_link_<speed>_active' to 'phy-link-<speed>'
- added entries for 'phy-link-<speed>-activity'
- added 'phy-link' and 'phy-link-activity' for triggers with any link
speed
- added entry for trigger 'none'
Changes in v4:
- patch added to the series
---
.../devicetree/bindings/net/ethernet-phy.yaml | 59 +++++++++++++++++++
1 file changed, 59 insertions(+)
diff --git a/Documentation/devicetree/bindings/net/ethernet-phy.yaml b/Documentation/devicetree/bindings/net/ethernet-phy.yaml
index f70f18ff821f..98ba320f828b 100644
--- a/Documentation/devicetree/bindings/net/ethernet-phy.yaml
+++ b/Documentation/devicetree/bindings/net/ethernet-phy.yaml
@@ -153,6 +153,50 @@ properties:
Delay after the reset was deasserted in microseconds. If
this property is missing the delay will be skipped.
+patternProperties:
+ "^leds$":
+ type: object
+ description:
+ Subnode with configuration of the PHY LEDs.
+
+ patternProperties:
+ "^led@[0-9]+$":
+ type: object
+ description:
+ Subnode with the configuration of a single PHY LED.
+
+ properties:
+ reg:
+ description:
+ The ID number of the LED, typically corresponds to a hardware ID.
+ $ref: "/schemas/types.yaml#/definitions/uint32"
+
+ linux,default-trigger:
+ description:
+ This parameter, if present, is a string specifying the trigger
+ assigned to the LED. Supported triggers are:
+ "none" - LED will be solid off
+ "phy-link" - LED will be solid on when a link is active
+ "phy-link-10m" - LED will be solid on when a 10Mb/s link is active
+ "phy-link-100m" - LED will be solid on when a 100Mb/s link is active
+ "phy-link-1g" - LED will be solid on when a 1Gb/s link is active
+ "phy-link-10g" - LED will be solid on when a 10Gb/s link is active
+ "phy-link-activity" - LED will be on when link is active and blink
+ off with activity.
+ "phy-link-10m-activity" - LED will be on when 10Mb/s link is active
+ and blink off with activity.
+ "phy-link-100m-activity" - LED will be on when 100Mb/s link is
+ active and blink off with activity.
+ "phy-link-1g-activity" - LED will be on when 1Gb/s link is active
+ and blink off with activity.
+ "phy-link-10g-activity" - LED will be on when 10Gb/s link is active
+ and blink off with activity.
+
+ $ref: "/schemas/types.yaml#/definitions/string"
+
+ required:
+ - reg
+
required:
- reg
@@ -173,5 +217,20 @@ examples:
reset-gpios = <&gpio1 4 1>;
reset-assert-us = <1000>;
reset-deassert-us = <2000>;
+
+ leds {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ led@0 {
+ reg = <0>;
+ linux,default-trigger = "phy-link-1g";
+ };
+
+ led@1 {
+ reg = <1>;
+ linux,default-trigger = "phy-link-100m-activity";
+ };
+ };
};
};
--
2.23.0.rc1.153.gdeed80330f-goog
^ permalink raw reply related
* [PATCH v6 2/4] net: phy: Add support for generic LED configuration through the DT
From: Matthias Kaehlcke @ 2019-08-13 19:11 UTC (permalink / raw)
To: David S . Miller, Rob Herring, Mark Rutland, Andrew Lunn,
Florian Fainelli, Heiner Kallweit
Cc: netdev, devicetree, linux-kernel, Douglas Anderson,
Matthias Kaehlcke
In-Reply-To: <20190813191147.19936-1-mka@chromium.org>
For PHYs with a device tree node look for LED trigger configuration
using the generic binding, if it exists try to apply it via the new
driver hook .config_led.
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
---
Changes in v6:
- delete unnecessary of_node_put() inside for_each_child_of_node()
loop
- use continue instead of goto in of_phy_config_leds()
- check return value of ->config_led() and print a warning if !0
Changes in v5:
- add callback to configure a LED to the PHY driver, instead of
having the driver retrieve the DT data
- use new trigger names
- added support for trigger 'none'
- release DT nodes after use
- renamed 'PHY_LED_LINK_*' to 'PHY_LED_TRIGGER_LINK_*'
- added anonymous struct to struct phy_led_config to track
'activity' in a separate flag. this could be changed to 'flags' if
needed/desired.
- updated commit message (previous subject was 'net: phy: Add
function to retrieve LED configuration from the DT')
Changes in v4:
- patch added to the series
---
drivers/net/phy/phy_device.c | 72 ++++++++++++++++++++++++++++++++++++
include/linux/phy.h | 22 +++++++++++
2 files changed, 94 insertions(+)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 6b5cb87f3866..80315777ae67 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -29,6 +29,7 @@
#include <linux/phy_led_triggers.h>
#include <linux/mdio.h>
#include <linux/io.h>
+#include <linux/of.h>
#include <linux/uaccess.h>
MODULE_DESCRIPTION("PHY library");
@@ -1064,6 +1065,75 @@ static int phy_poll_reset(struct phy_device *phydev)
return 0;
}
+static void of_phy_config_leds(struct phy_device *phydev)
+{
+ struct device_node *np, *child;
+ struct phy_led_config cfg;
+ const char *trigger;
+ int ret;
+
+ if (!IS_ENABLED(CONFIG_OF_MDIO) || !phydev->drv->config_led)
+ return;
+
+ np = of_find_node_by_name(phydev->mdio.dev.of_node, "leds");
+ if (!np)
+ return;
+
+ for_each_child_of_node(np, child) {
+ u32 led;
+
+ if (of_property_read_u32(child, "reg", &led))
+ continue;
+
+ ret = of_property_read_string(child, "linux,default-trigger",
+ &trigger);
+ if (ret)
+ trigger = "none";
+
+ memset(&cfg, 0, sizeof(cfg));
+
+ if (!strcmp(trigger, "none")) {
+ cfg.trigger.t = PHY_LED_TRIGGER_NONE;
+ } else if (!strcmp(trigger, "phy-link")) {
+ cfg.trigger.t = PHY_LED_TRIGGER_LINK;
+ } else if (!strcmp(trigger, "phy-link-10m")) {
+ cfg.trigger.t = PHY_LED_TRIGGER_LINK_10M;
+ } else if (!strcmp(trigger, "phy-link-100m")) {
+ cfg.trigger.t = PHY_LED_TRIGGER_LINK_100M;
+ } else if (!strcmp(trigger, "phy-link-1g")) {
+ cfg.trigger.t = PHY_LED_TRIGGER_LINK_1G;
+ } else if (!strcmp(trigger, "phy-link-10g")) {
+ cfg.trigger.t = PHY_LED_TRIGGER_LINK_10G;
+ } else if (!strcmp(trigger, "phy-link-activity")) {
+ cfg.trigger.t = PHY_LED_TRIGGER_LINK;
+ cfg.trigger.activity = true;
+ } else if (!strcmp(trigger, "phy-link-10m-activity")) {
+ cfg.trigger.t = PHY_LED_TRIGGER_LINK_10M;
+ cfg.trigger.activity = true;
+ } else if (!strcmp(trigger, "phy-link-100m-activity")) {
+ cfg.trigger.t = PHY_LED_TRIGGER_LINK_100M;
+ cfg.trigger.activity = true;
+ } else if (!strcmp(trigger, "phy-link-1g-activity")) {
+ cfg.trigger.t = PHY_LED_TRIGGER_LINK_1G;
+ cfg.trigger.activity = true;
+ } else if (!strcmp(trigger, "phy-link-10g-activity")) {
+ cfg.trigger.t = PHY_LED_TRIGGER_LINK_10G;
+ cfg.trigger.activity = true;
+ } else {
+ phydev_warn(phydev, "trigger '%s' for LED%d is invalid\n",
+ trigger, led);
+ continue;
+ }
+
+ ret = phydev->drv->config_led(phydev, led, &cfg);
+ if (ret)
+ phydev_warn(phydev, "trigger '%s' for LED%d not supported\n",
+ trigger, led);
+ }
+
+ of_node_put(np);
+}
+
int phy_init_hw(struct phy_device *phydev)
{
int ret = 0;
@@ -1087,6 +1157,8 @@ int phy_init_hw(struct phy_device *phydev)
if (phydev->drv->config_init)
ret = phydev->drv->config_init(phydev);
+ of_phy_config_leds(phydev);
+
return ret;
}
EXPORT_SYMBOL(phy_init_hw);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 462b90b73f93..3a07390fc5e9 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -325,6 +325,24 @@ struct phy_c45_device_ids {
u32 device_ids[8];
};
+/* Triggers for PHY LEDs */
+enum phy_led_trigger {
+ PHY_LED_TRIGGER_NONE,
+ PHY_LED_TRIGGER_LINK,
+ PHY_LED_TRIGGER_LINK_10M,
+ PHY_LED_TRIGGER_LINK_100M,
+ PHY_LED_TRIGGER_LINK_1G,
+ PHY_LED_TRIGGER_LINK_10G,
+};
+
+/* Configuration of a single PHY LED */
+struct phy_led_config {
+ struct {
+ enum phy_led_trigger t;
+ bool activity;
+ } trigger;
+};
+
/* phy_device: An instance of a PHY
*
* drv: Pointer to the driver for this PHY instance
@@ -626,6 +644,10 @@ struct phy_driver {
struct ethtool_tunable *tuna,
const void *data);
int (*set_loopback)(struct phy_device *dev, bool enable);
+
+ /* Configure a PHY LED */
+ int (*config_led)(struct phy_device *dev, int led,
+ struct phy_led_config *cfg);
};
#define to_phy_driver(d) container_of(to_mdio_common_driver(d), \
struct phy_driver, mdiodrv)
--
2.23.0.rc1.153.gdeed80330f-goog
^ permalink raw reply related
* [PATCH v6 3/4] net: phy: realtek: Add helpers for accessing RTL8211x extension pages
From: Matthias Kaehlcke @ 2019-08-13 19:11 UTC (permalink / raw)
To: David S . Miller, Rob Herring, Mark Rutland, Andrew Lunn,
Florian Fainelli, Heiner Kallweit
Cc: netdev, devicetree, linux-kernel, Douglas Anderson,
Matthias Kaehlcke
In-Reply-To: <20190813191147.19936-1-mka@chromium.org>
Some RTL8211x PHYs have extension pages, which can be accessed
after selecting a page through a custom method. Add a function to
modify bits in a register of an extension page and a helper for
selecting an ext page. Use rtl8211x_modify_ext_paged() in
rtl8211e_config_init() instead of doing things 'manually'.
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
---
Changes in v6:
- none
Changes in v5:
- renamed 'rtl8211e_<action>_ext_page' to 'rtl8211x_<action>_ext_page'
- updated commit message
Changes in v4:
- don't add constant RTL8211E_EXT_PAGE, it's only used once,
use a literal instead
- pass 'oldpage' to phy_restore_page() in rtl8211e_select_ext_page(),
not 'page'
- return 'oldpage' in rtl8211e_select_ext_page()
- use __phy_modify() in rtl8211e_modify_ext_paged() instead of
reimplementing __phy_modify_changed()
- in rtl8211e_modify_ext_paged() return directly when
rtl8211e_select_ext_page() fails
Changes in v3:
- use the new function in rtl8211e_config_init() instead of
doing things 'manually'
- use existing RTL8211E_EXT_PAGE instead of adding a new define
- updated commit message
Changes in v2:
- use phy_select_page() and phy_restore_page(), get rid of
rtl8211e_restore_page()
- s/rtl821e_select_ext_page/rtl8211e_select_ext_page/
- updated commit message
---
drivers/net/phy/realtek.c | 47 +++++++++++++++++++++++++++------------
1 file changed, 33 insertions(+), 14 deletions(-)
diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index a669945eb829..a5b3708dc4d8 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -53,6 +53,36 @@ static int rtl821x_write_page(struct phy_device *phydev, int page)
return __phy_write(phydev, RTL821x_PAGE_SELECT, page);
}
+static int rtl8211x_select_ext_page(struct phy_device *phydev, int page)
+{
+ int ret, oldpage;
+
+ oldpage = phy_select_page(phydev, 7);
+ if (oldpage < 0)
+ return oldpage;
+
+ ret = __phy_write(phydev, RTL821x_EXT_PAGE_SELECT, page);
+ if (ret)
+ return phy_restore_page(phydev, oldpage, ret);
+
+ return oldpage;
+}
+
+static int rtl8211x_modify_ext_paged(struct phy_device *phydev, int page,
+ u32 regnum, u16 mask, u16 set)
+{
+ int ret = 0;
+ int oldpage;
+
+ oldpage = rtl8211x_select_ext_page(phydev, page);
+ if (oldpage < 0)
+ return oldpage;
+
+ ret = __phy_modify(phydev, regnum, mask, set);
+
+ return phy_restore_page(phydev, oldpage, ret);
+}
+
static int rtl8201_ack_interrupt(struct phy_device *phydev)
{
int err;
@@ -184,7 +214,6 @@ static int rtl8211f_config_init(struct phy_device *phydev)
static int rtl8211e_config_init(struct phy_device *phydev)
{
- int ret = 0, oldpage;
u16 val;
/* enable TX/RX delay for rgmii-* modes, and disable them for rgmii. */
@@ -213,19 +242,9 @@ static int rtl8211e_config_init(struct phy_device *phydev)
* 2 = RX Delay, 1 = TX Delay, 0 = SELRGV (see original PHY datasheet
* for details).
*/
- oldpage = phy_select_page(phydev, 0x7);
- if (oldpage < 0)
- goto err_restore_page;
-
- ret = __phy_write(phydev, RTL821x_EXT_PAGE_SELECT, 0xa4);
- if (ret)
- goto err_restore_page;
-
- ret = __phy_modify(phydev, 0x1c, RTL8211E_TX_DELAY | RTL8211E_RX_DELAY,
- val);
-
-err_restore_page:
- return phy_restore_page(phydev, oldpage, ret);
+ return rtl8211x_modify_ext_paged(phydev, 0xa4, 0x1c,
+ RTL8211E_TX_DELAY | RTL8211E_RX_DELAY,
+ val);
}
static int rtl8211b_suspend(struct phy_device *phydev)
--
2.23.0.rc1.153.gdeed80330f-goog
^ permalink raw reply related
* [PATCH v6 4/4] net: phy: realtek: Add LED configuration support for RTL8211E
From: Matthias Kaehlcke @ 2019-08-13 19:11 UTC (permalink / raw)
To: David S . Miller, Rob Herring, Mark Rutland, Andrew Lunn,
Florian Fainelli, Heiner Kallweit
Cc: netdev, devicetree, linux-kernel, Douglas Anderson,
Matthias Kaehlcke
In-Reply-To: <20190813191147.19936-1-mka@chromium.org>
Add a .config_led hook which is called by the PHY core when
configuration data for a PHY LED is available. Each LED can be
configured to be solid 'off, solid 'on' for certain (or all)
link speeds or to blink on RX/TX activity.
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
---
Changes in v6:
- return -EOPNOTSUPP if trigger is not supported, don't log warning
- don't log errors if MDIO ops fail, this is rare and the phy_device
will log a warning
- added parentheses around macro argument used in arithmetics to
avoid possible operator precedence issues
- minor formatting changes
Changes in v5:
- use 'config_leds' driver callback instead of requesting the DT
configuration
- added support for trigger 'none'
- always disable EEE LED mode when a LED is configured. We have no
device data struct to keep track of its state, the number of LEDs
is limited, so the overhead of disabling it multiple times (once for
each LED that is configured) during initialization is negligible
- print warning when disabling EEE LED mode fails
- updated commit message (previous subject was 'net: phy: realtek:
configure RTL8211E LEDs')
Changes in v4:
- use the generic PHY LED binding
- keep default/current configuration if none is specified
- added rtl8211e_disable_eee_led_mode()
- was previously in separate patch, however since we always want to
disable EEE LED mode when a LED configuration is specified it makes
sense to just add the function here.
- don't call phy_restore_page() in rtl8211e_config_leds() if
selection of the extended page failed.
- use phydev_warn() instead of phydev_err() if LED configuration
fails since we don't bail out
- use hex number to specify page for consistency
- add hex number to comment about ext page 44 to facilitate searching
Changes in v3:
- sanity check led-modes values
- set LACR bits in a more readable way
- use phydev_err() instead of dev_err()
- log an error if LED configuration fails
Changes in v2:
- patch added to the series
---
drivers/net/phy/realtek.c | 90 ++++++++++++++++++++++++++++++++++++++-
1 file changed, 89 insertions(+), 1 deletion(-)
diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index a5b3708dc4d8..2bca3b91d43d 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -9,8 +9,9 @@
* Copyright (c) 2004 Freescale Semiconductor, Inc.
*/
#include <linux/bitops.h>
-#include <linux/phy.h>
+#include <linux/bits.h>
#include <linux/module.h>
+#include <linux/phy.h>
#define RTL821x_PHYSR 0x11
#define RTL821x_PHYSR_DUPLEX BIT(13)
@@ -26,6 +27,19 @@
#define RTL821x_EXT_PAGE_SELECT 0x1e
#define RTL821x_PAGE_SELECT 0x1f
+/* RTL8211E page 5 */
+#define RTL8211E_EEE_LED_MODE1 0x05
+#define RTL8211E_EEE_LED_MODE2 0x06
+
+/* RTL8211E extension page 44 (0x2c) */
+#define RTL8211E_LACR 0x1a
+#define RLT8211E_LACR_LEDACTCTRL_SHIFT 4
+#define RTL8211E_LCR 0x1c
+
+#define LACR_MASK(led) BIT(4 + (led))
+#define LCR_MASK(led) GENMASK(((led) * 4) + 2,\
+ (led) * 4)
+
#define RTL8211F_INSR 0x1d
#define RTL8211F_TX_DELAY BIT(8)
@@ -83,6 +97,79 @@ static int rtl8211x_modify_ext_paged(struct phy_device *phydev, int page,
return phy_restore_page(phydev, oldpage, ret);
}
+static void rtl8211e_disable_eee_led_mode(struct phy_device *phydev)
+{
+ int oldpage;
+ int err = 0;
+
+ oldpage = phy_select_page(phydev, 5);
+ if (oldpage < 0)
+ goto out;
+
+ /* write magic values to disable EEE LED mode */
+ err = __phy_write(phydev, RTL8211E_EEE_LED_MODE1, 0x8b82);
+ if (err)
+ goto out;
+
+ err = __phy_write(phydev, RTL8211E_EEE_LED_MODE2, 0x052b);
+
+out:
+ if (err)
+ phydev_warn(phydev, "failed to disable EEE LED mode: %d\n",
+ err);
+
+ phy_restore_page(phydev, oldpage, err);
+}
+
+static int rtl8211e_config_led(struct phy_device *phydev, int led,
+ struct phy_led_config *cfg)
+{
+ u16 lacr_bits = 0, lcr_bits = 0;
+ int oldpage, ret;
+
+ switch (cfg->trigger.t) {
+ case PHY_LED_TRIGGER_LINK:
+ lcr_bits = 7 << (led * 4);
+ break;
+
+ case PHY_LED_TRIGGER_LINK_10M:
+ lcr_bits = 1 << (led * 4);
+ break;
+
+ case PHY_LED_TRIGGER_LINK_100M:
+ lcr_bits = 2 << (led * 4);
+ break;
+
+ case PHY_LED_TRIGGER_LINK_1G:
+ lcr_bits |= 4 << (led * 4);
+ break;
+
+ case PHY_LED_TRIGGER_NONE:
+ break;
+
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ if (cfg->trigger.activity)
+ lacr_bits = BIT(RLT8211E_LACR_LEDACTCTRL_SHIFT + led);
+
+ rtl8211e_disable_eee_led_mode(phydev);
+
+ oldpage = rtl8211x_select_ext_page(phydev, 0x2c);
+ if (oldpage < 0)
+ return oldpage;
+
+ ret = __phy_modify(phydev, RTL8211E_LACR, LACR_MASK(led), lacr_bits);
+ if (ret)
+ goto err;
+
+ ret = __phy_modify(phydev, RTL8211E_LCR, LCR_MASK(led), lcr_bits);
+
+err:
+ return phy_restore_page(phydev, oldpage, ret);
+}
+
static int rtl8201_ack_interrupt(struct phy_device *phydev)
{
int err;
@@ -330,6 +417,7 @@ static struct phy_driver realtek_drvs[] = {
.config_init = &rtl8211e_config_init,
.ack_interrupt = &rtl821x_ack_interrupt,
.config_intr = &rtl8211e_config_intr,
+ .config_led = &rtl8211e_config_led,
.suspend = genphy_suspend,
.resume = genphy_resume,
.read_page = rtl821x_read_page,
--
2.23.0.rc1.153.gdeed80330f-goog
^ permalink raw reply related
* [PATCH v6 0/4] net: phy: Add support for DT configuration of PHY LEDs and use it for RTL8211E
From: Matthias Kaehlcke @ 2019-08-13 19:11 UTC (permalink / raw)
To: David S . Miller, Rob Herring, Mark Rutland, Andrew Lunn,
Florian Fainelli, Heiner Kallweit
Cc: netdev, devicetree, linux-kernel, Douglas Anderson,
Matthias Kaehlcke
This series adds a generic binding to configure PHY LEDs through
the device tree, and phylib support for reading the information
from the DT. PHY drivers that support the generic binding should
implement the new hook .config_led.
Enable DT configuration of the RTL8211E LEDs by implementing the
.config_led hook of the driver. Certain registers of the RTL8211E
can only be accessed through a vendor specific extended page
mechanism. Extended pages need to be accessed for the LED
configuration. This series adds helpers to facilitate accessing
extended pages.
Matthias Kaehlcke (4):
dt-bindings: net: phy: Add subnode for LED configuration
net: phy: Add support for generic LED configuration through the DT
net: phy: realtek: Add helpers for accessing RTL8211x extension pages
net: phy: realtek: Add LED configuration support for RTL8211E
.../devicetree/bindings/net/ethernet-phy.yaml | 59 ++++++++
drivers/net/phy/phy_device.c | 72 +++++++++
drivers/net/phy/realtek.c | 137 ++++++++++++++++--
include/linux/phy.h | 22 +++
4 files changed, 275 insertions(+), 15 deletions(-)
--
2.23.0.rc1.153.gdeed80330f-goog
^ permalink raw reply
* Re: general protection fault in tls_write_space
From: John Fastabend @ 2019-08-13 19:30 UTC (permalink / raw)
To: Jakub Kicinski, John Fastabend
Cc: Hillf Danton, syzbot, aviadye, borisp, daniel, davejwatson, davem,
linux-kernel, netdev, oss-drivers, syzkaller-bugs, willemb
In-Reply-To: <20190813115948.5f57b272@cakuba.netronome.com>
Jakub Kicinski wrote:
> On Tue, 13 Aug 2019 11:30:00 -0700, John Fastabend wrote:
> > Jakub Kicinski wrote:
> > > On Tue, 13 Aug 2019 10:17:06 -0700, John Fastabend wrote:
> > > > > Followup of commit 95fa145479fb
> > > > > ("bpf: sockmap/tls, close can race with map free")
> > > > >
> > > > > --- a/net/tls/tls_main.c
> > > > > +++ b/net/tls/tls_main.c
> > > > > @@ -308,6 +308,9 @@ static void tls_sk_proto_close(struct so
> > > > > if (free_ctx)
> > > > > icsk->icsk_ulp_data = NULL;
> > > > > sk->sk_prot = ctx->sk_proto;
> > > > > + /* tls will go; restore sock callback before enabling bh */
> > > > > + if (sk->sk_write_space == tls_write_space)
> > > > > + sk->sk_write_space = ctx->sk_write_space;
> > > > > write_unlock_bh(&sk->sk_callback_lock);
> > > > > release_sock(sk);
> > > > > if (ctx->tx_conf == TLS_SW)
> > > >
> > > > Hi Hillf,
> > > >
> > > > We need this patch (although slightly updated for bpf tree) do
> > > > you want to send it? Otherwise I can. We should only set this if
> > > > TX path was enabled otherwise we null it. Checking against
> > > > tls_write_space seems best to me as well.
> > > >
> > > > Against bpf this patch should fix it.
> > > >
> > > > diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> > > > index ce6ef56a65ef..43252a801c3f 100644
> > > > --- a/net/tls/tls_main.c
> > > > +++ b/net/tls/tls_main.c
> > > > @@ -308,7 +308,8 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
> > > > if (free_ctx)
> > > > icsk->icsk_ulp_data = NULL;
> > > > sk->sk_prot = ctx->sk_proto;
> > > > - sk->sk_write_space = ctx->sk_write_space;
> > > > + if (sk->sk_write_space == tls_write_space)
> > > > + sk->sk_write_space = ctx->sk_write_space;
> > > > write_unlock_bh(&sk->sk_callback_lock);
> > > > release_sock(sk);
> > > > if (ctx->tx_conf == TLS_SW)
> > >
> > > This is already in net since Friday:
> >
> > Don't we need to guard that with an
> >
> > if (sk->sk_write_space == tls_write_space)
> >
> > or something similar? Where is ctx->sk_write_space set in the rx only
> > case? In do_tls_setsockop_conf() we have this block
> >
> > if (tx) {
> > ctx->sk_write_space = sk->sk_write_space;
> > sk->sk_write_space = tls_write_space;
> > } else {
> > sk->sk_socket->ops = &tls_sw_proto_ops;
> > }
> >
> > which makes me think ctx->sk_write_space may not be set correctly in
> > all cases.
>
> Ah damn, you're right I remember looking at that but then I went down
> the rabbit hole of trying to repro and forgot :/
>
> Do you want to send an incremental change?
Sure I'll send something out this afternoon.
^ permalink raw reply
* Re: [PATCH net] net: dsa: mv88e6xxx: drop adjust_link to enabled phylink
From: Florian Fainelli @ 2019-08-13 19:50 UTC (permalink / raw)
To: Hubert Feurstein, Andrew Lunn
Cc: netdev, linux-kernel, Vivien Didelot, David S. Miller
In-Reply-To: <CAFfN3gX6_dvAkRqRuXdR_+nfsFyBd2UNSzYo1H3am49xyb-hBQ@mail.gmail.com>
On 8/5/19 1:49 AM, Hubert Feurstein wrote:
> Hi Andrew,
>
> It looks like some work is still needed in b53_phylink_mac_config to
> take over the
> functionality of the current adjust_link implementation.
Indeed, I will look into it in the next few weeks.
--
Florian
^ permalink raw reply
* Re: [PATCH net-next,v4 08/12] drivers: net: use flow block API
From: Pablo Neira Ayuso @ 2019-08-13 19:51 UTC (permalink / raw)
To: Edward Cree; +Cc: netdev, netfilter-devel
In-Reply-To: <75eec70e-60de-e33b-aea0-be595ca625f4@solarflare.com>
On Mon, Aug 12, 2019 at 06:50:09PM +0100, Edward Cree wrote:
> On 09/07/2019 21:55, Pablo Neira Ayuso wrote:
> > This patch updates flow_block_cb_setup_simple() to use the flow block API.
> > Several drivers are also adjusted to use it.
> >
> > This patch introduces the per-driver list of flow blocks to account for
> > blocks that are already in use.
> >
> > Remove tc_block_offload alias.
> >
> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> > ---
> > v4: fix typo in list in nfp driver - Jakub Kicinski.
> > Move driver_list handling to the driver code, this list is transitional,
> > until drivers are updated to support multiple subsystems. No more
> > driver_list handling from core.
>
> Pablo, can you explain (because this commit message doesn't) why these per-
> driver lists are needed, and what the information/state is that has module
> (rather than, say, netdevice) scope?
The idea is to update drivers to support one flow_block per subsystem,
one for ethtool, one for tc, and so on. So far, existing drivers only
allow for binding one single flow_block to one of the existing
subsystems. So this limitation applies at driver level.
^ permalink raw reply
* Re: [PATCH net-next] mcast: ensure L-L IPv6 packets are accepted by bridge
From: Ido Schimmel @ 2019-08-13 19:53 UTC (permalink / raw)
To: Patrick Ruddy; +Cc: netdev, roopa, nikolay, linus.luessing
In-Reply-To: <20190813141804.20515-1-pruddy@vyatta.att-mail.com>
+ Bridge maintainers, Linus
On Tue, Aug 13, 2019 at 03:18:04PM +0100, Patrick Ruddy wrote:
> At present only all-nodes IPv6 multicast packets are accepted by
> a bridge interface that is not in multicast router mode. Since
> other protocols can be running in the absense of multicast
> forwarding e.g. OSPFv3 IPv6 ND. Change the test to allow
> all of the FFx2::/16 range to be accepted when not in multicast
> router mode. This aligns the code with IPv4 link-local reception
> and RFC4291
Can you please quote the relevant part from RFC 4291?
>
> Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
> ---
> include/net/addrconf.h | 15 +++++++++++++++
> net/bridge/br_multicast.c | 2 +-
> 2 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/addrconf.h b/include/net/addrconf.h
> index becdad576859..05b42867e969 100644
> --- a/include/net/addrconf.h
> +++ b/include/net/addrconf.h
> @@ -434,6 +434,21 @@ static inline void addrconf_addr_solict_mult(const struct in6_addr *addr,
> htonl(0xFF000000) | addr->s6_addr32[3]);
> }
>
> +/*
> + * link local multicast address range ffx2::/16 rfc4291
> + */
> +static inline bool ipv6_addr_is_ll_mcast(const struct in6_addr *addr)
> +{
> +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
> + __be64 *p = (__be64 *)addr;
> + return ((p[0] & cpu_to_be64(0xff0f000000000000UL))
> + ^ cpu_to_be64(0xff02000000000000UL)) == 0UL;
> +#else
> + return ((addr->s6_addr32[0] & htonl(0xff0f0000)) ^
> + htonl(0xff020000)) == 0;
> +#endif
> +}
> +
> static inline bool ipv6_addr_is_ll_all_nodes(const struct in6_addr *addr)
> {
> #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
> diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
> index 9b379e110129..ed3957381fa2 100644
> --- a/net/bridge/br_multicast.c
> +++ b/net/bridge/br_multicast.c
> @@ -1664,7 +1664,7 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
> err = ipv6_mc_check_mld(skb);
>
> if (err == -ENOMSG) {
> - if (!ipv6_addr_is_ll_all_nodes(&ipv6_hdr(skb)->daddr))
> + if (!ipv6_addr_is_ll_mcast(&ipv6_hdr(skb)->daddr))
> BR_INPUT_SKB_CB(skb)->mrouters_only = 1;
IIUC, you want IPv6 link-local packets to be locally received, but this
also changes how these packets are flooded. RFC 4541 says that packets
addressed to the all hosts address are a special case and should be
forwarded to all ports:
"In IPv6, the data forwarding rules are more straight forward because MLD is
mandated for addresses with scope 2 (link-scope) or greater. The only exception
is the address FF02::1 which is the all hosts link-scope address for which MLD
messages are never sent. Packets with the all hosts link-scope address should
be forwarded on all ports."
Maybe you want something like:
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 09b1dd8cd853..9f312a73f61c 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -132,7 +132,8 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
if ((mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) &&
br_multicast_querier_exists(br, eth_hdr(skb))) {
if ((mdst && mdst->host_joined) ||
- br_multicast_is_router(br)) {
+ br_multicast_is_router(br) ||
+ BR_INPUT_SKB_CB_LOCAL_RECEIVE(skb)) {
local_rcv = true;
br->dev->stats.multicast++;
}
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 9b379e110129..f03cecf6174e 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1667,6 +1667,9 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
if (!ipv6_addr_is_ll_all_nodes(&ipv6_hdr(skb)->daddr))
BR_INPUT_SKB_CB(skb)->mrouters_only = 1;
+ if (ipv6_addr_is_ll_mcast(&ipv6_hdr(skb)->daddr))
+ BR_INPUT_SKB_CB(skb)->local_receive = 1;
+
if (ipv6_addr_is_all_snoopers(&ipv6_hdr(skb)->daddr)) {
err = br_ip6_multicast_mrd_rcv(br, port, skb);
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index b7a4942ff1b3..d76394ca4059 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -426,6 +426,7 @@ struct br_input_skb_cb {
#ifdef CONFIG_BRIDGE_IGMP_SNOOPING
u8 igmp;
u8 mrouters_only:1;
+ u8 local_receive:1;
#endif
u8 proxyarp_replied:1;
u8 src_port_isolated:1;
@@ -445,8 +446,10 @@ struct br_input_skb_cb {
#ifdef CONFIG_BRIDGE_IGMP_SNOOPING
# define BR_INPUT_SKB_CB_MROUTERS_ONLY(__skb) (BR_INPUT_SKB_CB(__skb)->mrouters_only)
+# define BR_INPUT_SKB_CB_LOCAL_RECEIVE(__skb) (BR_INPUT_SKB_CB(__skb)->local_receive)
#else
# define BR_INPUT_SKB_CB_MROUTERS_ONLY(__skb) (0)
+# define BR_INPUT_SKB_CB_LOCAL_RECEIVE(__skb) (0)
#endif
#define br_printk(level, br, format, args...) \
^ permalink raw reply related
* Re: [PATCH v6 1/4] dt-bindings: net: phy: Add subnode for LED configuration
From: Andrew Lunn @ 2019-08-13 19:54 UTC (permalink / raw)
To: Matthias Kaehlcke
Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
Heiner Kallweit, netdev, devicetree, linux-kernel,
Douglas Anderson
In-Reply-To: <20190813191147.19936-2-mka@chromium.org>
On Tue, Aug 13, 2019 at 12:11:44PM -0700, Matthias Kaehlcke wrote:
> The LED behavior of some Ethernet PHYs is configurable. Add an
> optional 'leds' subnode with a child node for each LED to be
> configured. The binding aims to be compatible with the common
> LED binding (see devicetree/bindings/leds/common.txt).
>
> A LED can be configured to be:
>
> - 'on' when a link is active, some PHYs allow configuration for
> certain link speeds
> speeds
> - 'off'
> - blink on RX/TX activity, some PHYs allow configuration for
> certain link speeds
>
> For the configuration to be effective it needs to be supported by
> the hardware and the corresponding PHY driver.
>
> Suggested-by: Andrew Lunn <andrew@lunn.ch>
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH net-next] net: phy: modify assignment to OR for dev_flags in phy_attach_direct
From: Florian Fainelli @ 2019-08-13 19:54 UTC (permalink / raw)
To: Tao Ren, Andrew Lunn, Heiner Kallweit, David S . Miller,
Vladimir Oltean, Arun Parameswaran, Justin Chen, netdev,
linux-kernel, openbmc
In-Reply-To: <20190805185551.3140564-1-taoren@fb.com>
On 8/5/19 11:55 AM, Tao Ren wrote:
> Modify the assignment to OR when dealing with phydev->dev_flags in
> phy_attach_direct function, and this is to make sure dev_flags set in
> driver's probe callback won't be lost.
As Andrew pointed out already, this probably needs to be reworked, but
for now this looks reasonable:
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
--
Florian
^ permalink raw reply
* Re: [PATCH v6 2/4] net: phy: Add support for generic LED configuration through the DT
From: Andrew Lunn @ 2019-08-13 19:56 UTC (permalink / raw)
To: Matthias Kaehlcke
Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
Heiner Kallweit, netdev, devicetree, linux-kernel,
Douglas Anderson
In-Reply-To: <20190813191147.19936-3-mka@chromium.org>
On Tue, Aug 13, 2019 at 12:11:45PM -0700, Matthias Kaehlcke wrote:
> For PHYs with a device tree node look for LED trigger configuration
> using the generic binding, if it exists try to apply it via the new
> driver hook .config_led.
>
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH v6 3/4] net: phy: realtek: Add helpers for accessing RTL8211x extension pages
From: Andrew Lunn @ 2019-08-13 20:09 UTC (permalink / raw)
To: Matthias Kaehlcke
Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
Heiner Kallweit, netdev, devicetree, linux-kernel,
Douglas Anderson
In-Reply-To: <20190813191147.19936-4-mka@chromium.org>
On Tue, Aug 13, 2019 at 12:11:46PM -0700, Matthias Kaehlcke wrote:
> Some RTL8211x PHYs have extension pages, which can be accessed
> after selecting a page through a custom method. Add a function to
> modify bits in a register of an extension page and a helper for
> selecting an ext page. Use rtl8211x_modify_ext_paged() in
> rtl8211e_config_init() instead of doing things 'manually'.
>
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH v6 4/4] net: phy: realtek: Add LED configuration support for RTL8211E
From: Andrew Lunn @ 2019-08-13 20:14 UTC (permalink / raw)
To: Matthias Kaehlcke
Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
Heiner Kallweit, netdev, devicetree, linux-kernel,
Douglas Anderson
In-Reply-To: <20190813191147.19936-5-mka@chromium.org>
> +static int rtl8211e_config_led(struct phy_device *phydev, int led,
> + struct phy_led_config *cfg)
> +{
> + u16 lacr_bits = 0, lcr_bits = 0;
> + int oldpage, ret;
> +
You should probably check that led is 0 or 1.
Otherwise this looks good.
Andrew
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox