* Re: [PATCH net-next v4 0/2] r8152: save EEE
From: David Miller @ 2019-08-23 21:31 UTC (permalink / raw)
To: hayeswang; +Cc: netdev, nic_swsd, linux-kernel
In-Reply-To: <1394712342-15778-311-Taiwan-albertk@realtek.com>
From: Hayes Wang <hayeswang@realtek.com>
Date: Fri, 23 Aug 2019 15:33:39 +0800
> v4:
> For patch #2, remove redundant calling of "ocp_reg_write(tp, OCP_EEE_ADV, 0)".
>
> v3:
> For patch #2, fix the mistake caused by copying and pasting.
>
> v2:
> Adjust patch #1. The EEE has been disabled in the beginning of
> r8153_hw_phy_cfg() and r8153b_hw_phy_cfg(), so only check if
> it is necessary to enable EEE.
>
> Add the patch #2 for the helper function.
>
> v1:
> Saving the settings of EEE to avoid they become the default settings
> after reset_resume().
Series applied.
^ permalink raw reply
* Re: [PATCH net 2/2] r8152: avoid using napi_disable after netif_napi_del.
From: David Miller @ 2019-08-23 21:33 UTC (permalink / raw)
To: hayeswang; +Cc: netdev, nic_swsd, linux-kernel, jslaby
In-Reply-To: <1394712342-15778-316-Taiwan-albertk@realtek.com>
From: Hayes Wang <hayeswang@realtek.com>
Date: Fri, 23 Aug 2019 16:53:02 +0800
> Exchange netif_napi_del() and unregister_netdev() in rtl8152_disconnect()
> to avoid using napi_disable() after netif_napi_del().
>
> Signed-off-by: Hayes Wang <hayeswang@realtek.com>
> ---
> drivers/net/usb/r8152.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
> index 690a24d1ef82..29390eda5251 100644
> --- a/drivers/net/usb/r8152.c
> +++ b/drivers/net/usb/r8152.c
> @@ -5364,8 +5364,8 @@ static void rtl8152_disconnect(struct usb_interface *intf)
> if (tp) {
> rtl_set_unplug(tp);
>
> - netif_napi_del(&tp->napi);
> unregister_netdev(tp->netdev);
> + netif_napi_del(&tp->napi);
> cancel_delayed_work_sync(&tp->hw_phy_work);
> tp->rtl_ops.unload(tp);
> free_netdev(tp->netdev);
This is completely redundant because free_netdev() will perform all of
the necessary netif_napi_del() calls.
^ permalink raw reply
* Re: [PATCH 1/2] rtnetlink: gate MAC address with an LSM hook
From: David Miller @ 2019-08-23 21:41 UTC (permalink / raw)
To: jeffv; +Cc: netdev, linux-security-module, selinux
In-Reply-To: <CABXk95BF=RfqFSHU_---DRHDoKyFON5kS_vYJbc4ns2OS=_t0w@mail.gmail.com>
From: Jeffrey Vander Stoep <jeffv@google.com>
Date: Fri, 23 Aug 2019 13:41:38 +0200
> I could make this really generic by adding a single hook to the end of
> sock_msgrecv() which would allow an LSM to modify the message to omit
> the MAC address and any other information that we deem as sensitive in the
> future. Basically what Casey was suggesting. Thoughts on that approach?
Editing the SKB in place is generally frowned upon, and it could be cloned
and in used by other code paths even, so would need to be copied or COW'd.
^ permalink raw reply
* Re: [PATCH net-next] net: ipv6: fix listify ip6_rcv_finish in case of forwarding
From: David Miller @ 2019-08-23 21:42 UTC (permalink / raw)
To: lucien.xin
Cc: netdev, linux-sctp, marcelo.leitner, nhorman, brouer, ecree,
dvyukov, syzkaller-bugs
In-Reply-To: <e355527b374f6ce70fcc286457f87592cd8f3dcc.1566559983.git.lucien.xin@gmail.com>
From: Xin Long <lucien.xin@gmail.com>
Date: Fri, 23 Aug 2019 19:33:03 +0800
> We need a similar fix for ipv6 as Commit 0761680d5215 ("net: ipv4: fix
> listify ip_rcv_finish in case of forwarding") does for ipv4.
>
> This issue can be reprocuded by syzbot since Commit 323ebb61e32b ("net:
> use listified RX for handling GRO_NORMAL skbs") on net-next. The call
> trace was:
...
> Fixes: d8269e2cbf90 ("net: ipv6: listify ipv6_rcv() and ip6_rcv_finish()")
> Fixes: 323ebb61e32b ("net: use listified RX for handling GRO_NORMAL skbs")
> Reported-by: syzbot+eb349eeee854e389c36d@syzkaller.appspotmail.com
> Reported-by: syzbot+4a0643a653ac375612d1@syzkaller.appspotmail.com
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
Applied, thanks.
^ permalink raw reply
* Re: [PATCH net-next] net/mlx5: Fix return code in case of hyperv wrong size read
From: David Miller @ 2019-08-23 21:45 UTC (permalink / raw)
To: eranbe; +Cc: netdev, saeedm, haiyangz
In-Reply-To: <1566563687-29760-1-git-send-email-eranbe@mellanox.com>
From: Eran Ben Elisha <eranbe@mellanox.com>
Date: Fri, 23 Aug 2019 15:34:47 +0300
> Return code value could be non deterministic in case of wrong size read.
> With this patch, if such error occurs, set rc to be -EIO.
>
> In addition, mlx5_hv_config_common() supports reading of
> HV_CONFIG_BLOCK_SIZE_MAX bytes only, fix to early return error with
> bad input.
>
> Fixes: 913d14e86657 ("net/mlx5: Add wrappers for HyperV PCIe operations")
> Reported-by: Leon Romanovsky <leon@kernel.org>
> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Applied, thank you.
^ permalink raw reply
* Re: [PATCH net] ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev
From: David Miller @ 2019-08-23 21:53 UTC (permalink / raw)
To: sd; +Cc: netdev
In-Reply-To: <5bc330e3f8123eb139113ae93851cc17100c22da.1566566438.git.sd@queasysnail.net>
From: Sabrina Dubroca <sd@queasysnail.net>
Date: Fri, 23 Aug 2019 15:44:36 +0200
> Currently, ipv6_find_idev returns NULL when ipv6_add_dev fails,
> ignoring the specific error value. This results in addrconf_add_dev
> returning ENOBUFS in all cases, which is unfortunate in cases such as:
>
> # ip link add dummyX type dummy
> # ip link set dummyX mtu 1200 up
> # ip addr add 2000::/64 dev dummyX
> RTNETLINK answers: No buffer space available
>
> Commit a317a2f19da7 ("ipv6: fail early when creating netdev named all
> or default") introduced error returns in ipv6_add_dev. Before that,
> that function would simply return NULL for all failures.
>
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Looks good, applied, thanks Sabrina.
^ permalink raw reply
* Re: [PATCH net-next] net/rds: Whitelist rdma_cookie and rx_tstamp for usercopy
From: David Miller @ 2019-08-23 21:56 UTC (permalink / raw)
To: dag.moxnes; +Cc: santosh.shilimkar, netdev, linux-rdma, rds-devel
In-Reply-To: <1566568998-26222-1-git-send-email-dag.moxnes@oracle.com>
From: Dag Moxnes <dag.moxnes@oracle.com>
Date: Fri, 23 Aug 2019 16:03:18 +0200
> Add the RDMA cookie and RX timestamp to the usercopy whitelist.
>
> After the introduction of hardened usercopy whitelisting
> (https://lwn.net/Articles/727322/), a warning is displayed when the
> RDMA cookie or RX timestamp is copied to userspace:
>
> kernel: WARNING: CPU: 3 PID: 5750 at
> mm/usercopy.c:81 usercopy_warn+0x8e/0xa6
> [...]
> kernel: Call Trace:
> kernel: __check_heap_object+0xb8/0x11b
> kernel: __check_object_size+0xe3/0x1bc
> kernel: put_cmsg+0x95/0x115
> kernel: rds_recvmsg+0x43d/0x620 [rds]
> kernel: sock_recvmsg+0x43/0x4a
> kernel: ___sys_recvmsg+0xda/0x1e6
> kernel: ? __handle_mm_fault+0xcae/0xf79
> kernel: __sys_recvmsg+0x51/0x8a
> kernel: SyS_recvmsg+0x12/0x1c
> kernel: do_syscall_64+0x79/0x1ae
>
> When the whitelisting feature was introduced, the memory for the RDMA
> cookie and RX timestamp in RDS was not added to the whitelist, causing
> the warning above.
>
> Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com>
> Tested-by: jenny.x.xu@oracle.com
Applied, with tested-by tag fixed.
Thanks.
^ permalink raw reply
* Re: [PATCH net-next] drop_monitor: Make timestamps y2038 safe
From: David Miller @ 2019-08-23 21:58 UTC (permalink / raw)
To: idosch; +Cc: netdev, nhorman, arnd, andrew, ayal, mlxsw, idosch
In-Reply-To: <20190823154721.9927-1-idosch@idosch.org>
From: Ido Schimmel <idosch@idosch.org>
Date: Fri, 23 Aug 2019 18:47:21 +0300
> From: Ido Schimmel <idosch@mellanox.com>
>
> Timestamps are currently communicated to user space as 'struct
> timespec', which is not considered y2038 safe since it uses a 32-bit
> signed value for seconds.
>
> Fix this while the API is still not part of any official kernel release
> by using 64-bit nanoseconds timestamps instead.
>
> Fixes: ca30707dee2b ("drop_monitor: Add packet alert mode")
> Fixes: 5e58109b1ea4 ("drop_monitor: Add support for packet alert mode for hardware drops")
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Applied, thanks Ido.
^ permalink raw reply
* Re: [PATCH net] ipv4: mpls: fix mpls_xmit for iptunnel
From: David Miller @ 2019-08-23 22:12 UTC (permalink / raw)
To: dsahern; +Cc: alexey.kodanev, netdev
In-Reply-To: <38b351be-b24e-cb05-7c93-74134796a9d7@gmail.com>
From: David Ahern <dsahern@gmail.com>
Date: Fri, 23 Aug 2019 13:59:05 -0400
> I am traveling today and doubt I will be able to take a deep look at
> this until Monday.
I'll wait until you've had a chance to review this properly.
^ permalink raw reply
* Re: [PATCH net] Revert "r8169: remove not needed call to dma_sync_single_for_device"
From: David Miller @ 2019-08-23 22:12 UTC (permalink / raw)
To: hkallweit1; +Cc: nic_swsd, netdev, aaro.koskinen
In-Reply-To: <573e5947-3a12-f69d-d1b3-1b0d1c49f367@gmail.com>
From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Fri, 23 Aug 2019 19:57:49 +0200
> This reverts commit f072218cca5b076dd99f3dfa3aaafedfd0023a51.
>
> As reported by Aaro this patch causes network problems on
> MIPS Loongson platform. Therefore revert it.
>
> Fixes: f072218cca5b ("r8169: remove not needed call to dma_sync_single_for_device")
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Applied.
^ permalink raw reply
* Re: [PATCH net-next] r8169: fix DMA issue on MIPS platform
From: David Miller @ 2019-08-23 22:12 UTC (permalink / raw)
To: hkallweit1; +Cc: nic_swsd, aaro.koskinen, netdev
In-Reply-To: <c732685d-591c-3dca-95b8-1207bdf0d37f@gmail.com>
From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Fri, 23 Aug 2019 20:07:26 +0200
> As reported by Aaro this patch causes network problems on
> MIPS Loongson platform. Therefore revert it.
>
> Fixes: f072218cca5b ("r8169: remove not needed call to dma_sync_single_for_device")
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Applied.
^ permalink raw reply
* Re: [PATCH v4 net-next 4/7] ip6tlvs: Registration of TLV handlers and parameters
From: David Miller @ 2019-08-23 22:16 UTC (permalink / raw)
To: tom; +Cc: netdev, tom
In-Reply-To: <1566587643-16594-5-git-send-email-tom@herbertland.com>
From: Tom Herbert <tom@herbertland.com>
Date: Fri, 23 Aug 2019 12:14:00 -0700
> int off, enum ipeh_parse_errors error))
> {
> const unsigned char *nh = skb_network_header(skb);
> - const struct tlvtype_proc *curr;
> + const struct tlv_proc *curr;
> bool disallow_unknowns = false;
> int tlv_count = 0;
> int padlen = 0;
Please retain the reverse christmas tree ordering here.
^ permalink raw reply
* Re: [PATCH] net/mlx5: fix a -Wstringop-truncation warning
From: David Miller @ 2019-08-23 22:18 UTC (permalink / raw)
To: cai; +Cc: saeedm, leon, moshe, ferasda, eranbe, netdev, linux-rdma,
linux-kernel
In-Reply-To: <1566590183-9898-1-git-send-email-cai@lca.pw>
Saeed, I assume I'll get this from you.
^ permalink raw reply
* Re: [PATCH v2] riscv: add support for SECCOMP and SECCOMP_FILTER
From: Carlos Eduardo de Paula @ 2019-08-23 22:54 UTC (permalink / raw)
To: David Abdurachmanov
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Oleg Nesterov,
Kees Cook, Andy Lutomirski, Will Drewry, Shuah Khan,
Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
Yonghong Song, David Abdurachmanov, Thomas Gleixner,
Allison Randal, Alexios Zavras, Anup Patel, Vincent Chen,
Alan Kao, linux-riscv, linux-kernel, linux-kselftest, netdev, bpf
In-Reply-To: <20190822205533.4877-1-david.abdurachmanov@sifive.com>
On Thu, Aug 22, 2019 at 5:56 PM David Abdurachmanov
<david.abdurachmanov@gmail.com> wrote:
>
> This patch was extensively tested on Fedora/RISCV (applied by default on
> top of 5.2-rc7 kernel for <2 months). The patch was also tested with 5.3-rc
> on QEMU and SiFive Unleashed board.
>
> libseccomp (userspace) was rebased:
> https://github.com/seccomp/libseccomp/pull/134
>
> Fully passes libseccomp regression testing (simulation and live).
>
> There is one failing kernel selftest: global.user_notification_signal
>
> v1 -> v2:
> - return immediatly if secure_computing(NULL) returns -1
> - fixed whitespace issues
> - add missing seccomp.h
> - remove patch #2 (solved now)
> - add riscv to seccomp kernel selftest
>
> Cc: keescook@chromium.org
> Cc: me@carlosedp.com
>
> Signed-off-by: David Abdurachmanov <david.abdurachmanov@sifive.com>
> ---
> arch/riscv/Kconfig | 14 ++++++++++
> arch/riscv/include/asm/seccomp.h | 10 +++++++
> arch/riscv/include/asm/thread_info.h | 5 +++-
> arch/riscv/kernel/entry.S | 27 +++++++++++++++++--
> arch/riscv/kernel/ptrace.c | 10 +++++++
> tools/testing/selftests/seccomp/seccomp_bpf.c | 8 +++++-
> 6 files changed, 70 insertions(+), 4 deletions(-)
> create mode 100644 arch/riscv/include/asm/seccomp.h
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 59a4727ecd6c..441e63ff5adc 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -31,6 +31,7 @@ config RISCV
> select GENERIC_SMP_IDLE_THREAD
> select GENERIC_ATOMIC64 if !64BIT
> select HAVE_ARCH_AUDITSYSCALL
> + select HAVE_ARCH_SECCOMP_FILTER
> select HAVE_MEMBLOCK_NODE_MAP
> select HAVE_DMA_CONTIGUOUS
> select HAVE_FUTEX_CMPXCHG if FUTEX
> @@ -235,6 +236,19 @@ menu "Kernel features"
>
> source "kernel/Kconfig.hz"
>
> +config SECCOMP
> + bool "Enable seccomp to safely compute untrusted bytecode"
> + help
> + This kernel feature is useful for number crunching applications
> + that may need to compute untrusted bytecode during their
> + execution. By using pipes or other transports made available to
> + the process as file descriptors supporting the read/write
> + syscalls, it's possible to isolate those applications in
> + their own address space using seccomp. Once seccomp is
> + enabled via prctl(PR_SET_SECCOMP), it cannot be disabled
> + and the task is only allowed to execute a few safe syscalls
> + defined by each seccomp mode.
> +
> endmenu
>
> menu "Boot options"
> diff --git a/arch/riscv/include/asm/seccomp.h b/arch/riscv/include/asm/seccomp.h
> new file mode 100644
> index 000000000000..bf7744ee3b3d
> --- /dev/null
> +++ b/arch/riscv/include/asm/seccomp.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef _ASM_SECCOMP_H
> +#define _ASM_SECCOMP_H
> +
> +#include <asm/unistd.h>
> +
> +#include <asm-generic/seccomp.h>
> +
> +#endif /* _ASM_SECCOMP_H */
> diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
> index 905372d7eeb8..a0b2a29a0da1 100644
> --- a/arch/riscv/include/asm/thread_info.h
> +++ b/arch/riscv/include/asm/thread_info.h
> @@ -75,6 +75,7 @@ struct thread_info {
> #define TIF_MEMDIE 5 /* is terminating due to OOM killer */
> #define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */
> #define TIF_SYSCALL_AUDIT 7 /* syscall auditing */
> +#define TIF_SECCOMP 8 /* syscall secure computing */
>
> #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
> #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
> @@ -82,11 +83,13 @@ struct thread_info {
> #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
> #define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)
> #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
> +#define _TIF_SECCOMP (1 << TIF_SECCOMP)
>
> #define _TIF_WORK_MASK \
> (_TIF_NOTIFY_RESUME | _TIF_SIGPENDING | _TIF_NEED_RESCHED)
>
> #define _TIF_SYSCALL_WORK \
> - (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_AUDIT)
> + (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_AUDIT | \
> + _TIF_SECCOMP )
>
> #endif /* _ASM_RISCV_THREAD_INFO_H */
> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> index bc7a56e1ca6f..0bbedfa3e47d 100644
> --- a/arch/riscv/kernel/entry.S
> +++ b/arch/riscv/kernel/entry.S
> @@ -203,8 +203,25 @@ check_syscall_nr:
> /* Check to make sure we don't jump to a bogus syscall number. */
> li t0, __NR_syscalls
> la s0, sys_ni_syscall
> - /* Syscall number held in a7 */
> - bgeu a7, t0, 1f
> + /*
> + * The tracer can change syscall number to valid/invalid value.
> + * We use syscall_set_nr helper in syscall_trace_enter thus we
> + * cannot trust the current value in a7 and have to reload from
> + * the current task pt_regs.
> + */
> + REG_L a7, PT_A7(sp)
> + /*
> + * Syscall number held in a7.
> + * If syscall number is above allowed value, redirect to ni_syscall.
> + */
> + bge a7, t0, 1f
> + /*
> + * Check if syscall is rejected by tracer or seccomp, i.e., a7 == -1.
> + * If yes, we pretend it was executed.
> + */
> + li t1, -1
> + beq a7, t1, ret_from_syscall_rejected
> + /* Call syscall */
> la s0, sys_call_table
> slli t0, a7, RISCV_LGPTR
> add s0, s0, t0
> @@ -215,6 +232,12 @@ check_syscall_nr:
> ret_from_syscall:
> /* Set user a0 to kernel a0 */
> REG_S a0, PT_A0(sp)
> + /*
> + * We didn't execute the actual syscall.
> + * Seccomp already set return value for the current task pt_regs.
> + * (If it was configured with SECCOMP_RET_ERRNO/TRACE)
> + */
> +ret_from_syscall_rejected:
> /* Trace syscalls, but only if requested by the user. */
> REG_L t0, TASK_TI_FLAGS(tp)
> andi t0, t0, _TIF_SYSCALL_WORK
> diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c
> index 368751438366..63e47c9f85f0 100644
> --- a/arch/riscv/kernel/ptrace.c
> +++ b/arch/riscv/kernel/ptrace.c
> @@ -154,6 +154,16 @@ void do_syscall_trace_enter(struct pt_regs *regs)
> if (tracehook_report_syscall_entry(regs))
> syscall_set_nr(current, regs, -1);
>
> + /*
> + * Do the secure computing after ptrace; failures should be fast.
> + * If this fails we might have return value in a0 from seccomp
> + * (via SECCOMP_RET_ERRNO/TRACE).
> + */
> + if (secure_computing(NULL) == -1) {
> + syscall_set_nr(current, regs, -1);
> + return;
> + }
> +
> #ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS
> if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
> trace_sys_enter(regs, syscall_get_nr(current, regs));
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 6ef7f16c4cf5..492e0adad9d3 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -112,6 +112,8 @@ struct seccomp_data {
> # define __NR_seccomp 383
> # elif defined(__aarch64__)
> # define __NR_seccomp 277
> +# elif defined(__riscv)
> +# define __NR_seccomp 277
> # elif defined(__hppa__)
> # define __NR_seccomp 338
> # elif defined(__powerpc__)
> @@ -1582,6 +1584,10 @@ TEST_F(TRACE_poke, getpid_runs_normally)
> # define ARCH_REGS struct user_pt_regs
> # define SYSCALL_NUM regs[8]
> # define SYSCALL_RET regs[0]
> +#elif defined(__riscv) && __riscv_xlen == 64
> +# define ARCH_REGS struct user_regs_struct
> +# define SYSCALL_NUM a7
> +# define SYSCALL_RET a0
> #elif defined(__hppa__)
> # define ARCH_REGS struct user_regs_struct
> # define SYSCALL_NUM gr[20]
> @@ -1671,7 +1677,7 @@ void change_syscall(struct __test_metadata *_metadata,
> EXPECT_EQ(0, ret) {}
>
> #if defined(__x86_64__) || defined(__i386__) || defined(__powerpc__) || \
> - defined(__s390__) || defined(__hppa__)
> + defined(__s390__) || defined(__hppa__) || defined(__riscv)
> {
> regs.SYSCALL_NUM = syscall;
> }
> --
> 2.21.0
>
Tested-by: Carlos de Paula <me@carlosedp.com>
--
________________________________________
Carlos Eduardo de Paula
me@carlosedp.com
http://carlosedp.com
http://twitter.com/carlosedp
Linkedin
________________________________________
^ permalink raw reply
* Re: [PATCHv4 net 2/2] xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode
From: Jonathan Lemon @ 2019-08-23 22:30 UTC (permalink / raw)
To: Hangbin Liu
Cc: netdev, Stefano Brivio, wenxu, Alexei Starovoitov,
David S . Miller, Eric Dumazet, Julian Anastasov
In-Reply-To: <20190822141949.29561-3-liuhangbin@gmail.com>
On 22 Aug 2019, at 7:19, Hangbin Liu wrote:
> In decode_session{4,6} there is a possibility that the skb dst dev is NULL,
> e,g, with tunnel collect_md mode, which will cause kernel crash.
> Here is what the code path looks like, for GRE:
>
> - ip6gre_tunnel_xmit
> - ip6gre_xmit_ipv6
> - __gre6_xmit
> - ip6_tnl_xmit
> - if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE
> - icmpv6_send
> - icmpv6_route_lookup
> - xfrm_decode_session_reverse
> - decode_session4
> - oif = skb_dst(skb)->dev->ifindex; <-- here
> - decode_session6
> - oif = skb_dst(skb)->dev->ifindex; <-- here
>
> The reason is __metadata_dst_init() init dst->dev to NULL by default.
> We could not fix it in __metadata_dst_init() as there is no dev supplied.
> On the other hand, the skb_dst(skb)->dev is actually not needed as we
> called decode_session{4,6} via xfrm_decode_session_reverse(), so oif is not
> used by: fl4->flowi4_oif = reverse ? skb->skb_iif : oif;
>
> So make a dst dev check here should be clean and safe.
>
> v4: No changes.
>
> v3: No changes.
>
> v2: fix the issue in decode_session{4,6} instead of updating shared dst dev
> in {ip_md, ip6}_tunnel_xmit.
>
> Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Tested-by: Jonathan Lemon <jonathan.lemon@gmail.com>
This does resolve a local crash where the dev pointer is NULL.
^ permalink raw reply
* Re: [PATCH v2] riscv: add support for SECCOMP and SECCOMP_FILTER
From: Carlos Eduardo de Paula @ 2019-08-23 23:01 UTC (permalink / raw)
To: David Abdurachmanov
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Oleg Nesterov,
Kees Cook, Andy Lutomirski, Will Drewry, Shuah Khan,
Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
Yonghong Song, David Abdurachmanov, Thomas Gleixner,
Allison Randal, Alexios Zavras, Anup Patel, Vincent Chen,
Alan Kao, linux-riscv, linux-kernel, linux-kselftest, netdev, bpf
In-Reply-To: <20190822205533.4877-1-david.abdurachmanov@sifive.com>
On Thu, Aug 22, 2019 at 5:56 PM David Abdurachmanov
<david.abdurachmanov@gmail.com> wrote:
>
> This patch was extensively tested on Fedora/RISCV (applied by default on
> top of 5.2-rc7 kernel for <2 months). The patch was also tested with 5.3-rc
> on QEMU and SiFive Unleashed board.
>
> libseccomp (userspace) was rebased:
> https://github.com/seccomp/libseccomp/pull/134
>
> Fully passes libseccomp regression testing (simulation and live).
>
> There is one failing kernel selftest: global.user_notification_signal
>
> v1 -> v2:
> - return immediatly if secure_computing(NULL) returns -1
> - fixed whitespace issues
> - add missing seccomp.h
> - remove patch #2 (solved now)
> - add riscv to seccomp kernel selftest
>
> Cc: keescook@chromium.org
> Cc: me@carlosedp.com
>
> Signed-off-by: David Abdurachmanov <david.abdurachmanov@sifive.com>
> Tested-by: Carlos de Paula <me@carlosedp.com>
> ---
> arch/riscv/Kconfig | 14 ++++++++++
> arch/riscv/include/asm/seccomp.h | 10 +++++++
> arch/riscv/include/asm/thread_info.h | 5 +++-
> arch/riscv/kernel/entry.S | 27 +++++++++++++++++--
> arch/riscv/kernel/ptrace.c | 10 +++++++
> tools/testing/selftests/seccomp/seccomp_bpf.c | 8 +++++-
> 6 files changed, 70 insertions(+), 4 deletions(-)
> create mode 100644 arch/riscv/include/asm/seccomp.h
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 59a4727ecd6c..441e63ff5adc 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -31,6 +31,7 @@ config RISCV
> select GENERIC_SMP_IDLE_THREAD
> select GENERIC_ATOMIC64 if !64BIT
> select HAVE_ARCH_AUDITSYSCALL
> + select HAVE_ARCH_SECCOMP_FILTER
> select HAVE_MEMBLOCK_NODE_MAP
> select HAVE_DMA_CONTIGUOUS
> select HAVE_FUTEX_CMPXCHG if FUTEX
> @@ -235,6 +236,19 @@ menu "Kernel features"
>
> source "kernel/Kconfig.hz"
>
> +config SECCOMP
> + bool "Enable seccomp to safely compute untrusted bytecode"
> + help
> + This kernel feature is useful for number crunching applications
> + that may need to compute untrusted bytecode during their
> + execution. By using pipes or other transports made available to
> + the process as file descriptors supporting the read/write
> + syscalls, it's possible to isolate those applications in
> + their own address space using seccomp. Once seccomp is
> + enabled via prctl(PR_SET_SECCOMP), it cannot be disabled
> + and the task is only allowed to execute a few safe syscalls
> + defined by each seccomp mode.
> +
> endmenu
>
> menu "Boot options"
> diff --git a/arch/riscv/include/asm/seccomp.h b/arch/riscv/include/asm/seccomp.h
> new file mode 100644
> index 000000000000..bf7744ee3b3d
> --- /dev/null
> +++ b/arch/riscv/include/asm/seccomp.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef _ASM_SECCOMP_H
> +#define _ASM_SECCOMP_H
> +
> +#include <asm/unistd.h>
> +
> +#include <asm-generic/seccomp.h>
> +
> +#endif /* _ASM_SECCOMP_H */
> diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
> index 905372d7eeb8..a0b2a29a0da1 100644
> --- a/arch/riscv/include/asm/thread_info.h
> +++ b/arch/riscv/include/asm/thread_info.h
> @@ -75,6 +75,7 @@ struct thread_info {
> #define TIF_MEMDIE 5 /* is terminating due to OOM killer */
> #define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */
> #define TIF_SYSCALL_AUDIT 7 /* syscall auditing */
> +#define TIF_SECCOMP 8 /* syscall secure computing */
>
> #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
> #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
> @@ -82,11 +83,13 @@ struct thread_info {
> #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
> #define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)
> #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
> +#define _TIF_SECCOMP (1 << TIF_SECCOMP)
>
> #define _TIF_WORK_MASK \
> (_TIF_NOTIFY_RESUME | _TIF_SIGPENDING | _TIF_NEED_RESCHED)
>
> #define _TIF_SYSCALL_WORK \
> - (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_AUDIT)
> + (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_AUDIT | \
> + _TIF_SECCOMP )
>
> #endif /* _ASM_RISCV_THREAD_INFO_H */
> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> index bc7a56e1ca6f..0bbedfa3e47d 100644
> --- a/arch/riscv/kernel/entry.S
> +++ b/arch/riscv/kernel/entry.S
> @@ -203,8 +203,25 @@ check_syscall_nr:
> /* Check to make sure we don't jump to a bogus syscall number. */
> li t0, __NR_syscalls
> la s0, sys_ni_syscall
> - /* Syscall number held in a7 */
> - bgeu a7, t0, 1f
> + /*
> + * The tracer can change syscall number to valid/invalid value.
> + * We use syscall_set_nr helper in syscall_trace_enter thus we
> + * cannot trust the current value in a7 and have to reload from
> + * the current task pt_regs.
> + */
> + REG_L a7, PT_A7(sp)
> + /*
> + * Syscall number held in a7.
> + * If syscall number is above allowed value, redirect to ni_syscall.
> + */
> + bge a7, t0, 1f
> + /*
> + * Check if syscall is rejected by tracer or seccomp, i.e., a7 == -1.
> + * If yes, we pretend it was executed.
> + */
> + li t1, -1
> + beq a7, t1, ret_from_syscall_rejected
> + /* Call syscall */
> la s0, sys_call_table
> slli t0, a7, RISCV_LGPTR
> add s0, s0, t0
> @@ -215,6 +232,12 @@ check_syscall_nr:
> ret_from_syscall:
> /* Set user a0 to kernel a0 */
> REG_S a0, PT_A0(sp)
> + /*
> + * We didn't execute the actual syscall.
> + * Seccomp already set return value for the current task pt_regs.
> + * (If it was configured with SECCOMP_RET_ERRNO/TRACE)
> + */
> +ret_from_syscall_rejected:
> /* Trace syscalls, but only if requested by the user. */
> REG_L t0, TASK_TI_FLAGS(tp)
> andi t0, t0, _TIF_SYSCALL_WORK
> diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c
> index 368751438366..63e47c9f85f0 100644
> --- a/arch/riscv/kernel/ptrace.c
> +++ b/arch/riscv/kernel/ptrace.c
> @@ -154,6 +154,16 @@ void do_syscall_trace_enter(struct pt_regs *regs)
> if (tracehook_report_syscall_entry(regs))
> syscall_set_nr(current, regs, -1);
>
> + /*
> + * Do the secure computing after ptrace; failures should be fast.
> + * If this fails we might have return value in a0 from seccomp
> + * (via SECCOMP_RET_ERRNO/TRACE).
> + */
> + if (secure_computing(NULL) == -1) {
> + syscall_set_nr(current, regs, -1);
> + return;
> + }
> +
> #ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS
> if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
> trace_sys_enter(regs, syscall_get_nr(current, regs));
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 6ef7f16c4cf5..492e0adad9d3 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -112,6 +112,8 @@ struct seccomp_data {
> # define __NR_seccomp 383
> # elif defined(__aarch64__)
> # define __NR_seccomp 277
> +# elif defined(__riscv)
> +# define __NR_seccomp 277
> # elif defined(__hppa__)
> # define __NR_seccomp 338
> # elif defined(__powerpc__)
> @@ -1582,6 +1584,10 @@ TEST_F(TRACE_poke, getpid_runs_normally)
> # define ARCH_REGS struct user_pt_regs
> # define SYSCALL_NUM regs[8]
> # define SYSCALL_RET regs[0]
> +#elif defined(__riscv) && __riscv_xlen == 64
> +# define ARCH_REGS struct user_regs_struct
> +# define SYSCALL_NUM a7
> +# define SYSCALL_RET a0
> #elif defined(__hppa__)
> # define ARCH_REGS struct user_regs_struct
> # define SYSCALL_NUM gr[20]
> @@ -1671,7 +1677,7 @@ void change_syscall(struct __test_metadata *_metadata,
> EXPECT_EQ(0, ret) {}
>
> #if defined(__x86_64__) || defined(__i386__) || defined(__powerpc__) || \
> - defined(__s390__) || defined(__hppa__)
> + defined(__s390__) || defined(__hppa__) || defined(__riscv)
> {
> regs.SYSCALL_NUM = syscall;
> }
> --
> 2.21.0
>
Kernel selftests results:
➜ uname -a
Linux fedora-unleashed 5.2.0-rc7-30159-g2d072d4-dirty #3 SMP Thu Jul 4
20:18:21 -03 2019 riscv64 riscv64 riscv64 GNU/Linux
➜ sudo ./seccomp_bpf
[==========] Running 74 tests from 1 test cases.
[ RUN ] global.mode_strict_support
[ OK ] global.mode_strict_support
[ RUN ] global.mode_strict_cannot_call_prctl
[ OK ] global.mode_strict_cannot_call_prctl
[ RUN ] global.no_new_privs_support
[ OK ] global.no_new_privs_support
[ RUN ] global.mode_filter_support
[ OK ] global.mode_filter_support
[ RUN ] global.mode_filter_without_nnp
[ OK ] global.mode_filter_without_nnp
[ RUN ] global.filter_size_limits
[ OK ] global.filter_size_limits
[ RUN ] global.filter_chain_limits
[ OK ] global.filter_chain_limits
[ RUN ] global.mode_filter_cannot_move_to_strict
[ OK ] global.mode_filter_cannot_move_to_strict
[ RUN ] global.mode_filter_get_seccomp
[ OK ] global.mode_filter_get_seccomp
[ RUN ] global.ALLOW_all
[ OK ] global.ALLOW_all
[ RUN ] global.empty_prog
[ OK ] global.empty_prog
[ RUN ] global.log_all
[ OK ] global.log_all
[ RUN ] global.unknown_ret_is_kill_inside
[ OK ] global.unknown_ret_is_kill_inside
[ RUN ] global.unknown_ret_is_kill_above_allow
[ OK ] global.unknown_ret_is_kill_above_allow
[ RUN ] global.KILL_all
[ OK ] global.KILL_all
[ RUN ] global.KILL_one
[ OK ] global.KILL_one
[ RUN ] global.KILL_one_arg_one
[ OK ] global.KILL_one_arg_one
[ RUN ] global.KILL_one_arg_six
[ OK ] global.KILL_one_arg_six
[ RUN ] global.KILL_thread
[ OK ] global.KILL_thread
[ RUN ] global.KILL_process
[ OK ] global.KILL_process
[ RUN ] global.arg_out_of_range
[ OK ] global.arg_out_of_range
[ RUN ] global.ERRNO_valid
[ OK ] global.ERRNO_valid
[ RUN ] global.ERRNO_zero
[ OK ] global.ERRNO_zero
[ RUN ] global.ERRNO_capped
[ OK ] global.ERRNO_capped
[ RUN ] global.ERRNO_order
[ OK ] global.ERRNO_order
[ RUN ] TRAP.dfl
[ OK ] TRAP.dfl
[ RUN ] TRAP.ign
[ OK ] TRAP.ign
[ RUN ] TRAP.handler
[ OK ] TRAP.handler
[ RUN ] precedence.allow_ok
[ OK ] precedence.allow_ok
[ RUN ] precedence.kill_is_highest
[ OK ] precedence.kill_is_highest
[ RUN ] precedence.kill_is_highest_in_any_order
[ OK ] precedence.kill_is_highest_in_any_order
[ RUN ] precedence.trap_is_second
[ OK ] precedence.trap_is_second
[ RUN ] precedence.trap_is_second_in_any_order
[ OK ] precedence.trap_is_second_in_any_order
[ RUN ] precedence.errno_is_third
[ OK ] precedence.errno_is_third
[ RUN ] precedence.errno_is_third_in_any_order
[ OK ] precedence.errno_is_third_in_any_order
[ RUN ] precedence.trace_is_fourth
[ OK ] precedence.trace_is_fourth
[ RUN ] precedence.trace_is_fourth_in_any_order
[ OK ] precedence.trace_is_fourth_in_any_order
[ RUN ] precedence.log_is_fifth
[ OK ] precedence.log_is_fifth
[ RUN ] precedence.log_is_fifth_in_any_order
[ OK ] precedence.log_is_fifth_in_any_order
[ RUN ] TRACE_poke.read_has_side_effects
[ OK ] TRACE_poke.read_has_side_effects
[ RUN ] TRACE_poke.getpid_runs_normally
[ OK ] TRACE_poke.getpid_runs_normally
[ RUN ] TRACE_syscall.ptrace_syscall_redirected
[ OK ] TRACE_syscall.ptrace_syscall_redirected
[ RUN ] TRACE_syscall.ptrace_syscall_errno
[ OK ] TRACE_syscall.ptrace_syscall_errno
[ RUN ] TRACE_syscall.ptrace_syscall_faked
[ OK ] TRACE_syscall.ptrace_syscall_faked
[ RUN ] TRACE_syscall.syscall_allowed
[ OK ] TRACE_syscall.syscall_allowed
[ RUN ] TRACE_syscall.syscall_redirected
[ OK ] TRACE_syscall.syscall_redirected
[ RUN ] TRACE_syscall.syscall_errno
[ OK ] TRACE_syscall.syscall_errno
[ RUN ] TRACE_syscall.syscall_faked
[ OK ] TRACE_syscall.syscall_faked
[ RUN ] TRACE_syscall.skip_after_RET_TRACE
[ OK ] TRACE_syscall.skip_after_RET_TRACE
[ RUN ] TRACE_syscall.kill_after_RET_TRACE
[ OK ] TRACE_syscall.kill_after_RET_TRACE
[ RUN ] TRACE_syscall.skip_after_ptrace
[ OK ] TRACE_syscall.skip_after_ptrace
[ RUN ] TRACE_syscall.kill_after_ptrace
[ OK ] TRACE_syscall.kill_after_ptrace
[ RUN ] global.seccomp_syscall
[ OK ] global.seccomp_syscall
[ RUN ] global.seccomp_syscall_mode_lock
[ OK ] global.seccomp_syscall_mode_lock
[ RUN ] global.detect_seccomp_filter_flags
[ OK ] global.detect_seccomp_filter_flags
[ RUN ] global.TSYNC_first
[ OK ] global.TSYNC_first
[ RUN ] TSYNC.siblings_fail_prctl
[ OK ] TSYNC.siblings_fail_prctl
[ RUN ] TSYNC.two_siblings_with_ancestor
[ OK ] TSYNC.two_siblings_with_ancestor
[ RUN ] TSYNC.two_sibling_want_nnp
[ OK ] TSYNC.two_sibling_want_nnp
[ RUN ] TSYNC.two_siblings_with_no_filter
[ OK ] TSYNC.two_siblings_with_no_filter
[ RUN ] TSYNC.two_siblings_with_one_divergence
[ OK ] TSYNC.two_siblings_with_one_divergence
[ RUN ] TSYNC.two_siblings_not_under_filter
[ OK ] TSYNC.two_siblings_not_under_filter
[ RUN ] global.syscall_restart
[ OK ] global.syscall_restart
[ RUN ] global.filter_flag_log
[ OK ] global.filter_flag_log
[ RUN ] global.get_action_avail
[ OK ] global.get_action_avail
[ RUN ] global.get_metadata
[ OK ] global.get_metadata
[ RUN ] global.user_notification_basic
[ OK ] global.user_notification_basic
[ RUN ] global.user_notification_kill_in_middle
[ OK ] global.user_notification_kill_in_middle
[ RUN ] global.user_notification_signal
[1] 5951 alarm sudo ./seccomp_bpf
carlosedp in ~ at fedora-unleashed
➜ sudo ./seccomp_benchmark
Calibrating reasonable sample size...
1564584448.964538790 - 1564584448.964529687 = 9103
1564584448.964588859 - 1564584448.964575204 = 13655
1564584448.964631342 - 1564584448.964604790 = 26552
1564584448.964710239 - 1564584448.964644997 = 65242
1564584448.964842239 - 1564584448.964726928 = 115311
1564584448.965072859 - 1564584448.964857411 = 215448
1564584448.965513618 - 1564584448.965089549 = 424069
1564584448.966417894 - 1564584448.965532584 = 885310
1564584448.968286377 - 1564584448.966443687 = 1842690
1564584448.971667549 - 1564584448.968314446 = 3353103
1564584448.978288790 - 1564584448.971694101 = 6594689
1564584448.991803618 - 1564584448.978313066 = 13490552
1564584449.017692308 - 1564584448.991836239 = 25856069
1564584449.069651756 - 1564584449.017713549 = 51938207
1564584449.173110928 - 1564584449.069673756 = 103437172
1564584449.380001204 - 1564584449.173132928 = 206868276
1564584449.793857618 - 1564584449.380041411 = 413816207
1564584450.625367342 - 1564584449.793898584 = 831468758
1564584452.299529411 - 1564584450.625426514 = 1674102897
1564584455.665938307 - 1564584452.299592376 = 3366345931
1564584462.331777479 - 1564584455.665973962 = 6665803517
Benchmarking 33554432 samples...
18.107882743 - 12.075641371 = 6032241372
getpid native: 179 ns
34.720410331 - 18.107978605 = 16612431726
getpid RET_ALLOW: 495 ns
Estimated seccomp overhead per syscall: 316 n
--
________________________________________
Carlos Eduardo de Paula
me@carlosedp.com
http://carlosedp.com
http://twitter.com/carlosedp
Linkedin
________________________________________
^ permalink raw reply
* Re: RFC: very rough draft of a bpf permission model
From: Andy Lutomirski @ 2019-08-23 23:09 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Andy Lutomirski, Daniel Borkmann, Song Liu, Kees Cook, Networking,
bpf, Alexei Starovoitov, Kernel Team, Lorenz Bauer, Jann Horn,
Greg KH, Linux API, LSM List, Chenbo Feng
In-Reply-To: <20190822232620.p5tql4rrlzlk35z7@ast-mbp.dhcp.thefacebook.com>
On Thu, Aug 22, 2019 at 4:26 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> You're proposing all of the above in addition to CAP_BPF, right?
> Otherwise I don't see how it addresses the use cases I kept
> explaining for the last few weeks.
None of my proposal is intended to exclude changes like CAP_BPF to
make privileged bpf() operations need less privilege. But I think
it's very hard to evaluate CAP_BPF without both a full description of
exactly what CAP_BPF would do and what at least one full example of a
user would look like.
I also think that users who want CAP_BPF should look at manipulating
their effective capability set instead. A daemon that wants to use
bpf() but otherwise minimize the chance of accidentally causing a
problem can use capset() to clear its effective and inheritable masks.
Then, each time it wants to call bpf(), it could re-add CAP_SYS_ADMIN
or CAP_NET_ADMIN to its effective set, call bpf(), and then clear its
effective set again. This works in current kernels and is generally
good practice.
Aside from this, and depending on exactly what CAP_BPF would be, I
have some further concerns. Looking at your example in this email:
> Here is another example of use case that CAP_BPF is solving:
> The daemon X is started by pid=1 and currently runs as root.
> It loads a bunch of tracing progs and attaches them to kprobes
> and tracepoints. It also loads cgroup-bpf progs and attaches them
> to cgroups. All progs are collecting data about the system and
> logging it for further analysis.
This needs more than just bpf(). Creating a perf kprobe event
requires CAP_SYS_ADMIN, and without a perf kprobe event, you can't
attach a bpf program. And the privilege to attach bpf programs to
cgroups without any DAC or MAC checks (which is what the current API
does) is an extremely broad privilege that is not that much weaker
than CAP_SYS_ADMIN or CAP_NET_ADMIN. Also:
> This tracing bpf is looking into kernel memory
> and using bpf_probe_read. Clearly it's not _secure_. But it's _safe_.
> The system is not going to crash because of BPF,
> but it can easily crash because of simple coding bugs in the user
> space bits of that daemon.
The BPF verifier and interpreter, taken in isolation, may be extremely
safe, but attaching BPF programs to various hooks can easily take down
the system, deliberately or by accident. A handler, especially if it
can access user memory or otherwise fault, will explode if attached to
an inappropriate kprobe, hw_breakpoint, or function entry trace event.
(I and the other maintainers consider this to be a bug if it happens,
and we'll fix it, but these bugs definitely exist.) A cgroup-bpf hook
that blocks all network traffic will effectively kill a machine,
especially if it's a server. A bpf program that runs excessively
slowly attached to a high-frequency hook will kill the system, too.
(I bet a buggy bpf program that calls bpf_probe_read() on an unmapped
address repeatedly could be make extremely slow. Page faults take
thousands to tens of thousands of cycles.) A bpf firewall rule that's
wrong can cut a machine off from the network -- I've killed machines
using iptables more than once, and bpf isn't magically safer.
Something finer-grained can mitigate some of this. CAP_BPF as I think
you're imagining it will not.
I'm wondering if something like CAP_TRACING would make sense.
CAP_TRACING would allow operations that can reveal kernel memory and
other secret kernel state but that do not, by design, allow modifying
system behavior. So, for example, CAP_TRACING would allow privileged
perf_event_open() operations and privileged bpf verifier usage. But
it would not allow cgroup-bpf unless further restrictions were added,
and it would not allow the *_BY_ID operations, as those can modify
other users' bpf programs' behavior.
(To get CAP_TRACING to work with cgroup-bpf, there could be a flag to
attach a "tracing" bpf program to a cgroup. This program would run in
addition to normal or MULTI programs, but it would not be allowed to
return a rejection result.)
^ permalink raw reply
* Re: [PATCH ipsec-next 0/7] ipsec: add TCP encapsulation support (RFC 8229)
From: Carl-Daniel Hailfinger @ 2019-08-23 23:17 UTC (permalink / raw)
To: Sabrina Dubroca; +Cc: herbert, netdev, steffen.klassert
In-Reply-To: <cover.1566395202.git.sd@queasysnail.net>
Hi!
On Wed, 21 Aug 2019 23:46:18 +0200, Sabrina Dubroca wrote:
> This patchset introduces support for TCP encapsulation of IKE and ESP
> messages, as defined by RFC 8229 [0]. It is an evolution of what
> Herbert Xu proposed in January 2018 [1] that addresses the main
> criticism against it, by not interfering with the TCP implementation
> at all. The networking stack now has infrastructure for this: TCP ULPs
> and Stream Parsers.
> [...]
Thank you very much for the patchset. Where I live, a substantial amount
of free and paid Wifi networks restrict UDP to port 53. TCP ports are
usually unaffected by such restrictions.
Running IKE/ESP over TCP is sometimes the only remaining option, and
this patch makes that option available.
> The main omission in this submission is IPv6 support. ESP
> encapsulation over UDP with IPv6 is currently not supported in the
> kernel either, as UDP encapsulation is aimed at NAT traversal, and NAT
> is not frequently used with IPv6.
Side note: The lack of support for ESP over UDP with IPv6 is the reason
why third-party Android IPsec management apps (e.g. the strongswan app)
can't connect to IPv6-only remote endpoints. AFAIK Android apps do not
have permission to send ESP packets directly, whereas establishing TCP
connections and sending UDP datagrams is permitted. But even without
IPv6 support, this patch is a great step forward.
Regards,
Carl-Daniel
^ permalink raw reply
* Re: [PATCH 0/3] Add NETIF_F_HW_BRIDGE feature
From: Florian Fainelli @ 2019-08-23 23:25 UTC (permalink / raw)
To: Horatiu Vultur, roopa, nikolay, davem, UNGLinuxDriver,
alexandre.belloni, allan.nielsen, netdev, linux-kernel, bridge
In-Reply-To: <1566500850-6247-1-git-send-email-horatiu.vultur@microchip.com>
On 8/22/19 12:07 PM, Horatiu Vultur wrote:
> Current implementation of the SW bridge is setting the interfaces in
> promisc mode when they are added to bridge if learning of the frames is
> enabled.
> In case of Ocelot which has HW capabilities to switch frames, it is not
> needed to set the ports in promisc mode because the HW already capable of
> doing that. Therefore add NETIF_F_HW_BRIDGE feature to indicate that the
> HW has bridge capabilities. Therefore the SW bridge doesn't need to set
> the ports in promisc mode to do the switching.
Then do not do anything when the ndo_set_rx_mode() for the ocelot
network device is called and indicates that IFF_PROMISC is set and that
your network port is a bridge port member. That is what mlxsw does AFAICT.
As other pointed out, the Linux bridge implements a software bridge by
default, and because it needs to operate on a wide variety of network
devices, all with different capabilities, the easiest way to make sure
that all management (IGMP, BPDU, etc. ) as well as non-management
traffic can make it to the bridge ports, is to put the network devices
in promiscuous mode. If this is suboptimal for you, you can take
shortcuts in your driver that do not hinder the overall functionality.
> This optimization takes places only if all the interfaces that are part
> of the bridge have this flag and have the same network driver.
>
> If the bridge interfaces is added in promisc mode then also the ports part
> of the bridge are set in promisc mode.
>
> Horatiu Vultur (3):
> net: Add HW_BRIDGE offload feature
> net: mscc: Use NETIF_F_HW_BRIDGE
> net: mscc: Implement promisc mode.
>
> drivers/net/ethernet/mscc/ocelot.c | 26 ++++++++++++++++++++++++--
> include/linux/netdev_features.h | 3 +++
> net/bridge/br_if.c | 29 ++++++++++++++++++++++++++++-
> net/core/ethtool.c | 1 +
> 4 files changed, 56 insertions(+), 3 deletions(-)
>
--
Florian
^ permalink raw reply
* Re: [PATCH 1/3] net: Add HW_BRIDGE offload feature
From: Florian Fainelli @ 2019-08-23 23:30 UTC (permalink / raw)
To: Horatiu Vultur, Andrew Lunn
Cc: roopa, nikolay, davem, UNGLinuxDriver, alexandre.belloni,
allan.nielsen, netdev, linux-kernel, bridge
In-Reply-To: <20190823123929.ta4ikozz7jwkwbo2@soft-dev3.microsemi.net>
On 8/23/19 5:39 AM, Horatiu Vultur wrote:
> The 08/22/2019 22:08, Andrew Lunn wrote:
>> External E-Mail
>>
>>
>>> +/* Determin if the SW bridge can be offloaded to HW. Return true if all
>>> + * the interfaces of the bridge have the feature NETIF_F_HW_SWITCHDEV set
>>> + * and have the same netdev_ops.
>>> + */
>>
>> Hi Horatiu
>>
>> Why do you need these restrictions. The HW bridge should be able to
>> learn that a destination MAC address can be reached via the SW
>> bridge. The software bridge can then forward it out the correct
>> interface.
>>
>> Or are you saying your hardware cannot learn from frames which come
>> from the CPU?
>>
>> Andrew
>>
> Hi Andrew,
>
> I do not believe that our HW can learn from frames which comes from the
> CPU, at least not in the way they are injected today. But in case of Ocelot
> (and the next chip we are working on), we have other issues in mixing with
> foreign interfaces which is why we have the check in
> ocelot_netdevice_dev_check.
>
> More important, as we responded to Nikolay, we properly introduced this
> restriction for the wrong reasons.
>
> In SW bridge I will remove all these restrictions and only set ports in
> promisc mode only if NETIF_F_HW_BRIDGE is not set.
> Then in the network driver I can see if a foreign interface is added to
> the bridge, and when that happens I can set the port in promisc mode.
> Then the frames will be flooded to the SW bridge which eventually will
> send to the foreign interface.
Is that really necessary? Is not the skb->fwd_offload_mark as well as
the phys_switch_id supposed to tell that information to the bridge already?
--
Florian
^ permalink raw reply
* Re: [PATCH bpf] flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH
From: Daniel Borkmann @ 2019-08-23 23:34 UTC (permalink / raw)
To: Jakub Sitnicki, bpf
Cc: netdev, kernel-team, Petar Penkov, Willem de Bruijn, Lorenz Bauer
In-Reply-To: <20190821121720.22009-1-jakub@cloudflare.com>
On 8/21/19 2:17 PM, Jakub Sitnicki wrote:
> Call to bpf_prog_put(), with help of call_rcu(), queues an RCU-callback to
> free the program once a grace period has elapsed. The callback can run
> together with new RCU readers that started after the last grace period.
> New RCU readers can potentially see the "old" to-be-freed or already-freed
> pointer to the program object before the RCU update-side NULLs it.
>
> Reorder the operations so that the RCU update-side resets the protected
> pointer before the end of the grace period after which the program will be
> freed.
>
> Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
> Reported-by: Lorenz Bauer <lmb@cloudflare.com>
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Applied, thanks!
^ permalink raw reply
* Re: [PATCH bpf] bpf: fix precision tracking in presence of bpf2bpf calls
From: Daniel Borkmann @ 2019-08-23 23:35 UTC (permalink / raw)
To: Alexei Starovoitov, davem; +Cc: netdev, bpf, kernel-team
In-Reply-To: <20190821210710.1276117-1-ast@kernel.org>
On 8/21/19 11:07 PM, Alexei Starovoitov wrote:
> While adding extra tests for precision tracking and extra infra
> to adjust verifier heuristics the existing test
> "calls: cross frame pruning - liveness propagation" started to fail.
> The root cause is the same as described in verifer.c comment:
>
> * Also if parent's curframe > frame where backtracking started,
> * the verifier need to mark registers in both frames, otherwise callees
> * may incorrectly prune callers. This is similar to
> * commit 7640ead93924 ("bpf: verifier: make sure callees don't prune with caller differences")
> * For now backtracking falls back into conservative marking.
>
> Turned out though that returning -ENOTSUPP from backtrack_insn() and
> doing mark_all_scalars_precise() in the current parentage chain is not enough.
> Depending on how is_state_visited() heuristic is creating parentage chain
> it's possible that callee will incorrectly prune caller.
> Fix the issue by setting precise=true earlier and more aggressively.
> Before this fix the precision tracking _within_ functions that don't do
> bpf2bpf calls would still work. Whereas now precision tracking is completely
> disabled when bpf2bpf calls are present anywhere in the program.
>
> No difference in cilium tests (they don't have bpf2bpf calls).
> No difference in test_progs though some of them have bpf2bpf calls,
> but precision tracking wasn't effective there.
>
> Fixes: b5dc0163d8fd ("bpf: precise scalar_value tracking")
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Applied, thanks!
^ permalink raw reply
* [net-next 02/14] ice: Account for all states of FW DCBx and LLDP
From: Jeff Kirsher @ 2019-08-23 23:37 UTC (permalink / raw)
To: davem
Cc: Dave Ertman, netdev, nhorman, sassmann, Tony Nguyen,
Andrew Bowers, Jeff Kirsher
In-Reply-To: <20190823233750.7997-1-jeffrey.t.kirsher@intel.com>
From: Dave Ertman <david.m.ertman@intel.com>
Currently, only the DCBx status is taken into account to
determine if FW LLDP is possible. But there are NVM version
coming out with DCBx enabled, and FW LLDP disabled. This
is causing errors where the driver sees that DCBx is not
disabled, and then tries to register for LLDP MIB change
events, and fails.
Change the logic to detect both DCBx and LLDP states in the
FW engine.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ice/ice_dcb_lib.c | 34 +++++++-------------
1 file changed, 12 insertions(+), 22 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
index bf6cd4760a48..22bdc244c7e0 100644
--- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
@@ -319,6 +319,11 @@ void ice_dcb_rebuild(struct ice_pf *pf)
}
ice_init_dcb(&pf->hw);
+ if (pf->hw.port_info->dcbx_status == ICE_DCBX_STATUS_DIS)
+ pf->hw.port_info->is_sw_lldp = true;
+ else
+ pf->hw.port_info->is_sw_lldp = false;
+
if (ice_dcb_need_recfg(pf, prev_cfg, local_dcbx_cfg)) {
/* difference in cfg detected - disable DCB till next MIB */
dev_err(&pf->pdev->dev, "Set local MIB not accurate\n");
@@ -440,35 +445,17 @@ int ice_init_pf_dcb(struct ice_pf *pf, bool locked)
struct device *dev = &pf->pdev->dev;
struct ice_port_info *port_info;
struct ice_hw *hw = &pf->hw;
- int sw_default = 0;
int err;
port_info = hw->port_info;
err = ice_init_dcb(hw);
if (err) {
- /* FW LLDP is not active, default to SW DCBX/LLDP */
- dev_info(&pf->pdev->dev, "FW LLDP is not active\n");
- hw->port_info->dcbx_status = ICE_DCBX_STATUS_NOT_STARTED;
- hw->port_info->is_sw_lldp = true;
- }
-
- if (port_info->dcbx_status == ICE_DCBX_STATUS_DIS)
- dev_info(&pf->pdev->dev, "DCBX disabled\n");
-
- /* LLDP disabled in FW */
- if (port_info->is_sw_lldp) {
- sw_default = 1;
- dev_info(&pf->pdev->dev, "DCBx/LLDP in SW mode.\n");
+ /* FW LLDP is disabled, activate SW DCBX/LLDP mode */
+ dev_info(&pf->pdev->dev,
+ "FW LLDP is disabled, DCBx/LLDP in SW mode.\n");
+ port_info->is_sw_lldp = true;
clear_bit(ICE_FLAG_ENABLE_FW_LLDP, pf->flags);
- } else {
- set_bit(ICE_FLAG_ENABLE_FW_LLDP, pf->flags);
- }
-
- if (port_info->dcbx_status == ICE_DCBX_STATUS_NOT_STARTED)
- dev_info(&pf->pdev->dev, "DCBX not started\n");
-
- if (sw_default) {
err = ice_dcb_sw_dflt_cfg(pf, locked);
if (err) {
dev_err(&pf->pdev->dev,
@@ -483,6 +470,9 @@ int ice_init_pf_dcb(struct ice_pf *pf, bool locked)
return 0;
}
+ port_info->is_sw_lldp = false;
+ set_bit(ICE_FLAG_ENABLE_FW_LLDP, pf->flags);
+
/* DCBX in FW and LLDP enabled in FW */
pf->dcbx_cap = DCB_CAP_DCBX_LLD_MANAGED | DCB_CAP_DCBX_VER_IEEE;
--
2.21.0
^ permalink raw reply related
* [net-next 01/14] ice: Allow egress control packets from PF_VSI
From: Jeff Kirsher @ 2019-08-23 23:37 UTC (permalink / raw)
To: davem; +Cc: Dave Ertman, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190823233750.7997-1-jeffrey.t.kirsher@intel.com>
From: Dave Ertman <david.m.ertman@intel.com>
For control packets (i.e. LLDP packets) to be able to egress
from the main VSI, a bit has to be set in the TX_descriptor.
This should only be done for the main VSI and only if the
FW LLDP agent is disabled. A bit to allow this also has to
be set in the VSI context.
Add the logic to add the necessary bits in the VSI context
for the PF_VSI and the TX_descriptors for control packets
egressing the PF_VSI.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ice/ice_lib.c | 7 +++++++
drivers/net/ethernet/intel/ice/ice_txrx.c | 11 ++++++++++-
2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 6e34c40e7840..d6279dfe029e 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -1010,6 +1010,13 @@ static int ice_vsi_init(struct ice_vsi *vsi)
ICE_AQ_VSI_SEC_FLAG_ENA_MAC_ANTI_SPOOF;
}
+ /* Allow control frames out of main VSI */
+ if (vsi->type == ICE_VSI_PF) {
+ ctxt->info.sec_flags |= ICE_AQ_VSI_SEC_FLAG_ALLOW_DEST_OVRD;
+ ctxt->info.valid_sections |=
+ cpu_to_le16(ICE_AQ_VSI_PROP_SECURITY_VALID);
+ }
+
ret = ice_add_vsi(hw, vsi->idx, ctxt, NULL);
if (ret) {
dev_err(&pf->pdev->dev,
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index e5c4c9139e54..5bf5c179a738 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -2106,6 +2106,7 @@ static netdev_tx_t
ice_xmit_frame_ring(struct sk_buff *skb, struct ice_ring *tx_ring)
{
struct ice_tx_offload_params offload = { 0 };
+ struct ice_vsi *vsi = tx_ring->vsi;
struct ice_tx_buf *first;
unsigned int count;
int tso, csum;
@@ -2153,7 +2154,15 @@ ice_xmit_frame_ring(struct sk_buff *skb, struct ice_ring *tx_ring)
if (csum < 0)
goto out_drop;
- if (tso || offload.cd_tunnel_params) {
+ /* allow CONTROL frames egress from main VSI if FW LLDP disabled */
+ if (unlikely(skb->priority == TC_PRIO_CONTROL &&
+ vsi->type == ICE_VSI_PF &&
+ vsi->port_info->is_sw_lldp))
+ offload.cd_qw1 |= (u64)(ICE_TX_DESC_DTYPE_CTX |
+ ICE_TX_CTX_DESC_SWTCH_UPLINK <<
+ ICE_TXD_CTX_QW1_CMD_S);
+
+ if (offload.cd_qw1 & ICE_TX_DESC_DTYPE_CTX) {
struct ice_tx_ctx_desc *cdesc;
int i = tx_ring->next_to_use;
--
2.21.0
^ permalink raw reply related
* [net-next 12/14] ice: update ethtool stats on-demand
From: Jeff Kirsher @ 2019-08-23 23:37 UTC (permalink / raw)
To: davem; +Cc: Bruce Allan, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190823233750.7997-1-jeffrey.t.kirsher@intel.com>
From: Bruce Allan <bruce.w.allan@intel.com>
Users expect ethtool statistics to be updated on-demand when invoking
'ethtool -S <iface>' instead of providing a snapshot of statistics taken
once a second (the frequency of the watchdog task where stats are currently
updated). Update stats every time 'ethtool -S <iface>' is run.
Also, fix an indentation style issue and an unnecessary local variable
initialization in ice_get_ethtool_stats() discovered while investigating
the subject issue.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ice/ice.h | 2 ++
drivers/net/ethernet/intel/ice/ice_ethtool.c | 7 +++++--
drivers/net/ethernet/intel/ice/ice_main.c | 6 ++----
3 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 99e0febd8e50..97d0f61cf52b 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -447,6 +447,8 @@ ice_find_vsi_by_type(struct ice_pf *pf, enum ice_vsi_type type)
int ice_vsi_setup_tx_rings(struct ice_vsi *vsi);
int ice_vsi_setup_rx_rings(struct ice_vsi *vsi);
void ice_set_ethtool_ops(struct net_device *netdev);
+void ice_update_vsi_stats(struct ice_vsi *vsi);
+void ice_update_pf_stats(struct ice_pf *pf);
int ice_up(struct ice_vsi *vsi);
int ice_down(struct ice_vsi *vsi);
int ice_vsi_cfg(struct ice_vsi *vsi);
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index 948a33716290..f7dd0bd03d39 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -1319,14 +1319,17 @@ ice_get_ethtool_stats(struct net_device *netdev,
struct ice_vsi *vsi = np->vsi;
struct ice_pf *pf = vsi->back;
struct ice_ring *ring;
- unsigned int j = 0;
+ unsigned int j;
int i = 0;
char *p;
+ ice_update_pf_stats(pf);
+ ice_update_vsi_stats(vsi);
+
for (j = 0; j < ICE_VSI_STATS_LEN; j++) {
p = (char *)vsi + ice_gstrings_vsi_stats[j].stat_offset;
data[i++] = (ice_gstrings_vsi_stats[j].sizeof_stat ==
- sizeof(u64)) ? *(u64 *)p : *(u32 *)p;
+ sizeof(u64)) ? *(u64 *)p : *(u32 *)p;
}
/* populate per queue stats */
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index a0d148f590c2..6dd806b763ea 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -34,8 +34,6 @@ static const struct net_device_ops ice_netdev_ops;
static void ice_rebuild(struct ice_pf *pf);
static void ice_vsi_release_all(struct ice_pf *pf);
-static void ice_update_vsi_stats(struct ice_vsi *vsi);
-static void ice_update_pf_stats(struct ice_pf *pf);
/**
* ice_get_tx_pending - returns number of Tx descriptors not processed
@@ -3254,7 +3252,7 @@ static void ice_update_vsi_ring_stats(struct ice_vsi *vsi)
* ice_update_vsi_stats - Update VSI stats counters
* @vsi: the VSI to be updated
*/
-static void ice_update_vsi_stats(struct ice_vsi *vsi)
+void ice_update_vsi_stats(struct ice_vsi *vsi)
{
struct rtnl_link_stats64 *cur_ns = &vsi->net_stats;
struct ice_eth_stats *cur_es = &vsi->eth_stats;
@@ -3290,7 +3288,7 @@ static void ice_update_vsi_stats(struct ice_vsi *vsi)
* ice_update_pf_stats - Update PF port stats counters
* @pf: PF whose stats needs to be updated
*/
-static void ice_update_pf_stats(struct ice_pf *pf)
+void ice_update_pf_stats(struct ice_pf *pf)
{
struct ice_hw_port_stats *prev_ps, *cur_ps;
struct ice_hw *hw = &pf->hw;
--
2.21.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox