From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Colascione Subject: Re: [RFC] Add BPF_SYNCHRONIZE bpf(2) command Date: Fri, 6 Jul 2018 20:22:49 -0700 Message-ID: References: <20180707015616.25988-1-dancol@google.com> <20180707025426.ssxipi7hsehoiuyo@ast-mbp.dhcp.thefacebook.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: Joel Fernandes , ast@fb.com, linux-kernel@vger.kernel.org, Tim Murray , daniel@iogearbox.net, netdev@vger.kernel.org To: Alexei Starovoitov Return-path: Received: from mail-io0-f196.google.com ([209.85.223.196]:41534 "EHLO mail-io0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932865AbeGGDWv (ORCPT ); Fri, 6 Jul 2018 23:22:51 -0400 Received: by mail-io0-f196.google.com with SMTP id q9-v6so12515892ioj.8 for ; Fri, 06 Jul 2018 20:22:51 -0700 (PDT) In-Reply-To: <20180707025426.ssxipi7hsehoiuyo@ast-mbp.dhcp.thefacebook.com> Sender: netdev-owner@vger.kernel.org List-ID: On Fri, Jul 6, 2018 at 7:54 PM, Alexei Starovoitov wrote: > On Fri, Jul 06, 2018 at 06:56:16PM -0700, Daniel Colascione wrote: >> BPF_SYNCHRONIZE waits for any BPF programs active at the time of >> BPF_SYNCHRONIZE to complete, allowing userspace to ensure atomicity of >> RCU data structure operations with respect to active programs. For >> example, userspace can update a map->map entry to point to a new map, >> use BPF_SYNCHRONIZE to wait for any BPF programs using the old map to >> complete, and then drain the old map without fear that BPF programs >> may still be updating it. >> >> Signed-off-by: Daniel Colascione >> --- >> include/uapi/linux/bpf.h | 1 + >> kernel/bpf/syscall.c | 14 ++++++++++++++ >> 2 files changed, 15 insertions(+) >> >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >> index b7db3261c62d..4365c50e8055 100644 >> --- a/include/uapi/linux/bpf.h >> +++ b/include/uapi/linux/bpf.h >> @@ -98,6 +98,7 @@ enum bpf_cmd { >> BPF_BTF_LOAD, >> BPF_BTF_GET_FD_BY_ID, >> BPF_TASK_FD_QUERY, >> + BPF_SYNCHRONIZE, >> }; >> >> enum bpf_map_type { >> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c >> index d10ecd78105f..60ec7811846e 100644 >> --- a/kernel/bpf/syscall.c >> +++ b/kernel/bpf/syscall.c >> @@ -2272,6 +2272,20 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz >> if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN)) >> return -EPERM; >> >> + if (cmd == BPF_SYNCHRONIZE) { >> + if (uattr != NULL || size != 0) >> + return -EINVAL; >> + err = security_bpf(cmd, NULL, 0); >> + if (err < 0) >> + return err; >> + /* BPF programs are run with preempt disabled, so >> + * synchronize_sched is sufficient even with >> + * RCU_PREEMPT. >> + */ >> + synchronize_sched(); >> + return 0; > > I don't think it's necessary. sys_membarrier() can do this already > and some folks use it exactly for this use case. True --- implementation-wise, today. The point of the patch is to sort out contractual guarantees. membarrier isn't documented to have the BPF_SYNCHRONIZE effect right now, although it does. If we want to extend membarrier's contract, great, but that might make it more expensive than necessary for callers who, well, just want a memory barrier. It's worthwhile to have a dedicated BPF command for this task, especially considering the simplicity of the implementation.