* [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
@ 2016-03-31 0:13 Daniel Borkmann
2016-03-31 1:18 ` Alexei Starovoitov
2016-03-31 9:15 ` Jiri Slaby
0 siblings, 2 replies; 20+ messages in thread
From: Daniel Borkmann @ 2016-03-31 0:13 UTC (permalink / raw)
To: davem
Cc: alexei.starovoitov, mkubecek, sasha.levin, jslaby, eric.dumazet,
mst, netdev, Daniel Borkmann
Sasha Levin reported a suspicious rcu_dereference_protected() warning
found while fuzzing with trinity that is similar to this one:
[ 52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage!
[ 52.765688] other info that might help us debug this:
[ 52.765695] rcu_scheduler_active = 1, debug_locks = 1
[ 52.765701] 1 lock held by a.out/1525:
[ 52.765704] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816a64b7>] rtnl_lock+0x17/0x20
[ 52.765721] stack backtrace:
[ 52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264
[...]
[ 52.765768] Call Trace:
[ 52.765775] [<ffffffff813e488d>] dump_stack+0x85/0xc8
[ 52.765784] [<ffffffff810f2fa5>] lockdep_rcu_suspicious+0xd5/0x110
[ 52.765792] [<ffffffff816afdc2>] sk_detach_filter+0x82/0x90
[ 52.765801] [<ffffffffa0883425>] tun_detach_filter+0x35/0x90 [tun]
[ 52.765810] [<ffffffffa0884ed4>] __tun_chr_ioctl+0x354/0x1130 [tun]
[ 52.765818] [<ffffffff8136fed0>] ? selinux_file_ioctl+0x130/0x210
[ 52.765827] [<ffffffffa0885ce3>] tun_chr_ioctl+0x13/0x20 [tun]
[ 52.765834] [<ffffffff81260ea6>] do_vfs_ioctl+0x96/0x690
[ 52.765843] [<ffffffff81364af3>] ? security_file_ioctl+0x43/0x60
[ 52.765850] [<ffffffff81261519>] SyS_ioctl+0x79/0x90
[ 52.765858] [<ffffffff81003ba2>] do_syscall_64+0x62/0x140
[ 52.765866] [<ffffffff817d563f>] entry_SYSCALL64_slow_path+0x25/0x25
Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled
from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH,
DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices.
Since the fix in f91ff5b9ff52 ("net: sk_{detach|attach}_filter() rcu
fixes") sk_attach_filter()/sk_detach_filter() now dereferences the
filter with rcu_dereference_protected(), checking whether socket lock
is held in control path.
Since its introduction in 994051625981 ("tun: socket filter support"),
tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the
sock_owned_by_user(sk) doesn't apply in this specific case and therefore
triggers the false positive.
Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair
that is used by tap filters and pass in lockdep_rtnl_is_held() for the
rcu_dereference_protected() checks instead.
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
drivers/net/tun.c | 8 +++++---
include/linux/filter.h | 4 ++++
net/core/filter.c | 33 +++++++++++++++++++++------------
3 files changed, 30 insertions(+), 15 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index afdf950..510e90a 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -622,7 +622,8 @@ static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filte
/* Re-attach the filter to persist device */
if (!skip_filter && (tun->filter_attached == true)) {
- err = sk_attach_filter(&tun->fprog, tfile->socket.sk);
+ err = __sk_attach_filter(&tun->fprog, tfile->socket.sk,
+ lockdep_rtnl_is_held());
if (!err)
goto out;
}
@@ -1822,7 +1823,7 @@ static void tun_detach_filter(struct tun_struct *tun, int n)
for (i = 0; i < n; i++) {
tfile = rtnl_dereference(tun->tfiles[i]);
- sk_detach_filter(tfile->socket.sk);
+ __sk_detach_filter(tfile->socket.sk, lockdep_rtnl_is_held());
}
tun->filter_attached = false;
@@ -1835,7 +1836,8 @@ static int tun_attach_filter(struct tun_struct *tun)
for (i = 0; i < tun->numqueues; i++) {
tfile = rtnl_dereference(tun->tfiles[i]);
- ret = sk_attach_filter(&tun->fprog, tfile->socket.sk);
+ ret = __sk_attach_filter(&tun->fprog, tfile->socket.sk,
+ lockdep_rtnl_is_held());
if (ret) {
tun_detach_filter(tun, i);
return ret;
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 43aa1f8..a51a536 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -465,10 +465,14 @@ int bpf_prog_create_from_user(struct bpf_prog **pfp, struct sock_fprog *fprog,
void bpf_prog_destroy(struct bpf_prog *fp);
int sk_attach_filter(struct sock_fprog *fprog, struct sock *sk);
+int __sk_attach_filter(struct sock_fprog *fprog, struct sock *sk,
+ bool locked);
int sk_attach_bpf(u32 ufd, struct sock *sk);
int sk_reuseport_attach_filter(struct sock_fprog *fprog, struct sock *sk);
int sk_reuseport_attach_bpf(u32 ufd, struct sock *sk);
int sk_detach_filter(struct sock *sk);
+int __sk_detach_filter(struct sock *sk, bool locked);
+
int sk_get_filter(struct sock *sk, struct sock_filter __user *filter,
unsigned int len);
diff --git a/net/core/filter.c b/net/core/filter.c
index 4b81b71..ca7f832 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1149,7 +1149,8 @@ void bpf_prog_destroy(struct bpf_prog *fp)
}
EXPORT_SYMBOL_GPL(bpf_prog_destroy);
-static int __sk_attach_prog(struct bpf_prog *prog, struct sock *sk)
+static int __sk_attach_prog(struct bpf_prog *prog, struct sock *sk,
+ bool locked)
{
struct sk_filter *fp, *old_fp;
@@ -1165,10 +1166,8 @@ static int __sk_attach_prog(struct bpf_prog *prog, struct sock *sk)
return -ENOMEM;
}
- old_fp = rcu_dereference_protected(sk->sk_filter,
- sock_owned_by_user(sk));
+ old_fp = rcu_dereference_protected(sk->sk_filter, locked);
rcu_assign_pointer(sk->sk_filter, fp);
-
if (old_fp)
sk_filter_uncharge(sk, old_fp);
@@ -1247,7 +1246,8 @@ struct bpf_prog *__get_filter(struct sock_fprog *fprog, struct sock *sk)
* occurs or there is insufficient memory for the filter a negative
* errno code is returned. On success the return is zero.
*/
-int sk_attach_filter(struct sock_fprog *fprog, struct sock *sk)
+int __sk_attach_filter(struct sock_fprog *fprog, struct sock *sk,
+ bool locked)
{
struct bpf_prog *prog = __get_filter(fprog, sk);
int err;
@@ -1255,7 +1255,7 @@ int sk_attach_filter(struct sock_fprog *fprog, struct sock *sk)
if (IS_ERR(prog))
return PTR_ERR(prog);
- err = __sk_attach_prog(prog, sk);
+ err = __sk_attach_prog(prog, sk, locked);
if (err < 0) {
__bpf_prog_release(prog);
return err;
@@ -1263,7 +1263,12 @@ int sk_attach_filter(struct sock_fprog *fprog, struct sock *sk)
return 0;
}
-EXPORT_SYMBOL_GPL(sk_attach_filter);
+EXPORT_SYMBOL_GPL(__sk_attach_filter);
+
+int sk_attach_filter(struct sock_fprog *fprog, struct sock *sk)
+{
+ return __sk_attach_filter(fprog, sk, sock_owned_by_user(sk));
+}
int sk_reuseport_attach_filter(struct sock_fprog *fprog, struct sock *sk)
{
@@ -1309,7 +1314,7 @@ int sk_attach_bpf(u32 ufd, struct sock *sk)
if (IS_ERR(prog))
return PTR_ERR(prog);
- err = __sk_attach_prog(prog, sk);
+ err = __sk_attach_prog(prog, sk, sock_owned_by_user(sk));
if (err < 0) {
bpf_prog_put(prog);
return err;
@@ -2250,7 +2255,7 @@ static int __init register_sk_filter_ops(void)
}
late_initcall(register_sk_filter_ops);
-int sk_detach_filter(struct sock *sk)
+int __sk_detach_filter(struct sock *sk, bool locked)
{
int ret = -ENOENT;
struct sk_filter *filter;
@@ -2258,8 +2263,7 @@ int sk_detach_filter(struct sock *sk)
if (sock_flag(sk, SOCK_FILTER_LOCKED))
return -EPERM;
- filter = rcu_dereference_protected(sk->sk_filter,
- sock_owned_by_user(sk));
+ filter = rcu_dereference_protected(sk->sk_filter, locked);
if (filter) {
RCU_INIT_POINTER(sk->sk_filter, NULL);
sk_filter_uncharge(sk, filter);
@@ -2268,7 +2272,12 @@ int sk_detach_filter(struct sock *sk)
return ret;
}
-EXPORT_SYMBOL_GPL(sk_detach_filter);
+EXPORT_SYMBOL_GPL(__sk_detach_filter);
+
+int sk_detach_filter(struct sock *sk)
+{
+ return __sk_detach_filter(sk, sock_owned_by_user(sk));
+}
int sk_get_filter(struct sock *sk, struct sock_filter __user *ubuf,
unsigned int len)
--
1.9.3
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 0:13 [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter Daniel Borkmann
@ 2016-03-31 1:18 ` Alexei Starovoitov
2016-03-31 5:01 ` Michal Kubecek
2016-03-31 9:15 ` Jiri Slaby
1 sibling, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2016-03-31 1:18 UTC (permalink / raw)
To: Daniel Borkmann
Cc: davem, mkubecek, sasha.levin, jslaby, eric.dumazet, mst, netdev
On Thu, Mar 31, 2016 at 02:13:18AM +0200, Daniel Borkmann wrote:
> Sasha Levin reported a suspicious rcu_dereference_protected() warning
> found while fuzzing with trinity that is similar to this one:
>
> [ 52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage!
> [ 52.765688] other info that might help us debug this:
> [ 52.765695] rcu_scheduler_active = 1, debug_locks = 1
> [ 52.765701] 1 lock held by a.out/1525:
> [ 52.765704] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816a64b7>] rtnl_lock+0x17/0x20
> [ 52.765721] stack backtrace:
> [ 52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264
> [...]
> [ 52.765768] Call Trace:
> [ 52.765775] [<ffffffff813e488d>] dump_stack+0x85/0xc8
> [ 52.765784] [<ffffffff810f2fa5>] lockdep_rcu_suspicious+0xd5/0x110
> [ 52.765792] [<ffffffff816afdc2>] sk_detach_filter+0x82/0x90
> [ 52.765801] [<ffffffffa0883425>] tun_detach_filter+0x35/0x90 [tun]
> [ 52.765810] [<ffffffffa0884ed4>] __tun_chr_ioctl+0x354/0x1130 [tun]
> [ 52.765818] [<ffffffff8136fed0>] ? selinux_file_ioctl+0x130/0x210
> [ 52.765827] [<ffffffffa0885ce3>] tun_chr_ioctl+0x13/0x20 [tun]
> [ 52.765834] [<ffffffff81260ea6>] do_vfs_ioctl+0x96/0x690
> [ 52.765843] [<ffffffff81364af3>] ? security_file_ioctl+0x43/0x60
> [ 52.765850] [<ffffffff81261519>] SyS_ioctl+0x79/0x90
> [ 52.765858] [<ffffffff81003ba2>] do_syscall_64+0x62/0x140
> [ 52.765866] [<ffffffff817d563f>] entry_SYSCALL64_slow_path+0x25/0x25
>
> Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled
> from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH,
> DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices.
>
> Since the fix in f91ff5b9ff52 ("net: sk_{detach|attach}_filter() rcu
> fixes") sk_attach_filter()/sk_detach_filter() now dereferences the
> filter with rcu_dereference_protected(), checking whether socket lock
> is held in control path.
>
> Since its introduction in 994051625981 ("tun: socket filter support"),
> tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the
> sock_owned_by_user(sk) doesn't apply in this specific case and therefore
> triggers the false positive.
>
> Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair
> that is used by tap filters and pass in lockdep_rtnl_is_held() for the
> rcu_dereference_protected() checks instead.
>
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---
> drivers/net/tun.c | 8 +++++---
> include/linux/filter.h | 4 ++++
> net/core/filter.c | 33 +++++++++++++++++++++------------
> 3 files changed, 30 insertions(+), 15 deletions(-)
kinda heavy patch to shut up lockdep.
Can we do
old_fp = rcu_dereference_protected(sk->sk_filter,
sock_owned_by_user(sk) || lockdep_rtnl_is_held());
and it always be correct?
I think right now tun is the only such user, but if it's correct for tun,
it's correct for future users too. If not correct then not correct for tun either.
Or I'm missing something?
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 1:18 ` Alexei Starovoitov
@ 2016-03-31 5:01 ` Michal Kubecek
2016-03-31 5:08 ` Alexei Starovoitov
0 siblings, 1 reply; 20+ messages in thread
From: Michal Kubecek @ 2016-03-31 5:01 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Daniel Borkmann, davem, sasha.levin, jslaby, eric.dumazet, mst,
netdev
On Wed, Mar 30, 2016 at 06:18:42PM -0700, Alexei Starovoitov wrote:
> On Thu, Mar 31, 2016 at 02:13:18AM +0200, Daniel Borkmann wrote:
> > Sasha Levin reported a suspicious rcu_dereference_protected() warning
> > found while fuzzing with trinity that is similar to this one:
> >
> > [ 52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage!
> > [ 52.765688] other info that might help us debug this:
> > [ 52.765695] rcu_scheduler_active = 1, debug_locks = 1
> > [ 52.765701] 1 lock held by a.out/1525:
> > [ 52.765704] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816a64b7>] rtnl_lock+0x17/0x20
> > [ 52.765721] stack backtrace:
> > [ 52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264
> > [...]
> > [ 52.765768] Call Trace:
> > [ 52.765775] [<ffffffff813e488d>] dump_stack+0x85/0xc8
> > [ 52.765784] [<ffffffff810f2fa5>] lockdep_rcu_suspicious+0xd5/0x110
> > [ 52.765792] [<ffffffff816afdc2>] sk_detach_filter+0x82/0x90
> > [ 52.765801] [<ffffffffa0883425>] tun_detach_filter+0x35/0x90 [tun]
> > [ 52.765810] [<ffffffffa0884ed4>] __tun_chr_ioctl+0x354/0x1130 [tun]
> > [ 52.765818] [<ffffffff8136fed0>] ? selinux_file_ioctl+0x130/0x210
> > [ 52.765827] [<ffffffffa0885ce3>] tun_chr_ioctl+0x13/0x20 [tun]
> > [ 52.765834] [<ffffffff81260ea6>] do_vfs_ioctl+0x96/0x690
> > [ 52.765843] [<ffffffff81364af3>] ? security_file_ioctl+0x43/0x60
> > [ 52.765850] [<ffffffff81261519>] SyS_ioctl+0x79/0x90
> > [ 52.765858] [<ffffffff81003ba2>] do_syscall_64+0x62/0x140
> > [ 52.765866] [<ffffffff817d563f>] entry_SYSCALL64_slow_path+0x25/0x25
> >
> > Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled
> > from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH,
> > DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices.
> >
> > Since the fix in f91ff5b9ff52 ("net: sk_{detach|attach}_filter() rcu
> > fixes") sk_attach_filter()/sk_detach_filter() now dereferences the
> > filter with rcu_dereference_protected(), checking whether socket lock
> > is held in control path.
> >
> > Since its introduction in 994051625981 ("tun: socket filter support"),
> > tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the
> > sock_owned_by_user(sk) doesn't apply in this specific case and therefore
> > triggers the false positive.
> >
> > Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair
> > that is used by tap filters and pass in lockdep_rtnl_is_held() for the
> > rcu_dereference_protected() checks instead.
> >
> > Reported-by: Sasha Levin <sasha.levin@oracle.com>
> > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> > ---
> > drivers/net/tun.c | 8 +++++---
> > include/linux/filter.h | 4 ++++
> > net/core/filter.c | 33 +++++++++++++++++++++------------
> > 3 files changed, 30 insertions(+), 15 deletions(-)
>
> kinda heavy patch to shut up lockdep.
> Can we do
> old_fp = rcu_dereference_protected(sk->sk_filter,
> sock_owned_by_user(sk) || lockdep_rtnl_is_held());
> and it always be correct?
> I think right now tun is the only such user, but if it's correct for tun,
> it's correct for future users too. If not correct then not correct for tun either.
> Or I'm missing something?
Already discussed here:
http://thread.gmane.org/gmane.linux.kernel/2158069/focus=405853
Michal Kubecek
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 5:01 ` Michal Kubecek
@ 2016-03-31 5:08 ` Alexei Starovoitov
2016-03-31 5:22 ` Michal Kubecek
0 siblings, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2016-03-31 5:08 UTC (permalink / raw)
To: Michal Kubecek
Cc: Daniel Borkmann, davem, sasha.levin, jslaby, eric.dumazet, mst,
netdev
On Thu, Mar 31, 2016 at 07:01:15AM +0200, Michal Kubecek wrote:
> On Wed, Mar 30, 2016 at 06:18:42PM -0700, Alexei Starovoitov wrote:
> > On Thu, Mar 31, 2016 at 02:13:18AM +0200, Daniel Borkmann wrote:
> > > Sasha Levin reported a suspicious rcu_dereference_protected() warning
> > > found while fuzzing with trinity that is similar to this one:
> > >
> > > [ 52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage!
> > > [ 52.765688] other info that might help us debug this:
> > > [ 52.765695] rcu_scheduler_active = 1, debug_locks = 1
> > > [ 52.765701] 1 lock held by a.out/1525:
> > > [ 52.765704] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816a64b7>] rtnl_lock+0x17/0x20
> > > [ 52.765721] stack backtrace:
> > > [ 52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264
> > > [...]
> > > [ 52.765768] Call Trace:
> > > [ 52.765775] [<ffffffff813e488d>] dump_stack+0x85/0xc8
> > > [ 52.765784] [<ffffffff810f2fa5>] lockdep_rcu_suspicious+0xd5/0x110
> > > [ 52.765792] [<ffffffff816afdc2>] sk_detach_filter+0x82/0x90
> > > [ 52.765801] [<ffffffffa0883425>] tun_detach_filter+0x35/0x90 [tun]
> > > [ 52.765810] [<ffffffffa0884ed4>] __tun_chr_ioctl+0x354/0x1130 [tun]
> > > [ 52.765818] [<ffffffff8136fed0>] ? selinux_file_ioctl+0x130/0x210
> > > [ 52.765827] [<ffffffffa0885ce3>] tun_chr_ioctl+0x13/0x20 [tun]
> > > [ 52.765834] [<ffffffff81260ea6>] do_vfs_ioctl+0x96/0x690
> > > [ 52.765843] [<ffffffff81364af3>] ? security_file_ioctl+0x43/0x60
> > > [ 52.765850] [<ffffffff81261519>] SyS_ioctl+0x79/0x90
> > > [ 52.765858] [<ffffffff81003ba2>] do_syscall_64+0x62/0x140
> > > [ 52.765866] [<ffffffff817d563f>] entry_SYSCALL64_slow_path+0x25/0x25
> > >
> > > Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled
> > > from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH,
> > > DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices.
> > >
> > > Since the fix in f91ff5b9ff52 ("net: sk_{detach|attach}_filter() rcu
> > > fixes") sk_attach_filter()/sk_detach_filter() now dereferences the
> > > filter with rcu_dereference_protected(), checking whether socket lock
> > > is held in control path.
> > >
> > > Since its introduction in 994051625981 ("tun: socket filter support"),
> > > tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the
> > > sock_owned_by_user(sk) doesn't apply in this specific case and therefore
> > > triggers the false positive.
> > >
> > > Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair
> > > that is used by tap filters and pass in lockdep_rtnl_is_held() for the
> > > rcu_dereference_protected() checks instead.
> > >
> > > Reported-by: Sasha Levin <sasha.levin@oracle.com>
> > > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> > > ---
> > > drivers/net/tun.c | 8 +++++---
> > > include/linux/filter.h | 4 ++++
> > > net/core/filter.c | 33 +++++++++++++++++++++------------
> > > 3 files changed, 30 insertions(+), 15 deletions(-)
> >
> > kinda heavy patch to shut up lockdep.
> > Can we do
> > old_fp = rcu_dereference_protected(sk->sk_filter,
> > sock_owned_by_user(sk) || lockdep_rtnl_is_held());
> > and it always be correct?
> > I think right now tun is the only such user, but if it's correct for tun,
> > it's correct for future users too. If not correct then not correct for tun either.
> > Or I'm missing something?
>
> Already discussed here:
>
> http://thread.gmane.org/gmane.linux.kernel/2158069/focus=405853
I saw that. My point above was challenging 'less accurate' part.
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 5:08 ` Alexei Starovoitov
@ 2016-03-31 5:22 ` Michal Kubecek
2016-03-31 5:43 ` Alexei Starovoitov
0 siblings, 1 reply; 20+ messages in thread
From: Michal Kubecek @ 2016-03-31 5:22 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Daniel Borkmann, davem, sasha.levin, jslaby, eric.dumazet, mst,
netdev
On Wed, Mar 30, 2016 at 10:08:10PM -0700, Alexei Starovoitov wrote:
> On Thu, Mar 31, 2016 at 07:01:15AM +0200, Michal Kubecek wrote:
> > On Wed, Mar 30, 2016 at 06:18:42PM -0700, Alexei Starovoitov wrote:
> > >
> > > kinda heavy patch to shut up lockdep.
> > > Can we do
> > > old_fp = rcu_dereference_protected(sk->sk_filter,
> > > sock_owned_by_user(sk) || lockdep_rtnl_is_held());
> > > and it always be correct?
> > > I think right now tun is the only such user, but if it's correct
> > > for tun, it's correct for future users too. If not correct then
> > > not correct for tun either.
> > > Or I'm missing something?
> >
> > Already discussed here:
> >
> > http://thread.gmane.org/gmane.linux.kernel/2158069/focus=405853
>
> I saw that. My point above was challenging 'less accurate' part.
>
Daniel's point was that lockdep_rtnl_is_held() does not mean "we hold
RTNL" but "someone holds RTNL" so that some other task holding RTNL at
the moment could make the check happy even when called by someone
supposed to own the socket.
So I guess the key question is if avoiding this type of false negative
is important enough to justify the extra complexity (taking into account
this race would have to happen every time the check is performed to
really hide a locking issue).
Michal Kubecek
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 5:22 ` Michal Kubecek
@ 2016-03-31 5:43 ` Alexei Starovoitov
2016-03-31 11:35 ` Daniel Borkmann
2016-03-31 12:12 ` Hannes Frederic Sowa
0 siblings, 2 replies; 20+ messages in thread
From: Alexei Starovoitov @ 2016-03-31 5:43 UTC (permalink / raw)
To: Michal Kubecek
Cc: Daniel Borkmann, davem, sasha.levin, jslaby, eric.dumazet, mst,
netdev
On Thu, Mar 31, 2016 at 07:22:32AM +0200, Michal Kubecek wrote:
> On Wed, Mar 30, 2016 at 10:08:10PM -0700, Alexei Starovoitov wrote:
> > On Thu, Mar 31, 2016 at 07:01:15AM +0200, Michal Kubecek wrote:
> > > On Wed, Mar 30, 2016 at 06:18:42PM -0700, Alexei Starovoitov wrote:
> > > >
> > > > kinda heavy patch to shut up lockdep.
> > > > Can we do
> > > > old_fp = rcu_dereference_protected(sk->sk_filter,
> > > > sock_owned_by_user(sk) || lockdep_rtnl_is_held());
> > > > and it always be correct?
> > > > I think right now tun is the only such user, but if it's correct
> > > > for tun, it's correct for future users too. If not correct then
> > > > not correct for tun either.
> > > > Or I'm missing something?
> > >
> > > Already discussed here:
> > >
> > > http://thread.gmane.org/gmane.linux.kernel/2158069/focus=405853
> >
> > I saw that. My point above was challenging 'less accurate' part.
> >
> Daniel's point was that lockdep_rtnl_is_held() does not mean "we hold
> RTNL" but "someone holds RTNL" so that some other task holding RTNL at
> the moment could make the check happy even when called by someone
> supposed to own the socket.
Of course... and that is the case for all rtnl_dereference() calls...
yet we're not paranoid about it.
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 5:43 ` Alexei Starovoitov
@ 2016-03-31 11:35 ` Daniel Borkmann
2016-03-31 11:59 ` Eric Dumazet
2016-03-31 12:12 ` Hannes Frederic Sowa
1 sibling, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2016-03-31 11:35 UTC (permalink / raw)
To: Alexei Starovoitov, Michal Kubecek
Cc: davem, sasha.levin, jslaby, eric.dumazet, mst, netdev
On 03/31/2016 07:43 AM, Alexei Starovoitov wrote:
> On Thu, Mar 31, 2016 at 07:22:32AM +0200, Michal Kubecek wrote:
>> On Wed, Mar 30, 2016 at 10:08:10PM -0700, Alexei Starovoitov wrote:
>>> On Thu, Mar 31, 2016 at 07:01:15AM +0200, Michal Kubecek wrote:
>>>> On Wed, Mar 30, 2016 at 06:18:42PM -0700, Alexei Starovoitov wrote:
>>>>>
>>>>> kinda heavy patch to shut up lockdep.
>>>>> Can we do
>>>>> old_fp = rcu_dereference_protected(sk->sk_filter,
>>>>> sock_owned_by_user(sk) || lockdep_rtnl_is_held());
>>>>> and it always be correct?
>>>>> I think right now tun is the only such user, but if it's correct
>>>>> for tun, it's correct for future users too. If not correct then
>>>>> not correct for tun either.
>>>>> Or I'm missing something?
>>>>
>>>> Already discussed here:
>>>>
>>>> http://thread.gmane.org/gmane.linux.kernel/2158069/focus=405853
>>>
>>> I saw that. My point above was challenging 'less accurate' part.
>>>
>> Daniel's point was that lockdep_rtnl_is_held() does not mean "we hold
>> RTNL" but "someone holds RTNL" so that some other task holding RTNL at
>> the moment could make the check happy even when called by someone
>> supposed to own the socket.
>
> Of course... and that is the case for all rtnl_dereference() calls...
> yet we're not paranoid about it.
Sure, but the rtnl case is a bit different, no? In the sense that there's
only one global mutex. So, imho, I don't think it's appropriate to relax
the current rcu_dereference_protected() check for the socket case _just_
in order to silence the tun case warning, if we _can_ actually do better
than this w/o much effort.
I thought about some alternatives if we really don't want to change the
API code like this: We could change the rcu_dereference_protected() just
into a rcu_dereference(), but with the trade-off of not having lockdep
which is probably not really what we want. We could hack the tun case to
create some 'fake' ownership by setting sk->sk_lock.owned, but this seems
very hacky imho, and messing around with sk_lock details that shouldn't
be messed with. Or, as in the other thread mentioned, we could add a flag
like below to mark the socket that it doesn't need to be locked in the
expected way. That diff works as well, is smaller, and the flag could
perhaps be reused in other cases, too. Downside is that we burn a socket
flag, but as it's not uapi, it's not set in stone and can still be changed
should we get into a shortage of bits in future. Have no strong opinion
whether this seems better or not.
Thanks,
Daniel
drivers/net/tun.c | 1 +
include/net/sock.h | 8 ++++++++
net/core/filter.c | 7 ++++---
3 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index afdf950..8dc7d3e 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2252,6 +2252,7 @@ static int tun_chr_open(struct inode *inode, struct file * file)
INIT_LIST_HEAD(&tfile->next);
sock_set_flag(&tfile->sk, SOCK_ZEROCOPY);
+ sock_set_flag(&tfile->sk, SOCK_EXTERNAL_OWNER);
return 0;
}
diff --git a/include/net/sock.h b/include/net/sock.h
index 255d3e0..8d90673 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -720,6 +720,9 @@ enum sock_flags {
*/
SOCK_FILTER_LOCKED, /* Filter cannot be changed anymore */
SOCK_SELECT_ERR_QUEUE, /* Wake select on error queue */
+ SOCK_EXTERNAL_OWNER, /* External locking (e.g. RTNL) is used instead
+ * of sk_lock for control path.
+ */
};
#define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
@@ -1330,6 +1333,11 @@ static inline void sock_release_ownership(struct sock *sk)
sk->sk_lock.owned = 0;
}
+static inline bool sock_owned_externally(const struct sock *sk)
+{
+ return sk->sk_flags & (1UL << SOCK_EXTERNAL_OWNER);
+}
+
/*
* Macro so as to not evaluate some arguments when
* lockdep is not enabled.
diff --git a/net/core/filter.c b/net/core/filter.c
index 4b81b71..828274e 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1166,9 +1166,9 @@ static int __sk_attach_prog(struct bpf_prog *prog, struct sock *sk)
}
old_fp = rcu_dereference_protected(sk->sk_filter,
- sock_owned_by_user(sk));
+ sock_owned_by_user(sk) ||
+ sock_owned_externally(sk));
rcu_assign_pointer(sk->sk_filter, fp);
-
if (old_fp)
sk_filter_uncharge(sk, old_fp);
@@ -2259,7 +2259,8 @@ int sk_detach_filter(struct sock *sk)
return -EPERM;
filter = rcu_dereference_protected(sk->sk_filter,
- sock_owned_by_user(sk));
+ sock_owned_by_user(sk) ||
+ sock_owned_externally(sk));
if (filter) {
RCU_INIT_POINTER(sk->sk_filter, NULL);
sk_filter_uncharge(sk, filter);
--
1.9.3
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 11:35 ` Daniel Borkmann
@ 2016-03-31 11:59 ` Eric Dumazet
2016-03-31 12:16 ` Daniel Borkmann
0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2016-03-31 11:59 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Alexei Starovoitov, Michal Kubecek, davem, sasha.levin, jslaby,
mst, netdev
On Thu, 2016-03-31 at 13:35 +0200, Daniel Borkmann wrote:
> +static inline bool sock_owned_externally(const struct sock *sk)
> +{
> + return sk->sk_flags & (1UL << SOCK_EXTERNAL_OWNER);
> +}
> +
Have you reinvented sock_flag(sl, SOCK_EXTERNAL_OWNER) ? ;)
Anyway, using a flag for this purpose sounds overkill to me.
Setting it is a way to 'fool' lockdep anyway...
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 11:59 ` Eric Dumazet
@ 2016-03-31 12:16 ` Daniel Borkmann
2016-03-31 19:21 ` David Miller
0 siblings, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2016-03-31 12:16 UTC (permalink / raw)
To: Eric Dumazet
Cc: Alexei Starovoitov, Michal Kubecek, davem, sasha.levin, jslaby,
mst, netdev
On 03/31/2016 01:59 PM, Eric Dumazet wrote:
> On Thu, 2016-03-31 at 13:35 +0200, Daniel Borkmann wrote:
>
>> +static inline bool sock_owned_externally(const struct sock *sk)
>> +{
>> + return sk->sk_flags & (1UL << SOCK_EXTERNAL_OWNER);
>> +}
>> +
>
> Have you reinvented sock_flag(sl, SOCK_EXTERNAL_OWNER) ? ;)
>
> Anyway, using a flag for this purpose sounds overkill to me.
Right.
> Setting it is a way to 'fool' lockdep anyway...
Yep, correct, we'd be fooling the tun case, so this diff doesn't
really make it any better there.
Thanks,
Daniel
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 12:16 ` Daniel Borkmann
@ 2016-03-31 19:21 ` David Miller
2016-03-31 19:24 ` Hannes Frederic Sowa
0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2016-03-31 19:21 UTC (permalink / raw)
To: daniel
Cc: eric.dumazet, alexei.starovoitov, mkubecek, sasha.levin, jslaby,
mst, netdev
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Thu, 31 Mar 2016 14:16:18 +0200
> On 03/31/2016 01:59 PM, Eric Dumazet wrote:
>> On Thu, 2016-03-31 at 13:35 +0200, Daniel Borkmann wrote:
>>
>>> +static inline bool sock_owned_externally(const struct sock *sk)
>>> +{
>>> + return sk->sk_flags & (1UL << SOCK_EXTERNAL_OWNER);
>>> +}
>>> +
>>
>> Have you reinvented sock_flag(sl, SOCK_EXTERNAL_OWNER) ? ;)
>>
>> Anyway, using a flag for this purpose sounds overkill to me.
>
> Right.
>
>> Setting it is a way to 'fool' lockdep anyway...
>
> Yep, correct, we'd be fooling the tun case, so this diff doesn't
> really make it any better there.
I like the currently proposed patch where TUN says that RTNL is what
the synchronizing element is.
Maybe we could make a helper of some sort but since we only have once
case like this is just overkill.
Alexei, do you really mind if I apply Danile's patch?
Thanks.
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 19:21 ` David Miller
@ 2016-03-31 19:24 ` Hannes Frederic Sowa
2016-03-31 19:31 ` Alexei Starovoitov
2016-03-31 19:36 ` David Miller
0 siblings, 2 replies; 20+ messages in thread
From: Hannes Frederic Sowa @ 2016-03-31 19:24 UTC (permalink / raw)
To: David Miller, daniel
Cc: eric.dumazet, alexei.starovoitov, mkubecek, sasha.levin, jslaby,
mst, netdev
Hello,
On 31.03.2016 21:21, David Miller wrote:
> From: Daniel Borkmann <daniel@iogearbox.net>
> Date: Thu, 31 Mar 2016 14:16:18 +0200
>
>> On 03/31/2016 01:59 PM, Eric Dumazet wrote:
>>> On Thu, 2016-03-31 at 13:35 +0200, Daniel Borkmann wrote:
>>>
>>>> +static inline bool sock_owned_externally(const struct sock *sk)
>>>> +{
>>>> + return sk->sk_flags & (1UL << SOCK_EXTERNAL_OWNER);
>>>> +}
>>>> +
>>>
>>> Have you reinvented sock_flag(sl, SOCK_EXTERNAL_OWNER) ? ;)
>>>
>>> Anyway, using a flag for this purpose sounds overkill to me.
>>
>> Right.
>>
>>> Setting it is a way to 'fool' lockdep anyway...
>>
>> Yep, correct, we'd be fooling the tun case, so this diff doesn't
>> really make it any better there.
>
> I like the currently proposed patch where TUN says that RTNL is what
> the synchronizing element is.
>
> Maybe we could make a helper of some sort but since we only have once
> case like this is just overkill.
>
> Alexei, do you really mind if I apply Danile's patch?
I proposed the following patch to Daniel and he seemed to like it. I
was just waiting for his feedback and tags and wanted to send it out
then.
What do you think?
lockdep_sock_is_held does also have some other applications in other
parts of the source.
Bye,
Hannes
diff --git a/include/net/sock.h b/include/net/sock.h
index 255d3e03727b73..651b84a38cfb9b 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1330,6 +1330,12 @@ static inline void sock_release_ownership(struct
sock *sk)
sk->sk_lock.owned = 0;
}
+static inline bool lockdep_sock_is_held(struct sock *sk)
+{
+ return lockdep_is_held(&sk->sk_lock) ||
+ lockdep_is_held(&sk->sk_lock.slock);
+}
+
/*
* Macro so as to not evaluate some arguments when
* lockdep is not enabled.
diff --git a/net/core/filter.c b/net/core/filter.c
index 4b81b71171b4ce..8ab270d5ce5507 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1166,7 +1166,8 @@ static int __sk_attach_prog(struct bpf_prog *prog,
struct sock *sk)
}
old_fp = rcu_dereference_protected(sk->sk_filter,
- sock_owned_by_user(sk));
+ lockdep_rtnl_is_held() ||
+ lockdep_sock_is_held(sk));
rcu_assign_pointer(sk->sk_filter, fp);
if (old_fp)
@@ -2259,7 +2260,9 @@ int sk_detach_filter(struct sock *sk)
return -EPERM;
filter = rcu_dereference_protected(sk->sk_filter,
- sock_owned_by_user(sk));
+ lockdep_rtnl_is_held() ||
+ lockdep_sock_is_held(sk));
+
if (filter) {
RCU_INIT_POINTER(sk->sk_filter, NULL);
sk_filter_uncharge(sk, filter);
c
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 19:24 ` Hannes Frederic Sowa
@ 2016-03-31 19:31 ` Alexei Starovoitov
2016-03-31 19:48 ` David Miller
2016-03-31 19:36 ` David Miller
1 sibling, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2016-03-31 19:31 UTC (permalink / raw)
To: Hannes Frederic Sowa
Cc: David Miller, daniel, eric.dumazet, mkubecek, sasha.levin, jslaby,
mst, netdev
On Thu, Mar 31, 2016 at 09:24:12PM +0200, Hannes Frederic Sowa wrote:
> Hello,
>
> On 31.03.2016 21:21, David Miller wrote:
> >From: Daniel Borkmann <daniel@iogearbox.net>
> >Date: Thu, 31 Mar 2016 14:16:18 +0200
> >
> >>On 03/31/2016 01:59 PM, Eric Dumazet wrote:
> >>>On Thu, 2016-03-31 at 13:35 +0200, Daniel Borkmann wrote:
> >>>
> >>>>+static inline bool sock_owned_externally(const struct sock *sk)
> >>>>+{
> >>>>+ return sk->sk_flags & (1UL << SOCK_EXTERNAL_OWNER);
> >>>>+}
> >>>>+
> >>>
> >>>Have you reinvented sock_flag(sl, SOCK_EXTERNAL_OWNER) ? ;)
> >>>
> >>>Anyway, using a flag for this purpose sounds overkill to me.
> >>
> >>Right.
> >>
> >>>Setting it is a way to 'fool' lockdep anyway...
> >>
> >>Yep, correct, we'd be fooling the tun case, so this diff doesn't
> >>really make it any better there.
> >
> >I like the currently proposed patch where TUN says that RTNL is what
> >the synchronizing element is.
> >
> >Maybe we could make a helper of some sort but since we only have once
> >case like this is just overkill.
> >
> >Alexei, do you really mind if I apply Danile's patch?
I don't have strong opinion either way.
Though Hannes's patch below looks simpler and easier to backport.
Yeah, I do care about backports quite a bit more nowadays :)
> I proposed the following patch to Daniel and he seemed to like it. I
> was just waiting for his feedback and tags and wanted to send it out
> then.
>
> What do you think?
>
> lockdep_sock_is_held does also have some other applications in other
> parts of the source.
>
> Bye,
> Hannes
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 255d3e03727b73..651b84a38cfb9b 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1330,6 +1330,12 @@ static inline void sock_release_ownership(struct sock
> *sk)
> sk->sk_lock.owned = 0;
> }
>
> +static inline bool lockdep_sock_is_held(struct sock *sk)
> +{
> + return lockdep_is_held(&sk->sk_lock) ||
> + lockdep_is_held(&sk->sk_lock.slock);
> +}
> +
> /*
> * Macro so as to not evaluate some arguments when
> * lockdep is not enabled.
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 4b81b71171b4ce..8ab270d5ce5507 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -1166,7 +1166,8 @@ static int __sk_attach_prog(struct bpf_prog *prog,
> struct sock *sk)
> }
>
> old_fp = rcu_dereference_protected(sk->sk_filter,
> - sock_owned_by_user(sk));
> + lockdep_rtnl_is_held() ||
> + lockdep_sock_is_held(sk));
> rcu_assign_pointer(sk->sk_filter, fp);
>
> if (old_fp)
> @@ -2259,7 +2260,9 @@ int sk_detach_filter(struct sock *sk)
> return -EPERM;
>
> filter = rcu_dereference_protected(sk->sk_filter,
> - sock_owned_by_user(sk));
> + lockdep_rtnl_is_held() ||
> + lockdep_sock_is_held(sk));
> +
> if (filter) {
> RCU_INIT_POINTER(sk->sk_filter, NULL);
> sk_filter_uncharge(sk, filter);
>
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 19:31 ` Alexei Starovoitov
@ 2016-03-31 19:48 ` David Miller
0 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2016-03-31 19:48 UTC (permalink / raw)
To: alexei.starovoitov
Cc: hannes, daniel, eric.dumazet, mkubecek, sasha.levin, jslaby, mst,
netdev
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Date: Thu, 31 Mar 2016 12:31:56 -0700
> On Thu, Mar 31, 2016 at 09:24:12PM +0200, Hannes Frederic Sowa wrote:
>> Hello,
>>
>> On 31.03.2016 21:21, David Miller wrote:
>> >From: Daniel Borkmann <daniel@iogearbox.net>
>> >Date: Thu, 31 Mar 2016 14:16:18 +0200
>> >
>> >Alexei, do you really mind if I apply Danile's patch?
>
> I don't have strong opinion either way.
> Though Hannes's patch below looks simpler and easier to backport.
> Yeah, I do care about backports quite a bit more nowadays :)
You know, I care a lot about backports too :)
But Hannes's patch has the same fundamental issue, I think.
If we accept both synchornization styles, false positives are more
likely.
And in the long term, we can fix the false positive possibilities with
the RTNL checks.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 19:24 ` Hannes Frederic Sowa
2016-03-31 19:31 ` Alexei Starovoitov
@ 2016-03-31 19:36 ` David Miller
2016-03-31 19:48 ` Hannes Frederic Sowa
1 sibling, 1 reply; 20+ messages in thread
From: David Miller @ 2016-03-31 19:36 UTC (permalink / raw)
To: hannes
Cc: daniel, eric.dumazet, alexei.starovoitov, mkubecek, sasha.levin,
jslaby, mst, netdev
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Thu, 31 Mar 2016 21:24:12 +0200
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 4b81b71171b4ce..8ab270d5ce5507 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -1166,7 +1166,8 @@ static int __sk_attach_prog(struct bpf_prog
> *prog, struct sock *sk)
> }
>
> old_fp = rcu_dereference_protected(sk->sk_filter,
> - sock_owned_by_user(sk));
> + lockdep_rtnl_is_held() ||
> + lockdep_sock_is_held(sk));
> rcu_assign_pointer(sk->sk_filter, fp);
>
> if (old_fp)
I have the same objections Daniel did.
Not all socket filter clients use RTNL as the synchornization
mechanism. The caller, or some descriptive element, should tell us
what that synchronizing element is.
Yes, I understand how these RTNL checks can pass "accidently" but
the opposite is true too. A socket locking synchornizing user,
who didn't lock the socket, might now pass because RTNL happens
to be held elsewhere.
Constraining the test properly, based upon the user, makes this less
likely to happen. And that's desirable.
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 19:36 ` David Miller
@ 2016-03-31 19:48 ` Hannes Frederic Sowa
2016-03-31 19:50 ` David Miller
2016-03-31 21:52 ` Daniel Borkmann
0 siblings, 2 replies; 20+ messages in thread
From: Hannes Frederic Sowa @ 2016-03-31 19:48 UTC (permalink / raw)
To: David Miller
Cc: daniel, eric.dumazet, alexei.starovoitov, mkubecek, sasha.levin,
jslaby, mst, netdev
On 31.03.2016 21:36, David Miller wrote:
> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Date: Thu, 31 Mar 2016 21:24:12 +0200
>
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 4b81b71171b4ce..8ab270d5ce5507 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -1166,7 +1166,8 @@ static int __sk_attach_prog(struct bpf_prog
>> *prog, struct sock *sk)
>> }
>>
>> old_fp = rcu_dereference_protected(sk->sk_filter,
>> - sock_owned_by_user(sk));
>> + lockdep_rtnl_is_held() ||
>> + lockdep_sock_is_held(sk));
>> rcu_assign_pointer(sk->sk_filter, fp);
>>
>> if (old_fp)
>
> I have the same objections Daniel did.
>
> Not all socket filter clients use RTNL as the synchornization
> mechanism. The caller, or some descriptive element, should tell us
> what that synchronizing element is.
>
> Yes, I understand how these RTNL checks can pass "accidently" but
> the opposite is true too. A socket locking synchornizing user,
> who didn't lock the socket, might now pass because RTNL happens
> to be held elsewhere.
Actually lockdep_rtnl_is_held checks if this specific code/thread holds
the lock and no other cpu/thread. So it will not pass here in case
another cpu has the lock.
lockdep stores the current held locks in current->held_locks, if we
preempt we switch current pointer, if we take a spin_lock we can't sleep
thus not preempt. Thus we always know that this specific code has the lock.
Using sock_owned_by_user actually has this problem, and thus I am
replacing it. We don't know who has the socket locked.
Tightest solution would probably be to combine both patches.
bool called_by_tuntap;
old_fp = rcu_dereference_protected(sk->sk_filter, called_by_tuntap ?
lockdep_rtnl_is_held() : lockdep_sock_is_held());
Bye,
Hannes
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 19:48 ` Hannes Frederic Sowa
@ 2016-03-31 19:50 ` David Miller
2016-03-31 21:52 ` Daniel Borkmann
1 sibling, 0 replies; 20+ messages in thread
From: David Miller @ 2016-03-31 19:50 UTC (permalink / raw)
To: hannes
Cc: daniel, eric.dumazet, alexei.starovoitov, mkubecek, sasha.levin,
jslaby, mst, netdev
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Thu, 31 Mar 2016 21:48:27 +0200
> Tightest solution would probably be to combine both patches.
>
> bool called_by_tuntap;
>
> old_fp = rcu_dereference_protected(sk->sk_filter, called_by_tuntap ?
> lockdep_rtnl_is_held() : lockdep_sock_is_held());
Ok, I see what you're saying.
I misunderstood how the RTNL lockdep checks work and thought we could
get false positives from other entities taking RTNL.
Can you cook up the combined patch?
Thanks.
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 19:48 ` Hannes Frederic Sowa
2016-03-31 19:50 ` David Miller
@ 2016-03-31 21:52 ` Daniel Borkmann
2016-03-31 23:31 ` Hannes Frederic Sowa
1 sibling, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2016-03-31 21:52 UTC (permalink / raw)
To: Hannes Frederic Sowa, David Miller
Cc: eric.dumazet, alexei.starovoitov, mkubecek, sasha.levin, jslaby,
mst, netdev
On 03/31/2016 09:48 PM, Hannes Frederic Sowa wrote:
[...]
> Tightest solution would probably be to combine both patches.
>
> bool called_by_tuntap;
>
> old_fp = rcu_dereference_protected(sk->sk_filter, called_by_tuntap ? lockdep_rtnl_is_held() : lockdep_sock_is_held());
If I understand you correctly with combining them, you mean you'd still
need the API change to pass the bool 'called_by_tuntap' down, right?
If so, your main difference is, after all, to replace the sock_owned_by_user()
with the lockdep_sock_is_held() construction instead, correct?
But then, isn't it already sufficient when you pass the bool itself down
that 'demuxes' in this case between the sock_owned_by_user() vs
lockdep_rtnl_is_held() check?
Thanks,
Daniel
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 21:52 ` Daniel Borkmann
@ 2016-03-31 23:31 ` Hannes Frederic Sowa
0 siblings, 0 replies; 20+ messages in thread
From: Hannes Frederic Sowa @ 2016-03-31 23:31 UTC (permalink / raw)
To: Daniel Borkmann, David Miller
Cc: eric.dumazet, alexei.starovoitov, mkubecek, sasha.levin, jslaby,
mst, netdev
Hi Daniel,
On 31.03.2016 23:52, Daniel Borkmann wrote:
> On 03/31/2016 09:48 PM, Hannes Frederic Sowa wrote:
> [...]
>> Tightest solution would probably be to combine both patches.
>>
>> bool called_by_tuntap;
>>
>> old_fp = rcu_dereference_protected(sk->sk_filter, called_by_tuntap ?
>> lockdep_rtnl_is_held() : lockdep_sock_is_held());
>
> If I understand you correctly with combining them, you mean you'd still
> need the API change to pass the bool 'called_by_tuntap' down, right?
I actually decided to simply lock the sockets. So that there will always
be a clear lock owner during the updates. I think this is cleaner. What
do you think?
> If so, your main difference is, after all, to replace the
> sock_owned_by_user()
> with the lockdep_sock_is_held() construction instead, correct?
I just didn't do that part because we hold socket lock now.
> But then, isn't it already sufficient when you pass the bool itself down
> that 'demuxes' in this case between the sock_owned_by_user() vs
> lockdep_rtnl_is_held() check?
I just send out the patches, please have a look.
Thanks,
Hannes
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 5:43 ` Alexei Starovoitov
2016-03-31 11:35 ` Daniel Borkmann
@ 2016-03-31 12:12 ` Hannes Frederic Sowa
1 sibling, 0 replies; 20+ messages in thread
From: Hannes Frederic Sowa @ 2016-03-31 12:12 UTC (permalink / raw)
To: Alexei Starovoitov, Michal Kubecek
Cc: Daniel Borkmann, davem, sasha.levin, jslaby, eric.dumazet, mst,
netdev
On 31.03.2016 07:43, Alexei Starovoitov wrote:
> On Thu, Mar 31, 2016 at 07:22:32AM +0200, Michal Kubecek wrote:
>> On Wed, Mar 30, 2016 at 10:08:10PM -0700, Alexei Starovoitov wrote:
>>> On Thu, Mar 31, 2016 at 07:01:15AM +0200, Michal Kubecek wrote:
>>>> On Wed, Mar 30, 2016 at 06:18:42PM -0700, Alexei Starovoitov wrote:
>>>>>
>>>>> kinda heavy patch to shut up lockdep.
>>>>> Can we do
>>>>> old_fp = rcu_dereference_protected(sk->sk_filter,
>>>>> sock_owned_by_user(sk) || lockdep_rtnl_is_held());
>>>>> and it always be correct?
>>>>> I think right now tun is the only such user, but if it's correct
>>>>> for tun, it's correct for future users too. If not correct then
>>>>> not correct for tun either.
>>>>> Or I'm missing something?
>>>>
>>>> Already discussed here:
>>>>
>>>> http://thread.gmane.org/gmane.linux.kernel/2158069/focus=405853
>>>
>>> I saw that. My point above was challenging 'less accurate' part.
>>>
>> Daniel's point was that lockdep_rtnl_is_held() does not mean "we hold
>> RTNL" but "someone holds RTNL" so that some other task holding RTNL at
>> the moment could make the check happy even when called by someone
>> supposed to own the socket.
>
> Of course... and that is the case for all rtnl_dereference() calls...
> yet we're not paranoid about it.
lockdep_rtnl_is_held actually checks *current if the currently running
code actually has the lock, no?
Bye,
Hannes
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
2016-03-31 0:13 [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter Daniel Borkmann
2016-03-31 1:18 ` Alexei Starovoitov
@ 2016-03-31 9:15 ` Jiri Slaby
1 sibling, 0 replies; 20+ messages in thread
From: Jiri Slaby @ 2016-03-31 9:15 UTC (permalink / raw)
To: Daniel Borkmann, davem
Cc: alexei.starovoitov, mkubecek, sasha.levin, eric.dumazet, mst,
netdev
On 03/31/2016, 02:13 AM, Daniel Borkmann wrote:
> Sasha Levin reported a suspicious rcu_dereference_protected() warning
> found while fuzzing with trinity that is similar to this one:
>
> [ 52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage!
> [ 52.765688] other info that might help us debug this:
> [ 52.765695] rcu_scheduler_active = 1, debug_locks = 1
> [ 52.765701] 1 lock held by a.out/1525:
> [ 52.765704] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816a64b7>] rtnl_lock+0x17/0x20
> [ 52.765721] stack backtrace:
> [ 52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264
> [...]
> [ 52.765768] Call Trace:
> [ 52.765775] [<ffffffff813e488d>] dump_stack+0x85/0xc8
> [ 52.765784] [<ffffffff810f2fa5>] lockdep_rcu_suspicious+0xd5/0x110
> [ 52.765792] [<ffffffff816afdc2>] sk_detach_filter+0x82/0x90
> [ 52.765801] [<ffffffffa0883425>] tun_detach_filter+0x35/0x90 [tun]
> [ 52.765810] [<ffffffffa0884ed4>] __tun_chr_ioctl+0x354/0x1130 [tun]
> [ 52.765818] [<ffffffff8136fed0>] ? selinux_file_ioctl+0x130/0x210
> [ 52.765827] [<ffffffffa0885ce3>] tun_chr_ioctl+0x13/0x20 [tun]
> [ 52.765834] [<ffffffff81260ea6>] do_vfs_ioctl+0x96/0x690
> [ 52.765843] [<ffffffff81364af3>] ? security_file_ioctl+0x43/0x60
> [ 52.765850] [<ffffffff81261519>] SyS_ioctl+0x79/0x90
> [ 52.765858] [<ffffffff81003ba2>] do_syscall_64+0x62/0x140
> [ 52.765866] [<ffffffff817d563f>] entry_SYSCALL64_slow_path+0x25/0x25
>
> Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled
> from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH,
> DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices.
>
> Since the fix in f91ff5b9ff52 ("net: sk_{detach|attach}_filter() rcu
> fixes") sk_attach_filter()/sk_detach_filter() now dereferences the
> filter with rcu_dereference_protected(), checking whether socket lock
> is held in control path.
>
> Since its introduction in 994051625981 ("tun: socket filter support"),
> tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the
> sock_owned_by_user(sk) doesn't apply in this specific case and therefore
> triggers the false positive.
>
> Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair
> that is used by tap filters and pass in lockdep_rtnl_is_held() for the
> rcu_dereference_protected() checks instead.
It seems to be gone with this patch here.
thanks,
--
js
suse labs
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2016-03-31 23:31 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-31 0:13 [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter Daniel Borkmann
2016-03-31 1:18 ` Alexei Starovoitov
2016-03-31 5:01 ` Michal Kubecek
2016-03-31 5:08 ` Alexei Starovoitov
2016-03-31 5:22 ` Michal Kubecek
2016-03-31 5:43 ` Alexei Starovoitov
2016-03-31 11:35 ` Daniel Borkmann
2016-03-31 11:59 ` Eric Dumazet
2016-03-31 12:16 ` Daniel Borkmann
2016-03-31 19:21 ` David Miller
2016-03-31 19:24 ` Hannes Frederic Sowa
2016-03-31 19:31 ` Alexei Starovoitov
2016-03-31 19:48 ` David Miller
2016-03-31 19:36 ` David Miller
2016-03-31 19:48 ` Hannes Frederic Sowa
2016-03-31 19:50 ` David Miller
2016-03-31 21:52 ` Daniel Borkmann
2016-03-31 23:31 ` Hannes Frederic Sowa
2016-03-31 12:12 ` Hannes Frederic Sowa
2016-03-31 9:15 ` Jiri Slaby
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).