* [PATCH] rds: fix use-after-free read in rds_find_bound
@ 2017-12-30 19:36 Santosh Shilimkar
2017-12-30 20:26 ` Sowmini Varadhan
0 siblings, 1 reply; 7+ messages in thread
From: Santosh Shilimkar @ 2017-12-30 19:36 UTC (permalink / raw)
To: netdev, davem; +Cc: linux-kernel, Santosh Shilimkar
socket buffer can get freed as part of sock_close
callback so before adding reference check underneath
socket validity.
Reported-by: syzbot+93a5839deb355537440f@syzkaller.appspotmail.com
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
---
net/rds/bind.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/rds/bind.c b/net/rds/bind.c
index 75d43dc..8dec06e 100644
--- a/net/rds/bind.c
+++ b/net/rds/bind.c
@@ -61,7 +61,7 @@ struct rds_sock *rds_find_bound(__be32 addr, __be16 port)
struct rds_sock *rs;
rs = rhashtable_lookup_fast(&bind_hash_table, &key, ht_parms);
- if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
+ if (rs && rds_rs_to_sk(rs) && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
rds_sock_addref(rs);
else
rs = NULL;
--
1.9.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] rds: fix use-after-free read in rds_find_bound
2017-12-30 19:36 [PATCH] rds: fix use-after-free read in rds_find_bound Santosh Shilimkar
@ 2017-12-30 20:26 ` Sowmini Varadhan
2017-12-30 21:37 ` santosh.shilimkar
0 siblings, 1 reply; 7+ messages in thread
From: Sowmini Varadhan @ 2017-12-30 20:26 UTC (permalink / raw)
To: Santosh Shilimkar; +Cc: netdev, davem
On (12/30/17 11:36), Santosh Shilimkar wrote:
>
> socket buffer can get freed as part of sock_close
> callback so before adding reference check underneath
> socket validity.
I'm not sure I understand this fix-
struct rds_sock is:
struct rds_sock {
struct sock rs_sk;
:
}
How can rs be non-null but rds_rs_to_sk() is null? (Note that
rds_rs_to_sk just returns &rs->rs_sk) so the changed line is
identical to the original line.
> - if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
> + if (rs && rds_rs_to_sk(rs) && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
I think the real issue is refcount bug somewhere,
Was the syzbot test run with http://patchwork.ozlabs.org/patch/852492/
this sounds like that type of bug.
--Sowmini
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] rds: fix use-after-free read in rds_find_bound
2017-12-30 20:26 ` Sowmini Varadhan
@ 2017-12-30 21:37 ` santosh.shilimkar
2017-12-30 22:32 ` Sowmini Varadhan
0 siblings, 1 reply; 7+ messages in thread
From: santosh.shilimkar @ 2017-12-30 21:37 UTC (permalink / raw)
To: Sowmini Varadhan; +Cc: netdev, davem
On 12/30/17 12:26 PM, Sowmini Varadhan wrote:
> On (12/30/17 11:36), Santosh Shilimkar wrote:
>>
>> socket buffer can get freed as part of sock_close
>> callback so before adding reference check underneath
>> socket validity.
>
> I'm not sure I understand this fix-
>
> struct rds_sock is:
> struct rds_sock {
> struct sock rs_sk;
> :
> }
>
> How can rs be non-null but rds_rs_to_sk() is null? (Note that
> rds_rs_to_sk just returns &rs->rs_sk) so the changed line is
> identical to the original line.
>
Well thats what the report says o.w flag test wouldn't have
been attempted.
>> - if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
>> + if (rs && rds_rs_to_sk(rs) && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
>
> I think the real issue is refcount bug somewhere,
>
Thats what I thought as well initially but since the reported case,
the rs seems to be valid where as sk seems to be freed up as part of
sock_release callback.
> Was the syzbot test run with http://patchwork.ozlabs.org/patch/852492/
> this sounds like that type of bug.
>
That fix scenario, the rs don't get inserted in hash table and
in this particular bug, the lookup was successful so am not sure
if these two bugs are related.
But since bound address fix was still not part of the build
reproduced use after free bug, $subject fix can wait for next
reproduction. Unfortunately as per the report, there is no
reproducer for it to test if other fix fixes this issue.
Regards,
Santosh
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] rds: fix use-after-free read in rds_find_bound
2017-12-30 21:37 ` santosh.shilimkar
@ 2017-12-30 22:32 ` Sowmini Varadhan
2017-12-31 5:09 ` santosh.shilimkar
0 siblings, 1 reply; 7+ messages in thread
From: Sowmini Varadhan @ 2017-12-30 22:32 UTC (permalink / raw)
To: santosh.shilimkar@oracle.com; +Cc: netdev, davem
On (12/30/17 13:37), santosh.shilimkar@oracle.com wrote:
> Well thats what the report says o.w flag test wouldn't have
> been attempted.
the bug report says "use-after-free".
It doesnt say that rds_rs_to_sk(rs) is null (if rds_rs_to_sk(rs) was null,
rs would also be null, please cscope struct rds_sock)
What the bug report says is
" The buggy address belongs to the object at ffff8801c09a6080
which belongs to the cache RDS of size 1472
The buggy address is located 96 bytes inside of .."
96 is the offset of sk->sk_flags. so yes, there is a socket refcount
issue.
But the patch you sent (see next two lines) will not solve that.
> >>- if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
> >>+ if (rs && rds_rs_to_sk(rs) && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
Sowmini>I think the real issue is refcount bug somewhere,
> Thats what I thought as well initially but since the reported case,
> the rs seems to be valid where as sk seems to be freed up as part of
> sock_release callback.
I dont understand the statement above- how can "rs be valid, and sk
be freed"?
rs_sk is embedded in the struct rds_sock, it is not a pointer.
let's find and fix the refcount bug. See stack trace in commit comment.
The socket release is happening prematurely and existing WARN_ONs
are not catching it.
> >Was the syzbot test run with http://patchwork.ozlabs.org/patch/852492/
> >this sounds like that type of bug.
--Sowmini
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] rds: fix use-after-free read in rds_find_bound
2017-12-30 22:32 ` Sowmini Varadhan
@ 2017-12-31 5:09 ` santosh.shilimkar
2017-12-31 12:33 ` Sowmini Varadhan
0 siblings, 1 reply; 7+ messages in thread
From: santosh.shilimkar @ 2017-12-31 5:09 UTC (permalink / raw)
To: Sowmini Varadhan; +Cc: netdev, davem
On 12/30/17 2:32 PM, Sowmini Varadhan wrote:
> On (12/30/17 13:37), santosh.shilimkar@oracle.com wrote:
[...]
>> Thats what I thought as well initially but since the reported case,
>> the rs seems to be valid where as sk seems to be freed up as part of
>> sock_release callback.
>
> I dont understand the statement above- how can "rs be valid, and sk
> be freed"?
>
> rs_sk is embedded in the struct rds_sock, it is not a pointer.
>
I was going with order of evaluation of if () but you made good point.
rs_sk isn't a pointer so sk can't be null.
> let's find and fix the refcount bug. See stack trace in commit comment.
> The socket release is happening prematurely and existing WARN_ONs
> are not catching it.
>
Right. This was loop transport in action so xmit will just flip
the direction with receive. And rds_recv_incoming() can race with
socket_release. rds_find_bound() is suppose to add ref count on
socket for rds_recv_incoming() but by that time socket is DEAD &
freed by socket release callback.
And rds_release is suppose to be synced with rs_recv_lock. But
release callback is marking the sk orphan before syncing
up with receive path and updating the bind table. Probably it
can pushed down further after the socket clean up buut need
to think bit more.
diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c
index b405f77..11e1426 100644
--- a/net/rds/af_rds.c
+++ b/net/rds/af_rds.c
@@ -65,7 +65,6 @@ static int rds_release(struct socket *sock)
rs = rds_sk_to_rs(sk);
- sock_orphan(sk);
/* Note - rds_clear_recv_queue grabs rs_recv_lock, so
* that ensures the recv path has completed messing
* with the socket. */
@@ -85,6 +84,7 @@ static int rds_release(struct socket *sock)
rds_trans_put(rs->rs_transport);
+ sock_orphan(sk);
sock->sk = NULL;
sock_put(sk);
out:
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH] rds: fix use-after-free read in rds_find_bound
2017-12-31 5:09 ` santosh.shilimkar
@ 2017-12-31 12:33 ` Sowmini Varadhan
2017-12-31 22:30 ` santosh.shilimkar
0 siblings, 1 reply; 7+ messages in thread
From: Sowmini Varadhan @ 2017-12-31 12:33 UTC (permalink / raw)
To: santosh.shilimkar@oracle.com; +Cc: netdev, davem
On (12/30/17 21:09), santosh.shilimkar@oracle.com wrote:
> Right. This was loop transport in action so xmit will just flip
> the direction with receive. And rds_recv_incoming() can race with
> socket_release. rds_find_bound() is suppose to add ref count on
> socket for rds_recv_incoming() but by that time socket is DEAD &
> freed by socket release callback.
correct, that makes sense.
> And rds_release is suppose to be synced with rs_recv_lock. But
> release callback is marking the sk orphan before syncing
> up with receive path and updating the bind table. Probably it
> can pushed down further after the socket clean up buut need
> to think bit more.
However, I'm not sure this seals the race.. according to the
bug report rds_recv_incoming->rds_find_bound is being called
in rds_send_worker context and the rds_find_bound code is
63 rs = rhashtable_lookup_fast(&bind_hash_table, &key, ht_parms);
64 if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
65 rds_sock_addref(rs);
66 else
67 rs = NULL;
68
Since the entire logic of rds_release can interleave between line 63
and 64, (whereas we only addref at line 65), moving the sock_orphan
will not help.
I see that there was an explicic synchornization via the bucket->lock
before 7b5654349e. I think you need something like that, or some type
or rcu-based hash list.
Patch below may make race-window smaller, but race window is still there.
>
>
> diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c
> index b405f77..11e1426 100644
> --- a/net/rds/af_rds.c
> +++ b/net/rds/af_rds.c
> @@ -65,7 +65,6 @@ static int rds_release(struct socket *sock)
>
> rs = rds_sk_to_rs(sk);
>
> - sock_orphan(sk);
> /* Note - rds_clear_recv_queue grabs rs_recv_lock, so
> * that ensures the recv path has completed messing
> * with the socket. */
> @@ -85,6 +84,7 @@ static int rds_release(struct socket *sock)
>
> rds_trans_put(rs->rs_transport);
>
> + sock_orphan(sk);
> sock->sk = NULL;
> sock_put(sk);
> out:
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] rds: fix use-after-free read in rds_find_bound
2017-12-31 12:33 ` Sowmini Varadhan
@ 2017-12-31 22:30 ` santosh.shilimkar
0 siblings, 0 replies; 7+ messages in thread
From: santosh.shilimkar @ 2017-12-31 22:30 UTC (permalink / raw)
To: Sowmini Varadhan; +Cc: netdev, davem
On 12/31/17 4:33 AM, Sowmini Varadhan wrote:
> On (12/30/17 21:09), santosh.shilimkar@oracle.com wrote:
>> Right. This was loop transport in action so xmit will just flip
>> the direction with receive. And rds_recv_incoming() can race with
>> socket_release. rds_find_bound() is suppose to add ref count on
>> socket for rds_recv_incoming() but by that time socket is DEAD &
>> freed by socket release callback.
>
> correct, that makes sense.
>
Yea. In fact the earlier point of sk being null and rs not is
also possible because socket release explicitly marks its
NULL ("sock->sk = NULL"). But it just side effect of the actual
race.
>> And rds_release is suppose to be synced with rs_recv_lock. But
>> release callback is marking the sk orphan before syncing
>> up with receive path and updating the bind table. Probably it
>> can pushed down further after the socket clean up buut need
>> to think bit more.
>
> However, I'm not sure this seals the race.. according to the
> bug report rds_recv_incoming->rds_find_bound is being called
> in rds_send_worker context and the rds_find_bound code is
>
> 63 rs = rhashtable_lookup_fast(&bind_hash_table, &key, ht_parms);
> 64 if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
> 65 rds_sock_addref(rs);
> 66 else
> 67 rs = NULL;
> 68
>
> Since the entire logic of rds_release can interleave between line 63
> and 64, (whereas we only addref at line 65), moving the sock_orphan
> will not help.
>
> I see that there was an explicic synchornization via the bucket->lock
> before 7b5654349e. I think you need something like that, or some type
> or rcu-based hash list.
>
The rhashtable already has internal bucket lock so those operation like
add/remove are synced up. But yes reference addition can still race
with receive since receive lock is taken after find bound.
> Patch below may make race-window smaller, but race window is still there.
>If the receive lock is taken ahead then with sock_orphan
change socket release will get synchronized with receive. But
preventing socket release to be getting triggered while in receive
path by means ref count is better option. Moving socket_orphan
down is anyway a good change but its not enough. Will think bit
more about it.
Thanks for the good discussion.
Regards,
Santosh
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-12-31 22:30 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-30 19:36 [PATCH] rds: fix use-after-free read in rds_find_bound Santosh Shilimkar
2017-12-30 20:26 ` Sowmini Varadhan
2017-12-30 21:37 ` santosh.shilimkar
2017-12-30 22:32 ` Sowmini Varadhan
2017-12-31 5:09 ` santosh.shilimkar
2017-12-31 12:33 ` Sowmini Varadhan
2017-12-31 22:30 ` santosh.shilimkar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).