From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sowmini Varadhan Subject: Re: [Patch net] rds: mark bound socket with SOCK_RCU_FREE Date: Mon, 10 Sep 2018 19:30:07 -0400 Message-ID: <20180910233007.GJ4668@oracle.com> References: <20180910222422.19470-1-xiyou.wangcong@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Cong Wang , netdev@vger.kernel.org, rds-devel@oss.oracle.com To: Santosh Shilimkar Return-path: Received: from userp2120.oracle.com ([156.151.31.85]:54620 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726143AbeIKE0g (ORCPT ); Tue, 11 Sep 2018 00:26:36 -0400 Content-Disposition: inline In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On (09/10/18 15:43), Santosh Shilimkar wrote: > On 9/10/2018 3:24 PM, Cong Wang wrote: > >When a rds sock is bound, it is inserted into the bind_hash_table > >which is protected by RCU. But when releasing rd sock, after it > >is removed from this hash table, it is freed immediately without > >respecting RCU grace period. This could cause some use-after-free > >as reported by syzbot. > > > Indeed. > > >Mark the rds sock as SOCK_RCU_FREE before inserting it into the > >bind_hash_table, so that it would be always freed after a RCU grace > >period. So I'm not sure I understand. Yes, Cong's fix may eliminate *some* of the syzbot failures, but the basic problem is not solved. To take one example of possible races (one that was discussed in https://www.spinics.net/lists/netdev/msg475074.html) rds_recv_incoming->rds_find_bound is being called in rds_send_worker context and the rds_find_bound code is 63 rs = rhashtable_lookup_fast(&bind_hash_table, &key, ht_parms); 64 if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD)) 65 rds_sock_addref(rs); 66 else 67 rs = NULL; 68 After we find an rs at line 63, how can we be sure that the entire logic of rds_release does not execute on another cpu, and free the rs, before we hit line 64 with the bad rs? Normally synchronize_rcu() or synchronize_net() in rds_release() would ensure this. How do we ensure this with SOCK_RCU_FREE (or is the intention to just reduce *some* of the syzbot failures)? --Sowmini