From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55ECDE7D26B for ; Tue, 26 Sep 2023 09:06:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233904AbjIZJGU (ORCPT ); Tue, 26 Sep 2023 05:06:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233947AbjIZJGS (ORCPT ); Tue, 26 Sep 2023 05:06:18 -0400 Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55EC2DE; Tue, 26 Sep 2023 02:06:11 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=alibuda@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0Vsw82Mw_1695719167; Received: from 30.221.147.7(mailfrom:alibuda@linux.alibaba.com fp:SMTPD_---0Vsw82Mw_1695719167) by smtp.aliyun-inc.com; Tue, 26 Sep 2023 17:06:08 +0800 Message-ID: Date: Tue, 26 Sep 2023 17:06:06 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: Re: [PATCH net] net/smc: fix panic smc_tcp_syn_recv_sock() while closing listen socket Content-Language: en-US To: Alexandra Winter , Wenjia Zhang , kgraul@linux.ibm.com, jaka@linux.ibm.com Cc: kuba@kernel.org, davem@davemloft.net, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-rdma@vger.kernel.org References: <1695211714-66958-1-git-send-email-alibuda@linux.alibaba.com> <0902f55b-0d51-7f4d-0a9e-4b9423217fcf@linux.ibm.com> <3d1b5c12-971f-3464-5f28-79477f1f9eb2@linux.ibm.com> From: "D. Wythe" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-s390@vger.kernel.org On 9/26/23 3:18 PM, Alexandra Winter wrote: > > On 26.09.23 05:00, D. Wythe wrote: >> You are right. The key point is how to ensure the valid of smc sock during the life time of clc sock, If so, READ_ONCE is good >> enough. Unfortunately, I found  that there are no such guarantee, so it's still a life-time problem. > Did you discover a scenario, where clc sock could live longer than smc sock? > Wouldn't that be a dangerous scenario in itself? I still have some hope that the lifetime of an smc socket is by design longer > than that of the corresponding tcp socket. Hi Alexandra, Yes there is. Considering scenario: tcp_v4_rcv(skb) /* req sock */ reqsk = _inet_lookup_skb(skb) /* listen sock */ sk = reqsk(reqsk)->rsk_listener; sock_hold(sk); tcp_check_req(sk)                                                 smc_release /* release smc listen sock */                                                 __smc_release smc_close_active()         /*  smc_sk->sk_state = SMC_CLOSED; */                                                     if (smc_sk->sk_state == SMC_CLOSED) smc_clcsock_release(); sock_release(clcsk);        /* close clcsock */     sock_put(sk);              /* might not  the final refcnt */ sock_put(smc_sk)    /* might be the final refcnt of smc_sock  */ syn_recv_sock(sk...) /* might be the final refcnt of tcp listen sock */ sock_put(sk); Fortunately, this scenario only affects smc_syn_recv_sock and smc_hs_congested, as other callbacks already have locks to protect smc, which can guarantee that the sk_user_data is either NULL (set in smc_close_active) or valid under the lock. > Considering the const, maybe >> we need to do : >> >> 1. hold a refcnt of smc_sock for syn_recv_sock to keep smc sock valid during life time of clc sock >> 2. put the refcnt of smc_sock in sk_destruct in tcp_sock to release the very smc sock . >> >> In that way, we can always make sure the valid of smc sock during the life time of clc sock. Then we can use READ_ONCE rather >> than lock.  What do you think ? > I am not sure I fully understand the details what you propose to do. And it is not only syn_recv_sock(), right? > You need to consider all relations between smc socks and tcp socks; fallback to tcp, initial creation, children of listen sockets, variants of shutdown, ... Preferrably a single simple mechanism covers all situations. Maybe there is such a mechanism already today? > (I don't think clcsock->sk->sk_user_data or sk_callback_lock provide this general coverage) > If we really have a gap, a general refcnt'ing on smc sock could be a solution, but needs to be designed carefully. You are right , we need designed it with care, we will try the referenced solutions internally first, and I will also send some RFCs so that everyone can track the latest progress and make it can be all agreed. > Many thanks to you and the team to help make smc more stable and robust. Our pleasure 😁.  The stability of smc is important to us too. Best wishes, D. Wythe