From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B84C5C64E90 for ; Tue, 1 Dec 2020 14:43:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6BB762084C for ; Tue, 1 Dec 2020 14:43:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391670AbgLAOnX (ORCPT ); Tue, 1 Dec 2020 09:43:23 -0500 Received: from correo.us.es ([193.147.175.20]:55930 "EHLO mail.us.es" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387963AbgLAOnX (ORCPT ); Tue, 1 Dec 2020 09:43:23 -0500 Received: from antivirus1-rhel7.int (unknown [192.168.2.11]) by mail.us.es (Postfix) with ESMTP id 290F11E2C7E for ; Tue, 1 Dec 2020 15:42:40 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id 1823BDA86C for ; Tue, 1 Dec 2020 15:42:40 +0100 (CET) Received: by antivirus1-rhel7.int (Postfix, from userid 99) id 0A679DA850; Tue, 1 Dec 2020 15:42:40 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id C760ADA7B6; Tue, 1 Dec 2020 15:42:37 +0100 (CET) Received: from 192.168.1.97 (192.168.1.97) by antivirus1-rhel7.int (F-Secure/fsigk_smtp/550/antivirus1-rhel7.int); Tue, 01 Dec 2020 15:42:37 +0100 (CET) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/antivirus1-rhel7.int) Received: from us.es (unknown [90.77.255.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: 1984lsi) by entrada.int (Postfix) with ESMTPSA id 9B63E4265A5A; Tue, 1 Dec 2020 15:42:37 +0100 (CET) Date: Tue, 1 Dec 2020 15:42:38 +0100 X-SMTPAUTHUS: auth mail.us.es From: Pablo Neira Ayuso To: Vladimir Oltean Cc: Jakub Kicinski , Eric Dumazet , Stephen Hemminger , netdev , Paul Gortmaker , Jiri Benc , Or Gerlitz , Cong Wang , Jamal Hadi Salim , Andrew Lunn , Florian Fainelli , fw@strlen.de Subject: Re: Correct usage of dev_base_lock in 2020 Message-ID: <20201201144238.GA5970@salvia> References: <20201129182435.jgqfjbekqmmtaief@skbuf> <20201129205817.hti2l4hm2fbp2iwy@skbuf> <20201129211230.4d704931@hermes.local> <20201130101405.73901b17@kicinski-fedora-pc1c0hjn.DHCP.thefacebook.com> <20201130184828.x56bwxxiwydsxt3k@skbuf> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20201130184828.x56bwxxiwydsxt3k@skbuf> User-Agent: Mutt/1.10.1 (2018-07-13) X-Virus-Scanned: ClamAV using ClamSMTP Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Mon, Nov 30, 2020 at 08:48:28PM +0200, Vladimir Oltean wrote: [...] > There are 2 separate classes of problems: > - We already have two ways of protecting pure readers: via RCU and via > the rwlock. It doesn't help if we also add a second way of locking out > pure writers. We need to first clean up what we have. That's the > reason why I started the discussion about the rwlock. > - Some callers appear to not use any sort of protection at all. Does > this code path look ok to you? > > nfnetlink_rcv_batch > -> NFT_MSG_NEWCHAIN This path holds the nft commit mutex. > -> nf_tables_addchain > -> nft_chain_parse_hook > -> nft_chain_parse_netdev > -> nf_tables_parse_netdev_hooks > -> nft_netdev_hook_alloc > -> __dev_get_by_name > -> netdev_name_node_lookup: must be under RTNL mutex or dev_base_lock protection The nf_tables_netdev_event() notifier holds the nft commit mutex too. Assuming worst case, race between __dev_get_by_name() and device removal, the notifier waits for the NFT_MSG_NEWCHAIN path to finish. If the nf_tables_netdev_event() notifier wins race, then __dev_get_by_name() hits ENOENT. The idea is explained here: commit 90d2723c6d4cb2ace50fc3b932a2bcc77710450b Author: Pablo Neira Ayuso Date: Tue Mar 20 17:00:19 2018 +0100 netfilter: nf_tables: do not hold reference on netdevice from preparation phase The netfilter netdevice event handler hold the nfnl_lock mutex, this avoids races with a device going away while such device is being attached to hooks from the netlink control plane. Therefore, either control plane bails out with ENOENT or netdevice event path waits until the hook that is attached to net_device is registered. I can submit a patch adding a comment so anyone else does not get confused :-)