netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vadim Fedorenko <vadim.fedorenko@linux.dev>
To: Gilad Naaman <gnaaman@drivenets.com>,
	Kuniyuki Iwashima <kuniyu@amazon.com>,
	"David S. Miller" <davem@davemloft.net>,
	David Ahern <dsahern@kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>,
	netdev@vger.kernel.org
Subject: Re: [PATCH net-next v2] Avoid traversing addrconf hash on ifdown
Date: Sat, 9 Nov 2024 15:00:55 +0000	[thread overview]
Message-ID: <ea009a4a-c9f2-4843-b84d-e6b72982228e@linux.dev> (raw)
In-Reply-To: <20241108052559.2926114-1-gnaaman@drivenets.com>

On 08/11/2024 05:25, Gilad Naaman wrote:
> struct inet6_dev already has a list of addresses owned by the device,
> enabling us to traverse this much shorter list, instead of scanning
> the entire hash-table.
> 
> Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
> ---
> Changes in v2:
>   - Remove double BH sections
>   - Styling fixes (extra {}, extra newline)
> ---
>   net/ipv6/addrconf.c | 38 +++++++++++++++++---------------------
>   1 file changed, 17 insertions(+), 21 deletions(-)
> 
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index d0a99710d65d..c6fbd634912a 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -3846,12 +3846,12 @@ static int addrconf_ifdown(struct net_device *dev, bool unregister)
>   {
>   	unsigned long event = unregister ? NETDEV_UNREGISTER : NETDEV_DOWN;
>   	struct net *net = dev_net(dev);
> -	struct inet6_dev *idev;
>   	struct inet6_ifaddr *ifa;
>   	LIST_HEAD(tmp_addr_list);
> +	struct inet6_dev *idev;
>   	bool keep_addr = false;
>   	bool was_ready;
> -	int state, i;
> +	int state;
>   
>   	ASSERT_RTNL();
>   
> @@ -3890,28 +3890,24 @@ static int addrconf_ifdown(struct net_device *dev, bool unregister)
>   	}
>   
>   	/* Step 2: clear hash table */
> -	for (i = 0; i < IN6_ADDR_HSIZE; i++) {
> -		struct hlist_head *h = &net->ipv6.inet6_addr_lst[i];
> +	read_lock_bh(&idev->lock);
 > +	spin_lock(&net->ipv6.addrconf_hash_lock);>
> -		spin_lock_bh(&net->ipv6.addrconf_hash_lock);
> -restart:
> -		hlist_for_each_entry_rcu(ifa, h, addr_lst) {
> -			if (ifa->idev == idev) {
> -				addrconf_del_dad_work(ifa);
> -				/* combined flag + permanent flag decide if
> -				 * address is retained on a down event
> -				 */
> -				if (!keep_addr ||
> -				    !(ifa->flags & IFA_F_PERMANENT) ||
> -				    addr_is_local(&ifa->addr)) {
> -					hlist_del_init_rcu(&ifa->addr_lst);
> -					goto restart;
> -				}
> -			}
> -		}
> -		spin_unlock_bh(&net->ipv6.addrconf_hash_lock);
> +	list_for_each_entry(ifa, &idev->addr_list, if_list) {
> +		addrconf_del_dad_work(ifa);
> +
> +		/* combined flag + permanent flag decide if
> +		 * address is retained on a down event
> +		 */
> +		if (!keep_addr ||
> +		    !(ifa->flags & IFA_F_PERMANENT) ||
> +		    addr_is_local(&ifa->addr))
> +			hlist_del_init_rcu(&ifa->addr_lst);
>   	}
>   
> +	spin_unlock(&net->ipv6.addrconf_hash_lock);
> +	read_unlock_bh(&idev->lock);

Why is this read lock needed here? spinlock addrconf_hash_lock will
block any RCU grace period to happen, so we can safely traverse
idev->addr_list with list_for_each_entry_rcu()...

> +
>   	write_lock_bh(&idev->lock);

if we are trying to protect idev->addr_list against addition, then we
have to extend write_lock scope. Otherwise it may happen that another
thread will grab write lock between read_unlock and write_lock.

Am I missing something?

  reply	other threads:[~2024-11-09 15:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-08  5:25 [PATCH net-next v2] Avoid traversing addrconf hash on ifdown Gilad Naaman
2024-11-09 15:00 ` Vadim Fedorenko [this message]
2024-11-10  6:53   ` Gilad Naaman
2024-11-10 22:31     ` Vadim Fedorenko
2024-11-11  5:21       ` Gilad Naaman
2024-11-11 12:07         ` Vadim Fedorenko
2024-11-12 14:41           ` Paolo Abeni
2024-11-12 16:08             ` Vadim Fedorenko
2024-11-13  6:21             ` Gilad Naaman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ea009a4a-c9f2-4843-b84d-e6b72982228e@linux.dev \
    --to=vadim.fedorenko@linux.dev \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=gnaaman@drivenets.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuniyu@amazon.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).