All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Ahern <dsahern@kernel.org>
To: Vladimir Oltean <vladimir.oltean@nxp.com>, netdev@vger.kernel.org
Cc: Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>, Ido Schimmel <idosch@nvidia.com>
Subject: Re: [RFC PATCH net] net: ipv4/ipv6 addrconf: call igmp{,6}_group_dropped() while dev is still up
Date: Sun, 9 Apr 2023 20:08:01 -0600	[thread overview]
Message-ID: <febbbc75-2cf5-1cf9-8ed9-6a42ff295ab9@kernel.org> (raw)
In-Reply-To: <20230406233058.780721-1-vladimir.oltean@nxp.com>

[ cc Ido in case such a change has implications to mlxsw ]

On 4/6/23 5:30 PM, Vladimir Oltean wrote:
> ipv4 devinet calls ip_mc_down(), and ipv6 calls addrconf_ifdown(), and
> both of these eventually result in calls to dev_mc_del(), either through
> igmp_group_dropped() or igmp6_group_dropped().
> 
> The problem is that dev_mc_del() does call __dev_set_rx_mode(), but this
> will not propagate all the way to the ndo_set_rx_mode() of the device,
> because of this check:
> 
> 	/* dev_open will call this function so the list will stay sane. */
> 	if (!(dev->flags&IFF_UP))
> 		return;
> 
> and the NETDEV_DOWN notifier is emitted while the interface is already
> down. OTOH we have NETDEV_GOING_DOWN which is emitted a bit earlier -
> see:
> 
> dev_close_many()
> -> __dev_close_many()
>    -> call_netdevice_notifiers(NETDEV_GOING_DOWN, dev);
>    -> dev->flags &= ~IFF_UP;
> -> call_netdevice_notifiers(NETDEV_DOWN, dev);
> 
> Normally this oversight is easy to miss, because the addresses aren't
> lost, just not synced to the device until the next up event.
> 
> DSA does some processing in its dsa_slave_set_rx_mode(), and assumes
> that all addresses that were synced are also unsynced by the time the
> device is unregistered. Due to that assumption not being satisfied,
> the WARN_ON(!list_empty(&dp->mdbs)); from dsa_switch_release_ports()
> triggers, and we leak memory corresponding to the multicast addresses
> that were never synced.
> 
> Minimal reproducer:
> ip link set swp0 up
> ip link set swp0 down
> echo 0000:00:00.5 > /sys/bus/pci/drivers/mscc_felix/unbind
> 
> The proposal is to respond to that slightly earlier notifier with the
> IGMP address deletion, so that the ndo_set_rx_mode() of the device does
> actually get called. I am not familiar with the details of these layers,
> but it appeared to me that NETDEV_DOWN needed to be replaced everywhere
> with NETDEV_GOING_DOWN, so I blindly did that and it worked.
> 
> Fixes: 5e8a1e03aa4d ("net: dsa: install secondary unicast and multicast addresses as host FDB/MDB")
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
> Obviously DSA is not the only affected driver, but the extent to which
> other drivers are impacted is not obvious to me. At least in DSA, there
> is a WARN_ON() and a memory leak, so this is why I chose that Fixes tag.
> 
>  net/ipv4/devinet.c  |  7 ++++---
>  net/ipv6/addrconf.c | 12 ++++++------
>  2 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index 5deac0517ef7..95690d16d651 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -392,7 +392,8 @@ static void __inet_del_ifa(struct in_device *in_dev,
>  
>  				rtmsg_ifa(RTM_DELADDR, ifa, nlh, portid);
>  				blocking_notifier_call_chain(&inetaddr_chain,
> -						NETDEV_DOWN, ifa);
> +							     NETDEV_GOING_DOWN,
> +							     ifa);
>  				inet_free_ifa(ifa);
>  			} else {
>  				promote = ifa;
> @@ -429,7 +430,7 @@ static void __inet_del_ifa(struct in_device *in_dev,
>  	   So that, this order is correct.
>  	 */
>  	rtmsg_ifa(RTM_DELADDR, ifa1, nlh, portid);
> -	blocking_notifier_call_chain(&inetaddr_chain, NETDEV_DOWN, ifa1);
> +	blocking_notifier_call_chain(&inetaddr_chain, NETDEV_GOING_DOWN, ifa1);
>  
>  	if (promote) {
>  		struct in_ifaddr *next_sec;
> @@ -1588,7 +1589,7 @@ static int inetdev_event(struct notifier_block *this, unsigned long event,
>  		/* Send gratuitous ARP to notify of link change */
>  		inetdev_send_gratuitous_arp(dev, in_dev);
>  		break;
> -	case NETDEV_DOWN:
> +	case NETDEV_GOING_DOWN:
>  		ip_mc_down(in_dev);
>  		break;
>  	case NETDEV_PRE_TYPE_CHANGE:
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index 3797917237d0..9e484f829f1c 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -1307,7 +1307,7 @@ static void ipv6_del_addr(struct inet6_ifaddr *ifp)
>  
>  	ipv6_ifa_notify(RTM_DELADDR, ifp);
>  
> -	inet6addr_notifier_call_chain(NETDEV_DOWN, ifp);
> +	inet6addr_notifier_call_chain(NETDEV_GOING_DOWN, ifp);
>  
>  	if (action != CLEANUP_PREFIX_RT_NOP) {
>  		cleanup_prefix_route(ifp, expires,
> @@ -3670,12 +3670,12 @@ static int addrconf_notify(struct notifier_block *this, unsigned long event,
>  		}
>  		break;
>  
> -	case NETDEV_DOWN:
> +	case NETDEV_GOING_DOWN:
>  	case NETDEV_UNREGISTER:
>  		/*
>  		 *	Remove all addresses from this interface.
>  		 */
> -		addrconf_ifdown(dev, event != NETDEV_DOWN);
> +		addrconf_ifdown(dev, event != NETDEV_GOING_DOWN);
>  		break;
>  
>  	case NETDEV_CHANGENAME:
> @@ -3741,7 +3741,7 @@ static bool addr_is_local(const struct in6_addr *addr)
>  
>  static int addrconf_ifdown(struct net_device *dev, bool unregister)
>  {
> -	unsigned long event = unregister ? NETDEV_UNREGISTER : NETDEV_DOWN;
> +	unsigned long event = unregister ? NETDEV_UNREGISTER : NETDEV_GOING_DOWN;
>  	struct net *net = dev_net(dev);
>  	struct inet6_dev *idev;
>  	struct inet6_ifaddr *ifa;
> @@ -3877,7 +3877,7 @@ static int addrconf_ifdown(struct net_device *dev, bool unregister)
>  
>  		if (state != INET6_IFADDR_STATE_DEAD) {
>  			__ipv6_ifa_notify(RTM_DELADDR, ifa);
> -			inet6addr_notifier_call_chain(NETDEV_DOWN, ifa);
> +			inet6addr_notifier_call_chain(NETDEV_GOING_DOWN, ifa);
>  		} else {
>  			if (idev->cnf.forwarding)
>  				addrconf_leave_anycast(ifa);
> @@ -6252,7 +6252,7 @@ static void dev_disable_change(struct inet6_dev *idev)
>  
>  	netdev_notifier_info_init(&info, idev->dev);
>  	if (idev->cnf.disable_ipv6)
> -		addrconf_notify(NULL, NETDEV_DOWN, &info);
> +		addrconf_notify(NULL, NETDEV_GOING_DOWN, &info);
>  	else
>  		addrconf_notify(NULL, NETDEV_UP, &info);
>  }


  reply	other threads:[~2023-04-10  2:08 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-06 23:30 [RFC PATCH net] net: ipv4/ipv6 addrconf: call igmp{,6}_group_dropped() while dev is still up Vladimir Oltean
2023-04-10  2:08 ` David Ahern [this message]
2023-04-10  7:55   ` Ido Schimmel
2023-04-10 10:09     ` Vladimir Oltean
2023-04-10 11:43       ` Ido Schimmel
2023-04-10 17:07         ` Vladimir Oltean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=febbbc75-2cf5-1cf9-8ed9-6a42ff295ab9@kernel.org \
    --to=dsahern@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=vladimir.oltean@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.