Netdev List
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf.kernel@gmail.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: davem@davemloft.net, netdev@vger.kernel.org, edumazet@google.com,
	 pabeni@redhat.com, andrew+netdev@lunn.ch, horms@kernel.org,
	jv@jvosburgh.net,  sdf@fomichev.me, dongchenchen2@huawei.com,
	idosch@nvidia.com, n05ec@lzu.edu.cn,  yuantan098@gmail.com,
	kuniyu@google.com, nb@tipi-net.de, aleksandr.loktionov@intel.com,
	 dtatulea@nvidia.com,
	syzbot+09da62a8b78959ceb8bb@syzkaller.appspotmail.com,
	 syzbot+cb67c392b0b8f0fd0fc1@syzkaller.appspotmail.com,
	syzbot+9bb8bd77f3966641f298@syzkaller.appspotmail.com
Subject: Re: [PATCH net 3/4] vlan: defer real device state propagation to netdev_work
Date: Thu, 25 Jun 2026 08:49:24 -0700	[thread overview]
Message-ID: <aj1N_W_RwvNxPrLk@devvm7509.cco0.facebook.com> (raw)
In-Reply-To: <20260624182018.2445732-4-kuba@kernel.org>

On 06/24, Jakub Kicinski wrote:
> vlan_device_event() generates nested UP/DOWN, MTU and feature
> change events. It executes an event for the VLAN device directly
> from the notifier - while the locks of the lower device are held.
> 
> This causes deadlocks, for example:
> 
>   bond    (3) bond_update_speed_duplex(vlan)
>     |           ^                v
>   vlan    (2) UP(vlan)    (4) vlan_ethtool_get_link_ksettings()
>     |           ^                v
>   dummy   (1) UP(dummy)   (5) __ethtool_get_link_ksettings()
> 
> The dummy device is ops locked, vlan creates a nested event (2),
> then bond wants to ask vlan for link state (3). bond uses the
> "I'm already holding the instance lock" flavor of API. But in
> this case the lock held refers to vlan itself. We hit vlan's
> link settings trampoline (4) and call __ethtool_get_link_ksettings()
> which tries to lock dummy. Deadlock. There's no clean way for us
> to tell the vlan_ethtool_get_link_ksettings() that the caller
> is already in lower device's critical section.
> 
> Defer the propagation to the per-netdev work facility instead:
> the notifier only schedules netdev_work_sched(vlandev, VLAN_WORK_*),
> and ndo_work (vlan_dev_work) applies the change later. Hopefully
> nobody expects the VLAN state changes to be instantaneous.
> 
> If someone does expect the changes to be instantaneous we will
> have to do the same thing Stan did for rx_mode and "strategically"
> place sync calls, to make sure such delayed works are executed
> after we drop the ops lock but before we drop rtnl_lock.
> 
> Stan suggests that if we need that down the line we may
> consider reshaping the mechanism into "async notifications".
> AFAICT only vlan does this sort of netdev open chaining,
> so as a first try I think that sticking the complexity into
> the vlan code makes sense.
> 
> One corner case is that we need to cancel the event if user
> explicitly changes the state before work could run. Consider
> the following operations with vlan0 on top of dummy0:
> 
>   ip link set dev dummy0 up    # queues work to up vlan0
>   ip link set dev vlan0 down   # user explicitly downs the vlan
>   ndo_work                     # acts on the stale event
> 
> Reported-by: syzbot+09da62a8b78959ceb8bb@syzkaller.appspotmail.com
> Reported-by: syzbot+cb67c392b0b8f0fd0fc1@syzkaller.appspotmail.com
> Reported-by: syzbot+9bb8bd77f3966641f298@syzkaller.appspotmail.com
> Fixes: 9f275c2e9020 ("net: ethtool: make sure __ethtool_get_link_ksettings() is ops-locked")
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Acked-by: Stanislav Fomichev <sdf@fomichev.me>

  parent reply	other threads:[~2026-06-25 16:05 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-24 18:20 [PATCH net 0/4] net: avoid nested UP notifier events Jakub Kicinski
2026-06-24 18:20 ` [PATCH net 1/4] net: turn the rx_mode work into a generic netdev_work facility Jakub Kicinski
2026-06-25  5:55   ` Kuniyuki Iwashima
2026-06-25 15:17     ` Jakub Kicinski
2026-06-25 15:48   ` Stanislav Fomichev
2026-06-24 18:20 ` [PATCH net 2/4] net: add the driver-facing netdev_work scheduling API Jakub Kicinski
2026-06-25  5:55   ` Kuniyuki Iwashima
2026-06-25 15:48   ` Stanislav Fomichev
2026-06-24 18:20 ` [PATCH net 3/4] vlan: defer real device state propagation to netdev_work Jakub Kicinski
2026-06-25  5:57   ` Kuniyuki Iwashima
2026-06-25 11:37   ` Nicolai Buchwitz
2026-06-25 15:49   ` Stanislav Fomichev [this message]
2026-06-24 18:20 ` [PATCH net 4/4] selftests: bonding: add a test for VLAN propagation over a bonded real device Jakub Kicinski
2026-06-25  8:41   ` Loktionov, Aleksandr
2026-06-25 15:49   ` Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aj1N_W_RwvNxPrLk@devvm7509.cco0.facebook.com \
    --to=sdf.kernel@gmail.com \
    --cc=aleksandr.loktionov@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=dongchenchen2@huawei.com \
    --cc=dtatulea@nvidia.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=idosch@nvidia.com \
    --cc=jv@jvosburgh.net \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=n05ec@lzu.edu.cn \
    --cc=nb@tipi-net.de \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=syzbot+09da62a8b78959ceb8bb@syzkaller.appspotmail.com \
    --cc=syzbot+9bb8bd77f3966641f298@syzkaller.appspotmail.com \
    --cc=syzbot+cb67c392b0b8f0fd0fc1@syzkaller.appspotmail.com \
    --cc=yuantan098@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox