All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicolai Buchwitz <nb@tipi-net.de>
To: Jakub Kicinski <kuba@kernel.org>
Cc: davem@davemloft.net, netdev@vger.kernel.org, edumazet@google.com,
	pabeni@redhat.com, andrew+netdev@lunn.ch, horms@kernel.org,
	jv@jvosburgh.net, sdf@fomichev.me, dongchenchen2@huawei.com,
	idosch@nvidia.com, n05ec@lzu.edu.cn, yuantan098@gmail.com,
	kuniyu@google.com, aleksandr.loktionov@intel.com,
	dtatulea@nvidia.com,
	syzbot+09da62a8b78959ceb8bb@syzkaller.appspotmail.com,
	syzbot+cb67c392b0b8f0fd0fc1@syzkaller.appspotmail.com,
	syzbot+9bb8bd77f3966641f298@syzkaller.appspotmail.com
Subject: Re: [PATCH net 3/4] vlan: defer real device state propagation to netdev_work
Date: Thu, 25 Jun 2026 13:37:31 +0200	[thread overview]
Message-ID: <dd271c70bbcdedf79940e013523451b7@tipi-net.de> (raw)
In-Reply-To: <20260624182018.2445732-4-kuba@kernel.org>

On 24.6.2026 20:20, Jakub Kicinski wrote:
> vlan_device_event() generates nested UP/DOWN, MTU and feature
> change events. It executes an event for the VLAN device directly
> from the notifier - while the locks of the lower device are held.
> 
> This causes deadlocks, for example:
> 
>   bond    (3) bond_update_speed_duplex(vlan)
>     |           ^                v
>   vlan    (2) UP(vlan)    (4) vlan_ethtool_get_link_ksettings()
>     |           ^                v
>   dummy   (1) UP(dummy)   (5) __ethtool_get_link_ksettings()
> 
> The dummy device is ops locked, vlan creates a nested event (2),
> then bond wants to ask vlan for link state (3). bond uses the
> "I'm already holding the instance lock" flavor of API. But in
> this case the lock held refers to vlan itself. We hit vlan's
> link settings trampoline (4) and call __ethtool_get_link_ksettings()
> which tries to lock dummy. Deadlock. There's no clean way for us
> to tell the vlan_ethtool_get_link_ksettings() that the caller
> is already in lower device's critical section.
> 
> Defer the propagation to the per-netdev work facility instead:
> the notifier only schedules netdev_work_sched(vlandev, VLAN_WORK_*),
> and ndo_work (vlan_dev_work) applies the change later. Hopefully
> nobody expects the VLAN state changes to be instantaneous.
> 
> If someone does expect the changes to be instantaneous we will
> have to do the same thing Stan did for rx_mode and "strategically"
> place sync calls, to make sure such delayed works are executed
> after we drop the ops lock but before we drop rtnl_lock.
> 
> Stan suggests that if we need that down the line we may
> consider reshaping the mechanism into "async notifications".
> AFAICT only vlan does this sort of netdev open chaining,
> so as a first try I think that sticking the complexity into
> the vlan code makes sense.
> 
> One corner case is that we need to cancel the event if user
> explicitly changes the state before work could run. Consider
> the following operations with vlan0 on top of dummy0:
> 
>   ip link set dev dummy0 up    # queues work to up vlan0
>   ip link set dev vlan0 down   # user explicitly downs the vlan
>   ndo_work                     # acts on the stale event
> 
> Reported-by: syzbot+09da62a8b78959ceb8bb@syzkaller.appspotmail.com
> Reported-by: syzbot+cb67c392b0b8f0fd0fc1@syzkaller.appspotmail.com
> Reported-by: syzbot+9bb8bd77f3966641f298@syzkaller.appspotmail.com
> Fixes: 9f275c2e9020 ("net: ethtool: make sure 
> __ethtool_get_link_ksettings() is ops-locked")
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---

> [...]

Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>

Thanks
Nicolai

  parent reply	other threads:[~2026-06-25 11:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-24 18:20 [PATCH net 0/4] net: avoid nested UP notifier events Jakub Kicinski
2026-06-24 18:20 ` [PATCH net 1/4] net: turn the rx_mode work into a generic netdev_work facility Jakub Kicinski
2026-06-25  5:55   ` Kuniyuki Iwashima
2026-06-25 15:17     ` Jakub Kicinski
2026-06-25 15:48   ` Stanislav Fomichev
2026-06-24 18:20 ` [PATCH net 2/4] net: add the driver-facing netdev_work scheduling API Jakub Kicinski
2026-06-25  5:55   ` Kuniyuki Iwashima
2026-06-25 15:48   ` Stanislav Fomichev
2026-06-24 18:20 ` [PATCH net 3/4] vlan: defer real device state propagation to netdev_work Jakub Kicinski
2026-06-25  5:57   ` Kuniyuki Iwashima
2026-06-25 11:37   ` Nicolai Buchwitz [this message]
2026-06-25 15:49   ` Stanislav Fomichev
2026-06-24 18:20 ` [PATCH net 4/4] selftests: bonding: add a test for VLAN propagation over a bonded real device Jakub Kicinski
2026-06-25  8:41   ` Loktionov, Aleksandr
2026-06-25 15:49   ` Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dd271c70bbcdedf79940e013523451b7@tipi-net.de \
    --to=nb@tipi-net.de \
    --cc=aleksandr.loktionov@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=dongchenchen2@huawei.com \
    --cc=dtatulea@nvidia.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=idosch@nvidia.com \
    --cc=jv@jvosburgh.net \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=n05ec@lzu.edu.cn \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=syzbot+09da62a8b78959ceb8bb@syzkaller.appspotmail.com \
    --cc=syzbot+9bb8bd77f3966641f298@syzkaller.appspotmail.com \
    --cc=syzbot+cb67c392b0b8f0fd0fc1@syzkaller.appspotmail.com \
    --cc=yuantan098@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.