From: Sabrina Dubroca <sd@queasysnail.net>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Leon Romanovsky <leon@kernel.org>,
Steffen Klassert <steffen.klassert@secunet.com>,
Herbert Xu <herbert@gondor.apana.org.au>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>, Ilan Tayari <ilant@mellanox.com>,
Guy Shapiro <guysh@mellanox.com>,
Yossi Kuperman <yossiku@mellanox.com>,
Network Development <netdev@vger.kernel.org>
Subject: Re: [PATCH net] xfrm: always flush state and policy upon NETDEV_DOWN/NETDEV_UNREGISTER events
Date: Thu, 29 Jan 2026 17:05:32 +0100 [thread overview]
Message-ID: <aXuFTJZ3SvaCBXj5@krikkit> (raw)
In-Reply-To: <d0aa3c51-291b-40fd-82b4-15024dc4404b@I-love.SAKURA.ne.jp>
2026-01-29, 19:16:30 +0900, Tetsuo Handa wrote:
> On 2026/01/29 18:09, Leon Romanovsky wrote:
> > On Thu, Jan 29, 2026 at 05:06:08PM +0900, Tetsuo Handa wrote:
> >> On 2026/01/28 21:35, Leon Romanovsky wrote:
> >>> On Wed, Jan 28, 2026 at 07:44:02PM +0900, Tetsuo Handa wrote:
> >>>> On 2026/01/28 19:24, Leon Romanovsky wrote:
> >>>>> I think this can work, but IMHO the more robust approach is to ensure that all
> >>>>> states and policies are removed when the NETIF_F_HW_ESP feature bit is cleared.
> >>>>
> >>>> The transaction will become complicated, for dev->features manipulation
> >>>> function can fail.
> >>>
> >>> Line above returning NOTIFY_OK, check that NETIF_F_HW_ESP is cleared,
> >>> and remove everything.
> >>
> >> That answer needs more clarification. I came to get confused about what we should do.
> >>
> >> Question 1:
> >>
> >> Since NETIF_F_HW_ESP is a hardware dependent flag, not all "struct net_device"
> >> support NETIF_F_HW_ESP flag. Is this interpretation correct?
> >
> > Yes, however any device (SW or HW) should set this flag if they want to
> > provide IPsec offload.
>
> OK. There are "IPsec with offload" and "IPsec without offload".
> Both cases use code in net/xfrm/ directory.
>
> Users (not the kernel source but Linux administrator) can choose
> "IPsec without offload" by clearing the NETIF_F_HW_ESP bit via
> "ethtool -K $dev esp-hw-offload off" command even if $dev supports
> both "IPsec with offload" and "IPsec without offload".
We should avoid talking about "IPsec with/without offload" when this
can mean multiple different things:
- ip xfrm state add ... offload ...
(and the offload request actually succeeded)
- packet going through all the offload code and to the device
- device with NETIF_F_HW_ESP set in dev->features
- device with ->xdo_dev_state_add
(I'm probably forgetting a few more)
> >> Question 2:
> >>
> >> Sabrina Dubroca commented
> >>
> >> But the current behavior ("ignore NETIF_F_HW_ESP and call
> >> xdo_dev_state_add for new states anyway") has been established for
> >> multiple years. Changing that now seems a bit risky.
> >>
> >> at https://lkml.kernel.org/r/aXd3QjzwOVm0Q9LF@krikkit .
> >>
> >> Is that comment saying that we have been permitting a "struct net_device"
> >> to be selected by xfrm_dev_state_add() even if that "struct net_device"
> >> does not support NETIF_F_HW_ESP flag. Is this interpretation correct?
> >
> > I don't understand what does it mean "device doesn't support offload but
> > state was offloaded anyway".
To Leon: this is not what Tetsuo wrote.
(but to be fair, "net_device does not support NETIF_F_HW_ESP flag" is
a bit confusing)
> Users (not the kernel source but Linux administrator) who are using $dev
> which supports only "IPsec without offload" can call xfrm_dev_state_add()
> because xfrm_dev_state_add() does not check for the NETIF_F_HW_ESP bit.
>
> Therefore such users can create "struct xfrm_state" with a reference to
> "struct net_device" held at
> https://elixir.bootlin.com/linux/v6.19-rc5/source/net/xfrm/xfrm_user.c#L986 .
>
> >
> >>
> >> Question 3:
> >>
> >> Leon Romanovsky suggested that, as a more robust approach, remove all states
> >> and policies when the NETIF_F_HW_ESP feature bit is cleared.
> >>
> >> But I consider that such approach will not work, for (according to Q2 above)
> >> xfrm_dev_state_add() can be called even if the NETIF_F_HW_ESP feature bit is not
> >> set. Also, I think that there is no guarantee that dev->features manipulation
> >> function is called after xfrm_dev_state_add() was called.
> >>
> >> Therefore, we need an event that are guaranteed to be called.
> >> The NETDEV_UNREGISTER event is guaranteed to be called when unregistring
> >> a "struct net_device", and therefore a good place to remove all states and
> >> policies.
> >>
> >> Is this interpretation correct?
>
> Since we don't have a syzbot reproducer, I can't tell whether syzbot is manually
> clearing the NETIF_F_HW_ESP bit or not. But as described above, a syzbot report
>
> unregister_netdevice: waiting for netdevsim0 to become free. Usage count = 2
> ref_tracker: netdev@ffff888052f24618 has 1/1 users at
> __netdev_tracker_alloc include/linux/netdevice.h:4400 [inline]
> netdev_tracker_alloc include/linux/netdevice.h:4412 [inline]
> xfrm_dev_state_add+0x3a5/0x1080 net/xfrm/xfrm_device.c:316
>
> indicates that "struct xfrm_state" with a reference to "struct net_device" held
> is remaining because xfrm_dev_state_flush() is not called upon NETDEV_UNREGISTER
> event.
I don't know what syzbot did, but this triggers the same symptoms for
me:
echo 0 > /sys/bus/netdevsim/new_device
dev=$(ls -1 /sys/bus/netdevsim/devices/netdevsim0/net/)
ip x s a src 192.168.13.1 dst 192.168.13.2 proto esp spi 0x1000 mode tunnel aead 'rfc4106(gcm(aes))' $key 128 offload crypto dev $dev dir out
ethtool -K $dev esp-hw-offload off
echo 0 > /sys/bus/netdevsim/del_device
--
Sabrina
next prev parent reply other threads:[~2026-01-29 16:05 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-27 15:32 [PATCH net] xfrm: always flush state and policy upon NETDEV_DOWN/NETDEV_UNREGISTER events Tetsuo Handa
2026-01-28 10:24 ` Leon Romanovsky
2026-01-28 10:44 ` Tetsuo Handa
2026-01-28 12:35 ` Leon Romanovsky
2026-01-29 8:06 ` Tetsuo Handa
2026-01-29 9:09 ` Leon Romanovsky
2026-01-29 10:16 ` Tetsuo Handa
2026-01-29 10:32 ` Tetsuo Handa
2026-01-29 16:05 ` Sabrina Dubroca [this message]
2026-02-01 13:12 ` Leon Romanovsky
2026-02-01 14:17 ` Tetsuo Handa
2026-01-29 15:59 ` Sabrina Dubroca
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aXuFTJZ3SvaCBXj5@krikkit \
--to=sd@queasysnail.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=guysh@mellanox.com \
--cc=herbert@gondor.apana.org.au \
--cc=horms@kernel.org \
--cc=ilant@mellanox.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=steffen.klassert@secunet.com \
--cc=yossiku@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.