From: Jakub Kicinski <kuba@kernel.org>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Mattias Forsblad <mattias.forsblad@gmail.com>,
Vladimir Oltean <olteanv@gmail.com>,
Baowen Zheng <baowen.zheng@corigine.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"roid@nvidia.com" <roid@nvidia.com>,
"vladbu@nvidia.com" <vladbu@nvidia.com>,
Eli Cohen <elic@nvidia.com>, Jiri Pirko <jiri@resnulli.us>,
Tobias Waldekranz <tobias@waldekranz.com>
Subject: Re: [RFC net-next] net: tc: flow indirect framework issue
Date: Thu, 14 Apr 2022 10:57:01 +0200 [thread overview]
Message-ID: <20220414105701.54c3fba4@kernel.org> (raw)
In-Reply-To: <YlbR4Cgzd/ulpT25@salvia>
On Wed, 13 Apr 2022 15:36:32 +0200 Pablo Neira Ayuso wrote:
> A bit of a long email...
>
> This commit 74fc4f828769 handles this scenario:
>
> 1) eth0 is gone (module removal)
> 2) vxlan0 device is still in place, tc ingress also contains rules for
> vxlan0.
> 3) eth0 is reloaded.
>
> A bit of background: tc ingress removes rules for eth0 if eth0 is
> gone (I am refering to software rules, in general). In this model, the
> tc ingress rules are attached to the device, and if the device eth0 is
> gone, those rules are also gone and, then, once this device eth0 comes
> back, the user has to the tc ingress rules software for eth0 again.
> There is no replay mechanism for tc ingress rules in this case.
>
> IIRC, Eli's patch re-adds the flow block for vxlan0 because he got a
> bug report that says that after reloading the driver module and eth0
> comes back, rules for tc vxlan0 were not hardware offloaded.
>
> The indirect flow block infrastructure is tracking devices such as
> vxlan0 that the given driver *might* be able to hardware offload.
> But from the control plane (user) perspective, this detail is hidden.
> To me, the problem is that there is no way from the control plane to
> relate vxlan0 with the real device that performs the hardware offload.
> There is also no flag for the user to request "please hardware offload
> vxlan0 tc ingress rules". Instead, the flow indirect block
> infrastructure performs the hardware offload "transparently" to the user.
TBH I don't understand why indirect infra is important. Mattias said he
gets a replay of the block bind. So it's the replay of rules that's
broken. Whether the block bind came from indir infra or the block is
shared and got bound to a new dev is not important.
> I think some people believe doing things fully transparent is good, at
> the cost of adding more kernel complexity and hiding details that are
> relevant to the user (such as if hardware offload is enabled for
> vxlan0 and what is the real device that is actually being used for the
> vxlan0 to be offloaded).
>
> So, there are no flags when setting up the vxlan0 device for the user
> to say: "I would like to hardware offload vxlan0", and going slightly
> further there is not "please attach this vxlan0 device to eth0 for
> hardware offload". Any real device could be potentially used to
> offload vxlan0, the user does not know which one is actually used.
>
> Exposing this information is a bit more work on top of the user, but:
>
> 1) it will be transparent: the control plane shows that the vxlan0 is
> hardware offloaded. Then if eth0 is gone, vxlan0 tc ingress can be
> removed too, because it depends on eth0.
>
> 2) The control plane validates if hardware offload for vxlan0. If this
> is not possible, display an error to the user: "sorry, I cannot
> offload vxlan0 on eth0 for reason X".
>
> Since this is not exposed to the control plane, the existing
> infrastructure follows a snooping scheme, but tracking devices that
> might be able to hardware offload.
>
> There is no obvious way to relate vxlan0 with the real device
> (eth0) that is actually performing the hardware offloading.
Let's not over-complicate things, Mattias just needs replay to work.
90% sure it worked when we did the work back in the day with John H,
before the nft rewrite etc.
next prev parent reply other threads:[~2022-04-14 8:57 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-13 5:52 [RFC net-next] net: tc: flow indirect framework issue Mattias Forsblad
2022-04-13 7:05 ` Baowen Zheng
2022-04-13 9:07 ` Vladimir Oltean
2022-04-13 12:24 ` Mattias Forsblad
2022-04-13 13:36 ` Pablo Neira Ayuso
2022-04-13 14:15 ` Vladimir Oltean
2022-04-14 8:57 ` Jakub Kicinski [this message]
2022-04-25 7:52 ` Mattias Forsblad
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220414105701.54c3fba4@kernel.org \
--to=kuba@kernel.org \
--cc=baowen.zheng@corigine.com \
--cc=elic@nvidia.com \
--cc=jiri@resnulli.us \
--cc=mattias.forsblad@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=olteanv@gmail.com \
--cc=pablo@netfilter.org \
--cc=roid@nvidia.com \
--cc=tobias@waldekranz.com \
--cc=vladbu@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).