From: Jakub Kicinski <kuba@kernel.org>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Mattias Forsblad <mattias.forsblad@gmail.com>,
Vladimir Oltean <olteanv@gmail.com>,
Baowen Zheng <baowen.zheng@corigine.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"roid@nvidia.com" <roid@nvidia.com>,
"vladbu@nvidia.com" <vladbu@nvidia.com>,
Eli Cohen <elic@nvidia.com>, Jiri Pirko <jiri@resnulli.us>,
Tobias Waldekranz <tobias@waldekranz.com>
Subject: Re: [RFC net-next] net: tc: flow indirect framework issue
Date: Thu, 14 Apr 2022 10:57:01 +0200 [thread overview]
Message-ID: <20220414105701.54c3fba4@kernel.org> (raw)
In-Reply-To: <YlbR4Cgzd/ulpT25@salvia>
On Wed, 13 Apr 2022 15:36:32 +0200 Pablo Neira Ayuso wrote:
> A bit of a long email...
>
> This commit 74fc4f828769 handles this scenario:
>
> 1) eth0 is gone (module removal)
> 2) vxlan0 device is still in place, tc ingress also contains rules for
> vxlan0.
> 3) eth0 is reloaded.
>
> A bit of background: tc ingress removes rules for eth0 if eth0 is
> gone (I am refering to software rules, in general). In this model, the
> tc ingress rules are attached to the device, and if the device eth0 is
> gone, those rules are also gone and, then, once this device eth0 comes
> back, the user has to the tc ingress rules software for eth0 again.
> There is no replay mechanism for tc ingress rules in this case.
>
> IIRC, Eli's patch re-adds the flow block for vxlan0 because he got a
> bug report that says that after reloading the driver module and eth0
> comes back, rules for tc vxlan0 were not hardware offloaded.
>
> The indirect flow block infrastructure is tracking devices such as
> vxlan0 that the given driver *might* be able to hardware offload.
> But from the control plane (user) perspective, this detail is hidden.
> To me, the problem is that there is no way from the control plane to
> relate vxlan0 with the real device that performs the hardware offload.
> There is also no flag for the user to request "please hardware offload
> vxlan0 tc ingress rules". Instead, the flow indirect block
> infrastructure performs the hardware offload "transparently" to the user.
TBH I don't understand why indirect infra is important. Mattias said he
gets a replay of the block bind. So it's the replay of rules that's
broken. Whether the block bind came from indir infra or the block is
shared and got bound to a new dev is not important.
> I think some people believe doing things fully transparent is good, at
> the cost of adding more kernel complexity and hiding details that are
> relevant to the user (such as if hardware offload is enabled for
> vxlan0 and what is the real device that is actually being used for the
> vxlan0 to be offloaded).
>
> So, there are no flags when setting up the vxlan0 device for the user
> to say: "I would like to hardware offload vxlan0", and going slightly
> further there is not "please attach this vxlan0 device to eth0 for
> hardware offload". Any real device could be potentially used to
> offload vxlan0, the user does not know which one is actually used.
>
> Exposing this information is a bit more work on top of the user, but:
>
> 1) it will be transparent: the control plane shows that the vxlan0 is
> hardware offloaded. Then if eth0 is gone, vxlan0 tc ingress can be
> removed too, because it depends on eth0.
>
> 2) The control plane validates if hardware offload for vxlan0. If this
> is not possible, display an error to the user: "sorry, I cannot
> offload vxlan0 on eth0 for reason X".
>
> Since this is not exposed to the control plane, the existing
> infrastructure follows a snooping scheme, but tracking devices that
> might be able to hardware offload.
>
> There is no obvious way to relate vxlan0 with the real device
> (eth0) that is actually performing the hardware offloading.
Let's not over-complicate things, Mattias just needs replay to work.
90% sure it worked when we did the work back in the day with John H,
before the nft rewrite etc.
next prev parent reply other threads:[~2022-04-14 8:57 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-13 5:52 [RFC net-next] net: tc: flow indirect framework issue Mattias Forsblad
2022-04-13 7:05 ` Baowen Zheng
2022-04-13 9:07 ` Vladimir Oltean
2022-04-13 12:24 ` Mattias Forsblad
2022-04-13 13:36 ` Pablo Neira Ayuso
2022-04-13 14:15 ` Vladimir Oltean
2022-04-14 8:57 ` Jakub Kicinski [this message]
2022-04-25 7:52 ` Mattias Forsblad
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220414105701.54c3fba4@kernel.org \
--to=kuba@kernel.org \
--cc=baowen.zheng@corigine.com \
--cc=elic@nvidia.com \
--cc=jiri@resnulli.us \
--cc=mattias.forsblad@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=olteanv@gmail.com \
--cc=pablo@netfilter.org \
--cc=roid@nvidia.com \
--cc=tobias@waldekranz.com \
--cc=vladbu@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.