netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Mattias Forsblad <mattias.forsblad@gmail.com>,
	Vladimir Oltean <olteanv@gmail.com>,
	Baowen Zheng <baowen.zheng@corigine.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"roid@nvidia.com" <roid@nvidia.com>,
	"vladbu@nvidia.com" <vladbu@nvidia.com>,
	Eli Cohen <elic@nvidia.com>, Jiri Pirko <jiri@resnulli.us>,
	Tobias Waldekranz <tobias@waldekranz.com>
Subject: Re: [RFC net-next] net: tc: flow indirect framework issue
Date: Thu, 14 Apr 2022 10:57:01 +0200	[thread overview]
Message-ID: <20220414105701.54c3fba4@kernel.org> (raw)
In-Reply-To: <YlbR4Cgzd/ulpT25@salvia>

On Wed, 13 Apr 2022 15:36:32 +0200 Pablo Neira Ayuso wrote:
> A bit of a long email...
> 
> This commit 74fc4f828769 handles this scenario:
> 
> 1) eth0 is gone (module removal)
> 2) vxlan0 device is still in place, tc ingress also contains rules for
>    vxlan0.
> 3) eth0 is reloaded.
> 
> A bit of background: tc ingress removes rules for eth0 if eth0 is
> gone (I am refering to software rules, in general). In this model, the
> tc ingress rules are attached to the device, and if the device eth0 is
> gone, those rules are also gone and, then, once this device eth0 comes
> back, the user has to the tc ingress rules software for eth0 again.
> There is no replay mechanism for tc ingress rules in this case.
> 
> IIRC, Eli's patch re-adds the flow block for vxlan0 because he got a
> bug report that says that after reloading the driver module and eth0
> comes back, rules for tc vxlan0 were not hardware offloaded.
> 
> The indirect flow block infrastructure is tracking devices such as
> vxlan0 that the given driver *might* be able to hardware offload.
> But from the control plane (user) perspective, this detail is hidden.
> To me, the problem is that there is no way from the control plane to
> relate vxlan0 with the real device that performs the hardware offload.
> There is also no flag for the user to request "please hardware offload
> vxlan0 tc ingress rules". Instead, the flow indirect block
> infrastructure performs the hardware offload "transparently" to the user.

TBH I don't understand why indirect infra is important. Mattias said he
gets a replay of the block bind. So it's the replay of rules that's
broken. Whether the block bind came from indir infra or the block is
shared and got bound to a new dev is not important.

> I think some people believe doing things fully transparent is good, at
> the cost of adding more kernel complexity and hiding details that are
> relevant to the user (such as if hardware offload is enabled for
> vxlan0 and what is the real device that is actually being used for the
> vxlan0 to be offloaded).
> 
> So, there are no flags when setting up the vxlan0 device for the user
> to say: "I would like to hardware offload vxlan0", and going slightly
> further there is not "please attach this vxlan0 device to eth0 for
> hardware offload". Any real device could be potentially used to
> offload vxlan0, the user does not know which one is actually used.
> 
> Exposing this information is a bit more work on top of the user, but:
> 
> 1) it will be transparent: the control plane shows that the vxlan0 is
>    hardware offloaded. Then if eth0 is gone, vxlan0 tc ingress can be
>    removed too, because it depends on eth0.
> 
> 2) The control plane validates if hardware offload for vxlan0. If this
>    is not possible, display an error to the user: "sorry, I cannot
>    offload vxlan0 on eth0 for reason X".
> 
> Since this is not exposed to the control plane, the existing
> infrastructure follows a snooping scheme, but tracking devices that
> might be able to hardware offload.
> 
> There is no obvious way to relate vxlan0 with the real device
> (eth0) that is actually performing the hardware offloading.

Let's not over-complicate things, Mattias just needs replay to work.
90% sure it worked when we did the work back in the day with John H,
before the nft rewrite etc.

  parent reply	other threads:[~2022-04-14  8:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-13  5:52 [RFC net-next] net: tc: flow indirect framework issue Mattias Forsblad
2022-04-13  7:05 ` Baowen Zheng
2022-04-13  9:07   ` Vladimir Oltean
2022-04-13 12:24     ` Mattias Forsblad
2022-04-13 13:36       ` Pablo Neira Ayuso
2022-04-13 14:15         ` Vladimir Oltean
2022-04-14  8:57         ` Jakub Kicinski [this message]
2022-04-25  7:52           ` Mattias Forsblad

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220414105701.54c3fba4@kernel.org \
    --to=kuba@kernel.org \
    --cc=baowen.zheng@corigine.com \
    --cc=elic@nvidia.com \
    --cc=jiri@resnulli.us \
    --cc=mattias.forsblad@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=pablo@netfilter.org \
    --cc=roid@nvidia.com \
    --cc=tobias@waldekranz.com \
    --cc=vladbu@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).