Netdev List
 help / color / mirror / Atom feed
From: Tariq Toukan <ttoukan.linux@gmail.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Jarod Wilson <jarod@redhat.com>, Tariq Toukan <tariqt@nvidia.com>,
	"David S. Miller" <davem@davemloft.net>,
	netdev@vger.kernel.org, Saeed Mahameed <saeedm@nvidia.com>,
	Moshe Shemesh <moshe@nvidia.com>,
	Maor Gottlieb <maorg@nvidia.com>
Subject: Re: [PATCH net-next 0/2] TLS TX HW offload for Bond
Date: Tue, 24 Nov 2020 17:08:42 +0200	[thread overview]
Message-ID: <7056d516-20a5-113b-ca04-b32d326f7742@gmail.com> (raw)
In-Reply-To: <20201123102040.338f32c7@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>



On 11/23/2020 8:20 PM, Jakub Kicinski wrote:
> On Sun, 22 Nov 2020 14:48:04 +0200 Tariq Toukan wrote:
>> On 11/19/2020 6:38 PM, Jakub Kicinski wrote:
>>> On Thu, 19 Nov 2020 17:59:38 +0200 Tariq Toukan wrote:
>>>> On 11/19/2020 2:02 AM, Jakub Kicinski wrote:
>>>>> On Sun, 15 Nov 2020 15:42:49 +0200 Tariq Toukan wrote:
>>>>>> This series opens TLS TX HW offload for bond interfaces.
>>>>>> This allows bond interfaces to benefit from capable slave devices.
>>>>>>
>>>>>> The first patch adds real_dev field in TLS context structure, and aligns
>>>>>> usages in TLS module and supporting drivers.
>>>>>> The second patch opens the offload for bond interfaces.
>>>>>>
>>>>>> For the configuration above, SW kTLS keeps picking the same slave
>>>>>> To keep simple track of the HW and SW TLS contexts, we bind each socket to
>>>>>> a specific slave for the socket's whole lifetime. This is logically valid
>>>>>> (and similar to the SW kTLS behavior) in the following bond configuration,
>>>>>> so we restrict the offload support to it:
>>>>>>
>>>>>> ((mode == balance-xor) or (mode == 802.3ad))
>>>>>> and xmit_hash_policy == layer3+4.
>>>>>
>>>>> This does not feel extremely clean, maybe you can convince me otherwise.
>>>>>
>>>>> Can we extend netdev_get_xmit_slave() and figure out the output dev
>>>>> (and if it's "stable") in a more generic way? And just feed that dev
>>>>> into TLS handling?
>>>>
>>>> I don't see we go through netdev_get_xmit_slave(), but through
>>>> .ndo_start_xmit (bond_start_xmit).
>>>
>>> I may be misunderstanding the purpose of netdev_get_xmit_slave(),
>>> please correct me if I'm wrong. AFAIU it's supposed to return a
>>> lower netdev that the skb should then be xmited on.
>>
>> That's true. It was recently added and used by the RDMA team. Not used
>> or integrated in the Eth networking stack.
>>
>>> So what I was thinking was either construct an skb or somehow reshuffle
>>> the netdev_get_xmit_slave() code to take a flow dissector output or
>>> ${insert other ideas}. Then add a helper in the core that would drill
>>> down from the socket netdev to the "egress" netdev. Have TLS call
>>> that helper, and talk to the "egress" netdev from the start, rather
>>> than the socket's netdev. Then loosen the checks on software devices.
>>
>> As I understand it, best if we can even generalize this to apply to all
>> kinds of traffic: bond driver won't do the xmit itself anymore, it just
>> picks an egress dev and returns it. The core infrastructure will call
>> the xmit function for the egress dev.
> 
> I think you went way further than I was intending :) I was only
> considering the control path. Leave the datapath unchanged.
> 
> AFAIK you're making 3 changes:
>   - forwarding tls ops
>   - pinning flows
>   - handling features
> 
> Pinning of the TLS device to a leg of the bond looks like ~15LoC.
> I think we can live with that.
> 

Good.
You mean the changes under __bond_start_xmit ?

> It's the 150 LoC of forwarding TLS ops and duplicating dev selection
> logic in bond_sk_hash_l34() that I'd rather avoid.
> 

I see.
But there are several issues with this:

1. The .ndo_get_xmit_slave acts on an SKB, not a socket. Hence, it 
doesn't fit for the stage of calling tls_dev_add, unless the ndo goes 
through some refactoring before the feature itself.

2. Existing hash logic acts on an SKB. We must have one that acts on a 
socket to be used inside get_slave(sk). Hence, I don't really see how 
the logic under bond_sk_hash_l34() are going to disappear, maybe just 
move around to a new place.


> Handling features is probably fine, too, I haven't thought about that
> much.
> 

Good.

>> I like the idea, it can generalize code structures for all kinds of
>> upper-devices and sockets, taking them into a common place in core,
>> which reduces code duplications.
>>
>> If we go only half the way, i.e. keep xmit logic in bond for
>> non-TLS-offloaded traffic, then we have to let TLS module (and others in
>> the future) act deferentially for different kinds of devs (upper/lower)
>> which IMHO reduces generality.
> 
> How so? I was expecting TLS to just do something like:
> 
> 	netdev = sk_get_xmit_dev_lowest(sk);
> 
> which would recursively call get_xmit_slave(CONST) until it reaches
> a device which doesn't resolve further.
> 
> BTW is the flow pinning to bond legs actually a must-do? I don't know
> much about bonding but wouldn't that mean that if the selected leg goes
> down we'd lose connectivity, rather than falling back to SW crypto?
> 

Right. As long as we don't have logic for un-offloading.
Currently in TLS, the device-offloaded connections has some 
"independence" once they are created, it's hard to modify them and apply 
configuration modifications to them (example: interaction with tx csum 
offload).
So I think there is a missing un-offloading mechanism in TLS that should 
address all of these together.

This fits with your comments below.

>> I'm in favor of the deeper change. It will be on a larger scale, and
>> totally orthogonal to the current TLS offload support in bond.
>>
>> If we decide to apply the idea only to TLS sockets (or any subset of
>> sockets) we're actually taking a generic one-flow (the xmit patch of a
>> bond dev) and turning it into two (or potentially more) flows, depending
>> on the socket type. This also reduces generality.
> 
> I don't follow this part.
> 
>>> I'm probably missing the problem you're trying to explain to me :S
>>
>> I kept the patch minimal, and kept the TLS offload logic internal to the
>> bond driver, just like it is internal to the device drivers (mlx5e, and
>> others), with no core infrastructure modification.
>>
>>> Side note - Jarod, I'd be happy to take a patch renaming
>>> netdev_get_xmit_slave() and the ndo, if you have the cycles to send
>>> a patch. It's a recent addition, and in the core we should make more
>>> of an effort to avoid sensitive terms.
>>>    
>>>> Currently I have my check there to
>>>> catch all skbs belonging to offloaded TLS sockets.
>>>>
>>>> The TLS offload get_slave() logic decision is per socket, so the result
>>>> cannot be saved in the bond memory. Currently I save the real_dev field
>>>> in the TLS context structure.
>>>
>>> Right, but we could just have ctx->netdev be the "egress" netdev
>>> always, right? Do we expect somewhere that it's going to be matching
>>> the socket's dst?
>>
>> So once the offload context is established we totally bypass the bond
>> dev? and lose interaction or reference to it?
> 
> Yup, I don't think we need it.
> 
>> What if the egress dev is detached form the bond? We must then be
>> notified somehow.
> 
> Do we notify TLS when routing changes? I think it's a separate topic.
> 
> If we have the code to "un-offload" a flow we could handle clearing
> features better and notify from sk_validate_xmit_skb that the flow
> started hitting unexpected dev, hence it should be re-offloaded.
> 
> I don't think we need an explicit invalidation from the particular
> drivers here.
> 

Agree.

  reply	other threads:[~2020-11-24 15:09 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-15 13:42 [PATCH net-next 0/2] TLS TX HW offload for Bond Tariq Toukan
2020-11-15 13:42 ` [PATCH net-next 1/2] net/tls: Add real_dev field to TLS context Tariq Toukan
2020-11-15 13:42 ` [PATCH net-next 2/2] bond: Add TLS TX offload support Tariq Toukan
2020-11-19  0:02 ` [PATCH net-next 0/2] TLS TX HW offload for Bond Jakub Kicinski
2020-11-19 15:59   ` Tariq Toukan
2020-11-19 16:38     ` Jakub Kicinski
2020-11-22 12:48       ` Tariq Toukan
2020-11-23 18:20         ` Jakub Kicinski
2020-11-24 15:08           ` Tariq Toukan [this message]
2020-11-30  7:35           ` Boris Pismenny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7056d516-20a5-113b-ca04-b32d326f7742@gmail.com \
    --to=ttoukan.linux@gmail.com \
    --cc=davem@davemloft.net \
    --cc=jarod@redhat.com \
    --cc=kuba@kernel.org \
    --cc=maorg@nvidia.com \
    --cc=moshe@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox