Re: [PATCH v5 1/2] net/handshake: Create a NETLINK service for handling handshake requests

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Hannes Reinecke <hare@suse.de>
To: Chuck Lever III <chuck.lever@oracle.com>
Cc: Chuck Lever <cel@kernel.org>, "kuba@kernel.org" <kuba@kernel.org>,
	"pabeni@redhat.com" <pabeni@redhat.com>,
	"edumazet@google.com" <edumazet@google.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"kernel-tls-handshake@lists.linux.dev"
	<kernel-tls-handshake@lists.linux.dev>
Subject: Re: [PATCH v5 1/2] net/handshake: Create a NETLINK service for handling handshake requests
Date: Tue, 28 Feb 2023 16:48:38 +0100	[thread overview]
Message-ID: <aae5c8e6-738e-669f-3f0e-059282d55449@suse.de> (raw)
In-Reply-To: <D06D70DC-D053-4212-B72C-82C1BB1AF9F2@oracle.com>

On 2/28/23 15:28, Chuck Lever III wrote:
> 
> 
>> On Feb 28, 2023, at 1:58 AM, Hannes Reinecke <hare@suse.de> wrote:
>>
>> On 2/27/23 19:10, Chuck Lever III wrote:
>>>> On Feb 27, 2023, at 12:21 PM, Hannes Reinecke <hare@suse.de> wrote:
>>>>
>>>>> On 2/27/23 16:39, Chuck Lever III wrote:
>>>>>> On Feb 27, 2023, at 10:14 AM, Hannes Reinecke <hare@suse.de> wrote:
>>>>>>
>>>>>> Problem here is with using different key materials.
>>>>>> As the current handshake can only deal with one key at a time
>>>>>> the only chance we have for several possible keys is to retry
>>>>>> the handshake with the next key.
>>>>>> But out of necessity we have to use the _same_ connection
>>>>>> (as tlshd doesn't control the socket). So we cannot close
>>>>>> the socket, and hence we can't notify userspace to give up the handshake attempt.
>>>>>> Being able to send a signal would be simple; sending SIGHUP to userspace, and wait for the 'done' call.
>>>>>> If it doesn't come we can terminate all attempts.
>>>>>> But if we get the 'done' call we know it's safe to start with the next attempt.
>>>>> We solve this problem by enabling the kernel to provide all those
>>>>> materials to tlshd in one go.
>>>> Ah. Right, that would work, too; provide all possible keys to the
>>>> 'accept' call and let the userspace agent figure out what to do with
>>>> them. That makes life certainly easier for the kernel side.
>>>>
>>>>> I don't think there's a "retry" situation here. Once the handshake
>>>>> has failed, the client peer has to know to try again. That would
>>>>> mean retrying would have to be part of the upper layer protocol.
>>>>> Does an NVMe initiator know it has to drive another handshake if
>>>>> the first one fails, or does it rely on the handshake itself to
>>>>> try all available identities?
>>>>> We don't have a choice but to provide all the keys at once and
>>>>> let the handshake negotiation deal with it.
>>>>> I'm working on DONE passing multiple remote peer IDs back to the
>>>>> kernel now. I don't see why ACCEPT couldn't pass multiple peer IDs
>>>>> the other way.
>>>> Nope. That's not required.
>>>> DONE can only ever have one peer id (TLS 1.3 specifies that the client
>>>> sends a list of identities, the server picks one, and sends that one back
>>>> to the client). So for DONE we will only ever have 1 peer ID.
>>>> If we allow for several peer IDs to be present in the client ACCEPT message
>>>> then we'd need to include the resulting peer ID in the client DONE, too;
>>>> otherwise we'll need it for the server DONE only.
>>>>
>>>> So all in all I think we should be going with the multiple IDs in the
>>>> ACCEPT call (ie move the key id from being part of the message into an
>>>> attribute), and have a peer id present in the DONE all for both versions,
>>>> server and client.
>>> To summarize:
>>> ---
>>> The ACCEPT request (from tlshd) would have just the handler class
>>> "Which handler is responding". The kernel uses that to find a
>>> handshake request waiting for that type of handler. In our case,
>>> "tlshd".
>>> The ACCEPT response (from the kernel) would have the socket fd,
>>> the handshake parameters, and zero or more peer ID key serial
>>> numbers. (Today, just zero or one peer IDs).
>>>> There is also an errno status in the ACCEPT response, which
>>> the kernel can use to indicate things like "no requests in that
>>> class were found" or that the request was otherwise improperly
>>> formed.
>>> ---
>>> The DONE request (from tlshd) would have the socket fd (and
>>> implicitly, the handler's PID), the session status, and zero
>>> or one remote peer ID key serial numbers.
>>>> The DONE response (from the kernel) is an ACK. (Today it's
>>> more than that, but that's broken and will be removed).
>>> ---
>>> For the DONE request, the session status is one of:
>>> 0: session established -- see @peerid for authentication status
>>> EIO: local error
>>> EACCES: handshake rejected
>>> For server handshake completion:
>>> @peerid contains the remote peer ID if the session was
>>> authenticated, or TLS_NO_PEERID if the session was not
>>> authenticated.
>>> status == EACCES if authentication material was present from
>>> both peers but verification failed.
>>> For client handshake completion:
>>> @peerid contains the remote peer ID if authentication was
>>> requested and the session was authenticated
>>> status == EACCES if authentication was requested and the
>>> session was not authenticated, or if verification failed.
>>> (Maybe client could work like the server side, and the
>>> kernel consumer would need to figure out if it cares
>>> whether there was authentication).
>> Yes, that would be my preference. Always return @peerid
>> for DONE if the TLS session was established.
> 
> You mean if the TLS session was authenticated. The server
> won't receive a remote peer identity if the client peer
> doesn't authenticate.
> 
Ah, yes, forgot about that.
(PSK always 'authenticate' as the identity is that used to
find the appropriate PSK ...)

> 
>> We might also consider returning @peerid with EACCESS
>> to indicate the offending ID.
> 
> I'll look into that.
> 
> 
>>> Is that adequate?
>> Yes, it is.
> 
> What about the narrow set of DONE status values? You've
> recently wanted to add ENOMEM, ENOKEY, and EINVAL to
> this set. My experience is that these status values are
> nearly always obscured before they can get back to the
> requesting user.
> 
> Can the kernel make use of ENOMEM, for example? It might
> be able to retry, I suppose... retrying is not sensible
> for the server side.
> 
The usual problem: Retry or no retry.
Sadly error numbers are no good indicator to that.
Maybe we should take the NVMe approach and add a _different_
attribute indicating whether this particular error status
should be retried.

> 
>> So the only bone of contention is the timeout; as we won't
>> be implementing signals I still think that we should have
>> a 'timeout' attribute. And if only to feed the TLS timeout
>> parameter for gnutls ...
> 
> I'm still not seeing the case for making it an individual
> parameter for each handshake request. Maybe a config
> parameter, if a short timeout is actually needed... even
> then, maybe a built-in timeout is preferable to yet another
> tuning knob that can be abused.
> 
The problem I see is that the kernel-side needs to make forward
progress eventually, and calling into userspace is a good recipe
of violating that principle.
Sending a timeout value as a netlink parameter has the advantage
the both sides are aware that there _is_ a timeout.
The alternative would be an unconditional wait in the kernel,
and a very real possibility of a stuck process.

> I'd like to see some testing results to determine that a
> short timeout is the only way to handle corner cases.
> 
Short timeouts are especially useful for testing and debugging;
timeout handlers are prone to issues, and hence need a really good
bashing to hash out issues.
And not having a timeout is also not a good idea, see above.

But yeah, in theory we could use a configuration timeout in tlshd.

In the end, it's _just_ another netlink attribute, which might
(or might not) be present. Which replaces a built-in value.
I hadn't thought this to be such an issue ...

Cheers,

Hannes

next prev parent reply	other threads:[~2023-02-28 15:48 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-24 19:19 [PATCH v5 0/2] Another crack at a handshake upcall mechanism Chuck Lever
2023-02-24 19:19 ` [PATCH v5 1/2] net/handshake: Create a NETLINK service for handling handshake requests Chuck Lever
2023-02-27  9:24   ` Hannes Reinecke
2023-02-27 14:59     ` Chuck Lever III
2023-02-27 15:14       ` Hannes Reinecke
2023-02-27 15:39         ` Chuck Lever III
2023-02-27 17:21           ` Hannes Reinecke
2023-02-27 18:10             ` Chuck Lever III
2023-02-28  6:58               ` Hannes Reinecke
2023-02-28 14:28                 ` Chuck Lever III
2023-02-28 15:48                   ` Hannes Reinecke [this message]
2023-02-28 16:01                     ` Chuck Lever III
2023-02-24 19:19 ` [PATCH v5 2/2] net/tls: Add kernel APIs for requesting a TLSv1.3 handshake Chuck Lever
2023-02-27  9:36   ` Hannes Reinecke
2023-02-27 15:01     ` Chuck Lever III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aae5c8e6-738e-669f-3f0e-059282d55449@suse.de \
    --to=hare@suse.de \
    --cc=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=edumazet@google.com \
    --cc=kernel-tls-handshake@lists.linux.dev \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.