From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-eopbgr10080.outbound.protection.outlook.com ([40.107.1.80]:43648 "EHLO EUR02-HE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750713AbeCHTtT (ORCPT ); Thu, 8 Mar 2018 14:49:19 -0500 Subject: Re: [PATCH RFC 4/5] tls: RX path for ktls To: Dave Watson , "David S. Miller" , Tom Herbert , Alexei Starovoitov , herbert@gondor.apana.org.au, linux-crypto@vger.kernel.org, netdev@vger.kernel.org, ilyal@mellanox.com Cc: Atul Gupta , Vakul Garg , Hannes Frederic Sowa , Steffen Klassert , John Fastabend , Daniel Borkmann References: <20180308165043.GA19592@davejwatson-mba> From: Boris Pismenny Message-ID: <510189dd-7aa3-0cc3-41a0-047e67c58e83@mellanox.com> Date: Thu, 8 Mar 2018 21:48:45 +0200 MIME-Version: 1.0 In-Reply-To: <20180308165043.GA19592@davejwatson-mba> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: Hi Dave, On 03/08/18 18:50, Dave Watson wrote: > Add rx path for tls software implementation. > > recvmsg, splice_read, and poll implemented. > > An additional sockopt TLS_RX is added, with the same interface as > TLS_TX. Either TLX_RX or TLX_TX may be provided separately, or > together (with two different setsockopt calls with appropriate keys). > > Control messages are passed via CMSG in a similar way to transmit. > If no cmsg buffer is passed, then only application data records > will be passed to userspace, and EIO is returned for other types of > alerts. > > EBADMSG is passed for decryption errors, and E2BIG is passed for framing > errors. Both are unrecoverable. I think E2BIG is for too long argument list. EMSGSIZE might be more appropriate. Also, we must check that the record is not too short (cipher specific). For TLS1.2 with AES-GCM the minimum length is 8 (IV) + 16 (TAG). The correct error for this case is EBADMSG, like a decryption failure. Also, how about bad TLS version (e.g. not TLS1.2)? A separate error type is required for bad version, because it triggers a unique alert in libraries such as OpenSSL. I thought of using EINVAL for bad version. What do you think? I wonder if we should provide a more flexible method of obtaining errors for the future. Maybe use a special CMSG for errors? This CMSG will be triggered only after the socket enters the error state. > .... > + > +int tls_sw_recvmsg(struct sock *sk, > + struct msghdr *msg, > + size_t len, > + int nonblock, > + int flags, > + int *addr_len) > +{ > + struct tls_context *tls_ctx = tls_get_ctx(sk); > + struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx); > + unsigned char control; > + struct strp_msg *rxm; > + struct sk_buff *skb; > + ssize_t copied = 0; > + bool cmsg = false; > + int err = 0; > + long timeo; Maybe try to read from the error queue here? > + > + flags |= nonblock; > + > + lock_sock(sk); > + > + timeo = sock_rcvtimeo(sk, flags & MSG_DONTWAIT); > + do { > + bool zc = false; > + int chunk = 0; > + > + skb = tls_wait_data(sk, flags, timeo, &err); > + if (!skb) > + goto recv_end; > + > + rxm = strp_msg(skb); > + if (!cmsg) { > + int cerr; > + > + cerr = put_cmsg(msg, SOL_TLS, TLS_GET_RECORD_TYPE, > + sizeof(ctx->control), &ctx->control); > + cmsg = true; > + control = ctx->control; > + if (ctx->control != TLS_RECORD_TYPE_DATA) { > + if (cerr || msg->msg_flags & MSG_CTRUNC) { > + err = -EIO; > + goto recv_end; > + } > + } > + } else if (control != ctx->control) { > + goto recv_end; > + } > + > + if (!ctx->decrypted) { > + int page_count; > + int to_copy; > + > + page_count = iov_iter_npages(&msg->msg_iter, > + MAX_SKB_FRAGS); > + to_copy = rxm->full_len - tls_ctx->rx.overhead_size; > + if (to_copy <= len && page_count < MAX_SKB_FRAGS && > + likely(!(flags & MSG_PEEK))) { > + struct scatterlist sgin[MAX_SKB_FRAGS + 1]; > + char unused[21]; > + int pages = 0; > + > + zc = true; > + sg_init_table(sgin, MAX_SKB_FRAGS + 1); > + sg_set_buf(&sgin[0], unused, 13); > + > + err = zerocopy_from_iter(sk, &msg->msg_iter, > + to_copy, &pages, > + &chunk, &sgin[1], > + MAX_SKB_FRAGS, false); > + if (err < 0) > + goto fallback_to_reg_recv; > + > + err = decrypt_skb(sk, skb, sgin); > + for (; pages > 0; pages--) > + put_page(sg_page(&sgin[pages])); > + if (err < 0) { > + tls_err_abort(sk, EBADMSG); > + goto recv_end; > + } > + } else { > +fallback_to_reg_recv: > + err = decrypt_skb(sk, skb, NULL); > + if (err < 0) { > + tls_err_abort(sk, EBADMSG); > + goto recv_end; > + } > + } > + ctx->decrypted = true; > + } > + > + if (!zc) { > + chunk = min_t(unsigned int, rxm->full_len, len); > + err = skb_copy_datagram_msg(skb, rxm->offset, msg, > + chunk); > + if (err < 0) > + goto recv_end; > + } > + > + copied += chunk; > + len -= chunk; > + if (likely(!(flags & MSG_PEEK))) { > + u8 control = ctx->control; > + > + if (tls_sw_advance_skb(sk, skb, chunk)) { > + /* Return full control message to > + * userspace before trying to parse > + * another message type > + */ > + msg->msg_flags |= MSG_EOR; > + if (control != TLS_RECORD_TYPE_DATA) > + goto recv_end; > + } > + } > + } while (len); > + > +recv_end: > + release_sock(sk); > + return copied ? : err; > +} > + > ... Best, Boris.