netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow
  2023-06-09 12:51 [PATCHv2 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
@ 2023-06-09 12:51 ` Hannes Reinecke
  2023-06-09 16:37   ` Sabrina Dubroca
  0 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-09 12:51 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski, netdev,
	Hannes Reinecke

tls_sw_sendmsg() / tls_do_sw_sendpage() already handles
MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails out on MSG_EOR.
But seeing that MSG_EOR is basically the opposite of
MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
MSG_EOR by treating it as the negation of MSG_MORE.

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 net/tls/tls_sw.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 635b8bf6b937..be8e0459d403 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -953,9 +953,12 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	int pending;
 
 	if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
-			       MSG_CMSG_COMPAT))
+			       MSG_EOR | MSG_CMSG_COMPAT))
 		return -EOPNOTSUPP;
 
+	if (msg->msg_flags & MSG_EOR)
+		eor = true;
+
 	ret = mutex_lock_interruptible(&tls_ctx->tx_lock);
 	if (ret)
 		return ret;
@@ -1173,6 +1176,8 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
 	bool eor;
 
 	eor = !(flags & MSG_SENDPAGE_NOTLAST);
+	if (flags & MSG_EOR)
+		eor = true;
 	sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
 
 	/* Call the sk_stream functions to manage the sndbuf mem. */
@@ -1274,7 +1279,7 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
 int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
 			   int offset, size_t size, int flags)
 {
-	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
 		      MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY |
 		      MSG_NO_SHARED_FRAGS))
 		return -EOPNOTSUPP;
@@ -1288,7 +1293,7 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	int ret;
 
-	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
 		      MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY))
 		return -EOPNOTSUPP;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow
  2023-06-09 12:51 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
@ 2023-06-09 16:37   ` Sabrina Dubroca
  0 siblings, 0 replies; 19+ messages in thread
From: Sabrina Dubroca @ 2023-06-09 16:37 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme,
	Jakub Kicinski, netdev

Hi Hannes,

2023-06-09, 14:51:50 +0200, Hannes Reinecke wrote:
> tls_sw_sendmsg() / tls_do_sw_sendpage() already handles
> MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails out on MSG_EOR.
> But seeing that MSG_EOR is basically the opposite of
> MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
> MSG_EOR by treating it as the negation of MSG_MORE.
> 
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: netdev@vger.kernel.org
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  net/tls/tls_sw.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 635b8bf6b937..be8e0459d403 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -953,9 +953,12 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
>  	int pending;
>  
>  	if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
> -			       MSG_CMSG_COMPAT))
> +			       MSG_EOR | MSG_CMSG_COMPAT))
>  		return -EOPNOTSUPP;
>  
> +	if (msg->msg_flags & MSG_EOR)
> +		eor = true;

Is MSG_EOR supposed to be incompatible with MSG_MORE, or is it
supposed to cancel it? (ie: MSG_MORE | MSG_EOR is invalid, or
MSG_MORE | MSG_EOR behaves like MSG_EOR) The current code already
behaves as if _EOR was passed as long as MSG_MORE isn't passed, so
_EOR is only needed to cancel out _MORE (or in your case, because
NVMe-over-TLS sets it).

If _EOR and _MORE (or MSG_SENDPAGE_NOTLAST below) are supposed to be
incompatible, we should return an error when they're both set. If we
accept both flags being set at the same time, I think we should
document the expected behavior ("_EOR overrides _MORE/_NOTLAST") and
add specific selftests to avoid regressions.

-- 
Sabrina


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow
  2023-06-12 14:38 [PATCHv3 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
@ 2023-06-12 14:38 ` Hannes Reinecke
  0 siblings, 0 replies; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-12 14:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski, netdev,
	Hannes Reinecke

tls_sw_sendmsg() / tls_do_sw_sendpage() already handles
MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails out on MSG_EOR.
But seeing that MSG_EOR is basically the opposite of
MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
MSG_EOR by treating it as the negation of MSG_MORE.
And erroring out if MSG_EOR is specified with either
MSG_MORE or MSG_SENDPAGE_NOTLAST.

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 net/tls/tls_sw.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 635b8bf6b937..16eae0c5c819 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -953,7 +953,10 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	int pending;
 
 	if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
-			       MSG_CMSG_COMPAT))
+			       MSG_EOR | MSG_CMSG_COMPAT))
+		return -EOPNOTSUPP;
+
+	if (!eor && msg->msg_flags & MSG_EOR)
 		return -EOPNOTSUPP;
 
 	ret = mutex_lock_interruptible(&tls_ctx->tx_lock);
@@ -1274,11 +1277,15 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
 int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
 			   int offset, size_t size, int flags)
 {
-	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
 		      MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY |
 		      MSG_NO_SHARED_FRAGS))
 		return -EOPNOTSUPP;
 
+	if ((flags & (MSG_MORE | MSG_SENDPAGE_NOTLAST)) &&
+	    (flags & MSG_EOR))
+		return -EINVAL;
+
 	return tls_sw_do_sendpage(sk, page, offset, size, flags);
 }
 
@@ -1288,10 +1295,14 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	int ret;
 
-	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
 		      MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY))
 		return -EOPNOTSUPP;
 
+	if ((flags & (MSG_MORE | MSG_SENDPAGE_NOTLAST)) &&
+	    (flags & MSG_EOR))
+		return -EOPNOTSUPP;
+
 	ret = mutex_lock_interruptible(&tls_ctx->tx_lock);
 	if (ret)
 		return ret;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow
  2023-06-14  6:22 [PATCHv4 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
@ 2023-06-14  6:22 ` Hannes Reinecke
  2023-06-17  6:26   ` Jakub Kicinski
  0 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-14  6:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski, netdev,
	Hannes Reinecke

tls_sw_sendmsg() / tls_do_sw_sendpage() already handles
MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails out on MSG_EOR.
But seeing that MSG_EOR is basically the opposite of
MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
MSG_EOR by treating it as the negation of MSG_MORE.
And erroring out if MSG_EOR is specified with either
MSG_MORE or MSG_SENDPAGE_NOTLAST.

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 net/tls/tls_sw.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 319f61590d2c..47eeff4d7d10 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -984,6 +984,9 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
 	int ret = 0;
 	int pending;
 
+	if (!eor && (msg->msg_flags & MSG_EOR))
+		return -EINVAL;
+
 	if (unlikely(msg->msg_controllen)) {
 		ret = tls_process_cmsg(sk, msg, &record_type);
 		if (ret) {
@@ -1193,7 +1196,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	int ret;
 
 	if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
-			       MSG_CMSG_COMPAT | MSG_SPLICE_PAGES |
+			       MSG_CMSG_COMPAT | MSG_SPLICE_PAGES | MSG_EOR |
 			       MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY))
 		return -EOPNOTSUPP;
 
@@ -1287,7 +1290,7 @@ int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
 	struct bio_vec bvec;
 	struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, };
 
-	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
 		      MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY |
 		      MSG_NO_SHARED_FRAGS))
 		return -EOPNOTSUPP;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow
  2023-06-14  6:22 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
@ 2023-06-17  6:26   ` Jakub Kicinski
  0 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2023-06-17  6:26 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme, netdev

On Wed, 14 Jun 2023 08:22:09 +0200 Hannes Reinecke wrote:
> @@ -1287,7 +1290,7 @@ int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
>  	struct bio_vec bvec;
>  	struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, };
>  
> -	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
> +	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
>  		      MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY |
>  		      MSG_NO_SHARED_FRAGS))
>  		return -EOPNOTSUPP;

you added it to sendpage_locked but not normal sendpage

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCHv5 0/4] net/tls: fixes for NVMe-over-TLS
@ 2023-06-20 10:28 Hannes Reinecke
  2023-06-20 10:28 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-20 10:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski,
	Eric Dumazet, Paolo Abeni, netdev, Hannes Reinecke

Hi all,

here are some small fixes to get NVMe-over-TLS up and running.
The first thre are just minor modifications to have MSG_EOR handled
for TLS (and adding a test for it), but the last implements the
->read_sock() callback for tls_sw and I guess could do with some
reviews.
It does work with my NVMe-TLS test harness, but what do I know :-)

As usual, comments and reviews are welcome.

Changes to the original submission:
- Add a testcase for MSG_EOR handling

Changes to v2:
- Bail out on conflicting message flags
- Rework flag handling

Changes to v3:
- Return -EINVAL on conflicting flags
- Rebase on top of net-next

Changes to v4:
- Add tlx_rx_reader_lock() to read_sock
- Add MSG_EOR handling to tls_sw_readpages()

Hannes Reinecke (4):
  net/tls: handle MSG_EOR for tls_sw TX flow
  net/tls: handle MSG_EOR for tls_device TX flow
  selftests/net/tls: add test for MSG_EOR
  net/tls: implement ->read_sock()

 net/tls/tls.h                     |  2 +
 net/tls/tls_device.c              | 25 +++++++--
 net/tls/tls_main.c                |  2 +
 net/tls/tls_sw.c                  | 87 +++++++++++++++++++++++++++++--
 tools/testing/selftests/net/tls.c | 11 ++++
 5 files changed, 119 insertions(+), 8 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow
  2023-06-20 10:28 [PATCHv5 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
@ 2023-06-20 10:28 ` Hannes Reinecke
  2023-06-20 10:28 ` [PATCH 2/4] net/tls: handle MSG_EOR for tls_device " Hannes Reinecke
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-20 10:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski,
	Eric Dumazet, Paolo Abeni, netdev, Hannes Reinecke

tls_sw_sendmsg() / tls_do_sw_sendpage() already handles
MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails out on MSG_EOR.
But seeing that MSG_EOR is basically the opposite of
MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
MSG_EOR by treating it as the negation of MSG_MORE.
And erroring out if MSG_EOR is specified with either
MSG_MORE or MSG_SENDPAGE_NOTLAST.

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 net/tls/tls_sw.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 319f61590d2c..97379e34c997 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -984,6 +984,9 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
 	int ret = 0;
 	int pending;
 
+	if (!eor && (msg->msg_flags & MSG_EOR))
+		return -EINVAL;
+
 	if (unlikely(msg->msg_controllen)) {
 		ret = tls_process_cmsg(sk, msg, &record_type);
 		if (ret) {
@@ -1193,7 +1196,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	int ret;
 
 	if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
-			       MSG_CMSG_COMPAT | MSG_SPLICE_PAGES |
+			       MSG_CMSG_COMPAT | MSG_SPLICE_PAGES | MSG_EOR |
 			       MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY))
 		return -EOPNOTSUPP;
 
@@ -1287,7 +1290,7 @@ int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
 	struct bio_vec bvec;
 	struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, };
 
-	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
 		      MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY |
 		      MSG_NO_SHARED_FRAGS))
 		return -EOPNOTSUPP;
@@ -1305,7 +1308,7 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
 	struct bio_vec bvec;
 	struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, };
 
-	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+	if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
 		      MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY))
 		return -EOPNOTSUPP;
 	if (flags & MSG_SENDPAGE_NOTLAST)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/4] net/tls: handle MSG_EOR for tls_device TX flow
  2023-06-20 10:28 [PATCHv5 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
  2023-06-20 10:28 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
@ 2023-06-20 10:28 ` Hannes Reinecke
  2023-06-20 17:12   ` Sabrina Dubroca
  2023-06-20 10:28 ` [PATCH 3/4] selftests/net/tls: add test for MSG_EOR Hannes Reinecke
  2023-06-20 10:28 ` [PATCH 4/4] net/tls: implement ->read_sock() Hannes Reinecke
  3 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-20 10:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski,
	Eric Dumazet, Paolo Abeni, netdev, Hannes Reinecke

tls_push_data() MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails
out on MSG_EOR.
But seeing that MSG_EOR is basically the opposite of
MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
MSG_EOR by treating it as the absence of MSG_MORE.
Consequently we should return an error when both are set.

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 net/tls/tls_device.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index b82770f68807..ebefd148ecf5 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -440,11 +440,6 @@ static int tls_push_data(struct sock *sk,
 	int copy, rc = 0;
 	long timeo;
 
-	if (flags &
-	    ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST |
-	      MSG_SPLICE_PAGES))
-		return -EOPNOTSUPP;
-
 	if (unlikely(sk->sk_err))
 		return -sk->sk_err;
 
@@ -536,6 +531,10 @@ static int tls_push_data(struct sock *sk,
 				more = true;
 				break;
 			}
+			if (flags & MSG_EOR) {
+				more = false;
+				break;
+			}
 
 			done = true;
 		}
@@ -582,6 +581,14 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	if (!tls_ctx->zerocopy_sendfile)
 		msg->msg_flags &= ~MSG_SPLICE_PAGES;
 
+	if (msg->msg_flags &
+	    ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST |
+	      MSG_SPLICE_PAGES | MSG_EOR))
+		return -EOPNOTSUPP;
+
+	if ((msg->msg_flags & (MSG_MORE | MSG_EOR)) == (MSG_MORE | MSG_EOR))
+		return -EOPNOTSUPP;
+
 	mutex_lock(&tls_ctx->tx_lock);
 	lock_sock(sk);
 
@@ -627,9 +634,17 @@ int tls_device_sendpage(struct sock *sk, struct page *page,
 	struct bio_vec bvec;
 	struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, };
 
+	if (flags &
+	    ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST |
+	      MSG_SPLICE_PAGES | MSG_EOR))
+		return -EOPNOTSUPP;
+
 	if (flags & MSG_SENDPAGE_NOTLAST)
 		msg.msg_flags |= MSG_MORE;
 
+	if ((msg.msg_flags & (MSG_MORE | MSG_EOR)) == (MSG_MORE | MSG_EOR))
+		return -EINVAL;
+
 	if (flags & MSG_OOB)
 		return -EOPNOTSUPP;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/4] selftests/net/tls: add test for MSG_EOR
  2023-06-20 10:28 [PATCHv5 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
  2023-06-20 10:28 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
  2023-06-20 10:28 ` [PATCH 2/4] net/tls: handle MSG_EOR for tls_device " Hannes Reinecke
@ 2023-06-20 10:28 ` Hannes Reinecke
  2023-06-20 10:28 ` [PATCH 4/4] net/tls: implement ->read_sock() Hannes Reinecke
  3 siblings, 0 replies; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-20 10:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski,
	Eric Dumazet, Paolo Abeni, netdev, Hannes Reinecke

As the recent patch is modifying the behaviour for TLS re MSG_EOR
handling we should be having a test for it.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
---
 tools/testing/selftests/net/tls.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index eccea9845c65..e8540fa5b310 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -477,6 +477,17 @@ TEST_F(tls, msg_more_unsent)
 	EXPECT_EQ(recv(self->cfd, buf, send_len, MSG_DONTWAIT), -1);
 }
 
+TEST_F(tls, msg_eor)
+{
+	char const *test_str = "test_read";
+	int send_len = 10;
+	char buf[10];
+
+	EXPECT_EQ(send(self->fd, test_str, send_len, MSG_EOR), send_len);
+	EXPECT_EQ(recv(self->cfd, buf, send_len, MSG_WAITALL), send_len);
+	EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
+}
+
 TEST_F(tls, sendmsg_single)
 {
 	struct msghdr msg;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/4] net/tls: implement ->read_sock()
  2023-06-20 10:28 [PATCHv5 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
                   ` (2 preceding siblings ...)
  2023-06-20 10:28 ` [PATCH 3/4] selftests/net/tls: add test for MSG_EOR Hannes Reinecke
@ 2023-06-20 10:28 ` Hannes Reinecke
  2023-06-20 13:21   ` Sagi Grimberg
  3 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-20 10:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski,
	Eric Dumazet, Paolo Abeni, netdev, Hannes Reinecke,
	Boris Pismenny

Implement ->read_sock() function for use with nvme-tcp.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Cc: Boris Pismenny <boris.pismenny@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
---
 net/tls/tls.h      |  2 ++
 net/tls/tls_main.c |  2 ++
 net/tls/tls_sw.c   | 78 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 82 insertions(+)

diff --git a/net/tls/tls.h b/net/tls/tls.h
index d002c3af1966..ba55cd5c4913 100644
--- a/net/tls/tls.h
+++ b/net/tls/tls.h
@@ -114,6 +114,8 @@ bool tls_sw_sock_is_readable(struct sock *sk);
 ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
 			   struct pipe_inode_info *pipe,
 			   size_t len, unsigned int flags);
+int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
+		     sk_read_actor_t read_actor);
 
 int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
 void tls_device_splice_eof(struct socket *sock);
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 7b9c83dd7de2..1a062a8c6d33 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -963,10 +963,12 @@ static void build_proto_ops(struct proto_ops ops[TLS_NUM_CONFIG][TLS_NUM_CONFIG]
 	ops[TLS_BASE][TLS_SW  ] = ops[TLS_BASE][TLS_BASE];
 	ops[TLS_BASE][TLS_SW  ].splice_read	= tls_sw_splice_read;
 	ops[TLS_BASE][TLS_SW  ].poll		= tls_sk_poll;
+	ops[TLS_BASE][TLS_SW  ].read_sock	= tls_sw_read_sock;
 
 	ops[TLS_SW  ][TLS_SW  ] = ops[TLS_SW  ][TLS_BASE];
 	ops[TLS_SW  ][TLS_SW  ].splice_read	= tls_sw_splice_read;
 	ops[TLS_SW  ][TLS_SW  ].poll		= tls_sk_poll;
+	ops[TLS_SW  ][TLS_SW  ].read_sock	= tls_sw_read_sock;
 
 #ifdef CONFIG_TLS_DEVICE
 	ops[TLS_HW  ][TLS_BASE] = ops[TLS_BASE][TLS_BASE];
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 97379e34c997..e918c98bbeb2 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2231,6 +2231,84 @@ ssize_t tls_sw_splice_read(struct socket *sock,  loff_t *ppos,
 	goto splice_read_end;
 }
 
+int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
+		     sk_read_actor_t read_actor)
+{
+	struct tls_context *tls_ctx = tls_get_ctx(sk);
+	struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
+	struct strp_msg *rxm = NULL;
+	struct tls_msg *tlm;
+	struct sk_buff *skb;
+	ssize_t copied = 0;
+	int err, used;
+
+	err = tls_rx_reader_lock(sk, ctx, true);
+	if (err < 0)
+		return err;
+	if (!skb_queue_empty(&ctx->rx_list)) {
+		skb = __skb_dequeue(&ctx->rx_list);
+	} else {
+		struct tls_decrypt_arg darg;
+
+		err = tls_rx_rec_wait(sk, NULL, true, true);
+		if (err <= 0) {
+			tls_rx_reader_unlock(sk, ctx);
+			return err;
+		}
+
+		memset(&darg.inargs, 0, sizeof(darg.inargs));
+
+		err = tls_rx_one_record(sk, NULL, &darg);
+		if (err < 0) {
+			tls_err_abort(sk, -EBADMSG);
+			tls_rx_reader_unlock(sk, ctx);
+			return err;
+		}
+
+		tls_rx_rec_done(ctx);
+		skb = darg.skb;
+	}
+
+	do {
+		rxm = strp_msg(skb);
+		tlm = tls_msg(skb);
+
+		/* read_sock does not support reading control messages */
+		if (tlm->control != TLS_RECORD_TYPE_DATA) {
+			err = -EINVAL;
+			goto read_sock_requeue;
+		}
+
+		used = read_actor(desc, skb, rxm->offset, rxm->full_len);
+		if (used <= 0) {
+			err = used;
+			goto read_sock_end;
+		}
+
+		copied += used;
+		if (used < rxm->full_len) {
+			rxm->offset += used;
+			rxm->full_len -= used;
+			if (!desc->count)
+				goto read_sock_requeue;
+		} else {
+			consume_skb(skb);
+			if (desc->count && !skb_queue_empty(&ctx->rx_list))
+				skb = __skb_dequeue(&ctx->rx_list);
+			else
+				skb = NULL;
+		}
+	} while (skb);
+
+read_sock_end:
+	tls_rx_reader_unlock(sk, ctx);
+	return copied ? : err;
+
+read_sock_requeue:
+	__skb_queue_head(&ctx->rx_list, skb);
+	goto read_sock_end;
+}
+
 bool tls_sw_sock_is_readable(struct sock *sk)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] net/tls: implement ->read_sock()
  2023-06-20 10:28 ` [PATCH 4/4] net/tls: implement ->read_sock() Hannes Reinecke
@ 2023-06-20 13:21   ` Sagi Grimberg
  2023-06-20 17:08     ` Jakub Kicinski
  0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2023-06-20 13:21 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, linux-nvme, Jakub Kicinski, Eric Dumazet,
	Paolo Abeni, netdev, Boris Pismenny


> Implement ->read_sock() function for use with nvme-tcp.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
> Cc: Boris Pismenny <boris.pismenny@gmail.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: netdev@vger.kernel.org
> ---
>   net/tls/tls.h      |  2 ++
>   net/tls/tls_main.c |  2 ++
>   net/tls/tls_sw.c   | 78 ++++++++++++++++++++++++++++++++++++++++++++++
>   3 files changed, 82 insertions(+)
> 
> diff --git a/net/tls/tls.h b/net/tls/tls.h
> index d002c3af1966..ba55cd5c4913 100644
> --- a/net/tls/tls.h
> +++ b/net/tls/tls.h
> @@ -114,6 +114,8 @@ bool tls_sw_sock_is_readable(struct sock *sk);
>   ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
>   			   struct pipe_inode_info *pipe,
>   			   size_t len, unsigned int flags);
> +int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
> +		     sk_read_actor_t read_actor);
>   
>   int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
>   void tls_device_splice_eof(struct socket *sock);
> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index 7b9c83dd7de2..1a062a8c6d33 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
> @@ -963,10 +963,12 @@ static void build_proto_ops(struct proto_ops ops[TLS_NUM_CONFIG][TLS_NUM_CONFIG]
>   	ops[TLS_BASE][TLS_SW  ] = ops[TLS_BASE][TLS_BASE];
>   	ops[TLS_BASE][TLS_SW  ].splice_read	= tls_sw_splice_read;
>   	ops[TLS_BASE][TLS_SW  ].poll		= tls_sk_poll;
> +	ops[TLS_BASE][TLS_SW  ].read_sock	= tls_sw_read_sock;
>   
>   	ops[TLS_SW  ][TLS_SW  ] = ops[TLS_SW  ][TLS_BASE];
>   	ops[TLS_SW  ][TLS_SW  ].splice_read	= tls_sw_splice_read;
>   	ops[TLS_SW  ][TLS_SW  ].poll		= tls_sk_poll;
> +	ops[TLS_SW  ][TLS_SW  ].read_sock	= tls_sw_read_sock;
>   
>   #ifdef CONFIG_TLS_DEVICE
>   	ops[TLS_HW  ][TLS_BASE] = ops[TLS_BASE][TLS_BASE];
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 97379e34c997..e918c98bbeb2 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2231,6 +2231,84 @@ ssize_t tls_sw_splice_read(struct socket *sock,  loff_t *ppos,
>   	goto splice_read_end;
>   }
>   
> +int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
> +		     sk_read_actor_t read_actor)
> +{
> +	struct tls_context *tls_ctx = tls_get_ctx(sk);
> +	struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
> +	struct strp_msg *rxm = NULL;
> +	struct tls_msg *tlm;
> +	struct sk_buff *skb;
> +	ssize_t copied = 0;
> +	int err, used;
> +
> +	err = tls_rx_reader_lock(sk, ctx, true);
> +	if (err < 0)
> +		return err;

Unlike recvmsg or splice_read, the caller of read_sock is assumed to
have the socket locked, and tls_rx_reader_lock also calls lock_sock,
how is this not a deadlock?

I'm not exactly clear why the lock is needed here or what is the subtle
distinction between tls_rx_reader_lock and what lock_sock provides.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] net/tls: implement ->read_sock()
  2023-06-20 13:21   ` Sagi Grimberg
@ 2023-06-20 17:08     ` Jakub Kicinski
  2023-06-21  6:44       ` Hannes Reinecke
  0 siblings, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2023-06-20 17:08 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Hannes Reinecke, Christoph Hellwig, Keith Busch, linux-nvme,
	Eric Dumazet, Paolo Abeni, netdev, Boris Pismenny

On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
> > +	err = tls_rx_reader_lock(sk, ctx, true);
> > +	if (err < 0)
> > +		return err;  
> 
> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
> how is this not a deadlock?

Yeah :|

> I'm not exactly clear why the lock is needed here or what is the subtle
> distinction between tls_rx_reader_lock and what lock_sock provides.

It's a bit of a workaround for the consistency of the data stream.
There's bunch of state in the TLS ULP and waiting for mem or data
releases and re-takes the socket lock. So to stop the flow annoying
corner case races I slapped a lock around all of the reader.

IMHO depending on the socket lock for anything non-trivial and outside
of the socket itself is a bad idea in general.

The immediate need at the time was that if you did a read() and someone
else did a peek() at the same time from a stream of A B C D you may read
A D B C.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/4] net/tls: handle MSG_EOR for tls_device TX flow
  2023-06-20 10:28 ` [PATCH 2/4] net/tls: handle MSG_EOR for tls_device " Hannes Reinecke
@ 2023-06-20 17:12   ` Sabrina Dubroca
  2023-06-21  6:09     ` Hannes Reinecke
  0 siblings, 1 reply; 19+ messages in thread
From: Sabrina Dubroca @ 2023-06-20 17:12 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme,
	Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev

2023-06-20, 12:28:54 +0200, Hannes Reinecke wrote:
> tls_push_data() MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails
> out on MSG_EOR.
> But seeing that MSG_EOR is basically the opposite of
> MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
> MSG_EOR by treating it as the absence of MSG_MORE.
> Consequently we should return an error when both are set.
> 
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: netdev@vger.kernel.org
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  net/tls/tls_device.c | 25 ++++++++++++++++++++-----
>  1 file changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
> index b82770f68807..ebefd148ecf5 100644
> --- a/net/tls/tls_device.c
> +++ b/net/tls/tls_device.c
> @@ -440,11 +440,6 @@ static int tls_push_data(struct sock *sk,
>  	int copy, rc = 0;
>  	long timeo;
>  
> -	if (flags &
> -	    ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST |
> -	      MSG_SPLICE_PAGES))
> -		return -EOPNOTSUPP;
> -
>  	if (unlikely(sk->sk_err))
>  		return -sk->sk_err;
>  
> @@ -536,6 +531,10 @@ static int tls_push_data(struct sock *sk,
>  				more = true;
>  				break;
>  			}
> +			if (flags & MSG_EOR) {
> +				more = false;
> +				break;

Why the break here? We don't want to close and push the record in that
case? (the "if (done || ...)" block just below)


> +			}
>  
>  			done = true;
>  		}

Thanks,

-- 
Sabrina


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/4] net/tls: handle MSG_EOR for tls_device TX flow
  2023-06-20 17:12   ` Sabrina Dubroca
@ 2023-06-21  6:09     ` Hannes Reinecke
  0 siblings, 0 replies; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-21  6:09 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme,
	Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev

On 6/20/23 19:12, Sabrina Dubroca wrote:
> 2023-06-20, 12:28:54 +0200, Hannes Reinecke wrote:
>> tls_push_data() MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails
>> out on MSG_EOR.
>> But seeing that MSG_EOR is basically the opposite of
>> MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
>> MSG_EOR by treating it as the absence of MSG_MORE.
>> Consequently we should return an error when both are set.
>>
>> Cc: Jakub Kicinski <kuba@kernel.org>
>> Cc: netdev@vger.kernel.org
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   net/tls/tls_device.c | 25 ++++++++++++++++++++-----
>>   1 file changed, 20 insertions(+), 5 deletions(-)
>>
>> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
>> index b82770f68807..ebefd148ecf5 100644
>> --- a/net/tls/tls_device.c
>> +++ b/net/tls/tls_device.c
>> @@ -440,11 +440,6 @@ static int tls_push_data(struct sock *sk,
>>   	int copy, rc = 0;
>>   	long timeo;
>>   
>> -	if (flags &
>> -	    ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST |
>> -	      MSG_SPLICE_PAGES))
>> -		return -EOPNOTSUPP;
>> -
>>   	if (unlikely(sk->sk_err))
>>   		return -sk->sk_err;
>>   
>> @@ -536,6 +531,10 @@ static int tls_push_data(struct sock *sk,
>>   				more = true;
>>   				break;
>>   			}
>> +			if (flags & MSG_EOR) {
>> +				more = false;
>> +				break;
> 
> Why the break here? We don't want to close and push the record in that
> case? (the "if (done || ...)" block just below)
> 
Ah, yes, you are correct. Will be fixing it.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] net/tls: implement ->read_sock()
  2023-06-20 17:08     ` Jakub Kicinski
@ 2023-06-21  6:44       ` Hannes Reinecke
  2023-06-21  8:39         ` Sagi Grimberg
  0 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-21  6:44 UTC (permalink / raw)
  To: Jakub Kicinski, Sagi Grimberg
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, Eric Dumazet,
	Paolo Abeni, netdev, Boris Pismenny

On 6/20/23 19:08, Jakub Kicinski wrote:
> On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
>>> +	err = tls_rx_reader_lock(sk, ctx, true);
>>> +	if (err < 0)
>>> +		return err;
>>
>> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
>> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
>> how is this not a deadlock?
> 
> Yeah :|
> 
>> I'm not exactly clear why the lock is needed here or what is the subtle
>> distinction between tls_rx_reader_lock and what lock_sock provides.
> 
> It's a bit of a workaround for the consistency of the data stream.
> There's bunch of state in the TLS ULP and waiting for mem or data
> releases and re-takes the socket lock. So to stop the flow annoying
> corner case races I slapped a lock around all of the reader.
> 
> IMHO depending on the socket lock for anything non-trivial and outside
> of the socket itself is a bad idea in general.
> 
> The immediate need at the time was that if you did a read() and someone
> else did a peek() at the same time from a stream of A B C D you may read
> A D B C.

Leaving me ever so confused.

read_sock() is a generic interface; we cannot require a protocol 
specific lock before calling it.

What to do now?
Drop the tls_rx_read_lock from read_sock() again?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] net/tls: implement ->read_sock()
  2023-06-21  6:44       ` Hannes Reinecke
@ 2023-06-21  8:39         ` Sagi Grimberg
  2023-06-21  9:08           ` Hannes Reinecke
  0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2023-06-21  8:39 UTC (permalink / raw)
  To: Hannes Reinecke, Jakub Kicinski
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, Eric Dumazet,
	Paolo Abeni, netdev, Boris Pismenny


>> On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
>>>> +    err = tls_rx_reader_lock(sk, ctx, true);
>>>> +    if (err < 0)
>>>> +        return err;
>>>
>>> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
>>> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
>>> how is this not a deadlock?
>>
>> Yeah :|
>>
>>> I'm not exactly clear why the lock is needed here or what is the subtle
>>> distinction between tls_rx_reader_lock and what lock_sock provides.
>>
>> It's a bit of a workaround for the consistency of the data stream.
>> There's bunch of state in the TLS ULP and waiting for mem or data
>> releases and re-takes the socket lock. So to stop the flow annoying
>> corner case races I slapped a lock around all of the reader.
>>
>> IMHO depending on the socket lock for anything non-trivial and outside
>> of the socket itself is a bad idea in general.
>>
>> The immediate need at the time was that if you did a read() and someone
>> else did a peek() at the same time from a stream of A B C D you may read
>> A D B C.
> 
> Leaving me ever so confused.
> 
> read_sock() is a generic interface; we cannot require a protocol 
> specific lock before calling it.
> 
> What to do now?
> Drop the tls_rx_read_lock from read_sock() again?

Probably just need to synchronize the readers by splitting that from
tls_rx_reader_lock:
--
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 53f944e6d8ef..53404c3fdcc6 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1845,13 +1845,10 @@ tls_read_flush_backlog(struct sock *sk, struct 
tls_prot_info *prot,
         return sk_flush_backlog(sk);
  }

-static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx 
*ctx,
-                             bool nonblock)
+static int tls_rx_reader_acquire(struct sock *sk, struct 
tls_sw_context_rx *ctx,
+                            bool nonblock)
  {
         long timeo;
-       int err;
-
-       lock_sock(sk);

         timeo = sock_rcvtimeo(sk, nonblock);

@@ -1865,26 +1862,30 @@ static int tls_rx_reader_lock(struct sock *sk, 
struct tls_sw_context_rx *ctx,
                               !READ_ONCE(ctx->reader_present), &wait);
                 remove_wait_queue(&ctx->wq, &wait);

-               if (timeo <= 0) {
-                       err = -EAGAIN;
-                       goto err_unlock;
-               }
-               if (signal_pending(current)) {
-                       err = sock_intr_errno(timeo);
-                       goto err_unlock;
-               }
+               if (timeo <= 0)
+                       return -EAGAIN;
+               if (signal_pending(current))
+                       return sock_intr_errno(timeo);
         }

         WRITE_ONCE(ctx->reader_present, 1);

         return 0;
+}

-err_unlock:
-       release_sock(sk);
+static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx 
*ctx,
+                             bool nonblock)
+{
+       int err;
+
+       lock_sock(sk);
+       err = tls_rx_reader_acquire(sk, ctx, nonblock);
+       if (err)
+               release_sock(sk);
         return err;
  }

-static void tls_rx_reader_unlock(struct sock *sk, struct 
tls_sw_context_rx *ctx)
+static void tls_rx_reader_release(struct sock *sk, struct 
tls_sw_context_rx *ctx)
  {
         if (unlikely(ctx->reader_contended)) {
                 if (wq_has_sleeper(&ctx->wq))
@@ -1896,6 +1897,11 @@ static void tls_rx_reader_unlock(struct sock *sk, 
struct tls_sw_context_rx *ctx)
         }

         WRITE_ONCE(ctx->reader_present, 0);
+}
+
+static void tls_rx_reader_unlock(struct sock *sk, struct 
tls_sw_context_rx *ctx)
+{
+       tls_rx_reader_release(sk, ctx);
         release_sock(sk);
  }
--

Then read_sock can just acquire/release.

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] net/tls: implement ->read_sock()
  2023-06-21  8:39         ` Sagi Grimberg
@ 2023-06-21  9:08           ` Hannes Reinecke
  2023-06-21  9:49             ` Sagi Grimberg
  0 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-21  9:08 UTC (permalink / raw)
  To: Sagi Grimberg, Jakub Kicinski
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, Eric Dumazet,
	Paolo Abeni, netdev, Boris Pismenny

On 6/21/23 10:39, Sagi Grimberg wrote:
> 
>>> On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
>>>>> +    err = tls_rx_reader_lock(sk, ctx, true);
>>>>> +    if (err < 0)
>>>>> +        return err;
>>>>
>>>> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
>>>> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
>>>> how is this not a deadlock?
>>>
>>> Yeah :|
>>>
>>>> I'm not exactly clear why the lock is needed here or what is the subtle
>>>> distinction between tls_rx_reader_lock and what lock_sock provides.
>>>
>>> It's a bit of a workaround for the consistency of the data stream.
>>> There's bunch of state in the TLS ULP and waiting for mem or data
>>> releases and re-takes the socket lock. So to stop the flow annoying
>>> corner case races I slapped a lock around all of the reader.
>>>
>>> IMHO depending on the socket lock for anything non-trivial and outside
>>> of the socket itself is a bad idea in general.
>>>
>>> The immediate need at the time was that if you did a read() and someone
>>> else did a peek() at the same time from a stream of A B C D you may read
>>> A D B C.
>>
>> Leaving me ever so confused.
>>
>> read_sock() is a generic interface; we cannot require a protocol 
>> specific lock before calling it.
>>
>> What to do now?
>> Drop the tls_rx_read_lock from read_sock() again?
> 
> Probably just need to synchronize the readers by splitting that from
> tls_rx_reader_lock:
> -- 
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 53f944e6d8ef..53404c3fdcc6 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -1845,13 +1845,10 @@ tls_read_flush_backlog(struct sock *sk, struct 
> tls_prot_info *prot,
>          return sk_flush_backlog(sk);
>   }
> 
> -static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx 
> *ctx,
> -                             bool nonblock)
> +static int tls_rx_reader_acquire(struct sock *sk, struct 
> tls_sw_context_rx *ctx,
> +                            bool nonblock)
>   {
>          long timeo;
> -       int err;
> -
> -       lock_sock(sk);
> 
>          timeo = sock_rcvtimeo(sk, nonblock);
> 
> @@ -1865,26 +1862,30 @@ static int tls_rx_reader_lock(struct sock *sk, 
> struct tls_sw_context_rx *ctx,
>                                !READ_ONCE(ctx->reader_present), &wait);
>                  remove_wait_queue(&ctx->wq, &wait);
> 
> -               if (timeo <= 0) {
> -                       err = -EAGAIN;
> -                       goto err_unlock;
> -               }
> -               if (signal_pending(current)) {
> -                       err = sock_intr_errno(timeo);
> -                       goto err_unlock;
> -               }
> +               if (timeo <= 0)
> +                       return -EAGAIN;
> +               if (signal_pending(current))
> +                       return sock_intr_errno(timeo);
>          }
> 
>          WRITE_ONCE(ctx->reader_present, 1);
> 
>          return 0;
> +}
> 
> -err_unlock:
> -       release_sock(sk);
> +static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx 
> *ctx,
> +                             bool nonblock)
> +{
> +       int err;
> +
> +       lock_sock(sk);
> +       err = tls_rx_reader_acquire(sk, ctx, nonblock);
> +       if (err)
> +               release_sock(sk);
>          return err;
>   }
> 
> -static void tls_rx_reader_unlock(struct sock *sk, struct 
> tls_sw_context_rx *ctx)
> +static void tls_rx_reader_release(struct sock *sk, struct 
> tls_sw_context_rx *ctx)
>   {
>          if (unlikely(ctx->reader_contended)) {
>                  if (wq_has_sleeper(&ctx->wq))
> @@ -1896,6 +1897,11 @@ static void tls_rx_reader_unlock(struct sock *sk, 
> struct tls_sw_context_rx *ctx)
>          }
> 
>          WRITE_ONCE(ctx->reader_present, 0);
> +}
> +
> +static void tls_rx_reader_unlock(struct sock *sk, struct 
> tls_sw_context_rx *ctx)
> +{
> +       tls_rx_reader_release(sk, ctx);
>          release_sock(sk);
>   }
> -- 
> 
> Then read_sock can just acquire/release.

Good suggestion.
Will be including it in the next round.

Cheers,

Hannes


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] net/tls: implement ->read_sock()
  2023-06-21  9:08           ` Hannes Reinecke
@ 2023-06-21  9:49             ` Sagi Grimberg
  2023-06-21 19:31               ` Jakub Kicinski
  0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2023-06-21  9:49 UTC (permalink / raw)
  To: Hannes Reinecke, Jakub Kicinski
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, Eric Dumazet,
	Paolo Abeni, netdev, Boris Pismenny



On 6/21/23 12:08, Hannes Reinecke wrote:
> On 6/21/23 10:39, Sagi Grimberg wrote:
>>
>>>> On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
>>>>>> +    err = tls_rx_reader_lock(sk, ctx, true);
>>>>>> +    if (err < 0)
>>>>>> +        return err;
>>>>>
>>>>> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
>>>>> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
>>>>> how is this not a deadlock?
>>>>
>>>> Yeah :|
>>>>
>>>>> I'm not exactly clear why the lock is needed here or what is the 
>>>>> subtle
>>>>> distinction between tls_rx_reader_lock and what lock_sock provides.
>>>>
>>>> It's a bit of a workaround for the consistency of the data stream.
>>>> There's bunch of state in the TLS ULP and waiting for mem or data
>>>> releases and re-takes the socket lock. So to stop the flow annoying
>>>> corner case races I slapped a lock around all of the reader.
>>>>
>>>> IMHO depending on the socket lock for anything non-trivial and outside
>>>> of the socket itself is a bad idea in general.
>>>>
>>>> The immediate need at the time was that if you did a read() and someone
>>>> else did a peek() at the same time from a stream of A B C D you may 
>>>> read
>>>> A D B C.
>>>
>>> Leaving me ever so confused.
>>>
>>> read_sock() is a generic interface; we cannot require a protocol 
>>> specific lock before calling it.
>>>
>>> What to do now?
>>> Drop the tls_rx_read_lock from read_sock() again?
>>
>> Probably just need to synchronize the readers by splitting that from
>> tls_rx_reader_lock:
>> -- 
>> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
>> index 53f944e6d8ef..53404c3fdcc6 100644
>> --- a/net/tls/tls_sw.c
>> +++ b/net/tls/tls_sw.c
>> @@ -1845,13 +1845,10 @@ tls_read_flush_backlog(struct sock *sk, struct 
>> tls_prot_info *prot,
>>          return sk_flush_backlog(sk);
>>   }
>>
>> -static int tls_rx_reader_lock(struct sock *sk, struct 
>> tls_sw_context_rx *ctx,
>> -                             bool nonblock)
>> +static int tls_rx_reader_acquire(struct sock *sk, struct 
>> tls_sw_context_rx *ctx,
>> +                            bool nonblock)
>>   {
>>          long timeo;
>> -       int err;
>> -
>> -       lock_sock(sk);
>>
>>          timeo = sock_rcvtimeo(sk, nonblock);
>>
>> @@ -1865,26 +1862,30 @@ static int tls_rx_reader_lock(struct sock *sk, 
>> struct tls_sw_context_rx *ctx,
>>                                !READ_ONCE(ctx->reader_present), &wait);
>>                  remove_wait_queue(&ctx->wq, &wait);
>>
>> -               if (timeo <= 0) {
>> -                       err = -EAGAIN;
>> -                       goto err_unlock;
>> -               }
>> -               if (signal_pending(current)) {
>> -                       err = sock_intr_errno(timeo);
>> -                       goto err_unlock;
>> -               }
>> +               if (timeo <= 0)
>> +                       return -EAGAIN;
>> +               if (signal_pending(current))
>> +                       return sock_intr_errno(timeo);
>>          }
>>
>>          WRITE_ONCE(ctx->reader_present, 1);
>>
>>          return 0;
>> +}
>>
>> -err_unlock:
>> -       release_sock(sk);
>> +static int tls_rx_reader_lock(struct sock *sk, struct 
>> tls_sw_context_rx *ctx,
>> +                             bool nonblock)
>> +{
>> +       int err;
>> +
>> +       lock_sock(sk);
>> +       err = tls_rx_reader_acquire(sk, ctx, nonblock);
>> +       if (err)
>> +               release_sock(sk);
>>          return err;
>>   }
>>
>> -static void tls_rx_reader_unlock(struct sock *sk, struct 
>> tls_sw_context_rx *ctx)
>> +static void tls_rx_reader_release(struct sock *sk, struct 
>> tls_sw_context_rx *ctx)
>>   {
>>          if (unlikely(ctx->reader_contended)) {
>>                  if (wq_has_sleeper(&ctx->wq))
>> @@ -1896,6 +1897,11 @@ static void tls_rx_reader_unlock(struct sock 
>> *sk, struct tls_sw_context_rx *ctx)
>>          }
>>
>>          WRITE_ONCE(ctx->reader_present, 0);
>> +}
>> +
>> +static void tls_rx_reader_unlock(struct sock *sk, struct 
>> tls_sw_context_rx *ctx)
>> +{
>> +       tls_rx_reader_release(sk, ctx);
>>          release_sock(sk);
>>   }
>> -- 
>>
>> Then read_sock can just acquire/release.
> 
> Good suggestion.
> Will be including it in the next round.

Maybe more appropriate helper names would be
tls_rx_reader_enter / tls_rx_reader_exit.

Whatever Jakub prefers...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] net/tls: implement ->read_sock()
  2023-06-21  9:49             ` Sagi Grimberg
@ 2023-06-21 19:31               ` Jakub Kicinski
  0 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2023-06-21 19:31 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Hannes Reinecke, Christoph Hellwig, Keith Busch, linux-nvme,
	Eric Dumazet, Paolo Abeni, netdev, Boris Pismenny

On Wed, 21 Jun 2023 12:49:21 +0300 Sagi Grimberg wrote:
> > Good suggestion.
> > Will be including it in the next round.  
> 
> Maybe more appropriate helper names would be
> tls_rx_reader_enter / tls_rx_reader_exit.
> 
> Whatever Jakub prefers...

I was thinking along the same lines but with __ in front of the names
of the factored out code. Your naming as suggested in the diff is
better.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-06-21 19:31 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-20 10:28 [PATCHv5 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
2023-06-20 10:28 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
2023-06-20 10:28 ` [PATCH 2/4] net/tls: handle MSG_EOR for tls_device " Hannes Reinecke
2023-06-20 17:12   ` Sabrina Dubroca
2023-06-21  6:09     ` Hannes Reinecke
2023-06-20 10:28 ` [PATCH 3/4] selftests/net/tls: add test for MSG_EOR Hannes Reinecke
2023-06-20 10:28 ` [PATCH 4/4] net/tls: implement ->read_sock() Hannes Reinecke
2023-06-20 13:21   ` Sagi Grimberg
2023-06-20 17:08     ` Jakub Kicinski
2023-06-21  6:44       ` Hannes Reinecke
2023-06-21  8:39         ` Sagi Grimberg
2023-06-21  9:08           ` Hannes Reinecke
2023-06-21  9:49             ` Sagi Grimberg
2023-06-21 19:31               ` Jakub Kicinski
  -- strict thread matches above, loose matches on Subject: below --
2023-06-14  6:22 [PATCHv4 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
2023-06-14  6:22 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
2023-06-17  6:26   ` Jakub Kicinski
2023-06-12 14:38 [PATCHv3 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
2023-06-12 14:38 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
2023-06-09 12:51 [PATCHv2 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
2023-06-09 12:51 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
2023-06-09 16:37   ` Sabrina Dubroca

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).