* [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow
2023-06-09 12:51 [PATCHv2 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
@ 2023-06-09 12:51 ` Hannes Reinecke
2023-06-09 16:37 ` Sabrina Dubroca
0 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-09 12:51 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski, netdev,
Hannes Reinecke
tls_sw_sendmsg() / tls_do_sw_sendpage() already handles
MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails out on MSG_EOR.
But seeing that MSG_EOR is basically the opposite of
MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
MSG_EOR by treating it as the negation of MSG_MORE.
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
net/tls/tls_sw.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 635b8bf6b937..be8e0459d403 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -953,9 +953,12 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
int pending;
if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
- MSG_CMSG_COMPAT))
+ MSG_EOR | MSG_CMSG_COMPAT))
return -EOPNOTSUPP;
+ if (msg->msg_flags & MSG_EOR)
+ eor = true;
+
ret = mutex_lock_interruptible(&tls_ctx->tx_lock);
if (ret)
return ret;
@@ -1173,6 +1176,8 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
bool eor;
eor = !(flags & MSG_SENDPAGE_NOTLAST);
+ if (flags & MSG_EOR)
+ eor = true;
sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
/* Call the sk_stream functions to manage the sndbuf mem. */
@@ -1274,7 +1279,7 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
int offset, size_t size, int flags)
{
- if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+ if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY |
MSG_NO_SHARED_FRAGS))
return -EOPNOTSUPP;
@@ -1288,7 +1293,7 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
struct tls_context *tls_ctx = tls_get_ctx(sk);
int ret;
- if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+ if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY))
return -EOPNOTSUPP;
--
2.35.3
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow
2023-06-09 12:51 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
@ 2023-06-09 16:37 ` Sabrina Dubroca
0 siblings, 0 replies; 19+ messages in thread
From: Sabrina Dubroca @ 2023-06-09 16:37 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme,
Jakub Kicinski, netdev
Hi Hannes,
2023-06-09, 14:51:50 +0200, Hannes Reinecke wrote:
> tls_sw_sendmsg() / tls_do_sw_sendpage() already handles
> MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails out on MSG_EOR.
> But seeing that MSG_EOR is basically the opposite of
> MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
> MSG_EOR by treating it as the negation of MSG_MORE.
>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: netdev@vger.kernel.org
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
> net/tls/tls_sw.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 635b8bf6b937..be8e0459d403 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -953,9 +953,12 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
> int pending;
>
> if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
> - MSG_CMSG_COMPAT))
> + MSG_EOR | MSG_CMSG_COMPAT))
> return -EOPNOTSUPP;
>
> + if (msg->msg_flags & MSG_EOR)
> + eor = true;
Is MSG_EOR supposed to be incompatible with MSG_MORE, or is it
supposed to cancel it? (ie: MSG_MORE | MSG_EOR is invalid, or
MSG_MORE | MSG_EOR behaves like MSG_EOR) The current code already
behaves as if _EOR was passed as long as MSG_MORE isn't passed, so
_EOR is only needed to cancel out _MORE (or in your case, because
NVMe-over-TLS sets it).
If _EOR and _MORE (or MSG_SENDPAGE_NOTLAST below) are supposed to be
incompatible, we should return an error when they're both set. If we
accept both flags being set at the same time, I think we should
document the expected behavior ("_EOR overrides _MORE/_NOTLAST") and
add specific selftests to avoid regressions.
--
Sabrina
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow
2023-06-12 14:38 [PATCHv3 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
@ 2023-06-12 14:38 ` Hannes Reinecke
0 siblings, 0 replies; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-12 14:38 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski, netdev,
Hannes Reinecke
tls_sw_sendmsg() / tls_do_sw_sendpage() already handles
MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails out on MSG_EOR.
But seeing that MSG_EOR is basically the opposite of
MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
MSG_EOR by treating it as the negation of MSG_MORE.
And erroring out if MSG_EOR is specified with either
MSG_MORE or MSG_SENDPAGE_NOTLAST.
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
net/tls/tls_sw.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 635b8bf6b937..16eae0c5c819 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -953,7 +953,10 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
int pending;
if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
- MSG_CMSG_COMPAT))
+ MSG_EOR | MSG_CMSG_COMPAT))
+ return -EOPNOTSUPP;
+
+ if (!eor && msg->msg_flags & MSG_EOR)
return -EOPNOTSUPP;
ret = mutex_lock_interruptible(&tls_ctx->tx_lock);
@@ -1274,11 +1277,15 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
int offset, size_t size, int flags)
{
- if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+ if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY |
MSG_NO_SHARED_FRAGS))
return -EOPNOTSUPP;
+ if ((flags & (MSG_MORE | MSG_SENDPAGE_NOTLAST)) &&
+ (flags & MSG_EOR))
+ return -EINVAL;
+
return tls_sw_do_sendpage(sk, page, offset, size, flags);
}
@@ -1288,10 +1295,14 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
struct tls_context *tls_ctx = tls_get_ctx(sk);
int ret;
- if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+ if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY))
return -EOPNOTSUPP;
+ if ((flags & (MSG_MORE | MSG_SENDPAGE_NOTLAST)) &&
+ (flags & MSG_EOR))
+ return -EOPNOTSUPP;
+
ret = mutex_lock_interruptible(&tls_ctx->tx_lock);
if (ret)
return ret;
--
2.35.3
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow
2023-06-14 6:22 [PATCHv4 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
@ 2023-06-14 6:22 ` Hannes Reinecke
2023-06-17 6:26 ` Jakub Kicinski
0 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-14 6:22 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski, netdev,
Hannes Reinecke
tls_sw_sendmsg() / tls_do_sw_sendpage() already handles
MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails out on MSG_EOR.
But seeing that MSG_EOR is basically the opposite of
MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
MSG_EOR by treating it as the negation of MSG_MORE.
And erroring out if MSG_EOR is specified with either
MSG_MORE or MSG_SENDPAGE_NOTLAST.
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
net/tls/tls_sw.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 319f61590d2c..47eeff4d7d10 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -984,6 +984,9 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
int ret = 0;
int pending;
+ if (!eor && (msg->msg_flags & MSG_EOR))
+ return -EINVAL;
+
if (unlikely(msg->msg_controllen)) {
ret = tls_process_cmsg(sk, msg, &record_type);
if (ret) {
@@ -1193,7 +1196,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
int ret;
if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
- MSG_CMSG_COMPAT | MSG_SPLICE_PAGES |
+ MSG_CMSG_COMPAT | MSG_SPLICE_PAGES | MSG_EOR |
MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY))
return -EOPNOTSUPP;
@@ -1287,7 +1290,7 @@ int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
struct bio_vec bvec;
struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, };
- if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+ if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY |
MSG_NO_SHARED_FRAGS))
return -EOPNOTSUPP;
--
2.35.3
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow
2023-06-14 6:22 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
@ 2023-06-17 6:26 ` Jakub Kicinski
0 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2023-06-17 6:26 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme, netdev
On Wed, 14 Jun 2023 08:22:09 +0200 Hannes Reinecke wrote:
> @@ -1287,7 +1290,7 @@ int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
> struct bio_vec bvec;
> struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, };
>
> - if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
> + if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
> MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY |
> MSG_NO_SHARED_FRAGS))
> return -EOPNOTSUPP;
you added it to sendpage_locked but not normal sendpage
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCHv5 0/4] net/tls: fixes for NVMe-over-TLS
@ 2023-06-20 10:28 Hannes Reinecke
2023-06-20 10:28 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
` (3 more replies)
0 siblings, 4 replies; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-20 10:28 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, Hannes Reinecke
Hi all,
here are some small fixes to get NVMe-over-TLS up and running.
The first thre are just minor modifications to have MSG_EOR handled
for TLS (and adding a test for it), but the last implements the
->read_sock() callback for tls_sw and I guess could do with some
reviews.
It does work with my NVMe-TLS test harness, but what do I know :-)
As usual, comments and reviews are welcome.
Changes to the original submission:
- Add a testcase for MSG_EOR handling
Changes to v2:
- Bail out on conflicting message flags
- Rework flag handling
Changes to v3:
- Return -EINVAL on conflicting flags
- Rebase on top of net-next
Changes to v4:
- Add tlx_rx_reader_lock() to read_sock
- Add MSG_EOR handling to tls_sw_readpages()
Hannes Reinecke (4):
net/tls: handle MSG_EOR for tls_sw TX flow
net/tls: handle MSG_EOR for tls_device TX flow
selftests/net/tls: add test for MSG_EOR
net/tls: implement ->read_sock()
net/tls/tls.h | 2 +
net/tls/tls_device.c | 25 +++++++--
net/tls/tls_main.c | 2 +
net/tls/tls_sw.c | 87 +++++++++++++++++++++++++++++--
tools/testing/selftests/net/tls.c | 11 ++++
5 files changed, 119 insertions(+), 8 deletions(-)
--
2.35.3
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow
2023-06-20 10:28 [PATCHv5 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
@ 2023-06-20 10:28 ` Hannes Reinecke
2023-06-20 10:28 ` [PATCH 2/4] net/tls: handle MSG_EOR for tls_device " Hannes Reinecke
` (2 subsequent siblings)
3 siblings, 0 replies; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-20 10:28 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, Hannes Reinecke
tls_sw_sendmsg() / tls_do_sw_sendpage() already handles
MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails out on MSG_EOR.
But seeing that MSG_EOR is basically the opposite of
MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
MSG_EOR by treating it as the negation of MSG_MORE.
And erroring out if MSG_EOR is specified with either
MSG_MORE or MSG_SENDPAGE_NOTLAST.
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
net/tls/tls_sw.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 319f61590d2c..97379e34c997 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -984,6 +984,9 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
int ret = 0;
int pending;
+ if (!eor && (msg->msg_flags & MSG_EOR))
+ return -EINVAL;
+
if (unlikely(msg->msg_controllen)) {
ret = tls_process_cmsg(sk, msg, &record_type);
if (ret) {
@@ -1193,7 +1196,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
int ret;
if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
- MSG_CMSG_COMPAT | MSG_SPLICE_PAGES |
+ MSG_CMSG_COMPAT | MSG_SPLICE_PAGES | MSG_EOR |
MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY))
return -EOPNOTSUPP;
@@ -1287,7 +1290,7 @@ int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
struct bio_vec bvec;
struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, };
- if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+ if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY |
MSG_NO_SHARED_FRAGS))
return -EOPNOTSUPP;
@@ -1305,7 +1308,7 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
struct bio_vec bvec;
struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, };
- if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+ if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_EOR |
MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY))
return -EOPNOTSUPP;
if (flags & MSG_SENDPAGE_NOTLAST)
--
2.35.3
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/4] net/tls: handle MSG_EOR for tls_device TX flow
2023-06-20 10:28 [PATCHv5 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
2023-06-20 10:28 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
@ 2023-06-20 10:28 ` Hannes Reinecke
2023-06-20 17:12 ` Sabrina Dubroca
2023-06-20 10:28 ` [PATCH 3/4] selftests/net/tls: add test for MSG_EOR Hannes Reinecke
2023-06-20 10:28 ` [PATCH 4/4] net/tls: implement ->read_sock() Hannes Reinecke
3 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-20 10:28 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, Hannes Reinecke
tls_push_data() MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails
out on MSG_EOR.
But seeing that MSG_EOR is basically the opposite of
MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
MSG_EOR by treating it as the absence of MSG_MORE.
Consequently we should return an error when both are set.
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
net/tls/tls_device.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index b82770f68807..ebefd148ecf5 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -440,11 +440,6 @@ static int tls_push_data(struct sock *sk,
int copy, rc = 0;
long timeo;
- if (flags &
- ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST |
- MSG_SPLICE_PAGES))
- return -EOPNOTSUPP;
-
if (unlikely(sk->sk_err))
return -sk->sk_err;
@@ -536,6 +531,10 @@ static int tls_push_data(struct sock *sk,
more = true;
break;
}
+ if (flags & MSG_EOR) {
+ more = false;
+ break;
+ }
done = true;
}
@@ -582,6 +581,14 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
if (!tls_ctx->zerocopy_sendfile)
msg->msg_flags &= ~MSG_SPLICE_PAGES;
+ if (msg->msg_flags &
+ ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST |
+ MSG_SPLICE_PAGES | MSG_EOR))
+ return -EOPNOTSUPP;
+
+ if ((msg->msg_flags & (MSG_MORE | MSG_EOR)) == (MSG_MORE | MSG_EOR))
+ return -EOPNOTSUPP;
+
mutex_lock(&tls_ctx->tx_lock);
lock_sock(sk);
@@ -627,9 +634,17 @@ int tls_device_sendpage(struct sock *sk, struct page *page,
struct bio_vec bvec;
struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, };
+ if (flags &
+ ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST |
+ MSG_SPLICE_PAGES | MSG_EOR))
+ return -EOPNOTSUPP;
+
if (flags & MSG_SENDPAGE_NOTLAST)
msg.msg_flags |= MSG_MORE;
+ if ((msg.msg_flags & (MSG_MORE | MSG_EOR)) == (MSG_MORE | MSG_EOR))
+ return -EINVAL;
+
if (flags & MSG_OOB)
return -EOPNOTSUPP;
--
2.35.3
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/4] selftests/net/tls: add test for MSG_EOR
2023-06-20 10:28 [PATCHv5 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
2023-06-20 10:28 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
2023-06-20 10:28 ` [PATCH 2/4] net/tls: handle MSG_EOR for tls_device " Hannes Reinecke
@ 2023-06-20 10:28 ` Hannes Reinecke
2023-06-20 10:28 ` [PATCH 4/4] net/tls: implement ->read_sock() Hannes Reinecke
3 siblings, 0 replies; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-20 10:28 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, Hannes Reinecke
As the recent patch is modifying the behaviour for TLS re MSG_EOR
handling we should be having a test for it.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
---
tools/testing/selftests/net/tls.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index eccea9845c65..e8540fa5b310 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -477,6 +477,17 @@ TEST_F(tls, msg_more_unsent)
EXPECT_EQ(recv(self->cfd, buf, send_len, MSG_DONTWAIT), -1);
}
+TEST_F(tls, msg_eor)
+{
+ char const *test_str = "test_read";
+ int send_len = 10;
+ char buf[10];
+
+ EXPECT_EQ(send(self->fd, test_str, send_len, MSG_EOR), send_len);
+ EXPECT_EQ(recv(self->cfd, buf, send_len, MSG_WAITALL), send_len);
+ EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
+}
+
TEST_F(tls, sendmsg_single)
{
struct msghdr msg;
--
2.35.3
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 4/4] net/tls: implement ->read_sock()
2023-06-20 10:28 [PATCHv5 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
` (2 preceding siblings ...)
2023-06-20 10:28 ` [PATCH 3/4] selftests/net/tls: add test for MSG_EOR Hannes Reinecke
@ 2023-06-20 10:28 ` Hannes Reinecke
2023-06-20 13:21 ` Sagi Grimberg
3 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-20 10:28 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, Hannes Reinecke,
Boris Pismenny
Implement ->read_sock() function for use with nvme-tcp.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Cc: Boris Pismenny <boris.pismenny@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
---
net/tls/tls.h | 2 ++
net/tls/tls_main.c | 2 ++
net/tls/tls_sw.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 82 insertions(+)
diff --git a/net/tls/tls.h b/net/tls/tls.h
index d002c3af1966..ba55cd5c4913 100644
--- a/net/tls/tls.h
+++ b/net/tls/tls.h
@@ -114,6 +114,8 @@ bool tls_sw_sock_is_readable(struct sock *sk);
ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
struct pipe_inode_info *pipe,
size_t len, unsigned int flags);
+int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
+ sk_read_actor_t read_actor);
int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
void tls_device_splice_eof(struct socket *sock);
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 7b9c83dd7de2..1a062a8c6d33 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -963,10 +963,12 @@ static void build_proto_ops(struct proto_ops ops[TLS_NUM_CONFIG][TLS_NUM_CONFIG]
ops[TLS_BASE][TLS_SW ] = ops[TLS_BASE][TLS_BASE];
ops[TLS_BASE][TLS_SW ].splice_read = tls_sw_splice_read;
ops[TLS_BASE][TLS_SW ].poll = tls_sk_poll;
+ ops[TLS_BASE][TLS_SW ].read_sock = tls_sw_read_sock;
ops[TLS_SW ][TLS_SW ] = ops[TLS_SW ][TLS_BASE];
ops[TLS_SW ][TLS_SW ].splice_read = tls_sw_splice_read;
ops[TLS_SW ][TLS_SW ].poll = tls_sk_poll;
+ ops[TLS_SW ][TLS_SW ].read_sock = tls_sw_read_sock;
#ifdef CONFIG_TLS_DEVICE
ops[TLS_HW ][TLS_BASE] = ops[TLS_BASE][TLS_BASE];
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 97379e34c997..e918c98bbeb2 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2231,6 +2231,84 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
goto splice_read_end;
}
+int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
+ sk_read_actor_t read_actor)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
+ struct strp_msg *rxm = NULL;
+ struct tls_msg *tlm;
+ struct sk_buff *skb;
+ ssize_t copied = 0;
+ int err, used;
+
+ err = tls_rx_reader_lock(sk, ctx, true);
+ if (err < 0)
+ return err;
+ if (!skb_queue_empty(&ctx->rx_list)) {
+ skb = __skb_dequeue(&ctx->rx_list);
+ } else {
+ struct tls_decrypt_arg darg;
+
+ err = tls_rx_rec_wait(sk, NULL, true, true);
+ if (err <= 0) {
+ tls_rx_reader_unlock(sk, ctx);
+ return err;
+ }
+
+ memset(&darg.inargs, 0, sizeof(darg.inargs));
+
+ err = tls_rx_one_record(sk, NULL, &darg);
+ if (err < 0) {
+ tls_err_abort(sk, -EBADMSG);
+ tls_rx_reader_unlock(sk, ctx);
+ return err;
+ }
+
+ tls_rx_rec_done(ctx);
+ skb = darg.skb;
+ }
+
+ do {
+ rxm = strp_msg(skb);
+ tlm = tls_msg(skb);
+
+ /* read_sock does not support reading control messages */
+ if (tlm->control != TLS_RECORD_TYPE_DATA) {
+ err = -EINVAL;
+ goto read_sock_requeue;
+ }
+
+ used = read_actor(desc, skb, rxm->offset, rxm->full_len);
+ if (used <= 0) {
+ err = used;
+ goto read_sock_end;
+ }
+
+ copied += used;
+ if (used < rxm->full_len) {
+ rxm->offset += used;
+ rxm->full_len -= used;
+ if (!desc->count)
+ goto read_sock_requeue;
+ } else {
+ consume_skb(skb);
+ if (desc->count && !skb_queue_empty(&ctx->rx_list))
+ skb = __skb_dequeue(&ctx->rx_list);
+ else
+ skb = NULL;
+ }
+ } while (skb);
+
+read_sock_end:
+ tls_rx_reader_unlock(sk, ctx);
+ return copied ? : err;
+
+read_sock_requeue:
+ __skb_queue_head(&ctx->rx_list, skb);
+ goto read_sock_end;
+}
+
bool tls_sw_sock_is_readable(struct sock *sk)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
--
2.35.3
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] net/tls: implement ->read_sock()
2023-06-20 10:28 ` [PATCH 4/4] net/tls: implement ->read_sock() Hannes Reinecke
@ 2023-06-20 13:21 ` Sagi Grimberg
2023-06-20 17:08 ` Jakub Kicinski
0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2023-06-20 13:21 UTC (permalink / raw)
To: Hannes Reinecke, Christoph Hellwig
Cc: Keith Busch, linux-nvme, Jakub Kicinski, Eric Dumazet,
Paolo Abeni, netdev, Boris Pismenny
> Implement ->read_sock() function for use with nvme-tcp.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
> Cc: Boris Pismenny <boris.pismenny@gmail.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: netdev@vger.kernel.org
> ---
> net/tls/tls.h | 2 ++
> net/tls/tls_main.c | 2 ++
> net/tls/tls_sw.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 82 insertions(+)
>
> diff --git a/net/tls/tls.h b/net/tls/tls.h
> index d002c3af1966..ba55cd5c4913 100644
> --- a/net/tls/tls.h
> +++ b/net/tls/tls.h
> @@ -114,6 +114,8 @@ bool tls_sw_sock_is_readable(struct sock *sk);
> ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
> struct pipe_inode_info *pipe,
> size_t len, unsigned int flags);
> +int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
> + sk_read_actor_t read_actor);
>
> int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
> void tls_device_splice_eof(struct socket *sock);
> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index 7b9c83dd7de2..1a062a8c6d33 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
> @@ -963,10 +963,12 @@ static void build_proto_ops(struct proto_ops ops[TLS_NUM_CONFIG][TLS_NUM_CONFIG]
> ops[TLS_BASE][TLS_SW ] = ops[TLS_BASE][TLS_BASE];
> ops[TLS_BASE][TLS_SW ].splice_read = tls_sw_splice_read;
> ops[TLS_BASE][TLS_SW ].poll = tls_sk_poll;
> + ops[TLS_BASE][TLS_SW ].read_sock = tls_sw_read_sock;
>
> ops[TLS_SW ][TLS_SW ] = ops[TLS_SW ][TLS_BASE];
> ops[TLS_SW ][TLS_SW ].splice_read = tls_sw_splice_read;
> ops[TLS_SW ][TLS_SW ].poll = tls_sk_poll;
> + ops[TLS_SW ][TLS_SW ].read_sock = tls_sw_read_sock;
>
> #ifdef CONFIG_TLS_DEVICE
> ops[TLS_HW ][TLS_BASE] = ops[TLS_BASE][TLS_BASE];
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 97379e34c997..e918c98bbeb2 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2231,6 +2231,84 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
> goto splice_read_end;
> }
>
> +int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
> + sk_read_actor_t read_actor)
> +{
> + struct tls_context *tls_ctx = tls_get_ctx(sk);
> + struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
> + struct strp_msg *rxm = NULL;
> + struct tls_msg *tlm;
> + struct sk_buff *skb;
> + ssize_t copied = 0;
> + int err, used;
> +
> + err = tls_rx_reader_lock(sk, ctx, true);
> + if (err < 0)
> + return err;
Unlike recvmsg or splice_read, the caller of read_sock is assumed to
have the socket locked, and tls_rx_reader_lock also calls lock_sock,
how is this not a deadlock?
I'm not exactly clear why the lock is needed here or what is the subtle
distinction between tls_rx_reader_lock and what lock_sock provides.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] net/tls: implement ->read_sock()
2023-06-20 13:21 ` Sagi Grimberg
@ 2023-06-20 17:08 ` Jakub Kicinski
2023-06-21 6:44 ` Hannes Reinecke
0 siblings, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2023-06-20 17:08 UTC (permalink / raw)
To: Sagi Grimberg
Cc: Hannes Reinecke, Christoph Hellwig, Keith Busch, linux-nvme,
Eric Dumazet, Paolo Abeni, netdev, Boris Pismenny
On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
> > + err = tls_rx_reader_lock(sk, ctx, true);
> > + if (err < 0)
> > + return err;
>
> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
> how is this not a deadlock?
Yeah :|
> I'm not exactly clear why the lock is needed here or what is the subtle
> distinction between tls_rx_reader_lock and what lock_sock provides.
It's a bit of a workaround for the consistency of the data stream.
There's bunch of state in the TLS ULP and waiting for mem or data
releases and re-takes the socket lock. So to stop the flow annoying
corner case races I slapped a lock around all of the reader.
IMHO depending on the socket lock for anything non-trivial and outside
of the socket itself is a bad idea in general.
The immediate need at the time was that if you did a read() and someone
else did a peek() at the same time from a stream of A B C D you may read
A D B C.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/4] net/tls: handle MSG_EOR for tls_device TX flow
2023-06-20 10:28 ` [PATCH 2/4] net/tls: handle MSG_EOR for tls_device " Hannes Reinecke
@ 2023-06-20 17:12 ` Sabrina Dubroca
2023-06-21 6:09 ` Hannes Reinecke
0 siblings, 1 reply; 19+ messages in thread
From: Sabrina Dubroca @ 2023-06-20 17:12 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme,
Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev
2023-06-20, 12:28:54 +0200, Hannes Reinecke wrote:
> tls_push_data() MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails
> out on MSG_EOR.
> But seeing that MSG_EOR is basically the opposite of
> MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
> MSG_EOR by treating it as the absence of MSG_MORE.
> Consequently we should return an error when both are set.
>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: netdev@vger.kernel.org
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
> net/tls/tls_device.c | 25 ++++++++++++++++++++-----
> 1 file changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
> index b82770f68807..ebefd148ecf5 100644
> --- a/net/tls/tls_device.c
> +++ b/net/tls/tls_device.c
> @@ -440,11 +440,6 @@ static int tls_push_data(struct sock *sk,
> int copy, rc = 0;
> long timeo;
>
> - if (flags &
> - ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST |
> - MSG_SPLICE_PAGES))
> - return -EOPNOTSUPP;
> -
> if (unlikely(sk->sk_err))
> return -sk->sk_err;
>
> @@ -536,6 +531,10 @@ static int tls_push_data(struct sock *sk,
> more = true;
> break;
> }
> + if (flags & MSG_EOR) {
> + more = false;
> + break;
Why the break here? We don't want to close and push the record in that
case? (the "if (done || ...)" block just below)
> + }
>
> done = true;
> }
Thanks,
--
Sabrina
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/4] net/tls: handle MSG_EOR for tls_device TX flow
2023-06-20 17:12 ` Sabrina Dubroca
@ 2023-06-21 6:09 ` Hannes Reinecke
0 siblings, 0 replies; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-21 6:09 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme,
Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev
On 6/20/23 19:12, Sabrina Dubroca wrote:
> 2023-06-20, 12:28:54 +0200, Hannes Reinecke wrote:
>> tls_push_data() MSG_MORE / MSG_SENDPAGE_NOTLAST, but bails
>> out on MSG_EOR.
>> But seeing that MSG_EOR is basically the opposite of
>> MSG_MORE / MSG_SENDPAGE_NOTLAST this patch adds handling
>> MSG_EOR by treating it as the absence of MSG_MORE.
>> Consequently we should return an error when both are set.
>>
>> Cc: Jakub Kicinski <kuba@kernel.org>
>> Cc: netdev@vger.kernel.org
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>> net/tls/tls_device.c | 25 ++++++++++++++++++++-----
>> 1 file changed, 20 insertions(+), 5 deletions(-)
>>
>> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
>> index b82770f68807..ebefd148ecf5 100644
>> --- a/net/tls/tls_device.c
>> +++ b/net/tls/tls_device.c
>> @@ -440,11 +440,6 @@ static int tls_push_data(struct sock *sk,
>> int copy, rc = 0;
>> long timeo;
>>
>> - if (flags &
>> - ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST |
>> - MSG_SPLICE_PAGES))
>> - return -EOPNOTSUPP;
>> -
>> if (unlikely(sk->sk_err))
>> return -sk->sk_err;
>>
>> @@ -536,6 +531,10 @@ static int tls_push_data(struct sock *sk,
>> more = true;
>> break;
>> }
>> + if (flags & MSG_EOR) {
>> + more = false;
>> + break;
>
> Why the break here? We don't want to close and push the record in that
> case? (the "if (done || ...)" block just below)
>
Ah, yes, you are correct. Will be fixing it.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] net/tls: implement ->read_sock()
2023-06-20 17:08 ` Jakub Kicinski
@ 2023-06-21 6:44 ` Hannes Reinecke
2023-06-21 8:39 ` Sagi Grimberg
0 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-21 6:44 UTC (permalink / raw)
To: Jakub Kicinski, Sagi Grimberg
Cc: Christoph Hellwig, Keith Busch, linux-nvme, Eric Dumazet,
Paolo Abeni, netdev, Boris Pismenny
On 6/20/23 19:08, Jakub Kicinski wrote:
> On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
>>> + err = tls_rx_reader_lock(sk, ctx, true);
>>> + if (err < 0)
>>> + return err;
>>
>> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
>> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
>> how is this not a deadlock?
>
> Yeah :|
>
>> I'm not exactly clear why the lock is needed here or what is the subtle
>> distinction between tls_rx_reader_lock and what lock_sock provides.
>
> It's a bit of a workaround for the consistency of the data stream.
> There's bunch of state in the TLS ULP and waiting for mem or data
> releases and re-takes the socket lock. So to stop the flow annoying
> corner case races I slapped a lock around all of the reader.
>
> IMHO depending on the socket lock for anything non-trivial and outside
> of the socket itself is a bad idea in general.
>
> The immediate need at the time was that if you did a read() and someone
> else did a peek() at the same time from a stream of A B C D you may read
> A D B C.
Leaving me ever so confused.
read_sock() is a generic interface; we cannot require a protocol
specific lock before calling it.
What to do now?
Drop the tls_rx_read_lock from read_sock() again?
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] net/tls: implement ->read_sock()
2023-06-21 6:44 ` Hannes Reinecke
@ 2023-06-21 8:39 ` Sagi Grimberg
2023-06-21 9:08 ` Hannes Reinecke
0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2023-06-21 8:39 UTC (permalink / raw)
To: Hannes Reinecke, Jakub Kicinski
Cc: Christoph Hellwig, Keith Busch, linux-nvme, Eric Dumazet,
Paolo Abeni, netdev, Boris Pismenny
>> On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
>>>> + err = tls_rx_reader_lock(sk, ctx, true);
>>>> + if (err < 0)
>>>> + return err;
>>>
>>> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
>>> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
>>> how is this not a deadlock?
>>
>> Yeah :|
>>
>>> I'm not exactly clear why the lock is needed here or what is the subtle
>>> distinction between tls_rx_reader_lock and what lock_sock provides.
>>
>> It's a bit of a workaround for the consistency of the data stream.
>> There's bunch of state in the TLS ULP and waiting for mem or data
>> releases and re-takes the socket lock. So to stop the flow annoying
>> corner case races I slapped a lock around all of the reader.
>>
>> IMHO depending on the socket lock for anything non-trivial and outside
>> of the socket itself is a bad idea in general.
>>
>> The immediate need at the time was that if you did a read() and someone
>> else did a peek() at the same time from a stream of A B C D you may read
>> A D B C.
>
> Leaving me ever so confused.
>
> read_sock() is a generic interface; we cannot require a protocol
> specific lock before calling it.
>
> What to do now?
> Drop the tls_rx_read_lock from read_sock() again?
Probably just need to synchronize the readers by splitting that from
tls_rx_reader_lock:
--
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 53f944e6d8ef..53404c3fdcc6 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1845,13 +1845,10 @@ tls_read_flush_backlog(struct sock *sk, struct
tls_prot_info *prot,
return sk_flush_backlog(sk);
}
-static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx
*ctx,
- bool nonblock)
+static int tls_rx_reader_acquire(struct sock *sk, struct
tls_sw_context_rx *ctx,
+ bool nonblock)
{
long timeo;
- int err;
-
- lock_sock(sk);
timeo = sock_rcvtimeo(sk, nonblock);
@@ -1865,26 +1862,30 @@ static int tls_rx_reader_lock(struct sock *sk,
struct tls_sw_context_rx *ctx,
!READ_ONCE(ctx->reader_present), &wait);
remove_wait_queue(&ctx->wq, &wait);
- if (timeo <= 0) {
- err = -EAGAIN;
- goto err_unlock;
- }
- if (signal_pending(current)) {
- err = sock_intr_errno(timeo);
- goto err_unlock;
- }
+ if (timeo <= 0)
+ return -EAGAIN;
+ if (signal_pending(current))
+ return sock_intr_errno(timeo);
}
WRITE_ONCE(ctx->reader_present, 1);
return 0;
+}
-err_unlock:
- release_sock(sk);
+static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx
*ctx,
+ bool nonblock)
+{
+ int err;
+
+ lock_sock(sk);
+ err = tls_rx_reader_acquire(sk, ctx, nonblock);
+ if (err)
+ release_sock(sk);
return err;
}
-static void tls_rx_reader_unlock(struct sock *sk, struct
tls_sw_context_rx *ctx)
+static void tls_rx_reader_release(struct sock *sk, struct
tls_sw_context_rx *ctx)
{
if (unlikely(ctx->reader_contended)) {
if (wq_has_sleeper(&ctx->wq))
@@ -1896,6 +1897,11 @@ static void tls_rx_reader_unlock(struct sock *sk,
struct tls_sw_context_rx *ctx)
}
WRITE_ONCE(ctx->reader_present, 0);
+}
+
+static void tls_rx_reader_unlock(struct sock *sk, struct
tls_sw_context_rx *ctx)
+{
+ tls_rx_reader_release(sk, ctx);
release_sock(sk);
}
--
Then read_sock can just acquire/release.
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] net/tls: implement ->read_sock()
2023-06-21 8:39 ` Sagi Grimberg
@ 2023-06-21 9:08 ` Hannes Reinecke
2023-06-21 9:49 ` Sagi Grimberg
0 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2023-06-21 9:08 UTC (permalink / raw)
To: Sagi Grimberg, Jakub Kicinski
Cc: Christoph Hellwig, Keith Busch, linux-nvme, Eric Dumazet,
Paolo Abeni, netdev, Boris Pismenny
On 6/21/23 10:39, Sagi Grimberg wrote:
>
>>> On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
>>>>> + err = tls_rx_reader_lock(sk, ctx, true);
>>>>> + if (err < 0)
>>>>> + return err;
>>>>
>>>> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
>>>> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
>>>> how is this not a deadlock?
>>>
>>> Yeah :|
>>>
>>>> I'm not exactly clear why the lock is needed here or what is the subtle
>>>> distinction between tls_rx_reader_lock and what lock_sock provides.
>>>
>>> It's a bit of a workaround for the consistency of the data stream.
>>> There's bunch of state in the TLS ULP and waiting for mem or data
>>> releases and re-takes the socket lock. So to stop the flow annoying
>>> corner case races I slapped a lock around all of the reader.
>>>
>>> IMHO depending on the socket lock for anything non-trivial and outside
>>> of the socket itself is a bad idea in general.
>>>
>>> The immediate need at the time was that if you did a read() and someone
>>> else did a peek() at the same time from a stream of A B C D you may read
>>> A D B C.
>>
>> Leaving me ever so confused.
>>
>> read_sock() is a generic interface; we cannot require a protocol
>> specific lock before calling it.
>>
>> What to do now?
>> Drop the tls_rx_read_lock from read_sock() again?
>
> Probably just need to synchronize the readers by splitting that from
> tls_rx_reader_lock:
> --
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 53f944e6d8ef..53404c3fdcc6 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -1845,13 +1845,10 @@ tls_read_flush_backlog(struct sock *sk, struct
> tls_prot_info *prot,
> return sk_flush_backlog(sk);
> }
>
> -static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx
> *ctx,
> - bool nonblock)
> +static int tls_rx_reader_acquire(struct sock *sk, struct
> tls_sw_context_rx *ctx,
> + bool nonblock)
> {
> long timeo;
> - int err;
> -
> - lock_sock(sk);
>
> timeo = sock_rcvtimeo(sk, nonblock);
>
> @@ -1865,26 +1862,30 @@ static int tls_rx_reader_lock(struct sock *sk,
> struct tls_sw_context_rx *ctx,
> !READ_ONCE(ctx->reader_present), &wait);
> remove_wait_queue(&ctx->wq, &wait);
>
> - if (timeo <= 0) {
> - err = -EAGAIN;
> - goto err_unlock;
> - }
> - if (signal_pending(current)) {
> - err = sock_intr_errno(timeo);
> - goto err_unlock;
> - }
> + if (timeo <= 0)
> + return -EAGAIN;
> + if (signal_pending(current))
> + return sock_intr_errno(timeo);
> }
>
> WRITE_ONCE(ctx->reader_present, 1);
>
> return 0;
> +}
>
> -err_unlock:
> - release_sock(sk);
> +static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx
> *ctx,
> + bool nonblock)
> +{
> + int err;
> +
> + lock_sock(sk);
> + err = tls_rx_reader_acquire(sk, ctx, nonblock);
> + if (err)
> + release_sock(sk);
> return err;
> }
>
> -static void tls_rx_reader_unlock(struct sock *sk, struct
> tls_sw_context_rx *ctx)
> +static void tls_rx_reader_release(struct sock *sk, struct
> tls_sw_context_rx *ctx)
> {
> if (unlikely(ctx->reader_contended)) {
> if (wq_has_sleeper(&ctx->wq))
> @@ -1896,6 +1897,11 @@ static void tls_rx_reader_unlock(struct sock *sk,
> struct tls_sw_context_rx *ctx)
> }
>
> WRITE_ONCE(ctx->reader_present, 0);
> +}
> +
> +static void tls_rx_reader_unlock(struct sock *sk, struct
> tls_sw_context_rx *ctx)
> +{
> + tls_rx_reader_release(sk, ctx);
> release_sock(sk);
> }
> --
>
> Then read_sock can just acquire/release.
Good suggestion.
Will be including it in the next round.
Cheers,
Hannes
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] net/tls: implement ->read_sock()
2023-06-21 9:08 ` Hannes Reinecke
@ 2023-06-21 9:49 ` Sagi Grimberg
2023-06-21 19:31 ` Jakub Kicinski
0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2023-06-21 9:49 UTC (permalink / raw)
To: Hannes Reinecke, Jakub Kicinski
Cc: Christoph Hellwig, Keith Busch, linux-nvme, Eric Dumazet,
Paolo Abeni, netdev, Boris Pismenny
On 6/21/23 12:08, Hannes Reinecke wrote:
> On 6/21/23 10:39, Sagi Grimberg wrote:
>>
>>>> On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
>>>>>> + err = tls_rx_reader_lock(sk, ctx, true);
>>>>>> + if (err < 0)
>>>>>> + return err;
>>>>>
>>>>> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
>>>>> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
>>>>> how is this not a deadlock?
>>>>
>>>> Yeah :|
>>>>
>>>>> I'm not exactly clear why the lock is needed here or what is the
>>>>> subtle
>>>>> distinction between tls_rx_reader_lock and what lock_sock provides.
>>>>
>>>> It's a bit of a workaround for the consistency of the data stream.
>>>> There's bunch of state in the TLS ULP and waiting for mem or data
>>>> releases and re-takes the socket lock. So to stop the flow annoying
>>>> corner case races I slapped a lock around all of the reader.
>>>>
>>>> IMHO depending on the socket lock for anything non-trivial and outside
>>>> of the socket itself is a bad idea in general.
>>>>
>>>> The immediate need at the time was that if you did a read() and someone
>>>> else did a peek() at the same time from a stream of A B C D you may
>>>> read
>>>> A D B C.
>>>
>>> Leaving me ever so confused.
>>>
>>> read_sock() is a generic interface; we cannot require a protocol
>>> specific lock before calling it.
>>>
>>> What to do now?
>>> Drop the tls_rx_read_lock from read_sock() again?
>>
>> Probably just need to synchronize the readers by splitting that from
>> tls_rx_reader_lock:
>> --
>> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
>> index 53f944e6d8ef..53404c3fdcc6 100644
>> --- a/net/tls/tls_sw.c
>> +++ b/net/tls/tls_sw.c
>> @@ -1845,13 +1845,10 @@ tls_read_flush_backlog(struct sock *sk, struct
>> tls_prot_info *prot,
>> return sk_flush_backlog(sk);
>> }
>>
>> -static int tls_rx_reader_lock(struct sock *sk, struct
>> tls_sw_context_rx *ctx,
>> - bool nonblock)
>> +static int tls_rx_reader_acquire(struct sock *sk, struct
>> tls_sw_context_rx *ctx,
>> + bool nonblock)
>> {
>> long timeo;
>> - int err;
>> -
>> - lock_sock(sk);
>>
>> timeo = sock_rcvtimeo(sk, nonblock);
>>
>> @@ -1865,26 +1862,30 @@ static int tls_rx_reader_lock(struct sock *sk,
>> struct tls_sw_context_rx *ctx,
>> !READ_ONCE(ctx->reader_present), &wait);
>> remove_wait_queue(&ctx->wq, &wait);
>>
>> - if (timeo <= 0) {
>> - err = -EAGAIN;
>> - goto err_unlock;
>> - }
>> - if (signal_pending(current)) {
>> - err = sock_intr_errno(timeo);
>> - goto err_unlock;
>> - }
>> + if (timeo <= 0)
>> + return -EAGAIN;
>> + if (signal_pending(current))
>> + return sock_intr_errno(timeo);
>> }
>>
>> WRITE_ONCE(ctx->reader_present, 1);
>>
>> return 0;
>> +}
>>
>> -err_unlock:
>> - release_sock(sk);
>> +static int tls_rx_reader_lock(struct sock *sk, struct
>> tls_sw_context_rx *ctx,
>> + bool nonblock)
>> +{
>> + int err;
>> +
>> + lock_sock(sk);
>> + err = tls_rx_reader_acquire(sk, ctx, nonblock);
>> + if (err)
>> + release_sock(sk);
>> return err;
>> }
>>
>> -static void tls_rx_reader_unlock(struct sock *sk, struct
>> tls_sw_context_rx *ctx)
>> +static void tls_rx_reader_release(struct sock *sk, struct
>> tls_sw_context_rx *ctx)
>> {
>> if (unlikely(ctx->reader_contended)) {
>> if (wq_has_sleeper(&ctx->wq))
>> @@ -1896,6 +1897,11 @@ static void tls_rx_reader_unlock(struct sock
>> *sk, struct tls_sw_context_rx *ctx)
>> }
>>
>> WRITE_ONCE(ctx->reader_present, 0);
>> +}
>> +
>> +static void tls_rx_reader_unlock(struct sock *sk, struct
>> tls_sw_context_rx *ctx)
>> +{
>> + tls_rx_reader_release(sk, ctx);
>> release_sock(sk);
>> }
>> --
>>
>> Then read_sock can just acquire/release.
>
> Good suggestion.
> Will be including it in the next round.
Maybe more appropriate helper names would be
tls_rx_reader_enter / tls_rx_reader_exit.
Whatever Jakub prefers...
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] net/tls: implement ->read_sock()
2023-06-21 9:49 ` Sagi Grimberg
@ 2023-06-21 19:31 ` Jakub Kicinski
0 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2023-06-21 19:31 UTC (permalink / raw)
To: Sagi Grimberg
Cc: Hannes Reinecke, Christoph Hellwig, Keith Busch, linux-nvme,
Eric Dumazet, Paolo Abeni, netdev, Boris Pismenny
On Wed, 21 Jun 2023 12:49:21 +0300 Sagi Grimberg wrote:
> > Good suggestion.
> > Will be including it in the next round.
>
> Maybe more appropriate helper names would be
> tls_rx_reader_enter / tls_rx_reader_exit.
>
> Whatever Jakub prefers...
I was thinking along the same lines but with __ in front of the names
of the factored out code. Your naming as suggested in the diff is
better.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2023-06-21 19:31 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-20 10:28 [PATCHv5 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
2023-06-20 10:28 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
2023-06-20 10:28 ` [PATCH 2/4] net/tls: handle MSG_EOR for tls_device " Hannes Reinecke
2023-06-20 17:12 ` Sabrina Dubroca
2023-06-21 6:09 ` Hannes Reinecke
2023-06-20 10:28 ` [PATCH 3/4] selftests/net/tls: add test for MSG_EOR Hannes Reinecke
2023-06-20 10:28 ` [PATCH 4/4] net/tls: implement ->read_sock() Hannes Reinecke
2023-06-20 13:21 ` Sagi Grimberg
2023-06-20 17:08 ` Jakub Kicinski
2023-06-21 6:44 ` Hannes Reinecke
2023-06-21 8:39 ` Sagi Grimberg
2023-06-21 9:08 ` Hannes Reinecke
2023-06-21 9:49 ` Sagi Grimberg
2023-06-21 19:31 ` Jakub Kicinski
-- strict thread matches above, loose matches on Subject: below --
2023-06-14 6:22 [PATCHv4 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
2023-06-14 6:22 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
2023-06-17 6:26 ` Jakub Kicinski
2023-06-12 14:38 [PATCHv3 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
2023-06-12 14:38 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
2023-06-09 12:51 [PATCHv2 0/4] net/tls: fixes for NVMe-over-TLS Hannes Reinecke
2023-06-09 12:51 ` [PATCH 1/4] net/tls: handle MSG_EOR for tls_sw TX flow Hannes Reinecke
2023-06-09 16:37 ` Sabrina Dubroca
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).