public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH 0/1] nvme-tcp: fence TCP socket on transport error
@ 2023-03-20 22:09 Chris Leech
  2023-03-20 22:09 ` [PATCH 1/1] " Chris Leech
  2023-03-21 16:30 ` [PATCH v2 1/1] nvme-tcp: fence TCP socket on receive error Chris Leech
  0 siblings, 2 replies; 8+ messages in thread
From: Chris Leech @ 2023-03-20 22:09 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme; +Cc: John Meneghini

Even after "160f3549a907 nvme-tcp: fix UAF when detecting digest errors", I
don't think the queue->rd_enabled flag is fencing the TCP socket from further
receive processing.

io_work can be re-queued while running, and if it's pending when a receive
error occurs it will bypass the queue->rd_enabled checks that should prevent
queueing in nvme_tcp_data_ready.  Actually, it looks like nvme_tcp_write_space
and nvme_tcp_queue_request would schedule io_work anyway, which will always
call nvme_tcp_try_recv once regardless of rd_enabled.  And nvme_tcp_poll has no
checks against rd_enabled.

After receiving an unsupported PDU, the header has been read but the payload
remains in the socket queue.  And the nvme_tcp queue state looks like it's
ready to receive the payload of a C2H data PDU.  Any additional calls to
nvme_tcp_try_recv can incorerectly interpret the next bits as a command ID,
lookup a request from the tagset using this bogus ID, and start copying the
payload data from the unsupported PDU to an invalid destination address.

This has been seen with a buggy target that sent extranious bytes in the TCP
stream, but also I believe with a properly functioning target that sent a
Controller to Host Terminate Connection (C2HTermReq).  The Fatal Error Status
field was used as a Command ID and brought the host system down.

An additonal check against queue->rd_enabled at the start of nvme_tcp_recv_skb
should protect against both additonal io_work scheduling and nvme_tcp_poll use
after a receive transport error.

- Chris

Chris Leech (1):
  nvme-tcp: fence TCP socket on receive error

 drivers/nvme/host/tcp.c | 7 +++++++
 1 file changed, 7 insertions(+)

-- 
2.39.2



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] nvme-tcp: fence TCP socket on transport error
  2023-03-20 22:09 [PATCH 0/1] nvme-tcp: fence TCP socket on transport error Chris Leech
@ 2023-03-20 22:09 ` Chris Leech
  2023-03-21  8:30   ` Sagi Grimberg
  2023-03-21 16:30 ` [PATCH v2 1/1] nvme-tcp: fence TCP socket on receive error Chris Leech
  1 sibling, 1 reply; 8+ messages in thread
From: Chris Leech @ 2023-03-20 22:09 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme; +Cc: John Meneghini

Ensure that no further socket reads occur after a receive processing
error, either from io_work being re-scheduled or nvme_tcp_poll.

Failing to do so can result in unrecognised PDU payloads or TCP stream
garbage being processed as a C2H data PDU, and potentially start copying
the payload to an invalid destination after looking up a request using a
bogus command id.

Signed-off-by: Chris Leech <cleech@redhat.com>
---
 drivers/nvme/host/tcp.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 42c0598c31f2..49e8eb576527 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -888,6 +888,13 @@ static int nvme_tcp_recv_skb(read_descriptor_t *desc, struct sk_buff *skb,
 	size_t consumed = len;
 	int result;
 
+	if (!queue->rd_enabled) {
+		/* io_work or polling happening after receive error
+		 * waiting on error recovery
+		 */
+		return -EFAULT;
+	}
+
 	while (len) {
 		switch (nvme_tcp_recv_state(queue)) {
 		case NVME_TCP_RECV_PDU:
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] nvme-tcp: fence TCP socket on transport error
  2023-03-20 22:09 ` [PATCH 1/1] " Chris Leech
@ 2023-03-21  8:30   ` Sagi Grimberg
  2023-03-21 16:30     ` Chris Leech
  0 siblings, 1 reply; 8+ messages in thread
From: Sagi Grimberg @ 2023-03-21  8:30 UTC (permalink / raw)
  To: Chris Leech, linux-nvme; +Cc: John Meneghini

Hey Chris,

> Ensure that no further socket reads occur after a receive processing
> error, either from io_work being re-scheduled or nvme_tcp_poll.
> 
> Failing to do so can result in unrecognised PDU payloads or TCP stream
> garbage being processed as a C2H data PDU, and potentially start copying
> the payload to an invalid destination after looking up a request using a
> bogus command id.

I agree with your analysis.

> 
> Signed-off-by: Chris Leech <cleech@redhat.com>
> ---
>   drivers/nvme/host/tcp.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 42c0598c31f2..49e8eb576527 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -888,6 +888,13 @@ static int nvme_tcp_recv_skb(read_descriptor_t *desc, struct sk_buff *skb,
>   	size_t consumed = len;
>   	int result;
>   
> +	if (!queue->rd_enabled) {
> +		/* io_work or polling happening after receive error
> +		 * waiting on error recovery
> +		 */
> +		return -EFAULT;
> +	}

I think we can drop the comment, the code is somewhat self-explanatory,
if read is not enabled, we shouldn't try and read from the socket.

	if (!queue->rd_enabled)
		return -EFAULT;


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] nvme-tcp: fence TCP socket on transport error
  2023-03-21  8:30   ` Sagi Grimberg
@ 2023-03-21 16:30     ` Chris Leech
  0 siblings, 0 replies; 8+ messages in thread
From: Chris Leech @ 2023-03-21 16:30 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, John Meneghini

On Tue, Mar 21, 2023 at 1:30 AM Sagi Grimberg <sagi@grimberg.me> wrote:
> > +     if (!queue->rd_enabled) {
> > +             /* io_work or polling happening after receive error
> > +              * waiting on error recovery
> > +              */
> > +             return -EFAULT;
> > +     }
>
> I think we can drop the comment, the code is somewhat self-explanatory,
> if read is not enabled, we shouldn't try and read from the socket.
>
>         if (!queue->rd_enabled)
>                 return -EFAULT;

Sure, and as it's an error check in the data path I'll add an unlikely hint.

- Chris



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/1] nvme-tcp: fence TCP socket on receive error
  2023-03-20 22:09 [PATCH 0/1] nvme-tcp: fence TCP socket on transport error Chris Leech
  2023-03-20 22:09 ` [PATCH 1/1] " Chris Leech
@ 2023-03-21 16:30 ` Chris Leech
  2023-03-21 20:19   ` John Meneghini
                     ` (2 more replies)
  1 sibling, 3 replies; 8+ messages in thread
From: Chris Leech @ 2023-03-21 16:30 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme; +Cc: John Meneghini

Ensure that no further socket reads occur after a receive processing
error, either from io_work being re-scheduled or nvme_tcp_poll.

Failing to do so can result in unrecognised PDU payloads or TCP stream
garbage being processed as a C2H data PDU, and potentially start copying
the payload to an invalid destination after looking up a request using a
bogus command id.

Signed-off-by: Chris Leech <cleech@redhat.com>
---
 drivers/nvme/host/tcp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 42c0598c31f2..99ad715210af 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -888,6 +888,9 @@ static int nvme_tcp_recv_skb(read_descriptor_t *desc, struct sk_buff *skb,
 	size_t consumed = len;
 	int result;
 
+	if (unlikely(!queue->rd_enabled))
+		return -EFAULT;
+
 	while (len) {
 		switch (nvme_tcp_recv_state(queue)) {
 		case NVME_TCP_RECV_PDU:
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/1] nvme-tcp: fence TCP socket on receive error
  2023-03-21 16:30 ` [PATCH v2 1/1] nvme-tcp: fence TCP socket on receive error Chris Leech
@ 2023-03-21 20:19   ` John Meneghini
  2023-03-22  7:13   ` Sagi Grimberg
  2023-04-05 15:21   ` Christoph Hellwig
  2 siblings, 0 replies; 8+ messages in thread
From: John Meneghini @ 2023-03-21 20:19 UTC (permalink / raw)
  To: Chris Leech, Sagi Grimberg, linux-nvme

Reviewed-by: John Meneghini <jmeneghi@redhat.com>

On 3/21/23 12:30, Chris Leech wrote:
> Ensure that no further socket reads occur after a receive processing
> error, either from io_work being re-scheduled or nvme_tcp_poll.
> 
> Failing to do so can result in unrecognised PDU payloads or TCP stream
> garbage being processed as a C2H data PDU, and potentially start copying
> the payload to an invalid destination after looking up a request using a
> bogus command id.
> 
> Signed-off-by: Chris Leech <cleech@redhat.com>
> ---
>   drivers/nvme/host/tcp.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 42c0598c31f2..99ad715210af 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -888,6 +888,9 @@ static int nvme_tcp_recv_skb(read_descriptor_t *desc, struct sk_buff *skb,
>   	size_t consumed = len;
>   	int result;
>   
> +	if (unlikely(!queue->rd_enabled))
> +		return -EFAULT;
> +
>   	while (len) {
>   		switch (nvme_tcp_recv_state(queue)) {
>   		case NVME_TCP_RECV_PDU:



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/1] nvme-tcp: fence TCP socket on receive error
  2023-03-21 16:30 ` [PATCH v2 1/1] nvme-tcp: fence TCP socket on receive error Chris Leech
  2023-03-21 20:19   ` John Meneghini
@ 2023-03-22  7:13   ` Sagi Grimberg
  2023-04-05 15:21   ` Christoph Hellwig
  2 siblings, 0 replies; 8+ messages in thread
From: Sagi Grimberg @ 2023-03-22  7:13 UTC (permalink / raw)
  To: Chris Leech, linux-nvme; +Cc: John Meneghini

Thanks Chris,

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/1] nvme-tcp: fence TCP socket on receive error
  2023-03-21 16:30 ` [PATCH v2 1/1] nvme-tcp: fence TCP socket on receive error Chris Leech
  2023-03-21 20:19   ` John Meneghini
  2023-03-22  7:13   ` Sagi Grimberg
@ 2023-04-05 15:21   ` Christoph Hellwig
  2 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2023-04-05 15:21 UTC (permalink / raw)
  To: Chris Leech; +Cc: Sagi Grimberg, linux-nvme, John Meneghini

Thanks,

applied to nvme-6.4.

Let me know if you really need this in 6.3, but it seems like we're
pretty late in the cycle and this a rather old bug.

Nit: a Fіxes tag would have been nice.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-04-05 15:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-20 22:09 [PATCH 0/1] nvme-tcp: fence TCP socket on transport error Chris Leech
2023-03-20 22:09 ` [PATCH 1/1] " Chris Leech
2023-03-21  8:30   ` Sagi Grimberg
2023-03-21 16:30     ` Chris Leech
2023-03-21 16:30 ` [PATCH v2 1/1] nvme-tcp: fence TCP socket on receive error Chris Leech
2023-03-21 20:19   ` John Meneghini
2023-03-22  7:13   ` Sagi Grimberg
2023-04-05 15:21   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox