qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal
@ 2024-08-02 19:26 Eric Blake
  2024-08-02 19:26 ` [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public Eric Blake
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Eric Blake @ 2024-08-02 19:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, rjones

I've requested a CVE from Red Hat, and hope to have an assigned number
soon.  Meanwhile, we can get review started, to make sure this is
ready to include in 9.1.  'qemu-img info' should never print untrusted
data in a way that might take over a user's terminal.

There are probably other spots where qemu-img info is printing
untrusted data (such as filenames), where we probably ought to use the
same sanitization tactics as shown here.  Identifying those spots
would be a useful part of this review, and may mean a v2 that is even
more extensive in the number of patches.

In patch 1, I created mod_utf8_sanitize_len(), with the intent that I
could sanitize just a prefix of a string without having to copy it
into a NUL-terminated buffer.  I didn't end up needing it in my
current version of patch 2 (since the code was already copying to a
NUL-terminated buffer for trace purposes), but we may find uses for
it; in fact, it raises the question of whether any of our trace_ calls
need to sanitize untrusted data (or whether we can rely on ALL trace
engines to be doing that on our behalf, already).

Eric Blake (2):
  util: Refactor json-writer's string sanitizer to be public
  qemu-img: CVE-XXX Sanitize untrusted output from NBD server

 include/qemu/unicode.h |  3 ++
 nbd/client.c           |  5 ++-
 qobject/json-writer.c  | 47 +----------------------
 util/unicode.c         | 84 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 92 insertions(+), 47 deletions(-)

-- 
2.45.2



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public
  2024-08-02 19:26 [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal Eric Blake
@ 2024-08-02 19:26 ` Eric Blake
  2024-08-02 21:00   ` Richard W.M. Jones
                     ` (3 more replies)
  2024-08-02 19:26 ` [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server Eric Blake
  2024-08-05 18:48 ` [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal Eric Blake
  2 siblings, 4 replies; 18+ messages in thread
From: Eric Blake @ 2024-08-02 19:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, rjones, Markus Armbruster

My next patch needs to convert text from an untrusted input into an
output representation that is suitable for display on a terminal is
useful to more than just the json-writer; the text should normally be
UTF-8, but blindly allowing all Unicode code points (including ASCII
ESC) through to a terminal risks remote-code-execution attacks on some
terminals.  Extract the existing body of json-writer's quoted_strinto
a new helper routine mod_utf8_sanitize, and generalize it to also work
on data that is length-limited rather than NUL-terminated.  [I was
actually surprised that glib does not have such a sanitizer already -
Google turns up lots of examples of rolling your own string
sanitizer.]

If desired in the future, we may want to tweak whether the output is
guaranteed to be ASCII (using lots of \u escape sequences, including
surrogate pairs for code points outside the BMP) or if we are okay
passing printable Unicode through (we still need to escape control
characters).  But for now, I went for minimal code churn, including
the fact that the resulting function allows a non-UTF-8 2-byte synonym
for U+0000.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 include/qemu/unicode.h |  3 ++
 qobject/json-writer.c  | 47 +----------------------
 util/unicode.c         | 84 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 88 insertions(+), 46 deletions(-)

diff --git a/include/qemu/unicode.h b/include/qemu/unicode.h
index 7fa10b8e604..e1013b29f72 100644
--- a/include/qemu/unicode.h
+++ b/include/qemu/unicode.h
@@ -4,4 +4,7 @@
 int mod_utf8_codepoint(const char *s, size_t n, char **end);
 ssize_t mod_utf8_encode(char buf[], size_t bufsz, int codepoint);

+void mod_utf8_sanitize(GString *buf, const char *str);
+void mod_utf8_sanitize_len(GString *buf, const char *str, ssize_t len);
+
 #endif
diff --git a/qobject/json-writer.c b/qobject/json-writer.c
index 309a31d57a9..5c14574efee 100644
--- a/qobject/json-writer.c
+++ b/qobject/json-writer.c
@@ -104,53 +104,8 @@ static void pretty_newline_or_space(JSONWriter *writer)

 static void quoted_str(JSONWriter *writer, const char *str)
 {
-    const char *ptr;
-    char *end;
-    int cp;
-
     g_string_append_c(writer->contents, '"');
-
-    for (ptr = str; *ptr; ptr = end) {
-        cp = mod_utf8_codepoint(ptr, 6, &end);
-        switch (cp) {
-        case '\"':
-            g_string_append(writer->contents, "\\\"");
-            break;
-        case '\\':
-            g_string_append(writer->contents, "\\\\");
-            break;
-        case '\b':
-            g_string_append(writer->contents, "\\b");
-            break;
-        case '\f':
-            g_string_append(writer->contents, "\\f");
-            break;
-        case '\n':
-            g_string_append(writer->contents, "\\n");
-            break;
-        case '\r':
-            g_string_append(writer->contents, "\\r");
-            break;
-        case '\t':
-            g_string_append(writer->contents, "\\t");
-            break;
-        default:
-            if (cp < 0) {
-                cp = 0xFFFD; /* replacement character */
-            }
-            if (cp > 0xFFFF) {
-                /* beyond BMP; need a surrogate pair */
-                g_string_append_printf(writer->contents, "\\u%04X\\u%04X",
-                                       0xD800 + ((cp - 0x10000) >> 10),
-                                       0xDC00 + ((cp - 0x10000) & 0x3FF));
-            } else if (cp < 0x20 || cp >= 0x7F) {
-                g_string_append_printf(writer->contents, "\\u%04X", cp);
-            } else {
-                g_string_append_c(writer->contents, cp);
-            }
-        }
-    };
-
+    mod_utf8_sanitize(writer->contents, str);
     g_string_append_c(writer->contents, '"');
 }

diff --git a/util/unicode.c b/util/unicode.c
index 8580bc598b3..a419ed4de35 100644
--- a/util/unicode.c
+++ b/util/unicode.c
@@ -154,3 +154,87 @@ ssize_t mod_utf8_encode(char buf[], size_t bufsz, int codepoint)
     buf[4] = 0;
     return 4;
 }
+
+/**
+ * mod_utf8_sanitize:
+ * @buf: Destination buffer
+ * @str: Modified UTF-8 string to sanitize
+ *
+ * Append the contents of the NUL-terminated Modified UTF-8 string
+ * @str into @buf, with escape sequences used for non-printable ASCII
+ * and for quote characters.  The result is therefore safe for output
+ * to a terminal.
+ *
+ * Modified UTF-8 is exactly like UTF-8, except U+0000 is encoded as
+ * "\xC0\x80".
+ */
+void mod_utf8_sanitize(GString *buf, const char *str)
+{
+    mod_utf8_sanitize_len(buf, str, -1);
+}
+
+/**
+ * mod_utf8_sanitize:
+ * @buf: Destination buffer
+ * @str: Modified UTF-8 string to sanitize
+ * @len: Length of @str, or negative to stop at NUL terminator
+ *
+ * Append the contents of @len bytes of the Modified UTF-8 string
+ * @str into @buf, with escape sequences used for non-printable ASCII
+ * and for quote characters.  The result is therefore safe for output
+ * to a terminal.
+ *
+ * Modified UTF-8 is exactly like UTF-8, except U+0000 is encoded as
+ * "\xC0\x80".
+ */
+void mod_utf8_sanitize_len(GString *buf, const char *str, ssize_t len)
+{
+    const char *ptr;
+    char *end;
+    int cp;
+
+    if (len < 0) {
+        len = strlen(str);
+    }
+
+    for (ptr = str; *ptr; ptr = end) {
+        cp = mod_utf8_codepoint(ptr, MIN(6, str + len - ptr), &end);
+        switch (cp) {
+        case '\"':
+            g_string_append(buf, "\\\"");
+            break;
+        case '\\':
+            g_string_append(buf, "\\\\");
+            break;
+        case '\b':
+            g_string_append(buf, "\\b");
+            break;
+        case '\f':
+            g_string_append(buf, "\\f");
+            break;
+        case '\n':
+            g_string_append(buf, "\\n");
+            break;
+        case '\r':
+            g_string_append(buf, "\\r");
+            break;
+        case '\t':
+            g_string_append(buf, "\\t");
+            break;
+        default:
+            if (cp < 0) {
+                cp = 0xFFFD; /* replacement character */
+            }
+            if (cp > 0xFFFF) {
+                /* beyond BMP; need a surrogate pair */
+                g_string_append_printf(buf, "\\u%04X\\u%04X",
+                                       0xD800 + ((cp - 0x10000) >> 10),
+                                       0xDC00 + ((cp - 0x10000) & 0x3FF));
+            } else if (cp < 0x20 || cp >= 0x7F) {
+                g_string_append_printf(buf, "\\u%04X", cp);
+            } else {
+                g_string_append_c(buf, cp);
+            }
+        }
+    }
+}
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server
  2024-08-02 19:26 [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal Eric Blake
  2024-08-02 19:26 ` [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public Eric Blake
@ 2024-08-02 19:26 ` Eric Blake
  2024-08-02 21:03   ` Richard W.M. Jones
                     ` (2 more replies)
  2024-08-05 18:48 ` [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal Eric Blake
  2 siblings, 3 replies; 18+ messages in thread
From: Eric Blake @ 2024-08-02 19:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, rjones, Vladimir Sementsov-Ogievskiy

Error messages from an NBD server must be treated as untrusted; a
malicious server can inject escape sequences to try and trigger RCE
flaws via escape sequences to whatever terminal happens to be running
qemu-img.  The easiest solution is to sanitize the output with the
same code we use to produce sanitized (pseudo-)JSON over QMP.

Rich Jones originally pointed this flaw out at:
https://lists.libguestfs.org/archives/list/guestfs@lists.libguestfs.org/thread/2NXA23G2V3HPWJYAO726PLNBEAAEUJAU/

With this patch, and a malicious server run with nbdkit 1.40 as:

$ nbdkit --log=null eval open=' printf \
  "EPERM x\\r mess up the output \e[31mmess up the output\e[m mess up" >&2; \
  exit 1 ' get_size=' echo 0 ' --run 'qemu-img info "$uri"'

we now get:

qemu-img: Could not open 'nbd://localhost': Requested export not available
server reported: /tmp/nbdkitOZHOKB/open: x\r mess up the output \u001B[31mmess up the output\u001B[m mess up

instead of an attempt to hide the name of the Unix socket and forcing
the terminal to render part of the text red.

Note that I did _not_ sanitize the string being sent through
trace-events in trace_nbd_server_error_msg; this is because I assume
that our trace engines already treat all string strings as untrusted
input and apply their own escaping as needed.

Reported-by: "Richard W.M. Jones" <rjones@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

---

If my assumption about allowing raw escape bytes through to trace_
calls is wrong (such as when tracing to stderr), let me know.  That's
a much bigger audit to determine which trace points, if any, should
sanitize data before tracing, and/or change the trace engines to
sanitize all strings (with possible knock-on effects if trace output
changes unexpectedly for a tool expecting something unsanitized).
---
 nbd/client.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/nbd/client.c b/nbd/client.c
index c89c7504673..baa20d10d69 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -23,6 +23,7 @@
 #include "trace.h"
 #include "nbd-internal.h"
 #include "qemu/cutils.h"
+#include "qemu/unicode.h"

 /* Definitions for opaque data types */

@@ -230,7 +231,9 @@ static int nbd_handle_reply_err(QIOChannel *ioc, NBDOptionReply *reply,
     }

     if (msg) {
-        error_append_hint(errp, "server reported: %s\n", msg);
+        g_autoptr(GString) buf = g_string_sized_new(reply->length);
+        mod_utf8_sanitize(buf, msg);
+        error_append_hint(errp, "server reported: %s\n", buf->str);
     }

  err:
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public
  2024-08-02 19:26 ` [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public Eric Blake
@ 2024-08-02 21:00   ` Richard W.M. Jones
  2024-08-02 21:38   ` Philippe Mathieu-Daudé
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Richard W.M. Jones @ 2024-08-02 21:00 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, Markus Armbruster

On Fri, Aug 02, 2024 at 02:26:05PM -0500, Eric Blake wrote:
> My next patch needs to convert text from an untrusted input into an
> output representation that is suitable for display on a terminal is
> useful to more than just the json-writer; the text should normally be
> UTF-8, but blindly allowing all Unicode code points (including ASCII
> ESC) through to a terminal risks remote-code-execution attacks on some
> terminals.  Extract the existing body of json-writer's quoted_strinto
> a new helper routine mod_utf8_sanitize, and generalize it to also work
> on data that is length-limited rather than NUL-terminated.  [I was
> actually surprised that glib does not have such a sanitizer already -
> Google turns up lots of examples of rolling your own string
> sanitizer.]
> 
> If desired in the future, we may want to tweak whether the output is
> guaranteed to be ASCII (using lots of \u escape sequences, including
> surrogate pairs for code points outside the BMP) or if we are okay
> passing printable Unicode through (we still need to escape control
> characters).  But for now, I went for minimal code churn, including
> the fact that the resulting function allows a non-UTF-8 2-byte synonym
> for U+0000.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

I have to admit I'd never heard of "modified UTF-8" before, but it's a
thing:

https://en.wikipedia.org/wiki/UTF-8#Modified_UTF-8

As the patch is almost a simple code motion:

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>

Rich.

> ---
>  include/qemu/unicode.h |  3 ++
>  qobject/json-writer.c  | 47 +----------------------
>  util/unicode.c         | 84 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 88 insertions(+), 46 deletions(-)
> 
> diff --git a/include/qemu/unicode.h b/include/qemu/unicode.h
> index 7fa10b8e604..e1013b29f72 100644
> --- a/include/qemu/unicode.h
> +++ b/include/qemu/unicode.h
> @@ -4,4 +4,7 @@
>  int mod_utf8_codepoint(const char *s, size_t n, char **end);
>  ssize_t mod_utf8_encode(char buf[], size_t bufsz, int codepoint);
> 
> +void mod_utf8_sanitize(GString *buf, const char *str);
> +void mod_utf8_sanitize_len(GString *buf, const char *str, ssize_t len);
> +
>  #endif
> diff --git a/qobject/json-writer.c b/qobject/json-writer.c
> index 309a31d57a9..5c14574efee 100644
> --- a/qobject/json-writer.c
> +++ b/qobject/json-writer.c
> @@ -104,53 +104,8 @@ static void pretty_newline_or_space(JSONWriter *writer)
> 
>  static void quoted_str(JSONWriter *writer, const char *str)
>  {
> -    const char *ptr;
> -    char *end;
> -    int cp;
> -
>      g_string_append_c(writer->contents, '"');
> -
> -    for (ptr = str; *ptr; ptr = end) {
> -        cp = mod_utf8_codepoint(ptr, 6, &end);
> -        switch (cp) {
> -        case '\"':
> -            g_string_append(writer->contents, "\\\"");
> -            break;
> -        case '\\':
> -            g_string_append(writer->contents, "\\\\");
> -            break;
> -        case '\b':
> -            g_string_append(writer->contents, "\\b");
> -            break;
> -        case '\f':
> -            g_string_append(writer->contents, "\\f");
> -            break;
> -        case '\n':
> -            g_string_append(writer->contents, "\\n");
> -            break;
> -        case '\r':
> -            g_string_append(writer->contents, "\\r");
> -            break;
> -        case '\t':
> -            g_string_append(writer->contents, "\\t");
> -            break;
> -        default:
> -            if (cp < 0) {
> -                cp = 0xFFFD; /* replacement character */
> -            }
> -            if (cp > 0xFFFF) {
> -                /* beyond BMP; need a surrogate pair */
> -                g_string_append_printf(writer->contents, "\\u%04X\\u%04X",
> -                                       0xD800 + ((cp - 0x10000) >> 10),
> -                                       0xDC00 + ((cp - 0x10000) & 0x3FF));
> -            } else if (cp < 0x20 || cp >= 0x7F) {
> -                g_string_append_printf(writer->contents, "\\u%04X", cp);
> -            } else {
> -                g_string_append_c(writer->contents, cp);
> -            }
> -        }
> -    };
> -
> +    mod_utf8_sanitize(writer->contents, str);
>      g_string_append_c(writer->contents, '"');
>  }
> 
> diff --git a/util/unicode.c b/util/unicode.c
> index 8580bc598b3..a419ed4de35 100644
> --- a/util/unicode.c
> +++ b/util/unicode.c
> @@ -154,3 +154,87 @@ ssize_t mod_utf8_encode(char buf[], size_t bufsz, int codepoint)
>      buf[4] = 0;
>      return 4;
>  }
> +
> +/**
> + * mod_utf8_sanitize:
> + * @buf: Destination buffer
> + * @str: Modified UTF-8 string to sanitize
> + *
> + * Append the contents of the NUL-terminated Modified UTF-8 string
> + * @str into @buf, with escape sequences used for non-printable ASCII
> + * and for quote characters.  The result is therefore safe for output
> + * to a terminal.
> + *
> + * Modified UTF-8 is exactly like UTF-8, except U+0000 is encoded as
> + * "\xC0\x80".
> + */
> +void mod_utf8_sanitize(GString *buf, const char *str)
> +{
> +    mod_utf8_sanitize_len(buf, str, -1);
> +}
> +
> +/**
> + * mod_utf8_sanitize:
> + * @buf: Destination buffer
> + * @str: Modified UTF-8 string to sanitize
> + * @len: Length of @str, or negative to stop at NUL terminator
> + *
> + * Append the contents of @len bytes of the Modified UTF-8 string
> + * @str into @buf, with escape sequences used for non-printable ASCII
> + * and for quote characters.  The result is therefore safe for output
> + * to a terminal.
> + *
> + * Modified UTF-8 is exactly like UTF-8, except U+0000 is encoded as
> + * "\xC0\x80".
> + */
> +void mod_utf8_sanitize_len(GString *buf, const char *str, ssize_t len)
> +{
> +    const char *ptr;
> +    char *end;
> +    int cp;
> +
> +    if (len < 0) {
> +        len = strlen(str);
> +    }
> +
> +    for (ptr = str; *ptr; ptr = end) {
> +        cp = mod_utf8_codepoint(ptr, MIN(6, str + len - ptr), &end);
> +        switch (cp) {
> +        case '\"':
> +            g_string_append(buf, "\\\"");
> +            break;
> +        case '\\':
> +            g_string_append(buf, "\\\\");
> +            break;
> +        case '\b':
> +            g_string_append(buf, "\\b");
> +            break;
> +        case '\f':
> +            g_string_append(buf, "\\f");
> +            break;
> +        case '\n':
> +            g_string_append(buf, "\\n");
> +            break;
> +        case '\r':
> +            g_string_append(buf, "\\r");
> +            break;
> +        case '\t':
> +            g_string_append(buf, "\\t");
> +            break;
> +        default:
> +            if (cp < 0) {
> +                cp = 0xFFFD; /* replacement character */
> +            }
> +            if (cp > 0xFFFF) {
> +                /* beyond BMP; need a surrogate pair */
> +                g_string_append_printf(buf, "\\u%04X\\u%04X",
> +                                       0xD800 + ((cp - 0x10000) >> 10),
> +                                       0xDC00 + ((cp - 0x10000) & 0x3FF));
> +            } else if (cp < 0x20 || cp >= 0x7F) {
> +                g_string_append_printf(buf, "\\u%04X", cp);
> +            } else {
> +                g_string_append_c(buf, cp);
> +            }
> +        }
> +    }
> +}
> -- 
> 2.45.2

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server
  2024-08-02 19:26 ` [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server Eric Blake
@ 2024-08-02 21:03   ` Richard W.M. Jones
  2024-08-07 18:45     ` Daniel P. Berrangé
  2024-08-02 21:41   ` Philippe Mathieu-Daudé
  2024-08-02 22:01   ` Richard W.M. Jones
  2 siblings, 1 reply; 18+ messages in thread
From: Richard W.M. Jones @ 2024-08-02 21:03 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, Vladimir Sementsov-Ogievskiy

On Fri, Aug 02, 2024 at 02:26:06PM -0500, Eric Blake wrote:
> Error messages from an NBD server must be treated as untrusted; a
> malicious server can inject escape sequences to try and trigger RCE
> flaws via escape sequences to whatever terminal happens to be running
> qemu-img.  The easiest solution is to sanitize the output with the
> same code we use to produce sanitized (pseudo-)JSON over QMP.
> 
> Rich Jones originally pointed this flaw out at:
> https://lists.libguestfs.org/archives/list/guestfs@lists.libguestfs.org/thread/2NXA23G2V3HPWJYAO726PLNBEAAEUJAU/
> 
> With this patch, and a malicious server run with nbdkit 1.40 as:
> 
> $ nbdkit --log=null eval open=' printf \
>   "EPERM x\\r mess up the output \e[31mmess up the output\e[m mess up" >&2; \
>   exit 1 ' get_size=' echo 0 ' --run 'qemu-img info "$uri"'
> 
> we now get:
> 
> qemu-img: Could not open 'nbd://localhost': Requested export not available
> server reported: /tmp/nbdkitOZHOKB/open: x\r mess up the output \u001B[31mmess up the output\u001B[m mess up
> 
> instead of an attempt to hide the name of the Unix socket and forcing
> the terminal to render part of the text red.
> 
> Note that I did _not_ sanitize the string being sent through
> trace-events in trace_nbd_server_error_msg; this is because I assume
> that our trace engines already treat all string strings as untrusted
> input and apply their own escaping as needed.
> 
> Reported-by: "Richard W.M. Jones" <rjones@redhat.com>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> 
> If my assumption about allowing raw escape bytes through to trace_
> calls is wrong (such as when tracing to stderr), let me know.  That's
> a much bigger audit to determine which trace points, if any, should
> sanitize data before tracing, and/or change the trace engines to
> sanitize all strings (with possible knock-on effects if trace output
> changes unexpectedly for a tool expecting something unsanitized).
> ---
>  nbd/client.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/nbd/client.c b/nbd/client.c
> index c89c7504673..baa20d10d69 100644
> --- a/nbd/client.c
> +++ b/nbd/client.c
> @@ -23,6 +23,7 @@
>  #include "trace.h"
>  #include "nbd-internal.h"
>  #include "qemu/cutils.h"
> +#include "qemu/unicode.h"
> 
>  /* Definitions for opaque data types */
> 
> @@ -230,7 +231,9 @@ static int nbd_handle_reply_err(QIOChannel *ioc, NBDOptionReply *reply,
>      }
> 
>      if (msg) {
> -        error_append_hint(errp, "server reported: %s\n", msg);
> +        g_autoptr(GString) buf = g_string_sized_new(reply->length);
> +        mod_utf8_sanitize(buf, msg);
> +        error_append_hint(errp, "server reported: %s\n", buf->str);
>      }

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
nbdkit - Flexible, fast NBD server with plugins
https://gitlab.com/nbdkit/nbdkit



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public
  2024-08-02 19:26 ` [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public Eric Blake
  2024-08-02 21:00   ` Richard W.M. Jones
@ 2024-08-02 21:38   ` Philippe Mathieu-Daudé
  2024-08-07 18:49   ` Daniel P. Berrangé
  2024-08-08  7:54   ` Markus Armbruster
  3 siblings, 0 replies; 18+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-08-02 21:38 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, rjones, Markus Armbruster

On 2/8/24 21:26, Eric Blake wrote:
> My next patch needs to convert text from an untrusted input into an
> output representation that is suitable for display on a terminal is
> useful to more than just the json-writer; the text should normally be
> UTF-8, but blindly allowing all Unicode code points (including ASCII
> ESC) through to a terminal risks remote-code-execution attacks on some
> terminals.  Extract the existing body of json-writer's quoted_strinto
> a new helper routine mod_utf8_sanitize, and generalize it to also work
> on data that is length-limited rather than NUL-terminated.  [I was
> actually surprised that glib does not have such a sanitizer already -
> Google turns up lots of examples of rolling your own string
> sanitizer.]
> 
> If desired in the future, we may want to tweak whether the output is
> guaranteed to be ASCII (using lots of \u escape sequences, including
> surrogate pairs for code points outside the BMP) or if we are okay
> passing printable Unicode through (we still need to escape control
> characters).  But for now, I went for minimal code churn, including
> the fact that the resulting function allows a non-UTF-8 2-byte synonym
> for U+0000.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>   include/qemu/unicode.h |  3 ++
>   qobject/json-writer.c  | 47 +----------------------
>   util/unicode.c         | 84 ++++++++++++++++++++++++++++++++++++++++++
>   3 files changed, 88 insertions(+), 46 deletions(-)

Preferably moving the docstring help to the header,

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server
  2024-08-02 19:26 ` [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server Eric Blake
  2024-08-02 21:03   ` Richard W.M. Jones
@ 2024-08-02 21:41   ` Philippe Mathieu-Daudé
  2024-08-07 13:43     ` Stefan Hajnoczi
  2024-08-02 22:01   ` Richard W.M. Jones
  2 siblings, 1 reply; 18+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-08-02 21:41 UTC (permalink / raw)
  To: Eric Blake, qemu-devel, Stefan Hajnoczi
  Cc: qemu-block, rjones, Vladimir Sementsov-Ogievskiy

On 2/8/24 21:26, Eric Blake wrote:
> Error messages from an NBD server must be treated as untrusted; a
> malicious server can inject escape sequences to try and trigger RCE
> flaws via escape sequences to whatever terminal happens to be running
> qemu-img.  The easiest solution is to sanitize the output with the
> same code we use to produce sanitized (pseudo-)JSON over QMP.
> 
> Rich Jones originally pointed this flaw out at:
> https://lists.libguestfs.org/archives/list/guestfs@lists.libguestfs.org/thread/2NXA23G2V3HPWJYAO726PLNBEAAEUJAU/
> 
> With this patch, and a malicious server run with nbdkit 1.40 as:
> 
> $ nbdkit --log=null eval open=' printf \
>    "EPERM x\\r mess up the output \e[31mmess up the output\e[m mess up" >&2; \
>    exit 1 ' get_size=' echo 0 ' --run 'qemu-img info "$uri"'
> 
> we now get:
> 
> qemu-img: Could not open 'nbd://localhost': Requested export not available
> server reported: /tmp/nbdkitOZHOKB/open: x\r mess up the output \u001B[31mmess up the output\u001B[m mess up
> 
> instead of an attempt to hide the name of the Unix socket and forcing
> the terminal to render part of the text red.
> 
> Note that I did _not_ sanitize the string being sent through
> trace-events in trace_nbd_server_error_msg; this is because I assume
> that our trace engines already treat all string strings as untrusted
> input and apply their own escaping as needed.
> 
> Reported-by: "Richard W.M. Jones" <rjones@redhat.com>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> 
> If my assumption about allowing raw escape bytes through to trace_
> calls is wrong (such as when tracing to stderr), let me know.  That's
> a much bigger audit to determine which trace points, if any, should
> sanitize data before tracing, and/or change the trace engines to
> sanitize all strings (with possible knock-on effects if trace output
> changes unexpectedly for a tool expecting something unsanitized).

I doubt the trace core layer sanitizes, but it feels it is the
trace backend responsibility, since core layer might just pass
pointer to the backends.

> ---
>   nbd/client.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server
  2024-08-02 19:26 ` [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server Eric Blake
  2024-08-02 21:03   ` Richard W.M. Jones
  2024-08-02 21:41   ` Philippe Mathieu-Daudé
@ 2024-08-02 22:01   ` Richard W.M. Jones
  2024-08-03  8:20     ` Richard W.M. Jones
  2 siblings, 1 reply; 18+ messages in thread
From: Richard W.M. Jones @ 2024-08-02 22:01 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, Vladimir Sementsov-Ogievskiy

On Fri, Aug 02, 2024 at 02:26:06PM -0500, Eric Blake wrote:
> Error messages from an NBD server must be treated as untrusted; a
> malicious server can inject escape sequences to try and trigger RCE
> flaws via escape sequences to whatever terminal happens to be running
> qemu-img.

This presentation is relevant:

https://dgl.cx/2023/09/ansi-terminal-security

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server
  2024-08-02 22:01   ` Richard W.M. Jones
@ 2024-08-03  8:20     ` Richard W.M. Jones
  0 siblings, 0 replies; 18+ messages in thread
From: Richard W.M. Jones @ 2024-08-03  8:20 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, Vladimir Sementsov-Ogievskiy

On Fri, Aug 02, 2024 at 11:01:36PM +0100, Richard W.M. Jones wrote:
> On Fri, Aug 02, 2024 at 02:26:06PM -0500, Eric Blake wrote:
> > Error messages from an NBD server must be treated as untrusted; a
> > malicious server can inject escape sequences to try and trigger RCE
> > flaws via escape sequences to whatever terminal happens to be running
> > qemu-img.
> 
> This presentation is relevant:
> 
> https://dgl.cx/2023/09/ansi-terminal-security

This took way too long, but ...

$ wget http://oirase.annexia.org/tmp/nyan.c
$ nbdkit --log=null cc /tmp/nyan.c --run 'qemu-img info "$uri"'

Needs nbdkit >= 1.40, and don't worry, it doesn't exploit the terminal
except for silly internet memes.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal
  2024-08-02 19:26 [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal Eric Blake
  2024-08-02 19:26 ` [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public Eric Blake
  2024-08-02 19:26 ` [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server Eric Blake
@ 2024-08-05 18:48 ` Eric Blake
  2024-08-05 19:11   ` Richard W.M. Jones
  2 siblings, 1 reply; 18+ messages in thread
From: Eric Blake @ 2024-08-05 18:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, rjones, kwolf

On Fri, Aug 02, 2024 at 02:26:04PM GMT, Eric Blake wrote:
> I've requested a CVE from Red Hat, and hope to have an assigned number
> soon.  Meanwhile, we can get review started, to make sure this is
> ready to include in 9.1.  'qemu-img info' should never print untrusted
> data in a way that might take over a user's terminal.
> 
> There are probably other spots where qemu-img info is printing
> untrusted data (such as filenames), where we probably ought to use the
> same sanitization tactics as shown here.  Identifying those spots
> would be a useful part of this review, and may mean a v2 that is even
> more extensive in the number of patches.

In fact, should we insist that 'qemu-img info XXX' refuse to accept
any control characters on any command-line filename, and that it
reject any backing file name with control characters as a malformed
qcow2 file?  For reference, we JUST fixed qemu-img info to change
qcow2 files with a claimed backing file of json:... to favor the local
file ./json:... over the potentially dangerous user-controlled
format/protocol description, so we are _already_ changing how strict
qemu-img is for 9.1, and adding one more restriction to avoid
escape-sequence madness makes sense.

Note that with:

touch $'\e[m' && qemu-img info --output=json $'\e[m'

we already escape our output, but without --output=json, we are
vulnerable to user-controlled ESC leaking through to stdout for more
than just the NBD server errors that I addressed in v1 of this patch
series.  Hence my question on whether v2 of the series should touch
more places in the code, and whether doing something like flat-out
refusing users stupid enough to embed control characters in their
filenames is a safe change this close to release.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal
  2024-08-05 18:48 ` [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal Eric Blake
@ 2024-08-05 19:11   ` Richard W.M. Jones
  2024-08-07 17:51     ` Eric Blake
  0 siblings, 1 reply; 18+ messages in thread
From: Richard W.M. Jones @ 2024-08-05 19:11 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, kwolf

On Mon, Aug 05, 2024 at 01:48:12PM -0500, Eric Blake wrote:
> On Fri, Aug 02, 2024 at 02:26:04PM GMT, Eric Blake wrote:
> > I've requested a CVE from Red Hat, and hope to have an assigned number
> > soon.  Meanwhile, we can get review started, to make sure this is
> > ready to include in 9.1.  'qemu-img info' should never print untrusted
> > data in a way that might take over a user's terminal.
> > 
> > There are probably other spots where qemu-img info is printing
> > untrusted data (such as filenames), where we probably ought to use the
> > same sanitization tactics as shown here.  Identifying those spots
> > would be a useful part of this review, and may mean a v2 that is even
> > more extensive in the number of patches.
> 
> In fact, should we insist that 'qemu-img info XXX' refuse to accept
> any control characters on any command-line filename, and that it
> reject any backing file name with control characters as a malformed
> qcow2 file?  For reference, we JUST fixed qemu-img info to change
> qcow2 files with a claimed backing file of json:... to favor the local
> file ./json:... over the potentially dangerous user-controlled
> format/protocol description, so we are _already_ changing how strict
> qemu-img is for 9.1, and adding one more restriction to avoid
> escape-sequence madness makes sense.
> 
> Note that with:
> 
> touch $'\e[m' && qemu-img info --output=json $'\e[m'
> 
> we already escape our output, but without --output=json, we are
> vulnerable to user-controlled ESC leaking through to stdout for more
> than just the NBD server errors that I addressed in v1 of this patch
> series.  Hence my question on whether v2 of the series should touch
> more places in the code, and whether doing something like flat-out
> refusing users stupid enough to embed control characters in their
> filenames is a safe change this close to release.

You could say if someone gives you a "malicious" text file which you
cat to stdout, it could change your terminal settings.  I don't think
therefore any of this is very serious.  We should probably fix any
obvious things, but it doesn't need to happen right before 9.1 is
released, we can take our time.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server
  2024-08-02 21:41   ` Philippe Mathieu-Daudé
@ 2024-08-07 13:43     ` Stefan Hajnoczi
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2024-08-07 13:43 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Eric Blake, qemu-devel, qemu-block, rjones,
	Vladimir Sementsov-Ogievskiy

[-- Attachment #1: Type: text/plain, Size: 2653 bytes --]

On Fri, Aug 02, 2024 at 11:41:35PM +0200, Philippe Mathieu-Daudé wrote:
> On 2/8/24 21:26, Eric Blake wrote:
> > Error messages from an NBD server must be treated as untrusted; a
> > malicious server can inject escape sequences to try and trigger RCE
> > flaws via escape sequences to whatever terminal happens to be running
> > qemu-img.  The easiest solution is to sanitize the output with the
> > same code we use to produce sanitized (pseudo-)JSON over QMP.
> > 
> > Rich Jones originally pointed this flaw out at:
> > https://lists.libguestfs.org/archives/list/guestfs@lists.libguestfs.org/thread/2NXA23G2V3HPWJYAO726PLNBEAAEUJAU/
> > 
> > With this patch, and a malicious server run with nbdkit 1.40 as:
> > 
> > $ nbdkit --log=null eval open=' printf \
> >    "EPERM x\\r mess up the output \e[31mmess up the output\e[m mess up" >&2; \
> >    exit 1 ' get_size=' echo 0 ' --run 'qemu-img info "$uri"'
> > 
> > we now get:
> > 
> > qemu-img: Could not open 'nbd://localhost': Requested export not available
> > server reported: /tmp/nbdkitOZHOKB/open: x\r mess up the output \u001B[31mmess up the output\u001B[m mess up
> > 
> > instead of an attempt to hide the name of the Unix socket and forcing
> > the terminal to render part of the text red.
> > 
> > Note that I did _not_ sanitize the string being sent through
> > trace-events in trace_nbd_server_error_msg; this is because I assume
> > that our trace engines already treat all string strings as untrusted
> > input and apply their own escaping as needed.
> > 
> > Reported-by: "Richard W.M. Jones" <rjones@redhat.com>
> > Signed-off-by: Eric Blake <eblake@redhat.com>
> > 
> > ---
> > 
> > If my assumption about allowing raw escape bytes through to trace_
> > calls is wrong (such as when tracing to stderr), let me know.  That's
> > a much bigger audit to determine which trace points, if any, should
> > sanitize data before tracing, and/or change the trace engines to
> > sanitize all strings (with possible knock-on effects if trace output
> > changes unexpectedly for a tool expecting something unsanitized).
> 
> I doubt the trace core layer sanitizes, but it feels it is the
> trace backend responsibility, since core layer might just pass
> pointer to the backends.

Yes, strings are not escaped by the core tracing code. They probably
should not be escaped by QEMU since the trace backend may wish to
process the strings and it expects to operate on raw strings.

> 
> > ---
> >   nbd/client.c | 5 ++++-
> >   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal
  2024-08-05 19:11   ` Richard W.M. Jones
@ 2024-08-07 17:51     ` Eric Blake
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Blake @ 2024-08-07 17:51 UTC (permalink / raw)
  To: Richard W.M. Jones; +Cc: qemu-devel, qemu-block, kwolf

On Mon, Aug 05, 2024 at 08:11:31PM GMT, Richard W.M. Jones wrote:
> On Mon, Aug 05, 2024 at 01:48:12PM -0500, Eric Blake wrote:
> > On Fri, Aug 02, 2024 at 02:26:04PM GMT, Eric Blake wrote:
> > > I've requested a CVE from Red Hat, and hope to have an assigned number
> > > soon.  Meanwhile, we can get review started, to make sure this is
> > > ready to include in 9.1.  'qemu-img info' should never print untrusted
> > > data in a way that might take over a user's terminal.
> > > 
> > > There are probably other spots where qemu-img info is printing
> > > untrusted data (such as filenames), where we probably ought to use the
> > > same sanitization tactics as shown here.  Identifying those spots
> > > would be a useful part of this review, and may mean a v2 that is even
> > > more extensive in the number of patches.
> > 
> > In fact, should we insist that 'qemu-img info XXX' refuse to accept
> > any control characters on any command-line filename, and that it
> > reject any backing file name with control characters as a malformed
> > qcow2 file?  For reference, we JUST fixed qemu-img info to change
> > qcow2 files with a claimed backing file of json:... to favor the local
> > file ./json:... over the potentially dangerous user-controlled
> > format/protocol description, so we are _already_ changing how strict
> > qemu-img is for 9.1, and adding one more restriction to avoid
> > escape-sequence madness makes sense.
> > 
> > Note that with:
> > 
> > touch $'\e[m' && qemu-img info --output=json $'\e[m'
> > 
> > we already escape our output, but without --output=json, we are
> > vulnerable to user-controlled ESC leaking through to stdout for more
> > than just the NBD server errors that I addressed in v1 of this patch
> > series.  Hence my question on whether v2 of the series should touch
> > more places in the code, and whether doing something like flat-out
> > refusing users stupid enough to embed control characters in their
> > filenames is a safe change this close to release.
> 
> You could say if someone gives you a "malicious" text file which you
> cat to stdout, it could change your terminal settings.  I don't think
> therefore any of this is very serious.  We should probably fix any
> obvious things, but it doesn't need to happen right before 9.1 is
> released, we can take our time.

After consulting with Red Hat security, they agree: their decision at
this time is that any CVE related to escape sequences taking over a
terminal would best be filed against the terminal and/or shell that
allowed the escape, rather than against every single app that can
produce such pass-through output.  So at this time they were unwilling
to issue a CVE against qemu for this particular issue, and I will
clean up the subject line for v2.

So I agree that cleaning up low-hanging fruit is still worth it, but
no longer a reason to rush this series into 9.1.  If it still gets a
timely positive review, I can include it alongside the other NBD
patches (we are fixing a CVE with qemu hitting SEGV on
nbd-server-stop).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server
  2024-08-02 21:03   ` Richard W.M. Jones
@ 2024-08-07 18:45     ` Daniel P. Berrangé
  0 siblings, 0 replies; 18+ messages in thread
From: Daniel P. Berrangé @ 2024-08-07 18:45 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Eric Blake, qemu-devel, qemu-block, Vladimir Sementsov-Ogievskiy

On Fri, Aug 02, 2024 at 10:03:05PM +0100, Richard W.M. Jones wrote:
> On Fri, Aug 02, 2024 at 02:26:06PM -0500, Eric Blake wrote:
> > Error messages from an NBD server must be treated as untrusted; a
> > malicious server can inject escape sequences to try and trigger RCE
> > flaws via escape sequences to whatever terminal happens to be running
> > qemu-img.  The easiest solution is to sanitize the output with the
> > same code we use to produce sanitized (pseudo-)JSON over QMP.
> > 
> > Rich Jones originally pointed this flaw out at:
> > https://lists.libguestfs.org/archives/list/guestfs@lists.libguestfs.org/thread/2NXA23G2V3HPWJYAO726PLNBEAAEUJAU/
> > 
> > With this patch, and a malicious server run with nbdkit 1.40 as:
> > 
> > $ nbdkit --log=null eval open=' printf \
> >   "EPERM x\\r mess up the output \e[31mmess up the output\e[m mess up" >&2; \
> >   exit 1 ' get_size=' echo 0 ' --run 'qemu-img info "$uri"'
> > 
> > we now get:
> > 
> > qemu-img: Could not open 'nbd://localhost': Requested export not available
> > server reported: /tmp/nbdkitOZHOKB/open: x\r mess up the output \u001B[31mmess up the output\u001B[m mess up
> > 
> > instead of an attempt to hide the name of the Unix socket and forcing
> > the terminal to render part of the text red.
> > 
> > Note that I did _not_ sanitize the string being sent through
> > trace-events in trace_nbd_server_error_msg; this is because I assume
> > that our trace engines already treat all string strings as untrusted
> > input and apply their own escaping as needed.
> > 
> > Reported-by: "Richard W.M. Jones" <rjones@redhat.com>
> > Signed-off-by: Eric Blake <eblake@redhat.com>
> > 
> > ---
> > 
> > If my assumption about allowing raw escape bytes through to trace_
> > calls is wrong (such as when tracing to stderr), let me know.  That's
> > a much bigger audit to determine which trace points, if any, should
> > sanitize data before tracing, and/or change the trace engines to
> > sanitize all strings (with possible knock-on effects if trace output
> > changes unexpectedly for a tool expecting something unsanitized).

For nearly all trace backends, QEMU is not emitting anything onthe
console, so there's no escaping QEMU needs to do.

The exception is the "log" backend which calls qemu_log(). IOW, if
that's a concern then the qemu_log() function needs to sanitize the
resulting buffer after printf, rather than the tracing code do anything.

The simpletrace.py script could probably need similar.

I lean towards "don't bother" though, as when tracing I think it is
important to see the raw un-modified output for the sake of accuracy.


> >  nbd/client.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/nbd/client.c b/nbd/client.c
> > index c89c7504673..baa20d10d69 100644
> > --- a/nbd/client.c
> > +++ b/nbd/client.c
> > @@ -23,6 +23,7 @@
> >  #include "trace.h"
> >  #include "nbd-internal.h"
> >  #include "qemu/cutils.h"
> > +#include "qemu/unicode.h"
> > 
> >  /* Definitions for opaque data types */
> > 
> > @@ -230,7 +231,9 @@ static int nbd_handle_reply_err(QIOChannel *ioc, NBDOptionReply *reply,
> >      }
> > 
> >      if (msg) {
> > -        error_append_hint(errp, "server reported: %s\n", msg);
> > +        g_autoptr(GString) buf = g_string_sized_new(reply->length);
> > +        mod_utf8_sanitize(buf, msg);
> > +        error_append_hint(errp, "server reported: %s\n", buf->str);
> >      }

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public
  2024-08-02 19:26 ` [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public Eric Blake
  2024-08-02 21:00   ` Richard W.M. Jones
  2024-08-02 21:38   ` Philippe Mathieu-Daudé
@ 2024-08-07 18:49   ` Daniel P. Berrangé
  2024-08-08  7:57     ` Markus Armbruster
  2024-08-08  7:54   ` Markus Armbruster
  3 siblings, 1 reply; 18+ messages in thread
From: Daniel P. Berrangé @ 2024-08-07 18:49 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, rjones, Markus Armbruster

On Fri, Aug 02, 2024 at 02:26:05PM -0500, Eric Blake wrote:
> My next patch needs to convert text from an untrusted input into an
> output representation that is suitable for display on a terminal is
> useful to more than just the json-writer; the text should normally be
> UTF-8, but blindly allowing all Unicode code points (including ASCII
> ESC) through to a terminal risks remote-code-execution attacks on some
> terminals.  Extract the existing body of json-writer's quoted_strinto
> a new helper routine mod_utf8_sanitize, and generalize it to also work
> on data that is length-limited rather than NUL-terminated.  [I was
> actually surprised that glib does not have such a sanitizer already -
> Google turns up lots of examples of rolling your own string
> sanitizer.]
> 
> If desired in the future, we may want to tweak whether the output is
> guaranteed to be ASCII (using lots of \u escape sequences, including
> surrogate pairs for code points outside the BMP) or if we are okay
> passing printable Unicode through (we still need to escape control
> characters).  But for now, I went for minimal code churn, including
> the fact that the resulting function allows a non-UTF-8 2-byte synonym
> for U+0000.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  include/qemu/unicode.h |  3 ++
>  qobject/json-writer.c  | 47 +----------------------
>  util/unicode.c         | 84 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 88 insertions(+), 46 deletions(-)

I was going to ask for a unit test, but "escaped_string" in
test-qjson.c  looks like it will be covering this sufficiently
well already, that we don't need to test it in isolation.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public
  2024-08-02 19:26 ` [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public Eric Blake
                     ` (2 preceding siblings ...)
  2024-08-07 18:49   ` Daniel P. Berrangé
@ 2024-08-08  7:54   ` Markus Armbruster
  2024-08-08 14:02     ` Eric Blake
  3 siblings, 1 reply; 18+ messages in thread
From: Markus Armbruster @ 2024-08-08  7:54 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, rjones

Eric Blake <eblake@redhat.com> writes:

> My next patch needs to convert text from an untrusted input into an
> output representation that is suitable for display on a terminal is
> useful to more than just the json-writer; the text should normally be
> UTF-8, but blindly allowing all Unicode code points (including ASCII
> ESC) through to a terminal risks remote-code-execution attacks on some
> terminals.  Extract the existing body of json-writer's quoted_strinto
> a new helper routine mod_utf8_sanitize, and generalize it to also work
> on data that is length-limited rather than NUL-terminated.

This is two changes in one patch: factor out, and generalize.

You don't actually use the generalized variant.  Please leave that for
later, and keep this patch simple.

>                                                             [I was
> actually surprised that glib does not have such a sanitizer already -
> Google turns up lots of examples of rolling your own string
> sanitizer.]
>
> If desired in the future, we may want to tweak whether the output is
> guaranteed to be ASCII (using lots of \u escape sequences, including
> surrogate pairs for code points outside the BMP) or if we are okay
> passing printable Unicode through (we still need to escape control
> characters).  But for now, I went for minimal code churn, including
> the fact that the resulting function allows a non-UTF-8 2-byte synonym
> for U+0000.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  include/qemu/unicode.h |  3 ++
>  qobject/json-writer.c  | 47 +----------------------
>  util/unicode.c         | 84 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 88 insertions(+), 46 deletions(-)
>
> diff --git a/include/qemu/unicode.h b/include/qemu/unicode.h
> index 7fa10b8e604..e1013b29f72 100644
> --- a/include/qemu/unicode.h
> +++ b/include/qemu/unicode.h
> @@ -4,4 +4,7 @@
>  int mod_utf8_codepoint(const char *s, size_t n, char **end);
>  ssize_t mod_utf8_encode(char buf[], size_t bufsz, int codepoint);
>
> +void mod_utf8_sanitize(GString *buf, const char *str);
> +void mod_utf8_sanitize_len(GString *buf, const char *str, ssize_t len);
> +
>  #endif
> diff --git a/qobject/json-writer.c b/qobject/json-writer.c
> index 309a31d57a9..5c14574efee 100644
> --- a/qobject/json-writer.c
> +++ b/qobject/json-writer.c
> @@ -104,53 +104,8 @@ static void pretty_newline_or_space(JSONWriter *writer)
>
>  static void quoted_str(JSONWriter *writer, const char *str)
>  {
> -    const char *ptr;
> -    char *end;
> -    int cp;
> -
>      g_string_append_c(writer->contents, '"');
> -
> -    for (ptr = str; *ptr; ptr = end) {
> -        cp = mod_utf8_codepoint(ptr, 6, &end);
> -        switch (cp) {
> -        case '\"':
> -            g_string_append(writer->contents, "\\\"");
> -            break;
> -        case '\\':
> -            g_string_append(writer->contents, "\\\\");
> -            break;
> -        case '\b':
> -            g_string_append(writer->contents, "\\b");
> -            break;
> -        case '\f':
> -            g_string_append(writer->contents, "\\f");
> -            break;
> -        case '\n':
> -            g_string_append(writer->contents, "\\n");
> -            break;
> -        case '\r':
> -            g_string_append(writer->contents, "\\r");
> -            break;
> -        case '\t':
> -            g_string_append(writer->contents, "\\t");
> -            break;
> -        default:
> -            if (cp < 0) {
> -                cp = 0xFFFD; /* replacement character */
> -            }
> -            if (cp > 0xFFFF) {
> -                /* beyond BMP; need a surrogate pair */
> -                g_string_append_printf(writer->contents, "\\u%04X\\u%04X",
> -                                       0xD800 + ((cp - 0x10000) >> 10),
> -                                       0xDC00 + ((cp - 0x10000) & 0x3FF));
> -            } else if (cp < 0x20 || cp >= 0x7F) {
> -                g_string_append_printf(writer->contents, "\\u%04X", cp);
> -            } else {
> -                g_string_append_c(writer->contents, cp);
> -            }
> -        }
> -    };
> -
> +    mod_utf8_sanitize(writer->contents, str);
>      g_string_append_c(writer->contents, '"');
>  }
>
> diff --git a/util/unicode.c b/util/unicode.c
> index 8580bc598b3..a419ed4de35 100644
> --- a/util/unicode.c
> +++ b/util/unicode.c
> @@ -154,3 +154,87 @@ ssize_t mod_utf8_encode(char buf[], size_t bufsz, int codepoint)
>      buf[4] = 0;
>      return 4;
>  }
> +
> +/**
> + * mod_utf8_sanitize:
> + * @buf: Destination buffer
> + * @str: Modified UTF-8 string to sanitize
> + *
> + * Append the contents of the NUL-terminated Modified UTF-8 string
> + * @str into @buf, with escape sequences used for non-printable ASCII

"Append into" or "append to"?

> + * and for quote characters.  The result is therefore safe for output
> + * to a terminal.

Actually, we escape double quote, backslash, ASCII control characters,
and non-ASCII characters, i.e. we leave just printable ASCII characters
other than '"' and '\\' unescaped.  See the code quoted below.

Escaping '\\' is of course necessary.

Escaping '"' is necessary only for callers that want to enclose the
result in double quotes without ambiguity.  Which is what the existing
caller wants.  Future callers may prefer not to escape '"'.  But we can
worry about this when we run into such callers.

> + *
> + * Modified UTF-8 is exactly like UTF-8, except U+0000 is encoded as
> + * "\xC0\x80".
> + */

Missing: behavior for invalid sequences.  Each such sequence is replaced
by a replacement character U+FFFD.  For the definition of "invalid
sequence", see mod_utf8_codepoint().

On the function name.  "Sanitize" could be many things.  This function
actually does two things: (1) replace invalid sequences, and (2) escape
to printable ASCII.  What about append_mod_utf8_as_printable_ascii()?
Admittedly a mouthful.

> +void mod_utf8_sanitize(GString *buf, const char *str)
> +{
> +    mod_utf8_sanitize_len(buf, str, -1);
> +}
> +
> +/**
> + * mod_utf8_sanitize:
> + * @buf: Destination buffer
> + * @str: Modified UTF-8 string to sanitize
> + * @len: Length of @str, or negative to stop at NUL terminator
> + *
> + * Append the contents of @len bytes of the Modified UTF-8 string
> + * @str into @buf, with escape sequences used for non-printable ASCII
> + * and for quote characters.  The result is therefore safe for output
> + * to a terminal.
> + *
> + * Modified UTF-8 is exactly like UTF-8, except U+0000 is encoded as
> + * "\xC0\x80".
> + */
> +void mod_utf8_sanitize_len(GString *buf, const char *str, ssize_t len)
> +{
> +    const char *ptr;
> +    char *end;
> +    int cp;
> +
> +    if (len < 0) {
> +        len = strlen(str);
> +    }
> +
> +    for (ptr = str; *ptr; ptr = end) {
> +        cp = mod_utf8_codepoint(ptr, MIN(6, str + len - ptr), &end);
> +        switch (cp) {
> +        case '\"':
> +            g_string_append(buf, "\\\"");
> +            break;
> +        case '\\':
> +            g_string_append(buf, "\\\\");
> +            break;
> +        case '\b':
> +            g_string_append(buf, "\\b");
> +            break;
> +        case '\f':
> +            g_string_append(buf, "\\f");
> +            break;
> +        case '\n':
> +            g_string_append(buf, "\\n");
> +            break;
> +        case '\r':
> +            g_string_append(buf, "\\r");
> +            break;
> +        case '\t':
> +            g_string_append(buf, "\\t");
> +            break;
> +        default:
> +            if (cp < 0) {
> +                cp = 0xFFFD; /* replacement character */
> +            }
> +            if (cp > 0xFFFF) {
> +                /* beyond BMP; need a surrogate pair */
> +                g_string_append_printf(buf, "\\u%04X\\u%04X",
> +                                       0xD800 + ((cp - 0x10000) >> 10),
> +                                       0xDC00 + ((cp - 0x10000) & 0x3FF));
> +            } else if (cp < 0x20 || cp >= 0x7F) {
> +                g_string_append_printf(buf, "\\u%04X", cp);
> +            } else {
> +                g_string_append_c(buf, cp);
> +            }
> +        }
> +    }
> +}



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public
  2024-08-07 18:49   ` Daniel P. Berrangé
@ 2024-08-08  7:57     ` Markus Armbruster
  0 siblings, 0 replies; 18+ messages in thread
From: Markus Armbruster @ 2024-08-08  7:57 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Eric Blake, qemu-devel, qemu-block, rjones

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Fri, Aug 02, 2024 at 02:26:05PM -0500, Eric Blake wrote:
>> My next patch needs to convert text from an untrusted input into an
>> output representation that is suitable for display on a terminal is
>> useful to more than just the json-writer; the text should normally be
>> UTF-8, but blindly allowing all Unicode code points (including ASCII
>> ESC) through to a terminal risks remote-code-execution attacks on some
>> terminals.  Extract the existing body of json-writer's quoted_strinto
>> a new helper routine mod_utf8_sanitize, and generalize it to also work
>> on data that is length-limited rather than NUL-terminated.  [I was
>> actually surprised that glib does not have such a sanitizer already -
>> Google turns up lots of examples of rolling your own string
>> sanitizer.]
>> 
>> If desired in the future, we may want to tweak whether the output is
>> guaranteed to be ASCII (using lots of \u escape sequences, including
>> surrogate pairs for code points outside the BMP) or if we are okay
>> passing printable Unicode through (we still need to escape control
>> characters).  But for now, I went for minimal code churn, including
>> the fact that the resulting function allows a non-UTF-8 2-byte synonym
>> for U+0000.
>> 
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---
>>  include/qemu/unicode.h |  3 ++
>>  qobject/json-writer.c  | 47 +----------------------
>>  util/unicode.c         | 84 ++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 88 insertions(+), 46 deletions(-)
>
> I was going to ask for a unit test, but "escaped_string" in
> test-qjson.c  looks like it will be covering this sufficiently

check-qjson.c, and other test cases torture it some more.

> well already, that we don't need to test it in isolation.
>
> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
>
>
> With regards,
> Daniel



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public
  2024-08-08  7:54   ` Markus Armbruster
@ 2024-08-08 14:02     ` Eric Blake
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Blake @ 2024-08-08 14:02 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-devel, qemu-block, rjones

On Thu, Aug 08, 2024 at 09:54:26AM GMT, Markus Armbruster wrote:
> Eric Blake <eblake@redhat.com> writes:
> 
> > My next patch needs to convert text from an untrusted input into an
> > output representation that is suitable for display on a terminal is
> > useful to more than just the json-writer; the text should normally be
> > UTF-8, but blindly allowing all Unicode code points (including ASCII
> > ESC) through to a terminal risks remote-code-execution attacks on some
> > terminals.  Extract the existing body of json-writer's quoted_strinto
> > a new helper routine mod_utf8_sanitize, and generalize it to also work
> > on data that is length-limited rather than NUL-terminated.
> 
> This is two changes in one patch: factor out, and generalize.
> 
> You don't actually use the generalized variant.  Please leave that for
> later, and keep this patch simple.

Makes sense. Will simplify in v2.

> 
> >                                                             [I was
> > actually surprised that glib does not have such a sanitizer already -
> > Google turns up lots of examples of rolling your own string
> > sanitizer.]

See https://gitlab.gnome.org/GNOME/glib/-/issues/3426 where I asked if
glib should provide a more generic sanitizer.  In turn, I found glib
does have:

char*
g_uri_escape_string (
  const char* unescaped,
  const char* reserved_chars_allowed,
  gboolean allow_utf8
)

which is a bit more powerful (you have some fine-tuning on what gets
escaped), but a different escaping mechanism (%XX instead of \uXXXX
escapes, where % rather than \ is special).  I'm not sure whether it
is nicer to have a malloc'd result or to append into a larger g_string
as the base operation (and where you could write an easy wrapper to
provide the other operation).

> > +/**
> > + * mod_utf8_sanitize:
> > + * @buf: Destination buffer
> > + * @str: Modified UTF-8 string to sanitize
> > + *
> > + * Append the contents of the NUL-terminated Modified UTF-8 string
> > + * @str into @buf, with escape sequences used for non-printable ASCII
> 
> "Append into" or "append to"?

"append to" works, will simplify.

> 
> > + * and for quote characters.  The result is therefore safe for output
> > + * to a terminal.
> 
> Actually, we escape double quote, backslash, ASCII control characters,
> and non-ASCII characters, i.e. we leave just printable ASCII characters
> other than '"' and '\\' unescaped.  See the code quoted below.
> 
> Escaping '\\' is of course necessary.
> 
> Escaping '"' is necessary only for callers that want to enclose the
> result in double quotes without ambiguity.  Which is what the existing
> caller wants.  Future callers may prefer not to escape '"'.  But we can
> worry about this when we run into such callers.

If we encounter more users, I could see this morphing into a backend
that takes a flag argument on knobs like whether to escape " or ',
whether to preserve printing Unicode, ...; coupled with frontends with
fewer arguments that pass the right flags intot the backend.

> 
> > + *
> > + * Modified UTF-8 is exactly like UTF-8, except U+0000 is encoded as
> > + * "\xC0\x80".
> > + */
> 
> Missing: behavior for invalid sequences.  Each such sequence is replaced
> by a replacement character U+FFFD.  For the definition of "invalid
> sequence", see mod_utf8_codepoint().

Indeed, more documentation here wouldn't hurt (both on \ and " being
escaped, and on U+FFFD codepoints being placed into the output
stream).

> 
> On the function name.  "Sanitize" could be many things.  This function
> actually does two things: (1) replace invalid sequences, and (2) escape
> to printable ASCII.  What about append_mod_utf8_as_printable_ascii()?
> Admittedly a mouthful.

Naming is always tough.  Your suggestion is longer, but indeed
accurate.  Maybe a shorter compromise of append_escaped_mod_utf8()?

> 
> > +void mod_utf8_sanitize(GString *buf, const char *str)
> > +{
> > +    mod_utf8_sanitize_len(buf, str, -1);
> > +}

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-08-08 14:03 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-02 19:26 [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal Eric Blake
2024-08-02 19:26 ` [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public Eric Blake
2024-08-02 21:00   ` Richard W.M. Jones
2024-08-02 21:38   ` Philippe Mathieu-Daudé
2024-08-07 18:49   ` Daniel P. Berrangé
2024-08-08  7:57     ` Markus Armbruster
2024-08-08  7:54   ` Markus Armbruster
2024-08-08 14:02     ` Eric Blake
2024-08-02 19:26 ` [PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server Eric Blake
2024-08-02 21:03   ` Richard W.M. Jones
2024-08-07 18:45     ` Daniel P. Berrangé
2024-08-02 21:41   ` Philippe Mathieu-Daudé
2024-08-07 13:43     ` Stefan Hajnoczi
2024-08-02 22:01   ` Richard W.M. Jones
2024-08-03  8:20     ` Richard W.M. Jones
2024-08-05 18:48 ` [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal Eric Blake
2024-08-05 19:11   ` Richard W.M. Jones
2024-08-07 17:51     ` Eric Blake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).