qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] raw: Prohibit dangerous writes for probed images
@ 2014-10-30 12:26 Kevin Wolf
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 1/4] block: Factor bdrv_probe_all() out of find_image_format() Kevin Wolf
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Kevin Wolf @ 2014-10-30 12:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, jcody, armbru, mreitz, stefanha

See the commit message of patch 3 for the why and how.


Old relevant discussions:

October 2014: '[PATCH RFC 0/2] block: Warn on insecure format probing'
http://lists.nongnu.org/archive/html/qemu-devel/2014-10/msg03440.html

August 2014: '[PATCH 0/3] vpc: support probing of fixed size images'
https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02456.html

July 2010: 'Make default invocation of block drivers safer (v3)'
http://patchwork.ozlabs.org/patch/58980/


Kevin Wolf (3):
  block: Read only one sector for format probing
  raw: Prohibit dangerous writes for probed images
  qemu-iotests: Test writing non-raw image headers to raw image

Markus Armbruster (1):
  block: Factor bdrv_probe_all() out of find_image_format()

 block.c                    |  48 ++++++++++++-----
 block/raw_bsd.c            |  46 +++++++++++++++-
 include/block/block_int.h  |   5 ++
 tests/qemu-iotests/109     | 100 +++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/109.out | 128 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/group   |   1 +
 6 files changed, 313 insertions(+), 15 deletions(-)
 create mode 100755 tests/qemu-iotests/109
 create mode 100644 tests/qemu-iotests/109.out

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 1/4] block: Factor bdrv_probe_all() out of find_image_format()
  2014-10-30 12:26 [Qemu-devel] [PATCH 0/4] raw: Prohibit dangerous writes for probed images Kevin Wolf
@ 2014-10-30 12:26 ` Kevin Wolf
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 2/4] block: Read only one sector for format probing Kevin Wolf
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Kevin Wolf @ 2014-10-30 12:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, jcody, armbru, mreitz, stefanha

From: Markus Armbruster <armbru@redhat.com>

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c | 45 ++++++++++++++++++++++++++++++++-------------
 1 file changed, 32 insertions(+), 13 deletions(-)

diff --git a/block.c b/block.c
index 88f6d9b..8da6e61 100644
--- a/block.c
+++ b/block.c
@@ -644,11 +644,40 @@ BlockDriver *bdrv_find_protocol(const char *filename,
     return NULL;
 }
 
+/*
+ * Guess image format by probing its contents.
+ * This is not a good idea when your image is raw (CVE-2008-2004), but
+ * we do it anyway for backward compatibility.
+ * @buf contains the image's first @buf_size bytes.
+ * @buf_size is 2048 or the image's size, whatever is smaller.
+ * @filename is its filename.
+ * For all block drivers, call the bdrv_probe() method to get its
+ * probing score.
+ * Return the first block driver with the highest probing score.
+ */
+static BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
+                                   const char *filename)
+{
+    int score_max = 0, score;
+    BlockDriver *drv = NULL, *d;
+
+    QLIST_FOREACH(d, &bdrv_drivers, list) {
+        if (d->bdrv_probe) {
+            score = d->bdrv_probe(buf, buf_size, filename);
+            if (score > score_max) {
+                score_max = score;
+                drv = d;
+            }
+        }
+    }
+
+    return drv;
+}
+
 static int find_image_format(BlockDriverState *bs, const char *filename,
                              BlockDriver **pdrv, Error **errp)
 {
-    int score, score_max;
-    BlockDriver *drv1, *drv;
+    BlockDriver *drv;
     uint8_t buf[2048];
     int ret = 0;
 
@@ -671,17 +700,7 @@ static int find_image_format(BlockDriverState *bs, const char *filename,
         return ret;
     }
 
-    score_max = 0;
-    drv = NULL;
-    QLIST_FOREACH(drv1, &bdrv_drivers, list) {
-        if (drv1->bdrv_probe) {
-            score = drv1->bdrv_probe(buf, ret, filename);
-            if (score > score_max) {
-                score_max = score;
-                drv = drv1;
-            }
-        }
-    }
+    drv = bdrv_probe_all(buf, ret, filename);
     if (!drv) {
         error_setg(errp, "Could not determine image format: No compatible "
                    "driver found");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 2/4] block: Read only one sector for format probing
  2014-10-30 12:26 [Qemu-devel] [PATCH 0/4] raw: Prohibit dangerous writes for probed images Kevin Wolf
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 1/4] block: Factor bdrv_probe_all() out of find_image_format() Kevin Wolf
@ 2014-10-30 12:26 ` Kevin Wolf
  2014-11-04 15:32   ` Stefan Hajnoczi
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 3/4] raw: Prohibit dangerous writes for probed images Kevin Wolf
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 4/4] qemu-iotests: Test writing non-raw image headers to raw image Kevin Wolf
  3 siblings, 1 reply; 10+ messages in thread
From: Kevin Wolf @ 2014-10-30 12:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, jcody, armbru, mreitz, stefanha

The only image format driver that even potentially accesses anything
after 512 bytes in its bdrv_probe() implementation is VMDK, which reads
a plain-text descriptor file. In practice, the field it's looking for
seems to come first and will be well within the first 512 bytes, too.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c                   | 2 +-
 include/block/block_int.h | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 8da6e61..8006685 100644
--- a/block.c
+++ b/block.c
@@ -678,7 +678,7 @@ static int find_image_format(BlockDriverState *bs, const char *filename,
                              BlockDriver **pdrv, Error **errp)
 {
     BlockDriver *drv;
-    uint8_t buf[2048];
+    uint8_t buf[BLOCK_PROBE_BUF_SIZE];
     int ret = 0;
 
     /* Return the raw BlockDriver * to scsi-generic devices or empty drives */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8898c6c..697cd5d 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -57,6 +57,8 @@
 #define BLOCK_OPT_REDUNDANCY        "redundancy"
 #define BLOCK_OPT_NOCOW             "nocow"
 
+#define BLOCK_PROBE_BUF_SIZE        512
+
 typedef struct BdrvTrackedRequest {
     BlockDriverState *bs;
     int64_t offset;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 3/4] raw: Prohibit dangerous writes for probed images
  2014-10-30 12:26 [Qemu-devel] [PATCH 0/4] raw: Prohibit dangerous writes for probed images Kevin Wolf
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 1/4] block: Factor bdrv_probe_all() out of find_image_format() Kevin Wolf
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 2/4] block: Read only one sector for format probing Kevin Wolf
@ 2014-10-30 12:26 ` Kevin Wolf
  2014-10-30 12:43   ` Kevin Wolf
                     ` (2 more replies)
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 4/4] qemu-iotests: Test writing non-raw image headers to raw image Kevin Wolf
  3 siblings, 3 replies; 10+ messages in thread
From: Kevin Wolf @ 2014-10-30 12:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, jcody, armbru, mreitz, stefanha

If the user neglects to specify the image format, QEMU probes the
image to guess it automatically, for convenience.

Relying on format probing is insecure for raw images (CVE-2008-2004).
If the guest writes a suitable header to the device, the next probe
will recognize a format chosen by the guest.  A malicious guest can
abuse this to gain access to host files, e.g. by crafting a QCOW2
header with backing file /etc/shadow.

Commit 1e72d3b (April 2008) provided -drive parameter format to let
users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
optionally store the backing file format, to let users disable backing
file probing.  QED has had a flag to suppress probing since the
beginning (2010), set whenever a raw backing file is assigned.

All of these additions that allow to avoid format probing have to be
specified explicitly. The default still allows the attack.

In order to fix this, commit 79368c8 (July 2010) put probed raw images
in a restricted mode, in which they wouldn't be able to overwrite the
first few bytes of the image so that they would identify as a different
image. If a write to the first sector would write one of the signatures
of another driver, qemu would instead zero out the first four bytes.
This patch was later reverted in commit 8b33d9e (September 2010) because
it didn't get the handling of unaligned qiov members right.

Today's block layer that is based on coroutines and has qiov utility
functions makes it much easier to get this functionality right, so this
patch implements it.

The other differences of this patch to the old one are that it doesn't
silently write something different than the guest requested by zeroing
out some bytes (it fails the request instead) and that it doesn't
maintain a list of signatures in the raw driver (it calls the usual
probe function instead).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c                   |  5 +++--
 block/raw_bsd.c           | 46 +++++++++++++++++++++++++++++++++++++++++++++-
 include/block/block_int.h |  3 +++
 3 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index 8006685..bd7b2b5 100644
--- a/block.c
+++ b/block.c
@@ -655,8 +655,8 @@ BlockDriver *bdrv_find_protocol(const char *filename,
  * probing score.
  * Return the first block driver with the highest probing score.
  */
-static BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
-                                   const char *filename)
+BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
+                            const char *filename)
 {
     int score_max = 0, score;
     BlockDriver *drv = NULL, *d;
@@ -1482,6 +1482,7 @@ int bdrv_open(BlockDriverState **pbs, const char *filename,
     }
 
     /* Image format probing */
+    bs->probed = !drv;
     if (!drv && file) {
         ret = find_image_format(file, filename, &drv, &local_err);
         if (ret < 0) {
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 401b967..80f3a50 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -58,8 +58,52 @@ static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num,
 static int coroutine_fn raw_co_writev(BlockDriverState *bs, int64_t sector_num,
                                       int nb_sectors, QEMUIOVector *qiov)
 {
+    void *buf = NULL;
+    BlockDriver *drv;
+    QEMUIOVector local_qiov;
+    int ret;
+
+    if (bs->probed && sector_num == 0) {
+        /* As long as these conditions are true, we can't get partial writes to
+         * the probe buffer and can just directly check the request. */
+        QEMU_BUILD_BUG_ON(BLOCK_PROBE_BUF_SIZE != 512);
+        QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512);
+
+        buf = g_try_malloc(512);
+        if (!buf) {
+            ret = -ENOMEM;
+            goto fail;
+        }
+
+        ret = qemu_iovec_to_buf(qiov, 0, buf, 512);
+        if (ret != 512) {
+            ret = -EINVAL;
+            goto fail;
+        }
+
+        drv = bdrv_probe_all(buf, 512, NULL);
+        if (drv != bs->drv) {
+            ret = -EPERM;
+            goto fail;
+        }
+
+        /* Use the checked buffer, a malicious guest might be overwriting its
+         * original buffer in the background. */
+        qemu_iovec_init(&local_qiov, qiov->niov + 1);
+        qemu_iovec_add(&local_qiov, buf, 512);
+        qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size - 512);
+        qiov = &local_qiov;
+    }
+
     BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
-    return bdrv_co_writev(bs->file, sector_num, nb_sectors, qiov);
+    ret = bdrv_co_writev(bs->file, sector_num, nb_sectors, qiov);
+
+fail:
+    if (qiov == &local_qiov) {
+        qemu_iovec_destroy(&local_qiov);
+    }
+    g_free(buf);
+    return ret;
 }
 
 static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 697cd5d..ba436d7 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -322,6 +322,7 @@ struct BlockDriverState {
     int sg;        /* if true, the device is a /dev/sg* */
     int copy_on_read; /* if true, copy read backing sectors into image
                          note this is a reference count */
+    bool probed;
 
     BlockDriver *drv; /* NULL means no media */
     void *opaque;
@@ -410,6 +411,8 @@ struct BlockDriverState {
 };
 
 int get_tmp_filename(char *filename, int size);
+BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
+                            const char *filename);
 
 void bdrv_set_io_limits(BlockDriverState *bs,
                         ThrottleConfig *cfg);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 4/4] qemu-iotests: Test writing non-raw image headers to raw image
  2014-10-30 12:26 [Qemu-devel] [PATCH 0/4] raw: Prohibit dangerous writes for probed images Kevin Wolf
                   ` (2 preceding siblings ...)
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 3/4] raw: Prohibit dangerous writes for probed images Kevin Wolf
@ 2014-10-30 12:26 ` Kevin Wolf
  3 siblings, 0 replies; 10+ messages in thread
From: Kevin Wolf @ 2014-10-30 12:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, jcody, armbru, mreitz, stefanha

This is forbidden if the raw driver was probed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 tests/qemu-iotests/109     | 100 +++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/109.out | 128 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 229 insertions(+)
 create mode 100755 tests/qemu-iotests/109
 create mode 100644 tests/qemu-iotests/109.out

diff --git a/tests/qemu-iotests/109 b/tests/qemu-iotests/109
new file mode 100755
index 0000000..ca7ea43
--- /dev/null
+++ b/tests/qemu-iotests/109
@@ -0,0 +1,100 @@
+#!/bin/bash
+#
+# Test writing image headers of other formats into raw images
+#
+# Copyright (C) 2014 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=kwolf@redhat.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+    rm -f $TEST_IMG.src
+	_cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+
+_supported_fmt raw
+_supported_proto file
+_supported_os Linux
+
+qemu_comm_method=qmp
+
+for fmt in qcow qcow2 qed vdi vhdx vmdk vpc; do
+
+    echo
+    echo "=== Writing a $fmt header into raw ==="
+    echo
+
+    _make_test_img 64M
+    TEST_IMG="$TEST_IMG.src" IMGFMT=$fmt _make_test_img 64M
+
+
+    # This first test should fail: The image format was probed, we may not
+    # write an image header at the start of the image
+
+    _launch_qemu -drive file="${TEST_IMG}.src",format=raw,cache=${CACHEMODE},id=src
+    _send_qemu_cmd $QEMU_HANDLE "{ 'execute': 'qmp_capabilities' }" "return"
+
+    _send_qemu_cmd $QEMU_HANDLE \
+        "{'execute':'drive-mirror', 'arguments':{
+            'device': 'src', 'target': '$TEST_IMG',
+            'mode': 'existing', 'sync': 'full'}}" \
+        "return"
+
+    _send_qemu_cmd $QEMU_HANDLE '' "BLOCK_JOB_ERROR"
+    _send_qemu_cmd $QEMU_HANDLE '{"execute":"query-block-jobs"}' "return"
+    _cleanup_qemu
+
+    $QEMU_IO -c 'read -P 0 0 64k' "$TEST_IMG" | _filter_qemu_io
+
+
+    # When raw was explicitly specified, the same must succeed
+
+    _launch_qemu -drive file="${TEST_IMG}.src",format=raw,cache=${CACHEMODE},id=src
+    _send_qemu_cmd $QEMU_HANDLE "{ 'execute': 'qmp_capabilities' }" "return"
+
+    _send_qemu_cmd $QEMU_HANDLE \
+        "{'execute':'drive-mirror', 'arguments':{
+            'device': 'src', 'target': '$TEST_IMG', 'format': 'raw',
+            'mode': 'existing', 'sync': 'full'}}" \
+        "return"
+
+    _send_qemu_cmd $QEMU_HANDLE '' "BLOCK_JOB_READY"
+    _send_qemu_cmd $QEMU_HANDLE '{"execute":"query-block-jobs"}' "return"
+    _cleanup_qemu
+
+    $QEMU_IMG compare -f raw -F raw "$TEST_IMG" "$TEST_IMG.src"
+
+done
+
+# success, all done
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/109.out b/tests/qemu-iotests/109.out
new file mode 100644
index 0000000..e617306
--- /dev/null
+++ b/tests/qemu-iotests/109.out
@@ -0,0 +1,128 @@
+QA output created by 109
+
+=== Writing a qcow header into raw ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.raw.src', fmt=IMGFMT size=67108864 
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": "report"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 1024, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
+{"return": []}
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 1024, "offset": 1024, "speed": 0, "type": "mirror"}}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 1024, "offset": 1024, "paused": false, "speed": 0, "type": "mirror"}]}
+Warning: Image size mismatch!
+Images are identical.
+
+=== Writing a qcow2 header into raw ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.raw.src', fmt=IMGFMT size=67108864 
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": "report"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 197120, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
+{"return": []}
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 197120, "offset": 197120, "speed": 0, "type": "mirror"}}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 197120, "offset": 197120, "paused": false, "speed": 0, "type": "mirror"}]}
+Warning: Image size mismatch!
+Images are identical.
+
+=== Writing a qed header into raw ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.raw.src', fmt=IMGFMT size=67108864 
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": "report"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 327680, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
+{"return": []}
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 327680, "offset": 327680, "speed": 0, "type": "mirror"}}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 327680, "offset": 327680, "paused": false, "speed": 0, "type": "mirror"}]}
+Warning: Image size mismatch!
+Images are identical.
+
+=== Writing a vdi header into raw ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.raw.src', fmt=IMGFMT size=67108864 
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": "report"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 1024, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
+{"return": []}
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 1024, "offset": 1024, "speed": 0, "type": "mirror"}}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 1024, "offset": 1024, "paused": false, "speed": 0, "type": "mirror"}]}
+Warning: Image size mismatch!
+Images are identical.
+
+=== Writing a vhdx header into raw ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.raw.src', fmt=IMGFMT size=67108864 
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": "report"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 8388608, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
+{"return": []}
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 8388608, "offset": 8388608, "speed": 0, "type": "mirror"}}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 8388608, "offset": 8388608, "paused": false, "speed": 0, "type": "mirror"}]}
+Warning: Image size mismatch!
+Images are identical.
+
+=== Writing a vmdk header into raw ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.raw.src', fmt=IMGFMT size=67108864 
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": "report"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 65536, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
+{"return": []}
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 65536, "offset": 65536, "speed": 0, "type": "mirror"}}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 65536, "offset": 65536, "paused": false, "speed": 0, "type": "mirror"}]}
+Warning: Image size mismatch!
+Images are identical.
+
+=== Writing a vpc header into raw ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.raw.src', fmt=IMGFMT size=67108864 
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": "report"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 2560, "offset": 0, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
+{"return": []}
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 2560, "offset": 2560, "speed": 0, "type": "mirror"}}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 2560, "offset": 2560, "paused": false, "speed": 0, "type": "mirror"}]}
+Warning: Image size mismatch!
+Images are identical.
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 9bbd5d3..659b086 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -109,3 +109,4 @@
 105 rw auto quick
 107 rw auto quick
 108 rw auto quick
+109 rw auto quick
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] raw: Prohibit dangerous writes for probed images
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 3/4] raw: Prohibit dangerous writes for probed images Kevin Wolf
@ 2014-10-30 12:43   ` Kevin Wolf
  2014-10-30 14:27   ` Eric Blake
  2014-11-04 15:41   ` Stefan Hajnoczi
  2 siblings, 0 replies; 10+ messages in thread
From: Kevin Wolf @ 2014-10-30 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: jcody, armbru, stefanha, mreitz

Am 30.10.2014 um 13:26 hat Kevin Wolf geschrieben:
> If the user neglects to specify the image format, QEMU probes the
> image to guess it automatically, for convenience.
> 
> Relying on format probing is insecure for raw images (CVE-2008-2004).
> If the guest writes a suitable header to the device, the next probe
> will recognize a format chosen by the guest.  A malicious guest can
> abuse this to gain access to host files, e.g. by crafting a QCOW2
> header with backing file /etc/shadow.
> 
> Commit 1e72d3b (April 2008) provided -drive parameter format to let
> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
> optionally store the backing file format, to let users disable backing
> file probing.  QED has had a flag to suppress probing since the
> beginning (2010), set whenever a raw backing file is assigned.
> 
> All of these additions that allow to avoid format probing have to be
> specified explicitly. The default still allows the attack.
> 
> In order to fix this, commit 79368c8 (July 2010) put probed raw images
> in a restricted mode, in which they wouldn't be able to overwrite the
> first few bytes of the image so that they would identify as a different
> image. If a write to the first sector would write one of the signatures
> of another driver, qemu would instead zero out the first four bytes.
> This patch was later reverted in commit 8b33d9e (September 2010) because
> it didn't get the handling of unaligned qiov members right.
> 
> Today's block layer that is based on coroutines and has qiov utility
> functions makes it much easier to get this functionality right, so this
> patch implements it.
> 
> The other differences of this patch to the old one are that it doesn't
> silently write something different than the guest requested by zeroing
> out some bytes (it fails the request instead) and that it doesn't
> maintain a list of signatures in the raw driver (it calls the usual
> probe function instead).
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

I was a little too quick with sending this out, so I'll have to
point out a bug myself, but the idea should be clear enough to have a
discussion.

> diff --git a/block/raw_bsd.c b/block/raw_bsd.c
> index 401b967..80f3a50 100644
> --- a/block/raw_bsd.c
> +++ b/block/raw_bsd.c
> @@ -58,8 +58,52 @@ static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num,
>  static int coroutine_fn raw_co_writev(BlockDriverState *bs, int64_t sector_num,
>                                        int nb_sectors, QEMUIOVector *qiov)
>  {
> +    void *buf = NULL;
> +    BlockDriver *drv;
> +    QEMUIOVector local_qiov;
> +    int ret;
> +
> +    if (bs->probed && sector_num == 0) {
> +        /* As long as these conditions are true, we can't get partial writes to
> +         * the probe buffer and can just directly check the request. */
> +        QEMU_BUILD_BUG_ON(BLOCK_PROBE_BUF_SIZE != 512);
> +        QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512);
> +
> +        buf = g_try_malloc(512);

Should be qemu_try_blockalign().

> +        if (!buf) {
> +            ret = -ENOMEM;
> +            goto fail;
> +        }
> +
> +        ret = qemu_iovec_to_buf(qiov, 0, buf, 512);
> +        if (ret != 512) {
> +            ret = -EINVAL;
> +            goto fail;
> +        }
> +
> +        drv = bdrv_probe_all(buf, 512, NULL);
> +        if (drv != bs->drv) {
> +            ret = -EPERM;
> +            goto fail;
> +        }
> +
> +        /* Use the checked buffer, a malicious guest might be overwriting its
> +         * original buffer in the background. */
> +        qemu_iovec_init(&local_qiov, qiov->niov + 1);
> +        qemu_iovec_add(&local_qiov, buf, 512);
> +        qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size - 512);

And here the offset obviously needs to be 512 instead of 0.

> +        qiov = &local_qiov;
> +    }
> +
>      BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
> -    return bdrv_co_writev(bs->file, sector_num, nb_sectors, qiov);
> +    ret = bdrv_co_writev(bs->file, sector_num, nb_sectors, qiov);
> +
> +fail:
> +    if (qiov == &local_qiov) {
> +        qemu_iovec_destroy(&local_qiov);
> +    }
> +    g_free(buf);
> +    return ret;
>  }

Kevin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] raw: Prohibit dangerous writes for probed images
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 3/4] raw: Prohibit dangerous writes for probed images Kevin Wolf
  2014-10-30 12:43   ` Kevin Wolf
@ 2014-10-30 14:27   ` Eric Blake
  2014-10-31  9:34     ` Kevin Wolf
  2014-11-04 15:41   ` Stefan Hajnoczi
  2 siblings, 1 reply; 10+ messages in thread
From: Eric Blake @ 2014-10-30 14:27 UTC (permalink / raw)
  To: Kevin Wolf, qemu-devel; +Cc: jcody, armbru, stefanha, mreitz

[-- Attachment #1: Type: text/plain, Size: 1221 bytes --]

On 10/30/2014 06:26 AM, Kevin Wolf wrote:
> 
> The other differences of this patch to the old one are that it doesn't
> silently write something different than the guest requested by zeroing
> out some bytes (it fails the request instead) and that it doesn't
> maintain a list of signatures in the raw driver (it calls the usual
> probe function instead).
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c                   |  5 +++--

> +
> +        drv = bdrv_probe_all(buf, 512, NULL);
> +        if (drv != bs->drv) {
> +            ret = -EPERM;
> +            goto fail;
> +        }

So, what happens when this returns -EPERM?  If the guest is configured
to halt on write errors, does this halt the guest and send an event to
management?  How does it compare to the case of halting on ENOSPACE?  Is
this particular failure mode something that the host should be able to
easily distinguish from other failure modes?

But I definitely like that you only do this failure on probed images,
and that a user that requests an explicit raw format will never trip up.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] raw: Prohibit dangerous writes for probed images
  2014-10-30 14:27   ` Eric Blake
@ 2014-10-31  9:34     ` Kevin Wolf
  0 siblings, 0 replies; 10+ messages in thread
From: Kevin Wolf @ 2014-10-31  9:34 UTC (permalink / raw)
  To: Eric Blake; +Cc: jcody, mreitz, qemu-devel, stefanha, armbru

[-- Attachment #1: Type: text/plain, Size: 1654 bytes --]

Am 30.10.2014 um 15:27 hat Eric Blake geschrieben:
> On 10/30/2014 06:26 AM, Kevin Wolf wrote:
> > 
> > The other differences of this patch to the old one are that it doesn't
> > silently write something different than the guest requested by zeroing
> > out some bytes (it fails the request instead) and that it doesn't
> > maintain a list of signatures in the raw driver (it calls the usual
> > probe function instead).
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  block.c                   |  5 +++--
> 
> > +
> > +        drv = bdrv_probe_all(buf, 512, NULL);
> > +        if (drv != bs->drv) {
> > +            ret = -EPERM;
> > +            goto fail;
> > +        }
> 
> So, what happens when this returns -EPERM?  If the guest is configured
> to halt on write errors, does this halt the guest and send an event to
> management?  How does it compare to the case of halting on ENOSPACE?  Is
> this particular failure mode something that the host should be able to
> easily distinguish from other failure modes?

This -EPERM is returned the same way as error that come directly from
the kernel, so the usual werror/rerror rules apply. It can easily be
distinguished from ENOSPC (nospace=false in the QMP event,
io-status=failed in BlockInfo for query-block), but it looks the same as
a regular I/O error.

> But I definitely like that you only do this failure on probed images,
> and that a user that requests an explicit raw format will never trip up.

Right, a management tool should always be passing the format explicitly
and shouldn't need to deal with this error case at all.

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: Read only one sector for format probing
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 2/4] block: Read only one sector for format probing Kevin Wolf
@ 2014-11-04 15:32   ` Stefan Hajnoczi
  0 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2014-11-04 15:32 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: jcody, mreitz, qemu-devel, armbru

[-- Attachment #1: Type: text/plain, Size: 715 bytes --]

On Thu, Oct 30, 2014 at 01:26:14PM +0100, Kevin Wolf wrote:
> The only image format driver that even potentially accesses anything
> after 512 bytes in its bdrv_probe() implementation is VMDK, which reads
> a plain-text descriptor file. In practice, the field it's looking for
> seems to come first and will be well within the first 512 bytes, too.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c                   | 2 +-
>  include/block/block_int.h | 2 ++
>  2 files changed, 3 insertions(+), 1 deletion(-)

I tried to find a logical reason for the historical 1024 and 2048 byte
sizes but it's not clear.  512 should be fine.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] raw: Prohibit dangerous writes for probed images
  2014-10-30 12:26 ` [Qemu-devel] [PATCH 3/4] raw: Prohibit dangerous writes for probed images Kevin Wolf
  2014-10-30 12:43   ` Kevin Wolf
  2014-10-30 14:27   ` Eric Blake
@ 2014-11-04 15:41   ` Stefan Hajnoczi
  2 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2014-11-04 15:41 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: jcody, mreitz, qemu-devel, armbru

[-- Attachment #1: Type: text/plain, Size: 5565 bytes --]

On Thu, Oct 30, 2014 at 01:26:15PM +0100, Kevin Wolf wrote:
> If the user neglects to specify the image format, QEMU probes the
> image to guess it automatically, for convenience.
> 
> Relying on format probing is insecure for raw images (CVE-2008-2004).
> If the guest writes a suitable header to the device, the next probe
> will recognize a format chosen by the guest.  A malicious guest can
> abuse this to gain access to host files, e.g. by crafting a QCOW2
> header with backing file /etc/shadow.
> 
> Commit 1e72d3b (April 2008) provided -drive parameter format to let
> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
> optionally store the backing file format, to let users disable backing
> file probing.  QED has had a flag to suppress probing since the
> beginning (2010), set whenever a raw backing file is assigned.
> 
> All of these additions that allow to avoid format probing have to be
> specified explicitly. The default still allows the attack.
> 
> In order to fix this, commit 79368c8 (July 2010) put probed raw images
> in a restricted mode, in which they wouldn't be able to overwrite the
> first few bytes of the image so that they would identify as a different
> image. If a write to the first sector would write one of the signatures
> of another driver, qemu would instead zero out the first four bytes.
> This patch was later reverted in commit 8b33d9e (September 2010) because
> it didn't get the handling of unaligned qiov members right.
> 
> Today's block layer that is based on coroutines and has qiov utility
> functions makes it much easier to get this functionality right, so this
> patch implements it.
> 
> The other differences of this patch to the old one are that it doesn't
> silently write something different than the guest requested by zeroing
> out some bytes (it fails the request instead) and that it doesn't
> maintain a list of signatures in the raw driver (it calls the usual
> probe function instead).
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c                   |  5 +++--
>  block/raw_bsd.c           | 46 +++++++++++++++++++++++++++++++++++++++++++++-
>  include/block/block_int.h |  3 +++
>  3 files changed, 51 insertions(+), 3 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 8006685..bd7b2b5 100644
> --- a/block.c
> +++ b/block.c
> @@ -655,8 +655,8 @@ BlockDriver *bdrv_find_protocol(const char *filename,
>   * probing score.
>   * Return the first block driver with the highest probing score.
>   */
> -static BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
> -                                   const char *filename)
> +BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
> +                            const char *filename)
>  {
>      int score_max = 0, score;
>      BlockDriver *drv = NULL, *d;
> @@ -1482,6 +1482,7 @@ int bdrv_open(BlockDriverState **pbs, const char *filename,
>      }
>  
>      /* Image format probing */
> +    bs->probed = !drv;
>      if (!drv && file) {
>          ret = find_image_format(file, filename, &drv, &local_err);
>          if (ret < 0) {
> diff --git a/block/raw_bsd.c b/block/raw_bsd.c
> index 401b967..80f3a50 100644
> --- a/block/raw_bsd.c
> +++ b/block/raw_bsd.c
> @@ -58,8 +58,52 @@ static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num,
>  static int coroutine_fn raw_co_writev(BlockDriverState *bs, int64_t sector_num,
>                                        int nb_sectors, QEMUIOVector *qiov)
>  {
> +    void *buf = NULL;
> +    BlockDriver *drv;
> +    QEMUIOVector local_qiov;
> +    int ret;
> +
> +    if (bs->probed && sector_num == 0) {
> +        /* As long as these conditions are true, we can't get partial writes to
> +         * the probe buffer and can just directly check the request. */
> +        QEMU_BUILD_BUG_ON(BLOCK_PROBE_BUF_SIZE != 512);
> +        QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512);
> +
> +        buf = g_try_malloc(512);
> +        if (!buf) {
> +            ret = -ENOMEM;
> +            goto fail;
> +        }
> +
> +        ret = qemu_iovec_to_buf(qiov, 0, buf, 512);
> +        if (ret != 512) {
> +            ret = -EINVAL;
> +            goto fail;
> +        }
> +
> +        drv = bdrv_probe_all(buf, 512, NULL);
> +        if (drv != bs->drv) {
> +            ret = -EPERM;
> +            goto fail;
> +        }
> +
> +        /* Use the checked buffer, a malicious guest might be overwriting its
> +         * original buffer in the background. */
> +        qemu_iovec_init(&local_qiov, qiov->niov + 1);
> +        qemu_iovec_add(&local_qiov, buf, 512);
> +        qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size - 512);
> +        qiov = &local_qiov;
> +    }
> +
>      BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
> -    return bdrv_co_writev(bs->file, sector_num, nb_sectors, qiov);
> +    ret = bdrv_co_writev(bs->file, sector_num, nb_sectors, qiov);
> +
> +fail:
> +    if (qiov == &local_qiov) {
> +        qemu_iovec_destroy(&local_qiov);
> +    }
> +    g_free(buf);
> +    return ret;
>  }

Looking at this patch, I'm surprised to say I actually kind of like it.

Although I think it's problematic to restrict the guest from doing
something that a real machine is allowed to do, it only happens in
limited circumstances:
1. The image format was probed
2. The format was determined to be raw

I'd be okay with this.  Less invasive than I imagined.

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-11-04 15:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-30 12:26 [Qemu-devel] [PATCH 0/4] raw: Prohibit dangerous writes for probed images Kevin Wolf
2014-10-30 12:26 ` [Qemu-devel] [PATCH 1/4] block: Factor bdrv_probe_all() out of find_image_format() Kevin Wolf
2014-10-30 12:26 ` [Qemu-devel] [PATCH 2/4] block: Read only one sector for format probing Kevin Wolf
2014-11-04 15:32   ` Stefan Hajnoczi
2014-10-30 12:26 ` [Qemu-devel] [PATCH 3/4] raw: Prohibit dangerous writes for probed images Kevin Wolf
2014-10-30 12:43   ` Kevin Wolf
2014-10-30 14:27   ` Eric Blake
2014-10-31  9:34     ` Kevin Wolf
2014-11-04 15:41   ` Stefan Hajnoczi
2014-10-30 12:26 ` [Qemu-devel] [PATCH 4/4] qemu-iotests: Test writing non-raw image headers to raw image Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).