qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Michael Roth <michael.roth@amd.com>
To: <qemu-devel@nongnu.org>
Cc: qemu-stable@nongnu.org,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	"Fam Zheng" <fam@euphon.net>,
	"Maxim Levitsky" <mlevitsk@redhat.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Michal Prívozník" <mprivozn@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>
Subject: [PATCH 27/64] block/nvme: Fix VFIO_MAP_DMA failed: No space left on device
Date: Tue, 19 Oct 2021 09:09:07 -0500	[thread overview]
Message-ID: <20211019140944.152419-28-michael.roth@amd.com> (raw)
In-Reply-To: <20211019140944.152419-1-michael.roth@amd.com>

From: Philippe Mathieu-Daudé <philmd@redhat.com>

When the NVMe block driver was introduced (see commit bdd6a90a9e5,
January 2018), Linux VFIO_IOMMU_MAP_DMA ioctl was only returning
-ENOMEM in case of error. The driver was correctly handling the
error path to recycle its volatile IOVA mappings.

To fix CVE-2019-3882, Linux commit 492855939bdb ("vfio/type1: Limit
DMA mappings per container", April 2019) added the -ENOSPC error to
signal the user exhausted the DMA mappings available for a container.

The block driver started to mis-behave:

  qemu-system-x86_64: VFIO_MAP_DMA failed: No space left on device
  (qemu)
  (qemu) info status
  VM status: paused (io-error)
  (qemu) c
  VFIO_MAP_DMA failed: No space left on device
  (qemu) c
  VFIO_MAP_DMA failed: No space left on device

(The VM is not resumable from here, hence stuck.)

Fix by handling the new -ENOSPC error (when DMA mappings are
exhausted) without any distinction to the current -ENOMEM error,
so we don't change the behavior on old kernels where the CVE-2019-3882
fix is not present.

An easy way to reproduce this bug is to restrict the DMA mapping
limit (65535 by default) when loading the VFIO IOMMU module:

  # modprobe vfio_iommu_type1 dma_entry_limit=666

Cc: qemu-stable@nongnu.org
Cc: Fam Zheng <fam@euphon.net>
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Michal Prívozník <mprivozn@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210723195843.1032825-1-philmd@redhat.com
Fixes: bdd6a90a9e5 ("block: Add VFIO based NVMe driver")
Buglink: https://bugs.launchpad.net/qemu/+bug/1863333
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/65
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 15a730e7a3aaac180df72cd5730e0617bcf44a5a)
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 block/nvme.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/block/nvme.c b/block/nvme.c
index 2b5421e7aa..e8dbbc2317 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -1030,7 +1030,29 @@ try_map:
         r = qemu_vfio_dma_map(s->vfio,
                               qiov->iov[i].iov_base,
                               len, true, &iova);
+        if (r == -ENOSPC) {
+            /*
+             * In addition to the -ENOMEM error, the VFIO_IOMMU_MAP_DMA
+             * ioctl returns -ENOSPC to signal the user exhausted the DMA
+             * mappings available for a container since Linux kernel commit
+             * 492855939bdb ("vfio/type1: Limit DMA mappings per container",
+             * April 2019, see CVE-2019-3882).
+             *
+             * This block driver already handles this error path by checking
+             * for the -ENOMEM error, so we directly replace -ENOSPC by
+             * -ENOMEM. Beside, -ENOSPC has a specific meaning for blockdev
+             * coroutines: it triggers BLOCKDEV_ON_ERROR_ENOSPC and
+             * BLOCK_ERROR_ACTION_STOP which stops the VM, asking the operator
+             * to add more storage to the blockdev. Not something we can do
+             * easily with an IOMMU :)
+             */
+            r = -ENOMEM;
+        }
         if (r == -ENOMEM && retry) {
+            /*
+             * We exhausted the DMA mappings available for our container:
+             * recycle the volatile IOVA mappings.
+             */
             retry = false;
             trace_nvme_dma_flush_queue_wait(s);
             if (s->dma_map_count) {
-- 
2.25.1



  parent reply	other threads:[~2021-10-19 14:49 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-19 14:08 [PATCH 00/64] Patch Round-up for stable 6.0.1, freeze on 2021-10-26 Michael Roth
2021-10-19 14:08 ` [PATCH 01/64] multi-process: Initialize variables declared with g_auto* Michael Roth
2021-10-19 14:08 ` [PATCH 02/64] linux-user/aarch64: Enable hwcap for RND, BTI, and MTE Michael Roth
2021-10-19 14:08 ` [PATCH 03/64] docs/system: Document the removal of "compat" property for POWER CPUs Michael Roth
2021-10-19 14:08 ` [PATCH 04/64] monitor/qmp: fix race on CHR_EVENT_CLOSED without OOB Michael Roth
2021-10-19 14:08 ` [PATCH 05/64] migration/rdma: Fix cm_event used before being initialized Michael Roth
2021-10-19 14:08 ` [PATCH 06/64] target/i386: Exit tb after wrmsr Michael Roth
2021-10-19 14:08 ` [PATCH 07/64] target/ppc: Fix load endianness for lxvwsx/lxvdsx Michael Roth
2021-10-19 14:08 ` [PATCH 08/64] vl: allow not specifying size in -m when using -M memory-backend Michael Roth
2021-10-19 14:08 ` [PATCH 09/64] target/xtensa: fix access ring in l32ex Michael Roth
2021-10-19 14:08 ` [PATCH 10/64] qemu-option: support accept-any QemuOptsList in qemu_opts_absorb_qdict Michael Roth
2021-10-19 14:08 ` [PATCH 11/64] qemu-config: load modules when instantiating option groups Michael Roth
2021-10-19 14:08 ` [PATCH 12/64] qemu-config: parse configuration files to a QDict Michael Roth
2021-10-19 14:08 ` [PATCH 13/64] vl: plumb keyval-based options into -readconfig Michael Roth
2021-10-19 14:08 ` [PATCH 14/64] vl: plug -object back " Michael Roth
2021-10-19 14:08 ` [PATCH 15/64] sockets: update SOCKET_ADDRESS_TYPE_FD listen(2) backlog Michael Roth
2021-10-19 14:08 ` [PATCH 16/64] hmp: Fix loadvm to resume the VM on success instead of failure Michael Roth
2021-10-19 14:08 ` [PATCH 17/64] configure: fix detection of gdbus-codegen Michael Roth
2021-10-19 14:08 ` [PATCH 18/64] vhost-vdpa: don't initialize backend_features Michael Roth
2021-10-19 14:08 ` [PATCH 19/64] esp: only assert INTR_DC interrupt flag if selection fails Michael Roth
2021-10-19 14:09 ` [PATCH 20/64] esp: only set ESP_RSEQ at the start of the select sequence Michael Roth
2021-10-19 14:09 ` [PATCH 21/64] runstate: Initialize Error * to NULL Michael Roth
2021-10-19 14:09 ` [PATCH 22/64] vfio: Fix unregister SaveVMHandler in vfio_migration_finalize Michael Roth
2021-10-19 14:09 ` [PATCH 23/64] vl: Fix an assert failure in error path Michael Roth
2021-10-19 14:09 ` [PATCH 24/64] tcg/sparc: Fix temp_allocate_frame vs sparc stack bias Michael Roth
2021-10-19 14:09 ` [PATCH 25/64] tcg: Allocate sufficient storage in temp_allocate_frame Michael Roth
2021-10-19 14:09 ` [PATCH 26/64] hw/pci-host/q35: Ignore write of reserved PCIEXBAR LENGTH field Michael Roth
2021-10-19 14:09 ` Michael Roth [this message]
2021-10-19 14:09 ` [PATCH 28/64] crypto/tlscreds: Introduce qcrypto_tls_creds_check_endpoint() helper Michael Roth
2021-10-19 14:09 ` [PATCH 29/64] block/nbd: Use qcrypto_tls_creds_check_endpoint() Michael Roth
2021-10-19 14:09 ` [PATCH 30/64] qemu-nbd: " Michael Roth
2021-10-19 14:09 ` [PATCH 31/64] chardev/socket: " Michael Roth
2021-10-19 14:09 ` [PATCH 32/64] migration/tls: " Michael Roth
2021-10-19 14:09 ` [PATCH 33/64] ui/vnc: " Michael Roth
2021-10-19 14:09 ` [PATCH 34/64] crypto: Make QCryptoTLSCreds* structures private Michael Roth
2021-10-19 14:09 ` [PATCH 35/64] yank: Unregister function when using TLS migration Michael Roth
2021-10-19 14:09 ` [PATCH 36/64] tests: acpi: prepare for changing DSDT tables Michael Roth
2021-10-19 14:09 ` [PATCH 37/64] acpi: pc: revert back to v5.2 PCI slot enumeration Michael Roth
2021-10-19 14:09 ` [PATCH 38/64] tests: acpi: pc: update expected DSDT blobs Michael Roth
2021-10-19 14:09 ` [PATCH 39/64] hw/block/nvme: align with existing style Michael Roth
2021-10-19 14:09 ` [PATCH 40/64] hw/nvme: fix missing check for PMR capability Michael Roth
2021-10-19 14:09 ` [PATCH 41/64] hw/nvme: fix pin-based interrupt behavior (again) Michael Roth
2021-10-19 14:09 ` [PATCH 42/64] virtio-balloon: don't start free page hinting if postcopy is possible Michael Roth
2021-10-19 14:09 ` [PATCH 43/64] hw/net/can: sja1000 fix buff2frame_bas and buff2frame_pel when dlc is out of std CAN 8 bytes Michael Roth
2021-10-19 14:09 ` [PATCH 44/64] hw/sd/sdcard: Document out-of-range addresses for SEND_WRITE_PROT Michael Roth
2021-10-19 14:09 ` [PATCH 45/64] hw/sd/sdcard: Fix assertion accessing out-of-range addresses with CMD30 Michael Roth
2021-10-19 14:09 ` [PATCH 46/64] audio: Never send migration section Michael Roth
2021-10-19 14:09 ` [PATCH 47/64] target/arm: Don't skip M-profile reset entirely in user mode Michael Roth
2021-10-19 14:09 ` [PATCH 48/64] virtio-net: fix use after unmap/free for sg Michael Roth
2021-10-19 14:09 ` [PATCH 49/64] qemu-nbd: Change default cache mode to writeback Michael Roth
2021-10-19 14:09 ` [PATCH 50/64] hmp: Unbreak "change vnc" Michael Roth
2021-10-19 14:09 ` [PATCH 51/64] virtio-mem-pci: Fix memory leak when creating MEMORY_DEVICE_SIZE_CHANGE event Michael Roth
2021-10-19 14:09 ` [PATCH 52/64] uas: add stream number sanity checks Michael Roth
2021-10-19 14:09 ` [PATCH 53/64] usb/redir: avoid dynamic stack allocation (CVE-2021-3527) Michael Roth
2021-10-19 14:09 ` [PATCH 54/64] usb: limit combined packets to 1 MiB (CVE-2021-3527) Michael Roth
2021-10-19 14:09 ` [PATCH 55/64] vhost-user-gpu: fix memory disclosure in virgl_cmd_get_capset_info (CVE-2021-3545) Michael Roth
2021-10-19 14:09 ` [PATCH 56/64] vhost-user-gpu: fix resource leak in 'vg_resource_create_2d' (CVE-2021-3544) Michael Roth
2021-10-19 14:09 ` [PATCH 57/64] vhost-user-gpu: fix memory leak in vg_resource_attach_backing (CVE-2021-3544) Michael Roth
2021-10-19 14:09 ` [PATCH 58/64] vhost-user-gpu: fix memory leak while calling 'vg_resource_unref' (CVE-2021-3544) Michael Roth
2021-10-19 14:09 ` [PATCH 59/64] vhost-user-gpu: fix memory leak in 'virgl_cmd_resource_unref' (CVE-2021-3544) Michael Roth
2021-10-19 14:09 ` [PATCH 60/64] vhost-user-gpu: fix memory leak in 'virgl_resource_attach_backing' (CVE-2021-3544) Michael Roth
2021-10-19 14:09 ` [PATCH 61/64] vhost-user-gpu: fix OOB write in 'virgl_cmd_get_capset' (CVE-2021-3546) Michael Roth
2021-10-19 14:09 ` [PATCH 62/64] hw/rdma: Fix possible mremap overflow in the pvrdma device (CVE-2021-3582) Michael Roth
2021-10-19 14:09 ` [PATCH 63/64] pvrdma: Ensure correct input on ring init (CVE-2021-3607) Michael Roth
2021-10-19 14:09 ` [PATCH 64/64] pvrdma: Fix the ring init error flow (CVE-2021-3608) Michael Roth
2021-10-19 14:43 ` [PATCH 00/64] Patch Round-up for stable 6.0.1, freeze on 2021-10-26 Ani Sinha
2021-10-19 14:45   ` Michael S. Tsirkin
2021-10-19 18:22   ` Michael Roth
2021-10-19 23:05     ` Ani Sinha
2021-10-19 14:52 ` Christian Schoenebeck
2021-10-19 15:26   ` Greg Kurz
2021-10-19 15:37     ` Christian Schoenebeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211019140944.152419-28-michael.roth@amd.com \
    --to=michael.roth@amd.com \
    --cc=alex.williamson@redhat.com \
    --cc=fam@euphon.net \
    --cc=mlevitsk@redhat.com \
    --cc=mprivozn@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).