From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44959) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XmOYE-0001Ql-3A for qemu-devel@nongnu.org; Thu, 06 Nov 2014 10:00:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XmOY5-0007oC-V7 for qemu-devel@nongnu.org; Thu, 06 Nov 2014 10:00:41 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36104) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XmOY5-0007nj-Kc for qemu-devel@nongnu.org; Thu, 06 Nov 2014 10:00:33 -0500 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id sA6F0Vim031254 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for ; Thu, 6 Nov 2014 10:00:32 -0500 Message-ID: <545B8D0C.4040600@redhat.com> Date: Thu, 06 Nov 2014 16:00:28 +0100 From: Max Reitz MIME-Version: 1.0 References: <87lhnq3iul.fsf@blackfin.pond.sub.org> <5459E210.2020008@redhat.com> <87a944y0od.fsf@blackfin.pond.sub.org> <545B6F4F.4050106@redhat.com> <20141106145658.GD23802@localhost.localdomain> In-Reply-To: <20141106145658.GD23802@localhost.localdomain> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] Image probing: how it can be insecure, and what we could do about it List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jeff Cody Cc: Kevin Wolf , Markus Armbruster , Stefan Hajnoczi , qemu-devel@nongnu.org On 2014-11-06 at 15:56, Jeff Cody wrote: > On Thu, Nov 06, 2014 at 01:53:35PM +0100, Max Reitz wrote: >> On 2014-11-06 at 13:26, Markus Armbruster wrote: >>> Max Reitz writes: >>> >>>> On 2014-11-04 at 19:45, Markus Armbruster wrote: >>>>> I'll try to explain all solutions fairly. Isn't easy when you're as >>>>> biased towards one of them as I am. Please bear with me. >>>>> >>>>> >>>>> = The trust boundary between image contents and meta-data = >>>>> >>>>> A disk image consists of image contents and meta-data. >>>>> >>>>> Example: all of a raw image's contents is image contents. Leaves just >>>>> file name and attributes for meta-data. >>>>> >>>>> Example: QCOW2 meta-data includes header, header extensions, L1 table, >>>>> L2 tables, ... The meta-data defines where in the image the actual >>>>> contents is stored. >>>>> >>>>> A guest can access the image contents, not the meta-data. >>>>> >>>>> Image contents you've let an untrusted guest write is untrusted. >>>>> >>>>> Therefore, there's a trust boundary between image contents and >>>>> meta-data. QEMU has to trust image meta-data, but shouldn't trust image >>>>> contents. The exact location of the trust boundary depends on the image >>>>> format. >>>>> >>>>> >>>>> = How we instruct QEMU what to trust = >>>>> >>>>> By configuring QEMU to use an image, the user instructs QEMU to trust >>>>> the image's meta-data. >>>>> >>>>> When the user's configuration specifies the image format explicitly, the >>>>> trust boundary is clear. >>>>> >>>>> Else, the trust boundary is ambigous when more than one format is >>>>> possible. >>>>> >>>>> QEMU resolves this ambiguity by picking the first format with the >>>>> highest "score". Raw format is always possible, and always has the >>>>> lowest score. >>>>> >>>>> >>>>> = How this lets the guest escape isolation = >>>>> >>>>> Unfortunately, this lets the guest shift the trust boundary and escape >>>>> isolation, as follows: >>>>> >>>>> * Expose a raw image to the guest (whether you specify the format=raw or >>>>> let QEMU guess it doesn't matter). The complete contents becomes >>>>> untrusted. >>>>> >>>>> * Reuse the image *without* specifying the raw format. QEMU guesses the >>>>> format based on untrusted image contents. Now QEMU guesses a format >>>>> chosen by the guest, with meta-data chosen by the guest. By >>>>> controlling image meta-data, the malicious guest can access arbitrary >>>>> files as QEMU, enlarge its storage, and more. A non-malicious guest >>>>> can accidentally DoS itself, by writing a pattern probing recognizes. >>>> Thank you for bringing that to my attention. This means that I'm even >>>> more in favor of using Kevin's patches because in fact they don't >>>> break anything. >>> They break things differently. The difference may or may not matter. >>> >>> Example: innocent guest writes a recognized pattern. >>> >>> Now: next restart fails, guest DoSed itself until host operator gets >>> around to adding format=raw to the configuration. Consequence: >>> downtime (probably lengthy), but no data corruption. >>> >>> With Kevin's patch: write fails, guest may or may not handle the >>> failure gracefully. Consequences can range from "guest complains to >>> its logs (who cares)" via "guest stops whatever it's doing and refuses >>> to continue until its hardware gets fixed (downtime as above)" to >>> "data corruption". >> You somehow seem convinced that writing to sector 0 is a completely >> normal operation. For x86, it isn't, though. >> >> There are only a couple of programs which do that, I can only think >> of partitioning and setting up boot loaders. There's not a myriad of >> programs which would increase the probability of one both writing a >> recognizable pattern *and* not handling EPERM correctly. >> >> I see the probability of both happening at the same time as >> extremely low, not least because there are only a handful of >> programs which access that sector. >> > I'm not yet opposed to the "restricted-raw" method, but... > > I think the above is a somewhat dangerous viewpoint to take with QEMU. > It is a bit of a slippery slope to start to assume what data guests > will write to the disks provided to them. Even if the probability of > this happening is very low, with what usage we envision now, it is > still entirely legitimate usage for a guest to write data starting at > sector 0. Then let's officially deprecate format probing, if we haven't done so already. That way, there's no excuse. What I'm saying is that there are obviously no compatibility issues. There is no guest software which did write recognizable patterns (so far nobody provided a counterexample), and since format probing is deprecated (or should be), you have no excuse for running future guests in qemu without having explicitly specified the format. And if you are specifying the format, Kevin's patches will not prevent the guest from making its disk a qcow2 image whatsoever. Max