From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:35588)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mreitz@redhat.com>) id 1Xlven-0006ed-1x
	for qemu-devel@nongnu.org; Wed, 05 Nov 2014 03:09:38 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mreitz@redhat.com>) id 1Xlveh-0003SC-Su
	for qemu-devel@nongnu.org; Wed, 05 Nov 2014 03:09:33 -0500
Received: from mx1.redhat.com ([209.132.183.28]:38666)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mreitz@redhat.com>) id 1Xlveh-0003S6-L8
	for qemu-devel@nongnu.org; Wed, 05 Nov 2014 03:09:27 -0500
Received: from int-mx11.intmail.prod.int.phx2.redhat.com
	(int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24])
	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id sA589RRY008451
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256
	verify=FAIL)
	for <qemu-devel@nongnu.org>; Wed, 5 Nov 2014 03:09:27 -0500
Message-ID: <5459DB34.4010207@redhat.com>
Date: Wed, 05 Nov 2014 09:09:24 +0100
From: Max Reitz <mreitz@redhat.com>
MIME-Version: 1.0
References: <1414512220-19058-1-git-send-email-armbru@redhat.com>	<1414512220-19058-3-git-send-email-armbru@redhat.com>	<20141029101242.GA3719@noname.str.redhat.com>	<877fzjc76v.fsf@blackfin.pond.sub.org>	<20141030093040.GA9097@noname.str.redhat.com>	<87oastzprh.fsf@blackfin.pond.sub.org>	<20141103110028.GC4437@noname.str.redhat.com>	<87h9yfl2xz.fsf@blackfin.pond.sub.org>	<20141104160917.GB23802@localhost.localdomain>
	<87mw86ul75.fsf@blackfin.pond.sub.org>
In-Reply-To: <87mw86ul75.fsf@blackfin.pond.sub.org>
Content-Type: text/plain; charset=iso-8859-15; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH RFC 2/2] block: Warn on insecure format
	probing
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Markus Armbruster <armbru@redhat.com>, Jeff Cody <jcody@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-devel@nongnu.org, stefanha@redhat.com

On 2014-11-05 at 09:05, Markus Armbruster wrote:
> Jeff Cody <jcody@redhat.com> writes:
>
>> On Tue, Nov 04, 2014 at 10:39:36AM +0100, Markus Armbruster wrote:
>>> Kevin Wolf <kwolf@redhat.com> writes:
>>>
>>>> Am 30.10.2014 um 13:49 hat Markus Armbruster geschrieben:
>>>>> Kevin Wolf <kwolf@redhat.com> writes:
>>>>>
>>>>>> Am 29.10.2014 um 14:54 hat Markus Armbruster geschrieben:
>>>>>>> Kevin Wolf <kwolf@redhat.com> writes:
>>>>>>>> Instead, let me try once more to sell my old proposal [1] from the
>>>>>>>> thread you mentioned:
>>>>>>>>
>>>>>>>>> What if we let the raw driver know that it was probed and then it
>>>>>>>>> enables a check that returns -EIO for any write on the
>>>>>>>>> first 2k if that
>>>>>>>>> write would make the image look like a different format?
>>>>>>>> Attacks the problem where it arises instead of trying to detect the
>>>>>>>> outcome of it, and works in whatever way it is nested in the BDS graph
>>>>>>>> and whatever way is used to address the image file.
>>>>>>>>
>>>>>>>> Kevin
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02548.html
>>>>>>> Arbitrarily breaking the virtual hardware that way is incompatible, too.
>>>>>>> It's a bigger no-no for me than changing user interface corner cases or
>>>>>>> deciding what to do with a file based on its file name extension.
>>>>>> It is arbitrarily breaking theoretical cases, while your patch is less
>>>>>> arbitrarily breaking common cases. I strongly prefer the former.
>>>>> I challenge "theoretical" as well as "common".
>>>>>
>>>>> For "theoretical", see below.
>>>>>
>>>>> As to "common": are unrecognizable image names really common?  I doubt
>>>>> it.  If I'm wrong, I expect users to complain about my warning, and then
>>>>> we can reconsider.
>>>> As I said in my other mail, I remember seeing BZs with command lines
>>>> that don't have a file extension. Block devices are affected anyway.
>>>>
>>>> Your RFC also misses the extension .raw that I personally use
>>>> occasionally, so I would get the warning even though the image name
>>>> isn't "unrecognizable" at all. And other people probably have other
>>>> extensions in use that we don't know of.
>>>>
>>>>>> Nobody will ever want to write a qcow2 header into their boot sector for
>>>>>> just running a guest:
>>>>>>
>>>>>> 0:   51                      push   %cx
>>>>>> 1:   46                      inc    %si
>>>>>> 2:   49                      dec    %cx
>>>>>> 3:   fb                      sti
>>>>>>
>>>>>> This code doesn't make any sense. It's similar for other formats. And if
>>>>>> they really have some odd reason to do that, the fix is easy: Either
>>>>>> don't use raw, or pass an explicit format=raw.
>>>>> A disk needn't start with a PC-style boot sector.  Even on a PC.  Not
>>>>> every disk needs to be bootable, or even partitioned.  Databases exist
>>>>> that like to work on raw devices.  Giving them /dev/sdb instead of a
>>>>> whole-disk partition /dev/sdb1 may not be the smartest thing to do, but
>>>>> it *works* with real hardware, so why not with virtual hardware?
>>>> Oh, it does. Just use a format that can represent this reliably (and as
>>>> a compromise, we're going to declare explicitly requested raw as such -
>>>> again, see my other mail for details).
>>> This is "opt in safety".  I dislike that for the same reason I dislike
>>> "opt in security".
>>>
>>> Insecure: we risk guest escaping isolation.
>>>
>>> Unsafe: we risk virtual hardware breakage.
>>>
>>>>> You either have to prevent *any* writing of the first 2048 bytes (the
>>>>> part that can be examined by a bdrv_probe() method, or your have to
>>>>> prevent writing anything a probe recognizes, or the user has to specify
>>>>> the format explicitly.
>>>>>
>>>>> If you do the former, you're way outside the realm of "theoretical".
>>>> Right, that's not an option.
>>>>
>>>>> If you do the latter, the degree of "theoreticalness" depends on the
>>>>> union of the patterns you prevent.  Issues:
>>>>>
>>>>> 1. Anthony's method of checking a few known signatures is fragile.  The
>>>>> only obviously robust way to test "is probing going to find something
>>>>> other than 'raw'?" is running the probes.  Feasible.
>>>> Yes, this is what I've been talking about.
>>>>
>>>>> 2. Whether the union of patterns qualifies as "theoretical" for all our
>>>>> targets is not obvious, and whether it'll remain "theoretical" for all
>>>>> future formats and target machines even less.
>>>> At least we don't have examples of it happening yet. And even if it
>>>> happens to become not theoretical at some point, the only thing that
>>>> happens is that they need to add format=raw - something that we already
>>>> know is sure to happen with your approach.
>>>>
>>>> Once we get to know of such cases, we can probably even resolve them by
>>>> improving the probing function of the problematic format.
>>> Your proposal "improves" things from "insecure by default" to "now less
>>> insecure, but additionally a bit unsafe".
>>>
>>> Your proposed remedy for "unsafe" seems to be "if it breaks, you get to
>>> keep the pieces, but you may ask us to narrow the conditions for
>>> breakage so it wouldn't have broken for you if you had imported the
>>> changed version from the future" ;)
>>>
>>>>> 3. A write access that works just fine in one QEMU version can be
>>>>> rejected by another version that recognizes more formats.  Or probes the
>>>>> same formats differently.
>>>> True. We're only sure to protect this binary, and we have good chances
>>>> of protecting some other qemu versions and some other tools, too.
>>>>
>>>> If the attacker has a time machine and knows what the next installed
>>>> qemu version will recognize (or if they exploit a file format that is
>>>> not a disk image, with a program other than qemu), we don't fully
>>>> protect the user. We'd have to completely forbid raw for that.
>>>>
>>>>> 4. Rejecting a write fails *late*, and looks like hardware failure to
>>>>> the guest.  With imperfect guest software, this risks data corruption.
>>>> Okay. So your common case is that you have a badly written database that
>>>> gets the full unpartitioned disk assigned and doesn't start to write at
>>>> the first block, but randomly, and when it finally tries to use block 0,
>>>> it gets deadly confused by the I/O error so that it will break the data
>>>> that it has already stored elsewhere.
>>>>
>>>> Yeah right.
>>>>
>>>> And you probably deserved it then for using such software. It would have
>>>> happened anyway sooner or later.
>>> Since you fail writes to sector 0 only when a "bad" pattern is written,
>>> your hypothesis that any software I should be using surely writes sector
>>> 0 early is irrelevant.
>>>
>>> "Common" doesn't matter either (and I never claimed it was common).  If
>>> I deploy software that uses whole disks, I'd consider disks that can
>>> fail writes to sector 0, even though the actual hardware is in good
>>> working order, defective, and I wouldn't buy them[*].  Regardless of how
>>> common the failure was.  Fortunately, no such disks exist.  Now you're
>>> proposing to create some, with a special "do not break" switch (factory
>>> default "do break").  Your excuse for doing so seems to be:
>>>
>>> (a) If I'm doing something *that* outlandish, I should know which disks
>>> to buy (i.e. anything but virtual raw images), or where to find the "do
>>> not break" switch (i.e. use format=raw).
>>>
>>> (b) If the breakage causes me harm, I deserve it, because my software
>>> sucks.
>>>
>>> For me, (a) is debatable, but (b) is simply bad attitude.  Let's not go
>>> there.
>>>
>>>
>>> [*] In fact, I'd consider them defective and wouldn't buy them no matter
>>> what software I plan to use.
>> The ultimate goal (correct me if I am wrong) of your patch is to get
>> rid of content probing during open, and rely solely on filename
>> extension.
> The goal is to separate raw and non-raw images sufficiently so we can
> exercise the proper care with untrusted raw image contents.
>
> Ditching probing is just a means.  I readily admit I rather like the
> means, since I never liked probing.

As I said before, I hope you won't ditch probing, because I'd strongly 
oppose that. I am however completely fine with adding more entropy to 
probing, i.e. using the filename extension as an additional source of 
information.

>> That leaves us open to guest data corruption in the opposite way -
>> identifying a qcow2 image as a raw image.  If I have qcow2 image named
>> mydisk.img, and QEMU 'detects' it as raw, then that means the first
>> guest data write will corrupt the image (especially to sector 0).
> I worked this issue into my "Image probing: how it can be insecure, and
> what we could do about it" treatise.
>
>> So if we are not specifying the image format, and allowing QEMU to
>> determine the format, Kevin's "restricted-raw" approach actually seems
>> safer to me, and less likely to break the guest.
> Relative likelihoods are of course debatable.
>
>> The other option is to just remove any sort of probing/format
>> autodetection (aside from qemu-img info) completely.  This may break
>> scripts, but switching to filename extensions as a detection mechanism
>> may break some scripts anyway.
> It breaks a whole lot more, and with fewer workarounds.  Fine with me,
> but Kevin hates it.

It would break my heart, too. (SCNR)

Max