qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Markus Armbruster <armbru@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Christoph Hellwig <hch@lst.de>,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] Make default invocation of block drivers safer
Date: Fri, 16 Jul 2010 11:16:31 -0500	[thread overview]
Message-ID: <4C4085DF.1080307@codemonkey.ws> (raw)
In-Reply-To: <m31vb3za1i.fsf@blackfin.pond.sub.org>

On 07/16/2010 11:06 AM, Markus Armbruster wrote:
> Anthony Liguori<anthony@codemonkey.ws>  writes:
>
>    
>> On 07/15/2010 10:19 AM, Markus Armbruster wrote:
>>      
>>> Anthony Liguori<anthony@codemonkey.ws>   writes:
>>>
>>>
>>>        
>>>> On 07/14/2010 01:43 PM, Christoph Hellwig wrote:
>>>>
>>>>          
>>>>> Err, strong NACK.  Please don't start messing with the contents of the
>>>>> data plane, we're getting into real trouble there.  It's perfectly
>>>>> valid for a guest to create an image inside an image, and with hardware
>>>>> support for nested virtualization I guess this use case will become
>>>>> rather common, just as it already is on S/390 with VM.
>>>>>
>>>>>
>>>>>            
>>>> Then we have to remove block format probing.
>>>>
>>>> The two things are fundamentally incompatible.
>>>>
>>>>          
>>> I agree with Christoph: changing guest writes is a big no-no, and
>>> changing them silently is even worse.
>>>
>>>        
>> I do sympathize.  The problem is we're already doing this.  This patch
>> simply changes the behavior to not be a security problem.  I've
>> committed it to attempt to resolve that security problem.  However, we
>> still have a problem and I don't consider the issue closed.
>>
>>      
>>> I could perhaps accept EIO.  Elsewhere in this thread you wrote that you
>>> rejected that approach because "it would trigger the stop-on-error
>>> behavior and the result would be far too difficult for a management
>>> tool/person to deal with."  I think that would be *far* superior in
>>> fact: it fails spectacularly, immediately and safely instead of silently
>>> corrupting disk contents.
>>>
>>>        
>> There's really nothing wrong with this type of write, so EIO doesn't
>> solve the problem.  While we can argue whether writing zeros or EIO is
>> a "better bad" solution, let's try to figure out a good solution.
>>
>>      
>>> The real problem in need of fixing is the unsafe default.  You wrote
>>> that "most users want block probing".  I disagree.  Users want to set up
>>> drives with as little hassle as possible.  If format is optional, and
>>> appears to work, why bother specifying it?
>>>        
>> I really think specifying the format is a burden that is nice to avoid.
>>      
> Yes, users don't like having to specify the "obvious".
>
>    
>> I have another idea that I hope will solve the problem in a more
>> complete way.  The fundamental issue is that it's impossible to probe
>> raw images reliably.  We can probe qcow2, vmdk, etc but not raw.
>>
>> So, let's do the following: have raw_probe() always fail.  Probing
>> shouldn't be a heuristic, it should be an absolute.  We can't prove
>> it's a raw image, so we should always fail.
>>      
> Note: if we stop right here, the security hole is patched, but use of
> raw images requires explicit specification of format.
>
>    
>> To accomodate current use-cases with raw, let's introduce a new format
>> called "probed_raw".  probed_raw's semantics will be the following:
>>
>> The signature of a probed_raw will be ~{'QFI\xfb', 'VMDK', 'COWD',
>> OOOM', ...}.  If the signature is 'QRAW', then instead of reading the
>> first sector at offset 0, we read the first sector at offset LENGTH.
>> If the signature is 'QRAW', LENGTH is computed by calculating
>> FILE_SIZE - 512.
>>
>> For probed_raw, write requests to sector 0 are checked.  If the first
>> four bytes is an invalid probed_raw signature or QRAW, we write a QRAW
>> signature to file offset 0 and copy the first sector to the end of the
>> file redirecting reads and writes to the end of file.
>>      
> Doesn't this require an image that can grow?  What about host block
> devices?
>    

I don't believe we probe host block devices.  We assume they're raw 
which means they would never be probed_raw.

>> An approach like this has the following properties:
>>
>> 1) We can make the bdrv_probe check 100% reliable and return a boolean.
>> 2) In the cases where we known format=raw, none of this code is ever
>> invoked.
>> 3) probed_raw images usually look exactly like raw images in most cases
>> 4) In the degenerate cases, probe_raw images are still mountable in
>> the normal way.
>> 5) Even after the QRAW signature is applied, if the guest writes a
>> valid signature, we can truncate the file and make it appear as a
>> normal raw image.
>>
>> Christoph/Markus/Stefan, does this seem like a more reasonable approach?
>>      
> I'm not convinced it's a good idea.  It's clearly a less bad idea,
> though :)
>
> It avoids guest-visible lossage, and that's good.
>
> There's still host-visible lossage: as soon as we redirect sector 0, the
> image isn't raw anymore, and accessing it with non-qemu tools (say
> losetup + kpartx) no longer works.  You need to know what QEMU did to
> your no-longer-raw image to work around the lossage (say losetup -o
> 512).
>    

Yeah, but as previously discussed, we can't probe raw.  So probed_raw 
ends up being a compromise.

>>>     That they get an unsafe
>>> default that way is a big surprise to them.  And I can't blame them!
>>> Users can reasonably expect programs not to trap them.
>>>
>>> If we want to let users define drives without having to specify the
>>> format, we can guess the format from the file name.
>>>        
> I still think guessing the format from the file name is a better
> way to spare users from having to specify formats.
>    
I think that would be true if we did it from day 1 but it would be a 
huge impact to users if we did it today.

Regards,

Anthony Liguori

  reply	other threads:[~2010-07-16 16:23 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-14 16:12 [Qemu-devel] [PATCH] Make default invocation of block drivers safer Anthony Liguori
2010-07-14 16:42 ` [Qemu-devel] " Kevin Wolf
2010-07-14 17:40   ` Anthony Liguori
2010-07-15  8:00     ` Kevin Wolf
2010-07-14 18:43 ` [Qemu-devel] " Christoph Hellwig
2010-07-14 18:50   ` Anthony Liguori
2010-07-15  9:20     ` Daniel P. Berrange
2010-07-15 12:35       ` Anthony Liguori
2010-07-15 15:19     ` Markus Armbruster
2010-07-15 16:20       ` Anthony Liguori
2010-07-15 17:10         ` Kevin Wolf
2010-07-15 17:51           ` Anthony Liguori
2010-07-16  7:30             ` Kevin Wolf
2010-07-16 12:55         ` Stefan Hajnoczi
2010-07-16 13:00           ` Stefan Hajnoczi
2010-07-16 16:06         ` Markus Armbruster
2010-07-16 16:16           ` Anthony Liguori [this message]
2010-07-16 16:24             ` Kevin Wolf
2010-07-14 18:53   ` Anthony Liguori
2010-07-14 18:54   ` Aurelien Jarno
2010-07-14 19:04     ` Anthony Liguori
2010-07-15  8:09   ` Kevin Wolf
2010-07-15  9:10     ` Stefan Hajnoczi
2010-07-15 12:57       ` Anthony Liguori
2010-07-15 13:16         ` Kevin Wolf
2010-07-15 13:20         ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C4085DF.1080307@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=armbru@redhat.com \
    --cc=hch@lst.de \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).