qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 3/3] block: Catch !bs->drv in bdrv_check()
Date: Sat, 09 Aug 2014 00:53:18 +0200	[thread overview]
Message-ID: <53E554DE.5050409@redhat.com> (raw)
In-Reply-To: <53E53D0F.3030808@redhat.com>

On 08.08.2014 23:11, Max Reitz wrote:
> On 08.08.2014 11:15, Kevin Wolf wrote:
>> Am 07.08.2014 um 22:47 hat Max Reitz geschrieben:
>>> qemu-img check calls bdrv_check() twice if the first run repaired some
>>> inconsistencies. If the first run however again triggered corruption
>>> prevention (on qcow2) due to very bad inconsistencies, bs->drv may be
>>> NULL afterwards. Thus, bdrv_check() should check whether bs->drv is 
>>> set.
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> I suppose there was a real case of this happening? I think bdrv_check()
>> triggering corruption prevention is a rather bad sign. The most
>> important point for image repair should be that it doesn't make the
>> situation any worse. Smells like a follow-up patch to the qcow2 code.
>
> Yes, as I wrote in the cover letter, using the image provided in 
> https://bugs.launchpad.net/qemu/+bug/1353456 and setting the refblock 
> offset to 0 (the reftable entry) results in a segmentation fault.
>
> A simple way to trigger corruption during bdrv_check() is creating an 
> image, setting the first (and only) reftable entry to 0 and running 
> qemu-img check -r all. bdrv_check() will try to allocate a refblock, 
> but since the first clusters are unallocated, it will allocate them 
> there which would obviously overwrite the image header and/or L1 table 
> and/or reftable.
>
> The only way I can imagine to fix this is to completely disregard the 
> on-disk refcount information during bdrv_check() and instead only use 
> the calculated refcounts. This would require own allocation functions 
> which may probably be rather simple, but in any case we'd need to 
> write them.
>
> I think I should have some time, so I'll have a look into it.

Okay, after thinking about the situation (which involved looking through 
the other bug reports by Maria), I think there is only one way to truly 
do the repair operation correctly. The general problem is that a damaged 
refcount structure may lead to a new reftable or new refblocks being 
allocated during the repair process. However, since the refcounts are 
not accurate, these new clusters may collide with existing allocations. 
We could fix this by replicating all the refcount operations for 
in-memory refcounts (which qcow2_check_refcounts() creates), but I think 
this to be a rather bad idea.

Instead, I'd rather create completely new refcount structures in 
qcow2_check_refcounts() when so much as a single referenced cluster with 
refcount=0 is encountered. If there is any cluster which is indeed 
referenced but for which the refcount structures say it's free, any new 
allocation may break things. Since changing refcounts may result in new 
cluster allocations, we should not update the existing refcount 
structures at all.

Alternatively, we can rewrite the refcount update functions to take an 
in-memory refcount table to know which clusters to avoid, but 
considering that those functions are complicated enough already, I'd 
rather refrain from that.

Max

  reply	other threads:[~2014-08-08 22:53 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-07 20:47 [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes Max Reitz
2014-08-07 20:47 ` [Qemu-devel] [PATCH 1/3] qcow2: Catch !*host_offset for data allocation Max Reitz
2014-08-07 20:47 ` [Qemu-devel] [PATCH 2/3] iotests: Add test for image header overlap Max Reitz
2014-08-07 20:47 ` [Qemu-devel] [PATCH 3/3] block: Catch !bs->drv in bdrv_check() Max Reitz
2014-08-08  9:15   ` Kevin Wolf
2014-08-08 21:11     ` Max Reitz
2014-08-08 22:53       ` Max Reitz [this message]
2014-08-07 20:59 ` [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes Eric Blake
2014-08-08  9:11 ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53E554DE.5050409@redhat.com \
    --to=mreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).