Re: [PATCH 1/1] Fix qcow2 corruption on discard

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Maxim Levitsky <mlevitsk@redhat.com>
To: Alberto Garcia <berto@igalia.com>, Kevin Wolf <kwolf@redhat.com>
Cc: vsementsov@virtuozzo.com, qemu-block@nongnu.org,
	zhang_youjia@126.com, qemu-devel@nongnu.org,
	Max Reitz <mreitz@redhat.com>,
	andrey.shinkevich@virtuozzo.com
Subject: Re: [PATCH 1/1] Fix qcow2 corruption on discard
Date: Tue, 24 Nov 2020 21:44:00 +0200	[thread overview]
Message-ID: <35b073136cba9987a6cbfc59549dbe8633e0e803.camel@redhat.com> (raw)
In-Reply-To: <f6df1eedb9785b907807e0581be12c790112d939.camel@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2830 bytes --]

On Tue, 2020-11-24 at 20:59 +0200, Maxim Levitsky wrote:
> On Tue, 2020-11-24 at 19:59 +0100, Alberto Garcia wrote:
> > On Tue 24 Nov 2020 10:17:23 AM CET, Kevin Wolf wrote:
> > > We can then continue work to find a minimal reproducer and merge the
> > > test case in the early 6.0 cycle.
> > 
> > I haven't been able to reproduce the problem yet, do you have any
> > findings?
> > 
> > Berto
> > 
> 
> I have a working reproducer script. I'll send it in a hour or so.
> Best regards,
> 	Maxim Levitsky

I have attached a minimal reproducer for this issue.
I can convert this to an iotest if you think that this is worth it.

So these are the exact conditions for the corruption to happen:

1. Image must have at least 5 refcount tables 
(1 more that default refcount table cache size, which is 4 by default)

2. IO pattern that populates the 4 entry refcount table cache fully:

 Easiest way to do it is to have 4 L2 entries populated in the base image,
 such as each entry references a physical cluster that is served by different
 refcount table.

 Then discard these entries in the snapshot, triggering discard in the
 base file during the commit, which will populate the refcount table cache.

4. A discard of a cluster that belongs to 5th refcount table (done in the
   exact same way as above discards).
   It should be done soon, before L2 cache flushed due to some unrelated
   IO.

   This triggers the corruption:

The call stack is:

2. qcow2_free_any_cluster->
	qcow2_free_clusters->
		update_refcount:

			//This sets dependency between flushing the refcount cache and l2 cache.
    			if (decrease)
        			qcow2_cache_set_dependency(bs, s->refcount_block_cache,s->l2_table_cache);

			ret = alloc_refcount_block(bs, cluster_index, &refcount_block);
				return load_refcount_block(bs, refcount_block_offset, refcount_block);
					return qcow2_cache_get(...
						qcow2_cache_do_get
							/* because of a cache miss, we have to evict an entry*/
							ret = qcow2_cache_entry_flush(bs, c, i);
							if (c->depends) {
								/* this flushes the L2 cache */
        							ret = qcow2_cache_flush_dependency(bs, c);
							}

I had attached a reproducer that works with almost any cluster size and refcount block size.
Cluster sizes below 4K don't work because commit which is done by the mirror job which works on 4K granularity,
and that results in it not doing any discards due to various alignment restrictions.

If I patch qemu to make mirror job work on 512B granularity, test reproduces for small clusters as well.

The reproducer creates a qcow2 image in the current directory and it needs about 11G for default parameters.
(64K cluster size, 16 bit refcounts).
For 4K cluster size and 64 bit refcounts, it needs only 11M.
(This can be changed by editing the script)

Best regards,
	Maxim Levitsky

[-- Attachment #2: reproducer.sh --]
[-- Type: application/x-shellscript, Size: 2088 bytes --]

next prev parent reply	other threads:[~2020-11-24 19:46 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-23 15:49 [PATCH 0/1] Fix qcow2 corruption after addition of subcluster support Maxim Levitsky
2020-11-23 15:49 ` [PATCH 1/1] Fix qcow2 corruption on discard Maxim Levitsky
2020-11-23 16:09   ` Kevin Wolf
2020-11-23 18:20     ` Alberto Garcia
2020-11-23 18:23       ` Maxim Levitsky
2020-11-23 17:38   ` Kevin Wolf
2020-11-23 18:11     ` Maxim Levitsky
2020-11-24  9:17       ` Kevin Wolf
2020-11-24 18:59         ` Alberto Garcia
2020-11-24 18:59           ` Maxim Levitsky
2020-11-24 19:44             ` Maxim Levitsky [this message]
2020-11-25 16:49               ` Alberto Garcia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=35b073136cba9987a6cbfc59549dbe8633e0e803.camel@redhat.com \
    --to=mlevitsk@redhat.com \
    --cc=andrey.shinkevich@virtuozzo.com \
    --cc=berto@igalia.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    --cc=zhang_youjia@126.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).