Re: [Qemu-devel] [RFC] Propose the Fast Virtual Disk (FVD) image format that outperforms QCOW2 by 249%

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Stefan Weil <weil@mail.berlios.de>
To: Chunqiang Tang <ctang@us.ibm.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC] Propose the Fast Virtual Disk (FVD) image format that outperforms QCOW2 by 249%
Date: Sat, 15 Jan 2011 18:27:17 +0100	[thread overview]
Message-ID: <4D31D8F5.3090700@mail.berlios.de> (raw)
In-Reply-To: <OFF6DFF2E0.4CB6499B-ON85257819.0012401A-85257819.0012FB87@us.ibm.com>

Am 15.01.2011 04:28, schrieb Chunqiang Tang:
>> The community block I/O test suite is qemu-iotests:
>> http://git.kernel.org/?p=linux/kernel/git/hch/qemu-iotests.git;a=summary
>> If you have tests that you'd like to contribute, please put them into
>> that framework so other developers can run them as part of their
>> regular testing.
>
> Hi Stefan,
>
> What I described is not a qemu-io test case. I also use qemu-io, which is
> very helpful, but I observed that qemu-io has several limitations in
> discovering elusive bugs:
>
> B1) qemu-io cannot trigger many race condition bugs, because it does not
> fully control the timing of events. For example, qemu-io cannot test this
> scenario: three concurrent writes a, b, and c are processed by
> bdrv_aio_writev() in the order of Pa, Pb, and Pc; their writes are
> actually persisted on disk in another order of Wc, Wa, and Wb; and 
> finally
> their callbacks are invoked in yet another order of Vb, Vc, and Va. Some
> race condition bugs may exist in the code (e.g., inappropriate locking),
> because it does not anticipate these orders of events are possible. This
> is just one example. In theory, there can be 100 concurrent reads or
> writes, and their events can happen in an arbitrary permutation order. It
> is nearly impossible to manually generating test cases for all of them.
>
> B2) Even if a race condition bug is triggered by chance, its behavior
> depends on subtle event timing that is hard to repeat and hence hard to
> debug.
>
> B3) With qemu-io, it is hard to test code paths that handle I/O failures.
> For example, a disk write may fail due to disk media error. Because these
> errors are rare, the failure handling code paths may never be tested,
> which for example may contain a null pointer bug that can crash the 
> entire
> VM or gradually leaks resources (e.g., memory) due to incomplete cleanup.
>
> B4) qemu-io requires manually creating test cases, which is not only time
> consuming but also leads a low coverage in testing. This is because many
> bugs happen in scenarios that the developers do not anticipate, and hence
> do not know how to create test cases in the first place.
>
> The FVD patch includes a new testing framework that addresses the above
> issues. This testing framework is orthogonal to FVD and can be used to
> test other block device drivers as well. This testing framework includes
> two components that can be used both separately and in a combination
>
> T1) To address the problems of B1- B3, I implemented an emulated disk in
> block/sim.c, which allows a full control of event timings, either 
> manually
> or automatically. Given the three concurrent writes example above, 
> their 9
> events (Pa, Pb, Pc, Wa, Wb, Wc, Va, Vb, and Vc) can be precisely
> controlled to be executed in any given order. Moreover, the emulated disk
> can inject disk I/O errors in a controlled manner. For example, it can
> fail a specific read or write to test how the code handles that, or it 
> can
> even fail as many as 90% of the reads/writes to test if the code has
> resource leaks. qemu-io is extended with a module qemu-io-sim.c to work
> with the emulated disk block/sim.c, so that the tester can use the 
> qemu-io
> console to manually control the order of events or fail disk reads or
> writes.
>
> T2) The solution in T1 still does not address the problem of B3), i.e.,
> manually generating test cases is time consuming and has a low coverage.
> This problem is solved by a new testing tool called qemu-test. qemu-test
> can 1) automatically generate an unlimited number of randomized test 
> cases
> that, e.g., execute 1,000 concurrent disk reads or writes on overlapping
> disk regions; 2) automatically generate the corresponding anticipated
> correct results, automatically run the tests, and automatically compare
> the actual test results with the anticipated correct results. Once it
> discovers a difference, which indicates a bug, it halts testing and waits
> for the developer to debug. The randomized test cases created by
> qemu-test are controlled by a pseudo random number generator, and hence
> the behavior is completely repeatable. Therefore, once a bug is 
> triggered,
> it can be precisely repeated for an unlimited number of times to
> facilitate debugging, even if this bug happens extremely rare in real 
> runs
> of a VM. qemu-test is fully automated. Once started, it can continuously
> run, e.g., for months to test an enormous number of test cases.
>
> The implementation of qemu-test is actually not that complicated. It 
> opens
> two virtual disks, the so-called truth image and test image, 
> respectively.
> The truth image is served by a trivial synchronous block device driver so
> that its behavior is guaranteed to be correct. The test image is served a
> real block device driver (e.g., FVD or QCOW2) that we want to test.
> qemu-test submits the same sequence of disk I/O requests (which is
> randomly generated) to the truth image and the test image, and expect 
> that
> the two images’ contents never diverge. Otherwise, it indicates a bug in
> the test image’s block device driver. qemu-test works with the emulated
> disk block/sim.c so that it can randomize event timings in a controlled
> manner and can inject disk I/O errors randomly.
>
> I found qemu-test extremely powerful in discovering elusive bugs that I
> never anticipated, and using qemu-test is effortless. Whenever I 
> completed
> some major code upgrade, I simply started qemu-test in the evening and
> came back in the morning to collect bugs, if any. Debugging them is also
> easy because the bugs are precisely repeatable even if they are hard to
> trigger.
>
> As for the QCOW2 bug I mentioned previously, it can be triggered by
> test-qcow2.sh. A faster way to trigger it is to bypass those correct test
> runs by executing the commands below:
>
> dd if=/dev/zero of=/var/ramdisk/truth.raw count=0 bs=1 seek=1155683840
> dd if=/dev/zero of=/var/ramdisk/zero-500M.raw count=0 bs=1 seek=609064448
> ./qemu-img create -f qcow2 -b /var/ramdisk/zero-500M.raw
> /var/ramdisk/test.qcow2 1155683840
> ./qemu-test --seed=116579177 --truth=/var/ramdisk/truth.raw
> --test=/var/ramdisk/test.qcow2 --verify_write=true --compare_before=false
> --compare_after=true --round=100000 --parallel=100 --io_size=10485760
> --fail_prob=0 --cancel_prob=0 --instant_qemubh=true
>
> As for the FVD patch that includes the new testing framework, I tried to
> post it on the mailing list twice but it always got bounced back, either
> because the message is too big or because of a Notes client configuration
> issue. Until I figure it out, please down the FVD patch from
> https://researcher.ibm.com/researcher/files/us-ctang/FVD-01-14-2011.patch
> .
>
> Best regards,
> ChunQiang (CQ) Tang, Ph.D.
> Homepage: http://www.research.ibm.com/people/c/ctang

Hi,

when I tried to use your patch, I found several problems:

* The patch does apply cleanly to latest QEMU.
   This is caused by recent changes in QEMU git master.

* The new code uses tabs instead of spaces (QEMU coding rules).

* Some lines of the new code end with blank characters.

* The patch adds empty lines at the end of some files.

The last two points are reported by newer versions of git
(which refuse to take such patches with the default setting).

Could you please update your patch to fix those topics?
I'd like to apply it to my QEMU code and try the new FVD.

If needed, I could also send your patch to qemu-devel.

Kind regards,
Stefan Weil

next prev parent reply	other threads:[~2011-01-15 17:38 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-04 21:44 [Qemu-devel] [RFC] Propose the Fast Virtual Disk (FVD) image format that outperforms QCOW2 by 249% Chunqiang Tang
2011-01-05 17:29 ` Anthony Liguori
2011-01-14 20:56   ` Chunqiang Tang
2011-01-19  1:12     ` Jamie Lokier
2011-01-19  8:10       ` Stefan Hajnoczi
2011-01-19 15:17         ` Chunqiang Tang
2011-01-19 15:25           ` Christoph Hellwig
2011-01-19 23:56           ` Jamie Lokier
2011-01-19 15:51     ` Christoph Hellwig
2011-01-19 16:21       ` Chunqiang Tang
2011-01-19 16:42         ` Christoph Hellwig
2011-01-19 17:08           ` Chunqiang Tang
2011-01-19 17:25             ` Christoph Hellwig
2011-01-06  9:17 ` Stefan Hajnoczi
2011-01-15  3:28   ` Chunqiang Tang
2011-01-15 17:27     ` Stefan Weil [this message]
2011-01-20  2:59       ` Chunqiang Tang
     [not found]     ` <AANLkTinw2S2dzKoeFK-dBP6b36J+VNLjb3f-vbkKm3Fz@mail.gmail.com>
2011-01-17 10:37       ` Stefan Hajnoczi
2011-01-18 20:35         ` Chunqiang Tang
2011-01-19  0:59           ` Jamie Lokier
2011-01-19 14:59             ` Chunqiang Tang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D31D8F5.3090700@mail.berlios.de \
    --to=weil@mail.berlios.de \
    --cc=ctang@us.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).