* [Qemu-devel] Cutting a new QEMU release @ 2009-02-03 20:48 Anthony Liguori 2009-02-03 20:58 ` Glauber Costa ` (7 more replies) 0 siblings, 8 replies; 82+ messages in thread From: Anthony Liguori @ 2009-02-03 20:48 UTC (permalink / raw) To: qemu-devel@nongnu.org What do people think? TCG seems to be in a good place. We've got virtio, KVM, live migration, tons of new devices, bsd-user, etc. We could decide to cut one by the end of the month. I'm already doing some test work in QEMU so I can follow up with some more detailed notes about what is working and what isn't working. That gives us some time to decide if there's anything we need to fix before a release. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 20:48 [Qemu-devel] Cutting a new QEMU release Anthony Liguori @ 2009-02-03 20:58 ` Glauber Costa 2009-02-03 21:35 ` Laurent Desnogues 2009-02-03 21:48 ` Rick Vernam ` (6 subsequent siblings) 7 siblings, 1 reply; 82+ messages in thread From: Glauber Costa @ 2009-02-03 20:58 UTC (permalink / raw) To: qemu-devel On Tue, Feb 3, 2009 at 6:48 PM, Anthony Liguori <anthony@codemonkey.ws> wrote: > What do people think? TCG seems to be in a good place. We've got virtio, > KVM, live migration, tons of new devices, bsd-user, etc. > > We could decide to cut one by the end of the month. I'm already doing some > test work in QEMU so I can follow up with some more detailed notes about > what is working and what isn't working. That gives us some time to decide > if there's anything we need to fix before a release. > > Regards, I'm totally for it. -- Glauber Costa. "Free as in Freedom" http://glommer.net "The less confident you are, the more serious you have to act." ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 20:58 ` Glauber Costa @ 2009-02-03 21:35 ` Laurent Desnogues 2009-02-03 21:50 ` Anthony Liguori ` (2 more replies) 0 siblings, 3 replies; 82+ messages in thread From: Laurent Desnogues @ 2009-02-03 21:35 UTC (permalink / raw) To: qemu-devel On Tue, Feb 3, 2009 at 9:58 PM, Glauber Costa <glommer@gmail.com> wrote: > On Tue, Feb 3, 2009 at 6:48 PM, Anthony Liguori <anthony@codemonkey.ws> wrote: >> What do people think? TCG seems to be in a good place. We've got virtio, >> KVM, live migration, tons of new devices, bsd-user, etc. >> >> We could decide to cut one by the end of the month. I'm already doing some >> test work in QEMU so I can follow up with some more detailed notes about >> what is working and what isn't working. That gives us some time to decide >> if there's anything we need to fix before a release. >> >> Regards, > > I'm totally for it. So am I, but who will test user mode and more generally (user and system) what is the test procedure? For instance someone (Andzrej?) mentionned ARM in system mode is half slower than it was before TCG. Also the ARM target needs some fixing. Perhaps doing at least one release candidate to get feedback (and focus on fixing reported bugs) would be appropriate. Cheers, Laurent ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 21:35 ` Laurent Desnogues @ 2009-02-03 21:50 ` Anthony Liguori 2009-02-03 22:05 ` Laurent Desnogues 2009-02-04 13:09 ` Ulrich Hecht 2009-02-04 0:31 ` David Turner [not found] ` <74222928-D24B-4780-BDB0-D537A83C4F68@hotmail.com> 2 siblings, 2 replies; 82+ messages in thread From: Anthony Liguori @ 2009-02-03 21:50 UTC (permalink / raw) To: qemu-devel Laurent Desnogues wrote: > On Tue, Feb 3, 2009 at 9:58 PM, Glauber Costa <glommer@gmail.com> wrote: > >> I'm totally for it. >> > > So am I, but who will test user mode and more generally (user and system) > what is the test procedure? > I'd like to approach this gently. Historically, there's been no formal release process. I'm not inclined to start out by introducing any sort of heavy weight procedure. I'll poke things as best I can over the next couple weeks. I encourage everyone else to do the same. I'll keep track of what's working and what's broken and make it available publicly. At some point, we can decide as if things are too embarrassing to release or not :-) > For instance someone (Andzrej?) mentionned ARM in system mode is half > slower than it was before TCG. Also the ARM target needs some fixing. > > Perhaps doing at least one release candidate to get feedback (and focus on > fixing reported bugs) would be appropriate. > A release doesn't have to be perfect to be useful. I think what matters most is whether something is likely to be fixed in the reasonably near future. We're going to have some regressions compared to 0.9.1. There are a number of platforms that are no longer supported (ia64 and s390, for instance) but we could wait another year and I doubt these features would appear. Regards, Anthony Liguori > Cheers, > > Laurent > > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 21:50 ` Anthony Liguori @ 2009-02-03 22:05 ` Laurent Desnogues 2009-02-03 22:47 ` Anthony Liguori 2009-02-04 13:09 ` Ulrich Hecht 1 sibling, 1 reply; 82+ messages in thread From: Laurent Desnogues @ 2009-02-03 22:05 UTC (permalink / raw) To: qemu-devel On Tue, Feb 3, 2009 at 10:50 PM, Anthony Liguori <anthony@codemonkey.ws> wrote: > Laurent Desnogues wrote: [...] >> So am I, but who will test user mode and more generally (user and system) >> what is the test procedure? >> > > I'd like to approach this gently. Historically, there's been no formal > release process. I'm not inclined to start out by introducing any sort of > heavy weight procedure. Don't take me wrong, I am not for formal processes at all :) I just want to be sure user mode won't be forgotten due to the lack of a maintainer. > I'll poke things as best I can over the next couple weeks. I encourage > everyone else to do the same. I'll keep track of what's working and what's > broken and make it available publicly. At some point, we can decide as if > things are too embarrassing to release or not :-) I intend on testing various things on my side too. Be sure I'll let you know of problems and also will provide patches. >> For instance someone (Andzrej?) mentionned ARM in system mode is half >> slower than it was before TCG. Also the ARM target needs some fixing. >> >> Perhaps doing at least one release candidate to get feedback (and focus on >> fixing reported bugs) would be appropriate. >> > > A release doesn't have to be perfect to be useful. I think what matters > most is whether something is likely to be fixed in the reasonably near > future. We're going to have some regressions compared to 0.9.1. There are > a number of platforms that are no longer supported (ia64 and s390, for > instance) but we could wait another year and I doubt these features would > appear. I agree we should not care now about targets that are not here anymore. But things that are important for the community should be taken with care (and arm linux user mode is certainly a very important target). Laurent ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 22:05 ` Laurent Desnogues @ 2009-02-03 22:47 ` Anthony Liguori 2009-02-03 23:48 ` Glauber Costa 0 siblings, 1 reply; 82+ messages in thread From: Anthony Liguori @ 2009-02-03 22:47 UTC (permalink / raw) To: qemu-devel Laurent Desnogues wrote: > On Tue, Feb 3, 2009 at 10:50 PM, Anthony Liguori <anthony@codemonkey.ws> wrote: > >> Laurent Desnogues wrote: >> >>> For instance someone (Andzrej?) mentionned ARM in system mode is half >>> slower than it was before TCG. Also the ARM target needs some fixing. >>> >>> Perhaps doing at least one release candidate to get feedback (and focus on >>> fixing reported bugs) would be appropriate. >>> >>> >> A release doesn't have to be perfect to be useful. I think what matters >> most is whether something is likely to be fixed in the reasonably near >> future. We're going to have some regressions compared to 0.9.1. There are >> a number of platforms that are no longer supported (ia64 and s390, for >> instance) but we could wait another year and I doubt these features would >> appear. >> > > I agree we should not care now about targets that are not here anymore. > But things that are important for the community should be taken with > care (and arm linux user mode is certainly a very important target). > If someone is actively fixing it, then I'm perfectly happy to wait. If it's a known issue that noone is resolving, I don't think delaying a release helps anyone. However, documenting all of these things somewhere so that they are clearly visible may make it easier for someone to fix so the process of going through a release would probably be helpful in general. Regards, Anthony Liguori > Laurent > > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 22:47 ` Anthony Liguori @ 2009-02-03 23:48 ` Glauber Costa 0 siblings, 0 replies; 82+ messages in thread From: Glauber Costa @ 2009-02-03 23:48 UTC (permalink / raw) To: qemu-devel >> >> I agree we should not care now about targets that are not here anymore. >> But things that are important for the community should be taken with >> care (and arm linux user mode is certainly a very important target). >> > > If someone is actively fixing it, then I'm perfectly happy to wait. If it's > a known issue that noone is resolving, I don't think delaying a release > helps anyone. However, documenting all of these things somewhere so that > they are clearly visible may make it easier for someone to fix so the > process of going through a release would probably be helpful in general. > We also have to take into account that making a release is likely to increase the quality of qemu code base as a whole. Just because any user getting into qemu site today, or getting code from a distro, is likely to be using something very old. Old enough that any bug reports will be probably useless to us. Bug reports against 0.9.1 has happened many times already in the list. -- Glauber Costa. "Free as in Freedom" http://glommer.net "The less confident you are, the more serious you have to act." ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 21:50 ` Anthony Liguori 2009-02-03 22:05 ` Laurent Desnogues @ 2009-02-04 13:09 ` Ulrich Hecht 1 sibling, 0 replies; 82+ messages in thread From: Ulrich Hecht @ 2009-02-04 13:09 UTC (permalink / raw) To: qemu-devel On Tuesday 03 February 2009, Anthony Liguori wrote: > There are a number of platforms that are no longer > supported (ia64 and s390, for instance) but we could wait another year > and I doubt these features would appear. I am working on S/390 host support. Currently, it's good enough to show the PC BIOS startup screen (softmmu) and to run an i386 shell (linux-user). ATM I cannot give an estimate when it will be good enough for a release, though. CU Uli -- SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 21:35 ` Laurent Desnogues 2009-02-03 21:50 ` Anthony Liguori @ 2009-02-04 0:31 ` David Turner [not found] ` <74222928-D24B-4780-BDB0-D537A83C4F68@hotmail.com> 2 siblings, 0 replies; 82+ messages in thread From: David Turner @ 2009-02-04 0:31 UTC (permalink / raw) To: qemu-devel [-- Attachment #1: Type: text/plain, Size: 1231 bytes --] On Tue, Feb 3, 2009 at 10:35 PM, Laurent Desnogues < laurent.desnogues@gmail.com> wrote: > > For instance someone (Andzrej?) mentionned ARM in system mode is half > slower than it was before TCG. Also the ARM target needs some fixing. > I have integrated the TCG ARM backend in the Android emulator, and my measurements show an improvement in performance, when running various Android performance tests, between x1.10 and x1.90 compared to the old dyngen based translator. To be honest, the improvements are not consistent, there are a few rare tests that run at x0.89, but they're not critical to me). Note that the TCG binary is compiled with GCC 4.2, while the old one was built with GCC 3.3 (fo rthe usual ugly dyngen reasons). This is only when comparing the same ARMv5 binaries, but it sounds good enough for me. An official release would be very welcomed at this point. The amount of changes since the last one has been dramatic. Morever, this will allow everyone to reset the clock on their forks and more easily share patches with upstream. Just my 2 cents > > Perhaps doing at least one release candidate to get feedback (and focus on > fixing reported bugs) would be appropriate. > > Cheers, > > Laurent > > > [-- Attachment #2: Type: text/html, Size: 1796 bytes --] ^ permalink raw reply [flat|nested] 82+ messages in thread
[parent not found: <74222928-D24B-4780-BDB0-D537A83C4F68@hotmail.com>]
* Re: [Qemu-devel] Cutting a new QEMU release [not found] ` <74222928-D24B-4780-BDB0-D537A83C4F68@hotmail.com> @ 2009-02-04 5:08 ` C.W. Betts 0 siblings, 0 replies; 82+ messages in thread From: C.W. Betts @ 2009-02-04 5:08 UTC (permalink / raw) To: qemu-devel I am all for releasing a release candidate. This signals everyone that qemu is thinking of releasing a stable version and people will try to find bugs. On Feb 3, 2009, at 2:35 PM, Laurent Desnogues wrote: > Perhaps doing at least one release candidate to get feedback (and > focus on > fixing reported bugs) would be appropriate. > > Cheers, > > Laurent > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 20:48 [Qemu-devel] Cutting a new QEMU release Anthony Liguori 2009-02-03 20:58 ` Glauber Costa @ 2009-02-03 21:48 ` Rick Vernam 2009-02-03 22:07 ` Daniel P. Berrange ` (5 subsequent siblings) 7 siblings, 0 replies; 82+ messages in thread From: Rick Vernam @ 2009-02-03 21:48 UTC (permalink / raw) To: qemu-devel On Tuesday 03 February 2009 2:48:22 pm Anthony Liguori wrote: > What do people think? TCG seems to be in a good place. We've got > virtio, KVM, live migration, tons of new devices, bsd-user, etc. > > We could decide to cut one by the end of the month. I'm already doing > some test work in QEMU so I can follow up with some more detailed notes > about what is working and what isn't working. That gives us some time > to decide if there's anything we need to fix before a release. > > Regards, > > Anthony Liguori -vga vmware doesn't work on either of my wxp guests, nor my w2k guest. I continue to get triple faults from any Windows XP guest at a particular point while booting, when invoked with qemu-system-x86_64. When invoked with plain qemu, it seems to work just fine. This is with or without -kqemu or -kernel-kqemu (my athlon 3700+ doesn't do kvm). I've posted about these in the past, and I don't have any additional information to add to those (ancient) discussions. I don't intend to start a discussion about them now, this is just a friendly reminder about some unresolved issues. Thanks -Rick ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 20:48 [Qemu-devel] Cutting a new QEMU release Anthony Liguori 2009-02-03 20:58 ` Glauber Costa 2009-02-03 21:48 ` Rick Vernam @ 2009-02-03 22:07 ` Daniel P. Berrange 2009-02-04 14:50 ` Aurelien Jarno ` (4 subsequent siblings) 7 siblings, 0 replies; 82+ messages in thread From: Daniel P. Berrange @ 2009-02-03 22:07 UTC (permalink / raw) To: qemu-devel On Tue, Feb 03, 2009 at 02:48:22PM -0600, Anthony Liguori wrote: > What do people think? TCG seems to be in a good place. We've got > virtio, KVM, live migration, tons of new devices, bsd-user, etc. I'd like to see a new release if at all practical. For Fedora there is a push to ship KVM and QEMU packages based off the same source tree to make patching security flaws more pratical. Given that KVM ships off a QEMU SVN snapshot, having a single source tree would mean shipping our full multi-arch QEMU package off a SVN snapshot too. I don't find this a particularly appealing thing - if CVS snapshot is stable enough for it to be exposed to Fedora users, I'd like to think QEMU developers would be happy with a official release. If the QEMU dev community considers the code too unstable to release, then exposing it to Fedora users seem sub-optimal. Personally I test & use the i386 and x86_64 system emulator parts of QEMU, and those seem generally stable enough to base a new release off. So I'd welcome a new release from that POV. I'll leave others to comment on quality of the other arch targets. > We could decide to cut one by the end of the month. I'm already doing > some test work in QEMU so I can follow up with some more detailed notes > about what is working and what isn't working. That gives us some time > to decide if there's anything we need to fix before a release. A QEMU release by the end of the month would work pretty well for the time scale we're working on to get stuff into the Fedora 11 release too. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 20:48 [Qemu-devel] Cutting a new QEMU release Anthony Liguori ` (2 preceding siblings ...) 2009-02-03 22:07 ` Daniel P. Berrange @ 2009-02-04 14:50 ` Aurelien Jarno 2009-02-04 15:23 ` Tristan Gingold ` (3 more replies) 2009-02-04 15:58 ` Glauber Costa ` (3 subsequent siblings) 7 siblings, 4 replies; 82+ messages in thread From: Aurelien Jarno @ 2009-02-04 14:50 UTC (permalink / raw) To: qemu-devel On Tue, Feb 03, 2009 at 02:48:22PM -0600, Anthony Liguori wrote: > What do people think? TCG seems to be in a good place. We've got > virtio, KVM, live migration, tons of new devices, bsd-user, etc. > > We could decide to cut one by the end of the month. I'm already doing > some test work in QEMU so I can follow up with some more detailed notes > about what is working and what isn't working. That gives us some time > to decide if there's anything we need to fix before a release. > That's a really good idea. I would like to see the switch of the remaining PowerPC machine from OpenHackware to OpenBIOS. We don't have the sources of the current ppc_rom.bin binary, and I don't feel comfortable making a release with it. We probably have the sources of an older version. This at least concerned ppc_chrp.c (ppc_prep.c could probably simply be dropped). I have no idea about how long it would take. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-04 14:50 ` Aurelien Jarno @ 2009-02-04 15:23 ` Tristan Gingold 2009-02-04 15:43 ` Lennart Sorensen 2009-02-04 17:39 ` [Qemu-devel] " Blue Swirl ` (2 subsequent siblings) 3 siblings, 1 reply; 82+ messages in thread From: Tristan Gingold @ 2009-02-04 15:23 UTC (permalink / raw) To: qemu-devel On Feb 4, 2009, at 3:50 PM, Aurelien Jarno wrote: > > This at least concerned ppc_chrp.c (ppc_prep.c could probably simply > be dropped). Please, don't drop ppc_prep.c. Even if it is not supported by OpenBios I know several uses of prep with direct images. Tristan. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-04 15:23 ` Tristan Gingold @ 2009-02-04 15:43 ` Lennart Sorensen 2009-02-04 16:01 ` Tristan Gingold 0 siblings, 1 reply; 82+ messages in thread From: Lennart Sorensen @ 2009-02-04 15:43 UTC (permalink / raw) To: qemu-devel On Wed, Feb 04, 2009 at 04:23:10PM +0100, Tristan Gingold wrote: > Please, don't drop ppc_prep.c. Even if it is not supported by > OpenBios I know several uses of prep > with direct images. Who still support it? The linux kernel seems to have thrown away the prep support code as of 2.6.27. -- Len Sorensen ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-04 15:43 ` Lennart Sorensen @ 2009-02-04 16:01 ` Tristan Gingold 2009-02-04 18:17 ` [Qemu-devel] " Consul 0 siblings, 1 reply; 82+ messages in thread From: Tristan Gingold @ 2009-02-04 16:01 UTC (permalink / raw) To: qemu-devel On Feb 4, 2009, at 4:43 PM, Lennart Sorensen wrote: > On Wed, Feb 04, 2009 at 04:23:10PM +0100, Tristan Gingold wrote: >> Please, don't drop ppc_prep.c. Even if it is not supported by >> OpenBios I know several uses of prep >> with direct images. > > Who still support it? I know at least one non-free OS that runs on prep. We also create raw programs that runs on prep (yes we could switch to chrp) > The linux kernel seems to have thrown away the prep support code as of > 2.6.27. Yes, but the world is not only linux 2.6.27+ :-) ^ permalink raw reply [flat|nested] 82+ messages in thread
* [Qemu-devel] Re: Cutting a new QEMU release 2009-02-04 16:01 ` Tristan Gingold @ 2009-02-04 18:17 ` Consul 0 siblings, 0 replies; 82+ messages in thread From: Consul @ 2009-02-04 18:17 UTC (permalink / raw) To: qemu-devel > > Yes, but the world is not only linux 2.6.27+ :-) > Of course not. All the world's a VAX! ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-04 14:50 ` Aurelien Jarno 2009-02-04 15:23 ` Tristan Gingold @ 2009-02-04 17:39 ` Blue Swirl 2009-02-04 17:50 ` Jonathan Kalbfeld 2009-02-04 20:07 ` Blue Swirl 2009-02-07 14:15 ` Stuart Brady 3 siblings, 1 reply; 82+ messages in thread From: Blue Swirl @ 2009-02-04 17:39 UTC (permalink / raw) To: qemu-devel On 2/4/09, Aurelien Jarno <aurelien@aurel32.net> wrote: > On Tue, Feb 03, 2009 at 02:48:22PM -0600, Anthony Liguori wrote: > > > What do people think? TCG seems to be in a good place. We've got > > virtio, KVM, live migration, tons of new devices, bsd-user, etc. > > > > We could decide to cut one by the end of the month. I'm already doing > > some test work in QEMU so I can follow up with some more detailed notes > > about what is working and what isn't working. That gives us some time > > to decide if there's anything we need to fix before a release. > > > > > That's a really good idea. > > I would like to see the switch of the remaining PowerPC machine from > OpenHackware to OpenBIOS. We don't have the sources of the current > ppc_rom.bin binary, and I don't feel comfortable making a release with > it. We probably have the sources of an older version. > > This at least concerned ppc_chrp.c (ppc_prep.c could probably simply > be dropped). > > I have no idea about how long it would take. PPC development on OpenBIOS side is taking quick leaps, I'd hate to rush a release just now when we are very close to a fully working system. On Sparc32/64 side, things are moving more slowly. Sparc32 is pretty much release quality with support for Linux, OpenBSD and NetBSD boot and a lot of boards. Sparc64 is still unusable, but it's not worth waiting for. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-04 17:39 ` [Qemu-devel] " Blue Swirl @ 2009-02-04 17:50 ` Jonathan Kalbfeld 0 siblings, 0 replies; 82+ messages in thread From: Jonathan Kalbfeld @ 2009-02-04 17:50 UTC (permalink / raw) To: qemu-devel Are the host details fixed on Solaris/SPARC? Does anyone want access to my build environment to try and make it work? I haven't gotten anything since the January 13, 2008 release to work without SIGSEGV on a sparc. jonathan On Wed, Feb 4, 2009 at 9:39 AM, Blue Swirl <blauwirbel@gmail.com> wrote: > On 2/4/09, Aurelien Jarno <aurelien@aurel32.net> wrote: >> On Tue, Feb 03, 2009 at 02:48:22PM -0600, Anthony Liguori wrote: >> >> > What do people think? TCG seems to be in a good place. We've got >> > virtio, KVM, live migration, tons of new devices, bsd-user, etc. >> > >> > We could decide to cut one by the end of the month. I'm already doing >> > some test work in QEMU so I can follow up with some more detailed notes >> > about what is working and what isn't working. That gives us some time >> > to decide if there's anything we need to fix before a release. >> > >> >> >> That's a really good idea. >> >> I would like to see the switch of the remaining PowerPC machine from >> OpenHackware to OpenBIOS. We don't have the sources of the current >> ppc_rom.bin binary, and I don't feel comfortable making a release with >> it. We probably have the sources of an older version. >> >> This at least concerned ppc_chrp.c (ppc_prep.c could probably simply >> be dropped). >> >> I have no idea about how long it would take. > > PPC development on OpenBIOS side is taking quick leaps, I'd hate to > rush a release just now when we are very close to a fully working > system. > > On Sparc32/64 side, things are moving more slowly. Sparc32 is pretty > much release quality with support for Linux, OpenBSD and NetBSD boot > and a lot of boards. Sparc64 is still unusable, but it's not worth > waiting for. > > > -- -- Jonathan Kalbfeld ThoughtWave Technologies LLC www.thoughtwave.com "Yes, we did!" Learn UNIX For Free at unixlessons.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-04 14:50 ` Aurelien Jarno 2009-02-04 15:23 ` Tristan Gingold 2009-02-04 17:39 ` [Qemu-devel] " Blue Swirl @ 2009-02-04 20:07 ` Blue Swirl 2009-02-07 14:15 ` Stuart Brady 3 siblings, 0 replies; 82+ messages in thread From: Blue Swirl @ 2009-02-04 20:07 UTC (permalink / raw) To: qemu-devel [-- Attachment #1: Type: text/plain, Size: 1190 bytes --] On 2/4/09, Aurelien Jarno <aurelien@aurel32.net> wrote: > On Tue, Feb 03, 2009 at 02:48:22PM -0600, Anthony Liguori wrote: > > > What do people think? TCG seems to be in a good place. We've got > > virtio, KVM, live migration, tons of new devices, bsd-user, etc. > > > > We could decide to cut one by the end of the month. I'm already doing > > some test work in QEMU so I can follow up with some more detailed notes > > about what is working and what isn't working. That gives us some time > > to decide if there's anything we need to fix before a release. > > > > > That's a really good idea. > > I would like to see the switch of the remaining PowerPC machine from > OpenHackware to OpenBIOS. We don't have the sources of the current > ppc_rom.bin binary, and I don't feel comfortable making a release with > it. We probably have the sources of an older version. > > This at least concerned ppc_chrp.c (ppc_prep.c could probably simply > be dropped). > > I have no idea about how long it would take. 15 minutes :-), though for Qemu side only. The attached patch switches CHRP to OpenBIOS. The screen flashes with something, some more stuff is needed on OpenBIOS side. [-- Attachment #2: chrp_use_openbios.diff --] [-- Type: plain/text, Size: 2327 bytes --] ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-04 14:50 ` Aurelien Jarno ` (2 preceding siblings ...) 2009-02-04 20:07 ` Blue Swirl @ 2009-02-07 14:15 ` Stuart Brady 3 siblings, 0 replies; 82+ messages in thread From: Stuart Brady @ 2009-02-07 14:15 UTC (permalink / raw) To: qemu-devel On Wed, Feb 04, 2009 at 03:50:52PM +0100, Aurelien Jarno wrote: > We don't have the sources of the current ppc_rom.bin binary, and I > don't feel comfortable making a release with it. We probably have the > sources of an older version. Ouch! :( I noticed that the upstream site for Open Hack'Ware disappeared a while ago... Various distros still have the source (and QEMU has a patch against it, pc-bios/ohw.diff), and I thought that it had not been modified in quite a while... I'm glad that OpenBIOS is doing well, but if we really have lost some of the Open Hack'Ware source, that's slightly disconcerting (even if it's not really needed any longer.) Cheers, -- Stuart Brady ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 20:48 [Qemu-devel] Cutting a new QEMU release Anthony Liguori ` (3 preceding siblings ...) 2009-02-04 14:50 ` Aurelien Jarno @ 2009-02-04 15:58 ` Glauber Costa 2009-02-07 15:29 ` Shin-ichiro KAWASAKI ` (2 subsequent siblings) 7 siblings, 0 replies; 82+ messages in thread From: Glauber Costa @ 2009-02-04 15:58 UTC (permalink / raw) To: qemu-devel On Tue, Feb 3, 2009 at 6:48 PM, Anthony Liguori <anthony@codemonkey.ws> wrote: > What do people think? TCG seems to be in a good place. We've got virtio, > KVM, live migration, tons of new devices, bsd-user, etc. > > We could decide to cut one by the end of the month. I'm already doing some > test work in QEMU so I can follow up with some more detailed notes about > what is working and what isn't working. That gives us some time to decide > if there's anything we need to fix before a release. As a curiosity, what would be the preferred version number? 0.9.2? 0.10 ? 1.0 ? "Phoenix"? "Crisis Release" ? -- Glauber Costa. "Free as in Freedom" http://glommer.net "The less confident you are, the more serious you have to act." ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 20:48 [Qemu-devel] Cutting a new QEMU release Anthony Liguori ` (4 preceding siblings ...) 2009-02-04 15:58 ` Glauber Costa @ 2009-02-07 15:29 ` Shin-ichiro KAWASAKI 2009-02-11 21:49 ` Rob Landley 2009-02-09 12:43 ` Mark McLoughlin 2009-02-13 8:40 ` Riku Voipio 7 siblings, 1 reply; 82+ messages in thread From: Shin-ichiro KAWASAKI @ 2009-02-07 15:29 UTC (permalink / raw) To: qemu-devel Anthony Liguori wrote: > What do people think? TCG seems to be in a good place. We've got > virtio, KVM, live migration, tons of new devices, bsd-user, etc. The development of sh4 is rather slow, you know. Then there is no need to think about it to decide when to cut the next version. >From the point of view from sh4 system emulation, that's a good news. It is a good way to provide current features for sh4 developers. But before release, I hope these two points would handled by anyone. [1] USB support Current sh4 system emulation (r2d board) does not support USB host. Without it, the graphic console does not receive any key input via USB keyboard emulation. Then, graphics console is not available now. I guess sh4 developers would feel it strange. Following patch adds usb host. http://lists.gnu.org/archive/html/qemu-devel/2008-12/msg01620.html It does not apply to current svn head, because of line mismatch. I'm willing to post new version, if the patch is OK. Could anyone review it? [2] sh4 kernel & disk image on QEMU's download page It's a bothering work to make kernel & disk image for sh4. Not to obstruct sh4 developers with it, I provide a small (4MB) set at following URL. http://www.assembla.com/spaces/qemu-sh4/documents/b18oeq850r3AhNab7jnrAJ/download?filename=sh-test-0.1.tar.bz2 Could anyone put the image at QEMU's download page? http://bellard.org/qemu/download.html I think it is the best place to provide it. Regards, Shin-ichiro KAWASAKI ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-07 15:29 ` Shin-ichiro KAWASAKI @ 2009-02-11 21:49 ` Rob Landley 2009-02-12 14:44 ` Shin-ichiro KAWASAKI 0 siblings, 1 reply; 82+ messages in thread From: Rob Landley @ 2009-02-11 21:49 UTC (permalink / raw) To: qemu-devel; +Cc: Shin-ichiro KAWASAKI On Saturday 07 February 2009 09:29:42 Shin-ichiro KAWASAKI wrote: > Anthony Liguori wrote: > > What do people think? TCG seems to be in a good place. We've got > > virtio, KVM, live migration, tons of new devices, bsd-user, etc. > > The development of sh4 is rather slow, you know. Then there is no need to > think about it to decide when to cut the next version. > > From the point of view from sh4 system emulation, that's a good news. > It is a good way to provide current features for sh4 developers. > But before release, I hope these two points would handled by anyone. > > [1] USB support > > Current sh4 system emulation (r2d board) does not support USB host. > Without it, the graphic console does not receive any key input > via USB keyboard emulation. Then, graphics console is not available > now. I guess sh4 developers would feel it strange. > > Following patch adds usb host. > > http://lists.gnu.org/archive/html/qemu-devel/2008-12/msg01620.html > > It does not apply to current svn head, because of line mismatch. > I'm willing to post new version, if the patch is OK. > Could anyone review it? > > > [2] sh4 kernel & disk image on QEMU's download page > > It's a bothering work to make kernel & disk image for sh4. Not to obstruct > sh4 developers with it, I provide a small (4MB) set at following URL. > > http://www.assembla.com/spaces/qemu-sh4/documents/b18oeq850r3AhNab7jnrAJ/do >wnload?filename=sh-test-0.1.tar.bz2 I downloaded this and tried it out with an svn snapshot from today (svn 6613), built on Ubuntu 8.10 with default "./configure; make; sudo make install". Your README has a typo, it says "-kernel r2d_zImage" but the one you've packaged is just "zImage". Once that's worked around, it pops up a qemu window within which it boots to a login prompt, but I can't type anything. This is the USB issue you mentioned, confirmed by the stdout from qemu: char device redirected to /dev/pts/14 Warning: could not add USB device keyboard long read to SH7750_WCR1_A7 (0x000000001f800008) ignored long read to SH7750_WCR2_A7 (0x000000001f80000c) ignored long read to SH7750_WCR3_A7 (0x000000001f800010) ignored long read to SH7750_MCR_A7 (0x000000001f800014) ignored long read to SH7750_MCR_A7 (0x000000001f800014) ignored I thought I'd work around that with a serial console (which is what I actually use for in my FWL project anyway), but I can't get that to work either. Your command line -appends "console=ttySC0,115200" and "early_printk=serial" but ctrl-alt-2 shows nothing, and booting with -nographic gives no output. Also, going back to ctrl-alt-1 doesn't redraw the vga screen. (The screen will partially redraw itself if it's still producing output, but it never redraws the penguin logo at the top of the frame buffer, and if the console has finished producing output it just stays black from then on.) This is more functionality than I've ever gotten out of sh4, and I really look forward to adding sh4 support to http://impactlinux.com/fwl . If I can get a serial console working I'm probably good to go, but right now it's not quite working for me yet... Your README says that you can extract the config.gz from the kernel with linux/scripts/extract-ikconfig, but when I tried it it said: ERROR: Unable to extract kernel configuration information. This kernel image may not have the config info. Obviously, I can't get it by logging in and catting /proc/config.gz without a working keyboard or serial console... Rob ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-11 21:49 ` Rob Landley @ 2009-02-12 14:44 ` Shin-ichiro KAWASAKI 2009-02-12 21:08 ` Rob Landley 2009-02-12 21:44 ` Rob Landley 0 siblings, 2 replies; 82+ messages in thread From: Shin-ichiro KAWASAKI @ 2009-02-12 14:44 UTC (permalink / raw) To: Rob Landley; +Cc: qemu-devel Rob Landley wrote: > On Saturday 07 February 2009 09:29:42 Shin-ichiro KAWASAKI wrote: >> Anthony Liguori wrote: >>> What do people think? TCG seems to be in a good place. We've got >>> virtio, KVM, live migration, tons of new devices, bsd-user, etc. >> The development of sh4 is rather slow, you know. Then there is no need to >> think about it to decide when to cut the next version. >> >> From the point of view from sh4 system emulation, that's a good news. >> It is a good way to provide current features for sh4 developers. >> But before release, I hope these two points would handled by anyone. >> >> [1] USB support >> >> Current sh4 system emulation (r2d board) does not support USB host. >> Without it, the graphic console does not receive any key input >> via USB keyboard emulation. Then, graphics console is not available >> now. I guess sh4 developers would feel it strange. >> >> Following patch adds usb host. >> >> http://lists.gnu.org/archive/html/qemu-devel/2008-12/msg01620.html >> >> It does not apply to current svn head, because of line mismatch. >> I'm willing to post new version, if the patch is OK. >> Could anyone review it? >> >> >> [2] sh4 kernel & disk image on QEMU's download page >> >> It's a bothering work to make kernel & disk image for sh4. Not to obstruct >> sh4 developers with it, I provide a small (4MB) set at following URL. >> >> http://www.assembla.com/spaces/qemu-sh4/documents/b18oeq850r3AhNab7jnrAJ/do >> wnload?filename=sh-test-0.1.tar.bz2 > > I downloaded this and tried it out with an svn snapshot from today (svn 6613), > built on Ubuntu 8.10 with default "./configure; make; sudo make install". Thank you Rob for trying, and sorry for my mistakes in the package. I thought again about the contents again, and have uploaded it on the same URL as before. > Your README has a typo, it says "-kernel r2d_zImage" but the one you've > packaged is just "zImage". Once that's worked around, it pops up a qemu > window within which it boots to a login prompt, but I can't type anything. > This is the USB issue you mentioned, confirmed by the stdout from qemu: > > char device redirected to /dev/pts/14 > Warning: could not add USB device keyboard > long read to SH7750_WCR1_A7 (0x000000001f800008) ignored > long read to SH7750_WCR2_A7 (0x000000001f80000c) ignored > long read to SH7750_WCR3_A7 (0x000000001f800010) ignored > long read to SH7750_MCR_A7 (0x000000001f800014) ignored > long read to SH7750_MCR_A7 (0x000000001f800014) ignored > > I thought I'd work around that with a serial console (which is what I actually > use for in my FWL project anyway), but I can't get that to work either. Your > command line -appends "console=ttySC0,115200" and "early_printk=serial" but > ctrl-alt-2 shows nothing, and booting with > -nographic gives no output. > > Also, going back to ctrl-alt-1 doesn't redraw the vga screen. (The screen > will partially redraw itself if it's still producing output, but it never > redraws the penguin logo at the top of the frame buffer, and if the console > has finished producing output it just stays black from then on.) Sorry to say, I completely missed to add '-serial null -serial stdio' in the command line example. Could you try following line again? % ./qemu-system-sh4 -M r2d -kernel zImage -hda sh-linux-mini.img -serial null -serial stdio -nographic I hope you'll see the shell prompt. > This is more functionality than I've ever gotten out of sh4, and I really look > forward to adding sh4 support to http://impactlinux.com/fwl . If I can get a > serial console working I'm probably good to go, but right now it's not quite > working for me yet... > > Your README says that you can extract the config.gz from the kernel with > linux/scripts/extract-ikconfig, but when I tried it it said: > > ERROR: Unable to extract kernel configuration information. > This kernel image may not have the config info. > > Obviously, I can't get it by logging in and catting /proc/config.gz without a > working keyboard or serial console... On my Ubuntu 8.04 env, I can't get config with scripts/extract-ikconfig. On the other hand, with Ubuntu 8.10 env, I can. I'm not sure about the reason. Anyway, I removed the explanation about the way to get config before booting. Additionally, I modified the default kernel boot options slightly, and ran fsck.ext2 on the disk image before packing. I hope the package is appropriate for qemu-sh users, now. I'm looking forward the fwl system image. I've been struggling to get stable userland on which gcc is available. Regards, Shin-ichiro KAWASAKI ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-12 14:44 ` Shin-ichiro KAWASAKI @ 2009-02-12 21:08 ` Rob Landley 2009-02-12 21:44 ` Rob Landley 1 sibling, 0 replies; 82+ messages in thread From: Rob Landley @ 2009-02-12 21:08 UTC (permalink / raw) To: Shin-ichiro KAWASAKI; +Cc: qemu-devel On Thursday 12 February 2009 08:44:48 Shin-ichiro KAWASAKI wrote: > Sorry to say, I completely missed to add '-serial null -serial stdio' in > the command line example. Could you try following line again? > > % ./qemu-system-sh4 -M r2d -kernel zImage -hda sh-linux-mini.img -serial > null -serial stdio -nographic > > I hope you'll see the shell prompt. Yes I did! Cool! Thanks, that's what I needed. > I'm looking forward the fwl system image. I've been struggling > to get stable userland on which gcc is available. I'll try to make it work this evening. Thank you very much, Rob ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-12 14:44 ` Shin-ichiro KAWASAKI 2009-02-12 21:08 ` Rob Landley @ 2009-02-12 21:44 ` Rob Landley 1 sibling, 0 replies; 82+ messages in thread From: Rob Landley @ 2009-02-12 21:44 UTC (permalink / raw) To: Shin-ichiro KAWASAKI; +Cc: qemu-devel On Thursday 12 February 2009 08:44:48 Shin-ichiro KAWASAKI wrote: > Rob Landley wrote: > > On Saturday 07 February 2009 09:29:42 Shin-ichiro KAWASAKI wrote: > % ./qemu-system-sh4 -M r2d -kernel zImage -hda sh-linux-mini.img -serial > null -serial stdio -nographic > > I hope you'll see the shell prompt. FYI, here's something I type from that shell prompt which made qemu-system-sh4 unhappy: # reboot The system is going down NOW! Sending SIGTERM to all processes Sending SIGKILL to all processes Requesting system reboot Restarting system. Unauthorized access qemu: fatal: Trying to execute code outside RAM or ROM at 0xa0000000 pc=0xa0000000 sr=0x700000f0 pr=0x8c03864c fpscr=0x00080000 spc=0x8c0126a6 ssr=0x10000000 gbr=0x2975b450 vbr=0x8c018000 sgr=0x8f989e8c dbr=0x00000000 delayed_pc=0x8c0126a0 fpul=0x00000000 r0=0x00000016 r1=0x80000001 r2=0x10000000 r3=0x0000198e r4=0x00000000 r5=0x0000198e r6=0xffffffff r7=0xffffffff r8=0x28121969 r9=0xfee1dead r10=0x000001a0 r11=0x01234567 r12=0x297577b8 r13=0x004c2c10 r14=0x7bea3aa0 r15=0x8f989e8c r16=0x00000000 r17=0xffffff0f r18=0xffffffff r19=0x40008000 r20=0x8f989e08 r21=0x00000000 r22=0x00000000 r23=0x8f988000 Just FYI. :) Rob ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 20:48 [Qemu-devel] Cutting a new QEMU release Anthony Liguori ` (5 preceding siblings ...) 2009-02-07 15:29 ` Shin-ichiro KAWASAKI @ 2009-02-09 12:43 ` Mark McLoughlin 2009-02-09 21:36 ` Anthony Liguori 2009-02-10 0:47 ` Rob Landley 2009-02-13 8:40 ` Riku Voipio 7 siblings, 2 replies; 82+ messages in thread From: Mark McLoughlin @ 2009-02-09 12:43 UTC (permalink / raw) To: qemu-devel On Tue, 2009-02-03 at 14:48 -0600, Anthony Liguori wrote: > What do people think? TCG seems to be in a good place. We've got > virtio, KVM, live migration, tons of new devices, bsd-user, etc. > > We could decide to cut one by the end of the month. I'm already doing > some test work in QEMU so I can follow up with some more detailed notes > about what is working and what isn't working. That gives us some time > to decide if there's anything we need to fix before a release. Sounds great to me. >From a Fedora perspective, qemu-0.9.1 is a year old and upstream has moved on a lot. As a package maintainer, it's hard to justify caring too much about bugs reported against 0.9.1, since the bug is likely to have very little relevance to the latest upstream. Also, it would be really nice to have a kvm-userspace based off a solid qemu release ... qemu moving so fast is great, but it means it's hard to predict the stability of a given kvm-userspace release. Some questions: - Will there be a period before the release when only bug fixes are merged? - Will there be a release candidate? - Is there any missing features that we might push out the release date for? - Post-release, is there any interest in maintaining a stable branch until the next release? - The plan for the next release is roughly 6 months, yes? Thanks, Mark. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-09 12:43 ` Mark McLoughlin @ 2009-02-09 21:36 ` Anthony Liguori 2009-02-10 0:47 ` Rob Landley 1 sibling, 0 replies; 82+ messages in thread From: Anthony Liguori @ 2009-02-09 21:36 UTC (permalink / raw) To: Mark McLoughlin, qemu-devel Mark McLoughlin wrote: > On Tue, 2009-02-03 at 14:48 -0600, Anthony Liguori wrote: > >> What do people think? TCG seems to be in a good place. We've got >> virtio, KVM, live migration, tons of new devices, bsd-user, etc. >> >> We could decide to cut one by the end of the month. I'm already doing >> some test work in QEMU so I can follow up with some more detailed notes >> about what is working and what isn't working. That gives us some time >> to decide if there's anything we need to fix before a release. >> > > Sounds great to me. > > >From a Fedora perspective, qemu-0.9.1 is a year old and upstream has > moved on a lot. As a package maintainer, it's hard to justify caring too > much about bugs reported against 0.9.1, since the bug is likely to have > very little relevance to the latest upstream. > > Also, it would be really nice to have a kvm-userspace based off a solid > qemu release ... qemu moving so fast is great, but it means it's hard to > predict the stability of a given kvm-userspace release. > > Some questions: > > - Will there be a period before the release when only bug fixes are > merged? > It's a good idea, but it may be hard to pull off practically speaking for the first release. Let's see how it works out. > - Will there be a release candidate? > Sometime this week, I'll try to post something summarizing our current state and anything outstanding. If there's time to put out an -rc, I'll try to make one available. Things may hiccup a bit. > - Is there any missing features that we might push out the release > date for? > Personally, I don't think so. I think openbios was the biggest issue because we don't have the code for the current firmware. It looks like that's been almost resolved. I'm more interested in getting a release out in a timely manner than holding up for any particular feature. If we have lots of features going in, I'd rather do more frequent releases than hold up releases. > - Post-release, is there any interest in maintaining a stable branch > until the next release? > I am tempted to try it out. Let's see how it goes. > - The plan for the next release is roughly 6 months, yes? > Yup. Regards, Anthony Liguori > Thanks, > Mark. > > > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-09 12:43 ` Mark McLoughlin 2009-02-09 21:36 ` Anthony Liguori @ 2009-02-10 0:47 ` Rob Landley 2009-02-10 7:22 ` M. Warner Losh 1 sibling, 1 reply; 82+ messages in thread From: Rob Landley @ 2009-02-10 0:47 UTC (permalink / raw) To: qemu-devel, Mark McLoughlin On Monday 09 February 2009 06:43:34 Mark McLoughlin wrote: > On Tue, 2009-02-03 at 14:48 -0600, Anthony Liguori wrote: > > What do people think? TCG seems to be in a good place. We've got > > virtio, KVM, live migration, tons of new devices, bsd-user, etc. > > > > We could decide to cut one by the end of the month. I'm already doing > > some test work in QEMU so I can follow up with some more detailed notes > > about what is working and what isn't working. That gives us some time > > to decide if there's anything we need to fix before a release. > > Sounds great to me. > > From a Fedora perspective, qemu-0.9.1 is a year old and upstream has > moved on a lot. As a package maintainer, it's hard to justify caring too > much about bugs reported against 0.9.1, since the bug is likely to have > very little relevance to the latest upstream. > > Also, it would be really nice to have a kvm-userspace based off a solid > qemu release ... qemu moving so fast is great, but it means it's hard to > predict the stability of a given kvm-userspace release. I'd like to point out a relevant Google tech talk video: http://video.google.com/videoplay?docid=-5503858974016723264 April 19, 2007 Release Management in Large Free Software Projects - Martin Michlmayr (Debian) ABSTRACT: Time based releases are made according to a specific time interval, instead of making a release when a particular functionality or set of features have been implemented. This talk argues that time based release management acts as an effective coordination mechanism in large volunteer projects and shows examples from seven projects that have moved to time based releases: Debian, GCC, GNOME, Linux, OpenOffice, Plone, and X.org. > Some questions: > > - Will there be a period before the release when only bug fixes are > merged? > > - Will there be a release candidate? Those two answer each other. If your 0.9.2 release turns out to have bugs, you can trivially cut a bugfix-only 0.9.2.1, 0.9.2.2, 0.9.2.3... as needed. Weekly even. So 0.9.2 being bug-free isn't that important. And it's actually just about impossible for you .0 to be bug-free, because you get 20 times as many testers for an actual release as you get for any snapshot, so they _will_ find new bugs. It's just about guaranteed. Also unless your stabilization period is a hard freeze preventing new development from going into the repository, then you'll be introducing new bugs while you try to fix 'em... > - Post-release, is there any interest in maintaining a stable branch > until the next release? That's kind of necessary for the previous two, but as long as it's clearly bugfix-only then it should have zero impact on new development, and can be done in a completely separate repository by a different maintainer. (That's how the linux kernel does things.) > - Is there any missing features that we might push out the release > date for? Defeats the purpose of time based releases: it's ok to bump things from this release if the next release is a finite amount of time away. If you have no idea when the next release will be, then getting every last feature into this release (and holding up the release for it) is a big deal, and thus you have endless delays, feature creep, a rush to merge things that aren't quite ready when a release is floated... > - The plan for the next release is roughly 6 months, yes? The general theory of having regular scheduled releases is that bumping stuff until the next release is no longer the end of the world, because there _will_ be a next release, and this has lots and lots of positive side effects, as described in the video. > Thanks, > Mark. Rob ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-10 0:47 ` Rob Landley @ 2009-02-10 7:22 ` M. Warner Losh 0 siblings, 0 replies; 82+ messages in thread From: M. Warner Losh @ 2009-02-10 7:22 UTC (permalink / raw) To: qemu-devel, rob; +Cc: markmc Re Time based releases: You need to have someone drive the time based releases long term, otherwise you'll slide back into the feature based release mode. And since there's always another feature, that, as has been pointed out, tends to stretch out things a very long time. Warner ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-03 20:48 [Qemu-devel] Cutting a new QEMU release Anthony Liguori ` (6 preceding siblings ...) 2009-02-09 12:43 ` Mark McLoughlin @ 2009-02-13 8:40 ` Riku Voipio 2009-02-13 9:59 ` Stefano Stabellini 2009-02-13 16:30 ` Jamie Lokier 7 siblings, 2 replies; 82+ messages in thread From: Riku Voipio @ 2009-02-13 8:40 UTC (permalink / raw) To: qemu-devel [-- Attachment #1: Type: text/plain, Size: 488 bytes --] On Tue, Feb 03, 2009 at 02:48:22PM -0600, Anthony Liguori wrote: > We could decide to cut one by the end of the month. This would indeed be really cool. > .. to decide if there's anything we need to fix before a release. At least the OS X (cocoa) host is broken, which is IMHO pretty bad regression. Apart from that I'm not aware of any major regression (wearing arm-linux-user, some arm-softmmu and debian hats). -- "rm -rf" only sounds scary if you don't have backups [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-13 8:40 ` Riku Voipio @ 2009-02-13 9:59 ` Stefano Stabellini 2009-02-13 16:30 ` Jamie Lokier 1 sibling, 0 replies; 82+ messages in thread From: Stefano Stabellini @ 2009-02-13 9:59 UTC (permalink / raw) To: qemu-devel@nongnu.org Riku Voipio wrote: >> .. to decide if there's anything we need to fix before a release. > > At least the OS X (cocoa) host is broken, which is IMHO pretty > bad regression. Apart from that I'm not aware of any major regression > (wearing arm-linux-user, some arm-softmmu and debian hats). > There was a patch laying around few weeks ago that was OK for most cases but didn't work in some. If we don't have anything better I think we could accept the patch and add a few #ifdef in vga.c to make sure the patch does not break. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-13 8:40 ` Riku Voipio 2009-02-13 9:59 ` Stefano Stabellini @ 2009-02-13 16:30 ` Jamie Lokier 2009-02-13 17:00 ` Anthony Liguori 1 sibling, 1 reply; 82+ messages in thread From: Jamie Lokier @ 2009-02-13 16:30 UTC (permalink / raw) To: qemu-devel Riku Voipio wrote: > On Tue, Feb 03, 2009 at 02:48:22PM -0600, Anthony Liguori wrote: > > We could decide to cut one by the end of the month. > > This would indeed be really cool. > > > .. to decide if there's anything we need to fix before a release. > > At least the OS X (cocoa) host is broken, which is IMHO pretty > bad regression. Apart from that I'm not aware of any major regression > (wearing arm-linux-user, some arm-softmmu and debian hats). I'd say the two qcow2 data corruption bugs are a major regression. (Both reported in in another thread). qemu 0.9.1 has the qcow2 code from kvm-72, which doesn't exhibit either of those corruption bugs. A new release based on current kvm userspace would introduce those bugs. One of the bugs (reported by Marc) corrupts a qcow2 image so you can't use it even if you revert to an older qemu/kvm. It's not clear if the other bug causes permanent corruption itself, but anything which causes a guest to see the wrong data can lead to the guest writing corrupt data elsewhere later on. Simply reverting the qcow2 code appears to fix those problems, so it needn't hold up cutting a release. That's what I recommend. -- Jamie ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Cutting a new QEMU release 2009-02-13 16:30 ` Jamie Lokier @ 2009-02-13 17:00 ` Anthony Liguori 2009-02-13 19:04 ` [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports Jamie Lokier 0 siblings, 1 reply; 82+ messages in thread From: Anthony Liguori @ 2009-02-13 17:00 UTC (permalink / raw) To: qemu-devel Jamie Lokier wrote: > Riku Voipio wrote: > >> On Tue, Feb 03, 2009 at 02:48:22PM -0600, Anthony Liguori wrote: >> >>> We could decide to cut one by the end of the month. >>> >> This would indeed be really cool. >> >> >>> .. to decide if there's anything we need to fix before a release. >>> >> At least the OS X (cocoa) host is broken, which is IMHO pretty >> bad regression. Apart from that I'm not aware of any major regression >> (wearing arm-linux-user, some arm-softmmu and debian hats). >> > > I'd say the two qcow2 data corruption bugs are a major regression. > (Both reported in in another thread). > > qemu 0.9.1 has the qcow2 code from kvm-72, which doesn't exhibit > either of those corruption bugs. A new release based on current kvm > userspace would introduce those bugs. One of the bugs (reported by > Marc) corrupts a qcow2 image so you can't use it even if you revert to > an older qemu/kvm. It's not clear if the other bug causes permanent > corruption itself, but anything which causes a guest to see the wrong > data can lead to the guest writing corrupt data elsewhere later on. > > Simply reverting the qcow2 code appears to fix those problems, so it > needn't hold up cutting a release. That's what I recommend. > Send some patches. Regards, Anthony Liguori > -- Jamie > > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-13 17:00 ` Anthony Liguori @ 2009-02-13 19:04 ` Jamie Lokier 2009-02-14 22:23 ` Dor Laor 2009-02-14 23:13 ` Anthony Liguori 0 siblings, 2 replies; 82+ messages in thread From: Jamie Lokier @ 2009-02-13 19:04 UTC (permalink / raw) To: qemu-devel Anthony Liguori wrote: > >Simply reverting the qcow2 code appears to fix those problems, so it > >needn't hold up cutting a release. That's what I recommend. > > Send some patches. I did already. Here it is again. This should fix my bug and Marc's bug according to his report that reverting qcow2.c fixes it. -- Jamie Subject: Revert block-qcow2.c to kvm-72 version due to corruption reports This fixes two kinds of qcow2 corruption observed in kvm-83 (actually kvm-73 and later), from three bug reports. Bug 1: Windows 2000 guests complain of corrupt registry. Many Windows 2000 guests which boot and runs fine in kvm-72, fail with a blue-screen indicating file corruption errors in kvm-73 through to kvm-83 (the latest), and succeed if we replace block-qcow2.c with the version from kvm-72. The blue screen appears towards the end of the boot sequence, and shows only briefly before rebooting. It says: STOP: c0000218 (Registry File Failure) The registry cannot load the hive (file): \SystemRoot\System32\Config\SOFTWARE or its log or alternate. It is corrupt, absent, or not writable. Beginning dump of physical memory Physical memory dump complete. Contact your system administrator or technical support [...?] This is narrowed down to the difference in block-qcow2.c between kvm-72 and kvm-73 (not -83). From kvm-73 to kvm-83, there have been more changes block-qcow2.c, but the observed corruption still occurs. The bug isn't evident when only reading. When using "qemu-img convert" to convert a qcow2 file to a raw file, with broken and fixed versions of block-qcow2.c it produces the same raw file. Also, when using "-snapshot" with qemu, the blue screen doesn't occur. This bug was observed by Jamie Lokier <jamie@shareable.org> and confirmed for multiple Windows 2000 guests by Marc Bevand <m.bevand@gmail.com>. Bug 2: Windows 2003 guests complain of corrupt registry. According to http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2001452&group_id=180599 Windows 2003 32-bit guests randomly spew disk corruption messages like this: Windows – Registry Hive Recovered Registry hive (file): SOFTWARE was corrupted and it has been recovered. Some data might have been lost. and The system cannot log on due to the following error: Unable to complete the requested operation because of either a catastrophic media failure or a data structure corruption on the disk. This bug was reported by <gerdwachs@users.sourceforge.net> and confirmed by Marc Bevand, noting: kvm-73+ also causes some of my Windows 2003 guests to exhibit this exact registry corruption error. [...] This bug is also fixed by reverting block-qcow2.c to the version from kvm-72. Worryingly, gerdwachs' bug report says it's for kvm-70, implying this patch may not fix all the Windows 2003 guest corruption problems. At least Marc says his observed problem goes away with kvm-72's qcow2. Bug 3: Corruption of qcow2 index rendering the file unusable. Marc Bevand writes: I tested kvm-81 and kvm-83 as well (can't test kvm-80 or older because of the qcow2 performance regression caused by the default writethrough caching policy) but it randomly triggers an even worse bug: the moment I shut down a guest by typing "quit" in the monitor, it sometimes overwrite the first 4kB of the disk image with mostly NUL bytes (!) which completely destroys it. I am familiar with the qcow2 format and apparently this 4kB block seems to be an L2 table with most entries set to zero. I have had to restore at least 6 or 7 disk images from backup after occurences of that bug. My intuition tells me this may be the qcow2 code trying to allocate a cluster to write a new L2 table, but not noticing the allocation failed (represented by a 0 offset), and writing the L2 table at that 0 offset, overwriting the qcow2 header. Fortunately this bug is also fixed by running kvm-75 with block-qcow2.c reverted to its kvm-72 version. Basically qcow2 in kvm-73 or newer is completely unreliable. Reverting block-qcow2.c to the version in kvm-72 appears to fix the corruption symptoms reported by Marc and Jamie, although gerdwachs' related bug is against kvm-70 so it may not fix that. Unfortunately this reverts some optimisations, but fixing corruption is more important until the new code is reliable. This patch reverts block-qcow2.c in kvm-83 to the version in kvm-72, except the "cache=writeback" default performance tweak is retained and there's no need to define "offsetof". Signed-Off-By: Jamie Lokier <jamie@shareable.org> --- kvm-83-real/qemu/block-qcow2.c 2009-01-13 13:29:42.000000000 +0000 +++ kvm-83/qemu/block-qcow2.c 2009-02-13 18:51:12.000000000 +0000 @@ -52,8 +52,6 @@ #define QCOW_CRYPT_NONE 0 #define QCOW_CRYPT_AES 1 -#define QCOW_MAX_CRYPT_CLUSTERS 32 - /* indicate that the refcount of the referenced cluster is exactly one. */ #define QCOW_OFLAG_COPIED (1LL << 63) /* indicate that the cluster is compressed (they never have the copied flag) */ @@ -269,8 +267,7 @@ if (!s->cluster_cache) goto fail; /* one more sector for decompressed data alignment */ - s->cluster_data = qemu_malloc(QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size - + 512); + s->cluster_data = qemu_malloc(s->cluster_size + 512); if (!s->cluster_data) goto fail; s->cluster_cache_offset = -1; @@ -437,7 +434,8 @@ int new_l1_size, new_l1_size2, ret, i; uint64_t *new_l1_table; uint64_t new_l1_table_offset; - uint8_t data[12]; + uint64_t data64; + uint32_t data32; new_l1_size = s->l1_size; if (min_size <= new_l1_size) @@ -467,10 +465,13 @@ new_l1_table[i] = be64_to_cpu(new_l1_table[i]); /* set new table */ - cpu_to_be32w((uint32_t*)data, new_l1_size); - cpu_to_be64w((uint64_t*)(data + 4), new_l1_table_offset); - if (bdrv_pwrite(s->hd, offsetof(QCowHeader, l1_size), data, - sizeof(data)) != sizeof(data)) + data64 = cpu_to_be64(new_l1_table_offset); + if (bdrv_pwrite(s->hd, offsetof(QCowHeader, l1_table_offset), + &data64, sizeof(data64)) != sizeof(data64)) + goto fail; + data32 = cpu_to_be32(new_l1_size); + if (bdrv_pwrite(s->hd, offsetof(QCowHeader, l1_size), + &data32, sizeof(data32)) != sizeof(data32)) goto fail; qemu_free(s->l1_table); free_clusters(bs, s->l1_table_offset, s->l1_size * sizeof(uint64_t)); @@ -483,549 +484,169 @@ return -EIO; } -/* - * seek_l2_table +/* 'allocate' is: * - * seek l2_offset in the l2_cache table - * if not found, return NULL, - * if found, - * increments the l2 cache hit count of the entry, - * if counter overflow, divide by two all counters - * return the pointer to the l2 cache entry + * 0 not to allocate. * - */ - -static uint64_t *seek_l2_table(BDRVQcowState *s, uint64_t l2_offset) -{ - int i, j; - - for(i = 0; i < L2_CACHE_SIZE; i++) { - if (l2_offset == s->l2_cache_offsets[i]) { - /* increment the hit count */ - if (++s->l2_cache_counts[i] == 0xffffffff) { - for(j = 0; j < L2_CACHE_SIZE; j++) { - s->l2_cache_counts[j] >>= 1; - } - } - return s->l2_cache + (i << s->l2_bits); - } - } - return NULL; -} - -/* - * l2_load + * 1 to allocate a normal cluster (for sector indexes 'n_start' to + * 'n_end') * - * Loads a L2 table into memory. If the table is in the cache, the cache - * is used; otherwise the L2 table is loaded from the image file. + * 2 to allocate a compressed cluster of size + * 'compressed_size'. 'compressed_size' must be > 0 and < + * cluster_size * - * Returns a pointer to the L2 table on success, or NULL if the read from - * the image file failed. + * return 0 if not allocated. */ - -static uint64_t *l2_load(BlockDriverState *bs, uint64_t l2_offset) -{ - BDRVQcowState *s = bs->opaque; - int min_index; - uint64_t *l2_table; - - /* seek if the table for the given offset is in the cache */ - - l2_table = seek_l2_table(s, l2_offset); - if (l2_table != NULL) - return l2_table; - - /* not found: load a new entry in the least used one */ - - min_index = l2_cache_new_entry(bs); - l2_table = s->l2_cache + (min_index << s->l2_bits); - if (bdrv_pread(s->hd, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) != - s->l2_size * sizeof(uint64_t)) - return NULL; - s->l2_cache_offsets[min_index] = l2_offset; - s->l2_cache_counts[min_index] = 1; - - return l2_table; -} - -/* - * l2_allocate - * - * Allocate a new l2 entry in the file. If l1_index points to an already - * used entry in the L2 table (i.e. we are doing a copy on write for the L2 - * table) copy the contents of the old L2 table into the newly allocated one. - * Otherwise the new table is initialized with zeros. - * - */ - -static uint64_t *l2_allocate(BlockDriverState *bs, int l1_index) -{ - BDRVQcowState *s = bs->opaque; - int min_index; - uint64_t old_l2_offset, tmp; - uint64_t *l2_table, l2_offset; - - old_l2_offset = s->l1_table[l1_index]; - - /* allocate a new l2 entry */ - - l2_offset = alloc_clusters(bs, s->l2_size * sizeof(uint64_t)); - - /* update the L1 entry */ - - s->l1_table[l1_index] = l2_offset | QCOW_OFLAG_COPIED; - - tmp = cpu_to_be64(l2_offset | QCOW_OFLAG_COPIED); - if (bdrv_pwrite(s->hd, s->l1_table_offset + l1_index * sizeof(tmp), - &tmp, sizeof(tmp)) != sizeof(tmp)) - return NULL; - - /* allocate a new entry in the l2 cache */ - - min_index = l2_cache_new_entry(bs); - l2_table = s->l2_cache + (min_index << s->l2_bits); - - if (old_l2_offset == 0) { - /* if there was no old l2 table, clear the new table */ - memset(l2_table, 0, s->l2_size * sizeof(uint64_t)); - } else { - /* if there was an old l2 table, read it from the disk */ - if (bdrv_pread(s->hd, old_l2_offset, - l2_table, s->l2_size * sizeof(uint64_t)) != - s->l2_size * sizeof(uint64_t)) - return NULL; - } - /* write the l2 table to the file */ - if (bdrv_pwrite(s->hd, l2_offset, - l2_table, s->l2_size * sizeof(uint64_t)) != - s->l2_size * sizeof(uint64_t)) - return NULL; - - /* update the l2 cache entry */ - - s->l2_cache_offsets[min_index] = l2_offset; - s->l2_cache_counts[min_index] = 1; - - return l2_table; -} - -static int size_to_clusters(BDRVQcowState *s, int64_t size) -{ - return (size + (s->cluster_size - 1)) >> s->cluster_bits; -} - -static int count_contiguous_clusters(uint64_t nb_clusters, int cluster_size, - uint64_t *l2_table, uint64_t start, uint64_t mask) -{ - int i; - uint64_t offset = be64_to_cpu(l2_table[0]) & ~mask; - - if (!offset) - return 0; - - for (i = start; i < start + nb_clusters; i++) - if (offset + i * cluster_size != (be64_to_cpu(l2_table[i]) & ~mask)) - break; - - return (i - start); -} - -static int count_contiguous_free_clusters(uint64_t nb_clusters, uint64_t *l2_table) -{ - int i = 0; - - while(nb_clusters-- && l2_table[i] == 0) - i++; - - return i; -} - -/* - * get_cluster_offset - * - * For a given offset of the disk image, return cluster offset in - * qcow2 file. - * - * on entry, *num is the number of contiguous clusters we'd like to - * access following offset. - * - * on exit, *num is the number of contiguous clusters we can read. - * - * Return 1, if the offset is found - * Return 0, otherwise. - * - */ - static uint64_t get_cluster_offset(BlockDriverState *bs, - uint64_t offset, int *num) -{ - BDRVQcowState *s = bs->opaque; - int l1_index, l2_index; - uint64_t l2_offset, *l2_table, cluster_offset; - int l1_bits, c; - int index_in_cluster, nb_available, nb_needed, nb_clusters; - - index_in_cluster = (offset >> 9) & (s->cluster_sectors - 1); - nb_needed = *num + index_in_cluster; - - l1_bits = s->l2_bits + s->cluster_bits; - - /* compute how many bytes there are between the offset and - * the end of the l1 entry - */ - - nb_available = (1 << l1_bits) - (offset & ((1 << l1_bits) - 1)); - - /* compute the number of available sectors */ - - nb_available = (nb_available >> 9) + index_in_cluster; - - cluster_offset = 0; - - /* seek the the l2 offset in the l1 table */ - - l1_index = offset >> l1_bits; - if (l1_index >= s->l1_size) - goto out; - - l2_offset = s->l1_table[l1_index]; - - /* seek the l2 table of the given l2 offset */ - - if (!l2_offset) - goto out; - - /* load the l2 table in memory */ - - l2_offset &= ~QCOW_OFLAG_COPIED; - l2_table = l2_load(bs, l2_offset); - if (l2_table == NULL) - return 0; - - /* find the cluster offset for the given disk offset */ - - l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1); - cluster_offset = be64_to_cpu(l2_table[l2_index]); - nb_clusters = size_to_clusters(s, nb_needed << 9); - - if (!cluster_offset) { - /* how many empty clusters ? */ - c = count_contiguous_free_clusters(nb_clusters, &l2_table[l2_index]); - } else { - /* how many allocated clusters ? */ - c = count_contiguous_clusters(nb_clusters, s->cluster_size, - &l2_table[l2_index], 0, QCOW_OFLAG_COPIED); - } - - nb_available = (c * s->cluster_sectors); -out: - if (nb_available > nb_needed) - nb_available = nb_needed; - - *num = nb_available - index_in_cluster; - - return cluster_offset & ~QCOW_OFLAG_COPIED; -} - -/* - * free_any_clusters - * - * free clusters according to its type: compressed or not - * - */ - -static void free_any_clusters(BlockDriverState *bs, - uint64_t cluster_offset, int nb_clusters) -{ - BDRVQcowState *s = bs->opaque; - - /* free the cluster */ - - if (cluster_offset & QCOW_OFLAG_COMPRESSED) { - int nb_csectors; - nb_csectors = ((cluster_offset >> s->csize_shift) & - s->csize_mask) + 1; - free_clusters(bs, (cluster_offset & s->cluster_offset_mask) & ~511, - nb_csectors * 512); - return; - } - - free_clusters(bs, cluster_offset, nb_clusters << s->cluster_bits); - - return; -} - -/* - * get_cluster_table - * - * for a given disk offset, load (and allocate if needed) - * the l2 table. - * - * the l2 table offset in the qcow2 file and the cluster index - * in the l2 table are given to the caller. - * - */ - -static int get_cluster_table(BlockDriverState *bs, uint64_t offset, - uint64_t **new_l2_table, - uint64_t *new_l2_offset, - int *new_l2_index) + uint64_t offset, int allocate, + int compressed_size, + int n_start, int n_end) { BDRVQcowState *s = bs->opaque; - int l1_index, l2_index, ret; - uint64_t l2_offset, *l2_table; - - /* seek the the l2 offset in the l1 table */ + int min_index, i, j, l1_index, l2_index, ret; + uint64_t l2_offset, *l2_table, cluster_offset, tmp, old_l2_offset; l1_index = offset >> (s->l2_bits + s->cluster_bits); if (l1_index >= s->l1_size) { - ret = grow_l1_table(bs, l1_index + 1); - if (ret < 0) + /* outside l1 table is allowed: we grow the table if needed */ + if (!allocate) + return 0; + if (grow_l1_table(bs, l1_index + 1) < 0) return 0; } l2_offset = s->l1_table[l1_index]; + if (!l2_offset) { + if (!allocate) + return 0; + l2_allocate: + old_l2_offset = l2_offset; + /* allocate a new l2 entry */ + l2_offset = alloc_clusters(bs, s->l2_size * sizeof(uint64_t)); + /* update the L1 entry */ + s->l1_table[l1_index] = l2_offset | QCOW_OFLAG_COPIED; + tmp = cpu_to_be64(l2_offset | QCOW_OFLAG_COPIED); + if (bdrv_pwrite(s->hd, s->l1_table_offset + l1_index * sizeof(tmp), + &tmp, sizeof(tmp)) != sizeof(tmp)) + return 0; + min_index = l2_cache_new_entry(bs); + l2_table = s->l2_cache + (min_index << s->l2_bits); - /* seek the l2 table of the given l2 offset */ - - if (l2_offset & QCOW_OFLAG_COPIED) { - /* load the l2 table in memory */ - l2_offset &= ~QCOW_OFLAG_COPIED; - l2_table = l2_load(bs, l2_offset); - if (l2_table == NULL) + if (old_l2_offset == 0) { + memset(l2_table, 0, s->l2_size * sizeof(uint64_t)); + } else { + if (bdrv_pread(s->hd, old_l2_offset, + l2_table, s->l2_size * sizeof(uint64_t)) != + s->l2_size * sizeof(uint64_t)) + return 0; + } + if (bdrv_pwrite(s->hd, l2_offset, + l2_table, s->l2_size * sizeof(uint64_t)) != + s->l2_size * sizeof(uint64_t)) return 0; } else { - if (l2_offset) - free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t)); - l2_table = l2_allocate(bs, l1_index); - if (l2_table == NULL) + if (!(l2_offset & QCOW_OFLAG_COPIED)) { + if (allocate) { + free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t)); + goto l2_allocate; + } + } else { + l2_offset &= ~QCOW_OFLAG_COPIED; + } + for(i = 0; i < L2_CACHE_SIZE; i++) { + if (l2_offset == s->l2_cache_offsets[i]) { + /* increment the hit count */ + if (++s->l2_cache_counts[i] == 0xffffffff) { + for(j = 0; j < L2_CACHE_SIZE; j++) { + s->l2_cache_counts[j] >>= 1; + } + } + l2_table = s->l2_cache + (i << s->l2_bits); + goto found; + } + } + /* not found: load a new entry in the least used one */ + min_index = l2_cache_new_entry(bs); + l2_table = s->l2_cache + (min_index << s->l2_bits); + if (bdrv_pread(s->hd, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) != + s->l2_size * sizeof(uint64_t)) return 0; - l2_offset = s->l1_table[l1_index] & ~QCOW_OFLAG_COPIED; } - - /* find the cluster offset for the given disk offset */ - + s->l2_cache_offsets[min_index] = l2_offset; + s->l2_cache_counts[min_index] = 1; + found: l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1); - - *new_l2_table = l2_table; - *new_l2_offset = l2_offset; - *new_l2_index = l2_index; - - return 1; -} - -/* - * alloc_compressed_cluster_offset - * - * For a given offset of the disk image, return cluster offset in - * qcow2 file. - * - * If the offset is not found, allocate a new compressed cluster. - * - * Return the cluster offset if successful, - * Return 0, otherwise. - * - */ - -static uint64_t alloc_compressed_cluster_offset(BlockDriverState *bs, - uint64_t offset, - int compressed_size) -{ - BDRVQcowState *s = bs->opaque; - int l2_index, ret; - uint64_t l2_offset, *l2_table, cluster_offset; - int nb_csectors; - - ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index); - if (ret == 0) - return 0; - cluster_offset = be64_to_cpu(l2_table[l2_index]); - if (cluster_offset & QCOW_OFLAG_COPIED) - return cluster_offset & ~QCOW_OFLAG_COPIED; - - if (cluster_offset) - free_any_clusters(bs, cluster_offset, 1); - - cluster_offset = alloc_bytes(bs, compressed_size); - nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) - - (cluster_offset >> 9); - - cluster_offset |= QCOW_OFLAG_COMPRESSED | - ((uint64_t)nb_csectors << s->csize_shift); - - /* update L2 table */ - - /* compressed clusters never have the copied flag */ - - l2_table[l2_index] = cpu_to_be64(cluster_offset); - if (bdrv_pwrite(s->hd, - l2_offset + l2_index * sizeof(uint64_t), - l2_table + l2_index, - sizeof(uint64_t)) != sizeof(uint64_t)) - return 0; - - return cluster_offset; -} - -typedef struct QCowL2Meta -{ - uint64_t offset; - int n_start; - int nb_available; - int nb_clusters; -} QCowL2Meta; - -static int alloc_cluster_link_l2(BlockDriverState *bs, uint64_t cluster_offset, - QCowL2Meta *m) -{ - BDRVQcowState *s = bs->opaque; - int i, j = 0, l2_index, ret; - uint64_t *old_cluster, start_sect, l2_offset, *l2_table; - - if (m->nb_clusters == 0) - return 0; - - if (!(old_cluster = qemu_malloc(m->nb_clusters * sizeof(uint64_t)))) - return -ENOMEM; - - /* copy content of unmodified sectors */ - start_sect = (m->offset & ~(s->cluster_size - 1)) >> 9; - if (m->n_start) { - ret = copy_sectors(bs, start_sect, cluster_offset, 0, m->n_start); - if (ret < 0) - goto err; + if (!cluster_offset) { + if (!allocate) + return cluster_offset; + } else if (!(cluster_offset & QCOW_OFLAG_COPIED)) { + if (!allocate) + return cluster_offset; + /* free the cluster */ + if (cluster_offset & QCOW_OFLAG_COMPRESSED) { + int nb_csectors; + nb_csectors = ((cluster_offset >> s->csize_shift) & + s->csize_mask) + 1; + free_clusters(bs, (cluster_offset & s->cluster_offset_mask) & ~511, + nb_csectors * 512); + } else { + free_clusters(bs, cluster_offset, s->cluster_size); + } + } else { + cluster_offset &= ~QCOW_OFLAG_COPIED; + return cluster_offset; } - - if (m->nb_available & (s->cluster_sectors - 1)) { - uint64_t end = m->nb_available & ~(uint64_t)(s->cluster_sectors - 1); - ret = copy_sectors(bs, start_sect + end, cluster_offset + (end << 9), - m->nb_available - end, s->cluster_sectors); - if (ret < 0) - goto err; + if (allocate == 1) { + /* allocate a new cluster */ + cluster_offset = alloc_clusters(bs, s->cluster_size); + + /* we must initialize the cluster content which won't be + written */ + if ((n_end - n_start) < s->cluster_sectors) { + uint64_t start_sect; + + start_sect = (offset & ~(s->cluster_size - 1)) >> 9; + ret = copy_sectors(bs, start_sect, + cluster_offset, 0, n_start); + if (ret < 0) + return 0; + ret = copy_sectors(bs, start_sect, + cluster_offset, n_end, s->cluster_sectors); + if (ret < 0) + return 0; + } + tmp = cpu_to_be64(cluster_offset | QCOW_OFLAG_COPIED); + } else { + int nb_csectors; + cluster_offset = alloc_bytes(bs, compressed_size); + nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) - + (cluster_offset >> 9); + cluster_offset |= QCOW_OFLAG_COMPRESSED | + ((uint64_t)nb_csectors << s->csize_shift); + /* compressed clusters never have the copied flag */ + tmp = cpu_to_be64(cluster_offset); } - - ret = -EIO; /* update L2 table */ - if (!get_cluster_table(bs, m->offset, &l2_table, &l2_offset, &l2_index)) - goto err; - - for (i = 0; i < m->nb_clusters; i++) { - if(l2_table[l2_index + i] != 0) - old_cluster[j++] = l2_table[l2_index + i]; - - l2_table[l2_index + i] = cpu_to_be64((cluster_offset + - (i << s->cluster_bits)) | QCOW_OFLAG_COPIED); - } - - if (bdrv_pwrite(s->hd, l2_offset + l2_index * sizeof(uint64_t), - l2_table + l2_index, m->nb_clusters * sizeof(uint64_t)) != - m->nb_clusters * sizeof(uint64_t)) - goto err; - - for (i = 0; i < j; i++) - free_any_clusters(bs, old_cluster[i], 1); - - ret = 0; -err: - qemu_free(old_cluster); - return ret; - } - -/* - * alloc_cluster_offset - * - * For a given offset of the disk image, return cluster offset in - * qcow2 file. - * - * If the offset is not found, allocate a new cluster. - * - * Return the cluster offset if successful, - * Return 0, otherwise. - * - */ - -static uint64_t alloc_cluster_offset(BlockDriverState *bs, - uint64_t offset, - int n_start, int n_end, - int *num, QCowL2Meta *m) -{ - BDRVQcowState *s = bs->opaque; - int l2_index, ret; - uint64_t l2_offset, *l2_table, cluster_offset; - int nb_clusters, i = 0; - - ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index); - if (ret == 0) + l2_table[l2_index] = tmp; + if (bdrv_pwrite(s->hd, + l2_offset + l2_index * sizeof(tmp), &tmp, sizeof(tmp)) != sizeof(tmp)) return 0; - - nb_clusters = size_to_clusters(s, n_end << 9); - - nb_clusters = MIN(nb_clusters, s->l2_size - l2_index); - - cluster_offset = be64_to_cpu(l2_table[l2_index]); - - /* We keep all QCOW_OFLAG_COPIED clusters */ - - if (cluster_offset & QCOW_OFLAG_COPIED) { - nb_clusters = count_contiguous_clusters(nb_clusters, s->cluster_size, - &l2_table[l2_index], 0, 0); - - cluster_offset &= ~QCOW_OFLAG_COPIED; - m->nb_clusters = 0; - - goto out; - } - - /* for the moment, multiple compressed clusters are not managed */ - - if (cluster_offset & QCOW_OFLAG_COMPRESSED) - nb_clusters = 1; - - /* how many available clusters ? */ - - while (i < nb_clusters) { - i += count_contiguous_clusters(nb_clusters - i, s->cluster_size, - &l2_table[l2_index], i, 0); - - if(be64_to_cpu(l2_table[l2_index + i])) - break; - - i += count_contiguous_free_clusters(nb_clusters - i, - &l2_table[l2_index + i]); - - cluster_offset = be64_to_cpu(l2_table[l2_index + i]); - - if ((cluster_offset & QCOW_OFLAG_COPIED) || - (cluster_offset & QCOW_OFLAG_COMPRESSED)) - break; - } - nb_clusters = i; - - /* allocate a new cluster */ - - cluster_offset = alloc_clusters(bs, nb_clusters * s->cluster_size); - - /* save info needed for meta data update */ - m->offset = offset; - m->n_start = n_start; - m->nb_clusters = nb_clusters; - -out: - m->nb_available = MIN(nb_clusters << (s->cluster_bits - 9), n_end); - - *num = m->nb_available - n_start; - return cluster_offset; } static int qcow_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum) { + BDRVQcowState *s = bs->opaque; + int index_in_cluster, n; uint64_t cluster_offset; - *pnum = nb_sectors; - cluster_offset = get_cluster_offset(bs, sector_num << 9, pnum); - + cluster_offset = get_cluster_offset(bs, sector_num << 9, 0, 0, 0, 0); + index_in_cluster = sector_num & (s->cluster_sectors - 1); + n = s->cluster_sectors - index_in_cluster; + if (n > nb_sectors) + n = nb_sectors; + *pnum = n; return (cluster_offset != 0); } @@ -1102,9 +723,11 @@ uint64_t cluster_offset; while (nb_sectors > 0) { - n = nb_sectors; - cluster_offset = get_cluster_offset(bs, sector_num << 9, &n); + cluster_offset = get_cluster_offset(bs, sector_num << 9, 0, 0, 0, 0); index_in_cluster = sector_num & (s->cluster_sectors - 1); + n = s->cluster_sectors - index_in_cluster; + if (n > nb_sectors) + n = nb_sectors; if (!cluster_offset) { if (bs->backing_hd) { /* read from the base image */ @@ -1143,18 +766,15 @@ BDRVQcowState *s = bs->opaque; int ret, index_in_cluster, n; uint64_t cluster_offset; - int n_end; - QCowL2Meta l2meta; while (nb_sectors > 0) { index_in_cluster = sector_num & (s->cluster_sectors - 1); - n_end = index_in_cluster + nb_sectors; - if (s->crypt_method && - n_end > QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors) - n_end = QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors; - cluster_offset = alloc_cluster_offset(bs, sector_num << 9, - index_in_cluster, - n_end, &n, &l2meta); + n = s->cluster_sectors - index_in_cluster; + if (n > nb_sectors) + n = nb_sectors; + cluster_offset = get_cluster_offset(bs, sector_num << 9, 1, 0, + index_in_cluster, + index_in_cluster + n); if (!cluster_offset) return -1; if (s->crypt_method) { @@ -1165,10 +785,8 @@ } else { ret = bdrv_pwrite(s->hd, cluster_offset + index_in_cluster * 512, buf, n * 512); } - if (ret != n * 512 || alloc_cluster_link_l2(bs, cluster_offset, &l2meta) < 0) { - free_any_clusters(bs, cluster_offset, l2meta.nb_clusters); + if (ret != n * 512) return -1; - } nb_sectors -= n; sector_num += n; buf += n * 512; @@ -1186,33 +804,8 @@ uint64_t cluster_offset; uint8_t *cluster_data; BlockDriverAIOCB *hd_aiocb; - QEMUBH *bh; - QCowL2Meta l2meta; } QCowAIOCB; -static void qcow_aio_read_cb(void *opaque, int ret); -static void qcow_aio_read_bh(void *opaque) -{ - QCowAIOCB *acb = opaque; - qemu_bh_delete(acb->bh); - acb->bh = NULL; - qcow_aio_read_cb(opaque, 0); -} - -static int qcow_schedule_bh(QEMUBHFunc *cb, QCowAIOCB *acb) -{ - if (acb->bh) - return -EIO; - - acb->bh = qemu_bh_new(cb, acb); - if (!acb->bh) - return -EIO; - - qemu_bh_schedule(acb->bh); - - return 0; -} - static void qcow_aio_read_cb(void *opaque, int ret) { QCowAIOCB *acb = opaque; @@ -1222,12 +815,13 @@ acb->hd_aiocb = NULL; if (ret < 0) { -fail: + fail: acb->common.cb(acb->common.opaque, ret); qemu_aio_release(acb); return; } + redo: /* post process the read buffer */ if (!acb->cluster_offset) { /* nothing to do */ @@ -1253,9 +847,12 @@ } /* prepare next AIO request */ - acb->n = acb->nb_sectors; - acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9, &acb->n); + acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9, + 0, 0, 0, 0); index_in_cluster = acb->sector_num & (s->cluster_sectors - 1); + acb->n = s->cluster_sectors - index_in_cluster; + if (acb->n > acb->nb_sectors) + acb->n = acb->nb_sectors; if (!acb->cluster_offset) { if (bs->backing_hd) { @@ -1268,16 +865,12 @@ if (acb->hd_aiocb == NULL) goto fail; } else { - ret = qcow_schedule_bh(qcow_aio_read_bh, acb); - if (ret < 0) - goto fail; + goto redo; } } else { /* Note: in this case, no need to wait */ memset(acb->buf, 0, 512 * acb->n); - ret = qcow_schedule_bh(qcow_aio_read_bh, acb); - if (ret < 0) - goto fail; + goto redo; } } else if (acb->cluster_offset & QCOW_OFLAG_COMPRESSED) { /* add AIO support for compressed blocks ? */ @@ -1285,9 +878,7 @@ goto fail; memcpy(acb->buf, s->cluster_cache + index_in_cluster * 512, 512 * acb->n); - ret = qcow_schedule_bh(qcow_aio_read_bh, acb); - if (ret < 0) - goto fail; + goto redo; } else { if ((acb->cluster_offset & 511) != 0) { ret = -EIO; @@ -1316,7 +907,6 @@ acb->nb_sectors = nb_sectors; acb->n = 0; acb->cluster_offset = 0; - acb->l2meta.nb_clusters = 0; return acb; } @@ -1340,8 +930,8 @@ BlockDriverState *bs = acb->common.bs; BDRVQcowState *s = bs->opaque; int index_in_cluster; + uint64_t cluster_offset; const uint8_t *src_buf; - int n_end; acb->hd_aiocb = NULL; @@ -1352,11 +942,6 @@ return; } - if (alloc_cluster_link_l2(bs, acb->cluster_offset, &acb->l2meta) < 0) { - free_any_clusters(bs, acb->cluster_offset, acb->l2meta.nb_clusters); - goto fail; - } - acb->nb_sectors -= acb->n; acb->sector_num += acb->n; acb->buf += acb->n * 512; @@ -1369,22 +954,19 @@ } index_in_cluster = acb->sector_num & (s->cluster_sectors - 1); - n_end = index_in_cluster + acb->nb_sectors; - if (s->crypt_method && - n_end > QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors) - n_end = QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors; - - acb->cluster_offset = alloc_cluster_offset(bs, acb->sector_num << 9, - index_in_cluster, - n_end, &acb->n, &acb->l2meta); - if (!acb->cluster_offset || (acb->cluster_offset & 511) != 0) { + acb->n = s->cluster_sectors - index_in_cluster; + if (acb->n > acb->nb_sectors) + acb->n = acb->nb_sectors; + cluster_offset = get_cluster_offset(bs, acb->sector_num << 9, 1, 0, + index_in_cluster, + index_in_cluster + acb->n); + if (!cluster_offset || (cluster_offset & 511) != 0) { ret = -EIO; goto fail; } if (s->crypt_method) { if (!acb->cluster_data) { - acb->cluster_data = qemu_mallocz(QCOW_MAX_CRYPT_CLUSTERS * - s->cluster_size); + acb->cluster_data = qemu_mallocz(s->cluster_size); if (!acb->cluster_data) { ret = -ENOMEM; goto fail; @@ -1397,7 +979,7 @@ src_buf = acb->buf; } acb->hd_aiocb = bdrv_aio_write(s->hd, - (acb->cluster_offset >> 9) + index_in_cluster, + (cluster_offset >> 9) + index_in_cluster, src_buf, acb->n, qcow_aio_write_cb, acb); if (acb->hd_aiocb == NULL) @@ -1571,7 +1153,7 @@ memset(s->l1_table, 0, l1_length); if (bdrv_pwrite(s->hd, s->l1_table_offset, s->l1_table, l1_length) < 0) - return -1; + return -1; ret = bdrv_truncate(s->hd, s->l1_table_offset + l1_length); if (ret < 0) return ret; @@ -1637,10 +1219,8 @@ /* could not compress: write normal cluster */ qcow_write(bs, sector_num, buf, s->cluster_sectors); } else { - cluster_offset = alloc_compressed_cluster_offset(bs, sector_num << 9, - out_len); - if (!cluster_offset) - return -1; + cluster_offset = get_cluster_offset(bs, sector_num << 9, 2, + out_len, 0, 0); cluster_offset &= s->cluster_offset_mask; if (bdrv_pwrite(s->hd, cluster_offset, out_buf, out_len) != out_len) { qemu_free(out_buf); @@ -2225,19 +1805,26 @@ BDRVQcowState *s = bs->opaque; int i, nb_clusters; - nb_clusters = size_to_clusters(s, size); -retry: - for(i = 0; i < nb_clusters; i++) { - int64_t i = s->free_cluster_index++; - if (get_refcount(bs, i) != 0) - goto retry; - } + nb_clusters = (size + s->cluster_size - 1) >> s->cluster_bits; + for(;;) { + if (get_refcount(bs, s->free_cluster_index) == 0) { + s->free_cluster_index++; + for(i = 1; i < nb_clusters; i++) { + if (get_refcount(bs, s->free_cluster_index) != 0) + goto not_found; + s->free_cluster_index++; + } #ifdef DEBUG_ALLOC2 - printf("alloc_clusters: size=%lld -> %lld\n", - size, - (s->free_cluster_index - nb_clusters) << s->cluster_bits); + printf("alloc_clusters: size=%lld -> %lld\n", + size, + (s->free_cluster_index - nb_clusters) << s->cluster_bits); #endif - return (s->free_cluster_index - nb_clusters) << s->cluster_bits; + return (s->free_cluster_index - nb_clusters) << s->cluster_bits; + } else { + not_found: + s->free_cluster_index++; + } + } } static int64_t alloc_clusters(BlockDriverState *bs, int64_t size) @@ -2301,7 +1888,8 @@ int new_table_size, new_table_size2, refcount_table_clusters, i, ret; uint64_t *new_table; int64_t table_offset; - uint8_t data[12]; + uint64_t data64; + uint32_t data32; int old_table_size; int64_t old_table_offset; @@ -2340,10 +1928,13 @@ for(i = 0; i < s->refcount_table_size; i++) be64_to_cpus(&new_table[i]); - cpu_to_be64w((uint64_t*)data, table_offset); - cpu_to_be32w((uint32_t*)(data + 8), refcount_table_clusters); + data64 = cpu_to_be64(table_offset); if (bdrv_pwrite(s->hd, offsetof(QCowHeader, refcount_table_offset), - data, sizeof(data)) != sizeof(data)) + &data64, sizeof(data64)) != sizeof(data64)) + goto fail; + data32 = cpu_to_be32(refcount_table_clusters); + if (bdrv_pwrite(s->hd, offsetof(QCowHeader, refcount_table_clusters), + &data32, sizeof(data32)) != sizeof(data32)) goto fail; qemu_free(s->refcount_table); old_table_offset = s->refcount_table_offset; @@ -2572,7 +2163,7 @@ uint16_t *refcount_table; size = bdrv_getlength(s->hd); - nb_clusters = size_to_clusters(s, size); + nb_clusters = (size + s->cluster_size - 1) >> s->cluster_bits; refcount_table = qemu_mallocz(nb_clusters * sizeof(uint16_t)); /* header */ @@ -2624,7 +2215,7 @@ int refcount; size = bdrv_getlength(s->hd); - nb_clusters = size_to_clusters(s, size); + nb_clusters = (size + s->cluster_size - 1) >> s->cluster_bits; for(k = 0; k < nb_clusters;) { k1 = k; refcount = get_refcount(bs, k); ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-13 19:04 ` [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports Jamie Lokier @ 2009-02-14 22:23 ` Dor Laor 2009-02-15 2:20 ` Jamie Lokier 2009-02-14 23:13 ` Anthony Liguori 1 sibling, 1 reply; 82+ messages in thread From: Dor Laor @ 2009-02-14 22:23 UTC (permalink / raw) To: qemu-devel [-- Attachment #1: Type: text/plain, Size: 42670 bytes --] Jamie Lokier wrote: > Anthony Liguori wrote: > >>> Simply reverting the qcow2 code appears to fix those problems, so it >>> needn't hold up cutting a release. That's what I recommend. >>> >> Send some patches. >> > > I did already. > > Here it is again. This should fix my bug and Marc's bug according to > his report that reverting qcow2.c fixes it. > Going back to kvm-72 is not good also. First, there were qcow2 corruptions before it, they were very rare but still exist. Not long ago we did not know even that qcow2 is the faulty. In addition, Gleb fixed some qcow2 meta data ordering writes. We need to keep them in. The solution is to find the real cause to the corruption. > -- Jamie > > > Subject: Revert block-qcow2.c to kvm-72 version due to corruption reports > > This fixes two kinds of qcow2 corruption observed in kvm-83 (actually > kvm-73 and later), from three bug reports. > > > Bug 1: Windows 2000 guests complain of corrupt registry. > > Many Windows 2000 guests which boot and runs fine in kvm-72, fail with > a blue-screen indicating file corruption errors in kvm-73 through to > kvm-83 (the latest), and succeed if we replace block-qcow2.c with the > version from kvm-72. > > The blue screen appears towards the end of the boot sequence, and > shows only briefly before rebooting. It says: > > STOP: c0000218 (Registry File Failure) > The registry cannot load the hive (file): > \SystemRoot\System32\Config\SOFTWARE > or its log or alternate. > It is corrupt, absent, or not writable. > > Beginning dump of physical memory > Physical memory dump complete. Contact your system administrator or > technical support [...?] > > This is narrowed down to the difference in block-qcow2.c between > kvm-72 and kvm-73 (not -83). From kvm-73 to kvm-83, there have been > more changes block-qcow2.c, but the observed corruption still occurs. > > The bug isn't evident when only reading. When using "qemu-img > convert" to convert a qcow2 file to a raw file, with broken and fixed > versions of block-qcow2.c it produces the same raw file. Also, when > using "-snapshot" with qemu, the blue screen doesn't occur. > > This bug was observed by Jamie Lokier <jamie@shareable.org> and > confirmed for multiple Windows 2000 guests by > Marc Bevand <m.bevand@gmail.com>. > > > Bug 2: Windows 2003 guests complain of corrupt registry. > > According to > http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2001452&group_id=180599 > > Windows 2003 32-bit guests randomly spew disk corruption messages > like this: > > Windows – Registry Hive Recovered > Registry hive (file): SOFTWARE was corrupted and it has > been recovered. Some data might have been lost. > > and > > The system cannot log on due to the following error: > Unable to complete the requested operation because of > either a catastrophic media failure or a data structure > corruption on the disk. > > This bug was reported by <gerdwachs@users.sourceforge.net> and > confirmed by Marc Bevand, noting: > > kvm-73+ also causes some of my Windows 2003 guests to exhibit this > exact registry corruption error. [...] This bug is also fixed by > reverting block-qcow2.c to the version from kvm-72. > > Worryingly, gerdwachs' bug report says it's for kvm-70, implying this > patch may not fix all the Windows 2003 guest corruption problems. > > At least Marc says his observed problem goes away with kvm-72's qcow2. > > > Bug 3: Corruption of qcow2 index rendering the file unusable. > > Marc Bevand writes: > > I tested kvm-81 and kvm-83 as well (can't test kvm-80 or older because > of the qcow2 performance regression caused by the default writethrough > caching policy) but it randomly triggers an even worse bug: the moment > I shut down a guest by typing "quit" in the monitor, it sometimes > overwrite the first 4kB of the disk image with mostly NUL bytes (!) > which completely destroys it. I am familiar with the qcow2 format and > apparently this 4kB block seems to be an L2 table with most entries > set to zero. I have had to restore at least 6 or 7 disk images from > backup after occurences of that bug. My intuition tells me this may be > the qcow2 code trying to allocate a cluster to write a new L2 table, > but not noticing the allocation failed (represented by a 0 offset), > and writing the L2 table at that 0 offset, overwriting the qcow2 > header. > > Fortunately this bug is also fixed by running kvm-75 with > block-qcow2.c reverted to its kvm-72 version. > > Basically qcow2 in kvm-73 or newer is completely unreliable. > > > Reverting block-qcow2.c to the version in kvm-72 appears to fix the > corruption symptoms reported by Marc and Jamie, although gerdwachs' > related bug is against kvm-70 so it may not fix that. > > Unfortunately this reverts some optimisations, but fixing corruption > is more important until the new code is reliable. > > This patch reverts block-qcow2.c in kvm-83 to the version in kvm-72, > except the "cache=writeback" default performance tweak is retained and > there's no need to define "offsetof". > > Signed-Off-By: Jamie Lokier <jamie@shareable.org> > > > --- kvm-83-real/qemu/block-qcow2.c 2009-01-13 13:29:42.000000000 +0000 > +++ kvm-83/qemu/block-qcow2.c 2009-02-13 18:51:12.000000000 +0000 > @@ -52,8 +52,6 @@ > #define QCOW_CRYPT_NONE 0 > #define QCOW_CRYPT_AES 1 > > -#define QCOW_MAX_CRYPT_CLUSTERS 32 > - > /* indicate that the refcount of the referenced cluster is exactly one. */ > #define QCOW_OFLAG_COPIED (1LL << 63) > /* indicate that the cluster is compressed (they never have the copied flag) */ > @@ -269,8 +267,7 @@ > if (!s->cluster_cache) > goto fail; > /* one more sector for decompressed data alignment */ > - s->cluster_data = qemu_malloc(QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size > - + 512); > + s->cluster_data = qemu_malloc(s->cluster_size + 512); > if (!s->cluster_data) > goto fail; > s->cluster_cache_offset = -1; > @@ -437,7 +434,8 @@ > int new_l1_size, new_l1_size2, ret, i; > uint64_t *new_l1_table; > uint64_t new_l1_table_offset; > - uint8_t data[12]; > + uint64_t data64; > + uint32_t data32; > > new_l1_size = s->l1_size; > if (min_size <= new_l1_size) > @@ -467,10 +465,13 @@ > new_l1_table[i] = be64_to_cpu(new_l1_table[i]); > > /* set new table */ > - cpu_to_be32w((uint32_t*)data, new_l1_size); > - cpu_to_be64w((uint64_t*)(data + 4), new_l1_table_offset); > - if (bdrv_pwrite(s->hd, offsetof(QCowHeader, l1_size), data, > - sizeof(data)) != sizeof(data)) > + data64 = cpu_to_be64(new_l1_table_offset); > + if (bdrv_pwrite(s->hd, offsetof(QCowHeader, l1_table_offset), > + &data64, sizeof(data64)) != sizeof(data64)) > + goto fail; > + data32 = cpu_to_be32(new_l1_size); > + if (bdrv_pwrite(s->hd, offsetof(QCowHeader, l1_size), > + &data32, sizeof(data32)) != sizeof(data32)) > goto fail; > qemu_free(s->l1_table); > free_clusters(bs, s->l1_table_offset, s->l1_size * sizeof(uint64_t)); > @@ -483,549 +484,169 @@ > return -EIO; > } > > -/* > - * seek_l2_table > +/* 'allocate' is: > * > - * seek l2_offset in the l2_cache table > - * if not found, return NULL, > - * if found, > - * increments the l2 cache hit count of the entry, > - * if counter overflow, divide by two all counters > - * return the pointer to the l2 cache entry > + * 0 not to allocate. > * > - */ > - > -static uint64_t *seek_l2_table(BDRVQcowState *s, uint64_t l2_offset) > -{ > - int i, j; > - > - for(i = 0; i < L2_CACHE_SIZE; i++) { > - if (l2_offset == s->l2_cache_offsets[i]) { > - /* increment the hit count */ > - if (++s->l2_cache_counts[i] == 0xffffffff) { > - for(j = 0; j < L2_CACHE_SIZE; j++) { > - s->l2_cache_counts[j] >>= 1; > - } > - } > - return s->l2_cache + (i << s->l2_bits); > - } > - } > - return NULL; > -} > - > -/* > - * l2_load > + * 1 to allocate a normal cluster (for sector indexes 'n_start' to > + * 'n_end') > * > - * Loads a L2 table into memory. If the table is in the cache, the cache > - * is used; otherwise the L2 table is loaded from the image file. > + * 2 to allocate a compressed cluster of size > + * 'compressed_size'. 'compressed_size' must be > 0 and < > + * cluster_size > * > - * Returns a pointer to the L2 table on success, or NULL if the read from > - * the image file failed. > + * return 0 if not allocated. > */ > - > -static uint64_t *l2_load(BlockDriverState *bs, uint64_t l2_offset) > -{ > - BDRVQcowState *s = bs->opaque; > - int min_index; > - uint64_t *l2_table; > - > - /* seek if the table for the given offset is in the cache */ > - > - l2_table = seek_l2_table(s, l2_offset); > - if (l2_table != NULL) > - return l2_table; > - > - /* not found: load a new entry in the least used one */ > - > - min_index = l2_cache_new_entry(bs); > - l2_table = s->l2_cache + (min_index << s->l2_bits); > - if (bdrv_pread(s->hd, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) != > - s->l2_size * sizeof(uint64_t)) > - return NULL; > - s->l2_cache_offsets[min_index] = l2_offset; > - s->l2_cache_counts[min_index] = 1; > - > - return l2_table; > -} > - > -/* > - * l2_allocate > - * > - * Allocate a new l2 entry in the file. If l1_index points to an already > - * used entry in the L2 table (i.e. we are doing a copy on write for the L2 > - * table) copy the contents of the old L2 table into the newly allocated one. > - * Otherwise the new table is initialized with zeros. > - * > - */ > - > -static uint64_t *l2_allocate(BlockDriverState *bs, int l1_index) > -{ > - BDRVQcowState *s = bs->opaque; > - int min_index; > - uint64_t old_l2_offset, tmp; > - uint64_t *l2_table, l2_offset; > - > - old_l2_offset = s->l1_table[l1_index]; > - > - /* allocate a new l2 entry */ > - > - l2_offset = alloc_clusters(bs, s->l2_size * sizeof(uint64_t)); > - > - /* update the L1 entry */ > - > - s->l1_table[l1_index] = l2_offset | QCOW_OFLAG_COPIED; > - > - tmp = cpu_to_be64(l2_offset | QCOW_OFLAG_COPIED); > - if (bdrv_pwrite(s->hd, s->l1_table_offset + l1_index * sizeof(tmp), > - &tmp, sizeof(tmp)) != sizeof(tmp)) > - return NULL; > - > - /* allocate a new entry in the l2 cache */ > - > - min_index = l2_cache_new_entry(bs); > - l2_table = s->l2_cache + (min_index << s->l2_bits); > - > - if (old_l2_offset == 0) { > - /* if there was no old l2 table, clear the new table */ > - memset(l2_table, 0, s->l2_size * sizeof(uint64_t)); > - } else { > - /* if there was an old l2 table, read it from the disk */ > - if (bdrv_pread(s->hd, old_l2_offset, > - l2_table, s->l2_size * sizeof(uint64_t)) != > - s->l2_size * sizeof(uint64_t)) > - return NULL; > - } > - /* write the l2 table to the file */ > - if (bdrv_pwrite(s->hd, l2_offset, > - l2_table, s->l2_size * sizeof(uint64_t)) != > - s->l2_size * sizeof(uint64_t)) > - return NULL; > - > - /* update the l2 cache entry */ > - > - s->l2_cache_offsets[min_index] = l2_offset; > - s->l2_cache_counts[min_index] = 1; > - > - return l2_table; > -} > - > -static int size_to_clusters(BDRVQcowState *s, int64_t size) > -{ > - return (size + (s->cluster_size - 1)) >> s->cluster_bits; > -} > - > -static int count_contiguous_clusters(uint64_t nb_clusters, int cluster_size, > - uint64_t *l2_table, uint64_t start, uint64_t mask) > -{ > - int i; > - uint64_t offset = be64_to_cpu(l2_table[0]) & ~mask; > - > - if (!offset) > - return 0; > - > - for (i = start; i < start + nb_clusters; i++) > - if (offset + i * cluster_size != (be64_to_cpu(l2_table[i]) & ~mask)) > - break; > - > - return (i - start); > -} > - > -static int count_contiguous_free_clusters(uint64_t nb_clusters, uint64_t *l2_table) > -{ > - int i = 0; > - > - while(nb_clusters-- && l2_table[i] == 0) > - i++; > - > - return i; > -} > - > -/* > - * get_cluster_offset > - * > - * For a given offset of the disk image, return cluster offset in > - * qcow2 file. > - * > - * on entry, *num is the number of contiguous clusters we'd like to > - * access following offset. > - * > - * on exit, *num is the number of contiguous clusters we can read. > - * > - * Return 1, if the offset is found > - * Return 0, otherwise. > - * > - */ > - > static uint64_t get_cluster_offset(BlockDriverState *bs, > - uint64_t offset, int *num) > -{ > - BDRVQcowState *s = bs->opaque; > - int l1_index, l2_index; > - uint64_t l2_offset, *l2_table, cluster_offset; > - int l1_bits, c; > - int index_in_cluster, nb_available, nb_needed, nb_clusters; > - > - index_in_cluster = (offset >> 9) & (s->cluster_sectors - 1); > - nb_needed = *num + index_in_cluster; > - > - l1_bits = s->l2_bits + s->cluster_bits; > - > - /* compute how many bytes there are between the offset and > - * the end of the l1 entry > - */ > - > - nb_available = (1 << l1_bits) - (offset & ((1 << l1_bits) - 1)); > - > - /* compute the number of available sectors */ > - > - nb_available = (nb_available >> 9) + index_in_cluster; > - > - cluster_offset = 0; > - > - /* seek the the l2 offset in the l1 table */ > - > - l1_index = offset >> l1_bits; > - if (l1_index >= s->l1_size) > - goto out; > - > - l2_offset = s->l1_table[l1_index]; > - > - /* seek the l2 table of the given l2 offset */ > - > - if (!l2_offset) > - goto out; > - > - /* load the l2 table in memory */ > - > - l2_offset &= ~QCOW_OFLAG_COPIED; > - l2_table = l2_load(bs, l2_offset); > - if (l2_table == NULL) > - return 0; > - > - /* find the cluster offset for the given disk offset */ > - > - l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1); > - cluster_offset = be64_to_cpu(l2_table[l2_index]); > - nb_clusters = size_to_clusters(s, nb_needed << 9); > - > - if (!cluster_offset) { > - /* how many empty clusters ? */ > - c = count_contiguous_free_clusters(nb_clusters, &l2_table[l2_index]); > - } else { > - /* how many allocated clusters ? */ > - c = count_contiguous_clusters(nb_clusters, s->cluster_size, > - &l2_table[l2_index], 0, QCOW_OFLAG_COPIED); > - } > - > - nb_available = (c * s->cluster_sectors); > -out: > - if (nb_available > nb_needed) > - nb_available = nb_needed; > - > - *num = nb_available - index_in_cluster; > - > - return cluster_offset & ~QCOW_OFLAG_COPIED; > -} > - > -/* > - * free_any_clusters > - * > - * free clusters according to its type: compressed or not > - * > - */ > - > -static void free_any_clusters(BlockDriverState *bs, > - uint64_t cluster_offset, int nb_clusters) > -{ > - BDRVQcowState *s = bs->opaque; > - > - /* free the cluster */ > - > - if (cluster_offset & QCOW_OFLAG_COMPRESSED) { > - int nb_csectors; > - nb_csectors = ((cluster_offset >> s->csize_shift) & > - s->csize_mask) + 1; > - free_clusters(bs, (cluster_offset & s->cluster_offset_mask) & ~511, > - nb_csectors * 512); > - return; > - } > - > - free_clusters(bs, cluster_offset, nb_clusters << s->cluster_bits); > - > - return; > -} > - > -/* > - * get_cluster_table > - * > - * for a given disk offset, load (and allocate if needed) > - * the l2 table. > - * > - * the l2 table offset in the qcow2 file and the cluster index > - * in the l2 table are given to the caller. > - * > - */ > - > -static int get_cluster_table(BlockDriverState *bs, uint64_t offset, > - uint64_t **new_l2_table, > - uint64_t *new_l2_offset, > - int *new_l2_index) > + uint64_t offset, int allocate, > + int compressed_size, > + int n_start, int n_end) > { > BDRVQcowState *s = bs->opaque; > - int l1_index, l2_index, ret; > - uint64_t l2_offset, *l2_table; > - > - /* seek the the l2 offset in the l1 table */ > + int min_index, i, j, l1_index, l2_index, ret; > + uint64_t l2_offset, *l2_table, cluster_offset, tmp, old_l2_offset; > > l1_index = offset >> (s->l2_bits + s->cluster_bits); > if (l1_index >= s->l1_size) { > - ret = grow_l1_table(bs, l1_index + 1); > - if (ret < 0) > + /* outside l1 table is allowed: we grow the table if needed */ > + if (!allocate) > + return 0; > + if (grow_l1_table(bs, l1_index + 1) < 0) > return 0; > } > l2_offset = s->l1_table[l1_index]; > + if (!l2_offset) { > + if (!allocate) > + return 0; > + l2_allocate: > + old_l2_offset = l2_offset; > + /* allocate a new l2 entry */ > + l2_offset = alloc_clusters(bs, s->l2_size * sizeof(uint64_t)); > + /* update the L1 entry */ > + s->l1_table[l1_index] = l2_offset | QCOW_OFLAG_COPIED; > + tmp = cpu_to_be64(l2_offset | QCOW_OFLAG_COPIED); > + if (bdrv_pwrite(s->hd, s->l1_table_offset + l1_index * sizeof(tmp), > + &tmp, sizeof(tmp)) != sizeof(tmp)) > + return 0; > + min_index = l2_cache_new_entry(bs); > + l2_table = s->l2_cache + (min_index << s->l2_bits); > > - /* seek the l2 table of the given l2 offset */ > - > - if (l2_offset & QCOW_OFLAG_COPIED) { > - /* load the l2 table in memory */ > - l2_offset &= ~QCOW_OFLAG_COPIED; > - l2_table = l2_load(bs, l2_offset); > - if (l2_table == NULL) > + if (old_l2_offset == 0) { > + memset(l2_table, 0, s->l2_size * sizeof(uint64_t)); > + } else { > + if (bdrv_pread(s->hd, old_l2_offset, > + l2_table, s->l2_size * sizeof(uint64_t)) != > + s->l2_size * sizeof(uint64_t)) > + return 0; > + } > + if (bdrv_pwrite(s->hd, l2_offset, > + l2_table, s->l2_size * sizeof(uint64_t)) != > + s->l2_size * sizeof(uint64_t)) > return 0; > } else { > - if (l2_offset) > - free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t)); > - l2_table = l2_allocate(bs, l1_index); > - if (l2_table == NULL) > + if (!(l2_offset & QCOW_OFLAG_COPIED)) { > + if (allocate) { > + free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t)); > + goto l2_allocate; > + } > + } else { > + l2_offset &= ~QCOW_OFLAG_COPIED; > + } > + for(i = 0; i < L2_CACHE_SIZE; i++) { > + if (l2_offset == s->l2_cache_offsets[i]) { > + /* increment the hit count */ > + if (++s->l2_cache_counts[i] == 0xffffffff) { > + for(j = 0; j < L2_CACHE_SIZE; j++) { > + s->l2_cache_counts[j] >>= 1; > + } > + } > + l2_table = s->l2_cache + (i << s->l2_bits); > + goto found; > + } > + } > + /* not found: load a new entry in the least used one */ > + min_index = l2_cache_new_entry(bs); > + l2_table = s->l2_cache + (min_index << s->l2_bits); > + if (bdrv_pread(s->hd, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) != > + s->l2_size * sizeof(uint64_t)) > return 0; > - l2_offset = s->l1_table[l1_index] & ~QCOW_OFLAG_COPIED; > } > - > - /* find the cluster offset for the given disk offset */ > - > + s->l2_cache_offsets[min_index] = l2_offset; > + s->l2_cache_counts[min_index] = 1; > + found: > l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1); > - > - *new_l2_table = l2_table; > - *new_l2_offset = l2_offset; > - *new_l2_index = l2_index; > - > - return 1; > -} > - > -/* > - * alloc_compressed_cluster_offset > - * > - * For a given offset of the disk image, return cluster offset in > - * qcow2 file. > - * > - * If the offset is not found, allocate a new compressed cluster. > - * > - * Return the cluster offset if successful, > - * Return 0, otherwise. > - * > - */ > - > -static uint64_t alloc_compressed_cluster_offset(BlockDriverState *bs, > - uint64_t offset, > - int compressed_size) > -{ > - BDRVQcowState *s = bs->opaque; > - int l2_index, ret; > - uint64_t l2_offset, *l2_table, cluster_offset; > - int nb_csectors; > - > - ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index); > - if (ret == 0) > - return 0; > - > cluster_offset = be64_to_cpu(l2_table[l2_index]); > - if (cluster_offset & QCOW_OFLAG_COPIED) > - return cluster_offset & ~QCOW_OFLAG_COPIED; > - > - if (cluster_offset) > - free_any_clusters(bs, cluster_offset, 1); > - > - cluster_offset = alloc_bytes(bs, compressed_size); > - nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) - > - (cluster_offset >> 9); > - > - cluster_offset |= QCOW_OFLAG_COMPRESSED | > - ((uint64_t)nb_csectors << s->csize_shift); > - > - /* update L2 table */ > - > - /* compressed clusters never have the copied flag */ > - > - l2_table[l2_index] = cpu_to_be64(cluster_offset); > - if (bdrv_pwrite(s->hd, > - l2_offset + l2_index * sizeof(uint64_t), > - l2_table + l2_index, > - sizeof(uint64_t)) != sizeof(uint64_t)) > - return 0; > - > - return cluster_offset; > -} > - > -typedef struct QCowL2Meta > -{ > - uint64_t offset; > - int n_start; > - int nb_available; > - int nb_clusters; > -} QCowL2Meta; > - > -static int alloc_cluster_link_l2(BlockDriverState *bs, uint64_t cluster_offset, > - QCowL2Meta *m) > -{ > - BDRVQcowState *s = bs->opaque; > - int i, j = 0, l2_index, ret; > - uint64_t *old_cluster, start_sect, l2_offset, *l2_table; > - > - if (m->nb_clusters == 0) > - return 0; > - > - if (!(old_cluster = qemu_malloc(m->nb_clusters * sizeof(uint64_t)))) > - return -ENOMEM; > - > - /* copy content of unmodified sectors */ > - start_sect = (m->offset & ~(s->cluster_size - 1)) >> 9; > - if (m->n_start) { > - ret = copy_sectors(bs, start_sect, cluster_offset, 0, m->n_start); > - if (ret < 0) > - goto err; > + if (!cluster_offset) { > + if (!allocate) > + return cluster_offset; > + } else if (!(cluster_offset & QCOW_OFLAG_COPIED)) { > + if (!allocate) > + return cluster_offset; > + /* free the cluster */ > + if (cluster_offset & QCOW_OFLAG_COMPRESSED) { > + int nb_csectors; > + nb_csectors = ((cluster_offset >> s->csize_shift) & > + s->csize_mask) + 1; > + free_clusters(bs, (cluster_offset & s->cluster_offset_mask) & ~511, > + nb_csectors * 512); > + } else { > + free_clusters(bs, cluster_offset, s->cluster_size); > + } > + } else { > + cluster_offset &= ~QCOW_OFLAG_COPIED; > + return cluster_offset; > } > - > - if (m->nb_available & (s->cluster_sectors - 1)) { > - uint64_t end = m->nb_available & ~(uint64_t)(s->cluster_sectors - 1); > - ret = copy_sectors(bs, start_sect + end, cluster_offset + (end << 9), > - m->nb_available - end, s->cluster_sectors); > - if (ret < 0) > - goto err; > + if (allocate == 1) { > + /* allocate a new cluster */ > + cluster_offset = alloc_clusters(bs, s->cluster_size); > + > + /* we must initialize the cluster content which won't be > + written */ > + if ((n_end - n_start) < s->cluster_sectors) { > + uint64_t start_sect; > + > + start_sect = (offset & ~(s->cluster_size - 1)) >> 9; > + ret = copy_sectors(bs, start_sect, > + cluster_offset, 0, n_start); > + if (ret < 0) > + return 0; > + ret = copy_sectors(bs, start_sect, > + cluster_offset, n_end, s->cluster_sectors); > + if (ret < 0) > + return 0; > + } > + tmp = cpu_to_be64(cluster_offset | QCOW_OFLAG_COPIED); > + } else { > + int nb_csectors; > + cluster_offset = alloc_bytes(bs, compressed_size); > + nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) - > + (cluster_offset >> 9); > + cluster_offset |= QCOW_OFLAG_COMPRESSED | > + ((uint64_t)nb_csectors << s->csize_shift); > + /* compressed clusters never have the copied flag */ > + tmp = cpu_to_be64(cluster_offset); > } > - > - ret = -EIO; > /* update L2 table */ > - if (!get_cluster_table(bs, m->offset, &l2_table, &l2_offset, &l2_index)) > - goto err; > - > - for (i = 0; i < m->nb_clusters; i++) { > - if(l2_table[l2_index + i] != 0) > - old_cluster[j++] = l2_table[l2_index + i]; > - > - l2_table[l2_index + i] = cpu_to_be64((cluster_offset + > - (i << s->cluster_bits)) | QCOW_OFLAG_COPIED); > - } > - > - if (bdrv_pwrite(s->hd, l2_offset + l2_index * sizeof(uint64_t), > - l2_table + l2_index, m->nb_clusters * sizeof(uint64_t)) != > - m->nb_clusters * sizeof(uint64_t)) > - goto err; > - > - for (i = 0; i < j; i++) > - free_any_clusters(bs, old_cluster[i], 1); > - > - ret = 0; > -err: > - qemu_free(old_cluster); > - return ret; > - } > - > -/* > - * alloc_cluster_offset > - * > - * For a given offset of the disk image, return cluster offset in > - * qcow2 file. > - * > - * If the offset is not found, allocate a new cluster. > - * > - * Return the cluster offset if successful, > - * Return 0, otherwise. > - * > - */ > - > -static uint64_t alloc_cluster_offset(BlockDriverState *bs, > - uint64_t offset, > - int n_start, int n_end, > - int *num, QCowL2Meta *m) > -{ > - BDRVQcowState *s = bs->opaque; > - int l2_index, ret; > - uint64_t l2_offset, *l2_table, cluster_offset; > - int nb_clusters, i = 0; > - > - ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index); > - if (ret == 0) > + l2_table[l2_index] = tmp; > + if (bdrv_pwrite(s->hd, > + l2_offset + l2_index * sizeof(tmp), &tmp, sizeof(tmp)) != sizeof(tmp)) > return 0; > - > - nb_clusters = size_to_clusters(s, n_end << 9); > - > - nb_clusters = MIN(nb_clusters, s->l2_size - l2_index); > - > - cluster_offset = be64_to_cpu(l2_table[l2_index]); > - > - /* We keep all QCOW_OFLAG_COPIED clusters */ > - > - if (cluster_offset & QCOW_OFLAG_COPIED) { > - nb_clusters = count_contiguous_clusters(nb_clusters, s->cluster_size, > - &l2_table[l2_index], 0, 0); > - > - cluster_offset &= ~QCOW_OFLAG_COPIED; > - m->nb_clusters = 0; > - > - goto out; > - } > - > - /* for the moment, multiple compressed clusters are not managed */ > - > - if (cluster_offset & QCOW_OFLAG_COMPRESSED) > - nb_clusters = 1; > - > - /* how many available clusters ? */ > - > - while (i < nb_clusters) { > - i += count_contiguous_clusters(nb_clusters - i, s->cluster_size, > - &l2_table[l2_index], i, 0); > - > - if(be64_to_cpu(l2_table[l2_index + i])) > - break; > - > - i += count_contiguous_free_clusters(nb_clusters - i, > - &l2_table[l2_index + i]); > - > - cluster_offset = be64_to_cpu(l2_table[l2_index + i]); > - > - if ((cluster_offset & QCOW_OFLAG_COPIED) || > - (cluster_offset & QCOW_OFLAG_COMPRESSED)) > - break; > - } > - nb_clusters = i; > - > - /* allocate a new cluster */ > - > - cluster_offset = alloc_clusters(bs, nb_clusters * s->cluster_size); > - > - /* save info needed for meta data update */ > - m->offset = offset; > - m->n_start = n_start; > - m->nb_clusters = nb_clusters; > - > -out: > - m->nb_available = MIN(nb_clusters << (s->cluster_bits - 9), n_end); > - > - *num = m->nb_available - n_start; > - > return cluster_offset; > } > > static int qcow_is_allocated(BlockDriverState *bs, int64_t sector_num, > int nb_sectors, int *pnum) > { > + BDRVQcowState *s = bs->opaque; > + int index_in_cluster, n; > uint64_t cluster_offset; > > - *pnum = nb_sectors; > - cluster_offset = get_cluster_offset(bs, sector_num << 9, pnum); > - > + cluster_offset = get_cluster_offset(bs, sector_num << 9, 0, 0, 0, 0); > + index_in_cluster = sector_num & (s->cluster_sectors - 1); > + n = s->cluster_sectors - index_in_cluster; > + if (n > nb_sectors) > + n = nb_sectors; > + *pnum = n; > return (cluster_offset != 0); > } > > @@ -1102,9 +723,11 @@ > uint64_t cluster_offset; > > while (nb_sectors > 0) { > - n = nb_sectors; > - cluster_offset = get_cluster_offset(bs, sector_num << 9, &n); > + cluster_offset = get_cluster_offset(bs, sector_num << 9, 0, 0, 0, 0); > index_in_cluster = sector_num & (s->cluster_sectors - 1); > + n = s->cluster_sectors - index_in_cluster; > + if (n > nb_sectors) > + n = nb_sectors; > if (!cluster_offset) { > if (bs->backing_hd) { > /* read from the base image */ > @@ -1143,18 +766,15 @@ > BDRVQcowState *s = bs->opaque; > int ret, index_in_cluster, n; > uint64_t cluster_offset; > - int n_end; > - QCowL2Meta l2meta; > > while (nb_sectors > 0) { > index_in_cluster = sector_num & (s->cluster_sectors - 1); > - n_end = index_in_cluster + nb_sectors; > - if (s->crypt_method && > - n_end > QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors) > - n_end = QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors; > - cluster_offset = alloc_cluster_offset(bs, sector_num << 9, > - index_in_cluster, > - n_end, &n, &l2meta); > + n = s->cluster_sectors - index_in_cluster; > + if (n > nb_sectors) > + n = nb_sectors; > + cluster_offset = get_cluster_offset(bs, sector_num << 9, 1, 0, > + index_in_cluster, > + index_in_cluster + n); > if (!cluster_offset) > return -1; > if (s->crypt_method) { > @@ -1165,10 +785,8 @@ > } else { > ret = bdrv_pwrite(s->hd, cluster_offset + index_in_cluster * 512, buf, n * 512); > } > - if (ret != n * 512 || alloc_cluster_link_l2(bs, cluster_offset, &l2meta) < 0) { > - free_any_clusters(bs, cluster_offset, l2meta.nb_clusters); > + if (ret != n * 512) > return -1; > - } > nb_sectors -= n; > sector_num += n; > buf += n * 512; > @@ -1186,33 +804,8 @@ > uint64_t cluster_offset; > uint8_t *cluster_data; > BlockDriverAIOCB *hd_aiocb; > - QEMUBH *bh; > - QCowL2Meta l2meta; > } QCowAIOCB; > > -static void qcow_aio_read_cb(void *opaque, int ret); > -static void qcow_aio_read_bh(void *opaque) > -{ > - QCowAIOCB *acb = opaque; > - qemu_bh_delete(acb->bh); > - acb->bh = NULL; > - qcow_aio_read_cb(opaque, 0); > -} > - > -static int qcow_schedule_bh(QEMUBHFunc *cb, QCowAIOCB *acb) > -{ > - if (acb->bh) > - return -EIO; > - > - acb->bh = qemu_bh_new(cb, acb); > - if (!acb->bh) > - return -EIO; > - > - qemu_bh_schedule(acb->bh); > - > - return 0; > -} > - > static void qcow_aio_read_cb(void *opaque, int ret) > { > QCowAIOCB *acb = opaque; > @@ -1222,12 +815,13 @@ > > acb->hd_aiocb = NULL; > if (ret < 0) { > -fail: > + fail: > acb->common.cb(acb->common.opaque, ret); > qemu_aio_release(acb); > return; > } > > + redo: > /* post process the read buffer */ > if (!acb->cluster_offset) { > /* nothing to do */ > @@ -1253,9 +847,12 @@ > } > > /* prepare next AIO request */ > - acb->n = acb->nb_sectors; > - acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9, &acb->n); > + acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9, > + 0, 0, 0, 0); > index_in_cluster = acb->sector_num & (s->cluster_sectors - 1); > + acb->n = s->cluster_sectors - index_in_cluster; > + if (acb->n > acb->nb_sectors) > + acb->n = acb->nb_sectors; > > if (!acb->cluster_offset) { > if (bs->backing_hd) { > @@ -1268,16 +865,12 @@ > if (acb->hd_aiocb == NULL) > goto fail; > } else { > - ret = qcow_schedule_bh(qcow_aio_read_bh, acb); > - if (ret < 0) > - goto fail; > + goto redo; > } > } else { > /* Note: in this case, no need to wait */ > memset(acb->buf, 0, 512 * acb->n); > - ret = qcow_schedule_bh(qcow_aio_read_bh, acb); > - if (ret < 0) > - goto fail; > + goto redo; > } > } else if (acb->cluster_offset & QCOW_OFLAG_COMPRESSED) { > /* add AIO support for compressed blocks ? */ > @@ -1285,9 +878,7 @@ > goto fail; > memcpy(acb->buf, > s->cluster_cache + index_in_cluster * 512, 512 * acb->n); > - ret = qcow_schedule_bh(qcow_aio_read_bh, acb); > - if (ret < 0) > - goto fail; > + goto redo; > } else { > if ((acb->cluster_offset & 511) != 0) { > ret = -EIO; > @@ -1316,7 +907,6 @@ > acb->nb_sectors = nb_sectors; > acb->n = 0; > acb->cluster_offset = 0; > - acb->l2meta.nb_clusters = 0; > return acb; > } > > @@ -1340,8 +930,8 @@ > BlockDriverState *bs = acb->common.bs; > BDRVQcowState *s = bs->opaque; > int index_in_cluster; > + uint64_t cluster_offset; > const uint8_t *src_buf; > - int n_end; > > acb->hd_aiocb = NULL; > > @@ -1352,11 +942,6 @@ > return; > } > > - if (alloc_cluster_link_l2(bs, acb->cluster_offset, &acb->l2meta) < 0) { > - free_any_clusters(bs, acb->cluster_offset, acb->l2meta.nb_clusters); > - goto fail; > - } > - > acb->nb_sectors -= acb->n; > acb->sector_num += acb->n; > acb->buf += acb->n * 512; > @@ -1369,22 +954,19 @@ > } > > index_in_cluster = acb->sector_num & (s->cluster_sectors - 1); > - n_end = index_in_cluster + acb->nb_sectors; > - if (s->crypt_method && > - n_end > QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors) > - n_end = QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors; > - > - acb->cluster_offset = alloc_cluster_offset(bs, acb->sector_num << 9, > - index_in_cluster, > - n_end, &acb->n, &acb->l2meta); > - if (!acb->cluster_offset || (acb->cluster_offset & 511) != 0) { > + acb->n = s->cluster_sectors - index_in_cluster; > + if (acb->n > acb->nb_sectors) > + acb->n = acb->nb_sectors; > + cluster_offset = get_cluster_offset(bs, acb->sector_num << 9, 1, 0, > + index_in_cluster, > + index_in_cluster + acb->n); > + if (!cluster_offset || (cluster_offset & 511) != 0) { > ret = -EIO; > goto fail; > } > if (s->crypt_method) { > if (!acb->cluster_data) { > - acb->cluster_data = qemu_mallocz(QCOW_MAX_CRYPT_CLUSTERS * > - s->cluster_size); > + acb->cluster_data = qemu_mallocz(s->cluster_size); > if (!acb->cluster_data) { > ret = -ENOMEM; > goto fail; > @@ -1397,7 +979,7 @@ > src_buf = acb->buf; > } > acb->hd_aiocb = bdrv_aio_write(s->hd, > - (acb->cluster_offset >> 9) + index_in_cluster, > + (cluster_offset >> 9) + index_in_cluster, > src_buf, acb->n, > qcow_aio_write_cb, acb); > if (acb->hd_aiocb == NULL) > @@ -1571,7 +1153,7 @@ > > memset(s->l1_table, 0, l1_length); > if (bdrv_pwrite(s->hd, s->l1_table_offset, s->l1_table, l1_length) < 0) > - return -1; > + return -1; > ret = bdrv_truncate(s->hd, s->l1_table_offset + l1_length); > if (ret < 0) > return ret; > @@ -1637,10 +1219,8 @@ > /* could not compress: write normal cluster */ > qcow_write(bs, sector_num, buf, s->cluster_sectors); > } else { > - cluster_offset = alloc_compressed_cluster_offset(bs, sector_num << 9, > - out_len); > - if (!cluster_offset) > - return -1; > + cluster_offset = get_cluster_offset(bs, sector_num << 9, 2, > + out_len, 0, 0); > cluster_offset &= s->cluster_offset_mask; > if (bdrv_pwrite(s->hd, cluster_offset, out_buf, out_len) != out_len) { > qemu_free(out_buf); > @@ -2225,19 +1805,26 @@ > BDRVQcowState *s = bs->opaque; > int i, nb_clusters; > > - nb_clusters = size_to_clusters(s, size); > -retry: > - for(i = 0; i < nb_clusters; i++) { > - int64_t i = s->free_cluster_index++; > - if (get_refcount(bs, i) != 0) > - goto retry; > - } > + nb_clusters = (size + s->cluster_size - 1) >> s->cluster_bits; > + for(;;) { > + if (get_refcount(bs, s->free_cluster_index) == 0) { > + s->free_cluster_index++; > + for(i = 1; i < nb_clusters; i++) { > + if (get_refcount(bs, s->free_cluster_index) != 0) > + goto not_found; > + s->free_cluster_index++; > + } > #ifdef DEBUG_ALLOC2 > - printf("alloc_clusters: size=%lld -> %lld\n", > - size, > - (s->free_cluster_index - nb_clusters) << s->cluster_bits); > + printf("alloc_clusters: size=%lld -> %lld\n", > + size, > + (s->free_cluster_index - nb_clusters) << s->cluster_bits); > #endif > - return (s->free_cluster_index - nb_clusters) << s->cluster_bits; > + return (s->free_cluster_index - nb_clusters) << s->cluster_bits; > + } else { > + not_found: > + s->free_cluster_index++; > + } > + } > } > > static int64_t alloc_clusters(BlockDriverState *bs, int64_t size) > @@ -2301,7 +1888,8 @@ > int new_table_size, new_table_size2, refcount_table_clusters, i, ret; > uint64_t *new_table; > int64_t table_offset; > - uint8_t data[12]; > + uint64_t data64; > + uint32_t data32; > int old_table_size; > int64_t old_table_offset; > > @@ -2340,10 +1928,13 @@ > for(i = 0; i < s->refcount_table_size; i++) > be64_to_cpus(&new_table[i]); > > - cpu_to_be64w((uint64_t*)data, table_offset); > - cpu_to_be32w((uint32_t*)(data + 8), refcount_table_clusters); > + data64 = cpu_to_be64(table_offset); > if (bdrv_pwrite(s->hd, offsetof(QCowHeader, refcount_table_offset), > - data, sizeof(data)) != sizeof(data)) > + &data64, sizeof(data64)) != sizeof(data64)) > + goto fail; > + data32 = cpu_to_be32(refcount_table_clusters); > + if (bdrv_pwrite(s->hd, offsetof(QCowHeader, refcount_table_clusters), > + &data32, sizeof(data32)) != sizeof(data32)) > goto fail; > qemu_free(s->refcount_table); > old_table_offset = s->refcount_table_offset; > @@ -2572,7 +2163,7 @@ > uint16_t *refcount_table; > > size = bdrv_getlength(s->hd); > - nb_clusters = size_to_clusters(s, size); > + nb_clusters = (size + s->cluster_size - 1) >> s->cluster_bits; > refcount_table = qemu_mallocz(nb_clusters * sizeof(uint16_t)); > > /* header */ > @@ -2624,7 +2215,7 @@ > int refcount; > > size = bdrv_getlength(s->hd); > - nb_clusters = size_to_clusters(s, size); > + nb_clusters = (size + s->cluster_size - 1) >> s->cluster_bits; > for(k = 0; k < nb_clusters;) { > k1 = k; > refcount = get_refcount(bs, k); > > > [-- Attachment #2: Type: text/html, Size: 43068 bytes --] ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-14 22:23 ` Dor Laor @ 2009-02-15 2:20 ` Jamie Lokier 0 siblings, 0 replies; 82+ messages in thread From: Jamie Lokier @ 2009-02-15 2:20 UTC (permalink / raw) To: dlaor, qemu-devel Dor Laor wrote: > The solution is to find the real cause to the corruption. I agree, if someone is able to do that, great, but if not and practical reality results in these choices: 1. Ship the current code which results in corruption on Windows 2000 and 2003 guests (and who knows what else), and by the way is unlikely to have anything to do with device emulation. 2. Revert to (nearly) kvm-72 code which appears to fix the majority of those corruption cases, although there is still something rare, which may be a different bug. Which is the best choice? >From a QA POV, I would revert the known bug until someone has a fix, then reinstate everything after it which is thought to be good. > Jamie Lokier wrote: > Anthony Liguori wrote: > Simply reverting the qcow2 code appears to fix > those problems, so it > needn't hold up cutting a release. That's what I > recommend. > Send some patches. > I did already. > > Here it is again. This should fix my bug and Marc's bug according to > his report that reverting qcow2.c fixes it. > Going back to kvm-72 is not good also. > First, there were qcow2 corruptions before it, they were very rare but still > exist. That's true. But they were noticably rarer - to the point that people clearly are using kvm-72 with qcow2 and not reporting many problems. Ubuntu 8.10 shipped kvm-72, and that coincided with their announcement that they're supporting KVM as their official virtualisation solution. I imagine kvm-72 is getting a fair bit of usage because of that. Of course they could be having rare problems and think it's a bug in the guest or its applications :-) > Not long ago we did not know even that qcow2 is the faulty. Worrying, isn't it. Does qcow2 get any rigorous testing? Should that be added - a blockdev test suite? There hasn't been a complete lack of bug reports about qcow2, but maybe they aren't getting to the right places, and maybe they're too difficult to reproduce and easy to workaround ("my guest occasionally shows random corruption", "don't use KVM for that guest", "I switch to raw and it went away") I very luckily discovered it prevented one of my VMs from booting, as soon as I upgraded from kvm-72 (shipped with Ubuntu) to something newer. If it hadn't prevented it from booting, just occasional rare corruption, I might not have realised it was qcow2 at all. Guest corruption can occur for many reasons, and -win2k-hack implies that the IDE emulation is not quite right in some way. -- Jamie ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-13 19:04 ` [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports Jamie Lokier 2009-02-14 22:23 ` Dor Laor @ 2009-02-14 23:13 ` Anthony Liguori 2009-02-15 2:01 ` Jamie Lokier 1 sibling, 1 reply; 82+ messages in thread From: Anthony Liguori @ 2009-02-14 23:13 UTC (permalink / raw) To: qemu-devel Jamie Lokier wrote: > Anthony Liguori wrote: > >>> Simply reverting the qcow2 code appears to fix those problems, so it >>> needn't hold up cutting a release. That's what I recommend. >>> >> Send some patches. >> > > I did already. > > Here it is again. This should fix my bug and Marc's bug according to > his report that reverting qcow2.c fixes it. > Well such a large reversion is a bad idea. Can you git bisect to the actual changeset that introduced the bug you see? You're effectively reverting a very large number of changes whereas only one is likely causing your problem Regards, Anthony Liguori > -- Jamie > > > Subject: Revert block-qcow2.c to kvm-72 version due to corruption reports > > This fixes two kinds of qcow2 corruption observed in kvm-83 (actually > kvm-73 and later), from three bug reports. > > > Bug 1: Windows 2000 guests complain of corrupt registry. > > Many Windows 2000 guests which boot and runs fine in kvm-72, fail with > a blue-screen indicating file corruption errors in kvm-73 through to > kvm-83 (the latest), and succeed if we replace block-qcow2.c with the > version from kvm-72. > > The blue screen appears towards the end of the boot sequence, and > shows only briefly before rebooting. It says: > > STOP: c0000218 (Registry File Failure) > The registry cannot load the hive (file): > \SystemRoot\System32\Config\SOFTWARE > or its log or alternate. > It is corrupt, absent, or not writable. > > Beginning dump of physical memory > Physical memory dump complete. Contact your system administrator or > technical support [...?] > > This is narrowed down to the difference in block-qcow2.c between > kvm-72 and kvm-73 (not -83). From kvm-73 to kvm-83, there have been > more changes block-qcow2.c, but the observed corruption still occurs. > > The bug isn't evident when only reading. When using "qemu-img > convert" to convert a qcow2 file to a raw file, with broken and fixed > versions of block-qcow2.c it produces the same raw file. Also, when > using "-snapshot" with qemu, the blue screen doesn't occur. > > This bug was observed by Jamie Lokier <jamie@shareable.org> and > confirmed for multiple Windows 2000 guests by > Marc Bevand <m.bevand@gmail.com>. > > > Bug 2: Windows 2003 guests complain of corrupt registry. > > According to > http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2001452&group_id=180599 > > Windows 2003 32-bit guests randomly spew disk corruption messages > like this: > > Windows – Registry Hive Recovered > Registry hive (file): SOFTWARE was corrupted and it has > been recovered. Some data might have been lost. > > and > > The system cannot log on due to the following error: > Unable to complete the requested operation because of > either a catastrophic media failure or a data structure > corruption on the disk. > > This bug was reported by <gerdwachs@users.sourceforge.net> and > confirmed by Marc Bevand, noting: > > kvm-73+ also causes some of my Windows 2003 guests to exhibit this > exact registry corruption error. [...] This bug is also fixed by > reverting block-qcow2.c to the version from kvm-72. > > Worryingly, gerdwachs' bug report says it's for kvm-70, implying this > patch may not fix all the Windows 2003 guest corruption problems. > > At least Marc says his observed problem goes away with kvm-72's qcow2. > > > Bug 3: Corruption of qcow2 index rendering the file unusable. > > Marc Bevand writes: > > I tested kvm-81 and kvm-83 as well (can't test kvm-80 or older because > of the qcow2 performance regression caused by the default writethrough > caching policy) but it randomly triggers an even worse bug: the moment > I shut down a guest by typing "quit" in the monitor, it sometimes > overwrite the first 4kB of the disk image with mostly NUL bytes (!) > which completely destroys it. I am familiar with the qcow2 format and > apparently this 4kB block seems to be an L2 table with most entries > set to zero. I have had to restore at least 6 or 7 disk images from > backup after occurences of that bug. My intuition tells me this may be > the qcow2 code trying to allocate a cluster to write a new L2 table, > but not noticing the allocation failed (represented by a 0 offset), > and writing the L2 table at that 0 offset, overwriting the qcow2 > header. > > Fortunately this bug is also fixed by running kvm-75 with > block-qcow2.c reverted to its kvm-72 version. > > Basically qcow2 in kvm-73 or newer is completely unreliable. > > > Reverting block-qcow2.c to the version in kvm-72 appears to fix the > corruption symptoms reported by Marc and Jamie, although gerdwachs' > related bug is against kvm-70 so it may not fix that. > > Unfortunately this reverts some optimisations, but fixing corruption > is more important until the new code is reliable. > > This patch reverts block-qcow2.c in kvm-83 to the version in kvm-72, > except the "cache=writeback" default performance tweak is retained and > there's no need to define "offsetof". > > Signed-Off-By: Jamie Lokier <jamie@shareable.org> > > > --- kvm-83-real/qemu/block-qcow2.c 2009-01-13 13:29:42.000000000 +0000 > +++ kvm-83/qemu/block-qcow2.c 2009-02-13 18:51:12.000000000 +0000 > @@ -52,8 +52,6 @@ > #define QCOW_CRYPT_NONE 0 > #define QCOW_CRYPT_AES 1 > > -#define QCOW_MAX_CRYPT_CLUSTERS 32 > - > /* indicate that the refcount of the referenced cluster is exactly one. */ > #define QCOW_OFLAG_COPIED (1LL << 63) > /* indicate that the cluster is compressed (they never have the copied flag) */ > @@ -269,8 +267,7 @@ > if (!s->cluster_cache) > goto fail; > /* one more sector for decompressed data alignment */ > - s->cluster_data = qemu_malloc(QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size > - + 512); > + s->cluster_data = qemu_malloc(s->cluster_size + 512); > if (!s->cluster_data) > goto fail; > s->cluster_cache_offset = -1; > @@ -437,7 +434,8 @@ > int new_l1_size, new_l1_size2, ret, i; > uint64_t *new_l1_table; > uint64_t new_l1_table_offset; > - uint8_t data[12]; > + uint64_t data64; > + uint32_t data32; > > new_l1_size = s->l1_size; > if (min_size <= new_l1_size) > @@ -467,10 +465,13 @@ > new_l1_table[i] = be64_to_cpu(new_l1_table[i]); > > /* set new table */ > - cpu_to_be32w((uint32_t*)data, new_l1_size); > - cpu_to_be64w((uint64_t*)(data + 4), new_l1_table_offset); > - if (bdrv_pwrite(s->hd, offsetof(QCowHeader, l1_size), data, > - sizeof(data)) != sizeof(data)) > + data64 = cpu_to_be64(new_l1_table_offset); > + if (bdrv_pwrite(s->hd, offsetof(QCowHeader, l1_table_offset), > + &data64, sizeof(data64)) != sizeof(data64)) > + goto fail; > + data32 = cpu_to_be32(new_l1_size); > + if (bdrv_pwrite(s->hd, offsetof(QCowHeader, l1_size), > + &data32, sizeof(data32)) != sizeof(data32)) > goto fail; > qemu_free(s->l1_table); > free_clusters(bs, s->l1_table_offset, s->l1_size * sizeof(uint64_t)); > @@ -483,549 +484,169 @@ > return -EIO; > } > > -/* > - * seek_l2_table > +/* 'allocate' is: > * > - * seek l2_offset in the l2_cache table > - * if not found, return NULL, > - * if found, > - * increments the l2 cache hit count of the entry, > - * if counter overflow, divide by two all counters > - * return the pointer to the l2 cache entry > + * 0 not to allocate. > * > - */ > - > -static uint64_t *seek_l2_table(BDRVQcowState *s, uint64_t l2_offset) > -{ > - int i, j; > - > - for(i = 0; i < L2_CACHE_SIZE; i++) { > - if (l2_offset == s->l2_cache_offsets[i]) { > - /* increment the hit count */ > - if (++s->l2_cache_counts[i] == 0xffffffff) { > - for(j = 0; j < L2_CACHE_SIZE; j++) { > - s->l2_cache_counts[j] >>= 1; > - } > - } > - return s->l2_cache + (i << s->l2_bits); > - } > - } > - return NULL; > -} > - > -/* > - * l2_load > + * 1 to allocate a normal cluster (for sector indexes 'n_start' to > + * 'n_end') > * > - * Loads a L2 table into memory. If the table is in the cache, the cache > - * is used; otherwise the L2 table is loaded from the image file. > + * 2 to allocate a compressed cluster of size > + * 'compressed_size'. 'compressed_size' must be > 0 and < > + * cluster_size > * > - * Returns a pointer to the L2 table on success, or NULL if the read from > - * the image file failed. > + * return 0 if not allocated. > */ > - > -static uint64_t *l2_load(BlockDriverState *bs, uint64_t l2_offset) > -{ > - BDRVQcowState *s = bs->opaque; > - int min_index; > - uint64_t *l2_table; > - > - /* seek if the table for the given offset is in the cache */ > - > - l2_table = seek_l2_table(s, l2_offset); > - if (l2_table != NULL) > - return l2_table; > - > - /* not found: load a new entry in the least used one */ > - > - min_index = l2_cache_new_entry(bs); > - l2_table = s->l2_cache + (min_index << s->l2_bits); > - if (bdrv_pread(s->hd, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) != > - s->l2_size * sizeof(uint64_t)) > - return NULL; > - s->l2_cache_offsets[min_index] = l2_offset; > - s->l2_cache_counts[min_index] = 1; > - > - return l2_table; > -} > - > -/* > - * l2_allocate > - * > - * Allocate a new l2 entry in the file. If l1_index points to an already > - * used entry in the L2 table (i.e. we are doing a copy on write for the L2 > - * table) copy the contents of the old L2 table into the newly allocated one. > - * Otherwise the new table is initialized with zeros. > - * > - */ > - > -static uint64_t *l2_allocate(BlockDriverState *bs, int l1_index) > -{ > - BDRVQcowState *s = bs->opaque; > - int min_index; > - uint64_t old_l2_offset, tmp; > - uint64_t *l2_table, l2_offset; > - > - old_l2_offset = s->l1_table[l1_index]; > - > - /* allocate a new l2 entry */ > - > - l2_offset = alloc_clusters(bs, s->l2_size * sizeof(uint64_t)); > - > - /* update the L1 entry */ > - > - s->l1_table[l1_index] = l2_offset | QCOW_OFLAG_COPIED; > - > - tmp = cpu_to_be64(l2_offset | QCOW_OFLAG_COPIED); > - if (bdrv_pwrite(s->hd, s->l1_table_offset + l1_index * sizeof(tmp), > - &tmp, sizeof(tmp)) != sizeof(tmp)) > - return NULL; > - > - /* allocate a new entry in the l2 cache */ > - > - min_index = l2_cache_new_entry(bs); > - l2_table = s->l2_cache + (min_index << s->l2_bits); > - > - if (old_l2_offset == 0) { > - /* if there was no old l2 table, clear the new table */ > - memset(l2_table, 0, s->l2_size * sizeof(uint64_t)); > - } else { > - /* if there was an old l2 table, read it from the disk */ > - if (bdrv_pread(s->hd, old_l2_offset, > - l2_table, s->l2_size * sizeof(uint64_t)) != > - s->l2_size * sizeof(uint64_t)) > - return NULL; > - } > - /* write the l2 table to the file */ > - if (bdrv_pwrite(s->hd, l2_offset, > - l2_table, s->l2_size * sizeof(uint64_t)) != > - s->l2_size * sizeof(uint64_t)) > - return NULL; > - > - /* update the l2 cache entry */ > - > - s->l2_cache_offsets[min_index] = l2_offset; > - s->l2_cache_counts[min_index] = 1; > - > - return l2_table; > -} > - > -static int size_to_clusters(BDRVQcowState *s, int64_t size) > -{ > - return (size + (s->cluster_size - 1)) >> s->cluster_bits; > -} > - > -static int count_contiguous_clusters(uint64_t nb_clusters, int cluster_size, > - uint64_t *l2_table, uint64_t start, uint64_t mask) > -{ > - int i; > - uint64_t offset = be64_to_cpu(l2_table[0]) & ~mask; > - > - if (!offset) > - return 0; > - > - for (i = start; i < start + nb_clusters; i++) > - if (offset + i * cluster_size != (be64_to_cpu(l2_table[i]) & ~mask)) > - break; > - > - return (i - start); > -} > - > -static int count_contiguous_free_clusters(uint64_t nb_clusters, uint64_t *l2_table) > -{ > - int i = 0; > - > - while(nb_clusters-- && l2_table[i] == 0) > - i++; > - > - return i; > -} > - > -/* > - * get_cluster_offset > - * > - * For a given offset of the disk image, return cluster offset in > - * qcow2 file. > - * > - * on entry, *num is the number of contiguous clusters we'd like to > - * access following offset. > - * > - * on exit, *num is the number of contiguous clusters we can read. > - * > - * Return 1, if the offset is found > - * Return 0, otherwise. > - * > - */ > - > static uint64_t get_cluster_offset(BlockDriverState *bs, > - uint64_t offset, int *num) > -{ > - BDRVQcowState *s = bs->opaque; > - int l1_index, l2_index; > - uint64_t l2_offset, *l2_table, cluster_offset; > - int l1_bits, c; > - int index_in_cluster, nb_available, nb_needed, nb_clusters; > - > - index_in_cluster = (offset >> 9) & (s->cluster_sectors - 1); > - nb_needed = *num + index_in_cluster; > - > - l1_bits = s->l2_bits + s->cluster_bits; > - > - /* compute how many bytes there are between the offset and > - * the end of the l1 entry > - */ > - > - nb_available = (1 << l1_bits) - (offset & ((1 << l1_bits) - 1)); > - > - /* compute the number of available sectors */ > - > - nb_available = (nb_available >> 9) + index_in_cluster; > - > - cluster_offset = 0; > - > - /* seek the the l2 offset in the l1 table */ > - > - l1_index = offset >> l1_bits; > - if (l1_index >= s->l1_size) > - goto out; > - > - l2_offset = s->l1_table[l1_index]; > - > - /* seek the l2 table of the given l2 offset */ > - > - if (!l2_offset) > - goto out; > - > - /* load the l2 table in memory */ > - > - l2_offset &= ~QCOW_OFLAG_COPIED; > - l2_table = l2_load(bs, l2_offset); > - if (l2_table == NULL) > - return 0; > - > - /* find the cluster offset for the given disk offset */ > - > - l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1); > - cluster_offset = be64_to_cpu(l2_table[l2_index]); > - nb_clusters = size_to_clusters(s, nb_needed << 9); > - > - if (!cluster_offset) { > - /* how many empty clusters ? */ > - c = count_contiguous_free_clusters(nb_clusters, &l2_table[l2_index]); > - } else { > - /* how many allocated clusters ? */ > - c = count_contiguous_clusters(nb_clusters, s->cluster_size, > - &l2_table[l2_index], 0, QCOW_OFLAG_COPIED); > - } > - > - nb_available = (c * s->cluster_sectors); > -out: > - if (nb_available > nb_needed) > - nb_available = nb_needed; > - > - *num = nb_available - index_in_cluster; > - > - return cluster_offset & ~QCOW_OFLAG_COPIED; > -} > - > -/* > - * free_any_clusters > - * > - * free clusters according to its type: compressed or not > - * > - */ > - > -static void free_any_clusters(BlockDriverState *bs, > - uint64_t cluster_offset, int nb_clusters) > -{ > - BDRVQcowState *s = bs->opaque; > - > - /* free the cluster */ > - > - if (cluster_offset & QCOW_OFLAG_COMPRESSED) { > - int nb_csectors; > - nb_csectors = ((cluster_offset >> s->csize_shift) & > - s->csize_mask) + 1; > - free_clusters(bs, (cluster_offset & s->cluster_offset_mask) & ~511, > - nb_csectors * 512); > - return; > - } > - > - free_clusters(bs, cluster_offset, nb_clusters << s->cluster_bits); > - > - return; > -} > - > -/* > - * get_cluster_table > - * > - * for a given disk offset, load (and allocate if needed) > - * the l2 table. > - * > - * the l2 table offset in the qcow2 file and the cluster index > - * in the l2 table are given to the caller. > - * > - */ > - > -static int get_cluster_table(BlockDriverState *bs, uint64_t offset, > - uint64_t **new_l2_table, > - uint64_t *new_l2_offset, > - int *new_l2_index) > + uint64_t offset, int allocate, > + int compressed_size, > + int n_start, int n_end) > { > BDRVQcowState *s = bs->opaque; > - int l1_index, l2_index, ret; > - uint64_t l2_offset, *l2_table; > - > - /* seek the the l2 offset in the l1 table */ > + int min_index, i, j, l1_index, l2_index, ret; > + uint64_t l2_offset, *l2_table, cluster_offset, tmp, old_l2_offset; > > l1_index = offset >> (s->l2_bits + s->cluster_bits); > if (l1_index >= s->l1_size) { > - ret = grow_l1_table(bs, l1_index + 1); > - if (ret < 0) > + /* outside l1 table is allowed: we grow the table if needed */ > + if (!allocate) > + return 0; > + if (grow_l1_table(bs, l1_index + 1) < 0) > return 0; > } > l2_offset = s->l1_table[l1_index]; > + if (!l2_offset) { > + if (!allocate) > + return 0; > + l2_allocate: > + old_l2_offset = l2_offset; > + /* allocate a new l2 entry */ > + l2_offset = alloc_clusters(bs, s->l2_size * sizeof(uint64_t)); > + /* update the L1 entry */ > + s->l1_table[l1_index] = l2_offset | QCOW_OFLAG_COPIED; > + tmp = cpu_to_be64(l2_offset | QCOW_OFLAG_COPIED); > + if (bdrv_pwrite(s->hd, s->l1_table_offset + l1_index * sizeof(tmp), > + &tmp, sizeof(tmp)) != sizeof(tmp)) > + return 0; > + min_index = l2_cache_new_entry(bs); > + l2_table = s->l2_cache + (min_index << s->l2_bits); > > - /* seek the l2 table of the given l2 offset */ > - > - if (l2_offset & QCOW_OFLAG_COPIED) { > - /* load the l2 table in memory */ > - l2_offset &= ~QCOW_OFLAG_COPIED; > - l2_table = l2_load(bs, l2_offset); > - if (l2_table == NULL) > + if (old_l2_offset == 0) { > + memset(l2_table, 0, s->l2_size * sizeof(uint64_t)); > + } else { > + if (bdrv_pread(s->hd, old_l2_offset, > + l2_table, s->l2_size * sizeof(uint64_t)) != > + s->l2_size * sizeof(uint64_t)) > + return 0; > + } > + if (bdrv_pwrite(s->hd, l2_offset, > + l2_table, s->l2_size * sizeof(uint64_t)) != > + s->l2_size * sizeof(uint64_t)) > return 0; > } else { > - if (l2_offset) > - free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t)); > - l2_table = l2_allocate(bs, l1_index); > - if (l2_table == NULL) > + if (!(l2_offset & QCOW_OFLAG_COPIED)) { > + if (allocate) { > + free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t)); > + goto l2_allocate; > + } > + } else { > + l2_offset &= ~QCOW_OFLAG_COPIED; > + } > + for(i = 0; i < L2_CACHE_SIZE; i++) { > + if (l2_offset == s->l2_cache_offsets[i]) { > + /* increment the hit count */ > + if (++s->l2_cache_counts[i] == 0xffffffff) { > + for(j = 0; j < L2_CACHE_SIZE; j++) { > + s->l2_cache_counts[j] >>= 1; > + } > + } > + l2_table = s->l2_cache + (i << s->l2_bits); > + goto found; > + } > + } > + /* not found: load a new entry in the least used one */ > + min_index = l2_cache_new_entry(bs); > + l2_table = s->l2_cache + (min_index << s->l2_bits); > + if (bdrv_pread(s->hd, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) != > + s->l2_size * sizeof(uint64_t)) > return 0; > - l2_offset = s->l1_table[l1_index] & ~QCOW_OFLAG_COPIED; > } > - > - /* find the cluster offset for the given disk offset */ > - > + s->l2_cache_offsets[min_index] = l2_offset; > + s->l2_cache_counts[min_index] = 1; > + found: > l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1); > - > - *new_l2_table = l2_table; > - *new_l2_offset = l2_offset; > - *new_l2_index = l2_index; > - > - return 1; > -} > - > -/* > - * alloc_compressed_cluster_offset > - * > - * For a given offset of the disk image, return cluster offset in > - * qcow2 file. > - * > - * If the offset is not found, allocate a new compressed cluster. > - * > - * Return the cluster offset if successful, > - * Return 0, otherwise. > - * > - */ > - > -static uint64_t alloc_compressed_cluster_offset(BlockDriverState *bs, > - uint64_t offset, > - int compressed_size) > -{ > - BDRVQcowState *s = bs->opaque; > - int l2_index, ret; > - uint64_t l2_offset, *l2_table, cluster_offset; > - int nb_csectors; > - > - ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index); > - if (ret == 0) > - return 0; > - > cluster_offset = be64_to_cpu(l2_table[l2_index]); > - if (cluster_offset & QCOW_OFLAG_COPIED) > - return cluster_offset & ~QCOW_OFLAG_COPIED; > - > - if (cluster_offset) > - free_any_clusters(bs, cluster_offset, 1); > - > - cluster_offset = alloc_bytes(bs, compressed_size); > - nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) - > - (cluster_offset >> 9); > - > - cluster_offset |= QCOW_OFLAG_COMPRESSED | > - ((uint64_t)nb_csectors << s->csize_shift); > - > - /* update L2 table */ > - > - /* compressed clusters never have the copied flag */ > - > - l2_table[l2_index] = cpu_to_be64(cluster_offset); > - if (bdrv_pwrite(s->hd, > - l2_offset + l2_index * sizeof(uint64_t), > - l2_table + l2_index, > - sizeof(uint64_t)) != sizeof(uint64_t)) > - return 0; > - > - return cluster_offset; > -} > - > -typedef struct QCowL2Meta > -{ > - uint64_t offset; > - int n_start; > - int nb_available; > - int nb_clusters; > -} QCowL2Meta; > - > -static int alloc_cluster_link_l2(BlockDriverState *bs, uint64_t cluster_offset, > - QCowL2Meta *m) > -{ > - BDRVQcowState *s = bs->opaque; > - int i, j = 0, l2_index, ret; > - uint64_t *old_cluster, start_sect, l2_offset, *l2_table; > - > - if (m->nb_clusters == 0) > - return 0; > - > - if (!(old_cluster = qemu_malloc(m->nb_clusters * sizeof(uint64_t)))) > - return -ENOMEM; > - > - /* copy content of unmodified sectors */ > - start_sect = (m->offset & ~(s->cluster_size - 1)) >> 9; > - if (m->n_start) { > - ret = copy_sectors(bs, start_sect, cluster_offset, 0, m->n_start); > - if (ret < 0) > - goto err; > + if (!cluster_offset) { > + if (!allocate) > + return cluster_offset; > + } else if (!(cluster_offset & QCOW_OFLAG_COPIED)) { > + if (!allocate) > + return cluster_offset; > + /* free the cluster */ > + if (cluster_offset & QCOW_OFLAG_COMPRESSED) { > + int nb_csectors; > + nb_csectors = ((cluster_offset >> s->csize_shift) & > + s->csize_mask) + 1; > + free_clusters(bs, (cluster_offset & s->cluster_offset_mask) & ~511, > + nb_csectors * 512); > + } else { > + free_clusters(bs, cluster_offset, s->cluster_size); > + } > + } else { > + cluster_offset &= ~QCOW_OFLAG_COPIED; > + return cluster_offset; > } > - > - if (m->nb_available & (s->cluster_sectors - 1)) { > - uint64_t end = m->nb_available & ~(uint64_t)(s->cluster_sectors - 1); > - ret = copy_sectors(bs, start_sect + end, cluster_offset + (end << 9), > - m->nb_available - end, s->cluster_sectors); > - if (ret < 0) > - goto err; > + if (allocate == 1) { > + /* allocate a new cluster */ > + cluster_offset = alloc_clusters(bs, s->cluster_size); > + > + /* we must initialize the cluster content which won't be > + written */ > + if ((n_end - n_start) < s->cluster_sectors) { > + uint64_t start_sect; > + > + start_sect = (offset & ~(s->cluster_size - 1)) >> 9; > + ret = copy_sectors(bs, start_sect, > + cluster_offset, 0, n_start); > + if (ret < 0) > + return 0; > + ret = copy_sectors(bs, start_sect, > + cluster_offset, n_end, s->cluster_sectors); > + if (ret < 0) > + return 0; > + } > + tmp = cpu_to_be64(cluster_offset | QCOW_OFLAG_COPIED); > + } else { > + int nb_csectors; > + cluster_offset = alloc_bytes(bs, compressed_size); > + nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) - > + (cluster_offset >> 9); > + cluster_offset |= QCOW_OFLAG_COMPRESSED | > + ((uint64_t)nb_csectors << s->csize_shift); > + /* compressed clusters never have the copied flag */ > + tmp = cpu_to_be64(cluster_offset); > } > - > - ret = -EIO; > /* update L2 table */ > - if (!get_cluster_table(bs, m->offset, &l2_table, &l2_offset, &l2_index)) > - goto err; > - > - for (i = 0; i < m->nb_clusters; i++) { > - if(l2_table[l2_index + i] != 0) > - old_cluster[j++] = l2_table[l2_index + i]; > - > - l2_table[l2_index + i] = cpu_to_be64((cluster_offset + > - (i << s->cluster_bits)) | QCOW_OFLAG_COPIED); > - } > - > - if (bdrv_pwrite(s->hd, l2_offset + l2_index * sizeof(uint64_t), > - l2_table + l2_index, m->nb_clusters * sizeof(uint64_t)) != > - m->nb_clusters * sizeof(uint64_t)) > - goto err; > - > - for (i = 0; i < j; i++) > - free_any_clusters(bs, old_cluster[i], 1); > - > - ret = 0; > -err: > - qemu_free(old_cluster); > - return ret; > - } > - > -/* > - * alloc_cluster_offset > - * > - * For a given offset of the disk image, return cluster offset in > - * qcow2 file. > - * > - * If the offset is not found, allocate a new cluster. > - * > - * Return the cluster offset if successful, > - * Return 0, otherwise. > - * > - */ > - > -static uint64_t alloc_cluster_offset(BlockDriverState *bs, > - uint64_t offset, > - int n_start, int n_end, > - int *num, QCowL2Meta *m) > -{ > - BDRVQcowState *s = bs->opaque; > - int l2_index, ret; > - uint64_t l2_offset, *l2_table, cluster_offset; > - int nb_clusters, i = 0; > - > - ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index); > - if (ret == 0) > + l2_table[l2_index] = tmp; > + if (bdrv_pwrite(s->hd, > + l2_offset + l2_index * sizeof(tmp), &tmp, sizeof(tmp)) != sizeof(tmp)) > return 0; > - > - nb_clusters = size_to_clusters(s, n_end << 9); > - > - nb_clusters = MIN(nb_clusters, s->l2_size - l2_index); > - > - cluster_offset = be64_to_cpu(l2_table[l2_index]); > - > - /* We keep all QCOW_OFLAG_COPIED clusters */ > - > - if (cluster_offset & QCOW_OFLAG_COPIED) { > - nb_clusters = count_contiguous_clusters(nb_clusters, s->cluster_size, > - &l2_table[l2_index], 0, 0); > - > - cluster_offset &= ~QCOW_OFLAG_COPIED; > - m->nb_clusters = 0; > - > - goto out; > - } > - > - /* for the moment, multiple compressed clusters are not managed */ > - > - if (cluster_offset & QCOW_OFLAG_COMPRESSED) > - nb_clusters = 1; > - > - /* how many available clusters ? */ > - > - while (i < nb_clusters) { > - i += count_contiguous_clusters(nb_clusters - i, s->cluster_size, > - &l2_table[l2_index], i, 0); > - > - if(be64_to_cpu(l2_table[l2_index + i])) > - break; > - > - i += count_contiguous_free_clusters(nb_clusters - i, > - &l2_table[l2_index + i]); > - > - cluster_offset = be64_to_cpu(l2_table[l2_index + i]); > - > - if ((cluster_offset & QCOW_OFLAG_COPIED) || > - (cluster_offset & QCOW_OFLAG_COMPRESSED)) > - break; > - } > - nb_clusters = i; > - > - /* allocate a new cluster */ > - > - cluster_offset = alloc_clusters(bs, nb_clusters * s->cluster_size); > - > - /* save info needed for meta data update */ > - m->offset = offset; > - m->n_start = n_start; > - m->nb_clusters = nb_clusters; > - > -out: > - m->nb_available = MIN(nb_clusters << (s->cluster_bits - 9), n_end); > - > - *num = m->nb_available - n_start; > - > return cluster_offset; > } > > static int qcow_is_allocated(BlockDriverState *bs, int64_t sector_num, > int nb_sectors, int *pnum) > { > + BDRVQcowState *s = bs->opaque; > + int index_in_cluster, n; > uint64_t cluster_offset; > > - *pnum = nb_sectors; > - cluster_offset = get_cluster_offset(bs, sector_num << 9, pnum); > - > + cluster_offset = get_cluster_offset(bs, sector_num << 9, 0, 0, 0, 0); > + index_in_cluster = sector_num & (s->cluster_sectors - 1); > + n = s->cluster_sectors - index_in_cluster; > + if (n > nb_sectors) > + n = nb_sectors; > + *pnum = n; > return (cluster_offset != 0); > } > > @@ -1102,9 +723,11 @@ > uint64_t cluster_offset; > > while (nb_sectors > 0) { > - n = nb_sectors; > - cluster_offset = get_cluster_offset(bs, sector_num << 9, &n); > + cluster_offset = get_cluster_offset(bs, sector_num << 9, 0, 0, 0, 0); > index_in_cluster = sector_num & (s->cluster_sectors - 1); > + n = s->cluster_sectors - index_in_cluster; > + if (n > nb_sectors) > + n = nb_sectors; > if (!cluster_offset) { > if (bs->backing_hd) { > /* read from the base image */ > @@ -1143,18 +766,15 @@ > BDRVQcowState *s = bs->opaque; > int ret, index_in_cluster, n; > uint64_t cluster_offset; > - int n_end; > - QCowL2Meta l2meta; > > while (nb_sectors > 0) { > index_in_cluster = sector_num & (s->cluster_sectors - 1); > - n_end = index_in_cluster + nb_sectors; > - if (s->crypt_method && > - n_end > QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors) > - n_end = QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors; > - cluster_offset = alloc_cluster_offset(bs, sector_num << 9, > - index_in_cluster, > - n_end, &n, &l2meta); > + n = s->cluster_sectors - index_in_cluster; > + if (n > nb_sectors) > + n = nb_sectors; > + cluster_offset = get_cluster_offset(bs, sector_num << 9, 1, 0, > + index_in_cluster, > + index_in_cluster + n); > if (!cluster_offset) > return -1; > if (s->crypt_method) { > @@ -1165,10 +785,8 @@ > } else { > ret = bdrv_pwrite(s->hd, cluster_offset + index_in_cluster * 512, buf, n * 512); > } > - if (ret != n * 512 || alloc_cluster_link_l2(bs, cluster_offset, &l2meta) < 0) { > - free_any_clusters(bs, cluster_offset, l2meta.nb_clusters); > + if (ret != n * 512) > return -1; > - } > nb_sectors -= n; > sector_num += n; > buf += n * 512; > @@ -1186,33 +804,8 @@ > uint64_t cluster_offset; > uint8_t *cluster_data; > BlockDriverAIOCB *hd_aiocb; > - QEMUBH *bh; > - QCowL2Meta l2meta; > } QCowAIOCB; > > -static void qcow_aio_read_cb(void *opaque, int ret); > -static void qcow_aio_read_bh(void *opaque) > -{ > - QCowAIOCB *acb = opaque; > - qemu_bh_delete(acb->bh); > - acb->bh = NULL; > - qcow_aio_read_cb(opaque, 0); > -} > - > -static int qcow_schedule_bh(QEMUBHFunc *cb, QCowAIOCB *acb) > -{ > - if (acb->bh) > - return -EIO; > - > - acb->bh = qemu_bh_new(cb, acb); > - if (!acb->bh) > - return -EIO; > - > - qemu_bh_schedule(acb->bh); > - > - return 0; > -} > - > static void qcow_aio_read_cb(void *opaque, int ret) > { > QCowAIOCB *acb = opaque; > @@ -1222,12 +815,13 @@ > > acb->hd_aiocb = NULL; > if (ret < 0) { > -fail: > + fail: > acb->common.cb(acb->common.opaque, ret); > qemu_aio_release(acb); > return; > } > > + redo: > /* post process the read buffer */ > if (!acb->cluster_offset) { > /* nothing to do */ > @@ -1253,9 +847,12 @@ > } > > /* prepare next AIO request */ > - acb->n = acb->nb_sectors; > - acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9, &acb->n); > + acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9, > + 0, 0, 0, 0); > index_in_cluster = acb->sector_num & (s->cluster_sectors - 1); > + acb->n = s->cluster_sectors - index_in_cluster; > + if (acb->n > acb->nb_sectors) > + acb->n = acb->nb_sectors; > > if (!acb->cluster_offset) { > if (bs->backing_hd) { > @@ -1268,16 +865,12 @@ > if (acb->hd_aiocb == NULL) > goto fail; > } else { > - ret = qcow_schedule_bh(qcow_aio_read_bh, acb); > - if (ret < 0) > - goto fail; > + goto redo; > } > } else { > /* Note: in this case, no need to wait */ > memset(acb->buf, 0, 512 * acb->n); > - ret = qcow_schedule_bh(qcow_aio_read_bh, acb); > - if (ret < 0) > - goto fail; > + goto redo; > } > } else if (acb->cluster_offset & QCOW_OFLAG_COMPRESSED) { > /* add AIO support for compressed blocks ? */ > @@ -1285,9 +878,7 @@ > goto fail; > memcpy(acb->buf, > s->cluster_cache + index_in_cluster * 512, 512 * acb->n); > - ret = qcow_schedule_bh(qcow_aio_read_bh, acb); > - if (ret < 0) > - goto fail; > + goto redo; > } else { > if ((acb->cluster_offset & 511) != 0) { > ret = -EIO; > @@ -1316,7 +907,6 @@ > acb->nb_sectors = nb_sectors; > acb->n = 0; > acb->cluster_offset = 0; > - acb->l2meta.nb_clusters = 0; > return acb; > } > > @@ -1340,8 +930,8 @@ > BlockDriverState *bs = acb->common.bs; > BDRVQcowState *s = bs->opaque; > int index_in_cluster; > + uint64_t cluster_offset; > const uint8_t *src_buf; > - int n_end; > > acb->hd_aiocb = NULL; > > @@ -1352,11 +942,6 @@ > return; > } > > - if (alloc_cluster_link_l2(bs, acb->cluster_offset, &acb->l2meta) < 0) { > - free_any_clusters(bs, acb->cluster_offset, acb->l2meta.nb_clusters); > - goto fail; > - } > - > acb->nb_sectors -= acb->n; > acb->sector_num += acb->n; > acb->buf += acb->n * 512; > @@ -1369,22 +954,19 @@ > } > > index_in_cluster = acb->sector_num & (s->cluster_sectors - 1); > - n_end = index_in_cluster + acb->nb_sectors; > - if (s->crypt_method && > - n_end > QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors) > - n_end = QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors; > - > - acb->cluster_offset = alloc_cluster_offset(bs, acb->sector_num << 9, > - index_in_cluster, > - n_end, &acb->n, &acb->l2meta); > - if (!acb->cluster_offset || (acb->cluster_offset & 511) != 0) { > + acb->n = s->cluster_sectors - index_in_cluster; > + if (acb->n > acb->nb_sectors) > + acb->n = acb->nb_sectors; > + cluster_offset = get_cluster_offset(bs, acb->sector_num << 9, 1, 0, > + index_in_cluster, > + index_in_cluster + acb->n); > + if (!cluster_offset || (cluster_offset & 511) != 0) { > ret = -EIO; > goto fail; > } > if (s->crypt_method) { > if (!acb->cluster_data) { > - acb->cluster_data = qemu_mallocz(QCOW_MAX_CRYPT_CLUSTERS * > - s->cluster_size); > + acb->cluster_data = qemu_mallocz(s->cluster_size); > if (!acb->cluster_data) { > ret = -ENOMEM; > goto fail; > @@ -1397,7 +979,7 @@ > src_buf = acb->buf; > } > acb->hd_aiocb = bdrv_aio_write(s->hd, > - (acb->cluster_offset >> 9) + index_in_cluster, > + (cluster_offset >> 9) + index_in_cluster, > src_buf, acb->n, > qcow_aio_write_cb, acb); > if (acb->hd_aiocb == NULL) > @@ -1571,7 +1153,7 @@ > > memset(s->l1_table, 0, l1_length); > if (bdrv_pwrite(s->hd, s->l1_table_offset, s->l1_table, l1_length) < 0) > - return -1; > + return -1; > ret = bdrv_truncate(s->hd, s->l1_table_offset + l1_length); > if (ret < 0) > return ret; > @@ -1637,10 +1219,8 @@ > /* could not compress: write normal cluster */ > qcow_write(bs, sector_num, buf, s->cluster_sectors); > } else { > - cluster_offset = alloc_compressed_cluster_offset(bs, sector_num << 9, > - out_len); > - if (!cluster_offset) > - return -1; > + cluster_offset = get_cluster_offset(bs, sector_num << 9, 2, > + out_len, 0, 0); > cluster_offset &= s->cluster_offset_mask; > if (bdrv_pwrite(s->hd, cluster_offset, out_buf, out_len) != out_len) { > qemu_free(out_buf); > @@ -2225,19 +1805,26 @@ > BDRVQcowState *s = bs->opaque; > int i, nb_clusters; > > - nb_clusters = size_to_clusters(s, size); > -retry: > - for(i = 0; i < nb_clusters; i++) { > - int64_t i = s->free_cluster_index++; > - if (get_refcount(bs, i) != 0) > - goto retry; > - } > + nb_clusters = (size + s->cluster_size - 1) >> s->cluster_bits; > + for(;;) { > + if (get_refcount(bs, s->free_cluster_index) == 0) { > + s->free_cluster_index++; > + for(i = 1; i < nb_clusters; i++) { > + if (get_refcount(bs, s->free_cluster_index) != 0) > + goto not_found; > + s->free_cluster_index++; > + } > #ifdef DEBUG_ALLOC2 > - printf("alloc_clusters: size=%lld -> %lld\n", > - size, > - (s->free_cluster_index - nb_clusters) << s->cluster_bits); > + printf("alloc_clusters: size=%lld -> %lld\n", > + size, > + (s->free_cluster_index - nb_clusters) << s->cluster_bits); > #endif > - return (s->free_cluster_index - nb_clusters) << s->cluster_bits; > + return (s->free_cluster_index - nb_clusters) << s->cluster_bits; > + } else { > + not_found: > + s->free_cluster_index++; > + } > + } > } > > static int64_t alloc_clusters(BlockDriverState *bs, int64_t size) > @@ -2301,7 +1888,8 @@ > int new_table_size, new_table_size2, refcount_table_clusters, i, ret; > uint64_t *new_table; > int64_t table_offset; > - uint8_t data[12]; > + uint64_t data64; > + uint32_t data32; > int old_table_size; > int64_t old_table_offset; > > @@ -2340,10 +1928,13 @@ > for(i = 0; i < s->refcount_table_size; i++) > be64_to_cpus(&new_table[i]); > > - cpu_to_be64w((uint64_t*)data, table_offset); > - cpu_to_be32w((uint32_t*)(data + 8), refcount_table_clusters); > + data64 = cpu_to_be64(table_offset); > if (bdrv_pwrite(s->hd, offsetof(QCowHeader, refcount_table_offset), > - data, sizeof(data)) != sizeof(data)) > + &data64, sizeof(data64)) != sizeof(data64)) > + goto fail; > + data32 = cpu_to_be32(refcount_table_clusters); > + if (bdrv_pwrite(s->hd, offsetof(QCowHeader, refcount_table_clusters), > + &data32, sizeof(data32)) != sizeof(data32)) > goto fail; > qemu_free(s->refcount_table); > old_table_offset = s->refcount_table_offset; > @@ -2572,7 +2163,7 @@ > uint16_t *refcount_table; > > size = bdrv_getlength(s->hd); > - nb_clusters = size_to_clusters(s, size); > + nb_clusters = (size + s->cluster_size - 1) >> s->cluster_bits; > refcount_table = qemu_mallocz(nb_clusters * sizeof(uint16_t)); > > /* header */ > @@ -2624,7 +2215,7 @@ > int refcount; > > size = bdrv_getlength(s->hd); > - nb_clusters = size_to_clusters(s, size); > + nb_clusters = (size + s->cluster_size - 1) >> s->cluster_bits; > for(k = 0; k < nb_clusters;) { > k1 = k; > refcount = get_refcount(bs, k); > > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-14 23:13 ` Anthony Liguori @ 2009-02-15 2:01 ` Jamie Lokier 2009-02-15 4:09 ` Anthony Liguori 0 siblings, 1 reply; 82+ messages in thread From: Jamie Lokier @ 2009-02-15 2:01 UTC (permalink / raw) To: qemu-devel Anthony Liguori wrote: > Well such a large reversion is a bad idea. Can you git bisect to the > actual changeset that introduced the bug you see? Have done, did you read the other thread? Message-ID: <20090211114126.GC31997@shareable.org> Subject: Re: qcow2 corruption observed, fixed by reverting old change Jamie Lokier wrote: > Kevin Wolf wrote: > > Jamie Lokier schrieb: > > > Although there are many ways to make Windows blue screen in KVM, in > > > this case I've narrowed it down to the difference in > > > qemu/block-qcow2.c between kvm-72 and kvm-73 (not -83). > > > > This must be one of SVN revisions 5003 to 5008 in upstream qemu. Can you > > narrow it down to one of these? I certainly don't feel like reviewing > > all of them once again. > > It's QEMU SVN delta 5005-5006, copied below. I don't have time to disentangle the different optimisations done to qcow2 around that changeset, nor fix the changeset itself, but I can test proposed patches on my guest VM image, which I've copied aside because it's consistent about failing or not. If nobody else has time either, then I think an imminent new QEMU release, which may get rolled into distros and so on, is better off with the the changes reverted than corrupting guest images. I'm not proposing throwing away all the good work done on qcow2, only that fixing observed corruption is important especially for a major release, and reverting later changes can be temporary until the bug is found and fixed. -- Jamie ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-15 2:01 ` Jamie Lokier @ 2009-02-15 4:09 ` Anthony Liguori 2009-02-15 15:42 ` Jamie Lokier 0 siblings, 1 reply; 82+ messages in thread From: Anthony Liguori @ 2009-02-15 4:09 UTC (permalink / raw) To: qemu-devel On Sat, Feb 14, 2009 at 8:01 PM, Jamie Lokier <jamie@shareable.org> wrote: > Have done, did you read the other thread? Yes, but your patch confused me (which is admittedly not hard). >> It's QEMU SVN delta 5005-5006, copied below. So why such an aggressive revert? Why not just revert the problematic changesets? Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-15 4:09 ` Anthony Liguori @ 2009-02-15 15:42 ` Jamie Lokier 2009-02-15 18:19 ` Anthony Liguori 0 siblings, 1 reply; 82+ messages in thread From: Jamie Lokier @ 2009-02-15 15:42 UTC (permalink / raw) To: qemu-devel Anthony Liguori wrote: > On Sat, Feb 14, 2009 at 8:01 PM, Jamie Lokier <jamie@shareable.org> wrote: > > Have done, did you read the other thread? > > Yes, but your patch confused me (which is admittedly not hard). > > >> It's QEMU SVN delta 5005-5006, copied below. > > So why such an aggressive revert? Why not just revert the problematic > changesets? Because most of the following changes look too dependent on it. I did keep a couple of changes which are trivially independent since that one - default to "cache=writeback" and eliminating #define offsetof. You have a point that QEMU SVN deltas up to 5005 don't need to be reverted. Reason for that: I simply don't have time to trim the patch down to its bare essentials quickly, and being a corruption bug, it should be dealt with quickly. This one seems to work; feel free to improve it by reverting less, or waiting a long time for me to do so :-) -- Jamie ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-15 15:42 ` Jamie Lokier @ 2009-02-15 18:19 ` Anthony Liguori 2009-02-15 18:34 ` Johannes Schindelin 2009-02-17 1:01 ` Jamie Lokier 0 siblings, 2 replies; 82+ messages in thread From: Anthony Liguori @ 2009-02-15 18:19 UTC (permalink / raw) To: qemu-devel Jamie Lokier wrote: > Anthony Liguori wrote: > >> On Sat, Feb 14, 2009 at 8:01 PM, Jamie Lokier <jamie@shareable.org> wrote: >> >>> Have done, did you read the other thread? >>> >> Yes, but your patch confused me (which is admittedly not hard). >> >> >>>> It's QEMU SVN delta 5005-5006, copied below. >>>> >> So why such an aggressive revert? Why not just revert the problematic >> changesets? >> > > Because most of the following changes look too dependent on it. > Too dependent on the introduced functionality or too dependent to make porting trivial? My impression upon looking was that it's the later, not the former. If that is the case, then someone needs to do the work of properly reverting. > I did keep a couple of changes which are trivially independent since > that one - default to "cache=writeback" and eliminating #define > offsetof. > > You have a point that QEMU SVN deltas up to 5005 don't need to be > reverted. Reason for that: I simply don't have time to trim the patch > down to its bare essentials quickly, and being a corruption bug, it > should be dealt with quickly. This one seems to work; feel free to > improve it by reverting less, or waiting a long time for me to do so :-) > But many of the changes since 5005 were also corruption fixes. And let's be clear, your data is *not* safe with qcow2. So I don't consider this to be a show stopping issue. Regards, Anthony Liguori > -- Jamie > > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-15 18:19 ` Anthony Liguori @ 2009-02-15 18:34 ` Johannes Schindelin 2009-02-16 1:01 ` Anthony Liguori 2009-02-16 1:19 ` Anthony Liguori 2009-02-17 1:01 ` Jamie Lokier 1 sibling, 2 replies; 82+ messages in thread From: Johannes Schindelin @ 2009-02-15 18:34 UTC (permalink / raw) To: Anthony Liguori; +Cc: qemu-devel Hi, On Sun, 15 Feb 2009, Anthony Liguori wrote: > And let's be clear, your data is *not* safe with qcow2. So I don't > consider this to be a show stopping issue. I beg your pardon? The one format that was recommended for quite a long time now is considered unsafe? That would not have happened with Fabrice in charge, Dscho ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-15 18:34 ` Johannes Schindelin @ 2009-02-16 1:01 ` Anthony Liguori 2009-02-17 0:52 ` Jamie Lokier 2009-02-16 1:19 ` Anthony Liguori 1 sibling, 1 reply; 82+ messages in thread From: Anthony Liguori @ 2009-02-16 1:01 UTC (permalink / raw) To: Johannes Schindelin; +Cc: qemu-devel Johannes Schindelin wrote: > Hi, > > On Sun, 15 Feb 2009, Anthony Liguori wrote: > > >> And let's be clear, your data is *not* safe with qcow2. So I don't >> consider this to be a show stopping issue. >> > > I beg your pardon? The one format that was recommended for quite a long > time now is considered unsafe? > It's always been that way. It's unsafe for a number of reasons that have been discussed at great length. Regards, Anthony LIguori > That would not have happened with Fabrice in charge, > Dscho > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-16 1:01 ` Anthony Liguori @ 2009-02-17 0:52 ` Jamie Lokier 2009-02-17 2:55 ` Anthony Liguori 0 siblings, 1 reply; 82+ messages in thread From: Jamie Lokier @ 2009-02-17 0:52 UTC (permalink / raw) To: qemu-devel Anthony Liguori wrote: > >>And let's be clear, your data is *not* safe with qcow2. So I don't > >>consider this to be a show stopping issue. > > > >I beg your pardon? The one format that was recommended for quite a long > >time now is considered unsafe? > > It's always been that way. It's unsafe for a number of reasons that > have been discussed at great length. It sure isn't mentioned in the documentation. If it was, I would never have used it, and I imagine I'm not alone. QEMU might be an emulator project where people expect quirks, but KVM and Xen are professional virtualisation platforms competing with VMware. It is really not very professional that the documentation places "your data is not safe" formats on an equal footing with safe formats - without saying anything about it - and doesn't even recommend one or the other. That said, maybe Microsoft is doing the same thing - their documentation happily recommends their VHD format if you're not concerned about running out of disk space, and it's maybe VHD has similar corruption windows. -- Jamie ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-17 0:52 ` Jamie Lokier @ 2009-02-17 2:55 ` Anthony Liguori 0 siblings, 0 replies; 82+ messages in thread From: Anthony Liguori @ 2009-02-17 2:55 UTC (permalink / raw) To: qemu-devel Jamie Lokier wrote: > Anthony Liguori wrote: > >>>> And let's be clear, your data is *not* safe with qcow2. So I don't >>>> consider this to be a show stopping issue. >>>> >>> I beg your pardon? The one format that was recommended for quite a long >>> time now is considered unsafe? >>> >> It's always been that way. It's unsafe for a number of reasons that >> have been discussed at great length. >> > > It sure isn't mentioned in the documentation. > If it was, I would never have used it, and I imagine I'm not alone. > > QEMU might be an emulator project where people expect quirks, but KVM > and Xen are professional virtualisation platforms competing with > VMware. > > It is really not very professional that the documentation places "your > data is not safe" formats on an equal footing with safe formats - > without saying anything about it - and doesn't even recommend one or > the other. > Please submit patches. I don't disagree with you and that is why I'm trying to make this clear now. > That said, maybe Microsoft is doing the same thing - their > documentation happily recommends their VHD format if you're not > concerned about running out of disk space, and it's maybe VHD has > similar corruption windows. > Yeah, it's hard to make a truly reliable format that isn't raw. It basically is the same problem file systems solve and requires either a journal or an fsck step. I'm thinking that this is a problem for other software too. Regards, Anthony Liguori > -- Jamie > > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-15 18:34 ` Johannes Schindelin 2009-02-16 1:01 ` Anthony Liguori @ 2009-02-16 1:19 ` Anthony Liguori 1 sibling, 0 replies; 82+ messages in thread From: Anthony Liguori @ 2009-02-16 1:19 UTC (permalink / raw) To: Johannes Schindelin; +Cc: qemu-devel Johannes Schindelin wrote: > Hi, > > On Sun, 15 Feb 2009, Anthony Liguori wrote: > > >> And let's be clear, your data is *not* safe with qcow2. So I don't >> consider this to be a show stopping issue. >> > > I beg your pardon? The one format that was recommended for quite a long > time now is considered unsafe? > Let me be abundantly clear here. qcow2 has never been the "recommended" format IMHO. When it was introduced, it was introduced as experimental. This is why block-qcow2.c was forked instead of qcow2 support being integrated into block-qcow.c. There's a ton of duplicate code between the two. The plan was to eventually merge the two once qcow2 stabilized. That's not happened. The code was never improved much since it's introduction until recently. I have concerns about the fundamental design of qcow2. I believe it will be difficult to make it safe while having acceptable performance. There has been some work recently by Gleb Natapov to reduce the corruption window in qcow2 but it's still there. I can only recommend qcow2 to casual users. It should *not* be used in production environments. It pains me to say that because we don't have a good alternative but there's no way I could recommend its use. In particular, Jamie's patch reverts one set of patches that reduces the corruption window to "fix" a corruption bug that is now being experience. However, since we don't know exactly what the cause of the new bug is, it's not necessarily true that the revert fixes the bug. It may just make it more difficult to expose. There's really no winning scenario here except finding the root cause of the new bug and fixing it. That still won't make qcow2 safe for production data though. So as far as I'm concerned, until qcow2 is made completely safe, it's still an experimental feature. Regards, Anthony Liguori > That would not have happened with Fabrice in charge, > Dscho > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports 2009-02-15 18:19 ` Anthony Liguori 2009-02-15 18:34 ` Johannes Schindelin @ 2009-02-17 1:01 ` Jamie Lokier 1 sibling, 0 replies; 82+ messages in thread From: Jamie Lokier @ 2009-02-17 1:01 UTC (permalink / raw) To: qemu-devel Anthony Liguori wrote: > >>>>It's QEMU SVN delta 5005-5006, copied below. > >>>> > >>So why such an aggressive revert? Why not just revert the problematic > >>changesets? > > > >Because most of the following changes look too dependent on it. > > > > Too dependent on the introduced functionality or too dependent to make > porting trivial? My impression upon looking was that it's the later, My impression is the former, in that it seems necessary to understand the changes in 5006 to understand how to rewrite subsequent patches which use the changed functions. But I didn't spend a long time on it, as I can't. Of course all such things reduce to trivial porting if you have enough time. > But many of the changes since 5005 were also corruption fixes. And > let's be clear, your data is *not* safe with qcow2. So I don't consider > this to be a show stopping issue. There's a HUGE difference between "not safe if the host/QEMU crashes" and "corrupts silently during normal operation with no errors". The former is a rare event we hope. Marc's report, based apparently on a big farm of VMs, is that he observes this corruption a lot with Windows guests. The scary thing is it looks like it doesn't have anything (directly) to do with the device emulation, which is more sensitive to guest OS type. I wonder if people using kvm >= 73 have silent corruption in their Linux guests without noticing yet. -- Jamie ^ permalink raw reply [flat|nested] 82+ messages in thread
* [Qemu-devel] Re: Cutting a new QEMU release @ 2009-02-05 9:13 Steve Fosdick 2009-02-05 14:26 ` Anthony Liguori 2009-02-05 14:55 ` Rick Vernam 0 siblings, 2 replies; 82+ messages in thread From: Steve Fosdick @ 2009-02-05 9:13 UTC (permalink / raw) To: qemu-devel Given the talk of a new release I though I'd try the latest qemu from SVN. At the moment I am being hampered by kqemu-1.4.0pre1 not compiling though: CC [M] /usr/src/kqemu-1.4.0pre1/kqemu-linux.o /usr/src/kqemu-1.4.0pre1/kqemu-linux.c: In function ‘kqemu_lock_user_page’: /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:81: error: dereferencing pointer to incomplete type /usr/src/kqemu-1.4.0pre1/kqemu-linux.c: In function ‘kqemu_schedule’: /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:194: error: implicit declaration of function ‘need_resched’ /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:195: error: implicit declaration of function ‘schedule’ /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:197: error: implicit declaration of function ‘signal_pending’ make[2]: *** [/usr/src/kqemu-1.4.0pre1/kqemu-linux.o] Error 1 This is with kernel 2.6.28.2. kqemu-1.3.0pre11 seems to compile OK with the kernel. Any ideas? Regards, Steve. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 9:13 [Qemu-devel] Re: Cutting a new QEMU release Steve Fosdick @ 2009-02-05 14:26 ` Anthony Liguori 2009-02-05 15:36 ` Rick Vernam ` (2 more replies) 2009-02-05 14:55 ` Rick Vernam 1 sibling, 3 replies; 82+ messages in thread From: Anthony Liguori @ 2009-02-05 14:26 UTC (permalink / raw) To: qemu-devel Steve Fosdick wrote: > Given the talk of a new release I though I'd try the latest qemu from > SVN. At the moment I am being hampered by kqemu-1.4.0pre1 not compiling > though: > > CC [M] /usr/src/kqemu-1.4.0pre1/kqemu-linux.o > /usr/src/kqemu-1.4.0pre1/kqemu-linux.c: In function > ‘kqemu_lock_user_page’: > /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:81: error: dereferencing pointer > to incomplete type > /usr/src/kqemu-1.4.0pre1/kqemu-linux.c: In function ‘kqemu_schedule’: > /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:194: error: implicit declaration > of function ‘need_resched’ > /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:195: error: implicit declaration > of function ‘schedule’ > /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:197: error: implicit declaration > of function ‘signal_pending’ > make[2]: *** [/usr/src/kqemu-1.4.0pre1/kqemu-linux.o] Error 1 > > This is with kernel 2.6.28.2. kqemu-1.3.0pre11 seems to compile OK with > the kernel. Any ideas? > kqemu is unsupported and unmaintained. Regards, Anthony Liguori > Regards, > Steve. > > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 14:26 ` Anthony Liguori @ 2009-02-05 15:36 ` Rick Vernam 2009-02-05 16:27 ` Paul Brook 2009-02-05 15:55 ` René Rebe 2009-02-07 12:01 ` Stefan Weil 2 siblings, 1 reply; 82+ messages in thread From: Rick Vernam @ 2009-02-05 15:36 UTC (permalink / raw) To: qemu-devel On Thursday 05 February 2009 8:26:04 am Anthony Liguori wrote: > kqemu is unsupported and unmaintained. Interesting. When did it fall into that status? The Maintainers file shows Fabrice as the maintainer of kqemu. I suppose that needs to be updated? I see Fabrice released 1.4.0pre1 on May 30th, 2008, although I never did see anything declaring it unsupported (I'm not suggesting it was never declared, just that I never saw any such declaration). Are there any plans to support it in the future? This really is quite a shock to me, actually. I know qemu has a wide range of uses - but for me and surely others, virtualization is a primary use. To the best of my knowledge, kvm requires hardware support - where does this leave the class of users who need virtualization & don't have hardware virtualization support? Are we no longer the a target audience of qemu? If not, fine, but apparently a statement needs to be made... Also, I had considered the web site at http://bellard.org/qemu/ to be accurate. Perhaps something should be done prior to a release so that those who browse to the site know that: 1 - the site is not an accurate source of information or 2 - kqemu is no longer supported or maintained Thanks -Rick > > Regards, > > Anthony Liguori > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 15:36 ` Rick Vernam @ 2009-02-05 16:27 ` Paul Brook 2009-02-05 17:15 ` René Rebe 2009-02-05 17:51 ` Ben Taylor 0 siblings, 2 replies; 82+ messages in thread From: Paul Brook @ 2009-02-05 16:27 UTC (permalink / raw) To: qemu-devel On Thursday 05 February 2009, Rick Vernam wrote: > On Thursday 05 February 2009 8:26:04 am Anthony Liguori wrote: > > kqemu is unsupported and unmaintained. > > Interesting. When did it fall into that status? IMHO It's pretty much always been that way. > The Maintainers file shows Fabrice as the maintainer of kqemu. I suppose > that needs to be updated? > > I see Fabrice released 1.4.0pre1 on May 30th, 2008, although I never did > see anything declaring it unsupported (I'm not suggesting it was never > declared, just that I never saw any such declaration). > > Are there any plans to support it in the future? This really is quite a > shock to me, actually. I know qemu has a wide range of uses - but for me > and surely others, virtualization is a primary use. To the best of my > knowledge, kvm requires hardware support - where does this leave the class > of users who need virtualization & don't have hardware virtualization > support? Are we no longer the a target audience of qemu? If not, fine, > but apparently a statement needs to be made... You have the source, you're free to fork and maintain it yourself. In practice Fabice is pretty much the only person who's ever done significant work on kqemu (except maybe some fairly minor host OS porting bits). There's never been a public source repository, so you get to use whatever random tarballs Fabrice leaves lying around. If those don't work, noone really cares. Paul ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 16:27 ` Paul Brook @ 2009-02-05 17:15 ` René Rebe 2009-02-05 17:36 ` Paul Brook 2009-02-05 17:51 ` Ben Taylor 1 sibling, 1 reply; 82+ messages in thread From: René Rebe @ 2009-02-05 17:15 UTC (permalink / raw) To: qemu-devel Paul Brook wrote: > On Thursday 05 February 2009, Rick Vernam wrote: > >> On Thursday 05 February 2009 8:26:04 am Anthony Liguori wrote: >> >>> kqemu is unsupported and unmaintained. >>> >> Interesting. When did it fall into that status? >> > > IMHO It's pretty much always been that way. > > >> The Maintainers file shows Fabrice as the maintainer of kqemu. I suppose >> that needs to be updated? >> >> I see Fabrice released 1.4.0pre1 on May 30th, 2008, although I never did >> see anything declaring it unsupported (I'm not suggesting it was never >> declared, just that I never saw any such declaration). >> >> Are there any plans to support it in the future? This really is quite a >> shock to me, actually. I know qemu has a wide range of uses - but for me >> and surely others, virtualization is a primary use. To the best of my >> knowledge, kvm requires hardware support - where does this leave the class >> of users who need virtualization & don't have hardware virtualization >> support? Are we no longer the a target audience of qemu? If not, fine, >> but apparently a statement needs to be made... >> > > You have the source, you're free to fork and maintain it yourself. > > In practice Fabice is pretty much the only person who's ever done significant > work on kqemu (except maybe some fairly minor host OS porting bits). There's > never been a public source repository, so you get to use whatever random > tarballs Fabrice leaves lying around. If those don't work, noone really > cares. > I find this rather drastic. So far it appears to work pretty well. And given the sheer amount of CPU sililcon without VT/SVM it looks to be worth keeping working. Maybe just to pull it into the Qemu SVN? Btw. anyone knows what Fabice is doing these days? -- René Rebe - ExactCODE GmbH - Europe, Germany, Berlin http://exactcode.de | http://t2-project.org | http://rene.rebe.name ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 17:15 ` René Rebe @ 2009-02-05 17:36 ` Paul Brook 2009-02-05 17:51 ` Daniel P. Berrange 0 siblings, 1 reply; 82+ messages in thread From: Paul Brook @ 2009-02-05 17:36 UTC (permalink / raw) To: qemu-devel; +Cc: René Rebe > given the sheer amount of CPU sililcon without VT/SVM it looks to be worth > keeping [kqemu] working. Maybe just to pull it into the Qemu SVN? I'd rather not. What you[1] really need to do is get it merged into upstream linux kernels. There have been several threads about this previously, the short version is that it probably involves rewriting to use the kvm API. You'll find that many developers (including myself) have extremely low tolerance for out of tree kernel modules[2]. Paul [1] Or someone else who actually cares/is paid to care about kqemu. [2] Obviously there's a bit of chicken and egg here. Upstream submission should at least be a fairly near-term goal. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 17:36 ` Paul Brook @ 2009-02-05 17:51 ` Daniel P. Berrange 0 siblings, 0 replies; 82+ messages in thread From: Daniel P. Berrange @ 2009-02-05 17:51 UTC (permalink / raw) To: qemu-devel; +Cc: René Rebe On Thu, Feb 05, 2009 at 05:36:22PM +0000, Paul Brook wrote: > > given the sheer amount of CPU sililcon without VT/SVM it looks to be worth > > keeping [kqemu] working. Maybe just to pull it into the Qemu SVN? > > I'd rather not. What you[1] really need to do is get it merged into upstream > linux kernels. There have been several threads about this previously, the > short version is that it probably involves rewriting to use the kvm API. > You'll find that many developers (including myself) have extremely low > tolerance for out of tree kernel modules[2]. More fundamentally, whether in or out of tree, someone needs to step forward & commit to being an active long term maintainer for the code. Having it in QEMU SVN without someone maintaining it won't help the current situation, and nor will dumping it upstream without someone maintaining it. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 16:27 ` Paul Brook 2009-02-05 17:15 ` René Rebe @ 2009-02-05 17:51 ` Ben Taylor 2009-02-05 18:39 ` René Rebe ` (2 more replies) 1 sibling, 3 replies; 82+ messages in thread From: Ben Taylor @ 2009-02-05 17:51 UTC (permalink / raw) To: qemu-devel On Thu, Feb 5, 2009 at 11:27 AM, Paul Brook <paul@codesourcery.com> wrote: > > In practice Fabice is pretty much the only person who's ever done significant > work on kqemu (except maybe some fairly minor host OS porting bits). There's > never been a public source repository, so you get to use whatever random > tarballs Fabrice leaves lying around. If those don't work, noone really > cares. I've maintained tarballs for both 1.4.0 and 1.3.0 at the qemu project on OpenSolaris.org, and just realized that I never put into the SVN repo the mods I made to the 1.4.0 code. I had tested it with Solaris SXCE and Ubuntu 08.04. If anyone shows some interest in testing, I'll import the 1.4.0 into the SVN repo. I believe that I picked up the minor patches that were posted to the list to fix compilations on linux with some various kernels. Ben ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 17:51 ` Ben Taylor @ 2009-02-05 18:39 ` René Rebe 2009-02-05 19:03 ` Anthony Liguori 2009-02-15 15:25 ` Andreas Färber 2 siblings, 0 replies; 82+ messages in thread From: René Rebe @ 2009-02-05 18:39 UTC (permalink / raw) To: qemu-devel Ben Taylor wrote: > On Thu, Feb 5, 2009 at 11:27 AM, Paul Brook <paul@codesourcery.com> wrote: > >> In practice Fabice is pretty much the only person who's ever done significant >> work on kqemu (except maybe some fairly minor host OS porting bits). There's >> never been a public source repository, so you get to use whatever random >> tarballs Fabrice leaves lying around. If those don't work, noone really >> cares. >> > > I've maintained tarballs for both 1.4.0 and 1.3.0 at the qemu project > on OpenSolaris.org, and just realized that I never put into the SVN repo > the mods I made to the 1.4.0 code. I had tested it with Solaris SXCE > and Ubuntu 08.04. If anyone shows some interest in testing, I'll import > the 1.4.0 into the SVN repo. I believe that I picked up the minor > patches that were posted to the list to fix compilations on linux > with some various kernels. > Hm - kqemu-1.4.0pre1 builds for at least 2.6.28 and 2.6.26 for x86 and x86-64 on my side. Anyway, could you post your modifications, some unsorted drop to my privately is also welcome if you miss the time to sort it out. Thanks, -- René Rebe - ExactCODE GmbH - Europe, Germany, Berlin http://exactcode.de | http://t2-project.org | http://rene.rebe.name ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 17:51 ` Ben Taylor 2009-02-05 18:39 ` René Rebe @ 2009-02-05 19:03 ` Anthony Liguori 2009-02-06 10:54 ` Steve Fosdick [not found] ` <92CAE88C-36FF-4566-BD1D-ACA58C98CB0F@hotmail.com> 2009-02-15 15:25 ` Andreas Färber 2 siblings, 2 replies; 82+ messages in thread From: Anthony Liguori @ 2009-02-05 19:03 UTC (permalink / raw) To: qemu-devel Ben Taylor wrote: > On Thu, Feb 5, 2009 at 11:27 AM, Paul Brook <paul@codesourcery.com> wrote: > >> In practice Fabice is pretty much the only person who's ever done significant >> work on kqemu (except maybe some fairly minor host OS porting bits). There's >> never been a public source repository, so you get to use whatever random >> tarballs Fabrice leaves lying around. If those don't work, noone really >> cares. >> > > I've maintained tarballs for both 1.4.0 and 1.3.0 at the qemu project > on OpenSolaris.org, and just realized that I never put into the SVN repo > the mods I made to the 1.4.0 code. I had tested it with Solaris SXCE > and Ubuntu 08.04. If anyone shows some interest in testing, I'll import > the 1.4.0 into the SVN repo. I believe that I picked up the minor > patches that were posted to the list to fix compilations on linux > with some various kernels. > Personally, I'd prefer that it lived outside of the QEMU tree. It is never going to go into upstream Linux and it's not something that I think is worth supporting. Regards, Anthony Liguori > Ben > > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 19:03 ` Anthony Liguori @ 2009-02-06 10:54 ` Steve Fosdick 2009-02-06 15:57 ` René Rebe 2009-02-07 16:39 ` Jamie Lokier [not found] ` <92CAE88C-36FF-4566-BD1D-ACA58C98CB0F@hotmail.com> 1 sibling, 2 replies; 82+ messages in thread From: Steve Fosdick @ 2009-02-06 10:54 UTC (permalink / raw) To: qemu-devel On Thu, 2009-02-05 at 13:03 -0600, Anthony Liguori wrote: > Personally, I'd prefer that it lived outside of the QEMU tree. It is > never going to go into upstream Linux and it's not something that I > think is worth supporting. Does anyone here have any stats on what people are using QEMU for? I ask this because I suspect a significant use case is running an x86 guest on an x86 host and, at the moment, the only way to get reasonable performance on a non virtualisation-enhanced CPU seems to be to use kqmeu. Now, I can understand the developers of kvm only supporting the virtualisation-enhanced CPUs because, looking to the future they will be common. I suspect at the moment though there are plenty of people running VMs on older hardware. I can also see that if it would take major refactoring to get kqemu into the main kernal tree it is probably not worth the efforts as, by the time that work is complete the ratio virtualisation-enhanced CPUs to older, non virtualisation-enhanced CPUs would be higher. To my mind mind, what would be good right now is if someone (or some people) understands kqemu well enough that, if kernel changes break it, it can be fixed, not forever but until more people have virtualisation-enhanced CPUs and can use KVM instead. Regards, Steve. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-06 10:54 ` Steve Fosdick @ 2009-02-06 15:57 ` René Rebe 2009-02-06 17:12 ` Anthony Liguori 2009-02-06 21:53 ` René Rebe 2009-02-07 16:39 ` Jamie Lokier 1 sibling, 2 replies; 82+ messages in thread From: René Rebe @ 2009-02-06 15:57 UTC (permalink / raw) To: qemu-devel Hi, Steve Fosdick wrote: > On Thu, 2009-02-05 at 13:03 -0600, Anthony Liguori wrote: > > >> Personally, I'd prefer that it lived outside of the QEMU tree. It is >> never going to go into upstream Linux and it's not something that I >> think is worth supporting. >> > > Does anyone here have any stats on what people are using QEMU for? > > I ask this because I suspect a significant use case is running an x86 > guest on an x86 host and, at the moment, the only way to get reasonable > performance on a non virtualisation-enhanced CPU seems to be to use > kqmeu. > > Now, I can understand the developers of kvm only supporting the > virtualisation-enhanced CPUs because, looking to the future they will be > common. I suspect at the moment though there are plenty of people > running VMs on older hardware. > > I can also see that if it would take major refactoring to get kqemu into > the main kernal tree it is probably not worth the efforts as, by the > time that work is complete the ratio virtualisation-enhanced CPUs to > older, non virtualisation-enhanced CPUs would be higher. > > To my mind mind, what would be good right now is if someone (or some > people) understands kqemu well enough that, if kernel changes break it, > it can be fixed, not forever but until more people have > virtualisation-enhanced CPUs and can use KVM instead. > Indeed. Though I used KVM for the past months to do Linux development and system testing / integration I had a use case for kqemu (non-VT CPU) just this week and was surprised to find quite "old" kqemu release just build and work for booth 2.6.26 and 2.6.28. And so far there was no problem with it. While I have no problem having it long time ported to the KVM interface, just declaring some quite useful and functional piece of open source work obsolete and unsupported quite drastic. This work should be not be lost so easily. When kqemu is supposed to be gotten upstream the question remains what to do with the freebsd, windows, solaris, etc. glue code. If I would know more of the internals of kqemu I would even volunteer to maintain it - however, I just took the first look at it yesterday which does not really qualify to maintain it just yet. Though I would work on getting it adapted on future kernel changes, and/or even hunt a bug if it starts crashing in one or another scenario for me (but right now I have to hunt some crashing with 32bit host KVM for a start). Yours, -- René Rebe - ExactCODE GmbH - Europe, Germany, Berlin http://exactcode.de | http://t2-project.org | http://rene.rebe.name ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-06 15:57 ` René Rebe @ 2009-02-06 17:12 ` Anthony Liguori 2009-02-06 21:47 ` René Rebe 2009-02-07 16:49 ` Jamie Lokier 2009-02-06 21:53 ` René Rebe 1 sibling, 2 replies; 82+ messages in thread From: Anthony Liguori @ 2009-02-06 17:12 UTC (permalink / raw) To: qemu-devel René Rebe wrote: > > Hi, >> > Indeed. Though I used KVM for the past months to do Linux development > and system testing / integration I had a use case for kqemu (non-VT CPU) > just this week and was surprised to find quite "old" kqemu release > just build > and work for booth 2.6.26 and 2.6.28. And so far there was no problem > with > it. > > While I have no problem having it long time ported to the KVM interface, > just declaring some quite useful and functional piece of open source work > obsolete and unsupported quite drastic. This work should be not be lost > so easily. I think you misunderstand. Noone is claiming that kqemu is no longer being supported. Quite rather, we're simply stating it's never been supported. It started as a binary kernel module, impossible to support within the QEMU community. While Fabrice has open sourced kqemu, it's never been included in QEMU. It's not maintained by the current QEMU maintainers and not supported by the current QEMU maintainers. It's essentially a separate project. Regards, Anthony Liguori > When kqemu is supposed to be gotten upstream the question remains what > to do with the freebsd, windows, solaris, etc. glue code. > > If I would know more of the internals of kqemu I would even volunteer to > maintain it - however, I just took the first look at it yesterday > which does > not really qualify to maintain it just yet. Though I would work on > getting > it adapted on future kernel changes, and/or even hunt a bug if it starts > crashing in one or another scenario for me (but right now I have to hunt > some crashing with 32bit host KVM for a start). > > Yours, > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-06 17:12 ` Anthony Liguori @ 2009-02-06 21:47 ` René Rebe 2009-02-07 16:49 ` Jamie Lokier 1 sibling, 0 replies; 82+ messages in thread From: René Rebe @ 2009-02-06 21:47 UTC (permalink / raw) To: qemu-devel Anthony Liguori wrote: > René Rebe wrote: >> >> Hi, >>> >> Indeed. Though I used KVM for the past months to do Linux development >> and system testing / integration I had a use case for kqemu (non-VT CPU) >> just this week and was surprised to find quite "old" kqemu release >> just build >> and work for booth 2.6.26 and 2.6.28. And so far there was no problem >> with >> it. >> >> While I have no problem having it long time ported to the KVM interface, >> just declaring some quite useful and functional piece of open source work >> obsolete and unsupported quite drastic. This work should be not be lost >> so easily. > > I think you misunderstand. Noone is claiming that kqemu is no longer > being supported. Quite rather, we're simply stating it's never been > supported. > > It started as a binary kernel module, impossible to support within the > QEMU community. While Fabrice has open sourced kqemu, it's never been > included in QEMU. It's not maintained by the current QEMU maintainers > and not supported by the current QEMU maintainers. I know about the history pretty well. Btw. is Farbrice still actively working on Qemu related code these days? > It's essentially a separate project. Well - depends. The user-space part always was in Qemu, but the kernel module apparently is a little left aside. However, this should not stop us from improving the situation instead of letting it bitrott. -- René Rebe - ExactCODE GmbH - Europe, Germany, Berlin http://exactcode.de | http://t2-project.org | http://rene.rebe.name ^ permalink raw reply [flat|nested] 82+ messages in thread
* [Qemu-devel] Re: Cutting a new QEMU release 2009-02-06 17:12 ` Anthony Liguori 2009-02-06 21:47 ` René Rebe @ 2009-02-07 16:49 ` Jamie Lokier 2009-02-07 17:06 ` Laurent Desnogues 2009-02-07 23:46 ` Anthony Liguori 1 sibling, 2 replies; 82+ messages in thread From: Jamie Lokier @ 2009-02-07 16:49 UTC (permalink / raw) To: qemu-devel Anthony Liguori wrote: > >While I have no problem having it long time ported to the KVM interface, > >just declaring some quite useful and functional piece of open source work > >obsolete and unsupported quite drastic. This work should be not be lost > >so easily. > > I think you misunderstand. Noone is claiming that kqemu is no longer > being supported. Quite rather, we're simply stating it's never been > supported. > > It started as a binary kernel module, impossible to support within the > QEMU community. While Fabrice has open sourced kqemu, it's never been > included in QEMU. It's not maintained by the current QEMU maintainers > and not supported by the current QEMU maintainers. > > It's essentially a separate project. Yes, it's unfortunate how its history worked out. On the face of it, it looks like Fabrice was hoping for someone to pay for it. Maybe they did. I remember a vague murmur of an attempt to make an open source replacement for kqemu when it was still binary-only; that didn't go anywhere as far as I remember. Anthony: If one or more maintainers were to step up, perhaps even begin adapting the kqemu interface to kvm's, would you be interested in folding it in the main qemu/kvm project as an official feature? Straw poll: who here's interested in maintaining kqemu? I have very little time, but plenty of x86 intimate knowledge and kernel knowledge, and have used kqemu occasionally. I can offer my hand as "interested a bit, not by myself". (Also, perhaps some of the Windows / other kqemu bits might be useful in porting kvm to Windows. Now that we have nested kvm, those of us who never run a native Windows host can think about testing such a thing ;-) -- Jamie ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-07 16:49 ` Jamie Lokier @ 2009-02-07 17:06 ` Laurent Desnogues 2009-02-07 23:46 ` Anthony Liguori 1 sibling, 0 replies; 82+ messages in thread From: Laurent Desnogues @ 2009-02-07 17:06 UTC (permalink / raw) To: qemu-devel On Sat, Feb 7, 2009 at 5:49 PM, Jamie Lokier <jamie@shareable.org> wrote: > I remember a vague murmur of an attempt to make an open > source replacement for kqemu when it was still binary-only; that > didn't go anywhere as far as I remember. I think you're referring to Paul's qvm86. http://savannah.nongnu.org/projects/qvm86 Laurent ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-07 16:49 ` Jamie Lokier 2009-02-07 17:06 ` Laurent Desnogues @ 2009-02-07 23:46 ` Anthony Liguori 1 sibling, 0 replies; 82+ messages in thread From: Anthony Liguori @ 2009-02-07 23:46 UTC (permalink / raw) To: qemu-devel Jamie Lokier wrote: > Anthony Liguori wrote: > > Yes, it's unfortunate how its history worked out. On the face of it, > it looks like Fabrice was hoping for someone to pay for it. Maybe > they did. I remember a vague murmur of an attempt to make an open > source replacement for kqemu when it was still binary-only; that > didn't go anywhere as far as I remember. > > Anthony: If one or more maintainers were to step up, perhaps even > begin adapting the kqemu interface to kvm's, would you be interested > in folding it in the main qemu/kvm project as an official feature? > Actions speak louder than words. All it takes is for someone to setup a tree somewhere with kqemu in it, and start working on it and merging patches. Once that happens, we can discuss the long term future wrt KVM and QEMU. Otherwise, it's just pontificating here. Merging into Linux proper is going to be a lot of work. I strongly suspect you won't see anyone step up. From a developer perspective, it's a case of diminishing returns. The more work you put into it, the less useful it is to people. Every day that goes buy, the potential audience grows smaller. Furthermore, the barrier to entry for someone to get a better solution is (i.e. KVM) is rather small. Just buy a new CPU. I think the only way it would prove useful to maintain is if some developer either has a deep desire to mess around with this kind of stuff or has a large customer base with pre-VT/SVM hardware that they wish to support. So far, no such developer has proven to exist. Recall, even when kqemu was the only solution (but closed source), there wasn't really anyone interested/willing to maintain qvm86. N.B. KVM and kqemu are not equal solutions. Even at it's best, kqemu is going to be significantly slower than KVM in most cases. When dealing with more modern CPUs (Barcelonas and Core i7s), the difference is going to be extremely high. Regards, Anthony Liguori > Straw poll: who here's interested in maintaining kqemu? > > I have very little time, but plenty of x86 intimate knowledge and > kernel knowledge, and have used kqemu occasionally. I can offer my > hand as "interested a bit, not by myself". > > (Also, perhaps some of the Windows / other kqemu bits might be useful > in porting kvm to Windows. Now that we have nested kvm, those of us > who never run a native Windows host can think about testing such a thing ;-) > > -- Jamie > > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-06 15:57 ` René Rebe 2009-02-06 17:12 ` Anthony Liguori @ 2009-02-06 21:53 ` René Rebe 1 sibling, 0 replies; 82+ messages in thread From: René Rebe @ 2009-02-06 21:53 UTC (permalink / raw) To: qemu-devel René Rebe wrote: > If I would know more of the internals of kqemu I would even volunteer to > maintain it - however, I just took the first look at it yesterday which > does > not really qualify to maintain it just yet. Though I would work on getting > it adapted on future kernel changes, and/or even hunt a bug if it starts > crashing in one or another scenario for me (but right now I have to hunt > some crashing with 32bit host KVM for a start). Ok, those segfaults where due to an old non-NPTL glibc and the __thread support being nonfunctional in this combination used in the kvm tree :-) -- René Rebe - ExactCODE GmbH - Europe, Germany, Berlin http://exactcode.de | http://t2-project.org | http://rene.rebe.name ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-06 10:54 ` Steve Fosdick 2009-02-06 15:57 ` René Rebe @ 2009-02-07 16:39 ` Jamie Lokier 1 sibling, 0 replies; 82+ messages in thread From: Jamie Lokier @ 2009-02-07 16:39 UTC (permalink / raw) To: qemu-devel Steve Fosdick wrote: > Now, I can understand the developers of kvm only supporting the > virtualisation-enhanced CPUs because, looking to the future they will be > common. I suspect at the moment though there are plenty of people > running VMs on older hardware. In a couple of brief threads before, it was made fairly clear that kvm developers believe CPUs without the virtualisation feature are essentially obsolete, not just non-current. I sympathise with that view, now that my laptop has the feature :-) But it does seem harsh, a rather sudden cut-off point as it was only a few years ago that the virtualisation feature was not common, and it's still not available on all x86s. -- Jamie ^ permalink raw reply [flat|nested] 82+ messages in thread
[parent not found: <92CAE88C-36FF-4566-BD1D-ACA58C98CB0F@hotmail.com>]
* Re: [Qemu-devel] Re: Cutting a new QEMU release [not found] ` <92CAE88C-36FF-4566-BD1D-ACA58C98CB0F@hotmail.com> @ 2009-02-09 5:01 ` C.W. Betts [not found] ` <784D2534-F9CD-4EA5-BBEE-67E9DE196598@hotmail.com> 0 siblings, 1 reply; 82+ messages in thread From: C.W. Betts @ 2009-02-09 5:01 UTC (permalink / raw) To: qemu-devel [-- Attachment #1: Type: text/plain, Size: 1398 bytes --] On Feb 5, 2009, at 12:03 PM, Anthony Liguori wrote: > Ben Taylor wrote: >> On Thu, Feb 5, 2009 at 11:27 AM, Paul Brook <paul@codesourcery.com> >> wrote: >> >>> In practice Fabice is pretty much the only person who's ever done >>> significant >>> work on kqemu (except maybe some fairly minor host OS porting >>> bits). There's >>> never been a public source repository, so you get to use whatever >>> random >>> tarballs Fabrice leaves lying around. If those don't work, noone >>> really >>> cares. >>> >> >> I've maintained tarballs for both 1.4.0 and 1.3.0 at the qemu project >> on OpenSolaris.org, and just realized that I never put into the SVN >> repo >> the mods I made to the 1.4.0 code. I had tested it with Solaris SXCE >> and Ubuntu 08.04. If anyone shows some interest in testing, I'll >> import >> the 1.4.0 into the SVN repo. I believe that I picked up the minor >> patches that were posted to the list to fix compilations on linux >> with some various kernels. >> > > Personally, I'd prefer that it lived outside of the QEMU tree. It > is never going to go into upstream Linux and it's not something that > I think is worth supporting. > The only thing that prevents kvm being used in Windows or Darwin/OS X is that it depends too heavily on the Linux Kernel. Kqemu, on the other hand, has been ported to Windows, and someone tried to do a Darwin port. [-- Attachment #2: Type: text/html, Size: 2536 bytes --] ^ permalink raw reply [flat|nested] 82+ messages in thread
[parent not found: <784D2534-F9CD-4EA5-BBEE-67E9DE196598@hotmail.com>]
* Re: [Qemu-devel] Re: Cutting a new QEMU release [not found] ` <784D2534-F9CD-4EA5-BBEE-67E9DE196598@hotmail.com> @ 2009-02-09 5:42 ` C.W. Betts 2009-02-09 10:29 ` René Rebe 0 siblings, 1 reply; 82+ messages in thread From: C.W. Betts @ 2009-02-09 5:42 UTC (permalink / raw) To: qemu-devel Okay, what is keeping Qemu from releasing a new version? I say we release a release candidate, wait for people to find the bugs (two or three weeks), then release a new "official" version. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-09 5:42 ` C.W. Betts @ 2009-02-09 10:29 ` René Rebe 0 siblings, 0 replies; 82+ messages in thread From: René Rebe @ 2009-02-09 10:29 UTC (permalink / raw) To: qemu-devel C.W. Betts wrote: > Okay, what is keeping Qemu from releasing a new version? I say we > release a release candidate, wait for people to find the bugs (two or > three weeks), then release a new "official" version. I would prefer a release "schedule" like kvm: every other month - often and quickly. -- René Rebe - ExactCODE GmbH - Europe, Germany, Berlin http://exactcode.de | http://t2-project.org | http://rene.rebe.name ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 17:51 ` Ben Taylor 2009-02-05 18:39 ` René Rebe 2009-02-05 19:03 ` Anthony Liguori @ 2009-02-15 15:25 ` Andreas Färber 2009-02-15 15:44 ` Jamie Lokier 2009-02-15 18:17 ` Anthony Liguori 2 siblings, 2 replies; 82+ messages in thread From: Andreas Färber @ 2009-02-15 15:25 UTC (permalink / raw) To: qemu-devel Am 05.02.2009 um 18:51 schrieb Ben Taylor: > On Thu, Feb 5, 2009 at 11:27 AM, Paul Brook <paul@codesourcery.com> > wrote: >> >> In practice Fabice is pretty much the only person who's ever done >> significant >> work on kqemu (except maybe some fairly minor host OS porting >> bits). There's >> never been a public source repository, so you get to use whatever >> random >> tarballs Fabrice leaves lying around. If those don't work, noone >> really >> cares. > > I've maintained tarballs for both 1.4.0 and 1.3.0 at the qemu project > on OpenSolaris.org, and just realized that I never put into the SVN > repo > the mods I made to the 1.4.0 code. I had tested it with Solaris SXCE > and Ubuntu 08.04. If anyone shows some interest in testing, I'll > import > the 1.4.0 into the SVN repo. I believe that I picked up the minor > patches that were posted to the list to fix compilations on linux > with some various kernels. I have happily used kqemu 1.4 on OpenSolaris for several months without problems, running Linux in sparc-softmmu and Haiku/BeOS in i386-softmmu. I did have to tweak the Makefile a little for kqemu to link on OpenSolaris/amd64, I believe. Possibly by replacing ld with path/to/ amd64/ld. There has been no rumor of any KVM port to Solaris. Linux kernel integration cannot be the only criteria. It used to work in early December - could we set up a Git repo for Fabrice's official tarball? Then we could apply the OpenSolaris.org changes on a branch and play with our own Git forks to keep it working as long as there is no alternative. Asking for maintainers of unversioned software seems doomed to fail. Andreas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-15 15:25 ` Andreas Färber @ 2009-02-15 15:44 ` Jamie Lokier 2009-02-15 19:14 ` Andreas Färber 2009-02-15 18:17 ` Anthony Liguori 1 sibling, 1 reply; 82+ messages in thread From: Jamie Lokier @ 2009-02-15 15:44 UTC (permalink / raw) To: qemu-devel Andreas Färber wrote: > There has been no rumor of any KVM port to Solaris. Linux kernel > integration cannot be the only criteria. Does Solaris not have their own equivalent to KVM, for running VMs? -- Jamie ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-15 15:44 ` Jamie Lokier @ 2009-02-15 19:14 ` Andreas Färber 0 siblings, 0 replies; 82+ messages in thread From: Andreas Färber @ 2009-02-15 19:14 UTC (permalink / raw) To: qemu-devel Am 15.02.2009 um 16:44 schrieb Jamie Lokier: > Andreas Färber wrote: >> There has been no rumor of any KVM port to Solaris. Linux kernel >> integration cannot be the only criteria. > > Does Solaris not have their own equivalent to KVM, for running VMs? Sun has xVM (based on Xen) with virt-manager UI. But it didn't run, e.g., Haiku and doesn't help with non-native (sparc-/ppc-softmmu) emulation either. I'm not looking for a virtualization technology on Solaris but for a platform suited for my uses of QEMU emulation (and that box was pretty fast :). However unsupported, QEMU+kqemu on OpenSolaris/amd64 is much faster than unaccelerated QEMU on OSX/ppc! And trying to set up any KVM guest in Fedora was a pain. Haven't tried the new KVM integration in QEMU trunk yet. Maybe kqemu really has a bad kernel interface, but it's simple to set up and fits the needs of my use cases: - booting existing hard disk images without one-time booting from an ISO image first (virt-manager seems to require the latter) - storing the image files anywhere I like, including my home dir (SELinux messes with that on Fedora 10) - starting the VM from an unpriviledged user, preferably from a shell script Andreas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-15 15:25 ` Andreas Färber 2009-02-15 15:44 ` Jamie Lokier @ 2009-02-15 18:17 ` Anthony Liguori 2009-02-15 20:31 ` Andreas Färber 1 sibling, 1 reply; 82+ messages in thread From: Anthony Liguori @ 2009-02-15 18:17 UTC (permalink / raw) To: qemu-devel Andreas Färber wrote: > > Am 05.02.2009 um 18:51 schrieb Ben Taylor: > >> On Thu, Feb 5, 2009 at 11:27 AM, Paul Brook <paul@codesourcery.com> >> wrote: >>> >>> In practice Fabice is pretty much the only person who's ever done >>> significant >>> work on kqemu (except maybe some fairly minor host OS porting bits). >>> There's >>> never been a public source repository, so you get to use whatever >>> random >>> tarballs Fabrice leaves lying around. If those don't work, noone really >>> cares. >> >> I've maintained tarballs for both 1.4.0 and 1.3.0 at the qemu project >> on OpenSolaris.org, and just realized that I never put into the SVN repo >> the mods I made to the 1.4.0 code. I had tested it with Solaris SXCE >> and Ubuntu 08.04. If anyone shows some interest in testing, I'll import >> the 1.4.0 into the SVN repo. I believe that I picked up the minor >> patches that were posted to the list to fix compilations on linux >> with some various kernels. > > I have happily used kqemu 1.4 on OpenSolaris for several months > without problems, running Linux in sparc-softmmu and Haiku/BeOS in > i386-softmmu. > > I did have to tweak the Makefile a little for kqemu to link on > OpenSolaris/amd64, I believe. Possibly by replacing ld with > path/to/amd64/ld. > > There has been no rumor of any KVM port to Solaris. Linux kernel > integration cannot be the only criteria. > It used to work in early December - could we set up a Git repo for > Fabrice's official tarball? Then we could apply the OpenSolaris.org > changes on a branch and play with our own Git forks to keep it working > as long as there is no alternative. Asking for maintainers of > unversioned software seems doomed to fail. Set up a repository somewhere. You don't need anyone's permission for that. Savannah isn't a great place for hosting. You can only have one git repo per project. I'd suggest something like github or repo.or.cz. Regards, Anthony Liguori > > Andreas > > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-15 18:17 ` Anthony Liguori @ 2009-02-15 20:31 ` Andreas Färber 0 siblings, 0 replies; 82+ messages in thread From: Andreas Färber @ 2009-02-15 20:31 UTC (permalink / raw) To: qemu-devel Am 15.02.2009 um 19:17 schrieb Anthony Liguori: > Andreas Färber wrote: >> [kqemu] used to work in early December - could we set up a Git repo >> for Fabrice's official tarball? Then we could apply the >> OpenSolaris.org changes on a branch and play with our own Git forks >> to keep it working as long as there is no alternative. Asking for >> maintainers of unversioned software seems doomed to fail. > > Set up a repository somewhere. You don't need anyone's permission > for that. I just did: http://repo.or.cz/w/kqemu.git Andreas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 14:26 ` Anthony Liguori 2009-02-05 15:36 ` Rick Vernam @ 2009-02-05 15:55 ` René Rebe 2009-02-07 12:01 ` Stefan Weil 2 siblings, 0 replies; 82+ messages in thread From: René Rebe @ 2009-02-05 15:55 UTC (permalink / raw) To: qemu-devel Hi, Anthony Liguori wrote: > Steve Fosdick wrote: >> Given the talk of a new release I though I'd try the latest qemu from >> SVN. At the moment I am being hampered by kqemu-1.4.0pre1 not compiling >> though: >> >> CC [M] /usr/src/kqemu-1.4.0pre1/kqemu-linux.o >> /usr/src/kqemu-1.4.0pre1/kqemu-linux.c: In function >> ‘kqemu_lock_user_page’: >> /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:81: error: dereferencing pointer >> to incomplete type >> /usr/src/kqemu-1.4.0pre1/kqemu-linux.c: In function ‘kqemu_schedule’: >> /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:194: error: implicit declaration >> of function ‘need_resched’ >> /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:195: error: implicit declaration >> of function ‘schedule’ >> /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:197: error: implicit declaration >> of function ‘signal_pending’ >> make[2]: *** [/usr/src/kqemu-1.4.0pre1/kqemu-linux.o] Error 1 >> >> This is with kernel 2.6.28.2. kqemu-1.3.0pre11 seems to compile OK with >> the kernel. Any ideas? >> > > kqemu is unsupported and unmaintained. Ouhm. Why's that? Give that the vast majority of CPUs in use still don't have hardware virtualization, ... That said kqemu builds for me and works for me. -- René Rebe - ExactCODE GmbH - Europe, Germany, Berlin http://exactcode.de | http://t2-project.org | http://rene.rebe.name ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 14:26 ` Anthony Liguori 2009-02-05 15:36 ` Rick Vernam 2009-02-05 15:55 ` René Rebe @ 2009-02-07 12:01 ` Stefan Weil 2009-02-07 15:08 ` Anthony Liguori 2009-02-07 15:36 ` Jamie Lokier 2 siblings, 2 replies; 82+ messages in thread From: Stefan Weil @ 2009-02-07 12:01 UTC (permalink / raw) To: qemu-devel Anthony Liguori schrieb: > > kqemu is unsupported and unmaintained. > > Regards, > > Anthony Liguori > The kvm kernel module could be a good replacement for kqemu for those running linux on new cpus. It does not play this role in current linux distributions because you will need newer versions of kvm which are sometimes difficult to compile. It will never play this role for those running "old" cpus. And it will never play this role on Windows (or is there a kvm for Windows?). I am surprised that nobody mentions this part in the discussion. So even if kqemu is unmaintained, the Qemu developers should at least maintain the interface. I'd prefer to have a svn tree with kqemu beside qemu. Then patches could be sent, and maybe some day there could be a maintainer, too. Integration of code from virtualbox could be a way to replace kqemu, but I don't see this coming in the near future. Regards Stefan Weil ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-07 12:01 ` Stefan Weil @ 2009-02-07 15:08 ` Anthony Liguori 2009-02-07 15:36 ` Jamie Lokier 1 sibling, 0 replies; 82+ messages in thread From: Anthony Liguori @ 2009-02-07 15:08 UTC (permalink / raw) To: qemu-devel Stefan Weil wrote: > The kvm kernel module could be a good replacement for kqemu > for those running linux on new cpus. > > It does not play this role in current linux distributions because > you will need newer versions of kvm which are sometimes > difficult to compile. > > It will never play this role for those running "old" cpus. > > And it will never play this role on Windows (or is there a kvm > for Windows?). I am surprised that nobody mentions this part > in the discussion. > > So even if kqemu is unmaintained, the Qemu developers > should at least maintain the interface. > > I'd prefer to have a svn tree with kqemu beside qemu. > Then patches could be sent, and maybe some day there could > be a maintainer, too. > Nothing is stopping anyone from taking kqemu and setting up a SVN repo somewhere. That's the beauty of the GPL. Regards, Anthony Liguori > Integration of code from virtualbox could be a way to replace > kqemu, but I don't see this coming in the near future. > > Regards > Stefan Weil > > > > ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-07 12:01 ` Stefan Weil 2009-02-07 15:08 ` Anthony Liguori @ 2009-02-07 15:36 ` Jamie Lokier 2009-02-07 16:45 ` Jan Kiszka 1 sibling, 1 reply; 82+ messages in thread From: Jamie Lokier @ 2009-02-07 15:36 UTC (permalink / raw) To: qemu-devel Stefan Weil wrote: > The kvm kernel module could be a good replacement for kqemu > for those running linux on new cpus. It's not yet, though. kvm doesn't run 16-bit code properly. I use kqemu to run older OSes, and kvm to run current ones. I like the idea of a "kvm-soft", which is basically kqemu with a kvm interface. It would need a few extensions on the kvm interface, of course. Another potential use for _part_ of kqemu, or kvm-soft, is emulating other CPUs with host kernel support for the memory map, instead of full software TLB. That might be a performance accelerator for emulation, for some combinations of host and target CPUs where it's feasible to map the memory in that way. If kqemu were evolved into an accelerator for cross-CPU emulation in that way, then its current use as an x86-on-x86 accelerator would just be a special case of that. -- Jamie ^ permalink raw reply [flat|nested] 82+ messages in thread
* [Qemu-devel] Re: Cutting a new QEMU release 2009-02-07 15:36 ` Jamie Lokier @ 2009-02-07 16:45 ` Jan Kiszka 0 siblings, 0 replies; 82+ messages in thread From: Jan Kiszka @ 2009-02-07 16:45 UTC (permalink / raw) To: qemu-devel [-- Attachment #1: Type: text/plain, Size: 2483 bytes --] Jamie Lokier wrote: > Stefan Weil wrote: >> The kvm kernel module could be a good replacement for kqemu >> for those running linux on new cpus. > > It's not yet, though. kvm doesn't run 16-bit code properly. You mean real mode, I guess. I think there are still a few holes in the emulator that may bite you on certain DOSes or with some fancy boot loaders. But 16-bit protected mode runs as fine as 32 or 64 bit for quite a while now. > I use kqemu to run older OSes, and kvm to run current ones. I haven't found much code that caused troubles to kvm, but a lot that broke kqemu - YMMV. > > I like the idea of a "kvm-soft", which is basically kqemu with a kvm > interface. It would need a few extensions on the kvm interface, of > course. > > Another potential use for _part_ of kqemu, or kvm-soft, is emulating > other CPUs with host kernel support for the memory map, instead of > full software TLB. That might be a performance accelerator for > emulation, for some combinations of host and target CPUs where it's > feasible to map the memory in that way. > > If kqemu were evolved into an accelerator for cross-CPU emulation in > that way, then its current use as an x86-on-x86 accelerator would just > be a special case of that. Most of kqemu's code base deals with / works around unvirtualizable x86 cruft. Memory management, page table handling is only a small subset. And that part is focused on running guest code under the control of the monitor, not in the TCG user space environment. So even if there is something a kernel part could contribute to accelerate TCG execution, I'm not sure that there will be a high re-use of kqemu's infrastructure - or even kvm's. You also should be aware of the fact the kqemu's x86 virtualization is fairly fragile, only working for OSes like Linux and Windows, and even there not always reliably (I've seen Linux kernels crashing). We are evaluating alternative workarounds for these issues, but they will come with their own limitations. Either they are too costly to implement (binary translation) given the remaining lifetime of kqemu, or they cause problems with self-checking guests (patch in traps & emulate). And both will need special user space support beyond current kqemu's or kvm's need. Depending on the outcome (for the picky customer OS), we may be able to contribute to a properly maintained kqemu tree (or better: kvm-soft). But this is yet open. Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [Qemu-devel] Re: Cutting a new QEMU release 2009-02-05 9:13 [Qemu-devel] Re: Cutting a new QEMU release Steve Fosdick 2009-02-05 14:26 ` Anthony Liguori @ 2009-02-05 14:55 ` Rick Vernam 1 sibling, 0 replies; 82+ messages in thread From: Rick Vernam @ 2009-02-05 14:55 UTC (permalink / raw) To: qemu-devel On Thursday 05 February 2009 3:13:14 am Steve Fosdick wrote: > Given the talk of a new release I though I'd try the latest qemu from > SVN. At the moment I am being hampered by kqemu-1.4.0pre1 not compiling > though: > > CC [M] /usr/src/kqemu-1.4.0pre1/kqemu-linux.o > /usr/src/kqemu-1.4.0pre1/kqemu-linux.c: In function > ‘kqemu_lock_user_page’: > /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:81: error: dereferencing pointer > to incomplete type > /usr/src/kqemu-1.4.0pre1/kqemu-linux.c: In function ‘kqemu_schedule’: > /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:194: error: implicit declaration > of function ‘need_resched’ > /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:195: error: implicit declaration > of function ‘schedule’ > /usr/src/kqemu-1.4.0pre1/kqemu-linux.c:197: error: implicit declaration > of function ‘signal_pending’ > make[2]: *** [/usr/src/kqemu-1.4.0pre1/kqemu-linux.o] Error 1 > > This is with kernel 2.6.28.2. kqemu-1.3.0pre11 seems to compile OK with > the kernel. Any ideas? I, and another, posted about this some time ago. The solution is a particular #include somewhere, which I don't recall off the top of my head. It's in the list somewhere, if you look hard enough. > > Regards, > Steve. ^ permalink raw reply [flat|nested] 82+ messages in thread
end of thread, other threads:[~2009-02-17 2:55 UTC | newest] Thread overview: 82+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-02-03 20:48 [Qemu-devel] Cutting a new QEMU release Anthony Liguori 2009-02-03 20:58 ` Glauber Costa 2009-02-03 21:35 ` Laurent Desnogues 2009-02-03 21:50 ` Anthony Liguori 2009-02-03 22:05 ` Laurent Desnogues 2009-02-03 22:47 ` Anthony Liguori 2009-02-03 23:48 ` Glauber Costa 2009-02-04 13:09 ` Ulrich Hecht 2009-02-04 0:31 ` David Turner [not found] ` <74222928-D24B-4780-BDB0-D537A83C4F68@hotmail.com> 2009-02-04 5:08 ` C.W. Betts 2009-02-03 21:48 ` Rick Vernam 2009-02-03 22:07 ` Daniel P. Berrange 2009-02-04 14:50 ` Aurelien Jarno 2009-02-04 15:23 ` Tristan Gingold 2009-02-04 15:43 ` Lennart Sorensen 2009-02-04 16:01 ` Tristan Gingold 2009-02-04 18:17 ` [Qemu-devel] " Consul 2009-02-04 17:39 ` [Qemu-devel] " Blue Swirl 2009-02-04 17:50 ` Jonathan Kalbfeld 2009-02-04 20:07 ` Blue Swirl 2009-02-07 14:15 ` Stuart Brady 2009-02-04 15:58 ` Glauber Costa 2009-02-07 15:29 ` Shin-ichiro KAWASAKI 2009-02-11 21:49 ` Rob Landley 2009-02-12 14:44 ` Shin-ichiro KAWASAKI 2009-02-12 21:08 ` Rob Landley 2009-02-12 21:44 ` Rob Landley 2009-02-09 12:43 ` Mark McLoughlin 2009-02-09 21:36 ` Anthony Liguori 2009-02-10 0:47 ` Rob Landley 2009-02-10 7:22 ` M. Warner Losh 2009-02-13 8:40 ` Riku Voipio 2009-02-13 9:59 ` Stefano Stabellini 2009-02-13 16:30 ` Jamie Lokier 2009-02-13 17:00 ` Anthony Liguori 2009-02-13 19:04 ` [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports Jamie Lokier 2009-02-14 22:23 ` Dor Laor 2009-02-15 2:20 ` Jamie Lokier 2009-02-14 23:13 ` Anthony Liguori 2009-02-15 2:01 ` Jamie Lokier 2009-02-15 4:09 ` Anthony Liguori 2009-02-15 15:42 ` Jamie Lokier 2009-02-15 18:19 ` Anthony Liguori 2009-02-15 18:34 ` Johannes Schindelin 2009-02-16 1:01 ` Anthony Liguori 2009-02-17 0:52 ` Jamie Lokier 2009-02-17 2:55 ` Anthony Liguori 2009-02-16 1:19 ` Anthony Liguori 2009-02-17 1:01 ` Jamie Lokier -- strict thread matches above, loose matches on Subject: below -- 2009-02-05 9:13 [Qemu-devel] Re: Cutting a new QEMU release Steve Fosdick 2009-02-05 14:26 ` Anthony Liguori 2009-02-05 15:36 ` Rick Vernam 2009-02-05 16:27 ` Paul Brook 2009-02-05 17:15 ` René Rebe 2009-02-05 17:36 ` Paul Brook 2009-02-05 17:51 ` Daniel P. Berrange 2009-02-05 17:51 ` Ben Taylor 2009-02-05 18:39 ` René Rebe 2009-02-05 19:03 ` Anthony Liguori 2009-02-06 10:54 ` Steve Fosdick 2009-02-06 15:57 ` René Rebe 2009-02-06 17:12 ` Anthony Liguori 2009-02-06 21:47 ` René Rebe 2009-02-07 16:49 ` Jamie Lokier 2009-02-07 17:06 ` Laurent Desnogues 2009-02-07 23:46 ` Anthony Liguori 2009-02-06 21:53 ` René Rebe 2009-02-07 16:39 ` Jamie Lokier [not found] ` <92CAE88C-36FF-4566-BD1D-ACA58C98CB0F@hotmail.com> 2009-02-09 5:01 ` C.W. Betts [not found] ` <784D2534-F9CD-4EA5-BBEE-67E9DE196598@hotmail.com> 2009-02-09 5:42 ` C.W. Betts 2009-02-09 10:29 ` René Rebe 2009-02-15 15:25 ` Andreas Färber 2009-02-15 15:44 ` Jamie Lokier 2009-02-15 19:14 ` Andreas Färber 2009-02-15 18:17 ` Anthony Liguori 2009-02-15 20:31 ` Andreas Färber 2009-02-05 15:55 ` René Rebe 2009-02-07 12:01 ` Stefan Weil 2009-02-07 15:08 ` Anthony Liguori 2009-02-07 15:36 ` Jamie Lokier 2009-02-07 16:45 ` Jan Kiszka 2009-02-05 14:55 ` Rick Vernam
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).